Query 012318
Match_columns 466
No_of_seqs 455 out of 3342
Neff 7.9
Searched_HMMs 46136
Date Fri Mar 29 01:22:19 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/012318.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/012318hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.2E-49 7E-54 411.3 38.3 296 126-462 41-361 (455)
2 TIGR02038 protease_degS peripl 100.0 1.2E-48 2.7E-53 395.7 38.9 297 125-463 45-350 (351)
3 PRK10898 serine endoprotease; 100.0 2.4E-48 5.3E-53 393.3 38.6 296 126-463 46-351 (353)
4 TIGR02037 degP_htrA_DO peripla 100.0 1.7E-46 3.6E-51 390.9 37.9 295 127-462 3-328 (428)
5 PRK10942 serine endoprotease; 100.0 2.3E-46 4.9E-51 391.7 37.5 295 126-461 39-381 (473)
6 COG0265 DegQ Trypsin-like seri 100.0 3E-36 6.5E-41 306.0 33.1 295 125-461 33-340 (347)
7 KOG1320 Serine protease [Postt 100.0 5.1E-28 1.1E-32 246.2 23.0 327 124-465 127-472 (473)
8 KOG1421 Predicted signaling-as 99.9 3.6E-25 7.8E-30 227.1 21.9 304 125-462 52-372 (955)
9 PF13365 Trypsin_2: Trypsin-li 99.7 2.4E-16 5.3E-21 134.3 12.7 117 157-303 1-120 (120)
10 KOG1421 Predicted signaling-as 99.6 5E-14 1.1E-18 145.8 18.1 295 132-461 525-831 (955)
11 PF13180 PDZ_2: PDZ domain; PD 99.5 4.9E-14 1.1E-18 112.8 10.1 81 362-459 1-82 (82)
12 PF00089 Trypsin: Trypsin; In 99.5 2.6E-12 5.7E-17 120.6 18.0 177 135-327 12-220 (220)
13 cd00190 Tryp_SPc Trypsin-like 99.4 2.5E-11 5.5E-16 114.7 18.9 181 135-329 12-231 (232)
14 cd00987 PDZ_serine_protease PD 99.3 1.1E-11 2.3E-16 100.5 10.9 88 362-456 1-89 (90)
15 cd00991 PDZ_archaeal_metallopr 99.3 3.6E-11 7.7E-16 95.5 10.2 69 389-458 8-77 (79)
16 smart00020 Tryp_SPc Trypsin-li 99.2 2.7E-10 5.9E-15 107.9 15.6 161 134-308 12-208 (229)
17 cd00990 PDZ_glycyl_aminopeptid 99.2 1E-10 2.2E-15 92.8 9.7 68 390-460 11-78 (80)
18 cd00989 PDZ_metalloprotease PD 99.2 1.2E-10 2.7E-15 92.0 9.7 67 391-458 12-78 (79)
19 TIGR01713 typeII_sec_gspC gene 99.2 1.2E-10 2.5E-15 113.1 11.5 99 321-458 159-258 (259)
20 cd00986 PDZ_LON_protease PDZ d 99.2 1.7E-10 3.7E-15 91.5 10.1 71 390-462 7-78 (79)
21 KOG1320 Serine protease [Postt 99.2 7.2E-11 1.6E-15 121.3 9.3 277 131-450 56-351 (473)
22 cd00988 PDZ_CTP_protease PDZ d 99.1 5.2E-10 1.1E-14 89.7 9.5 71 390-460 12-84 (85)
23 cd00136 PDZ PDZ domain, also c 99.0 2.1E-09 4.6E-14 82.8 7.7 55 391-445 13-69 (70)
24 TIGR02037 degP_htrA_DO peripla 99.0 3.5E-09 7.6E-14 110.8 11.2 90 361-456 337-427 (428)
25 COG3591 V8-like Glu-specific e 98.8 6.3E-08 1.4E-12 92.4 13.8 160 155-331 64-250 (251)
26 TIGR00054 RIP metalloprotease 98.8 1.5E-08 3.3E-13 105.6 9.8 70 391-461 203-272 (420)
27 PRK10779 zinc metallopeptidase 98.7 4.1E-08 8.9E-13 103.3 10.2 69 391-460 221-289 (449)
28 PRK10779 zinc metallopeptidase 98.7 3.3E-08 7.1E-13 104.0 8.7 67 393-460 128-195 (449)
29 cd00992 PDZ_signaling PDZ doma 98.6 3E-07 6.4E-12 72.9 9.0 69 361-445 11-81 (82)
30 smart00228 PDZ Domain present 98.6 2.9E-07 6.2E-12 73.2 8.9 71 362-447 12-83 (85)
31 PRK10139 serine endoprotease; 98.6 1.6E-07 3.4E-12 98.7 9.5 66 390-457 389-454 (455)
32 TIGR02860 spore_IV_B stage IV 98.6 2.4E-07 5.2E-12 94.3 10.1 69 391-460 105-181 (402)
33 PF00595 PDZ: PDZ domain (Also 98.5 2.7E-07 6E-12 73.3 7.4 72 360-446 8-81 (81)
34 TIGR00225 prc C-terminal pepti 98.5 2.6E-07 5.7E-12 93.5 8.3 68 391-459 62-131 (334)
35 PLN00049 carboxyl-terminal pro 98.5 4E-07 8.7E-12 94.0 9.6 68 391-459 102-171 (389)
36 TIGR03279 cyano_FeS_chp putati 98.5 3.6E-07 7.8E-12 93.6 8.9 63 395-461 2-65 (433)
37 PRK10942 serine endoprotease; 98.5 3.8E-07 8.3E-12 96.3 9.2 65 391-457 408-472 (473)
38 KOG3627 Trypsin [Amino acid tr 98.4 1.9E-05 4E-10 76.5 17.7 169 155-331 38-254 (256)
39 PF14685 Tricorn_PDZ: Tricorn 98.3 3.2E-06 7E-11 68.0 8.0 68 390-457 11-88 (88)
40 PF00863 Peptidase_C4: Peptida 98.3 8.9E-05 1.9E-09 70.4 18.6 174 133-333 15-195 (235)
41 TIGR00054 RIP metalloprotease 98.3 1.7E-06 3.6E-11 90.3 7.0 64 391-456 128-191 (420)
42 KOG3129 26S proteasome regulat 98.2 3.1E-06 6.7E-11 77.3 7.5 73 392-465 140-215 (231)
43 COG0793 Prc Periplasmic protea 98.2 5.1E-06 1.1E-10 85.9 9.5 70 391-460 112-184 (406)
44 PF04495 GRASP55_65: GRASP55/6 98.0 1.5E-05 3.2E-10 70.0 7.7 72 390-461 42-115 (138)
45 COG3975 Predicted protease wit 97.9 1.9E-05 4E-10 81.7 7.1 65 389-462 460-525 (558)
46 PRK11186 carboxy-terminal prot 97.9 3.5E-05 7.6E-10 84.0 9.0 70 391-460 255-334 (667)
47 COG3480 SdrC Predicted secrete 97.9 4.4E-05 9.6E-10 74.3 8.6 70 389-459 128-198 (342)
48 PRK09681 putative type II secr 97.9 4.8E-05 1E-09 73.9 8.5 61 398-459 211-275 (276)
49 PF12812 PDZ_1: PDZ-like domai 97.7 0.00017 3.6E-09 56.9 7.0 69 361-437 8-76 (78)
50 COG5640 Secreted trypsin-like 97.6 0.00059 1.3E-08 67.6 12.0 51 282-332 223-279 (413)
51 PF05579 Peptidase_S32: Equine 97.6 0.0013 2.7E-08 62.8 13.2 116 154-308 111-229 (297)
52 KOG3553 Tax interaction protei 97.5 7.3E-05 1.6E-09 60.2 3.3 37 388-424 56-92 (124)
53 PF08192 Peptidase_S64: Peptid 97.4 0.0015 3.3E-08 69.6 11.6 118 208-331 541-689 (695)
54 PF03761 DUF316: Domain of unk 97.3 0.018 3.9E-07 56.7 17.8 176 135-325 53-273 (282)
55 COG3031 PulC Type II secretory 97.3 0.00067 1.4E-08 63.5 6.9 67 391-458 207-274 (275)
56 PF05580 Peptidase_S55: SpoIVB 97.2 0.0082 1.8E-07 56.0 13.2 161 154-323 19-215 (218)
57 KOG3580 Tight junction protein 96.7 0.0031 6.7E-08 66.0 6.2 59 388-446 426-487 (1027)
58 KOG3532 Predicted protein kina 96.6 0.0036 7.7E-08 66.5 6.4 51 389-439 396-446 (1051)
59 PF10459 Peptidase_S46: Peptid 96.6 0.0097 2.1E-07 65.5 9.4 24 155-178 47-70 (698)
60 PF10459 Peptidase_S46: Peptid 96.4 0.0055 1.2E-07 67.4 6.2 55 277-331 623-687 (698)
61 PF00548 Peptidase_C3: 3C cyst 96.2 0.049 1.1E-06 49.7 10.6 148 136-307 13-170 (172)
62 KOG3209 WW domain-containing p 95.9 0.011 2.5E-07 63.1 5.6 52 395-447 782-836 (984)
63 PF02122 Peptidase_S39: Peptid 95.8 0.048 1E-06 51.0 8.5 117 167-307 43-166 (203)
64 KOG3552 FERM domain protein FR 95.7 0.013 2.9E-07 64.3 5.0 56 391-448 75-132 (1298)
65 KOG3550 Receptor targeting pro 95.6 0.033 7.1E-07 48.6 6.1 57 389-446 113-172 (207)
66 TIGR02860 spore_IV_B stage IV 95.6 0.18 3.9E-06 52.0 12.4 45 278-323 351-395 (402)
67 PF00949 Peptidase_S7: Peptida 95.3 0.036 7.7E-07 48.1 5.5 33 278-310 88-120 (132)
68 COG0750 Predicted membrane-ass 95.3 0.064 1.4E-06 55.0 8.5 56 397-452 135-193 (375)
69 PF09342 DUF1986: Domain of un 95.3 0.56 1.2E-05 44.7 13.6 99 136-247 17-131 (267)
70 PF00944 Peptidase_S3: Alphavi 94.7 0.086 1.9E-06 45.3 6.0 39 279-317 98-136 (158)
71 KOG3580 Tight junction protein 94.4 0.088 1.9E-06 55.5 6.5 73 381-457 30-104 (1027)
72 KOG3542 cAMP-regulated guanine 94.1 0.041 8.8E-07 58.7 3.4 57 389-447 560-618 (1283)
73 KOG3571 Dishevelled 3 and rela 93.0 0.15 3.2E-06 53.0 5.1 75 360-447 259-338 (626)
74 KOG3209 WW domain-containing p 92.4 0.18 4E-06 54.3 5.1 56 390-447 922-980 (984)
75 KOG3605 Beta amyloid precursor 92.1 0.32 6.8E-06 52.1 6.3 123 284-439 677-806 (829)
76 KOG2921 Intramembrane metallop 92.0 0.17 3.7E-06 51.1 4.0 51 384-434 213-264 (484)
77 KOG3834 Golgi reassembly stack 91.9 0.29 6.3E-06 50.0 5.6 71 389-460 13-86 (462)
78 KOG3651 Protein kinase C, alph 90.6 0.5 1.1E-05 46.1 5.6 55 391-446 30-87 (429)
79 KOG3606 Cell polarity protein 90.0 0.65 1.4E-05 44.6 5.7 57 390-447 193-252 (358)
80 KOG3834 Golgi reassembly stack 89.7 0.57 1.2E-05 47.9 5.4 67 395-461 113-181 (462)
81 KOG3549 Syntrophins (type gamm 88.7 0.6 1.3E-05 46.4 4.5 56 390-446 79-137 (505)
82 KOG3551 Syntrophins (type beta 88.1 0.6 1.3E-05 47.1 4.2 58 389-447 108-170 (506)
83 KOG0609 Calcium/calmodulin-dep 87.4 0.97 2.1E-05 47.6 5.4 55 392-447 147-204 (542)
84 KOG0606 Microtubule-associated 86.5 0.93 2E-05 51.7 5.0 52 393-445 660-713 (1205)
85 KOG1892 Actin filament-binding 86.4 0.85 1.8E-05 50.9 4.4 57 390-447 959-1018(1629)
86 KOG3605 Beta amyloid precursor 83.0 2.3 5E-05 45.8 5.7 55 393-447 675-733 (829)
87 PF03510 Peptidase_C24: 2C end 80.9 5.8 0.00013 33.0 6.2 53 159-230 3-55 (105)
88 PF02395 Peptidase_S6: Immunog 79.7 27 0.00059 39.4 13.0 163 157-330 67-266 (769)
89 PF00947 Pico_P2A: Picornaviru 78.3 10 0.00022 32.6 7.0 35 273-308 76-110 (127)
90 PF02907 Peptidase_S29: Hepati 71.0 9 0.0002 33.2 5.0 128 157-323 14-146 (148)
91 PF01732 DUF31: Putative pepti 69.1 3.4 7.4E-05 42.5 2.5 24 282-305 350-373 (374)
92 PF11874 DUF3394: Domain of un 64.3 7.3 0.00016 35.8 3.4 32 388-419 119-150 (183)
93 KOG3938 RGS-GAIP interacting p 63.8 10 0.00022 36.6 4.3 55 393-447 151-209 (334)
94 cd00600 Sm_like The eukaryotic 59.8 24 0.00053 25.8 5.1 33 186-218 6-38 (63)
95 cd01726 LSm6 The eukaryotic Sm 57.2 24 0.00053 26.6 4.7 33 186-218 10-42 (67)
96 PRK00737 small nuclear ribonuc 56.9 26 0.00057 26.9 4.9 33 186-218 14-46 (72)
97 cd01731 archaeal_Sm1 The archa 56.5 27 0.00059 26.3 4.9 33 186-218 10-42 (68)
98 cd01722 Sm_F The eukaryotic Sm 55.7 25 0.00054 26.6 4.5 33 186-218 11-43 (68)
99 cd01732 LSm5 The eukaryotic Sm 55.5 25 0.00055 27.4 4.6 32 186-217 13-44 (76)
100 cd01730 LSm3 The eukaryotic Sm 54.9 23 0.0005 27.9 4.4 32 186-217 11-42 (82)
101 cd06168 LSm9 The eukaryotic Sm 52.4 34 0.00074 26.6 4.9 33 186-218 10-42 (75)
102 cd01719 Sm_G The eukaryotic Sm 51.9 36 0.00077 26.2 4.9 33 186-218 10-42 (72)
103 cd01717 Sm_B The eukaryotic Sm 51.6 32 0.00069 26.9 4.7 33 186-218 10-42 (79)
104 cd01729 LSm7 The eukaryotic Sm 51.5 34 0.00075 26.9 4.8 33 186-218 12-44 (81)
105 cd01728 LSm1 The eukaryotic Sm 50.4 38 0.00082 26.3 4.8 32 186-217 12-43 (74)
106 cd01720 Sm_D2 The eukaryotic S 49.2 36 0.00079 27.3 4.7 32 187-218 15-46 (87)
107 cd01735 LSm12_N LSm12 belongs 48.8 60 0.0013 24.2 5.4 33 187-219 7-39 (61)
108 cd01721 Sm_D3 The eukaryotic S 47.0 47 0.001 25.3 4.8 33 186-218 10-42 (70)
109 smart00651 Sm snRNP Sm protein 46.3 49 0.0011 24.5 4.8 33 186-218 8-40 (67)
110 PF05416 Peptidase_C37: Southa 45.4 1.4E+02 0.0031 31.0 9.2 136 155-310 379-529 (535)
111 cd01727 LSm8 The eukaryotic Sm 44.2 50 0.0011 25.4 4.7 33 186-218 9-41 (74)
112 PF01423 LSM: LSM domain ; In 43.5 61 0.0013 24.0 5.0 34 186-219 8-41 (67)
113 COG1958 LSM1 Small nuclear rib 40.8 57 0.0012 25.4 4.6 33 187-219 18-50 (79)
114 PF00571 CBS: CBS domain CBS d 40.8 29 0.00063 24.4 2.7 21 286-306 28-48 (57)
115 PF12381 Peptidase_C3G: Tungro 40.6 31 0.00068 32.5 3.4 55 275-330 168-228 (231)
116 cd01723 LSm4 The eukaryotic Sm 37.2 84 0.0018 24.3 5.0 33 186-218 11-43 (76)
117 KOG1738 Membrane-associated gu 36.4 54 0.0012 35.7 4.9 37 391-427 225-262 (638)
118 PF02743 Cache_1: Cache domain 36.0 47 0.001 25.5 3.4 32 291-332 19-50 (81)
119 cd01725 LSm2 The eukaryotic Sm 30.3 1.2E+02 0.0025 23.9 4.8 33 186-218 11-43 (81)
120 COG5233 GRH1 Peripheral Golgi 30.0 30 0.00066 34.4 1.7 30 395-424 67-96 (417)
121 cd01733 LSm10 The eukaryotic S 30.0 1.4E+02 0.003 23.3 5.1 32 187-218 20-51 (78)
122 PF14827 Cache_3: Sensory doma 29.3 55 0.0012 27.3 3.0 18 291-308 94-111 (116)
123 cd01724 Sm_D1 The eukaryotic S 27.9 1.4E+02 0.003 24.0 4.9 33 186-218 11-43 (90)
124 PF14438 SM-ATX: Ataxin 2 SM d 27.3 1.5E+02 0.0032 22.8 4.9 28 187-214 13-43 (77)
125 COG0260 PepB Leucyl aminopepti 23.6 55 0.0012 34.9 2.3 30 394-424 301-330 (485)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3.2e-49 Score=411.28 Aligned_cols=296 Identities=33% Similarity=0.567 Sum_probs=258.8
Q ss_pred hHHHHHHHhCCceEEEEcccc-------------cccc-------ccCCceeEEEEEeC-CCeEEeccccccCCCCCCCC
Q 012318 126 TIANAAARVCPAVVNLSAPRE-------------FLGI-------LSGRGIGSGAIVDA-DGTILTCAHVVVDFHGSRAL 184 (466)
Q Consensus 126 ~~~~~~~~~~~SVV~I~~~~~-------------~~~~-------~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~~~~~ 184 (466)
+++++++++.||||.|.+... +++. ....+.||||+|++ +||||||+|||.+.
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a------ 114 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQA------ 114 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCC------
Confidence 499999999999999986421 0110 01236899999985 69999999999986
Q ss_pred CCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecC
Q 012318 185 PKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRK 264 (466)
Q Consensus 185 ~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~ 264 (466)
..+.|++.||+.|+|++++.|+.+||||||++....+++++|+++..++.|++|+++|+|.+...+++.|+|++..+.
T Consensus 115 --~~i~V~~~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~ 192 (455)
T PRK10139 115 --QKISIQLNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRS 192 (455)
T ss_pred --CEEEEEECCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEcccccc
Confidence 689999999999999999999999999999986678999999999999999999999999999999999999988765
Q ss_pred ccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEecC---CCeeeEEEeHHHHHHHHHHHHHcCCcccccc
Q 012318 265 SSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQK 341 (466)
Q Consensus 265 ~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~---~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~ 341 (466)
... . .....++++|+.+++|||||||||.+|+||||+++.... ..+++|+||++.+++++++|+++|
T Consensus 193 ~~~--~-~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g------- 262 (455)
T PRK10139 193 GLN--L-EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFG------- 262 (455)
T ss_pred ccC--C-CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcC-------
Confidence 221 1 123568999999999999999999999999999987642 357999999999999999999999
Q ss_pred ccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCE
Q 012318 342 VPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGK 421 (466)
Q Consensus 342 ~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~ 421 (466)
++.++|||+.+.+++++..+.++.. ...|++|.+|.++|||+++|||+||+|++|||+
T Consensus 263 ----------------~v~r~~LGv~~~~l~~~~~~~lgl~------~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~ 320 (455)
T PRK10139 263 ----------------EIKRGLLGIKGTEMSADIAKAFNLD------VQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGK 320 (455)
T ss_pred ----------------cccccceeEEEEECCHHHHHhcCCC------CCCceEEEEECCCChHHHCCCCCCCEEEEECCE
Confidence 8889999999999999988877642 356999999999999999999999999999999
Q ss_pred ecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318 422 PVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA 462 (466)
Q Consensus 422 ~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~ 462 (466)
+|.+|+++...+.. +.|+++.++|.| +|+.+++++++...
T Consensus 321 ~V~s~~dl~~~l~~~~~g~~v~l~V~R-~G~~~~l~v~~~~~ 361 (455)
T PRK10139 321 PLNSFAELRSRIATTEPGTKVKLGLLR-NGKPLEVEVTLDTS 361 (455)
T ss_pred ECCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEECCC
Confidence 99999999988876 788999999999 89998888887543
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=1.2e-48 Score=395.71 Aligned_cols=297 Identities=37% Similarity=0.590 Sum_probs=258.1
Q ss_pred hhHHHHHHHhCCceEEEEcccccc---ccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE
Q 012318 125 DTIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG 201 (466)
Q Consensus 125 ~~~~~~~~~~~~SVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a 201 (466)
.++.++++++.||||.|.+..... ......+.||||+|+++||||||+|||.++ ..+.|.+.||+.++|
T Consensus 45 ~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~--------~~i~V~~~dg~~~~a 116 (351)
T TIGR02038 45 ISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA--------DQIVVALQDGRKFEA 116 (351)
T ss_pred hhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC--------CEEEEEECCCCEEEE
Confidence 369999999999999998753221 111234679999999999999999999885 689999999999999
Q ss_pred EEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEc
Q 012318 202 TVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTD 281 (466)
Q Consensus 202 ~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~ 281 (466)
+++++|+.+||||||++. ..+++++++++..++.|++|+++|||.+...+++.|+|+...+.... ......++++|
T Consensus 117 ~vv~~d~~~DlAvlkv~~-~~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd 192 (351)
T TIGR02038 117 ELVGSDPLTDLAVLKIEG-DNLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD 192 (351)
T ss_pred EEEEecCCCCEEEEEecC-CCCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence 999999999999999996 45888899888889999999999999999999999999988765321 11235689999
Q ss_pred ccCCCCCccceeecCCCeEEEEEEeEecC-----CCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeee
Q 012318 282 CAINAGNSGGPLVNIDGEIVGINIMKVAA-----ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILC 356 (466)
Q Consensus 282 ~~i~~G~SGGPlvd~~G~VVGI~~~~~~~-----~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~ 356 (466)
+.+++|||||||||.+|+||||+++.... ..+++|+||++.+++++++++++|
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g---------------------- 250 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDG---------------------- 250 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcC----------------------
Confidence 99999999999999999999999876432 257899999999999999999999
Q ss_pred eeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-
Q 012318 357 RRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD- 435 (466)
Q Consensus 357 ~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~- 435 (466)
++.++|||+.+.++++...+.++.. ...|++|.+|.++|||+++||++||+|++|||++|.+++++.+++..
T Consensus 251 -~~~r~~lGv~~~~~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~ 323 (351)
T TIGR02038 251 -RVIRGYIGVSGEDINSVVAQGLGLP------DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAET 323 (351)
T ss_pred -cccceEeeeEEEECCHHHHHhcCCC------ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhc
Confidence 7889999999999998887776542 23699999999999999999999999999999999999999998876
Q ss_pred CCCCeEEEEEEECCCeEEEEEEEecCCC
Q 012318 436 RVGEPLKVVVQRANDQLVTLTVIPEEAN 463 (466)
Q Consensus 436 ~~g~~v~l~v~R~~g~~~~l~v~~~~~~ 463 (466)
+.|+++.++|.| +|+.+++++++.+.+
T Consensus 324 ~~g~~v~l~v~R-~g~~~~~~v~l~~~p 350 (351)
T TIGR02038 324 RPGSKVMVTVLR-QGKQLELPVTIDEKP 350 (351)
T ss_pred CCCCEEEEEEEE-CCEEEEEEEEecCCC
Confidence 788999999999 899999999887654
No 3
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=2.4e-48 Score=393.30 Aligned_cols=296 Identities=36% Similarity=0.561 Sum_probs=255.4
Q ss_pred hHHHHHHHhCCceEEEEcccccc---ccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEE
Q 012318 126 TIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGT 202 (466)
Q Consensus 126 ~~~~~~~~~~~SVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~ 202 (466)
++.++++++.||||.|.+..... +.....+.||||+|+++||||||+|||.++ ..+.|++.||+.|+|+
T Consensus 46 ~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~ 117 (353)
T PRK10898 46 SYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA--------DQIIVALQDGRVFEAL 117 (353)
T ss_pred hHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC--------CEEEEEeCCCCEEEEE
Confidence 59999999999999999854321 111234689999999999999999999985 6899999999999999
Q ss_pred EEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcc
Q 012318 203 VLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDC 282 (466)
Q Consensus 203 vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~ 282 (466)
++++|+.+||||||++. ..+++++++++..++.|++|+++|||.+...+++.|+|++..+.... ......++++|+
T Consensus 118 vv~~d~~~DlAvl~v~~-~~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtda 193 (353)
T PRK10898 118 LVGSDSLTDLAVLKINA-TNLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTDA 193 (353)
T ss_pred EEEEcCCCCEEEEEEcC-CCCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEecc
Confidence 99999999999999985 46888999888889999999999999998889999999987764321 112246899999
Q ss_pred cCCCCCccceeecCCCeEEEEEEeEecC------CCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeee
Q 012318 283 AINAGNSGGPLVNIDGEIVGINIMKVAA------ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILC 356 (466)
Q Consensus 283 ~i~~G~SGGPlvd~~G~VVGI~~~~~~~------~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~ 356 (466)
.+++|||||||+|.+|+||||+++.... ..+++|+||++.+++++++++++|
T Consensus 194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G---------------------- 251 (353)
T PRK10898 194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDG---------------------- 251 (353)
T ss_pred ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcC----------------------
Confidence 9999999999999999999999976542 247899999999999999999999
Q ss_pred eeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-
Q 012318 357 RRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD- 435 (466)
Q Consensus 357 ~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~- 435 (466)
++.++|||+.+.++++.....++. ....|++|.+|.++|||+++||++||+|++|||++|.+++++.+.+..
T Consensus 252 -~~~~~~lGi~~~~~~~~~~~~~~~------~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~ 324 (353)
T PRK10898 252 -RVIRGYIGIGGREIAPLHAQGGGI------DQLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEI 324 (353)
T ss_pred -cccccccceEEEECCHHHHHhcCC------CCCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhc
Confidence 888999999999887765544322 234799999999999999999999999999999999999999888876
Q ss_pred CCCCeEEEEEEECCCeEEEEEEEecCCC
Q 012318 436 RVGEPLKVVVQRANDQLVTLTVIPEEAN 463 (466)
Q Consensus 436 ~~g~~v~l~v~R~~g~~~~l~v~~~~~~ 463 (466)
+.|+++.++|.| +|+.+++.+++.+.+
T Consensus 325 ~~g~~v~l~v~R-~g~~~~~~v~l~~~p 351 (353)
T PRK10898 325 RPGSVIPVVVMR-DDKQLTLQVTIQEYP 351 (353)
T ss_pred CCCCEEEEEEEE-CCEEEEEEEEeccCC
Confidence 788999999999 899999999887665
No 4
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=1.7e-46 Score=390.91 Aligned_cols=295 Identities=41% Similarity=0.685 Sum_probs=257.3
Q ss_pred HHHHHHHhCCceEEEEcccc----------------cccc-----------ccCCceeEEEEEeCCCeEEeccccccCCC
Q 012318 127 IANAAARVCPAVVNLSAPRE----------------FLGI-----------LSGRGIGSGAIVDADGTILTCAHVVVDFH 179 (466)
Q Consensus 127 ~~~~~~~~~~SVV~I~~~~~----------------~~~~-----------~~~~~~GSGfiI~~~G~ILT~aHvv~~~~ 179 (466)
+.++++++.||||.|.+... +++. ....+.||||+|+++|+||||+||+.++
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~- 81 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA- 81 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC-
Confidence 78999999999999986420 0000 0124679999999999999999999986
Q ss_pred CCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEe
Q 012318 180 GSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVS 259 (466)
Q Consensus 180 ~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs 259 (466)
..+.|.+.+++.++|++++.|+.+|||||+++....++++.|+++..++.|++|+++|||.+...+++.|+|+
T Consensus 82 -------~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs 154 (428)
T TIGR02037 82 -------DEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVS 154 (428)
T ss_pred -------CeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEE
Confidence 6899999999999999999999999999999976679999999888899999999999999999999999999
Q ss_pred eeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEec---CCCeeeEEEeHHHHHHHHHHHHHcCCc
Q 012318 260 CVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVA---AADGLSFAVPIDSAAKIIEQFKKNGWM 336 (466)
Q Consensus 260 ~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~---~~~g~~~aip~~~i~~~l~~l~~~g~~ 336 (466)
...+.... ...+..++++|+.+++|+|||||||.+|+||||++.... ...+++|+||++.+++++++|+++|
T Consensus 155 ~~~~~~~~---~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g-- 229 (428)
T TIGR02037 155 ALGRSGLG---IGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGG-- 229 (428)
T ss_pred ecccCccC---CCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcC--
Confidence 87765311 122346899999999999999999999999999988654 2457899999999999999999999
Q ss_pred cccccccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEE
Q 012318 337 HVEQKVPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVI 416 (466)
Q Consensus 337 ~~~~~~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~ 416 (466)
++.++|||+.+..++++.++.++.. ...|++|.+|.++|||+++||++||+|+
T Consensus 230 ---------------------~~~~~~lGi~~~~~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~ 282 (428)
T TIGR02037 230 ---------------------KVQRGWLGVTIQEVTSDLAKSLGLE------KQRGALVAQVLPGSPAEKAGLKAGDVIL 282 (428)
T ss_pred ---------------------cCcCCcCceEeecCCHHHHHHcCCC------CCCceEEEEccCCCChHHcCCCCCCEEE
Confidence 7889999999999999988888642 2479999999999999999999999999
Q ss_pred EECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318 417 KFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA 462 (466)
Q Consensus 417 ~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~ 462 (466)
+|||++|.++.++..++.. ..|+.++++|.| +|+.+++++++...
T Consensus 283 ~Vng~~i~~~~~~~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~l~~~ 328 (428)
T TIGR02037 283 SVNGKPISSFADLRRAIGTLKPGKKVTLGILR-KGKEKTITVTLGAS 328 (428)
T ss_pred EECCEEcCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEECcC
Confidence 9999999999999988876 678999999999 89998888887543
No 5
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=2.3e-46 Score=391.71 Aligned_cols=295 Identities=35% Similarity=0.566 Sum_probs=256.1
Q ss_pred hHHHHHHHhCCceEEEEccccc--------------ccc---c--------------------------cCCceeEEEEE
Q 012318 126 TIANAAARVCPAVVNLSAPREF--------------LGI---L--------------------------SGRGIGSGAIV 162 (466)
Q Consensus 126 ~~~~~~~~~~~SVV~I~~~~~~--------------~~~---~--------------------------~~~~~GSGfiI 162 (466)
+++++++++.||||.|.+.... ++. + ...+.||||+|
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 4999999999999999863310 110 0 01357999999
Q ss_pred eC-CCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEE
Q 012318 163 DA-DGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVV 241 (466)
Q Consensus 163 ~~-~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~ 241 (466)
++ +||||||+||+.+. ..+.|++.||+.|+|++++.|+.+||||||++...++++++|+++..+++|++|+
T Consensus 119 ~~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~ 190 (473)
T PRK10942 119 DADKGYVVTNNHVVDNA--------TKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTV 190 (473)
T ss_pred ECCCCEEEeChhhcCCC--------CEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEE
Confidence 96 59999999999986 6899999999999999999999999999999866789999999999999999999
Q ss_pred EEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEecC---CCeeeEEE
Q 012318 242 AMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAV 318 (466)
Q Consensus 242 ~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~---~~g~~~ai 318 (466)
++|+|.+...+++.|+|++..+.... . ..+..++++|+.+++|+|||||+|.+|+||||+++.... ..+.+|+|
T Consensus 191 aiG~P~g~~~tvt~GiVs~~~r~~~~--~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaI 267 (473)
T PRK10942 191 AIGNPYGLGETVTSGIVSALGRSGLN--V-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAI 267 (473)
T ss_pred EEcCCCCCCcceeEEEEEEeecccCC--c-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEE
Confidence 99999999999999999988765211 1 123468999999999999999999999999999987642 34689999
Q ss_pred eHHHHHHHHHHHHHcCCccccccccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeeccc
Q 012318 319 PIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVV 398 (466)
Q Consensus 319 p~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V 398 (466)
|++.+++++++|+++| ++.++|||+.+.++++++.+.++.. ...|++|.+|
T Consensus 268 P~~~~~~v~~~l~~~g-----------------------~v~rg~lGv~~~~l~~~~a~~~~l~------~~~GvlV~~V 318 (473)
T PRK10942 268 PSNMVKNLTSQMVEYG-----------------------QVKRGELGIMGTELNSELAKAMKVD------AQRGAFVSQV 318 (473)
T ss_pred EHHHHHHHHHHHHhcc-----------------------ccccceeeeEeeecCHHHHHhcCCC------CCCceEEEEE
Confidence 9999999999999999 8889999999999999988777542 3579999999
Q ss_pred CCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 012318 399 TPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEE 461 (466)
Q Consensus 399 ~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~ 461 (466)
.++|||+++||++||+|++|||++|.+++++...+.. ..|++++++|.| +|+.+++.+++..
T Consensus 319 ~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~v~R-~G~~~~v~v~l~~ 381 (473)
T PRK10942 319 LPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVGSKLTLGLLR-DGKPVNVNVELQQ 381 (473)
T ss_pred CCCChHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEE-CCeEEEEEEEeCc
Confidence 9999999999999999999999999999999988876 678899999999 8998888887754
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=3e-36 Score=305.96 Aligned_cols=295 Identities=39% Similarity=0.627 Sum_probs=256.8
Q ss_pred hhHHHHHHHhCCceEEEEccccccc---------cccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC
Q 012318 125 DTIANAAARVCPAVVNLSAPREFLG---------ILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD 195 (466)
Q Consensus 125 ~~~~~~~~~~~~SVV~I~~~~~~~~---------~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~ 195 (466)
..+..+++++.|+||.+........ .....+.||||+++.+|||+|+.|++.++ .++.+.+.|
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a--------~~i~v~l~d 104 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGA--------EEITVTLAD 104 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCc--------ceEEEEeCC
Confidence 4699999999999999987542211 00014789999999999999999999985 689999999
Q ss_pred CcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccc
Q 012318 196 GRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRR 275 (466)
Q Consensus 196 g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~ 275 (466)
|+.+++++++.|+..|+|+|+++....++.+.++++..++.|++++++|+|+++..+++.|+++...+. ..+......
T Consensus 105 g~~~~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~--~v~~~~~~~ 182 (347)
T COG0265 105 GREVPAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT--GVGSAGGYV 182 (347)
T ss_pred CCEEEEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc--cccCccccc
Confidence 999999999999999999999997544888899999999999999999999999999999999998886 222212256
Q ss_pred cEEEEcccCCCCCccceeecCCCeEEEEEEeEecCCC---eeeEEEeHHHHHHHHHHHHHcCCcccccccccccccccee
Q 012318 276 EYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAAD---GLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQV 352 (466)
Q Consensus 276 ~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~---g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~ 352 (466)
.++|+|+.+++|+||||++|.+|++|||++....... +++|++|++.+..++.+++++|
T Consensus 183 ~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G------------------ 244 (347)
T COG0265 183 NFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKG------------------ 244 (347)
T ss_pred chhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcC------------------
Confidence 7899999999999999999999999999999876543 5899999999999999999999
Q ss_pred eeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHH
Q 012318 353 VILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEI 432 (466)
Q Consensus 353 ~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~ 432 (466)
++.++|+|+.+..++.+.. ++ + ....|++|..|.+++||+++|++.||+|+++||+++.+..++...
T Consensus 245 -----~v~~~~lgv~~~~~~~~~~--~g-----~-~~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~ 311 (347)
T COG0265 245 -----KVVRGYLGVIGEPLTADIA--LG-----L-PVAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAA 311 (347)
T ss_pred -----CccccccceEEEEcccccc--cC-----C-CCCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHH
Confidence 8899999999988776554 22 2 246789999999999999999999999999999999999999988
Q ss_pred Hhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 012318 433 MGD-RVGEPLKVVVQRANDQLVTLTVIPEE 461 (466)
Q Consensus 433 l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~ 461 (466)
+.. .+|+.+.+++.| +|+..++.++..+
T Consensus 312 v~~~~~g~~v~~~~~r-~g~~~~~~v~l~~ 340 (347)
T COG0265 312 VASNRPGDEVALKLLR-GGKERELAVTLGD 340 (347)
T ss_pred HhccCCCCEEEEEEEE-CCEEEEEEEEecC
Confidence 876 679999999999 7999999998876
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96 E-value=5.1e-28 Score=246.23 Aligned_cols=327 Identities=32% Similarity=0.429 Sum_probs=258.2
Q ss_pred chhHHHHHHHhCCceEEEEcccccccc------ccCCceeEEEEEeCCCeEEeccccccCCCCCCCCC---CceEEEEeC
Q 012318 124 RDTIANAAARVCPAVVNLSAPREFLGI------LSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALP---KGKVDVTLQ 194 (466)
Q Consensus 124 ~~~~~~~~~~~~~SVV~I~~~~~~~~~------~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~---~~~i~V~~~ 194 (466)
..-++.+.++-.+++|.|....-..+. .-....||||+++.+|+++||+||+.......... -..+.+...
T Consensus 127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa 206 (473)
T KOG1320|consen 127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAA 206 (473)
T ss_pred hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEe
Confidence 456888999999999999963322111 13456799999999999999999997643221111 123566665
Q ss_pred CC--cEEEEEEEEecCCCCEEEEEeCCCC-CCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCC
Q 012318 195 DG--RTFEGTVLNADFHSDIAIVKINSKT-PLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG 271 (466)
Q Consensus 195 ~g--~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~ 271 (466)
+| ..+++.+.+.|+..|+|+++++.+. .+++++++.+..+..|+++..+|.|++..++.+.|+++...|.....+..
T Consensus 207 ~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 207 IGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred ecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence 55 8999999999999999999997653 37888898889999999999999999999999999999998887665554
Q ss_pred --CccccEEEEcccCCCCCccceeecCCCeEEEEEEeEecC---CCeeeEEEeHHHHHHHHHHHHHcCCccccccccccc
Q 012318 272 --GMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLW 346 (466)
Q Consensus 272 --~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~---~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~ 346 (466)
....+++++++.++.|+||||++|.+|++||+++..... ..+.+|++|.+.++.++.+..+.. +.. ...
T Consensus 287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~-~~l-r~~---- 360 (473)
T KOG1320|consen 287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQ-ISL-RPV---- 360 (473)
T ss_pred cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhc-eee-ccc----
Confidence 566789999999999999999999999999998887542 357899999999999988886544 000 000
Q ss_pred cccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCC-CCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCC
Q 012318 347 STCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFP-NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS 425 (466)
Q Consensus 347 ~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~-~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~ 425 (466)
+. ..+.+.|+|..+..+...+......+.+.++ ....+++|..|.+++++...++++||+|++|||++|++
T Consensus 361 ~~--------~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n 432 (473)
T KOG1320|consen 361 KP--------LVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKN 432 (473)
T ss_pred cC--------cccccccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeec
Confidence 00 0234679999888888777666655555554 34468999999999999999999999999999999999
Q ss_pred HHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCCCCC
Q 012318 426 ITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEANPD 465 (466)
Q Consensus 426 ~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~~~~ 465 (466)
..++.+++.. ..++++.+...| +.+..++.+.++...+.
T Consensus 433 ~~~l~~~i~~~~~~~~v~vl~~~-~~e~~tl~Il~~~~~p~ 472 (473)
T KOG1320|consen 433 LKHLYELIEECSTEDKVAVLDRR-SAEDATLEILPEHKIPS 472 (473)
T ss_pred hHHHHHHHHhcCcCceEEEEEec-CccceeEEecccccCCC
Confidence 9999999987 556677777777 77889999988876554
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.94 E-value=3.6e-25 Score=227.13 Aligned_cols=304 Identities=22% Similarity=0.323 Sum_probs=243.6
Q ss_pred hhHHHHHHHhCCceEEEEccc--cccccccCCceeEEEEEeCC-CeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE
Q 012318 125 DTIANAAARVCPAVVNLSAPR--EFLGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG 201 (466)
Q Consensus 125 ~~~~~~~~~~~~SVV~I~~~~--~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a 201 (466)
.+|...+.++-+|||.|.... .++-...+...||||++++. |+||||+|++... .-.-.+.|.+....+.
T Consensus 52 e~w~~~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pg-------P~va~avf~n~ee~ei 124 (955)
T KOG1421|consen 52 EDWRNTIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPG-------PFVASAVFDNHEEIEI 124 (955)
T ss_pred hhhhhhhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCC-------CceeEEEecccccCCc
Confidence 379999999999999999754 23334466788999999987 8999999999753 1345666777777777
Q ss_pred EEEEecCCCCEEEEEeCCC----CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCC---CCcc
Q 012318 202 TVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGL---GGMR 274 (466)
Q Consensus 202 ~vv~~d~~~DlAlLkv~~~----~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~---~~~~ 274 (466)
-.++.|+-||+.+++.+.. ..+..+.+.. .-.+.|.+++++|+..+...++..|.++.+++...+++. +...
T Consensus 125 ~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfn 203 (955)
T KOG1421|consen 125 YPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFN 203 (955)
T ss_pred ccccCCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCcccccccccccc
Confidence 7889999999999999864 2233333432 346789999999998888888889999999998877643 3334
Q ss_pred ccEEEEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeee
Q 012318 275 REYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVI 354 (466)
Q Consensus 275 ~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~ 354 (466)
..++|.......|.||+|++|.+|..|.++..+... .+.+|++|++.+.+-|.-++++.
T Consensus 204 Tfy~QaasstsggssgspVv~i~gyAVAl~agg~~s-sas~ffLpLdrV~RaL~clq~n~-------------------- 262 (955)
T KOG1421|consen 204 TFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSIS-SASDFFLPLDRVVRALRCLQNNT-------------------- 262 (955)
T ss_pred ceeeeehhcCCCCCCCCceecccceEEeeecCCccc-ccccceeeccchhhhhhhhhcCC--------------------
Confidence 468899999999999999999999999999886543 45789999999999999999888
Q ss_pred eeeeeecccccceeecCCHHHHHHhhccC-------CCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHH
Q 012318 355 LCRRVVRPWLGLKMLDLNDMIIAQLKERD-------PSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSIT 427 (466)
Q Consensus 355 ~~~~~~~~~lG~~~~~~~~~~~~~~~~~~-------~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~ 427 (466)
.+.|+.|-+++..-.-+..+++++.+ ..++....=++|..|.+++||++. |++||++++||+.-+.++.
T Consensus 263 ---PItRGtLqvefl~k~~de~rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~-Le~GDillavN~t~l~df~ 338 (955)
T KOG1421|consen 263 ---PITRGTLQVEFLHKLFDECRRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK-LEPGDILLAVNSTCLNDFE 338 (955)
T ss_pred ---CcccceEEEEEehhhhHHHHhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc-cCCCcEEEEEcceehHHHH
Confidence 55677777776665555555555444 245543334456789999999998 9999999999999999999
Q ss_pred HHHHHHhcCCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318 428 EIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPEEA 462 (466)
Q Consensus 428 ~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~~ 462 (466)
++.++|.+..|+.+.|+|+| +|++.+++++.++.
T Consensus 339 ~l~~iLDegvgk~l~LtI~R-ggqelel~vtvqdl 372 (955)
T KOG1421|consen 339 ALEQILDEGVGKNLELTIQR-GGQELELTVTVQDL 372 (955)
T ss_pred HHHHHHhhccCceEEEEEEe-CCEEEEEEEEeccc
Confidence 99999999999999999999 89999998887654
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.69 E-value=2.4e-16 Score=134.29 Aligned_cols=117 Identities=35% Similarity=0.570 Sum_probs=77.9
Q ss_pred eEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEE--EEEEEecCC-CCEEEEEeCCCCCCCccccCCCCC
Q 012318 157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE--GTVLNADFH-SDIAIVKINSKTPLPAAKLGTSSK 233 (466)
Q Consensus 157 GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~l~~s~~ 233 (466)
||||+|+++|+||||+||+.+...........+.+...++..+. +++++.++. +|+|||+++
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~--------------- 65 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVD--------------- 65 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEES---------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEe---------------
Confidence 89999999999999999998754333234578999999998888 999999999 999999999
Q ss_pred CCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEE
Q 012318 234 LCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGI 303 (466)
Q Consensus 234 ~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI 303 (466)
.....+... ............... ......+ +++.+.+|+|||||||.+|+||||
T Consensus 66 -----~~~~~~~~~-----~~~~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 66 -----PWTGVGGGV-----RVPGSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -----CEEEEEEEE-----EEEEEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -----cccceeeee-----EeeeeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 000000000 000000000000000 0001123 799999999999999999999997
No 10
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.59 E-value=5e-14 Score=145.81 Aligned_cols=295 Identities=17% Similarity=0.168 Sum_probs=196.5
Q ss_pred HHhCCceEEEEccccc--cccccCCceeEEEEEeCC-CeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecC
Q 012318 132 ARVCPAVVNLSAPREF--LGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADF 208 (466)
Q Consensus 132 ~~~~~SVV~I~~~~~~--~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~ 208 (466)
+++..+.|.+....+. ++.......|||.|++.. |+++++..++.... ...+|++.|...++|.+.+.|+
T Consensus 525 ~~i~~~~~~v~~~~~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~-------~d~~vt~~dS~~i~a~~~fL~~ 597 (955)
T KOG1421|consen 525 ADISNCLVDVEPMMPVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDA-------KDQRVTEADSDGIPANVSFLHP 597 (955)
T ss_pred hHHhhhhhhheeceeeccccchhhhhcCceEEEEccCCceeEecccCCchh-------hceEEeecccccccceeeEecC
Confidence 4555666666654332 233334467999999977 89999999997542 6788999988889999999999
Q ss_pred CCCEEEEEeCCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEee---ee-cCccCCCCCCccccEEEEcccC
Q 012318 209 HSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSC---VD-RKSSDLGLGGMRREYLQTDCAI 284 (466)
Q Consensus 209 ~~DlAlLkv~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~---~~-~~~~~~~~~~~~~~~i~~~~~i 284 (466)
..++|.++.+... ...++|.+ ..+..|+++...|+............++. +. .......+.....+.|..++..
T Consensus 598 t~n~a~~kydp~~-~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nl 675 (955)
T KOG1421|consen 598 TENVASFKYDPAL-EVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNL 675 (955)
T ss_pred ccceeEeccChhH-hhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEeccc
Confidence 9999999999543 34455644 46788999999999876543222222221 11 1111122333345677777777
Q ss_pred CCCCccceeecCCCeEEEEEEeEecCC-C----eeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeeeeee
Q 012318 285 NAGNSGGPLVNIDGEIVGINIMKVAAA-D----GLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRV 359 (466)
Q Consensus 285 ~~G~SGGPlvd~~G~VVGI~~~~~~~~-~----g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~ 359 (466)
..++--|-+.|.+|+|+|+.-....+. . ..-|.+.+..++++|++|+.++... + +..+++- .
T Consensus 676 sT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-----p----~i~~vef----~ 742 (955)
T KOG1421|consen 676 STSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-----P----TIAGVEF----S 742 (955)
T ss_pred cccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-----c----eeeccce----e
Confidence 666666788899999999987765422 1 2457789999999999999888110 0 0000000 0
Q ss_pred ecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCC
Q 012318 360 VRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGE 439 (466)
Q Consensus 360 ~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~ 439 (466)
.-...+.+...++.+++.+++... -....-++|+.|.+..+ +. |..||+|+++||+-|+.+.|+.+..
T Consensus 743 ~i~laqar~lglp~e~imk~e~es---~~~~ql~~ishv~~~~~--ki-l~~gdiilsvngk~itr~~dl~d~~------ 810 (955)
T KOG1421|consen 743 HITLAQARTLGLPSEFIMKSEEES---TIPRQLYVISHVRPLLH--KI-LGVGDIILSVNGKMITRLSDLHDFE------ 810 (955)
T ss_pred eEEeehhhccCCCHHHHhhhhhcC---CCcceEEEEEeeccCcc--cc-cccccEEEEecCeEEeeehhhhhhh------
Confidence 011223344445666655553321 12344567888877543 44 9999999999999999999998733
Q ss_pred eEEEEEEECCCeEEEEEEEecC
Q 012318 440 PLKVVVQRANDQLVTLTVIPEE 461 (466)
Q Consensus 440 ~v~l~v~R~~g~~~~l~v~~~~ 461 (466)
.+...|.| +|..+++.+...+
T Consensus 811 eid~~ilr-dg~~~~ikipt~p 831 (955)
T KOG1421|consen 811 EIDAVILR-DGIEMEIKIPTYP 831 (955)
T ss_pred hhheeeee-cCcEEEEEecccc
Confidence 47789999 8998888876644
No 11
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.53 E-value=4.9e-14 Score=112.81 Aligned_cols=81 Identities=35% Similarity=0.655 Sum_probs=69.5
Q ss_pred ccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCe
Q 012318 362 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP 440 (466)
Q Consensus 362 ~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~ 440 (466)
||||+.+....+ ..|++|.+|.++|||+++||++||+|++|||++|.++.++..++.. .+|++
T Consensus 1 ~~lGv~~~~~~~----------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~ 64 (82)
T PF13180_consen 1 GGLGVTVQNLSD----------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDT 64 (82)
T ss_dssp -E-SEEEEECSC----------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred CEECeEEEEccC----------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence 589998876431 4699999999999999999999999999999999999999998854 89999
Q ss_pred EEEEEEECCCeEEEEEEEe
Q 012318 441 LKVVVQRANDQLVTLTVIP 459 (466)
Q Consensus 441 v~l~v~R~~g~~~~l~v~~ 459 (466)
++|+|.| +|+.+++++++
T Consensus 65 v~l~v~R-~g~~~~~~v~l 82 (82)
T PF13180_consen 65 VTLTVLR-DGEELTVEVTL 82 (82)
T ss_dssp EEEEEEE-TTEEEEEEEE-
T ss_pred EEEEEEE-CCEEEEEEEEC
Confidence 9999999 99999988864
No 12
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.47 E-value=2.6e-12 Score=120.56 Aligned_cols=177 Identities=21% Similarity=0.294 Sum_probs=117.1
Q ss_pred CCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeC-------CC--cEEEEEEEE
Q 012318 135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQ-------DG--RTFEGTVLN 205 (466)
Q Consensus 135 ~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~-------~g--~~~~a~vv~ 205 (466)
.|.+|.|..... ...|+|++|+++ +|||++||+... ..+.+.+. ++ ..+..+-+.
T Consensus 12 ~p~~v~i~~~~~-------~~~C~G~li~~~-~vLTaahC~~~~--------~~~~v~~g~~~~~~~~~~~~~~~v~~~~ 75 (220)
T PF00089_consen 12 FPWVVSIRYSNG-------RFFCTGTLISPR-WVLTAAHCVDGA--------SDIKVRLGTYSIRNSDGSEQTIKVSKII 75 (220)
T ss_dssp STTEEEEEETTT-------EEEEEEEEEETT-EEEEEGGGHTSG--------GSEEEEESESBTTSTTTTSEEEEEEEEE
T ss_pred CCeEEEEeeCCC-------CeeEeEEecccc-cccccccccccc--------cccccccccccccccccccccccccccc
Confidence 367788876442 467999999988 999999999872 34444332 22 344444443
Q ss_pred ec----C---CCCEEEEEeCCC----CCCCccccCC-CCCCCCCCEEEEEecCCCCC----CceEEeEEeeeecCccCCC
Q 012318 206 AD----F---HSDIAIVKINSK----TPLPAAKLGT-SSKLCPGDWVVAMGCPHSLQ----NTVTAGIVSCVDRKSSDLG 269 (466)
Q Consensus 206 ~d----~---~~DlAlLkv~~~----~~~~~~~l~~-s~~~~~G~~V~~iG~p~~~~----~~~t~G~Vs~~~~~~~~~~ 269 (466)
.+ . .+|||||+|+.+ ..+.++.+.. ...+..|+.+.++||+.... ..+....+.......+...
T Consensus 76 ~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~ 155 (220)
T PF00089_consen 76 IHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSS 155 (220)
T ss_dssp EETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHH
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 32 2 579999999976 3456666655 23457899999999997533 2455555544444332211
Q ss_pred CCC-ccccEEEEcc----cCCCCCccceeecCCCeEEEEEEeEecCCC--eeeEEEeHHHHHHHH
Q 012318 270 LGG-MRREYLQTDC----AINAGNSGGPLVNIDGEIVGINIMKVAAAD--GLSFAVPIDSAAKII 327 (466)
Q Consensus 270 ~~~-~~~~~i~~~~----~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~--g~~~aip~~~i~~~l 327 (466)
+.. .....+.... ..|.|+|||||+..++.|+||++.+..... ...++.+++.+.+|+
T Consensus 156 ~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 156 YNDNLTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp TTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 111 2234555554 789999999999877779999999843222 247889988888775
No 13
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.39 E-value=2.5e-11 Score=114.69 Aligned_cols=181 Identities=23% Similarity=0.288 Sum_probs=111.1
Q ss_pred CCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC---------CcEEEEEEEE
Q 012318 135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------GRTFEGTVLN 205 (466)
Q Consensus 135 ~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~---------g~~~~a~vv~ 205 (466)
.|.+|.|.... ....|+|++|+++ +|||+|||+.+.. ...+.|.+.. ...+..+-+.
T Consensus 12 ~Pw~v~i~~~~-------~~~~C~GtlIs~~-~VLTaAhC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~~v~~~~ 77 (232)
T cd00190 12 FPWQVSLQYTG-------GRHFCGGSLISPR-WVLTAAHCVYSSA------PSNYTVRLGSHDLSSNEGGGQVIKVKKVI 77 (232)
T ss_pred CCCEEEEEccC-------CcEEEEEEEeeCC-EEEECHHhcCCCC------CccEEEEeCcccccCCCCceEEEEEEEEE
Confidence 57788886542 2367999999988 9999999997632 1244444421 2233344444
Q ss_pred ec-------CCCCEEEEEeCCCC----CCCccccCCCC-CCCCCCEEEEEecCCCCCC-----ceEEeEEeeeecCccCC
Q 012318 206 AD-------FHSDIAIVKINSKT----PLPAAKLGTSS-KLCPGDWVVAMGCPHSLQN-----TVTAGIVSCVDRKSSDL 268 (466)
Q Consensus 206 ~d-------~~~DlAlLkv~~~~----~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~-----~~t~G~Vs~~~~~~~~~ 268 (466)
.+ ..+|||||+|+.+. .+.++.|.... .+..|+.+.+.||...... .+....+..+....+..
T Consensus 78 ~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~ 157 (232)
T cd00190 78 VHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKR 157 (232)
T ss_pred ECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhh
Confidence 44 35799999999752 25666675543 5678899999999765332 23333333332222221
Q ss_pred CCC---CccccEEEE-----cccCCCCCccceeecCC---CeEEEEEEeEecCC--CeeeEEEeHHHHHHHHHH
Q 012318 269 GLG---GMRREYLQT-----DCAINAGNSGGPLVNID---GEIVGINIMKVAAA--DGLSFAVPIDSAAKIIEQ 329 (466)
Q Consensus 269 ~~~---~~~~~~i~~-----~~~i~~G~SGGPlvd~~---G~VVGI~~~~~~~~--~g~~~aip~~~i~~~l~~ 329 (466)
... ......+.. ....|.|+|||||+... +.++||.+++..-. .....+..+....+|+++
T Consensus 158 ~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~ 231 (232)
T cd00190 158 AYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQK 231 (232)
T ss_pred hccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhc
Confidence 111 011122222 34578999999999654 89999999875421 233455667777777754
No 14
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.34 E-value=1.1e-11 Score=100.54 Aligned_cols=88 Identities=36% Similarity=0.692 Sum_probs=74.3
Q ss_pred ccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCe
Q 012318 362 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP 440 (466)
Q Consensus 362 ~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~ 440 (466)
+|+|+.+.++++.....+.. ....|++|.+|.++|||+++||++||+|++|||+++.++.++.+++.. ..++.
T Consensus 1 ~~~G~~~~~~~~~~~~~~~~------~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~ 74 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELGL------KDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDK 74 (90)
T ss_pred CccceEEeECCHHHHHHcCC------CCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCE
Confidence 58999999999876554321 234699999999999999999999999999999999999999988876 45788
Q ss_pred EEEEEEECCCeEEEEE
Q 012318 441 LKVVVQRANDQLVTLT 456 (466)
Q Consensus 441 v~l~v~R~~g~~~~l~ 456 (466)
+.+++.| +|+..++.
T Consensus 75 i~l~v~r-~g~~~~~~ 89 (90)
T cd00987 75 VTLTVLR-GGKELTVT 89 (90)
T ss_pred EEEEEEE-CCEEEEee
Confidence 9999999 78776554
No 15
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.27 E-value=3.6e-11 Score=95.49 Aligned_cols=69 Identities=26% Similarity=0.419 Sum_probs=62.7
Q ss_pred CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 012318 389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 458 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 458 (466)
...|++|..|.++|||+++||++||+|++|||+++.+|+++..++.. ..|+.+.+++.| +|+.++++++
T Consensus 8 ~~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r-~g~~~~~~~~ 77 (79)
T cd00991 8 AVAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLP-STTKLTNVST 77 (79)
T ss_pred cCCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEE
Confidence 35799999999999999999999999999999999999999998876 468899999999 8888887765
No 16
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.22 E-value=2.7e-10 Score=107.87 Aligned_cols=161 Identities=24% Similarity=0.324 Sum_probs=97.7
Q ss_pred hCCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCC--------cEEEEEEEE
Q 012318 134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDG--------RTFEGTVLN 205 (466)
Q Consensus 134 ~~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g--------~~~~a~vv~ 205 (466)
..|-+|.|.... ....|+|++|+++ +|||+|||+.+.. ...+.|.+... ..+...-+.
T Consensus 12 ~~Pw~~~i~~~~-------~~~~C~GtlIs~~-~VLTaahC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~v~~~~ 77 (229)
T smart00020 12 SFPWQVSLQYRG-------GRHFCGGSLISPR-WVLTAAHCVYGSD------PSNIRVRLGSHDLSSGEEGQVIKVSKVI 77 (229)
T ss_pred CCCcEEEEEEcC-------CCcEEEEEEecCC-EEEECHHHcCCCC------CcceEEEeCcccCCCCCCceEEeeEEEE
Confidence 356677776432 2467999999988 9999999998642 12455555432 233444444
Q ss_pred ec-------CCCCEEEEEeCCC----CCCCccccCCC-CCCCCCCEEEEEecCCCCC------CceEEeEEeeeecCccC
Q 012318 206 AD-------FHSDIAIVKINSK----TPLPAAKLGTS-SKLCPGDWVVAMGCPHSLQ------NTVTAGIVSCVDRKSSD 267 (466)
Q Consensus 206 ~d-------~~~DlAlLkv~~~----~~~~~~~l~~s-~~~~~G~~V~~iG~p~~~~------~~~t~G~Vs~~~~~~~~ 267 (466)
.+ ..+|||||+|+.+ ..+.++.+... ..+..++.+.+.||+.... .......+.......+.
T Consensus 78 ~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~ 157 (229)
T smart00020 78 IHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCR 157 (229)
T ss_pred ECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhh
Confidence 33 4679999999875 23456666543 3566789999999987542 12223333322222222
Q ss_pred CCCCC---ccccEEEE-----cccCCCCCccceeecCCC--eEEEEEEeEe
Q 012318 268 LGLGG---MRREYLQT-----DCAINAGNSGGPLVNIDG--EIVGINIMKV 308 (466)
Q Consensus 268 ~~~~~---~~~~~i~~-----~~~i~~G~SGGPlvd~~G--~VVGI~~~~~ 308 (466)
..+.. .....+.. ....|.|+||||++...+ .++||++.+.
T Consensus 158 ~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 158 RAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred hhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 11100 01111221 356789999999996543 9999999875
No 17
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.21 E-value=1e-10 Score=92.82 Aligned_cols=68 Identities=24% Similarity=0.434 Sum_probs=58.0
Q ss_pred CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318 390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 460 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 460 (466)
..+++|.+|.++|||+++||++||+|++|||+++.+|.++.+.+ ..++.+.+++.| +|+..++.+++.
T Consensus 11 ~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r-~g~~~~~~v~~~ 78 (80)
T cd00990 11 EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFR-DDRLIEVPLTLA 78 (80)
T ss_pred CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEE-CCEEEEEEEEec
Confidence 35799999999999999999999999999999999876654332 467889999999 888888888775
No 18
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19 E-value=1.2e-10 Score=91.97 Aligned_cols=67 Identities=30% Similarity=0.589 Sum_probs=59.9
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEE
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVI 458 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~ 458 (466)
..++|..|.++|||+++||++||+|++|||+++.+++++..++....++.+.+++.| +|+..++.++
T Consensus 12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r-~~~~~~~~l~ 78 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVER-NGETITLTLT 78 (79)
T ss_pred cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEE-CCEEEEEEec
Confidence 457899999999999999999999999999999999999998877667889999999 7877777765
No 19
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.19 E-value=1.2e-10 Score=113.07 Aligned_cols=99 Identities=17% Similarity=0.209 Sum_probs=86.3
Q ss_pred HHHHHHHHHHHHcCCccccccccccccccceeeeeeeeeecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCC
Q 012318 321 DSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTP 400 (466)
Q Consensus 321 ~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~ 400 (466)
..++++++++++++ +..+.|+|+...... +...|+.|..+.+
T Consensus 159 ~~~~~v~~~l~~~g-----------------------~~~~~~lgi~p~~~~---------------g~~~G~~v~~v~~ 200 (259)
T TIGR01713 159 VVSRRIIEELTKDP-----------------------QKMFDYIRLSPVMKN---------------DKLEGYRLNPGKD 200 (259)
T ss_pred hhHHHHHHHHHHCH-----------------------HhhhheEeEEEEEeC---------------CceeEEEEEecCC
Confidence 46788999999999 788999999875321 2346999999999
Q ss_pred CChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 012318 401 GSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 458 (466)
Q Consensus 401 ~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 458 (466)
+++|+++|||+||+|++|||+++.+++++.+++.+ +.++.++|+|+| +|+.+++.+.
T Consensus 201 ~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R-~G~~~~i~v~ 258 (259)
T TIGR01713 201 PSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVER-DGQREDIYVR 258 (259)
T ss_pred CCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEE-CCEEEEEEEE
Confidence 99999999999999999999999999999998887 677899999999 8998888764
No 20
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19 E-value=1.7e-10 Score=91.48 Aligned_cols=71 Identities=28% Similarity=0.462 Sum_probs=64.0
Q ss_pred CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318 390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA 462 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~ 462 (466)
..|++|..|.++|||+. ||++||+|++|||+++.+|+++.+++.. ..|+.+.+++.| +|+..++++++.+.
T Consensus 7 ~~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r-~g~~~~~~v~l~~~ 78 (79)
T cd00986 7 YHGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKR-EEKELPEDLILKTF 78 (79)
T ss_pred ecCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEE-CCEEEEEEEEEecc
Confidence 35899999999999997 7999999999999999999999998875 678899999999 89999998888754
No 21
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.17 E-value=7.2e-11 Score=121.28 Aligned_cols=277 Identities=19% Similarity=0.195 Sum_probs=181.3
Q ss_pred HHHhCCceEEEEccccc-------cccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEE-eCCCcEEEEE
Q 012318 131 AARVCPAVVNLSAPREF-------LGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVT-LQDGRTFEGT 202 (466)
Q Consensus 131 ~~~~~~SVV~I~~~~~~-------~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~-~~~g~~~~a~ 202 (466)
.+....|++.+...... ....+....|+||.+... .++|++|++..... ...+.+. ...-+.|.++
T Consensus 56 ~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~-----~~~v~v~~~gs~~k~~~~ 129 (473)
T KOG1320|consen 56 VDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNND-----HKFVTVKKHGSPRKYKAF 129 (473)
T ss_pred ccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCcccccccc-----ccccccccCCCchhhhhh
Confidence 34455566776643221 111244567999999866 99999999984311 1233333 1223668888
Q ss_pred EEEecCCCCEEEEEeCCC---CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEE
Q 012318 203 VLNADFHSDIAIVKINSK---TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQ 279 (466)
Q Consensus 203 vv~~d~~~DlAlLkv~~~---~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~ 279 (466)
+...-.+.|+|++.++.. ....++.+++ -+...+.++++| +....+|.|.|....... +..+......++
T Consensus 130 v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~--ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~--y~~~~~~l~~vq 202 (473)
T KOG1320|consen 130 VAAVFEECDLAVVYIESEEFWKGMNPFELGD--IPSLNGSGFVVG---GDGIIVTNGHVVRVEPRI--YAHSSTVLLRVQ 202 (473)
T ss_pred HHHhhhcccceEEEEeeccccCCCcccccCC--CcccCccEEEEc---CCcEEEEeeEEEEEEecc--ccCCCcceeeEE
Confidence 888889999999999863 2233344433 344557899998 667789999998876653 223334455789
Q ss_pred EcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeeeeee
Q 012318 280 TDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRRV 359 (466)
Q Consensus 280 ~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~ 359 (466)
+++.+.+|+||+|.+.-.+++.|+.+..........+.+|.-.+..+.......+ ...
T Consensus 203 i~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a----------------------~~~ 260 (473)
T KOG1320|consen 203 IDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSA----------------------IGN 260 (473)
T ss_pred EEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeec----------------------ccc
Confidence 9999999999999997779999999998755446788898877766665544444 012
Q ss_pred ecccccceeecC-CHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCC----HH--HHHHH
Q 012318 360 VRPWLGLKMLDL-NDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS----IT--EIIEI 432 (466)
Q Consensus 360 ~~~~lG~~~~~~-~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~----~~--~~~~~ 432 (466)
..++++...+.+ +.+.++.++ + +...|+.+.++.+-+.|.+. ++.||.|+.+||..|.- .. .+...
T Consensus 261 ~f~~~nt~t~g~vs~~~R~~~~-----l-g~~~g~~i~~~~qtd~ai~~-~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~ 333 (473)
T KOG1320|consen 261 GFGLLNTLTQGMVSGQLRKSFK-----L-GLETGVLISKINQTDAAINP-GNSGGPLLNLDGEVIGVNTRKVTRIGFSHG 333 (473)
T ss_pred Cceeeeeeeecccccccccccc-----c-Ccccceeeeeecccchhhhc-ccCCCcEEEecCcEeeeeeeeeEEeecccc
Confidence 233444433222 122222221 1 22378999999999999988 99999999999988831 11 11122
Q ss_pred Hhc-CCCCeEEEEEEECCC
Q 012318 433 MGD-RVGEPLKVVVQRAND 450 (466)
Q Consensus 433 l~~-~~g~~v~l~v~R~~g 450 (466)
+.. .+++++.+.+.| .+
T Consensus 334 iSf~~p~d~vl~~v~r-~~ 351 (473)
T KOG1320|consen 334 ISFKIPIDTVLVIVLR-LG 351 (473)
T ss_pred ceeccCchHhhhhhhh-hh
Confidence 222 566777777777 44
No 22
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.10 E-value=5.2e-10 Score=89.73 Aligned_cols=71 Identities=24% Similarity=0.608 Sum_probs=62.8
Q ss_pred CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318 390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 460 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 460 (466)
..+++|..|.++|||+++||++||+|++|||+++.+| +++..++....++.+.+++.|.+|+..++++++.
T Consensus 12 ~~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~~~~~~~~~~ 84 (85)
T cd00988 12 DGGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLTRL 84 (85)
T ss_pred CCeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCCEEEEEEEEC
Confidence 3689999999999999999999999999999999999 9999888777788999999993288888887764
No 23
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.97 E-value=2.1e-09 Score=82.81 Aligned_cols=55 Identities=36% Similarity=0.696 Sum_probs=51.0
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEE
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVV 445 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v 445 (466)
.+++|..|.++|||+++||++||+|++|||+++.++ +++.+++....|+.++|++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence 489999999999999999999999999999999999 9999999887788888876
No 24
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.96 E-value=3.5e-09 Score=110.79 Aligned_cols=90 Identities=30% Similarity=0.568 Sum_probs=77.8
Q ss_pred cccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCC
Q 012318 361 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGE 439 (466)
Q Consensus 361 ~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~ 439 (466)
..|+|+.+..+++...++++.. ....|++|.+|.++|||+++||++||+|++|||++|.+++++.+++.. +.++
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~l~-----~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~ 411 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELRLK-----GDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGG 411 (428)
T ss_pred ccccceEEecCCHHHHHHcCCC-----cCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCC
Confidence 5689999999998887766432 223699999999999999999999999999999999999999999876 5788
Q ss_pred eEEEEEEECCCeEEEEE
Q 012318 440 PLKVVVQRANDQLVTLT 456 (466)
Q Consensus 440 ~v~l~v~R~~g~~~~l~ 456 (466)
.+.|+|.| +|+...+.
T Consensus 412 ~v~l~v~R-~g~~~~~~ 427 (428)
T TIGR02037 412 RVALLILR-GGATIFVT 427 (428)
T ss_pred EEEEEEEE-CCEEEEEE
Confidence 99999999 78876654
No 25
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.83 E-value=6.3e-08 Score=92.36 Aligned_cols=160 Identities=18% Similarity=0.196 Sum_probs=95.2
Q ss_pred ceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEe----CCCc-EE--EEEEE-EecC---CCCEEEEEeCCC---
Q 012318 155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTL----QDGR-TF--EGTVL-NADF---HSDIAIVKINSK--- 220 (466)
Q Consensus 155 ~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~----~~g~-~~--~a~vv-~~d~---~~DlAlLkv~~~--- 220 (466)
..|++|+|.++ .+||++||+......+ ..+.+.. .++. .+ ..... .+.. ..|.+...+...
T Consensus 64 ~~~~~~lI~pn-tvLTa~Hc~~s~~~G~----~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~ 138 (251)
T COG3591 64 LCTAATLIGPN-TVLTAGHCIYSPDYGE----DDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALE 138 (251)
T ss_pred ceeeEEEEcCc-eEEEeeeEEecCCCCh----hhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhc
Confidence 44566999998 9999999997643211 1222211 1111 11 11111 1112 345555555421
Q ss_pred ------CCCCccccCCCCCCCCCCEEEEEecCCCCCCce----EEeEEeeeecCccCCCCCCccccEEEEcccCCCCCcc
Q 012318 221 ------TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTV----TAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSG 290 (466)
Q Consensus 221 ------~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~----t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SG 290 (466)
.......+......+.++.+.++|||.+..+.. ..+.+.... ...+.++|.+++|+||
T Consensus 139 ~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~------------~~~l~y~~dT~pG~SG 206 (251)
T COG3591 139 SGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK------------GNKLFYDADTLPGSSG 206 (251)
T ss_pred cCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEe------------cceEEEEecccCCCCC
Confidence 112223333445678899999999998765332 223332221 2368999999999999
Q ss_pred ceeecCCCeEEEEEEeEecCCC--eee-EEEeHHHHHHHHHHHH
Q 012318 291 GPLVNIDGEIVGINIMKVAAAD--GLS-FAVPIDSAAKIIEQFK 331 (466)
Q Consensus 291 GPlvd~~G~VVGI~~~~~~~~~--g~~-~aip~~~i~~~l~~l~ 331 (466)
+|+++.+.++||++..+....+ ..+ ...-...++++++++.
T Consensus 207 Spv~~~~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 207 SPVLISKDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred CceEecCceEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 9999999999999998765221 122 2234556777777764
No 26
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.81 E-value=1.5e-08 Score=105.55 Aligned_cols=70 Identities=26% Similarity=0.507 Sum_probs=65.0
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEecC
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPEE 461 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~ 461 (466)
.+++|.+|.++|||+++||++||+|++|||++|++|+|+.+.+....++.+.+++.| +|+..+++++++.
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~~ 272 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVER-NGETLSISLTPEA 272 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEE-CCEEEEEEEEEcC
Confidence 478999999999999999999999999999999999999999988788889999999 8899888888854
No 27
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.73 E-value=4.1e-08 Score=103.26 Aligned_cols=69 Identities=28% Similarity=0.549 Sum_probs=63.6
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 460 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 460 (466)
.+++|.+|.++|||+++||++||+|++|||++|.+|+|+.+++....++.+.++|.| +|+..++++++.
T Consensus 221 ~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~ 289 (449)
T PRK10779 221 IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIER-QGSPLSLTLTPD 289 (449)
T ss_pred cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEE-CCEEEEEEEEee
Confidence 357899999999999999999999999999999999999999887778899999999 889888888875
No 28
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.71 E-value=3.3e-08 Score=104.00 Aligned_cols=67 Identities=15% Similarity=0.063 Sum_probs=58.8
Q ss_pred eeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEec
Q 012318 393 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPE 460 (466)
Q Consensus 393 ~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~ 460 (466)
.+|.+|.++|||++||||+||+|++|||++|.+|+++...+.. ..+++++++|.| +|+.++++++..
T Consensus 128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R-~gk~~~~~v~l~ 195 (449)
T PRK10779 128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAP-FGSDQRRDKTLD 195 (449)
T ss_pred ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEe-CCccceEEEEec
Confidence 3689999999999999999999999999999999999887765 667889999999 888776666553
No 29
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.59 E-value=3e-07 Score=72.88 Aligned_cols=69 Identities=25% Similarity=0.447 Sum_probs=55.9
Q ss_pred cccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecC--CHHHHHHHHhcCCC
Q 012318 361 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ--SITEIIEIMGDRVG 438 (466)
Q Consensus 361 ~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~--~~~~~~~~l~~~~g 438 (466)
...+|+.+..... ...|++|..|.++|||+++||++||+|++|||+++. +++++.+++....+
T Consensus 11 ~~~~G~~~~~~~~---------------~~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~ 75 (82)
T cd00992 11 GGGLGFSLRGGKD---------------SGGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD 75 (82)
T ss_pred CCCcCEEEeCccc---------------CCCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC
Confidence 5678998765321 135899999999999999999999999999999999 88999988876433
Q ss_pred CeEEEEE
Q 012318 439 EPLKVVV 445 (466)
Q Consensus 439 ~~v~l~v 445 (466)
.+.+++
T Consensus 76 -~v~l~v 81 (82)
T cd00992 76 -EVTLTV 81 (82)
T ss_pred -eEEEEE
Confidence 566654
No 30
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.59 E-value=2.9e-07 Score=73.24 Aligned_cols=71 Identities=30% Similarity=0.504 Sum_probs=55.5
Q ss_pred ccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHh-cCCCCe
Q 012318 362 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMG-DRVGEP 440 (466)
Q Consensus 362 ~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~-~~~g~~ 440 (466)
..+|+.+.... ....|++|..|.++|||+++||++||+|++|||+++.++.+...... ...+..
T Consensus 12 ~~~G~~~~~~~---------------~~~~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~ 76 (85)
T smart00228 12 GGLGFSLVGGK---------------DEGGGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGK 76 (85)
T ss_pred CcccEEEECCC---------------CCCCCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCe
Confidence 78898876421 11168999999999999999999999999999999998766544332 234568
Q ss_pred EEEEEEE
Q 012318 441 LKVVVQR 447 (466)
Q Consensus 441 v~l~v~R 447 (466)
+.+.+.|
T Consensus 77 ~~l~i~r 83 (85)
T smart00228 77 VTLTVLR 83 (85)
T ss_pred EEEEEEe
Confidence 8999988
No 31
>PRK10139 serine endoprotease; Provisional
Probab=98.59 E-value=1.6e-07 Score=98.71 Aligned_cols=66 Identities=26% Similarity=0.450 Sum_probs=59.6
Q ss_pred CcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318 390 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV 457 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v 457 (466)
..|++|.+|.++|||+++||++||+|++|||++|.+|+++.+++..+. +.+.|+|+| +|+.+.+.+
T Consensus 389 ~~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R-~g~~~~~~~ 454 (455)
T PRK10139 389 TKGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVR-GNESIYLLL 454 (455)
T ss_pred CCceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence 368999999999999999999999999999999999999999998754 689999999 888777665
No 32
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.58 E-value=2.4e-07 Score=94.31 Aligned_cols=69 Identities=23% Similarity=0.486 Sum_probs=60.0
Q ss_pred cceeecccC--------CCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 012318 391 SGVLVPVVT--------PGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 460 (466)
Q Consensus 391 ~g~~V~~V~--------~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 460 (466)
.|++|.... .+|||+++|||+||+|++|||++|++|+|+.+++....++.+.++|.| +|+..++.++|.
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R-~Ge~~tv~V~Pv 181 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIER-GGKIIETVIKPV 181 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEE-CCEEEEEEEEEe
Confidence 688875432 369999999999999999999999999999999988678899999999 889888888764
No 33
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.54 E-value=2.7e-07 Score=73.27 Aligned_cols=72 Identities=31% Similarity=0.554 Sum_probs=56.5
Q ss_pred ecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCC
Q 012318 360 VRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRV 437 (466)
Q Consensus 360 ~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~ 437 (466)
....||+.+..-.+ ....+++|.+|.++|+|+++||++||+|++|||+++.++ .++..++....
T Consensus 8 ~~~~lG~~l~~~~~--------------~~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~ 73 (81)
T PF00595_consen 8 GNGPLGFTLRGGSD--------------NDEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSAS 73 (81)
T ss_dssp TTSBSSEEEEEEST--------------SSSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHST
T ss_pred CCCCcCEEEEecCC--------------CCcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCC
Confidence 56789998875321 012699999999999999999999999999999999977 45566666644
Q ss_pred CCeEEEEEE
Q 012318 438 GEPLKVVVQ 446 (466)
Q Consensus 438 g~~v~l~v~ 446 (466)
+ .++|+|+
T Consensus 74 ~-~v~L~V~ 81 (81)
T PF00595_consen 74 N-PVTLTVQ 81 (81)
T ss_dssp S-EEEEEEE
T ss_pred C-cEEEEEC
Confidence 4 7888774
No 34
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.51 E-value=2.6e-07 Score=93.50 Aligned_cols=68 Identities=25% Similarity=0.451 Sum_probs=56.9
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEe
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIP 459 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~ 459 (466)
.+++|..|.++|||+++||++||+|++|||+++.+| .++...+....|+.+.++|.| +|+..++++++
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R-~g~~~~~~v~l 131 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILR-AGKSKPLTFTL 131 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEe-CCCCceEEEEE
Confidence 578999999999999999999999999999999987 577777766778899999999 66544444433
No 35
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.51 E-value=4e-07 Score=93.99 Aligned_cols=68 Identities=25% Similarity=0.498 Sum_probs=59.1
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEe
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIP 459 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~ 459 (466)
.|++|..|.++|||+++||++||+|++|||++|.++ .++...+....|..+.|+|.| +|+..+++++.
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r-~g~~~~~~l~r 171 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRR-GPETRLVTLTR 171 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEe
Confidence 489999999999999999999999999999999864 677777776778899999999 78877777654
No 36
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.50 E-value=3.6e-07 Score=93.64 Aligned_cols=63 Identities=22% Similarity=0.370 Sum_probs=55.3
Q ss_pred ecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEE-ECCCeEEEEEEEecC
Q 012318 395 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQ-RANDQLVTLTVIPEE 461 (466)
Q Consensus 395 V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~-R~~g~~~~l~v~~~~ 461 (466)
|..|.++|+|+++||++||+|++|||+++.+|.|+..++. ++.+.++|. | +|+..++++.+++
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~r-dGe~~~l~Ie~~~ 65 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDA-NGESHQIEIEKDL 65 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcC-CCeEEEEEEecCC
Confidence 6789999999999999999999999999999999887774 356889997 6 8888888888754
No 37
>PRK10942 serine endoprotease; Provisional
Probab=98.49 E-value=3.8e-07 Score=96.28 Aligned_cols=65 Identities=34% Similarity=0.572 Sum_probs=58.9
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV 457 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v 457 (466)
.|++|.+|.++|+|+++||++||+|++|||++|.+++++.+++..+. +.+.|+|.| +|..+.+.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~~v~l~V~R-~g~~~~v~~ 472 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-SVLALNIQR-GDSSIYLLM 472 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence 58999999999999999999999999999999999999999988744 689999999 888776654
No 38
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.39 E-value=1.9e-05 Score=76.50 Aligned_cols=169 Identities=20% Similarity=0.240 Sum_probs=95.3
Q ss_pred ceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC---------C---cEEEE-EEEEecC-------C-CCEE
Q 012318 155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------G---RTFEG-TVLNADF-------H-SDIA 213 (466)
Q Consensus 155 ~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~---------g---~~~~a-~vv~~d~-------~-~DlA 213 (466)
..|-|.+|+++ ||||++||+.... .. .+.|.+.. + ..... +++ .|+ . +|||
T Consensus 38 ~~Cggsli~~~-~vltaaHC~~~~~-----~~-~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDia 109 (256)
T KOG3627|consen 38 HLCGGSLISPR-WVLTAAHCVKGAS-----AS-LYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIA 109 (256)
T ss_pred eeeeeEEeeCC-EEEEChhhCCCCC-----Cc-ceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEE
Confidence 36788788776 9999999998742 00 33343321 1 11111 222 222 3 8999
Q ss_pred EEEeCCC----CCCCccccCCCCC---CCCCCEEEEEecCCCC------CCceEEeEEeeeecCccCCCCCC---ccccE
Q 012318 214 IVKINSK----TPLPAAKLGTSSK---LCPGDWVVAMGCPHSL------QNTVTAGIVSCVDRKSSDLGLGG---MRREY 277 (466)
Q Consensus 214 lLkv~~~----~~~~~~~l~~s~~---~~~G~~V~~iG~p~~~------~~~~t~G~Vs~~~~~~~~~~~~~---~~~~~ 277 (466)
||+++.+ ..+.++.+..... ...+..+++.||+... ...+....+..+....|...+.. .....
T Consensus 110 ll~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~ 189 (256)
T KOG3627|consen 110 LLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTM 189 (256)
T ss_pred EEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCE
Confidence 9999975 3345556643222 3444888899997532 22333333333333333322211 11122
Q ss_pred EEEc-----ccCCCCCccceeecCC---CeEEEEEEeEecC-CC--eeeEEEeHHHHHHHHHHHH
Q 012318 278 LQTD-----CAINAGNSGGPLVNID---GEIVGINIMKVAA-AD--GLSFAVPIDSAAKIIEQFK 331 (466)
Q Consensus 278 i~~~-----~~i~~G~SGGPlvd~~---G~VVGI~~~~~~~-~~--g~~~aip~~~i~~~l~~l~ 331 (466)
+... ..+|.|+|||||+-.+ ..++||++++... .. .-+.+..+....+|+++..
T Consensus 190 ~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~~ 254 (256)
T KOG3627|consen 190 LCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKENI 254 (256)
T ss_pred EeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHHh
Confidence 3332 3368999999999554 6999999998641 11 1233566777777777643
No 39
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.30 E-value=3.2e-06 Score=68.03 Aligned_cols=68 Identities=22% Similarity=0.430 Sum_probs=49.4
Q ss_pred CcceeecccCCC--------ChhhhCCC--CCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318 390 KSGVLVPVVTPG--------SPAHLAGF--LPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV 457 (466)
Q Consensus 390 ~~g~~V~~V~~~--------spA~~aGl--~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~v 457 (466)
..+..|..|.++ ||..+.|+ ++||+|++|||+++..-.++..+|..+.|+.+.|+|.+.+++.+++.|
T Consensus 11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~~~~R~v~V 88 (88)
T PF14685_consen 11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKPGGARTVVV 88 (88)
T ss_dssp TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-STT-EEEEEE
T ss_pred CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCCCCceEEEC
Confidence 356778887664 78888875 599999999999999999999999999999999999995555666543
No 40
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.28 E-value=8.9e-05 Score=70.41 Aligned_cols=174 Identities=19% Similarity=0.252 Sum_probs=91.4
Q ss_pred HhCCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEE-----EEEec
Q 012318 133 RVCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGT-----VLNAD 207 (466)
Q Consensus 133 ~~~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~-----vv~~d 207 (466)
-++..|++|.-..+. ....--|+.+. + +|+|++|..+... ..+++....|.- ... -+..=
T Consensus 15 ~Ia~~ic~l~n~s~~-----~~~~l~gigyG-~-~iItn~HLf~~nn-------g~L~i~s~hG~f-~v~nt~~lkv~~i 79 (235)
T PF00863_consen 15 PIASNICRLTNESDG-----GTRSLYGIGYG-S-YIITNAHLFKRNN-------GELTIKSQHGEF-TVPNTTQLKVHPI 79 (235)
T ss_dssp HHHTTEEEEEEEETT-----EEEEEEEEEET-T-EEEEEGGGGSSTT-------CEEEEEETTEEE-EECEGGGSEEEE-
T ss_pred hhhheEEEEEEEeCC-----CeEEEEEEeEC-C-EEEEChhhhccCC-------CeEEEEeCceEE-EcCCccccceEEe
Confidence 345678888743321 11223456665 3 9999999997652 568887776632 211 23344
Q ss_pred CCCCEEEEEeCCCCCCCccccC-CCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCC
Q 012318 208 FHSDIAIVKINSKTPLPAAKLG-TSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINA 286 (466)
Q Consensus 208 ~~~DlAlLkv~~~~~~~~~~l~-~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~ 286 (466)
+..||.++|++. ++||.+-. .-..++.++.|+++|.-+..... ...|+....... .....++.+...+..
T Consensus 80 ~~~DiviirmPk--DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~--~s~vSesS~i~p-----~~~~~fWkHwIsTk~ 150 (235)
T PF00863_consen 80 EGRDIVIIRMPK--DFPPFPQKLKFRAPKEGERVCMVGSNFQEKSI--SSTVSESSWIYP-----EENSHFWKHWISTKD 150 (235)
T ss_dssp TCSSEEEEE--T--TS----S---B----TT-EEEEEEEECSSCCC--EEEEEEEEEEEE-----ETTTTEEEE-C---T
T ss_pred CCccEEEEeCCc--ccCCcchhhhccCCCCCCEEEEEEEEEEcCCe--eEEECCceEEee-----cCCCCeeEEEecCCC
Confidence 688999999995 44543321 22467899999999985543321 112222211111 012457888889999
Q ss_pred CCccceeecC-CCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHc
Q 012318 287 GNSGGPLVNI-DGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKN 333 (466)
Q Consensus 287 G~SGGPlvd~-~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~ 333 (466)
|+=|.|||+. ||.+||+++..... ...+|+.|+.. ++++.+.+.
T Consensus 151 G~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~~--~f~~~~l~~ 195 (235)
T PF00863_consen 151 GDCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFPD--DFEEFYLEN 195 (235)
T ss_dssp T-TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--T--THHHHHCC-
T ss_pred CccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCCH--HHHHHHhcc
Confidence 9999999976 49999999987644 45778888653 444444433
No 41
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.25 E-value=1.7e-06 Score=90.27 Aligned_cols=64 Identities=17% Similarity=0.280 Sum_probs=55.8
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEE
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLT 456 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~g~~~~l~ 456 (466)
.|.+|.+|.++|||+++||++||+|+++||+++.+++++...+.... .++.+++.| +++..++.
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~-~~v~~~I~r-~g~~~~l~ 191 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA-GEPMVEILA-ERENWTFE 191 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc-ccceEEEEE-ecCceEec
Confidence 67889999999999999999999999999999999999998887755 678899999 66655443
No 42
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=98.24 E-value=3.1e-06 Score=77.30 Aligned_cols=73 Identities=26% Similarity=0.369 Sum_probs=61.9
Q ss_pred ceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHH---HHhcCCCCeEEEEEEECCCeEEEEEEEecCCCCC
Q 012318 392 GVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIE---IMGDRVGEPLKVVVQRANDQLVTLTVIPEEANPD 465 (466)
Q Consensus 392 g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~---~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~~~~~ 465 (466)
-++|++|.++|||+++||+.||.|+++....-.++..++. ......++.+.++|.| .|+.+.+.++|..|.++
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R-~g~~v~L~ltP~~W~Gr 215 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIR-EGQKVVLSLTPKKWQGR 215 (231)
T ss_pred eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEec-CCCEEEEEeCcccccCC
Confidence 4679999999999999999999999999887766655543 3345778899999999 89999999999988764
No 43
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.21 E-value=5.1e-06 Score=85.93 Aligned_cols=70 Identities=36% Similarity=0.567 Sum_probs=59.9
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEec
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPE 460 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~ 460 (466)
.++.|.++.+++||+++||++||+|++|||+++... +++.+.++..+|..++|++.|.+ ++.++++++.+
T Consensus 112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~k~~~v~l~Re 184 (406)
T COG0793 112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGGKPFTVTLTRE 184 (406)
T ss_pred CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEcCCCceeEEEEEEE
Confidence 688899999999999999999999999999999876 56777788889999999999942 45666666654
No 44
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=98.05 E-value=1.5e-05 Score=69.98 Aligned_cols=72 Identities=29% Similarity=0.494 Sum_probs=55.2
Q ss_pred CcceeecccCCCChhhhCCCCC-CCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEecC
Q 012318 390 KSGVLVPVVTPGSPAHLAGFLP-SDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPEE 461 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aGl~~-GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~~ 461 (466)
..+.-|.+|.|+|||++|||++ .|.|+.+|+....+.++|.+.+..+.++.+.|.|...+ ...+.++++|..
T Consensus 42 ~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vyns~~~~vR~V~i~P~~ 115 (138)
T PF04495_consen 42 EEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVYNSKTDSVREVTITPSR 115 (138)
T ss_dssp CCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEEETTTTCEEEEEE---T
T ss_pred cceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEEECCCCeEEEEEEEcCC
Confidence 4688899999999999999999 59999999999999999999999888999999999743 356788988863
No 45
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.94 E-value=1.9e-05 Score=81.67 Aligned_cols=65 Identities=25% Similarity=0.380 Sum_probs=54.6
Q ss_pred CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 012318 389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA 462 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~ 462 (466)
...+.+|..|.++|||++|||.+||.|++|||. .+.+.. +.++++++.+.| .|..+++.++....
T Consensus 460 ~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~-~~~L~e~~v~~~~~ 525 (558)
T COG3975 460 EGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDRYKVNDKIQVHVFR-EGRLREFLVKLGGD 525 (558)
T ss_pred cCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccccccccccceEEEEcc-CCceEEeecccCCC
Confidence 356788999999999999999999999999998 233443 788899999999 88998888877543
No 46
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.90 E-value=3.5e-05 Score=84.00 Aligned_cols=70 Identities=14% Similarity=0.355 Sum_probs=55.7
Q ss_pred cceeecccCCCChhhhC-CCCCCCEEEEEC--CEecC---C--HHHHHHHHhcCCCCeEEEEEEEC--CCeEEEEEEEec
Q 012318 391 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFD--GKPVQ---S--ITEIIEIMGDRVGEPLKVVVQRA--NDQLVTLTVIPE 460 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~a-Gl~~GD~I~~in--g~~v~---~--~~~~~~~l~~~~g~~v~l~v~R~--~g~~~~l~v~~~ 460 (466)
.+++|.+|.+||||+++ ||++||+|++|| |+++. . .+++..+|....|.+|.|+|.|. +++.++++++.+
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r~~~~~~~~~vtl~R~ 334 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTLTRD 334 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEeCCCCCceEEEEEEee
Confidence 46889999999999998 999999999999 55443 2 35788888888899999999983 345666666543
No 47
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.90 E-value=4.4e-05 Score=74.27 Aligned_cols=70 Identities=26% Similarity=0.403 Sum_probs=60.6
Q ss_pred CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 012318 389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP 459 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~ 459 (466)
.-.|+++..|..++|+... |+.||.|++|||+++.+.+++..++.. ++|+.+++++.|.+++....+++.
T Consensus 128 ~y~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl 198 (342)
T COG3480 128 TYAGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITL 198 (342)
T ss_pred EEeeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEE
Confidence 3469999999999999987 999999999999999999999998876 899999999998666655544444
No 48
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.88 E-value=4.8e-05 Score=73.92 Aligned_cols=61 Identities=20% Similarity=0.356 Sum_probs=51.7
Q ss_pred cCCC---ChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 012318 398 VTPG---SPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP 459 (466)
Q Consensus 398 V~~~---spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~ 459 (466)
+.|+ .--+++|||+||++++|||.++++.++..+++.. .....++|+|+| +|+..++.+..
T Consensus 211 l~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeR-dGq~~~i~i~l 275 (276)
T PRK09681 211 VKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLR-KGARHDISIAL 275 (276)
T ss_pred ECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEE-CCEEEEEEEEc
Confidence 4555 3458899999999999999999999988888876 667789999999 99998887754
No 49
>PF12812 PDZ_1: PDZ-like domain
Probab=97.66 E-value=0.00017 Score=56.91 Aligned_cols=69 Identities=20% Similarity=0.236 Sum_probs=58.0
Q ss_pred cccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCC
Q 012318 361 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRV 437 (466)
Q Consensus 361 ~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~ 437 (466)
.-|.|..+.+++-..++++.. .-|.++.....++++...|+..|-+|.+|||+++.+.++|.+++++-+
T Consensus 8 v~~~Ga~f~~Ls~q~aR~~~~--------~~~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ip 76 (78)
T PF12812_consen 8 VEVCGAVFHDLSYQQARQYGI--------PVGGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKKIP 76 (78)
T ss_pred EEEcCeecccCCHHHHHHhCC--------CCCEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCC
Confidence 358899999999999888853 234555567889999987899999999999999999999999987644
No 50
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.65 E-value=0.00059 Score=67.59 Aligned_cols=51 Identities=18% Similarity=0.279 Sum_probs=35.3
Q ss_pred ccCCCCCccceeecC--CC-eEEEEEEeEecCCCe---eeEEEeHHHHHHHHHHHHH
Q 012318 282 CAINAGNSGGPLVNI--DG-EIVGINIMKVAAADG---LSFAVPIDSAAKIIEQFKK 332 (466)
Q Consensus 282 ~~i~~G~SGGPlvd~--~G-~VVGI~~~~~~~~~g---~~~aip~~~i~~~l~~l~~ 332 (466)
...|.|+||||+|-. +| .-+||++++.....+ -+...-++....|+++..+
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~ 279 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN 279 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence 456999999999943 25 457999998663322 1244557888888888544
No 51
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.61 E-value=0.0013 Score=62.80 Aligned_cols=116 Identities=22% Similarity=0.373 Sum_probs=62.2
Q ss_pred CceeEEEEEeCCC--eEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC-CCCCCCccccCC
Q 012318 154 RGIGSGAIVDADG--TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN-SKTPLPAAKLGT 230 (466)
Q Consensus 154 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~-~~~~~~~~~l~~ 230 (466)
...|||=+...+| .|+|+.||+.+ +...|...+ .. +...++..-|+|.-.++ -+..+|.+++..
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg~---------~~a~v~~~g-~~---~~~tF~~~GDfA~~~~~~~~G~~P~~k~a~ 177 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLGG---------NTARVSGVG-TR---RMLTFKKNGDFAEADITNWPGAAPKYKFAQ 177 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCBT---------TEEEEEETT-EE---EEEEEEEETTEEEEEETTS-S---B--B-T
T ss_pred ecccccceEEECCeEEEEEEEEEcCC---------CeEEEEecc-eE---EEEEEeccCcEEEEECCCCCCCCCceeecC
Confidence 3556666665555 89999999985 455555443 22 23345566799999994 345667777642
Q ss_pred CCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeEe
Q 012318 231 SSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKV 308 (466)
Q Consensus 231 s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~ 308 (466)
. ..|.--+.. ...+..|.|... ..+ |-+.+|+||+|+++.+|.+||+|+...
T Consensus 178 ~---~~GrAyW~t------~tGvE~G~ig~~--------------~~~---~fT~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 178 N---YTGRAYWLT------STGVEPGFIGGG--------------GAV---CFTGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp T----SEEEEEEE------TTEEEEEEEETT--------------EEE---ESS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred C---cccceEEEc------ccCcccceecCc--------------eEE---EEcCCCCCCCccCcCCCCEEEEEecCC
Confidence 1 122211111 112333433210 011 334689999999999999999999863
No 52
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.54 E-value=7.3e-05 Score=60.24 Aligned_cols=37 Identities=30% Similarity=0.523 Sum_probs=33.7
Q ss_pred CCCcceeecccCCCChhhhCCCCCCCEEEEECCEecC
Q 012318 388 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ 424 (466)
Q Consensus 388 ~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~ 424 (466)
-...|++|++|.++|||+.|||+.+|.|+.+||...+
T Consensus 56 ytD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfT 92 (124)
T KOG3553|consen 56 YTDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFT 92 (124)
T ss_pred cCCccEEEEEeccCChhhhhcceecceEEEecCceeE
Confidence 3568999999999999999999999999999998753
No 53
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=97.37 E-value=0.0015 Score=69.59 Aligned_cols=118 Identities=22% Similarity=0.322 Sum_probs=73.9
Q ss_pred CCCCEEEEEeCCCC--------CC------CccccCC------CCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccC
Q 012318 208 FHSDIAIVKINSKT--------PL------PAAKLGT------SSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSD 267 (466)
Q Consensus 208 ~~~DlAlLkv~~~~--------~~------~~~~l~~------s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~ 267 (466)
.-.|+|||+++... ++ |.+.+.+ ...+..|.+|+-+|.-.+ .|.|.+.+..-..
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy-- 614 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY-- 614 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--
Confidence 34599999998531 11 2233322 134678999999998765 3455555442211
Q ss_pred CCCCCc-cccEEEEc----ccCCCCCccceeecCCCe------EEEEEEeEecCCCeeeEEEeHHHHHHHHHHHH
Q 012318 268 LGLGGM-RREYLQTD----CAINAGNSGGPLVNIDGE------IVGINIMKVAAADGLSFAVPIDSAAKIIEQFK 331 (466)
Q Consensus 268 ~~~~~~-~~~~i~~~----~~i~~G~SGGPlvd~~G~------VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~ 331 (466)
+..+.. ..+++... .-...|+||+-|++.-+. |+||.+..-.+..+++++.|+..|.+-|++.-
T Consensus 615 w~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~vT 689 (695)
T PF08192_consen 615 WADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEVT 689 (695)
T ss_pred ecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHhh
Confidence 111111 12333333 224689999999986444 99999987666678999999988877777653
No 54
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.29 E-value=0.018 Score=56.71 Aligned_cols=176 Identities=19% Similarity=0.247 Sum_probs=99.1
Q ss_pred CCceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCC----C-----CCCc------------eEEEE-
Q 012318 135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSR----A-----LPKG------------KVDVT- 192 (466)
Q Consensus 135 ~~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~----~-----~~~~------------~i~V~- 192 (466)
.|-.|.+...... ......+|++|+++ ||||+.||+-...... . -... .+.+.
T Consensus 53 ~pW~v~v~~~~~~----~~~~~~~gtlIS~R-HiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v~~ 127 (282)
T PF03761_consen 53 APWAVSVYTKNHN----EGNYFSTGTLISPR-HILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEVLSKIDVRC 127 (282)
T ss_pred CCCEEEEEeccCc----ccceecceEEeccC-eEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHHhccEEEEe
Confidence 4667777765432 11233499999999 9999999996322100 0 0001 12220
Q ss_pred ---eCCC-----cEEEEEEEEe--------cCCCCEEEEEeCCC--CCCCccccCCCC-CCCCCCEEEEEecCCCCCCce
Q 012318 193 ---LQDG-----RTFEGTVLNA--------DFHSDIAIVKINSK--TPLPAAKLGTSS-KLCPGDWVVAMGCPHSLQNTV 253 (466)
Q Consensus 193 ---~~~g-----~~~~a~vv~~--------d~~~DlAlLkv~~~--~~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~~~ 253 (466)
...+ +...|.++.. ....+++||+++.+ ....++=|+++. ....++.+.+.|+... ..+
T Consensus 128 ~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~ 205 (282)
T PF03761_consen 128 CNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKL 205 (282)
T ss_pred ecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeE
Confidence 0111 1222444322 34679999999987 666777776643 3667899999998222 122
Q ss_pred EEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceee---cCCCeEEEEEEeEecCC-CeeeEEEeHHHHHH
Q 012318 254 TAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLV---NIDGEIVGINIMKVAAA-DGLSFAVPIDSAAK 325 (466)
Q Consensus 254 t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlv---d~~G~VVGI~~~~~~~~-~g~~~aip~~~i~~ 325 (466)
....+.-..... ....+......+.|++|||++ |....||||.+...... ....+++.+..+.+
T Consensus 206 ~~~~~~i~~~~~--------~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 206 KHRKLKITNCTK--------CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD 273 (282)
T ss_pred EEEEEEEEEeec--------cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence 222222111110 123445566778999999998 33357899987654222 12556677666544
No 55
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.29 E-value=0.00067 Score=63.53 Aligned_cols=67 Identities=15% Similarity=0.193 Sum_probs=56.1
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 458 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 458 (466)
.|..+.=..+++.-++.|||.||+.+++|+..+++.+++..+|.. ..-+.++++|+| +|+...+.+.
T Consensus 207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R-~G~rhdInV~ 274 (275)
T COG3031 207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIR-RGKRHDINVR 274 (275)
T ss_pred EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEe-cCccceeeec
Confidence 355555566778889999999999999999999999999888876 455679999999 8998887764
No 56
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=97.21 E-value=0.0082 Score=55.95 Aligned_cols=161 Identities=17% Similarity=0.244 Sum_probs=86.7
Q ss_pred CceeEEEEEeCC-CeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecC----------------CCCEEEEE
Q 012318 154 RGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADF----------------HSDIAIVK 216 (466)
Q Consensus 154 ~~~GSGfiI~~~-G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~----------------~~DlAlLk 216 (466)
.+.||=.+++++ +..--=.|.+.+.+ ....+.+.+|+.|++++....+ ..-+.-+.
T Consensus 19 aGiGTlTf~dp~~~~fgALGH~I~D~d-------t~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~ 91 (218)
T PF05580_consen 19 AGIGTLTFYDPETGTFGALGHGISDVD-------TGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIE 91 (218)
T ss_pred cCeEEEEEEECCCCcEEecCCeEEcCC-------CCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEE
Confidence 477899999974 56666689888753 3445666778888877666532 11122222
Q ss_pred eCCC---------------CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCC----CccccE
Q 012318 217 INSK---------------TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG----GMRREY 277 (466)
Q Consensus 217 v~~~---------------~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~----~~~~~~ 277 (466)
-+.. ...++++++...++++|..-+..=.......... -.|..+.+.......+ ....++
T Consensus 92 ~Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~~G~~ie~f~-ieI~~v~~~~~~~~k~~vi~vtd~~L 170 (218)
T PF05580_consen 92 KNTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVIDGTKIEEFD-IEIEKVLPQSSPSGKGMVIKVTDPRL 170 (218)
T ss_pred eccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEEcCCeEEEeE-EEEEEEccCCCCCCCcEEEEECCcch
Confidence 2211 1223344444455666643211101100000111 1112222211100000 001123
Q ss_pred EEEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHH
Q 012318 278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSA 323 (466)
Q Consensus 278 i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i 323 (466)
+..+..+..|+||+|++ .+|++||=++..+.+....+|.++++..
T Consensus 171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~M 215 (218)
T PF05580_consen 171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEWM 215 (218)
T ss_pred hhhhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHHH
Confidence 33445678899999999 7999999999998877889999987643
No 57
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=96.68 E-value=0.0031 Score=65.95 Aligned_cols=59 Identities=22% Similarity=0.414 Sum_probs=47.3
Q ss_pred CCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCH---HHHHHHHhcCCCCeEEEEEE
Q 012318 388 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI---TEIIEIMGDRVGEPLKVVVQ 446 (466)
Q Consensus 388 ~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~---~~~~~~l~~~~g~~v~l~v~ 446 (466)
+...|++|..|.++|||++-||++||+|+.||.++..+. +.+..+|.--+|+.++|.-.
T Consensus 426 GNDVGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ 487 (1027)
T KOG3580|consen 426 GNDVGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQ 487 (1027)
T ss_pred CCceeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhh
Confidence 345799999999999999999999999999999998775 23444455577887777544
No 58
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=96.64 E-value=0.0036 Score=66.46 Aligned_cols=51 Identities=25% Similarity=0.401 Sum_probs=45.4
Q ss_pred CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCC
Q 012318 389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGE 439 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~ 439 (466)
...-+.|..|.+++||.++.|++||++++|||.+|++.++..+.++.-.|+
T Consensus 396 ~~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~ 446 (1051)
T KOG3532|consen 396 TNRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGD 446 (1051)
T ss_pred CceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccc
Confidence 345677899999999999999999999999999999999999988875554
No 59
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.55 E-value=0.0097 Score=65.54 Aligned_cols=24 Identities=38% Similarity=0.440 Sum_probs=21.2
Q ss_pred ceeEEEEEeCCCeEEeccccccCC
Q 012318 155 GIGSGAIVDADGTILTCAHVVVDF 178 (466)
Q Consensus 155 ~~GSGfiI~~~G~ILT~aHvv~~~ 178 (466)
+-|||-||+++|+||||.||.-++
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~~~ 70 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGYGA 70 (698)
T ss_pred CceeEEEEcCCceEEecchhhhhH
Confidence 349999999999999999999653
No 60
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.38 E-value=0.0055 Score=67.42 Aligned_cols=55 Identities=24% Similarity=0.332 Sum_probs=43.4
Q ss_pred EEEEcccCCCCCccceeecCCCeEEEEEEeEec----------CCCeeeEEEeHHHHHHHHHHHH
Q 012318 277 YLQTDCAINAGNSGGPLVNIDGEIVGINIMKVA----------AADGLSFAVPIDSAAKIIEQFK 331 (466)
Q Consensus 277 ~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~----------~~~g~~~aip~~~i~~~l~~l~ 331 (466)
.+..+..+.+||||+|++|.+|+|||+++-+.- ..-+.+..|-+..|+.+|+++-
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 577888999999999999999999999987642 1223567777777888877654
No 61
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.21 E-value=0.049 Score=49.74 Aligned_cols=148 Identities=18% Similarity=0.239 Sum_probs=82.0
Q ss_pred CceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE--EEEEecC---CC
Q 012318 136 PAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG--TVLNADF---HS 210 (466)
Q Consensus 136 ~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a--~vv~~d~---~~ 210 (466)
..++.|... .+...++++.|..+ ++|...|.-.. ..+.+ +|+.++. .+...+. ..
T Consensus 13 ~N~~~v~~~-------~g~~t~l~~gi~~~-~~lvp~H~~~~---------~~i~i---~g~~~~~~d~~~lv~~~~~~~ 72 (172)
T PF00548_consen 13 KNVVPVTTG-------KGEFTMLALGIYDR-YFLVPTHEEPE---------DTIYI---DGVEYKVDDSVVLVDRDGVDT 72 (172)
T ss_dssp HHEEEEEET-------TEEEEEEEEEEEBT-EEEEEGGGGGC---------SEEEE---TTEEEEEEEEEEEEETTSSEE
T ss_pred ccEEEEEeC-------CceEEEecceEeee-EEEEECcCCCc---------EEEEE---CCEEEEeeeeEEEecCCCcce
Confidence 456666652 23456888889877 99999992221 34433 3555443 2223443 46
Q ss_pred CEEEEEeCCCCCCCccccCCCCCC-CCCCEEEEEecCCCCCC-ceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCC
Q 012318 211 DIAIVKINSKTPLPAAKLGTSSKL-CPGDWVVAMGCPHSLQN-TVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGN 288 (466)
Q Consensus 211 DlAlLkv~~~~~~~~~~l~~s~~~-~~G~~V~~iG~p~~~~~-~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~ 288 (466)
||++++++....++-++---.... ...+...++ +...... ....+.++..... .. .+......+.+++++..|+
T Consensus 73 Dl~~v~l~~~~kfrDIrk~~~~~~~~~~~~~l~v-~~~~~~~~~~~v~~v~~~~~i-~~--~g~~~~~~~~Y~~~t~~G~ 148 (172)
T PF00548_consen 73 DLTLVKLPRNPKFRDIRKFFPESIPEYPECVLLV-NSTKFPRMIVEVGFVTNFGFI-NL--SGTTTPRSLKYKAPTKPGM 148 (172)
T ss_dssp EEEEEEEESSS-B--GGGGSBSSGGTEEEEEEEE-ESSSSTCEEEEEEEEEEEEEE-EE--TTEEEEEEEEEESEEETTG
T ss_pred eEEEEEccCCcccCchhhhhccccccCCCcEEEE-ECCCCccEEEEEEEEeecCcc-cc--CCCEeeEEEEEccCCCCCc
Confidence 999999986543332211000112 223333333 3333332 3334444333322 10 1123345788899999999
Q ss_pred ccceeecC---CCeEEEEEEeE
Q 012318 289 SGGPLVNI---DGEIVGINIMK 307 (466)
Q Consensus 289 SGGPlvd~---~G~VVGI~~~~ 307 (466)
-||||+.. .++++|||.++
T Consensus 149 CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 149 CGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp TTEEEEESCGGTTEEEEEEEEE
T ss_pred cCCeEEEeeccCccEEEEEecc
Confidence 99999942 48999999885
No 62
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.93 E-value=0.011 Score=63.09 Aligned_cols=52 Identities=19% Similarity=0.489 Sum_probs=44.7
Q ss_pred ecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEEE
Q 012318 395 VPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQR 447 (466)
Q Consensus 395 V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~R 447 (466)
|..|.++|||++.| |+.||.|++|||+.|.+. .|+..++++ .|.+|+|+|.-
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd-aGlsVtLtIip 836 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD-AGLSVTLTIIP 836 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHh-cCceEEEEEcC
Confidence 78899999999986 999999999999999876 466676665 68889999875
No 63
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=95.78 E-value=0.048 Score=50.96 Aligned_cols=117 Identities=24% Similarity=0.315 Sum_probs=48.9
Q ss_pred eEEeccccccCCCCCCCCCCceEEEEeCCCcEEE---EEEEEecCCCCEEEEEeCCC----CCCCccccCCCCCCCCCCE
Q 012318 167 TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE---GTVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDW 239 (466)
Q Consensus 167 ~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~---a~vv~~d~~~DlAlLkv~~~----~~~~~~~l~~s~~~~~G~~ 239 (466)
.++|+.||.... ..+ ..+.+|+.++ -+.+..+...|++||+.... ..++.+.+.....+..|
T Consensus 43 ~L~ta~Hv~~~~--------~~~-~~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~~g-- 111 (203)
T PF02122_consen 43 ALLTARHVWSRP--------SKV-TSLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLAKG-- 111 (203)
T ss_dssp EEEE-HHHHTSS--------S----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEEEE--
T ss_pred ceecccccCCCc--------cce-eEcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhCCC--
Confidence 899999999873 222 2334454444 24556678999999999842 23344444322211100
Q ss_pred EEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEEEcccCCCCCccceeecCCCeEEEEEEeE
Q 012318 240 VVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMK 307 (466)
Q Consensus 240 V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~ 307 (466)
.+.. +....+...+...... .....+..+-+.+.+|.||.|+++.. +++|++...
T Consensus 112 --~~~~-----y~~~~~~~~~~sa~i~-----g~~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~ 166 (203)
T PF02122_consen 112 --PVSF-----YGFSSGEWPCSSAKIP-----GTEGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS 166 (203)
T ss_dssp --ESST-----TSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE
T ss_pred --Ceee-----eeecCCCceeccCccc-----cccCcCCceEcCCCCCCCCCCeEECC-CceEeecCc
Confidence 1111 1111211111111110 11133667778999999999999877 999999985
No 64
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=95.70 E-value=0.013 Score=64.28 Aligned_cols=56 Identities=23% Similarity=0.472 Sum_probs=47.0
Q ss_pred cceeecccCCCChhhhCCCCCCCEEEEECCEecCC--HHHHHHHHhcCCCCeEEEEEEEC
Q 012318 391 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQRA 448 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~--~~~~~~~l~~~~g~~v~l~v~R~ 448 (466)
..++|..|.+|+|+.-. |++||+|++|||.+|++ |+.+.++++. ....|.|+|.++
T Consensus 75 rPviVr~VT~GGps~GK-L~PGDQIl~vN~Epv~daprervIdlvRa-ce~sv~ltV~qP 132 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSIGK-LQPGDQILAVNGEPVKDAPRERVIDLVRA-CESSVNLTVCQP 132 (1298)
T ss_pred CceEEEEecCCCCcccc-ccCCCeEEEecCcccccccHHHHHHHHHH-HhhhcceEEecc
Confidence 68899999999999866 99999999999999976 6778887765 345688888873
No 65
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=95.60 E-value=0.033 Score=48.61 Aligned_cols=57 Identities=25% Similarity=0.469 Sum_probs=43.4
Q ss_pred CCcceeecccCCCChhhhC-CCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEE
Q 012318 389 VKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ 446 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~ 446 (466)
.+.+++|+.+.|++.|++- ||+.||++++|||..|..- +...++|+...| .++|.|.
T Consensus 113 qnspiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g-svklvvr 172 (207)
T KOG3550|consen 113 QNSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG-SVKLVVR 172 (207)
T ss_pred cCCceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC-cEEEEEe
Confidence 3568999999999999987 7999999999999998754 334455555444 4666654
No 66
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=95.57 E-value=0.18 Score=51.97 Aligned_cols=45 Identities=22% Similarity=0.386 Sum_probs=36.5
Q ss_pred EEEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEEEeHHHH
Q 012318 278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSA 323 (466)
Q Consensus 278 i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i 323 (466)
+..+..+..|+||+|++ .+|++||=++-...+.+..+|+|-++..
T Consensus 351 l~~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~M 395 (402)
T TIGR02860 351 LEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEWM 395 (402)
T ss_pred hhHhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHHH
Confidence 33345677899999999 7999999998888888889999965543
No 67
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.34 E-value=0.036 Score=48.13 Aligned_cols=33 Identities=36% Similarity=0.506 Sum_probs=23.4
Q ss_pred EEEcccCCCCCccceeecCCCeEEEEEEeEecC
Q 012318 278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA 310 (466)
Q Consensus 278 i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~ 310 (466)
...+..+.+|.||+|+||.+|++|||...+..-
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 344555779999999999999999999887653
No 68
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=95.32 E-value=0.064 Score=55.01 Aligned_cols=56 Identities=30% Similarity=0.520 Sum_probs=47.6
Q ss_pred ccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHHHHhcCCCCe---EEEEEEECCCeE
Q 012318 397 VVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEP---LKVVVQRANDQL 452 (466)
Q Consensus 397 ~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~g~~---v~l~v~R~~g~~ 452 (466)
.+..+|+|..+|+++||.|+++|++++.+|+++.+.+....+.. +.+.+.|.++..
T Consensus 135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~ 193 (375)
T COG0750 135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEA 193 (375)
T ss_pred ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEecccee
Confidence 78899999999999999999999999999999988877655655 788888833443
No 69
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.27 E-value=0.56 Score=44.70 Aligned_cols=99 Identities=20% Similarity=0.283 Sum_probs=67.4
Q ss_pred CceEEEEccccccccccCCceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEE------EEEEEec--
Q 012318 136 PAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE------GTVLNAD-- 207 (466)
Q Consensus 136 ~SVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~------a~vv~~d-- 207 (466)
|-.+.|.. .|...|||++|+++ |||++-.|+.+.. .....+.+.+..++.+. -++...|
T Consensus 17 PWlA~IYv--------dG~~~CsgvLlD~~-WlLvsssCl~~I~----L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~ 83 (267)
T PF09342_consen 17 PWLADIYV--------DGRYWCSGVLLDPH-WLLVSSSCLRGIS----LSHHYVSALLGGGKTYLSVDGPHEQISRVDCF 83 (267)
T ss_pred cceeeEEE--------cCeEEEEEEEeccc-eEEEeccccCCcc----cccceEEEEecCcceecccCCChheEEEeeee
Confidence 66666665 45578999999998 9999999998743 11256777777776543 1233333
Q ss_pred ---CCCCEEEEEeCCCC----CCCccccCC-CCCCCCCCEEEEEecCC
Q 012318 208 ---FHSDIAIVKINSKT----PLPAAKLGT-SSKLCPGDWVVAMGCPH 247 (466)
Q Consensus 208 ---~~~DlAlLkv~~~~----~~~~~~l~~-s~~~~~G~~V~~iG~p~ 247 (466)
++.+++||.++.+. .+.|+-+.+ .......+.++++|...
T Consensus 84 ~~V~~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 84 KDVPESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeccccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 68899999999763 234444433 23444556899999876
No 70
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=94.73 E-value=0.086 Score=45.26 Aligned_cols=39 Identities=18% Similarity=0.308 Sum_probs=30.5
Q ss_pred EEcccCCCCCccceeecCCCeEEEEEEeEecCCCeeeEE
Q 012318 279 QTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFA 317 (466)
Q Consensus 279 ~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~~g~~~a 317 (466)
.-...-.+|+||-|++|-.|+||||+..+..+.......
T Consensus 98 ip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~RTaLS 136 (158)
T PF00944_consen 98 IPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRRTALS 136 (158)
T ss_dssp EETTS-STTSTTEEEESTTSBEEEEEEEEEEETTEEEEE
T ss_pred eccCCCCCCCCCCccCcCCCCEEEEEecCCCCCCceEEE
Confidence 335566899999999999999999999988766555443
No 71
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.40 E-value=0.088 Score=55.50 Aligned_cols=73 Identities=21% Similarity=0.428 Sum_probs=54.1
Q ss_pred ccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHHHHH--HHhcCCCCeEEEEEEECCCeEEEEEE
Q 012318 381 ERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIE--IMGDRVGEPLKVVVQRANDQLVTLTV 457 (466)
Q Consensus 381 ~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~--~l~~~~g~~v~l~v~R~~g~~~~l~v 457 (466)
..++.|.+-...++|..|.+++||+-. ||.||.|+-|||....+...... +|+ +.|+...|+|.|. +++++..
T Consensus 30 RDnPhf~~getSiViSDVlpGGPAeG~-LQenDrvvMVNGvsMenv~haFAvQqLr-ksgK~A~ItvkRp--rkvqvpa 104 (1027)
T KOG3580|consen 30 RDNPHFENGETSIVISDVLPGGPAEGL-LQENDRVVMVNGVSMENVLHAFAVQQLR-KSGKVAAITVKRP--RKVQVPA 104 (1027)
T ss_pred CCCCCccCCceeEEEeeccCCCCcccc-cccCCeEEEEcCcchhhhHHHHHHHHHH-hhccceeEEeccc--ceeeccc
Confidence 444566666678999999999999976 99999999999999877654332 333 4677788999883 3444433
No 72
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=94.14 E-value=0.041 Score=58.68 Aligned_cols=57 Identities=26% Similarity=0.447 Sum_probs=43.9
Q ss_pred CCcceeecccCCCChhhhCCCCCCCEEEEECCEecCCHHH--HHHHHhcCCCCeEEEEEEE
Q 012318 389 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR 447 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~--~~~~l~~~~g~~v~l~v~R 447 (466)
...|++|.+|.|+|.|++.|++.||.|++|||+...+..- ..++|.. +..++|+|.-
T Consensus 560 kGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrn--nthLtltvKt 618 (1283)
T KOG3542|consen 560 KGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRN--NTHLTLTVKT 618 (1283)
T ss_pred ccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcC--CceEEEEEec
Confidence 4569999999999999999999999999999999877643 2334433 3456666653
No 73
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=92.98 E-value=0.15 Score=52.96 Aligned_cols=75 Identities=15% Similarity=0.421 Sum_probs=50.9
Q ss_pred ecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhh-CCCCCCCEEEEECCEecCCH--HHHHHHHhc-
Q 012318 360 VRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHL-AGFLPSDVVIKFDGKPVQSI--TEIIEIMGD- 435 (466)
Q Consensus 360 ~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~-aGl~~GD~I~~ing~~v~~~--~~~~~~l~~- 435 (466)
..+|||+.....+. . ....|++|.+|++++.-+. .-|.+||.|+.||.....++ ++....|.+
T Consensus 259 ~vnfLGiSivgqsn-------~------rgDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREa 325 (626)
T KOG3571|consen 259 TVNFLGISIVGQSN-------A------RGDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREA 325 (626)
T ss_pred ccccceeEeecccC-------c------CCCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHH
Confidence 46778887654221 0 2357999999999875444 45999999999999988765 455555554
Q ss_pred -CCCCeEEEEEEE
Q 012318 436 -RVGEPLKVVVQR 447 (466)
Q Consensus 436 -~~g~~v~l~v~R 447 (466)
....+++|+|-.
T Consensus 326 V~~~gPi~ltvAk 338 (626)
T KOG3571|consen 326 VSRPGPIKLTVAK 338 (626)
T ss_pred hccCCCeEEEEee
Confidence 222347888765
No 74
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=92.43 E-value=0.18 Score=54.30 Aligned_cols=56 Identities=20% Similarity=0.369 Sum_probs=43.3
Q ss_pred CcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCHHH--HHHHHhcCCCCeEEEEEEE
Q 012318 390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR 447 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~~~--~~~~l~~~~g~~v~l~v~R 447 (466)
..+++|..+.+++||.+-| ++.||+|++|||+..+.+.. ..+++++ |....+.++|
T Consensus 922 nM~LfVLRlAeDGPA~rdGrm~VGDqi~eINGesTkgmtH~rAIelIk~--gg~~vll~Lr 980 (984)
T KOG3209|consen 922 NMDLFVLRLAEDGPAIRDGRMRVGDQITEINGESTKGMTHDRAIELIKQ--GGRRVLLLLR 980 (984)
T ss_pred ccceEEEEeccCCCccccCceeecceEEEecCcccCCCcHHHHHHHHHh--CCeEEEEEec
Confidence 4578999999999999887 99999999999999988754 3444443 4444555555
No 75
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=92.07 E-value=0.32 Score=52.06 Aligned_cols=123 Identities=26% Similarity=0.403 Sum_probs=73.3
Q ss_pred CCCCCccceee-----cCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHHcCCccccccccccccccceeeeeeee
Q 012318 284 INAGNSGGPLV-----NIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGWMHVEQKVPLLWSTCKQVVILCRR 358 (466)
Q Consensus 284 i~~G~SGGPlv-----d~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~ 358 (466)
+-.-++|||.- |...+++.|+-.. =..+|.+....+++.+|+.- .+.+..-.|.-|+.-
T Consensus 677 iAnmm~~GpAarsgkLnIGDQiiaING~S-------LVGLPLstcQs~Ik~~KnQT------~VkltiV~cpPV~~V--- 740 (829)
T KOG3605|consen 677 IANMMHGGPAARSGKLNIGDQIMSINGTS-------LVGLPLSTCQSIIKGLKNQT------AVKLNIVSCPPVTTV--- 740 (829)
T ss_pred HHhcccCChhhhcCCccccceeEeecCce-------eccccHHHHHHHHhcccccc------eEEEEEecCCCceEE---
Confidence 33456777775 3334666665322 23489999999999888765 111111122222211
Q ss_pred eecccccceeecCCHHHHHHhhccCCCCCCCCcceeecccCCCChhhhCCCCCCCEEEEECCEecCC--HHHHHHHHhcC
Q 012318 359 VVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDR 436 (466)
Q Consensus 359 ~~~~~lG~~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~--~~~~~~~l~~~ 436 (466)
+-.-++.+.+|+. .+.+|+ |-+..-++-|++.|++.|-.|++|||+.|.- -+.+..+|...
T Consensus 741 ----------~I~RPd~kyQLGF------SVQNGi-ICSLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~a 803 (829)
T KOG3605|consen 741 ----------LIRRPDLRYQLGF------SVQNGI-ICSLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSNA 803 (829)
T ss_pred ----------Eeecccchhhccc------eeeCcE-eehhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHHh
Confidence 1112233333321 334675 6678899999999999999999999998743 24455666555
Q ss_pred CCC
Q 012318 437 VGE 439 (466)
Q Consensus 437 ~g~ 439 (466)
.|+
T Consensus 804 VGE 806 (829)
T KOG3605|consen 804 VGE 806 (829)
T ss_pred hhh
Confidence 553
No 76
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=92.02 E-value=0.17 Score=51.06 Aligned_cols=51 Identities=29% Similarity=0.455 Sum_probs=42.8
Q ss_pred CCCCCCCcceeecccCCCChhhhC-CCCCCCEEEEECCEecCCHHHHHHHHh
Q 012318 384 PSFPNVKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSITEIIEIMG 434 (466)
Q Consensus 384 ~~~~~~~~g~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~~~~~~~l~ 434 (466)
..|+....|+.|++|...||+..- ||++||+|+++||.+|++.+|..+-+.
T Consensus 213 sPfya~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~ 264 (484)
T KOG2921|consen 213 SPFYAHGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLA 264 (484)
T ss_pred chhhhcCceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHH
Confidence 355667889999999999998543 899999999999999999888766554
No 77
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=91.91 E-value=0.29 Score=49.99 Aligned_cols=71 Identities=21% Similarity=0.379 Sum_probs=52.9
Q ss_pred CCcceeecccCCCChhhhCCCCCC-CEEEEECCEecCCHHHH-HHHHhcCCCCeEEEEEEECCC-eEEEEEEEec
Q 012318 389 VKSGVLVPVVTPGSPAHLAGFLPS-DVVIKFDGKPVQSITEI-IEIMGDRVGEPLKVVVQRAND-QLVTLTVIPE 460 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~aGl~~G-D~I~~ing~~v~~~~~~-~~~l~~~~g~~v~l~v~R~~g-~~~~l~v~~~ 460 (466)
...|.-|.+|.++|+|.++||.+= |-|++|||..+..-.|. ...|+.+..+ ++++|..... ..+.++|++.
T Consensus 13 gteg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~kt~~~R~v~I~ps 86 (462)
T KOG3834|consen 13 GTEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSKTQEVRIVEIVPS 86 (462)
T ss_pred CceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEecccceeEEEEeccc
Confidence 346888999999999999999997 89999999999865554 4455555444 9999986333 3455666654
No 78
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=90.64 E-value=0.5 Score=46.05 Aligned_cols=55 Identities=16% Similarity=0.273 Sum_probs=43.1
Q ss_pred cceeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEE
Q 012318 391 SGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ 446 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~ 446 (466)
.=++|..|..++||++-| ++.||.|++|||..|+.- .++.+++....+ .+++.+.
T Consensus 30 PClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~-eV~IhyN 87 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLN-EVKIHYN 87 (429)
T ss_pred CeEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhcc-ceEEEeh
Confidence 357888999999999887 999999999999999754 566777766443 3666664
No 79
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=90.03 E-value=0.65 Score=44.64 Aligned_cols=57 Identities=21% Similarity=0.454 Sum_probs=45.5
Q ss_pred CcceeecccCCCChhhhCC-CCCCCEEEEECCEec--CCHHHHHHHHhcCCCCeEEEEEEE
Q 012318 390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPV--QSITEIIEIMGDRVGEPLKVVVQR 447 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v--~~~~~~~~~l~~~~g~~v~l~v~R 447 (466)
..|+.|....+++.|+..| |...|.|++|||.+| ++.+++.+++-.+. ..+.++|+-
T Consensus 193 vpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs-hNLIiTVkP 252 (358)
T KOG3606|consen 193 VPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS-HNLIITVKP 252 (358)
T ss_pred cCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcc-cceEEEecc
Confidence 4699999999999999999 567899999999998 56788988776532 235666654
No 80
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=89.72 E-value=0.57 Score=47.94 Aligned_cols=67 Identities=28% Similarity=0.503 Sum_probs=54.0
Q ss_pred ecccCCCChhhhCCCCC-CCEEEEECCEecCCHHHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEecC
Q 012318 395 VPVVTPGSPAHLAGFLP-SDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPEE 461 (466)
Q Consensus 395 V~~V~~~spA~~aGl~~-GD~I~~ing~~v~~~~~~~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~~ 461 (466)
|-+|.++|||+.|||++ +|.|+.+-+...+..+|+...+..+.++.+++-|+.-+ ...+++++++..
T Consensus 113 vl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReVti~pn~ 181 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREVTITPNS 181 (462)
T ss_pred eeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceEEeeccc
Confidence 67899999999999994 59999995556677889998888888899999888522 345778888754
No 81
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=88.67 E-value=0.6 Score=46.40 Aligned_cols=56 Identities=18% Similarity=0.454 Sum_probs=47.3
Q ss_pred CcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEEE
Q 012318 390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ 446 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v~ 446 (466)
.-+++|+.+.++-.|+..| |-.||-|+.|||..|+.. +++..+|+. .|+.++|+|.
T Consensus 79 n~PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN-AGdeVtlTV~ 137 (505)
T KOG3549|consen 79 NLPVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN-AGDEVTLTVK 137 (505)
T ss_pred CccEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHh-cCCEEEEEeH
Confidence 4589999999999999988 778999999999998764 677777764 7888888886
No 82
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=88.12 E-value=0.6 Score=47.07 Aligned_cols=58 Identities=17% Similarity=0.367 Sum_probs=44.4
Q ss_pred CCcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEE--EEEE
Q 012318 389 VKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKV--VVQR 447 (466)
Q Consensus 389 ~~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l--~v~R 447 (466)
....++|+.+.++-.|++.+ |..||.|++|||....+. ++..++|+. .|+.|.+ ++.|
T Consensus 108 NkMPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKr-aGkeV~levKy~R 170 (506)
T KOG3551|consen 108 NKMPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKR-AGKEVLLEVKYMR 170 (506)
T ss_pred cCCceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHh-hCceeeeeeeeeh
Confidence 45689999999999999986 999999999999987654 455556654 5666555 4456
No 83
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=87.45 E-value=0.97 Score=47.63 Aligned_cols=55 Identities=20% Similarity=0.349 Sum_probs=46.2
Q ss_pred ceeecccCCCChhhhCC-CCCCCEEEEECCEecCC--HHHHHHHHhcCCCCeEEEEEEE
Q 012318 392 GVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQR 447 (466)
Q Consensus 392 g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~--~~~~~~~l~~~~g~~v~l~v~R 447 (466)
-++|..+..|+-+.+.| |+.||.|.+|||..+.+ ..++.++|....| .+++.+.-
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~G-~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSRG-SITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCCC-cEEEEEcc
Confidence 58899999999999987 89999999999999976 5789998887554 47777764
No 84
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=86.50 E-value=0.93 Score=51.68 Aligned_cols=52 Identities=31% Similarity=0.544 Sum_probs=39.6
Q ss_pred eeecccCCCChhhhCCCCCCCEEEEECCEecCCH--HHHHHHHhcCCCCeEEEEE
Q 012318 393 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVV 445 (466)
Q Consensus 393 ~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~--~~~~~~l~~~~g~~v~l~v 445 (466)
=+|..|.++|||..+|+++||.|+.+||++|... .++.+.|.. .|..+.+.+
T Consensus 660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~-~gn~v~~~t 713 (1205)
T KOG0606|consen 660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLK-SGNKVTLRT 713 (1205)
T ss_pred eeeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHh-cCCeeEEEe
Confidence 3577899999999999999999999999999765 355555443 355555543
No 85
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=86.37 E-value=0.85 Score=50.95 Aligned_cols=57 Identities=25% Similarity=0.436 Sum_probs=45.5
Q ss_pred CcceeecccCCCChhhhCC-CCCCCEEEEECCEecCCHHH--HHHHHhcCCCCeEEEEEEE
Q 012318 390 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR 447 (466)
Q Consensus 390 ~~g~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~~~--~~~~l~~~~g~~v~l~v~R 447 (466)
.-|++|.+|.+|++|+.-| |+.||++++|||+..-.+.+ ..+ |....|..|.+.|..
T Consensus 959 klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~-lmtrtg~vV~leVaK 1018 (1629)
T KOG1892|consen 959 KLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAAR-LMTRTGNVVHLEVAK 1018 (1629)
T ss_pred ccceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHH-HHhccCCeEEEehhh
Confidence 4599999999999998776 99999999999998766533 333 344578888898875
No 86
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=83.04 E-value=2.3 Score=45.81 Aligned_cols=55 Identities=16% Similarity=0.349 Sum_probs=40.1
Q ss_pred eeecccCCCChhhhCC-CCCCCEEEEECCEecCCH--HHHHHHHhc-CCCCeEEEEEEE
Q 012318 393 VLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGD-RVGEPLKVVVQR 447 (466)
Q Consensus 393 ~~V~~V~~~spA~~aG-l~~GD~I~~ing~~v~~~--~~~~~~l~~-~~g~~v~l~v~R 447 (466)
++|.....++||++.| |--||+|++|||...... ...+.+++. +.-..|+++|.+
T Consensus 675 VViAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~ 733 (829)
T KOG3605|consen 675 VVIANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS 733 (829)
T ss_pred HHHHhcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence 5566778899999997 999999999999876542 455566665 333456666665
No 87
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=80.85 E-value=5.8 Score=32.95 Aligned_cols=53 Identities=28% Similarity=0.505 Sum_probs=32.6
Q ss_pred EEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCCCCCCccccCC
Q 012318 159 GAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGT 230 (466)
Q Consensus 159 GfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~l~~ 230 (466)
++-|. +|.++|+.|+++... .+ +|..+ +++. ..-|+++++.+... ++.+++++
T Consensus 3 avHIG-nG~~vt~tHva~~~~--------~v-----~g~~f--~~~~--~~ge~~~v~~~~~~-~p~~~ig~ 55 (105)
T PF03510_consen 3 AVHIG-NGRYVTVTHVAKSSD--------SV-----DGQPF--KIVK--TDGELCWVQSPLVH-LPAAQIGT 55 (105)
T ss_pred eEEeC-CCEEEEEEEEeccCc--------eE-----cCcCc--EEEE--eccCEEEEECCCCC-CCeeEecc
Confidence 45565 679999999998742 21 12222 2222 34499999998543 56666654
No 88
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=79.73 E-value=27 Score=39.44 Aligned_cols=163 Identities=18% Similarity=0.183 Sum_probs=73.4
Q ss_pred eEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCC--CcEEEEEEEEecCCCCEEEEEeCCC-CCCCccccCCC--
Q 012318 157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD--GRTFEGTVLNADFHSDIAIVKINSK-TPLPAAKLGTS-- 231 (466)
Q Consensus 157 GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~--g~~~~a~vv~~d~~~DlAlLkv~~~-~~~~~~~l~~s-- 231 (466)
|...+|++. ||+|.+|...+. -.|.|.. ...|...--.-++..|+.+-||..- ....|+.+...
T Consensus 67 G~aTLigpq-YiVSV~HN~~gy----------~~v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~t~~~~ 135 (769)
T PF02395_consen 67 GVATLIGPQ-YIVSVKHNGKGY----------NSVSFGNEGQNTYKIVDRNNYPSGDFHMPRLNKFVTEVAPAEMTTAGS 135 (769)
T ss_dssp SS-EEEETT-EEEBETTG-TSC----------CEECESCSSTCEEEEEEEEBETTSTEBEEEESS---SS----BBSSTT
T ss_pred ceEEEecCC-eEEEEEccCCCc----------CceeecccCCceEEEEEccCCCCcccceeecCceEEEEeccccccccc
Confidence 789999987 999999998442 2344544 3455332222334579999999852 22233332211
Q ss_pred -----CCCCCCCEEEEEecCC-------C--------CCCceEEeEEeeeecCccC-CCCCCccccEEEE----cccCCC
Q 012318 232 -----SKLCPGDWVVAMGCPH-------S--------LQNTVTAGIVSCVDRKSSD-LGLGGMRREYLQT----DCAINA 286 (466)
Q Consensus 232 -----~~~~~G~~V~~iG~p~-------~--------~~~~~t~G~Vs~~~~~~~~-~~~~~~~~~~i~~----~~~i~~ 286 (466)
.....-...+-+|... + .....+.|.+......... ............. .....+
T Consensus 136 ~~~~y~d~~rY~~f~R~GsG~Q~i~~~~g~~~~~~~~ay~yltgGt~~~~~~~~n~~~~~~~~~~~~~~~~~pL~n~~~~ 215 (769)
T PF02395_consen 136 DSNTYNDKERYPAFVRVGSGTQYIKDRNGNGTTILGGAYNYLTGGTVYNLPGYGNGSMILSGDLKKFNSYNGPLPNYGSP 215 (769)
T ss_dssp STTGGGHTTTC-EEEEEESSSEEEEECCEEEEEEEEETTSCEEEEEESSEEEEECTCEEEEESTTTCCCCCSSSBEB--T
T ss_pred cccccccchhchheeecCCceEEEEcCCCCeeEEEEeccceecCCccccccccccceEEEecccccccccCCcccccccc
Confidence 0111222334444221 1 1112333443221100000 0000000000111 223478
Q ss_pred CCccceee--cCC---CeEEEEEEeEecC--CCeeeEEEeHHHHHHHHHHH
Q 012318 287 GNSGGPLV--NID---GEIVGINIMKVAA--ADGLSFAVPIDSAAKIIEQF 330 (466)
Q Consensus 287 G~SGGPlv--d~~---G~VVGI~~~~~~~--~~g~~~aip~~~i~~~l~~l 330 (466)
|+||+||| |.. --++|+.+..... ..+....+|.+.+.+++++-
T Consensus 216 GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~~~~~~~~~~f~~~~~~~d 266 (769)
T PF02395_consen 216 GDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKGNWWNVIPPDFINQIKQND 266 (769)
T ss_dssp T-TT-EEEEEETTTTEEEEEEEEEEECCCCHSEEEEEEECHHHHHHHHHHC
T ss_pred CcCCCceEEEEccCCeEEEEEEEccccccCCccceeEEecHHHHHHHHhhh
Confidence 99999999 322 4699999876542 12345567777776666653
No 89
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=78.34 E-value=10 Score=32.62 Aligned_cols=35 Identities=20% Similarity=0.291 Sum_probs=27.5
Q ss_pred ccccEEEEcccCCCCCccceeecCCCeEEEEEEeEe
Q 012318 273 MRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKV 308 (466)
Q Consensus 273 ~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~~~~~ 308 (466)
....++....+..||+-||+|+ .+--||||++++-
T Consensus 76 ~Q~~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg 110 (127)
T PF00947_consen 76 YQYNLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGG 110 (127)
T ss_dssp EEECEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEE
T ss_pred eecCceeecccCCCCCCCceeE-eCCCeEEEEEeCC
Confidence 3445777788899999999999 8888999999863
No 90
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=71.02 E-value=9 Score=33.23 Aligned_cols=128 Identities=18% Similarity=0.215 Sum_probs=63.4
Q ss_pred eEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCC-CCCCccccCCCCCCC
Q 012318 157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSK-TPLPAAKLGTSSKLC 235 (466)
Q Consensus 157 GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~-~~~~~~~l~~s~~~~ 235 (466)
--|+.|+ |.+-|.+|=-... ++.-+.| +..-.+.+...|+..-..... ..+.+-.-+
T Consensus 14 fmgt~vn--GV~wT~~HGagsr-----------tlAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg------ 71 (148)
T PF02907_consen 14 FMGTCVN--GVMWTVYHGAGSR-----------TLAGPKG---PVNQMYTSVDDDLVGWPAPPGARSLTPCTCG------ 71 (148)
T ss_dssp EEEEEET--TEEEEEHHHHTTS-----------EEEBTTS---EB-ESEEETTTTEEEEE-STTB--BBB-SSS------
T ss_pred eehhEEc--cEEEEEEecCCcc-----------cccCCCC---cceEeEEcCCCCCcccccccccccCCccccC------
Confidence 4577775 7888999965431 1222222 122345677888888877642 223333222
Q ss_pred CCCEEEEEecCCCCCCceEEeEEeeeecCccCCCCCCccccEEE-EcccCCCCCccceeecCCCeEEEEEEeEecCC---
Q 012318 236 PGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQ-TDCAINAGNSGGPLVNIDGEIVGINIMKVAAA--- 311 (466)
Q Consensus 236 ~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~-~~~~i~~G~SGGPlvd~~G~VVGI~~~~~~~~--- 311 (466)
-..++++-.... +..++- ... ....++. .-.....|.||||++=.+|.+|||-....-..
T Consensus 72 -~~dlylVtr~~~----v~p~rr--~gd---------~~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgva 135 (148)
T PF02907_consen 72 -SSDLYLVTRDAD----VIPVRR--RGD---------SRASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVA 135 (148)
T ss_dssp -SSEEEEE-TTS-----EEEEEE--EST---------TEEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEE
T ss_pred -CccEEEEeccCc----EeeeEE--cCC---------CceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCce
Confidence 146677744221 222211 000 0111111 11234689999999988899999987754321
Q ss_pred CeeeEEEeHHHH
Q 012318 312 DGLSFAVPIDSA 323 (466)
Q Consensus 312 ~g~~~aip~~~i 323 (466)
..+-| +|++.+
T Consensus 136 k~i~f-~P~e~l 146 (148)
T PF02907_consen 136 KAIDF-IPVETL 146 (148)
T ss_dssp EEEEE-EEHHHH
T ss_pred eeEEE-Eeeeec
Confidence 22344 587764
No 91
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=69.10 E-value=3.4 Score=42.48 Aligned_cols=24 Identities=33% Similarity=0.650 Sum_probs=21.3
Q ss_pred ccCCCCCccceeecCCCeEEEEEE
Q 012318 282 CAINAGNSGGPLVNIDGEIVGINI 305 (466)
Q Consensus 282 ~~i~~G~SGGPlvd~~G~VVGI~~ 305 (466)
.....|.||+.|+|.+|++|||.+
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 356789999999999999999975
No 92
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=64.31 E-value=7.3 Score=35.79 Aligned_cols=32 Identities=28% Similarity=0.240 Sum_probs=28.1
Q ss_pred CCCcceeecccCCCChhhhCCCCCCCEEEEEC
Q 012318 388 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFD 419 (466)
Q Consensus 388 ~~~~g~~V~~V~~~spA~~aGl~~GD~I~~in 419 (466)
.....++|++|..+|||+++|+.-|+.|+++-
T Consensus 119 ~e~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 119 EEGGKVIVDEVEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred eeCCEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence 44567899999999999999999999998873
No 93
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=63.75 E-value=10 Score=36.65 Aligned_cols=55 Identities=13% Similarity=0.269 Sum_probs=42.9
Q ss_pred eeecccCCCChhhhC-CCCCCCEEEEECCEecCCHH--HHHHHHhc-CCCCeEEEEEEE
Q 012318 393 VLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSIT--EIIEIMGD-RVGEPLKVVVQR 447 (466)
Q Consensus 393 ~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~~--~~~~~l~~-~~g~~v~l~v~R 447 (466)
..|..+.++|.-++. -++.||.|-+|||+.+-.+. ++.++|+. ..|++.+|.+..
T Consensus 151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie 209 (334)
T KOG3938|consen 151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE 209 (334)
T ss_pred eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence 457777888877665 48999999999999998875 55678877 678888777664
No 94
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=59.84 E-value=24 Score=25.77 Aligned_cols=33 Identities=27% Similarity=0.469 Sum_probs=29.2
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|+.|.+.+..+|...++.|-...
T Consensus 6 g~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~ 38 (63)
T cd00600 6 GKTVRVELKDGRVLEGVLVAFDKYMNLVLDDVE 38 (63)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCCCEEECCEE
Confidence 368999999999999999999999999886664
No 95
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=57.25 E-value=24 Score=26.58 Aligned_cols=33 Identities=24% Similarity=0.278 Sum_probs=29.5
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|+.|.+++.++|...+|.|-...
T Consensus 10 ~~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 10 GRPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CCeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 368999999999999999999999999886664
No 96
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=56.93 E-value=26 Score=26.86 Aligned_cols=33 Identities=27% Similarity=0.491 Sum_probs=29.7
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|+.|.+.+.++|...++.|-...
T Consensus 14 ~k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 46 (72)
T PRK00737 14 NSPVLVRLKGGREFRGELQGYDIHMNLVLDNAE 46 (72)
T ss_pred CCEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence 368999999999999999999999999887765
No 97
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=56.49 E-value=27 Score=26.32 Aligned_cols=33 Identities=21% Similarity=0.359 Sum_probs=30.0
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
..++.|.+.+|+.+.+.+..+|...+|.|-...
T Consensus 10 ~~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (68)
T cd01731 10 NKPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAE 42 (68)
T ss_pred CCEEEEEECCCCEEEEEEEEECCcceEEEeeEE
Confidence 368999999999999999999999999988775
No 98
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=55.66 E-value=25 Score=26.64 Aligned_cols=33 Identities=21% Similarity=0.330 Sum_probs=29.3
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|+.|.+++..+|...+|.|=.+.
T Consensus 11 g~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 11 GKPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 368999999999999999999999999886654
No 99
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=55.53 E-value=25 Score=27.37 Aligned_cols=32 Identities=16% Similarity=0.365 Sum_probs=28.5
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEe
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI 217 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv 217 (466)
+..+.|.+.+|+.+.+++.++|...++.|=..
T Consensus 13 ~~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 13 GSRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CCEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 36899999999999999999999999987554
No 100
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=54.93 E-value=23 Score=27.93 Aligned_cols=32 Identities=22% Similarity=0.376 Sum_probs=28.3
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEe
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI 217 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv 217 (466)
...+.|.+.+|+.+.+.+.++|.+.+|.|=..
T Consensus 11 ~k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 11 DERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CCEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 36899999999999999999999999987544
No 101
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=52.44 E-value=34 Score=26.58 Aligned_cols=33 Identities=24% Similarity=0.404 Sum_probs=29.0
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.||+.+.+.+..+|...+|.|=...
T Consensus 10 ~~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 10 GRTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred CCeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence 368999999999999999999999999875554
No 102
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.86 E-value=36 Score=26.19 Aligned_cols=33 Identities=15% Similarity=0.231 Sum_probs=28.9
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
+.++.|.+.+|+.+.+++.++|...+|.|=...
T Consensus 10 ~k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (72)
T cd01719 10 DKKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV 42 (72)
T ss_pred CCeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence 468999999999999999999999999885553
No 103
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.61 E-value=32 Score=26.86 Aligned_cols=33 Identities=36% Similarity=0.566 Sum_probs=29.1
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|+.+.+.+.++|...+|.|=...
T Consensus 10 ~~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 10 NYRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CCEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 468999999999999999999999999875554
No 104
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.51 E-value=34 Score=26.94 Aligned_cols=33 Identities=21% Similarity=0.301 Sum_probs=28.8
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
+.++.|.+.+|+.+.+.+.++|...+|.|=...
T Consensus 12 ~k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~~ 44 (81)
T cd01729 12 DKKIRVKFQGGREVTGILKGYDQLLNLVLDDTV 44 (81)
T ss_pred CCeEEEEECCCcEEEEEEEEEcCcccEEecCEE
Confidence 468999999999999999999999999875543
No 105
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.38 E-value=38 Score=26.27 Aligned_cols=32 Identities=28% Similarity=0.415 Sum_probs=28.6
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEe
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI 217 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv 217 (466)
+.++.|.+.+|+.+.+.+.++|+..+|.|=..
T Consensus 12 ~k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 12 DKKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CCEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 46899999999999999999999999988554
No 106
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=49.18 E-value=36 Score=27.28 Aligned_cols=32 Identities=16% Similarity=0.325 Sum_probs=28.8
Q ss_pred ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
..+.|.+.+++.+.+++.++|.+.+|.|=.+.
T Consensus 15 ~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 15 TQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred CEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 68999999999999999999999999976554
No 107
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=48.81 E-value=60 Score=24.24 Aligned_cols=33 Identities=24% Similarity=0.331 Sum_probs=29.4
Q ss_pred ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCC
Q 012318 187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS 219 (466)
Q Consensus 187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 219 (466)
..+.+..-.|..++++++.+|....+.+|+.++
T Consensus 7 s~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 7 SQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred cEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 577788888999999999999999999999765
No 108
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=46.99 E-value=47 Score=25.32 Aligned_cols=33 Identities=18% Similarity=0.393 Sum_probs=29.9
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|..|.+++..+|...++.|-.+.
T Consensus 10 g~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 10 GHIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CCEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 368999999999999999999999999988774
No 109
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=46.26 E-value=49 Score=24.54 Aligned_cols=33 Identities=24% Similarity=0.462 Sum_probs=28.9
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|+.+.+.+..+|...++-|=...
T Consensus 8 ~~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~ 40 (67)
T smart00651 8 GKRVLVELKNGREYRGTLKGFDQFMNLVLEDVE 40 (67)
T ss_pred CcEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence 368999999999999999999999999876554
No 110
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=45.41 E-value=1.4e+02 Score=31.03 Aligned_cols=136 Identities=16% Similarity=0.209 Sum_probs=62.9
Q ss_pred ceeEEEEEeCCCeEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCCC--CCCCccccCCCC
Q 012318 155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSK--TPLPAAKLGTSS 232 (466)
Q Consensus 155 ~~GSGfiI~~~G~ILT~aHvv~~~~~~~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~~--~~~~~~~l~~s~ 232 (466)
+.|=||.|+++ +++|+-||+.... .++ | | .+-.-+..+..-++.-+++..+ .+++-+-|. .
T Consensus 379 GsGWGfWVS~~-lfITttHViP~g~-------~E~---F--G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--e 441 (535)
T PF05416_consen 379 GSGWGFWVSPT-LFITTTHVIPPGA-------KEA---F--G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--E 441 (535)
T ss_dssp TTEEEEESSSS-EEEEEGGGS-STT-------SEE---T--T--EECGGEEEEEETTEEEEEESS-SSTTS---EE---S
T ss_pred CCceeeeecce-EEEEeeeecCCcc-------hhh---h--C--CChhHeEEeeccceEEEecCCCCCCCccceeec--c
Confidence 55789999998 9999999997632 111 0 0 0111123344446777777754 234444442 2
Q ss_pred CCCCCCEEEE-EecCCCC--CCceEEeEEeeeecCccCCCCCCccccEEEE-------cccCCCCCccceeecCCC---e
Q 012318 233 KLCPGDWVVA-MGCPHSL--QNTVTAGIVSCVDRKSSDLGLGGMRREYLQT-------DCAINAGNSGGPLVNIDG---E 299 (466)
Q Consensus 233 ~~~~G~~V~~-iG~p~~~--~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~-------~~~i~~G~SGGPlvd~~G---~ 299 (466)
....|.-+.+ |=.+.|. +..+..|........-... .....++.+ |-.+.||+-|-|-+-..| -
T Consensus 442 GapEGtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v---~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~V 518 (535)
T PF05416_consen 442 GAPEGTVCSILIKRPSGELLPLAVRMGTHASMKIQGRTV---HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWV 518 (535)
T ss_dssp S--TT-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEE---EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEE
T ss_pred CCCCceEEEEEEEcCCccchhhhhhhccceeEEEcceee---cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEE
Confidence 3344544332 3334332 2234444433322110000 001112222 334678999999995555 5
Q ss_pred EEEEEEeEecC
Q 012318 300 IVGINIMKVAA 310 (466)
Q Consensus 300 VVGI~~~~~~~ 310 (466)
|+|+|.....+
T Consensus 519 V~GVH~AAtr~ 529 (535)
T PF05416_consen 519 VIGVHAAATRS 529 (535)
T ss_dssp EEEEEEEE-SS
T ss_pred EEEEEehhccC
Confidence 89999886543
No 111
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=44.22 E-value=50 Score=25.40 Aligned_cols=33 Identities=24% Similarity=0.330 Sum_probs=29.2
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
+.++.|.+.+|+.+.+.+.++|...++.|=...
T Consensus 9 ~~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 9 NKTVSVITVDGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CCEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence 368999999999999999999999999887654
No 112
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=43.47 E-value=61 Score=24.04 Aligned_cols=34 Identities=26% Similarity=0.560 Sum_probs=30.5
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS 219 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 219 (466)
...++|.+.+|+.+.+.+..+|...++.|-....
T Consensus 8 g~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~ 41 (67)
T PF01423_consen 8 GKRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE 41 (67)
T ss_dssp TSEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred CcEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence 3689999999999999999999999998887764
No 113
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=40.78 E-value=57 Score=25.39 Aligned_cols=33 Identities=24% Similarity=0.485 Sum_probs=29.4
Q ss_pred ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeCC
Q 012318 187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS 219 (466)
Q Consensus 187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 219 (466)
..+.|.+.+|+.|.+++.++|...++.|--+..
T Consensus 18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence 689999999999999999999999998876553
No 114
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=40.76 E-value=29 Score=24.39 Aligned_cols=21 Identities=43% Similarity=0.566 Sum_probs=17.5
Q ss_pred CCCccceeecCCCeEEEEEEe
Q 012318 286 AGNSGGPLVNIDGEIVGINIM 306 (466)
Q Consensus 286 ~G~SGGPlvd~~G~VVGI~~~ 306 (466)
.+.+.-|++|.+|+++|+++.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 456778999999999999764
No 115
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=40.55 E-value=31 Score=32.46 Aligned_cols=55 Identities=25% Similarity=0.488 Sum_probs=37.9
Q ss_pred ccEEEEcccCCCCCccceeecCC----CeEEEEEEeEecCCCeeeEEEeH--HHHHHHHHHH
Q 012318 275 REYLQTDCAINAGNSGGPLVNID----GEIVGINIMKVAAADGLSFAVPI--DSAAKIIEQF 330 (466)
Q Consensus 275 ~~~i~~~~~i~~G~SGGPlvd~~----G~VVGI~~~~~~~~~g~~~aip~--~~i~~~l~~l 330 (466)
+..++...+...|+=|||++-.+ -+++||+..+.. ..+.+||-++ +.+.+-+..|
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~-~~~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA-NHAMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc-cccceehhhhhHHHHHHHHHhh
Confidence 34678888999999999999333 699999998753 2356676544 3344444444
No 116
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=37.19 E-value=84 Score=24.30 Aligned_cols=33 Identities=24% Similarity=0.448 Sum_probs=29.7
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|+.+.+.+..+|...++.+-.+.
T Consensus 11 g~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 11 NHPMLVELKNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence 368999999999999999999999999987664
No 117
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=36.43 E-value=54 Score=35.66 Aligned_cols=37 Identities=19% Similarity=0.248 Sum_probs=32.0
Q ss_pred cceeecccCCCChhhhC-CCCCCCEEEEECCEecCCHH
Q 012318 391 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSIT 427 (466)
Q Consensus 391 ~g~~V~~V~~~spA~~a-Gl~~GD~I~~ing~~v~~~~ 427 (466)
+-.+|+++.++|||+.. -|..||.|+.||++.+..|+
T Consensus 225 g~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 225 GPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred CceeccccccCChHHHhhcccCccceeeecccccccch
Confidence 44567889999999877 49999999999999998884
No 118
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=35.95 E-value=47 Score=25.51 Aligned_cols=32 Identities=28% Similarity=0.535 Sum_probs=25.0
Q ss_pred ceeecCCCeEEEEEEeEecCCCeeeEEEeHHHHHHHHHHHHH
Q 012318 291 GPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKK 332 (466)
Q Consensus 291 GPlvd~~G~VVGI~~~~~~~~~g~~~aip~~~i~~~l~~l~~ 332 (466)
-|+++.+|+++|++.. .+.++.+.++++++.-
T Consensus 19 ~pi~~~~g~~~Gvv~~----------di~l~~l~~~i~~~~~ 50 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVGI----------DISLDQLSEIISNIKF 50 (81)
T ss_dssp EEEEETTTEEEEEEEE----------EEEHHHHHHHHTTSBB
T ss_pred EEEECCCCCEEEEEEE----------EeccceeeeEEEeeEE
Confidence 5888889999999864 4788888887776543
No 119
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.32 E-value=1.2e+02 Score=23.88 Aligned_cols=33 Identities=24% Similarity=0.368 Sum_probs=29.7
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|..|.+++..+|...++-+-.+.
T Consensus 11 g~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~ 43 (81)
T cd01725 11 GKEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS 43 (81)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence 368999999999999999999999999887765
No 120
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=29.99 E-value=30 Score=34.37 Aligned_cols=30 Identities=33% Similarity=0.595 Sum_probs=26.7
Q ss_pred ecccCCCChhhhCCCCCCCEEEEECCEecC
Q 012318 395 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ 424 (466)
Q Consensus 395 V~~V~~~spA~~aGl~~GD~I~~ing~~v~ 424 (466)
+..|.+.+||+++|.-.||.|+.+|+-++.
T Consensus 67 ~lrv~~~~~~e~~~~~~~dyilg~n~Dp~~ 96 (417)
T COG5233 67 VLRVNPESPAEKAGMVVGDYILGINEDPLR 96 (417)
T ss_pred heeccccChhHhhccccceeEEeecCCcHH
Confidence 556788999999999999999999987764
No 121
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=29.96 E-value=1.4e+02 Score=23.35 Aligned_cols=32 Identities=25% Similarity=0.403 Sum_probs=29.2
Q ss_pred ceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 187 GKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 187 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
..+.|.+.+|..|.+++..+|...++-|-.+.
T Consensus 20 ~~V~VeLKng~~~~G~L~~vD~~MNl~L~~~~ 51 (78)
T cd01733 20 KVVTVELRNETTVTGRIASVDAFMNIRLAKVT 51 (78)
T ss_pred CEEEEEECCCCEEEEEEEEEcCCceeEEEEEE
Confidence 68999999999999999999999999887765
No 122
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=29.32 E-value=55 Score=27.31 Aligned_cols=18 Identities=28% Similarity=0.695 Sum_probs=13.6
Q ss_pred ceeecCCCeEEEEEEeEe
Q 012318 291 GPLVNIDGEIVGINIMKV 308 (466)
Q Consensus 291 GPlvd~~G~VVGI~~~~~ 308 (466)
.|++|.+|++||++..+.
T Consensus 94 ~PV~d~~g~viG~V~VG~ 111 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVGV 111 (116)
T ss_dssp EEEE-TTS-EEEEEEEEE
T ss_pred EeeECCCCcEEEEEEEEE
Confidence 599999999999998753
No 123
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=27.90 E-value=1.4e+02 Score=24.04 Aligned_cols=33 Identities=18% Similarity=0.383 Sum_probs=29.9
Q ss_pred CceEEEEeCCCcEEEEEEEEecCCCCEEEEEeC
Q 012318 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (466)
Q Consensus 186 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLkv~ 218 (466)
...+.|.+.+|..|.+.+..+|...++.|-.+.
T Consensus 11 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~a~ 43 (90)
T cd01724 11 NETVTIELKNGTIVHGTITGVDPSMNTHLKNVK 43 (90)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCceeEEEEEEE
Confidence 368999999999999999999999999988765
No 124
>PF14438 SM-ATX: Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=27.31 E-value=1.5e+02 Score=22.79 Aligned_cols=28 Identities=29% Similarity=0.490 Sum_probs=21.1
Q ss_pred ceEEEEeCCCcEEEEEEEEecC---CCCEEE
Q 012318 187 GKVDVTLQDGRTFEGTVLNADF---HSDIAI 214 (466)
Q Consensus 187 ~~i~V~~~~g~~~~a~vv~~d~---~~DlAl 214 (466)
..+.|+..||..|++-+...++ +.+++|
T Consensus 13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL 43 (77)
T PF14438_consen 13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL 43 (77)
T ss_dssp SEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred CEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence 6899999999999999999988 566665
No 125
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=23.63 E-value=55 Score=34.93 Aligned_cols=30 Identities=30% Similarity=0.452 Sum_probs=23.9
Q ss_pred eecccCCCChhhhCCCCCCCEEEEECCEecC
Q 012318 394 LVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ 424 (466)
Q Consensus 394 ~V~~V~~~spA~~aGl~~GD~I~~ing~~v~ 424 (466)
.|.-..+|.|.-.| .+|||||++.||+.|+
T Consensus 301 ~vl~~~ENm~~g~A-~rPGDVits~~GkTVE 330 (485)
T COG0260 301 GVLPAVENMPSGNA-YRPGDVITSMNGKTVE 330 (485)
T ss_pred EEEeeeccCCCCCC-CCCCCeEEecCCcEEE
Confidence 34455677787777 9999999999998763
Done!