Query 013444
Match_columns 443
No_of_seqs 403 out of 2973
Neff 7.8
Searched_HMMs 46136
Date Fri Mar 29 03:55:04 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/013444.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/013444hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.6E-50 7.8E-55 415.6 38.8 296 126-439 41-361 (455)
2 TIGR02038 protease_degS peripl 100.0 3.8E-49 8.3E-54 396.7 39.3 297 125-440 45-350 (351)
3 PRK10898 serine endoprotease; 100.0 6.9E-49 1.5E-53 394.6 39.1 296 126-440 46-351 (353)
4 PRK10942 serine endoprotease; 100.0 2.4E-47 5.3E-52 396.3 38.0 295 126-438 39-381 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 3.2E-47 6.9E-52 393.5 38.3 295 126-438 2-327 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 4.3E-37 9.4E-42 309.8 33.3 295 125-438 33-340 (347)
7 KOG1320 Serine protease [Postt 100.0 2.1E-29 4.5E-34 254.9 24.0 318 124-442 127-472 (473)
8 KOG1421 Predicted signaling-as 99.9 1.5E-25 3.3E-30 228.7 22.7 306 124-440 51-373 (955)
9 PF13365 Trypsin_2: Trypsin-li 99.7 1.9E-16 4E-21 134.0 11.7 117 157-303 1-120 (120)
10 KOG1421 Predicted signaling-as 99.6 2E-14 4.3E-19 148.0 21.9 290 131-439 524-832 (955)
11 PF13180 PDZ_2: PDZ domain; PD 99.6 4E-14 8.7E-19 112.6 11.1 81 339-436 1-82 (82)
12 PF00089 Trypsin: Trypsin; In 99.5 3.4E-12 7.4E-17 118.8 17.9 177 135-327 12-220 (220)
13 cd00190 Tryp_SPc Trypsin-like 99.4 2.3E-11 5E-16 114.0 18.9 181 134-328 11-230 (232)
14 cd00987 PDZ_serine_protease PD 99.4 5.5E-12 1.2E-16 101.5 11.2 88 339-433 1-89 (90)
15 cd00991 PDZ_archaeal_metallopr 99.3 4.7E-11 1E-15 94.3 10.6 68 367-435 9-77 (79)
16 cd00990 PDZ_glycyl_aminopeptid 99.2 7.3E-11 1.6E-15 93.0 10.3 68 367-437 11-78 (80)
17 smart00020 Tryp_SPc Trypsin-li 99.2 2.6E-10 5.6E-15 107.2 16.0 161 134-308 12-208 (229)
18 TIGR01713 typeII_sec_gspC gene 99.2 5.9E-11 1.3E-15 114.4 11.5 99 321-435 159-258 (259)
19 KOG1320 Serine protease [Postt 99.2 5E-11 1.1E-15 121.7 10.8 273 132-424 57-349 (473)
20 cd00989 PDZ_metalloprotease PD 99.2 1.1E-10 2.4E-15 91.7 9.9 67 368-435 12-78 (79)
21 cd00986 PDZ_LON_protease PDZ d 99.2 2E-10 4.4E-15 90.4 10.8 70 368-439 8-78 (79)
22 cd00988 PDZ_CTP_protease PDZ d 99.1 4.5E-10 9.7E-15 89.5 9.9 71 367-437 12-84 (85)
23 TIGR02037 degP_htrA_DO peripla 99.0 2.7E-09 5.9E-14 110.8 11.5 90 338-433 337-427 (428)
24 cd00136 PDZ PDZ domain, also c 99.0 2.2E-09 4.8E-14 82.2 7.8 56 368-423 13-70 (70)
25 TIGR00054 RIP metalloprotease 98.8 1.6E-08 3.5E-13 104.6 10.0 70 368-438 203-272 (420)
26 COG3591 V8-like Glu-specific e 98.7 1.9E-07 4E-12 88.7 13.6 160 155-331 64-250 (251)
27 PRK10779 zinc metallopeptidase 98.7 4.9E-08 1.1E-12 101.9 10.3 69 368-437 221-289 (449)
28 PRK10779 zinc metallopeptidase 98.7 2.7E-08 5.9E-13 103.8 7.9 67 370-437 128-195 (449)
29 TIGR00225 prc C-terminal pepti 98.6 1.4E-07 3E-12 94.9 9.8 67 368-435 62-130 (334)
30 cd00992 PDZ_signaling PDZ doma 98.6 2.5E-07 5.4E-12 72.8 8.6 54 368-422 26-81 (82)
31 PRK10942 serine endoprotease; 98.6 2E-07 4.2E-12 97.8 10.2 65 368-434 408-472 (473)
32 PRK10139 serine endoprotease; 98.6 2E-07 4.4E-12 97.2 9.8 65 368-434 390-454 (455)
33 smart00228 PDZ Domain present 98.6 3.2E-07 6.8E-12 72.5 8.2 57 368-424 26-83 (85)
34 TIGR02860 spore_IV_B stage IV 98.5 3.5E-07 7.6E-12 92.5 10.2 70 367-437 104-181 (402)
35 PLN00049 carboxyl-terminal pro 98.5 4.5E-07 9.7E-12 93.0 10.0 68 368-436 102-171 (389)
36 PF00595 PDZ: PDZ domain (Also 98.5 2.8E-07 6.2E-12 72.7 6.4 70 339-423 10-81 (81)
37 TIGR03279 cyano_FeS_chp putati 98.5 3E-07 6.6E-12 93.6 7.7 63 372-438 2-65 (433)
38 KOG3627 Trypsin [Amino acid tr 98.3 2.9E-05 6.4E-10 74.5 17.6 180 136-329 25-252 (256)
39 COG0793 Prc Periplasmic protea 98.3 3.5E-06 7.5E-11 86.6 10.6 70 368-437 112-184 (406)
40 PF14685 Tricorn_PDZ: Tricorn 98.3 4.7E-06 1E-10 66.8 8.8 67 367-433 11-87 (88)
41 PF00863 Peptidase_C4: Peptida 98.3 3.4E-05 7.4E-10 72.8 15.7 165 133-321 15-185 (235)
42 TIGR00054 RIP metalloprotease 98.2 1.8E-06 3.8E-11 89.5 6.6 63 368-432 128-190 (420)
43 KOG3129 26S proteasome regulat 98.1 9.3E-06 2E-10 73.9 7.5 72 369-441 140-214 (231)
44 PF04495 GRASP55_65: GRASP55/6 98.0 1.6E-05 3.6E-10 69.3 7.6 72 367-438 42-115 (138)
45 PRK11186 carboxy-terminal prot 98.0 3.2E-05 6.9E-10 83.8 10.5 70 368-437 255-334 (667)
46 COG3480 SdrC Predicted secrete 97.9 3.8E-05 8.3E-10 74.3 8.8 68 368-436 130-198 (342)
47 PRK09681 putative type II secr 97.8 4.6E-05 1E-09 73.6 7.5 61 375-436 211-275 (276)
48 COG3975 Predicted protease wit 97.8 4.3E-05 9.2E-10 78.7 6.5 64 366-438 460-524 (558)
49 PF12812 PDZ_1: PDZ-like domai 97.7 0.00018 3.9E-09 56.4 7.7 68 339-414 9-76 (78)
50 COG5640 Secreted trypsin-like 97.5 0.0017 3.7E-08 64.1 13.5 51 282-332 223-279 (413)
51 PF03761 DUF316: Domain of unk 97.5 0.009 1.9E-07 58.4 18.2 177 134-325 52-273 (282)
52 PF05579 Peptidase_S32: Equine 97.4 0.00097 2.1E-08 63.2 10.0 116 153-307 110-228 (297)
53 KOG3553 Tax interaction protei 97.4 0.00015 3.4E-09 58.2 3.7 36 365-400 56-91 (124)
54 COG3031 PulC Type II secretory 97.4 0.0004 8.6E-09 64.7 6.2 67 368-435 207-274 (275)
55 PF05580 Peptidase_S55: SpoIVB 96.8 0.02 4.2E-07 53.2 12.1 160 154-322 19-214 (218)
56 KOG3580 Tight junction protein 96.7 0.0022 4.7E-08 66.8 5.1 59 366-424 427-488 (1027)
57 KOG3532 Predicted protein kina 96.5 0.006 1.3E-07 64.5 6.8 57 367-424 397-453 (1051)
58 PF10459 Peptidase_S46: Peptid 96.4 0.015 3.3E-07 63.6 9.6 24 155-178 47-70 (698)
59 PF08192 Peptidase_S64: Peptid 96.3 0.042 9E-07 58.6 11.7 117 208-330 541-688 (695)
60 PF10459 Peptidase_S46: Peptid 96.2 0.0075 1.6E-07 66.0 6.0 56 276-331 622-687 (698)
61 PF00548 Peptidase_C3: 3C cyst 96.1 0.32 6.8E-06 44.2 15.4 149 135-307 12-170 (172)
62 KOG3550 Receptor targeting pro 95.9 0.03 6.5E-07 48.7 6.9 58 365-423 112-172 (207)
63 KOG3209 WW domain-containing p 95.5 0.023 4.9E-07 60.6 5.6 52 372-424 782-836 (984)
64 COG0750 Predicted membrane-ass 95.4 0.056 1.2E-06 55.0 8.4 56 374-429 135-193 (375)
65 PF00949 Peptidase_S7: Peptida 95.4 0.047 1E-06 47.1 6.4 33 279-311 89-121 (132)
66 KOG3209 WW domain-containing p 95.0 0.067 1.5E-06 57.2 7.4 60 365-424 671-734 (984)
67 KOG3552 FERM domain protein FR 94.6 0.061 1.3E-06 59.0 5.9 55 368-424 75-131 (1298)
68 TIGR02860 spore_IV_B stage IV 94.4 0.34 7.3E-06 49.7 10.6 46 278-324 351-396 (402)
69 KOG3571 Dishevelled 3 and rela 94.4 0.079 1.7E-06 54.6 6.0 58 366-424 275-338 (626)
70 PF09342 DUF1986: Domain of un 94.3 0.57 1.2E-05 44.4 10.9 100 135-247 16-131 (267)
71 KOG3542 cAMP-regulated guanine 94.1 0.043 9.4E-07 58.2 3.5 56 367-424 561-618 (1283)
72 KOG3580 Tight junction protein 94.0 0.14 3E-06 53.9 6.9 63 361-424 33-96 (1027)
73 KOG3605 Beta amyloid precursor 94.0 0.096 2.1E-06 55.5 5.7 116 287-416 680-806 (829)
74 PF00944 Peptidase_S3: Alphavi 94.0 0.17 3.6E-06 43.4 6.1 33 281-313 100-132 (158)
75 PF02122 Peptidase_S39: Peptid 93.9 0.31 6.7E-06 45.3 8.5 117 167-307 43-166 (203)
76 KOG3834 Golgi reassembly stack 92.3 0.22 4.7E-06 50.6 5.2 70 367-437 14-86 (462)
77 KOG3606 Cell polarity protein 91.5 0.51 1.1E-05 45.1 6.4 57 367-424 193-252 (358)
78 KOG3549 Syntrophins (type gamm 91.1 0.23 5E-06 49.0 3.8 56 367-423 79-137 (505)
79 KOG3651 Protein kinase C, alph 90.5 0.54 1.2E-05 45.6 5.7 55 369-424 31-88 (429)
80 KOG3834 Golgi reassembly stack 90.5 0.45 9.7E-06 48.4 5.3 68 372-439 113-182 (462)
81 KOG2921 Intramembrane metallop 89.8 0.46 9.9E-06 47.8 4.6 50 362-411 214-264 (484)
82 KOG1892 Actin filament-binding 89.3 0.55 1.2E-05 52.1 5.1 59 365-424 957-1018(1629)
83 KOG3551 Syntrophins (type beta 88.8 0.48 1E-05 47.5 4.0 58 365-423 107-167 (506)
84 KOG0609 Calcium/calmodulin-dep 85.9 1.7 3.6E-05 45.6 6.1 55 369-424 147-204 (542)
85 KOG0606 Microtubule-associated 85.6 1.3 2.7E-05 50.4 5.4 52 370-422 660-713 (1205)
86 PF02907 Peptidase_S29: Hepati 79.0 4.2 9.2E-05 35.0 5.0 39 284-323 105-146 (148)
87 PF03510 Peptidase_C24: 2C end 73.1 11 0.00023 31.2 5.7 53 159-230 3-55 (105)
88 PF02395 Peptidase_S6: Immunog 71.4 41 0.00088 37.8 11.5 52 157-219 67-120 (769)
89 KOG3605 Beta amyloid precursor 71.1 3.6 7.8E-05 44.1 3.1 50 375-424 680-733 (829)
90 KOG3938 RGS-GAIP interacting p 66.4 6.8 0.00015 37.6 3.6 55 370-424 151-209 (334)
91 PF00947 Pico_P2A: Picornaviru 65.1 26 0.00057 29.9 6.6 33 275-308 78-110 (127)
92 PF01732 DUF31: Putative pepti 62.4 5.4 0.00012 40.7 2.4 24 282-305 350-373 (374)
93 cd00600 Sm_like The eukaryotic 59.4 26 0.00055 25.5 5.1 33 186-218 6-38 (63)
94 PRK00737 small nuclear ribonuc 54.7 30 0.00066 26.4 4.9 33 186-218 14-46 (72)
95 cd01731 archaeal_Sm1 The archa 54.5 31 0.00068 25.9 4.9 33 186-218 10-42 (68)
96 cd01726 LSm6 The eukaryotic Sm 54.4 29 0.00063 26.0 4.7 33 186-218 10-42 (67)
97 PF11874 DUF3394: Domain of un 54.0 19 0.00041 32.9 4.2 29 367-395 121-149 (183)
98 cd06168 LSm9 The eukaryotic Sm 53.8 32 0.00069 26.6 4.9 33 186-218 10-42 (75)
99 cd01722 Sm_F The eukaryotic Sm 53.4 29 0.00062 26.1 4.5 33 186-218 11-43 (68)
100 cd01732 LSm5 The eukaryotic Sm 53.4 29 0.00063 26.9 4.6 32 186-217 13-44 (76)
101 cd01730 LSm3 The eukaryotic Sm 52.1 27 0.00059 27.3 4.4 32 186-217 11-42 (82)
102 cd01717 Sm_B The eukaryotic Sm 52.0 32 0.00068 26.7 4.7 33 186-218 10-42 (79)
103 cd01729 LSm7 The eukaryotic Sm 49.6 39 0.00084 26.5 4.8 33 186-218 12-44 (81)
104 cd01719 Sm_G The eukaryotic Sm 48.3 44 0.00095 25.5 4.8 33 186-218 10-42 (72)
105 cd01721 Sm_D3 The eukaryotic S 47.8 49 0.0011 25.0 5.0 33 186-218 10-42 (70)
106 cd01728 LSm1 The eukaryotic Sm 47.0 46 0.001 25.6 4.8 32 186-217 12-43 (74)
107 cd01720 Sm_D2 The eukaryotic S 46.8 44 0.00096 26.7 4.8 33 186-218 14-46 (87)
108 smart00651 Sm snRNP Sm protein 46.4 49 0.0011 24.4 4.8 33 186-218 8-40 (67)
109 cd01735 LSm12_N LSm12 belongs 45.9 72 0.0016 23.7 5.4 34 186-219 6-39 (61)
110 PF01423 LSM: LSM domain ; In 45.2 41 0.00089 24.8 4.2 34 186-219 8-41 (67)
111 cd01727 LSm8 The eukaryotic Sm 41.6 59 0.0013 24.9 4.7 33 186-218 9-41 (74)
112 PF05416 Peptidase_C37: Southa 39.1 2.7E+02 0.0059 29.0 10.0 134 155-309 379-528 (535)
113 PF00571 CBS: CBS domain CBS d 36.9 34 0.00075 23.8 2.6 19 287-305 29-47 (57)
114 cd01723 LSm4 The eukaryotic Sm 36.1 91 0.002 24.0 5.0 33 186-218 11-43 (76)
115 PF12381 Peptidase_C3G: Tungro 35.7 42 0.00092 31.4 3.5 56 275-331 168-229 (231)
116 COG1958 LSM1 Small nuclear rib 35.7 76 0.0017 24.5 4.5 33 186-218 17-49 (79)
117 KOG1738 Membrane-associated gu 32.5 59 0.0013 35.2 4.4 37 368-404 225-262 (638)
118 PF02743 Cache_1: Cache domain 32.1 58 0.0013 24.8 3.3 32 290-331 18-49 (81)
119 COG0260 PepB Leucyl aminopepti 31.2 45 0.00097 35.3 3.3 32 369-401 299-330 (485)
120 PF14438 SM-ATX: Ataxin 2 SM d 30.9 1.3E+02 0.0028 23.0 5.1 29 186-214 12-43 (77)
121 cd01725 LSm2 The eukaryotic Sm 30.7 1.2E+02 0.0026 23.6 4.9 33 186-218 11-43 (81)
122 cd01733 LSm10 The eukaryotic S 30.1 1.4E+02 0.003 23.2 5.1 33 186-218 19-51 (78)
123 PF14827 Cache_3: Sensory doma 29.7 57 0.0012 27.0 3.1 18 291-308 94-111 (116)
124 cd05701 S1_Rrp5_repeat_hs10 S1 29.3 38 0.00081 25.4 1.6 33 210-242 13-54 (69)
125 cd01724 Sm_D1 The eukaryotic S 29.2 1.3E+02 0.0028 24.1 4.9 33 186-218 11-43 (90)
126 PF05578 Peptidase_S31: Pestiv 25.9 1.7E+02 0.0036 26.0 5.3 73 233-307 108-182 (211)
127 PF09465 LBR_tudor: Lamin-B re 23.9 2.9E+02 0.0063 20.1 5.7 35 186-220 9-44 (55)
128 PF09122 DUF1930: Domain of un 22.2 2.3E+02 0.0049 21.2 4.5 44 389-434 19-64 (68)
129 PRK05015 aminopeptidase B; Pro 22.2 90 0.0019 32.4 3.5 29 372-401 240-268 (424)
130 PRK00913 multifunctional amino 21.1 91 0.002 33.1 3.3 29 372-401 303-331 (483)
131 cd00433 Peptidase_M17 Cytosol 20.9 89 0.0019 33.1 3.2 29 372-401 289-317 (468)
132 cd04627 CBS_pair_14 The CBS do 20.6 72 0.0016 25.9 2.1 20 287-306 98-117 (123)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3.6e-50 Score=415.62 Aligned_cols=296 Identities=33% Similarity=0.563 Sum_probs=259.8
Q ss_pred hHHHHHHHhCCceEEEEeccc----------ccccc----------cCCcEEEEEEEeC-CCEEEeccccccCCCCCCCC
Q 013444 126 TIANAAARVCPAVVNLSAPRE----------FLGIL----------SGRGIGSGAIVDA-DGTILTCAHVVVDFHGSRAL 184 (443)
Q Consensus 126 ~~~~~~~~~~pSVV~I~~~~~----------~~~~~----------~~~~~GSGfiI~~-~G~ILTaaHvv~~~~~~~~~ 184 (443)
++.++++++.||||.|.+... +..++ ...+.||||||++ +||||||+|||.++
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a------ 114 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQA------ 114 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCC------
Confidence 499999999999999987421 01111 1236899999985 69999999999985
Q ss_pred CCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecC
Q 013444 185 PKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRK 264 (443)
Q Consensus 185 ~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~ 264 (443)
..+.|++.|++.++|++++.|+.+||||||++.+..+++++|+++..+++|++|+++|+|+++..+++.|+|++..+.
T Consensus 115 --~~i~V~~~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~ 192 (455)
T PRK10139 115 --QKISIQLNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRS 192 (455)
T ss_pred --CEEEEEECCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEcccccc
Confidence 589999999999999999999999999999986678999999999999999999999999999999999999988775
Q ss_pred ccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEEeHHHHHHHHHHHHHcCceeeeec
Q 013444 265 SSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGRVVRPWL 341 (443)
Q Consensus 265 ~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~l 341 (443)
.... ..+..+|++|+.+++|+|||||||.+|+||||+++.... ..+++|+||++.+++++++|+++|++.|+||
T Consensus 193 ~~~~---~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~L 269 (455)
T PRK10139 193 GLNL---EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLL 269 (455)
T ss_pred ccCC---CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccce
Confidence 3211 123578999999999999999999999999999987643 3579999999999999999999999999999
Q ss_pred CceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEE
Q 013444 342 GLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKV 420 (443)
Q Consensus 342 Gi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l 420 (443)
|+.+++++++.++.+++. ...|++|.+|.++|||+++||++||+|++|||++|.+|+|+...+.. ..|+++.+
T Consensus 270 Gv~~~~l~~~~~~~lgl~------~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l 343 (455)
T PRK10139 270 GIKGTEMSADIAKAFNLD------VQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKL 343 (455)
T ss_pred eEEEEECCHHHHHhcCCC------CCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEE
Confidence 999999999998887652 35699999999999999999999999999999999999999998876 78899999
Q ss_pred EEEECCCeEEEEEEEecCC
Q 013444 421 VVQRANDQLVTLTVIPEEA 439 (443)
Q Consensus 421 ~v~R~~g~~~~l~v~~~~~ 439 (443)
+|.| +|+.+++++++.+.
T Consensus 344 ~V~R-~G~~~~l~v~~~~~ 361 (455)
T PRK10139 344 GLLR-NGKPLEVEVTLDTS 361 (455)
T ss_pred EEEE-CCEEEEEEEEECCC
Confidence 9999 88988888887543
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=3.8e-49 Score=396.69 Aligned_cols=297 Identities=37% Similarity=0.591 Sum_probs=258.5
Q ss_pred hhHHHHHHHhCCceEEEEeccccc---ccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE
Q 013444 125 DTIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG 201 (443)
Q Consensus 125 ~~~~~~~~~~~pSVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a 201 (443)
.++.++++++.||||.|+...... ......+.||||+|+++||||||+|||.++ ..+.|.+.||+.++|
T Consensus 45 ~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~--------~~i~V~~~dg~~~~a 116 (351)
T TIGR02038 45 ISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA--------DQIVVALQDGRKFEA 116 (351)
T ss_pred hhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC--------CEEEEEECCCCEEEE
Confidence 469999999999999998754221 111234679999999999999999999885 579999999999999
Q ss_pred EEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEc
Q 013444 202 TVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTD 281 (443)
Q Consensus 202 ~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d 281 (443)
++++.|+.+||||||++.. .+++++++++..+++|++|+++|+|++...+++.|+|+...+.... ......++++|
T Consensus 117 ~vv~~d~~~DlAvlkv~~~-~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd 192 (351)
T TIGR02038 117 ELVGSDPLTDLAVLKIEGD-NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD 192 (351)
T ss_pred EEEEecCCCCEEEEEecCC-CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence 9999999999999999854 5788899888889999999999999999899999999988765321 11235789999
Q ss_pred ccCCCCCccceeeecCCCEEEEEEEEeec-----CCCeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHh
Q 013444 282 CAINAGNSGGPLVNIDGEIVGINIMKVAA-----ADGLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQL 356 (443)
Q Consensus 282 ~~i~~G~SGGPlvd~~G~VVGI~s~~~~~-----~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l 356 (443)
+.+++|+|||||||.+|+||||+++.... ..+++|+||++.+++++++++++|++.|||||+.++++++...+.+
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l 272 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL 272 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence 99999999999999999999999876532 2578999999999999999999999999999999999998888777
Q ss_pred hcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444 357 KERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 435 (443)
Q Consensus 357 ~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 435 (443)
+.. ...|++|.+|.++|||+++||++||+|++|||++|.+++|+.+.+.. +.|+++.++|.| +|+.+++.++
T Consensus 273 gl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~ 345 (351)
T TIGR02038 273 GLP------DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLR-QGKQLELPVT 345 (351)
T ss_pred CCC------ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEE
Confidence 652 23699999999999999999999999999999999999999998876 788999999999 8898899998
Q ss_pred ecCCC
Q 013444 436 PEEAN 440 (443)
Q Consensus 436 ~~~~~ 440 (443)
+.+.+
T Consensus 346 l~~~p 350 (351)
T TIGR02038 346 IDEKP 350 (351)
T ss_pred ecCCC
Confidence 87643
No 3
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=6.9e-49 Score=394.61 Aligned_cols=296 Identities=36% Similarity=0.562 Sum_probs=255.5
Q ss_pred hHHHHHHHhCCceEEEEeccccc---ccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEE
Q 013444 126 TIANAAARVCPAVVNLSAPREFL---GILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGT 202 (443)
Q Consensus 126 ~~~~~~~~~~pSVV~I~~~~~~~---~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~ 202 (443)
++.++++++.||||.|....... ......+.||||+|+++|+||||+|+|.++ ..+.|++.||+.++|+
T Consensus 46 ~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~ 117 (353)
T PRK10898 46 SYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA--------DQIIVALQDGRVFEAL 117 (353)
T ss_pred hHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC--------CEEEEEeCCCCEEEEE
Confidence 58999999999999999864321 111223689999999999999999999984 5899999999999999
Q ss_pred EEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcc
Q 013444 203 VLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDC 282 (443)
Q Consensus 203 vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~ 282 (443)
+++.|+.+||||||++. ..+++++++++..+++|++|+++|||++...+++.|+|++..+.... ......++++|+
T Consensus 118 vv~~d~~~DlAvl~v~~-~~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtda 193 (353)
T PRK10898 118 LVGSDSLTDLAVLKINA-TNLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTDA 193 (353)
T ss_pred EEEEcCCCCEEEEEEcC-CCCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEecc
Confidence 99999999999999985 46888999888889999999999999998889999999987765321 112246899999
Q ss_pred cCCCCCccceeeecCCCEEEEEEEEeecC------CCeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHh
Q 013444 283 AINAGNSGGPLVNIDGEIVGINIMKVAAA------DGLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQL 356 (443)
Q Consensus 283 ~i~~G~SGGPlvd~~G~VVGI~s~~~~~~------~g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l 356 (443)
.+++|+|||||+|.+|+||||+++..... .+++|+||++.+++++++|+++|++.|+|||+.++++++.....+
T Consensus 194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~~ 273 (353)
T PRK10898 194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQGG 273 (353)
T ss_pred ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHhc
Confidence 99999999999999999999999866432 478999999999999999999999999999999999877655443
Q ss_pred hcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444 357 KERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 435 (443)
Q Consensus 357 ~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 435 (443)
+. ....|++|.+|.++|||+++||++||+|++|||++|.++.++.+.+.. ..|+++.++|+| +++..+++++
T Consensus 274 ~~------~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~ 346 (353)
T PRK10898 274 GI------DQLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMR-DDKQLTLQVT 346 (353)
T ss_pred CC------CCCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEE
Confidence 32 224799999999999999999999999999999999999999988876 788999999999 8898899998
Q ss_pred ecCCC
Q 013444 436 PEEAN 440 (443)
Q Consensus 436 ~~~~~ 440 (443)
+.+.+
T Consensus 347 l~~~p 351 (353)
T PRK10898 347 IQEYP 351 (353)
T ss_pred eccCC
Confidence 87654
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=2.4e-47 Score=396.31 Aligned_cols=295 Identities=35% Similarity=0.559 Sum_probs=257.2
Q ss_pred hHHHHHHHhCCceEEEEecccc-----------cccc--------------------------------cCCcEEEEEEE
Q 013444 126 TIANAAARVCPAVVNLSAPREF-----------LGIL--------------------------------SGRGIGSGAIV 162 (443)
Q Consensus 126 ~~~~~~~~~~pSVV~I~~~~~~-----------~~~~--------------------------------~~~~~GSGfiI 162 (443)
+++++++++.|+||.|.+.... +.++ ...+.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 4999999999999999864310 0011 11357999999
Q ss_pred eC-CCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEE
Q 013444 163 DA-DGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVV 241 (443)
Q Consensus 163 ~~-~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~ 241 (443)
++ +||||||+|||.+. ..+.|++.|++.|+|++++.|+.+||||||++...++++++|+++..+++|++|+
T Consensus 119 ~~~~G~IlTn~HVv~~a--------~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~ 190 (473)
T PRK10942 119 DADKGYVVTNNHVVDNA--------TKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTV 190 (473)
T ss_pred ECCCCEEEeChhhcCCC--------CEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEE
Confidence 96 59999999999985 5899999999999999999999999999999866789999999999999999999
Q ss_pred EEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEE
Q 013444 242 AMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAV 318 (443)
Q Consensus 242 ~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aI 318 (443)
++|+|+++..+++.|+|+...+.... . ..+..+|++|+.+++|+|||||+|.+|+||||++..... ..+++|+|
T Consensus 191 aiG~P~g~~~tvt~GiVs~~~r~~~~--~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaI 267 (473)
T PRK10942 191 AIGNPYGLGETVTSGIVSALGRSGLN--V-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAI 267 (473)
T ss_pred EEcCCCCCCcceeEEEEEEeecccCC--c-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEE
Confidence 99999999999999999988765211 1 123578999999999999999999999999999987643 25689999
Q ss_pred eHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCE
Q 013444 319 PIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGK 398 (443)
Q Consensus 319 Pi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~ 398 (443)
|++.+++++++|+++|++.|+|||+.++++++++++.+++. ...|++|.+|.++|||+++||++||+|++|||+
T Consensus 268 P~~~~~~v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~------~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~ 341 (473)
T PRK10942 268 PSNMVKNLTSQMVEYGQVKRGELGIMGTELNSELAKAMKVD------AQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGK 341 (473)
T ss_pred EHHHHHHHHHHHHhccccccceeeeEeeecCHHHHHhcCCC------CCCceEEEEECCCChHHHcCCCCCCEEEEECCE
Confidence 99999999999999999999999999999999988887653 357999999999999999999999999999999
Q ss_pred eeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 013444 399 PVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEE 438 (443)
Q Consensus 399 ~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~ 438 (443)
+|.+|+++...+.. ..|+++.++|.| +|+.+++.+++..
T Consensus 342 ~V~s~~dl~~~l~~~~~g~~v~l~v~R-~G~~~~v~v~l~~ 381 (473)
T PRK10942 342 PISSFAALRAQVGTMPVGSKLTLGLLR-DGKPVNVNVELQQ 381 (473)
T ss_pred ECCCHHHHHHHHHhcCCCCEEEEEEEE-CCeEEEEEEEeCc
Confidence 99999999988866 678899999999 8888888887654
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=3.2e-47 Score=393.45 Aligned_cols=295 Identities=40% Similarity=0.674 Sum_probs=258.0
Q ss_pred hHHHHHHHhCCceEEEEecccc-------------cccc--------------cCCcEEEEEEEeCCCEEEeccccccCC
Q 013444 126 TIANAAARVCPAVVNLSAPREF-------------LGIL--------------SGRGIGSGAIVDADGTILTCAHVVVDF 178 (443)
Q Consensus 126 ~~~~~~~~~~pSVV~I~~~~~~-------------~~~~--------------~~~~~GSGfiI~~~G~ILTaaHvv~~~ 178 (443)
++.++++++.||||.|.+.... ..++ ...+.||||+|+++|+||||+||+.++
T Consensus 2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~ 81 (428)
T TIGR02037 2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA 81 (428)
T ss_pred cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC
Confidence 3789999999999999874210 0011 124679999999999999999999985
Q ss_pred CCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEE
Q 013444 179 HGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIV 258 (443)
Q Consensus 179 ~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~V 258 (443)
..+.|++.|++.++|++++.|+.+|||||+++....++++.|+++..+++|++|+++|||++...+++.|+|
T Consensus 82 --------~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~v 153 (428)
T TIGR02037 82 --------DEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIV 153 (428)
T ss_pred --------CeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEE
Confidence 589999999999999999999999999999987667999999988899999999999999999999999999
Q ss_pred EeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEEeHHHHHHHHHHHHHcCc
Q 013444 259 SCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKKNGR 335 (443)
Q Consensus 259 s~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aIPi~~i~~~l~~l~~~g~ 335 (443)
+...+.... ...+..++++|+.+++|+|||||||.+|+||||++..... ..+++|+||++.+++++++|+++++
T Consensus 154 s~~~~~~~~---~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~ 230 (428)
T TIGR02037 154 SALGRSGLG---IGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGK 230 (428)
T ss_pred EecccCccC---CCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCc
Confidence 988765311 1223568999999999999999999999999999886652 3578999999999999999999999
Q ss_pred eeeeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CC
Q 013444 336 VVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RV 414 (443)
Q Consensus 336 v~rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~ 414 (443)
+.|||||+.+++++++.++.++.. ...|++|.+|.++|||+++||++||+|++|||++|.++.++..++.. ..
T Consensus 231 ~~~~~lGi~~~~~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~ 304 (428)
T TIGR02037 231 VQRGWLGVTIQEVTSDLAKSLGLE------KQRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLRRAIGTLKP 304 (428)
T ss_pred CcCCcCceEeecCCHHHHHHcCCC------CCCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCC
Confidence 999999999999999998888763 24799999999999999999999999999999999999999998876 67
Q ss_pred CCeEEEEEEECCCeEEEEEEEecC
Q 013444 415 GEPLKVVVQRANDQLVTLTVIPEE 438 (443)
Q Consensus 415 g~~v~l~v~R~~g~~~~l~v~~~~ 438 (443)
|++++++|.| +++.+++++++..
T Consensus 305 g~~v~l~v~R-~g~~~~~~v~l~~ 327 (428)
T TIGR02037 305 GKKVTLGILR-KGKEKTITVTLGA 327 (428)
T ss_pred CCEEEEEEEE-CCEEEEEEEEECc
Confidence 8999999999 8888888887654
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=4.3e-37 Score=309.82 Aligned_cols=295 Identities=39% Similarity=0.627 Sum_probs=257.0
Q ss_pred hhHHHHHHHhCCceEEEEecccccc---------cccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC
Q 013444 125 DTIANAAARVCPAVVNLSAPREFLG---------ILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD 195 (443)
Q Consensus 125 ~~~~~~~~~~~pSVV~I~~~~~~~~---------~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d 195 (443)
..+..+++++.|+||.|........ .....+.||||+++.+|||+|+.|++.++ ..+.+.+.|
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a--------~~i~v~l~d 104 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGA--------EEITVTLAD 104 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCc--------ceEEEEeCC
Confidence 4689999999999999997542211 00014789999999899999999999984 588899999
Q ss_pred CcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccc
Q 013444 196 GRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRR 275 (443)
Q Consensus 196 g~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~ 275 (443)
|+.+++++++.|+..|+|+||++....++.+.++++..++.|++++++|+|+++..+++.|+++...+. .........
T Consensus 105 g~~~~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~--~v~~~~~~~ 182 (347)
T COG0265 105 GREVPAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT--GVGSAGGYV 182 (347)
T ss_pred CCEEEEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc--cccCccccc
Confidence 999999999999999999999997544888899999999999999999999999999999999998886 222222256
Q ss_pred eEEEEcccCCCCCccceeeecCCCEEEEEEEEeecCC---CeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHH
Q 013444 276 EYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAAD---GLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMI 352 (443)
Q Consensus 276 ~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~---g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~ 352 (443)
.+|++|+.+++|+||||++|.+|++|||++....... +++|+||++.++.+++.+.+.|++.|+|+|+.+.+++...
T Consensus 183 ~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~ 262 (347)
T COG0265 183 NFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADI 262 (347)
T ss_pred chhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEccccc
Confidence 7899999999999999999999999999999877543 5899999999999999999988999999999998887665
Q ss_pred HHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEE
Q 013444 353 IAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVT 431 (443)
Q Consensus 353 ~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~ 431 (443)
. ++ + ....|++|.+|.+++||+++|++.||+|+++||+++.+..++...+.. ..|+++.+++.| +|+.++
T Consensus 263 ~--~g-----~-~~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r-~g~~~~ 333 (347)
T COG0265 263 A--LG-----L-PVAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLR-GGKERE 333 (347)
T ss_pred c--cC-----C-CCCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEE-CCEEEE
Confidence 4 22 2 245789999999999999999999999999999999999999988876 679999999999 799999
Q ss_pred EEEEecC
Q 013444 432 LTVIPEE 438 (443)
Q Consensus 432 l~v~~~~ 438 (443)
+.+++.+
T Consensus 334 ~~v~l~~ 340 (347)
T COG0265 334 LAVTLGD 340 (347)
T ss_pred EEEEecC
Confidence 9998876
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.97 E-value=2.1e-29 Score=254.91 Aligned_cols=318 Identities=33% Similarity=0.421 Sum_probs=257.2
Q ss_pred chhHHHHHHHhCCceEEEEecccc------cccccCCcEEEEEEEeCCCEEEeccccccCCCCCCC---CCCceEEEEeC
Q 013444 124 RDTIANAAARVCPAVVNLSAPREF------LGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRA---LPKGKVDVTLQ 194 (443)
Q Consensus 124 ~~~~~~~~~~~~pSVV~I~~~~~~------~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~---~~~~~i~V~~~ 194 (443)
...+.++.++-.+++|.|+...-+ ....-....||||+++.+|+++||+||+........ ..-..+.+...
T Consensus 127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa 206 (473)
T KOG1320|consen 127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAA 206 (473)
T ss_pred hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEe
Confidence 455778889999999999963211 111134467999999999999999999986432211 11124666666
Q ss_pred CC--cEEEEEEEeecCCCCEEEEEEcCC-CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCC
Q 013444 195 DG--RTFEGTVLNADFHSDIAIVKINSK-TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG 271 (443)
Q Consensus 195 dg--~~~~a~vv~~d~~~DlAlLkl~~~-~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~ 271 (443)
++ ..+++.+.+.|+..|+|+++++.+ ...++++++.+..+..|+++..+|.|++..++.+.|+++...|...+++..
T Consensus 207 ~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 207 IGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred ecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence 65 899999999999999999999755 337888898899999999999999999999999999999998887665544
Q ss_pred --CccceEEEEcccCCCCCccceeeecCCCEEEEEEEEeec---CCCeEEEEeHHHHHHHHHHHHH---cCce------e
Q 013444 272 --GMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAA---ADGLSFAVPIDSAAKIIEQFKK---NGRV------V 337 (443)
Q Consensus 272 --~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~---~~g~~~aIPi~~i~~~l~~l~~---~g~v------~ 337 (443)
....+++++|+.++.|+||||++|.+|++||++++.... ..+++|++|.+.+..++.+..+ ..+. .
T Consensus 287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~ 366 (473)
T KOG1320|consen 287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPV 366 (473)
T ss_pred cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccc
Confidence 556789999999999999999999999999998886542 3678999999998888777632 2222 2
Q ss_pred eeecCceeecccHHHHHHhhcCCCCCC-CCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCC
Q 013444 338 RPWLGLKMLDLNDMIIAQLKERDPSFP-NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVG 415 (443)
Q Consensus 338 rp~lGi~~~~~~~~~~~~l~~~~~~~~-~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g 415 (443)
+.|+|..+..+...+..++..+.+.++ ....+++|.+|.+++++...++++||+|++|||++|.+..++.++++. ..+
T Consensus 367 ~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~ 446 (473)
T KOG1320|consen 367 HQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELIEECSTE 446 (473)
T ss_pred cccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHHHHHHhcCcC
Confidence 459999988888877777766666666 344689999999999999999999999999999999999999999987 455
Q ss_pred CeEEEEEEECCCeEEEEEEEecCCCCC
Q 013444 416 EPLKVVVQRANDQLVTLTVIPEEANPD 442 (443)
Q Consensus 416 ~~v~l~v~R~~g~~~~l~v~~~~~~~~ 442 (443)
+++.+..+| +.+..++.+.+++..+.
T Consensus 447 ~~v~vl~~~-~~e~~tl~Il~~~~~p~ 472 (473)
T KOG1320|consen 447 DKVAVLDRR-SAEDATLEILPEHKIPS 472 (473)
T ss_pred ceEEEEEec-CccceeEEecccccCCC
Confidence 678888777 77888999988876654
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.94 E-value=1.5e-25 Score=228.72 Aligned_cols=306 Identities=22% Similarity=0.320 Sum_probs=252.8
Q ss_pred chhHHHHHHHhCCceEEEEeccc--ccccccCCcEEEEEEEeCC-CEEEeccccccCCCCCCCCCCceEEEEeCCCcEEE
Q 013444 124 RDTIANAAARVCPAVVNLSAPRE--FLGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE 200 (443)
Q Consensus 124 ~~~~~~~~~~~~pSVV~I~~~~~--~~~~~~~~~~GSGfiI~~~-G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~ 200 (443)
...+...+..+-++||.|...+- ++..+.+.+.+|||++++. |+||||+|++... ...-.+.|.+..+.+
T Consensus 51 ~e~w~~~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pg-------P~va~avf~n~ee~e 123 (955)
T KOG1421|consen 51 SEDWRNTIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPG-------PFVASAVFDNHEEIE 123 (955)
T ss_pred hhhhhhhhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCC-------CceeEEEecccccCC
Confidence 34788999999999999998643 3444566788999999986 8999999999863 245667788888888
Q ss_pred EEEEeecCCCCEEEEEEcCC----CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCC---CCc
Q 013444 201 GTVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGL---GGM 273 (443)
Q Consensus 201 a~vv~~d~~~DlAlLkl~~~----~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~---~~~ 273 (443)
.-.++.|+-+|+.+++.+.. ..+..+.+.. +-.++|.+++++|+..+.-.++..|.++..++...+++. +..
T Consensus 124 i~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndf 202 (955)
T KOG1421|consen 124 IYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDF 202 (955)
T ss_pred cccccCCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccc
Confidence 88899999999999999854 2333444533 456899999999998777778889999999998877643 334
Q ss_pred cceEEEEcccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHH
Q 013444 274 RREYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMII 353 (443)
Q Consensus 274 ~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~ 353 (443)
...++|.......|.||+|++|.+|..|.++..+... .+.+|++|++.+.+.+.-++++.-++|+.|.+++..-.-+..
T Consensus 203 nTfy~QaasstsggssgspVv~i~gyAVAl~agg~~s-sas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~ 281 (955)
T KOG1421|consen 203 NTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSIS-SASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDEC 281 (955)
T ss_pred cceeeeehhcCCCCCCCCceecccceEEeeecCCccc-ccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHH
Confidence 4568899999999999999999999999998876543 456789999999999999998888999999999887666667
Q ss_pred HHhhcCC-------CCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECC
Q 013444 354 AQLKERD-------PSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAN 426 (443)
Q Consensus 354 ~~l~~~~-------~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~ 426 (443)
+++++.. ..+|....-++|..|.+++||++. |++||++++||+.-+.++.++.+.|.+..|+.+.|+|+| +
T Consensus 282 rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~-Le~GDillavN~t~l~df~~l~~iLDegvgk~l~LtI~R-g 359 (955)
T KOG1421|consen 282 RRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK-LEPGDILLAVNSTCLNDFEALEQILDEGVGKNLELTIQR-G 359 (955)
T ss_pred HhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc-cCCCcEEEEEcceehHHHHHHHHHHhhccCceEEEEEEe-C
Confidence 7776644 356766556778899999999998 999999999999999999999999999999999999999 8
Q ss_pred CeEEEEEEEecCCC
Q 013444 427 DQLVTLTVIPEEAN 440 (443)
Q Consensus 427 g~~~~l~v~~~~~~ 440 (443)
|++.++++...+.+
T Consensus 360 gqelel~vtvqdlh 373 (955)
T KOG1421|consen 360 GQELELTVTVQDLH 373 (955)
T ss_pred CEEEEEEEEecccc
Confidence 89888888876543
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.69 E-value=1.9e-16 Score=134.02 Aligned_cols=117 Identities=35% Similarity=0.570 Sum_probs=77.6
Q ss_pred EEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEE--EEEEeecCC-CCEEEEEEcCCCCCCccccCCCCC
Q 013444 157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE--GTVLNADFH-SDIAIVKINSKTPLPAAKLGTSSK 233 (443)
Q Consensus 157 GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkl~~~~~~~~~~l~~s~~ 233 (443)
||||+|+++|+||||+||+.+...........+.+.+.++.... +++++.++. +|+|||+++
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~--------------- 65 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVD--------------- 65 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEES---------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEe---------------
Confidence 89999999999999999999754332223568888999988888 999999999 999999998
Q ss_pred CCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEE
Q 013444 234 LCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGI 303 (443)
Q Consensus 234 ~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI 303 (443)
.....+... ............... ......+ +++.+.+|+|||||||.+|+||||
T Consensus 66 -----~~~~~~~~~-----~~~~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 66 -----PWTGVGGGV-----RVPGSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -----CEEEEEEEE-----EEEEEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -----cccceeeee-----EeeeeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 000000000 000000000000000 0001124 799999999999999999999997
No 10
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.64 E-value=2e-14 Score=147.99 Aligned_cols=290 Identities=15% Similarity=0.144 Sum_probs=204.4
Q ss_pred HHHhCCceEEEEecccc--cccccCCcEEEEEEEeCC-CEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeec
Q 013444 131 AARVCPAVVNLSAPREF--LGILSGRGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNAD 207 (443)
Q Consensus 131 ~~~~~pSVV~I~~~~~~--~~~~~~~~~GSGfiI~~~-G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d 207 (443)
.+++..+.|.++...+. ++.......|||.|++.+ |++++++.++.-. ..+.+|++.|...++|.+.+.|
T Consensus 524 ~~~i~~~~~~v~~~~~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d-------~~d~~vt~~dS~~i~a~~~fL~ 596 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSD-------AKDQRVTEADSDGIPANVSFLH 596 (955)
T ss_pred hhHHhhhhhhheeceeeccccchhhhhcCceEEEEccCCceeEecccCCch-------hhceEEeecccccccceeeEec
Confidence 45667777777765443 444444456999999976 8999999999753 3678899999999999999999
Q ss_pred CCCCEEEEEEcCCCCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEee---ecC-ccCCCCCCccceEEEEccc
Q 013444 208 FHSDIAIVKINSKTPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCV---DRK-SSDLGLGGMRREYLQTDCA 283 (443)
Q Consensus 208 ~~~DlAlLkl~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~---~~~-~~~~~~~~~~~~~i~~d~~ 283 (443)
+..++|.+|.+.. -...++|.+ ..+..||++...|+............|+.+ ... ..-..+.....+.|..++.
T Consensus 597 ~t~n~a~~kydp~-~~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~n 674 (955)
T KOG1421|consen 597 PTENVASFKYDPA-LEVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDN 674 (955)
T ss_pred CccceeEeccChh-Hhhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecc
Confidence 9999999999853 234455644 567889999999998765432222222222 111 1112233444567877777
Q ss_pred CCCCCccceeeecCCCEEEEEEEEeecCC-----CeEEEEeHHHHHHHHHHHHHcCceeeeecCceeecccHHHHHHhhc
Q 013444 284 INAGNSGGPLVNIDGEIVGINIMKVAAAD-----GLSFAVPIDSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQLKE 358 (443)
Q Consensus 284 i~~G~SGGPlvd~~G~VVGI~s~~~~~~~-----g~~~aIPi~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l~~ 358 (443)
+.-+.--|-+.|.+|+|+|++-..+.+.- ..-|.+.+..++..|+.|+.++..+...+|+++..++-..++.+++
T Consensus 675 lsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lgl 754 (955)
T KOG1421|consen 675 LSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGL 754 (955)
T ss_pred ccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCC
Confidence 65555456788999999999876654321 2457789999999999999888776667788776665544444433
Q ss_pred CCC-------CCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEE
Q 013444 359 RDP-------SFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVT 431 (443)
Q Consensus 359 ~~~-------~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~ 431 (443)
..- .-.....-++|+.|.+..+ +. |..||+|+++||+.|+...|+.+.. .+...|.| +|..++
T Consensus 755 p~e~imk~e~es~~~~ql~~ishv~~~~~--ki-l~~gdiilsvngk~itr~~dl~d~~------eid~~ilr-dg~~~~ 824 (955)
T KOG1421|consen 755 PSEFIMKSEEESTIPRQLYVISHVRPLLH--KI-LGVGDIILSVNGKMITRLSDLHDFE------EIDAVILR-DGIEME 824 (955)
T ss_pred CHHHHhhhhhcCCCcceEEEEEeeccCcc--cc-cccccEEEEecCeEEeeehhhhhhh------hhheeeee-cCcEEE
Confidence 210 0012334577888877544 34 9999999999999999999998733 47889999 899988
Q ss_pred EEEEecCC
Q 013444 432 LTVIPEEA 439 (443)
Q Consensus 432 l~v~~~~~ 439 (443)
+++...+.
T Consensus 825 ikipt~p~ 832 (955)
T KOG1421|consen 825 IKIPTYPE 832 (955)
T ss_pred EEeccccc
Confidence 88876543
No 11
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.55 E-value=4e-14 Score=112.62 Aligned_cols=81 Identities=35% Similarity=0.655 Sum_probs=69.8
Q ss_pred eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCe
Q 013444 339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP 417 (443)
Q Consensus 339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~ 417 (443)
||||+.+...+. ..|++|.+|.++|||+++||++||+|++|||++|+++.++..++.. ..|++
T Consensus 1 ~~lGv~~~~~~~----------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~ 64 (82)
T PF13180_consen 1 GGLGVTVQNLSD----------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDT 64 (82)
T ss_dssp -E-SEEEEECSC----------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred CEECeEEEEccC----------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence 689999876532 4699999999999999999999999999999999999999998854 88999
Q ss_pred EEEEEEECCCeEEEEEEEe
Q 013444 418 LKVVVQRANDQLVTLTVIP 436 (443)
Q Consensus 418 v~l~v~R~~g~~~~l~v~~ 436 (443)
++|+|+| +++.+++++++
T Consensus 65 v~l~v~R-~g~~~~~~v~l 82 (82)
T PF13180_consen 65 VTLTVLR-DGEELTVEVTL 82 (82)
T ss_dssp EEEEEEE-TTEEEEEEEE-
T ss_pred EEEEEEE-CCEEEEEEEEC
Confidence 9999999 89998888864
No 12
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.46 E-value=3.4e-12 Score=118.81 Aligned_cols=177 Identities=21% Similarity=0.291 Sum_probs=116.9
Q ss_pred CCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeC-------CC--cEEEEEEEe
Q 013444 135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQ-------DG--RTFEGTVLN 205 (443)
Q Consensus 135 ~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~-------dg--~~~~a~vv~ 205 (443)
.|.+|.|..... ...|+|++|+++ +|||++||+... ..+.+.+. ++ ..+...-+.
T Consensus 12 ~p~~v~i~~~~~-------~~~C~G~li~~~-~vLTaahC~~~~--------~~~~v~~g~~~~~~~~~~~~~~~v~~~~ 75 (220)
T PF00089_consen 12 FPWVVSIRYSNG-------RFFCTGTLISPR-WVLTAAHCVDGA--------SDIKVRLGTYSIRNSDGSEQTIKVSKII 75 (220)
T ss_dssp STTEEEEEETTT-------EEEEEEEEEETT-EEEEEGGGHTSG--------GSEEEEESESBTTSTTTTSEEEEEEEEE
T ss_pred CCeEEEEeeCCC-------CeeEeEEecccc-cccccccccccc--------cccccccccccccccccccccccccccc
Confidence 478888876553 367999999988 999999999871 34444332 22 345544443
Q ss_pred ecC-------CCCEEEEEEcCC----CCCCccccCC-CCCCCCCCEEEEEecCCCCC----CceEEEEEEeeecCccCCC
Q 013444 206 ADF-------HSDIAIVKINSK----TPLPAAKLGT-SSKLCPGDWVVAMGCPHSLQ----NTVTAGIVSCVDRKSSDLG 269 (443)
Q Consensus 206 ~d~-------~~DlAlLkl~~~----~~~~~~~l~~-s~~~~~G~~V~~iG~p~~~~----~~~t~G~Vs~~~~~~~~~~ 269 (443)
.++ .+|||||+++.+ ..+.++.+.. ...++.|+.+.++||+.... ..+....+.......+...
T Consensus 76 ~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~ 155 (220)
T PF00089_consen 76 IHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSS 155 (220)
T ss_dssp EETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHH
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 432 469999999976 3455667755 23457899999999998633 2455555544444332211
Q ss_pred CCC-ccceEEEEcc----cCCCCCccceeeecCCCEEEEEEEEeecC-C-CeEEEEeHHHHHHHH
Q 013444 270 LGG-MRREYLQTDC----AINAGNSGGPLVNIDGEIVGINIMKVAAA-D-GLSFAVPIDSAAKII 327 (443)
Q Consensus 270 ~~~-~~~~~i~~d~----~i~~G~SGGPlvd~~G~VVGI~s~~~~~~-~-g~~~aIPi~~i~~~l 327 (443)
+.. .....++... ..|.|+|||||++.++.|+||++.+..-. . ...++.+++...++|
T Consensus 156 ~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 156 YNDNLTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp TTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 111 2234566555 78999999999987777999999873322 2 247788888777664
No 13
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.40 E-value=2.3e-11 Score=114.04 Aligned_cols=181 Identities=23% Similarity=0.267 Sum_probs=108.9
Q ss_pred hCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC---------CcEEEEEEE
Q 013444 134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------GRTFEGTVL 204 (443)
Q Consensus 134 ~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d---------g~~~~a~vv 204 (443)
..|.+|.|.... ....|+|++|+++ +|||+|||+.+.. ...+.|.+.. ...+..+-+
T Consensus 11 ~~Pw~v~i~~~~-------~~~~C~GtlIs~~-~VLTaAhC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~~v~~~ 76 (232)
T cd00190 11 SFPWQVSLQYTG-------GRHFCGGSLISPR-WVLTAAHCVYSSA------PSNYTVRLGSHDLSSNEGGGQVIKVKKV 76 (232)
T ss_pred CCCCEEEEEccC-------CcEEEEEEEeeCC-EEEECHHhcCCCC------CccEEEEeCcccccCCCCceEEEEEEEE
Confidence 357888887543 2367999999987 9999999998632 1344454432 223344444
Q ss_pred eecC-------CCCEEEEEEcCCC----CCCccccCCCC-CCCCCCEEEEEecCCCCCC-----ceEEEEEEeeecCccC
Q 013444 205 NADF-------HSDIAIVKINSKT----PLPAAKLGTSS-KLCPGDWVVAMGCPHSLQN-----TVTAGIVSCVDRKSSD 267 (443)
Q Consensus 205 ~~d~-------~~DlAlLkl~~~~----~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~-----~~t~G~Vs~~~~~~~~ 267 (443)
..++ .+|||||+|+.+. .+.++.|.... .+..|+.+.+.||+..... ......+..+....+.
T Consensus 77 ~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~ 156 (232)
T cd00190 77 IVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECK 156 (232)
T ss_pred EECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhh
Confidence 4443 5799999998652 25677775543 5778999999999765332 2233333322222221
Q ss_pred CCCC---CccceEEEE-----cccCCCCCccceeeecC---CCEEEEEEEEeecC--CCeEEEEeHHHHHHHHH
Q 013444 268 LGLG---GMRREYLQT-----DCAINAGNSGGPLVNID---GEIVGINIMKVAAA--DGLSFAVPIDSAAKIIE 328 (443)
Q Consensus 268 ~~~~---~~~~~~i~~-----d~~i~~G~SGGPlvd~~---G~VVGI~s~~~~~~--~g~~~aIPi~~i~~~l~ 328 (443)
.... ......+.. ....|.|+|||||+... +.++||.+++..-. .....+..+....++|+
T Consensus 157 ~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~ 230 (232)
T cd00190 157 RAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQ 230 (232)
T ss_pred hhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhh
Confidence 1111 111222322 34578999999999664 78999999865311 12233455565666654
No 14
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.37 E-value=5.5e-12 Score=101.53 Aligned_cols=88 Identities=36% Similarity=0.692 Sum_probs=74.4
Q ss_pred eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCe
Q 013444 339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEP 417 (443)
Q Consensus 339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~ 417 (443)
||+|+.++++++.....+.. ....|++|.+|.++|||+++||++||+|++|||+++.++.++..++.. ..++.
T Consensus 1 ~~~G~~~~~~~~~~~~~~~~------~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~ 74 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELGL------KDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDK 74 (90)
T ss_pred CccceEEeECCHHHHHHcCC------CCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCE
Confidence 68999999999876655332 335699999999999999999999999999999999999999988876 45889
Q ss_pred EEEEEEECCCeEEEEE
Q 013444 418 LKVVVQRANDQLVTLT 433 (443)
Q Consensus 418 v~l~v~R~~g~~~~l~ 433 (443)
+.+++.| +|+..++.
T Consensus 75 i~l~v~r-~g~~~~~~ 89 (90)
T cd00987 75 VTLTVLR-GGKELTVT 89 (90)
T ss_pred EEEEEEE-CCEEEEee
Confidence 9999999 77765543
No 15
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.27 E-value=4.7e-11 Score=94.26 Aligned_cols=68 Identities=26% Similarity=0.425 Sum_probs=62.1
Q ss_pred CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 435 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 435 (443)
..|++|.+|.++|||+++||++||+|++|||+++.+|+++..++.. ..|+++.+++.| +++..+++++
T Consensus 9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r-~g~~~~~~~~ 77 (79)
T cd00991 9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLP-STTKLTNVST 77 (79)
T ss_pred CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEE
Confidence 4699999999999999999999999999999999999999999887 468899999999 8888777765
No 16
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.24 E-value=7.3e-11 Score=93.03 Aligned_cols=68 Identities=24% Similarity=0.434 Sum_probs=57.9
Q ss_pred CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 437 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 437 (443)
..+++|.+|.++|||+++||++||+|++|||+++.+|.++...+ ..++.+.+++.| +++..++.+++.
T Consensus 11 ~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r-~g~~~~~~v~~~ 78 (80)
T cd00990 11 EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFR-DDRLIEVPLTLA 78 (80)
T ss_pred CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEE-CCEEEEEEEEec
Confidence 35799999999999999999999999999999999876654433 467789999999 788888888765
No 17
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.24 E-value=2.6e-10 Score=107.16 Aligned_cols=161 Identities=24% Similarity=0.312 Sum_probs=98.7
Q ss_pred hCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCC--------cEEEEEEEe
Q 013444 134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDG--------RTFEGTVLN 205 (443)
Q Consensus 134 ~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg--------~~~~a~vv~ 205 (443)
..|.+|.|.... ....|+|++|+++ +|||+|||+.+.. ...+.|.+... ..+.+.-+.
T Consensus 12 ~~Pw~~~i~~~~-------~~~~C~GtlIs~~-~VLTaahC~~~~~------~~~~~v~~g~~~~~~~~~~~~~~v~~~~ 77 (229)
T smart00020 12 SFPWQVSLQYRG-------GRHFCGGSLISPR-WVLTAAHCVYGSD------PSNIRVRLGSHDLSSGEEGQVIKVSKVI 77 (229)
T ss_pred CCCcEEEEEEcC-------CCcEEEEEEecCC-EEEECHHHcCCCC------CcceEEEeCcccCCCCCCceEEeeEEEE
Confidence 457788886432 2367999999987 9999999998642 13455655432 334444444
Q ss_pred ec-------CCCCEEEEEEcCC----CCCCccccCCC-CCCCCCCEEEEEecCCCCC------CceEEEEEEeeecCccC
Q 013444 206 AD-------FHSDIAIVKINSK----TPLPAAKLGTS-SKLCPGDWVVAMGCPHSLQ------NTVTAGIVSCVDRKSSD 267 (443)
Q Consensus 206 ~d-------~~~DlAlLkl~~~----~~~~~~~l~~s-~~~~~G~~V~~iG~p~~~~------~~~t~G~Vs~~~~~~~~ 267 (443)
.+ ..+|||||+|+.+ ..+.++.|... ..+..++.+.+.||+.... .......+.......+.
T Consensus 78 ~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~ 157 (229)
T smart00020 78 IHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCR 157 (229)
T ss_pred ECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhh
Confidence 33 3579999999875 23456666543 3567789999999987542 12223333322222221
Q ss_pred CCCC---CccceEEEE-----cccCCCCCccceeeecCC--CEEEEEEEEe
Q 013444 268 LGLG---GMRREYLQT-----DCAINAGNSGGPLVNIDG--EIVGINIMKV 308 (443)
Q Consensus 268 ~~~~---~~~~~~i~~-----d~~i~~G~SGGPlvd~~G--~VVGI~s~~~ 308 (443)
.... ......+.. ....|+|+||||++...+ .++||++++.
T Consensus 158 ~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 158 RAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred hhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 1110 011112222 355789999999996543 8999999865
No 18
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.23 E-value=5.9e-11 Score=114.44 Aligned_cols=99 Identities=17% Similarity=0.209 Sum_probs=86.6
Q ss_pred HHHHHHHHHHHHcCceeeeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEee
Q 013444 321 DSAAKIIEQFKKNGRVVRPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPV 400 (443)
Q Consensus 321 ~~i~~~l~~l~~~g~v~rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V 400 (443)
..++++++++.+++++.+.|+|+.....+ +...|+.|..+.++++|+++||++||+|++|||+++
T Consensus 159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~---------------g~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i 223 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN---------------DKLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDL 223 (259)
T ss_pred hhHHHHHHHHHHCHHhhhheEeEEEEEeC---------------CceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEc
Confidence 45788899999999999999999874322 224699999999999999999999999999999999
Q ss_pred CCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444 401 QSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 435 (443)
Q Consensus 401 ~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 435 (443)
.+++++.+++.+ ..++.++|+|+| +|+.+++.+.
T Consensus 224 ~~~~~~~~~l~~~~~~~~v~l~V~R-~G~~~~i~v~ 258 (259)
T TIGR01713 224 RDPEQAFQALQMLREETNLTLTVER-DGQREDIYVR 258 (259)
T ss_pred CCHHHHHHHHHhcCCCCeEEEEEEE-CCEEEEEEEE
Confidence 999999998887 677899999999 8888887765
No 19
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.22 E-value=5e-11 Score=121.70 Aligned_cols=273 Identities=19% Similarity=0.196 Sum_probs=181.1
Q ss_pred HHhCCceEEEEeccc-------ccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEE-eCCCcEEEEEE
Q 013444 132 ARVCPAVVNLSAPRE-------FLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVT-LQDGRTFEGTV 203 (443)
Q Consensus 132 ~~~~pSVV~I~~~~~-------~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~-~~dg~~~~a~v 203 (443)
+....+++.+..... +.........|+||.+... .++|++|++..... ...+.+. ...-+.|.+++
T Consensus 57 ~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~-----~~~v~v~~~gs~~k~~~~v 130 (473)
T KOG1320|consen 57 DLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNND-----HKFVTVKKHGSPRKYKAFV 130 (473)
T ss_pred cccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCcccccccc-----ccccccccCCCchhhhhhH
Confidence 455566777765322 1111134467999999865 99999999984321 1223332 23347778888
Q ss_pred EeecCCCCEEEEEEcCC---CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEE
Q 013444 204 LNADFHSDIAIVKINSK---TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQT 280 (443)
Q Consensus 204 v~~d~~~DlAlLkl~~~---~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~ 280 (443)
...-.+.|+|++.++.. ....++.+++ -+...+.++++| +....+|.|.|........ .........+++
T Consensus 131 ~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~--ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y--~~~~~~l~~vqi 203 (473)
T KOG1320|consen 131 AAVFEECDLAVVYIESEEFWKGMNPFELGD--IPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIY--AHSSTVLLRVQI 203 (473)
T ss_pred HHhhhcccceEEEEeeccccCCCcccccCC--CcccCccEEEEc---CCcEEEEeeEEEEEEeccc--cCCCcceeeEEE
Confidence 88888999999999853 2223344433 345567899998 5667899999998876532 222334557899
Q ss_pred cccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHHHcCce-eeeecCceeeccc-HHHHHHhhc
Q 013444 281 DCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGRV-VRPWLGLKMLDLN-DMIIAQLKE 358 (443)
Q Consensus 281 d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~~~g~v-~rp~lGi~~~~~~-~~~~~~l~~ 358 (443)
++...+|+||+|.+...+++.|+++...+....+.+.+|.-.+.++.......+.. .+++++...+.+- ...++.++
T Consensus 204 ~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~- 282 (473)
T KOG1320|consen 204 DAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFK- 282 (473)
T ss_pred EEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccc-
Confidence 99999999999999877999999999886544678889998888877665544322 3444544444432 22222211
Q ss_pred CCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeC-CHHH-----HHHHHhc-CCCCeEEEEEEE
Q 013444 359 RDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ-SITE-----IIEIMGD-RVGEPLKVVVQR 424 (443)
Q Consensus 359 ~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~-s~~d-----l~~~l~~-~~g~~v~l~v~R 424 (443)
++...|+.+.++.+-+.|.+. ++.||.|+.+||..|. ++.. +...+.. .+++++.+.+.|
T Consensus 283 -----lg~~~g~~i~~~~qtd~ai~~-~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r 349 (473)
T KOG1320|consen 283 -----LGLETGVLISKINQTDAAINP-GNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLR 349 (473)
T ss_pred -----cCcccceeeeeecccchhhhc-ccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhh
Confidence 122378999999999999888 9999999999999883 1111 1122222 455666666666
No 20
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.21 E-value=1.1e-10 Score=91.67 Aligned_cols=67 Identities=30% Similarity=0.589 Sum_probs=59.8
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVI 435 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~ 435 (443)
..++|.+|.++|||+++||++||+|++|||+++.+|+++..++....++.+.+++.| +++..++.++
T Consensus 12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r-~~~~~~~~l~ 78 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVER-NGETITLTLT 78 (79)
T ss_pred cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEE-CCEEEEEEec
Confidence 458899999999999999999999999999999999999998877667889999999 7777777664
No 21
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19 E-value=2e-10 Score=90.45 Aligned_cols=70 Identities=29% Similarity=0.478 Sum_probs=63.5
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecCC
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEEA 439 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~~ 439 (443)
.|++|.+|.++|||++ ||++||+|++|||+++.+|+++..++.. ..++.+.+++.| +|+..++++++.+.
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r-~g~~~~~~v~l~~~ 78 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKR-EEKELPEDLILKTF 78 (79)
T ss_pred cCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEE-CCEEEEEEEEEecc
Confidence 5899999999999997 7999999999999999999999998875 678899999999 88888888888753
No 22
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.12 E-value=4.5e-10 Score=89.51 Aligned_cols=71 Identities=24% Similarity=0.608 Sum_probs=62.8
Q ss_pred CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 437 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 437 (443)
..+++|..|.+++||+++||++||+|++|||+++.+| .++..++....++.+.+++.|.+++..++++++.
T Consensus 12 ~~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~~~~~~~~~~ 84 (85)
T cd00988 12 DGGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLTRL 84 (85)
T ss_pred CCeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCCEEEEEEEEC
Confidence 3689999999999999999999999999999999999 9999888777788999999993277788877764
No 23
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.98 E-value=2.7e-09 Score=110.76 Aligned_cols=90 Identities=30% Similarity=0.568 Sum_probs=77.4
Q ss_pred eeecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCC
Q 013444 338 RPWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGE 416 (443)
Q Consensus 338 rp~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~ 416 (443)
+.|+|+.+..++.....++++. ....|++|.+|.++|||+++||++||+|++|||++|.+++++.+++.. +.++
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~l~-----~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~ 411 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELRLK-----GDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGG 411 (428)
T ss_pred ccccceEEecCCHHHHHHcCCC-----cCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCC
Confidence 4689999999998887766542 224699999999999999999999999999999999999999999976 5788
Q ss_pred eEEEEEEECCCeEEEEE
Q 013444 417 PLKVVVQRANDQLVTLT 433 (443)
Q Consensus 417 ~v~l~v~R~~g~~~~l~ 433 (443)
++.++|+| +++...+.
T Consensus 412 ~v~l~v~R-~g~~~~~~ 427 (428)
T TIGR02037 412 RVALLILR-GGATIFVT 427 (428)
T ss_pred EEEEEEEE-CCEEEEEE
Confidence 99999999 77766553
No 24
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.97 E-value=2.2e-09 Score=82.19 Aligned_cols=56 Identities=36% Similarity=0.692 Sum_probs=51.6
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ 423 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~ 423 (443)
.+++|.+|.+++||+++||++||+|++|||+++.++ +++.+++....|++++|+++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~ 70 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTVR 70 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEEC
Confidence 489999999999999999999999999999999999 99999998877888888763
No 25
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.81 E-value=1.6e-08 Score=104.56 Aligned_cols=70 Identities=26% Similarity=0.507 Sum_probs=64.8
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEecC
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPEE 438 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~ 438 (443)
.+++|.+|.++|||+++||++||+|++|||++|++|+|+.+.+....++++.+++.| +|+..+++++++.
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~~ 272 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVER-NGETLSISLTPEA 272 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEE-CCEEEEEEEEEcC
Confidence 489999999999999999999999999999999999999999988888899999999 8888888888753
No 26
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.73 E-value=1.9e-07 Score=88.69 Aligned_cols=160 Identities=18% Similarity=0.220 Sum_probs=93.7
Q ss_pred cEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEe----CCCc-E--EEEEEEeec-C---CCCEEEEEEcCC---
Q 013444 155 GIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTL----QDGR-T--FEGTVLNAD-F---HSDIAIVKINSK--- 220 (443)
Q Consensus 155 ~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~----~dg~-~--~~a~vv~~d-~---~~DlAlLkl~~~--- 220 (443)
..|++|+|+++ .+||++||+...... ...+.+.. .++. . +........ . ..|.+...+...
T Consensus 64 ~~~~~~lI~pn-tvLTa~Hc~~s~~~G----~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~ 138 (251)
T COG3591 64 LCTAATLIGPN-TVLTAGHCIYSPDYG----EDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALE 138 (251)
T ss_pred ceeeEEEEcCc-eEEEeeeEEecCCCC----hhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhc
Confidence 45677999998 999999999764321 11221111 1111 1 111112111 2 345555555421
Q ss_pred ------CCCCccccCCCCCCCCCCEEEEEecCCCCCCce----EEEEEEeeecCccCCCCCCccceEEEEcccCCCCCcc
Q 013444 221 ------TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTV----TAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSG 290 (443)
Q Consensus 221 ------~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~----t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SG 290 (443)
.-.....+......+.++.+.++|||.+..+.. ..+.+... ....+..+|.+++|+||
T Consensus 139 ~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SG 206 (251)
T COG3591 139 SGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSG 206 (251)
T ss_pred cCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCC
Confidence 111223344456778899999999998765332 22222211 12368899999999999
Q ss_pred ceeeecCCCEEEEEEEEeecCCC--eEE-EEeHHHHHHHHHHHH
Q 013444 291 GPLVNIDGEIVGINIMKVAAADG--LSF-AVPIDSAAKIIEQFK 331 (443)
Q Consensus 291 GPlvd~~G~VVGI~s~~~~~~~g--~~~-aIPi~~i~~~l~~l~ 331 (443)
+|+++.+.+|||++..+....++ .++ ..-...++++|+++.
T Consensus 207 Spv~~~~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 207 SPVLISKDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred CceEecCceEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 99999999999999887653221 222 233445777766543
No 27
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.72 E-value=4.9e-08 Score=101.94 Aligned_cols=69 Identities=28% Similarity=0.549 Sum_probs=63.4
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 437 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 437 (443)
.+++|.+|.++|||+++||++||+|++|||++|++|+|+.+.+....++.+.+++.| +|+..++++++.
T Consensus 221 ~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R-~g~~~~~~v~~~ 289 (449)
T PRK10779 221 IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIER-QGSPLSLTLTPD 289 (449)
T ss_pred cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEE-CCEEEEEEEEee
Confidence 367899999999999999999999999999999999999999888788899999999 888888888775
No 28
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.71 E-value=2.7e-08 Score=103.83 Aligned_cols=67 Identities=15% Similarity=0.063 Sum_probs=58.6
Q ss_pred eEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEec
Q 013444 370 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPE 437 (443)
Q Consensus 370 ~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~ 437 (443)
.+|.+|.++|||++||||+||+|++|||++|++|+|+...+.. ..+++++++|.| +|+++++++++.
T Consensus 128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R-~gk~~~~~v~l~ 195 (449)
T PRK10779 128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAP-FGSDQRRDKTLD 195 (449)
T ss_pred ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEe-CCccceEEEEec
Confidence 3689999999999999999999999999999999999987765 667889999999 787766666553
No 29
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.63 E-value=1.4e-07 Score=94.89 Aligned_cols=67 Identities=25% Similarity=0.469 Sum_probs=56.4
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEECCCeEEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRANDQLVTLTVI 435 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~ 435 (443)
.+++|..|.++|||+++||++||+|++|||++|.+| .++...+....|+++.+++.| +++..+++++
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R-~g~~~~~~v~ 130 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILR-AGKSKPLTFT 130 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEe-CCCCceEEEE
Confidence 589999999999999999999999999999999987 577777777788899999999 6554443333
No 30
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.60 E-value=2.5e-07 Score=72.85 Aligned_cols=54 Identities=26% Similarity=0.535 Sum_probs=47.6
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeC--CHHHHHHHHhcCCCCeEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ--SITEIIEIMGDRVGEPLKVVV 422 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~--s~~dl~~~l~~~~g~~v~l~v 422 (443)
.|++|.+|.++|||+++||++||+|++|||+++. +++++.+++....+ .+.+++
T Consensus 26 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v 81 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV 81 (82)
T ss_pred CCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence 5899999999999999999999999999999999 89999998876443 566654
No 31
>PRK10942 serine endoprotease; Provisional
Probab=98.59 E-value=2e-07 Score=97.83 Aligned_cols=65 Identities=34% Similarity=0.572 Sum_probs=58.4
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV 434 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v 434 (443)
.|++|.+|.++|+|+++||++||+|++|||++|.+++++.+++.... +.+.|+|.| +|+...+.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~~v~l~V~R-~g~~~~v~~ 472 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-SVLALNIQR-GDSSIYLLM 472 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence 58999999999999999999999999999999999999999988744 689999999 777766554
No 32
>PRK10139 serine endoprotease; Provisional
Probab=98.58 E-value=2e-07 Score=97.25 Aligned_cols=65 Identities=26% Similarity=0.456 Sum_probs=58.7
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTV 434 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v 434 (443)
.|++|.+|.++|||+++||++||+|++|||++|.+|+++.+++..+. +++.++|+| +|+...+.+
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R-~g~~~~~~~ 454 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVR-GNESIYLLL 454 (455)
T ss_pred CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE-CCEEEEEEe
Confidence 58999999999999999999999999999999999999999997754 689999999 787766654
No 33
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.55 E-value=3.2e-07 Score=72.48 Aligned_cols=57 Identities=33% Similarity=0.584 Sum_probs=47.6
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHh-cCCCCeEEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMG-DRVGEPLKVVVQR 424 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~-~~~g~~v~l~v~R 424 (443)
.|++|..|.+++||+++||++||+|++|||+++.++.+...... ...++.+.+++.|
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r 83 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLR 83 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEe
Confidence 68999999999999999999999999999999998766544322 2335689999988
No 34
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.55 E-value=3.5e-07 Score=92.55 Aligned_cols=70 Identities=23% Similarity=0.480 Sum_probs=60.7
Q ss_pred CCceEEeEEC--------CCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEec
Q 013444 367 KSGVLVPVVT--------PGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIPE 437 (443)
Q Consensus 367 ~~g~~V~~v~--------~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~~ 437 (443)
..|++|.... .+|||+++||++||+|++|||++|++|+|+.+++....++++.++|.| +++..++.++|.
T Consensus 104 t~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R-~Ge~~tv~V~Pv 181 (402)
T TIGR02860 104 TKGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIER-GGKIIETVIKPV 181 (402)
T ss_pred cCEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEE-CCEEEEEEEEEe
Confidence 3688885542 369999999999999999999999999999999988778899999999 888888888765
No 35
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.51 E-value=4.5e-07 Score=92.96 Aligned_cols=68 Identities=25% Similarity=0.498 Sum_probs=59.0
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEEECCCeEEEEEEEe
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQRANDQLVTLTVIP 436 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~v~~ 436 (443)
.+++|..|.++|||+++||++||+|++|||++|.+ +.++...+....|+.+.|+|.| +++..+++++.
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r-~g~~~~~~l~r 171 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRR-GPETRLVTLTR 171 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEE-CCEEEEEEEEe
Confidence 48999999999999999999999999999999985 4777777777788899999999 77776666654
No 36
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.49 E-value=2.8e-07 Score=72.72 Aligned_cols=70 Identities=31% Similarity=0.584 Sum_probs=54.6
Q ss_pred eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCC
Q 013444 339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGE 416 (443)
Q Consensus 339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~ 416 (443)
..|||++..-.. ....+++|.+|.++++|+++||++||+|++|||+.+.++ .++..++....+
T Consensus 10 ~~lG~~l~~~~~--------------~~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~- 74 (81)
T PF00595_consen 10 GPLGFTLRGGSD--------------NDEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN- 74 (81)
T ss_dssp SBSSEEEEEEST--------------SSSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS-
T ss_pred CCcCEEEEecCC--------------CCcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC-
Confidence 468888765322 113599999999999999999999999999999999876 455666666544
Q ss_pred eEEEEEE
Q 013444 417 PLKVVVQ 423 (443)
Q Consensus 417 ~v~l~v~ 423 (443)
+++|+|+
T Consensus 75 ~v~L~V~ 81 (81)
T PF00595_consen 75 PVTLTVQ 81 (81)
T ss_dssp EEEEEEE
T ss_pred cEEEEEC
Confidence 7888764
No 37
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.48 E-value=3e-07 Score=93.60 Aligned_cols=63 Identities=22% Similarity=0.370 Sum_probs=54.9
Q ss_pred EeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEE-ECCCeEEEEEEEecC
Q 013444 372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQ-RANDQLVTLTVIPEE 438 (443)
Q Consensus 372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~-R~~g~~~~l~v~~~~ 438 (443)
|..|.|+|+|+++||++||+|++|||++|.+|.|+...+. ++.+.++|. | +|+..++++.+++
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~r-dGe~~~l~Ie~~~ 65 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDA-NGESHQIEIEKDL 65 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcC-CCeEEEEEEecCC
Confidence 6679999999999999999999999999999999987774 356889997 6 7888888888753
No 38
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.35 E-value=2.9e-05 Score=74.53 Aligned_cols=180 Identities=22% Similarity=0.265 Sum_probs=97.5
Q ss_pred CceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC---------C---cEEEE-E
Q 013444 136 PAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD---------G---RTFEG-T 202 (443)
Q Consensus 136 pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d---------g---~~~~a-~ 202 (443)
|-.|.+..... ....|.|.+|+++ ||||++||+.... .. .+.|.+.. + ..... +
T Consensus 25 Pw~~~l~~~~~------~~~~Cggsli~~~-~vltaaHC~~~~~-----~~-~~~V~~G~~~~~~~~~~~~~~~~~~v~~ 91 (256)
T KOG3627|consen 25 PWQVSLQYGGN------GRHLCGGSLISPR-WVLTAAHCVKGAS-----AS-LYTVRLGEHDINLSVSEGEEQLVGDVEK 91 (256)
T ss_pred CCEEEEEECCC------cceeeeeEEeeCC-EEEEChhhCCCCC-----Cc-ceEEEECccccccccccCchhhhceeeE
Confidence 34666655432 1136778888766 9999999998742 01 33444321 1 11111 2
Q ss_pred EEeecC-------C-CCEEEEEEcCC----CCCCccccCCCCC---CCCCCEEEEEecCCCC------CCceEEEEEEee
Q 013444 203 VLNADF-------H-SDIAIVKINSK----TPLPAAKLGTSSK---LCPGDWVVAMGCPHSL------QNTVTAGIVSCV 261 (443)
Q Consensus 203 vv~~d~-------~-~DlAlLkl~~~----~~~~~~~l~~s~~---~~~G~~V~~iG~p~~~------~~~~t~G~Vs~~ 261 (443)
++ .|+ . +|||||+++.+ ..+.++.|..... ...+..+++.||+... ...+....+..+
T Consensus 92 ~i-~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~ 170 (256)
T KOG3627|consen 92 II-VHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPII 170 (256)
T ss_pred EE-ECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEc
Confidence 22 232 3 79999999874 3345666643222 3445888899997532 122332333333
Q ss_pred ecCccCCCCCC---ccceEEEEc-----ccCCCCCccceeeecC---CCEEEEEEEEeec--CC-CeEEEEeHHHHHHHH
Q 013444 262 DRKSSDLGLGG---MRREYLQTD-----CAINAGNSGGPLVNID---GEIVGINIMKVAA--AD-GLSFAVPIDSAAKII 327 (443)
Q Consensus 262 ~~~~~~~~~~~---~~~~~i~~d-----~~i~~G~SGGPlvd~~---G~VVGI~s~~~~~--~~-g~~~aIPi~~i~~~l 327 (443)
....+...+.. .....+... ...|.|+|||||+-.+ ..++||++++... .. .-+....+....+++
T Consensus 171 ~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI 250 (256)
T KOG3627|consen 171 SNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWI 250 (256)
T ss_pred ChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHH
Confidence 33323222211 111234443 3368999999999654 5999999998642 11 122245555555555
Q ss_pred HH
Q 013444 328 EQ 329 (443)
Q Consensus 328 ~~ 329 (443)
++
T Consensus 251 ~~ 252 (256)
T KOG3627|consen 251 KE 252 (256)
T ss_pred HH
Confidence 44
No 39
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.30 E-value=3.5e-06 Score=86.60 Aligned_cols=70 Identities=36% Similarity=0.567 Sum_probs=60.2
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEECC-CeEEEEEEEec
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQRAN-DQLVTLTVIPE 437 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R~~-g~~~~l~v~~~ 437 (443)
.++.|.++.+++||+++||++||+|++|||+++... +++.+.+....|..++|++.|.+ ++.++++++.+
T Consensus 112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~k~~~v~l~Re 184 (406)
T COG0793 112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGGKPFTVTLTRE 184 (406)
T ss_pred CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEcCCCceeEEEEEEE
Confidence 688999999999999999999999999999999765 56778888899999999999953 45666666654
No 40
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.29 E-value=4.7e-06 Score=66.75 Aligned_cols=67 Identities=21% Similarity=0.419 Sum_probs=49.5
Q ss_pred CCceEEeEECCC--------CccccCCC--CCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEEE
Q 013444 367 KSGVLVPVVTPG--------SPAHLAGF--LPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTLT 433 (443)
Q Consensus 367 ~~g~~V~~v~~~--------spA~~aGl--~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l~ 433 (443)
..+..|..|.++ ||..+.|+ ++||+|++|||+++..-.++..+|..+.|+.+.|+|.+.+++.+++.
T Consensus 11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~~~~R~v~ 87 (88)
T PF14685_consen 11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKPGGARTVV 87 (88)
T ss_dssp TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-STT-EEEEE
T ss_pred CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCCCCceEEE
Confidence 368889999875 77777765 59999999999999999999999999999999999999555555554
No 41
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.28 E-value=3.4e-05 Score=72.79 Aligned_cols=165 Identities=18% Similarity=0.219 Sum_probs=84.4
Q ss_pred HhCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE----EEEeecC
Q 013444 133 RVCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG----TVLNADF 208 (443)
Q Consensus 133 ~~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a----~vv~~d~ 208 (443)
-+...|.+|.-..+... ..--|+.++ + +|+|++|..+.. ++.++|...-|.-.-. --+..=+
T Consensus 15 ~Ia~~ic~l~n~s~~~~-----~~l~gigyG-~-~iItn~HLf~~n-------ng~L~i~s~hG~f~v~nt~~lkv~~i~ 80 (235)
T PF00863_consen 15 PIASNICRLTNESDGGT-----RSLYGIGYG-S-YIITNAHLFKRN-------NGELTIKSQHGEFTVPNTTQLKVHPIE 80 (235)
T ss_dssp HHHTTEEEEEEEETTEE-----EEEEEEEET-T-EEEEEGGGGSST-------TCEEEEEETTEEEEECEGGGSEEEE-T
T ss_pred hhhheEEEEEEEeCCCe-----EEEEEEeEC-C-EEEEChhhhccC-------CCeEEEEeCceEEEcCCccccceEEeC
Confidence 44566777764332211 223356665 3 999999999764 3567777665532111 1223335
Q ss_pred CCCEEEEEEcCCCCCCccccC-CCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCC
Q 013444 209 HSDIAIVKINSKTPLPAAKLG-TSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAG 287 (443)
Q Consensus 209 ~~DlAlLkl~~~~~~~~~~l~-~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G 287 (443)
..||.++++..+ +||.+-. .-..++.++.|+++|.-+.... ....|+....... .....++.+......|
T Consensus 81 ~~DiviirmPkD--fpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p-----~~~~~fWkHwIsTk~G 151 (235)
T PF00863_consen 81 GRDIVIIRMPKD--FPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYP-----EENSHFWKHWISTKDG 151 (235)
T ss_dssp CSSEEEEE--TT--S----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEE-----ETTTTEEEE-C---TT
T ss_pred CccEEEEeCCcc--cCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEee-----cCCCCeeEEEecCCCC
Confidence 789999999753 4433221 1246789999999998554221 1222222221111 1123466677777889
Q ss_pred Cccceeeec-CCCEEEEEEEEeecCCCeEEEEeHH
Q 013444 288 NSGGPLVNI-DGEIVGINIMKVAAADGLSFAVPID 321 (443)
Q Consensus 288 ~SGGPlvd~-~G~VVGI~s~~~~~~~g~~~aIPi~ 321 (443)
+=|.|+|+. ||.+|||++..... ...+|+.|+.
T Consensus 152 ~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~ 185 (235)
T PF00863_consen 152 DCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFP 185 (235)
T ss_dssp -TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--
T ss_pred ccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCC
Confidence 999999976 59999999986544 4677888775
No 42
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.23 E-value=1.8e-06 Score=89.50 Aligned_cols=63 Identities=17% Similarity=0.285 Sum_probs=55.3
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCCeEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRANDQLVTL 432 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g~~~~l 432 (443)
.|.+|.+|.++|||++|||++||+|++|||+++.+++++...+.... +++.+++.| +++..++
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~-~~v~~~I~r-~g~~~~l 190 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA-GEPMVEILA-ERENWTF 190 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc-ccceEEEEE-ecCceEe
Confidence 68889999999999999999999999999999999999998887655 678899999 5555443
No 43
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=98.10 E-value=9.3e-06 Score=73.88 Aligned_cols=72 Identities=26% Similarity=0.376 Sum_probs=60.8
Q ss_pred ceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHH---HHhcCCCCeEEEEEEECCCeEEEEEEEecCCCC
Q 013444 369 GVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIE---IMGDRVGEPLKVVVQRANDQLVTLTVIPEEANP 441 (443)
Q Consensus 369 g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~---~l~~~~g~~v~l~v~R~~g~~~~l~v~~~~~~~ 441 (443)
-++|.+|.|+|||+++||+.||.|++++...--++..|+. ......++.+.++|.| .|+.+.+.++|....+
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R-~g~~v~L~ltP~~W~G 214 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIR-EGQKVVLSLTPKKWQG 214 (231)
T ss_pred eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEec-CCCEEEEEeCcccccC
Confidence 4678999999999999999999999999887777665553 3345778899999999 8999999999987654
No 44
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=98.03 E-value=1.6e-05 Score=69.27 Aligned_cols=72 Identities=28% Similarity=0.473 Sum_probs=55.0
Q ss_pred CCceEEeEECCCCccccCCCCC-CCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCC-eEEEEEEEecC
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLP-SDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAND-QLVTLTVIPEE 438 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~-GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g-~~~~l~v~~~~ 438 (443)
..++-|.+|.|+|||++|||++ .|.|+.+|+....+.++|.+.+..+.++++.|.|++... ....+.++|..
T Consensus 42 ~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vyns~~~~vR~V~i~P~~ 115 (138)
T PF04495_consen 42 EEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVYNSKTDSVREVTITPSR 115 (138)
T ss_dssp CCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEEETTTTCEEEEEE---T
T ss_pred cceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEEECCCCeEEEEEEEcCC
Confidence 4588899999999999999999 599999999999999999999999999999999997433 45678888764
No 45
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.99 E-value=3.2e-05 Score=83.79 Aligned_cols=70 Identities=14% Similarity=0.352 Sum_probs=55.9
Q ss_pred CceEEeEECCCCccccC-CCCCCCEEEEEC--CEeeC-----CHHHHHHHHhcCCCCeEEEEEEEC--CCeEEEEEEEec
Q 013444 368 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFD--GKPVQ-----SITEIIEIMGDRVGEPLKVVVQRA--NDQLVTLTVIPE 437 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~a-Gl~~GDiI~~vn--g~~V~-----s~~dl~~~l~~~~g~~v~l~v~R~--~g~~~~l~v~~~ 437 (443)
.+++|.+|.+|+||+++ ||++||+|++|| |+++. +.+++..+|....|.+|.|+|.|. +++..+++++..
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r~~~~~~~~~vtl~R~ 334 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTLTRD 334 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEeCCCCCceEEEEEEee
Confidence 46899999999999998 999999999999 55443 245788888888999999999983 245566666543
No 46
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.93 E-value=3.8e-05 Score=74.33 Aligned_cols=68 Identities=26% Similarity=0.425 Sum_probs=59.1
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP 436 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~ 436 (443)
.|+++..+..++|+... |+.||.|++|||+++.+.+|+.+++.. +.|++++|++.|.++++...++++
T Consensus 130 ~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl 198 (342)
T COG3480 130 AGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITL 198 (342)
T ss_pred eeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEE
Confidence 59999999999999887 999999999999999999999998876 899999999998666655444443
No 47
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.83 E-value=4.6e-05 Score=73.60 Aligned_cols=61 Identities=18% Similarity=0.350 Sum_probs=52.0
Q ss_pred ECCCC---ccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEe
Q 013444 375 VTPGS---PAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIP 436 (443)
Q Consensus 375 v~~~s---pA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~ 436 (443)
+.|+. -.+++|||+||++++|||.++.+.++..+++.+ .....++|+|+| +|+.+++.+.+
T Consensus 211 l~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeR-dGq~~~i~i~l 275 (276)
T PRK09681 211 VKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLR-KGARHDISIAL 275 (276)
T ss_pred ECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEE-CCEEEEEEEEc
Confidence 55653 457889999999999999999999998888876 667789999999 99998887754
No 48
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.77 E-value=4.3e-05 Score=78.65 Aligned_cols=64 Identities=25% Similarity=0.387 Sum_probs=53.6
Q ss_pred CCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEEecC
Q 013444 366 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVIPEE 438 (443)
Q Consensus 366 ~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~~~~ 438 (443)
...+.+|..|.++|||++|||.+||.|++|||. .+.+.. +.++.+++++.| .+..+++.+++..
T Consensus 460 ~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~-~~~L~e~~v~~~~ 524 (558)
T COG3975 460 EGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDRYKVNDKIQVHVFR-EGRLREFLVKLGG 524 (558)
T ss_pred cCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccccccccccceEEEEcc-CCceEEeecccCC
Confidence 346789999999999999999999999999998 333443 788899999999 7888888777654
No 49
>PF12812 PDZ_1: PDZ-like domain
Probab=97.70 E-value=0.00018 Score=56.39 Aligned_cols=68 Identities=21% Similarity=0.249 Sum_probs=57.5
Q ss_pred eecCceeecccHHHHHHhhcCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCC
Q 013444 339 PWLGLKMLDLNDMIIAQLKERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRV 414 (443)
Q Consensus 339 p~lGi~~~~~~~~~~~~l~~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~ 414 (443)
-|.|..+.+|+-..++++.. .-|.++.....++++.+-|+..|-+|.+|||+++.+.++|.+.+++-+
T Consensus 9 ~~~Ga~f~~Ls~q~aR~~~~--------~~~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ip 76 (78)
T PF12812_consen 9 EVCGAVFHDLSYQQARQYGI--------PVGGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKKIP 76 (78)
T ss_pred EEcCeecccCCHHHHHHhCC--------CCCEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCC
Confidence 58899999999999998855 334566667888988877799999999999999999999999887643
No 50
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.54 E-value=0.0017 Score=64.06 Aligned_cols=51 Identities=18% Similarity=0.294 Sum_probs=34.0
Q ss_pred ccCCCCCccceeeec--CCC-EEEEEEEEeecCCC---eEEEEeHHHHHHHHHHHHH
Q 013444 282 CAINAGNSGGPLVNI--DGE-IVGINIMKVAAADG---LSFAVPIDSAAKIIEQFKK 332 (443)
Q Consensus 282 ~~i~~G~SGGPlvd~--~G~-VVGI~s~~~~~~~g---~~~aIPi~~i~~~l~~l~~ 332 (443)
...|.|+||||+|-. +|+ -+||++|+.....+ -+..--++....+|+...+
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~ 279 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN 279 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence 457899999999932 365 46999998764322 1223446677788877554
No 51
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.49 E-value=0.009 Score=58.42 Aligned_cols=177 Identities=20% Similarity=0.277 Sum_probs=100.2
Q ss_pred hCCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCC----C-----CCC------------ceEEEE
Q 013444 134 VCPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSR----A-----LPK------------GKVDVT 192 (443)
Q Consensus 134 ~~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~----~-----~~~------------~~i~V~ 192 (443)
-.|-.|.+....... .....+|++|+++ ||||++|++-...... . -.. ..+.+.
T Consensus 52 ~~pW~v~v~~~~~~~----~~~~~~gtlIS~R-HiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v~ 126 (282)
T PF03761_consen 52 EAPWAVSVYTKNHNE----GNYFSTGTLISPR-HILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEVLSKIDVR 126 (282)
T ss_pred CCCCEEEEEeccCcc----cceecceEEeccC-eEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHHhccEEEE
Confidence 456788887665322 1133499999998 9999999997322200 0 001 112220
Q ss_pred ----eCCC-----cEEEEEEEe-e-------cCCCCEEEEEEcCC--CCCCccccCCCC-CCCCCCEEEEEecCCCCCCc
Q 013444 193 ----LQDG-----RTFEGTVLN-A-------DFHSDIAIVKINSK--TPLPAAKLGTSS-KLCPGDWVVAMGCPHSLQNT 252 (443)
Q Consensus 193 ----~~dg-----~~~~a~vv~-~-------d~~~DlAlLkl~~~--~~~~~~~l~~s~-~~~~G~~V~~iG~p~~~~~~ 252 (443)
...+ +...|.++. + ...++++||+++.+ ....++.|+++. .+..++.+.+.|+... ..
T Consensus 127 ~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~ 204 (282)
T PF03761_consen 127 CCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GK 204 (282)
T ss_pred eecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--Ce
Confidence 0111 122344442 2 23579999999987 677888887643 4567899999888211 12
Q ss_pred eEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceee-ecCC--CEEEEEEEEeecC-CCeEEEEeHHHHHH
Q 013444 253 VTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLV-NIDG--EIVGINIMKVAAA-DGLSFAVPIDSAAK 325 (443)
Q Consensus 253 ~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlv-d~~G--~VVGI~s~~~~~~-~g~~~aIPi~~i~~ 325 (443)
+....+.-..... ....+......+.|++|||++ +.+| .||||.+.+.... ....+++.+..+++
T Consensus 205 ~~~~~~~i~~~~~--------~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 205 LKHRKLKITNCTK--------CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD 273 (282)
T ss_pred EEEEEEEEEEeec--------cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence 2222222111110 233455566778999999998 3334 5888876543221 12456677665544
No 52
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.44 E-value=0.00097 Score=63.24 Aligned_cols=116 Identities=23% Similarity=0.386 Sum_probs=60.7
Q ss_pred CCcEEEEEEEeCCC--EEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc-CCCCCCccccC
Q 013444 153 GRGIGSGAIVDADG--TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN-SKTPLPAAKLG 229 (443)
Q Consensus 153 ~~~~GSGfiI~~~G--~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~-~~~~~~~~~l~ 229 (443)
+...|||=++..+| .|+|+.||+.+ ....|...+ .... ..++..-|+|.-.++ -+...|..++.
T Consensus 110 Gss~Gsggvft~~~~~vvvTAtHVlg~---------~~a~v~~~g-~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a 176 (297)
T PF05579_consen 110 GSSVGSGGVFTIGGNTVVVTATHVLGG---------NTARVSGVG-TRRM---LTFKKNGDFAEADITNWPGAAPKYKFA 176 (297)
T ss_dssp SSSEEEEEEEECTTEEEEEEEHHHCBT---------TEEEEEETT-EEEE---EEEEEETTEEEEEETTS-S---B--B-
T ss_pred eecccccceEEECCeEEEEEEEEEcCC---------CeEEEEecc-eEEE---EEEeccCcEEEEECCCCCCCCCceeec
Confidence 34556666665554 89999999985 344454433 2222 234455699999994 34566777665
Q ss_pred CCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEE
Q 013444 230 TSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMK 307 (443)
Q Consensus 230 ~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~ 307 (443)
.. ..|---+.- +..+..|.|.. ...+ |-..+||||+|+++.+|.+|||++..
T Consensus 177 ~~---~~GrAyW~t------~tGvE~G~ig~--------------~~~~---~fT~~GDSGSPVVt~dg~liGVHTGS 228 (297)
T PF05579_consen 177 QN---YTGRAYWLT------STGVEPGFIGG--------------GGAV---CFTGPGDSGSPVVTEDGDLIGVHTGS 228 (297)
T ss_dssp TT----SEEEEEEE------TTEEEEEEEET--------------TEEE---ESS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred CC---cccceEEEc------ccCcccceecC--------------ceEE---EEcCCCCCCCccCcCCCCEEEEEecC
Confidence 21 112211110 11233444321 0111 33457999999999999999999874
No 53
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.41 E-value=0.00015 Score=58.18 Aligned_cols=36 Identities=31% Similarity=0.538 Sum_probs=32.8
Q ss_pred CCCCceEEeEECCCCccccCCCCCCCEEEEECCEee
Q 013444 365 NVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPV 400 (443)
Q Consensus 365 ~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V 400 (443)
-+..|++|++|.++|||+.|||+.+|.|+.+||...
T Consensus 56 ytD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~Df 91 (124)
T KOG3553|consen 56 YTDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDF 91 (124)
T ss_pred cCCccEEEEEeccCChhhhhcceecceEEEecCcee
Confidence 345799999999999999999999999999999765
No 54
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.35 E-value=0.0004 Score=64.70 Aligned_cols=67 Identities=15% Similarity=0.193 Sum_probs=56.5
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhc-CCCCeEEEEEEECCCeEEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGD-RVGEPLKVVVQRANDQLVTLTVI 435 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v~ 435 (443)
.|..+.-..+.+..+..|||.||+.+++|+..+++.+++..+|+. ...+.++++|+| +|+...+.+.
T Consensus 207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R-~G~rhdInV~ 274 (275)
T COG3031 207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIR-RGKRHDINVR 274 (275)
T ss_pred EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEe-cCccceeeec
Confidence 355555566678888899999999999999999999999998877 556789999999 8998887764
No 55
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.84 E-value=0.02 Score=53.18 Aligned_cols=160 Identities=18% Similarity=0.235 Sum_probs=84.8
Q ss_pred CcEEEEEEEeCC-CEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCC----------------CCEEEEE
Q 013444 154 RGIGSGAIVDAD-GTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFH----------------SDIAIVK 216 (443)
Q Consensus 154 ~~~GSGfiI~~~-G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~----------------~DlAlLk 216 (443)
.+.||=.+++++ +..--=.|.+.+.+ ....+.+.+|+.+++.+....+. .-+.-+.
T Consensus 19 aGiGTlTf~dp~~~~fgALGH~I~D~d-------t~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~ 91 (218)
T PF05580_consen 19 AGIGTLTFYDPETGTFGALGHGISDVD-------TGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIE 91 (218)
T ss_pred cCeEEEEEEECCCCcEEecCCeEEcCC-------CCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEE
Confidence 367888999874 55555678887754 23345667788887776654321 1111222
Q ss_pred EcC----------C-----CCCCccccCCCCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCC----CccceE
Q 013444 217 INS----------K-----TPLPAAKLGTSSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG----GMRREY 277 (443)
Q Consensus 217 l~~----------~-----~~~~~~~l~~s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~----~~~~~~ 277 (443)
-+. . ...++++++...++++|..-+..=.. +.....-.=.|..+.........+ -...++
T Consensus 92 ~Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~~-G~~ie~f~ieI~~v~~~~~~~~k~~vi~vtd~~L 170 (218)
T PF05580_consen 92 KNTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVID-GTKIEEFDIEIEKVLPQSSPSGKGMVIKVTDPRL 170 (218)
T ss_pred eccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEEc-CCeEEEeEEEEEEEccCCCCCCCcEEEEECCcch
Confidence 111 1 12234444445566666532211010 100000011111222211100000 011123
Q ss_pred EEEcccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHH
Q 013444 278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDS 322 (443)
Q Consensus 278 i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~ 322 (443)
+.....+-+|+||+|++ .+|++||=++..+.+....+|.++++.
T Consensus 171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~ 214 (218)
T PF05580_consen 171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEW 214 (218)
T ss_pred hhhhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHH
Confidence 33445577899999999 699999999988877788999999765
No 56
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=96.68 E-value=0.0022 Score=66.76 Aligned_cols=59 Identities=24% Similarity=0.430 Sum_probs=47.3
Q ss_pred CCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HH-HHHHHhcCCCCeEEEEEEE
Q 013444 366 VKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TE-IIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 366 ~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~d-l~~~l~~~~g~~v~l~v~R 424 (443)
...|++|..|.++|||++-||+.||+|++||.++..+. ++ +.-+|.-..|+.++|.-.+
T Consensus 427 NDVGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~ 488 (1027)
T KOG3580|consen 427 NDVGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQS 488 (1027)
T ss_pred CceeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhhh
Confidence 35699999999999999999999999999999998775 23 3334455788888875543
No 57
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=96.48 E-value=0.006 Score=64.46 Aligned_cols=57 Identities=30% Similarity=0.471 Sum_probs=47.9
Q ss_pred CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEE
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R 424 (443)
..-+.|..|.++++|.++.|++||++++|||.+|.+..+..+.++...|+ +...++|
T Consensus 397 ~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~-~~~l~~~ 453 (1051)
T KOG3532|consen 397 NRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGD-LTVLVER 453 (1051)
T ss_pred ceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccc-eEEEEee
Confidence 45677899999999999999999999999999999999999998876665 3333333
No 58
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.39 E-value=0.015 Score=63.62 Aligned_cols=24 Identities=38% Similarity=0.440 Sum_probs=21.1
Q ss_pred cEEEEEEEeCCCEEEeccccccCC
Q 013444 155 GIGSGAIVDADGTILTCAHVVVDF 178 (443)
Q Consensus 155 ~~GSGfiI~~~G~ILTaaHvv~~~ 178 (443)
+-|||-||+++|+||||.||..++
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~~~ 70 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGYGA 70 (698)
T ss_pred CceeEEEEcCCceEEecchhhhhH
Confidence 349999999999999999998653
No 59
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.28 E-value=0.042 Score=58.65 Aligned_cols=117 Identities=22% Similarity=0.325 Sum_probs=72.9
Q ss_pred CCCCEEEEEEcCC--------CCC------CccccCC------CCCCCCCCEEEEEecCCCCCCceEEEEEEeeecCccC
Q 013444 208 FHSDIAIVKINSK--------TPL------PAAKLGT------SSKLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSD 267 (443)
Q Consensus 208 ~~~DlAlLkl~~~--------~~~------~~~~l~~------s~~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~ 267 (443)
.-.|+|||+++.. +++ |.+.+.+ ...+.+|..|+-+|...+ .|.|++.+.....
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy-- 614 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY-- 614 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--
Confidence 3469999999853 111 2222321 134677999999998766 4566665543221
Q ss_pred CCCCCc-cceEEEEc----ccCCCCCccceeeecCCC------EEEEEEEEeecCCCeEEEEeHHHHHHHHHHH
Q 013444 268 LGLGGM-RREYLQTD----CAINAGNSGGPLVNIDGE------IVGINIMKVAAADGLSFAVPIDSAAKIIEQF 330 (443)
Q Consensus 268 ~~~~~~-~~~~i~~d----~~i~~G~SGGPlvd~~G~------VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l 330 (443)
+..+.. ..+++... .-...|+||+=|++.-+. |+||.+..-....+++++.|+..|.+-|++.
T Consensus 615 w~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 615 WADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred ecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHh
Confidence 111111 12333333 223679999999986444 9999988666667899999998877766654
No 60
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.20 E-value=0.0075 Score=65.98 Aligned_cols=56 Identities=23% Similarity=0.315 Sum_probs=43.4
Q ss_pred eEEEEcccCCCCCccceeeecCCCEEEEEEEEee----------cCCCeEEEEeHHHHHHHHHHHH
Q 013444 276 EYLQTDCAINAGNSGGPLVNIDGEIVGINIMKVA----------AADGLSFAVPIDSAAKIIEQFK 331 (443)
Q Consensus 276 ~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~----------~~~g~~~aIPi~~i~~~l~~l~ 331 (443)
-.+.++..+..||||+|++|.+|+|||+++-+.. .....+..|-+..|.-+|+++-
T Consensus 622 v~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 622 VNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred eEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 3567888999999999999999999999986532 1224566777777888887764
No 61
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.14 E-value=0.32 Score=44.15 Aligned_cols=149 Identities=18% Similarity=0.230 Sum_probs=82.4
Q ss_pred CCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEE--EEEeecC---C
Q 013444 135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEG--TVLNADF---H 209 (443)
Q Consensus 135 ~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a--~vv~~d~---~ 209 (443)
..-++.|.+. ++...++++.|..+ ++|...|.-.. ..+.+ +|+.++. .+...+. .
T Consensus 12 ~~N~~~v~~~-------~g~~t~l~~gi~~~-~~lvp~H~~~~---------~~i~i---~g~~~~~~d~~~lv~~~~~~ 71 (172)
T PF00548_consen 12 KKNVVPVTTG-------KGEFTMLALGIYDR-YFLVPTHEEPE---------DTIYI---DGVEYKVDDSVVLVDRDGVD 71 (172)
T ss_dssp HHHEEEEEET-------TEEEEEEEEEEEBT-EEEEEGGGGGC---------SEEEE---TTEEEEEEEEEEEEETTSSE
T ss_pred hccEEEEEeC-------CceEEEecceEeee-EEEEECcCCCc---------EEEEE---CCEEEEeeeeEEEecCCCcc
Confidence 3456666662 23466888889876 99999992221 23433 3555543 2223343 4
Q ss_pred CCEEEEEEcCCCCCCccccCCCCCC-CCCCEEEEEecCCCCC-CceEEEEEEeeecCccCCCCCCccceEEEEcccCCCC
Q 013444 210 SDIAIVKINSKTPLPAAKLGTSSKL-CPGDWVVAMGCPHSLQ-NTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAG 287 (443)
Q Consensus 210 ~DlAlLkl~~~~~~~~~~l~~s~~~-~~G~~V~~iG~p~~~~-~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G 287 (443)
.||++++++....++-+.---...+ ...+...++ +..... .....+.++....... .+......+..+++..+|
T Consensus 72 ~Dl~~v~l~~~~kfrDIrk~~~~~~~~~~~~~l~v-~~~~~~~~~~~v~~v~~~~~i~~---~g~~~~~~~~Y~~~t~~G 147 (172)
T PF00548_consen 72 TDLTLVKLPRNPKFRDIRKFFPESIPEYPECVLLV-NSTKFPRMIVEVGFVTNFGFINL---SGTTTPRSLKYKAPTKPG 147 (172)
T ss_dssp EEEEEEEEESSS-B--GGGGSBSSGGTEEEEEEEE-ESSSSTCEEEEEEEEEEEEEEEE---TTEEEEEEEEEESEEETT
T ss_pred eeEEEEEccCCcccCchhhhhccccccCCCcEEEE-ECCCCccEEEEEEEEeecCcccc---CCCEeeEEEEEccCCCCC
Confidence 6999999987544432221011122 223333333 333333 2334444443333211 112234578888889999
Q ss_pred Cccceeeec---CCCEEEEEEEE
Q 013444 288 NSGGPLVNI---DGEIVGINIMK 307 (443)
Q Consensus 288 ~SGGPlvd~---~G~VVGI~s~~ 307 (443)
+-||||+.. .++++||+.++
T Consensus 148 ~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 148 MCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp GTTEEEEESCGGTTEEEEEEEEE
T ss_pred ccCCeEEEeeccCccEEEEEecc
Confidence 999999942 47999999875
No 62
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=95.86 E-value=0.03 Score=48.66 Aligned_cols=58 Identities=24% Similarity=0.466 Sum_probs=44.5
Q ss_pred CCCCceEEeEECCCCccccC-CCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEE
Q 013444 365 NVKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ 423 (443)
Q Consensus 365 ~~~~g~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~ 423 (443)
.....++|+.|.|++.|++- ||+.||.+++|||..|..- +...++|+...| .+++.|.
T Consensus 112 eqnspiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g-svklvvr 172 (207)
T KOG3550|consen 112 EQNSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG-SVKLVVR 172 (207)
T ss_pred ccCCceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC-cEEEEEe
Confidence 34568999999999999986 7999999999999998643 344566665555 4666654
No 63
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.45 E-value=0.023 Score=60.62 Aligned_cols=52 Identities=19% Similarity=0.489 Sum_probs=43.1
Q ss_pred EeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEE
Q 013444 372 VPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 372 V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R 424 (443)
|..|.++|||++.| |+.||.|++|||+.|.+. .|+..++++ .|-.|+|+|.-
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd-aGlsVtLtIip 836 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD-AGLSVTLTIIP 836 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHh-cCceEEEEEcC
Confidence 67789999999975 999999999999999754 566776665 57789998875
No 64
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=95.43 E-value=0.056 Score=55.05 Aligned_cols=56 Identities=30% Similarity=0.520 Sum_probs=47.9
Q ss_pred EECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHHHhcCCCCe---EEEEEEECCCeE
Q 013444 374 VVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEIMGDRVGEP---LKVVVQRANDQL 429 (443)
Q Consensus 374 ~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~l~~~~g~~---v~l~v~R~~g~~ 429 (443)
.+..++++..+|+++||.|+++|++++.+|+++...+....+.. +.+.+.|-++..
T Consensus 135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~ 193 (375)
T COG0750 135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEA 193 (375)
T ss_pred ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEecccee
Confidence 78999999999999999999999999999999998887766665 788888833333
No 65
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.38 E-value=0.047 Score=47.13 Aligned_cols=33 Identities=36% Similarity=0.555 Sum_probs=23.3
Q ss_pred EEcccCCCCCccceeeecCCCEEEEEEEEeecC
Q 013444 279 QTDCAINAGNSGGPLVNIDGEIVGINIMKVAAA 311 (443)
Q Consensus 279 ~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~ 311 (443)
..+..+.+|.||+|+||.+|++|||...+..-.
T Consensus 89 ~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 89 AIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp EE---S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred eeecccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 344557799999999999999999998876543
No 66
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.01 E-value=0.067 Score=57.19 Aligned_cols=60 Identities=22% Similarity=0.401 Sum_probs=49.6
Q ss_pred CCCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeC--CHHHHHHHHhc-CCCCeEEEEEEE
Q 013444 365 NVKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQ--SITEIIEIMGD-RVGEPLKVVVQR 424 (443)
Q Consensus 365 ~~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~--s~~dl~~~l~~-~~g~~v~l~v~R 424 (443)
.....++|..|.+.+.|++.| |++||.|+.|||.+|. +-.++..+|.. ..+..|.|+|.|
T Consensus 671 ep~qpi~iG~Iv~lGaAe~DGRL~~gDElv~iDG~pV~GksH~~vv~Lm~~AArnghV~LtVRR 734 (984)
T KOG3209|consen 671 EPGQPIYIGAIVPLGAAEEDGRLREGDELVCIDGIPVEGKSHSEVVDLMEAAARNGHVNLTVRR 734 (984)
T ss_pred CCCCeeEEeeeeecccccccCcccCCCeEEEecCeeccCccHHHHHHHHHHHHhcCceEEEEee
Confidence 456789999999999999986 9999999999999995 66777777765 334468999987
No 67
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=94.57 E-value=0.061 Score=58.99 Aligned_cols=55 Identities=24% Similarity=0.465 Sum_probs=45.3
Q ss_pred CceEEeEECCCCccccCCCCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEEE
Q 013444 368 SGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~R 424 (443)
..++|..|.+|+|+.-. |++||+|++|||++|++ |+.+.++++.. .+.|.|+|.+
T Consensus 75 rPviVr~VT~GGps~GK-L~PGDQIl~vN~Epv~daprervIdlvRac-e~sv~ltV~q 131 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSIGK-LQPGDQILAVNGEPVKDAPRERVIDLVRAC-ESSVNLTVCQ 131 (1298)
T ss_pred CceEEEEecCCCCcccc-ccCCCeEEEecCcccccccHHHHHHHHHHH-hhhcceEEec
Confidence 68999999999999865 99999999999999974 67777777653 2457788877
No 68
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=94.40 E-value=0.34 Score=49.66 Aligned_cols=46 Identities=22% Similarity=0.369 Sum_probs=37.2
Q ss_pred EEEcccCCCCCccceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHH
Q 013444 278 LQTDCAINAGNSGGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAA 324 (443)
Q Consensus 278 i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~ 324 (443)
+.....+-+|+||+|++ .+|++||=++-.+-+.+..+|+|-++.-.
T Consensus 351 l~~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~Ml 396 (402)
T TIGR02860 351 LEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEWML 396 (402)
T ss_pred hhHhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHHHH
Confidence 33345677899999999 79999998888888888899999776543
No 69
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=94.40 E-value=0.079 Score=54.58 Aligned_cols=58 Identities=21% Similarity=0.547 Sum_probs=43.5
Q ss_pred CCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhc---CCCCeEEEEEEE
Q 013444 366 VKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGD---RVGEPLKVVVQR 424 (443)
Q Consensus 366 ~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~---~~g~~v~l~v~R 424 (443)
...|++|.+|.+++.-+..| |.+||.|++||.....++ +|....|.+ +.| +++++|-.
T Consensus 275 gDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~g-Pi~ltvAk 338 (626)
T KOG3571|consen 275 GDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPG-PIKLTVAK 338 (626)
T ss_pred CCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCC-CeEEEEee
Confidence 35799999999988776665 999999999999887654 344555554 333 58888876
No 70
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=94.25 E-value=0.57 Score=44.39 Aligned_cols=100 Identities=19% Similarity=0.256 Sum_probs=66.3
Q ss_pred CCceEEEEecccccccccCCcEEEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEE------EEEEeec-
Q 013444 135 CPAVVNLSAPREFLGILSGRGIGSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE------GTVLNAD- 207 (443)
Q Consensus 135 ~pSVV~I~~~~~~~~~~~~~~~GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~------a~vv~~d- 207 (443)
.|-.+.|.. .|...|||++|+++ |||++..|+.+..- ....+.+.+..++.+. -++..+|
T Consensus 16 WPWlA~IYv--------dG~~~CsgvLlD~~-WlLvsssCl~~I~L----~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~ 82 (267)
T PF09342_consen 16 WPWLADIYV--------DGRYWCSGVLLDPH-WLLVSSSCLRGISL----SHHYVSALLGGGKTYLSVDGPHEQISRVDC 82 (267)
T ss_pred CcceeeEEE--------cCeEEEEEEEeccc-eEEEeccccCCccc----ccceEEEEecCcceecccCCChheEEEeee
Confidence 466666654 34578999999988 99999999987431 1256777787776544 1233333
Q ss_pred ----CCCCEEEEEEcCCCC----CCccccCC-CCCCCCCCEEEEEecCC
Q 013444 208 ----FHSDIAIVKINSKTP----LPAAKLGT-SSKLCPGDWVVAMGCPH 247 (443)
Q Consensus 208 ----~~~DlAlLkl~~~~~----~~~~~l~~-s~~~~~G~~V~~iG~p~ 247 (443)
++.+++||.++.+.. +.|.-+.. .......+.++++|.-.
T Consensus 83 ~~~V~~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 83 FKDVPESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeeccccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 578999999997632 23444432 23444556899999765
No 71
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=94.09 E-value=0.043 Score=58.19 Aligned_cols=56 Identities=27% Similarity=0.457 Sum_probs=43.4
Q ss_pred CCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHH--HHHHhcCCCCeEEEEEEE
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEI--IEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl--~~~l~~~~g~~v~l~v~R 424 (443)
..|++|.+|.|++.|++.|++.||.|++|||+...++.-. .++|.+ ...+.|+|+.
T Consensus 561 GfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrn--nthLtltvKt 618 (1283)
T KOG3542|consen 561 GFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRN--NTHLTLTVKT 618 (1283)
T ss_pred cceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcC--CceEEEEEec
Confidence 4589999999999999999999999999999998776432 234443 2346666653
No 72
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.01 E-value=0.14 Score=53.86 Aligned_cols=63 Identities=22% Similarity=0.443 Sum_probs=48.0
Q ss_pred CCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeCCHHHHHHH-HhcCCCCeEEEEEEE
Q 013444 361 PSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSITEIIEI-MGDRVGEPLKVVVQR 424 (443)
Q Consensus 361 ~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~~dl~~~-l~~~~g~~v~l~v~R 424 (443)
+.|.+-...++|..|.|++||+-- |+.||.|+-|||....+......+ .-.+.|+...|+|+|
T Consensus 33 Phf~~getSiViSDVlpGGPAeG~-LQenDrvvMVNGvsMenv~haFAvQqLrksgK~A~ItvkR 96 (1027)
T KOG3580|consen 33 PHFENGETSIVISDVLPGGPAEGL-LQENDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITVKR 96 (1027)
T ss_pred CCccCCceeEEEeeccCCCCcccc-cccCCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEecc
Confidence 344444567999999999999976 999999999999988776544331 122567778899988
No 73
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=93.98 E-value=0.096 Score=55.53 Aligned_cols=116 Identities=24% Similarity=0.373 Sum_probs=72.2
Q ss_pred CCccceee-----ecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHHHcCceeeee---cCc-eeecccHHHHHHhh
Q 013444 287 GNSGGPLV-----NIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFKKNGRVVRPW---LGL-KMLDLNDMIIAQLK 357 (443)
Q Consensus 287 G~SGGPlv-----d~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~~~g~v~rp~---lGi-~~~~~~~~~~~~l~ 357 (443)
=++|||.- |...+++.|+-... ..+|.+..+.+++.+|+.-.++.-. --+ ++.-.-++..-+|+
T Consensus 680 mm~~GpAarsgkLnIGDQiiaING~SL-------VGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~kyQLG 752 (829)
T KOG3605|consen 680 MMHGGPAARSGKLNIGDQIMSINGTSL-------VGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLRYQLG 752 (829)
T ss_pred cccCChhhhcCCccccceeEeecCcee-------ccccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccchhhcc
Confidence 45666663 44445666653221 2499999999999998766553111 111 11112233333332
Q ss_pred cCCCCCCCCCCceEEeEECCCCccccCCCCCCCEEEEECCEeeC--CHHHHHHHHhcCCCC
Q 013444 358 ERDPSFPNVKSGVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ--SITEIIEIMGDRVGE 416 (443)
Q Consensus 358 ~~~~~~~~~~~g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~--s~~dl~~~l~~~~g~ 416 (443)
- ...+|++- +...++.|++-|++.|-.|++|||+.|. --+.+..+|...+|+
T Consensus 753 F------SVQNGiIC-SLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~aVGE 806 (829)
T KOG3605|consen 753 F------SVQNGIIC-SLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSNAVGE 806 (829)
T ss_pred c------eeeCcEee-hhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHHhhhh
Confidence 2 34567754 4778999999999999999999999883 334566667666654
No 74
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=93.95 E-value=0.17 Score=43.38 Aligned_cols=33 Identities=21% Similarity=0.352 Sum_probs=26.9
Q ss_pred cccCCCCCccceeeecCCCEEEEEEEEeecCCC
Q 013444 281 DCAINAGNSGGPLVNIDGEIVGINIMKVAAADG 313 (443)
Q Consensus 281 d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~g 313 (443)
...-.+|+||-|++|..|+||||+..+..+...
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~R 132 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRR 132 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEEEETTE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCCCCCCc
Confidence 445679999999999999999999988766443
No 75
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=93.95 E-value=0.31 Score=45.35 Aligned_cols=117 Identities=26% Similarity=0.334 Sum_probs=47.4
Q ss_pred EEEeccccccCCCCCCCCCCceEEEEeCCCcEEE---EEEEeecCCCCEEEEEEcCC----CCCCccccCCCCCCCCCCE
Q 013444 167 TILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFE---GTVLNADFHSDIAIVKINSK----TPLPAAKLGTSSKLCPGDW 239 (443)
Q Consensus 167 ~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~---a~vv~~d~~~DlAlLkl~~~----~~~~~~~l~~s~~~~~G~~ 239 (443)
.++|+.||..+.. .+ ..+.+|+.++ -+.+..+...|++||+.... ...+.+.+.....+.
T Consensus 43 ~L~ta~Hv~~~~~--------~~-~~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~---- 109 (203)
T PF02122_consen 43 ALLTARHVWSRPS--------KV-TSLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLA---- 109 (203)
T ss_dssp EEEE-HHHHTSSS------------EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEE----
T ss_pred ceecccccCCCcc--------ce-eEcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhC----
Confidence 8999999998732 22 2334444444 24555678899999999732 222333332221111
Q ss_pred EEEEecCCCCCCceEEEEEEeeecCccCCCCCCccceEEEEcccCCCCCccceeeecCCCEEEEEEEE
Q 013444 240 VVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLGGMRREYLQTDCAINAGNSGGPLVNIDGEIVGINIMK 307 (443)
Q Consensus 240 V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~ 307 (443)
-| +... .....+...+...... +....+...-+...+|.||.|+++.+ +++|++...
T Consensus 110 ---~g-~~~~-y~~~~~~~~~~sa~i~-----g~~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~ 166 (203)
T PF02122_consen 110 ---KG-PVSF-YGFSSGEWPCSSAKIP-----GTEGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS 166 (203)
T ss_dssp ---EE-ESST-TSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE
T ss_pred ---CC-Ceee-eeecCCCceeccCccc-----cccCcCCceEcCCCCCCCCCCeEECC-CceEeecCc
Confidence 01 0000 1111211111111110 11123566778889999999999877 999999874
No 76
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=92.28 E-value=0.22 Score=50.61 Aligned_cols=70 Identities=23% Similarity=0.434 Sum_probs=52.0
Q ss_pred CCceEEeEECCCCccccCCCCCC-CEEEEECCEeeCCHHHHHH-HHhcCCCCeEEEEEEECCCe-EEEEEEEec
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLPS-DVVIKFDGKPVQSITEIIE-IMGDRVGEPLKVVVQRANDQ-LVTLTVIPE 437 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~G-DiI~~vng~~V~s~~dl~~-~l~~~~g~~v~l~v~R~~g~-~~~l~v~~~ 437 (443)
..|.-|.+|.++|+|.++||.+= |-|++|||..++..+|..+ .|+....+ |+++|...... ...+.|++.
T Consensus 14 teg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~kt~~~R~v~I~ps 86 (462)
T KOG3834|consen 14 TEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSKTQEVRIVEIVPS 86 (462)
T ss_pred ceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEecccceeEEEEeccc
Confidence 46888999999999999999997 8999999999986666554 44545444 99999863222 344555554
No 77
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=91.52 E-value=0.51 Score=45.13 Aligned_cols=57 Identities=21% Similarity=0.454 Sum_probs=45.7
Q ss_pred CCceEEeEECCCCccccCC-CCCCCEEEEECCEee--CCHHHHHHHHhcCCCCeEEEEEEE
Q 013444 367 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPV--QSITEIIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V--~s~~dl~~~l~~~~g~~v~l~v~R 424 (443)
..|++|....|++-|+..| |...|.|++|||.+| ++.+++.++|-.+. ..+.++|+-
T Consensus 193 vpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs-hNLIiTVkP 252 (358)
T KOG3606|consen 193 VPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS-HNLIITVKP 252 (358)
T ss_pred cCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcc-cceEEEecc
Confidence 3599999999999999998 567899999999999 58889988775422 236666654
No 78
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=91.15 E-value=0.23 Score=49.00 Aligned_cols=56 Identities=18% Similarity=0.436 Sum_probs=46.8
Q ss_pred CCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEE
Q 013444 367 KSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQ 423 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~ 423 (443)
.-+++|..|.++-.|+..| |-.||-|++|||..|+. -+|+..+|+ +.|+.|+|+|.
T Consensus 79 n~PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLR-NAGdeVtlTV~ 137 (505)
T KOG3549|consen 79 NLPVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILR-NAGDEVTLTVK 137 (505)
T ss_pred CccEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHH-hcCCEEEEEeH
Confidence 4589999999999999887 67999999999999974 467777666 46888888886
No 79
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=90.53 E-value=0.54 Score=45.63 Aligned_cols=55 Identities=16% Similarity=0.287 Sum_probs=42.8
Q ss_pred ceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEEE
Q 013444 369 GVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 369 g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~R 424 (443)
-++|..|..++||++.| ++.||.|++|||..|+.- -++.++++...+ +|+|.+..
T Consensus 31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~-eV~IhyNK 88 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLN-EVKIHYNK 88 (429)
T ss_pred eEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhcc-ceEEEehh
Confidence 57888999999999986 999999999999999754 455666665443 57777643
No 80
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=90.47 E-value=0.45 Score=48.43 Aligned_cols=68 Identities=26% Similarity=0.432 Sum_probs=54.4
Q ss_pred EeEECCCCccccCCCC-CCCEEEEECCEeeCCHHHHHHHHhcCCCCeEEEEEEECCC-eEEEEEEEecCC
Q 013444 372 VPVVTPGSPAHLAGFL-PSDVVIKFDGKPVQSITEIIEIMGDRVGEPLKVVVQRAND-QLVTLTVIPEEA 439 (443)
Q Consensus 372 V~~v~~~spA~~aGl~-~GDiI~~vng~~V~s~~dl~~~l~~~~g~~v~l~v~R~~g-~~~~l~v~~~~~ 439 (443)
|-+|.++|||+.|||+ -+|-|+-+-+..-...+|+..+|..+.++.+++.|+.-+. ...++++++..+
T Consensus 113 vl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReVti~pn~a 182 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREVTITPNSA 182 (462)
T ss_pred eeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceEEeecccc
Confidence 5679999999999999 5699999955556778899999998999999999987333 346777776643
No 81
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=89.76 E-value=0.46 Score=47.85 Aligned_cols=50 Identities=30% Similarity=0.462 Sum_probs=41.3
Q ss_pred CCCCCCCceEEeEECCCCccccC-CCCCCCEEEEECCEeeCCHHHHHHHHh
Q 013444 362 SFPNVKSGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSITEIIEIMG 411 (443)
Q Consensus 362 ~~~~~~~g~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~~dl~~~l~ 411 (443)
.|+-...|+.|.+|...||+..- ||.+||+|+++||-+|++.+|-.+.++
T Consensus 214 Pfya~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~ 264 (484)
T KOG2921|consen 214 PFYAHGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLA 264 (484)
T ss_pred hhhhcCceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHH
Confidence 34455789999999999998653 999999999999999998877766554
No 82
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=89.29 E-value=0.55 Score=52.09 Aligned_cols=59 Identities=24% Similarity=0.414 Sum_probs=46.2
Q ss_pred CCCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCHHH--HHHHHhcCCCCeEEEEEEE
Q 013444 365 NVKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSITE--IIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 365 ~~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~~d--l~~~l~~~~g~~v~l~v~R 424 (443)
..+-|++|.+|.+|++|+..| |+.||.+++|||+..-.+.+ ..+ +....|..|.+.|.+
T Consensus 957 q~klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~-lmtrtg~vV~leVaK 1018 (1629)
T KOG1892|consen 957 QRKLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAAR-LMTRTGNVVHLEVAK 1018 (1629)
T ss_pred ccccceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHH-HHhccCCeEEEehhh
Confidence 345699999999999999876 99999999999998865543 333 333567788888865
No 83
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=88.84 E-value=0.48 Score=47.49 Aligned_cols=58 Identities=19% Similarity=0.413 Sum_probs=44.3
Q ss_pred CCCCceEEeEECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEEE
Q 013444 365 NVKSGVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVVQ 423 (443)
Q Consensus 365 ~~~~g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v~ 423 (443)
+.+..++|++|.++-.|++.+ |..||.|++|||....+. ++...+|+ +.|+.|.++|+
T Consensus 107 eNkMPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLK-raGkeV~levK 167 (506)
T KOG3551|consen 107 ENKMPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALK-RAGKEVLLEVK 167 (506)
T ss_pred ccCCceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHH-hhCceeeeeee
Confidence 446799999999999999985 999999999999988643 44455555 45676665553
No 84
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=85.85 E-value=1.7 Score=45.64 Aligned_cols=55 Identities=20% Similarity=0.349 Sum_probs=46.1
Q ss_pred ceEEeEECCCCccccCC-CCCCCEEEEECCEeeCC--HHHHHHHHhcCCCCeEEEEEEE
Q 013444 369 GVLVPVVTPGSPAHLAG-FLPSDVVIKFDGKPVQS--ITEIIEIMGDRVGEPLKVVVQR 424 (443)
Q Consensus 369 g~~V~~v~~~spA~~aG-l~~GDiI~~vng~~V~s--~~dl~~~l~~~~g~~v~l~v~R 424 (443)
-++|..|..|+-+.+.| |..||.|.+|||..|.+ ..+++++|.+..| .+++.+.-
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~G-~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSRG-SITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCCC-cEEEEEcc
Confidence 68899999999998887 89999999999999964 6889998888665 57777754
No 85
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=85.63 E-value=1.3 Score=50.35 Aligned_cols=52 Identities=31% Similarity=0.544 Sum_probs=39.2
Q ss_pred eEEeEECCCCccccCCCCCCCEEEEECCEeeCCH--HHHHHHHhcCCCCeEEEEE
Q 013444 370 VLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQSI--TEIIEIMGDRVGEPLKVVV 422 (443)
Q Consensus 370 ~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~s~--~dl~~~l~~~~g~~v~l~v 422 (443)
=+|..|.++|||..+|++.||.|+.+||++|... .++.++|.+ .|.++.+.+
T Consensus 660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~-~gn~v~~~t 713 (1205)
T KOG0606|consen 660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLK-SGNKVTLRT 713 (1205)
T ss_pred eeeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHh-cCCeeEEEe
Confidence 4577899999999999999999999999999754 455565543 344444433
No 86
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=79.03 E-value=4.2 Score=35.02 Aligned_cols=39 Identities=28% Similarity=0.583 Sum_probs=25.4
Q ss_pred CCCCCccceeeecCCCEEEEEEEEeecC---CCeEEEEeHHHH
Q 013444 284 INAGNSGGPLVNIDGEIVGINIMKVAAA---DGLSFAVPIDSA 323 (443)
Q Consensus 284 i~~G~SGGPlvd~~G~VVGI~s~~~~~~---~g~~~aIPi~~i 323 (443)
...|.||||++-.+|.+|||-....-.. ..+-| +|++.+
T Consensus 105 ~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f-~P~e~l 146 (148)
T PF02907_consen 105 DLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDF-IPVETL 146 (148)
T ss_dssp HHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEE-EEHHHH
T ss_pred EEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEE-Eeeeec
Confidence 3579999999999999999976644321 12334 587654
No 87
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=73.05 E-value=11 Score=31.24 Aligned_cols=53 Identities=28% Similarity=0.500 Sum_probs=32.8
Q ss_pred EEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCCCCCCccccCC
Q 013444 159 GAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSKTPLPAAKLGT 230 (443)
Q Consensus 159 GfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~~~~~~~~l~~ 230 (443)
++-|. +|.++|+.|+.+... .| +|..+ +++. ..-|+++++.+.. .++.+++++
T Consensus 3 avHIG-nG~~vt~tHva~~~~--------~v-----~g~~f--~~~~--~~ge~~~v~~~~~-~~p~~~ig~ 55 (105)
T PF03510_consen 3 AVHIG-NGRYVTVTHVAKSSD--------SV-----DGQPF--KIVK--TDGELCWVQSPLV-HLPAAQIGT 55 (105)
T ss_pred eEEeC-CCEEEEEEEEeccCc--------eE-----cCcCc--EEEE--eccCEEEEECCCC-CCCeeEecc
Confidence 55665 689999999998742 21 12222 1222 3448999999854 356666654
No 88
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=71.35 E-value=41 Score=37.80 Aligned_cols=52 Identities=15% Similarity=0.257 Sum_probs=33.2
Q ss_pred EEEEEEeCCCEEEeccccccCCCCCCCCCCceEEEEeCC--CcEEEEEEEeecCCCCEEEEEEcC
Q 013444 157 GSGAIVDADGTILTCAHVVVDFHGSRALPKGKVDVTLQD--GRTFEGTVLNADFHSDIAIVKINS 219 (443)
Q Consensus 157 GSGfiI~~~G~ILTaaHvv~~~~~~~~~~~~~i~V~~~d--g~~~~a~vv~~d~~~DlAlLkl~~ 219 (443)
|...+|++. ||+|.+|...+. -.|.|.+ ...|...--.-++..|+.+-|++.
T Consensus 67 G~aTLigpq-YiVSV~HN~~gy----------~~v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK 120 (769)
T PF02395_consen 67 GVATLIGPQ-YIVSVKHNGKGY----------NSVSFGNEGQNTYKIVDRNNYPSGDFHMPRLNK 120 (769)
T ss_dssp SS-EEEETT-EEEBETTG-TSC----------CEECESCSSTCEEEEEEEEBETTSTEBEEEESS
T ss_pred ceEEEecCC-eEEEEEccCCCc----------CceeecccCCceEEEEEccCCCCcccceeecCc
Confidence 778999987 999999998442 2355654 344532222223447999999985
No 89
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=71.10 E-value=3.6 Score=44.13 Aligned_cols=50 Identities=16% Similarity=0.341 Sum_probs=34.2
Q ss_pred ECCCCccccCC-CCCCCEEEEECCEeeCCH--HHHHHHHhc-CCCCeEEEEEEE
Q 013444 375 VTPGSPAHLAG-FLPSDVVIKFDGKPVQSI--TEIIEIMGD-RVGEPLKVVVQR 424 (443)
Q Consensus 375 v~~~spA~~aG-l~~GDiI~~vng~~V~s~--~dl~~~l~~-~~g~~v~l~v~R 424 (443)
...++||++.| |-.||+|++|||...... ..-+.+++. +....|+++|.+
T Consensus 680 mm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~ 733 (829)
T KOG3605|consen 680 MMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS 733 (829)
T ss_pred cccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence 45689999986 999999999999876432 333455555 333356666655
No 90
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=66.38 E-value=6.8 Score=37.65 Aligned_cols=55 Identities=11% Similarity=0.250 Sum_probs=42.9
Q ss_pred eEEeEECCCCccccC-CCCCCCEEEEECCEeeCCHHH--HHHHHhc-CCCCeEEEEEEE
Q 013444 370 VLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSITE--IIEIMGD-RVGEPLKVVVQR 424 (443)
Q Consensus 370 ~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~~d--l~~~l~~-~~g~~v~l~v~R 424 (443)
..|..+.++|.-.+. -++.||.|-+|||+.+-.+.. +.++|+. ..|++.++.+..
T Consensus 151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie 209 (334)
T KOG3938|consen 151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE 209 (334)
T ss_pred eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence 557778888887765 489999999999999987754 5577877 678888877653
No 91
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=65.13 E-value=26 Score=29.94 Aligned_cols=33 Identities=21% Similarity=0.296 Sum_probs=25.2
Q ss_pred ceEEEEcccCCCCCccceeeecCCCEEEEEEEEe
Q 013444 275 REYLQTDCAINAGNSGGPLVNIDGEIVGINIMKV 308 (443)
Q Consensus 275 ~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~ 308 (443)
..++....+..||+-||+|+ .+--||||++.+-
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg 110 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGG 110 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEE
T ss_pred cCceeecccCCCCCCCceeE-eCCCeEEEEEeCC
Confidence 34566678899999999999 6778999998763
No 92
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=62.38 E-value=5.4 Score=40.74 Aligned_cols=24 Identities=33% Similarity=0.650 Sum_probs=21.1
Q ss_pred ccCCCCCccceeeecCCCEEEEEE
Q 013444 282 CAINAGNSGGPLVNIDGEIVGINI 305 (443)
Q Consensus 282 ~~i~~G~SGGPlvd~~G~VVGI~s 305 (443)
..+..|.||+.|+|.+|++|||.+
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 356789999999999999999975
No 93
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=59.39 E-value=26 Score=25.51 Aligned_cols=33 Identities=27% Similarity=0.469 Sum_probs=29.1
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.||+.+.+.+..+|...++.|-...
T Consensus 6 g~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~ 38 (63)
T cd00600 6 GKTVRVELKDGRVLEGVLVAFDKYMNLVLDDVE 38 (63)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCCCEEECCEE
Confidence 468999999999999999999999998886664
No 94
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=54.67 E-value=30 Score=26.36 Aligned_cols=33 Identities=27% Similarity=0.491 Sum_probs=29.6
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|+.|.+++..+|...++.|-...
T Consensus 14 ~k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 46 (72)
T PRK00737 14 NSPVLVRLKGGREFRGELQGYDIHMNLVLDNAE 46 (72)
T ss_pred CCEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence 467999999999999999999999999887764
No 95
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=54.48 E-value=31 Score=25.86 Aligned_cols=33 Identities=21% Similarity=0.359 Sum_probs=29.8
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|+.+.+++..+|...+|.|-...
T Consensus 10 ~~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (68)
T cd01731 10 NKPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAE 42 (68)
T ss_pred CCEEEEEECCCCEEEEEEEEECCcceEEEeeEE
Confidence 468999999999999999999999999987765
No 96
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=54.44 E-value=29 Score=26.01 Aligned_cols=33 Identities=24% Similarity=0.278 Sum_probs=29.3
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|+.|.+++..+|...+|.|-...
T Consensus 10 ~~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 10 GRPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CCeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 468999999999999999999999999886664
No 97
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=53.95 E-value=19 Score=32.92 Aligned_cols=29 Identities=31% Similarity=0.245 Sum_probs=26.5
Q ss_pred CCceEEeEECCCCccccCCCCCCCEEEEE
Q 013444 367 KSGVLVPVVTPGSPAHLAGFLPSDVVIKF 395 (443)
Q Consensus 367 ~~g~~V~~v~~~spA~~aGl~~GDiI~~v 395 (443)
.+.+.|.+|..+|||+++|+.-|+.|+++
T Consensus 121 ~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v 149 (183)
T PF11874_consen 121 GGKVIVDEVEFGSPAEKAGIDFDWEITEV 149 (183)
T ss_pred CCEEEEEecCCCCHHHHcCCCCCcEEEEE
Confidence 45788999999999999999999999887
No 98
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=53.80 E-value=32 Score=26.62 Aligned_cols=33 Identities=24% Similarity=0.404 Sum_probs=29.1
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.||+.+.+++..+|.+.+|.|=...
T Consensus 10 ~~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 10 GRTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred CCeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence 468999999999999999999999999876554
No 99
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=53.41 E-value=29 Score=26.14 Aligned_cols=33 Identities=21% Similarity=0.330 Sum_probs=29.2
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|+.+.+++..+|...+|.+=.+.
T Consensus 11 g~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 11 GKPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 468999999999999999999999999886654
No 100
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=53.40 E-value=29 Score=26.89 Aligned_cols=32 Identities=16% Similarity=0.365 Sum_probs=28.4
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEE
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI 217 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl 217 (443)
+..+.|.+.+|+.+.+++.++|...++.|=..
T Consensus 13 ~~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 13 GSRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CCEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 46899999999999999999999999987554
No 101
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=52.12 E-value=27 Score=27.34 Aligned_cols=32 Identities=22% Similarity=0.376 Sum_probs=28.2
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEE
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI 217 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl 217 (443)
...+.|.+.+|+.+.+++.++|.+.+|.|=..
T Consensus 11 ~k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 11 DERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CCEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 35899999999999999999999999887544
No 102
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.97 E-value=32 Score=26.74 Aligned_cols=33 Identities=36% Similarity=0.566 Sum_probs=29.1
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.||+.+.+.+.++|.+.+|.|=...
T Consensus 10 ~~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 10 NYRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CCEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 468999999999999999999999999876554
No 103
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=49.63 E-value=39 Score=26.50 Aligned_cols=33 Identities=21% Similarity=0.301 Sum_probs=28.8
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|+.+.+++..+|...+|.|=...
T Consensus 12 ~k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~~ 44 (81)
T cd01729 12 DKKIRVKFQGGREVTGILKGYDQLLNLVLDDTV 44 (81)
T ss_pred CCeEEEEECCCcEEEEEEEEEcCcccEEecCEE
Confidence 468999999999999999999999999875543
No 104
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.34 E-value=44 Score=25.54 Aligned_cols=33 Identities=15% Similarity=0.231 Sum_probs=28.7
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
+..+.|.+.+|+.+.+++.++|...+|.|=...
T Consensus 10 ~k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (72)
T cd01719 10 DKKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV 42 (72)
T ss_pred CCeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence 468999999999999999999999998885553
No 105
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.83 E-value=49 Score=25.04 Aligned_cols=33 Identities=18% Similarity=0.393 Sum_probs=29.8
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|..|.+++..+|...++.+-...
T Consensus 10 g~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 10 GHIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CCEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 468999999999999999999999999988774
No 106
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.05 E-value=46 Score=25.64 Aligned_cols=32 Identities=28% Similarity=0.415 Sum_probs=28.5
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEE
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKI 217 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl 217 (443)
...+.|.+.+|+.+.+.+..+|++.++.|=..
T Consensus 12 ~k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 12 DKKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CCEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 46899999999999999999999999888554
No 107
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=46.79 E-value=44 Score=26.66 Aligned_cols=33 Identities=15% Similarity=0.324 Sum_probs=29.0
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+++.+.+++.++|.+.+|.|=...
T Consensus 14 ~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 14 NTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred CCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 358999999999999999999999999876554
No 108
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=46.37 E-value=49 Score=24.37 Aligned_cols=33 Identities=24% Similarity=0.462 Sum_probs=28.8
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.||+.+.+++..+|...++-|=...
T Consensus 8 ~~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~ 40 (67)
T smart00651 8 GKRVLVELKNGREYRGTLKGFDQFMNLVLEDVE 40 (67)
T ss_pred CcEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence 458999999999999999999999998876554
No 109
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=45.92 E-value=72 Score=23.70 Aligned_cols=34 Identities=24% Similarity=0.310 Sum_probs=29.3
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcC
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS 219 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~ 219 (443)
...+.+..-.|..++++++.+|....+.+|+.+.
T Consensus 6 Gs~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 6 GSQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred ccEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 3567777888999999999999999999998764
No 110
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=45.19 E-value=41 Score=24.84 Aligned_cols=34 Identities=26% Similarity=0.560 Sum_probs=30.2
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcC
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINS 219 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~ 219 (443)
...+.|.+.+|+.+.+.+..+|...++.|-....
T Consensus 8 g~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~ 41 (67)
T PF01423_consen 8 GKRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE 41 (67)
T ss_dssp TSEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred CcEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence 4689999999999999999999999998877753
No 111
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.56 E-value=59 Score=24.87 Aligned_cols=33 Identities=24% Similarity=0.330 Sum_probs=29.1
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
+..+.|.+.+++.+.+++.++|...++.|=...
T Consensus 9 ~~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 9 NKTVSVITVDGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CCEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence 468999999999999999999999998887654
No 112
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=39.09 E-value=2.7e+02 Score=28.96 Aligned_cols=134 Identities=16% Similarity=0.245 Sum_probs=61.3
Q ss_pred cEEEEEEEeCCCEEEeccccccCCCCCC-CCCCceEEEEeCCCcEEEEEEEeecCCCCEEEEEEcCC--CCCCccccCCC
Q 013444 155 GIGSGAIVDADGTILTCAHVVVDFHGSR-ALPKGKVDVTLQDGRTFEGTVLNADFHSDIAIVKINSK--TPLPAAKLGTS 231 (443)
Q Consensus 155 ~~GSGfiI~~~G~ILTaaHvv~~~~~~~-~~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~~~--~~~~~~~l~~s 231 (443)
+.|=||.|+++ .++|+-||+......- ..+..+| .++..-+++-+++..+ .+++-+-|.
T Consensus 379 GsGWGfWVS~~-lfITttHViP~g~~E~FGv~i~~i---------------~vh~sGeF~~~rFpk~iRPDvtgmiLE-- 440 (535)
T PF05416_consen 379 GSGWGFWVSPT-LFITTTHVIPPGAKEAFGVPISQI---------------QVHKSGEFCRFRFPKPIRPDVTGMILE-- 440 (535)
T ss_dssp TTEEEEESSSS-EEEEEGGGS-STTSEETTEECGGE---------------EEEEETTEEEEEESS-SSTTS---EE---
T ss_pred CCceeeeecce-EEEEeeeecCCcchhhhCCChhHe---------------EEeeccceEEEecCCCCCCCccceeec--
Confidence 55779999998 9999999998632100 0001122 2233346677777653 334444442
Q ss_pred CCCCCCCEEEE-EecCCCC--CCceEEEEEEeeecCccCCCCCCccceEEE-------EcccCCCCCccceeeecCCC--
Q 013444 232 SKLCPGDWVVA-MGCPHSL--QNTVTAGIVSCVDRKSSDLGLGGMRREYLQ-------TDCAINAGNSGGPLVNIDGE-- 299 (443)
Q Consensus 232 ~~~~~G~~V~~-iG~p~~~--~~~~t~G~Vs~~~~~~~~~~~~~~~~~~i~-------~d~~i~~G~SGGPlvd~~G~-- 299 (443)
+..+.|.-+.+ +-.+.+. ...+.-|........-.- .++ ...++. .|-...||+-|.|-|-..|+
T Consensus 441 eGapEGtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~--v~G-Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~ 517 (535)
T PF05416_consen 441 EGAPEGTVCSILIKRPSGELLPLAVRMGTHASMKIQGRT--VHG-QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDW 517 (535)
T ss_dssp SS--TT-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEE--EEE-EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEE
T ss_pred cCCCCceEEEEEEEcCCccchhhhhhhccceeEEEccee--ecc-eeeeeeecCCccccccCCCCCCCCCceeeecCCcE
Confidence 23344554433 2233331 123344443332211000 000 011222 23346789999999976665
Q ss_pred -EEEEEEEEee
Q 013444 300 -IVGINIMKVA 309 (443)
Q Consensus 300 -VVGI~s~~~~ 309 (443)
|+|++.....
T Consensus 518 VV~GVH~AAtr 528 (535)
T PF05416_consen 518 VVIGVHAAATR 528 (535)
T ss_dssp EEEEEEEEE-S
T ss_pred EEEEEEehhcc
Confidence 8899987543
No 113
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=36.90 E-value=34 Score=23.82 Aligned_cols=19 Identities=47% Similarity=0.660 Sum_probs=16.2
Q ss_pred CCccceeeecCCCEEEEEE
Q 013444 287 GNSGGPLVNIDGEIVGINI 305 (443)
Q Consensus 287 G~SGGPlvd~~G~VVGI~s 305 (443)
+.+.-|++|.+|+++|+.+
T Consensus 29 ~~~~~~V~d~~~~~~G~is 47 (57)
T PF00571_consen 29 GISRLPVVDEDGKLVGIIS 47 (57)
T ss_dssp TSSEEEEESTTSBEEEEEE
T ss_pred CCcEEEEEecCCEEEEEEE
Confidence 5567899999999999975
No 114
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.06 E-value=91 Score=23.97 Aligned_cols=33 Identities=24% Similarity=0.448 Sum_probs=29.6
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|+.+.+++..+|...++.+-.+.
T Consensus 11 g~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 11 NHPMLVELKNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence 468999999999999999999999999987663
No 115
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=35.67 E-value=42 Score=31.44 Aligned_cols=56 Identities=25% Similarity=0.486 Sum_probs=39.1
Q ss_pred ceEEEEcccCCCCCccceeeecC----CCEEEEEEEEeecCCCeEEEEeH--HHHHHHHHHHH
Q 013444 275 REYLQTDCAINAGNSGGPLVNID----GEIVGINIMKVAAADGLSFAVPI--DSAAKIIEQFK 331 (443)
Q Consensus 275 ~~~i~~d~~i~~G~SGGPlvd~~----G~VVGI~s~~~~~~~g~~~aIPi--~~i~~~l~~l~ 331 (443)
+..++...+...|+=|||++-.+ -+++||+..+... .+.+||-++ +.+++.++.|+
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~-~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN-HAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc-ccceehhhhhHHHHHHHHHhhc
Confidence 34567788889999999998332 6899999987643 356777544 35666666554
No 116
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=35.65 E-value=76 Score=24.49 Aligned_cols=33 Identities=24% Similarity=0.491 Sum_probs=29.0
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|+.+.+++..+|...++.|--..
T Consensus 17 ~~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 49 (79)
T COG1958 17 NKRVLVKLKNGREYRGTLVGFDQYMNLVLDDVE 49 (79)
T ss_pred CCEEEEEECCCCEEEEEEEEEccceeEEEeceE
Confidence 368999999999999999999999998887554
No 117
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=32.46 E-value=59 Score=35.17 Aligned_cols=37 Identities=19% Similarity=0.248 Sum_probs=32.2
Q ss_pred CceEEeEECCCCccccC-CCCCCCEEEEECCEeeCCHH
Q 013444 368 SGVLVPVVTPGSPAHLA-GFLPSDVVIKFDGKPVQSIT 404 (443)
Q Consensus 368 ~g~~V~~v~~~spA~~a-Gl~~GDiI~~vng~~V~s~~ 404 (443)
+-.+|.++.++|||... .|..||.|+.||++.|..|+
T Consensus 225 g~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 225 GPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred CceeccccccCChHHHhhcccCccceeeecccccccch
Confidence 45667889999999876 59999999999999998885
No 118
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=32.09 E-value=58 Score=24.79 Aligned_cols=32 Identities=28% Similarity=0.574 Sum_probs=24.6
Q ss_pred cceeeecCCCEEEEEEEEeecCCCeEEEEeHHHHHHHHHHHH
Q 013444 290 GGPLVNIDGEIVGINIMKVAAADGLSFAVPIDSAAKIIEQFK 331 (443)
Q Consensus 290 GGPlvd~~G~VVGI~s~~~~~~~g~~~aIPi~~i~~~l~~l~ 331 (443)
.-|+.+.+|+++|++.. .+.++.+.++++++.
T Consensus 18 s~pi~~~~g~~~Gvv~~----------di~l~~l~~~i~~~~ 49 (81)
T PF02743_consen 18 SVPIYDDDGKIIGVVGI----------DISLDQLSEIISNIK 49 (81)
T ss_dssp EEEEEETTTEEEEEEEE----------EEEHHHHHHHHTTSB
T ss_pred EEEEECCCCCEEEEEEE----------EeccceeeeEEEeeE
Confidence 35888889999999754 478888888776653
No 119
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=31.21 E-value=45 Score=35.34 Aligned_cols=32 Identities=31% Similarity=0.425 Sum_probs=25.1
Q ss_pred ceEEeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444 369 GVLVPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ 401 (443)
Q Consensus 369 g~~V~~v~~~spA~~aGl~~GDiI~~vng~~V~ 401 (443)
-+.|.-..+|.|.-.| .+|||||++.||+.|+
T Consensus 299 v~~vl~~~ENm~~g~A-~rPGDVits~~GkTVE 330 (485)
T COG0260 299 VVGVLPAVENMPSGNA-YRPGDVITSMNGKTVE 330 (485)
T ss_pred EEEEEeeeccCCCCCC-CCCCCeEEecCCcEEE
Confidence 3344456678888877 9999999999999874
No 120
>PF14438 SM-ATX: Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=30.92 E-value=1.3e+02 Score=22.98 Aligned_cols=29 Identities=28% Similarity=0.461 Sum_probs=21.5
Q ss_pred CceEEEEeCCCcEEEEEEEeecC---CCCEEE
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADF---HSDIAI 214 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~---~~DlAl 214 (443)
...++|++.||..|++.+..+++ +.+++|
T Consensus 12 G~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL 43 (77)
T PF14438_consen 12 GQTVEVTTKNGSVYEGIFHSASPESNEFDVVL 43 (77)
T ss_dssp TSEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred CCEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence 46899999999999999999988 556655
No 121
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.71 E-value=1.2e+02 Score=23.64 Aligned_cols=33 Identities=24% Similarity=0.368 Sum_probs=29.6
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|..+.+++..+|...++-+-.+.
T Consensus 11 g~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~ 43 (81)
T cd01725 11 GKEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS 43 (81)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence 468999999999999999999999999887764
No 122
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=30.13 E-value=1.4e+02 Score=23.22 Aligned_cols=33 Identities=24% Similarity=0.380 Sum_probs=29.3
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|..|.+++..+|...++-+-...
T Consensus 19 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~~~ 51 (78)
T cd01733 19 GKVVTVELRNETTVTGRIASVDAFMNIRLAKVT 51 (78)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCCceeEEEEEE
Confidence 468999999999999999999999998887664
No 123
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=29.66 E-value=57 Score=27.04 Aligned_cols=18 Identities=28% Similarity=0.695 Sum_probs=13.6
Q ss_pred ceeeecCCCEEEEEEEEe
Q 013444 291 GPLVNIDGEIVGINIMKV 308 (443)
Q Consensus 291 GPlvd~~G~VVGI~s~~~ 308 (443)
.|++|.+|++||++..+.
T Consensus 94 ~PV~d~~g~viG~V~VG~ 111 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVGV 111 (116)
T ss_dssp EEEE-TTS-EEEEEEEEE
T ss_pred EeeECCCCcEEEEEEEEE
Confidence 589999999999998754
No 124
>cd05701 S1_Rrp5_repeat_hs10 S1_Rrp5_repeat_hs10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 10 (hs10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.
Probab=29.26 E-value=38 Score=25.39 Aligned_cols=33 Identities=30% Similarity=0.282 Sum_probs=17.5
Q ss_pred CCEEEEEEcCCCCCCccc---------cCCCCCCCCCCEEEE
Q 013444 210 SDIAIVKINSKTPLPAAK---------LGTSSKLCPGDWVVA 242 (443)
Q Consensus 210 ~DlAlLkl~~~~~~~~~~---------l~~s~~~~~G~~V~~ 242 (443)
.|+||+.+.....+..++ ..+++++++|+.+.+
T Consensus 13 kdfAvvSL~~t~~L~a~p~~sHLNdtfrf~seklkvG~~l~v 54 (69)
T cd05701 13 KDFAIVSLATTGDLAAFPTRSHLNDTFRFDSEKLSVGQCLDV 54 (69)
T ss_pred hceEEEEeeccccEEEEEchhhccccccccceeeeccceEEE
Confidence 467777776543332222 224566677776554
No 125
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=29.18 E-value=1.3e+02 Score=24.10 Aligned_cols=33 Identities=18% Similarity=0.383 Sum_probs=29.7
Q ss_pred CceEEEEeCCCcEEEEEEEeecCCCCEEEEEEc
Q 013444 186 KGKVDVTLQDGRTFEGTVLNADFHSDIAIVKIN 218 (443)
Q Consensus 186 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkl~ 218 (443)
...+.|.+.+|..|.+++..+|...++.+-.+.
T Consensus 11 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~a~ 43 (90)
T cd01724 11 NETVTIELKNGTIVHGTITGVDPSMNTHLKNVK 43 (90)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCceeEEEEEEE
Confidence 468999999999999999999999999887764
No 126
>PF05578 Peptidase_S31: Pestivirus NS3 polyprotein peptidase S31; InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=25.95 E-value=1.7e+02 Score=25.97 Aligned_cols=73 Identities=16% Similarity=0.161 Sum_probs=36.4
Q ss_pred CCCCCCEEEEEecCCCCCCceEEEEEEeeecCccCCCCC-CccceEEEEcccCCCCCccceeeecC-CCEEEEEEEE
Q 013444 233 KLCPGDWVVAMGCPHSLQNTVTAGIVSCVDRKSSDLGLG-GMRREYLQTDCAINAGNSGGPLVNID-GEIVGINIMK 307 (443)
Q Consensus 233 ~~~~G~~V~~iG~p~~~~~~~t~G~Vs~~~~~~~~~~~~-~~~~~~i~~d~~i~~G~SGGPlvd~~-G~VVGI~s~~ 307 (443)
....|..+|++ +|.....+.+.|.+-.+...-.++..- ..... -..|..-..|.||=|+|... |++||-.-.+
T Consensus 108 gcp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtp-af~~~knlkg~s~~pifeassgr~vgr~k~g 182 (211)
T PF05578_consen 108 GCPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTP-AFFDLKNLKGWSGLPIFEASSGRVVGRVKVG 182 (211)
T ss_pred CCCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCc-ceeeccccCCCCCCceeeccCCcEEEEEEec
Confidence 44567777777 555444344444333222211111000 00001 11233445689999999654 9999976543
No 127
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=23.87 E-value=2.9e+02 Score=20.08 Aligned_cols=35 Identities=29% Similarity=0.237 Sum_probs=27.6
Q ss_pred CceEEEEeCCCcEE-EEEEEeecCCCCEEEEEEcCC
Q 013444 186 KGKVDVTLQDGRTF-EGTVLNADFHSDIAIVKINSK 220 (443)
Q Consensus 186 ~~~i~V~~~dg~~~-~a~vv~~d~~~DlAlLkl~~~ 220 (443)
...+.+.+++...| ++++..+|...++.-++.++.
T Consensus 9 Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG 44 (55)
T PF09465_consen 9 GEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG 44 (55)
T ss_dssp S-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred CCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence 46889999987655 999999999999999998754
No 128
>PF09122 DUF1930: Domain of unknown function (DUF1930); InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=22.19 E-value=2.3e+02 Score=21.19 Aligned_cols=44 Identities=18% Similarity=0.376 Sum_probs=27.0
Q ss_pred CCEEEEECCEeeCCHH-HHHHHHhc-CCCCeEEEEEEECCCeEEEEEE
Q 013444 389 SDVVIKFDGKPVQSIT-EIIEIMGD-RVGEPLKVVVQRANDQLVTLTV 434 (443)
Q Consensus 389 GDiI~~vng~~V~s~~-dl~~~l~~-~~g~~v~l~v~R~~g~~~~l~v 434 (443)
.-.-+.+||..|++.+ ++..++.. +.|++-++.++. ++...+++
T Consensus 19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~FkS--~Rv~~iEv 64 (68)
T PF09122_consen 19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFFKS--QRVAVIEV 64 (68)
T ss_dssp TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEETT--S-EEEEE-
T ss_pred cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEEec--CcEEEEEc
Confidence 4567889999999875 57777765 899987777653 45555555
No 129
>PRK05015 aminopeptidase B; Provisional
Probab=22.16 E-value=90 Score=32.43 Aligned_cols=29 Identities=21% Similarity=0.208 Sum_probs=23.6
Q ss_pred EeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444 372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ 401 (443)
Q Consensus 372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~ 401 (443)
|.-+.+|.+...+ .++||||..-||+.|.
T Consensus 240 il~~aENmisg~A-~kpgDVIt~~nGkTVE 268 (424)
T PRK05015 240 FLCCAENLISGNA-FKLGDIITYRNGKTVE 268 (424)
T ss_pred EEEecccCCCCCC-CCCCCEEEecCCcEEe
Confidence 3445677777777 9999999999999874
No 130
>PRK00913 multifunctional aminopeptidase A; Provisional
Probab=21.05 E-value=91 Score=33.11 Aligned_cols=29 Identities=28% Similarity=0.502 Sum_probs=23.9
Q ss_pred EeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444 372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ 401 (443)
Q Consensus 372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~ 401 (443)
|.-..+|.|...+ .++||+|..-||+.|.
T Consensus 303 v~~l~ENm~~~~A-~rPgDVi~~~~GkTVE 331 (483)
T PRK00913 303 VVAACENMPSGNA-YRPGDVLTSMSGKTIE 331 (483)
T ss_pred EEEeeccCCCCCC-CCCCCEEEECCCcEEE
Confidence 3345678888887 9999999999999874
No 131
>cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.
Probab=20.87 E-value=89 Score=33.05 Aligned_cols=29 Identities=28% Similarity=0.353 Sum_probs=23.7
Q ss_pred EeEECCCCccccCCCCCCCEEEEECCEeeC
Q 013444 372 VPVVTPGSPAHLAGFLPSDVVIKFDGKPVQ 401 (443)
Q Consensus 372 V~~v~~~spA~~aGl~~GDiI~~vng~~V~ 401 (443)
+.-..+|.+...+ .+|||||..-||+.|+
T Consensus 289 i~~~~EN~is~~A-~rPgDVi~s~~GkTVE 317 (468)
T cd00433 289 VLPLAENMISGNA-YRPGDVITSRSGKTVE 317 (468)
T ss_pred EEEeeecCCCCCC-CCCCCEeEeCCCcEEE
Confidence 3345678888887 9999999999999874
No 132
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=20.61 E-value=72 Score=25.94 Aligned_cols=20 Identities=25% Similarity=0.386 Sum_probs=16.0
Q ss_pred CCccceeeecCCCEEEEEEE
Q 013444 287 GNSGGPLVNIDGEIVGINIM 306 (443)
Q Consensus 287 G~SGGPlvd~~G~VVGI~s~ 306 (443)
+.+.=|++|.+|+++|+++.
T Consensus 98 ~~~~lpVvd~~~~~vGiit~ 117 (123)
T cd04627 98 GISSVAVVDNQGNLIGNISV 117 (123)
T ss_pred CCceEEEECCCCcEEEEEeH
Confidence 34456999989999999875
Done!