Query 013804
Match_columns 436
No_of_seqs 445 out of 3214
Neff 7.9
Searched_HMMs 46136
Date Fri Mar 29 07:33:56 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/013804.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/013804hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3E-52 6.5E-57 429.3 39.4 301 117-432 41-361 (455)
2 TIGR02038 protease_degS peripl 100.0 5.4E-51 1.2E-55 408.4 39.4 303 112-433 41-350 (351)
3 PRK10898 serine endoprotease; 100.0 6.1E-51 1.3E-55 407.7 39.4 303 112-433 41-351 (353)
4 PRK10942 serine endoprotease; 100.0 9.8E-50 2.1E-54 412.4 37.3 300 117-431 39-381 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 1.6E-48 3.5E-53 401.4 35.9 301 118-433 3-329 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 6.3E-39 1.4E-43 321.9 31.5 302 116-431 33-340 (347)
7 KOG1320 Serine protease [Postt 100.0 2.9E-27 6.2E-32 238.3 22.0 305 117-432 129-469 (473)
8 KOG1421 Predicted signaling-as 99.9 1.1E-22 2.5E-27 206.9 22.5 295 117-433 53-373 (955)
9 PF13365 Trypsin_2: Trypsin-li 99.7 4.9E-16 1.1E-20 130.9 12.6 109 154-293 1-120 (120)
10 PF13180 PDZ_2: PDZ domain; PD 99.6 7.5E-15 1.6E-19 116.3 10.7 82 332-429 1-82 (82)
11 KOG1421 Predicted signaling-as 99.6 6.1E-14 1.3E-18 143.8 16.0 296 123-432 525-832 (955)
12 PF00089 Trypsin: Trypsin; In 99.5 5.3E-13 1.1E-17 123.9 19.3 168 151-320 24-220 (220)
13 KOG1320 Serine protease [Postt 99.4 4.7E-13 1E-17 135.8 10.2 275 123-420 57-351 (473)
14 cd00190 Tryp_SPc Trypsin-like 99.4 5.1E-12 1.1E-16 118.1 15.9 170 151-322 24-231 (232)
15 smart00020 Tryp_SPc Trypsin-li 99.3 2.7E-11 5.9E-16 113.4 15.8 167 151-319 25-228 (229)
16 cd00991 PDZ_archaeal_metallopr 99.3 2.4E-11 5.1E-16 95.5 9.9 68 350-428 10-77 (79)
17 cd00986 PDZ_LON_protease PDZ d 99.3 4.4E-11 9.5E-16 93.9 10.4 72 350-433 8-79 (79)
18 cd00987 PDZ_serine_protease PD 99.2 4.8E-11 1E-15 95.6 9.9 84 332-426 1-89 (90)
19 TIGR01713 typeII_sec_gspC gene 99.2 3.1E-11 6.8E-16 115.8 9.6 101 314-429 159-259 (259)
20 cd00990 PDZ_glycyl_aminopeptid 99.2 5.9E-11 1.3E-15 93.1 9.4 78 332-430 1-78 (80)
21 cd00989 PDZ_metalloprotease PD 99.2 2.3E-10 5.1E-15 89.4 9.6 77 333-428 2-78 (79)
22 cd00988 PDZ_CTP_protease PDZ d 99.1 5.6E-10 1.2E-14 88.6 9.7 79 332-429 2-83 (85)
23 COG3591 V8-like Glu-specific e 99.0 3.6E-09 7.9E-14 99.7 13.4 159 152-324 64-250 (251)
24 cd00136 PDZ PDZ domain, also c 98.9 8.2E-09 1.8E-13 78.6 7.1 67 333-417 2-70 (70)
25 TIGR02037 degP_htrA_DO peripla 98.8 1.3E-08 2.8E-13 105.3 9.9 85 331-426 337-427 (428)
26 PRK10779 zinc metallopeptidase 98.8 1E-08 2.2E-13 106.5 7.7 68 352-430 128-195 (449)
27 TIGR00054 RIP metalloprotease 98.7 5.1E-08 1.1E-12 100.5 9.3 69 350-430 203-271 (420)
28 PRK10779 zinc metallopeptidase 98.7 9.9E-08 2.1E-12 99.2 10.0 68 351-430 222-289 (449)
29 TIGR00225 prc C-terminal pepti 98.6 1.1E-07 2.4E-12 95.1 8.8 80 332-430 51-132 (334)
30 TIGR02860 spore_IV_B stage IV 98.6 2.5E-07 5.4E-12 93.2 10.5 77 333-430 97-181 (402)
31 KOG3627 Trypsin [Amino acid tr 98.6 1.8E-06 4E-11 82.6 16.0 170 152-322 38-252 (256)
32 smart00228 PDZ Domain present 98.6 1.7E-07 3.8E-12 73.7 7.2 74 332-420 12-85 (85)
33 PLN00049 carboxyl-terminal pro 98.5 6.4E-07 1.4E-11 91.4 10.2 84 332-428 85-170 (389)
34 PRK10139 serine endoprotease; 98.5 4.5E-07 9.8E-12 94.2 9.1 65 350-427 390-454 (455)
35 PF00863 Peptidase_C4: Peptida 98.5 2.3E-05 5E-10 73.6 19.3 163 123-313 14-184 (235)
36 cd00992 PDZ_signaling PDZ doma 98.4 9E-07 1.9E-11 69.3 7.6 69 331-416 11-81 (82)
37 PF00595 PDZ: PDZ domain (Also 98.4 4.8E-07 1.1E-11 71.1 5.3 71 331-417 9-81 (81)
38 COG3480 SdrC Predicted secrete 98.4 1.5E-06 3.3E-11 83.4 9.0 71 350-432 130-201 (342)
39 PRK10942 serine endoprotease; 98.4 1.1E-06 2.4E-11 91.7 8.9 65 350-427 408-472 (473)
40 TIGR03279 cyano_FeS_chp putati 98.3 9.5E-07 2.1E-11 89.6 7.5 63 353-430 1-64 (433)
41 PF14685 Tricorn_PDZ: Tricorn 98.3 4.4E-06 9.6E-11 66.5 9.0 79 332-427 1-88 (88)
42 COG0793 Prc Periplasmic protea 98.2 3.4E-06 7.4E-11 86.3 8.8 80 330-427 98-181 (406)
43 TIGR00054 RIP metalloprotease 98.2 1.7E-06 3.6E-11 89.3 6.3 66 350-428 128-193 (420)
44 COG5640 Secreted trypsin-like 98.0 0.00024 5.2E-09 69.6 15.7 55 272-326 223-280 (413)
45 PRK09681 putative type II secr 98.0 1.6E-05 3.6E-10 76.3 7.7 57 362-429 219-275 (276)
46 COG3975 Predicted protease wit 98.0 1.2E-05 2.5E-10 82.3 5.9 64 350-432 462-525 (558)
47 KOG3129 26S proteasome regulat 97.9 2.6E-05 5.7E-10 70.6 7.2 74 351-435 140-215 (231)
48 PRK11186 carboxy-terminal prot 97.9 3.7E-05 8.1E-10 82.9 9.0 79 331-428 243-332 (667)
49 PF05579 Peptidase_S32: Equine 97.8 0.00026 5.6E-09 66.7 11.7 117 151-298 111-229 (297)
50 PF04495 GRASP55_65: GRASP55/6 97.8 5.9E-05 1.3E-09 65.5 6.7 87 332-430 26-114 (138)
51 COG3031 PulC Type II secretory 97.3 0.00056 1.2E-08 63.4 6.2 59 359-428 216-274 (275)
52 KOG3553 Tax interaction protei 96.9 0.0008 1.7E-08 53.9 2.9 34 350-394 59-92 (124)
53 PF00548 Peptidase_C3: 3C cyst 96.7 0.044 9.6E-07 49.5 13.1 137 151-297 24-170 (172)
54 PF12812 PDZ_1: PDZ-like domai 96.7 0.0045 9.7E-08 48.3 5.6 64 332-406 9-75 (78)
55 PF05580 Peptidase_S55: SpoIVB 96.7 0.031 6.6E-07 51.7 11.8 166 145-315 13-214 (218)
56 PF03761 DUF316: Domain of unk 96.6 0.09 2E-06 51.1 15.7 91 197-297 159-254 (282)
57 KOG3580 Tight junction protein 96.0 0.0069 1.5E-07 62.9 4.4 58 350-418 429-488 (1027)
58 PF10459 Peptidase_S46: Peptid 95.9 0.033 7.1E-07 60.8 9.0 22 152-173 47-68 (698)
59 PF08192 Peptidase_S64: Peptid 95.8 0.056 1.2E-06 57.4 10.1 117 197-322 541-687 (695)
60 KOG3580 Tight junction protein 95.5 0.026 5.6E-07 58.8 6.0 86 332-429 198-288 (1027)
61 KOG3532 Predicted protein kina 95.4 0.032 7E-07 58.9 6.4 58 332-406 386-443 (1051)
62 PF00949 Peptidase_S7: Peptida 95.3 0.025 5.5E-07 48.5 4.5 33 268-300 88-120 (132)
63 KOG3209 WW domain-containing p 95.1 0.03 6.6E-07 59.4 5.0 56 354-421 782-839 (984)
64 PF10459 Peptidase_S46: Peptid 95.0 0.018 3.8E-07 62.8 3.4 59 265-323 621-686 (698)
65 PF09342 DUF1986: Domain of un 94.7 0.26 5.7E-06 46.4 9.6 90 147-237 23-131 (267)
66 COG0750 Predicted membrane-ass 94.4 0.14 2.9E-06 52.0 8.0 56 356-423 135-194 (375)
67 KOG3552 FERM domain protein FR 94.4 0.061 1.3E-06 58.8 5.4 65 332-418 65-131 (1298)
68 TIGR02860 spore_IV_B stage IV 94.3 0.44 9.6E-06 48.6 11.1 41 271-315 354-394 (402)
69 KOG3606 Cell polarity protein 93.9 0.15 3.3E-06 48.3 6.5 59 349-419 193-253 (358)
70 KOG3209 WW domain-containing p 93.4 0.21 4.6E-06 53.3 7.0 58 352-419 373-432 (984)
71 PF02122 Peptidase_S39: Peptid 93.2 0.59 1.3E-05 43.3 8.9 134 164-314 43-182 (203)
72 KOG3542 cAMP-regulated guanine 91.9 0.12 2.6E-06 54.7 3.0 57 350-418 562-618 (1283)
73 KOG3834 Golgi reassembly stack 91.6 0.38 8.2E-06 48.7 5.9 69 350-429 15-85 (462)
74 KOG3651 Protein kinase C, alph 91.4 0.4 8.7E-06 46.3 5.6 56 351-418 31-88 (429)
75 KOG3549 Syntrophins (type gamm 91.4 0.34 7.4E-06 47.6 5.2 56 350-417 80-137 (505)
76 PF00944 Peptidase_S3: Alphavi 90.9 0.47 1E-05 40.4 4.9 29 271-299 100-128 (158)
77 KOG3550 Receptor targeting pro 90.8 0.8 1.7E-05 39.8 6.4 55 350-416 115-171 (207)
78 PF02395 Peptidase_S6: Immunog 90.6 0.98 2.1E-05 50.1 8.6 65 151-218 64-130 (769)
79 KOG3551 Syntrophins (type beta 90.2 0.42 9.2E-06 47.7 4.7 73 331-420 95-172 (506)
80 KOG3571 Dishevelled 3 and rela 89.6 0.63 1.4E-05 48.0 5.6 72 332-418 261-338 (626)
81 KOG3605 Beta amyloid precursor 88.7 0.64 1.4E-05 49.3 5.0 112 276-410 679-806 (829)
82 KOG0609 Calcium/calmodulin-dep 87.9 1 2.2E-05 47.0 5.8 68 333-418 135-204 (542)
83 KOG1892 Actin filament-binding 87.8 0.66 1.4E-05 51.3 4.6 60 350-421 960-1021(1629)
84 PF02907 Peptidase_S29: Hepati 86.7 0.44 9.5E-06 40.7 2.0 117 154-300 14-131 (148)
85 KOG2921 Intramembrane metallop 86.4 1.1 2.4E-05 44.9 5.0 45 350-405 220-265 (484)
86 PF00947 Pico_P2A: Picornaviru 81.9 7 0.00015 33.2 7.2 32 265-297 78-109 (127)
87 KOG0606 Microtubule-associated 81.6 2.2 4.9E-05 48.3 5.3 50 353-415 661-712 (1205)
88 PF03510 Peptidase_C24: 2C end 80.4 6.4 0.00014 32.4 6.3 53 155-219 2-54 (105)
89 KOG3834 Golgi reassembly stack 79.3 2.6 5.6E-05 42.9 4.4 65 354-429 113-179 (462)
90 PF01732 DUF31: Putative pepti 72.7 2.8 6.2E-05 42.6 2.8 24 272-295 350-373 (374)
91 KOG3605 Beta amyloid precursor 68.9 5.2 0.00011 42.8 3.7 58 351-418 674-733 (829)
92 PF05416 Peptidase_C37: Southa 68.1 19 0.0004 36.9 7.2 137 150-299 377-528 (535)
93 PF11874 DUF3394: Domain of un 55.8 42 0.00091 30.5 6.7 38 333-388 112-149 (183)
94 KOG3938 RGS-GAIP interacting p 54.3 14 0.00031 35.3 3.5 59 350-418 149-209 (334)
95 cd01735 LSm12_N LSm12 belongs 48.5 65 0.0014 23.8 5.5 34 175-208 6-39 (61)
96 PF00571 CBS: CBS domain CBS d 47.7 18 0.0004 25.2 2.6 21 276-296 28-48 (57)
97 cd00600 Sm_like The eukaryotic 45.2 61 0.0013 23.3 5.1 33 176-208 7-39 (63)
98 PRK14864 putative biofilm stre 44.4 40 0.00087 27.7 4.3 55 78-133 3-57 (104)
99 cd01720 Sm_D2 The eukaryotic S 43.5 44 0.00096 26.5 4.3 37 171-207 10-46 (87)
100 cd01731 archaeal_Sm1 The archa 39.3 72 0.0016 23.7 4.8 33 176-208 11-43 (68)
101 PRK00737 small nuclear ribonuc 39.1 75 0.0016 24.0 4.9 32 176-207 15-46 (72)
102 cd01722 Sm_F The eukaryotic Sm 38.0 67 0.0015 24.0 4.4 32 176-207 12-43 (68)
103 cd01726 LSm6 The eukaryotic Sm 37.7 76 0.0016 23.6 4.6 32 176-207 11-42 (67)
104 cd06168 LSm9 The eukaryotic Sm 36.5 87 0.0019 24.0 4.9 32 176-207 11-42 (75)
105 PF12381 Peptidase_C3G: Tungro 36.5 26 0.00055 32.7 2.2 55 265-323 168-228 (231)
106 COG4956 Integral membrane prot 36.4 29 0.00064 34.1 2.7 40 385-424 269-309 (356)
107 cd01730 LSm3 The eukaryotic Sm 35.0 69 0.0015 24.9 4.2 31 176-206 12-42 (82)
108 cd01717 Sm_B The eukaryotic Sm 35.0 86 0.0019 24.1 4.7 32 176-207 11-42 (79)
109 cd01729 LSm7 The eukaryotic Sm 34.6 88 0.0019 24.3 4.7 31 176-206 13-43 (81)
110 PF04225 OapA: Opacity-associa 33.3 21 0.00045 28.1 1.0 53 379-431 7-68 (85)
111 TIGR03000 plancto_dom_1 Planct 32.1 64 0.0014 24.9 3.4 47 382-428 11-62 (75)
112 cd01732 LSm5 The eukaryotic Sm 32.0 93 0.002 23.9 4.4 31 176-206 14-44 (76)
113 PF14827 Cache_3: Sensory doma 31.7 47 0.001 27.4 2.9 18 281-298 94-111 (116)
114 cd01719 Sm_G The eukaryotic Sm 31.2 1.2E+02 0.0026 23.0 4.9 32 176-207 11-42 (72)
115 cd01728 LSm1 The eukaryotic Sm 30.8 1.1E+02 0.0024 23.3 4.6 31 176-206 13-43 (74)
116 COG0298 HypC Hydrogenase matur 30.8 1.1E+02 0.0024 23.8 4.5 47 188-236 5-52 (82)
117 smart00651 Sm snRNP Sm protein 29.8 1.2E+02 0.0027 22.0 4.7 32 176-207 9-40 (67)
118 PF02743 Cache_1: Cache domain 28.9 53 0.0011 24.9 2.6 31 281-324 19-49 (81)
119 PF09122 DUF1930: Domain of un 28.6 1.4E+02 0.0031 22.1 4.5 45 382-427 19-64 (68)
120 PF09465 LBR_tudor: Lamin-B re 28.3 2.3E+02 0.005 20.5 5.6 35 174-208 8-43 (55)
121 PF05578 Peptidase_S31: Pestiv 28.1 1.1E+02 0.0024 26.9 4.5 73 224-298 109-183 (211)
122 cd01727 LSm8 The eukaryotic Sm 27.4 1.3E+02 0.0029 22.7 4.5 32 176-207 10-41 (74)
123 COG1958 LSM1 Small nuclear rib 27.1 1.2E+02 0.0027 23.2 4.4 33 176-208 18-50 (79)
124 cd01721 Sm_D3 The eukaryotic S 26.8 1.6E+02 0.0034 22.1 4.8 32 176-207 11-42 (70)
125 PF14275 DUF4362: Domain of un 26.3 2E+02 0.0042 23.4 5.5 47 381-429 2-62 (98)
126 PF01423 LSM: LSM domain ; In 25.5 1.5E+02 0.0032 21.6 4.4 33 176-208 9-41 (67)
127 cd04627 CBS_pair_14 The CBS do 25.2 58 0.0012 26.4 2.4 22 276-297 97-118 (123)
128 PF01455 HupF_HypC: HupF/HypC 24.2 2.3E+02 0.0051 21.2 5.2 43 188-233 5-47 (68)
129 PF02601 Exonuc_VII_L: Exonucl 23.8 87 0.0019 30.9 3.8 35 152-186 280-314 (319)
130 cd04603 CBS_pair_KefB_assoc Th 22.8 69 0.0015 25.5 2.4 20 277-296 86-105 (111)
131 cd04620 CBS_pair_7 The CBS dom 22.3 71 0.0015 25.3 2.4 20 277-296 90-109 (115)
132 PF10049 DUF2283: Protein of u 20.1 73 0.0016 22.3 1.7 11 285-295 36-46 (50)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3e-52 Score=429.25 Aligned_cols=301 Identities=39% Similarity=0.625 Sum_probs=261.9
Q ss_pred hHHHHHHHhCCceEEEEEeeeccC------ccc----c---cc-ccCcCeEEEEEEEcC-CCEEEecccccCCCCeEEEE
Q 013804 117 ATVRLFQENTPSVVNITNLAARQD------AFT----L---DV-LEVPQGSGSGFVWDS-KGHVVTNYHVIRGASDIRVT 181 (436)
Q Consensus 117 ~~~~~~~~~~~sVV~I~~~~~~~~------~~~----~---~~-~~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~i~V~ 181 (436)
++.++++++.||||.|.+...... .|. . +. .....+.||||+|++ +||||||+||+.+++.+.|+
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~ 120 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ 120 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence 578999999999999987643211 111 0 00 112247899999985 79999999999999999999
Q ss_pred ecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCC
Q 013804 182 FADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG 261 (436)
Q Consensus 182 ~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~ 261 (436)
+.|+++++|++++.|+.+||||||++.+ ..+++++|+++..+++|++|+++|||++...+++.|+|++..+.....
T Consensus 121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~--- 196 (455)
T PRK10139 121 LNDGREFDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNL--- 196 (455)
T ss_pred ECCCCEEEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCC---
Confidence 9999999999999999999999999854 378999999999999999999999999999999999999887652211
Q ss_pred CCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecceeeecc
Q 013804 262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD 341 (436)
Q Consensus 262 ~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~ 341 (436)
..+..++++|+.+++|+|||||+|.+|+||||+++...+.++..+++|+||++.+++++++|+++|++.++|||+.+++.
T Consensus 197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l 276 (455)
T PRK10139 197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM 276 (455)
T ss_pred CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence 12357899999999999999999999999999999887766678999999999999999999999999999999999863
Q ss_pred --hhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 013804 342 --QSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEV 416 (436)
Q Consensus 342 --~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v 416 (436)
+.++.+|+ .|++|..|.++|||+++||++ ||+|++|||++|.+|+|+.+.+....+|+++.++|
T Consensus 277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V 345 (455)
T PRK10139 277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKA-----------GDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL 345 (455)
T ss_pred CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCC-----------CCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence 34566765 699999999999999999999 99999999999999999999998878899999999
Q ss_pred EECCEEEEEEEEeecC
Q 013804 417 LRGDQKEKIPVKLEPK 432 (436)
Q Consensus 417 ~R~g~~~~~~v~~~~~ 432 (436)
.|+|+.+++++++...
T Consensus 346 ~R~G~~~~l~v~~~~~ 361 (455)
T PRK10139 346 LRNGKPLEVEVTLDTS 361 (455)
T ss_pred EECCEEEEEEEEECCC
Confidence 9999999999987543
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=5.4e-51 Score=408.35 Aligned_cols=303 Identities=37% Similarity=0.623 Sum_probs=262.4
Q ss_pred CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeE
Q 013804 112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK 191 (436)
Q Consensus 112 ~~~~~~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~ 191 (436)
...+.++.++++++.||||.|.+.....+. .......+.||||+|+++||||||+||+.+++.+.|.+.||+.++|+
T Consensus 41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~---~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~ 117 (351)
T TIGR02038 41 NTVEISFNKAVRRAAPAVVNIYNRSISQNS---LNQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAE 117 (351)
T ss_pred cccchhHHHHHHhcCCcEEEEEeEeccccc---cccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEE
Confidence 344557889999999999999886543321 11123357899999999999999999999999999999999999999
Q ss_pred EEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEc
Q 013804 192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD 271 (436)
Q Consensus 192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~ 271 (436)
+++.|+.+||||||++.. .+++++++++..+++|++|+++|||++...+++.|+|++..+.... ......++++|
T Consensus 118 vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd 192 (351)
T TIGR02038 118 LVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD 192 (351)
T ss_pred EEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence 999999999999999863 5888999888889999999999999998899999999988764321 11235789999
Q ss_pred cccCCCCCCCeEECCCCcEEEEEeeeecCCC--CCCcceeeeeeeccchhhhhccccceecceecceeeecc--hhhhhc
Q 013804 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG--ASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD--QSVEQL 347 (436)
Q Consensus 272 ~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~--~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~--~~~~~~ 347 (436)
+.+++|+|||||+|.+|+||||+++.+...+ ...+++|+||++.+++++++++++|++.++|||+.+++. ..++.+
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l 272 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL 272 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence 9999999999999999999999997664322 246899999999999999999999999999999999863 345666
Q ss_pred Cc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEE
Q 013804 348 GV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEK 424 (436)
Q Consensus 348 g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~ 424 (436)
|+ .|++|..|.+++||+++||++ ||+|++|||++|.+++|+.+++...++|++++++|.|+|+.++
T Consensus 273 gl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~ 341 (351)
T TIGR02038 273 GLPDLRGIVITGVDPNGPAARAGILV-----------RDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLRQGKQLE 341 (351)
T ss_pred CCCccccceEeecCCCChHHHCCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEE
Confidence 76 699999999999999999999 9999999999999999999999887889999999999999999
Q ss_pred EEEEeecCC
Q 013804 425 IPVKLEPKP 433 (436)
Q Consensus 425 ~~v~~~~~~ 433 (436)
+++++.++|
T Consensus 342 ~~v~l~~~p 350 (351)
T TIGR02038 342 LPVTIDEKP 350 (351)
T ss_pred EEEEecCCC
Confidence 999987654
No 3
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=6.1e-51 Score=407.75 Aligned_cols=303 Identities=35% Similarity=0.535 Sum_probs=259.7
Q ss_pred CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeE
Q 013804 112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK 191 (436)
Q Consensus 112 ~~~~~~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~ 191 (436)
...+.++.++++++.||||.|......... .......+.||||+|+++||||||+||+.+++.+.|.+.||+.++|+
T Consensus 41 ~~~~~~~~~~~~~~~psvV~v~~~~~~~~~---~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~ 117 (353)
T PRK10898 41 DETPASYNQAVRRAAPAVVNVYNRSLNSTS---HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEAL 117 (353)
T ss_pred ccccchHHHHHHHhCCcEEEEEeEeccccC---cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEE
Confidence 334457889999999999999886532211 11112347899999999999999999999999999999999999999
Q ss_pred EEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEc
Q 013804 192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD 271 (436)
Q Consensus 192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~ 271 (436)
++++|+.+||||||++.. .+++++++++..+++|++|+++|||++...+++.|+|++..+..... .....++++|
T Consensus 118 vv~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~---~~~~~~iqtd 192 (353)
T PRK10898 118 LVGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSP---TGRQNFLQTD 192 (353)
T ss_pred EEEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCC---ccccceEEec
Confidence 999999999999999863 58889998888899999999999999988899999999877643221 1224689999
Q ss_pred cccCCCCCCCeEECCCCcEEEEEeeeecCCC---CCCcceeeeeeeccchhhhhccccceecceecceeeecch--hhhh
Q 013804 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG---ASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQ--SVEQ 346 (436)
Q Consensus 272 ~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~---~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~--~~~~ 346 (436)
+.+++|+|||||+|.+|+||||+++.+...+ ...+++|+||++.+++++++++++|++.++|||+.+++.. .++.
T Consensus 193 a~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~ 272 (353)
T PRK10898 193 ASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQG 272 (353)
T ss_pred cccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHh
Confidence 9999999999999999999999998765432 2368999999999999999999999999999999987532 2333
Q ss_pred cCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEE
Q 013804 347 LGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKE 423 (436)
Q Consensus 347 ~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~ 423 (436)
+++ .|++|.+|.+++||+++||++ ||+|++|||++|.++.|+.+.+....+|++++++|.|+|+.+
T Consensus 273 ~~~~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~ 341 (353)
T PRK10898 273 GGIDQLQGIVVNEVSPDGPAAKAGIQV-----------NDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQL 341 (353)
T ss_pred cCCCCCCeEEEEEECCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEE
Confidence 343 799999999999999999999 999999999999999999999988789999999999999999
Q ss_pred EEEEEeecCC
Q 013804 424 KIPVKLEPKP 433 (436)
Q Consensus 424 ~~~v~~~~~~ 433 (436)
++++++.+++
T Consensus 342 ~~~v~l~~~p 351 (353)
T PRK10898 342 TLQVTIQEYP 351 (353)
T ss_pred EEEEEeccCC
Confidence 9999987765
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=9.8e-50 Score=412.42 Aligned_cols=300 Identities=38% Similarity=0.600 Sum_probs=261.0
Q ss_pred hHHHHHHHhCCceEEEEEeeeccC---c--------cccc--------------------------cccCcCeEEEEEEE
Q 013804 117 ATVRLFQENTPSVVNITNLAARQD---A--------FTLD--------------------------VLEVPQGSGSGFVW 159 (436)
Q Consensus 117 ~~~~~~~~~~~sVV~I~~~~~~~~---~--------~~~~--------------------------~~~~~~~~GSGfiI 159 (436)
++.++++++.||||.|.+...... + |... ......+.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 588999999999999987653211 0 1000 00122468999999
Q ss_pred cC-CCEEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCC
Q 013804 160 DS-KGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFG 238 (436)
Q Consensus 160 ~~-~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g 238 (436)
++ +||||||+||+.+++.+.|++.|++.++|++++.|+.+||||||++.. ..+++++|+++..+++|++|+++|+|++
T Consensus 119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~-~~l~~~~lg~s~~l~~G~~V~aiG~P~g 197 (473)
T PRK10942 119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNP-KNLTAIKMADSDALRVGDYTVAIGNPYG 197 (473)
T ss_pred ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEcCCCC
Confidence 86 599999999999999999999999999999999999999999999754 3689999999999999999999999999
Q ss_pred CCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccch
Q 013804 239 LDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNG 318 (436)
Q Consensus 239 ~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~ 318 (436)
...+++.|+|++..+.... ...+..++++|+.+++|+|||||+|.+|+||||+++...+.++..+++|+||++.+++
T Consensus 198 ~~~tvt~GiVs~~~r~~~~---~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~ 274 (473)
T PRK10942 198 LGETVTSGIVSALGRSGLN---VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN 274 (473)
T ss_pred CCcceeEEEEEEeecccCC---cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence 9999999999988764211 1123578999999999999999999999999999998877666778999999999999
Q ss_pred hhhhccccceecceecceeeecc--hhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEe
Q 013804 319 IVDQLVKFGKVTRPILGIKFAPD--QSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV 393 (436)
Q Consensus 319 ~l~~l~~~g~v~~~~lGv~~~~~--~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V 393 (436)
++++|+++|++.|+|||+.+++. +.++.+++ .|++|..|.+++||+++||++ ||+|++|||++|
T Consensus 275 v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V 343 (473)
T PRK10942 275 LTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKA-----------GDVITSLNGKPI 343 (473)
T ss_pred HHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCC-----------CCEEEEECCEEC
Confidence 99999999999999999999863 34666775 599999999999999999999 999999999999
Q ss_pred CCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 013804 394 SNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 431 (436)
Q Consensus 394 ~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 431 (436)
.+++|+.+++....+|++++++|.|+|+.+++++++..
T Consensus 344 ~s~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~ 381 (473)
T PRK10942 344 SSFAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQ 381 (473)
T ss_pred CCHHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCc
Confidence 99999999998888899999999999999999988754
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=1.6e-48 Score=401.44 Aligned_cols=301 Identities=44% Similarity=0.677 Sum_probs=262.6
Q ss_pred HHHHHHHhCCceEEEEEeeeccC---------c----ccc--c------cccCcCeEEEEEEEcCCCEEEecccccCCCC
Q 013804 118 TVRLFQENTPSVVNITNLAARQD---------A----FTL--D------VLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS 176 (436)
Q Consensus 118 ~~~~~~~~~~sVV~I~~~~~~~~---------~----~~~--~------~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~ 176 (436)
+.++++++.||||.|.+...... + |.. . ......+.||||+|+++||||||+||+.++.
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~ 82 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD 82 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence 56799999999999988652211 1 100 0 1123457899999999999999999999999
Q ss_pred eEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeec
Q 013804 177 DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREIS 256 (436)
Q Consensus 177 ~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~ 256 (436)
.+.|++.|++.++|++++.|+.+||||||++.+ ..++++.|+++..+++|++|+++|||++...+++.|+|++..+...
T Consensus 83 ~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~ 161 (428)
T TIGR02037 83 EITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGL 161 (428)
T ss_pred eEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCcc
Confidence 999999999999999999999999999999864 3689999998888999999999999999999999999998776531
Q ss_pred cCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecce
Q 013804 257 SAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGI 336 (436)
Q Consensus 257 ~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv 336 (436)
....+..++++|+.+++|+|||||+|.+|+||||+++.....++..+++|+||++.+++++++++++|++.++|||+
T Consensus 162 ---~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi 238 (428)
T TIGR02037 162 ---GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV 238 (428)
T ss_pred ---CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence 11234568999999999999999999999999999998776666778999999999999999999999999999999
Q ss_pred eeecc--hhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804 337 KFAPD--QSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (436)
Q Consensus 337 ~~~~~--~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~ 411 (436)
.+++. ..++.+|+ .|++|.+|.+++||+++||++ ||+|++|||++|.++.++.+++....+|++
T Consensus 239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~ 307 (428)
T TIGR02037 239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKA-----------GDVILSVNGKPISSFADLRRAIGTLKPGKK 307 (428)
T ss_pred EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCE
Confidence 99863 35677776 799999999999999999999 999999999999999999999988788999
Q ss_pred EEEEEEECCEEEEEEEEeecCC
Q 013804 412 VIVEVLRGDQKEKIPVKLEPKP 433 (436)
Q Consensus 412 v~l~v~R~g~~~~~~v~~~~~~ 433 (436)
++++|.|+|+.+++++++...+
T Consensus 308 v~l~v~R~g~~~~~~v~l~~~~ 329 (428)
T TIGR02037 308 VTLGILRKGKEKTITVTLGASP 329 (428)
T ss_pred EEEEEEECCEEEEEEEEECcCC
Confidence 9999999999999999876543
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=6.3e-39 Score=321.87 Aligned_cols=302 Identities=45% Similarity=0.689 Sum_probs=261.5
Q ss_pred hhHHHHHHHhCCceEEEEEeeeccC-cccccc--ccCcCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeEE
Q 013804 116 LATVRLFQENTPSVVNITNLAARQD-AFTLDV--LEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI 192 (436)
Q Consensus 116 ~~~~~~~~~~~~sVV~I~~~~~~~~-~~~~~~--~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~v 192 (436)
..+..+++++.|+||.|........ .|.... .....+.||||+++++|||+|+.||+.++..+.+.+.||+.+++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~ 112 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL 112 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence 5778899999999999988654332 111000 0001489999999999999999999999999999999999999999
Q ss_pred EEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEcc
Q 013804 193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA 272 (436)
Q Consensus 193 v~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~ 272 (436)
++.|+..|+|++|++.... ++.+.++++..++.|++++++|+|++...+++.|+++...+. ... ....+.+++++|+
T Consensus 113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~v~-~~~~~~~~IqtdA 189 (347)
T COG0265 113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-GVG-SAGGYVNFIQTDA 189 (347)
T ss_pred EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-ccc-Ccccccchhhccc
Confidence 9999999999999987543 888899999999999999999999999999999999998886 111 1112568899999
Q ss_pred ccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecceeeecchhhhhcC---c
Q 013804 273 AINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLG---V 349 (436)
Q Consensus 273 ~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~g---~ 349 (436)
++++|+||||++|.+|++|||++......++..+++|+||++.++.++.++++.|++.++++|+.+.+......+| .
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~ 269 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVA 269 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCC
Confidence 9999999999999999999999999887665677999999999999999999988999999999988633222144 3
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
.|++|..+.+++||+++|+++ ||+|+++||++|.+..++...+....+|+++.+++.|+|+++++.+++
T Consensus 270 ~G~~V~~v~~~spa~~agi~~-----------Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~l 338 (347)
T COG0265 270 AGAVVLGVLPGSPAAKAGIKA-----------GDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVTL 338 (347)
T ss_pred CceEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Confidence 799999999999999999999 999999999999999999999998889999999999999999999999
Q ss_pred ec
Q 013804 430 EP 431 (436)
Q Consensus 430 ~~ 431 (436)
.+
T Consensus 339 ~~ 340 (347)
T COG0265 339 GD 340 (347)
T ss_pred cC
Confidence 76
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95 E-value=2.9e-27 Score=238.34 Aligned_cols=305 Identities=37% Similarity=0.516 Sum_probs=244.6
Q ss_pred hHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCCCC-----------eEEEEecCC
Q 013804 117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS-----------DIRVTFADQ 185 (436)
Q Consensus 117 ~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~-----------~i~V~~~dg 185 (436)
....+.++-..+||.|+...-...+......+.+...||||+++.+|.++|++||+.... .+.|...++
T Consensus 129 ~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~ 208 (473)
T KOG1320|consen 129 FVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIG 208 (473)
T ss_pred hHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeec
Confidence 345788899999999987544333322344455678999999999999999999997543 377777766
Q ss_pred --cEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCC--
Q 013804 186 --SAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG-- 261 (436)
Q Consensus 186 --~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~-- 261 (436)
..+++.+.+.|+..|+|+++++.+..-.++++++-+..+..|+++..+|.|++..+..+.|.+++..|........
T Consensus 209 ~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~g 288 (473)
T KOG1320|consen 209 PGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLETG 288 (473)
T ss_pred CCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcccc
Confidence 8899999999999999999997654347888998888999999999999999999999999999888765443332
Q ss_pred CCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcccccee---------cce
Q 013804 262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV---------TRP 332 (436)
Q Consensus 262 ~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v---------~~~ 332 (436)
....+++++|++++.|+||+|++|.+|++||+++......+-..+++|++|.+.+..++.+..+.... .+.
T Consensus 289 ~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~~ 368 (473)
T KOG1320|consen 289 VLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVHQ 368 (473)
T ss_pred eeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccccc
Confidence 34568899999999999999999999999999887765444457899999999999998887443322 234
Q ss_pred ecceeeec-------chhhhhc----C-ccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHH
Q 013804 333 ILGIKFAP-------DQSVEQL----G-VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLY 400 (436)
Q Consensus 333 ~lGv~~~~-------~~~~~~~----g-~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~ 400 (436)
|+|+...- ....+.+ + ..+|+|.+|.+++++...++.+ ||+|++|||++|.+..++.
T Consensus 369 ~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~-----------g~~V~~vng~~V~n~~~l~ 437 (473)
T KOG1320|consen 369 YIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKP-----------GDQVVKVNGKPVKNLKHLY 437 (473)
T ss_pred cCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccC-----------CCEEEEECCEEeechHHHH
Confidence 66664331 0011111 2 2689999999999999999999 9999999999999999999
Q ss_pred HHHhcCCCCCEEEEEEEECCEEEEEEEEeecC
Q 013804 401 RILDQCKVGDEVIVEVLRGDQKEKIPVKLEPK 432 (436)
Q Consensus 401 ~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~ 432 (436)
+++++...+++|.+..+|..|..++.+.....
T Consensus 438 ~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~~~ 469 (473)
T KOG1320|consen 438 ELIEECSTEDKVAVLDRRSAEDATLEILPEHK 469 (473)
T ss_pred HHHHhcCcCceEEEEEecCccceeEEeccccc
Confidence 99999888899999999998988888876543
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.91 E-value=1.1e-22 Score=206.88 Aligned_cols=295 Identities=27% Similarity=0.367 Sum_probs=233.9
Q ss_pred hHHHHHHHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCC-CEEEecccccCCC-CeEEEEecCCcEEeeEEEE
Q 013804 117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGA-SDIRVTFADQSAYDAKIVG 194 (436)
Q Consensus 117 ~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~-~~i~V~~~dg~~~~a~vv~ 194 (436)
.....+..+-++||.|......- | +....+.+.+|||++++. ||||||+||+... -...+.+.+..+.+.-.++
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~~--f--dtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy 128 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVRA--F--DTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY 128 (955)
T ss_pred hhhhhhhhhcccEEEEEehheee--c--ccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence 55667889999999998754321 1 223345678999999976 8999999999754 4567888888888888999
Q ss_pred EcCCCCeEEEEEcCCC---CCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCC---CCCcccEE
Q 013804 195 FDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAAT---GRPIQDVI 268 (436)
Q Consensus 195 ~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~---~~~~~~~i 268 (436)
.|+.+|+.+++.+... ..+..+.++. +-.++|.+++++|+..+.-.+...|.++.+.+....+.. +.....++
T Consensus 129 rDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~ 207 (955)
T KOG1421|consen 129 RDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI 207 (955)
T ss_pred CCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceee
Confidence 9999999999998643 1233444432 335789999999998888888999999988887766532 11224567
Q ss_pred EEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceecceecceeeec--chhhhh
Q 013804 269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQ 346 (436)
Q Consensus 269 ~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~--~~~~~~ 346 (436)
|.-+....|.||+|++|.+|..|.++..+.. ..+.+|++|++.+.+-+.-++++..++|+.|-++|.+ .+.+++
T Consensus 208 QaasstsggssgspVv~i~gyAVAl~agg~~----ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rr 283 (955)
T KOG1421|consen 208 QAASSTSGGSSGSPVVDIPGYAVALNAGGSI----SSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRR 283 (955)
T ss_pred eehhcCCCCCCCCceecccceEEeeecCCcc----cccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHh
Confidence 7778888999999999999999999887654 3456799999999999999998889999999999986 334666
Q ss_pred cCc---------------cceE-EEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCC
Q 013804 347 LGV---------------SGVL-VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGD 410 (436)
Q Consensus 347 ~g~---------------~gv~-V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~ 410 (436)
+|+ .|++ |..+.+++||++. |++ ||++++||+.-+.++.++.+.|.+ ..|+
T Consensus 284 lGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~-Le~-----------GDillavN~t~l~df~~l~~iLDe-gvgk 350 (955)
T KOG1421|consen 284 LGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK-LEP-----------GDILLAVNSTCLNDFEALEQILDE-GVGK 350 (955)
T ss_pred cCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc-cCC-----------CcEEEEEcceehHHHHHHHHHHhh-ccCc
Confidence 664 4554 5567788887654 444 999999999999999999999988 5899
Q ss_pred EEEEEEEECCEEEEEEEEeecCC
Q 013804 411 EVIVEVLRGDQKEKIPVKLEPKP 433 (436)
Q Consensus 411 ~v~l~v~R~g~~~~~~v~~~~~~ 433 (436)
.++|+|+|+|++.+++++...+.
T Consensus 351 ~l~LtI~Rggqelel~vtvqdlh 373 (955)
T KOG1421|consen 351 NLELTIQRGGQELELTVTVQDLH 373 (955)
T ss_pred eEEEEEEeCCEEEEEEEEecccc
Confidence 99999999999999999887553
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.68 E-value=4.9e-16 Score=130.90 Aligned_cols=109 Identities=38% Similarity=0.594 Sum_probs=74.8
Q ss_pred EEEEEEcCCCEEEecccccC--------CCCeEEEEecCCcEEe--eEEEEEcCC-CCeEEEEEcCCCCCCcceecCCCC
Q 013804 154 GSGFVWDSKGHVVTNYHVIR--------GASDIRVTFADQSAYD--AKIVGFDQD-KDVAVLRIDAPKDKLRPIPIGVSA 222 (436)
Q Consensus 154 GSGfiI~~~G~ILT~aHvv~--------~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~~l~~~~ 222 (436)
||||+|+++|+||||+||+. ....+.+...+++.+. ++++..++. +|+|||+++.
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~-------------- 66 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP-------------- 66 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC--------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec--------------
Confidence 89999999999999999998 4567888888988888 999999999 9999999970
Q ss_pred CCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEE
Q 013804 223 DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGI 293 (436)
Q Consensus 223 ~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI 293 (436)
....+...... ............ ......+ +++.+.+|+|||||||.+|+||||
T Consensus 67 ~~~~~~~~~~~------------~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 67 WTGVGGGVRVP------------GSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp EEEEEEEEEEE------------EEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred ccceeeeeEee------------eeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 00000000000 000000000000 0001114 799999999999999999999997
No 10
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.60 E-value=7.5e-15 Score=116.29 Aligned_cols=82 Identities=40% Similarity=0.607 Sum_probs=72.6
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~ 411 (436)
||||+.+..... ..|++|.+|.++|||+++||++ ||+|++|||++|+++.|+..++....+|++
T Consensus 1 ~~lGv~~~~~~~-----~~g~~V~~V~~~spA~~aGl~~-----------GD~I~~ing~~v~~~~~~~~~l~~~~~g~~ 64 (82)
T PF13180_consen 1 GGLGVTVQNLSD-----TGGVVVVSVIPGSPAAKAGLQP-----------GDIILAINGKPVNSSEDLVNILSKGKPGDT 64 (82)
T ss_dssp -E-SEEEEECSC-----SSSEEEEEESTTSHHHHTTS-T-----------TEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred CEECeEEEEccC-----CCeEEEEEeCCCCcHHHCCCCC-----------CcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence 689999987542 2599999999999999999999 999999999999999999999988899999
Q ss_pred EEEEEEECCEEEEEEEEe
Q 013804 412 VIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 412 v~l~v~R~g~~~~~~v~~ 429 (436)
++++|.|+|+.+++++++
T Consensus 65 v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 65 VTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp EEEEEEETTEEEEEEEE-
T ss_pred EEEEEEECCEEEEEEEEC
Confidence 999999999999999875
No 11
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.56 E-value=6.1e-14 Score=143.82 Aligned_cols=296 Identities=19% Similarity=0.238 Sum_probs=196.2
Q ss_pred HHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCC-CEEEecccccC-CCCeEEEEecCCcEEeeEEEEEcCCCC
Q 013804 123 QENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVGFDQDKD 200 (436)
Q Consensus 123 ~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~~d~~~D 200 (436)
++...+.|.+.... ++.+++.......|||.|++.+ |++++...++. +..+.+|.+.|.-.++|.+.+.|+..+
T Consensus 525 ~~i~~~~~~v~~~~----~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~n 600 (955)
T KOG1421|consen 525 ADISNCLVDVEPMM----PVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTEN 600 (955)
T ss_pred hHHhhhhhhheece----eeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCccc
Confidence 34445555554422 2334444444568999999865 89999999986 567899999999999999999999999
Q ss_pred eEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEee---eeeeeccCC-CCCCcccEEEEccccCC
Q 013804 201 VAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISG---LRREISSAA-TGRPIQDVIQTDAAINP 276 (436)
Q Consensus 201 lAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~---~~~~~~~~~-~~~~~~~~i~~~~~i~~ 276 (436)
+|.+|.+... ...+.|. ...+..|+++...|+......-.....+.. +........ ......+.|.+++.+.-
T Consensus 601 ~a~~kydp~~--~~~~kl~-~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nlsT 677 (955)
T KOG1421|consen 601 VASFKYDPAL--EVQLKLT-DTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLST 677 (955)
T ss_pred eeEeccChhH--hhhhccc-eeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEeccccc
Confidence 9999998632 3445553 345788999999998755432111111111 111111111 11222466777776666
Q ss_pred CCCCCeEECCCCcEEEEEeeeecCC--CCCCcceeeeeeeccchhhhhccccceecceecceeeecc--hhhhhcCccce
Q 013804 277 GNSGGPLLDSSGSLIGINTAIYSPS--GASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD--QSVEQLGVSGV 352 (436)
Q Consensus 277 G~SGGPlvd~~G~VVGI~s~~~~~~--~~~~~~~~aIP~~~i~~~l~~l~~~g~v~~~~lGv~~~~~--~~~~~~g~~gv 352 (436)
++--|-+.|.+|+|+|+.-...+.. +.....-|.+.+.++++.++.|+..+......+|++|... ..++.+|+.--
T Consensus 678 ~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~e 757 (955)
T KOG1421|consen 678 SCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPSE 757 (955)
T ss_pred cccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCHH
Confidence 6666778999999999976555442 2233456778899999999999998888777888888752 24555554333
Q ss_pred EEEecCCCCcccccCccc--ccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 013804 353 LVLDAPPNGPAGKAGLLS--TKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE 430 (436)
Q Consensus 353 ~V~~v~~~spa~~agl~~--~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~ 430 (436)
++.+...++...+.-+.. .+...+..|..||+|+++||+.|+...||.+.. .++.+|.|||.++++++++-
T Consensus 758 ~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~~-------eid~~ilrdg~~~~ikipt~ 830 (955)
T KOG1421|consen 758 FIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDFE-------EIDAVILRDGIEMEIKIPTY 830 (955)
T ss_pred HHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhhh-------hhheeeeecCcEEEEEeccc
Confidence 333333333222221111 111223446679999999999999999998632 47899999999999998875
Q ss_pred cC
Q 013804 431 PK 432 (436)
Q Consensus 431 ~~ 432 (436)
+.
T Consensus 831 p~ 832 (955)
T KOG1421|consen 831 PE 832 (955)
T ss_pred cc
Confidence 43
No 12
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.54 E-value=5.3e-13 Score=123.87 Aligned_cols=168 Identities=23% Similarity=0.351 Sum_probs=112.2
Q ss_pred CeEEEEEEEcCCCEEEecccccCCCCeEEEEecC-------C--cEEeeEEEEEc----C---CCCeEEEEEcCC---CC
Q 013804 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD-------Q--SAYDAKIVGFD----Q---DKDVAVLRIDAP---KD 211 (436)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d-------g--~~~~a~vv~~d----~---~~DlAlLkv~~~---~~ 211 (436)
...|+|++|+++ +|||++||+.+...+.+.+.. + ..+..+.+..+ . .+|+|||+++.+ ..
T Consensus 24 ~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~ 102 (220)
T PF00089_consen 24 RFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD 102 (220)
T ss_dssp EEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred CeeEeEEecccc-ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 468999999987 999999999996667665532 2 24555544343 2 469999999987 34
Q ss_pred CCcceecCCC-CCCCCCCEEEEEecCCCCCC----ceeEeEEeeeeeeeccC-CCCCCcccEEEEcc----ccCCCCCCC
Q 013804 212 KLRPIPIGVS-ADLLVGQKVYAIGNPFGLDH----TLTTGVISGLRREISSA-ATGRPIQDVIQTDA----AINPGNSGG 281 (436)
Q Consensus 212 ~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~~----~~~~G~vs~~~~~~~~~-~~~~~~~~~i~~~~----~i~~G~SGG 281 (436)
.+.++.+... ..+..|+.+.++||+..... ......+..+....... .........+.... ..+.|+|||
T Consensus 103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~ 182 (220)
T PF00089_consen 103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGG 182 (220)
T ss_dssp SBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTS
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 5677777652 33578999999999975332 34444443333321111 00111234555554 789999999
Q ss_pred eEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhh
Q 013804 282 PLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIV 320 (436)
Q Consensus 282 Plvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l 320 (436)
||++.++.|+||++.. ..++......++.++..+.+|+
T Consensus 183 pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 183 PLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGHHHH
T ss_pred ccccceeeecceeeec-CCCCCCCcCEEEEEHHHhhccC
Confidence 9998776799999988 4444343467888888877764
No 13
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.43 E-value=4.7e-13 Score=135.77 Aligned_cols=275 Identities=20% Similarity=0.224 Sum_probs=193.7
Q ss_pred HHhCCceEEEEEeeeccCcccccccc-CcCeEEEEEEEcCCCEEEecccccC---CCCeEEEEe-cCCcEEeeEEEEEcC
Q 013804 123 QENTPSVVNITNLAARQDAFTLDVLE-VPQGSGSGFVWDSKGHVVTNYHVIR---GASDIRVTF-ADQSAYDAKIVGFDQ 197 (436)
Q Consensus 123 ~~~~~sVV~I~~~~~~~~~~~~~~~~-~~~~~GSGfiI~~~G~ILT~aHvv~---~~~~i~V~~-~dg~~~~a~vv~~d~ 197 (436)
+....+++.+............|... .....|+||.+... .++|++|++. +...+.+.- +.-+.|.+++...-.
T Consensus 57 ~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~~ 135 (473)
T KOG1320|consen 57 DLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVFE 135 (473)
T ss_pred cccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHhhh
Confidence 34466788887766555443333332 33567999999754 8999999998 555555542 234678899988889
Q ss_pred CCCeEEEEEcCCC--CCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 013804 198 DKDVAVLRIDAPK--DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN 275 (436)
Q Consensus 198 ~~DlAlLkv~~~~--~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~ 275 (436)
++|+|++.++... ....++.+. +-+...+.++++| +....++.|+|......... ........+++++.++
T Consensus 136 ~cd~Avv~Ie~~~f~~~~~~~e~~--~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~--~~~~~l~~vqi~aa~~ 208 (473)
T KOG1320|consen 136 ECDLAVVYIESEEFWKGMNPFELG--DIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYA--HSSTVLLRVQIDAAIG 208 (473)
T ss_pred cccceEEEEeeccccCCCcccccC--CCcccCccEEEEc---CCcEEEEeeEEEEEEecccc--CCCcceeeEEEEEeec
Confidence 9999999998643 122233443 3355668899998 67789999999877654322 2223345689999999
Q ss_pred CCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcccccee-cceecceeeecch---hhhhcCc--
Q 013804 276 PGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV-TRPILGIKFAPDQ---SVEQLGV-- 349 (436)
Q Consensus 276 ~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v-~~~~lGv~~~~~~---~~~~~g~-- 349 (436)
+|+||+|.+...+++.|+........ ..+++.||.-.+.+|.......+.. .+++++...+... ..+.+.+
T Consensus 209 ~~~s~ep~i~g~d~~~gvA~l~ik~~---~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~ 285 (473)
T KOG1320|consen 209 PGNSGEPVIVGVDKVAGVAFLKIKTP---ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGL 285 (473)
T ss_pred CCccCCCeEEccccccceEEEEEecC---CcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCc
Confidence 99999999987789999988776432 2678999999999999877666654 3566666555322 2233322
Q ss_pred -cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC-H-----HHHHHHHhcCCCCCEEEEEEEECC
Q 013804 350 -SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN-G-----SDLYRILDQCKVGDEVIVEVLRGD 420 (436)
Q Consensus 350 -~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s-~-----~dl~~~l~~~~~g~~v~l~v~R~g 420 (436)
.|+.+.++.+-+.|-+. ++. ||.|+++||+.|-- + -.+...+....++|++.+.+.|.+
T Consensus 286 ~~g~~i~~~~qtd~ai~~-~ns-----------g~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 286 ETGVLISKINQTDAAINP-GNS-----------GGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred ccceeeeeecccchhhhc-ccC-----------CCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence 57999999887776554 333 99999999998841 1 123345566678999999999987
No 14
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.42 E-value=5.1e-12 Score=118.11 Aligned_cols=170 Identities=21% Similarity=0.264 Sum_probs=104.0
Q ss_pred CeEEEEEEEcCCCEEEecccccCCC--CeEEEEecC---------CcEEeeEEEEEcC-------CCCeEEEEEcCCC--
Q 013804 151 QGSGSGFVWDSKGHVVTNYHVIRGA--SDIRVTFAD---------QSAYDAKIVGFDQ-------DKDVAVLRIDAPK-- 210 (436)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~--~~i~V~~~d---------g~~~~a~vv~~d~-------~~DlAlLkv~~~~-- 210 (436)
...|+|++|+++ +|||+|||+.+. ..+.|.++. ...+.++-+..++ .+|||||+++.+.
T Consensus 24 ~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~ 102 (232)
T cd00190 24 RHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL 102 (232)
T ss_pred cEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence 568999999987 999999999875 566666642 2234444455553 5799999998653
Q ss_pred -CCCcceecCCCC-CCCCCCEEEEEecCCCCCC-----ceeEeEEeeeeeeeccCCCC---CCcccEEEE-----ccccC
Q 013804 211 -DKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDH-----TLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN 275 (436)
Q Consensus 211 -~~~~~~~l~~~~-~~~~G~~V~~vG~p~g~~~-----~~~~G~vs~~~~~~~~~~~~---~~~~~~i~~-----~~~i~ 275 (436)
..+.|+.|.... .+..|+.+.++||+..... ......+..+....+..... ......+.. ....|
T Consensus 103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 182 (232)
T cd00190 103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC 182 (232)
T ss_pred CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence 236777776543 5678999999999765332 12222222221111110000 000111211 44578
Q ss_pred CCCCCCeEECCC---CcEEEEEeeeecCCCCCCcceeeeeeeccchhhhh
Q 013804 276 PGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQ 322 (436)
Q Consensus 276 ~G~SGGPlvd~~---G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~ 322 (436)
.|+|||||+... +.++||.+++.. |+.......+..+...++|+++
T Consensus 183 ~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~~ 231 (232)
T cd00190 183 QGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQK 231 (232)
T ss_pred cCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhhc
Confidence 999999999654 789999998754 4422334455556666666653
No 15
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.34 E-value=2.7e-11 Score=113.42 Aligned_cols=167 Identities=22% Similarity=0.284 Sum_probs=100.3
Q ss_pred CeEEEEEEEcCCCEEEecccccCCCC--eEEEEecCC--------cEEeeEEEEEc-------CCCCeEEEEEcCCC---
Q 013804 151 QGSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ--------SAYDAKIVGFD-------QDKDVAVLRIDAPK--- 210 (436)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg--------~~~~a~vv~~d-------~~~DlAlLkv~~~~--- 210 (436)
...|+|++|+++ +|||+|||+.+.. .+.|.+... ..+.+.-+..+ ..+|+|||+++.+.
T Consensus 25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~ 103 (229)
T smart00020 25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS 103 (229)
T ss_pred CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence 568999999977 9999999998754 677777543 33444444433 35799999998762
Q ss_pred CCCcceecCCC-CCCCCCCEEEEEecCCCCC------CceeEeEEeeeeeeeccCCCCC---CcccEEE-----EccccC
Q 013804 211 DKLRPIPIGVS-ADLLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQ-----TDAAIN 275 (436)
Q Consensus 211 ~~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~------~~~~~G~vs~~~~~~~~~~~~~---~~~~~i~-----~~~~i~ 275 (436)
..+.|+.|... ..+..++.+.+.||+.... .......+..+........... .....+. .....|
T Consensus 104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 183 (229)
T smart00020 104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC 183 (229)
T ss_pred CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence 24667777543 3467789999999986542 1122222222211111100000 0011111 135678
Q ss_pred CCCCCCeEECCCC--cEEEEEeeeecCCCCCCcceeeeeeeccchh
Q 013804 276 PGNSGGPLLDSSG--SLIGINTAIYSPSGASSGVGFSIPVDTVNGI 319 (436)
Q Consensus 276 ~G~SGGPlvd~~G--~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~ 319 (436)
+|+||||++...+ .++||.+.+. .|+.......+..+....+|
T Consensus 184 ~gdsG~pl~~~~~~~~l~Gi~s~g~-~C~~~~~~~~~~~i~~~~~W 228 (229)
T smart00020 184 QGDSGGPLVCNDGRWVLVGIVSWGS-GCARPGKPGVYTRVSSYLDW 228 (229)
T ss_pred CCCCCCeeEEECCCEEEEEEEEECC-CCCCCCCCCEEEEecccccc
Confidence 9999999996543 8999999876 44433344445555444433
No 16
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.29 E-value=2.4e-11 Score=95.50 Aligned_cols=68 Identities=29% Similarity=0.423 Sum_probs=63.6
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK 428 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (436)
.|++|..+.+++||+++||++ ||+|++|||++|.+++|+.+++....+|+.+.+++.|+|+.++++++
T Consensus 10 ~Gv~V~~V~~~spa~~aGL~~-----------GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 10 AGVVIVGVIVGSPAENAVLHT-----------GDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred CcEEEEEECCCChHHhcCCCC-----------CCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 699999999999999999999 99999999999999999999998866789999999999998887765
No 17
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.27 E-value=4.4e-11 Score=93.85 Aligned_cols=72 Identities=29% Similarity=0.393 Sum_probs=66.0
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
.|++|.+|.+++||+. ||++ ||+|++|||++|.+++++.+++....+|+.+.+++.|+|+.+++++++
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l 75 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKA-----------GDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLIL 75 (79)
T ss_pred cCEEEEEECCCCchhh-CCCC-----------CCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEE
Confidence 6899999999999986 7999 999999999999999999999987678899999999999999999998
Q ss_pred ecCC
Q 013804 430 EPKP 433 (436)
Q Consensus 430 ~~~~ 433 (436)
.+++
T Consensus 76 ~~~~ 79 (79)
T cd00986 76 KTFP 79 (79)
T ss_pred eccC
Confidence 7653
No 18
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.25 E-value=4.8e-11 Score=95.61 Aligned_cols=84 Identities=44% Similarity=0.681 Sum_probs=70.9
Q ss_pred eecceeeecchhh--hhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 013804 332 PILGIKFAPDQSV--EQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (436)
Q Consensus 332 ~~lGv~~~~~~~~--~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~ 406 (436)
+|+|+.+++.... +.++. .|++|.++.+++||+++||++ ||+|++|||++|.++.++.+++...
T Consensus 1 ~~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~i~~~~~~~~~l~~~ 69 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGLKP-----------GDVILAVNGKPVKSVADLRRALAEL 69 (90)
T ss_pred CccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCCCc-----------CCEEEEECCEECCCHHHHHHHHHhc
Confidence 5789998864422 22332 599999999999999999999 9999999999999999999999876
Q ss_pred CCCCEEEEEEEECCEEEEEE
Q 013804 407 KVGDEVIVEVLRGDQKEKIP 426 (436)
Q Consensus 407 ~~g~~v~l~v~R~g~~~~~~ 426 (436)
..++.+.+++.|+|+.+++.
T Consensus 70 ~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 70 KPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred CCCCEEEEEEEECCEEEEee
Confidence 66899999999999876654
No 19
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.23 E-value=3.1e-11 Score=115.78 Aligned_cols=101 Identities=19% Similarity=0.229 Sum_probs=89.8
Q ss_pred eccchhhhhccccceecceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEe
Q 013804 314 DTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV 393 (436)
Q Consensus 314 ~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V 393 (436)
..++++++++++++++.++|+|+...... -...|++|..+.++++++++||++ ||+|++|||+++
T Consensus 159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~----g~~~G~~v~~v~~~s~a~~aGLr~-----------GDvIv~ING~~i 223 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN----DKLEGYRLNPGKDPSLFYKSGLQD-----------GDIAVALNGLDL 223 (259)
T ss_pred hhHHHHHHHHHHCHHhhhheEeEEEEEeC----CceeEEEEEecCCCCHHHHcCCCC-----------CCEEEEECCEEc
Confidence 45678899999999999999999975432 114799999999999999999999 999999999999
Q ss_pred CCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804 394 SNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 394 ~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
++++++.+++...+++++++++|+|+|+.+++.+.+
T Consensus 224 ~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 224 RDPEQAFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred CCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 999999999998888999999999999998888764
No 20
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.23 E-value=5.9e-11 Score=93.14 Aligned_cols=78 Identities=33% Similarity=0.521 Sum_probs=66.3
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~ 411 (436)
+|+|+.+.... .|++|.+|.+++||+++||++ ||+|++|||++|.++.++ +.....++.
T Consensus 1 ~~~G~~~~~~~-------~~~~V~~V~~~s~a~~aGl~~-----------GD~I~~Ing~~v~~~~~~---l~~~~~~~~ 59 (80)
T cd00990 1 PYLGLTLDKEE-------GLGKVTFVRDDSPADKAGLVA-----------GDELVAVNGWRVDALQDR---LKEYQAGDP 59 (80)
T ss_pred CcccEEEEccC-------CcEEEEEECCCChHHHhCCCC-----------CCEEEEECCEEhHHHHHH---HHhcCCCCE
Confidence 57899886543 579999999999999999999 999999999999986654 444457889
Q ss_pred EEEEEEECCEEEEEEEEee
Q 013804 412 VIVEVLRGDQKEKIPVKLE 430 (436)
Q Consensus 412 v~l~v~R~g~~~~~~v~~~ 430 (436)
+.+++.|+|+..++.+++.
T Consensus 60 v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 60 VELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred EEEEEEECCEEEEEEEEec
Confidence 9999999999988888764
No 21
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.16 E-value=2.3e-10 Score=89.38 Aligned_cols=77 Identities=27% Similarity=0.421 Sum_probs=65.4
Q ss_pred ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEE
Q 013804 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEV 412 (436)
Q Consensus 333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v 412 (436)
|+|+.+.... ..++|..+.+++||+++||++ ||+|++|||+++.+++++.+++... .++.+
T Consensus 2 ~~~~~~g~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~ 62 (79)
T cd00989 2 ILGFVPGGPP-------IEPVIGEVVPGSPAAKAGLKA-----------GDRILAINGQKIKSWEDLVDAVQEN-PGKPL 62 (79)
T ss_pred eeeEeccCCc-------cCcEEEeECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHHC-CCceE
Confidence 5666655332 358899999999999999999 9999999999999999999999874 57899
Q ss_pred EEEEEECCEEEEEEEE
Q 013804 413 IVEVLRGDQKEKIPVK 428 (436)
Q Consensus 413 ~l~v~R~g~~~~~~v~ 428 (436)
.+++.|+|+..++.++
T Consensus 63 ~l~v~r~~~~~~~~l~ 78 (79)
T cd00989 63 TLTVERNGETITLTLT 78 (79)
T ss_pred EEEEEECCEEEEEEec
Confidence 9999999988777664
No 22
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.10 E-value=5.6e-10 Score=88.55 Aligned_cols=79 Identities=28% Similarity=0.558 Sum_probs=68.3
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g 409 (436)
.-||+.+.... .+++|..+.+++||+++||++ ||+|++|||+++.++ .++..++.. ..|
T Consensus 2 ~~lG~~~~~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~vng~~i~~~~~~~~~~~l~~-~~~ 62 (85)
T cd00988 2 GGIGLELKYDD-------GGLVITSVLPGSPAAKAGIKA-----------GDIIVAIDGEPVDGLSLEDVVKLLRG-KAG 62 (85)
T ss_pred eEEEEEEEEcC-------CeEEEEEecCCCCHHHcCCCC-----------CCEEEEECCEEcCCCCHHHHHHHhcC-CCC
Confidence 35788876533 689999999999999999999 999999999999999 899988876 468
Q ss_pred CEEEEEEEEC-CEEEEEEEEe
Q 013804 410 DEVIVEVLRG-DQKEKIPVKL 429 (436)
Q Consensus 410 ~~v~l~v~R~-g~~~~~~v~~ 429 (436)
+.+.+++.|+ |+..+++++.
T Consensus 63 ~~i~l~v~r~~~~~~~~~~~~ 83 (85)
T cd00988 63 TKVRLTLKRGDGEPREVTLTR 83 (85)
T ss_pred CEEEEEEEcCCCCEEEEEEEE
Confidence 8999999999 8888877754
No 23
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=99.03 E-value=3.6e-09 Score=99.75 Aligned_cols=159 Identities=18% Similarity=0.209 Sum_probs=94.0
Q ss_pred eEEEEEEEcCCCEEEecccccCCCCe----EEEEe----cCC-cEEeeE--EEEEc-C---CCCeEEEEEcCCC------
Q 013804 152 GSGSGFVWDSKGHVVTNYHVIRGASD----IRVTF----ADQ-SAYDAK--IVGFD-Q---DKDVAVLRIDAPK------ 210 (436)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~~~~~----i~V~~----~dg-~~~~a~--vv~~d-~---~~DlAlLkv~~~~------ 210 (436)
..+++|+|+++ .+||++||+..... +.+.. .++ ..+..+ ..... . +.|.+...+....
T Consensus 64 ~~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~ 142 (251)
T COG3591 64 LCTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGIN 142 (251)
T ss_pred ceeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCC
Confidence 34566999987 99999999965431 11111 122 122222 12112 2 3466665554211
Q ss_pred --CCCcceecCCCCCCCCCCEEEEEecCCCCCCce----eEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEE
Q 013804 211 --DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTL----TTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLL 284 (436)
Q Consensus 211 --~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~----~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlv 284 (436)
.......+......+.++.+.++|||.+..... ..+.+... ....+.+++.+++|+||+|++
T Consensus 143 ~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SGSpv~ 210 (251)
T COG3591 143 IGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSGSPVL 210 (251)
T ss_pred ccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCCCceE
Confidence 112222333345567889999999998765322 22333211 124688999999999999999
Q ss_pred CCCCcEEEEEeeeecCCCCCCcceee-eeeeccchhhhhcc
Q 013804 285 DSSGSLIGINTAIYSPSGASSGVGFS-IPVDTVNGIVDQLV 324 (436)
Q Consensus 285 d~~G~VVGI~s~~~~~~~~~~~~~~a-IP~~~i~~~l~~l~ 324 (436)
+.+.+|||++..+....++ ...+++ .-...+++++++++
T Consensus 211 ~~~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 211 ISKDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred ecCceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence 9988999999987664433 333433 34466677776654
No 24
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.85 E-value=8.2e-09 Score=78.63 Aligned_cols=67 Identities=39% Similarity=0.620 Sum_probs=57.9
Q ss_pred ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCC
Q 013804 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD 410 (436)
Q Consensus 333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~ 410 (436)
++|+.+..... .|++|..+.+++||+++||++ ||+|++|||++|.++ +++.+++... .|+
T Consensus 2 ~~G~~~~~~~~------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~~-~g~ 63 (70)
T cd00136 2 GLGFSIRGGTE------GGVVVLSVEPGSPAERAGLQA-----------GDVILAVNGTDVKNLTLEDVAELLKKE-VGE 63 (70)
T ss_pred CccEEEecCCC------CCEEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhhC-CCC
Confidence 57888775431 489999999999999999999 999999999999999 9999999884 488
Q ss_pred EEEEEEE
Q 013804 411 EVIVEVL 417 (436)
Q Consensus 411 ~v~l~v~ 417 (436)
+++|+|+
T Consensus 64 ~v~l~v~ 70 (70)
T cd00136 64 KVTLTVR 70 (70)
T ss_pred eEEEEEC
Confidence 8988763
No 25
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.83 E-value=1.3e-08 Score=105.34 Aligned_cols=85 Identities=35% Similarity=0.554 Sum_probs=73.1
Q ss_pred ceecceeeecch--hhhhcCc----cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHh
Q 013804 331 RPILGIKFAPDQ--SVEQLGV----SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD 404 (436)
Q Consensus 331 ~~~lGv~~~~~~--~~~~~g~----~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~ 404 (436)
..++|+.+.+.. ..+.+++ .|++|.+|.++|||+++||++ ||+|++|||++|.+++|+.+++.
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~-----------GDvI~~Ing~~V~s~~d~~~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQP-----------GDVILSVNQQPVSSVAELRKVLD 405 (428)
T ss_pred ccccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHH
Confidence 467899887633 2344453 599999999999999999999 99999999999999999999999
Q ss_pred cCCCCCEEEEEEEECCEEEEEE
Q 013804 405 QCKVGDEVIVEVLRGDQKEKIP 426 (436)
Q Consensus 405 ~~~~g~~v~l~v~R~g~~~~~~ 426 (436)
..+.|+.+.++|.|+|+...+.
T Consensus 406 ~~~~g~~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 406 RAKKGGRVALLILRGGATIFVT 427 (428)
T ss_pred hcCCCCEEEEEEEECCEEEEEE
Confidence 8778999999999999987654
No 26
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.79 E-value=1e-08 Score=106.53 Aligned_cols=68 Identities=16% Similarity=0.174 Sum_probs=63.1
Q ss_pred eEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 013804 352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE 430 (436)
Q Consensus 352 v~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~ 430 (436)
.+|.+|.++|||++|||++ ||+|++|||++|++++|+...+....+|++++++|.|+|+.+++++++.
T Consensus 128 ~lV~~V~~~SpA~kAGLk~-----------GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~ 195 (449)
T PRK10779 128 PVVGEIAPNSIAAQAQIAP-----------GTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLD 195 (449)
T ss_pred ccccccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEec
Confidence 3689999999999999999 9999999999999999999999887888999999999999888887774
No 27
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.70 E-value=5.1e-08 Score=100.46 Aligned_cols=69 Identities=32% Similarity=0.491 Sum_probs=64.1
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
.|++|.+|.++|||+++||++ ||+|++|||++|++++|+.+.+.. .+++++.++++|+|+..++++++
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~-----------GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~ 270 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKE-----------GDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKVERNGETLSISLTP 270 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEEEECCEEEEEEEEE
Confidence 378999999999999999999 999999999999999999999987 57888999999999998888887
Q ss_pred e
Q 013804 430 E 430 (436)
Q Consensus 430 ~ 430 (436)
.
T Consensus 271 ~ 271 (420)
T TIGR00054 271 E 271 (420)
T ss_pred c
Confidence 4
No 28
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.65 E-value=9.9e-08 Score=99.24 Aligned_cols=68 Identities=22% Similarity=0.410 Sum_probs=63.5
Q ss_pred ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 013804 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE 430 (436)
Q Consensus 351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~ 430 (436)
+++|.+|.++|||+++||++ ||+|++|||++|++|+|+.+.+.. .+|+.+.++|.|+|+..++++++.
T Consensus 222 ~~vV~~V~~~SpA~~AGL~~-----------GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGLQA-----------GDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIERQGSPLSLTLTPD 289 (449)
T ss_pred CcEEEeeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence 57899999999999999999 999999999999999999999987 578899999999999998888875
No 29
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.62 E-value=1.1e-07 Score=95.11 Aligned_cols=80 Identities=29% Similarity=0.499 Sum_probs=66.4
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g 409 (436)
..+|+.+.... .+++|..|.+++||+++||++ ||+|++|||++|.++ .++...+.. ..|
T Consensus 51 ~~lG~~~~~~~-------~~~~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g 111 (334)
T TIGR00225 51 EGIGIQVGMDD-------GEIVIVSPFEGSPAEKAGIKP-----------GDKIIKINGKSVAGMSLDDAVALIRG-KKG 111 (334)
T ss_pred EEEEEEEEEEC-------CEEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHhccC-CCC
Confidence 56888876432 579999999999999999999 999999999999986 467666665 578
Q ss_pred CEEEEEEEECCEEEEEEEEee
Q 013804 410 DEVIVEVLRGDQKEKIPVKLE 430 (436)
Q Consensus 410 ~~v~l~v~R~g~~~~~~v~~~ 430 (436)
+++.++|.|+|+..++++++.
T Consensus 112 ~~v~l~v~R~g~~~~~~v~l~ 132 (334)
T TIGR00225 112 TKVSLEILRAGKSKPLTFTLK 132 (334)
T ss_pred CEEEEEEEeCCCCceEEEEEE
Confidence 999999999987776666664
No 30
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.59 E-value=2.5e-07 Score=93.18 Aligned_cols=77 Identities=29% Similarity=0.515 Sum_probs=65.0
Q ss_pred ecceeeecchhhhhcCccceEEEecC--------CCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHh
Q 013804 333 ILGIKFAPDQSVEQLGVSGVLVLDAP--------PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD 404 (436)
Q Consensus 333 ~lGv~~~~~~~~~~~g~~gv~V~~v~--------~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~ 404 (436)
.+|+.+.. +||+|.... .++||+++||++ ||+|++|||++|.+++|+.+++.
T Consensus 97 ~iGI~l~t---------~GVlVvg~~~v~~~~g~~~SPAa~AGLq~-----------GDiIvsING~~V~s~~DL~~iL~ 156 (402)
T TIGR02860 97 SIGVKLNT---------KGVLVVGFSDIETEKGKIHSPGEEAGIQI-----------GDRILKINGEKIKNMDDLANLIN 156 (402)
T ss_pred EEEEEEec---------CEEEEEEEEcccccCCCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHH
Confidence 46666643 688886652 368999999999 99999999999999999999998
Q ss_pred cCCCCCEEEEEEEECCEEEEEEEEee
Q 013804 405 QCKVGDEVIVEVLRGDQKEKIPVKLE 430 (436)
Q Consensus 405 ~~~~g~~v~l~v~R~g~~~~~~v~~~ 430 (436)
.. .++.+.++|.|+|+..++++++.
T Consensus 157 ~~-~g~~V~LtV~R~Ge~~tv~V~Pv 181 (402)
T TIGR02860 157 KA-GGEKLTLTIERGGKIIETVIKPV 181 (402)
T ss_pred hC-CCCeEEEEEEECCEEEEEEEEEe
Confidence 85 48999999999999988888754
No 31
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.59 E-value=1.8e-06 Score=82.58 Aligned_cols=170 Identities=22% Similarity=0.249 Sum_probs=95.5
Q ss_pred eEEEEEEEcCCCEEEecccccCCCC--eEEEEecC---------C---cEEeeEEEEEcC-------C-CCeEEEEEcCC
Q 013804 152 GSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFAD---------Q---SAYDAKIVGFDQ-------D-KDVAVLRIDAP 209 (436)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~d---------g---~~~~a~vv~~d~-------~-~DlAlLkv~~~ 209 (436)
..|.|.+|+++ ||+|++||+.+.. .+.|+++. + .......+..|+ . .|||||+++.+
T Consensus 38 ~~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~ 116 (256)
T KOG3627|consen 38 HLCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEP 116 (256)
T ss_pred eeeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEEECCCCCCCCCCCCCEEEEEECCC
Confidence 47888888665 9999999999976 77777642 1 111122112332 2 79999999874
Q ss_pred C---CCCcceecCCCCC---CCCCCEEEEEecCCCCC------CceeEeEEeeeeeeeccCCCCC---CcccEEEE----
Q 013804 210 K---DKLRPIPIGVSAD---LLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQT---- 270 (436)
Q Consensus 210 ~---~~~~~~~l~~~~~---~~~G~~V~~vG~p~g~~------~~~~~G~vs~~~~~~~~~~~~~---~~~~~i~~---- 270 (436)
. ..+.|+.|..... ...+..+++.||+.... .......+.-+....+...... .....+..
T Consensus 117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~ 196 (256)
T KOG3627|consen 117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE 196 (256)
T ss_pred cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence 3 4566777743332 34458888899864321 1222222222221111111000 00112222
Q ss_pred -ccccCCCCCCCeEECCC---CcEEEEEeeeecCCCCCCcceeeeeeeccchhhhh
Q 013804 271 -DAAINPGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQ 322 (436)
Q Consensus 271 -~~~i~~G~SGGPlvd~~---G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~ 322 (436)
....|.|+|||||+-.+ ..++||++++...|+.....+....+....+|+++
T Consensus 197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~ 252 (256)
T KOG3627|consen 197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKE 252 (256)
T ss_pred CCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHH
Confidence 23468999999999654 69999999987645433223334445555555554
No 32
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.58 E-value=1.7e-07 Score=73.66 Aligned_cols=74 Identities=36% Similarity=0.490 Sum_probs=58.2
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~ 411 (436)
..+|+.+....... .|++|..+.+++||+++||++ ||+|++|||+++.++.+..........++.
T Consensus 12 ~~~G~~~~~~~~~~----~~~~i~~v~~~s~a~~~gl~~-----------GD~I~~In~~~v~~~~~~~~~~~~~~~~~~ 76 (85)
T smart00228 12 GGLGFSLVGGKDEG----GGVVVSSVVPGSPAAKAGLKV-----------GDVILEVNGTSVEGLTHLEAVDLLKKAGGK 76 (85)
T ss_pred CcccEEEECCCCCC----CCEEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHHhCCCe
Confidence 57888877533111 589999999999999999999 999999999999987665554443334678
Q ss_pred EEEEEEECC
Q 013804 412 VIVEVLRGD 420 (436)
Q Consensus 412 v~l~v~R~g 420 (436)
+.+++.|++
T Consensus 77 ~~l~i~r~~ 85 (85)
T smart00228 77 VTLTVLRGG 85 (85)
T ss_pred EEEEEEeCC
Confidence 999999875
No 33
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.48 E-value=6.4e-07 Score=91.43 Aligned_cols=84 Identities=21% Similarity=0.404 Sum_probs=64.8
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g 409 (436)
.-+|+.+....... -...|++|..|.++|||+++||++ ||+|++|||++|.++ .++...+.. ..|
T Consensus 85 ~GiG~~~~~~~~~~-~~~~g~~V~~V~~~SPA~~aGl~~-----------GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g 151 (389)
T PLN00049 85 TGVGLEVGYPTGSD-GPPAGLVVVAPAPGGPAARAGIRP-----------GDVILAIDGTSTEGLSLYEAADRLQG-PEG 151 (389)
T ss_pred eEEEEEEEEccCCC-CccCcEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhc-CCC
Confidence 45777765322100 001489999999999999999999 999999999999864 677777765 578
Q ss_pred CEEEEEEEECCEEEEEEEE
Q 013804 410 DEVIVEVLRGDQKEKIPVK 428 (436)
Q Consensus 410 ~~v~l~v~R~g~~~~~~v~ 428 (436)
+.+.++|.|+|+..+++++
T Consensus 152 ~~v~ltv~r~g~~~~~~l~ 170 (389)
T PLN00049 152 SSVELTLRRGPETRLVTLT 170 (389)
T ss_pred CEEEEEEEECCEEEEEEEE
Confidence 9999999999987776654
No 34
>PRK10139 serine endoprotease; Provisional
Probab=98.48 E-value=4.5e-07 Score=94.25 Aligned_cols=65 Identities=22% Similarity=0.428 Sum_probs=59.6
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV 427 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v 427 (436)
.|++|.+|.+++||+++||++ ||+|++|||++|.+|+|+.+++.+. + +++.++|+|+|+...+.+
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~R~g~~~~~~~ 454 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQK-----------DDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIVRGNESIYLLL 454 (455)
T ss_pred CceEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEEECCEEEEEEe
Confidence 589999999999999999999 9999999999999999999999873 3 789999999999877665
No 35
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.47 E-value=2.3e-05 Score=73.61 Aligned_cols=163 Identities=16% Similarity=0.283 Sum_probs=86.0
Q ss_pred HHhCCceEEEEEeeeccCccccccccCcCeEEEEEEEcCCCEEEecccccCC-CCeEEEEecCCcEEeeE-----EEEEc
Q 013804 123 QENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRG-ASDIRVTFADQSAYDAK-----IVGFD 196 (436)
Q Consensus 123 ~~~~~sVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~-----vv~~d 196 (436)
..+...|+.|....... ...=-|+..+ .+|+|++|.++. ...++|...-|. |... -+..-
T Consensus 14 n~Ia~~ic~l~n~s~~~-----------~~~l~gigyG--~~iItn~HLf~~nng~L~i~s~hG~-f~v~nt~~lkv~~i 79 (235)
T PF00863_consen 14 NPIASNICRLTNESDGG-----------TRSLYGIGYG--SYIITNAHLFKRNNGELTIKSQHGE-FTVPNTTQLKVHPI 79 (235)
T ss_dssp HHHHTTEEEEEEEETTE-----------EEEEEEEEET--TEEEEEGGGGSSTTCEEEEEETTEE-EEECEGGGSEEEE-
T ss_pred chhhheEEEEEEEeCCC-----------eEEEEEEeEC--CEEEEChhhhccCCCeEEEEeCceE-EEcCCccccceEEe
Confidence 34456688887643221 2233477775 389999999964 456777776663 3222 23444
Q ss_pred CCCCeEEEEEcCCCCCCcceecC-CCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 013804 197 QDKDVAVLRIDAPKDKLRPIPIG-VSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN 275 (436)
Q Consensus 197 ~~~DlAlLkv~~~~~~~~~~~l~-~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~ 275 (436)
+..||.++|+.. ++||.+-. .-..+..++.|+++|.-+.... ..-.|+........ . ...+...-....
T Consensus 80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p~--~---~~~fWkHwIsTk 149 (235)
T PF00863_consen 80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYPE--E---NSHFWKHWISTK 149 (235)
T ss_dssp TCSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEEE--T---TTTEEEE-C---
T ss_pred CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEeec--C---CCCeeEEEecCC
Confidence 688999999965 56665432 3356889999999997544332 12223322221111 1 124455556667
Q ss_pred CCCCCCeEECC-CCcEEEEEeeeecCCCCCCcceeeeee
Q 013804 276 PGNSGGPLLDS-SGSLIGINTAIYSPSGASSGVGFSIPV 313 (436)
Q Consensus 276 ~G~SGGPlvd~-~G~VVGI~s~~~~~~~~~~~~~~aIP~ 313 (436)
.|+=|.|+|+. +|++|||++..... ...+|+.|+
T Consensus 150 ~G~CG~PlVs~~Dg~IVGiHsl~~~~----~~~N~F~~f 184 (235)
T PF00863_consen 150 DGDCGLPLVSTKDGKIVGIHSLTSNT----SSRNYFTPF 184 (235)
T ss_dssp TT-TT-EEEETTT--EEEEEEEEETT----TSSEEEEE-
T ss_pred CCccCCcEEEcCCCcEEEEEcCccCC----CCeEEEEcC
Confidence 89999999984 99999999987543 345677766
No 36
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.42 E-value=9e-07 Score=69.32 Aligned_cols=69 Identities=32% Similarity=0.474 Sum_probs=55.6
Q ss_pred ceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeC--CHHHHHHHHhcCCC
Q 013804 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKV 408 (436)
Q Consensus 331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~--s~~dl~~~l~~~~~ 408 (436)
...+|+.+...... ..|++|.++.+++||+++||++ ||+|++|||+++. ++.++.+++... .
T Consensus 11 ~~~~G~~~~~~~~~----~~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~~~l~~~-~ 74 (82)
T cd00992 11 GGGLGFSLRGGKDS----GGGIFVSRVEPGGPAERGGLRV-----------GDRILEVNGVSVEGLTHEEAVELLKNS-G 74 (82)
T ss_pred CCCcCEEEeCcccC----CCCeEEEEECCCChHHhCCCCC-----------CCEEEEECCEEcCccCHHHHHHHHHhC-C
Confidence 35688888754311 3589999999999999999999 9999999999999 899999988863 2
Q ss_pred CCEEEEEE
Q 013804 409 GDEVIVEV 416 (436)
Q Consensus 409 g~~v~l~v 416 (436)
..+.+++
T Consensus 75 -~~v~l~v 81 (82)
T cd00992 75 -DEVTLTV 81 (82)
T ss_pred -CeEEEEE
Confidence 3666654
No 37
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.39 E-value=4.8e-07 Score=71.07 Aligned_cols=71 Identities=28% Similarity=0.462 Sum_probs=55.7
Q ss_pred ceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCC
Q 013804 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKV 408 (436)
Q Consensus 331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~ 408 (436)
...+|+.+....... ..+++|.++.+++||+++||++ ||.|++|||+++.++ .++..++...
T Consensus 9 ~~~lG~~l~~~~~~~---~~~~~V~~v~~~~~a~~~gl~~-----------GD~Il~INg~~v~~~~~~~~~~~l~~~-- 72 (81)
T PF00595_consen 9 NGPLGFTLRGGSDND---EKGVFVSSVVPGSPAERAGLKV-----------GDRILEINGQSVRGMSHDEVVQLLKSA-- 72 (81)
T ss_dssp TSBSSEEEEEESTSS---SEEEEEEEECTTSHHHHHTSST-----------TEEEEEETTEESTTSBHHHHHHHHHHS--
T ss_pred CCCcCEEEEecCCCC---cCCEEEEEEeCCChHHhcccch-----------hhhhheeCCEeCCCCCHHHHHHHHHCC--
Confidence 457888887643211 2599999999999999999999 999999999999976 4566667663
Q ss_pred CCEEEEEEE
Q 013804 409 GDEVIVEVL 417 (436)
Q Consensus 409 g~~v~l~v~ 417 (436)
+.+++|+|+
T Consensus 73 ~~~v~L~V~ 81 (81)
T PF00595_consen 73 SNPVTLTVQ 81 (81)
T ss_dssp TSEEEEEEE
T ss_pred CCcEEEEEC
Confidence 348888874
No 38
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=98.37 E-value=1.5e-06 Score=83.40 Aligned_cols=71 Identities=31% Similarity=0.500 Sum_probs=63.7
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLR-GDQKEKIPVK 428 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~ 428 (436)
.||++..+.+++|+ .++|+.||.|++|||+++.+.+|+..++.+.++|++|++++.| +++...++++
T Consensus 130 ~gvyv~~v~~~~~~------------~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~t 197 (342)
T COG3480 130 AGVYVLSVIDNSPF------------KGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTIT 197 (342)
T ss_pred eeEEEEEccCCcch------------hceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEE
Confidence 79999999999986 4556669999999999999999999999999999999999997 7888888888
Q ss_pred eecC
Q 013804 429 LEPK 432 (436)
Q Consensus 429 ~~~~ 432 (436)
+.+.
T Consensus 198 l~~~ 201 (342)
T COG3480 198 LIKN 201 (342)
T ss_pred EEee
Confidence 8766
No 39
>PRK10942 serine endoprotease; Provisional
Probab=98.37 E-value=1.1e-06 Score=91.74 Aligned_cols=65 Identities=34% Similarity=0.539 Sum_probs=59.3
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV 427 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v 427 (436)
.|++|.+|.++++|+++||++ ||+|++|||++|.+++|+.+++.. .+ +.+.|+|+|+|+.+.+.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~-----------GDvIv~VNg~~V~s~~dl~~~l~~-~~-~~v~l~V~R~g~~~~v~~ 472 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKK-----------GDVIIGANQQPVKNIAELRKILDS-KP-SVLALNIQRGDSSIYLLM 472 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CC-CeEEEEEEECCEEEEEEe
Confidence 489999999999999999999 999999999999999999999987 33 789999999999877655
No 40
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.35 E-value=9.5e-07 Score=89.62 Aligned_cols=63 Identities=24% Similarity=0.362 Sum_probs=54.8
Q ss_pred EEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE-ECCEEEEEEEEee
Q 013804 353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVL-RGDQKEKIPVKLE 430 (436)
Q Consensus 353 ~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~-R~g~~~~~~v~~~ 430 (436)
+|..|.++|+|+++||++ ||+|++|||++|.+|.|+..++. ++.+.++|. |+|+..++++...
T Consensus 1 ~I~~V~pgSpAe~AGLe~-----------GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 1 LISAVLPGSIAEELGFEP-----------GDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred CcCCcCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCCeEEEEEEecC
Confidence 367889999999999999 99999999999999999988774 366899997 8898888887654
No 41
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.31 E-value=4.4e-06 Score=66.54 Aligned_cols=79 Identities=29% Similarity=0.524 Sum_probs=50.8
Q ss_pred eecceeeecchhhhhcCccceEEEecCCC--------CcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHH
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPN--------GPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRIL 403 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~--------spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l 403 (436)
+.||..+.... .+..|..+.++ ||-.+.|+. +++||+|++|||++|..-.++..+|
T Consensus 1 G~LGAd~~~~~-------~~y~I~~I~~gd~~~~~~~sPL~~pGv~---------v~~GD~I~aInG~~v~~~~~~~~lL 64 (88)
T PF14685_consen 1 GLLGADFSYDN-------GGYRIARIYPGDPWNPNARSPLAQPGVD---------VREGDYILAINGQPVTADANPYRLL 64 (88)
T ss_dssp -B-SEEEEEET-------TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB-HHHHH
T ss_pred CccceEEEEcC-------CEEEEEEEeCCCCCCccccCCccCCCCC---------CCCCCEEEEECCEECCCCCCHHHHh
Confidence 35777776543 56678887765 444444443 4669999999999999999999999
Q ss_pred hcCCCCCEEEEEEEECC-EEEEEEE
Q 013804 404 DQCKVGDEVIVEVLRGD-QKEKIPV 427 (436)
Q Consensus 404 ~~~~~g~~v~l~v~R~g-~~~~~~v 427 (436)
.. +.|+.|.|+|.+.+ +.+++.|
T Consensus 65 ~~-~agk~V~Ltv~~~~~~~R~v~V 88 (88)
T PF14685_consen 65 EG-KAGKQVLLTVNRKPGGARTVVV 88 (88)
T ss_dssp HT-TTTSEEEEEEE-STT-EEEEEE
T ss_pred cc-cCCCEEEEEEecCCCCceEEEC
Confidence 98 68999999999965 4555543
No 42
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.24 E-value=3.4e-06 Score=86.29 Aligned_cols=80 Identities=26% Similarity=0.468 Sum_probs=65.6
Q ss_pred cceecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCC
Q 013804 330 TRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCK 407 (436)
Q Consensus 330 ~~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~ 407 (436)
.+..+|+++...+. .++.|.++.+++||+++|+++ ||+|++|||+++... +++.+.+.. +
T Consensus 98 ~~~GiG~~i~~~~~------~~~~V~s~~~~~PA~kagi~~-----------GD~I~~IdG~~~~~~~~~~av~~irG-~ 159 (406)
T COG0793 98 EFGGIGIELQMEDI------GGVKVVSPIDGSPAAKAGIKP-----------GDVIIKIDGKSVGGVSLDEAVKLIRG-K 159 (406)
T ss_pred cccceeEEEEEecC------CCcEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCCCHHHHHHHhCC-C
Confidence 56889999876432 688999999999999999999 999999999999976 457777776 6
Q ss_pred CCCEEEEEEEECC--EEEEEEE
Q 013804 408 VGDEVIVEVLRGD--QKEKIPV 427 (436)
Q Consensus 408 ~g~~v~l~v~R~g--~~~~~~v 427 (436)
+|..|+|+|.|.+ +..++++
T Consensus 160 ~Gt~V~L~i~r~~~~k~~~v~l 181 (406)
T COG0793 160 PGTKVTLTILRAGGGKPFTVTL 181 (406)
T ss_pred CCCeEEEEEEEcCCCceeEEEE
Confidence 8999999999974 3444443
No 43
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.23 E-value=1.7e-06 Score=89.31 Aligned_cols=66 Identities=29% Similarity=0.322 Sum_probs=58.7
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK 428 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (436)
.|++|.+|.++|||+++||++ ||+|++|||+++.+++|+.+.+.... +++.+++.|+++..+++++
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~-----------GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I~r~g~~~~l~v~ 193 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEP-----------GDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEILAERENWTFEVM 193 (420)
T ss_pred CCceeeccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEEEEecCceEeccc
Confidence 578999999999999999999 99999999999999999999988754 6789999998877665443
No 44
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.00 E-value=0.00024 Score=69.56 Aligned_cols=55 Identities=18% Similarity=0.239 Sum_probs=39.5
Q ss_pred cccCCCCCCCeEECC--CC-cEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcccc
Q 013804 272 AAINPGNSGGPLLDS--SG-SLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKF 326 (436)
Q Consensus 272 ~~i~~G~SGGPlvd~--~G-~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~ 326 (436)
...|.|+||||+|-. +| .-+||++|+...|+...-.+...-++....|++..++.
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~ 280 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNG 280 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcC
Confidence 567899999999843 45 47999999988877544344444567777888775543
No 45
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=98.00 E-value=1.6e-05 Score=76.31 Aligned_cols=57 Identities=21% Similarity=0.367 Sum_probs=52.1
Q ss_pred cccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804 362 PAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 362 pa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
--+++||++ ||++++|||.++++.++..++++.......++|+|+|||+.+++.+.+
T Consensus 219 lF~~~GLq~-----------GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l 275 (276)
T PRK09681 219 LFDASGFKE-----------GDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL 275 (276)
T ss_pred HHHHcCCCC-----------CCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEc
Confidence 357789999 999999999999999999999998888899999999999999988765
No 46
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.95 E-value=1.2e-05 Score=82.33 Aligned_cols=64 Identities=38% Similarity=0.473 Sum_probs=56.3
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
++.+|..|.++|||++|||.+ ||.|++|||. ...+..++.++.|++++.|.|+.+++.+++
T Consensus 462 g~~~i~~V~~~gPA~~AGl~~-----------Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~~~~L~e~~v~~ 522 (558)
T COG3975 462 GHEKITFVFPGGPAYKAGLSP-----------GDKIVAINGI--------SDQLDRYKVNDKIQVHVFREGRLREFLVKL 522 (558)
T ss_pred CeeEEEecCCCChhHhccCCC-----------ccEEEEEcCc--------cccccccccccceEEEEccCCceEEeeccc
Confidence 678999999999999999999 9999999998 334556788999999999999999998877
Q ss_pred ecC
Q 013804 430 EPK 432 (436)
Q Consensus 430 ~~~ 432 (436)
...
T Consensus 523 ~~~ 525 (558)
T COG3975 523 GGD 525 (558)
T ss_pred CCC
Confidence 543
No 47
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.94 E-value=2.6e-05 Score=70.63 Aligned_cols=74 Identities=27% Similarity=0.248 Sum_probs=61.8
Q ss_pred ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHH--HhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRI--LDQCKVGDEVIVEVLRGDQKEKIPVK 428 (436)
Q Consensus 351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~--l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (436)
=++|..|.++|||+.+||+. ||.|+++....-.++..|..+ +.+...++.+.++|.|.|+...+.++
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~-----------gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~lt 208 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCV-----------GDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLT 208 (231)
T ss_pred eEEEeecCCCChhhhhCccc-----------CceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeC
Confidence 35799999999999999999 999999988777776655442 33446789999999999999999999
Q ss_pred eecCCCC
Q 013804 429 LEPKPDE 435 (436)
Q Consensus 429 ~~~~~~~ 435 (436)
+..|...
T Consensus 209 P~~W~Gr 215 (231)
T KOG3129|consen 209 PKKWQGR 215 (231)
T ss_pred cccccCC
Confidence 9988653
No 48
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.90 E-value=3.7e-05 Score=82.91 Aligned_cols=79 Identities=23% Similarity=0.354 Sum_probs=60.5
Q ss_pred ceecceeeecchhhhhcCccceEEEecCCCCccccc-CcccccccccCcccCCcEEEEEC--CEEeCC-----HHHHHHH
Q 013804 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVN--GKKVSN-----GSDLYRI 402 (436)
Q Consensus 331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~a-gl~~~~~~~~~~l~~GDiIl~vn--G~~V~s-----~~dl~~~ 402 (436)
...+|+.+.... ++++|..|.+++||+++ ||++ ||+|++|| |+++.+ .+++...
T Consensus 243 ~~GIGa~l~~~~-------~~~~V~~vipGsPA~ka~gLk~-----------GD~IlaVn~~g~~~~dv~g~~~~~vv~l 304 (667)
T PRK11186 243 LEGIGAVLQMDD-------DYTVINSLVAGGPAAKSKKLSV-----------GDKIVGVGQDGKPIVDVIGWRLDDVVAL 304 (667)
T ss_pred eeEEEEEEEEeC-------CeEEEEEccCCChHHHhCCCCC-----------CCEEEEECCCCCcccccccCCHHHHHHH
Confidence 356788876543 46899999999999998 9999 99999999 554432 3477778
Q ss_pred HhcCCCCCEEEEEEEEC---CEEEEEEEE
Q 013804 403 LDQCKVGDEVIVEVLRG---DQKEKIPVK 428 (436)
Q Consensus 403 l~~~~~g~~v~l~v~R~---g~~~~~~v~ 428 (436)
+.. ..|.+|.|+|.|+ ++..+++++
T Consensus 305 irG-~~Gt~V~LtV~r~~~~~~~~~vtl~ 332 (667)
T PRK11186 305 IKG-PKGSKVRLEILPAGKGTKTRIVTLT 332 (667)
T ss_pred hcC-CCCCEEEEEEEeCCCCCceEEEEEE
Confidence 877 6899999999984 445555543
No 49
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.81 E-value=0.00026 Score=66.73 Aligned_cols=117 Identities=26% Similarity=0.366 Sum_probs=63.0
Q ss_pred CeEEEEEEEcCCC--EEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCC
Q 013804 151 QGSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQ 228 (436)
Q Consensus 151 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~ 228 (436)
...|||=+...+| .|+|+.||+. .+...|.. .+.... ..+...-|+|.-.++.-...+|.++++.. ..|.
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~-~g~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a~~---~~Gr 182 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSG-VGTRRM---LTFKKNGDFAEADITNWPGAAPKYKFAQN---YTGR 182 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE-TTEEEE---EEEEEETTEEEEEETTS-S---B--B-TT----SEE
T ss_pred ecccccceEEECCeEEEEEEEEEcC-CCeEEEEe-cceEEE---EEEeccCcEEEEECCCCCCCCCceeecCC---cccc
Confidence 4455555555444 4999999998 44444444 333222 23445679999999543346777777522 2333
Q ss_pred EEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeee
Q 013804 229 KVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIY 298 (436)
Q Consensus 229 ~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~ 298 (436)
--+.- ...+..|.|..- ..+. -..+|+||+|+++.+|.+||||+...
T Consensus 183 AyW~t------~tGvE~G~ig~~--------------~~~~---fT~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 183 AYWLT------STGVEPGFIGGG--------------GAVC---FTGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp EEEEE------TTEEEEEEEETT--------------EEEE---SS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred eEEEc------ccCcccceecCc--------------eEEE---EcCCCCCCCccCcCCCCEEEEEecCC
Confidence 22221 223445555311 1122 23479999999999999999999764
No 50
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.78 E-value=5.9e-05 Score=65.49 Aligned_cols=87 Identities=24% Similarity=0.389 Sum_probs=59.0
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~ 411 (436)
+.||+.++-.... .-...+.-|.+|.++|||++|||++ ..|.|+.+|+....+.++|.+.++. ..++.
T Consensus 26 g~LG~sv~~~~~~-~~~~~~~~Vl~V~p~SPA~~AGL~p----------~~DyIig~~~~~l~~~~~l~~~v~~-~~~~~ 93 (138)
T PF04495_consen 26 GLLGISVRFESFE-GAEEEGWHVLRVAPNSPAAKAGLEP----------FFDYIIGIDGGLLDDEDDLFELVEA-NENKP 93 (138)
T ss_dssp SSS-EEEEEEE-T-TGCCCEEEEEEE-TTSHHHHTT--T----------TTEEEEEETTCE--STCHHHHHHHH-TTTS-
T ss_pred CCCcEEEEEeccc-ccccceEEEeEecCCCHHHHCCccc----------cccEEEEccceecCCHHHHHHHHHH-cCCCc
Confidence 6788877643211 0112578899999999999999998 2599999999999999999999988 57889
Q ss_pred EEEEEEEC--CEEEEEEEEee
Q 013804 412 VIVEVLRG--DQKEKIPVKLE 430 (436)
Q Consensus 412 v~l~v~R~--g~~~~~~v~~~ 430 (436)
+.+.|... +..+++++++.
T Consensus 94 l~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 94 LQLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp EEEEEEETTTTCEEEEEE---
T ss_pred EEEEEEECCCCeEEEEEEEcC
Confidence 99999873 45566666664
No 51
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.27 E-value=0.00056 Score=63.41 Aligned_cols=59 Identities=24% Similarity=0.424 Sum_probs=51.9
Q ss_pred CCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 013804 359 PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK 428 (436)
Q Consensus 359 ~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (436)
+.+--+..||++ ||+.+++|+..+++.+++..+++....-+.++++|+|+|+.+.+.|.
T Consensus 216 d~slF~~sglq~-----------GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 216 DGSLFYKSGLQR-----------GDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred CcchhhhhcCCC-----------cceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 445566778998 99999999999999999999999877778899999999999988875
No 52
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.88 E-value=0.0008 Score=53.85 Aligned_cols=34 Identities=35% Similarity=0.442 Sum_probs=31.7
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeC
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS 394 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~ 394 (436)
.|++|+.|.++|||+.|||+. +|.|+.+||...+
T Consensus 59 ~GiYvT~V~eGsPA~~AGLri-----------hDKIlQvNG~DfT 92 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAEIAGLRI-----------HDKILQVNGWDFT 92 (124)
T ss_pred ccEEEEEeccCChhhhhccee-----------cceEEEecCceeE
Confidence 799999999999999999999 9999999996544
No 53
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.68 E-value=0.044 Score=49.48 Aligned_cols=137 Identities=18% Similarity=0.271 Sum_probs=77.2
Q ss_pred CeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeE--EEEEcC---CCCeEEEEEcCCCCCCccee-cCCCCCC
Q 013804 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK--IVGFDQ---DKDVAVLRIDAPKDKLRPIP-IGVSADL 224 (436)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~--vv~~d~---~~DlAlLkv~~~~~~~~~~~-l~~~~~~ 224 (436)
...++++.|-.+ ++|-..| -.....+.+ +|+.++.. +...+. ..|+++++++... +++-+. +-.....
T Consensus 24 ~~t~l~~gi~~~-~~lvp~H-~~~~~~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~~~~~ 97 (172)
T PF00548_consen 24 EFTMLALGIYDR-YFLVPTH-EEPEDTIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFPESIP 97 (172)
T ss_dssp EEEEEEEEEEBT-EEEEEGG-GGGCSEEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSBSSGG
T ss_pred eEEEecceEeee-EEEEECc-CCCcEEEEE---CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhccccc
Confidence 557888888765 9999999 222233333 45555433 222333 4599999997643 332221 1111112
Q ss_pred CCCCEEEEEecCCCCCC-ceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEEC---CCCcEEEEEeee
Q 013804 225 LVGQKVYAIGNPFGLDH-TLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD---SSGSLIGINTAI 297 (436)
Q Consensus 225 ~~G~~V~~vG~p~g~~~-~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd---~~G~VVGI~s~~ 297 (436)
...+...++-.. .... ....+.+...... .. .+......+.++++..+|+-||||+. ..++++|||.++
T Consensus 98 ~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~~--~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 98 EYPECVLLVNST-KFPRMIVEVGFVTNFGFI-NL--SGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp TEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-EE--TTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred cCCCcEEEEECC-CCccEEEEEEEEeecCcc-cc--CCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 334444444322 2222 3334444433332 11 23334577888999999999999994 267999999986
No 54
>PF12812 PDZ_1: PDZ-like domain
Probab=96.66 E-value=0.0045 Score=48.29 Aligned_cols=64 Identities=28% Similarity=0.390 Sum_probs=50.1
Q ss_pred eecceeeec--chhhhhcCc-cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 013804 332 PILGIKFAP--DQSVEQLGV-SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (436)
Q Consensus 332 ~~lGv~~~~--~~~~~~~g~-~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~ 406 (436)
-|.|-.+.+ ...++.+++ -|+++.....++++..-|+.. |-+|++|||+++.+.++|.+.+++.
T Consensus 9 ~~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i~~-----------g~iI~~Vn~kpt~~Ld~f~~vvk~i 75 (78)
T PF12812_consen 9 EVCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGISK-----------GFIITSVNGKPTPDLDDFIKVVKKI 75 (78)
T ss_pred EEcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCCCC-----------CeEEEeECCcCCcCHHHHHHHHHhC
Confidence 367777776 345777775 345555667888887766888 9999999999999999999999874
No 55
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.65 E-value=0.031 Score=51.66 Aligned_cols=166 Identities=18% Similarity=0.249 Sum_probs=86.5
Q ss_pred ccccCcCeEEEEEEEcCC-CEEEecccccCCCCe-EEEEecCCcEEeeEEEEEcCC----------------CCeEEEEE
Q 013804 145 DVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQD----------------KDVAVLRI 206 (436)
Q Consensus 145 ~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~~----------------~DlAlLkv 206 (436)
|......+.||=.+++++ +..--=.|.+.+.+. ..+.+.+|+.+++++....+. .-+.-+.-
T Consensus 13 wVRD~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~ 92 (218)
T PF05580_consen 13 WVRDSTAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEK 92 (218)
T ss_pred EEEeCCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEe
Confidence 334445788999999874 566666899887664 456667888888777655421 11222222
Q ss_pred cCC--------------CCCCcceecCCCCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCC----cccEE
Q 013804 207 DAP--------------KDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRP----IQDVI 268 (436)
Q Consensus 207 ~~~--------------~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~----~~~~i 268 (436)
+.. ....++++++...+++.|..-+.--. .+.....-.-.|..+.+.......+.. ...++
T Consensus 93 Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~ieI~~v~~~~~~~~k~~vi~vtd~~Ll 171 (218)
T PF05580_consen 93 NTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDIEIEKVLPQSSPSGKGMVIKVTDPRLL 171 (218)
T ss_pred ccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEEEEEEEccCCCCCCCcEEEEECCcchh
Confidence 111 01234455555555666653221100 111111111112222221110000000 01222
Q ss_pred EEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeec
Q 013804 269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT 315 (436)
Q Consensus 269 ~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~ 315 (436)
.....+..||||+|++ .+|++||=++..+.+ +...+|.++++.
T Consensus 172 ~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~---dp~~Gygi~ie~ 214 (218)
T PF05580_consen 172 EKTGGIVQGMSGSPII-QDGKLIGAVTHVFVN---DPTKGYGIFIEW 214 (218)
T ss_pred hhhCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCceeeecHHH
Confidence 3344577899999999 799999998877653 456788887654
No 56
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.59 E-value=0.09 Score=51.12 Aligned_cols=91 Identities=20% Similarity=0.253 Sum_probs=55.1
Q ss_pred CCCCeEEEEEcCC-CCCCcceecCCCC-CCCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEEEEcccc
Q 013804 197 QDKDVAVLRIDAP-KDKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI 274 (436)
Q Consensus 197 ~~~DlAlLkv~~~-~~~~~~~~l~~~~-~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i 274 (436)
..++++||+++.+ .....++=|+++. ....|+.+.+.|+.. ........+.-.... .....+......
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~--~~~~~~~~~~i~~~~--------~~~~~~~~~~~~ 228 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNS--TGKLKHRKLKITNCT--------KCAYSICTKQYS 228 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCC--CCeEEEEEEEEEEee--------ccceeEeccccc
Confidence 4569999999876 2356677776543 356789999998721 112222222211110 012345556677
Q ss_pred CCCCCCCeEEC-CCC--cEEEEEeee
Q 013804 275 NPGNSGGPLLD-SSG--SLIGINTAI 297 (436)
Q Consensus 275 ~~G~SGGPlvd-~~G--~VVGI~s~~ 297 (436)
+.|++|||++. .+| -||||.+..
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~ 254 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASG 254 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccC
Confidence 89999999983 344 589987654
No 57
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=96.02 E-value=0.0069 Score=62.87 Aligned_cols=58 Identities=28% Similarity=0.353 Sum_probs=49.7
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLR 418 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~v~R 418 (436)
-|++|.+|.+++||++.||++ ||.|+.||.++..+. +|...+|....+|+.|+|.-++
T Consensus 429 VGIFVaGvqegspA~~eGlqE-----------GDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~ 488 (1027)
T KOG3580|consen 429 VGIFVAGVQEGSPAEQEGLQE-----------GDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQS 488 (1027)
T ss_pred eeEEEeecccCCchhhccccc-----------cceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhhh
Confidence 489999999999999999999 999999999988875 4556667778899999886543
No 58
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.87 E-value=0.033 Score=60.80 Aligned_cols=22 Identities=36% Similarity=0.359 Sum_probs=20.3
Q ss_pred eEEEEEEEcCCCEEEecccccC
Q 013804 152 GSGSGFVWDSKGHVVTNYHVIR 173 (436)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~ 173 (436)
+-|||-+|+++|.||||-||..
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred CceeEEEEcCCceEEecchhhh
Confidence 4699999999999999999975
No 59
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.83 E-value=0.056 Score=57.43 Aligned_cols=117 Identities=18% Similarity=0.397 Sum_probs=71.7
Q ss_pred CCCCeEEEEEcCCC-------CCC------cceecCC------CCCCCCCCEEEEEecCCCCCCceeEeEEeeeeeeecc
Q 013804 197 QDKDVAVLRIDAPK-------DKL------RPIPIGV------SADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISS 257 (436)
Q Consensus 197 ~~~DlAlLkv~~~~-------~~~------~~~~l~~------~~~~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~ 257 (436)
.-.|+||++++... +.+ |.+.+.+ ...+..|.+|+-+|.-.+ .+.|.+.++.-....
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw~ 616 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYWA 616 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEec
Confidence 34599999997532 111 1222221 123567999999986644 567777766432211
Q ss_pred CCCCCC-cccEEEEc----cccCCCCCCCeEECCCCc------EEEEEeeeecCCCCCCcceeeeeeeccchhhhh
Q 013804 258 AATGRP-IQDVIQTD----AAINPGNSGGPLLDSSGS------LIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQ 322 (436)
Q Consensus 258 ~~~~~~-~~~~i~~~----~~i~~G~SGGPlvd~~G~------VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~ 322 (436)
.+.. ..+++... .-...|+||+=|++.-+. |+||.++..+ ....++.+.|+..|.+=+++
T Consensus 617 --dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg---e~kqfglftPi~~il~rl~~ 687 (695)
T PF08192_consen 617 --DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG---EQKQFGLFTPINEILDRLEE 687 (695)
T ss_pred --CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC---ccceeeccCcHHHHHHHHHH
Confidence 1211 13334443 334579999999985344 9999887643 35578999999887766655
No 60
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=95.49 E-value=0.026 Score=58.76 Aligned_cols=86 Identities=24% Similarity=0.388 Sum_probs=61.4
Q ss_pred eecceeeecchhhhhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcC
Q 013804 332 PILGIKFAPDQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQC 406 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~ 406 (436)
+-+++.+.-....++||+ ..++|..+...+-|++ +|.|+.||+|++|||....| ..|..+++.+.
T Consensus 198 ~p~kv~LvKsR~nEEyGlrLgSqIFvKeit~~gLAar----------dgnlqEGDiiLkINGtvteNmSLtDar~LIEkS 267 (1027)
T KOG3580|consen 198 GPIKVLLVKSRANEEYGLRLGSQIFVKEITRTGLAAR----------DGNLQEGDIILKINGTVTENMSLTDARKLIEKS 267 (1027)
T ss_pred CcceEEEEeeccchhhcccccchhhhhhhcccchhhc----------cCCcccccEEEEECcEeeccccchhHHHHHHhc
Confidence 345666655555677886 6788888877666654 45555599999999987765 45888888763
Q ss_pred CCCCEEEEEEEECCEEEEEEEEe
Q 013804 407 KVGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 407 ~~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
. .++++.|+||.+..-+.+..
T Consensus 268 -~-GKL~lvVlRD~~qtLiNiP~ 288 (1027)
T KOG3580|consen 268 -R-GKLQLVVLRDSQQTLINIPS 288 (1027)
T ss_pred -c-CceEEEEEecCCceeeecCC
Confidence 3 45899999997765565543
No 61
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=95.38 E-value=0.032 Score=58.91 Aligned_cols=58 Identities=22% Similarity=0.357 Sum_probs=49.3
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~ 406 (436)
--+|+.|.... -.-|.|-.|.++++|.++.+++ ||++++|||.+|++..++.+.++..
T Consensus 386 ~~ig~vf~~~~------~~~v~v~tv~~ns~a~k~~~~~-----------gdvlvai~~~pi~s~~q~~~~~~s~ 443 (1051)
T KOG3532|consen 386 SPIGLVFDKNT------NRAVKVCTVEDNSLADKAAFKP-----------GDVLVAINNVPIRSERQATRFLQST 443 (1051)
T ss_pred CceeEEEecCC------ceEEEEEEecCCChhhHhcCCC-----------cceEEEecCccchhHHHHHHHHHhc
Confidence 35777775432 1567899999999999999999 9999999999999999999999884
No 62
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.33 E-value=0.025 Score=48.53 Aligned_cols=33 Identities=21% Similarity=0.424 Sum_probs=23.0
Q ss_pred EEEccccCCCCCCCeEECCCCcEEEEEeeeecC
Q 013804 268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYSP 300 (436)
Q Consensus 268 i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~ 300 (436)
...+..+.+|.||+|+||.+|++|||.......
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 344555778999999999999999998876553
No 63
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.05 E-value=0.03 Score=59.41 Aligned_cols=56 Identities=27% Similarity=0.481 Sum_probs=44.0
Q ss_pred EEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCE
Q 013804 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQ 421 (436)
Q Consensus 354 V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v~R~g~ 421 (436)
|..|.++|||++.| +|++||.|++|||+.|.+.. |+..+++. .|-+|+|+|.-.++
T Consensus 782 iGrIieGSPAdRCg----------kLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIip~ee 839 (984)
T KOG3209|consen 782 IGRIIEGSPADRCG----------KLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTIIPPEE 839 (984)
T ss_pred ccccccCChhHhhc----------cccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEcChhc
Confidence 66778888888764 44459999999999999764 77777776 68889999876443
No 64
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.02 E-value=0.018 Score=62.84 Aligned_cols=59 Identities=22% Similarity=0.273 Sum_probs=39.5
Q ss_pred ccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCC-------CCCcceeeeeeeccchhhhhc
Q 013804 265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSG-------ASSGVGFSIPVDTVNGIVDQL 323 (436)
Q Consensus 265 ~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~-------~~~~~~~aIP~~~i~~~l~~l 323 (436)
.-.+.++..|..||||+|++|.+|+|||++.=+.-.+- ....-+..|-+..|..+++++
T Consensus 621 pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv 686 (698)
T PF10459_consen 621 PVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKV 686 (698)
T ss_pred eeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHH
Confidence 34567888999999999999999999999763211110 111234455566666666654
No 65
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=94.66 E-value=0.26 Score=46.37 Aligned_cols=90 Identities=18% Similarity=0.241 Sum_probs=62.0
Q ss_pred ccCcCeEEEEEEEcCCCEEEecccccCCCC----eEEEEecCCcEEe------eEEEEEc-----CCCCeEEEEEcCCC-
Q 013804 147 LEVPQGSGSGFVWDSKGHVVTNYHVIRGAS----DIRVTFADQSAYD------AKIVGFD-----QDKDVAVLRIDAPK- 210 (436)
Q Consensus 147 ~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~----~i~V~~~dg~~~~------a~vv~~d-----~~~DlAlLkv~~~~- 210 (436)
..++...|+|++|+++ |+|++..|+.+-. -+.+.++.++.+. -++..+| ++.+++||.++.+.
T Consensus 23 YvdG~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~~~~ 101 (267)
T PF09342_consen 23 YVDGRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQPAN 101 (267)
T ss_pred EEcCeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecCccc
Confidence 3456789999999987 9999999998743 3677777777543 1233333 67899999998764
Q ss_pred --CCCcceecCC-CCCCCCCCEEEEEecCC
Q 013804 211 --DKLRPIPIGV-SADLLVGQKVYAIGNPF 237 (436)
Q Consensus 211 --~~~~~~~l~~-~~~~~~G~~V~~vG~p~ 237 (436)
..+.|.-+.. ..+....+.++++|.-.
T Consensus 102 fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 102 FTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred ceeeecccccccccCCCCCCCceEEEEccc
Confidence 2344554533 23445566899999653
No 66
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=94.44 E-value=0.14 Score=52.00 Aligned_cols=56 Identities=34% Similarity=0.555 Sum_probs=48.6
Q ss_pred ecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE---EEEEEEE-CCEEE
Q 013804 356 DAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE---VIVEVLR-GDQKE 423 (436)
Q Consensus 356 ~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~---v~l~v~R-~g~~~ 423 (436)
++..++++..+|+++ ||.|+++|++++.+++++.+.+.. ..+.. +.+.+.| +++.+
T Consensus 135 ~v~~~s~a~~a~l~~-----------Gd~iv~~~~~~i~~~~~~~~~~~~-~~~~~~~~~~i~~~~~~~~~~ 194 (375)
T COG0750 135 EVAPKSAAALAGLRP-----------GDRIVAVDGEKVASWDDVRRLLVA-AAGDVFNLLTILVIRLDGEAH 194 (375)
T ss_pred ecCCCCHHHHcCCCC-----------CCEEEeECCEEccCHHHHHHHHHh-ccCCcccceEEEEEeccceee
Confidence 688999999999999 999999999999999999988876 34555 8899999 77663
No 67
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=94.37 E-value=0.061 Score=58.77 Aligned_cols=65 Identities=29% Similarity=0.491 Sum_probs=49.5
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCC
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVG 409 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g 409 (436)
+.||+-|... ..|+|..|.+|+|+ .|+|++||.|++|||++|.+ ++-+.+++... .
T Consensus 65 ~~lGFgfvag--------rPviVr~VT~GGps------------~GKL~PGDQIl~vN~Epv~daprervIdlvRac--e 122 (1298)
T KOG3552|consen 65 ASLGFGFVAG--------RPVIVRFVTEGGPS------------IGKLQPGDQILAVNGEPVKDAPRERVIDLVRAC--E 122 (1298)
T ss_pred ccccceeecC--------CceEEEEecCCCCc------------cccccCCCeEEEecCcccccccHHHHHHHHHHH--h
Confidence 5666666432 57899999999995 56777799999999999985 56667777663 3
Q ss_pred CEEEEEEEE
Q 013804 410 DEVIVEVLR 418 (436)
Q Consensus 410 ~~v~l~v~R 418 (436)
+.|.|+|.+
T Consensus 123 ~sv~ltV~q 131 (1298)
T KOG3552|consen 123 SSVNLTVCQ 131 (1298)
T ss_pred hhcceEEec
Confidence 568888876
No 68
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=94.28 E-value=0.44 Score=48.58 Aligned_cols=41 Identities=24% Similarity=0.552 Sum_probs=30.9
Q ss_pred ccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeeec
Q 013804 271 DAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT 315 (436)
Q Consensus 271 ~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~ 315 (436)
...+..||||+|++ .||++||=++..+-+ ++.-+|+|-++.
T Consensus 354 tgGivqGMSGSPi~-q~gkliGAvtHVfvn---dpt~GYGi~ie~ 394 (402)
T TIGR02860 354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVN---DPTSGYGVYIEW 394 (402)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCcceeehHHH
Confidence 34567899999999 799999988877664 345678875543
No 69
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=93.92 E-value=0.15 Score=48.32 Aligned_cols=59 Identities=25% Similarity=0.430 Sum_probs=46.9
Q ss_pred ccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeC--CHHHHHHHHhcCCCCCEEEEEEEEC
Q 013804 349 VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKVGDEVIVEVLRG 419 (436)
Q Consensus 349 ~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~--s~~dl~~~l~~~~~g~~v~l~v~R~ 419 (436)
+.|++|+...+++-|+..||-. +.|.|++|||.+|. +.+++..+|-.. ...+-++|.-.
T Consensus 193 vpGIFISRlVpGGLAeSTGLLa----------VnDEVlEVNGIEVaGKTLDQVTDMMvAN--shNLIiTVkPA 253 (358)
T KOG3606|consen 193 VPGIFISRLVPGGLAESTGLLA----------VNDEVLEVNGIEVAGKTLDQVTDMMVAN--SHNLIITVKPA 253 (358)
T ss_pred cCceEEEeecCCccccccceee----------ecceeEEEcCEEeccccHHHHHHHHhhc--ccceEEEeccc
Confidence 3799999999999999999865 49999999999997 677888877652 24466666543
No 70
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=93.36 E-value=0.21 Score=53.33 Aligned_cols=58 Identities=29% Similarity=0.476 Sum_probs=46.6
Q ss_pred eEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEEC
Q 013804 352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLRG 419 (436)
Q Consensus 352 v~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g~~v~l~v~R~ 419 (436)
+.|.+|.+++||++. |+|+.||+|+.|||.-+.- -.|.-+.+.....|+.|.|++-|.
T Consensus 373 LqVKsvl~DGPAa~d----------Gkle~GDviV~INg~cvlGhTHAqaV~~fqaiPvg~~V~L~lcRg 432 (984)
T KOG3209|consen 373 LQVKSVLKDGPAAQD----------GKLETGDVIVHINGECVLGHTHAQAVKRFQAIPVGQSVDLVLCRG 432 (984)
T ss_pred eeeeecccCCchhhc----------CccccCcEEEEECCceeccccHHHHHHHhhccccCCeeeEEEecC
Confidence 458888999999875 4555699999999998874 456777777767899999999884
No 71
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=93.16 E-value=0.59 Score=43.31 Aligned_cols=134 Identities=15% Similarity=0.181 Sum_probs=48.1
Q ss_pred EEEecccccCCCCeEEEEecCCcEEee---EEEEEcCCCCeEEEEEcCCC---CCCcceecCCCCCCCCCCEEEEEecCC
Q 013804 164 HVVTNYHVIRGASDIRVTFADQSAYDA---KIVGFDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPF 237 (436)
Q Consensus 164 ~ILT~aHvv~~~~~i~V~~~dg~~~~a---~vv~~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~ 237 (436)
.++|+.||..+...+. .+.+|+.++- +.+..+...|++||++.... .....+.+.....+..| .+..
T Consensus 43 ~L~ta~Hv~~~~~~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~~g----~~~~-- 115 (203)
T PF02122_consen 43 ALLTARHVWSRPSKVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLAKG----PVSF-- 115 (203)
T ss_dssp EEEE-HHHHTSSS----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEEEE----ESST--
T ss_pred ceecccccCCCcccee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhCCC----Ceee--
Confidence 5999999999855543 3445555442 35556788899999997321 12222333211111000 0100
Q ss_pred CCCCceeEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCCeEECCCCcEEEEEeeeecCCCCCCcceeeeeee
Q 013804 238 GLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVD 314 (436)
Q Consensus 238 g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~ 314 (436)
.....+........+.. . ...+...-+...+|.||.|+++.+ ++||++... .........++..|+.
T Consensus 116 ---y~~~~~~~~~~sa~i~g--~---~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~-~~~~~~~n~n~~spip 182 (203)
T PF02122_consen 116 ---YGFSSGEWPCSSAKIPG--T---EGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS-PSGSNRENNNRMSPIP 182 (203)
T ss_dssp ---TSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE-----------------
T ss_pred ---eeecCCCceeccCcccc--c---cCcCCceEcCCCCCCCCCCeEECC-CceEeecCc-cccccccccccccccc
Confidence 11122111111111111 1 123556667788999999999887 999999875 2222233445544443
No 72
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=91.94 E-value=0.12 Score=54.71 Aligned_cols=57 Identities=26% Similarity=0.366 Sum_probs=43.7
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLR 418 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R 418 (436)
-|++|.+|.+++.|++.|++- ||.|++|||+...+.. +.++..-+..+..+.+++.-
T Consensus 562 fgifV~~V~pgskAa~~GlKR-----------gDqilEVNgQnfenis-~~KA~eiLrnnthLtltvKt 618 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGLKR-----------GDQILEVNGQNFENIS-AKKAEEILRNNTHLTLTVKT 618 (1283)
T ss_pred ceeEEeeecCCchHHHhhhhh-----------hhhhhhccccchhhhh-HHHHHHHhcCCceEEEEEec
Confidence 689999999999999999999 9999999999877654 33333333344566666654
No 73
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=91.58 E-value=0.38 Score=48.72 Aligned_cols=69 Identities=26% Similarity=0.399 Sum_probs=51.4
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEEC--CEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRG--DQKEKIPV 427 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~--g~~~~~~v 427 (436)
.|.-|.+|.++|++.++||.+ --|-|++|||..+..-+|..+.+.+... ++|+++|.-. -+.+.++|
T Consensus 15 eg~hvlkVqedSpa~~aglep----------ffdFIvSI~g~rL~~dnd~Lk~llk~~s-ekVkltv~n~kt~~~R~v~I 83 (462)
T KOG3834|consen 15 EGYHVLKVQEDSPAHKAGLEP----------FFDFIVSINGIRLNKDNDTLKALLKANS-EKVKLTVYNSKTQEVRIVEI 83 (462)
T ss_pred eeEEEEEeecCChHHhcCcch----------hhhhhheeCcccccCchHHHHHHHHhcc-cceEEEEEecccceeEEEEe
Confidence 577788999999999999998 3899999999999987776666665333 3499988753 23344444
Q ss_pred Ee
Q 013804 428 KL 429 (436)
Q Consensus 428 ~~ 429 (436)
+.
T Consensus 84 ~p 85 (462)
T KOG3834|consen 84 VP 85 (462)
T ss_pred cc
Confidence 43
No 74
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=91.38 E-value=0.4 Score=46.27 Aligned_cols=56 Identities=25% Similarity=0.396 Sum_probs=41.9
Q ss_pred ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 013804 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR 418 (436)
Q Consensus 351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g~~v~l~v~R 418 (436)
-++|..|..++||++.| .++-||.|++|||..|+. --++-++++.. -.+|++++..
T Consensus 31 ClYiVQvFD~tPAa~dG----------~i~~GDEi~avNg~svKGktKveVAkmIQ~~--~~eV~IhyNK 88 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDG----------RIRCGDEIVAVNGISVKGKTKVEVAKMIQVS--LNEVKIHYNK 88 (429)
T ss_pred eEEEEEeccCCchhccC----------ccccCCeeEEecceeecCccHHHHHHHHHHh--ccceEEEehh
Confidence 47899999999998755 344499999999999985 44677777763 2457777653
No 75
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=91.35 E-value=0.34 Score=47.60 Aligned_cols=56 Identities=32% Similarity=0.404 Sum_probs=45.9
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVL 417 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~v~ 417 (436)
-.|+|+++.++-.|+..|+-= .||-|+.|||..|+.. +|+..+|.+ .|+.|+++|.
T Consensus 80 ~PvviSkI~kdQaAd~tG~LF----------vGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~ 137 (505)
T KOG3549|consen 80 LPVVISKIYKDQAADITGQLF----------VGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVK 137 (505)
T ss_pred ccEEeehhhhhhhhhhcCceE----------eeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeH
Confidence 368899999998888887542 3999999999999864 578888876 7899999885
No 76
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=90.85 E-value=0.47 Score=40.44 Aligned_cols=29 Identities=31% Similarity=0.556 Sum_probs=23.8
Q ss_pred ccccCCCCCCCeEECCCCcEEEEEeeeec
Q 013804 271 DAAINPGNSGGPLLDSSGSLIGINTAIYS 299 (436)
Q Consensus 271 ~~~i~~G~SGGPlvd~~G~VVGI~s~~~~ 299 (436)
...-.+|+||-|++|..|+||||+-.+..
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~n 128 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGAN 128 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCCC
Confidence 44567999999999999999999887643
No 77
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=90.82 E-value=0.8 Score=39.78 Aligned_cols=55 Identities=31% Similarity=0.398 Sum_probs=37.7
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEV 416 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v 416 (436)
..++|+.+.|++.|++- |.|+.||.+++|||..|..-. ...++|+. .. ..|++.|
T Consensus 115 spiyisriipggvadrh----------gglkrgdqllsvngvsvege~hekavellka-a~-gsvklvv 171 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRH----------GGLKRGDQLLSVNGVSVEGEHHEKAVELLKA-AV-GSVKLVV 171 (207)
T ss_pred CceEEEeecCCcccccc----------CcccccceeEeecceeecchhhHHHHHHHHH-hc-CcEEEEE
Confidence 67999999999998764 334449999999999987532 23334444 23 3466654
No 78
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=90.61 E-value=0.98 Score=50.06 Aligned_cols=65 Identities=17% Similarity=0.248 Sum_probs=37.2
Q ss_pred CeEEEEEEEcCCCEEEecccccCCCCeEEEEecC--CcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceec
Q 013804 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD--QSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPI 218 (436)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d--g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l 218 (436)
...|...+|++. ||+|.+|+..+... |.|.+ +..|...----++..|+.+-|++.-..+..|+.+
T Consensus 64 ~~~G~aTLigpq-YiVSV~HN~~gy~~--v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~ 130 (769)
T PF02395_consen 64 RNKGVATLIGPQ-YIVSVKHNGKGYNS--VSFGNEGQNTYKIVDRNNYPSGDFHMPRLNKFVTEVAPAEM 130 (769)
T ss_dssp TTTSS-EEEETT-EEEBETTG-TSCCE--ECESCSSTCEEEEEEEEBETTSTEBEEEESS---SS----B
T ss_pred cCCceEEEecCC-eEEEEEccCCCcCc--eeecccCCceEEEEEccCCCCcccceeecCceEEEEecccc
Confidence 344789999986 99999999966554 44443 4455433223334579999999864334444444
No 79
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=90.15 E-value=0.42 Score=47.65 Aligned_cols=73 Identities=27% Similarity=0.388 Sum_probs=51.0
Q ss_pred ceecceeeecchhhhhcCccceEEEecCCCCcccccC-cccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCC
Q 013804 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAG-LLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCK 407 (436)
Q Consensus 331 ~~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~ag-l~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~ 407 (436)
-+-|||.+.-.... .-.++|+++.++-.|++++ |.. ||.|++|||....+. ++..++|+.
T Consensus 95 ~gGLGISIKGGreN----kMPIlISKIFkGlAADQt~aL~~-----------gDaIlSVNG~dL~~AtHdeAVqaLKr-- 157 (506)
T KOG3551|consen 95 AGGLGISIKGGREN----KMPILISKIFKGLAADQTGALFL-----------GDAILSVNGEDLRDATHDEAVQALKR-- 157 (506)
T ss_pred CCcceEEeecCccc----CCceehhHhccccccccccceee-----------ccEEEEecchhhhhcchHHHHHHHHh--
Confidence 46788887743211 1478999999999888875 445 999999999988754 455566665
Q ss_pred CCCEEEEE--EEECC
Q 013804 408 VGDEVIVE--VLRGD 420 (436)
Q Consensus 408 ~g~~v~l~--v~R~g 420 (436)
.|+.|.++ +.|+-
T Consensus 158 aGkeV~levKy~REv 172 (506)
T KOG3551|consen 158 AGKEVLLEVKYMREV 172 (506)
T ss_pred hCceeeeeeeeehhc
Confidence 67776555 45643
No 80
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=89.58 E-value=0.63 Score=47.98 Aligned_cols=72 Identities=26% Similarity=0.377 Sum_probs=46.5
Q ss_pred eecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--H----HHHHHhc
Q 013804 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--D----LYRILDQ 405 (436)
Q Consensus 332 ~~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--d----l~~~l~~ 405 (436)
++||+.+.-..+++ |-.|++|.+|.+++..+. +|.+.+||.|+.||.....++. | |++++.+
T Consensus 261 nfLGiSivgqsn~r--gDggIYVgsImkgGAVA~----------DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~ 328 (626)
T KOG3571|consen 261 NFLGISIVGQSNAR--GDGGIYVGSIMKGGAVAL----------DGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSR 328 (626)
T ss_pred ccceeEeecccCcC--CCCceEEeeeccCceeec----------cCccCccceEEEeeecchhhcCchHHHHHHHHHhcc
Confidence 67777765422211 337999999999887654 5556669999999998766543 3 3334433
Q ss_pred CCCCCEEEEEEEE
Q 013804 406 CKVGDEVIVEVLR 418 (436)
Q Consensus 406 ~~~g~~v~l~v~R 418 (436)
+| .++++|-.
T Consensus 329 --~g-Pi~ltvAk 338 (626)
T KOG3571|consen 329 --PG-PIKLTVAK 338 (626)
T ss_pred --CC-CeEEEEee
Confidence 32 36777654
No 81
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=88.69 E-value=0.64 Score=49.35 Aligned_cols=112 Identities=23% Similarity=0.391 Sum_probs=74.2
Q ss_pred CCCCCCeEE-----CCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhccccceec------ceecceeeecchhh
Q 013804 276 PGNSGGPLL-----DSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVT------RPILGIKFAPDQSV 344 (436)
Q Consensus 276 ~G~SGGPlv-----d~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~v~------~~~lGv~~~~~~~~ 344 (436)
.-|+|||.- |...+++.|+-.. -..+|.+..+..++.+++.-.|+ .|..-+.+...+..
T Consensus 679 nmm~~GpAarsgkLnIGDQiiaING~S----------LVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~k 748 (829)
T KOG3605|consen 679 NMMHGGPAARSGKLNIGDQIMSINGTS----------LVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLR 748 (829)
T ss_pred hcccCChhhhcCCccccceeEeecCce----------eccccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccch
Confidence 457888873 4444677764322 13489999999999988766553 24444444444445
Q ss_pred hhcCc---cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC-H-HHHHHHHhcCCCCC
Q 013804 345 EQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN-G-SDLYRILDQCKVGD 410 (436)
Q Consensus 345 ~~~g~---~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s-~-~dl~~~l~~~~~g~ 410 (436)
.++|. .|| |-.+..++.|++-|+++ |-.|++|||+.|-- . +-+.++|.. ..|+
T Consensus 749 yQLGFSVQNGi-ICSLlRGGIAERGGVRV-----------GHRIIEINgQSVVA~pHekIV~lLs~-aVGE 806 (829)
T KOG3605|consen 749 YQLGFSVQNGI-ICSLLRGGIAERGGVRV-----------GHRIIEINGQSVVATPHEKIVQLLSN-AVGE 806 (829)
T ss_pred hhccceeeCcE-eehhhcccchhccCcee-----------eeeEEEECCceEEeccHHHHHHHHHH-hhhh
Confidence 55664 564 55678999999999999 99999999997753 2 235555554 3453
No 82
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=87.86 E-value=1 Score=47.03 Aligned_cols=68 Identities=28% Similarity=0.454 Sum_probs=52.5
Q ss_pred ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCC--HHHHHHHHhcCCCCC
Q 013804 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGD 410 (436)
Q Consensus 333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s--~~dl~~~l~~~~~g~ 410 (436)
.||+.+..... .-++|..+..|+.+.+.|+- +.||.|++|||..|.+ ..++..++.+.. .
T Consensus 135 plG~Tik~~e~------~~~~vARI~~GG~~~r~glL----------~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~--G 196 (542)
T KOG0609|consen 135 PLGATIRVEED------TKVVVARIMHGGMADRQGLL----------HVGDEILEVNGISVANKSPEELQELLRNSR--G 196 (542)
T ss_pred ccceEEEeccC------CccEEeeeccCCcchhccce----------eeccchheecCeecccCCHHHHHHHHHhCC--C
Confidence 58888875431 25899999999999988853 2499999999999986 578999998854 4
Q ss_pred EEEEEEEE
Q 013804 411 EVIVEVLR 418 (436)
Q Consensus 411 ~v~l~v~R 418 (436)
.++++|.-
T Consensus 197 ~itfkiiP 204 (542)
T KOG0609|consen 197 SITFKIIP 204 (542)
T ss_pred cEEEEEcc
Confidence 57777654
No 83
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=87.79 E-value=0.66 Score=51.25 Aligned_cols=60 Identities=30% Similarity=0.428 Sum_probs=44.9
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQ 421 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v~R~g~ 421 (436)
-|+||.+|.+|++|+. +|.|+.||.+++|||+..-... +.-++|- .-|..|.++|...|.
T Consensus 960 lGIYvKsVV~GgaAd~----------DGRL~aGDQLLsVdG~SLiGisQErAA~lmt--rtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAADH----------DGRLEAGDQLLSVDGHSLIGISQERAARLMT--RTGNVVHLEVAKQGA 1021 (1629)
T ss_pred cceEEEEeccCCcccc----------ccccccCceeeeecCcccccccHHHHHHHHh--ccCCeEEEehhhhhh
Confidence 5899999999999865 5566669999999999776544 3334443 467888998876553
No 84
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=86.67 E-value=0.44 Score=40.69 Aligned_cols=117 Identities=21% Similarity=0.242 Sum_probs=56.5
Q ss_pred EEEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEE
Q 013804 154 GSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI 233 (436)
Q Consensus 154 GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~v 233 (436)
--|+.|+ |-.-|.+|--....- - |..-+....+.+...|+..-....-...+.|-.-+. +.+|++
T Consensus 14 fmgt~vn--GV~wT~~HGagsrtl---A---gp~Gpv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg~-------~dlylV 78 (148)
T PF02907_consen 14 FMGTCVN--GVMWTVYHGAGSRTL---A---GPKGPVNQMYTSVDDDLVGWPAPPGARSLTPCTCGS-------SDLYLV 78 (148)
T ss_dssp EEEEEET--TEEEEEHHHHTTSEE---E---BTTSEB-ESEEETTTTEEEEE-STTB--BBB-SSSS-------SEEEEE
T ss_pred eehhEEc--cEEEEEEecCCcccc---c---CCCCcceEeEEcCCCCCcccccccccccCCccccCC-------ccEEEE
Confidence 3477774 788888885433110 0 111123344566778888877754333444433321 346666
Q ss_pred ecCCCCCCceeEeEEeeeeeeeccCCCCCCcccEE-EEccccCCCCCCCeEECCCCcEEEEEeeeecC
Q 013804 234 GNPFGLDHTLTTGVISGLRREISSAATGRPIQDVI-QTDAAINPGNSGGPLLDSSGSLIGINTAIYSP 300 (436)
Q Consensus 234 G~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~i-~~~~~i~~G~SGGPlvd~~G~VVGI~s~~~~~ 300 (436)
-+- ..+-.+ ++. ++....++ -.......|.||||++-.+|.+|||..+....
T Consensus 79 tr~----~~v~p~-----rr~------gd~~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~t 131 (148)
T PF02907_consen 79 TRD----ADVIPV-----RRR------GDSRASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVCT 131 (148)
T ss_dssp -TT----S-EEEE-----EEE------STTEEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEEE
T ss_pred ecc----CcEeee-----EEc------CCCceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEEc
Confidence 322 111111 111 01001111 11122347999999999999999998766554
No 85
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=86.39 E-value=1.1 Score=44.93 Aligned_cols=45 Identities=38% Similarity=0.505 Sum_probs=37.7
Q ss_pred cceEEEecCCCCccccc-CcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhc
Q 013804 350 SGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQ 405 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~a-gl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~ 405 (436)
.||.|++|...||...- ||.+ ||+|+++||-+|.+.+|..+.++.
T Consensus 220 ~gV~Vtev~~~Spl~gprGL~v-----------gdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 220 EGVTVTEVPSVSPLFGPRGLSV-----------GDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred ceEEEEeccccCCCcCcccCCc-----------cceEEecCCcccCCHHHHHHHHHh
Confidence 79999999998887432 5666 999999999999999888777664
No 86
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=81.90 E-value=7 Score=33.20 Aligned_cols=32 Identities=31% Similarity=0.471 Sum_probs=24.3
Q ss_pred ccEEEEccccCCCCCCCeEECCCCcEEEEEeee
Q 013804 265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI 297 (436)
Q Consensus 265 ~~~i~~~~~i~~G~SGGPlvd~~G~VVGI~s~~ 297 (436)
.+++....+..||+-||+|+- +--||||++++
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~C-~HGViGi~Tag 109 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILRC-KHGVIGIVTAG 109 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEEE-TTCEEEEEEEE
T ss_pred cCceeecccCCCCCCCceeEe-CCCeEEEEEeC
Confidence 455666778899999999994 45599999986
No 87
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=81.65 E-value=2.2 Score=48.25 Aligned_cols=50 Identities=34% Similarity=0.493 Sum_probs=39.1
Q ss_pred EEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEE
Q 013804 353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVE 415 (436)
Q Consensus 353 ~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~ 415 (436)
.|..|.+++||..+|+++ ||.|+.|||+.|... .++.+.+.+ .|..+.+.
T Consensus 661 ~v~sv~egsPA~~agls~-----------~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~ 712 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSA-----------GDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLR 712 (1205)
T ss_pred eeeeecCCCCccccCCCc-----------cceeEeccCcccchhhHHHHHHHHHh--cCCeeEEE
Confidence 577899999999999999 999999999999864 366666654 34445443
No 88
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=80.39 E-value=6.4 Score=32.37 Aligned_cols=53 Identities=25% Similarity=0.350 Sum_probs=34.7
Q ss_pred EEEEEcCCCEEEecccccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCCCCCcceecC
Q 013804 155 SGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIG 219 (436)
Q Consensus 155 SGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~ 219 (436)
-++-|. +|..+|+.||.+..+.+ +|..+ +++. ...|+|+++.+.. .++.++++
T Consensus 2 ~avHIG-nG~~vt~tHva~~~~~v-----~g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig 54 (105)
T PF03510_consen 2 WAVHIG-NGRYVTVTHVAKSSDSV-----DGQPF--KIVK--TDGELCWVQSPLV--HLPAAQIG 54 (105)
T ss_pred ceEEeC-CCEEEEEEEEeccCceE-----cCcCc--EEEE--eccCEEEEECCCC--CCCeeEec
Confidence 356675 68999999999887654 22222 2222 3559999999753 35666664
No 89
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=79.35 E-value=2.6 Score=42.91 Aligned_cols=65 Identities=25% Similarity=0.419 Sum_probs=49.2
Q ss_pred EEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCE--EEEEEEEe
Q 013804 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQ--KEKIPVKL 429 (436)
Q Consensus 354 V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~--~~~~~v~~ 429 (436)
|-+|.+++||+.|||.+ .+|.|+-+-...-...+|+...|.. ..++.+++-|.--+. .++++++.
T Consensus 113 vl~V~p~SPaalAgl~~----------~~DYivG~~~~~~~~~eDl~~lIes-he~kpLklyVYN~D~d~~ReVti~p 179 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLRP----------YTDYIVGIWDAVMHEEEDLFTLIES-HEGKPLKLYVYNHDTDSCREVTITP 179 (462)
T ss_pred eeecCCCCHHHhccccc----------ccceEecchhhhccchHHHHHHHHh-ccCCCcceeEeecCCCccceEEeec
Confidence 66788999999999996 3999999944445677899999988 468889998876433 34455443
No 90
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=72.72 E-value=2.8 Score=42.59 Aligned_cols=24 Identities=25% Similarity=0.504 Sum_probs=21.0
Q ss_pred cccCCCCCCCeEECCCCcEEEEEe
Q 013804 272 AAINPGNSGGPLLDSSGSLIGINT 295 (436)
Q Consensus 272 ~~i~~G~SGGPlvd~~G~VVGI~s 295 (436)
..+..|.||+.|+|.+|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 355689999999999999999964
No 91
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=68.88 E-value=5.2 Score=42.82 Aligned_cols=58 Identities=28% Similarity=0.440 Sum_probs=38.6
Q ss_pred ceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEE
Q 013804 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLR 418 (436)
Q Consensus 351 gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~--~dl~~~l~~~~~g~~v~l~v~R 418 (436)
-|+|.....++||++.| +|-.||.|++|||...--. ..-+.+++..+.-..|+++|.+
T Consensus 674 TVViAnmm~~GpAarsg----------kLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~ 733 (829)
T KOG3605|consen 674 TVVIANMMHGGPAARSG----------KLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS 733 (829)
T ss_pred HHHHHhcccCChhhhcC----------CccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence 34556677888988765 3445999999999866542 3445566665544556666654
No 92
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=68.10 E-value=19 Score=36.88 Aligned_cols=137 Identities=20% Similarity=0.304 Sum_probs=67.0
Q ss_pred cCeEEEEEEEcCCCEEEecccccCCCCe-EEEEecCCcEEeeEEEEEcCCCCeEEEEEcCCC-CCCcceecCCCCCCCCC
Q 013804 150 PQGSGSGFVWDSKGHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPK-DKLRPIPIGVSADLLVG 227 (436)
Q Consensus 150 ~~~~GSGfiI~~~G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~~~l~~~~~~~~G 227 (436)
.-+.|-||-|+++ ..+|+-||+..... + | | .+..-+.++..-+++-+++..+. .+++-+-|. +-...|
T Consensus 377 ~fGsGWGfWVS~~-lfITttHViP~g~~E~---F--G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--eGapEG 446 (535)
T PF05416_consen 377 KFGSGWGFWVSPT-LFITTTHVIPPGAKEA---F--G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--EGAPEG 446 (535)
T ss_dssp EETTEEEEESSSS-EEEEEGGGS-STTSEE---T--T--EECGGEEEEEETTEEEEEESS-SSTTS---EE---SS--TT
T ss_pred ecCCceeeeecce-EEEEeeeecCCcchhh---h--C--CChhHeEEeeccceEEEecCCCCCCCccceeec--cCCCCc
Confidence 3467999999987 99999999975322 1 0 0 11122344556678888887653 245545552 223445
Q ss_pred CEEEE-EecCCCCC--CceeEeEEeeeeeeeccCCCCCCcccEEEE-------ccccCCCCCCCeEECCCC---cEEEEE
Q 013804 228 QKVYA-IGNPFGLD--HTLTTGVISGLRREISSAATGRPIQDVIQT-------DAAINPGNSGGPLLDSSG---SLIGIN 294 (436)
Q Consensus 228 ~~V~~-vG~p~g~~--~~~~~G~vs~~~~~~~~~~~~~~~~~~i~~-------~~~i~~G~SGGPlvd~~G---~VVGI~ 294 (436)
.-+.+ +=.+.|.- ..+..|...+..-.-..- .+ ...++.+ |-...||+-|.|-|-..| -|+|||
T Consensus 447 tV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v-~G--Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH 523 (535)
T PF05416_consen 447 TVCSILIKRPSGELLPLAVRMGTHASMKIQGRTV-HG--QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVH 523 (535)
T ss_dssp -EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEE-EE--EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEE
T ss_pred eEEEEEEEcCCccchhhhhhhccceeEEEcceee-cc--eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEE
Confidence 54443 34554432 234444443222110000 00 0112222 234558999999996555 499999
Q ss_pred eeeec
Q 013804 295 TAIYS 299 (436)
Q Consensus 295 s~~~~ 299 (436)
++...
T Consensus 524 ~AAtr 528 (535)
T PF05416_consen 524 AAATR 528 (535)
T ss_dssp EEE-S
T ss_pred ehhcc
Confidence 87643
No 93
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=55.78 E-value=42 Score=30.55 Aligned_cols=38 Identities=29% Similarity=0.299 Sum_probs=31.4
Q ss_pred ecceeeecchhhhhcCccceEEEecCCCCcccccCcccccccccCcccCCcEEEEE
Q 013804 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSV 388 (436)
Q Consensus 333 ~lGv~~~~~~~~~~~g~~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~v 388 (436)
..|+.+.++. +.++|..+.-+|||+++|+.- |+.|++|
T Consensus 112 ~~GL~l~~e~-------~~~~Vd~v~fgS~A~~~g~d~-----------d~~I~~v 149 (183)
T PF11874_consen 112 AAGLTLMEEG-------GKVIVDEVEFGSPAEKAGIDF-----------DWEITEV 149 (183)
T ss_pred hCCCEEEeeC-------CEEEEEecCCCCHHHHcCCCC-----------CcEEEEE
Confidence 3477776544 568999999999999999998 8988887
No 94
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=54.26 E-value=14 Score=35.32 Aligned_cols=59 Identities=14% Similarity=0.325 Sum_probs=43.5
Q ss_pred cceEEEecCCCCcccccCcccccccccCcccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEE
Q 013804 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLR 418 (436)
Q Consensus 350 ~gv~V~~v~~~spa~~agl~~~~~~~~~~l~~GDiIl~vnG~~V~s~~--dl~~~l~~~~~g~~v~l~v~R 418 (436)
.-++|..+.++|.-++.- .+++||.|-+|||+.|-.+. ++-++|+....|++.++.+..
T Consensus 149 GyAFIKrIkegsvidri~----------~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie 209 (334)
T KOG3938|consen 149 GYAFIKRIKEGSVIDRIE----------AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE 209 (334)
T ss_pred ceeeeEeecCCchhhhhh----------heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence 345666676766655432 24459999999999998775 677889998889988887653
No 95
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=48.51 E-value=65 Score=23.80 Aligned_cols=34 Identities=15% Similarity=0.300 Sum_probs=29.1
Q ss_pred CCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804 175 ASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (436)
Q Consensus 175 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (436)
...+.+.+..|..++++++.+|....+.+|+...
T Consensus 6 Gs~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 6 GSQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred ccEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 3456778888999999999999999999998654
No 96
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=47.70 E-value=18 Score=25.16 Aligned_cols=21 Identities=38% Similarity=0.616 Sum_probs=17.8
Q ss_pred CCCCCCeEECCCCcEEEEEee
Q 013804 276 PGNSGGPLLDSSGSLIGINTA 296 (436)
Q Consensus 276 ~G~SGGPlvd~~G~VVGI~s~ 296 (436)
.+.+.-|++|.+|+++|+++.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 456788999999999999874
No 97
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.17 E-value=61 Score=23.32 Aligned_cols=33 Identities=18% Similarity=0.415 Sum_probs=28.5
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (436)
..+.|.+.||+.+.+.+..+|...++.+-....
T Consensus 7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence 468899999999999999999999888776643
No 98
>PRK14864 putative biofilm stress and motility protein A; Provisional
Probab=44.36 E-value=40 Score=27.72 Aligned_cols=55 Identities=11% Similarity=0.058 Sum_probs=25.4
Q ss_pred HHHHHHHHHHHHHHhhccccCcccccccCCccccCccchhHHHHHHHhCCceEEEE
Q 013804 78 SLFVFCGSVVLSFTLLFSNVDSASAFVVTPQRKLQTDELATVRLFQENTPSVVNIT 133 (436)
Q Consensus 78 ~~~~~~~~l~l~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~sVV~I~ 133 (436)
+.++.+..+++++.|.+++..+..+. ++|++...+++....+....-+=.+|.|.
T Consensus 3 ~~mk~~~~l~~~l~LS~~s~~~~~p~-~~p~~~~~A~eI~~~qa~~lq~iGtVSvs 57 (104)
T PRK14864 3 MVMRRFASLLLTLLLSACSALQGTPQ-PAPPPADHAQEIRRAQTQGLQKMGTVSAL 57 (104)
T ss_pred hHHHHHHHHHHHHHHhhhhhcccCCC-CCCCccccceecCHHHhhCCceeeEEEEe
Confidence 34555555555555555655554433 33333445555554433222222355554
No 99
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.48 E-value=44 Score=26.50 Aligned_cols=37 Identities=5% Similarity=0.297 Sum_probs=30.9
Q ss_pred ccCCCCeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 171 VIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 171 vv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
++.....+.|.+.+++.+.+++.++|...++.|=...
T Consensus 10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 3445578999999999999999999999998876653
No 100
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=39.32 E-value=72 Score=23.72 Aligned_cols=33 Identities=9% Similarity=0.245 Sum_probs=29.3
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (436)
..+.|.+.+|+.+.+++.++|+..++.+-....
T Consensus 11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence 468899999999999999999999998887753
No 101
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=39.15 E-value=75 Score=24.02 Aligned_cols=32 Identities=13% Similarity=0.330 Sum_probs=28.4
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.+|+.+.+++.++|+..++.+=...
T Consensus 15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 46 (72)
T PRK00737 15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAE 46 (72)
T ss_pred CEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence 46889999999999999999999998887764
No 102
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=37.95 E-value=67 Score=23.96 Aligned_cols=32 Identities=13% Similarity=0.217 Sum_probs=27.8
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.+|+.+.+++..+|...++.+=.+.
T Consensus 12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 46889999999999999999999888876653
No 103
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=37.72 E-value=76 Score=23.56 Aligned_cols=32 Identities=13% Similarity=0.233 Sum_probs=28.0
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.+|+.|.+++.++|+..++.+=...
T Consensus 11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 46889999999999999999999988886654
No 104
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.51 E-value=87 Score=24.04 Aligned_cols=32 Identities=13% Similarity=0.215 Sum_probs=27.7
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.||+.+.+.+..+|...+|.+=...
T Consensus 11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred CeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence 46889999999999999999999998775553
No 105
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=36.50 E-value=26 Score=32.69 Aligned_cols=55 Identities=15% Similarity=0.412 Sum_probs=37.2
Q ss_pred ccEEEEccccCCCCCCCeEECC----CCcEEEEEeeeecCCCCCCcceeeeee--eccchhhhhc
Q 013804 265 QDVIQTDAAINPGNSGGPLLDS----SGSLIGINTAIYSPSGASSGVGFSIPV--DTVNGIVDQL 323 (436)
Q Consensus 265 ~~~i~~~~~i~~G~SGGPlvd~----~G~VVGI~s~~~~~~~~~~~~~~aIP~--~~i~~~l~~l 323 (436)
...+++..+...|+=|||++-. --+++||+.++.. ..+.+||-++ +.+++-+++|
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~----~~~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA----NHAMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc----cccceehhhhhHHHHHHHHHhh
Confidence 3456777888899999999732 3589999998753 3456777665 3444444433
No 106
>COG4956 Integral membrane protein (PIN domain superfamily) [General function prediction only]
Probab=36.45 E-value=29 Score=34.10 Aligned_cols=40 Identities=20% Similarity=0.398 Sum_probs=33.8
Q ss_pred EEEECCEEeCCHHHHHHHHhc-CCCCCEEEEEEEECCEEEE
Q 013804 385 ITSVNGKKVSNGSDLYRILDQ-CKVGDEVIVEVLRGDQKEK 424 (436)
Q Consensus 385 Il~vnG~~V~s~~dl~~~l~~-~~~g~~v~l~v~R~g~~~~ 424 (436)
+-++.|.+|-|.+|+..+++- ..+||++++++.++|++..
T Consensus 269 Vae~qgV~vLNINDLAnAVkP~vlpGe~l~v~iiK~GkE~~ 309 (356)
T COG4956 269 VAELQGVQVLNINDLANAVKPVVLPGEELTVQIIKDGKEPG 309 (356)
T ss_pred HHhhcCCceecHHHHHHHhCCcccCCCeeEEEEeecCcccC
Confidence 456788899999999999983 5789999999999998753
No 107
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.96 E-value=69 Score=24.89 Aligned_cols=31 Identities=10% Similarity=0.271 Sum_probs=27.0
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (436)
..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus 12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 4788999999999999999999998886544
No 108
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.95 E-value=86 Score=24.13 Aligned_cols=32 Identities=19% Similarity=0.429 Sum_probs=27.7
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.||+.+.+.+.++|...++.|=...
T Consensus 11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 46889999999999999999999998876553
No 109
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.61 E-value=88 Score=24.32 Aligned_cols=31 Identities=23% Similarity=0.334 Sum_probs=26.9
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (436)
..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus 13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence 4688999999999999999999998877554
No 110
>PF04225 OapA: Opacity-associated protein A LysM-like domain; InterPro: IPR007340 This entry includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonization, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation [].; PDB: 2GU1_A.
Probab=33.33 E-value=21 Score=28.15 Aligned_cols=53 Identities=17% Similarity=0.246 Sum_probs=27.4
Q ss_pred ccCCcEEEEE---CCEEeCCHHHHHH------HHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 013804 379 LILGDIITSV---NGKKVSNGSDLYR------ILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 431 (436)
Q Consensus 379 l~~GDiIl~v---nG~~V~s~~dl~~------~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 431 (436)
++.||-+-.| .|.+..+...+.+ .|...+||+++.+.+..+|+...+.+....
T Consensus 7 V~~GDtLs~iF~~~gls~~dl~~v~~~~~~~k~L~~L~pGq~l~f~~d~~g~L~~L~~~~~~ 68 (85)
T PF04225_consen 7 VKSGDTLSTIFRRAGLSASDLYAVLEADGEAKPLTRLKPGQTLEFQLDEDGQLTALRYERSP 68 (85)
T ss_dssp --TT--HHHHHHHTT--HHHHHHHHHHGGGT--GGG--TT-EEEEEE-TTS-EEEEEEEEET
T ss_pred ECCCCcHHHHHHHcCCCHHHHHHHHhccCccchHhhCCCCCEEEEEECCCCCEEEEEEEcCC
Confidence 3447766555 4655544444433 455678999999999999998887776543
No 111
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=32.13 E-value=64 Score=24.85 Aligned_cols=47 Identities=17% Similarity=0.167 Sum_probs=33.0
Q ss_pred CcEEEEECCEEeCCHHHHHHHHh-cCCCCC----EEEEEEEECCEEEEEEEE
Q 013804 382 GDIITSVNGKKVSNGSDLYRILD-QCKVGD----EVIVEVLRGDQKEKIPVK 428 (436)
Q Consensus 382 GDiIl~vnG~~V~s~~dl~~~l~-~~~~g~----~v~l~v~R~g~~~~~~v~ 428 (436)
-|-.+.+||++.++....+.... .+..|. ++..++.|||+..+.+-+
T Consensus 11 adAkl~v~G~~t~~~G~~R~F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~~ 62 (75)
T TIGR03000 11 ADAKLKVDGKETNGTGTVRTFTTPPLEAGKEYEYTVTAEYDRDGRILTRTRT 62 (75)
T ss_pred CCCEEEECCeEcccCccEEEEECCCCCCCCEEEEEEEEEEecCCcEEEEEEE
Confidence 58889999999998776555443 244564 477778899987655433
No 112
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=31.99 E-value=93 Score=23.92 Aligned_cols=31 Identities=16% Similarity=0.409 Sum_probs=27.2
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (436)
..+.|.+.+|+.+.+++.++|...++.+=..
T Consensus 14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 5788999999999999999999998886554
No 113
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=31.67 E-value=47 Score=27.40 Aligned_cols=18 Identities=33% Similarity=0.617 Sum_probs=13.3
Q ss_pred CeEECCCCcEEEEEeeee
Q 013804 281 GPLLDSSGSLIGINTAIY 298 (436)
Q Consensus 281 GPlvd~~G~VVGI~s~~~ 298 (436)
.|++|.+|++||+++.+.
T Consensus 94 ~PV~d~~g~viG~V~VG~ 111 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVGV 111 (116)
T ss_dssp EEEE-TTS-EEEEEEEEE
T ss_pred EeeECCCCcEEEEEEEEE
Confidence 588889999999988654
No 114
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=31.23 E-value=1.2e+02 Score=22.97 Aligned_cols=32 Identities=9% Similarity=0.165 Sum_probs=27.4
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.+|+.+.+++.++|...+|.+=...
T Consensus 11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (72)
T cd01719 11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV 42 (72)
T ss_pred CeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence 46889999999999999999999888875553
No 115
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.81 E-value=1.1e+02 Score=23.35 Aligned_cols=31 Identities=16% Similarity=0.185 Sum_probs=27.1
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEE
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (436)
..+.|.+.||+.+.+.+.++|+..++.+=..
T Consensus 13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 4688999999999999999999988887554
No 116
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=30.79 E-value=1.1e+02 Score=23.83 Aligned_cols=47 Identities=19% Similarity=0.376 Sum_probs=31.3
Q ss_pred EeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEE-EecC
Q 013804 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYA-IGNP 236 (436)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~-vG~p 236 (436)
++++++..+...++|++.+-.-...+ -+.|-. .+++.|++|++ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkreV-~l~Lv~-~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKREV-NLDLVG-EEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEEE-Eeeeec-CccccCCEEEEEeeEE
Confidence 57888999988889999986422111 222322 26789999886 6764
No 117
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=29.78 E-value=1.2e+02 Score=21.99 Aligned_cols=32 Identities=19% Similarity=0.413 Sum_probs=27.7
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.||+.+.+.+..+|...++-+=...
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~ 40 (67)
T smart00651 9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVE 40 (67)
T ss_pred cEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence 46889999999999999999999888876654
No 118
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=28.86 E-value=53 Score=24.89 Aligned_cols=31 Identities=26% Similarity=0.588 Sum_probs=22.9
Q ss_pred CeEECCCCcEEEEEeeeecCCCCCCcceeeeeeeccchhhhhcc
Q 013804 281 GPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLV 324 (436)
Q Consensus 281 GPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~ 324 (436)
-|+.+.+|+++|++.. .+.++.+.++++++.
T Consensus 19 ~pi~~~~g~~~Gvv~~-------------di~l~~l~~~i~~~~ 49 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVGI-------------DISLDQLSEIISNIK 49 (81)
T ss_dssp EEEEETTTEEEEEEEE-------------EEEHHHHHHHHTTSB
T ss_pred EEEECCCCCEEEEEEE-------------EeccceeeeEEEeeE
Confidence 5788889999999654 366777777777653
No 119
>PF09122 DUF1930: Domain of unknown function (DUF1930); InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=28.61 E-value=1.4e+02 Score=22.10 Aligned_cols=45 Identities=22% Similarity=0.264 Sum_probs=27.6
Q ss_pred CcEEEEECCEEeCCHH-HHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 013804 382 GDIITSVNGKKVSNGS-DLYRILDQCKVGDEVIVEVLRGDQKEKIPV 427 (436)
Q Consensus 382 GDiIl~vnG~~V~s~~-dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v 427 (436)
.-.-+.+||..|.+.+ |+..++.....|+..++.+.. +....+++
T Consensus 19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~FkS-~Rv~~iEv 64 (68)
T PF09122_consen 19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFFKS-QRVAVIEV 64 (68)
T ss_dssp TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEETT-S-EEEEE-
T ss_pred cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEEec-CcEEEEEc
Confidence 4566789999999975 788888877899988876543 33444444
No 120
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=28.30 E-value=2.3e+02 Score=20.46 Aligned_cols=35 Identities=17% Similarity=0.335 Sum_probs=27.6
Q ss_pred CCCeEEEEecCCcEE-eeEEEEEcCCCCeEEEEEcC
Q 013804 174 GASDIRVTFADQSAY-DAKIVGFDQDKDVAVLRIDA 208 (436)
Q Consensus 174 ~~~~i~V~~~dg~~~-~a~vv~~d~~~DlAlLkv~~ 208 (436)
..+.+.++.++...| ++++..+|...++.-++.+.
T Consensus 8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~D 43 (55)
T PF09465_consen 8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYED 43 (55)
T ss_dssp SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETT
T ss_pred CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcC
Confidence 446788899887765 99999999999999999976
No 121
>PF05578 Peptidase_S31: Pestivirus NS3 polyprotein peptidase S31; InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=28.07 E-value=1.1e+02 Score=26.94 Aligned_cols=73 Identities=21% Similarity=0.210 Sum_probs=39.1
Q ss_pred CCCCCEEEEEecCCCCCCceeEeEEeeeeeeeccCCC-CCCcccEEEEccccCCCCCCCeEECC-CCcEEEEEeeee
Q 013804 224 LLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAAT-GRPIQDVIQTDAAINPGNSGGPLLDS-SGSLIGINTAIY 298 (436)
Q Consensus 224 ~~~G~~V~~vG~p~g~~~~~~~G~vs~~~~~~~~~~~-~~~~~~~i~~~~~i~~G~SGGPlvd~-~G~VVGI~s~~~ 298 (436)
...|..+|++ +|...+.+-+.|.+-...+.-..... ...... -.+|..-..|-||=|+|.. .|++||=.-.+.
T Consensus 109 cp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtp-af~~~knlkg~s~~pifeassgr~vgr~k~gk 183 (211)
T PF05578_consen 109 CPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTP-AFFDLKNLKGWSGLPIFEASSGRVVGRVKVGK 183 (211)
T ss_pred CCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCc-ceeeccccCCCCCCceeeccCCcEEEEEEecC
Confidence 4567888888 56554444455544333322110000 000001 1233344579999999974 899999866543
No 122
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=27.39 E-value=1.3e+02 Score=22.75 Aligned_cols=32 Identities=19% Similarity=0.267 Sum_probs=27.6
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.||+.+.++..++|...++.+=...
T Consensus 10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 10 KTVSVITVDGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence 46889999999999999999999888776653
No 123
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=27.07 E-value=1.2e+02 Score=23.20 Aligned_cols=33 Identities=21% Similarity=0.459 Sum_probs=28.8
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (436)
..+.|.+.+|+.+.+++.++|...++.+--+..
T Consensus 18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence 678999999999999999999999888776643
No 124
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.83 E-value=1.6e+02 Score=22.12 Aligned_cols=32 Identities=9% Similarity=0.272 Sum_probs=28.8
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEc
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (436)
..+.|.+.+|..|.+++..+|...++.+-...
T Consensus 11 ~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 11 HIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 46889999999999999999999999888774
No 125
>PF14275 DUF4362: Domain of unknown function (DUF4362)
Probab=26.28 E-value=2e+02 Score=23.45 Aligned_cols=47 Identities=17% Similarity=0.312 Sum_probs=27.8
Q ss_pred CCcEEEEECCEEeCCHHHHHHHHhcCC--------------CCCEEEEEEEECCEEEEEEEEe
Q 013804 381 LGDIITSVNGKKVSNGSDLYRILDQCK--------------VGDEVIVEVLRGDQKEKIPVKL 429 (436)
Q Consensus 381 ~GDiIl~vnG~~V~s~~dl~~~l~~~~--------------~g~~v~l~v~R~g~~~~~~v~~ 429 (436)
.||+|.+ .|+ |.+.+.|...+.... .|+++-..+.-+|+...+.+.-
T Consensus 2 ~~DVi~~-~~~-i~Nl~kl~~Fi~nv~~~k~d~IrIv~yT~EGdPI~~~L~~~G~~I~y~~Dn 62 (98)
T PF14275_consen 2 NNDVINK-HGE-IENLDKLDQFIENVEQGKPDKIRIVQYTIEGDPIFQDLEYDGNQIKYTSDN 62 (98)
T ss_pred CCCEEEe-CCe-EEeHHHHHHHHHHHhcCCCCEEEEEEecCCCCCEEEEEEECCCEEEEEECC
Confidence 4999988 444 777777776665432 3344444455566655555544
No 126
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=25.54 E-value=1.5e+02 Score=21.63 Aligned_cols=33 Identities=21% Similarity=0.444 Sum_probs=29.3
Q ss_pred CeEEEEecCCcEEeeEEEEEcCCCCeEEEEEcC
Q 013804 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (436)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (436)
..+.|.+.+|+.+.+.+..+|...++.+-....
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~ 41 (67)
T PF01423_consen 9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE 41 (67)
T ss_dssp SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred cEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence 568899999999999999999999988887754
No 127
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=25.21 E-value=58 Score=26.41 Aligned_cols=22 Identities=32% Similarity=0.453 Sum_probs=17.4
Q ss_pred CCCCCCeEECCCCcEEEEEeee
Q 013804 276 PGNSGGPLLDSSGSLIGINTAI 297 (436)
Q Consensus 276 ~G~SGGPlvd~~G~VVGI~s~~ 297 (436)
.+.+.=|++|.+|+++|+++..
T Consensus 97 ~~~~~lpVvd~~~~~vGiit~~ 118 (123)
T cd04627 97 EGISSVAVVDNQGNLIGNISVT 118 (123)
T ss_pred cCCceEEEECCCCcEEEEEeHH
Confidence 3455679999999999998753
No 128
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=24.18 E-value=2.3e+02 Score=21.23 Aligned_cols=43 Identities=23% Similarity=0.396 Sum_probs=29.3
Q ss_pred EeeEEEEEcCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEE
Q 013804 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI 233 (436)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~v 233 (436)
++++++..+.....|++.... ....+.+.--.++++|++|++-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G---~~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGG---VRREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETT---EEEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCC---cEEEEEEEEeCCCCCCCEEEEe
Confidence 678899998888999998864 3344444333458899999875
No 129
>PF02601 Exonuc_VII_L: Exonuclease VII, large subunit; InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus. This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=23.81 E-value=87 Score=30.86 Aligned_cols=35 Identities=29% Similarity=0.499 Sum_probs=30.7
Q ss_pred eEEEEEEEcCCCEEEecccccCCCCeEEEEecCCc
Q 013804 152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQS 186 (436)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~ 186 (436)
..|-+++-+++|.++|+..-+...+.+++.+.||.
T Consensus 280 ~RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~ 314 (319)
T PF02601_consen 280 KRGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS 314 (319)
T ss_pred hCceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence 35667788788999999999999999999999995
No 130
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=22.81 E-value=69 Score=25.46 Aligned_cols=20 Identities=20% Similarity=0.270 Sum_probs=15.9
Q ss_pred CCCCCeEECCCCcEEEEEee
Q 013804 277 GNSGGPLLDSSGSLIGINTA 296 (436)
Q Consensus 277 G~SGGPlvd~~G~VVGI~s~ 296 (436)
+.+--|++|.+|+++|+++.
T Consensus 86 ~~~~lpVvd~~~~~~Giit~ 105 (111)
T cd04603 86 EPPVVAVVDKEGKLVGTIYE 105 (111)
T ss_pred CCCeEEEEcCCCeEEEEEEh
Confidence 44456899988999999874
No 131
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=22.35 E-value=71 Score=25.32 Aligned_cols=20 Identities=45% Similarity=0.599 Sum_probs=16.3
Q ss_pred CCCCCeEECCCCcEEEEEee
Q 013804 277 GNSGGPLLDSSGSLIGINTA 296 (436)
Q Consensus 277 G~SGGPlvd~~G~VVGI~s~ 296 (436)
+...-|++|.+|+++||++.
T Consensus 90 ~~~~~pVvd~~~~~~Gvit~ 109 (115)
T cd04620 90 QIRHLPVLDDQGQLIGLVTA 109 (115)
T ss_pred CCceEEEEcCCCCEEEEEEh
Confidence 44567899989999999875
No 132
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=20.09 E-value=73 Score=22.27 Aligned_cols=11 Identities=36% Similarity=0.866 Sum_probs=8.4
Q ss_pred CCCCcEEEEEe
Q 013804 285 DSSGSLIGINT 295 (436)
Q Consensus 285 d~~G~VVGI~s 295 (436)
|.+|++|||--
T Consensus 36 d~~G~ivGIEI 46 (50)
T PF10049_consen 36 DEDGRIVGIEI 46 (50)
T ss_pred CCCCCEEEEEE
Confidence 57899999843
Done!