Query psy18070
Match_columns 169
No_of_seqs 192 out of 1639
Neff 7.0
Searched_HMMs 46136
Date Fri Aug 16 21:52:17 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy18070.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/18070hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.6E-31 7.7E-36 233.5 15.7 140 1-156 202-348 (455)
2 TIGR02038 protease_degS peripl 100.0 8.4E-30 1.8E-34 218.3 16.4 140 1-156 188-336 (351)
3 PRK10898 serine endoprotease; 100.0 1.2E-29 2.6E-34 217.5 16.8 140 1-156 188-337 (353)
4 PRK10942 serine endoprotease; 100.0 1.9E-29 4.2E-34 223.4 15.1 140 1-156 223-369 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 8.3E-29 1.8E-33 216.6 16.6 140 1-156 169-315 (428)
6 COG0265 DegQ Trypsin-like seri 99.9 2E-23 4.4E-28 177.9 14.1 138 1-156 184-328 (347)
7 KOG1320|consensus 99.7 2.8E-16 6.1E-21 138.3 9.4 150 1-156 294-456 (473)
8 cd00987 PDZ_serine_protease PD 99.2 1.2E-10 2.6E-15 80.1 8.8 80 65-156 1-82 (90)
9 PF13180 PDZ_2: PDZ domain; PD 99.2 7E-11 1.5E-15 80.9 7.5 76 65-162 1-78 (82)
10 KOG1421|consensus 99.1 4.7E-10 1E-14 102.0 11.7 143 1-156 206-359 (955)
11 cd00990 PDZ_glycyl_aminopeptid 98.9 1.3E-08 2.8E-13 68.7 8.2 67 65-156 1-67 (80)
12 TIGR02037 degP_htrA_DO peripla 98.8 3.4E-08 7.4E-13 86.7 8.9 83 64-157 337-421 (428)
13 cd00991 PDZ_archaeal_metallopr 98.6 2.2E-07 4.8E-12 63.2 8.1 61 92-158 8-70 (79)
14 TIGR01713 typeII_sec_gspC gene 98.6 2.6E-07 5.6E-12 76.6 9.3 92 43-159 159-252 (259)
15 KOG1320|consensus 98.6 5.2E-08 1.1E-12 86.4 5.1 116 1-132 200-318 (473)
16 cd00136 PDZ PDZ domain, also c 98.5 4E-07 8.8E-12 59.7 7.0 36 94-135 13-48 (70)
17 smart00228 PDZ Domain present 98.5 9.2E-07 2E-11 59.5 7.8 71 65-156 12-84 (85)
18 cd00992 PDZ_signaling PDZ doma 98.4 1.1E-06 2.3E-11 59.2 7.4 49 64-133 11-59 (82)
19 cd00988 PDZ_CTP_protease PDZ d 98.4 1.5E-06 3.3E-11 59.0 7.9 57 94-156 13-72 (85)
20 cd00989 PDZ_metalloprotease PD 98.4 1.5E-06 3.2E-11 58.2 7.6 57 94-156 12-69 (79)
21 cd00986 PDZ_LON_protease PDZ d 98.4 2.8E-06 6E-11 57.4 7.8 56 94-156 8-65 (79)
22 PF00595 PDZ: PDZ domain (Also 98.3 1.5E-06 3.2E-11 58.9 6.2 71 63-153 8-80 (81)
23 PRK10942 serine endoprotease; 98.3 3.8E-06 8.1E-11 75.1 8.2 57 94-156 408-464 (473)
24 PRK10139 serine endoprotease; 98.0 1.9E-05 4.1E-10 70.3 6.8 57 94-156 390-446 (455)
25 TIGR00225 prc C-terminal pepti 97.9 5.8E-05 1.3E-09 64.4 8.5 57 94-156 62-121 (334)
26 TIGR00054 RIP metalloprotease 97.8 5E-05 1.1E-09 66.9 6.6 57 94-156 203-260 (420)
27 TIGR00054 RIP metalloprotease 97.8 3.6E-05 7.8E-10 67.8 5.4 57 93-155 127-183 (420)
28 PRK10779 zinc metallopeptidase 97.7 2.9E-05 6.2E-10 68.9 3.4 55 96-156 128-184 (449)
29 PRK10779 zinc metallopeptidase 97.7 0.0001 2.3E-09 65.4 6.7 57 94-156 221-278 (449)
30 PLN00049 carboxyl-terminal pro 97.6 0.00013 2.8E-09 63.7 6.7 35 94-134 102-136 (389)
31 COG0793 Prc Periplasmic protea 97.5 0.00044 9.6E-09 60.8 7.9 57 94-156 112-171 (406)
32 PF12812 PDZ_1: PDZ-like domai 97.5 0.00052 1.1E-08 47.0 6.4 64 64-141 8-71 (78)
33 KOG3553|consensus 97.5 9.9E-05 2.1E-09 53.0 2.8 32 94-131 59-90 (124)
34 COG3975 Predicted protease wit 97.2 0.0007 1.5E-08 60.9 5.6 54 67-130 439-492 (558)
35 PF04495 GRASP55_65: GRASP55/6 97.2 0.00057 1.2E-08 51.7 4.3 73 65-155 26-100 (138)
36 TIGR02860 spore_IV_B stage IV 97.1 0.0023 4.9E-08 56.3 7.9 58 93-156 104-170 (402)
37 PRK11186 carboxy-terminal prot 97.1 0.0017 3.8E-08 60.4 7.2 29 94-128 255-284 (667)
38 TIGR03279 cyano_FeS_chp putati 97.0 0.00091 2E-08 59.2 4.5 50 97-154 1-50 (433)
39 PF10459 Peptidase_S46: Peptid 96.7 0.0027 5.9E-08 59.4 5.4 50 3-52 625-686 (698)
40 KOG3129|consensus 96.6 0.0045 9.7E-08 49.9 5.4 61 95-161 140-204 (231)
41 PF14685 Tricorn_PDZ: Tricorn 96.1 0.044 9.6E-07 38.3 7.6 57 94-156 12-79 (88)
42 KOG3532|consensus 96.1 0.017 3.8E-07 53.7 6.6 59 94-158 398-456 (1051)
43 PF00949 Peptidase_S7: Peptida 96.0 0.0075 1.6E-07 45.3 3.4 29 4-32 90-118 (132)
44 PF00944 Peptidase_S3: Alphavi 95.4 0.022 4.8E-07 43.1 3.9 36 6-41 101-136 (158)
45 KOG1421|consensus 95.2 0.27 5.8E-06 46.2 10.8 115 9-137 677-805 (955)
46 PF05579 Peptidase_S32: Equine 94.9 0.034 7.3E-07 46.5 3.8 33 8-40 205-237 (297)
47 PRK09681 putative type II secr 94.8 0.052 1.1E-06 45.6 5.0 60 94-156 204-265 (276)
48 KOG3571|consensus 94.4 0.12 2.5E-06 46.8 6.2 59 92-156 275-339 (626)
49 KOG3580|consensus 93.8 0.082 1.8E-06 48.9 4.2 36 94-135 429-464 (1027)
50 KOG2921|consensus 93.0 0.099 2.2E-06 46.0 3.4 41 92-137 218-258 (484)
51 KOG3605|consensus 92.3 0.26 5.7E-06 45.8 5.2 82 38-136 707-792 (829)
52 KOG3550|consensus 92.0 0.31 6.8E-06 37.7 4.6 36 94-135 115-151 (207)
53 KOG3580|consensus 91.7 0.37 8E-06 44.7 5.4 64 91-161 37-101 (1027)
54 KOG3209|consensus 91.6 0.23 4.9E-06 46.7 4.0 51 98-154 782-835 (984)
55 COG3591 V8-like Glu-specific e 90.9 0.24 5.2E-06 41.1 3.2 32 2-33 194-225 (251)
56 KOG3542|consensus 90.6 0.2 4.4E-06 47.0 2.7 57 93-155 561-618 (1283)
57 KOG3834|consensus 90.1 0.39 8.5E-06 42.6 4.0 63 96-164 111-175 (462)
58 COG3480 SdrC Predicted secrete 89.8 0.95 2E-05 38.9 5.9 55 94-155 130-186 (342)
59 PF02907 Peptidase_S29: Hepati 88.5 0.49 1.1E-05 35.8 3.0 32 9-40 106-138 (148)
60 PF00947 Pico_P2A: Picornaviru 88.1 0.95 2.1E-05 33.8 4.3 35 4-39 83-117 (127)
61 KOG3209|consensus 84.8 2.1 4.5E-05 40.6 5.5 59 92-156 921-981 (984)
62 KOG0606|consensus 83.4 0.95 2.1E-05 44.6 2.8 34 96-135 660-693 (1205)
63 KOG3651|consensus 81.0 2.1 4.5E-05 36.8 3.7 42 94-141 30-72 (429)
64 COG3031 PulC Type II secretory 80.8 2.2 4.8E-05 35.3 3.7 56 95-156 208-265 (275)
65 PF00863 Peptidase_C4: Peptida 79.5 3.4 7.3E-05 34.0 4.4 40 4-43 144-185 (235)
66 KOG3552|consensus 76.0 4 8.7E-05 39.9 4.4 55 94-155 75-131 (1298)
67 KOG1892|consensus 75.7 3.1 6.8E-05 40.8 3.6 38 91-134 957-995 (1629)
68 PF08192 Peptidase_S64: Peptid 74.1 5.1 0.00011 37.6 4.5 44 8-51 636-687 (695)
69 COG0750 Predicted membrane-ass 73.8 7.1 0.00015 33.3 5.1 50 100-155 135-188 (375)
70 KOG3627|consensus 72.9 2.9 6.3E-05 33.3 2.4 26 8-33 201-229 (256)
71 PF11874 DUF3394: Domain of un 70.9 5.7 0.00012 31.5 3.5 28 94-127 122-149 (183)
72 PF00571 CBS: CBS domain CBS d 67.1 7.4 0.00016 23.4 2.8 20 11-30 29-48 (57)
73 KOG3606|consensus 65.2 5.6 0.00012 33.7 2.5 38 92-135 192-230 (358)
74 KOG0609|consensus 62.6 18 0.00039 33.2 5.4 35 95-135 147-182 (542)
75 PF02743 Cache_1: Cache domain 56.6 25 0.00053 23.0 4.1 33 15-55 19-51 (81)
76 cd00218 GlcAT-I Beta1,3-glucur 54.7 13 0.00028 30.4 2.9 31 14-45 136-172 (223)
77 KOG3605|consensus 53.1 9.8 0.00021 35.9 2.2 52 98-155 677-733 (829)
78 PF02122 Peptidase_S39: Peptid 46.3 22 0.00048 28.5 3.0 40 3-43 139-182 (203)
79 KOG3834|consensus 46.0 30 0.00066 31.0 4.0 58 91-155 12-72 (462)
80 KOG3549|consensus 40.8 39 0.00085 29.8 3.8 53 95-153 81-136 (505)
81 COG0260 PepB Leucyl aminopepti 37.5 42 0.00091 30.6 3.7 31 98-135 302-334 (485)
82 cd04582 CBS_pair_ABC_OpuCA_ass 36.7 33 0.00071 22.7 2.3 22 10-31 80-101 (106)
83 cd04596 CBS_pair_DRTGG_assoc T 35.9 33 0.00072 22.9 2.3 21 10-30 82-102 (108)
84 KOG1728|consensus 35.6 13 0.00029 28.2 0.2 31 104-141 111-141 (156)
85 PF08669 GCV_T_C: Glycine clea 35.2 53 0.0012 22.2 3.2 31 10-40 32-67 (95)
86 KOG1476|consensus 35.1 27 0.00059 30.1 2.0 32 15-47 223-260 (330)
87 PF10049 DUF2283: Protein of u 34.8 43 0.00093 20.4 2.4 13 18-30 35-47 (50)
88 cd04618 CBS_pair_5 The CBS dom 33.5 31 0.00068 23.3 1.8 21 11-31 72-93 (98)
89 cd04606 CBS_pair_Mg_transporte 33.3 39 0.00085 22.6 2.3 21 10-30 82-102 (109)
90 smart00116 CBS Domain in cysta 33.2 42 0.00092 18.0 2.1 20 11-30 22-41 (49)
91 cd04610 CBS_pair_ParBc_assoc T 32.5 41 0.00088 22.2 2.3 18 13-30 84-101 (107)
92 PRK09570 rpoH DNA-directed RNA 31.6 29 0.00063 23.7 1.3 16 104-125 43-59 (79)
93 cd04592 CBS_pair_EriC_assoc_eu 30.7 51 0.0011 23.8 2.6 22 10-31 22-43 (133)
94 cd04641 CBS_pair_28 The CBS do 30.2 53 0.0011 22.4 2.6 22 9-30 21-42 (120)
95 cd00433 Peptidase_M17 Cytosol 29.8 69 0.0015 29.0 3.8 28 101-135 292-321 (468)
96 TIGR00612 ispG_gcpE 1-hydroxy- 29.2 60 0.0013 28.2 3.1 37 42-84 107-143 (346)
97 PRK00913 multifunctional amino 28.8 76 0.0016 28.9 3.9 28 101-135 306-335 (483)
98 TIGR02913 HAF_rpt probable ext 28.4 62 0.0013 18.9 2.2 12 18-29 4-15 (39)
99 cd04614 CBS_pair_1 The CBS dom 28.3 66 0.0014 21.4 2.7 47 10-56 22-71 (96)
100 PF00883 Peptidase_M17: Cytoso 28.3 61 0.0013 27.8 3.0 29 100-135 136-166 (311)
101 PF03761 DUF316: Domain of unk 28.3 61 0.0013 26.4 3.0 28 4-31 224-254 (282)
102 cd00190 Tryp_SPc Trypsin-like 28.2 30 0.00065 26.3 1.1 17 6-22 179-195 (232)
103 PF08275 Toprim_N: DNA primase 27.6 47 0.001 24.3 2.0 17 16-32 82-98 (128)
104 KOG3551|consensus 27.3 60 0.0013 29.0 2.8 32 94-131 110-142 (506)
105 PRK05015 aminopeptidase B; Pro 25.2 96 0.0021 27.8 3.8 30 99-135 241-272 (424)
106 cd04603 CBS_pair_KefB_assoc Th 23.4 91 0.002 21.0 2.7 19 11-29 23-41 (111)
107 cd04801 CBS_pair_M50_like This 23.1 77 0.0017 21.2 2.3 20 12-31 25-44 (114)
108 PRK03760 hypothetical protein; 22.9 98 0.0021 22.5 2.9 25 94-125 89-113 (117)
109 cd04643 CBS_pair_30 The CBS do 22.8 73 0.0016 21.2 2.2 20 12-31 24-43 (116)
110 COG1792 MreC Cell shape-determ 22.6 3.3E+02 0.0071 22.8 6.4 41 2-44 134-175 (284)
111 PF12120 Arr-ms: Rifampin ADP- 22.6 43 0.00094 23.8 0.9 30 113-142 5-45 (100)
112 COG4043 Preprotein translocase 22.3 52 0.0011 23.7 1.3 27 115-141 31-65 (111)
113 PF01191 RNA_pol_Rpb5_C: RNA p 22.0 44 0.00094 22.6 0.8 18 103-126 39-57 (74)
114 COG5428 Uncharacterized conser 21.7 64 0.0014 21.5 1.5 14 18-31 36-49 (69)
115 cd04594 CBS_pair_EriC_assoc_ar 21.3 82 0.0018 20.9 2.1 21 9-30 78-98 (104)
116 cd04459 Rho_CSD Rho_CSD: Rho p 21.1 60 0.0013 21.4 1.3 12 115-126 38-49 (68)
117 cd04617 CBS_pair_4 The CBS dom 21.0 1E+02 0.0023 20.8 2.7 21 10-30 22-42 (118)
118 cd04619 CBS_pair_6 The CBS dom 20.8 1.1E+02 0.0024 20.6 2.8 22 10-31 22-43 (114)
119 cd04623 CBS_pair_10 The CBS do 20.6 95 0.0021 20.4 2.3 21 11-31 23-43 (113)
120 cd04621 CBS_pair_8 The CBS dom 20.0 1.1E+02 0.0023 21.8 2.6 21 11-31 23-43 (135)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=99.97 E-value=3.6e-31 Score=233.52 Aligned_cols=140 Identities=26% Similarity=0.323 Sum_probs=122.9
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN 75 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~ 75 (169)
+|||||+||||||||||+|.+||||||+++..+ +|++||||++.+++++++|+++| ++.++|||+++++++
T Consensus 202 ~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g----~v~r~~LGv~~~~l~ 277 (455)
T PRK10139 202 FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFG----EIKRGLLGIKGTEMS 277 (455)
T ss_pred EEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcC----cccccceeEEEEECC
Confidence 589999999999999999999999999999764 57999999999999999999999 899999999999999
Q ss_pred HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--hcCCCCeeEEEE
Q psy18070 76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--SINHPSITCHIL 153 (169)
Q Consensus 76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~~~~~~~~~~~i 153 (169)
+++++.++++ ...|++|.+|.++|||+ ++|||+||+|+++++.++.+..++... .......+.+.+
T Consensus 278 ~~~~~~lgl~------~~~Gv~V~~V~~~SpA~------~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V 345 (455)
T PRK10139 278 ADIAKAFNLD------VQRGAFVSEVLPNSGSA------KAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL 345 (455)
T ss_pred HHHHHhcCCC------CCCceEEEEECCCChHH------HCCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence 9999999884 23799999999999999 999999999999988887776655433 223445668888
Q ss_pred EEe
Q psy18070 154 LRL 156 (169)
Q Consensus 154 ~r~ 156 (169)
.|.
T Consensus 346 ~R~ 348 (455)
T PRK10139 346 LRN 348 (455)
T ss_pred EEC
Confidence 885
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.97 E-value=8.4e-30 Score=218.26 Aligned_cols=140 Identities=29% Similarity=0.378 Sum_probs=121.9
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc-------CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEE
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-------AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLT 73 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-------~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~ 73 (169)
++||||++|||||||||+|.+||||||+++.+. ++++||||++.+++++++++++| ++.++|||+++++
T Consensus 188 ~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g----~~~r~~lGv~~~~ 263 (351)
T TIGR02038 188 FIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDG----RVIRGYIGVSGED 263 (351)
T ss_pred EEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcC----cccceEeeeEEEE
Confidence 589999999999999999999999999997652 57999999999999999999999 8899999999999
Q ss_pred CCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCCeeEE
Q psy18070 74 LNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCH 151 (169)
Q Consensus 74 l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~ 151 (169)
+++..++.++++ ...|++|.+|.++|||+ ++||++||+|+++++.++.+..++. +...+....+.+
T Consensus 264 ~~~~~~~~lgl~------~~~Gv~V~~V~~~spA~------~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l 331 (351)
T TIGR02038 264 INSVVAQGLGLP------DLRGIVITGVDPNGPAA------RAGILVRDVILKYDGKDVIGAEELMDRIAETRPGSKVMV 331 (351)
T ss_pred CCHHHHHhcCCC------ccccceEeecCCCChHH------HCCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEE
Confidence 999999998884 23799999999999999 9999999999999988887765543 333344556788
Q ss_pred EEEEe
Q psy18070 152 ILLRL 156 (169)
Q Consensus 152 ~i~r~ 156 (169)
.++|+
T Consensus 332 ~v~R~ 336 (351)
T TIGR02038 332 TVLRQ 336 (351)
T ss_pred EEEEC
Confidence 88885
No 3
>PRK10898 serine endoprotease; Provisional
Probab=99.97 E-value=1.2e-29 Score=217.49 Aligned_cols=140 Identities=27% Similarity=0.352 Sum_probs=120.2
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc--------CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEE
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT--------AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITML 72 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~--------~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~ 72 (169)
+|||||++|||||||||+|.+||||||+++.+. ++++||||++.+++++++|+++| ++.++|||+.++
T Consensus 188 ~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G----~~~~~~lGi~~~ 263 (353)
T PRK10898 188 FLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDG----RVIRGYIGIGGR 263 (353)
T ss_pred eEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcC----cccccccceEEE
Confidence 589999999999999999999999999998753 47899999999999999999999 889999999999
Q ss_pred ECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhh--hhhhcCCCCeeE
Q psy18070 73 TLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKL--VVWSINHPSITC 150 (169)
Q Consensus 73 ~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~--~~~~~~~~~~~~ 150 (169)
++++...+.++++ ...|++|.+|.++|||+ ++||++||+|+++++.++.+..++ .+........+.
T Consensus 264 ~~~~~~~~~~~~~------~~~Gv~V~~V~~~spA~------~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~ 331 (353)
T PRK10898 264 EIAPLHAQGGGID------QLQGIVVNEVSPDGPAA------KAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIP 331 (353)
T ss_pred ECCHHHHHhcCCC------CCCeEEEEEECCCChHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEE
Confidence 9998877776653 23799999999999999 999999999999999988765443 333334455668
Q ss_pred EEEEEe
Q psy18070 151 HILLRL 156 (169)
Q Consensus 151 ~~i~r~ 156 (169)
+.++|.
T Consensus 332 l~v~R~ 337 (353)
T PRK10898 332 VVVMRD 337 (353)
T ss_pred EEEEEC
Confidence 888885
No 4
>PRK10942 serine endoprotease; Provisional
Probab=99.96 E-value=1.9e-29 Score=223.45 Aligned_cols=140 Identities=31% Similarity=0.361 Sum_probs=122.6
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN 75 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~ 75 (169)
||||||++|||||||||+|.+||||||+++..+ .+++||||++.+++++++|++++ .+.++|+|+.+++++
T Consensus 223 ~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~~g----~v~rg~lGv~~~~l~ 298 (473)
T PRK10942 223 FIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYG----QVKRGELGIMGTELN 298 (473)
T ss_pred eEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHhcc----ccccceeeeEeeecC
Confidence 589999999999999999999999999998764 46999999999999999999999 899999999999999
Q ss_pred HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCCeeEEEE
Q psy18070 76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCHIL 153 (169)
Q Consensus 76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i 153 (169)
+++++.++++. ..|++|.+|.++|||+ ++||++||+|+++++.++.+..++. +........+.+.+
T Consensus 299 ~~~a~~~~l~~------~~GvlV~~V~~~SpA~------~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~v 366 (473)
T PRK10942 299 SELAKAMKVDA------QRGAFVSQVLPNSSAA------KAGIKAGDVITSLNGKPISSFAALRAQVGTMPVGSKLTLGL 366 (473)
T ss_pred HHHHHhcCCCC------CCceEEEEECCCChHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence 99999998852 3799999999999999 9999999999999888877765544 33334455668888
Q ss_pred EEe
Q psy18070 154 LRL 156 (169)
Q Consensus 154 ~r~ 156 (169)
+|+
T Consensus 367 ~R~ 369 (473)
T PRK10942 367 LRD 369 (473)
T ss_pred EEC
Confidence 774
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.96 E-value=8.3e-29 Score=216.60 Aligned_cols=140 Identities=30% Similarity=0.418 Sum_probs=123.1
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN 75 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~ 75 (169)
++||||++|||||||||+|.+||||||+++..+ .+++||||++.+++++++|++++ .+.++|||+++++++
T Consensus 169 ~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g----~~~~~~lGi~~~~~~ 244 (428)
T TIGR02037 169 FIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGG----KVQRGWLGVTIQEVT 244 (428)
T ss_pred eEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcC----cCcCCcCceEeecCC
Confidence 589999999999999999999999999998764 57899999999999999999999 889999999999999
Q ss_pred HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCCeeEEEE
Q psy18070 76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCHIL 153 (169)
Q Consensus 76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i 153 (169)
+++++.++++ ...|++|.+|.++|||+ ++||++||+|+++++.++.+..++. +........+.+.+
T Consensus 245 ~~~~~~lgl~------~~~Gv~V~~V~~~spA~------~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~v~l~v 312 (428)
T TIGR02037 245 SDLAKSLGLE------KQRGALVAQVLPGSPAE------KAGLKAGDVILSVNGKPISSFADLRRAIGTLKPGKKVTLGI 312 (428)
T ss_pred HHHHHHcCCC------CCCceEEEEccCCCChH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEE
Confidence 9999999984 23799999999999999 9999999999999998887665544 33334455678888
Q ss_pred EEe
Q psy18070 154 LRL 156 (169)
Q Consensus 154 ~r~ 156 (169)
+|+
T Consensus 313 ~R~ 315 (428)
T TIGR02037 313 LRK 315 (428)
T ss_pred EEC
Confidence 885
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.90 E-value=2e-23 Score=177.92 Aligned_cols=138 Identities=32% Similarity=0.421 Sum_probs=119.8
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECC
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLN 75 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~ 75 (169)
+|||||++||||||||++|.+|++|||+++... ++++||||++.++.++.++...| ++.++|+|+.+.+++
T Consensus 184 ~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G----~v~~~~lgv~~~~~~ 259 (347)
T COG0265 184 FIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKG----KVVRGYLGVIGEPLT 259 (347)
T ss_pred hhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcC----CccccccceEEEEcc
Confidence 589999999999999999999999999999986 24899999999999999999988 899999999999998
Q ss_pred HHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--cCCCCeeEEEE
Q psy18070 76 EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS--INHPSITCHIL 153 (169)
Q Consensus 76 ~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~--~~~~~~~~~~i 153 (169)
+... +++ ....|++|.+|.++|||+ ++|+++||+|++.++..+.+..++.... ......+.+.+
T Consensus 260 ~~~~--~g~------~~~~G~~V~~v~~~spa~------~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~ 325 (347)
T COG0265 260 ADIA--LGL------PVAAGAVVLGVLPGSPAA------KAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKL 325 (347)
T ss_pred cccc--cCC------CCCCceEEEecCCCChHH------HcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEE
Confidence 8776 554 244799999999999999 9999999999999888887776666443 33344668888
Q ss_pred EEe
Q psy18070 154 LRL 156 (169)
Q Consensus 154 ~r~ 156 (169)
+|.
T Consensus 326 ~r~ 328 (347)
T COG0265 326 LRG 328 (347)
T ss_pred EEC
Confidence 886
No 7
>KOG1320|consensus
Probab=99.66 E-value=2.8e-16 Score=138.30 Aligned_cols=150 Identities=29% Similarity=0.338 Sum_probs=113.1
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc-----CCeEEEEehhhHHHHHHhhhhcCC-----ccceeecceecEE
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDI-----DRTITHKKYIGIT 70 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-----~~~~faiP~~~i~~~l~~l~~~g~-----~~~~~~~~~lGi~ 70 (169)
++|||+++|+||||||++|.+|++||+++++.. .+++|++|.+.+..++.+..+..+ -.....+.|+|+.
T Consensus 294 ~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~~~~g~~ 373 (473)
T KOG1320|consen 294 INQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVHQYIGLP 373 (473)
T ss_pred ecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccccccCCce
Confidence 579999999999999999999999999999986 789999999999999988743321 1112235688888
Q ss_pred EEECCHHHHHHhhcccCCCC-CCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--cCCCC
Q psy18070 71 MLTLNEKLIEQLRRDRHIPY-DLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS--INHPS 147 (169)
Q Consensus 71 ~~~l~~~~~~~~~~~~~~~~-~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~--~~~~~ 147 (169)
+-.++..+..+.--..+.+| ...+||+|++|.+++++. ..++++||+|+++|++++.+..++.-.. ....+
T Consensus 374 s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~------~~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~~ 447 (473)
T KOG1320|consen 374 SYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSING------GYGLKPGDQVVKVNGKPVKNLKHLYELIEECSTED 447 (473)
T ss_pred eEEEecceEEeecCCCccccccceeEEEEEEeccCCCcc------cccccCCCEEEEECCEEeechHHHHHHHHhcCcCc
Confidence 77776555544444444444 333699999999999999 9999999999999999888886666432 22233
Q ss_pred eeEEEEEEe
Q psy18070 148 ITCHILLRL 156 (169)
Q Consensus 148 ~~~~~i~r~ 156 (169)
.+....+|.
T Consensus 448 ~v~vl~~~~ 456 (473)
T KOG1320|consen 448 KVAVLDRRS 456 (473)
T ss_pred eEEEEEecC
Confidence 445555554
No 8
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.21 E-value=1.2e-10 Score=80.07 Aligned_cols=80 Identities=28% Similarity=0.270 Sum_probs=62.2
Q ss_pred ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--
Q psy18070 65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS-- 142 (169)
Q Consensus 65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~-- 142 (169)
+|+|+.+++++++.+..+.+ ....|++|.+|.++|||+ ++||++||+|+++++.++.+..++....
T Consensus 1 ~~~G~~~~~~~~~~~~~~~~------~~~~g~~V~~v~~~s~a~------~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~ 68 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELGL------KDTKGVLVASVDPGSPAA------KAGLKPGDVILAVNGKPVKSVADLRRALAE 68 (90)
T ss_pred CccceEEeECCHHHHHHcCC------CCCCEEEEEEECCCCHHH------HcCCCcCCEEEEECCEECCCHHHHHHHHHh
Confidence 58999999999887776554 234699999999999999 9999999999999999887654444332
Q ss_pred cCCCCeeEEEEEEe
Q psy18070 143 INHPSITCHILLRL 156 (169)
Q Consensus 143 ~~~~~~~~~~i~r~ 156 (169)
......+.+.+.|+
T Consensus 69 ~~~~~~i~l~v~r~ 82 (90)
T cd00987 69 LKPGDKVTLTVLRG 82 (90)
T ss_pred cCCCCEEEEEEEEC
Confidence 22245667777774
No 9
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.20 E-value=7e-11 Score=80.91 Aligned_cols=76 Identities=24% Similarity=0.212 Sum_probs=58.4
Q ss_pred ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--h
Q psy18070 65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--S 142 (169)
Q Consensus 65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~ 142 (169)
||||+++...++ ..|++|.+|.++|||+ ++||++||+|+++++.++.+..++... .
T Consensus 1 ~~lGv~~~~~~~----------------~~g~~V~~V~~~spA~------~aGl~~GD~I~~ing~~v~~~~~~~~~l~~ 58 (82)
T PF13180_consen 1 GGLGVTVQNLSD----------------TGGVVVVSVIPGSPAA------KAGLQPGDIILAINGKPVNSSEDLVNILSK 58 (82)
T ss_dssp -E-SEEEEECSC----------------SSSEEEEEESTTSHHH------HTTS-TTEEEEEETTEESSSHHHHHHHHHC
T ss_pred CEECeEEEEccC----------------CCeEEEEEeCCCCcHH------HCCCCCCcEEEEECCEEcCCHHHHHHHHHh
Confidence 689999998753 2699999999999999 999999999999999988776665533 3
Q ss_pred cCCCCeeEEEEEEeeEEeec
Q psy18070 143 INHPSITCHILLRLYLLVCS 162 (169)
Q Consensus 143 ~~~~~~~~~~i~r~~~~v~~ 162 (169)
......+.+.++|+--..+.
T Consensus 59 ~~~g~~v~l~v~R~g~~~~~ 78 (82)
T PF13180_consen 59 GKPGDTVTLTVLRDGEELTV 78 (82)
T ss_dssp SSTTSEEEEEEEETTEEEEE
T ss_pred CCCCCEEEEEEEECCEEEEE
Confidence 45566779999996554443
No 10
>KOG1421|consensus
Probab=99.14 E-value=4.7e-10 Score=102.03 Aligned_cols=143 Identities=15% Similarity=0.142 Sum_probs=113.1
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeecc-CCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECCHHHH
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT-AGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLI 79 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~~-~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~ 79 (169)
++|.-+....|.||.|++|.+|..|..+..+.. ++.+|++|++.+.+.+..+++.. ...|+.|-+++.+-.-+..
T Consensus 206 y~QaasstsggssgspVv~i~gyAVAl~agg~~ssas~ffLpLdrV~RaL~clq~n~----PItRGtLqvefl~k~~de~ 281 (955)
T KOG1421|consen 206 YIQAASSTSGGSSGSPVVDIPGYAVALNAGGSISSASDFFLPLDRVVRALRCLQNNT----PITRGTLQVEFLHKLFDEC 281 (955)
T ss_pred eeeehhcCCCCCCCCceecccceEEeeecCCcccccccceeeccchhhhhhhhhcCC----CcccceEEEEEehhhhHHH
Confidence 579999999999999999999999999887764 56899999999999999999888 7888999999887777777
Q ss_pred HHhhcccC-------CCCCCCCcEE-EEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC--CCCee
Q psy18070 80 EQLRRDRH-------IPYDLTHGVL-IWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN--HPSIT 149 (169)
Q Consensus 80 ~~~~~~~~-------~~~~~~~Gv~-V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~--~~~~~ 149 (169)
+++|++.. .+|.. .|++ |..|.++|||+ +. |++||+++++|.- ..+++..+....+ -+..+
T Consensus 282 rrlGL~sE~eqv~r~k~P~~-tgmLvV~~vL~~gpa~------k~-Le~GDillavN~t-~l~df~~l~~iLDegvgk~l 352 (955)
T KOG1421|consen 282 RRLGLSSEWEQVVRTKFPER-TGMLVVETVLPEGPAE------KK-LEPGDILLAVNST-CLNDFEALEQILDEGVGKNL 352 (955)
T ss_pred HhcCCcHHHHHHHHhcCccc-ceeEEEEEeccCCchh------hc-cCCCcEEEEEcce-ehHHHHHHHHHHhhccCceE
Confidence 78877543 44543 4655 68899999999 66 9999999998843 3344333332222 34566
Q ss_pred EEEEEEe
Q psy18070 150 CHILLRL 156 (169)
Q Consensus 150 ~~~i~r~ 156 (169)
.+.++|.
T Consensus 353 ~LtI~Rg 359 (955)
T KOG1421|consen 353 ELTIQRG 359 (955)
T ss_pred EEEEEeC
Confidence 9999995
No 11
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.88 E-value=1.3e-08 Score=68.69 Aligned_cols=67 Identities=15% Similarity=-0.002 Sum_probs=50.1
Q ss_pred ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC
Q psy18070 65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN 144 (169)
Q Consensus 65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~ 144 (169)
+|+|+.+.+- ..|++|.+|.++|||+ ++||++||+|+++++.++.+ ....+....
T Consensus 1 ~~~G~~~~~~------------------~~~~~V~~V~~~s~a~------~aGl~~GD~I~~Ing~~v~~-~~~~l~~~~ 55 (80)
T cd00990 1 PYLGLTLDKE------------------EGLGKVTFVRDDSPAD------KAGLVAGDELVAVNGWRVDA-LQDRLKEYQ 55 (80)
T ss_pred CcccEEEEcc------------------CCcEEEEEECCCChHH------HhCCCCCCEEEEECCEEhHH-HHHHHHhcC
Confidence 5788888641 2589999999999999 99999999999999998776 222333333
Q ss_pred CCCeeEEEEEEe
Q psy18070 145 HPSITCHILLRL 156 (169)
Q Consensus 145 ~~~~~~~~i~r~ 156 (169)
....+.+.+.|.
T Consensus 56 ~~~~v~l~v~r~ 67 (80)
T cd00990 56 AGDPVELTVFRD 67 (80)
T ss_pred CCCEEEEEEEEC
Confidence 344567777764
No 12
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.76 E-value=3.4e-08 Score=86.72 Aligned_cols=83 Identities=22% Similarity=0.295 Sum_probs=66.7
Q ss_pred cceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--
Q psy18070 64 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW-- 141 (169)
Q Consensus 64 ~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~-- 141 (169)
..|+|+++.+++++.+++++++ ....|++|.+|.++|||+ ++||++||+|+++++.++.+..++...
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~l~-----~~~~Gv~V~~V~~~SpA~------~aGL~~GDvI~~Ing~~V~s~~d~~~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELRLK-----GDVKGVVVTKVVSGSPAA------RAGLQPGDVILSVNQQPVSSVAELRKVLD 405 (428)
T ss_pred ccccceEEecCCHHHHHHcCCC-----cCcCceEEEEeCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence 4689999999999999988774 334799999999999999 999999999999998887766554433
Q ss_pred hcCCCCeeEEEEEEee
Q psy18070 142 SINHPSITCHILLRLY 157 (169)
Q Consensus 142 ~~~~~~~~~~~i~r~~ 157 (169)
..+....+.+.++|+-
T Consensus 406 ~~~~g~~v~l~v~R~g 421 (428)
T TIGR02037 406 RAKKGGRVALLILRGG 421 (428)
T ss_pred hcCCCCEEEEEEEECC
Confidence 2234566788888864
No 13
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.64 E-value=2.2e-07 Score=63.19 Aligned_cols=61 Identities=16% Similarity=0.021 Sum_probs=46.4
Q ss_pred CCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC--CCCeeEEEEEEeeE
Q psy18070 92 LTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN--HPSITCHILLRLYL 158 (169)
Q Consensus 92 ~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~--~~~~~~~~i~r~~~ 158 (169)
...|++|.+|.++|||+ ++||++||+|+++++.++.+..++...... ....+.+.+.|+-.
T Consensus 8 ~~~Gv~V~~V~~~spa~------~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~ 70 (79)
T cd00991 8 AVAGVVIVGVIVGSPAE------NAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTT 70 (79)
T ss_pred cCCcEEEEEECCCChHH------hcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCE
Confidence 34799999999999999 999999999999998887765444433222 24456788887543
No 14
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.61 E-value=2.6e-07 Score=76.62 Aligned_cols=92 Identities=11% Similarity=0.016 Sum_probs=72.9
Q ss_pred hhHHHHHHhhhhcCCccceeecceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCc
Q psy18070 43 DYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTS 122 (169)
Q Consensus 43 ~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GD 122 (169)
..++++++++++.+ ..-+.|+|+.....+ ....|+.|..+.++|||+ ++|||+||
T Consensus 159 ~~~~~v~~~l~~~g----~~~~~~lgi~p~~~~---------------g~~~G~~v~~v~~~s~a~------~aGLr~GD 213 (259)
T TIGR01713 159 VVSRRIIEELTKDP----QKMFDYIRLSPVMKN---------------DKLEGYRLNPGKDPSLFY------KSGLQDGD 213 (259)
T ss_pred hhHHHHHHHHHHCH----HhhhheEeEEEEEeC---------------CceeEEEEEecCCCCHHH------HcCCCCCC
Confidence 46788999999988 888999999986543 112699999999999999 99999999
Q ss_pred EEEecCceeecchhhhh--hhhcCCCCeeEEEEEEeeEE
Q psy18070 123 SRLLGECLAQYTTSKLV--VWSINHPSITCHILLRLYLL 159 (169)
Q Consensus 123 vI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i~r~~~~ 159 (169)
+|+++|+.++.+..+.. +........+.+.+.|+--.
T Consensus 214 vIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~ 252 (259)
T TIGR01713 214 IAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQR 252 (259)
T ss_pred EEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEE
Confidence 99999999887765544 33334445778999986543
No 15
>KOG1320|consensus
Probab=98.60 E-value=5.2e-08 Score=86.38 Aligned_cols=116 Identities=18% Similarity=0.180 Sum_probs=92.7
Q ss_pred CEeeccccCCCCccceEEcCCccEEEEEeeec--cCCeEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEEC-CHH
Q psy18070 1 MSLTGIMVKFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTL-NEK 77 (169)
Q Consensus 1 ~iq~da~in~GnSGGplvn~~G~vvGi~~~~~--~~~~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l-~~~ 77 (169)
.+|+||+++|||||+|.+.-.++++|+++.++ .+++++.+|.-...++.......++ ....++++...+.+ +.+
T Consensus 200 ~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~---~~~f~~~nt~t~g~vs~~ 276 (473)
T KOG1320|consen 200 RVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAI---GNGFGLLNTLTQGMVSGQ 276 (473)
T ss_pred eEEEEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeecc---ccCceeeeeeeecccccc
Confidence 38999999999999999988899999999998 4578999999999998887766653 34556666666555 466
Q ss_pred HHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceee
Q psy18070 78 LIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQ 132 (169)
Q Consensus 78 ~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~ 132 (169)
.++.+.+. .. .|+.+.++.+-+.|. .. ++.||+|+..+++.+
T Consensus 277 ~R~~~~lg-----~~-~g~~i~~~~qtd~ai------~~-~nsg~~ll~~DG~~I 318 (473)
T KOG1320|consen 277 LRKSFKLG-----LE-TGVLISKINQTDAAI------NP-GNSGGPLLNLDGEVI 318 (473)
T ss_pred cccccccC-----cc-cceeeeeecccchhh------hc-ccCCCcEEEecCcEe
Confidence 66666553 33 789999999999888 55 999999999766655
No 16
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.54 E-value=4e-07 Score=59.67 Aligned_cols=36 Identities=25% Similarity=0.211 Sum_probs=33.0
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT 135 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~ 135 (169)
.|++|.+|.++|||+ ++||++||+|+++++.++.+.
T Consensus 13 ~~~~V~~v~~~s~a~------~~gl~~GD~I~~Ing~~v~~~ 48 (70)
T cd00136 13 GGVVVLSVEPGSPAE------RAGLQAGDVILAVNGTDVKNL 48 (70)
T ss_pred CCEEEEEeCCCCHHH------HcCCCCCCEEEEECCEECCCC
Confidence 489999999999999 999999999999998887665
No 17
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.48 E-value=9.2e-07 Score=59.46 Aligned_cols=71 Identities=20% Similarity=0.098 Sum_probs=50.0
Q ss_pred ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhhhh
Q psy18070 65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVVWS 142 (169)
Q Consensus 65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~~~ 142 (169)
..+|+.+..... ...|++|..|.++|||+ ++||++||+|+++++....+. .+.....
T Consensus 12 ~~~G~~~~~~~~---------------~~~~~~i~~v~~~s~a~------~~gl~~GD~I~~In~~~v~~~~~~~~~~~~ 70 (85)
T smart00228 12 GGLGFSLVGGKD---------------EGGGVVVSSVVPGSPAA------KAGLKVGDVILEVNGTSVEGLTHLEAVDLL 70 (85)
T ss_pred CcccEEEECCCC---------------CCCCEEEEEECCCCHHH------HcCCCCCCEEEEECCEECCCCCHHHHHHHH
Confidence 678888875321 11599999999999999 999999999999998877643 3333332
Q ss_pred cCCCCeeEEEEEEe
Q psy18070 143 INHPSITCHILLRL 156 (169)
Q Consensus 143 ~~~~~~~~~~i~r~ 156 (169)
......+.+.+.|.
T Consensus 71 ~~~~~~~~l~i~r~ 84 (85)
T smart00228 71 KKAGGKVTLTVLRG 84 (85)
T ss_pred HhCCCeEEEEEEeC
Confidence 33334556666663
No 18
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.45 E-value=1.1e-06 Score=59.16 Aligned_cols=49 Identities=16% Similarity=0.188 Sum_probs=41.2
Q ss_pred cceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeec
Q psy18070 64 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQY 133 (169)
Q Consensus 64 ~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~ 133 (169)
...+|+++...... ..|++|.+|.++|||+ ++||++||+|+++++....
T Consensus 11 ~~~~G~~~~~~~~~---------------~~~~~V~~v~~~s~a~------~~gl~~GD~I~~ing~~i~ 59 (82)
T cd00992 11 GGGLGFSLRGGKDS---------------GGGIFVSRVEPGGPAE------RGGLRVGDRILEVNGVSVE 59 (82)
T ss_pred CCCcCEEEeCcccC---------------CCCeEEEEECCCChHH------hCCCCCCCEEEEECCEEcC
Confidence 45689988764321 2699999999999999 9999999999999998877
No 19
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.43 E-value=1.5e-06 Score=59.03 Aligned_cols=57 Identities=23% Similarity=0.095 Sum_probs=43.8
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhhhhcC-CCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVVWSIN-HPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~~~~~-~~~~~~~~i~r~ 156 (169)
.+++|..|.++|||+ ++||++||+|+++++.+..+. .+....... ....+.+.+.|.
T Consensus 13 ~~~~V~~v~~~s~a~------~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~ 72 (85)
T cd00988 13 GGLVITSVLPGSPAA------KAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRG 72 (85)
T ss_pred CeEEEEEecCCCCHH------HcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcC
Confidence 689999999999999 999999999999999887764 444333222 344567777775
No 20
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.43 E-value=1.5e-06 Score=58.23 Aligned_cols=57 Identities=21% Similarity=0.027 Sum_probs=43.1
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCC-CCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINH-PSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~-~~~~~~~i~r~ 156 (169)
..++|.+|.++|||+ ++||++||+|+++++.+..+..+........ ...+.+.+.|.
T Consensus 12 ~~~~V~~v~~~s~a~------~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~ 69 (79)
T cd00989 12 IEPVIGEVVPGSPAA------KAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERN 69 (79)
T ss_pred cCcEEEeECCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEEC
Confidence 358999999999999 9999999999999999877654443332222 34557777764
No 21
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.36 E-value=2.8e-06 Score=57.39 Aligned_cols=56 Identities=16% Similarity=0.121 Sum_probs=42.8
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhh--cCCCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWS--INHPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~--~~~~~~~~~~i~r~ 156 (169)
.|++|.+|.++|||+ . ||++||+|+++++.++.+..++.... ......+.+.+.|.
T Consensus 8 ~Gv~V~~V~~~s~A~------~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~ 65 (79)
T cd00986 8 HGVYVTSVVEGMPAA------G-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKRE 65 (79)
T ss_pred cCEEEEEECCCCchh------h-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEEC
Confidence 699999999999999 6 79999999999998877654443222 23344567888774
No 22
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.35 E-value=1.5e-06 Score=58.92 Aligned_cols=71 Identities=18% Similarity=0.122 Sum_probs=50.6
Q ss_pred ecceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhh
Q psy18070 63 HKKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVV 140 (169)
Q Consensus 63 ~~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~ 140 (169)
....+|+++....... ..|++|.+|.++|||+ ++||++||.|+++|+..+.+. .+...
T Consensus 8 ~~~~lG~~l~~~~~~~--------------~~~~~V~~v~~~~~a~------~~gl~~GD~Il~INg~~v~~~~~~~~~~ 67 (81)
T PF00595_consen 8 GNGPLGFTLRGGSDND--------------EKGVFVSSVVPGSPAE------RAGLKVGDRILEINGQSVRGMSHDEVVQ 67 (81)
T ss_dssp TTSBSSEEEEEESTSS--------------SEEEEEEEECTTSHHH------HHTSSTTEEEEEETTEESTTSBHHHHHH
T ss_pred CCCCcCEEEEecCCCC--------------cCCEEEEEEeCCChHH------hcccchhhhhheeCCEeCCCCCHHHHHH
Confidence 4567999998653210 2599999999999999 999999999999998887654 33333
Q ss_pred hhcCCCCeeEEEE
Q psy18070 141 WSINHPSITCHIL 153 (169)
Q Consensus 141 ~~~~~~~~~~~~i 153 (169)
.....+..+.+.+
T Consensus 68 ~l~~~~~~v~L~V 80 (81)
T PF00595_consen 68 LLKSASNPVTLTV 80 (81)
T ss_dssp HHHHSTSEEEEEE
T ss_pred HHHCCCCcEEEEE
Confidence 3333444555544
No 23
>PRK10942 serine endoprotease; Provisional
Probab=98.25 E-value=3.8e-06 Score=75.10 Aligned_cols=57 Identities=19% Similarity=0.144 Sum_probs=47.3
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r~ 156 (169)
.|++|.+|.++|||+ ++||++||+|+++|+.++.+..++.....+.+..+.+.+.|.
T Consensus 408 ~gvvV~~V~~~S~A~------~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~~~v~l~V~R~ 464 (473)
T PRK10942 408 KGVVVDNVKPGTPAA------QIGLKKGDVIIGANQQPVKNIAELRKILDSKPSVLALNIQRG 464 (473)
T ss_pred CCeEEEEeCCCChHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCeEEEEEEEC
Confidence 589999999999999 999999999999999988887666544444456678888885
No 24
>PRK10139 serine endoprotease; Provisional
Probab=97.96 E-value=1.9e-05 Score=70.34 Aligned_cols=57 Identities=19% Similarity=0.165 Sum_probs=46.0
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r~ 156 (169)
.|++|.+|.++|||+ ++||++||+|+++|+.++.+..++.....+++..+.+.++|+
T Consensus 390 ~Gv~V~~V~~~spA~------~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~v~l~v~R~ 446 (455)
T PRK10139 390 KGIKIDEVVKGSPAA------QAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKPAIIALQIVRG 446 (455)
T ss_pred CceEEEEeCCCChHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCeEEEEEEEC
Confidence 599999999999999 999999999999998888776555544333445667888885
No 25
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.90 E-value=5.8e-05 Score=64.35 Aligned_cols=57 Identities=21% Similarity=0.130 Sum_probs=41.9
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch--hhhhhhh-cCCCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT--SKLVVWS-INHPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~--~~~~~~~-~~~~~~~~~~i~r~ 156 (169)
.+++|.+|.++|||+ ++||++||+|+++++.++.+- .+..... ......+.+.+.|.
T Consensus 62 ~~~~V~~V~~~spA~------~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~ 121 (334)
T TIGR00225 62 GEIVIVSPFEGSPAE------KAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRA 121 (334)
T ss_pred CEEEEEEeCCCChHH------HcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeC
Confidence 589999999999999 999999999999999887652 2222121 12344557777774
No 26
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.80 E-value=5e-05 Score=66.90 Aligned_cols=57 Identities=18% Similarity=-0.057 Sum_probs=44.0
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC-CCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN-HPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~-~~~~~~~~i~r~ 156 (169)
.|++|.+|.++|||+ ++||++||+|+++|+.++.+-.+....... ....+.+.+.|+
T Consensus 203 ~g~vV~~V~~~SpA~------~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~ 260 (420)
T TIGR00054 203 IEPVLSDVTPNSPAE------KAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVERN 260 (420)
T ss_pred cCcEEEEECCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEEC
Confidence 489999999999999 999999999999998887765554433322 333457777775
No 27
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.79 E-value=3.6e-05 Score=67.81 Aligned_cols=57 Identities=19% Similarity=0.013 Sum_probs=46.2
Q ss_pred CCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEE
Q psy18070 93 THGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLR 155 (169)
Q Consensus 93 ~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r 155 (169)
..|++|.+|.++|||+ +|||++||+|+++|+.++.+..++........+...+.+.|
T Consensus 127 ~~g~~V~~V~~~SpA~------~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~~~v~~~I~r 183 (420)
T TIGR00054 127 EVGPVIELLDKNSIAL------EAGIEPGDEILSVNGNKIPGFKDVRQQIADIAGEPMVEILA 183 (420)
T ss_pred CCCceeeccCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhhcccceEEEEE
Confidence 3699999999999999 99999999999999888887766665544444566677766
No 28
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.69 E-value=2.9e-05 Score=68.90 Aligned_cols=55 Identities=15% Similarity=-0.019 Sum_probs=42.7
Q ss_pred EEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--hcCCCCeeEEEEEEe
Q psy18070 96 VLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--SINHPSITCHILLRL 156 (169)
Q Consensus 96 v~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~~~~~~~~~~~i~r~ 156 (169)
.+|.+|.++|||+ +||||+||+|+++|+.++.+-.++... .......+.+++.|.
T Consensus 128 ~lV~~V~~~SpA~------kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~ 184 (449)
T PRK10779 128 PVVGEIAPNSIAA------QAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPF 184 (449)
T ss_pred ccccccCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeC
Confidence 4789999999999 999999999999888887776555433 333334568888875
No 29
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.67 E-value=0.0001 Score=65.36 Aligned_cols=57 Identities=16% Similarity=0.002 Sum_probs=43.1
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhc-CCCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSI-NHPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~-~~~~~~~~~i~r~ 156 (169)
.+++|.+|.++|||+ ++||++||+|+++|+.++.+-.+...... .....+.+.+.|+
T Consensus 221 ~~~vV~~V~~~SpA~------~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~ 278 (449)
T PRK10779 221 IEPVLAEVQPNSAAS------KAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQ 278 (449)
T ss_pred cCcEEEeeCCCCHHH------HcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEEC
Confidence 368999999999999 99999999999999888766544433222 2334567777774
No 30
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.64 E-value=0.00013 Score=63.70 Aligned_cols=35 Identities=23% Similarity=0.253 Sum_probs=32.4
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecc
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYT 134 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~ 134 (169)
.|++|..|.++|||+ ++||++||+|+++++.++.+
T Consensus 102 ~g~~V~~V~~~SPA~------~aGl~~GD~Iv~InG~~v~~ 136 (389)
T PLN00049 102 AGLVVVAPAPGGPAA------RAGIRPGDVILAIDGTSTEG 136 (389)
T ss_pred CcEEEEEeCCCChHH------HcCCCCCCEEEEECCEECCC
Confidence 489999999999999 99999999999999988764
No 31
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.49 E-value=0.00044 Score=60.83 Aligned_cols=57 Identities=26% Similarity=0.183 Sum_probs=44.5
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchh-hhhhhhc--CCCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTS-KLVVWSI--NHPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~-~~~~~~~--~~~~~~~~~i~r~ 156 (169)
.++.|.++.+++||+ ++||++||+|+++++.++.... +.++..+ +....+.+.+.|.
T Consensus 112 ~~~~V~s~~~~~PA~------kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~ 171 (406)
T COG0793 112 GGVKVVSPIDGSPAA------KAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRA 171 (406)
T ss_pred CCcEEEecCCCChHH------HcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEc
Confidence 589999999999999 9999999999999988776663 3333333 3344668888884
No 32
>PF12812 PDZ_1: PDZ-like domain
Probab=97.46 E-value=0.00052 Score=46.98 Aligned_cols=64 Identities=14% Similarity=-0.000 Sum_probs=52.2
Q ss_pred cceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh
Q psy18070 64 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW 141 (169)
Q Consensus 64 ~~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~ 141 (169)
-.|+|..+++|+-+.++++++. -|+++.....++++. ..|+.+|-+|+++|+.++.+-.++...
T Consensus 8 v~~~Ga~f~~Ls~q~aR~~~~~--------~~gv~v~~~~g~~~~------~~~i~~g~iI~~Vn~kpt~~Ld~f~~v 71 (78)
T PF12812_consen 8 VEVCGAVFHDLSYQQARQYGIP--------VGGVYVAVSGGSLAF------AGGISKGFIITSVNGKPTPDLDDFIKV 71 (78)
T ss_pred EEEcCeecccCCHHHHHHhCCC--------CCEEEEEecCCChhh------hCCCCCCeEEEeECCcCCcCHHHHHHH
Confidence 3679999999999999999875 456666778899998 666999999999998887776655543
No 33
>KOG3553|consensus
Probab=97.46 E-value=9.9e-05 Score=52.98 Aligned_cols=32 Identities=28% Similarity=0.302 Sum_probs=30.1
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCcee
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLA 131 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~ 131 (169)
.|++|++|.+||||+ .||||.+|-|+.+|+-.
T Consensus 59 ~GiYvT~V~eGsPA~------~AGLrihDKIlQvNG~D 90 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAE------IAGLRIHDKILQVNGWD 90 (124)
T ss_pred ccEEEEEeccCChhh------hhcceecceEEEecCce
Confidence 699999999999999 99999999999988764
No 34
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.18 E-value=0.0007 Score=60.90 Aligned_cols=54 Identities=20% Similarity=0.179 Sum_probs=39.2
Q ss_pred ecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCce
Q psy18070 67 IGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECL 130 (169)
Q Consensus 67 lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v 130 (169)
.|+++.++.++ .-.+|++-. +...+.+|+.|.++|||+ +|||.+||.|++++++
T Consensus 439 ~gL~~~~~~~~-~~~LGl~v~---~~~g~~~i~~V~~~gPA~------~AGl~~Gd~ivai~G~ 492 (558)
T COG3975 439 FGLTFTPKPRE-AYYLGLKVK---SEGGHEKITFVFPGGPAY------KAGLSPGDKIVAINGI 492 (558)
T ss_pred cceEEEecCCC-CcccceEec---ccCCeeEEEecCCCChhH------hccCCCccEEEEEcCc
Confidence 56666555443 223333211 333678999999999999 9999999999999998
No 35
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.17 E-value=0.00057 Score=51.68 Aligned_cols=73 Identities=19% Similarity=0.099 Sum_probs=44.5
Q ss_pred ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccccccCCCC-CcEEEecCceeecchhhhh-hhh
Q psy18070 65 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFRTSAGIKP-TSSRLLGECLAQYTTSKLV-VWS 142 (169)
Q Consensus 65 ~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~-GDvI~~~~~v~~~~~~~~~-~~~ 142 (169)
+.||++++--..+- ..+.+.-|.+|.|+|||+ +|||+| .|-|+..+.....+..++. ...
T Consensus 26 g~LG~sv~~~~~~~------------~~~~~~~Vl~V~p~SPA~------~AGL~p~~DyIig~~~~~l~~~~~l~~~v~ 87 (138)
T PF04495_consen 26 GLLGISVRFESFEG------------AEEEGWHVLRVAPNSPAA------KAGLEPFFDYIIGIDGGLLDDEDDLFELVE 87 (138)
T ss_dssp SSS-EEEEEEE-TT------------GCCCEEEEEEE-TTSHHH------HTT--TTTEEEEEETTCE--STCHHHHHHH
T ss_pred CCCcEEEEEecccc------------cccceEEEeEecCCCHHH------HCCccccccEEEEccceecCCHHHHHHHHH
Confidence 67999987543210 223689999999999999 999999 6999996654433333333 223
Q ss_pred cCCCCeeEEEEEE
Q psy18070 143 INHPSITCHILLR 155 (169)
Q Consensus 143 ~~~~~~~~~~i~r 155 (169)
......+.+.+++
T Consensus 88 ~~~~~~l~L~Vyn 100 (138)
T PF04495_consen 88 ANENKPLQLYVYN 100 (138)
T ss_dssp HTTTS-EEEEEEE
T ss_pred HcCCCcEEEEEEE
Confidence 4455667787776
No 36
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.10 E-value=0.0023 Score=56.30 Aligned_cols=58 Identities=21% Similarity=0.064 Sum_probs=42.3
Q ss_pred CCcEEEEEE--------ccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcC-CCCeeEEEEEEe
Q psy18070 93 THGVLIWRV--------MYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSIN-HPSITCHILLRL 156 (169)
Q Consensus 93 ~~Gv~V~~V--------~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~-~~~~~~~~i~r~ 156 (169)
+.||+|... ..+|||+ ++|||+||+|+++|+.++.+..++...... ....+.+.+.|.
T Consensus 104 t~GVlVvg~~~v~~~~g~~~SPAa------~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~ 170 (402)
T TIGR02860 104 TKGVLVVGFSDIETEKGKIHSPGE------EAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERG 170 (402)
T ss_pred cCEEEEEEEEcccccCCCCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEEC
Confidence 379999665 2369999 999999999999998887776555433222 245567777775
No 37
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.06 E-value=0.0017 Score=60.42 Aligned_cols=29 Identities=10% Similarity=0.000 Sum_probs=26.9
Q ss_pred CcEEEEEEccCCccccccccccc-CCCCCcEEEecC
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSA-GIKPTSSRLLGE 128 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~a-GL~~GDvI~~~~ 128 (169)
.+++|.+|.|||||+ ++ ||++||+|++++
T Consensus 255 ~~~~V~~vipGsPA~------ka~gLk~GD~IlaVn 284 (667)
T PRK11186 255 DYTVINSLVAGGPAA------KSKKLSVGDKIVGVG 284 (667)
T ss_pred CeEEEEEccCCChHH------HhCCCCCCCEEEEEC
Confidence 468999999999999 98 999999999976
No 38
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=96.99 E-value=0.00091 Score=59.21 Aligned_cols=50 Identities=18% Similarity=0.056 Sum_probs=37.1
Q ss_pred EEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEE
Q psy18070 97 LIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILL 154 (169)
Q Consensus 97 ~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~ 154 (169)
+|.+|.|+|||+ ++||++||.|+++|+.++.+-.+....... ..+.+.+.
T Consensus 1 ~I~~V~pgSpAe------~AGLe~GD~IlsING~~V~Dw~D~~~~l~~--e~l~L~V~ 50 (433)
T TIGR03279 1 LISAVLPGSIAE------ELGFEPGDALVSINGVAPRDLIDYQFLCAD--EELELEVL 50 (433)
T ss_pred CcCCcCCCCHHH------HcCCCCCCEEEEECCEECCCHHHHHHHhcC--CcEEEEEE
Confidence 367899999999 999999999999999988775554433322 33455554
No 39
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.70 E-value=0.0027 Score=59.43 Aligned_cols=50 Identities=26% Similarity=0.401 Sum_probs=35.8
Q ss_pred eeccccCCCCccceEEcCCccEEEEEeeecc----------CCe--EEEEehhhHHHHHHhh
Q psy18070 3 LTGIMVKFGNSGGPLVNLDGEVIGINSMKVT----------AGI--SFAIPIDYAIEFLTNY 52 (169)
Q Consensus 3 q~da~in~GnSGGplvn~~G~vvGi~~~~~~----------~~~--~faiP~~~i~~~l~~l 52 (169)
-++.-|..||||+|++|.+||+||+++-..- ... +..|=+..+..+++.+
T Consensus 625 lstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv 686 (698)
T PF10459_consen 625 LSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKV 686 (698)
T ss_pred EeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHH
Confidence 3567788999999999999999999997652 122 3444445555665554
No 40
>KOG3129|consensus
Probab=96.62 E-value=0.0045 Score=49.89 Aligned_cols=61 Identities=20% Similarity=0.009 Sum_probs=43.0
Q ss_pred cEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhh----hhcCCCCeeEEEEEEeeEEee
Q psy18070 95 GVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVV----WSINHPSITCHILLRLYLLVC 161 (169)
Q Consensus 95 Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~----~~~~~~~~~~~~i~r~~~~v~ 161 (169)
=++|.+|.|+|||+ +|||+.||-|++...+...+-..+.. ........+.+++.|.--.|+
T Consensus 140 Fa~V~sV~~~SPA~------~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~ 204 (231)
T KOG3129|consen 140 FAVVDSVVPGSPAD------EAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVV 204 (231)
T ss_pred eEEEeecCCCChhh------hhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEE
Confidence 36899999999999 99999999999966665555443221 122345566888888644443
No 41
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=96.13 E-value=0.044 Score=38.34 Aligned_cols=57 Identities=14% Similarity=0.004 Sum_probs=36.4
Q ss_pred CcEEEEEEccC--------CcccccccccccCC--CCCcEEEecCceeecchhhhhhhhcCC-CCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYN--------SPAYFIKFRTSAGI--KPTSSRLLGECLAQYTTSKLVVWSINH-PSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~--------spA~~~~~~~~aGL--~~GDvI~~~~~v~~~~~~~~~~~~~~~-~~~~~~~i~r~ 156 (169)
.+..|.++.++ ||-. +.|+ ++||+|+++|+.++..+.....+...+ ...+.+++.+.
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~------~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~ 79 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLA------QPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRK 79 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGG------GGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-S
T ss_pred CEEEEEEEeCCCCCCccccCCcc------CCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecC
Confidence 67889999776 5555 5554 599999999999999886666555554 44667777653
No 42
>KOG3532|consensus
Probab=96.05 E-value=0.017 Score=53.73 Aligned_cols=59 Identities=15% Similarity=0.024 Sum_probs=46.8
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhhhcCCCCeeEEEEEEeeE
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVWSINHPSITCHILLRLYL 158 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~~~~~~~~~~~~i~r~~~ 158 (169)
.-|-|..|.+++||. ++-+++|||+++++++++.+..+.......-.+.+.....|...
T Consensus 398 ~~v~v~tv~~ns~a~------k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~~~l~~~~~~ 456 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLAD------KAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDLTVLVERSLD 456 (1051)
T ss_pred eEEEEEEecCCChhh------HhcCCCcceEEEecCccchhHHHHHHHHHhcccceEEEEeeccc
Confidence 457799999999999 99999999999999999998888776655556665444444433
No 43
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=96.01 E-value=0.0075 Score=45.34 Aligned_cols=29 Identities=31% Similarity=0.598 Sum_probs=21.1
Q ss_pred eccccCCCCccceEEcCCccEEEEEeeec
Q psy18070 4 TGIMVKFGNSGGPLVNLDGEVIGINSMKV 32 (169)
Q Consensus 4 ~da~in~GnSGGplvn~~G~vvGi~~~~~ 32 (169)
.|.-+.+|.||.|++|.+|++||+-....
T Consensus 90 ~~~d~~~GsSGSpi~n~~g~ivGlYg~g~ 118 (132)
T PF00949_consen 90 IDLDFPKGSSGSPIFNQNGEIVGLYGNGV 118 (132)
T ss_dssp E---S-TTGTT-EEEETTSCEEEEEEEEE
T ss_pred eecccCCCCCCCceEcCCCcEEEEEccce
Confidence 34558999999999999999999976554
No 44
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=95.41 E-value=0.022 Score=43.05 Aligned_cols=36 Identities=28% Similarity=0.345 Sum_probs=29.2
Q ss_pred cccCCCCccceEEcCCccEEEEEeeeccCCeEEEEe
Q psy18070 6 IMVKFGNSGGPLVNLDGEVIGINSMKVTAGISFAIP 41 (169)
Q Consensus 6 a~in~GnSGGplvn~~G~vvGi~~~~~~~~~~faiP 41 (169)
..-+||.||-|++|-.|+||||+-...++|--.++.
T Consensus 101 g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~RTaLS 136 (158)
T PF00944_consen 101 GVGKPGDSGRPIFDNSGRVVAIVLGGANEGRRTALS 136 (158)
T ss_dssp TS-STTSTTEEEESTTSBEEEEEEEEEEETTEEEEE
T ss_pred CCCCCCCCCCccCcCCCCEEEEEecCCCCCCceEEE
Confidence 345899999999999999999998887766555554
No 45
>KOG1421|consensus
Probab=95.18 E-value=0.27 Score=46.16 Aligned_cols=115 Identities=16% Similarity=0.093 Sum_probs=73.2
Q ss_pred CCCCccceEEcCCccEEEEEeeecc---CC----eEEEEehhhHHHHHHhhhhcCCccceeecceecEEEEECCHHHHHH
Q psy18070 9 KFGNSGGPLVNLDGEVIGINSMKVT---AG----ISFAIPIDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLIEQ 81 (169)
Q Consensus 9 n~GnSGGplvn~~G~vvGi~~~~~~---~~----~~faiP~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~~~ 81 (169)
..++| |-+.|-+|+|+++=-.-.. ++ +-|-+.+..+..+++.|+.++ ......+|+.+..++-..++.
T Consensus 677 T~c~s-g~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~----~~rp~i~~vef~~i~laqar~ 751 (955)
T KOG1421|consen 677 TSCLS-GRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGP----SARPTIAGVEFSHITLAQART 751 (955)
T ss_pred ccccc-eEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCC----CCCceeeccceeeEEeehhhc
Confidence 34455 4899999999995332222 22 346667778999999999887 444555677776666555555
Q ss_pred hhcccC------CCCCCC-CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh
Q psy18070 82 LRRDRH------IPYDLT-HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK 137 (169)
Q Consensus 82 ~~~~~~------~~~~~~-~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~ 137 (169)
+|++.. -..... +=.+|+.|.+.-+.. |..||||+++|+.-+..-.+
T Consensus 752 lglp~e~imk~e~es~~~~ql~~ishv~~~~~ki---------l~~gdiilsvngk~itr~~d 805 (955)
T KOG1421|consen 752 LGLPSEFIMKSEEESTIPRQLYVISHVRPLLHKI---------LGVGDIILSVNGKMITRLSD 805 (955)
T ss_pred cCCCHHHHhhhhhcCCCcceEEEEEeeccCcccc---------cccccEEEEecCeEEeeehh
Confidence 554311 000111 235678898766544 99999999988776554433
No 46
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=94.86 E-value=0.034 Score=46.51 Aligned_cols=33 Identities=30% Similarity=0.527 Sum_probs=25.4
Q ss_pred cCCCCccceEEcCCccEEEEEeeeccCCeEEEE
Q psy18070 8 VKFGNSGGPLVNLDGEVIGINSMKVTAGISFAI 40 (169)
Q Consensus 8 in~GnSGGplvn~~G~vvGi~~~~~~~~~~fai 40 (169)
.+||+||.|++..+|.+|||.+.+...|.++.-
T Consensus 205 T~~GDSGSPVVt~dg~liGVHTGSn~~G~g~vT 237 (297)
T PF05579_consen 205 TGPGDSGSPVVTEDGDLIGVHTGSNKRGSGAVT 237 (297)
T ss_dssp S-GGCTT-EEEETTC-EEEEEEEEETTTEEEEE
T ss_pred cCCCCCCCccCcCCCCEEEEEecCCCcCceEEE
Confidence 479999999999999999999988776666543
No 47
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=94.84 E-value=0.052 Score=45.57 Aligned_cols=60 Identities=10% Similarity=0.063 Sum_probs=41.9
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhh--hhhhcCCCCeeEEEEEEe
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKL--VVWSINHPSITCHILLRL 156 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~--~~~~~~~~~~~~~~i~r~ 156 (169)
.|+.=-.|.|+.+++ .-.++|||+|||++++|++...+..+. +...+.....+.+++.|.
T Consensus 204 ~Gl~GYrl~Pgkd~~---lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRd 265 (276)
T PRK09681 204 EGIVGYAVKPGADRS---LFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRK 265 (276)
T ss_pred CCceEEEECCCCcHH---HHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEEC
Confidence 462224567775543 123799999999999999987766543 344566777889999995
No 48
>KOG3571|consensus
Probab=94.36 E-value=0.12 Score=46.79 Aligned_cols=59 Identities=10% Similarity=0.074 Sum_probs=41.7
Q ss_pred CCCcEEEEEEccCCccccccccccc-CCCCCcEEEecCceeecch-----hhhhhhhcCCCCeeEEEEEEe
Q psy18070 92 LTHGVLIWRVMYNSPAYFIKFRTSA-GIKPTSSRLLGECLAQYTT-----SKLVVWSINHPSITCHILLRL 156 (169)
Q Consensus 92 ~~~Gv~V~~V~~~spA~~~~~~~~a-GL~~GDvI~~~~~v~~~~~-----~~~~~~~~~~~~~~~~~i~r~ 156 (169)
...|++|.++.+++.-+ .- -|.+||.|+.+|.+...+- .+.+.....+++.+.+++-+-
T Consensus 275 gDggIYVgsImkgGAVA------~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltvAk~ 339 (626)
T KOG3571|consen 275 GDGGIYVGSIMKGGAVA------LDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTVAKC 339 (626)
T ss_pred CCCceEEeeeccCceee------ccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCCCeEEEEeec
Confidence 34799999999999777 44 4999999999887754332 233333456777777776553
No 49
>KOG3580|consensus
Probab=93.77 E-value=0.082 Score=48.87 Aligned_cols=36 Identities=19% Similarity=0.195 Sum_probs=32.6
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT 135 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~ 135 (169)
-|+.|..|..+|||+ +-||+.||-|+.+|.+...+-
T Consensus 429 VGIFVaGvqegspA~------~eGlqEGDQIL~VN~vdF~nl 464 (1027)
T KOG3580|consen 429 VGIFVAGVQEGSPAE------QEGLQEGDQILKVNTVDFRNL 464 (1027)
T ss_pred eeEEEeecccCCchh------hccccccceeEEeccccchhh
Confidence 599999999999999 999999999999988875554
No 50
>KOG2921|consensus
Probab=93.03 E-value=0.099 Score=45.99 Aligned_cols=41 Identities=17% Similarity=0.084 Sum_probs=34.0
Q ss_pred CCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh
Q psy18070 92 LTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK 137 (169)
Q Consensus 92 ~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~ 137 (169)
...||.|++|...||+. +.+ ||.+||+|++.++.++.+..+
T Consensus 218 ~g~gV~Vtev~~~Spl~----gpr-GL~vgdvitsldgcpV~~v~d 258 (484)
T KOG2921|consen 218 HGEGVTVTEVPSVSPLF----GPR-GLSVGDVITSLDGCPVHKVSD 258 (484)
T ss_pred cCceEEEEeccccCCCc----Ccc-cCCccceEEecCCcccCCHHH
Confidence 34699999999999998 334 999999999988887777644
No 51
>KOG3605|consensus
Probab=92.29 E-value=0.26 Score=45.84 Aligned_cols=82 Identities=13% Similarity=0.091 Sum_probs=59.0
Q ss_pred EEEehhhHHHHHHhhhhcCCccceeec----ceecEEEEECCHHHHHHhhcccCCCCCCCCcEEEEEEccCCcccccccc
Q psy18070 38 FAIPIDYAIEFLTNYKRKDIDRTITHK----KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYFIKFR 113 (169)
Q Consensus 38 faiP~~~i~~~l~~l~~~g~~~~~~~~----~~lGi~~~~l~~~~~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~~~~~ 113 (169)
--+|.+..+.+++.++++- .++. .--=.++.-.-|+++.+||. ....||+++=. .|+-|+
T Consensus 707 VGLPLstcQs~Ik~~KnQT----~VkltiV~cpPV~~V~I~RPd~kyQLGF------SVQNGiICSLl-RGGIAE----- 770 (829)
T KOG3605|consen 707 VGLPLSTCQSIIKGLKNQT----AVKLNIVSCPPVTTVLIRRPDLRYQLGF------SVQNGIICSLL-RGGIAE----- 770 (829)
T ss_pred ccccHHHHHHHHhcccccc----eEEEEEecCCCceEEEeecccchhhccc------eeeCcEeehhh-cccchh-----
Confidence 3478888999998888775 3222 11112233334778888887 45589988755 589999
Q ss_pred cccCCCCCcEEEecCceeecchh
Q psy18070 114 TSAGIKPTSSRLLGECLAQYTTS 136 (169)
Q Consensus 114 ~~aGL~~GDvI~~~~~v~~~~~~ 136 (169)
|.|+|.|-.|+++|+..+..+-
T Consensus 771 -RGGVRVGHRIIEINgQSVVA~p 792 (829)
T KOG3605|consen 771 -RGGVRVGHRIIEINGQSVVATP 792 (829)
T ss_pred -ccCceeeeeEEEECCceEEecc
Confidence 9999999999999988877663
No 52
>KOG3550|consensus
Probab=92.02 E-value=0.31 Score=37.70 Aligned_cols=36 Identities=17% Similarity=0.115 Sum_probs=32.0
Q ss_pred CcEEEEEEccCCcccccccccc-cCCCCCcEEEecCceeecch
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTS-AGIKPTSSRLLGECLAQYTT 135 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~-aGL~~GDvI~~~~~v~~~~~ 135 (169)
+-++|+.+.||+-|+ + .||+.||-++++|++.+...
T Consensus 115 spiyisriipggvad------rhgglkrgdqllsvngvsvege 151 (207)
T KOG3550|consen 115 SPIYISRIIPGGVAD------RHGGLKRGDQLLSVNGVSVEGE 151 (207)
T ss_pred CceEEEeecCCcccc------ccCcccccceeEeecceeecch
Confidence 579999999999999 6 58999999999999887655
No 53
>KOG3580|consensus
Probab=91.70 E-value=0.37 Score=44.73 Aligned_cols=64 Identities=11% Similarity=0.081 Sum_probs=42.8
Q ss_pred CCCCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh-hhhhhcCCCCeeEEEEEEeeEEee
Q psy18070 91 DLTHGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK-LVVWSINHPSITCHILLRLYLLVC 161 (169)
Q Consensus 91 ~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~-~~~~~~~~~~~~~~~i~r~~~~v~ 161 (169)
+..+-++|++|.||+||+ .-||.||-|+.+|++...+... +++...++.+...-+..++-..|+
T Consensus 37 ~getSiViSDVlpGGPAe-------G~LQenDrvvMVNGvsMenv~haFAvQqLrksgK~A~ItvkRprkvq 101 (1027)
T KOG3580|consen 37 NGETSIVISDVLPGGPAE-------GLLQENDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITVKRPRKVQ 101 (1027)
T ss_pred CCceeEEEeeccCCCCcc-------cccccCCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEecccceee
Confidence 445679999999999999 4599999999999997666533 233344444444333333333333
No 54
>KOG3209|consensus
Probab=91.56 E-value=0.23 Score=46.74 Aligned_cols=51 Identities=18% Similarity=0.081 Sum_probs=36.8
Q ss_pred EEEEccCCcccccccccccC-CCCCcEEEecCceeecchhhhh--hhhcCCCCeeEEEEE
Q psy18070 98 IWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTTSKLV--VWSINHPSITCHILL 154 (169)
Q Consensus 98 V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~~~~~~i~ 154 (169)
|-+|.+||||+ +.| |+.||.|+++|+..+.+-.... .+..+..-.+.++|.
T Consensus 782 iGrIieGSPAd------RCgkLkVGDrilAVNG~sI~~lsHadiv~LIKdaGlsVtLtIi 835 (984)
T KOG3209|consen 782 IGRIIEGSPAD------RCGKLKVGDRILAVNGQSILNLSHADIVSLIKDAGLSVTLTII 835 (984)
T ss_pred ccccccCChhH------hhccccccceEEEecCeeeeccCchhHHHHHHhcCceEEEEEc
Confidence 77889999999 876 9999999999998887764433 233333444455553
No 55
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=90.88 E-value=0.24 Score=41.07 Aligned_cols=32 Identities=25% Similarity=0.270 Sum_probs=28.5
Q ss_pred EeeccccCCCCccceEEcCCccEEEEEeeecc
Q psy18070 2 SLTGIMVKFGNSGGPLVNLDGEVIGINSMKVT 33 (169)
Q Consensus 2 iq~da~in~GnSGGplvn~~G~vvGi~~~~~~ 33 (169)
++-|+-+-||+||.|+++.+.+|+|+......
T Consensus 194 l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~ 225 (251)
T COG3591 194 LFYDADTLPGSSGSPVLISKDEVIGVHYNGPG 225 (251)
T ss_pred EEEEecccCCCCCCceEecCceEEEEEecCCC
Confidence 67788999999999999999999999887654
No 56
>KOG3542|consensus
Probab=90.61 E-value=0.2 Score=47.04 Aligned_cols=57 Identities=14% Similarity=0.031 Sum_probs=39.1
Q ss_pred CCcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecch-hhhhhhhcCCCCeeEEEEEE
Q psy18070 93 THGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT-SKLVVWSINHPSITCHILLR 155 (169)
Q Consensus 93 ~~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~-~~~~~~~~~~~~~~~~~i~r 155 (169)
..|++|.+|.|+|-|+ +.||+.||-|+++|+....+- +..+........-..+++.-
T Consensus 561 GfgifV~~V~pgskAa------~~GlKRgDqilEVNgQnfenis~~KA~eiLrnnthLtltvKt 618 (1283)
T KOG3542|consen 561 GFGIFVAEVFPGSKAA------REGLKRGDQILEVNGQNFENISAKKAEEILRNNTHLTLTVKT 618 (1283)
T ss_pred cceeEEeeecCCchHH------HhhhhhhhhhhhccccchhhhhHHHHHHHhcCCceEEEEEec
Confidence 3589999999999999 999999999999887654433 22233333333333444433
No 57
>KOG3834|consensus
Probab=90.15 E-value=0.39 Score=42.61 Aligned_cols=63 Identities=16% Similarity=0.059 Sum_probs=47.3
Q ss_pred EEEEEEccCCcccccccccccCCC-CCcEEEec-CceeecchhhhhhhhcCCCCeeEEEEEEeeEEeeccc
Q psy18070 96 VLIWRVMYNSPAYFIKFRTSAGIK-PTSSRLLG-ECLAQYTTSKLVVWSINHPSITCHILLRLYLLVCSEL 164 (169)
Q Consensus 96 v~V~~V~~~spA~~~~~~~~aGL~-~GDvI~~~-~~v~~~~~~~~~~~~~~~~~~~~~~i~r~~~~v~~~~ 164 (169)
.-|-+|.++|||+ .|||+ -+|-|+.+ +.+-..++..+.+...+....+++.+|.-+.-.|.++
T Consensus 111 wHvl~V~p~SPaa------lAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReV 175 (462)
T KOG3834|consen 111 WHVLSVEPNSPAA------LAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREV 175 (462)
T ss_pred eeeeecCCCCHHH------hcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceE
Confidence 4467889999999 99999 68999987 6665555555555666677788888888777666554
No 58
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=89.83 E-value=0.95 Score=38.87 Aligned_cols=55 Identities=16% Similarity=0.137 Sum_probs=41.8
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhhhh--hcCCCCeeEEEEEE
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW--SINHPSITCHILLR 155 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~--~~~~~~~~~~~i~r 155 (169)
.||++..|..+|||. .-|+.||-|+++++.+..+..++.-. +.+....+.+...|
T Consensus 130 ~gvyv~~v~~~~~~~-------gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r 186 (342)
T COG3480 130 AGVYVLSVIDNSPFK-------GKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYER 186 (342)
T ss_pred eeEEEEEccCCcchh-------ceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEe
Confidence 699999999999998 45999999999988887777666633 33334455666665
No 59
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=88.47 E-value=0.49 Score=35.81 Aligned_cols=32 Identities=34% Similarity=0.596 Sum_probs=22.3
Q ss_pred CCCCccceEEcCCccEEEEEeeecc-CCeEEEE
Q psy18070 9 KFGNSGGPLVNLDGEVIGINSMKVT-AGISFAI 40 (169)
Q Consensus 9 n~GnSGGplvn~~G~vvGi~~~~~~-~~~~fai 40 (169)
-.|.||||++-.+|.+|||-.+... .+..-+|
T Consensus 106 lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i 138 (148)
T PF02907_consen 106 LKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAI 138 (148)
T ss_dssp HTT-TT-EEEETTSEEEEEEEEEEEETTEEEEE
T ss_pred EecCCCCcccCCCCCEEEEEEEEEEcCCceeeE
Confidence 3699999999999999999876653 3444333
No 60
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=88.13 E-value=0.95 Score=33.76 Aligned_cols=35 Identities=29% Similarity=0.304 Sum_probs=25.2
Q ss_pred eccccCCCCccceEEcCCccEEEEEeeeccCCeEEE
Q psy18070 4 TGIMVKFGNSGGPLVNLDGEVIGINSMKVTAGISFA 39 (169)
Q Consensus 4 ~da~in~GnSGGplvn~~G~vvGi~~~~~~~~~~fa 39 (169)
...+..||.-||+|+ .+--|+||.+++...-.+|+
T Consensus 83 g~Gp~~PGdCGg~L~-C~HGViGi~Tagg~g~VaF~ 117 (127)
T PF00947_consen 83 GEGPAEPGDCGGILR-CKHGVIGIVTAGGEGHVAFA 117 (127)
T ss_dssp EE-SSSTT-TCSEEE-ETTCEEEEEEEEETTEEEEE
T ss_pred ecccCCCCCCCceeE-eCCCeEEEEEeCCCceEEEE
Confidence 345789999999999 55569999999875434443
No 61
>KOG3209|consensus
Probab=84.78 E-value=2.1 Score=40.62 Aligned_cols=59 Identities=14% Similarity=0.118 Sum_probs=44.9
Q ss_pred CCCcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecch-hhhhhhhcCCCCeeEEEEEEe
Q psy18070 92 LTHGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTT-SKLVVWSINHPSITCHILLRL 156 (169)
Q Consensus 92 ~~~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~-~~~~~~~~~~~~~~~~~i~r~ 156 (169)
+.-+++|.+..+++||. +.| ++.||-|+++|+...... -..++..++..+...++++|+
T Consensus 921 ynM~LfVLRlAeDGPA~------rdGrm~VGDqi~eINGesTkgmtH~rAIelIk~gg~~vll~Lr~ 981 (984)
T KOG3209|consen 921 YNMDLFVLRLAEDGPAI------RDGRMRVGDQITEINGESTKGMTHDRAIELIKQGGRRVLLLLRR 981 (984)
T ss_pred cccceEEEEeccCCCcc------ccCceeecceEEEecCcccCCCcHHHHHHHHHhCCeEEEEEecc
Confidence 34579999999999999 876 999999999988765554 334556677777666666664
No 62
>KOG0606|consensus
Probab=83.37 E-value=0.95 Score=44.56 Aligned_cols=34 Identities=18% Similarity=0.025 Sum_probs=28.9
Q ss_pred EEEEEEccCCcccccccccccCCCCCcEEEecCceeecch
Q psy18070 96 VLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTT 135 (169)
Q Consensus 96 v~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~ 135 (169)
-.|+.|.++|||. .+|+++||.|+.+++..+...
T Consensus 660 h~v~sv~egsPA~------~agls~~DlIthvnge~v~gl 693 (1205)
T KOG0606|consen 660 HSVGSVEEGSPAF------EAGLSAGDLITHVNGEPVHGL 693 (1205)
T ss_pred eeeeeecCCCCcc------ccCCCccceeEeccCcccchh
Confidence 5789999999999 899999999999886554443
No 63
>KOG3651|consensus
Probab=81.01 E-value=2.1 Score=36.76 Aligned_cols=42 Identities=19% Similarity=0.169 Sum_probs=35.3
Q ss_pred CcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecchhhhhhh
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTTSKLVVW 141 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~~~~~~~ 141 (169)
.=++|..|-.++||+ +-| ++.||-|+++|++.+...-+..+.
T Consensus 30 PClYiVQvFD~tPAa------~dG~i~~GDEi~avNg~svKGktKveVA 72 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAA------KDGRIRCGDEIVAVNGISVKGKTKVEVA 72 (429)
T ss_pred CeEEEEEeccCCchh------ccCccccCCeeEEecceeecCccHHHHH
Confidence 468999999999999 765 999999999999988776555544
No 64
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=80.77 E-value=2.2 Score=35.35 Aligned_cols=56 Identities=11% Similarity=0.006 Sum_probs=37.6
Q ss_pred cEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhh--hhhhhcCCCCeeEEEEEEe
Q psy18070 95 GVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSK--LVVWSINHPSITCHILLRL 156 (169)
Q Consensus 95 Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~--~~~~~~~~~~~~~~~i~r~ 156 (169)
|-.+.=..++|.-+ ..|||+||+-+++|+....+..+ .++..+......++++.|+
T Consensus 208 Gyr~~pgkd~slF~------~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~ 265 (275)
T COG3031 208 GYRFEPGKDGSLFY------KSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRR 265 (275)
T ss_pred EEEecCCCCcchhh------hhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEec
Confidence 33344444555555 89999999999988876555433 3344556667778888885
No 65
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=79.48 E-value=3.4 Score=34.01 Aligned_cols=40 Identities=30% Similarity=0.532 Sum_probs=23.4
Q ss_pred eccccCCCCccceEEcCC-ccEEEEEeeecc-CCeEEEEehh
Q psy18070 4 TGIMVKFGNSGGPLVNLD-GEVIGINSMKVT-AGISFAIPID 43 (169)
Q Consensus 4 ~da~in~GnSGGplvn~~-G~vvGi~~~~~~-~~~~faiP~~ 43 (169)
+-.+...|.=|.|+|+.. |.+||+-++... ...+|+.|+.
T Consensus 144 HwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~~~~N~F~~f~ 185 (235)
T PF00863_consen 144 HWISTKDGDCGLPLVSTKDGKIVGIHSLTSNTSSRNYFTPFP 185 (235)
T ss_dssp E-C---TT-TT-EEEETTT--EEEEEEEEETTTSSEEEEE--
T ss_pred EEecCCCCccCCcEEEcCCCcEEEEEcCccCCCCeEEEEcCC
Confidence 345678999999999975 999999998764 4456766653
No 66
>KOG3552|consensus
Probab=75.98 E-value=4 Score=39.86 Aligned_cols=55 Identities=11% Similarity=0.044 Sum_probs=37.4
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEecCceeecchhhhh-hhhc-CCCCeeEEEEEE
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV-VWSI-NHPSITCHILLR 155 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~-~~~~-~~~~~~~~~i~r 155 (169)
.-|+|..|.+|+|+. ..|+|||-|+++|+-.+...-+.. +... .-...+.+++.+
T Consensus 75 rPviVr~VT~GGps~-------GKL~PGDQIl~vN~Epv~daprervIdlvRace~sv~ltV~q 131 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSI-------GKLQPGDQILAVNGEPVKDAPRERVIDLVRACESSVNLTVCQ 131 (1298)
T ss_pred CceEEEEecCCCCcc-------ccccCCCeEEEecCcccccccHHHHHHHHHHHhhhcceEEec
Confidence 358999999999999 569999999998877665553222 2222 223445666555
No 67
>KOG1892|consensus
Probab=75.67 E-value=3.1 Score=40.82 Aligned_cols=38 Identities=13% Similarity=0.107 Sum_probs=30.9
Q ss_pred CCCCcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecc
Q psy18070 91 DLTHGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYT 134 (169)
Q Consensus 91 ~~~~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~ 134 (169)
+..-|++|.+|.+|++|+ .-| |+.||-++++++.....
T Consensus 957 q~klGIYvKsVV~GgaAd------~DGRL~aGDQLLsVdG~SLiG 995 (1629)
T KOG1892|consen 957 QRKLGIYVKSVVEGGAAD------HDGRLEAGDQLLSVDGHSLIG 995 (1629)
T ss_pred ccccceEEEEeccCCccc------cccccccCceeeeecCccccc
Confidence 334599999999999999 544 99999999988776443
No 68
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=74.07 E-value=5.1 Score=37.62 Aligned_cols=44 Identities=18% Similarity=0.224 Sum_probs=34.8
Q ss_pred cCCCCccceEEcCCcc------EEEEEeeecc--CCeEEEEehhhHHHHHHh
Q psy18070 8 VKFGNSGGPLVNLDGE------VIGINSMKVT--AGISFAIPIDYAIEFLTN 51 (169)
Q Consensus 8 in~GnSGGplvn~~G~------vvGi~~~~~~--~~~~faiP~~~i~~~l~~ 51 (169)
-.+|.||.-+++.-+. |+||.++... -.+|++.|++.+..-|++
T Consensus 636 a~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~ 687 (695)
T PF08192_consen 636 ASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEE 687 (695)
T ss_pred cCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHH
Confidence 3579999999998766 9999998654 258999998877666654
No 69
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=73.83 E-value=7.1 Score=33.29 Aligned_cols=50 Identities=14% Similarity=0.117 Sum_probs=34.1
Q ss_pred EEccCCcccccccccccCCCCCcEEEecCceeecchhhhh--hhhcCCCC--eeEEEEEE
Q psy18070 100 RVMYNSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLV--VWSINHPS--ITCHILLR 155 (169)
Q Consensus 100 ~V~~~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~--~~~~~~~~--~~~~~i~r 155 (169)
.+..+|+|+ .+|+++||.|++.++.+..+-.+.. ........ ...+.+.|
T Consensus 135 ~v~~~s~a~------~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~ 188 (375)
T COG0750 135 EVAPKSAAA------LAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIR 188 (375)
T ss_pred ecCCCCHHH------HcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEe
Confidence 688999999 9999999999997777666554332 22222222 25666666
No 70
>KOG3627|consensus
Probab=72.93 E-value=2.9 Score=33.34 Aligned_cols=26 Identities=42% Similarity=0.506 Sum_probs=21.5
Q ss_pred cCCCCccceEEcCC---ccEEEEEeeecc
Q psy18070 8 VKFGNSGGPLVNLD---GEVIGINSMKVT 33 (169)
Q Consensus 8 in~GnSGGplvn~~---G~vvGi~~~~~~ 33 (169)
...|+|||||+-.+ ..++||++....
T Consensus 201 ~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 201 ACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred cccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 56799999999776 699999987654
No 71
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=70.87 E-value=5.7 Score=31.49 Aligned_cols=28 Identities=29% Similarity=0.184 Sum_probs=25.6
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEEec
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRLLG 127 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~~~ 127 (169)
..+.|..|..||||+ ++|+.-++.|+++
T Consensus 122 ~~~~Vd~v~fgS~A~------~~g~d~d~~I~~v 149 (183)
T PF11874_consen 122 GKVIVDEVEFGSPAE------KAGIDFDWEITEV 149 (183)
T ss_pred CEEEEEecCCCCHHH------HcCCCCCcEEEEE
Confidence 568999999999999 9999999999883
No 72
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=67.13 E-value=7.4 Score=23.43 Aligned_cols=20 Identities=45% Similarity=0.642 Sum_probs=16.8
Q ss_pred CCccceEEcCCccEEEEEee
Q psy18070 11 GNSGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 11 GnSGGplvn~~G~vvGi~~~ 30 (169)
+-+.-|++|.+|+++|+.+.
T Consensus 29 ~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 29 GISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp TSSEEEEESTTSBEEEEEEH
T ss_pred CCcEEEEEecCCEEEEEEEH
Confidence 45678999999999999774
No 73
>KOG3606|consensus
Probab=65.20 E-value=5.6 Score=33.70 Aligned_cols=38 Identities=18% Similarity=0.147 Sum_probs=32.3
Q ss_pred CCCcEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecch
Q psy18070 92 LTHGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTT 135 (169)
Q Consensus 92 ~~~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~ 135 (169)
...|+.|++..||+-|+ .-| |..+|-++++|++++..-
T Consensus 192 kvpGIFISRlVpGGLAe------STGLLaVnDEVlEVNGIEVaGK 230 (358)
T KOG3606|consen 192 KVPGIFISRLVPGGLAE------STGLLAVNDEVLEVNGIEVAGK 230 (358)
T ss_pred ccCceEEEeecCCcccc------ccceeeecceeEEEcCEEeccc
Confidence 34799999999999999 777 568999999999987543
No 74
>KOG0609|consensus
Probab=62.63 E-value=18 Score=33.17 Aligned_cols=35 Identities=17% Similarity=0.136 Sum_probs=31.3
Q ss_pred cEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecch
Q psy18070 95 GVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTT 135 (169)
Q Consensus 95 Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~ 135 (169)
-++|..+..|+-|. +.| |+.||.|.++|++.+.+.
T Consensus 147 ~~~vARI~~GG~~~------r~glL~~GD~i~EvNGi~v~~~ 182 (542)
T KOG0609|consen 147 KVVVARIMHGGMAD------RQGLLHVGDEILEVNGISVANK 182 (542)
T ss_pred ccEEeeeccCCcch------hccceeeccchheecCeecccC
Confidence 59999999999999 887 789999999999987776
No 75
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=56.56 E-value=25 Score=22.96 Aligned_cols=33 Identities=30% Similarity=0.511 Sum_probs=25.7
Q ss_pred ceEEcCCccEEEEEeeeccCCeEEEEehhhHHHHHHhhhhc
Q psy18070 15 GPLVNLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYKRK 55 (169)
Q Consensus 15 Gplvn~~G~vvGi~~~~~~~~~~faiP~~~i~~~l~~l~~~ 55 (169)
-|+.+.+|+++|+.. ..+..+.+.++++++.-+
T Consensus 19 ~pi~~~~g~~~Gvv~--------~di~l~~l~~~i~~~~~~ 51 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVG--------IDISLDQLSEIISNIKFG 51 (81)
T ss_dssp EEEEETTTEEEEEEE--------EEEEHHHHHHHHTTSBBT
T ss_pred EEEECCCCCEEEEEE--------EEeccceeeeEEEeeEEC
Confidence 488888999999865 457788888888776543
No 76
>cd00218 GlcAT-I Beta1,3-glucuronyltransferase I (GlcAT-I) is involved in the initial steps of proteoglycan synthesis. Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a Key enzyme involved in the initial steps of proteoglycan synthesis. GlcAT-I catalyzes the transfer of a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region of trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl of proteoglycans. The enzyme has two subdomains that bind the donor and acceptor substrate separately. The active site is located at the cleft between both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. This family has been classified as Glycosyltransferase family 43 (GT-43).
Probab=54.68 E-value=13 Score=30.38 Aligned_cols=31 Identities=23% Similarity=0.400 Sum_probs=23.7
Q ss_pred cceEEcCCccEEEEEeeecc------CCeEEEEehhhH
Q psy18070 14 GGPLVNLDGEVIGINSMKVT------AGISFAIPIDYA 45 (169)
Q Consensus 14 GGplvn~~G~vvGi~~~~~~------~~~~faiP~~~i 45 (169)
-||+++ +|+|+|+.+.-.. +-.|||+-+..+
T Consensus 136 egP~c~-~gkV~gw~~~w~~~R~f~idmAGFA~n~~ll 172 (223)
T cd00218 136 EGPVCE-NGKVVGWHTAWKPERPFPIDMAGFAFNSKLL 172 (223)
T ss_pred eccEee-CCeEeEEecCCCCCCCCcceeeeEEEehhhh
Confidence 379998 9999999987543 235899987654
No 77
>KOG3605|consensus
Probab=53.11 E-value=9.8 Score=35.87 Aligned_cols=52 Identities=12% Similarity=0.112 Sum_probs=34.5
Q ss_pred EEEEccCCcccccccccccC-CCCCcEEEecCceeecc----hhhhhhhhcCCCCeeEEEEEE
Q psy18070 98 IWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYT----TSKLVVWSINHPSITCHILLR 155 (169)
Q Consensus 98 V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~----~~~~~~~~~~~~~~~~~~i~r 155 (169)
|...+.++||+ +.| |..||-|+++|+....- +-+..+...+....+++++.+
T Consensus 677 iAnmm~~GpAa------rsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~ 733 (829)
T KOG3605|consen 677 IANMMHGGPAA------RSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS 733 (829)
T ss_pred HHhcccCChhh------hcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence 33458899999 887 99999999999876543 234444444555555555443
No 78
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=46.29 E-value=22 Score=28.52 Aligned_cols=40 Identities=20% Similarity=0.153 Sum_probs=16.2
Q ss_pred eeccccCCCCccceEEcCCccEEEEEeeec----cCCeEEEEehh
Q psy18070 3 LTGIMVKFGNSGGPLVNLDGEVIGINSMKV----TAGISFAIPID 43 (169)
Q Consensus 3 q~da~in~GnSGGplvn~~G~vvGi~~~~~----~~~~~faiP~~ 43 (169)
+.-....+|.||.|+++.. ++||+-.... .++.++.-|+.
T Consensus 139 ~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip 182 (203)
T PF02122_consen 139 SVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIP 182 (203)
T ss_dssp EE-----TT-TT-EEE-SS--EEEEEEEE----------------
T ss_pred ceEcCCCCCCCCCCeEECC-CceEeecCccccccccccccccccc
Confidence 3445667999999999988 9999988741 25566665553
No 79
>KOG3834|consensus
Probab=46.04 E-value=30 Score=31.02 Aligned_cols=58 Identities=17% Similarity=0.157 Sum_probs=40.9
Q ss_pred CCCCcEEEEEEccCCcccccccccccCCCC-CcEEEecCceeecchhhh--hhhhcCCCCeeEEEEEE
Q psy18070 91 DLTHGVLIWRVMYNSPAYFIKFRTSAGIKP-TSSRLLGECLAQYTTSKL--VVWSINHPSITCHILLR 155 (169)
Q Consensus 91 ~~~~Gv~V~~V~~~spA~~~~~~~~aGL~~-GDvI~~~~~v~~~~~~~~--~~~~~~~~~~~~~~i~r 155 (169)
..+.|--|.+|.++|||. ++||.+ -|-|++++++....+-+. .++....+. ++++++.
T Consensus 12 ggteg~hvlkVqedSpa~------~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n 72 (462)
T KOG3834|consen 12 GGTEGYHVLKVQEDSPAH------KAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYN 72 (462)
T ss_pred CCceeEEEEEeecCChHH------hcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEe
Confidence 445688999999999999 999998 688999888877644222 233333333 6666654
No 80
>KOG3549|consensus
Probab=40.79 E-value=39 Score=29.77 Aligned_cols=53 Identities=13% Similarity=-0.010 Sum_probs=36.8
Q ss_pred cEEEEEEccCCcccccccccccC-CCCCcEEEecCceeecchhhhh-h-hhcCCCCeeEEEE
Q psy18070 95 GVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLAQYTTSKLV-V-WSINHPSITCHIL 153 (169)
Q Consensus 95 Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~~~~~~~~~-~-~~~~~~~~~~~~i 153 (169)
-|+|+.+-++-.|+ ..| |-.||.|+.+|++.+...-... + ..-+.++.+.+++
T Consensus 81 PvviSkI~kdQaAd------~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRNAGdeVtlTV 136 (505)
T KOG3549|consen 81 PVVISKIYKDQAAD------ITGQLFVGDAILQVNGIYVTACPHEEVVNILRNAGDEVTLTV 136 (505)
T ss_pred cEEeehhhhhhhhh------hcCceEeeeeeEEeccEEeecCChHHHHHHHHhcCCEEEEEe
Confidence 48999999888777 555 7799999999999877663222 2 2334455555543
No 81
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=37.45 E-value=42 Score=30.57 Aligned_cols=31 Identities=19% Similarity=0.257 Sum_probs=21.5
Q ss_pred EEEEccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070 98 IWRVMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT 135 (169)
Q Consensus 98 V~~V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~ 135 (169)
|.-..+|-|.. .| .||||||++ +.-|++.++
T Consensus 302 vl~~~ENm~~g------~A-~rPGDVits~~GkTVEV~NT 334 (485)
T COG0260 302 VLPAVENMPSG------NA-YRPGDVITSMNGKTVEVLNT 334 (485)
T ss_pred EEeeeccCCCC------CC-CCCCCeEEecCCcEEEEccc
Confidence 44445677777 55 899999999 555565555
No 82
>cd04582 CBS_pair_ABC_OpuCA_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzi
Probab=36.69 E-value=33 Score=22.69 Aligned_cols=22 Identities=27% Similarity=0.291 Sum_probs=17.0
Q ss_pred CCCccceEEcCCccEEEEEeee
Q psy18070 10 FGNSGGPLVNLDGEVIGINSMK 31 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~~ 31 (169)
.+-+--|++|.+|+++|+.+..
T Consensus 80 ~~~~~~~Vv~~~~~~~Gvi~~~ 101 (106)
T cd04582 80 HDMSWLPCVDEDGRYVGEVTQR 101 (106)
T ss_pred CCCCeeeEECCCCcEEEEEEHH
Confidence 3445578999999999998753
No 83
>cd04596 CBS_pair_DRTGG_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=35.93 E-value=33 Score=22.92 Aligned_cols=21 Identities=29% Similarity=0.294 Sum_probs=17.3
Q ss_pred CCCccceEEcCCccEEEEEee
Q psy18070 10 FGNSGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~ 30 (169)
.+-..-|++|.+|+++|+.+.
T Consensus 82 ~~~~~~~Vv~~~~~~~G~it~ 102 (108)
T cd04596 82 EGIEMLPVVDDNKKLLGIISR 102 (108)
T ss_pred cCCCeeeEEcCCCCEEEEEEH
Confidence 455667999999999999875
No 84
>KOG1728|consensus
Probab=35.57 E-value=13 Score=28.17 Aligned_cols=31 Identities=26% Similarity=0.402 Sum_probs=24.5
Q ss_pred CCcccccccccccCCCCCcEEEecCceeecchhhhhhh
Q psy18070 104 NSPAYFIKFRTSAGIKPTSSRLLGECLAQYTTSKLVVW 141 (169)
Q Consensus 104 ~spA~~~~~~~~aGL~~GDvI~~~~~v~~~~~~~~~~~ 141 (169)
=||+. + .+++||+++.+++-+...+..+.++
T Consensus 111 ~SPcF------r-di~~gDiVtvGecrPLSKtvrfnVL 141 (156)
T KOG1728|consen 111 VSPCF------R-DIQEGDIVTVGECRPLSKTVRFNVL 141 (156)
T ss_pred cchhh------h-ccccCCEEEEeecccccceEEEEEE
Confidence 38898 7 7999999999998877776555544
No 85
>PF08669 GCV_T_C: Glycine cleavage T-protein C-terminal barrel domain; InterPro: IPR013977 This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=35.22 E-value=53 Score=22.19 Aligned_cols=31 Identities=23% Similarity=0.367 Sum_probs=21.1
Q ss_pred CCCccceEEcCCccEEEEEeeec-c----CCeEEEE
Q psy18070 10 FGNSGGPLVNLDGEVIGINSMKV-T----AGISFAI 40 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~~~-~----~~~~fai 40 (169)
+=..|.|++..+|+.||..+... + .+++++.
T Consensus 32 ~~~~g~~v~~~~g~~vG~vTS~~~sp~~~~~Iala~ 67 (95)
T PF08669_consen 32 PPRGGEPVYDEDGKPVGRVTSGAYSPTLGKNIALAY 67 (95)
T ss_dssp --STTCEEEETTTEEEEEEEEEEEETTTTEEEEEEE
T ss_pred CCCCCCEEEECCCcEEeEEEEEeECCCCCceEEEEE
Confidence 34567899988999999877654 2 3455554
No 86
>KOG1476|consensus
Probab=35.10 E-value=27 Score=30.06 Aligned_cols=32 Identities=28% Similarity=0.576 Sum_probs=24.3
Q ss_pred ceEEcCCccEEEEEeeecc------CCeEEEEehhhHHH
Q psy18070 15 GPLVNLDGEVIGINSMKVT------AGISFAIPIDYAIE 47 (169)
Q Consensus 15 Gplvn~~G~vvGi~~~~~~------~~~~faiP~~~i~~ 47 (169)
||.++ +|+|+|++..-.. +-.|||+-..++..
T Consensus 223 ~P~v~-~~kvvg~~~~w~~~r~f~vdmaGFAvNl~lll~ 260 (330)
T KOG1476|consen 223 GPVVN-NGKVVGWHTRWEPERPFAVDMAGFAVNLKLLLD 260 (330)
T ss_pred cceec-cCeeEEEEeccccCCCCccchhhheehhhhhcc
Confidence 69998 9999999987553 33589998766544
No 87
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=34.85 E-value=43 Score=20.45 Aligned_cols=13 Identities=31% Similarity=0.736 Sum_probs=9.8
Q ss_pred EcCCccEEEEEee
Q psy18070 18 VNLDGEVIGINSM 30 (169)
Q Consensus 18 vn~~G~vvGi~~~ 30 (169)
+|.+|++|||-..
T Consensus 35 ~d~~G~ivGIEIl 47 (50)
T PF10049_consen 35 YDEDGRIVGIEIL 47 (50)
T ss_pred ECCCCCEEEEEEE
Confidence 4667999998654
No 88
>cd04618 CBS_pair_5 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=33.51 E-value=31 Score=23.26 Aligned_cols=21 Identities=14% Similarity=0.148 Sum_probs=16.4
Q ss_pred CCccceEEcCC-ccEEEEEeee
Q psy18070 11 GNSGGPLVNLD-GEVIGINSMK 31 (169)
Q Consensus 11 GnSGGplvn~~-G~vvGi~~~~ 31 (169)
+-.-=|++|.+ |+++|+.+..
T Consensus 72 ~~~~lpVvd~~~~~~~giit~~ 93 (98)
T cd04618 72 KIHRLPVIDPSTGTGLYILTSR 93 (98)
T ss_pred CCCEeeEEECCCCCceEEeehh
Confidence 44456999987 9999998754
No 89
>cd04606 CBS_pair_Mg_transporter This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain in the magnesium transporter, MgtE. MgtE and its homologs are found in eubacteria, archaebacteria, and eukaryota. Members of this family transport Mg2+ or other divalent cations into the cell via two highly conserved aspartates. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=33.30 E-value=39 Score=22.59 Aligned_cols=21 Identities=24% Similarity=0.539 Sum_probs=16.6
Q ss_pred CCCccceEEcCCccEEEEEee
Q psy18070 10 FGNSGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~ 30 (169)
.+-...|++|.+|+++|+.+.
T Consensus 82 ~~~~~~~Vv~~~~~~~Gvit~ 102 (109)
T cd04606 82 YDLLALPVVDEEGRLVGIITV 102 (109)
T ss_pred cCCceeeeECCCCcEEEEEEh
Confidence 334567999999999999875
No 90
>smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease.
Probab=33.22 E-value=42 Score=17.97 Aligned_cols=20 Identities=30% Similarity=0.554 Sum_probs=15.3
Q ss_pred CCccceEEcCCccEEEEEee
Q psy18070 11 GNSGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 11 GnSGGplvn~~G~vvGi~~~ 30 (169)
+-+.-|+++.+++++|+.+.
T Consensus 22 ~~~~~~v~~~~~~~~g~i~~ 41 (49)
T smart00116 22 GIRRLPVVDEEGRLVGIVTR 41 (49)
T ss_pred CCCcccEECCCCeEEEEEEH
Confidence 34456889988999998764
No 91
>cd04610 CBS_pair_ParBc_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain downstream. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=32.50 E-value=41 Score=22.19 Aligned_cols=18 Identities=28% Similarity=0.453 Sum_probs=14.7
Q ss_pred ccceEEcCCccEEEEEee
Q psy18070 13 SGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 13 SGGplvn~~G~vvGi~~~ 30 (169)
+--|++|.+|+++|+.+.
T Consensus 84 ~~~~Vv~~~g~~~Gvi~~ 101 (107)
T cd04610 84 SKLPVVDENNNLVGIITN 101 (107)
T ss_pred CeEeEECCCCeEEEEEEH
Confidence 346889999999999774
No 92
>PRK09570 rpoH DNA-directed RNA polymerase subunit H; Reviewed
Probab=31.65 E-value=29 Score=23.74 Aligned_cols=16 Identities=31% Similarity=0.439 Sum_probs=12.0
Q ss_pred CCccccccccccc-CCCCCcEEE
Q psy18070 104 NSPAYFIKFRTSA-GIKPTSSRL 125 (169)
Q Consensus 104 ~spA~~~~~~~~a-GL~~GDvI~ 125 (169)
..|++ +. |+++||||-
T Consensus 43 ~DPv~------r~~g~k~GdVvk 59 (79)
T PRK09570 43 SDPVV------KAIGAKPGDVIK 59 (79)
T ss_pred cChhh------hhcCCCCCCEEE
Confidence 45666 54 999999974
No 93
>cd04592 CBS_pair_EriC_assoc_euk This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in eukaryotes. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually
Probab=30.69 E-value=51 Score=23.82 Aligned_cols=22 Identities=23% Similarity=0.099 Sum_probs=17.7
Q ss_pred CCCccceEEcCCccEEEEEeee
Q psy18070 10 FGNSGGPLVNLDGEVIGINSMK 31 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~~ 31 (169)
.+-++-|++|.+|+++|+.+..
T Consensus 22 ~~~~~~~VvD~~g~l~Givt~~ 43 (133)
T cd04592 22 EKQSCVLVVDSDDFLEGILTLG 43 (133)
T ss_pred cCCCEEEEECCCCeEEEEEEHH
Confidence 3456789999999999998743
No 94
>cd04641 CBS_pair_28 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=30.22 E-value=53 Score=22.44 Aligned_cols=22 Identities=27% Similarity=0.382 Sum_probs=17.8
Q ss_pred CCCCccceEEcCCccEEEEEee
Q psy18070 9 KFGNSGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 9 n~GnSGGplvn~~G~vvGi~~~ 30 (169)
..+-+.-|++|.+|+++|+.+.
T Consensus 21 ~~~~~~~pVv~~~~~~~Giv~~ 42 (120)
T cd04641 21 ERRVSALPIVDENGKVVDVYSR 42 (120)
T ss_pred HcCCCeeeEECCCCeEEEEEeH
Confidence 3455678999999999999874
No 95
>cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.
Probab=29.80 E-value=69 Score=28.95 Aligned_cols=28 Identities=18% Similarity=0.133 Sum_probs=20.3
Q ss_pred EccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070 101 VMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT 135 (169)
Q Consensus 101 V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~ 135 (169)
..+|.|.. .+ .||||||++ +.-|++.++
T Consensus 292 ~~EN~is~------~A-~rPgDVi~s~~GkTVEI~NT 321 (468)
T cd00433 292 LAENMISG------NA-YRPGDVITSRSGKTVEILNT 321 (468)
T ss_pred eeecCCCC------CC-CCCCCEeEeCCCcEEEEecC
Confidence 34677777 55 899999999 555665554
No 96
>TIGR00612 ispG_gcpE 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase. Chlamydial members of the family have a long insert. The family is largely restricted to Bacteria, where it is widely but not universally distributed. No homology can be detected between the GcpE family and other proteins.
Probab=29.16 E-value=60 Score=28.24 Aligned_cols=37 Identities=14% Similarity=0.254 Sum_probs=29.0
Q ss_pred hhhHHHHHHhhhhcCCccceeecceecEEEEECCHHHHHHhhc
Q psy18070 42 IDYAIEFLTNYKRKDIDRTITHKKYIGITMLTLNEKLIEQLRR 84 (169)
Q Consensus 42 ~~~i~~~l~~l~~~g~~~~~~~~~~lGi~~~~l~~~~~~~~~~ 84 (169)
-+.++++++..++.+ ..-++|+.-..|++++.+.++.
T Consensus 107 ~e~v~~vv~~ak~~~------ipIRIGVN~GSL~~~~~~kyg~ 143 (346)
T TIGR00612 107 RERVRDVVEKARDHG------KAMRIGVNHGSLERRLLEKYGD 143 (346)
T ss_pred HHHHHHHHHHHHHCC------CCEEEecCCCCCcHHHHHHcCC
Confidence 467788888888777 5567999999999988887753
No 97
>PRK00913 multifunctional aminopeptidase A; Provisional
Probab=28.82 E-value=76 Score=28.88 Aligned_cols=28 Identities=21% Similarity=0.250 Sum_probs=19.8
Q ss_pred EccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070 101 VMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT 135 (169)
Q Consensus 101 V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~ 135 (169)
..+|.|.. .| .||||||++ +.-|++.++
T Consensus 306 l~ENm~~~------~A-~rPgDVi~~~~GkTVEV~NT 335 (483)
T PRK00913 306 ACENMPSG------NA-YRPGDVLTSMSGKTIEVLNT 335 (483)
T ss_pred eeccCCCC------CC-CCCCCEEEECCCcEEEeecC
Confidence 34677777 66 999999999 445555444
No 98
>TIGR02913 HAF_rpt probable extracellular repeat, HAF family. The model for this family detects a homology domain of about 40 amino acids. Member proteins always have a least two tandem copies and as many as seven. The spacing between repeats as defined here usually is four residues exactly. This repeat is named for a tripeptide motif HAF found in most members. Some members proteins are found in species with no outer membrane (archaea and Gram-positive bacteria) while others have C-terminal autotransporter domains that suggest that the repeat region is transported across the outer membrane. This domain seems likely to be an extracellular protein repeat.
Probab=28.38 E-value=62 Score=18.89 Aligned_cols=12 Identities=42% Similarity=0.880 Sum_probs=9.8
Q ss_pred EcCCccEEEEEe
Q psy18070 18 VNLDGEVIGINS 29 (169)
Q Consensus 18 vn~~G~vvGi~~ 29 (169)
+|.+|+|||...
T Consensus 4 In~~G~VvG~s~ 15 (39)
T TIGR02913 4 INNDGQVVGYST 15 (39)
T ss_pred CCCCCcEEEEEE
Confidence 688999999754
No 99
>cd04614 CBS_pair_1 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=28.32 E-value=66 Score=21.36 Aligned_cols=47 Identities=19% Similarity=0.198 Sum_probs=31.2
Q ss_pred CCCccceEEcCCccEEEEEeeec--c-CCeEEEEehhhHHHHHHhhhhcC
Q psy18070 10 FGNSGGPLVNLDGEVIGINSMKV--T-AGISFAIPIDYAIEFLTNYKRKD 56 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~~~--~-~~~~faiP~~~i~~~l~~l~~~g 56 (169)
.+-+.-|++|.+|+++|+.+..- . ..+.+.=|-+.+.+.++.+.+++
T Consensus 22 ~~~~~~~V~d~~~~~~Giv~~~dl~~~~~~~~v~~~~~l~~a~~~m~~~~ 71 (96)
T cd04614 22 ANVKALPVLDDDGKLSGIITERDLIAKSEVVTATKRTTVSECAQKMKRNR 71 (96)
T ss_pred cCCCeEEEECCCCCEEEEEEHHHHhcCCCcEEecCCCCHHHHHHHHHHhC
Confidence 45577899999999999987443 1 12344445556677777776665
No 100
>PF00883 Peptidase_M17: Cytosol aminopeptidase family, catalytic domain; InterPro: IPR000819 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine). Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear [, ]. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids []. The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another []. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape []. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices []. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core []. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer []. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain [].; GO: 0004177 aminopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 3KZW_L 3KQX_C 3KQZ_L 3KR4_I 3KR5_J 3T8W_C 3H8F_D 3H8G_A 3H8E_B 3IJ3_A ....
Probab=28.27 E-value=61 Score=27.79 Aligned_cols=29 Identities=17% Similarity=0.072 Sum_probs=17.0
Q ss_pred EEccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070 100 RVMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT 135 (169)
Q Consensus 100 ~V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~ 135 (169)
-..+|.|.. .+ .+|||||++ +.-|++.++
T Consensus 136 ~~~EN~i~~------~a-~~pgDVi~s~~GkTVEI~NT 166 (311)
T PF00883_consen 136 PLAENMISG------NA-YRPGDVITSMNGKTVEIGNT 166 (311)
T ss_dssp EEEEE--ST------TS-TTTTEEEE-TTS-EEEES-T
T ss_pred EcccccCCC------CC-CCCCCEEEeCCCCEEEEEee
Confidence 334677777 55 899999999 555665555
No 101
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=28.25 E-value=61 Score=26.39 Aligned_cols=28 Identities=36% Similarity=0.459 Sum_probs=19.6
Q ss_pred eccccCCCCccceEE---cCCccEEEEEeee
Q psy18070 4 TGIMVKFGNSGGPLV---NLDGEVIGINSMK 31 (169)
Q Consensus 4 ~da~in~GnSGGplv---n~~G~vvGi~~~~ 31 (169)
++.-..+|.+||||+ |-+-.|||+.+..
T Consensus 224 ~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~ 254 (282)
T PF03761_consen 224 TKQYSCKGDRGGPLVKNINGRWTLIGVGASG 254 (282)
T ss_pred cccccCCCCccCeEEEEECCCEEEEEEEccC
Confidence 344567899999998 4445678876644
No 102
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=28.23 E-value=30 Score=26.27 Aligned_cols=17 Identities=47% Similarity=0.571 Sum_probs=13.4
Q ss_pred cccCCCCccceEEcCCc
Q psy18070 6 IMVKFGNSGGPLVNLDG 22 (169)
Q Consensus 6 a~in~GnSGGplvn~~G 22 (169)
....+|+|||||+...+
T Consensus 179 ~~~c~gdsGgpl~~~~~ 195 (232)
T cd00190 179 KDACQGDSGGPLVCNDN 195 (232)
T ss_pred CccccCCCCCcEEEEeC
Confidence 44567999999998665
No 103
>PF08275 Toprim_N: DNA primase catalytic core, N-terminal domain; InterPro: IPR013264 This is the N-terminal, catalytic core domain of DNA primases. DNA primase (2.7.7 from EC) is a nucleotidyltransferase which synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork. It can also prime the leading strand and has been implicated in cell division []. ; PDB: 1EQN_E 1DD9_A 3B39_B 1DDE_A 2AU3_A.
Probab=27.63 E-value=47 Score=24.33 Aligned_cols=17 Identities=24% Similarity=0.632 Sum_probs=12.8
Q ss_pred eEEcCCccEEEEEeeec
Q psy18070 16 PLVNLDGEVIGINSMKV 32 (169)
Q Consensus 16 plvn~~G~vvGi~~~~~ 32 (169)
|+.|.+|+|||+..-..
T Consensus 82 PI~d~~G~vvgF~gR~l 98 (128)
T PF08275_consen 82 PIRDERGRVVGFGGRRL 98 (128)
T ss_dssp EEE-TTS-EEEEEEEES
T ss_pred EEEcCCCCEEEEecccC
Confidence 89999999999977655
No 104
>KOG3551|consensus
Probab=27.27 E-value=60 Score=29.03 Aligned_cols=32 Identities=16% Similarity=0.060 Sum_probs=25.8
Q ss_pred CcEEEEEEccCCcccccccccccC-CCCCcEEEecCcee
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAG-IKPTSSRLLGECLA 131 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aG-L~~GDvI~~~~~v~ 131 (169)
--++|+.+-+|=.|+ +.+ |..||.|+++|+-.
T Consensus 110 MPIlISKIFkGlAAD------Qt~aL~~gDaIlSVNG~d 142 (506)
T KOG3551|consen 110 MPILISKIFKGLAAD------QTGALFLGDAILSVNGED 142 (506)
T ss_pred CceehhHhccccccc------cccceeeccEEEEecchh
Confidence 469999999988777 543 99999999976654
No 105
>PRK05015 aminopeptidase B; Provisional
Probab=25.16 E-value=96 Score=27.82 Aligned_cols=30 Identities=17% Similarity=-0.004 Sum_probs=20.2
Q ss_pred EEEccCCcccccccccccCCCCCcEEEe--cCceeecch
Q psy18070 99 WRVMYNSPAYFIKFRTSAGIKPTSSRLL--GECLAQYTT 135 (169)
Q Consensus 99 ~~V~~~spA~~~~~~~~aGL~~GDvI~~--~~~v~~~~~ 135 (169)
.-..+|.+.. .+ .||||||+. +.-|++.++
T Consensus 241 l~~aENmisg------~A-~kpgDVIt~~nGkTVEI~NT 272 (424)
T PRK05015 241 LCCAENLISG------NA-FKLGDIITYRNGKTVEVMNT 272 (424)
T ss_pred EEecccCCCC------CC-CCCCCEEEecCCcEEeeecc
Confidence 3344677777 55 999999999 444555444
No 106
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=23.36 E-value=91 Score=20.95 Aligned_cols=19 Identities=21% Similarity=0.363 Sum_probs=15.4
Q ss_pred CCccceEEcCCccEEEEEe
Q psy18070 11 GNSGGPLVNLDGEVIGINS 29 (169)
Q Consensus 11 GnSGGplvn~~G~vvGi~~ 29 (169)
+-+.-|++|.+|+++|+.+
T Consensus 23 ~~~~~~V~d~~~~~~G~v~ 41 (111)
T cd04603 23 GARAVVVVDEENKVLGQVT 41 (111)
T ss_pred CCCEEEEEcCCCCEEEEEE
Confidence 3456688999999999987
No 107
>cd04801 CBS_pair_M50_like This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the metalloprotease peptidase M50. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=23.13 E-value=77 Score=21.23 Aligned_cols=20 Identities=25% Similarity=0.351 Sum_probs=15.9
Q ss_pred CccceEEcCCccEEEEEeee
Q psy18070 12 NSGGPLVNLDGEVIGINSMK 31 (169)
Q Consensus 12 nSGGplvn~~G~vvGi~~~~ 31 (169)
.+.=|++|.+|+++|+.+..
T Consensus 25 ~~~~~V~d~~~~~~G~v~~~ 44 (114)
T cd04801 25 QRRFVVVDNEGRYVGIISLA 44 (114)
T ss_pred ceeEEEEcCCCcEEEEEEHH
Confidence 45668899999999998743
No 108
>PRK03760 hypothetical protein; Provisional
Probab=22.93 E-value=98 Score=22.46 Aligned_cols=25 Identities=4% Similarity=-0.201 Sum_probs=18.7
Q ss_pred CcEEEEEEccCCcccccccccccCCCCCcEEE
Q psy18070 94 HGVLIWRVMYNSPAYFIKFRTSAGIKPTSSRL 125 (169)
Q Consensus 94 ~Gv~V~~V~~~spA~~~~~~~~aGL~~GDvI~ 125 (169)
.-.+|-++..|. ++ +.||++||.|.
T Consensus 89 ~a~~VLEl~aG~-~~------~~gi~~Gd~v~ 113 (117)
T PRK03760 89 PARYIIEGPVGK-IR------VLKVEVGDEIE 113 (117)
T ss_pred cceEEEEeCCCh-HH------HcCCCCCCEEE
Confidence 356788887655 55 58999999974
No 109
>cd04643 CBS_pair_30 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=22.84 E-value=73 Score=21.24 Aligned_cols=20 Identities=25% Similarity=0.483 Sum_probs=15.9
Q ss_pred CccceEEcCCccEEEEEeee
Q psy18070 12 NSGGPLVNLDGEVIGINSMK 31 (169)
Q Consensus 12 nSGGplvn~~G~vvGi~~~~ 31 (169)
-+.-|++|.+|+++|+.+..
T Consensus 24 ~~~~~V~d~~~~~~Giv~~~ 43 (116)
T cd04643 24 YSAIPVLDKEGKYVGTISLT 43 (116)
T ss_pred CceeeeECCCCcEEEEEeHH
Confidence 34568999999999998753
No 110
>COG1792 MreC Cell shape-determining protein [Cell envelope biogenesis, outer membrane]
Probab=22.63 E-value=3.3e+02 Score=22.81 Aligned_cols=41 Identities=17% Similarity=0.148 Sum_probs=25.9
Q ss_pred EeeccccCCCCccc-eEEcCCccEEEEEeeeccCCeEEEEehhh
Q psy18070 2 SLTGIMVKFGNSGG-PLVNLDGEVIGINSMKVTAGISFAIPIDY 44 (169)
Q Consensus 2 iq~da~in~GnSGG-plvn~~G~vvGi~~~~~~~~~~faiP~~~ 44 (169)
+-+|+--|.|=.=| |+++..| |||-+..- +.+....+.+..
T Consensus 134 ivId~Gs~~GV~~~~~Vi~~~G-LVG~V~~V-~~~tS~V~Lltd 175 (284)
T COG1792 134 IVIDKGSNDGIKKGMPVVAEGG-LVGKVVEV-SKNTSRVLLLTD 175 (284)
T ss_pred EEEecCcccCccCCCeEEECCc-eEEEEEEE-cCceeEEEEeec
Confidence 34666777776644 9999999 99965543 334445554443
No 111
>PF12120 Arr-ms: Rifampin ADP-ribosyl transferase; InterPro: IPR021975 This domain is part of the beta subunit of bacterial DNA dependent RNA polymerase. This domain is the binding site for the antibacterial drug rifampin (and its analogues) which blocks the DNA/RNA tunnel and prevents initiation of transcription. ; PDB: 2HW2_A.
Probab=22.55 E-value=43 Score=23.82 Aligned_cols=30 Identities=17% Similarity=0.208 Sum_probs=14.8
Q ss_pred ccccCCCCCcEEEec-----------Cceeecchhhhhhhh
Q psy18070 113 RTSAGIKPTSSRLLG-----------ECLAQYTTSKLVVWS 142 (169)
Q Consensus 113 ~~~aGL~~GDvI~~~-----------~~v~~~~~~~~~~~~ 142 (169)
+|+|-|++||.|+.+ +-|.....++.+.|.
T Consensus 5 GTkAdL~~GDll~pG~~SNy~~~~~~n~iY~Ta~ld~A~w~ 45 (100)
T PF12120_consen 5 GTKADLQVGDLLTPGFRSNYGPGRVMNHIYFTATLDAAIWG 45 (100)
T ss_dssp EESS---TT-EE-S--B-SSSTT-B-S-EEEESBHHHHHHH
T ss_pred cccccCCCCcEecCCCccccCCCceeeEEEEeeccchhHHH
Confidence 578999999999973 234444556666553
No 112
>COG4043 Preprotein translocase subunit Sec61beta [Intracellular trafficking, secretion, and vesicular transport]
Probab=22.31 E-value=52 Score=23.74 Aligned_cols=27 Identities=22% Similarity=0.352 Sum_probs=17.5
Q ss_pred ccCCCCCcEEEe-cC-------ceeecchhhhhhh
Q psy18070 115 SAGIKPTSSRLL-GE-------CLAQYTTSKLVVW 141 (169)
Q Consensus 115 ~aGL~~GDvI~~-~~-------~v~~~~~~~~~~~ 141 (169)
+.+++|||.|+- ++ .+....+++..+.
T Consensus 31 rr~ik~GD~IiF~~~~l~v~V~~vr~Y~tF~~mlr 65 (111)
T COG4043 31 RRQIKPGDKIIFNGDKLKVEVIDVRVYDTFEEMLR 65 (111)
T ss_pred hcCCCCCCEEEEcCCeeEEEEEEEeehhHHHHHHH
Confidence 689999999986 22 3444555554443
No 113
>PF01191 RNA_pol_Rpb5_C: RNA polymerase Rpb5, C-terminal domain; InterPro: IPR000783 Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; 2.7.7.6 from EC) that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region (IPR005571 from INTERPRO), plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH) [, , , ]. This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2). ; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 1EIK_A 2Y0S_Z 1DZF_A 3GTG_E 2VUM_E 3GTP_E 3GTO_E 3S17_E 3S1R_E 1I3Q_E ....
Probab=22.04 E-value=44 Score=22.55 Aligned_cols=18 Identities=22% Similarity=0.322 Sum_probs=10.9
Q ss_pred cCCcccccccccc-cCCCCCcEEEe
Q psy18070 103 YNSPAYFIKFRTS-AGIKPTSSRLL 126 (169)
Q Consensus 103 ~~spA~~~~~~~~-aGL~~GDvI~~ 126 (169)
...|.+ + .|+++||||--
T Consensus 39 ~~DPv~------r~~g~k~GdVvkI 57 (74)
T PF01191_consen 39 SSDPVA------RYLGAKPGDVVKI 57 (74)
T ss_dssp TTSHHH------HHTT--TTSEEEE
T ss_pred ccChhh------hhcCCCCCCEEEE
Confidence 355666 4 59999999743
No 114
>COG5428 Uncharacterized conserved small protein [Function unknown]
Probab=21.73 E-value=64 Score=21.53 Aligned_cols=14 Identities=36% Similarity=0.700 Sum_probs=10.9
Q ss_pred EcCCccEEEEEeee
Q psy18070 18 VNLDGEVIGINSMK 31 (169)
Q Consensus 18 vn~~G~vvGi~~~~ 31 (169)
+|.+|+|+||-.-.
T Consensus 36 ide~GkV~GiEi~~ 49 (69)
T COG5428 36 IDENGKVIGIEIWN 49 (69)
T ss_pred ecCCCcEEEEEEEc
Confidence 47889999997643
No 115
>cd04594 CBS_pair_EriC_assoc_archaea This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the EriC CIC-type chloride channels in archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS do
Probab=21.33 E-value=82 Score=20.86 Aligned_cols=21 Identities=33% Similarity=0.501 Sum_probs=15.8
Q ss_pred CCCCccceEEcCCccEEEEEee
Q psy18070 9 KFGNSGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 9 n~GnSGGplvn~~G~vvGi~~~ 30 (169)
..+.+--|+++ +|+++|+.+.
T Consensus 78 ~~~~~~~~Vv~-~~~~iGvit~ 98 (104)
T cd04594 78 KNKTRWCPVVD-DGKFKGIVTL 98 (104)
T ss_pred HcCcceEEEEE-CCEEEEEEEH
Confidence 34455678998 7999999874
No 116
>cd04459 Rho_CSD Rho_CSD: Rho protein cold-shock domain (CSD). Rho protein is a transcription termination factor in most bacteria. In bacteria, there are two distinct mechanisms for mRNA transcription termination. In intrinsic termination, RNA polymerase and nascent mRNA are released from DNA template by an mRNA stem loop structure, which resembles the transcription termination mechanism used by eukaryotic pol III. The second mechanism is mediated by Rho factor. Rho factor terminates transcription by using energy from ATP hydrolysis to forcibly dissociate the transcripts from RNA polymerase. Rho protein contains an N-terminal S1-like domain, which binds single-stranded RNA. Rho has a C-terminal ATPase domain which hydrolyzes ATP to provide energy to strip RNA polymerase and mRNA from the DNA template. Rho functions as a homohexamer.
Probab=21.14 E-value=60 Score=21.44 Aligned_cols=12 Identities=0% Similarity=-0.003 Sum_probs=10.9
Q ss_pred ccCCCCCcEEEe
Q psy18070 115 SAGIKPTSSRLL 126 (169)
Q Consensus 115 ~aGL~~GDvI~~ 126 (169)
+.|||.||.|..
T Consensus 38 r~~LR~GD~V~G 49 (68)
T cd04459 38 RFNLRTGDTVVG 49 (68)
T ss_pred HhCCCCCCEEEE
Confidence 789999999987
No 117
>cd04617 CBS_pair_4 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=21.01 E-value=1e+02 Score=20.85 Aligned_cols=21 Identities=24% Similarity=0.289 Sum_probs=16.5
Q ss_pred CCCccceEEcCCccEEEEEee
Q psy18070 10 FGNSGGPLVNLDGEVIGINSM 30 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~ 30 (169)
.+-+..|++|.+|+++|+.+.
T Consensus 22 ~~~~~~~V~d~~~~~~Givt~ 42 (118)
T cd04617 22 EDVGSLFVVDEDGDLVGVVSR 42 (118)
T ss_pred cCCCEEEEEcCCCCEEEEEEH
Confidence 344577899999999999774
No 118
>cd04619 CBS_pair_6 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=20.76 E-value=1.1e+02 Score=20.57 Aligned_cols=22 Identities=14% Similarity=0.135 Sum_probs=17.1
Q ss_pred CCCccceEEcCCccEEEEEeee
Q psy18070 10 FGNSGGPLVNLDGEVIGINSMK 31 (169)
Q Consensus 10 ~GnSGGplvn~~G~vvGi~~~~ 31 (169)
.+...-|++|.+|+++|+.+..
T Consensus 22 ~~~~~~~Vvd~~g~~~G~vt~~ 43 (114)
T cd04619 22 PGIDLVVVCDPHGKLAGVLTKT 43 (114)
T ss_pred cCCCEEEEECCCCCEEEEEehH
Confidence 3455668999999999998743
No 119
>cd04623 CBS_pair_10 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=20.60 E-value=95 Score=20.43 Aligned_cols=21 Identities=24% Similarity=0.306 Sum_probs=16.1
Q ss_pred CCccceEEcCCccEEEEEeee
Q psy18070 11 GNSGGPLVNLDGEVIGINSMK 31 (169)
Q Consensus 11 GnSGGplvn~~G~vvGi~~~~ 31 (169)
+-+.=|++|.+|+++|+.+..
T Consensus 23 ~~~~~~V~~~~~~~~Giv~~~ 43 (113)
T cd04623 23 NIGAVVVVDDGGRLVGIFSER 43 (113)
T ss_pred CCCeEEEECCCCCEEEEEehH
Confidence 345568999889999997743
No 120
>cd04621 CBS_pair_8 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=20.01 E-value=1.1e+02 Score=21.77 Aligned_cols=21 Identities=19% Similarity=0.374 Sum_probs=17.2
Q ss_pred CCccceEEcCCccEEEEEeee
Q psy18070 11 GNSGGPLVNLDGEVIGINSMK 31 (169)
Q Consensus 11 GnSGGplvn~~G~vvGi~~~~ 31 (169)
+-+.-|++|.+|+++|+.+..
T Consensus 23 ~~~~l~V~d~~~~~~Giv~~~ 43 (135)
T cd04621 23 GVGRVIVVDDNGKPVGVITYR 43 (135)
T ss_pred CCCcceEECCCCCEEEEEeHH
Confidence 456779999999999998743
Done!