Query psy2771
Match_columns 174
No_of_seqs 147 out of 1063
Neff 8.3
Searched_HMMs 46136
Date Sat Aug 17 00:43:34 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy2771.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/2771hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.4E-32 7.3E-37 232.1 18.1 160 4-166 110-275 (455)
2 PRK10898 serine endoprotease; 100.0 2.2E-31 4.8E-36 220.8 18.7 161 4-167 97-265 (353)
3 TIGR02038 protease_degS peripl 100.0 5.5E-31 1.2E-35 218.4 19.3 160 4-166 97-263 (351)
4 PRK10942 serine endoprotease; 100.0 4.3E-30 9.4E-35 220.1 17.9 159 5-166 132-296 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 4.7E-29 1E-33 211.8 18.3 159 6-167 79-243 (428)
6 COG0265 DegQ Trypsin-like seri 99.9 1.2E-23 2.6E-28 174.3 16.8 161 4-166 91-257 (347)
7 KOG1320|consensus 99.6 1.2E-14 2.6E-19 123.2 10.2 140 17-156 208-357 (473)
8 PF13365 Trypsin_2: Trypsin-li 99.1 6.2E-10 1.4E-14 77.6 8.9 84 11-124 33-120 (120)
9 PF00089 Trypsin: Trypsin; In 98.7 3.2E-07 6.9E-12 69.9 10.9 115 32-146 86-220 (220)
10 KOG1421|consensus 98.6 1.7E-07 3.6E-12 82.3 7.3 147 20-167 120-276 (955)
11 cd00190 Tryp_SPc Trypsin-like 98.3 7.2E-06 1.6E-10 62.8 10.4 116 32-147 88-230 (232)
12 PF00863 Peptidase_C4: Peptida 98.2 3.9E-05 8.4E-10 60.4 11.9 115 27-151 76-194 (235)
13 KOG1320|consensus 98.2 2.7E-06 5.8E-11 72.8 5.4 130 19-154 123-258 (473)
14 COG3591 V8-like Glu-specific e 98.1 7.9E-05 1.7E-09 59.2 11.1 87 53-150 155-250 (251)
15 smart00020 Tryp_SPc Trypsin-li 97.9 0.00014 3.1E-09 55.8 9.5 100 31-130 87-209 (229)
16 PF10459 Peptidase_S46: Peptid 97.6 0.00013 2.8E-09 65.8 5.8 57 94-150 619-687 (698)
17 PF05580 Peptidase_S55: SpoIVB 97.2 0.00075 1.6E-08 52.2 5.4 46 97-143 169-216 (218)
18 PF08192 Peptidase_S64: Peptid 97.2 0.0043 9.4E-08 55.1 10.4 117 30-150 540-689 (695)
19 PF00949 Peptidase_S7: Peptida 97.0 0.0011 2.3E-08 47.8 3.8 34 97-130 86-119 (132)
20 KOG3627|consensus 96.9 0.02 4.3E-07 45.0 11.2 117 33-149 106-253 (256)
21 PF00944 Peptidase_S3: Alphavi 96.6 0.004 8.6E-08 44.8 4.2 41 97-137 95-135 (158)
22 PF03761 DUF316: Domain of unk 96.4 0.087 1.9E-06 42.3 11.6 105 31-144 159-273 (282)
23 PF00548 Peptidase_C3: 3C cyst 96.3 0.075 1.6E-06 40.0 10.2 108 17-128 51-170 (172)
24 TIGR02860 spore_IV_B stage IV 96.1 0.012 2.5E-07 50.0 5.3 46 98-144 350-397 (402)
25 PF05579 Peptidase_S32: Equine 95.9 0.01 2.2E-07 47.5 3.6 28 105-132 205-232 (297)
26 KOG1421|consensus 95.9 0.18 3.9E-06 45.5 11.6 150 10-162 577-739 (955)
27 PF00947 Pico_P2A: Picornaviru 93.2 0.25 5.4E-06 35.2 5.0 40 96-136 78-117 (127)
28 PF02907 Peptidase_S29: Hepati 92.1 0.34 7.4E-06 35.0 4.5 38 104-142 104-146 (148)
29 PF01732 DUF31: Putative pepti 90.7 0.2 4.2E-06 42.2 2.6 25 102-126 349-373 (374)
30 COG5640 Secreted trypsin-like 85.7 2 4.4E-05 36.1 5.3 49 103-151 223-279 (413)
31 PF02122 Peptidase_S39: Peptid 85.4 1.2 2.7E-05 34.4 3.8 45 97-142 136-184 (203)
32 PF12381 Peptidase_C3G: Tungro 79.2 3.2 7E-05 32.4 3.9 53 97-149 169-228 (231)
33 PF00571 CBS: CBS domain CBS d 75.8 3.7 8E-05 24.0 2.8 21 107-127 28-48 (57)
34 cd01735 LSm12_N LSm12 belongs 71.0 16 0.00034 22.7 4.8 26 17-42 14-39 (61)
35 PF10459 Peptidase_S46: Peptid 70.9 6.8 0.00015 35.9 4.4 39 32-71 199-253 (698)
36 PF14827 Cache_3: Sensory doma 68.1 5.4 0.00012 27.5 2.6 18 112-129 94-111 (116)
37 PF02743 Cache_1: Cache domain 62.4 12 0.00026 23.7 3.3 31 112-150 19-49 (81)
38 PF05578 Peptidase_S31: Pestiv 61.4 28 0.00061 25.9 5.3 74 55-130 108-184 (211)
39 COG2524 Predicted transcriptio 60.7 90 0.002 25.4 8.7 21 107-128 201-221 (294)
40 cd04627 CBS_pair_14 The CBS do 59.8 8.1 0.00018 26.1 2.2 22 107-128 97-118 (123)
41 COG0298 HypC Hydrogenase matur 58.9 28 0.00061 22.8 4.4 47 22-69 5-53 (82)
42 cd04603 CBS_pair_KefB_assoc Th 55.2 12 0.00026 24.9 2.4 22 107-128 85-106 (111)
43 PF02395 Peptidase_S6: Immunog 54.5 26 0.00055 32.7 5.0 47 103-149 211-266 (769)
44 PF03510 Peptidase_C24: 2C end 53.9 33 0.00071 23.7 4.4 23 31-53 34-56 (105)
45 cd04618 CBS_pair_5 The CBS dom 53.2 29 0.00062 22.7 4.0 48 107-154 22-73 (98)
46 cd04620 CBS_pair_7 The CBS dom 52.7 14 0.00029 24.5 2.4 22 107-128 89-110 (115)
47 cd04643 CBS_pair_30 The CBS do 48.4 16 0.00034 24.1 2.2 17 112-128 95-111 (116)
48 cd04597 CBS_pair_DRTGG_assoc2 47.0 21 0.00046 24.0 2.7 21 107-127 87-107 (113)
49 cd04619 CBS_pair_6 The CBS dom 46.9 19 0.00042 23.9 2.4 22 107-128 88-109 (114)
50 cd04592 CBS_pair_EriC_assoc_eu 46.3 22 0.00047 25.0 2.7 22 107-128 22-43 (133)
51 cd04801 CBS_pair_M50_like This 46.3 20 0.00043 23.7 2.4 22 107-128 88-109 (114)
52 cd04607 CBS_pair_NTP_transfera 44.9 22 0.00047 23.4 2.4 22 107-128 87-108 (113)
53 PRK15431 ferrous iron transpor 44.3 20 0.00043 23.4 2.0 27 133-159 24-50 (78)
54 COG5428 Uncharacterized conser 44.3 43 0.00093 21.3 3.4 18 26-43 2-19 (69)
55 cd04602 CBS_pair_IMPDH_2 This 43.2 23 0.0005 23.4 2.3 22 107-128 88-109 (114)
56 cd04590 CBS_pair_CorC_HlyC_ass 43.1 22 0.00047 23.2 2.2 22 107-128 85-106 (111)
57 cd04617 CBS_pair_4 The CBS dom 42.9 21 0.00047 23.8 2.2 22 107-128 89-113 (118)
58 COG3448 CBS-domain-containing 42.5 21 0.00045 29.6 2.3 21 108-128 345-365 (382)
59 cd04582 CBS_pair_ABC_OpuCA_ass 42.5 25 0.00054 22.7 2.4 22 107-128 80-101 (106)
60 cd04641 CBS_pair_28 The CBS do 42.2 27 0.00059 23.3 2.6 22 106-127 21-42 (120)
61 cd04601 CBS_pair_IMPDH This cd 42.2 24 0.00053 22.8 2.3 22 107-128 84-105 (110)
62 smart00116 CBS Domain in cysta 42.1 28 0.0006 18.2 2.2 20 108-127 22-41 (49)
63 cd04614 CBS_pair_1 The CBS dom 42.0 30 0.00065 22.4 2.7 48 107-154 22-72 (96)
64 cd04583 CBS_pair_ABC_OpuCA_ass 41.9 25 0.00055 22.7 2.4 22 107-128 83-104 (109)
65 cd04596 CBS_pair_DRTGG_assoc T 41.3 26 0.00056 22.9 2.4 22 107-128 82-103 (108)
66 COG3290 CitA Signal transducti 41.0 23 0.00049 31.5 2.4 18 112-129 143-160 (537)
67 cd04606 CBS_pair_Mg_transporte 40.3 28 0.0006 22.7 2.4 22 107-128 82-103 (109)
68 cd04642 CBS_pair_29 The CBS do 40.1 28 0.00061 23.5 2.4 20 109-128 102-121 (126)
69 cd04610 CBS_pair_ParBc_assoc T 38.5 31 0.00067 22.2 2.4 19 110-128 84-102 (107)
70 cd04640 CBS_pair_27 The CBS do 38.2 28 0.00061 23.6 2.2 22 107-128 99-121 (126)
71 PF01455 HupF_HypC: HupF/HypC 38.2 1E+02 0.0022 19.3 5.0 43 22-65 5-47 (68)
72 cd04600 CBS_pair_HPP_assoc Thi 37.4 32 0.0007 22.9 2.4 22 107-128 98-119 (124)
73 cd04615 CBS_pair_2 The CBS dom 36.6 34 0.00074 22.3 2.4 22 107-128 87-108 (113)
74 cd04609 CBS_pair_PALP_assoc2 T 36.4 32 0.0007 22.2 2.2 18 111-128 88-105 (110)
75 PF09465 LBR_tudor: Lamin-B re 36.2 1E+02 0.0022 18.7 4.7 42 1-42 1-43 (55)
76 cd04587 CBS_pair_CAP-ED_DUF294 34.8 36 0.00078 22.2 2.3 18 111-128 91-108 (113)
77 COG3284 AcoR Transcriptional a 34.2 22 0.00048 32.1 1.4 23 107-129 158-180 (606)
78 PF15436 PGBA_N: Plasminogen-b 33.5 1.9E+02 0.0041 22.7 6.2 52 12-64 33-88 (218)
79 cd04624 CBS_pair_11 The CBS do 32.9 45 0.00097 21.7 2.5 22 107-128 86-107 (112)
80 cd04611 CBS_pair_PAS_GGDEF_DUF 32.2 43 0.00094 21.6 2.3 22 107-128 85-106 (111)
81 PF06003 SMN: Survival motor n 32.2 1.3E+02 0.0028 24.2 5.4 34 9-42 72-106 (264)
82 PF00741 Gas_vesicle: Gas vesi 32.1 98 0.0021 17.4 3.3 30 142-171 2-31 (39)
83 cd04635 CBS_pair_22 The CBS do 32.1 50 0.0011 21.8 2.7 21 107-127 96-116 (122)
84 PF10049 DUF2283: Protein of u 31.5 70 0.0015 18.6 2.9 16 27-42 3-18 (50)
85 PF09012 FeoC: FeoC like trans 30.3 58 0.0013 20.1 2.5 26 134-159 23-48 (69)
86 cd01717 Sm_B The eukaryotic Sm 30.2 1.2E+02 0.0026 19.3 4.1 29 13-41 14-42 (79)
87 cd00218 GlcAT-I Beta1,3-glucur 30.1 63 0.0014 25.4 3.1 32 111-143 136-173 (223)
88 cd05701 S1_Rrp5_repeat_hs10 S1 29.9 1.4E+02 0.0031 18.8 4.0 14 52-65 42-55 (69)
89 cd04605 CBS_pair_MET2_assoc Th 29.9 53 0.0011 21.2 2.4 22 107-128 84-105 (110)
90 cd04621 CBS_pair_8 The CBS dom 29.7 58 0.0013 22.5 2.7 20 108-127 23-42 (135)
91 TIGR00074 hypC_hupF hydrogenas 29.6 1.3E+02 0.0027 19.5 4.0 44 22-68 5-49 (76)
92 cd04604 CBS_pair_KpsF_GutQ_ass 29.5 61 0.0013 21.0 2.7 21 107-127 88-108 (114)
93 cd04585 CBS_pair_ACT_assoc2 Th 29.1 63 0.0014 21.1 2.7 22 107-128 96-117 (122)
94 cd04632 CBS_pair_19 The CBS do 29.1 62 0.0013 21.7 2.7 21 107-127 22-42 (128)
95 cd04803 CBS_pair_15 The CBS do 29.0 55 0.0012 21.6 2.4 22 107-128 96-117 (122)
96 cd04588 CBS_pair_CAP-ED_DUF294 29.0 53 0.0012 21.2 2.3 21 108-128 85-105 (110)
97 cd04598 CBS_pair_GGDEF_assoc T 28.8 47 0.001 21.8 2.1 18 111-128 97-114 (119)
98 cd04623 CBS_pair_10 The CBS do 28.3 58 0.0013 21.0 2.4 20 108-127 23-42 (113)
99 PF08275 Toprim_N: DNA primase 27.8 69 0.0015 22.6 2.8 17 113-129 82-98 (128)
100 cd04608 CBS_pair_PALP_assoc Th 27.6 59 0.0013 22.1 2.4 21 108-128 24-44 (124)
101 cd04639 CBS_pair_26 The CBS do 27.6 66 0.0014 20.8 2.6 22 107-128 85-106 (111)
102 cd06168 LSm9 The eukaryotic Sm 27.6 1.7E+02 0.0036 18.7 4.3 30 12-41 13-42 (75)
103 PRK09371 gas vesicle synthesis 27.2 1.2E+02 0.0026 19.1 3.4 33 139-171 6-38 (68)
104 cd02205 CBS_pair The CBS domai 27.1 63 0.0014 20.4 2.4 21 107-127 87-107 (113)
105 cd04593 CBS_pair_EriC_assoc_ba 26.7 59 0.0013 21.3 2.2 22 107-128 87-110 (115)
106 cd04595 CBS_pair_DHH_polyA_Pol 26.6 54 0.0012 21.2 2.0 20 108-128 86-105 (110)
107 PF04085 MreC: rod shape-deter 26.5 2.5E+02 0.0055 20.3 8.6 57 31-87 66-125 (152)
108 COG0517 FOG: CBS domain [Gener 26.5 69 0.0015 20.8 2.5 21 107-127 92-113 (117)
109 cd04637 CBS_pair_24 The CBS do 26.4 65 0.0014 21.3 2.4 22 107-128 96-117 (122)
110 cd01730 LSm3 The eukaryotic Sm 26.1 1.4E+02 0.0031 19.2 3.9 29 12-40 14-42 (82)
111 COG0490 Putative regulatory, l 26.0 81 0.0017 23.6 2.9 14 54-67 133-146 (162)
112 cd04599 CBS_pair_GGDEF_assoc2 25.5 61 0.0013 20.7 2.1 21 107-128 80-100 (105)
113 KOG3888|consensus 25.2 99 0.0021 26.3 3.6 45 109-153 292-338 (407)
114 cd04625 CBS_pair_12 The CBS do 25.2 62 0.0013 21.0 2.1 21 107-128 87-107 (112)
115 cd04631 CBS_pair_18 The CBS do 24.8 72 0.0016 21.1 2.4 22 107-128 99-120 (125)
116 PF08448 PAS_4: PAS fold; Int 24.5 80 0.0017 20.0 2.5 17 112-128 86-102 (110)
117 cd04629 CBS_pair_16 The CBS do 23.8 69 0.0015 20.8 2.1 20 108-127 23-42 (114)
118 cd04594 CBS_pair_EriC_assoc_ar 23.5 71 0.0015 20.6 2.1 21 107-128 79-99 (104)
119 cd04612 CBS_pair_SpoIVFB_EriC_ 23.4 88 0.0019 20.1 2.6 22 107-128 85-106 (111)
120 cd04630 CBS_pair_17 The CBS do 23.3 71 0.0015 20.9 2.1 21 107-128 89-109 (114)
121 cd01727 LSm8 The eukaryotic Sm 23.3 1.3E+02 0.0029 18.9 3.3 27 14-40 14-40 (74)
122 cd01732 LSm5 The eukaryotic Sm 23.2 2E+02 0.0043 18.4 4.1 29 12-40 16-44 (76)
123 cd04586 CBS_pair_BON_assoc Thi 23.2 80 0.0017 21.5 2.4 21 107-128 110-130 (135)
124 cd04622 CBS_pair_9 The CBS dom 22.8 85 0.0018 20.3 2.4 19 110-128 90-108 (113)
125 cd01728 LSm1 The eukaryotic Sm 22.5 2.2E+02 0.0047 18.1 4.2 53 12-66 15-72 (74)
126 PRK11543 gutQ D-arabinose 5-ph 22.5 73 0.0016 25.8 2.4 22 107-128 292-313 (321)
127 cd04613 CBS_pair_SpoIVFB_EriC_ 22.4 87 0.0019 20.1 2.4 20 108-127 23-42 (114)
128 cd01720 Sm_D2 The eukaryotic S 22.3 2.2E+02 0.0047 18.8 4.2 30 12-41 17-46 (87)
129 cd04584 CBS_pair_ACT_assoc Thi 22.1 82 0.0018 20.6 2.3 20 108-127 23-42 (121)
130 PF08669 GCV_T_C: Glycine clea 21.9 1.1E+02 0.0024 19.9 2.8 23 107-129 32-54 (95)
131 cd01731 archaeal_Sm1 The archa 21.6 2.1E+02 0.0045 17.6 4.3 29 13-41 14-42 (68)
132 cd01722 Sm_F The eukaryotic Sm 20.9 2.2E+02 0.0047 17.5 4.0 31 11-41 13-43 (68)
133 COG1958 LSM1 Small nuclear rib 20.9 2.3E+02 0.0051 17.9 4.1 32 10-41 18-49 (79)
134 cd00600 Sm_like The eukaryotic 20.8 2E+02 0.0042 16.9 4.5 29 13-41 10-38 (63)
135 PRK00737 small nuclear ribonuc 20.6 2.3E+02 0.005 17.7 4.3 30 12-41 17-46 (72)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3.4e-32 Score=232.08 Aligned_cols=160 Identities=33% Similarity=0.448 Sum_probs=141.4
Q ss_pred eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEc-CCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEee
Q psy2771 4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCL-QNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISN 82 (174)
Q Consensus 4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~-~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~ 82 (174)
|.+.++++.+.. .|++.++|+++++|+.+||||||++ +.++++++|++++.+++||+|+++|||+++..+++.|+|++
T Consensus 110 Vv~~a~~i~V~~-~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~ 188 (455)
T PRK10139 110 VINQAQKISIQL-NDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISA 188 (455)
T ss_pred HhCCCCEEEEEE-CCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEcc
Confidence 344566776654 9999999999999999999999998 46899999999999999999999999999999999999999
Q ss_pred eccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeeee
Q psy2771 83 KQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFCA 157 (174)
Q Consensus 83 ~~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~~ 157 (174)
..+..... .....++++|+.+++|||||||||.+|+||||+++... .+++||||++.+++++++|+++|++.|
T Consensus 189 ~~r~~~~~--~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r 266 (455)
T PRK10139 189 LGRSGLNL--EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKR 266 (455)
T ss_pred ccccccCC--CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccc
Confidence 87653211 23457899999999999999999999999999998653 468999999999999999999999999
Q ss_pred eecCceeee
Q psy2771 158 YSKGKSDLR 166 (174)
Q Consensus 158 ~~lg~~~~~ 166 (174)
+|+|++..+
T Consensus 267 ~~LGv~~~~ 275 (455)
T PRK10139 267 GLLGIKGTE 275 (455)
T ss_pred cceeEEEEE
Confidence 999998765
No 2
>PRK10898 serine endoprotease; Provisional
Probab=99.98 E-value=2.2e-31 Score=220.80 Aligned_cols=161 Identities=29% Similarity=0.406 Sum_probs=140.0
Q ss_pred eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeee
Q psy2771 4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNK 83 (174)
Q Consensus 4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~ 83 (174)
|.+.++++.+.+ .||+.++|+++++|+.+||||||++..++|++++++++.+++|+.|+++|||++...+++.|+|++.
T Consensus 97 Vv~~a~~i~V~~-~dg~~~~a~vv~~d~~~DlAvl~v~~~~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~ 175 (353)
T PRK10898 97 VINDADQIIVAL-QDGRVFEALLVGSDSLTDLAVLKINATNLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISAT 175 (353)
T ss_pred EeCCCCEEEEEe-CCCCEEEEEEEEEcCCCCEEEEEEcCCCCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEec
Confidence 344466666554 8999999999999999999999999888999999998889999999999999999999999999987
Q ss_pred ccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC--------CCeEEEEeHHHHHHHHHHHHhCCee
Q psy2771 84 QRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT--------AGISFAIPIDYAIEFLTNYKRKGKF 155 (174)
Q Consensus 84 ~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~--------~~~~~aiPi~~i~~~l~~l~~~g~~ 155 (174)
.+..... .....++++|+.+++|||||||+|.+|+||||+++... .+++|+||++.+++++++++++|++
T Consensus 176 ~r~~~~~--~~~~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~ 253 (353)
T PRK10898 176 GRIGLSP--TGRQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRV 253 (353)
T ss_pred cccccCC--ccccceEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcc
Confidence 7643211 22347899999999999999999999999999997542 4689999999999999999999999
Q ss_pred eeeecCceeeee
Q psy2771 156 CAYSKGKSDLRT 167 (174)
Q Consensus 156 ~~~~lg~~~~~~ 167 (174)
.|+|+|+...+.
T Consensus 254 ~~~~lGi~~~~~ 265 (353)
T PRK10898 254 IRGYIGIGGREI 265 (353)
T ss_pred cccccceEEEEC
Confidence 999999987653
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.98 E-value=5.5e-31 Score=218.43 Aligned_cols=160 Identities=31% Similarity=0.437 Sum_probs=139.3
Q ss_pred eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeee
Q psy2771 4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNK 83 (174)
Q Consensus 4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~ 83 (174)
|.+.++.+.+. +.||+.++|+++++|+.+||||||++...+++++++++..+++||+|+++|||++...+++.|+|+..
T Consensus 97 VV~~~~~i~V~-~~dg~~~~a~vv~~d~~~DlAvlkv~~~~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~ 175 (351)
T TIGR02038 97 VIKKADQIVVA-LQDGRKFEAELVGSDPLTDLAVLKIEGDNLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISAT 175 (351)
T ss_pred EeCCCCEEEEE-ECCCCEEEEEEEEecCCCCEEEEEecCCCCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEec
Confidence 33445555554 59999999999999999999999999888999999988889999999999999999999999999988
Q ss_pred ccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-------CCeEEEEeHHHHHHHHHHHHhCCeee
Q psy2771 84 QRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-------AGISFAIPIDYAIEFLTNYKRKGKFC 156 (174)
Q Consensus 84 ~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-------~~~~~aiPi~~i~~~l~~l~~~g~~~ 156 (174)
.+..... .....++++|+.+++|||||||||.+|+||||+++... .+++|+||++.+++++++++++|++.
T Consensus 176 ~r~~~~~--~~~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~ 253 (351)
T TIGR02038 176 GRNGLSS--VGRQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI 253 (351)
T ss_pred cCcccCC--CCcceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc
Confidence 7653211 23357899999999999999999999999999987542 46899999999999999999999999
Q ss_pred eeecCceeee
Q psy2771 157 AYSKGKSDLR 166 (174)
Q Consensus 157 ~~~lg~~~~~ 166 (174)
|+|+|+...+
T Consensus 254 r~~lGv~~~~ 263 (351)
T TIGR02038 254 RGYIGVSGED 263 (351)
T ss_pred ceEeeeEEEE
Confidence 9999998765
No 4
>PRK10942 serine endoprotease; Provisional
Probab=99.97 E-value=4.3e-30 Score=220.07 Aligned_cols=159 Identities=32% Similarity=0.449 Sum_probs=139.9
Q ss_pred eEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEc-CCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeee
Q psy2771 5 EKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCL-QNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNK 83 (174)
Q Consensus 5 ~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~-~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~ 83 (174)
...++++.++ +.|++.+.|++++.|+.+||||||++ ..++++++|++++.+++|++|+++|||+++..+++.|+|++.
T Consensus 132 v~~a~~i~V~-~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~aiG~P~g~~~tvt~GiVs~~ 210 (473)
T PRK10942 132 VDNATKIKVQ-LSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTVAIGNPYGLGETVTSGIVSAL 210 (473)
T ss_pred cCCCCEEEEE-ECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEcCCCCCCcceeEEEEEEe
Confidence 3446666665 49999999999999999999999997 568999999999999999999999999999999999999988
Q ss_pred ccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeeeee
Q psy2771 84 QRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFCAY 158 (174)
Q Consensus 84 ~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~~~ 158 (174)
.+.... ......++++|+.+++|+|||||+|.+|+||||+++... .+++|+||++.+++++++|++.|++.|+
T Consensus 211 ~r~~~~--~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~~g~v~rg 288 (473)
T PRK10942 211 GRSGLN--VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYGQVKRG 288 (473)
T ss_pred ecccCC--cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHhccccccc
Confidence 764221 123457899999999999999999999999999998653 3589999999999999999999999999
Q ss_pred ecCceeee
Q psy2771 159 SKGKSDLR 166 (174)
Q Consensus 159 ~lg~~~~~ 166 (174)
|+|+...+
T Consensus 289 ~lGv~~~~ 296 (473)
T PRK10942 289 ELGIMGTE 296 (473)
T ss_pred eeeeEeee
Confidence 99998765
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.97 E-value=4.7e-29 Score=211.76 Aligned_cols=159 Identities=36% Similarity=0.506 Sum_probs=139.1
Q ss_pred EeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCC-CCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeec
Q psy2771 6 KVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQN-NYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQ 84 (174)
Q Consensus 6 ~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~-~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~ 84 (174)
..+.++.+.+ .+++.++|+++++|+..||||||++.. .+++++|++++.+++|++|+++|||++...+++.|+|++..
T Consensus 79 ~~~~~i~V~~-~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~ 157 (428)
T TIGR02037 79 DGADEITVTL-SDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALG 157 (428)
T ss_pred CCCCeEEEEe-CCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecc
Confidence 3456666554 899999999999999999999999864 89999999988899999999999999999999999999887
Q ss_pred cCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeeeeee
Q psy2771 85 RSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFCAYS 159 (174)
Q Consensus 85 ~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~~~~ 159 (174)
+... .......++++|+++++|+|||||||.+|+||||+++... .+++|+||++.++++++++++.|++.++|
T Consensus 158 ~~~~--~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~ 235 (428)
T TIGR02037 158 RSGL--GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGW 235 (428)
T ss_pred cCcc--CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCc
Confidence 6531 1123456899999999999999999999999999987653 46899999999999999999999999999
Q ss_pred cCceeeee
Q psy2771 160 KGKSDLRT 167 (174)
Q Consensus 160 lg~~~~~~ 167 (174)
||+...+.
T Consensus 236 lGi~~~~~ 243 (428)
T TIGR02037 236 LGVTIQEV 243 (428)
T ss_pred CceEeecC
Confidence 99987653
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.92 E-value=1.2e-23 Score=174.27 Aligned_cols=161 Identities=33% Similarity=0.480 Sum_probs=140.8
Q ss_pred eeEeeeeEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCC-CCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEee
Q psy2771 4 VEKVTQDICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNN-YPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISN 82 (174)
Q Consensus 4 v~~~a~~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~-~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~ 82 (174)
|...|+++.+.. .||+.++++++++|+..|+|++|++... ++.+.++++..++.|+.++++|+|+++..+++.|+++.
T Consensus 91 Vi~~a~~i~v~l-~dg~~~~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~ 169 (347)
T COG0265 91 VIAGAEEITVTL-ADGREVPAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSA 169 (347)
T ss_pred ecCCcceEEEEe-CCCCEEEEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCCcccceeccEEec
Confidence 344466666666 9999999999999999999999999654 89999999999999999999999999999999999999
Q ss_pred eccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecCC-----CeEEEEeHHHHHHHHHHHHhCCeeee
Q psy2771 83 KQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTA-----GISFAIPIDYAIEFLTNYKRKGKFCA 157 (174)
Q Consensus 83 ~~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~-----~~~~aiPi~~i~~~l~~l~~~g~~~~ 157 (174)
..+. ..........++++|+.+++|+||||++|.+|++|||++..... +++|+||++.+..++.++.++|++.+
T Consensus 170 ~~r~-~v~~~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~ 248 (347)
T COG0265 170 LGRT-GVGSAGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVR 248 (347)
T ss_pred cccc-cccCcccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccc
Confidence 9885 11111235688999999999999999999999999999988752 47999999999999999999999999
Q ss_pred eecCceeee
Q psy2771 158 YSKGKSDLR 166 (174)
Q Consensus 158 ~~lg~~~~~ 166 (174)
+++|+...+
T Consensus 249 ~~lgv~~~~ 257 (347)
T COG0265 249 GYLGVIGEP 257 (347)
T ss_pred cccceEEEE
Confidence 999998764
No 7
>KOG1320|consensus
Probab=99.58 E-value=1.2e-14 Score=123.16 Aligned_cols=140 Identities=37% Similarity=0.431 Sum_probs=121.6
Q ss_pred ecCcEEEEeeeEeeCCCcEEEEEEc--CCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCcc-
Q psy2771 17 SFNSLLTLPNIAYYFEKHIILFHCL--QNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLN- 93 (174)
Q Consensus 17 ~~~~~~~a~~v~~d~~~DlAllkv~--~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~~- 93 (174)
..+...++.+.+.|+..|+|+++++ ..-+++++++-+.++..|+++.++|.|++..++.+.|.++...+..+..+..
T Consensus 208 ~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~ 287 (473)
T KOG1320|consen 208 GPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLET 287 (473)
T ss_pred cCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCccc
Confidence 3448899999999999999999995 3348889999899999999999999999999999999999998877765544
Q ss_pred --ccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-----CCeEEEEeHHHHHHHHHHHHhCCeee
Q psy2771 94 --KTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKGKFC 156 (174)
Q Consensus 94 --~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-----~~~~~aiPi~~i~~~l~~l~~~g~~~ 156 (174)
...+++++|+.++.|+||||++|.+|++||+.++... .+++|++|.+.+..++.+..+.....
T Consensus 288 g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~l 357 (473)
T KOG1320|consen 288 GVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISL 357 (473)
T ss_pred ceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceee
Confidence 5568899999999999999999999999999998765 68999999999999998885444443
No 8
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.11 E-value=6.2e-10 Score=77.60 Aligned_cols=84 Identities=21% Similarity=0.272 Sum_probs=50.2
Q ss_pred EEEEEeecCcEEE--EeeeEeeCC-CcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCc
Q psy2771 11 ICLSTFSFNSLLT--LPNIAYYFE-KHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSS 87 (174)
Q Consensus 11 ~~~~~~~~~~~~~--a~~v~~d~~-~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~ 87 (174)
+.+.. .++.... ++++..|+. .|+|||+++ .....+... ...+.........
T Consensus 33 ~~~~~-~~~~~~~~~~~~~~~~~~~~D~All~v~-------------------~~~~~~~~~-----~~~~~~~~~~~~~ 87 (120)
T PF13365_consen 33 VEVVF-PDGRRVPPVAEVVYFDPDDYDLALLKVD-------------------PWTGVGGGV-----RVPGSTSGVSPTS 87 (120)
T ss_dssp EEEEE-TTSCEEETEEEEEEEETT-TTEEEEEES-------------------CEEEEEEEE-----EEEEEEEEEEEEE
T ss_pred EEEEe-cCCCEEeeeEEEEEECCccccEEEEEEe-------------------cccceeeee-----Eeeeecccccccc
Confidence 34444 6777788 999999999 999999999 000000000 0000000000000
Q ss_pred cccCccccccEEE-EeeecCCCCccceEEcCCCcEEEE
Q psy2771 88 ETLGLNKTINYIQ-TDAAITFGNSGGPLVNLDGEVIGI 124 (174)
Q Consensus 88 ~~~~~~~~~~~~~-~~~~~~~G~SGGPl~n~~G~liGI 124 (174)
........ +++++.+|+|||||||.+|+||||
T Consensus 88 -----~~~~~~~~~~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 88 -----TNDNRMLYITDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -----EEETEEEEEESSS-STTTTTSEEEETTSEEEEE
T ss_pred -----CcccceeEeeecccCCCcEeHhEECCCCEEEeC
Confidence 01111112 799999999999999999999997
No 9
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=98.67 E-value=3.2e-07 Score=69.93 Aligned_cols=115 Identities=22% Similarity=0.230 Sum_probs=75.6
Q ss_pred CCcEEEEEEcCC-----CCCceeecCC-CCCCCCCEEEEEecCCCCCC----ceeecEEeeeccCccc--cCccccccEE
Q psy2771 32 EKHIILFHCLQN-----NYPALKLGKA-ADIRNGEFVIAMGSPLTLNN----TNTFGIISNKQRSSET--LGLNKTINYI 99 (174)
Q Consensus 32 ~~DlAllkv~~~-----~~~~~~l~~~-~~~~~G~~v~~~G~p~g~~~----~~~~G~vs~~~~~~~~--~~~~~~~~~~ 99 (174)
..|+|||+++.. ...++.+... ..++.|+.+.++|++..... .+....+.-....... .........+
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~ 165 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI 165 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 469999999843 5567777652 34689999999999976332 3333333322221110 0001223444
Q ss_pred EEee----ecCCCCccceEEcCCCcEEEEEeeecCC----CeEEEEeHHHHHHHH
Q psy2771 100 QTDA----AITFGNSGGPLVNLDGEVIGINSMKVTA----GISFAIPIDYAIEFL 146 (174)
Q Consensus 100 ~~~~----~~~~G~SGGPl~n~~G~liGI~~~~~~~----~~~~aiPi~~i~~~l 146 (174)
.... ..++|+|||||++.++.|+||++.+..+ ...++.+++...+|+
T Consensus 166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 4444 7899999999997777899999998542 258899998887764
No 10
>KOG1421|consensus
Probab=98.57 E-value=1.7e-07 Score=82.32 Aligned_cols=147 Identities=16% Similarity=0.190 Sum_probs=117.9
Q ss_pred cEEEEeeeEeeCCCcEEEEEEcCCC-----CCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCc--
Q psy2771 20 SLLTLPNIAYYFEKHIILFHCLQNN-----YPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGL-- 92 (174)
Q Consensus 20 ~~~~a~~v~~d~~~DlAllkv~~~~-----~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~-- 92 (174)
...+.-.+..|+-+|+.+++.+++. +..+.++. +..++|.+++++|+-.+.-.++-.|.++++.+....++.
T Consensus 120 ee~ei~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~ 198 (955)
T KOG1421|consen 120 EEIEIYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDT 198 (955)
T ss_pred ccCCcccccCCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccc
Confidence 3334444788999999999998643 34444543 345899999999998888888889999988887765544
Q ss_pred --cccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC-CCeEEEEeHHHHHHHHHHHHhCCeeeeeecCceeeee
Q psy2771 93 --NKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-AGISFAIPIDYAIEFLTNYKRKGKFCAYSKGKSDLRT 167 (174)
Q Consensus 93 --~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~-~~~~~aiPi~~i~~~l~~l~~~g~~~~~~lg~~~~~~ 167 (174)
.....++|.......|.||+||+|.+|..|..+..+.. .+..|++|++.+.+.+.-++++..++|+-|.++.+.+
T Consensus 199 yndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k 276 (955)
T KOG1421|consen 199 YNDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSISSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHK 276 (955)
T ss_pred cccccceeeeehhcCCCCCCCCceecccceEEeeecCCcccccccceeeccchhhhhhhhhcCCCcccceEEEEEehh
Confidence 33346678888899999999999999999999997764 5679999999999999999999999999888877654
No 11
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.33 E-value=7.2e-06 Score=62.80 Aligned_cols=116 Identities=18% Similarity=0.194 Sum_probs=68.1
Q ss_pred CCcEEEEEEcC-----CCCCceeecCCC-CCCCCCEEEEEecCCCCCC-----ceeecEEeeeccC--ccccC--ccccc
Q psy2771 32 EKHIILFHCLQ-----NNYPALKLGKAA-DIRNGEFVIAMGSPLTLNN-----TNTFGIISNKQRS--SETLG--LNKTI 96 (174)
Q Consensus 32 ~~DlAllkv~~-----~~~~~~~l~~~~-~~~~G~~v~~~G~p~g~~~-----~~~~G~vs~~~~~--~~~~~--~~~~~ 96 (174)
..|+||||++. ..+.|+.|.... .+..|+.+.+.|+...... ......+.-.... ..... .....
T Consensus 88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~ 167 (232)
T cd00190 88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITD 167 (232)
T ss_pred cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCC
Confidence 57999999973 236788886653 5788999999998764321 1122222111110 00000 00000
Q ss_pred cEE-E----EeeecCCCCccceEEcCC---CcEEEEEeeecCC----CeEEEEeHHHHHHHHH
Q psy2771 97 NYI-Q----TDAAITFGNSGGPLVNLD---GEVIGINSMKVTA----GISFAIPIDYAIEFLT 147 (174)
Q Consensus 97 ~~~-~----~~~~~~~G~SGGPl~n~~---G~liGI~~~~~~~----~~~~aiPi~~i~~~l~ 147 (174)
..+ . .....|+|+|||||+... ..|+||++++..+ ....+..+....+|++
T Consensus 168 ~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~ 230 (232)
T cd00190 168 NMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQ 230 (232)
T ss_pred ceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhh
Confidence 111 1 145578999999999654 7899999987632 2345566666666664
No 12
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.20 E-value=3.9e-05 Score=60.38 Aligned_cols=115 Identities=20% Similarity=0.320 Sum_probs=59.1
Q ss_pred eEeeCCCcEEEEEEcCCCCCceeec-CCCCCCCCCEEEEEecCCCCCC-ceeecEEeeeccCccccCccccccEEEEeee
Q psy2771 27 IAYYFEKHIILFHCLQNNYPALKLG-KAADIRNGEFVIAMGSPLTLNN-TNTFGIISNKQRSSETLGLNKTINYIQTDAA 104 (174)
Q Consensus 27 v~~d~~~DlAllkv~~~~~~~~~l~-~~~~~~~G~~v~~~G~p~g~~~-~~~~G~vs~~~~~~~~~~~~~~~~~~~~~~~ 104 (174)
+..-+..||.++|... ++||.+-. .-..++.++.|+++|.-+.... +.+.-.-+...+ .....+..+-..
T Consensus 76 v~~i~~~DiviirmPk-DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~~s~vSesS~i~p-------~~~~~fWkHwIs 147 (235)
T PF00863_consen 76 VHPIEGRDIVIIRMPK-DFPPFPQKLKFRAPKEGERVCMVGSNFQEKSISSTVSESSWIYP-------EENSHFWKHWIS 147 (235)
T ss_dssp EEE-TCSSEEEEE--T-TS----S---B----TT-EEEEEEEECSSCCCEEEEEEEEEEEE-------ETTTTEEEE-C-
T ss_pred eEEeCCccEEEEeCCc-ccCCcchhhhccCCCCCCEEEEEEEEEEcCCeeEEECCceEEee-------cCCCCeeEEEec
Confidence 4555688999999984 56665531 2356899999999997544332 111111111111 234477788888
Q ss_pred cCCCCccceEEcC-CCcEEEEEeeecC-CCeEEEEeHHHHHHHHHHHHh
Q psy2771 105 ITFGNSGGPLVNL-DGEVIGINSMKVT-AGISFAIPIDYAIEFLTNYKR 151 (174)
Q Consensus 105 ~~~G~SGGPl~n~-~G~liGI~~~~~~-~~~~~aiPi~~i~~~l~~l~~ 151 (174)
+..|+=|.||++. +|++|||++.... ...||+.|+.. ++.+.+.+
T Consensus 148 Tk~G~CG~PlVs~~Dg~IVGiHsl~~~~~~~N~F~~f~~--~f~~~~l~ 194 (235)
T PF00863_consen 148 TKDGDCGLPLVSTKDGKIVGIHSLTSNTSSRNYFTPFPD--DFEEFYLE 194 (235)
T ss_dssp --TT-TT-EEEETTT--EEEEEEEEETTTSSEEEEE--T--THHHHHCC
T ss_pred CCCCccCCcEEEcCCCcEEEEEcCccCCCCeEEEEcCCH--HHHHHHhc
Confidence 8999999999976 5999999998764 45788877753 44444443
No 13
>KOG1320|consensus
Probab=98.18 E-value=2.7e-06 Score=72.82 Aligned_cols=130 Identities=19% Similarity=0.181 Sum_probs=100.6
Q ss_pred CcEEEEeeeEeeCCCcEEEEEEcCC----CCCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCccc
Q psy2771 19 NSLLTLPNIAYYFEKHIILFHCLQN----NYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNK 94 (174)
Q Consensus 19 ~~~~~a~~v~~d~~~DlAllkv~~~----~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~~~ 94 (174)
-+.+.+++...-.++|+|++.++.. ...|+.+++. +.-.+.++++| +....++.|.|++.....+..+ ..
T Consensus 123 ~~k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~i--p~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~-~~ 196 (473)
T KOG1320|consen 123 PRKYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDI--PSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHS-ST 196 (473)
T ss_pred chhhhhhHHHhhhcccceEEEEeeccccCCCcccccCCC--cccCccEEEEc---CCcEEEEeeEEEEEEeccccCC-Cc
Confidence 3567788888888999999999853 3335666554 56678899998 7788999999999887655433 23
Q ss_pred cccEEEEeeecCCCCccceEEcCCCcEEEEEeeec--CCCeEEEEeHHHHHHHHHHHHhCCe
Q psy2771 95 TINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAIEFLTNYKRKGK 154 (174)
Q Consensus 95 ~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~--~~~~~~aiPi~~i~~~l~~l~~~g~ 154 (174)
....+++++.+.+|+||+|.+...++..|+++... .+++.+.||.-.+.++.......+.
T Consensus 197 ~l~~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~ 258 (473)
T KOG1320|consen 197 VLLRVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAI 258 (473)
T ss_pred ceeeEEEEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeecc
Confidence 44568999999999999999988899999999887 4567889998777777665544443
No 14
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.05 E-value=7.9e-05 Score=59.16 Aligned_cols=87 Identities=21% Similarity=0.232 Sum_probs=61.1
Q ss_pred CCCCCCCCEEEEEecCCCCCCc----eeecEEeeeccCccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeee
Q psy2771 53 AADIRNGEFVIAMGSPLTLNNT----NTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 53 ~~~~~~G~~v~~~G~p~g~~~~----~~~G~vs~~~~~~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~ 128 (174)
....+.++.+-++|||.+..+. ...+.+.... ...+.+++.+++|+||+||++.+.++||++..+
T Consensus 155 ~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~-----------~~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g 223 (251)
T COG3591 155 ASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK-----------GNKLFYDADTLPGSSGSPVLISKDEVIGVHYNG 223 (251)
T ss_pred ccccccCceeEEEeccCCCCcceeEeeecceeEEEe-----------cceEEEEecccCCCCCCceEecCceEEEEEecC
Confidence 3457899999999999775532 2233333221 246899999999999999999999999999987
Q ss_pred cCC----CeEE-EEeHHHHHHHHHHHH
Q psy2771 129 VTA----GISF-AIPIDYAIEFLTNYK 150 (174)
Q Consensus 129 ~~~----~~~~-aiPi~~i~~~l~~l~ 150 (174)
... ..++ +.-...++++++++.
T Consensus 224 ~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 224 PGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred CCcccccccCcceEecHHHHHHHHHhh
Confidence 651 2333 344456677776654
No 15
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=97.87 E-value=0.00014 Score=55.76 Aligned_cols=100 Identities=20% Similarity=0.200 Sum_probs=57.8
Q ss_pred CCCcEEEEEEcC-----CCCCceeecCC-CCCCCCCEEEEEecCCCCC------CceeecEEeeeccCcc--ccC---cc
Q psy2771 31 FEKHIILFHCLQ-----NNYPALKLGKA-ADIRNGEFVIAMGSPLTLN------NTNTFGIISNKQRSSE--TLG---LN 93 (174)
Q Consensus 31 ~~~DlAllkv~~-----~~~~~~~l~~~-~~~~~G~~v~~~G~p~g~~------~~~~~G~vs~~~~~~~--~~~---~~ 93 (174)
...|+|||+++. ..+.|+.|... ..+..++.+.+.|++.... .......+........ ... ..
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~ 166 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAI 166 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhcccccc
Confidence 467999999974 24667777553 3577899999999886542 1111222211111000 000 00
Q ss_pred ccccEEE----EeeecCCCCccceEEcCCC--cEEEEEeeecC
Q psy2771 94 KTINYIQ----TDAAITFGNSGGPLVNLDG--EVIGINSMKVT 130 (174)
Q Consensus 94 ~~~~~~~----~~~~~~~G~SGGPl~n~~G--~liGI~~~~~~ 130 (174)
....+-. .....++|+|||||+...+ .|+||++.+..
T Consensus 167 ~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~~ 209 (229)
T smart00020 167 TDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGSG 209 (229)
T ss_pred CCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECCC
Confidence 0000000 1456789999999995443 89999999763
No 16
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=97.58 E-value=0.00013 Score=65.79 Aligned_cols=57 Identities=25% Similarity=0.302 Sum_probs=45.4
Q ss_pred ccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecC------------CCeEEEEeHHHHHHHHHHHH
Q psy2771 94 KTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT------------AGISFAIPIDYAIEFLTNYK 150 (174)
Q Consensus 94 ~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~------------~~~~~aiPi~~i~~~l~~l~ 150 (174)
...-.+.++..++.||||+||+|.+|+|||+++-+.- ...+.+|-+..+..+++.+-
T Consensus 619 ~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 619 SVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred CeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 3455678999999999999999999999999996531 23577788888888877653
No 17
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=97.21 E-value=0.00075 Score=52.23 Aligned_cols=46 Identities=24% Similarity=0.504 Sum_probs=35.7
Q ss_pred cEEEEeeecCCCCccceEEcCCCcEEEEEeeec--CCCeEEEEeHHHHH
Q psy2771 97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAI 143 (174)
Q Consensus 97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~--~~~~~~aiPi~~i~ 143 (174)
.++..+..+.+||||+|++ .+|+|||-++... .+..+|.+++++..
T Consensus 169 ~Ll~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~ML 216 (218)
T PF05580_consen 169 RLLEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEWML 216 (218)
T ss_pred chhhhhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHHHh
Confidence 3444445677999999999 9999999999765 34579999987653
No 18
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=97.18 E-value=0.0043 Score=55.11 Aligned_cols=117 Identities=15% Similarity=0.178 Sum_probs=71.8
Q ss_pred eCCCcEEEEEEcCC---------CC------CceeecCC------CCCCCCCEEEEEecCCCCCCceeecEEeeeccCcc
Q psy2771 30 YFEKHIILFHCLQN---------NY------PALKLGKA------ADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSE 88 (174)
Q Consensus 30 d~~~DlAllkv~~~---------~~------~~~~l~~~------~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~ 88 (174)
..-.|+||||++.. ++ |.+.+.+. ..+++|..|+-+|...+ .+.|.+.+..-..-
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw 615 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYW 615 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEe
Confidence 33459999999741 22 22333221 34678999999988766 44565555432110
Q ss_pred ccCccccccEEEEe----eecCCCCccceEEcCCCc------EEEEEeeecC--CCeEEEEeHHHHHHHHHHHH
Q psy2771 89 TLGLNKTINYIQTD----AAITFGNSGGPLVNLDGE------VIGINSMKVT--AGISFAIPIDYAIEFLTNYK 150 (174)
Q Consensus 89 ~~~~~~~~~~~~~~----~~~~~G~SGGPl~n~~G~------liGI~~~~~~--~~~~~aiPi~~i~~~l~~l~ 150 (174)
..+.-...+++... .=...|+||+=|++.-++ |+||.+..-. ..++++.|+.+|.+-|++.-
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~vT 689 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEVT 689 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHhh
Confidence 00100112333333 223489999999976444 9999998653 35889999999888777754
No 19
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=96.95 E-value=0.0011 Score=47.79 Aligned_cols=34 Identities=29% Similarity=0.509 Sum_probs=25.0
Q ss_pred cEEEEeeecCCCCccceEEcCCCcEEEEEeeecC
Q psy2771 97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT 130 (174)
Q Consensus 97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~ 130 (174)
.+...+....+|.||+|+||.+|++|||...+..
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~ 119 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE 119 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence 4556677788999999999999999999987763
No 20
>KOG3627|consensus
Probab=96.92 E-value=0.02 Score=44.99 Aligned_cols=117 Identities=19% Similarity=0.185 Sum_probs=64.4
Q ss_pred CcEEEEEEcC-----CCCCceeecCCCC---CCCCCEEEEEecCCCC------CCceeecEEeeecc--CccccCcc-cc
Q psy2771 33 KHIILFHCLQ-----NNYPALKLGKAAD---IRNGEFVIAMGSPLTL------NNTNTFGIISNKQR--SSETLGLN-KT 95 (174)
Q Consensus 33 ~DlAllkv~~-----~~~~~~~l~~~~~---~~~G~~v~~~G~p~g~------~~~~~~G~vs~~~~--~~~~~~~~-~~ 95 (174)
+|||+|+++. +.+.|+.|..... ...++..++.|.+... ...+....+.-... ........ ..
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~ 185 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTI 185 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCcccc
Confidence 7999999984 3566777743332 3455888888865431 11222212221111 11111100 00
Q ss_pred -ccEE-----EEeeecCCCCccceEEcCC---CcEEEEEeeecC-CC----eEEEEeHHHHHHHHHHH
Q psy2771 96 -INYI-----QTDAAITFGNSGGPLVNLD---GEVIGINSMKVT-AG----ISFAIPIDYAIEFLTNY 149 (174)
Q Consensus 96 -~~~~-----~~~~~~~~G~SGGPl~n~~---G~liGI~~~~~~-~~----~~~aiPi~~i~~~l~~l 149 (174)
...+ .....+|.|+|||||+-.+ ..++||++++.. ++ .+....+....+++++.
T Consensus 186 ~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~ 253 (256)
T KOG3627|consen 186 TDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKEN 253 (256)
T ss_pred CCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHH
Confidence 0111 1123468999999999554 699999999864 21 35566666666666554
No 21
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=96.56 E-value=0.004 Score=44.76 Aligned_cols=41 Identities=22% Similarity=0.318 Sum_probs=31.1
Q ss_pred cEEEEeeecCCCCccceEEcCCCcEEEEEeeecCCCeEEEE
Q psy2771 97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTAGISFAI 137 (174)
Q Consensus 97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~~~~~ai 137 (174)
.+..-+..-.+|+||-|++|.+|+||||+..+.+++...++
T Consensus 95 rftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~RTaL 135 (158)
T PF00944_consen 95 RFTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRRTAL 135 (158)
T ss_dssp EEEEETTS-STTSTTEEEESTTSBEEEEEEEEEEETTEEEE
T ss_pred eEEeccCCCCCCCCCCccCcCCCCEEEEEecCCCCCCceEE
Confidence 33344555679999999999999999999998876654444
No 22
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.38 E-value=0.087 Score=42.28 Aligned_cols=105 Identities=18% Similarity=0.177 Sum_probs=67.1
Q ss_pred CCCcEEEEEEcCC---CCCceeecCCC-CCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCccccccEEEEeeecC
Q psy2771 31 FEKHIILFHCLQN---NYPALKLGKAA-DIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAIT 106 (174)
Q Consensus 31 ~~~DlAllkv~~~---~~~~~~l~~~~-~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~~~~~~~~~ 106 (174)
...++.||.++.+ ...|+.|+++. .+..++.+.+.|+... ..+....+.-.... .....+......+
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~~~~~~~i~~~~-------~~~~~~~~~~~~~ 229 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKLKHRKLKITNCT-------KCAYSICTKQYSC 229 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeEEEEEEEEEEee-------ccceeEecccccC
Confidence 4568999999854 78899998753 3678999999988211 11222222111110 1223445556778
Q ss_pred CCCccceEE---cCCCcEEEEEeeecCC---CeEEEEeHHHHHH
Q psy2771 107 FGNSGGPLV---NLDGEVIGINSMKVTA---GISFAIPIDYAIE 144 (174)
Q Consensus 107 ~G~SGGPl~---n~~G~liGI~~~~~~~---~~~~aiPi~~i~~ 144 (174)
.|++|||++ |.+-.||||.+..... +..+++.+..+++
T Consensus 230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 230 KGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD 273 (282)
T ss_pred CCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence 999999999 3344699999877632 2577777776654
No 23
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.31 E-value=0.075 Score=39.97 Aligned_cols=108 Identities=17% Similarity=0.195 Sum_probs=60.8
Q ss_pred ecCcEEEEeee--EeeC---CCcEEEEEEcC-CCCCcee--ecCCCCCCCCCEEEEEecCCCCCC-ceeecEEeeeccCc
Q psy2771 17 SFNSLLTLPNI--AYYF---EKHIILFHCLQ-NNYPALK--LGKAADIRNGEFVIAMGSPLTLNN-TNTFGIISNKQRSS 87 (174)
Q Consensus 17 ~~~~~~~a~~v--~~d~---~~DlAllkv~~-~~~~~~~--l~~~~~~~~G~~v~~~G~p~g~~~-~~~~G~vs~~~~~~ 87 (174)
.++..+..... ..+. ..|+++++++. ..++-++ +.+. ..+..+...++=++ .... ....+.+......
T Consensus 51 i~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~~~-~~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i- 127 (172)
T PF00548_consen 51 IDGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFPES-IPEYPECVLLVNST-KFPRMIVEVGFVTNFGFI- 127 (172)
T ss_dssp ETTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSBSS-GGTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-
T ss_pred ECCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhccc-cccCCCcEEEEECC-CCccEEEEEEEEeecCcc-
Confidence 45665555442 2333 45999999963 2222111 1121 12444555555333 3333 3344444433322
Q ss_pred cccCccccccEEEEeeecCCCCccceEEcC---CCcEEEEEeee
Q psy2771 88 ETLGLNKTINYIQTDAAITFGNSGGPLVNL---DGEVIGINSMK 128 (174)
Q Consensus 88 ~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~---~G~liGI~~~~ 128 (174)
..........+.+++++.+|+-||||+.. .++++||+.++
T Consensus 128 -~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 128 -NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp -EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred -ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 11113456788999999999999999942 57899999986
No 24
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=96.09 E-value=0.012 Score=50.01 Aligned_cols=46 Identities=22% Similarity=0.505 Sum_probs=34.7
Q ss_pred EEEEeeecCCCCccceEEcCCCcEEEEEeeec--CCCeEEEEeHHHHHH
Q psy2771 98 YIQTDAAITFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAIE 144 (174)
Q Consensus 98 ~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~--~~~~~~aiPi~~i~~ 144 (174)
++..+..+.+||||+|++ .+|+|||-++=.. .+..+|.|-+++..+
T Consensus 350 ll~~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~Ml~ 397 (402)
T TIGR02860 350 LLEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEWMLK 397 (402)
T ss_pred HhhHhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHHHHH
Confidence 333445677999999999 9999999887443 456799997776644
No 25
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=95.85 E-value=0.01 Score=47.48 Aligned_cols=28 Identities=36% Similarity=0.663 Sum_probs=21.7
Q ss_pred cCCCCccceEEcCCCcEEEEEeeecCCC
Q psy2771 105 ITFGNSGGPLVNLDGEVIGINSMKVTAG 132 (174)
Q Consensus 105 ~~~G~SGGPl~n~~G~liGI~~~~~~~~ 132 (174)
+.+|+||+|++..+|.+|||++.....+
T Consensus 205 T~~GDSGSPVVt~dg~liGVHTGSn~~G 232 (297)
T PF05579_consen 205 TGPGDSGSPVVTEDGDLIGVHTGSNKRG 232 (297)
T ss_dssp S-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred cCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence 4589999999999999999999876543
No 26
>KOG1421|consensus
Probab=95.85 E-value=0.18 Score=45.49 Aligned_cols=150 Identities=15% Similarity=0.083 Sum_probs=97.4
Q ss_pred eEEEEEeecCcEEEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEEecCCCCCC-----ceeecEEeeec
Q psy2771 10 DICLSTFSFNSLLTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAMGSPLTLNN-----TNTFGIISNKQ 84 (174)
Q Consensus 10 ~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~G~p~g~~~-----~~~~G~vs~~~ 84 (174)
|+.+.. .|-..+.|.+...++...+|.+|-+++-...++|.+ ..+..||++...|+-..... +++.-.+....
T Consensus 577 d~~vt~-~dS~~i~a~~~fL~~t~n~a~~kydp~~~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~p 654 (955)
T KOG1421|consen 577 DQRVTE-ADSDGIPANVSFLHPTENVASFKYDPALEVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIP 654 (955)
T ss_pred ceEEee-cccccccceeeEecCccceeEeccChhHhhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEec
Confidence 334433 666778888888999999999999977666777755 45899999999998865432 22221111111
Q ss_pred cC-ccccCccccccEEEEeeecCCCCccceEEcCCCcEEEEEeeecCC-------CeEEEEeHHHHHHHHHHHHhCCeee
Q psy2771 85 RS-SETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTA-------GISFAIPIDYAIEFLTNYKRKGKFC 156 (174)
Q Consensus 85 ~~-~~~~~~~~~~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~-------~~~~aiPi~~i~~~l~~l~~~g~~~ 156 (174)
+. ...+. ......+...+.+.-++--|-+.|.+|+++++=-....+ -..|.+.+..+++.+++|+..++..
T Consensus 655 s~~~pr~r-~~n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r 733 (955)
T KOG1421|consen 655 SSVMPRFR-ATNLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR 733 (955)
T ss_pred CCCCccee-ecceEEEEEeccccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC
Confidence 11 00111 223455555545445555568899999999986544321 2578899999999999999988765
Q ss_pred eeecCc
Q psy2771 157 AYSKGK 162 (174)
Q Consensus 157 ~~~lg~ 162 (174)
.--+|+
T Consensus 734 p~i~~v 739 (955)
T KOG1421|consen 734 PTIAGV 739 (955)
T ss_pred ceeecc
Confidence 333343
No 27
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=93.19 E-value=0.25 Score=35.21 Aligned_cols=40 Identities=28% Similarity=0.375 Sum_probs=29.7
Q ss_pred ccEEEEeeecCCCCccceEEcCCCcEEEEEeeecCCCeEEE
Q psy2771 96 INYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTAGISFA 136 (174)
Q Consensus 96 ~~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~~~~~~~a 136 (174)
..++.-..+..||+-||+|+ -+--||||++++...-..|+
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg~g~VaF~ 117 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGGEGHVAFA 117 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEETTEEEEE
T ss_pred cCceeecccCCCCCCCceeE-eCCCeEEEEEeCCCceEEEE
Confidence 35666778899999999999 66679999999876544444
No 28
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=92.09 E-value=0.34 Score=34.97 Aligned_cols=38 Identities=32% Similarity=0.606 Sum_probs=27.1
Q ss_pred ecCCCCccceEEcCCCcEEEEEeeecC-CC----eEEEEeHHHH
Q psy2771 104 AITFGNSGGPLVNLDGEVIGINSMKVT-AG----ISFAIPIDYA 142 (174)
Q Consensus 104 ~~~~G~SGGPl~n~~G~liGI~~~~~~-~~----~~~aiPi~~i 142 (174)
....|.||||++-.+|-+|||..+... .+ +.|. |++.+
T Consensus 104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred EEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 345899999999999999999887663 22 4444 88765
No 29
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=90.67 E-value=0.2 Score=42.20 Aligned_cols=25 Identities=28% Similarity=0.472 Sum_probs=21.8
Q ss_pred eeecCCCCccceEEcCCCcEEEEEe
Q psy2771 102 DAAITFGNSGGPLVNLDGEVIGINS 126 (174)
Q Consensus 102 ~~~~~~G~SGGPl~n~~G~liGI~~ 126 (174)
...+..|.||+.|+|.+|++|||..
T Consensus 349 ~~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 349 NYSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred ccCCCCCCCcCeEECCCCCEEEEeC
Confidence 3456699999999999999999975
No 30
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=85.68 E-value=2 Score=36.10 Aligned_cols=49 Identities=18% Similarity=0.310 Sum_probs=34.8
Q ss_pred eecCCCCccceEEcC--CC-cEEEEEeeecC-CC----eEEEEeHHHHHHHHHHHHh
Q psy2771 103 AAITFGNSGGPLVNL--DG-EVIGINSMKVT-AG----ISFAIPIDYAIEFLTNYKR 151 (174)
Q Consensus 103 ~~~~~G~SGGPl~n~--~G-~liGI~~~~~~-~~----~~~aiPi~~i~~~l~~l~~ 151 (174)
...|.|+||||+|=. +| .-+||++|+.. ++ .....-++....++.+..+
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~ 279 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN 279 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence 567899999999922 23 48999999975 21 2445557777888777544
No 31
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=85.39 E-value=1.2 Score=34.39 Aligned_cols=45 Identities=18% Similarity=0.206 Sum_probs=18.9
Q ss_pred cEEEEeeecCCCCccceEEcCCCcEEEEEeeec----CCCeEEEEeHHHH
Q psy2771 97 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKV----TAGISFAIPIDYA 142 (174)
Q Consensus 97 ~~~~~~~~~~~G~SGGPl~n~~G~liGI~~~~~----~~~~~~aiPi~~i 142 (174)
.+...-+.+.+|.||.|+++.+ +++|++.... ..+.++--|+.-+
T Consensus 136 ~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip~~ 184 (203)
T PF02122_consen 136 KFASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIPPI 184 (203)
T ss_dssp TEEEE-----TT-TT-EEE-SS--EEEEEEEE------------------
T ss_pred cCCceEcCCCCCCCCCCeEECC-CceEeecCccccccccccccccccccc
Confidence 4667778899999999999877 9999999842 2466776666544
No 32
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=79.17 E-value=3.2 Score=32.42 Aligned_cols=53 Identities=17% Similarity=0.371 Sum_probs=38.6
Q ss_pred cEEEEeeecCCCCccceEE--cCC--CcEEEEEeeecC-CCeEEEEeH--HHHHHHHHHH
Q psy2771 97 NYIQTDAAITFGNSGGPLV--NLD--GEVIGINSMKVT-AGISFAIPI--DYAIEFLTNY 149 (174)
Q Consensus 97 ~~~~~~~~~~~G~SGGPl~--n~~--G~liGI~~~~~~-~~~~~aiPi--~~i~~~l~~l 149 (174)
.-+++..++..|+=|+|++ |.+ -+++||+.++.. .+.+||-++ +.+.+.+..+
T Consensus 169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANHAMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccccceehhhhhHHHHHHHHHhh
Confidence 4567888999999999999 222 589999999985 467787554 4444444444
No 33
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=75.82 E-value=3.7 Score=23.98 Aligned_cols=21 Identities=43% Similarity=0.563 Sum_probs=18.1
Q ss_pred CCCccceEEcCCCcEEEEEee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~ 127 (174)
.+.+.-|++|.+|+++|+++.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 457778999999999999875
No 34
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=70.96 E-value=16 Score=22.70 Aligned_cols=26 Identities=15% Similarity=0.464 Sum_probs=23.5
Q ss_pred ecCcEEEEeeeEeeCCCcEEEEEEcC
Q psy2771 17 SFNSLLTLPNIAYYFEKHIILFHCLQ 42 (174)
Q Consensus 17 ~~~~~~~a~~v~~d~~~DlAllkv~~ 42 (174)
-+|..++.+++++|....+.+||-.+
T Consensus 14 c~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 14 CFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred cCCceEEEEEEEecCCCcEEEEECcc
Confidence 67999999999999999999999553
No 35
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=70.86 E-value=6.8 Score=35.94 Aligned_cols=39 Identities=26% Similarity=0.537 Sum_probs=26.9
Q ss_pred CCcEEEEEEc-C----------CCCC-----ceeecCCCCCCCCCEEEEEecCCCC
Q psy2771 32 EKHIILFHCL-Q----------NNYP-----ALKLGKAADIRNGEFVIAMGSPLTL 71 (174)
Q Consensus 32 ~~DlAllkv~-~----------~~~~-----~~~l~~~~~~~~G~~v~~~G~p~g~ 71 (174)
..|++++|+= . ++.| .+++. ...++.||.|+++|||...
T Consensus 199 tgDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is-~~G~keGD~vmv~GyPG~T 253 (698)
T PF10459_consen 199 TGDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKIS-LKGVKEGDFVMVAGYPGRT 253 (698)
T ss_pred CCceEEEEEEeCCCCCccccCcCCCCCCCccccccC-CCCCCCCCeEEEccCCCcc
Confidence 4499999992 1 1222 34443 3568999999999999653
No 36
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=68.11 E-value=5.4 Score=27.51 Aligned_cols=18 Identities=44% Similarity=0.785 Sum_probs=13.6
Q ss_pred ceEEcCCCcEEEEEeeec
Q psy2771 112 GPLVNLDGEVIGINSMKV 129 (174)
Q Consensus 112 GPl~n~~G~liGI~~~~~ 129 (174)
.|++|.+|++||+++.+.
T Consensus 94 ~PV~d~~g~viG~V~VG~ 111 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVGV 111 (116)
T ss_dssp EEEE-TTS-EEEEEEEEE
T ss_pred EeeECCCCcEEEEEEEEE
Confidence 488899999999998764
No 37
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=62.36 E-value=12 Score=23.67 Aligned_cols=31 Identities=32% Similarity=0.606 Sum_probs=24.1
Q ss_pred ceEEcCCCcEEEEEeeecCCCeEEEEeHHHHHHHHHHHH
Q psy2771 112 GPLVNLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYK 150 (174)
Q Consensus 112 GPl~n~~G~liGI~~~~~~~~~~~aiPi~~i~~~l~~l~ 150 (174)
-|+.+.+|+++|++.. .+..+.+.++++++.
T Consensus 19 ~pi~~~~g~~~Gvv~~--------di~l~~l~~~i~~~~ 49 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVGI--------DISLDQLSEIISNIK 49 (81)
T ss_dssp EEEEETTTEEEEEEEE--------EEEHHHHHHHHTTSB
T ss_pred EEEECCCCCEEEEEEE--------EeccceeeeEEEeeE
Confidence 4888889999999764 577778877777754
No 38
>PF05578 Peptidase_S31: Pestivirus NS3 polyprotein peptidase S31; InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=61.42 E-value=28 Score=25.88 Aligned_cols=74 Identities=19% Similarity=0.207 Sum_probs=46.0
Q ss_pred CCCCCCEEEEEecCCCCCCceeecEEeeeccCccccCc--cccccEEEEeeecCCCCccceEEcC-CCcEEEEEeeecC
Q psy2771 55 DIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGL--NKTINYIQTDAAITFGNSGGPLVNL-DGEVIGINSMKVT 130 (174)
Q Consensus 55 ~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~~~~~~--~~~~~~~~~~~~~~~G~SGGPl~n~-~G~liGI~~~~~~ 130 (174)
.-..|...|++ +|...+.+-+.|.+-.......++.- .+... -.+|..-..|.||=|+|.. .|++||=+-.+-+
T Consensus 108 gcp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtp-af~~~knlkg~s~~pifeassgr~vgr~k~gkn 184 (211)
T PF05578_consen 108 GCPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTP-AFFDLKNLKGWSGLPIFEASSGRVVGRVKVGKN 184 (211)
T ss_pred CCCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCc-ceeeccccCCCCCCceeeccCCcEEEEEEecCC
Confidence 35678888888 77766667777766655543221110 00001 1234455689999999954 5999998776543
No 39
>COG2524 Predicted transcriptional regulator, contains C-terminal CBS domains [Transcription]
Probab=60.70 E-value=90 Score=25.36 Aligned_cols=21 Identities=33% Similarity=0.678 Sum_probs=18.5
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.|..|.|++|.+ +++||.+..
T Consensus 201 ~~i~GaPVvd~d-k~vGiit~~ 221 (294)
T COG2524 201 KGIRGAPVVDDD-KIVGIITLS 221 (294)
T ss_pred cCccCCceecCC-ceEEEEEHH
Confidence 899999999766 999999854
No 40
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=59.79 E-value=8.1 Score=26.14 Aligned_cols=22 Identities=32% Similarity=0.350 Sum_probs=18.1
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+.=||+|.+|+++|+++..
T Consensus 97 ~~~~~lpVvd~~~~~vGiit~~ 118 (123)
T cd04627 97 EGISSVAVVDNQGNLIGNISVT 118 (123)
T ss_pred cCCceEEEECCCCcEEEEEeHH
Confidence 5556679999999999999864
No 41
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=58.94 E-value=28 Score=22.83 Aligned_cols=47 Identities=13% Similarity=0.244 Sum_probs=32.1
Q ss_pred EEEeeeEeeCCCcEEEEEEcC-CCCCceeecCCCCCCCCCEEEE-EecCC
Q psy2771 22 LTLPNIAYYFEKHIILFHCLQ-NNYPALKLGKAADIRNGEFVIA-MGSPL 69 (174)
Q Consensus 22 ~~a~~v~~d~~~DlAllkv~~-~~~~~~~l~~~~~~~~G~~v~~-~G~p~ 69 (174)
++.+++..|...++|++.+-. ..--.+.|-.. +++.|+.|++ +||..
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkreV~l~Lv~~-~v~~GdyVLVHvGfAi 53 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKREVNLDLVGE-EVKVGDYVLVHVGFAM 53 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEEEEeeeecC-ccccCCEEEEEeeEEE
Confidence 467888888888899999864 11112233222 6899999998 78764
No 42
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=55.19 E-value=12 Score=24.88 Aligned_cols=22 Identities=14% Similarity=0.197 Sum_probs=17.5
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+-=||+|.+|+++|+++..
T Consensus 85 ~~~~~lpVvd~~~~~~Giit~~ 106 (111)
T cd04603 85 TEPPVVAVVDKEGKLVGTIYER 106 (111)
T ss_pred cCCCeEEEEcCCCeEEEEEEhH
Confidence 4555569999999999999853
No 43
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=54.47 E-value=26 Score=32.70 Aligned_cols=47 Identities=23% Similarity=0.258 Sum_probs=30.6
Q ss_pred eecCCCCccceEE--cCC-Cc--EEEEEeeecCC----CeEEEEeHHHHHHHHHHH
Q psy2771 103 AAITFGNSGGPLV--NLD-GE--VIGINSMKVTA----GISFAIPIDYAIEFLTNY 149 (174)
Q Consensus 103 ~~~~~G~SGGPl~--n~~-G~--liGI~~~~~~~----~~~~aiPi~~i~~~l~~l 149 (174)
....+|+||+||| |.. .+ |+|+.+..... +....+|.+.+.++.++.
T Consensus 211 n~~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~~~~~~~~~~f~~~~~~~d 266 (769)
T PF02395_consen 211 NYGSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKGNWWNVIPPDFINQIKQND 266 (769)
T ss_dssp EB--TT-TT-EEEEEETTTTEEEEEEEEEEECCCCHSEEEEEEECHHHHHHHHHHC
T ss_pred cccccCcCCCceEEEEccCCeEEEEEEEccccccCCccceeEEecHHHHHHHHhhh
Confidence 3456999999999 333 33 99999876542 356678888887777764
No 44
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=53.93 E-value=33 Score=23.71 Aligned_cols=23 Identities=13% Similarity=0.257 Sum_probs=16.9
Q ss_pred CCCcEEEEEEcCCCCCceeecCC
Q psy2771 31 FEKHIILFHCLQNNYPALKLGKA 53 (174)
Q Consensus 31 ~~~DlAllkv~~~~~~~~~l~~~ 53 (174)
...|+|+++.+...+|.+++++.
T Consensus 34 ~~ge~~~v~~~~~~~p~~~ig~g 56 (105)
T PF03510_consen 34 TDGELCWVQSPLVHLPAAQIGTG 56 (105)
T ss_pred eccCEEEEECCCCCCCeeEeccC
Confidence 34699999998766777777543
No 45
>cd04618 CBS_pair_5 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=53.25 E-value=29 Score=22.71 Aligned_cols=48 Identities=10% Similarity=0.041 Sum_probs=33.5
Q ss_pred CCCccceEEcCC-CcEEEEEeeecCCC---eEEEEeHHHHHHHHHHHHhCCe
Q psy2771 107 FGNSGGPLVNLD-GEVIGINSMKVTAG---ISFAIPIDYAIEFLTNYKRKGK 154 (174)
Q Consensus 107 ~G~SGGPl~n~~-G~liGI~~~~~~~~---~~~aiPi~~i~~~l~~l~~~g~ 154 (174)
.+.++-|++|.+ |+++||++..--.. ....-|-+.+.+.++.+.+++.
T Consensus 22 ~~~~~~~Vvd~~~~~~~Givt~~Dl~~~~~~~~v~~~~~l~~a~~~m~~~~~ 73 (98)
T cd04618 22 NGIRSAPLWDSRKQQFVGMLTITDFILILRLVSIHPERSLFDAALLLLKNKI 73 (98)
T ss_pred cCCceEEEEeCCCCEEEEEEEHHHHhhheeeEEeCCCCcHHHHHHHHHHCCC
Confidence 456788999875 89999999542111 3445566678888888877654
No 46
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=52.69 E-value=14 Score=24.49 Aligned_cols=22 Identities=18% Similarity=0.360 Sum_probs=17.7
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-|++|.+|+++|+++..
T Consensus 89 ~~~~~~pVvd~~~~~~Gvit~~ 110 (115)
T cd04620 89 HQIRHLPVLDDQGQLIGLVTAE 110 (115)
T ss_pred hCCceEEEEcCCCCEEEEEEhH
Confidence 4556679999999999999853
No 47
>cd04643 CBS_pair_30 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=48.40 E-value=16 Score=24.12 Aligned_cols=17 Identities=41% Similarity=0.581 Sum_probs=14.9
Q ss_pred ceEEcCCCcEEEEEeee
Q psy2771 112 GPLVNLDGEVIGINSMK 128 (174)
Q Consensus 112 GPl~n~~G~liGI~~~~ 128 (174)
-|++|.+|+++||++..
T Consensus 95 ~~Vv~~~~~~~Gvit~~ 111 (116)
T cd04643 95 LPVVDDDGIFIGIITRR 111 (116)
T ss_pred eeEEeCCCeEEEEEEHH
Confidence 69999999999999864
No 48
>cd04597 CBS_pair_DRTGG_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=47.02 E-value=21 Score=24.03 Aligned_cols=21 Identities=29% Similarity=0.374 Sum_probs=18.1
Q ss_pred CCCccceEEcCCCcEEEEEee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~ 127 (174)
.+...-||+|.+|+++||++.
T Consensus 87 ~~~~~lpVvd~~~~l~Givt~ 107 (113)
T cd04597 87 HNIRTLPVVDDDGTPAGIITL 107 (113)
T ss_pred cCCCEEEEECCCCeEEEEEEH
Confidence 566788999999999999975
No 49
>cd04619 CBS_pair_6 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=46.92 E-value=19 Score=23.87 Aligned_cols=22 Identities=18% Similarity=0.339 Sum_probs=17.6
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...=|++|.+|+++|+++..
T Consensus 88 ~~~~~lpVvd~~~~~~Gvi~~~ 109 (114)
T cd04619 88 RGLKNIPVVDENARPLGVLNAR 109 (114)
T ss_pred cCCCeEEEECCCCcEEEEEEhH
Confidence 4555679999889999999863
No 50
>cd04592 CBS_pair_EriC_assoc_euk This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in eukaryotes. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually
Probab=46.29 E-value=22 Score=25.00 Aligned_cols=22 Identities=23% Similarity=0.099 Sum_probs=18.1
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.++-||+|.+|+++|+++..
T Consensus 22 ~~~~~~~VvD~~g~l~Givt~~ 43 (133)
T cd04592 22 EKQSCVLVVDSDDFLEGILTLG 43 (133)
T ss_pred cCCCEEEEECCCCeEEEEEEHH
Confidence 3556789999999999999953
No 51
>cd04801 CBS_pair_M50_like This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the metalloprotease peptidase M50. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=46.28 E-value=20 Score=23.68 Aligned_cols=22 Identities=27% Similarity=0.314 Sum_probs=18.1
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+--||+|.+|+++|+++..
T Consensus 88 ~~~~~l~Vv~~~~~~~Gvl~~~ 109 (114)
T cd04801 88 QGLDELAVVEDSGQVIGLITEA 109 (114)
T ss_pred CCCCeeEEEcCCCcEEEEEecc
Confidence 5566679998889999999864
No 52
>cd04607 CBS_pair_NTP_transferase_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain associated with the NTP (Nucleotidyl transferase) domain downstream. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=44.86 E-value=22 Score=23.43 Aligned_cols=22 Identities=18% Similarity=0.457 Sum_probs=17.7
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-||+|.+|+++|+++..
T Consensus 87 ~~~~~~~Vv~~~~~~~Gvit~~ 108 (113)
T cd04607 87 RSIRHLPILDEEGRVVGLATLD 108 (113)
T ss_pred CCCCEEEEECCCCCEEEEEEhH
Confidence 4556679999889999999853
No 53
>PRK15431 ferrous iron transport protein FeoC; Provisional
Probab=44.29 E-value=20 Score=23.38 Aligned_cols=27 Identities=15% Similarity=0.177 Sum_probs=23.8
Q ss_pred eEEEEeHHHHHHHHHHHHhCCeeeeee
Q psy2771 133 ISFAIPIDYAIEFLTNYKRKGKFCAYS 159 (174)
Q Consensus 133 ~~~aiPi~~i~~~l~~l~~~g~~~~~~ 159 (174)
..|..|.+.+...|+.|.+.|++.+..
T Consensus 24 ~~~~~p~~~VeaMLe~l~~kGkverv~ 50 (78)
T PRK15431 24 QTLNTPQPMINAMLQQLESMGKAVRIQ 50 (78)
T ss_pred HHHCcCHHHHHHHHHHHHHCCCeEeec
Confidence 367899999999999999999998663
No 54
>COG5428 Uncharacterized conserved small protein [Function unknown]
Probab=44.27 E-value=43 Score=21.29 Aligned_cols=18 Identities=11% Similarity=0.237 Sum_probs=14.5
Q ss_pred eeEeeCCCcEEEEEEcCC
Q psy2771 26 NIAYYFEKHIILFHCLQN 43 (174)
Q Consensus 26 ~v~~d~~~DlAllkv~~~ 43 (174)
.+.||++.|++-|.+.+.
T Consensus 2 kv~YD~daD~lYI~~~~~ 19 (69)
T COG5428 2 KVKYDTDADILYILLEEG 19 (69)
T ss_pred ceeecCCCcEEEEEEecC
Confidence 367999999998888754
No 55
>cd04602 CBS_pair_IMPDH_2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein. IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain in IMPDH have been associated with retinitis pigmentos
Probab=43.17 E-value=23 Score=23.44 Aligned_cols=22 Identities=27% Similarity=0.398 Sum_probs=17.9
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-|++|.+|+++|+++..
T Consensus 88 ~~~~~~pVv~~~~~~~Gvit~~ 109 (114)
T cd04602 88 SKKGKLPIVNDDGELVALVTRS 109 (114)
T ss_pred cCCCceeEECCCCeEEEEEEHH
Confidence 4555679999899999999853
No 56
>cd04590 CBS_pair_CorC_HlyC_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the CorC_HlyC domain. CorC_HlyC is a transporter associated domain. This small domain is found in Na+/H+ antiporters, in proteins involved in magnesium and cobalt efflux, and in association with some proteins of unknown function. The function of the CorC_HlyC domain is uncertain but it might be involved in modulating transport of ion substrates. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role,
Probab=43.08 E-value=22 Score=23.24 Aligned_cols=22 Identities=14% Similarity=0.149 Sum_probs=17.1
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+=-|++|.+|+++|+++..
T Consensus 85 ~~~~~~~Vv~~~~~~~Gvit~~ 106 (111)
T cd04590 85 ERSHMAIVVDEYGGTAGLVTLE 106 (111)
T ss_pred cCCcEEEEEECCCCEEEEeEHH
Confidence 3455568899889999999853
No 57
>cd04617 CBS_pair_4 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=42.88 E-value=21 Score=23.81 Aligned_cols=22 Identities=27% Similarity=0.155 Sum_probs=17.0
Q ss_pred CCCccceEEcCC---CcEEEEEeee
Q psy2771 107 FGNSGGPLVNLD---GEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~---G~liGI~~~~ 128 (174)
.+..-=||+|.+ |+++|+++..
T Consensus 89 ~~~~~lpVvd~~~~~~~l~Gvit~~ 113 (118)
T cd04617 89 HQVDSLPVVEKVDEGLEVIGRITKT 113 (118)
T ss_pred cCCCEeeEEeCCCccceEEEEEEhh
Confidence 455567999887 7999999864
No 58
>COG3448 CBS-domain-containing membrane protein [Signal transduction mechanisms]
Probab=42.54 E-value=21 Score=29.60 Aligned_cols=21 Identities=29% Similarity=0.537 Sum_probs=16.5
Q ss_pred CCccceEEcCCCcEEEEEeee
Q psy2771 108 GNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~~ 128 (174)
|.--=|++|.+|+++||++..
T Consensus 345 g~H~lpvld~~g~lvGIvsQt 365 (382)
T COG3448 345 GLHALPVLDAAGKLVGIVSQT 365 (382)
T ss_pred CcceeeEEcCCCcEEEEeeHH
Confidence 333459999999999999853
No 59
>cd04582 CBS_pair_ABC_OpuCA_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzi
Probab=42.46 E-value=25 Score=22.69 Aligned_cols=22 Identities=27% Similarity=0.291 Sum_probs=17.5
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+--|++|.+|+++|+++..
T Consensus 80 ~~~~~~~Vv~~~~~~~Gvi~~~ 101 (106)
T cd04582 80 HDMSWLPCVDEDGRYVGEVTQR 101 (106)
T ss_pred CCCCeeeEECCCCcEEEEEEHH
Confidence 4555579999899999999864
No 60
>cd04641 CBS_pair_28 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=42.22 E-value=27 Score=23.32 Aligned_cols=22 Identities=27% Similarity=0.362 Sum_probs=18.2
Q ss_pred CCCCccceEEcCCCcEEEEEee
Q psy2771 106 TFGNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 106 ~~G~SGGPl~n~~G~liGI~~~ 127 (174)
..+.+.-||+|.+|+++|+++.
T Consensus 21 ~~~~~~~pVv~~~~~~~Giv~~ 42 (120)
T cd04641 21 ERRVSALPIVDENGKVVDVYSR 42 (120)
T ss_pred HcCCCeeeEECCCCeEEEEEeH
Confidence 3466778999999999999983
No 61
>cd04601 CBS_pair_IMPDH This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein. IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain in IMPDH have been associated with retinitis pigmentosa.
Probab=42.19 E-value=24 Score=22.84 Aligned_cols=22 Identities=23% Similarity=0.377 Sum_probs=17.3
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+..--|++|.+|+++|+++..
T Consensus 84 ~~~~~~~Vv~~~~~~~Gvi~~~ 105 (110)
T cd04601 84 HKIEKLPVVDDEGKLKGLITVK 105 (110)
T ss_pred hCCCeeeEEcCCCCEEEEEEhh
Confidence 3455568999889999999864
No 62
>smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease.
Probab=42.14 E-value=28 Score=18.23 Aligned_cols=20 Identities=30% Similarity=0.554 Sum_probs=15.7
Q ss_pred CCccceEEcCCCcEEEEEee
Q psy2771 108 GNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~ 127 (174)
+.+.-|+++.+++++|+++.
T Consensus 22 ~~~~~~v~~~~~~~~g~i~~ 41 (49)
T smart00116 22 GIRRLPVVDEEGRLVGIVTR 41 (49)
T ss_pred CCCcccEECCCCeEEEEEEH
Confidence 44566888888999999875
No 63
>cd04614 CBS_pair_1 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=41.96 E-value=30 Score=22.38 Aligned_cols=48 Identities=19% Similarity=0.175 Sum_probs=32.4
Q ss_pred CCCccceEEcCCCcEEEEEeeec--C-CCeEEEEeHHHHHHHHHHHHhCCe
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMKV--T-AGISFAIPIDYAIEFLTNYKRKGK 154 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~~--~-~~~~~aiPi~~i~~~l~~l~~~g~ 154 (174)
.+.+.-|++|.+|+++|+++... . ....+.-|-+.+.+.++.+.+.+.
T Consensus 22 ~~~~~~~V~d~~~~~~Giv~~~dl~~~~~~~~v~~~~~l~~a~~~m~~~~~ 72 (96)
T cd04614 22 ANVKALPVLDDDGKLSGIITERDLIAKSEVVTATKRTTVSECAQKMKRNRI 72 (96)
T ss_pred cCCCeEEEECCCCCEEEEEEHHHHhcCCCcEEecCCCCHHHHHHHHHHhCC
Confidence 46678899999999999998543 1 123444455666777777766544
No 64
>cd04583 CBS_pair_ABC_OpuCA_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyz
Probab=41.87 E-value=25 Score=22.70 Aligned_cols=22 Identities=27% Similarity=0.477 Sum_probs=17.4
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+..--|++|.+|+++|+++..
T Consensus 83 ~~~~~~~vv~~~g~~~Gvit~~ 104 (109)
T cd04583 83 RGPKYVPVVDEDGKLVGLITRS 104 (109)
T ss_pred cCCceeeEECCCCeEEEEEehH
Confidence 3555669999999999999864
No 65
>cd04596 CBS_pair_DRTGG_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=41.25 E-value=26 Score=22.86 Aligned_cols=22 Identities=27% Similarity=0.303 Sum_probs=17.9
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-|++|.+|+++|+++..
T Consensus 82 ~~~~~~~Vv~~~~~~~G~it~~ 103 (108)
T cd04596 82 EGIEMLPVVDDNKKLLGIISRQ 103 (108)
T ss_pred cCCCeeeEEcCCCCEEEEEEHH
Confidence 4556779999999999998753
No 66
>COG3290 CitA Signal transduction histidine kinase regulating citrate/malate metabolism [Signal transduction mechanisms]
Probab=41.02 E-value=23 Score=31.50 Aligned_cols=18 Identities=33% Similarity=0.569 Sum_probs=16.4
Q ss_pred ceEEcCCCcEEEEEeeec
Q psy2771 112 GPLVNLDGEVIGINSMKV 129 (174)
Q Consensus 112 GPl~n~~G~liGI~~~~~ 129 (174)
-|+||.+|++||+++-++
T Consensus 143 ~PI~d~~g~~IGvVsVG~ 160 (537)
T COG3290 143 VPIFDEDGKQIGVVSVGY 160 (537)
T ss_pred cceECCCCCEEEEEEEee
Confidence 499999999999999875
No 67
>cd04606 CBS_pair_Mg_transporter This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain in the magnesium transporter, MgtE. MgtE and its homologs are found in eubacteria, archaebacteria, and eukaryota. Members of this family transport Mg2+ or other divalent cations into the cell via two highly conserved aspartates. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=40.26 E-value=28 Score=22.74 Aligned_cols=22 Identities=23% Similarity=0.522 Sum_probs=17.3
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+....|++|.+|+++|+++..
T Consensus 82 ~~~~~~~Vv~~~~~~~Gvit~~ 103 (109)
T cd04606 82 YDLLALPVVDEEGRLVGIITVD 103 (109)
T ss_pred cCCceeeeECCCCcEEEEEEhH
Confidence 3445679999899999999864
No 68
>cd04642 CBS_pair_29 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=40.07 E-value=28 Score=23.51 Aligned_cols=20 Identities=20% Similarity=0.283 Sum_probs=15.7
Q ss_pred CccceEEcCCCcEEEEEeee
Q psy2771 109 NSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 109 ~SGGPl~n~~G~liGI~~~~ 128 (174)
.+=-|++|.+|+++|+++..
T Consensus 102 ~~~l~Vvd~~~~~~Giit~~ 121 (126)
T cd04642 102 VHRVWVVDEEGKPIGVITLT 121 (126)
T ss_pred CcEEEEECCCCCEEEEEEHH
Confidence 33358998889999999853
No 69
>cd04610 CBS_pair_ParBc_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain downstream. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=38.46 E-value=31 Score=22.23 Aligned_cols=19 Identities=26% Similarity=0.431 Sum_probs=15.2
Q ss_pred ccceEEcCCCcEEEEEeee
Q psy2771 110 SGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 110 SGGPl~n~~G~liGI~~~~ 128 (174)
.--|++|.+|+++|+++..
T Consensus 84 ~~~~Vv~~~g~~~Gvi~~~ 102 (107)
T cd04610 84 SKLPVVDENNNLVGIITNT 102 (107)
T ss_pred CeEeEECCCCeEEEEEEHH
Confidence 3458889899999998753
No 70
>cd04640 CBS_pair_27 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=38.20 E-value=28 Score=23.58 Aligned_cols=22 Identities=23% Similarity=0.297 Sum_probs=17.8
Q ss_pred CCCccceEEcCC-CcEEEEEeee
Q psy2771 107 FGNSGGPLVNLD-GEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~-G~liGI~~~~ 128 (174)
.+.+--||+|.+ |+++|+++..
T Consensus 99 ~~~~~lpVvd~~~~~~~G~it~~ 121 (126)
T cd04640 99 SGRQHALVVDREHHQIRGIISTS 121 (126)
T ss_pred CCCceEEEEECCCCEEEEEEeHH
Confidence 566678999887 8999999853
No 71
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=38.16 E-value=1e+02 Score=19.34 Aligned_cols=43 Identities=9% Similarity=0.128 Sum_probs=29.0
Q ss_pred EEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEEE
Q psy2771 22 LTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIAM 65 (174)
Q Consensus 22 ~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~~ 65 (174)
++++++..+.....|++..... ...+.+.=-.++++||.|++-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G~-~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGGV-RREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETTE-EEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCCc-EEEEEEEEeCCCCCCCEEEEe
Confidence 5788888888899999988742 333433333458999999985
No 72
>cd04600 CBS_pair_HPP_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the HPP motif domain. These proteins are integral membrane proteins with four transmembrane spanning helices. The function of these proteins is uncertain, but they are thought to be transporters. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=37.36 E-value=32 Score=22.90 Aligned_cols=22 Identities=27% Similarity=0.424 Sum_probs=18.1
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+.-|++|.+|+++|+++..
T Consensus 98 ~~~~~~~Vv~~~g~~~Gvit~~ 119 (124)
T cd04600 98 GGHHHVPVVDEDRRLVGIVTQT 119 (124)
T ss_pred cCCCceeEEcCCCCEEEEEEhH
Confidence 4566789999899999999853
No 73
>cd04615 CBS_pair_2 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=36.58 E-value=34 Score=22.33 Aligned_cols=22 Identities=27% Similarity=0.249 Sum_probs=17.1
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-|++|.+|+++|+++..
T Consensus 87 ~~~~~~~Vvd~~g~~~Gvvt~~ 108 (113)
T cd04615 87 NNISRLPVLDDKGKVGGIVTED 108 (113)
T ss_pred cCCCeeeEECCCCeEEEEEEHH
Confidence 3445679999899999998853
No 74
>cd04609 CBS_pair_PALP_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain upstream. The vitamin B6 complex comprises pyridoxine, pyridoxal, and pyridoxamine, as well as the 5'-phosphate esters of pyridoxal (PALP) and pyridoxamine, the last two being the biologically active coenzyme derivatives. The members of the PALP family are principally involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and other amine-containing compounds. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a pote
Probab=36.39 E-value=32 Score=22.15 Aligned_cols=18 Identities=22% Similarity=0.340 Sum_probs=14.8
Q ss_pred cceEEcCCCcEEEEEeee
Q psy2771 111 GGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 111 GGPl~n~~G~liGI~~~~ 128 (174)
.-|+++.+|+++|+++..
T Consensus 88 ~~~vv~~~~~~~Gvvt~~ 105 (110)
T cd04609 88 VAVVVDEGGKFVGIITRA 105 (110)
T ss_pred ceeEEecCCeEEEEEeHH
Confidence 468888889999998853
No 75
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=36.18 E-value=1e+02 Score=18.70 Aligned_cols=42 Identities=12% Similarity=0.097 Sum_probs=26.1
Q ss_pred CCceeEeeeeEEEEEeecCcEE-EEeeeEeeCCCcEEEEEEcC
Q psy2771 1 MPGVEKVTQDICLSTFSFNSLL-TLPNIAYYFEKHIILFHCLQ 42 (174)
Q Consensus 1 ~~~v~~~a~~~~~~~~~~~~~~-~a~~v~~d~~~DlAllkv~~ 42 (174)
||...-...+..-.-+.+...+ ++++..+|...++.-++.+.
T Consensus 1 mp~~k~~~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~D 43 (55)
T PF09465_consen 1 MPSRKFAIGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYED 43 (55)
T ss_dssp SSSSSS-SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETT
T ss_pred CCcccccCCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcC
Confidence 4443333333334444555554 99999999999999888874
No 76
>cd04587 CBS_pair_CAP-ED_DUF294_PBI_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with either the CAP_ED (cAMP receptor protein effector domain) family of transcription factors and the DUF294 domain or the PB1 (Phox and Bem1p) domain. Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. The PB1 domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pai
Probab=34.84 E-value=36 Score=22.16 Aligned_cols=18 Identities=28% Similarity=0.575 Sum_probs=14.7
Q ss_pred cceEEcCCCcEEEEEeee
Q psy2771 111 GGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 111 GGPl~n~~G~liGI~~~~ 128 (174)
--|+++.+|+++|+++..
T Consensus 91 ~l~Vv~~~~~~~Gvvs~~ 108 (113)
T cd04587 91 HLPVVDKSGQVVGLLDVT 108 (113)
T ss_pred cccEECCCCCEEEEEEHH
Confidence 348998889999999853
No 77
>COG3284 AcoR Transcriptional activator of acetoin/glycerol metabolism [Secondary metabolites biosynthesis, transport, and catabolism / Transcription]
Probab=34.20 E-value=22 Score=32.05 Aligned_cols=23 Identities=22% Similarity=0.559 Sum_probs=19.3
Q ss_pred CCCccceEEcCCCcEEEEEeeec
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMKV 129 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~~ 129 (174)
--+++.|++|.+|+|+|+..-..
T Consensus 158 lsCsAaPI~D~qG~L~gVLDISs 180 (606)
T COG3284 158 LSCSAAPIFDEQGELVGVLDISS 180 (606)
T ss_pred ceeeeeccccCCCcEEEEEEecc
Confidence 34789999999999999987653
No 78
>PF15436 PGBA_N: Plasminogen-binding protein pgbA N-terminal
Probab=33.48 E-value=1.9e+02 Score=22.72 Aligned_cols=52 Identities=15% Similarity=0.052 Sum_probs=34.1
Q ss_pred EEEEe-ecCcEEEEeeeEeeCCCcEEEEEEcC---CCCCceeecCCCCCCCCCEEEE
Q psy2771 12 CLSTF-SFNSLLTLPNIAYYFEKHIILFHCLQ---NNYPALKLGKAADIRNGEFVIA 64 (174)
Q Consensus 12 ~~~~~-~~~~~~~a~~v~~d~~~DlAllkv~~---~~~~~~~l~~~~~~~~G~~v~~ 64 (174)
.+..+ .+-+.+.|+.+....+...|.+|+.. .....++... -.++.||.|+.
T Consensus 33 V~h~~~~~~~~IiA~a~V~~~~~g~A~~kf~~fd~L~Q~aLP~p~-~~pk~GD~vil 88 (218)
T PF15436_consen 33 VVHKFDKDHSSIIARAVVISKKNGVAKAKFSVFDSLKQDALPTPK-MVPKKGDEVIL 88 (218)
T ss_pred EEEEecCCcceeeeEEEEEEecCCeeEEEEeehhhhhhhcCCCCc-cccCCCCEEEE
Confidence 34455 56677778877777789999999863 2222333322 24799999876
No 79
>cd04624 CBS_pair_11 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=32.90 E-value=45 Score=21.68 Aligned_cols=22 Identities=23% Similarity=0.365 Sum_probs=17.4
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+..|++|.+|+++|+++..
T Consensus 86 ~~~~~~~Vv~~~g~~~Gilt~~ 107 (112)
T cd04624 86 NNIRHHLVVDKGGELVGVISIR 107 (112)
T ss_pred cCccEEEEEcCCCcEEEEEEHH
Confidence 3455678999889999999864
No 80
>cd04611 CBS_pair_PAS_GGDEF_DUF1_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with a PAS domain, a GGDEF (DiGuanylate-Cyclase (DGC) domain, and a DUF1 domain downstream. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CB
Probab=32.24 E-value=43 Score=21.61 Aligned_cols=22 Identities=32% Similarity=0.403 Sum_probs=17.0
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+..--|++|.+|+++|+++..
T Consensus 85 ~~~~~~~Vv~~~~~~~Gvi~~~ 106 (111)
T cd04611 85 HGIRHLVVVDDDGELLGLLSQT 106 (111)
T ss_pred cCCeEEEEECCCCcEEEEEEhH
Confidence 3445578999889999999853
No 81
>PF06003 SMN: Survival motor neuron protein (SMN); InterPro: IPR010304 This family consists of several eukaryotic survival motor neuron (SMN) proteins. The Survival of Motor Neurons (SMN) protein, the product of the spinal muscular atrophy-determining gene, is part of a large macromolecular complex (SMN complex) that functions in the assembly of spliceosomal small nuclear ribonucleoproteins (snRNPs). The SMN complex functions as a specificity factor essential for the efficient assembly of Sm proteins on U snRNAs and likely protects cells from illicit, and potentially deleterious, non-specific binding of Sm proteins to RNAs.; GO: 0003723 RNA binding, 0006397 mRNA processing, 0005634 nucleus, 0005737 cytoplasm; PDB: 1MHN_A 4A4G_A 3S6N_M 4A4E_A 1G5V_A 4A4H_A 4A4F_A 2D9T_A.
Probab=32.21 E-value=1.3e+02 Score=24.16 Aligned_cols=34 Identities=12% Similarity=0.055 Sum_probs=28.3
Q ss_pred eeEEEEEee-cCcEEEEeeeEeeCCCcEEEEEEcC
Q psy2771 9 QDICLSTFS-FNSLLTLPNIAYYFEKHIILFHCLQ 42 (174)
Q Consensus 9 ~~~~~~~~~-~~~~~~a~~v~~d~~~DlAllkv~~ 42 (174)
-|.|.-.++ ||..|+|++...+.+.+-|++++..
T Consensus 72 Gd~C~A~~s~Dg~~Y~A~I~~i~~~~~~~~V~f~g 106 (264)
T PF06003_consen 72 GDKCMAVYSEDGQYYPATIESIDEEDGTCVVVFTG 106 (264)
T ss_dssp T-EEEEE-TTTSSEEEEEEEEEETTTTEEEEEETT
T ss_pred CCEEEEEECCCCCEEEEEEEEEcCCCCEEEEEEcc
Confidence 357888886 8999999999999999999999974
No 82
>PF00741 Gas_vesicle: Gas vesicle protein; InterPro: IPR000638 Gas vesicles are small, hollow, gas filled protein structures found in several cyanobacterial and archaebacterial microorganisms []. They allow the positioning of the bacteria at the favourable depth for growth. Gas vesicles are hollow cylindrical tubes, closed by a hollow, conical cap at each end. Both the conical end caps and central cylinder are made up of 4-5 nm wide ribs that run at right angles to the long axis of the structure. Gas vesicles seem to be constituted of two different protein components, GVPa and GVPc. GVPa, a small protein of about 70 amino acid residues, is the main constituent of gas vesicles and form the essential core of the structure. The sequence of GVPa is extremely well conserved. GvpJ and gvpM, two proteins encoded in the cluster of genes required for gas vesicle synthesis in the archaebacteria Halobacterium salinarium and Halobacterium mediterranei (Haloferax mediterranei), have been found [] to be evolutionary related to GVPa. The exact function of these two proteins is not known, although they could be important for determining the shape determination gas vesicles. The N-terminal domain of Aphanizomenon flos-aquae protein gvpA/J is also related to GVPa.; GO: 0005198 structural molecule activity, 0012506 vesicle membrane
Probab=32.14 E-value=98 Score=17.36 Aligned_cols=30 Identities=20% Similarity=0.119 Sum_probs=25.8
Q ss_pred HHHHHHHHHhCCeeeeeecCceeeeeeeee
Q psy2771 142 AIEFLTNYKRKGKFCAYSKGKSDLRTEVLY 171 (174)
Q Consensus 142 i~~~l~~l~~~g~~~~~~lg~~~~~~e~~~ 171 (174)
+.++++++..+|-+...++-++....|+..
T Consensus 2 L~d~LdriLdkGvVi~gdi~isva~veLl~ 31 (39)
T PF00741_consen 2 LVDLLDRILDKGVVIDGDIRISVAGVELLT 31 (39)
T ss_pred HHHHHHHHcCCceEEEEEEEEEEcceEEEE
Confidence 568899999999999999988888877764
No 83
>cd04635 CBS_pair_22 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=32.12 E-value=50 Score=21.82 Aligned_cols=21 Identities=24% Similarity=0.325 Sum_probs=17.6
Q ss_pred CCCccceEEcCCCcEEEEEee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~ 127 (174)
.+.+.-|+++.+|+++|+++.
T Consensus 96 ~~~~~~~Vvd~~g~~~Gvit~ 116 (122)
T cd04635 96 HDIGRLPVVNEKDQLVGIVDR 116 (122)
T ss_pred cCCCeeeEEcCCCcEEEEEEh
Confidence 566677999988999999985
No 84
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=31.54 E-value=70 Score=18.60 Aligned_cols=16 Identities=19% Similarity=0.204 Sum_probs=13.6
Q ss_pred eEeeCCCcEEEEEEcC
Q psy2771 27 IAYYFEKHIILFHCLQ 42 (174)
Q Consensus 27 v~~d~~~DlAllkv~~ 42 (174)
+.||++.|.+.|++..
T Consensus 3 i~YD~~~D~lyi~l~~ 18 (50)
T PF10049_consen 3 IEYDPEADALYIRLSD 18 (50)
T ss_pred eEEcCcCCEEEEEECC
Confidence 6799999999999953
No 85
>PF09012 FeoC: FeoC like transcriptional regulator; InterPro: IPR015102 This entry contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependent transcriptional repressor []. ; PDB: 1XN7_A 2K02_A.
Probab=30.29 E-value=58 Score=20.11 Aligned_cols=26 Identities=23% Similarity=0.280 Sum_probs=20.1
Q ss_pred EEEEeHHHHHHHHHHHHhCCeeeeee
Q psy2771 134 SFAIPIDYAIEFLTNYKRKGKFCAYS 159 (174)
Q Consensus 134 ~~aiPi~~i~~~l~~l~~~g~~~~~~ 159 (174)
.|-++.+.+...++.|.+.|++.+-.
T Consensus 23 ~~~~s~~~ve~mL~~l~~kG~I~~~~ 48 (69)
T PF09012_consen 23 EFGISPEAVEAMLEQLIRKGYIRKVD 48 (69)
T ss_dssp HTT--HHHHHHHHHHHHCCTSCEEEE
T ss_pred HHCcCHHHHHHHHHHHHHCCcEEEec
Confidence 35588899999999999999987543
No 86
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.24 E-value=1.2e+02 Score=19.34 Aligned_cols=29 Identities=14% Similarity=0.123 Sum_probs=23.6
Q ss_pred EEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 13 LSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 13 ~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
...+.+|+.+.....++|....+.|=...
T Consensus 14 ~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 14 RVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred EEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 34459999999999999999988865553
No 87
>cd00218 GlcAT-I Beta1,3-glucuronyltransferase I (GlcAT-I) is involved in the initial steps of proteoglycan synthesis. Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a Key enzyme involved in the initial steps of proteoglycan synthesis. GlcAT-I catalyzes the transfer of a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region of trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl of proteoglycans. The enzyme has two subdomains that bind the donor and acceptor substrate separately. The active site is located at the cleft between both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. This family has been classified as Glycosyltransferase family 43 (GT-43).
Probab=30.12 E-value=63 Score=25.44 Aligned_cols=32 Identities=22% Similarity=0.369 Sum_probs=24.0
Q ss_pred cceEEcCCCcEEEEEeeecC------CCeEEEEeHHHHH
Q psy2771 111 GGPLVNLDGEVIGINSMKVT------AGISFAIPIDYAI 143 (174)
Q Consensus 111 GGPl~n~~G~liGI~~~~~~------~~~~~aiPi~~i~ 143 (174)
-||+++ +|+|+|-.+.-.. +-.+||+.+..+.
T Consensus 136 egP~c~-~gkV~gw~~~w~~~R~f~idmAGFA~n~~ll~ 173 (223)
T cd00218 136 EGPVCE-NGKVVGWHTAWKPERPFPIDMAGFAFNSKLLW 173 (223)
T ss_pred eccEee-CCeEeEEecCCCCCCCCcceeeeEEEehhhhc
Confidence 479997 8999999996543 2358888877664
No 88
>cd05701 S1_Rrp5_repeat_hs10 S1_Rrp5_repeat_hs10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 10 (hs10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.
Probab=29.88 E-value=1.4e+02 Score=18.75 Aligned_cols=14 Identities=7% Similarity=0.181 Sum_probs=8.9
Q ss_pred CCCCCCCCCEEEEE
Q psy2771 52 KAADIRNGEFVIAM 65 (174)
Q Consensus 52 ~~~~~~~G~~v~~~ 65 (174)
+++.+++|+.+.+.
T Consensus 42 ~seklkvG~~l~v~ 55 (69)
T cd05701 42 DSEKLSVGQCLDVT 55 (69)
T ss_pred cceeeeccceEEEE
Confidence 45567777776664
No 89
>cd04605 CBS_pair_MET2_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the MET2 domain. Met2 is a key enzyme in the biosynthesis of methionine. It encodes a homoserine transacetylase involved in converting homoserine to O-acetyl homoserine. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=29.86 E-value=53 Score=21.23 Aligned_cols=22 Identities=32% Similarity=0.450 Sum_probs=17.3
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+..--|+++.+|+++|+++..
T Consensus 84 ~~~~~~~Vv~~~~~~~G~v~~~ 105 (110)
T cd04605 84 HNISALPVVDAENRVIGIITSE 105 (110)
T ss_pred hCCCEEeEECCCCcEEEEEEHH
Confidence 4445678999899999999863
No 90
>cd04621 CBS_pair_8 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=29.71 E-value=58 Score=22.54 Aligned_cols=20 Identities=20% Similarity=0.348 Sum_probs=17.3
Q ss_pred CCccceEEcCCCcEEEEEee
Q psy2771 108 GNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~ 127 (174)
+.+.-||+|.+|+++|+++.
T Consensus 23 ~~~~l~V~d~~~~~~Giv~~ 42 (135)
T cd04621 23 GVGRVIVVDDNGKPVGVITY 42 (135)
T ss_pred CCCcceEECCCCCEEEEEeH
Confidence 56778999999999999984
No 91
>TIGR00074 hypC_hupF hydrogenase assembly chaperone HypC/HupF. An additional proposed function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. PubMed:12441107.
Probab=29.65 E-value=1.3e+02 Score=19.45 Aligned_cols=44 Identities=14% Similarity=0.218 Sum_probs=26.8
Q ss_pred EEEeeeEeeCCCcEEEEEEcCCCCCceeecCCCCCCCCCEEEE-EecC
Q psy2771 22 LTLPNIAYYFEKHIILFHCLQNNYPALKLGKAADIRNGEFVIA-MGSP 68 (174)
Q Consensus 22 ~~a~~v~~d~~~DlAllkv~~~~~~~~~l~~~~~~~~G~~v~~-~G~p 68 (174)
++++++..+. +.|++..... ...+.+.=..++++||.|++ .||.
T Consensus 5 iP~~V~~i~~--~~A~v~~~G~-~~~v~l~lv~~~~vGD~VLVH~G~A 49 (76)
T TIGR00074 5 IPGQVVEIDE--NIALVEFCGI-KRDVSLDLVGEVKVGDYVLVHVGFA 49 (76)
T ss_pred cceEEEEEcC--CEEEEEcCCe-EEEEEEEeeCCCCCCCEEEEecChh
Confidence 4667777665 4688877632 12233322245799999998 5654
No 92
>cd04604 CBS_pair_KpsF_GutQ_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with KpsF/GutQ domains in the API [A5P (D-arabinose 5-phosphate) isomerase] protein. These APIs catalyze the conversion of the pentose pathway intermediate D-ribulose 5-phosphate into A5P, a precursor of 3-deoxy-D-manno-octulosonate, which is an integral carbohydrate component of various glycolipids coating the surface of the outer membrane of Gram-negative bacteria, including lipopolysaccharide and many group 2 K-antigen capsules. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other funct
Probab=29.49 E-value=61 Score=20.98 Aligned_cols=21 Identities=19% Similarity=0.336 Sum_probs=17.3
Q ss_pred CCCccceEEcCCCcEEEEEee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~ 127 (174)
.+.+.-|++|.+|+++|+++.
T Consensus 88 ~~~~~~~Vv~~~~~~iG~it~ 108 (114)
T cd04604 88 NKITALPVVDDNGRPVGVLHI 108 (114)
T ss_pred cCCCEEEEECCCCCEEEEEEH
Confidence 455677999888999999875
No 93
>cd04585 CBS_pair_ACT_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the acetoin utilization proteins in bacteria. Acetoin is a product of fermentative metabolism in many prokaryotic and eukaryotic microorganisms. They produce acetoin as an external carbon storage compound and then later reuse it as a carbon and energy source during their stationary phase and sporulation. In addition these CBS domains are associated with a downstream ACT domain, which is linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The i
Probab=29.09 E-value=63 Score=21.13 Aligned_cols=22 Identities=32% Similarity=0.433 Sum_probs=18.0
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+.-|++|.+|+++|+++..
T Consensus 96 ~~~~~~~Vv~~~~~~~Gvvt~~ 117 (122)
T cd04585 96 RKISGLPVVDDQGRLVGIITES 117 (122)
T ss_pred cCCCceeEECCCCcEEEEEEHH
Confidence 5667789998889999999853
No 94
>cd04632 CBS_pair_19 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=29.06 E-value=62 Score=21.73 Aligned_cols=21 Identities=33% Similarity=0.491 Sum_probs=17.6
Q ss_pred CCCccceEEcCCCcEEEEEee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~ 127 (174)
.+.++-|++|.+|+++|+++.
T Consensus 22 ~~~~~~~Vv~~~~~~~G~it~ 42 (128)
T cd04632 22 HGISRLPVVDDNGKLTGIVTR 42 (128)
T ss_pred cCCCEEEEECCCCcEEEEEEH
Confidence 456678999999999999993
No 95
>cd04803 CBS_pair_15 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=29.02 E-value=55 Score=21.63 Aligned_cols=22 Identities=23% Similarity=0.268 Sum_probs=16.7
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...=|++|.+|+++|+++..
T Consensus 96 ~~~~~~~Vv~~~~~~~Gvit~~ 117 (122)
T cd04803 96 NKIGCLPVVDDKGTLVGIITRS 117 (122)
T ss_pred cCCCeEEEEcCCCCEEEEEEHH
Confidence 3445568888889999999853
No 96
>cd04588 CBS_pair_CAP-ED_DUF294_assoc_arch This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the archaeal CAP_ED (cAMP receptor protein effector domain) family of transcription factors and the DUF294 domain. Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site.
Probab=28.99 E-value=53 Score=21.22 Aligned_cols=21 Identities=14% Similarity=0.150 Sum_probs=16.6
Q ss_pred CCccceEEcCCCcEEEEEeee
Q psy2771 108 GNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~~ 128 (174)
+...-|++|.+|+++|+++..
T Consensus 85 ~~~~~~V~~~~~~~~G~i~~~ 105 (110)
T cd04588 85 NVGRLIVTDDEGRPVGIITRT 105 (110)
T ss_pred CCCEEEEECCCCCEEEEEEhH
Confidence 445678888889999999864
No 97
>cd04598 CBS_pair_GGDEF_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=28.79 E-value=47 Score=21.84 Aligned_cols=18 Identities=33% Similarity=0.658 Sum_probs=14.7
Q ss_pred cceEEcCCCcEEEEEeee
Q psy2771 111 GGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 111 GGPl~n~~G~liGI~~~~ 128 (174)
..++++.+|+++|+++..
T Consensus 97 ~~~vv~~~~~~~Gvvs~~ 114 (119)
T cd04598 97 DGFIVTEEGRYLGIGTVK 114 (119)
T ss_pred ccEEEeeCCeEEEEEEHH
Confidence 457888899999999853
No 98
>cd04623 CBS_pair_10 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=28.32 E-value=58 Score=21.00 Aligned_cols=20 Identities=25% Similarity=0.277 Sum_probs=16.5
Q ss_pred CCccceEEcCCCcEEEEEee
Q psy2771 108 GNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~ 127 (174)
+.+.=|++|.+++++|+++.
T Consensus 23 ~~~~~~V~~~~~~~~Giv~~ 42 (113)
T cd04623 23 NIGAVVVVDDGGRLVGIFSE 42 (113)
T ss_pred CCCeEEEECCCCCEEEEEeh
Confidence 45566899988999999994
No 99
>PF08275 Toprim_N: DNA primase catalytic core, N-terminal domain; InterPro: IPR013264 This is the N-terminal, catalytic core domain of DNA primases. DNA primase (2.7.7 from EC) is a nucleotidyltransferase which synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork. It can also prime the leading strand and has been implicated in cell division []. ; PDB: 1EQN_E 1DD9_A 3B39_B 1DDE_A 2AU3_A.
Probab=27.80 E-value=69 Score=22.59 Aligned_cols=17 Identities=24% Similarity=0.632 Sum_probs=13.3
Q ss_pred eEEcCCCcEEEEEeeec
Q psy2771 113 PLVNLDGEVIGINSMKV 129 (174)
Q Consensus 113 Pl~n~~G~liGI~~~~~ 129 (174)
|+.|.+|+|||......
T Consensus 82 PI~d~~G~vvgF~gR~l 98 (128)
T PF08275_consen 82 PIRDERGRVVGFGGRRL 98 (128)
T ss_dssp EEE-TTS-EEEEEEEES
T ss_pred EEEcCCCCEEEEecccC
Confidence 89999999999988766
No 100
>cd04608 CBS_pair_PALP_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain upstream. The vitamin B6 complex comprises pyridoxine, pyridoxal, and pyridoxamine, as well as the 5'-phosphate esters of pyridoxal (PALP) and pyridoxamine, the last two being the biologically active coenzyme derivatives. The members of the PALP family are principally involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and other amine-containing compounds. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a poten
Probab=27.63 E-value=59 Score=22.07 Aligned_cols=21 Identities=24% Similarity=0.558 Sum_probs=17.1
Q ss_pred CCccceEEcCCCcEEEEEeee
Q psy2771 108 GNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~~ 128 (174)
+.+.-|++|.+|+++|+++..
T Consensus 24 ~~~~~~Vvd~~~~~~Gii~~~ 44 (124)
T cd04608 24 GFDQLPVVDESGKILGMVTLG 44 (124)
T ss_pred CCCEEEEEcCCCCEEEEEEHH
Confidence 455778999999999999853
No 101
>cd04639 CBS_pair_26 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=27.59 E-value=66 Score=20.80 Aligned_cols=22 Identities=23% Similarity=0.522 Sum_probs=17.6
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-|++|.+|+++|+++..
T Consensus 85 ~~~~~~~Vv~~~~~~~G~it~~ 106 (111)
T cd04639 85 GGAPAVPVVDGSGRLVGLVTLE 106 (111)
T ss_pred cCCceeeEEcCCCCEEEEEEHH
Confidence 4566779998889999999863
No 102
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=27.56 E-value=1.7e+02 Score=18.72 Aligned_cols=30 Identities=13% Similarity=-0.119 Sum_probs=24.1
Q ss_pred EEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
....+.||+.+.....++|...++.+=...
T Consensus 13 v~V~l~dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 13 MRIHMTDGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred EEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence 344559999999999999998888765553
No 103
>PRK09371 gas vesicle synthesis protein GvpA; Provisional
Probab=27.16 E-value=1.2e+02 Score=19.11 Aligned_cols=33 Identities=18% Similarity=0.020 Sum_probs=28.6
Q ss_pred HHHHHHHHHHHHhCCeeeeeecCceeeeeeeee
Q psy2771 139 IDYAIEFLTNYKRKGKFCAYSKGKSDLRTEVLY 171 (174)
Q Consensus 139 i~~i~~~l~~l~~~g~~~~~~lg~~~~~~e~~~ 171 (174)
...+.++++++..+|-+...++-++....|++.
T Consensus 6 s~sLadvldriLDKGiVI~adi~VSl~gieLL~ 38 (68)
T PRK09371 6 SSSLAEVIDRILDKGIVVDAWVRVSLVGIELLA 38 (68)
T ss_pred cccHHHHHHHHccCCeEEEEEEEEEEeeeEEEE
Confidence 346789999999999999999999988888775
No 104
>cd02205 CBS_pair The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generali
Probab=27.09 E-value=63 Score=20.37 Aligned_cols=21 Identities=29% Similarity=0.507 Sum_probs=17.3
Q ss_pred CCCccceEEcCCCcEEEEEee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~ 127 (174)
.+.+--|++|.+|+++|+.+.
T Consensus 87 ~~~~~~~V~~~~~~~~G~i~~ 107 (113)
T cd02205 87 HGIRRLPVVDDEGRLVGIVTR 107 (113)
T ss_pred cCCCEEEEEcCCCcEEEEEEH
Confidence 455667899999999999875
No 105
>cd04593 CBS_pair_EriC_assoc_bac_arch This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in bacteria and archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS d
Probab=26.66 E-value=59 Score=21.31 Aligned_cols=22 Identities=27% Similarity=0.449 Sum_probs=17.3
Q ss_pred CCCccceEEcCC--CcEEEEEeee
Q psy2771 107 FGNSGGPLVNLD--GEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~--G~liGI~~~~ 128 (174)
.+..--||+|.. |+++|+++..
T Consensus 87 ~~~~~~~Vvd~~~~~~~~Gvit~~ 110 (115)
T cd04593 87 RGLRQLPVVDRGNPGQVLGLLTRE 110 (115)
T ss_pred cCCceeeEEeCCCCCeEEEEEEhH
Confidence 555567999887 8999999863
No 106
>cd04595 CBS_pair_DHH_polyA_Pol_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with an upstream DHH domain which performs a phosphoesterase function and a downstream polyA polymerase domain. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=26.63 E-value=54 Score=21.23 Aligned_cols=20 Identities=30% Similarity=0.534 Sum_probs=15.0
Q ss_pred CCccceEEcCCCcEEEEEeee
Q psy2771 108 GNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~~ 128 (174)
+..-=|+++ +|+++|+++..
T Consensus 86 ~~~~~~V~~-~~~~~Gvvt~~ 105 (110)
T cd04595 86 DIGRVPVVE-DGRLVGIVTRT 105 (110)
T ss_pred CCCeeEEEe-CCEEEEEEEhH
Confidence 334458888 89999999864
No 107
>PF04085 MreC: rod shape-determining protein MreC; InterPro: IPR007221 MreC (murein formation C) is involved in the rod shape determination in Escherichia coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped.; GO: 0008360 regulation of cell shape; PDB: 2J5U_B 2QF4_B 2QF5_A.
Probab=26.53 E-value=2.5e+02 Score=20.27 Aligned_cols=57 Identities=18% Similarity=0.159 Sum_probs=33.9
Q ss_pred CCCcEEEEEEcCCC---CCceeecCCCCCCCCCEEEEEecCCCCCCceeecEEeeeccCc
Q psy2771 31 FEKHIILFHCLQNN---YPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSS 87 (174)
Q Consensus 31 ~~~DlAllkv~~~~---~~~~~l~~~~~~~~G~~v~~~G~p~g~~~~~~~G~vs~~~~~~ 87 (174)
...+.++++=.... +.--.+....+++.||.|+..|...-+..-+.-|.|.......
T Consensus 66 ~~~~~Gi~~G~~~~~~~~~l~~i~~~~~i~~GD~V~TSG~~~~fP~Gi~VG~V~~v~~~~ 125 (152)
T PF04085_consen 66 RSGDRGILRGDGSNTGLLKLEYIPKDADIKKGDIVVTSGLGGIFPPGIPVGTVSSVEPDK 125 (152)
T ss_dssp CTTEEEEEEEEETTTTEEEEEEECTTS---TT-EEEEE-TTSSS-CCEEEEEEEEEECTT
T ss_pred cCCeeEEEEeCCCCCceEEEEECCCCCCCCCCCEEEECCCCCcCCCCCEEEEEEEEEeCC
Confidence 33456888766433 2223344566799999999998765566678899998887653
No 108
>COG0517 FOG: CBS domain [General function prediction only]
Probab=26.52 E-value=69 Score=20.84 Aligned_cols=21 Identities=29% Similarity=0.491 Sum_probs=18.2
Q ss_pred CCCccceEEcCCC-cEEEEEee
Q psy2771 107 FGNSGGPLVNLDG-EVIGINSM 127 (174)
Q Consensus 107 ~G~SGGPl~n~~G-~liGI~~~ 127 (174)
.+...-|++|.++ +++||++.
T Consensus 92 ~~~~~lpVv~~~~~~lvGivt~ 113 (117)
T COG0517 92 HKIRRLPVVDDDGGKLVGIITL 113 (117)
T ss_pred cCcCeEEEEECCCCeEEEEEEH
Confidence 5788899999986 99999985
No 109
>cd04637 CBS_pair_24 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=26.40 E-value=65 Score=21.29 Aligned_cols=22 Identities=36% Similarity=0.460 Sum_probs=17.5
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-|++|.+|+++|+++..
T Consensus 96 ~~~~~~~vv~~~~~~~Gvit~~ 117 (122)
T cd04637 96 NSISCLPVVDENGQLIGIITWK 117 (122)
T ss_pred cCCCeEeEECCCCCEEEEEEHH
Confidence 4555679998889999999853
No 110
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.09 E-value=1.4e+02 Score=19.21 Aligned_cols=29 Identities=17% Similarity=-0.018 Sum_probs=23.1
Q ss_pred EEEEeecCcEEEEeeeEeeCCCcEEEEEE
Q psy2771 12 CLSTFSFNSLLTLPNIAYYFEKHIILFHC 40 (174)
Q Consensus 12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv 40 (174)
....+.+|+.+...+.++|....|.+=..
T Consensus 14 V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 14 VYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred EEEEECCCCEEEEEEEEEccceEEeccce
Confidence 33445999999999999999888876443
No 111
>COG0490 Putative regulatory, ligand-binding protein related to C-terminal domains of K+ channels [Inorganic ion transport and metabolism]
Probab=26.03 E-value=81 Score=23.56 Aligned_cols=14 Identities=14% Similarity=0.525 Sum_probs=11.6
Q ss_pred CCCCCCCEEEEEec
Q psy2771 54 ADIRNGEFVIAMGS 67 (174)
Q Consensus 54 ~~~~~G~~v~~~G~ 67 (174)
..++.||.++++|-
T Consensus 133 ~vle~gDtlvviG~ 146 (162)
T COG0490 133 TVLEAGDTLVVIGE 146 (162)
T ss_pred hhhcCCCEEEEEec
Confidence 45799999999983
No 112
>cd04599 CBS_pair_GGDEF_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=25.47 E-value=61 Score=20.67 Aligned_cols=21 Identities=14% Similarity=0.132 Sum_probs=16.1
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...=|++|. |+++|+++..
T Consensus 80 ~~~~~~~Vv~~-~~~~G~it~~ 100 (105)
T cd04599 80 KKIERLPVLRE-RKLVGIITKG 100 (105)
T ss_pred cCCCEeeEEEC-CEEEEEEEHH
Confidence 45556788876 9999999853
No 113
>KOG3888|consensus
Probab=25.21 E-value=99 Score=26.29 Aligned_cols=45 Identities=18% Similarity=0.257 Sum_probs=36.5
Q ss_pred CccceEE--cCCCcEEEEEeeecCCCeEEEEeHHHHHHHHHHHHhCC
Q psy2771 109 NSGGPLV--NLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYKRKG 153 (174)
Q Consensus 109 ~SGGPl~--n~~G~liGI~~~~~~~~~~~aiPi~~i~~~l~~l~~~g 153 (174)
..=.|++ |.+|+++-|+........-|-+|.+.++.+...++..-
T Consensus 292 i~r~~vI~ld~egr~~rIN~s~~~Rds~fdvp~e~v~~~y~a~~~F~ 338 (407)
T KOG3888|consen 292 IWRAPVICLDDEGRVVRINFSNPQRDSIFDVPVEQVQPWYRALKLFV 338 (407)
T ss_pred hhcCceEEEcccceEEEEecCCccccccccCCHHHHHHHHHHHHHHH
Confidence 3445777 77899999999887767789999999999998886543
No 114
>cd04625 CBS_pair_12 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=25.21 E-value=62 Score=20.98 Aligned_cols=21 Identities=19% Similarity=0.314 Sum_probs=15.8
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-|+++ +|+++|+++..
T Consensus 87 ~~~~~l~Vv~-~~~~~Gvvt~~ 107 (112)
T cd04625 87 RHLRYLPVLD-GGTLLGVISFH 107 (112)
T ss_pred cCCCeeeEEE-CCEEEEEEEHH
Confidence 4455578887 69999999853
No 115
>cd04631 CBS_pair_18 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=24.78 E-value=72 Score=21.13 Aligned_cols=22 Identities=32% Similarity=0.533 Sum_probs=17.2
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-||+|.+|+++|+++..
T Consensus 99 ~~~~~~~V~~~~~~~~Gvit~~ 120 (125)
T cd04631 99 KRVGGLPVVDDDGKLVGIVTER 120 (125)
T ss_pred cCCceEEEEcCCCcEEEEEEHH
Confidence 4555678888789999999853
No 116
>PF08448 PAS_4: PAS fold; InterPro: IPR013656 The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs []. The PAS fold appears in archaea, eubacteria and eukarya. ; PDB: 3K3D_A 3K3C_B 3KX0_X 3FC7_B 3LUQ_D 3MXQ_A 3BWL_C 3FG8_A.
Probab=24.46 E-value=80 Score=20.02 Aligned_cols=17 Identities=35% Similarity=0.741 Sum_probs=14.8
Q ss_pred ceEEcCCCcEEEEEeee
Q psy2771 112 GPLVNLDGEVIGINSMK 128 (174)
Q Consensus 112 GPl~n~~G~liGI~~~~ 128 (174)
.|+.|.+|++.|++...
T Consensus 86 ~Pi~~~~g~~~g~~~~~ 102 (110)
T PF08448_consen 86 SPIFDEDGEVVGVLVII 102 (110)
T ss_dssp EEEECTTTCEEEEEEEE
T ss_pred EEeEcCCCCEEEEEEEE
Confidence 69999999999998754
No 117
>cd04629 CBS_pair_16 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=23.82 E-value=69 Score=20.76 Aligned_cols=20 Identities=40% Similarity=0.699 Sum_probs=16.1
Q ss_pred CCccceEEcCCCcEEEEEee
Q psy2771 108 GNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~ 127 (174)
+.+.-|++|.+|+++|+++.
T Consensus 23 ~~~~~~V~~~~~~~~G~v~~ 42 (114)
T cd04629 23 KISGGPVVDDNGNLVGFLSE 42 (114)
T ss_pred CCCCccEECCCCeEEEEeeh
Confidence 34566899999999999983
No 118
>cd04594 CBS_pair_EriC_assoc_archaea This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the EriC CIC-type chloride channels in archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS do
Probab=23.51 E-value=71 Score=20.58 Aligned_cols=21 Identities=29% Similarity=0.458 Sum_probs=15.6
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+..--|+++ +|+++|+++..
T Consensus 79 ~~~~~~~Vv~-~~~~iGvit~~ 99 (104)
T cd04594 79 NKTRWCPVVD-DGKFKGIVTLD 99 (104)
T ss_pred cCcceEEEEE-CCEEEEEEEHH
Confidence 4444578887 69999998853
No 119
>cd04612 CBS_pair_SpoIVFB_EriC_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with either the SpoIVFB domain (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) or the chloride channel protein EriC. SpoIVFB is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A ), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. EriC is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase an
Probab=23.36 E-value=88 Score=20.06 Aligned_cols=22 Identities=27% Similarity=0.327 Sum_probs=17.9
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+.-|++|.+|+++|+++..
T Consensus 85 ~~~~~~~V~~~~~~~~G~it~~ 106 (111)
T cd04612 85 RDIGRLPVVDDSGRLVGIVSRS 106 (111)
T ss_pred CCCCeeeEEcCCCCEEEEEEHH
Confidence 4566789998889999999864
No 120
>cd04630 CBS_pair_17 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=23.31 E-value=71 Score=20.88 Aligned_cols=21 Identities=33% Similarity=0.447 Sum_probs=15.7
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+..--|++|. |+++|+++..
T Consensus 89 ~~~~~~~Vvd~-~~~~Gvi~~~ 109 (114)
T cd04630 89 TNIRRAPVVEN-NELIGIISLT 109 (114)
T ss_pred cCCCEeeEeeC-CEEEEEEEHH
Confidence 34555688876 9999999853
No 121
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=23.27 E-value=1.3e+02 Score=18.87 Aligned_cols=27 Identities=7% Similarity=-0.066 Sum_probs=22.6
Q ss_pred EEeecCcEEEEeeeEeeCCCcEEEEEE
Q psy2771 14 STFSFNSLLTLPNIAYYFEKHIILFHC 40 (174)
Q Consensus 14 ~~~~~~~~~~a~~v~~d~~~DlAllkv 40 (174)
..+.+++.+.....++|....+.+=..
T Consensus 14 V~l~dgr~~~G~L~~~D~~~NlvL~~~ 40 (74)
T cd01727 14 VITVDGRVIVGTLKGFDQATNLILDDS 40 (74)
T ss_pred EEECCCcEEEEEEEEEccccCEEccce
Confidence 345999999999999999888777665
No 122
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=23.19 E-value=2e+02 Score=18.36 Aligned_cols=29 Identities=3% Similarity=-0.072 Sum_probs=23.1
Q ss_pred EEEEeecCcEEEEeeeEeeCCCcEEEEEE
Q psy2771 12 CLSTFSFNSLLTLPNIAYYFEKHIILFHC 40 (174)
Q Consensus 12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv 40 (174)
....+.+++.+...+.++|..-.+.+=..
T Consensus 16 V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 16 IWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred EEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 33445999999999999999888876544
No 123
>cd04586 CBS_pair_BON_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the BON (bacterial OsmY and nodulation domain) domain. BON is a putative phospholipid-binding domain found in a family of osmotic shock protection proteins. It is also found in some secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=23.16 E-value=80 Score=21.46 Aligned_cols=21 Identities=29% Similarity=0.406 Sum_probs=17.3
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+.+.-|++| +|+++||++..
T Consensus 110 ~~~~~l~Vvd-~g~~~Gvit~~ 130 (135)
T cd04586 110 HRIKRVPVVR-GGRLVGIVSRA 130 (135)
T ss_pred cCCCccCEec-CCEEEEEEEhH
Confidence 5666789999 89999999853
No 124
>cd04622 CBS_pair_9 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=22.84 E-value=85 Score=20.28 Aligned_cols=19 Identities=37% Similarity=0.602 Sum_probs=14.8
Q ss_pred ccceEEcCCCcEEEEEeee
Q psy2771 110 SGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 110 SGGPl~n~~G~liGI~~~~ 128 (174)
.--|+++.+|+++|+++..
T Consensus 90 ~~~~V~~~~~~~~G~it~~ 108 (113)
T cd04622 90 RRLPVVDDDGRLVGIVSLG 108 (113)
T ss_pred CeeeEECCCCcEEEEEEHH
Confidence 3448888889999998753
No 125
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.50 E-value=2.2e+02 Score=18.11 Aligned_cols=53 Identities=17% Similarity=0.171 Sum_probs=33.1
Q ss_pred EEEEeecCcEEEEeeeEeeCCCcEEEEEEcC-----CCCCceeecCCCCCCCCCEEEEEe
Q psy2771 12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCLQ-----NNYPALKLGKAADIRNGEFVIAMG 66 (174)
Q Consensus 12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~~-----~~~~~~~l~~~~~~~~G~~v~~~G 66 (174)
....+.+|+.+.....++|+...+.+=...+ .......++. -+-.|+.|..+|
T Consensus 15 v~V~l~~gr~~~G~L~~fD~~~NlvL~d~~E~~~~~~~~~~~~lG~--~viRG~~V~~ig 72 (74)
T cd01728 15 VVVLLRDGRKLIGILRSFDQFANLVLQDTVERIYVGDKYGDIPRGI--FIIRGENVVLLG 72 (74)
T ss_pred EEEEEcCCeEEEEEEEEECCcccEEecceEEEEecCCccceeEeeE--EEEECCEEEEEE
Confidence 3445599999999999999988877755421 1111222222 244577777665
No 126
>PRK11543 gutQ D-arabinose 5-phosphate isomerase; Provisional
Probab=22.47 E-value=73 Score=25.79 Aligned_cols=22 Identities=18% Similarity=0.410 Sum_probs=18.7
Q ss_pred CCCccceEEcCCCcEEEEEeee
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMK 128 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~ 128 (174)
.+...-||+|.+|+++|+++..
T Consensus 292 ~~~~~lpVvd~~~~lvGvIt~~ 313 (321)
T PRK11543 292 RKITAAPVVDENGKLTGAINLQ 313 (321)
T ss_pred cCCCEEEEEcCCCeEEEEEEHH
Confidence 6677789999899999999854
No 127
>cd04613 CBS_pair_SpoIVFB_EriC_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with either the SpoIVFB domain (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) or the chloride channel protein EriC. SpoIVFB is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A ), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. EriC is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase a
Probab=22.41 E-value=87 Score=20.14 Aligned_cols=20 Identities=35% Similarity=0.607 Sum_probs=16.2
Q ss_pred CCccceEEcCCCcEEEEEee
Q psy2771 108 GNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~ 127 (174)
+.+.-|++|.+|+++|+++.
T Consensus 23 ~~~~~~v~~~~~~~~G~v~~ 42 (114)
T cd04613 23 PENNFPVVDDDGRLVGIVSL 42 (114)
T ss_pred CCcceeEECCCCCEEEEEEH
Confidence 34567899888999999994
No 128
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.33 E-value=2.2e+02 Score=18.79 Aligned_cols=30 Identities=17% Similarity=-0.013 Sum_probs=24.3
Q ss_pred EEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
....+.+++.+...+.++|....+.+=...
T Consensus 17 V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 17 VLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred EEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 344459999999999999999988876553
No 129
>cd04584 CBS_pair_ACT_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the acetoin utilization proteins in bacteria. Acetoin is a product of fermentative metabolism in many prokaryotic and eukaryotic microorganisms. They produce acetoin as an external carbon storage compound and then later reuse it as a carbon and energy source during their stationary phase and sporulation. In addition these CBS domains are associated with a downstream ACT domain, which is linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The in
Probab=22.05 E-value=82 Score=20.64 Aligned_cols=20 Identities=25% Similarity=0.398 Sum_probs=16.1
Q ss_pred CCccceEEcCCCcEEEEEee
Q psy2771 108 GNSGGPLVNLDGEVIGINSM 127 (174)
Q Consensus 108 G~SGGPl~n~~G~liGI~~~ 127 (174)
+.+.-|++|.+|+++|+++.
T Consensus 23 ~~~~~~V~d~~~~~~G~v~~ 42 (121)
T cd04584 23 KIRHLPVVDEEGRLVGIVTD 42 (121)
T ss_pred CCCcccEECCCCcEEEEEEH
Confidence 44456888999999999983
No 130
>PF08669 GCV_T_C: Glycine cleavage T-protein C-terminal barrel domain; InterPro: IPR013977 This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=21.93 E-value=1.1e+02 Score=19.89 Aligned_cols=23 Identities=22% Similarity=0.345 Sum_probs=17.9
Q ss_pred CCCccceEEcCCCcEEEEEeeec
Q psy2771 107 FGNSGGPLVNLDGEVIGINSMKV 129 (174)
Q Consensus 107 ~G~SGGPl~n~~G~liGI~~~~~ 129 (174)
+=..|.|++..+|+.||.++...
T Consensus 32 ~~~~g~~v~~~~g~~vG~vTS~~ 54 (95)
T PF08669_consen 32 PPRGGEPVYDEDGKPVGRVTSGA 54 (95)
T ss_dssp --STTCEEEETTTEEEEEEEEEE
T ss_pred CCCCCCEEEECCCcEEeEEEEEe
Confidence 34567899977999999999764
No 131
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=21.61 E-value=2.1e+02 Score=17.55 Aligned_cols=29 Identities=10% Similarity=-0.035 Sum_probs=24.5
Q ss_pred EEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 13 LSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 13 ~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
...+.+|+.+...+.++|...++.+-...
T Consensus 14 ~V~l~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (68)
T cd01731 14 LVKLKGGKEVRGRLKSYDQHMNLVLEDAE 42 (68)
T ss_pred EEEECCCCEEEEEEEEECCcceEEEeeEE
Confidence 33459999999999999999998887764
No 132
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=20.91 E-value=2.2e+02 Score=17.54 Aligned_cols=31 Identities=3% Similarity=-0.114 Sum_probs=24.6
Q ss_pred EEEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 11 ICLSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 11 ~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
.....+.+|+.+..++.++|..-++.+=.+.
T Consensus 13 ~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 13 PVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred EEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 3444559999999999999998888875553
No 133
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=20.87 E-value=2.3e+02 Score=17.90 Aligned_cols=32 Identities=6% Similarity=-0.065 Sum_probs=25.6
Q ss_pred eEEEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 10 DICLSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 10 ~~~~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
......+.+|+.+..++.++|....+.+--+.
T Consensus 18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 49 (79)
T COG1958 18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVE 49 (79)
T ss_pred CEEEEEECCCCEEEEEEEEEccceeEEEeceE
Confidence 34455569999999999999998888776654
No 134
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.79 E-value=2e+02 Score=16.95 Aligned_cols=29 Identities=10% Similarity=-0.048 Sum_probs=23.9
Q ss_pred EEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 13 LSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 13 ~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
...+.+|+.+...+.++|...++.+-...
T Consensus 10 ~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~ 38 (63)
T cd00600 10 RVELKDGRVLEGVLVAFDKYMNLVLDDVE 38 (63)
T ss_pred EEEECCCcEEEEEEEEECCCCCEEECCEE
Confidence 34459999999999999998888876664
No 135
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=20.62 E-value=2.3e+02 Score=17.69 Aligned_cols=30 Identities=10% Similarity=0.003 Sum_probs=24.6
Q ss_pred EEEEeecCcEEEEeeeEeeCCCcEEEEEEc
Q psy2771 12 CLSTFSFNSLLTLPNIAYYFEKHIILFHCL 41 (174)
Q Consensus 12 ~~~~~~~~~~~~a~~v~~d~~~DlAllkv~ 41 (174)
....+.+|+.+...+.++|..-++.+=...
T Consensus 17 V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 46 (72)
T PRK00737 17 VLVRLKGGREFRGELQGYDIHMNLVLDNAE 46 (72)
T ss_pred EEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence 334459999999999999999888877764
Done!