Query 045527
Match_columns 258
No_of_seqs 244 out of 1139
Neff 7.8
Searched_HMMs 46136
Date Fri Mar 29 03:00:10 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/045527.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/045527hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF08284 RVP_2: Retroviral asp 100.0 3.6E-29 7.7E-34 199.3 13.4 119 130-250 14-133 (135)
2 cd05479 RP_DDI RP_DDI; retrope 99.9 3.9E-24 8.4E-29 168.1 12.7 111 135-246 14-124 (124)
3 cd05484 retropepsin_like_LTR_2 99.8 9.7E-19 2.1E-23 130.0 8.8 90 139-230 2-91 (91)
4 PF09668 Asp_protease: Asparty 99.7 1.5E-17 3.2E-22 129.7 10.1 103 135-238 22-124 (124)
5 cd05480 NRIP_C NRIP_C; putativ 99.6 1.6E-15 3.4E-20 112.2 8.5 99 141-240 2-102 (103)
6 TIGR02281 clan_AA_DTGA clan AA 99.6 1.8E-14 4E-19 112.6 11.5 109 134-246 8-119 (121)
7 PF13650 Asp_protease_2: Aspar 99.6 1.1E-14 2.4E-19 106.7 9.2 87 140-228 1-90 (90)
8 PF00077 RVP: Retroviral aspar 99.6 7.5E-15 1.6E-19 110.5 6.8 96 137-238 5-100 (100)
9 cd05483 retropepsin_like_bacte 99.5 1E-13 2.2E-18 102.7 9.8 92 137-230 2-96 (96)
10 PF12384 Peptidase_A2B: Ty3 tr 99.5 1.9E-13 4.2E-18 109.7 10.5 100 135-234 32-131 (177)
11 TIGR03698 clan_AA_DTGF clan AA 99.5 3.8E-13 8.3E-18 102.9 9.5 92 147-244 15-107 (107)
12 KOG0012 DNA damage inducible p 99.4 2.5E-13 5.5E-18 120.9 8.1 116 134-250 232-347 (380)
13 cd00303 retropepsin_like Retro 99.4 1.1E-12 2.4E-17 92.9 9.7 90 141-230 2-92 (92)
14 PF02160 Peptidase_A3: Caulifl 99.4 1.1E-12 2.5E-17 109.7 9.0 113 135-250 2-119 (201)
15 cd06095 RP_RTVL_H_like Retrope 99.4 1.4E-12 3E-17 95.9 8.4 85 141-230 2-86 (86)
16 PF13975 gag-asp_proteas: gag- 99.4 3.7E-12 8.1E-17 90.5 8.5 64 134-197 5-69 (72)
17 COG3577 Predicted aspartyl pro 99.1 7.5E-10 1.6E-14 92.3 10.1 102 134-237 102-206 (215)
18 cd05481 retropepsin_like_LTR_1 99.1 6E-10 1.3E-14 83.1 7.9 84 142-227 3-90 (93)
19 cd06094 RP_Saci_like RP_Saci_l 98.9 1.9E-09 4.1E-14 78.8 5.6 79 148-233 9-88 (89)
20 COG5550 Predicted aspartyl pro 98.6 5.6E-07 1.2E-11 69.4 10.3 94 146-245 24-118 (125)
21 PF05585 DUF1758: Putative pep 98.5 4.9E-07 1.1E-11 74.1 7.6 32 147-178 11-42 (164)
22 cd05482 HIV_retropepsin_like R 98.2 8.3E-06 1.8E-10 59.9 7.3 86 141-230 2-87 (87)
23 PF00098 zf-CCHC: Zinc knuckle 98.0 3.6E-06 7.8E-11 43.4 1.2 17 66-82 2-18 (18)
24 PF12382 Peptidase_A2E: Retrot 97.5 0.00028 6E-09 52.5 5.3 83 149-232 48-130 (137)
25 cd05476 pepsin_A_like_plant Ch 96.3 0.018 3.8E-07 50.5 8.1 86 149-249 177-263 (265)
26 PF13696 zf-CCHC_2: Zinc knuck 95.8 0.0035 7.5E-08 37.0 0.7 18 65-82 9-26 (32)
27 cd06096 Plasmepsin_5 Plasmepsi 95.7 0.029 6.3E-07 50.8 6.8 92 148-248 231-322 (326)
28 COG4067 Uncharacterized protei 95.6 0.02 4.4E-07 46.0 4.6 93 147-241 38-155 (162)
29 cd05477 gastricsin Gastricsins 94.9 0.23 5E-06 44.6 9.9 93 149-247 202-316 (318)
30 PF13917 zf-CCHC_3: Zinc knuck 94.9 0.012 2.6E-07 37.1 1.0 20 63-82 3-22 (42)
31 smart00343 ZnF_C2HC zinc finge 94.9 0.012 2.6E-07 32.9 0.9 17 66-82 1-17 (26)
32 cd06098 phytepsin Phytepsin, a 94.9 0.19 4.1E-06 45.3 9.2 95 149-247 211-316 (317)
33 PF00026 Asp: Eukaryotic aspar 94.8 0.041 8.9E-07 48.9 4.6 96 148-247 199-315 (317)
34 cd05485 Cathepsin_D_like Cathe 94.7 0.29 6.3E-06 44.3 10.1 93 149-247 211-328 (329)
35 PF05618 Zn_protease: Putative 94.6 0.03 6.5E-07 44.7 2.9 92 147-242 15-132 (138)
36 cd05478 pepsin_A Pepsin A, asp 94.5 0.31 6.7E-06 43.8 9.7 100 142-247 194-316 (317)
37 cd06097 Aspergillopepsin_like 94.4 0.052 1.1E-06 47.9 4.3 79 148-247 198-277 (278)
38 cd05471 pepsin_like Pepsin-lik 94.3 0.077 1.7E-06 46.2 5.1 79 147-247 201-282 (283)
39 cd05474 SAP_like SAPs, pepsin- 94.0 0.14 3.1E-06 45.1 6.2 96 147-247 177-293 (295)
40 PF14787 zf-CCHC_5: GAG-polypr 93.9 0.024 5.1E-07 34.2 0.7 20 64-83 2-21 (36)
41 cd05472 cnd41_like Chloroplast 93.8 0.6 1.3E-05 41.5 9.9 27 221-248 270-296 (299)
42 cd05486 Cathespin_E Cathepsin 93.5 0.28 6E-06 44.1 7.3 91 150-247 200-315 (316)
43 COG5082 AIR1 Arginine methyltr 92.7 0.049 1.1E-06 45.4 1.1 23 59-81 55-77 (190)
44 PTZ00147 plasmepsin-1; Provisi 92.6 1.1 2.4E-05 42.7 10.2 98 148-249 332-449 (453)
45 PTZ00368 universal minicircle 92.3 0.072 1.6E-06 42.8 1.5 19 64-82 103-121 (148)
46 cd05488 Proteinase_A_fungi Fun 91.9 0.44 9.5E-06 42.9 6.3 93 149-247 206-319 (320)
47 cd05490 Cathepsin_D2 Cathepsin 91.2 0.56 1.2E-05 42.2 6.2 93 149-247 207-324 (325)
48 cd05487 renin_like Renin stimu 90.9 1.2 2.7E-05 40.1 8.2 26 221-247 299-324 (326)
49 cd05475 nucellin_like Nucellin 90.4 0.77 1.7E-05 40.4 6.3 85 149-248 178-270 (273)
50 PTZ00013 plasmepsin 4 (PM4); P 90.0 0.83 1.8E-05 43.5 6.5 96 148-248 331-447 (450)
51 COG5082 AIR1 Arginine methyltr 90.0 0.14 2.9E-06 42.8 1.0 19 65-83 98-117 (190)
52 PLN03146 aspartyl protease fam 89.8 0.63 1.4E-05 44.0 5.5 27 221-248 399-425 (431)
53 cd05474 SAP_like SAPs, pepsin- 88.9 1.8 3.9E-05 38.1 7.5 73 139-228 4-80 (295)
54 PF00026 Asp: Eukaryotic aspar 88.5 0.73 1.6E-05 40.8 4.7 87 139-228 3-114 (317)
55 PF14392 zf-CCHC_4: Zinc knuck 88.0 0.2 4.3E-06 32.5 0.5 21 62-82 29-49 (49)
56 COG5222 Uncharacterized conser 87.5 0.27 5.8E-06 43.8 1.2 17 66-82 178-194 (427)
57 PTZ00165 aspartyl protease; Pr 87.1 3.7 8.1E-05 39.5 8.9 29 221-250 419-447 (482)
58 PTZ00147 plasmepsin-1; Provisi 87.0 1.9 4.1E-05 41.2 6.7 89 137-228 139-252 (453)
59 PTZ00013 plasmepsin 4 (PM4); P 86.5 2.1 4.6E-05 40.8 6.8 88 138-228 139-251 (450)
60 cd05473 beta_secretase_like Be 85.9 2.4 5.2E-05 38.9 6.7 26 222-248 319-344 (364)
61 PTZ00368 universal minicircle 85.4 0.39 8.5E-06 38.4 1.1 17 66-82 2-18 (148)
62 cd05476 pepsin_A_like_plant Ch 83.5 3.1 6.8E-05 36.2 6.1 74 140-228 4-88 (265)
63 KOG4400 E3 ubiquitin ligase in 82.4 0.57 1.2E-05 41.2 0.9 18 65-82 144-161 (261)
64 cd05489 xylanase_inhibitor_I_l 79.6 9.9 0.00021 35.0 8.1 25 222-247 335-359 (362)
65 PF15288 zf-CCHC_6: Zinc knuck 78.9 1.2 2.5E-05 27.7 1.2 19 65-83 2-22 (40)
66 PF03539 Spuma_A9PTase: Spumav 78.5 4 8.7E-05 32.8 4.4 80 144-233 1-84 (163)
67 cd05470 pepsin_retropepsin_lik 75.8 2.7 5.9E-05 31.1 2.7 26 141-166 2-29 (109)
68 cd05487 renin_like Renin stimu 75.8 13 0.00028 33.5 7.6 27 138-164 9-37 (326)
69 cd05477 gastricsin Gastricsins 72.8 12 0.00026 33.5 6.6 26 139-164 5-32 (318)
70 cd05472 cnd41_like Chloroplast 71.9 7.6 0.00016 34.4 5.1 77 140-228 4-89 (299)
71 cd05486 Cathespin_E Cathepsin 70.9 11 0.00023 33.8 5.9 24 141-164 4-29 (316)
72 cd05488 Proteinase_A_fungi Fun 69.7 13 0.00029 33.2 6.2 27 138-164 11-39 (320)
73 PF14541 TAXi_C: Xylanase inhi 64.9 25 0.00055 28.1 6.4 28 219-247 133-160 (161)
74 PTZ00165 aspartyl protease; Pr 64.0 30 0.00064 33.4 7.6 31 136-166 119-151 (482)
75 KOG0109 RNA-binding protein LA 60.1 4.4 9.5E-05 36.2 1.1 18 66-83 162-179 (346)
76 KOG0107 Alternative splicing f 57.4 5.9 0.00013 32.9 1.3 20 63-82 99-118 (195)
77 smart00647 IBR In Between Ring 57.0 5.5 0.00012 26.4 1.0 17 64-80 48-64 (64)
78 cd06097 Aspergillopepsin_like 56.1 13 0.00027 32.6 3.4 26 140-165 3-30 (278)
79 PF01485 IBR: IBR domain; Int 55.4 4.2 9E-05 27.0 0.1 17 64-80 48-64 (64)
80 PF12353 eIF3g: Eukaryotic tra 52.7 7 0.00015 30.7 1.1 19 64-83 106-124 (128)
81 KOG4400 E3 ubiquitin ligase in 49.6 7.1 0.00015 34.2 0.7 21 63-83 91-111 (261)
82 KOG2673 Uncharacterized conser 49.2 8.4 0.00018 36.5 1.1 22 62-83 126-147 (485)
83 KOG0341 DEAD-box protein abstr 47.3 8.4 0.00018 36.1 0.8 18 65-82 571-588 (610)
84 cd06098 phytepsin Phytepsin, a 45.4 28 0.00061 31.1 3.9 27 138-164 11-39 (317)
85 KOG0119 Splicing factor 1/bran 45.3 10 0.00022 36.3 1.0 19 65-83 286-304 (554)
86 PF05515 Viral_NABP: Viral nuc 42.7 13 0.00028 29.0 1.1 19 63-81 61-79 (124)
87 PF13821 DUF4187: Domain of un 41.9 12 0.00027 24.8 0.8 23 60-82 23-49 (55)
88 cd05475 nucellin_like Nucellin 41.5 32 0.00069 30.0 3.6 25 139-163 4-30 (273)
89 cd05490 Cathepsin_D2 Cathepsin 40.7 34 0.00075 30.5 3.8 26 138-163 7-34 (325)
90 COG0282 ackA Acetate kinase [E 37.7 19 0.00041 33.6 1.5 38 155-192 178-215 (396)
91 cd05478 pepsin_A Pepsin A, asp 37.1 44 0.00096 29.7 3.9 27 138-164 11-39 (317)
92 PF14543 TAXi_N: Xylanase inhi 36.6 46 0.001 26.8 3.6 24 140-163 3-28 (164)
93 cd05473 beta_secretase_like Be 36.5 40 0.00088 30.7 3.6 26 139-164 5-32 (364)
94 cd06096 Plasmepsin_5 Plasmepsi 36.5 43 0.00093 30.0 3.7 27 139-165 5-33 (326)
95 KOG1339 Aspartyl protease [Pos 35.8 93 0.002 28.9 5.9 27 221-248 364-391 (398)
96 cd05485 Cathepsin_D_like Cathe 35.7 46 0.00099 29.9 3.7 29 137-165 11-41 (329)
97 KOG2044 5'-3' exonuclease HKE1 34.2 17 0.00037 36.9 0.8 19 65-83 261-279 (931)
98 PF03419 Peptidase_U4: Sporula 33.9 73 0.0016 28.4 4.7 22 137-158 157-180 (293)
99 cd05471 pepsin_like Pepsin-lik 32.8 48 0.001 28.4 3.3 27 141-167 4-32 (283)
100 TIGR02854 spore_II_GA sigma-E 32.2 81 0.0017 28.2 4.7 23 136-158 157-181 (288)
101 PLN03146 aspartyl protease fam 25.2 79 0.0017 29.8 3.5 27 137-163 84-112 (431)
102 PF13717 zinc_ribbon_4: zinc-r 22.7 35 0.00077 20.4 0.4 25 51-75 12-36 (36)
103 COG2383 Uncharacterized conser 22.4 26 0.00056 26.3 -0.2 20 222-241 50-69 (109)
104 PF03991 Prion_octapep: Copper 21.5 50 0.0011 13.2 0.6 6 12-17 1-6 (8)
105 PF09538 FYDLN_acid: Protein o 21.5 38 0.00082 25.8 0.5 20 63-82 8-31 (108)
106 KOG2560 RNA splicing factor - 21.0 23 0.00051 33.6 -0.9 19 63-81 111-129 (529)
107 PF09706 Cas_CXXC_CXXC: CRISPR 21.0 45 0.00097 23.1 0.7 12 62-73 3-14 (69)
108 KOG3794 CBF1-interacting corep 20.8 44 0.00096 31.2 0.8 19 64-82 124-144 (453)
109 cd01813 UBP_N UBP ubiquitin pr 20.2 1.5E+02 0.0032 20.5 3.3 39 138-176 1-39 (74)
No 1
>PF08284 RVP_2: Retroviral aspartyl protease; InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.
Probab=99.96 E-value=3.6e-29 Score=199.29 Aligned_cols=119 Identities=38% Similarity=0.615 Sum_probs=107.6
Q ss_pred cCccCCCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCC-CceEeecccccccccceEeeeeEeeeceeE
Q 045527 130 VGLTSPKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTE-PYGVILRTGSATKAQGICRGVGLILQGVEI 208 (258)
Q Consensus 130 ~g~~~~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~-~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~ 208 (258)
+....+.+|...+.|++.++.+||||||||||||.++|.+++++..+++ +..|. ++|+.+.+...|..+++.++|+.|
T Consensus 14 ~~~~~~~vi~g~~~I~~~~~~vLiDSGAThsFIs~~~a~~~~l~~~~l~~~~~V~-~~g~~~~~~~~~~~~~~~i~g~~~ 92 (135)
T PF08284_consen 14 EAEESPDVITGTFLINSIPASVLIDSGATHSFISSSFAKKLGLPLEPLPRPIVVS-APGGSINCEGVCPDVPLSIQGHEF 92 (135)
T ss_pred cccCCCCeEEEEEEeccEEEEEEEecCCCcEEccHHHHHhcCCEEEEccCeeEEe-cccccccccceeeeEEEEECCeEE
Confidence 3445688999999999999999999999999999999999999999875 55555 567777888889999999999999
Q ss_pred EeeccccCCCCccEEecchHHHhcCCeEEEeeCCEEEEeeCC
Q 045527 209 VEDFLPLDLGITDIIMGIHWLKTLGATHINWKTHSMKFNTRN 250 (258)
Q Consensus 209 ~~~f~Vl~~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~~~~ 250 (258)
..+|.|+++.++|+|||||||.+|+| .|||.+++|+|....
T Consensus 93 ~~dl~vl~l~~~DvILGm~WL~~~~~-~IDw~~k~v~f~~p~ 133 (135)
T PF08284_consen 93 VVDLLVLDLGGYDVILGMDWLKKHNP-VIDWATKTVTFNSPS 133 (135)
T ss_pred EeeeEEecccceeeEeccchHHhCCC-EEEccCCEEEEeCCC
Confidence 99999999999999999999999999 999999999998653
No 2
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=99.91 E-value=3.9e-24 Score=168.12 Aligned_cols=111 Identities=20% Similarity=0.197 Sum_probs=97.6
Q ss_pred CCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccc
Q 045527 135 PKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLP 214 (258)
Q Consensus 135 ~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~V 214 (258)
...+++.+.|||+++.+||||||++||||+++|+++|++.....+..+..++++.....+.+..+++.+++..+..+|.|
T Consensus 14 ~~~~~v~~~Ing~~~~~LvDTGAs~s~Is~~~a~~lgl~~~~~~~~~~~~~g~g~~~~~g~~~~~~l~i~~~~~~~~~~V 93 (124)
T cd05479 14 VPMLYINVEINGVPVKAFVDSGAQMTIMSKACAEKCGLMRLIDKRFQGIAKGVGTQKILGRIHLAQVKIGNLFLPCSFTV 93 (124)
T ss_pred eeEEEEEEEECCEEEEEEEeCCCceEEeCHHHHHHcCCccccCcceEEEEecCCCcEEEeEEEEEEEEECCEEeeeEEEE
Confidence 35789999999999999999999999999999999999876544566666554446667788888999999999999999
Q ss_pred cCCCCccEEecchHHHhcCCeEEEeeCCEEEE
Q 045527 215 LDLGITDIIMGIHWLKTLGATHINWKTHSMKF 246 (258)
Q Consensus 215 l~~~~~dvILG~dwL~~~~~i~ID~~~~~v~f 246 (258)
+++..+|+|||||||.+++. .|||++++|++
T Consensus 94 l~~~~~d~ILG~d~L~~~~~-~ID~~~~~i~~ 124 (124)
T cd05479 94 LEDDDVDFLIGLDMLKRHQC-VIDLKENVLRI 124 (124)
T ss_pred ECCCCcCEEecHHHHHhCCe-EEECCCCEEEC
Confidence 99999999999999999995 99999999874
No 3
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=99.78 E-value=9.7e-19 Score=129.97 Aligned_cols=90 Identities=19% Similarity=0.186 Sum_probs=82.8
Q ss_pred EEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCC
Q 045527 139 KLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLG 218 (258)
Q Consensus 139 ~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~ 218 (258)
++.+.|||+++.+||||||++||||++.+.+++++........+..|+|+.+.+.+.+ .+.+.+++..+..+|+|++..
T Consensus 2 ~~~~~Ing~~i~~lvDTGA~~svis~~~~~~lg~~~~~~~~~~v~~a~G~~~~~~G~~-~~~v~~~~~~~~~~~~v~~~~ 80 (91)
T cd05484 2 TVTLLVNGKPLKFQLDTGSAITVISEKTWRKLGSPPLKPTKKRLRTATGTKLSVLGQI-LVTVKYGGKTKVLTLYVVKNE 80 (91)
T ss_pred EEEEEECCEEEEEEEcCCcceEEeCHHHHHHhCCCccccccEEEEecCCCEeeEeEEE-EEEEEECCEEEEEEEEEEECC
Confidence 5789999999999999999999999999999999874445688999999999999887 789999999999999999998
Q ss_pred CccEEecchHHH
Q 045527 219 ITDIIMGIHWLK 230 (258)
Q Consensus 219 ~~dvILG~dwL~ 230 (258)
+|.|||+|||.
T Consensus 81 -~~~lLG~~wl~ 91 (91)
T cd05484 81 -GLNLLGRDWLD 91 (91)
T ss_pred -CCCccChhhcC
Confidence 99999999984
No 4
>PF09668 Asp_protease: Aspartyl protease; InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=99.74 E-value=1.5e-17 Score=129.66 Aligned_cols=103 Identities=20% Similarity=0.236 Sum_probs=79.0
Q ss_pred CCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccc
Q 045527 135 PKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLP 214 (258)
Q Consensus 135 ~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~V 214 (258)
...+++.++|||++++|+|||||.+|.||.++|+++|+...-...+.-...+-+.....|+++.+++.+++..+...|.|
T Consensus 22 v~mLyI~~~ing~~vkA~VDtGAQ~tims~~~a~r~gL~~lid~r~~g~a~GvG~~~i~G~Ih~~~l~ig~~~~~~s~~V 101 (124)
T PF09668_consen 22 VSMLYINCKINGVPVKAFVDTGAQSTIMSKSCAERCGLMRLIDKRFAGVAKGVGTQKILGRIHSVQLKIGGLFFPCSFTV 101 (124)
T ss_dssp ----EEEEEETTEEEEEEEETT-SS-EEEHHHHHHTTGGGGEEGGG-EE-------EEEEEEEEEEEEETTEEEEEEEEE
T ss_pred cceEEEEEEECCEEEEEEEeCCCCccccCHHHHHHcCChhhccccccccccCCCcCceeEEEEEEEEEECCEEEEEEEEE
Confidence 45789999999999999999999999999999999999754333333332222445677888899999999999999999
Q ss_pred cCCCCccEEecchHHHhcCCeEEE
Q 045527 215 LDLGITDIIMGIHWLKTLGATHIN 238 (258)
Q Consensus 215 l~~~~~dvILG~dwL~~~~~i~ID 238 (258)
++....|+|||.|||++|+. .||
T Consensus 102 le~~~~d~llGld~L~~~~c-~ID 124 (124)
T PF09668_consen 102 LEDQDVDLLLGLDMLKRHKC-CID 124 (124)
T ss_dssp ETTSSSSEEEEHHHHHHTT--EEE
T ss_pred eCCCCcceeeeHHHHHHhCc-ccC
Confidence 99889999999999999997 887
No 5
>cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=99.63 E-value=1.6e-15 Score=112.23 Aligned_cols=99 Identities=20% Similarity=0.179 Sum_probs=83.7
Q ss_pred EEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCC-Cc-eEeecccccccccceEeeeeEeeeceeEEeeccccCCC
Q 045527 141 ASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTE-PY-GVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLG 218 (258)
Q Consensus 141 ~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~-~~-~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~ 218 (258)
.+++||++++|+|||||.+|+||+.+|+++||...-.. .+ -+.-+-|+..+..|+++.+++.+++..+...|.|++..
T Consensus 2 nCk~nG~~vkAfVDsGaQ~timS~~caercgL~r~v~~~r~~g~A~gvgt~~kiiGrih~~~ikig~~~~~CSftVld~~ 81 (103)
T cd05480 2 SCQCAGKELRALVDTGCQYNLISAACLDRLGLKERVLKAKAEEEAPSLPTSVKVIGQIERLVLQLGQLTVECSAQVVDDN 81 (103)
T ss_pred ceeECCEEEEEEEecCCchhhcCHHHHHHcChHhhhhhccccccccCCCcceeEeeEEEEEEEEeCCEEeeEEEEEEcCC
Confidence 57899999999999999999999999999999743221 22 22334455567778999999999999999999999998
Q ss_pred CccEEecchHHHhcCCeEEEee
Q 045527 219 ITDIIMGIHWLKTLGATHINWK 240 (258)
Q Consensus 219 ~~dvILG~dwL~~~~~i~ID~~ 240 (258)
+.|++||.|-|++|+. .||.+
T Consensus 82 ~~d~llGLdmLkrhqc-~IdL~ 102 (103)
T cd05480 82 EKNFSLGLQTLKSLKC-VINLE 102 (103)
T ss_pred CcceEeeHHHHhhcce-eeecc
Confidence 9999999999999997 89975
No 6
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=99.59 E-value=1.8e-14 Score=112.63 Aligned_cols=109 Identities=17% Similarity=0.235 Sum_probs=85.7
Q ss_pred CCCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccC-CCceEeecccccccccceEeeeeEeeeceeEE-ee
Q 045527 134 SPKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNT-EPYGVILRTGSATKAQGICRGVGLILQGVEIV-ED 211 (258)
Q Consensus 134 ~~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~-~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~-~~ 211 (258)
....+++.+.|||+++.+||||||++++|++++|+++++..... .+..+..++|..... ...--.+.+++..+. +.
T Consensus 8 ~~g~~~v~~~InG~~~~flVDTGAs~t~is~~~A~~Lgl~~~~~~~~~~~~ta~G~~~~~--~~~l~~l~iG~~~~~nv~ 85 (121)
T TIGR02281 8 GDGHFYATGRVNGRNVRFLVDTGATSVALNEEDAQRLGLDLNRLGYTVTVSTANGQIKAA--RVTLDRVAIGGIVVNDVD 85 (121)
T ss_pred CCCeEEEEEEECCEEEEEEEECCCCcEEcCHHHHHHcCCCcccCCceEEEEeCCCcEEEE--EEEeCEEEECCEEEeCcE
Confidence 35678999999999999999999999999999999999986553 356778889874322 222236889998876 88
Q ss_pred ccccCCCC-ccEEecchHHHhcCCeEEEeeCCEEEE
Q 045527 212 FLPLDLGI-TDIIMGIHWLKTLGATHINWKTHSMKF 246 (258)
Q Consensus 212 f~Vl~~~~-~dvILG~dwL~~~~~i~ID~~~~~v~f 246 (258)
+.|++... .+.|||||||.++.. +...++.|.+
T Consensus 86 ~~v~~~~~~~~~LLGm~fL~~~~~--~~~~~~~l~l 119 (121)
T TIGR02281 86 AMVAEGGALSESLLGMSFLNRLSR--FTVRGGKLIL 119 (121)
T ss_pred EEEeCCCcCCceEcCHHHHhcccc--EEEECCEEEE
Confidence 89998763 589999999999975 4444555554
No 7
>PF13650 Asp_protease_2: Aspartyl protease
Probab=99.58 E-value=1.1e-14 Score=106.74 Aligned_cols=87 Identities=29% Similarity=0.363 Sum_probs=72.4
Q ss_pred EEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCC-CceEeecccccccccceEeeeeEeeeceeE-EeeccccC-
Q 045527 140 LASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTE-PYGVILRTGSATKAQGICRGVGLILQGVEI-VEDFLPLD- 216 (258)
Q Consensus 140 i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~-~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~-~~~f~Vl~- 216 (258)
|+++|||+++.+||||||+.++|++++|++++++..... ...+..++|......... -.+.+++..+ ..++.|++
T Consensus 1 V~v~vng~~~~~liDTGa~~~~i~~~~~~~l~~~~~~~~~~~~~~~~~g~~~~~~~~~--~~i~ig~~~~~~~~~~v~~~ 78 (90)
T PF13650_consen 1 VPVKVNGKPVRFLIDTGASISVISRSLAKKLGLKPRPKSVPISVSGAGGSVTVYRGRV--DSITIGGITLKNVPFLVVDL 78 (90)
T ss_pred CEEEECCEEEEEEEcCCCCcEEECHHHHHHcCCCCcCCceeEEEEeCCCCEEEEEEEE--EEEEECCEEEEeEEEEEECC
Confidence 468899999999999999999999999999999887664 577788888843333332 3788999887 68899999
Q ss_pred CCCccEEecchH
Q 045527 217 LGITDIIMGIHW 228 (258)
Q Consensus 217 ~~~~dvILG~dw 228 (258)
...+|+|||+||
T Consensus 79 ~~~~~~iLG~df 90 (90)
T PF13650_consen 79 GDPIDGILGMDF 90 (90)
T ss_pred CCCCEEEeCCcC
Confidence 668999999998
No 8
>PF00077 RVP: Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026; InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=99.56 E-value=7.5e-15 Score=110.52 Aligned_cols=96 Identities=30% Similarity=0.321 Sum_probs=80.2
Q ss_pred eEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccC
Q 045527 137 TLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLD 216 (258)
Q Consensus 137 ~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~ 216 (258)
..++.+.|||+++.|||||||+.|+|+.+.+...... ......+..++|.. ...+. ..+.+.+++..+...|+|++
T Consensus 5 rp~i~v~i~g~~i~~LlDTGA~vsiI~~~~~~~~~~~--~~~~~~v~~~~g~~-~~~~~-~~~~v~~~~~~~~~~~~v~~ 80 (100)
T PF00077_consen 5 RPYITVKINGKKIKALLDTGADVSIISEKDWKKLGPP--PKTSITVRGAGGSS-SILGS-TTVEVKIGGKEFNHTFLVVP 80 (100)
T ss_dssp SSEEEEEETTEEEEEEEETTBSSEEESSGGSSSTSSE--EEEEEEEEETTEEE-EEEEE-EEEEEEETTEEEEEEEEESS
T ss_pred CceEEEeECCEEEEEEEecCCCcceeccccccccccc--ccCCceeccCCCcc-eeeeE-EEEEEEEECccceEEEEecC
Confidence 3467889999999999999999999999998877654 33456677788877 66555 47899999999999999999
Q ss_pred CCCccEEecchHHHhcCCeEEE
Q 045527 217 LGITDIIMGIHWLKTLGATHIN 238 (258)
Q Consensus 217 ~~~~dvILG~dwL~~~~~i~ID 238 (258)
....| |||+|||.+++. .|+
T Consensus 81 ~~~~~-ILG~D~L~~~~~-~i~ 100 (100)
T PF00077_consen 81 DLPMN-ILGRDFLKKLNA-VIN 100 (100)
T ss_dssp TCSSE-EEEHHHHTTTTC-EEE
T ss_pred CCCCC-EeChhHHHHcCC-EEC
Confidence 87778 999999999996 664
No 9
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=99.52 E-value=1e-13 Score=102.67 Aligned_cols=92 Identities=26% Similarity=0.300 Sum_probs=76.8
Q ss_pred eEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEE-eecccc
Q 045527 137 TLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIV-EDFLPL 215 (258)
Q Consensus 137 ~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~-~~f~Vl 215 (258)
.+.+.+.||++++.+||||||+.++|+.++++++++.........+..++|........ .-.+.+++..+. ..+.|+
T Consensus 2 ~~~v~v~i~~~~~~~llDTGa~~s~i~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~--~~~i~ig~~~~~~~~~~v~ 79 (96)
T cd05483 2 HFVVPVTINGQPVRFLLDTGASTTVISEELAERLGLPLTLGGKVTVQTANGRVRAARVR--LDSLQIGGITLRNVPAVVL 79 (96)
T ss_pred cEEEEEEECCEEEEEEEECCCCcEEcCHHHHHHcCCCccCCCcEEEEecCCCccceEEE--cceEEECCcEEeccEEEEe
Confidence 46889999999999999999999999999999999843344567788888887665443 347889999876 689999
Q ss_pred CCCC--ccEEecchHHH
Q 045527 216 DLGI--TDIIMGIHWLK 230 (258)
Q Consensus 216 ~~~~--~dvILG~dwL~ 230 (258)
+... .|+|||+|||+
T Consensus 80 d~~~~~~~gIlG~d~l~ 96 (96)
T cd05483 80 PGDALGVDGLLGMDFLR 96 (96)
T ss_pred CCcccCCceEeChHHhC
Confidence 9887 99999999984
No 10
>PF12384 Peptidase_A2B: Ty3 transposon peptidase; InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=99.50 E-value=1.9e-13 Score=109.67 Aligned_cols=100 Identities=20% Similarity=0.218 Sum_probs=87.7
Q ss_pred CCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccc
Q 045527 135 PKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLP 214 (258)
Q Consensus 135 ~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~V 214 (258)
.++..+...++|.++.+|+||||-.|||+.+++++|+|+....++++++.+.+...........+++.+++..+.+.++|
T Consensus 32 g~T~~v~l~~~~t~i~vLfDSGSPTSfIr~di~~kL~L~~~~app~~fRG~vs~~~~~tsEAv~ld~~i~n~~i~i~aYV 111 (177)
T PF12384_consen 32 GKTAIVQLNCKGTPIKVLFDSGSPTSFIRSDIVEKLELPTHDAPPFRFRGFVSGESATTSEAVTLDFYIDNKLIDIAAYV 111 (177)
T ss_pred CcEEEEEEeecCcEEEEEEeCCCccceeehhhHHhhCCccccCCCEEEeeeccCCceEEEEeEEEEEEECCeEEEEEEEE
Confidence 46788899999999999999999999999999999999999999999988654433333344578999999999999999
Q ss_pred cCCCCccEEecchHHHhcCC
Q 045527 215 LDLGITDIIMGIHWLKTLGA 234 (258)
Q Consensus 215 l~~~~~dvILG~dwL~~~~~ 234 (258)
++..++|+|+|.+.|.+|..
T Consensus 112 ~d~m~~dlIIGnPiL~ryp~ 131 (177)
T PF12384_consen 112 TDNMDHDLIIGNPILDRYPT 131 (177)
T ss_pred eccCCcceEeccHHHhhhHH
Confidence 99999999999999999874
No 11
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=99.46 E-value=3.8e-13 Score=102.94 Aligned_cols=92 Identities=24% Similarity=0.356 Sum_probs=75.2
Q ss_pred EEEEEEEcCCCCccc-cCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCCccEEec
Q 045527 147 KKVVVLTDSGASHNF-ISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGITDIIMG 225 (258)
Q Consensus 147 ~~v~aLIDSGAt~sf-Is~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~dvILG 225 (258)
.++.+||||||+..+ |+.++|+++|++... ...+..|||...... .....+.++|....+.+.+.+..+ +++||
T Consensus 15 ~~v~~LVDTGat~~~~l~~~~a~~lgl~~~~--~~~~~tA~G~~~~~~--v~~~~v~igg~~~~~~v~~~~~~~-~~LLG 89 (107)
T TIGR03698 15 MEVRALVDTGFSGFLLVPPDIVNKLGLPELD--QRRVYLADGREVLTD--VAKASIIINGLEIDAFVESLGYVD-EPLLG 89 (107)
T ss_pred eEEEEEEECCCCeEEecCHHHHHHcCCCccc--CcEEEecCCcEEEEE--EEEEEEEECCEEEEEEEEecCCCC-ccEec
Confidence 479999999999997 999999999997643 568999999755544 235678899998866666666555 89999
Q ss_pred chHHHhcCCeEEEeeCCEE
Q 045527 226 IHWLKTLGATHINWKTHSM 244 (258)
Q Consensus 226 ~dwL~~~~~i~ID~~~~~v 244 (258)
|.||.+++ +.|||+++.+
T Consensus 90 ~~~L~~l~-l~id~~~~~~ 107 (107)
T TIGR03698 90 TELLEGLG-IVIDYRNQGL 107 (107)
T ss_pred HHHHhhCC-EEEehhhCcC
Confidence 99999999 6999998753
No 12
>KOG0012 consensus DNA damage inducible protein [Replication, recombination and repair]
Probab=99.44 E-value=2.5e-13 Score=120.94 Aligned_cols=116 Identities=21% Similarity=0.235 Sum_probs=98.8
Q ss_pred CCCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeecc
Q 045527 134 SPKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFL 213 (258)
Q Consensus 134 ~~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~ 213 (258)
....++|.++|||++|+|+|||||..|+||..+|+++|+...-.+.+.=........++.|.++.+.+.+++..+...|.
T Consensus 232 ~v~ML~iN~~ing~~VKAfVDsGaq~timS~~Caer~gL~rlid~r~~g~a~gvg~~ki~g~Ih~~~lki~~~~l~c~ft 311 (380)
T KOG0012|consen 232 QVTMLYINCEINGVPVKAFVDSGAQTTIMSAACAERCGLNRLIDKRFQGEARGVGTEKILGRIHQAQLKIEDLYLPCSFT 311 (380)
T ss_pred cceEEEEEEEECCEEEEEEEcccchhhhhhHHHHHHhChHHHhhhhhhccccCCCcccccceeEEEEEEeccEeeccceE
Confidence 45678999999999999999999999999999999999976544433222222235677788899999999999999999
Q ss_pred ccCCCCccEEecchHHHhcCCeEEEeeCCEEEEeeCC
Q 045527 214 PLDLGITDIIMGIHWLKTLGATHINWKTHSMKFNTRN 250 (258)
Q Consensus 214 Vl~~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~~~~ 250 (258)
|++..+.|++||.|-|++|+. .||.+++.+.+...+
T Consensus 312 V~d~~~~d~llGLd~Lrr~~c-cIdL~~~~L~ig~~~ 347 (380)
T KOG0012|consen 312 VLDRRDMDLLLGLDMLRRHQC-CIDLKTNVLRIGNTE 347 (380)
T ss_pred EecCCCcchhhhHHHHHhccc-eeecccCeEEecCCC
Confidence 999999999999999999998 999999999886543
No 13
>cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples
Probab=99.43 E-value=1.1e-12 Score=92.95 Aligned_cols=90 Identities=38% Similarity=0.485 Sum_probs=75.7
Q ss_pred EEEECCEEEEEEEcCCCCccccCHHHHHHcCC-CcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCC
Q 045527 141 ASEINNKKVVVLTDSGASHNFISNEVVLVLKL-PITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGI 219 (258)
Q Consensus 141 ~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l-~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~ 219 (258)
.+.+++.++.+|+|+||++++++..++.++++ ......+..+..++|......+.+..+.+.+++..+...|++++...
T Consensus 2 ~~~~~~~~~~~liDtgs~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~ 81 (92)
T cd00303 2 KGKINGVPVRALVDSGASVNFISESLAKKLGLPPRLLPTPLKVKGANGSSVKTLGVILPVTIGIGGKTFTVDFYVLDLLS 81 (92)
T ss_pred EEEECCEEEEEEEcCCCcccccCHHHHHHcCCCcccCCCceEEEecCCCEeccCcEEEEEEEEeCCEEEEEEEEEEcCCC
Confidence 46688999999999999999999999999988 43334566777788876666555557788899999999999999999
Q ss_pred ccEEecchHHH
Q 045527 220 TDIIMGIHWLK 230 (258)
Q Consensus 220 ~dvILG~dwL~ 230 (258)
+|+|||+|||.
T Consensus 82 ~~~ilG~~~l~ 92 (92)
T cd00303 82 YDVILGRPWLE 92 (92)
T ss_pred cCEEecccccC
Confidence 99999999984
No 14
>PF02160 Peptidase_A3: Cauliflower mosaic virus peptidase (A3); InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=99.40 E-value=1.1e-12 Score=109.68 Aligned_cols=113 Identities=14% Similarity=0.168 Sum_probs=90.2
Q ss_pred CCeEEEEE--EECC---EEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEE
Q 045527 135 PKTLKLAS--EINN---KKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIV 209 (258)
Q Consensus 135 ~~~i~i~~--~I~g---~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~ 209 (258)
|+.+++.+ .+.| ..+.++|||||+.++++..+.-...|.. ...+..|+.||++......++..+.+.++++.|.
T Consensus 2 pNsiyI~~~i~~~gy~~~~~~~~vDTGAt~C~~~~~iiP~e~we~-~~~~i~v~~an~~~~~i~~~~~~~~i~I~~~~F~ 80 (201)
T PF02160_consen 2 PNSIYIKVKISFPGYKKFNYHCYVDTGATICCASKKIIPEEYWEK-SKKPIKVKGANGSIIQINKKAKNGKIQIADKIFR 80 (201)
T ss_pred CccEEEEEEEEEcCceeEEEEEEEeCCCceEEecCCcCCHHHHHh-CCCcEEEEEecCCceEEEEEecCceEEEccEEEe
Confidence 34444444 4444 5788999999999999998775555532 1236789999999888889999999999999999
Q ss_pred eeccccCCCCccEEecchHHHhcCCeEEEeeCCEEEEeeCC
Q 045527 210 EDFLPLDLGITDIIMGIHWLKTLGATHINWKTHSMKFNTRN 250 (258)
Q Consensus 210 ~~f~Vl~~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~~~~ 250 (258)
+.+...-..+.|+|||++||+.++| .+.|.+ .+.|+.++
T Consensus 81 IP~iYq~~~g~d~IlG~NF~r~y~P-fiq~~~-~I~f~~~~ 119 (201)
T PF02160_consen 81 IPTIYQQESGIDIILGNNFLRLYEP-FIQTED-RIQFHKKG 119 (201)
T ss_pred ccEEEEecCCCCEEecchHHHhcCC-cEEEcc-EEEEEeCC
Confidence 8866554478999999999999999 799975 68898876
No 15
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where
Probab=99.40 E-value=1.4e-12 Score=95.88 Aligned_cols=85 Identities=19% Similarity=0.250 Sum_probs=66.0
Q ss_pred EEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCCc
Q 045527 141 ASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGIT 220 (258)
Q Consensus 141 ~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~ 220 (258)
.+.|||+++.+||||||+.+.|+..+|+++. ....+..+..++|........... .+.+++++....+.+++.. .
T Consensus 2 ~v~InG~~~~fLvDTGA~~tii~~~~a~~~~---~~~~~~~v~gagG~~~~~v~~~~~-~v~vg~~~~~~~~~v~~~~-~ 76 (86)
T cd06095 2 TITVEGVPIVFLVDTGATHSVLKSDLGPKQE---LSTTSVLIRGVSGQSQQPVTTYRT-LVDLGGHTVSHSFLVVPNC-P 76 (86)
T ss_pred EEEECCEEEEEEEECCCCeEEECHHHhhhcc---CCCCcEEEEeCCCcccccEEEeee-EEEECCEEEEEEEEEEcCC-C
Confidence 5789999999999999999999999999972 223567888898886211111111 5889999998888888754 5
Q ss_pred cEEecchHHH
Q 045527 221 DIIMGIHWLK 230 (258)
Q Consensus 221 dvILG~dwL~ 230 (258)
+.|||||||.
T Consensus 77 ~~lLG~dfL~ 86 (86)
T cd06095 77 DPLLGRDLLS 86 (86)
T ss_pred CcEechhhcC
Confidence 9999999984
No 16
>PF13975 gag-asp_proteas: gag-polyprotein putative aspartyl protease
Probab=99.36 E-value=3.7e-12 Score=90.51 Aligned_cols=64 Identities=33% Similarity=0.554 Sum_probs=58.2
Q ss_pred CCCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCC-CceEeecccccccccceEe
Q 045527 134 SPKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTE-PYGVILRTGSATKAQGICR 197 (258)
Q Consensus 134 ~~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~-~~~V~~a~G~~~~~~~~~~ 197 (258)
.+..+++.+.|+++.+.+||||||++|||+.++|++|+++..... +..+++|||+.....+...
T Consensus 5 ~~g~~~v~~~I~g~~~~alvDtGat~~fis~~~a~rLgl~~~~~~~~~~v~~a~g~~~~~~g~~~ 69 (72)
T PF13975_consen 5 DPGLMYVPVSIGGVQVKALVDTGATHNFISESLAKRLGLPLEKPPSPIRVKLANGSVIEIRGVAE 69 (72)
T ss_pred cCCEEEEEEEECCEEEEEEEeCCCcceecCHHHHHHhCCCcccCCCCEEEEECCCCccccceEEE
Confidence 467899999999999999999999999999999999999998886 8999999999888776653
No 17
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=99.09 E-value=7.5e-10 Score=92.30 Aligned_cols=102 Identities=20% Similarity=0.288 Sum_probs=84.3
Q ss_pred CCCeEEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCC-CceEeecccccccccceEeeeeEeeeceeEE-ee
Q 045527 134 SPKTLKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTE-PYGVILRTGSATKAQGICRGVGLILQGVEIV-ED 211 (258)
Q Consensus 134 ~~~~i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~-~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~-~~ 211 (258)
....+...+.|||+.+.+|||||||.-.+++..|+++|+....+. ++.|..|||......-. --.+.|++.... ++
T Consensus 102 ~~GHF~a~~~VNGk~v~fLVDTGATsVal~~~dA~RlGid~~~l~y~~~v~TANG~~~AA~V~--Ld~v~IG~I~~~nV~ 179 (215)
T COG3577 102 RDGHFEANGRVNGKKVDFLVDTGATSVALNEEDARRLGIDLNSLDYTITVSTANGRARAAPVT--LDRVQIGGIRVKNVD 179 (215)
T ss_pred CCCcEEEEEEECCEEEEEEEecCcceeecCHHHHHHhCCCccccCCceEEEccCCccccceEE--eeeEEEccEEEcCch
Confidence 456899999999999999999999999999999999999988875 78899999986554322 226788988764 78
Q ss_pred ccccCCCC-ccEEecchHHHhcCCeEE
Q 045527 212 FLPLDLGI-TDIIMGIHWLKTLGATHI 237 (258)
Q Consensus 212 f~Vl~~~~-~dvILG~dwL~~~~~i~I 237 (258)
.+|++.+. ...+|||.||.+++...+
T Consensus 180 A~V~~~g~L~~sLLGMSfL~rL~~fq~ 206 (215)
T COG3577 180 AMVAEDGALDESLLGMSFLNRLSGFQV 206 (215)
T ss_pred hheecCCccchhhhhHHHHhhccceEe
Confidence 89997764 458999999999986333
No 18
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=99.07 E-value=6e-10 Score=83.07 Aligned_cols=84 Identities=18% Similarity=0.171 Sum_probs=72.2
Q ss_pred EEECC-EEEEEEEcCCCCccccCHHHHHHcC---CCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCC
Q 045527 142 SEINN-KKVVVLTDSGASHNFISNEVVLVLK---LPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDL 217 (258)
Q Consensus 142 ~~I~g-~~v~aLIDSGAt~sfIs~~~a~~l~---l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~ 217 (258)
..|++ .+++++|||||+.|+|+.+++++++ .+....++..+..+||+.+...|. ..+.+.+++..+..+|+|++.
T Consensus 3 ~~i~g~~~v~~~vDtGA~vnllp~~~~~~l~~~~~~~L~~t~~~L~~~~g~~~~~~G~-~~~~v~~~~~~~~~~f~Vvd~ 81 (93)
T cd05481 3 MKINGKQSVKFQLDTGATCNVLPLRWLKSLTPDKDPELRPSPVRLTAYGGSTIPVEGG-VKLKCRYRNPKYNLTFQVVKE 81 (93)
T ss_pred eEeCCceeEEEEEecCCEEEeccHHHHhhhccCCCCcCccCCeEEEeeCCCEeeeeEE-EEEEEEECCcEEEEEEEEECC
Confidence 56888 9999999999999999999999998 555455678889999999999988 478999999999999999996
Q ss_pred CCccEEecch
Q 045527 218 GITDIIMGIH 227 (258)
Q Consensus 218 ~~~dvILG~d 227 (258)
. ..-|||.+
T Consensus 82 ~-~~~lLG~~ 90 (93)
T cd05481 82 E-GPPLLGAK 90 (93)
T ss_pred C-CCceEccc
Confidence 5 35567765
No 19
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=98.93 E-value=1.9e-09 Score=78.77 Aligned_cols=79 Identities=20% Similarity=0.295 Sum_probs=65.7
Q ss_pred EEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeece-eEEeeccccCCCCccEEecc
Q 045527 148 KVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGV-EIVEDFLPLDLGITDIIMGI 226 (258)
Q Consensus 148 ~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~-~~~~~f~Vl~~~~~dvILG~ 226 (258)
.+.+||||||.+|+|-....++. ....++.+..|||+.+.+.|. ..+.+.++.. .|.-.|.|.+.. .-|||.
T Consensus 9 ~~~fLVDTGA~vSviP~~~~~~~----~~~~~~~l~AANgt~I~tyG~-~~l~ldlGlrr~~~w~FvvAdv~--~pIlGa 81 (89)
T cd06094 9 GLRFLVDTGAAVSVLPASSTKKS----LKPSPLTLQAANGTPIATYGT-RSLTLDLGLRRPFAWNFVVADVP--HPILGA 81 (89)
T ss_pred CcEEEEeCCCceEeecccccccc----ccCCceEEEeCCCCeEeeeee-EEEEEEcCCCcEEeEEEEEcCCC--cceecH
Confidence 47899999999999998887753 234567899999999999886 5788889874 788899998765 479999
Q ss_pred hHHHhcC
Q 045527 227 HWLKTLG 233 (258)
Q Consensus 227 dwL~~~~ 233 (258)
|||..|+
T Consensus 82 DfL~~~~ 88 (89)
T cd06094 82 DFLQHYG 88 (89)
T ss_pred HHHHHcC
Confidence 9999986
No 20
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.61 E-value=5.6e-07 Score=69.41 Aligned_cols=94 Identities=26% Similarity=0.298 Sum_probs=75.7
Q ss_pred CEEEEEEEcCCCC-ccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCCccEEe
Q 045527 146 NKKVVVLTDSGAS-HNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGITDIIM 224 (258)
Q Consensus 146 g~~v~aLIDSGAt-~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~dvIL 224 (258)
+.-...|||||++ --.+++++|++++++... ...+..++|+.+.+. .....+.++|.+..+-..+.+....+ +|
T Consensus 24 d~~~~~LiDTGFtg~lvlp~~vaek~~~~~~~--~~~~~~a~~~~v~t~--V~~~~iki~g~e~~~~Vl~s~~~~~~-li 98 (125)
T COG5550 24 DFVYDELIDTGFTGYLVLPPQVAEKLGLPLFS--TIRIVLADGGVVKTS--VALATIKIDGVEKVAFVLASDNLPEP-LI 98 (125)
T ss_pred cEEeeeEEecCCceeEEeCHHHHHhcCCCccC--ChhhhhhcCCEEEEE--EEEEEEEECCEEEEEEEEccCCCccc-ch
Confidence 3444559999999 889999999999998544 345677888877663 34568899999888777778888788 99
Q ss_pred cchHHHhcCCeEEEeeCCEEE
Q 045527 225 GIHWLKTLGATHINWKTHSMK 245 (258)
Q Consensus 225 G~dwL~~~~~i~ID~~~~~v~ 245 (258)
|++||+.++ +.+|+.+..++
T Consensus 99 G~~~lk~l~-~~vn~~~g~LE 118 (125)
T COG5550 99 GVNLLKLLG-LVVNPKTGKLE 118 (125)
T ss_pred hhhhhhhcc-EEEcCCcceEe
Confidence 999999999 59999887765
No 21
>PF05585 DUF1758: Putative peptidase (DUF1758); InterPro: IPR008737 This is a family of nematode proteins of unknown function []. However, it seems likely that these proteins act as aspartic peptidases.
Probab=98.48 E-value=4.9e-07 Score=74.12 Aligned_cols=32 Identities=31% Similarity=0.453 Sum_probs=28.4
Q ss_pred EEEEEEEcCCCCccccCHHHHHHcCCCcccCC
Q 045527 147 KKVVVLTDSGASHNFISNEVVLVLKLPITNTE 178 (258)
Q Consensus 147 ~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~ 178 (258)
..+.+|+||||..|||++++|++|+|+.....
T Consensus 11 ~~~~~LlDsGSq~SfIt~~la~~L~L~~~~~~ 42 (164)
T PF05585_consen 11 VEARALLDSGSQRSFITESLANKLNLPGTGEK 42 (164)
T ss_pred EEEEEEEecCCchhHHhHHHHHHhCCCCCCce
Confidence 46889999999999999999999999876554
No 22
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=98.18 E-value=8.3e-06 Score=59.91 Aligned_cols=86 Identities=22% Similarity=0.191 Sum_probs=58.0
Q ss_pred EEEECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCCc
Q 045527 141 ASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGIT 220 (258)
Q Consensus 141 ~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~ 220 (258)
..+|+|+.+.+|+||||..++|+....... ++. ...+..+. +-|+.+..... ..+.+.+.+......+.|.+....
T Consensus 2 ~~~i~g~~~~~llDTGAd~Tvi~~~~~p~~-w~~-~~~~~~i~-GIGG~~~~~~~-~~v~i~i~~~~~~g~vlv~~~~~P 77 (87)
T cd05482 2 TLYINGKLFEGLLDTGADVSIIAENDWPKN-WPI-QPAPSNLT-GIGGAITPSQS-SVLLLEIDGEGHLGTILVYVLSLP 77 (87)
T ss_pred EEEECCEEEEEEEccCCCCeEEcccccCCC-Ccc-CCCCeEEE-eccceEEEEEE-eeEEEEEcCCeEEEEEEEccCCCc
Confidence 467899999999999999999998443321 111 11233344 44444554332 468899999888888888876333
Q ss_pred cEEecchHHH
Q 045527 221 DIIMGIHWLK 230 (258)
Q Consensus 221 dvILG~dwL~ 230 (258)
.-|||.|.|.
T Consensus 78 ~nllGRd~L~ 87 (87)
T cd05482 78 VNLWGRDILS 87 (87)
T ss_pred ccEEccccCC
Confidence 4589999873
No 23
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=97.95 E-value=3.6e-06 Score=43.37 Aligned_cols=17 Identities=29% Similarity=0.716 Sum_probs=15.9
Q ss_pred cceecCCCccCccccCC
Q 045527 66 LCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 66 lCf~Cg~~gh~~~~Cp~ 82 (258)
.||+|++.||++.+||+
T Consensus 2 ~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 2 KCFNCGEPGHIARDCPK 18 (18)
T ss_dssp BCTTTSCSSSCGCTSSS
T ss_pred cCcCCCCcCcccccCcc
Confidence 59999999999999984
No 24
>PF12382 Peptidase_A2E: Retrotransposon peptidase; InterPro: IPR024648 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This entry represents a small family of fungal retroviral aspartyl peptidases.
Probab=97.46 E-value=0.00028 Score=52.51 Aligned_cols=83 Identities=24% Similarity=0.232 Sum_probs=57.1
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCCccEEecchH
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGITDIIMGIHW 228 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~dvILG~dw 228 (258)
+..|||+||..|+|.++.++...+|..+.... |..+.--+.........+.|.++|..+...|.|+..-.++.-+...-
T Consensus 48 ipclidtgaq~niiteetvrahklptrpw~~s-viyggvyp~kinrkt~kl~i~lngisikteflvvkkfshpaaisftt 126 (137)
T PF12382_consen 48 IPCLIDTGAQVNIITEETVRAHKLPTRPWSQS-VIYGGVYPNKINRKTIKLNINLNGISIKTEFLVVKKFSHPAAISFTT 126 (137)
T ss_pred ceeEEccCceeeeeehhhhhhccCCCCcchhh-eEeccccccccccceEEEEEEecceEEEEEEEEEEeccCcceEEEEE
Confidence 56899999999999999999999988776432 22221112223344456778899999999999987655555444444
Q ss_pred HHhc
Q 045527 229 LKTL 232 (258)
Q Consensus 229 L~~~ 232 (258)
|...
T Consensus 127 lydn 130 (137)
T PF12382_consen 127 LYDN 130 (137)
T ss_pred EeeC
Confidence 4443
No 25
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The enzymes specifically cleave bonds in peptides which
Probab=96.34 E-value=0.018 Score=50.52 Aligned_cols=86 Identities=9% Similarity=0.035 Sum_probs=52.7
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCC-CCccEEecch
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDL-GITDIIMGIH 227 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~-~~~dvILG~d 227 (258)
..++||||++..++...+. +.+.+..++|........ +.-+...+..... .++.. ..--.|||..
T Consensus 177 ~~ai~DTGTs~~~lp~~~~----------P~i~~~f~~~~~~~i~~~--~y~~~~~~~~~C~--~~~~~~~~~~~ilG~~ 242 (265)
T cd05476 177 GGTIIDSGTTLTYLPDPAY----------PDLTLHFDGGADLELPPE--NYFVDVGEGVVCL--AILSSSSGGVSILGNI 242 (265)
T ss_pred CcEEEeCCCcceEcCcccc----------CCEEEEECCCCEEEeCcc--cEEEECCCCCEEE--EEecCCCCCcEEEChh
Confidence 4589999999999988876 345566654443332211 0001111111111 12222 3446899999
Q ss_pred HHHhcCCeEEEeeCCEEEEeeC
Q 045527 228 WLKTLGATHINWKTHSMKFNTR 249 (258)
Q Consensus 228 wL~~~~~i~ID~~~~~v~f~~~ 249 (258)
||+.+- +..|+.+++|.|...
T Consensus 243 fl~~~~-~vFD~~~~~iGfa~~ 263 (265)
T cd05476 243 QQQNFL-VEYDLENSRLGFAPA 263 (265)
T ss_pred hcccEE-EEEECCCCEEeeecC
Confidence 999999 599999999988643
No 26
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=95.80 E-value=0.0035 Score=36.97 Aligned_cols=18 Identities=22% Similarity=0.427 Sum_probs=16.6
Q ss_pred CcceecCCCccCccccCC
Q 045527 65 GLCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 65 ~lCf~Cg~~gh~~~~Cp~ 82 (258)
-+|+.|+++||+..+||.
T Consensus 9 Y~C~~C~~~GH~i~dCP~ 26 (32)
T PF13696_consen 9 YVCHRCGQKGHWIQDCPT 26 (32)
T ss_pred CEeecCCCCCccHhHCCC
Confidence 469999999999999995
No 27
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5. Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=95.73 E-value=0.029 Score=50.78 Aligned_cols=92 Identities=10% Similarity=0.040 Sum_probs=56.8
Q ss_pred EEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCCccEEecch
Q 045527 148 KVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGITDIIMGIH 227 (258)
Q Consensus 148 ~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~dvILG~d 227 (258)
...++||||++..++...+.+++.-.. +.+.+...+|..+..... +.-+...+... +..+....--.|||..
T Consensus 231 ~~~aivDSGTs~~~lp~~~~~~l~~~~---P~i~~~f~~g~~~~i~p~--~y~~~~~~~~c---~~~~~~~~~~~ILG~~ 302 (326)
T cd06096 231 GLGMLVDSGSTLSHFPEDLYNKINNFF---PTITIIFENNLKIDWKPS--SYLYKKESFWC---KGGEKSVSNKPILGAS 302 (326)
T ss_pred CCCEEEeCCCCcccCCHHHHHHHHhhc---CcEEEEEcCCcEEEECHH--HhccccCCceE---EEEEecCCCceEEChH
Confidence 345899999999999999988875333 345555554544332110 00011111111 1111112234799999
Q ss_pred HHHhcCCeEEEeeCCEEEEee
Q 045527 228 WLKTLGATHINWKTHSMKFNT 248 (258)
Q Consensus 228 wL~~~~~i~ID~~~~~v~f~~ 248 (258)
||+.+- +..|+.+++|.|..
T Consensus 303 flr~~y-~vFD~~~~riGfa~ 322 (326)
T cd06096 303 FFKNKQ-IIFDLDNNRIGFVE 322 (326)
T ss_pred HhcCcE-EEEECcCCEEeeEc
Confidence 999999 69999999999864
No 28
>COG4067 Uncharacterized protein conserved in archaea [Posttranslational modification, protein turnover, chaperones]
Probab=95.63 E-value=0.02 Score=45.97 Aligned_cols=93 Identities=18% Similarity=0.194 Sum_probs=64.3
Q ss_pred EEEEEEEcCCCCccccCHHHHHHcCC----------C--------ccc-----CCCceEeecccccccccceEeeeeEee
Q 045527 147 KKVVVLTDSGASHNFISNEVVLVLKL----------P--------ITN-----TEPYGVILRTGSATKAQGICRGVGLIL 203 (258)
Q Consensus 147 ~~v~aLIDSGAt~sfIs~~~a~~l~l----------~--------~~~-----~~~~~V~~a~G~~~~~~~~~~~v~i~i 203 (258)
..+.|=|||||..|-++..=...+.- . ... ....+|+.++|+....... ..+.+.+
T Consensus 38 ~~~kAkiDTGA~TSsL~A~dI~~fkRdGe~WVRF~~~~~~~~~~~~~~~e~pvi~~ikvR~s~~~~~e~RpV-V~~~l~l 116 (162)
T COG4067 38 IQLKAKIDTGAVTSSLSASDIERFKRDGERWVRFRLADTDNLDQRSEECEAPVIRKIKVRSSSGSRAERRPV-VRLTLCL 116 (162)
T ss_pred ceeeeeecccceeeeEEeecceeeeeCCceEEEEEeecccCccccceeeccceEEEEEEecCCCCccccccE-EEEEEee
Confidence 35788899999999988754443311 1 000 0113455566665444433 4678899
Q ss_pred eceeEEeeccccCCC--CccEEecchHHHhcCCeEEEeeC
Q 045527 204 QGVEIVEDFLPLDLG--ITDIIMGIHWLKTLGATHINWKT 241 (258)
Q Consensus 204 ~g~~~~~~f~Vl~~~--~~dvILG~dwL~~~~~i~ID~~~ 241 (258)
||....+.|.+.+-. .|++|||.-+|..... .+|-..
T Consensus 117 G~~~~~~E~tLtDR~~m~Yp~LlGrk~l~~~~~-~VDpSr 155 (162)
T COG4067 117 GGRILPIEFTLTDRSNMRYPVLLGRKALRHFGA-VVDPSR 155 (162)
T ss_pred CCeeeeEEEEeecccccccceEecHHHHhhCCe-EECchh
Confidence 999999999998864 6999999999999884 788654
No 29
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=94.93 E-value=0.23 Score=44.63 Aligned_cols=93 Identities=15% Similarity=0.118 Sum_probs=55.0
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeeccccc-ccccc--eEeeeeEeeeceeEEeec-------------
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSA-TKAQG--ICRGVGLILQGVEIVEDF------------- 212 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~-~~~~~--~~~~v~i~i~g~~~~~~f------------- 212 (258)
..++||||++..++...+++.+--...... ...|.. +.|.. ..+.+.+.+++..+.+..
T Consensus 202 ~~~iiDSGtt~~~lP~~~~~~l~~~~~~~~-----~~~~~~~~~C~~~~~~p~l~~~f~g~~~~v~~~~y~~~~~~~C~~ 276 (318)
T cd05477 202 CQAIVDTGTSLLTAPQQVMSTLMQSIGAQQ-----DQYGQYVVNCNNIQNLPTLTFTINGVSFPLPPSAYILQNNGYCTV 276 (318)
T ss_pred ceeeECCCCccEECCHHHHHHHHHHhCCcc-----ccCCCEEEeCCccccCCcEEEEECCEEEEECHHHeEecCCCeEEE
Confidence 368999999999999988876532111000 001111 11111 113455666665554321
Q ss_pred cccC------CCCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 213 LPLD------LGITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 213 ~Vl~------~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
-+.+ ......|||..||+.+. +..|+.+.+|.|.
T Consensus 277 ~i~~~~~~~~~~~~~~ilG~~fl~~~y-~vfD~~~~~ig~a 316 (318)
T cd05477 277 GIEPTYLPSQNGQPLWILGDVFLRQYY-SVYDLGNNQVGFA 316 (318)
T ss_pred EEEecccCCCCCCceEEEcHHHhhheE-EEEeCCCCEEeee
Confidence 1111 11235899999999999 5999999999875
No 30
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=94.91 E-value=0.012 Score=37.06 Aligned_cols=20 Identities=25% Similarity=0.333 Sum_probs=17.8
Q ss_pred cCCcceecCCCccCccccCC
Q 045527 63 ECGLCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 63 ~~~lCf~Cg~~gh~~~~Cp~ 82 (258)
....|.+|+++||...+||+
T Consensus 3 ~~~~CqkC~~~GH~tyeC~~ 22 (42)
T PF13917_consen 3 ARVRCQKCGQKGHWTYECPN 22 (42)
T ss_pred CCCcCcccCCCCcchhhCCC
Confidence 35689999999999999994
No 31
>smart00343 ZnF_C2HC zinc finger.
Probab=94.90 E-value=0.012 Score=32.88 Aligned_cols=17 Identities=29% Similarity=0.757 Sum_probs=15.5
Q ss_pred cceecCCCccCccccCC
Q 045527 66 LCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 66 lCf~Cg~~gh~~~~Cp~ 82 (258)
.||+|++.||++.+||.
T Consensus 1 ~C~~CG~~GH~~~~C~~ 17 (26)
T smart00343 1 KCYNCGKEGHIARDCPK 17 (26)
T ss_pred CCccCCCCCcchhhCCc
Confidence 49999999999999984
No 32
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases. They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=94.87 E-value=0.19 Score=45.30 Aligned_cols=95 Identities=15% Similarity=0.157 Sum_probs=50.8
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCccc-----CCCceEeecccccccccceEeeeeEeeec-e--eEEeeccccCC---
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITN-----TEPYGVILRTGSATKAQGICRGVGLILQG-V--EIVEDFLPLDL--- 217 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~-----~~~~~V~~a~G~~~~~~~~~~~v~i~i~g-~--~~~~~f~Vl~~--- 217 (258)
..++||||++..++..++++.+...... .+.+.+.. +|..+..... +.-+.... . .-...+..++.
T Consensus 211 ~~aivDTGTs~~~lP~~~~~~i~~~~~C~~~~~~P~i~f~f-~g~~~~l~~~--~yi~~~~~~~~~~C~~~~~~~~~~~~ 287 (317)
T cd06098 211 CAAIADSGTSLLAGPTTIVTQINSAVDCNSLSSMPNVSFTI-GGKTFELTPE--QYILKVGEGAAAQCISGFTALDVPPP 287 (317)
T ss_pred cEEEEecCCcceeCCHHHHHhhhccCCccccccCCcEEEEE-CCEEEEEChH--HeEEeecCCCCCEEeceEEECCCCCC
Confidence 4689999999999999999877532211 11222222 2221111100 00000000 0 00001111111
Q ss_pred CCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 218 GITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 218 ~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
.+...|||-.||+.+- +..|+.+++|.|.
T Consensus 288 ~~~~~IlGd~Flr~~y-~VfD~~~~~iGfA 316 (317)
T cd06098 288 RGPLWILGDVFMGAYH-TVFDYGNLRVGFA 316 (317)
T ss_pred CCCeEEechHHhcccE-EEEeCCCCEEeec
Confidence 1224799999999999 4899999999874
No 33
>PF00026 Asp: Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.; InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) . More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=94.80 E-value=0.041 Score=48.90 Aligned_cols=96 Identities=14% Similarity=0.181 Sum_probs=54.2
Q ss_pred EEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEee----------------
Q 045527 148 KVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVED---------------- 211 (258)
Q Consensus 148 ~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~---------------- 211 (258)
...++||||++...+...++..+--.............+-.... ....+.+.+++..+.+.
T Consensus 199 ~~~~~~Dtgt~~i~lp~~~~~~i~~~l~~~~~~~~~~~~c~~~~---~~p~l~f~~~~~~~~i~~~~~~~~~~~~~~~~C 275 (317)
T PF00026_consen 199 GQQAILDTGTSYIYLPRSIFDAIIKALGGSYSDGVYSVPCNSTD---SLPDLTFTFGGVTFTIPPSDYIFKIEDGNGGYC 275 (317)
T ss_dssp EEEEEEETTBSSEEEEHHHHHHHHHHHTTEEECSEEEEETTGGG---GSEEEEEEETTEEEEEEHHHHEEEESSTTSSEE
T ss_pred ceeeecccccccccccchhhHHHHhhhcccccceeEEEeccccc---ccceEEEeeCCEEEEecchHhccccccccccee
Confidence 46799999999999999877765221111000000000000000 01233444444333321
Q ss_pred -ccccC----CCCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 212 -FLPLD----LGITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 212 -f~Vl~----~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
+.+.. ......|||.+||+.+- +.+|+.+++|.|.
T Consensus 276 ~~~i~~~~~~~~~~~~iLG~~fl~~~y-~vfD~~~~~ig~A 315 (317)
T PF00026_consen 276 YLGIQPMDSSDDSDDWILGSPFLRNYY-VVFDYENNRIGFA 315 (317)
T ss_dssp EESEEEESSTTSSSEEEEEHHHHTTEE-EEEETTTTEEEEE
T ss_pred EeeeecccccccCCceEecHHHhhceE-EEEeCCCCEEEEe
Confidence 11222 23567999999999999 5999999999875
No 34
>cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets an
Probab=94.74 E-value=0.29 Score=44.35 Aligned_cols=93 Identities=15% Similarity=0.152 Sum_probs=53.6
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeeccccc-ccccc--eEeeeeEeeeceeEEee--ccc---------
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSA-TKAQG--ICRGVGLILQGVEIVED--FLP--------- 214 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~-~~~~~--~~~~v~i~i~g~~~~~~--f~V--------- 214 (258)
..++||||++..++...+++.+.-..... . + .++.. +.|.. ..+.+.+.+++..+.+. .++
T Consensus 211 ~~~iiDSGtt~~~lP~~~~~~l~~~~~~~-~--~--~~~~~~~~C~~~~~~p~i~f~fgg~~~~i~~~~yi~~~~~~~~~ 285 (329)
T cd05485 211 CQAIADTGTSLIAGPVDEIEKLNNAIGAK-P--I--IGGEYMVNCSAIPSLPDITFVLGGKSFSLTGKDYVLKVTQMGQT 285 (329)
T ss_pred cEEEEccCCcceeCCHHHHHHHHHHhCCc-c--c--cCCcEEEeccccccCCcEEEEECCEEeEEChHHeEEEecCCCCC
Confidence 36999999999999998777653211110 0 0 01111 11111 11234555555544422 111
Q ss_pred ------c--C---CCCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 215 ------L--D---LGITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 215 ------l--~---~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
. + ..+...|||..||+.+- +..|+.+++|.|.
T Consensus 286 ~C~~~~~~~~~~~~~~~~~IlG~~fl~~~y-~vFD~~~~~ig~a 328 (329)
T cd05485 286 ICLSGFMGIDIPPPAGPLWILGDVFIGKYY-TEFDLGNNRVGFA 328 (329)
T ss_pred EEeeeEEECcCCCCCCCeEEEchHHhccce-EEEeCCCCEEeec
Confidence 1 1 11224799999999999 4899999999874
No 35
>PF05618 Zn_protease: Putative ATP-dependant zinc protease; InterPro: IPR008503 This family consists of several hypothetical proteins from different archaeal and bacterial species.; PDB: 2PMA_B.
Probab=94.63 E-value=0.03 Score=44.67 Aligned_cols=92 Identities=12% Similarity=0.116 Sum_probs=48.0
Q ss_pred EEEEEEEcCCCCccccCHHHHHHc---CCCcc-------c--C-----------CCceEeecccccccccceEeeeeEee
Q 045527 147 KKVVVLTDSGASHNFISNEVVLVL---KLPIT-------N--T-----------EPYGVILRTGSATKAQGICRGVGLIL 203 (258)
Q Consensus 147 ~~v~aLIDSGAt~sfIs~~~a~~l---~l~~~-------~--~-----------~~~~V~~a~G~~~~~~~~~~~v~i~i 203 (258)
..+.|=|||||..|-|+..=.+.+ |-... . . ....|+-.+|. .... ....+.+.+
T Consensus 15 ~~~~aKiDTGA~tSSLhA~~I~~ferdg~~~VrF~~~~~~~~~~~~~~~e~p~~~~~~Ik~s~g~-~e~R-~VV~~~~~l 92 (138)
T PF05618_consen 15 LTIKAKIDTGAKTSSLHATDIEEFERDGEKWVRFTVHDPDNEEKEGVTFEAPVVRRRKIKSSNGE-SERR-PVVETTLCL 92 (138)
T ss_dssp EEEEEEE-TT-SSEEEE-EEEEEEEETTEEEEEE----EEEETTEEEEEEEEEECEEE----------CC-EEEEEEEEE
T ss_pred CEEEEEEcCCCcccceeecceEEeeeCCceEEEEecccccCCCcceEEEEEEeEeEEEEEcCCCc-eeEe-eEEEEEEEE
Confidence 347788899999998876433321 10000 0 0 01122333444 1111 223678889
Q ss_pred eceeEEeeccccCCC--CccEEec-chHHHhcCCeEEEeeCC
Q 045527 204 QGVEIVEDFLPLDLG--ITDIIMG-IHWLKTLGATHINWKTH 242 (258)
Q Consensus 204 ~g~~~~~~f~Vl~~~--~~dvILG-~dwL~~~~~i~ID~~~~ 242 (258)
++..+.+.|.+.+-. .|+++|| ..||... +.+|-...
T Consensus 93 g~~~~~~e~tL~dR~~m~yp~LlGrR~~l~~~--~lVD~s~~ 132 (138)
T PF05618_consen 93 GGKTWKIEFTLTDRSNMKYPMLLGRRNFLRGR--FLVDVSRS 132 (138)
T ss_dssp TTEEEEEEEEEE-S--SS-SEEE-HHHHHHTT--EEEETT--
T ss_pred CCEEEEEEEEEcCCCcCcCCEEEEehHHhcCC--EEECCChh
Confidence 999999999998754 6899999 9999885 47886643
No 36
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which
Probab=94.54 E-value=0.31 Score=43.78 Aligned_cols=100 Identities=14% Similarity=0.152 Sum_probs=60.4
Q ss_pred EEECCEEE------EEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccc-cccccc--eEeeeeEeeeceeEEeec
Q 045527 142 SEINNKKV------VVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGS-ATKAQG--ICRGVGLILQGVEIVEDF 212 (258)
Q Consensus 142 ~~I~g~~v------~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~-~~~~~~--~~~~v~i~i~g~~~~~~f 212 (258)
+.|++..+ .++||||++..++....+..+.-...... ..+|. .+.|.. ..+.+.+.++|..+.+..
T Consensus 194 v~v~g~~~~~~~~~~~iiDTGts~~~lp~~~~~~l~~~~~~~~-----~~~~~~~~~C~~~~~~P~~~f~f~g~~~~i~~ 268 (317)
T cd05478 194 VTINGQVVACSGGCQAIVDTGTSLLVGPSSDIANIQSDIGASQ-----NQNGEMVVNCSSISSMPDVVFTINGVQYPLPP 268 (317)
T ss_pred EEECCEEEccCCCCEEEECCCchhhhCCHHHHHHHHHHhCCcc-----ccCCcEEeCCcCcccCCcEEEEECCEEEEECH
Confidence 45676654 58999999999999988876532211100 01111 112211 123456666666655331
Q ss_pred -----------c--ccCCC-CccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 213 -----------L--PLDLG-ITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 213 -----------~--Vl~~~-~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
+ +.+.. ....|||..||+.+- +..|+.+++|.|.
T Consensus 269 ~~y~~~~~~~C~~~~~~~~~~~~~IlG~~fl~~~y-~vfD~~~~~iG~A 316 (317)
T cd05478 269 SAYILQDQGSCTSGFQSMGLGELWILGDVFIRQYY-SVFDRANNKVGLA 316 (317)
T ss_pred HHheecCCCEEeEEEEeCCCCCeEEechHHhcceE-EEEeCCCCEEeec
Confidence 1 11222 235899999999999 4899999999874
No 37
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=94.42 E-value=0.052 Score=47.86 Aligned_cols=79 Identities=11% Similarity=0.101 Sum_probs=47.7
Q ss_pred EEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccc-cccccceEeeeeEeeeceeEEeeccccCCCCccEEecc
Q 045527 148 KVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGS-ATKAQGICRGVGLILQGVEIVEDFLPLDLGITDIIMGI 226 (258)
Q Consensus 148 ~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~-~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~dvILG~ 226 (258)
...++||||++..++...+++.+.-... ........|. .+.|....+.+.+.+ ..|||-
T Consensus 198 ~~~~iiDSGTs~~~lP~~~~~~l~~~l~---g~~~~~~~~~~~~~C~~~~P~i~f~~-----------------~~ilGd 257 (278)
T cd06097 198 GFSAIADTGTTLILLPDAIVEAYYSQVP---GAYYDSEYGGWVFPCDTTLPDLSFAV-----------------FSILGD 257 (278)
T ss_pred CceEEeecCCchhcCCHHHHHHHHHhCc---CCcccCCCCEEEEECCCCCCCEEEEE-----------------EEEEcc
Confidence 4569999999999999888776632210 0000001111 111111011222222 579999
Q ss_pred hHHHhcCCeEEEeeCCEEEEe
Q 045527 227 HWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 227 dwL~~~~~i~ID~~~~~v~f~ 247 (258)
.||+.+-. ..|+.+++|-|.
T Consensus 258 ~fl~~~y~-vfD~~~~~ig~A 277 (278)
T cd06097 258 VFLKAQYV-VFDVGGPKLGFA 277 (278)
T ss_pred hhhCceeE-EEcCCCceeeec
Confidence 99999995 999999998774
No 38
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=94.28 E-value=0.077 Score=46.19 Aligned_cols=79 Identities=14% Similarity=0.175 Sum_probs=49.6
Q ss_pred EEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeeccccc-cccc--ceEeeeeEeeeceeEEeeccccCCCCccEE
Q 045527 147 KKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSA-TKAQ--GICRGVGLILQGVEIVEDFLPLDLGITDII 223 (258)
Q Consensus 147 ~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~-~~~~--~~~~~v~i~i~g~~~~~~f~Vl~~~~~dvI 223 (258)
....++||||++..++...+++.+--.....-.. ..... ..+. ...+.+.+.+ ..|
T Consensus 201 ~~~~~iiDsGt~~~~lp~~~~~~l~~~~~~~~~~----~~~~~~~~~~~~~~~p~i~f~f-----------------~~i 259 (283)
T cd05471 201 GGGGAIVDSGTSLIYLPSSVYDAILKALGAAVSS----SDGGYGVDCSPCDTLPDITFTF-----------------LWI 259 (283)
T ss_pred CCcEEEEecCCCCEeCCHHHHHHHHHHhCCcccc----cCCcEEEeCcccCcCCCEEEEE-----------------EEE
Confidence 4578999999999999999888764332221100 00000 0000 0001122222 799
Q ss_pred ecchHHHhcCCeEEEeeCCEEEEe
Q 045527 224 MGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 224 LG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
||..||+.+. +..|+.+++|.|.
T Consensus 260 lG~~fl~~~y-~vfD~~~~~igfa 282 (283)
T cd05471 260 LGDVFLRNYY-TVFDLDNNRIGFA 282 (283)
T ss_pred ccHhhhhheE-EEEeCCCCEEeec
Confidence 9999999999 5999999998774
No 39
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=93.96 E-value=0.14 Score=45.13 Aligned_cols=96 Identities=14% Similarity=0.173 Sum_probs=56.0
Q ss_pred EEEEEEEcCCCCccccCHHHHHHcCCCcccC-CCceEeecccccccccceE-eeeeEeeeceeEEeec--c---------
Q 045527 147 KKVVVLTDSGASHNFISNEVVLVLKLPITNT-EPYGVILRTGSATKAQGIC-RGVGLILQGVEIVEDF--L--------- 213 (258)
Q Consensus 147 ~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~-~~~~V~~a~G~~~~~~~~~-~~v~i~i~g~~~~~~f--~--------- 213 (258)
....++||||++..++...+++.+--..... ... ...-...|.... +.+.+.++|..+.+.. +
T Consensus 177 ~~~~~iiDSGt~~~~lP~~~~~~l~~~~~~~~~~~----~~~~~~~C~~~~~p~i~f~f~g~~~~i~~~~~~~~~~~~~~ 252 (295)
T cd05474 177 KNLPALLDSGTTLTYLPSDIVDAIAKQLGATYDSD----EGLYVVDCDAKDDGSLTFNFGGATISVPLSDLVLPASTDDG 252 (295)
T ss_pred CCccEEECCCCccEeCCHHHHHHHHHHhCCEEcCC----CcEEEEeCCCCCCCEEEEEECCeEEEEEHHHhEeccccCCC
Confidence 3467999999999999998887653221110 000 000001111000 3455556665444321 1
Q ss_pred --------ccCCCCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 214 --------PLDLGITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 214 --------Vl~~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
+.+......|||..||+.+- +..|+.+++|.|.
T Consensus 253 ~~~~C~~~i~~~~~~~~iLG~~fl~~~y-~vfD~~~~~ig~a 293 (295)
T cd05474 253 GDGACYLGIQPSTSDYNILGDTFLRSAY-VVYDLDNNEISLA 293 (295)
T ss_pred CCCCeEEEEEeCCCCcEEeChHHhhcEE-EEEECCCCEEEee
Confidence 11211135899999999999 5999999999875
No 40
>PF14787 zf-CCHC_5: GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=93.86 E-value=0.024 Score=34.17 Aligned_cols=20 Identities=35% Similarity=0.843 Sum_probs=12.3
Q ss_pred CCcceecCCCccCccccCCc
Q 045527 64 CGLCYKCDEKFSPGHRCRKQ 83 (258)
Q Consensus 64 ~~lCf~Cg~~gh~~~~Cp~k 83 (258)
.++|++|++.+|-+.+|..+
T Consensus 2 ~~~CprC~kg~Hwa~~C~sk 21 (36)
T PF14787_consen 2 PGLCPRCGKGFHWASECRSK 21 (36)
T ss_dssp --C-TTTSSSCS-TTT---T
T ss_pred CccCcccCCCcchhhhhhhh
Confidence 47899999999999999854
No 41
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco. CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=93.76 E-value=0.6 Score=41.45 Aligned_cols=27 Identities=15% Similarity=0.066 Sum_probs=24.0
Q ss_pred cEEecchHHHhcCCeEEEeeCCEEEEee
Q 045527 221 DIIMGIHWLKTLGATHINWKTHSMKFNT 248 (258)
Q Consensus 221 dvILG~dwL~~~~~i~ID~~~~~v~f~~ 248 (258)
-.|||..||+.+. +..|+.+++|.|..
T Consensus 270 ~~ilG~~fl~~~~-vvfD~~~~~igfa~ 296 (299)
T cd05472 270 LSIIGNVQQQTFR-VVYDVAGGRIGFAP 296 (299)
T ss_pred CEEEchHHccceE-EEEECCCCEEeEec
Confidence 3799999999999 59999999998864
No 42
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=93.50 E-value=0.28 Score=44.11 Aligned_cols=91 Identities=14% Similarity=0.131 Sum_probs=52.6
Q ss_pred EEEEcCCCCccccCHHHHHHcCCCcccCCCceEeeccccc-ccccc--eEeeeeEeeeceeEEeec--cc----------
Q 045527 150 VVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSA-TKAQG--ICRGVGLILQGVEIVEDF--LP---------- 214 (258)
Q Consensus 150 ~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~-~~~~~--~~~~v~i~i~g~~~~~~f--~V---------- 214 (258)
.++||||++..++....++.+.-.... ...+|.. +.|.. ..+.+.+.++|..+.+.. ++
T Consensus 200 ~aiiDTGTs~~~lP~~~~~~l~~~~~~------~~~~~~~~~~C~~~~~~p~i~f~f~g~~~~l~~~~y~~~~~~~~~~~ 273 (316)
T cd05486 200 QAIVDTGTSLITGPSGDIKQLQNYIGA------TATDGEYGVDCSTLSLMPSVTFTINGIPYSLSPQAYTLEDQSDGGGY 273 (316)
T ss_pred EEEECCCcchhhcCHHHHHHHHHHhCC------cccCCcEEEeccccccCCCEEEEECCEEEEeCHHHeEEecccCCCCE
Confidence 699999999999999877665211100 0011211 11211 113455556665444321 11
Q ss_pred -------cC---CCCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 215 -------LD---LGITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 215 -------l~---~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
++ ..+...|||-.||+.+-. ..|+.+++|-|.
T Consensus 274 C~~~~~~~~~~~~~~~~~ILGd~flr~~y~-vfD~~~~~IGfA 315 (316)
T cd05486 274 CSSGFQGLDIPPPAGPLWILGDVFIRQYYS-VFDRGNNRVGFA 315 (316)
T ss_pred EeeEEEECCCCCCCCCeEEEchHHhcceEE-EEeCCCCEeecc
Confidence 11 111237999999999994 899999998763
No 43
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=92.70 E-value=0.049 Score=45.42 Aligned_cols=23 Identities=17% Similarity=0.518 Sum_probs=20.7
Q ss_pred hhcccCCcceecCCCccCccccC
Q 045527 59 QSNQECGLCYKCDEKFSPGHRCR 81 (258)
Q Consensus 59 ~~rR~~~lCf~Cg~~gh~~~~Cp 81 (258)
+.+++...||+||+.||...+||
T Consensus 55 ~~~~~~~~C~nCg~~GH~~~DCP 77 (190)
T COG5082 55 AIREENPVCFNCGQNGHLRRDCP 77 (190)
T ss_pred cccccccccchhcccCcccccCC
Confidence 56677888999999999999999
No 44
>PTZ00147 plasmepsin-1; Provisional
Probab=92.55 E-value=1.1 Score=42.70 Aligned_cols=98 Identities=15% Similarity=0.163 Sum_probs=55.4
Q ss_pred EEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeeccccccccc-ceEeeeeEeeeceeEEee--cc-----------
Q 045527 148 KVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQ-GICRGVGLILQGVEIVED--FL----------- 213 (258)
Q Consensus 148 ~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~-~~~~~v~i~i~g~~~~~~--f~----------- 213 (258)
...++||||++..++..+.+..+--..... .+.....-...|. ...+.+.+.+++..+.+. .+
T Consensus 332 ~~~aIiDSGTsli~lP~~~~~ai~~~l~~~---~~~~~~~y~~~C~~~~lP~~~f~f~g~~~~L~p~~yi~~~~~~~~~~ 408 (453)
T PTZ00147 332 KANVIVDSGTSVITVPTEFLNKFVESLDVF---KVPFLPLYVTTCNNTKLPTLEFRSPNKVYTLEPEYYLQPIEDIGSAL 408 (453)
T ss_pred ceeEEECCCCchhcCCHHHHHHHHHHhCCe---ecCCCCeEEEeCCCCCCCeEEEEECCEEEEECHHHheeccccCCCcE
Confidence 457999999999999998877642111000 0000000001111 111344555555444322 11
Q ss_pred ----cc--CCCCccEEecchHHHhcCCeEEEeeCCEEEEeeC
Q 045527 214 ----PL--DLGITDIIMGIHWLKTLGATHINWKTHSMKFNTR 249 (258)
Q Consensus 214 ----Vl--~~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~~~ 249 (258)
+. +......|||..||+.+-. ..|+.+.+|.|...
T Consensus 409 C~~~i~~~~~~~~~~ILGd~FLr~~Yt-VFD~~n~rIGfA~a 449 (453)
T PTZ00147 409 CMLNIIPIDLEKNTFILGDPFMRKYFT-VFDYDNHTVGFALA 449 (453)
T ss_pred EEEEEEECCCCCCCEEECHHHhccEEE-EEECCCCEEEEEEe
Confidence 11 1122247999999999995 99999999998754
No 45
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=92.27 E-value=0.072 Score=42.77 Aligned_cols=19 Identities=21% Similarity=0.559 Sum_probs=14.7
Q ss_pred CCcceecCCCccCccccCC
Q 045527 64 CGLCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 64 ~~lCf~Cg~~gh~~~~Cp~ 82 (258)
...||+|++.||++.+||.
T Consensus 103 ~~~C~~Cg~~gH~~~~C~~ 121 (148)
T PTZ00368 103 RRACYNCGGEGHISRDCPN 121 (148)
T ss_pred chhhcccCcCCcchhcCCC
Confidence 3468888888888888875
No 46
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme. Proteinase A preferentially hydro
Probab=91.89 E-value=0.44 Score=42.91 Aligned_cols=93 Identities=12% Similarity=0.123 Sum_probs=55.5
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeeccccc-ccccc--eEeeeeEeeeceeEEeec--cccC-------
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSA-TKAQG--ICRGVGLILQGVEIVEDF--LPLD------- 216 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~-~~~~~--~~~~v~i~i~g~~~~~~f--~Vl~------- 216 (258)
..++||||++..++...+++.+.-...... ..++.. +.|.. ..+.+.+.+++..+.+.. ++++
T Consensus 206 ~~~ivDSGtt~~~lp~~~~~~l~~~~~~~~-----~~~~~~~~~C~~~~~~P~i~f~f~g~~~~i~~~~y~~~~~g~C~~ 280 (320)
T cd05488 206 TGAAIDTGTSLIALPSDLAEMLNAEIGAKK-----SWNGQYTVDCSKVDSLPDLTFNFDGYNFTLGPFDYTLEVSGSCIS 280 (320)
T ss_pred CeEEEcCCcccccCCHHHHHHHHHHhCCcc-----ccCCcEEeeccccccCCCEEEEECCEEEEECHHHheecCCCeEEE
Confidence 358999999999999998876532211100 011111 11111 113456666666554331 1111
Q ss_pred ------C---CCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 217 ------L---GITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 217 ------~---~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
+ .+...|||..||+.+- +..|+.+++|.|.
T Consensus 281 ~~~~~~~~~~~~~~~ilG~~fl~~~y-~vfD~~~~~iG~a 319 (320)
T cd05488 281 AFTGMDFPEPVGPLAIVGDAFLRKYY-SVYDLGNNAVGLA 319 (320)
T ss_pred EEEECcCCCCCCCeEEEchHHhhheE-EEEeCCCCEEeec
Confidence 1 1225899999999998 5999999998774
No 47
>cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank
Probab=91.16 E-value=0.56 Score=42.23 Aligned_cols=93 Identities=18% Similarity=0.147 Sum_probs=53.6
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccc-cccccc--eEeeeeEeeeceeEEee--------------
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGS-ATKAQG--ICRGVGLILQGVEIVED-------------- 211 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~-~~~~~~--~~~~v~i~i~g~~~~~~-------------- 211 (258)
..++||||++..++....++.+.-...... ...|. .+.|.. ..+.+.+.+++..+.+.
T Consensus 207 ~~aiiDSGTt~~~~p~~~~~~l~~~~~~~~-----~~~~~~~~~C~~~~~~P~i~f~fgg~~~~l~~~~y~~~~~~~~~~ 281 (325)
T cd05490 207 CEAIVDTGTSLITGPVEEVRALQKAIGAVP-----LIQGEYMIDCEKIPTLPVISFSLGGKVYPLTGEDYILKVSQRGTT 281 (325)
T ss_pred CEEEECCCCccccCCHHHHHHHHHHhCCcc-----ccCCCEEecccccccCCCEEEEECCEEEEEChHHeEEeccCCCCC
Confidence 479999999999999988876532111100 00111 111211 11234555555544322
Q ss_pred -c----cccCC---CCccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 212 -F----LPLDL---GITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 212 -f----~Vl~~---~~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
+ ..++. .....|||..||+.+- +..|+.+++|.|.
T Consensus 282 ~C~~~~~~~~~~~~~~~~~ilGd~flr~~y-~vfD~~~~~IGfA 324 (325)
T cd05490 282 ICLSGFMGLDIPPPAGPLWILGDVFIGRYY-TVFDRDNDRVGFA 324 (325)
T ss_pred EEeeEEEECCCCCCCCceEEEChHhheeeE-EEEEcCCcEeecc
Confidence 1 11111 1234799999999999 4899999998774
No 48
>cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate r
Probab=90.89 E-value=1.2 Score=40.06 Aligned_cols=26 Identities=12% Similarity=0.365 Sum_probs=23.3
Q ss_pred cEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 221 DIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 221 dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
..|||..||+.+- +..|+.+++|-|.
T Consensus 299 ~~ilG~~flr~~y-~vfD~~~~~IGfA 324 (326)
T cd05487 299 LWVLGATFIRKFY-TEFDRQNNRIGFA 324 (326)
T ss_pred eEEEehHHhhccE-EEEeCCCCEEeee
Confidence 4799999999999 5999999999875
No 49
>cd05475 nucellin_like Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more d
Probab=90.44 E-value=0.77 Score=40.38 Aligned_cols=85 Identities=16% Similarity=0.179 Sum_probs=48.8
Q ss_pred EEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeeccc---ccccccceEeeeeEee-eceeEEeeccccCC---C-Cc
Q 045527 149 VVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTG---SATKAQGICRGVGLIL-QGVEIVEDFLPLDL---G-IT 220 (258)
Q Consensus 149 v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G---~~~~~~~~~~~v~i~i-~g~~~~~~f~Vl~~---~-~~ 220 (258)
..++||||++..++..... .+++.+....+ ..+..... +.-+.. .+. .. +.++.. . .-
T Consensus 178 ~~~ivDTGTt~t~lp~~~y---------~p~i~~~f~~~~~~~~~~l~~~--~y~~~~~~~~-~C--l~~~~~~~~~~~~ 243 (273)
T cd05475 178 LEVVFDSGSSYTYFNAQAY---------FKPLTLKFGKGWRTRLLEIPPE--NYLIISEKGN-VC--LGILNGSEIGLGN 243 (273)
T ss_pred ceEEEECCCceEEcCCccc---------cccEEEEECCCCceeEEEeCCC--ceEEEcCCCC-EE--EEEecCCCcCCCc
Confidence 4689999999999988754 23445554332 11211110 000000 111 11 112211 1 12
Q ss_pred cEEecchHHHhcCCeEEEeeCCEEEEee
Q 045527 221 DIIMGIHWLKTLGATHINWKTHSMKFNT 248 (258)
Q Consensus 221 dvILG~dwL~~~~~i~ID~~~~~v~f~~ 248 (258)
..|||..||+.+- +..|+.+++|-|..
T Consensus 244 ~~ilG~~~l~~~~-~vfD~~~~riGfa~ 270 (273)
T cd05475 244 TNIIGDISMQGLM-VIYDNEKQQIGWVR 270 (273)
T ss_pred eEEECceEEEeeE-EEEECcCCEeCccc
Confidence 4799999999999 59999999998864
No 50
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=90.00 E-value=0.83 Score=43.55 Aligned_cols=96 Identities=14% Similarity=0.106 Sum_probs=53.8
Q ss_pred EEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccc-ccccc-ceEeeeeEeeeceeEEee--------------
Q 045527 148 KVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGS-ATKAQ-GICRGVGLILQGVEIVED-------------- 211 (258)
Q Consensus 148 ~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~-~~~~~-~~~~~v~i~i~g~~~~~~-------------- 211 (258)
...++||||++..++..+.++.+--..... .+ ...|. ...|. ...+.+.+.+++..+.+.
T Consensus 331 ~~~aIlDSGTSli~lP~~~~~~i~~~l~~~---~~-~~~~~y~~~C~~~~lP~i~F~~~g~~~~L~p~~Yi~~~~~~~~~ 406 (450)
T PTZ00013 331 KANVIVDSGTTTITAPSEFLNKFFANLNVI---KV-PFLPFYVTTCDNKEMPTLEFKSANNTYTLEPEYYMNPLLDVDDT 406 (450)
T ss_pred ccceEECCCCccccCCHHHHHHHHHHhCCe---ec-CCCCeEEeecCCCCCCeEEEEECCEEEEECHHHheehhccCCCC
Confidence 346999999999999998776652111000 00 00110 01111 111234455555443321
Q ss_pred ---ccccC--CCCccEEecchHHHhcCCeEEEeeCCEEEEee
Q 045527 212 ---FLPLD--LGITDIIMGIHWLKTLGATHINWKTHSMKFNT 248 (258)
Q Consensus 212 ---f~Vl~--~~~~dvILG~dwL~~~~~i~ID~~~~~v~f~~ 248 (258)
+.+.+ ...-..|||-.||+.+-. ..|+.+++|.|..
T Consensus 407 ~C~~~i~~~~~~~~~~ILGd~FLr~~Y~-VFD~~n~rIGfA~ 447 (450)
T PTZ00013 407 LCMITMLPVDIDDNTFILGDPFMRKYFT-VFDYDKESVGFAI 447 (450)
T ss_pred eeEEEEEECCCCCCCEEECHHHhccEEE-EEECCCCEEEEEE
Confidence 11111 112247999999999995 8999999998864
No 51
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=89.97 E-value=0.14 Score=42.84 Aligned_cols=19 Identities=26% Similarity=0.656 Sum_probs=16.8
Q ss_pred CcceecCCCccCcccc-CCc
Q 045527 65 GLCYKCDEKFSPGHRC-RKQ 83 (258)
Q Consensus 65 ~lCf~Cg~~gh~~~~C-p~k 83 (258)
.+||+||+-||++.+| |.+
T Consensus 98 ~~C~~Cg~~GH~~~dC~P~~ 117 (190)
T COG5082 98 KKCYNCGETGHLSRDCNPSK 117 (190)
T ss_pred cccccccccCccccccCccc
Confidence 5799999999999999 654
No 52
>PLN03146 aspartyl protease family protein; Provisional
Probab=89.78 E-value=0.63 Score=44.01 Aligned_cols=27 Identities=11% Similarity=0.103 Sum_probs=24.0
Q ss_pred cEEecchHHHhcCCeEEEeeCCEEEEee
Q 045527 221 DIIMGIHWLKTLGATHINWKTHSMKFNT 248 (258)
Q Consensus 221 dvILG~dwL~~~~~i~ID~~~~~v~f~~ 248 (258)
..|||..+++.+. +..|..+++|.|..
T Consensus 399 ~~IlG~~~q~~~~-vvyDl~~~~igFa~ 425 (431)
T PLN03146 399 IAIFGNLAQMNFL-VGYDLESKTVSFKP 425 (431)
T ss_pred ceEECeeeEeeEE-EEEECCCCEEeeec
Confidence 3799999999998 69999999999864
No 53
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=88.85 E-value=1.8 Score=38.08 Aligned_cols=73 Identities=18% Similarity=0.307 Sum_probs=46.2
Q ss_pred EEEEEECC--EEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEE-eecccc
Q 045527 139 KLASEINN--KKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIV-EDFLPL 215 (258)
Q Consensus 139 ~i~~~I~g--~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~-~~f~Vl 215 (258)
.+.+.|+. +.+.+++||||+..+|. ++.+..++|..+. |....=.+.+++.... ..|-+.
T Consensus 4 ~~~i~iGtp~q~~~v~~DTgS~~~wv~---------------~~~~~Y~~g~~~~--G~~~~D~v~~g~~~~~~~~fg~~ 66 (295)
T cd05474 4 SAELSVGTPPQKVTVLLDTGSSDLWVP---------------DFSISYGDGTSAS--GTWGTDTVSIGGATVKNLQFAVA 66 (295)
T ss_pred EEEEEECCCCcEEEEEEeCCCCcceee---------------eeEEEeccCCcEE--EEEEEEEEEECCeEecceEEEEE
Confidence 45566666 88999999999999988 4566667765433 2223335666665442 234333
Q ss_pred C-CCCccEEecchH
Q 045527 216 D-LGITDIIMGIHW 228 (258)
Q Consensus 216 ~-~~~~dvILG~dw 228 (258)
. ....|.|||+-+
T Consensus 67 ~~~~~~~GilGLg~ 80 (295)
T cd05474 67 NSTSSDVGVLGIGL 80 (295)
T ss_pred ecCCCCcceeeECC
Confidence 2 345788988654
No 54
>PF00026 Asp: Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.; InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) . More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=88.46 E-value=0.73 Score=40.80 Aligned_cols=87 Identities=20% Similarity=0.256 Sum_probs=47.8
Q ss_pred EEEEEEC--CEEEEEEEcCCCCccccCHHHHHHc-------CCCccc-------CCCceEeecccccccccceEeeeeEe
Q 045527 139 KLASEIN--NKKVVVLTDSGASHNFISNEVVLVL-------KLPITN-------TEPYGVILRTGSATKAQGICRGVGLI 202 (258)
Q Consensus 139 ~i~~~I~--g~~v~aLIDSGAt~sfIs~~~a~~l-------~l~~~~-------~~~~~V~~a~G~~~~~~~~~~~v~i~ 202 (258)
.+.+.|+ .+++.++|||||+..+|...-+... ...... ..++.+..++|. +. +....-.+.
T Consensus 3 ~~~v~iGtp~q~~~~~iDTGS~~~wv~~~~c~~~~~~~~~~~y~~~~S~t~~~~~~~~~~~y~~g~-~~--G~~~~D~v~ 79 (317)
T PF00026_consen 3 YINVTIGTPPQTFRVLIDTGSSDTWVPSSNCNSCSSCASSGFYNPSKSSTFSNQGKPFSISYGDGS-VS--GNLVSDTVS 79 (317)
T ss_dssp EEEEEETTTTEEEEEEEETTBSSEEEEBTTECSHTHHCTSC-BBGGGSTTEEEEEEEEEEEETTEE-EE--EEEEEEEEE
T ss_pred EEEEEECCCCeEEEEEEecccceeeeceeccccccccccccccccccccccccceeeeeeeccCcc-cc--cccccceEe
Confidence 4566776 7999999999999988864221111 111000 012344445555 33 333334667
Q ss_pred eeceeEE-eeccccCC--------CCccEEecchH
Q 045527 203 LQGVEIV-EDFLPLDL--------GITDIIMGIHW 228 (258)
Q Consensus 203 i~g~~~~-~~f~Vl~~--------~~~dvILG~dw 228 (258)
+++.... ..|.++.. ...|.|||+-+
T Consensus 80 ig~~~~~~~~f~~~~~~~~~~~~~~~~~GilGLg~ 114 (317)
T PF00026_consen 80 IGGLTIPNQTFGLADSYSGDPFSPIPFDGILGLGF 114 (317)
T ss_dssp ETTEEEEEEEEEEEEEEESHHHHHSSSSEEEE-SS
T ss_pred eeeccccccceeccccccccccccccccccccccC
Confidence 7776654 44444332 35688999873
No 55
>PF14392 zf-CCHC_4: Zinc knuckle
Probab=87.99 E-value=0.2 Score=32.49 Aligned_cols=21 Identities=19% Similarity=0.534 Sum_probs=17.8
Q ss_pred ccCCcceecCCCccCccccCC
Q 045527 62 QECGLCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 62 R~~~lCf~Cg~~gh~~~~Cp~ 82 (258)
|-...||+||--||....||+
T Consensus 29 ~lp~~C~~C~~~gH~~~~C~k 49 (49)
T PF14392_consen 29 RLPRFCFHCGRIGHSDKECPK 49 (49)
T ss_pred CcChhhcCCCCcCcCHhHcCC
Confidence 344669999999999999984
No 56
>COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=87.53 E-value=0.27 Score=43.81 Aligned_cols=17 Identities=29% Similarity=0.737 Sum_probs=16.0
Q ss_pred cceecCCCccCccccCC
Q 045527 66 LCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 66 lCf~Cg~~gh~~~~Cp~ 82 (258)
.||+||++||....||.
T Consensus 178 ~CyRCGqkgHwIqnCpT 194 (427)
T COG5222 178 VCYRCGQKGHWIQNCPT 194 (427)
T ss_pred eEEecCCCCchhhcCCC
Confidence 59999999999999995
No 57
>PTZ00165 aspartyl protease; Provisional
Probab=87.11 E-value=3.7 Score=39.50 Aligned_cols=29 Identities=3% Similarity=0.137 Sum_probs=25.2
Q ss_pred cEEecchHHHhcCCeEEEeeCCEEEEeeCC
Q 045527 221 DIIMGIHWLKTLGATHINWKTHSMKFNTRN 250 (258)
Q Consensus 221 dvILG~dwL~~~~~i~ID~~~~~v~f~~~~ 250 (258)
..|||-.||+++-. ..|..+++|-|....
T Consensus 419 ~~ILGd~Flr~yy~-VFD~~n~rIGfA~a~ 447 (482)
T PTZ00165 419 LFVLGNNFIRKYYS-IFDRDHMMVGLVPAK 447 (482)
T ss_pred eEEEchhhheeEEE-EEeCCCCEEEEEeec
Confidence 47999999999995 999999999997543
No 58
>PTZ00147 plasmepsin-1; Provisional
Probab=86.98 E-value=1.9 Score=41.19 Aligned_cols=89 Identities=11% Similarity=0.175 Sum_probs=50.5
Q ss_pred eEEEEEEEC--CEEEEEEEcCCCCccccCHHHHHHcCCCc----cc---------CCCceEeecccccccccceEeeeeE
Q 045527 137 TLKLASEIN--NKKVVVLTDSGASHNFISNEVVLVLKLPI----TN---------TEPYGVILRTGSATKAQGICRGVGL 201 (258)
Q Consensus 137 ~i~i~~~I~--g~~v~aLIDSGAt~sfIs~~~a~~l~l~~----~~---------~~~~~V~~a~G~~~~~~~~~~~v~i 201 (258)
.....+.|+ .+++.+++||||+..+|-..-+...++.. .+ ...+.+..++|. +.|....=.+
T Consensus 139 ~Y~~~I~IGTP~Q~f~Vi~DTGSsdlWVps~~C~~~~C~~~~~yd~s~SsT~~~~~~~f~i~Yg~Gs---vsG~~~~DtV 215 (453)
T PTZ00147 139 MSYGEAKLGDNGQKFNFIFDTGSANLWVPSIKCTTEGCETKNLYDSSKSKTYEKDGTKVEMNYVSGT---VSGFFSKDLV 215 (453)
T ss_pred EEEEEEEECCCCeEEEEEEeCCCCcEEEeecCCCcccccCCCccCCccCcceEECCCEEEEEeCCCC---EEEEEEEEEE
Confidence 445677887 68999999999999998543221111110 00 113445556664 2233333356
Q ss_pred eeeceeEEeecccc----------CCCCccEEecchH
Q 045527 202 ILQGVEIVEDFLPL----------DLGITDIIMGIHW 228 (258)
Q Consensus 202 ~i~g~~~~~~f~Vl----------~~~~~dvILG~dw 228 (258)
.+++......|..+ .....|.|||+-|
T Consensus 216 tiG~~~v~~qF~~~~~~~~f~~~~~~~~~DGILGLG~ 252 (453)
T PTZ00147 216 TIGNLSVPYKFIEVTDTNGFEPFYTESDFDGIFGLGW 252 (453)
T ss_pred EECCEEEEEEEEEEEeccCcccccccccccceecccC
Confidence 67776554333321 1224799999987
No 59
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=86.55 E-value=2.1 Score=40.81 Aligned_cols=88 Identities=15% Similarity=0.308 Sum_probs=49.3
Q ss_pred EEEEEEEC--CEEEEEEEcCCCCccccCHHHHHHcCCCcc----c--------C-CCceEeecccccccccceEeeeeEe
Q 045527 138 LKLASEIN--NKKVVVLTDSGASHNFISNEVVLVLKLPIT----N--------T-EPYGVILRTGSATKAQGICRGVGLI 202 (258)
Q Consensus 138 i~i~~~I~--g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~----~--------~-~~~~V~~a~G~~~~~~~~~~~v~i~ 202 (258)
....+.|+ .+++.+++||||+..+|...-+...++... + . ..+.+..++|. + .|....=.|.
T Consensus 139 Yy~~i~IGTP~Q~f~vi~DTGSsdlWV~s~~C~~~~C~~~~~yd~s~SsT~~~~~~~~~i~YG~Gs-v--~G~~~~Dtv~ 215 (450)
T PTZ00013 139 FYGEGEVGDNHQKFMLIFDTGSANLWVPSKKCDSIGCSIKNLYDSSKSKSYEKDGTKVDITYGSGT-V--KGFFSKDLVT 215 (450)
T ss_pred EEEEEEECCCCeEEEEEEeCCCCceEEecccCCccccccCCCccCccCcccccCCcEEEEEECCce-E--EEEEEEEEEE
Confidence 35566776 689999999999999986443221111100 0 0 12345556664 2 2333333566
Q ss_pred eeceeEEeecccc----------CCCCccEEecchH
Q 045527 203 LQGVEIVEDFLPL----------DLGITDIIMGIHW 228 (258)
Q Consensus 203 i~g~~~~~~f~Vl----------~~~~~dvILG~dw 228 (258)
+++......|..+ .....|.|||+.|
T Consensus 216 iG~~~~~~~f~~~~~~~~~~~~~~~~~~dGIlGLg~ 251 (450)
T PTZ00013 216 LGHLSMPYKFIEVTDTDDLEPIYSSSEFDGILGLGW 251 (450)
T ss_pred ECCEEEccEEEEEEeccccccceecccccceecccC
Confidence 7776554333322 1124699999976
No 60
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two
Probab=85.90 E-value=2.4 Score=38.88 Aligned_cols=26 Identities=8% Similarity=0.197 Sum_probs=23.8
Q ss_pred EEecchHHHhcCCeEEEeeCCEEEEee
Q 045527 222 IIMGIHWLKTLGATHINWKTHSMKFNT 248 (258)
Q Consensus 222 vILG~dwL~~~~~i~ID~~~~~v~f~~ 248 (258)
.|||..||+.+. +..|..+++|-|..
T Consensus 319 ~ILG~~flr~~y-vvfD~~~~rIGfa~ 344 (364)
T cd05473 319 TVIGAVIMEGFY-VVFDRANKRVGFAV 344 (364)
T ss_pred eEEeeeeEcceE-EEEECCCCEEeeEe
Confidence 699999999999 59999999999875
No 61
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=85.41 E-value=0.39 Score=38.44 Aligned_cols=17 Identities=24% Similarity=0.786 Sum_probs=15.5
Q ss_pred cceecCCCccCccccCC
Q 045527 66 LCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 66 lCf~Cg~~gh~~~~Cp~ 82 (258)
+||+|++.||++++||.
T Consensus 2 ~C~~C~~~GH~~~~c~~ 18 (148)
T PTZ00368 2 VCYRCGGVGHQSRECPN 18 (148)
T ss_pred cCCCCCCCCcCcccCcC
Confidence 69999999999999996
No 62
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The enzymes specifically cleave bonds in peptides which
Probab=83.51 E-value=3.1 Score=36.21 Aligned_cols=74 Identities=18% Similarity=0.241 Sum_probs=43.5
Q ss_pred EEEEEC--CEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeece--eE-Eeeccc
Q 045527 140 LASEIN--NKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGV--EI-VEDFLP 214 (258)
Q Consensus 140 i~~~I~--g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~--~~-~~~f~V 214 (258)
+.+.|+ .+.+.+++||||+..++.. ..+.+..++|.... +....=.+.+++. .. ...|-+
T Consensus 4 ~~i~iGtP~q~~~v~~DTGSs~~wv~~-------------~~~~~~Y~dg~~~~--G~~~~D~v~~g~~~~~~~~~~Fg~ 68 (265)
T cd05476 4 VTLSIGTPPQPFSLIVDTGSDLTWTQC-------------CSYEYSYGDGSSTS--GVLATETFTFGDSSVSVPNVAFGC 68 (265)
T ss_pred EEEecCCCCcceEEEecCCCCCEEEcC-------------CceEeEeCCCceee--eeEEEEEEEecCCCCccCCEEEEe
Confidence 445555 4789999999999998853 23455555555433 2322335556655 22 122333
Q ss_pred cC------CCCccEEecchH
Q 045527 215 LD------LGITDIIMGIHW 228 (258)
Q Consensus 215 l~------~~~~dvILG~dw 228 (258)
.. ....|.|||+.+
T Consensus 69 ~~~~~~~~~~~~~GIlGLg~ 88 (265)
T cd05476 69 GTDNEGGSFGGADGILGLGR 88 (265)
T ss_pred cccccCCccCCCCEEEECCC
Confidence 32 235799999875
No 63
>KOG4400 consensus E3 ubiquitin ligase interacting with arginine methyltransferase [Posttranslational modification, protein turnover, chaperones]
Probab=82.43 E-value=0.57 Score=41.23 Aligned_cols=18 Identities=28% Similarity=0.719 Sum_probs=16.5
Q ss_pred CcceecCCCccCccccCC
Q 045527 65 GLCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 65 ~lCf~Cg~~gh~~~~Cp~ 82 (258)
..||+||+.||+.++||.
T Consensus 144 ~~Cy~Cg~~GH~s~~C~~ 161 (261)
T KOG4400|consen 144 AKCYSCGEQGHISDDCPE 161 (261)
T ss_pred CccCCCCcCCcchhhCCC
Confidence 569999999999999994
No 64
>cd05489 xylanase_inhibitor_I_like TAXI-I inhibits degradation of xylan in the cell wall. Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. Xylanases of fungal and bacterial pathogens are the key enzymes in the degradation of xylan in the cell wall. Plants secrete proteins that inhibit these degradation glycosidases, including xylanase. Surprisingly, TAXI-I displays structural homology with the pepsin-like family of aspartic proteases but is proteolytically nonfunctional, because one or more residues of the essential catalytic triad are absent. The structure of the TAXI-inhibitor, Aspergillus niger xylanase I complex, illustrates the ability
Probab=79.59 E-value=9.9 Score=35.03 Aligned_cols=25 Identities=12% Similarity=0.274 Sum_probs=22.6
Q ss_pred EEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 222 IIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 222 vILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
.|||.-+|+.+. +..|..+++|-|.
T Consensus 335 ~IlG~~~~~~~~-vvyD~~~~riGfa 359 (362)
T cd05489 335 VVIGGHQMEDNL-LVFDLEKSRLGFS 359 (362)
T ss_pred EEEeeheecceE-EEEECCCCEeecc
Confidence 589999999999 6999999999875
No 65
>PF15288 zf-CCHC_6: Zinc knuckle
Probab=78.85 E-value=1.2 Score=27.69 Aligned_cols=19 Identities=21% Similarity=0.301 Sum_probs=15.9
Q ss_pred CcceecCCCccCc--cccCCc
Q 045527 65 GLCYKCDEKFSPG--HRCRKQ 83 (258)
Q Consensus 65 ~lCf~Cg~~gh~~--~~Cp~k 83 (258)
..|-+||.-||.. ..||-+
T Consensus 2 ~kC~~CG~~GH~~t~k~CP~~ 22 (40)
T PF15288_consen 2 VKCKNCGAFGHMRTNKRCPMY 22 (40)
T ss_pred ccccccccccccccCccCCCC
Confidence 4599999999994 889954
No 66
>PF03539 Spuma_A9PTase: Spumavirus aspartic protease (A9); InterPro: IPR001641 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A9 (spumapepsin family, clan AA). Foamy viruses are single-stranded enveloped retroviruses that have been noted to infect monkeys, cats and humans. In the human virus, the aspartic protease is encoded by the retroviral gag gene [], and in monkeys by the pol gene []. At present, the virus has not been proven to cause any particular disease. However, studies have shown Human foamy virus causes neurological disorders in infected mice []. It is not clear whether the Foamy virus/spumavirus proteases share a common evolutionary origin with other aspartic proteases. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 2JYS_A.
Probab=78.46 E-value=4 Score=32.80 Aligned_cols=80 Identities=18% Similarity=0.174 Sum_probs=43.7
Q ss_pred ECCEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeeceeEEeeccccCCCCccEE
Q 045527 144 INNKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGVEIVEDFLPLDLGITDII 223 (258)
Q Consensus 144 I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~~~~~~f~Vl~~~~~dvI 223 (258)
|.|..+.+.-||||+.++|-..|...- .+.....+....|....-. -=+.+.++|+...+.+.-.+ +|.+
T Consensus 1 ikg~~l~~~wDsga~ITCiP~~fl~~E----~Pi~~~~i~Tihg~~~~~v---YYl~fKi~grkv~aEVi~s~---~dy~ 70 (163)
T PF03539_consen 1 IKGTKLKGHWDSGAQITCIPESFLEEE----QPIGKTLIKTIHGEKEQDV---YYLTFKINGRKVEAEVIASP---YDYI 70 (163)
T ss_dssp ETTEEEEEEE-TT-SSEEEEGGGTTT-------SEEEEEE-SS-EEEEEE---EEEEEEESS-EEEEEEEEES---SSSE
T ss_pred CCCceeeEEecCCCeEEEccHHHhCcc----ccccceEEEEecCceeccE---EEEEEEEcCeEEEEEEecCc---cceE
Confidence 467889999999999999998885531 1122334555666543321 23578888887665544333 3322
Q ss_pred e----cchHHHhcC
Q 045527 224 M----GIHWLKTLG 233 (258)
Q Consensus 224 L----G~dwL~~~~ 233 (258)
| -.+|+....
T Consensus 71 li~p~diPw~~~~p 84 (163)
T PF03539_consen 71 LISPSDIPWYKKKP 84 (163)
T ss_dssp EE-TTT-HHHHS--
T ss_pred EEcccccccccCCC
Confidence 2 358988754
No 67
>cd05470 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site
Probab=75.80 E-value=2.7 Score=31.06 Aligned_cols=26 Identities=23% Similarity=0.327 Sum_probs=20.1
Q ss_pred EEEECC--EEEEEEEcCCCCccccCHHH
Q 045527 141 ASEINN--KKVVVLTDSGASHNFISNEV 166 (258)
Q Consensus 141 ~~~I~g--~~v~aLIDSGAt~sfIs~~~ 166 (258)
.+.|+. +++.+++||||+..++...-
T Consensus 2 ~i~vGtP~q~~~~~~DTGSs~~Wv~~~~ 29 (109)
T cd05470 2 EIGIGTPPQTFNVLLDTGSSNLWVPSVD 29 (109)
T ss_pred EEEeCCCCceEEEEEeCCCCCEEEeCCC
Confidence 345554 78999999999988887653
No 68
>cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate r
Probab=75.75 E-value=13 Score=33.47 Aligned_cols=27 Identities=19% Similarity=0.245 Sum_probs=21.5
Q ss_pred EEEEEEEC--CEEEEEEEcCCCCccccCH
Q 045527 138 LKLASEIN--NKKVVVLTDSGASHNFISN 164 (258)
Q Consensus 138 i~i~~~I~--g~~v~aLIDSGAt~sfIs~ 164 (258)
..+.+.|+ .+++.++|||||+..+|..
T Consensus 9 y~~~i~iGtP~q~~~v~~DTGSs~~Wv~~ 37 (326)
T cd05487 9 YYGEIGIGTPPQTFKVVFDTGSSNLWVPS 37 (326)
T ss_pred EEEEEEECCCCcEEEEEEeCCccceEEcc
Confidence 34556666 6889999999999999954
No 69
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=72.80 E-value=12 Score=33.46 Aligned_cols=26 Identities=23% Similarity=0.367 Sum_probs=20.5
Q ss_pred EEEEEECC--EEEEEEEcCCCCccccCH
Q 045527 139 KLASEINN--KKVVVLTDSGASHNFISN 164 (258)
Q Consensus 139 ~i~~~I~g--~~v~aLIDSGAt~sfIs~ 164 (258)
...+.|+. +++.++|||||+..++..
T Consensus 5 ~~~i~iGtP~q~~~v~~DTGS~~~wv~~ 32 (318)
T cd05477 5 YGEISIGTPPQNFLVLFDTGSSNLWVPS 32 (318)
T ss_pred EEEEEECCCCcEEEEEEeCCCccEEEcc
Confidence 34556664 789999999999998864
No 70
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco. CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=71.91 E-value=7.6 Score=34.35 Aligned_cols=77 Identities=18% Similarity=0.168 Sum_probs=43.4
Q ss_pred EEEEEC--CEEEEEEEcCCCCccccCHHHHHHcCCCcccCCCceEeecccccccccceEeeeeEeeece-eEE-eecccc
Q 045527 140 LASEIN--NKKVVVLTDSGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKAQGICRGVGLILQGV-EIV-EDFLPL 215 (258)
Q Consensus 140 i~~~I~--g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~~~~~~~v~i~i~g~-~~~-~~f~Vl 215 (258)
+.+.|+ .+++.+++||||+..+|... .+ ..+.+..++|.... |....=.+.+++. ... ..|-+.
T Consensus 4 ~~i~iGtP~q~~~v~~DTGSs~~Wv~c~-----~c-----~~~~i~Yg~Gs~~~--G~~~~D~v~ig~~~~~~~~~Fg~~ 71 (299)
T cd05472 4 VTVGLGTPARDQTVIVDTGSDLTWVQCQ-----PC-----CLYQVSYGDGSYTT--GDLATDTLTLGSSDVVPGFAFGCG 71 (299)
T ss_pred EEEecCCCCcceEEEecCCCCcccccCC-----CC-----CeeeeEeCCCceEE--EEEEEEEEEeCCCCccCCEEEECC
Confidence 344555 47899999999999988321 11 34666777776433 2222224455554 221 223322
Q ss_pred CC-----CCccEEecchH
Q 045527 216 DL-----GITDIIMGIHW 228 (258)
Q Consensus 216 ~~-----~~~dvILG~dw 228 (258)
.. ...|.|||+-+
T Consensus 72 ~~~~~~~~~~~GilGLg~ 89 (299)
T cd05472 72 HDNEGLFGGAAGLLGLGR 89 (299)
T ss_pred ccCCCccCCCCEEEECCC
Confidence 21 15789999864
No 71
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=70.94 E-value=11 Score=33.77 Aligned_cols=24 Identities=21% Similarity=0.423 Sum_probs=18.6
Q ss_pred EEEEC--CEEEEEEEcCCCCccccCH
Q 045527 141 ASEIN--NKKVVVLTDSGASHNFISN 164 (258)
Q Consensus 141 ~~~I~--g~~v~aLIDSGAt~sfIs~ 164 (258)
.+.|+ .+++.++|||||+..+|-.
T Consensus 4 ~i~iGtP~Q~~~v~~DTGSs~~Wv~s 29 (316)
T cd05486 4 QISIGTPPQNFTVIFDTGSSNLWVPS 29 (316)
T ss_pred EEEECCCCcEEEEEEcCCCccEEEec
Confidence 44555 4789999999999888853
No 72
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme. Proteinase A preferentially hydro
Probab=69.74 E-value=13 Score=33.22 Aligned_cols=27 Identities=19% Similarity=0.320 Sum_probs=21.5
Q ss_pred EEEEEEEC--CEEEEEEEcCCCCccccCH
Q 045527 138 LKLASEIN--NKKVVVLTDSGASHNFISN 164 (258)
Q Consensus 138 i~i~~~I~--g~~v~aLIDSGAt~sfIs~ 164 (258)
..+.+.|+ .+++.++|||||+..+|..
T Consensus 11 Y~~~i~iGtp~q~~~v~~DTGSs~~wv~~ 39 (320)
T cd05488 11 YFTDITLGTPPQKFKVILDTGSSNLWVPS 39 (320)
T ss_pred EEEEEEECCCCcEEEEEEecCCcceEEEc
Confidence 45566776 4889999999999998854
No 73
>PF14541 TAXi_C: Xylanase inhibitor C-terminal; PDB: 3AUP_D 3HD8_A 1T6G_A 1T6E_X 2B42_A 3VLB_A 3VLA_A.
Probab=64.91 E-value=25 Score=28.11 Aligned_cols=28 Identities=7% Similarity=0.058 Sum_probs=24.0
Q ss_pred CccEEecchHHHhcCCeEEEeeCCEEEEe
Q 045527 219 ITDIIMGIHWLKTLGATHINWKTHSMKFN 247 (258)
Q Consensus 219 ~~dvILG~dwL~~~~~i~ID~~~~~v~f~ 247 (258)
..-.|||...+..+. |..|-.+++|.|.
T Consensus 133 ~~~~viG~~~~~~~~-v~fDl~~~~igF~ 160 (161)
T PF14541_consen 133 DGVSVIGNFQQQNYH-VVFDLENGRIGFA 160 (161)
T ss_dssp SSSEEE-HHHCCTEE-EEEETTTTEEEEE
T ss_pred CCcEEECHHHhcCcE-EEEECCCCEEEEe
Confidence 456899999999999 7999999999986
No 74
>PTZ00165 aspartyl protease; Provisional
Probab=63.95 E-value=30 Score=33.38 Aligned_cols=31 Identities=23% Similarity=0.309 Sum_probs=24.3
Q ss_pred CeEEEEEEECC--EEEEEEEcCCCCccccCHHH
Q 045527 136 KTLKLASEINN--KKVVVLTDSGASHNFISNEV 166 (258)
Q Consensus 136 ~~i~i~~~I~g--~~v~aLIDSGAt~sfIs~~~ 166 (258)
......+.|+. +++.+++||||+..+|-..-
T Consensus 119 ~~Y~~~I~IGTPpQ~f~Vv~DTGSS~lWVps~~ 151 (482)
T PTZ00165 119 SQYFGEIQVGTPPKSFVVVFDTGSSNLWIPSKE 151 (482)
T ss_pred CeEEEEEEeCCCCceEEEEEeCCCCCEEEEchh
Confidence 34566778876 89999999999998886543
No 75
>KOG0109 consensus RNA-binding protein LARK, contains RRM and retroviral-type Zn-finger domains [RNA processing and modification; General function prediction only]
Probab=60.09 E-value=4.4 Score=36.23 Aligned_cols=18 Identities=22% Similarity=0.600 Sum_probs=16.2
Q ss_pred cceecCCCccCccccCCc
Q 045527 66 LCYKCDEKFSPGHRCRKQ 83 (258)
Q Consensus 66 lCf~Cg~~gh~~~~Cp~k 83 (258)
-||.||+.||-..+||..
T Consensus 162 ~cyrcGkeghwskEcP~~ 179 (346)
T KOG0109|consen 162 GCYRCGKEGHWSKECPVD 179 (346)
T ss_pred HheeccccccccccCCcc
Confidence 499999999999999953
No 76
>KOG0107 consensus Alternative splicing factor SRp20/9G8 (RRM superfamily) [RNA processing and modification]
Probab=57.44 E-value=5.9 Score=32.90 Aligned_cols=20 Identities=35% Similarity=0.994 Sum_probs=17.4
Q ss_pred cCCcceecCCCccCccccCC
Q 045527 63 ECGLCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 63 ~~~lCf~Cg~~gh~~~~Cp~ 82 (258)
..+.|++||+.||....|.+
T Consensus 99 g~~~~~r~G~rg~~~r~~~~ 118 (195)
T KOG0107|consen 99 GRGFCYRCGERGHIGRNCKD 118 (195)
T ss_pred cccccccCCCcccccccccc
Confidence 34669999999999999986
No 77
>smart00647 IBR In Between Ring fingers. the domains occurs between pairs og RING fingers
Probab=57.01 E-value=5.5 Score=26.41 Aligned_cols=17 Identities=18% Similarity=0.821 Sum_probs=13.8
Q ss_pred CCcceecCCCccCcccc
Q 045527 64 CGLCYKCDEKFSPGHRC 80 (258)
Q Consensus 64 ~~lCf~Cg~~gh~~~~C 80 (258)
...||+|++.||..-.|
T Consensus 48 ~~fC~~C~~~~H~~~~C 64 (64)
T smart00647 48 FSFCFRCKVPWHSPVSC 64 (64)
T ss_pred CeECCCCCCcCCCCCCC
Confidence 44599999999987665
No 78
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=56.07 E-value=13 Score=32.58 Aligned_cols=26 Identities=15% Similarity=0.262 Sum_probs=20.6
Q ss_pred EEEEECC--EEEEEEEcCCCCccccCHH
Q 045527 140 LASEINN--KKVVVLTDSGASHNFISNE 165 (258)
Q Consensus 140 i~~~I~g--~~v~aLIDSGAt~sfIs~~ 165 (258)
+.+.|+. +++.+++||||+..+|-..
T Consensus 3 ~~i~vGtP~Q~~~v~~DTGS~~~wv~~~ 30 (278)
T cd06097 3 TPVKIGTPPQTLNLDLDTGSSDLWVFSS 30 (278)
T ss_pred eeEEECCCCcEEEEEEeCCCCceeEeeC
Confidence 4556666 8899999999999988643
No 79
>PF01485 IBR: IBR domain; InterPro: IPR002867 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a cysteine-rich (C6HC) zinc finger domain that is present in Triad1, and which is conserved in other proteins encoded by various eukaryotes. The C6HC consensus pattern is: C-x(4)-C-x(14-30)-C-x(1-4)-C-x(4)-C-x(2)-C-x(4)-H-x(4)-C The C6HC zinc finger motif is the fourth family member of the zinc-binding RING, LIM, and LAP/PHD fingers. Strikingly, in most of the proteins the C6HC domain is flanked by two RING finger structures IPR001841 from INTERPRO. The novel C6HC motif has been called DRIL (double RING finger linked). The strong conservation of the larger tripartite TRIAD (twoRING fingers and DRIL) structure indicates that the three subdomains are functionally linked and identifies a novel class of proteins []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding; PDB: 2CT7_A 1WD2_A 2JMO_A 1WIM_A.
Probab=55.44 E-value=4.2 Score=26.97 Aligned_cols=17 Identities=35% Similarity=1.056 Sum_probs=13.3
Q ss_pred CCcceecCCCccCcccc
Q 045527 64 CGLCYKCDEKFSPGHRC 80 (258)
Q Consensus 64 ~~lCf~Cg~~gh~~~~C 80 (258)
...||.|++.||....|
T Consensus 48 ~~fC~~C~~~~H~~~~C 64 (64)
T PF01485_consen 48 TEFCFKCGEPWHEGVTC 64 (64)
T ss_dssp SEECSSSTSESCTTS-H
T ss_pred CcCccccCcccCCCCCC
Confidence 45699999999987766
No 80
>PF12353 eIF3g: Eukaryotic translation initiation factor 3 subunit G ; InterPro: IPR024675 At least eleven different protein factors are involved in initiation of protein synthesis in eukaryotes. Binding of initiator tRNA and mRNA to the 40S subunit requires the presence of the translation initiation factors eIF-2 and eIF-3, with eIF-3 being particularly important for 80S ribosome dissociation and mRNA binding []. eIF-3 is the most complex translation inititation factor, consisting of about 13 putative subunits and having a molecular weight of between 550 - 700 kDa in mammalian cells. Subunits are designated eIF-3a - eIF-3m; the large number of subunits means that the interactions between the individual subunits that make up the eIF-3 complex are complex and varied. Subunit G is required for eIF3 integrity. This entry represents a domain of approximately 130 amino acids in length found at the N terminus of eukaryotic translation initiation factor 3 subunit G. This domain is commonly found in association with the RNA recognition domain PF00076 from PFAM.
Probab=52.74 E-value=7 Score=30.68 Aligned_cols=19 Identities=16% Similarity=0.212 Sum_probs=15.8
Q ss_pred CCcceecCCCccCccccCCc
Q 045527 64 CGLCYKCDEKFSPGHRCRKQ 83 (258)
Q Consensus 64 ~~lCf~Cg~~gh~~~~Cp~k 83 (258)
...|..|+ ..|+..+||-+
T Consensus 106 ~v~CR~Ck-GdH~T~~CPyK 124 (128)
T PF12353_consen 106 KVKCRICK-GDHWTSKCPYK 124 (128)
T ss_pred eEEeCCCC-CCcccccCCcc
Confidence 45599997 77999999955
No 81
>KOG4400 consensus E3 ubiquitin ligase interacting with arginine methyltransferase [Posttranslational modification, protein turnover, chaperones]
Probab=49.60 E-value=7.1 Score=34.25 Aligned_cols=21 Identities=19% Similarity=0.504 Sum_probs=17.9
Q ss_pred cCCcceecCCCccCccccCCc
Q 045527 63 ECGLCYKCDEKFSPGHRCRKQ 83 (258)
Q Consensus 63 ~~~lCf~Cg~~gh~~~~Cp~k 83 (258)
....||+|++.||..++||.+
T Consensus 91 ~~~~c~~C~~~gH~~~~c~~~ 111 (261)
T KOG4400|consen 91 IAAACFNCGEGGHIERDCPEA 111 (261)
T ss_pred cchhhhhCCCCccchhhCCcc
Confidence 355699999999999999954
No 82
>KOG2673 consensus Uncharacterized conserved protein, contains PSP domain [Function unknown]
Probab=49.23 E-value=8.4 Score=36.47 Aligned_cols=22 Identities=18% Similarity=0.681 Sum_probs=17.6
Q ss_pred ccCCcceecCCCccCccccCCc
Q 045527 62 QECGLCYKCDEKFSPGHRCRKQ 83 (258)
Q Consensus 62 R~~~lCf~Cg~~gh~~~~Cp~k 83 (258)
+..--||+|++.-|-.++||.+
T Consensus 126 ~~~~~CFNC~g~~hsLrdC~rp 147 (485)
T KOG2673|consen 126 NKCDPCFNCGGTPHSLRDCPRP 147 (485)
T ss_pred ccCccccccCCCCCccccCCCc
Confidence 3333499999999999999965
No 83
>KOG0341 consensus DEAD-box protein abstrakt [RNA processing and modification]
Probab=47.29 E-value=8.4 Score=36.13 Aligned_cols=18 Identities=22% Similarity=0.239 Sum_probs=16.3
Q ss_pred CcceecCCCccCccccCC
Q 045527 65 GLCYKCDEKFSPGHRCRK 82 (258)
Q Consensus 65 ~lCf~Cg~~gh~~~~Cp~ 82 (258)
.-|-|||+-||+-.+||+
T Consensus 571 kGCayCgGLGHRItdCPK 588 (610)
T KOG0341|consen 571 KGCAYCGGLGHRITDCPK 588 (610)
T ss_pred cccccccCCCcccccCch
Confidence 349999999999999995
No 84
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases. They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=45.36 E-value=28 Score=31.10 Aligned_cols=27 Identities=22% Similarity=0.301 Sum_probs=21.7
Q ss_pred EEEEEEEC--CEEEEEEEcCCCCccccCH
Q 045527 138 LKLASEIN--NKKVVVLTDSGASHNFISN 164 (258)
Q Consensus 138 i~i~~~I~--g~~v~aLIDSGAt~sfIs~ 164 (258)
..+.+.|+ .+++.++|||||+..+|..
T Consensus 11 Y~~~i~iGtP~Q~~~v~~DTGSs~lWv~~ 39 (317)
T cd06098 11 YFGEIGIGTPPQKFTVIFDTGSSNLWVPS 39 (317)
T ss_pred EEEEEEECCCCeEEEEEECCCccceEEec
Confidence 45566776 5889999999999888864
No 85
>KOG0119 consensus Splicing factor 1/branch point binding protein (RRM superfamily) [RNA processing and modification]
Probab=45.34 E-value=10 Score=36.33 Aligned_cols=19 Identities=16% Similarity=0.380 Sum_probs=16.7
Q ss_pred CcceecCCCccCccccCCc
Q 045527 65 GLCYKCDEKFSPGHRCRKQ 83 (258)
Q Consensus 65 ~lCf~Cg~~gh~~~~Cp~k 83 (258)
++|++|+.-||++.+|+.+
T Consensus 286 n~c~~cg~~gH~~~dc~~~ 304 (554)
T KOG0119|consen 286 NVCKICGPLGHISIDCKVN 304 (554)
T ss_pred ccccccCCcccccccCCCc
Confidence 4899999999999999854
No 86
>PF05515 Viral_NABP: Viral nucleic acid binding ; InterPro: IPR008891 This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP).
Probab=42.72 E-value=13 Score=29.00 Aligned_cols=19 Identities=26% Similarity=0.775 Sum_probs=16.2
Q ss_pred cCCcceecCCCccCccccC
Q 045527 63 ECGLCYKCDEKFSPGHRCR 81 (258)
Q Consensus 63 ~~~lCf~Cg~~gh~~~~Cp 81 (258)
+-+.||.||.--|..+.|+
T Consensus 61 R~~~C~~CG~~l~~~~~C~ 79 (124)
T PF05515_consen 61 RYNRCFKCGRYLHNNGNCR 79 (124)
T ss_pred HhCccccccceeecCCcCC
Confidence 4577999999778899998
No 87
>PF13821 DUF4187: Domain of unknown function (DUF4187)
Probab=41.92 E-value=12 Score=24.79 Aligned_cols=23 Identities=22% Similarity=0.651 Sum_probs=17.8
Q ss_pred hcccCCcceecCCCccCc----cccCC
Q 045527 60 SNQECGLCYKCDEKFSPG----HRCRK 82 (258)
Q Consensus 60 ~rR~~~lCf~Cg~~gh~~----~~Cp~ 82 (258)
-|+.-.-||+||-++.-. ..||.
T Consensus 23 LR~~~~YC~~Cg~~Y~d~~dL~~~CPG 49 (55)
T PF13821_consen 23 LREEHNYCFWCGTKYDDEEDLERNCPG 49 (55)
T ss_pred HHhhCceeeeeCCccCCHHHHHhCCCC
Confidence 356677799999998764 78984
No 88
>cd05475 nucellin_like Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more d
Probab=41.55 E-value=32 Score=30.01 Aligned_cols=25 Identities=16% Similarity=0.286 Sum_probs=19.6
Q ss_pred EEEEEEC--CEEEEEEEcCCCCccccC
Q 045527 139 KLASEIN--NKKVVVLTDSGASHNFIS 163 (258)
Q Consensus 139 ~i~~~I~--g~~v~aLIDSGAt~sfIs 163 (258)
.+.+.|+ .+.+.+++||||+...|.
T Consensus 4 ~~~i~iGtP~q~~~v~~DTGS~~~Wv~ 30 (273)
T cd05475 4 YVTINIGNPPKPYFLDIDTGSDLTWLQ 30 (273)
T ss_pred EEEEEcCCCCeeEEEEEccCCCceEEe
Confidence 3445555 578899999999999984
No 89
>cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank
Probab=40.68 E-value=34 Score=30.54 Aligned_cols=26 Identities=19% Similarity=0.282 Sum_probs=20.4
Q ss_pred EEEEEEECC--EEEEEEEcCCCCccccC
Q 045527 138 LKLASEINN--KKVVVLTDSGASHNFIS 163 (258)
Q Consensus 138 i~i~~~I~g--~~v~aLIDSGAt~sfIs 163 (258)
..+.+.|+. +++.+++||||+..+|-
T Consensus 7 Y~~~i~iGtP~q~~~v~~DTGSs~~Wv~ 34 (325)
T cd05490 7 YYGEIGIGTPPQTFTVVFDTGSSNLWVP 34 (325)
T ss_pred EEEEEEECCCCcEEEEEEeCCCccEEEE
Confidence 345666663 78999999999999884
No 90
>COG0282 ackA Acetate kinase [Energy production and conversion]
Probab=37.67 E-value=19 Score=33.60 Aligned_cols=38 Identities=24% Similarity=0.404 Sum_probs=33.6
Q ss_pred CCCCccccCHHHHHHcCCCcccCCCceEeecccccccc
Q 045527 155 SGASHNFISNEVVLVLKLPITNTEPYGVILRTGSATKA 192 (258)
Q Consensus 155 SGAt~sfIs~~~a~~l~l~~~~~~~~~V~~a~G~~~~~ 192 (258)
-|-||.|++..+++.++.+.+..+-+...++||.++..
T Consensus 178 HGtSh~YVs~~aa~~L~k~~~~l~~I~~HLGNGASicA 215 (396)
T COG0282 178 HGTSHKYVSQRAAEILGKPLEDLNLITCHLGNGASICA 215 (396)
T ss_pred CccchHHHHHHHHHHhCCCccccCEEEEEecCchhhhh
Confidence 47899999999999999998888878888899987765
No 91
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which
Probab=37.15 E-value=44 Score=29.75 Aligned_cols=27 Identities=19% Similarity=0.303 Sum_probs=21.2
Q ss_pred EEEEEEEC--CEEEEEEEcCCCCccccCH
Q 045527 138 LKLASEIN--NKKVVVLTDSGASHNFISN 164 (258)
Q Consensus 138 i~i~~~I~--g~~v~aLIDSGAt~sfIs~ 164 (258)
..+.+.|+ .+++.++|||||+..+|..
T Consensus 11 Y~~~i~vGtp~q~~~v~~DTGS~~~wv~~ 39 (317)
T cd05478 11 YYGTISIGTPPQDFTVIFDTGSSNLWVPS 39 (317)
T ss_pred EEEEEEeCCCCcEEEEEEeCCCccEEEec
Confidence 34556665 5789999999999999864
No 92
>PF14543 TAXi_N: Xylanase inhibitor N-terminal; PDB: 3HD8_A 3VLB_A 3VLA_A 3AUP_D 1T6G_A 1T6E_X 2B42_A.
Probab=36.56 E-value=46 Score=26.80 Aligned_cols=24 Identities=13% Similarity=0.425 Sum_probs=16.9
Q ss_pred EEEEECC--EEEEEEEcCCCCccccC
Q 045527 140 LASEINN--KKVVVLTDSGASHNFIS 163 (258)
Q Consensus 140 i~~~I~g--~~v~aLIDSGAt~sfIs 163 (258)
+.+.|+. +++.++||||+..+++.
T Consensus 3 ~~~~iGtP~~~~~lvvDtgs~l~W~~ 28 (164)
T PF14543_consen 3 VSVSIGTPPQPFSLVVDTGSDLTWVQ 28 (164)
T ss_dssp EEEECTCTTEEEEEEEETT-SSEEEE
T ss_pred EEEEeCCCCceEEEEEECCCCceEEc
Confidence 3444443 68899999999998873
No 93
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two
Probab=36.49 E-value=40 Score=30.74 Aligned_cols=26 Identities=23% Similarity=0.350 Sum_probs=20.2
Q ss_pred EEEEEEC--CEEEEEEEcCCCCccccCH
Q 045527 139 KLASEIN--NKKVVVLTDSGASHNFISN 164 (258)
Q Consensus 139 ~i~~~I~--g~~v~aLIDSGAt~sfIs~ 164 (258)
.+.+.|+ .+++.++|||||+...|..
T Consensus 5 ~~~i~iGtP~Q~~~v~~DTGSs~lWv~~ 32 (364)
T cd05473 5 YIEMLIGTPPQKLNILVDTGSSNFAVAA 32 (364)
T ss_pred EEEEEecCCCceEEEEEecCCcceEEEc
Confidence 3455665 5789999999999988754
No 94
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5. Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=36.47 E-value=43 Score=30.05 Aligned_cols=27 Identities=19% Similarity=0.155 Sum_probs=20.6
Q ss_pred EEEEEEC--CEEEEEEEcCCCCccccCHH
Q 045527 139 KLASEIN--NKKVVVLTDSGASHNFISNE 165 (258)
Q Consensus 139 ~i~~~I~--g~~v~aLIDSGAt~sfIs~~ 165 (258)
.+.+.|+ .+++.++|||||+..+|...
T Consensus 5 ~~~i~vGtP~Q~~~v~~DTGS~~~wv~~~ 33 (326)
T cd06096 5 FIDIFIGNPPQKQSLILDTGSSSLSFPCS 33 (326)
T ss_pred EEEEEecCCCeEEEEEEeCCCCceEEecC
Confidence 3455665 47899999999999887653
No 95
>KOG1339 consensus Aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=35.82 E-value=93 Score=28.88 Aligned_cols=27 Identities=11% Similarity=0.103 Sum_probs=23.2
Q ss_pred cEEecchHHHhcCCeEEEee-CCEEEEee
Q 045527 221 DIIMGIHWLKTLGATHINWK-THSMKFNT 248 (258)
Q Consensus 221 dvILG~dwL~~~~~i~ID~~-~~~v~f~~ 248 (258)
..|||--+++.+. +..|.. +.++-|..
T Consensus 364 ~~ilG~~~~~~~~-~~~D~~~~~riGfa~ 391 (398)
T KOG1339|consen 364 LWILGDVFQQNYL-VVFDLGENSRVGFAP 391 (398)
T ss_pred eEEEchHHhCCEE-EEEeCCCCCEEEecc
Confidence 6899999999999 599998 88887764
No 96
>cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets an
Probab=35.69 E-value=46 Score=29.95 Aligned_cols=29 Identities=17% Similarity=0.251 Sum_probs=22.8
Q ss_pred eEEEEEEEC--CEEEEEEEcCCCCccccCHH
Q 045527 137 TLKLASEIN--NKKVVVLTDSGASHNFISNE 165 (258)
Q Consensus 137 ~i~i~~~I~--g~~v~aLIDSGAt~sfIs~~ 165 (258)
...+.+.|+ .+++.++|||||+..++...
T Consensus 11 ~Y~~~i~vGtP~q~~~v~~DTGSs~~Wv~~~ 41 (329)
T cd05485 11 QYYGVITIGTPPQSFKVVFDTGSSNLWVPSK 41 (329)
T ss_pred eEEEEEEECCCCcEEEEEEcCCCccEEEecC
Confidence 345667777 47899999999999888653
No 97
>KOG2044 consensus 5'-3' exonuclease HKE1/RAT1 [Replication, recombination and repair; RNA processing and modification]
Probab=34.17 E-value=17 Score=36.86 Aligned_cols=19 Identities=16% Similarity=0.560 Sum_probs=16.5
Q ss_pred CcceecCCCccCccccCCc
Q 045527 65 GLCYKCDEKFSPGHRCRKQ 83 (258)
Q Consensus 65 ~lCf~Cg~~gh~~~~Cp~k 83 (258)
..||.||..||.+.+|..+
T Consensus 261 ~~C~~cgq~gh~~~dc~g~ 279 (931)
T KOG2044|consen 261 RRCFLCGQTGHEAKDCEGK 279 (931)
T ss_pred ccchhhcccCCcHhhcCCc
Confidence 3499999999999999853
No 98
>PF03419 Peptidase_U4: Sporulation factor SpoIIGA This family belongs to family U4 of the peptidase classification.; InterPro: IPR005081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported. This group of peptidases belong to the MEROPS peptidase family U4 (SpoIIGA peptidase family, clan U-). Sporulation in bacteria such as Bacillus subtilis involves the formation of a polar septum, which divides the sporangium into a mother cell and a forespore. The sigma E factor, which is encoded within the spoIIG operon, is a cell-specific regulatory protein that directs gene transcription in the mother cell. Sigma E is synthesised as an inactive proprotein pro-sigma E, which is converted to the mature factor by the putative processing enzyme SpoIIGA []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis, 0030436 asexual sporulation
Probab=33.85 E-value=73 Score=28.35 Aligned_cols=22 Identities=27% Similarity=0.412 Sum_probs=16.4
Q ss_pred eEEEEEEECCE--EEEEEEcCCCC
Q 045527 137 TLKLASEINNK--KVVVLTDSGAS 158 (258)
Q Consensus 137 ~i~i~~~I~g~--~v~aLIDSGAt 158 (258)
...+.+.++|+ .+++|+|||..
T Consensus 157 ~~~v~i~~~~~~~~~~allDTGN~ 180 (293)
T PF03419_consen 157 LYPVTIEIGGKKIELKALLDTGNQ 180 (293)
T ss_pred EEEEEEEECCEEEEEEEEEECCCc
Confidence 45667777876 56899999954
No 99
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=32.83 E-value=48 Score=28.38 Aligned_cols=27 Identities=22% Similarity=0.347 Sum_probs=19.9
Q ss_pred EEEECC--EEEEEEEcCCCCccccCHHHH
Q 045527 141 ASEINN--KKVVVLTDSGASHNFISNEVV 167 (258)
Q Consensus 141 ~~~I~g--~~v~aLIDSGAt~sfIs~~~a 167 (258)
.+.|+. +++.++|||||+..+|...-.
T Consensus 4 ~i~iGtp~q~~~l~~DTGS~~~wv~~~~c 32 (283)
T cd05471 4 EITIGTPPQKFSVIFDTGSSLLWVPSSNC 32 (283)
T ss_pred EEEECCCCcEEEEEEeCCCCCEEEecCCC
Confidence 344443 589999999999988866543
No 100
>TIGR02854 spore_II_GA sigma-E processing peptidase SpoIIGA. Members of this protein family are the stage II sporulation protein SpoIIGA. This protein acts as an activating protease for Sigma-E, one of several specialized sigma factors of the sporulation process in Bacillus subtilis and related endospore-forming bacteria.
Probab=32.21 E-value=81 Score=28.17 Aligned_cols=23 Identities=17% Similarity=0.313 Sum_probs=16.8
Q ss_pred CeEEEEEEECCE--EEEEEEcCCCC
Q 045527 136 KTLKLASEINNK--KVVVLTDSGAS 158 (258)
Q Consensus 136 ~~i~i~~~I~g~--~v~aLIDSGAt 158 (258)
....+.+.++|+ .+++|+|||..
T Consensus 157 ~~~~v~i~~~g~~~~~~alvDTGN~ 181 (288)
T TIGR02854 157 QIYELEICLDGKKVTIKGFLDTGNQ 181 (288)
T ss_pred eEEEEEEEECCEEEEEEEEEecCCc
Confidence 345667777776 57899999954
No 101
>PLN03146 aspartyl protease family protein; Provisional
Probab=25.18 E-value=79 Score=29.84 Aligned_cols=27 Identities=11% Similarity=0.263 Sum_probs=21.0
Q ss_pred eEEEEEEEC--CEEEEEEEcCCCCccccC
Q 045527 137 TLKLASEIN--NKKVVVLTDSGASHNFIS 163 (258)
Q Consensus 137 ~i~i~~~I~--g~~v~aLIDSGAt~sfIs 163 (258)
...+.+.|+ .+++.+++||||+..+|-
T Consensus 84 ~Y~v~i~iGTPpq~~~vi~DTGS~l~Wv~ 112 (431)
T PLN03146 84 EYLMNISIGTPPVPILAIADTGSDLIWTQ 112 (431)
T ss_pred cEEEEEEcCCCCceEEEEECCCCCcceEc
Confidence 345666666 468899999999999984
No 102
>PF13717 zinc_ribbon_4: zinc-ribbon domain
Probab=22.74 E-value=35 Score=20.38 Aligned_cols=25 Identities=16% Similarity=0.379 Sum_probs=17.4
Q ss_pred ccCCHHHHhhcccCCcceecCCCcc
Q 045527 51 RKLTEAELQSNQECGLCYKCDEKFS 75 (258)
Q Consensus 51 ~rlt~~e~~~rR~~~lCf~Cg~~gh 75 (258)
..+.++.+.....+..|-+|+..|+
T Consensus 12 y~i~d~~ip~~g~~v~C~~C~~~f~ 36 (36)
T PF13717_consen 12 YEIDDEKIPPKGRKVRCSKCGHVFF 36 (36)
T ss_pred EeCCHHHCCCCCcEEECCCCCCEeC
Confidence 4455666666667778999988764
No 103
>COG2383 Uncharacterized conserved protein [Function unknown]
Probab=22.43 E-value=26 Score=26.31 Aligned_cols=20 Identities=35% Similarity=0.943 Sum_probs=17.2
Q ss_pred EEecchHHHhcCCeEEEeeC
Q 045527 222 IIMGIHWLKTLGATHINWKT 241 (258)
Q Consensus 222 vILG~dwL~~~~~i~ID~~~ 241 (258)
.||+.-||.+++-|.|||..
T Consensus 50 yilsl~~La~~GVItin~~a 69 (109)
T COG2383 50 YILSLFWLAQYGVITINWEA 69 (109)
T ss_pred HHHHHHHHHHcCeEEEcHHH
Confidence 36889999999999999964
No 104
>PF03991 Prion_octapep: Copper binding octapeptide repeat; InterPro: IPR020949 Prion protein (PrP-c) [, , ] is a small glycoprotein found in high quantity in the brain of animals infected with certain degenerative neurological diseases, such as sheep scrapie and bovine spongiform encephalopathy (BSE), and the human dementias Creutzfeldt-Jacob disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP-c is encoded in the host genome and is expressed both in normal and infected cells. During infection, however, the PrP-c molecule become altered (conformationally rather than at the amino acid level) to an abnormal isoform, PrP-sc. In detergent-treated brain extracts from infected individuals, fibrils composed of polymers of PrP-sc, namely scrapie-associated fibrils or prion rods, can be evidenced by electron microscopy. The precise function of the normal PrP isoform in healthy individuals remains unknown. Several results, mainly obtained in transgenic animals, indicate that PrP-c might play a role in long-term potentiation, in sleep physiology, in oxidative burst compensation (PrP can fix four Cu2+ through its octarepeat domain), in interactions with the extracellular matrix (PrP-c can bind to the precursor of the laminin receptor, LRP), in apoptosis and in signal transduction (costimulation of PrP-c induces a modulation of Fyn kinase phosphorylation) []. The normal isoform, PrP-c, is anchored at the cell membrane, in rafts, through a glycosyl phosphatidyl inositol (GPI); its half-life at the cell surface is 5 h, after which the protein is internalised through a caveolae-dependent mechanism and degraded in the endolysosome compartment. Conversion between PrP-c and PrP-sc occurs likely during the internalisation process. This repeat is found at the amino terminus of mammalian prion proteins. It has been shown to bind to copper [].
Probab=21.53 E-value=50 Score=13.18 Aligned_cols=6 Identities=50% Similarity=1.160 Sum_probs=3.2
Q ss_pred CCCCCc
Q 045527 12 PHVSGF 17 (258)
Q Consensus 12 ~~~~~~ 17 (258)
||++||
T Consensus 1 phgG~W 6 (8)
T PF03991_consen 1 PHGGGW 6 (8)
T ss_pred CCCCcC
Confidence 455554
No 105
>PF09538 FYDLN_acid: Protein of unknown function (FYDLN_acid); InterPro: IPR012644 Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=21.52 E-value=38 Score=25.78 Aligned_cols=20 Identities=30% Similarity=0.519 Sum_probs=14.4
Q ss_pred cCCcceecCCCccCcc----ccCC
Q 045527 63 ECGLCYKCDEKFSPGH----RCRK 82 (258)
Q Consensus 63 ~~~lCf~Cg~~gh~~~----~Cp~ 82 (258)
.|.+|..||.+|.=.+ .||+
T Consensus 8 tKR~Cp~CG~kFYDLnk~PivCP~ 31 (108)
T PF09538_consen 8 TKRTCPSCGAKFYDLNKDPIVCPK 31 (108)
T ss_pred CcccCCCCcchhccCCCCCccCCC
Confidence 5678999999966433 3774
No 106
>KOG2560 consensus RNA splicing factor - Slu7p [RNA processing and modification]
Probab=21.02 E-value=23 Score=33.64 Aligned_cols=19 Identities=21% Similarity=0.468 Sum_probs=17.5
Q ss_pred cCCcceecCCCccCccccC
Q 045527 63 ECGLCYKCDEKFSPGHRCR 81 (258)
Q Consensus 63 ~~~lCf~Cg~~gh~~~~Cp 81 (258)
++|.|-+||.-+|-..+|=
T Consensus 111 RKGACeNCGAmtHk~KDCm 129 (529)
T KOG2560|consen 111 RKGACENCGAMTHKVKDCM 129 (529)
T ss_pred hhhhhhhhhhhhcchHHHh
Confidence 5789999999999999994
No 107
>PF09706 Cas_CXXC_CXXC: CRISPR-associated protein (Cas_CXXC_CXXC); InterPro: IPR019121 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny. This entry represents a conserved domain of about 65 amino acids found in otherwise highly divergent proteins encoded in CRISPR-associated regions. This domain features two CXXC motifs.
Probab=20.99 E-value=45 Score=23.12 Aligned_cols=12 Identities=25% Similarity=0.509 Sum_probs=9.9
Q ss_pred ccCCcceecCCC
Q 045527 62 QECGLCYKCDEK 73 (258)
Q Consensus 62 R~~~lCf~Cg~~ 73 (258)
..+..|+.||++
T Consensus 3 k~~~~C~~Cg~r 14 (69)
T PF09706_consen 3 KKKYNCIFCGER 14 (69)
T ss_pred CCCCcCcCCCCc
Confidence 467889999986
No 108
>KOG3794 consensus CBF1-interacting corepressor CIR and related proteins [Transcription]
Probab=20.82 E-value=44 Score=31.19 Aligned_cols=19 Identities=21% Similarity=0.328 Sum_probs=16.2
Q ss_pred CCcceecCCCccC--ccccCC
Q 045527 64 CGLCYKCDEKFSP--GHRCRK 82 (258)
Q Consensus 64 ~~lCf~Cg~~gh~--~~~Cp~ 82 (258)
+..|++|+.-||. ...||-
T Consensus 124 NVrC~kChkwGH~n~DreCpl 144 (453)
T KOG3794|consen 124 NVRCLKCHKWGHINTDRECPL 144 (453)
T ss_pred eeeEEeecccccccCCccCcc
Confidence 4569999999998 788993
No 109
>cd01813 UBP_N UBP ubiquitin processing protease. The UBP (ubiquitin processing protease) domain (also referred to as USP which stands for "ubiquitin-specific protease") is present at in a large family of cysteine proteases that specifically cleave ubiquitin conjugates. This family includes Rpn11, UBP6 (USP14), USP7 (HAUSP). This domain is closely related to the amino-terminal ubiquitin-like domain of BAG1 (Bcl2-associated anthanogene1) protein and is found only in eukaryotes.
Probab=20.23 E-value=1.5e+02 Score=20.50 Aligned_cols=39 Identities=8% Similarity=-0.066 Sum_probs=33.1
Q ss_pred EEEEEEECCEEEEEEEcCCCCccccCHHHHHHcCCCccc
Q 045527 138 LKLASEINNKKVVVLTDSGASHNFISNEVVLVLKLPITN 176 (258)
Q Consensus 138 i~i~~~I~g~~v~aLIDSGAt~sfIs~~~a~~l~l~~~~ 176 (258)
|.+.++.++..+.+=+|..+|..=+-..+....+++...
T Consensus 1 ~~i~vk~~g~~~~v~v~~~~Tv~~lK~~i~~~tgvp~~~ 39 (74)
T cd01813 1 VPVIVKWGGQEYSVTTLSEDTVLDLKQFIKTLTGVLPER 39 (74)
T ss_pred CEEEEEECCEEEEEEECCCCCHHHHHHHHHHHHCCCHHH
Confidence 357788899999999999999999999999999987644
Done!