Query 045020
Match_columns 221
No_of_seqs 164 out of 582
Neff 7.4
Searched_HMMs 46136
Date Fri Mar 29 09:03:50 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/045020.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/045020hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd05484 retropepsin_like_LTR_2 99.7 2.3E-17 4.9E-22 120.3 9.8 90 69-163 2-91 (91)
2 cd05481 retropepsin_like_LTR_1 99.6 6.2E-15 1.4E-19 108.3 8.4 88 71-161 2-91 (93)
3 cd05479 RP_DDI RP_DDI; retrope 99.5 1.9E-13 4.2E-18 105.3 10.7 109 64-176 13-122 (124)
4 PF08284 RVP_2: Retroviral asp 99.5 7.2E-13 1.6E-17 103.7 11.5 119 62-184 16-135 (135)
5 PF00077 RVP: Retroviral aspar 99.4 4.1E-13 8.8E-18 99.0 6.2 97 64-168 2-98 (100)
6 PF09668 Asp_protease: Asparty 99.3 1.7E-11 3.7E-16 94.3 9.6 101 64-168 21-122 (124)
7 PF13650 Asp_protease_2: Aspar 99.2 2.1E-10 4.6E-15 82.0 9.5 89 70-161 1-90 (90)
8 cd00303 retropepsin_like Retro 99.2 4.1E-10 9E-15 77.4 9.8 90 71-163 2-92 (92)
9 cd06095 RP_RTVL_H_like Retrope 98.8 4.4E-08 9.6E-13 70.6 8.7 85 70-162 1-85 (86)
10 cd05483 retropepsin_like_bacte 98.7 9.5E-08 2.1E-12 68.8 9.3 92 68-163 3-96 (96)
11 cd06094 RP_Saci_like RP_Saci_l 98.7 2.3E-08 4.9E-13 72.3 5.4 78 80-166 10-88 (89)
12 TIGR02281 clan_AA_DTGA clan AA 98.6 3.4E-07 7.3E-12 70.3 9.4 101 62-166 6-108 (121)
13 cd05480 NRIP_C NRIP_C; putativ 98.6 3.3E-07 7.1E-12 67.4 8.3 96 71-169 2-99 (103)
14 KOG0012 DNA damage inducible p 98.5 4.8E-07 1E-11 80.1 7.4 124 50-182 223-347 (380)
15 TIGR03698 clan_AA_DTGF clan AA 98.2 1.2E-05 2.6E-10 60.3 8.5 86 78-171 16-102 (107)
16 PF13975 gag-asp_proteas: gag- 98.1 9.2E-06 2E-10 56.6 6.5 65 63-129 4-68 (72)
17 PF12384 Peptidase_A2B: Ty3 tr 98.1 2.1E-05 4.6E-10 62.9 9.1 103 64-170 31-134 (177)
18 COG3577 Predicted aspartyl pro 96.8 0.0066 1.4E-07 50.5 7.6 133 25-168 67-204 (215)
19 PF02160 Peptidase_A3: Caulifl 96.8 0.0049 1.1E-07 51.2 6.7 104 71-182 10-119 (201)
20 cd05482 HIV_retropepsin_like R 96.1 0.042 9.1E-07 39.7 7.6 85 71-162 2-86 (87)
21 PF05585 DUF1758: Putative pep 95.7 0.033 7.2E-07 44.5 6.3 71 77-150 10-83 (164)
22 COG5550 Predicted aspartyl pro 95.6 0.084 1.8E-06 40.5 7.7 86 74-171 22-112 (125)
23 PF03732 Retrotrans_gag: Retro 95.5 0.018 3.9E-07 40.9 3.7 37 2-38 58-95 (96)
24 PF05618 Zn_protease: Putative 90.9 0.48 1E-05 37.2 4.7 47 128-174 85-134 (138)
25 PF12382 Peptidase_A2E: Retrot 90.5 1.8 3.9E-05 32.1 7.2 84 66-151 33-117 (137)
26 COG4067 Uncharacterized protei 87.4 1 2.2E-05 35.9 4.3 42 127-168 108-150 (162)
27 PTZ00147 plasmepsin-1; Provisi 78.4 8.1 0.00018 36.1 7.2 34 62-96 133-169 (453)
28 PTZ00013 plasmepsin 4 (PM4); P 75.3 12 0.00025 35.1 7.3 32 64-96 135-168 (450)
29 PF14223 UBN2: gag-polypeptide 63.2 7.7 0.00017 28.8 2.8 39 1-39 37-79 (119)
30 cd05471 pepsin_like Pepsin-lik 60.8 22 0.00048 29.8 5.6 23 80-102 203-225 (283)
31 cd05474 SAP_like SAPs, pepsin- 50.1 1E+02 0.0022 26.1 8.1 71 69-159 4-78 (295)
32 PF13260 DUF4051: Protein of u 47.0 23 0.00051 22.6 2.5 18 7-32 22-39 (54)
33 cd05476 pepsin_A_like_plant Ch 45.7 92 0.002 26.3 7.0 18 81-98 178-195 (265)
34 PF00026 Asp: Eukaryotic aspar 45.5 32 0.00069 29.4 4.2 31 71-101 186-221 (317)
35 PF00026 Asp: Eukaryotic aspar 45.4 1.1E+02 0.0025 25.9 7.7 89 69-160 3-113 (317)
36 KOG2209 Oxysterol-binding prot 39.5 51 0.0011 29.7 4.4 84 94-183 149-242 (445)
37 cd06096 Plasmepsin_5 Plasmepsi 35.6 1.5E+02 0.0033 25.9 7.0 22 81-102 233-254 (326)
38 cd05474 SAP_like SAPs, pepsin- 34.8 2.8E+02 0.0061 23.4 8.8 21 81-101 180-200 (295)
39 PF08990 Docking: Erythronolid 33.6 25 0.00055 19.6 1.1 14 1-14 1-14 (27)
40 PF13867 SAP30_Sin3_bdg: Sin3 32.5 41 0.00088 21.7 2.1 30 4-35 3-32 (53)
41 cd05472 cnd41_like Chloroplast 27.5 77 0.0017 27.2 3.6 31 71-101 154-193 (299)
42 PTZ00165 aspartyl protease; Pr 26.0 4.8E+02 0.01 24.6 8.9 21 81-101 329-349 (482)
43 cd06097 Aspergillopepsin_like 24.0 2.3E+02 0.005 24.0 5.9 22 80-101 199-220 (278)
44 cd05470 pepsin_retropepsin_lik 23.3 75 0.0016 22.6 2.4 25 71-96 2-28 (109)
45 PF09830 ATP_transf: ATP adeny 22.8 40 0.00086 22.5 0.7 18 152-172 1-18 (62)
46 cd05473 beta_secretase_like Be 21.6 1.2E+02 0.0027 26.9 3.9 21 81-101 213-233 (364)
47 TIGR02854 spore_II_GA sigma-E 21.2 4.7E+02 0.01 22.8 7.4 35 67-101 158-202 (288)
48 cd05486 Cathespin_E Cathepsin 20.1 1.2E+02 0.0027 26.2 3.5 30 72-101 186-220 (316)
No 1
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=99.73 E-value=2.3e-17 Score=120.32 Aligned_cols=90 Identities=21% Similarity=0.326 Sum_probs=85.2
Q ss_pred EEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeEEEEEEEEe
Q 045020 69 VISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSLLTDFLVVK 148 (221)
Q Consensus 69 ~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~~~~F~Vvd 148 (221)
++++.|+|.++. +|||+||++|+|+.+.++++|.++ ++++...+.+|+|.....+|.+.+.+++|+.+..++|+|++
T Consensus 2 ~~~~~Ing~~i~-~lvDTGA~~svis~~~~~~lg~~~--~~~~~~~v~~a~G~~~~~~G~~~~~v~~~~~~~~~~~~v~~ 78 (91)
T cd05484 2 TVTLLVNGKPLK-FQLDTGSAITVISEKTWRKLGSPP--LKPTKKRLRTATGTKLSVLGQILVTVKYGGKTKVLTLYVVK 78 (91)
T ss_pred EEEEEECCEEEE-EEEcCCcceEEeCHHHHHHhCCCc--cccccEEEEecCCCEeeEeEEEEEEEEECCEEEEEEEEEEE
Confidence 678999999985 999999999999999999999987 99999999999999999999999999999999999999998
Q ss_pred eCCceeeEEcchhhh
Q 045020 149 ATSPYNVILGRQWIH 163 (221)
Q Consensus 149 ~~~~yn~ILGr~wL~ 163 (221)
.. |+.||||+||.
T Consensus 79 ~~--~~~lLG~~wl~ 91 (91)
T cd05484 79 NE--GLNLLGRDWLD 91 (91)
T ss_pred CC--CCCccChhhcC
Confidence 76 99999999984
No 2
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=99.59 E-value=6.2e-15 Score=108.26 Aligned_cols=88 Identities=22% Similarity=0.293 Sum_probs=77.8
Q ss_pred EEEEcC-eeeeEEEecCCCccccccHHHHHhhcc-ccccccccccceecccCCcccccceeeEEEEEcCeeEEEEEEEEe
Q 045020 71 SIKIGN-CLVKRILIDNGSSVNILDDATVEKMRL-EQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSLLTDFLVVK 148 (221)
Q Consensus 71 ~~~I~~-~~v~riLVD~GSsvnil~~~~~~~mg~-~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~~~~F~Vvd 148 (221)
.+.|++ ..++ ++||+||++|+||.+++++||. ...+|+++...|++|+|+...+.|.+++++++|+.+.++.|+|+|
T Consensus 2 ~~~i~g~~~v~-~~vDtGA~vnllp~~~~~~l~~~~~~~L~~t~~~L~~~~g~~~~~~G~~~~~v~~~~~~~~~~f~Vvd 80 (93)
T cd05481 2 DMKINGKQSVK-FQLDTGATCNVLPLRWLKSLTPDKDPELRPSPVRLTAYGGSTIPVEGGVKLKCRYRNPKYNLTFQVVK 80 (93)
T ss_pred ceEeCCceeEE-EEEecCCEEEeccHHHHhhhccCCCCcCccCCeEEEeeCCCEeeeeEEEEEEEEECCcEEEEEEEEEC
Confidence 356777 5555 9999999999999999999983 245799999999999999999999999999999999999999999
Q ss_pred eCCceeeEEcchh
Q 045020 149 ATSPYNVILGRQW 161 (221)
Q Consensus 149 ~~~~yn~ILGr~w 161 (221)
...+ .|||+..
T Consensus 81 ~~~~--~lLG~~~ 91 (93)
T cd05481 81 EEGP--PLLGAKA 91 (93)
T ss_pred CCCC--ceEcccc
Confidence 8665 7999864
No 3
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=99.50 E-value=1.9e-13 Score=105.32 Aligned_cols=109 Identities=22% Similarity=0.280 Sum_probs=86.8
Q ss_pred CCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceee-EEEEEcCeeEEE
Q 045020 64 HDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVIT-LLVIAKSYSLLT 142 (221)
Q Consensus 64 h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~-L~v~~g~~t~~~ 142 (221)
....+++.+.|++..+ ++|||+||+.|+++.+..+++|++...-.+......|-+ .....|.+. .++++|+.++.+
T Consensus 13 ~~~~~~v~~~Ing~~~-~~LvDTGAs~s~Is~~~a~~lgl~~~~~~~~~~~~~g~g--~~~~~g~~~~~~l~i~~~~~~~ 89 (124)
T cd05479 13 KVPMLYINVEINGVPV-KAFVDSGAQMTIMSKACAEKCGLMRLIDKRFQGIAKGVG--TQKILGRIHLAQVKIGNLFLPC 89 (124)
T ss_pred eeeEEEEEEEECCEEE-EEEEeCCCceEEeCHHHHHHcCCccccCcceEEEEecCC--CcEEEeEEEEEEEEECCEEeee
Confidence 3456899999999997 599999999999999999999997542222223344433 356788874 799999999999
Q ss_pred EEEEEeeCCceeeEEcchhhhhccccccCcccee
Q 045020 143 DFLVVKATSPYNVILGRQWIHKMRAVPSTCHQVL 176 (221)
Q Consensus 143 ~F~Vvd~~~~yn~ILGr~wL~~~~~i~st~~q~v 176 (221)
+|.|++.. .++.|||.+||.+++++...-.+.|
T Consensus 90 ~~~Vl~~~-~~d~ILG~d~L~~~~~~ID~~~~~i 122 (124)
T cd05479 90 SFTVLEDD-DVDFLIGLDMLKRHQCVIDLKENVL 122 (124)
T ss_pred EEEEECCC-CcCEEecHHHHHhCCeEEECCCCEE
Confidence 99999855 8899999999999999877665554
No 4
>PF08284 RVP_2: Retroviral aspartyl protease; InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.
Probab=99.47 E-value=7.2e-13 Score=103.70 Aligned_cols=119 Identities=20% Similarity=0.329 Sum_probs=99.0
Q ss_pred CCCCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCccccccee-eEEEEEcCeeE
Q 045020 62 SPHDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVI-TLLVIAKSYSL 140 (221)
Q Consensus 62 ~~h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i-~L~v~~g~~t~ 140 (221)
+.+++.+...+.|+++.+. +|+|+||+-|+|..+..++++++...+..+-.. .+. |......+.+ .+++.+++.++
T Consensus 16 ~~~~~vi~g~~~I~~~~~~-vLiDSGAThsFIs~~~a~~~~l~~~~l~~~~~V-~~~-g~~~~~~~~~~~~~~~i~g~~~ 92 (135)
T PF08284_consen 16 EESPDVITGTFLINSIPAS-VLIDSGATHSFISSSFAKKLGLPLEPLPRPIVV-SAP-GGSINCEGVCPDVPLSIQGHEF 92 (135)
T ss_pred cCCCCeEEEEEEeccEEEE-EEEecCCCcEEccHHHHHhcCCEEEEccCeeEE-ecc-cccccccceeeeEEEEECCeEE
Confidence 4556677778889999988 999999999999999999999998888664433 343 4444455554 59999999999
Q ss_pred EEEEEEEeeCCceeeEEcchhhhhccccccCccceeeeecCCcE
Q 045020 141 LTDFLVVKATSPYNVILGRQWIHKMRAVPSTCHQVLCYQTIYGI 184 (221)
Q Consensus 141 ~~~F~Vvd~~~~yn~ILGr~wL~~~~~i~st~~q~vk~~~~~g~ 184 (221)
..+|.|.+ ...|++|||.+||.+..+......-+|.|..|.|.
T Consensus 93 ~~dl~vl~-l~~~DvILGm~WL~~~~~~IDw~~k~v~f~~p~~~ 135 (135)
T PF08284_consen 93 VVDLLVLD-LGGYDVILGMDWLKKHNPVIDWATKTVTFNSPSGE 135 (135)
T ss_pred EeeeEEec-ccceeeEeccchHHhCCCEEEccCCEEEEeCCCCC
Confidence 99999998 57799999999999999998888999999888873
No 5
>PF00077 RVP: Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026; InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=99.41 E-value=4.1e-13 Score=99.02 Aligned_cols=97 Identities=29% Similarity=0.396 Sum_probs=85.1
Q ss_pred CCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeEEEE
Q 045020 64 HDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSLLTD 143 (221)
Q Consensus 64 h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~~~~ 143 (221)
+++..++++.|++..+. +|+|+||++++++.+.++..+.+ ......+.|.+|.. ...|...+.++++.......
T Consensus 2 ~~~rp~i~v~i~g~~i~-~LlDTGA~vsiI~~~~~~~~~~~----~~~~~~v~~~~g~~-~~~~~~~~~v~~~~~~~~~~ 75 (100)
T PF00077_consen 2 LSNRPYITVKINGKKIK-ALLDTGADVSIISEKDWKKLGPP----PKTSITVRGAGGSS-SILGSTTVEVKIGGKEFNHT 75 (100)
T ss_dssp SSSSSEEEEEETTEEEE-EEEETTBSSEEESSGGSSSTSSE----EEEEEEEEETTEEE-EEEEEEEEEEEETTEEEEEE
T ss_pred CCCCceEEEeECCEEEE-EEEecCCCcceeccccccccccc----ccCCceeccCCCcc-eeeeEEEEEEEEECccceEE
Confidence 34455889999999988 99999999999999988776543 56667889999988 99999999999999999999
Q ss_pred EEEEeeCCceeeEEcchhhhhcccc
Q 045020 144 FLVVKATSPYNVILGRQWIHKMRAV 168 (221)
Q Consensus 144 F~Vvd~~~~yn~ILGr~wL~~~~~i 168 (221)
|+|++. .++| ||||+||.++++.
T Consensus 76 ~~v~~~-~~~~-ILG~D~L~~~~~~ 98 (100)
T PF00077_consen 76 FLVVPD-LPMN-ILGRDFLKKLNAV 98 (100)
T ss_dssp EEESST-CSSE-EEEHHHHTTTTCE
T ss_pred EEecCC-CCCC-EeChhHHHHcCCE
Confidence 999987 8888 9999999999875
No 6
>PF09668 Asp_protease: Aspartyl protease; InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=99.30 E-value=1.7e-11 Score=94.33 Aligned_cols=101 Identities=17% Similarity=0.212 Sum_probs=73.7
Q ss_pred CCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceee-EEEEEcCeeEEE
Q 045020 64 HDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVIT-LLVIAKSYSLLT 142 (221)
Q Consensus 64 h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~-L~v~~g~~t~~~ 142 (221)
....|+|.+.|+|.++. ++||+|+..|+|+.+.++++|+.. +-.....-...+-.+..++|.|. +++++|+..+++
T Consensus 21 ~v~mLyI~~~ing~~vk-A~VDtGAQ~tims~~~a~r~gL~~--lid~r~~g~a~GvG~~~i~G~Ih~~~l~ig~~~~~~ 97 (124)
T PF09668_consen 21 QVSMLYINCKINGVPVK-AFVDTGAQSTIMSKSCAERCGLMR--LIDKRFAGVAKGVGTQKILGRIHSVQLKIGGLFFPC 97 (124)
T ss_dssp -----EEEEEETTEEEE-EEEETT-SS-EEEHHHHHHTTGGG--GEEGGG-EE-------EEEEEEEEEEEEETTEEEEE
T ss_pred CcceEEEEEEECCEEEE-EEEeCCCCccccCHHHHHHcCChh--hccccccccccCCCcCceeEEEEEEEEEECCEEEEE
Confidence 34579999999999994 999999999999999999999853 44443322222224567999997 999999999999
Q ss_pred EEEEEeeCCceeeEEcchhhhhcccc
Q 045020 143 DFLVVKATSPYNVILGRQWIHKMRAV 168 (221)
Q Consensus 143 ~F~Vvd~~~~yn~ILGr~wL~~~~~i 168 (221)
.|.|++ ..+-+.|||.+||.+.+++
T Consensus 98 s~~Vle-~~~~d~llGld~L~~~~c~ 122 (124)
T PF09668_consen 98 SFTVLE-DQDVDLLLGLDMLKRHKCC 122 (124)
T ss_dssp EEEEET-TSSSSEEEEHHHHHHTT-E
T ss_pred EEEEeC-CCCcceeeeHHHHHHhCcc
Confidence 999999 4455899999999998875
No 7
>PF13650 Asp_protease_2: Aspartyl protease
Probab=99.18 E-value=2.1e-10 Score=82.01 Aligned_cols=89 Identities=27% Similarity=0.444 Sum_probs=74.2
Q ss_pred EEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeE-EEEEEEEe
Q 045020 70 ISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSL-LTDFLVVK 148 (221)
Q Consensus 70 i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~-~~~F~Vvd 148 (221)
|++.|+|..+ ++++|+||+.++++.+.++++|+....... ...+.+++|......+.+. .+++|+.++ ++.|.+++
T Consensus 1 V~v~vng~~~-~~liDTGa~~~~i~~~~~~~l~~~~~~~~~-~~~~~~~~g~~~~~~~~~~-~i~ig~~~~~~~~~~v~~ 77 (90)
T PF13650_consen 1 VPVKVNGKPV-RFLIDTGASISVISRSLAKKLGLKPRPKSV-PISVSGAGGSVTVYRGRVD-SITIGGITLKNVPFLVVD 77 (90)
T ss_pred CEEEECCEEE-EEEEcCCCCcEEECHHHHHHcCCCCcCCce-eEEEEeCCCCEEEEEEEEE-EEEECCEEEEeEEEEEEC
Confidence 4678999977 599999999999999999999987654432 3566888998666666666 899999887 89999999
Q ss_pred eCCceeeEEcchh
Q 045020 149 ATSPYNVILGRQW 161 (221)
Q Consensus 149 ~~~~yn~ILGr~w 161 (221)
...+++.|||.+|
T Consensus 78 ~~~~~~~iLG~df 90 (90)
T PF13650_consen 78 LGDPIDGILGMDF 90 (90)
T ss_pred CCCCCEEEeCCcC
Confidence 7788999999998
No 8
>cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples
Probab=99.16 E-value=4.1e-10 Score=77.41 Aligned_cols=90 Identities=28% Similarity=0.483 Sum_probs=73.9
Q ss_pred EEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccce-eeEEEEEcCeeEEEEEEEEee
Q 045020 71 SIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGV-ITLLVIAKSYSLLTDFLVVKA 149 (221)
Q Consensus 71 ~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~-i~L~v~~g~~t~~~~F~Vvd~ 149 (221)
++.+++. ..++|+|+||++|++..+.+.+++. ..........+.+++|......+. ..+.+.++.......|++++.
T Consensus 2 ~~~~~~~-~~~~liDtgs~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~ 79 (92)
T cd00303 2 KGKINGV-PVRALVDSGASVNFISESLAKKLGL-PPRLLPTPLKVKGANGSSVKTLGVILPVTIGIGGKTFTVDFYVLDL 79 (92)
T ss_pred EEEECCE-EEEEEEcCCCcccccCHHHHHHcCC-CcccCCCceEEEecCCCEeccCcEEEEEEEEeCCEEEEEEEEEEcC
Confidence 4567774 4469999999999999999999987 223445567788888887777777 578889999999999999986
Q ss_pred CCceeeEEcchhhh
Q 045020 150 TSPYNVILGRQWIH 163 (221)
Q Consensus 150 ~~~yn~ILGr~wL~ 163 (221)
. .|+.|||++||+
T Consensus 80 ~-~~~~ilG~~~l~ 92 (92)
T cd00303 80 L-SYDVILGRPWLE 92 (92)
T ss_pred C-CcCEEecccccC
Confidence 6 899999999984
No 9
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where
Probab=98.79 E-value=4.4e-08 Score=70.56 Aligned_cols=85 Identities=22% Similarity=0.353 Sum_probs=68.2
Q ss_pred EEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeEEEEEEEEee
Q 045020 70 ISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSLLTDFLVVKA 149 (221)
Q Consensus 70 i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~~~~F~Vvd~ 149 (221)
+++.|+|.++. +||||||+..+++...++++ .+......+.|.+|....+.....-.+++|+.+....|.|++.
T Consensus 1 ~~v~InG~~~~-fLvDTGA~~tii~~~~a~~~-----~~~~~~~~v~gagG~~~~~v~~~~~~v~vg~~~~~~~~~v~~~ 74 (86)
T cd06095 1 VTITVEGVPIV-FLVDTGATHSVLKSDLGPKQ-----ELSTTSVLIRGVSGQSQQPVTTYRTLVDLGGHTVSHSFLVVPN 74 (86)
T ss_pred CEEEECCEEEE-EEEECCCCeEEECHHHhhhc-----cCCCCcEEEEeCCCcccccEEEeeeEEEECCEEEEEEEEEEcC
Confidence 36789999887 99999999999999999987 3456778899999987444443322689999999889988864
Q ss_pred CCceeeEEcchhh
Q 045020 150 TSPYNVILGRQWI 162 (221)
Q Consensus 150 ~~~yn~ILGr~wL 162 (221)
. + +.|||+.||
T Consensus 75 ~-~-~~lLG~dfL 85 (86)
T cd06095 75 C-P-DPLLGRDLL 85 (86)
T ss_pred C-C-CcEechhhc
Confidence 3 3 789999987
No 10
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=98.74 E-value=9.5e-08 Score=68.81 Aligned_cols=92 Identities=16% Similarity=0.224 Sum_probs=71.2
Q ss_pred eEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeE-EEEEEE
Q 045020 68 LVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSL-LTDFLV 146 (221)
Q Consensus 68 L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~-~~~F~V 146 (221)
+++++.|++..+. +++|+||+.++++.+..++++.. ..........+.+|........ .-.+++|+.++ ++.+.|
T Consensus 3 ~~v~v~i~~~~~~-~llDTGa~~s~i~~~~~~~l~~~--~~~~~~~~~~~~~G~~~~~~~~-~~~i~ig~~~~~~~~~~v 78 (96)
T cd05483 3 FVVPVTINGQPVR-FLLDTGASTTVISEELAERLGLP--LTLGGKVTVQTANGRVRAARVR-LDSLQIGGITLRNVPAVV 78 (96)
T ss_pred EEEEEEECCEEEE-EEEECCCCcEEcCHHHHHHcCCC--ccCCCcEEEEecCCCccceEEE-cceEEECCcEEeccEEEE
Confidence 6789999988875 99999999999999999999872 2333445566677765544444 45788999875 588999
Q ss_pred EeeCC-ceeeEEcchhhh
Q 045020 147 VKATS-PYNVILGRQWIH 163 (221)
Q Consensus 147 vd~~~-~yn~ILGr~wL~ 163 (221)
+|... +.+.|||.+||.
T Consensus 79 ~d~~~~~~~gIlG~d~l~ 96 (96)
T cd05483 79 LPGDALGVDGLLGMDFLR 96 (96)
T ss_pred eCCcccCCceEeChHHhC
Confidence 98765 578999999974
No 11
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=98.73 E-value=2.3e-08 Score=72.29 Aligned_cols=78 Identities=22% Similarity=0.330 Sum_probs=67.5
Q ss_pred eEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCe-eEEEEEEEEeeCCceeeEEc
Q 045020 80 KRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSY-SLLTDFLVVKATSPYNVILG 158 (221)
Q Consensus 80 ~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~-t~~~~F~Vvd~~~~yn~ILG 158 (221)
.+.|||+||.++++|.+..++. ++++...|.+.||+.....|...+.+..|.. .+...|.|.|.+. +|||
T Consensus 10 ~~fLVDTGA~vSviP~~~~~~~------~~~~~~~l~AANgt~I~tyG~~~l~ldlGlrr~~~w~FvvAdv~~---pIlG 80 (89)
T cd06094 10 LRFLVDTGAAVSVLPASSTKKS------LKPSPLTLQAANGTPIATYGTRSLTLDLGLRRPFAWNFVVADVPH---PILG 80 (89)
T ss_pred cEEEEeCCCceEeecccccccc------ccCCceEEEeCCCCeEeeeeeEEEEEEcCCCcEEeEEEEEcCCCc---ceec
Confidence 3599999999999998866542 6777789999999999999999999999986 8888999988766 5999
Q ss_pred chhhhhcc
Q 045020 159 RQWIHKMR 166 (221)
Q Consensus 159 r~wL~~~~ 166 (221)
..+|++++
T Consensus 81 aDfL~~~~ 88 (89)
T cd06094 81 ADFLQHYG 88 (89)
T ss_pred HHHHHHcC
Confidence 99999875
No 12
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=98.61 E-value=3.4e-07 Score=70.28 Aligned_cols=101 Identities=14% Similarity=0.284 Sum_probs=78.3
Q ss_pred CCCCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceee-EEEEEcCeeE
Q 045020 62 SPHDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVIT-LLVIAKSYSL 140 (221)
Q Consensus 62 ~~h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~-L~v~~g~~t~ 140 (221)
+.+...+++.+.|+|.++. ++|||||+..+++....+++|+....+ .....+.+.+|.... ..+. -.+++|+.++
T Consensus 6 ~~~~g~~~v~~~InG~~~~-flVDTGAs~t~is~~~A~~Lgl~~~~~-~~~~~~~ta~G~~~~--~~~~l~~l~iG~~~~ 81 (121)
T TIGR02281 6 KDGDGHFYATGRVNGRNVR-FLVDTGATSVALNEEDAQRLGLDLNRL-GYTVTVSTANGQIKA--ARVTLDRVAIGGIVV 81 (121)
T ss_pred EcCCCeEEEEEEECCEEEE-EEEECCCCcEEcCHHHHHHcCCCcccC-CceEEEEeCCCcEEE--EEEEeCEEEECCEEE
Confidence 4456678999999999775 999999999999999999999976543 234556667776432 3334 4588999876
Q ss_pred E-EEEEEEeeCCceeeEEcchhhhhcc
Q 045020 141 L-TDFLVVKATSPYNVILGRQWIHKMR 166 (221)
Q Consensus 141 ~-~~F~Vvd~~~~yn~ILGr~wL~~~~ 166 (221)
. +.+.|++.....+.+||-.+|.+++
T Consensus 82 ~nv~~~v~~~~~~~~~LLGm~fL~~~~ 108 (121)
T TIGR02281 82 NDVDAMVAEGGALSESLLGMSFLNRLS 108 (121)
T ss_pred eCcEEEEeCCCcCCceEcCHHHHhccc
Confidence 4 8999987654447999999999986
No 13
>cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=98.59 E-value=3.3e-07 Score=67.44 Aligned_cols=96 Identities=20% Similarity=0.253 Sum_probs=73.8
Q ss_pred EEEEcCeeeeEEEecCCCccccccHHHHHhhcccccccc-ccccceecccCCcccccceee-EEEEEcCeeEEEEEEEEe
Q 045020 71 SIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIK-LFEQPLIGLSGIVTNSDGVIT-LLVIAKSYSLLTDFLVVK 148 (221)
Q Consensus 71 ~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~-~~~~~l~gf~g~~~~~lG~i~-L~v~~g~~t~~~~F~Vvd 148 (221)
...++|++|. ++||+|+-.|||+....+++|+...-.+ ...-.-.| -|++...+|.|. +++++|...++..|.|+|
T Consensus 2 nCk~nG~~vk-AfVDsGaQ~timS~~caercgL~r~v~~~r~~g~A~g-vgt~~kiiGrih~~~ikig~~~~~CSftVld 79 (103)
T cd05480 2 SCQCAGKELR-ALVDTGCQYNLISAACLDRLGLKERVLKAKAEEEAPS-LPTSVKVIGQIERLVLQLGQLTVECSAQVVD 79 (103)
T ss_pred ceeECCEEEE-EEEecCCchhhcCHHHHHHcChHhhhhhccccccccC-CCcceeEeeEEEEEEEEeCCEEeeEEEEEEc
Confidence 4578999998 9999999999999999999998532111 11111111 123357889996 999999999999999999
Q ss_pred eCCceeeEEcchhhhhccccc
Q 045020 149 ATSPYNVILGRQWIHKMRAVP 169 (221)
Q Consensus 149 ~~~~yn~ILGr~wL~~~~~i~ 169 (221)
..+.+.+||-..|.+.++..
T Consensus 80 -~~~~d~llGLdmLkrhqc~I 99 (103)
T cd05480 80 -DNEKNFSLGLQTLKSLKCVI 99 (103)
T ss_pred -CCCcceEeeHHHHhhcceee
Confidence 45678999999999887753
No 14
>KOG0012 consensus DNA damage inducible protein [Replication, recombination and repair]
Probab=98.46 E-value=4.8e-07 Score=80.09 Aligned_cols=124 Identities=20% Similarity=0.279 Sum_probs=93.4
Q ss_pred ccccCCCCCCCcCCCCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCccccccee
Q 045020 50 VSFLPQDLKLIQSPHDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVI 129 (221)
Q Consensus 50 I~F~~~d~~~~~~~h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i 129 (221)
|..+|++... -.-|+|.+.|+|++|+ +.||+|+-.|||+....+++|+..--.+...-...|.+ +.+.+|.|
T Consensus 223 i~~~pe~f~~-----v~ML~iN~~ing~~VK-AfVDsGaq~timS~~Caer~gL~rlid~r~~g~a~gvg--~~ki~g~I 294 (380)
T KOG0012|consen 223 IEYHPEDFTQ-----VTMLYINCEINGVPVK-AFVDSGAQTTIMSAACAERCGLNRLIDKRFQGEARGVG--TEKILGRI 294 (380)
T ss_pred hhcCcccccc-----ceEEEEEEEECCEEEE-EEEcccchhhhhhHHHHHHhChHHHhhhhhhccccCCC--ccccccee
Confidence 5555554332 3568999999999998 99999999999999999999986532222222222221 55678999
Q ss_pred e-EEEEEcCeeEEEEEEEEeeCCceeeEEcchhhhhccccccCccceeeeecCC
Q 045020 130 T-LLVIAKSYSLLTDFLVVKATSPYNVILGRQWIHKMRAVPSTCHQVLCYQTIY 182 (221)
Q Consensus 130 ~-L~v~~g~~t~~~~F~Vvd~~~~yn~ILGr~wL~~~~~i~st~~q~vk~~~~~ 182 (221)
. +++++|..++...|.|+|. .+-+++||-.-|.+.++...--+-.+.+-...
T Consensus 295 h~~~lki~~~~l~c~ftV~d~-~~~d~llGLd~Lrr~~ccIdL~~~~L~ig~~~ 347 (380)
T KOG0012|consen 295 HQAQLKIEDLYLPCSFTVLDR-RDMDLLLGLDMLRRHQCCIDLKTNVLRIGNTE 347 (380)
T ss_pred EEEEEEeccEeeccceEEecC-CCcchhhhHHHHHhccceeecccCeEEecCCC
Confidence 6 9999999999999999995 44589999999999999876655555554333
No 15
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=98.18 E-value=1.2e-05 Score=60.33 Aligned_cols=86 Identities=17% Similarity=0.168 Sum_probs=59.7
Q ss_pred eeeEEEecCCCcccc-ccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeEEEEEEEEeeCCceeeE
Q 045020 78 LVKRILIDNGSSVNI-LDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSLLTDFLVVKATSPYNVI 156 (221)
Q Consensus 78 ~v~riLVD~GSsvni-l~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~~~~F~Vvd~~~~yn~I 156 (221)
.|. +|||+|++..+ +|.+.++++|+++.. .......+|...... .....|.+|+........+.+ ..+ ..+
T Consensus 16 ~v~-~LVDTGat~~~~l~~~~a~~lgl~~~~----~~~~~tA~G~~~~~~-v~~~~v~igg~~~~~~v~~~~-~~~-~~L 87 (107)
T TIGR03698 16 EVR-ALVDTGFSGFLLVPPDIVNKLGLPELD----QRRVYLADGREVLTD-VAKASIIINGLEIDAFVESLG-YVD-EPL 87 (107)
T ss_pred EEE-EEEECCCCeEEecCHHHHHHcCCCccc----CcEEEecCCcEEEEE-EEEEEEEECCEEEEEEEEecC-CCC-ccE
Confidence 444 99999999997 999999999997642 345555677432221 224667788887744433323 224 679
Q ss_pred EcchhhhhccccccC
Q 045020 157 LGRQWIHKMRAVPST 171 (221)
Q Consensus 157 LGr~wL~~~~~i~st 171 (221)
||..||.+++.+...
T Consensus 88 LG~~~L~~l~l~id~ 102 (107)
T TIGR03698 88 LGTELLEGLGIVIDY 102 (107)
T ss_pred ecHHHHhhCCEEEeh
Confidence 999999999887644
No 16
>PF13975 gag-asp_proteas: gag-polyprotein putative aspartyl protease
Probab=98.13 E-value=9.2e-06 Score=56.57 Aligned_cols=65 Identities=23% Similarity=0.385 Sum_probs=51.5
Q ss_pred CCCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCccccccee
Q 045020 63 PHDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVI 129 (221)
Q Consensus 63 ~h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i 129 (221)
.+...+++.+.|++..+. .|||+||+.|+|+.+..+++|++..... ....+.-.+|......|.+
T Consensus 4 ~~~g~~~v~~~I~g~~~~-alvDtGat~~fis~~~a~rLgl~~~~~~-~~~~v~~a~g~~~~~~g~~ 68 (72)
T PF13975_consen 4 PDPGLMYVPVSIGGVQVK-ALVDTGATHNFISESLAKRLGLPLEKPP-SPIRVKLANGSVIEIRGVA 68 (72)
T ss_pred ccCCEEEEEEEECCEEEE-EEEeCCCcceecCHHHHHHhCCCcccCC-CCEEEEECCCCccccceEE
Confidence 455668999999999888 9999999999999999999998764332 2355566688877776655
No 17
>PF12384 Peptidase_A2B: Ty3 transposon peptidase; InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=98.13 E-value=2.1e-05 Score=62.93 Aligned_cols=103 Identities=15% Similarity=0.220 Sum_probs=81.8
Q ss_pred CCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccC-CcccccceeeEEEEEcCeeEEE
Q 045020 64 HDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSG-IVTNSDGVITLLVIAKSYSLLT 142 (221)
Q Consensus 64 h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g-~~~~~lG~i~L~v~~g~~t~~~ 142 (221)
=++.+...+.|.+-.+. +|.|+||-..++......+|+++.. ..+...+.|+.+ +....--.+++++..+...+.+
T Consensus 31 vg~T~~v~l~~~~t~i~-vLfDSGSPTSfIr~di~~kL~L~~~--~app~~fRG~vs~~~~~tsEAv~ld~~i~n~~i~i 107 (177)
T PF12384_consen 31 VGKTAIVQLNCKGTPIK-VLFDSGSPTSFIRSDIVEKLELPTH--DAPPFRFRGFVSGESATTSEAVTLDFYIDNKLIDI 107 (177)
T ss_pred cCcEEEEEEeecCcEEE-EEEeCCCccceeehhhHHhhCCccc--cCCCEEEeeeccCCceEEEEeEEEEEEECCeEEEE
Confidence 35667888899999988 9999999999999999999987653 334455666654 3444555677999999999999
Q ss_pred EEEEEeeCCceeeEEcchhhhhcccccc
Q 045020 143 DFLVVKATSPYNVILGRQWIHKMRAVPS 170 (221)
Q Consensus 143 ~F~Vvd~~~~yn~ILGr~wL~~~~~i~s 170 (221)
.++|.| .-++++|+|-|.|.++.-+-.
T Consensus 108 ~aYV~d-~m~~dlIIGnPiL~ryp~l~~ 134 (177)
T PF12384_consen 108 AAYVTD-NMDHDLIIGNPILDRYPTLLY 134 (177)
T ss_pred EEEEec-cCCcceEeccHHHhhhHHHHH
Confidence 999998 556899999999988766533
No 18
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=96.78 E-value=0.0066 Score=50.46 Aligned_cols=133 Identities=14% Similarity=0.197 Sum_probs=91.9
Q ss_pred cHHHHHHHHhcCCCCCCCcccccccccccCCCC--C-CCcCCCCCceEEEEEEcCeeeeEEEecCCCccccccHHHHHhh
Q 045020 25 DDQLAIVTFKRGLPSNHPFFESLSEVSFLPQDL--K-LIQSPHDDALVISIKIGNCLVKRILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 25 ~~~~~~~~fk~~~~~~~~~~es~~~I~F~~~d~--~-~~~~~h~~~L~i~~~I~~~~v~riLVD~GSsvnil~~~~~~~m 101 (221)
+...+-...+++.-|+.||--- ..|- . ......+-.+.+...|||..|. .|||||+|.=.+......++
T Consensus 67 ~l~~~~~~v~~gl~P~~~~a~~-------~~~g~~~v~Lak~~~GHF~a~~~VNGk~v~-fLVDTGATsVal~~~dA~Rl 138 (215)
T COG3577 67 ELQDATHRVLRGLNPGRAWATL-------VGDGYQEVSLAKSRDGHFEANGRVNGKKVD-FLVDTGATSVALNEEDARRL 138 (215)
T ss_pred ccccchhHHHhhcCCCCCcccc-------CCCCceEEEEEecCCCcEEEEEEECCEEEE-EEEecCcceeecCHHHHHHh
Confidence 3333444556666667777321 1111 0 1224455568889999999998 99999999999999999999
Q ss_pred ccccccccccccceecccCCcccccceee-EEEEEcCe-eEEEEEEEEeeCCceeeEEcchhhhhcccc
Q 045020 102 RLEQDKIKLFEQPLIGLSGIVTNSDGVIT-LLVIAKSY-SLLTDFLVVKATSPYNVILGRQWIHKMRAV 168 (221)
Q Consensus 102 g~~~~~L~~~~~~l~gf~g~~~~~lG~i~-L~v~~g~~-t~~~~F~Vvd~~~~yn~ILGr~wL~~~~~i 168 (221)
|+....|.-+ .++..-||...-. .++ =.|++|++ ..+++=+|++....-+.+||-.+|.+++-.
T Consensus 139 Gid~~~l~y~-~~v~TANG~~~AA--~V~Ld~v~IG~I~~~nV~A~V~~~g~L~~sLLGMSfL~rL~~f 204 (215)
T COG3577 139 GIDLNSLDYT-ITVSTANGRARAA--PVTLDRVQIGGIRVKNVDAMVAEDGALDESLLGMSFLNRLSGF 204 (215)
T ss_pred CCCccccCCc-eEEEccCCccccc--eEEeeeEEEccEEEcCchhheecCCccchhhhhHHHHhhccce
Confidence 9988766544 5556567754221 122 35778877 458888899877677899999999887643
No 19
>PF02160 Peptidase_A3: Cauliflower mosaic virus peptidase (A3); InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=96.75 E-value=0.0049 Score=51.25 Aligned_cols=104 Identities=10% Similarity=0.117 Sum_probs=69.9
Q ss_pred EEEEcCe---eeeEEEecCCCccccccHHHHHhhccccccccccc--cceecccCCccccccee-eEEEEEcCeeEEEEE
Q 045020 71 SIKIGNC---LVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFE--QPLIGLSGIVTNSDGVI-TLLVIAKSYSLLTDF 144 (221)
Q Consensus 71 ~~~I~~~---~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~--~~l~gf~g~~~~~lG~i-~L~v~~g~~t~~~~F 144 (221)
.+.+.|. .+. .+|||||+++++..... |...++.+. ..++|+|++..+..=.+ .+.+.+++..+.+.|
T Consensus 10 ~i~~~gy~~~~~~-~~vDTGAt~C~~~~~ii-----P~e~we~~~~~i~v~~an~~~~~i~~~~~~~~i~I~~~~F~IP~ 83 (201)
T PF02160_consen 10 KISFPGYKKFNYH-CYVDTGATICCASKKII-----PEEYWEKSKKPIKVKGANGSIIQINKKAKNGKIQIADKIFRIPT 83 (201)
T ss_pred EEEEcCceeEEEE-EEEeCCCceEEecCCcC-----CHHHHHhCCCcEEEEEecCCceEEEEEecCceEEEccEEEeccE
Confidence 4445553 334 79999999988776654 333344444 47888998865444444 388888988877777
Q ss_pred EEEeeCCceeeEEcchhhhhccccccCccceeeeecCC
Q 045020 145 LVVKATSPYNVILGRQWIHKMRAVPSTCHQVLCYQTIY 182 (221)
Q Consensus 145 ~Vvd~~~~yn~ILGr~wL~~~~~i~st~~q~vk~~~~~ 182 (221)
.-.. .+..++|||.+||..+....-+. ..+.|-.++
T Consensus 84 iYq~-~~g~d~IlG~NF~r~y~Pfiq~~-~~I~f~~~~ 119 (201)
T PF02160_consen 84 IYQQ-ESGIDIILGNNFLRLYEPFIQTE-DRIQFHKKG 119 (201)
T ss_pred EEEe-cCCCCEEecchHHHhcCCcEEEc-cEEEEEeCC
Confidence 6644 46789999999999888764443 345555544
No 20
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=96.10 E-value=0.042 Score=39.72 Aligned_cols=85 Identities=25% Similarity=0.336 Sum_probs=57.2
Q ss_pred EEEEcCeeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeEEEEEEEEeeC
Q 045020 71 SIKIGNCLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSLLTDFLVVKAT 150 (221)
Q Consensus 71 ~~~I~~~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~~~~F~Vvd~~ 150 (221)
++.|++..+. .|+|+||-..++....+..- -.+......+.|.+|.. .+.-.-.+++++.+....-.|.|....
T Consensus 2 ~~~i~g~~~~-~llDTGAd~Tvi~~~~~p~~----w~~~~~~~~i~GIGG~~-~~~~~~~v~i~i~~~~~~g~vlv~~~~ 75 (87)
T cd05482 2 TLYINGKLFE-GLLDTGADVSIIAENDWPKN----WPIQPAPSNLTGIGGAI-TPSQSSVLLLEIDGEGHLGTILVYVLS 75 (87)
T ss_pred EEEECCEEEE-EEEccCCCCeEEcccccCCC----CccCCCCeEEEeccceE-EEEEEeeEEEEEcCCeEEEEEEEccCC
Confidence 5678888877 99999999999876443211 12445667788888753 333333688888877666677776533
Q ss_pred CceeeEEcchhh
Q 045020 151 SPYNVILGRQWI 162 (221)
Q Consensus 151 ~~yn~ILGr~wL 162 (221)
.|.| |+||..|
T Consensus 76 ~P~n-llGRd~L 86 (87)
T cd05482 76 LPVN-LWGRDIL 86 (87)
T ss_pred Cccc-EEccccC
Confidence 4443 7899876
No 21
>PF05585 DUF1758: Putative peptidase (DUF1758); InterPro: IPR008737 This is a family of nematode proteins of unknown function []. However, it seems likely that these proteins act as aspartic peptidases.
Probab=95.72 E-value=0.033 Score=44.47 Aligned_cols=71 Identities=23% Similarity=0.245 Sum_probs=49.2
Q ss_pred eeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCc-ccccc--eeeEEEEEcCeeEEEEEEEEeeC
Q 045020 77 CLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIV-TNSDG--VITLLVIAKSYSLLTDFLVVKAT 150 (221)
Q Consensus 77 ~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~-~~~lG--~i~L~v~~g~~t~~~~F~Vvd~~ 150 (221)
...-|+|+|+||..+++..+..++|+++.. +......|..|.. ..... .+.+.+..+.....+..++++..
T Consensus 10 ~~~~~~LlDsGSq~SfIt~~la~~L~L~~~---~~~~~~~~~~g~~~~~~~~~~~~~i~~~~~~~~~~i~alvv~~I 83 (164)
T PF05585_consen 10 QVEARALLDSGSQRSFITESLANKLNLPGT---GEKILVIGTFGSSSPKSKKCVRVKISSRTSNNSLEIEALVVPKI 83 (164)
T ss_pred EEEEEEEEecCCchhHHhHHHHHHhCCCCC---CceEEEEeccCccCccceeEEEEEEEEecCCCceEEEEEecCcc
Confidence 444679999999999999999999998653 3444445544443 22333 34466666777788888888654
No 22
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=95.60 E-value=0.084 Score=40.49 Aligned_cols=86 Identities=20% Similarity=0.261 Sum_probs=56.7
Q ss_pred EcCeeeeEEEecCCCc-cccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCe----eEEEEEEEEe
Q 045020 74 IGNCLVKRILIDNGSS-VNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSY----SLLTDFLVVK 148 (221)
Q Consensus 74 I~~~~v~riLVD~GSs-vnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~----t~~~~F~Vvd 148 (221)
.+++.-.+ |||||.+ --++|.+.+.++|.++. ......+. .-|.+...|....+ ...+.|..+.
T Consensus 22 ~Gd~~~~~-LiDTGFtg~lvlp~~vaek~~~~~~--~~~~~~~a--------~~~~v~t~V~~~~iki~g~e~~~~Vl~s 90 (125)
T COG5550 22 QGDFVYDE-LIDTGFTGYLVLPPQVAEKLGLPLF--STIRIVLA--------DGGVVKTSVALATIKIDGVEKVAFVLAS 90 (125)
T ss_pred CCcEEeee-EEecCCceeEEeCHHHHHhcCCCcc--CChhhhhh--------cCCEEEEEEEEEEEEECCEEEEEEEEcc
Confidence 34554444 9999998 78899999999998752 22222222 22333344443332 4566777777
Q ss_pred eCCceeeEEcchhhhhccccccC
Q 045020 149 ATSPYNVILGRQWIHKMRAVPST 171 (221)
Q Consensus 149 ~~~~yn~ILGr~wL~~~~~i~st 171 (221)
+..+-+ ++|+.||+........
T Consensus 91 ~~~~~~-liG~~~lk~l~~~vn~ 112 (125)
T COG5550 91 DNLPEP-LIGVNLLKLLGLVVNP 112 (125)
T ss_pred CCCccc-chhhhhhhhccEEEcC
Confidence 777877 9999999988876543
No 23
>PF03732 Retrotrans_gag: Retrotransposon gag protein ; InterPro: IPR005162 Transposable elements (TEs) promote various chromosomal rearrangements more efficiently, and often more specifically, than other cellular processes. Retrotransposons are structurally similar to retroviruses and are bounded by long terminal repeats. This entry represents eukaryotic Gag or capsid-related retrotranspon-related proteins. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.
Probab=95.50 E-value=0.018 Score=40.85 Aligned_cols=37 Identities=27% Similarity=0.526 Sum_probs=34.1
Q ss_pred CCcchHHHHHHHHHHhcccccc-ccHHHHHHHHhcCCC
Q 045020 2 GRVETLKYYIRRFRLASAKVEN-CDDQLAIVTFKRGLP 38 (221)
Q Consensus 2 ~~~eslr~~~k~f~~~~l~ve~-~~~~~~~~~fk~~~~ 38 (221)
|.+||+++|+.||+..+.+++. .+++..+..|.+|+.
T Consensus 58 Q~~esv~~y~~rf~~l~~~~~~~~~e~~~v~~f~~GL~ 95 (96)
T PF03732_consen 58 QGNESVREYVNRFRELARRAPPPMDEEMLVERFIRGLR 95 (96)
T ss_pred ccCCcHHHHHHHHHHHHHHCCCCcCHHHHHHHHHHCCC
Confidence 4699999999999999999996 999999999999975
No 24
>PF05618 Zn_protease: Putative ATP-dependant zinc protease; InterPro: IPR008503 This family consists of several hypothetical proteins from different archaeal and bacterial species.; PDB: 2PMA_B.
Probab=90.85 E-value=0.48 Score=37.15 Aligned_cols=47 Identities=17% Similarity=0.282 Sum_probs=32.4
Q ss_pred eeeEEEEEcCeeEEEEEEEEeeCC-ceeeEEc-chhhhhcccc-ccCccc
Q 045020 128 VITLLVIAKSYSLLTDFLVVKATS-PYNVILG-RQWIHKMRAV-PSTCHQ 174 (221)
Q Consensus 128 ~i~L~v~~g~~t~~~~F~Vvd~~~-~yn~ILG-r~wL~~~~~i-~st~~q 174 (221)
-|.+.+++|+.+..+.|.+.|-.. .|.++|| |.||...-+| |+-.|.
T Consensus 85 VV~~~~~lg~~~~~~e~tL~dR~~m~yp~LlGrR~~l~~~~lVD~s~~~~ 134 (138)
T PF05618_consen 85 VVETTLCLGGKTWKIEFTLTDRSNMKYPMLLGRRNFLRGRFLVDVSRSFL 134 (138)
T ss_dssp EEEEEEEETTEEEEEEEEEE-S--SS-SEEE-HHHHHHTTEEEETT---T
T ss_pred EEEEEEEECCEEEEEEEEEcCCCcCcCCEEEEehHHhcCCEEECCChhhc
Confidence 357999999999999999998655 7999999 9999876555 554443
No 25
>PF12382 Peptidase_A2E: Retrotransposon peptidase; InterPro: IPR024648 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This entry represents a small family of fungal retroviral aspartyl peptidases.
Probab=90.54 E-value=1.8 Score=32.10 Aligned_cols=84 Identities=27% Similarity=0.290 Sum_probs=55.7
Q ss_pred CceEEEEEEcCeeee-EEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceeeEEEEEcCeeEEEEE
Q 045020 66 DALVISIKIGNCLVK-RILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVITLLVIAKSYSLLTDF 144 (221)
Q Consensus 66 ~~L~i~~~I~~~~v~-riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~L~v~~g~~t~~~~F 144 (221)
.-+++...+-+|+.+ -.|+|+|+.+|++-..+.+.-.++......+ ++..|.-.. ..-.-.+.|.+...++...+.|
T Consensus 33 yemvlqa~lp~fkcsipclidtgaq~niiteetvrahklptrpw~~s-viyggvyp~-kinrkt~kl~i~lngisiktef 110 (137)
T PF12382_consen 33 YEMVLQAKLPDFKCSIPCLIDTGAQVNIITEETVRAHKLPTRPWSQS-VIYGGVYPN-KINRKTIKLNINLNGISIKTEF 110 (137)
T ss_pred hHhhhhhhCCCccccceeEEccCceeeeeehhhhhhccCCCCcchhh-eEecccccc-ccccceEEEEEEecceEEEEEE
Confidence 346667777666542 2799999999999999998876655433222 222221111 1123356788999999999999
Q ss_pred EEEeeCC
Q 045020 145 LVVKATS 151 (221)
Q Consensus 145 ~Vvd~~~ 151 (221)
.||..-+
T Consensus 111 lvvkkfs 117 (137)
T PF12382_consen 111 LVVKKFS 117 (137)
T ss_pred EEEEecc
Confidence 9997533
No 26
>COG4067 Uncharacterized protein conserved in archaea [Posttranslational modification, protein turnover, chaperones]
Probab=87.39 E-value=1 Score=35.91 Aligned_cols=42 Identities=26% Similarity=0.414 Sum_probs=36.0
Q ss_pred ceeeEEEEEcCeeEEEEEEEEeeCC-ceeeEEcchhhhhcccc
Q 045020 127 GVITLLVIAKSYSLLTDFLVVKATS-PYNVILGRQWIHKMRAV 168 (221)
Q Consensus 127 G~i~L~v~~g~~t~~~~F~Vvd~~~-~yn~ILGr~wL~~~~~i 168 (221)
=-|.+.+.+|+...++.|...|-.. .|.++|||..|+++.++
T Consensus 108 pVV~~~l~lG~~~~~~E~tLtDR~~m~Yp~LlGrk~l~~~~~~ 150 (162)
T COG4067 108 PVVRLTLCLGGRILPIEFTLTDRSNMRYPVLLGRKALRHFGAV 150 (162)
T ss_pred cEEEEEEeeCCeeeeEEEEeecccccccceEecHHHHhhCCeE
Confidence 3467899999999999999998655 79999999999996665
No 27
>PTZ00147 plasmepsin-1; Provisional
Probab=78.42 E-value=8.1 Score=36.13 Aligned_cols=34 Identities=15% Similarity=0.137 Sum_probs=23.7
Q ss_pred CCCCCc-eEEEEEEc--CeeeeEEEecCCCccccccHH
Q 045020 62 SPHDDA-LVISIKIG--NCLVKRILIDNGSSVNILDDA 96 (221)
Q Consensus 62 ~~h~~~-L~i~~~I~--~~~v~riLVD~GSsvnil~~~ 96 (221)
..+.+. .+..+.|| .... .+++||||+.-.++..
T Consensus 133 ~n~~n~~Y~~~I~IGTP~Q~f-~Vi~DTGSsdlWVps~ 169 (453)
T PTZ00147 133 KDLANVMSYGEAKLGDNGQKF-NFIFDTGSANLWVPSI 169 (453)
T ss_pred cccCCCEEEEEEEECCCCeEE-EEEEeCCCCcEEEeec
Confidence 344444 45688898 3444 5999999998888743
No 28
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=75.28 E-value=12 Score=35.11 Aligned_cols=32 Identities=13% Similarity=0.116 Sum_probs=22.7
Q ss_pred CCCceEEEEEEc--CeeeeEEEecCCCccccccHH
Q 045020 64 HDDALVISIKIG--NCLVKRILIDNGSSVNILDDA 96 (221)
Q Consensus 64 h~~~L~i~~~I~--~~~v~riLVD~GSsvnil~~~ 96 (221)
++.-.+..+.|| ...+ ++++||||+.=.++.+
T Consensus 135 ~n~~Yy~~i~IGTP~Q~f-~vi~DTGSsdlWV~s~ 168 (450)
T PTZ00013 135 ANIMFYGEGEVGDNHQKF-MLIFDTGSANLWVPSK 168 (450)
T ss_pred CCCEEEEEEEECCCCeEE-EEEEeCCCCceEEecc
Confidence 344456677887 4445 4999999998888754
No 29
>PF14223 UBN2: gag-polypeptide of LTR copia-type
Probab=63.24 E-value=7.7 Score=28.75 Aligned_cols=39 Identities=23% Similarity=0.363 Sum_probs=32.4
Q ss_pred CCCcchHHHHHHHHHHhccccccc----cHHHHHHHHhcCCCC
Q 045020 1 MGRVETLKYYIRRFRLASAKVENC----DDQLAIVTFKRGLPS 39 (221)
Q Consensus 1 ~~~~eslr~~~k~f~~~~l~ve~~----~~~~~~~~fk~~~~~ 39 (221)
|+.++|+.+|+.|+...+.++..+ ++...+..+-+++++
T Consensus 37 ~~~~~sv~~y~~~~~~i~~~L~~~g~~i~d~~~v~~iL~~Lp~ 79 (119)
T PF14223_consen 37 MKDGESVDEYISRLKEIVDELRAIGKPISDEDLVSKILRSLPP 79 (119)
T ss_pred hcccccHHHHHHHHHHhhhhhhhcCCcccchhHHHHHHhcCCc
Confidence 678999999999999988887664 666678888888774
No 30
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=60.83 E-value=22 Score=29.78 Aligned_cols=23 Identities=17% Similarity=0.343 Sum_probs=20.8
Q ss_pred eEEEecCCCccccccHHHHHhhc
Q 045020 80 KRILIDNGSSVNILDDATVEKMR 102 (221)
Q Consensus 80 ~riLVD~GSsvnil~~~~~~~mg 102 (221)
..+++|+|++...+|.+++.++-
T Consensus 203 ~~~iiDsGt~~~~lp~~~~~~l~ 225 (283)
T cd05471 203 GGAIVDSGTSLIYLPSSVYDAIL 225 (283)
T ss_pred cEEEEecCCCCEeCCHHHHHHHH
Confidence 56999999999999999999873
No 31
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=50.15 E-value=1e+02 Score=26.13 Aligned_cols=71 Identities=24% Similarity=0.243 Sum_probs=42.6
Q ss_pred EEEEEEcC--eeeeEEEecCCCccccccHHHHHhhccccccccccccceecccCCcccccceee-EEEEEcCeeE-EEEE
Q 045020 69 VISIKIGN--CLVKRILIDNGSSVNILDDATVEKMRLEQDKIKLFEQPLIGLSGIVTNSDGVIT-LLVIAKSYSL-LTDF 144 (221)
Q Consensus 69 ~i~~~I~~--~~v~riLVD~GSsvnil~~~~~~~mg~~~~~L~~~~~~l~gf~g~~~~~lG~i~-L~v~~g~~t~-~~~F 144 (221)
++.+.||. ..+ .+++|+||+.-.++ .+ -..|.. .....|.+. =.+++|+.+. +..|
T Consensus 4 ~~~i~iGtp~q~~-~v~~DTgS~~~wv~-------~~-----------~~~Y~~-g~~~~G~~~~D~v~~g~~~~~~~~f 63 (295)
T cd05474 4 SAELSVGTPPQKV-TVLLDTGSSDLWVP-------DF-----------SISYGD-GTSASGTWGTDTVSIGGATVKNLQF 63 (295)
T ss_pred EEEEEECCCCcEE-EEEEeCCCCcceee-------ee-----------EEEecc-CCcEEEEEEEEEEEECCeEecceEE
Confidence 56777876 444 49999999998888 00 011221 112344442 4456666544 5778
Q ss_pred EEEeeCCceeeEEcc
Q 045020 145 LVVKATSPYNVILGR 159 (221)
Q Consensus 145 ~Vvd~~~~yn~ILGr 159 (221)
.++.......-|||-
T Consensus 64 g~~~~~~~~~GilGL 78 (295)
T cd05474 64 AVANSTSSDVGVLGI 78 (295)
T ss_pred EEEecCCCCcceeeE
Confidence 888765666778763
No 32
>PF13260 DUF4051: Protein of unknown function (DUF4051)
Probab=47.04 E-value=23 Score=22.59 Aligned_cols=18 Identities=33% Similarity=0.547 Sum_probs=14.1
Q ss_pred HHHHHHHHHHhccccccccHHHHHHH
Q 045020 7 LKYYIRRFRLASAKVENCDDQLAIVT 32 (221)
Q Consensus 7 lr~~~k~f~~~~l~ve~~~~~~~~~~ 32 (221)
+|-|||-|+| +.|+.+++
T Consensus 22 mkrycrafrq--------drdallea 39 (54)
T PF13260_consen 22 MKRYCRAFRQ--------DRDALLEA 39 (54)
T ss_pred HHHHHHHHhh--------hHHHHHHH
Confidence 6889999999 66666554
No 33
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The enzymes specifically cleave bonds in peptides which
Probab=45.68 E-value=92 Score=26.27 Aligned_cols=18 Identities=28% Similarity=0.552 Sum_probs=16.0
Q ss_pred EEEecCCCccccccHHHH
Q 045020 81 RILIDNGSSVNILDDATV 98 (221)
Q Consensus 81 riLVD~GSsvnil~~~~~ 98 (221)
.+++|+|++.-.+|.+++
T Consensus 178 ~ai~DTGTs~~~lp~~~~ 195 (265)
T cd05476 178 GTIIDSGTTLTYLPDPAY 195 (265)
T ss_pred cEEEeCCCcceEcCcccc
Confidence 389999999999998887
No 34
>PF00026 Asp: Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.; InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) . More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=45.50 E-value=32 Score=29.44 Aligned_cols=31 Identities=29% Similarity=0.460 Sum_probs=24.7
Q ss_pred EEEEcCeee-----eEEEecCCCccccccHHHHHhh
Q 045020 71 SIKIGNCLV-----KRILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 71 ~~~I~~~~v-----~riLVD~GSsvnil~~~~~~~m 101 (221)
.+.+++... ..+++|+|++.-.+|...++++
T Consensus 186 ~i~i~~~~~~~~~~~~~~~Dtgt~~i~lp~~~~~~i 221 (317)
T PF00026_consen 186 SISIGGESVFSSSGQQAILDTGTSYIYLPRSIFDAI 221 (317)
T ss_dssp EEEETTEEEEEEEEEEEEEETTBSSEEEEHHHHHHH
T ss_pred cccccccccccccceeeecccccccccccchhhHHH
Confidence 456766622 3699999999999999988887
No 35
>PF00026 Asp: Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.; InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) . More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=45.42 E-value=1.1e+02 Score=25.91 Aligned_cols=89 Identities=22% Similarity=0.284 Sum_probs=46.1
Q ss_pred EEEEEEc--CeeeeEEEecCCCccccccHHHHH-------hhcccccc---ccccccc-eecccCCcccccceee-EEEE
Q 045020 69 VISIKIG--NCLVKRILIDNGSSVNILDDATVE-------KMRLEQDK---IKLFEQP-LIGLSGIVTNSDGVIT-LLVI 134 (221)
Q Consensus 69 ~i~~~I~--~~~v~riLVD~GSsvnil~~~~~~-------~mg~~~~~---L~~~~~~-l~gf~g~~~~~lG~i~-L~v~ 134 (221)
++.+.|| ...+ ++++|+||+.-.++.+... +-.+.+.. ..+.+.. -..|... . .-|.+. =.+.
T Consensus 3 ~~~v~iGtp~q~~-~~~iDTGS~~~wv~~~~c~~~~~~~~~~~y~~~~S~t~~~~~~~~~~~y~~g-~-~~G~~~~D~v~ 79 (317)
T PF00026_consen 3 YINVTIGTPPQTF-RVLIDTGSSDTWVPSSNCNSCSSCASSGFYNPSKSSTFSNQGKPFSISYGDG-S-VSGNLVSDTVS 79 (317)
T ss_dssp EEEEEETTTTEEE-EEEEETTBSSEEEEBTTECSHTHHCTSC-BBGGGSTTEEEEEEEEEEEETTE-E-EEEEEEEEEEE
T ss_pred EEEEEECCCCeEE-EEEEecccceeeeceeccccccccccccccccccccccccceeeeeeeccCc-c-cccccccceEe
Confidence 5678888 4554 4999999997777632111 11111110 0001111 1123222 2 445442 4566
Q ss_pred EcCeeEE-EEEEEEeeC-------CceeeEEcch
Q 045020 135 AKSYSLL-TDFLVVKAT-------SPYNVILGRQ 160 (221)
Q Consensus 135 ~g~~t~~-~~F~Vvd~~-------~~yn~ILGr~ 160 (221)
+|+.+.. ..|..++.. .+++-|||-.
T Consensus 80 ig~~~~~~~~f~~~~~~~~~~~~~~~~~GilGLg 113 (317)
T PF00026_consen 80 IGGLTIPNQTFGLADSYSGDPFSPIPFDGILGLG 113 (317)
T ss_dssp ETTEEEEEEEEEEEEEEESHHHHHSSSSEEEE-S
T ss_pred eeeccccccceecccccccccccccccccccccc
Confidence 7777664 888888772 2457777765
No 36
>KOG2209 consensus Oxysterol-binding protein [Signal transduction mechanisms]
Probab=39.45 E-value=51 Score=29.71 Aligned_cols=84 Identities=18% Similarity=0.236 Sum_probs=46.2
Q ss_pred cHHHHHhhccccc-cccccccceecccCCc--ccccceeeEEEEE-cCeeEE--EEEEEEeeCCceeeEEcchhhhhcc-
Q 045020 94 DDATVEKMRLEQD-KIKLFEQPLIGLSGIV--TNSDGVITLLVIA-KSYSLL--TDFLVVKATSPYNVILGRQWIHKMR- 166 (221)
Q Consensus 94 ~~~~~~~mg~~~~-~L~~~~~~l~gf~g~~--~~~lG~i~L~v~~-g~~t~~--~~F~Vvd~~~~yn~ILGr~wL~~~~- 166 (221)
|.++|..-|++.. ....+-.|-.+|=|.+ ..|.|+|+|...- +..+.+ ..+- -||+|+|+-|+...+
T Consensus 149 PiSAfhaEgl~~dF~fhGsi~PklkFWgksvea~Pkgtitle~~k~nEaYtWtnp~Cc------vhNiIvGklwieqyg~ 222 (445)
T KOG2209|consen 149 PISAFHAEGLNNDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLKHNEAYTWTNPTCC------VHNIIVGKLWIEQYGN 222 (445)
T ss_pred ChhHhhhcccCcceEEeeeecccceeccceeecCCCceEEEEecccCcceeccCCcce------eeeehhhhhhHhhcCc
Confidence 4456666655443 2333334555565653 6788888876543 332221 1111 379999999999654
Q ss_pred --ccccCc-cceeeeecCCc
Q 045020 167 --AVPSTC-HQVLCYQTIYG 183 (221)
Q Consensus 167 --~i~st~-~q~vk~~~~~g 183 (221)
++.|-. |-|+---.+.|
T Consensus 223 ~eI~nh~Tg~~~vl~Fk~~G 242 (445)
T KOG2209|consen 223 VEIINHKTGHKCVLNFKPCG 242 (445)
T ss_pred EEEEecCccceeEEeccccc
Confidence 444433 66654333443
No 37
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5. Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=35.61 E-value=1.5e+02 Score=25.87 Aligned_cols=22 Identities=23% Similarity=0.504 Sum_probs=19.9
Q ss_pred EEEecCCCccccccHHHHHhhc
Q 045020 81 RILIDNGSSVNILDDATVEKMR 102 (221)
Q Consensus 81 riLVD~GSsvnil~~~~~~~mg 102 (221)
.++||+|++.-.+|..+++++-
T Consensus 233 ~aivDSGTs~~~lp~~~~~~l~ 254 (326)
T cd06096 233 GMLVDSGSTLSHFPEDLYNKIN 254 (326)
T ss_pred CEEEeCCCCcccCCHHHHHHHH
Confidence 3799999999999999999874
No 38
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=34.83 E-value=2.8e+02 Score=23.40 Aligned_cols=21 Identities=24% Similarity=0.525 Sum_probs=19.2
Q ss_pred EEEecCCCccccccHHHHHhh
Q 045020 81 RILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 81 riLVD~GSsvnil~~~~~~~m 101 (221)
.++||+|++.=.+|..+++.+
T Consensus 180 ~~iiDSGt~~~~lP~~~~~~l 200 (295)
T cd05474 180 PALLDSGTTLTYLPSDIVDAI 200 (295)
T ss_pred cEEECCCCccEeCCHHHHHHH
Confidence 589999999999999999886
No 39
>PF08990 Docking: Erythronolide synthase docking; InterPro: IPR015083 The N-terminal docking domain found in modular polyketide synthase assumes an alpha-helical structure, wherein two alpha-helices are connected by a short loop. Two such N-terminal domains dimerise to form amphipathic parallel alpha-helical coiled coils: dimerisation is essential for protein function []. ; GO: 0016740 transferase activity, 0048037 cofactor binding; PDB: 2HG4_E.
Probab=33.58 E-value=25 Score=19.63 Aligned_cols=14 Identities=29% Similarity=0.667 Sum_probs=9.5
Q ss_pred CCCcchHHHHHHHH
Q 045020 1 MGRVETLKYYIRRF 14 (221)
Q Consensus 1 ~~~~eslr~~~k~f 14 (221)
|-.++-||+|.|+-
T Consensus 1 M~~e~kLr~YLkr~ 14 (27)
T PF08990_consen 1 MANEDKLRDYLKRV 14 (27)
T ss_dssp ---HCHHHHHHHHH
T ss_pred CCcHHHHHHHHHHH
Confidence 56778899999984
No 40
>PF13867 SAP30_Sin3_bdg: Sin3 binding region of histone deacetylase complex subunit SAP30; PDB: 2LD7_A.
Probab=32.52 E-value=41 Score=21.69 Aligned_cols=30 Identities=13% Similarity=0.339 Sum_probs=16.6
Q ss_pred cchHHHHHHHHHHhccccccccHHHHHHHHhc
Q 045020 4 VETLKYYIRRFRLASAKVENCDDQLAIVTFKR 35 (221)
Q Consensus 4 ~eslr~~~k~f~~~~l~ve~~~~~~~~~~fk~ 35 (221)
..|||.|+++|++...+ ..+.++...+.++
T Consensus 3 ~~tLrrY~~~~~l~~~~--~~sK~qLa~~V~k 32 (53)
T PF13867_consen 3 TPTLRRYKKHYKLPERP--RSSKEQLANAVRK 32 (53)
T ss_dssp HHHHHHHHHHTT----S--S--HHHHHHHHHH
T ss_pred hHHHHHHHHHhCCCCCC--CCCHHHHHHHHHH
Confidence 46899999999887665 6666655444433
No 41
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco. CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=27.52 E-value=77 Score=27.21 Aligned_cols=31 Identities=19% Similarity=0.429 Sum_probs=24.4
Q ss_pred EEEEcCeeee---------EEEecCCCccccccHHHHHhh
Q 045020 71 SIKIGNCLVK---------RILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 71 ~~~I~~~~v~---------riLVD~GSsvnil~~~~~~~m 101 (221)
.+.+++..+. .++||+|++.-.+|..+++.+
T Consensus 154 ~i~vg~~~~~~~~~~~~~~~~ivDSGTt~~~lp~~~~~~l 193 (299)
T cd05472 154 GISVGGRRLPIPPASFGAGGVIIDSGTVITRLPPSAYAAL 193 (299)
T ss_pred EEEECCEECCCCccccCCCCeEEeCCCcceecCHHHHHHH
Confidence 3467766543 489999999999999988876
No 42
>PTZ00165 aspartyl protease; Provisional
Probab=26.02 E-value=4.8e+02 Score=24.61 Aligned_cols=21 Identities=24% Similarity=0.351 Sum_probs=18.8
Q ss_pred EEEecCCCccccccHHHHHhh
Q 045020 81 RILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 81 riLVD~GSsvnil~~~~~~~m 101 (221)
.+++|+|++.-.+|...++++
T Consensus 329 ~aIiDTGTSli~lP~~~~~~i 349 (482)
T PTZ00165 329 KAAIDTGSSLITGPSSVINPL 349 (482)
T ss_pred EEEEcCCCccEeCCHHHHHHH
Confidence 489999999999999988776
No 43
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=24.02 E-value=2.3e+02 Score=23.95 Aligned_cols=22 Identities=32% Similarity=0.463 Sum_probs=19.0
Q ss_pred eEEEecCCCccccccHHHHHhh
Q 045020 80 KRILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 80 ~riLVD~GSsvnil~~~~~~~m 101 (221)
..++||+|++.-.+|..+++++
T Consensus 199 ~~~iiDSGTs~~~lP~~~~~~l 220 (278)
T cd06097 199 FSAIADTGTTLILLPDAIVEAY 220 (278)
T ss_pred ceEEeecCCchhcCCHHHHHHH
Confidence 4489999999999998888776
No 44
>cd05470 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site
Probab=23.28 E-value=75 Score=22.57 Aligned_cols=25 Identities=32% Similarity=0.424 Sum_probs=17.6
Q ss_pred EEEEcC--eeeeEEEecCCCccccccHH
Q 045020 71 SIKIGN--CLVKRILIDNGSSVNILDDA 96 (221)
Q Consensus 71 ~~~I~~--~~v~riLVD~GSsvnil~~~ 96 (221)
.+.|+. ..+ .+++|+||+.=.++.+
T Consensus 2 ~i~vGtP~q~~-~~~~DTGSs~~Wv~~~ 28 (109)
T cd05470 2 EIGIGTPPQTF-NVLLDTGSSNLWVPSV 28 (109)
T ss_pred EEEeCCCCceE-EEEEeCCCCCEEEeCC
Confidence 455664 444 4999999997777654
No 45
>PF09830 ATP_transf: ATP adenylyltransferase; InterPro: IPR019200 Diadenosine 5',5'''-P-1,P-4-tetraphosphate (Ap4A) and related diadenosine oligoposphates such as Ap3A are important intracellular and extracellular signalling molecules in prokaryotes and eukaryotes []. They are implicated in the regulation of many vital celluar functions including stress response, cell division and apoptosis. Synthesis primarily occurs via aminoacyl-tRNA synthetases adding the AMP moiety of an aminoacyl-AMP to an acceptor nucleotide, and is an inevitable byproduct of protein synthesis. The concentration of these compounds must thus be controlled both to ensure the proper regulation of various celluar processes, but also to prevent their buildup to potentially toxic levels. This domain is found in a group of ATP adenylyltransferases found in bacteria and lower eukaryotes which catalyse the interconversion of Ap4A to ATP and ADP [, , ]. While these enzymes are thought to act primarily to break down Ap4A, there is evidence to suggest that in some circumstances they may also act in a biosynthetic role. Some variability in substrate range is apparent eg the cyanobacterial enzyme can also utilise Ap3A as a substrate, while the Saccharomyces enzymes apparently cannot.; GO: 0003877 ATP adenylyltransferase activity
Probab=22.84 E-value=40 Score=22.51 Aligned_cols=18 Identities=33% Similarity=0.896 Sum_probs=13.8
Q ss_pred ceeeEEcchhhhhccccccCc
Q 045020 152 PYNVILGRQWIHKMRAVPSTC 172 (221)
Q Consensus 152 ~yn~ILGr~wL~~~~~i~st~ 172 (221)
|||.+|-+.|+ .+||-+.
T Consensus 1 ~yNll~T~~wm---~lvPR~~ 18 (62)
T PF09830_consen 1 SYNLLMTRRWM---MLVPRSR 18 (62)
T ss_pred CceEEEecCeE---EEEeccc
Confidence 69999999994 5666444
No 46
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two
Probab=21.60 E-value=1.2e+02 Score=26.86 Aligned_cols=21 Identities=14% Similarity=0.245 Sum_probs=19.0
Q ss_pred EEEecCCCccccccHHHHHhh
Q 045020 81 RILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 81 riLVD~GSsvnil~~~~~~~m 101 (221)
.++||+|++.-.+|...++++
T Consensus 213 ~~ivDSGTs~~~lp~~~~~~l 233 (364)
T cd05473 213 KAIVDSGTTNLRLPVKVFNAA 233 (364)
T ss_pred cEEEeCCCcceeCCHHHHHHH
Confidence 489999999999999998875
No 47
>TIGR02854 spore_II_GA sigma-E processing peptidase SpoIIGA. Members of this protein family are the stage II sporulation protein SpoIIGA. This protein acts as an activating protease for Sigma-E, one of several specialized sigma factors of the sporulation process in Bacillus subtilis and related endospore-forming bacteria.
Probab=21.24 E-value=4.7e+02 Score=22.79 Aligned_cols=35 Identities=14% Similarity=0.281 Sum_probs=25.7
Q ss_pred ceEEEEEEcCeee-eEEEecCCCc---------cccccHHHHHhh
Q 045020 67 ALVISIKIGNCLV-KRILIDNGSS---------VNILDDATVEKM 101 (221)
Q Consensus 67 ~L~i~~~I~~~~v-~riLVD~GSs---------vnil~~~~~~~m 101 (221)
-.-+++.+++..+ -+.|+|||.. |-++..+.++++
T Consensus 158 ~~~v~i~~~g~~~~~~alvDTGN~L~DPlT~~PV~Ive~~~~~~~ 202 (288)
T TIGR02854 158 IYELEICLDGKKVTIKGFLDTGNQLRDPLTKLPVIVVEYDSLKSI 202 (288)
T ss_pred EEEEEEEECCEEEEEEEEEecCCcccCCCCCCCEEEEEHHHhhhh
Confidence 3456778888876 6799998854 667777777765
No 48
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=20.08 E-value=1.2e+02 Score=26.22 Aligned_cols=30 Identities=17% Similarity=0.470 Sum_probs=23.4
Q ss_pred EEEcCeee-----eEEEecCCCccccccHHHHHhh
Q 045020 72 IKIGNCLV-----KRILIDNGSSVNILDDATVEKM 101 (221)
Q Consensus 72 ~~I~~~~v-----~riLVD~GSsvnil~~~~~~~m 101 (221)
+.|++..+ ..++||+|++.-.+|...++++
T Consensus 186 i~v~g~~~~~~~~~~aiiDTGTs~~~lP~~~~~~l 220 (316)
T cd05486 186 IQVGGTVIFCSDGCQAIVDTGTSLITGPSGDIKQL 220 (316)
T ss_pred EEEecceEecCCCCEEEECCCcchhhcCHHHHHHH
Confidence 45666543 2489999999999999988776
Done!