Query         028157
Match_columns 213
No_of_seqs    115 out of 1064
Neff          9.0 
Searched_HMMs 46136
Date          Fri Mar 29 07:08:00 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/028157.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/028157hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd05489 xylanase_inhibitor_I_l 100.0 8.7E-43 1.9E-47  295.5  22.8  210    1-210   129-361 (362)
  2 cd05472 cnd41_like Chloroplast 100.0 9.1E-43   2E-47  289.0  21.9  210    1-212    85-299 (299)
  3 PLN03146 aspartyl protease fam 100.0 1.1E-39 2.3E-44  282.3  22.1  205    1-213   216-429 (431)
  4 cd05476 pepsin_A_like_plant Ch 100.0 3.4E-37 7.3E-42  251.6  18.7  173    1-212    84-265 (265)
  5 KOG1339 Aspartyl protease [Pos 100.0 1.9E-35 4.1E-40  254.0  20.5  206    1-213   177-397 (398)
  6 cd05473 beta_secretase_like Be 100.0 1.3E-35 2.8E-40  252.4  18.8  207    1-213   109-348 (364)
  7 cd05475 nucellin_like Nucellin 100.0 2.4E-34 5.1E-39  235.8  18.5  166    1-212    98-273 (273)
  8 cd05478 pepsin_A Pepsin A, asp 100.0 2.9E-34 6.4E-39  239.9  16.2  186    1-209   118-317 (317)
  9 cd05474 SAP_like SAPs, pepsin- 100.0 2.9E-34 6.2E-39  237.4  15.2  195    1-210    76-295 (295)
 10 PF14541 TAXi_C:  Xylanase inhi 100.0 1.8E-33   4E-38  213.4  15.0  150   60-209     1-161 (161)
 11 cd05486 Cathespin_E Cathepsin  100.0   2E-33 4.4E-38  234.7  16.2  188    1-209   108-316 (316)
 12 cd05487 renin_like Renin stimu 100.0 3.4E-33 7.3E-38  234.3  17.3  189    1-210   117-326 (326)
 13 cd05477 gastricsin Gastricsins 100.0 4.4E-33 9.5E-38  232.9  17.4  189    1-210   111-318 (318)
 14 cd05490 Cathepsin_D2 Cathepsin 100.0 4.1E-33   9E-38  233.7  17.0  189    1-209   116-325 (325)
 15 PTZ00165 aspartyl protease; Pr 100.0 9.5E-33 2.1E-37  240.8  19.3  187    1-213   233-449 (482)
 16 cd05488 Proteinase_A_fungi Fun 100.0 4.1E-33 8.9E-38  233.3  16.0  186    1-209   118-320 (320)
 17 cd06096 Plasmepsin_5 Plasmepsi 100.0 5.1E-33 1.1E-37  233.2  16.0  170    1-213   134-326 (326)
 18 cd05485 Cathepsin_D_like Cathe 100.0 2.9E-32 6.2E-37  229.0  17.6  188    1-209   121-329 (329)
 19 cd06098 phytepsin Phytepsin, a 100.0 9.6E-32 2.1E-36  224.7  17.8  178    1-209   119-317 (317)
 20 PTZ00013 plasmepsin 4 (PM4); P 100.0 1.9E-31   4E-36  230.8  17.0  187    1-211   247-449 (450)
 21 PTZ00147 plasmepsin-1; Provisi 100.0 2.2E-31 4.8E-36  230.6  16.3  187    1-211   248-450 (453)
 22 PF00026 Asp:  Eukaryotic aspar 100.0 2.5E-28 5.4E-33  203.3  11.3  189    1-210   110-317 (317)
 23 cd06097 Aspergillopepsin_like  100.0 5.9E-28 1.3E-32  198.4  13.2  155    1-209   110-278 (278)
 24 cd05471 pepsin_like Pepsin-lik  99.9 5.6E-27 1.2E-31  192.1  14.1  160    1-209   109-283 (283)
 25 cd05479 RP_DDI RP_DDI; retrope  95.5    0.33 7.1E-06   34.9  10.3   26  182-207    99-124 (124)
 26 PF08284 RVP_2:  Retroviral asp  94.3    0.76 1.6E-05   33.6   9.6   29  182-210   104-132 (135)
 27 PF13650 Asp_protease_2:  Aspar  93.8    0.11 2.3E-06   34.5   4.0   30   67-106     2-31  (90)
 28 TIGR03698 clan_AA_DTGF clan AA  93.6    0.54 1.2E-05   32.9   7.3   23  182-204    84-106 (107)
 29 TIGR02281 clan_AA_DTGA clan AA  93.0    0.21 4.5E-06   35.8   4.6   36   58-106     9-44  (121)
 30 cd05484 retropepsin_like_LTR_2  92.8    0.22 4.7E-06   33.6   4.2   30   67-106     4-33  (91)
 31 PF13975 gag-asp_proteas:  gag-  92.8     0.3 6.6E-06   31.4   4.7   30   67-106    12-41  (72)
 32 cd05483 retropepsin_like_bacte  92.0    0.38 8.3E-06   32.1   4.7   31   66-106     5-35  (96)
 33 cd06095 RP_RTVL_H_like Retrope  90.3    0.48   1E-05   31.6   3.7   29   68-106     3-31  (86)
 34 PF00077 RVP:  Retroviral aspar  88.5    0.66 1.4E-05   31.6   3.5   28   66-103     8-35  (100)
 35 cd05481 retropepsin_like_LTR_1  84.7     1.4   3E-05   29.9   3.4   20   87-106    13-32  (93)
 36 PF09668 Asp_protease:  Asparty  84.5     1.4 3.1E-05   31.7   3.5   31   66-106    27-57  (124)
 37 COG3577 Predicted aspartyl pro  78.6     6.2 0.00014   30.9   5.3   35   58-105   103-137 (215)
 38 COG5550 Predicted aspartyl pro  68.8     3.8 8.3E-05   29.3   1.9   20   87-106    29-49  (125)
 39 PF12384 Peptidase_A2B:  Ty3 tr  66.7      18 0.00039   27.4   5.2   31   66-106    37-67  (177)
 40 cd05470 pepsin_retropepsin_lik  62.1     8.7 0.00019   26.2   2.7   16   87-102    14-29  (109)
 41 cd00303 retropepsin_like Retro  57.7      21 0.00046   21.8   3.9   19   87-105    12-30  (92)
 42 cd06094 RP_Saci_like RP_Saci_l  53.2      61  0.0013   21.9   5.4   75   86-194    11-87  (89)
 43 cd05480 NRIP_C NRIP_C; putativ  52.9      22 0.00047   24.6   3.3   29   68-106     3-31  (103)
 44 cd06097 Aspergillopepsin_like   52.7      15 0.00032   29.9   3.0   26   66-101     3-30  (278)
 45 PF05585 DUF1758:  Putative pep  41.9      16 0.00034   27.3   1.5   21   86-106    14-34  (164)
 46 cd06096 Plasmepsin_5 Plasmepsi  41.1      29 0.00064   28.9   3.1   29   65-101     5-33  (326)
 47 cd05482 HIV_retropepsin_like R  40.2      40 0.00086   22.6   3.0   23   68-100     3-25  (87)
 48 cd05474 SAP_like SAPs, pepsin-  39.0      38 0.00083   27.5   3.5   27   65-99      4-30  (295)
 49 PF00026 Asp:  Eukaryotic aspar  36.7      42 0.00092   27.4   3.4   26   66-101     4-31  (317)
 50 cd06098 phytepsin Phytepsin, a  31.6      59  0.0013   27.0   3.5   28   66-101    13-40  (317)
 51 cd05476 pepsin_A_like_plant Ch  31.0      39 0.00085   27.2   2.3   14   87-100    17-30  (265)
 52 cd05477 gastricsin Gastricsins  30.9      47   0.001   27.5   2.8   29   65-101     5-33  (318)
 53 TIGR03778 VPDSG_CTERM VPDSG-CT  30.5      11 0.00023   19.2  -0.7   13   90-102     3-15  (26)
 54 KOG0012 DNA damage inducible p  30.0      69  0.0015   27.4   3.5   38  170-209   307-345 (380)
 55 cd05478 pepsin_A Pepsin A, asp  28.5      53  0.0011   27.2   2.7   28   66-101    13-40  (317)
 56 PF14543 TAXi_N:  Xylanase inhi  26.3      47   0.001   24.8   1.8   14   87-100    16-29  (164)
 57 PTZ00147 plasmepsin-1; Provisi  25.7      87  0.0019   27.8   3.6   15   87-101   155-169 (453)
 58 cd05488 Proteinase_A_fungi Fun  25.4      61  0.0013   26.9   2.5   28   66-101    13-40  (320)
 59 PTZ00165 aspartyl protease; Pr  23.9      88  0.0019   28.0   3.3   29   66-102   123-151 (482)
 60 PLN03146 aspartyl protease fam  21.8      73  0.0016   28.0   2.4   27   66-100    87-113 (431)

No 1  
>cd05489 xylanase_inhibitor_I_like TAXI-I inhibits degradation of xylan in the cell wall. Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. Xylanases of fungal and bacterial pathogens are the key enzymes in the degradation of xylan in the cell wall. Plants secrete proteins that inhibit these degradation glycosidases, including xylanase. Surprisingly, TAXI-I displays structural homology with the pepsin-like family of aspartic proteases but is proteolytically nonfunctional, because one or more residues of the essential catalytic triad are absent. The structure of the TAXI-inhibitor, Aspergillus niger xylanase I complex, illustrates the ability 
Probab=100.00  E-value=8.7e-43  Score=295.51  Aligned_cols=210  Identities=26%  Similarity=0.401  Sum_probs=171.1

Q ss_pred             CCCCCCCChhhhhhCcC-----ceEEecCCCCCCccEEEECCCCC-CC------CCCeeEeecccCCCCCcceEEEEeEE
Q 028157            1 MGLDRSSVSIISKTNTS-----YFSYCLPSPYGSTAYITFGKPVS-VS------NKFIKYTPIVTTAEQSEYYDIILTGI   68 (213)
Q Consensus         1 ~Glg~~~~sl~~ql~~~-----~FS~cl~~~~~~~g~l~fG~~~~-~~------~~~~~y~pl~~~~~~~~~y~v~l~~i   68 (213)
                      ||||++++|+++||..+     .|||||++....+|+|+||+.+. .+      .+.+.||||++++..+.+|+|+|++|
T Consensus       129 lGLg~~~lSl~sql~~~~~~~~~FS~CL~~~~~~~g~l~fG~~~~~~~~~~~~~~~~~~~tPl~~~~~~~~~Y~v~l~~I  208 (362)
T cd05489         129 AGLGRSPLSLPAQLASAFGVARKFALCLPSSPGGPGVAIFGGGPYYLFPPPIDLSKSLSYTPLLTNPRKSGEYYIGVTSI  208 (362)
T ss_pred             cccCCCccchHHHhhhhcCCCcceEEEeCCCCCCCeeEEECCCchhcccccccccCCccccccccCCCCCCceEEEEEEE
Confidence            79999999999999863     49999987644689999999885 33      37899999998754457999999999


Q ss_pred             EECCeEeeccccccC-----CCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCC----cce
Q 028157           69 SVGGEKLPFKISYFT-----KLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAY----ETV  139 (213)
Q Consensus        69 ~vg~~~~~~~~~~~~-----~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~----~~~  139 (213)
                      +||++.+.+++..+.     .+++||||||++|+||+++|++|++++.+++...+.........+.||.....    ...
T Consensus       209 sVg~~~l~~~~~~~~~~~~~~~g~iiDSGTs~t~lp~~~y~~l~~a~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~  288 (362)
T cd05489         209 AVNGHAVPLNPTLSANDRLGPGGVKLSTVVPYTVLRSDIYRAFTQAFAKATARIPRVPAAAVFPELCYPASALGNTRLGY  288 (362)
T ss_pred             EECCEECCCCchhccccccCCCcEEEecCCceEEECHHHHHHHHHHHHHHhcccCcCCCCCCCcCccccCCCcCCccccc
Confidence            999999987654332     26899999999999999999999999998876433222211223689976431    135


Q ss_pred             ecCeEEEEEeC-CcEEEEcCCceEEEeCCCceEEEEEecCCC-CCeeEecceeeeeeEEEEeCCCCEEEEeeC
Q 028157          140 VVPKIAIHFLG-GVDLELDVRGTLVVASVSQVCLEFAIYPPD-LNSITLGNVQQRGHEVHYDVGGRRLGFGPG  210 (213)
Q Consensus       140 ~~P~l~~~f~~-g~~~~l~~~~y~~~~~~~~~C~~~~~~~~~-~~~~ilG~~~~~~~~~vfD~~~~riGfa~~  210 (213)
                      .+|+|+|+|++ |++|+|+|++|++++..+..|++|...+.. ...||||+.|||+++++||++++|||||++
T Consensus       289 ~~P~it~~f~g~g~~~~l~~~ny~~~~~~~~~Cl~f~~~~~~~~~~~IlG~~~~~~~~vvyD~~~~riGfa~~  361 (362)
T cd05489         289 AVPAIDLVLDGGGVNWTIFGANSMVQVKGGVACLAFVDGGSEPRPAVVIGGHQMEDNLLVFDLEKSRLGFSSS  361 (362)
T ss_pred             ccceEEEEEeCCCeEEEEcCCceEEEcCCCcEEEEEeeCCCCCCceEEEeeheecceEEEEECCCCEeecccC
Confidence            79999999987 799999999999988777899999865422 357999999999999999999999999975


No 2  
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco.  CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=100.00  E-value=9.1e-43  Score=288.96  Aligned_cols=210  Identities=47%  Similarity=0.810  Sum_probs=171.7

Q ss_pred             CCCCCCCChhhhhhCcC---ceEEecCCCC-CCccEEEECCCCCCCCCCeeEeecccCCCCCcceEEEEeEEEECCeEee
Q 028157            1 MGLDRSSVSIISKTNTS---YFSYCLPSPY-GSTAYITFGKPVSVSNKFIKYTPIVTTAEQSEYYDIILTGISVGGEKLP   76 (213)
Q Consensus         1 ~Glg~~~~sl~~ql~~~---~FS~cl~~~~-~~~g~l~fG~~~~~~~~~~~y~pl~~~~~~~~~y~v~l~~i~vg~~~~~   76 (213)
                      ||||+..+|+++|+..+   .||+||++.. ..+|+|+||++|.. .+++.|+||+.++..+.+|.|+|++|+||++.+.
T Consensus        85 lGLg~~~~s~~~ql~~~~~~~FS~~L~~~~~~~~G~l~fGg~d~~-~g~l~~~pv~~~~~~~~~y~v~l~~i~vg~~~~~  163 (299)
T cd05472          85 LGLGRGKLSLPSQTASSYGGVFSYCLPDRSSSSSGYLSFGAAASV-PAGASFTPMLSNPRVPTFYYVGLTGISVGGRRLP  163 (299)
T ss_pred             EECCCCcchHHHHhhHhhcCceEEEccCCCCCCCceEEeCCcccc-CCCceECCCccCCCCCCeEEEeeEEEEECCEECC
Confidence            69999999999998764   5999998753 36899999999976 8899999999865445799999999999999987


Q ss_pred             ccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEEEeCCcEEEE
Q 028157           77 FKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIHFLGGVDLEL  156 (213)
Q Consensus        77 ~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~f~~g~~~~l  156 (213)
                      +++.....+++||||||+++++|+++|++|.+++.++....+..... .....||..++.....+|+|+|+|++|++++|
T Consensus       164 ~~~~~~~~~~~ivDSGTt~~~lp~~~~~~l~~~l~~~~~~~~~~~~~-~~~~~C~~~~~~~~~~~P~i~f~f~~g~~~~l  242 (299)
T cd05472         164 IPPASFGAGGVIIDSGTVITRLPPSAYAALRDAFRAAMAAYPRAPGF-SILDTCYDLSGFRSVSVPTVSLHFQGGADVEL  242 (299)
T ss_pred             CCccccCCCCeEEeCCCcceecCHHHHHHHHHHHHHHhccCCCCCCC-CCCCccCcCCCCcCCccCCEEEEECCCCEEEe
Confidence            65332334689999999999999999999999998876433222221 22346997765444689999999986899999


Q ss_pred             cCCceEEEe-CCCceEEEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEeeCCC
Q 028157          157 DVRGTLVVA-SVSQVCLEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPGNC  212 (213)
Q Consensus       157 ~~~~y~~~~-~~~~~C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~~C  212 (213)
                      +|++|++.. ..+..|+++.......+.||||+.|||++|+|||++++|||||+++|
T Consensus       243 ~~~~y~~~~~~~~~~C~~~~~~~~~~~~~ilG~~fl~~~~vvfD~~~~~igfa~~~C  299 (299)
T cd05472         243 DASGVLYPVDDSSQVCLAFAGTSDDGGLSIIGNVQQQTFRVVYDVAGGRIGFAPGGC  299 (299)
T ss_pred             CcccEEEEecCCCCEEEEEeCCCCCCCCEEEchHHccceEEEEECCCCEEeEecCCC
Confidence            999999843 44679998876532346799999999999999999999999999999


No 3  
>PLN03146 aspartyl protease family protein; Provisional
Probab=100.00  E-value=1.1e-39  Score=282.28  Aligned_cols=205  Identities=33%  Similarity=0.532  Sum_probs=167.0

Q ss_pred             CCCCCCCChhhhhhCcC---ceEEecCCCC---CCccEEEECCCCCCCCCCeeEeecccCCCCCcceEEEEeEEEECCeE
Q 028157            1 MGLDRSSVSIISKTNTS---YFSYCLPSPY---GSTAYITFGKPVSVSNKFIKYTPIVTTAEQSEYYDIILTGISVGGEK   74 (213)
Q Consensus         1 ~Glg~~~~sl~~ql~~~---~FS~cl~~~~---~~~g~l~fG~~~~~~~~~~~y~pl~~~~~~~~~y~v~l~~i~vg~~~   74 (213)
                      ||||++++|+++||...   .|||||++..   ...|.|+||+........+.||||+.+.. +.+|.|+|++|+||++.
T Consensus       216 lGLG~~~~Sl~sql~~~~~~~FSycL~~~~~~~~~~g~l~fG~~~~~~~~~~~~tPl~~~~~-~~~y~V~L~gIsVgg~~  294 (431)
T PLN03146        216 VGLGGGPLSLISQLGSSIGGKFSYCLVPLSSDSNGTSKINFGTNAIVSGSGVVSTPLVSKDP-DTFYYLTLEAISVGSKK  294 (431)
T ss_pred             EecCCCCccHHHHhhHhhCCcEEEECCCCCCCCCCcceEEeCCccccCCCCceEcccccCCC-CCeEEEeEEEEEECCEE
Confidence            69999999999998763   4999997532   24799999996433334589999986532 47999999999999999


Q ss_pred             eeccccccC---CCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEEEeCC
Q 028157           75 LPFKISYFT---KLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIHFLGG  151 (213)
Q Consensus        75 ~~~~~~~~~---~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~f~~g  151 (213)
                      +.++...|.   .+++||||||++|+||+++|++|++++.++++..+..... ..+..||....  ...+|+|+|+|+ |
T Consensus       295 l~~~~~~~~~~~~g~~iiDSGTt~t~Lp~~~y~~l~~~~~~~~~~~~~~~~~-~~~~~C~~~~~--~~~~P~i~~~F~-G  370 (431)
T PLN03146        295 LPYTGSSKNGVEEGNIIIDSGTTLTLLPSDFYSELESAVEEAIGGERVSDPQ-GLLSLCYSSTS--DIKLPIITAHFT-G  370 (431)
T ss_pred             CcCCccccccCCCCcEEEeCCccceecCHHHHHHHHHHHHHHhccccCCCCC-CCCCccccCCC--CCCCCeEEEEEC-C
Confidence            988765542   2579999999999999999999999999888642222111 34578997532  246899999997 8


Q ss_pred             cEEEEcCCceEEEeCCCceEEEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEeeCCCC
Q 028157          152 VDLELDVRGTLVVASVSQVCLEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPGNCS  213 (213)
Q Consensus       152 ~~~~l~~~~y~~~~~~~~~C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~~C~  213 (213)
                      +++.|+|++|++....+..|+++....   +.||||+.|||+++++||++++|||||+.+|+
T Consensus       371 a~~~l~~~~~~~~~~~~~~Cl~~~~~~---~~~IlG~~~q~~~~vvyDl~~~~igFa~~~C~  429 (431)
T PLN03146        371 ADVKLQPLNTFVKVSEDLVCFAMIPTS---SIAIFGNLAQMNFLVGYDLESKTVSFKPTDCT  429 (431)
T ss_pred             CeeecCcceeEEEcCCCcEEEEEecCC---CceEECeeeEeeEEEEEECCCCEEeeecCCcC
Confidence            999999999999876678999987542   46999999999999999999999999999995


No 4  
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  The enzymes specifically cleave bonds in peptides which 
Probab=100.00  E-value=3.4e-37  Score=251.59  Aligned_cols=173  Identities=38%  Similarity=0.688  Sum_probs=150.2

Q ss_pred             CCCCCCCChhhhhhCcC--ceEEecCCC--CCCccEEEECCCCCCCCCCeeEeecccCCCCCcceEEEEeEEEECCeEee
Q 028157            1 MGLDRSSVSIISKTNTS--YFSYCLPSP--YGSTAYITFGKPVSVSNKFIKYTPIVTTAEQSEYYDIILTGISVGGEKLP   76 (213)
Q Consensus         1 ~Glg~~~~sl~~ql~~~--~FS~cl~~~--~~~~g~l~fG~~~~~~~~~~~y~pl~~~~~~~~~y~v~l~~i~vg~~~~~   76 (213)
                      ||||+...|+++||..+  .||+||++.  ...+|+|+||++|..+.+++.|+|++.++....+|.|++++|+|+++.+.
T Consensus        84 lGLg~~~~s~~~ql~~~~~~Fs~~l~~~~~~~~~G~l~fGg~d~~~~~~l~~~p~~~~~~~~~~~~v~l~~i~v~~~~~~  163 (265)
T cd05476          84 LGLGRGPLSLVSQLGSTGNKFSYCLVPHDDTGGSSPLILGDAADLGGSGVVYTPLVKNPANPTYYYVNLEGISVGGKRLP  163 (265)
T ss_pred             EECCCCcccHHHHhhcccCeeEEEccCCCCCCCCCeEEECCcccccCCCceEeecccCCCCCCceEeeeEEEEECCEEec
Confidence            69999999999999988  699999874  24689999999996688999999999864345799999999999999887


Q ss_pred             cccccc-----CCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEEEeCC
Q 028157           77 FKISYF-----TKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIHFLGG  151 (213)
Q Consensus        77 ~~~~~~-----~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~f~~g  151 (213)
                      +++..+     ....+||||||+++++|+++|                                      |+|+|+|.+|
T Consensus       164 ~~~~~~~~~~~~~~~ai~DTGTs~~~lp~~~~--------------------------------------P~i~~~f~~~  205 (265)
T cd05476         164 IPPSVFAIDSDGSGGTIIDSGTTLTYLPDPAY--------------------------------------PDLTLHFDGG  205 (265)
T ss_pred             CCchhcccccCCCCcEEEeCCCcceEcCcccc--------------------------------------CCEEEEECCC
Confidence            543211     125799999999999999876                                      8899999768


Q ss_pred             cEEEEcCCceEEEeCCCceEEEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEeeCCC
Q 028157          152 VDLELDVRGTLVVASVSQVCLEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPGNC  212 (213)
Q Consensus       152 ~~~~l~~~~y~~~~~~~~~C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~~C  212 (213)
                      .++.+++++|++....+..|+++.... ..+.||||+.|||++|++||++++|||||+++|
T Consensus       206 ~~~~i~~~~y~~~~~~~~~C~~~~~~~-~~~~~ilG~~fl~~~~~vFD~~~~~iGfa~~~C  265 (265)
T cd05476         206 ADLELPPENYFVDVGEGVVCLAILSSS-SGGVSILGNIQQQNFLVEYDLENSRLGFAPADC  265 (265)
T ss_pred             CEEEeCcccEEEECCCCCEEEEEecCC-CCCcEEEChhhcccEEEEEECCCCEEeeecCCC
Confidence            999999999999776678999987653 357899999999999999999999999999999


No 5  
>KOG1339 consensus Aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=1.9e-35  Score=253.96  Aligned_cols=206  Identities=36%  Similarity=0.563  Sum_probs=169.4

Q ss_pred             CCCCCCCChhhhhhCcCc-----eEEecCCCCC---CccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEeEEEEC
Q 028157            1 MGLDRSSVSIISKTNTSY-----FSYCLPSPYG---STAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILTGISVG   71 (213)
Q Consensus         1 ~Glg~~~~sl~~ql~~~~-----FS~cl~~~~~---~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~~i~vg   71 (213)
                      ||||++++|+++|+....     ||+||.+...   .+|.|+||+.|. ++.+.+.|+||+.++.  .+|.|++++|+||
T Consensus       177 lGLg~~~~S~~~q~~~~~~~~~~FS~cL~~~~~~~~~~G~i~fG~~d~~~~~~~l~~tPl~~~~~--~~y~v~l~~I~vg  254 (398)
T KOG1339|consen  177 LGLGRGSLSVPSQLPSFYNAINVFSYCLSSNGSPSSGGGSIIFGGVDSSHYTGSLTYTPLLSNPS--TYYQVNLDGISVG  254 (398)
T ss_pred             eecCCCCccceeecccccCCceeEEEEeCCCCCCCCCCcEEEECCCcccCcCCceEEEeeccCCC--ccEEEEEeEEEEC
Confidence            699999999999999875     9999998753   489999999996 6788999999999642  5999999999999


Q ss_pred             CeEeeccccccCC--CcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEEEe
Q 028157           72 GEKLPFKISYFTK--LSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIHFL  149 (213)
Q Consensus        72 ~~~~~~~~~~~~~--~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~f~  149 (213)
                      ++. .++...++.  +++|+||||++++||+++|++|++++.++.. ...  ..+.....|+...... ..+|.|+|+|.
T Consensus       255 g~~-~~~~~~~~~~~~~~iiDSGTs~t~lp~~~y~~i~~~~~~~~~-~~~--~~~~~~~~C~~~~~~~-~~~P~i~~~f~  329 (398)
T KOG1339|consen  255 GKR-PIGSSLFCTDGGGAIIDSGTSLTYLPTSAYNALREAIGAEVS-VVG--TDGEYFVPCFSISTSG-VKLPDITFHFG  329 (398)
T ss_pred             Ccc-CCCcceEecCCCCEEEECCcceeeccHHHHHHHHHHHHhhee-ccc--cCCceeeecccCCCCc-ccCCcEEEEEC
Confidence            977 655555543  6899999999999999999999999988741 000  1113456899875322 45999999998


Q ss_pred             CCcEEEEcCCceEEEeCCCce-EEEEEecCCCCCeeEecceeeeeeEEEEeCC-CCEEEEee--CCCC
Q 028157          150 GGVDLELDVRGTLVVASVSQV-CLEFAIYPPDLNSITLGNVQQRGHEVHYDVG-GRRLGFGP--GNCS  213 (213)
Q Consensus       150 ~g~~~~l~~~~y~~~~~~~~~-C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~-~~riGfa~--~~C~  213 (213)
                      +|+.|.+++++|++.++.+.. |+++.........||||+.+||+++++||.. ++|||||+  ..|+
T Consensus       330 ~g~~~~l~~~~y~~~~~~~~~~Cl~~~~~~~~~~~~ilG~~~~~~~~~~~D~~~~~riGfa~~~~~c~  397 (398)
T KOG1339|consen  330 GGAVFSLPPKNYLVEVSDGGGVCLAFFNGMDSGPLWILGDVFQQNYLVVFDLGENSRVGFAPALTNCS  397 (398)
T ss_pred             CCcEEEeCccceEEEECCCCCceeeEEecCCCCceEEEchHHhCCEEEEEeCCCCCEEEeccccccCC
Confidence            789999999999998765444 9997665322258999999999999999999 99999999  7885


No 6  
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two 
Probab=100.00  E-value=1.3e-35  Score=252.44  Aligned_cols=207  Identities=23%  Similarity=0.351  Sum_probs=156.5

Q ss_pred             CCCCCCCC------------hhhhhhCcC-ceEEecCC---------CCCCccEEEECCCCC-CCCCCeeEeecccCCCC
Q 028157            1 MGLDRSSV------------SIISKTNTS-YFSYCLPS---------PYGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQ   57 (213)
Q Consensus         1 ~Glg~~~~------------sl~~ql~~~-~FS~cl~~---------~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~   57 (213)
                      ||||+..+            +|++|...+ .||+||..         ....+|+|+||++|. ++.+++.|+|++.    
T Consensus       109 lGLg~~~l~~~~~~~~~~~~~l~~q~~~~~~FS~~l~~~~~~~~~~~~~~~~g~l~fGg~D~~~~~g~l~~~p~~~----  184 (364)
T cd05473         109 LGLAYAELARPDSSVEPFFDSLVKQTGIPDVFSLQMCGAGLPVNGSASGTVGGSMVIGGIDPSLYKGDIWYTPIRE----  184 (364)
T ss_pred             eeecccccccCCCCCCCHHHHHHhccCCccceEEEecccccccccccccCCCcEEEeCCcCHhhcCCCceEEecCc----
Confidence            68998766            566676554 49997632         112479999999996 7889999999976    


Q ss_pred             CcceEEEEeEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhh--ccccccccccccccccccC
Q 028157           58 SEYYDIILTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKK--YKKAKEFEDLLGTCYDLSA  135 (213)
Q Consensus        58 ~~~y~v~l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~--~~~~~~~~~~~~~C~~~~~  135 (213)
                      ..+|.|++++|+||++.+.++...+....+||||||+++++|+++|++|.+++.++...  .+..... .....|+....
T Consensus       185 ~~~~~v~l~~i~vg~~~~~~~~~~~~~~~~ivDSGTs~~~lp~~~~~~l~~~l~~~~~~~~~~~~~~~-~~~~~C~~~~~  263 (364)
T cd05473         185 EWYYEVIILKLEVGGQSLNLDCKEYNYDKAIVDSGTTNLRLPVKVFNAAVDAIKAASLIEDFPDGFWL-GSQLACWQKGT  263 (364)
T ss_pred             ceeEEEEEEEEEECCEecccccccccCccEEEeCCCcceeCCHHHHHHHHHHHHhhcccccCCccccC-cceeecccccC
Confidence            46899999999999999876554333346999999999999999999999999876531  1111000 11247986543


Q ss_pred             CcceecCeEEEEEeCC-----cEEEEcCCceEEEeC---CCceEEEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEE
Q 028157          136 YETVVVPKIAIHFLGG-----VDLELDVRGTLVVAS---VSQVCLEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGF  207 (213)
Q Consensus       136 ~~~~~~P~l~~~f~~g-----~~~~l~~~~y~~~~~---~~~~C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGf  207 (213)
                      .....+|+|+|+|+++     .+++|+|++|+....   .+..|+.+.... ..+.||||+.|||++|+|||++++||||
T Consensus       264 ~~~~~~P~i~~~f~g~~~~~~~~l~l~p~~Y~~~~~~~~~~~~C~~~~~~~-~~~~~ILG~~flr~~yvvfD~~~~rIGf  342 (364)
T cd05473         264 TPWEIFPKISIYLRDENSSQSFRITILPQLYLRPVEDHGTQLDCYKFAISQ-STNGTVIGAVIMEGFYVVFDRANKRVGF  342 (364)
T ss_pred             chHhhCCcEEEEEccCCCCceEEEEECHHHhhhhhccCCCcceeeEEeeec-CCCceEEeeeeEcceEEEEECCCCEEee
Confidence            2234699999999752     478999999998642   246898754332 2356999999999999999999999999


Q ss_pred             eeCCCC
Q 028157          208 GPGNCS  213 (213)
Q Consensus       208 a~~~C~  213 (213)
                      |+++|+
T Consensus       343 a~~~C~  348 (364)
T cd05473         343 AVSTCA  348 (364)
T ss_pred             Eecccc
Confidence            999995


No 7  
>cd05475 nucellin_like Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more d
Probab=100.00  E-value=2.4e-34  Score=235.78  Aligned_cols=166  Identities=22%  Similarity=0.387  Sum_probs=139.0

Q ss_pred             CCCCCCCChhhhhhCcC-----ceEEecCCCCCCccEEEECCCCCCCCCCeeEeecccCCCCCcceEEEEeEEEECCeEe
Q 028157            1 MGLDRSSVSIISKTNTS-----YFSYCLPSPYGSTAYITFGKPVSVSNKFIKYTPIVTTAEQSEYYDIILTGISVGGEKL   75 (213)
Q Consensus         1 ~Glg~~~~sl~~ql~~~-----~FS~cl~~~~~~~g~l~fG~~~~~~~~~~~y~pl~~~~~~~~~y~v~l~~i~vg~~~~   75 (213)
                      ||||+++.|+++||.++     .||+||++.  .+|.|+||+.. .+.+++.|+||.+++. ..+|.|++.+|+||++.+
T Consensus        98 lGLg~~~~s~~~ql~~~~~i~~~Fs~~l~~~--~~g~l~~G~~~-~~~g~i~ytpl~~~~~-~~~y~v~l~~i~vg~~~~  173 (273)
T cd05475          98 LGLGRGKISLPSQLASQGIIKNVIGHCLSSN--GGGFLFFGDDL-VPSSGVTWTPMRRESQ-KKHYSPGPASLLFNGQPT  173 (273)
T ss_pred             EECCCCCCCHHHHHHhcCCcCceEEEEccCC--CCeEEEECCCC-CCCCCeeecccccCCC-CCeEEEeEeEEEECCEEC
Confidence            69999999999999875     399999874  47999999543 3567899999987642 469999999999999854


Q ss_pred             eccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEEEeCC---c
Q 028157           76 PFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIHFLGG---V  152 (213)
Q Consensus        76 ~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~f~~g---~  152 (213)
                      ..     ...++||||||+++++|+++|                                     +|+|+|+|+++   +
T Consensus       174 ~~-----~~~~~ivDTGTt~t~lp~~~y-------------------------------------~p~i~~~f~~~~~~~  211 (273)
T cd05475         174 GG-----KGLEVVFDSGSSYTYFNAQAY-------------------------------------FKPLTLKFGKGWRTR  211 (273)
T ss_pred             cC-----CCceEEEECCCceEEcCCccc-------------------------------------cccEEEEECCCCcee
Confidence            32     124799999999999999876                                     48899999754   7


Q ss_pred             EEEEcCCceEEEeCCCceEEEEEecCC--CCCeeEecceeeeeeEEEEeCCCCEEEEeeCCC
Q 028157          153 DLELDVRGTLVVASVSQVCLEFAIYPP--DLNSITLGNVQQRGHEVHYDVGGRRLGFGPGNC  212 (213)
Q Consensus       153 ~~~l~~~~y~~~~~~~~~C~~~~~~~~--~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~~C  212 (213)
                      +++|+|++|++....+..|+++.....  ..+.||||+.|||++|++||++++|||||+++|
T Consensus       212 ~~~l~~~~y~~~~~~~~~Cl~~~~~~~~~~~~~~ilG~~~l~~~~~vfD~~~~riGfa~~~C  273 (273)
T cd05475         212 LLEIPPENYLIISEKGNVCLGILNGSEIGLGNTNIIGDISMQGLMVIYDNEKQQIGWVRSDC  273 (273)
T ss_pred             EEEeCCCceEEEcCCCCEEEEEecCCCcCCCceEEECceEEEeeEEEEECcCCEeCcccCCC
Confidence            999999999987666779999875432  235799999999999999999999999999999


No 8  
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which 
Probab=100.00  E-value=2.9e-34  Score=239.88  Aligned_cols=186  Identities=18%  Similarity=0.331  Sum_probs=148.4

Q ss_pred             CCCCCCCCh------hhhhhCcC------ceEEecCCCCCCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEeE
Q 028157            1 MGLDRSSVS------IISKTNTS------YFSYCLPSPYGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILTG   67 (213)
Q Consensus         1 ~Glg~~~~s------l~~ql~~~------~FS~cl~~~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~~   67 (213)
                      ||||+..+|      ++.||.++      .||+||++....+|+|+||++|. ++.+++.|+|+..    +.+|.|.+++
T Consensus       118 lGLg~~~~s~~~~~~~~~~L~~~g~i~~~~FS~~L~~~~~~~g~l~~Gg~d~~~~~g~l~~~p~~~----~~~w~v~l~~  193 (317)
T cd05478         118 LGLAYPSIASSGATPVFDNMMSQGLVSQDLFSVYLSSNGQQGSVVTFGGIDPSYYTGSLNWVPVTA----ETYWQITVDS  193 (317)
T ss_pred             eeeccchhcccCCCCHHHHHHhCCCCCCCEEEEEeCCCCCCCeEEEEcccCHHHccCceEEEECCC----CcEEEEEeeE
Confidence            688876544      66676553      39999998655679999999996 7889999999975    4799999999


Q ss_pred             EEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEE
Q 028157           68 ISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIH  147 (213)
Q Consensus        68 i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~  147 (213)
                      |+||++.+....    ...+||||||+++++|+++|++|.+++.+...   . ..  .....|+..     ..+|.|+|+
T Consensus       194 v~v~g~~~~~~~----~~~~iiDTGts~~~lp~~~~~~l~~~~~~~~~---~-~~--~~~~~C~~~-----~~~P~~~f~  258 (317)
T cd05478         194 VTINGQVVACSG----GCQAIVDTGTSLLVGPSSDIANIQSDIGASQN---Q-NG--EMVVNCSSI-----SSMPDVVFT  258 (317)
T ss_pred             EEECCEEEccCC----CCEEEECCCchhhhCCHHHHHHHHHHhCCccc---c-CC--cEEeCCcCc-----ccCCcEEEE
Confidence            999999886432    24699999999999999999999887744321   1 00  122467643     468999999


Q ss_pred             EeCCcEEEEcCCceEEEeCCCceEEE-EEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          148 FLGGVDLELDVRGTLVVASVSQVCLE-FAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       148 f~~g~~~~l~~~~y~~~~~~~~~C~~-~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      |+ |++++|+|++|+...  ...|+. +.... ..+.||||+.|||++|++||++++|||||+
T Consensus       259 f~-g~~~~i~~~~y~~~~--~~~C~~~~~~~~-~~~~~IlG~~fl~~~y~vfD~~~~~iG~A~  317 (317)
T cd05478         259 IN-GVQYPLPPSAYILQD--QGSCTSGFQSMG-LGELWILGDVFIRQYYSVFDRANNKVGLAP  317 (317)
T ss_pred             EC-CEEEEECHHHheecC--CCEEeEEEEeCC-CCCeEEechHHhcceEEEEeCCCCEEeecC
Confidence            95 899999999999865  568985 65543 246799999999999999999999999996


No 9  
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=100.00  E-value=2.9e-34  Score=237.38  Aligned_cols=195  Identities=19%  Similarity=0.249  Sum_probs=155.0

Q ss_pred             CCCCCCCC-----------hhhhhhCcC------ceEEecCCCCCCccEEEECCCCC-CCCCCeeEeecccCCC--CCcc
Q 028157            1 MGLDRSSV-----------SIISKTNTS------YFSYCLPSPYGSTAYITFGKPVS-VSNKFIKYTPIVTTAE--QSEY   60 (213)
Q Consensus         1 ~Glg~~~~-----------sl~~ql~~~------~FS~cl~~~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~--~~~~   60 (213)
                      ||||+.+.           |++.||..+      .||+||++.....|.|+||++|. ++.+++.|+|++.++.  ...+
T Consensus        76 lGLg~~~~~~~~~~~~~~~s~~~~L~~~g~i~~~~Fsl~l~~~~~~~g~l~~Gg~d~~~~~g~~~~~p~~~~~~~~~~~~  155 (295)
T cd05474          76 LGIGLPGNEATYGTGYTYPNFPIALKKQGLIKKNAYSLYLNDLDASTGSILFGGVDTAKYSGDLVTLPIVNDNGGSEPSE  155 (295)
T ss_pred             eeECCCCCcccccCCCcCCCHHHHHHHCCcccceEEEEEeCCCCCCceeEEEeeeccceeeceeEEEeCcCcCCCCCceE
Confidence            68888876           688888754      29999998644689999999995 7889999999988642  2378


Q ss_pred             eEEEEeEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCccee
Q 028157           61 YDIILTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVV  140 (213)
Q Consensus        61 y~v~l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~  140 (213)
                      |.|++++|+++++.+..+.. -....+||||||+++++|++++++|.+++.+....  . ..  .....|+...     .
T Consensus       156 ~~v~l~~i~v~~~~~~~~~~-~~~~~~iiDSGt~~~~lP~~~~~~l~~~~~~~~~~--~-~~--~~~~~C~~~~-----~  224 (295)
T cd05474         156 LSVTLSSISVNGSSGNTTLL-SKNLPALLDSGTTLTYLPSDIVDAIAKQLGATYDS--D-EG--LYVVDCDAKD-----D  224 (295)
T ss_pred             EEEEEEEEEEEcCCCccccc-CCCccEEECCCCccEeCCHHHHHHHHHHhCCEEcC--C-Cc--EEEEeCCCCC-----C
Confidence            99999999999988753211 12358999999999999999999999988654321  1 11  3346788642     3


Q ss_pred             cCeEEEEEeCCcEEEEcCCceEEEeC----CCceEE-EEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEeeC
Q 028157          141 VPKIAIHFLGGVDLELDVRGTLVVAS----VSQVCL-EFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPG  210 (213)
Q Consensus       141 ~P~l~~~f~~g~~~~l~~~~y~~~~~----~~~~C~-~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~  210 (213)
                       |+|+|+|. |.++++++++|+++..    .+..|. ++.....  +.||||+.|||++|++||.+++|||||++
T Consensus       225 -p~i~f~f~-g~~~~i~~~~~~~~~~~~~~~~~~C~~~i~~~~~--~~~iLG~~fl~~~y~vfD~~~~~ig~a~a  295 (295)
T cd05474         225 -GSLTFNFG-GATISVPLSDLVLPASTDDGGDGACYLGIQPSTS--DYNILGDTFLRSAYVVYDLDNNEISLAQA  295 (295)
T ss_pred             -CEEEEEEC-CeEEEEEHHHhEeccccCCCCCCCeEEEEEeCCC--CcEEeChHHhhcEEEEEECCCCEEEeecC
Confidence             99999996 7999999999998764    256884 6776532  67999999999999999999999999986


No 10 
>PF14541 TAXi_C:  Xylanase inhibitor C-terminal; PDB: 3AUP_D 3HD8_A 1T6G_A 1T6E_X 2B42_A 3VLB_A 3VLA_A.
Probab=100.00  E-value=1.8e-33  Score=213.39  Aligned_cols=150  Identities=38%  Similarity=0.581  Sum_probs=120.5

Q ss_pred             ceEEEEeEEEECCeEeecccccc----CCCcEEEecCCcceecChhHHHHHHHHHHHHhhhcc--ccccccccccccccc
Q 028157           60 YYDIILTGISVGGEKLPFKISYF----TKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYK--KAKEFEDLLGTCYDL  133 (213)
Q Consensus        60 ~y~v~l~~i~vg~~~~~~~~~~~----~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~--~~~~~~~~~~~C~~~  133 (213)
                      +|+|+|++|+||+++++++...|    ..+++||||||++|+||+++|++|++++.+++...+  +.......+..||+.
T Consensus         1 ~Y~v~l~~Isvg~~~l~~~~~~~~~~~~~g~~iiDSGT~~T~L~~~~y~~l~~al~~~~~~~~~~~~~~~~~~~~~Cy~~   80 (161)
T PF14541_consen    1 FYYVNLTGISVGGKRLPIPPSVFQLSDGSGGTIIDSGTTYTYLPPPVYDALVQALDAQMGAPGVSREAPPFSGFDLCYNL   80 (161)
T ss_dssp             SEEEEEEEEEETTEEE---TTCSCETTSTCSEEE-SSSSSEEEEHHHHHHHHHHHHHHHHTCT--CEE---TT-S-EEEG
T ss_pred             CccEEEEEEEECCEEecCChHHhhccCCCCCEEEECCCCccCCcHHHHHHHHHHHHHHhhhcccccccccCCCCCceeec
Confidence            59999999999999999998877    347999999999999999999999999999997642  211111567899999


Q ss_pred             cC----CcceecCeEEEEEeCCcEEEEcCCceEEEeCCCceEEEEEec-CCCCCeeEecceeeeeeEEEEeCCCCEEEEe
Q 028157          134 SA----YETVVVPKIAIHFLGGVDLELDVRGTLVVASVSQVCLEFAIY-PPDLNSITLGNVQQRGHEVHYDVGGRRLGFG  208 (213)
Q Consensus       134 ~~----~~~~~~P~l~~~f~~g~~~~l~~~~y~~~~~~~~~C~~~~~~-~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa  208 (213)
                      +.    .....+|+|+|||.+|++|+|+|++|++...++.+|+++..+ ....+..|||..+|+++.++||++++||||+
T Consensus        81 ~~~~~~~~~~~~P~i~l~F~~ga~l~l~~~~y~~~~~~~~~Cla~~~~~~~~~~~~viG~~~~~~~~v~fDl~~~~igF~  160 (161)
T PF14541_consen   81 SSFGVNRDWAKFPTITLHFEGGADLTLPPENYFVQVSPGVFCLAFVPSDADDDGVSVIGNFQQQNYHVVFDLENGRIGFA  160 (161)
T ss_dssp             GCS-EETTEESS--EEEEETTSEEEEE-HHHHEEEECTTEEEESEEEETSTTSSSEEE-HHHCCTEEEEEETTTTEEEEE
T ss_pred             cccccccccccCCeEEEEEeCCcceeeeccceeeeccCCCEEEEEEccCCCCCCcEEECHHHhcCcEEEEECCCCEEEEe
Confidence            87    356899999999998999999999999998888999999887 2235789999999999999999999999999


Q ss_pred             e
Q 028157          209 P  209 (213)
Q Consensus       209 ~  209 (213)
                      |
T Consensus       161 ~  161 (161)
T PF14541_consen  161 P  161 (161)
T ss_dssp             E
T ss_pred             C
Confidence            6


No 11 
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=100.00  E-value=2e-33  Score=234.70  Aligned_cols=188  Identities=20%  Similarity=0.317  Sum_probs=144.9

Q ss_pred             CCCCCCCChh------hhhhCcC------ceEEecCCCC--CCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEE
Q 028157            1 MGLDRSSVSI------ISKTNTS------YFSYCLPSPY--GSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIIL   65 (213)
Q Consensus         1 ~Glg~~~~sl------~~ql~~~------~FS~cl~~~~--~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l   65 (213)
                      ||||+..++.      +.+|.++      .||+||++..  +.+|+|+||++|. ++.+++.|+|++.    ..+|.|++
T Consensus       108 lGLg~~~~s~~~~~p~~~~l~~qg~i~~~~FS~~L~~~~~~~~~g~l~fGg~d~~~~~g~l~~~pi~~----~~~w~v~l  183 (316)
T cd05486         108 LGLAYPSLAVDGVTPVFDNMMAQNLVELPMFSVYMSRNPNSADGGELVFGGFDTSRFSGQLNWVPVTV----QGYWQIQL  183 (316)
T ss_pred             eccCchhhccCCCCCHHHHHHhcCCCCCCEEEEEEccCCCCCCCcEEEEcccCHHHcccceEEEECCC----ceEEEEEe
Confidence            6888876653      4444432      3999998742  2579999999996 7889999999876    47999999


Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEE
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIA  145 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~  145 (213)
                      ++|+||++.+..+.    ...+||||||+++++|++++++|.+++.+..     ..+  .....|...     ..+|+|+
T Consensus       184 ~~i~v~g~~~~~~~----~~~aiiDTGTs~~~lP~~~~~~l~~~~~~~~-----~~~--~~~~~C~~~-----~~~p~i~  247 (316)
T cd05486         184 DNIQVGGTVIFCSD----GCQAIVDTGTSLITGPSGDIKQLQNYIGATA-----TDG--EYGVDCSTL-----SLMPSVT  247 (316)
T ss_pred             eEEEEecceEecCC----CCEEEECCCcchhhcCHHHHHHHHHHhCCcc-----cCC--cEEEecccc-----ccCCCEE
Confidence            99999998875432    1469999999999999999998877663221     111  122467643     4689999


Q ss_pred             EEEeCCcEEEEcCCceEEEe--CCCceEE-EEEecC---CCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          146 IHFLGGVDLELDVRGTLVVA--SVSQVCL-EFAIYP---PDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       146 ~~f~~g~~~~l~~~~y~~~~--~~~~~C~-~~~~~~---~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      |+|+ |++++|+|++|++..  .....|+ ++....   ...+.||||+.|||++|+|||.+++|||||+
T Consensus       248 f~f~-g~~~~l~~~~y~~~~~~~~~~~C~~~~~~~~~~~~~~~~~ILGd~flr~~y~vfD~~~~~IGfA~  316 (316)
T cd05486         248 FTIN-GIPYSLSPQAYTLEDQSDGGGYCSSGFQGLDIPPPAGPLWILGDVFIRQYYSVFDRGNNRVGFAP  316 (316)
T ss_pred             EEEC-CEEEEeCHHHeEEecccCCCCEEeeEEEECCCCCCCCCeEEEchHHhcceEEEEeCCCCEeeccC
Confidence            9995 899999999999875  3356897 465432   1235799999999999999999999999996


No 12 
>cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate  r
Probab=100.00  E-value=3.4e-33  Score=234.34  Aligned_cols=189  Identities=20%  Similarity=0.342  Sum_probs=147.1

Q ss_pred             CCCCCCCCh----------hhhh--hCcCceEEecCCCC--CCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEE
Q 028157            1 MGLDRSSVS----------IISK--TNTSYFSYCLPSPY--GSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIIL   65 (213)
Q Consensus         1 ~Glg~~~~s----------l~~q--l~~~~FS~cl~~~~--~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l   65 (213)
                      ||||+...|          |++|  +..+.||+||++..  ..+|+|+||++|. ++.+++.|+|+..    ..+|.|.+
T Consensus       117 lGLg~~~~s~~~~~~~~~~L~~qg~i~~~~FS~~L~~~~~~~~~G~l~fGg~d~~~y~g~l~~~~~~~----~~~w~v~l  192 (326)
T cd05487         117 LGMGYPKQAIGGVTPVFDNIMSQGVLKEDVFSVYYSRDSSHSLGGEIVLGGSDPQHYQGDFHYINTSK----TGFWQIQM  192 (326)
T ss_pred             EecCChhhcccCCCCHHHHHHhcCCCCCCEEEEEEeCCCCCCCCcEEEECCcChhhccCceEEEECCc----CceEEEEe
Confidence            688887655          4444  22234999998753  2589999999996 7889999999875    47999999


Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEE
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIA  145 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~  145 (213)
                      ++|+||++.+....    ...+||||||+++++|+++++++.+++.+..    . ..  .....|+..     ..+|+|+
T Consensus       193 ~~i~vg~~~~~~~~----~~~aiiDSGts~~~lP~~~~~~l~~~~~~~~----~-~~--~y~~~C~~~-----~~~P~i~  256 (326)
T cd05487         193 KGVSVGSSTLLCED----GCTAVVDTGASFISGPTSSISKLMEALGAKE----R-LG--DYVVKCNEV-----PTLPDIS  256 (326)
T ss_pred             cEEEECCEEEecCC----CCEEEECCCccchhCcHHHHHHHHHHhCCcc----c-CC--CEEEecccc-----CCCCCEE
Confidence            99999998875432    2369999999999999999999988774321    1 11  223467753     4689999


Q ss_pred             EEEeCCcEEEEcCCceEEEeCC--CceEE-EEEecC---CCCCeeEecceeeeeeEEEEeCCCCEEEEeeC
Q 028157          146 IHFLGGVDLELDVRGTLVVASV--SQVCL-EFAIYP---PDLNSITLGNVQQRGHEVHYDVGGRRLGFGPG  210 (213)
Q Consensus       146 ~~f~~g~~~~l~~~~y~~~~~~--~~~C~-~~~~~~---~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~  210 (213)
                      |+| +|.+++|++++|+++...  +..|+ ++....   ..++.||||+.|||++|+|||++++|||||++
T Consensus       257 f~f-gg~~~~v~~~~yi~~~~~~~~~~C~~~~~~~~~~~~~~~~~ilG~~flr~~y~vfD~~~~~IGfA~a  326 (326)
T cd05487         257 FHL-GGKEYTLSSSDYVLQDSDFSDKLCTVAFHAMDIPPPTGPLWVLGATFIRKFYTEFDRQNNRIGFALA  326 (326)
T ss_pred             EEE-CCEEEEeCHHHhEEeccCCCCCEEEEEEEeCCCCCCCCCeEEEehHHhhccEEEEeCCCCEEeeeeC
Confidence            999 489999999999987532  56896 566432   12358999999999999999999999999985


No 13 
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=100.00  E-value=4.4e-33  Score=232.87  Aligned_cols=189  Identities=18%  Similarity=0.340  Sum_probs=147.5

Q ss_pred             CCCCCC------CChhhhhhCcC------ceEEecCCCC-CCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEe
Q 028157            1 MGLDRS------SVSIISKTNTS------YFSYCLPSPY-GSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILT   66 (213)
Q Consensus         1 ~Glg~~------~~sl~~ql~~~------~FS~cl~~~~-~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~   66 (213)
                      ||||+.      ..+++.||..+      .||+||++.. ..+|.|+||++|+ ++.+++.|+|+..    ..+|.|.++
T Consensus       111 lGLg~~~~s~~~~~~~~~~L~~~g~i~~~~FS~~L~~~~~~~~g~l~fGg~d~~~~~g~l~~~pv~~----~~~w~v~l~  186 (318)
T cd05477         111 LGLAYPSISAGGATTVMQGMMQQNLLQAPIFSFYLSGQQGQQGGELVFGGVDNNLYTGQIYWTPVTS----ETYWQIGIQ  186 (318)
T ss_pred             eecCcccccccCCCCHHHHHHhcCCcCCCEEEEEEcCCCCCCCCEEEEcccCHHHcCCceEEEecCC----ceEEEEEee
Confidence            588874      35677787654      3999998753 2479999999996 7889999999875    479999999


Q ss_pred             EEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEE
Q 028157           67 GISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAI  146 (213)
Q Consensus        67 ~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~  146 (213)
                      +|+||++.+.+...   ...+||||||+++++|+++|++|.+++.++...    ..  .....|+..     ..+|+|+|
T Consensus       187 ~i~v~g~~~~~~~~---~~~~iiDSGtt~~~lP~~~~~~l~~~~~~~~~~----~~--~~~~~C~~~-----~~~p~l~~  252 (318)
T cd05477         187 GFQINGQATGWCSQ---GCQAIVDTGTSLLTAPQQVMSTLMQSIGAQQDQ----YG--QYVVNCNNI-----QNLPTLTF  252 (318)
T ss_pred             EEEECCEEecccCC---CceeeECCCCccEECCHHHHHHHHHHhCCcccc----CC--CEEEeCCcc-----ccCCcEEE
Confidence            99999998754321   246999999999999999999998877544211    11  122456643     46899999


Q ss_pred             EEeCCcEEEEcCCceEEEeCCCceEE-EEEecC----CCCCeeEecceeeeeeEEEEeCCCCEEEEeeC
Q 028157          147 HFLGGVDLELDVRGTLVVASVSQVCL-EFAIYP----PDLNSITLGNVQQRGHEVHYDVGGRRLGFGPG  210 (213)
Q Consensus       147 ~f~~g~~~~l~~~~y~~~~~~~~~C~-~~~~~~----~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~  210 (213)
                      +|+ |+++++++++|+...  ...|+ ++....    .+...||||+.|||++|++||++++|||||++
T Consensus       253 ~f~-g~~~~v~~~~y~~~~--~~~C~~~i~~~~~~~~~~~~~~ilG~~fl~~~y~vfD~~~~~ig~a~~  318 (318)
T cd05477         253 TIN-GVSFPLPPSAYILQN--NGYCTVGIEPTYLPSQNGQPLWILGDVFLRQYYSVYDLGNNQVGFATA  318 (318)
T ss_pred             EEC-CEEEEECHHHeEecC--CCeEEEEEEecccCCCCCCceEEEcHHHhhheEEEEeCCCCEEeeeeC
Confidence            995 799999999999864  45796 675431    12347999999999999999999999999985


No 14 
>cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank 
Probab=100.00  E-value=4.1e-33  Score=233.67  Aligned_cols=189  Identities=19%  Similarity=0.271  Sum_probs=145.1

Q ss_pred             CCCCCCCChh------hhhhCcC------ceEEecCCCCC--CccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEE
Q 028157            1 MGLDRSSVSI------ISKTNTS------YFSYCLPSPYG--STAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIIL   65 (213)
Q Consensus         1 ~Glg~~~~sl------~~ql~~~------~FS~cl~~~~~--~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l   65 (213)
                      ||||+..+|.      +.||.++      .||+||++..+  .+|+|+||++|. ++.+++.|+|+.+    ..+|.|++
T Consensus       116 lGLg~~~~s~~~~~~~~~~l~~~g~i~~~~FS~~L~~~~~~~~~G~l~~Gg~d~~~~~g~l~~~~~~~----~~~w~v~l  191 (325)
T cd05490         116 LGMAYPRISVDGVTPVFDNIMAQKLVEQNVFSFYLNRDPDAQPGGELMLGGTDPKYYTGDLHYVNVTR----KAYWQIHM  191 (325)
T ss_pred             EecCCccccccCCCCHHHHHHhcCCCCCCEEEEEEeCCCCCCCCCEEEECccCHHHcCCceEEEEcCc----ceEEEEEe
Confidence            6888876653      3455442      39999986432  479999999996 7889999999875    47999999


Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEE
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIA  145 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~  145 (213)
                      ++|+||++......    ...+||||||+++++|++++++|.+++.+.    +....  .....|+..     ..+|+|+
T Consensus       192 ~~i~vg~~~~~~~~----~~~aiiDSGTt~~~~p~~~~~~l~~~~~~~----~~~~~--~~~~~C~~~-----~~~P~i~  256 (325)
T cd05490         192 DQVDVGSGLTLCKG----GCEAIVDTGTSLITGPVEEVRALQKAIGAV----PLIQG--EYMIDCEKI-----PTLPVIS  256 (325)
T ss_pred             eEEEECCeeeecCC----CCEEEECCCCccccCCHHHHHHHHHHhCCc----cccCC--CEEeccccc-----ccCCCEE
Confidence            99999987543221    247999999999999999999998877432    21111  234567753     4689999


Q ss_pred             EEEeCCcEEEEcCCceEEEeC--CCceEEE-EEecC---CCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          146 IHFLGGVDLELDVRGTLVVAS--VSQVCLE-FAIYP---PDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       146 ~~f~~g~~~~l~~~~y~~~~~--~~~~C~~-~~~~~---~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      |+|+ |++++|+|++|++...  ....|+. +....   .....||||+.|||++|+|||++++|||||+
T Consensus       257 f~fg-g~~~~l~~~~y~~~~~~~~~~~C~~~~~~~~~~~~~~~~~ilGd~flr~~y~vfD~~~~~IGfA~  325 (325)
T cd05490         257 FSLG-GKVYPLTGEDYILKVSQRGTTICLSGFMGLDIPPPAGPLWILGDVFIGRYYTVFDRDNDRVGFAK  325 (325)
T ss_pred             EEEC-CEEEEEChHHeEEeccCCCCCEEeeEEEECCCCCCCCceEEEChHhheeeEEEEEcCCcEeeccC
Confidence            9995 8999999999998753  2358984 65432   1245899999999999999999999999996


No 15 
>PTZ00165 aspartyl protease; Provisional
Probab=100.00  E-value=9.5e-33  Score=240.82  Aligned_cols=187  Identities=20%  Similarity=0.359  Sum_probs=146.8

Q ss_pred             CCCCCCCC---------hhhhhhCcC------ceEEecCCCCCCccEEEECCCCCC-C--CCCeeEeecccCCCCCcceE
Q 028157            1 MGLDRSSV---------SIISKTNTS------YFSYCLPSPYGSTAYITFGKPVSV-S--NKFIKYTPIVTTAEQSEYYD   62 (213)
Q Consensus         1 ~Glg~~~~---------sl~~ql~~~------~FS~cl~~~~~~~g~l~fG~~~~~-~--~~~~~y~pl~~~~~~~~~y~   62 (213)
                      ||||...+         +++.||.++      .||+||++....+|+|+|||+|++ +  .+++.|+|+++    ..||+
T Consensus       233 LGLg~~~~s~~s~~~~~p~~~~l~~qgli~~~~FS~yL~~~~~~~G~l~fGGiD~~~~~~~g~i~~~Pv~~----~~yW~  308 (482)
T PTZ00165        233 VGLGFPDKDFKESKKALPIVDNIKKQNLLKRNIFSFYMSKDLNQPGSISFGSADPKYTLEGHKIWWFPVIS----TDYWE  308 (482)
T ss_pred             eecCCCcccccccCCCCCHHHHHHHcCCcccceEEEEeccCCCCCCEEEeCCcCHHHcCCCCceEEEEccc----cceEE
Confidence            68887765         344455432      399999876556899999999963 3  57899999987    47999


Q ss_pred             EEEeEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecC
Q 028157           63 IILTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVP  142 (213)
Q Consensus        63 v~l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P  142 (213)
                      |++++|+||++.+.....   ...+|+||||+++++|++++++|.+++.+              ...|+..     ..+|
T Consensus       309 i~l~~i~vgg~~~~~~~~---~~~aIiDTGTSli~lP~~~~~~i~~~i~~--------------~~~C~~~-----~~lP  366 (482)
T PTZ00165        309 IEVVDILIDGKSLGFCDR---KCKAAIDTGSSLITGPSSVINPLLEKIPL--------------EEDCSNK-----DSLP  366 (482)
T ss_pred             EEeCeEEECCEEeeecCC---ceEEEEcCCCccEeCCHHHHHHHHHHcCC--------------ccccccc-----ccCC
Confidence            999999999988765321   24699999999999999999988876521              1358864     3689


Q ss_pred             eEEEEEeC--C--cEEEEcCCceEEEe----CCCceEE-EEEecC---CCCCeeEecceeeeeeEEEEeCCCCEEEEeeC
Q 028157          143 KIAIHFLG--G--VDLELDVRGTLVVA----SVSQVCL-EFAIYP---PDLNSITLGNVQQRGHEVHYDVGGRRLGFGPG  210 (213)
Q Consensus       143 ~l~~~f~~--g--~~~~l~~~~y~~~~----~~~~~C~-~~~~~~---~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~  210 (213)
                      +|+|+|.+  |  ++++++|++|+++.    ..+..|+ ++...+   +.++.||||++|||+||+|||++++|||||++
T Consensus       367 ~itf~f~g~~g~~v~~~l~p~dYi~~~~~~~~~~~~C~~g~~~~d~~~~~g~~~ILGd~Flr~yy~VFD~~n~rIGfA~a  446 (482)
T PTZ00165        367 RISFVLEDVNGRKIKFDMDPEDYVIEEGDSEEQEHQCVIGIIPMDVPAPRGPLFVLGNNFIRKYYSIFDRDHMMVGLVPA  446 (482)
T ss_pred             ceEEEECCCCCceEEEEEchHHeeeecccCCCCCCeEEEEEEECCCCCCCCceEEEchhhheeEEEEEeCCCCEEEEEee
Confidence            99999963  2  38999999999974    2356896 576542   12467999999999999999999999999999


Q ss_pred             CCC
Q 028157          211 NCS  213 (213)
Q Consensus       211 ~C~  213 (213)
                      +|+
T Consensus       447 ~~~  449 (482)
T PTZ00165        447 KHD  449 (482)
T ss_pred             ccC
Confidence            985


No 16 
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme.  Proteinase A preferentially hydro
Probab=100.00  E-value=4.1e-33  Score=233.26  Aligned_cols=186  Identities=22%  Similarity=0.328  Sum_probs=144.3

Q ss_pred             CCCCCCCChhhhh------hC------cCceEEecCCCCCCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEeE
Q 028157            1 MGLDRSSVSIISK------TN------TSYFSYCLPSPYGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILTG   67 (213)
Q Consensus         1 ~Glg~~~~sl~~q------l~------~~~FS~cl~~~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~~   67 (213)
                      ||||++..|..++      |.      .+.||+||++....+|+|+||++|. ++.+++.|+|++.    ..+|.|++++
T Consensus       118 lGLg~~~~s~~~~~~~~~~l~~qg~i~~~~FS~~L~~~~~~~G~l~fGg~d~~~~~g~l~~~p~~~----~~~w~v~l~~  193 (320)
T cd05488         118 LGLAYDTISVNKIVPPFYNMINQGLLDEPVFSFYLGSSEEDGGEATFGGIDESRFTGKITWLPVRR----KAYWEVELEK  193 (320)
T ss_pred             EecCCccccccCCCCHHHHHHhcCCCCCCEEEEEecCCCCCCcEEEECCcCHHHcCCceEEEeCCc----CcEEEEEeCe
Confidence            6899988765432      22      2349999998644689999999996 6889999999976    4699999999


Q ss_pred             EEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEE
Q 028157           68 ISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIH  147 (213)
Q Consensus        68 i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~  147 (213)
                      |+||++.+....     ..++|||||+++++|+++++++.+++.+...      ........|+..     ..+|.|+|+
T Consensus       194 i~vg~~~~~~~~-----~~~ivDSGtt~~~lp~~~~~~l~~~~~~~~~------~~~~~~~~C~~~-----~~~P~i~f~  257 (320)
T cd05488         194 IGLGDEELELEN-----TGAAIDTGTSLIALPSDLAEMLNAEIGAKKS------WNGQYTVDCSKV-----DSLPDLTFN  257 (320)
T ss_pred             EEECCEEeccCC-----CeEEEcCCcccccCCHHHHHHHHHHhCCccc------cCCcEEeecccc-----ccCCCEEEE
Confidence            999998876432     4699999999999999999998877643211      010122356643     468999999


Q ss_pred             EeCCcEEEEcCCceEEEeCCCceEEE-EEecC---CCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          148 FLGGVDLELDVRGTLVVASVSQVCLE-FAIYP---PDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       148 f~~g~~~~l~~~~y~~~~~~~~~C~~-~~~~~---~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      |. |++++|+|++|+++.  ...|+. +....   ..++.||||+.|||++|++||++++|||||+
T Consensus       258 f~-g~~~~i~~~~y~~~~--~g~C~~~~~~~~~~~~~~~~~ilG~~fl~~~y~vfD~~~~~iG~a~  320 (320)
T cd05488         258 FD-GYNFTLGPFDYTLEV--SGSCISAFTGMDFPEPVGPLAIVGDAFLRKYYSVYDLGNNAVGLAK  320 (320)
T ss_pred             EC-CEEEEECHHHheecC--CCeEEEEEEECcCCCCCCCeEEEchHHhhheEEEEeCCCCEEeecC
Confidence            95 799999999999854  347985 44321   1234799999999999999999999999996


No 17 
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5.  Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=100.00  E-value=5.1e-33  Score=233.24  Aligned_cols=170  Identities=24%  Similarity=0.369  Sum_probs=134.9

Q ss_pred             CCCCCCCCh--------hhhhhCc----CceEEecCCCCCCccEEEECCCCC-CCC----------CCeeEeecccCCCC
Q 028157            1 MGLDRSSVS--------IISKTNT----SYFSYCLPSPYGSTAYITFGKPVS-VSN----------KFIKYTPIVTTAEQ   57 (213)
Q Consensus         1 ~Glg~~~~s--------l~~ql~~----~~FS~cl~~~~~~~g~l~fG~~~~-~~~----------~~~~y~pl~~~~~~   57 (213)
                      ||||+.+.+        +.+|...    ..||+||++.   +|+|+||++|+ ++.          +++.|+|+..    
T Consensus       134 lGLg~~~~~~~~~~~~~l~~~~~~~~~~~~FS~~l~~~---~G~l~~Gg~d~~~~~~~~~~~~~~~~~~~~~p~~~----  206 (326)
T cd06096         134 LGLSLTKNNGLPTPIILLFTKRPKLKKDKIFSICLSED---GGELTIGGYDKDYTVRNSSIGNNKVSKIVWTPITR----  206 (326)
T ss_pred             EEccCCcccccCchhHHHHHhcccccCCceEEEEEcCC---CeEEEECccChhhhcccccccccccCCceEEeccC----
Confidence            688887643        1123221    3499999874   69999999996 444          7899999986    


Q ss_pred             CcceEEEEeEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCc
Q 028157           58 SEYYDIILTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYE  137 (213)
Q Consensus        58 ~~~y~v~l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~  137 (213)
                      ..+|.|++++|+|+++......  .....+||||||++++||+++|++|.+++                           
T Consensus       207 ~~~y~v~l~~i~vg~~~~~~~~--~~~~~aivDSGTs~~~lp~~~~~~l~~~~---------------------------  257 (326)
T cd06096         207 KYYYYVKLEGLSVYGTTSNSGN--TKGLGMLVDSGSTLSHFPEDLYNKINNFF---------------------------  257 (326)
T ss_pred             CceEEEEEEEEEEcccccceec--ccCCCEEEeCCCCcccCCHHHHHHHHhhc---------------------------
Confidence            3689999999999998611111  01357999999999999999999887643                           


Q ss_pred             ceecCeEEEEEeCCcEEEEcCCceEEEeCCCceEEEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEeeCCCC
Q 028157          138 TVVVPKIAIHFLGGVDLELDVRGTLVVASVSQVCLEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPGNCS  213 (213)
Q Consensus       138 ~~~~P~l~~~f~~g~~~~l~~~~y~~~~~~~~~C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~~C~  213 (213)
                          |+|+|+|++|++++++|++|++.......|+.+...   .+.+|||+.|||++|+|||++++|||||+++|+
T Consensus       258 ----P~i~~~f~~g~~~~i~p~~y~~~~~~~~c~~~~~~~---~~~~ILG~~flr~~y~vFD~~~~riGfa~~~C~  326 (326)
T cd06096         258 ----PTITIIFENNLKIDWKPSSYLYKKESFWCKGGEKSV---SNKPILGASFFKNKQIIFDLDNNRIGFVESNCP  326 (326)
T ss_pred             ----CcEEEEEcCCcEEEECHHHhccccCCceEEEEEecC---CCceEEChHHhcCcEEEEECcCCEEeeEcCCCC
Confidence                889999976899999999999876544455555543   257999999999999999999999999999996


No 18 
>cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets an
Probab=100.00  E-value=2.9e-32  Score=228.96  Aligned_cols=188  Identities=19%  Similarity=0.313  Sum_probs=144.5

Q ss_pred             CCCCCCCChh------hhhhCcC------ceEEecCCCCC--CccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEE
Q 028157            1 MGLDRSSVSI------ISKTNTS------YFSYCLPSPYG--STAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIIL   65 (213)
Q Consensus         1 ~Glg~~~~sl------~~ql~~~------~FS~cl~~~~~--~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l   65 (213)
                      ||||+..+|.      +.||..+      .||+||++..+  .+|+|+||++|. ++.+++.|+|+..    +.+|.|++
T Consensus       121 lGLg~~~~s~~~~~p~~~~l~~qg~i~~~~FS~~l~~~~~~~~~G~l~fGg~d~~~~~g~l~~~p~~~----~~~~~v~~  196 (329)
T cd05485         121 LGMGYSSISVDGVVPVFYNMVNQKLVDAPVFSFYLNRDPSAKEGGELILGGSDPKHYTGNFTYLPVTR----KGYWQFKM  196 (329)
T ss_pred             EEcCCccccccCCCCHHHHHHhCCCCCCCEEEEEecCCCCCCCCcEEEEcccCHHHcccceEEEEcCC----ceEEEEEe
Confidence            6889887663      3444332      39999987532  479999999996 6889999999975    47999999


Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEE
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIA  145 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~  145 (213)
                      ++|+|+++.+...     ...+||||||+++++|++++++|.+++.+..  ..  ..  .....|+..     ..+|+|+
T Consensus       197 ~~i~v~~~~~~~~-----~~~~iiDSGtt~~~lP~~~~~~l~~~~~~~~--~~--~~--~~~~~C~~~-----~~~p~i~  260 (329)
T cd05485         197 DSVSVGEGEFCSG-----GCQAIADTGTSLIAGPVDEIEKLNNAIGAKP--II--GG--EYMVNCSAI-----PSLPDIT  260 (329)
T ss_pred             eEEEECCeeecCC-----CcEEEEccCCcceeCCHHHHHHHHHHhCCcc--cc--CC--cEEEecccc-----ccCCcEE
Confidence            9999999876421     2469999999999999999998887764321  11  11  223466643     4679999


Q ss_pred             EEEeCCcEEEEcCCceEEEeC--CCceEEE-EEecC---CCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          146 IHFLGGVDLELDVRGTLVVAS--VSQVCLE-FAIYP---PDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       146 ~~f~~g~~~~l~~~~y~~~~~--~~~~C~~-~~~~~---~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      |+|+ |++++|+|++|+++..  +...|+. +....   ...+.||||+.|||++|+|||++++|||||+
T Consensus       261 f~fg-g~~~~i~~~~yi~~~~~~~~~~C~~~~~~~~~~~~~~~~~IlG~~fl~~~y~vFD~~~~~ig~a~  329 (329)
T cd05485         261 FVLG-GKSFSLTGKDYVLKVTQMGQTICLSGFMGIDIPPPAGPLWILGDVFIGKYYTEFDLGNNRVGFAT  329 (329)
T ss_pred             EEEC-CEEeEEChHHeEEEecCCCCCEEeeeEEECcCCCCCCCeEEEchHHhccceEEEeCCCCEEeecC
Confidence            9995 8999999999999763  2468984 66431   1235799999999999999999999999985


No 19 
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases.  They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=100.00  E-value=9.6e-32  Score=224.72  Aligned_cols=178  Identities=20%  Similarity=0.330  Sum_probs=138.7

Q ss_pred             CCCCCCCChhh------hhhCc------CceEEecCCCC--CCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEE
Q 028157            1 MGLDRSSVSII------SKTNT------SYFSYCLPSPY--GSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIIL   65 (213)
Q Consensus         1 ~Glg~~~~sl~------~ql~~------~~FS~cl~~~~--~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l   65 (213)
                      ||||+...|..      .+|.+      +.||+||++..  ..+|+|+||++|+ ++.+++.|+|++.    ..+|.|.+
T Consensus       119 lGLg~~~~s~~~~~~~~~~l~~qg~i~~~~FS~~L~~~~~~~~~G~l~fGg~d~~~~~g~l~~~pv~~----~~~w~v~l  194 (317)
T cd06098         119 LGLGFQEISVGKAVPVWYNMVEQGLVKEPVFSFWLNRNPDEEEGGELVFGGVDPKHFKGEHTYVPVTR----KGYWQFEM  194 (317)
T ss_pred             ccccccchhhcCCCCHHHHHHhcCCCCCCEEEEEEecCCCCCCCcEEEECccChhhcccceEEEecCc----CcEEEEEe
Confidence            68888766542      23322      34999998642  2589999999996 7889999999975    46999999


Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEE
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIA  145 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~  145 (213)
                      ++|+||++.+.....   ...+||||||+++++|+++++++.                  ....|+..     ..+|+|+
T Consensus       195 ~~i~v~g~~~~~~~~---~~~aivDTGTs~~~lP~~~~~~i~------------------~~~~C~~~-----~~~P~i~  248 (317)
T cd06098         195 GDVLIGGKSTGFCAG---GCAAIADSGTSLLAGPTTIVTQIN------------------SAVDCNSL-----SSMPNVS  248 (317)
T ss_pred             CeEEECCEEeeecCC---CcEEEEecCCcceeCCHHHHHhhh------------------ccCCcccc-----ccCCcEE
Confidence            999999998765432   136999999999999998765542                  12458854     3689999


Q ss_pred             EEEeCCcEEEEcCCceEEEeCC--CceEEE-EEecC---CCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          146 IHFLGGVDLELDVRGTLVVASV--SQVCLE-FAIYP---PDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       146 ~~f~~g~~~~l~~~~y~~~~~~--~~~C~~-~~~~~---~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      |+|+ |.+++|+|++|+++...  ...|+. +....   ...+.||||+.|||++|+|||++++|||||+
T Consensus       249 f~f~-g~~~~l~~~~yi~~~~~~~~~~C~~~~~~~~~~~~~~~~~IlGd~Flr~~y~VfD~~~~~iGfA~  317 (317)
T cd06098         249 FTIG-GKTFELTPEQYILKVGEGAAAQCISGFTALDVPPPRGPLWILGDVFMGAYHTVFDYGNLRVGFAE  317 (317)
T ss_pred             EEEC-CEEEEEChHHeEEeecCCCCCEEeceEEECCCCCCCCCeEEechHHhcccEEEEeCCCCEEeecC
Confidence            9994 79999999999987532  358974 65432   1245799999999999999999999999995


No 20 
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=99.98  E-value=1.9e-31  Score=230.78  Aligned_cols=187  Identities=19%  Similarity=0.277  Sum_probs=142.0

Q ss_pred             CCCCCCCCh------hhhhhCcC------ceEEecCCCCCCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEeE
Q 028157            1 MGLDRSSVS------IISKTNTS------YFSYCLPSPYGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILTG   67 (213)
Q Consensus         1 ~Glg~~~~s------l~~ql~~~------~FS~cl~~~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~~   67 (213)
                      ||||+..+|      ++.||..+      .||+||++....+|+|+|||+|+ ++.+++.|+|+..    ..+|.|.++ 
T Consensus       247 lGLg~~~~s~~~~~p~~~~L~~qg~I~~~vFS~~L~~~~~~~G~L~fGGiD~~~y~G~L~y~pv~~----~~yW~I~l~-  321 (450)
T PTZ00013        247 LGLGWKDLSIGSIDPIVVELKNQNKIDNALFTFYLPVHDVHAGYLTIGGIEEKFYEGNITYEKLNH----DLYWQIDLD-  321 (450)
T ss_pred             ecccCCccccccCCCHHHHHHhccCcCCcEEEEEecCCCCCCCEEEECCcCccccccceEEEEcCc----CceEEEEEE-
Confidence            689887765      34455443      39999987544689999999996 7889999999965    479999998 


Q ss_pred             EEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEE
Q 028157           68 ISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIH  147 (213)
Q Consensus        68 i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~  147 (213)
                      +.+|.....       ...+||||||+++++|+++++++.+++....  .+   ........|+.      ..+|+|+|+
T Consensus       322 v~~G~~~~~-------~~~aIlDSGTSli~lP~~~~~~i~~~l~~~~--~~---~~~~y~~~C~~------~~lP~i~F~  383 (450)
T PTZ00013        322 VHFGKQTMQ-------KANVIVDSGTTTITAPSEFLNKFFANLNVIK--VP---FLPFYVTTCDN------KEMPTLEFK  383 (450)
T ss_pred             EEECceecc-------ccceEECCCCccccCCHHHHHHHHHHhCCee--cC---CCCeEEeecCC------CCCCeEEEE
Confidence            677654331       2469999999999999999888877653221  11   11022356864      367999999


Q ss_pred             EeCCcEEEEcCCceEEEe--CCCceEE-EEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEeeCC
Q 028157          148 FLGGVDLELDVRGTLVVA--SVSQVCL-EFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPGN  211 (213)
Q Consensus       148 f~~g~~~~l~~~~y~~~~--~~~~~C~-~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~~  211 (213)
                      |. |.+++|+|++|+...  ..+..|+ ++.+.+...+.||||+.|||++|+|||++++|||||+++
T Consensus       384 ~~-g~~~~L~p~~Yi~~~~~~~~~~C~~~i~~~~~~~~~~ILGd~FLr~~Y~VFD~~n~rIGfA~a~  449 (450)
T PTZ00013        384 SA-NNTYTLEPEYYMNPLLDVDDTLCMITMLPVDIDDNTFILGDPFMRKYFTVFDYDKESVGFAIAK  449 (450)
T ss_pred             EC-CEEEEECHHHheehhccCCCCeeEEEEEECCCCCCCEEECHHHhccEEEEEECCCCEEEEEEeC
Confidence            96 799999999999753  2346897 566543334579999999999999999999999999874


No 21 
>PTZ00147 plasmepsin-1; Provisional
Probab=99.97  E-value=2.2e-31  Score=230.64  Aligned_cols=187  Identities=18%  Similarity=0.281  Sum_probs=142.9

Q ss_pred             CCCCCCCChh------hhhhCcC------ceEEecCCCCCCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEeE
Q 028157            1 MGLDRSSVSI------ISKTNTS------YFSYCLPSPYGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILTG   67 (213)
Q Consensus         1 ~Glg~~~~sl------~~ql~~~------~FS~cl~~~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~~   67 (213)
                      ||||+.++|.      +.||..+      .||+||++....+|+|+|||+|. ++.+++.|+|+..    ..+|.|.++ 
T Consensus       248 LGLG~~~~S~~~~~p~~~~L~~qg~I~~~vFS~~L~~~~~~~G~L~fGGiD~~ky~G~l~y~pl~~----~~~W~V~l~-  322 (453)
T PTZ00147        248 FGLGWKDLSIGSVDPYVVELKNQNKIEQAVFTFYLPPEDKHKGYLTIGGIEERFYEGPLTYEKLNH----DLYWQVDLD-  322 (453)
T ss_pred             ecccCCccccccCCCHHHHHHHcCCCCccEEEEEecCCCCCCeEEEECCcChhhcCCceEEEEcCC----CceEEEEEE-
Confidence            6899887653      3344433      39999987655689999999996 6889999999964    479999998 


Q ss_pred             EEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEE
Q 028157           68 ISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIH  147 (213)
Q Consensus        68 i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~  147 (213)
                      +.+|+...       ....+||||||+++++|+++++++.+++.+..  .+...   .....|+.      ..+|+++|+
T Consensus       323 ~~vg~~~~-------~~~~aIiDSGTsli~lP~~~~~ai~~~l~~~~--~~~~~---~y~~~C~~------~~lP~~~f~  384 (453)
T PTZ00147        323 VHFGNVSS-------EKANVIVDSGTSVITVPTEFLNKFVESLDVFK--VPFLP---LYVTTCNN------TKLPTLEFR  384 (453)
T ss_pred             EEECCEec-------CceeEEECCCCchhcCCHHHHHHHHHHhCCee--cCCCC---eEEEeCCC------CCCCeEEEE
Confidence            57776432       12469999999999999999999888764321  11111   22356874      357999999


Q ss_pred             EeCCcEEEEcCCceEEEeC--CCceEE-EEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEeeCC
Q 028157          148 FLGGVDLELDVRGTLVVAS--VSQVCL-EFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPGN  211 (213)
Q Consensus       148 f~~g~~~~l~~~~y~~~~~--~~~~C~-~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~~  211 (213)
                      |. |.+++|+|++|+....  ....|+ ++...+...+.||||+.|||++|+|||++++|||||+++
T Consensus       385 f~-g~~~~L~p~~yi~~~~~~~~~~C~~~i~~~~~~~~~~ILGd~FLr~~YtVFD~~n~rIGfA~a~  450 (453)
T PTZ00147        385 SP-NKVYTLEPEYYLQPIEDIGSALCMLNIIPIDLEKNTFILGDPFMRKYFTVFDYDNHTVGFALAK  450 (453)
T ss_pred             EC-CEEEEECHHHheeccccCCCcEEEEEEEECCCCCCCEEECHHHhccEEEEEECCCCEEEEEEec
Confidence            96 7899999999997642  235797 476643233579999999999999999999999999875


No 22 
>PF00026 Asp:  Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.;  InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .  More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=99.95  E-value=2.5e-28  Score=203.34  Aligned_cols=189  Identities=21%  Similarity=0.319  Sum_probs=145.9

Q ss_pred             CCCCCC-------CChhhhhhCcCc------eEEecCCCCCCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEe
Q 028157            1 MGLDRS-------SVSIISKTNTSY------FSYCLPSPYGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILT   66 (213)
Q Consensus         1 ~Glg~~-------~~sl~~ql~~~~------FS~cl~~~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~   66 (213)
                      ||||..       ..+++.||..++      ||+||.+.....|.|+||++|. ++.+++.|+|+..    ..+|.|.++
T Consensus       110 lGLg~~~~~~~~~~~~~~~~l~~~g~i~~~~fsl~l~~~~~~~g~l~~Gg~d~~~~~g~~~~~~~~~----~~~w~v~~~  185 (317)
T PF00026_consen  110 LGLGFPSLSSSSTYPTFLDQLVQQGLISSNVFSLYLNPSDSQNGSLTFGGYDPSKYDGDLVWVPLVS----SGYWSVPLD  185 (317)
T ss_dssp             EE-SSGGGSGGGTS-SHHHHHHHTTSSSSSEEEEEEESTTSSEEEEEESSEEGGGEESEEEEEEBSS----TTTTEEEEE
T ss_pred             ccccCCcccccccCCcceecchhhccccccccceeeeecccccchheeeccccccccCceeccCccc----ccccccccc
Confidence            477743       346777776653      9999988655689999999996 6889999999984    579999999


Q ss_pred             EEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEE
Q 028157           67 GISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAI  146 (213)
Q Consensus        67 ~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~  146 (213)
                      +|.+++........    ..++|||||+++++|+++++.|.+++......     .  .....|...     ..+|.++|
T Consensus       186 ~i~i~~~~~~~~~~----~~~~~Dtgt~~i~lp~~~~~~i~~~l~~~~~~-----~--~~~~~c~~~-----~~~p~l~f  249 (317)
T PF00026_consen  186 SISIGGESVFSSSG----QQAILDTGTSYIYLPRSIFDAIIKALGGSYSD-----G--VYSVPCNST-----DSLPDLTF  249 (317)
T ss_dssp             EEEETTEEEEEEEE----EEEEEETTBSSEEEEHHHHHHHHHHHTTEEEC-----S--EEEEETTGG-----GGSEEEEE
T ss_pred             cccccccccccccc----eeeecccccccccccchhhHHHHhhhcccccc-----e--eEEEecccc-----cccceEEE
Confidence            99999993322211    25999999999999999999998877543221     1  223456543     46899999


Q ss_pred             EEeCCcEEEEcCCceEEEeCC--CceEEE-EEec--CCCCCeeEecceeeeeeEEEEeCCCCEEEEeeC
Q 028157          147 HFLGGVDLELDVRGTLVVASV--SQVCLE-FAIY--PPDLNSITLGNVQQRGHEVHYDVGGRRLGFGPG  210 (213)
Q Consensus       147 ~f~~g~~~~l~~~~y~~~~~~--~~~C~~-~~~~--~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~~  210 (213)
                      +|. +.+++++|++|+.....  ...|.. +...  ....+.+|||..|||++|++||.+++|||||++
T Consensus       250 ~~~-~~~~~i~~~~~~~~~~~~~~~~C~~~i~~~~~~~~~~~~iLG~~fl~~~y~vfD~~~~~ig~A~a  317 (317)
T PF00026_consen  250 TFG-GVTFTIPPSDYIFKIEDGNGGYCYLGIQPMDSSDDSDDWILGSPFLRNYYVVFDYENNRIGFAQA  317 (317)
T ss_dssp             EET-TEEEEEEHHHHEEEESSTTSSEEEESEEEESSTTSSSEEEEEHHHHTTEEEEEETTTTEEEEEEE
T ss_pred             eeC-CEEEEecchHhcccccccccceeEeeeecccccccCCceEecHHHhhceEEEEeCCCCEEEEecC
Confidence            996 89999999999998643  348975 5541  224578999999999999999999999999985


No 23 
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=99.95  E-value=5.9e-28  Score=198.38  Aligned_cols=155  Identities=25%  Similarity=0.412  Sum_probs=117.4

Q ss_pred             CCCCCCCChh---------hhhhCc----CceEEecCCCCCCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEEe
Q 028157            1 MGLDRSSVSI---------ISKTNT----SYFSYCLPSPYGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIILT   66 (213)
Q Consensus         1 ~Glg~~~~sl---------~~ql~~----~~FS~cl~~~~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l~   66 (213)
                      ||||+...+.         +.+|..    ..||+||.+.  .+|+|+||++|+ ++.+++.|+|++.+   ..+|.|+++
T Consensus       110 lGLg~~~~~~~~~~~~~~~~~~l~~~~~~~~Fs~~l~~~--~~G~l~fGg~D~~~~~g~l~~~pi~~~---~~~w~v~l~  184 (278)
T cd06097         110 LGLAFSSINTVQPPKQKTFFENALSSLDAPLFTADLRKA--APGFYTFGYIDESKYKGEISWTPVDNS---SGFWQFTST  184 (278)
T ss_pred             eeeccccccccccCCCCCHHHHHHHhccCceEEEEecCC--CCcEEEEeccChHHcCCceEEEEccCC---CcEEEEEEe
Confidence            6888876543         333333    3499999873  579999999996 78999999999864   469999999


Q ss_pred             EEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEE
Q 028157           67 GISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAI  146 (213)
Q Consensus        67 ~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~  146 (213)
                      +|+||++.....    ....+||||||+++++|+++++++.+++....  +..  .     ..+|..+|..  .+|+|+|
T Consensus       185 ~i~v~~~~~~~~----~~~~~iiDSGTs~~~lP~~~~~~l~~~l~g~~--~~~--~-----~~~~~~~C~~--~~P~i~f  249 (278)
T cd06097         185 SYTVGGDAPWSR----SGFSAIADTGTTLILLPDAIVEAYYSQVPGAY--YDS--E-----YGGWVFPCDT--TLPDLSF  249 (278)
T ss_pred             eEEECCcceeec----CCceEEeecCCchhcCCHHHHHHHHHhCcCCc--ccC--C-----CCEEEEECCC--CCCCEEE
Confidence            999999844321    12469999999999999999988887762110  111  0     1233333322  2899999


Q ss_pred             EEeCCcEEEEcCCceEEEeCCCceEEEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          147 HFLGGVDLELDVRGTLVVASVSQVCLEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       147 ~f~~g~~~~l~~~~y~~~~~~~~~C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      +|                                  .||||+.|||++|+|||++++|||||+
T Consensus       250 ~~----------------------------------~~ilGd~fl~~~y~vfD~~~~~ig~A~  278 (278)
T cd06097         250 AV----------------------------------FSILGDVFLKAQYVVFDVGGPKLGFAP  278 (278)
T ss_pred             EE----------------------------------EEEEcchhhCceeEEEcCCCceeeecC
Confidence            99                                  699999999999999999999999995


No 24 
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=99.95  E-value=5.6e-27  Score=192.06  Aligned_cols=160  Identities=29%  Similarity=0.450  Sum_probs=128.8

Q ss_pred             CCCCCCC------ChhhhhhCcC------ceEEecCCC--CCCccEEEECCCCC-CCCCCeeEeecccCCCCCcceEEEE
Q 028157            1 MGLDRSS------VSIISKTNTS------YFSYCLPSP--YGSTAYITFGKPVS-VSNKFIKYTPIVTTAEQSEYYDIIL   65 (213)
Q Consensus         1 ~Glg~~~------~sl~~ql~~~------~FS~cl~~~--~~~~g~l~fG~~~~-~~~~~~~y~pl~~~~~~~~~y~v~l   65 (213)
                      ||||+..      .+++.||..+      .||+||.+.  ....|.|+||++|. ++.+++.|+|++..  ...+|.|.+
T Consensus       109 lGLg~~~~~~~~~~s~~~~l~~~~~i~~~~Fs~~l~~~~~~~~~g~l~~Gg~d~~~~~~~~~~~p~~~~--~~~~~~v~l  186 (283)
T cd05471         109 LGLGFPSLSVDGVPSFFDQLKSQGLISSPVFSFYLGRDGDGGNGGELTFGGIDPSKYTGDLTYTPVVSN--GPGYWQVPL  186 (283)
T ss_pred             eecCCcccccccCCCHHHHHHHCCCCCCCEEEEEEcCCCCCCCCCEEEEcccCccccCCceEEEecCCC--CCCEEEEEe
Confidence            6899988      7899998874      399999985  23689999999996 57899999999885  247999999


Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEE
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIA  145 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~  145 (213)
                      ++|.|+++.....   .....++|||||++++||+++|++|.+++.+....          ...|+...+.....+|+|+
T Consensus       187 ~~i~v~~~~~~~~---~~~~~~iiDsGt~~~~lp~~~~~~l~~~~~~~~~~----------~~~~~~~~~~~~~~~p~i~  253 (283)
T cd05471         187 DGISVGGKSVISS---SGGGGAIVDSGTSLIYLPSSVYDAILKALGAAVSS----------SDGGYGVDCSPCDTLPDIT  253 (283)
T ss_pred             CeEEECCceeeec---CCCcEEEEecCCCCEeCCHHHHHHHHHHhCCcccc----------cCCcEEEeCcccCcCCCEE
Confidence            9999999751111   11257999999999999999999999988665431          1123333332346789999


Q ss_pred             EEEeCCcEEEEcCCceEEEeCCCceEEEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          146 IHFLGGVDLELDVRGTLVVASVSQVCLEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       146 ~~f~~g~~~~l~~~~y~~~~~~~~~C~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      |+|                                  .+|||+.|||++|++||.+++|||||+
T Consensus       254 f~f----------------------------------~~ilG~~fl~~~y~vfD~~~~~igfa~  283 (283)
T cd05471         254 FTF----------------------------------LWILGDVFLRNYYTVFDLDNNRIGFAP  283 (283)
T ss_pred             EEE----------------------------------EEEccHhhhhheEEEEeCCCCEEeecC
Confidence            999                                  699999999999999999999999985


No 25 
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=95.51  E-value=0.33  Score=34.86  Aligned_cols=26  Identities=15%  Similarity=0.162  Sum_probs=23.2

Q ss_pred             CeeEecceeeeeeEEEEeCCCCEEEE
Q 028157          182 NSITLGNVQQRGHEVHYDVGGRRLGF  207 (213)
Q Consensus       182 ~~~ilG~~~~~~~~~vfD~~~~riGf  207 (213)
                      ...|||..||+.+..+.|+.+++|-+
T Consensus        99 ~d~ILG~d~L~~~~~~ID~~~~~i~~  124 (124)
T cd05479          99 VDFLIGLDMLKRHQCVIDLKENVLRI  124 (124)
T ss_pred             cCEEecHHHHHhCCeEEECCCCEEEC
Confidence            45899999999999999999998853


No 26 
>PF08284 RVP_2:  Retroviral aspartyl protease;  InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases. 
Probab=94.33  E-value=0.76  Score=33.58  Aligned_cols=29  Identities=17%  Similarity=0.205  Sum_probs=26.4

Q ss_pred             CeeEecceeeeeeEEEEeCCCCEEEEeeC
Q 028157          182 NSITLGNVQQRGHEVHYDVGGRRLGFGPG  210 (213)
Q Consensus       182 ~~~ilG~~~~~~~~~vfD~~~~riGfa~~  210 (213)
                      -..|||..+|+.+...-|+.+++|-|...
T Consensus       104 ~DvILGm~WL~~~~~~IDw~~k~v~f~~p  132 (135)
T PF08284_consen  104 YDVILGMDWLKKHNPVIDWATKTVTFNSP  132 (135)
T ss_pred             eeeEeccchHHhCCCEEEccCCEEEEeCC
Confidence            35999999999999999999999999754


No 27 
>PF13650 Asp_protease_2:  Aspartyl protease
Probab=93.84  E-value=0.11  Score=34.46  Aligned_cols=30  Identities=23%  Similarity=0.367  Sum_probs=24.8

Q ss_pred             EEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           67 GISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        67 ~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      .++|||+.+.          +++|||++.+.+.++.++.+
T Consensus         2 ~v~vng~~~~----------~liDTGa~~~~i~~~~~~~l   31 (90)
T PF13650_consen    2 PVKVNGKPVR----------FLIDTGASISVISRSLAKKL   31 (90)
T ss_pred             EEEECCEEEE----------EEEcCCCCcEEECHHHHHHc
Confidence            3678887663          99999999999999888765


No 28 
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=93.59  E-value=0.54  Score=32.86  Aligned_cols=23  Identities=17%  Similarity=0.186  Sum_probs=20.5

Q ss_pred             CeeEecceeeeeeEEEEeCCCCE
Q 028157          182 NSITLGNVQQRGHEVHYDVGGRR  204 (213)
Q Consensus       182 ~~~ilG~~~~~~~~~vfD~~~~r  204 (213)
                      +..+||..+|+.+-++.|+.+++
T Consensus        84 ~~~LLG~~~L~~l~l~id~~~~~  106 (107)
T TIGR03698        84 DEPLLGTELLEGLGIVIDYRNQG  106 (107)
T ss_pred             CccEecHHHHhhCCEEEehhhCc
Confidence            47899999999999999998765


No 29 
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=93.03  E-value=0.21  Score=35.82  Aligned_cols=36  Identities=17%  Similarity=0.134  Sum_probs=28.3

Q ss_pred             CcceEEEEeEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           58 SEYYDIILTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        58 ~~~y~v~l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      .++|.+   .+.|||+.+.          ++||||.+.+.++++..+.+
T Consensus         9 ~g~~~v---~~~InG~~~~----------flVDTGAs~t~is~~~A~~L   44 (121)
T TIGR02281         9 DGHFYA---TGRVNGRNVR----------FLVDTGATSVALNEEDAQRL   44 (121)
T ss_pred             CCeEEE---EEEECCEEEE----------EEEECCCCcEEcCHHHHHHc
Confidence            456644   5678998653          99999999999999887654


No 30 
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=92.79  E-value=0.22  Score=33.55  Aligned_cols=30  Identities=27%  Similarity=0.485  Sum_probs=25.9

Q ss_pred             EEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           67 GISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        67 ~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      .+.|||+.+.          ..||||++.+.++++.+..+
T Consensus         4 ~~~Ing~~i~----------~lvDTGA~~svis~~~~~~l   33 (91)
T cd05484           4 TLLVNGKPLK----------FQLDTGSAITVISEKTWRKL   33 (91)
T ss_pred             EEEECCEEEE----------EEEcCCcceEEeCHHHHHHh
Confidence            5788999885          89999999999999888754


No 31 
>PF13975 gag-asp_proteas:  gag-polyprotein putative aspartyl protease
Probab=92.78  E-value=0.3  Score=31.45  Aligned_cols=30  Identities=20%  Similarity=0.391  Sum_probs=25.6

Q ss_pred             EEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           67 GISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        67 ~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      .+.|++..+.          +++|||++-++++++..+.|
T Consensus        12 ~~~I~g~~~~----------alvDtGat~~fis~~~a~rL   41 (72)
T PF13975_consen   12 PVSIGGVQVK----------ALVDTGATHNFISESLAKRL   41 (72)
T ss_pred             EEEECCEEEE----------EEEeCCCcceecCHHHHHHh
Confidence            4678887774          99999999999999988765


No 32 
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=91.97  E-value=0.38  Score=32.08  Aligned_cols=31  Identities=16%  Similarity=0.318  Sum_probs=25.1

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      ..+.||++.+.          +++|||++.+.++++..+.+
T Consensus         5 v~v~i~~~~~~----------~llDTGa~~s~i~~~~~~~l   35 (96)
T cd05483           5 VPVTINGQPVR----------FLLDTGASTTVISEELAERL   35 (96)
T ss_pred             EEEEECCEEEE----------EEEECCCCcEEcCHHHHHHc
Confidence            45778877764          99999999999999877654


No 33 
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where 
Probab=90.25  E-value=0.48  Score=31.58  Aligned_cols=29  Identities=24%  Similarity=0.249  Sum_probs=24.7

Q ss_pred             EEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           68 ISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        68 i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      +.|||+.+.          +++|||.+.+.+++...+.+
T Consensus         3 v~InG~~~~----------fLvDTGA~~tii~~~~a~~~   31 (86)
T cd06095           3 ITVEGVPIV----------FLVDTGATHSVLKSDLGPKQ   31 (86)
T ss_pred             EEECCEEEE----------EEEECCCCeEEECHHHhhhc
Confidence            678888774          89999999999999888754


No 34 
>PF00077 RVP:  Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026;  InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=88.51  E-value=0.66  Score=31.55  Aligned_cols=28  Identities=14%  Similarity=0.402  Sum_probs=22.9

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHH
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVY  103 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~  103 (213)
                      ..|.++++.+.          ++||||+..+.++++.+
T Consensus         8 i~v~i~g~~i~----------~LlDTGA~vsiI~~~~~   35 (100)
T PF00077_consen    8 ITVKINGKKIK----------ALLDTGADVSIISEKDW   35 (100)
T ss_dssp             EEEEETTEEEE----------EEEETTBSSEEESSGGS
T ss_pred             EEEeECCEEEE----------EEEecCCCcceeccccc
Confidence            45778888774          99999999999998643


No 35 
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=84.66  E-value=1.4  Score=29.94  Aligned_cols=20  Identities=25%  Similarity=0.373  Sum_probs=18.0

Q ss_pred             EEEecCCcceecChhHHHHH
Q 028157           87 TEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        87 ~iiDSGTt~~~lp~~~~~~l  106 (213)
                      +-+|||++.+.+|...|..+
T Consensus        13 ~~vDtGA~vnllp~~~~~~l   32 (93)
T cd05481          13 FQLDTGATCNVLPLRWLKSL   32 (93)
T ss_pred             EEEecCCEEEeccHHHHhhh
Confidence            88999999999999888755


No 36 
>PF09668 Asp_protease:  Aspartyl protease;  InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.  This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=84.50  E-value=1.4  Score=31.70  Aligned_cols=31  Identities=13%  Similarity=0.171  Sum_probs=24.6

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      ..++|||+.+.          |.||||+..+.++.+.++++
T Consensus        27 I~~~ing~~vk----------A~VDtGAQ~tims~~~a~r~   57 (124)
T PF09668_consen   27 INCKINGVPVK----------AFVDTGAQSTIMSKSCAERC   57 (124)
T ss_dssp             EEEEETTEEEE----------EEEETT-SS-EEEHHHHHHT
T ss_pred             EEEEECCEEEE----------EEEeCCCCccccCHHHHHHc
Confidence            35788999885          99999999999999888753


No 37 
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=78.65  E-value=6.2  Score=30.93  Aligned_cols=35  Identities=17%  Similarity=0.208  Sum_probs=28.8

Q ss_pred             CcceEEEEeEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHH
Q 028157           58 SEYYDIILTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAA  105 (213)
Q Consensus        58 ~~~y~v~l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~  105 (213)
                      .++|.   ....|||+.+.          .+||||.|...++++..+.
T Consensus       103 ~GHF~---a~~~VNGk~v~----------fLVDTGATsVal~~~dA~R  137 (215)
T COG3577         103 DGHFE---ANGRVNGKKVD----------FLVDTGATSVALNEEDARR  137 (215)
T ss_pred             CCcEE---EEEEECCEEEE----------EEEecCcceeecCHHHHHH
Confidence            46665   45789999996          8999999999999977654


No 38 
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=68.79  E-value=3.8  Score=29.35  Aligned_cols=20  Identities=35%  Similarity=0.381  Sum_probs=18.0

Q ss_pred             EEEecCCc-ceecChhHHHHH
Q 028157           87 TEIDSGNI-ITRLPSPVYAAL  106 (213)
Q Consensus        87 ~iiDSGTt-~~~lp~~~~~~l  106 (213)
                      .+||||-+ ++.+|+++++++
T Consensus        29 ~LiDTGFtg~lvlp~~vaek~   49 (125)
T COG5550          29 ELIDTGFTGYLVLPPQVAEKL   49 (125)
T ss_pred             eEEecCCceeEEeCHHHHHhc
Confidence            58999999 999999999865


No 39 
>PF12384 Peptidase_A2B:  Ty3 transposon peptidase;  InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=66.71  E-value=18  Score=27.39  Aligned_cols=31  Identities=19%  Similarity=0.285  Sum_probs=23.6

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      +.+.+.+..+.          +.+|||++..++-+++.+.|
T Consensus        37 v~l~~~~t~i~----------vLfDSGSPTSfIr~di~~kL   67 (177)
T PF12384_consen   37 VQLNCKGTPIK----------VLFDSGSPTSFIRSDIVEKL   67 (177)
T ss_pred             EEEeecCcEEE----------EEEeCCCccceeehhhHHhh
Confidence            34556666663          99999999999998887665


No 40 
>cd05470 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site 
Probab=62.15  E-value=8.7  Score=26.18  Aligned_cols=16  Identities=25%  Similarity=0.275  Sum_probs=14.2

Q ss_pred             EEEecCCcceecChhH
Q 028157           87 TEIDSGNIITRLPSPV  102 (213)
Q Consensus        87 ~iiDSGTt~~~lp~~~  102 (213)
                      +++|||++.+-++.+-
T Consensus        14 ~~~DTGSs~~Wv~~~~   29 (109)
T cd05470          14 VLLDTGSSNLWVPSVD   29 (109)
T ss_pred             EEEeCCCCCEEEeCCC
Confidence            8999999999998754


No 41 
>cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples
Probab=57.69  E-value=21  Score=21.83  Aligned_cols=19  Identities=16%  Similarity=0.270  Sum_probs=16.7

Q ss_pred             EEEecCCcceecChhHHHH
Q 028157           87 TEIDSGNIITRLPSPVYAA  105 (213)
Q Consensus        87 ~iiDSGTt~~~lp~~~~~~  105 (213)
                      +++|+|++...+..+.+..
T Consensus        12 ~liDtgs~~~~~~~~~~~~   30 (92)
T cd00303          12 ALVDSGASVNFISESLAKK   30 (92)
T ss_pred             EEEcCCCcccccCHHHHHH
Confidence            8999999999999988754


No 42 
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=53.19  E-value=61  Score=21.89  Aligned_cols=75  Identities=13%  Similarity=0.156  Sum_probs=42.3

Q ss_pred             cEEEecCCcceecChhHHHHHHHHHHHHhhhccccccccccccccccccCCcceecCeEEEEEeCCcEEEEcCCceEEE-
Q 028157           86 STEIDSGNIITRLPSPVYAALRSAFRKRMKKYKKAKEFEDLLGTCYDLSAYETVVVPKIAIHFLGGVDLELDVRGTLVV-  164 (213)
Q Consensus        86 ~~iiDSGTt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~P~l~~~f~~g~~~~l~~~~y~~~-  164 (213)
                      ...||||+..+.+|.+.-+.-                         .       .-..+.+.=+||..+...++..+.- 
T Consensus        11 ~fLVDTGA~vSviP~~~~~~~-------------------------~-------~~~~~~l~AANgt~I~tyG~~~l~ld   58 (89)
T cd06094          11 RFLVDTGAAVSVLPASSTKKS-------------------------L-------KPSPLTLQAANGTPIATYGTRSLTLD   58 (89)
T ss_pred             EEEEeCCCceEeecccccccc-------------------------c-------cCCceEEEeCCCCeEeeeeeEEEEEE
Confidence            489999999999997553210                         0       0022455555666666666555432 


Q ss_pred             eCCCceE-EEEEecCCCCCeeEecceeeeee
Q 028157          165 ASVSQVC-LEFAIYPPDLNSITLGNVQQRGH  194 (213)
Q Consensus       165 ~~~~~~C-~~~~~~~~~~~~~ilG~~~~~~~  194 (213)
                      .+..... -.|.-.  +-+..|||.-|++.|
T Consensus        59 lGlrr~~~w~FvvA--dv~~pIlGaDfL~~~   87 (89)
T cd06094          59 LGLRRPFAWNFVVA--DVPHPILGADFLQHY   87 (89)
T ss_pred             cCCCcEEeEEEEEc--CCCcceecHHHHHHc
Confidence            2222111 122221  124589999998876


No 43 
>cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The  C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=52.85  E-value=22  Score=24.64  Aligned_cols=29  Identities=17%  Similarity=0.312  Sum_probs=23.1

Q ss_pred             EEECCeEeeccccccCCCcEEEecCCcceecChhHHHHH
Q 028157           68 ISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        68 i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      -.+||..+.          +.||||+..+.+.+.-.+..
T Consensus         3 Ck~nG~~vk----------AfVDsGaQ~timS~~caerc   31 (103)
T cd05480           3 CQCAGKELR----------ALVDTGCQYNLISAACLDRL   31 (103)
T ss_pred             eeECCEEEE----------EEEecCCchhhcCHHHHHHc
Confidence            456777764          99999999999998877654


No 44 
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=52.70  E-value=15  Score=29.86  Aligned_cols=26  Identities=23%  Similarity=0.378  Sum_probs=20.3

Q ss_pred             eEEEECC--eEeeccccccCCCcEEEecCCcceecChh
Q 028157           66 TGISVGG--EKLPFKISYFTKLSTEIDSGNIITRLPSP  101 (213)
Q Consensus        66 ~~i~vg~--~~~~~~~~~~~~~~~iiDSGTt~~~lp~~  101 (213)
                      ..|+||.  +.+          .+++|||++.+.+|..
T Consensus         3 ~~i~vGtP~Q~~----------~v~~DTGS~~~wv~~~   30 (278)
T cd06097           3 TPVKIGTPPQTL----------NLDLDTGSSDLWVFSS   30 (278)
T ss_pred             eeEEECCCCcEE----------EEEEeCCCCceeEeeC
Confidence            4577886  444          2999999999999965


No 45 
>PF05585 DUF1758:  Putative peptidase (DUF1758);  InterPro: IPR008737  This is a family of nematode proteins of unknown function []. However, it seems likely that these proteins act as aspartic peptidases. 
Probab=41.89  E-value=16  Score=27.30  Aligned_cols=21  Identities=19%  Similarity=0.243  Sum_probs=18.5

Q ss_pred             cEEEecCCcceecChhHHHHH
Q 028157           86 STEIDSGNIITRLPSPVYAAL  106 (213)
Q Consensus        86 ~~iiDSGTt~~~lp~~~~~~l  106 (213)
                      .+++|||+..+++-+++.+.|
T Consensus        14 ~~LlDsGSq~SfIt~~la~~L   34 (164)
T PF05585_consen   14 RALLDSGSQRSFITESLANKL   34 (164)
T ss_pred             EEEEecCCchhHHhHHHHHHh
Confidence            489999999999999888765


No 46 
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5.  Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=41.11  E-value=29  Score=28.94  Aligned_cols=29  Identities=21%  Similarity=0.307  Sum_probs=20.9

Q ss_pred             EeEEEECCeEeeccccccCCCcEEEecCCcceecChh
Q 028157           65 LTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSP  101 (213)
Q Consensus        65 l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~  101 (213)
                      +..|.||.-.-.     +   .+++|||++.+.+|..
T Consensus         5 ~~~i~vGtP~Q~-----~---~v~~DTGS~~~wv~~~   33 (326)
T cd06096           5 FIDIFIGNPPQK-----Q---SLILDTGSSSLSFPCS   33 (326)
T ss_pred             EEEEEecCCCeE-----E---EEEEeCCCCceEEecC
Confidence            567888862221     1   2999999999999864


No 47 
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=40.17  E-value=40  Score=22.57  Aligned_cols=23  Identities=13%  Similarity=0.359  Sum_probs=18.7

Q ss_pred             EEECCeEeeccccccCCCcEEEecCCcceecCh
Q 028157           68 ISVGGEKLPFKISYFTKLSTEIDSGNIITRLPS  100 (213)
Q Consensus        68 i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~  100 (213)
                      ++++++.+.          +.+|||.-.|.+++
T Consensus         3 ~~i~g~~~~----------~llDTGAd~Tvi~~   25 (87)
T cd05482           3 LYINGKLFE----------GLLDTGADVSIIAE   25 (87)
T ss_pred             EEECCEEEE----------EEEccCCCCeEEcc
Confidence            566777664          89999999999887


No 48 
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=39.00  E-value=38  Score=27.50  Aligned_cols=27  Identities=22%  Similarity=0.304  Sum_probs=19.3

Q ss_pred             EeEEEECCeEeeccccccCCCcEEEecCCcceecC
Q 028157           65 LTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLP   99 (213)
Q Consensus        65 l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp   99 (213)
                      +..|.||.-.-.+        .+++|||++.+-+|
T Consensus         4 ~~~i~iGtp~q~~--------~v~~DTgS~~~wv~   30 (295)
T cd05474           4 SAELSVGTPPQKV--------TVLLDTGSSDLWVP   30 (295)
T ss_pred             EEEEEECCCCcEE--------EEEEeCCCCcceee
Confidence            4567788733221        28999999999998


No 49 
>PF00026 Asp:  Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.;  InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .  More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=36.72  E-value=42  Score=27.39  Aligned_cols=26  Identities=27%  Similarity=0.483  Sum_probs=19.7

Q ss_pred             eEEEEC--CeEeeccccccCCCcEEEecCCcceecChh
Q 028157           66 TGISVG--GEKLPFKISYFTKLSTEIDSGNIITRLPSP  101 (213)
Q Consensus        66 ~~i~vg--~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~  101 (213)
                      ..|.||  .+.+.          +++|||++.+.+|..
T Consensus         4 ~~v~iGtp~q~~~----------~~iDTGS~~~wv~~~   31 (317)
T PF00026_consen    4 INVTIGTPPQTFR----------VLIDTGSSDTWVPSS   31 (317)
T ss_dssp             EEEEETTTTEEEE----------EEEETTBSSEEEEBT
T ss_pred             EEEEECCCCeEEE----------EEEecccceeeecee
Confidence            457777  45553          899999999999853


No 50 
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases.  They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=31.63  E-value=59  Score=26.97  Aligned_cols=28  Identities=25%  Similarity=0.392  Sum_probs=20.1

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChh
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSP  101 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~  101 (213)
                      ..|+||.-.-.     +   .+++|||++.+-+|..
T Consensus        13 ~~i~iGtP~Q~-----~---~v~~DTGSs~lWv~~~   40 (317)
T cd06098          13 GEIGIGTPPQK-----F---TVIFDTGSSNLWVPSS   40 (317)
T ss_pred             EEEEECCCCeE-----E---EEEECCCccceEEecC
Confidence            46788853222     1   2999999999999964


No 51 
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  The enzymes specifically cleave bonds in peptides which 
Probab=31.05  E-value=39  Score=27.16  Aligned_cols=14  Identities=21%  Similarity=0.271  Sum_probs=12.6

Q ss_pred             EEEecCCcceecCh
Q 028157           87 TEIDSGNIITRLPS  100 (213)
Q Consensus        87 ~iiDSGTt~~~lp~  100 (213)
                      +++|||++.+.+|.
T Consensus        17 v~~DTGSs~~wv~~   30 (265)
T cd05476          17 LIVDTGSDLTWTQC   30 (265)
T ss_pred             EEecCCCCCEEEcC
Confidence            89999999999885


No 52 
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=30.92  E-value=47  Score=27.51  Aligned_cols=29  Identities=28%  Similarity=0.328  Sum_probs=21.1

Q ss_pred             EeEEEECCeEeeccccccCCCcEEEecCCcceecChh
Q 028157           65 LTGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSP  101 (213)
Q Consensus        65 l~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~  101 (213)
                      +..|.||.-.-.+        .+++|||++.+.+|..
T Consensus         5 ~~~i~iGtP~q~~--------~v~~DTGS~~~wv~~~   33 (318)
T cd05477           5 YGEISIGTPPQNF--------LVLFDTGSSNLWVPSV   33 (318)
T ss_pred             EEEEEECCCCcEE--------EEEEeCCCccEEEccC
Confidence            4567888633222        2999999999999864


No 53 
>TIGR03778 VPDSG_CTERM VPDSG-CTERM exosortase interaction domain. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (PubMed:16930487). This model describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C-terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif (TIGR02595) of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria.
Probab=30.50  E-value=11  Score=19.25  Aligned_cols=13  Identities=31%  Similarity=0.399  Sum_probs=9.0

Q ss_pred             ecCCcceecChhH
Q 028157           90 DSGNIITRLPSPV  102 (213)
Q Consensus        90 DSGTt~~~lp~~~  102 (213)
                      |||||+..+--.+
T Consensus         3 DsGST~~Ll~~~l   15 (26)
T TIGR03778         3 DSGSTLALLGLGL   15 (26)
T ss_pred             CchhHHHHHHHHH
Confidence            8899887764433


No 54 
>KOG0012 consensus DNA damage inducible protein [Replication, recombination and repair]
Probab=30.00  E-value=69  Score=27.44  Aligned_cols=38  Identities=21%  Similarity=0.209  Sum_probs=29.3

Q ss_pred             eE-EEEEecCCCCCeeEecceeeeeeEEEEeCCCCEEEEee
Q 028157          170 VC-LEFAIYPPDLNSITLGNVQQRGHEVHYDVGGRRLGFGP  209 (213)
Q Consensus       170 ~C-~~~~~~~~~~~~~ilG~~~~~~~~~vfD~~~~riGfa~  209 (213)
                      .| +.+....  .-...||.-.+|.+--.-|++++++-++.
T Consensus       307 ~c~ftV~d~~--~~d~llGLd~Lrr~~ccIdL~~~~L~ig~  345 (380)
T KOG0012|consen  307 PCSFTVLDRR--DMDLLLGLDMLRRHQCCIDLKTNVLRIGN  345 (380)
T ss_pred             ccceEEecCC--CcchhhhHHHHHhccceeecccCeEEecC
Confidence            56 4565542  23589999999999999999999988764


No 55 
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which 
Probab=28.53  E-value=53  Score=27.21  Aligned_cols=28  Identities=29%  Similarity=0.382  Sum_probs=20.3

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChh
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSP  101 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~  101 (213)
                      ..|.||.-.-.+        .+++|||++.+.+|..
T Consensus        13 ~~i~vGtp~q~~--------~v~~DTGS~~~wv~~~   40 (317)
T cd05478          13 GTISIGTPPQDF--------TVIFDTGSSNLWVPSV   40 (317)
T ss_pred             EEEEeCCCCcEE--------EEEEeCCCccEEEecC
Confidence            357888633222        2999999999999853


No 56 
>PF14543 TAXi_N:  Xylanase inhibitor N-terminal; PDB: 3HD8_A 3VLB_A 3VLA_A 3AUP_D 1T6G_A 1T6E_X 2B42_A.
Probab=26.29  E-value=47  Score=24.76  Aligned_cols=14  Identities=21%  Similarity=0.363  Sum_probs=11.9

Q ss_pred             EEEecCCcceecCh
Q 028157           87 TEIDSGNIITRLPS  100 (213)
Q Consensus        87 ~iiDSGTt~~~lp~  100 (213)
                      ++||||+.++.++=
T Consensus        16 lvvDtgs~l~W~~C   29 (164)
T PF14543_consen   16 LVVDTGSDLTWVQC   29 (164)
T ss_dssp             EEEETT-SSEEEET
T ss_pred             EEEECCCCceEEcC
Confidence            99999999999965


No 57 
>PTZ00147 plasmepsin-1; Provisional
Probab=25.72  E-value=87  Score=27.81  Aligned_cols=15  Identities=27%  Similarity=0.275  Sum_probs=13.8

Q ss_pred             EEEecCCcceecChh
Q 028157           87 TEIDSGNIITRLPSP  101 (213)
Q Consensus        87 ~iiDSGTt~~~lp~~  101 (213)
                      +++|||++.+.+|..
T Consensus       155 Vi~DTGSsdlWVps~  169 (453)
T PTZ00147        155 FIFDTGSANLWVPSI  169 (453)
T ss_pred             EEEeCCCCcEEEeec
Confidence            999999999999964


No 58 
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme.  Proteinase A preferentially hydro
Probab=25.42  E-value=61  Score=26.91  Aligned_cols=28  Identities=29%  Similarity=0.421  Sum_probs=20.4

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChh
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSP  101 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~  101 (213)
                      ..|+||.-.-.+        .+++|||++.+.+|..
T Consensus        13 ~~i~iGtp~q~~--------~v~~DTGSs~~wv~~~   40 (320)
T cd05488          13 TDITLGTPPQKF--------KVILDTGSSNLWVPSV   40 (320)
T ss_pred             EEEEECCCCcEE--------EEEEecCCcceEEEcC
Confidence            558888632221        2999999999999863


No 59 
>PTZ00165 aspartyl protease; Provisional
Probab=23.87  E-value=88  Score=28.02  Aligned_cols=29  Identities=28%  Similarity=0.318  Sum_probs=20.6

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecChhH
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPSPV  102 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~~~  102 (213)
                      ..|+||.-.-     .|   .+++|||++.+-+|...
T Consensus       123 ~~I~IGTPpQ-----~f---~Vv~DTGSS~lWVps~~  151 (482)
T PTZ00165        123 GEIQVGTPPK-----SF---VVVFDTGSSNLWIPSKE  151 (482)
T ss_pred             EEEEeCCCCc-----eE---EEEEeCCCCCEEEEchh
Confidence            3577876222     22   29999999999999753


No 60 
>PLN03146 aspartyl protease family protein; Provisional
Probab=21.79  E-value=73  Score=27.96  Aligned_cols=27  Identities=22%  Similarity=0.347  Sum_probs=19.2

Q ss_pred             eEEEECCeEeeccccccCCCcEEEecCCcceecCh
Q 028157           66 TGISVGGEKLPFKISYFTKLSTEIDSGNIITRLPS  100 (213)
Q Consensus        66 ~~i~vg~~~~~~~~~~~~~~~~iiDSGTt~~~lp~  100 (213)
                      ..|.||.-..++        .+++|||+.++-+|-
T Consensus        87 v~i~iGTPpq~~--------~vi~DTGS~l~Wv~C  113 (431)
T PLN03146         87 MNISIGTPPVPI--------LAIADTGSDLIWTQC  113 (431)
T ss_pred             EEEEcCCCCceE--------EEEECCCCCcceEcC
Confidence            457777533221        399999999999874


Done!