Query         040033
Match_columns 158
No_of_seqs    144 out of 791
Neff          7.7 
Searched_HMMs 46136
Date          Fri Mar 29 04:25:14 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/040033.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/040033hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF08284 RVP_2:  Retroviral asp  99.9 1.4E-25 3.1E-30  163.3  11.7   84   74-158    17-124 (135)
  2 cd05479 RP_DDI RP_DDI; retrope  99.9 4.3E-21 9.4E-26  137.7  11.9   82   76-158    14-119 (124)
  3 cd05484 retropepsin_like_LTR_2  99.6 1.3E-15 2.8E-20  103.4   9.1   67   80-147     2-91  (91)
  4 PF09668 Asp_protease:  Asparty  99.6 2.2E-15 4.7E-20  108.0  10.3   79   76-155    22-124 (124)
  5 TIGR02281 clan_AA_DTGA clan AA  99.5 1.4E-13 2.9E-18   98.5  10.6   82   75-156     8-114 (121)
  6 PF13650 Asp_protease_2:  Aspar  99.5 2.9E-13 6.3E-18   90.3   9.0   65   81-145     1-90  (90)
  7 PF00077 RVP:  Retroviral aspar  99.5 6.3E-13 1.4E-17   91.2   9.2   72   79-151     6-97  (100)
  8 cd05480 NRIP_C NRIP_C; putativ  99.4 5.3E-13 1.1E-17   91.4   8.3   75   82-157     2-102 (103)
  9 cd05483 retropepsin_like_bacte  99.4 2.1E-12 4.4E-17   87.0   9.4   70   78-147     2-96  (96)
 10 PF12384 Peptidase_A2B:  Ty3 tr  99.3 3.4E-11 7.3E-16   89.4   9.8   79   73-151    29-131 (177)
 11 TIGR03698 clan_AA_DTGF clan AA  99.3 3.5E-11 7.5E-16   84.3   9.4   70   87-158    14-104 (107)
 12 cd06095 RP_RTVL_H_like Retrope  99.2 6.1E-11 1.3E-15   79.7   8.5   65   82-147     2-86  (86)
 13 cd00303 retropepsin_like Retro  99.2 1.6E-10 3.6E-15   73.9   8.1   66   82-147     2-92  (92)
 14 KOG0012 DNA damage inducible p  99.1 1.4E-10 3.1E-15   95.1   6.7   80   77-157   234-337 (380)
 15 PF13975 gag-asp_proteas:  gag-  99.0 8.5E-10 1.8E-14   71.9   5.6   41   75-115     5-45  (72)
 16 COG3577 Predicted aspartyl pro  98.9 7.7E-09 1.7E-13   79.4   7.6   82   74-155   101-207 (215)
 17 PF02160 Peptidase_A3:  Caulifl  98.6   1E-07 2.3E-12   73.3   6.3   69   88-157    19-110 (201)
 18 cd05481 retropepsin_like_LTR_1  98.4 1.7E-06 3.6E-11   59.1   7.6   61   83-144     3-90  (93)
 19 cd06094 RP_Saci_like RP_Saci_l  98.2 2.8E-06 6.1E-11   57.4   5.3   61   88-150     8-88  (89)
 20 PF00098 zf-CCHC:  Zinc knuckle  98.2 6.1E-07 1.3E-11   43.1   0.9   17   31-47      2-18  (18)
 21 PF05585 DUF1758:  Putative pep  98.2   4E-06 8.6E-11   62.5   5.5   28   88-115    11-38  (164)
 22 COG5550 Predicted aspartyl pro  98.1 2.7E-05 5.9E-10   55.5   8.3   70   86-157    23-113 (125)
 23 cd05482 HIV_retropepsin_like R  97.9 6.5E-05 1.4E-09   50.7   6.5   66   82-147     2-87  (87)
 24 PF13696 zf-CCHC_2:  Zinc knuck  96.7 0.00073 1.6E-08   37.0   0.8   19   30-48      9-27  (32)
 25 smart00343 ZnF_C2HC zinc finge  95.9  0.0036 7.8E-08   32.4   0.9   18   31-48      1-18  (26)
 26 PF13917 zf-CCHC_3:  Zinc knuck  95.7  0.0045 9.7E-08   36.1   0.8   18   30-47      5-22  (42)
 27 PF12382 Peptidase_A2E:  Retrot  95.2   0.051 1.1E-06   37.6   4.9   65   79-143    35-124 (137)
 28 PF14787 zf-CCHC_5:  GAG-polypr  95.1  0.0074 1.6E-07   33.7   0.4   19   30-48      3-21  (36)
 29 COG5082 AIR1 Arginine methyltr  94.7   0.012 2.7E-07   45.0   0.9   21   26-46     56-77  (190)
 30 PF05618 Zn_protease:  Putative  94.2    0.14   3E-06   37.4   5.4   33  117-149    89-124 (138)
 31 COG5082 AIR1 Arginine methyltr  93.9   0.024 5.2E-07   43.4   0.9   19   30-48     98-117 (190)
 32 PF14392 zf-CCHC_4:  Zinc knuck  93.2   0.028   6E-07   33.6   0.2   22   26-47     27-49  (49)
 33 COG4067 Uncharacterized protei  93.1    0.26 5.6E-06   36.6   5.2   38  117-155   113-152 (162)
 34 PTZ00368 universal minicircle   91.2   0.089 1.9E-06   38.4   0.9   17   31-47      2-18  (148)
 35 KOG4400 E3 ubiquitin ligase in  90.9   0.092   2E-06   42.0   0.7   18   30-47    144-161 (261)
 36 COG5222 Uncharacterized conser  90.1    0.21 4.4E-06   41.1   2.1   19   30-48    177-195 (427)
 37 cd05476 pepsin_A_like_plant Ch  89.3     1.4   3E-05   34.9   6.3   62   90-157   177-254 (265)
 38 cd05477 gastricsin Gastricsins  89.2     1.8 3.8E-05   35.1   7.0   67   90-157   202-309 (318)
 39 PTZ00368 universal minicircle   88.9    0.17 3.7E-06   36.9   0.8   18   30-47    104-121 (148)
 40 PTZ00013 plasmepsin 4 (PM4); P  87.3       2 4.3E-05   37.3   6.4   26   81-106   141-168 (450)
 41 cd05470 pepsin_retropepsin_lik  86.8    0.79 1.7E-05   30.9   3.1   25   82-106     2-28  (109)
 42 cd05478 pepsin_A Pepsin A, asp  86.1     2.8   6E-05   34.0   6.4   74   83-157   194-309 (317)
 43 PTZ00147 plasmepsin-1; Provisi  85.7     3.1 6.7E-05   36.1   6.8   26   80-105   141-168 (453)
 44 PF15288 zf-CCHC_6:  Zinc knuck  85.3    0.45 9.7E-06   27.3   1.0   18   31-48      3-22  (40)
 45 cd06096 Plasmepsin_5 Plasmepsi  84.6     2.2 4.7E-05   34.9   5.2   68   89-157   231-314 (326)
 46 cd05474 SAP_like SAPs, pepsin-  84.3     4.1   9E-05   32.3   6.5   64   81-145     5-80  (295)
 47 KOG0109 RNA-binding protein LA  81.4    0.76 1.6E-05   37.7   1.2   17   32-48    163-179 (346)
 48 cd06097 Aspergillopepsin_like   79.3     1.6 3.4E-05   34.7   2.4   68   89-157   198-270 (278)
 49 cd05471 pepsin_like Pepsin-lik  79.2     2.7 5.9E-05   32.8   3.7   61   88-157   201-275 (283)
 50 PF00026 Asp:  Eukaryotic aspar  78.0     3.7   8E-05   32.7   4.2   26   80-105     3-30  (317)
 51 PF05515 Viral_NABP:  Viral nuc  75.8     1.2 2.6E-05   31.8   0.7   18   30-47     63-80  (124)
 52 cd06097 Aspergillopepsin_like   74.0     4.3 9.4E-05   32.2   3.6   25   81-105     3-29  (278)
 53 PF13821 DUF4187:  Domain of un  73.0     1.9 4.1E-05   26.4   1.0   22   26-47     22-49  (55)
 54 KOG0341 DEAD-box protein abstr  69.8     2.1 4.4E-05   36.9   0.9   19   30-48    571-589 (610)
 55 KOG4400 E3 ubiquitin ligase in  69.7     1.8 3.9E-05   34.5   0.5   19   30-48     93-111 (261)
 56 smart00647 IBR In Between Ring  69.6     2.2 4.9E-05   25.9   0.8   16   30-45     49-64  (64)
 57 PF12353 eIF3g:  Eukaryotic tra  68.5     2.6 5.6E-05   30.3   1.0   18   30-48    107-124 (128)
 58 PF03539 Spuma_A9PTase:  Spumav  68.5      16 0.00035   27.1   5.2   63   85-150     1-84  (163)
 59 cd05472 cnd41_like Chloroplast  67.3     6.7 0.00015   31.4   3.4   21   91-111   173-193 (299)
 60 cd06098 phytepsin Phytepsin, a  65.9      11 0.00023   30.7   4.3   26   80-105    12-39  (317)
 61 PF01485 IBR:  IBR domain;  Int  65.8     1.7 3.7E-05   26.4  -0.3   16   30-45     49-64  (64)
 62 KOG0119 Splicing factor 1/bran  64.0     3.4 7.5E-05   36.2   1.1   19   30-48    286-304 (554)
 63 cd05487 renin_like Renin stimu  62.6      12 0.00026   30.4   4.1   25   81-105    11-37  (326)
 64 cd05490 Cathepsin_D2 Cathepsin  62.4      12 0.00027   30.2   4.1   25   80-104     8-34  (325)
 65 cd05476 pepsin_A_like_plant Ch  62.1     9.9 0.00021   29.9   3.4   65   81-145     4-88  (265)
 66 cd05477 gastricsin Gastricsins  61.2      14  0.0003   29.9   4.1   25   81-105     6-32  (318)
 67 cd06098 phytepsin Phytepsin, a  60.5      12 0.00026   30.3   3.7   74   83-157   197-309 (317)
 68 cd05475 nucellin_like Nucellin  59.7      14  0.0003   29.3   3.9   24   81-104     5-30  (273)
 69 cd05485 Cathepsin_D_like Cathe  59.6      12 0.00026   30.6   3.5   67   90-157   211-321 (329)
 70 COG2383 Uncharacterized conser  59.4     2.7 5.9E-05   29.0  -0.2   18  140-157    51-68  (109)
 71 KOG0107 Alternative splicing f  59.1     4.7  0.0001   30.8   1.0   18   30-47    101-118 (195)
 72 cd05473 beta_secretase_like Be  58.3      11 0.00025   31.1   3.3   20   91-110   213-232 (364)
 73 cd05486 Cathespin_E Cathepsin   57.8     9.8 0.00021   30.8   2.7   29   83-111   186-220 (316)
 74 cd05486 Cathespin_E Cathepsin   57.5      15 0.00033   29.6   3.8   24   82-105     4-29  (316)
 75 cd05478 pepsin_A Pepsin A, asp  57.2      17 0.00037   29.4   4.1   26   80-105    12-39  (317)
 76 cd05473 beta_secretase_like Be  56.4      16 0.00035   30.2   3.8   25   81-105     6-32  (364)
 77 cd06096 Plasmepsin_5 Plasmepsi  56.3      18  0.0004   29.4   4.1   25   81-105     6-32  (326)
 78 cd05472 cnd41_like Chloroplast  56.3      11 0.00024   30.1   2.8   65   81-145     4-89  (299)
 79 cd05488 Proteinase_A_fungi Fun  55.7      18  0.0004   29.3   4.0   27   79-105    11-39  (320)
 80 cd05485 Cathepsin_D_like Cathe  55.6      19 0.00041   29.4   4.1   27   80-106    13-41  (329)
 81 PF14543 TAXi_N:  Xylanase inhi  51.7      24 0.00052   25.9   3.8   23   81-103     3-27  (164)
 82 cd05471 pepsin_like Pepsin-lik  51.2      22 0.00048   27.5   3.7   26   82-107     4-31  (283)
 83 KOG2673 Uncharacterized conser  49.9      11 0.00023   32.9   1.7   19   30-48    129-147 (485)
 84 cd05489 xylanase_inhibitor_I_l  46.7      29 0.00062   29.0   3.9   20   91-110   231-250 (362)
 85 PLN03146 aspartyl protease fam  46.5      23 0.00051   30.3   3.4   19   91-109   309-327 (431)
 86 PTZ00165 aspartyl protease; Pr  44.1      34 0.00073   30.0   4.0   28   79-106   121-150 (482)
 87 KOG2560 RNA splicing factor -   43.6     5.8 0.00013   34.5  -0.7   19   28-46    111-129 (529)
 88 PF14541 TAXi_C:  Xylanase inhi  42.1      22 0.00047   25.9   2.2   23   89-111    29-51  (161)
 89 cd05487 renin_like Renin stimu  41.9      31 0.00067   28.0   3.3   22   90-111   208-229 (326)
 90 cd05490 Cathepsin_D2 Cathepsin  41.6      19 0.00041   29.2   2.0   22   90-111   207-228 (325)
 91 PLN03146 aspartyl protease fam  40.9      35 0.00076   29.3   3.6   26   79-104    85-112 (431)
 92 KOG2044 5'-3' exonuclease HKE1  38.1      11 0.00025   35.0   0.2   19   30-48    261-279 (931)
 93 KOG0119 Splicing factor 1/bran  37.1      19 0.00041   31.8   1.3   34   15-48    240-280 (554)
 94 cd05488 Proteinase_A_fungi Fun  36.6      28 0.00062   28.1   2.3   67   90-157   206-312 (320)
 95 KOG4584 Uncharacterized conser  35.7      24 0.00051   29.3   1.6   22  122-143   196-217 (348)
 96 PTZ00147 plasmepsin-1; Provisi  35.6      53  0.0011   28.5   3.9   22   89-110   332-353 (453)
 97 PTZ00165 aspartyl protease; Pr  34.8      53  0.0012   28.8   3.8   23   89-111   327-349 (482)
 98 cd05474 SAP_like SAPs, pepsin-  34.6      30 0.00065   27.3   2.1   69   88-157   177-286 (295)
 99 TIGR02854 spore_II_GA sigma-E   33.9      59  0.0013   26.4   3.7   35   77-111   157-202 (288)
100 PF03419 Peptidase_U4:  Sporula  33.4      65  0.0014   26.0   3.9   34   78-111   157-201 (293)
101 PTZ00013 plasmepsin 4 (PM4); P  31.7      31 0.00068   29.9   1.8   23   89-111   331-353 (450)
102 COG1644 RPB10 DNA-directed RNA  30.3      24 0.00051   22.2   0.6   13   30-42      5-18  (63)
103 KOG3497 DNA-directed RNA polym  28.4      26 0.00056   22.0   0.6    8   31-38      6-13  (69)
104 PF00026 Asp:  Eukaryotic aspar  28.1      54  0.0012   25.9   2.6   68   89-157   199-308 (317)
105 PF13395 HNH_4:  HNH endonuclea  26.3      24 0.00051   21.1   0.1    9   32-40      1-9   (54)
106 smart00400 ZnF_CHCC zinc finge  25.6      34 0.00074   20.3   0.8   11   30-40     24-34  (55)
107 PF11880 DUF3400:  Domain of un  25.6      60  0.0013   19.0   1.7   23  125-149     9-31  (45)
108 PF13771 zf-HC5HC2H:  PHD-like   25.0      49  0.0011   21.4   1.5   19   30-48     37-55  (90)
109 PHA00689 hypothetical protein   24.9      18 0.00038   21.8  -0.6   18   24-41     12-29  (62)
110 PF13717 zinc_ribbon_4:  zinc-r  24.6      24 0.00052   19.4  -0.1   23   18-40     13-36  (36)
111 PLN00032 DNA-directed RNA poly  23.8      31 0.00068   22.3   0.3    9   30-38      5-13  (71)
112 PRK04016 DNA-directed RNA poly  23.1      31 0.00067   21.7   0.2    9   30-38      5-13  (62)
113 PF04746 DUF575:  Protein of un  22.8      45 0.00098   22.7   1.0   12  138-149    28-39  (101)
114 KOG1339 Aspartyl protease [Pos  20.8      97  0.0021   26.1   2.9   22   87-108    57-78  (398)

No 1  
>PF08284 RVP_2:  Retroviral aspartyl protease;  InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases. 
Probab=99.93  E-value=1.4e-25  Score=163.35  Aligned_cols=84  Identities=40%  Similarity=0.676  Sum_probs=79.0

Q ss_pred             ccCceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE------------------------EEEEECCEEEEEEE
Q 040033           74 RASETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI------------------------VNLILQGVYVIVDF  129 (158)
Q Consensus        74 ~~~~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~------------------------A~i~i~g~~~~~~~  129 (158)
                      ..++.|...+.|+++++.+|||||||||||++++|++++++.                        .++.++|++|..+|
T Consensus        17 ~~~~vi~g~~~I~~~~~~vLiDSGAThsFIs~~~a~~~~l~~~~l~~~~~V~~~g~~~~~~~~~~~~~~~i~g~~~~~dl   96 (135)
T PF08284_consen   17 ESPDVITGTFLINSIPASVLIDSGATHSFISSSFAKKLGLPLEPLPRPIVVSAPGGSINCEGVCPDVPLSIQGHEFVVDL   96 (135)
T ss_pred             CCCCeEEEEEEeccEEEEEEEecCCCcEEccHHHHHhcCCEEEEccCeeEEecccccccccceeeeEEEEECCeEEEeee
Confidence            457889999999999999999999999999999999999887                        18899999999999


Q ss_pred             EEecCCCCcEEechhHhhhcCCceeeecC
Q 040033          130 NLRELEGYDVVLGTQWLRTLEPILWDFAS  158 (158)
Q Consensus       130 ~v~~~~~~dvILG~dwL~~~~~i~idw~~  158 (158)
                      .|+++.++|||||||||.+|+| .|||.+
T Consensus        97 ~vl~l~~~DvILGm~WL~~~~~-~IDw~~  124 (135)
T PF08284_consen   97 LVLDLGGYDVILGMDWLKKHNP-VIDWAT  124 (135)
T ss_pred             EEecccceeeEeccchHHhCCC-EEEccC
Confidence            9999999999999999999999 899974


No 2  
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=99.86  E-value=4.3e-21  Score=137.71  Aligned_cols=82  Identities=21%  Similarity=0.321  Sum_probs=75.6

Q ss_pred             CceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE------------------------EEEEECCEEEEEEEEE
Q 040033           76 SETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI------------------------VNLILQGVYVIVDFNL  131 (158)
Q Consensus        76 ~~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~------------------------A~i~i~g~~~~~~~~v  131 (158)
                      ...+.+.+.|||+++.+||||||++|||+.++|+++|++.                        .++.+++.++.++|.|
T Consensus        14 ~~~~~v~~~Ing~~~~~LvDTGAs~s~Is~~~a~~lgl~~~~~~~~~~~~~g~g~~~~~g~~~~~~l~i~~~~~~~~~~V   93 (124)
T cd05479          14 VPMLYINVEINGVPVKAFVDSGAQMTIMSKACAEKCGLMRLIDKRFQGIAKGVGTQKILGRIHLAQVKIGNLFLPCSFTV   93 (124)
T ss_pred             eeEEEEEEEECCEEEEEEEeCCCceEEeCHHHHHHcCCccccCcceEEEEecCCCcEEEeEEEEEEEEECCEEeeeEEEE
Confidence            4578899999999999999999999999999999999853                        1888999999999999


Q ss_pred             ecCCCCcEEechhHhhhcCCceeeecC
Q 040033          132 RELEGYDVVLGTQWLRTLEPILWDFAS  158 (158)
Q Consensus       132 ~~~~~~dvILG~dwL~~~~~i~idw~~  158 (158)
                      +++..+|+|||||||++++. .|||++
T Consensus        94 l~~~~~d~ILG~d~L~~~~~-~ID~~~  119 (124)
T cd05479          94 LEDDDVDFLIGLDMLKRHQC-VIDLKE  119 (124)
T ss_pred             ECCCCcCEEecHHHHHhCCe-EEECCC
Confidence            99999999999999999995 899974


No 3  
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=99.64  E-value=1.3e-15  Score=103.40  Aligned_cols=67  Identities=19%  Similarity=0.226  Sum_probs=61.1

Q ss_pred             EEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE-----------------------EEEEECCEEEEEEEEEecCCC
Q 040033           80 RINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI-----------------------VNLILQGVYVIVDFNLRELEG  136 (158)
Q Consensus        80 ~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~-----------------------A~i~i~g~~~~~~~~v~~~~~  136 (158)
                      .+.+.|||+++.+||||||++|||+++.+.+++.+.                       ..+++++.++.++|+|++.. 
T Consensus         2 ~~~~~Ing~~i~~lvDTGA~~svis~~~~~~lg~~~~~~~~~~v~~a~G~~~~~~G~~~~~v~~~~~~~~~~~~v~~~~-   80 (91)
T cd05484           2 TVTLLVNGKPLKFQLDTGSAITVISEKTWRKLGSPPLKPTKKRLRTATGTKLSVLGQILVTVKYGGKTKVLTLYVVKNE-   80 (91)
T ss_pred             EEEEEECCEEEEEEEcCCcceEEeCHHHHHHhCCCccccccEEEEecCCCEeeEeEEEEEEEEECCEEEEEEEEEEECC-
Confidence            467899999999999999999999999999999753                       17788999999999999998 


Q ss_pred             CcEEechhHhh
Q 040033          137 YDVVLGTQWLR  147 (158)
Q Consensus       137 ~dvILG~dwL~  147 (158)
                      ++.|||+|||.
T Consensus        81 ~~~lLG~~wl~   91 (91)
T cd05484          81 GLNLLGRDWLD   91 (91)
T ss_pred             CCCccChhhcC
Confidence            99999999984


No 4  
>PF09668 Asp_protease:  Aspartyl protease;  InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.  This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=99.64  E-value=2.2e-15  Score=107.97  Aligned_cols=79  Identities=24%  Similarity=0.381  Sum_probs=67.3

Q ss_pred             CceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE------------------------EEEEECCEEEEEEEEE
Q 040033           76 SETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI------------------------VNLILQGVYVIVDFNL  131 (158)
Q Consensus        76 ~~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~------------------------A~i~i~g~~~~~~~~v  131 (158)
                      ...+.+.+.|||++++|+|||||.+|.||.++|+++|+..                        +++++++..+.+.|.|
T Consensus        22 v~mLyI~~~ing~~vkA~VDtGAQ~tims~~~a~r~gL~~lid~r~~g~a~GvG~~~i~G~Ih~~~l~ig~~~~~~s~~V  101 (124)
T PF09668_consen   22 VSMLYINCKINGVPVKAFVDTGAQSTIMSKSCAERCGLMRLIDKRFAGVAKGVGTQKILGRIHSVQLKIGGLFFPCSFTV  101 (124)
T ss_dssp             ----EEEEEETTEEEEEEEETT-SS-EEEHHHHHHTTGGGGEEGGG-EE-------EEEEEEEEEEEEETTEEEEEEEEE
T ss_pred             cceEEEEEEECCEEEEEEEeCCCCccccCHHHHHHcCChhhccccccccccCCCcCceeEEEEEEEEEECCEEEEEEEEE
Confidence            4567889999999999999999999999999999999864                        2889999999999999


Q ss_pred             ecCCCCcEEechhHhhhcCCceee
Q 040033          132 RELEGYDVVLGTQWLRTLEPILWD  155 (158)
Q Consensus       132 ~~~~~~dvILG~dwL~~~~~i~id  155 (158)
                      ++-...|+|||.|||++|+. .||
T Consensus       102 le~~~~d~llGld~L~~~~c-~ID  124 (124)
T PF09668_consen  102 LEDQDVDLLLGLDMLKRHKC-CID  124 (124)
T ss_dssp             ETTSSSSEEEEHHHHHHTT--EEE
T ss_pred             eCCCCcceeeeHHHHHHhCc-ccC
Confidence            99889999999999999997 676


No 5  
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=99.52  E-value=1.4e-13  Score=98.49  Aligned_cols=82  Identities=21%  Similarity=0.239  Sum_probs=72.1

Q ss_pred             cCceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE-----------E------------EEEECCEEEE-EEEE
Q 040033           75 ASETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI-----------V------------NLILQGVYVI-VDFN  130 (158)
Q Consensus        75 ~~~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~-----------A------------~i~i~g~~~~-~~~~  130 (158)
                      ...++.+++.|||+++.+||||||++++|++++|+++|+..           |            .+.+++..+. +.+.
T Consensus         8 ~~g~~~v~~~InG~~~~flVDTGAs~t~is~~~A~~Lgl~~~~~~~~~~~~ta~G~~~~~~~~l~~l~iG~~~~~nv~~~   87 (121)
T TIGR02281         8 GDGHFYATGRVNGRNVRFLVDTGATSVALNEEDAQRLGLDLNRLGYTVTVSTANGQIKAARVTLDRVAIGGIVVNDVDAM   87 (121)
T ss_pred             CCCeEEEEEEECCEEEEEEEECCCCcEEcCHHHHHHcCCCcccCCceEEEEeCCCcEEEEEEEeCEEEECCEEEeCcEEE
Confidence            35678899999999999999999999999999999999875           1            7888998888 8899


Q ss_pred             EecCC-CCcEEechhHhhhcCCceeee
Q 040033          131 LRELE-GYDVVLGTQWLRTLEPILWDF  156 (158)
Q Consensus       131 v~~~~-~~dvILG~dwL~~~~~i~idw  156 (158)
                      |++.. ..+.+||||||.++..+.+|-
T Consensus        88 v~~~~~~~~~LLGm~fL~~~~~~~~~~  114 (121)
T TIGR02281        88 VAEGGALSESLLGMSFLNRLSRFTVRG  114 (121)
T ss_pred             EeCCCcCCceEcCHHHHhccccEEEEC
Confidence            99886 358999999999998877774


No 6  
>PF13650 Asp_protease_2:  Aspartyl protease
Probab=99.48  E-value=2.9e-13  Score=90.28  Aligned_cols=65  Identities=22%  Similarity=0.330  Sum_probs=58.3

Q ss_pred             EEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE-----------------------EEEEECCEEE-EEEEEEec-CC
Q 040033           81 INGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI-----------------------VNLILQGVYV-IVDFNLRE-LE  135 (158)
Q Consensus        81 ~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~-----------------------A~i~i~g~~~-~~~~~v~~-~~  135 (158)
                      ++++|||+++.+||||||+.++|++++++++++..                       ..+++++..+ .+++.+++ ..
T Consensus         1 V~v~vng~~~~~liDTGa~~~~i~~~~~~~l~~~~~~~~~~~~~~~~~g~~~~~~~~~~~i~ig~~~~~~~~~~v~~~~~   80 (90)
T PF13650_consen    1 VPVKVNGKPVRFLIDTGASISVISRSLAKKLGLKPRPKSVPISVSGAGGSVTVYRGRVDSITIGGITLKNVPFLVVDLGD   80 (90)
T ss_pred             CEEEECCEEEEEEEcCCCCcEEECHHHHHHcCCCCcCCceeEEEEeCCCCEEEEEEEEEEEEECCEEEEeEEEEEECCCC
Confidence            46889999999999999999999999999998775                       1888999888 58899999 56


Q ss_pred             CCcEEechhH
Q 040033          136 GYDVVLGTQW  145 (158)
Q Consensus       136 ~~dvILG~dw  145 (158)
                      .+|+|||+||
T Consensus        81 ~~~~iLG~df   90 (90)
T PF13650_consen   81 PIDGILGMDF   90 (90)
T ss_pred             CCEEEeCCcC
Confidence            8999999998


No 7  
>PF00077 RVP:  Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026;  InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=99.45  E-value=6.3e-13  Score=91.18  Aligned_cols=72  Identities=22%  Similarity=0.226  Sum_probs=63.8

Q ss_pred             EEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE--------------------EEEEECCEEEEEEEEEecCCCCc
Q 040033           79 LRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI--------------------VNLILQGVYVIVDFNLRELEGYD  138 (158)
Q Consensus        79 i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~--------------------A~i~i~g~~~~~~~~v~~~~~~d  138 (158)
                      -.+.+.++|+++.+||||||+.|+|+++.++......                    +.+.+++..+...|+|++....|
T Consensus         6 p~i~v~i~g~~i~~LlDTGA~vsiI~~~~~~~~~~~~~~~~~v~~~~g~~~~~~~~~~~v~~~~~~~~~~~~v~~~~~~~   85 (100)
T PF00077_consen    6 PYITVKINGKKIKALLDTGADVSIISEKDWKKLGPPPKTSITVRGAGGSSSILGSTTVEVKIGGKEFNHTFLVVPDLPMN   85 (100)
T ss_dssp             SEEEEEETTEEEEEEEETTBSSEEESSGGSSSTSSEEEEEEEEEETTEEEEEEEEEEEEEEETTEEEEEEEEESSTCSSE
T ss_pred             ceEEEeECCEEEEEEEecCCCcceecccccccccccccCCceeccCCCcceeeeEEEEEEEEECccceEEEEecCCCCCC
Confidence            3578999999999999999999999999998775532                    18889999999999999987788


Q ss_pred             EEechhHhhhcCC
Q 040033          139 VVLGTQWLRTLEP  151 (158)
Q Consensus       139 vILG~dwL~~~~~  151 (158)
                       |||+|||++++.
T Consensus        86 -ILG~D~L~~~~~   97 (100)
T PF00077_consen   86 -ILGRDFLKKLNA   97 (100)
T ss_dssp             -EEEHHHHTTTTC
T ss_pred             -EeChhHHHHcCC
Confidence             999999999986


No 8  
>cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The  C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=99.45  E-value=5.3e-13  Score=91.41  Aligned_cols=75  Identities=20%  Similarity=0.184  Sum_probs=68.9

Q ss_pred             EEEECCEeeEEEecCCCceeEECHHHHHHcCCeE--------------------------EEEEECCEEEEEEEEEecCC
Q 040033           82 NGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI--------------------------VNLILQGVYVIVDFNLRELE  135 (158)
Q Consensus        82 ~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~--------------------------A~i~i~g~~~~~~~~v~~~~  135 (158)
                      ..++||++++|+|||||-+|+||..+|+++|+..                          |++++++..+...|.|++-.
T Consensus         2 nCk~nG~~vkAfVDsGaQ~timS~~caercgL~r~v~~~r~~g~A~gvgt~~kiiGrih~~~ikig~~~~~CSftVld~~   81 (103)
T cd05480           2 SCQCAGKELRALVDTGCQYNLISAACLDRLGLKERVLKAKAEEEAPSLPTSVKVIGQIERLVLQLGQLTVECSAQVVDDN   81 (103)
T ss_pred             ceeECCEEEEEEEecCCchhhcCHHHHHHcChHhhhhhccccccccCCCcceeEeeEEEEEEEEeCCEEeeEEEEEEcCC
Confidence            4679999999999999999999999999999863                          28889999999999999988


Q ss_pred             CCcEEechhHhhhcCCceeeec
Q 040033          136 GYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       136 ~~dvILG~dwL~~~~~i~idw~  157 (158)
                      +.|++||.|-|++|.. .||.+
T Consensus        82 ~~d~llGLdmLkrhqc-~IdL~  102 (103)
T cd05480          82 EKNFSLGLQTLKSLKC-VINLE  102 (103)
T ss_pred             CcceEeeHHHHhhcce-eeecc
Confidence            9999999999999998 79864


No 9  
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=99.41  E-value=2.1e-12  Score=87.03  Aligned_cols=70  Identities=21%  Similarity=0.318  Sum_probs=62.0

Q ss_pred             eEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE----------E------------EEEECCEEEE-EEEEEecC
Q 040033           78 TLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI----------V------------NLILQGVYVI-VDFNLREL  134 (158)
Q Consensus        78 ~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~----------A------------~i~i~g~~~~-~~~~v~~~  134 (158)
                      .+.+++.||++++.+||||||++++|+..+++++++..          |            .+++++..+. ..+.+++.
T Consensus         2 ~~~v~v~i~~~~~~~llDTGa~~s~i~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~i~ig~~~~~~~~~~v~d~   81 (96)
T cd05483           2 HFVVPVTINGQPVRFLLDTGASTTVISEELAERLGLPLTLGGKVTVQTANGRVRAARVRLDSLQIGGITLRNVPAVVLPG   81 (96)
T ss_pred             cEEEEEEECCEEEEEEEECCCCcEEcCHHHHHHcCCCccCCCcEEEEecCCCccceEEEcceEEECCcEEeccEEEEeCC
Confidence            46789999999999999999999999999999988522          1            8889999887 68999999


Q ss_pred             CC--CcEEechhHhh
Q 040033          135 EG--YDVVLGTQWLR  147 (158)
Q Consensus       135 ~~--~dvILG~dwL~  147 (158)
                      ..  .|.|||+|||+
T Consensus        82 ~~~~~~gIlG~d~l~   96 (96)
T cd05483          82 DALGVDGLLGMDFLR   96 (96)
T ss_pred             cccCCceEeChHHhC
Confidence            87  99999999984


No 10 
>PF12384 Peptidase_A2B:  Ty3 transposon peptidase;  InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=99.28  E-value=3.4e-11  Score=89.37  Aligned_cols=79  Identities=16%  Similarity=0.216  Sum_probs=72.2

Q ss_pred             cccCceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE---------------------E---EEEECCEEEEEE
Q 040033           73 VRASETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI---------------------V---NLILQGVYVIVD  128 (158)
Q Consensus        73 ~~~~~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~---------------------A---~i~i~g~~~~~~  128 (158)
                      -.-..+...+..++|.++.+|+||||-.|||+..++.+|+++.                     |   ++.+++..+.+.
T Consensus        29 Pevg~T~~v~l~~~~t~i~vLfDSGSPTSfIr~di~~kL~L~~~~app~~fRG~vs~~~~~tsEAv~ld~~i~n~~i~i~  108 (177)
T PF12384_consen   29 PEVGKTAIVQLNCKGTPIKVLFDSGSPTSFIRSDIVEKLELPTHDAPPFRFRGFVSGESATTSEAVTLDFYIDNKLIDIA  108 (177)
T ss_pred             cccCcEEEEEEeecCcEEEEEEeCCCccceeehhhHHhhCCccccCCCEEEeeeccCCceEEEEeEEEEEEECCeEEEEE
Confidence            3345778889999999999999999999999999999999987                     1   788999999999


Q ss_pred             EEEecCCCCcEEechhHhhhcCC
Q 040033          129 FNLRELEGYDVVLGTQWLRTLEP  151 (158)
Q Consensus       129 ~~v~~~~~~dvILG~dwL~~~~~  151 (158)
                      ++|++..++|+|+|.+.|+++..
T Consensus       109 aYV~d~m~~dlIIGnPiL~ryp~  131 (177)
T PF12384_consen  109 AYVTDNMDHDLIIGNPILDRYPT  131 (177)
T ss_pred             EEEeccCCcceEeccHHHhhhHH
Confidence            99999999999999999999875


No 11 
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=99.28  E-value=3.5e-11  Score=84.30  Aligned_cols=70  Identities=19%  Similarity=0.229  Sum_probs=59.3

Q ss_pred             CEeeEEEecCCCceeE-ECHHHHHHcCCeE--------------------EEEEECCEEEEEEEEEecCCCCcEEechhH
Q 040033           87 NISPIVLVDSGSTHNF-ISDTFAKKVKNFI--------------------VNLILQGVYVIVDFNLRELEGYDVVLGTQW  145 (158)
Q Consensus        87 ~~~v~aLiDSGat~sf-I~~~~a~~~~~~~--------------------A~i~i~g~~~~~~~~v~~~~~~dvILG~dw  145 (158)
                      ..++.+||||||+..+ |++++|+++|+..                    +.+.++|.+....+.+.+..+ +++|||.|
T Consensus        14 ~~~v~~LVDTGat~~~~l~~~~a~~lgl~~~~~~~~~tA~G~~~~~~v~~~~v~igg~~~~~~v~~~~~~~-~~LLG~~~   92 (107)
T TIGR03698        14 FMEVRALVDTGFSGFLLVPPDIVNKLGLPELDQRRVYLADGREVLTDVAKASIIINGLEIDAFVESLGYVD-EPLLGTEL   92 (107)
T ss_pred             ceEEEEEEECCCCeEEecCHHHHHHcCCCcccCcEEEecCCcEEEEEEEEEEEEECCEEEEEEEEecCCCC-ccEecHHH
Confidence            3589999999999998 9999999999886                    178889988866666666656 89999999


Q ss_pred             hhhcCCceeeecC
Q 040033          146 LRTLEPILWDFAS  158 (158)
Q Consensus       146 L~~~~~i~idw~~  158 (158)
                      |.+++. .+||++
T Consensus        93 L~~l~l-~id~~~  104 (107)
T TIGR03698        93 LEGLGI-VIDYRN  104 (107)
T ss_pred             HhhCCE-EEehhh
Confidence            999984 899974


No 12 
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where 
Probab=99.24  E-value=6.1e-11  Score=79.73  Aligned_cols=65  Identities=25%  Similarity=0.272  Sum_probs=54.6

Q ss_pred             EEEECCEeeEEEecCCCceeEECHHHHHHcCCeE-------E-------------EEEECCEEEEEEEEEecCCCCcEEe
Q 040033           82 NGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI-------V-------------NLILQGVYVIVDFNLRELEGYDVVL  141 (158)
Q Consensus        82 ~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~-------A-------------~i~i~g~~~~~~~~v~~~~~~dvIL  141 (158)
                      .+.|||+++.+|+||||+++.|++..++++....       |             .+.+++++....+.+++-. .+.||
T Consensus         2 ~v~InG~~~~fLvDTGA~~tii~~~~a~~~~~~~~~~~v~gagG~~~~~v~~~~~~v~vg~~~~~~~~~v~~~~-~~~lL   80 (86)
T cd06095           2 TITVEGVPIVFLVDTGATHSVLKSDLGPKQELSTTSVLIRGVSGQSQQPVTTYRTLVDLGGHTVSHSFLVVPNC-PDPLL   80 (86)
T ss_pred             EEEECCEEEEEEEECCCCeEEECHHHhhhccCCCCcEEEEeCCCcccccEEEeeeEEEECCEEEEEEEEEEcCC-CCcEe
Confidence            5789999999999999999999999999852211       0             3788999999999888753 69999


Q ss_pred             chhHhh
Q 040033          142 GTQWLR  147 (158)
Q Consensus       142 G~dwL~  147 (158)
                      |||||.
T Consensus        81 G~dfL~   86 (86)
T cd06095          81 GRDLLS   86 (86)
T ss_pred             chhhcC
Confidence            999984


No 13 
>cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples
Probab=99.18  E-value=1.6e-10  Score=73.92  Aligned_cols=66  Identities=41%  Similarity=0.663  Sum_probs=57.3

Q ss_pred             EEEECCEeeEEEecCCCceeEECHHHHHHcCC-eE------------------------EEEEECCEEEEEEEEEecCCC
Q 040033           82 NGRIGNISPIVLVDSGSTHNFISDTFAKKVKN-FI------------------------VNLILQGVYVIVDFNLRELEG  136 (158)
Q Consensus        82 ~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~-~~------------------------A~i~i~g~~~~~~~~v~~~~~  136 (158)
                      .+.+++.++.+|+|+||++++++..++.+++. ..                        ..+.+++..+...|++++...
T Consensus         2 ~~~~~~~~~~~liDtgs~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~   81 (92)
T cd00303           2 KGKINGVPVRALVDSGASVNFISESLAKKLGLPPRLLPTPLKVKGANGSSVKTLGVILPVTIGIGGKTFTVDFYVLDLLS   81 (92)
T ss_pred             EEEECCEEEEEEEcCCCcccccCHHHHHHcCCCcccCCCceEEEecCCCEeccCcEEEEEEEEeCCEEEEEEEEEEcCCC
Confidence            46688999999999999999999999999876 22                        055677888999999999999


Q ss_pred             CcEEechhHhh
Q 040033          137 YDVVLGTQWLR  147 (158)
Q Consensus       137 ~dvILG~dwL~  147 (158)
                      +|+|||+|||.
T Consensus        82 ~~~ilG~~~l~   92 (92)
T cd00303          82 YDVILGRPWLE   92 (92)
T ss_pred             cCEEecccccC
Confidence            99999999984


No 14 
>KOG0012 consensus DNA damage inducible protein [Replication, recombination and repair]
Probab=99.11  E-value=1.4e-10  Score=95.07  Aligned_cols=80  Identities=24%  Similarity=0.353  Sum_probs=74.1

Q ss_pred             ceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE------------------------EEEEECCEEEEEEEEEe
Q 040033           77 ETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI------------------------VNLILQGVYVIVDFNLR  132 (158)
Q Consensus        77 ~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~------------------------A~i~i~g~~~~~~~~v~  132 (158)
                      ..+.+.+.|||++|+|+|||||-.|.||.++|+++|+..                        +.++++...+...|.|+
T Consensus       234 ~ML~iN~~ing~~VKAfVDsGaq~timS~~Caer~gL~rlid~r~~g~a~gvg~~ki~g~Ih~~~lki~~~~l~c~ftV~  313 (380)
T KOG0012|consen  234 TMLYINCEINGVPVKAFVDSGAQTTIMSAACAERCGLNRLIDKRFQGEARGVGTEKILGRIHQAQLKIEDLYLPCSFTVL  313 (380)
T ss_pred             eEEEEEEEECCEEEEEEEcccchhhhhhHHHHHHhChHHHhhhhhhccccCCCcccccceeEEEEEEeccEeeccceEEe
Confidence            446778999999999999999999999999999999876                        28899999999999999


Q ss_pred             cCCCCcEEechhHhhhcCCceeeec
Q 040033          133 ELEGYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       133 ~~~~~dvILG~dwL~~~~~i~idw~  157 (158)
                      +-.+.|+.||.|-|++|+. .||.+
T Consensus       314 d~~~~d~llGLd~Lrr~~c-cIdL~  337 (380)
T KOG0012|consen  314 DRRDMDLLLGLDMLRRHQC-CIDLK  337 (380)
T ss_pred             cCCCcchhhhHHHHHhccc-eeecc
Confidence            9999999999999999998 79975


No 15 
>PF13975 gag-asp_proteas:  gag-polyprotein putative aspartyl protease
Probab=99.00  E-value=8.5e-10  Score=71.85  Aligned_cols=41  Identities=34%  Similarity=0.539  Sum_probs=38.3

Q ss_pred             cCceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE
Q 040033           75 ASETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI  115 (158)
Q Consensus        75 ~~~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~  115 (158)
                      ....+++.+.|+++.+.+||||||++|||++++|+++|++.
T Consensus         5 ~~g~~~v~~~I~g~~~~alvDtGat~~fis~~~a~rLgl~~   45 (72)
T PF13975_consen    5 DPGLMYVPVSIGGVQVKALVDTGATHNFISESLAKRLGLPL   45 (72)
T ss_pred             cCCEEEEEEEECCEEEEEEEeCCCcceecCHHHHHHhCCCc
Confidence            45788999999999999999999999999999999999877


No 16 
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=98.88  E-value=7.7e-09  Score=79.44  Aligned_cols=82  Identities=23%  Similarity=0.285  Sum_probs=70.0

Q ss_pred             ccCceEEEEEEECCEeeEEEecCCCceeEECHHHHHHcCCeE------------------E-----EEEECCEEEE-EEE
Q 040033           74 RASETLRINGRIGNISPIVLVDSGSTHNFISDTFAKKVKNFI------------------V-----NLILQGVYVI-VDF  129 (158)
Q Consensus        74 ~~~~~i~~~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~~~~------------------A-----~i~i~g~~~~-~~~  129 (158)
                      +...++...+.|||+.+.+|||||||.-.+++.-|+++|+..                  |     .++|++..+. ++.
T Consensus       101 ~~~GHF~a~~~VNGk~v~fLVDTGATsVal~~~dA~RlGid~~~l~y~~~v~TANG~~~AA~V~Ld~v~IG~I~~~nV~A  180 (215)
T COG3577         101 SRDGHFEANGRVNGKKVDFLVDTGATSVALNEEDARRLGIDLNSLDYTITVSTANGRARAAPVTLDRVQIGGIRVKNVDA  180 (215)
T ss_pred             cCCCcEEEEEEECCEEEEEEEecCcceeecCHHHHHHhCCCccccCCceEEEccCCccccceEEeeeEEEccEEEcCchh
Confidence            345678889999999999999999999999999999999876                  1     7888988887 788


Q ss_pred             EEecCC-CCcEEechhHhhhcCCceee
Q 040033          130 NLRELE-GYDVVLGTQWLRTLEPILWD  155 (158)
Q Consensus       130 ~v~~~~-~~dvILG~dwL~~~~~i~id  155 (158)
                      +|++-+ --..+|||.||.+++-.+.+
T Consensus       181 ~V~~~g~L~~sLLGMSfL~rL~~fq~~  207 (215)
T COG3577         181 MVAEDGALDESLLGMSFLNRLSGFQVD  207 (215)
T ss_pred             heecCCccchhhhhHHHHhhccceEec
Confidence            999766 67789999999998765443


No 17 
>PF02160 Peptidase_A3:  Cauliflower mosaic virus peptidase (A3);  InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=98.60  E-value=1e-07  Score=73.34  Aligned_cols=69  Identities=23%  Similarity=0.334  Sum_probs=53.3

Q ss_pred             EeeEEEecCCCceeEECHHHH-----HHcCCeE----E--------------EEEECCEEEEEEEEEecCCCCcEEechh
Q 040033           88 ISPIVLVDSGSTHNFISDTFA-----KKVKNFI----V--------------NLILQGVYVIVDFNLRELEGYDVVLGTQ  144 (158)
Q Consensus        88 ~~v~aLiDSGat~sfI~~~~a-----~~~~~~~----A--------------~i~i~g~~~~~~~~v~~~~~~dvILG~d  144 (158)
                      ..+.++|||||+.++++..+.     +++.-+.    |              .+.+.|+.|...+.-.--.+.|+|||++
T Consensus        19 ~~~~~~vDTGAt~C~~~~~iiP~e~we~~~~~i~v~~an~~~~~i~~~~~~~~i~I~~~~F~IP~iYq~~~g~d~IlG~N   98 (201)
T PF02160_consen   19 FNYHCYVDTGATICCASKKIIPEEYWEKSKKPIKVKGANGSIIQINKKAKNGKIQIADKIFRIPTIYQQESGIDIILGNN   98 (201)
T ss_pred             EEEEEEEeCCCceEEecCCcCCHHHHHhCCCcEEEEEecCCceEEEEEecCceEEEccEEEeccEEEEecCCCCEEecch
Confidence            567899999999999887654     4444322    2              7889999999865444336899999999


Q ss_pred             HhhhcCCceeeec
Q 040033          145 WLRTLEPILWDFA  157 (158)
Q Consensus       145 wL~~~~~i~idw~  157 (158)
                      |++.+.| -+.|.
T Consensus        99 F~r~y~P-fiq~~  110 (201)
T PF02160_consen   99 FLRLYEP-FIQTE  110 (201)
T ss_pred             HHHhcCC-cEEEc
Confidence            9999999 47774


No 18 
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=98.41  E-value=1.7e-06  Score=59.06  Aligned_cols=61  Identities=18%  Similarity=0.172  Sum_probs=50.6

Q ss_pred             EEECC-EeeEEEecCCCceeEECHHHHHHcC---CeE----------E-------------EEEECCEEEEEEEEEecCC
Q 040033           83 GRIGN-ISPIVLVDSGSTHNFISDTFAKKVK---NFI----------V-------------NLILQGVYVIVDFNLRELE  135 (158)
Q Consensus        83 ~~i~~-~~v~aLiDSGat~sfI~~~~a~~~~---~~~----------A-------------~i~i~g~~~~~~~~v~~~~  135 (158)
                      ..|++ +++++++||||+.|+|+.+..++++   .+.          |             .+.+++..+.++|+|++..
T Consensus         3 ~~i~g~~~v~~~vDtGA~vnllp~~~~~~l~~~~~~~L~~t~~~L~~~~g~~~~~~G~~~~~v~~~~~~~~~~f~Vvd~~   82 (93)
T cd05481           3 MKINGKQSVKFQLDTGATCNVLPLRWLKSLTPDKDPELRPSPVRLTAYGGSTIPVEGGVKLKCRYRNPKYNLTFQVVKEE   82 (93)
T ss_pred             eEeCCceeEEEEEecCCEEEeccHHHHhhhccCCCCcCccCCeEEEeeCCCEeeeeEEEEEEEEECCcEEEEEEEEECCC
Confidence            56888 9999999999999999999999987   222          1             7778999999999999975


Q ss_pred             CCcEEechh
Q 040033          136 GYDVVLGTQ  144 (158)
Q Consensus       136 ~~dvILG~d  144 (158)
                       ..-|||.+
T Consensus        83 -~~~lLG~~   90 (93)
T cd05481          83 -GPPLLGAK   90 (93)
T ss_pred             -CCceEccc
Confidence             55667764


No 19 
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=98.23  E-value=2.8e-06  Score=57.36  Aligned_cols=61  Identities=21%  Similarity=0.278  Sum_probs=46.8

Q ss_pred             EeeEEEecCCCceeEECHHHHHHc-C-CeE---E--------------EEEECC-EEEEEEEEEecCCCCcEEechhHhh
Q 040033           88 ISPIVLVDSGSTHNFISDTFAKKV-K-NFI---V--------------NLILQG-VYVIVDFNLRELEGYDVVLGTQWLR  147 (158)
Q Consensus        88 ~~v~aLiDSGat~sfI~~~~a~~~-~-~~~---A--------------~i~i~g-~~~~~~~~v~~~~~~dvILG~dwL~  147 (158)
                      -.+..||||||.+|.|.....++. . .+.   |              .+.++. +.|.-.|.|.+..  .-|||.|||+
T Consensus         8 s~~~fLVDTGA~vSviP~~~~~~~~~~~~~~l~AANgt~I~tyG~~~l~ldlGlrr~~~w~FvvAdv~--~pIlGaDfL~   85 (89)
T cd06094           8 SGLRFLVDTGAAVSVLPASSTKKSLKPSPLTLQAANGTPIATYGTRSLTLDLGLRRPFAWNFVVADVP--HPILGADFLQ   85 (89)
T ss_pred             CCcEEEEeCCCceEeeccccccccccCCceEEEeCCCCeEeeeeeEEEEEEcCCCcEEeEEEEEcCCC--cceecHHHHH
Confidence            357899999999999999988863 1 111   1              455564 4888899998875  4799999999


Q ss_pred             hcC
Q 040033          148 TLE  150 (158)
Q Consensus       148 ~~~  150 (158)
                      .|+
T Consensus        86 ~~~   88 (89)
T cd06094          86 HYG   88 (89)
T ss_pred             HcC
Confidence            886


No 20 
>PF00098 zf-CCHC:  Zinc knuckle;  InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:  C-X2-C-X4-H-X4-C  where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=98.18  E-value=6.1e-07  Score=43.09  Aligned_cols=17  Identities=6%  Similarity=0.001  Sum_probs=15.9

Q ss_pred             ceeecCCCCCCCccCCc
Q 040033           31 VCATIATKSFPRDIGER   47 (158)
Q Consensus        31 ~Cf~Cg~~gH~~~~C~~   47 (158)
                      .||+|++.||++++||+
T Consensus         2 ~C~~C~~~GH~~~~Cp~   18 (18)
T PF00098_consen    2 KCFNCGEPGHIARDCPK   18 (18)
T ss_dssp             BCTTTSCSSSCGCTSSS
T ss_pred             cCcCCCCcCcccccCcc
Confidence            59999999999999985


No 21 
>PF05585 DUF1758:  Putative peptidase (DUF1758);  InterPro: IPR008737  This is a family of nematode proteins of unknown function []. However, it seems likely that these proteins act as aspartic peptidases. 
Probab=98.16  E-value=4e-06  Score=62.51  Aligned_cols=28  Identities=32%  Similarity=0.485  Sum_probs=26.1

Q ss_pred             EeeEEEecCCCceeEECHHHHHHcCCeE
Q 040033           88 ISPIVLVDSGSTHNFISDTFAKKVKNFI  115 (158)
Q Consensus        88 ~~v~aLiDSGat~sfI~~~~a~~~~~~~  115 (158)
                      .++.+|+||||..|||++++|++++++.
T Consensus        11 ~~~~~LlDsGSq~SfIt~~la~~L~L~~   38 (164)
T PF05585_consen   11 VEARALLDSGSQRSFITESLANKLNLPG   38 (164)
T ss_pred             EEEEEEEecCCchhHHhHHHHHHhCCCC
Confidence            5788999999999999999999999975


No 22 
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.10  E-value=2.7e-05  Score=55.45  Aligned_cols=70  Identities=17%  Similarity=0.155  Sum_probs=58.5

Q ss_pred             CCEeeEEEecCCCc-eeEECHHHHHHcCCeE--------------------EEEEECCEEEEEEEEEecCCCCcEEechh
Q 040033           86 GNISPIVLVDSGST-HNFISDTFAKKVKNFI--------------------VNLILQGVYVIVDFNLRELEGYDVVLGTQ  144 (158)
Q Consensus        86 ~~~~v~aLiDSGat-~sfI~~~~a~~~~~~~--------------------A~i~i~g~~~~~~~~v~~~~~~dvILG~d  144 (158)
                      +++....|||||.+ --.+++++|.+++++.                    |.++++|.+...-..+.+....+ ++|++
T Consensus        23 Gd~~~~~LiDTGFtg~lvlp~~vaek~~~~~~~~~~~~~a~~~~v~t~V~~~~iki~g~e~~~~Vl~s~~~~~~-liG~~  101 (125)
T COG5550          23 GDFVYDELIDTGFTGYLVLPPQVAEKLGLPLFSTIRIVLADGGVVKTSVALATIKIDGVEKVAFVLASDNLPEP-LIGVN  101 (125)
T ss_pred             CcEEeeeEEecCCceeEEeCHHHHHhcCCCccCChhhhhhcCCEEEEEEEEEEEEECCEEEEEEEEccCCCccc-chhhh
Confidence            34455569999999 8899999999999887                    38899999888888888888888 99999


Q ss_pred             HhhhcCCceeeec
Q 040033          145 WLRTLEPILWDFA  157 (158)
Q Consensus       145 wL~~~~~i~idw~  157 (158)
                      ||+.++- .+|..
T Consensus       102 ~lk~l~~-~vn~~  113 (125)
T COG5550         102 LLKLLGL-VVNPK  113 (125)
T ss_pred             hhhhccE-EEcCC
Confidence            9998885 67654


No 23 
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=97.87  E-value=6.5e-05  Score=50.68  Aligned_cols=66  Identities=17%  Similarity=-0.018  Sum_probs=49.4

Q ss_pred             EEEECCEeeEEEecCCCceeEECHHHHHHcC----CeE----------------EEEEECCEEEEEEEEEecCCCCcEEe
Q 040033           82 NGRIGNISPIVLVDSGSTHNFISDTFAKKVK----NFI----------------VNLILQGVYVIVDFNLRELEGYDVVL  141 (158)
Q Consensus        82 ~~~i~~~~v~aLiDSGat~sfI~~~~a~~~~----~~~----------------A~i~i~g~~~~~~~~v~~~~~~dvIL  141 (158)
                      ...|+|+.+.+|+||||..++|++.-..+.-    -+.                -.+++.++.....+.|.+.....-||
T Consensus         2 ~~~i~g~~~~~llDTGAd~Tvi~~~~~p~~w~~~~~~~~i~GIGG~~~~~~~~~v~i~i~~~~~~g~vlv~~~~~P~nll   81 (87)
T cd05482           2 TLYINGKLFEGLLDTGADVSIIAENDWPKNWPIQPAPSNLTGIGGAITPSQSSVLLLEIDGEGHLGTILVYVLSLPVNLW   81 (87)
T ss_pred             EEEECCEEEEEEEccCCCCeEEcccccCCCCccCCCCeEEEeccceEEEEEEeeEEEEEcCCeEEEEEEEccCCCcccEE
Confidence            4678999999999999999999986554321    111                16777888888888888864445689


Q ss_pred             chhHhh
Q 040033          142 GTQWLR  147 (158)
Q Consensus       142 G~dwL~  147 (158)
                      |+|.|.
T Consensus        82 GRd~L~   87 (87)
T cd05482          82 GRDILS   87 (87)
T ss_pred             ccccCC
Confidence            999874


No 24 
>PF13696 zf-CCHC_2:  Zinc knuckle
Probab=96.66  E-value=0.00073  Score=36.98  Aligned_cols=19  Identities=16%  Similarity=0.048  Sum_probs=17.5

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      .+|+.|+++||+..+||.+
T Consensus         9 Y~C~~C~~~GH~i~dCP~~   27 (32)
T PF13696_consen    9 YVCHRCGQKGHWIQDCPTN   27 (32)
T ss_pred             CEeecCCCCCccHhHCCCC
Confidence            6899999999999999974


No 25 
>smart00343 ZnF_C2HC zinc finger.
Probab=95.88  E-value=0.0036  Score=32.39  Aligned_cols=18  Identities=6%  Similarity=0.051  Sum_probs=16.0

Q ss_pred             ceeecCCCCCCCccCCcc
Q 040033           31 VCATIATKSFPRDIGERS   48 (158)
Q Consensus        31 ~Cf~Cg~~gH~~~~C~~~   48 (158)
                      .||+|++.||.+.+||..
T Consensus         1 ~C~~CG~~GH~~~~C~~~   18 (26)
T smart00343        1 KCYNCGKEGHIARDCPKX   18 (26)
T ss_pred             CCccCCCCCcchhhCCcc
Confidence            499999999999999853


No 26 
>PF13917 zf-CCHC_3:  Zinc knuckle
Probab=95.67  E-value=0.0045  Score=36.05  Aligned_cols=18  Identities=11%  Similarity=-0.040  Sum_probs=16.9

Q ss_pred             cceeecCCCCCCCccCCc
Q 040033           30 KVCATIATKSFPRDIGER   47 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~   47 (158)
                      ..|.+|++.||...+|++
T Consensus         5 ~~CqkC~~~GH~tyeC~~   22 (42)
T PF13917_consen    5 VRCQKCGQKGHWTYECPN   22 (42)
T ss_pred             CcCcccCCCCcchhhCCC
Confidence            689999999999999994


No 27 
>PF12382 Peptidase_A2E:  Retrotransposon peptidase;  InterPro: IPR024648 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  This entry represents a small family of fungal retroviral aspartyl peptidases.
Probab=95.21  E-value=0.051  Score=37.62  Aligned_cols=65  Identities=17%  Similarity=0.254  Sum_probs=48.3

Q ss_pred             EEEEEEECCE--eeEEEecCCCceeEECHHHHHHcCCeE---------E--------------EEEECCEEEEEEEEEec
Q 040033           79 LRINGRIGNI--SPIVLVDSGSTHNFISDTFAKKVKNFI---------V--------------NLILQGVYVIVDFNLRE  133 (158)
Q Consensus        79 i~~~~~i~~~--~v~aLiDSGat~sfI~~~~a~~~~~~~---------A--------------~i~i~g~~~~~~~~v~~  133 (158)
                      |.+++.+...  .+.-|||+||..|.|.+..++...++.         +              .+.++|......|+|+.
T Consensus        35 mvlqa~lp~fkcsipclidtgaq~niiteetvrahklptrpw~~sviyggvyp~kinrkt~kl~i~lngisikteflvvk  114 (137)
T PF12382_consen   35 MVLQAKLPDFKCSIPCLIDTGAQVNIITEETVRAHKLPTRPWSQSVIYGGVYPNKINRKTIKLNINLNGISIKTEFLVVK  114 (137)
T ss_pred             hhhhhhCCCccccceeEEccCceeeeeehhhhhhccCCCCcchhheEeccccccccccceEEEEEEecceEEEEEEEEEE
Confidence            3344444333  346799999999999999999888876         0              67789999999999998


Q ss_pred             CCCCcEEech
Q 040033          134 LEGYDVVLGT  143 (158)
Q Consensus       134 ~~~~dvILG~  143 (158)
                      .-.++.-+.+
T Consensus       115 kfshpaaisf  124 (137)
T PF12382_consen  115 KFSHPAAISF  124 (137)
T ss_pred             eccCcceEEE
Confidence            6655554443


No 28 
>PF14787 zf-CCHC_5:  GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=95.08  E-value=0.0074  Score=33.74  Aligned_cols=19  Identities=5%  Similarity=-0.027  Sum_probs=12.4

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ++|++|++..|.+.+|..+
T Consensus         3 ~~CprC~kg~Hwa~~C~sk   21 (36)
T PF14787_consen    3 GLCPRCGKGFHWASECRSK   21 (36)
T ss_dssp             -C-TTTSSSCS-TTT---T
T ss_pred             ccCcccCCCcchhhhhhhh
Confidence            7899999999999999875


No 29 
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=94.71  E-value=0.012  Score=44.95  Aligned_cols=21  Identities=19%  Similarity=0.180  Sum_probs=17.9

Q ss_pred             hhcc-cceeecCCCCCCCccCC
Q 040033           26 ERRA-KVCATIATKSFPRDIGE   46 (158)
Q Consensus        26 ~rR~-~~Cf~Cg~~gH~~~~C~   46 (158)
                      .+++ ..||+||+.||.+++||
T Consensus        56 ~~~~~~~C~nCg~~GH~~~DCP   77 (190)
T COG5082          56 IREENPVCFNCGQNGHLRRDCP   77 (190)
T ss_pred             ccccccccchhcccCcccccCC
Confidence            3344 88999999999999999


No 30 
>PF05618 Zn_protease:  Putative ATP-dependant zinc protease;  InterPro: IPR008503 This family consists of several hypothetical proteins from different archaeal and bacterial species.; PDB: 2PMA_B.
Probab=94.21  E-value=0.14  Score=37.36  Aligned_cols=33  Identities=30%  Similarity=0.526  Sum_probs=25.2

Q ss_pred             EEEECCEEEEEEEEEecCC--CCcEEec-hhHhhhc
Q 040033          117 NLILQGVYVIVDFNLRELE--GYDVVLG-TQWLRTL  149 (158)
Q Consensus       117 ~i~i~g~~~~~~~~v~~~~--~~dvILG-~dwL~~~  149 (158)
                      .+.++|..+.+.|.+.+=+  .|+|+|| ..||...
T Consensus        89 ~~~lg~~~~~~e~tL~dR~~m~yp~LlGrR~~l~~~  124 (138)
T PF05618_consen   89 TLCLGGKTWKIEFTLTDRSNMKYPMLLGRRNFLRGR  124 (138)
T ss_dssp             EEEETTEEEEEEEEEE-S--SS-SEEE-HHHHHHTT
T ss_pred             EEEECCEEEEEEEEEcCCCcCcCCEEEEehHHhcCC
Confidence            7788999999999998754  7999999 9999653


No 31 
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=93.89  E-value=0.024  Score=43.41  Aligned_cols=19  Identities=11%  Similarity=0.034  Sum_probs=16.9

Q ss_pred             cceeecCCCCCCCccC-Ccc
Q 040033           30 KVCATIATKSFPRDIG-ERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C-~~~   48 (158)
                      ..||+||+.||++++| |++
T Consensus        98 ~~C~~Cg~~GH~~~dC~P~~  117 (190)
T COG5082          98 KKCYNCGETGHLSRDCNPSK  117 (190)
T ss_pred             cccccccccCccccccCccc
Confidence            5799999999999999 655


No 32 
>PF14392 zf-CCHC_4:  Zinc knuckle
Probab=93.23  E-value=0.028  Score=33.63  Aligned_cols=22  Identities=9%  Similarity=0.129  Sum_probs=18.7

Q ss_pred             hhcc-cceeecCCCCCCCccCCc
Q 040033           26 ERRA-KVCATIATKSFPRDIGER   47 (158)
Q Consensus        26 ~rR~-~~Cf~Cg~~gH~~~~C~~   47 (158)
                      +-|- ..||+||..||....||+
T Consensus        27 YE~lp~~C~~C~~~gH~~~~C~k   49 (49)
T PF14392_consen   27 YERLPRFCFHCGRIGHSDKECPK   49 (49)
T ss_pred             ECCcChhhcCCCCcCcCHhHcCC
Confidence            3355 789999999999999985


No 33 
>COG4067 Uncharacterized protein conserved in archaea [Posttranslational modification, protein turnover, chaperones]
Probab=93.12  E-value=0.26  Score=36.62  Aligned_cols=38  Identities=32%  Similarity=0.533  Sum_probs=31.1

Q ss_pred             EEEECCEEEEEEEEEecCC--CCcEEechhHhhhcCCceee
Q 040033          117 NLILQGVYVIVDFNLRELE--GYDVVLGTQWLRTLEPILWD  155 (158)
Q Consensus       117 ~i~i~g~~~~~~~~v~~~~--~~dvILG~dwL~~~~~i~id  155 (158)
                      .+.++|....+.|.+.+=.  .|+|+||.-+|..... .+|
T Consensus       113 ~l~lG~~~~~~E~tLtDR~~m~Yp~LlGrk~l~~~~~-~VD  152 (162)
T COG4067         113 TLCLGGRILPIEFTLTDRSNMRYPVLLGRKALRHFGA-VVD  152 (162)
T ss_pred             EEeeCCeeeeEEEEeecccccccceEecHHHHhhCCe-EEC
Confidence            6778999999999988754  7999999999998654 444


No 34 
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=91.21  E-value=0.089  Score=38.43  Aligned_cols=17  Identities=12%  Similarity=0.077  Sum_probs=14.2

Q ss_pred             ceeecCCCCCCCccCCc
Q 040033           31 VCATIATKSFPRDIGER   47 (158)
Q Consensus        31 ~Cf~Cg~~gH~~~~C~~   47 (158)
                      +||+|++.||++++||.
T Consensus         2 ~C~~C~~~GH~~~~c~~   18 (148)
T PTZ00368          2 VCYRCGGVGHQSRECPN   18 (148)
T ss_pred             cCCCCCCCCcCcccCcC
Confidence            58888888888888886


No 35 
>KOG4400 consensus E3 ubiquitin ligase interacting with arginine methyltransferase [Posttranslational modification, protein turnover, chaperones]
Probab=90.86  E-value=0.092  Score=41.99  Aligned_cols=18  Identities=11%  Similarity=0.110  Sum_probs=16.2

Q ss_pred             cceeecCCCCCCCccCCc
Q 040033           30 KVCATIATKSFPRDIGER   47 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~   47 (158)
                      ..||.||+.||+.++||.
T Consensus       144 ~~Cy~Cg~~GH~s~~C~~  161 (261)
T KOG4400|consen  144 AKCYSCGEQGHISDDCPE  161 (261)
T ss_pred             CccCCCCcCCcchhhCCC
Confidence            349999999999999994


No 36 
>COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=90.13  E-value=0.21  Score=41.07  Aligned_cols=19  Identities=16%  Similarity=0.041  Sum_probs=17.5

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ..||+||++||....||.+
T Consensus       177 Y~CyRCGqkgHwIqnCpTN  195 (427)
T COG5222         177 YVCYRCGQKGHWIQNCPTN  195 (427)
T ss_pred             eeEEecCCCCchhhcCCCC
Confidence            6799999999999999966


No 37 
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  The enzymes specifically cleave bonds in peptides which 
Probab=89.34  E-value=1.4  Score=34.86  Aligned_cols=62  Identities=18%  Similarity=0.272  Sum_probs=38.4

Q ss_pred             eEEEecCCCceeEECHHHHHHcCCeEEEEEEC-CEEEEEE--------------EEEecC-CCCcEEechhHhhhcCCce
Q 040033           90 PIVLVDSGSTHNFISDTFAKKVKNFIVNLILQ-GVYVIVD--------------FNLREL-EGYDVVLGTQWLRTLEPIL  153 (158)
Q Consensus        90 v~aLiDSGat~sfI~~~~a~~~~~~~A~i~i~-g~~~~~~--------------~~v~~~-~~~dvILG~dwL~~~~~i~  153 (158)
                      ..++||||++..++.+.+.-.+     .+.++ +..+.+.              +.++.. ..--.|||-.||+.+-- .
T Consensus       177 ~~ai~DTGTs~~~lp~~~~P~i-----~~~f~~~~~~~i~~~~y~~~~~~~~~C~~~~~~~~~~~~ilG~~fl~~~~~-v  250 (265)
T cd05476         177 GGTIIDSGTTLTYLPDPAYPDL-----TLHFDGGADLELPPENYFVDVGEGVVCLAILSSSSGGVSILGNIQQQNFLV-E  250 (265)
T ss_pred             CcEEEeCCCcceEcCccccCCE-----EEEECCCCEEEeCcccEEEECCCCCEEEEEecCCCCCcEEEChhhcccEEE-E
Confidence            3489999999999998876222     23333 3322211              122222 34568999999998764 4


Q ss_pred             eeec
Q 040033          154 WDFA  157 (158)
Q Consensus       154 idw~  157 (158)
                      .|+.
T Consensus       251 FD~~  254 (265)
T cd05476         251 YDLE  254 (265)
T ss_pred             EECC
Confidence            5654


No 38 
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=89.22  E-value=1.8  Score=35.14  Aligned_cols=67  Identities=15%  Similarity=0.207  Sum_probs=43.8

Q ss_pred             eEEEecCCCceeEECHHHHHHcC----CeE-----------E-------EEEECCEEEEEE-------------EEEecC
Q 040033           90 PIVLVDSGSTHNFISDTFAKKVK----NFI-----------V-------NLILQGVYVIVD-------------FNLREL  134 (158)
Q Consensus        90 v~aLiDSGat~sfI~~~~a~~~~----~~~-----------A-------~i~i~g~~~~~~-------------~~v~~~  134 (158)
                      ..++||||++..++...+++.+-    ...           .       .+.++|.++.+.             +.+.+.
T Consensus       202 ~~~iiDSGtt~~~lP~~~~~~l~~~~~~~~~~~~~~~~~C~~~~~~p~l~~~f~g~~~~v~~~~y~~~~~~~C~~~i~~~  281 (318)
T cd05477         202 CQAIVDTGTSLLTAPQQVMSTLMQSIGAQQDQYGQYVVNCNNIQNLPTLTFTINGVSFPLPPSAYILQNNGYCTVGIEPT  281 (318)
T ss_pred             ceeeECCCCccEECCHHHHHHHHHHhCCccccCCCEEEeCCccccCCcEEEEECCEEEEECHHHeEecCCCeEEEEEEec
Confidence            36899999999999998877642    111           0       456677766642             122211


Q ss_pred             ------CCCcEEechhHhhhcCCceeeec
Q 040033          135 ------EGYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       135 ------~~~dvILG~dwL~~~~~i~idw~  157 (158)
                            +....|||..||+.+-- ..|+.
T Consensus       282 ~~~~~~~~~~~ilG~~fl~~~y~-vfD~~  309 (318)
T cd05477         282 YLPSQNGQPLWILGDVFLRQYYS-VYDLG  309 (318)
T ss_pred             ccCCCCCCceEEEcHHHhhheEE-EEeCC
Confidence                  12358999999998775 46764


No 39 
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=88.93  E-value=0.17  Score=36.90  Aligned_cols=18  Identities=6%  Similarity=0.064  Sum_probs=11.0

Q ss_pred             cceeecCCCCCCCccCCc
Q 040033           30 KVCATIATKSFPRDIGER   47 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~   47 (158)
                      ..||+|++.||++++||+
T Consensus       104 ~~C~~Cg~~gH~~~~C~~  121 (148)
T PTZ00368        104 RACYNCGGEGHISRDCPN  121 (148)
T ss_pred             hhhcccCcCCcchhcCCC
Confidence            356666666666666665


No 40 
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=87.34  E-value=2  Score=37.29  Aligned_cols=26  Identities=19%  Similarity=0.480  Sum_probs=21.1

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEECHH
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFISDT  106 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~~~  106 (158)
                      .++.|+  .+++.+++||||+..+|...
T Consensus       141 ~~i~IGTP~Q~f~vi~DTGSsdlWV~s~  168 (450)
T PTZ00013        141 GEGEVGDNHQKFMLIFDTGSANLWVPSK  168 (450)
T ss_pred             EEEEECCCCeEEEEEEeCCCCceEEecc
Confidence            355665  79999999999999999643


No 41 
>cd05470 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site 
Probab=86.83  E-value=0.79  Score=30.93  Aligned_cols=25  Identities=28%  Similarity=0.414  Sum_probs=20.1

Q ss_pred             EEEECC--EeeEEEecCCCceeEECHH
Q 040033           82 NGRIGN--ISPIVLVDSGSTHNFISDT  106 (158)
Q Consensus        82 ~~~i~~--~~v~aLiDSGat~sfI~~~  106 (158)
                      ++.|+.  +++.++|||||+..++...
T Consensus         2 ~i~vGtP~q~~~~~~DTGSs~~Wv~~~   28 (109)
T cd05470           2 EIGIGTPPQTFNVLLDTGSSNLWVPSV   28 (109)
T ss_pred             EEEeCCCCceEEEEEeCCCCCEEEeCC
Confidence            355654  8999999999999888765


No 42 
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which 
Probab=86.08  E-value=2.8  Score=34.03  Aligned_cols=74  Identities=16%  Similarity=0.183  Sum_probs=48.3

Q ss_pred             EEECCEee------EEEecCCCceeEECHHHHHHc----CCeE-------------E-----EEEECCEEEEEE------
Q 040033           83 GRIGNISP------IVLVDSGSTHNFISDTFAKKV----KNFI-------------V-----NLILQGVYVIVD------  128 (158)
Q Consensus        83 ~~i~~~~v------~aLiDSGat~sfI~~~~a~~~----~~~~-------------A-----~i~i~g~~~~~~------  128 (158)
                      +.|+++.+      .++||||++..++......++    +...             .     .+.++|..|.+.      
T Consensus       194 v~v~g~~~~~~~~~~~iiDTGts~~~lp~~~~~~l~~~~~~~~~~~~~~~~~C~~~~~~P~~~f~f~g~~~~i~~~~y~~  273 (317)
T cd05478         194 VTINGQVVACSGGCQAIVDTGTSLLVGPSSDIANIQSDIGASQNQNGEMVVNCSSISSMPDVVFTINGVQYPLPPSAYIL  273 (317)
T ss_pred             EEECCEEEccCCCCEEEECCCchhhhCCHHHHHHHHHHhCCccccCCcEEeCCcCcccCCcEEEEECCEEEEECHHHhee
Confidence            46777654      689999999999998877653    2211             0     455677766642      


Q ss_pred             -------EEEecCC-CCcEEechhHhhhcCCceeeec
Q 040033          129 -------FNLRELE-GYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       129 -------~~v~~~~-~~dvILG~dwL~~~~~i~idw~  157 (158)
                             +.+.+.. ....|||-.||+.+-. ..|+.
T Consensus       274 ~~~~~C~~~~~~~~~~~~~IlG~~fl~~~y~-vfD~~  309 (317)
T cd05478         274 QDQGSCTSGFQSMGLGELWILGDVFIRQYYS-VFDRA  309 (317)
T ss_pred             cCCCEEeEEEEeCCCCCeEEechHHhcceEE-EEeCC
Confidence                   1122222 2458999999998775 46664


No 43 
>PTZ00147 plasmepsin-1; Provisional
Probab=85.75  E-value=3.1  Score=36.11  Aligned_cols=26  Identities=15%  Similarity=0.304  Sum_probs=21.6

Q ss_pred             EEEEEEC--CEeeEEEecCCCceeEECH
Q 040033           80 RINGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        80 ~~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      ...+.|+  .+++.++|||||+..+|..
T Consensus       141 ~~~I~IGTP~Q~f~Vi~DTGSsdlWVps  168 (453)
T PTZ00147        141 YGEAKLGDNGQKFNFIFDTGSANLWVPS  168 (453)
T ss_pred             EEEEEECCCCeEEEEEEeCCCCcEEEee
Confidence            4466776  7999999999999999964


No 44 
>PF15288 zf-CCHC_6:  Zinc knuckle
Probab=85.34  E-value=0.45  Score=27.34  Aligned_cols=18  Identities=11%  Similarity=0.025  Sum_probs=15.6

Q ss_pred             ceeecCCCCCCC--ccCCcc
Q 040033           31 VCATIATKSFPR--DIGERS   48 (158)
Q Consensus        31 ~Cf~Cg~~gH~~--~~C~~~   48 (158)
                      .|-.||..||.+  +.||-+
T Consensus         3 kC~~CG~~GH~~t~k~CP~~   22 (40)
T PF15288_consen    3 KCKNCGAFGHMRTNKRCPMY   22 (40)
T ss_pred             cccccccccccccCccCCCC
Confidence            499999999998  789865


No 45 
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5.  Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=84.62  E-value=2.2  Score=34.86  Aligned_cols=68  Identities=19%  Similarity=0.312  Sum_probs=42.7

Q ss_pred             eeEEEecCCCceeEECHHHHHHcCC--eEEEEEEC-CEEEEEE-------------EEEecCCCCcEEechhHhhhcCCc
Q 040033           89 SPIVLVDSGSTHNFISDTFAKKVKN--FIVNLILQ-GVYVIVD-------------FNLRELEGYDVVLGTQWLRTLEPI  152 (158)
Q Consensus        89 ~v~aLiDSGat~sfI~~~~a~~~~~--~~A~i~i~-g~~~~~~-------------~~v~~~~~~dvILG~dwL~~~~~i  152 (158)
                      ...++||||++..++...+.+++.-  +.-.+.++ |..+.+.             +..+....--.|||-.||+.+-- 
T Consensus       231 ~~~aivDSGTs~~~lp~~~~~~l~~~~P~i~~~f~~g~~~~i~p~~y~~~~~~~~c~~~~~~~~~~~ILG~~flr~~y~-  309 (326)
T cd06096         231 GLGMLVDSGSTLSHFPEDLYNKINNFFPTITIIFENNLKIDWKPSSYLYKKESFWCKGGEKSVSNKPILGASFFKNKQI-  309 (326)
T ss_pred             CCCEEEeCCCCcccCCHHHHHHHHhhcCcEEEEEcCCcEEEECHHHhccccCCceEEEEEecCCCceEEChHHhcCcEE-
Confidence            3458999999999999999887542  11134444 4433320             11122223357999999998774 


Q ss_pred             eeeec
Q 040033          153 LWDFA  157 (158)
Q Consensus       153 ~idw~  157 (158)
                      ..|+.
T Consensus       310 vFD~~  314 (326)
T cd06096         310 IFDLD  314 (326)
T ss_pred             EEECc
Confidence            56764


No 46 
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=84.25  E-value=4.1  Score=32.31  Aligned_cols=64  Identities=22%  Similarity=0.299  Sum_probs=40.5

Q ss_pred             EEEEECC--EeeEEEecCCCceeEECHHHHHHcCC--eE------EEEEECCEEEE-EEEEEecC-CCCcEEechhH
Q 040033           81 INGRIGN--ISPIVLVDSGSTHNFISDTFAKKVKN--FI------VNLILQGVYVI-VDFNLREL-EGYDVVLGTQW  145 (158)
Q Consensus        81 ~~~~i~~--~~v~aLiDSGat~sfI~~~~a~~~~~--~~------A~i~i~g~~~~-~~~~v~~~-~~~dvILG~dw  145 (158)
                      +++.|+.  +++.+++||||+..+|.. +-..++-  ..      ..+.+++.... ..|.++.. ...|-|||..+
T Consensus         5 ~~i~iGtp~q~~~v~~DTgS~~~wv~~-~~~~Y~~g~~~~G~~~~D~v~~g~~~~~~~~fg~~~~~~~~~GilGLg~   80 (295)
T cd05474           5 AELSVGTPPQKVTVLLDTGSSDLWVPD-FSISYGDGTSASGTWGTDTVSIGGATVKNLQFAVANSTSSDVGVLGIGL   80 (295)
T ss_pred             EEEEECCCCcEEEEEEeCCCCcceeee-eEEEeccCCcEEEEEEEEEEEECCeEecceEEEEEecCCCCcceeeECC
Confidence            4566665  899999999999998881 1111111  11      17777776553 45655543 46888988764


No 47 
>KOG0109 consensus RNA-binding protein LARK, contains RRM and retroviral-type Zn-finger domains [RNA processing and modification; General function prediction only]
Probab=81.40  E-value=0.76  Score=37.67  Aligned_cols=17  Identities=6%  Similarity=-0.023  Sum_probs=16.3

Q ss_pred             eeecCCCCCCCccCCcc
Q 040033           32 CATIATKSFPRDIGERS   48 (158)
Q Consensus        32 Cf~Cg~~gH~~~~C~~~   48 (158)
                      ||.||+.||.+.+||..
T Consensus       163 cyrcGkeghwskEcP~~  179 (346)
T KOG0109|consen  163 CYRCGKEGHWSKECPVD  179 (346)
T ss_pred             heeccccccccccCCcc
Confidence            99999999999999976


No 48 
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=79.30  E-value=1.6  Score=34.72  Aligned_cols=68  Identities=12%  Similarity=0.080  Sum_probs=37.8

Q ss_pred             eeEEEecCCCceeEECHHHHHHcCCeE--EEEEECCEEEEEEEEE-ecCC--CCcEEechhHhhhcCCceeeec
Q 040033           89 SPIVLVDSGSTHNFISDTFAKKVKNFI--VNLILQGVYVIVDFNL-RELE--GYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus        89 ~v~aLiDSGat~sfI~~~~a~~~~~~~--A~i~i~g~~~~~~~~v-~~~~--~~dvILG~dwL~~~~~i~idw~  157 (158)
                      ...++||||++..++...+++++.-..  |........+.+++.- +|-.  .+..|||-.||+++-. ..||.
T Consensus       198 ~~~~iiDSGTs~~~lP~~~~~~l~~~l~g~~~~~~~~~~~~~C~~~~P~i~f~~~~ilGd~fl~~~y~-vfD~~  270 (278)
T cd06097         198 GFSAIADTGTTLILLPDAIVEAYYSQVPGAYYDSEYGGWVFPCDTTLPDLSFAVFSILGDVFLKAQYV-VFDVG  270 (278)
T ss_pred             CceEEeecCCchhcCCHHHHHHHHHhCcCCcccCCCCEEEEECCCCCCCEEEEEEEEEcchhhCceeE-EEcCC
Confidence            456999999999999987766542111  1110001111111110 0000  0157999999998875 57775


No 49 
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=79.24  E-value=2.7  Score=32.77  Aligned_cols=61  Identities=16%  Similarity=0.252  Sum_probs=39.1

Q ss_pred             EeeEEEecCCCceeEECHHHHHHcCCeE-EEEE------------EC-CEEEEEEEEEecCCCCcEEechhHhhhcCCce
Q 040033           88 ISPIVLVDSGSTHNFISDTFAKKVKNFI-VNLI------------LQ-GVYVIVDFNLRELEGYDVVLGTQWLRTLEPIL  153 (158)
Q Consensus        88 ~~v~aLiDSGat~sfI~~~~a~~~~~~~-A~i~------------i~-g~~~~~~~~v~~~~~~dvILG~dwL~~~~~i~  153 (158)
                      ....++||||++..++...++..+--.. +...            .. --.+...|        ..|||..||+.+-- .
T Consensus       201 ~~~~~iiDsGt~~~~lp~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~p~i~f~f--------~~ilG~~fl~~~y~-v  271 (283)
T cd05471         201 GGGGAIVDSGTSLIYLPSSVYDAILKALGAAVSSSDGGYGVDCSPCDTLPDITFTF--------LWILGDVFLRNYYT-V  271 (283)
T ss_pred             CCcEEEEecCCCCEeCCHHHHHHHHHHhCCcccccCCcEEEeCcccCcCCCEEEEE--------EEEccHhhhhheEE-E
Confidence            4678999999999999999887753222 1000            00 00111222        89999999998774 4


Q ss_pred             eeec
Q 040033          154 WDFA  157 (158)
Q Consensus       154 idw~  157 (158)
                      .|+.
T Consensus       272 fD~~  275 (283)
T cd05471         272 FDLD  275 (283)
T ss_pred             EeCC
Confidence            5653


No 50 
>PF00026 Asp:  Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.;  InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .  More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=78.01  E-value=3.7  Score=32.71  Aligned_cols=26  Identities=35%  Similarity=0.528  Sum_probs=21.7

Q ss_pred             EEEEEEC--CEeeEEEecCCCceeEECH
Q 040033           80 RINGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        80 ~~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      .+++.|+  .+++.++|||||+..+|..
T Consensus         3 ~~~v~iGtp~q~~~~~iDTGS~~~wv~~   30 (317)
T PF00026_consen    3 YINVTIGTPPQTFRVLIDTGSSDTWVPS   30 (317)
T ss_dssp             EEEEEETTTTEEEEEEEETTBSSEEEEB
T ss_pred             EEEEEECCCCeEEEEEEecccceeeece
Confidence            3567776  8999999999999998874


No 51 
>PF05515 Viral_NABP:  Viral nucleic acid binding ;  InterPro: IPR008891 This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP).
Probab=75.84  E-value=1.2  Score=31.85  Aligned_cols=18  Identities=11%  Similarity=0.054  Sum_probs=16.0

Q ss_pred             cceeecCCCCCCCccCCc
Q 040033           30 KVCATIATKSFPRDIGER   47 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~   47 (158)
                      +.||.||.--|-.+.|+.
T Consensus        63 ~~C~~CG~~l~~~~~C~~   80 (124)
T PF05515_consen   63 NRCFKCGRYLHNNGNCRR   80 (124)
T ss_pred             CccccccceeecCCcCCC
Confidence            999999997788899984


No 52 
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=74.04  E-value=4.3  Score=32.17  Aligned_cols=25  Identities=20%  Similarity=0.287  Sum_probs=20.5

Q ss_pred             EEEEECC--EeeEEEecCCCceeEECH
Q 040033           81 INGRIGN--ISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        81 ~~~~i~~--~~v~aLiDSGat~sfI~~  105 (158)
                      +.+.|+.  +++.+++||||+..+|..
T Consensus         3 ~~i~vGtP~Q~~~v~~DTGS~~~wv~~   29 (278)
T cd06097           3 TPVKIGTPPQTLNLDLDTGSSDLWVFS   29 (278)
T ss_pred             eeEEECCCCcEEEEEEeCCCCceeEee
Confidence            4566766  899999999999999854


No 53 
>PF13821 DUF4187:  Domain of unknown function (DUF4187)
Probab=72.95  E-value=1.9  Score=26.42  Aligned_cols=22  Identities=18%  Similarity=-0.043  Sum_probs=16.6

Q ss_pred             hhcc--cceeecCCCCCCC----ccCCc
Q 040033           26 ERRA--KVCATIATKSFPR----DIGER   47 (158)
Q Consensus        26 ~rR~--~~Cf~Cg~~gH~~----~~C~~   47 (158)
                      +-|.  .-||+||-++--.    ..||-
T Consensus        22 YLR~~~~YC~~Cg~~Y~d~~dL~~~CPG   49 (55)
T PF13821_consen   22 YLREEHNYCFWCGTKYDDEEDLERNCPG   49 (55)
T ss_pred             HHHhhCceeeeeCCccCCHHHHHhCCCC
Confidence            3355  8899999887665    77875


No 54 
>KOG0341 consensus DEAD-box protein abstrakt [RNA processing and modification]
Probab=69.85  E-value=2.1  Score=36.92  Aligned_cols=19  Identities=16%  Similarity=-0.008  Sum_probs=16.9

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      .-|-|||..||+..+||+-
T Consensus       571 kGCayCgGLGHRItdCPKl  589 (610)
T KOG0341|consen  571 KGCAYCGGLGHRITDCPKL  589 (610)
T ss_pred             cccccccCCCcccccCchh
Confidence            3499999999999999983


No 55 
>KOG4400 consensus E3 ubiquitin ligase interacting with arginine methyltransferase [Posttranslational modification, protein turnover, chaperones]
Probab=69.69  E-value=1.8  Score=34.52  Aligned_cols=19  Identities=5%  Similarity=0.013  Sum_probs=17.5

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ..||+|++.||..++|+..
T Consensus        93 ~~c~~C~~~gH~~~~c~~~  111 (261)
T KOG4400|consen   93 AACFNCGEGGHIERDCPEA  111 (261)
T ss_pred             hhhhhCCCCccchhhCCcc
Confidence            6799999999999999976


No 56 
>smart00647 IBR In Between Ring fingers. the domains occurs between pairs og RING fingers
Probab=69.64  E-value=2.2  Score=25.87  Aligned_cols=16  Identities=6%  Similarity=-0.070  Sum_probs=13.6

Q ss_pred             cceeecCCCCCCCccC
Q 040033           30 KVCATIATKSFPRDIG   45 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C   45 (158)
                      ..||+|++.||..-.|
T Consensus        49 ~fC~~C~~~~H~~~~C   64 (64)
T smart00647       49 SFCFRCKVPWHSPVSC   64 (64)
T ss_pred             eECCCCCCcCCCCCCC
Confidence            6699999999987665


No 57 
>PF12353 eIF3g:  Eukaryotic translation initiation factor 3 subunit G ;  InterPro: IPR024675 At least eleven different protein factors are involved in initiation of protein synthesis in eukaryotes. Binding of initiator tRNA and mRNA to the 40S subunit requires the presence of the translation initiation factors eIF-2 and eIF-3, with eIF-3 being particularly important for 80S ribosome dissociation and mRNA binding []. eIF-3 is the most complex translation inititation factor, consisting of about 13 putative subunits and having a molecular weight of between 550 - 700 kDa in mammalian cells. Subunits are designated eIF-3a - eIF-3m; the large number of subunits means that the interactions between the individual subunits that make up the eIF-3 complex are complex and varied. Subunit G is required for eIF3 integrity.   This entry represents a domain of approximately 130 amino acids in length found at the N terminus of eukaryotic translation initiation factor 3 subunit G. This domain is commonly found in association with the RNA recognition domain PF00076 from PFAM. 
Probab=68.48  E-value=2.6  Score=30.33  Aligned_cols=18  Identities=6%  Similarity=-0.114  Sum_probs=16.2

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ..|+.|+ ..|+..+||.+
T Consensus       107 v~CR~Ck-GdH~T~~CPyK  124 (128)
T PF12353_consen  107 VKCRICK-GDHWTSKCPYK  124 (128)
T ss_pred             EEeCCCC-CCcccccCCcc
Confidence            6699996 99999999986


No 58 
>PF03539 Spuma_A9PTase:  Spumavirus aspartic protease (A9);  InterPro: IPR001641 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A9 (spumapepsin family, clan AA). Foamy viruses are single-stranded enveloped retroviruses that have been noted to infect monkeys, cats and humans. In the human virus, the aspartic protease is encoded by the retroviral gag gene [], and in monkeys by the pol gene []. At present, the virus has not been proven to cause any particular disease. However, studies have shown Human foamy virus causes neurological disorders in infected mice []. It is not clear whether the Foamy virus/spumavirus proteases share a common evolutionary origin with other aspartic proteases. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 2JYS_A.
Probab=68.48  E-value=16  Score=27.10  Aligned_cols=63  Identities=19%  Similarity=0.225  Sum_probs=38.3

Q ss_pred             ECCEeeEEEecCCCceeEECHHHHHHcC-CeE-------------E---EEEECCEEEEEEEEEecCCCCcEEe----ch
Q 040033           85 IGNISPIVLVDSGSTHNFISDTFAKKVK-NFI-------------V---NLILQGVYVIVDFNLRELEGYDVVL----GT  143 (158)
Q Consensus        85 i~~~~v~aLiDSGat~sfI~~~~a~~~~-~~~-------------A---~i~i~g~~~~~~~~v~~~~~~dvIL----G~  143 (158)
                      |.|..+.+--||||+.+.|...|...-. +..             +   .++++|+.....++--++   |.||    -.
T Consensus         1 ikg~~l~~~wDsga~ITCiP~~fl~~E~Pi~~~~i~Tihg~~~~~vYYl~fKi~grkv~aEVi~s~~---dy~li~p~di   77 (163)
T PF03539_consen    1 IKGTKLKGHWDSGAQITCIPESFLEEEQPIGKTLIKTIHGEKEQDVYYLTFKINGRKVEAEVIASPY---DYILISPSDI   77 (163)
T ss_dssp             ETTEEEEEEE-TT-SSEEEEGGGTTT---SEEEEEE-SS-EEEEEEEEEEEEESS-EEEEEEEEESS---SSEEE-TTT-
T ss_pred             CCCceeeEEecCCCeEEEccHHHhCccccccceEEEEecCceeccEEEEEEEEcCeEEEEEEecCcc---ceEEEccccc
Confidence            4577889999999999999998865321 111             1   778899877766665553   3333    24


Q ss_pred             hHhhhcC
Q 040033          144 QWLRTLE  150 (158)
Q Consensus       144 dwL~~~~  150 (158)
                      +|+.+..
T Consensus        78 Pw~~~~p   84 (163)
T PF03539_consen   78 PWYKKKP   84 (163)
T ss_dssp             HHHHS--
T ss_pred             ccccCCC
Confidence            7887654


No 59 
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco.  CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=67.26  E-value=6.7  Score=31.37  Aligned_cols=21  Identities=19%  Similarity=0.246  Sum_probs=17.7

Q ss_pred             EEEecCCCceeEECHHHHHHc
Q 040033           91 IVLVDSGSTHNFISDTFAKKV  111 (158)
Q Consensus        91 ~aLiDSGat~sfI~~~~a~~~  111 (158)
                      .++||||++..++.+.+.+.+
T Consensus       173 ~~ivDSGTt~~~lp~~~~~~l  193 (299)
T cd05472         173 GVIIDSGTVITRLPPSAYAAL  193 (299)
T ss_pred             CeEEeCCCcceecCHHHHHHH
Confidence            589999999999998776643


No 60 
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases.  They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=65.86  E-value=11  Score=30.68  Aligned_cols=26  Identities=23%  Similarity=0.256  Sum_probs=21.1

Q ss_pred             EEEEEEC--CEeeEEEecCCCceeEECH
Q 040033           80 RINGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        80 ~~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      ...+.|+  .+++.++|||||+..+|..
T Consensus        12 ~~~i~iGtP~Q~~~v~~DTGSs~lWv~~   39 (317)
T cd06098          12 FGEIGIGTPPQKFTVIFDTGSSNLWVPS   39 (317)
T ss_pred             EEEEEECCCCeEEEEEECCCccceEEec
Confidence            4456666  6899999999999998864


No 61 
>PF01485 IBR:  IBR domain;  InterPro: IPR002867 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents a cysteine-rich (C6HC) zinc finger domain that is present in Triad1, and which is conserved in other proteins encoded by various eukaryotes. The C6HC consensus pattern is:  C-x(4)-C-x(14-30)-C-x(1-4)-C-x(4)-C-x(2)-C-x(4)-H-x(4)-C  The C6HC zinc finger motif is the fourth family member of the zinc-binding RING, LIM, and LAP/PHD fingers. Strikingly, in most of the proteins the C6HC domain is flanked by two RING finger structures IPR001841 from INTERPRO. The novel C6HC motif has been called DRIL (double RING finger linked). The strong conservation of the larger tripartite TRIAD (twoRING fingers and DRIL) structure indicates that the three subdomains are functionally linked and identifies a novel class of proteins []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding; PDB: 2CT7_A 1WD2_A 2JMO_A 1WIM_A.
Probab=65.84  E-value=1.7  Score=26.37  Aligned_cols=16  Identities=6%  Similarity=-0.020  Sum_probs=13.0

Q ss_pred             cceeecCCCCCCCccC
Q 040033           30 KVCATIATKSFPRDIG   45 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C   45 (158)
                      ..||.|+++||....|
T Consensus        49 ~fC~~C~~~~H~~~~C   64 (64)
T PF01485_consen   49 EFCFKCGEPWHEGVTC   64 (64)
T ss_dssp             EECSSSTSESCTTS-H
T ss_pred             cCccccCcccCCCCCC
Confidence            5699999999987665


No 62 
>KOG0119 consensus Splicing factor 1/branch point binding protein (RRM superfamily) [RNA processing and modification]
Probab=63.96  E-value=3.4  Score=36.18  Aligned_cols=19  Identities=11%  Similarity=0.004  Sum_probs=14.3

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ++|++||..||.+.+|+.+
T Consensus       286 n~c~~cg~~gH~~~dc~~~  304 (554)
T KOG0119|consen  286 NVCKICGPLGHISIDCKVN  304 (554)
T ss_pred             ccccccCCcccccccCCCc
Confidence            4788888888888888754


No 63 
>cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate  r
Probab=62.55  E-value=12  Score=30.39  Aligned_cols=25  Identities=24%  Similarity=0.294  Sum_probs=20.5

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEECH
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      ..+.|+  .+++.++|||||+..+|..
T Consensus        11 ~~i~iGtP~q~~~v~~DTGSs~~Wv~~   37 (326)
T cd05487          11 GEIGIGTPPQTFKVVFDTGSSNLWVPS   37 (326)
T ss_pred             EEEEECCCCcEEEEEEeCCccceEEcc
Confidence            455665  7899999999999999953


No 64 
>cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank 
Probab=62.39  E-value=12  Score=30.24  Aligned_cols=25  Identities=24%  Similarity=0.283  Sum_probs=20.1

Q ss_pred             EEEEEEC--CEeeEEEecCCCceeEEC
Q 040033           80 RINGRIG--NISPIVLVDSGSTHNFIS  104 (158)
Q Consensus        80 ~~~~~i~--~~~v~aLiDSGat~sfI~  104 (158)
                      ..++.|+  .+++.++|||||+..+|.
T Consensus         8 ~~~i~iGtP~q~~~v~~DTGSs~~Wv~   34 (325)
T cd05490           8 YGEIGIGTPPQTFTVVFDTGSSNLWVP   34 (325)
T ss_pred             EEEEEECCCCcEEEEEEeCCCccEEEE
Confidence            3456665  488999999999999984


No 65 
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  The enzymes specifically cleave bonds in peptides which 
Probab=62.14  E-value=9.9  Score=29.92  Aligned_cols=65  Identities=20%  Similarity=0.210  Sum_probs=38.3

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEECH-HHHHHcC--CeE------EEEEECCE--EE-EEEEEEecC------CCCcEE
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFISD-TFAKKVK--NFI------VNLILQGV--YV-IVDFNLREL------EGYDVV  140 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~~-~~a~~~~--~~~------A~i~i~g~--~~-~~~~~v~~~------~~~dvI  140 (158)
                      +.+.|+  .+.+.+++||||+..++.. .+...++  -..      ..+.+++.  .. ...|.++..      ...|-|
T Consensus         4 ~~i~iGtP~q~~~v~~DTGSs~~wv~~~~~~~~Y~dg~~~~G~~~~D~v~~g~~~~~~~~~~Fg~~~~~~~~~~~~~~GI   83 (265)
T cd05476           4 VTLSIGTPPQPFSLIVDTGSDLTWTQCCSYEYSYGDGSSTSGVLATETFTFGDSSVSVPNVAFGCGTDNEGGSFGGADGI   83 (265)
T ss_pred             EEEecCCCCcceEEEecCCCCCEEEcCCceEeEeCCCceeeeeEEEEEEEecCCCCccCCEEEEecccccCCccCCCCEE
Confidence            445565  6899999999999998853 1111111  000      16667665  22 234555443      258899


Q ss_pred             echhH
Q 040033          141 LGTQW  145 (158)
Q Consensus       141 LG~dw  145 (158)
                      ||..+
T Consensus        84 lGLg~   88 (265)
T cd05476          84 LGLGR   88 (265)
T ss_pred             EECCC
Confidence            99865


No 66 
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=61.17  E-value=14  Score=29.91  Aligned_cols=25  Identities=28%  Similarity=0.391  Sum_probs=20.4

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEECH
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      .++.|+  .+++.++|||||+..++..
T Consensus         6 ~~i~iGtP~q~~~v~~DTGS~~~wv~~   32 (318)
T cd05477           6 GEISIGTPPQNFLVLFDTGSSNLWVPS   32 (318)
T ss_pred             EEEEECCCCcEEEEEEeCCCccEEEcc
Confidence            455666  4899999999999999964


No 67 
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases.  They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=60.54  E-value=12  Score=30.32  Aligned_cols=74  Identities=16%  Similarity=0.300  Sum_probs=47.8

Q ss_pred             EEECCEe-------eEEEecCCCceeEECHHHHHHcCCeE-----E-----EEEECCEEEEEE---EE------------
Q 040033           83 GRIGNIS-------PIVLVDSGSTHNFISDTFAKKVKNFI-----V-----NLILQGVYVIVD---FN------------  130 (158)
Q Consensus        83 ~~i~~~~-------v~aLiDSGat~sfI~~~~a~~~~~~~-----A-----~i~i~g~~~~~~---~~------------  130 (158)
                      +.|+++.       ..++||||++..++..++++.+....     .     .+.++|..+.+.   +.            
T Consensus       197 i~v~g~~~~~~~~~~~aivDTGTs~~~lP~~~~~~i~~~~~C~~~~~~P~i~f~f~g~~~~l~~~~yi~~~~~~~~~~C~  276 (317)
T cd06098         197 VLIGGKSTGFCAGGCAAIADSGTSLLAGPTTIVTQINSAVDCNSLSSMPNVSFTIGGKTFELTPEQYILKVGEGAAAQCI  276 (317)
T ss_pred             EEECCEEeeecCCCcEEEEecCCcceeCCHHHHHhhhccCCccccccCCcEEEEECCEEEEEChHHeEEeecCCCCCEEe
Confidence            3566654       46999999999999999988765322     0     456677666542   11            


Q ss_pred             --Ee--cC---CCCcEEechhHhhhcCCceeeec
Q 040033          131 --LR--EL---EGYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       131 --v~--~~---~~~dvILG~dwL~~~~~i~idw~  157 (158)
                        +.  +.   .+...|||-.||+.+-- -.|+.
T Consensus       277 ~~~~~~~~~~~~~~~~IlGd~Flr~~y~-VfD~~  309 (317)
T cd06098         277 SGFTALDVPPPRGPLWILGDVFMGAYHT-VFDYG  309 (317)
T ss_pred             ceEEECCCCCCCCCeEEechHHhcccEE-EEeCC
Confidence              11  11   12246999999998765 45664


No 68 
>cd05475 nucellin_like Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more d
Probab=59.74  E-value=14  Score=29.29  Aligned_cols=24  Identities=21%  Similarity=0.368  Sum_probs=19.6

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEEC
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFIS  104 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~  104 (158)
                      +.+.|+  .+.+.+++||||+..+|.
T Consensus         5 ~~i~iGtP~q~~~v~~DTGS~~~Wv~   30 (273)
T cd05475           5 VTINIGNPPKPYFLDIDTGSDLTWLQ   30 (273)
T ss_pred             EEEEcCCCCeeEEEEEccCCCceEEe
Confidence            345555  688999999999999994


No 69 
>cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets an
Probab=59.63  E-value=12  Score=30.60  Aligned_cols=67  Identities=13%  Similarity=0.131  Sum_probs=42.4

Q ss_pred             eEEEecCCCceeEECHHHHHHc----CCeE-----------E-------EEEECCEEEEEE---EE--------------
Q 040033           90 PIVLVDSGSTHNFISDTFAKKV----KNFI-----------V-------NLILQGVYVIVD---FN--------------  130 (158)
Q Consensus        90 v~aLiDSGat~sfI~~~~a~~~----~~~~-----------A-------~i~i~g~~~~~~---~~--------------  130 (158)
                      ..++||||++..++...+++.+    +...           .       .+.++|..|.+.   +.              
T Consensus       211 ~~~iiDSGtt~~~lP~~~~~~l~~~~~~~~~~~~~~~~~C~~~~~~p~i~f~fgg~~~~i~~~~yi~~~~~~~~~~C~~~  290 (329)
T cd05485         211 CQAIADTGTSLIAGPVDEIEKLNNAIGAKPIIGGEYMVNCSAIPSLPDITFVLGGKSFSLTGKDYVLKVTQMGQTICLSG  290 (329)
T ss_pred             cEEEEccCCcceeCCHHHHHHHHHHhCCccccCCcEEEeccccccCCcEEEEECCEEeEEChHHeEEEecCCCCCEEeee
Confidence            3699999999999999876654    2110           0       445667666542   11              


Q ss_pred             Eec-----CCCCcEEechhHhhhcCCceeeec
Q 040033          131 LRE-----LEGYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       131 v~~-----~~~~dvILG~dwL~~~~~i~idw~  157 (158)
                      +..     ..+...|||..||+.+-. ..||.
T Consensus       291 ~~~~~~~~~~~~~~IlG~~fl~~~y~-vFD~~  321 (329)
T cd05485         291 FMGIDIPPPAGPLWILGDVFIGKYYT-EFDLG  321 (329)
T ss_pred             EEECcCCCCCCCeEEEchHHhccceE-EEeCC
Confidence            111     112348999999998775 46765


No 70 
>COG2383 Uncharacterized conserved protein [Function unknown]
Probab=59.40  E-value=2.7  Score=29.02  Aligned_cols=18  Identities=22%  Similarity=0.470  Sum_probs=16.1

Q ss_pred             EechhHhhhcCCceeeec
Q 040033          140 VLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       140 ILG~dwL~~~~~i~idw~  157 (158)
                      ||+.-||++++-|+|||.
T Consensus        51 ilsl~~La~~GVItin~~   68 (109)
T COG2383          51 ILSLFWLAQYGVITINWE   68 (109)
T ss_pred             HHHHHHHHHcCeEEEcHH
Confidence            577799999999999995


No 71 
>KOG0107 consensus Alternative splicing factor SRp20/9G8 (RRM superfamily) [RNA processing and modification]
Probab=59.09  E-value=4.7  Score=30.80  Aligned_cols=18  Identities=6%  Similarity=0.051  Sum_probs=16.6

Q ss_pred             cceeecCCCCCCCccCCc
Q 040033           30 KVCATIATKSFPRDIGER   47 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~   47 (158)
                      +.|++||+.||..+.|.+
T Consensus       101 ~~~~r~G~rg~~~r~~~~  118 (195)
T KOG0107|consen  101 GFCYRCGERGHIGRNCKD  118 (195)
T ss_pred             cccccCCCcccccccccc
Confidence            669999999999999987


No 72 
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two 
Probab=58.29  E-value=11  Score=31.10  Aligned_cols=20  Identities=25%  Similarity=0.257  Sum_probs=17.4

Q ss_pred             EEEecCCCceeEECHHHHHH
Q 040033           91 IVLVDSGSTHNFISDTFAKK  110 (158)
Q Consensus        91 ~aLiDSGat~sfI~~~~a~~  110 (158)
                      .++||||++..++.......
T Consensus       213 ~~ivDSGTs~~~lp~~~~~~  232 (364)
T cd05473         213 KAIVDSGTTNLRLPVKVFNA  232 (364)
T ss_pred             cEEEeCCCcceeCCHHHHHH
Confidence            48999999999999887664


No 73 
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=57.80  E-value=9.8  Score=30.77  Aligned_cols=29  Identities=17%  Similarity=0.142  Sum_probs=21.6

Q ss_pred             EEECCEe------eEEEecCCCceeEECHHHHHHc
Q 040033           83 GRIGNIS------PIVLVDSGSTHNFISDTFAKKV  111 (158)
Q Consensus        83 ~~i~~~~------v~aLiDSGat~sfI~~~~a~~~  111 (158)
                      +.|+++.      ..++||||++..++....++.+
T Consensus       186 i~v~g~~~~~~~~~~aiiDTGTs~~~lP~~~~~~l  220 (316)
T cd05486         186 IQVGGTVIFCSDGCQAIVDTGTSLITGPSGDIKQL  220 (316)
T ss_pred             EEEecceEecCCCCEEEECCCcchhhcCHHHHHHH
Confidence            3556543      3699999999999998866544


No 74 
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=57.46  E-value=15  Score=29.61  Aligned_cols=24  Identities=25%  Similarity=0.403  Sum_probs=19.3

Q ss_pred             EEEEC--CEeeEEEecCCCceeEECH
Q 040033           82 NGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        82 ~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      ++.|+  .+++.++|||||+..+|..
T Consensus         4 ~i~iGtP~Q~~~v~~DTGSs~~Wv~s   29 (316)
T cd05486           4 QISIGTPPQNFTVIFDTGSSNLWVPS   29 (316)
T ss_pred             EEEECCCCcEEEEEEcCCCccEEEec
Confidence            44554  6889999999999999853


No 75 
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which 
Probab=57.20  E-value=17  Score=29.36  Aligned_cols=26  Identities=23%  Similarity=0.284  Sum_probs=20.7

Q ss_pred             EEEEEEC--CEeeEEEecCCCceeEECH
Q 040033           80 RINGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        80 ~~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      ...+.|+  .+++.++|||||+..+|..
T Consensus        12 ~~~i~vGtp~q~~~v~~DTGS~~~wv~~   39 (317)
T cd05478          12 YGTISIGTPPQDFTVIFDTGSSNLWVPS   39 (317)
T ss_pred             EEEEEeCCCCcEEEEEEeCCCccEEEec
Confidence            3455665  6889999999999999964


No 76 
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two 
Probab=56.36  E-value=16  Score=30.21  Aligned_cols=25  Identities=32%  Similarity=0.384  Sum_probs=20.1

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEECH
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      +++.|+  .+++.++|||||+..+|..
T Consensus         6 ~~i~iGtP~Q~~~v~~DTGSs~lWv~~   32 (364)
T cd05473           6 IEMLIGTPPQKLNILVDTGSSNFAVAA   32 (364)
T ss_pred             EEEEecCCCceEEEEEecCCcceEEEc
Confidence            455665  5899999999999998864


No 77 
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5.  Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=56.31  E-value=18  Score=29.39  Aligned_cols=25  Identities=24%  Similarity=0.262  Sum_probs=19.7

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEECH
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      +.+.|+  .+++.++|||||+..+|..
T Consensus         6 ~~i~vGtP~Q~~~v~~DTGS~~~wv~~   32 (326)
T cd06096           6 IDIFIGNPPQKQSLILDTGSSSLSFPC   32 (326)
T ss_pred             EEEEecCCCeEEEEEEeCCCCceEEec
Confidence            455555  5899999999999988754


No 78 
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco.  CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=56.26  E-value=11  Score=30.09  Aligned_cols=65  Identities=20%  Similarity=0.245  Sum_probs=37.1

Q ss_pred             EEEEEC--CEeeEEEecCCCceeEECHH----HHHHcCCe--E----E--EEEECCE-EEE-EEEEEecCC-----CCcE
Q 040033           81 INGRIG--NISPIVLVDSGSTHNFISDT----FAKKVKNF--I----V--NLILQGV-YVI-VDFNLRELE-----GYDV  139 (158)
Q Consensus        81 ~~~~i~--~~~v~aLiDSGat~sfI~~~----~a~~~~~~--~----A--~i~i~g~-~~~-~~~~v~~~~-----~~dv  139 (158)
                      ..+.|+  .+++.+++||||+..+|.-.    +..+++--  .    +  .+.+++. ... ..|.+....     ..|-
T Consensus         4 ~~i~iGtP~q~~~v~~DTGSs~~Wv~c~~c~~~~i~Yg~Gs~~~G~~~~D~v~ig~~~~~~~~~Fg~~~~~~~~~~~~~G   83 (299)
T cd05472           4 VTVGLGTPARDQTVIVDTGSDLTWVQCQPCCLYQVSYGDGSYTTGDLATDTLTLGSSDVVPGFAFGCGHDNEGLFGGAAG   83 (299)
T ss_pred             EEEecCCCCcceEEEecCCCCcccccCCCCCeeeeEeCCCceEEEEEEEEEEEeCCCCccCCEEEECCccCCCccCCCCE
Confidence            345555  58999999999999988321    11111100  0    1  6667664 322 345444321     5788


Q ss_pred             EechhH
Q 040033          140 VLGTQW  145 (158)
Q Consensus       140 ILG~dw  145 (158)
                      |||+.+
T Consensus        84 ilGLg~   89 (299)
T cd05472          84 LLGLGR   89 (299)
T ss_pred             EEECCC
Confidence            999864


No 79 
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme.  Proteinase A preferentially hydro
Probab=55.72  E-value=18  Score=29.26  Aligned_cols=27  Identities=19%  Similarity=0.308  Sum_probs=21.4

Q ss_pred             EEEEEEEC--CEeeEEEecCCCceeEECH
Q 040033           79 LRINGRIG--NISPIVLVDSGSTHNFISD  105 (158)
Q Consensus        79 i~~~~~i~--~~~v~aLiDSGat~sfI~~  105 (158)
                      ...++.|+  .+++.++|||||+..+|..
T Consensus        11 Y~~~i~iGtp~q~~~v~~DTGSs~~wv~~   39 (320)
T cd05488          11 YFTDITLGTPPQKFKVILDTGSSNLWVPS   39 (320)
T ss_pred             EEEEEEECCCCcEEEEEEecCCcceEEEc
Confidence            34456666  5899999999999998854


No 80 
>cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets an
Probab=55.57  E-value=19  Score=29.36  Aligned_cols=27  Identities=26%  Similarity=0.231  Sum_probs=21.6

Q ss_pred             EEEEEEC--CEeeEEEecCCCceeEECHH
Q 040033           80 RINGRIG--NISPIVLVDSGSTHNFISDT  106 (158)
Q Consensus        80 ~~~~~i~--~~~v~aLiDSGat~sfI~~~  106 (158)
                      ...+.|+  .+++.++|||||+..++...
T Consensus        13 ~~~i~vGtP~q~~~v~~DTGSs~~Wv~~~   41 (329)
T cd05485          13 YGVITIGTPPQSFKVVFDTGSSNLWVPSK   41 (329)
T ss_pred             EEEEEECCCCcEEEEEEcCCCccEEEecC
Confidence            4466776  58999999999999988753


No 81 
>PF14543 TAXi_N:  Xylanase inhibitor N-terminal; PDB: 3HD8_A 3VLB_A 3VLA_A 3AUP_D 1T6G_A 1T6E_X 2B42_A.
Probab=51.70  E-value=24  Score=25.88  Aligned_cols=23  Identities=26%  Similarity=0.462  Sum_probs=16.9

Q ss_pred             EEEEECC--EeeEEEecCCCceeEE
Q 040033           81 INGRIGN--ISPIVLVDSGSTHNFI  103 (158)
Q Consensus        81 ~~~~i~~--~~v~aLiDSGat~sfI  103 (158)
                      +.+.|+.  +++.++||||+...++
T Consensus         3 ~~~~iGtP~~~~~lvvDtgs~l~W~   27 (164)
T PF14543_consen    3 VSVSIGTPPQPFSLVVDTGSDLTWV   27 (164)
T ss_dssp             EEEECTCTTEEEEEEEETT-SSEEE
T ss_pred             EEEEeCCCCceEEEEEECCCCceEE
Confidence            3445543  7899999999999987


No 82 
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=51.17  E-value=22  Score=27.53  Aligned_cols=26  Identities=23%  Similarity=0.289  Sum_probs=19.6

Q ss_pred             EEEEC--CEeeEEEecCCCceeEECHHH
Q 040033           82 NGRIG--NISPIVLVDSGSTHNFISDTF  107 (158)
Q Consensus        82 ~~~i~--~~~v~aLiDSGat~sfI~~~~  107 (158)
                      .+.|+  .+++.++|||||+..+|...-
T Consensus         4 ~i~iGtp~q~~~l~~DTGS~~~wv~~~~   31 (283)
T cd05471           4 EITIGTPPQKFSVIFDTGSSLLWVPSSN   31 (283)
T ss_pred             EEEECCCCcEEEEEEeCCCCCEEEecCC
Confidence            34454  368999999999999886553


No 83 
>KOG2673 consensus Uncharacterized conserved protein, contains PSP domain [Function unknown]
Probab=49.90  E-value=11  Score=32.90  Aligned_cols=19  Identities=11%  Similarity=0.069  Sum_probs=17.2

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      --||+|+..-|--++|+++
T Consensus       129 ~~CFNC~g~~hsLrdC~rp  147 (485)
T KOG2673|consen  129 DPCFNCGGTPHSLRDCPRP  147 (485)
T ss_pred             ccccccCCCCCccccCCCc
Confidence            4589999999999999987


No 84 
>cd05489 xylanase_inhibitor_I_like TAXI-I inhibits degradation of xylan in the cell wall. Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. Xylanases of fungal and bacterial pathogens are the key enzymes in the degradation of xylan in the cell wall. Plants secrete proteins that inhibit these degradation glycosidases, including xylanase. Surprisingly, TAXI-I displays structural homology with the pepsin-like family of aspartic proteases but is proteolytically nonfunctional, because one or more residues of the essential catalytic triad are absent. The structure of the TAXI-inhibitor, Aspergillus niger xylanase I complex, illustrates the ability 
Probab=46.72  E-value=29  Score=29.03  Aligned_cols=20  Identities=5%  Similarity=-0.014  Sum_probs=16.5

Q ss_pred             EEEecCCCceeEECHHHHHH
Q 040033           91 IVLVDSGSTHNFISDTFAKK  110 (158)
Q Consensus        91 ~aLiDSGat~sfI~~~~a~~  110 (158)
                      .++||||++.+++...+.+.
T Consensus       231 g~iiDSGTs~t~lp~~~y~~  250 (362)
T cd05489         231 GVKLSTVVPYTVLRSDIYRA  250 (362)
T ss_pred             cEEEecCCceEEECHHHHHH
Confidence            48999999999988876554


No 85 
>PLN03146 aspartyl protease family protein; Provisional
Probab=46.46  E-value=23  Score=30.32  Aligned_cols=19  Identities=26%  Similarity=0.578  Sum_probs=16.3

Q ss_pred             EEEecCCCceeEECHHHHH
Q 040033           91 IVLVDSGSTHNFISDTFAK  109 (158)
Q Consensus        91 ~aLiDSGat~sfI~~~~a~  109 (158)
                      .++||||++.+++.+...+
T Consensus       309 ~~iiDSGTt~t~Lp~~~y~  327 (431)
T PLN03146        309 NIIIDSGTTLTLLPSDFYS  327 (431)
T ss_pred             cEEEeCCccceecCHHHHH
Confidence            4799999999999998543


No 86 
>PTZ00165 aspartyl protease; Provisional
Probab=44.14  E-value=34  Score=29.97  Aligned_cols=28  Identities=25%  Similarity=0.339  Sum_probs=22.2

Q ss_pred             EEEEEEECC--EeeEEEecCCCceeEECHH
Q 040033           79 LRINGRIGN--ISPIVLVDSGSTHNFISDT  106 (158)
Q Consensus        79 i~~~~~i~~--~~v~aLiDSGat~sfI~~~  106 (158)
                      ....+.|+.  +++.+++||||+..+|...
T Consensus       121 Y~~~I~IGTPpQ~f~Vv~DTGSS~lWVps~  150 (482)
T PTZ00165        121 YFGEIQVGTPPKSFVVVFDTGSSNLWIPSK  150 (482)
T ss_pred             EEEEEEeCCCCceEEEEEeCCCCCEEEEch
Confidence            345667765  8999999999999998643


No 87 
>KOG2560 consensus RNA splicing factor - Slu7p [RNA processing and modification]
Probab=43.63  E-value=5.8  Score=34.51  Aligned_cols=19  Identities=11%  Similarity=0.097  Sum_probs=17.2

Q ss_pred             cccceeecCCCCCCCccCC
Q 040033           28 RAKVCATIATKSFPRDIGE   46 (158)
Q Consensus        28 R~~~Cf~Cg~~gH~~~~C~   46 (158)
                      |.|.|-+||.-+|....|=
T Consensus       111 RKGACeNCGAmtHk~KDCm  129 (529)
T KOG2560|consen  111 RKGACENCGAMTHKVKDCM  129 (529)
T ss_pred             hhhhhhhhhhhhcchHHHh
Confidence            4599999999999999994


No 88 
>PF14541 TAXi_C:  Xylanase inhibitor C-terminal; PDB: 3AUP_D 3HD8_A 1T6G_A 1T6E_X 2B42_A 3VLB_A 3VLA_A.
Probab=42.07  E-value=22  Score=25.88  Aligned_cols=23  Identities=22%  Similarity=0.368  Sum_probs=17.3

Q ss_pred             eeEEEecCCCceeEECHHHHHHc
Q 040033           89 SPIVLVDSGSTHNFISDTFAKKV  111 (158)
Q Consensus        89 ~v~aLiDSGat~sfI~~~~a~~~  111 (158)
                      .-.++||||++.+++.+.+...+
T Consensus        29 ~g~~iiDSGT~~T~L~~~~y~~l   51 (161)
T PF14541_consen   29 SGGTIIDSGTTYTYLPPPVYDAL   51 (161)
T ss_dssp             TCSEEE-SSSSSEEEEHHHHHHH
T ss_pred             CCCEEEECCCCccCCcHHHHHHH
Confidence            44578999999999998865543


No 89 
>cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate  r
Probab=41.91  E-value=31  Score=28.02  Aligned_cols=22  Identities=18%  Similarity=0.156  Sum_probs=17.8

Q ss_pred             eEEEecCCCceeEECHHHHHHc
Q 040033           90 PIVLVDSGSTHNFISDTFAKKV  111 (158)
Q Consensus        90 v~aLiDSGat~sfI~~~~a~~~  111 (158)
                      ..++||||++..++....++++
T Consensus       208 ~~aiiDSGts~~~lP~~~~~~l  229 (326)
T cd05487         208 CTAVVDTGASFISGPTSSISKL  229 (326)
T ss_pred             CEEEECCCccchhCcHHHHHHH
Confidence            3689999999999998865543


No 90 
>cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank 
Probab=41.61  E-value=19  Score=29.17  Aligned_cols=22  Identities=14%  Similarity=0.042  Sum_probs=18.8

Q ss_pred             eEEEecCCCceeEECHHHHHHc
Q 040033           90 PIVLVDSGSTHNFISDTFAKKV  111 (158)
Q Consensus        90 v~aLiDSGat~sfI~~~~a~~~  111 (158)
                      ..++||||++..++....+..+
T Consensus       207 ~~aiiDSGTt~~~~p~~~~~~l  228 (325)
T cd05490         207 CEAIVDTGTSLITGPVEEVRAL  228 (325)
T ss_pred             CEEEECCCCccccCCHHHHHHH
Confidence            5799999999999998877654


No 91 
>PLN03146 aspartyl protease family protein; Provisional
Probab=40.89  E-value=35  Score=29.25  Aligned_cols=26  Identities=23%  Similarity=0.377  Sum_probs=20.5

Q ss_pred             EEEEEEEC--CEeeEEEecCCCceeEEC
Q 040033           79 LRINGRIG--NISPIVLVDSGSTHNFIS  104 (158)
Q Consensus        79 i~~~~~i~--~~~v~aLiDSGat~sfI~  104 (158)
                      ..+.+.|+  .+++.+++||||+..+|.
T Consensus        85 Y~v~i~iGTPpq~~~vi~DTGS~l~Wv~  112 (431)
T PLN03146         85 YLMNISIGTPPVPILAIADTGSDLIWTQ  112 (431)
T ss_pred             EEEEEEcCCCCceEEEEECCCCCcceEc
Confidence            44566665  578999999999999884


No 92 
>KOG2044 consensus 5'-3' exonuclease HKE1/RAT1 [Replication, recombination and repair; RNA processing and modification]
Probab=38.15  E-value=11  Score=34.98  Aligned_cols=19  Identities=11%  Similarity=0.085  Sum_probs=16.6

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ..||.||+.||...+|...
T Consensus       261 ~~C~~cgq~gh~~~dc~g~  279 (931)
T KOG2044|consen  261 RRCFLCGQTGHEAKDCEGK  279 (931)
T ss_pred             ccchhhcccCCcHhhcCCc
Confidence            4499999999999999754


No 93 
>KOG0119 consensus Splicing factor 1/branch point binding protein (RRM superfamily) [RNA processing and modification]
Probab=37.12  E-value=19  Score=31.76  Aligned_cols=34  Identities=15%  Similarity=0.201  Sum_probs=26.0

Q ss_pred             CCCCCCHHHHHhh----cc---cceeecCCCCCCCccCCcc
Q 040033           15 PVRGLSRAELQER----RA---KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        15 ~~~~ls~~e~~~r----R~---~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ..+++...|+..+    |.   ..|-.||..||....||.+
T Consensus       240 ~l~~~Qlrela~lNgt~r~~d~~~c~~cg~~~H~q~~cp~r  280 (554)
T KOG0119|consen  240 DLKRLQLRELARLNGTLRDDDNRACRNCGSTGHKQYDCPGR  280 (554)
T ss_pred             cccHHHHHHHHHhCCCCCccccccccccCCCccccccCCcc
Confidence            3556666676554    33   4799999999999999987


No 94 
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme.  Proteinase A preferentially hydro
Probab=36.61  E-value=28  Score=28.13  Aligned_cols=67  Identities=13%  Similarity=0.282  Sum_probs=42.3

Q ss_pred             eEEEecCCCceeEECHHHHHHcC----CeE-------------E-----EEEECCEEEEEE---E----------EEec-
Q 040033           90 PIVLVDSGSTHNFISDTFAKKVK----NFI-------------V-----NLILQGVYVIVD---F----------NLRE-  133 (158)
Q Consensus        90 v~aLiDSGat~sfI~~~~a~~~~----~~~-------------A-----~i~i~g~~~~~~---~----------~v~~-  133 (158)
                      ..++||||++..++...+++.+.    ...             .     .+.++|..+.+.   +          .+.. 
T Consensus       206 ~~~ivDSGtt~~~lp~~~~~~l~~~~~~~~~~~~~~~~~C~~~~~~P~i~f~f~g~~~~i~~~~y~~~~~g~C~~~~~~~  285 (320)
T cd05488         206 TGAAIDTGTSLIALPSDLAEMLNAEIGAKKSWNGQYTVDCSKVDSLPDLTFNFDGYNFTLGPFDYTLEVSGSCISAFTGM  285 (320)
T ss_pred             CeEEEcCCcccccCCHHHHHHHHHHhCCccccCCcEEeeccccccCCCEEEEECCEEEEECHHHheecCCCeEEEEEEEC
Confidence            46899999999999998876532    111             0     455667666542   1          1111 


Q ss_pred             -C---CCCcEEechhHhhhcCCceeeec
Q 040033          134 -L---EGYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       134 -~---~~~dvILG~dwL~~~~~i~idw~  157 (158)
                       +   .+...|||..||+.+-- ..|+.
T Consensus       286 ~~~~~~~~~~ilG~~fl~~~y~-vfD~~  312 (320)
T cd05488         286 DFPEPVGPLAIVGDAFLRKYYS-VYDLG  312 (320)
T ss_pred             cCCCCCCCeEEEchHHhhheEE-EEeCC
Confidence             1   12358999999987664 46654


No 95 
>KOG4584 consensus Uncharacterized conserved protein [General function prediction only]
Probab=35.72  E-value=24  Score=29.33  Aligned_cols=22  Identities=27%  Similarity=0.446  Sum_probs=18.2

Q ss_pred             CEEEEEEEEEecCCCCcEEech
Q 040033          122 GVYVIVDFNLRELEGYDVVLGT  143 (158)
Q Consensus       122 g~~~~~~~~v~~~~~~dvILG~  143 (158)
                      +.++...+.-++.+++|+|||+
T Consensus       196 ~~p~K~~lif~DNSG~DvILGi  217 (348)
T KOG4584|consen  196 GKPHKCALIFVDNSGFDVILGI  217 (348)
T ss_pred             CCCcceEEEEecCCCcceeeee
Confidence            4566777788899999999998


No 96 
>PTZ00147 plasmepsin-1; Provisional
Probab=35.59  E-value=53  Score=28.54  Aligned_cols=22  Identities=32%  Similarity=0.454  Sum_probs=18.9

Q ss_pred             eeEEEecCCCceeEECHHHHHH
Q 040033           89 SPIVLVDSGSTHNFISDTFAKK  110 (158)
Q Consensus        89 ~v~aLiDSGat~sfI~~~~a~~  110 (158)
                      ...++||||++..++....+..
T Consensus       332 ~~~aIiDSGTsli~lP~~~~~a  353 (453)
T PTZ00147        332 KANVIVDSGTSVITVPTEFLNK  353 (453)
T ss_pred             ceeEEECCCCchhcCCHHHHHH
Confidence            4679999999999999987764


No 97 
>PTZ00165 aspartyl protease; Provisional
Probab=34.78  E-value=53  Score=28.75  Aligned_cols=23  Identities=13%  Similarity=0.085  Sum_probs=19.2

Q ss_pred             eeEEEecCCCceeEECHHHHHHc
Q 040033           89 SPIVLVDSGSTHNFISDTFAKKV  111 (158)
Q Consensus        89 ~v~aLiDSGat~sfI~~~~a~~~  111 (158)
                      ...|+||||++..++...+++.+
T Consensus       327 ~~~aIiDTGTSli~lP~~~~~~i  349 (482)
T PTZ00165        327 KCKAAIDTGSSLITGPSSVINPL  349 (482)
T ss_pred             ceEEEEcCCCccEeCCHHHHHHH
Confidence            35689999999999999886654


No 98 
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=34.64  E-value=30  Score=27.32  Aligned_cols=69  Identities=17%  Similarity=0.310  Sum_probs=44.3

Q ss_pred             EeeEEEecCCCceeEECHHHHHHc----CCeE-----------------E-EEEECCEEEEEE-----------------
Q 040033           88 ISPIVLVDSGSTHNFISDTFAKKV----KNFI-----------------V-NLILQGVYVIVD-----------------  128 (158)
Q Consensus        88 ~~v~aLiDSGat~sfI~~~~a~~~----~~~~-----------------A-~i~i~g~~~~~~-----------------  128 (158)
                      ....++||||++..++...+...+    +...                 + .+.++|.++.+.                 
T Consensus       177 ~~~~~iiDSGt~~~~lP~~~~~~l~~~~~~~~~~~~~~~~~~C~~~~~p~i~f~f~g~~~~i~~~~~~~~~~~~~~~~~~  256 (295)
T cd05474         177 KNLPALLDSGTTLTYLPSDIVDAIAKQLGATYDSDEGLYVVDCDAKDDGSLTFNFGGATISVPLSDLVLPASTDDGGDGA  256 (295)
T ss_pred             CCccEEECCCCccEeCCHHHHHHHHHHhCCEEcCCCcEEEEeCCCCCCCEEEEEECCeEEEEEHHHhEeccccCCCCCCC
Confidence            345899999999999999987764    2211                 0 345566555432                 


Q ss_pred             --EEEecCCCCcEEechhHhhhcCCceeeec
Q 040033          129 --FNLRELEGYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       129 --~~v~~~~~~dvILG~dwL~~~~~i~idw~  157 (158)
                        +.+.+......|||..||+.+-- ..|+.
T Consensus       257 C~~~i~~~~~~~~iLG~~fl~~~y~-vfD~~  286 (295)
T cd05474         257 CYLGIQPSTSDYNILGDTFLRSAYV-VYDLD  286 (295)
T ss_pred             eEEEEEeCCCCcEEeChHHhhcEEE-EEECC
Confidence              12222222468999999998764 46654


No 99 
>TIGR02854 spore_II_GA sigma-E processing peptidase SpoIIGA. Members of this protein family are the stage II sporulation protein SpoIIGA. This protein acts as an activating protease for Sigma-E, one of several specialized sigma factors of the sporulation process in Bacillus subtilis and related endospore-forming bacteria.
Probab=33.87  E-value=59  Score=26.40  Aligned_cols=35  Identities=9%  Similarity=0.087  Sum_probs=24.1

Q ss_pred             ceEEEEEEECCE--eeEEEecCCCc---------eeEECHHHHHHc
Q 040033           77 ETLRINGRIGNI--SPIVLVDSGST---------HNFISDTFAKKV  111 (158)
Q Consensus        77 ~~i~~~~~i~~~--~v~aLiDSGat---------~sfI~~~~a~~~  111 (158)
                      ....+.+.++|+  .+++|+|||..         +..++.+.++++
T Consensus       157 ~~~~v~i~~~g~~~~~~alvDTGN~L~DPlT~~PV~Ive~~~~~~~  202 (288)
T TIGR02854       157 QIYELEICLDGKKVTIKGFLDTGNQLRDPLTKLPVIVVEYDSLKSI  202 (288)
T ss_pred             eEEEEEEEECCEEEEEEEEEecCCcccCCCCCCCEEEEEHHHhhhh
Confidence            344566677776  57899999964         567776666554


No 100
>PF03419 Peptidase_U4:  Sporulation factor SpoIIGA  This family belongs to family U4 of the peptidase classification.;  InterPro: IPR005081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.   The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported. This group of peptidases belong to the MEROPS peptidase family U4 (SpoIIGA peptidase family, clan U-).  Sporulation in bacteria such as Bacillus subtilis involves the formation of a polar septum, which divides the sporangium into a mother cell and a forespore. The sigma E factor, which is encoded within the spoIIG operon, is a cell-specific regulatory protein that directs gene transcription in the mother cell. Sigma E is synthesised as an inactive proprotein pro-sigma E, which is converted to the mature factor by the putative processing enzyme SpoIIGA []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis, 0030436 asexual sporulation
Probab=33.37  E-value=65  Score=26.03  Aligned_cols=34  Identities=18%  Similarity=0.198  Sum_probs=25.2

Q ss_pred             eEEEEEEECCE--eeEEEecCCCc---------eeEECHHHHHHc
Q 040033           78 TLRINGRIGNI--SPIVLVDSGST---------HNFISDTFAKKV  111 (158)
Q Consensus        78 ~i~~~~~i~~~--~v~aLiDSGat---------~sfI~~~~a~~~  111 (158)
                      ...+++.++++  .+++|+|||..         +.+++.+.++++
T Consensus       157 ~~~v~i~~~~~~~~~~allDTGN~L~DPitg~PV~Vve~~~~~~~  201 (293)
T PF03419_consen  157 LYPVTIEIGGKKIELKALLDTGNQLRDPITGRPVIVVEYEALEKL  201 (293)
T ss_pred             EEEEEEEECCEEEEEEEEEECCCcccCCCCCCcEEEEEHHHHHhh
Confidence            34566677776  56899999874         568888887777


No 101
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=31.69  E-value=31  Score=29.93  Aligned_cols=23  Identities=35%  Similarity=0.426  Sum_probs=18.8

Q ss_pred             eeEEEecCCCceeEECHHHHHHc
Q 040033           89 SPIVLVDSGSTHNFISDTFAKKV  111 (158)
Q Consensus        89 ~v~aLiDSGat~sfI~~~~a~~~  111 (158)
                      ...|+||||++..++....++++
T Consensus       331 ~~~aIlDSGTSli~lP~~~~~~i  353 (450)
T PTZ00013        331 KANVIVDSGTTTITAPSEFLNKF  353 (450)
T ss_pred             ccceEECCCCccccCCHHHHHHH
Confidence            35699999999999998876543


No 102
>COG1644 RPB10 DNA-directed RNA polymerase, subunit N (RpoN/RPB10) [Transcription]
Probab=30.25  E-value=24  Score=22.21  Aligned_cols=13  Identities=8%  Similarity=0.077  Sum_probs=9.1

Q ss_pred             cceeecCCC-CCCC
Q 040033           30 KVCATIATK-SFPR   42 (158)
Q Consensus        30 ~~Cf~Cg~~-gH~~   42 (158)
                      ..||.||+. ||.-
T Consensus         5 iRCFsCGkvi~~~w   18 (63)
T COG1644           5 VRCFSCGKVIGHKW   18 (63)
T ss_pred             eEeecCCCCHHHHH
Confidence            359999986 5543


No 103
>KOG3497 consensus DNA-directed RNA polymerase, subunit RPB10 [Transcription]
Probab=28.41  E-value=26  Score=21.99  Aligned_cols=8  Identities=25%  Similarity=0.314  Sum_probs=6.5

Q ss_pred             ceeecCCC
Q 040033           31 VCATIATK   38 (158)
Q Consensus        31 ~Cf~Cg~~   38 (158)
                      .||.||+-
T Consensus         6 RCFtCGKv   13 (69)
T KOG3497|consen    6 RCFTCGKV   13 (69)
T ss_pred             Eeeecccc
Confidence            59999874


No 104
>PF00026 Asp:  Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.;  InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .  More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=28.10  E-value=54  Score=25.88  Aligned_cols=68  Identities=15%  Similarity=0.377  Sum_probs=45.1

Q ss_pred             eeEEEecCCCceeEECHHHHHHc----CCeE-----------------EEEEECCEEEEE-----------------EEE
Q 040033           89 SPIVLVDSGSTHNFISDTFAKKV----KNFI-----------------VNLILQGVYVIV-----------------DFN  130 (158)
Q Consensus        89 ~v~aLiDSGat~sfI~~~~a~~~----~~~~-----------------A~i~i~g~~~~~-----------------~~~  130 (158)
                      ...++||||++..++...+...+    +...                 -.+.+++.++.+                 -+.
T Consensus       199 ~~~~~~Dtgt~~i~lp~~~~~~i~~~l~~~~~~~~~~~~c~~~~~~p~l~f~~~~~~~~i~~~~~~~~~~~~~~~~C~~~  278 (317)
T PF00026_consen  199 GQQAILDTGTSYIYLPRSIFDAIIKALGGSYSDGVYSVPCNSTDSLPDLTFTFGGVTFTIPPSDYIFKIEDGNGGYCYLG  278 (317)
T ss_dssp             EEEEEEETTBSSEEEEHHHHHHHHHHHTTEEECSEEEEETTGGGGSEEEEEEETTEEEEEEHHHHEEEESSTTSSEEEES
T ss_pred             ceeeecccccccccccchhhHHHHhhhcccccceeEEEecccccccceEEEeeCCEEEEecchHhcccccccccceeEee
Confidence            35799999999999999876654    3221                 045556655542                 111


Q ss_pred             Eec----CCCCcEEechhHhhhcCCceeeec
Q 040033          131 LRE----LEGYDVVLGTQWLRTLEPILWDFA  157 (158)
Q Consensus       131 v~~----~~~~dvILG~dwL~~~~~i~idw~  157 (158)
                      +.+    ......|||+.||+++-- ..|+.
T Consensus       279 i~~~~~~~~~~~~iLG~~fl~~~y~-vfD~~  308 (317)
T PF00026_consen  279 IQPMDSSDDSDDWILGSPFLRNYYV-VFDYE  308 (317)
T ss_dssp             EEEESSTTSSSEEEEEHHHHTTEEE-EEETT
T ss_pred             eecccccccCCceEecHHHhhceEE-EEeCC
Confidence            222    346789999999998764 56654


No 105
>PF13395 HNH_4:  HNH endonuclease
Probab=26.29  E-value=24  Score=21.10  Aligned_cols=9  Identities=11%  Similarity=0.158  Sum_probs=6.3

Q ss_pred             eeecCCCCC
Q 040033           32 CATIATKSF   40 (158)
Q Consensus        32 Cf~Cg~~gH   40 (158)
                      |||||++-.
T Consensus         1 C~Y~g~~i~    9 (54)
T PF13395_consen    1 CPYCGKPIS    9 (54)
T ss_pred             CCCCCCCCC
Confidence            888887643


No 106
>smart00400 ZnF_CHCC zinc finger.
Probab=25.65  E-value=34  Score=20.34  Aligned_cols=11  Identities=9%  Similarity=-0.028  Sum_probs=8.4

Q ss_pred             cceeecCCCCC
Q 040033           30 KVCATIATKSF   40 (158)
Q Consensus        30 ~~Cf~Cg~~gH   40 (158)
                      --||.||+.|-
T Consensus        24 ~~Cf~cg~gGd   34 (55)
T smart00400       24 FHCFGCGAGGN   34 (55)
T ss_pred             EEEeCCCCCCC
Confidence            56999987764


No 107
>PF11880 DUF3400:  Domain of unknown function (DUF3400);  InterPro: IPR021817  This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 50 amino acids in length. This domain is found associated with PF02754 from PFAM, PF02913 from PFAM, PF01565 from PFAM. 
Probab=25.57  E-value=60  Score=19.01  Aligned_cols=23  Identities=22%  Similarity=0.540  Sum_probs=15.4

Q ss_pred             EEEEEEEecCCCCcEEechhHhhhc
Q 040033          125 VIVDFNLRELEGYDVVLGTQWLRTL  149 (158)
Q Consensus       125 ~~~~~~v~~~~~~dvILG~dwL~~~  149 (158)
                      ...|.+|+++-  .-+||-+|+..+
T Consensus         9 ~~aDYIVVEmA--~~lLGe~W~~~~   31 (45)
T PF11880_consen    9 LEADYIVVEMA--RHLLGENWQQDY   31 (45)
T ss_pred             CccceehHHHH--HHHhhhhHHHHH
Confidence            34566777663  447999998764


No 108
>PF13771 zf-HC5HC2H:  PHD-like zinc-binding domain
Probab=25.01  E-value=49  Score=21.42  Aligned_cols=19  Identities=11%  Similarity=-0.099  Sum_probs=13.7

Q ss_pred             cceeecCCCCCCCccCCcc
Q 040033           30 KVCATIATKSFPRDIGERS   48 (158)
Q Consensus        30 ~~Cf~Cg~~gH~~~~C~~~   48 (158)
                      ..|+.|++++--.-+|-.+
T Consensus        37 ~~C~~C~~~~Ga~i~C~~~   55 (90)
T PF13771_consen   37 LKCSICKKKGGACIGCSHP   55 (90)
T ss_pred             CCCcCCCCCCCeEEEEeCC
Confidence            8899999993355556554


No 109
>PHA00689 hypothetical protein
Probab=24.87  E-value=18  Score=21.81  Aligned_cols=18  Identities=28%  Similarity=0.350  Sum_probs=12.1

Q ss_pred             HHhhcccceeecCCCCCC
Q 040033           24 LQERRAKVCATIATKSFP   41 (158)
Q Consensus        24 ~~~rR~~~Cf~Cg~~gH~   41 (158)
                      -++-|...|-+||+.|-+
T Consensus        12 dqepravtckrcgktglr   29 (62)
T PHA00689         12 DQEPRAVTCKRCGKTGLR   29 (62)
T ss_pred             ccCcceeehhhccccCce
Confidence            345566779899887643


No 110
>PF13717 zinc_ribbon_4:  zinc-ribbon domain
Probab=24.56  E-value=24  Score=19.41  Aligned_cols=23  Identities=9%  Similarity=0.251  Sum_probs=13.9

Q ss_pred             CCCHHHHHhhcc-cceeecCCCCC
Q 040033           18 GLSRAELQERRA-KVCATIATKSF   40 (158)
Q Consensus        18 ~ls~~e~~~rR~-~~Cf~Cg~~gH   40 (158)
                      .+.++++..... ..|-+|+..|+
T Consensus        13 ~i~d~~ip~~g~~v~C~~C~~~f~   36 (36)
T PF13717_consen   13 EIDDEKIPPKGRKVRCSKCGHVFF   36 (36)
T ss_pred             eCCHHHCCCCCcEEECCCCCCEeC
Confidence            344444444444 77888887664


No 111
>PLN00032 DNA-directed RNA polymerase; Provisional
Probab=23.79  E-value=31  Score=22.26  Aligned_cols=9  Identities=22%  Similarity=0.217  Sum_probs=7.1

Q ss_pred             cceeecCCC
Q 040033           30 KVCATIATK   38 (158)
Q Consensus        30 ~~Cf~Cg~~   38 (158)
                      ..||.||+.
T Consensus         5 VRCFTCGkv   13 (71)
T PLN00032          5 VRCFTCGKV   13 (71)
T ss_pred             eeecCCCCC
Confidence            359999975


No 112
>PRK04016 DNA-directed RNA polymerase subunit N; Provisional
Probab=23.11  E-value=31  Score=21.68  Aligned_cols=9  Identities=22%  Similarity=0.217  Sum_probs=7.1

Q ss_pred             cceeecCCC
Q 040033           30 KVCATIATK   38 (158)
Q Consensus        30 ~~Cf~Cg~~   38 (158)
                      ..||.||+.
T Consensus         5 vRCFTCGkv   13 (62)
T PRK04016          5 VRCFTCGKV   13 (62)
T ss_pred             eEecCCCCC
Confidence            359999975


No 113
>PF04746 DUF575:  Protein of unknown function (DUF575);  InterPro: IPR006835 This represents a conserved region found in a number of Chlamydophila pneumoniae proteins.
Probab=22.80  E-value=45  Score=22.71  Aligned_cols=12  Identities=33%  Similarity=0.913  Sum_probs=9.7

Q ss_pred             cEEechhHhhhc
Q 040033          138 DVVLGTQWLRTL  149 (158)
Q Consensus       138 dvILG~dwL~~~  149 (158)
                      .++.|++||-..
T Consensus        28 hiv~GieWLvS~   39 (101)
T PF04746_consen   28 HIVMGIEWLVSR   39 (101)
T ss_pred             eEEeehHHHHHH
Confidence            578999999754


No 114
>KOG1339 consensus Aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=20.77  E-value=97  Score=26.06  Aligned_cols=22  Identities=23%  Similarity=0.203  Sum_probs=18.1

Q ss_pred             CEeeEEEecCCCceeEECHHHH
Q 040033           87 NISPIVLVDSGSTHNFISDTFA  108 (158)
Q Consensus        87 ~~~v~aLiDSGat~sfI~~~~a  108 (158)
                      .+++.+++||||+..+|.-.-.
T Consensus        57 pq~f~v~~DTGS~~lWV~c~~c   78 (398)
T KOG1339|consen   57 PQSFTVVLDTGSDLLWVPCAPC   78 (398)
T ss_pred             CeeeEEEEeCCCCceeeccccc
Confidence            5789999999999998877443


Done!