Query         047816
Match_columns 620
No_of_seqs    364 out of 1457
Neff          8.4 
Searched_HMMs 46136
Date          Fri Mar 29 03:27:31 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/047816.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/047816hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PLN03146 aspartyl protease fam 100.0 1.4E-52 3.1E-57  451.7  37.8  333   81-430    80-430 (431)
  2 PTZ00165 aspartyl protease; Pr 100.0 9.7E-53 2.1E-57  454.9  36.6  320   73-437   110-458 (482)
  3 cd05478 pepsin_A Pepsin A, asp 100.0   1E-51 2.2E-56  430.9  30.8  298   83-425     8-317 (317)
  4 cd05490 Cathepsin_D2 Cathepsin 100.0 2.1E-51 4.6E-56  430.1  31.1  300   83-425     4-325 (325)
  5 KOG1339 Aspartyl protease [Pos 100.0 1.7E-50 3.6E-55  433.4  35.8  340   79-429    40-397 (398)
  6 cd05486 Cathespin_E Cathepsin  100.0 5.4E-51 1.2E-55  425.3  29.0  296   86-425     1-316 (316)
  7 cd05487 renin_like Renin stimu 100.0 1.4E-50   3E-55  423.9  31.5  301   83-426     6-326 (326)
  8 PTZ00147 plasmepsin-1; Provisi 100.0 2.8E-50 6.1E-55  432.5  33.3  310   70-427   126-450 (453)
  9 cd06098 phytepsin Phytepsin, a 100.0 2.7E-50 5.9E-55  419.9  31.7  290   82-425     7-317 (317)
 10 cd06096 Plasmepsin_5 Plasmepsi 100.0 3.6E-50 7.8E-55  420.6  31.0  296   84-429     2-326 (326)
 11 PTZ00013 plasmepsin 4 (PM4); P 100.0 8.2E-50 1.8E-54  428.0  33.5  309   69-427   124-449 (450)
 12 cd05477 gastricsin Gastricsins 100.0 8.1E-50 1.7E-54  416.9  32.2  297   84-426     2-318 (318)
 13 cd05488 Proteinase_A_fungi Fun 100.0   5E-50 1.1E-54  418.6  30.5  295   83-425     8-320 (320)
 14 cd05485 Cathepsin_D_like Cathe 100.0   6E-50 1.3E-54  419.3  30.8  299   83-425     9-329 (329)
 15 cd05472 cnd41_like Chloroplast 100.0 4.4E-48 9.4E-53  400.4  34.1  293   85-428     1-299 (299)
 16 cd05473 beta_secretase_like Be 100.0 1.6E-47 3.4E-52  406.7  31.8  321   85-432     3-351 (364)
 17 PF00026 Asp:  Eukaryotic aspar 100.0 5.1E-47 1.1E-51  395.7  21.0  302   85-426     1-317 (317)
 18 cd06097 Aspergillopepsin_like  100.0 2.8E-46   6E-51  382.8  25.3  265   86-425     1-278 (278)
 19 cd05476 pepsin_A_like_plant Ch 100.0 1.4E-45 2.9E-50  375.0  29.3  255   85-428     1-265 (265)
 20 cd05475 nucellin_like Nucellin 100.0 4.2E-45 9.1E-50  372.7  30.4  261   84-428     1-273 (273)
 21 cd05474 SAP_like SAPs, pepsin- 100.0 3.2E-45   7E-50  378.5  28.7  272   85-426     2-295 (295)
 22 cd05489 xylanase_inhibitor_I_l 100.0 1.3E-43 2.9E-48  373.9  32.6  318   92-426     2-361 (362)
 23 cd05471 pepsin_like Pepsin-lik 100.0   5E-42 1.1E-46  352.0  28.9  269   86-425     1-283 (283)
 24 PF14543 TAXi_N:  Xylanase inhi  99.9 3.4E-25 7.4E-30  207.8  14.8  152   86-250     1-164 (164)
 25 PF14541 TAXi_C:  Xylanase inhi  99.9 1.3E-21 2.7E-26  183.4  14.9  155  269-425     1-161 (161)
 26 cd05470 pepsin_retropepsin_lik  99.9 4.7E-21   1E-25  167.5  12.1  107   88-211     1-109 (109)
 27 cd05483 retropepsin_like_bacte  97.9 3.1E-05 6.6E-10   65.3   6.9   93   85-213     2-94  (96)
 28 PF01102 Glycophorin_A:  Glycop  96.6  0.0023   5E-08   55.9   3.9   37  578-615    59-96  (122)
 29 TIGR02281 clan_AA_DTGA clan AA  96.5   0.012 2.6E-07   52.1   8.0   94   83-212     9-102 (121)
 30 PF13650 Asp_protease_2:  Aspar  95.9   0.041 8.9E-07   45.3   7.9   89   88-212     1-89  (90)
 31 PTZ00382 Variant-specific surf  95.4  0.0027 5.8E-08   53.5  -1.0   36  574-610    57-94  (96)
 32 PF01034 Syndecan:  Syndecan do  95.2  0.0058 1.2E-07   46.3   0.1   36  583-619    10-45  (64)
 33 cd05479 RP_DDI RP_DDI; retrope  94.2    0.18 3.9E-06   44.8   7.4   27  397-423    98-124 (124)
 34 cd05479 RP_DDI RP_DDI; retrope  94.2    0.19   4E-06   44.7   7.3   90   84-212    15-106 (124)
 35 PF04478 Mid2:  Mid2 like cell   93.2   0.011 2.3E-07   53.3  -2.3   30  583-612    50-79  (154)
 36 PF03302 VSP:  Giardia variant-  92.6   0.045 9.7E-07   58.8   0.9   37  574-611   358-396 (397)
 37 PF01299 Lamp:  Lysosome-associ  91.5    0.14 3.1E-06   53.0   3.1   34  583-619   271-304 (306)
 38 TIGR01478 STEVOR variant surfa  90.5    0.31 6.6E-06   48.6   4.1   33  586-618   262-294 (295)
 39 PF02009 Rifin_STEVOR:  Rifin/s  89.2    0.34 7.3E-06   49.6   3.4   31  581-614   255-285 (299)
 40 PF05454 DAG1:  Dystroglycan (D  88.8    0.13 2.7E-06   52.2   0.0   91  478-571     3-102 (290)
 41 PF08693 SKG6:  Transmembrane a  88.6   0.041 8.9E-07   37.8  -2.4    8  605-612    32-39  (40)
 42 PF02439 Adeno_E3_CR2:  Adenovi  88.4    0.96 2.1E-05   30.6   3.9   29  585-613     6-34  (38)
 43 cd05484 retropepsin_like_LTR_2  88.0    0.55 1.2E-05   39.0   3.4   28   86-115     1-28  (91)
 44 cd06095 RP_RTVL_H_like Retrope  87.6     2.7 5.9E-05   34.5   7.3   27   89-117     2-28  (86)
 45 PTZ00370 STEVOR; Provisional    87.5    0.42 9.1E-06   47.7   2.7   26  586-611   258-283 (296)
 46 TIGR02281 clan_AA_DTGA clan AA  86.9     7.5 0.00016   34.2  10.1   37  266-316     8-44  (121)
 47 PHA03286 envelope glycoprotein  85.8     1.4   3E-05   46.7   5.5   76  537-616   337-424 (492)
 48 TIGR03698 clan_AA_DTGF clan AA  85.5     3.2 6.9E-05   35.7   6.9   24  398-421    84-107 (107)
 49 PF08284 RVP_2:  Retroviral asp  84.6     6.1 0.00013   35.6   8.6   28  398-425   104-131 (135)
 50 COG3577 Predicted aspartyl pro  83.8       3 6.5E-05   39.8   6.3   79   73-180    95-173 (215)
 51 PF05808 Podoplanin:  Podoplani  83.7    0.34 7.4E-06   44.2   0.0   35  574-610   120-155 (162)
 52 PF13975 gag-asp_proteas:  gag-  82.7       2 4.4E-05   34.0   4.1   36   83-120     6-41  (72)
 53 cd05484 retropepsin_like_LTR_2  81.5     7.3 0.00016   32.2   7.3   30  276-316     4-33  (91)
 54 PTZ00046 rifin; Provisional     80.5     1.7 3.8E-05   45.2   3.7   30  586-615   316-345 (358)
 55 TIGR01477 RIFIN variant surfac  79.9     1.8   4E-05   44.8   3.7   30  586-615   311-340 (353)
 56 PF15176 LRR19-TM:  Leucine-ric  76.2     4.9 0.00011   33.6   4.5   39  573-612     8-47  (102)
 57 PF05393 Hum_adeno_E3A:  Human   74.9     3.7 8.1E-05   33.3   3.3   25  596-620    43-68  (94)
 58 PF00077 RVP:  Retroviral aspar  74.6     4.1 8.9E-05   34.2   3.9   26   88-115     8-33  (100)
 59 PF14575 EphA2_TM:  Ephrin type  73.5     3.4 7.5E-05   33.0   2.9   27  584-612     2-28  (75)
 60 PF12768 Rax2:  Cortical protei  72.0     2.3   5E-05   43.4   1.9   36  577-612   221-259 (281)
 61 PF06697 DUF1191:  Protein of u  70.4     1.1 2.3E-05   45.1  -0.8   29  583-613   215-243 (278)
 62 PF05568 ASFV_J13L:  African sw  70.0       7 0.00015   34.8   4.2   40  573-613    20-59  (189)
 63 PF13650 Asp_protease_2:  Aspar  69.3     5.9 0.00013   32.1   3.6   29  277-316     3-31  (90)
 64 PF11925 DUF3443:  Protein of u  68.8      12 0.00026   39.1   6.3  107   88-214    26-149 (370)
 65 PF12191 stn_TNFRSF12A:  Tumour  67.7     1.6 3.5E-05   38.0  -0.2   17  599-615    92-108 (129)
 66 PF06024 DUF912:  Nucleopolyhed  67.6     4.1 8.9E-05   34.7   2.3   31  583-614    63-93  (101)
 67 PF03229 Alpha_GJ:  Alphavirus   67.4     5.3 0.00012   34.2   2.8   27  584-612    85-111 (126)
 68 PF05545 FixQ:  Cbb3-type cytoc  67.0     6.6 0.00014   28.5   2.9   23  592-614    15-37  (49)
 69 PF15102 TMEM154:  TMEM154 prot  66.5     2.4 5.2E-05   38.2   0.6    6  585-590    59-64  (146)
 70 PF12384 Peptidase_A2B:  Ty3 tr  65.7      19 0.00042   33.2   6.2   22  295-316    46-67  (177)
 71 PF13975 gag-asp_proteas:  gag-  65.6     8.9 0.00019   30.2   3.7   30  276-316    12-41  (72)
 72 PHA03265 envelope glycoprotein  65.1     2.1 4.4E-05   43.9  -0.0   29  583-612   348-376 (402)
 73 PF15065 NCU-G1:  Lysosomal tra  63.6     7.3 0.00016   40.9   3.6   49  565-613   297-349 (350)
 74 cd05482 HIV_retropepsin_like R  62.6     9.6 0.00021   31.5   3.5   25   89-115     2-26  (87)
 75 PF02480 Herpes_gE:  Alphaherpe  62.5     2.5 5.4E-05   46.0   0.0   23  358-380   170-192 (439)
 76 PF02009 Rifin_STEVOR:  Rifin/s  61.5     6.4 0.00014   40.4   2.7   24  595-618   269-293 (299)
 77 PF07213 DAP10:  DAP10 membrane  57.9      16 0.00035   29.3   3.8   19  579-598    30-49  (79)
 78 cd05483 retropepsin_like_bacte  57.8      15 0.00033   29.9   4.1   30  276-316     6-35  (96)
 79 PF06365 CD34_antigen:  CD34/Po  56.6     8.2 0.00018   37.1   2.4   30  583-613   101-130 (202)
 80 TIGR01167 LPXTG_anchor LPXTG-m  56.0      13 0.00028   24.4   2.6   10  603-612    24-33  (34)
 81 PF12877 DUF3827:  Domain of un  55.9      31 0.00067   38.7   6.9   81  523-611   207-297 (684)
 82 cd06094 RP_Saci_like RP_Saci_l  55.0      58  0.0013   27.0   6.7   21  294-314     9-29  (89)
 83 PF13703 PepSY_TM_2:  PepSY-ass  54.9      19 0.00041   29.6   4.0   20  593-612    24-43  (88)
 84 cd06095 RP_RTVL_H_like Retrope  53.4      16 0.00035   29.8   3.4   29  277-316     3-31  (86)
 85 PF02160 Peptidase_A3:  Caulifl  53.3      17 0.00037   34.9   3.9   28  397-425    90-117 (201)
 86 PF01034 Syndecan:  Syndecan do  53.0     5.1 0.00011   30.7   0.3   34  577-612     7-41  (64)
 87 PF11014 DUF2852:  Protein of u  48.0      29 0.00063   30.1   4.1   31  580-611     7-38  (115)
 88 PF02529 PetG:  Cytochrome B6-F  47.9      42 0.00091   22.6   3.9   26  583-609     5-30  (37)
 89 TIGR01478 STEVOR variant surfa  47.6      16 0.00036   36.6   2.9   29  590-619   262-291 (295)
 90 KOG4818 Lysosomal-associated m  47.0      15 0.00033   38.1   2.7   29  584-613   328-356 (362)
 91 PF15099 PIRT:  Phosphoinositid  46.6      11 0.00023   33.1   1.3   32  583-615    81-113 (129)
 92 PF14986 DUF4514:  Domain of un  46.5      19 0.00042   26.3   2.3   28  583-612    23-50  (61)
 93 PTZ00370 STEVOR; Provisional    45.8      18 0.00039   36.5   2.9   26  590-615   258-284 (296)
 94 PF11353 DUF3153:  Protein of u  45.5      18 0.00038   35.2   2.9   46  566-613   162-208 (209)
 95 PTZ00382 Variant-specific surf  45.2     3.1 6.8E-05   35.0  -2.1   33  583-615    63-95  (96)
 96 PF13268 DUF4059:  Protein of u  43.5      25 0.00054   27.5   2.7   23  595-617    17-39  (72)
 97 PF08374 Protocadherin:  Protoc  42.8      11 0.00024   36.2   1.0   26  583-611    39-64  (221)
 98 KOG3540 Beta amyloid precursor  42.3      30 0.00064   37.3   4.0   57  554-613   518-576 (615)
 99 PF00077 RVP:  Retroviral aspar  42.2      20 0.00042   30.0   2.3   27  276-313     9-35  (100)
100 PF14575 EphA2_TM:  Ephrin type  41.2      32  0.0007   27.5   3.2   23  593-615     6-28  (75)
101 PF12384 Peptidase_A2B:  Ty3 tr  40.7      36 0.00077   31.5   3.7   29   87-115    34-62  (177)
102 PF09668 Asp_protease:  Asparty  38.9      16 0.00034   32.4   1.2   36   85-122    24-59  (124)
103 TIGR03867 MprA_tail MprA prote  38.8      42  0.0009   21.0   2.6   19  593-611     8-26  (27)
104 PF13908 Shisa:  Wnt and FGF in  38.0      16 0.00035   34.5   1.2   12  579-590    75-87  (179)
105 PRK09459 pspG phage shock prot  38.0      29 0.00064   27.4   2.4   21  599-619    53-73  (76)
106 PF05084 GRA6:  Granule antigen  37.8      36 0.00079   31.0   3.3   13  601-613   164-176 (215)
107 TIGR03370 PEPCTERM_Roseo varia  36.9      41  0.0009   21.0   2.4   17  597-613     9-25  (26)
108 PHA03283 envelope glycoprotein  36.7      34 0.00074   37.4   3.5   25  595-619   410-434 (542)
109 PF01102 Glycophorin_A:  Glycop  36.1      46 0.00099   29.4   3.6   30  583-614    69-98  (122)
110 CHL00008 petG cytochrome b6/f   35.5      73  0.0016   21.4   3.5   25  583-608     5-29  (37)
111 PHA03281 envelope glycoprotein  35.5      63  0.0014   35.5   5.2   20  537-556   505-524 (642)
112 COG3577 Predicted aspartyl pro  35.0      71  0.0015   30.8   4.9   36  266-315   102-137 (215)
113 TIGR03778 VPDSG_CTERM VPDSG-CT  34.9      46 0.00099   20.7   2.4   17  595-611     8-24  (26)
114 PF15176 LRR19-TM:  Leucine-ric  34.5      41  0.0009   28.3   2.9   36  583-619    15-51  (102)
115 PRK00665 petG cytochrome b6-f   33.7      78  0.0017   21.3   3.4   25  583-608     5-29  (37)
116 cd05481 retropepsin_like_LTR_1  33.2      47   0.001   27.7   3.1   21  296-316    12-32  (93)
117 PF10577 UPF0560:  Uncharacteri  33.1      51  0.0011   38.1   4.3   33  580-612   270-302 (807)
118 COG4736 CcoQ Cbb3-type cytochr  32.9      50  0.0011   25.2   2.9   23  589-612    13-35  (60)
119 cd01324 cbb3_Oxidase_CcoQ Cyto  32.4      53  0.0012   23.8   2.9   22  592-613    16-37  (48)
120 COG5550 Predicted aspartyl pro  32.3      32  0.0007   30.2   2.0   20  297-316    29-49  (125)
121 PRK10525 cytochrome o ubiquino  31.4      42 0.00091   34.8   3.1   29  592-620    51-80  (315)
122 TIGR02595 PEP_exosort PEP-CTER  31.1      56  0.0012   20.3   2.4    7  606-612    17-23  (26)
123 PF14654 Epiglycanin_C:  Mucin,  30.6      87  0.0019   26.2   4.1   26  583-609    19-44  (106)
124 PF14828 Amnionless:  Amnionles  30.5 1.7E+02  0.0037   32.0   7.6   61  471-532   231-291 (437)
125 PF09668 Asp_protease:  Asparty  29.8      56  0.0012   28.9   3.2   29  276-315    28-56  (124)
126 PF14979 TMEM52:  Transmembrane  29.8 1.1E+02  0.0023   27.7   4.9   36  578-613    15-51  (154)
127 PF02038 ATP1G1_PLM_MAT8:  ATP1  29.8      75  0.0016   23.2   3.2   15  583-598    15-29  (50)
128 PF10661 EssA:  WXG100 protein   29.7      60  0.0013   29.6   3.4   16  596-611   128-143 (145)
129 TIGR01433 CyoA cytochrome o ub  29.7      43 0.00094   33.0   2.7   17  597-613    44-60  (226)
130 PF06679 DUF1180:  Protein of u  29.0      57  0.0012   30.3   3.2    7  612-618   123-129 (163)
131 TIGR03698 clan_AA_DTGF clan AA  29.0 1.6E+02  0.0034   25.2   5.8   64   88-179     2-70  (107)
132 PF14316 DUF4381:  Domain of un  29.0      50  0.0011   30.1   2.8   11  602-612    36-46  (146)
133 PF05337 CSF-1:  Macrophage col  28.9      19  0.0004   36.1   0.0   20  596-615   236-255 (285)
134 PF10873 DUF2668:  Protein of u  28.9      33 0.00071   30.8   1.5   12  579-590    57-69  (155)
135 PF04689 S1FA:  DNA binding pro  28.6      45 0.00096   25.5   1.9   36  576-612     6-42  (69)
136 TIGR03501 gamma_C_targ gammapr  28.3      62  0.0013   20.2   2.2   13  598-610     9-21  (26)
137 PF05283 MGC-24:  Multi-glycosy  27.9      66  0.0014   30.6   3.5   17  592-608   166-182 (186)
138 PF06040 Adeno_E3:  Adenovirus   27.8      62  0.0014   27.8   2.9   23  562-598    82-104 (127)
139 PRK14748 kdpF potassium-transp  27.5 1.2E+02  0.0026   19.3   3.3   21  586-608     4-24  (29)
140 PF09472 MtrF:  Tetrahydrometha  27.1      86  0.0019   24.3   3.3   30  577-608    34-64  (64)
141 KOG1094 Discoidin domain recep  26.7      80  0.0017   35.5   4.2   13   96-108    53-65  (807)
142 PRK15348 type III secretion sy  26.7      79  0.0017   31.6   3.9   35  480-515   149-184 (249)
143 PF13706 PepSY_TM_3:  PepSY-ass  25.2 1.3E+02  0.0029   20.3   3.7   22  583-605     9-30  (37)
144 KOG0860 Synaptobrevin/VAMP-lik  24.8      72  0.0016   27.7   2.8   15  576-590    85-99  (116)
145 PF14610 DUF4448:  Protein of u  24.6      22 0.00047   34.0  -0.4    8  561-568   123-130 (189)
146 PF12301 CD99L2:  CD99 antigen   24.0      77  0.0017   29.7   3.1   12  534-545    68-79  (169)
147 PF13172 PepSY_TM_1:  PepSY-ass  24.0 1.2E+02  0.0025   20.0   3.2   26  577-603     2-29  (34)
148 PRK11486 flagellar biosynthesi  23.3      87  0.0019   27.7   3.1   20  592-611    24-43  (124)
149 PF11615 DUF3249:  Protein of u  23.1      56  0.0012   23.3   1.5   25  534-558    11-35  (60)
150 TIGR03063 srtB_target sortase   22.8   1E+02  0.0022   19.8   2.5    7  604-610    22-28  (29)
151 PF01002 Flavi_NS2B:  Flaviviru  22.8      86  0.0019   27.9   3.0   27  506-532    44-79  (128)
152 PF13179 DUF4006:  Family of un  22.5 1.4E+02   0.003   23.2   3.6   19  601-619    29-47  (66)
153 PF11669 WBP-1:  WW domain-bind  22.3      81  0.0018   26.9   2.7   15  597-611    32-46  (102)
154 PF13908 Shisa:  Wnt and FGF in  22.3      41 0.00088   31.8   1.0   11  583-593    76-86  (179)
155 PF14991 MLANA:  Protein melan-  22.2      20 0.00042   30.8  -1.1   18  593-610    33-50  (118)
156 PF01282 Ribosomal_S24e:  Ribos  22.1 3.7E+02   0.008   21.9   6.5   43  490-532    12-56  (84)
157 PF02038 ATP1G1_PLM_MAT8:  ATP1  22.1 1.2E+02  0.0025   22.2   3.0   25  586-611    13-37  (50)
158 cd05481 retropepsin_like_LTR_1  21.3      69  0.0015   26.6   2.0   23   90-114     3-26  (93)
159 PRK00523 hypothetical protein;  21.1      93   0.002   24.6   2.5   25  586-611     5-29  (72)
160 PF02480 Herpes_gE:  Alphaherpe  20.8      33 0.00071   37.5   0.0   30  583-612   353-382 (439)
161 PF03597 CcoS:  Cytochrome oxid  20.7 1.9E+02   0.004   20.7   3.8   12  596-607    12-23  (45)
162 PRK00972 tetrahydromethanopter  20.5      82  0.0018   31.2   2.6    9  611-619   284-292 (292)
163 PTZ00208 65 kDa invariant surf  20.3      92   0.002   33.0   3.1    6  338-343   237-242 (436)
164 PF11118 DUF2627:  Protein of u  20.2 1.1E+02  0.0025   24.3   2.9   27  588-615    44-70  (77)

No 1  
>PLN03146 aspartyl protease family protein; Provisional
Probab=100.00  E-value=1.4e-52  Score=451.69  Aligned_cols=333  Identities=28%  Similarity=0.537  Sum_probs=269.7

Q ss_pred             ccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-cc-------cCCCCC
Q 047816           81 LLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-CN-------CDRERA  152 (620)
Q Consensus        81 ~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c~-------c~~~~~  152 (620)
                      .++++|+++|.||||+|++.|++||||+++||+|.+|..|..+.++.|||++|+||+.+.|.+. |.       |... +
T Consensus        80 ~~~~~Y~v~i~iGTPpq~~~vi~DTGS~l~Wv~C~~C~~C~~~~~~~fdps~SST~~~~~C~s~~C~~~~~~~~c~~~-~  158 (431)
T PLN03146         80 SNGGEYLMNISIGTPPVPILAIADTGSDLIWTQCKPCDDCYKQVSPLFDPKKSSTYKDVSCDSSQCQALGNQASCSDE-N  158 (431)
T ss_pred             cCCccEEEEEEcCCCCceEEEEECCCCCcceEcCCCCcccccCCCCcccCCCCCCCcccCCCCcccccCCCCCCCCCC-C
Confidence            3567899999999999999999999999999999999999887778999999999999999864 74       5433 4


Q ss_pred             cceeEEeeccCCceeEEEEEEEEEeCCCC--CCCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCccc
Q 047816          153 QCVYERKYAEMSSSSGVLGEDIISFGNES--DLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISD  230 (620)
Q Consensus       153 ~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~--~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~  230 (620)
                      .|.|.+.|+||+.+.|.+++|+|+|++..  ..++.++.|||+....+.+. ...+||||||++..|++.||..+  +.+
T Consensus       159 ~c~y~i~Ygdgs~~~G~l~~Dtltlg~~~~~~~~v~~~~FGc~~~~~g~f~-~~~~GilGLG~~~~Sl~sql~~~--~~~  235 (431)
T PLN03146        159 TCTYSYSYGDGSFTKGNLAVETLTIGSTSGRPVSFPGIVFGCGHNNGGTFD-EKGSGIVGLGGGPLSLISQLGSS--IGG  235 (431)
T ss_pred             CCeeEEEeCCCCceeeEEEEEEEEeccCCCCcceeCCEEEeCCCCCCCCcc-CCCceeEecCCCCccHHHHhhHh--hCC
Confidence            69999999998888999999999998743  14578999999988766543 35799999999999999999763  557


Q ss_pred             ceEEeecCCC---CCCceEEECCCCCCC--CceEeecCCC-CCCeeEEEEeEEEEccEEecCCCCcc--CCCCceEeecc
Q 047816          231 SFSLCYGGMD---VGGGAMVLGGISPPK--DMVFTHSDPV-RSPYYNIDLKVIHVAGKPLPLNPKVF--DGKHGTVLDSG  302 (620)
Q Consensus       231 ~FSl~l~~~~---~~~G~l~fGgiD~~~--~~~~~~~~~~-~~~~w~v~l~~i~v~g~~~~~~~~~~--~~~~~ailDSG  302 (620)
                      .||+||.+..   ...|.|+||+...-.  .+.+++.... ...+|.|.|++|+|+++.+.++...+  .+...+|||||
T Consensus       236 ~FSycL~~~~~~~~~~g~l~fG~~~~~~~~~~~~tPl~~~~~~~~y~V~L~gIsVgg~~l~~~~~~~~~~~~g~~iiDSG  315 (431)
T PLN03146        236 KFSYCLVPLSSDSNGTSKINFGTNAIVSGSGVVSTPLVSKDPDTFYYLTLEAISVGSKKLPYTGSSKNGVEEGNIIIDSG  315 (431)
T ss_pred             cEEEECCCCCCCCCCcceEEeCCccccCCCCceEcccccCCCCCeEEEeEEEEEECCEECcCCccccccCCCCcEEEeCC
Confidence            9999996422   347999999953221  2456654422 35789999999999999988766544  23457999999


Q ss_pred             ceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecc
Q 047816          303 TTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSK  382 (620)
Q Consensus       303 tt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~  382 (620)
                      |++++||+++|+++.+++.+++.... .. ......+.||....      ...+|+|+|+| +|.++.|++++|++....
T Consensus       316 Tt~t~Lp~~~y~~l~~~~~~~~~~~~-~~-~~~~~~~~C~~~~~------~~~~P~i~~~F-~Ga~~~l~~~~~~~~~~~  386 (431)
T PLN03146        316 TTLTLLPSDFYSELESAVEEAIGGER-VS-DPQGLLSLCYSSTS------DIKLPIITAHF-TGADVKLQPLNTFVKVSE  386 (431)
T ss_pred             ccceecCHHHHHHHHHHHHHHhcccc-CC-CCCCCCCccccCCC------CCCCCeEEEEE-CCCeeecCcceeEEEcCC
Confidence            99999999999999999988875321 11 11234578986321      13689999999 689999999999987643


Q ss_pred             cCCeEEEEEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEecCCcc
Q 047816          383 VRGAYCLGIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCSE  430 (620)
Q Consensus       383 ~~~~~Cl~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~~  430 (620)
                        +..|+++...  .+.+|||+.|||++|++||++++|||||+++|+.
T Consensus       387 --~~~Cl~~~~~--~~~~IlG~~~q~~~~vvyDl~~~~igFa~~~C~~  430 (431)
T PLN03146        387 --DLVCFAMIPT--SSIAIFGNLAQMNFLVGYDLESKTVSFKPTDCTK  430 (431)
T ss_pred             --CcEEEEEecC--CCceEECeeeEeeEEEEEECCCCEEeeecCCcCc
Confidence              5689988754  3469999999999999999999999999999975


No 2  
>PTZ00165 aspartyl protease; Provisional
Probab=100.00  E-value=9.7e-53  Score=454.89  Aligned_cols=320  Identities=24%  Similarity=0.439  Sum_probs=253.4

Q ss_pred             eeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCC--CCCCCCCCCCCCCCcccccccCcCCcccCCC
Q 047816           73 RMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEH--CGDHQDPKFEPDLSSTYQPVKCNLYCNCDRE  150 (620)
Q Consensus        73 ~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~--C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~  150 (620)
                      ..++.+.  .|.+|+++|+||||||+|.|++||||+++||+|..|..  |..+  +.|||++|+||+...+..       
T Consensus       110 ~~~l~n~--~d~~Y~~~I~IGTPpQ~f~Vv~DTGSS~lWVps~~C~~~~C~~~--~~yd~s~SSTy~~~~~~~-------  178 (482)
T PTZ00165        110 QQDLLNF--HNSQYFGEIQVGTPPKSFVVVFDTGSSNLWIPSKECKSGGCAPH--RKFDPKKSSTYTKLKLGD-------  178 (482)
T ss_pred             ceecccc--cCCeEEEEEEeCCCCceEEEEEeCCCCCEEEEchhcCccccccc--CCCCccccCCcEecCCCC-------
Confidence            3444443  47899999999999999999999999999999999985  6554  799999999999843110       


Q ss_pred             CCcceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC---------chHH
Q 047816          151 RAQCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL---------SVVD  220 (620)
Q Consensus       151 ~~~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~---------s~~~  220 (620)
                       ....+.++|++ +++.|.+++|+|++|+   ++++++.||++..+++. +....+|||||||++..         ++++
T Consensus       179 -~~~~~~i~YGs-Gs~~G~l~~DtV~ig~---l~i~~q~FG~a~~~s~~~f~~~~~DGILGLg~~~~s~~s~~~~~p~~~  253 (482)
T PTZ00165        179 -ESAETYIQYGT-GECVLALGKDTVKIGG---LKVKHQSIGLAIEESLHPFADLPFDGLVGLGFPDKDFKESKKALPIVD  253 (482)
T ss_pred             -ccceEEEEeCC-CcEEEEEEEEEEEECC---EEEccEEEEEEEeccccccccccccceeecCCCcccccccCCCCCHHH
Confidence             11257799998 5678999999999998   78899999999987653 44557899999998753         5899


Q ss_pred             HHHHcCCcc-cceEEeecCCCCCCceEEECCCCCCCC--c-eEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCc
Q 047816          221 QLVEKGVIS-DSFSLCYGGMDVGGGAMVLGGISPPKD--M-VFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHG  296 (620)
Q Consensus       221 ~L~~~g~I~-~~FSl~l~~~~~~~G~l~fGgiD~~~~--~-~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~  296 (620)
                      +|++||+|+ +.||+||++....+|+|+|||+|+.+.  . ...+.+.....+|.|.+++|+|+++.+...    .....
T Consensus       254 ~l~~qgli~~~~FS~yL~~~~~~~G~l~fGGiD~~~~~~~g~i~~~Pv~~~~yW~i~l~~i~vgg~~~~~~----~~~~~  329 (482)
T PTZ00165        254 NIKKQNLLKRNIFSFYMSKDLNQPGSISFGSADPKYTLEGHKIWWFPVISTDYWEIEVVDILIDGKSLGFC----DRKCK  329 (482)
T ss_pred             HHHHcCCcccceEEEEeccCCCCCCEEEeCCcCHHHcCCCCceEEEEccccceEEEEeCeEEECCEEeeec----CCceE
Confidence            999999998 899999987655689999999998653  1 233333446789999999999999876542    23457


Q ss_pred             eEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCc-----EEEe
Q 047816          297 TVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQ-----KLLL  371 (620)
Q Consensus       297 ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~-----~~~l  371 (620)
                      +++||||+++++|.++++++.++++.               ..+|..        . +.+|+|+|+| +|.     +|.+
T Consensus       330 aIiDTGTSli~lP~~~~~~i~~~i~~---------------~~~C~~--------~-~~lP~itf~f-~g~~g~~v~~~l  384 (482)
T PTZ00165        330 AAIDTGSSLITGPSSVINPLLEKIPL---------------EEDCSN--------K-DSLPRISFVL-EDVNGRKIKFDM  384 (482)
T ss_pred             EEEcCCCccEeCCHHHHHHHHHHcCC---------------cccccc--------c-ccCCceEEEE-CCCCCceEEEEE
Confidence            99999999999999999999888732               136732        2 5789999999 443     8999


Q ss_pred             CCCCcEEEec--ccCCeEEE-EEEecC----CCCceeehHhhhceEEEEEeCCCCEEEEEecCCcc-ccccccc
Q 047816          372 APENYLFRHS--KVRGAYCL-GIFQNG----RDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCSE-LWERLHI  437 (620)
Q Consensus       372 ~~~~yi~~~~--~~~~~~Cl-~~~~~~----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~~-~~~~~~~  437 (620)
                      +|++|+++..  ..++..|+ ++....    .++.||||++|||++|+|||++|+|||||+++|+. ..+..++
T Consensus       385 ~p~dYi~~~~~~~~~~~~C~~g~~~~d~~~~~g~~~ILGd~Flr~yy~VFD~~n~rIGfA~a~~~~~~~~~~~~  458 (482)
T PTZ00165        385 DPEDYVIEEGDSEEQEHQCVIGIIPMDVPAPRGPLFVLGNNFIRKYYSIFDRDHMMVGLVPAKHDQSGPNFQEL  458 (482)
T ss_pred             chHHeeeecccCCCCCCeEEEEEEECCCCCCCCceEEEchhhheeEEEEEeCCCCEEEEEeeccCCCCCcEEEe
Confidence            9999999742  23456896 454321    23579999999999999999999999999999876 3334444


No 3  
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which 
Probab=100.00  E-value=1e-51  Score=430.89  Aligned_cols=298  Identities=27%  Similarity=0.507  Sum_probs=248.2

Q ss_pred             ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816           83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE  162 (620)
Q Consensus        83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d  162 (620)
                      +..|+++|.||||+|++.|++||||+++||+|..|..|....++.|+|++|+|++..+             +.+.+.|++
T Consensus         8 ~~~Y~~~i~vGtp~q~~~v~~DTGS~~~wv~~~~C~~~~c~~~~~f~~~~Sst~~~~~-------------~~~~~~yg~   74 (317)
T cd05478           8 DMEYYGTISIGTPPQDFTVIFDTGSSNLWVPSVYCSSQACSNHNRFNPRQSSTYQSTG-------------QPLSIQYGT   74 (317)
T ss_pred             CCEEEEEEEeCCCCcEEEEEEeCCCccEEEecCCCCcccccccCcCCCCCCcceeeCC-------------cEEEEEECC
Confidence            6789999999999999999999999999999999985333334799999999999865             589999998


Q ss_pred             CCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcC-CCcceEEecCCCCC------chHHHHHHcCCcc-cceEE
Q 047816          163 MSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYS-QHADGIIGLGRGDL------SVVDQLVEKGVIS-DSFSL  234 (620)
Q Consensus       163 g~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~-~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~~FSl  234 (620)
                      |. +.|.+++|+|++|+   +.++++.|||+..+.+.+.. ...+||||||++..      +++++|+++|+|+ ++||+
T Consensus        75 gs-~~G~~~~D~v~ig~---~~i~~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~L~~~g~i~~~~FS~  150 (317)
T cd05478          75 GS-MTGILGYDTVQVGG---ISDTNQIFGLSETEPGSFFYYAPFDGILGLAYPSIASSGATPVFDNMMSQGLVSQDLFSV  150 (317)
T ss_pred             ce-EEEEEeeeEEEECC---EEECCEEEEEEEecCccccccccccceeeeccchhcccCCCCHHHHHHhCCCCCCCEEEE
Confidence            55 89999999999998   67789999999877665433 35799999998753      5899999999998 89999


Q ss_pred             eecCCCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHH
Q 047816          235 CYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEA  311 (620)
Q Consensus       235 ~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~  311 (620)
                      ||.+.+..+|.|+|||+|++++   +.|++.  ....+|.|.++++.|+++.+...     .+..++|||||+++++|++
T Consensus       151 ~L~~~~~~~g~l~~Gg~d~~~~~g~l~~~p~--~~~~~w~v~l~~v~v~g~~~~~~-----~~~~~iiDTGts~~~lp~~  223 (317)
T cd05478         151 YLSSNGQQGSVVTFGGIDPSYYTGSLNWVPV--TAETYWQITVDSVTINGQVVACS-----GGCQAIVDTGTSLLVGPSS  223 (317)
T ss_pred             EeCCCCCCCeEEEEcccCHHHccCceEEEEC--CCCcEEEEEeeEEEECCEEEccC-----CCCEEEECCCchhhhCCHH
Confidence            9998665679999999999873   344443  45689999999999999987532     3457999999999999999


Q ss_pred             HHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEE
Q 047816          312 AFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGI  391 (620)
Q Consensus       312 ~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~  391 (620)
                      ++++|.+++++...       ..+.|..+|..        . ..+|.|+|+| +|+++.||+++|+.+.    +..|+..
T Consensus       224 ~~~~l~~~~~~~~~-------~~~~~~~~C~~--------~-~~~P~~~f~f-~g~~~~i~~~~y~~~~----~~~C~~~  282 (317)
T cd05478         224 DIANIQSDIGASQN-------QNGEMVVNCSS--------I-SSMPDVVFTI-NGVQYPLPPSAYILQD----QGSCTSG  282 (317)
T ss_pred             HHHHHHHHhCCccc-------cCCcEEeCCcC--------c-ccCCcEEEEE-CCEEEEECHHHheecC----CCEEeEE
Confidence            99999998855321       23356678842        1 4689999999 8899999999999864    5689866


Q ss_pred             EecCC-CCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          392 FQNGR-DPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       392 ~~~~~-~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      ++..+ ...||||++|||++|+|||++|+||||||
T Consensus       283 ~~~~~~~~~~IlG~~fl~~~y~vfD~~~~~iG~A~  317 (317)
T cd05478         283 FQSMGLGELWILGDVFIRQYYSVFDRANNKVGLAP  317 (317)
T ss_pred             EEeCCCCCeEEechHHhcceEEEEeCCCCEEeecC
Confidence            65543 46799999999999999999999999996


No 4  
>cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank 
Probab=100.00  E-value=2.1e-51  Score=430.15  Aligned_cols=300  Identities=27%  Similarity=0.487  Sum_probs=244.9

Q ss_pred             ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC----CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816           83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE----HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER  158 (620)
Q Consensus        83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~----~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~  158 (620)
                      |.+|+++|.||||+|++.|++||||+++||+|..|.    .|..+  +.|+|++|+||+..+             +.|.+
T Consensus         4 ~~~Y~~~i~iGtP~q~~~v~~DTGSs~~Wv~~~~C~~~~~~C~~~--~~y~~~~SsT~~~~~-------------~~~~i   68 (325)
T cd05490           4 DAQYYGEIGIGTPPQTFTVVFDTGSSNLWVPSVHCSLLDIACWLH--HKYNSSKSSTYVKNG-------------TEFAI   68 (325)
T ss_pred             CCEEEEEEEECCCCcEEEEEEeCCCccEEEEcCCCCCCCccccCc--CcCCcccCcceeeCC-------------cEEEE
Confidence            668999999999999999999999999999999997    46655  689999999998754             68999


Q ss_pred             eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-c
Q 047816          159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-D  230 (620)
Q Consensus       159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~  230 (620)
                      .|++| ++.|.+++|+|++|+   .+++++.||+++...+. +.....+||||||++..      +++++|++||+|+ +
T Consensus        69 ~Yg~G-~~~G~~~~D~v~~g~---~~~~~~~Fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~l~~~g~i~~~  144 (325)
T cd05490          69 QYGSG-SLSGYLSQDTVSIGG---LQVEGQLFGEAVKQPGITFIAAKFDGILGMAYPRISVDGVTPVFDNIMAQKLVEQN  144 (325)
T ss_pred             EECCc-EEEEEEeeeEEEECC---EEEcCEEEEEEeeccCCcccceeeeEEEecCCccccccCCCCHHHHHHhcCCCCCC
Confidence            99995 589999999999998   67889999999877653 33346799999998754      5889999999998 8


Q ss_pred             ceEEeecCCC--CCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816          231 SFSLCYGGMD--VGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY  305 (620)
Q Consensus       231 ~FSl~l~~~~--~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~  305 (620)
                      .||+||++..  ..+|+|+|||+|++++   +.|++  .....+|.|++++|.|+++....     .....++|||||++
T Consensus       145 ~FS~~L~~~~~~~~~G~l~~Gg~d~~~~~g~l~~~~--~~~~~~w~v~l~~i~vg~~~~~~-----~~~~~aiiDSGTt~  217 (325)
T cd05490         145 VFSFYLNRDPDAQPGGELMLGGTDPKYYTGDLHYVN--VTRKAYWQIHMDQVDVGSGLTLC-----KGGCEAIVDTGTSL  217 (325)
T ss_pred             EEEEEEeCCCCCCCCCEEEECccCHHHcCCceEEEE--cCcceEEEEEeeEEEECCeeeec-----CCCCEEEECCCCcc
Confidence            9999998642  2469999999999873   33443  34568999999999998764321     23457999999999


Q ss_pred             eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCC
Q 047816          306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRG  385 (620)
Q Consensus       306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~  385 (620)
                      +++|.+++++|.+++.+.       ....+.|..+|..         ...+|+|+|+| +|+.|.|+|++|+++....+.
T Consensus       218 ~~~p~~~~~~l~~~~~~~-------~~~~~~~~~~C~~---------~~~~P~i~f~f-gg~~~~l~~~~y~~~~~~~~~  280 (325)
T cd05490         218 ITGPVEEVRALQKAIGAV-------PLIQGEYMIDCEK---------IPTLPVISFSL-GGKVYPLTGEDYILKVSQRGT  280 (325)
T ss_pred             ccCCHHHHHHHHHHhCCc-------cccCCCEEecccc---------cccCCCEEEEE-CCEEEEEChHHeEEeccCCCC
Confidence            999999999999888542       1123456778842         14689999999 899999999999997654445


Q ss_pred             eEEEEEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          386 AYCLGIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       386 ~~Cl~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      ..|+..+..     ..+..||||++|||++|+|||++++|||||+
T Consensus       281 ~~C~~~~~~~~~~~~~~~~~ilGd~flr~~y~vfD~~~~~IGfA~  325 (325)
T cd05490         281 TICLSGFMGLDIPPPAGPLWILGDVFIGRYYTVFDRDNDRVGFAK  325 (325)
T ss_pred             CEEeeEEEECCCCCCCCceEEEChHhheeeEEEEEcCCcEeeccC
Confidence            679765443     2245799999999999999999999999995


No 5  
>KOG1339 consensus Aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=1.7e-50  Score=433.36  Aligned_cols=340  Identities=37%  Similarity=0.685  Sum_probs=281.6

Q ss_pred             CCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC-CCCCCCCCCCCCCCCcccccccCcCC-cc----cCCCCC
Q 047816           79 DLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE-HCGDHQDPKFEPDLSSTYQPVKCNLY-CN----CDRERA  152 (620)
Q Consensus        79 ~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~-~C~~~~~~~y~p~~SsT~~~~~c~~~-c~----c~~~~~  152 (620)
                      ....+++|+++|.||||||+|.|++||||+++||+|..|. .|..+.+..|+|++|+||+...|.+. |.    |....+
T Consensus        40 ~~~~~~~Y~~~i~IGTPpq~f~v~~DTGS~~lWV~c~~c~~~C~~~~~~~f~p~~SSt~~~~~c~~~~c~~~~~~~~~~~  119 (398)
T KOG1339|consen   40 SSYSSGEYYGNISIGTPPQSFTVVLDTGSDLLWVPCAPCSSACYSQHNPIFDPSASSTYKSVGCSSPRCKSLPQSCSPNS  119 (398)
T ss_pred             ccccccccEEEEecCCCCeeeEEEEeCCCCceeeccccccccccccCCCccCccccccccccCCCCccccccccCcccCC
Confidence            3445678999999999999999999999999999999999 89875445699999999999999974 52    444567


Q ss_pred             cceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcC-CCcceEEecCCCCCchHHHHHHcCCcccc
Q 047816          153 QCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYS-QHADGIIGLGRGDLSVVDQLVEKGVISDS  231 (620)
Q Consensus       153 ~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~-~~~dGIlGLg~~~~s~~~~L~~~g~I~~~  231 (620)
                      .|.|.+.|+||+.+.|.+++|+|++++.+.+...++.|||+..+.+.+.. ...+||||||++.+++..|+...+...++
T Consensus       120 ~C~y~i~Ygd~~~~~G~l~~Dtv~~~~~~~~~~~~~~FGc~~~~~g~~~~~~~~dGIlGLg~~~~S~~~q~~~~~~~~~~  199 (398)
T KOG1339|consen  120 SCPYSIQYGDGSSTSGYLATDTVTFGGTTSLPVPNQTFGCGTNNPGSFGLFAAFDGILGLGRGSLSVPSQLPSFYNAINV  199 (398)
T ss_pred             cCceEEEeCCCCceeEEEEEEEEEEccccccccccEEEEeeecCccccccccccceEeecCCCCccceeecccccCCcee
Confidence            89999999999999999999999999853356678999999998765222 46899999999999999999988777679


Q ss_pred             eEEeecCCCC---CCceEEECCCCCCCCce---EeecCCCCCCeeEEEEeEEEEccEEecCCCCccCC-CCceEeeccce
Q 047816          232 FSLCYGGMDV---GGGAMVLGGISPPKDMV---FTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDG-KHGTVLDSGTT  304 (620)
Q Consensus       232 FSl~l~~~~~---~~G~l~fGgiD~~~~~~---~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~-~~~ailDSGtt  304 (620)
                      ||+||.+.+.   .+|.|+||++|+.++..   |++.......+|.|.+++|.|+++. .+....+.. ..++++||||+
T Consensus       200 FS~cL~~~~~~~~~~G~i~fG~~d~~~~~~~l~~tPl~~~~~~~y~v~l~~I~vgg~~-~~~~~~~~~~~~~~iiDSGTs  278 (398)
T KOG1339|consen  200 FSYCLSSNGSPSSGGGSIIFGGVDSSHYTGSLTYTPLLSNPSTYYQVNLDGISVGGKR-PIGSSLFCTDGGGAIIDSGTS  278 (398)
T ss_pred             EEEEeCCCCCCCCCCcEEEECCCcccCcCCceEEEeeccCCCccEEEEEeEEEECCcc-CCCcceEecCCCCEEEECCcc
Confidence            9999998653   47999999999998554   6665543335999999999999977 655555555 47899999999


Q ss_pred             eeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccC
Q 047816          305 YAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVR  384 (620)
Q Consensus       305 ~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~  384 (620)
                      +++||+++|++|.+++.+++.    .......+.+.|+......     ..+|.|+|+|++|+.|.+++++|+++.....
T Consensus       279 ~t~lp~~~y~~i~~~~~~~~~----~~~~~~~~~~~C~~~~~~~-----~~~P~i~~~f~~g~~~~l~~~~y~~~~~~~~  349 (398)
T KOG1339|consen  279 LTYLPTSAYNALREAIGAEVS----VVGTDGEYFVPCFSISTSG-----VKLPDITFHFGGGAVFSLPPKNYLVEVSDGG  349 (398)
T ss_pred             eeeccHHHHHHHHHHHHhhee----ccccCCceeeecccCCCCc-----ccCCcEEEEECCCcEEEeCccceEEEECCCC
Confidence            999999999999999988741    0224456778998653221     3589999999559999999999999876532


Q ss_pred             CeEEEEEEecCCC-CceeehHhhhceEEEEEeCC-CCEEEEEe--cCCc
Q 047816          385 GAYCLGIFQNGRD-PTTLLGGIIVRNTLVMYDRE-HSKIGFWK--TNCS  429 (620)
Q Consensus       385 ~~~Cl~~~~~~~~-~~~ILG~~fLr~~yvvfD~e-n~rIGfA~--~~c~  429 (620)
                      .. |++.+...+. ..||||+.|+|+++++||.. ++|||||+  .+|.
T Consensus       350 ~~-Cl~~~~~~~~~~~~ilG~~~~~~~~~~~D~~~~~riGfa~~~~~c~  397 (398)
T KOG1339|consen  350 GV-CLAFFNGMDSGPLWILGDVFQQNYLVVFDLGENSRVGFAPALTNCS  397 (398)
T ss_pred             Cc-eeeEEecCCCCceEEEchHHhCCEEEEEeCCCCCEEEeccccccCC
Confidence            22 9998877644 48999999999999999999 99999999  6664


No 6  
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=100.00  E-value=5.4e-51  Score=425.27  Aligned_cols=296  Identities=28%  Similarity=0.525  Sum_probs=243.6

Q ss_pred             EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccC
Q 047816           86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEM  163 (620)
Q Consensus        86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg  163 (620)
                      |+++|+||||+|+++|++||||+++||+|..|.  .|..+  +.|+|++|+|++..+             +.|++.|++|
T Consensus         1 Y~~~i~iGtP~Q~~~v~~DTGSs~~Wv~s~~C~~~~C~~~--~~y~~~~SsT~~~~~-------------~~~~i~Yg~g   65 (316)
T cd05486           1 YFGQISIGTPPQNFTVIFDTGSSNLWVPSIYCTSQACTKH--NRFQPSESSTYVSNG-------------EAFSIQYGTG   65 (316)
T ss_pred             CeEEEEECCCCcEEEEEEcCCCccEEEecCCCCCcccCcc--ceECCCCCcccccCC-------------cEEEEEeCCc
Confidence            679999999999999999999999999999997  57655  789999999998865             6899999985


Q ss_pred             CceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-cceEEe
Q 047816          164 SSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-DSFSLC  235 (620)
Q Consensus       164 ~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~~FSl~  235 (620)
                       ++.|.+++|+|++++   .++.++.||++..+.+. +.....+||||||++..      +++++|++||+|+ +.||+|
T Consensus        66 -~~~G~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~p~~~~l~~qg~i~~~~FS~~  141 (316)
T cd05486          66 -SLTGIIGIDQVTVEG---ITVQNQQFAESVSEPGSTFQDSEFDGILGLAYPSLAVDGVTPVFDNMMAQNLVELPMFSVY  141 (316)
T ss_pred             -EEEEEeeecEEEECC---EEEcCEEEEEeeccCcccccccccceEeccCchhhccCCCCCHHHHHHhcCCCCCCEEEEE
Confidence             689999999999998   67889999998776553 33456899999998764      4799999999998 899999


Q ss_pred             ecCCC--CCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecH
Q 047816          236 YGGMD--VGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPE  310 (620)
Q Consensus       236 l~~~~--~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~  310 (620)
                      |++..  ..+|+|+|||+|+++   .+.|+++  ....+|.|.+++|.|+++.+..     .....++|||||+++++|+
T Consensus       142 L~~~~~~~~~g~l~fGg~d~~~~~g~l~~~pi--~~~~~w~v~l~~i~v~g~~~~~-----~~~~~aiiDTGTs~~~lP~  214 (316)
T cd05486         142 MSRNPNSADGGELVFGGFDTSRFSGQLNWVPV--TVQGYWQIQLDNIQVGGTVIFC-----SDGCQAIVDTGTSLITGPS  214 (316)
T ss_pred             EccCCCCCCCcEEEEcccCHHHcccceEEEEC--CCceEEEEEeeEEEEecceEec-----CCCCEEEECCCcchhhcCH
Confidence            98642  357999999999987   3445543  4578999999999999987642     2235799999999999999


Q ss_pred             HHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEE
Q 047816          311 AAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLG  390 (620)
Q Consensus       311 ~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~  390 (620)
                      ++++++.+++.+.        ..++.|.++|..         .+.+|+|+|+| +|+.++|+|++|++.....++..|+.
T Consensus       215 ~~~~~l~~~~~~~--------~~~~~~~~~C~~---------~~~~p~i~f~f-~g~~~~l~~~~y~~~~~~~~~~~C~~  276 (316)
T cd05486         215 GDIKQLQNYIGAT--------ATDGEYGVDCST---------LSLMPSVTFTI-NGIPYSLSPQAYTLEDQSDGGGYCSS  276 (316)
T ss_pred             HHHHHHHHHhCCc--------ccCCcEEEeccc---------cccCCCEEEEE-CCEEEEeCHHHeEEecccCCCCEEee
Confidence            9999998877432        123446678842         14689999999 89999999999998753334568975


Q ss_pred             EEecC-----CCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          391 IFQNG-----RDPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       391 ~~~~~-----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      .++..     ..+.||||++|||++|+|||.+++|||||+
T Consensus       277 ~~~~~~~~~~~~~~~ILGd~flr~~y~vfD~~~~~IGfA~  316 (316)
T cd05486         277 GFQGLDIPPPAGPLWILGDVFIRQYYSVFDRGNNRVGFAP  316 (316)
T ss_pred             EEEECCCCCCCCCeEEEchHHhcceEEEEeCCCCEeeccC
Confidence            54432     235799999999999999999999999995


No 7  
>cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate  r
Probab=100.00  E-value=1.4e-50  Score=423.86  Aligned_cols=301  Identities=26%  Similarity=0.504  Sum_probs=246.6

Q ss_pred             ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCC----CCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816           83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEH----CGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER  158 (620)
Q Consensus        83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~----C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~  158 (620)
                      +..|+++|+||||+|+++|++||||+++||++..|..    |..+  +.|+|++|+|++..+             |.|++
T Consensus         6 ~~~y~~~i~iGtP~q~~~v~~DTGSs~~Wv~~~~C~~~~~~c~~~--~~y~~~~SsT~~~~~-------------~~~~~   70 (326)
T cd05487           6 DTQYYGEIGIGTPPQTFKVVFDTGSSNLWVPSSKCSPLYTACVTH--NLYDASDSSTYKENG-------------TEFTI   70 (326)
T ss_pred             CCeEEEEEEECCCCcEEEEEEeCCccceEEccCCCcCcchhhccc--CcCCCCCCeeeeECC-------------EEEEE
Confidence            5689999999999999999999999999999988874    5544  799999999999865             68999


Q ss_pred             eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccC-CCcCCCcceEEecCCCCC------chHHHHHHcCCcc-c
Q 047816          159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETG-DLYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-D  230 (620)
Q Consensus       159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~-~~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~  230 (620)
                      .|++| ++.|.+++|+|++|+   ..+ ++.||++..... .+.....+||||||++..      +++++|++||+|+ +
T Consensus        71 ~Yg~g-~~~G~~~~D~v~~g~---~~~-~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~L~~qg~i~~~  145 (326)
T cd05487          71 HYASG-TVKGFLSQDIVTVGG---IPV-TQMFGEVTALPAIPFMLAKFDGVLGMGYPKQAIGGVTPVFDNIMSQGVLKED  145 (326)
T ss_pred             EeCCc-eEEEEEeeeEEEECC---EEe-eEEEEEEEeccCCccceeecceEEecCChhhcccCCCCHHHHHHhcCCCCCC
Confidence            99985 599999999999998   444 478999887542 233346899999998653      5899999999998 8


Q ss_pred             ceEEeecCCC--CCCceEEECCCCCCCCc-eEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeee
Q 047816          231 SFSLCYGGMD--VGGGAMVLGGISPPKDM-VFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAY  307 (620)
Q Consensus       231 ~FSl~l~~~~--~~~G~l~fGgiD~~~~~-~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~  307 (620)
                      .||+||.+.+  ...|.|+|||+|++++. .+.+++.....+|.|+++++.|+++.+...     .+..++|||||++++
T Consensus       146 ~FS~~L~~~~~~~~~G~l~fGg~d~~~y~g~l~~~~~~~~~~w~v~l~~i~vg~~~~~~~-----~~~~aiiDSGts~~~  220 (326)
T cd05487         146 VFSVYYSRDSSHSLGGEIVLGGSDPQHYQGDFHYINTSKTGFWQIQMKGVSVGSSTLLCE-----DGCTAVVDTGASFIS  220 (326)
T ss_pred             EEEEEEeCCCCCCCCcEEEECCcChhhccCceEEEECCcCceEEEEecEEEECCEEEecC-----CCCEEEECCCccchh
Confidence            9999998653  35799999999998843 344444456789999999999999876432     235799999999999


Q ss_pred             ecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeE
Q 047816          308 LPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAY  387 (620)
Q Consensus       308 LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~  387 (620)
                      +|.++++++++++++...        ...|..+|..         ...+|+|+|+| +|.+++|++++|+++....++..
T Consensus       221 lP~~~~~~l~~~~~~~~~--------~~~y~~~C~~---------~~~~P~i~f~f-gg~~~~v~~~~yi~~~~~~~~~~  282 (326)
T cd05487         221 GPTSSISKLMEALGAKER--------LGDYVVKCNE---------VPTLPDISFHL-GGKEYTLSSSDYVLQDSDFSDKL  282 (326)
T ss_pred             CcHHHHHHHHHHhCCccc--------CCCEEEeccc---------cCCCCCEEEEE-CCEEEEeCHHHhEEeccCCCCCE
Confidence            999999999998854321        3456778843         14689999999 88999999999999876545678


Q ss_pred             EEEEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816          388 CLGIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT  426 (620)
Q Consensus       388 Cl~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~  426 (620)
                      |+..++.     ..++.||||++|||++|+|||++++|||||++
T Consensus       283 C~~~~~~~~~~~~~~~~~ilG~~flr~~y~vfD~~~~~IGfA~a  326 (326)
T cd05487         283 CTVAFHAMDIPPPTGPLWVLGATFIRKFYTEFDRQNNRIGFALA  326 (326)
T ss_pred             EEEEEEeCCCCCCCCCeEEEehHHhhccEEEEeCCCCEEeeeeC
Confidence            8755443     12357999999999999999999999999985


No 8  
>PTZ00147 plasmepsin-1; Provisional
Probab=100.00  E-value=2.8e-50  Score=432.51  Aligned_cols=310  Identities=24%  Similarity=0.359  Sum_probs=246.6

Q ss_pred             CceeeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCC
Q 047816           70 PNARMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDR  149 (620)
Q Consensus        70 ~~~~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~  149 (620)
                      .+..+++.+..  +.+|+++|+||||+|++.|++||||+++||+|..|..|..+.++.|||++|+||+..+         
T Consensus       126 ~~~~v~L~n~~--n~~Y~~~I~IGTP~Q~f~Vi~DTGSsdlWVps~~C~~~~C~~~~~yd~s~SsT~~~~~---------  194 (453)
T PTZ00147        126 EFDNVELKDLA--NVMSYGEAKLGDNGQKFNFIFDTGSANLWVPSIKCTTEGCETKNLYDSSKSKTYEKDG---------  194 (453)
T ss_pred             CCCeeeccccC--CCEEEEEEEECCCCeEEEEEEeCCCCcEEEeecCCCcccccCCCccCCccCcceEECC---------
Confidence            44556666543  6789999999999999999999999999999999985443344799999999999865         


Q ss_pred             CCCcceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC---CcCCCcceEEecCCCCC------chHH
Q 047816          150 ERAQCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD---LYSQHADGIIGLGRGDL------SVVD  220 (620)
Q Consensus       150 ~~~~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~---~~~~~~dGIlGLg~~~~------s~~~  220 (620)
                          +.|++.|++| ++.|.+++|+|++|+   .+++ ..|+++....+.   +.....|||||||++..      +++.
T Consensus       195 ----~~f~i~Yg~G-svsG~~~~DtVtiG~---~~v~-~qF~~~~~~~~f~~~~~~~~~DGILGLG~~~~S~~~~~p~~~  265 (453)
T PTZ00147        195 ----TKVEMNYVSG-TVSGFFSKDLVTIGN---LSVP-YKFIEVTDTNGFEPFYTESDFDGIFGLGWKDLSIGSVDPYVV  265 (453)
T ss_pred             ----CEEEEEeCCC-CEEEEEEEEEEEECC---EEEE-EEEEEEEeccCcccccccccccceecccCCccccccCCCHHH
Confidence                5899999985 689999999999998   5555 578888765431   22346799999999764      4788


Q ss_pred             HHHHcCCcc-cceEEeecCCCCCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCc
Q 047816          221 QLVEKGVIS-DSFSLCYGGMDVGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHG  296 (620)
Q Consensus       221 ~L~~~g~I~-~~FSl~l~~~~~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~  296 (620)
                      +|++||+|+ ++||+||++.+..+|.|+|||+|+++   ++.|++.  ....+|.|.++ +.+++...        ....
T Consensus       266 ~L~~qg~I~~~vFS~~L~~~~~~~G~L~fGGiD~~ky~G~l~y~pl--~~~~~W~V~l~-~~vg~~~~--------~~~~  334 (453)
T PTZ00147        266 ELKNQNKIEQAVFTFYLPPEDKHKGYLTIGGIEERFYEGPLTYEKL--NHDLYWQVDLD-VHFGNVSS--------EKAN  334 (453)
T ss_pred             HHHHcCCCCccEEEEEecCCCCCCeEEEECCcChhhcCCceEEEEc--CCCceEEEEEE-EEECCEec--------Ccee
Confidence            999999998 79999998766668999999999997   3445544  35679999998 47765431        2457


Q ss_pred             eEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCc
Q 047816          297 TVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENY  376 (620)
Q Consensus       297 ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~y  376 (620)
                      ++|||||+++++|+++++++.+++++...      ...+.|.++|+.          ..+|+|+|.| +|..++|+|++|
T Consensus       335 aIiDSGTsli~lP~~~~~ai~~~l~~~~~------~~~~~y~~~C~~----------~~lP~~~f~f-~g~~~~L~p~~y  397 (453)
T PTZ00147        335 VIVDSGTSVITVPTEFLNKFVESLDVFKV------PFLPLYVTTCNN----------TKLPTLEFRS-PNKVYTLEPEYY  397 (453)
T ss_pred             EEECCCCchhcCCHHHHHHHHHHhCCeec------CCCCeEEEeCCC----------CCCCeEEEEE-CCEEEEECHHHh
Confidence            99999999999999999999998854211      122346678842          3689999999 789999999999


Q ss_pred             EEEecccCCeEEEEEEec-C-CCCceeehHhhhceEEEEEeCCCCEEEEEecC
Q 047816          377 LFRHSKVRGAYCLGIFQN-G-RDPTTLLGGIIVRNTLVMYDREHSKIGFWKTN  427 (620)
Q Consensus       377 i~~~~~~~~~~Cl~~~~~-~-~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~  427 (620)
                      +....+.+...|+..+.. . ..+.||||++|||++|+|||++++|||||+++
T Consensus       398 i~~~~~~~~~~C~~~i~~~~~~~~~~ILGd~FLr~~YtVFD~~n~rIGfA~a~  450 (453)
T PTZ00147        398 LQPIEDIGSALCMLNIIPIDLEKNTFILGDPFMRKYFTVFDYDNHTVGFALAK  450 (453)
T ss_pred             eeccccCCCcEEEEEEEECCCCCCCEEECHHHhccEEEEEECCCCEEEEEEec
Confidence            986544344679754433 2 23579999999999999999999999999986


No 9  
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases.  They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=100.00  E-value=2.7e-50  Score=419.88  Aligned_cols=290  Identities=28%  Similarity=0.495  Sum_probs=239.2

Q ss_pred             cceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC---CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816           82 LNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE---HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER  158 (620)
Q Consensus        82 ~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~---~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~  158 (620)
                      .+.+|+++|+||||+|++.|++||||+++||+|..|.   .|..+  +.|+|++|+|++..+             ..+.+
T Consensus         7 ~~~~Y~~~i~iGtP~Q~~~v~~DTGSs~lWv~~~~C~~~~~C~~~--~~y~~~~SsT~~~~~-------------~~~~i   71 (317)
T cd06098           7 LDAQYFGEIGIGTPPQKFTVIFDTGSSNLWVPSSKCYFSIACYFH--SKYKSSKSSTYKKNG-------------TSASI   71 (317)
T ss_pred             CCCEEEEEEEECCCCeEEEEEECCCccceEEecCCCCCCcccccc--CcCCcccCCCcccCC-------------CEEEE
Confidence            3678999999999999999999999999999999996   68766  799999999998865             57899


Q ss_pred             eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-c
Q 047816          159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-D  230 (620)
Q Consensus       159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~  230 (620)
                      .|++| ++.|.+++|+|++|+   .+++++.||++..+.+. +.....+||||||++..      +++.+|++||+|+ +
T Consensus        72 ~Yg~G-~~~G~~~~D~v~ig~---~~v~~~~f~~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~l~~qg~i~~~  147 (317)
T cd06098          72 QYGTG-SISGFFSQDSVTVGD---LVVKNQVFIEATKEPGLTFLLAKFDGILGLGFQEISVGKAVPVWYNMVEQGLVKEP  147 (317)
T ss_pred             EcCCc-eEEEEEEeeEEEECC---EEECCEEEEEEEecCCccccccccceeccccccchhhcCCCCHHHHHHhcCCCCCC
Confidence            99985 589999999999998   67889999999876543 34456899999999754      4788999999998 8


Q ss_pred             ceEEeecCCC--CCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816          231 SFSLCYGGMD--VGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY  305 (620)
Q Consensus       231 ~FSl~l~~~~--~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~  305 (620)
                      .||+||++..  ..+|+|+|||+|++++   +.|+++  ....+|.|++++|.|+++.+....    ....++|||||++
T Consensus       148 ~FS~~L~~~~~~~~~G~l~fGg~d~~~~~g~l~~~pv--~~~~~w~v~l~~i~v~g~~~~~~~----~~~~aivDTGTs~  221 (317)
T cd06098         148 VFSFWLNRNPDEEEGGELVFGGVDPKHFKGEHTYVPV--TRKGYWQFEMGDVLIGGKSTGFCA----GGCAAIADSGTSL  221 (317)
T ss_pred             EEEEEEecCCCCCCCcEEEECccChhhcccceEEEec--CcCcEEEEEeCeEEECCEEeeecC----CCcEEEEecCCcc
Confidence            9999998642  3579999999999974   345544  356799999999999998765422    3457999999999


Q ss_pred             eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCC
Q 047816          306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRG  385 (620)
Q Consensus       306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~  385 (620)
                      +++|+++++++.                   +..+|...         ..+|+|+|+| +|+++.|+|++|+++......
T Consensus       222 ~~lP~~~~~~i~-------------------~~~~C~~~---------~~~P~i~f~f-~g~~~~l~~~~yi~~~~~~~~  272 (317)
T cd06098         222 LAGPTTIVTQIN-------------------SAVDCNSL---------SSMPNVSFTI-GGKTFELTPEQYILKVGEGAA  272 (317)
T ss_pred             eeCCHHHHHhhh-------------------ccCCcccc---------ccCCcEEEEE-CCEEEEEChHHeEEeecCCCC
Confidence            999998776542                   34578421         4689999999 889999999999987655445


Q ss_pred             eEEEEEEecC-----CCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          386 AYCLGIFQNG-----RDPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       386 ~~Cl~~~~~~-----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      ..|+..++..     .++.||||++|||++|+|||++|+|||||+
T Consensus       273 ~~C~~~~~~~~~~~~~~~~~IlGd~Flr~~y~VfD~~~~~iGfA~  317 (317)
T cd06098         273 AQCISGFTALDVPPPRGPLWILGDVFMGAYHTVFDYGNLRVGFAE  317 (317)
T ss_pred             CEEeceEEECCCCCCCCCeEEechHHhcccEEEEeCCCCEEeecC
Confidence            6897544321     235799999999999999999999999995


No 10 
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5.  Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=100.00  E-value=3.6e-50  Score=420.60  Aligned_cols=296  Identities=30%  Similarity=0.554  Sum_probs=240.3

Q ss_pred             eeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-c----ccCCCCCcceeEE
Q 047816           84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-C----NCDRERAQCVYER  158 (620)
Q Consensus        84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c----~c~~~~~~~~~~~  158 (620)
                      ++|+++|.||||+|++.|+|||||+++||+|..|..|..+.++.|+|++|+|++.+.|++. |    .|.  .+.|.|.+
T Consensus         2 ~~Y~~~i~vGtP~Q~~~v~~DTGS~~~wv~~~~C~~c~~~~~~~y~~~~Sst~~~~~C~~~~c~~~~~~~--~~~~~~~i   79 (326)
T cd06096           2 AYYFIDIFIGNPPQKQSLILDTGSSSLSFPCSQCKNCGIHMEPPYNLNNSITSSILYCDCNKCCYCLSCL--NNKCEYSI   79 (326)
T ss_pred             ceEEEEEEecCCCeEEEEEEeCCCCceEEecCCCCCcCCCCCCCcCcccccccccccCCCccccccCcCC--CCcCcEEE
Confidence            4799999999999999999999999999999999999887778999999999999999863 4    243  35699999


Q ss_pred             eeccCCceeEEEEEEEEEeCCCCCC----CccceEEEEEEeccCCCcCCCcceEEecCCCCC----chHHHHHHcCCcc-
Q 047816          159 KYAEMSSSSGVLGEDIISFGNESDL----KPQRAVFGCENVETGDLYSQHADGIIGLGRGDL----SVVDQLVEKGVIS-  229 (620)
Q Consensus       159 ~Y~dg~~~~G~~~~D~v~lg~~~~~----~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~----s~~~~L~~~g~I~-  229 (620)
                      .|++|+.+.|.+++|+|+||+....    ...++.|||+..+.+.+.....+||||||+...    +...+|.+++.+. 
T Consensus        80 ~Y~~gs~~~G~~~~D~v~lg~~~~~~~~~~~~~~~fg~~~~~~~~~~~~~~~GilGLg~~~~~~~~~~~~~l~~~~~~~~  159 (326)
T cd06096          80 SYSEGSSISGFYFSDFVSFESYLNSNSEKESFKKIFGCHTHETNLFLTQQATGILGLSLTKNNGLPTPIILLFTKRPKLK  159 (326)
T ss_pred             EECCCCceeeEEEEEEEEeccCCCCccccccccEEeccCccccCcccccccceEEEccCCcccccCchhHHHHHhccccc
Confidence            9999878999999999999984310    112578999988877666667899999999764    2444566776653 


Q ss_pred             --cceEEeecCCCCCCceEEECCCCCCCC-------------ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCC
Q 047816          230 --DSFSLCYGGMDVGGGAMVLGGISPPKD-------------MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGK  294 (620)
Q Consensus       230 --~~FSl~l~~~~~~~G~l~fGgiD~~~~-------------~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~  294 (620)
                        ++||+||++   .+|.|+|||+|+.++             +.|++.  ....+|.|.+++|+|+++....   .....
T Consensus       160 ~~~~FS~~l~~---~~G~l~~Gg~d~~~~~~~~~~~~~~~~~~~~~p~--~~~~~y~v~l~~i~vg~~~~~~---~~~~~  231 (326)
T cd06096         160 KDKIFSICLSE---DGGELTIGGYDKDYTVRNSSIGNNKVSKIVWTPI--TRKYYYYVKLEGLSVYGTTSNS---GNTKG  231 (326)
T ss_pred             CCceEEEEEcC---CCeEEEECccChhhhcccccccccccCCceEEec--cCCceEEEEEEEEEEcccccce---ecccC
Confidence              799999986   469999999998753             244443  3458999999999999875110   11345


Q ss_pred             CceEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCC
Q 047816          295 HGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPE  374 (620)
Q Consensus       295 ~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~  374 (620)
                      ..++|||||++++||+++++++.+++                                    |+|+|+|++|+++.++|+
T Consensus       232 ~~aivDSGTs~~~lp~~~~~~l~~~~------------------------------------P~i~~~f~~g~~~~i~p~  275 (326)
T cd06096         232 LGMLVDSGSTLSHFPEDLYNKINNFF------------------------------------PTITIIFENNLKIDWKPS  275 (326)
T ss_pred             CCEEEeCCCCcccCCHHHHHHHHhhc------------------------------------CcEEEEEcCCcEEEECHH
Confidence            68999999999999999998877654                                    789999965899999999


Q ss_pred             CcEEEecccCCeEEEEEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEecCCc
Q 047816          375 NYLFRHSKVRGAYCLGIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCS  429 (620)
Q Consensus       375 ~yi~~~~~~~~~~Cl~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~  429 (620)
                      +|++....  ..+|+++. .. ++.+|||++|||++|+|||+|++|||||+++|.
T Consensus       276 ~y~~~~~~--~~c~~~~~-~~-~~~~ILG~~flr~~y~vFD~~~~riGfa~~~C~  326 (326)
T cd06096         276 SYLYKKES--FWCKGGEK-SV-SNKPILGASFFKNKQIIFDLDNNRIGFVESNCP  326 (326)
T ss_pred             HhccccCC--ceEEEEEe-cC-CCceEEChHHhcCcEEEEECcCCEEeeEcCCCC
Confidence            99987543  33555543 33 468999999999999999999999999999994


No 11 
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=100.00  E-value=8.2e-50  Score=428.04  Aligned_cols=309  Identities=25%  Similarity=0.381  Sum_probs=244.9

Q ss_pred             CCceeeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcc
Q 047816           69 HPNARMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCN  146 (620)
Q Consensus        69 ~~~~~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~  146 (620)
                      ..+..+++.+..  +.+|+++|+||||+|++.|++||||+++||+|..|.  .|..+  +.|+|++|+|++..+      
T Consensus       124 ~~~~~~~l~d~~--n~~Yy~~i~IGTP~Q~f~vi~DTGSsdlWV~s~~C~~~~C~~~--~~yd~s~SsT~~~~~------  193 (450)
T PTZ00013        124 SENDVIELDDVA--NIMFYGEGEVGDNHQKFMLIFDTGSANLWVPSKKCDSIGCSIK--NLYDSSKSKSYEKDG------  193 (450)
T ss_pred             cCCCceeeeccC--CCEEEEEEEECCCCeEEEEEEeCCCCceEEecccCCccccccC--CCccCccCcccccCC------
Confidence            344556666543  668889999999999999999999999999999997  56655  799999999999865      


Q ss_pred             cCCCCCcceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC---CcCCCcceEEecCCCCC------c
Q 047816          147 CDRERAQCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD---LYSQHADGIIGLGRGDL------S  217 (620)
Q Consensus       147 c~~~~~~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~---~~~~~~dGIlGLg~~~~------s  217 (620)
                             +.+++.|++| ++.|.+++|+|++|+   +++. ..|+++....+.   +....+|||||||++..      +
T Consensus       194 -------~~~~i~YG~G-sv~G~~~~Dtv~iG~---~~~~-~~f~~~~~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~~p  261 (450)
T PTZ00013        194 -------TKVDITYGSG-TVKGFFSKDLVTLGH---LSMP-YKFIEVTDTDDLEPIYSSSEFDGILGLGWKDLSIGSIDP  261 (450)
T ss_pred             -------cEEEEEECCc-eEEEEEEEEEEEECC---EEEc-cEEEEEEeccccccceecccccceecccCCccccccCCC
Confidence                   5899999985 599999999999998   4554 578887665321   22346799999999764      5


Q ss_pred             hHHHHHHcCCcc-cceEEeecCCCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCC
Q 047816          218 VVDQLVEKGVIS-DSFSLCYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDG  293 (620)
Q Consensus       218 ~~~~L~~~g~I~-~~FSl~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~  293 (620)
                      ++++|++||+|+ ++||+||++.+..+|.|+|||+|++++   +.|+++  ....+|.|.++ +.++....        .
T Consensus       262 ~~~~L~~qg~I~~~vFS~~L~~~~~~~G~L~fGGiD~~~y~G~L~y~pv--~~~~yW~I~l~-v~~G~~~~--------~  330 (450)
T PTZ00013        262 IVVELKNQNKIDNALFTFYLPVHDVHAGYLTIGGIEEKFYEGNITYEKL--NHDLYWQIDLD-VHFGKQTM--------Q  330 (450)
T ss_pred             HHHHHHhccCcCCcEEEEEecCCCCCCCEEEECCcCccccccceEEEEc--CcCceEEEEEE-EEECceec--------c
Confidence            889999999998 799999987655689999999999973   445444  35679999998 66654322        2


Q ss_pred             CCceEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCC
Q 047816          294 KHGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAP  373 (620)
Q Consensus       294 ~~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~  373 (620)
                      +..+++||||+++++|+++++++++++.....      ...+.|..+|+.          +.+|+|+|+| +|.+++|+|
T Consensus       331 ~~~aIlDSGTSli~lP~~~~~~i~~~l~~~~~------~~~~~y~~~C~~----------~~lP~i~F~~-~g~~~~L~p  393 (450)
T PTZ00013        331 KANVIVDSGTTTITAPSEFLNKFFANLNVIKV------PFLPFYVTTCDN----------KEMPTLEFKS-ANNTYTLEP  393 (450)
T ss_pred             ccceEECCCCccccCCHHHHHHHHHHhCCeec------CCCCeEEeecCC----------CCCCeEEEEE-CCEEEEECH
Confidence            35799999999999999999999988754311      122346678842          4689999999 789999999


Q ss_pred             CCcEEEecccCCeEEEEEEec-C-CCCceeehHhhhceEEEEEeCCCCEEEEEecC
Q 047816          374 ENYLFRHSKVRGAYCLGIFQN-G-RDPTTLLGGIIVRNTLVMYDREHSKIGFWKTN  427 (620)
Q Consensus       374 ~~yi~~~~~~~~~~Cl~~~~~-~-~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~  427 (620)
                      ++|+......++..|+..+.. . .++.||||++|||++|+|||++++|||||+++
T Consensus       394 ~~Yi~~~~~~~~~~C~~~i~~~~~~~~~~ILGd~FLr~~Y~VFD~~n~rIGfA~a~  449 (450)
T PTZ00013        394 EYYMNPLLDVDDTLCMITMLPVDIDDNTFILGDPFMRKYFTVFDYDKESVGFAIAK  449 (450)
T ss_pred             HHheehhccCCCCeeEEEEEECCCCCCCEEECHHHhccEEEEEECCCCEEEEEEeC
Confidence            999976443345689644443 2 24689999999999999999999999999975


No 12 
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=100.00  E-value=8.1e-50  Score=416.92  Aligned_cols=297  Identities=26%  Similarity=0.520  Sum_probs=245.2

Q ss_pred             eeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeec
Q 047816           84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYA  161 (620)
Q Consensus        84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~  161 (620)
                      ..|+++|+||||+|++.|++||||+++||+|..|.  .|..+  +.|+|++|+||+..+             |.|++.|+
T Consensus         2 ~~y~~~i~iGtP~q~~~v~~DTGS~~~wv~~~~C~~~~C~~~--~~f~~~~SsT~~~~~-------------~~~~~~Yg   66 (318)
T cd05477           2 MSYYGEISIGTPPQNFLVLFDTGSSNLWVPSVLCQSQACTNH--TKFNPSQSSTYSTNG-------------ETFSLQYG   66 (318)
T ss_pred             cEEEEEEEECCCCcEEEEEEeCCCccEEEccCCCCCcccccc--CCCCcccCCCceECC-------------cEEEEEEC
Confidence            47999999999999999999999999999999998  46654  799999999999865             68999999


Q ss_pred             cCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCC------CchHHHHHHcCCcc-cceE
Q 047816          162 EMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGD------LSVVDQLVEKGVIS-DSFS  233 (620)
Q Consensus       162 dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~------~s~~~~L~~~g~I~-~~FS  233 (620)
                      +| ++.|.+++|+|++|+   ..+.++.|||+....+. +.....+||||||++.      .+++++|+++|.|+ ++||
T Consensus        67 ~G-s~~G~~~~D~i~~g~---~~i~~~~Fg~~~~~~~~~~~~~~~~GilGLg~~~~s~~~~~~~~~~L~~~g~i~~~~FS  142 (318)
T cd05477          67 SG-SLTGIFGYDTVTVQG---IIITNQEFGLSETEPGTNFVYAQFDGILGLAYPSISAGGATTVMQGMMQQNLLQAPIFS  142 (318)
T ss_pred             Cc-EEEEEEEeeEEEECC---EEEcCEEEEEEEecccccccccceeeEeecCcccccccCCCCHHHHHHhcCCcCCCEEE
Confidence            95 589999999999998   67789999999876543 3334579999999853      46999999999998 8999


Q ss_pred             EeecCCC-CCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeec
Q 047816          234 LCYGGMD-VGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLP  309 (620)
Q Consensus       234 l~l~~~~-~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP  309 (620)
                      +||++.. ..+|.|+|||+|++++   +.|+++  ....+|.|.+++|+|+++.+...    ..+..++|||||+++++|
T Consensus       143 ~~L~~~~~~~~g~l~fGg~d~~~~~g~l~~~pv--~~~~~w~v~l~~i~v~g~~~~~~----~~~~~~iiDSGtt~~~lP  216 (318)
T cd05477         143 FYLSGQQGQQGGELVFGGVDNNLYTGQIYWTPV--TSETYWQIGIQGFQINGQATGWC----SQGCQAIVDTGTSLLTAP  216 (318)
T ss_pred             EEEcCCCCCCCCEEEEcccCHHHcCCceEEEec--CCceEEEEEeeEEEECCEEeccc----CCCceeeECCCCccEECC
Confidence            9998742 3469999999999873   445543  45689999999999999876532    234579999999999999


Q ss_pred             HHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEE
Q 047816          310 EAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCL  389 (620)
Q Consensus       310 ~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl  389 (620)
                      ++++++|++++.++..       ..+.|..+|..         .+.+|+|+|+| +|+++.||+++|+.+.    ...|+
T Consensus       217 ~~~~~~l~~~~~~~~~-------~~~~~~~~C~~---------~~~~p~l~~~f-~g~~~~v~~~~y~~~~----~~~C~  275 (318)
T cd05477         217 QQVMSTLMQSIGAQQD-------QYGQYVVNCNN---------IQNLPTLTFTI-NGVSFPLPPSAYILQN----NGYCT  275 (318)
T ss_pred             HHHHHHHHHHhCCccc-------cCCCEEEeCCc---------cccCCcEEEEE-CCEEEEECHHHeEecC----CCeEE
Confidence            9999999998865432       23456778842         14689999999 7899999999999864    34685


Q ss_pred             -EEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816          390 -GIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT  426 (620)
Q Consensus       390 -~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~  426 (620)
                       ++...     .++..||||++|||++|++||++++|||||++
T Consensus       276 ~~i~~~~~~~~~~~~~~ilG~~fl~~~y~vfD~~~~~ig~a~~  318 (318)
T cd05477         276 VGIEPTYLPSQNGQPLWILGDVFLRQYYSVYDLGNNQVGFATA  318 (318)
T ss_pred             EEEEecccCCCCCCceEEEcHHHhhheEEEEeCCCCEEeeeeC
Confidence             55432     12356999999999999999999999999985


No 13 
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme.  Proteinase A preferentially hydro
Probab=100.00  E-value=5e-50  Score=418.62  Aligned_cols=295  Identities=27%  Similarity=0.520  Sum_probs=244.2

Q ss_pred             ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEee
Q 047816           83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKY  160 (620)
Q Consensus        83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y  160 (620)
                      +.+|+++|+||||+|++.|++||||+++||+|..|.  .|..+  +.|+|++|+|++..+             |.+.+.|
T Consensus         8 ~~~Y~~~i~iGtp~q~~~v~~DTGSs~~wv~~~~C~~~~C~~~--~~y~~~~Sst~~~~~-------------~~~~~~y   72 (320)
T cd05488           8 NAQYFTDITLGTPPQKFKVILDTGSSNLWVPSVKCGSIACFLH--SKYDSSASSTYKANG-------------TEFKIQY   72 (320)
T ss_pred             CCEEEEEEEECCCCcEEEEEEecCCcceEEEcCCCCCcccCCc--ceECCCCCcceeeCC-------------CEEEEEE
Confidence            568999999999999999999999999999999997  57655  699999999999765             5899999


Q ss_pred             ccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-cce
Q 047816          161 AEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-DSF  232 (620)
Q Consensus       161 ~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~~F  232 (620)
                      ++| +++|.+++|+|++++   +.++++.|||+..+.+. +.....+||||||++..      +.+.+|++||+|. +.|
T Consensus        73 ~~g-~~~G~~~~D~v~ig~---~~~~~~~f~~a~~~~g~~~~~~~~dGilGLg~~~~s~~~~~~~~~~l~~qg~i~~~~F  148 (320)
T cd05488          73 GSG-SLEGFVSQDTLSIGD---LTIKKQDFAEATSEPGLAFAFGKFDGILGLAYDTISVNKIVPPFYNMINQGLLDEPVF  148 (320)
T ss_pred             CCc-eEEEEEEEeEEEECC---EEECCEEEEEEecCCCcceeeeeeceEEecCCccccccCCCCHHHHHHhcCCCCCCEE
Confidence            985 589999999999998   67789999999876553 22346799999999764      3567899999998 899


Q ss_pred             EEeecCCCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeec
Q 047816          233 SLCYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLP  309 (620)
Q Consensus       233 Sl~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP  309 (620)
                      |+||++.+..+|.|+|||+|++++   +.|++.  ....+|.|.+++|+|+++.+...      +..++|||||++++||
T Consensus       149 S~~L~~~~~~~G~l~fGg~d~~~~~g~l~~~p~--~~~~~w~v~l~~i~vg~~~~~~~------~~~~ivDSGtt~~~lp  220 (320)
T cd05488         149 SFYLGSSEEDGGEATFGGIDESRFTGKITWLPV--RRKAYWEVELEKIGLGDEELELE------NTGAAIDTGTSLIALP  220 (320)
T ss_pred             EEEecCCCCCCcEEEECCcCHHHcCCceEEEeC--CcCcEEEEEeCeEEECCEEeccC------CCeEEEcCCcccccCC
Confidence            999998655689999999999873   445543  35679999999999999877532      3479999999999999


Q ss_pred             HHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEE
Q 047816          310 EAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCL  389 (620)
Q Consensus       310 ~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl  389 (620)
                      +++++++.+++.+..       .....|..+|..        . +.+|.|+|+| +|+++.||+++|+++.    +..|+
T Consensus       221 ~~~~~~l~~~~~~~~-------~~~~~~~~~C~~--------~-~~~P~i~f~f-~g~~~~i~~~~y~~~~----~g~C~  279 (320)
T cd05488         221 SDLAEMLNAEIGAKK-------SWNGQYTVDCSK--------V-DSLPDLTFNF-DGYNFTLGPFDYTLEV----SGSCI  279 (320)
T ss_pred             HHHHHHHHHHhCCcc-------ccCCcEEeeccc--------c-ccCCCEEEEE-CCEEEEECHHHheecC----CCeEE
Confidence            999999988875432       123456677842        1 4689999999 7899999999999853    34698


Q ss_pred             EEEecC-----CCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          390 GIFQNG-----RDPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       390 ~~~~~~-----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      ..+...     ..+.||||+.|||++|+|||++++|||||+
T Consensus       280 ~~~~~~~~~~~~~~~~ilG~~fl~~~y~vfD~~~~~iG~a~  320 (320)
T cd05488         280 SAFTGMDFPEPVGPLAIVGDAFLRKYYSVYDLGNNAVGLAK  320 (320)
T ss_pred             EEEEECcCCCCCCCeEEEchHHhhheEEEEeCCCCEEeecC
Confidence            665532     134799999999999999999999999996


No 14 
>cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets an
Probab=100.00  E-value=6e-50  Score=419.27  Aligned_cols=299  Identities=25%  Similarity=0.472  Sum_probs=245.7

Q ss_pred             ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC----CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816           83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE----HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER  158 (620)
Q Consensus        83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~----~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~  158 (620)
                      +.+|+++|+||||+|++.|++||||+++||+|..|.    .|..+  +.|+|++|+|++..+             +.|.+
T Consensus         9 ~~~Y~~~i~vGtP~q~~~v~~DTGSs~~Wv~~~~C~~~~~~c~~~--~~y~~~~Sst~~~~~-------------~~~~i   73 (329)
T cd05485           9 DAQYYGVITIGTPPQSFKVVFDTGSSNLWVPSKKCSWTNIACLLH--NKYDSTKSSTYKKNG-------------TEFAI   73 (329)
T ss_pred             CCeEEEEEEECCCCcEEEEEEcCCCccEEEecCCCCCCCccccCC--CeECCcCCCCeEECC-------------eEEEE
Confidence            568999999999999999999999999999999997    46544  689999999999865             68999


Q ss_pred             eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCCc------hHHHHHHcCCcc-c
Q 047816          159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDLS------VVDQLVEKGVIS-D  230 (620)
Q Consensus       159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~s------~~~~L~~~g~I~-~  230 (620)
                      .|++| ++.|.+++|+|++|+   ..++++.||++..+.+. +.....+||||||++..+      ++.+|++||+|+ +
T Consensus        74 ~Y~~g-~~~G~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~~GilGLg~~~~s~~~~~p~~~~l~~qg~i~~~  149 (329)
T cd05485          74 QYGSG-SLSGFLSTDTVSVGG---VSVKGQTFAEAINEPGLTFVAAKFDGILGMGYSSISVDGVVPVFYNMVNQKLVDAP  149 (329)
T ss_pred             EECCc-eEEEEEecCcEEECC---EEECCEEEEEEEecCCccccccccceEEEcCCccccccCCCCHHHHHHhCCCCCCC
Confidence            99985 489999999999998   66789999999876542 334567999999997653      679999999998 8


Q ss_pred             ceEEeecCCCC--CCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816          231 SFSLCYGGMDV--GGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY  305 (620)
Q Consensus       231 ~FSl~l~~~~~--~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~  305 (620)
                      .||+||.+...  .+|.|+|||+|++++   +.+++.  ....+|.|.++++.++++.+.      ..+..++|||||++
T Consensus       150 ~FS~~l~~~~~~~~~G~l~fGg~d~~~~~g~l~~~p~--~~~~~~~v~~~~i~v~~~~~~------~~~~~~iiDSGtt~  221 (329)
T cd05485         150 VFSFYLNRDPSAKEGGELILGGSDPKHYTGNFTYLPV--TRKGYWQFKMDSVSVGEGEFC------SGGCQAIADTGTSL  221 (329)
T ss_pred             EEEEEecCCCCCCCCcEEEEcccCHHHcccceEEEEc--CCceEEEEEeeEEEECCeeec------CCCcEEEEccCCcc
Confidence            99999986432  469999999999874   344443  457899999999999988654      23457999999999


Q ss_pred             eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCC
Q 047816          306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRG  385 (620)
Q Consensus       306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~  385 (620)
                      +++|++++++|.+++.+..     +  ....|.++|..         .+.+|+|+|+| +|+++.|++++|+++....+.
T Consensus       222 ~~lP~~~~~~l~~~~~~~~-----~--~~~~~~~~C~~---------~~~~p~i~f~f-gg~~~~i~~~~yi~~~~~~~~  284 (329)
T cd05485         222 IAGPVDEIEKLNNAIGAKP-----I--IGGEYMVNCSA---------IPSLPDITFVL-GGKSFSLTGKDYVLKVTQMGQ  284 (329)
T ss_pred             eeCCHHHHHHHHHHhCCcc-----c--cCCcEEEeccc---------cccCCcEEEEE-CCEEeEEChHHeEEEecCCCC
Confidence            9999999999988875431     1  12356778842         14679999999 889999999999998765445


Q ss_pred             eEEEEEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          386 AYCLGIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       386 ~~Cl~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      ..|+..+..     ..++.||||++|||++|+|||++++|||||+
T Consensus       285 ~~C~~~~~~~~~~~~~~~~~IlG~~fl~~~y~vFD~~~~~ig~a~  329 (329)
T cd05485         285 TICLSGFMGIDIPPPAGPLWILGDVFIGKYYTEFDLGNNRVGFAT  329 (329)
T ss_pred             CEEeeeEEECcCCCCCCCeEEEchHHhccceEEEeCCCCEEeecC
Confidence            689754442     2235799999999999999999999999985


No 15 
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco.  CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=100.00  E-value=4.4e-48  Score=400.45  Aligned_cols=293  Identities=28%  Similarity=0.513  Sum_probs=233.7

Q ss_pred             eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816           85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS  164 (620)
Q Consensus        85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~  164 (620)
                      +|+++|.||||||++.|++||||+++||+|..|                       |             .|.++|++|+
T Consensus         1 ~Y~~~i~iGtP~q~~~v~~DTGSs~~Wv~c~~c-----------------------~-------------~~~i~Yg~Gs   44 (299)
T cd05472           1 EYVVTVGLGTPARDQTVIVDTGSDLTWVQCQPC-----------------------C-------------LYQVSYGDGS   44 (299)
T ss_pred             CeEEEEecCCCCcceEEEecCCCCcccccCCCC-----------------------C-------------eeeeEeCCCc
Confidence            488999999999999999999999999987654                       2             5889999987


Q ss_pred             ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCcccceEEeecCCC-CCC
Q 047816          165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSLCYGGMD-VGG  243 (620)
Q Consensus       165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl~l~~~~-~~~  243 (620)
                      .++|.+++|+|+||+.  ..++++.|||+...++.+.  ..+||||||++..+++.||..+  .+++||+||++.+ ..+
T Consensus        45 ~~~G~~~~D~v~ig~~--~~~~~~~Fg~~~~~~~~~~--~~~GilGLg~~~~s~~~ql~~~--~~~~FS~~L~~~~~~~~  118 (299)
T cd05472          45 YTTGDLATDTLTLGSS--DVVPGFAFGCGHDNEGLFG--GAAGLLGLGRGKLSLPSQTASS--YGGVFSYCLPDRSSSSS  118 (299)
T ss_pred             eEEEEEEEEEEEeCCC--CccCCEEEECCccCCCccC--CCCEEEECCCCcchHHHHhhHh--hcCceEEEccCCCCCCC
Confidence            7899999999999983  1678999999987765432  6899999999999999998765  4589999998754 467


Q ss_pred             ceEEECCCCCCC-CceEeecCCC--CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHHHHHH
Q 047816          244 GAMVLGGISPPK-DMVFTHSDPV--RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAFKDAI  320 (620)
Q Consensus       244 G~l~fGgiD~~~-~~~~~~~~~~--~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i~~~l  320 (620)
                      |+|+|||+|++. .+.|+++...  ...+|.|++++|+|+++.+...... .....++|||||++++||++++++|.+++
T Consensus       119 G~l~fGg~d~~~g~l~~~pv~~~~~~~~~y~v~l~~i~vg~~~~~~~~~~-~~~~~~ivDSGTt~~~lp~~~~~~l~~~l  197 (299)
T cd05472         119 GYLSFGAAASVPAGASFTPMLSNPRVPTFYYVGLTGISVGGRRLPIPPAS-FGAGGVIIDSGTVITRLPPSAYAALRDAF  197 (299)
T ss_pred             ceEEeCCccccCCCceECCCccCCCCCCeEEEeeEEEEECCEECCCCccc-cCCCCeEEeCCCcceecCHHHHHHHHHHH
Confidence            999999999962 4556654332  2468999999999999987654321 23457999999999999999999999999


Q ss_pred             HHHhhhcccccCCCCCCc-cccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEecC-CCC
Q 047816          321 MSELQSLKQIRGPDPNYN-DICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQNG-RDP  398 (620)
Q Consensus       321 ~~~~~~~~~~~~~~~~~~-~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~~~-~~~  398 (620)
                      .++......   ....+. ..|+.....   . ...+|+|+|+|++|.++.|++++|++... ..+..|+++.... ..+
T Consensus       198 ~~~~~~~~~---~~~~~~~~~C~~~~~~---~-~~~~P~i~f~f~~g~~~~l~~~~y~~~~~-~~~~~C~~~~~~~~~~~  269 (299)
T cd05472         198 RAAMAAYPR---APGFSILDTCYDLSGF---R-SVSVPTVSLHFQGGADVELDASGVLYPVD-DSSQVCLAFAGTSDDGG  269 (299)
T ss_pred             HHHhccCCC---CCCCCCCCccCcCCCC---c-CCccCCEEEEECCCCEEEeCcccEEEEec-CCCCEEEEEeCCCCCCC
Confidence            887642211   111222 358754221   1 25799999999658999999999998432 2457899877653 346


Q ss_pred             ceeehHhhhceEEEEEeCCCCEEEEEecCC
Q 047816          399 TTLLGGIIVRNTLVMYDREHSKIGFWKTNC  428 (620)
Q Consensus       399 ~~ILG~~fLr~~yvvfD~en~rIGfA~~~c  428 (620)
                      .+|||+.|||++|+|||++++|||||+++|
T Consensus       270 ~~ilG~~fl~~~~vvfD~~~~~igfa~~~C  299 (299)
T cd05472         270 LSIIGNVQQQTFRVVYDVAGGRIGFAPGGC  299 (299)
T ss_pred             CEEEchHHccceEEEEECCCCEEeEecCCC
Confidence            799999999999999999999999999999


No 16 
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two 
Probab=100.00  E-value=1.6e-47  Score=406.71  Aligned_cols=321  Identities=28%  Similarity=0.422  Sum_probs=239.6

Q ss_pred             eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816           85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS  164 (620)
Q Consensus        85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~  164 (620)
                      .|+++|.||||+|++.|++||||+++||+|..|..|  +  +.|+|++|+||+..+             |.|++.|++| 
T Consensus         3 ~Y~~~i~iGtP~Q~~~v~~DTGSs~lWv~~~~~~~~--~--~~f~~~~SsT~~~~~-------------~~~~i~Yg~G-   64 (364)
T cd05473           3 GYYIEMLIGTPPQKLNILVDTGSSNFAVAAAPHPFI--H--TYFHRELSSTYRDLG-------------KGVTVPYTQG-   64 (364)
T ss_pred             ceEEEEEecCCCceEEEEEecCCcceEEEcCCCccc--c--ccCCchhCcCcccCC-------------ceEEEEECcc-
Confidence            478999999999999999999999999999887432  2  589999999999876             5899999985 


Q ss_pred             ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcC-CCcceEEecCCCCC--------chHHHHHHcCCcccceEEe
Q 047816          165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYS-QHADGIIGLGRGDL--------SVVDQLVEKGVISDSFSLC  235 (620)
Q Consensus       165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~-~~~dGIlGLg~~~~--------s~~~~L~~~g~I~~~FSl~  235 (620)
                      ++.|.+++|+|+||+..... ..+.|++.....+.+.. ...|||||||++.+        +++++|++|+.++++||+|
T Consensus        65 s~~G~~~~D~v~ig~~~~~~-~~~~~~~~~~~~~~~~~~~~~dGIlGLg~~~l~~~~~~~~~~~~~l~~q~~~~~~FS~~  143 (364)
T cd05473          65 SWEGELGTDLVSIPKGPNVT-FRANIAAITESENFFLNGSNWEGILGLAYAELARPDSSVEPFFDSLVKQTGIPDVFSLQ  143 (364)
T ss_pred             eEEEEEEEEEEEECCCCccc-eEEeeEEEeccccceecccccceeeeecccccccCCCCCCCHHHHHHhccCCccceEEE
Confidence            67999999999998631111 12334555544433332 35799999998754        5889999999988899998


Q ss_pred             ecC---------CCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccc
Q 047816          236 YGG---------MDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGT  303 (620)
Q Consensus       236 l~~---------~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGt  303 (620)
                      |+.         ....+|.|+|||+|++++   +.|+++  ....+|.|.+++|+|+++.+......+. ...++|||||
T Consensus       144 l~~~~~~~~~~~~~~~~g~l~fGg~D~~~~~g~l~~~p~--~~~~~~~v~l~~i~vg~~~~~~~~~~~~-~~~~ivDSGT  220 (364)
T cd05473         144 MCGAGLPVNGSASGTVGGSMVIGGIDPSLYKGDIWYTPI--REEWYYEVIILKLEVGGQSLNLDCKEYN-YDKAIVDSGT  220 (364)
T ss_pred             ecccccccccccccCCCcEEEeCCcCHhhcCCCceEEec--CcceeEEEEEEEEEECCEeccccccccc-CccEEEeCCC
Confidence            853         123479999999999873   445544  3467999999999999998875433221 2369999999


Q ss_pred             eeeeecHHHHHHHHHHHHHHhhhcccccCC-CCCCccccccCCCCCccccCCCCCeEEEEECCC-----cEEEeCCCCcE
Q 047816          304 TYAYLPEAAFLAFKDAIMSELQSLKQIRGP-DPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNG-----QKLLLAPENYL  377 (620)
Q Consensus       304 t~~~LP~~~~~~i~~~l~~~~~~~~~~~~~-~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g-----~~~~l~~~~yi  377 (620)
                      ++++||+++++++.+++.++... ...... ...+..+|+.....    ....+|+|+|+|+++     .++.|+|++|+
T Consensus       221 s~~~lp~~~~~~l~~~l~~~~~~-~~~~~~~~~~~~~~C~~~~~~----~~~~~P~i~~~f~g~~~~~~~~l~l~p~~Y~  295 (364)
T cd05473         221 TNLRLPVKVFNAAVDAIKAASLI-EDFPDGFWLGSQLACWQKGTT----PWEIFPKISIYLRDENSSQSFRITILPQLYL  295 (364)
T ss_pred             cceeCCHHHHHHHHHHHHhhccc-ccCCccccCcceeecccccCc----hHhhCCcEEEEEccCCCCceEEEEECHHHhh
Confidence            99999999999999999887531 111111 01234578643211    113689999999542     36899999999


Q ss_pred             EEeccc-CCeEEEEEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEecCCcccc
Q 047816          378 FRHSKV-RGAYCLGIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCSELW  432 (620)
Q Consensus       378 ~~~~~~-~~~~Cl~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~~~~  432 (620)
                      ...... .+..|+.+......+.||||+.|||++|+|||++++|||||+++|.+.+
T Consensus       296 ~~~~~~~~~~~C~~~~~~~~~~~~ILG~~flr~~yvvfD~~~~rIGfa~~~C~~~~  351 (364)
T cd05473         296 RPVEDHGTQLDCYKFAISQSTNGTVIGAVIMEGFYVVFDRANKRVGFAVSTCAEHD  351 (364)
T ss_pred             hhhccCCCcceeeEEeeecCCCceEEeeeeEcceEEEEECCCCEEeeEeccccccc
Confidence            764321 2457975332223457999999999999999999999999999998743


No 17 
>PF00026 Asp:  Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.;  InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .  More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=100.00  E-value=5.1e-47  Score=395.72  Aligned_cols=302  Identities=32%  Similarity=0.563  Sum_probs=248.1

Q ss_pred             eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCC-CCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccC
Q 047816           85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHC-GDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEM  163 (620)
Q Consensus        85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C-~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg  163 (620)
                      .|+++|.||||+|+++|++||||+++||++..|..| .......|++++|+|++..+             +.+.+.|++|
T Consensus         1 ~Y~~~v~iGtp~q~~~~~iDTGS~~~wv~~~~c~~~~~~~~~~~y~~~~S~t~~~~~-------------~~~~~~y~~g   67 (317)
T PF00026_consen    1 QYYINVTIGTPPQTFRVLIDTGSSDTWVPSSNCNSCSSCASSGFYNPSKSSTFSNQG-------------KPFSISYGDG   67 (317)
T ss_dssp             EEEEEEEETTTTEEEEEEEETTBSSEEEEBTTECSHTHHCTSC-BBGGGSTTEEEEE-------------EEEEEEETTE
T ss_pred             CeEEEEEECCCCeEEEEEEecccceeeeceeccccccccccccccccccccccccce-------------eeeeeeccCc
Confidence            488999999999999999999999999999999876 33334799999999999875             5799999996


Q ss_pred             CceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCC-------CCchHHHHHHcCCcc-cceEE
Q 047816          164 SSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRG-------DLSVVDQLVEKGVIS-DSFSL  234 (620)
Q Consensus       164 ~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~-------~~s~~~~L~~~g~I~-~~FSl  234 (620)
                      . ++|.+++|+|+|++   +.+.++.||++....+. +.....+||||||++       ..+++++|+++|+|+ ++||+
T Consensus        68 ~-~~G~~~~D~v~ig~---~~~~~~~f~~~~~~~~~~~~~~~~~GilGLg~~~~~~~~~~~~~~~~l~~~g~i~~~~fsl  143 (317)
T PF00026_consen   68 S-VSGNLVSDTVSIGG---LTIPNQTFGLADSYSGDPFSPIPFDGILGLGFPSLSSSSTYPTFLDQLVQQGLISSNVFSL  143 (317)
T ss_dssp             E-EEEEEEEEEEEETT---EEEEEEEEEEEEEEESHHHHHSSSSEEEE-SSGGGSGGGTS-SHHHHHHHTTSSSSSEEEE
T ss_pred             c-cccccccceEeeee---ccccccceeccccccccccccccccccccccCCcccccccCCcceecchhhccccccccce
Confidence            6 99999999999999   67789999999986443 234568999999974       247999999999998 89999


Q ss_pred             eecCCCCCCceEEECCCCCCCCc-eEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHH
Q 047816          235 CYGGMDVGGGAMVLGGISPPKDM-VFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAF  313 (620)
Q Consensus       235 ~l~~~~~~~G~l~fGgiD~~~~~-~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~  313 (620)
                      +|++.+...|.|+|||+|++++. ...+.+.....+|.+.+++|.++++....     .....++||||+++++||.+++
T Consensus       144 ~l~~~~~~~g~l~~Gg~d~~~~~g~~~~~~~~~~~~w~v~~~~i~i~~~~~~~-----~~~~~~~~Dtgt~~i~lp~~~~  218 (317)
T PF00026_consen  144 YLNPSDSQNGSLTFGGYDPSKYDGDLVWVPLVSSGYWSVPLDSISIGGESVFS-----SSGQQAILDTGTSYIYLPRSIF  218 (317)
T ss_dssp             EEESTTSSEEEEEESSEEGGGEESEEEEEEBSSTTTTEEEEEEEEETTEEEEE-----EEEEEEEEETTBSSEEEEHHHH
T ss_pred             eeeecccccchheeeccccccccCceeccCccccccccccccccccccccccc-----ccceeeecccccccccccchhh
Confidence            99987666799999999999832 23333333788999999999999882221     2335699999999999999999


Q ss_pred             HHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEe
Q 047816          314 LAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQ  393 (620)
Q Consensus       314 ~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~  393 (620)
                      ++|++++.+....        ..|..+|.        .. +.+|.++|.| ++.+|.||+++|+.+........|+..+.
T Consensus       219 ~~i~~~l~~~~~~--------~~~~~~c~--------~~-~~~p~l~f~~-~~~~~~i~~~~~~~~~~~~~~~~C~~~i~  280 (317)
T PF00026_consen  219 DAIIKALGGSYSD--------GVYSVPCN--------ST-DSLPDLTFTF-GGVTFTIPPSDYIFKIEDGNGGYCYLGIQ  280 (317)
T ss_dssp             HHHHHHHTTEEEC--------SEEEEETT--------GG-GGSEEEEEEE-TTEEEEEEHHHHEEEESSTTSSEEEESEE
T ss_pred             HHHHhhhcccccc--------eeEEEecc--------cc-cccceEEEee-CCEEEEecchHhcccccccccceeEeeee
Confidence            9999999765431        45677883        22 5689999999 79999999999999887655558865554


Q ss_pred             c----CCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816          394 N----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT  426 (620)
Q Consensus       394 ~----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~  426 (620)
                      .    .....+|||.+|||++|++||+|++|||||+|
T Consensus       281 ~~~~~~~~~~~iLG~~fl~~~y~vfD~~~~~ig~A~a  317 (317)
T PF00026_consen  281 PMDSSDDSDDWILGSPFLRNYYVVFDYENNRIGFAQA  317 (317)
T ss_dssp             EESSTTSSSEEEEEHHHHTTEEEEEETTTTEEEEEEE
T ss_pred             cccccccCCceEecHHHhhceEEEEeCCCCEEEEecC
Confidence            4    44578999999999999999999999999986


No 18 
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=100.00  E-value=2.8e-46  Score=382.81  Aligned_cols=265  Identities=23%  Similarity=0.357  Sum_probs=219.7

Q ss_pred             EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCCc
Q 047816           86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMSS  165 (620)
Q Consensus        86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~~  165 (620)
                      |+++|+||||+|++.|++||||+++||+|..|..|..+.+..|++++|+|++...            .+.|.+.|++|+.
T Consensus         1 Y~~~i~vGtP~Q~~~v~~DTGS~~~wv~~~~c~~~~~~~~~~y~~~~Sst~~~~~------------~~~~~i~Y~~G~~   68 (278)
T cd06097           1 YLTPVKIGTPPQTLNLDLDTGSSDLWVFSSETPAAQQGGHKLYDPSKSSTAKLLP------------GATWSISYGDGSS   68 (278)
T ss_pred             CeeeEEECCCCcEEEEEEeCCCCceeEeeCCCCchhhccCCcCCCccCccceecC------------CcEEEEEeCCCCe
Confidence            6799999999999999999999999999999998887666789999999998753            3689999999878


Q ss_pred             eeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC---------chHHHHHHcCCcccceEEe
Q 047816          166 SSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL---------SVVDQLVEKGVISDSFSLC  235 (620)
Q Consensus       166 ~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~---------s~~~~L~~~g~I~~~FSl~  235 (620)
                      +.|.+++|+|+||+   .+++++.||+++...+. +.....+||||||++..         +++++|.+++. ++.||+|
T Consensus        69 ~~G~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~~~~~~~~~~~~~~~l~~~~~-~~~Fs~~  144 (278)
T cd06097          69 ASGIVYTDTVSIGG---VEVPNQAIELATAVSASFFSDTASDGLLGLAFSSINTVQPPKQKTFFENALSSLD-APLFTAD  144 (278)
T ss_pred             EEEEEEEEEEEECC---EEECCeEEEEEeecCccccccccccceeeeccccccccccCCCCCHHHHHHHhcc-CceEEEE
Confidence            99999999999998   67789999999987653 33457899999998643         47889999865 7899999


Q ss_pred             ecCCCCCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHH
Q 047816          236 YGGMDVGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAA  312 (620)
Q Consensus       236 l~~~~~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~  312 (620)
                      |.+  ...|+|+|||+|+++   .+.|++... ...+|.|++++|.|+++....     .....++|||||+++++|+++
T Consensus       145 l~~--~~~G~l~fGg~D~~~~~g~l~~~pi~~-~~~~w~v~l~~i~v~~~~~~~-----~~~~~~iiDSGTs~~~lP~~~  216 (278)
T cd06097         145 LRK--AAPGFYTFGYIDESKYKGEISWTPVDN-SSGFWQFTSTSYTVGGDAPWS-----RSGFSAIADTGTTLILLPDAI  216 (278)
T ss_pred             ecC--CCCcEEEEeccChHHcCCceEEEEccC-CCcEEEEEEeeEEECCcceee-----cCCceEEeecCCchhcCCHHH
Confidence            986  357999999999987   345555432 268999999999999874321     234679999999999999999


Q ss_pred             HHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEE
Q 047816          313 FLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIF  392 (620)
Q Consensus       313 ~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~  392 (620)
                      ++++.+++.+..     .....+.|..+|.           ..+|+|+|+|                             
T Consensus       217 ~~~l~~~l~g~~-----~~~~~~~~~~~C~-----------~~~P~i~f~~-----------------------------  251 (278)
T cd06097         217 VEAYYSQVPGAY-----YDSEYGGWVFPCD-----------TTLPDLSFAV-----------------------------  251 (278)
T ss_pred             HHHHHHhCcCCc-----ccCCCCEEEEECC-----------CCCCCEEEEE-----------------------------
Confidence            999998883211     1123345778883           2389999988                             


Q ss_pred             ecCCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          393 QNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       393 ~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                            .||||++|||++|+|||++|+|||||+
T Consensus       252 ------~~ilGd~fl~~~y~vfD~~~~~ig~A~  278 (278)
T cd06097         252 ------FSILGDVFLKAQYVVFDVGGPKLGFAP  278 (278)
T ss_pred             ------EEEEcchhhCceeEEEcCCCceeeecC
Confidence                  699999999999999999999999995


No 19 
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  The enzymes specifically cleave bonds in peptides which 
Probab=100.00  E-value=1.4e-45  Score=374.98  Aligned_cols=255  Identities=42%  Similarity=0.755  Sum_probs=217.5

Q ss_pred             eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816           85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS  164 (620)
Q Consensus        85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~  164 (620)
                      .|+++|+||||+|++.|++||||+++||+|                          |             .|.+.|+|++
T Consensus         1 ~Y~~~i~iGtP~q~~~v~~DTGSs~~wv~~--------------------------~-------------~~~~~Y~dg~   41 (265)
T cd05476           1 EYLVTLSIGTPPQPFSLIVDTGSDLTWTQC--------------------------C-------------SYEYSYGDGS   41 (265)
T ss_pred             CeEEEEecCCCCcceEEEecCCCCCEEEcC--------------------------C-------------ceEeEeCCCc
Confidence            388999999999999999999999999985                          1             4789999989


Q ss_pred             ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCcccceEEeecCC--CCC
Q 047816          165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSLCYGGM--DVG  242 (620)
Q Consensus       165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl~l~~~--~~~  242 (620)
                      .++|.+++|+|+|++.. .++.++.|||+..+.+ +.....+||||||+...|++.||..++   ++||+||.+.  ...
T Consensus        42 ~~~G~~~~D~v~~g~~~-~~~~~~~Fg~~~~~~~-~~~~~~~GIlGLg~~~~s~~~ql~~~~---~~Fs~~l~~~~~~~~  116 (265)
T cd05476          42 STSGVLATETFTFGDSS-VSVPNVAFGCGTDNEG-GSFGGADGILGLGRGPLSLVSQLGSTG---NKFSYCLVPHDDTGG  116 (265)
T ss_pred             eeeeeEEEEEEEecCCC-CccCCEEEEecccccC-CccCCCCEEEECCCCcccHHHHhhccc---CeeEEEccCCCCCCC
Confidence            99999999999999832 1678999999998876 555678999999999999999999888   7999999864  356


Q ss_pred             CceEEECCCCCCC--CceEeecCCC--CCCeeEEEEeEEEEccEEecCCCCc----cCCCCceEeeccceeeeecHHHHH
Q 047816          243 GGAMVLGGISPPK--DMVFTHSDPV--RSPYYNIDLKVIHVAGKPLPLNPKV----FDGKHGTVLDSGTTYAYLPEAAFL  314 (620)
Q Consensus       243 ~G~l~fGgiD~~~--~~~~~~~~~~--~~~~w~v~l~~i~v~g~~~~~~~~~----~~~~~~ailDSGtt~~~LP~~~~~  314 (620)
                      +|.|+|||+|+++  .+.|++....  ...+|.|++++|+|+++.+.++...    ......++|||||++++||++++ 
T Consensus       117 ~G~l~fGg~d~~~~~~l~~~p~~~~~~~~~~~~v~l~~i~v~~~~~~~~~~~~~~~~~~~~~ai~DTGTs~~~lp~~~~-  195 (265)
T cd05476         117 SSPLILGDAADLGGSGVVYTPLVKNPANPTYYYVNLEGISVGGKRLPIPPSVFAIDSDGSGGTIIDSGTTLTYLPDPAY-  195 (265)
T ss_pred             CCeEEECCcccccCCCceEeecccCCCCCCceEeeeEEEEECCEEecCCchhcccccCCCCcEEEeCCCcceEcCcccc-
Confidence            7999999999963  5566665442  3679999999999999987643221    13456799999999999999876 


Q ss_pred             HHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEec
Q 047816          315 AFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQN  394 (620)
Q Consensus       315 ~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~~  394 (620)
                                                                |+|+|+|+++.++.+++++|+....  ++..|+++...
T Consensus       196 ------------------------------------------P~i~~~f~~~~~~~i~~~~y~~~~~--~~~~C~~~~~~  231 (265)
T cd05476         196 ------------------------------------------PDLTLHFDGGADLELPPENYFVDVG--EGVVCLAILSS  231 (265)
T ss_pred             ------------------------------------------CCEEEEECCCCEEEeCcccEEEECC--CCCEEEEEecC
Confidence                                                      6899999658999999999998543  36789988876


Q ss_pred             CCCCceeehHhhhceEEEEEeCCCCEEEEEecCC
Q 047816          395 GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNC  428 (620)
Q Consensus       395 ~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c  428 (620)
                      ...+.||||++|||++|++||++++|||||+++|
T Consensus       232 ~~~~~~ilG~~fl~~~~~vFD~~~~~iGfa~~~C  265 (265)
T cd05476         232 SSGGVSILGNIQQQNFLVEYDLENSRLGFAPADC  265 (265)
T ss_pred             CCCCcEEEChhhcccEEEEEECCCCEEeeecCCC
Confidence            5567899999999999999999999999999999


No 20 
>cd05475 nucellin_like Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more d
Probab=100.00  E-value=4.2e-45  Score=372.75  Aligned_cols=261  Identities=34%  Similarity=0.695  Sum_probs=213.4

Q ss_pred             eeEEEEEEecCCCcEEEEEEeCCCCceeEeCC-CCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816           84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCA-TCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE  162 (620)
Q Consensus        84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~-~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d  162 (620)
                      |+|+++|.||||+|++.|++||||+++||+|. .|..|                   .             |.|+++|+|
T Consensus         1 ~~Y~~~i~iGtP~q~~~v~~DTGS~~~Wv~c~~~c~~c-------------------~-------------c~~~i~Ygd   48 (273)
T cd05475           1 GYYYVTINIGNPPKPYFLDIDTGSDLTWLQCDAPCTGC-------------------Q-------------CDYEIEYAD   48 (273)
T ss_pred             CceEEEEEcCCCCeeEEEEEccCCCceEEeCCCCCCCC-------------------c-------------CccEeEeCC
Confidence            47999999999999999999999999999984 57666                   1             469999998


Q ss_pred             CCceeEEEEEEEEEeCCCC-CCCccceEEEEEEeccCCC--cCCCcceEEecCCCCCchHHHHHHcCCcccceEEeecCC
Q 047816          163 MSSSSGVLGEDIISFGNES-DLKPQRAVFGCENVETGDL--YSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSLCYGGM  239 (620)
Q Consensus       163 g~~~~G~~~~D~v~lg~~~-~~~~~~~~fg~~~~~~~~~--~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl~l~~~  239 (620)
                      ++.+.|.+++|+|+++... ...+.++.|||+....+.+  .....+||||||++..++++||.++++|+++||+||.+ 
T Consensus        49 ~~~~~G~~~~D~v~~~~~~~~~~~~~~~Fgc~~~~~~~~~~~~~~~dGIlGLg~~~~s~~~ql~~~~~i~~~Fs~~l~~-  127 (273)
T cd05475          49 GGSSMGVLVTDIFSLKLTNGSRAKPRIAFGCGYDQQGPLLNPPPPTDGILGLGRGKISLPSQLASQGIIKNVIGHCLSS-  127 (273)
T ss_pred             CCceEEEEEEEEEEEeecCCCcccCCEEEEeeeccCCcccCCCccCCEEEECCCCCCCHHHHHHhcCCcCceEEEEccC-
Confidence            8999999999999997531 1456789999998765432  23467999999999999999999999998899999986 


Q ss_pred             CCCCceEEECCCCCC-CCceEeecCCC-CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHHH
Q 047816          240 DVGGGAMVLGGISPP-KDMVFTHSDPV-RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAFK  317 (620)
Q Consensus       240 ~~~~G~l~fGgiD~~-~~~~~~~~~~~-~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i~  317 (620)
                       ..+|.|+||+.... ..+.|+++... ...+|.|++.+|+|+++...      .....++|||||++++||++++    
T Consensus       128 -~~~g~l~~G~~~~~~g~i~ytpl~~~~~~~~y~v~l~~i~vg~~~~~------~~~~~~ivDTGTt~t~lp~~~y----  196 (273)
T cd05475         128 -NGGGFLFFGDDLVPSSGVTWTPMRRESQKKHYSPGPASLLFNGQPTG------GKGLEVVFDSGSSYTYFNAQAY----  196 (273)
T ss_pred             -CCCeEEEECCCCCCCCCeeecccccCCCCCeEEEeEeEEEECCEECc------CCCceEEEECCCceEEcCCccc----
Confidence             34689999854322 14556554321 24799999999999998543      2345799999999999999865    


Q ss_pred             HHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCC---cEEEeCCCCcEEEecccCCeEEEEEEec
Q 047816          318 DAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNG---QKLLLAPENYLFRHSKVRGAYCLGIFQN  394 (620)
Q Consensus       318 ~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g---~~~~l~~~~yi~~~~~~~~~~Cl~~~~~  394 (620)
                                                            +|+|+|+|+++   ++++|++++|++...  .+..|++++..
T Consensus       197 --------------------------------------~p~i~~~f~~~~~~~~~~l~~~~y~~~~~--~~~~Cl~~~~~  236 (273)
T cd05475         197 --------------------------------------FKPLTLKFGKGWRTRLLEIPPENYLIISE--KGNVCLGILNG  236 (273)
T ss_pred             --------------------------------------cccEEEEECCCCceeEEEeCCCceEEEcC--CCCEEEEEecC
Confidence                                                  46899999543   799999999998754  35689998865


Q ss_pred             CC---CCceeehHhhhceEEEEEeCCCCEEEEEecCC
Q 047816          395 GR---DPTTLLGGIIVRNTLVMYDREHSKIGFWKTNC  428 (620)
Q Consensus       395 ~~---~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c  428 (620)
                      .+   .+.||||+.|||++|+|||++++|||||+++|
T Consensus       237 ~~~~~~~~~ilG~~~l~~~~~vfD~~~~riGfa~~~C  273 (273)
T cd05475         237 SEIGLGNTNIIGDISMQGLMVIYDNEKQQIGWVRSDC  273 (273)
T ss_pred             CCcCCCceEEECceEEEeeEEEEECcCCEeCcccCCC
Confidence            42   35799999999999999999999999999998


No 21 
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=100.00  E-value=3.2e-45  Score=378.47  Aligned_cols=272  Identities=25%  Similarity=0.450  Sum_probs=225.7

Q ss_pred             eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816           85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS  164 (620)
Q Consensus        85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~  164 (620)
                      +|+++|.||||+|++.|++||||+++||+                                         .|++.|++|+
T Consensus         2 ~Y~~~i~iGtp~q~~~v~~DTgS~~~wv~-----------------------------------------~~~~~Y~~g~   40 (295)
T cd05474           2 YYSAELSVGTPPQKVTVLLDTGSSDLWVP-----------------------------------------DFSISYGDGT   40 (295)
T ss_pred             eEEEEEEECCCCcEEEEEEeCCCCcceee-----------------------------------------eeEEEeccCC
Confidence            68999999999999999999999999997                                         1778999989


Q ss_pred             ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCCC-----------chHHHHHHcCCcc-cce
Q 047816          165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDL-----------SVVDQLVEKGVIS-DSF  232 (620)
Q Consensus       165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~-----------s~~~~L~~~g~I~-~~F  232 (620)
                      .+.|.+++|+|++++   ..++++.|||++...      ..+||||||++..           +|+++|++||+|+ ++|
T Consensus        41 ~~~G~~~~D~v~~g~---~~~~~~~fg~~~~~~------~~~GilGLg~~~~~~~~~~~~~~~s~~~~L~~~g~i~~~~F  111 (295)
T cd05474          41 SASGTWGTDTVSIGG---ATVKNLQFAVANSTS------SDVGVLGIGLPGNEATYGTGYTYPNFPIALKKQGLIKKNAY  111 (295)
T ss_pred             cEEEEEEEEEEEECC---eEecceEEEEEecCC------CCcceeeECCCCCcccccCCCcCCCHHHHHHHCCcccceEE
Confidence            999999999999998   567899999998742      4799999999775           6999999999998 899


Q ss_pred             EEeecCCCCCCceEEECCCCCCCC---ceEeecCCCC----CCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816          233 SLCYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVR----SPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY  305 (620)
Q Consensus       233 Sl~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~----~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~  305 (620)
                      |+||++.+...|.|+|||+|+.++   +.+++.....    ..+|.|.+++|.++++.+..+  .......++|||||++
T Consensus       112 sl~l~~~~~~~g~l~~Gg~d~~~~~g~~~~~p~~~~~~~~~~~~~~v~l~~i~v~~~~~~~~--~~~~~~~~iiDSGt~~  189 (295)
T cd05474         112 SLYLNDLDASTGSILFGGVDTAKYSGDLVTLPIVNDNGGSEPSELSVTLSSISVNGSSGNTT--LLSKNLPALLDSGTTL  189 (295)
T ss_pred             EEEeCCCCCCceeEEEeeeccceeeceeEEEeCcCcCCCCCceEEEEEEEEEEEEcCCCccc--ccCCCccEEECCCCcc
Confidence            999998655689999999998873   4555544432    278999999999999876532  1245568999999999


Q ss_pred             eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecc--c
Q 047816          306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSK--V  383 (620)
Q Consensus       306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~--~  383 (620)
                      ++||++++++|.+++.+....      ....|..+|+..         .. |+|+|+| +|.++.||+++|+++...  .
T Consensus       190 ~~lP~~~~~~l~~~~~~~~~~------~~~~~~~~C~~~---------~~-p~i~f~f-~g~~~~i~~~~~~~~~~~~~~  252 (295)
T cd05474         190 TYLPSDIVDAIAKQLGATYDS------DEGLYVVDCDAK---------DD-GSLTFNF-GGATISVPLSDLVLPASTDDG  252 (295)
T ss_pred             EeCCHHHHHHHHHHhCCEEcC------CCcEEEEeCCCC---------CC-CEEEEEE-CCeEEEEEHHHhEeccccCCC
Confidence            999999999999999765431      234577888532         23 9999999 789999999999987642  2


Q ss_pred             CCeEEE-EEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816          384 RGAYCL-GIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT  426 (620)
Q Consensus       384 ~~~~Cl-~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~  426 (620)
                      .+..|+ ++.... ++.||||++|||++|++||++++|||||++
T Consensus       253 ~~~~C~~~i~~~~-~~~~iLG~~fl~~~y~vfD~~~~~ig~a~a  295 (295)
T cd05474         253 GDGACYLGIQPST-SDYNILGDTFLRSAYVVYDLDNNEISLAQA  295 (295)
T ss_pred             CCCCeEEEEEeCC-CCcEEeChHHhhcEEEEEECCCCEEEeecC
Confidence            345674 554433 478999999999999999999999999986


No 22 
>cd05489 xylanase_inhibitor_I_like TAXI-I inhibits degradation of xylan in the cell wall. Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. Xylanases of fungal and bacterial pathogens are the key enzymes in the degradation of xylan in the cell wall. Plants secrete proteins that inhibit these degradation glycosidases, including xylanase. Surprisingly, TAXI-I displays structural homology with the pepsin-like family of aspartic proteases but is proteolytically nonfunctional, because one or more residues of the essential catalytic triad are absent. The structure of the TAXI-inhibitor, Aspergillus niger xylanase I complex, illustrates the ability 
Probab=100.00  E-value=1.3e-43  Score=373.91  Aligned_cols=318  Identities=24%  Similarity=0.437  Sum_probs=242.6

Q ss_pred             ecCCCcE-EEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-cc------cC----------CCCCc
Q 047816           92 IGTPPQT-FALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-CN------CD----------RERAQ  153 (620)
Q Consensus        92 iGTP~Q~-~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c~------c~----------~~~~~  153 (620)
                      +|||-.+ +.|++||||+++||+|.+              .+|+||..+.|.+. |.      |.          -..+.
T Consensus         2 ~~~~~~~~~~~~~DTGS~l~WvqC~~--------------~~sst~~~~~C~s~~C~~~~~~~~~~~~~~~~~~~c~~~~   67 (362)
T cd05489           2 TITPLKGAVPLVLDLAGPLLWSTCDA--------------GHSSTYQTVPCSSSVCSLANRYHCPGTCGGAPGPGCGNNT   67 (362)
T ss_pred             cccCccCCeeEEEECCCCceeeeCCC--------------CCcCCCCccCcCChhhccccccCCCccccCCCCCCCCCCc
Confidence            5788777 999999999999999864              34667777777753 52      11          01234


Q ss_pred             ceeEEe-eccCCceeEEEEEEEEEeCCCCC-----CCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCC
Q 047816          154 CVYERK-YAEMSSSSGVLGEDIISFGNESD-----LKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGV  227 (620)
Q Consensus       154 ~~~~~~-Y~dg~~~~G~~~~D~v~lg~~~~-----~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~  227 (620)
                      |.|... |++|+...|.+++|+|+|+....     .++.++.|||+............|||||||++.+|++.||..++.
T Consensus        68 C~y~~~~y~~gs~t~G~l~~Dtl~~~~~~g~~~~~~~~~~~~FGC~~~~~~~~~~~~~dGIlGLg~~~lSl~sql~~~~~  147 (362)
T cd05489          68 CTAHPYNPVTGECATGDLTQDVLSANTTDGSNPLLVVIFNFVFSCAPSLLLKGLPPGAQGVAGLGRSPLSLPAQLASAFG  147 (362)
T ss_pred             CeeEccccccCcEeeEEEEEEEEEecccCCCCcccceeCCEEEEcCCcccccCCccccccccccCCCccchHHHhhhhcC
Confidence            777654 77877999999999999975321     257899999998754322233489999999999999999998777


Q ss_pred             cccceEEeecCCCCCCceEEECCCCCCC---------CceEeecCCC--CCCeeEEEEeEEEEccEEecCCCCcc----C
Q 047816          228 ISDSFSLCYGGMDVGGGAMVLGGISPPK---------DMVFTHSDPV--RSPYYNIDLKVIHVAGKPLPLNPKVF----D  292 (620)
Q Consensus       228 I~~~FSl~l~~~~~~~G~l~fGgiD~~~---------~~~~~~~~~~--~~~~w~v~l~~i~v~g~~~~~~~~~~----~  292 (620)
                      ++++||+||.+....+|.|+||+.+..+         .+.|++....  ...+|.|+|++|+|+++.+.+++..+    .
T Consensus       148 ~~~~FS~CL~~~~~~~g~l~fG~~~~~~~~~~~~~~~~~~~tPl~~~~~~~~~Y~v~l~~IsVg~~~l~~~~~~~~~~~~  227 (362)
T cd05489         148 VARKFALCLPSSPGGPGVAIFGGGPYYLFPPPIDLSKSLSYTPLLTNPRKSGEYYIGVTSIAVNGHAVPLNPTLSANDRL  227 (362)
T ss_pred             CCcceEEEeCCCCCCCeeEEECCCchhcccccccccCCccccccccCCCCCCceEEEEEEEEECCEECCCCchhcccccc
Confidence            6689999998754567999999998643         3455554332  34799999999999999987754432    2


Q ss_pred             CCCceEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECC-CcEEEe
Q 047816          293 GKHGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGN-GQKLLL  371 (620)
Q Consensus       293 ~~~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~-g~~~~l  371 (620)
                      +...++|||||++++||+++|++|.+++.+++........ .....+.||......+......+|+|+|+|++ |.++.|
T Consensus       228 ~~~g~iiDSGTs~t~lp~~~y~~l~~a~~~~~~~~~~~~~-~~~~~~~C~~~~~~~~~~~~~~~P~it~~f~g~g~~~~l  306 (362)
T cd05489         228 GPGGVKLSTVVPYTVLRSDIYRAFTQAFAKATARIPRVPA-AAVFPELCYPASALGNTRLGYAVPAIDLVLDGGGVNWTI  306 (362)
T ss_pred             CCCcEEEecCCceEEECHHHHHHHHHHHHHHhcccCcCCC-CCCCcCccccCCCcCCcccccccceEEEEEeCCCeEEEE
Confidence            3457999999999999999999999999988764322211 11223689876543333334689999999965 799999


Q ss_pred             CCCCcEEEecccCCeEEEEEEecCC--CCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816          372 APENYLFRHSKVRGAYCLGIFQNGR--DPTTLLGGIIVRNTLVMYDREHSKIGFWKT  426 (620)
Q Consensus       372 ~~~~yi~~~~~~~~~~Cl~~~~~~~--~~~~ILG~~fLr~~yvvfD~en~rIGfA~~  426 (620)
                      ++++|+++..+  +..|+++.....  .+.||||+.|||++|++||++++|||||+.
T Consensus       307 ~~~ny~~~~~~--~~~Cl~f~~~~~~~~~~~IlG~~~~~~~~vvyD~~~~riGfa~~  361 (362)
T cd05489         307 FGANSMVQVKG--GVACLAFVDGGSEPRPAVVIGGHQMEDNLLVFDLEKSRLGFSSS  361 (362)
T ss_pred             cCCceEEEcCC--CcEEEEEeeCCCCCCceEEEeeheecceEEEEECCCCEeecccC
Confidence            99999998653  568998876542  357999999999999999999999999964


No 23 
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=100.00  E-value=5e-42  Score=352.04  Aligned_cols=269  Identities=36%  Similarity=0.682  Sum_probs=223.7

Q ss_pred             EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCC--CCCCCCcccccccCcCCcccCCCCCcceeEEeeccC
Q 047816           86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPK--FEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEM  163 (620)
Q Consensus        86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~--y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg  163 (620)
                      |+++|.||||+|++.|++||||+++||+|..|..|..+....  |++..|+++....             |.+++.|++ 
T Consensus         1 Y~~~i~iGtp~q~~~l~~DTGS~~~wv~~~~c~~~~~~~~~~~~~~~~~s~~~~~~~-------------~~~~~~Y~~-   66 (283)
T cd05471           1 YYGEITIGTPPQKFSVIFDTGSSLLWVPSSNCTSCSCQKHPRFKYDSSKSSTYKDTG-------------CTFSITYGD-   66 (283)
T ss_pred             CEEEEEECCCCcEEEEEEeCCCCCEEEecCCCCccccccCCCCccCccCCceeecCC-------------CEEEEEECC-
Confidence            678999999999999999999999999999999887665444  7888888887654             689999998 


Q ss_pred             CceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCC------CchHHHHHHcCCcc-cceEEee
Q 047816          164 SSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGD------LSVVDQLVEKGVIS-DSFSLCY  236 (620)
Q Consensus       164 ~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~------~s~~~~L~~~g~I~-~~FSl~l  236 (620)
                      +.+.|.+++|+|++++   ..+.++.|||++.....+.....+||||||+..      .+++++|.++++|. +.||+|+
T Consensus        67 g~~~g~~~~D~v~~~~---~~~~~~~fg~~~~~~~~~~~~~~~GilGLg~~~~~~~~~~s~~~~l~~~~~i~~~~Fs~~l  143 (283)
T cd05471          67 GSVTGGLGTDTVTIGG---LTIPNQTFGCATSESGDFSSSGFDGILGLGFPSLSVDGVPSFFDQLKSQGLISSPVFSFYL  143 (283)
T ss_pred             CeEEEEEEEeEEEECC---EEEeceEEEEEeccCCcccccccceEeecCCcccccccCCCHHHHHHHCCCCCCCEEEEEE
Confidence            6889999999999998   457899999999887644556789999999988      78999999999998 8999999


Q ss_pred             cCC--CCCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHH
Q 047816          237 GGM--DVGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEA  311 (620)
Q Consensus       237 ~~~--~~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~  311 (620)
                      .+.  ....|.|+|||+|+++   .+.+++.......+|.|.+++|.++++...    .......++|||||++++||++
T Consensus       144 ~~~~~~~~~g~l~~Gg~d~~~~~~~~~~~p~~~~~~~~~~v~l~~i~v~~~~~~----~~~~~~~~iiDsGt~~~~lp~~  219 (283)
T cd05471         144 GRDGDGGNGGELTFGGIDPSKYTGDLTYTPVVSNGPGYWQVPLDGISVGGKSVI----SSSGGGGAIVDSGTSLIYLPSS  219 (283)
T ss_pred             cCCCCCCCCCEEEEcccCccccCCceEEEecCCCCCCEEEEEeCeEEECCceee----ecCCCcEEEEecCCCCEeCCHH
Confidence            975  3467999999999985   455665554347899999999999987411    1134568999999999999999


Q ss_pred             HHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEE
Q 047816          312 AFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGI  391 (620)
Q Consensus       312 ~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~  391 (620)
                      +++++++++.+....      ....+...|        .. .+.+|+|+|+|                            
T Consensus       220 ~~~~l~~~~~~~~~~------~~~~~~~~~--------~~-~~~~p~i~f~f----------------------------  256 (283)
T cd05471         220 VYDAILKALGAAVSS------SDGGYGVDC--------SP-CDTLPDITFTF----------------------------  256 (283)
T ss_pred             HHHHHHHHhCCcccc------cCCcEEEeC--------cc-cCcCCCEEEEE----------------------------
Confidence            999999999776532      112223333        11 26899999999                            


Q ss_pred             EecCCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          392 FQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       392 ~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                             .+|||.+|||++|++||+++++||||+
T Consensus       257 -------~~ilG~~fl~~~y~vfD~~~~~igfa~  283 (283)
T cd05471         257 -------LWILGDVFLRNYYTVFDLDNNRIGFAP  283 (283)
T ss_pred             -------EEEccHhhhhheEEEEeCCCCEEeecC
Confidence                   699999999999999999999999985


No 24 
>PF14543 TAXi_N:  Xylanase inhibitor N-terminal; PDB: 3HD8_A 3VLB_A 3VLA_A 3AUP_D 1T6G_A 1T6E_X 2B42_A.
Probab=99.93  E-value=3.4e-25  Score=207.79  Aligned_cols=152  Identities=43%  Similarity=0.814  Sum_probs=123.4

Q ss_pred             EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-c--------ccCCCCCccee
Q 047816           86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-C--------NCDRERAQCVY  156 (620)
Q Consensus        86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c--------~c~~~~~~~~~  156 (620)
                      |+++|.||||+|++.|++||||+.+|++|         .++.|+|.+|+||+.+.|.+. |        .|......|.|
T Consensus         1 Y~~~~~iGtP~~~~~lvvDtgs~l~W~~C---------~~~~f~~~~Sst~~~v~C~s~~C~~~~~~~~~~~~~~~~C~y   71 (164)
T PF14543_consen    1 YYVSVSIGTPPQPFSLVVDTGSDLTWVQC---------PDPPFDPSKSSTYRPVPCSSPQCSSAPSFCPCCCCSNNSCPY   71 (164)
T ss_dssp             EEEEEECTCTTEEEEEEEETT-SSEEEET-------------STT-TTSSBEC-BTTSHHHHHCTSSBTCCTCESSEEEE
T ss_pred             CEEEEEeCCCCceEEEEEECCCCceEEcC---------CCcccCCccCCcccccCCCCcchhhcccccccCCCCcCcccc
Confidence            78999999999999999999999999998         237999999999999999763 6        35556788999


Q ss_pred             EEeeccCCceeEEEEEEEEEeCCCCC--CCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCcccceEE
Q 047816          157 ERKYAEMSSSSGVLGEDIISFGNESD--LKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSL  234 (620)
Q Consensus       157 ~~~Y~dg~~~~G~~~~D~v~lg~~~~--~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl  234 (620)
                      .+.|+++..+.|.+++|+|+++....  ....++.|||+....+.+.  ..+||||||++..||+.||.++  ..+.||+
T Consensus        72 ~~~y~~~s~~~G~l~~D~~~~~~~~~~~~~~~~~~FGC~~~~~g~~~--~~~GilGLg~~~~Sl~sQl~~~--~~~~FSy  147 (164)
T PF14543_consen   72 SQSYGDGSSSSGFLASDTLTFGSSSGGSNSVPDFIFGCATSNSGLFY--GADGILGLGRGPLSLPSQLASS--SGNKFSY  147 (164)
T ss_dssp             EEEETTTEEEEEEEEEEEEEEEEESSSSEEEEEEEEEEE-GGGTSST--TEEEEEE-SSSTTSHHHHHHHH----SEEEE
T ss_pred             eeecCCCccccCceEEEEEEecCCCCCCceeeeEEEEeeeccccCCc--CCCcccccCCCcccHHHHHHHh--cCCeEEE
Confidence            99999999999999999999987431  3467899999999886443  7899999999999999999998  5589999


Q ss_pred             eecC-CCCCCceEEECC
Q 047816          235 CYGG-MDVGGGAMVLGG  250 (620)
Q Consensus       235 ~l~~-~~~~~G~l~fGg  250 (620)
                      ||.+ .....|.|+||+
T Consensus       148 CL~~~~~~~~g~l~fG~  164 (164)
T PF14543_consen  148 CLPSSSPSSSGFLSFGD  164 (164)
T ss_dssp             EB-S-SSSSEEEEEECS
T ss_pred             ECCCCCCCCCEEEEeCc
Confidence            9998 456779999996


No 25 
>PF14541 TAXi_C:  Xylanase inhibitor C-terminal; PDB: 3AUP_D 3HD8_A 1T6G_A 1T6E_X 2B42_A 3VLB_A 3VLA_A.
Probab=99.87  E-value=1.3e-21  Score=183.45  Aligned_cols=155  Identities=35%  Similarity=0.663  Sum_probs=120.2

Q ss_pred             eeEEEEeEEEEccEEecCCCCcc---CCCCceEeeccceeeeecHHHHHHHHHHHHHHhhhccccc-CCCCCCccccccC
Q 047816          269 YYNIDLKVIHVAGKPLPLNPKVF---DGKHGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIR-GPDPNYNDICFSG  344 (620)
Q Consensus       269 ~w~v~l~~i~v~g~~~~~~~~~~---~~~~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~-~~~~~~~~~C~~~  344 (620)
                      +|.|+|++|+|+++.+.++...|   ++...++|||||++++||+++|+++.+++.+++.....-. .........||..
T Consensus         1 ~Y~v~l~~Isvg~~~l~~~~~~~~~~~~~g~~iiDSGT~~T~L~~~~y~~l~~al~~~~~~~~~~~~~~~~~~~~~Cy~~   80 (161)
T PF14541_consen    1 FYYVNLTGISVGGKRLPIPPSVFQLSDGSGGTIIDSGTTYTYLPPPVYDALVQALDAQMGAPGVSREAPPFSGFDLCYNL   80 (161)
T ss_dssp             SEEEEEEEEEETTEEE---TTCSCETTSTCSEEE-SSSSSEEEEHHHHHHHHHHHHHHHHTCT--CEE---TT-S-EEEG
T ss_pred             CccEEEEEEEECCEEecCChHHhhccCCCCCEEEECCCCccCCcHHHHHHHHHHHHHHhhhcccccccccCCCCCceeec
Confidence            48999999999999999988876   4567899999999999999999999999999987653110 1233556789987


Q ss_pred             CCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEec--CCCCceeehHhhhceEEEEEeCCCCEEE
Q 047816          345 APSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQN--GRDPTTLLGGIIVRNTLVMYDREHSKIG  422 (620)
Q Consensus       345 ~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~~--~~~~~~ILG~~fLr~~yvvfD~en~rIG  422 (620)
                      ...........+|+|+|+|.+|.++++++++|++....  +..|+++...  ..++..|||..+|++++++||++++|||
T Consensus        81 ~~~~~~~~~~~~P~i~l~F~~ga~l~l~~~~y~~~~~~--~~~Cla~~~~~~~~~~~~viG~~~~~~~~v~fDl~~~~ig  158 (161)
T PF14541_consen   81 SSFGVNRDWAKFPTITLHFEGGADLTLPPENYFVQVSP--GVFCLAFVPSDADDDGVSVIGNFQQQNYHVVFDLENGRIG  158 (161)
T ss_dssp             GCS-EETTEESS--EEEEETTSEEEEE-HHHHEEEECT--TEEEESEEEETSTTSSSEEE-HHHCCTEEEEEETTTTEEE
T ss_pred             cccccccccccCCeEEEEEeCCcceeeeccceeeeccC--CCEEEEEEccCCCCCCcEEECHHHhcCcEEEEECCCCEEE
Confidence            65323334478999999998899999999999998863  7899999888  5568899999999999999999999999


Q ss_pred             EEe
Q 047816          423 FWK  425 (620)
Q Consensus       423 fA~  425 (620)
                      |+|
T Consensus       159 F~~  161 (161)
T PF14541_consen  159 FAP  161 (161)
T ss_dssp             EEE
T ss_pred             EeC
Confidence            986


No 26 
>cd05470 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site 
Probab=99.85  E-value=4.7e-21  Score=167.54  Aligned_cols=107  Identities=36%  Similarity=0.658  Sum_probs=92.6

Q ss_pred             EEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCC-CCCCCcccccccCcCCcccCCCCCcceeEEeeccCCce
Q 047816           88 TRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKF-EPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMSSS  166 (620)
Q Consensus        88 ~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y-~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~~~  166 (620)
                      ++|.||||+|++.|+|||||+++||+|..|..|..+....| +|+.|++++...             |.|.+.|++| ++
T Consensus         1 ~~i~vGtP~q~~~~~~DTGSs~~Wv~~~~c~~~~~~~~~~~~~~~~sst~~~~~-------------~~~~~~Y~~g-~~   66 (109)
T cd05470           1 IEIGIGTPPQTFNVLLDTGSSNLWVPSVDCQSLAIYSHSSYDDPSASSTYSDNG-------------CTFSITYGTG-SL   66 (109)
T ss_pred             CEEEeCCCCceEEEEEeCCCCCEEEeCCCCCCcccccccccCCcCCCCCCCCCC-------------cEEEEEeCCC-eE
Confidence            47999999999999999999999999999997775545566 999999998865             6899999985 67


Q ss_pred             eEEEEEEEEEeCCCCCCCccceEEEEEEeccCCC-cCCCcceEEec
Q 047816          167 SGVLGEDIISFGNESDLKPQRAVFGCENVETGDL-YSQHADGIIGL  211 (620)
Q Consensus       167 ~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~-~~~~~dGIlGL  211 (620)
                      .|.++.|+|+|++   ..+.++.|||+....+.+ .....+|||||
T Consensus        67 ~g~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~~GilGL  109 (109)
T cd05470          67 SGGLSTDTVSIGD---IEVVGQAFGCATDEPGATFLPALFDGILGL  109 (109)
T ss_pred             EEEEEEEEEEECC---EEECCEEEEEEEecCCccccccccccccCC
Confidence            8999999999998   667899999999887653 33568999998


No 27 
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=97.90  E-value=3.1e-05  Score=65.27  Aligned_cols=93  Identities=15%  Similarity=0.163  Sum_probs=64.5

Q ss_pred             eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816           85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS  164 (620)
Q Consensus        85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~  164 (620)
                      .|++++.||  ++++++++|||++.+|+.......|...    +.                      ......+...+|.
T Consensus         2 ~~~v~v~i~--~~~~~~llDTGa~~s~i~~~~~~~l~~~----~~----------------------~~~~~~~~~~~G~   53 (96)
T cd05483           2 HFVVPVTIN--GQPVRFLLDTGASTTVISEELAERLGLP----LT----------------------LGGKVTVQTANGR   53 (96)
T ss_pred             cEEEEEEEC--CEEEEEEEECCCCcEEcCHHHHHHcCCC----cc----------------------CCCcEEEEecCCC
Confidence            478999999  8999999999999999976432222210    00                      0123456666666


Q ss_pred             ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCC
Q 047816          165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGR  213 (620)
Q Consensus       165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~  213 (620)
                      ........+.+++|+   .+..++.+.+......     ..+||||+.+
T Consensus        54 ~~~~~~~~~~i~ig~---~~~~~~~~~v~d~~~~-----~~~gIlG~d~   94 (96)
T cd05483          54 VRAARVRLDSLQIGG---ITLRNVPAVVLPGDAL-----GVDGLLGMDF   94 (96)
T ss_pred             ccceEEEcceEEECC---cEEeccEEEEeCCccc-----CCceEeChHH
Confidence            666666789999998   6667777777654332     4799999863


No 28 
>PF01102 Glycophorin_A:  Glycophorin A;  InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=96.58  E-value=0.0023  Score=55.94  Aligned_cols=37  Identities=19%  Similarity=0.336  Sum_probs=25.3

Q ss_pred             cchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816          578 TWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       578 ~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~  615 (620)
                      ..++. +|+||++| +++.++++.+++.|+|||+|+|..
T Consensus        59 h~fs~~~i~~Ii~g-v~aGvIg~Illi~y~irR~~Kk~~   96 (122)
T PF01102_consen   59 HRFSEPAIIGIIFG-VMAGVIGIILLISYCIRRLRKKSS   96 (122)
T ss_dssp             SSSS-TCHHHHHHH-HHHHHHHHHHHHHHHHHHHS----
T ss_pred             cCccccceeehhHH-HHHHHHHHHHHHHHHHHHHhccCC
Confidence            36667 89999999 777777777777888888776654


No 29 
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=96.49  E-value=0.012  Score=52.12  Aligned_cols=94  Identities=12%  Similarity=0.179  Sum_probs=59.8

Q ss_pred             ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816           83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE  162 (620)
Q Consensus        83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d  162 (620)
                      +|.|++++.|.  ++++.++||||++.+-+....-....      .++..      .             .....+.-+.
T Consensus         9 ~g~~~v~~~In--G~~~~flVDTGAs~t~is~~~A~~Lg------l~~~~------~-------------~~~~~~~ta~   61 (121)
T TIGR02281         9 DGHFYATGRVN--GRNVRFLVDTGATSVALNEEDAQRLG------LDLNR------L-------------GYTVTVSTAN   61 (121)
T ss_pred             CCeEEEEEEEC--CEEEEEEEECCCCcEEcCHHHHHHcC------CCccc------C-------------CceEEEEeCC
Confidence            67899999998  89999999999999987643211111      11110      0             0122333333


Q ss_pred             CCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecC
Q 047816          163 MSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLG  212 (620)
Q Consensus       163 g~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg  212 (620)
                      |......+.-|.+.+|+   ....++.+.+.....      ..+|+||+.
T Consensus        62 G~~~~~~~~l~~l~iG~---~~~~nv~~~v~~~~~------~~~~LLGm~  102 (121)
T TIGR02281        62 GQIKAARVTLDRVAIGG---IVVNDVDAMVAEGGA------LSESLLGMS  102 (121)
T ss_pred             CcEEEEEEEeCEEEECC---EEEeCcEEEEeCCCc------CCceEcCHH
Confidence            44344456789999999   677788877663221      137999986


No 30 
>PF13650 Asp_protease_2:  Aspartyl protease
Probab=95.88  E-value=0.041  Score=45.30  Aligned_cols=89  Identities=17%  Similarity=0.241  Sum_probs=52.9

Q ss_pred             EEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCCcee
Q 047816           88 TRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMSSSS  167 (620)
Q Consensus        88 ~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~~~~  167 (620)
                      +++.|+  ++++++++|||++.+.+....+......      +...                   .....+.-.+|....
T Consensus         1 V~v~vn--g~~~~~liDTGa~~~~i~~~~~~~l~~~------~~~~-------------------~~~~~~~~~~g~~~~   53 (90)
T PF13650_consen    1 VPVKVN--GKPVRFLIDTGASISVISRSLAKKLGLK------PRPK-------------------SVPISVSGAGGSVTV   53 (90)
T ss_pred             CEEEEC--CEEEEEEEcCCCCcEEECHHHHHHcCCC------CcCC-------------------ceeEEEEeCCCCEEE
Confidence            367787  8999999999999887764332221111      0000                   011223333344344


Q ss_pred             EEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecC
Q 047816          168 GVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLG  212 (620)
Q Consensus       168 G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg  212 (620)
                      .....+.+++|+   .+..+..+-+..      .....+||||+-
T Consensus        54 ~~~~~~~i~ig~---~~~~~~~~~v~~------~~~~~~~iLG~d   89 (90)
T PF13650_consen   54 YRGRVDSITIGG---ITLKNVPFLVVD------LGDPIDGILGMD   89 (90)
T ss_pred             EEEEEEEEEECC---EEEEeEEEEEEC------CCCCCEEEeCCc
Confidence            455667899998   556677766654      123578999974


No 31 
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=95.45  E-value=0.0027  Score=53.48  Aligned_cols=36  Identities=22%  Similarity=0.266  Sum_probs=23.4

Q ss_pred             Cccccchhh-hhHHHHHHHHHHHHHHHHHH-HHHhhhhh
Q 047816          574 QVKRTWWQE-HFLMVVLAITIMMVVGLSVF-GILFILRR  610 (620)
Q Consensus       574 ~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~-~~~~~~r~  610 (620)
                      +..+++++. +|+||++| +++++.+|.++ .|||++||
T Consensus        57 st~~~~ls~gaiagi~vg-~~~~v~~lv~~l~w~f~~r~   94 (96)
T PTZ00382         57 GANRSGLSTGAIAGISVA-VVAVVGGLVGFLCWWFVCRG   94 (96)
T ss_pred             ccCCCCcccccEEEEEee-hhhHHHHHHHHHhheeEEee
Confidence            455678888 99999998 55555455444 44455543


No 32 
>PF01034 Syndecan:  Syndecan domain;  InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains:   A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains;  A transmembrane region;  A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins.    The proteins known to belong to this family are:    Syndecan 1.  Syndecan 2 or fibroglycan.  Syndecan 3 or neuroglycan or N-syndecan.  Syndecan 4 or amphiglycan or ryudocan.  Drosophila syndecan.   Caenorhabditis elegans probable syndecan (F57C7.3).    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=95.17  E-value=0.0058  Score=46.32  Aligned_cols=36  Identities=17%  Similarity=0.288  Sum_probs=2.1

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccccCCCC
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQSVNSYK  619 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~~~~~  619 (620)
                      .++|+++| +++.++..+++++++++|.|+|.+.+|.
T Consensus        10 vlaavIaG-~Vvgll~ailLIlf~iyR~rkkdEGSY~   45 (64)
T PF01034_consen   10 VLAAVIAG-GVVGLLFAILLILFLIYRMRKKDEGSYD   45 (64)
T ss_dssp             ----------------------------S------SS
T ss_pred             HHHHHHHH-HHHHHHHHHHHHHHHHHHHHhcCCCCcc
Confidence            45566666 5555555666667888898988888884


No 33 
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=94.22  E-value=0.18  Score=44.76  Aligned_cols=27  Identities=15%  Similarity=0.228  Sum_probs=23.6

Q ss_pred             CCceeehHhhhceEEEEEeCCCCEEEE
Q 047816          397 DPTTLLGGIIVRNTLVMYDREHSKIGF  423 (620)
Q Consensus       397 ~~~~ILG~~fLr~~yvvfD~en~rIGf  423 (620)
                      ....|||..||+.+-.+.|+.+++|-+
T Consensus        98 ~~d~ILG~d~L~~~~~~ID~~~~~i~~  124 (124)
T cd05479          98 DVDFLIGLDMLKRHQCVIDLKENVLRI  124 (124)
T ss_pred             CcCEEecHHHHHhCCeEEECCCCEEEC
Confidence            446799999999999999999998853


No 34 
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=94.15  E-value=0.19  Score=44.68  Aligned_cols=90  Identities=19%  Similarity=0.196  Sum_probs=55.5

Q ss_pred             eeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeE-Eeec-
Q 047816           84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYE-RKYA-  161 (620)
Q Consensus        84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~-~~Y~-  161 (620)
                      ..+++++.|+  ++++.+++|||++.+++....+..|+...      ...                    ..+. ...+ 
T Consensus        15 ~~~~v~~~In--g~~~~~LvDTGAs~s~Is~~~a~~lgl~~------~~~--------------------~~~~~~~~g~   66 (124)
T cd05479          15 PMLYINVEIN--GVPVKAFVDSGAQMTIMSKACAEKCGLMR------LID--------------------KRFQGIAKGV   66 (124)
T ss_pred             eEEEEEEEEC--CEEEEEEEeCCCceEEeCHHHHHHcCCcc------ccC--------------------cceEEEEecC
Confidence            3577899999  89999999999999998765444443320      000                    0111 1222 


Q ss_pred             cCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecC
Q 047816          162 EMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLG  212 (620)
Q Consensus       162 dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg  212 (620)
                      ++....|..-.+.+.+++.   .. ...|.+...       ...|+|||+-
T Consensus        67 g~~~~~g~~~~~~l~i~~~---~~-~~~~~Vl~~-------~~~d~ILG~d  106 (124)
T cd05479          67 GTQKILGRIHLAQVKIGNL---FL-PCSFTVLED-------DDVDFLIGLD  106 (124)
T ss_pred             CCcEEEeEEEEEEEEECCE---Ee-eeEEEEECC-------CCcCEEecHH
Confidence            2234566677788999983   22 245554421       1478999985


No 35 
>PF04478 Mid2:  Mid2 like cell wall stress sensor;  InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=93.22  E-value=0.011  Score=53.29  Aligned_cols=30  Identities=20%  Similarity=0.455  Sum_probs=19.9

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      .++|+++|+++++++++++++.||..|+|+
T Consensus        50 IVIGvVVGVGg~ill~il~lvf~~c~r~kk   79 (154)
T PF04478_consen   50 IVIGVVVGVGGPILLGILALVFIFCIRRKK   79 (154)
T ss_pred             EEEEEEecccHHHHHHHHHhheeEEEeccc
Confidence            699999997777776666655444444443


No 36 
>PF03302 VSP:  Giardia variant-specific surface protein;  InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=92.59  E-value=0.045  Score=58.78  Aligned_cols=37  Identities=19%  Similarity=0.289  Sum_probs=27.9

Q ss_pred             Cccccchhh-hhHHHHHHHHHHHHHHHH-HHHHHhhhhhh
Q 047816          574 QVKRTWWQE-HFLMVVLAITIMMVVGLS-VFGILFILRRR  611 (620)
Q Consensus       574 ~~~~~~~~~-~~~~i~~~~~~~~~~~l~-~~~~~~~~r~r  611 (620)
                      ...|+++|+ +|+||+++ +|++|-+|+ +|.||||.|+|
T Consensus       358 ~~n~s~LstgaIaGIsva-vvvvVgglvGfLcWwf~crgk  396 (397)
T PF03302_consen  358 STNKSGLSTGAIAGISVA-VVVVVGGLVGFLCWWFICRGK  396 (397)
T ss_pred             Ccccccccccceeeeeeh-hHHHHHHHHHHHhhheeeccc
Confidence            456789999 99999999 666666664 45577777765


No 37 
>PF01299 Lamp:  Lysosome-associated membrane glycoprotein (Lamp);  InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below.   +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+  In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100.  Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail.   Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=91.54  E-value=0.14  Score=53.04  Aligned_cols=34  Identities=21%  Similarity=0.340  Sum_probs=25.0

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccccCCCC
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQSVNSYK  619 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~~~~~  619 (620)
                      .+|-|++| ++.++++|++++.|+|.|||+++  -|+
T Consensus       271 ~~vPIaVG-~~La~lvlivLiaYli~Rrr~~~--gYq  304 (306)
T PF01299_consen  271 DLVPIAVG-AALAGLVLIVLIAYLIGRRRSRA--GYQ  304 (306)
T ss_pred             chHHHHHH-HHHHHHHHHHHHhheeEeccccc--ccc
Confidence            46778888 55566777778889999988765  454


No 38 
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=90.52  E-value=0.31  Score=48.56  Aligned_cols=33  Identities=18%  Similarity=0.255  Sum_probs=17.8

Q ss_pred             HHHHHHHHHHHHHHHHHHHHhhhhhhccccCCC
Q 047816          586 MVVLAITIMMVVGLSVFGILFILRRRRQSVNSY  618 (620)
Q Consensus       586 ~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~~~~  618 (620)
                      ||++=+.++++|+|.++=+|+.|||++.-.|.+
T Consensus       262 giaalvllil~vvliiLYiWlyrrRK~swkhe~  294 (295)
T TIGR01478       262 GIAALVLIILTVVLIILYIWLYRRRKKSWKHEC  294 (295)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhhcccccccc
Confidence            777544556666665555555555544333443


No 39 
>PF02009 Rifin_STEVOR:  Rifin/stevor family;  InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=89.19  E-value=0.34  Score=49.60  Aligned_cols=31  Identities=23%  Similarity=0.414  Sum_probs=20.7

Q ss_pred             hhhhHHHHHHHHHHHHHHHHHHHHHhhhhhhccc
Q 047816          581 QEHFLMVVLAITIMMVVGLSVFGILFILRRRRQS  614 (620)
Q Consensus       581 ~~~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~  614 (620)
                      +.+|++.+   +++++++|..+++++|||.||++
T Consensus       255 ~t~I~aSi---iaIliIVLIMvIIYLILRYRRKK  285 (299)
T PF02009_consen  255 TTAIIASI---IAILIIVLIMVIIYLILRYRRKK  285 (299)
T ss_pred             HHHHHHHH---HHHHHHHHHHHHHHHHHHHHHHh
Confidence            33444444   45566677778889999988854


No 40 
>PF05454 DAG1:  Dystroglycan (Dystrophin-associated glycoprotein 1);  InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=88.78  E-value=0.13  Score=52.24  Aligned_cols=91  Identities=15%  Similarity=0.266  Sum_probs=0.0

Q ss_pred             EEEEEeecCCCCC---CchhHHHHHHhhccc-ccccceEEeeeeecCCceeeEEEEecCCCcccccHHHHHHHHHHHccC
Q 047816          478 FDMFLSINYSDLR---PHIPELADSIAQELD-VNTSQVHLLNFMSKGNNSFIAWAVFPSGSANYISNATALRIISRLAEH  553 (620)
Q Consensus       478 ~~~~~~~~~~~~~---~~~~~~~~~~~~~l~-~~~~qv~~~~~~~~g~~~~~~~~~~P~~~~~~f~~~~~~~i~~~~~~~  553 (620)
                      |.+.|...+-.|.   -....|.+-||..++ -+.+++.|.++.  .+...+.|-=--- ....=...++.+++.+|...
T Consensus         3 F~~~l~~d~~~f~~dv~~ki~lVekLA~~~GD~nts~ItV~sIt--~gstiVtwtNnTL-p~~~CP~eeI~~L~~~L~~~   79 (290)
T PF05454_consen    3 FSATLDIDYESFNNDVQRKILLVEKLARLFGDRNTSSITVRSIT--SGSTIVTWTNNTL-PTSPCPKEEIEKLRKRLVDD   79 (290)
T ss_dssp             --------------------------------------------------------------------------------
T ss_pred             eEEEEcCCHHHhhhhHHHHHHHHHHHHHHhCCCCCCeEEEEEec--CCCEEEEEEcCCC-CCCCCCHHHHHHHHHHHhcC
Confidence            4455544454442   223358888998888 567899999987  3444455521111 12223456677777777666


Q ss_pred             ccCCC----CCCcce-eeeeeee
Q 047816          554 RVHIP----DTFGNY-KLLQWNI  571 (620)
Q Consensus       554 ~~~~~----~~fG~y-~l~~~~~  571 (620)
                      +-.+.    ..+||. .+.+.++
T Consensus        80 ~g~~~~~f~~am~pef~V~svsv  102 (290)
T PF05454_consen   80 DGKPSQEFVRAMGPEFKVKSVSV  102 (290)
T ss_dssp             -----------------------
T ss_pred             CCCcCHHHHHHhCCCCceeEEEE
Confidence            53322    456643 3444443


No 41 
>PF08693 SKG6:  Transmembrane alpha-helix domain;  InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=88.60  E-value=0.041  Score=37.76  Aligned_cols=8  Identities=50%  Similarity=0.899  Sum_probs=3.5

Q ss_pred             Hhhhhhhc
Q 047816          605 LFILRRRR  612 (620)
Q Consensus       605 ~~~~r~r~  612 (620)
                      +++||||+
T Consensus        32 l~~~~rR~   39 (40)
T PF08693_consen   32 LFFWYRRK   39 (40)
T ss_pred             hheEEecc
Confidence            34444443


No 42 
>PF02439 Adeno_E3_CR2:  Adenovirus E3 region protein CR2;  InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=88.36  E-value=0.96  Score=30.63  Aligned_cols=29  Identities=7%  Similarity=0.182  Sum_probs=12.0

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          585 LMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       585 ~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      ++|++|+++.+++..+.+..++..+||.+
T Consensus         6 IaIIv~V~vg~~iiii~~~~YaCcykk~~   34 (38)
T PF02439_consen    6 IAIIVAVVVGMAIIIICMFYYACCYKKHR   34 (38)
T ss_pred             hhHHHHHHHHHHHHHHHHHHHHHHHcccc
Confidence            45555523333333333334444554443


No 43 
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=87.98  E-value=0.55  Score=39.01  Aligned_cols=28  Identities=25%  Similarity=0.379  Sum_probs=24.7

Q ss_pred             EEEEEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816           86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCA  115 (620)
Q Consensus        86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~  115 (620)
                      |++.+.|+  ++++.+++||||+.+++...
T Consensus         1 ~~~~~~In--g~~i~~lvDTGA~~svis~~   28 (91)
T cd05484           1 KTVTLLVN--GKPLKFQLDTGSAITVISEK   28 (91)
T ss_pred             CEEEEEEC--CEEEEEEEcCCcceEEeCHH
Confidence            35789999  99999999999999999754


No 44 
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where 
Probab=87.60  E-value=2.7  Score=34.45  Aligned_cols=27  Identities=19%  Similarity=0.263  Sum_probs=22.0

Q ss_pred             EEEecCCCcEEEEEEeCCCCceeEeCCCC
Q 047816           89 RLWIGTPPQTFALIVDTGSTVTYVPCATC  117 (620)
Q Consensus        89 ~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c  117 (620)
                      .+.|.  ++++++++|||++.+-+....+
T Consensus         2 ~v~In--G~~~~fLvDTGA~~tii~~~~a   28 (86)
T cd06095           2 TITVE--GVPIVFLVDTGATHSVLKSDLG   28 (86)
T ss_pred             EEEEC--CEEEEEEEECCCCeEEECHHHh
Confidence            45666  8999999999999999976543


No 45 
>PTZ00370 STEVOR; Provisional
Probab=87.45  E-value=0.42  Score=47.72  Aligned_cols=26  Identities=19%  Similarity=0.324  Sum_probs=14.7

Q ss_pred             HHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816          586 MVVLAITIMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       586 ~i~~~~~~~~~~~l~~~~~~~~~r~r  611 (620)
                      ||++=+.++++|+|.++=+|+.|||+
T Consensus       258 giaalvllil~vvliilYiwlyrrRK  283 (296)
T PTZ00370        258 GIAALVLLILAVVLIILYIWLYRRRK  283 (296)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence            77754455555555555555555544


No 46 
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=86.89  E-value=7.5  Score=34.23  Aligned_cols=37  Identities=19%  Similarity=0.220  Sum_probs=27.5

Q ss_pred             CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816          266 RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       266 ~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i  316 (620)
                      ..++|.++   +.|+|+..           .+++|||.+.+.++++..+++
T Consensus         8 ~~g~~~v~---~~InG~~~-----------~flVDTGAs~t~is~~~A~~L   44 (121)
T TIGR02281         8 GDGHFYAT---GRVNGRNV-----------RFLVDTGATSVALNEEDAQRL   44 (121)
T ss_pred             CCCeEEEE---EEECCEEE-----------EEEEECCCCcEEcCHHHHHHc
Confidence            34555444   56787754           489999999999999987663


No 47 
>PHA03286 envelope glycoprotein E; Provisional
Probab=85.75  E-value=1.4  Score=46.73  Aligned_cols=76  Identities=18%  Similarity=0.171  Sum_probs=44.3

Q ss_pred             cccHHHHHHHHHHHccCccCCCCCCcceeeeeeeec-----CC------ccccchhh-hhHHHHHHHHHHHHHHHHHHHH
Q 047816          537 YISNATALRIISRLAEHRVHIPDTFGNYKLLQWNIE-----PQ------VKRTWWQE-HFLMVVLAITIMMVVGLSVFGI  604 (620)
Q Consensus       537 ~f~~~~~~~i~~~~~~~~~~~~~~fG~y~l~~~~~~-----~~------~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~  604 (620)
                      +-=.||+.+++..+.+++-|   .||+-.++.-+..     |.      ..++-+.. .+..+++| ++++++.....+.
T Consensus       337 YtlvST~~~fvNVi~e~~~P---~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~s~~~~-~~~~~~~~~~~~~  412 (492)
T PHA03286        337 YVYLSTLETILNVFEDVHKP---GFGYNAVSANDPGNFTAAPTHTIAFKEGPTVIYSLLVSSMAAG-AILVVLLFALCIA  412 (492)
T ss_pred             EEEEehHHHhhhhhhhccCC---CCCCcccccCCccccccCCcchhhhccCCeEEHHHHHHHHHHH-HHHHHHHHHHHhH
Confidence            44568899999999998777   4886665543221     21      11222333 34456666 5555555555566


Q ss_pred             HhhhhhhccccC
Q 047816          605 LFILRRRRQSVN  616 (620)
Q Consensus       605 ~~~~r~r~~~~~  616 (620)
                      .+++|||+++..
T Consensus       413 ~~~~r~~~~r~~  424 (492)
T PHA03286        413 GLYRRRRRHRTN  424 (492)
T ss_pred             hHhhhhhhhhcc
Confidence            677766654433


No 48 
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=85.45  E-value=3.2  Score=35.72  Aligned_cols=24  Identities=17%  Similarity=0.282  Sum_probs=20.9

Q ss_pred             CceeehHhhhceEEEEEeCCCCEE
Q 047816          398 PTTLLGGIIVRNTLVMYDREHSKI  421 (620)
Q Consensus       398 ~~~ILG~~fLr~~yvvfD~en~rI  421 (620)
                      +..+||..||+.+-++.|+.++++
T Consensus        84 ~~~LLG~~~L~~l~l~id~~~~~~  107 (107)
T TIGR03698        84 DEPLLGTELLEGLGIVIDYRNQGL  107 (107)
T ss_pred             CccEecHHHHhhCCEEEehhhCcC
Confidence            478999999999999999987753


No 49 
>PF08284 RVP_2:  Retroviral aspartyl protease;  InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases. 
Probab=84.58  E-value=6.1  Score=35.56  Aligned_cols=28  Identities=14%  Similarity=0.131  Sum_probs=25.4

Q ss_pred             CceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          398 PTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       398 ~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      -..|||..+|+.+..+.|..+++|.|-.
T Consensus       104 ~DvILGm~WL~~~~~~IDw~~k~v~f~~  131 (135)
T PF08284_consen  104 YDVILGMDWLKKHNPVIDWATKTVTFNS  131 (135)
T ss_pred             eeeEeccchHHhCCCEEEccCCEEEEeC
Confidence            4589999999999999999999999864


No 50 
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=83.78  E-value=3  Score=39.83  Aligned_cols=79  Identities=13%  Similarity=0.149  Sum_probs=53.5

Q ss_pred             eeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCC
Q 047816           73 RMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERA  152 (620)
Q Consensus        73 ~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~  152 (620)
                      .+.+..+  .+|.|.++..|-  +|++..+||||-+.+-+...+...      -.|+....                   
T Consensus        95 ~v~Lak~--~~GHF~a~~~VN--Gk~v~fLVDTGATsVal~~~dA~R------lGid~~~l-------------------  145 (215)
T COG3577          95 EVSLAKS--RDGHFEANGRVN--GKKVDFLVDTGATSVALNEEDARR------LGIDLNSL-------------------  145 (215)
T ss_pred             EEEEEec--CCCcEEEEEEEC--CEEEEEEEecCcceeecCHHHHHH------hCCCcccc-------------------
Confidence            4555554  488999999998  999999999999998887543211      12333221                   


Q ss_pred             cceeEEeeccCCceeEEEEEEEEEeCCC
Q 047816          153 QCVYERKYAEMSSSSGVLGEDIISFGNE  180 (620)
Q Consensus       153 ~~~~~~~Y~dg~~~~G~~~~D~v~lg~~  180 (620)
                      ..++.+.-++|..-...+-.|.|.||+.
T Consensus       146 ~y~~~v~TANG~~~AA~V~Ld~v~IG~I  173 (215)
T COG3577         146 DYTITVSTANGRARAAPVTLDRVQIGGI  173 (215)
T ss_pred             CCceEEEccCCccccceEEeeeEEEccE
Confidence            1234555566555555677899999993


No 51 
>PF05808 Podoplanin:  Podoplanin;  InterPro: IPR008783 This family consists of several mammalian podoplanin-like proteins which are thought to control specifically the unique shape of podocytes [].; GO: 0016021 integral to membrane; PDB: 3IET_X.
Probab=83.68  E-value=0.34  Score=44.19  Aligned_cols=35  Identities=11%  Similarity=0.345  Sum_probs=0.0

Q ss_pred             Cccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhh
Q 047816          574 QVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRR  610 (620)
Q Consensus       574 ~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~  610 (620)
                      ...|.++++ .+|||++|  +++++++++-+++++.||
T Consensus       120 t~ek~GL~T~tLVGIIVG--VLlaIG~igGIIivvvRK  155 (162)
T PF05808_consen  120 TVEKDGLSTVTLVGIIVG--VLLAIGFIGGIIIVVVRK  155 (162)
T ss_dssp             --------------------------------------
T ss_pred             ccccCCcceeeeeeehhh--HHHHHHHHhheeeEEeeh
Confidence            356899999 99999998  555666666556666665


No 52 
>PF13975 gag-asp_proteas:  gag-polyprotein putative aspartyl protease
Probab=82.70  E-value=2  Score=33.97  Aligned_cols=36  Identities=22%  Similarity=0.330  Sum_probs=29.9

Q ss_pred             ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCC
Q 047816           83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHC  120 (620)
Q Consensus        83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C  120 (620)
                      .+.+++++.||  ++.+..++|||++...|..+.+..+
T Consensus         6 ~g~~~v~~~I~--g~~~~alvDtGat~~fis~~~a~rL   41 (72)
T PF13975_consen    6 PGLMYVPVSIG--GVQVKALVDTGATHNFISESLAKRL   41 (72)
T ss_pred             CCEEEEEEEEC--CEEEEEEEeCCCcceecCHHHHHHh
Confidence            46788999999  8999999999999998876554433


No 53 
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=81.45  E-value=7.3  Score=32.17  Aligned_cols=30  Identities=30%  Similarity=0.543  Sum_probs=24.7

Q ss_pred             EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816          276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i  316 (620)
                      .+.|+|+.+.           +++|||++.+.++++.+..+
T Consensus         4 ~~~Ing~~i~-----------~lvDTGA~~svis~~~~~~l   33 (91)
T cd05484           4 TLLVNGKPLK-----------FQLDTGSAITVISEKTWRKL   33 (91)
T ss_pred             EEEECCEEEE-----------EEEcCCcceEEeCHHHHHHh
Confidence            4567887664           79999999999999998764


No 54 
>PTZ00046 rifin; Provisional
Probab=80.45  E-value=1.7  Score=45.17  Aligned_cols=30  Identities=30%  Similarity=0.462  Sum_probs=22.1

Q ss_pred             HHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816          586 MVVLAITIMMVVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       586 ~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~  615 (620)
                      +|++++..+++++|..++++||.|.||++.
T Consensus       316 aIiaSiiAIvVIVLIMvIIYLILRYRRKKK  345 (358)
T PTZ00046        316 AIIASIVAIVVIVLIMVIIYLILRYRRKKK  345 (358)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHhhhcch
Confidence            444454566777788888999999998653


No 55 
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=79.94  E-value=1.8  Score=44.83  Aligned_cols=30  Identities=27%  Similarity=0.446  Sum_probs=22.3

Q ss_pred             HHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816          586 MVVLAITIMMVVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       586 ~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~  615 (620)
                      +|++++..+++++|..++++||.|.||++.
T Consensus       311 ~IiaSiIAIvvIVLIMvIIYLILRYRRKKK  340 (353)
T TIGR01477       311 PIIASIIAILIIVLIMVIIYLILRYRRKKK  340 (353)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHhhhcch
Confidence            455555666777778888999999998653


No 56 
>PF15176 LRR19-TM:  Leucine-rich repeat family 19 TM domain
Probab=76.25  E-value=4.9  Score=33.62  Aligned_cols=39  Identities=8%  Similarity=-0.003  Sum_probs=22.2

Q ss_pred             CCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          573 PQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       573 ~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      +..++.+.+. .+|||+++ ++++-+.+.+++=+=+|||.+
T Consensus         8 ~~~~~~g~sW~~LVGVv~~-al~~SlLIalaaKC~~~~k~~   47 (102)
T PF15176_consen    8 PGPGEGGRSWPFLVGVVVT-ALVTSLLIALAAKCPVWYKYL   47 (102)
T ss_pred             CCCCCCCcccHhHHHHHHH-HHHHHHHHHHHHHhHHHHHHH
Confidence            3344556666 88999887 444444434444455566554


No 57 
>PF05393 Hum_adeno_E3A:  Human adenovirus early E3A glycoprotein;  InterPro: IPR008652 This family consists of several early glycoproteins (E3A), from human adenovirus type 2.; GO: 0016021 integral to membrane
Probab=74.93  E-value=3.7  Score=33.27  Aligned_cols=25  Identities=24%  Similarity=0.377  Sum_probs=12.3

Q ss_pred             HHHHHHHHHHhhhhhhcccc-CCCCC
Q 047816          596 VVGLSVFGILFILRRRRQSV-NSYKP  620 (620)
Q Consensus       596 ~~~l~~~~~~~~~r~r~~~~-~~~~~  620 (620)
                      +..|+.+.++..|+||+|++ -.|+|
T Consensus        43 iFil~VilwfvCC~kRkrsRrPIYrP   68 (94)
T PF05393_consen   43 IFILLVILWFVCCKKRKRSRRPIYRP   68 (94)
T ss_pred             HHHHHHHHHHHHHHHhhhccCCcccc
Confidence            33334444444455555444 46776


No 58 
>PF00077 RVP:  Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026;  InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=74.56  E-value=4.1  Score=34.18  Aligned_cols=26  Identities=19%  Similarity=0.323  Sum_probs=22.6

Q ss_pred             EEEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816           88 TRLWIGTPPQTFALIVDTGSTVTYVPCA  115 (620)
Q Consensus        88 ~~i~iGTP~Q~~~v~vDTGSs~~WV~~~  115 (620)
                      .+|.|.  ++++..++||||+.+-++..
T Consensus         8 i~v~i~--g~~i~~LlDTGA~vsiI~~~   33 (100)
T PF00077_consen    8 ITVKIN--GKKIKALLDTGADVSIISEK   33 (100)
T ss_dssp             EEEEET--TEEEEEEEETTBSSEEESSG
T ss_pred             EEEeEC--CEEEEEEEecCCCcceeccc
Confidence            578888  88999999999999988764


No 59 
>PF14575 EphA2_TM:  Ephrin type-A receptor 2 transmembrane domain; PDB: 3KUL_A 2XVD_A 2VX1_A 2VWV_A 2VX0_A 2VWY_A 2VWZ_A 2VWW_A 2VWU_A 2VWX_A ....
Probab=73.47  E-value=3.4  Score=33.05  Aligned_cols=27  Identities=11%  Similarity=0.493  Sum_probs=10.3

Q ss_pred             hHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          584 FLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       584 ~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      |++++++ +++++++++++ +++++||++
T Consensus         2 ii~~~~~-g~~~ll~~v~~-~~~~~rr~~   28 (75)
T PF14575_consen    2 IIASIIV-GVLLLLVLVII-VIVCFRRCK   28 (75)
T ss_dssp             HHHHHHH-HHHHHHHHHHH-HHCCCTT--
T ss_pred             EEehHHH-HHHHHHHhhee-EEEEEeeEc
Confidence            3443433 34444443333 344444443


No 60 
>PF12768 Rax2:  Cortical protein marker for cell polarity
Probab=71.95  E-value=2.3  Score=43.37  Aligned_cols=36  Identities=17%  Similarity=0.291  Sum_probs=18.6

Q ss_pred             ccchhh-hhHH--HHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          577 RTWWQE-HFLM--VVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       577 ~~~~~~-~~~~--i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      ++.+++ .+|.  ++++++++++++|+.+++..++|||+
T Consensus       221 ~~~l~~G~VVlIslAiALG~v~ll~l~Gii~~~~~r~~~  259 (281)
T PF12768_consen  221 GKKLSRGFVVLISLAIALGTVFLLVLIGIILAYIRRRRQ  259 (281)
T ss_pred             cccccceEEEEEehHHHHHHHHHHHHHHHHHHHHHhhhc
Confidence            355555 4444  44444455555555555555555544


No 61 
>PF06697 DUF1191:  Protein of unknown function (DUF1191);  InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=70.45  E-value=1.1  Score=45.08  Aligned_cols=29  Identities=17%  Similarity=0.256  Sum_probs=16.1

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      .++|+++|  +++++.|+.+++++.+.||++
T Consensus       215 iv~g~~~G--~~~L~ll~~lv~~~vr~krk~  243 (278)
T PF06697_consen  215 IVVGVVGG--VVLLGLLSLLVAMLVRYKRKK  243 (278)
T ss_pred             EEEEehHH--HHHHHHHHHHHHhhhhhhHHH
Confidence            67777777  333444444555555555543


No 62 
>PF05568 ASFV_J13L:  African swine fever virus J13L protein;  InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=69.96  E-value=7  Score=34.82  Aligned_cols=40  Identities=15%  Similarity=0.436  Sum_probs=26.1

Q ss_pred             CCccccchhhhhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          573 PQVKRTWWQEHFLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       573 ~~~~~~~~~~~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      +-...+-+++|+.-|++| .+++++.+..++.|+-+|||++
T Consensus        20 ~~~~psffsthm~tILia-IvVliiiiivli~lcssRKkKa   59 (189)
T PF05568_consen   20 PVTPPSFFSTHMYTILIA-IVVLIIIIIVLIYLCSSRKKKA   59 (189)
T ss_pred             CCCCccHHHHHHHHHHHH-HHHHHHHHHHHHHHHhhhhHHH
Confidence            445566777888888888 5555555555556666666654


No 63 
>PF13650 Asp_protease_2:  Aspartyl protease
Probab=69.33  E-value=5.9  Score=32.12  Aligned_cols=29  Identities=21%  Similarity=0.487  Sum_probs=23.3

Q ss_pred             EEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816          277 IHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       277 i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i  316 (620)
                      +.|+|+.+           .+++|||++.+.+.++.++++
T Consensus         3 v~vng~~~-----------~~liDTGa~~~~i~~~~~~~l   31 (90)
T PF13650_consen    3 VKVNGKPV-----------RFLIDTGASISVISRSLAKKL   31 (90)
T ss_pred             EEECCEEE-----------EEEEcCCCCcEEECHHHHHHc
Confidence            55677654           489999999999999988764


No 64 
>PF11925 DUF3443:  Protein of unknown function (DUF3443);  InterPro: IPR021847  This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 400 to 434 amino acids in length. This protein has two conserved sequence motifs: NPV and DNNG. 
Probab=68.81  E-value=12  Score=39.12  Aligned_cols=107  Identities=21%  Similarity=0.275  Sum_probs=55.0

Q ss_pred             EEEEecCCC----cEE-EEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816           88 TRLWIGTPP----QTF-ALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE  162 (620)
Q Consensus        88 ~~i~iGTP~----Q~~-~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d  162 (620)
                      +.|+|=.|+    |.+ +|+|||||.-+=|..+.-..-..   ...-...+ .-..+.   ||            ..|++
T Consensus        26 VsVtVC~PGts~CqTIdnvlVDTGS~GLRi~~sAl~~~l~---~~Lp~~t~-~g~~la---EC------------~~F~s   86 (370)
T PF11925_consen   26 VSVTVCAPGTSNCQTIDNVLVDTGSYGLRIFASALPSSLA---GSLPQQTG-GGAPLA---EC------------AQFAS   86 (370)
T ss_pred             eEEEEeCCCCCCceeeCcEEEeccchhhhHHHhhhchhhh---ccCCcccC-CCcchh---hh------------hhccC
Confidence            566665543    666 89999999988776542210000   00111111 011110   12            34565


Q ss_pred             CCceeEEEEEEEEEeCCCCCCCccceEEEE----------EEecc--CCCcCCCcceEEecCCC
Q 047816          163 MSSSSGVLGEDIISFGNESDLKPQRAVFGC----------ENVET--GDLYSQHADGIIGLGRG  214 (620)
Q Consensus       163 g~~~~G~~~~D~v~lg~~~~~~~~~~~fg~----------~~~~~--~~~~~~~~dGIlGLg~~  214 (620)
                       +..=|-+.+-.|+|+++....++-|.++=          .....  .........||||+|.-
T Consensus        87 -gytWGsVr~AdV~igge~A~~iPiQvI~D~~~~~~P~sC~~~g~~~~t~~~lgaNGILGIg~~  149 (370)
T PF11925_consen   87 -GYTWGSVRTADVTIGGETASSIPIQVIGDSAAPSVPSSCSNSGASMNTVADLGANGILGIGPF  149 (370)
T ss_pred             -cccccceEEEEEEEcCeeccccCEEEEcCCCCCCCCchhhcCCCCCCCcccccCceEEeecCC
Confidence             55567788899999986433344444432          11110  00113467899999973


No 65 
>PF12191 stn_TNFRSF12A:  Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain;  InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=67.73  E-value=1.6  Score=38.01  Aligned_cols=17  Identities=24%  Similarity=0.442  Sum_probs=0.0

Q ss_pred             HHHHHHHhhhhhhcccc
Q 047816          599 LSVFGILFILRRRRQSV  615 (620)
Q Consensus       599 l~~~~~~~~~r~r~~~~  615 (620)
                      |.++..+++|||.||++
T Consensus        92 l~llsg~lv~rrcrrr~  108 (129)
T PF12191_consen   92 LALLSGFLVWRRCRRRE  108 (129)
T ss_dssp             -----------------
T ss_pred             HHHHHHHHHHhhhhccc
Confidence            33333455555554443


No 66 
>PF06024 DUF912:  Nucleopolyhedrovirus protein of unknown function (DUF912);  InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=67.60  E-value=4.1  Score=34.69  Aligned_cols=31  Identities=29%  Similarity=0.410  Sum_probs=17.5

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQS  614 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~  614 (620)
                      .++.++++ .+.+++.|.++.-++|.|.|+++
T Consensus        63 iili~lls-~v~IlVily~IyYFVILRer~~~   93 (101)
T PF06024_consen   63 IILISLLS-FVCILVILYAIYYFVILRERQKS   93 (101)
T ss_pred             chHHHHHH-HHHHHHHHhhheEEEEEeccccc
Confidence            55555555 44444555555566677766644


No 67 
>PF03229 Alpha_GJ:  Alphavirus glycoprotein J;  InterPro: IPR004913 The exact function of the herpesvirus glycoprotein J is unknown, but it appears to play a role in the inhibition of apotosis of the host cell [].; GO: 0019050 suppression by virus of host apoptosis
Probab=67.37  E-value=5.3  Score=34.17  Aligned_cols=27  Identities=26%  Similarity=0.294  Sum_probs=19.0

Q ss_pred             hHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          584 FLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       584 ~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      +++.|+|  ..+++.|.+++...+.||++
T Consensus        85 aLp~VIG--GLcaL~LaamGA~~LLrR~c  111 (126)
T PF03229_consen   85 ALPLVIG--GLCALTLAAMGAGALLRRCC  111 (126)
T ss_pred             chhhhhh--HHHHHHHHHHHHHHHHHHHH
Confidence            4588887  56677788888776666553


No 68 
>PF05545 FixQ:  Cbb3-type cytochrome oxidase component FixQ;  InterPro: IPR008621 This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon [].
Probab=66.95  E-value=6.6  Score=28.50  Aligned_cols=23  Identities=17%  Similarity=0.134  Sum_probs=14.9

Q ss_pred             HHHHHHHHHHHHHHhhhhhhccc
Q 047816          592 TIMMVVGLSVFGILFILRRRRQS  614 (620)
Q Consensus       592 ~~~~~~~l~~~~~~~~~r~r~~~  614 (620)
                      .+++.+...++++|++|+||+++
T Consensus        15 ~v~~~~~F~gi~~w~~~~~~k~~   37 (49)
T PF05545_consen   15 TVLFFVFFIGIVIWAYRPRNKKR   37 (49)
T ss_pred             HHHHHHHHHHHHHHHHcccchhh
Confidence            45555566677778887776543


No 69 
>PF15102 TMEM154:  TMEM154 protein family
Probab=66.47  E-value=2.4  Score=38.21  Aligned_cols=6  Identities=33%  Similarity=0.905  Sum_probs=2.5

Q ss_pred             HHHHHH
Q 047816          585 LMVVLA  590 (620)
Q Consensus       585 ~~i~~~  590 (620)
                      +.|++.
T Consensus        59 LmIlIP   64 (146)
T PF15102_consen   59 LMILIP   64 (146)
T ss_pred             EEEeHH
Confidence            344444


No 70 
>PF12384 Peptidase_A2B:  Ty3 transposon peptidase;  InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=65.69  E-value=19  Score=33.23  Aligned_cols=22  Identities=14%  Similarity=0.302  Sum_probs=18.9

Q ss_pred             CceEeeccceeeeecHHHHHHH
Q 047816          295 HGTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       295 ~~ailDSGtt~~~LP~~~~~~i  316 (620)
                      ..++||||++..+.-.++.+.+
T Consensus        46 i~vLfDSGSPTSfIr~di~~kL   67 (177)
T PF12384_consen   46 IKVLFDSGSPTSFIRSDIVEKL   67 (177)
T ss_pred             EEEEEeCCCccceeehhhHHhh
Confidence            3589999999999999887764


No 71 
>PF13975 gag-asp_proteas:  gag-polyprotein putative aspartyl protease
Probab=65.61  E-value=8.9  Score=30.25  Aligned_cols=30  Identities=17%  Similarity=0.407  Sum_probs=24.1

Q ss_pred             EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816          276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i  316 (620)
                      .+.|+|..+           .+++|||.+..+++.+.++.+
T Consensus        12 ~~~I~g~~~-----------~alvDtGat~~fis~~~a~rL   41 (72)
T PF13975_consen   12 PVSIGGVQV-----------KALVDTGATHNFISESLAKRL   41 (72)
T ss_pred             EEEECCEEE-----------EEEEeCCCcceecCHHHHHHh
Confidence            355677655           389999999999999998775


No 72 
>PHA03265 envelope glycoprotein D; Provisional
Probab=65.14  E-value=2.1  Score=43.92  Aligned_cols=29  Identities=17%  Similarity=0.191  Sum_probs=18.2

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      -.+||++| +.++-+++..+++++.||||+
T Consensus       348 ~~~g~~ig-~~i~glv~vg~il~~~~rr~k  376 (402)
T PHA03265        348 TFVGISVG-LGIAGLVLVGVILYVCLRRKK  376 (402)
T ss_pred             cccceEEc-cchhhhhhhhHHHHHHhhhhh
Confidence            57788887 444455555555666777663


No 73 
>PF15065 NCU-G1:  Lysosomal transcription factor, NCU-G1
Probab=63.59  E-value=7.3  Score=40.92  Aligned_cols=49  Identities=10%  Similarity=0.218  Sum_probs=25.9

Q ss_pred             eeeeeeec---CCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          565 KLLQWNIE---PQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       565 ~l~~~~~~---~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      +.+.|+..   -..+...+|. .|+.|++|.++-+++.|+..+.++++|+|+|
T Consensus       297 ~ylsWt~~~G~G~PP~d~~S~lvi~i~~vgLG~P~l~li~Ggl~v~~~r~r~~  349 (350)
T PF15065_consen  297 NYLSWTFLIGYGSPPVDSFSPLVIMIMAVGLGVPLLLLILGGLYVCLRRRRKR  349 (350)
T ss_pred             ceEEEEEecccCCCCccchhHHHHHHHHHHhhHHHHHHHHhhheEEEeccccC
Confidence            35566642   1245677888 5555667755554444444434444444444


No 74 
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=62.64  E-value=9.6  Score=31.46  Aligned_cols=25  Identities=28%  Similarity=0.498  Sum_probs=21.3

Q ss_pred             EEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816           89 RLWIGTPPQTFALIVDTGSTVTYVPCA  115 (620)
Q Consensus        89 ~i~iGTP~Q~~~v~vDTGSs~~WV~~~  115 (620)
                      .+.|+  +|.+..++|||++++-+...
T Consensus         2 ~~~i~--g~~~~~llDTGAd~Tvi~~~   26 (87)
T cd05482           2 TLYIN--GKLFEGLLDTGADVSIIAEN   26 (87)
T ss_pred             EEEEC--CEEEEEEEccCCCCeEEccc
Confidence            36677  89999999999999998753


No 75 
>PF02480 Herpes_gE:  Alphaherpesvirus glycoprotein E;  InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=62.55  E-value=2.5  Score=45.98  Aligned_cols=23  Identities=9%  Similarity=0.114  Sum_probs=9.4

Q ss_pred             eEEEEECCCcEEEeCCCCcEEEe
Q 047816          358 AVEMAFGNGQKLLLAPENYLFRH  380 (620)
Q Consensus       358 ~i~f~f~~g~~~~l~~~~yi~~~  380 (620)
                      .|...+.+...|.+.-+-|.++.
T Consensus       170 ~l~~~~~d~~~f~~~i~W~~~~~  192 (439)
T PF02480_consen  170 HLQSEAHDDPPFSLEIDWYYMPT  192 (439)
T ss_dssp             EEEEEESSS--EEEEEEEEEE--
T ss_pred             EEEeccCCCCCeeEEEEEEEecC
Confidence            44444433355555555555554


No 76 
>PF02009 Rifin_STEVOR:  Rifin/stevor family;  InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=61.52  E-value=6.4  Score=40.42  Aligned_cols=24  Identities=17%  Similarity=0.368  Sum_probs=15.0

Q ss_pred             HHHHHHHHHHHhhhhhhc-cccCCC
Q 047816          595 MVVGLSVFGILFILRRRR-QSVNSY  618 (620)
Q Consensus       595 ~~~~l~~~~~~~~~r~r~-~~~~~~  618 (620)
                      +++.|.+..+|-.||||+ ++.++|
T Consensus       269 VLIMvIIYLILRYRRKKKmkKKlQY  293 (299)
T PF02009_consen  269 VLIMVIIYLILRYRRKKKMKKKLQY  293 (299)
T ss_pred             HHHHHHHHHHHHHHHHhhhhHHHHH
Confidence            345556666789999765 444444


No 77 
>PF07213 DAP10:  DAP10 membrane protein;  InterPro: IPR009861 This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells [].
Probab=57.93  E-value=16  Score=29.26  Aligned_cols=19  Identities=11%  Similarity=-0.031  Sum_probs=12.0

Q ss_pred             chhh-hhHHHHHHHHHHHHHH
Q 047816          579 WWQE-HFLMVVLAITIMMVVG  598 (620)
Q Consensus       579 ~~~~-~~~~i~~~~~~~~~~~  598 (620)
                      .++. .++||++| =+++.+.
T Consensus        30 ~ls~g~LaGiV~~-D~vlTLL   49 (79)
T PF07213_consen   30 PLSPGLLAGIVAA-DAVLTLL   49 (79)
T ss_pred             ccCHHHHHHHHHH-HHHHHHH
Confidence            4566 78899887 4444333


No 78 
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=57.83  E-value=15  Score=29.94  Aligned_cols=30  Identities=20%  Similarity=0.427  Sum_probs=23.2

Q ss_pred             EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816          276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i  316 (620)
                      .+.|+++.+           .+++|||++.+.++.+..+.+
T Consensus         6 ~v~i~~~~~-----------~~llDTGa~~s~i~~~~~~~l   35 (96)
T cd05483           6 PVTINGQPV-----------RFLLDTGASTTVISEELAERL   35 (96)
T ss_pred             EEEECCEEE-----------EEEEECCCCcEEcCHHHHHHc
Confidence            456676654           489999999999999876654


No 79 
>PF06365 CD34_antigen:  CD34/Podocalyxin family;  InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=56.64  E-value=8.2  Score=37.07  Aligned_cols=30  Identities=10%  Similarity=0.134  Sum_probs=20.0

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      .+|++++. +.+++++++.++.|++|+||+.
T Consensus       101 ~lI~lv~~-g~~lLla~~~~~~Y~~~~Rrs~  130 (202)
T PF06365_consen  101 TLIALVTS-GSFLLLAILLGAGYCCHQRRSW  130 (202)
T ss_pred             EEEehHHh-hHHHHHHHHHHHHHHhhhhccC
Confidence            56665544 4556666666777888888864


No 80 
>TIGR01167 LPXTG_anchor LPXTG-motif cell wall anchor domain. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region.
Probab=56.03  E-value=13  Score=24.36  Aligned_cols=10  Identities=30%  Similarity=0.717  Sum_probs=4.8

Q ss_pred             HHHhhhhhhc
Q 047816          603 GILFILRRRR  612 (620)
Q Consensus       603 ~~~~~~r~r~  612 (620)
                      +.++++|||+
T Consensus        24 ~~~~~~~rk~   33 (34)
T TIGR01167        24 GGLLLRKRKK   33 (34)
T ss_pred             HHHHheeccc
Confidence            3455555543


No 81 
>PF12877 DUF3827:  Domain of unknown function (DUF3827);  InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells. 
Probab=55.88  E-value=31  Score=38.70  Aligned_cols=81  Identities=15%  Similarity=0.152  Sum_probs=47.0

Q ss_pred             eeeEEEEecCCCcccccHHHHHHHHHHHccCccCCCCCCc---------ceeeeeeeecCCccccchhh-hhHHHHHHHH
Q 047816          523 SFIAWAVFPSGSANYISNATALRIISRLAEHRVHIPDTFG---------NYKLLQWNIEPQVKRTWWQE-HFLMVVLAIT  592 (620)
Q Consensus       523 ~~~~~~~~P~~~~~~f~~~~~~~i~~~~~~~~~~~~~~fG---------~y~l~~~~~~~~~~~~~~~~-~~~~i~~~~~  592 (620)
                      ++|.--+-- .++.....++|-.+...+..+++.+-  .|         ||+...   .|..+.-..+. +|+||++.  
T Consensus       207 ~EL~YyV~~-~~G~pl~a~~AA~~Ln~ld~Q~~Al~--LGy~V~~~~AqPv~~~a---~P~~~s~~~NlWII~gVlvP--  278 (684)
T PF12877_consen  207 VELTYYVEG-QNGKPLPAVTAAKDLNLLDSQRMALI--LGYRVQGIVAQPVEKQA---EPPAKSPPNNLWIIAGVLVP--  278 (684)
T ss_pred             eEEEEEEEc-CCCcCCcHHHHHHHHhccCHHHHHHh--cCceecccccccccccc---CCCCCCCCCCeEEEehHhHH--
Confidence            555555541 27888999999999999988886642  33         333332   24444444455 66888655  


Q ss_pred             HHHHHHHHHHHHHhhhhhh
Q 047816          593 IMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       593 ~~~~~~l~~~~~~~~~r~r  611 (620)
                      +++++.+.+++.|.++||.
T Consensus       279 v~vV~~Iiiil~~~LCRk~  297 (684)
T PF12877_consen  279 VLVVLLIIIILYWKLCRKN  297 (684)
T ss_pred             HHHHHHHHHHHHHHHhccc
Confidence            3333333444445555443


No 82 
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=54.98  E-value=58  Score=26.95  Aligned_cols=21  Identities=19%  Similarity=0.316  Sum_probs=17.3

Q ss_pred             CCceEeeccceeeeecHHHHH
Q 047816          294 KHGTVLDSGTTYAYLPEAAFL  314 (620)
Q Consensus       294 ~~~ailDSGtt~~~LP~~~~~  314 (620)
                      +...++|||.....+|....+
T Consensus         9 ~~~fLVDTGA~vSviP~~~~~   29 (89)
T cd06094           9 GLRFLVDTGAAVSVLPASSTK   29 (89)
T ss_pred             CcEEEEeCCCceEeecccccc
Confidence            346899999999999987654


No 83 
>PF13703 PepSY_TM_2:  PepSY-associated TM helix
Probab=54.93  E-value=19  Score=29.64  Aligned_cols=20  Identities=20%  Similarity=0.368  Sum_probs=10.2

Q ss_pred             HHHHHHHHHHHHHhhhhhhc
Q 047816          593 IMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       593 ~~~~~~l~~~~~~~~~r~r~  612 (620)
                      ..+.+.+++.|.+++|+|++
T Consensus        24 al~~l~~~isGl~l~~p~~~   43 (88)
T PF13703_consen   24 ALLLLLLLISGLYLWWPRRW   43 (88)
T ss_pred             HHHHHHHHHHHHHHhhHHhc
Confidence            33344444556666665443


No 84 
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where 
Probab=53.36  E-value=16  Score=29.81  Aligned_cols=29  Identities=28%  Similarity=0.315  Sum_probs=23.7

Q ss_pred             EEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816          277 IHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       277 i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i  316 (620)
                      +.|||+.+.           .++|||.+.+.++++..+.+
T Consensus         3 v~InG~~~~-----------fLvDTGA~~tii~~~~a~~~   31 (86)
T cd06095           3 ITVEGVPIV-----------FLVDTGATHSVLKSDLGPKQ   31 (86)
T ss_pred             EEECCEEEE-----------EEEECCCCeEEECHHHhhhc
Confidence            457777654           79999999999999988764


No 85 
>PF02160 Peptidase_A3:  Cauliflower mosaic virus peptidase (A3);  InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=53.31  E-value=17  Score=34.92  Aligned_cols=28  Identities=25%  Similarity=0.319  Sum_probs=20.5

Q ss_pred             CCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816          397 DPTTLLGGIIVRNTLVMYDREHSKIGFWK  425 (620)
Q Consensus       397 ~~~~ILG~~fLr~~yvvfD~en~rIGfA~  425 (620)
                      +-..|||..|+|.++=-.+.+ .+|-|-.
T Consensus        90 g~d~IlG~NF~r~y~Pfiq~~-~~I~f~~  117 (201)
T PF02160_consen   90 GIDIILGNNFLRLYEPFIQTE-DRIQFHK  117 (201)
T ss_pred             CCCEEecchHHHhcCCcEEEc-cEEEEEe
Confidence            456999999999887555554 4677764


No 86 
>PF01034 Syndecan:  Syndecan domain;  InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains:   A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains;  A transmembrane region;  A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins.    The proteins known to belong to this family are:    Syndecan 1.  Syndecan 2 or fibroglycan.  Syndecan 3 or neuroglycan or N-syndecan.  Syndecan 4 or amphiglycan or ryudocan.  Drosophila syndecan.   Caenorhabditis elegans probable syndecan (F57C7.3).    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=53.04  E-value=5.1  Score=30.67  Aligned_cols=34  Identities=15%  Similarity=0.135  Sum_probs=0.9

Q ss_pred             ccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          577 RTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       577 ~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      ++..-. .|+|+++|  +++++.|+.+.++-+++|--
T Consensus         7 ~~~vlaavIaG~Vvg--ll~ailLIlf~iyR~rkkdE   41 (64)
T PF01034_consen    7 RSEVLAAVIAGGVVG--LLFAILLILFLIYRMRKKDE   41 (64)
T ss_dssp             --------------------------------S----
T ss_pred             cchHHHHHHHHHHHH--HHHHHHHHHHHHHHHHhcCC
Confidence            344445 88999997  55666777778888888863


No 87 
>PF11014 DUF2852:  Protein of unknown function (DUF2852);  InterPro: IPR021273  This bacterial family of proteins has no known function. 
Probab=48.03  E-value=29  Score=30.11  Aligned_cols=31  Identities=19%  Similarity=0.227  Sum_probs=22.6

Q ss_pred             hhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816          580 WQE-HFLMVVLAITIMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       580 ~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r  611 (620)
                      |.- .|+.+|+| .|+.-.+-++++.|.||.+|
T Consensus         7 ~~~a~Ia~mVlG-Fi~fWPlGla~Lay~iw~~r   38 (115)
T PF11014_consen    7 WKPAWIAAMVLG-FIVFWPLGLALLAYMIWGKR   38 (115)
T ss_pred             CchHHHHHHHHH-HHHHHHHHHHHHHHHHHHHH
Confidence            344 68888999 66666666777778888766


No 88 
>PF02529 PetG:  Cytochrome B6-F complex subunit 5;  InterPro: IPR003683 This family consists of cytochrome b6/f complex subunit 5 (PetG). The cytochrome bf complex, found in green plants, eukaryotic algae and cyanobacteria, connects photosystem I to photosystem II in the electron transport chain, functioning as a plastoquinol:plastocyanin/cytochrome c6 oxidoreductase []. The purified complex from the unicellular alga Chlamydomonas reinhardtii contains seven subunits; namely four high molecular weight subunits (cytochrome f, Rieske iron-sulphur protein, cytochrome b6, and subunit IV) and three approximately miniproteins (PetG, PetL, and PetX) []. Stoichiometry measurements are consistent with every subunit being present as two copies per b6/f dimer. The absence of PetG affects either the assembly or stability of the cytochrome bf complex in C. reinhardtii [].; GO: 0009512 cytochrome b6f complex; PDB: 1Q90_G 2ZT9_G 1VF5_G 2D2C_G 2E74_G 2E75_G 2E76_G.
Probab=47.94  E-value=42  Score=22.62  Aligned_cols=26  Identities=15%  Similarity=0.103  Sum_probs=15.9

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhh
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILR  609 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r  609 (620)
                      ...||++| .+.+.++-+.+..|+-.|
T Consensus         5 lL~GiVlG-li~vtl~Glfv~Ay~QY~   30 (37)
T PF02529_consen    5 LLSGIVLG-LIPVTLAGLFVAAYLQYR   30 (37)
T ss_dssp             HHHHHHHH-HHHHHHHHHHHHHHHHHC
T ss_pred             hhhhHHHH-hHHHHHHHHHHHHHHHHh
Confidence            46799999 666555555554454444


No 89 
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=47.57  E-value=16  Score=36.65  Aligned_cols=29  Identities=34%  Similarity=0.509  Sum_probs=16.9

Q ss_pred             HHHHHHHHHH-HHHHHHhhhhhhccccCCCC
Q 047816          590 AITIMMVVGL-SVFGILFILRRRRQSVNSYK  619 (620)
Q Consensus       590 ~~~~~~~~~l-~~~~~~~~~r~r~~~~~~~~  619 (620)
                      |+++.++++| ++++++.||-+|||+. ++|
T Consensus       262 giaalvllil~vvliiLYiWlyrrRK~-swk  291 (295)
T TIGR01478       262 GIAALVLIILTVVLIILYIWLYRRRKK-SWK  291 (295)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhhcc-ccc
Confidence            5444444444 3445678899888774 443


No 90 
>KOG4818 consensus Lysosomal-associated membrane protein [General function prediction only]
Probab=47.03  E-value=15  Score=38.10  Aligned_cols=29  Identities=28%  Similarity=0.333  Sum_probs=19.9

Q ss_pred             hHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          584 FLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       584 ~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      ++-|++| ++..++.+..++.++|.||||+
T Consensus       328 v~PivVg-~~l~gl~~~vliaylIgrr~~~  356 (362)
T KOG4818|consen  328 VLPIAVG-AILAGLVLVVLIAYLIGRRRSH  356 (362)
T ss_pred             ecchHHH-HHHHHHHHHHHHHhheeheecc
Confidence            4556777 6666777777788888755543


No 91 
>PF15099 PIRT:  Phosphoinositide-interacting protein family
Probab=46.55  E-value=11  Score=33.07  Aligned_cols=32  Identities=19%  Similarity=0.249  Sum_probs=19.9

Q ss_pred             hhHHHH-HHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816          583 HFLMVV-LAITIMMVVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       583 ~~~~i~-~~~~~~~~~~l~~~~~~~~~r~r~~~~  615 (620)
                      .++|.+ ++ +..++++.++++|..+.|||++++
T Consensus        81 ~~~G~vlLs-~GLmlL~~~alcW~~~~rkK~~kr  113 (129)
T PF15099_consen   81 SIFGPVLLS-LGLMLLACSALCWKPIIRKKKKKR  113 (129)
T ss_pred             hhehHHHHH-HHHHHHHhhhheehhhhHhHHHHh
Confidence            355655 55 556666667777777777665443


No 92 
>PF14986 DUF4514:  Domain of unknown function (DUF4514)
Probab=46.51  E-value=19  Score=26.27  Aligned_cols=28  Identities=18%  Similarity=0.266  Sum_probs=19.1

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      +++|.++|++|  ..+.+++-++.|||.-.
T Consensus        23 a~IGtalGvai--sAgFLaLKicmIrkhlf   50 (61)
T PF14986_consen   23 AIIGTALGVAI--SAGFLALKICMIRKHLF   50 (61)
T ss_pred             eeehhHHHHHH--HHHHHHHHHHHHHHhhc
Confidence            78888888333  44556777788877653


No 93 
>PTZ00370 STEVOR; Provisional
Probab=45.77  E-value=18  Score=36.48  Aligned_cols=26  Identities=31%  Similarity=0.423  Sum_probs=15.8

Q ss_pred             HHHHHHHHHH-HHHHHHhhhhhhcccc
Q 047816          590 AITIMMVVGL-SVFGILFILRRRRQSV  615 (620)
Q Consensus       590 ~~~~~~~~~l-~~~~~~~~~r~r~~~~  615 (620)
                      |+++.++++| ++++++.||-+|||+.
T Consensus       258 giaalvllil~vvliilYiwlyrrRK~  284 (296)
T PTZ00370        258 GIAALVLLILAVVLIILYIWLYRRRKN  284 (296)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence            5433333333 3445778999998875


No 94 
>PF11353 DUF3153:  Protein of unknown function (DUF3153);  InterPro: IPR021499  This family of proteins with unknown function appear to be restricted to Cyanobacteria. Some members are annotated as membrane proteins however this cannot be confirmed. 
Probab=45.54  E-value=18  Score=35.23  Aligned_cols=46  Identities=17%  Similarity=0.299  Sum_probs=22.6

Q ss_pred             eeeeeecCCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          566 LLQWNIEPQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       566 l~~~~~~~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      .+.|++.| +....+.. .++-=.+| .+++++++++++.++++|+|++
T Consensus       162 ~l~W~L~p-Ge~N~L~~~~w~pn~lg-iG~v~I~~l~~~~~~l~~~r~~  208 (209)
T PF11353_consen  162 QLTWKLQP-GEINHLEASFWVPNPLG-IGTVLIVLLILLGFLLRRRRLP  208 (209)
T ss_pred             EEEEecCC-CceeEEEEEEEeccHHH-HHHHHHHHHHHHHHHHHHhhcC
Confidence            67788743 33333333 22222344 2333444455555677776654


No 95 
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=45.24  E-value=3.1  Score=35.02  Aligned_cols=33  Identities=6%  Similarity=-0.106  Sum_probs=23.5

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~  615 (620)
                      .-.|.++|+++.+++++.+++.+++|.+.+|++
T Consensus        63 ls~gaiagi~vg~~~~v~~lv~~l~w~f~~r~k   95 (96)
T PTZ00382         63 LSTGAIAGISVAVVAVVGGLVGFLCWWFVCRGK   95 (96)
T ss_pred             cccccEEEEEeehhhHHHHHHHHHhheeEEeec
Confidence            677777875666666667778888887776653


No 96 
>PF13268 DUF4059:  Protein of unknown function (DUF4059)
Probab=43.45  E-value=25  Score=27.53  Aligned_cols=23  Identities=26%  Similarity=0.339  Sum_probs=12.9

Q ss_pred             HHHHHHHHHHHhhhhhhccccCC
Q 047816          595 MVVGLSVFGILFILRRRRQSVNS  617 (620)
Q Consensus       595 ~~~~l~~~~~~~~~r~r~~~~~~  617 (620)
                      .+..+.+-+.|..||.++++-++
T Consensus        17 ~i~V~~~~~~wi~~Ra~~~~DKT   39 (72)
T PF13268_consen   17 SILVLLVSGIWILWRALRKKDKT   39 (72)
T ss_pred             HHHHHHHHHHHHHHHHHHcCCCc
Confidence            34444455567777766655443


No 97 
>PF08374 Protocadherin:  Protocadherin;  InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated []. 
Probab=42.85  E-value=11  Score=36.18  Aligned_cols=26  Identities=12%  Similarity=0.333  Sum_probs=13.0

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r  611 (620)
                      .++||+.| +++++|+  ++++.++|+.|
T Consensus        39 I~iaiVAG-~~tVILV--I~i~v~vR~CR   64 (221)
T PF08374_consen   39 IMIAIVAG-IMTVILV--IFIVVLVRYCR   64 (221)
T ss_pred             eeeeeecc-hhhhHHH--HHHHHHHHHHh
Confidence            57777776 4443333  33334445344


No 98 
>KOG3540 consensus Beta amyloid precursor protein [General function prediction only]
Probab=42.32  E-value=30  Score=37.28  Aligned_cols=57  Identities=14%  Similarity=0.293  Sum_probs=35.7

Q ss_pred             ccCCCCCC-cceeeeeeeecCCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          554 RVHIPDTF-GNYKLLQWNIEPQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       554 ~~~~~~~f-G~y~l~~~~~~~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      .++...-| ++|++-.-.+.|.+...+.|. +++|+.++ +  ++++-++++.+++.|||+.
T Consensus       518 ev~~d~e~d~~~e~~r~~~~~~~ed~~~s~~av~gllv~-~--~~i~tvivisl~mlrkr~y  576 (615)
T KOG3540|consen  518 EVRVDAEFDEGAEFYRHDLLPQSEDVGRSASAVIGLLVS-A--VFIATVIVISLVMLRKRQY  576 (615)
T ss_pred             hcccCCCCCCchhhhhhhhccccccccccHHHHHHHHHH-H--HHHHHHHHHHHHHHccccc
Confidence            34445445 788887777778888888899 99997654 1  1222233344556666653


No 99 
>PF00077 RVP:  Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026;  InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=42.25  E-value=20  Score=29.97  Aligned_cols=27  Identities=22%  Similarity=0.571  Sum_probs=21.1

Q ss_pred             EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHH
Q 047816          276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAF  313 (620)
Q Consensus       276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~  313 (620)
                      .+.++|+.+           .++||||+..+.++++.+
T Consensus         9 ~v~i~g~~i-----------~~LlDTGA~vsiI~~~~~   35 (100)
T PF00077_consen    9 TVKINGKKI-----------KALLDTGADVSIISEKDW   35 (100)
T ss_dssp             EEEETTEEE-----------EEEEETTBSSEEESSGGS
T ss_pred             EEeECCEEE-----------EEEEecCCCcceeccccc
Confidence            456677755           489999999999998653


No 100
>PF14575 EphA2_TM:  Ephrin type-A receptor 2 transmembrane domain; PDB: 3KUL_A 2XVD_A 2VX1_A 2VWV_A 2VX0_A 2VWY_A 2VWZ_A 2VWW_A 2VWU_A 2VWX_A ....
Probab=41.18  E-value=32  Score=27.48  Aligned_cols=23  Identities=17%  Similarity=0.193  Sum_probs=9.6

Q ss_pred             HHHHHHHHHHHHHhhhhhhcccc
Q 047816          593 IMMVVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       593 ~~~~~~l~~~~~~~~~r~r~~~~  615 (620)
                      +++.+.++.+.++++.-.+||..
T Consensus         6 ~~~g~~~ll~~v~~~~~~~rr~~   28 (75)
T PF14575_consen    6 IIVGVLLLLVLVIIVIVCFRRCK   28 (75)
T ss_dssp             HHHHHHHHHHHHHHHHCCCTT--
T ss_pred             HHHHHHHHHHhheeEEEEEeeEc
Confidence            33344444444555555554443


No 101
>PF12384 Peptidase_A2B:  Ty3 transposon peptidase;  InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=40.67  E-value=36  Score=31.53  Aligned_cols=29  Identities=14%  Similarity=0.339  Sum_probs=23.8

Q ss_pred             EEEEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816           87 TTRLWIGTPPQTFALIVDTGSTVTYVPCA  115 (620)
Q Consensus        87 ~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~  115 (620)
                      +..+.+++-+.++++++||||+.-.+...
T Consensus        34 T~~v~l~~~~t~i~vLfDSGSPTSfIr~d   62 (177)
T PF12384_consen   34 TAIVQLNCKGTPIKVLFDSGSPTSFIRSD   62 (177)
T ss_pred             EEEEEEeecCcEEEEEEeCCCccceeehh
Confidence            34677777799999999999999888653


No 102
>PF09668 Asp_protease:  Aspartyl protease;  InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.  This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=38.92  E-value=16  Score=32.40  Aligned_cols=36  Identities=25%  Similarity=0.291  Sum_probs=26.6

Q ss_pred             eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCC
Q 047816           85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGD  122 (620)
Q Consensus        85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~  122 (620)
                      ..|++++|+  ++++...+|||...+-+..+-+..|+.
T Consensus        24 mLyI~~~in--g~~vkA~VDtGAQ~tims~~~a~r~gL   59 (124)
T PF09668_consen   24 MLYINCKIN--GVPVKAFVDTGAQSTIMSKSCAERCGL   59 (124)
T ss_dssp             --EEEEEET--TEEEEEEEETT-SS-EEEHHHHHHTTG
T ss_pred             eEEEEEEEC--CEEEEEEEeCCCCccccCHHHHHHcCC
Confidence            567899999  999999999999998887654556654


No 103
>TIGR03867 MprA_tail MprA protease C-terminal sorting domain. This model describes a protein C-terminal domain that occurs in species of the genus Ralstonia and is predicted to play a role in protein targeting. This sequence, though limited to members of the MprA serine in species distribution, resembles C-terminal sorting sequences of the sortase and exosortase systems, as well as a Shewanella-type C-terminal sequence modeled by TIGR03501. For all such cases, member proteins have homologs in other species with essentially full-length homology, save for the lack of the domain modeled here. All members of the present family are predicted serine proteases
Probab=38.79  E-value=42  Score=21.03  Aligned_cols=19  Identities=32%  Similarity=0.296  Sum_probs=11.1

Q ss_pred             HHHHHHHHHHHHHhhhhhh
Q 047816          593 IMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       593 ~~~~~~l~~~~~~~~~r~r  611 (620)
                      ..++..|++++.+.+.|||
T Consensus         8 ~~~A~Lll~aG~~~~~rR~   26 (27)
T TIGR03867         8 PWLAALLLAAGLLGFARRR   26 (27)
T ss_pred             HHHHHHHHHHHhhhHHhhc
Confidence            3445556666666666655


No 104
>PF13908 Shisa:  Wnt and FGF inhibitory regulator
Probab=38.02  E-value=16  Score=34.51  Aligned_cols=12  Identities=8%  Similarity=0.232  Sum_probs=6.6

Q ss_pred             chhh-hhHHHHHH
Q 047816          579 WWQE-HFLMVVLA  590 (620)
Q Consensus       579 ~~~~-~~~~i~~~  590 (620)
                      ++.. +++||++|
T Consensus        75 ~~~~~iivgvi~~   87 (179)
T PF13908_consen   75 YFITGIIVGVICG   87 (179)
T ss_pred             cceeeeeeehhhH
Confidence            3344 56666665


No 105
>PRK09459 pspG phage shock protein G; Reviewed
Probab=37.99  E-value=29  Score=27.45  Aligned_cols=21  Identities=24%  Similarity=0.322  Sum_probs=13.8

Q ss_pred             HHHHHHHhhhhhhccccCCCC
Q 047816          599 LSVFGILFILRRRRQSVNSYK  619 (620)
Q Consensus       599 l~~~~~~~~~r~r~~~~~~~~  619 (620)
                      +.+++.|++|.+++++.+.|+
T Consensus        53 l~~v~vW~~r~~~~~~~~~y~   73 (76)
T PRK09459         53 LAVVVVWVIRAIKAPKVPRYQ   73 (76)
T ss_pred             HHHHHHHHHHHhhcccccccc
Confidence            355667888776766666664


No 106
>PF05084 GRA6:  Granule antigen protein (GRA6);  InterPro: IPR008119  Toxoplasma gondii is an obligate intracellular apicomplexan protozoan parasite, with a complex lifestyle involving varied hosts []. It has two phases of growth: an intestinal phase in feline hosts, and an extra-intestinal phase in other mammals. Oocysts from infected cats develop into tachyzoites, and eventually, bradyzoites and zoitocysts in the extraintestinal host []. Transmission of the parasite occurs through contact with infected cats or raw/undercooked meat; in immunocompromised individuals, it can cause severe and often lethal toxoplasmosis. Acute infection in healthy humans can sometimes also cause tissue damage [].  The protozoan utilises a variety of secretory and antigenic proteins to invade a host and gain access to the intracellular environment []. These originate from distinct organelles in the T. gondii cell termed micronemes, rhoptries, and dense granules. They are released at specific times during invasion to ensure the proteins are allocated to their correct target destinations []. Dense granule antigens (GRAs) are released from the T. gondii tachyzoite while still encapsulated in a host vacuole. Gra6, one of these moieties, is associated with the parasitophorous vacuole []. It possesses a hydrophobic central region flanked by two hydrophilic domains, and is present as a single copy gene in the Toxoplasma gondii genome []. Gra6 shares a similar function with Gra2, in that it is rapidly targeted to a network of membranous tubules that connect with the vacuolar membrane []. Indeed, these two proteins, together with Gra4, form a multimeric complex that stabilises the parasite within the vacuole.
Probab=37.77  E-value=36  Score=31.01  Aligned_cols=13  Identities=31%  Similarity=0.455  Sum_probs=6.5

Q ss_pred             HHHHHhhhhhhcc
Q 047816          601 VFGILFILRRRRQ  613 (620)
Q Consensus       601 ~~~~~~~~r~r~~  613 (620)
                      ++.+|++.|||.|
T Consensus       164 A~L~~~F~RR~~r  176 (215)
T PF05084_consen  164 AMLTWFFLRRTGR  176 (215)
T ss_pred             HHHHHHHHHhhcc
Confidence            3344555555543


No 107
>TIGR03370 PEPCTERM_Roseo variant PEP-CTERM putative exosortase signal, Roseobacter type. A probable protein export sorting signal, PEP-CTERM, was described by Haft, et al. (PubMed:16930487). It is predicted to interact with a putative transpeptidase we designate exosortase. Most examples of this signal are recognized by model TIGR02595, but some unusual clades require different models. This model describes a variant with conserved motif VPLPA, rather than VPEP. This variant is found prominently in two members of the Rhodobacterales, namely Jannaschia sp. CCS1 and Roseobacter denitrificans OCh 114. One interesting member protein has a full-length duplication and therefore two copies of this putative sorting domain.
Probab=36.95  E-value=41  Score=20.95  Aligned_cols=17  Identities=41%  Similarity=0.479  Sum_probs=9.0

Q ss_pred             HHHHHHHHHhhhhhhcc
Q 047816          597 VGLSVFGILFILRRRRQ  613 (620)
Q Consensus       597 ~~l~~~~~~~~~r~r~~  613 (620)
                      +.++.++.+.+.|||++
T Consensus         9 LLl~gLggl~~~rRRrk   25 (26)
T TIGR03370         9 LLLAGLGGLGAMRRRRR   25 (26)
T ss_pred             HHHHHHHHHHHHHHhhc
Confidence            34455555556665543


No 108
>PHA03283 envelope glycoprotein E; Provisional
Probab=36.69  E-value=34  Score=37.41  Aligned_cols=25  Identities=16%  Similarity=0.139  Sum_probs=14.9

Q ss_pred             HHHHHHHHHHHhhhhhhccccCCCC
Q 047816          595 MVVGLSVFGILFILRRRRQSVNSYK  619 (620)
Q Consensus       595 ~~~~l~~~~~~~~~r~r~~~~~~~~  619 (620)
                      ++++|+++++|...|-|++.++.|+
T Consensus       410 ~~~~~~~l~vw~c~~~r~~~~~~y~  434 (542)
T PHA03283        410 CAALLVALVVWGCILYRRSNRKPYE  434 (542)
T ss_pred             HHHHHHHHhhhheeeehhhcCCccc
Confidence            3355666667766665555556664


No 109
>PF01102 Glycophorin_A:  Glycophorin A;  InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=36.09  E-value=46  Score=29.37  Aligned_cols=30  Identities=10%  Similarity=0.165  Sum_probs=18.3

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQS  614 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~  614 (620)
                      .|+||++|  ++++++|+++++.-.+||....
T Consensus        69 Ii~gv~aG--vIg~Illi~y~irR~~Kk~~~~   98 (122)
T PF01102_consen   69 IIFGVMAG--VIGIILLISYCIRRLRKKSSSD   98 (122)
T ss_dssp             HHHHHHHH--HHHHHHHHHHHHHHHS------
T ss_pred             hhHHHHHH--HHHHHHHHHHHHHHHhccCCCC
Confidence            67888887  4556678888888777777543


No 110
>CHL00008 petG cytochrome b6/f complex subunit V
Probab=35.52  E-value=73  Score=21.39  Aligned_cols=25  Identities=12%  Similarity=0.090  Sum_probs=14.6

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhh
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFIL  608 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~  608 (620)
                      .+-||++| .+.+.++-+++..|+=.
T Consensus         5 lL~GiVLG-lipvTl~GlfvaAylQY   29 (37)
T CHL00008          5 LLFGIVLG-LIPITLAGLFVTAYLQY   29 (37)
T ss_pred             hhhhHHHH-hHHHHHHHHHHHHHHHH
Confidence            35689999 66555554554444433


No 111
>PHA03281 envelope glycoprotein E; Provisional
Probab=35.45  E-value=63  Score=35.51  Aligned_cols=20  Identities=20%  Similarity=0.305  Sum_probs=15.4

Q ss_pred             cccHHHHHHHHHHHccCccC
Q 047816          537 YISNATALRIISRLAEHRVH  556 (620)
Q Consensus       537 ~f~~~~~~~i~~~~~~~~~~  556 (620)
                      +-=.||+.+++..+.++.++
T Consensus       505 YtlvSTad~fvNvV~d~~~P  524 (642)
T PHA03281        505 YTVVSTIDHFVNAIEEHGFP  524 (642)
T ss_pred             EEEEehHHhhhhhehhcCCC
Confidence            45567888899999988655


No 112
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=35.05  E-value=71  Score=30.78  Aligned_cols=36  Identities=25%  Similarity=0.306  Sum_probs=28.4

Q ss_pred             CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHH
Q 047816          266 RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLA  315 (620)
Q Consensus       266 ~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~  315 (620)
                      .+++|.++   ..|||+.+.           .++|||.|.+.++++..+.
T Consensus       102 ~~GHF~a~---~~VNGk~v~-----------fLVDTGATsVal~~~dA~R  137 (215)
T COG3577         102 RDGHFEAN---GRVNGKKVD-----------FLVDTGATSVALNEEDARR  137 (215)
T ss_pred             CCCcEEEE---EEECCEEEE-----------EEEecCcceeecCHHHHHH
Confidence            55666654   578888765           7999999999999988765


No 113
>TIGR03778 VPDSG_CTERM VPDSG-CTERM exosortase interaction domain. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (PubMed:16930487). This model describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C-terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif (TIGR02595) of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria.
Probab=34.89  E-value=46  Score=20.73  Aligned_cols=17  Identities=29%  Similarity=0.065  Sum_probs=7.9

Q ss_pred             HHHHHHHHHHHhhhhhh
Q 047816          595 MVVGLSVFGILFILRRR  611 (620)
Q Consensus       595 ~~~~l~~~~~~~~~r~r  611 (620)
                      +++...++..++.+|||
T Consensus         8 ~~Ll~~~l~~l~~~rRr   24 (26)
T TIGR03778         8 LALLGLGLLGLLGLRRR   24 (26)
T ss_pred             HHHHHHHHHHHHHHhhc
Confidence            33334444445555544


No 114
>PF15176 LRR19-TM:  Leucine-rich repeat family 19 TM domain
Probab=34.49  E-value=41  Score=28.28  Aligned_cols=36  Identities=11%  Similarity=0.066  Sum_probs=26.9

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc-cccCCCC
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRR-QSVNSYK  619 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~-~~~~~~~  619 (620)
                      .-+...+| .++.++.+++++.++++=... +...+|+
T Consensus        15 ~sW~~LVG-Vv~~al~~SlLIalaaKC~~~~k~~~SY~   51 (102)
T PF15176_consen   15 RSWPFLVG-VVVTALVTSLLIALAAKCPVWYKYLASYR   51 (102)
T ss_pred             cccHhHHH-HHHHHHHHHHHHHHHHHhHHHHHHHhccc
Confidence            56778899 888888889998888877664 4455553


No 115
>PRK00665 petG cytochrome b6-f complex subunit PetG; Reviewed
Probab=33.69  E-value=78  Score=21.26  Aligned_cols=25  Identities=12%  Similarity=-0.024  Sum_probs=14.5

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhh
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFIL  608 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~  608 (620)
                      .+-||++| .+.+.++-+++..|+=.
T Consensus         5 lL~GiVLG-lipiTl~GlfvaAylQY   29 (37)
T PRK00665          5 LLCGIVLG-LIPVTLAGLFVAAWNQY   29 (37)
T ss_pred             hhhhHHHH-hHHHHHHHHHHHHHHHH
Confidence            35689999 66555544444444433


No 116
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=33.24  E-value=47  Score=27.66  Aligned_cols=21  Identities=29%  Similarity=0.332  Sum_probs=18.6

Q ss_pred             ceEeeccceeeeecHHHHHHH
Q 047816          296 GTVLDSGTTYAYLPEAAFLAF  316 (620)
Q Consensus       296 ~ailDSGtt~~~LP~~~~~~i  316 (620)
                      .+.+|||++...+|...++.+
T Consensus        12 ~~~vDtGA~vnllp~~~~~~l   32 (93)
T cd05481          12 KFQLDTGATCNVLPLRWLKSL   32 (93)
T ss_pred             EEEEecCCEEEeccHHHHhhh
Confidence            589999999999999988764


No 117
>PF10577 UPF0560:  Uncharacterised protein family UPF0560;  InterPro: IPR018890  This family of proteins has no known function. 
Probab=33.12  E-value=51  Score=38.11  Aligned_cols=33  Identities=21%  Similarity=0.320  Sum_probs=19.9

Q ss_pred             hhhhhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          580 WQEHFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       580 ~~~~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      -++.++..++|..++++++|+++.+|..+||..
T Consensus       270 YHT~fLl~ILG~~~livl~lL~vLl~yCrrkc~  302 (807)
T PF10577_consen  270 YHTVFLLAILGGTALIVLILLCVLLCYCRRKCL  302 (807)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccC
Confidence            355444444443677777777777776666553


No 118
>COG4736 CcoQ Cbb3-type cytochrome oxidase, subunit 3 [Posttranslational modification, protein turnover, chaperones]
Probab=32.88  E-value=50  Score=25.16  Aligned_cols=23  Identities=9%  Similarity=-0.008  Sum_probs=13.2

Q ss_pred             HHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          589 LAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       589 ~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      .| .+++.+.+++++.|++|++|+
T Consensus        13 ~~-t~~~~l~fiavi~~ayr~~~K   35 (60)
T COG4736          13 WG-TIAFTLFFIAVIYFAYRPGKK   35 (60)
T ss_pred             HH-HHHHHHHHHHHHHHHhcccch
Confidence            44 555555556666666665554


No 119
>cd01324 cbb3_Oxidase_CcoQ Cytochrome cbb oxidase CcoQ.  Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  Found exclusively in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I.  Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center.  ccoQ, the fourth subunit, is a single transmembrane helix protein.  It has been shown to protect the core complex from proteolytic degradation by serine proteases.  See cd00919, cd01322
Probab=32.43  E-value=53  Score=23.77  Aligned_cols=22  Identities=5%  Similarity=0.004  Sum_probs=13.1

Q ss_pred             HHHHHHHHHHHHHHhhhhhhcc
Q 047816          592 TIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       592 ~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      .+.+++.-+++++|.+|+++++
T Consensus        16 l~~~~~~Figiv~wa~~p~~k~   37 (48)
T cd01324          16 LLYLALFFLGVVVWAFRPGRKK   37 (48)
T ss_pred             HHHHHHHHHHHHHHHhCCCcch
Confidence            3344445566667777776654


No 120
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=32.28  E-value=32  Score=30.19  Aligned_cols=20  Identities=30%  Similarity=0.511  Sum_probs=17.8

Q ss_pred             eEeeccce-eeeecHHHHHHH
Q 047816          297 TVLDSGTT-YAYLPEAAFLAF  316 (620)
Q Consensus       297 ailDSGtt-~~~LP~~~~~~i  316 (620)
                      .++|||.+ ++.+|+++++++
T Consensus        29 ~LiDTGFtg~lvlp~~vaek~   49 (125)
T COG5550          29 ELIDTGFTGYLVLPPQVAEKL   49 (125)
T ss_pred             eEEecCCceeEEeCHHHHHhc
Confidence            48999999 999999998774


No 121
>PRK10525 cytochrome o ubiquinol oxidase subunit II; Provisional
Probab=31.40  E-value=42  Score=34.80  Aligned_cols=29  Identities=17%  Similarity=0.392  Sum_probs=16.2

Q ss_pred             HHHHHHHHHHHHHHhhhhhhcc-ccCCCCC
Q 047816          592 TIMMVVGLSVFGILFILRRRRQ-SVNSYKP  620 (620)
Q Consensus       592 ~~~~~~~l~~~~~~~~~r~r~~-~~~~~~~  620 (620)
                      .+++++.+.++.++++||.|++ +...|.|
T Consensus        51 ~liv~i~V~~l~~~f~~ryR~~~~~a~y~p   80 (315)
T PRK10525         51 MLIVVIPAILMAVGFAWKYRASNKDAKYSP   80 (315)
T ss_pred             HHhhHHHHHHHHheeEEEEecCCCcCCCCC
Confidence            3444444444566777777754 3356654


No 122
>TIGR02595 PEP_exosort PEP-CTERM putative exosortase interaction domain. This model describes a 25-residue domain that includes a near-invariant Pro-Glu-Pro (PEP) motif, a thirteen residue strongly hydrophobic sequence likely to span the membrane, and a five-residue strongly basic motif that often contains four Arg residues. In nearly every case, this motif is found within nine residues, and usually within five residues, of the extreme C-terminus of the protein. Proteins with this motif typically have signal sequences at the N-terminus. This region appears many times per genome or not at all, and co-occurs in genomes with a proposed protein-sorting integral membrane protein we designate exosortase (see TIGR02602). PEP-CTERM proteins frequently are poorly conserved, Ser/Thr-rich proteins and may become extensively modified proteinaceous constituents of extracellular material in bacterial biofilms.
Probab=31.13  E-value=56  Score=20.28  Aligned_cols=7  Identities=71%  Similarity=1.312  Sum_probs=2.7

Q ss_pred             hhhhhhc
Q 047816          606 FILRRRR  612 (620)
Q Consensus       606 ~~~r~r~  612 (620)
                      +..|||+
T Consensus        17 ~~~rrrk   23 (26)
T TIGR02595        17 LLLRRRR   23 (26)
T ss_pred             HHHhhcc
Confidence            3334343


No 123
>PF14654 Epiglycanin_C:  Mucin, catalytic, TM and cytoplasmic tail region
Probab=30.63  E-value=87  Score=26.16  Aligned_cols=26  Identities=19%  Similarity=0.369  Sum_probs=16.1

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhh
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILR  609 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r  609 (620)
                      .|.-|++. .++++++|++-..+.+|+
T Consensus        19 eIfLItLa-sVvvavGl~aGLfFcvR~   44 (106)
T PF14654_consen   19 EIFLITLA-SVVVAVGLFAGLFFCVRN   44 (106)
T ss_pred             HHHHHHHH-HHHHHHHHHHHHHHHhhh
Confidence            45556666 667777777655555544


No 124
>PF14828 Amnionless:  Amnionless
Probab=30.48  E-value=1.7e+02  Score=31.99  Aligned_cols=61  Identities=13%  Similarity=0.086  Sum_probs=32.3

Q ss_pred             eEEEEEEEEEEEeecCCCCCCchhHHHHHHhhcccccccceEEeeeeecCCceeeEEEEecC
Q 047816          471 LQIGRITFDMFLSINYSDLRPHIPELADSIAQELDVNTSQVHLLNFMSKGNNSFIAWAVFPS  532 (620)
Q Consensus       471 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~qv~~~~~~~~g~~~~~~~~~~P~  532 (620)
                      .|-+-+++.+. ...--|++.|...+.+.+..+=....-|.|++.....+..-.+++.|.=.
T Consensus       231 iCGa~v~~~~~-~~~~fdl~~~~~~l~~~~~~~~~~~~v~~~v~kv~~~~~~~~iQiVi~d~  291 (437)
T PF14828_consen  231 ICGAIVTLEYS-CESTFDLQSYRQRLRHAFLELPQYDEVQMHVSKVWSDQSGNEIQIVITDR  291 (437)
T ss_pred             hcceEEEEeec-CCccccHHHHHHHHHHHHhccccccceeEEEEEeecCCCCceEEEEEecC
Confidence            46565555544 34444566665555555544333344566666655554445555555443


No 125
>PF09668 Asp_protease:  Aspartyl protease;  InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.  This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=29.78  E-value=56  Score=28.92  Aligned_cols=29  Identities=14%  Similarity=0.323  Sum_probs=22.5

Q ss_pred             EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHH
Q 047816          276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLA  315 (620)
Q Consensus       276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~  315 (620)
                      .+.+||+.+.           |++|||+..+.++.+.+++
T Consensus        28 ~~~ing~~vk-----------A~VDtGAQ~tims~~~a~r   56 (124)
T PF09668_consen   28 NCKINGVPVK-----------AFVDTGAQSTIMSKSCAER   56 (124)
T ss_dssp             EEEETTEEEE-----------EEEETT-SS-EEEHHHHHH
T ss_pred             EEEECCEEEE-----------EEEeCCCCccccCHHHHHH
Confidence            4567888764           8999999999999998877


No 126
>PF14979 TMEM52:  Transmembrane 52
Probab=29.77  E-value=1.1e+02  Score=27.72  Aligned_cols=36  Identities=17%  Similarity=0.370  Sum_probs=17.1

Q ss_pred             cchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816          578 TWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ  613 (620)
Q Consensus       578 ~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~  613 (620)
                      .|.+- +|+-|++.++..++-++.+..+=|-|+||++
T Consensus        15 ~W~~LWyIwLill~~~llLLCG~ta~C~rfCClrk~~   51 (154)
T PF14979_consen   15 RWSSLWYIWLILLIGFLLLLCGLTASCVRFCCLRKQA   51 (154)
T ss_pred             ceehhhHHHHHHHHHHHHHHHHHHHHHHHHHHhcccc
Confidence            45555 4444443324444444444444445555653


No 127
>PF02038 ATP1G1_PLM_MAT8:  ATP1G1/PLM/MAT8 family;  InterPro: IPR000272  The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable.   Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=29.76  E-value=75  Score=23.21  Aligned_cols=15  Identities=13%  Similarity=0.465  Sum_probs=6.6

Q ss_pred             hhHHHHHHHHHHHHHH
Q 047816          583 HFLMVVLAITIMMVVG  598 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~  598 (620)
                      -+.|.+.+ ++.++++
T Consensus        15 rigGLi~A-~vlfi~G   29 (50)
T PF02038_consen   15 RIGGLIFA-GVLFILG   29 (50)
T ss_dssp             HHHHHHHH-HHHHHHH
T ss_pred             hccchHHH-HHHHHHH
Confidence            35555544 3333333


No 128
>PF10661 EssA:  WXG100 protein secretion system (Wss), protein EssA;  InterPro: IPR018920  The Wss (WXG100 protein secretion system) in Staphylococcus aureus seems to be encoded by a locus of eight ORFs, called ess (eSAT-6 secretion system) []. This locus encodes, amongst several other proteins, EssA, a protein predicted to possess one transmembrane domain. Due to its predicted membrane location and its absolute requirement for WXG100 protein secretion, it has been speculated that EssA could form a secretion apparatus in conjunction with YukC and YukAB. Proteins homologous to EssA, YukC, EsaA and YukD were absent from mycobacteria [].   Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This highly divergent protein family consists largely of a central region of highly polar low-complexity sequence containing occasional LF motifs in weak repeats about 17 residues in length, flanked by hydrophobic N- and C-terminal regions. 
Probab=29.74  E-value=60  Score=29.59  Aligned_cols=16  Identities=25%  Similarity=0.324  Sum_probs=11.0

Q ss_pred             HHHHHHHHHHhhhhhh
Q 047816          596 VVGLSVFGILFILRRR  611 (620)
Q Consensus       596 ~~~l~~~~~~~~~r~r  611 (620)
                      +|++++.+++.+.||-
T Consensus       128 ~ll~i~~giy~~~r~~  143 (145)
T PF10661_consen  128 ILLAICGGIYVVLRKV  143 (145)
T ss_pred             HHHHHHHHHHHHHHHh
Confidence            5555667778888864


No 129
>TIGR01433 CyoA cytochrome o ubiquinol oxidase subunit II. This enzyme catalyzes the oxidation of ubiquinol with the concomitant reduction of molecular oxygen to water. This acts as the terminal electron acceptor in the respiratory chain. Subunit II is responsible for binding and oxidation of the ubiquinone substrate. This sequence is closely related to QoxA, which oxidizes quinol in gram positive bacteria but which is in complex with subunits which utilize cytochromes a in the reduction of molecular oxygen. Slightly more distantly related is subunit II of cytochrome c oxidase which uses cyt. c as the oxidant.
Probab=29.67  E-value=43  Score=32.99  Aligned_cols=17  Identities=12%  Similarity=0.376  Sum_probs=8.8

Q ss_pred             HHHHHHHHHhhhhhhcc
Q 047816          597 VGLSVFGILFILRRRRQ  613 (620)
Q Consensus       597 ~~l~~~~~~~~~r~r~~  613 (620)
                      +...++.++++||.|++
T Consensus        44 v~v~~~~~~~~~r~r~~   60 (226)
T TIGR01433        44 IPVILMTLFFAWKYRAT   60 (226)
T ss_pred             HHHHHHHheeeEEEecc
Confidence            33344445666666543


No 130
>PF06679 DUF1180:  Protein of unknown function (DUF1180);  InterPro: IPR009565 This entry consists of several hypothetical eukaryotic proteins thought to be membrane proteins. Their function is unknown.
Probab=29.04  E-value=57  Score=30.31  Aligned_cols=7  Identities=14%  Similarity=0.482  Sum_probs=3.7

Q ss_pred             ccccCCC
Q 047816          612 RQSVNSY  618 (620)
Q Consensus       612 ~~~~~~~  618 (620)
                      +|+.+.|
T Consensus       123 ~rktRkY  129 (163)
T PF06679_consen  123 NRKTRKY  129 (163)
T ss_pred             cccceee
Confidence            4555555


No 131
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=29.03  E-value=1.6e+02  Score=25.17  Aligned_cols=64  Identities=14%  Similarity=0.053  Sum_probs=38.2

Q ss_pred             EEEEecCCCc----EEEEEEeCCCCcee-EeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816           88 TRLWIGTPPQ----TFALIVDTGSTVTY-VPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE  162 (620)
Q Consensus        88 ~~i~iGTP~Q----~~~v~vDTGSs~~W-V~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d  162 (620)
                      +++.|..|.|    ++..++|||.+..- ++...-..      -...+..                      ...+.-++
T Consensus         2 ~~v~~~~p~~~~~~~v~~LVDTGat~~~~l~~~~a~~------lgl~~~~----------------------~~~~~tA~   53 (107)
T TIGR03698         2 LDVELSNPKNPEFMEVRALVDTGFSGFLLVPPDIVNK------LGLPELD----------------------QRRVYLAD   53 (107)
T ss_pred             EEEEEeCCCCCCceEEEEEEECCCCeEEecCHHHHHH------cCCCccc----------------------CcEEEecC
Confidence            5677877732    68999999998664 43221110      0111111                      12344556


Q ss_pred             CCceeEEEEEEEEEeCC
Q 047816          163 MSSSSGVLGEDIISFGN  179 (620)
Q Consensus       163 g~~~~G~~~~D~v~lg~  179 (620)
                      |....-....++|.+++
T Consensus        54 G~~~~~~v~~~~v~igg   70 (107)
T TIGR03698        54 GREVLTDVAKASIIING   70 (107)
T ss_pred             CcEEEEEEEEEEEEECC
Confidence            65666677889999998


No 132
>PF14316 DUF4381:  Domain of unknown function (DUF4381)
Probab=28.99  E-value=50  Score=30.05  Aligned_cols=11  Identities=27%  Similarity=0.338  Sum_probs=4.8

Q ss_pred             HHHHhhhhhhc
Q 047816          602 FGILFILRRRR  612 (620)
Q Consensus       602 ~~~~~~~r~r~  612 (620)
                      ++++..+|+++
T Consensus        36 ~~~~~~~r~~~   46 (146)
T PF14316_consen   36 LLLWRLWRRWR   46 (146)
T ss_pred             HHHHHHHHHHH
Confidence            33444444443


No 133
>PF05337 CSF-1:  Macrophage colony stimulating factor-1 (CSF-1);  InterPro: IPR008001 Colony stimulating factor 1 (CSF-1) is a homodimeric polypeptide growth factor whose primary function is to regulate the survival, proliferation, differentiation, and function of cells of the mononuclear phagocytic lineage. This lineage includes mononuclear phagocytic precursors, blood monocytes, tissue macrophages, osteoclasts, and microglia of the brain, all of which possess cell surface receptors for CSF-1. The protein has also been linked with male fertility [] and mutations in the Csf-1 gene have been found to cause osteopetrosis and failure of tooth eruption [].; GO: 0005125 cytokine activity, 0008083 growth factor activity, 0016021 integral to membrane; PDB: 3EJJ_A.
Probab=28.91  E-value=19  Score=36.12  Aligned_cols=20  Identities=40%  Similarity=0.556  Sum_probs=0.0

Q ss_pred             HHHHHHHHHHhhhhhhcccc
Q 047816          596 VVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       596 ~~~l~~~~~~~~~r~r~~~~  615 (620)
                      +++|++++.++|.|+|||.+
T Consensus       236 ILVLLaVGGLLfYr~rrRs~  255 (285)
T PF05337_consen  236 ILVLLAVGGLLFYRRRRRSH  255 (285)
T ss_dssp             --------------------
T ss_pred             hhhhhhccceeeeccccccc
Confidence            45567777777766665544


No 134
>PF10873 DUF2668:  Protein of unknown function (DUF2668);  InterPro: IPR022640  Members in this family of proteins are annotated as cysteine and tyrosine-rich protein 1, however currently no function is known []. 
Probab=28.88  E-value=33  Score=30.82  Aligned_cols=12  Identities=8%  Similarity=0.019  Sum_probs=9.0

Q ss_pred             chhh-hhHHHHHH
Q 047816          579 WWQE-HFLMVVLA  590 (620)
Q Consensus       579 ~~~~-~~~~i~~~  590 (620)
                      .++. +|+||+.|
T Consensus        57 ~lsgtAIaGIVfg   69 (155)
T PF10873_consen   57 VLSGTAIAGIVFG   69 (155)
T ss_pred             ccccceeeeeehh
Confidence            3445 89999988


No 135
>PF04689 S1FA:  DNA binding protein S1FA;  InterPro: IPR006779  S1FA is an unusual small plant peptide of only 70 amino acids with a basic domain which contains a nuclear localization signal and a putative DNA binding helix. S1FA is highly conserved between dicotyledonous and monocotyledonous plants and may be a DNA-binding protein that specifically recognises the negative promoter element S1F [].; GO: 0003677 DNA binding, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=28.63  E-value=45  Score=25.49  Aligned_cols=36  Identities=8%  Similarity=0.190  Sum_probs=23.8

Q ss_pred             cccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          576 KRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       576 ~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      ..++++- .||-|++| +..+++.+--++.+..|||.-
T Consensus         6 ~~KGlnPGlIVLlvV~-g~ll~flvGnyvlY~Yaqk~l   42 (69)
T PF04689_consen    6 EAKGLNPGLIVLLVVA-GLLLVFLVGNYVLYVYAQKTL   42 (69)
T ss_pred             cccCCCCCeEEeehHH-HHHHHHHHHHHHHHHHHhhcC
Confidence            3467777 78887776 555555555556778888763


No 136
>TIGR03501 gamma_C_targ gammaproteobacterial enzyme C-terminal transmembrane domain. This homology domain, largely restricted to a subset of the gamma proteobacteria that excludes the enterobacteria, is found at the extreme carboxyl-terminus of a diverse set of proteins, most of which are enzymes with conventional signal sequences and with hydrolytic activities: nucleases, proteases, agarases, etc. Species that have this domain at all typically have from two to fifteen proteins tagged with this domain at the C-terminus. The agarase AgaA from Vibro sp. strain JT0107 is secreted into the medium, while the same protein heterologously expressed in E. coli is retained in the cell fraction. This suggests cleavage and release in species with this domain. Both this suggestion, and the chemical structure of the domain (motif, hydrophobic predicted transmembrane helix, cluster of basic residues) closely parallels that of the LPXTG/sortase system and the PEP-CTERM/exosortase(EpsH) system.
Probab=28.32  E-value=62  Score=20.22  Aligned_cols=13  Identities=31%  Similarity=0.342  Sum_probs=5.1

Q ss_pred             HHHHHHHHhhhhh
Q 047816          598 GLSVFGILFILRR  610 (620)
Q Consensus       598 ~l~~~~~~~~~r~  610 (620)
                      +|+.+..+.++||
T Consensus         9 ~LllL~~~~~rRr   21 (26)
T TIGR03501         9 SLLLLLLLGLRRR   21 (26)
T ss_pred             HHHHHHHHHHHHh
Confidence            3333333444443


No 137
>PF05283 MGC-24:  Multi-glycosylated core protein 24 (MGC-24);  InterPro: IPR007947 CD164 is a mucin-like receptor, or sialomucin, with specificity in receptor/ ligand interactions that depends on the structural characteristics of the mucin-like receptor. Its functions include mediating, or regulating, haematopoietic progenitor cell adhesion and the negative regulation of their growth and/or-differentiation. It exists in the native state as a disulphide- linked homodimer of two 80-85kDa subunits. It is usually expressed by CD34+ and CD341o/- haematopoietic stem cells and associated microenvironmental cells. It contains, in its extracellular region, two mucin domains (I and II) linked by a non-mucin domain, which has been predicted to contain intra- disulphide bridges. This receptor may play a key role in haematopoiesis by facilitating the adhesion of human CD34+ cells to bone marrow stroma and by negatively regulating CD34+ CD341o/- haematopoietic progenitor cell proliferation. These effects involve the CD164 class I and/or II epitopes recognised by the monoclonal antibodies (mAbs) 105A5 and 103B2/9E10. These epitopes are carbohydrate-dependent and are located on the N-terminal mucin domain I [, ]. It has been found that murine MGC-24v and rat endolyn share significant sequence similarities with human CD164. However, CD164 lacks the consensus glycosaminoglycan (GAG)-attachment site found in MGC-24; it is possible that GAG-association is responsible for the high molecular weight of the epithelial-derived MGC-24 glycoprotein [].  Genomic structure studies have placed CD164 within the mucin-subgroup that comprises multiple exons, and demonstrate the diverse chromosomal distribution of this family of molecules. Molecules with such multiple exons may have sophisticated regulatory mechanisms that involve not only post-translational modifications of the oligosaccharide side chains, but also differential exon usage. Although differences in the intron and exon sizes are seen between the mouse and human genes, the predicted proteins are similar in size and structure, maintaining functionally important motifs that regulate cell proliferation or subcellular distribution [].  CD164 is a gene whose expression depends on differential usage of poly- adenylation sites within the 3'-UTR. The conserved distribution of the 3.2- and 1.2-kb CD164 transcripts between mouse and human suggests that (i) a mechanism may exist to regulate tissue-specific polyadenylation, and (ii) differences in polyadenylation are important for the expression and function of CD164 in different tissues. Two other aspects of the structure of CD164 are of particular interest. First, it shares one of several conserved features of a cytokine-binding pocket - in this respect, it is notable that evidence exists for a class of cell-surface sialomucin modulators that directly interact with growth factor receptors to regulate their response to physiological ligands. Second, its cytoplasmic tail contains a C-terminal YHTL motif found in many endocytic membrane proteins or receptors. These Tyr-based motifs bind to adaptor proteins, which mediate the sorting of membrane proteins into transport vesicles from the plasma membrane to the endosomes, and between intracellular compartments. 
Probab=27.90  E-value=66  Score=30.61  Aligned_cols=17  Identities=12%  Similarity=0.280  Sum_probs=12.4

Q ss_pred             HHHHHHHHHHHHHHhhh
Q 047816          592 TIMMVVGLSVFGILFIL  608 (620)
Q Consensus       592 ~~~~~~~l~~~~~~~~~  608 (620)
                      +|+|+++|++++.++++
T Consensus       166 GIVL~LGv~aI~ff~~K  182 (186)
T PF05283_consen  166 GIVLTLGVLAIIFFLYK  182 (186)
T ss_pred             HHHHHHHHHHHHHHHhh
Confidence            78888998887655443


No 138
>PF06040 Adeno_E3:  Adenovirus E3 protein;  InterPro: IPR009266 This family consists of several Adenovirus E3 proteins. The E3 protein does not seem to be essential for virus replication in cultured cells suggesting that the protein may function in virus-host interactions [].
Probab=27.78  E-value=62  Score=27.79  Aligned_cols=23  Identities=17%  Similarity=0.332  Sum_probs=15.7

Q ss_pred             cceeeeeeeecCCccccchhhhhHHHHHHHHHHHHHH
Q 047816          562 GNYKLLQWNIEPQVKRTWWQEHFLMVVLAITIMMVVG  598 (620)
Q Consensus       562 G~y~l~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~  598 (620)
                      +|||...+.             ++||++| +.+++++
T Consensus        82 ~p~evvG~l-------------~LGvV~G-G~i~vLc  104 (127)
T PF06040_consen   82 SPWEVVGYL-------------ILGVVAG-GLIAVLC  104 (127)
T ss_pred             CCeeeeehh-------------hHHHHhc-cHHHHHH
Confidence            677666543             7899998 6665553


No 139
>PRK14748 kdpF potassium-transporting ATPase subunit F; Provisional
Probab=27.48  E-value=1.2e+02  Score=19.31  Aligned_cols=21  Identities=14%  Similarity=0.251  Sum_probs=11.0

Q ss_pred             HHHHHHHHHHHHHHHHHHHHhhh
Q 047816          586 MVVLAITIMMVVGLSVFGILFIL  608 (620)
Q Consensus       586 ~i~~~~~~~~~~~l~~~~~~~~~  608 (620)
                      ++++|  +++++.|+....+++.
T Consensus         4 ~vi~G--~ilv~lLlgYLvyALi   24 (29)
T PRK14748          4 GVITG--VLLVFLLLGYLVYALI   24 (29)
T ss_pred             HHHHH--HHHHHHHHHHHHHHHh
Confidence            45555  4555555555555443


No 140
>PF09472 MtrF:  Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF);  InterPro: IPR013347  Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump []. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. In some organisms this domain is found at the C-terminal region of what appears to be a fusion of the MtrA and MtrF proteins [, ]. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.; GO: 0030269 tetrahydromethanopterin S-methyltransferase activity, 0015948 methanogenesis, 0016020 membrane
Probab=27.13  E-value=86  Score=24.25  Aligned_cols=30  Identities=10%  Similarity=0.107  Sum_probs=16.9

Q ss_pred             ccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhh
Q 047816          577 RTWWQE-HFLMVVLAITIMMVVGLSVFGILFIL  608 (620)
Q Consensus       577 ~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~  608 (620)
                      .++.+. -+.|.++|  ++++++|..+-.++.|
T Consensus        34 ~SGv~~~~~~GfaiG--~~~AlvLv~ip~~l~~   64 (64)
T PF09472_consen   34 ESGVMATGIKGFAIG--FLFALVLVGIPILLMF   64 (64)
T ss_pred             HHHHhhhhhHHHHHH--HHHHHHHHHHHHHHhC
Confidence            345555 78888887  4444444444444443


No 141
>KOG1094 consensus Discoidin domain receptor DDR1 [Signal transduction mechanisms]
Probab=26.74  E-value=80  Score=35.47  Aligned_cols=13  Identities=38%  Similarity=0.470  Sum_probs=7.3

Q ss_pred             CcEEEEEEeCCCC
Q 047816           96 PQTFALIVDTGST  108 (620)
Q Consensus        96 ~Q~~~v~vDTGSs  108 (620)
                      +|.-++..|-||.
T Consensus        53 ~~~~rl~se~g~G   65 (807)
T KOG1094|consen   53 PQHARLHSEDGSG   65 (807)
T ss_pred             cccccccccCCCc
Confidence            5555566655544


No 142
>PRK15348 type III secretion system lipoprotein SsaJ; Provisional
Probab=26.65  E-value=79  Score=31.63  Aligned_cols=35  Identities=14%  Similarity=0.230  Sum_probs=19.5

Q ss_pred             EEEeecCCCCCCchhHHHHHHhhcc-cccccceEEee
Q 047816          480 MFLSINYSDLRPHIPELADSIAQEL-DVNTSQVHLLN  515 (620)
Q Consensus       480 ~~~~~~~~~~~~~~~~~~~~~~~~l-~~~~~qv~~~~  515 (620)
                      ++++|+ .+..+........+|+.. ++++..|.|..
T Consensus       149 I~~~~~-~~~~~~~v~I~~LVA~SV~gL~~enVTVvd  184 (249)
T PRK15348        149 IKYSPQ-VNMEAFRVKIKDLIEMSIPGLQYSKISILM  184 (249)
T ss_pred             EEeCCC-CChHHHHHHHHHHHHHhcCCCCccceEEEe
Confidence            334555 344444336777777743 45666666654


No 143
>PF13706 PepSY_TM_3:  PepSY-associated TM helix
Probab=25.21  E-value=1.3e+02  Score=20.26  Aligned_cols=22  Identities=18%  Similarity=0.338  Sum_probs=14.1

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHH
Q 047816          583 HFLMVVLAITIMMVVGLSVFGIL  605 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~  605 (620)
                      .++|+++| ...+++.++...+.
T Consensus         9 ~W~Gl~~g-~~l~~~~~tG~~~~   30 (37)
T PF13706_consen    9 RWLGLILG-LLLFVIFLTGAVMV   30 (37)
T ss_pred             HHHHHHHH-HHHHHHHHHhHHHH
Confidence            57889888 66656665554433


No 144
>KOG0860 consensus Synaptobrevin/VAMP-like protein [Intracellular trafficking, secretion, and vesicular transport]
Probab=24.80  E-value=72  Score=27.71  Aligned_cols=15  Identities=20%  Similarity=0.950  Sum_probs=7.9

Q ss_pred             cccchhhhhHHHHHH
Q 047816          576 KRTWWQEHFLMVVLA  590 (620)
Q Consensus       576 ~~~~~~~~~~~i~~~  590 (620)
                      ++-||...-+-+.+|
T Consensus        85 rk~wWkn~Km~~il~   99 (116)
T KOG0860|consen   85 RKMWWKNCKMRIILG   99 (116)
T ss_pred             HHHHHHHHHHHHHHH
Confidence            456777733333344


No 145
>PF14610 DUF4448:  Protein of unknown function (DUF4448)
Probab=24.55  E-value=22  Score=33.96  Aligned_cols=8  Identities=25%  Similarity=0.032  Sum_probs=3.5

Q ss_pred             Ccceeeee
Q 047816          561 FGNYKLLQ  568 (620)
Q Consensus       561 fG~y~l~~  568 (620)
                      -||--.+.
T Consensus       123 ~GP~V~~~  130 (189)
T PF14610_consen  123 KGPTVSLT  130 (189)
T ss_pred             cCCeEEee
Confidence            44444443


No 146
>PF12301 CD99L2:  CD99 antigen like protein 2;  InterPro: IPR022078  This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 237 amino acids in length. CD99L2 and CD99 are involved in trans-endothelial migration of neutrophils in vitro and in the recruitment of neutrophils into inflamed peritoneum. 
Probab=24.04  E-value=77  Score=29.68  Aligned_cols=12  Identities=8%  Similarity=0.122  Sum_probs=6.2

Q ss_pred             CcccccHHHHHH
Q 047816          534 SANYISNATALR  545 (620)
Q Consensus       534 ~~~~f~~~~~~~  545 (620)
                      .+-.|+.++...
T Consensus        68 ~gg~fsD~DL~D   79 (169)
T PF12301_consen   68 GGGGFSDSDLFD   79 (169)
T ss_pred             CCCCcCcccccc
Confidence            344566555444


No 147
>PF13172 PepSY_TM_1:  PepSY-associated TM helix
Probab=24.04  E-value=1.2e+02  Score=20.01  Aligned_cols=26  Identities=19%  Similarity=0.535  Sum_probs=15.1

Q ss_pred             ccchhh--hhHHHHHHHHHHHHHHHHHHH
Q 047816          577 RTWWQE--HFLMVVLAITIMMVVGLSVFG  603 (620)
Q Consensus       577 ~~~~~~--~~~~i~~~~~~~~~~~l~~~~  603 (620)
                      ++.+.+  .++|+..+ ...++++++.+.
T Consensus         2 r~~~~~~H~~~g~~~~-~~ll~~~lTG~~   29 (34)
T PF13172_consen    2 RKFWRKIHRWLGLIAA-IFLLLLALTGAL   29 (34)
T ss_pred             hHHHHHHHHHHHHHHH-HHHHHHHHHHHH
Confidence            344555  46677766 566666665543


No 148
>PRK11486 flagellar biosynthesis protein FliO; Provisional
Probab=23.31  E-value=87  Score=27.70  Aligned_cols=20  Identities=10%  Similarity=0.240  Sum_probs=14.6

Q ss_pred             HHHHHHHHHHHHHHhhhhhh
Q 047816          592 TIMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       592 ~~~~~~~l~~~~~~~~~r~r  611 (620)
                      +..++++|+++..|+++|..
T Consensus        24 ~L~lVl~lI~~~aWLlkR~~   43 (124)
T PRK11486         24 ALIGIIALILAAAWLVKRLG   43 (124)
T ss_pred             HHHHHHHHHHHHHHHHHHcC
Confidence            45667777777788888864


No 149
>PF11615 DUF3249:  Protein of unknown function (DUF3249);  InterPro: IPR021653  This family of proteins represents the gene product of the protein CAF4, the yeast protein YKR036c. This protein contains seven WD40 repeats in its C terminus. The function however is unknown []. ; PDB: 2PQR_D.
Probab=23.08  E-value=56  Score=23.31  Aligned_cols=25  Identities=40%  Similarity=0.723  Sum_probs=12.1

Q ss_pred             CcccccHHHHHHHHHHHccCccCCC
Q 047816          534 SANYISNATALRIISRLAEHRVHIP  558 (620)
Q Consensus       534 ~~~~f~~~~~~~i~~~~~~~~~~~~  558 (620)
                      ..++-...+..||..-|.++++|+|
T Consensus        11 qnnyadsattfrilahldeqryplp   35 (60)
T PF11615_consen   11 QNNYADSATTFRILAHLDEQRYPLP   35 (60)
T ss_dssp             ------HHHHHHHHT---TTTS---
T ss_pred             eccccchhhHHHHHHhhcccccCCC
Confidence            3455667889999999999999987


No 150
>TIGR03063 srtB_target sortase B cell surface sorting signal. Two different classes of sorting signal, both analogous to the sortase A signal LPXTG, may be recognized by the sortase SrtB. These are given as NXZTN and NPKXZ. Proteins sorted by this class of sortase are less common than the sortase A and LPXTG system. This model describes a number of cell surface protein C-terminal regions from Gram-positive bacteria that appear to be sortase B (SrtB) sorting signals.
Probab=22.79  E-value=1e+02  Score=19.80  Aligned_cols=7  Identities=29%  Similarity=0.762  Sum_probs=3.5

Q ss_pred             HHhhhhh
Q 047816          604 ILFILRR  610 (620)
Q Consensus       604 ~~~~~r~  610 (620)
                      .+++||+
T Consensus        22 ~~Li~k~   28 (29)
T TIGR03063        22 LFLIRKR   28 (29)
T ss_pred             HHHhhcc
Confidence            4555443


No 151
>PF01002 Flavi_NS2B:  Flavivirus non-structural protein NS2B;  InterPro: IPR000487 Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex [, ].; GO: 0004252 serine-type endopeptidase activity, 0019012 virion; PDB: 2WV9_A 2FOM_A 2VBC_B 3U1I_C 3U1J_A 3LKW_A 3L6P_A 2GGV_A 3E90_C 2IJO_A ....
Probab=22.77  E-value=86  Score=27.91  Aligned_cols=27  Identities=7%  Similarity=0.047  Sum_probs=13.1

Q ss_pred             ccccceEEe---eeeec------CCceeeEEEEecC
Q 047816          506 VNTSQVHLL---NFMSK------GNNSFIAWAVFPS  532 (620)
Q Consensus       506 ~~~~qv~~~---~~~~~------g~~~~~~~~~~P~  532 (620)
                      ....|+.+.   +++|+      |...+..+++-..
T Consensus        44 gks~~L~~E~ag~i~W~~ea~~sG~s~rldV~~d~~   79 (128)
T PF01002_consen   44 GKSTDLWLEWAGDISWEEEAEISGGSVRLDVKLDDD   79 (128)
T ss_dssp             ---SSEEEEEEE-S---TTHEEHSEEEEEEEEE-TT
T ss_pred             cccCceEEEEEeccccCccchhcCCceEEEEEECCC
Confidence            345566665   77888      7777777777554


No 152
>PF13179 DUF4006:  Family of unknown function (DUF4006)
Probab=22.54  E-value=1.4e+02  Score=23.24  Aligned_cols=19  Identities=26%  Similarity=0.345  Sum_probs=7.7

Q ss_pred             HHHHHhhhhhhccccCCCC
Q 047816          601 VFGILFILRRRRQSVNSYK  619 (620)
Q Consensus       601 ~~~~~~~~r~r~~~~~~~~  619 (620)
                      .++.+.|.--+......|+
T Consensus        29 ~lt~~ai~~Qq~~At~~Y~   47 (66)
T PF13179_consen   29 FLTYWAIKVQQEQATNPYK   47 (66)
T ss_pred             HHHHHHHHHHHHHhcCCcc
Confidence            3334444333333444443


No 153
>PF11669 WBP-1:  WW domain-binding protein 1;  InterPro: IPR021684  This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain []. 
Probab=22.32  E-value=81  Score=26.86  Aligned_cols=15  Identities=33%  Similarity=0.142  Sum_probs=8.5

Q ss_pred             HHHHHHHHHhhhhhh
Q 047816          597 VGLSVFGILFILRRR  611 (620)
Q Consensus       597 ~~l~~~~~~~~~r~r  611 (620)
                      +.|+.+.++-.||+|
T Consensus        32 ill~c~c~~~~~r~r   46 (102)
T PF11669_consen   32 ILLSCCCACRHRRRR   46 (102)
T ss_pred             HHHHHHHHHHHHHHH
Confidence            334555566666654


No 154
>PF13908 Shisa:  Wnt and FGF inhibitory regulator
Probab=22.32  E-value=41  Score=31.76  Aligned_cols=11  Identities=0%  Similarity=0.184  Sum_probs=7.9

Q ss_pred             hhHHHHHHHHH
Q 047816          583 HFLMVVLAITI  593 (620)
Q Consensus       583 ~~~~i~~~~~~  593 (620)
                      .+++|++||++
T Consensus        76 ~~~~iivgvi~   86 (179)
T PF13908_consen   76 FITGIIVGVIC   86 (179)
T ss_pred             ceeeeeeehhh
Confidence            68889998433


No 155
>PF14991 MLANA:  Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=22.17  E-value=20  Score=30.84  Aligned_cols=18  Identities=28%  Similarity=0.425  Sum_probs=0.0

Q ss_pred             HHHHHHHHHHHHHhhhhh
Q 047816          593 IMMVVGLSVFGILFILRR  610 (620)
Q Consensus       593 ~~~~~~l~~~~~~~~~r~  610 (620)
                      ++++.+|++++-|..+||
T Consensus        33 ~VILgiLLliGCWYckRR   50 (118)
T PF14991_consen   33 IVILGILLLIGCWYCKRR   50 (118)
T ss_dssp             ------------------
T ss_pred             HHHHHHHHHHhheeeeec
Confidence            333334444555555444


No 156
>PF01282 Ribosomal_S24e:  Ribosomal protein S24e;  InterPro: IPR001976 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.  Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This family contains the S24e ribosomal proteins from eukaryotes and archaebacteria. These proteins have 101 to 148 amino acids.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome; PDB: 2V94_B 1YWX_A 2G1D_A 3IZ6_U 1XN9_A 2XZM_P 2XZN_P 3U5G_Y 3J16_D 3IZB_U ....
Probab=22.15  E-value=3.7e+02  Score=21.92  Aligned_cols=43  Identities=19%  Similarity=0.318  Sum_probs=36.6

Q ss_pred             CCchhHHHHHHhhcccccccceEEeeeeec--CCceeeEEEEecC
Q 047816          490 RPHIPELADSIAQELDVNTSQVHLLNFMSK--GNNSFIAWAVFPS  532 (620)
Q Consensus       490 ~~~~~~~~~~~~~~l~~~~~qv~~~~~~~~--g~~~~~~~~~~P~  532 (620)
                      .|-..+..+-||..|+++..+|.|.++.-+  +........|+-+
T Consensus        12 Tpsr~ei~~klA~~~~~~~~~ivv~~~~t~fG~~~s~g~a~IYd~   56 (84)
T PF01282_consen   12 TPSRKEIREKLAAMLNVDPDLIVVFGIKTEFGGGKSTGFAKIYDS   56 (84)
T ss_dssp             S--HHHHHHHHHHHHTSTGCCEEEEEEEESSSSSEEEEEEEEESS
T ss_pred             CCCHHHHHHHHHHHhCCCCCeEEEeccEecCCCceEEEEEEEeCC
Confidence            577889999999999999999999999988  5678888888875


No 157
>PF02038 ATP1G1_PLM_MAT8:  ATP1G1/PLM/MAT8 family;  InterPro: IPR000272  The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable.   Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=22.08  E-value=1.2e+02  Score=22.23  Aligned_cols=25  Identities=24%  Similarity=0.444  Sum_probs=16.1

Q ss_pred             HHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816          586 MVVLAITIMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       586 ~i~~~~~~~~~~~l~~~~~~~~~r~r  611 (620)
                      ..=+| +.+++.+|.+++++++.-+|
T Consensus        13 tLrig-GLi~A~vlfi~Gi~iils~k   37 (50)
T PF02038_consen   13 TLRIG-GLIFAGVLFILGILIILSGK   37 (50)
T ss_dssp             HHHHH-HHHHHHHHHHHHHHHHCTTH
T ss_pred             Hhhcc-chHHHHHHHHHHHHHHHcCc
Confidence            34466 77777777777776665443


No 158
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=21.27  E-value=69  Score=26.64  Aligned_cols=23  Identities=30%  Similarity=0.473  Sum_probs=18.3

Q ss_pred             EEecCCC-cEEEEEEeCCCCceeEeC
Q 047816           90 LWIGTPP-QTFALIVDTGSTVTYVPC  114 (620)
Q Consensus        90 i~iGTP~-Q~~~v~vDTGSs~~WV~~  114 (620)
                      +.|.  + +.+++++|||++..-++-
T Consensus         3 ~~i~--g~~~v~~~vDtGA~vnllp~   26 (93)
T cd05481           3 MKIN--GKQSVKFQLDTGATCNVLPL   26 (93)
T ss_pred             eEeC--CceeEEEEEecCCEEEeccH
Confidence            4444  5 899999999999877764


No 159
>PRK00523 hypothetical protein; Provisional
Probab=21.06  E-value=93  Score=24.59  Aligned_cols=25  Identities=8%  Similarity=-0.012  Sum_probs=11.5

Q ss_pred             HHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816          586 MVVLAITIMMVVGLSVFGILFILRRR  611 (620)
Q Consensus       586 ~i~~~~~~~~~~~l~~~~~~~~~r~r  611 (620)
                      +++++ .+++++++-+++.+++-||.
T Consensus         5 ~l~I~-l~i~~li~G~~~Gffiark~   29 (72)
T PRK00523          5 GLALG-LGIPLLIVGGIIGYFVSKKM   29 (72)
T ss_pred             HHHHH-HHHHHHHHHHHHHHHHHHHH
Confidence            44444 23333333445556665554


No 160
>PF02480 Herpes_gE:  Alphaherpesvirus glycoprotein E;  InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=20.81  E-value=33  Score=37.45  Aligned_cols=30  Identities=33%  Similarity=0.455  Sum_probs=0.0

Q ss_pred             hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816          583 HFLMVVLAITIMMVVGLSVFGILFILRRRR  612 (620)
Q Consensus       583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~  612 (620)
                      .++++++|++++++++++++++++.+||||
T Consensus       353 ~~l~vVlgvavlivVv~viv~vc~~~rrrR  382 (439)
T PF02480_consen  353 ALLGVVLGVAVLIVVVGVIVWVCLRCRRRR  382 (439)
T ss_dssp             ------------------------------
T ss_pred             chHHHHHHHHHHHHHHHHHhheeeeehhcc
Confidence            566666663444444434433333333333


No 161
>PF03597 CcoS:  Cytochrome oxidase maturation protein cbb3-type;  InterPro: IPR004714 Cytochrome cbb3 oxidases are found almost exclusively in Proteobacteria, and represent a distinctive class of proton-pumping respiratory haem-copper oxidases (HCO) that lack many of the key structural features that contribute to the reaction cycle of the intensely studied mitochondrial cytochrome c oxidase (CcO). Expression of cytochrome cbb3 oxidase allows human pathogens to colonise anoxic tissues and agronomically important diazotrophs to sustain nitrogen fixation []. Genes encoding a cytochrome cbb3 oxidase were initially designated fixNOQP (ccoNOQP), the ccoNOQP operon is always found close to a second gene cluster, known as fixGHIS (ccoGHIS) whose expression is necessary for the assembly of a functional cbb3 oxidase. On the basis of their derived amino acid sequences each of the four proteins encoded by the ccoGHIS operon are thought to be membrane-bound. It has been suggested that they may function in concert as a multi-subunit complex, possibly playing a role in the uptake and metabolism of copper required for the assembly of the binuclear centre of cytochrome cbb3 oxidase. 
Probab=20.66  E-value=1.9e+02  Score=20.69  Aligned_cols=12  Identities=17%  Similarity=0.492  Sum_probs=4.4

Q ss_pred             HHHHHHHHHHhh
Q 047816          596 VVGLSVFGILFI  607 (620)
Q Consensus       596 ~~~l~~~~~~~~  607 (620)
                      ++++.+++.++|
T Consensus        12 ~l~~~~l~~f~W   23 (45)
T PF03597_consen   12 ILGLIALAAFLW   23 (45)
T ss_pred             HHHHHHHHHHHH
Confidence            333333333333


No 162
>PRK00972 tetrahydromethanopterin S-methyltransferase subunit E; Provisional
Probab=20.53  E-value=82  Score=31.24  Aligned_cols=9  Identities=33%  Similarity=0.608  Sum_probs=5.2

Q ss_pred             hccccCCCC
Q 047816          611 RRQSVNSYK  619 (620)
Q Consensus       611 r~~~~~~~~  619 (620)
                      .|++...||
T Consensus       284 aR~~yGpY~  292 (292)
T PRK00972        284 ARKKYGPYK  292 (292)
T ss_pred             HHhhcCCCC
Confidence            355666664


No 163
>PTZ00208 65 kDa invariant surface glycoprotein; Provisional
Probab=20.29  E-value=92  Score=33.04  Aligned_cols=6  Identities=17%  Similarity=0.169  Sum_probs=3.0

Q ss_pred             cccccc
Q 047816          338 NDICFS  343 (620)
Q Consensus       338 ~~~C~~  343 (620)
                      ...|..
T Consensus       237 ~~~C~~  242 (436)
T PTZ00208        237 DMNCNI  242 (436)
T ss_pred             Cccccc
Confidence            455643


No 164
>PF11118 DUF2627:  Protein of unknown function (DUF2627);  InterPro: IPR020138 This entry represents uncharacterised membrane proteins with no known function.
Probab=20.22  E-value=1.1e+02  Score=24.35  Aligned_cols=27  Identities=11%  Similarity=0.402  Sum_probs=19.0

Q ss_pred             HHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816          588 VLAITIMMVVGLSVFGILFILRRRRQSV  615 (620)
Q Consensus       588 ~~~~~~~~~~~l~~~~~~~~~r~r~~~~  615 (620)
                      ++| .+..++++..++.|.+.|-|+|.+
T Consensus        44 l~G-~~lf~~G~~Fi~GfI~~RDRKrnk   70 (77)
T PF11118_consen   44 LAG-LLLFAIGVGFIAGFILHRDRKRNK   70 (77)
T ss_pred             HHH-HHHHHHHHHHHHhHhheeeccccc
Confidence            344 666788888888888887776543


Done!