Query         044180
Match_columns 130
No_of_seqs    109 out of 268
Neff          6.0 
Searched_HMMs 46136
Date          Fri Mar 29 12:12:34 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/044180.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/044180hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF08284 RVP_2:  Retroviral asp  99.8 5.9E-21 1.3E-25  139.9   6.2   70   29-100    63-132 (135)
  2 cd05479 RP_DDI RP_DDI; retrope  99.5 2.1E-14 4.5E-19  103.1   7.0   68   29-97     57-124 (124)
  3 cd05484 retropepsin_like_LTR_2  98.9 2.4E-09 5.3E-14   72.4   5.7   53   27-81     39-91  (91)
  4 PF02160 Peptidase_A3:  Caulifl  98.2   1E-05 2.2E-10   63.4   8.7   76   27-105    47-124 (201)
  5 cd00303 retropepsin_like Retro  98.2 8.2E-06 1.8E-10   50.3   6.2   54   28-81     39-92  (92)
  6 TIGR03698 clan_AA_DTGF clan AA  97.8 4.4E-05 9.6E-10   53.6   5.6   63   29-95     45-107 (107)
  7 PF09668 Asp_protease:  Asparty  97.6 0.00014   3E-09   53.0   5.1   50   39-89     75-124 (124)
  8 PF00077 RVP:  Retroviral aspar  97.4 0.00028   6E-09   47.8   4.4   55   28-85     43-97  (100)
  9 cd05483 retropepsin_like_bacte  97.3 0.00059 1.3E-08   44.8   4.9   53   27-81     41-96  (96)
 10 PF13650 Asp_protease_2:  Aspar  97.3  0.0011 2.4E-08   43.1   6.1   48   30-79     41-90  (90)
 11 TIGR02281 clan_AA_DTGA clan AA  97.0  0.0035 7.5E-08   44.8   6.7   64   30-95     54-119 (121)
 12 cd05480 NRIP_C NRIP_C; putativ  96.3  0.0098 2.1E-07   42.1   5.0   54   38-92     50-103 (103)
 13 cd06094 RP_Saci_like RP_Saci_l  96.0   0.017 3.7E-07   39.9   5.0   55   27-84     33-88  (89)
 14 cd06095 RP_RTVL_H_like Retrope  95.4   0.043 9.3E-07   36.6   5.0   52   28-81     35-86  (86)
 15 COG5550 Predicted aspartyl pro  94.5   0.077 1.7E-06   38.8   4.6   63   30-96     56-118 (125)
 16 cd05481 retropepsin_like_LTR_1  91.9     0.6 1.3E-05   31.8   5.5   49   27-77     41-89  (93)
 17 KOG0012 DNA damage inducible p  91.1    0.29 6.2E-06   41.7   3.9   62   40-102   287-348 (380)
 18 PF12384 Peptidase_A2B:  Ty3 tr  89.1     1.2 2.5E-05   34.4   5.4   58   27-85     73-131 (177)
 19 COG2383 Uncharacterized conser  71.6    0.72 1.6E-05   32.8  -1.2   20   74-93     51-70  (109)
 20 COG3577 Predicted aspartyl pro  67.4     6.1 0.00013   31.4   3.0   58   29-88    147-206 (215)
 21 PF00098 zf-CCHC:  Zinc knuckle  58.0     3.2 6.9E-05   20.2  -0.1    8   13-20      2-9   (18)
 22 cd06396 PB1_NBR1 The PB1 domai  55.2      37 0.00081   22.9   4.8   35   94-128     2-36  (81)
 23 PF00026 Asp:  Eukaryotic aspar  51.1      37 0.00081   26.5   5.0   31   68-99    286-316 (317)
 24 PRK12442 translation initiatio  50.7      40 0.00086   23.2   4.4   53   27-91     18-72  (87)
 25 KOG2872 Uroporphyrinogen decar  44.6     6.9 0.00015   33.0  -0.2   33   69-112   271-303 (359)
 26 PF05515 Viral_NABP:  Viral nuc  43.4      11 0.00023   27.7   0.7   21    6-26     55-77  (124)
 27 cd06398 PB1_Joka2 The PB1 doma  41.6      88  0.0019   21.3   5.0   36   94-129     2-40  (91)
 28 PF13913 zf-C2HC_2:  zinc-finge  40.9     6.8 0.00015   20.5  -0.5   13   12-24      3-15  (25)
 29 PF04930 FUN14:  FUN14 family;   38.5     6.4 0.00014   27.0  -1.1   24   74-97     31-54  (100)
 30 PF14645 Chibby:  Chibby family  38.3      54  0.0012   23.4   3.7   31   73-108    34-64  (116)
 31 PRK13908 putative recombinatio  37.6      16 0.00035   28.8   0.9   15   12-26    139-163 (204)
 32 PF09040 H-K_ATPase_N:  Gastric  34.8      16 0.00034   21.5   0.4   12    1-12     21-32  (41)
 33 PF14787 zf-CCHC_5:  GAG-polypr  34.8      14  0.0003   21.4   0.1   21   11-31      2-22  (36)
 34 CHL00008 petG cytochrome b6/f   34.4      20 0.00044   20.8   0.8   12    2-13     24-35  (37)
 35 KOG1542 Cysteine proteinase Ca  34.2     9.5 0.00021   32.6  -0.9   42   57-101   261-302 (372)
 36 PF09538 FYDLN_acid:  Protein o  33.1      15 0.00033   26.0   0.1   17   10-26      8-24  (108)
 37 PRK00665 petG cytochrome b6-f   32.3      22 0.00048   20.6   0.7   11    2-12     24-34  (37)
 38 COG2081 Predicted flavoprotein  31.3      73  0.0016   27.7   4.0   35   50-84    144-185 (408)
 39 KOG4584 Uncharacterized conser  30.1      28 0.00061   29.4   1.2   23   56-78    196-218 (348)
 40 PF00622 SPRY:  SPRY domain;  I  29.8      93   0.002   20.6   3.7   21   85-105    71-91  (124)
 41 PF11148 DUF2922:  Protein of u  28.0 1.2E+02  0.0026   19.2   3.7   38   91-128     1-40  (69)
 42 PF00670 AdoHcyase_NAD:  S-aden  27.7 1.3E+02  0.0028   22.8   4.4   51   60-110   101-152 (162)
 43 TIGR00008 infA translation ini  27.7 1.3E+02  0.0029   19.6   3.9   43   27-73     16-60  (68)
 44 cd05477 gastricsin Gastricsins  27.5 2.2E+02  0.0049   22.6   6.1   28   71-99    290-317 (318)
 45 PRK14891 50S ribosomal protein  27.5      42 0.00091   24.8   1.6   34   11-48      4-38  (131)
 46 PF04746 DUF575:  Protein of un  27.0      22 0.00048   24.9   0.1   11   73-83     29-39  (101)
 47 PF08844 DUF1815:  Domain of un  26.4 1.1E+02  0.0025   21.5   3.6   28   11-42     32-60  (105)
 48 KOG3217 Protein tyrosine phosp  25.6      25 0.00055   26.7   0.2   12   67-78     82-93  (159)
 49 cd06097 Aspergillopepsin_like   24.6      95  0.0021   24.2   3.4   26   72-98    252-277 (278)
 50 TIGR02300 FYDLN_acid conserved  24.0      29 0.00063   25.5   0.3   18    9-26      7-24  (129)
 51 PF13975 gag-asp_proteas:  gag-  23.6      64  0.0014   20.4   1.8   20   30-49     51-70  (72)
 52 smart00647 IBR In Between Ring  23.4      34 0.00073   20.5   0.5   14   11-24     48-61  (64)
 53 PF11164 DUF2948:  Protein of u  23.4      97  0.0021   23.0   2.9   36   21-56     92-127 (138)
 54 cd05476 pepsin_A_like_plant Ch  21.6 2.2E+02  0.0048   22.0   4.9   53   48-101   196-264 (265)
 55 PF01684 ET:  ET module;  Inter  21.6 1.3E+02  0.0028   20.3   3.0   22   35-56     12-34  (82)
 56 PF09706 Cas_CXXC_CXXC:  CRISPR  21.5      28 0.00061   22.5  -0.2   17    9-25      3-19  (69)
 57 cd07429 Cby_like Chibby, a nuc  21.3 1.3E+02  0.0029   21.4   3.2   43   62-109    23-65  (108)
 58 TIGR03318 YdfZ_fam putative se  21.3      48   0.001   21.6   0.8   20    6-25     40-60  (65)
 59 PF02529 PetG:  Cytochrome B6-F  21.3      35 0.00077   19.8   0.2   11    2-12     24-34  (37)
 60 PHA01782 hypothetical protein   21.1      39 0.00084   26.0   0.5   33   78-111    74-111 (177)
 61 KOG1370 S-adenosylhomocysteine  20.9 1.1E+02  0.0023   26.4   3.0   73   38-110   270-343 (434)
 62 PF14452 Multi_ubiq:  Multiubiq  20.7 2.3E+02   0.005   17.8   4.2   27   93-122     1-27  (72)
 63 COG3880 Modulator of heat shoc  20.5      63  0.0014   25.0   1.5   33   13-53      2-35  (176)
 64 PF02989 DUF228:  Lyme disease   20.2      71  0.0015   24.8   1.7   82   10-109    67-149 (184)
 65 smart00507 HNHc HNH nucleases.  20.1      53  0.0012   18.2   0.8   16    7-23      7-22  (52)

No 1  
>PF08284 RVP_2:  Retroviral aspartyl protease;  InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases. 
Probab=99.83  E-value=5.9e-21  Score=139.86  Aligned_cols=70  Identities=33%  Similarity=0.593  Sum_probs=64.2

Q ss_pred             CceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEeccccEEEEEeC
Q 044180           29 MSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTMEFTYR  100 (130)
Q Consensus        29 ~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~~  100 (130)
                      .++.|. ++|+.+.|.+.|++++|.+||++|..||++|++++||||||||||++|+| .|||.+++++|+..
T Consensus        63 ~~~~V~-~~g~~~~~~~~~~~~~~~i~g~~~~~dl~vl~l~~~DvILGm~WL~~~~~-~IDw~~k~v~f~~p  132 (135)
T PF08284_consen   63 RPIVVS-APGGSINCEGVCPDVPLSIQGHEFVVDLLVLDLGGYDVILGMDWLKKHNP-VIDWATKTVTFNSP  132 (135)
T ss_pred             CeeEEe-cccccccccceeeeEEEEECCeEEEeeeEEecccceeeEeccchHHhCCC-EEEccCCEEEEeCC
Confidence            345555 67888999999999999999999999999999999999999999999999 89999999999843


No 2  
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=99.53  E-value=2.1e-14  Score=103.15  Aligned_cols=68  Identities=12%  Similarity=0.133  Sum_probs=61.9

Q ss_pred             CceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEeccccEEEE
Q 044180           29 MSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTMEF   97 (130)
Q Consensus        29 ~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f   97 (130)
                      .++.+++++++...+.+.|..++++++|.+|..+|.++|+.++|+|||||||++++ +.|||++++|+|
T Consensus        57 ~~~~~~~~g~g~~~~~g~~~~~~l~i~~~~~~~~~~Vl~~~~~d~ILG~d~L~~~~-~~ID~~~~~i~~  124 (124)
T cd05479          57 KRFQGIAKGVGTQKILGRIHLAQVKIGNLFLPCSFTVLEDDDVDFLIGLDMLKRHQ-CVIDLKENVLRI  124 (124)
T ss_pred             cceEEEEecCCCcEEEeEEEEEEEEECCEEeeeEEEEECCCCcCEEecHHHHHhCC-eEEECCCCEEEC
Confidence            35677888766688899999999999999999999999999999999999999999 599999999975


No 3  
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=98.93  E-value=2.4e-09  Score=72.42  Aligned_cols=53  Identities=15%  Similarity=0.207  Sum_probs=49.5

Q ss_pred             ccCceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchh
Q 044180           27 KIMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLE   81 (130)
Q Consensus        27 ~~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~   81 (130)
                      +....+|..|||+.+.+.+.| .+++.++|.+|..+++|++.. +|.|||+|||+
T Consensus        39 ~~~~~~v~~a~G~~~~~~G~~-~~~v~~~~~~~~~~~~v~~~~-~~~lLG~~wl~   91 (91)
T cd05484          39 KPTKKRLRTATGTKLSVLGQI-LVTVKYGGKTKVLTLYVVKNE-GLNLLGRDWLD   91 (91)
T ss_pred             ccccEEEEecCCCEeeEeEEE-EEEEEECCEEEEEEEEEEECC-CCCccChhhcC
Confidence            345689999999999999999 899999999999999999999 99999999985


No 4  
>PF02160 Peptidase_A3:  Cauliflower mosaic virus peptidase (A3);  InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=98.20  E-value=1e-05  Score=63.36  Aligned_cols=76  Identities=16%  Similarity=0.240  Sum_probs=65.5

Q ss_pred             ccCceeEEeeCCCeEeecceeeccEEeecCEEEEceeE-EeecCCcceeecccchhccCceEEeccccEEEEEeCC-eEE
Q 044180           27 KIMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFH-ILDFFGANAVLAVQWLEKLGKIVTDHKALTMEFTYRG-QPI  104 (130)
Q Consensus        27 ~~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~-vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~~g-~~I  104 (130)
                      ...++.|+.|||+......+|++..++|.|+.|...++ ..+ .|.|+|||+.+|+.+.|. +.|.. +++|+..+ ..+
T Consensus        47 ~~~~i~v~~an~~~~~i~~~~~~~~i~I~~~~F~IP~iYq~~-~g~d~IlG~NF~r~y~Pf-iq~~~-~I~f~~~~~~~~  123 (201)
T PF02160_consen   47 SKKPIKVKGANGSIIQINKKAKNGKIQIADKIFRIPTIYQQE-SGIDIILGNNFLRLYEPF-IQTED-RIQFHKKGLQKV  123 (201)
T ss_pred             CCCcEEEEEecCCceEEEEEecCceEEEccEEEeccEEEEec-CCCCEEecchHHHhcCCc-EEEcc-EEEEEeCCccee
Confidence            34468999999999999999999999999999998776 566 689999999999999994 88864 79999877 444


Q ss_pred             E
Q 044180          105 K  105 (130)
Q Consensus       105 ~  105 (130)
                      .
T Consensus       124 ~  124 (201)
T PF02160_consen  124 E  124 (201)
T ss_pred             e
Confidence            3


No 5  
>cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples
Probab=98.16  E-value=8.2e-06  Score=50.33  Aligned_cols=54  Identities=28%  Similarity=0.525  Sum_probs=47.5

Q ss_pred             cCceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchh
Q 044180           28 IMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLE   81 (130)
Q Consensus        28 ~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~   81 (130)
                      ..+..+..++|......+.+..+.+.+++..+...+++.+..++|+|||++||.
T Consensus        39 ~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~ilG~~~l~   92 (92)
T cd00303          39 PTPLKVKGANGSSVKTLGVILPVTIGIGGKTFTVDFYVLDLLSYDVILGRPWLE   92 (92)
T ss_pred             CCceEEEecCCCEeccCcEEEEEEEEeCCEEEEEEEEEEcCCCcCEEecccccC
Confidence            345677888888888888888999999999999999999999999999999984


No 6  
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=97.84  E-value=4.4e-05  Score=53.65  Aligned_cols=63  Identities=24%  Similarity=0.388  Sum_probs=47.8

Q ss_pred             CceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEeccccEE
Q 044180           29 MSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTM   95 (130)
Q Consensus        29 ~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl   95 (130)
                      ...+|..|||.......  .-..+.++|.+....+.+.+... ++.|||.||++++ +.+||++..|
T Consensus        45 ~~~~~~tA~G~~~~~~v--~~~~v~igg~~~~~~v~~~~~~~-~~LLG~~~L~~l~-l~id~~~~~~  107 (107)
T TIGR03698        45 DQRRVYLADGREVLTDV--AKASIIINGLEIDAFVESLGYVD-EPLLGTELLEGLG-IVIDYRNQGL  107 (107)
T ss_pred             cCcEEEecCCcEEEEEE--EEEEEEECCEEEEEEEEecCCCC-ccEecHHHHhhCC-EEEehhhCcC
Confidence            35689999998766553  46677889998855544445545 8999999999997 5899998754


No 7  
>PF09668 Asp_protease:  Aspartyl protease;  InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.  This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=97.59  E-value=0.00014  Score=52.96  Aligned_cols=50  Identities=18%  Similarity=0.320  Sum_probs=42.5

Q ss_pred             CeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEe
Q 044180           39 EQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTD   89 (130)
Q Consensus        39 ~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD   89 (130)
                      +.-...|....+++++++..|+..|.|+|-...|++||.|||.+|.. .||
T Consensus        75 G~~~i~G~Ih~~~l~ig~~~~~~s~~Vle~~~~d~llGld~L~~~~c-~ID  124 (124)
T PF09668_consen   75 GTQKILGRIHSVQLKIGGLFFPCSFTVLEDQDVDLLLGLDMLKRHKC-CID  124 (124)
T ss_dssp             ---EEEEEEEEEEEEETTEEEEEEEEEETTSSSSEEEEHHHHHHTT--EEE
T ss_pred             CcCceeEEEEEEEEEECCEEEEEEEEEeCCCCcceeeeHHHHHHhCc-ccC
Confidence            55567778999999999999999999999889999999999999976 565


No 8  
>PF00077 RVP:  Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026;  InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=97.40  E-value=0.00028  Score=47.76  Aligned_cols=55  Identities=25%  Similarity=0.392  Sum_probs=45.3

Q ss_pred             cCceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCc
Q 044180           28 IMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGK   85 (130)
Q Consensus        28 ~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~   85 (130)
                      .....|.-++|.. ...+. ..+.+.+.+..|...+++.+-..+| |||.|||.+++.
T Consensus        43 ~~~~~v~~~~g~~-~~~~~-~~~~v~~~~~~~~~~~~v~~~~~~~-ILG~D~L~~~~~   97 (100)
T PF00077_consen   43 KTSITVRGAGGSS-SILGS-TTVEVKIGGKEFNHTFLVVPDLPMN-ILGRDFLKKLNA   97 (100)
T ss_dssp             EEEEEEEETTEEE-EEEEE-EEEEEEETTEEEEEEEEESSTCSSE-EEEHHHHTTTTC
T ss_pred             cCCceeccCCCcc-eeeeE-EEEEEEEECccceEEEEecCCCCCC-EeChhHHHHcCC
Confidence            3445666677776 66665 5889999999999999999988888 999999999986


No 9  
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=97.29  E-value=0.00059  Score=44.83  Aligned_cols=53  Identities=13%  Similarity=0.150  Sum_probs=43.5

Q ss_pred             ccCceeEEeeCCCeEeecceeeccEEeecCEEEE-ceeEEeecCC--cceeecccchh
Q 044180           27 KIMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFE-ADFHILDFFG--ANAVLAVQWLE   81 (130)
Q Consensus        27 ~~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~-~dl~vL~l~g--~DvILGmdWL~   81 (130)
                      .....++..++|........  -.++++++..|. ..+.++|...  .|.|||+|||.
T Consensus        41 ~~~~~~~~~~~G~~~~~~~~--~~~i~ig~~~~~~~~~~v~d~~~~~~~gIlG~d~l~   96 (96)
T cd05483          41 LGGKVTVQTANGRVRAARVR--LDSLQIGGITLRNVPAVVLPGDALGVDGLLGMDFLR   96 (96)
T ss_pred             CCCcEEEEecCCCccceEEE--cceEEECCcEEeccEEEEeCCcccCCceEeChHHhC
Confidence            34567888899998877666  446789999998 5899999997  99999999984


No 10 
>PF13650 Asp_protease_2:  Aspartyl protease
Probab=97.28  E-value=0.0011  Score=43.11  Aligned_cols=48  Identities=15%  Similarity=0.305  Sum_probs=38.5

Q ss_pred             ceeEEeeCCCeEeecceeeccEEeecCEEE-EceeEEee-cCCcceeecccc
Q 044180           30 SFLVDVSNGEQIRSEGHCSKVKFEMQGVEF-EADFHILD-FFGANAVLAVQW   79 (130)
Q Consensus        30 ~~~V~vanG~~l~~~~~c~~~~~~iqg~~f-~~dl~vL~-l~g~DvILGmdW   79 (130)
                      +..+..++|.........+  ++++++..+ ..++.+.+ -..+|.||||||
T Consensus        41 ~~~~~~~~g~~~~~~~~~~--~i~ig~~~~~~~~~~v~~~~~~~~~iLG~df   90 (90)
T PF13650_consen   41 PISVSGAGGSVTVYRGRVD--SITIGGITLKNVPFLVVDLGDPIDGILGMDF   90 (90)
T ss_pred             eEEEEeCCCCEEEEEEEEE--EEEECCEEEEeEEEEEECCCCCCEEEeCCcC
Confidence            4778889998555444544  788999998 67888999 678999999998


No 11 
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=96.97  E-value=0.0035  Score=44.84  Aligned_cols=64  Identities=14%  Similarity=0.280  Sum_probs=48.3

Q ss_pred             ceeEEeeCCCeEeecceeeccEEeecCEEEE-ceeEEeecCC-cceeecccchhccCceEEeccccEE
Q 044180           30 SFLVDVSNGEQIRSEGHCSKVKFEMQGVEFE-ADFHILDFFG-ANAVLAVQWLEKLGKIVTDHKALTM   95 (130)
Q Consensus        30 ~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~-~dl~vL~l~g-~DvILGmdWL~~~g~i~iD~~~~tl   95 (130)
                      +..+..|||.....  ...--++++.+..+. .++.++|.+. .|.+||||+|.++.++.+|-.+.+|
T Consensus        54 ~~~~~ta~G~~~~~--~~~l~~l~iG~~~~~nv~~~v~~~~~~~~~LLGm~fL~~~~~~~~~~~~l~l  119 (121)
T TIGR02281        54 TVTVSTANGQIKAA--RVTLDRVAIGGIVVNDVDAMVAEGGALSESLLGMSFLNRLSRFTVRGGKLIL  119 (121)
T ss_pred             eEEEEeCCCcEEEE--EEEeCEEEECCEEEeCcEEEEeCCCcCCceEcCHHHHhccccEEEECCEEEE
Confidence            56788899976432  335556789999988 7888999874 5899999999999877777655444


No 12 
>cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The  C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=96.32  E-value=0.0098  Score=42.11  Aligned_cols=54  Identities=17%  Similarity=0.225  Sum_probs=47.4

Q ss_pred             CCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEeccc
Q 044180           38 GEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKA   92 (130)
Q Consensus        38 G~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~   92 (130)
                      |....-.|....++++|++..++-.|.|||-...|++||.|=|.+|.. .||.++
T Consensus        50 gt~~kiiGrih~~~ikig~~~~~CSftVld~~~~d~llGLdmLkrhqc-~IdL~k  103 (103)
T cd05480          50 PTSVKVIGQIERLVLQLGQLTVECSAQVVDDNEKNFSLGLQTLKSLKC-VINLEK  103 (103)
T ss_pred             CcceeEeeEEEEEEEEeCCEEeeEEEEEEcCCCcceEeeHHHHhhcce-eeeccC
Confidence            444567788899999999999999999999999999999999999977 788653


No 13 
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=96.03  E-value=0.017  Score=39.93  Aligned_cols=55  Identities=20%  Similarity=0.391  Sum_probs=47.6

Q ss_pred             ccCceeEEeeCCCeEeecceeeccEEeecCE-EEEceeEEeecCCcceeecccchhccC
Q 044180           27 KIMSFLVDVSNGEQIRSEGHCSKVKFEMQGV-EFEADFHILDFFGANAVLAVQWLEKLG   84 (130)
Q Consensus        27 ~~~~~~V~vanG~~l~~~~~c~~~~~~iqg~-~f~~dl~vL~l~g~DvILGmdWL~~~g   84 (130)
                      +..++.++.|||..|..-|. ..+.+.++.. .|.-+|.|=|+.  .-|||+|.|..|+
T Consensus        33 ~~~~~~l~AANgt~I~tyG~-~~l~ldlGlrr~~~w~FvvAdv~--~pIlGaDfL~~~~   88 (89)
T cd06094          33 KPSPLTLQAANGTPIATYGT-RSLTLDLGLRRPFAWNFVVADVP--HPILGADFLQHYG   88 (89)
T ss_pred             cCCceEEEeCCCCeEeeeee-EEEEEEcCCCcEEeEEEEEcCCC--cceecHHHHHHcC
Confidence            45678999999999999995 8888888875 999999998886  4799999999886


No 14 
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where 
Probab=95.43  E-value=0.043  Score=36.57  Aligned_cols=52  Identities=13%  Similarity=0.206  Sum_probs=36.5

Q ss_pred             cCceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchh
Q 044180           28 IMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLE   81 (130)
Q Consensus        28 ~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~   81 (130)
                      ..+..|.-++|..-.......+ .+.+.+.++..++.+.+- ..|.|||||+|.
T Consensus        35 ~~~~~v~gagG~~~~~v~~~~~-~v~vg~~~~~~~~~v~~~-~~~~lLG~dfL~   86 (86)
T cd06095          35 TTSVLIRGVSGQSQQPVTTYRT-LVDLGGHTVSHSFLVVPN-CPDPLLGRDLLS   86 (86)
T ss_pred             CCcEEEEeCCCcccccEEEeee-EEEECCEEEEEEEEEEcC-CCCcEechhhcC
Confidence            3467788888886111111222 688999999998888773 469999999984


No 15 
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=94.51  E-value=0.077  Score=38.79  Aligned_cols=63  Identities=21%  Similarity=0.295  Sum_probs=49.9

Q ss_pred             ceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEeccccEEE
Q 044180           30 SFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTME   96 (130)
Q Consensus        30 ~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~   96 (130)
                      ..++..++|+.+.+.  .....++++|.+..+-....+....+ ++|++||+.+|- .+|.+.-.++
T Consensus        56 ~~~~~~a~~~~v~t~--V~~~~iki~g~e~~~~Vl~s~~~~~~-liG~~~lk~l~~-~vn~~~g~LE  118 (125)
T COG5550          56 TIRIVLADGGVVKTS--VALATIKIDGVEKVAFVLASDNLPEP-LIGVNLLKLLGL-VVNPKTGKLE  118 (125)
T ss_pred             ChhhhhhcCCEEEEE--EEEEEEEECCEEEEEEEEccCCCccc-chhhhhhhhccE-EEcCCcceEe
Confidence            456778888876654  34667789999998888888888888 999999999976 8998665554


No 16 
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=91.90  E-value=0.6  Score=31.78  Aligned_cols=49  Identities=18%  Similarity=0.314  Sum_probs=41.6

Q ss_pred             ccCceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecc
Q 044180           27 KIMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAV   77 (130)
Q Consensus        27 ~~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGm   77 (130)
                      +.+++.++.+||..+...|. ..+++.+++..+..+|+|+|..+-. |||.
T Consensus        41 ~~t~~~L~~~~g~~~~~~G~-~~~~v~~~~~~~~~~f~Vvd~~~~~-lLG~   89 (93)
T cd05481          41 RPSPVRLTAYGGSTIPVEGG-VKLKCRYRNPKYNLTFQVVKEEGPP-LLGA   89 (93)
T ss_pred             ccCCeEEEeeCCCEeeeeEE-EEEEEEECCcEEEEEEEEECCCCCc-eEcc
Confidence            45678999999999999999 6899999999999999999986544 4454


No 17 
>KOG0012 consensus DNA damage inducible protein [Replication, recombination and repair]
Probab=91.13  E-value=0.29  Score=41.69  Aligned_cols=62  Identities=13%  Similarity=0.212  Sum_probs=56.3

Q ss_pred             eEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEeccccEEEEEeCCe
Q 044180           40 QIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTMEFTYRGQ  102 (130)
Q Consensus        40 ~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~~g~  102 (130)
                      ...-.|.+..++++|+...++-.|-|++-...|+-||.|=|.+|+. -||.++..|.|...+.
T Consensus       287 ~~ki~g~Ih~~~lki~~~~l~c~ftV~d~~~~d~llGLd~Lrr~~c-cIdL~~~~L~ig~~~t  348 (380)
T KOG0012|consen  287 TEKILGRIHQAQLKIEDLYLPCSFTVLDRRDMDLLLGLDMLRRHQC-CIDLKTNVLRIGNTET  348 (380)
T ss_pred             cccccceeEEEEEEeccEeeccceEEecCCCcchhhhHHHHHhccc-eeecccCeEEecCCCc
Confidence            4555778899999999999999999999999999999999999998 7999999999876655


No 18 
>PF12384 Peptidase_A2B:  Ty3 transposon peptidase;  InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=89.07  E-value=1.2  Score=34.37  Aligned_cols=58  Identities=10%  Similarity=0.129  Sum_probs=45.5

Q ss_pred             ccCceeEEeeC-CCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCc
Q 044180           27 KIMSFLVDVSN-GEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGK   85 (130)
Q Consensus        27 ~~~~~~V~van-G~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~   85 (130)
                      .+++++++-+- ++...|... ..+++.+++..|...++|+|--++|+|+|..-|.++..
T Consensus        73 ~app~~fRG~vs~~~~~tsEA-v~ld~~i~n~~i~i~aYV~d~m~~dlIIGnPiL~ryp~  131 (177)
T PF12384_consen   73 DAPPFRFRGFVSGESATTSEA-VTLDFYIDNKLIDIAAYVTDNMDHDLIIGNPILDRYPT  131 (177)
T ss_pred             cCCCEEEeeeccCCceEEEEe-EEEEEEECCeEEEEEEEEeccCCcceEeccHHHhhhHH
Confidence            46777777333 333444443 67899999999999999999999999999999988763


No 19 
>COG2383 Uncharacterized conserved protein [Function unknown]
Probab=71.57  E-value=0.72  Score=32.76  Aligned_cols=20  Identities=35%  Similarity=0.740  Sum_probs=17.7

Q ss_pred             eecccchhccCceEEecccc
Q 044180           74 VLAVQWLEKLGKIVTDHKAL   93 (130)
Q Consensus        74 ILGmdWL~~~g~i~iD~~~~   93 (130)
                      ||+..||+++|-|++||.+.
T Consensus        51 ilsl~~La~~GVItin~~al   70 (109)
T COG2383          51 ILSLFWLAQYGVITINWEAL   70 (109)
T ss_pred             HHHHHHHHHcCeEEEcHHHH
Confidence            57889999999999999764


No 20 
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=67.37  E-value=6.1  Score=31.40  Aligned_cols=58  Identities=16%  Similarity=0.295  Sum_probs=41.3

Q ss_pred             CceeEEeeCCCeEeecceeeccEEeecCEEEE-ceeEEeecCCcc-eeecccchhccCceEE
Q 044180           29 MSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFE-ADFHILDFFGAN-AVLAVQWLEKLGKIVT   88 (130)
Q Consensus        29 ~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~-~dl~vL~l~g~D-vILGmdWL~~~g~i~i   88 (130)
                      -++.|+.|||..-...-....+  .|.+.++. .+-+|++-+..| .-|||.+|.+++...+
T Consensus       147 y~~~v~TANG~~~AA~V~Ld~v--~IG~I~~~nV~A~V~~~g~L~~sLLGMSfL~rL~~fq~  206 (215)
T COG3577         147 YTITVSTANGRARAAPVTLDRV--QIGGIRVKNVDAMVAEDGALDESLLGMSFLNRLSGFQV  206 (215)
T ss_pred             CceEEEccCCccccceEEeeeE--EEccEEEcCchhheecCCccchhhhhHHHHhhccceEe
Confidence            3667888999875555444443  57777776 377888888665 4579999999987433


No 21 
>PF00098 zf-CCHC:  Zinc knuckle;  InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:  C-X2-C-X4-H-X4-C  where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=58.04  E-value=3.2  Score=20.20  Aligned_cols=8  Identities=63%  Similarity=1.219  Sum_probs=6.9

Q ss_pred             ceEecCCC
Q 044180           13 LRFNCDEP   20 (130)
Q Consensus        13 lcf~cde~   20 (130)
                      .||+|++.
T Consensus         2 ~C~~C~~~    9 (18)
T PF00098_consen    2 KCFNCGEP    9 (18)
T ss_dssp             BCTTTSCS
T ss_pred             cCcCCCCc
Confidence            69999995


No 22 
>cd06396 PB1_NBR1 The PB1 domain is an essential part of NBR1 protein, next to BRCA1, a scaffold protein mediating specific protein-protein interaction with both titin protein kinase and with another scaffold protein p62. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. The NBR1 protein contains a type I PB1 domain.
Probab=55.20  E-value=37  Score=22.92  Aligned_cols=35  Identities=11%  Similarity=0.186  Sum_probs=30.0

Q ss_pred             EEEEEeCCeEEEEEeccCCCCChhhHHHHHhcccC
Q 044180           94 TMEFTYRGQPIKLVGAQNIRPKPTQSIHLQSRIFD  128 (130)
Q Consensus        94 tl~f~~~g~~I~l~G~~~~~is~~Q~~~l~~~~~~  128 (130)
                      +++.+++|..+++.=.++...+-.++..++.+.|.
T Consensus         2 ~vKaty~~d~~rf~~~~~~~~~~~~L~~ev~~rf~   36 (81)
T cd06396           2 NLKVTYNGESQSFLVSDSENTTWASVEAMVKVSFG   36 (81)
T ss_pred             EEEEEECCeEEEEEecCCCCCCHHHHHHHHHHHhC
Confidence            57778999999998888777888999999999886


No 23 
>PF00026 Asp:  Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.;  InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .  More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=51.08  E-value=37  Score=26.45  Aligned_cols=31  Identities=16%  Similarity=0.090  Sum_probs=25.9

Q ss_pred             cCCcceeecccchhccCceEEeccccEEEEEe
Q 044180           68 FFGANAVLAVQWLEKLGKIVTDHKALTMEFTY   99 (130)
Q Consensus        68 l~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~   99 (130)
                      -.....|||+.+|+.+=- ..|+.++++.|..
T Consensus       286 ~~~~~~iLG~~fl~~~y~-vfD~~~~~ig~A~  316 (317)
T PF00026_consen  286 DDSDDWILGSPFLRNYYV-VFDYENNRIGFAQ  316 (317)
T ss_dssp             TSSSEEEEEHHHHTTEEE-EEETTTTEEEEEE
T ss_pred             ccCCceEecHHHhhceEE-EEeCCCCEEEEec
Confidence            346688999999999854 8999999999864


No 24 
>PRK12442 translation initiation factor IF-1; Reviewed
Probab=50.73  E-value=40  Score=23.20  Aligned_cols=53  Identities=23%  Similarity=0.328  Sum_probs=35.9

Q ss_pred             ccCceeEEeeCCCeEee--cceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEecc
Q 044180           27 KIMSFLVDVSNGEQIRS--EGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHK   91 (130)
Q Consensus        27 ~~~~~~V~vanG~~l~~--~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~   91 (130)
                      +...|+|.+.||..+.|  +|....-.++|    ..-|-+.+++..||.-        -|.|.+-++
T Consensus        18 p~~~frV~LenG~~vla~isGKmR~~rIrI----l~GD~V~VE~spYDlt--------kGRIiyR~~   72 (87)
T PRK12442         18 PDSRFRVTLENGVEVGAYASGRMRKHRIRI----LAGDRVTLELSPYDLT--------KGRINFRHK   72 (87)
T ss_pred             CCCEEEEEeCCCCEEEEEeccceeeeeEEe----cCCCEEEEEECcccCC--------ceeEEEEec
Confidence            45679999999999776  45555444444    3557788888888853        456555443


No 25 
>KOG2872 consensus Uroporphyrinogen decarboxylase [Coenzyme transport and metabolism]
Probab=44.55  E-value=6.9  Score=32.96  Aligned_cols=33  Identities=21%  Similarity=0.400  Sum_probs=22.3

Q ss_pred             CCcceeecccchhccCceEEeccccEEEEEeCCeEEEEEeccCC
Q 044180           69 FGANAVLAVQWLEKLGKIVTDHKALTMEFTYRGQPIKLVGAQNI  112 (130)
Q Consensus        69 ~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~~g~~I~l~G~~~~  112 (130)
                      .||||| |.||       ++|=++-.   ...|+.|++||+-+-
T Consensus       271 tG~DVv-gLDW-------Tvdp~ear---~~~g~~VtlQGNlDP  303 (359)
T KOG2872|consen  271 TGYDVV-GLDW-------TVDPAEAR---RRVGNRVTLQGNLDP  303 (359)
T ss_pred             cCCcEE-eecc-------cccHHHHH---HhhCCceEEecCCCh
Confidence            489987 8899       55543321   346778999998743


No 26 
>PF05515 Viral_NABP:  Viral nucleic acid binding ;  InterPro: IPR008891 This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP).
Probab=43.40  E-value=11  Score=27.67  Aligned_cols=21  Identities=19%  Similarity=0.243  Sum_probs=14.3

Q ss_pred             hHHHhc--cceEecCCCCCcccc
Q 044180            6 QICRAQ--GLRFNCDEPFNPAIE   26 (130)
Q Consensus         6 ~~rr~~--Glcf~cde~f~p~h~   26 (130)
                      +.|||+  |.||.|+.--.-+|.
T Consensus        55 ~KRRAkR~~~C~~CG~~l~~~~~   77 (124)
T PF05515_consen   55 AKRRAKRYNRCFKCGRYLHNNGN   77 (124)
T ss_pred             HHHHHHHhCccccccceeecCCc
Confidence            457777  899999983333443


No 27 
>cd06398 PB1_Joka2 The PB1 domain is present in the Nicotiana plumbaginifolia Joka2 protein which interacts with sulfur stress inducible UP9 protein. The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module
Probab=41.59  E-value=88  Score=21.27  Aligned_cols=36  Identities=8%  Similarity=0.080  Sum_probs=28.2

Q ss_pred             EEEEEeCCeEEEEEeccC---CCCChhhHHHHHhcccCC
Q 044180           94 TMEFTYRGQPIKLVGAQN---IRPKPTQSIHLQSRIFDS  129 (130)
Q Consensus        94 tl~f~~~g~~I~l~G~~~---~~is~~Q~~~l~~~~~~~  129 (130)
                      +++.+++|..+++.-...   ..++..+|+..+.+.|.-
T Consensus         2 ~vKv~y~~~~rRf~l~~~~~~~d~~~~~L~~kI~~~f~l   40 (91)
T cd06398           2 VVKVKYGGTLRRFTFPVAENQLDLNMDGLREKVEELFSL   40 (91)
T ss_pred             EEEEEeCCEEEEEEeccccccCCCCHHHHHHHHHHHhCC
Confidence            567788998877776653   468999999999988853


No 28 
>PF13913 zf-C2HC_2:  zinc-finger of a C2HC-type
Probab=40.91  E-value=6.8  Score=20.45  Aligned_cols=13  Identities=31%  Similarity=0.388  Sum_probs=10.7

Q ss_pred             cceEecCCCCCcc
Q 044180           12 GLRFNCDEPFNPA   24 (130)
Q Consensus        12 Glcf~cde~f~p~   24 (130)
                      -.|..|+.+|.|+
T Consensus         3 ~~C~~CgR~F~~~   15 (25)
T PF13913_consen    3 VPCPICGRKFNPD   15 (25)
T ss_pred             CcCCCCCCEECHH
Confidence            3589999999875


No 29 
>PF04930 FUN14:  FUN14 family;  InterPro: IPR007014 This is a family of short proteins found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices.
Probab=38.48  E-value=6.4  Score=27.05  Aligned_cols=24  Identities=25%  Similarity=0.378  Sum_probs=19.6

Q ss_pred             eecccchhccCceEEeccccEEEE
Q 044180           74 VLAVQWLEKLGKIVTDHKALTMEF   97 (130)
Q Consensus        74 ILGmdWL~~~g~i~iD~~~~tl~f   97 (130)
                      ++.++||+..|=|.+||.+.+-..
T Consensus        31 ~l~lq~l~~~G~i~Vnw~kl~~~~   54 (100)
T PF04930_consen   31 FLLLQYLASKGYIKVNWDKLEKDV   54 (100)
T ss_pred             HHHHHHHHHCCeEEECHHHHHHHH
Confidence            456799999999999999865443


No 30 
>PF14645 Chibby:  Chibby family
Probab=38.29  E-value=54  Score=23.39  Aligned_cols=31  Identities=26%  Similarity=0.251  Sum_probs=24.9

Q ss_pred             eeecccchhccCceEEeccccEEEEEeCCeEEEEEe
Q 044180           73 AVLAVQWLEKLGKIVTDHKALTMEFTYRGQPIKLVG  108 (130)
Q Consensus        73 vILGmdWL~~~g~i~iD~~~~tl~f~~~g~~I~l~G  108 (130)
                      .=||+|+    ||+.++....++.| .+|+|+.-.|
T Consensus        34 ~el~ld~----~p~~l~Lg~~~l~F-~dG~W~~e~~   64 (116)
T PF14645_consen   34 AELGLDY----GPPRLNLGDQTLVF-EDGQWTSESG   64 (116)
T ss_pred             ccccccc----CCceEeECCeEEEE-ECCEEeccCC
Confidence            3467774    89999999999999 8999994444


No 31 
>PRK13908 putative recombination protein RecO; Provisional
Probab=37.61  E-value=16  Score=28.82  Aligned_cols=15  Identities=47%  Similarity=0.665  Sum_probs=12.9

Q ss_pred             cceEecCCC----------CCcccc
Q 044180           12 GLRFNCDEP----------FNPAIE   26 (130)
Q Consensus        12 Glcf~cde~----------f~p~h~   26 (130)
                      ..||-|||+          |-|+|.
T Consensus       139 ~~Cf~Ce~~i~~~iaL~RaflpaH~  163 (204)
T PRK13908        139 FICFLCDEKIENEIALARAFLPAHP  163 (204)
T ss_pred             CeEEecCCccccchHHHHhhcccCh
Confidence            579999998          888886


No 32 
>PF09040 H-K_ATPase_N:  Gastric H+/K+-ATPase, N terminal domain;  InterPro: IPR015127 ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [, ]. The different types include:   F-ATPases (F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). V-ATPases (V1V0-ATPases), which are primarily found in eukaryotic vacuoles and catalyse ATP hydrolysis to transport solutes and lower pH in organelles. A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases (though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases). P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes. E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.   P-ATPases (sometime known as E1-E2 ATPases) (3.6.3.- from EC) are found in bacteria and in a number of eukaryotic plasma membranes and organelles []. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2. This entry represents the N-terminal domain found in gastric H+/K+-transporter ATPases. This domain adopts an alpha-helical conformation under hydrophobic conditions. The domain contains tyrosine residues, phosphorylation of which regulates the function of the ATPase. Additionally, the domain also interacts with various structural proteins, including the spectrin-binding domain of ankyrin III [].  More information about this protein can be found at Protein of the Month: ATP Synthases [].; GO: 0000287 magnesium ion binding, 0005524 ATP binding, 0008900 hydrogen:potassium-exchanging ATPase activity, 0015991 ATP hydrolysis coupled proton transport, 0016020 membrane; PDB: 1IWF_A 1IWC_A.
Probab=34.85  E-value=16  Score=21.47  Aligned_cols=12  Identities=50%  Similarity=0.573  Sum_probs=8.7

Q ss_pred             ChhhhhHHHhcc
Q 044180            1 MAAEMQICRAQG   12 (130)
Q Consensus         1 ~~a~~~~rr~~G   12 (130)
                      |+|.||.+++.|
T Consensus        21 m~AK~~kkk~~~   32 (41)
T PF09040_consen   21 MAAKMSKKKAGG   32 (41)
T ss_dssp             SSHCCHHHT-S-
T ss_pred             HHHHHhhhhccC
Confidence            789999888775


No 33 
>PF14787 zf-CCHC_5:  GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=34.82  E-value=14  Score=21.43  Aligned_cols=21  Identities=29%  Similarity=0.233  Sum_probs=10.8

Q ss_pred             ccceEecCCCCCccccccCce
Q 044180           11 QGLRFNCDEPFNPAIEKIMSF   31 (130)
Q Consensus        11 ~Glcf~cde~f~p~h~~~~~~   31 (130)
                      .++|++|+.-|.=+.++....
T Consensus         2 ~~~CprC~kg~Hwa~~C~sk~   22 (36)
T PF14787_consen    2 PGLCPRCGKGFHWASECRSKT   22 (36)
T ss_dssp             --C-TTTSSSCS-TTT---TC
T ss_pred             CccCcccCCCcchhhhhhhhh
Confidence            589999999877666644433


No 34 
>CHL00008 petG cytochrome b6/f complex subunit V
Probab=34.39  E-value=20  Score=20.80  Aligned_cols=12  Identities=33%  Similarity=0.335  Sum_probs=9.7

Q ss_pred             hhhhhHHHhccc
Q 044180            2 AAEMQICRAQGL   13 (130)
Q Consensus         2 ~a~~~~rr~~Gl   13 (130)
                      +|.+||||..-+
T Consensus        24 aAylQYrRg~~l   35 (37)
T CHL00008         24 TAYLQYRRGDQL   35 (37)
T ss_pred             HHHHHHhhcccc
Confidence            799999997643


No 35 
>KOG1542 consensus Cysteine proteinase Cathepsin F [Posttranslational modification, protein turnover, chaperones]
Probab=34.15  E-value=9.5  Score=32.61  Aligned_cols=42  Identities=26%  Similarity=0.373  Sum_probs=32.8

Q ss_pred             EEEEceeEEeecCCcceeecccchhccCceEEeccccEEEEEeCC
Q 044180           57 VEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTMEFTYRG  101 (130)
Q Consensus        57 ~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~~g  101 (130)
                      ..+..||..|+..--+|+   .||.++||+++=-....|.|..+|
T Consensus       261 ~v~I~~f~~l~~nE~~ia---~wLv~~GPi~vgiNa~~mQ~YrgG  302 (372)
T KOG1542|consen  261 VVSIKDFSMLSNNEDQIA---AWLVTFGPLSVGINAKPMQFYRGG  302 (372)
T ss_pred             eEEEeccEecCCCHHHHH---HHHHhcCCeEEEEchHHHHHhccc
Confidence            455667888888655665   999999999887778888877666


No 36 
>PF09538 FYDLN_acid:  Protein of unknown function (FYDLN_acid);  InterPro: IPR012644 Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=33.11  E-value=15  Score=25.98  Aligned_cols=17  Identities=12%  Similarity=0.155  Sum_probs=13.6

Q ss_pred             hccceEecCCCCCcccc
Q 044180           10 AQGLRFNCDEPFNPAIE   26 (130)
Q Consensus        10 ~~Glcf~cde~f~p~h~   26 (130)
                      .|.+|..|+.+|--..+
T Consensus         8 tKR~Cp~CG~kFYDLnk   24 (108)
T PF09538_consen    8 TKRTCPSCGAKFYDLNK   24 (108)
T ss_pred             CcccCCCCcchhccCCC
Confidence            47899999999966554


No 37 
>PRK00665 petG cytochrome b6-f complex subunit PetG; Reviewed
Probab=32.34  E-value=22  Score=20.64  Aligned_cols=11  Identities=36%  Similarity=0.075  Sum_probs=9.1

Q ss_pred             hhhhhHHHhcc
Q 044180            2 AAEMQICRAQG   12 (130)
Q Consensus         2 ~a~~~~rr~~G   12 (130)
                      +|.+||||..-
T Consensus        24 aAylQYrRg~~   34 (37)
T PRK00665         24 AAWNQYKRGNQ   34 (37)
T ss_pred             HHHHHHhcccc
Confidence            79999999654


No 38 
>COG2081 Predicted flavoproteins [General function prediction only]
Probab=31.26  E-value=73  Score=27.75  Aligned_cols=35  Identities=14%  Similarity=0.160  Sum_probs=24.7

Q ss_pred             cEEeecCE-EEEceeEEeecCCccee------ecccchhccC
Q 044180           50 VKFEMQGV-EFEADFHILDFFGANAV------LAVQWLEKLG   84 (130)
Q Consensus        50 ~~~~iqg~-~f~~dl~vL~l~g~DvI------LGmdWL~~~g   84 (130)
                      ..+...+- ++..|-.||-+||.-+=      +|++|++++|
T Consensus       144 f~l~t~~g~~i~~d~lilAtGG~S~P~lGstg~gy~iA~~~G  185 (408)
T COG2081         144 FRLDTSSGETVKCDSLILATGGKSWPKLGSTGFGYPIARQFG  185 (408)
T ss_pred             EEEEcCCCCEEEccEEEEecCCcCCCCCCCCchhhHHHHHcC
Confidence            44444444 56777778888877665      7889998888


No 39 
>KOG4584 consensus Uncharacterized conserved protein [General function prediction only]
Probab=30.10  E-value=28  Score=29.40  Aligned_cols=23  Identities=17%  Similarity=0.332  Sum_probs=17.6

Q ss_pred             CEEEEceeEEeecCCcceeeccc
Q 044180           56 GVEFEADFHILDFFGANAVLAVQ   78 (130)
Q Consensus        56 g~~f~~dl~vL~l~g~DvILGmd   78 (130)
                      |.++..-++..|=+|+|+||||=
T Consensus       196 ~~p~K~~lif~DNSG~DvILGil  218 (348)
T KOG4584|consen  196 GKPHKCALIFVDNSGFDVILGIL  218 (348)
T ss_pred             CCCcceEEEEecCCCcceeeeec
Confidence            44555566778889999999983


No 40 
>PF00622 SPRY:  SPRY domain;  InterPro: IPR003877 The SPRY domain is of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin []. Ca2+-release from the sarcoplasmic or endoplasmic reticulum, the intracellular Ca2+ store, is mediated by the ryanodine receptor (RyR) and/or the inositol trisphosphate receptor (IP3R).; GO: 0005515 protein binding; PDB: 2V24_A 3EK9_A 2AFJ_A 2IWG_E 3EMW_A 2WL1_A 3TOJ_B 2VOK_A 2VOL_B 2FNJ_A ....
Probab=29.82  E-value=93  Score=20.62  Aligned_cols=21  Identities=14%  Similarity=0.066  Sum_probs=18.5

Q ss_pred             ceEEeccccEEEEEeCCeEEE
Q 044180           85 KIVTDHKALTMEFTYRGQPIK  105 (130)
Q Consensus        85 ~i~iD~~~~tl~f~~~g~~I~  105 (130)
                      .+.+|+.+.+|.|+.+|+.+.
T Consensus        71 G~~lD~~~g~l~F~~ng~~~~   91 (124)
T PF00622_consen   71 GCGLDLDNGELSFYKNGKFLG   91 (124)
T ss_dssp             EEEEETTTTEEEEEETTEEEE
T ss_pred             EEEEeecccEEEEEECCccce
Confidence            669999999999999998744


No 41 
>PF11148 DUF2922:  Protein of unknown function (DUF2922);  InterPro: IPR021321  This bacterial family of proteins has no known function. 
Probab=27.99  E-value=1.2e+02  Score=19.21  Aligned_cols=38  Identities=13%  Similarity=0.132  Sum_probs=26.4

Q ss_pred             cccEEEEE-eCCeEEEEEeccCC-CCChhhHHHHHhcccC
Q 044180           91 KALTMEFT-YRGQPIKLVGAQNI-RPKPTQSIHLQSRIFD  128 (130)
Q Consensus        91 ~~~tl~f~-~~g~~I~l~G~~~~-~is~~Q~~~l~~~~~~  128 (130)
                      +.+.|+|. -.|+..++.=..-- .++..+++..+..|-+
T Consensus         1 KtL~l~F~~~~gk~~ti~i~~pk~~lt~~~V~~~m~~ii~   40 (69)
T PF11148_consen    1 KTLELVFKTEDGKTFTISIPNPKEDLTEAEVKAAMQAIIA   40 (69)
T ss_pred             CEEEEEEEcCCCCEEEEEcCCCCCCCCHHHHHHHHHHHHH
Confidence            35678886 56777666544433 3899999998887754


No 42 
>PF00670 AdoHcyase_NAD:  S-adenosyl-L-homocysteine hydrolase, NAD binding domain;  InterPro: IPR015878 S-adenosyl-L-homocysteine hydrolase (3.3.1.1 from EC) (AdoHcyase) is an enzyme of the activated methyl cycle, responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. AdoHcyase is an ubiquitous enzyme which binds and requires NAD+ as a cofactor. AdoHcyase is a highly conserved protein [] of about 430 to 470 amino acids.  This entry represents the glycine-rich region in the central part of AdoHcyase, which is thought to be involved in NAD-binding.; GO: 0004013 adenosylhomocysteinase activity; PDB: 2ZJ1_C 3DHY_B 2ZIZ_C 2ZJ0_D 3CE6_B 3GLQ_B 3D64_A 3G1U_C 1A7A_A 3NJ4_C ....
Probab=27.70  E-value=1.3e+02  Score=22.81  Aligned_cols=51  Identities=14%  Similarity=0.061  Sum_probs=39.5

Q ss_pred             EceeEEeecCCcceeecccchhccCceEEeccccEEEEE-eCCeEEEEEecc
Q 044180           60 EADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTMEFT-YRGQPIKLVGAQ  110 (130)
Q Consensus        60 ~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f~-~~g~~I~l~G~~  110 (130)
                      ..+.++-+.+.+|.=+-++||+.++--...-+.....++ ++|+.|.|-+..
T Consensus       101 kdgail~n~Gh~d~Eid~~~L~~~~~~~~~v~~~v~~y~l~~G~~i~lLa~G  152 (162)
T PF00670_consen  101 KDGAILANAGHFDVEIDVDALEANAVEREEVRPQVDRYTLPDGRRIILLAEG  152 (162)
T ss_dssp             -TTEEEEESSSSTTSBTHHHHHTCTSEEEEEETTEEEEEETTSEEEEEEGGG
T ss_pred             cCCeEEeccCcCceeEeeccccccCcEEEEcCCCeeEEEeCCCCEEEEEECC
Confidence            456788899999999999999999654555556677777 468888887765


No 43 
>TIGR00008 infA translation initiation factor IF-1. This family consists of translation initiation factor IF-1 as found in bacteria and chloroplasts. This protein, about 70 residues in length, consists largely of an S1 RNA binding domain (pfam00575).
Probab=27.69  E-value=1.3e+02  Score=19.56  Aligned_cols=43  Identities=14%  Similarity=0.208  Sum_probs=27.0

Q ss_pred             ccCceeEEeeCCCeEee--cceeeccEEeecCEEEEceeEEeecCCcce
Q 044180           27 KIMSFLVDVSNGEQIRS--EGHCSKVKFEMQGVEFEADFHILDFFGANA   73 (130)
Q Consensus        27 ~~~~~~V~vanG~~l~~--~~~c~~~~~~iqg~~f~~dl~vL~l~g~Dv   73 (130)
                      +...|+|++.||..+.|  +|.-..-.++|    ..-|-+.+++..||.
T Consensus        16 ~~~~f~V~l~ng~~vla~i~GKmr~~rI~I----~~GD~V~Ve~spyd~   60 (68)
T TIGR00008        16 PNAMFRVELENGHEVLAHISGKIRMHYIRI----LPGDKVKVELSPYDL   60 (68)
T ss_pred             CCCEEEEEECCCCEEEEEecCcchhccEEE----CCCCEEEEEECcccC
Confidence            45678999999998776  44444333333    234566666776663


No 44 
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=27.53  E-value=2.2e+02  Score=22.56  Aligned_cols=28  Identities=18%  Similarity=0.108  Sum_probs=23.7

Q ss_pred             cceeecccchhccCceEEeccccEEEEEe
Q 044180           71 ANAVLAVQWLEKLGKIVTDHKALTMEFTY   99 (130)
Q Consensus        71 ~DvILGmdWL~~~g~i~iD~~~~tl~f~~   99 (130)
                      -..|||...|+.+-. ..|+.+.++.|..
T Consensus       290 ~~~ilG~~fl~~~y~-vfD~~~~~ig~a~  317 (318)
T cd05477         290 PLWILGDVFLRQYYS-VYDLGNNQVGFAT  317 (318)
T ss_pred             ceEEEcHHHhhheEE-EEeCCCCEEeeee
Confidence            358999999999855 7999999999864


No 45 
>PRK14891 50S ribosomal protein L24e/unknown domain fusion protein; Provisional
Probab=27.50  E-value=42  Score=24.78  Aligned_cols=34  Identities=15%  Similarity=0.251  Sum_probs=22.8

Q ss_pred             ccceEecCCCCCccccccCceeEEeeCCCe-Eeecceee
Q 044180           11 QGLRFNCDEPFNPAIEKIMSFLVDVSNGEQ-IRSEGHCS   48 (130)
Q Consensus        11 ~Glcf~cde~f~p~h~~~~~~~V~vanG~~-l~~~~~c~   48 (130)
                      ..+|.+|+-+--|+|.   ..-|+ .+|.+ ..|+.+|.
T Consensus         4 ~e~CsFcG~kIyPG~G---~~fVR-~DGkvf~FcssKC~   38 (131)
T PRK14891          4 TRTCDYTGEEIEPGTG---TMFVR-KDGTVLHFVDSKCE   38 (131)
T ss_pred             eeeecCcCCcccCCCC---cEEEe-cCCCEEEEecHHHH
Confidence            3589999999999996   22222 22333 45888875


No 46 
>PF04746 DUF575:  Protein of unknown function (DUF575);  InterPro: IPR006835 This represents a conserved region found in a number of Chlamydophila pneumoniae proteins.
Probab=26.97  E-value=22  Score=24.90  Aligned_cols=11  Identities=27%  Similarity=0.755  Sum_probs=9.2

Q ss_pred             eeecccchhcc
Q 044180           73 AVLAVQWLEKL   83 (130)
Q Consensus        73 vILGmdWL~~~   83 (130)
                      +|.|||||-+.
T Consensus        29 iv~GieWLvS~   39 (101)
T PF04746_consen   29 IVMGIEWLVSR   39 (101)
T ss_pred             EEeehHHHHHH
Confidence            67899999875


No 47 
>PF08844 DUF1815:  Domain of unknown function (DUF1815);  InterPro: IPR014943 This entry is about 100 amino acids in length and is functionally uncharacterised. 
Probab=26.42  E-value=1.1e+02  Score=21.49  Aligned_cols=28  Identities=21%  Similarity=0.498  Sum_probs=17.0

Q ss_pred             ccceEecCCCCCcccc-ccCceeEEeeCCCeEe
Q 044180           11 QGLRFNCDEPFNPAIE-KIMSFLVDVSNGEQIR   42 (130)
Q Consensus        11 ~Glcf~cde~f~p~h~-~~~~~~V~vanG~~l~   42 (130)
                      .--||-||+    ++. ..-.|-|.++++..|.
T Consensus        32 ~AsCYtC~d----G~~~~~ASFmv~lg~~HliR   60 (105)
T PF08844_consen   32 LASCYTCGD----GRDMNSASFMVSLGDNHLIR   60 (105)
T ss_pred             eeEEEecCC----CCCCCceeEEEEcCCCcEEE
Confidence            457999975    222 2345677777766554


No 48 
>KOG3217 consensus Protein tyrosine phosphatase [Signal transduction mechanisms]
Probab=25.56  E-value=25  Score=26.65  Aligned_cols=12  Identities=33%  Similarity=0.484  Sum_probs=10.1

Q ss_pred             ecCCcceeeccc
Q 044180           67 DFFGANAVLAVQ   78 (130)
Q Consensus        67 ~l~g~DvILGmd   78 (130)
                      |...||.|||||
T Consensus        82 DF~~FDYI~~MD   93 (159)
T KOG3217|consen   82 DFREFDYILAMD   93 (159)
T ss_pred             HhhhcceeEEec
Confidence            556799999997


No 49 
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=24.59  E-value=95  Score=24.21  Aligned_cols=26  Identities=19%  Similarity=0.169  Sum_probs=22.4

Q ss_pred             ceeecccchhccCceEEeccccEEEEE
Q 044180           72 NAVLAVQWLEKLGKIVTDHKALTMEFT   98 (130)
Q Consensus        72 DvILGmdWL~~~g~i~iD~~~~tl~f~   98 (130)
                      ..|||-..|+.+-. ..|+.++++.|.
T Consensus       252 ~~ilGd~fl~~~y~-vfD~~~~~ig~A  277 (278)
T cd06097         252 FSILGDVFLKAQYV-VFDVGGPKLGFA  277 (278)
T ss_pred             EEEEcchhhCceeE-EEcCCCceeeec
Confidence            56999999999866 899999999875


No 50 
>TIGR02300 FYDLN_acid conserved hypothetical protein TIGR02300. Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=24.01  E-value=29  Score=25.53  Aligned_cols=18  Identities=11%  Similarity=0.025  Sum_probs=13.8

Q ss_pred             HhccceEecCCCCCcccc
Q 044180            9 RAQGLRFNCDEPFNPAIE   26 (130)
Q Consensus         9 r~~Glcf~cde~f~p~h~   26 (130)
                      -.|.+|.+|+.+|--..+
T Consensus         7 GtKr~Cp~cg~kFYDLnk   24 (129)
T TIGR02300         7 GTKRICPNTGSKFYDLNR   24 (129)
T ss_pred             CccccCCCcCccccccCC
Confidence            357899999999875554


No 51 
>PF13975 gag-asp_proteas:  gag-polyprotein putative aspartyl protease
Probab=23.62  E-value=64  Score=20.41  Aligned_cols=20  Identities=25%  Similarity=0.405  Sum_probs=17.1

Q ss_pred             ceeEEeeCCCeEeecceeec
Q 044180           30 SFLVDVSNGEQIRSEGHCSK   49 (130)
Q Consensus        30 ~~~V~vanG~~l~~~~~c~~   49 (130)
                      +.+|++|||....+.+...+
T Consensus        51 ~~~v~~a~g~~~~~~g~~~~   70 (72)
T PF13975_consen   51 PIRVKLANGSVIEIRGVAEN   70 (72)
T ss_pred             CEEEEECCCCccccceEEEe
Confidence            68999999999998887654


No 52 
>smart00647 IBR In Between Ring fingers. the domains occurs between pairs og RING fingers
Probab=23.45  E-value=34  Score=20.51  Aligned_cols=14  Identities=21%  Similarity=0.669  Sum_probs=10.3

Q ss_pred             ccceEecCCCCCcc
Q 044180           11 QGLRFNCDEPFNPA   24 (130)
Q Consensus        11 ~Glcf~cde~f~p~   24 (130)
                      ..-||+|+++|.+.
T Consensus        48 ~~fC~~C~~~~H~~   61 (64)
T smart00647       48 FSFCFRCKVPWHSP   61 (64)
T ss_pred             CeECCCCCCcCCCC
Confidence            35689999988543


No 53 
>PF11164 DUF2948:  Protein of unknown function (DUF2948);  InterPro: IPR021335  This family of proteins with unknown function appear to be restricted to Proteobacteria. 
Probab=23.44  E-value=97  Score=23.00  Aligned_cols=36  Identities=28%  Similarity=0.453  Sum_probs=21.7

Q ss_pred             CCccccccCceeEEeeCCCeEeecceeeccEEeecC
Q 044180           21 FNPAIEKIMSFLVDVSNGEQIRSEGHCSKVKFEMQG   56 (130)
Q Consensus        21 f~p~h~~~~~~~V~vanG~~l~~~~~c~~~~~~iqg   56 (130)
                      |.|+..+...+..+.|+|+.|.-+-.|-++.+.--|
T Consensus        92 fe~~e~p~G~v~L~fAGgg~IrL~VE~ie~~L~D~~  127 (138)
T PF11164_consen   92 FEPGEAPAGHVLLTFAGGGAIRLEVECIEVQLRDLG  127 (138)
T ss_pred             EEeCCCCCcEEEEEECCCcEEEEEEEEEEEEEeecC
Confidence            444444455566666777777777777666655543


No 54 
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  The enzymes specifically cleave bonds in peptides which 
Probab=21.64  E-value=2.2e+02  Score=21.98  Aligned_cols=53  Identities=11%  Similarity=0.126  Sum_probs=36.2

Q ss_pred             eccEEeec-CEEEEcee--EEe------------ec-CCcceeecccchhccCceEEeccccEEEEEeCC
Q 044180           48 SKVKFEMQ-GVEFEADF--HIL------------DF-FGANAVLAVQWLEKLGKIVTDHKALTMEFTYRG  101 (130)
Q Consensus        48 ~~~~~~iq-g~~f~~dl--~vL------------~l-~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~~g  101 (130)
                      |.+.+.+. |..|....  +++            .. ..--.|||..+|+.+-- ..|..++++.|...+
T Consensus       196 P~i~~~f~~~~~~~i~~~~y~~~~~~~~~C~~~~~~~~~~~~ilG~~fl~~~~~-vFD~~~~~iGfa~~~  264 (265)
T cd05476         196 PDLTLHFDGGADLELPPENYFVDVGEGVVCLAILSSSSGGVSILGNIQQQNFLV-EYDLENSRLGFAPAD  264 (265)
T ss_pred             CCEEEEECCCCEEEeCcccEEEECCCCCEEEEEecCCCCCcEEEChhhcccEEE-EEECCCCEEeeecCC
Confidence            77888887 66555332  111            11 23347999999999855 789999999987643


No 55 
>PF01684 ET:  ET module;  InterPro: IPR002603 The proteins in this entry have no known function, and are found in Caenorhabditis elegans and in Caenorhabditis briggsae. Each repeat contains 8-10 conserved cysteines that probably form 4-5 disulphide bridges. By inspection of the conservation of cysteines it looks like cysteines 1, 2, 3, 4, 9 and 10 are always present and that sometimes the pair 5 and 8 or the pair 6 and 7 are missing. This suggests that cysteines 5/8 and 6/7 make disulphide bridges.
Probab=21.61  E-value=1.3e+02  Score=20.26  Aligned_cols=22  Identities=27%  Similarity=0.764  Sum_probs=19.9

Q ss_pred             eeCCCeEeecceeeccEE-eecC
Q 044180           35 VSNGEQIRSEGHCSKVKF-EMQG   56 (130)
Q Consensus        35 vanG~~l~~~~~c~~~~~-~iqg   56 (130)
                      +..|+...|+|.|..+++ .++|
T Consensus        12 ~~~g~~~~C~G~CaSvs~~~~ng   34 (82)
T PF01684_consen   12 ISTGAEVACQGQCASVSITTYNG   34 (82)
T ss_pred             ccCCeeEEeCCEEEEEEEEeECC
Confidence            557889999999999999 9999


No 56 
>PF09706 Cas_CXXC_CXXC:  CRISPR-associated protein (Cas_CXXC_CXXC);  InterPro: IPR019121 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.  This entry represents a conserved domain of about 65 amino acids found in otherwise highly divergent proteins encoded in CRISPR-associated regions. This domain features two CXXC motifs. 
Probab=21.49  E-value=28  Score=22.48  Aligned_cols=17  Identities=12%  Similarity=0.018  Sum_probs=12.1

Q ss_pred             HhccceEecCCCCCccc
Q 044180            9 RAQGLRFNCDEPFNPAI   25 (130)
Q Consensus         9 r~~Glcf~cde~f~p~h   25 (130)
                      ..+++|+.|+|.=...+
T Consensus         3 k~~~~C~~Cg~r~~~~~   19 (69)
T PF09706_consen    3 KKKYNCIFCGERPSKKK   19 (69)
T ss_pred             CCCCcCcCCCCcccccc
Confidence            46799999998544333


No 57 
>cd07429 Cby_like Chibby, a nuclear inhibitor of Wnt/beta-catenin mediated transcription, and similar proteins. Chibby(Cby) is a well-conserved nuclear protein that functions as part of the Wnt/beta-catenin signaling pathway. Specifically, Cby binds directly to beta-catenin by interacting with its central region, which harbors armadillo repeats. Cby-beta-catenin interactions may also involve 14-3-3 proteins. By competing with other binding partners of beta-catenin, the Tcf/Lef transcription factors, Cby inhibits transcriptional activation. Cby has been shown to play a role in adipocyte differentiation. The C-terminal region of Cby appears to contain an alpha-helical coiled-coil motif.
Probab=21.34  E-value=1.3e+02  Score=21.36  Aligned_cols=43  Identities=23%  Similarity=0.328  Sum_probs=30.1

Q ss_pred             eeEEeecCCcceeecccchhccCceEEeccccEEEEEeCCeEEEEEec
Q 044180           62 DFHILDFFGANAVLAVQWLEKLGKIVTDHKALTMEFTYRGQPIKLVGA  109 (130)
Q Consensus        62 dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f~~~g~~I~l~G~  109 (130)
                      ++..++-..-+.=||.++    ||+.+....+.+.|. +|+||.=.|-
T Consensus        23 ~~~~~d~~t~~~El~l~y----g~i~l~Lg~~~l~F~-dG~W~~e~~~   65 (108)
T cd07429          23 NLRLLDRSTRQAELGLDY----GPIRLKLGGQELVFE-DGRWISESGG   65 (108)
T ss_pred             cccccCCCcccccccccc----CCceeeeCCceEEee-CCEEecCCCC
Confidence            444445444444566664    899999999999976 8888775554


No 58 
>TIGR03318 YdfZ_fam putative selenium-binding protein YdfZ. This small protein has a very limited distribution, being found so far only among some gamma-Proteobacteria. The member from Escherichia coli was shown to bind selenium in the absence of a working SelD-dependent selenium incorporation system. Note that while the E. coli member contains a single Cys residue, a likely selenium binding site, some other members of this protein family contain two Cys residues or none.
Probab=21.29  E-value=48  Score=21.60  Aligned_cols=20  Identities=30%  Similarity=0.340  Sum_probs=13.5

Q ss_pred             hHHHhc-cceEecCCCCCccc
Q 044180            6 QICRAQ-GLRFNCDEPFNPAI   25 (130)
Q Consensus         6 ~~rr~~-Glcf~cde~f~p~h   25 (130)
                      |.||++ =+-=.|+|+|.|-.
T Consensus        40 Q~rR~k~Vel~g~e~~f~P~e   60 (65)
T TIGR03318        40 QARREKCVELEGCEERFAPLD   60 (65)
T ss_pred             HhhhccEEEEecccceecchh
Confidence            566665 22368999999854


No 59 
>PF02529 PetG:  Cytochrome B6-F complex subunit 5;  InterPro: IPR003683 This family consists of cytochrome b6/f complex subunit 5 (PetG). The cytochrome bf complex, found in green plants, eukaryotic algae and cyanobacteria, connects photosystem I to photosystem II in the electron transport chain, functioning as a plastoquinol:plastocyanin/cytochrome c6 oxidoreductase []. The purified complex from the unicellular alga Chlamydomonas reinhardtii contains seven subunits; namely four high molecular weight subunits (cytochrome f, Rieske iron-sulphur protein, cytochrome b6, and subunit IV) and three approximately miniproteins (PetG, PetL, and PetX) []. Stoichiometry measurements are consistent with every subunit being present as two copies per b6/f dimer. The absence of PetG affects either the assembly or stability of the cytochrome bf complex in C. reinhardtii [].; GO: 0009512 cytochrome b6f complex; PDB: 1Q90_G 2ZT9_G 1VF5_G 2D2C_G 2E74_G 2E75_G 2E76_G.
Probab=21.27  E-value=35  Score=19.83  Aligned_cols=11  Identities=36%  Similarity=0.292  Sum_probs=7.7

Q ss_pred             hhhhhHHHhcc
Q 044180            2 AAEMQICRAQG   12 (130)
Q Consensus         2 ~a~~~~rr~~G   12 (130)
                      +|+.||||.+-
T Consensus        24 ~Ay~QY~Rg~q   34 (37)
T PF02529_consen   24 AAYLQYRRGNQ   34 (37)
T ss_dssp             HHHHHHCS--T
T ss_pred             HHHHHHhcccc
Confidence            68999999763


No 60 
>PHA01782 hypothetical protein
Probab=21.11  E-value=39  Score=26.03  Aligned_cols=33  Identities=30%  Similarity=0.582  Sum_probs=22.4

Q ss_pred             cchhccCceEEeccccE---EEEE--eCCeEEEEEeccC
Q 044180           78 QWLEKLGKIVTDHKALT---MEFT--YRGQPIKLVGAQN  111 (130)
Q Consensus        78 dWL~~~g~i~iD~~~~t---l~f~--~~g~~I~l~G~~~  111 (130)
                      +||.++|+|.++-.+++   ..|-  ..++ -.|.|..+
T Consensus        74 ~wlv~~Gkv~vntDkk~aKefpf~~nK~~~-tdLegA~~  111 (177)
T PHA01782         74 EWLVKFGKVQVNTDKKSAKEFPFVYNKFGK-TDLEGATA  111 (177)
T ss_pred             HHHHHhCCccccccccccccCceeeccccc-hhhHhhhc
Confidence            89999999999988885   3343  3333 34555543


No 61 
>KOG1370 consensus S-adenosylhomocysteine hydrolase [Coenzyme transport and metabolism]
Probab=20.90  E-value=1.1e+02  Score=26.44  Aligned_cols=73  Identities=12%  Similarity=0.148  Sum_probs=43.1

Q ss_pred             CCeEeecceeeccEEeecCEEEEceeEEeecCCcceeecccchhccCceEEeccccEEEE-EeCCeEEEEEecc
Q 044180           38 GEQIRSEGHCSKVKFEMQGVEFEADFHILDFFGANAVLAVQWLEKLGKIVTDHKALTMEF-TYRGQPIKLVGAQ  110 (130)
Q Consensus        38 G~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~g~DvILGmdWL~~~g~i~iD~~~~tl~f-~~~g~~I~l~G~~  110 (130)
                      +..+.+..-|.++-..-.=.....|.+|-++|.+|+=..+.||.+..--..+-+-+.=.+ .++|+.|.|-...
T Consensus       270 ~difVTtTGc~dii~~~H~~~mk~d~IvCN~Ghfd~EiDv~~L~~~~~~~~~vk~QvD~~~~~~gr~iIlLAeG  343 (434)
T KOG1370|consen  270 VDIFVTTTGCKDIITGEHFDQMKNDAIVCNIGHFDTEIDVKWLNTPALTWENVKPQVDRYILPNGKHIILLAEG  343 (434)
T ss_pred             CCEEEEccCCcchhhHHHHHhCcCCcEEeccccccceeehhhccCCcceeeecccccceeeccCCcEEEEEecC
Confidence            444444444555433222234567899999999999999999999543222222122222 2577777776654


No 62 
>PF14452 Multi_ubiq:  Multiubiquitin
Probab=20.72  E-value=2.3e+02  Score=17.76  Aligned_cols=27  Identities=19%  Similarity=0.239  Sum_probs=16.5

Q ss_pred             cEEEEEeCCeEEEEEeccCCCCChhhHHHH
Q 044180           93 LTMEFTYRGQPIKLVGAQNIRPKPTQSIHL  122 (130)
Q Consensus        93 ~tl~f~~~g~~I~l~G~~~~~is~~Q~~~l  122 (130)
                      |+.+|.-+|+.+.|   ..-.||..|+..|
T Consensus         1 r~~~i~vn~~~~~~---~~~~iTg~qi~~l   27 (72)
T PF14452_consen    1 RTFRIIVNGRPYEW---PDPTITGRQILAL   27 (72)
T ss_pred             CeEEEEECCeEEEE---CCCCcCHHHHHHH
Confidence            34566667776665   3344777777665


No 63 
>COG3880 Modulator of heat shock repressor CtsR, McsA [Signal transduction    mechanisms]
Probab=20.53  E-value=63  Score=24.99  Aligned_cols=33  Identities=21%  Similarity=0.351  Sum_probs=25.4

Q ss_pred             ceEecCCCCCccccccCceeE-EeeCCCeEeecceeeccEEe
Q 044180           13 LRFNCDEPFNPAIEKIMSFLV-DVSNGEQIRSEGHCSKVKFE   53 (130)
Q Consensus        13 lcf~cde~f~p~h~~~~~~~V-~vanG~~l~~~~~c~~~~~~   53 (130)
                      +|++|.+       +.-++.+ +|-||+.+. ..+|..|+-.
T Consensus         2 iCq~Cqq-------npAti~~tkI~~~~k~e-~~vCe~Ca~~   35 (176)
T COG3880           2 ICQNCQQ-------NPATIHFTKIINGEKIE-LYVCETCAKP   35 (176)
T ss_pred             cchhhcC-------CcceEEEEEeecCCeeE-eehhhcCCCc
Confidence            6899987       2235677 689999999 8899888744


No 64 
>PF02989 DUF228:  Lyme disease proteins of unknown function;  InterPro: IPR004239 This group comprises proteins of unknown function from Borrelia burgdorferi, the causitive organism of Lyme disease.
Probab=20.20  E-value=71  Score=24.85  Aligned_cols=82  Identities=15%  Similarity=0.264  Sum_probs=47.1

Q ss_pred             hccceEecCCCCCccccccCceeEEeeCCCeEeecceeeccEEeecCEEEEceeEEeecC-CcceeecccchhccCceEE
Q 044180           10 AQGLRFNCDEPFNPAIEKIMSFLVDVSNGEQIRSEGHCSKVKFEMQGVEFEADFHILDFF-GANAVLAVQWLEKLGKIVT   88 (130)
Q Consensus        10 ~~Glcf~cde~f~p~h~~~~~~~V~vanG~~l~~~~~c~~~~~~iqg~~f~~dl~vL~l~-g~DvILGmdWL~~~g~i~i   88 (130)
                      .+|-||+|+-|.++.   ...+.|..+.|..+-  |.|.++.      +|+....|+|+- +|.-     ||-.-.+ +|
T Consensus        67 ~kgfPYKrGVKLv~~---~~~~~Ve~Ggg~DLY--GICvDiD------efs~tAtVvPITnnFeg-----yLvak~~-~i  129 (184)
T PF02989_consen   67 FKGFPYKRGVKLVDK---ENEIYVEAGGGSDLY--GICVDID------EFSKTATVVPITNNFEG-----YLVAKDS-TI  129 (184)
T ss_pred             ccCCCccceeEecCC---CceEEEEeCCCCccE--EEEEehh------hccceEEEEeccCCeEE-----EEEECCC-CC
Confidence            579999999999943   335677766666654  5666553      455555566654 2333     2222221 22


Q ss_pred             eccccEEEEEeCCeEEEEEec
Q 044180           89 DHKALTMEFTYRGQPIKLVGA  109 (130)
Q Consensus        89 D~~~~tl~f~~~g~~I~l~G~  109 (130)
                      .. .-.|.|...|......|.
T Consensus       130 k~-gdkL~fN~~G~leK~~g~  149 (184)
T PF02989_consen  130 KA-GDKLIFNKDGELEKATGN  149 (184)
T ss_pred             Cc-CcEEEecCCCeEEEccCC
Confidence            21 134566666766666665


No 65 
>smart00507 HNHc HNH nucleases.
Probab=20.11  E-value=53  Score=18.17  Aligned_cols=16  Identities=25%  Similarity=0.561  Sum_probs=12.4

Q ss_pred             HHHhccceEecCCCCCc
Q 044180            7 ICRAQGLRFNCDEPFNP   23 (130)
Q Consensus         7 ~rr~~Glcf~cde~f~p   23 (130)
                      ..|. +.|..|++++.+
T Consensus         7 ~~r~-~~C~~C~~~~~~   22 (52)
T smart00507        7 LHRD-GVCAYCGKPASE   22 (52)
T ss_pred             HHHC-CCCcCCcCCCCC
Confidence            3466 899999998754


Done!