Query         048720
Match_columns 109
No_of_seqs    113 out of 765
Neff          6.9 
Searched_HMMs 46136
Date          Fri Mar 29 12:23:42 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/048720.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/048720hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd00042 CY Substituted updates  99.9 9.4E-24   2E-28  138.6  11.8   85    8-94      1-105 (105)
  2 smart00043 CY Cystatin-like do  99.9 1.9E-24 4.2E-29  142.4   8.4   87    6-94      2-106 (107)
  3 PF00031 Cystatin:  Cystatin do  99.9 8.1E-21 1.8E-25  122.2  12.0   74    8-83      1-94  (94)
  4 PF07430 PP1:  Phloem filament   99.7 1.5E-15 3.3E-20  109.6  11.1   83    6-90    113-199 (202)
  5 PF07430 PP1:  Phloem filament   99.2 5.4E-11 1.2E-15   86.1   5.5   89    6-95      6-99  (202)
  6 TIGR01638 Atha_cystat_rel Arab  99.0 2.4E-09 5.2E-14   69.5   8.7   65   17-81     10-77  (92)
  7 PF06907 Latexin:  Latexin;  In  96.3    0.21 4.5E-06   37.3  12.3   78   17-95      4-89  (220)
  8 TIGR01572 A_thl_para_3677 Arab  94.7    0.21 4.5E-06   38.2   7.4   60   22-81     42-104 (265)
  9 PF05679 CHGN:  Chondroitin N-a  82.2     5.7 0.00012   32.8   6.6   49   19-67    160-210 (499)
 10 PF00666 Cathelicidins:  Cathel  61.3      13 0.00027   22.8   3.0   46   22-68      4-53  (67)
 11 KOG2650 Zinc carboxypeptidase   56.5      38 0.00083   27.7   5.8   63    6-69    317-389 (418)
 12 COG3360 Uncharacterized conser  53.9      51  0.0011   20.4   6.3   45   21-66     18-64  (71)
 13 PF12274 DUF3615:  Protein of u  49.6      65  0.0014   20.3   8.2   67   33-100     1-78  (96)
 14 PF07311 Dodecin:  Dodecin;  In  48.2      61  0.0013   19.7   7.3   48   18-66     12-61  (66)
 15 cd05881 Ig1_Necl-2 First (N-te  40.4      35 0.00076   21.8   2.7   28   55-82     58-85  (95)
 16 PRK15344 type III secretion sy  29.5      87  0.0019   19.4   3.1   21   16-36     29-49  (71)
 17 TIGR02105 III_needle type III   28.9      86  0.0019   19.3   3.0   21   16-36     30-50  (72)
 18 PF03504 Chlam_OMP6:  Chlamydia  27.2      84  0.0018   20.3   2.8   30   51-82     66-95  (95)
 19 PF02995 DUF229:  Protein of un  26.7      75  0.0016   26.3   3.2   30    6-37    450-479 (497)
 20 KOG1693 emp24/gp25L/p24 family  24.4 1.7E+02  0.0037   21.8   4.3   24   44-67     44-67  (209)
 21 smart00678 WWE Domain in Delte  23.5 1.7E+02  0.0036   17.2   4.6   12   54-65     38-49  (73)
 22 PF14201 DUF4318:  Domain of un  21.4 2.1E+02  0.0046   17.6   7.7   44   26-70     20-67  (74)
 23 PF04475 DUF555:  Protein of un  20.8 1.4E+02  0.0031   19.7   3.0   41    8-52      9-49  (102)
 24 KOG3205 Rho GDP-dissociation i  20.4   2E+02  0.0043   21.2   4.0   28   37-64    103-130 (200)

No 1  
>cd00042 CY Substituted updates: Jan 30, 2002
Probab=99.91  E-value=9.4e-24  Score=138.57  Aligned_cols=85  Identities=40%  Similarity=0.573  Sum_probs=79.2

Q ss_pred             cceeecCCCCCCHHHHHHHHHHHHHHHhhcCCc-eeEEEEEEEEEEeeeeEeEEEEEEEeeC------------------
Q 048720            8 QGVYDYGSNRKNADVEGFVHFSIQEHNKKENAL-LEFARVLKAKVQVVAGKLYCFILQVIKN------------------   68 (109)
Q Consensus         8 GG~~~~~~~~~d~~v~~la~~Av~~~N~~~~~~-~~~~kVv~a~~QVVaG~nY~l~v~~~~~------------------   68 (109)
                      |||.++  +.+||++++++++|+.+||+.+++. +++.+|++|++|||+|++|+|+|++.++                  
T Consensus         1 gg~~~~--~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~i~~~~~QvvaG~~y~i~~~~~~t~C~k~~~~~~~~~c~~~~   78 (105)
T cd00042           1 GGPSDI--PANDPEVQELADFAVAEYNKKSNDKYLEFFKVLSAKSQVVAGTNYYITVEAGDTNCKKSSVPLDCPDCKLLE   78 (105)
T ss_pred             CCCccC--CCCCHHHHHHHHHHHHHHHhhcCccceeEEEEEEEEEEEEeeeEEEEEEEEecccccccCcccccccccccc
Confidence            899988  5899999999999999999999887 8999999999999999999999999874                  


Q ss_pred             -ceeeEEEEEEEEecCCCceeEEEEEe
Q 048720           69 -VKKKIYEAKISVKSWNKFKQLWEFKH   94 (109)
Q Consensus        69 -~~~~~y~~~Vw~~PW~~~~~l~~f~~   94 (109)
                       +....|.+.||.+||+++.+|++|.|
T Consensus        79 ~~~~~~C~~~V~~~pw~~~~~l~~~~C  105 (105)
T cd00042          79 EGKKKFCTAKVWEKPWENFKELLSFKC  105 (105)
T ss_pred             cCCCEEEEEEEEecCCCCceeeeeccC
Confidence             35789999999999999999999876


No 2  
>smart00043 CY Cystatin-like domain. Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as  kininogen, His-rich glycoprotein and fetuin also contain these domains.
Probab=99.91  E-value=1.9e-24  Score=142.40  Aligned_cols=87  Identities=36%  Similarity=0.446  Sum_probs=79.8

Q ss_pred             cccceeecCCCCCCHHHHHHHHHHHHHHHhhcCCcee--EEEEEEEEEEeeeeEeEEEEEEEeeCcee------------
Q 048720            6 LLQGVYDYGSNRKNADVEGFVHFSIQEHNKKENALLE--FARVLKAKVQVVAGKLYCFILQVIKNVKK------------   71 (109)
Q Consensus         6 ~~GG~~~~~~~~~d~~v~~la~~Av~~~N~~~~~~~~--~~kVv~a~~QVVaG~nY~l~v~~~~~~~~------------   71 (109)
                      ++|||+++  +.+||++++++++|+++||+++++.+.  +++|++|++|||+|++|+|+|++.++...            
T Consensus         2 ~~Gg~~~~--~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~~v~~a~~QvvaG~~y~l~~~v~~t~C~k~~~~~~~C~~~   79 (107)
T smart00043        2 CLGGPSDV--PPNDPEVQEAADFAVAEYNKKSNDKYELRVIKVVSAKSQVVAGTNYYLKVEVGETNCKKLSVDLENCPFL   79 (107)
T ss_pred             CCCCCccC--CCCCHHHHHHHHHHHHHHHHhcccchhhhhhhhheeeeeeecceEEEEEEEEEeceeccCCcccccCCCC
Confidence            68999999  588999999999999999999988776  79999999999999999999999986543            


Q ss_pred             ----eEEEEEEEEecCCCceeEEEEEe
Q 048720           72 ----KIYEAKISVKSWNKFKQLWEFKH   94 (109)
Q Consensus        72 ----~~y~~~Vw~~PW~~~~~l~~f~~   94 (109)
                          ..|.++||.+||+++.++++|+|
T Consensus        80 ~~~~~~C~~~V~~~pw~~~~~~~~~~C  106 (107)
T smart00043       80 DQGEKFCTAKVWEKPWENKIKLVEFKC  106 (107)
T ss_pred             CCCccEEEEEEEecCCCCccCccceec
Confidence                38999999999999999999987


No 3  
>PF00031 Cystatin:  Cystatin domain;  InterPro: IPR000010 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.  The cystatins are cysteine proteinase inhibitors belonging to MEROPS inhibitor family I25, clan IH [, , ]. They mainly inhibit peptidases belonging to peptidase families C1 (papain family) and C13 (legumain family). The cystatin family includes:   The Type 1 cystatins, which are intracellular cystatins that are present in the cytosol of many cell types, but can also appear in body fluids at significant concentrations. They are single-chain polypeptides of about 100 residues, which have neither disulphide bonds nor carbohydrate side chains.  The Type 2 cystatins, which are mainly extracellular secreted polypeptides synthesised with a 19-28 residue signal peptide. They are broadly distributed and found in most body fluids.  The Type 3 cystatins, which are multidomain proteins. The mammalian representatives of this group are the kininogens. There are three different kininogens in mammals: H- (high molecular mass, IPR002395 from INTERPRO) and L- (low molecular mass) kininogen which are found in a number of species, and T-kininogen that is found only in rat.  Unclassified cystatins. These are cystatin-like proteins found in a range of organisms: plant phytocystatins, fetuin in mammals, insect cystatins and a puff adder venom cystatin which inhibits metalloproteases of the MEROPS peptidase family M12 (astacin/adamalysin). Also a number of the cystatins-like proteins have been shown to be devoid of inhibitory activity.   All true cystatins inhibit cysteine peptidases of the papain family (MEROPS peptidase family C1), and some also inhibit legumain family enzymes (MEROPS peptidase family C13). These peptidases play key roles in physiological processes, such as intracellular protein degradation (cathepsins B, H and L), are pivotal in the remodelling of bone (cathepsin K), and may be important in the control of antigen presentation (cathepsin S, mammalian legumain). Moreover, the activities of such peptidases are increased in pathophysiological conditions, such as cancer metastasis and inflammation. Additionally, such peptidases are essential for several pathogenic parasites and bacteria. Thus in animals cystatins not only have capacity to regulate normal body processes and perhaps cause disease when down-regulated, but in other organisms may also participate in defence against biotic and abiotic stress. ; GO: 0004869 cysteine-type endopeptidase inhibitor activity; PDB: 3L0R_B 2W9P_K 2W9Q_A 3S67_A 3QRD_D 1R4C_G 3GAX_A 1TIJ_B 1G96_A 3NX0_A ....
Probab=99.86  E-value=8.1e-21  Score=122.22  Aligned_cols=74  Identities=22%  Similarity=0.385  Sum_probs=66.7

Q ss_pred             cceeecCCCCCCHHHHHHHHHHHHHHHhhcCC--ceeEEEEEEEEEEeeeeEeEEEEEEEeeC-----------------
Q 048720            8 QGVYDYGSNRKNADVEGFVHFSIQEHNKKENA--LLEFARVLKAKVQVVAGKLYCFILQVIKN-----------------   68 (109)
Q Consensus         8 GG~~~~~~~~~d~~v~~la~~Av~~~N~~~~~--~~~~~kVv~a~~QVVaG~nY~l~v~~~~~-----------------   68 (109)
                      |||+++  +++||+++++|++|+.+||+++++  .|++.+|++|++|||+|++|+|+|.+.++                 
T Consensus         1 Gg~~~~--~~~dp~v~~~~~~al~~~N~~~~~~~~~~~~~v~~a~~QvV~G~~Y~i~~~~~~t~C~k~~~~~~~C~~~~~   78 (94)
T PF00031_consen    1 GGPSPV--DPNDPEVQEAAEFALDKFNEQSNSGYKFKLVKVISATTQVVAGINYYIEFEVGETNCKKSSKDFENCPFQEE   78 (94)
T ss_dssp             SSEEEE--CTTSHHHHHHHHHHHHHHHHHSTTSEEEEEEEEEEEEEEESSSEEEEEEEEEEEEEEETCEEEEEECEBEST
T ss_pred             CCCccC--CCCCHHHHHHHHHHHHHHHHhCcccCcceeeeeeEEEEeecCCceEEEEEEEEcccccccccccccCCcccc
Confidence            899999  589999999999999999999855  47889999999999999999999999874                 


Q ss_pred             -ceeeEEEEEEEEecC
Q 048720           69 -VKKKIYEAKISVKSW   83 (109)
Q Consensus        69 -~~~~~y~~~Vw~~PW   83 (109)
                       .....|.++||.+||
T Consensus        79 ~~~~~~C~~~v~~~pW   94 (94)
T PF00031_consen   79 QPWTKFCKFTVWERPW   94 (94)
T ss_dssp             TSSEEEEEEEEEEECG
T ss_pred             CCceeeEEEEEEECCC
Confidence             235689999999998


No 4  
>PF07430 PP1:  Phloem filament protein PP1;  InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.66  E-value=1.5e-15  Score=109.58  Aligned_cols=83  Identities=23%  Similarity=0.205  Sum_probs=72.8

Q ss_pred             cccceeecCCCCCCHHHHHHHHHHHHHHHhhcCCceeEEEEEEEEEEeee--eEeEEEEEEEeeC-ceeeEEEEEEEEe-
Q 048720            6 LLQGVYDYGSNRKNADVEGFVHFSIQEHNKKENALLEFARVLKAKVQVVA--GKLYCFILQVIKN-VKKKIYEAKISVK-   81 (109)
Q Consensus         6 ~~GG~~~~~~~~~d~~v~~la~~Av~~~N~~~~~~~~~~kVv~a~~QVVa--G~nY~l~v~~~~~-~~~~~y~~~Vw~~-   81 (109)
                      ...+|.+++ |+++|.+|++++|||.||| +.+++++|.+|.+++.|=++  |++|+|++.+.|+ ++...|+|.||++ 
T Consensus       113 ~~~~Wi~I~-nin~p~VQeLgkFAV~EhN-K~gd~LkF~KV~eGw~q~l~~d~ikYrLhI~AkDg~G~~~~YeAvV~~k~  190 (202)
T PF07430_consen  113 QSKKWIPIP-NINNPFVQELGKFAVIEHN-KAGDKLKFEKVYEGWYQDLGNDGIKYRLHIVAKDGLGRLGNYEAVVWEKQ  190 (202)
T ss_pred             ccCCCEECC-CCCcHHHHHHHHHHHHHHh-hcCCceEEEEEeeEEEEeccCCCceEEEEEEeecCCCCcCceEEEEEEec
Confidence            345799997 7999999999999999999 67889999999999999885  6999999999997 8889999999999 


Q ss_pred             cCCCceeEE
Q 048720           82 SWNKFKQLW   90 (109)
Q Consensus        82 PW~~~~~l~   90 (109)
                      +|.+...+.
T Consensus       191 ~~sk~i~i~  199 (202)
T PF07430_consen  191 FLSKKIKIL  199 (202)
T ss_pred             cCcceEEEE
Confidence            566644443


No 5  
>PF07430 PP1:  Phloem filament protein PP1;  InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.16  E-value=5.4e-11  Score=86.10  Aligned_cols=89  Identities=13%  Similarity=0.146  Sum_probs=80.8

Q ss_pred             cccceeecCCCCCCHHHHHHHHHHHHHHHhhcCCceeEEEEEEEE--EEeeeeEeEEEEEEEee-CceeeEEEEEEEEec
Q 048720            6 LLQGVYDYGSNRKNADVEGFVHFSIQEHNKKENALLEFARVLKAK--VQVVAGKLYCFILQVIK-NVKKKIYEAKISVKS   82 (109)
Q Consensus         6 ~~GG~~~~~~~~~d~~v~~la~~Av~~~N~~~~~~~~~~kVv~a~--~QVVaG~nY~l~v~~~~-~~~~~~y~~~Vw~~P   82 (109)
                      ..+||.+++ |..+|.+|.++.|||.+++.+.++.|+|..|++++  .|.+.+++|||.+++.| -+..-.|++.|+++-
T Consensus         6 ~~~~w~~ip-~v~~~~~q~v~~~~veq~k~~~~~~l~~~~v~egwy~el~~~~~~yrlhv~a~d~l~r~l~~e~ii~e~~   84 (202)
T PF07430_consen    6 FSPKWIKIP-DVKEPCLQEVAKFAVEQFKIQYGDSLKFRSVVEGWYFELCPNSLKYRLHVKAIDFLGRSLKYEAIIIEEK   84 (202)
T ss_pred             cCcccccCC-cccchHHHHHHHHHHHHHhhhcccceeeeeeeeceeecccccceeEEEeehhhhhhccccceeeeeeehh
Confidence            468999998 79999999999999999999999999999999999  67889999999999987 567788999999995


Q ss_pred             --CCCceeEEEEEec
Q 048720           83 --WNKFKQLWEFKHT   95 (109)
Q Consensus        83 --W~~~~~l~~f~~~   95 (109)
                        |++.++|.+|-..
T Consensus        85 ~~~~~~~kl~s~l~~   99 (202)
T PF07430_consen   85 PQLTRIRKLASILAI   99 (202)
T ss_pred             hhhhhhhhhheeeEE
Confidence              9999999987543


No 6  
>TIGR01638 Atha_cystat_rel Arabidopsis thaliana cystatin-related protein. This model represents a family similar in sequence and probably homologous to a large family of cysteine proteinase inhibitors, or cystatins, as described by pfam model pfam00031. Cystatins may help plants resist attack by insects.
Probab=99.04  E-value=2.4e-09  Score=69.55  Aligned_cols=65  Identities=12%  Similarity=0.051  Sum_probs=56.5

Q ss_pred             CCCHHHHHHHHHHHHHHHhhcCCceeEEEEEEEEEEeeeeEeEEEEEEEeeC---ceeeEEEEEEEEe
Q 048720           17 RKNADVEGFVHFSIQEHNKKENALLEFARVLKAKVQVVAGKLYCFILQVIKN---VKKKIYEAKISVK   81 (109)
Q Consensus        17 ~~d~~v~~la~~Av~~~N~~~~~~~~~~kVv~a~~QVVaG~nY~l~v~~~~~---~~~~~y~~~Vw~~   81 (109)
                      .+.+-++.++..|+++||+..+.+++|++|++|..|..+|++|+|||.+.|.   ...+.|++.||..
T Consensus        10 T~rd~~~~la~~al~k~N~~~~t~lEfV~vVrAn~~~~~g~~~yITF~Ard~~d~p~~e~~q~~v~~~   77 (92)
T TIGR01638        10 TNRDLLERLSYVASKKYNDTKFLNLELVEVVRANYRGGAKSKSYITFEARDKPDGPLGEYQQAAVVYL   77 (92)
T ss_pred             CHHHHHHHHHHHHHHHhhhhcCceEEEEEEEEEEeeccceEEEEEEEEEecCCCCCHHHhhheeeEec
Confidence            4556699999999999999999999999999999999999999999999873   3456677777764


No 7  
>PF06907 Latexin:  Latexin;  InterPro: IPR009684 This family consists of several animal specific latexin and proteins related to latexin that belong to MEROPS proteinase inhibitor family I47, clan I- [].  Latexin, a protein possessing inhibitory activity against rat carboxypeptidase A1 (CPA1) and CPA2 (MEROPS peptidase family M14A), is expressed in a neuronal subset in the cerebral cortex and cells in other neural and non-neural tissues of rat [, ]. OCX-32, the 32 kDa eggshell matrix protein, is present at high levels in the uterine fluid during the terminal phase of eggshell formation, and is localised predominantly in the outer eggshell. The timing of OCX-32 secretion into the uterine fluid suggests that it may play a role in the termination of mineral deposition []. OCX-32 protein possesses limited identity (32%) to two unrelated proteins: latexin and to a skin protein that is encoded by a retinoic acid receptor-responsive gene, TIG1. Tazarotene Induced Gene 1 (TIG1) is a putative 228 transmembrane protein with a small N-terminal intracellular region, a single membrane-spanning hydrophobic region, and a large C-terminal extracellular region containing a glycosylation signal. TIG1 is up-regulated by retinoic acid receptor but not by retinoid X receptor-specific synthetic retinoids []. TIG1 may be a tumour suppressor gene whose diminished expression is involved in the malignant progression of prostate cancer [].; PDB: 1WNH_A 2BO9_B.
Probab=96.34  E-value=0.21  Score=37.26  Aligned_cols=78  Identities=13%  Similarity=0.100  Sum_probs=54.8

Q ss_pred             CCCHHHHHHHHHHHHHHHhhcCCc---eeEEEEEEEEEEeee--eEeEEEEEEEee---CceeeEEEEEEEEecCCCcee
Q 048720           17 RKNADVEGFVHFSIQEHNKKENAL---LEFARVLKAKVQVVA--GKLYCFILQVIK---NVKKKIYEAKISVKSWNKFKQ   88 (109)
Q Consensus        17 ~~d~~v~~la~~Av~~~N~~~~~~---~~~~kVv~a~~QVVa--G~nY~l~v~~~~---~~~~~~y~~~Vw~~PW~~~~~   88 (109)
                      ++.-..+++|+-|+.=+|-+.+.+   +.+.+|.+|...++.  |-+|.|.|.+.+   +.....|.|+|+-. -.+..-
T Consensus         4 p~h~~a~rAA~va~hy~N~~~GSP~~l~~l~~V~~a~~e~ip~~G~Ky~L~FSte~~~~~e~~g~CsA~V~f~-~qkp~P   82 (220)
T PF06907_consen    4 PSHRPAQRAARVAQHYINYRAGSPSRLFVLQQVQKARAEDIPGEGCKYDLVFSTEEYIEGEHLGNCSAEVFFK-NQKPRP   82 (220)
T ss_dssp             TTSHHHHHHHHHHHHHHHHHH-BTTB-EEEEEEEEEEEEEETTTEEEEEEEEEEEETTT---EEEEEEEEEET-T-----
T ss_pred             CcchHHHHHHHHHHHHhccccCCCceeeehhhhhhhhheeccCCCCEEEEEEEhHHhhcCCceeEeEEEEEec-CCCCCC
Confidence            445568899999998889987765   566889999999884  799999999987   35678999999873 233333


Q ss_pred             EEEEEec
Q 048720           89 LWEFKHT   95 (109)
Q Consensus        89 l~~f~~~   95 (109)
                      -..+.|.
T Consensus        83 ~V~vtc~   89 (220)
T PF06907_consen   83 AVNVTCT   89 (220)
T ss_dssp             EEEEEEC
T ss_pred             cEEEEEE
Confidence            4555554


No 8  
>TIGR01572 A_thl_para_3677 Arabidopsis paralogous family TIGR01572. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL.
Probab=94.68  E-value=0.21  Score=38.20  Aligned_cols=60  Identities=17%  Similarity=0.241  Sum_probs=53.9

Q ss_pred             HHHHHHHHHHHHHhhcCCceeEEEEEEEEEEeeeeEeEEEEEEEeeC---ceeeEEEEEEEEe
Q 048720           22 VEGFVHFSIQEHNKKENALLEFARVLKAKVQVVAGKLYCFILQVIKN---VKKKIYEAKISVK   81 (109)
Q Consensus        22 v~~la~~Av~~~N~~~~~~~~~~kVv~a~~QVVaG~nY~l~v~~~~~---~~~~~y~~~Vw~~   81 (109)
                      +.-.|+.++.-||-+.+.+|+|..|.+.-++..+=+.|+||+.+.|.   ...+.|+..|.++
T Consensus        42 vklyAr~GLH~YN~~~GTNlel~~v~K~N~~~~~~~syyITL~A~DP~s~~s~qTFQtrV~e~  104 (265)
T TIGR01572        42 VKIYARVGLHRYNFLEGTNLELDHVDKFNKRMCALSSYYITLLAVDPDSRFLQQTFQVRVDEQ  104 (265)
T ss_pred             HHHHHHhhhhhhhhccCccceehhhhhhccchhhheeeeEEEEEecCCccccceEEEEEEEec
Confidence            68889999999999999999999999999999999999999999884   3567888888776


No 9  
>PF05679 CHGN:  Chondroitin N-acetylgalactosaminyltransferase;  InterPro: IPR008428 This family represents Chondroitin N-acetylgalactosaminyltransferase. Proteins have a type II transmembrane topology. The enzyme is involved in the biosynthetic initiation and elongation of chondroitin sulphate and is the key enzyme responsible for the selective chain assembly of chondroitin/dermatan sulphate on the linkage region tetrasaccharide common to various proteoglycans containing chondroitin/dermatan sulphate or heparin/heparan sulphate chains. ; GO: 0016758 transferase activity, transferring hexosyl groups, 0032580 Golgi cisterna membrane
Probab=82.25  E-value=5.7  Score=32.81  Aligned_cols=49  Identities=20%  Similarity=0.352  Sum_probs=43.2

Q ss_pred             CHHHHHHHHHHHHHHHhhcCCceeEEEEEEEEEEe--eeeEeEEEEEEEee
Q 048720           19 NADVEGFVHFSIQEHNKKENALLEFARVLKAKVQV--VAGKLYCFILQVIK   67 (109)
Q Consensus        19 d~~v~~la~~Av~~~N~~~~~~~~~~kVv~a~~QV--VaG~nY~l~v~~~~   67 (109)
                      -.++.++...|+...|+.....++|.+++.+...+  .-|+-|.|++.+..
T Consensus       160 ~~dl~~vi~~a~~~ln~~~~~~~~~~~l~~GY~R~dp~rG~~Y~Ldl~l~~  210 (499)
T PF05679_consen  160 REDLDDVIEQAMEELNRKSRRVLEFRDLINGYRRFDPTRGMDYILDLLLKY  210 (499)
T ss_pred             HHHHHHHHHHHHHHHhccccccEEeeeeeeEEEEecCCCCceEEEEEEEee
Confidence            36799999999999999988889999999998876  56999999998765


No 10 
>PF00666 Cathelicidins:  Cathelicidin;  InterPro: IPR001894 The precursor sequences of a number of antimicrobial peptides secreted by neutrophils (polymorphonuclear leukocytes) upon activation have been found to be evolutionarily related and are collectively known as cathelicidins []. Structurally, these proteins consist of three domains: a signal sequence, a conserved region of about 100 residues that contains four cysteines involved in two disulphide bonds, and a highly divergent C-terminal section of variable size. It is in this C-terminal section that the antibacterial peptides are found; they are proteolytically processed from their precursor by enzymes such as elastase. This structure is shown in the following schematic representation:  +---+--------------------------------+--------------------+ |Sig| Propeptide C C C C | Antibacterial pep. | +---+----------------|--|--|--|------+--------------------+ | | | | +--+ +--+ 'C': conserved cysteine involved in a disulphide bond. ; GO: 0006952 defense response, 0005576 extracellular region; PDB: 1KWI_A 1PFP_A 1LXE_A 1N5P_A 1N5H_A.
Probab=61.33  E-value=13  Score=22.80  Aligned_cols=46  Identities=11%  Similarity=0.013  Sum_probs=26.8

Q ss_pred             HHHHHHHHHHHHHhhcCCceeEEEEEEEEEEee----eeEeEEEEEEEeeC
Q 048720           22 VEGFVHFSIQEHNKKENALLEFARVLKAKVQVV----AGKLYCFILQVIKN   68 (109)
Q Consensus        22 v~~la~~Av~~~N~~~~~~~~~~kVv~a~~QVV----aG~nY~l~v~~~~~   68 (109)
                      ++++...||..||+++... -+++++.+.-|--    .++.--+.|.+.++
T Consensus         4 Y~eav~~Av~~yN~~s~~~-nlfRLLe~~p~P~~~~~~~~~~pl~FtIkET   53 (67)
T PF00666_consen    4 YEEAVLRAVDFYNQGSSGE-NLFRLLELDPPPGWDEDPSTPKPLNFTIKET   53 (67)
T ss_dssp             CHHHHHHHHHHHHHCS-SS-EEEEEEEE---SSSSSSSSS-EEEEEEEEEE
T ss_pred             HHHHHHHHHHHHhcCCCcc-CceeeeeccCCCCCCCCcCcceeeEEEEeec
Confidence            6789999999999997754 2345555554432    23445555665554


No 11 
>KOG2650 consensus Zinc carboxypeptidase [Function unknown]
Probab=56.47  E-value=38  Score=27.67  Aligned_cols=63  Identities=10%  Similarity=0.028  Sum_probs=44.3

Q ss_pred             cccceeecCCCCCCHHHHHHHHHHHHHHHhhcCCceeEEEEEE------E----EEEeeeeEeEEEEEEEeeCc
Q 048720            6 LLQGVYDYGSNRKNADVEGFVHFSIQEHNKKENALLEFARVLK------A----KVQVVAGKLYCFILQVIKNV   69 (109)
Q Consensus         6 ~~GG~~~~~~~~~d~~v~~la~~Av~~~N~~~~~~~~~~kVv~------a----~~QVVaG~nY~l~v~~~~~~   69 (109)
                      .|=|++... .++-++++++|+.|++...+..+..|++...-.      +    ..+=+.|++|-+++++.|.+
T Consensus       317 yPyg~~~~~-~~~~~dl~~va~~a~~ai~~~~gt~Y~~G~~~~~~y~asG~S~Dway~~~gi~~~ft~ELrd~g  389 (418)
T KOG2650|consen  317 YPYGYTNDL-PEDYEDLQEVARAAADALKSVYGTKYTVGSSADTLYPASGGSDDWAYDVLGIPYAFTFELRDTG  389 (418)
T ss_pred             ecccccCCC-CCCHHHHHHHHHHHHHHHHHHhCCEEEeccccceeeccCCchHHHhhhccCCCEEEEEEeccCC
Confidence            344565554 356677999999999999999888887742221      1    12346788999999988654


No 12 
>COG3360 Uncharacterized conserved protein [Function unknown]
Probab=53.85  E-value=51  Score=20.35  Aligned_cols=45  Identities=20%  Similarity=0.284  Sum_probs=33.6

Q ss_pred             HHHHHHHHHHHHHHhhcCCceeEEEEEEEEEEeeee--EeEEEEEEEe
Q 048720           21 DVEGFVHFSIQEHNKKENALLEFARVLKAKVQVVAG--KLYCFILQVI   66 (109)
Q Consensus        21 ~v~~la~~Av~~~N~~~~~~~~~~kVv~a~~QVVaG--~nY~l~v~~~   66 (109)
                      .+.++++-|++.-. ++-+.+...+|+.-+-+|+.|  .-|.++++++
T Consensus        18 S~d~Ai~~Ai~RA~-~t~~~l~wfeV~~~rg~v~~g~v~hyqv~lkVg   64 (71)
T COG3360          18 SIDAAIANAIARAA-DTLDNLDWFEVVETRGHVVDGAVAHYQVTLKVG   64 (71)
T ss_pred             cHHHHHHHHHHHHH-hhhhcceEEEEEeecccEeecceEEEEEEEEEE
Confidence            36777777887543 244578899999999999987  4688888764


No 13 
>PF12274 DUF3615:  Protein of unknown function (DUF3615);  InterPro: IPR022059  This domain family is found in bacteria and eukaryotes, and is typically between 86 and 97 amino acids in length. There is a conserved FAE sequence motif. There is a single completely conserved residue F that may be functionally important. 
Probab=49.57  E-value=65  Score=20.34  Aligned_cols=67  Identities=12%  Similarity=0.042  Sum_probs=43.8

Q ss_pred             HHhhc---CCceeEEEEEEEEEEeeee-E-eEEEEEEEeeC------ceeeEEEEEEEEecCCCceeEEEEEecCCCCC
Q 048720           33 HNKKE---NALLEFARVLKAKVQVVAG-K-LYCFILQVIKN------VKKKIYEAKISVKSWNKFKQLWEFKHTKHGPF  100 (109)
Q Consensus        33 ~N~~~---~~~~~~~kVv~a~~QVVaG-~-nY~l~v~~~~~------~~~~~y~~~Vw~~PW~~~~~l~~f~~~~~~~~  100 (109)
                      ||++.   +..|++.+|+....=.=.| . =|++.|.+...      +....|=|+|. ..-.....+.-+-++...++
T Consensus         1 Yn~~~~~~~~~yeL~~v~~~~~~~e~~~~~y~HvNF~A~~~~~~~~~~~~~LFFAE~~-~~~~~~~~v~~C~~l~~~~~   78 (96)
T PF12274_consen    1 YNEDHPLLGLEYELVDVLHSCFIFERGGWNYYHVNFTAKTKGPDSDDGSPTLFFAEVS-NDCKDEDDVSCCCPLEPPDN   78 (96)
T ss_pred             CcccCCCCCcCEEEeEEEeeeeeEeCCCcEEEeEEEEEEcCCccCCCCCceEEEEEEe-cCCCCCCEEEEEEEeECCCC
Confidence            45554   5578999988876544333 3 36888888653      35678999997 23345667777777655554


No 14 
>PF07311 Dodecin:  Dodecin;  InterPro: IPR009923 This entry represents proteins with a Dodecin-like topology. Dodecin flavoprotein is a small dodecameric flavin-binding protein from Halobacterium salinarium (Halobacterium halobium) that contains two flavins stacked in a single binding pocket between two tryptophan residues to form an aromatic tetrade []. Dodecin binds riboflavin, although it appears to have a broad specificity for flavins. Lumichrome, a molecule associated with flavin metabolism, appears to be a ligand of dodecin, which could act as a waste-trapping device. ; PDB: 2VYX_L 2DEG_F 2V18_K 2V19_D 2UX9_B 2CZ8_E 2V21_F 2CC8_A 2CCB_A 2VX9_A ....
Probab=48.24  E-value=61  Score=19.66  Aligned_cols=48  Identities=19%  Similarity=0.165  Sum_probs=37.4

Q ss_pred             CCHHHHHHHHHHHHHHHhhcCCceeEEEEEEEEEEeeee--EeEEEEEEEe
Q 048720           18 KNADVEGFVHFSIQEHNKKENALLEFARVLKAKVQVVAG--KLYCFILQVI   66 (109)
Q Consensus        18 ~d~~v~~la~~Av~~~N~~~~~~~~~~kVv~a~~QVVaG--~nY~l~v~~~   66 (109)
                      +.....++++.|+.+-++- =.+++-++|..-+-.|..|  ..|+.+++++
T Consensus        12 S~~S~edAv~~Av~~A~kT-l~ni~~~eV~e~~~~v~dg~i~~y~v~lkv~   61 (66)
T PF07311_consen   12 SPKSWEDAVQNAVARASKT-LRNIRWFEVKEQRGHVEDGKITEYQVNLKVS   61 (66)
T ss_dssp             ESSHHHHHHHHHHHHHHHH-SSSEEEEEEEEEEEEEETTCEEEEEEEEEEE
T ss_pred             CCCCHHHHHHHHHHHHhhc-hhCcEEEEEEEEEEEEeCCcEEEEEEEEEEE
Confidence            3456889999999887654 3467888999988889887  5788888764


No 15 
>cd05881 Ig1_Necl-2 First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule 2 (also known as cell adhesion molecule 1 (CADM1)). Ig1_Necl-2: domain similar to the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-2, Necl-2 (also known as cell adhesion molecule 1 (CADM1), SynCAM1, IGSF4A, Tslc1, sgIGSF, and RA175).  Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region, belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-2 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-2 is expressed in a wide variety of tissues, and is a 
Probab=40.40  E-value=35  Score=21.77  Aligned_cols=28  Identities=7%  Similarity=0.035  Sum_probs=21.2

Q ss_pred             eeEeEEEEEEEeeCceeeEEEEEEEEec
Q 048720           55 AGKLYCFILQVIKNVKKKIYEAKISVKS   82 (109)
Q Consensus        55 aG~nY~l~v~~~~~~~~~~y~~~Vw~~P   82 (109)
                      .|..|.|+|+-........|.|.+|..|
T Consensus        58 ~~~~~tL~I~~vq~~D~G~Y~Cqv~t~p   85 (95)
T cd05881          58 SSNELRVSLSNVSLSDEGRYFCQLYTDP   85 (95)
T ss_pred             CCCEEEEEECcCCcccCEEEEEEEEccc
Confidence            4778887776555555678999999987


No 16 
>PRK15344 type III secretion system needle protein SsaG; Provisional
Probab=29.46  E-value=87  Score=19.37  Aligned_cols=21  Identities=14%  Similarity=0.097  Sum_probs=18.1

Q ss_pred             CCCCHHHHHHHHHHHHHHHhh
Q 048720           16 NRKNADVEGFVHFSIQEHNKK   36 (109)
Q Consensus        16 ~~~d~~v~~la~~Av~~~N~~   36 (109)
                      +++||+..--+.|++.+|+.-
T Consensus        29 ~~~nP~~ml~lQf~i~QyS~~   49 (71)
T PRK15344         29 DLLNPESMIKAQFALQQYSTF   49 (71)
T ss_pred             CCCCHHHHHHHHHHHHHHHHH
Confidence            688999888899999999764


No 17 
>TIGR02105 III_needle type III secretion apparatus needle protein. Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus.
Probab=28.93  E-value=86  Score=19.28  Aligned_cols=21  Identities=5%  Similarity=0.177  Sum_probs=18.0

Q ss_pred             CCCCHHHHHHHHHHHHHHHhh
Q 048720           16 NRKNADVEGFVHFSIQEHNKK   36 (109)
Q Consensus        16 ~~~d~~v~~la~~Av~~~N~~   36 (109)
                      .++||...--..|++.+||--
T Consensus        30 ~~~nP~~La~~Q~~~~qYs~~   50 (72)
T TIGR02105        30 LPNDPELMAELQFALNQYSAY   50 (72)
T ss_pred             CCCCHHHHHHHHHHHHHHHHH
Confidence            468999988899999999864


No 18 
>PF03504 Chlam_OMP6:  Chlamydia cysteine-rich outer membrane protein 6;  InterPro: IPR003506 Three cysteine-rich proteins (also believed to be lipoproteins) make up the extracellular matrix of the Chlamydial outer membrane []. They are involved in the essential structural integrity of both the elementary body (EB) and recticulate body (RB) phase. As these bacteria lack the peptidoglycan layer common to most Gram-negative microbes, such proteins are highly important in the pathogenicity of the organism. The largest of these is the major outer membrane protein (MOMP), and constitutes around 60% of the total protein for the membrane []. OMP6 is the second largest, with a molecular mass of 58kDa, while the OMP3 protein is ~15kDa []. MOMP is believed to elicit the strongest immune response, and has recently been linked to heart disease through its sequence similarity to a murine heart-muscle specific alpha myosin []. The OMP6 family plays a structural role in the outer membrane during the EB stage of the Chlamydial cell, and different biovars show a small, yet highly significant, change at peptide charge level []. Members of this family include Chlamydia trachomatis, Chlamydia pneumoniae and Chlamydia psittaci.; GO: 0005201 extracellular matrix structural constituent
Probab=27.21  E-value=84  Score=20.26  Aligned_cols=30  Identities=7%  Similarity=0.055  Sum_probs=17.0

Q ss_pred             EEeeeeEeEEEEEEEeeCceeeEEEEEEEEec
Q 048720           51 VQVVAGKLYCFILQVIKNVKKKIYEAKISVKS   82 (109)
Q Consensus        51 ~QVVaG~nY~l~v~~~~~~~~~~y~~~Vw~~P   82 (109)
                      +|-+++.+-..++.-.+  .-..|.+.||.+|
T Consensus        66 ttp~~D~kLVW~Ig~l~--~G~k~kItVwVKP   95 (95)
T PF03504_consen   66 TTPTPDGKLVWKIGRLG--QGEKCKITVWVKP   95 (95)
T ss_pred             cccCCCCEEEEEecccc--CCceeEEEEEeCC
Confidence            34445554444443222  2346999999987


No 19 
>PF02995 DUF229:  Protein of unknown function (DUF229);  InterPro: IPR004245 Members of this family are uncharacterised with a long conserved region that may contain several domains.
Probab=26.70  E-value=75  Score=26.25  Aligned_cols=30  Identities=13%  Similarity=0.137  Sum_probs=25.2

Q ss_pred             cccceeecCCCCCCHHHHHHHHHHHHHHHhhc
Q 048720            6 LLQGVYDYGSNRKNADVEGFVHFSIQEHNKKE   37 (109)
Q Consensus         6 ~~GG~~~~~~~~~d~~v~~la~~Av~~~N~~~   37 (109)
                      .|.+|.++  +.+++.++.+|+++|...|+..
T Consensus       450 ~C~~~~~~--~~~~~~~~~~a~~~v~~iN~~l  479 (497)
T PF02995_consen  450 TCEGWKTI--PTNDSLVQRIAKFLVDHINEYL  479 (497)
T ss_pred             cCcCcccc--ccCcHHHHHHHHHHHHHHHHHH
Confidence            35678888  4788899999999999999973


No 20 
>KOG1693 consensus emp24/gp25L/p24 family of membrane trafficking proteins [Intracellular trafficking, secretion, and vesicular transport]
Probab=24.37  E-value=1.7e+02  Score=21.80  Aligned_cols=24  Identities=13%  Similarity=0.073  Sum_probs=20.5

Q ss_pred             EEEEEEEEEeeeeEeEEEEEEEee
Q 048720           44 ARVLKAKVQVVAGKLYCFILQVIK   67 (109)
Q Consensus        44 ~kVv~a~~QVVaG~nY~l~v~~~~   67 (109)
                      .+....+.||.+|-+|-+++.+.+
T Consensus        44 ~~~~~~~fqV~tGG~fDVD~~I~a   67 (209)
T KOG1693|consen   44 DDTTSFEFQVQTGGHFDVDYDIEA   67 (209)
T ss_pred             CceEEEEEEEEeCCceeeEEEEEC
Confidence            456778899999999999998876


No 21 
>smart00678 WWE Domain in Deltex and TRIP12 homologues. Possibly involved in regulation of ubiquitin-mediated proteolysis.
Probab=23.46  E-value=1.7e+02  Score=17.18  Aligned_cols=12  Identities=17%  Similarity=0.332  Sum_probs=9.2

Q ss_pred             eeeEeEEEEEEE
Q 048720           54 VAGKLYCFILQV   65 (109)
Q Consensus        54 VaG~nY~l~v~~   65 (109)
                      +.|.+|.|.|..
T Consensus        38 ~~g~~Y~IdF~~   49 (73)
T smart00678       38 ICGFPYTIDFNA   49 (73)
T ss_pred             ECCeEEEEECcC
Confidence            357899998864


No 22 
>PF14201 DUF4318:  Domain of unknown function (DUF4318)
Probab=21.44  E-value=2.1e+02  Score=17.64  Aligned_cols=44  Identities=20%  Similarity=0.209  Sum_probs=30.2

Q ss_pred             HHHHHHHHHhhcCCceeEEEEEE-EEEEeeeeEeEEEEE---EEeeCce
Q 048720           26 VHFSIQEHNKKENALLEFARVLK-AKVQVVAGKLYCFIL---QVIKNVK   70 (109)
Q Consensus        26 a~~Av~~~N~~~~~~~~~~kVv~-a~~QVVaG~nY~l~v---~~~~~~~   70 (109)
                      ...|+..|=.+++..++|++=.. +... ..|.+|.+++   ...+++-
T Consensus        20 i~~aIE~YC~~~~~~l~Fisr~~Pi~~~-idg~lYev~i~~~~~~rggy   67 (74)
T PF14201_consen   20 ICEAIEKYCIKNGESLEFISRDKPITFK-IDGVLYEVEIDEEYMARGGY   67 (74)
T ss_pred             HHHHHHHHHHHcCCceEEEecCCcEEEE-ECCeEEEEEEEeeecccCcE
Confidence            34488899888888888854333 3333 3899999999   4445553


No 23 
>PF04475 DUF555:  Protein of unknown function (DUF555);  InterPro: IPR007564 This is a family of uncharacterised, hypothetical archaeal proteins.
Probab=20.82  E-value=1.4e+02  Score=19.69  Aligned_cols=41  Identities=15%  Similarity=0.175  Sum_probs=31.1

Q ss_pred             cceeecCCCCCCHHHHHHHHHHHHHHHhhcCCceeEEEEEEEEEE
Q 048720            8 QGVYDYGSNRKNADVEGFVHFSIQEHNKKENALLEFARVLKAKVQ   52 (109)
Q Consensus         8 GG~~~~~~~~~d~~v~~la~~Av~~~N~~~~~~~~~~kVv~a~~Q   52 (109)
                      +.|.-.  |.  +.++++..-|+++-.+..|..+.|++|--+.++
T Consensus         9 aa~~V~--dv--~s~dDAI~iAIseaGkrLn~~~~~VeIevG~~~   49 (102)
T PF04475_consen    9 AAWIVR--DV--ESVDDAIGIAISEAGKRLNPDLDYVEIEVGDTI   49 (102)
T ss_pred             eeEEEe--cC--CcHHHHHHHHHHHHHHhhCCCCCeEEEecCccc
Confidence            456555  34  347889999999999998889999888765543


No 24 
>KOG3205 consensus Rho GDP-dissociation inhibitor [Signal transduction mechanisms]
Probab=20.35  E-value=2e+02  Score=21.23  Aligned_cols=28  Identities=14%  Similarity=0.258  Sum_probs=19.0

Q ss_pred             cCCceeEEEEEEEEEEeeeeEeEEEEEE
Q 048720           37 ENALLEFARVLKAKVQVVAGKLYCFILQ   64 (109)
Q Consensus        37 ~~~~~~~~kVv~a~~QVVaG~nY~l~v~   64 (109)
                      .|..|++.-..+++.-+|+|..|.=++.
T Consensus       103 EGs~Y~lki~F~Vq~eIvSGLrY~q~v~  130 (200)
T KOG3205|consen  103 EGSEYRLKISFRVQREIVSGLRYVQTVY  130 (200)
T ss_pred             cCcEEEEEEEEEEeeheeccceeeeEEe
Confidence            3455666666777777888887766554


Done!