Query         033492
Match_columns 118
No_of_seqs    131 out of 1034
Neff          7.3 
Searched_HMMs 46136
Date          Fri Mar 29 02:54:20 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/033492.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/033492hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd00042 CY Substituted updates  99.9 1.5E-22 3.3E-27  133.6  12.1   84   30-114     1-104 (105)
  2 smart00043 CY Cystatin-like do  99.9 1.4E-22 2.9E-27  134.3   8.8   86   28-114     2-105 (107)
  3 PF00031 Cystatin:  Cystatin do  99.9 1.1E-20 2.4E-25  122.3  10.8   74   30-104     1-94  (94)
  4 PF07430 PP1:  Phloem filament   99.6 9.1E-15   2E-19  105.5  10.2   86   25-111   110-199 (202)
  5 PF07430 PP1:  Phloem filament   99.1 2.2E-10 4.8E-15   82.9   6.0   91   25-115     3-98  (202)
  6 TIGR01638 Atha_cystat_rel Arab  98.7 7.4E-08 1.6E-12   62.5   7.2   66   37-102     9-77  (92)
  7 PF06907 Latexin:  Latexin;  In  96.3   0.086 1.9E-06   39.3  10.1   66   37-102     3-76  (220)
  8 TIGR01572 A_thl_para_3677 Arab  90.9    0.55 1.2E-05   36.0   4.9   60   43-102    42-104 (265)
  9 PF07172 GRP:  Glycine rich pro  88.7    0.27 5.9E-06   32.1   1.5    7    1-7       1-7   (95)
 10 PF05679 CHGN:  Chondroitin N-a  78.5     9.9 0.00021   31.6   6.9   50   41-90    161-212 (499)
 11 COG3360 Uncharacterized conser  76.0      16 0.00034   22.5   6.5   45   42-87     18-64  (71)
 12 PF12276 DUF3617:  Protein of u  74.5       4 8.6E-05   28.3   3.1   34    1-34      1-34  (162)
 13 PF07311 Dodecin:  Dodecin;  In  72.5      19 0.00041   21.9   7.8   48   39-87     12-61  (66)
 14 PF10731 Anophelin:  Thrombin i  71.9     4.1 8.9E-05   24.5   2.2   18    1-18      1-18  (65)
 15 PF05984 Cytomega_UL20A:  Cytom  66.3     5.9 0.00013   25.5   2.2   21    1-21      1-21  (100)
 16 PF00666 Cathelicidins:  Cathel  65.7     7.1 0.00015   23.9   2.4   47   42-89      3-53  (67)
 17 KOG2650 Zinc carboxypeptidase   49.8      47   0.001   27.3   5.3   63   28-90    317-389 (418)
 18 PF01456 Mucin:  Mucin-like gly  48.6      12 0.00026   25.4   1.5   22    1-22      2-25  (143)
 19 PF03032 Brevenin:  Brevenin/es  48.2      15 0.00033   20.7   1.7   18    4-21      4-21  (46)
 20 TIGR01601 PYST-C1 Plasmodium y  43.9      14 0.00031   23.3   1.2   16    1-16      1-17  (82)
 21 cd06379 PBP1_iGluR_NMDA_NR1 N-  42.6      37 0.00081   26.3   3.7   42   11-58      5-46  (377)
 22 PRK10386 curli assembly protei  41.6 1.2E+02  0.0025   21.0   6.9   81    1-90      1-82  (130)
 23 PRK15344 type III secretion sy  33.8      67  0.0014   19.9   3.0   22   36-57     28-49  (71)
 24 TIGR02105 III_needle type III   33.8      60  0.0013   20.0   2.9   21   37-57     30-50  (72)
 25 PF13028 DUF3889:  Protein of u  30.2 1.6E+02  0.0035   19.2  10.8   69   39-108    19-89  (97)
 26 PF06157 DUF973:  Protein of un  29.4      82  0.0018   24.5   3.6   29   72-103   257-285 (285)
 27 CHL00132 psaF photosystem I su  27.2 2.1E+02  0.0046   20.9   5.1   26   28-55     25-50  (185)
 28 PF12274 DUF3615:  Protein of u  27.2 1.7E+02  0.0036   18.5   7.9   56   59-115     9-72  (96)
 29 COG4676 Uncharacterized protei  26.2      28 0.00061   26.4   0.5   33   17-50     19-54  (268)
 30 PRK09408 ompX outer membrane p  26.2      56  0.0012   23.4   2.1   37    1-38      1-37  (171)
 31 cd06247 M14_CPO Peptidase M14   26.2   1E+02  0.0022   24.0   3.7   61   29-90    199-269 (298)
 32 PF03823 Neurokinin_B:  Neuroki  26.2      81  0.0018   18.7   2.4   34    1-34      1-38  (59)
 33 PRK15346 outer membrane secret  24.8      95  0.0021   25.8   3.5   48    1-51      1-48  (499)
 34 PF08139 LPAM_1:  Prokaryotic m  23.9      53  0.0011   16.2   1.1   12    3-14      8-19  (25)
 35 PF03082 MAGSP:  Male accessory  23.0      55  0.0012   24.7   1.6   22    1-22      1-23  (264)
 36 TIGR03431 PhnD phosphonate ABC  22.0 3.3E+02  0.0072   20.1   5.9   34    1-34      1-35  (288)
 37 PF05887 Trypan_PARP:  Procycli  21.9      30 0.00065   24.1   0.0   13    1-13      1-13  (143)
 38 PF05438 TRH:  Thyrotropin-rele  20.7 1.8E+02  0.0039   21.7   3.8   16    6-21      2-17  (212)
 39 cd01781 AF6_RA_repeat2 Ubiquit  20.3 1.5E+02  0.0032   19.5   3.0   30   39-68     24-55  (100)
 40 PF07353 Uroplakin_II:  Uroplak  20.2 2.4E+02  0.0051   20.5   4.2   16   75-90    110-125 (184)
 41 PF15281 Consortin_C:  Consorti  20.2      76  0.0016   21.3   1.6   15    7-21     56-70  (113)
 42 PF07312 DUF1459:  Protein of u  20.1      93   0.002   19.8   1.9   16    1-16      1-16  (84)

No 1  
>cd00042 CY Substituted updates: Jan 30, 2002
Probab=99.89  E-value=1.5e-22  Score=133.56  Aligned_cols=84  Identities=46%  Similarity=0.727  Sum_probs=78.2

Q ss_pred             ceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCC-eEEEEEEEEEEEEeceEEEEEEEEEEeCC------------------
Q 033492           30 GGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSA-LKFESVEKGETQVVSGTNYRLILVVKDGP------------------   90 (118)
Q Consensus        30 GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~-~~~~~V~~a~~QVVaG~nY~l~v~~~~~~------------------   90 (118)
                      |||.++ +.+||++++++++|+.+||+++++. |.+.+|+++++|||+|++|++++++.+++                  
T Consensus         1 gg~~~~-~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~i~~~~~QvvaG~~y~i~~~~~~t~C~k~~~~~~~~~c~~~~~   79 (105)
T cd00042           1 GGPSDI-PANDPEVQELADFAVAEYNKKSNDKYLEFFKVLSAKSQVVAGTNYYITVEAGDTNCKKSSVPLDCPDCKLLEE   79 (105)
T ss_pred             CCCccC-CCCCHHHHHHHHHHHHHHHhhcCccceeEEEEEEEEEEEEeeeEEEEEEEEecccccccCccccccccccccc
Confidence            899995 8999999999999999999999988 77899999999999999999999999863                  


Q ss_pred             -CceeEEEEEEEecCCCceEEEEee
Q 033492           91 -STKKFEAVVWEKPWEHFKSLTSFK  114 (118)
Q Consensus        91 -~~~~c~~~V~~~PW~~~~~l~sf~  114 (118)
                       ....|.+.||.+||.+..++++++
T Consensus        80 ~~~~~C~~~V~~~pw~~~~~l~~~~  104 (105)
T cd00042          80 GKKKFCTAKVWEKPWENFKELLSFK  104 (105)
T ss_pred             CCCEEEEEEEEecCCCCceeeeecc
Confidence             578999999999999999999875


No 2  
>smart00043 CY Cystatin-like domain. Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as  kininogen, His-rich glycoprotein and fetuin also contain these domains.
Probab=99.88  E-value=1.4e-22  Score=134.34  Aligned_cols=86  Identities=41%  Similarity=0.628  Sum_probs=78.5

Q ss_pred             ccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeE--EEEEEEEEEEEeceEEEEEEEEEEeCC-Cc------------
Q 033492           28 LVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALK--FESVEKGETQVVSGTNYRLILVVKDGP-ST------------   92 (118)
Q Consensus        28 ~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~--~~~V~~a~~QVVaG~nY~l~v~~~~~~-~~------------   92 (118)
                      ++|||.++ +.+||++++++++|+++||+++++.|.  +.+++++++|||+|++|++++++.+++ +.            
T Consensus         2 ~~Gg~~~~-~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~~v~~a~~QvvaG~~y~l~~~v~~t~C~k~~~~~~~C~~~~   80 (107)
T smart00043        2 CLGGPSDV-PPNDPEVQEAADFAVAEYNKKSNDKYELRVIKVVSAKSQVVAGTNYYLKVEVGETNCKKLSVDLENCPFLD   80 (107)
T ss_pred             CCCCCccC-CCCCHHHHHHHHHHHHHHHHhcccchhhhhhhhheeeeeeecceEEEEEEEEEeceeccCCcccccCCCCC
Confidence            68999995 899999999999999999999998887  689999999999999999999999876 11            


Q ss_pred             ---eeEEEEEEEecCCCceEEEEee
Q 033492           93 ---KKFEAVVWEKPWEHFKSLTSFK  114 (118)
Q Consensus        93 ---~~c~~~V~~~PW~~~~~l~sf~  114 (118)
                         ..|.++||.+||+++.++++|+
T Consensus        81 ~~~~~C~~~V~~~pw~~~~~~~~~~  105 (107)
T smart00043       81 QGEKFCTAKVWEKPWENKIKLVEFK  105 (107)
T ss_pred             CCccEEEEEEEecCCCCccCcccee
Confidence               4899999999999999998875


No 3  
>PF00031 Cystatin:  Cystatin domain;  InterPro: IPR000010 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.  The cystatins are cysteine proteinase inhibitors belonging to MEROPS inhibitor family I25, clan IH [, , ]. They mainly inhibit peptidases belonging to peptidase families C1 (papain family) and C13 (legumain family). The cystatin family includes:   The Type 1 cystatins, which are intracellular cystatins that are present in the cytosol of many cell types, but can also appear in body fluids at significant concentrations. They are single-chain polypeptides of about 100 residues, which have neither disulphide bonds nor carbohydrate side chains.  The Type 2 cystatins, which are mainly extracellular secreted polypeptides synthesised with a 19-28 residue signal peptide. They are broadly distributed and found in most body fluids.  The Type 3 cystatins, which are multidomain proteins. The mammalian representatives of this group are the kininogens. There are three different kininogens in mammals: H- (high molecular mass, IPR002395 from INTERPRO) and L- (low molecular mass) kininogen which are found in a number of species, and T-kininogen that is found only in rat.  Unclassified cystatins. These are cystatin-like proteins found in a range of organisms: plant phytocystatins, fetuin in mammals, insect cystatins and a puff adder venom cystatin which inhibits metalloproteases of the MEROPS peptidase family M12 (astacin/adamalysin). Also a number of the cystatins-like proteins have been shown to be devoid of inhibitory activity.   All true cystatins inhibit cysteine peptidases of the papain family (MEROPS peptidase family C1), and some also inhibit legumain family enzymes (MEROPS peptidase family C13). These peptidases play key roles in physiological processes, such as intracellular protein degradation (cathepsins B, H and L), are pivotal in the remodelling of bone (cathepsin K), and may be important in the control of antigen presentation (cathepsin S, mammalian legumain). Moreover, the activities of such peptidases are increased in pathophysiological conditions, such as cancer metastasis and inflammation. Additionally, such peptidases are essential for several pathogenic parasites and bacteria. Thus in animals cystatins not only have capacity to regulate normal body processes and perhaps cause disease when down-regulated, but in other organisms may also participate in defence against biotic and abiotic stress. ; GO: 0004869 cysteine-type endopeptidase inhibitor activity; PDB: 3L0R_B 2W9P_K 2W9Q_A 3S67_A 3QRD_D 1R4C_G 3GAX_A 1TIJ_B 1G96_A 3NX0_A ....
Probab=99.85  E-value=1.1e-20  Score=122.32  Aligned_cols=74  Identities=41%  Similarity=0.687  Sum_probs=67.6

Q ss_pred             ceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCe--EEEEEEEEEEEEeceEEEEEEEEEEeCC-----------------
Q 033492           30 GGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSAL--KFESVEKGETQVVSGTNYRLILVVKDGP-----------------   90 (118)
Q Consensus        30 GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~--~~~~V~~a~~QVVaG~nY~l~v~~~~~~-----------------   90 (118)
                      |||+++ +++||++++++++|+.+||+++++.+  .+.+|++|++|||+|++|+|++++.+++                 
T Consensus         1 Gg~~~~-~~~dp~v~~~~~~al~~~N~~~~~~~~~~~~~v~~a~~QvV~G~~Y~i~~~~~~t~C~k~~~~~~~C~~~~~~   79 (94)
T PF00031_consen    1 GGPSPV-DPNDPEVQEAAEFALDKFNEQSNSGYKFKLVKVISATTQVVAGINYYIEFEVGETNCKKSSKDFENCPFQEEQ   79 (94)
T ss_dssp             SSEEEE-CTTSHHHHHHHHHHHHHHHHHSTTSEEEEEEEEEEEEEEESSSEEEEEEEEEEEEEEETCEEEEEECEBESTT
T ss_pred             CCCccC-CCCCHHHHHHHHHHHHHHHHhCcccCcceeeeeeEEEEeecCCceEEEEEEEEcccccccccccccCCccccC
Confidence            899995 88999999999999999999996665  5699999999999999999999998862                 


Q ss_pred             -CceeEEEEEEEecC
Q 033492           91 -STKKFEAVVWEKPW  104 (118)
Q Consensus        91 -~~~~c~~~V~~~PW  104 (118)
                       ....|.++||.+||
T Consensus        80 ~~~~~C~~~v~~~pW   94 (94)
T PF00031_consen   80 PWTKFCKFTVWERPW   94 (94)
T ss_dssp             SSEEEEEEEEEEECG
T ss_pred             CceeeEEEEEEECCC
Confidence             56799999999999


No 4  
>PF07430 PP1:  Phloem filament protein PP1;  InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.60  E-value=9.1e-15  Score=105.50  Aligned_cols=86  Identities=38%  Similarity=0.566  Sum_probs=76.2

Q ss_pred             CCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEec--eEEEEEEEEEEeC-CCceeEEEEEEE
Q 033492           25 KGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVS--GTNYRLILVVKDG-PSTKKFEAVVWE  101 (118)
Q Consensus        25 ~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVa--G~nY~l~v~~~~~-~~~~~c~~~V~~  101 (118)
                      ..+...+|.+|.|+++|.+|++++||+.++| +.+++++|.+|.+++.|-++  |++|+|.+.+.+. ++...|++.||+
T Consensus       110 ~~p~~~~Wi~I~nin~p~VQeLgkFAV~EhN-K~gd~LkF~KV~eGw~q~l~~d~ikYrLhI~AkDg~G~~~~YeAvV~~  188 (202)
T PF07430_consen  110 ATPQSKKWIPIPNINNPFVQELGKFAVIEHN-KAGDKLKFEKVYEGWYQDLGNDGIKYRLHIVAKDGLGRLGNYEAVVWE  188 (202)
T ss_pred             cCcccCCCEECCCCCcHHHHHHHHHHHHHHh-hcCCceEEEEEeeEEEEeccCCCceEEEEEEeecCCCCcCceEEEEEE
Confidence            4566789999999999999999999999999 67899999999999999885  6999999999998 688999999999


Q ss_pred             e-cCCCceEEE
Q 033492          102 K-PWEHFKSLT  111 (118)
Q Consensus       102 ~-PW~~~~~l~  111 (118)
                      . +|.++.+++
T Consensus       189 k~~~sk~i~i~  199 (202)
T PF07430_consen  189 KQFLSKKIKIL  199 (202)
T ss_pred             eccCcceEEEE
Confidence            7 577765443


No 5  
>PF07430 PP1:  Phloem filament protein PP1;  InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.09  E-value=2.2e-10  Score=82.92  Aligned_cols=91  Identities=29%  Similarity=0.403  Sum_probs=82.7

Q ss_pred             CCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEE--EEEeceEEEEEEEEEEeC-CCceeEEEEEEE
Q 033492           25 KGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGE--TQVVSGTNYRLILVVKDG-PSTKKFEAVVWE  101 (118)
Q Consensus        25 ~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~--~QVVaG~nY~l~v~~~~~-~~~~~c~~~V~~  101 (118)
                      +....+||.+|+|+.+|.+|+++++++.+++-+.++.++|.+|.+.+  .|..+|++|++.+++.|- .+...|.+.|++
T Consensus         3 ~~~~~~~w~~ip~v~~~~~q~v~~~~veq~k~~~~~~l~~~~v~egwy~el~~~~~~yrlhv~a~d~l~r~l~~e~ii~e   82 (202)
T PF07430_consen    3 QVPFSPKWIKIPDVKEPCLQEVAKFAVEQFKIQYGDSLKFRSVVEGWYFELCPNSLKYRLHVKAIDFLGRSLKYEAIIIE   82 (202)
T ss_pred             CcccCcccccCCcccchHHHHHHHHHHHHHhhhcccceeeeeeeeceeecccccceeEEEeehhhhhhccccceeeeeee
Confidence            44678999999999999999999999999999999999999999999  889999999999999876 478899999999


Q ss_pred             ec--CCCceEEEEeee
Q 033492          102 KP--WEHFKSLTSFKP  115 (118)
Q Consensus       102 ~P--W~~~~~l~sf~~  115 (118)
                      +-  |++.++|.|+-.
T Consensus        83 ~~~~~~~~~kl~s~l~   98 (202)
T PF07430_consen   83 EKPQLTRIRKLASILA   98 (202)
T ss_pred             hhhhhhhhhhhheeeE
Confidence            85  999999988754


No 6  
>TIGR01638 Atha_cystat_rel Arabidopsis thaliana cystatin-related protein. This model represents a family similar in sequence and probably homologous to a large family of cysteine proteinase inhibitors, or cystatins, as described by pfam model pfam00031. Cystatins may help plants resist attack by insects.
Probab=98.72  E-value=7.4e-08  Score=62.55  Aligned_cols=66  Identities=14%  Similarity=0.134  Sum_probs=56.0

Q ss_pred             CCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEeceEEEEEEEEEEeCC---CceeEEEEEEEe
Q 033492           37 DPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSGTNYRLILVVKDGP---STKKFEAVVWEK  102 (118)
Q Consensus        37 ~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG~nY~l~v~~~~~~---~~~~c~~~V~~~  102 (118)
                      ..+.+-+..+++.|+.+||+..+.++.|++|++|..|..+|..|+|++.+.+..   ..+.+++.||..
T Consensus         9 ~T~rd~~~~la~~al~k~N~~~~t~lEfV~vVrAn~~~~~g~~~yITF~Ard~~d~p~~e~~q~~v~~~   77 (92)
T TIGR01638         9 ETNRDLLERLSYVASKKYNDTKFLNLELVEVVRANYRGGAKSKSYITFEARDKPDGPLGEYQQAAVVYL   77 (92)
T ss_pred             cCHHHHHHHHHHHHHHHhhhhcCceEEEEEEEEEEeeccceEEEEEEEEEecCCCCCHHHhhheeeEec
Confidence            346667889999999999999999999999999999999999999999999764   345556666653


No 7  
>PF06907 Latexin:  Latexin;  InterPro: IPR009684 This family consists of several animal specific latexin and proteins related to latexin that belong to MEROPS proteinase inhibitor family I47, clan I- [].  Latexin, a protein possessing inhibitory activity against rat carboxypeptidase A1 (CPA1) and CPA2 (MEROPS peptidase family M14A), is expressed in a neuronal subset in the cerebral cortex and cells in other neural and non-neural tissues of rat [, ]. OCX-32, the 32 kDa eggshell matrix protein, is present at high levels in the uterine fluid during the terminal phase of eggshell formation, and is localised predominantly in the outer eggshell. The timing of OCX-32 secretion into the uterine fluid suggests that it may play a role in the termination of mineral deposition []. OCX-32 protein possesses limited identity (32%) to two unrelated proteins: latexin and to a skin protein that is encoded by a retinoic acid receptor-responsive gene, TIG1. Tazarotene Induced Gene 1 (TIG1) is a putative 228 transmembrane protein with a small N-terminal intracellular region, a single membrane-spanning hydrophobic region, and a large C-terminal extracellular region containing a glycosylation signal. TIG1 is up-regulated by retinoic acid receptor but not by retinoid X receptor-specific synthetic retinoids []. TIG1 may be a tumour suppressor gene whose diminished expression is involved in the malignant progression of prostate cancer [].; PDB: 1WNH_A 2BO9_B.
Probab=96.29  E-value=0.086  Score=39.33  Aligned_cols=66  Identities=18%  Similarity=0.264  Sum_probs=52.8

Q ss_pred             CCCcHHHHHHHHHHHHHHHHhcCCCe---EEEEEEEEEEEEec--eEEEEEEEEEEeCC---CceeEEEEEEEe
Q 033492           37 DPKEKHVMEIGQFAVTEYNKQSKSAL---KFESVEKGETQVVS--GTNYRLILVVKDGP---STKKFEAVVWEK  102 (118)
Q Consensus        37 ~~~d~~v~~~a~~Av~~~n~~~~~~~---~~~~V~~a~~QVVa--G~nY~l~v~~~~~~---~~~~c~~~V~~~  102 (118)
                      +++.-..+++|+-|..-+|-..++.+   .+..|.+|+..+++  |-+|.|.+.+.+-.   ....|.++|+..
T Consensus         3 ~p~h~~a~rAA~va~hy~N~~~GSP~~l~~l~~V~~a~~e~ip~~G~Ky~L~FSte~~~~~e~~g~CsA~V~f~   76 (220)
T PF06907_consen    3 NPSHRPAQRAARVAQHYINYRAGSPSRLFVLQQVQKARAEDIPGEGCKYDLVFSTEEYIEGEHLGNCSAEVFFK   76 (220)
T ss_dssp             -TTSHHHHHHHHHHHHHHHHHH-BTTB-EEEEEEEEEEEEEETTTEEEEEEEEEEEETTT---EEEEEEEEEET
T ss_pred             CCcchHHHHHHHHHHHHhccccCCCceeeehhhhhhhhheeccCCCCEEEEEEEhHHhhcCCceeEeEEEEEec
Confidence            56677788999999999998877765   34789999999985  78999999998753   678999999983


No 8  
>TIGR01572 A_thl_para_3677 Arabidopsis paralogous family TIGR01572. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL.
Probab=90.93  E-value=0.55  Score=36.01  Aligned_cols=60  Identities=20%  Similarity=0.288  Sum_probs=53.1

Q ss_pred             HHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEeceEEEEEEEEEEeCC---CceeEEEEEEEe
Q 033492           43 VMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSGTNYRLILVVKDGP---STKKFEAVVWEK  102 (118)
Q Consensus        43 v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG~nY~l~v~~~~~~---~~~~c~~~V~~~  102 (118)
                      ++-.|+.++.-||-..+.+++|..|.+.-++..+=+.|+|++++-+.+   ..+.|+..|.++
T Consensus        42 vklyAr~GLH~YN~~~GTNlel~~v~K~N~~~~~~~syyITL~A~DP~s~~s~qTFQtrV~e~  104 (265)
T TIGR01572        42 VKIYARVGLHRYNFLEGTNLELDHVDKFNKRMCALSSYYITLLAVDPDSRFLQQTFQVRVDEQ  104 (265)
T ss_pred             HHHHHHhhhhhhhhccCccceehhhhhhccchhhheeeeEEEEEecCCccccceEEEEEEEec
Confidence            577899999999999999999999999999999999999999999875   567788877765


No 9  
>PF07172 GRP:  Glycine rich protein family;  InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=88.70  E-value=0.27  Score=32.09  Aligned_cols=7  Identities=14%  Similarity=0.111  Sum_probs=4.8

Q ss_pred             CcchHHH
Q 033492            1 MNQRFCC    7 (118)
Q Consensus         1 m~~~~~~    7 (118)
                      |++|.++
T Consensus         1 MaSK~~l    7 (95)
T PF07172_consen    1 MASKAFL    7 (95)
T ss_pred             CchhHHH
Confidence            8887644


No 10 
>PF05679 CHGN:  Chondroitin N-acetylgalactosaminyltransferase;  InterPro: IPR008428 This family represents Chondroitin N-acetylgalactosaminyltransferase. Proteins have a type II transmembrane topology. The enzyme is involved in the biosynthetic initiation and elongation of chondroitin sulphate and is the key enzyme responsible for the selective chain assembly of chondroitin/dermatan sulphate on the linkage region tetrasaccharide common to various proteoglycans containing chondroitin/dermatan sulphate or heparin/heparan sulphate chains. ; GO: 0016758 transferase activity, transferring hexosyl groups, 0032580 Golgi cisterna membrane
Probab=78.54  E-value=9.9  Score=31.59  Aligned_cols=50  Identities=24%  Similarity=0.377  Sum_probs=43.2

Q ss_pred             HHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEe--ceEEEEEEEEEEeCC
Q 033492           41 KHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVV--SGTNYRLILVVKDGP   90 (118)
Q Consensus        41 ~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVV--aG~nY~l~v~~~~~~   90 (118)
                      +++.++.+.|+...|++....+.|.+++.+...+-  -|+.|.|++.+....
T Consensus       161 ~dl~~vi~~a~~~ln~~~~~~~~~~~l~~GY~R~dp~rG~~Y~Ldl~l~~~~  212 (499)
T PF05679_consen  161 EDLDDVIEQAMEELNRKSRRVLEFRDLINGYRRFDPTRGMDYILDLLLKYKK  212 (499)
T ss_pred             HHHHHHHHHHHHHHhccccccEEeeeeeeEEEEecCCCCceEEEEEEEeecc
Confidence            57889999999999999888889999999987764  599999999886654


No 11 
>COG3360 Uncharacterized conserved protein [Function unknown]
Probab=76.05  E-value=16  Score=22.54  Aligned_cols=45  Identities=20%  Similarity=0.244  Sum_probs=32.4

Q ss_pred             HHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEece--EEEEEEEEEE
Q 033492           42 HVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSG--TNYRLILVVK   87 (118)
Q Consensus        42 ~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG--~nY~l~v~~~   87 (118)
                      .+.++++-|++.-.. +-+.+.+.+|++.+-+++.|  ..|.++++++
T Consensus        18 S~d~Ai~~Ai~RA~~-t~~~l~wfeV~~~rg~v~~g~v~hyqv~lkVg   64 (71)
T COG3360          18 SIDAAIANAIARAAD-TLDNLDWFEVVETRGHVVDGAVAHYQVTLKVG   64 (71)
T ss_pred             cHHHHHHHHHHHHHh-hhhcceEEEEEeecccEeecceEEEEEEEEEE
Confidence            344556666665433 34667889999999999988  4688888775


No 12 
>PF12276 DUF3617:  Protein of unknown function (DUF3617);  InterPro: IPR022061  This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 179 amino acids in length. There is a single completely conserved residue C that may be functionally important. 
Probab=74.45  E-value=4  Score=28.25  Aligned_cols=34  Identities=24%  Similarity=0.300  Sum_probs=18.9

Q ss_pred             CcchHHHHHHHHHHhhhcccccCCCCcccceeee
Q 033492            1 MNQRFCCLIVLFLSVVPLLAAGDRKGALVGGWKP   34 (118)
Q Consensus         1 m~~~~~~~~~~~~~~~~~~~~~~~~~~~~GG~~~   34 (118)
                      |++..++++++++++....+++......+|-|.-
T Consensus         1 M~~~~~~~~~~~~~~~~~~~~a~~~~~kpGlWe~   34 (162)
T PF12276_consen    1 MKRRLLLALALALLALAAAAAAAAPDIKPGLWEV   34 (162)
T ss_pred             CchHHHHHHHHHHHHhhcccccccCCCCCcccEE
Confidence            6666555444444432333444456667899974


No 13 
>PF07311 Dodecin:  Dodecin;  InterPro: IPR009923 This entry represents proteins with a Dodecin-like topology. Dodecin flavoprotein is a small dodecameric flavin-binding protein from Halobacterium salinarium (Halobacterium halobium) that contains two flavins stacked in a single binding pocket between two tryptophan residues to form an aromatic tetrade []. Dodecin binds riboflavin, although it appears to have a broad specificity for flavins. Lumichrome, a molecule associated with flavin metabolism, appears to be a ligand of dodecin, which could act as a waste-trapping device. ; PDB: 2VYX_L 2DEG_F 2V18_K 2V19_D 2UX9_B 2CZ8_E 2V21_F 2CC8_A 2CCB_A 2VX9_A ....
Probab=72.55  E-value=19  Score=21.90  Aligned_cols=48  Identities=25%  Similarity=0.273  Sum_probs=37.9

Q ss_pred             CcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEece--EEEEEEEEEE
Q 033492           39 KEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSG--TNYRLILVVK   87 (118)
Q Consensus        39 ~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG--~nY~l~v~~~   87 (118)
                      +.....++++.|+.+-++. -.++..++|.+.+-.|..|  ..|+.+++++
T Consensus        12 S~~S~edAv~~Av~~A~kT-l~ni~~~eV~e~~~~v~dg~i~~y~v~lkv~   61 (66)
T PF07311_consen   12 SPKSWEDAVQNAVARASKT-LRNIRWFEVKEQRGHVEDGKITEYQVNLKVS   61 (66)
T ss_dssp             ESSHHHHHHHHHHHHHHHH-SSSEEEEEEEEEEEEEETTCEEEEEEEEEEE
T ss_pred             CCCCHHHHHHHHHHHHhhc-hhCcEEEEEEEEEEEEeCCcEEEEEEEEEEE
Confidence            3456777888888887664 3578889999999999998  6799888875


No 14 
>PF10731 Anophelin:  Thrombin inhibitor from mosquito;  InterPro: IPR018932  Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing. 
Probab=71.85  E-value=4.1  Score=24.47  Aligned_cols=18  Identities=17%  Similarity=0.482  Sum_probs=13.4

Q ss_pred             CcchHHHHHHHHHHhhhc
Q 033492            1 MNQRFCCLIVLFLSVVPL   18 (118)
Q Consensus         1 m~~~~~~~~~~~~~~~~~   18 (118)
                      |++|++++.|+|+.++.+
T Consensus         1 MA~Kl~vialLC~aLva~   18 (65)
T PF10731_consen    1 MASKLIVIALLCVALVAI   18 (65)
T ss_pred             CcchhhHHHHHHHHHHHH
Confidence            888887777777777654


No 15 
>PF05984 Cytomega_UL20A:  Cytomegalovirus UL20A protein;  InterPro: IPR009245 This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein [].
Probab=66.33  E-value=5.9  Score=25.47  Aligned_cols=21  Identities=33%  Similarity=0.425  Sum_probs=13.5

Q ss_pred             CcchHHHHHHHHHHhhhcccc
Q 033492            1 MNQRFCCLIVLFLSVVPLLAA   21 (118)
Q Consensus         1 m~~~~~~~~~~~~~~~~~~~~   21 (118)
                      |++|+++|-|+++.++...|+
T Consensus         1 MaRRlwiLslLAVtLtVALAA   21 (100)
T PF05984_consen    1 MARRLWILSLLAVTLTVALAA   21 (100)
T ss_pred             CchhhHHHHHHHHHHHHHhhc
Confidence            889887766666555544444


No 16 
>PF00666 Cathelicidins:  Cathelicidin;  InterPro: IPR001894 The precursor sequences of a number of antimicrobial peptides secreted by neutrophils (polymorphonuclear leukocytes) upon activation have been found to be evolutionarily related and are collectively known as cathelicidins []. Structurally, these proteins consist of three domains: a signal sequence, a conserved region of about 100 residues that contains four cysteines involved in two disulphide bonds, and a highly divergent C-terminal section of variable size. It is in this C-terminal section that the antibacterial peptides are found; they are proteolytically processed from their precursor by enzymes such as elastase. This structure is shown in the following schematic representation:  +---+--------------------------------+--------------------+ |Sig| Propeptide C C C C | Antibacterial pep. | +---+----------------|--|--|--|------+--------------------+ | | | | +--+ +--+ 'C': conserved cysteine involved in a disulphide bond. ; GO: 0006952 defense response, 0005576 extracellular region; PDB: 1KWI_A 1PFP_A 1LXE_A 1N5P_A 1N5H_A.
Probab=65.71  E-value=7.1  Score=23.89  Aligned_cols=47  Identities=19%  Similarity=0.051  Sum_probs=28.0

Q ss_pred             HHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEe----ceEEEEEEEEEEeC
Q 033492           42 HVMEIGQFAVTEYNKQSKSALKFESVEKGETQVV----SGTNYRLILVVKDG   89 (118)
Q Consensus        42 ~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVV----aG~nY~l~v~~~~~   89 (118)
                      .|++++..|+..||+++.+.. +.+++++.-|--    .++.--+.+.+.++
T Consensus         3 sY~eav~~Av~~yN~~s~~~n-lfRLLe~~p~P~~~~~~~~~~pl~FtIkET   53 (67)
T PF00666_consen    3 SYEEAVLRAVDFYNQGSSGEN-LFRLLELDPPPGWDEDPSTPKPLNFTIKET   53 (67)
T ss_dssp             CCHHHHHHHHHHHHHCS-SSE-EEEEEEE---SSSSSSSSS-EEEEEEEEEE
T ss_pred             CHHHHHHHHHHHHhcCCCccC-ceeeeeccCCCCCCCCcCcceeeEEEEeec
Confidence            367889999999999987653 456676665532    22344555555554


No 17 
>KOG2650 consensus Zinc carboxypeptidase [Function unknown]
Probab=49.77  E-value=47  Score=27.31  Aligned_cols=63  Identities=13%  Similarity=0.096  Sum_probs=45.5

Q ss_pred             ccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEE---EEEE---E----EEEEeceEEEEEEEEEEeCC
Q 033492           28 LVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFE---SVEK---G----ETQVVSGTNYRLILVVKDGP   90 (118)
Q Consensus        28 ~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~---~V~~---a----~~QVVaG~nY~l~v~~~~~~   90 (118)
                      .|=|++.....+-++++++|+.|+..+++..+..|++-   .++-   .    +.+-++|+.|-+++++.++.
T Consensus       317 yPyg~~~~~~~~~~dl~~va~~a~~ai~~~~gt~Y~~G~~~~~~y~asG~S~Dway~~~gi~~~ft~ELrd~g  389 (418)
T KOG2650|consen  317 YPYGYTNDLPEDYEDLQEVARAAADALKSVYGTKYTVGSSADTLYPASGGSDDWAYDVLGIPYAFTFELRDTG  389 (418)
T ss_pred             ecccccCCCCCCHHHHHHHHHHHHHHHHHHhCCEEEeccccceeeccCCchHHHhhhccCCCEEEEEEeccCC
Confidence            36677663334677889999999999999988888763   2221   1    13557899999999998654


No 18 
>PF01456 Mucin:  Mucin-like glycoprotein;  InterPro: IPR000458 This family of trypanosomal proteins resemble vertebrate mucins. The protein consists of three regions. The N and C terminii are conserved between all members of the family, whereas the central region is not well conserved and contains a large number of threonine residues which can be glycosylated []. Indirect evidence suggested that these genes might encode the core protein of parasite mucins, glycoproteins that were proposed to be involved in the interaction with, and invasion of, mammalian host cells.
Probab=48.62  E-value=12  Score=25.43  Aligned_cols=22  Identities=36%  Similarity=0.596  Sum_probs=13.4

Q ss_pred             CcchHHH-HHHHHHHhh-hccccc
Q 033492            1 MNQRFCC-LIVLFLSVV-PLLAAG   22 (118)
Q Consensus         1 m~~~~~~-~~~~~~~~~-~~~~~~   22 (118)
                      |--|+|+ ||++.|+|+ +.++..
T Consensus         2 mtcRLLCalLvlaLcCCpsvc~t~   25 (143)
T PF01456_consen    2 MTCRLLCALLVLALCCCPSVCATA   25 (143)
T ss_pred             chHHHHHHHHHHHHHcCcchhccc
Confidence            3447766 777777777 444443


No 19 
>PF03032 Brevenin:  Brevenin/esculentin/gaegurin/rugosin family;  InterPro: IPR004275 In addition to the highly specific cell-mediated immune system, vertebrates possess an efficient host-defence mechanism against invading microorganisms which involves the synthesis of highly potent antimicrobial peptides with a large spectrum of activity. This entry represents a number of these defence peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins.; GO: 0006952 defense response, 0042742 defense response to bacterium, 0005576 extracellular region
Probab=48.18  E-value=15  Score=20.72  Aligned_cols=18  Identities=22%  Similarity=0.458  Sum_probs=12.5

Q ss_pred             hHHHHHHHHHHhhhcccc
Q 033492            4 RFCCLIVLFLSVVPLLAA   21 (118)
Q Consensus         4 ~~~~~~~~~~~~~~~~~~   21 (118)
                      |..+++|++|.+++++.-
T Consensus         4 KKsllLlfflG~ISlSlC   21 (46)
T PF03032_consen    4 KKSLLLLFFLGTISLSLC   21 (46)
T ss_pred             hHHHHHHHHHHHcccchH
Confidence            445788888888876443


No 20 
>TIGR01601 PYST-C1 Plasmodium yoelii subtelomeric domain PYST-C1. The C-terminal portions of the genes which contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (TIGR01604).
Probab=43.93  E-value=14  Score=23.34  Aligned_cols=16  Identities=25%  Similarity=0.611  Sum_probs=9.0

Q ss_pred             Ccch-HHHHHHHHHHhh
Q 033492            1 MNQR-FCCLIVLFLSVV   16 (118)
Q Consensus         1 m~~~-~~~~~~~~~~~~   16 (118)
                      |++| |+|+|.+++.+.
T Consensus         1 MNkrIfslVcivlY~ll   17 (82)
T TIGR01601         1 MNKRIFSLVCIVLYILL   17 (82)
T ss_pred             CCceEeehhHHHHHHHH
Confidence            8776 455555554443


No 21 
>cd06379 PBP1_iGluR_NMDA_NR1 N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer ccomposed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits.  The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor.  When co-expressed with NR1, the NR3 subunits form receptors that are activated by glycine alone and therefore 
Probab=42.57  E-value=37  Score=26.31  Aligned_cols=42  Identities=26%  Similarity=0.181  Sum_probs=27.1

Q ss_pred             HHHHhhhcccccCCCCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhc
Q 033492           11 LFLSVVPLLAAGDRKGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQS   58 (118)
Q Consensus        11 ~~~~~~~~~~~~~~~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~   58 (118)
                      ++|.++++ +++.+..-..|+..+ .    ...++..++|++.+|...
T Consensus         5 ~~~~~~~~-~~~~~~~i~IG~i~~-~----~~~~~~~~~Ai~~~N~~~   46 (377)
T cd06379           5 LFLSLCAR-AGCSPKTVNIGAVLS-N----KKHEQEFKEAVNAANVER   46 (377)
T ss_pred             HHHHHhcc-cCCCCcEEEEeEEec-c----hhHHHHHHHHHHHHhhhh
Confidence            33444344 233345566788887 2    256789999999999854


No 22 
>PRK10386 curli assembly protein CsgE; Provisional
Probab=41.56  E-value=1.2e+02  Score=20.98  Aligned_cols=81  Identities=9%  Similarity=-0.024  Sum_probs=41.0

Q ss_pred             CcchH-HHHHHHHHHhhhcccccCCCCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEeceEE
Q 033492            1 MNQRF-CCLIVLFLSVVPLLAAGDRKGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSGTN   79 (118)
Q Consensus         1 m~~~~-~~~~~~~~~~~~~~~~~~~~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG~n   79 (118)
                      |.+.. ++++++++++++...++ .+..+.|=..+ ...+-. =.+--.+-.+.++    +.+.  ..+.+..++.+|..
T Consensus         1 ~~r~~~~~l~~~~l~~~~~~~a~-~eiEi~GLIiD-~T~Tr~-G~DFY~~Fs~~~~----~~~~--~nltI~E~p~a~~G   71 (130)
T PRK10386          1 MKRYLRWIVAAELLFAAGNLHAA-VEVEVPGLLTD-HTVSSI-GHDFYRAFSDKWE----SDYD--GNLTINERPSARWG   71 (130)
T ss_pred             ChhHHHHHHHHHHHHhCcccccc-ccccccceEec-cccccc-cHhHHHHHHHHHh----hhCC--CcEEEEEEEcCCCC
Confidence            55532 44555555554432232 34455555555 222221 1222223334444    1222  46667888888777


Q ss_pred             EEEEEEEEeCC
Q 033492           80 YRLILVVKDGP   90 (118)
Q Consensus        80 Y~l~v~~~~~~   90 (118)
                      =.|++.+.+.-
T Consensus        72 S~ItV~~n~~v   82 (130)
T PRK10386         72 SWITITVNQDV   82 (130)
T ss_pred             cEEEEEECCEE
Confidence            78999887654


No 23 
>PRK15344 type III secretion system needle protein SsaG; Provisional
Probab=33.84  E-value=67  Score=19.88  Aligned_cols=22  Identities=27%  Similarity=0.345  Sum_probs=18.3

Q ss_pred             CCCCcHHHHHHHHHHHHHHHHh
Q 033492           36 EDPKEKHVMEIGQFAVTEYNKQ   57 (118)
Q Consensus        36 ~~~~d~~v~~~a~~Av~~~n~~   57 (118)
                      .+++||+..--+.|++.+|+.-
T Consensus        28 ~~~~nP~~ml~lQf~i~QyS~~   49 (71)
T PRK15344         28 NDLLNPESMIKAQFALQQYSTF   49 (71)
T ss_pred             CCCCCHHHHHHHHHHHHHHHHH
Confidence            3678999888899999999753


No 24 
>TIGR02105 III_needle type III secretion apparatus needle protein. Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus.
Probab=33.76  E-value=60  Score=20.00  Aligned_cols=21  Identities=29%  Similarity=0.455  Sum_probs=18.0

Q ss_pred             CCCcHHHHHHHHHHHHHHHHh
Q 033492           37 DPKEKHVMEIGQFAVTEYNKQ   57 (118)
Q Consensus        37 ~~~d~~v~~~a~~Av~~~n~~   57 (118)
                      +|+||+..--..|++.+||.-
T Consensus        30 ~~~nP~~La~~Q~~~~qYs~~   50 (72)
T TIGR02105        30 LPNDPELMAELQFALNQYSAY   50 (72)
T ss_pred             CCCCHHHHHHHHHHHHHHHHH
Confidence            568999988899999999763


No 25 
>PF13028 DUF3889:  Protein of unknown function (DUF3889)
Probab=30.17  E-value=1.6e+02  Score=19.21  Aligned_cols=69  Identities=16%  Similarity=0.073  Sum_probs=48.8

Q ss_pred             CcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEE-EEEEece-EEEEEEEEEEeCCCceeEEEEEEEecCCCce
Q 033492           39 KEKHVMEIGQFAVTEYNKQSKSALKFESVEKG-ETQVVSG-TNYRLILVVKDGPSTKKFEAVVWEKPWEHFK  108 (118)
Q Consensus        39 ~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a-~~QVVaG-~nY~l~v~~~~~~~~~~c~~~V~~~PW~~~~  108 (118)
                      .+|.+.+-.+.|+++.-++= .+..++.-+.. ++|+-.+ +.-.+++.+.++++.--..+.|+-.|-.+..
T Consensus        19 ~~p~yaKWgrlA~~~~k~~Y-p~a~v~DY~~vGr~~~~~~~t~e~Fkl~l~~~~kefgV~v~V~f~p~T~ki   89 (97)
T PF13028_consen   19 AQPSYAKWGRLAVQETKEKY-PGAEVVDYLYVGRTKVNDEQTVEKFKLWLREGGKEFGVFVTVSFNPKTEKI   89 (97)
T ss_pred             CCCcHHHHHHHHHHHHHHHC-CCCEEeeeeeecceecCCcceEEEEEEEEEcCCeEEEEEEEEEEeCCCCcE
Confidence            45888888999999875542 12233343433 4455566 7889999999999888888889888866653


No 26 
>PF06157 DUF973:  Protein of unknown function (DUF973);  InterPro: IPR009321 This family consists of several hypothetical archaeal proteins of unknown function.
Probab=29.35  E-value=82  Score=24.47  Aligned_cols=29  Identities=24%  Similarity=0.526  Sum_probs=20.0

Q ss_pred             EEEeceEEEEEEEEEEeCCCceeEEEEEEEec
Q 033492           72 TQVVSGTNYRLILVVKDGPSTKKFEAVVWEKP  103 (118)
Q Consensus        72 ~QVVaG~nY~l~v~~~~~~~~~~c~~~V~~~P  103 (118)
                      .+.+.|.+|.+++.++++.   ..++.+-.+|
T Consensus       257 ~~l~~g~~Y~i~l~l~ng~---~v~v~~~y~p  285 (285)
T PF06157_consen  257 LNLVPGNTYTITLTLSNGQ---TVDVNVIYQP  285 (285)
T ss_pred             ccCCCCCEEEEEEEEcCCc---EEEEEEEEeC
Confidence            3466788888888888775   5555555554


No 27 
>CHL00132 psaF photosystem I subunit III; Validated
Probab=27.20  E-value=2.1e+02  Score=20.91  Aligned_cols=26  Identities=15%  Similarity=0.138  Sum_probs=17.2

Q ss_pred             ccceeeeCCCCCcHHHHHHHHHHHHHHH
Q 033492           28 LVGGWKPIEDPKEKHVMEIGQFAVTEYN   55 (118)
Q Consensus        28 ~~GG~~~i~~~~d~~v~~~a~~Av~~~n   55 (118)
                      -.+|.+|=  .++|.+++-++.++.+..
T Consensus        25 d~agLtpC--ses~aF~kR~~~~~k~Le   50 (185)
T CHL00132         25 DVAGLTPC--SESPAFQKRLNNSVKKLE   50 (185)
T ss_pred             cccCCccC--ccCHHHHHHHHHHHHHHH
Confidence            45666662  478888887777776643


No 28 
>PF12274 DUF3615:  Protein of unknown function (DUF3615);  InterPro: IPR022059  This domain family is found in bacteria and eukaryotes, and is typically between 86 and 97 amino acids in length. There is a conserved FAE sequence motif. There is a single completely conserved residue F that may be functionally important. 
Probab=27.17  E-value=1.7e+02  Score=18.45  Aligned_cols=56  Identities=16%  Similarity=0.088  Sum_probs=35.0

Q ss_pred             CCCeEEEEEEEEEEEEeceE--EEEEEEEEEeCC------CceeEEEEEEEecCCCceEEEEeee
Q 033492           59 KSALKFESVEKGETQVVSGT--NYRLILVVKDGP------STKKFEAVVWEKPWEHFKSLTSFKP  115 (118)
Q Consensus        59 ~~~~~~~~V~~a~~QVVaG~--nY~l~v~~~~~~------~~~~c~~~V~~~PW~~~~~l~sf~~  115 (118)
                      +..|.+.+++....=.-.|.  =|++.+.+...+      ....+-|++. ..-.....+...-+
T Consensus         9 ~~~yeL~~v~~~~~~~e~~~~~y~HvNF~A~~~~~~~~~~~~~LFFAE~~-~~~~~~~~v~~C~~   72 (96)
T PF12274_consen    9 GLEYELVDVLHSCFIFERGGWNYYHVNFTAKTKGPDSDDGSPTLFFAEVS-NDCKDEDDVSCCCP   72 (96)
T ss_pred             CcCEEEeEEEeeeeeEeCCCcEEEeEEEEEEcCCccCCCCCceEEEEEEe-cCCCCCCEEEEEEE
Confidence            56788888887764444443  368888887654      5677889887 22233344444433


No 29 
>COG4676 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=26.22  E-value=28  Score=26.38  Aligned_cols=33  Identities=18%  Similarity=0.450  Sum_probs=19.2

Q ss_pred             hcccccCCCCcc---cceeeeCCCCCcHHHHHHHHHH
Q 033492           17 PLLAAGDRKGAL---VGGWKPIEDPKEKHVMEIGQFA   50 (118)
Q Consensus        17 ~~~~~~~~~~~~---~GG~~~i~~~~d~~v~~~a~~A   50 (118)
                      ++.|.+.+...+   .+||+. ..-.|..+.+.+++-
T Consensus        19 ~l~A~ae~~v~ld~P~~GWr~-s~g~~~~~~q~v~YP   54 (268)
T COG4676          19 SLVAWAEPEVELDAPLSGWRP-SGGEDASYRQSVNYP   54 (268)
T ss_pred             chhhhcCCcccccCccccccc-CCCccccccccccCC
Confidence            555666555443   689988 665555554444443


No 30 
>PRK09408 ompX outer membrane protein X; Provisional
Probab=26.19  E-value=56  Score=23.36  Aligned_cols=37  Identities=19%  Similarity=0.273  Sum_probs=19.3

Q ss_pred             CcchHHHHHHHHHHhhhcccccCCCCcccceeeeCCCC
Q 033492            1 MNQRFCCLIVLFLSVVPLLAAGDRKGALVGGWKPIEDP   38 (118)
Q Consensus         1 m~~~~~~~~~~~~~~~~~~~~~~~~~~~~GG~~~i~~~   38 (118)
                      |.+..|+.+++++++++..++......+.+|+.. .++
T Consensus         1 mkk~~~~~~~~~~~~~~~~~~~~~~~t~s~GYaq-~~~   37 (171)
T PRK09408          1 MKKIACLSALACVLAVTAGTAVAATSTVTGGYAQ-SDA   37 (171)
T ss_pred             CceEehHHHHHHHHHHhhhhhhcccceEEEEEEE-eec
Confidence            6665555444333332222122234788899887 454


No 31 
>cd06247 M14_CPO Peptidase M14 carboxypeptidase (CP) O (CPO, also known as metallocarboxypeptidase C; EC 3.4.17.) belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPO has not been well characterized as yet, and little is known about it. Based on modeling studies, CPO has been suggested to have specificity for acidic residues rather than aliphatic/aromatic residues as in A-like enzymes or basic residues as in B-like enzymes. It remains to be demonstrated that CPO is functional as an MCP.
Probab=26.17  E-value=1e+02  Score=23.96  Aligned_cols=61  Identities=13%  Similarity=0.190  Sum_probs=39.3

Q ss_pred             cceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEE---EEE---E----EEEEeceEEEEEEEEEEeCC
Q 033492           29 VGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFES---VEK---G----ETQVVSGTNYRLILVVKDGP   90 (118)
Q Consensus        29 ~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~---V~~---a----~~QVVaG~nY~l~v~~~~~~   90 (118)
                      |=|++.-..++++++.++++.++..+.+..+..|..-.   ++-   .    +.+ ..|..|-+++++.+..
T Consensus       199 P~g~~~~~~~n~~~~~~~a~~~~~ai~~~~~~~y~~g~~~~~~y~a~G~s~Dwa~-~~~~~~s~t~El~~~g  269 (298)
T cd06247         199 PYGYTKEPSSNHEEMMLVAQKAAAALKEKHGTEYRVGSSALILYSNSGSSRDWAV-DIGIPFSYTFELRDNG  269 (298)
T ss_pred             CCcCCCCCCCCHHHHHHHHHHHHHHHHHhcCCCCccCCcccccccCCCChhhhhh-ccCCCEEEEEEeCCCC
Confidence            33444423457888999999999988887776775421   111   0    112 2588899999997654


No 32 
>PF03823 Neurokinin_B:  Neurokinin B;  InterPro: IPR003635 Tachykinins [, , ] are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. This family includes neurokinins, as well as many other peptides. Like other tachykinins, neurokinins are synthesized as larger protein precursors that are enzymatically converted to their mature forms.; GO: 0007217 tachykinin receptor signaling pathway
Probab=26.16  E-value=81  Score=18.66  Aligned_cols=34  Identities=12%  Similarity=0.136  Sum_probs=15.5

Q ss_pred             CcchHHHHH-HHHHHhhhcccccC--CCCccc-ceeee
Q 033492            1 MNQRFCCLI-VLFLSVVPLLAAGD--RKGALV-GGWKP   34 (118)
Q Consensus         1 m~~~~~~~~-~~~~~~~~~~~~~~--~~~~~~-GG~~~   34 (118)
                      |+.-.++++ |++-.+-+..|.+.  ++...+ ||.++
T Consensus         1 MR~~lLf~aiLalsla~s~gavCeesQeQ~~p~gg~sk   38 (59)
T PF03823_consen    1 MRSTLLFAAILALSLARSFGAVCEESQEQVVPGGGHSK   38 (59)
T ss_pred             ChhHHHHHHHHHHHHHHHhhhhhhhhhhccCCCCCccc
Confidence            655444433 33333335555555  233345 45555


No 33 
>PRK15346 outer membrane secretin SsaC; Provisional
Probab=24.77  E-value=95  Score=25.78  Aligned_cols=48  Identities=13%  Similarity=0.056  Sum_probs=23.1

Q ss_pred             CcchHHHHHHHHHHhhhcccccCCCCcccceeeeCCCCCcHHHHHHHHHHH
Q 033492            1 MNQRFCCLIVLFLSVVPLLAAGDRKGALVGGWKPIEDPKEKHVMEIGQFAV   51 (118)
Q Consensus         1 m~~~~~~~~~~~~~~~~~~~~~~~~~~~~GG~~~i~~~~d~~v~~~a~~Av   51 (118)
                      |-++.++.+|++|+.+.-.++ . ...-.|-.-.+ +..|.++.++++..-
T Consensus         1 ~~~~~~~~~~~~~~~~~~~~~-~-~~~w~~~~~~~-~~~~~di~~vl~~~a   48 (499)
T PRK15346          1 MKKLLILIFLFLLNTAKFAAS-K-SIPWQGNPFFI-YSRGMPLAEVLHDLG   48 (499)
T ss_pred             CchhHHHHHHHHHhhhhhhcc-C-CCCCCCCCEEE-EECCCcHHHHHHHHH
Confidence            445555555555555433222 1 11122322332 567888877777433


No 34 
>PF08139 LPAM_1:  Prokaryotic membrane lipoprotein lipid attachment site;  InterPro: IPR012640  In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognises a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached [,].  This lipid attachment site is found in homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection [].
Probab=23.94  E-value=53  Score=16.15  Aligned_cols=12  Identities=0%  Similarity=0.185  Sum_probs=5.1

Q ss_pred             chHHHHHHHHHH
Q 033492            3 QRFCCLIVLFLS   14 (118)
Q Consensus         3 ~~~~~~~~~~~~   14 (118)
                      +|++++++++++
T Consensus         8 Kkil~~l~a~~~   19 (25)
T PF08139_consen    8 KKILFPLLALFM   19 (25)
T ss_pred             HHHHHHHHHHHH
Confidence            444444444333


No 35 
>PF03082 MAGSP:  Male accessory gland secretory protein;  InterPro: IPR004315 The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of which affect the female's receptivity to courtship and her rate of oviposition. The protein is expressed exclusively in the male accessory glands of adult Drosophila melanogaster. During copulation it is transferred to the female genital tract where it is rapidly altered [].; GO: 0007618 mating, 0005576 extracellular region
Probab=23.05  E-value=55  Score=24.74  Aligned_cols=22  Identities=36%  Similarity=0.439  Sum_probs=16.2

Q ss_pred             CcchH-HHHHHHHHHhhhccccc
Q 033492            1 MNQRF-CCLIVLFLSVVPLLAAG   22 (118)
Q Consensus         1 m~~~~-~~~~~~~~~~~~~~~~~   22 (118)
                      |++.. |-.+|+++++|..+.+.
T Consensus         1 MNQILLCS~iLLllfaVAnC~~~   23 (264)
T PF03082_consen    1 MNQILLCSAILLLLFAVANCDGL   23 (264)
T ss_pred             CceeeehHHHHHHHHHHhhcccc
Confidence            88876 44888888887776653


No 36 
>TIGR03431 PhnD phosphonate ABC transporter, periplasmic phosphonate binding protein. Note that this model does not identify all phnD-subfamily genes with evident phosphonate context, but all sequences above the trusted context may be inferred to bind phosphonate compounds even in the absence of such context. Furthermore, there is ample evidence to suggest that many other members of the TIGR01098 subfamily have a different primary function.
Probab=21.98  E-value=3.3e+02  Score=20.07  Aligned_cols=34  Identities=21%  Similarity=0.156  Sum_probs=17.0

Q ss_pred             CcchHHHHHHHHHHhhhcccccCCC-Ccccceeee
Q 033492            1 MNQRFCCLIVLFLSVVPLLAAGDRK-GALVGGWKP   34 (118)
Q Consensus         1 m~~~~~~~~~~~~~~~~~~~~~~~~-~~~~GG~~~   34 (118)
                      |.+|+|+..+..+++-++.+..... ..+.-|..+
T Consensus         1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~vg~~~   35 (288)
T TIGR03431         1 MLRRLILSLVAAFMLISSNAQAEDWPKELNFGIIP   35 (288)
T ss_pred             ChhhHHHHHHHHHHHHhcchhhhcCCCeEEEEEcC
Confidence            7778766444444443333332222 345666665


No 37 
>PF05887 Trypan_PARP:  Procyclic acidic repetitive protein (PARP);  InterPro: IPR008882 This family consists of several Trypanosoma brucei procyclic acidic repetitive protein (PARP) like sequences. The procyclic acidic repetitive protein (parp) genes of T. brucei encode a small family of abundant surface proteins whose expression is restricted to the procyclic form of the parasite. They are found at two unlinked loci, parpA and parpB; transcription of both loci is developmentally regulated [].; GO: 0016020 membrane; PDB: 2X34_B 2X32_B.
Probab=21.86  E-value=30  Score=24.12  Aligned_cols=13  Identities=38%  Similarity=0.869  Sum_probs=0.0

Q ss_pred             CcchHHHHHHHHH
Q 033492            1 MNQRFCCLIVLFL   13 (118)
Q Consensus         1 m~~~~~~~~~~~~   13 (118)
                      |..|.+|+|.++|
T Consensus         1 m~pr~l~~LavLL   13 (143)
T PF05887_consen    1 MTPRHLCLLAVLL   13 (143)
T ss_dssp             -------------
T ss_pred             Ccccccccccccc
Confidence            7777655433333


No 38 
>PF05438 TRH:  Thyrotropin-releasing hormone (TRH);  InterPro: IPR008857 This family consists of several thyrotropin-releasing hormone (TRH) proteins. Thyrotropin-Releasing Hormone (TRH; pyroGlu-His-Pro-NH2), originally isolated as a hypothalamic neuropeptide hormone, most likely acts also as a neuromodulator and/or neurotransmitter in the central nervous system (CNS). This interpretation is supported by the identification of a peptidase localised on the surface of neuronal cells which has been termed TRH-degrading ectoenzyme (TRH-DE) since it selectively inactivates TRH. TRH has been used clinically for the treatment of spinocerebellar degeneration and disturbance of consciousness in humans [].; GO: 0005184 neuropeptide hormone activity, 0009755 hormone-mediated signaling pathway, 0005576 extracellular region
Probab=20.73  E-value=1.8e+02  Score=21.72  Aligned_cols=16  Identities=19%  Similarity=0.187  Sum_probs=8.3

Q ss_pred             HHHHHHHHHhhhcccc
Q 033492            6 CCLIVLFLSVVPLLAA   21 (118)
Q Consensus         6 ~~~~~~~~~~~~~~~~   21 (118)
                      |+++|++|+++...+.
T Consensus         2 wlllll~L~l~~~~v~   17 (212)
T PF05438_consen    2 WLLLLLALTLCNTGVP   17 (212)
T ss_pred             HHHHHHHHHHhhcccc
Confidence            4556666655444333


No 39 
>cd01781 AF6_RA_repeat2 Ubiquitin domain of AT-6, second repeat. The AF-6 protein (also known as afadin and canoe) is a multidomain cell junction protein that contains two N-terminal Ras-associating (RA) domains in addition to FHA (forkhead-associated), DIL (class V myosin homology region), and PDZ domains and a proline-rich region. AF6 acts downstream of the Egfr (Epidermal Growth Factor-receptor)/Ras signalling pathway and provides a link from Egfr to cytoskeletal elements.
Probab=20.33  E-value=1.5e+02  Score=19.52  Aligned_cols=30  Identities=13%  Similarity=0.096  Sum_probs=23.2

Q ss_pred             CcHHHHHHHHHHHHHHHHhcC--CCeEEEEEE
Q 033492           39 KEKHVMEIGQFAVTEYNKQSK--SALKFESVE   68 (118)
Q Consensus        39 ~d~~v~~~a~~Av~~~n~~~~--~~~~~~~V~   68 (118)
                      ++....++...|+.+|+-+..  .+|.+++|+
T Consensus        24 ~~~~a~~vV~eALeKygL~~e~p~~Y~LveV~   55 (100)
T cd01781          24 INDNADRIVGEALEKYGLEKSDPDDYCLVEVS   55 (100)
T ss_pred             CCccHHHHHHHHHHHhCCCccCccceEEEEEe
Confidence            566778899999999987654  367877775


No 40 
>PF07353 Uroplakin_II:  Uroplakin II;  InterPro: IPR009952 This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension [].; GO: 0016044 cellular membrane organization, 0030176 integral to endoplasmic reticulum membrane
Probab=20.21  E-value=2.4e+02  Score=20.45  Aligned_cols=16  Identities=31%  Similarity=0.590  Sum_probs=13.9

Q ss_pred             eceEEEEEEEEEEeCC
Q 033492           75 VSGTNYRLILVVKDGP   90 (118)
Q Consensus        75 VaG~nY~l~v~~~~~~   90 (118)
                      ..|++|++...+.++.
T Consensus       110 ~pGTkY~isY~Vtkgt  125 (184)
T PF07353_consen  110 QPGTKYYISYLVTKGT  125 (184)
T ss_pred             CCCcEEEEEEEEecCc
Confidence            4699999999998875


No 41 
>PF15281 Consortin_C:  Consortin C-terminus
Probab=20.21  E-value=76  Score=21.34  Aligned_cols=15  Identities=33%  Similarity=0.388  Sum_probs=9.8

Q ss_pred             HHHHHHHHhhhcccc
Q 033492            7 CLIVLFLSVVPLLAA   21 (118)
Q Consensus         7 ~~~~~~~~~~~~~~~   21 (118)
                      +|+|+++++|.++..
T Consensus        56 ~L~LlclvTv~lS~g   70 (113)
T PF15281_consen   56 LLLLLCLVTVVLSVG   70 (113)
T ss_pred             HHHHHHHHHHHHhcc
Confidence            466777777766555


No 42 
>PF07312 DUF1459:  Protein of unknown function (DUF1459);  InterPro: IPR009924 This family consists of several hypothetical Caenorhabditis elegans proteins of around 85 residues in length. The function of this family is unknown.
Probab=20.12  E-value=93  Score=19.75  Aligned_cols=16  Identities=25%  Similarity=0.366  Sum_probs=9.9

Q ss_pred             CcchHHHHHHHHHHhh
Q 033492            1 MNQRFCCLIVLFLSVV   16 (118)
Q Consensus         1 m~~~~~~~~~~~~~~~   16 (118)
                      |+.|-+.++++++++.
T Consensus         1 MF~Kc~~~l~l~~f~i   16 (84)
T PF07312_consen    1 MFQKCIIVLLLCLFCI   16 (84)
T ss_pred             ChHHHHHHHHHHHHHH
Confidence            7777655555555553


Done!