Query         030834
Match_columns 170
No_of_seqs    103 out of 272
Neff          6.1 
Searched_HMMs 46136
Date          Fri Mar 29 05:16:54 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/030834.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/030834hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF00197 Kunitz_legume:  Trypsi 100.0 2.4E-48 5.2E-53  310.7  15.8  132   31-169     1-176 (176)
  2 cd00178 STI Soybean trypsin in 100.0 6.3E-48 1.4E-52  307.6  14.9  130   31-169     1-172 (172)
  3 smart00452 STI Soybean trypsin 100.0 5.7E-47 1.2E-51  302.1  14.7  130   32-170     1-171 (172)
  4 PF08194 DIM:  DIM protein;  In  69.0     8.1 0.00018   23.4   3.1   16    1-17      1-16  (36)
  5 KOG3858 Ephrin, ligand for Eph  64.4     7.5 0.00016   32.8   3.2   45   35-80    116-165 (233)
  6 PF07172 GRP:  Glycine rich pro  57.4     6.3 0.00014   28.6   1.4    9   34-42     34-42  (95)
  7 PRK10220 hypothetical protein;  55.8      18 0.00038   27.2   3.5   31   30-60     42-72  (111)
  8 PF00879 Defensin_propep:  Defe  53.0      16 0.00035   23.8   2.6   19    1-20      1-19  (52)
  9 TIGR00686 phnA alkylphosphonat  50.5      24 0.00052   26.4   3.5   31   30-60     41-71  (109)
 10 COG2824 PhnA Uncharacterized Z  45.1      30 0.00065   26.0   3.3   32   29-60     42-73  (112)
 11 PF14009 DUF4228:  Domain of un  44.3      15 0.00033   27.9   1.8   20   35-54     64-83  (181)
 12 PF05474 Semenogelin:  Semenoge  43.8      12 0.00027   34.9   1.3   15    1-15      1-15  (582)
 13 PF03831 PhnA:  PhnA protein;    43.6      19 0.00042   23.8   1.9   29   32-60      2-30  (56)
 14 PF09466 Yqai:  Hypothetical pr  43.2      19 0.00041   25.0   1.9   21   30-50     24-44  (71)
 15 PF15284 PAGK:  Phage-encoded v  42.5      13 0.00029   25.0   1.0   16    1-16      1-18  (61)
 16 PF00812 Ephrin:  Ephrin;  Inte  40.7      20 0.00043   28.0   1.9   24   36-59    101-124 (145)
 17 COG5341 Uncharacterized protei  39.3 1.2E+02  0.0026   23.4   5.8   11   42-52     46-56  (132)
 18 TIGR02588 conserved hypothetic  35.7      66  0.0014   24.6   4.0   30   29-58     34-63  (122)
 19 PF10657 RC-P840_PscD:  Photosy  35.2      34 0.00074   26.4   2.3   26   36-61     24-49  (144)
 20 PF10731 Anophelin:  Thrombin i  33.1      52  0.0011   22.2   2.7   41    1-43      1-46  (65)
 21 PF02402 Lysis_col:  Lysis prot  32.6      42 0.00091   21.3   2.1   17   28-44     20-36  (46)
 22 PF09888 DUF2115:  Uncharacteri  31.2      72  0.0016   25.3   3.7   30  101-133   113-142 (163)
 23 PF10813 DUF2733:  Protein of u  30.9      25 0.00053   20.8   0.8   19   28-46     11-29  (32)
 24 COG3045 CreA Uncharacterized p  30.9      91   0.002   24.9   4.1   24   31-54     42-65  (165)
 25 COG5510 Predicted small secret  30.2      46   0.001   21.0   1.9   16    1-16      2-17  (44)
 26 PRK01022 hypothetical protein;  27.5      87  0.0019   25.0   3.6   30  101-133   115-144 (167)
 27 PRK10159 outer membrane phosph  27.2      48   0.001   28.9   2.3   15   30-44     22-36  (351)
 28 PF11153 DUF2931:  Protein of u  26.0      70  0.0015   25.9   2.9   17    1-17      1-17  (216)
 29 PRK11289 ampC beta-lactamase/D  25.8      46 0.00099   29.5   1.9   15    1-15      2-16  (384)
 30 PF05550 Peptidase_C53:  Pestiv  24.7      56  0.0012   25.9   2.0   16   28-43     20-35  (168)
 31 PF11777 DUF3316:  Protein of u  23.0      68  0.0015   23.6   2.1   14    1-15      1-14  (114)
 32 PF02950 Conotoxin:  Conotoxin;  21.5      31 0.00067   23.0   0.0    7    1-7       1-7   (75)
 33 PF11355 DUF3157:  Protein of u  20.5 1.1E+02  0.0023   25.4   2.9   22   30-51     22-46  (199)
 34 KOG3352 Cytochrome c oxidase,   20.3 1.2E+02  0.0026   24.0   3.1   25   96-124   109-133 (153)
 35 PF12071 DUF3551:  Protein of u  20.2 1.3E+02  0.0028   21.2   3.0    7    1-7       1-7   (82)

No 1  
>PF00197 Kunitz_legume:  Trypsin and protease inhibitor;  InterPro: IPR002160 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.  The Kunitz-type soybean trypsin inhibitor (STI) family consists mainly of proteinase inhibitors from Leguminosae seeds []. They belong to MEROPS inhibitor family I3, clan IC. They exhibit proteinase inhibitory activity against serine proteinases; trypsin (MEROPS peptidase family S1, IPR001254 from INTERPRO) and subtilisin (MEROPS peptidase family S8, IPR000209 from INTERPRO), thiol proteinases (MEROPS peptidase family C1, IPR000668 from INTERPRO) and aspartic proteinases (MEROPS peptidase family A1, IPR001461 from INTERPRO) [].  Inhibitors from cereals are active against subtilisin and endogenous alpha-amylases, while some also inhibit tissue plasminogen activator. The inhibitors are usually specific for either trypsin or chymotrypsin, and some are effective against both. They are thought to protect the seeds against consumption by animal predators, while at the same time existing as seed storage proteins themselves - all the actively inhibitory members contain 2 disulphide bridges. The existence of a member with no inhibitory activity, winged bean albumin 1, suggests that the inhibitors may have evolved from seed storage proteins. Proteins from the Kunitz family contain from 170 to 200 amino acid residues and one or two intra-chain disulphide bonds. The best conserved region is found in their N-terminal section. The crystal structures of soybean trypsin inhibitor (STI), trypsin inhibitor DE-3 from the Kaffir tree Erythrina caffra (ETI) [] and the bifunctional proteinase K/alpha-amylase inhibitor from wheat (PK13) have been solved, showing them to share the same 12-stranded beta-sheet structure as those of interleukin-1 and heparin-binding growth factors []. The beta-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel beta-barrel. Despite the structural similarity, STI shows no interleukin-1 bioactivity, presumably as a result of their primary sequence disparities. The active inhibitory site containing the scissile bond is located in the loop between beta-strands 4 and 5 in STI and ETI. The STIs belong to a superfamily that also contains the interleukin-1 proteins, heparin binding growth factors (HBGF) and histactophilin, all of which have very similar structures, but share no sequence similarity with the STI family.; GO: 0004866 endopeptidase inhibitor activity; PDB: 3TC2_B 3S8J_A 3S8K_A 1TIE_A 2GZB_A 3E8L_C 2IWT_B 3BX1_C 1AVA_D 3IIR_A ....
Probab=100.00  E-value=2.4e-48  Score=310.71  Aligned_cols=132  Identities=56%  Similarity=0.997  Sum_probs=118.6

Q ss_pred             ceecCCCCccccCCeEEEEecccCCCCcEEEeecCCCCCCCCceEecCCC------------------------------
Q 030834           31 PVLDIAGKQLRAGSKYYILPVTKGRGGGLTLAGRSNNKTCPLDVVQEQHS------------------------------   80 (170)
Q Consensus        31 ~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~l~~t~n~~~CPl~VvQ~~~~------------------------------   80 (170)
                      ||+|+|||||++|.+|||+|++++.|||+++++++| ++||++|+|++++                              
T Consensus         1 pVlD~~G~~l~~g~~YyI~p~~~~~GGGl~l~~~~n-~~CPl~Vvq~~~~~~~GlPv~Fs~~~~~~~~~~ir~st~l~I~   79 (176)
T PF00197_consen    1 PVLDTDGNPLRNGGEYYILPAIRGAGGGLTLAKTGN-ETCPLDVVQSPSELSRGLPVKFSPPYRNSFDTVIRESTDLNIE   79 (176)
T ss_dssp             B-BETTSCB-BTTSEEEEEESSTGCSEEEEEECCTT-SSSSEEEEEESSTTS-BSEEEEEESSSSSSTBCTBTTSEEEEE
T ss_pred             CcCCCCCCCCcCCCCEEEEeCccCCCCeeEecCCCC-CCCChheEEccCCCCCceeEEEEeCCcccCCCeeEcceEEEEE
Confidence            799999999999999999999999999999999999 9999999999887                              


Q ss_pred             --------CCccEEEeecCCCcccEEEEeCCCCCCCCCCCCcccEEEEEeCCc----ceEEeccCcccccccceeeeEEE
Q 030834           81 --------FRNVWKLDNFDAILGQWFVKTGGIEGNLGPQTTINWFRIEKFYGD----YKLVFCPLVCKFCKVLCIDVGIF  148 (170)
Q Consensus        81 --------~s~~W~v~~~d~~~~~~~V~~gg~~g~pg~~t~~~~FkIek~~~~----YKLvfCp~~~~~~~~~C~dvGi~  148 (170)
                              .+++|+|++.++++++ +|+|||.+|   .++..|||||||++.+    |||+|||+.|+  ...|+||||+
T Consensus        80 F~~~~~c~~~~~W~V~~~~~~~~~-~V~~gg~~~---~~~~~~~FkIek~~~~~~~~YKLvfCp~~~~--~~~C~dvGi~  153 (176)
T PF00197_consen   80 FSSPTSCACSTVWKVVKDDPETGQ-FVKTGGVKG---PETVDSWFKIEKYEDGFNNAYKLVFCPSVCC--DSLCGDVGIY  153 (176)
T ss_dssp             ESSECTTSSSSBEEEEEETTTTEE-EEEEESSSS---SGCGCCEEEEEEESSSSTTEEEEEEESSSSS--TSSEEEEEEE
T ss_pred             EccCCCCCccCEEEEeecCcccce-EEEeCCccc---CCccCcEEEEEEeCCCCCCcEEEEECCCccc--cCccceeeEE
Confidence                    6789999987776555 899999987   6788999999999985    99999999764  8999999999


Q ss_pred             Ee-CCeEEEEeeC-CcEEEEEec
Q 030834          149 VN-GGVWHLALSD-VTFNVTFLN  169 (170)
Q Consensus       149 ~~-~g~rrL~ls~-~p~~V~F~k  169 (170)
                      +| +|+|||||++ +||.|+|||
T Consensus       154 ~d~~g~rrL~l~~~~p~~V~F~K  176 (176)
T PF00197_consen  154 FDDNGNRRLALSDDNPFVVVFQK  176 (176)
T ss_dssp             EETTSEEEEEEESSSB-EEEEEE
T ss_pred             EcCCCeEEEEECCCCcEEEEEEC
Confidence            95 8999999998 599999998


No 2  
>cd00178 STI Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. Inhibit proteases by binding with high affinity to their active sites. Trefoil fold, common to interleukins and fibroblast growth factors.
Probab=100.00  E-value=6.3e-48  Score=307.60  Aligned_cols=130  Identities=52%  Similarity=0.932  Sum_probs=119.8

Q ss_pred             ceecCCCCccccCCeEEEEecccCCCCcEEEeecCCCCCCCCceEecCCC------------------------------
Q 030834           31 PVLDIAGKQLRAGSKYYILPVTKGRGGGLTLAGRSNNKTCPLDVVQEQHS------------------------------   80 (170)
Q Consensus        31 ~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~l~~t~n~~~CPl~VvQ~~~~------------------------------   80 (170)
                      ||+|++||||++|.+|||+|+++|.||||++++++| ++||++|+|++++                              
T Consensus         1 ~VlD~~G~~l~~g~~YyI~p~~~g~GGGl~l~~~~~-~~CPl~VvQ~~~~~~~GlPv~Fs~~~~~~~~I~e~t~lnI~F~   79 (172)
T cd00178           1 PVLDTDGNPLRNGGRYYILPAIRGGGGGLTLAATGN-ETCPLTVVQSPSELDRGLPVKFSPPNPKSDVIRESTDLNIEFD   79 (172)
T ss_pred             CcCcCCCCCCcCCCeEEEEEceeCCCCcEEEcCCCC-CCCCCeeEECCCCCCCCeeEEEEeCCCCCCEEECCCcEEEEeC
Confidence            699999999999999999999999899999999999 9999999999986                              


Q ss_pred             -------CCccEEEeecCCCcccEEEEeCCCCCCCCCCCCcccEEEEEeCC---cceEEeccCcccccccceeeeEEEEe
Q 030834           81 -------FRNVWKLDNFDAILGQWFVKTGGIEGNLGPQTTINWFRIEKFYG---DYKLVFCPLVCKFCKVLCIDVGIFVN  150 (170)
Q Consensus        81 -------~s~~W~v~~~d~~~~~~~V~~gg~~g~pg~~t~~~~FkIek~~~---~YKLvfCp~~~~~~~~~C~dvGi~~~  150 (170)
                             +|++|+|+++|+ .++|+|+|||.+++    +.+|||||||+++   .|||+|||+.|   +..|+||||+.+
T Consensus        80 ~~~~c~~~st~W~V~~~~~-~~~~~V~~Gg~~~~----~~~~~FkIek~~~~~~~YKL~~Cp~~~---~~~C~~VGi~~d  151 (172)
T cd00178          80 APTWCCGSSTVWKVDRDST-PEGLFVTTGGVKGN----TLNSWFKIEKVSEGLNAYKLVFCPSSC---DSKCGDVGIFID  151 (172)
T ss_pred             CCCcCCCCCCEEEEeccCC-ccCeEEEeCCcCCC----cccceEEEEECCCCCCcEEEEEcCCCC---CCceeecccEEC
Confidence                   689999987665 78999999999875    6899999999987   69999999875   679999999995


Q ss_pred             -CCeEEEEeeC-CcEEEEEec
Q 030834          151 -GGVWHLALSD-VTFNVTFLN  169 (170)
Q Consensus       151 -~g~rrL~ls~-~p~~V~F~k  169 (170)
                       +|.|||+|++ +||.|+|+|
T Consensus       152 ~~g~rrL~l~~~~p~~V~F~k  172 (172)
T cd00178         152 PEGVRRLVLSDDNPLVVVFKK  172 (172)
T ss_pred             CCCcEEEEEcCCCCeEEEEeC
Confidence             8999999998 599999997


No 3  
>smart00452 STI Soybean trypsin inhibitor (Kunitz) family of protease inhibitors.
Probab=100.00  E-value=5.7e-47  Score=302.14  Aligned_cols=130  Identities=47%  Similarity=0.835  Sum_probs=116.9

Q ss_pred             eecCCCCccccCCeEEEEecccCCCCcEEEeecCCCCCCCCceEecCCC-------------------------------
Q 030834           32 VLDIAGKQLRAGSKYYILPVTKGRGGGLTLAGRSNNKTCPLDVVQEQHS-------------------------------   80 (170)
Q Consensus        32 VlD~~G~~l~~g~~YYI~p~~~~~gGGl~l~~t~n~~~CPl~VvQ~~~~-------------------------------   80 (170)
                      |+|++||||++|++|||+|++++.||||++++++| ++||++|+|++++                               
T Consensus         1 VlDt~G~~l~~G~~YyI~p~~~g~GGGl~l~~~~n-~~CPl~VvQ~~~~~~~GlPV~Fs~~~~~~~ii~e~t~lnI~F~~   79 (172)
T smart00452        1 VLDTDGNPLRNGGTYYILPAIRGHGGGLTLAATGN-EICPLTVVQSPNEVDNGLPVKFSPPNPSDFIIRESTDLNIEFDA   79 (172)
T ss_pred             CCCCCCCCCcCCCcEEEEEccccCCCCEEEccCCC-CCCCCeeEECCCCCCCceeEEEeecCCCCCEEecCceEEEEeCC
Confidence            79999999999999999999999889999999999 9999999999876                               


Q ss_pred             -----CCccEEEeecCCCcccEEEEeCCCCCCCCCCCCcccEEEEEeCC---cceEEeccCcccccccceeeeEEEEe-C
Q 030834           81 -----FRNVWKLDNFDAILGQWFVKTGGIEGNLGPQTTINWFRIEKFYG---DYKLVFCPLVCKFCKVLCIDVGIFVN-G  151 (170)
Q Consensus        81 -----~s~~W~v~~~d~~~~~~~V~~gg~~g~pg~~t~~~~FkIek~~~---~YKLvfCp~~~~~~~~~C~dvGi~~~-~  151 (170)
                           +|++|+|++ |++.++|+|+|||   +|+..  .|||||||+++   .|||+|||+.|+  ...|.||||+++ +
T Consensus        80 ~~~C~~st~W~V~~-~~~~~~~~V~~gg---~~~~~--~~~FkIek~~~~~~~YKLv~Cp~~~~--~~~C~~vGi~~d~~  151 (172)
T smart00452       80 PPLCAQSTVWTVDE-DSAPEGLAVKTGG---YPGVR--DSWFKIEKYSGESNGYKLVYCPNGSD--DDKCGDVGIFIDPE  151 (172)
T ss_pred             CCCCCCCCEEEEec-CCccccEEEEeCC---cCCCC--CCeEEEEECCCCCCCEEEEEcCCCCC--CCccCccCeEECCC
Confidence                 789999975 6678899999998   44443  69999999987   699999998774  678999999995 8


Q ss_pred             CeEEEEeeCC-cEEEEEecC
Q 030834          152 GVWHLALSDV-TFNVTFLNG  170 (170)
Q Consensus       152 g~rrL~ls~~-p~~V~F~k~  170 (170)
                      |+|||||+++ ||.|+|+|.
T Consensus       152 g~rrL~ls~~~p~~v~F~k~  171 (172)
T smart00452      152 GGRRLVLSNENPLVVVFKKA  171 (172)
T ss_pred             CcEEEEEcCCCCeEEEEEEC
Confidence            9999999975 999999983


No 4  
>PF08194 DIM:  DIM protein;  InterPro: IPR013172 Drosophila immune-induced molecules (DIMs) are short proteins induced during the immune response of Drosophila []. This entry includes DIMs 1 to 4 and DIM23.
Probab=68.99  E-value=8.1  Score=23.40  Aligned_cols=16  Identities=25%  Similarity=0.366  Sum_probs=9.0

Q ss_pred             CCchhhHHHHHHHHHHH
Q 030834            1 MRSTLVLTPLILLFAFI   17 (170)
Q Consensus         1 MK~~~~l~l~fll~a~~   17 (170)
                      ||.+.+ .+.|+++|+.
T Consensus         1 Mk~l~~-a~~l~lLal~   16 (36)
T PF08194_consen    1 MKCLSL-AFALLLLALA   16 (36)
T ss_pred             CceeHH-HHHHHHHHHH
Confidence            998873 3334444443


No 5  
>KOG3858 consensus Ephrin, ligand for Eph receptor tyrosine kinase [Signal transduction mechanisms]
Probab=64.38  E-value=7.5  Score=32.80  Aligned_cols=45  Identities=24%  Similarity=0.397  Sum_probs=30.5

Q ss_pred             CCCCccccCCeEEEEecccCCCCcEE-----EeecCCCCCCCCceEecCCC
Q 030834           35 IAGKQLRAGSKYYILPVTKGRGGGLT-----LAGRSNNKTCPLDVVQEQHS   80 (170)
Q Consensus        35 ~~G~~l~~g~~YYI~p~~~~~gGGl~-----l~~t~n~~~CPl~VvQ~~~~   80 (170)
                      ..|.+-++|.+||+++...|.-.|+-     +-.+.+ ..+-..|.|++..
T Consensus       116 p~G~EF~pG~~YY~IStStg~~~g~~~~~ggvc~~~~-mk~~~~V~~~~~~  165 (233)
T KOG3858|consen  116 PLGFEFQPGHTYYYISTSTGDAEGLCNLRGGVCVTRN-MKLLMKVGQSPRS  165 (233)
T ss_pred             CCCccccCCCeEEEEeCCCccccccchhhCCEeccCC-ceEEEEecccCCC
Confidence            36999999999999998766433332     123334 5677778887653


No 6  
>PF07172 GRP:  Glycine rich protein family;  InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=57.43  E-value=6.3  Score=28.62  Aligned_cols=9  Identities=0%  Similarity=0.176  Sum_probs=3.7

Q ss_pred             cCCCCcccc
Q 030834           34 DIAGKQLRA   42 (170)
Q Consensus        34 D~~G~~l~~   42 (170)
                      ..+-++|+.
T Consensus        34 ~~~~~~v~~   42 (95)
T PF07172_consen   34 EEEENEVQD   42 (95)
T ss_pred             cccCCCCCc
Confidence            333344443


No 7  
>PRK10220 hypothetical protein; Provisional
Probab=55.78  E-value=18  Score=27.22  Aligned_cols=31  Identities=26%  Similarity=0.214  Sum_probs=22.5

Q ss_pred             CceecCCCCccccCCeEEEEecccCCCCcEE
Q 030834           30 DPVLDIAGKQLRAGSKYYILPVTKGRGGGLT   60 (170)
Q Consensus        30 ~~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~   60 (170)
                      ..|.|.+|++|..|..--++=...=.|.+.+
T Consensus        42 ~~vkDsnG~~L~dGDsV~viKDLkVKGss~~   72 (111)
T PRK10220         42 LIVKDANGNLLADGDSVTIVKDLKVKGSSSM   72 (111)
T ss_pred             ceEEcCCCCCccCCCEEEEEeeccccccccc
Confidence            4689999999999998777655433444444


No 8  
>PF00879 Defensin_propep:  Defensin propeptide The pattern for this Prosite entry doesn't match the propeptide.;  InterPro: IPR002366 Defensins are 2-6 kDa, cationic, microbicidal peptides active against many Gram-negative and Gram-positive bacteria, fungi, and enveloped viruses [], containing three pairs of intramolecular disulphide bonds []. On the basis of their size and pattern of disulphide bonding, mammalian defensins are classified into alpha, beta and theta categories. Alpha-defensins, which have been identified in humans, monkeys and several rodent species, are particularly abundant in neutrophils, certain macrophage populations and Paneth cells of the small intestine. Every mammalian species explored thus far has beta-defensins. In cows, as many as 13 beta-defensins exist in neutrophils. However, in other species, beta-defensins are more often produced by epithelial cells lining various organs (e.g. the epidermis, bronchial tree and genitourinary tract). Theta-defensins are cyclic and have so far only been identified in primate phagocytes.   Defensins are produced constitutively and/or in response to microbial products or proinflammatory cytokines. Some defensins are also called corticostatins (CS) because they inhibit corticotropin-stimulated corticosteroid production. The mechanism(s) by which microorganisms are killed and/or inactivated by defensins is not understood completely. However, it is generally believed that killing is a consequence of disruption of the microbial membrane. The polar topology of defensins, with spatially separated charged and hydrophobic regions, allows them to insert themselves into the phospholipid membranes so that their hydrophobic regions are buried within the lipid membrane interior and their charged (mostly cationic) regions interact with anionic phospholipid head groups and water. Subsequently, some defensins can aggregate to form `channel-like' pores; others might bind to and cover the microbial membrane in a `carpet-like' manner. The net outcome is the disruption of membrane integrity and function, which ultimately leads to the lysis of microorganisms. Some defensins are synthesized as propeptides which may be relevant to this process - in neutrophils only the mature peptides have been identified but in Paneth cells, the propeptide is stored in vesicles [] and appears to be cleaved by trypsin on activation.  ; GO: 0006952 defense response
Probab=52.96  E-value=16  Score=23.83  Aligned_cols=19  Identities=37%  Similarity=0.574  Sum_probs=13.2

Q ss_pred             CCchhhHHHHHHHHHHHhCC
Q 030834            1 MRSTLVLTPLILLFAFIATP   20 (170)
Q Consensus         1 MK~~~~l~l~fll~a~~t~~   20 (170)
                      ||++.+|.. +||+||-++.
T Consensus         1 MRTL~LLaA-lLLlAlqaQA   19 (52)
T PF00879_consen    1 MRTLALLAA-LLLLALQAQA   19 (52)
T ss_pred             CcHHHHHHH-HHHHHHHHhc
Confidence            898875544 7888876544


No 9  
>TIGR00686 phnA alkylphosphonate utilization operon protein PhnA. The protein family includes an uncharacterized member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterized phosphonoacetate hydrolase designated PhnA by Kulakova, et al. (2001, 1997).
Probab=50.54  E-value=24  Score=26.45  Aligned_cols=31  Identities=26%  Similarity=0.293  Sum_probs=22.7

Q ss_pred             CceecCCCCccccCCeEEEEecccCCCCcEE
Q 030834           30 DPVLDIAGKQLRAGSKYYILPVTKGRGGGLT   60 (170)
Q Consensus        30 ~~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~   60 (170)
                      ..|.|.+|++|..|-+=-|+=...=.|.+.+
T Consensus        41 ~~~kDsnG~~L~dGDsV~liKDLkVKGss~~   71 (109)
T TIGR00686        41 LIVKDCNGNLLANGDSVILIKDLKVKGSSLV   71 (109)
T ss_pred             ceEEcCCCCCccCCCEEEEEeeccccCcccc
Confidence            4689999999999998877755433444444


No 10 
>COG2824 PhnA Uncharacterized Zn-ribbon-containing protein involved in phosphonate metabolism [Inorganic ion transport and metabolism]
Probab=45.13  E-value=30  Score=25.95  Aligned_cols=32  Identities=22%  Similarity=0.204  Sum_probs=22.7

Q ss_pred             CCceecCCCCccccCCeEEEEecccCCCCcEE
Q 030834           29 PDPVLDIAGKQLRAGSKYYILPVTKGRGGGLT   60 (170)
Q Consensus        29 ~~~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~   60 (170)
                      ...|.|.+||.|..|..--|+-...-.|.+.+
T Consensus        42 ~~~v~DsnGn~L~dGDsV~lIKDLkVKGss~~   73 (112)
T COG2824          42 ALIVKDSNGNLLADGDSVTLIKDLKVKGSSKV   73 (112)
T ss_pred             ceEEEcCCCcEeccCCeEEEEEeeeecCCcce
Confidence            35899999999999998877654433333333


No 11 
>PF14009 DUF4228:  Domain of unknown function (DUF4228)
Probab=44.33  E-value=15  Score=27.88  Aligned_cols=20  Identities=25%  Similarity=0.665  Sum_probs=16.6

Q ss_pred             CCCCccccCCeEEEEecccC
Q 030834           35 IAGKQLRAGSKYYILPVTKG   54 (170)
Q Consensus        35 ~~G~~l~~g~~YYI~p~~~~   54 (170)
                      .-.++|++|.-||++|..+-
T Consensus        64 ~~d~~L~~G~~Y~llP~~~~   83 (181)
T PF14009_consen   64 PPDEELQPGQIYFLLPMSRL   83 (181)
T ss_pred             CccCeecCCCEEEEEEcccc
Confidence            45678999999999998653


No 12 
>PF05474 Semenogelin:  Semenogelin;  InterPro: IPR008836 This family consists of several mammalian semenogelin (I and II) proteins. Freshly ejaculated Homo sapiens semen has the appearance of a loose gel in which the predominant structural protein components are the seminal vesicle secreted semenogelins (Sg) [].; GO: 0005198 structural molecule activity, 0019953 sexual reproduction, 0005576 extracellular region, 0030141 stored secretory granule
Probab=43.83  E-value=12  Score=34.89  Aligned_cols=15  Identities=27%  Similarity=0.439  Sum_probs=12.2

Q ss_pred             CCchhhHHHHHHHHH
Q 030834            1 MRSTLVLTPLILLFA   15 (170)
Q Consensus         1 MK~~~~l~l~fll~a   15 (170)
                      ||+++++.||+||++
T Consensus         1 MK~~I~F~lSLLLiL   15 (582)
T PF05474_consen    1 MKSIIFFVLSLLLIL   15 (582)
T ss_pred             CCceeehHHHHHHHH
Confidence            999877778888776


No 13 
>PF03831 PhnA:  PhnA protein;  InterPro: IPR013988 The PhnA protein family includes the uncharacterised Escherichia coli protein PhnA and its homologues. The E. coli phnA gene is part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage []. The protein is not related to the characterised phosphonoacetate hydrolase designated PhnA []. This entry represents the C-terminal domain of PhnA.; PDB: 2AKK_A 2AKL_A.
Probab=43.63  E-value=19  Score=23.85  Aligned_cols=29  Identities=28%  Similarity=0.434  Sum_probs=15.2

Q ss_pred             eecCCCCccccCCeEEEEecccCCCCcEE
Q 030834           32 VLDIAGKQLRAGSKYYILPVTKGRGGGLT   60 (170)
Q Consensus        32 VlD~~G~~l~~g~~YYI~p~~~~~gGGl~   60 (170)
                      |.|.+|++|..|-+--++--..=.|.+.+
T Consensus         2 v~DsnGn~L~dGDsV~~iKDLkVKG~s~~   30 (56)
T PF03831_consen    2 VKDSNGNELQDGDSVTLIKDLKVKGSSFT   30 (56)
T ss_dssp             -B-TTS-B--TTEEEEESS-EEETTTTEE
T ss_pred             eEcCCCCCccCCCEEEEEeeeeeccCCcc
Confidence            68999999999988776544332344444


No 14 
>PF09466 Yqai:  Hypothetical protein Yqai;  InterPro: IPR018474 The hypothetical protein YqaI is expressed in bacteria, particularly Bacillus subtilis. It forms a homo-dimer, with each monomer containing an alpha helix and four beta strands.; PDB: 2DSM_B.
Probab=43.21  E-value=19  Score=25.01  Aligned_cols=21  Identities=33%  Similarity=0.801  Sum_probs=12.2

Q ss_pred             CceecCCCCccccCCeEEEEe
Q 030834           30 DPVLDIAGKQLRAGSKYYILP   50 (170)
Q Consensus        30 ~~VlD~~G~~l~~g~~YYI~p   50 (170)
                      -|+.|.-|+++.+|.+|+|.|
T Consensus        24 ~~i~D~yG~EI~~~D~y~i~~   44 (71)
T PF09466_consen   24 HPIEDFYGDEIFPGDDYFISP   44 (71)
T ss_dssp             -B---TTSS-B-TTS-EEE-E
T ss_pred             cceeeeeccccccCCeEEEeC
Confidence            467789999999999999976


No 15 
>PF15284 PAGK:  Phage-encoded virulence factor
Probab=42.52  E-value=13  Score=25.00  Aligned_cols=16  Identities=25%  Similarity=0.254  Sum_probs=8.0

Q ss_pred             CCchh--hHHHHHHHHHH
Q 030834            1 MRSTL--VLTPLILLFAF   16 (170)
Q Consensus         1 MK~~~--~l~l~fll~a~   16 (170)
                      ||...  +|.|.|.|.|.
T Consensus         1 Mkk~ksifL~l~~~LsA~   18 (61)
T PF15284_consen    1 MKKFKSIFLALVFILSAA   18 (61)
T ss_pred             ChHHHHHHHHHHHHHHHh
Confidence            77332  35555555553


No 16 
>PF00812 Ephrin:  Ephrin;  InterPro: IPR001799 Ephrins are a family of proteins [] that are ligands of class V (EPH-related) receptor protein-tyrosine kinases (see IPR001426 from INTERPRO). These receptors and their ligands have been implicated in regulating neuronal axon guidance and in patterning of the developing nervous system and may also serve a patterning and compartmentalisation role outside of the nervous system as well. Ephrins are membrane-attached proteins of 205 to 340 residues. Attachment appears to be crucial for their normal function. Type-A ephrins are linked to the membrane via a glycosylphosphatidylinositol (GPI)-linkage, while type-B ephrins are type-I membrane proteins.; GO: 0016020 membrane; PDB: 3HEI_P 3CZU_B 3MBW_B 1KGY_E 1IKO_P 2WO3_B 2I85_A 2VSK_B 3GXU_B 2VSM_B ....
Probab=40.72  E-value=20  Score=28.02  Aligned_cols=24  Identities=29%  Similarity=0.703  Sum_probs=18.2

Q ss_pred             CCCccccCCeEEEEecccCCCCcE
Q 030834           36 AGKQLRAGSKYYILPVTKGRGGGL   59 (170)
Q Consensus        36 ~G~~l~~g~~YYI~p~~~~~gGGl   59 (170)
                      .|-+-++|.+||++....|.-+|+
T Consensus       101 ~G~EF~pG~~YY~ISts~g~~~g~  124 (145)
T PF00812_consen  101 LGLEFQPGHDYYYISTSTGTQEGL  124 (145)
T ss_dssp             TSSS--TTEEEEEEEEESSSSTTT
T ss_pred             CCeeecCCCeEEEEEccCCCCCCc
Confidence            789999999999999877765554


No 17 
>COG5341 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=39.33  E-value=1.2e+02  Score=23.40  Aligned_cols=11  Identities=27%  Similarity=0.495  Sum_probs=6.9

Q ss_pred             cCCeEEEEecc
Q 030834           42 AGSKYYILPVT   52 (170)
Q Consensus        42 ~g~~YYI~p~~   52 (170)
                      .|..||..|..
T Consensus        46 ~Gk~~r~i~l~   56 (132)
T COG5341          46 DGKVIRTIPLT   56 (132)
T ss_pred             CCEEEEEEEcc
Confidence            45667777765


No 18 
>TIGR02588 conserved hypothetical protein TIGR02588. The function of this protein is unknown. It is always found as part of a two-gene operon with TIGR02587, a protein that appears to span the membrane seven times. It is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus, so far, all of which are bacterial.
Probab=35.65  E-value=66  Score=24.56  Aligned_cols=30  Identities=17%  Similarity=0.151  Sum_probs=21.4

Q ss_pred             CCceecCCCCccccCCeEEEEecccCCCCc
Q 030834           29 PDPVLDIAGKQLRAGSKYYILPVTKGRGGG   58 (170)
Q Consensus        29 ~~~VlD~~G~~l~~g~~YYI~p~~~~~gGG   58 (170)
                      |.......+..=+.+++||+--..++.||+
T Consensus        34 p~l~v~~~~~~r~~~gqyyVpF~V~N~gg~   63 (122)
T TIGR02588        34 AVLEVAPAEVERMQTGQYYVPFAIHNLGGT   63 (122)
T ss_pred             CeEEEeehheeEEeCCEEEEEEEEEeCCCc
Confidence            344556666655578999998888877654


No 19 
>PF10657 RC-P840_PscD:  Photosystem P840 reaction centre protein PscD;  InterPro: IPR019608 Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.  The photosynthetic reaction centres (RCs) of aerotolerant organisms contain a heterodimeric core, built up of two strongly homologous polypeptides each of which contributes five transmembrane peptide helices to hold a pseudo-symmetric double set of redox components. Two molecules of PscD are housed within a subunit. PscD may be involved in stabilising the PscB component since it is found to co-precipitate with FMO (Fenna-Mathews-Olson BChl a-protein) and PscB. It may also be involved in the interaction with ferredoxin []. 
Probab=35.22  E-value=34  Score=26.39  Aligned_cols=26  Identities=31%  Similarity=0.608  Sum_probs=23.2

Q ss_pred             CCCccccCCeEEEEecccCCCCcEEE
Q 030834           36 AGKQLRAGSKYYILPVTKGRGGGLTL   61 (170)
Q Consensus        36 ~G~~l~~g~~YYI~p~~~~~gGGl~l   61 (170)
                      .||++.-..+|||-+|.|+..|-|.+
T Consensus        24 sGNa~HK~eKYfITsAkRD~~g~Lql   49 (144)
T PF10657_consen   24 SGNAVHKAEKYFITSAKRDRYGKLQL   49 (144)
T ss_pred             cCchhhhhheeEEeeeecccCCceEE
Confidence            79999999999999999998886654


No 20 
>PF10731 Anophelin:  Thrombin inhibitor from mosquito;  InterPro: IPR018932  Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing. 
Probab=33.07  E-value=52  Score=22.24  Aligned_cols=41  Identities=24%  Similarity=0.370  Sum_probs=20.0

Q ss_pred             CCchhhHHHHHHHHHHHh--CCCCCCCCCCCCceecCCC---CccccC
Q 030834            1 MRSTLVLTPLILLFAFIA--TPLPVRGNASPDPVLDIAG---KQLRAG   43 (170)
Q Consensus         1 MK~~~~l~l~fll~a~~t--~~l~~~~~~~~~~VlD~~G---~~l~~g   43 (170)
                      |-+-+ ..++||++|+..  +..|.- +...+|-+|-+-   ++|.+.
T Consensus         1 MA~Kl-~vialLC~aLva~vQ~APQY-a~GeeP~YDEdd~dde~l~ph   46 (65)
T PF10731_consen    1 MASKL-IVIALLCVALVAIVQSAPQY-APGEEPSYDEDDDDDEPLKPH   46 (65)
T ss_pred             Ccchh-hHHHHHHHHHHHHHhcCccc-CCCCCCCcCcccCcccccccC
Confidence            44443 345566666542  322211 123478887664   566553


No 21 
>PF02402 Lysis_col:  Lysis protein;  InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined []. The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells []. A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB []. Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively [].  Sequence similarities between colicins E2, A and E1 [] are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 [] immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides []. Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase []. The mature ColE2 lysis protein is located in the cell envelope [].; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane
Probab=32.58  E-value=42  Score=21.27  Aligned_cols=17  Identities=18%  Similarity=0.348  Sum_probs=12.5

Q ss_pred             CCCceecCCCCccccCC
Q 030834           28 SPDPVLDIAGKQLRAGS   44 (170)
Q Consensus        28 ~~~~VlD~~G~~l~~g~   44 (170)
                      +...|-|+.|--+.|..
T Consensus        20 QaN~iRDvqGGtVaPSS   36 (46)
T PF02402_consen   20 QANYIRDVQGGTVAPSS   36 (46)
T ss_pred             hhcceecCCCceECCCc
Confidence            33689999998776653


No 22 
>PF09888 DUF2115:  Uncharacterized protein conserved in archaea (DUF2115);  InterPro: IPR019215  This entry represents various hypothetical archaeal proteins, has no known function. 
Probab=31.22  E-value=72  Score=25.28  Aligned_cols=30  Identities=20%  Similarity=0.296  Sum_probs=21.5

Q ss_pred             eCCCCCCCCCCCCcccEEEEEeCCcceEEeccC
Q 030834          101 TGGIEGNLGPQTTINWFRIEKFYGDYKLVFCPL  133 (170)
Q Consensus       101 ~gg~~g~pg~~t~~~~FkIek~~~~YKLvfCp~  133 (170)
                      +-...+||=--...|-|+|++-++.|   |||-
T Consensus       113 I~~~PlHPvG~~FPGG~~V~~~~g~Y---YCPV  142 (163)
T PF09888_consen  113 ILKEPLHPVGMPFPGGFKVEEKNGNY---YCPV  142 (163)
T ss_pred             HhCCCCCCCCCCCCCCeEEEEECCEE---eCcc
Confidence            34456677333478899999998877   8993


No 23 
>PF10813 DUF2733:  Protein of unknown function (DUF2733);  InterPro: IPR024360 The UL11 gene product of herpes simplex virus is a membrane-associated tegument protein that is incorporated into the HSV virion and functions in viral envelopment []. UL11 is acylated, which is crucial for lipid raft association [].
Probab=30.92  E-value=25  Score=20.77  Aligned_cols=19  Identities=16%  Similarity=0.464  Sum_probs=14.6

Q ss_pred             CCCceecCCCCccccCCeE
Q 030834           28 SPDPVLDIAGKQLRAGSKY   46 (170)
Q Consensus        28 ~~~~VlD~~G~~l~~g~~Y   46 (170)
                      ...|+.|.+|+++.--.+|
T Consensus        11 r~n~l~Dv~G~~Inl~~dF   29 (32)
T PF10813_consen   11 RHNPLKDVKGNPINLYKDF   29 (32)
T ss_pred             cCCcccccCCCEEechhcc
Confidence            4579999999998765444


No 24 
>COG3045 CreA Uncharacterized protein conserved in bacteria [Function unknown]
Probab=30.86  E-value=91  Score=24.89  Aligned_cols=24  Identities=17%  Similarity=0.227  Sum_probs=19.2

Q ss_pred             ceecCCCCccccCCeEEEEecccC
Q 030834           31 PVLDIAGKQLRAGSKYYILPVTKG   54 (170)
Q Consensus        31 ~VlD~~G~~l~~g~~YYI~p~~~~   54 (170)
                      .|++.--+|.-.|.+-||--+.+|
T Consensus        42 IvveafdDP~V~gVTCyvs~a~~g   65 (165)
T COG3045          42 IVVEAFDDPDVKGVTCYVSRAKTG   65 (165)
T ss_pred             EEEEecCCCCcCcEEEEEEEeccc
Confidence            566666778888999999988765


No 25 
>COG5510 Predicted small secreted protein [Function unknown]
Probab=30.20  E-value=46  Score=20.99  Aligned_cols=16  Identities=38%  Similarity=0.617  Sum_probs=8.8

Q ss_pred             CCchhhHHHHHHHHHH
Q 030834            1 MRSTLVLTPLILLFAF   16 (170)
Q Consensus         1 MK~~~~l~l~fll~a~   16 (170)
                      ||-+.++.+++++.++
T Consensus         2 mk~t~l~i~~vll~s~   17 (44)
T COG5510           2 MKKTILLIALVLLAST   17 (44)
T ss_pred             chHHHHHHHHHHHHHH
Confidence            7776645444555444


No 26 
>PRK01022 hypothetical protein; Provisional
Probab=27.45  E-value=87  Score=24.98  Aligned_cols=30  Identities=20%  Similarity=0.250  Sum_probs=21.1

Q ss_pred             eCCCCCCCCCCCCcccEEEEEeCCcceEEeccC
Q 030834          101 TGGIEGNLGPQTTINWFRIEKFYGDYKLVFCPL  133 (170)
Q Consensus       101 ~gg~~g~pg~~t~~~~FkIek~~~~YKLvfCp~  133 (170)
                      +-+..+||=--...|-|+|++.++.|   |||-
T Consensus       115 I~~~PlHPvG~~FPGG~~V~~~~g~y---YCPV  144 (167)
T PRK01022        115 ILKEPLHPVGTPFPGGFKVEEKNGVY---YCPV  144 (167)
T ss_pred             HhCCCCCCCCCCCCCCeEEEeECCEE---eCcc
Confidence            33456677333477889999998876   7993


No 27 
>PRK10159 outer membrane phosphoporin protein E; Provisional
Probab=27.20  E-value=48  Score=28.95  Aligned_cols=15  Identities=20%  Similarity=0.257  Sum_probs=11.2

Q ss_pred             CceecCCCCccccCC
Q 030834           30 DPVLDIAGKQLRAGS   44 (170)
Q Consensus        30 ~~VlD~~G~~l~~g~   44 (170)
                      .+|+|.||.-|.-.+
T Consensus        22 ~~vy~~d~ssvtlyG   36 (351)
T PRK10159         22 AEVYNKDGNKLDVYG   36 (351)
T ss_pred             EEEEECCCCEEEEEE
Confidence            589999997776543


No 28 
>PF11153 DUF2931:  Protein of unknown function (DUF2931);  InterPro: IPR021326  Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently, there is no known function. 
Probab=26.01  E-value=70  Score=25.88  Aligned_cols=17  Identities=35%  Similarity=0.456  Sum_probs=10.0

Q ss_pred             CCchhhHHHHHHHHHHH
Q 030834            1 MRSTLVLTPLILLFAFI   17 (170)
Q Consensus         1 MK~~~~l~l~fll~a~~   17 (170)
                      ||.+++|++++||.+-+
T Consensus         1 mk~i~~l~l~lll~~C~   17 (216)
T PF11153_consen    1 MKKILLLLLLLLLTGCS   17 (216)
T ss_pred             ChHHHHHHHHHHHHhhc
Confidence            89887555545544433


No 29 
>PRK11289 ampC beta-lactamase/D-alanine carboxypeptidase; Provisional
Probab=25.75  E-value=46  Score=29.51  Aligned_cols=15  Identities=33%  Similarity=0.441  Sum_probs=9.5

Q ss_pred             CCchhhHHHHHHHHH
Q 030834            1 MRSTLVLTPLILLFA   15 (170)
Q Consensus         1 MK~~~~l~l~fll~a   15 (170)
                      ||..++|+++||+++
T Consensus         2 ~~~~~~~~~~~~~~~   16 (384)
T PRK11289          2 MKMMLLLLLAALLLT   16 (384)
T ss_pred             cchhhHHHHHHHHHH
Confidence            887775655555554


No 30 
>PF05550 Peptidase_C53:  Pestivirus Npro endopeptidase C53;  InterPro: IPR008751 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to MEROPS peptidase family C53 (clan C-). The active site residues occur in the order E, H, C in the sequence which is unlike that in any other family. They are unique to pestiviruses. The N-terminal cysteine peptidase (Npro) encoded by the bovine viral diarrhoea virus genome is responsible for the self-cleavage that releases the N terminus of the core protein. This unique protease is dispensable for viral replication, and its coding region can be replaced by a ubiquitin gene directly fused in frame to the core [, , , ].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=24.74  E-value=56  Score=25.92  Aligned_cols=16  Identities=38%  Similarity=0.526  Sum_probs=13.7

Q ss_pred             CCCceecCCCCccccC
Q 030834           28 SPDPVLDIAGKQLRAG   43 (170)
Q Consensus        28 ~~~~VlD~~G~~l~~g   43 (170)
                      ..|||+|..|+||.-.
T Consensus        20 v~EPVyd~~g~plfGe   35 (168)
T PF05550_consen   20 VEEPVYDSAGNPLFGE   35 (168)
T ss_pred             cccccccCCCCCccCC
Confidence            4599999999999854


No 31 
>PF11777 DUF3316:  Protein of unknown function (DUF3316);  InterPro: IPR016879 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=22.98  E-value=68  Score=23.56  Aligned_cols=14  Identities=29%  Similarity=0.655  Sum_probs=7.9

Q ss_pred             CCchhhHHHHHHHHH
Q 030834            1 MRSTLVLTPLILLFA   15 (170)
Q Consensus         1 MK~~~~l~l~fll~a   15 (170)
                      ||.+++|+ +.||++
T Consensus         1 MKk~~ll~-~~ll~s   14 (114)
T PF11777_consen    1 MKKIILLA-SLLLLS   14 (114)
T ss_pred             CchHHHHH-HHHHHH
Confidence            99887443 334443


No 32 
>PF02950 Conotoxin:  Conotoxin;  InterPro: IPR004214 Cone snail toxins, conotoxins, are small neurotoxic peptides with disulphide connectivity that target ion-channels or G-protein coupled receptors. Based on the number and pattern of disulphide bonds and biological activities, conotoxins can be classified into several families []. Omega, delta and kappa families of conotoxins have a knottin or inhibitor cysteine knot scaffold. The knottin scaffold is a very special disulphide-through-disulphide knot, in which the III-VI disulphide bond crosses the macrocycle formed by two other disulphide bonds (I-IV and II-V) and the interconnecting backbone segments, where I-VI indicates the six cysteine residues starting from the N terminus.  The disulphide bonding network, as well as specific amino acids in inter-cysteine loops, provide the specificity of conotoxins []. The cysteine arrangements are the same for omega, delta and kappa families, even though omega conotoxins are calcium channel blockers, whereas delta conotoxins delay the inactivation of sodium channels, and kappa conotoxins are potassium channel blockers []. Mu conotoxins have two types of cysteine arrangements, but the knottin scaffold is not observed. Mu conotoxins target the voltage-gated sodium channels [], and are useful probes for investigating voltage-dependent sodium channels of excitable tissues []. Alpha conotoxins have two types of cysteine arrangements [], and are competitive nicotinic acetylcholine receptor antagonists. ; GO: 0008200 ion channel inhibitor activity, 0009405 pathogenesis, 0005576 extracellular region; PDB: 2EFZ_A 1FYG_A 1RMK_A 1DG0_A 1DFY_A 1DFZ_A 2JQC_A 2YYF_A 2JQB_A 1F3K_A ....
Probab=21.49  E-value=31  Score=22.99  Aligned_cols=7  Identities=57%  Similarity=0.634  Sum_probs=0.0

Q ss_pred             CCchhhH
Q 030834            1 MRSTLVL    7 (170)
Q Consensus         1 MK~~~~l    7 (170)
                      ||-+.+|
T Consensus         1 mKLt~vl    7 (75)
T PF02950_consen    1 MKLTCVL    7 (75)
T ss_dssp             -------
T ss_pred             CCcchHH
Confidence            8988644


No 33 
>PF11355 DUF3157:  Protein of unknown function (DUF3157);  InterPro: IPR021501  This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 
Probab=20.50  E-value=1.1e+02  Score=25.39  Aligned_cols=22  Identities=23%  Similarity=0.530  Sum_probs=17.1

Q ss_pred             CceecCCCCccccCCeE---EEEec
Q 030834           30 DPVLDIAGKQLRAGSKY---YILPV   51 (170)
Q Consensus        30 ~~VlD~~G~~l~~g~~Y---YI~p~   51 (170)
                      ..|-=.||..|+-..++   |+++-
T Consensus        22 ~~VTLedGrqV~LnDDFTWeYv~~~   46 (199)
T PF11355_consen   22 ATVTLEDGRQVQLNDDFTWEYVIPE   46 (199)
T ss_pred             cEEEecCCCEEEecCCceEEEEecc
Confidence            47888999999988665   66653


No 34 
>KOG3352 consensus Cytochrome c oxidase, subunit Vb/COX4 [Energy production and conversion]
Probab=20.35  E-value=1.2e+02  Score=24.02  Aligned_cols=25  Identities=24%  Similarity=0.338  Sum_probs=17.6

Q ss_pred             cEEEEeCCCCCCCCCCCCcccEEEEEeCC
Q 030834           96 QWFVKTGGIEGNLGPQTTINWFRIEKFYG  124 (170)
Q Consensus        96 ~~~V~~gg~~g~pg~~t~~~~FkIek~~~  124 (170)
                      .++|.-|..+++    +.-.||.|+|.+.
T Consensus       109 ~RiVGC~c~eD~----~~V~Wmwl~Kge~  133 (153)
T KOG3352|consen  109 KRIVGCGCEEDS----HAVVWMWLEKGET  133 (153)
T ss_pred             ceEEeecccCCC----cceEEEEEEcCCc
Confidence            567888666652    3357999999864


No 35 
>PF12071 DUF3551:  Protein of unknown function (DUF3551);  InterPro: IPR021937  This family of proteins is functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 79 to 104 amino acids in length. This protein has a single completely conserved residue C that may be functionally important. 
Probab=20.24  E-value=1.3e+02  Score=21.19  Aligned_cols=7  Identities=43%  Similarity=0.577  Sum_probs=4.6

Q ss_pred             CCchhhH
Q 030834            1 MRSTLVL    7 (170)
Q Consensus         1 MK~~~~l    7 (170)
                      ||..++.
T Consensus         1 MR~~~~a    7 (82)
T PF12071_consen    1 MRRLLLA    7 (82)
T ss_pred             ChhHHHH
Confidence            8877633


Done!