Query 030834
Match_columns 170
No_of_seqs 103 out of 272
Neff 6.1
Searched_HMMs 46136
Date Fri Mar 29 05:16:54 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/030834.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/030834hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF00197 Kunitz_legume: Trypsi 100.0 2.4E-48 5.2E-53 310.7 15.8 132 31-169 1-176 (176)
2 cd00178 STI Soybean trypsin in 100.0 6.3E-48 1.4E-52 307.6 14.9 130 31-169 1-172 (172)
3 smart00452 STI Soybean trypsin 100.0 5.7E-47 1.2E-51 302.1 14.7 130 32-170 1-171 (172)
4 PF08194 DIM: DIM protein; In 69.0 8.1 0.00018 23.4 3.1 16 1-17 1-16 (36)
5 KOG3858 Ephrin, ligand for Eph 64.4 7.5 0.00016 32.8 3.2 45 35-80 116-165 (233)
6 PF07172 GRP: Glycine rich pro 57.4 6.3 0.00014 28.6 1.4 9 34-42 34-42 (95)
7 PRK10220 hypothetical protein; 55.8 18 0.00038 27.2 3.5 31 30-60 42-72 (111)
8 PF00879 Defensin_propep: Defe 53.0 16 0.00035 23.8 2.6 19 1-20 1-19 (52)
9 TIGR00686 phnA alkylphosphonat 50.5 24 0.00052 26.4 3.5 31 30-60 41-71 (109)
10 COG2824 PhnA Uncharacterized Z 45.1 30 0.00065 26.0 3.3 32 29-60 42-73 (112)
11 PF14009 DUF4228: Domain of un 44.3 15 0.00033 27.9 1.8 20 35-54 64-83 (181)
12 PF05474 Semenogelin: Semenoge 43.8 12 0.00027 34.9 1.3 15 1-15 1-15 (582)
13 PF03831 PhnA: PhnA protein; 43.6 19 0.00042 23.8 1.9 29 32-60 2-30 (56)
14 PF09466 Yqai: Hypothetical pr 43.2 19 0.00041 25.0 1.9 21 30-50 24-44 (71)
15 PF15284 PAGK: Phage-encoded v 42.5 13 0.00029 25.0 1.0 16 1-16 1-18 (61)
16 PF00812 Ephrin: Ephrin; Inte 40.7 20 0.00043 28.0 1.9 24 36-59 101-124 (145)
17 COG5341 Uncharacterized protei 39.3 1.2E+02 0.0026 23.4 5.8 11 42-52 46-56 (132)
18 TIGR02588 conserved hypothetic 35.7 66 0.0014 24.6 4.0 30 29-58 34-63 (122)
19 PF10657 RC-P840_PscD: Photosy 35.2 34 0.00074 26.4 2.3 26 36-61 24-49 (144)
20 PF10731 Anophelin: Thrombin i 33.1 52 0.0011 22.2 2.7 41 1-43 1-46 (65)
21 PF02402 Lysis_col: Lysis prot 32.6 42 0.00091 21.3 2.1 17 28-44 20-36 (46)
22 PF09888 DUF2115: Uncharacteri 31.2 72 0.0016 25.3 3.7 30 101-133 113-142 (163)
23 PF10813 DUF2733: Protein of u 30.9 25 0.00053 20.8 0.8 19 28-46 11-29 (32)
24 COG3045 CreA Uncharacterized p 30.9 91 0.002 24.9 4.1 24 31-54 42-65 (165)
25 COG5510 Predicted small secret 30.2 46 0.001 21.0 1.9 16 1-16 2-17 (44)
26 PRK01022 hypothetical protein; 27.5 87 0.0019 25.0 3.6 30 101-133 115-144 (167)
27 PRK10159 outer membrane phosph 27.2 48 0.001 28.9 2.3 15 30-44 22-36 (351)
28 PF11153 DUF2931: Protein of u 26.0 70 0.0015 25.9 2.9 17 1-17 1-17 (216)
29 PRK11289 ampC beta-lactamase/D 25.8 46 0.00099 29.5 1.9 15 1-15 2-16 (384)
30 PF05550 Peptidase_C53: Pestiv 24.7 56 0.0012 25.9 2.0 16 28-43 20-35 (168)
31 PF11777 DUF3316: Protein of u 23.0 68 0.0015 23.6 2.1 14 1-15 1-14 (114)
32 PF02950 Conotoxin: Conotoxin; 21.5 31 0.00067 23.0 0.0 7 1-7 1-7 (75)
33 PF11355 DUF3157: Protein of u 20.5 1.1E+02 0.0023 25.4 2.9 22 30-51 22-46 (199)
34 KOG3352 Cytochrome c oxidase, 20.3 1.2E+02 0.0026 24.0 3.1 25 96-124 109-133 (153)
35 PF12071 DUF3551: Protein of u 20.2 1.3E+02 0.0028 21.2 3.0 7 1-7 1-7 (82)
No 1
>PF00197 Kunitz_legume: Trypsin and protease inhibitor; InterPro: IPR002160 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. The Kunitz-type soybean trypsin inhibitor (STI) family consists mainly of proteinase inhibitors from Leguminosae seeds []. They belong to MEROPS inhibitor family I3, clan IC. They exhibit proteinase inhibitory activity against serine proteinases; trypsin (MEROPS peptidase family S1, IPR001254 from INTERPRO) and subtilisin (MEROPS peptidase family S8, IPR000209 from INTERPRO), thiol proteinases (MEROPS peptidase family C1, IPR000668 from INTERPRO) and aspartic proteinases (MEROPS peptidase family A1, IPR001461 from INTERPRO) []. Inhibitors from cereals are active against subtilisin and endogenous alpha-amylases, while some also inhibit tissue plasminogen activator. The inhibitors are usually specific for either trypsin or chymotrypsin, and some are effective against both. They are thought to protect the seeds against consumption by animal predators, while at the same time existing as seed storage proteins themselves - all the actively inhibitory members contain 2 disulphide bridges. The existence of a member with no inhibitory activity, winged bean albumin 1, suggests that the inhibitors may have evolved from seed storage proteins. Proteins from the Kunitz family contain from 170 to 200 amino acid residues and one or two intra-chain disulphide bonds. The best conserved region is found in their N-terminal section. The crystal structures of soybean trypsin inhibitor (STI), trypsin inhibitor DE-3 from the Kaffir tree Erythrina caffra (ETI) [] and the bifunctional proteinase K/alpha-amylase inhibitor from wheat (PK13) have been solved, showing them to share the same 12-stranded beta-sheet structure as those of interleukin-1 and heparin-binding growth factors []. The beta-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel beta-barrel. Despite the structural similarity, STI shows no interleukin-1 bioactivity, presumably as a result of their primary sequence disparities. The active inhibitory site containing the scissile bond is located in the loop between beta-strands 4 and 5 in STI and ETI. The STIs belong to a superfamily that also contains the interleukin-1 proteins, heparin binding growth factors (HBGF) and histactophilin, all of which have very similar structures, but share no sequence similarity with the STI family.; GO: 0004866 endopeptidase inhibitor activity; PDB: 3TC2_B 3S8J_A 3S8K_A 1TIE_A 2GZB_A 3E8L_C 2IWT_B 3BX1_C 1AVA_D 3IIR_A ....
Probab=100.00 E-value=2.4e-48 Score=310.71 Aligned_cols=132 Identities=56% Similarity=0.997 Sum_probs=118.6
Q ss_pred ceecCCCCccccCCeEEEEecccCCCCcEEEeecCCCCCCCCceEecCCC------------------------------
Q 030834 31 PVLDIAGKQLRAGSKYYILPVTKGRGGGLTLAGRSNNKTCPLDVVQEQHS------------------------------ 80 (170)
Q Consensus 31 ~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~l~~t~n~~~CPl~VvQ~~~~------------------------------ 80 (170)
||+|+|||||++|.+|||+|++++.|||+++++++| ++||++|+|++++
T Consensus 1 pVlD~~G~~l~~g~~YyI~p~~~~~GGGl~l~~~~n-~~CPl~Vvq~~~~~~~GlPv~Fs~~~~~~~~~~ir~st~l~I~ 79 (176)
T PF00197_consen 1 PVLDTDGNPLRNGGEYYILPAIRGAGGGLTLAKTGN-ETCPLDVVQSPSELSRGLPVKFSPPYRNSFDTVIRESTDLNIE 79 (176)
T ss_dssp B-BETTSCB-BTTSEEEEEESSTGCSEEEEEECCTT-SSSSEEEEEESSTTS-BSEEEEEESSSSSSTBCTBTTSEEEEE
T ss_pred CcCCCCCCCCcCCCCEEEEeCccCCCCeeEecCCCC-CCCChheEEccCCCCCceeEEEEeCCcccCCCeeEcceEEEEE
Confidence 799999999999999999999999999999999999 9999999999887
Q ss_pred --------CCccEEEeecCCCcccEEEEeCCCCCCCCCCCCcccEEEEEeCCc----ceEEeccCcccccccceeeeEEE
Q 030834 81 --------FRNVWKLDNFDAILGQWFVKTGGIEGNLGPQTTINWFRIEKFYGD----YKLVFCPLVCKFCKVLCIDVGIF 148 (170)
Q Consensus 81 --------~s~~W~v~~~d~~~~~~~V~~gg~~g~pg~~t~~~~FkIek~~~~----YKLvfCp~~~~~~~~~C~dvGi~ 148 (170)
.+++|+|++.++++++ +|+|||.+| .++..|||||||++.+ |||+|||+.|+ ...|+||||+
T Consensus 80 F~~~~~c~~~~~W~V~~~~~~~~~-~V~~gg~~~---~~~~~~~FkIek~~~~~~~~YKLvfCp~~~~--~~~C~dvGi~ 153 (176)
T PF00197_consen 80 FSSPTSCACSTVWKVVKDDPETGQ-FVKTGGVKG---PETVDSWFKIEKYEDGFNNAYKLVFCPSVCC--DSLCGDVGIY 153 (176)
T ss_dssp ESSECTTSSSSBEEEEEETTTTEE-EEEEESSSS---SGCGCCEEEEEEESSSSTTEEEEEEESSSSS--TSSEEEEEEE
T ss_pred EccCCCCCccCEEEEeecCcccce-EEEeCCccc---CCccCcEEEEEEeCCCCCCcEEEEECCCccc--cCccceeeEE
Confidence 6789999987776555 899999987 6788999999999985 99999999764 8999999999
Q ss_pred Ee-CCeEEEEeeC-CcEEEEEec
Q 030834 149 VN-GGVWHLALSD-VTFNVTFLN 169 (170)
Q Consensus 149 ~~-~g~rrL~ls~-~p~~V~F~k 169 (170)
+| +|+|||||++ +||.|+|||
T Consensus 154 ~d~~g~rrL~l~~~~p~~V~F~K 176 (176)
T PF00197_consen 154 FDDNGNRRLALSDDNPFVVVFQK 176 (176)
T ss_dssp EETTSEEEEEEESSSB-EEEEEE
T ss_pred EcCCCeEEEEECCCCcEEEEEEC
Confidence 95 8999999998 599999998
No 2
>cd00178 STI Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. Inhibit proteases by binding with high affinity to their active sites. Trefoil fold, common to interleukins and fibroblast growth factors.
Probab=100.00 E-value=6.3e-48 Score=307.60 Aligned_cols=130 Identities=52% Similarity=0.932 Sum_probs=119.8
Q ss_pred ceecCCCCccccCCeEEEEecccCCCCcEEEeecCCCCCCCCceEecCCC------------------------------
Q 030834 31 PVLDIAGKQLRAGSKYYILPVTKGRGGGLTLAGRSNNKTCPLDVVQEQHS------------------------------ 80 (170)
Q Consensus 31 ~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~l~~t~n~~~CPl~VvQ~~~~------------------------------ 80 (170)
||+|++||||++|.+|||+|+++|.||||++++++| ++||++|+|++++
T Consensus 1 ~VlD~~G~~l~~g~~YyI~p~~~g~GGGl~l~~~~~-~~CPl~VvQ~~~~~~~GlPv~Fs~~~~~~~~I~e~t~lnI~F~ 79 (172)
T cd00178 1 PVLDTDGNPLRNGGRYYILPAIRGGGGGLTLAATGN-ETCPLTVVQSPSELDRGLPVKFSPPNPKSDVIRESTDLNIEFD 79 (172)
T ss_pred CcCcCCCCCCcCCCeEEEEEceeCCCCcEEEcCCCC-CCCCCeeEECCCCCCCCeeEEEEeCCCCCCEEECCCcEEEEeC
Confidence 699999999999999999999999899999999999 9999999999986
Q ss_pred -------CCccEEEeecCCCcccEEEEeCCCCCCCCCCCCcccEEEEEeCC---cceEEeccCcccccccceeeeEEEEe
Q 030834 81 -------FRNVWKLDNFDAILGQWFVKTGGIEGNLGPQTTINWFRIEKFYG---DYKLVFCPLVCKFCKVLCIDVGIFVN 150 (170)
Q Consensus 81 -------~s~~W~v~~~d~~~~~~~V~~gg~~g~pg~~t~~~~FkIek~~~---~YKLvfCp~~~~~~~~~C~dvGi~~~ 150 (170)
+|++|+|+++|+ .++|+|+|||.+++ +.+|||||||+++ .|||+|||+.| +..|+||||+.+
T Consensus 80 ~~~~c~~~st~W~V~~~~~-~~~~~V~~Gg~~~~----~~~~~FkIek~~~~~~~YKL~~Cp~~~---~~~C~~VGi~~d 151 (172)
T cd00178 80 APTWCCGSSTVWKVDRDST-PEGLFVTTGGVKGN----TLNSWFKIEKVSEGLNAYKLVFCPSSC---DSKCGDVGIFID 151 (172)
T ss_pred CCCcCCCCCCEEEEeccCC-ccCeEEEeCCcCCC----cccceEEEEECCCCCCcEEEEEcCCCC---CCceeecccEEC
Confidence 689999987665 78999999999875 6899999999987 69999999875 679999999995
Q ss_pred -CCeEEEEeeC-CcEEEEEec
Q 030834 151 -GGVWHLALSD-VTFNVTFLN 169 (170)
Q Consensus 151 -~g~rrL~ls~-~p~~V~F~k 169 (170)
+|.|||+|++ +||.|+|+|
T Consensus 152 ~~g~rrL~l~~~~p~~V~F~k 172 (172)
T cd00178 152 PEGVRRLVLSDDNPLVVVFKK 172 (172)
T ss_pred CCCcEEEEEcCCCCeEEEEeC
Confidence 8999999998 599999997
No 3
>smart00452 STI Soybean trypsin inhibitor (Kunitz) family of protease inhibitors.
Probab=100.00 E-value=5.7e-47 Score=302.14 Aligned_cols=130 Identities=47% Similarity=0.835 Sum_probs=116.9
Q ss_pred eecCCCCccccCCeEEEEecccCCCCcEEEeecCCCCCCCCceEecCCC-------------------------------
Q 030834 32 VLDIAGKQLRAGSKYYILPVTKGRGGGLTLAGRSNNKTCPLDVVQEQHS------------------------------- 80 (170)
Q Consensus 32 VlD~~G~~l~~g~~YYI~p~~~~~gGGl~l~~t~n~~~CPl~VvQ~~~~------------------------------- 80 (170)
|+|++||||++|++|||+|++++.||||++++++| ++||++|+|++++
T Consensus 1 VlDt~G~~l~~G~~YyI~p~~~g~GGGl~l~~~~n-~~CPl~VvQ~~~~~~~GlPV~Fs~~~~~~~ii~e~t~lnI~F~~ 79 (172)
T smart00452 1 VLDTDGNPLRNGGTYYILPAIRGHGGGLTLAATGN-EICPLTVVQSPNEVDNGLPVKFSPPNPSDFIIRESTDLNIEFDA 79 (172)
T ss_pred CCCCCCCCCcCCCcEEEEEccccCCCCEEEccCCC-CCCCCeeEECCCCCCCceeEEEeecCCCCCEEecCceEEEEeCC
Confidence 79999999999999999999999889999999999 9999999999876
Q ss_pred -----CCccEEEeecCCCcccEEEEeCCCCCCCCCCCCcccEEEEEeCC---cceEEeccCcccccccceeeeEEEEe-C
Q 030834 81 -----FRNVWKLDNFDAILGQWFVKTGGIEGNLGPQTTINWFRIEKFYG---DYKLVFCPLVCKFCKVLCIDVGIFVN-G 151 (170)
Q Consensus 81 -----~s~~W~v~~~d~~~~~~~V~~gg~~g~pg~~t~~~~FkIek~~~---~YKLvfCp~~~~~~~~~C~dvGi~~~-~ 151 (170)
+|++|+|++ |++.++|+|+||| +|+.. .|||||||+++ .|||+|||+.|+ ...|.||||+++ +
T Consensus 80 ~~~C~~st~W~V~~-~~~~~~~~V~~gg---~~~~~--~~~FkIek~~~~~~~YKLv~Cp~~~~--~~~C~~vGi~~d~~ 151 (172)
T smart00452 80 PPLCAQSTVWTVDE-DSAPEGLAVKTGG---YPGVR--DSWFKIEKYSGESNGYKLVYCPNGSD--DDKCGDVGIFIDPE 151 (172)
T ss_pred CCCCCCCCEEEEec-CCccccEEEEeCC---cCCCC--CCeEEEEECCCCCCCEEEEEcCCCCC--CCccCccCeEECCC
Confidence 789999975 6678899999998 44443 69999999987 699999998774 678999999995 8
Q ss_pred CeEEEEeeCC-cEEEEEecC
Q 030834 152 GVWHLALSDV-TFNVTFLNG 170 (170)
Q Consensus 152 g~rrL~ls~~-p~~V~F~k~ 170 (170)
|+|||||+++ ||.|+|+|.
T Consensus 152 g~rrL~ls~~~p~~v~F~k~ 171 (172)
T smart00452 152 GGRRLVLSNENPLVVVFKKA 171 (172)
T ss_pred CcEEEEEcCCCCeEEEEEEC
Confidence 9999999975 999999983
No 4
>PF08194 DIM: DIM protein; InterPro: IPR013172 Drosophila immune-induced molecules (DIMs) are short proteins induced during the immune response of Drosophila []. This entry includes DIMs 1 to 4 and DIM23.
Probab=68.99 E-value=8.1 Score=23.40 Aligned_cols=16 Identities=25% Similarity=0.366 Sum_probs=9.0
Q ss_pred CCchhhHHHHHHHHHHH
Q 030834 1 MRSTLVLTPLILLFAFI 17 (170)
Q Consensus 1 MK~~~~l~l~fll~a~~ 17 (170)
||.+.+ .+.|+++|+.
T Consensus 1 Mk~l~~-a~~l~lLal~ 16 (36)
T PF08194_consen 1 MKCLSL-AFALLLLALA 16 (36)
T ss_pred CceeHH-HHHHHHHHHH
Confidence 998873 3334444443
No 5
>KOG3858 consensus Ephrin, ligand for Eph receptor tyrosine kinase [Signal transduction mechanisms]
Probab=64.38 E-value=7.5 Score=32.80 Aligned_cols=45 Identities=24% Similarity=0.397 Sum_probs=30.5
Q ss_pred CCCCccccCCeEEEEecccCCCCcEE-----EeecCCCCCCCCceEecCCC
Q 030834 35 IAGKQLRAGSKYYILPVTKGRGGGLT-----LAGRSNNKTCPLDVVQEQHS 80 (170)
Q Consensus 35 ~~G~~l~~g~~YYI~p~~~~~gGGl~-----l~~t~n~~~CPl~VvQ~~~~ 80 (170)
..|.+-++|.+||+++...|.-.|+- +-.+.+ ..+-..|.|++..
T Consensus 116 p~G~EF~pG~~YY~IStStg~~~g~~~~~ggvc~~~~-mk~~~~V~~~~~~ 165 (233)
T KOG3858|consen 116 PLGFEFQPGHTYYYISTSTGDAEGLCNLRGGVCVTRN-MKLLMKVGQSPRS 165 (233)
T ss_pred CCCccccCCCeEEEEeCCCccccccchhhCCEeccCC-ceEEEEecccCCC
Confidence 36999999999999998766433332 123334 5677778887653
No 6
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=57.43 E-value=6.3 Score=28.62 Aligned_cols=9 Identities=0% Similarity=0.176 Sum_probs=3.7
Q ss_pred cCCCCcccc
Q 030834 34 DIAGKQLRA 42 (170)
Q Consensus 34 D~~G~~l~~ 42 (170)
..+-++|+.
T Consensus 34 ~~~~~~v~~ 42 (95)
T PF07172_consen 34 EEEENEVQD 42 (95)
T ss_pred cccCCCCCc
Confidence 333344443
No 7
>PRK10220 hypothetical protein; Provisional
Probab=55.78 E-value=18 Score=27.22 Aligned_cols=31 Identities=26% Similarity=0.214 Sum_probs=22.5
Q ss_pred CceecCCCCccccCCeEEEEecccCCCCcEE
Q 030834 30 DPVLDIAGKQLRAGSKYYILPVTKGRGGGLT 60 (170)
Q Consensus 30 ~~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~ 60 (170)
..|.|.+|++|..|..--++=...=.|.+.+
T Consensus 42 ~~vkDsnG~~L~dGDsV~viKDLkVKGss~~ 72 (111)
T PRK10220 42 LIVKDANGNLLADGDSVTIVKDLKVKGSSSM 72 (111)
T ss_pred ceEEcCCCCCccCCCEEEEEeeccccccccc
Confidence 4689999999999998777655433444444
No 8
>PF00879 Defensin_propep: Defensin propeptide The pattern for this Prosite entry doesn't match the propeptide.; InterPro: IPR002366 Defensins are 2-6 kDa, cationic, microbicidal peptides active against many Gram-negative and Gram-positive bacteria, fungi, and enveloped viruses [], containing three pairs of intramolecular disulphide bonds []. On the basis of their size and pattern of disulphide bonding, mammalian defensins are classified into alpha, beta and theta categories. Alpha-defensins, which have been identified in humans, monkeys and several rodent species, are particularly abundant in neutrophils, certain macrophage populations and Paneth cells of the small intestine. Every mammalian species explored thus far has beta-defensins. In cows, as many as 13 beta-defensins exist in neutrophils. However, in other species, beta-defensins are more often produced by epithelial cells lining various organs (e.g. the epidermis, bronchial tree and genitourinary tract). Theta-defensins are cyclic and have so far only been identified in primate phagocytes. Defensins are produced constitutively and/or in response to microbial products or proinflammatory cytokines. Some defensins are also called corticostatins (CS) because they inhibit corticotropin-stimulated corticosteroid production. The mechanism(s) by which microorganisms are killed and/or inactivated by defensins is not understood completely. However, it is generally believed that killing is a consequence of disruption of the microbial membrane. The polar topology of defensins, with spatially separated charged and hydrophobic regions, allows them to insert themselves into the phospholipid membranes so that their hydrophobic regions are buried within the lipid membrane interior and their charged (mostly cationic) regions interact with anionic phospholipid head groups and water. Subsequently, some defensins can aggregate to form `channel-like' pores; others might bind to and cover the microbial membrane in a `carpet-like' manner. The net outcome is the disruption of membrane integrity and function, which ultimately leads to the lysis of microorganisms. Some defensins are synthesized as propeptides which may be relevant to this process - in neutrophils only the mature peptides have been identified but in Paneth cells, the propeptide is stored in vesicles [] and appears to be cleaved by trypsin on activation. ; GO: 0006952 defense response
Probab=52.96 E-value=16 Score=23.83 Aligned_cols=19 Identities=37% Similarity=0.574 Sum_probs=13.2
Q ss_pred CCchhhHHHHHHHHHHHhCC
Q 030834 1 MRSTLVLTPLILLFAFIATP 20 (170)
Q Consensus 1 MK~~~~l~l~fll~a~~t~~ 20 (170)
||++.+|.. +||+||-++.
T Consensus 1 MRTL~LLaA-lLLlAlqaQA 19 (52)
T PF00879_consen 1 MRTLALLAA-LLLLALQAQA 19 (52)
T ss_pred CcHHHHHHH-HHHHHHHHhc
Confidence 898875544 7888876544
No 9
>TIGR00686 phnA alkylphosphonate utilization operon protein PhnA. The protein family includes an uncharacterized member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterized phosphonoacetate hydrolase designated PhnA by Kulakova, et al. (2001, 1997).
Probab=50.54 E-value=24 Score=26.45 Aligned_cols=31 Identities=26% Similarity=0.293 Sum_probs=22.7
Q ss_pred CceecCCCCccccCCeEEEEecccCCCCcEE
Q 030834 30 DPVLDIAGKQLRAGSKYYILPVTKGRGGGLT 60 (170)
Q Consensus 30 ~~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~ 60 (170)
..|.|.+|++|..|-+=-|+=...=.|.+.+
T Consensus 41 ~~~kDsnG~~L~dGDsV~liKDLkVKGss~~ 71 (109)
T TIGR00686 41 LIVKDCNGNLLANGDSVILIKDLKVKGSSLV 71 (109)
T ss_pred ceEEcCCCCCccCCCEEEEEeeccccCcccc
Confidence 4689999999999998877755433444444
No 10
>COG2824 PhnA Uncharacterized Zn-ribbon-containing protein involved in phosphonate metabolism [Inorganic ion transport and metabolism]
Probab=45.13 E-value=30 Score=25.95 Aligned_cols=32 Identities=22% Similarity=0.204 Sum_probs=22.7
Q ss_pred CCceecCCCCccccCCeEEEEecccCCCCcEE
Q 030834 29 PDPVLDIAGKQLRAGSKYYILPVTKGRGGGLT 60 (170)
Q Consensus 29 ~~~VlD~~G~~l~~g~~YYI~p~~~~~gGGl~ 60 (170)
...|.|.+||.|..|..--|+-...-.|.+.+
T Consensus 42 ~~~v~DsnGn~L~dGDsV~lIKDLkVKGss~~ 73 (112)
T COG2824 42 ALIVKDSNGNLLADGDSVTLIKDLKVKGSSKV 73 (112)
T ss_pred ceEEEcCCCcEeccCCeEEEEEeeeecCCcce
Confidence 35899999999999998877654433333333
No 11
>PF14009 DUF4228: Domain of unknown function (DUF4228)
Probab=44.33 E-value=15 Score=27.88 Aligned_cols=20 Identities=25% Similarity=0.665 Sum_probs=16.6
Q ss_pred CCCCccccCCeEEEEecccC
Q 030834 35 IAGKQLRAGSKYYILPVTKG 54 (170)
Q Consensus 35 ~~G~~l~~g~~YYI~p~~~~ 54 (170)
.-.++|++|.-||++|..+-
T Consensus 64 ~~d~~L~~G~~Y~llP~~~~ 83 (181)
T PF14009_consen 64 PPDEELQPGQIYFLLPMSRL 83 (181)
T ss_pred CccCeecCCCEEEEEEcccc
Confidence 45678999999999998653
No 12
>PF05474 Semenogelin: Semenogelin; InterPro: IPR008836 This family consists of several mammalian semenogelin (I and II) proteins. Freshly ejaculated Homo sapiens semen has the appearance of a loose gel in which the predominant structural protein components are the seminal vesicle secreted semenogelins (Sg) [].; GO: 0005198 structural molecule activity, 0019953 sexual reproduction, 0005576 extracellular region, 0030141 stored secretory granule
Probab=43.83 E-value=12 Score=34.89 Aligned_cols=15 Identities=27% Similarity=0.439 Sum_probs=12.2
Q ss_pred CCchhhHHHHHHHHH
Q 030834 1 MRSTLVLTPLILLFA 15 (170)
Q Consensus 1 MK~~~~l~l~fll~a 15 (170)
||+++++.||+||++
T Consensus 1 MK~~I~F~lSLLLiL 15 (582)
T PF05474_consen 1 MKSIIFFVLSLLLIL 15 (582)
T ss_pred CCceeehHHHHHHHH
Confidence 999877778888776
No 13
>PF03831 PhnA: PhnA protein; InterPro: IPR013988 The PhnA protein family includes the uncharacterised Escherichia coli protein PhnA and its homologues. The E. coli phnA gene is part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage []. The protein is not related to the characterised phosphonoacetate hydrolase designated PhnA []. This entry represents the C-terminal domain of PhnA.; PDB: 2AKK_A 2AKL_A.
Probab=43.63 E-value=19 Score=23.85 Aligned_cols=29 Identities=28% Similarity=0.434 Sum_probs=15.2
Q ss_pred eecCCCCccccCCeEEEEecccCCCCcEE
Q 030834 32 VLDIAGKQLRAGSKYYILPVTKGRGGGLT 60 (170)
Q Consensus 32 VlD~~G~~l~~g~~YYI~p~~~~~gGGl~ 60 (170)
|.|.+|++|..|-+--++--..=.|.+.+
T Consensus 2 v~DsnGn~L~dGDsV~~iKDLkVKG~s~~ 30 (56)
T PF03831_consen 2 VKDSNGNELQDGDSVTLIKDLKVKGSSFT 30 (56)
T ss_dssp -B-TTS-B--TTEEEEESS-EEETTTTEE
T ss_pred eEcCCCCCccCCCEEEEEeeeeeccCCcc
Confidence 68999999999988776544332344444
No 14
>PF09466 Yqai: Hypothetical protein Yqai; InterPro: IPR018474 The hypothetical protein YqaI is expressed in bacteria, particularly Bacillus subtilis. It forms a homo-dimer, with each monomer containing an alpha helix and four beta strands.; PDB: 2DSM_B.
Probab=43.21 E-value=19 Score=25.01 Aligned_cols=21 Identities=33% Similarity=0.801 Sum_probs=12.2
Q ss_pred CceecCCCCccccCCeEEEEe
Q 030834 30 DPVLDIAGKQLRAGSKYYILP 50 (170)
Q Consensus 30 ~~VlD~~G~~l~~g~~YYI~p 50 (170)
-|+.|.-|+++.+|.+|+|.|
T Consensus 24 ~~i~D~yG~EI~~~D~y~i~~ 44 (71)
T PF09466_consen 24 HPIEDFYGDEIFPGDDYFISP 44 (71)
T ss_dssp -B---TTSS-B-TTS-EEE-E
T ss_pred cceeeeeccccccCCeEEEeC
Confidence 467789999999999999976
No 15
>PF15284 PAGK: Phage-encoded virulence factor
Probab=42.52 E-value=13 Score=25.00 Aligned_cols=16 Identities=25% Similarity=0.254 Sum_probs=8.0
Q ss_pred CCchh--hHHHHHHHHHH
Q 030834 1 MRSTL--VLTPLILLFAF 16 (170)
Q Consensus 1 MK~~~--~l~l~fll~a~ 16 (170)
||... +|.|.|.|.|.
T Consensus 1 Mkk~ksifL~l~~~LsA~ 18 (61)
T PF15284_consen 1 MKKFKSIFLALVFILSAA 18 (61)
T ss_pred ChHHHHHHHHHHHHHHHh
Confidence 77332 35555555553
No 16
>PF00812 Ephrin: Ephrin; InterPro: IPR001799 Ephrins are a family of proteins [] that are ligands of class V (EPH-related) receptor protein-tyrosine kinases (see IPR001426 from INTERPRO). These receptors and their ligands have been implicated in regulating neuronal axon guidance and in patterning of the developing nervous system and may also serve a patterning and compartmentalisation role outside of the nervous system as well. Ephrins are membrane-attached proteins of 205 to 340 residues. Attachment appears to be crucial for their normal function. Type-A ephrins are linked to the membrane via a glycosylphosphatidylinositol (GPI)-linkage, while type-B ephrins are type-I membrane proteins.; GO: 0016020 membrane; PDB: 3HEI_P 3CZU_B 3MBW_B 1KGY_E 1IKO_P 2WO3_B 2I85_A 2VSK_B 3GXU_B 2VSM_B ....
Probab=40.72 E-value=20 Score=28.02 Aligned_cols=24 Identities=29% Similarity=0.703 Sum_probs=18.2
Q ss_pred CCCccccCCeEEEEecccCCCCcE
Q 030834 36 AGKQLRAGSKYYILPVTKGRGGGL 59 (170)
Q Consensus 36 ~G~~l~~g~~YYI~p~~~~~gGGl 59 (170)
.|-+-++|.+||++....|.-+|+
T Consensus 101 ~G~EF~pG~~YY~ISts~g~~~g~ 124 (145)
T PF00812_consen 101 LGLEFQPGHDYYYISTSTGTQEGL 124 (145)
T ss_dssp TSSS--TTEEEEEEEEESSSSTTT
T ss_pred CCeeecCCCeEEEEEccCCCCCCc
Confidence 789999999999999877765554
No 17
>COG5341 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=39.33 E-value=1.2e+02 Score=23.40 Aligned_cols=11 Identities=27% Similarity=0.495 Sum_probs=6.9
Q ss_pred cCCeEEEEecc
Q 030834 42 AGSKYYILPVT 52 (170)
Q Consensus 42 ~g~~YYI~p~~ 52 (170)
.|..||..|..
T Consensus 46 ~Gk~~r~i~l~ 56 (132)
T COG5341 46 DGKVIRTIPLT 56 (132)
T ss_pred CCEEEEEEEcc
Confidence 45667777765
No 18
>TIGR02588 conserved hypothetical protein TIGR02588. The function of this protein is unknown. It is always found as part of a two-gene operon with TIGR02587, a protein that appears to span the membrane seven times. It is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus, so far, all of which are bacterial.
Probab=35.65 E-value=66 Score=24.56 Aligned_cols=30 Identities=17% Similarity=0.151 Sum_probs=21.4
Q ss_pred CCceecCCCCccccCCeEEEEecccCCCCc
Q 030834 29 PDPVLDIAGKQLRAGSKYYILPVTKGRGGG 58 (170)
Q Consensus 29 ~~~VlD~~G~~l~~g~~YYI~p~~~~~gGG 58 (170)
|.......+..=+.+++||+--..++.||+
T Consensus 34 p~l~v~~~~~~r~~~gqyyVpF~V~N~gg~ 63 (122)
T TIGR02588 34 AVLEVAPAEVERMQTGQYYVPFAIHNLGGT 63 (122)
T ss_pred CeEEEeehheeEEeCCEEEEEEEEEeCCCc
Confidence 344556666655578999998888877654
No 19
>PF10657 RC-P840_PscD: Photosystem P840 reaction centre protein PscD; InterPro: IPR019608 Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product. The photosynthetic reaction centres (RCs) of aerotolerant organisms contain a heterodimeric core, built up of two strongly homologous polypeptides each of which contributes five transmembrane peptide helices to hold a pseudo-symmetric double set of redox components. Two molecules of PscD are housed within a subunit. PscD may be involved in stabilising the PscB component since it is found to co-precipitate with FMO (Fenna-Mathews-Olson BChl a-protein) and PscB. It may also be involved in the interaction with ferredoxin [].
Probab=35.22 E-value=34 Score=26.39 Aligned_cols=26 Identities=31% Similarity=0.608 Sum_probs=23.2
Q ss_pred CCCccccCCeEEEEecccCCCCcEEE
Q 030834 36 AGKQLRAGSKYYILPVTKGRGGGLTL 61 (170)
Q Consensus 36 ~G~~l~~g~~YYI~p~~~~~gGGl~l 61 (170)
.||++.-..+|||-+|.|+..|-|.+
T Consensus 24 sGNa~HK~eKYfITsAkRD~~g~Lql 49 (144)
T PF10657_consen 24 SGNAVHKAEKYFITSAKRDRYGKLQL 49 (144)
T ss_pred cCchhhhhheeEEeeeecccCCceEE
Confidence 79999999999999999998886654
No 20
>PF10731 Anophelin: Thrombin inhibitor from mosquito; InterPro: IPR018932 Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing.
Probab=33.07 E-value=52 Score=22.24 Aligned_cols=41 Identities=24% Similarity=0.370 Sum_probs=20.0
Q ss_pred CCchhhHHHHHHHHHHHh--CCCCCCCCCCCCceecCCC---CccccC
Q 030834 1 MRSTLVLTPLILLFAFIA--TPLPVRGNASPDPVLDIAG---KQLRAG 43 (170)
Q Consensus 1 MK~~~~l~l~fll~a~~t--~~l~~~~~~~~~~VlD~~G---~~l~~g 43 (170)
|-+-+ ..++||++|+.. +..|.- +...+|-+|-+- ++|.+.
T Consensus 1 MA~Kl-~vialLC~aLva~vQ~APQY-a~GeeP~YDEdd~dde~l~ph 46 (65)
T PF10731_consen 1 MASKL-IVIALLCVALVAIVQSAPQY-APGEEPSYDEDDDDDEPLKPH 46 (65)
T ss_pred Ccchh-hHHHHHHHHHHHHHhcCccc-CCCCCCCcCcccCcccccccC
Confidence 44443 345566666542 322211 123478887664 566553
No 21
>PF02402 Lysis_col: Lysis protein; InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined []. The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells []. A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB []. Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively []. Sequence similarities between colicins E2, A and E1 [] are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 [] immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides []. Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase []. The mature ColE2 lysis protein is located in the cell envelope [].; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane
Probab=32.58 E-value=42 Score=21.27 Aligned_cols=17 Identities=18% Similarity=0.348 Sum_probs=12.5
Q ss_pred CCCceecCCCCccccCC
Q 030834 28 SPDPVLDIAGKQLRAGS 44 (170)
Q Consensus 28 ~~~~VlD~~G~~l~~g~ 44 (170)
+...|-|+.|--+.|..
T Consensus 20 QaN~iRDvqGGtVaPSS 36 (46)
T PF02402_consen 20 QANYIRDVQGGTVAPSS 36 (46)
T ss_pred hhcceecCCCceECCCc
Confidence 33689999998776653
No 22
>PF09888 DUF2115: Uncharacterized protein conserved in archaea (DUF2115); InterPro: IPR019215 This entry represents various hypothetical archaeal proteins, has no known function.
Probab=31.22 E-value=72 Score=25.28 Aligned_cols=30 Identities=20% Similarity=0.296 Sum_probs=21.5
Q ss_pred eCCCCCCCCCCCCcccEEEEEeCCcceEEeccC
Q 030834 101 TGGIEGNLGPQTTINWFRIEKFYGDYKLVFCPL 133 (170)
Q Consensus 101 ~gg~~g~pg~~t~~~~FkIek~~~~YKLvfCp~ 133 (170)
+-...+||=--...|-|+|++-++.| |||-
T Consensus 113 I~~~PlHPvG~~FPGG~~V~~~~g~Y---YCPV 142 (163)
T PF09888_consen 113 ILKEPLHPVGMPFPGGFKVEEKNGNY---YCPV 142 (163)
T ss_pred HhCCCCCCCCCCCCCCeEEEEECCEE---eCcc
Confidence 34456677333478899999998877 8993
No 23
>PF10813 DUF2733: Protein of unknown function (DUF2733); InterPro: IPR024360 The UL11 gene product of herpes simplex virus is a membrane-associated tegument protein that is incorporated into the HSV virion and functions in viral envelopment []. UL11 is acylated, which is crucial for lipid raft association [].
Probab=30.92 E-value=25 Score=20.77 Aligned_cols=19 Identities=16% Similarity=0.464 Sum_probs=14.6
Q ss_pred CCCceecCCCCccccCCeE
Q 030834 28 SPDPVLDIAGKQLRAGSKY 46 (170)
Q Consensus 28 ~~~~VlD~~G~~l~~g~~Y 46 (170)
...|+.|.+|+++.--.+|
T Consensus 11 r~n~l~Dv~G~~Inl~~dF 29 (32)
T PF10813_consen 11 RHNPLKDVKGNPINLYKDF 29 (32)
T ss_pred cCCcccccCCCEEechhcc
Confidence 4579999999998765444
No 24
>COG3045 CreA Uncharacterized protein conserved in bacteria [Function unknown]
Probab=30.86 E-value=91 Score=24.89 Aligned_cols=24 Identities=17% Similarity=0.227 Sum_probs=19.2
Q ss_pred ceecCCCCccccCCeEEEEecccC
Q 030834 31 PVLDIAGKQLRAGSKYYILPVTKG 54 (170)
Q Consensus 31 ~VlD~~G~~l~~g~~YYI~p~~~~ 54 (170)
.|++.--+|.-.|.+-||--+.+|
T Consensus 42 IvveafdDP~V~gVTCyvs~a~~g 65 (165)
T COG3045 42 IVVEAFDDPDVKGVTCYVSRAKTG 65 (165)
T ss_pred EEEEecCCCCcCcEEEEEEEeccc
Confidence 566666778888999999988765
No 25
>COG5510 Predicted small secreted protein [Function unknown]
Probab=30.20 E-value=46 Score=20.99 Aligned_cols=16 Identities=38% Similarity=0.617 Sum_probs=8.8
Q ss_pred CCchhhHHHHHHHHHH
Q 030834 1 MRSTLVLTPLILLFAF 16 (170)
Q Consensus 1 MK~~~~l~l~fll~a~ 16 (170)
||-+.++.+++++.++
T Consensus 2 mk~t~l~i~~vll~s~ 17 (44)
T COG5510 2 MKKTILLIALVLLAST 17 (44)
T ss_pred chHHHHHHHHHHHHHH
Confidence 7776645444555444
No 26
>PRK01022 hypothetical protein; Provisional
Probab=27.45 E-value=87 Score=24.98 Aligned_cols=30 Identities=20% Similarity=0.250 Sum_probs=21.1
Q ss_pred eCCCCCCCCCCCCcccEEEEEeCCcceEEeccC
Q 030834 101 TGGIEGNLGPQTTINWFRIEKFYGDYKLVFCPL 133 (170)
Q Consensus 101 ~gg~~g~pg~~t~~~~FkIek~~~~YKLvfCp~ 133 (170)
+-+..+||=--...|-|+|++.++.| |||-
T Consensus 115 I~~~PlHPvG~~FPGG~~V~~~~g~y---YCPV 144 (167)
T PRK01022 115 ILKEPLHPVGTPFPGGFKVEEKNGVY---YCPV 144 (167)
T ss_pred HhCCCCCCCCCCCCCCeEEEeECCEE---eCcc
Confidence 33456677333477889999998876 7993
No 27
>PRK10159 outer membrane phosphoporin protein E; Provisional
Probab=27.20 E-value=48 Score=28.95 Aligned_cols=15 Identities=20% Similarity=0.257 Sum_probs=11.2
Q ss_pred CceecCCCCccccCC
Q 030834 30 DPVLDIAGKQLRAGS 44 (170)
Q Consensus 30 ~~VlD~~G~~l~~g~ 44 (170)
.+|+|.||.-|.-.+
T Consensus 22 ~~vy~~d~ssvtlyG 36 (351)
T PRK10159 22 AEVYNKDGNKLDVYG 36 (351)
T ss_pred EEEEECCCCEEEEEE
Confidence 589999997776543
No 28
>PF11153 DUF2931: Protein of unknown function (DUF2931); InterPro: IPR021326 Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently, there is no known function.
Probab=26.01 E-value=70 Score=25.88 Aligned_cols=17 Identities=35% Similarity=0.456 Sum_probs=10.0
Q ss_pred CCchhhHHHHHHHHHHH
Q 030834 1 MRSTLVLTPLILLFAFI 17 (170)
Q Consensus 1 MK~~~~l~l~fll~a~~ 17 (170)
||.+++|++++||.+-+
T Consensus 1 mk~i~~l~l~lll~~C~ 17 (216)
T PF11153_consen 1 MKKILLLLLLLLLTGCS 17 (216)
T ss_pred ChHHHHHHHHHHHHhhc
Confidence 89887555545544433
No 29
>PRK11289 ampC beta-lactamase/D-alanine carboxypeptidase; Provisional
Probab=25.75 E-value=46 Score=29.51 Aligned_cols=15 Identities=33% Similarity=0.441 Sum_probs=9.5
Q ss_pred CCchhhHHHHHHHHH
Q 030834 1 MRSTLVLTPLILLFA 15 (170)
Q Consensus 1 MK~~~~l~l~fll~a 15 (170)
||..++|+++||+++
T Consensus 2 ~~~~~~~~~~~~~~~ 16 (384)
T PRK11289 2 MKMMLLLLLAALLLT 16 (384)
T ss_pred cchhhHHHHHHHHHH
Confidence 887775655555554
No 30
>PF05550 Peptidase_C53: Pestivirus Npro endopeptidase C53; InterPro: IPR008751 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to MEROPS peptidase family C53 (clan C-). The active site residues occur in the order E, H, C in the sequence which is unlike that in any other family. They are unique to pestiviruses. The N-terminal cysteine peptidase (Npro) encoded by the bovine viral diarrhoea virus genome is responsible for the self-cleavage that releases the N terminus of the core protein. This unique protease is dispensable for viral replication, and its coding region can be replaced by a ubiquitin gene directly fused in frame to the core [, , , ].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=24.74 E-value=56 Score=25.92 Aligned_cols=16 Identities=38% Similarity=0.526 Sum_probs=13.7
Q ss_pred CCCceecCCCCccccC
Q 030834 28 SPDPVLDIAGKQLRAG 43 (170)
Q Consensus 28 ~~~~VlD~~G~~l~~g 43 (170)
..|||+|..|+||.-.
T Consensus 20 v~EPVyd~~g~plfGe 35 (168)
T PF05550_consen 20 VEEPVYDSAGNPLFGE 35 (168)
T ss_pred cccccccCCCCCccCC
Confidence 4599999999999854
No 31
>PF11777 DUF3316: Protein of unknown function (DUF3316); InterPro: IPR016879 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=22.98 E-value=68 Score=23.56 Aligned_cols=14 Identities=29% Similarity=0.655 Sum_probs=7.9
Q ss_pred CCchhhHHHHHHHHH
Q 030834 1 MRSTLVLTPLILLFA 15 (170)
Q Consensus 1 MK~~~~l~l~fll~a 15 (170)
||.+++|+ +.||++
T Consensus 1 MKk~~ll~-~~ll~s 14 (114)
T PF11777_consen 1 MKKIILLA-SLLLLS 14 (114)
T ss_pred CchHHHHH-HHHHHH
Confidence 99887443 334443
No 32
>PF02950 Conotoxin: Conotoxin; InterPro: IPR004214 Cone snail toxins, conotoxins, are small neurotoxic peptides with disulphide connectivity that target ion-channels or G-protein coupled receptors. Based on the number and pattern of disulphide bonds and biological activities, conotoxins can be classified into several families []. Omega, delta and kappa families of conotoxins have a knottin or inhibitor cysteine knot scaffold. The knottin scaffold is a very special disulphide-through-disulphide knot, in which the III-VI disulphide bond crosses the macrocycle formed by two other disulphide bonds (I-IV and II-V) and the interconnecting backbone segments, where I-VI indicates the six cysteine residues starting from the N terminus. The disulphide bonding network, as well as specific amino acids in inter-cysteine loops, provide the specificity of conotoxins []. The cysteine arrangements are the same for omega, delta and kappa families, even though omega conotoxins are calcium channel blockers, whereas delta conotoxins delay the inactivation of sodium channels, and kappa conotoxins are potassium channel blockers []. Mu conotoxins have two types of cysteine arrangements, but the knottin scaffold is not observed. Mu conotoxins target the voltage-gated sodium channels [], and are useful probes for investigating voltage-dependent sodium channels of excitable tissues []. Alpha conotoxins have two types of cysteine arrangements [], and are competitive nicotinic acetylcholine receptor antagonists. ; GO: 0008200 ion channel inhibitor activity, 0009405 pathogenesis, 0005576 extracellular region; PDB: 2EFZ_A 1FYG_A 1RMK_A 1DG0_A 1DFY_A 1DFZ_A 2JQC_A 2YYF_A 2JQB_A 1F3K_A ....
Probab=21.49 E-value=31 Score=22.99 Aligned_cols=7 Identities=57% Similarity=0.634 Sum_probs=0.0
Q ss_pred CCchhhH
Q 030834 1 MRSTLVL 7 (170)
Q Consensus 1 MK~~~~l 7 (170)
||-+.+|
T Consensus 1 mKLt~vl 7 (75)
T PF02950_consen 1 MKLTCVL 7 (75)
T ss_dssp -------
T ss_pred CCcchHH
Confidence 8988644
No 33
>PF11355 DUF3157: Protein of unknown function (DUF3157); InterPro: IPR021501 This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.
Probab=20.50 E-value=1.1e+02 Score=25.39 Aligned_cols=22 Identities=23% Similarity=0.530 Sum_probs=17.1
Q ss_pred CceecCCCCccccCCeE---EEEec
Q 030834 30 DPVLDIAGKQLRAGSKY---YILPV 51 (170)
Q Consensus 30 ~~VlD~~G~~l~~g~~Y---YI~p~ 51 (170)
..|-=.||..|+-..++ |+++-
T Consensus 22 ~~VTLedGrqV~LnDDFTWeYv~~~ 46 (199)
T PF11355_consen 22 ATVTLEDGRQVQLNDDFTWEYVIPE 46 (199)
T ss_pred cEEEecCCCEEEecCCceEEEEecc
Confidence 47888999999988665 66653
No 34
>KOG3352 consensus Cytochrome c oxidase, subunit Vb/COX4 [Energy production and conversion]
Probab=20.35 E-value=1.2e+02 Score=24.02 Aligned_cols=25 Identities=24% Similarity=0.338 Sum_probs=17.6
Q ss_pred cEEEEeCCCCCCCCCCCCcccEEEEEeCC
Q 030834 96 QWFVKTGGIEGNLGPQTTINWFRIEKFYG 124 (170)
Q Consensus 96 ~~~V~~gg~~g~pg~~t~~~~FkIek~~~ 124 (170)
.++|.-|..+++ +.-.||.|+|.+.
T Consensus 109 ~RiVGC~c~eD~----~~V~Wmwl~Kge~ 133 (153)
T KOG3352|consen 109 KRIVGCGCEEDS----HAVVWMWLEKGET 133 (153)
T ss_pred ceEEeecccCCC----cceEEEEEEcCCc
Confidence 567888666652 3357999999864
No 35
>PF12071 DUF3551: Protein of unknown function (DUF3551); InterPro: IPR021937 This family of proteins is functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 79 to 104 amino acids in length. This protein has a single completely conserved residue C that may be functionally important.
Probab=20.24 E-value=1.3e+02 Score=21.19 Aligned_cols=7 Identities=43% Similarity=0.577 Sum_probs=4.6
Q ss_pred CCchhhH
Q 030834 1 MRSTLVL 7 (170)
Q Consensus 1 MK~~~~l 7 (170)
||..++.
T Consensus 1 MR~~~~a 7 (82)
T PF12071_consen 1 MRRLLLA 7 (82)
T ss_pred ChhHHHH
Confidence 8877633
Done!