Query         psy11287
Match_columns 112
No_of_seqs    137 out of 1085
Neff          6.5 
Searched_HMMs 46136
Date          Fri Aug 16 22:57:30 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy11287.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/11287hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd01328 FSL_SPARC Follistatin-  99.8 9.2E-20   2E-24  119.7   6.1   69   22-90      1-85  (86)
  2 smart00280 KAZAL Kazal type se  99.6 1.6E-16 3.6E-21   92.5   4.2   44   45-88      2-46  (46)
  3 cd01327 KAZAL_PSTI Kazal-type   99.6 2.9E-16 6.3E-21   91.7   4.2   42   47-88      4-45  (45)
  4 PF00050 Kazal_1:  Kazal-type s  99.5 2.7E-14 5.8E-19   84.2   4.1   41   48-88      8-48  (48)
  5 PF07648 Kazal_2:  Kazal-type s  99.5 3.3E-14 7.2E-19   80.5   3.5   41   48-88      1-42  (42)
  6 cd00104 KAZAL_FS Kazal type se  99.4 9.5E-14 2.1E-18   78.6   3.7   41   48-88      1-41  (41)
  7 KOG4004|consensus               99.0 1.1E-10 2.4E-15   87.5   2.6   85   14-98     44-140 (259)
  8 KOG3555|consensus               98.6   2E-08 4.3E-13   80.4   2.1   72   19-90     86-185 (434)
  9 smart00057 FIMAC factor I memb  98.4 5.4E-07 1.2E-11   57.0   5.3   63   23-90      2-69  (69)
 10 KOG4578|consensus               98.3 4.6E-07 9.9E-12   72.4   3.1   43   47-90     37-79  (421)
 11 cd01330 KAZAL_SLC21 The kazal-  97.3 0.00017 3.7E-09   43.5   1.9   26   47-72     10-36  (54)
 12 KOG4597|consensus               96.0  0.0065 1.4E-07   50.9   3.5   64   20-84    103-168 (560)
 13 smart00274 FOLN Follistatin-N-  95.7  0.0085 1.8E-07   30.9   1.8   23   22-44      1-24  (26)
 14 PF09289 FOLN:  Follistatin/Ost  94.9   0.012 2.7E-07   29.3   0.9   21   23-43      1-22  (22)
 15 TIGR00805 oat sodium-independe  92.7   0.051 1.1E-06   46.4   1.2   29   42-70    448-478 (633)
 16 cd01327 KAZAL_PSTI Kazal-type   89.2    0.17 3.6E-06   29.0   0.7   22    9-30      5-26  (45)
 17 cd01328 FSL_SPARC Follistatin-  87.4    0.41 8.9E-06   31.4   1.8   26   87-112     6-31  (86)
 18 PF00008 EGF:  EGF-like domain   86.3    0.48   1E-05   25.0   1.4   24   23-46      1-25  (32)
 19 KOG1219|consensus               86.0    0.73 1.6E-05   45.8   3.2   35   12-46   3855-3891(4289)
 20 KOG3626|consensus               79.7       1 2.2E-05   39.8   1.6   24   47-70    519-543 (735)
 21 KOG4289|consensus               75.7     1.4   3E-05   42.2   1.2   70   20-109  1239-1308(2531)
 22 cd00053 EGF Epidermal growth f  69.5     6.4 0.00014   19.5   2.5   22   25-46      5-26  (36)
 23 smart00179 EGF_CA Calcium-bind  68.3     6.7 0.00015   20.2   2.5   26   21-46      3-29  (39)
 24 smart00181 EGF Epidermal growt  58.7      12 0.00027   19.0   2.3   23   23-46      2-25  (35)
 25 KOG3509|consensus               57.0      15 0.00032   33.7   3.9   79   12-90    163-262 (964)
 26 PF00954 S_locus_glycop:  S-loc  56.2     8.7 0.00019   25.3   1.8   24   86-111    84-107 (110)
 27 cd00054 EGF_CA Calcium-binding  54.5      17 0.00036   18.2   2.5   26   21-46      3-29  (38)
 28 PF09064 Tme5_EGF_like:  Thromb  51.6     7.7 0.00017   21.2   0.8   18   92-111    10-27  (34)
 29 PF07974 EGF_2:  EGF-like domai  37.1      25 0.00055   18.5   1.4   22   87-111     7-28  (32)
 30 PF12714 TILa:  TILa domain      36.3      22 0.00049   20.9   1.2   22   22-43     34-55  (56)
 31 KOG4004|consensus               35.2      22 0.00048   27.3   1.3   25   87-111    57-82  (259)
 32 PF12947 EGF_3:  EGF domain;  I  35.0      11 0.00023   20.5  -0.3   25   85-110     5-29  (36)
 33 PF12662 cEGF:  Complement Clr-  27.9      30 0.00065   17.2   0.7   10  102-111     2-11  (24)
 34 PF12661 hEGF:  Human growth fa  26.8      18 0.00038   15.4  -0.3    8  103-110     1-8   (13)
 35 PF11403 Yeast_MT:  Yeast metal  26.7      55  0.0012   18.0   1.6   19   94-112    14-32  (40)
 36 PF12946 EGF_MSP1_1:  MSP1 EGF   26.3      45 0.00098   18.5   1.3   24   23-46      2-26  (37)
 37 KOG3516|consensus               24.3 1.4E+02  0.0031   28.4   4.7   72    9-83    534-605 (1306)
 38 PF14380 WAK_assoc:  Wall-assoc  23.8      93   0.002   19.9   2.6   37   73-109    51-93  (94)
 39 PF07645 EGF_CA:  Calcium-bindi  21.1 1.1E+02  0.0024   16.5   2.3   21   26-46     10-30  (42)

No 1  
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=99.80  E-value=9.2e-20  Score=119.74  Aligned_cols=69  Identities=26%  Similarity=0.735  Sum_probs=62.2

Q ss_pred             CCCCCCCCCCCccccc-CCeeeeeCC-CCCCCC---CCeEcccCcEechhhHHHHHHhhCC-----------CCceEeee
Q psy11287         22 PCSSNPCRNDGHCVVK-NGKAVCKCP-SCSAEY---NPVCGSDGISYENECKLNLEACQHS-----------RQISVLYI   85 (112)
Q Consensus        22 ~C~~~~C~~g~~C~~~-~~~~~C~C~-~C~~~~---~pVCGSDG~TY~n~C~l~~~~C~~~-----------~~i~i~~~   85 (112)
                      ||+++.|++|++|+++ +++|+|+|. .||.++   .+||||||+||.|+|+|++++|..+           .+|+|.|+
T Consensus         1 pC~~v~C~~G~~C~~d~~~~p~CvC~~~Cp~~~~~~~~VCGsDg~TY~s~C~L~r~~C~~~~~~~~~~~~~~~~l~i~Y~   80 (86)
T cd01328           1 PCENHHCGAGKVCEVDDENTPKCVCIDPCPEEVDDRRKVCTNDNETFDSDCELYRTRCLCKGGKKGCRGPKYQHLHLDYY   80 (86)
T ss_pred             CCCCcCCCCCCEeeECCCCCeEEecCCcCCCCCCCCCceeCCCCCCcccHhHHhhhHhccCCCCcCCCCCccCeEEEEee
Confidence            6899999999999996 789999998 899865   5699999999999999999999732           48999999


Q ss_pred             cCCCC
Q psy11287         86 GLCSK   90 (112)
Q Consensus        86 G~C~~   90 (112)
                      |+|+.
T Consensus        81 G~Ck~   85 (86)
T cd01328          81 GECKE   85 (86)
T ss_pred             ccccC
Confidence            99974


No 2  
>smart00280 KAZAL Kazal type serine protease inhibitors. Kazal type serine protease inhibitors and follistatin-like domains.
Probab=99.65  E-value=1.6e-16  Score=92.48  Aligned_cols=44  Identities=52%  Similarity=1.164  Sum_probs=42.0

Q ss_pred             CC-CCCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEeeecCC
Q psy11287         45 CP-SCSAEYNPVCGSDGISYENECKLNLEACQHSRQISVLYIGLC   88 (112)
Q Consensus        45 C~-~C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~~G~C   88 (112)
                      |+ .||.++.|||||||+||.|+|+|++++|..+..|+++|.|+|
T Consensus         2 C~~~C~~~~~pVCgsdg~TY~N~C~l~~~~c~~~~~i~~~~~G~C   46 (46)
T smart00280        2 CPEACPREYDPVCGSDGVTYSNECHLCKAACESGKSIEVKHDGPC   46 (46)
T ss_pred             CCCCCCCCCCccCCCCCCEeCCHhHHHHHHhcCCCCeEEeecCCC
Confidence            66 799999999999999999999999999999999999999987


No 3  
>cd01327 KAZAL_PSTI Kazal-type pancreatic secretory trypsin inhibitors (PSTI) and related proteins, including the second domain of the ovomucoid turkey inhibitor and the C-terminal domain of the esophagus cancer-related gene-2 protein (ECRG-2), are members of the superfamily of kazal-type proteinase inhibitors and follistatin-like proteins.
Probab=99.63  E-value=2.9e-16  Score=91.66  Aligned_cols=42  Identities=40%  Similarity=0.917  Sum_probs=40.4

Q ss_pred             CCCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEeeecCC
Q psy11287         47 SCSAEYNPVCGSDGISYENECKLNLEACQHSRQISVLYIGLC   88 (112)
Q Consensus        47 ~C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~~G~C   88 (112)
                      .||..+.|||||||+||.|+|+|+.++|..+..|.++|+|+|
T Consensus         4 ~Cp~~~~PVCGsDg~TY~N~C~l~~~~c~~~~~i~~~~~G~C   45 (45)
T cd01327           4 GCPKDYDPVCGTDGVTYSNECLLCAENLKRQTNIRIKHDGEC   45 (45)
T ss_pred             CCCCCCCceeCCCCCEeCCHhHHHHHHhccCCCeEEeecCCC
Confidence            589999999999999999999999999999999999999987


No 4  
>PF00050 Kazal_1:  Kazal-type serine protease inhibitor domain;  InterPro: IPR002350 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.  This family of Kazal inhibitors, belongs to MEROPS inhibitor family I1, clan IA. They inhibit serine peptidases of the S1 family (IPR001254 from INTERPRO) []. The members are primarily metazoan, but includes exceptions in the alveolata (apicomplexa), stramenopiles, higher plants and bacteria. Kazal inhibitors, which inhibit a number of serine proteases (such as trypsin and elastase), belong to family of proteins that includes pancreatic secretory trypsin inhibitor; avian ovomucoid; acrosin inhibitor; and elastase inhibitor. These proteins contain between 1 and 7 Kazal-type inhibitor repeats [, ]. The structure of the Kazal repeat includes a large quantity of extended chain, 2 short alpha-helices and a 3-stranded anti-parallel beta sheet [].The inhibitor makes 11 contacts with its enzyme substrate: unusually, 8 of these important residues are hypervariable []. Altering the enzyme-contact residues, and especially that of the active site bond, affects the the strength of inhibition and specificity of the inhibitor for particular serine proteases [, ]. The presence of this Pfam domain is usually indicative of serine protease inhibitors, however, Kazal-like domains are also seen in the extracellular part of agrins which are not known to be proteinase inhibitors.; GO: 0005515 protein binding; PDB: 1HPT_A 1CGI_I 1CGJ_I 2P6A_D 3HH2_C 2B0U_D 2BUS_A 1BUS_A 1OVO_A 3OVO_A ....
Probab=99.50  E-value=2.7e-14  Score=84.22  Aligned_cols=41  Identities=44%  Similarity=0.969  Sum_probs=35.4

Q ss_pred             CCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEeeecCC
Q psy11287         48 CSAEYNPVCGSDGISYENECKLNLEACQHSRQISVLYIGLC   88 (112)
Q Consensus        48 C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~~G~C   88 (112)
                      |+.++.|||||||+||.|+|+|..+.-+.+..|.++|+|+|
T Consensus         8 C~~~~~PVCGsdg~TY~N~C~lC~~~~~~~~~i~~~~~G~C   48 (48)
T PF00050_consen    8 CPREYDPVCGSDGVTYSNECELCAANRRSGKNIRIAHDGPC   48 (48)
T ss_dssp             EESSCS-EEETTSEEESSHHHHHHHHHHTTTSS-EEEESS-
T ss_pred             CCCCCCceECCCCCCccccchhhhhhcccCCCeEEeecCCC
Confidence            88899999999999999999997777788999999999997


No 5  
>PF07648 Kazal_2:  Kazal-type serine protease inhibitor domain;  InterPro: IPR011497 This domain is usually indicative of serine protease inhibitors that belong to Merops inhibitor families: I1, I2, I17 and I31. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays and have a central alpha-helix, a short two-stranded antiparallel beta-sheet and several disulphide bonds [, , ]. The amino terminal segment of this domain binds to the active site of its target proteases, thus inhibiting their function.; PDB: 2KMP_A 1LDT_L 2KMR_A 2KMO_A 2KMQ_A 1AN1_I 4E0S_B 3T5O_A 4A5W_B 1LR7_A ....
Probab=99.48  E-value=3.3e-14  Score=80.46  Aligned_cols=41  Identities=46%  Similarity=1.057  Sum_probs=37.9

Q ss_pred             CCCCCC-CeEcccCcEechhhHHHHHHhhCCCCceEeeecCC
Q psy11287         48 CSAEYN-PVCGSDGISYENECKLNLEACQHSRQISVLYIGLC   88 (112)
Q Consensus        48 C~~~~~-pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~~G~C   88 (112)
                      ||..+. ||||+||+||.|+|+|++++|..+..++++|.|+|
T Consensus         1 C~~~~~~PVCg~dg~ty~n~C~l~~~~c~~~~~~~~~~~g~C   42 (42)
T PF07648_consen    1 CPREYSSPVCGSDGKTYSNECELRCANCRTNSKIQIVHDGSC   42 (42)
T ss_dssp             EESSSSSTEEETTSEEESSHHHHHHHHHHHTTTCEEEEESSS
T ss_pred             CcCCCCCCEEcCCCCEEhhHHHHHHHHHhcCCCeEEEeCCCC
Confidence            566666 99999999999999999999999999999999987


No 6  
>cd00104 KAZAL_FS Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the  Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor.
Probab=99.44  E-value=9.5e-14  Score=78.63  Aligned_cols=41  Identities=54%  Similarity=1.108  Sum_probs=38.8

Q ss_pred             CCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEeeecCC
Q psy11287         48 CSAEYNPVCGSDGISYENECKLNLEACQHSRQISVLYIGLC   88 (112)
Q Consensus        48 C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~~G~C   88 (112)
                      ||.++.||||+||+||.|+|+|..++|..+..|.+.|.|+|
T Consensus         1 C~~~~~pVCgsdg~tY~n~C~~~~~~c~~~~~~~~~~~g~C   41 (41)
T cd00104           1 CPKEYDPVCGSDGKTYSNECHLGCAACRSGRSITVAHNGPC   41 (41)
T ss_pred             CCCCCCccCCCCCCEECCHhhhhHHhhcCCCCeEEEeccCC
Confidence            78888999999999999999999999998889999999987


No 7  
>KOG4004|consensus
Probab=99.03  E-value=1.1e-10  Score=87.50  Aligned_cols=85  Identities=19%  Similarity=0.544  Sum_probs=71.8

Q ss_pred             cCCCCCCCCCCCCCCCCCCccccc-CCeeeeeCC--CCCC----CCCCeEcccCcEechhhHHHHHHhhCC-----CCce
Q psy11287         14 VGYLGETGPCSSNPCRNDGHCVVK-NGKAVCKCP--SCSA----EYNPVCGSDGISYENECKLNLEACQHS-----RQIS   81 (112)
Q Consensus        14 ~g~C~~~~~C~~~~C~~g~~C~~~-~~~~~C~C~--~C~~----~~~pVCGSDG~TY~n~C~l~~~~C~~~-----~~i~   81 (112)
                      .+.=...+||++.+|++|..|+++ .++|+|+|.  .||+    .+..|||.|++||.+.|+|.+.+|...     ..++
T Consensus        44 ~~ae~a~npC~dh~Cg~gk~C~vd~~~~P~Cvc~~~kCP~~~~~p~~KVC~nnNqTf~S~C~~~~~~C~~~~~~k~~k~h  123 (259)
T KOG4004|consen   44 IAAEEARNPCADHKCGPGKNCLVDLQTQPRCVCCRYKCPRKQQRPVHKVCGNNNQTFNSWCEMHKHSCESRYFIKVHKLH  123 (259)
T ss_pred             hccccccCccccccCCCCceeeecCCCCceeEEecCCCCcccCCchhhhhcCCCcchhHHHHHHHhhhhhhcccccceee
Confidence            344446789999999999999997 679999985  7998    356799999999999999999999764     4567


Q ss_pred             EeeecCCCCCCccccCC
Q psy11287         82 VLYIGLCSKGLLREIDK   98 (112)
Q Consensus        82 i~~~G~C~~~~~C~~d~   98 (112)
                      +.|.|+|...+.|....
T Consensus       124 l~y~G~Ck~i~~c~d~~  140 (259)
T KOG4004|consen  124 LKYQGSCKYIPPCLDSE  140 (259)
T ss_pred             hhcccccccCCchhHHH
Confidence            79999999999887654


No 8  
>KOG3555|consensus
Probab=98.59  E-value=2e-08  Score=80.44  Aligned_cols=72  Identities=35%  Similarity=0.766  Sum_probs=61.3

Q ss_pred             CCCCCCCCCCCCCCccccc-CCeeee--------------------------eCCCCCCC-CCCeEcccCcEechhhHHH
Q psy11287         19 ETGPCSSNPCRNDGHCVVK-NGKAVC--------------------------KCPSCSAE-YNPVCGSDGISYENECKLN   70 (112)
Q Consensus        19 ~~~~C~~~~C~~g~~C~~~-~~~~~C--------------------------~C~~C~~~-~~pVCGSDG~TY~n~C~l~   70 (112)
                      +.+||..++|.+.++|+.. ...+.|                          .|..||.. ..+||||||+||.+.|.|+
T Consensus        86 ~kdpc~kvkcs~hkvci~Qd~Q~A~cis~k~l~~r~k~a~v~~~q~~d~~l~~CKpCpvaq~a~vCGsDghtYss~ckLe  165 (434)
T KOG3555|consen   86 IKDPCLKVKCSRHKVCIAQDYQTAGCISRKQLQHRQKAAGVSVIQWDDPELDNCKPCPVAQPAFVCGSDGHTYSSRCKLE  165 (434)
T ss_pred             ccChHhhhcccccceeeccccchhhhHHHHHHhhhccCCCcceecccCcccccCccCCcccccceecCCCCeehhhhhHH
Confidence            4679999999999999874 233444                          36779975 5689999999999999999


Q ss_pred             HHHhhCCCCceEeeecCCCC
Q psy11287         71 LEACQHSRQISVLYIGLCSK   90 (112)
Q Consensus        71 ~~~C~~~~~i~i~~~G~C~~   90 (112)
                      ..+|.....|.+...|+|..
T Consensus       166 ~~aC~~sksiav~c~g~cpc  185 (434)
T KOG3555|consen  166 YHACHVSKSIAVICEGPCPC  185 (434)
T ss_pred             HHhhhhhhhhhhhhCCCCCC
Confidence            99999999999999999975


No 9  
>smart00057 FIMAC factor I membrane attack complex.
Probab=98.44  E-value=5.4e-07  Score=56.95  Aligned_cols=63  Identities=29%  Similarity=0.590  Sum_probs=54.8

Q ss_pred             CCCCCCCCCCcccccCCeeeeeCC---CCCCCCCCeEcccCcEe--chhhHHHHHHhhCCCCceEeeecCCCC
Q psy11287         23 CSSNPCRNDGHCVVKNGKAVCKCP---SCSAEYNPVCGSDGISY--ENECKLNLEACQHSRQISVLYIGLCSK   90 (112)
Q Consensus        23 C~~~~C~~g~~C~~~~~~~~C~C~---~C~~~~~pVCGSDG~TY--~n~C~l~~~~C~~~~~i~i~~~G~C~~   90 (112)
                      |.-..|.+|..|+.    .+|+|.   .||.....||..|+..|  .|+|+|....|.. ..+.+.|.|+|..
T Consensus         2 c~c~~C~pWekc~~----~~CvCk~P~qC~~~~~~vCv~~~~~~~t~S~C~~~a~~C~g-~~~~~~~~g~C~~   69 (69)
T smart00057        2 CAKVFCQPWQKCSA----STCVCKLPYECPKAGTDVCVEDGRSEKTLTYCKQKSLECLN-PKYKFLHIGSCTA   69 (69)
T ss_pred             CcCccCCCcccccC----CeeEeCCHhHCCCCCCCeeEecCceeeeecHHHHHHHHhcC-CCcEEeccCCCCC
Confidence            45568999999988    789984   79997678999999999  9999999999996 5799999999963


No 10 
>KOG4578|consensus
Probab=98.30  E-value=4.6e-07  Score=72.41  Aligned_cols=43  Identities=37%  Similarity=0.850  Sum_probs=39.3

Q ss_pred             CCCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEeeecCCCC
Q psy11287         47 SCSAEYNPVCGSDGISYENECKLNLEACQHSRQISVLYIGLCSK   90 (112)
Q Consensus        47 ~C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~~G~C~~   90 (112)
                      .|.....|||||||+||.++|+|.++.|... .+.++|+|.|..
T Consensus        37 ~C~~~~kPvCasDGrty~srCe~qRAkC~dp-ql~~~yrG~Ck~   79 (421)
T KOG4578|consen   37 ECDDNEKPVCASDGRTYPSRCELQRAKCGDP-QLSLKYRGSCKA   79 (421)
T ss_pred             ccCCCCCCccccCCccchhHHHHHHhhcCCC-ceeEEecCcHHH
Confidence            6777789999999999999999999999876 599999999975


No 11 
>cd01330 KAZAL_SLC21 The kazal-type serine protease inhibitor domain has been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The KAZAL_SLC21 domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=97.27  E-value=0.00017  Score=43.50  Aligned_cols=26  Identities=46%  Similarity=1.053  Sum_probs=21.6

Q ss_pred             CCCC-CCCCeEcccCcEechhhHHHHH
Q psy11287         47 SCSA-EYNPVCGSDGISYENECKLNLE   72 (112)
Q Consensus        47 ~C~~-~~~pVCGSDG~TY~n~C~l~~~   72 (112)
                      .|+. .+.||||+||+||-|.|+..-.
T Consensus        10 ~C~~~~~~PVCg~~g~tY~SpC~AGC~   36 (54)
T cd01330          10 SCSESAYSPVCGENGITYFSPCHAGCT   36 (54)
T ss_pred             CCCCCCcCCccCCCCCEEcCHhHcCCc
Confidence            4665 5889999999999999997643


No 12 
>KOG4597|consensus
Probab=96.05  E-value=0.0065  Score=50.93  Aligned_cols=64  Identities=31%  Similarity=0.790  Sum_probs=57.0

Q ss_pred             CCCCCCCCCC-CCCcccccCCeeeeeCC-CCCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEee
Q psy11287         20 TGPCSSNPCR-NDGHCVVKNGKAVCKCP-SCSAEYNPVCGSDGISYENECKLNLEACQHSRQISVLY   84 (112)
Q Consensus        20 ~~~C~~~~C~-~g~~C~~~~~~~~C~C~-~C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~   84 (112)
                      ...|+...|. .+.+|.+.++++.|.|. .|..+...-|.+||.+|.| |.|...+|.++..++|.+
T Consensus       103 ea~~~~~~~~qq~s~~dif~~~~r~~~~~~~~~eP~~~~~d~~~k~~n-~t~cs~aCgKG~q~~iv~  168 (560)
T KOG4597|consen  103 EALCAQFPCSQQGSVCDIFDGQPRCTCIDRCEKEPSFTCADDGLKYYN-CTMCSEACGKGVQLRIVY  168 (560)
T ss_pred             cchhccCccccccccccccCCCCCcccccccccCCchhhhhcCceecc-eEehhhhhcCCceeeeEE
Confidence            3468888886 58999999999999998 7888899999999999999 999999999988777764


No 13 
>smart00274 FOLN Follistatin-N-terminal domain-like. Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence
Probab=95.68  E-value=0.0085  Score=30.95  Aligned_cols=23  Identities=35%  Similarity=0.746  Sum_probs=20.3

Q ss_pred             CCCCCCCCCCCccccc-CCeeeee
Q psy11287         22 PCSSNPCRNDGHCVVK-NGKAVCK   44 (112)
Q Consensus        22 ~C~~~~C~~g~~C~~~-~~~~~C~   44 (112)
                      +|.++.|++|++|+++ .+.|+|+
T Consensus         1 ~C~~v~C~~G~~C~~d~~g~p~Cv   24 (26)
T smart00274        1 SCRNVQCPFGKVCVVDKGGNARCV   24 (26)
T ss_pred             CCCCEECCCCCEEEeCCCCCEEEe
Confidence            5899999999999995 6889885


No 14 
>PF09289 FOLN:  Follistatin/Osteonectin-like EGF domain;  InterPro: IPR015369 This domain is predominantly found in osteonectin and follistatin. They adopt an EGF-like structure [, ]. Follistatin is involved in diverse activities from embryonic development to cell secretion. ; GO: 0005515 protein binding; PDB: 1LR7_A 1LR8_A 1LR9_A 2ARP_F 3B4V_H 2KCX_A 3SEK_C 2P6A_D 3HH2_C 2B0U_D ....
Probab=94.90  E-value=0.012  Score=29.26  Aligned_cols=21  Identities=33%  Similarity=0.860  Sum_probs=15.5

Q ss_pred             CCCCCCCCCCccccc-CCeeee
Q psy11287         23 CSSNPCRNDGHCVVK-NGKAVC   43 (112)
Q Consensus        23 C~~~~C~~g~~C~~~-~~~~~C   43 (112)
                      |.++.|++|.+|.++ .++|.|
T Consensus         1 C~n~~Ck~GKvC~~d~~~~P~C   22 (22)
T PF09289_consen    1 CDNFHCKRGKVCKVDEQGKPHC   22 (22)
T ss_dssp             STT---BTTEEEEEETTTCEEE
T ss_pred             CCCcccCCCCEeeeCCCCCcCC
Confidence            678999999999995 688877


No 15 
>TIGR00805 oat sodium-independent organic anion transporter. Proteins of the OAT family catalyze the Na+-independent facilitated transport of organic anions such as bromosulfobromophthalein and prostaglandins as well as conjugated and unconjugated bile acids (taurocholate and cholate, respectively). These transporters have been characterized in mammals, but homologues are present in C. elegans and A. thaliana. Some of the mammalian proteins exhibit a high degree of tissue specificity. For example, the rat OAT is found at high levels in liver and kidney and at lower levels in other tissues. These proteins possess 10-12 putative a-helical transmembrane spanners. They may catalyze electrogenic anion uniport or anion exchange.
Probab=92.71  E-value=0.051  Score=46.45  Aligned_cols=29  Identities=34%  Similarity=0.973  Sum_probs=22.6

Q ss_pred             eeeCC-CCCC-CCCCeEcccCcEechhhHHH
Q psy11287         42 VCKCP-SCSA-EYNPVCGSDGISYENECKLN   70 (112)
Q Consensus        42 ~C~C~-~C~~-~~~pVCGSDG~TY~n~C~l~   70 (112)
                      .|.=. .|+. .++||||.||.||-+-|+--
T Consensus       448 ~Cn~~C~C~~~~~~PVCg~~~~tY~SpC~AG  478 (633)
T TIGR00805       448 DCNRQCSCDSSFFDPVCGDNGLAYLSPCHAG  478 (633)
T ss_pred             ccCCCCCCCCCCcccccCCCCCEEECccccC
Confidence            45433 5664 69999999999999999854


No 16 
>cd01327 KAZAL_PSTI Kazal-type pancreatic secretory trypsin inhibitors (PSTI) and related proteins, including the second domain of the ovomucoid turkey inhibitor and the C-terminal domain of the esophagus cancer-related gene-2 protein (ECRG-2), are members of the superfamily of kazal-type proteinase inhibitors and follistatin-like proteins.
Probab=89.19  E-value=0.17  Score=29.03  Aligned_cols=22  Identities=27%  Similarity=0.185  Sum_probs=19.3

Q ss_pred             ccccccCCCCCCCCCCCCCCCC
Q psy11287          9 FQFDSVGYLGETGPCSSNPCRN   30 (112)
Q Consensus         9 ~~v~~~g~C~~~~~C~~~~C~~   30 (112)
                      -+..|.|+||+++.+|.+.|..
T Consensus         5 Cp~~~~PVCGsDg~TY~N~C~l   26 (45)
T cd01327           5 CPKDYDPVCGTDGVTYSNECLL   26 (45)
T ss_pred             CCCCCCceeCCCCCEeCCHhHH
Confidence            3567899999999999999976


No 17 
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=87.41  E-value=0.41  Score=31.38  Aligned_cols=26  Identities=15%  Similarity=0.240  Sum_probs=21.9

Q ss_pred             CCCCCCccccCCCcccccccCCCCCC
Q psy11287         87 LCSKGLLREIDKARQENEIPPGKLQS  112 (112)
Q Consensus        87 ~C~~~~~C~~d~~~~~~C~cp~~~~~  112 (112)
                      .|..+.+|..|+++.++|||+..+++
T Consensus         6 ~C~~G~~C~~d~~~~p~CvC~~~Cp~   31 (86)
T cd01328           6 HCGAGKVCEVDDENTPKCVCIDPCPE   31 (86)
T ss_pred             CCCCCCEeeECCCCCeEEecCCcCCC
Confidence            46778999999999999999987763


No 18 
>PF00008 EGF:  EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry;  InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=86.29  E-value=0.48  Score=25.02  Aligned_cols=24  Identities=54%  Similarity=1.308  Sum_probs=21.0

Q ss_pred             CCCCCCCCCCcccccC-CeeeeeCC
Q psy11287         23 CSSNPCRNDGHCVVKN-GKAVCKCP   46 (112)
Q Consensus        23 C~~~~C~~g~~C~~~~-~~~~C~C~   46 (112)
                      |..+.|..++.|+... +...|.|+
T Consensus         1 C~~~~C~n~g~C~~~~~~~y~C~C~   25 (32)
T PF00008_consen    1 CSSNPCQNGGTCIDLPGGGYTCECP   25 (32)
T ss_dssp             TTTTSSTTTEEEEEESTSEEEEEEB
T ss_pred             CCCCcCCCCeEEEeCCCCCEEeECC
Confidence            5677999999999987 89999986


No 19 
>KOG1219|consensus
Probab=85.97  E-value=0.73  Score=45.78  Aligned_cols=35  Identities=34%  Similarity=0.853  Sum_probs=27.7

Q ss_pred             cccCCCCCC-CCCCCCCCCCCCccccc-CCeeeeeCC
Q psy11287         12 DSVGYLGET-GPCSSNPCRNDGHCVVK-NGKAVCKCP   46 (112)
Q Consensus        12 ~~~g~C~~~-~~C~~~~C~~g~~C~~~-~~~~~C~C~   46 (112)
                      ...+-|-.. ++|..++|.+|+.|... .+.-.|.||
T Consensus      3855 ~l~pgC~l~~d~C~~npCqhgG~C~~~~~ggy~CkCp 3891 (4289)
T KOG1219|consen 3855 GLQPGCSLLTDPCNDNPCQHGGTCISQPKGGYKCKCP 3891 (4289)
T ss_pred             cccccccccccccccCcccCCCEecCCCCCceEEeCc
Confidence            345556443 89999999999999986 577889986


No 20 
>KOG3626|consensus
Probab=79.74  E-value=1  Score=39.79  Aligned_cols=24  Identities=50%  Similarity=1.204  Sum_probs=20.3

Q ss_pred             CCC-CCCCCeEcccCcEechhhHHH
Q psy11287         47 SCS-AEYNPVCGSDGISYENECKLN   70 (112)
Q Consensus        47 ~C~-~~~~pVCGSDG~TY~n~C~l~   70 (112)
                      .|+ ..++||||.||.||-+.|+--
T Consensus       519 ~C~~~~~~PVCg~~G~tY~SpChAG  543 (735)
T KOG3626|consen  519 SCDTSEYEPVCGENGITYFSPCHAG  543 (735)
T ss_pred             CCCCcCcCcccCCCCCEEeChhhhC
Confidence            455 478999999999999999854


No 21 
>KOG4289|consensus
Probab=75.68  E-value=1.4  Score=42.17  Aligned_cols=70  Identities=24%  Similarity=0.418  Sum_probs=50.0

Q ss_pred             CCCCCCCCCCCCCcccccCCeeeeeCCCCCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEeeecCCCCCCccccCCC
Q psy11287         20 TGPCSSNPCRNDGHCVVKNGKAVCKCPSCSAEYNPVCGSDGISYENECKLNLEACQHSRQISVLYIGLCSKGLLREIDKA   99 (112)
Q Consensus        20 ~~~C~~~~C~~g~~C~~~~~~~~C~C~~C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~~~G~C~~~~~C~~d~~   99 (112)
                      -|-||..+|+.++.|....+.-.|+|..         |=.    .-.|++...+      .+- -.|.|..++.|..+.+
T Consensus      1239 iDlCYs~pC~nng~C~srEggYtCeCrp---------g~t----GehCEvs~~a------grC-vpGvC~nggtC~~~~n 1298 (2531)
T KOG4289|consen 1239 IDLCYSGPCGNNGRCRSREGGYTCECRP---------GFT----GEHCEVSARA------GRC-VPGVCKNGGTCVNLLN 1298 (2531)
T ss_pred             hHhhhcCCCCCCCceEEecCceeEEecC---------Ccc----ccceeeeccc------Ccc-ccceecCCCEEeecCC
Confidence            5689999999999999999999999951         001    1123322211      111 2367788999999999


Q ss_pred             cccccccCCC
Q psy11287        100 RQENEIPPGK  109 (112)
Q Consensus       100 ~~~~C~cp~~  109 (112)
                      +.-.|+||.+
T Consensus      1299 ggf~c~Cp~g 1308 (2531)
T KOG4289|consen 1299 GGFCCHCPYG 1308 (2531)
T ss_pred             CceeccCCCc
Confidence            9999999975


No 22 
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.
Probab=69.48  E-value=6.4  Score=19.51  Aligned_cols=22  Identities=55%  Similarity=1.186  Sum_probs=17.7

Q ss_pred             CCCCCCCCcccccCCeeeeeCC
Q psy11287         25 SNPCRNDGHCVVKNGKAVCKCP   46 (112)
Q Consensus        25 ~~~C~~g~~C~~~~~~~~C~C~   46 (112)
                      ...|..++.|+...+...|.|+
T Consensus         5 ~~~C~~~~~C~~~~~~~~C~C~   26 (36)
T cd00053           5 SNPCSNGGTCVNTPGSYRCVCP   26 (36)
T ss_pred             CCCCCCCCEEecCCCCeEeECC
Confidence            5567778899988778899886


No 23 
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=68.27  E-value=6.7  Score=20.23  Aligned_cols=26  Identities=50%  Similarity=1.202  Sum_probs=20.1

Q ss_pred             CCCCC-CCCCCCCcccccCCeeeeeCC
Q psy11287         21 GPCSS-NPCRNDGHCVVKNGKAVCKCP   46 (112)
Q Consensus        21 ~~C~~-~~C~~g~~C~~~~~~~~C~C~   46 (112)
                      +.|.. ..|..++.|+...+...|.|+
T Consensus         3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~   29 (39)
T smart00179        3 DECASGNPCQNGGTCVNTVGSYRCECP   29 (39)
T ss_pred             ccCcCCCCcCCCCEeECCCCCeEeECC
Confidence            45555 678888899988788889875


No 24 
>smart00181 EGF Epidermal growth factor-like domain.
Probab=58.66  E-value=12  Score=18.97  Aligned_cols=23  Identities=43%  Similarity=1.223  Sum_probs=17.9

Q ss_pred             CCC-CCCCCCCcccccCCeeeeeCC
Q psy11287         23 CSS-NPCRNDGHCVVKNGKAVCKCP   46 (112)
Q Consensus        23 C~~-~~C~~g~~C~~~~~~~~C~C~   46 (112)
                      |.. ..|..+ .|+...+...|.|+
T Consensus         2 C~~~~~C~~~-~C~~~~~~~~C~C~   25 (35)
T smart00181        2 CASGGPCSNG-TCINTPGSYTCSCP   25 (35)
T ss_pred             CCCcCCCCCC-EEECCCCCeEeECC
Confidence            444 578888 89988888999886


No 25 
>KOG3509|consensus
Probab=57.01  E-value=15  Score=33.72  Aligned_cols=79  Identities=28%  Similarity=0.395  Sum_probs=58.0

Q ss_pred             cccCCCC-CCCCCCCCCCCCCCcccccCCeeee-------------------eCC-CCCCCCCCeEcccCcEechhhHHH
Q psy11287         12 DSVGYLG-ETGPCSSNPCRNDGHCVVKNGKAVC-------------------KCP-SCSAEYNPVCGSDGISYENECKLN   70 (112)
Q Consensus        12 ~~~g~C~-~~~~C~~~~C~~g~~C~~~~~~~~C-------------------~C~-~C~~~~~pVCGSDG~TY~n~C~l~   70 (112)
                      .++|.++ ....|..+.|..|+.|.++++.+.+                   .|+ .+..+.-++|..+|++|..++-++
T Consensus       163 ~~~~~~~~~~~~~~~~~~q~g~tC~~~~~~~~~~~~~~~~~~~~c~~~~~r~~~~f~~~~~g~~~~~~~~vp~~~e~S~~  242 (964)
T KOG3509|consen  163 TSEGGPGTEPIHCAQPVCQGGATCEVRNGKGYSLECPDCKVRVVCEACKPRAFCPFEKSVEGCLKCFCFGVPRPSESSLH  242 (964)
T ss_pred             ccCCCCccccccccCcccccceeEEecCCcceeeeccccccceehhhccCceecccccccccccceeecCCCccccchhh
Confidence            3444443 2457889999999999997553332                   222 233346789999999999999999


Q ss_pred             HHHhhCCCCceEeeecCCCC
Q psy11287         71 LEACQHSRQISVLYIGLCSK   90 (112)
Q Consensus        71 ~~~C~~~~~i~i~~~G~C~~   90 (112)
                      .+....+..+++.+.+.+..
T Consensus       243 ~~~~~h~~~~~~~~~~~~~~  262 (964)
T KOG3509|consen  243 AFRAIHGATLHVDSLGVFFS  262 (964)
T ss_pred             hHhhhccchhccchheeecc
Confidence            99998888888887777754


No 26 
>PF00954 S_locus_glycop:  S-locus glycoprotein family;  InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=56.20  E-value=8.7  Score=25.28  Aligned_cols=24  Identities=8%  Similarity=0.098  Sum_probs=16.0

Q ss_pred             cCCCCCCccccCCCcccccccCCCCC
Q psy11287         86 GLCSKGLLREIDKARQENEIPPGKLQ  111 (112)
Q Consensus        86 G~C~~~~~C~~d~~~~~~C~cp~~~~  111 (112)
                      +.|...++|..  +..+.|.|+.++.
T Consensus        84 ~~CG~~g~C~~--~~~~~C~Cl~GF~  107 (110)
T PF00954_consen   84 GFCGPNGICNS--NNSPKCSCLPGFE  107 (110)
T ss_pred             cccCCccEeCC--CCCCceECCCCcC
Confidence            44455666743  3567899999874


No 27 
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=54.54  E-value=17  Score=18.21  Aligned_cols=26  Identities=50%  Similarity=1.188  Sum_probs=19.1

Q ss_pred             CCCCC-CCCCCCCcccccCCeeeeeCC
Q psy11287         21 GPCSS-NPCRNDGHCVVKNGKAVCKCP   46 (112)
Q Consensus        21 ~~C~~-~~C~~g~~C~~~~~~~~C~C~   46 (112)
                      +.|.. ..|..++.|....+...|.|+
T Consensus         3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~   29 (38)
T cd00054           3 DECASGNPCQNGGTCVNTVGSYRCSCP   29 (38)
T ss_pred             ccCCCCCCcCCCCEeECCCCCeEeECC
Confidence            45555 568888889887777888875


No 28 
>PF09064 Tme5_EGF_like:  Thrombomodulin like fifth domain, EGF-like;  InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=51.64  E-value=7.7  Score=21.20  Aligned_cols=18  Identities=6%  Similarity=-0.171  Sum_probs=12.6

Q ss_pred             CccccCCCcccccccCCCCC
Q psy11287         92 LLREIDKARQENEIPPGKLQ  111 (112)
Q Consensus        92 ~~C~~d~~~~~~C~cp~~~~  111 (112)
                      +.|+.  +...+|.||.+++
T Consensus        10 A~CDp--n~~~~C~CPeGyI   27 (34)
T PF09064_consen   10 ADCDP--NSPGQCFCPEGYI   27 (34)
T ss_pred             CccCC--CCCCceeCCCceE
Confidence            34444  4456999999986


No 29 
>PF07974 EGF_2:  EGF-like domain;  InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=37.10  E-value=25  Score=18.51  Aligned_cols=22  Identities=9%  Similarity=-0.082  Sum_probs=16.6

Q ss_pred             CCCCCCccccCCCcccccccCCCCC
Q psy11287         87 LCSKGLLREIDKARQENEIPPGKLQ  111 (112)
Q Consensus        87 ~C~~~~~C~~d~~~~~~C~cp~~~~  111 (112)
                      .|...++|..+   .++|+|+.+|.
T Consensus         7 ~C~~~G~C~~~---~g~C~C~~g~~   28 (32)
T PF07974_consen    7 ICSGHGTCVSP---CGRCVCDSGYT   28 (32)
T ss_pred             ccCCCCEEeCC---CCEEECCCCCc
Confidence            46667777763   68999998874


No 30 
>PF12714 TILa:  TILa domain
Probab=36.34  E-value=22  Score=20.95  Aligned_cols=22  Identities=27%  Similarity=0.774  Sum_probs=16.4

Q ss_pred             CCCCCCCCCCCcccccCCeeee
Q psy11287         22 PCSSNPCRNDGHCVVKNGKAVC   43 (112)
Q Consensus        22 ~C~~~~C~~g~~C~~~~~~~~C   43 (112)
                      .|....|..+.+|.+.+|...|
T Consensus        34 ~C~~~~C~~~e~C~~~~G~~~C   55 (56)
T PF12714_consen   34 QCQPSSCPPGEVCQIQNGVRGC   55 (56)
T ss_pred             EEeCCCCCCCCEeEeCCCEEcC
Confidence            4777888888888887766554


No 31 
>KOG4004|consensus
Probab=35.17  E-value=22  Score=27.27  Aligned_cols=25  Identities=20%  Similarity=0.107  Sum_probs=20.8

Q ss_pred             CCCCCCccccCCCcccccccCC-CCC
Q psy11287         87 LCSKGLLREIDKARQENEIPPG-KLQ  111 (112)
Q Consensus        87 ~C~~~~~C~~d~~~~~~C~cp~-~~~  111 (112)
                      .|..+..|..|.++.++|||-. +||
T Consensus        57 ~Cg~gk~C~vd~~~~P~Cvc~~~kCP   82 (259)
T KOG4004|consen   57 KCGPGKNCLVDLQTQPRCVCCRYKCP   82 (259)
T ss_pred             cCCCCceeeecCCCCceeEEecCCCC
Confidence            3555999999999999999987 665


No 32 
>PF12947 EGF_3:  EGF domain;  InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=34.99  E-value=11  Score=20.48  Aligned_cols=25  Identities=8%  Similarity=-0.126  Sum_probs=17.7

Q ss_pred             ecCCCCCCccccCCCcccccccCCCC
Q psy11287         85 IGLCSKGLLREIDKARQENEIPPGKL  110 (112)
Q Consensus        85 ~G~C~~~~~C~~d~~~~~~C~cp~~~  110 (112)
                      .+.|+..++|..... +..|+|+.++
T Consensus         5 ~~~C~~nA~C~~~~~-~~~C~C~~Gy   29 (36)
T PF12947_consen    5 NGGCHPNATCTNTGG-SYTCTCKPGY   29 (36)
T ss_dssp             GGGS-TTCEEEE-TT-SEEEEE-CEE
T ss_pred             CCCCCCCcEeecCCC-CEEeECCCCC
Confidence            367888999988766 8899998765


No 33 
>PF12662 cEGF:  Complement Clr-like EGF-like
Probab=27.91  E-value=30  Score=17.22  Aligned_cols=10  Identities=20%  Similarity=-0.081  Sum_probs=8.2

Q ss_pred             cccccCCCCC
Q psy11287        102 ENEIPPGKLQ  111 (112)
Q Consensus       102 ~~C~cp~~~~  111 (112)
                      .+|.||.+++
T Consensus         2 y~C~C~~Gy~   11 (24)
T PF12662_consen    2 YTCSCPPGYQ   11 (24)
T ss_pred             EEeeCCCCCc
Confidence            5799999886


No 34 
>PF12661 hEGF:  Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=26.81  E-value=18  Score=15.41  Aligned_cols=8  Identities=13%  Similarity=-0.135  Sum_probs=5.0

Q ss_pred             ccccCCCC
Q psy11287        103 NEIPPGKL  110 (112)
Q Consensus       103 ~C~cp~~~  110 (112)
                      +|+||.++
T Consensus         1 ~C~C~~G~    8 (13)
T PF12661_consen    1 TCQCPPGW    8 (13)
T ss_dssp             EEEE-TTE
T ss_pred             CccCcCCC
Confidence            48888775


No 35 
>PF11403 Yeast_MT:  Yeast metallothionein;  InterPro: IPR022710  Metallothioneins are characterised by an abundance of cysteine residues and a lack of generic secondary structure motifs. This protein functions in primary metal storage, transport and detoxification []. For the first 40 residues in the protein the polypeptide wraps around the metal by forming two large parallel loops separated by a deep cleft containing the metal cluster []. ; PDB: 1AQS_A 1AQR_A 1RJU_V 1FMY_A 1AOO_A 1AQQ_A.
Probab=26.71  E-value=55  Score=17.95  Aligned_cols=19  Identities=16%  Similarity=0.018  Sum_probs=9.3

Q ss_pred             cccCCCcccccccCCCCCC
Q psy11287         94 REIDKARQENEIPPGKLQS  112 (112)
Q Consensus        94 C~~d~~~~~~C~cp~~~~~  112 (112)
                      |-.++.-.-.|.||+++-|
T Consensus        14 cknneqcqkscscptgcns   32 (40)
T PF11403_consen   14 CKNNEQCQKSCSCPTGCNS   32 (40)
T ss_dssp             TTT-TTSTTS-SS-TTTTS
T ss_pred             ccChHHHhhcCCCCCCCCC
Confidence            3333334567888888753


No 36 
>PF12946 EGF_MSP1_1:  MSP1 EGF domain 1;  InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=26.34  E-value=45  Score=18.46  Aligned_cols=24  Identities=25%  Similarity=0.827  Sum_probs=16.5

Q ss_pred             CCCCCCCCCCcccccC-CeeeeeCC
Q psy11287         23 CSSNPCRNDGHCVVKN-GKAVCKCP   46 (112)
Q Consensus        23 C~~~~C~~g~~C~~~~-~~~~C~C~   46 (112)
                      |.+..|...+.|...+ |...|+|.
T Consensus         2 C~~~~cP~NA~C~~~~dG~eecrCl   26 (37)
T PF12946_consen    2 CIDTKCPANAGCFRYDDGSEECRCL   26 (37)
T ss_dssp             -SSS---TTEEEEEETTSEEEEEE-
T ss_pred             ccCccCCCCcccEEcCCCCEEEEee
Confidence            6678899999999864 89999984


No 37 
>KOG3516|consensus
Probab=24.28  E-value=1.4e+02  Score=28.37  Aligned_cols=72  Identities=22%  Similarity=0.435  Sum_probs=48.3

Q ss_pred             ccccccCCCCCCCCCCCCCCCCCCcccccCCeeeeeCCCCCCCCCCeEcccCcEechhhHHHHHHhhCCCCceEe
Q psy11287          9 FQFDSVGYLGETGPCSSNPCRNDGHCVVKNGKAVCKCPSCSAEYNPVCGSDGISYENECKLNLEACQHSRQISVL   83 (112)
Q Consensus         9 ~~v~~~g~C~~~~~C~~~~C~~g~~C~~~~~~~~C~C~~C~~~~~pVCGSDG~TY~n~C~l~~~~C~~~~~i~i~   83 (112)
                      +.....+-|+..++|..+.|+.|+.|.-....-.|.|.. .-....+|-+  ..|+-.|+-.+..=.+..+.-|.
T Consensus       534 ~~~v~id~C~i~drClPN~CehgG~C~Qs~~~f~C~C~~-TGY~GatCHt--si~e~SCeay~~~~~t~~~~~iD  605 (1306)
T KOG3516|consen  534 FSDVQIDMCGISDRCLPNPCEHGGKCSQSWDDFECNCEL-TGYKGATCHT--SIYELSCEAYKNIGQTSGNFLID  605 (1306)
T ss_pred             ccceeecccccccccCCccccCCCcccccccceeEeccc-cccccccccC--CCcchhhHHhhhhccccceEEEc
Confidence            345677889999999999999999999977788998841 1123445543  23666777666643333333333


No 38 
>PF14380 WAK_assoc:  Wall-associated receptor kinase C-terminal
Probab=23.78  E-value=93  Score=19.86  Aligned_cols=37  Identities=14%  Similarity=0.023  Sum_probs=27.0

Q ss_pred             HhhCCCCceEe-eecCCCC----CCccccCCC-cccccccCCC
Q psy11287         73 ACQHSRQISVL-YIGLCSK----GLLREIDKA-RQENEIPPGK  109 (112)
Q Consensus        73 ~C~~~~~i~i~-~~G~C~~----~~~C~~d~~-~~~~C~cp~~  109 (112)
                      ..+.+..|.-. ..+.|..    ++.|.-|+. ..-.|.||++
T Consensus        51 ~L~~GF~L~w~~~~~~C~~C~~SgG~Cgy~~~~~~f~C~C~dg   93 (94)
T PF14380_consen   51 VLKKGFELEWNADSGDCRECEASGGRCGYDSNSEQFTCFCSDG   93 (94)
T ss_pred             HHhcCcEEEEeCCCCcCcChhcCCCEeCCCCCCceEEEECCCC
Confidence            34555555555 4688876    899999865 6789999975


No 39 
>PF07645 EGF_CA:  Calcium-binding EGF domain;  InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes [].  +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=21.11  E-value=1.1e+02  Score=16.46  Aligned_cols=21  Identities=38%  Similarity=1.081  Sum_probs=18.1

Q ss_pred             CCCCCCCcccccCCeeeeeCC
Q psy11287         26 NPCRNDGHCVVKNGKAVCKCP   46 (112)
Q Consensus        26 ~~C~~g~~C~~~~~~~~C~C~   46 (112)
                      ..|...+.|+...|.-.|.|+
T Consensus        10 ~~C~~~~~C~N~~Gsy~C~C~   30 (42)
T PF07645_consen   10 HNCPENGTCVNTEGSYSCSCP   30 (42)
T ss_dssp             SSSSTTSEEEEETTEEEEEES
T ss_pred             CcCCCCCEEEcCCCCEEeeCC
Confidence            458888999999999999986


Done!