Query         027901
Match_columns 217
No_of_seqs    152 out of 397
Neff          5.3 
Searched_HMMs 46136
Date          Fri Mar 29 03:11:18 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/027901.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/027901hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG1219 Uncharacterized conser  99.3 2.1E-12 4.5E-17  137.1   6.7  104   37-148  3864-3975(4289)
  2 KOG4289 Cadherin EGF LAG seven  99.2 1.8E-11   4E-16  126.6   3.8  102   33-173  1175-1301(2531)
  3 KOG4289 Cadherin EGF LAG seven  98.4 2.2E-07 4.7E-12   97.3   4.0   67   38-143  1240-1308(2531)
  4 KOG1214 Nidogen and related ba  98.2 1.6E-06 3.6E-11   87.4   6.5  108   35-148   732-860 (1289)
  5 KOG1219 Uncharacterized conser  98.1 5.8E-06 1.2E-10   89.9   6.1   83   84-168  3866-3959(4289)
  6 PF00008 EGF:  EGF-like domain   98.0   2E-06 4.4E-11   53.4   0.6   31   40-72      1-32  (32)
  7 PF00008 EGF:  EGF-like domain   98.0 4.9E-06 1.1E-10   51.7   2.1   30  118-147     1-31  (32)
  8 KOG1225 Teneurin-1 and related  97.9   4E-05 8.6E-10   74.6   7.9   93   37-152   244-343 (525)
  9 KOG1217 Fibrillins and related  97.9 9.3E-05   2E-09   67.1   9.5  124   38-165   170-330 (487)
 10 KOG4260 Uncharacterized conser  97.7 3.5E-05 7.6E-10   69.8   4.8  104   43-147   150-270 (350)
 11 PF07645 EGF_CA:  Calcium-bindi  97.7 2.1E-05 4.5E-10   51.4   2.5   31  114-145     1-34  (42)
 12 KOG1217 Fibrillins and related  97.6  0.0003 6.4E-09   63.8   9.2  104   36-147   270-389 (487)
 13 smart00179 EGF_CA Calcium-bind  97.6 6.6E-05 1.4E-09   46.6   3.4   34   37-73      2-38  (39)
 14 smart00179 EGF_CA Calcium-bind  97.5 0.00011 2.5E-09   45.5   3.6   32  115-147     2-36  (39)
 15 cd00054 EGF_CA Calcium-binding  97.3 0.00028 6.1E-09   42.9   3.3   34   37-73      2-37  (38)
 16 smart00181 EGF Epidermal growt  97.2 0.00044 9.5E-09   42.3   3.2   32   39-73      1-34  (35)
 17 cd00054 EGF_CA Calcium-binding  97.2 0.00051 1.1E-08   41.8   3.4   32  115-147     2-35  (38)
 18 cd00053 EGF Epidermal growth f  96.9  0.0013 2.7E-08   39.3   3.2   31   40-73      2-35  (36)
 19 smart00181 EGF Epidermal growt  96.8  0.0014 3.1E-08   40.0   3.3   30  117-147     1-31  (35)
 20 cd00053 EGF Epidermal growth f  96.7  0.0021 4.5E-08   38.3   3.4   27  120-147     5-32  (36)
 21 KOG1214 Nidogen and related ba  96.5  0.0098 2.1E-07   61.0   8.0  122   40-168   697-844 (1289)
 22 PF07645 EGF_CA:  Calcium-bindi  96.5  0.0012 2.5E-08   43.0   1.0   30   37-69      2-34  (42)
 23 KOG1225 Teneurin-1 and related  96.4   0.011 2.3E-07   58.0   7.7   92   58-166   232-325 (525)
 24 PF07974 EGF_2:  EGF-like domai  96.3   0.004 8.7E-08   38.9   2.7   26   43-73      6-32  (32)
 25 KOG1226 Integrin beta subunit   96.2   0.011 2.3E-07   59.9   6.3   23   45-74    557-580 (783)
 26 KOG1226 Integrin beta subunit   96.1   0.014   3E-07   59.1   6.7   90   47-148   472-578 (783)
 27 PF12661 hEGF:  Human growth fa  96.0  0.0016 3.5E-08   33.1  -0.1   13   61-73      1-13  (13)
 28 PF12947 EGF_3:  EGF domain;  I  96.0  0.0055 1.2E-07   39.2   2.1   25  122-147     7-32  (36)
 29 PF07974 EGF_2:  EGF-like domai  94.7    0.04 8.6E-07   34.4   2.9   24  121-147     6-30  (32)
 30 PHA03099 epidermal growth fact  93.4    0.14   3E-06   41.9   4.6   48   28-76     28-83  (139)
 31 PF12662 cEGF:  Complement Clr-  92.9   0.073 1.6E-06   31.3   1.6   13  135-147     1-13  (24)
 32 PF06247 Plasmod_Pvs28:  Plasmo  92.8   0.033 7.3E-07   48.1   0.2   71   44-148     7-82  (197)
 33 PHA02887 EGF-like protein; Pro  92.1    0.11 2.4E-06   41.9   2.3   30   45-75     94-123 (126)
 34 PF14670 FXa_inhibition:  Coagu  91.9    0.11 2.3E-06   33.3   1.6   21  127-148    11-31  (36)
 35 KOG4260 Uncharacterized conser  91.2    0.11 2.3E-06   47.6   1.5   80   36-162   235-317 (350)
 36 cd01475 vWA_Matrilin VWA_Matri  90.8    0.32 6.8E-06   41.6   4.0   35  112-148   184-220 (224)
 37 PF06247 Plasmod_Pvs28:  Plasmo  89.8   0.088 1.9E-06   45.5  -0.3   73   39-147    89-162 (197)
 38 PHA02887 EGF-like protein; Pro  89.6    0.39 8.5E-06   38.7   3.3   36  114-149    82-121 (126)
 39 PF12947 EGF_3:  EGF domain;  I  89.2    0.11 2.5E-06   33.0  -0.0   25   44-71      7-32  (36)
 40 KOG3514 Neurexin III-alpha [Si  86.7    0.42 9.2E-06   50.6   2.3   38   37-77    623-662 (1591)
 41 smart00051 DSL delta serrate l  85.2    0.71 1.5E-05   32.9   2.2   13   61-73     51-63  (63)
 42 PHA03099 epidermal growth fact  85.1     1.1 2.4E-05   36.7   3.6   28  121-148    51-79  (139)
 43 PF07172 GRP:  Glycine rich pro  84.0    0.76 1.6E-05   35.4   2.1   18    1-19      1-18  (95)
 44 PF14670 FXa_inhibition:  Coagu  83.7    0.42 9.1E-06   30.5   0.4   21   49-72     11-31  (36)
 45 KOG1836 Extracellular matrix g  83.2     2.4 5.1E-05   47.2   6.1   90   60-149   695-811 (1705)
 46 cd01475 vWA_Matrilin VWA_Matri  82.1    0.95 2.1E-05   38.6   2.2   42   28-72    172-220 (224)
 47 KOG3516 Neurexin IV [Signal tr  81.0     1.4   3E-05   47.2   3.2   47   28-77    946-994 (1306)
 48 PF13980 UPF0370:  Uncharacteri  78.9     1.8 3.8E-05   31.0   2.2   14  202-215     7-20  (63)
 49 PF12946 EGF_MSP1_1:  MSP1 EGF   78.6     1.4   3E-05   28.6   1.5   30  118-147     2-32  (37)
 50 PF00954 S_locus_glycop:  S-loc  73.7     2.9 6.3E-05   32.0   2.5   31  114-146    76-108 (110)
 51 PF04863 EGF_alliinase:  Alliin  68.2     1.3 2.7E-05   31.2  -0.6   35   42-76     16-52  (56)
 52 KOG3516 Neurexin IV [Signal tr  67.4     4.7  0.0001   43.3   3.0   39  112-151   542-582 (1306)
 53 PF00954 S_locus_glycop:  S-loc  63.4     7.2 0.00016   29.8   2.7   31   37-71     77-109 (110)
 54 KOG1836 Extracellular matrix g  59.9      22 0.00048   39.9   6.6   37   40-77    777-815 (1705)
 55 KOG3512 Netrin, axonal chemotr  59.8      19  0.0004   35.6   5.3   25   48-74    285-309 (592)
 56 PRK13664 hypothetical protein;  56.4      10 0.00022   27.0   2.2   14  202-215     7-21  (62)
 57 PF12955 DUF3844:  Domain of un  55.9     9.1  0.0002   30.1   2.1   33   38-70      6-43  (103)
 58 KOG3607 Meltrins, fertilins an  55.2      27 0.00058   35.9   5.8   21  123-147   632-653 (716)
 59 KOG1218 Proteins containing Ca  55.0      47   0.001   29.0   6.8   33  136-168   162-197 (316)
 60 cd00055 EGF_Lam Laminin-type e  50.2     9.5 0.00021   25.3   1.2   16   59-74     18-33  (50)
 61 KOG0994 Extracellular matrix g  49.8      30 0.00065   37.8   5.3   38   37-75    903-949 (1758)
 62 PF00053 Laminin_EGF:  Laminin   49.6     7.4 0.00016   25.5   0.6   22   48-74     11-32  (49)
 63 KOG3509 Basement membrane-spec  48.6      29 0.00062   36.9   4.9   36   36-74    405-441 (964)
 64 smart00180 EGF_Lam Laminin-typ  48.0      11 0.00023   24.8   1.2   15   60-74     18-32  (46)
 65 cd01328 FSL_SPARC Follistatin-  44.7      25 0.00054   26.6   2.9   27  117-143     1-28  (86)
 66 KOG0994 Extracellular matrix g  39.0      96  0.0021   34.2   7.0   89   60-148   830-946 (1758)
 67 PF01414 DSL:  Delta serrate li  33.9      13 0.00029   26.3  -0.0   11   63-73     53-63  (63)
 68 KOG3607 Meltrins, fertilins an  30.3      33 0.00072   35.3   2.1   16   59-74    641-656 (716)
 69 PF09064 Tme5_EGF_like:  Thromb  29.4      41 0.00089   21.4   1.6   13   59-71     17-29  (34)
 70 PF01683 EB:  EB module;  Inter  29.1      74  0.0016   20.9   3.0   20  123-147    28-48  (52)
 71 KOG1388 Attractin and platelet  27.3      46   0.001   29.5   2.2   40   26-71     42-88  (217)
 72 PLN03148 Blue copper-like prot  25.9      65  0.0014   27.3   2.8   20  153-175   100-120 (167)
 73 smart00274 FOLN Follistatin-N-  25.3      86  0.0019   18.5   2.4   22  117-138     1-23  (26)
 74 PF09289 FOLN:  Follistatin/Ost  22.6      88  0.0019   18.0   2.0   21  118-138     1-22  (22)

No 1  
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=99.31  E-value=2.1e-12  Score=137.14  Aligned_cols=104  Identities=22%  Similarity=0.584  Sum_probs=82.0

Q ss_pred             CCCCcCCCCCC-CeeecCCCCCCcccccCCCCCccCCCCCCCCCCCCCCccCC-CCcCCCC--ccccCCCCCCCC-CCCC
Q 027901           37 DKMCEKVDCGK-GKCRADMTHPFNFRCECEPGWKKTKDNDEDNDHSFLPCIIP-DCTLHYD--SCHTAPPPDPDK-VPHN  111 (217)
Q Consensus        37 ~d~C~~~pC~~-GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~-~Ct~~~g--sC~~~~~~~~~g-~g~n  111 (217)
                      .+.|+.+||+| |+|....  .++|.|+|.+-|+|.+|+.+...|.++||-.+ .|...++  .|.|     |.| +|+.
T Consensus      3864 ~d~C~~npCqhgG~C~~~~--~ggy~CkCpsqysG~~CEi~~epC~snPC~~GgtCip~~n~f~CnC-----~~gyTG~~ 3936 (4289)
T KOG1219|consen 3864 TDPCNDNPCQHGGTCISQP--KGGYKCKCPSQYSGNHCEIDLEPCASNPCLTGGTCIPFYNGFLCNC-----PNGYTGKR 3936 (4289)
T ss_pred             ccccccCcccCCCEecCCC--CCceEEeCcccccCcccccccccccCCCCCCCCEEEecCCCeeEeC-----CCCccCce
Confidence            38999999998 6999874  57899999999999999977667777777644 5654321  3554     556 8888


Q ss_pred             CC-C-CCCCCCCccCC-CeEeeCCCCceeeecCCCCccCC
Q 027901          112 IS-V-FEPCSWIYCGE-GTCRNTSNYKHTCECKPGFNNLL  148 (217)
Q Consensus       112 ~~-~-~DpC~~~~Cg~-GtC~~~~~~sY~C~C~~Gy~n~~  148 (217)
                      |+ . +++|..++|++ |+|++..+ +|+|.|.+||.|..
T Consensus      3937 Ce~~Gi~eCs~n~C~~gg~C~n~~g-sf~CncT~g~~gr~ 3975 (4289)
T KOG1219|consen 3937 CEARGISECSKNVCGTGGQCINIPG-SFHCNCTPGILGRT 3975 (4289)
T ss_pred             eecccccccccccccCCceeeccCC-ceEeccChhHhccc
Confidence            54 3 78999999994 69998855 89999999999875


No 2  
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=99.15  E-value=1.8e-11  Score=126.57  Aligned_cols=102  Identities=30%  Similarity=0.620  Sum_probs=82.9

Q ss_pred             CCCCCCCCcCCCCCC-CeeecCC-------------------CCCCcccccCCCCCccCCCCCCCCCCCCCCccCCCCcC
Q 027901           33 SPFFDKMCEKVDCGK-GKCRADM-------------------THPFNFRCECEPGWKKTKDNDEDNDHSFLPCIIPDCTL   92 (217)
Q Consensus        33 ~~~~~d~C~~~pC~~-GtC~~~~-------------------~~~~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~~Ct~   92 (217)
                      -|++|+.|...||.| -.|+...                   +..++++|+|+|||||..|+++                
T Consensus      1175 lpfdDniClrEPCenymkCvsvlrFdssapf~~s~s~lfRpi~pvnglrCrCPpGFTgd~CeTe---------------- 1238 (2531)
T KOG4289|consen 1175 LPFDDNICLREPCENYMKCVSVLRFDSSAPFLASDSVLFRPIHPVNGLRCRCPPGFTGDYCETE---------------- 1238 (2531)
T ss_pred             eeccCchhhcchhHHHHhhhhheeecccCccccccceeeeeccccCceeEeCCCCCCcccccch----------------
Confidence            378999999999987 4676321                   1246899999999999988754                


Q ss_pred             CCCccccCCCCCCCCCCCCCCCCCCCCCCccC-CCeEeeCCCCceeeecCCCCccCCCCC---CCCCc-cCCCCCCCccC
Q 027901           93 HYDSCHTAPPPDPDKVPHNISVFEPCSWIYCG-EGTCRNTSNYKHTCECKPGFNNLLNTS---YFPCF-SNCTLGADCEK  167 (217)
Q Consensus        93 ~~gsC~~~~~~~~~g~g~n~~~~DpC~~~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~~n~t---~~pC~-~~C~~G~dC~~  167 (217)
                                            +|.|.+.||+ +|+|+.- .++|+|+|++||+|....-   ..-|+ .-|.+|+.|.+
T Consensus      1239 ----------------------iDlCYs~pC~nng~C~sr-EggYtCeCrpg~tGehCEvs~~agrCvpGvC~nggtC~~ 1295 (2531)
T KOG4289|consen 1239 ----------------------IDLCYSGPCGNNGRCRSR-EGGYTCECRPGFTGEHCEVSARAGRCVPGVCKNGGTCVN 1295 (2531)
T ss_pred             ----------------------hHhhhcCCCCCCCceEEe-cCceeEEecCCccccceeeecccCccccceecCCCEEee
Confidence                                  4889999999 6899986 4599999999999996432   58999 79999999999


Q ss_pred             CCccCC
Q 027901          168 LGIRSS  173 (217)
Q Consensus       168 lgi~~~  173 (217)
                      +++...
T Consensus      1296 ~~nggf 1301 (2531)
T KOG4289|consen 1296 LLNGGF 1301 (2531)
T ss_pred             cCCCce
Confidence            977653


No 3  
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.39  E-value=2.2e-07  Score=97.30  Aligned_cols=67  Identities=31%  Similarity=0.717  Sum_probs=52.2

Q ss_pred             CCCcCCCCC-CCeeecCCCCCCcccccCCCCCccCCCCCCCCCCCCCCccCCCCcCCCCccccCCCCCCCCCCCCCCCCC
Q 027901           38 KMCEKVDCG-KGKCRADMTHPFNFRCECEPGWKKTKDNDEDNDHSFLPCIIPDCTLHYDSCHTAPPPDPDKVPHNISVFE  116 (217)
Q Consensus        38 d~C~~~pC~-~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~~Ct~~~gsC~~~~~~~~~g~g~n~~~~D  116 (217)
                      |.|-+.||+ ||+|+..   .++|+|+|.|||+|.+|+.+.   .            .+.                    
T Consensus      1240 DlCYs~pC~nng~C~sr---EggYtCeCrpg~tGehCEvs~---~------------agr-------------------- 1281 (2531)
T KOG4289|consen 1240 DLCYSGPCGNNGRCRSR---EGGYTCECRPGFTGEHCEVSA---R------------AGR-------------------- 1281 (2531)
T ss_pred             HhhhcCCCCCCCceEEe---cCceeEEecCCccccceeeec---c------------cCc--------------------
Confidence            457788998 5899987   578999999999999988542   1            122                    


Q ss_pred             CCCCCccCC-CeEeeCCCCceeeecCCC
Q 027901          117 PCSWIYCGE-GTCRNTSNYKHTCECKPG  143 (217)
Q Consensus       117 pC~~~~Cg~-GtC~~~~~~sY~C~C~~G  143 (217)
                       |....|.+ |+|++...++++|.|+.|
T Consensus      1282 -CvpGvC~nggtC~~~~nggf~c~Cp~g 1308 (2531)
T KOG4289|consen 1282 -CVPGVCKNGGTCVNLLNGGFCCHCPYG 1308 (2531)
T ss_pred             -cccceecCCCEEeecCCCceeccCCCc
Confidence             33456874 699998777999999998


No 4  
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=98.25  E-value=1.6e-06  Score=87.36  Aligned_cols=108  Identities=26%  Similarity=0.516  Sum_probs=72.4

Q ss_pred             CCCCCCcCC--CCC-CCeeecCCCCCCcccccCCCCCc----cCCCCCCCCCCCCCCccCC--CCcCCC-----------
Q 027901           35 FFDKMCEKV--DCG-KGKCRADMTHPFNFRCECEPGWK----KTKDNDEDNDHSFLPCIIP--DCTLHY-----------   94 (217)
Q Consensus        35 ~~~d~C~~~--pC~-~GtC~~~~~~~~~Y~C~C~pGwt----G~~c~~~~~~~~~~PC~~~--~Ct~~~-----------   94 (217)
                      ++.++|++-  .|+ +-.|++.   +++|+|+|..||+    +-+|-...+--+.+||+..  +|.++.           
T Consensus       732 ~d~~eca~~~~~CGp~s~Cin~---pg~~rceC~~gy~F~dd~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~  808 (1289)
T KOG1214|consen  732 VDENECATGFHRCGPNSVCINL---PGSYRCECRSGYEFADDRHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGST  808 (1289)
T ss_pred             CChhhhccCCCCCCCCceeecC---CCceeEEEeecceeccCCcceEEecCCCCCCccccCccccCcCCceEEEecCCce
Confidence            344556544  377 4679976   6889999999884    3445422223455677733  676552           


Q ss_pred             CccccCCCCCCCCCCCCCCCCCCCCCCccC-CCeEeeCCCCceeeecCCCCccCC
Q 027901           95 DSCHTAPPPDPDKVPHNISVFEPCSWIYCG-EGTCRNTSNYKHTCECKPGFNNLL  148 (217)
Q Consensus        95 gsC~~~~~~~~~g~g~n~~~~DpC~~~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~~  148 (217)
                      .+|++.++..-  .|.-+.+.|+|..+.|. +.+|+++.+ +|.|+|++||+|..
T Consensus       809 y~C~CLPGfsG--DG~~c~dvDeC~psrChp~A~Cyntpg-sfsC~C~pGy~GDG  860 (1289)
T KOG1214|consen  809 YSCACLPGFSG--DGHQCTDVDECSPSRCHPAATCYNTPG-SFSCRCQPGYYGDG  860 (1289)
T ss_pred             EEEeecCCccC--CccccccccccCccccCCCceEecCCC-cceeecccCccCCC
Confidence            14555444421  23335567999999999 679999974 99999999999983


No 5  
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.06  E-value=5.8e-06  Score=89.90  Aligned_cols=83  Identities=24%  Similarity=0.514  Sum_probs=61.0

Q ss_pred             CccCCCCcCCCCccccCC-CCC----CCC-CCCCC-CCCCCCCCCccC-CCeEeeCCCCceeeecCCCCccCCCCC--CC
Q 027901           84 PCIIPDCTLHYDSCHTAP-PPD----PDK-VPHNI-SVFEPCSWIYCG-EGTCRNTSNYKHTCECKPGFNNLLNTS--YF  153 (217)
Q Consensus        84 PC~~~~Ct~~~gsC~~~~-~~~----~~g-~g~n~-~~~DpC~~~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~~n~t--~~  153 (217)
                      ||.-+.|+++. +|..++ +.|    +.- +|++| ....||..+||. +|+|+.. .++|.|.|+.||+|.....  ..
T Consensus      3866 ~C~~npCqhgG-~C~~~~~ggy~CkCpsqysG~~CEi~~epC~snPC~~GgtCip~-~n~f~CnC~~gyTG~~Ce~~Gi~ 3943 (4289)
T KOG1219|consen 3866 PCNDNPCQHGG-TCISQPKGGYKCKCPSQYSGNHCEIDLEPCASNPCLTGGTCIPF-YNGFLCNCPNGYTGKRCEARGIS 3943 (4289)
T ss_pred             ccccCcccCCC-EecCCCCCceEEeCcccccCcccccccccccCCCCCCCCEEEec-CCCeeEeCCCCccCceeeccccc
Confidence            34444444333 565443 222    444 88885 478999999999 5799987 4599999999999996544  47


Q ss_pred             CCc-cCCCCCCCccCC
Q 027901          154 PCF-SNCTLGADCEKL  168 (217)
Q Consensus       154 pC~-~~C~~G~dC~~l  168 (217)
                      +|- .+|..|+.|.+.
T Consensus      3944 eCs~n~C~~gg~C~n~ 3959 (4289)
T KOG1219|consen 3944 ECSKNVCGTGGQCINI 3959 (4289)
T ss_pred             ccccccccCCceeecc
Confidence            898 799999999887


No 6  
>PF00008 EGF:  EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry;  InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=97.97  E-value=2e-06  Score=53.39  Aligned_cols=31  Identities=29%  Similarity=0.795  Sum_probs=26.4

Q ss_pred             CcCCCCCC-CeeecCCCCCCcccccCCCCCccCC
Q 027901           40 CEKVDCGK-GKCRADMTHPFNFRCECEPGWKKTK   72 (217)
Q Consensus        40 C~~~pC~~-GtC~~~~~~~~~Y~C~C~pGwtG~~   72 (217)
                      |..+||.| |+|++..  ..+|+|+|.+||+|.+
T Consensus         1 C~~~~C~n~g~C~~~~--~~~y~C~C~~G~~G~~   32 (32)
T PF00008_consen    1 CSSNPCQNGGTCIDLP--GGGYTCECPPGYTGKR   32 (32)
T ss_dssp             TTTTSSTTTEEEEEES--TSEEEEEEBTTEESTT
T ss_pred             CCCCcCCCCeEEEeCC--CCCEEeECCCCCccCC
Confidence            67889997 7999884  3789999999999964


No 7  
>PF00008 EGF:  EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry;  InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=97.95  E-value=4.9e-06  Score=51.65  Aligned_cols=30  Identities=40%  Similarity=0.834  Sum_probs=25.5

Q ss_pred             CCCCccCC-CeEeeCCCCceeeecCCCCccC
Q 027901          118 CSWIYCGE-GTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       118 C~~~~Cg~-GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      |..++|.+ |+|+.....+|+|+|++||+|.
T Consensus         1 C~~~~C~n~g~C~~~~~~~y~C~C~~G~~G~   31 (32)
T PF00008_consen    1 CSSNPCQNGGTCIDLPGGGYTCECPPGYTGK   31 (32)
T ss_dssp             TTTTSSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred             CCCCcCCCCeEEEeCCCCCEEeECCCCCccC
Confidence            56679995 6999886469999999999986


No 8  
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=97.88  E-value=4e-05  Score=74.62  Aligned_cols=93  Identities=22%  Similarity=0.546  Sum_probs=59.6

Q ss_pred             CCCCcCCCCCC-----CeeecCCCCCCcccccCCCCCccCCCCCCCCCCCCCCccCCCCcCCCCccccCCCCCCCC-CCC
Q 027901           37 DKMCEKVDCGK-----GKCRADMTHPFNFRCECEPGWKKTKDNDEDNDHSFLPCIIPDCTLHYDSCHTAPPPDPDK-VPH  110 (217)
Q Consensus        37 ~d~C~~~pC~~-----GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~~Ct~~~gsC~~~~~~~~~g-~g~  110 (217)
                      +..|...-|.+     |.|++.       +|.|++||+|.+|+..  .|... |...-+-.+. .|.+     ++| +|+
T Consensus       244 g~~c~~~~C~~~c~~~g~c~~G-------~CIC~~Gf~G~dC~e~--~Cp~~-cs~~g~~~~g-~CiC-----~~g~~G~  307 (525)
T KOG1225|consen  244 GPLCSTIYCPGGCTGRGQCVEG-------RCICPPGFTGDDCDEL--VCPVD-CSGGGVCVDG-ECIC-----NPGYSGK  307 (525)
T ss_pred             CCccccccCCCCCcccceEeCC-------eEeCCCCCcCCCCCcc--cCCcc-cCCCceecCC-Eeec-----CCCcccc
Confidence            44565555654     478764       6999999999998743  24443 4432221222 6666     445 677


Q ss_pred             CCCCCCCCCCCccC-CCeEeeCCCCceeeecCCCCccCCCCCC
Q 027901          111 NISVFEPCSWIYCG-EGTCRNTSNYKHTCECKPGFNNLLNTSY  152 (217)
Q Consensus       111 n~~~~DpC~~~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~~n~t~  152 (217)
                      .++... |. ..|. +|.|++  +   +|+|.+||+|.+..+.
T Consensus       308 dCs~~~-cp-adC~g~G~Ci~--G---~C~C~~Gy~G~~C~~~  343 (525)
T KOG1225|consen  308 DCSIRR-CP-ADCSGHGKCID--G---ECLCDEGYTGELCIQR  343 (525)
T ss_pred             cccccc-CC-ccCCCCCcccC--C---ceEeCCCCcCCccccc
Confidence            765322 43 7787 689993  2   5999999999976555


No 9  
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.86  E-value=9.3e-05  Score=67.05  Aligned_cols=124  Identities=23%  Similarity=0.572  Sum_probs=76.4

Q ss_pred             CCCc--CCCCCC-CeeecCCCCCCcccccCCCCCccCCCCCCC--CCCC------------CCCccCC--CCcCCCCccc
Q 027901           38 KMCE--KVDCGK-GKCRADMTHPFNFRCECEPGWKKTKDNDED--NDHS------------FLPCIIP--DCTLHYDSCH   98 (217)
Q Consensus        38 d~C~--~~pC~~-GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~--~~~~------------~~PC~~~--~Ct~~~gsC~   98 (217)
                      +.|.  ..+|.+ ++|++.   ..+|.|.|.+||++..++...  ..+.            ...|.+.  .|..+.+.|.
T Consensus       170 ~~C~~~~~~c~~~~~C~~~---~~~~~C~c~~~~~~~~~~~~~~~~~c~~~~~~~~~~g~~~~~c~~~~~~~~~~~~~c~  246 (487)
T KOG1217|consen  170 DECIQYSSPCQNGGTCVNT---GGSYLCSCPPGYTGSTCETTGNGGTCVDSVACSCPPGARGPECEVSIVECASGDGTCV  246 (487)
T ss_pred             cccccCCCCcCCCcccccC---CCCeeEeCCCCccCCcCcCCCCCceEecceeccCCCCCCCCCcccccccccCCCCccc
Confidence            5676  335885 689977   456999999999999887430  0000            1112211  1111102444


Q ss_pred             cCCCCC----CCC-CCCC---CCCCCCCCCCc-cCC-CeEeeCCCCceeeecCCCCccCCC--C-CCCCC----c-cCCC
Q 027901           99 TAPPPD----PDK-VPHN---ISVFEPCSWIY-CGE-GTCRNTSNYKHTCECKPGFNNLLN--T-SYFPC----F-SNCT  160 (217)
Q Consensus        99 ~~~~~~----~~g-~g~n---~~~~DpC~~~~-Cg~-GtC~~~~~~sY~C~C~~Gy~n~~n--~-t~~pC----~-~~C~  160 (217)
                      +..+.+    ++| .+..   ..+++.|.... |.+ |+|+...+ .|.|.|++||++...  . ....|    . ++|.
T Consensus       247 ~~~~~~~C~~~~g~~~~~~~~~~~~~~C~~~~~c~~~~~C~~~~~-~~~C~C~~g~~g~~~~~~~~~~~C~~~~~~~~c~  325 (487)
T KOG1217|consen  247 NTVGSYTCRCPEGYTGDACVTCVDVDSCALIASCPNGGTCVNVPG-SYRCTCPPGFTGRLCTECVDVDECSPRNAGGPCA  325 (487)
T ss_pred             ccCCceeeeCCCCccccccceeeeccccCCCCccCCCCeeecCCC-cceeeCCCCCCCCCCccccccccccccccCCcCC
Confidence            433332    344 3333   34679998875 884 79998754 699999999999975  2 22566    2 5688


Q ss_pred             CCCCc
Q 027901          161 LGADC  165 (217)
Q Consensus       161 ~G~dC  165 (217)
                      +|..|
T Consensus       326 ~g~~C  330 (487)
T KOG1217|consen  326 NGGTC  330 (487)
T ss_pred             CCccc
Confidence            88888


No 10 
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=97.75  E-value=3.5e-05  Score=69.78  Aligned_cols=104  Identities=20%  Similarity=0.312  Sum_probs=60.0

Q ss_pred             CCC-CCCeeecCCCCCCcccccCCCCCccCCCCCCCC-CCCCC-CccCCCCcCCC----CccccCCCCC----CCC-CCC
Q 027901           43 VDC-GKGKCRADMTHPFNFRCECEPGWKKTKDNDEDN-DHSFL-PCIIPDCTLHY----DSCHTAPPPD----PDK-VPH  110 (217)
Q Consensus        43 ~pC-~~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~-~~~~~-PC~~~~Ct~~~----gsC~~~~~~~----~~g-~g~  110 (217)
                      .|| ++|.|.-.....++-.|+|++||+|..|..-.. ..... -=.+-.|+-=.    +.|.......    ..| .-.
T Consensus       150 r~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~Cg~eyfes~Rne~~lvCt~Ch~~C~~~Csg~~~k~C~kCkkGW~ld  229 (350)
T KOG4260|consen  150 RPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRYCGIEYFESSRNEQHLVCTACHEGCLGVCSGESSKGCSKCKKGWKLD  229 (350)
T ss_pred             CCcCCCCcccCCCCCCCCCcccccCCCCCccccccchHHHHhhcccccchhhhhhhhhhcccCCCCCCChhhhcccceec
Confidence            368 489999665456788999999999999852110 00000 00011122111    1222111110    222 112


Q ss_pred             C--CCCCCCCCC--CccCC-CeEeeCCCCceeeecCCCCccC
Q 027901          111 N--ISVFEPCSW--IYCGE-GTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       111 n--~~~~DpC~~--~~Cg~-GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      .  +.++|+|..  .+|+. --|+|+.+ +|+|++++||...
T Consensus       230 e~gCvDvnEC~~ep~~c~~~qfCvNteG-Sf~C~dk~Gy~~g  270 (350)
T KOG4260|consen  230 EEGCVDVNECQNEPAPCKAHQFCVNTEG-SFKCEDKEGYKKG  270 (350)
T ss_pred             ccccccHHHHhcCCCCCChhheeecCCC-ceEecccccccCC
Confidence            2  568899964  56884 58999854 9999999999973


No 11 
>PF07645 EGF_CA:  Calcium-binding EGF domain;  InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes [].  +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=97.75  E-value=2.1e-05  Score=51.39  Aligned_cols=31  Identities=35%  Similarity=0.873  Sum_probs=25.6

Q ss_pred             CCCCCCCC--ccC-CCeEeeCCCCceeeecCCCCc
Q 027901          114 VFEPCSWI--YCG-EGTCRNTSNYKHTCECKPGFN  145 (217)
Q Consensus       114 ~~DpC~~~--~Cg-~GtC~~~~~~sY~C~C~~Gy~  145 (217)
                      ++|+|...  .|. +++|+++.+ +|+|.|++||.
T Consensus         1 DidEC~~~~~~C~~~~~C~N~~G-sy~C~C~~Gy~   34 (42)
T PF07645_consen    1 DIDECAEGPHNCPENGTCVNTEG-SYSCSCPPGYE   34 (42)
T ss_dssp             ESSTTTTTSSSSSTTSEEEEETT-EEEEEESTTEE
T ss_pred             CccccCCCCCcCCCCCEEEcCCC-CEEeeCCCCcE
Confidence            36788664  598 579999965 99999999999


No 12 
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.63  E-value=0.0003  Score=63.76  Aligned_cols=104  Identities=23%  Similarity=0.474  Sum_probs=68.2

Q ss_pred             CCCCCcCCC-CCC-CeeecCCCCCCcccccCCCCCccCCC-C-CCCCCC----CCCCccCC-CCcC-C---CCccccCCC
Q 027901           36 FDKMCEKVD-CGK-GKCRADMTHPFNFRCECEPGWKKTKD-N-DEDNDH----SFLPCIIP-DCTL-H---YDSCHTAPP  102 (217)
Q Consensus        36 ~~d~C~~~p-C~~-GtC~~~~~~~~~Y~C~C~pGwtG~~c-~-~~~~~~----~~~PC~~~-~Ct~-~---~gsC~~~~~  102 (217)
                      +.+.|.... |.+ |+|++.   .+.|.|.|.+||+|..+ . .+..++    ...+|..+ .|.. +   ...|..   
T Consensus       270 ~~~~C~~~~~c~~~~~C~~~---~~~~~C~C~~g~~g~~~~~~~~~~~C~~~~~~~~c~~g~~C~~~~~~~~~~C~c---  343 (487)
T KOG1217|consen  270 DVDSCALIASCPNGGTCVNV---PGSYRCTCPPGFTGRLCTECVDVDECSPRNAGGPCANGGTCNTLGSFGGFRCAC---  343 (487)
T ss_pred             eccccCCCCccCCCCeeecC---CCcceeeCCCCCCCCCCccccccccccccccCCcCCCCcccccCCCCCCCCcCC---
Confidence            467898875 875 799987   34599999999999998 1 111123    22234433 3411 1   002333   


Q ss_pred             CCCCC-CCCCCCCC-CCCCCCccC-CCeEeeCCCCceeeecCCCCccC
Q 027901          103 PDPDK-VPHNISVF-EPCSWIYCG-EGTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       103 ~~~~g-~g~n~~~~-DpC~~~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                        ..+ .|+.++.. |+|...+|. +++|++...++|+|.|+.+|.+.
T Consensus       344 --~~~~~g~~C~~~~~~C~~~~~~~~~~c~~~~~~~~~c~~~~~~~~~  389 (487)
T KOG1217|consen  344 --GPGFTGRRCEDSNDECASSPCCPGGTCVNETPGSYRCACPAGFAGK  389 (487)
T ss_pred             --CCCCCCCccccCCccccCCccccCCEeccCCCCCeEecCCCccccC
Confidence              334 66776666 599998877 57999832348999999999984


No 13 
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=97.63  E-value=6.6e-05  Score=46.57  Aligned_cols=34  Identities=29%  Similarity=0.736  Sum_probs=28.3

Q ss_pred             CCCCcC-CCCCC-CeeecCCCCCCcccccCCCCCc-cCCC
Q 027901           37 DKMCEK-VDCGK-GKCRADMTHPFNFRCECEPGWK-KTKD   73 (217)
Q Consensus        37 ~d~C~~-~pC~~-GtC~~~~~~~~~Y~C~C~pGwt-G~~c   73 (217)
                      .++|.. .+|.+ |+|++.   .++|.|.|.+||+ |.+|
T Consensus         2 ~~~C~~~~~C~~~~~C~~~---~g~~~C~C~~g~~~g~~C   38 (39)
T smart00179        2 IDECASGNPCQNGGTCVNT---VGSYRCECPPGYTDGRNC   38 (39)
T ss_pred             cccCcCCCCcCCCCEeECC---CCCeEeECCCCCccCCcC
Confidence            477887 79986 599977   5689999999999 8765


No 14 
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=97.53  E-value=0.00011  Score=45.49  Aligned_cols=32  Identities=38%  Similarity=0.874  Sum_probs=26.3

Q ss_pred             CCCCCC-CccCC-CeEeeCCCCceeeecCCCCc-cC
Q 027901          115 FEPCSW-IYCGE-GTCRNTSNYKHTCECKPGFN-NL  147 (217)
Q Consensus       115 ~DpC~~-~~Cg~-GtC~~~~~~sY~C~C~~Gy~-n~  147 (217)
                      +|+|.. .+|.+ |+|+++.+ +|+|.|++||. |.
T Consensus         2 ~~~C~~~~~C~~~~~C~~~~g-~~~C~C~~g~~~g~   36 (39)
T smart00179        2 IDECASGNPCQNGGTCVNTVG-SYRCECPPGYTDGR   36 (39)
T ss_pred             cccCcCCCCcCCCCEeECCCC-CeEeECCCCCccCC
Confidence            578877 78984 69998855 89999999998 54


No 15 
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.31  E-value=0.00028  Score=42.91  Aligned_cols=34  Identities=26%  Similarity=0.703  Sum_probs=28.0

Q ss_pred             CCCCcC-CCCCC-CeeecCCCCCCcccccCCCCCccCCC
Q 027901           37 DKMCEK-VDCGK-GKCRADMTHPFNFRCECEPGWKKTKD   73 (217)
Q Consensus        37 ~d~C~~-~pC~~-GtC~~~~~~~~~Y~C~C~pGwtG~~c   73 (217)
                      .++|.. .+|.+ |+|++.   .++|.|.|.+||+|.+|
T Consensus         2 ~~~C~~~~~C~~~~~C~~~---~~~~~C~C~~g~~g~~C   37 (38)
T cd00054           2 IDECASGNPCQNGGTCVNT---VGSYRCSCPPGYTGRNC   37 (38)
T ss_pred             cccCCCCCCcCCCCEeECC---CCCeEeECCCCCcCCcC
Confidence            467887 78985 699976   56799999999999776


No 16 
>smart00181 EGF Epidermal growth factor-like domain.
Probab=97.18  E-value=0.00044  Score=42.31  Aligned_cols=32  Identities=28%  Similarity=0.850  Sum_probs=25.9

Q ss_pred             CCcC-CCCCCCeeecCCCCCCcccccCCCCCcc-CCC
Q 027901           39 MCEK-VDCGKGKCRADMTHPFNFRCECEPGWKK-TKD   73 (217)
Q Consensus        39 ~C~~-~pC~~GtC~~~~~~~~~Y~C~C~pGwtG-~~c   73 (217)
                      +|.. .+|.+++|++.   .++|+|.|.+||+| ..|
T Consensus         1 ~C~~~~~C~~~~C~~~---~~~~~C~C~~g~~g~~~C   34 (35)
T smart00181        1 ECASGGPCSNGTCINT---PGSYTCSCPPGYTGDKRC   34 (35)
T ss_pred             CCCCcCCCCCCEEECC---CCCeEeECCCCCccCCcc
Confidence            3566 68988899976   57899999999999 654


No 17 
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.17  E-value=0.00051  Score=41.77  Aligned_cols=32  Identities=34%  Similarity=0.821  Sum_probs=26.3

Q ss_pred             CCCCCC-CccC-CCeEeeCCCCceeeecCCCCccC
Q 027901          115 FEPCSW-IYCG-EGTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       115 ~DpC~~-~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      +|+|.. .+|. +|+|++..+ +|+|.|++||.|.
T Consensus         2 ~~~C~~~~~C~~~~~C~~~~~-~~~C~C~~g~~g~   35 (38)
T cd00054           2 IDECASGNPCQNGGTCVNTVG-SYRCSCPPGYTGR   35 (38)
T ss_pred             cccCCCCCCcCCCCEeECCCC-CeEeECCCCCcCC
Confidence            478877 7898 569998754 8999999999984


No 18 
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.
Probab=96.87  E-value=0.0013  Score=39.27  Aligned_cols=31  Identities=32%  Similarity=0.915  Sum_probs=24.9

Q ss_pred             Cc-CCCCCC-CeeecCCCCCCcccccCCCCCccC-CC
Q 027901           40 CE-KVDCGK-GKCRADMTHPFNFRCECEPGWKKT-KD   73 (217)
Q Consensus        40 C~-~~pC~~-GtC~~~~~~~~~Y~C~C~pGwtG~-~c   73 (217)
                      |. ..+|.+ ++|++.   ..+|+|+|..||.|. .|
T Consensus         2 C~~~~~C~~~~~C~~~---~~~~~C~C~~g~~g~~~C   35 (36)
T cd00053           2 CAASNPCSNGGTCVNT---PGSYRCVCPPGYTGDRSC   35 (36)
T ss_pred             CCCCCCCCCCCEEecC---CCCeEeECCCCCcccCCc
Confidence            55 667875 899987   467999999999997 44


No 19 
>smart00181 EGF Epidermal growth factor-like domain.
Probab=96.85  E-value=0.0014  Score=39.99  Aligned_cols=30  Identities=40%  Similarity=0.809  Sum_probs=23.5

Q ss_pred             CCCC-CccCCCeEeeCCCCceeeecCCCCccC
Q 027901          117 PCSW-IYCGEGTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       117 pC~~-~~Cg~GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      +|.. .+|.+++|++. .++|+|.|++||.+.
T Consensus         1 ~C~~~~~C~~~~C~~~-~~~~~C~C~~g~~g~   31 (35)
T smart00181        1 ECASGGPCSNGTCINT-PGSYTCSCPPGYTGD   31 (35)
T ss_pred             CCCCcCCCCCCEEECC-CCCeEeECCCCCccC
Confidence            3555 57876699987 459999999999983


No 20 
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.
Probab=96.74  E-value=0.0021  Score=38.29  Aligned_cols=27  Identities=37%  Similarity=0.786  Sum_probs=22.2

Q ss_pred             CCccCC-CeEeeCCCCceeeecCCCCccC
Q 027901          120 WIYCGE-GTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       120 ~~~Cg~-GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      ..+|.+ +.|++..+ +|+|.|++||.+.
T Consensus         5 ~~~C~~~~~C~~~~~-~~~C~C~~g~~g~   32 (36)
T cd00053           5 SNPCSNGGTCVNTPG-SYRCVCPPGYTGD   32 (36)
T ss_pred             CCCCCCCCEEecCCC-CeEeECCCCCccc
Confidence            567874 79998754 8999999999886


No 21 
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=96.47  E-value=0.0098  Score=60.98  Aligned_cols=122  Identities=24%  Similarity=0.486  Sum_probs=72.8

Q ss_pred             CcCCCCCCC-eeecCCCCCCcccccCCCCCcc--CCCCCCCCCCCCCCccCCCCcCCCCccccCCCCC----CC-----C
Q 027901           40 CEKVDCGKG-KCRADMTHPFNFRCECEPGWKK--TKDNDEDNDHSFLPCIIPDCTLHYDSCHTAPPPD----PD-----K  107 (217)
Q Consensus        40 C~~~pC~~G-tC~~~~~~~~~Y~C~C~pGwtG--~~c~~~~~~~~~~PC~~~~Ct~~~gsC~~~~~~~----~~-----g  107 (217)
                      |.+.-|.-+ .|.++.  --.|+|+|..||.|  .+|.+. +++..-   .++|-.+. -|.+.++.+    ..     +
T Consensus       697 ~gsh~cdt~a~C~pg~--~~~~tcecs~g~~gdgr~c~d~-~eca~~---~~~CGp~s-~Cin~pg~~rceC~~gy~F~d  769 (1289)
T KOG1214|consen  697 DGSHMCDTTARCHPGT--GVDYTCECSSGYQGDGRNCVDE-NECATG---FHRCGPNS-VCINLPGSYRCECRSGYEFAD  769 (1289)
T ss_pred             ecCcccCCCccccCCC--CcceEEEEeeccCCCCCCCCCh-hhhccC---CCCCCCCc-eeecCCCceeEEEeecceecc
Confidence            445556644 588764  24699999999975  455432 244431   22344333 344333333    01     1


Q ss_pred             CCCCCC------CCCCCCC--CccC-CC--eEeeCCCCceeeecCCCCccCC--CCCCCCCc-cCCCCCCCccCC
Q 027901          108 VPHNIS------VFEPCSW--IYCG-EG--TCRNTSNYKHTCECKPGFNNLL--NTSYFPCF-SNCTLGADCEKL  168 (217)
Q Consensus       108 ~g~n~~------~~DpC~~--~~Cg-~G--tC~~~~~~sY~C~C~~Gy~n~~--n~t~~pC~-~~C~~G~dC~~l  168 (217)
                      .+.++.      ..++|..  +.|. +|  .|+.+.+.+|+|.|-+||+|..  +...++|. +-|.-.+.|-+.
T Consensus       770 d~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsGDG~~c~dvDeC~psrChp~A~Cynt  844 (1289)
T KOG1214|consen  770 DRHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSGDGHQCTDVDECSPSRCHPAATCYNT  844 (1289)
T ss_pred             CCcceEEecCCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccCCccccccccccCccccCCCceEecC
Confidence            222321      3467753  4676 55  6788877789999999999885  45568887 667666666543


No 22 
>PF07645 EGF_CA:  Calcium-binding EGF domain;  InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes [].  +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=96.46  E-value=0.0012  Score=43.01  Aligned_cols=30  Identities=27%  Similarity=0.835  Sum_probs=25.5

Q ss_pred             CCCCcCC--CCC-CCeeecCCCCCCcccccCCCCCc
Q 027901           37 DKMCEKV--DCG-KGKCRADMTHPFNFRCECEPGWK   69 (217)
Q Consensus        37 ~d~C~~~--pC~-~GtC~~~~~~~~~Y~C~C~pGwt   69 (217)
                      .|+|+..  .|. +++|++.   .++|+|.|.+||+
T Consensus         2 idEC~~~~~~C~~~~~C~N~---~Gsy~C~C~~Gy~   34 (42)
T PF07645_consen    2 IDECAEGPHNCPENGTCVNT---EGSYSCSCPPGYE   34 (42)
T ss_dssp             SSTTTTTSSSSSTTSEEEEE---TTEEEEEESTTEE
T ss_pred             ccccCCCCCcCCCCCEEEcC---CCCEEeeCCCCcE
Confidence            4788865  587 5899988   6899999999999


No 23 
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=96.42  E-value=0.011  Score=57.98  Aligned_cols=92  Identities=24%  Similarity=0.513  Sum_probs=56.7

Q ss_pred             CcccccCCCCCccCCCCCCCCCCCCCCccCCCCcCCCCccccCCCCCCCC-CCCCCCCCCCCCCCccCC-CeEeeCCCCc
Q 027901           58 FNFRCECEPGWKKTKDNDEDNDHSFLPCIIPDCTLHYDSCHTAPPPDPDK-VPHNISVFEPCSWIYCGE-GTCRNTSNYK  135 (217)
Q Consensus        58 ~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~~Ct~~~gsC~~~~~~~~~g-~g~n~~~~DpC~~~~Cg~-GtC~~~~~~s  135 (217)
                      +.++|+|..||+|.+|...       -|. ++|+.. +.|.+..---++| +|.+++. -.|-.. |.+ |.|++    +
T Consensus       232 ~~~ic~c~~~~~g~~c~~~-------~C~-~~c~~~-g~c~~G~CIC~~Gf~G~dC~e-~~Cp~~-cs~~g~~~~----g  296 (525)
T KOG1225|consen  232 FDGICECPEGYFGPLCSTI-------YCP-GGCTGR-GQCVEGRCICPPGFTGDDCDE-LVCPVD-CSGGGVCVD----G  296 (525)
T ss_pred             cCceeecCCceeCCccccc-------cCC-CCCccc-ceEeCCeEeCCCCCcCCCCCc-ccCCcc-cCCCceecC----C
Confidence            3457999999999987632       122 234432 2333322222566 7888653 346545 774 46654    3


Q ss_pred             eeeecCCCCccCCCCCCCCCccCCCCCCCcc
Q 027901          136 HTCECKPGFNNLLNTSYFPCFSNCTLGADCE  166 (217)
Q Consensus       136 Y~C~C~~Gy~n~~n~t~~pC~~~C~~G~dC~  166 (217)
                       +|.|++||.|. .+++-.|..+|...+.|.
T Consensus       297 -~CiC~~g~~G~-dCs~~~cpadC~g~G~Ci  325 (525)
T KOG1225|consen  297 -ECICNPGYSGK-DCSIRRCPADCSGHGKCI  325 (525)
T ss_pred             -EeecCCCcccc-ccccccCCccCCCCCccc
Confidence             79999999999 455444666666666665


No 24 
>PF07974 EGF_2:  EGF-like domain;  InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=96.30  E-value=0.004  Score=38.87  Aligned_cols=26  Identities=27%  Similarity=0.668  Sum_probs=21.0

Q ss_pred             CCCC-CCeeecCCCCCCcccccCCCCCccCCC
Q 027901           43 VDCG-KGKCRADMTHPFNFRCECEPGWKKTKD   73 (217)
Q Consensus        43 ~pC~-~GtC~~~~~~~~~Y~C~C~pGwtG~~c   73 (217)
                      ..|. +|+|+..     ..+|+|++||+|..|
T Consensus         6 ~~C~~~G~C~~~-----~g~C~C~~g~~G~~C   32 (32)
T PF07974_consen    6 NICSGHGTCVSP-----CGRCVCDSGYTGPDC   32 (32)
T ss_pred             CccCCCCEEeCC-----CCEEECCCCCcCCCC
Confidence            3585 7999954     379999999999875


No 25 
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=96.16  E-value=0.011  Score=59.93  Aligned_cols=23  Identities=48%  Similarity=1.393  Sum_probs=16.6

Q ss_pred             CC-CCeeecCCCCCCcccccCCCCCccCCCC
Q 027901           45 CG-KGKCRADMTHPFNFRCECEPGWKKTKDN   74 (217)
Q Consensus        45 C~-~GtC~~~~~~~~~Y~C~C~pGwtG~~c~   74 (217)
                      |+ ||+|.-.       +|.|++||+|..|+
T Consensus       557 C~g~G~C~CG-------~CvC~~GwtG~~C~  580 (783)
T KOG1226|consen  557 CGGHGRCECG-------RCVCNPGWTGSACN  580 (783)
T ss_pred             cCCCCeEeCC-------cEEcCCCCccCCCC
Confidence            54 5777643       68888888888876


No 26 
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=96.08  E-value=0.014  Score=59.10  Aligned_cols=90  Identities=23%  Similarity=0.520  Sum_probs=45.5

Q ss_pred             CCeeecCCCCCCcccccCCCCCccCCCCCCCCCCCC----CCcc----CCCCcCCC----CccccCCCCCCCCCCCCCC-
Q 027901           47 KGKCRADMTHPFNFRCECEPGWKKTKDNDEDNDHSF----LPCI----IPDCTLHY----DSCHTAPPPDPDKVPHNIS-  113 (217)
Q Consensus        47 ~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~~~~~----~PC~----~~~Ct~~~----gsC~~~~~~~~~g~g~n~~-  113 (217)
                      ||+++-+       .|+|++||.|++|+-..++...    .-|.    .+.|+...    |.|.+.+...+.-+|+.++ 
T Consensus       472 ~G~~~CG-------~C~C~~G~~G~~CEC~~~~~ss~~~~~~Cr~~~~~~vCSgrG~C~CGqC~C~~~~~~~i~G~fCEC  544 (783)
T KOG1226|consen  472 NGTFVCG-------QCRCDEGWLGKKCECSTDELSSSEEEDKCRENSDSPVCSGRGDCVCGQCVCHKPDNGKIYGKFCEC  544 (783)
T ss_pred             CCcEEec-------ceecCCCCCCCcccCCccccCcHhHHhhccCCCCCCCcCCCCcEeCCceEecCCCCCceeeeeeec
Confidence            5665543       7999999999999833322332    1233    11344222    2344443333211455532 


Q ss_pred             CCCCCCCC---ccC-CCeEeeCCCCceeeecCCCCccCC
Q 027901          114 VFEPCSWI---YCG-EGTCRNTSNYKHTCECKPGFNNLL  148 (217)
Q Consensus       114 ~~DpC~~~---~Cg-~GtC~~~~~~sY~C~C~~Gy~n~~  148 (217)
                      +.--|..+   -|+ +|+|.=.     +|.|++||+|..
T Consensus       545 DnfsC~r~~g~lC~g~G~C~CG-----~CvC~~GwtG~~  578 (783)
T KOG1226|consen  545 DNFSCERHKGVLCGGHGRCECG-----RCVCNPGWTGSA  578 (783)
T ss_pred             cCcccccccCcccCCCCeEeCC-----cEEcCCCCccCC
Confidence            11223222   365 4666532     477777777764


No 27 
>PF12661 hEGF:  Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.02  E-value=0.0016  Score=33.06  Aligned_cols=13  Identities=38%  Similarity=1.257  Sum_probs=10.5

Q ss_pred             cccCCCCCccCCC
Q 027901           61 RCECEPGWKKTKD   73 (217)
Q Consensus        61 ~C~C~pGwtG~~c   73 (217)
                      +|.|++||+|.+|
T Consensus         1 ~C~C~~G~~G~~C   13 (13)
T PF12661_consen    1 TCQCPPGWTGPNC   13 (13)
T ss_dssp             EEEE-TTEETTTT
T ss_pred             CccCcCCCcCCCC
Confidence            5999999999875


No 28 
>PF12947 EGF_3:  EGF domain;  InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=95.95  E-value=0.0055  Score=39.17  Aligned_cols=25  Identities=44%  Similarity=0.953  Sum_probs=19.0

Q ss_pred             ccC-CCeEeeCCCCceeeecCCCCccC
Q 027901          122 YCG-EGTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       122 ~Cg-~GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      .|. +.+|+++.+ +|+|+|++||.|.
T Consensus         7 ~C~~nA~C~~~~~-~~~C~C~~Gy~Gd   32 (36)
T PF12947_consen    7 GCHPNATCTNTGG-SYTCTCKPGYEGD   32 (36)
T ss_dssp             GS-TTCEEEE-TT-SEEEEE-CEEECC
T ss_pred             CCCCCcEeecCCC-CEEeECCCCCccC
Confidence            476 579999865 9999999999987


No 29 
>PF07974 EGF_2:  EGF-like domain;  InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=94.67  E-value=0.04  Score=34.36  Aligned_cols=24  Identities=29%  Similarity=0.782  Sum_probs=19.9

Q ss_pred             CccC-CCeEeeCCCCceeeecCCCCccC
Q 027901          121 IYCG-EGTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       121 ~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      ..|. +|+|++.   ..+|.|.+||+|.
T Consensus         6 ~~C~~~G~C~~~---~g~C~C~~g~~G~   30 (32)
T PF07974_consen    6 NICSGHGTCVSP---CGRCVCDSGYTGP   30 (32)
T ss_pred             CccCCCCEEeCC---CCEEECCCCCcCC
Confidence            4587 6999975   5789999999986


No 30 
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=93.45  E-value=0.14  Score=41.88  Aligned_cols=48  Identities=17%  Similarity=0.412  Sum_probs=34.6

Q ss_pred             ccccCCCCCCCC-----CcCC---CCCCCeeecCCCCCCcccccCCCCCccCCCCCC
Q 027901           28 LAPALSPFFDKM-----CEKV---DCGKGKCRADMTHPFNFRCECEPGWKKTKDNDE   76 (217)
Q Consensus        28 ~s~~~~~~~~d~-----C~~~---pC~~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~   76 (217)
                      -||-+++...|.     |...   -|-||+|.--.+ ...+.|.|..||+|.+|+..
T Consensus        28 ~~~~~~~~~~~~~~i~~Cp~ey~~YClHG~C~yI~d-l~~~~CrC~~GYtGeRCEh~   83 (139)
T PHA03099         28 TSPEITNATTDIPAIRLCGPEGDGYCLHGDCIHARD-IDGMYCRCSHGYTGIRCQHV   83 (139)
T ss_pred             cChhhccCccCCcccccCChhhCCEeECCEEEeecc-CCCceeECCCCcccccccce
Confidence            667666655443     6533   388899985432 45799999999999999854


No 31 
>PF12662 cEGF:  Complement Clr-like EGF-like
Probab=92.87  E-value=0.073  Score=31.32  Aligned_cols=13  Identities=38%  Similarity=0.971  Sum_probs=11.5

Q ss_pred             ceeeecCCCCccC
Q 027901          135 KHTCECKPGFNNL  147 (217)
Q Consensus       135 sY~C~C~~Gy~n~  147 (217)
                      +|+|.|++||+..
T Consensus         1 sy~C~C~~Gy~l~   13 (24)
T PF12662_consen    1 SYTCSCPPGYQLS   13 (24)
T ss_pred             CEEeeCCCCCcCC
Confidence            6999999999954


No 32 
>PF06247 Plasmod_Pvs28:  Plasmodium ookinete surface protein Pvs28;  InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=92.79  E-value=0.033  Score=48.07  Aligned_cols=71  Identities=25%  Similarity=0.549  Sum_probs=42.4

Q ss_pred             CCCCCeeecCCCCCCcccccCCCCCccCCCCCCCCCCCCCCccCCCCcCCCCccccCCCCCCCCCCCCCCCCCCCCCCcc
Q 027901           44 DCGKGKCRADMTHPFNFRCECEPGWKKTKDNDEDNDHSFLPCIIPDCTLHYDSCHTAPPPDPDKVPHNISVFEPCSWIYC  123 (217)
Q Consensus        44 pC~~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~~Ct~~~gsC~~~~~~~~~g~g~n~~~~DpC~~~~C  123 (217)
                      .|.||.-...+   +.|.|.|.+||.-.+   +           ++|+... .|...         .+       ..-+|
T Consensus         7 ~CKNG~LiQMS---NHfEC~Cnegfvl~~---E-----------ntCE~kv-~C~~~---------e~-------~~K~C   52 (197)
T PF06247_consen    7 ICKNGYLIQMS---NHFECKCNEGFVLKN---E-----------NTCEEKV-ECDKL---------EN-------VNKPC   52 (197)
T ss_dssp             --BTEEEEEES---SEEEEEESTTEEEEE---T-----------TEEEE-----SG----------GG-------TTSEE
T ss_pred             cccCCEEEEcc---CceEEEcCCCcEEcc---c-----------cccccce-ecCcc---------cc-------cCccc
Confidence            57889888875   459999999998763   1           1222111 12110         00       12468


Q ss_pred             CC-CeEeeCCC----CceeeecCCCCccCC
Q 027901          124 GE-GTCRNTSN----YKHTCECKPGFNNLL  148 (217)
Q Consensus       124 g~-GtC~~~~~----~sY~C~C~~Gy~n~~  148 (217)
                      ++ ++|++...    ..|+|.|.+||....
T Consensus        53 gdya~C~~~~~~~~~~~~~C~C~~gY~~~~   82 (197)
T PF06247_consen   53 GDYAKCINQANKGEERAYKCDCINGYILKQ   82 (197)
T ss_dssp             ETTEEEEE-SSTTSSTSEEEEE-TTEEESS
T ss_pred             cchhhhhcCCCcccceeEEEecccCceeeC
Confidence            87 79998653    469999999998763


No 33 
>PHA02887 EGF-like protein; Provisional
Probab=92.13  E-value=0.11  Score=41.86  Aligned_cols=30  Identities=20%  Similarity=0.506  Sum_probs=24.7

Q ss_pred             CCCCeeecCCCCCCcccccCCCCCccCCCCC
Q 027901           45 CGKGKCRADMTHPFNFRCECEPGWKKTKDND   75 (217)
Q Consensus        45 C~~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~   75 (217)
                      |-||+|.--.+ ...+.|.|+.||+|.+|+.
T Consensus        94 CiHG~C~yI~d-L~epsCrC~~GYtG~RCE~  123 (126)
T PHA02887         94 CINGECMNIID-LDEKFCICNKGYTGIRCDE  123 (126)
T ss_pred             eeCCEEEcccc-CCCceeECCCCcccCCCCc
Confidence            88999995532 4568999999999999973


No 34 
>PF14670 FXa_inhibition:  Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=91.85  E-value=0.11  Score=33.30  Aligned_cols=21  Identities=33%  Similarity=0.784  Sum_probs=15.9

Q ss_pred             eEeeCCCCceeeecCCCCccCC
Q 027901          127 TCRNTSNYKHTCECKPGFNNLL  148 (217)
Q Consensus       127 tC~~~~~~sY~C~C~~Gy~n~~  148 (217)
                      .|+++.+ +|+|.|++||+...
T Consensus        11 ~C~~~~g-~~~C~C~~Gy~L~~   31 (36)
T PF14670_consen   11 ICVNTPG-SYRCSCPPGYKLAE   31 (36)
T ss_dssp             EEEEETT-SEEEE-STTEEE-T
T ss_pred             CCccCCC-ceEeECCCCCEECc
Confidence            5888855 89999999998753


No 35 
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=91.18  E-value=0.11  Score=47.60  Aligned_cols=80  Identities=23%  Similarity=0.450  Sum_probs=52.1

Q ss_pred             CCCCCcC--CCCC-CCeeecCCCCCCcccccCCCCCccCCCCCCCCCCCCCCccCCCCcCCCCccccCCCCCCCCCCCCC
Q 027901           36 FDKMCEK--VDCG-KGKCRADMTHPFNFRCECEPGWKKTKDNDEDNDHSFLPCIIPDCTLHYDSCHTAPPPDPDKVPHNI  112 (217)
Q Consensus        36 ~~d~C~~--~pC~-~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~~Ct~~~gsC~~~~~~~~~g~g~n~  112 (217)
                      +.|+|..  +||. +--|+++   .++|+|++.+||++.   .|            +|++-.+.|.-             
T Consensus       235 DvnEC~~ep~~c~~~qfCvNt---eGSf~C~dk~Gy~~g---~d------------~C~~~~d~~~~-------------  283 (350)
T KOG4260|consen  235 DVNECQNEPAPCKAHQFCVNT---EGSFKCEDKEGYKKG---VD------------ECQFCADVCAS-------------  283 (350)
T ss_pred             cHHHHhcCCCCCChhheeecC---CCceEecccccccCC---hH------------Hhhhhhhhccc-------------
Confidence            5677864  4676 4679987   688999999999983   22            23321111210             


Q ss_pred             CCCCCCCCCccCCCeEeeCCCCceeeecCCCCccCCCCCCCCCccCCCCC
Q 027901          113 SVFEPCSWIYCGEGTCRNTSNYKHTCECKPGFNNLLNTSYFPCFSNCTLG  162 (217)
Q Consensus       113 ~~~DpC~~~~Cg~GtC~~~~~~sY~C~C~~Gy~n~~n~t~~pC~~~C~~G  162 (217)
                                 .++.|.+.++ +|+|.|..|+.    .....|+..++.-
T Consensus       284 -----------kn~~c~ni~~-~~r~v~f~~~~----~~~g~cV~~~~p~  317 (350)
T KOG4260|consen  284 -----------KNRPCMNIDG-QYRCVCFSGLI----IIEGFCVWHGSPV  317 (350)
T ss_pred             -----------CCCCcccCCc-cEEEEecccce----eeeeeeeccCCch
Confidence                       1345878765 99999988875    3357788766654


No 36 
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=90.82  E-value=0.32  Score=41.56  Aligned_cols=35  Identities=26%  Similarity=0.575  Sum_probs=25.0

Q ss_pred             CCCCCCCCCC--ccCCCeEeeCCCCceeeecCCCCccCC
Q 027901          112 ISVFEPCSWI--YCGEGTCRNTSNYKHTCECKPGFNNLL  148 (217)
Q Consensus       112 ~~~~DpC~~~--~Cg~GtC~~~~~~sY~C~C~~Gy~n~~  148 (217)
                      +.+.|+|...  .|. ..|.++.+ +|.|.|++||+...
T Consensus       184 C~~~~~C~~~~~~c~-~~C~~~~g-~~~c~c~~g~~~~~  220 (224)
T cd01475         184 CVVPDLCATLSHVCQ-QVCISTPG-SYLCACTEGYALLE  220 (224)
T ss_pred             CcCchhhcCCCCCcc-ceEEcCCC-CEEeECCCCccCCC
Confidence            4456777533  454 36998855 99999999998653


No 37 
>PF06247 Plasmod_Pvs28:  Plasmodium ookinete surface protein Pvs28;  InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=89.75  E-value=0.088  Score=45.54  Aligned_cols=73  Identities=34%  Similarity=0.735  Sum_probs=36.8

Q ss_pred             CCcCCCCCCCeeecCCCCCCcccccCCCCCccCCCCCCCCCCCCCCccCCCCcCCCCccccCCCCCCCCCCCCCCCCCCC
Q 027901           39 MCEKVDCGKGKCRADMTHPFNFRCECEPGWKKTKDNDEDNDHSFLPCIIPDCTLHYDSCHTAPPPDPDKVPHNISVFEPC  118 (217)
Q Consensus        39 ~C~~~pC~~GtC~~~~~~~~~Y~C~C~pGwtG~~c~~~~~~~~~~PC~~~~Ct~~~gsC~~~~~~~~~g~g~n~~~~DpC  118 (217)
                      .|....|++|.|+.....+....|.|+-|.. ..   +          ...|+-...                    -+|
T Consensus        89 ~C~~~~Cg~GKCI~d~~~~~~~~CSC~IGkV-~~---d----------n~kCtk~G~--------------------T~C  134 (197)
T PF06247_consen   89 KCNNKDCGSGKCILDPDNPNNPTCSCNIGKV-PD---D----------NKKCTKTGE--------------------TKC  134 (197)
T ss_dssp             GGSS---TTEEEEEEEGGGSEEEEEE-TEEE-TT---T----------TTESEEEE------------------------
T ss_pred             hcCceecCCCeEEecCCCCCCceeEeeeceE-ec---c----------CCcccCCCc--------------------cce
Confidence            3556667777777543334456777777776 11   1          134543221                    133


Q ss_pred             CCCccCCC-eEeeCCCCceeeecCCCCccC
Q 027901          119 SWIYCGEG-TCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       119 ~~~~Cg~G-tC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      . .-|... .|.+..+ -|+|.|++||.+.
T Consensus       135 ~-LKCk~nE~CK~~~~-~Y~C~~~~~~~~~  162 (197)
T PF06247_consen  135 S-LKCKENEECKLVDG-YYKCVCKEGFPGD  162 (197)
T ss_dssp             -----TTTEEEEEETT-EEEEEE-TT-EEE
T ss_pred             e-eecCCCcceeeeCc-EEEeecCCCCCCC
Confidence            3 356543 8998854 8999999999875


No 38 
>PHA02887 EGF-like protein; Provisional
Probab=89.55  E-value=0.39  Score=38.72  Aligned_cols=36  Identities=28%  Similarity=0.655  Sum_probs=27.3

Q ss_pred             CCCCCCC---CccCCCeEeeCC-CCceeeecCCCCccCCC
Q 027901          114 VFEPCSW---IYCGEGTCRNTS-NYKHTCECKPGFNNLLN  149 (217)
Q Consensus       114 ~~DpC~~---~~Cg~GtC~~~~-~~sY~C~C~~Gy~n~~n  149 (217)
                      .++||..   ++|-+|+|.--. ...+.|.|++||+|...
T Consensus        82 hf~pC~~eyk~YCiHG~C~yI~dL~epsCrC~~GYtG~RC  121 (126)
T PHA02887         82 FFEKCKNDFNDFCINGECMNIIDLDEKFCICNKGYTGIRC  121 (126)
T ss_pred             CccccChHhhCEeeCCEEEccccCCCceeECCCCcccCCC
Confidence            3567754   579999997532 34799999999999853


No 39 
>PF12947 EGF_3:  EGF domain;  InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=89.18  E-value=0.11  Score=33.05  Aligned_cols=25  Identities=24%  Similarity=0.854  Sum_probs=17.5

Q ss_pred             CCC-CCeeecCCCCCCcccccCCCCCccC
Q 027901           44 DCG-KGKCRADMTHPFNFRCECEPGWKKT   71 (217)
Q Consensus        44 pC~-~GtC~~~~~~~~~Y~C~C~pGwtG~   71 (217)
                      .|. +-+|++.   ..+|+|+|.+||+|.
T Consensus         7 ~C~~nA~C~~~---~~~~~C~C~~Gy~Gd   32 (36)
T PF12947_consen    7 GCHPNATCTNT---GGSYTCTCKPGYEGD   32 (36)
T ss_dssp             GS-TTCEEEE----TTSEEEEE-CEEECC
T ss_pred             CCCCCcEeecC---CCCEEeECCCCCccC
Confidence            354 4689887   458999999999984


No 40 
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=86.69  E-value=0.42  Score=50.60  Aligned_cols=38  Identities=32%  Similarity=0.762  Sum_probs=31.9

Q ss_pred             CCCCcCCCCCC-CeeecCCCCCCcccccCC-CCCccCCCCCCC
Q 027901           37 DKMCEKVDCGK-GKCRADMTHPFNFRCECE-PGWKKTKDNDED   77 (217)
Q Consensus        37 ~d~C~~~pC~~-GtC~~~~~~~~~Y~C~C~-pGwtG~~c~~~~   77 (217)
                      ..+|+++||+| |+|.+.   -+.|.|+|. .||.|..|+-+.
T Consensus       623 ~~~C~~nPC~N~g~C~eg---wNrfiCDCs~T~~~G~~CerE~  662 (1591)
T KOG3514|consen  623 EKICESNPCQNGGKCSEG---WNRFICDCSGTGFEGRTCEREA  662 (1591)
T ss_pred             ccccCCCcccCCCCcccc---ccccccccccCcccCcccccee
Confidence            45899999997 799988   467999996 899999998543


No 41 
>smart00051 DSL delta serrate ligand.
Probab=85.24  E-value=0.71  Score=32.88  Aligned_cols=13  Identities=23%  Similarity=0.462  Sum_probs=10.2

Q ss_pred             cccCCCCCccCCC
Q 027901           61 RCECEPGWKKTKD   73 (217)
Q Consensus        61 ~C~C~pGwtG~~c   73 (217)
                      .|.|.+||+|.+|
T Consensus        51 ~~~C~~Gw~G~~C   63 (63)
T smart00051       51 NKGCLEGWMGPYC   63 (63)
T ss_pred             CEecCCCCcCCCC
Confidence            3559999999865


No 42 
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=85.14  E-value=1.1  Score=36.67  Aligned_cols=28  Identities=25%  Similarity=0.649  Sum_probs=22.5

Q ss_pred             CccCCCeEeeC-CCCceeeecCCCCccCC
Q 027901          121 IYCGEGTCRNT-SNYKHTCECKPGFNNLL  148 (217)
Q Consensus       121 ~~Cg~GtC~~~-~~~sY~C~C~~Gy~n~~  148 (217)
                      ++|.+|+|.-- +...|.|+|..||+|..
T Consensus        51 ~YClHG~C~yI~dl~~~~CrC~~GYtGeR   79 (139)
T PHA03099         51 GYCLHGDCIHARDIDGMYCRCSHGYTGIR   79 (139)
T ss_pred             CEeECCEEEeeccCCCceeECCCCccccc
Confidence            67999999753 23489999999999873


No 43 
>PF07172 GRP:  Glycine rich protein family;  InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=83.99  E-value=0.76  Score=35.40  Aligned_cols=18  Identities=39%  Similarity=0.434  Sum_probs=12.3

Q ss_pred             CCccchhHHHHHHHHHhhh
Q 027901            1 MAAFKPMAFLALLVVLLPT   19 (217)
Q Consensus         1 m~~~~~~~~~~~~~~~~~~   19 (217)
                      || +|.+.||+|||.++.+
T Consensus         1 Ma-SK~~llL~l~LA~lLl   18 (95)
T PF07172_consen    1 MA-SKAFLLLGLLLAALLL   18 (95)
T ss_pred             Cc-hhHHHHHHHHHHHHHH
Confidence            88 8888777777633333


No 44 
>PF14670 FXa_inhibition:  Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=83.71  E-value=0.42  Score=30.51  Aligned_cols=21  Identities=38%  Similarity=1.016  Sum_probs=16.0

Q ss_pred             eeecCCCCCCcccccCCCCCccCC
Q 027901           49 KCRADMTHPFNFRCECEPGWKKTK   72 (217)
Q Consensus        49 tC~~~~~~~~~Y~C~C~pGwtG~~   72 (217)
                      .|++.   +++|+|.|.+||+-..
T Consensus        11 ~C~~~---~g~~~C~C~~Gy~L~~   31 (36)
T PF14670_consen   11 ICVNT---PGSYRCSCPPGYKLAE   31 (36)
T ss_dssp             EEEEE---TTSEEEE-STTEEE-T
T ss_pred             CCccC---CCceEeECCCCCEECc
Confidence            68877   6789999999998754


No 45 
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=83.16  E-value=2.4  Score=47.20  Aligned_cols=90  Identities=22%  Similarity=0.413  Sum_probs=53.0

Q ss_pred             ccccCCCCCccCCCCCCC--------CCCCCCCccC-------CCCcCCCCccccCCCCC-------CCC-CCCCC-CCC
Q 027901           60 FRCECEPGWKKTKDNDED--------NDHSFLPCII-------PDCTLHYDSCHTAPPPD-------PDK-VPHNI-SVF  115 (217)
Q Consensus        60 Y~C~C~pGwtG~~c~~~~--------~~~~~~PC~~-------~~Ct~~~gsC~~~~~~~-------~~g-~g~n~-~~~  115 (217)
                      -.|.|..||+|..|+.=.        ..-++.||+.       ..|+...|.|.+...-+       ..| +|-.. ...
T Consensus       695 e~c~C~~g~tG~~Ce~C~~gfrr~~~~~~~~~~c~~C~cngh~~~Cd~~tG~C~C~~~t~G~~C~~C~~GfYg~~~~~~~  774 (1705)
T KOG1836|consen  695 EQCTCPVGYTGQFCESCAPGFRRLSPQLGPFCPCIPCDCNGHSNICDPRTGQCKCKHNTFGGQCAQCVDGFYGLPDLGTS  774 (1705)
T ss_pred             hhccCCCCcccchhhhcchhhhcccccCCCCCcccccccCCccccccCCCCceecccCCCCCchhhhcCCCCCccccCCC
Confidence            349999999999986211        1223345441       23444444554322221       334 33332 223


Q ss_pred             CCCCCCccC-CCeEeeCC-CCceeee-cCCCCccCCC
Q 027901          116 EPCSWIYCG-EGTCRNTS-NYKHTCE-CKPGFNNLLN  149 (217)
Q Consensus       116 DpC~~~~Cg-~GtC~~~~-~~sY~C~-C~~Gy~n~~n  149 (217)
                      +.|..-+|- +|.|..+. ..++.|+ |++||+|+..
T Consensus       775 ~dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rC  811 (1705)
T KOG1836|consen  775 GDCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLRC  811 (1705)
T ss_pred             CCCccCCCCCChhhcCcCcccceecCCCCCCCccccc
Confidence            339888887 45887664 4579999 9999999863


No 46 
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=82.12  E-value=0.95  Score=38.60  Aligned_cols=42  Identities=21%  Similarity=0.540  Sum_probs=29.2

Q ss_pred             ccccCCCCCCCCCc-CCCCCC------CeeecCCCCCCcccccCCCCCccCC
Q 027901           28 LAPALSPFFDKMCE-KVDCGK------GKCRADMTHPFNFRCECEPGWKKTK   72 (217)
Q Consensus        28 ~s~~~~~~~~d~C~-~~pC~~------GtC~~~~~~~~~Y~C~C~pGwtG~~   72 (217)
                      |..++..+....|. ..+|..      .+|...   .++|.|.|.+||+...
T Consensus       172 l~~~~~~l~~~~C~~~~~C~~~~~~c~~~C~~~---~g~~~c~c~~g~~~~~  220 (224)
T cd01475         172 IEELTKKFQGKICVVPDLCATLSHVCQQVCIST---PGSYLCACTEGYALLE  220 (224)
T ss_pred             HHHHhhhcccccCcCchhhcCCCCCccceEEcC---CCCEEeECCCCccCCC
Confidence            56666666777785 334532      257765   6789999999998753


No 47 
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=80.96  E-value=1.4  Score=47.17  Aligned_cols=47  Identities=19%  Similarity=0.424  Sum_probs=38.9

Q ss_pred             ccccCCCCCCCCCcCCCCCC-CeeecCCCCCCcccccCC-CCCccCCCCCCC
Q 027901           28 LAPALSPFFDKMCEKVDCGK-GKCRADMTHPFNFRCECE-PGWKKTKDNDED   77 (217)
Q Consensus        28 ~s~~~~~~~~d~C~~~pC~~-GtC~~~~~~~~~Y~C~C~-pGwtG~~c~~~~   77 (217)
                      ..+.++|--...|.+.+|.| |+|++.   -.+|+|+|. .-|+|+.|..|+
T Consensus       946 ~~~gv~~GC~GhCss~~C~NGG~Cver---y~gytCDCs~Tay~Gp~Cs~ei  994 (1306)
T KOG3516|consen  946 GTAGVSPGCEGHCSSYPCLNGGHCVER---YDGYTCDCSRTAYDGPFCSKEI  994 (1306)
T ss_pred             cCCcccCCCccccccccccCCCEEEEe---cCceeeccccCcCCCCcccccc
Confidence            55677787788899999997 699988   458999996 789999997553


No 48 
>PF13980 UPF0370:  Uncharacterised protein family (UPF0370)
Probab=78.87  E-value=1.8  Score=31.01  Aligned_cols=14  Identities=29%  Similarity=0.581  Sum_probs=12.3

Q ss_pred             HHHHHHHHHHHHHh
Q 027901          202 HWMSILIMSMVIAI  215 (217)
Q Consensus       202 ~~~~~~~~~~~~~~  215 (217)
                      +|||||++.+||++
T Consensus         7 YWWiiLl~lvG~i~   20 (63)
T PF13980_consen    7 YWWIILLILVGMII   20 (63)
T ss_pred             HHHHHHHHHHHHHH
Confidence            79999999999875


No 49 
>PF12946 EGF_MSP1_1:  MSP1 EGF domain 1;  InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=78.65  E-value=1.4  Score=28.58  Aligned_cols=30  Identities=20%  Similarity=0.577  Sum_probs=19.6

Q ss_pred             CCCCccC-CCeEeeCCCCceeeecCCCCccC
Q 027901          118 CSWIYCG-EGTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       118 C~~~~Cg-~GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      |....|- +..|.+...+++.|+|..||.-.
T Consensus         2 C~~~~cP~NA~C~~~~dG~eecrCllgyk~~   32 (37)
T PF12946_consen    2 CIDTKCPANAGCFRYDDGSEECRCLLGYKKV   32 (37)
T ss_dssp             -SSS---TTEEEEEETTSEEEEEE-TTEEEE
T ss_pred             ccCccCCCCcccEEcCCCCEEEEeeCCcccc
Confidence            3344554 56799887679999999999854


No 50 
>PF00954 S_locus_glycop:  S-locus glycoprotein family;  InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=73.73  E-value=2.9  Score=31.96  Aligned_cols=31  Identities=32%  Similarity=0.920  Sum_probs=24.0

Q ss_pred             CCCCCCC-CccC-CCeEeeCCCCceeeecCCCCcc
Q 027901          114 VFEPCSW-IYCG-EGTCRNTSNYKHTCECKPGFNN  146 (217)
Q Consensus       114 ~~DpC~~-~~Cg-~GtC~~~~~~sY~C~C~~Gy~n  146 (217)
                      ..|+|+. ..|| .|.|..+  ....|+|.+||.=
T Consensus        76 p~d~Cd~y~~CG~~g~C~~~--~~~~C~Cl~GF~P  108 (110)
T PF00954_consen   76 PKDQCDVYGFCGPNGICNSN--NSPKCSCLPGFEP  108 (110)
T ss_pred             cccCCCCccccCCccEeCCC--CCCceECCCCcCC
Confidence            4589975 6799 6999654  3678999999963


No 51 
>PF04863 EGF_alliinase:  Alliinase EGF-like domain;  InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=68.21  E-value=1.3  Score=31.24  Aligned_cols=35  Identities=17%  Similarity=0.403  Sum_probs=17.5

Q ss_pred             CCCC-CCCeee-cCCCCCCcccccCCCCCccCCCCCC
Q 027901           42 KVDC-GKGKCR-ADMTHPFNFRCECEPGWKKTKDNDE   76 (217)
Q Consensus        42 ~~pC-~~GtC~-~~~~~~~~Y~C~C~pGwtG~~c~~~   76 (217)
                      .++| +||+.. +.....+.-.|+|+.-|+|++|.+.
T Consensus        16 ai~CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~~   52 (56)
T PF04863_consen   16 AISCSGHGRAFLDGLIADGSPVCECNSCYGGPDCSTL   52 (56)
T ss_dssp             TS--TTSEE--TTS-EETTEE--EE-TTEESTTS-EE
T ss_pred             cCCcCCCCeeeeccccccCCccccccCCcCCCCcccC
Confidence            3467 478766 3221234579999999999998743


No 52 
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=67.40  E-value=4.7  Score=43.35  Aligned_cols=39  Identities=18%  Similarity=0.514  Sum_probs=31.4

Q ss_pred             CCCCCCCCCCccCC-CeEeeCCCCceeeecC-CCCccCCCCC
Q 027901          112 ISVFEPCSWIYCGE-GTCRNTSNYKHTCECK-PGFNNLLNTS  151 (217)
Q Consensus       112 ~~~~DpC~~~~Cg~-GtC~~~~~~sY~C~C~-~Gy~n~~n~t  151 (217)
                      +...|.|..++|.+ |.|.-+ ...|.|.|. .||+|....+
T Consensus       542 C~i~drClPN~CehgG~C~Qs-~~~f~C~C~~TGY~GatCHt  582 (1306)
T KOG3516|consen  542 CGISDRCLPNPCEHGGKCSQS-WDDFECNCELTGYKGATCHT  582 (1306)
T ss_pred             cccccccCCccccCCCccccc-ccceeEeccccccccccccC
Confidence            34679999999995 699874 458999999 9999985544


No 53 
>PF00954 S_locus_glycop:  S-locus glycoprotein family;  InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=63.38  E-value=7.2  Score=29.75  Aligned_cols=31  Identities=29%  Similarity=0.847  Sum_probs=23.5

Q ss_pred             CCCCcC-CCCC-CCeeecCCCCCCcccccCCCCCccC
Q 027901           37 DKMCEK-VDCG-KGKCRADMTHPFNFRCECEPGWKKT   71 (217)
Q Consensus        37 ~d~C~~-~pC~-~GtC~~~~~~~~~Y~C~C~pGwtG~   71 (217)
                      .|.|+. ..|| +|.|...    ....|+|.+||+-+
T Consensus        77 ~d~Cd~y~~CG~~g~C~~~----~~~~C~Cl~GF~P~  109 (110)
T PF00954_consen   77 KDQCDVYGFCGPNGICNSN----NSPKCSCLPGFEPK  109 (110)
T ss_pred             ccCCCCccccCCccEeCCC----CCCceECCCCcCCC
Confidence            578984 6898 5999643    24579999999854


No 54 
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=59.89  E-value=22  Score=39.90  Aligned_cols=37  Identities=22%  Similarity=0.671  Sum_probs=29.8

Q ss_pred             CcCCCCCC-CeeecCCCCCCccccc-CCCCCccCCCCCCC
Q 027901           40 CEKVDCGK-GKCRADMTHPFNFRCE-CEPGWKKTKDNDED   77 (217)
Q Consensus        40 C~~~pC~~-GtC~~~~~~~~~Y~C~-C~pGwtG~~c~~~~   77 (217)
                      |+.=+|-+ |.|....+ +..+.|+ |.+||+|.+|+..+
T Consensus       777 C~~C~Cp~~~~~~~~~~-~~~~iCk~Cp~gytG~rCe~c~  815 (1705)
T KOG1836|consen  777 CQPCPCPNGGACGQTPE-ILEVVCKNCPPGYTGLRCEECA  815 (1705)
T ss_pred             CccCCCCCChhhcCcCc-ccceecCCCCCCCcccccccCC
Confidence            88888975 67887754 5679999 99999999998543


No 55 
>KOG3512 consensus Netrin, axonal chemotropic factor [Signal transduction mechanisms]
Probab=59.78  E-value=19  Score=35.64  Aligned_cols=25  Identities=20%  Similarity=0.489  Sum_probs=19.6

Q ss_pred             CeeecCCCCCCcccccCCCCCccCCCC
Q 027901           48 GKCRADMTHPFNFRCECEPGWKKTKDN   74 (217)
Q Consensus        48 GtC~~~~~~~~~Y~C~C~pGwtG~~c~   74 (217)
                      -.|+-..  .+.++|+|..+-+|+.|+
T Consensus       285 s~Cv~d~--~~~ltCdC~HNTaGPdCg  309 (592)
T KOG3512|consen  285 SRCVMDE--SSHLTCDCEHNTAGPDCG  309 (592)
T ss_pred             ceeeecc--CCceEEecccCCCCCCcc
Confidence            3588553  345999999999999986


No 56 
>PRK13664 hypothetical protein; Provisional
Probab=56.42  E-value=10  Score=27.04  Aligned_cols=14  Identities=14%  Similarity=0.522  Sum_probs=10.4

Q ss_pred             HHHH-HHHHHHHHHh
Q 027901          202 HWMS-ILIMSMVIAI  215 (217)
Q Consensus       202 ~~~~-~~~~~~~~~~  215 (217)
                      +||| ||++.+||++
T Consensus         7 yWWilill~lvG~i~   21 (62)
T PRK13664          7 YWWILVLVFLVGVLL   21 (62)
T ss_pred             HHHHHHHHHHHHHHH
Confidence            4555 7888888875


No 57 
>PF12955 DUF3844:  Domain of unknown function (DUF3844);  InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=55.93  E-value=9.1  Score=30.09  Aligned_cols=33  Identities=33%  Similarity=0.786  Sum_probs=21.7

Q ss_pred             CCCc--CCCC-CCCeeecCCCC--CCcccccCCCCCcc
Q 027901           38 KMCE--KVDC-GKGKCRADMTH--PFNFRCECEPGWKK   70 (217)
Q Consensus        38 d~C~--~~pC-~~GtC~~~~~~--~~~Y~C~C~pGwtG   70 (217)
                      +.|+  .+.| +||.|+.....  ..=|.|+|.+.+..
T Consensus         6 ~aC~~~Tn~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~   43 (103)
T PF12955_consen    6 DACENATNNCSGHGSCVKKYGSGGGDCFACKCKPTVVK   43 (103)
T ss_pred             HHHHHhccCCCCCceEeeccCCCccceEEEEeeccccc
Confidence            4454  4468 48999987422  13599999995443


No 58 
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=55.18  E-value=27  Score=35.95  Aligned_cols=21  Identities=38%  Similarity=1.160  Sum_probs=15.5

Q ss_pred             cC-CCeEeeCCCCceeeecCCCCccC
Q 027901          123 CG-EGTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       123 Cg-~GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      |. +|.|.+    .++|.|.+||...
T Consensus       632 C~g~GVCnn----~~~ChC~~gwapp  653 (716)
T KOG3607|consen  632 CNGHGVCNN----ELNCHCEPGWAPP  653 (716)
T ss_pred             cCCCcccCC----CcceeeCCCCCCC
Confidence            54 466643    4789999999876


No 59 
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=55.00  E-value=47  Score=28.97  Aligned_cols=33  Identities=30%  Similarity=0.793  Sum_probs=21.6

Q ss_pred             eeeecCCCCccCCCCCC-CCCc--cCCCCCCCccCC
Q 027901          136 HTCECKPGFNNLLNTSY-FPCF--SNCTLGADCEKL  168 (217)
Q Consensus       136 Y~C~C~~Gy~n~~n~t~-~pC~--~~C~~G~dC~~l  168 (217)
                      -.|.|.+||++...... ..|.  ..|.+|+.|...
T Consensus       162 ~~c~c~~g~~g~~~~~~~~~c~~~~~~~~g~~C~~~  197 (316)
T KOG1218|consen  162 GICTCQPGFVGVFCVESCSGCSPLTACENGAKCNRS  197 (316)
T ss_pred             CceeccCCcccccccccCCCcCCCcccCCCCeeecc
Confidence            45778888888865554 3366  566666667655


No 60 
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=50.21  E-value=9.5  Score=25.32  Aligned_cols=16  Identities=25%  Similarity=0.617  Sum_probs=14.0

Q ss_pred             cccccCCCCCccCCCC
Q 027901           59 NFRCECEPGWKKTKDN   74 (217)
Q Consensus        59 ~Y~C~C~pGwtG~~c~   74 (217)
                      .-+|.|.+||+|.+|+
T Consensus        18 ~G~C~C~~~~~G~~C~   33 (50)
T cd00055          18 TGQCECKPNTTGRRCD   33 (50)
T ss_pred             CCEEeCCCcCCCCCCC
Confidence            3589999999999986


No 61 
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=49.83  E-value=30  Score=37.82  Aligned_cols=38  Identities=24%  Similarity=0.598  Sum_probs=26.9

Q ss_pred             CCCCcCCCCCCC---------eeecCCCCCCcccccCCCCCccCCCCC
Q 027901           37 DKMCEKVDCGKG---------KCRADMTHPFNFRCECEPGWKKTKDND   75 (217)
Q Consensus        37 ~d~C~~~pC~~G---------tC~~~~~~~~~Y~C~C~pGwtG~~c~~   75 (217)
                      +..|..-||-.|         +|.-. +......|.|++||+|.+|+.
T Consensus       903 g~~CrPCpCP~gp~Sg~~~A~sC~~d-~~t~~ivC~C~~GY~G~RCe~  949 (1758)
T KOG0994|consen  903 GIGCRPCPCPDGPASGRQHADSCYLD-TRTQQIVCHCQEGYSGSRCEI  949 (1758)
T ss_pred             CCCCCCCCCCCCCccchhcccccccc-ccccceeeecccCccccchhh
Confidence            556877788653         35533 234567999999999999984


No 62 
>PF00053 Laminin_EGF:  Laminin EGF-like (Domains III and V);  InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below.  +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain  In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=49.62  E-value=7.4  Score=25.52  Aligned_cols=22  Identities=23%  Similarity=0.632  Sum_probs=17.1

Q ss_pred             CeeecCCCCCCcccccCCCCCccCCCC
Q 027901           48 GKCRADMTHPFNFRCECEPGWKKTKDN   74 (217)
Q Consensus        48 GtC~~~~~~~~~Y~C~C~pGwtG~~c~   74 (217)
                      .+|...     ..+|.|.++|+|.+|+
T Consensus        11 ~~C~~~-----~G~C~C~~~~~G~~C~   32 (49)
T PF00053_consen   11 QTCDPS-----TGQCVCKPGTTGPRCD   32 (49)
T ss_dssp             SSEEET-----CEEESBSTTEESTTS-
T ss_pred             CcccCC-----CCEEeccccccCCcCc
Confidence            367653     4799999999999987


No 63 
>KOG3509 consensus Basement membrane-specific heparan sulfate proteoglycan (HSPG) core protein [Posttranslational modification, protein turnover, chaperones]
Probab=48.62  E-value=29  Score=36.92  Aligned_cols=36  Identities=19%  Similarity=0.588  Sum_probs=29.0

Q ss_pred             CCCCCcCCCCCCC-eeecCCCCCCcccccCCCCCccCCCC
Q 027901           36 FDKMCEKVDCGKG-KCRADMTHPFNFRCECEPGWKKTKDN   74 (217)
Q Consensus        36 ~~d~C~~~pC~~G-tC~~~~~~~~~Y~C~C~pGwtG~~c~   74 (217)
                      .++.|...+|.+. -|-..   .....|.|.+||+|..|+
T Consensus       405 ~g~~c~~~p~~~~g~c~p~---~~~~~c~c~~g~~G~~c~  441 (964)
T KOG3509|consen  405 LGDVCWRIPCQHDGPCLQT---LEGKQCLCPPGYTGDSCE  441 (964)
T ss_pred             CCCccccccCCCCcccccc---ccccceeccccccCchhh
Confidence            3678888999874 56655   567899999999999886


No 64 
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=48.01  E-value=11  Score=24.81  Aligned_cols=15  Identities=27%  Similarity=0.667  Sum_probs=13.5

Q ss_pred             ccccCCCCCccCCCC
Q 027901           60 FRCECEPGWKKTKDN   74 (217)
Q Consensus        60 Y~C~C~pGwtG~~c~   74 (217)
                      -.|+|.+||+|.+|+
T Consensus        18 G~C~C~~~~~G~~C~   32 (46)
T smart00180       18 GQCECKPNVTGRRCD   32 (46)
T ss_pred             CEEECCCCCCCCCCC
Confidence            589999999999886


No 65 
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=44.70  E-value=25  Score=26.65  Aligned_cols=27  Identities=30%  Similarity=0.759  Sum_probs=20.7

Q ss_pred             CCCCCccCCC-eEeeCCCCceeeecCCC
Q 027901          117 PCSWIYCGEG-TCRNTSNYKHTCECKPG  143 (217)
Q Consensus       117 pC~~~~Cg~G-tC~~~~~~sY~C~C~~G  143 (217)
                      ||....|+.| +|+-+..+.-+|.|.+-
T Consensus         1 pC~~v~C~~G~~C~~d~~~~p~CvC~~~   28 (86)
T cd01328           1 PCENHHCGAGKVCEVDDENTPKCVCIDP   28 (86)
T ss_pred             CCCCcCCCCCCEeeECCCCCeEEecCCc
Confidence            6788889977 89865455789999754


No 66 
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=39.00  E-value=96  Score=34.23  Aligned_cols=89  Identities=19%  Similarity=0.381  Sum_probs=47.6

Q ss_pred             ccccCCCCCccCCCCCCC-CCCCCCCccCC-------CCcCCCCccc---cCCCCC-----CCC-CCCC-CCCCCCCCCC
Q 027901           60 FRCECEPGWKKTKDNDED-NDHSFLPCIIP-------DCTLHYDSCH---TAPPPD-----PDK-VPHN-ISVFEPCSWI  121 (217)
Q Consensus        60 Y~C~C~pGwtG~~c~~~~-~~~~~~PC~~~-------~Ct~~~gsC~---~~~~~~-----~~g-~g~n-~~~~DpC~~~  121 (217)
                      -.|.|.+|--|.+|+.-+ -.+-|..|..=       .|..-.|.|.   +.-...     ..| .|.- ...-++|..-
T Consensus       830 GQC~C~~g~ygrqCnqCqpG~WgFPeCr~CqCNgHA~~Cd~~tGaCi~CqD~T~G~~CdrCl~GyyGdP~lg~g~~CrPC  909 (1758)
T KOG0994|consen  830 GQCQCRPGTYGRQCNQCQPGYWGFPECRPCQCNGHADTCDPITGACIDCQDSTTGHSCDRCLDGYYGDPRLGSGIGCRPC  909 (1758)
T ss_pred             cceeeccccchhhccccCCCccCCCcCccccccCcccccCccccccccccccccccchhhhhccccCCcccCCCCCCCCC
Confidence            367788888887776322 14556544321       3433333332   221111     233 2222 2234677777


Q ss_pred             ccCCC---------eEeeC-CCCceeeecCCCCccCC
Q 027901          122 YCGEG---------TCRNT-SNYKHTCECKPGFNNLL  148 (217)
Q Consensus       122 ~Cg~G---------tC~~~-~~~sY~C~C~~Gy~n~~  148 (217)
                      ||-+|         +|.-. ....-.|.|++||+|..
T Consensus       910 pCP~gp~Sg~~~A~sC~~d~~t~~ivC~C~~GY~G~R  946 (1758)
T KOG0994|consen  910 PCPDGPASGRQHADSCYLDTRTQQIVCHCQEGYSGSR  946 (1758)
T ss_pred             CCCCCCccchhccccccccccccceeeecccCccccc
Confidence            77532         47532 23468899999999985


No 67 
>PF01414 DSL:  Delta serrate ligand;  InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=33.95  E-value=13  Score=26.34  Aligned_cols=11  Identities=36%  Similarity=0.966  Sum_probs=8.3

Q ss_pred             cCCCCCccCCC
Q 027901           63 ECEPGWKKTKD   73 (217)
Q Consensus        63 ~C~pGwtG~~c   73 (217)
                      .|.+||+|.+|
T Consensus        53 ~C~~Gw~G~~C   63 (63)
T PF01414_consen   53 VCLPGWTGPNC   63 (63)
T ss_dssp             EE-TTEESTTS
T ss_pred             CCCCCCcCCCC
Confidence            57899999875


No 68 
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=30.26  E-value=33  Score=35.25  Aligned_cols=16  Identities=38%  Similarity=1.001  Sum_probs=13.6

Q ss_pred             cccccCCCCCccCCCC
Q 027901           59 NFRCECEPGWKKTKDN   74 (217)
Q Consensus        59 ~Y~C~C~pGwtG~~c~   74 (217)
                      .+.|.|.+||.++.|+
T Consensus       641 ~~~ChC~~gwapp~C~  656 (716)
T KOG3607|consen  641 ELNCHCEPGWAPPFCF  656 (716)
T ss_pred             CcceeeCCCCCCCccc
Confidence            4789999999998876


No 69 
>PF09064 Tme5_EGF_like:  Thrombomodulin like fifth domain, EGF-like;  InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=29.42  E-value=41  Score=21.45  Aligned_cols=13  Identities=23%  Similarity=0.513  Sum_probs=10.4

Q ss_pred             cccccCCCCCccC
Q 027901           59 NFRCECEPGWKKT   71 (217)
Q Consensus        59 ~Y~C~C~pGwtG~   71 (217)
                      .+.|.|..||--.
T Consensus        17 ~~~C~CPeGyIld   29 (34)
T PF09064_consen   17 PGQCFCPEGYILD   29 (34)
T ss_pred             CCceeCCCceEec
Confidence            4699999999653


No 70 
>PF01683 EB:  EB module;  InterPro: IPR006149  The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO 
Probab=29.12  E-value=74  Score=20.85  Aligned_cols=20  Identities=35%  Similarity=1.041  Sum_probs=14.0

Q ss_pred             cCC-CeEeeCCCCceeeecCCCCccC
Q 027901          123 CGE-GTCRNTSNYKHTCECKPGFNNL  147 (217)
Q Consensus       123 Cg~-GtC~~~~~~sY~C~C~~Gy~n~  147 (217)
                      |.+ ..|++.     +|.|++||.-.
T Consensus        28 C~~~s~C~~g-----~C~C~~g~~~~   48 (52)
T PF01683_consen   28 CIGGSVCVNG-----RCQCPPGYVEV   48 (52)
T ss_pred             CCCcCEEcCC-----EeECCCCCEec
Confidence            553 478542     69999999754


No 71 
>KOG1388 consensus Attractin and platelet-activating factor acetylhydrolase [Signal transduction mechanisms; Defense mechanisms]
Probab=27.31  E-value=46  Score=29.47  Aligned_cols=40  Identities=20%  Similarity=0.401  Sum_probs=25.1

Q ss_pred             CcccccCCCCCCCCCcCCCCCCCeeecC------CCCCCccccc-CCCCCccC
Q 027901           26 DDLAPALSPFFDKMCEKVDCGKGKCRAD------MTHPFNFRCE-CEPGWKKT   71 (217)
Q Consensus        26 ~f~s~~~~~~~~d~C~~~pC~~GtC~~~------~~~~~~Y~C~-C~pGwtG~   71 (217)
                      +.|.-++.|  ...|  +  +++.|...      .|.+.++.|+ |-.||.|.
T Consensus        42 ~~W~fl~cP--~~~c--N--Gh~~c~t~~v~~~~~N~~~g~~c~kc~~g~~Gd   88 (217)
T KOG1388|consen   42 EIWRFLFCP--LCQC--N--GHSDCNTQHVCWRCENGTTGAHCEKCIVGFYGD   88 (217)
T ss_pred             chhhhhcCh--HHHh--c--CCCCcccceeeeeccCccccccCCceEEEEEec
Confidence            358888877  2333  3  55555533      2335678888 88888884


No 72 
>PLN03148 Blue copper-like protein; Provisional
Probab=25.92  E-value=65  Score=27.31  Aligned_cols=20  Identities=30%  Similarity=0.408  Sum_probs=9.9

Q ss_pred             CCCc-cCCCCCCCccCCCccCCCC
Q 027901          153 FPCF-SNCTLGADCEKLGIRSSDS  175 (217)
Q Consensus       153 ~pC~-~~C~~G~dC~~lgi~~~~~  175 (217)
                      |.|. .-|..|   ..|-|.+.+.
T Consensus       100 FIcg~ghC~~G---mKl~I~V~~~  120 (167)
T PLN03148        100 FICGNGQCFNG---MKVTILVHPL  120 (167)
T ss_pred             EEcCCCccccC---CEEEEEEcCC
Confidence            4444 345555   3445666554


No 73 
>smart00274 FOLN Follistatin-N-terminal domain-like. Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence
Probab=25.30  E-value=86  Score=18.53  Aligned_cols=22  Identities=23%  Similarity=0.564  Sum_probs=14.4

Q ss_pred             CCCCCccCCC-eEeeCCCCceee
Q 027901          117 PCSWIYCGEG-TCRNTSNYKHTC  138 (217)
Q Consensus       117 pC~~~~Cg~G-tC~~~~~~sY~C  138 (217)
                      +|....|..| +|+.+..+.-+|
T Consensus         1 ~C~~v~C~~G~~C~~d~~g~p~C   23 (26)
T smart00274        1 SCRNVQCPFGKVCVVDKGGNARC   23 (26)
T ss_pred             CCCCEECCCCCEEEeCCCCCEEE
Confidence            3666778866 787754446666


No 74 
>PF09289 FOLN:  Follistatin/Osteonectin-like EGF domain;  InterPro: IPR015369 This domain is predominantly found in osteonectin and follistatin. They adopt an EGF-like structure [, ]. Follistatin is involved in diverse activities from embryonic development to cell secretion. ; GO: 0005515 protein binding; PDB: 1LR7_A 1LR8_A 1LR9_A 2ARP_F 3B4V_H 2KCX_A 3SEK_C 2P6A_D 3HH2_C 2B0U_D ....
Probab=22.56  E-value=88  Score=18.05  Aligned_cols=21  Identities=29%  Similarity=0.743  Sum_probs=11.1

Q ss_pred             CCCCccCCC-eEeeCCCCceee
Q 027901          118 CSWIYCGEG-TCRNTSNYKHTC  138 (217)
Q Consensus       118 C~~~~Cg~G-tC~~~~~~sY~C  138 (217)
                      |....|+.| .|.-+..+..+|
T Consensus         1 C~n~~Ck~GKvC~~d~~~~P~C   22 (22)
T PF09289_consen    1 CDNFHCKRGKVCKVDEQGKPHC   22 (22)
T ss_dssp             STT---BTTEEEEEETTTCEEE
T ss_pred             CCCcccCCCCEeeeCCCCCcCC
Confidence            556778877 787644445554


Done!