Query         005509
Match_columns 693
No_of_seqs    457 out of 2553
Neff          7.8 
Searched_HMMs 46136
Date          Fri Mar 29 00:35:38 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/005509.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/005509hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF03016 Exostosin:  Exostosin  100.0   1E-46 2.2E-51  399.9  18.8  283  348-687     2-290 (302)
  2 KOG1021 Acetylglucosaminyltran 100.0   2E-40 4.4E-45  368.2  18.8  296  349-688    71-398 (464)
  3 KOG2264 Exostosin EXT1L [Signa  99.7 1.1E-16 2.3E-21  170.3  12.9  229  398-685   218-474 (907)
  4 KOG1225 Teneurin-1 and related  99.6 6.7E-16 1.4E-20  169.8  10.6  123  114-305   243-365 (525)
  5 KOG1225 Teneurin-1 and related  99.5 7.8E-14 1.7E-18  153.6  11.2  171  111-309   167-343 (525)
  6 KOG1226 Integrin beta subunit   99.4 6.9E-13 1.5E-17  148.0  10.1  143  126-308   468-621 (783)
  7 KOG1022 Acetylglucosaminyltran  99.1 7.3E-10 1.6E-14  118.7  11.3  225  398-687   126-358 (691)
  8 KOG0994 Extracellular matrix g  99.0 7.1E-10 1.5E-14  126.7  11.0  200   96-308   881-1147(1758)
  9 KOG1226 Integrin beta subunit   99.0   5E-10 1.1E-14  125.4   7.9  134  126-309   515-653 (783)
 10 KOG1219 Uncharacterized conser  99.0 5.9E-10 1.3E-14  133.5   7.5  107  121-310  3865-3980(4289)
 11 KOG0994 Extracellular matrix g  98.9 3.1E-09 6.7E-14  121.6   8.7  170  121-308   854-1099(1758)
 12 KOG1836 Extracellular matrix g  98.4 1.4E-06 3.1E-11  108.4  13.3  105   93-202   699-814 (1705)
 13 KOG4289 Cadherin EGF LAG seven  98.2 2.6E-06 5.6E-11  100.1   8.4   73  120-195  1716-1799(2531)
 14 KOG4289 Cadherin EGF LAG seven  98.2 2.2E-06 4.7E-11  100.7   6.6   94  137-275  1220-1318(2531)
 15 KOG1219 Uncharacterized conser  98.1 3.1E-06 6.6E-11  103.0   5.1   68  233-309  3869-3940(4289)
 16 KOG1836 Extracellular matrix g  98.0   5E-05 1.1E-09   95.0  13.2  198   95-309   751-1023(1705)
 17 PF07974 EGF_2:  EGF-like domai  97.9 9.5E-06 2.1E-10   55.4   3.4   27  125-151     6-32  (32)
 18 KOG1217 Fibrillins and related  97.8 0.00015 3.3E-09   81.4  12.8   65  236-310   280-356 (487)
 19 KOG4260 Uncharacterized conser  97.6 9.3E-05   2E-09   73.9   6.6  132  139-301   130-303 (350)
 20 PF07974 EGF_2:  EGF-like domai  97.6 5.2E-05 1.1E-09   51.8   3.0   25  282-306     6-32  (32)
 21 KOG3512 Netrin, axonal chemotr  97.4 0.00096 2.1E-08   71.6  10.8  153  126-312   279-483 (592)
 22 KOG1214 Nidogen and related ba  97.4 0.00049 1.1E-08   78.0   9.0  142  124-305   699-860 (1289)
 23 KOG1217 Fibrillins and related  97.3  0.0023 4.9E-08   71.9  12.8  150  127-306   136-306 (487)
 24 smart00051 DSL delta serrate l  97.1 0.00049 1.1E-08   54.9   3.3   46  258-306    17-63  (63)
 25 PF00008 EGF:  EGF-like domain   97.0 0.00064 1.4E-08   46.6   2.7   27  124-150     3-32  (32)
 26 KOG1214 Nidogen and related ba  96.9  0.0028   6E-08   72.2   8.7  138  121-303   738-908 (1289)
 27 PF12661 hEGF:  Human growth fa  96.2  0.0027 5.8E-08   34.1   1.3   13  139-151     1-13  (13)
 28 PF00852 Glyco_transf_10:  Glyc  96.2  0.0087 1.9E-07   65.0   6.5  129  536-681   141-283 (349)
 29 PF12661 hEGF:  Human growth fa  96.1  0.0026 5.6E-08   34.2   0.9   13  294-306     1-13  (13)
 30 KOG1218 Proteins containing Ca  96.0   0.081 1.8E-06   56.3  12.8  153  134-303    45-209 (316)
 31 cd00055 EGF_Lam Laminin-type e  96.0  0.0074 1.6E-07   45.9   3.3   28  126-153     3-34  (50)
 32 smart00179 EGF_CA Calcium-bind  95.9   0.011 2.4E-07   41.8   3.7   32  121-152     3-39  (39)
 33 PF00008 EGF:  EGF-like domain   95.9  0.0041 8.8E-08   42.6   1.3   24  282-305     4-32  (32)
 34 PF00053 Laminin_EGF:  Laminin   95.8  0.0063 1.4E-07   46.0   2.4   23  131-153    11-33  (49)
 35 smart00051 DSL delta serrate l  95.6    0.01 2.3E-07   47.3   3.0   30  121-151    32-63  (63)
 36 KOG3512 Netrin, axonal chemotr  95.6   0.047   1E-06   59.1   8.6  103   95-202   301-430 (592)
 37 cd00054 EGF_CA Calcium-binding  95.2   0.026 5.6E-07   39.3   3.7   32  121-152     3-38  (38)
 38 KOG4260 Uncharacterized conser  94.9   0.019 4.1E-07   57.9   2.9   45  262-308   132-183 (350)
 39 smart00180 EGF_Lam Laminin-typ  94.7   0.033 7.1E-07   41.5   3.2   23  131-153    11-33  (46)
 40 PF01414 DSL:  Delta serrate li  94.3   0.014 3.1E-07   46.6   0.5   45  257-306    16-63  (63)
 41 cd00053 EGF Epidermal growth f  94.0   0.071 1.5E-06   36.4   3.4   29  124-152     5-36  (36)
 42 smart00181 EGF Epidermal growt  94.0   0.076 1.6E-06   36.6   3.5   27  125-152     6-35  (35)
 43 PHA02887 EGF-like protein; Pro  93.9    0.07 1.5E-06   47.0   3.9   33  121-154    84-124 (126)
 44 smart00179 EGF_CA Calcium-bind  92.9    0.11 2.4E-06   36.5   3.1   26  282-307     9-39  (39)
 45 KOG1218 Proteins containing Ca  92.8     1.1 2.4E-05   47.6  11.9   27  127-154    81-107 (316)
 46 cd00054 EGF_CA Calcium-binding  92.4    0.14 3.1E-06   35.4   3.0   26  282-307     9-38  (38)
 47 PF04863 EGF_alliinase:  Alliin  91.9    0.09   2E-06   40.0   1.5   29  126-154    18-52  (56)
 48 PF07645 EGF_CA:  Calcium-bindi  91.0    0.22 4.8E-06   36.2   2.8   27  121-147     3-34  (42)
 49 cd00053 EGF Epidermal growth f  90.5    0.26 5.7E-06   33.5   2.8   26  282-307     6-36  (36)
 50 cd00055 EGF_Lam Laminin-type e  90.1    0.35 7.5E-06   36.7   3.3   20  289-308    13-34  (50)
 51 KOG2619 Fucosyltransferase [Ca  89.7    0.63 1.4E-05   50.3   6.0  126  536-676   162-293 (372)
 52 PF01414 DSL:  Delta serrate li  89.6    0.17 3.6E-06   40.5   1.2   46  139-199    18-63  (63)
 53 PHA02887 EGF-like protein; Pro  87.8     0.4 8.6E-06   42.4   2.4   26  283-309    93-124 (126)
 54 PF00053 Laminin_EGF:  Laminin   87.5    0.31 6.8E-06   36.7   1.4   28  288-317    11-40  (49)
 55 smart00181 EGF Epidermal growt  86.5    0.66 1.4E-05   31.8   2.6   25  282-307     6-35  (35)
 56 PF12947 EGF_3:  EGF domain;  I  86.2    0.48   1E-05   33.3   1.7   26  125-150     6-33  (36)
 57 PF12955 DUF3844:  Domain of un  84.9    0.55 1.2E-05   41.2   1.8   32  282-313    13-66  (103)
 58 PF12955 DUF3844:  Domain of un  84.8    0.76 1.6E-05   40.3   2.6   31  124-154    12-62  (103)
 59 KOG3607 Meltrins, fertilins an  84.6    0.62 1.4E-05   54.9   2.7   33  121-154   626-658 (716)
 60 KOG3607 Meltrins, fertilins an  83.1     1.1 2.3E-05   53.1   3.7   35  278-312   626-661 (716)
 61 smart00180 EGF_Lam Laminin-typ  82.1     1.5 3.3E-05   32.6   3.0   17  292-308    17-33  (46)
 62 PF04863 EGF_alliinase:  Alliin  81.0    0.74 1.6E-05   35.1   0.9   29  282-310    17-53  (56)
 63 PHA03099 epidermal growth fact  78.6     1.5 3.2E-05   39.6   2.2   29  125-154    51-83  (139)
 64 PHA03099 epidermal growth fact  77.8     1.8 3.9E-05   39.1   2.5   26  284-310    53-84  (139)
 65 PF09064 Tme5_EGF_like:  Thromb  77.4     1.7 3.7E-05   29.9   1.7   22  175-196     6-28  (34)
 66 PF01683 EB:  EB module;  Inter  76.2     2.5 5.5E-05   32.1   2.7   31  266-302    16-46  (52)
 67 PF06247 Plasmod_Pvs28:  Plasmo  75.3     1.6 3.5E-05   42.2   1.6  133  130-305    10-163 (197)
 68 PF12947 EGF_3:  EGF domain;  I  72.6     1.9   4E-05   30.3   1.0   23  283-305     7-33  (36)
 69 PF00534 Glycos_transf_1:  Glyc  72.3     4.4 9.6E-05   38.4   4.0   41  628-669    83-124 (172)
 70 PF07645 EGF_CA:  Calcium-bindi  70.1     2.5 5.4E-05   30.7   1.3   21  282-302    10-34  (42)
 71 PF05686 Glyco_transf_90:  Glyc  69.5     8.1 0.00018   42.7   5.7  110  564-686   153-266 (395)
 72 cd03814 GT1_like_2 This family  60.6       9  0.0002   40.4   3.9   43  627-670   256-299 (364)
 73 cd03802 GT1_AviGT4_like This f  59.2      14 0.00031   38.8   5.1   41  629-670   235-277 (335)
 74 cd03808 GT1_cap1E_like This fa  58.5      13 0.00028   38.7   4.6   42  628-670   254-296 (359)
 75 cd03823 GT1_ExpE7_like This fa  58.4      10 0.00022   39.8   3.8   41  628-669   253-295 (359)
 76 PF01683 EB:  EB module;  Inter  58.3      11 0.00023   28.6   2.9   22  124-147    25-46  (52)
 77 cd03822 GT1_ecORF704_like This  52.0      15 0.00032   38.8   3.8   41  628-669   258-301 (366)
 78 KOG3516 Neurexin IV [Signal tr  51.3     9.4  0.0002   46.6   2.2   35  120-155   545-584 (1306)
 79 cd03798 GT1_wlbH_like This fam  49.1      17 0.00036   38.0   3.6   41  628-669   269-310 (377)
 80 cd03807 GT1_WbnK_like This fam  47.3      19  0.0004   37.7   3.6   40  629-669   260-300 (365)
 81 PF00954 S_locus_glycop:  S-loc  47.2      18 0.00039   32.0   3.0   29  121-149    78-109 (110)
 82 cd03819 GT1_WavL_like This fam  46.0      28  0.0006   36.8   4.8   41  628-669   254-296 (355)
 83 KOG1388 Attractin and platelet  45.6      15 0.00031   36.6   2.2   73  126-204    53-130 (217)
 84 cd03821 GT1_Bme6_like This fam  45.1      22 0.00047   37.3   3.7   41  629-670   273-314 (375)
 85 cd04951 GT1_WbdM_like This fam  44.3      24 0.00051   37.4   3.9   40  629-669   254-294 (360)
 86 KOG3514 Neurexin III-alpha [Si  43.3      15 0.00033   44.3   2.3   41  271-311   617-663 (1591)
 87 PF12662 cEGF:  Complement Clr-  42.9      16 0.00034   23.3   1.3   10  293-302     2-11  (24)
 88 cd03801 GT1_YqgM_like This fam  41.8      27 0.00058   36.3   3.8   41  628-669   266-307 (374)
 89 cd03818 GT1_ExpC_like This fam  41.2      81  0.0018   34.4   7.6   40  629-669   292-332 (396)
 90 PF13692 Glyco_trans_1_4:  Glyc  40.1      44 0.00095   29.9   4.5   40  629-669    62-103 (135)
 91 cd03816 GT1_ALG1_like This fam  37.9      77  0.0017   35.1   6.8   42  628-670   305-350 (415)
 92 TIGR03087 stp1 sugar transfera  37.1      44 0.00096   36.6   4.7   39  630-669   290-330 (397)
 93 KOG0196 Tyrosine kinase, EPH (  36.7      75  0.0016   37.9   6.4   65  127-198   248-320 (996)
 94 cd03805 GT1_ALG2_like This fam  36.4      51  0.0011   35.6   5.1   40  629-669   291-331 (392)
 95 PLN02871 UDP-sulfoquinovose:DA  36.1      34 0.00074   38.5   3.7   40  629-669   323-363 (465)
 96 cd03794 GT1_wbuB_like This fam  36.0      36 0.00078   35.8   3.7   43  628-671   285-333 (394)
 97 cd03804 GT1_wbaZ_like This fam  35.4      41  0.0009   35.8   4.1   41  628-669   252-292 (351)
 98 PRK15427 colanic acid biosynth  34.3 1.5E+02  0.0032   32.8   8.3   45  628-673   289-340 (406)
 99 PRK15484 lipopolysaccharide 1,  34.1      42 0.00091   36.7   3.9   43  628-671   267-311 (380)
100 cd03809 GT1_mtfB_like This fam  34.1      28 0.00061   36.6   2.5   41  628-669   263-304 (365)
101 cd03800 GT1_Sucrose_synthase T  33.4      39 0.00085   36.3   3.5   40  629-669   294-334 (398)
102 PF00919 UPF0004:  Uncharacteri  32.8      40 0.00086   29.4   2.7   32  392-424    12-44  (98)
103 KOG3516 Neurexin IV [Signal tr  32.7      26 0.00057   43.0   2.0   40  278-317   546-591 (1306)
104 cd05844 GT1_like_7 Glycosyltra  29.7      78  0.0017   33.6   5.1   42  629-671   256-304 (367)
105 cd04962 GT1_like_5 This family  28.8      57  0.0012   34.7   3.8   41  629-670   262-303 (371)
106 TIGR03088 stp2 sugar transfera  28.7      53  0.0012   35.3   3.6   41  629-670   264-305 (374)
107 cd03806 GT1_ALG11_like This fa  28.5 2.1E+02  0.0045   31.8   8.3   40  628-668   315-355 (419)
108 cd04955 GT1_like_6 This family  28.4      86  0.0019   33.1   5.1   41  628-669   258-300 (363)
109 smart00672 CAP10 Putative lipo  27.9      84  0.0018   32.6   4.6  105  564-680    79-192 (256)
110 cd03792 GT1_Trehalose_phosphor  27.1      64  0.0014   34.8   3.9   42  628-670   264-306 (372)
111 KOG3514 Neurexin III-alpha [Si  26.0      42 0.00092   40.8   2.2   34  233-275   628-661 (1591)
112 PHA01633 putative glycosyl tra  24.7      83  0.0018   34.0   4.0   40  629-669   215-255 (335)
113 PRK09922 UDP-D-galactose:(gluc  23.5      87  0.0019   33.7   4.0   38  630-668   250-288 (359)
114 cd03795 GT1_like_4 This family  22.8      86  0.0019   32.9   3.8   40  629-669   255-297 (357)
115 PF12946 EGF_MSP1_1:  MSP1 EGF   22.4      71  0.0015   22.7   1.9   24  125-148     5-31  (37)
116 PF00954 S_locus_glycop:  S-loc  20.5      79  0.0017   27.9   2.4   22  282-303    84-108 (110)
117 PF14670 FXa_inhibition:  Coagu  20.4      67  0.0015   22.6   1.5   16  132-147    11-28  (36)
118 cd03820 GT1_amsD_like This fam  20.4   1E+02  0.0023   31.6   3.8   41  628-669   243-284 (348)

No 1  
>PF03016 Exostosin:  Exostosin family;  InterPro: IPR004263 Hereditary multiple exostoses (EXT) is an autosomal dominant disorder that is characterised by the appearance of multiple outgrowths of the long bones (exostoses) at their epiphyses []. Mutations in two homologous genes, EXT1 and EXT2, are responsible for the EXT syndrome. The human and mouse EXT genes have at least two homologs in the invertebrate Caenorhabditis elegans, indicating that they do not function exclusively as regulators of bone growth. EXT1 and EXT2 have both been shown to encode glycosyltransferases involved in the chain elongation step of heparan sulphate biosynthesis [].; GO: 0016020 membrane
Probab=100.00  E-value=1e-46  Score=399.89  Aligned_cols=283  Identities=32%  Similarity=0.559  Sum_probs=207.1

Q ss_pred             ccCceEEeEcCChhhhHHHhhccccccccccccccCcCccccccccchhHHHHHHHHhcCCCcCCCcCCCceEEEeccce
Q 005509          348 KKRPLLYVYDLPPEFNSLLLEGRHYKLECVNRIYNEKNETLWTDMLYGSQMAFYESILASPHRTLNGEEADFFFVPVLDS  427 (693)
Q Consensus       348 ~~~p~IYvYdlP~~fn~~ll~~~~~~~~c~~~~~~~~~~~~w~~~~y~~E~~~~~~L~~s~~rT~dP~eAdlF~VP~~~~  427 (693)
                      .++++||||++|++||.+++...           .......+.+.+|++|.+||++|++|++||.||+|||+||||++.+
T Consensus         2 ~~~lkVYVY~lp~~~~~~~~~~~-----------~~~~~~~~~~~~~~~e~~l~~~l~~s~~~T~dp~eAdlF~vP~~~~   70 (302)
T PF03016_consen    2 HRGLKVYVYPLPPKFNKDLLDPR-----------EDEQCSWYETSQYALEVILHEALLNSPFRTDDPEEADLFFVPFYSS   70 (302)
T ss_pred             CCCCEEEEEeCCccccccceecc-----------ccccCCCcccccchHHHHHHHHHHhCCcEeCCHHHCeEEEEEcccc
Confidence            35789999999999999888321           1122333456799999999999999999999999999999999998


Q ss_pred             eeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCccEEEEeccCCCCccCCcc--ccCceEEeecc
Q 005509          428 CIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRDHIWFFSWDEGACYAPKE--IWNSMMLVHWG  505 (693)
Q Consensus       428 ~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~GrdH~~v~~~d~g~~~~~~~--~~~~~~l~~~g  505 (693)
                      +.......          ........+.+..++.++++++|||||++|+||||+.++|+|.+.....  +.+...++...
T Consensus        71 ~~~~~~~~----------~~~~~~~~~~~~~~~~~~~~~~p~w~r~~G~dH~~~~~~~~g~~~~~~~~~~~~~~~~~~~~  140 (302)
T PF03016_consen   71 CYFHHWWG----------SPNSGADRDSLSDALRHLLASYPYWNRSGGRDHFFVNSHDRGGCSFDRNPRLMNNSIRAVVA  140 (302)
T ss_pred             cccccccC----------CccchhhHHHHHHHHHHHHhcCchhhccCCCCeEEEeccccccccccccHhhhccchhheec
Confidence            87411100          0011123445567788888899999999999999999999888864321  11111111100


Q ss_pred             CCCcCCCcceeeeecCCCcccCcCCCCCCccccCCCceeecCccCCchhhhhccccCCCCCCCceeEEecccCCCCCCCC
Q 005509          506 NTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPEKDLVLPAWKAPDAFVLRSKLWASPREKRKTLFYFNGNLGSAYPNG  585 (693)
Q Consensus       506 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~kDvviP~~~~~~~~~~~~~~~~~~~~~R~~L~~F~G~~~~~~~~~  585 (693)
                                    ...         ....+|+|++||++|++............+..+..+|++|++|+|++...    
T Consensus       141 --------------~~~---------~~~~~~~~~~Di~~P~~~~~~~~~~~~~~~~~~~~~R~~l~~f~g~~~~~----  193 (302)
T PF03016_consen  141 --------------FSS---------FSSSCFRPGFDIVIPPFVPPSSLPDWRPWPQRPPARRPYLLFFAGTIRPS----  193 (302)
T ss_pred             --------------cCC---------CCcCcccCCCCeeccccccccccCCccccccCCccCCceEEEEeeecccc----
Confidence                          000         02358999999999998876543322222345678999999999998642    


Q ss_pred             CCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEecCCchhHHHHhhcCceeeccCCCC-CchhHHHHHhcCceeE
Q 005509          586 RPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVTSLRSENYHEDLSSSVFCGVLPGDG-WSGRMEDSILQGCIPV  664 (693)
Q Consensus       586 r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~p~Gd~-~s~Rl~dAi~~GCIPV  664 (693)
                        ...|++++|+.|+++|++.++.....+       ........+|.+.|++|||||+|+|++ ++.||+|||++|||||
T Consensus       194 --~~~~~~~~r~~l~~~~~~~~~~~~~~~-------~~~~~~~~~~~~~l~~S~FCL~p~G~~~~s~Rl~eal~~GcIPV  264 (302)
T PF03016_consen  194 --SNDYSGGVRQRLLDECKSDPDFRCSDG-------SETCPSPSEYMELLRNSKFCLCPRGDGPWSRRLYEALAAGCIPV  264 (302)
T ss_pred             --ccccchhhhhHHHHhcccCCcceeeec-------ccccccchHHHHhcccCeEEEECCCCCcccchHHHHhhhceeeE
Confidence              111678999999999987654321100       011245567999999999999999997 6899999999999999


Q ss_pred             EEeCCeeec---eecCCCccEEEece
Q 005509          665 VIQVVISSF---LLLCQNGSLKIRNK  687 (693)
Q Consensus       665 iisd~~~~p---~l~~~~fsv~v~~~  687 (693)
                      ||+|++.+|   +|||++|||+|+++
T Consensus       265 ii~d~~~lPf~~~ldw~~fsv~v~~~  290 (302)
T PF03016_consen  265 IISDDYVLPFEDVLDWSRFSVRVPEA  290 (302)
T ss_pred             EecCcccCCcccccCHHHEEEEECHH
Confidence            999999999   79999999999975


No 2  
>KOG1021 consensus Acetylglucosaminyltransferase EXT1/exostosin 1 [Carbohydrate transport and metabolism; Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=100.00  E-value=2e-40  Score=368.20  Aligned_cols=296  Identities=28%  Similarity=0.379  Sum_probs=208.4

Q ss_pred             cCceEEeEcCChhhhHHHhhcccccc--------ccccc---------cccCc-----Ccccc-ccccchhHHHHHHHHh
Q 005509          349 KRPLLYVYDLPPEFNSLLLEGRHYKL--------ECVNR---------IYNEK-----NETLW-TDMLYGSQMAFYESIL  405 (693)
Q Consensus       349 ~~p~IYvYdlP~~fn~~ll~~~~~~~--------~c~~~---------~~~~~-----~~~~w-~~~~y~~E~~~~~~L~  405 (693)
                      ....||+|++|+.|+..++..+....        .|..-         .+..+     ....| .++||++|.+||.+|+
T Consensus        71 ~~~~v~~~~~~~~F~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~~~E~~~~~~~~  150 (464)
T KOG1021|consen   71 AGASVYVYNLPSGFDVSLLLFHKQIPTSPNNKKFMCSYKLNEKRGKVYVYHEGNKPLFHTPSWCLTDQYASEGIFHNRML  150 (464)
T ss_pred             cCcceeeeccchhhhhhhhccCccccccCcchhhhhhhhhhcccCceEEecCCCCccccCCCcccccchhHHHHHHHHHh
Confidence            34578999999999999888754332        22210         11111     12244 5689999999999995


Q ss_pred             --cCCCcCCCcCCCceEEEeccceeeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCccEEEEec
Q 005509          406 --ASPHRTLNGEEADFFFVPVLDSCIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRDHIWFFS  483 (693)
Q Consensus       406 --~s~~rT~dP~eAdlF~VP~~~~~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~GrdH~~v~~  483 (693)
                        .+++||.||++||+||||||+++..+++...+.-.      .+ ....+++++.+..+++++|||||+.|+|||||+.
T Consensus       151 ~~~~~~Rt~dp~~Ad~f~vPf~~~~~~~~~~~~~~~~------~~-~~~~~~~~~~i~~~~~~~p~W~Rs~G~DH~~v~~  223 (464)
T KOG1021|consen  151 RRESAFRTLDPLEADAFYVPFYASLDYNRALLWPDER------VN-AILRSILQDYIVALLSKQPYWNRSSGRDHFFVAC  223 (464)
T ss_pred             cccCceecCChhhCcEEEEcceeeEehhhhcccCCcc------cc-hHHHHHHHHHHHHHHhcCchhhccCCCceEEEeC
Confidence              77999999999999999999999987764433210      01 1123444555556678999999999999999999


Q ss_pred             cCCCCccCCccccCceEEeeccCCCcCCCcceeeeecCCCcccCcCCCCCCccccCC-CceeecCccCCchhhhh--ccc
Q 005509          484 WDEGACYAPKEIWNSMMLVHWGNTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPE-KDLVLPAWKAPDAFVLR--SKL  560 (693)
Q Consensus       484 ~d~g~~~~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~-kDvviP~~~~~~~~~~~--~~~  560 (693)
                      +|++...... .++++++..+..-+    ++..                 ...|.+. +||+||++...++....  .-.
T Consensus       224 ~~~~~~~~~~-~~~~~~~~i~~~~n----~a~l-----------------s~~~~~~~~dv~iP~~~~~~~~~~~~~~~~  281 (464)
T KOG1021|consen  224 HDWGDFRRRS-DWGASISLIPEFCN----GALL-----------------SLEFFPWNKDVAIPYPTIPHPLSPPENSWQ  281 (464)
T ss_pred             Ccchheeecc-chhhHHHHHHhhCC----ccee-----------------ecccccCCCcccCCCccCcCccCccccccc
Confidence            9998875431 22222211111110    0000                 1246777 99999998765543321  123


Q ss_pred             cCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEecCCchhHHHHhhcCce
Q 005509          561 WASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVTSLRSENYHEDLSSSVF  640 (693)
Q Consensus       561 ~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~y~~~l~~S~F  640 (693)
                      ...+..+|++|+||+|+.            ..+.||+.|+++|+++++.+..+.+.   +....+.+...|.+.|++|+|
T Consensus       282 ~~~~~~~R~~L~~F~G~~------------~~~~iR~~L~~~~~~~~~~~~~~~~~---~g~~~~~~~~~y~~~m~~S~F  346 (464)
T KOG1021|consen  282 GGVPFSNRPILAFFAGAP------------AGGQIRSILLDLWKKDPDTEVFVNCP---RGKVSCDRPLNYMEGMQDSKF  346 (464)
T ss_pred             cCCCCCCCceEEEEeccc------------cCCcHHHHHHHHhhcCcCccccccCC---CCccccCCcchHHHHhhcCeE
Confidence            345568999999999983            13569999999999844433222221   111234667899999999999


Q ss_pred             eeccCCCCC-chhHHHHHhcCceeEEEeCCeeec---eecCCCccEEEecee
Q 005509          641 CGVLPGDGW-SGRMEDSILQGCIPVVIQVVISSF---LLLCQNGSLKIRNKF  688 (693)
Q Consensus       641 CL~p~Gd~~-s~Rl~dAi~~GCIPViisd~~~~p---~l~~~~fsv~v~~~~  688 (693)
                      ||+|+||++ ++|+||||++|||||||+|++++|   .+||++|||+|++|.
T Consensus       347 CL~p~Gd~~ts~R~fdai~~gCvPViisd~~~lpf~~~~d~~~fSV~v~~~~  398 (464)
T KOG1021|consen  347 CLCPPGDTPTSPRLFDAIVSGCVPVIISDGIQLPFGDVLDWTEFSVFVPEKD  398 (464)
T ss_pred             EECCCCCCcccHhHHHHHHhCCccEEEcCCcccCcCCCccceEEEEEEEHHH
Confidence            999999975 789999999999999999999998   799999999999653


No 3  
>KOG2264 consensus Exostosin EXT1L [Signal transduction mechanisms]
Probab=99.69  E-value=1.1e-16  Score=170.33  Aligned_cols=229  Identities=21%  Similarity=0.231  Sum_probs=142.4

Q ss_pred             HHHHHHHhcCCCcCCCcCCCceEEEeccceeeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCcc
Q 005509          398 MAFYESILASPHRTLNGEEADFFFVPVLDSCIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRD  477 (693)
Q Consensus       398 ~~~~~~L~~s~~rT~dP~eAdlF~VP~~~~~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~Grd  477 (693)
                      ..|.+.+.+..|.|+||+.|+++++-+=      -. ..|       ..++.   .+     ++.| -++||| |++|+|
T Consensus       218 ~~fq~t~~~n~~~ve~pd~ACiyi~lvg------e~-q~P-------~~l~p---~e-----lekl-yslp~w-~~dg~N  273 (907)
T KOG2264|consen  218 QVFQETIPNNVYLVETPDKACIYIHLVG------EI-QSP-------VVLTP---AE-----LEKL-YSLPHW-RTDGFN  273 (907)
T ss_pred             HHHHHhcccceeEeeCCCccEEEEEEec------cc-cCC-------CcCCh---Hh-----hhhh-hcCccc-cCCCcc
Confidence            4677778888999999999999999771      11 111       11221   11     2233 478999 799999


Q ss_pred             EEEEeccCCCCccCCccccCceEEeeccCCCcCCCcceeeeecCCCcccCcCCCCCCccccCCCceeecCccCCchhhhh
Q 005509          478 HIWFFSWDEGACYAPKEIWNSMMLVHWGNTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPEKDLVLPAWKAPDAFVLR  557 (693)
Q Consensus       478 H~~v~~~d~g~~~~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~kDvviP~~~~~~~~~~~  557 (693)
                      |+++......  +..+.++|    +..|++...         .+..         ...+|||++|+++++...+.....+
T Consensus       274 hvl~Nl~r~s--~~~n~lyn----~~t~raivv---------Qssf---------~~~q~RpgfDl~V~pv~h~~~e~~~  329 (907)
T KOG2264|consen  274 HVLFNLGRPS--DTQNLLYN----FQTGRAIVV---------QSSF---------YTVQIRPGFDLPVDPVNHIAVEKNF  329 (907)
T ss_pred             eEEEEccCcc--ccccceeE----eccCceEEE---------eecc---------eeeeeccCCCcccCcccccccCccc
Confidence            9999543321  11122222    222222100         0000         0127999999999988776655445


Q ss_pred             ccccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEe-------------
Q 005509          558 SKLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVT-------------  624 (693)
Q Consensus       558 ~~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~-------------  624 (693)
                      .++....+.+|++|+.|+|++.+.    +..   -...+... ++...++...   ..+..+-+++.             
T Consensus       330 ~e~~p~vP~~RkyL~t~qgki~~~----~ss---Ln~~~aF~-~e~~adp~~~---a~qds~i~qv~c~~t~k~Qe~~SL  398 (907)
T KOG2264|consen  330 VELTPLVPFQRKYLITLQGKIESD----NSS---LNEFSAFS-EELSADPSRR---AVQDSPIVQVKCSFTCKNQENCSL  398 (907)
T ss_pred             eecCcccchhhheeEEEEeeeccc----ccc---cchhhhhH-HHhccCCccc---ccccCceEEEEEeeccccCCCCCc
Confidence            556566788999999999988652    110   11233322 2233332211   01111111221             


Q ss_pred             -----cCCchhHHHHhhcCceeec-cCCCCC------chhHHHHHhcCceeEEEeCCeeec---eecCCCccEEEe
Q 005509          625 -----SLRSENYHEDLSSSVFCGV-LPGDGW------SGRMEDSILQGCIPVVIQVVISSF---LLLCQNGSLKIR  685 (693)
Q Consensus       625 -----~~~~~~y~~~l~~S~FCL~-p~Gd~~------s~Rl~dAi~~GCIPViisd~~~~p---~l~~~~fsv~v~  685 (693)
                           |...+.-.++|++|+|||+ ||||+-      -.|++||+..||||||+++...+|   .|||.+..+.++
T Consensus       399 pewalcg~~~~RrqLlk~STF~lilpp~d~rv~S~~~~~r~~eaL~~GavPviLg~~~~LPyqd~idWrraal~lP  474 (907)
T KOG2264|consen  399 PEWALCGERERRRQLLKSSTFCLILPPGDPRVISEMFFQRFLEALQLGAVPVILGNSQLLPYQDLIDWRRAALRLP  474 (907)
T ss_pred             chhhhccchHHHHHHhccceeEEEecCCCcchhhHHHHHHHHHHHhcCCeeEEeccccccchHHHHHHHHHhhhCC
Confidence                 2223466799999999995 889865      378999999999999999999888   799999888775


No 4  
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=99.63  E-value=6.7e-16  Score=169.77  Aligned_cols=123  Identities=33%  Similarity=0.824  Sum_probs=100.0

Q ss_pred             ccCccCCCCCCCCCCCCCEEeccCCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCC
Q 005509          114 LVEMIGGKSCKSDCSGQGVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEG  193 (693)
Q Consensus       114 ~~~~~~~~~C~~~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G  193 (693)
                      .++.+....|+..|+++|.|+  .|.|+|++||+|.+|++.   .|+.              .|+++-.+..|+|.|++|
T Consensus       243 ~g~~c~~~~C~~~c~~~g~c~--~G~CIC~~Gf~G~dC~e~---~Cp~--------------~cs~~g~~~~g~CiC~~g  303 (525)
T KOG1225|consen  243 FGPLCSTIYCPGGCTGRGQCV--EGRCICPPGFTGDDCDEL---VCPV--------------DCSGGGVCVDGECICNPG  303 (525)
T ss_pred             eCCccccccCCCCCcccceEe--CCeEeCCCCCcCCCCCcc---cCCc--------------ccCCCceecCCEeecCCC
Confidence            345565678888899999998  899999999999999874   2443              355656667789999999


Q ss_pred             cccCCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccc
Q 005509          194 TKYPNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCE  273 (693)
Q Consensus       194 ~~G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~  273 (693)
                      |+|..|+... |                           +.+|+++|.|             +.++|.|+ +||+|..|+
T Consensus       304 ~~G~dCs~~~-c---------------------------padC~g~G~C-------------i~G~C~C~-~Gy~G~~C~  341 (525)
T KOG1225|consen  304 YSGKDCSIRR-C---------------------------PADCSGHGKC-------------IDGECLCD-EGYTGELCI  341 (525)
T ss_pred             cccccccccc-C---------------------------CccCCCCCcc-------------cCCceEeC-CCCcCCccc
Confidence            9988885431 1                           5678899999             47899999 999999999


Q ss_pred             cccCCccCCCCCCCceeeCCeeecCCCcccCC
Q 005509          274 VPVSSTCVNQCSGHGHCRGGFCQCDSGWYGVD  305 (693)
Q Consensus       274 ~~~~~~C~~~C~~~G~C~~g~C~C~~G~~G~~  305 (693)
                      +.      . |+++|.|++| |+|+.||.|.+
T Consensus       342 ~~------~-C~~~g~cv~g-C~C~~Gw~G~d  365 (525)
T KOG1225|consen  342 QR------A-CSGGGQCVNG-CKCKKGWRGPD  365 (525)
T ss_pred             cc------c-cCCCceeccC-ceeccCccCCC
Confidence            74      3 9999999999 99999999999


No 5  
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=99.50  E-value=7.8e-14  Score=153.60  Aligned_cols=171  Identities=26%  Similarity=0.469  Sum_probs=123.7

Q ss_pred             cccccCccCCCCCCCCCCCCCEEeccCCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCC--CCcee
Q 005509          111 EVDLVEMIGGKSCKSDCSGQGVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDT--TRAMC  188 (693)
Q Consensus       111 ~~~~~~~~~~~~C~~~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~--~~g~C  188 (693)
                      ..+.++.++...|++.|+.||.+.  .+.|.+..+++|..|...   .|...     ++....-..++..|..  ..+.|
T Consensus       167 ~~~~~~~~g~~~~~~~~~~hg~~~--~~~~l~~~~~s~~~~~~~---~~~~~-----~~~~~r~~~~~~~~~~~~~~~ic  236 (525)
T KOG1225|consen  167 PNPFGAECGQYKCPNDGSGHGRYY--FGNCLSGISASGETCNQL---GCNDD-----CFRTGRCREGRCFCTAGFFDGIC  236 (525)
T ss_pred             CCccccccceecCCcCCCCCccce--ecccccccCcchhhhhcc---cCCcc-----ceeccccccCcccccccccCcee
Confidence            445566677778889999999998  899999999999999763   12210     0110000111122221  23489


Q ss_pred             eeCCCcccCCCCCCCCCCCc-ccCCCCC--CCCCCCCCcCCCCCCc-cCCCCCCCceecCCcccccccccccccccccCC
Q 005509          189 FCGEGTKYPNRPVAEACGFQ-VNLPSQP--GAPKSTDWAKADLDNI-FTTNGSKPGWCNVDPEEAYALKVQFKEECDCKY  264 (693)
Q Consensus       189 ~C~~G~~G~~C~~~~~C~~~-~~~~~~~--~~~C~~gw~g~~c~~~-~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~  264 (693)
                      .|..+|+|+.|.. ..|... .....|.  .|.|++||+|.+|+.. .+.+|++++.|             ++++|+|. 
T Consensus       237 ~c~~~~~g~~c~~-~~C~~~c~~~g~c~~G~CIC~~Gf~G~dC~e~~Cp~~cs~~g~~-------------~~g~CiC~-  301 (525)
T KOG1225|consen  237 ECPEGYFGPLCST-IYCPGGCTGRGQCVEGRCICPPGFTGDDCDELVCPVDCSGGGVC-------------VDGECICN-  301 (525)
T ss_pred             ecCCceeCCcccc-ccCCCCCcccceEeCCeEeCCCCCcCCCCCcccCCcccCCCcee-------------cCCEeecC-
Confidence            9999999999863 233221 1112233  3458999999999963 35558777777             47899999 


Q ss_pred             CCCcCcccccccCCccCCCCCCCceeeCCeeecCCCcccCCCCCC
Q 005509          265 DGLLGQFCEVPVSSTCVNQCSGHGHCRGGFCQCDSGWYGVDCSIP  309 (693)
Q Consensus       265 ~G~~G~~C~~~~~~~C~~~C~~~G~C~~g~C~C~~G~~G~~C~~~  309 (693)
                      +||+|..|++.   .|+.+|+++|.|++|+|+|.+||+|..|+++
T Consensus       302 ~g~~G~dCs~~---~cpadC~g~G~Ci~G~C~C~~Gy~G~~C~~~  343 (525)
T KOG1225|consen  302 PGYSGKDCSIR---RCPADCSGHGKCIDGECLCDEGYTGELCIQR  343 (525)
T ss_pred             CCccccccccc---cCCccCCCCCcccCCceEeCCCCcCCccccc
Confidence            99999999986   7999999999999999999999999999998


No 6  
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=99.40  E-value=6.9e-13  Score=147.96  Aligned_cols=143  Identities=25%  Similarity=0.517  Sum_probs=102.1

Q ss_pred             CCCCCCEEeccCCeEEeCCCccCCCCCccccCcCCCCCC-CCCCCCCccccccCCCCCCCCceeeeCCCcc----cCCCC
Q 005509          126 DCSGQGVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKT-PELPYGRWVVSICPTHCDTTRAMCFCGEGTK----YPNRP  200 (693)
Q Consensus       126 ~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~-~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~----G~~C~  200 (693)
                      .|++||+++  .|+|.|.+||.|..||-..  .+.+... ...|...-....|+|..+|.-|+|.|.+...    |+.|+
T Consensus       468 ~C~g~G~~~--CG~C~C~~G~~G~~CEC~~--~~~ss~~~~~~Cr~~~~~~vCSgrG~C~CGqC~C~~~~~~~i~G~fCE  543 (783)
T KOG1226|consen  468 LCHGNGTFV--CGQCRCDEGWLGKKCECST--DELSSSEEEDKCRENSDSPVCSGRGDCVCGQCVCHKPDNGKIYGKFCE  543 (783)
T ss_pred             ccCCCCcEE--ecceecCCCCCCCcccCCc--cccCcHhHHhhccCCCCCCCcCCCCcEeCCceEecCCCCCceeeeeee
Confidence            599999998  9999999999999999532  1211100 0001111122389999888999999998877    77774


Q ss_pred             CCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCccccccc-CCc
Q 005509          201 VAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPV-SST  279 (693)
Q Consensus       201 ~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~-~~~  279 (693)
                      -.+.                      .|+...+.-|.++|.|.             .|+|.|. +||+|..|+.+. .+.
T Consensus       544 CDnf----------------------sC~r~~g~lC~g~G~C~-------------CG~CvC~-~GwtG~~C~C~~std~  587 (783)
T KOG1226|consen  544 CDNF----------------------SCERHKGVLCGGHGRCE-------------CGRCVCN-PGWTGSACNCPLSTDT  587 (783)
T ss_pred             ccCc----------------------ccccccCcccCCCCeEe-------------CCcEEcC-CCCccCCCCCCCCCcc
Confidence            3211                      12222245577888883             7899999 999999998763 345


Q ss_pred             cCC----CCCCCceeeCCeeecCCC-cccCCCCC
Q 005509          280 CVN----QCSGHGHCRGGFCQCDSG-WYGVDCSI  308 (693)
Q Consensus       280 C~~----~C~~~G~C~~g~C~C~~G-~~G~~C~~  308 (693)
                      |.+    .|+++|+|.-|+|+|... |+|..|+.
T Consensus       588 C~~~~G~iCSGrG~C~Cg~C~C~~~~~sG~~CE~  621 (783)
T KOG1226|consen  588 CESSDGQICSGRGTCECGRCKCTDPPYSGEFCEK  621 (783)
T ss_pred             ccCCCCceeCCCceeeCCceEcCCCCcCcchhhc
Confidence            652    599999999999999766 99999997


No 7  
>KOG1022 consensus Acetylglucosaminyltransferase EXT2/exostosin 2 [Carbohydrate transport and metabolism; Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=99.06  E-value=7.3e-10  Score=118.67  Aligned_cols=225  Identities=16%  Similarity=0.100  Sum_probs=138.3

Q ss_pred             HHHHHHHhcCCCcCCCcCCCceEEEeccceeeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCcc
Q 005509          398 MAFYESILASPHRTLNGEEADFFFVPVLDSCIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRD  477 (693)
Q Consensus       398 ~~~~~~L~~s~~rT~dP~eAdlF~VP~~~~~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~Grd  477 (693)
                      ..+.|+...|.+.|.|+.+|++|.--. .-+  +          +  ..    +..++    -..++++.-.|.|  |.+
T Consensus       126 ~~lleA~~~S~yyt~n~N~aclf~Ps~-d~l--n----------Q--n~----l~~kl----~~~ala~l~~wdr--g~n  180 (691)
T KOG1022|consen  126 IALLEAWHLSFYYTFNYNGACLFMPSS-DEL--N----------Q--NP----LSWKL----EKVALAKLLVWDR--GVN  180 (691)
T ss_pred             HHHHHHHHhccceecCCCceEEEecch-hhh--c----------c--Cc----chHHH----HHHHHhcccchhc--ccc
Confidence            467778888999999999999986433 111  1          1  11    22222    1234456779987  999


Q ss_pred             EEEEeccCCCCccCCccccCceEEeeccCCCcCCCcceeeeecCCCcccCcCCCCCCccccCCCceeecCccCCchhhhh
Q 005509          478 HIWFFSWDEGACYAPKEIWNSMMLVHWGNTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPEKDLVLPAWKAPDAFVLR  557 (693)
Q Consensus       478 H~~v~~~d~g~~~~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~kDvviP~~~~~~~~~~~  557 (693)
                      |..+.-=.-|.-     .+|..+  +.|+-+    .-...-..+.|            .||++.||.||.|......   
T Consensus       181 H~~fnmLpGg~p-----~yntal--dv~~d~----a~~~gggf~tW------------~yr~g~dv~ipv~Sp~~v~---  234 (691)
T KOG1022|consen  181 HEGFNMLPGGDP-----TYNTAL--DVGQDE----AWYSGGGFGTW------------KYRKGNDVYIPVRSPGNVG---  234 (691)
T ss_pred             eeeEeeccCCCC-----Cccccc--cCCcce----eEEecCCcCcc------------cccCCCccccccccccccC---
Confidence            999932222221     112211  111110    00000112345            6899999999999865221   


Q ss_pred             ccccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEe--c--CCchhHHH
Q 005509          558 SKLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVT--S--LRSENYHE  633 (693)
Q Consensus       558 ~~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~--~--~~~~~y~~  633 (693)
                        .......+|..++.-.|.            .|...+|..|.++..........++.+...+....  +  +....|.+
T Consensus       235 --~~~~~~g~r~~~l~~~q~------------n~~pr~r~~l~el~~kh~e~~l~l~~c~nlsl~~r~~~qhH~~~~yp~  300 (691)
T KOG1022|consen  235 --RAFLYDGSRYRVLQDCQE------------NYGPRIRVSLIELLSKHEERELELPFCLNLSLNSRGVRQHHFDVKYPS  300 (691)
T ss_pred             --ccccCCccceeeeecccc------------ccchHhHHhHHHHHhhccceEEecchhccccccccchhhccccccccc
Confidence              112334566655544441            34566888888876555443333333322121111  1  22357999


Q ss_pred             HhhcCceeeccCCCCC-chhHHHHHhcCceeEEEeCCeeec---eecCCCccEEEece
Q 005509          634 DLSSSVFCGVLPGDGW-SGRMEDSILQGCIPVVIQVVISSF---LLLCQNGSLKIRNK  687 (693)
Q Consensus       634 ~l~~S~FCL~p~Gd~~-s~Rl~dAi~~GCIPViisd~~~~p---~l~~~~fsv~v~~~  687 (693)
                      .+...+||+.-++..- ..-+.+-+.++|||||+.|.+.+|   ++||.-.||.++|-
T Consensus       301 ~l~~~~fc~~~R~~r~gq~~lv~~~~a~c~pvi~vd~y~lpf~~Vvdw~~aSv~~~e~  358 (691)
T KOG1022|consen  301 SLEFIGFCDGDRVTRGGQFHLVILGYASCAPVISVDIYLLPFLGVVDWIVASVWCMEY  358 (691)
T ss_pred             ccceeeeEeccccccCCccceehhhhcccceeeeeehhhhhhhhhhhceeeeEEeehh
Confidence            9999999998888544 456999999999999999999999   89999999999885


No 8  
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=99.05  E-value=7.1e-10  Score=126.73  Aligned_cols=200  Identities=24%  Similarity=0.462  Sum_probs=117.9

Q ss_pred             cccCcccccCCCCcccccccCccCCCCC-CCCCCC--------CCEEeccCC----eEEeCCCccCCCCCccccCcCCCC
Q 005509           96 AEIGRWLSGCDSVAKEVDLVEMIGGKSC-KSDCSG--------QGVCNHELG----QCRCFHGFRGKGCSERIHFQCNFP  162 (693)
Q Consensus        96 ~~~g~~~~~c~~~~~~~~~~~~~~~~~C-~~~C~~--------~G~C~~~~G----~C~C~~G~~G~~Ce~~~~~~C~~~  162 (693)
                      .+.|.+|++|..++.+.+.-..  +..| |.+|-.        --.|...+-    .|.|.+||+|..|+.     |.++
T Consensus       881 ~T~G~~CdrCl~GyyGdP~lg~--g~~CrPCpCP~gp~Sg~~~A~sC~~d~~t~~ivC~C~~GY~G~RCe~-----CA~~  953 (1758)
T KOG0994|consen  881 STTGHSCDRCLDGYYGDPRLGS--GIGCRPCPCPDGPASGRQHADSCYLDTRTQQIVCHCQEGYSGSRCEI-----CADN  953 (1758)
T ss_pred             cccccchhhhhccccCCcccCC--CCCCCCCCCCCCCccchhccccccccccccceeeecccCccccchhh-----hccc
Confidence            3578889999777765444221  2344 444422        125632222    799999999999997     6654


Q ss_pred             --CCCCCCCCCccccccC--------CCCCCCCceee-------------eCCCcccC----CCCCCCCC------CCcc
Q 005509          163 --KTPELPYGRWVVSICP--------THCDTTRAMCF-------------CGEGTKYP----NRPVAEAC------GFQV  209 (693)
Q Consensus       163 --~~~~~~~g~~~~~~C~--------g~C~~~~g~C~-------------C~~G~~G~----~C~~~~~C------~~~~  209 (693)
                        +.|.. .|.|....|+        +.|+-.+|.|.             |..||.|.    +|..+ .|      ..+.
T Consensus       954 ~fGnP~~-GGtCq~CeC~~NiD~~d~~aCD~~TG~CLkCL~hTeG~hCe~Ck~Gf~GdA~~q~CqrC-~Cn~LGTn~~~~ 1031 (1758)
T KOG0994|consen  954 HFGNPSE-GGTCQKCECSNNIDLYDPGACDVATGACLKCLYHTEGDHCEHCKDGFYGDALRQNCQRC-VCNFLGTNSTCH 1031 (1758)
T ss_pred             ccCCccc-CCccccccccCCcCccCCCccchhhchhhhhhhcccccchhhccccchhHHHHhhhhhh-eccccccCCccc
Confidence              33333 4566666775        55777777664             55555553    22111 01      1122


Q ss_pred             cCCCCCCCCCCCCCcCCCCCCccCCC---CCCCc--eecCCcccccccccc--cccccccCCCCCcCcccccccC-----
Q 005509          210 NLPSQPGAPKSTDWAKADLDNIFTTN---GSKPG--WCNVDPEEAYALKVQ--FKEECDCKYDGLLGQFCEVPVS-----  277 (693)
Q Consensus       210 ~~~~~~~~~C~~gw~g~~c~~~~~~~---C~~~G--~C~~~~~~~~~~~~c--~~g~C~C~~~G~~G~~C~~~~~-----  277 (693)
                      ++.....|+|.++--|..|+.+-.+.   =+++|  .|+.++.   .+-+|  .+|+|.|+ +||.|..|++.-+     
T Consensus      1032 CDr~tGQCpClpNv~G~~CDqCA~N~w~laSG~GCe~C~Cd~~---~~pqCN~ftGQCqCk-pGfGGR~C~qCqel~WGd 1107 (1758)
T KOG0994|consen 1032 CDRFTGQCPCLPNVQGVRCDQCAENHWNLASGEGCEPCNCDPI---GGPQCNEFTGQCQCK-PGFGGRTCSQCQELYWGD 1107 (1758)
T ss_pred             cccccCcCCCCcccccccccccccchhccccCCCCCccCCCcc---CCccccccccceecc-CCCCCcchhHHHHhhcCC
Confidence            34445577889999999888654221   01111  1222210   11122  58899999 9999999987521     


Q ss_pred             --CccC-CCCCCCc----eee--CCeeecCCCcccCCCCC
Q 005509          278 --STCV-NQCSGHG----HCR--GGFCQCDSGWYGVDCSI  308 (693)
Q Consensus       278 --~~C~-~~C~~~G----~C~--~g~C~C~~G~~G~~C~~  308 (693)
                        ..|. -.|...|    .|.  +|+|.|.+|..|..|..
T Consensus      1108 P~~~C~aCdCd~rG~~tpQCdr~tG~C~C~~Gv~G~rCdq 1147 (1758)
T KOG0994|consen 1108 PNEKCRACDCDPRGIETPQCDRATGRCVCRPGVGGPRCDQ 1147 (1758)
T ss_pred             CCCCceecCCCCCCCCCCCccccCCceeecCCCCCcchhh
Confidence              1121 1343333    475  89999999999999987


No 9  
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=99.01  E-value=5e-10  Score=125.43  Aligned_cols=134  Identities=27%  Similarity=0.507  Sum_probs=97.3

Q ss_pred             CCCCCCEEeccCCeEEeCCCcc----CCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCCcccCCCCC
Q 005509          126 DCSGQGVCNHELGQCRCFHGFR----GKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEGTKYPNRPV  201 (693)
Q Consensus       126 ~C~~~G~C~~~~G~C~C~~G~~----G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~G~~C~~  201 (693)
                      .|+|+|.|.  .|+|.|.+...    |+.||-. ...|...          .+..|+|+..|.-|.|+|.+||+|..|.-
T Consensus       515 vCSgrG~C~--CGqC~C~~~~~~~i~G~fCECD-nfsC~r~----------~g~lC~g~G~C~CG~CvC~~GwtG~~C~C  581 (783)
T KOG1226|consen  515 VCSGRGDCV--CGQCVCHKPDNGKIYGKFCECD-NFSCERH----------KGVLCGGHGRCECGRCVCNPGWTGSACNC  581 (783)
T ss_pred             CcCCCCcEe--CCceEecCCCCCceeeeeeecc-Ccccccc----------cCcccCCCCeEeCCcEEcCCCCccCCCCC
Confidence            699999999  99999999887    9999853 2334322          23478877777889999999999998843


Q ss_pred             CCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccccccCCccC
Q 005509          202 AEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSSTCV  281 (693)
Q Consensus       202 ~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~~C~  281 (693)
                      ..      ...              .|....+..|+++|.|             .-|+|.|.-++|.|.+||..  ..|+
T Consensus       582 ~~------std--------------~C~~~~G~iCSGrG~C-------------~Cg~C~C~~~~~sG~~CE~c--ptc~  626 (783)
T KOG1226|consen  582 PL------STD--------------TCESSDGQICSGRGTC-------------ECGRCKCTDPPYSGEFCEKC--PTCP  626 (783)
T ss_pred             CC------CCc--------------cccCCCCceeCCCcee-------------eCCceEcCCCCcCcchhhcC--CCCC
Confidence            21      111              2222334557778877             47899998334999999985  6888


Q ss_pred             CCCCCCceeeCCee-ecCCCcccCCCCCC
Q 005509          282 NQCSGHGHCRGGFC-QCDSGWYGVDCSIP  309 (693)
Q Consensus       282 ~~C~~~G~C~~g~C-~C~~G~~G~~C~~~  309 (693)
                      .+|..+..|+  +| .+..|+.+..|.+.
T Consensus       627 ~~C~~~~~Cv--eC~~~~~g~~~~~C~~~  653 (783)
T KOG1226|consen  627 DPCAENKSCV--ECQAFETGPVGDTCVEE  653 (783)
T ss_pred             Ccccccccch--hhcccccccccchHHHH
Confidence            8999988886  22 24556888887764


No 10 
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.99  E-value=5.9e-10  Score=133.52  Aligned_cols=107  Identities=27%  Similarity=0.654  Sum_probs=83.9

Q ss_pred             CCC-CCCCCCCCEEeccCC---eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCCccc
Q 005509          121 KSC-KSDCSGQGVCNHELG---QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEGTKY  196 (693)
Q Consensus       121 ~~C-~~~C~~~G~C~~~~G---~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~G  196 (693)
                      ..| .++|+++|+|+...+   .|.|++-|.|..||.. ..+|.                                    
T Consensus      3865 d~C~~npCqhgG~C~~~~~ggy~CkCpsqysG~~CEi~-~epC~------------------------------------ 3907 (4289)
T KOG1219|consen 3865 DPCNDNPCQHGGTCISQPKGGYKCKCPSQYSGNHCEID-LEPCA------------------------------------ 3907 (4289)
T ss_pred             cccccCcccCCCEecCCCCCceEEeCcccccCcccccc-ccccc------------------------------------
Confidence            678 788999999986443   7999999999999874 22222                                    


Q ss_pred             CCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCccccccc
Q 005509          197 PNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPV  276 (693)
Q Consensus       197 ~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~  276 (693)
                                                          +++|..+|+|....+         ...|.|+ .||+|..||...
T Consensus      3908 ------------------------------------snPC~~GgtCip~~n---------~f~CnC~-~gyTG~~Ce~~G 3941 (4289)
T KOG1219|consen 3908 ------------------------------------SNPCLTGGTCIPFYN---------GFLCNCP-NGYTGKRCEARG 3941 (4289)
T ss_pred             ------------------------------------CCCCCCCCEEEecCC---------CeeEeCC-CCccCceeeccc
Confidence                                                556777888866533         5689999 999999999874


Q ss_pred             CCccC-CCCCCCceee--CC--eeecCCCcccCCCCCCc
Q 005509          277 SSTCV-NQCSGHGHCR--GG--FCQCDSGWYGVDCSIPS  310 (693)
Q Consensus       277 ~~~C~-~~C~~~G~C~--~g--~C~C~~G~~G~~C~~~~  310 (693)
                      ...|. +.|.++|.|+  .|  .|.|.+||.|..|...+
T Consensus      3942 i~eCs~n~C~~gg~C~n~~gsf~CncT~g~~gr~c~~~~ 3980 (4289)
T KOG1219|consen 3942 ISECSKNVCGTGGQCINIPGSFHCNCTPGILGRTCCAEK 3980 (4289)
T ss_pred             ccccccccccCCceeeccCCceEeccChhHhcccCcccc
Confidence            56687 7899999997  34  89999999999996544


No 11 
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=98.90  E-value=3.1e-09  Score=121.65  Aligned_cols=170  Identities=23%  Similarity=0.505  Sum_probs=101.8

Q ss_pred             CCC-CCCCCCCC-EEeccCCeEE-eCCCccCCCCCccccCcCCCCCC--CCCCC-CCccccccCC----------CCCCC
Q 005509          121 KSC-KSDCSGQG-VCNHELGQCR-CFHGFRGKGCSERIHFQCNFPKT--PELPY-GRWVVSICPT----------HCDTT  184 (693)
Q Consensus       121 ~~C-~~~C~~~G-~C~~~~G~C~-C~~G~~G~~Ce~~~~~~C~~~~~--~~~~~-g~~~~~~C~g----------~C~~~  184 (693)
                      .+| +..|++|. +|+..+|.|+ |..-.+|..|+.     |..+-.  |.-.+ +.|.+.+||.          .|.-.
T Consensus       854 PeCr~CqCNgHA~~Cd~~tGaCi~CqD~T~G~~Cdr-----Cl~GyyGdP~lg~g~~CrPCpCP~gp~Sg~~~A~sC~~d  928 (1758)
T KOG0994|consen  854 PECRPCQCNGHADTCDPITGACIDCQDSTTGHSCDR-----CLDGYYGDPRLGSGIGCRPCPCPDGPASGRQHADSCYLD  928 (1758)
T ss_pred             CcCccccccCcccccCccccccccccccccccchhh-----hhccccCCcccCCCCCCCCCCCCCCCccchhcccccccc
Confidence            566 66788886 9999999996 999999999987     555422  21122 2566667762          23322


Q ss_pred             ----CceeeeCCCcccCCCCCCCCCC--CcccCCCCCCC----------------------------------CCCCCCc
Q 005509          185 ----RAMCFCGEGTKYPNRPVAEACG--FQVNLPSQPGA----------------------------------PKSTDWA  224 (693)
Q Consensus       185 ----~g~C~C~~G~~G~~C~~~~~C~--~~~~~~~~~~~----------------------------------~C~~gw~  224 (693)
                          .-.|.|.+||.|.+|+.+.+-.  ......+|..|                                  .|..||+
T Consensus       929 ~~t~~ivC~C~~GY~G~RCe~CA~~~fGnP~~GGtCq~CeC~~NiD~~d~~aCD~~TG~CLkCL~hTeG~hCe~Ck~Gf~ 1008 (1758)
T KOG0994|consen  929 TRTQQIVCHCQEGYSGSRCEICADNHFGNPSEGGTCQKCECSNNIDLYDPGACDVATGACLKCLYHTEGDHCEHCKDGFY 1008 (1758)
T ss_pred             ccccceeeecccCccccchhhhcccccCCcccCCccccccccCCcCccCCCccchhhchhhhhhhcccccchhhccccch
Confidence                2279999999999986432110  00001111111                                  1456666


Q ss_pred             CC----CCCCccCCCCCCCce---ecCCcccccccccccccccccCCCCCcCcccccccCC--------ccC-CCC--CC
Q 005509          225 KA----DLDNIFTTNGSKPGW---CNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSS--------TCV-NQC--SG  286 (693)
Q Consensus       225 g~----~c~~~~~~~C~~~G~---C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~--------~C~-~~C--~~  286 (693)
                      |.    +|..   ..|.-.|+   |..+         .++|+|.|. +...|..|+...+.        .|. -+|  .+
T Consensus      1009 GdA~~q~Cqr---C~Cn~LGTn~~~~CD---------r~tGQCpCl-pNv~G~~CDqCA~N~w~laSG~GCe~C~Cd~~~ 1075 (1758)
T KOG0994|consen 1009 GDALRQNCQR---CVCNFLGTNSTCHCD---------RFTGQCPCL-PNVQGVRCDQCAENHWNLASGEGCEPCNCDPIG 1075 (1758)
T ss_pred             hHHHHhhhhh---heccccccCCccccc---------cccCcCCCC-cccccccccccccchhccccCCCCCccCCCccC
Confidence            64    2221   11211111   1111         147899998 99999999875321        111 012  34


Q ss_pred             Cceee--CCeeecCCCcccCCCCC
Q 005509          287 HGHCR--GGFCQCDSGWYGVDCSI  308 (693)
Q Consensus       287 ~G~C~--~g~C~C~~G~~G~~C~~  308 (693)
                      +-+|+  +|+|+|+|||-|..|++
T Consensus      1076 ~pqCN~ftGQCqCkpGfGGR~C~q 1099 (1758)
T KOG0994|consen 1076 GPQCNEFTGQCQCKPGFGGRTCSQ 1099 (1758)
T ss_pred             CccccccccceeccCCCCCcchhH
Confidence            55787  89999999999999986


No 12 
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=98.43  E-value=1.4e-06  Score=108.42  Aligned_cols=105  Identities=19%  Similarity=0.318  Sum_probs=74.5

Q ss_pred             CcccccCcccccCCCCcccccccCccCCCCCCCCCCCC-CEEeccCCeEEeCCCccCCCCCccccCcCCCCCCCCC---C
Q 005509           93 PWKAEIGRWLSGCDSVAKEVDLVEMIGGKSCKSDCSGQ-GVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKTPEL---P  168 (693)
Q Consensus        93 ~~~~~~g~~~~~c~~~~~~~~~~~~~~~~~C~~~C~~~-G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~---~  168 (693)
                      .....+|++++.|..+...+.-...-...-|+.+|++| .+|+..+|.|.|.+.-.|..|+.     |.++.....   .
T Consensus       699 C~~g~tG~~Ce~C~~gfrr~~~~~~~~~~c~~C~cngh~~~Cd~~tG~C~C~~~t~G~~C~~-----C~~GfYg~~~~~~  773 (1705)
T KOG1836|consen  699 CPVGYTGQFCESCAPGFRRLSPQLGPFCPCIPCDCNGHSNICDPRTGQCKCKHNTFGGQCAQ-----CVDGFYGLPDLGT  773 (1705)
T ss_pred             CCCCcccchhhhcchhhhcccccCCCCCcccccccCCccccccCCCCceecccCCCCCchhh-----hcCCCCCccccCC
Confidence            33567999999998777655443222123338889997 79999999999999999999997     666543322   2


Q ss_pred             CCCccccccCC------CCCCCCceee-eCCCcccCCCCCC
Q 005509          169 YGRWVVSICPT------HCDTTRAMCF-CGEGTKYPNRPVA  202 (693)
Q Consensus       169 ~g~~~~~~C~g------~C~~~~g~C~-C~~G~~G~~C~~~  202 (693)
                      +++|....|++      .++...+.|. |++||+|..|+.+
T Consensus       774 ~~dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~c  814 (1705)
T KOG1836|consen  774 SGDCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLRCEEC  814 (1705)
T ss_pred             CCCCccCCCCCChhhcCcCcccceecCCCCCCCcccccccC
Confidence            23456666663      3444467999 9999999999654


No 13 
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.23  E-value=2.6e-06  Score=100.07  Aligned_cols=73  Identities=26%  Similarity=0.584  Sum_probs=53.2

Q ss_pred             CCCC-CCCCCCCCEEeccCC----eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC------CCCCCCCcee
Q 005509          120 GKSC-KSDCSGQGVCNHELG----QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP------THCDTTRAMC  188 (693)
Q Consensus       120 ~~~C-~~~C~~~G~C~~~~G----~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~------g~C~~~~g~C  188 (693)
                      .+.| -++|.+.|+|....|    +|+|++||+|++||.....+|+.+=   +.+..|....|.      ..|+-.+|+|
T Consensus      1716 ~~vC~lnpc~~~g~Cv~sp~a~GY~C~C~~g~~G~~Ce~~~dq~CPrGW---WG~P~CgpC~CavsKgfdp~CnKt~G~C 1792 (2531)
T KOG4289|consen 1716 VDVCSLNPCENQGTCVRSPGAHGYTCECPPGYTGPYCELRADQPCPRGW---WGFPTCGPCNCAVSKGFDPDCNKTNGQC 1792 (2531)
T ss_pred             cchhcccccccCceeecCCCCCceeEECCCcccCcchhhhccCCCCCcc---cCCCCccCccccccCCCCCCccccCcce
Confidence            3566 678999999987665    8999999999999998777787531   112234444442      4577788999


Q ss_pred             eeCCCcc
Q 005509          189 FCGEGTK  195 (693)
Q Consensus       189 ~C~~G~~  195 (693)
                      .|.+.+.
T Consensus      1793 qCKe~hy 1799 (2531)
T KOG4289|consen 1793 QCKENHY 1799 (2531)
T ss_pred             eeccccc
Confidence            9988765


No 14 
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.18  E-value=2.2e-06  Score=100.67  Aligned_cols=94  Identities=26%  Similarity=0.450  Sum_probs=66.0

Q ss_pred             CC-eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCC--CceeeeCCCcccCCCCCCCCCCCcccC
Q 005509          137 LG-QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTT--RAMCFCGEGTKYPNRPVAEACGFQVNL  211 (693)
Q Consensus       137 ~G-~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~--~g~C~C~~G~~G~~C~~~~~C~~~~~~  211 (693)
                      .| .|+|++||+|++||.. .+.|-.+             +|.  +.|-..  ..+|.|.+||+|..|+....-+.    
T Consensus      1220 nglrCrCPpGFTgd~CeTe-iDlCYs~-------------pC~nng~C~srEggYtCeCrpg~tGehCEvs~~agr---- 1281 (2531)
T KOG4289|consen 1220 NGLRCRCPPGFTGDYCETE-IDLCYSG-------------PCGNNGRCRSREGGYTCECRPGFTGEHCEVSARAGR---- 1281 (2531)
T ss_pred             CceeEeCCCCCCcccccch-hHhhhcC-------------CCCCCCceEEecCceeEEecCCccccceeeecccCc----
Confidence            44 8999999999999986 4556543             454  334333  34899999999998876533211    


Q ss_pred             CCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccccc
Q 005509          212 PSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVP  275 (693)
Q Consensus       212 ~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~  275 (693)
                                         +.+..|.|+|+|.+..+.        ...|.|++..|+++.|+..
T Consensus      1282 -------------------CvpGvC~nggtC~~~~ng--------gf~c~Cp~ge~e~prC~v~ 1318 (2531)
T KOG4289|consen 1282 -------------------CVPGVCKNGGTCVNLLNG--------GFCCHCPYGEFEDPRCEVT 1318 (2531)
T ss_pred             -------------------cccceecCCCEEeecCCC--------ceeccCCCcccCCCceEEE
Confidence                               125668899999776541        3489999767889999864


No 15 
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.08  E-value=3.1e-06  Score=103.00  Aligned_cols=68  Identities=26%  Similarity=0.671  Sum_probs=58.2

Q ss_pred             CCCCCCCceecCCcccccccccccccccccCCCCCcCcccccccCCccCCCCCCCceee----CCeeecCCCcccCCCCC
Q 005509          233 TTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSSTCVNQCSGHGHCR----GGFCQCDSGWYGVDCSI  308 (693)
Q Consensus       233 ~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~~C~~~C~~~G~C~----~g~C~C~~G~~G~~C~~  308 (693)
                      .++|+++|.|...+..        .++|.|+ .-|+|..||+..+.+-+++|..+|+|+    +..|.|+.||+|.+|+.
T Consensus      3869 ~npCqhgG~C~~~~~g--------gy~CkCp-sqysG~~CEi~~epC~snPC~~GgtCip~~n~f~CnC~~gyTG~~Ce~ 3939 (4289)
T KOG1219|consen 3869 DNPCQHGGTCISQPKG--------GYKCKCP-SQYSGNHCEIDLEPCASNPCLTGGTCIPFYNGFLCNCPNGYTGKRCEA 3939 (4289)
T ss_pred             cCcccCCCEecCCCCC--------ceEEeCc-ccccCcccccccccccCCCCCCCCEEEecCCCeeEeCCCCccCceeec
Confidence            6789999999876542        5699999 999999999985544468999999998    44899999999999997


Q ss_pred             C
Q 005509          309 P  309 (693)
Q Consensus       309 ~  309 (693)
                      .
T Consensus      3940 ~ 3940 (4289)
T KOG1219|consen 3940 R 3940 (4289)
T ss_pred             c
Confidence            7


No 16 
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=97.98  E-value=5e-05  Score=95.02  Aligned_cols=198  Identities=20%  Similarity=0.375  Sum_probs=113.3

Q ss_pred             ccccCcccccCCCCcccccccCccCCCCC-CCCCCCCCEEec----cCCeEE-eCCCccCCCCCccccCcCCCC-----C
Q 005509           95 KAEIGRWLSGCDSVAKEVDLVEMIGGKSC-KSDCSGQGVCNH----ELGQCR-CFHGFRGKGCSERIHFQCNFP-----K  163 (693)
Q Consensus        95 ~~~~g~~~~~c~~~~~~~~~~~~~~~~~C-~~~C~~~G~C~~----~~G~C~-C~~G~~G~~Ce~~~~~~C~~~-----~  163 (693)
                      ..+-|..+++|..+.........-  .+| +.+|-+.|.|..    ..+.|. |++||+|..|+.     |..+     .
T Consensus       751 ~~t~G~~C~~C~~GfYg~~~~~~~--~dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~-----c~dgyfg~p~  823 (1705)
T KOG1836|consen  751 HNTFGGQCAQCVDGFYGLPDLGTS--GDCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLRCEE-----CADGYFGNPL  823 (1705)
T ss_pred             cCCCCCchhhhcCCCCCccccCCC--CCCccCCCCCChhhcCcCcccceecCCCCCCCccccccc-----CCCccccCCC
Confidence            345777888887776655443332  237 667777777743    346899 999999999998     4432     1


Q ss_pred             CCCCCCCCccccccC--------CCCCCCCcee-eeCCCcccCCCCCCC--------------CCCCc------------
Q 005509          164 TPELPYGRWVVSICP--------THCDTTRAMC-FCGEGTKYPNRPVAE--------------ACGFQ------------  208 (693)
Q Consensus       164 ~~~~~~g~~~~~~C~--------g~C~~~~g~C-~C~~G~~G~~C~~~~--------------~C~~~------------  208 (693)
                      ........+....|.        ++|+-..|.| .|.....|..|+.+.              .|..+            
T Consensus       824 ~~~~~~~~c~~c~c~~n~dp~~~g~c~~~tg~c~~ci~nT~g~~cd~c~~g~~gd~l~~~p~~~c~~c~c~p~gs~~~~~  903 (1705)
T KOG1836|consen  824 GHDGDVRPCQSCQCNFNVDPNAFGNCNRLTGECLKCIHNTAGEYCDLCKEGYFGDPLAPNPEDKCFACGCVPAGSELPSL  903 (1705)
T ss_pred             CCCCCcccCccceeccccCccccccccccccceeeccCCcccccccccccCccccccCCCcCCccccccCccCCcccccc
Confidence            111122244444553        6788888888 577777777664321              11111            


Q ss_pred             ccCCCCCCCCCCCCCcCCCCCCcc-------------CCCCCCCceecCCcccccccccc--cccccccCCCCCcCcccc
Q 005509          209 VNLPSQPGAPKSTDWAKADLDNIF-------------TTNGSKPGWCNVDPEEAYALKVQ--FKEECDCKYDGLLGQFCE  273 (693)
Q Consensus       209 ~~~~~~~~~~C~~gw~g~~c~~~~-------------~~~C~~~G~C~~~~~~~~~~~~c--~~g~C~C~~~G~~G~~C~  273 (693)
                      .+++....+.|.+.-.|.+|..+.             ..+|...|.=+         ..|  .+|+|.|. +|-+|..|+
T Consensus       904 ~c~~~tGQcec~~~v~g~~c~~c~~g~fnl~s~~gC~~c~c~~~gs~~---------~~c~~~tGqc~c~-~gVtgqrc~  973 (1705)
T KOG1836|consen  904 TCNPVTGQCECKPNVEGRDCLYCFKGFFNLNSGVGCEPCNCDPTGSES---------SDCDVGTGQCYCR-PGVTGQRCD  973 (1705)
T ss_pred             cCCCcccceeccCCCCccccccccccccccCCCCCccccccccccccc---------ccccccCCceeee-cCccccccC
Confidence            112222233444444455444222             11222222110         112  37899998 999999998


Q ss_pred             cccC-------CccC-CCCCCCc----eee--CCeeecCCCcccCCCCCC
Q 005509          274 VPVS-------STCV-NQCSGHG----HCR--GGFCQCDSGWYGVDCSIP  309 (693)
Q Consensus       274 ~~~~-------~~C~-~~C~~~G----~C~--~g~C~C~~G~~G~~C~~~  309 (693)
                      ....       ..|- -.|...|    .|+  +|+|.|.+++.|..|..-
T Consensus       974 qc~~~~~~~~~~gc~~c~c~~~Gs~~~qc~~~~G~c~c~~~~~g~~c~~c 1023 (1705)
T KOG1836|consen  974 QCETYHFGFQTEGCGLCECDPLGSRGFQCDPEDGQCPCRPGFEGRRCDQC 1023 (1705)
T ss_pred             ccccCcccccccCCcceecccCCcccceecccCCeeeecCCCCCcccccc
Confidence            6421       1111 1344455    586  899999999999777653


No 17 
>PF07974 EGF_2:  EGF-like domain;  InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.94  E-value=9.5e-06  Score=55.37  Aligned_cols=27  Identities=41%  Similarity=1.058  Sum_probs=24.6

Q ss_pred             CCCCCCCEEeccCCeEEeCCCccCCCC
Q 005509          125 SDCSGQGVCNHELGQCRCFHGFRGKGC  151 (693)
Q Consensus       125 ~~C~~~G~C~~~~G~C~C~~G~~G~~C  151 (693)
                      ..|++||+|+...|+|.|.+||+|++|
T Consensus         6 ~~C~~~G~C~~~~g~C~C~~g~~G~~C   32 (32)
T PF07974_consen    6 NICSGHGTCVSPCGRCVCDSGYTGPDC   32 (32)
T ss_pred             CccCCCCEEeCCCCEEECCCCCcCCCC
Confidence            359999999977799999999999987


No 18 
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.82  E-value=0.00015  Score=81.43  Aligned_cols=65  Identities=29%  Similarity=0.729  Sum_probs=47.7

Q ss_pred             CCCCceecCCcccccccccccccccccCCCCCcCccc-ccccCCccC-----CCCCCCceee------CCeeecCCCccc
Q 005509          236 GSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFC-EVPVSSTCV-----NQCSGHGHCR------GGFCQCDSGWYG  303 (693)
Q Consensus       236 C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C-~~~~~~~C~-----~~C~~~G~C~------~g~C~C~~G~~G  303 (693)
                      |.++++|.....         .+.|.|+ +||+|..| .......|.     ..|.++++|.      ...|.|..||.|
T Consensus       280 c~~~~~C~~~~~---------~~~C~C~-~g~~g~~~~~~~~~~~C~~~~~~~~c~~g~~C~~~~~~~~~~C~c~~~~~g  349 (487)
T KOG1217|consen  280 CPNGGTCVNVPG---------SYRCTCP-PGFTGRLCTECVDVDECSPRNAGGPCANGGTCNTLGSFGGFRCACGPGFTG  349 (487)
T ss_pred             cCCCCeeecCCC---------cceeeCC-CCCCCCCCccccccccccccccCCcCCCCcccccCCCCCCCCcCCCCCCCC
Confidence            777888865432         3789999 99999998 221123552     4588888993      236999999999


Q ss_pred             CCCCCCc
Q 005509          304 VDCSIPS  310 (693)
Q Consensus       304 ~~C~~~~  310 (693)
                      ..|+.+.
T Consensus       350 ~~C~~~~  356 (487)
T KOG1217|consen  350 RRCEDSN  356 (487)
T ss_pred             CccccCC
Confidence            9999874


No 19 
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=97.65  E-value=9.3e-05  Score=73.92  Aligned_cols=132  Identities=20%  Similarity=0.379  Sum_probs=75.4

Q ss_pred             eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCC-------CceeeeCCCcccCCCCCCC--------
Q 005509          139 QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTT-------RAMCFCGEGTKYPNRPVAE--------  203 (693)
Q Consensus       139 ~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~-------~g~C~C~~G~~G~~C~~~~--------  203 (693)
                      .| |++|..|++|..     |+.+.-          ..|.|+..|.       .|.|.|.+||.|+.|..+.        
T Consensus       130 vC-Cp~gtyGpdCl~-----Cpggse----------r~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~Cg~eyfes~R  193 (350)
T KOG4260|consen  130 VC-CPDGTYGPDCLQ-----CPGGSE----------RPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRYCGIEYFESSR  193 (350)
T ss_pred             ec-cCCCCcCCcccc-----CCCCCc----------CCcCCCCcccCCCCCCCCCcccccCCCCCccccccchHHHHhhc
Confidence            44 999999999986     543211          2354332221       4699999999999984321        


Q ss_pred             ------------CCCCcccCCCCCCC-CCCCCCcCC--CCCC---cc--CCCCCCCceecCCcccccccccccccccccC
Q 005509          204 ------------ACGFQVNLPSQPGA-PKSTDWAKA--DLDN---IF--TTNGSKPGWCNVDPEEAYALKVQFKEECDCK  263 (693)
Q Consensus       204 ------------~C~~~~~~~~~~~~-~C~~gw~g~--~c~~---~~--~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~  263 (693)
                                  .|...++.....+| .|..||.-.  .|-+   +.  +.+|..+..|.+..+         .++|.++
T Consensus       194 ne~~lvCt~Ch~~C~~~Csg~~~k~C~kCkkGW~lde~gCvDvnEC~~ep~~c~~~qfCvNteG---------Sf~C~dk  264 (350)
T KOG4260|consen  194 NEQHLVCTACHEGCLGVCSGESSKGCSKCKKGWKLDEEGCVDVNECQNEPAPCKAHQFCVNTEG---------SFKCEDK  264 (350)
T ss_pred             ccccchhhhhhhhhhcccCCCCCCChhhhcccceecccccccHHHHhcCCCCCChhheeecCCC---------ceEeccc
Confidence                        12211112222222 277888643  2211   11  566777777765432         5689998


Q ss_pred             CCCCcCc--ccccccCCccCCCCC-CCceee----CCeeecCCCc
Q 005509          264 YDGLLGQ--FCEVPVSSTCVNQCS-GHGHCR----GGFCQCDSGW  301 (693)
Q Consensus       264 ~~G~~G~--~C~~~~~~~C~~~C~-~~G~C~----~g~C~C~~G~  301 (693)
                       +||.+.  .|+     .|...|. .++.|.    ..+|+|..|.
T Consensus       265 -~Gy~~g~d~C~-----~~~d~~~~kn~~c~ni~~~~r~v~f~~~  303 (350)
T KOG4260|consen  265 -EGYKKGVDECQ-----FCADVCASKNRPCMNIDGQYRCVCFSGL  303 (350)
T ss_pred             -ccccCChHHhh-----hhhhhcccCCCCcccCCccEEEEecccc
Confidence             999862  222     2334443 355564    3478887765


No 20 
>PF07974 EGF_2:  EGF-like domain;  InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.61  E-value=5.2e-05  Score=51.77  Aligned_cols=25  Identities=64%  Similarity=1.379  Sum_probs=23.0

Q ss_pred             CCCCCCceee--CCeeecCCCcccCCC
Q 005509          282 NQCSGHGHCR--GGFCQCDSGWYGVDC  306 (693)
Q Consensus       282 ~~C~~~G~C~--~g~C~C~~G~~G~~C  306 (693)
                      ..|++||+|+  .++|+|.+||+|.+|
T Consensus         6 ~~C~~~G~C~~~~g~C~C~~g~~G~~C   32 (32)
T PF07974_consen    6 NICSGHGTCVSPCGRCVCDSGYTGPDC   32 (32)
T ss_pred             CccCCCCEEeCCCCEEECCCCCcCCCC
Confidence            4699999999  799999999999987


No 21 
>KOG3512 consensus Netrin, axonal chemotropic factor [Signal transduction mechanisms]
Probab=97.42  E-value=0.00096  Score=71.60  Aligned_cols=153  Identities=20%  Similarity=0.402  Sum_probs=87.3

Q ss_pred             CCCCCC-EEeccCC---eEEeCCCccCCCCCccccCcCCCCCCC-------CCCCCCccccccCC-------CCCC----
Q 005509          126 DCSGQG-VCNHELG---QCRCFHGFRGKGCSERIHFQCNFPKTP-------ELPYGRWVVSICPT-------HCDT----  183 (693)
Q Consensus       126 ~C~~~G-~C~~~~G---~C~C~~G~~G~~Ce~~~~~~C~~~~~~-------~~~~g~~~~~~C~g-------~C~~----  183 (693)
                      .|++|. .|+...+   +|.|.++.+|++|+.     |...-..       ..+-..+....|.+       ++..    
T Consensus       279 KCNgHAs~Cv~d~~~~ltCdC~HNTaGPdCgr-----CKpfy~dRPW~raT~~~a~~c~ac~Cn~harrcrfn~Ely~lS  353 (592)
T KOG3512|consen  279 KCNGHASRCVMDESSHLTCDCEHNTAGPDCGR-----CKPFYYDRPWGRATALPANECVACNCNGHARRCRFNMELYRLS  353 (592)
T ss_pred             eecCccceeeeccCCceEEecccCCCCCCccc-----ccccccCCCccccccCCCccccccccchhhhhcccchhhhccc
Confidence            478876 7865444   899999999999997     4432111       00111222333321       1111    


Q ss_pred             ---CCceee-eCCCcccCCCCCCCCCCCcccCCCCCCCCCCCCCcCCC---CC---CccCCCCCCCc----eecCCcccc
Q 005509          184 ---TRAMCF-CGEGTKYPNRPVAEACGFQVNLPSQPGAPKSTDWAKAD---LD---NIFTTNGSKPG----WCNVDPEEA  249 (693)
Q Consensus       184 ---~~g~C~-C~~G~~G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~---c~---~~~~~~C~~~G----~C~~~~~~~  249 (693)
                         ..|.|. |...+.|..|..                 |..||+-..   .+   .+...+|+.-|    +|+.     
T Consensus       354 gr~SggvClnCrHnTaGrhChy-----------------CreGyyRd~s~pl~hrkaCk~CdChpVGs~gktCNq-----  411 (592)
T KOG3512|consen  354 GRRSGGVCLNCRHNTAGRHCHY-----------------CREGYYRDGSKPLTHRKACKACDCHPVGSAGKTCNQ-----  411 (592)
T ss_pred             CccccceEeecccCCCCccccc-----------------ccCccccCCCCCCchhhhhhhcCCcccccccccccc-----
Confidence               134554 666667666632                 233333211   00   01133454433    3543     


Q ss_pred             cccccccccccccCCCCCcCccccccc---------CCccC-------CCCCCCceeeCCeeecCCCcccCCCCCCccC
Q 005509          250 YALKVQFKEECDCKYDGLLGQFCEVPV---------SSTCV-------NQCSGHGHCRGGFCQCDSGWYGVDCSIPSVM  312 (693)
Q Consensus       250 ~~~~~c~~g~C~C~~~G~~G~~C~~~~---------~~~C~-------~~C~~~G~C~~g~C~C~~G~~G~~C~~~~~~  312 (693)
                            .+|+|.|+ +|-+|..|+...         ...|.       ..|+++++=.+..|.|+.++.|..|+++..-
T Consensus       412 ------~tGqCpCk-eGvtG~tCnrCa~gyqqsrs~vapcik~p~~~~~~~~s~ve~qd~~s~Ck~~~~~~r~n~kkfc  483 (592)
T KOG3512|consen  412 ------TTGQCPCK-EGVTGLTCNRCAPGYQQSRSPVAPCIKIPTDAPTLGSSGVEPQDQCSKCKASPGGKRLNQKKFC  483 (592)
T ss_pred             ------cCCcccCC-CCCcccccccccchhhcccCCCcCceecCCCCccccCCCCcchhccccCCCCCcceeccccccC
Confidence                  37899999 999999998642         11221       2366666633556799999999999998765


No 22 
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=97.41  E-value=0.00049  Score=78.05  Aligned_cols=142  Identities=20%  Similarity=0.486  Sum_probs=83.4

Q ss_pred             CCCCCCCCEEeccCC---eEEeCCCccC--CCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCCCc--eeeeCCCc
Q 005509          124 KSDCSGQGVCNHELG---QCRCFHGFRG--KGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTTRA--MCFCGEGT  194 (693)
Q Consensus       124 ~~~C~~~G~C~~~~G---~C~C~~G~~G--~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~~g--~C~C~~G~  194 (693)
                      ...|.-+..|...+|   +|.|..||.|  .+|...  ++|....           ..|.  ..|.+..+  .|.|..||
T Consensus       699 sh~cdt~a~C~pg~~~~~tcecs~g~~gdgr~c~d~--~eca~~~-----------~~CGp~s~Cin~pg~~rceC~~gy  765 (1289)
T KOG1214|consen  699 SHMCDTTARCHPGTGVDYTCECSSGYQGDGRNCVDE--NECATGF-----------HRCGPNSVCINLPGSYRCECRSGY  765 (1289)
T ss_pred             CcccCCCccccCCCCcceEEEEeeccCCCCCCCCCh--hhhccCC-----------CCCCCCceeecCCCceeEEEeecc
Confidence            445777888885555   8999999986  467763  4665432           2554  44666554  67777776


Q ss_pred             c--cC--CCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcC-
Q 005509          195 K--YP--NRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLG-  269 (693)
Q Consensus       195 ~--G~--~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G-  269 (693)
                      .  +.  +|-.-..   +.-.              ..|++ -..+|.-.|.|.....    ++  ..+.|.|. +||.| 
T Consensus       766 ~F~dd~~tCV~i~~---pap~--------------n~Ce~-g~h~C~i~g~a~c~~h----Gg--s~y~C~CL-PGfsGD  820 (1289)
T KOG1214|consen  766 EFADDRHTCVLITP---PAPA--------------NPCED-GSHTCAIAGQARCVHH----GG--STYSCACL-PGFSGD  820 (1289)
T ss_pred             eeccCCcceEEecC---CCCC--------------Ccccc-CccccCcCCceEEEec----CC--ceEEEeec-CCccCC
Confidence            4  22  2310000   0000              01211 0234555554432110    00  25799998 99996 


Q ss_pred             -cccccccCCcc-CCCCCCCceee----CCeeecCCCcccCC
Q 005509          270 -QFCEVPVSSTC-VNQCSGHGHCR----GGFCQCDSGWYGVD  305 (693)
Q Consensus       270 -~~C~~~~~~~C-~~~C~~~G~C~----~g~C~C~~G~~G~~  305 (693)
                       ..|...  +.| ++.|.-..+|.    ...|+|++||.|+.
T Consensus       821 G~~c~dv--DeC~psrChp~A~CyntpgsfsC~C~pGy~GDG  860 (1289)
T KOG1214|consen  821 GHQCTDV--DECSPSRCHPAATCYNTPGSFSCRCQPGYYGDG  860 (1289)
T ss_pred             ccccccc--cccCccccCCCceEecCCCcceeecccCccCCC
Confidence             445442  466 47899999997    34899999999974


No 23 
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.28  E-value=0.0023  Score=71.88  Aligned_cols=150  Identities=22%  Similarity=0.500  Sum_probs=97.4

Q ss_pred             CCCCCEEecc-----CCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCCC--ceeeeCCCcccC
Q 005509          127 CSGQGVCNHE-----LGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTTR--AMCFCGEGTKYP  197 (693)
Q Consensus       127 C~~~G~C~~~-----~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~~--g~C~C~~G~~G~  197 (693)
                      +..++.|...     .-.|.|..||.|..|+.. ...|.....           .|.  +.|....  ..|.|.+||.|.
T Consensus       136 ~~~~~~c~~~~~~~~~~~c~C~~g~~~~~~~~~-~~~C~~~~~-----------~c~~~~~C~~~~~~~~C~c~~~~~~~  203 (487)
T KOG1217|consen  136 CCIDGSCSNGPGSVGPFRCSCTEGYEGEPCETD-LDECIQYSS-----------PCQNGGTCVNTGGSYLCSCPPGYTGS  203 (487)
T ss_pred             eeCchhhcCCCCCCCceeeeeCCCccccccccc-ccccccCCC-----------CcCCCcccccCCCCeeEeCCCCccCC
Confidence            4567777643     237999999999999974 245653221           233  4455544  479999999999


Q ss_pred             CCCCC---CCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCC-ceecCCcccccccccccccccccCCCCCcCccc-
Q 005509          198 NRPVA---EACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKP-GWCNVDPEEAYALKVQFKEECDCKYDGLLGQFC-  272 (693)
Q Consensus       198 ~C~~~---~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~-G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C-  272 (693)
                      .|...   ..|..   .   ..+.+..++.+..|+... ..|... +.|.....         ..+|.|. +||.+..+ 
T Consensus       204 ~~~~~~~~~~c~~---~---~~~~~~~g~~~~~c~~~~-~~~~~~~~~c~~~~~---------~~~C~~~-~g~~~~~~~  266 (487)
T KOG1217|consen  204 TCETTGNGGTCVD---S---VACSCPPGARGPECEVSI-VECASGDGTCVNTVG---------SYTCRCP-EGYTGDACV  266 (487)
T ss_pred             cCcCCCCCceEec---c---eeccCCCCCCCCCccccc-ccccCCCCcccccCC---------ceeeeCC-CCccccccc
Confidence            88644   11110   0   234566777777776322 223322 77765432         4689998 99999874 


Q ss_pred             -ccccCCccCC--CCCCCceee----CCeeecCCCcccCCC
Q 005509          273 -EVPVSSTCVN--QCSGHGHCR----GGFCQCDSGWYGVDC  306 (693)
Q Consensus       273 -~~~~~~~C~~--~C~~~G~C~----~g~C~C~~G~~G~~C  306 (693)
                       .+. .+.|..  .|.++++|.    ...|.|++||+|..|
T Consensus       267 ~~~~-~~~C~~~~~c~~~~~C~~~~~~~~C~C~~g~~g~~~  306 (487)
T KOG1217|consen  267 TCVD-VDSCALIASCPNGGTCVNVPGSYRCTCPPGFTGRLC  306 (487)
T ss_pred             eeee-ccccCCCCccCCCCeeecCCCcceeeCCCCCCCCCC
Confidence             111 235652  399999997    268999999999999


No 24 
>smart00051 DSL delta serrate ligand.
Probab=97.08  E-value=0.00049  Score=54.94  Aligned_cols=46  Identities=26%  Similarity=0.569  Sum_probs=37.5

Q ss_pred             cccccCCCCCcCcccccccCCccCCCCCCCceee-CCeeecCCCcccCCC
Q 005509          258 EECDCKYDGLLGQFCEVPVSSTCVNQCSGHGHCR-GGFCQCDSGWYGVDC  306 (693)
Q Consensus       258 g~C~C~~~G~~G~~C~~~~~~~C~~~C~~~G~C~-~g~C~C~~G~~G~~C  306 (693)
                      ..-.|+ ++|.|..|+..  +.+.+.+.++.+|+ .|.|.|.+||+|.+|
T Consensus        17 ~rv~C~-~~~yG~~C~~~--C~~~~d~~~~~~Cd~~G~~~C~~Gw~G~~C   63 (63)
T smart00051       17 IRVTCD-ENYYGEGCNKF--CRPRDDFFGHYTCDENGNKGCLEGWMGPYC   63 (63)
T ss_pred             EEeeCC-CCCcCCccCCE--eCcCccccCCccCCcCCCEecCCCCcCCCC
Confidence            355788 99999999862  23335688999997 889999999999988


No 25 
>PF00008 EGF:  EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry;  InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=96.97  E-value=0.00064  Score=46.55  Aligned_cols=27  Identities=33%  Similarity=0.832  Sum_probs=23.1

Q ss_pred             CCCCCCCCEEeccC-C--eEEeCCCccCCC
Q 005509          124 KSDCSGQGVCNHEL-G--QCRCFHGFRGKG  150 (693)
Q Consensus       124 ~~~C~~~G~C~~~~-G--~C~C~~G~~G~~  150 (693)
                      +++|.|+|+|.... +  .|.|++||+|++
T Consensus         3 ~~~C~n~g~C~~~~~~~y~C~C~~G~~G~~   32 (32)
T PF00008_consen    3 SNPCQNGGTCIDLPGGGYTCECPPGYTGKR   32 (32)
T ss_dssp             TTSSTTTEEEEEESTSEEEEEEBTTEESTT
T ss_pred             CCcCCCCeEEEeCCCCCEEeECCCCCccCC
Confidence            56899999998766 4  899999999974


No 26 
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=96.93  E-value=0.0028  Score=72.22  Aligned_cols=138  Identities=20%  Similarity=0.391  Sum_probs=81.0

Q ss_pred             CCCCCCCCCCCEEeccCC--eEEeCCCc--cCC--CCCccc----cCcCCCCCCCCCCCCCccccccC--CCCCCC----
Q 005509          121 KSCKSDCSGQGVCNHELG--QCRCFHGF--RGK--GCSERI----HFQCNFPKTPELPYGRWVVSICP--THCDTT----  184 (693)
Q Consensus       121 ~~C~~~C~~~G~C~~~~G--~C~C~~G~--~G~--~Ce~~~----~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~----  184 (693)
                      .+|+..|..+.+|++..|  +|+|..||  .|+  +|-...    ..+|..+.           ..|.  +.|.+.    
T Consensus       738 a~~~~~CGp~s~Cin~pg~~rceC~~gy~F~dd~~tCV~i~~pap~n~Ce~g~-----------h~C~i~g~a~c~~hGg  806 (1289)
T KOG1214|consen  738 ATGFHRCGPNSVCINLPGSYRCECRSGYEFADDRHTCVLITPPAPANPCEDGS-----------HTCAIAGQARCVHHGG  806 (1289)
T ss_pred             ccCCCCCCCCceeecCCCceeEEEeecceeccCCcceEEecCCCCCCccccCc-----------cccCcCCceEEEecCC
Confidence            334667999999998888  68877776  343  565432    23343321           2343  444433    


Q ss_pred             -CceeeeCCCcccCCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccC
Q 005509          185 -RAMCFCGEGTKYPNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCK  263 (693)
Q Consensus       185 -~g~C~C~~G~~G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~  263 (693)
                       ...|.|.+||.|..-    .|.                 -+..|+   ++-|...+.|.++++         ...|.|.
T Consensus       807 s~y~C~CLPGfsGDG~----~c~-----------------dvDeC~---psrChp~A~Cyntpg---------sfsC~C~  853 (1289)
T KOG1214|consen  807 STYSCACLPGFSGDGH----QCT-----------------DVDECS---PSRCHPAATCYNTPG---------SFSCRCQ  853 (1289)
T ss_pred             ceEEEeecCCccCCcc----ccc-----------------cccccC---ccccCCCceEecCCC---------cceeecc
Confidence             238999999998531    010                 011222   667888899987764         5689999


Q ss_pred             CCCCcCc--cccccc--CCcc------CCCCCCCceee------CCeeecCCCccc
Q 005509          264 YDGLLGQ--FCEVPV--SSTC------VNQCSGHGHCR------GGFCQCDSGWYG  303 (693)
Q Consensus       264 ~~G~~G~--~C~~~~--~~~C------~~~C~~~G~C~------~g~C~C~~G~~G  303 (693)
                       +||.|.  .|--..  ...|      +..|.+...|.      ..+|.|+++-.|
T Consensus       854 -pGy~GDGf~CVP~~~~~T~C~~er~hpl~chg~t~~~~~~Dp~~~e~p~~~~ppG  908 (1289)
T KOG1214|consen  854 -PGYYGDGFQCVPDTSSLTPCEQERFHPLQCHGSTGFCWCVDPDGHEVPGTQTPPG  908 (1289)
T ss_pred             -cCccCCCceecCCCccCCccccccccceeeccccceeEeeCCCcccCCCCCCCCC
Confidence             999964  443210  1223      23465544332      337887776666


No 27 
>PF12661 hEGF:  Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.21  E-value=0.0027  Score=34.14  Aligned_cols=13  Identities=38%  Similarity=1.216  Sum_probs=11.1

Q ss_pred             eEEeCCCccCCCC
Q 005509          139 QCRCFHGFRGKGC  151 (693)
Q Consensus       139 ~C~C~~G~~G~~C  151 (693)
                      +|.|++||+|++|
T Consensus         1 ~C~C~~G~~G~~C   13 (13)
T PF12661_consen    1 TCQCPPGWTGPNC   13 (13)
T ss_dssp             EEEE-TTEETTTT
T ss_pred             CccCcCCCcCCCC
Confidence            5999999999998


No 28 
>PF00852 Glyco_transf_10:  Glycosyltransferase family 10 (fucosyltransferase);  InterPro: IPR001503 The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates (2.4.1.- from EC) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'. Glycosyltransferase family 10 GT10 from CAZY comprises enzymes with two known activities; galactoside 3(4)-L-fucosyltransferase (2.4.1.65 from EC) and galactoside 3-fucosyltransferase (2.4.1.152 from EC).  The galactoside 3-fucosyltransferases display similarities with the alpha-2 and alpha-6-fucosyltranferases []. The biosynthesis of the carbohydrate antigen sialyl Lewis X (sLe(x)) is dependent on the activity of an galactoside 3-fucosyltransferase. This enzyme catalyses the transfer of fucose from GDP-beta-fucose to the 3-OH of N-acetylglucosamine present in lactosamine acceptors [].  Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 3(4)-L-fucosyltransferase (2.4.1.65 from EC) belongs to the Lewis blood group system and is associated with Le(a/b) antigen. ; GO: 0008417 fucosyltransferase activity, 0006486 protein glycosylation, 0016020 membrane; PDB: 2NZX_B 2NZW_C 2NZY_C.
Probab=96.19  E-value=0.0087  Score=64.98  Aligned_cols=129  Identities=14%  Similarity=0.121  Sum_probs=56.7

Q ss_pred             cccCCCceeecCccCCchhhhhc--cccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCcccc
Q 005509          536 CFDPEKDLVLPAWKAPDAFVLRS--KLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKL  613 (693)
Q Consensus       536 ~f~p~kDvviP~~~~~~~~~~~~--~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~  613 (693)
                      .||...||.+|+...........  .+......+++..++++.+..            ....|..+++++...- .....
T Consensus       141 TYr~dSDi~~py~~~~~~~~~~~~~~~~~~~~~K~~~~~w~~Snc~------------~~~~R~~~~~~L~~~~-~vd~y  207 (349)
T PF00852_consen  141 TYRRDSDIPLPYGYFSPRESPSEKDDLPNILKKKTKLAAWIVSNCN------------PHSGREEYVRELSKYI-PVDSY  207 (349)
T ss_dssp             --------------------------------TSSEEEEE--S-S--------------H-HHHHHHHHHHTTS--EEE-
T ss_pred             ccccccccccccccccccccccccccccccccCCCceEEEEeeCcC------------CcccHHHHHHHHHhhc-CeEcc
Confidence            68899999999754322111111  111112233455666666543            2334999999887752 23344


Q ss_pred             CcccCcceEEecCCchhHHHHhhcCceeeccCCC---CC-chhHHHHHhcCceeEEEe--C-Ce---eec--eecCCCcc
Q 005509          614 GKQHAEDVIVTSLRSENYHEDLSSSVFCGVLPGD---GW-SGRMEDSILQGCIPVVIQ--V-VI---SSF--LLLCQNGS  681 (693)
Q Consensus       614 g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~p~Gd---~~-s~Rl~dAi~~GCIPViis--d-~~---~~p--~l~~~~fs  681 (693)
                      |++...    .......+.+.|++-||-|+.--.   .. |-++++|+.+|+|||+++  . ++   .+|  +|+.++|+
T Consensus       208 G~c~~~----~~~~~~~~~~~~~~ykF~lafENs~c~dYiTEK~~~al~~g~VPI~~G~~~~~~~~~~P~~SfI~~~df~  283 (349)
T PF00852_consen  208 GKCGNN----NPCPRDCKLELLSKYKFYLAFENSNCPDYITEKFWNALLAGTVPIYWGPPRPNYEEFAPPNSFIHVDDFK  283 (349)
T ss_dssp             SSTT------SSS--S-HHHHHHTEEEEEEE-SS--TT---HHHHHHHHTTSEEEEES---TTHHHHS-GGGSEEGGGSS
T ss_pred             CCCCCC----CCcccccccccccCcEEEEEecCCCCCCCCCHHHHHHHHCCeEEEEECCEecccccCCCCCCccchhcCC
Confidence            544100    012224488999999999986542   22 788999999999999999  3 33   233  77777773


No 29 
>PF12661 hEGF:  Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.10  E-value=0.0026  Score=34.18  Aligned_cols=13  Identities=54%  Similarity=1.551  Sum_probs=9.0

Q ss_pred             eeecCCCcccCCC
Q 005509          294 FCQCDSGWYGVDC  306 (693)
Q Consensus       294 ~C~C~~G~~G~~C  306 (693)
                      .|+|++||+|.+|
T Consensus         1 ~C~C~~G~~G~~C   13 (13)
T PF12661_consen    1 TCQCPPGWTGPNC   13 (13)
T ss_dssp             EEEE-TTEETTTT
T ss_pred             CccCcCCCcCCCC
Confidence            4778888888776


No 30 
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=95.99  E-value=0.081  Score=56.29  Aligned_cols=153  Identities=20%  Similarity=0.364  Sum_probs=82.6

Q ss_pred             eccCCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCCCceeeeCCCcccCCCCCCCCCCCc---
Q 005509          134 NHELGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTTRAMCFCGEGTKYPNRPVAEACGFQ---  208 (693)
Q Consensus       134 ~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~~g~C~C~~G~~G~~C~~~~~C~~~---  208 (693)
                      ....+.|.+..+|.|..|+...........+.    ..   ..|.  ..++...+.|. ..+|.|..|.....|+..   
T Consensus        45 ~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~c~----~~---~~c~~~~~~~~~~~~~~-~~~~~g~~C~~~~~~~~~c~~  116 (316)
T KOG1218|consen   45 EVNSGECGLGYGFVGSVCRIECVCGNAGGGCS----QP---CRCKNGGTCVSSTGYCH-LNGYEGPQCESPCPCGDGCAE  116 (316)
T ss_pred             cCCceeEecccccCCCccccccccCCCCCccc----Cc---cccCCCCcccCCCCccc-CCCCCcccccCCCCcCCcccc
Confidence            44578999999999999987532222111110    00   0121  22222333444 688889888766555432   


Q ss_pred             -ccCCCCCCCCCCCCCcCCCCCC--ccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccccccCCccC--CC
Q 005509          209 -VNLPSQPGAPKSTDWAKADLDN--IFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSSTCV--NQ  283 (693)
Q Consensus       209 -~~~~~~~~~~C~~gw~g~~c~~--~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~~C~--~~  283 (693)
                       .+.+....+.+..+|.+..|..  .....      |.... .+..+..+.++.|.|. +||.|.+|.... ..|.  ..
T Consensus       117 ~~C~~~~~~c~~~~~~~~~~C~~~~~~g~~------C~~~c-~~~~~~~~~~~~c~c~-~g~~g~~~~~~~-~~c~~~~~  187 (316)
T KOG1218|consen  117 KTCANPRRECRCGGGYIGEQCGEENLVGLK------CQRDC-QCTGGCDCKNGICTCQ-PGFVGVFCVESC-SGCSPLTA  187 (316)
T ss_pred             cccCCCccceecCCcCccccccccCCCCCC------ccCCC-CCccccCCCCCceecc-CCcccccccccC-CCcCCCcc
Confidence             1111111233445555555553  11111      21111 0111112247789999 999999998752 2255  45


Q ss_pred             CCCCceee--CCeeecCCCccc
Q 005509          284 CSGHGHCR--GGFCQCDSGWYG  303 (693)
Q Consensus       284 C~~~G~C~--~g~C~C~~G~~G  303 (693)
                      |.+++.|.  .+.|.|.+++.+
T Consensus       188 ~~~g~~C~~~~~~~~~~~~~~~  209 (316)
T KOG1218|consen  188 CENGAKCNRSTGSCLCYPGPSG  209 (316)
T ss_pred             cCCCCeeeccccccccCCCCcc
Confidence            66778997  678888888865


No 31 
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=95.96  E-value=0.0074  Score=45.88  Aligned_cols=28  Identities=39%  Similarity=1.047  Sum_probs=23.5

Q ss_pred             CCCCCCE----EeccCCeEEeCCCccCCCCCc
Q 005509          126 DCSGQGV----CNHELGQCRCFHGFRGKGCSE  153 (693)
Q Consensus       126 ~C~~~G~----C~~~~G~C~C~~G~~G~~Ce~  153 (693)
                      .|+++|.    |+..+|+|.|.+|++|..|+.
T Consensus         3 ~C~~~g~~~~~C~~~~G~C~C~~~~~G~~C~~   34 (50)
T cd00055           3 DCNGHGSLSGQCDPGTGQCECKPNTTGRRCDR   34 (50)
T ss_pred             cCcCCCCCCccccCCCCEEeCCCcCCCCCCCC
Confidence            3555554    988899999999999999985


No 32 
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=95.87  E-value=0.011  Score=41.78  Aligned_cols=32  Identities=31%  Similarity=1.060  Sum_probs=26.7

Q ss_pred             CCC-C-CCCCCCCEEeccCC--eEEeCCCcc-CCCCC
Q 005509          121 KSC-K-SDCSGQGVCNHELG--QCRCFHGFR-GKGCS  152 (693)
Q Consensus       121 ~~C-~-~~C~~~G~C~~~~G--~C~C~~G~~-G~~Ce  152 (693)
                      ++| . .+|.++|+|....|  .|.|++||. |..|+
T Consensus         3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~~g~~C~   39 (39)
T smart00179        3 DECASGNPCQNGGTCVNTVGSYRCECPPGYTDGRNCE   39 (39)
T ss_pred             ccCcCCCCcCCCCEeECCCCCeEeECCCCCccCCcCC
Confidence            567 3 57999999987666  799999999 98885


No 33 
>PF00008 EGF:  EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry;  InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=95.85  E-value=0.0041  Score=42.56  Aligned_cols=24  Identities=38%  Similarity=0.942  Sum_probs=20.6

Q ss_pred             CCCCCCceee-----CCeeecCCCcccCC
Q 005509          282 NQCSGHGHCR-----GGFCQCDSGWYGVD  305 (693)
Q Consensus       282 ~~C~~~G~C~-----~g~C~C~~G~~G~~  305 (693)
                      ++|.++|+|+     +..|+|++||+|.+
T Consensus         4 ~~C~n~g~C~~~~~~~y~C~C~~G~~G~~   32 (32)
T PF00008_consen    4 NPCQNGGTCIDLPGGGYTCECPPGYTGKR   32 (32)
T ss_dssp             TSSTTTEEEEEESTSEEEEEEBTTEESTT
T ss_pred             CcCCCCeEEEeCCCCCEEeECCCCCccCC
Confidence            5899999997     34899999999974


No 34 
>PF00053 Laminin_EGF:  Laminin EGF-like (Domains III and V);  InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below.  +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain  In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=95.82  E-value=0.0063  Score=46.00  Aligned_cols=23  Identities=35%  Similarity=0.859  Sum_probs=19.9

Q ss_pred             CEEeccCCeEEeCCCccCCCCCc
Q 005509          131 GVCNHELGQCRCFHGFRGKGCSE  153 (693)
Q Consensus       131 G~C~~~~G~C~C~~G~~G~~Ce~  153 (693)
                      .+|+..+|+|.|.++|+|..|++
T Consensus        11 ~~C~~~~G~C~C~~~~~G~~C~~   33 (49)
T PF00053_consen   11 QTCDPSTGQCVCKPGTTGPRCDQ   33 (49)
T ss_dssp             SSEEETCEEESBSTTEESTTS-E
T ss_pred             CcccCCCCEEeccccccCCcCcC
Confidence            38998899999999999999986


No 35 
>smart00051 DSL delta serrate ligand.
Probab=95.63  E-value=0.01  Score=47.35  Aligned_cols=30  Identities=33%  Similarity=0.783  Sum_probs=25.5

Q ss_pred             CCC--CCCCCCCCEEeccCCeEEeCCCccCCCC
Q 005509          121 KSC--KSDCSGQGVCNHELGQCRCFHGFRGKGC  151 (693)
Q Consensus       121 ~~C--~~~C~~~G~C~~~~G~C~C~~G~~G~~C  151 (693)
                      +.|  .+++.+|.+|+. .|.|.|.+||+|++|
T Consensus        32 ~~C~~~~d~~~~~~Cd~-~G~~~C~~Gw~G~~C   63 (63)
T smart00051       32 KFCRPRDDFFGHYTCDE-NGNKGCLEGWMGPYC   63 (63)
T ss_pred             CEeCcCccccCCccCCc-CCCEecCCCCcCCCC
Confidence            455  346889999986 799999999999988


No 36 
>KOG3512 consensus Netrin, axonal chemotropic factor [Signal transduction mechanisms]
Probab=95.59  E-value=0.047  Score=59.12  Aligned_cols=103  Identities=20%  Similarity=0.419  Sum_probs=64.6

Q ss_pred             ccccCcccccCCCCcccccccCc--cCCCCC-CCCCCCCCE-Eec------c-----CCeE-EeCCCccCCCCCccccCc
Q 005509           95 KAEIGRWLSGCDSVAKEVDLVEM--IGGKSC-KSDCSGQGV-CNH------E-----LGQC-RCFHGFRGKGCSERIHFQ  158 (693)
Q Consensus        95 ~~~~g~~~~~c~~~~~~~~~~~~--~~~~~C-~~~C~~~G~-C~~------~-----~G~C-~C~~G~~G~~Ce~~~~~~  158 (693)
                      +.+.|.=|..|...+..-+++..  ...++| .+.|++|+. |-.      .     -|+| .|.+...|.+|.-     
T Consensus       301 HNTaGPdCgrCKpfy~dRPW~raT~~~a~~c~ac~Cn~harrcrfn~Ely~lSgr~SggvClnCrHnTaGrhChy-----  375 (592)
T KOG3512|consen  301 HNTAGPDCGRCKPFYYDRPWGRATALPANECVACNCNGHARRCRFNMELYRLSGRRSGGVCLNCRHNTAGRHCHY-----  375 (592)
T ss_pred             cCCCCCCcccccccccCCCccccccCCCccccccccchhhhhcccchhhhcccCccccceEeecccCCCCccccc-----
Confidence            34466666666555555555432  345788 778887764 411      1     2366 4999999999986     


Q ss_pred             CCCC-----CCCCCCCCCccccccC------CCCCCCCceeeeCCCcccCCCCCC
Q 005509          159 CNFP-----KTPELPYGRWVVSICP------THCDTTRAMCFCGEGTKYPNRPVA  202 (693)
Q Consensus       159 C~~~-----~~~~~~~g~~~~~~C~------g~C~~~~g~C~C~~G~~G~~C~~~  202 (693)
                      |.-+     +.+......|....|.      .+|+..+|+|.|.+|-+|..|..+
T Consensus       376 CreGyyRd~s~pl~hrkaCk~CdChpVGs~gktCNq~tGqCpCkeGvtG~tCnrC  430 (592)
T KOG3512|consen  376 CREGYYRDGSKPLTHRKACKACDCHPVGSAGKTCNQTTGQCPCKEGVTGLTCNRC  430 (592)
T ss_pred             ccCccccCCCCCCchhhhhhhcCCcccccccccccccCCcccCCCCCcccccccc
Confidence            4322     1111122234444553      578999999999999999998544


No 37 
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=95.21  E-value=0.026  Score=39.34  Aligned_cols=32  Identities=31%  Similarity=1.049  Sum_probs=26.1

Q ss_pred             CCCC--CCCCCCCEEeccCC--eEEeCCCccCCCCC
Q 005509          121 KSCK--SDCSGQGVCNHELG--QCRCFHGFRGKGCS  152 (693)
Q Consensus       121 ~~C~--~~C~~~G~C~~~~G--~C~C~~G~~G~~Ce  152 (693)
                      ++|.  .+|.++|.|....|  .|.|..||.|..|+
T Consensus         3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g~~C~   38 (38)
T cd00054           3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRNCE   38 (38)
T ss_pred             ccCCCCCCcCCCCEeECCCCCeEeECCCCCcCCcCC
Confidence            5663  57999999986666  79999999998885


No 38 
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=94.91  E-value=0.019  Score=57.90  Aligned_cols=45  Identities=33%  Similarity=0.833  Sum_probs=35.4

Q ss_pred             cCCCCCcCcccccccCCccCCCCCCCceee-------CCeeecCCCcccCCCCC
Q 005509          262 CKYDGLLGQFCEVPVSSTCVNQCSGHGHCR-------GGFCQCDSGWYGVDCSI  308 (693)
Q Consensus       262 C~~~G~~G~~C~~~~~~~C~~~C~~~G~C~-------~g~C~C~~G~~G~~C~~  308 (693)
                      |+ +|-.|++|... +..-..+|.++|.|.       +|.|.|.+||+|+.|..
T Consensus       132 Cp-~gtyGpdCl~C-pggser~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~  183 (350)
T KOG4260|consen  132 CP-DGTYGPDCLQC-PGGSERPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRY  183 (350)
T ss_pred             cC-CCCcCCccccC-CCCCcCCcCCCCcccCCCCCCCCCcccccCCCCCccccc
Confidence            77 89999999842 111125799999996       67999999999999875


No 39 
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=94.73  E-value=0.033  Score=41.53  Aligned_cols=23  Identities=35%  Similarity=1.012  Sum_probs=20.8

Q ss_pred             CEEeccCCeEEeCCCccCCCCCc
Q 005509          131 GVCNHELGQCRCFHGFRGKGCSE  153 (693)
Q Consensus       131 G~C~~~~G~C~C~~G~~G~~Ce~  153 (693)
                      ..|+..+|+|.|.++++|..|+.
T Consensus        11 ~~C~~~~G~C~C~~~~~G~~C~~   33 (46)
T smart00180       11 GTCDPDTGQCECKPNVTGRRCDR   33 (46)
T ss_pred             CcccCCCCEEECCCCCCCCCCCc
Confidence            57888889999999999999985


No 40 
>PF01414 DSL:  Delta serrate ligand;  InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=94.34  E-value=0.014  Score=46.58  Aligned_cols=45  Identities=29%  Similarity=0.705  Sum_probs=25.6

Q ss_pred             ccccccCCCCCcCcccccccCCccCC--CCCCCceee-CCeeecCCCcccCCC
Q 005509          257 KEECDCKYDGLLGQFCEVPVSSTCVN--QCSGHGHCR-GGFCQCDSGWYGVDC  306 (693)
Q Consensus       257 ~g~C~C~~~G~~G~~C~~~~~~~C~~--~C~~~G~C~-~g~C~C~~G~~G~~C  306 (693)
                      ..+-.|. ..|.|..|+.    .|.+  .-.+|-+|+ +|.=.|.+||+|++|
T Consensus        16 ~~rv~C~-~nyyG~~C~~----~C~~~~d~~ghy~Cd~~G~~~C~~Gw~G~~C   63 (63)
T PF01414_consen   16 RIRVVCD-ENYYGPNCSK----FCKPRDDSFGHYTCDSNGNKVCLPGWTGPNC   63 (63)
T ss_dssp             --------TTEETTTT-E----E---EEETTEEEEE-SS--EEE-TTEESTTS
T ss_pred             EEEEECC-CCCCCccccC----CcCCCcCCcCCcccCCCCCCCCCCCCcCCCC
Confidence            4577898 9999999997    5643  245677787 889999999999998


No 41 
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.
Probab=93.96  E-value=0.071  Score=36.44  Aligned_cols=29  Identities=34%  Similarity=0.953  Sum_probs=23.4

Q ss_pred             CCCCCCCCEEeccCC--eEEeCCCccCC-CCC
Q 005509          124 KSDCSGQGVCNHELG--QCRCFHGFRGK-GCS  152 (693)
Q Consensus       124 ~~~C~~~G~C~~~~G--~C~C~~G~~G~-~Ce  152 (693)
                      ...|.+++.|....+  .|.|+.||.|. .|+
T Consensus         5 ~~~C~~~~~C~~~~~~~~C~C~~g~~g~~~C~   36 (36)
T cd00053           5 SNPCSNGGTCVNTPGSYRCVCPPGYTGDRSCE   36 (36)
T ss_pred             CCCCCCCCEEecCCCCeEeECCCCCcccCCcC
Confidence            356888999986544  89999999998 664


No 42 
>smart00181 EGF Epidermal growth factor-like domain.
Probab=93.96  E-value=0.076  Score=36.62  Aligned_cols=27  Identities=37%  Similarity=0.947  Sum_probs=21.9

Q ss_pred             CCCCCCCEEeccCC--eEEeCCCccC-CCCC
Q 005509          125 SDCSGQGVCNHELG--QCRCFHGFRG-KGCS  152 (693)
Q Consensus       125 ~~C~~~G~C~~~~G--~C~C~~G~~G-~~Ce  152 (693)
                      ..|.++ +|....+  .|.|+.||.| ..|+
T Consensus         6 ~~C~~~-~C~~~~~~~~C~C~~g~~g~~~C~   35 (35)
T smart00181        6 GPCSNG-TCINTPGSYTCSCPPGYTGDKRCE   35 (35)
T ss_pred             CCCCCC-EEECCCCCeEeECCCCCccCCccC
Confidence            468888 9986544  8999999999 7774


No 43 
>PHA02887 EGF-like protein; Provisional
Probab=93.92  E-value=0.07  Score=47.05  Aligned_cols=33  Identities=33%  Similarity=0.827  Sum_probs=25.2

Q ss_pred             CCCC----CCCCCCCEEeccCC----eEEeCCCccCCCCCcc
Q 005509          121 KSCK----SDCSGQGVCNHELG----QCRCFHGFRGKGCSER  154 (693)
Q Consensus       121 ~~C~----~~C~~~G~C~~~~G----~C~C~~G~~G~~Ce~~  154 (693)
                      .+|+    +-|- ||+|.....    .|.|+.||+|..|+..
T Consensus        84 ~pC~~eyk~YCi-HG~C~yI~dL~epsCrC~~GYtG~RCE~v  124 (126)
T PHA02887         84 EKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGYTGIRCDEV  124 (126)
T ss_pred             cccChHhhCEee-CCEEEccccCCCceeECCCCcccCCCCcc
Confidence            5663    3487 789964333    7999999999999974


No 44 
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=92.93  E-value=0.11  Score=36.52  Aligned_cols=26  Identities=35%  Similarity=0.961  Sum_probs=21.7

Q ss_pred             CCCCCCceee----CCeeecCCCcc-cCCCC
Q 005509          282 NQCSGHGHCR----GGFCQCDSGWY-GVDCS  307 (693)
Q Consensus       282 ~~C~~~G~C~----~g~C~C~~G~~-G~~C~  307 (693)
                      .+|.++|+|.    ...|.|++||. |..|+
T Consensus         9 ~~C~~~~~C~~~~g~~~C~C~~g~~~g~~C~   39 (39)
T smart00179        9 NPCQNGGTCVNTVGSYRCECPPGYTDGRNCE   39 (39)
T ss_pred             CCcCCCCEeECCCCCeEeECCCCCccCCcCC
Confidence            4788889997    34799999999 98885


No 45 
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=92.79  E-value=1.1  Score=47.56  Aligned_cols=27  Identities=30%  Similarity=0.912  Sum_probs=19.8

Q ss_pred             CCCCCEEeccCCeEEeCCCccCCCCCcc
Q 005509          127 CSGQGVCNHELGQCRCFHGFRGKGCSER  154 (693)
Q Consensus       127 C~~~G~C~~~~G~C~C~~G~~G~~Ce~~  154 (693)
                      |..++.+...++.|. ..+|.|..|+..
T Consensus        81 c~~~~~~~~~~~~~~-~~~~~g~~C~~~  107 (316)
T KOG1218|consen   81 CKNGGTCVSSTGYCH-LNGYEGPQCESP  107 (316)
T ss_pred             cCCCCcccCCCCccc-CCCCCcccccCC
Confidence            667777775566666 788888888874


No 46 
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=92.38  E-value=0.14  Score=35.45  Aligned_cols=26  Identities=35%  Similarity=0.942  Sum_probs=21.4

Q ss_pred             CCCCCCceee----CCeeecCCCcccCCCC
Q 005509          282 NQCSGHGHCR----GGFCQCDSGWYGVDCS  307 (693)
Q Consensus       282 ~~C~~~G~C~----~g~C~C~~G~~G~~C~  307 (693)
                      .+|.+++.|.    ...|.|.+||.|..|+
T Consensus         9 ~~C~~~~~C~~~~~~~~C~C~~g~~g~~C~   38 (38)
T cd00054           9 NPCQNGGTCVNTVGSYRCSCPPGYTGRNCE   38 (38)
T ss_pred             CCcCCCCEeECCCCCeEeECCCCCcCCcCC
Confidence            4688888997    3479999999998885


No 47 
>PF04863 EGF_alliinase:  Alliinase EGF-like domain;  InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=91.88  E-value=0.09  Score=39.96  Aligned_cols=29  Identities=34%  Similarity=0.723  Sum_probs=17.3

Q ss_pred             CCCCCCEEec----cCC--eEEeCCCccCCCCCcc
Q 005509          126 DCSGQGVCNH----ELG--QCRCFHGFRGKGCSER  154 (693)
Q Consensus       126 ~C~~~G~C~~----~~G--~C~C~~G~~G~~Ce~~  154 (693)
                      .|++||..-.    ..|  .|.|+.-|.|++|++.
T Consensus        18 ~CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~~   52 (56)
T PF04863_consen   18 SCSGHGRAFLDGLIADGSPVCECNSCYGGPDCSTL   52 (56)
T ss_dssp             --TTSEE--TTS-EETTEE--EE-TTEESTTS-EE
T ss_pred             CcCCCCeeeeccccccCCccccccCCcCCCCcccC
Confidence            6999998842    234  7999999999999985


No 48 
>PF07645 EGF_CA:  Calcium-binding EGF domain;  InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes [].  +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=90.97  E-value=0.22  Score=36.19  Aligned_cols=27  Identities=30%  Similarity=0.943  Sum_probs=23.0

Q ss_pred             CCC---CCCCCCCCEEeccCC--eEEeCCCcc
Q 005509          121 KSC---KSDCSGQGVCNHELG--QCRCFHGFR  147 (693)
Q Consensus       121 ~~C---~~~C~~~G~C~~~~G--~C~C~~G~~  147 (693)
                      ++|   +..|..++.|....|  .|.|++||.
T Consensus         3 dEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~   34 (42)
T PF07645_consen    3 DECAEGPHNCPENGTCVNTEGSYSCSCPPGYE   34 (42)
T ss_dssp             STTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred             cccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence            677   346988999998888  899999998


No 49 
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.
Probab=90.52  E-value=0.26  Score=33.48  Aligned_cols=26  Identities=38%  Similarity=0.912  Sum_probs=21.1

Q ss_pred             CCCCCCceee----CCeeecCCCcccC-CCC
Q 005509          282 NQCSGHGHCR----GGFCQCDSGWYGV-DCS  307 (693)
Q Consensus       282 ~~C~~~G~C~----~g~C~C~~G~~G~-~C~  307 (693)
                      .+|.+++.|.    ...|.|+.||.|. .|+
T Consensus         6 ~~C~~~~~C~~~~~~~~C~C~~g~~g~~~C~   36 (36)
T cd00053           6 NPCSNGGTCVNTPGSYRCVCPPGYTGDRSCE   36 (36)
T ss_pred             CCCCCCCEEecCCCCeEeECCCCCcccCCcC
Confidence            5688888897    4589999999998 664


No 50 
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=90.10  E-value=0.35  Score=36.66  Aligned_cols=20  Identities=30%  Similarity=0.830  Sum_probs=17.2

Q ss_pred             eee--CCeeecCCCcccCCCCC
Q 005509          289 HCR--GGFCQCDSGWYGVDCSI  308 (693)
Q Consensus       289 ~C~--~g~C~C~~G~~G~~C~~  308 (693)
                      .|+  +|+|.|+++|+|.+|+.
T Consensus        13 ~C~~~~G~C~C~~~~~G~~C~~   34 (50)
T cd00055          13 QCDPGTGQCECKPNTTGRRCDR   34 (50)
T ss_pred             cccCCCCEEeCCCcCCCCCCCC
Confidence            464  78999999999999995


No 51 
>KOG2619 consensus Fucosyltransferase [Carbohydrate transport and metabolism; Amino acid transport and metabolism]
Probab=89.66  E-value=0.63  Score=50.35  Aligned_cols=126  Identities=14%  Similarity=0.078  Sum_probs=72.1

Q ss_pred             cccCCCceeecCccCC-ch-hhhhccccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCcccc
Q 005509          536 CFDPEKDLVLPAWKAP-DA-FVLRSKLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKL  613 (693)
Q Consensus       536 ~f~p~kDvviP~~~~~-~~-~~~~~~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~  613 (693)
                      .||-+.|+.+|+-... .. ..+..++...-..+++.++++..+...            ..-|.++++++... -.....
T Consensus       162 Tyr~dSd~~~pygy~~~~~~~~~~~p~~~~~~~k~~~~aw~vSnc~~------------~~~R~~~~~~L~k~-l~iD~Y  228 (372)
T KOG2619|consen  162 TYRRDSDLFVPYGYLEKPEANPVLVPVNSILSAKTKLAAWLVSNCIP------------RSARLDYYKELMKH-LEIDSY  228 (372)
T ss_pred             EEeccCCCCCccceEeecccCceecccccccccccceeeeeccccCc------------chHHHHHHHHHHhh-Cceeec
Confidence            5677777777762211 11 111111111124566777788776542            33566666666543 122233


Q ss_pred             CcccCcceEEecCCchhHHHHhhcCceeeccCCC----CCchhHHHHHhcCceeEEEeCCeeeceec
Q 005509          614 GKQHAEDVIVTSLRSENYHEDLSSSVFCGVLPGD----GWSGRMEDSILQGCIPVVIQVVISSFLLL  676 (693)
Q Consensus       614 g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~p~Gd----~~s~Rl~dAi~~GCIPViisd~~~~p~l~  676 (693)
                      |.+..+.  .........++.++.=||=|+.--.    =-|-+|+-|+.+|.|||+++....+.++|
T Consensus       229 G~c~~~~--~~~~~~~~~~~~~s~YKFyLAfENS~c~DYVTEKfw~al~~gsVPVvlg~~n~e~fvP  293 (372)
T KOG2619|consen  229 GECLRKN--ANRDPSDCLLETLSHYKFYLAFENSNCEDYVTEKFWNALDAGSVPVVLGPPNYENFVP  293 (372)
T ss_pred             ccccccc--ccCCCCCcceeecccceEEEEecccCCcccccHHHHhhhhcCcccEEECCccccccCC
Confidence            3333211  0112234556788899999986642    22889999999999999999865554555


No 52 
>PF01414 DSL:  Delta serrate ligand;  InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=89.56  E-value=0.17  Score=40.50  Aligned_cols=46  Identities=24%  Similarity=0.447  Sum_probs=21.9

Q ss_pred             eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCCcccCCC
Q 005509          139 QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEGTKYPNR  199 (693)
Q Consensus       139 ~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~G~~C  199 (693)
                      .-.|...|.|+.|+..    |......          .=.-.|+ ..|.=.|.+||+|++|
T Consensus        18 rv~C~~nyyG~~C~~~----C~~~~d~----------~ghy~Cd-~~G~~~C~~Gw~G~~C   63 (63)
T PF01414_consen   18 RVVCDENYYGPNCSKF----CKPRDDS----------FGHYTCD-SNGNKVCLPGWTGPNC   63 (63)
T ss_dssp             -----TTEETTTT-EE-------EEET----------TEEEEE--SS--EEE-TTEESTTS
T ss_pred             EEECCCCCCCccccCC----cCCCcCC----------cCCcccC-CCCCCCCCCCCcCCCC
Confidence            5689999999999985    4321000          0012466 4688889999999876


No 53 
>PHA02887 EGF-like protein; Provisional
Probab=87.79  E-value=0.4  Score=42.45  Aligned_cols=26  Identities=35%  Similarity=1.016  Sum_probs=21.6

Q ss_pred             CCCCCceee------CCeeecCCCcccCCCCCC
Q 005509          283 QCSGHGHCR------GGFCQCDSGWYGVDCSIP  309 (693)
Q Consensus       283 ~C~~~G~C~------~g~C~C~~G~~G~~C~~~  309 (693)
                      -|- ||+|.      ...|.|..||+|..|+.-
T Consensus        93 YCi-HG~C~yI~dL~epsCrC~~GYtG~RCE~v  124 (126)
T PHA02887         93 FCI-NGECMNIIDLDEKFCICNKGYTGIRCDEV  124 (126)
T ss_pred             Eee-CCEEEccccCCCceeECCCCcccCCCCcc
Confidence            466 68996      459999999999999874


No 54 
>PF00053 Laminin_EGF:  Laminin EGF-like (Domains III and V);  InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below.  +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain  In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=87.47  E-value=0.31  Score=36.66  Aligned_cols=28  Identities=25%  Similarity=0.555  Sum_probs=19.3

Q ss_pred             ceee--CCeeecCCCcccCCCCCCccCCCCCC
Q 005509          288 GHCR--GGFCQCDSGWYGVDCSIPSVMSSMSE  317 (693)
Q Consensus       288 G~C~--~g~C~C~~G~~G~~C~~~~~~~~~~~  317 (693)
                      ..|.  +|+|.|+++|+|..|++  +..+..+
T Consensus        11 ~~C~~~~G~C~C~~~~~G~~C~~--C~~g~~~   40 (49)
T PF00053_consen   11 QTCDPSTGQCVCKPGTTGPRCDQ--CKPGYFG   40 (49)
T ss_dssp             SSEEETCEEESBSTTEESTTS-E--E-TTEEC
T ss_pred             CcccCCCCEEeccccccCCcCcC--CCCcccc
Confidence            3675  78999999999999996  4444333


No 55 
>smart00181 EGF Epidermal growth factor-like domain.
Probab=86.51  E-value=0.66  Score=31.79  Aligned_cols=25  Identities=32%  Similarity=0.852  Sum_probs=19.2

Q ss_pred             CCCCCCceee----CCeeecCCCccc-CCCC
Q 005509          282 NQCSGHGHCR----GGFCQCDSGWYG-VDCS  307 (693)
Q Consensus       282 ~~C~~~G~C~----~g~C~C~~G~~G-~~C~  307 (693)
                      .+|.++ .|.    ...|.|++||.| ..|+
T Consensus         6 ~~C~~~-~C~~~~~~~~C~C~~g~~g~~~C~   35 (35)
T smart00181        6 GPCSNG-TCINTPGSYTCSCPPGYTGDKRCE   35 (35)
T ss_pred             CCCCCC-EEECCCCCeEeECCCCCccCCccC
Confidence            357777 786    458999999999 7764


No 56 
>PF12947 EGF_3:  EGF domain;  InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=86.17  E-value=0.48  Score=33.32  Aligned_cols=26  Identities=31%  Similarity=0.908  Sum_probs=19.1

Q ss_pred             CCCCCCCEEeccCC--eEEeCCCccCCC
Q 005509          125 SDCSGQGVCNHELG--QCRCFHGFRGKG  150 (693)
Q Consensus       125 ~~C~~~G~C~~~~G--~C~C~~G~~G~~  150 (693)
                      ..|+.+.+|....+  +|.|++||.|+-
T Consensus         6 ~~C~~nA~C~~~~~~~~C~C~~Gy~GdG   33 (36)
T PF12947_consen    6 GGCHPNATCTNTGGSYTCTCKPGYEGDG   33 (36)
T ss_dssp             GGS-TTCEEEE-TTSEEEEE-CEEECCS
T ss_pred             CCCCCCcEeecCCCCEEeECCCCCccCC
Confidence            35889999987666  899999999863


No 57 
>PF12955 DUF3844:  Domain of unknown function (DUF3844);  InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=84.89  E-value=0.55  Score=41.17  Aligned_cols=32  Identities=41%  Similarity=0.909  Sum_probs=25.0

Q ss_pred             CCCCCCceeeC---------CeeecCC-------------CcccCCCCCCccCC
Q 005509          282 NQCSGHGHCRG---------GFCQCDS-------------GWYGVDCSIPSVMS  313 (693)
Q Consensus       282 ~~C~~~G~C~~---------g~C~C~~-------------G~~G~~C~~~~~~~  313 (693)
                      ++|++||.|..         ..|+|.+             .|.|..|+......
T Consensus        13 n~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~~~~~~~ktt~W~G~aCqKkDvS~   66 (103)
T PF12955_consen   13 NNCSGHGSCVKKYGSGGGDCFACKCKPTVVKTGSGKGKTTHWGGPACQKKDVSV   66 (103)
T ss_pred             cCCCCCceEeeccCCCccceEEEEeeccccccccccCceeeecccccccccccc
Confidence            78999999972         1689987             68888888876554


No 58 
>PF12955 DUF3844:  Domain of unknown function (DUF3844);  InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=84.85  E-value=0.76  Score=40.28  Aligned_cols=31  Identities=32%  Similarity=0.993  Sum_probs=24.1

Q ss_pred             CCCCCCCCEEeccC----C---eEEeCC-------------CccCCCCCcc
Q 005509          124 KSDCSGQGVCNHEL----G---QCRCFH-------------GFRGKGCSER  154 (693)
Q Consensus       124 ~~~C~~~G~C~~~~----G---~C~C~~-------------G~~G~~Ce~~  154 (693)
                      .++|++||.|....    +   .|.|.+             .|.|..|+..
T Consensus        12 Tn~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~~~~~~~ktt~W~G~aCqKk   62 (103)
T PF12955_consen   12 TNNCSGHGSCVKKYGSGGGDCFACKCKPTVVKTGSGKGKTTHWGGPACQKK   62 (103)
T ss_pred             ccCCCCCceEeeccCCCccceEEEEeeccccccccccCceeeecccccccc
Confidence            46899999997642    1   699998             5778888875


No 59 
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=84.58  E-value=0.62  Score=54.93  Aligned_cols=33  Identities=33%  Similarity=0.772  Sum_probs=29.0

Q ss_pred             CCCCCCCCCCCEEeccCCeEEeCCCccCCCCCcc
Q 005509          121 KSCKSDCSGQGVCNHELGQCRCFHGFRGKGCSER  154 (693)
Q Consensus       121 ~~C~~~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~  154 (693)
                      ..|+..|++||+|+. ...|+|.+||.+++|+..
T Consensus       626 ~~~~~~C~g~GVCnn-~~~ChC~~gwapp~C~~~  658 (716)
T KOG3607|consen  626 SCCPTTCNGHGVCNN-ELNCHCEPGWAPPFCFIF  658 (716)
T ss_pred             cccccccCCCcccCC-CcceeeCCCCCCCccccc
Confidence            445778999999994 789999999999999985


No 60 
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=83.06  E-value=1.1  Score=53.09  Aligned_cols=35  Identities=37%  Similarity=0.835  Sum_probs=30.5

Q ss_pred             CccCCCCCCCceee-CCeeecCCCcccCCCCCCccC
Q 005509          278 STCVNQCSGHGHCR-GGFCQCDSGWYGVDCSIPSVM  312 (693)
Q Consensus       278 ~~C~~~C~~~G~C~-~g~C~C~~G~~G~~C~~~~~~  312 (693)
                      ..|+..|+++|.|+ ...|+|.+||.+++|++....
T Consensus       626 ~~~~~~C~g~GVCnn~~~ChC~~gwapp~C~~~~~~  661 (716)
T KOG3607|consen  626 SCCPTTCNGHGVCNNELNCHCEPGWAPPFCFIFGYG  661 (716)
T ss_pred             cccccccCCCcccCCCcceeeCCCCCCCccccccCC
Confidence            35677899999998 679999999999999998755


No 61 
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=82.14  E-value=1.5  Score=32.57  Aligned_cols=17  Identities=29%  Similarity=0.751  Sum_probs=14.1

Q ss_pred             CCeeecCCCcccCCCCC
Q 005509          292 GGFCQCDSGWYGVDCSI  308 (693)
Q Consensus       292 ~g~C~C~~G~~G~~C~~  308 (693)
                      +|+|.|+++|+|.+|+.
T Consensus        17 ~G~C~C~~~~~G~~C~~   33 (46)
T smart00180       17 TGQCECKPNVTGRRCDR   33 (46)
T ss_pred             CCEEECCCCCCCCCCCc
Confidence            67888888888888884


No 62 
>PF04863 EGF_alliinase:  Alliinase EGF-like domain;  InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=80.99  E-value=0.74  Score=35.14  Aligned_cols=29  Identities=45%  Similarity=0.927  Sum_probs=16.7

Q ss_pred             CCCCCCceee------CC--eeecCCCcccCCCCCCc
Q 005509          282 NQCSGHGHCR------GG--FCQCDSGWYGVDCSIPS  310 (693)
Q Consensus       282 ~~C~~~G~C~------~g--~C~C~~G~~G~~C~~~~  310 (693)
                      -.|++||..-      +|  .|.|..-|.|++|++..
T Consensus        17 i~CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~~~   53 (56)
T PF04863_consen   17 ISCSGHGRAFLDGLIADGSPVCECNSCYGGPDCSTLI   53 (56)
T ss_dssp             S--TTSEE--TTS-EETTEE--EE-TTEESTTS-EE-
T ss_pred             CCcCCCCeeeeccccccCCccccccCCcCCCCcccCC
Confidence            4689999874      33  79999999999999765


No 63 
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=78.59  E-value=1.5  Score=39.63  Aligned_cols=29  Identities=38%  Similarity=0.834  Sum_probs=22.1

Q ss_pred             CCCCCCCEEeccC----CeEEeCCCccCCCCCcc
Q 005509          125 SDCSGQGVCNHEL----GQCRCFHGFRGKGCSER  154 (693)
Q Consensus       125 ~~C~~~G~C~~~~----G~C~C~~G~~G~~Ce~~  154 (693)
                      +-|-+ |+|....    -.|.|..||+|..||..
T Consensus        51 ~YClH-G~C~yI~dl~~~~CrC~~GYtGeRCEh~   83 (139)
T PHA03099         51 GYCLH-GDCIHARDIDGMYCRCSHGYTGIRCQHV   83 (139)
T ss_pred             CEeEC-CEEEeeccCCCceeECCCCcccccccce
Confidence            34765 5995433    27999999999999985


No 64 
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=77.80  E-value=1.8  Score=39.08  Aligned_cols=26  Identities=35%  Similarity=1.050  Sum_probs=21.4

Q ss_pred             CCCCceee------CCeeecCCCcccCCCCCCc
Q 005509          284 CSGHGHCR------GGFCQCDSGWYGVDCSIPS  310 (693)
Q Consensus       284 C~~~G~C~------~g~C~C~~G~~G~~C~~~~  310 (693)
                      |-+ |+|.      ...|.|..||+|..||.-.
T Consensus        53 ClH-G~C~yI~dl~~~~CrC~~GYtGeRCEh~d   84 (139)
T PHA03099         53 CLH-GDCIHARDIDGMYCRCSHGYTGIRCQHVV   84 (139)
T ss_pred             eEC-CEEEeeccCCCceeECCCCccccccccee
Confidence            555 4886      5689999999999999865


No 65 
>PF09064 Tme5_EGF_like:  Thrombomodulin like fifth domain, EGF-like;  InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=77.38  E-value=1.7  Score=29.86  Aligned_cols=22  Identities=41%  Similarity=1.043  Sum_probs=18.2

Q ss_pred             cccCCCCCCC-CceeeeCCCccc
Q 005509          175 SICPTHCDTT-RAMCFCGEGTKY  196 (693)
Q Consensus       175 ~~C~g~C~~~-~g~C~C~~G~~G  196 (693)
                      ..|+..|+.. .++|.|++||.-
T Consensus         6 t~CpA~CDpn~~~~C~CPeGyIl   28 (34)
T PF09064_consen    6 TECPADCDPNSPGQCFCPEGYIL   28 (34)
T ss_pred             ccCCCccCCCCCCceeCCCceEe
Confidence            4788999885 679999999973


No 66 
>PF01683 EB:  EB module;  InterPro: IPR006149  The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO 
Probab=76.24  E-value=2.5  Score=32.06  Aligned_cols=31  Identities=39%  Similarity=0.827  Sum_probs=23.0

Q ss_pred             CCcCcccccccCCccCCCCCCCceeeCCeeecCCCcc
Q 005509          266 GLLGQFCEVPVSSTCVNQCSGHGHCRGGFCQCDSGWY  302 (693)
Q Consensus       266 G~~G~~C~~~~~~~C~~~C~~~G~C~~g~C~C~~G~~  302 (693)
                      -..|..|+..      .+|..+..|++|+|.|++||.
T Consensus        16 ~~~g~~C~~~------~qC~~~s~C~~g~C~C~~g~~   46 (52)
T PF01683_consen   16 VQPGESCESD------EQCIGGSVCVNGRCQCPPGYV   46 (52)
T ss_pred             CCCCCCCCCc------CCCCCcCEEcCCEeECCCCCE
Confidence            3446667653      345688999999999999984


No 67 
>PF06247 Plasmod_Pvs28:  Plasmodium ookinete surface protein Pvs28;  InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=75.34  E-value=1.6  Score=42.25  Aligned_cols=133  Identities=23%  Similarity=0.565  Sum_probs=66.6

Q ss_pred             CCEEeccCC--eEEeCCCcc---CCCCCccccCcCCCCCCCCCCCCCccccccC--CCC-------CCCCceeeeCCCcc
Q 005509          130 QGVCNHELG--QCRCFHGFR---GKGCSERIHFQCNFPKTPELPYGRWVVSICP--THC-------DTTRAMCFCGEGTK  195 (693)
Q Consensus       130 ~G~C~~~~G--~C~C~~G~~---G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C-------~~~~g~C~C~~G~~  195 (693)
                      +|....-.+  .|.|.+||.   -..||..  ..|.....        ....|.  +.|       ......|.|..||.
T Consensus        10 NG~LiQMSNHfEC~Cnegfvl~~EntCE~k--v~C~~~e~--------~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~   79 (197)
T PF06247_consen   10 NGYLIQMSNHFECKCNEGFVLKNENTCEEK--VECDKLEN--------VNKPCGDYAKCINQANKGEERAYKCDCINGYI   79 (197)
T ss_dssp             TEEEEEESSEEEEEESTTEEEEETTEEEE------SG-GG--------TTSEEETTEEEEE-SSTTSSTSEEEEE-TTEE
T ss_pred             CCEEEEccCceEEEcCCCcEEccccccccc--eecCcccc--------cCccccchhhhhcCCCcccceeEEEecccCce
Confidence            455554444  899999995   4567763  34543110        011332  122       22345899999998


Q ss_pred             cCCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCc---Cccc
Q 005509          196 YPNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLL---GQFC  272 (693)
Q Consensus       196 G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~---G~~C  272 (693)
                      -..    ..|...                  .|.   .-.|. .|.|..++..      -....|.|. .|+.   +..|
T Consensus        80 ~~~----~vCvp~------------------~C~---~~~Cg-~GKCI~d~~~------~~~~~CSC~-IGkV~~dn~kC  126 (197)
T PF06247_consen   80 LKQ----GVCVPN------------------KCN---NKDCG-SGKCILDPDN------PNNPTCSCN-IGKVPDDNKKC  126 (197)
T ss_dssp             ESS----SSEEEG------------------GGS---S---T-TEEEEEEEGG------GSEEEEEE--TEEETTTTTES
T ss_pred             eeC----CeEchh------------------hcC---ceecC-CCeEEecCCC------CCCceeEee-eceEeccCCcc
Confidence            432    111100                  111   22343 6888665431      013489998 8987   5667


Q ss_pred             ccccCCccCCCCCCCceee----CCeeecCCCcccCC
Q 005509          273 EVPVSSTCVNQCSGHGHCR----GGFCQCDSGWYGVD  305 (693)
Q Consensus       273 ~~~~~~~C~~~C~~~G~C~----~g~C~C~~G~~G~~  305 (693)
                      ...-+..|.-.|..+-.|.    -++|.|+.|+.|..
T Consensus       127 tk~G~T~C~LKCk~nE~CK~~~~~Y~C~~~~~~~~~~  163 (197)
T PF06247_consen  127 TKTGETKCSLKCKENEECKLVDGYYKCVCKEGFPGDG  163 (197)
T ss_dssp             EEEE--------TTTEEEEEETTEEEEEE-TT-EEET
T ss_pred             cCCCccceeeecCCCcceeeeCcEEEeecCCCCCCCC
Confidence            7766678888998889996    34999999997654


No 68 
>PF12947 EGF_3:  EGF domain;  InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=72.60  E-value=1.9  Score=30.34  Aligned_cols=23  Identities=26%  Similarity=0.743  Sum_probs=16.7

Q ss_pred             CCCCCceee----CCeeecCCCcccCC
Q 005509          283 QCSGHGHCR----GGFCQCDSGWYGVD  305 (693)
Q Consensus       283 ~C~~~G~C~----~g~C~C~~G~~G~~  305 (693)
                      .|..+++|.    ...|+|++||.|+.
T Consensus         7 ~C~~nA~C~~~~~~~~C~C~~Gy~GdG   33 (36)
T PF12947_consen    7 GCHPNATCTNTGGSYTCTCKPGYEGDG   33 (36)
T ss_dssp             GS-TTCEEEE-TTSEEEEE-CEEECCS
T ss_pred             CCCCCcEeecCCCCEEeECCCCCccCC
Confidence            577788886    45899999999863


No 69 
>PF00534 Glycos_transf_1:  Glycosyl transferases group 1;  InterPro: IPR001296 The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates (2.4.1.- from EC) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'. Proteins containign this domain transfer UDP, ADP, GDP or CMP linked sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. The bacterial enzymes are involved in various biosynthetic processes that include exopolysaccharide biosynthesis, lipopolysaccharide core biosynthesis and the biosynthesis of the slime polysaccaride colanic acid. Mutations in this domain of the human N-acetylglucosaminyl-phosphatidylinositol biosynthetic protein are the cause of paroxysmal nocturnal hemoglobinuria (PNH), an acquired hemolytic blood disorder characterised by venous thrombosis, erythrocyte hemolysis, infections and defective hematopoiesis.; GO: 0009058 biosynthetic process; PDB: 2L7C_A 2IV3_B 2IUY_B 2XA9_A 2XA1_B 2X6R_A 2XMP_B 2XA2_B 2X6Q_A 3QHP_B ....
Probab=72.30  E-value=4.4  Score=38.38  Aligned_cols=41  Identities=20%  Similarity=0.342  Sum_probs=32.1

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..++.+.|+.|.+-+.|.- ++++.-++|||.+|| |||+++.
T Consensus        83 ~~~l~~~~~~~di~v~~s~~e~~~~~~~Ea~~~g~-pvI~~~~  124 (172)
T PF00534_consen   83 DDELDELYKSSDIFVSPSRNEGFGLSLLEAMACGC-PVIASDI  124 (172)
T ss_dssp             HHHHHHHHHHTSEEEE-BSSBSS-HHHHHHHHTT--EEEEESS
T ss_pred             ccccccccccceecccccccccccccccccccccc-ceeeccc
Confidence            4578899999999999877 477888999999999 7777774


No 70 
>PF07645 EGF_CA:  Calcium-binding EGF domain;  InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes [].  +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=70.13  E-value=2.5  Score=30.65  Aligned_cols=21  Identities=29%  Similarity=0.865  Sum_probs=17.8

Q ss_pred             CCCCCCceee----CCeeecCCCcc
Q 005509          282 NQCSGHGHCR----GGFCQCDSGWY  302 (693)
Q Consensus       282 ~~C~~~G~C~----~g~C~C~~G~~  302 (693)
                      +.|..++.|+    ++.|.|++||.
T Consensus        10 ~~C~~~~~C~N~~Gsy~C~C~~Gy~   34 (42)
T PF07645_consen   10 HNCPENGTCVNTEGSYSCSCPPGYE   34 (42)
T ss_dssp             SSSSTTSEEEEETTEEEEEESTTEE
T ss_pred             CcCCCCCEEEcCCCCEEeeCCCCcE
Confidence            4688899997    45999999997


No 71 
>PF05686 Glyco_transf_90:  Glycosyl transferase family 90;  InterPro: IPR006598  Cryptococcus neoformans is a pathogenic fungus which most commonly affects the central nervous system and causes fatal meningoencephalitis primarily in patients with AIDS. This fungus produces a thick extracellular polysaccharide capsule which is well recognised as a virulence factor. CAP10 is required for capsule formation and virulence [].
Probab=69.48  E-value=8.1  Score=42.75  Aligned_cols=110  Identities=17%  Similarity=0.189  Sum_probs=64.7

Q ss_pred             CCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEecCCchhHHHHhhcCceeec
Q 005509          564 PREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVTSLRSENYHEDLSSSVFCGV  643 (693)
Q Consensus       564 ~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~  643 (693)
                      +=.+|.-.+||+|+...            +.+|+.|+..-.+.+.......................=++..-+-||=+.
T Consensus       153 pW~~K~p~afWRG~~~~------------~~~R~~L~~~~~~~~~~~~a~i~~~d~~~~~~~~~~~~~l~~~~~yKYli~  220 (395)
T PF05686_consen  153 PWEDKKPKAFWRGSPTV------------AETRQRLVRCSRSHPDLWDARITKQDWDKEYKPGFKHVPLEDQCKYKYLIY  220 (395)
T ss_pred             ChhhcccceEECCCcCC------------CcchhHHHHHhccCCccceeeechhhhhhhccccccccCHHHHhhhheeec
Confidence            34567788999997642            237988887544432210000000000000000111122466778889888


Q ss_pred             cCCCCCchhHHHHHhcCceeEEEeCCeeec----eecCCCccEEEec
Q 005509          644 LPGDGWSGRMEDSILQGCIPVVIQVVISSF----LLLCQNGSLKIRN  686 (693)
Q Consensus       644 p~Gd~~s~Rl~dAi~~GCIPViisd~~~~p----~l~~~~fsv~v~~  686 (693)
                      .-|.+||.||.=-|..|.|.+.+...+.++    +.||..+ |-|+.
T Consensus       221 idG~~~S~RlkylL~c~SvVl~~~~~~~e~f~~~L~P~vHY-VPV~~  266 (395)
T PF05686_consen  221 IDGNAWSGRLKYLLACNSVVLKVKSPYYEFFYRALKPWVHY-VPVKR  266 (395)
T ss_pred             CCCceeehhHHHHHcCCceEEEeCCcHHHHHHhhhcccccE-EEecc
Confidence            999999999988899999988886665444    5677665 44444


No 72 
>cd03814 GT1_like_2 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=60.56  E-value=9  Score=40.37  Aligned_cols=43  Identities=14%  Similarity=0.080  Sum_probs=35.6

Q ss_pred             CchhHHHHhhcCceeeccCCC-CCchhHHHHHhcCceeEEEeCCe
Q 005509          627 RSENYHEDLSSSVFCGVLPGD-GWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       627 ~~~~y~~~l~~S~FCL~p~Gd-~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      ...++.+.|+.|.+++.|... +++..++|||.+|+ |||.+|.-
T Consensus       256 ~~~~~~~~~~~~d~~l~~s~~e~~~~~~lEa~a~g~-PvI~~~~~  299 (364)
T cd03814         256 DGEELAAAYASADVFVFPSRTETFGLVVLEAMASGL-PVVAPDAG  299 (364)
T ss_pred             CHHHHHHHHHhCCEEEECcccccCCcHHHHHHHcCC-CEEEcCCC
Confidence            346678999999999998774 66788999999998 88888753


No 73 
>cd03802 GT1_AviGT4_like This family is most closely related to the GT1 family of glycosyltransferases. aviGT4 in Streptomyces viridochromogenes has been shown to be involved in biosynthesis of oligosaccharide antibiotic avilamycin A. Inactivation of aviGT4 resulted in a mutant that accumulated a novel avilamycin derivative lacking the terminal eurekanate residue.
Probab=59.20  E-value=14  Score=38.75  Aligned_cols=41  Identities=15%  Similarity=-0.031  Sum_probs=33.9

Q ss_pred             hhHHHHhhcCceeeccCC--CCCchhHHHHHhcCceeEEEeCCe
Q 005509          629 ENYHEDLSSSVFCGVLPG--DGWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G--d~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      ....+.|+.+.+.+.|.-  .+++.-++|||.+|+ |||.+|.-
T Consensus       235 ~~~~~~~~~~d~~v~ps~~~E~~~~~~lEAma~G~-PvI~~~~~  277 (335)
T cd03802         235 AEKAELLGNARALLFPILWEEPFGLVMIEAMACGT-PVIAFRRG  277 (335)
T ss_pred             HHHHHHHHhCcEEEeCCcccCCcchHHHHHHhcCC-CEEEeCCC
Confidence            456789999999999863  467777999999996 99999853


No 74 
>cd03808 GT1_cap1E_like This family is most closely related to the GT1 family of glycosyltransferases. cap1E in Streptococcus pneumoniae is required for the synthesis of type 1 capsular polysaccharides.
Probab=58.45  E-value=13  Score=38.72  Aligned_cols=42  Identities=17%  Similarity=0.192  Sum_probs=34.4

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      ..+..+.|+.|.+.+.|.. .+.+..++|||.+| +|||.+|.-
T Consensus       254 ~~~~~~~~~~adi~i~ps~~e~~~~~~~Ea~~~G-~Pvi~s~~~  296 (359)
T cd03808         254 RDDVPELLAAADVFVLPSYREGLPRVLLEAMAMG-RPVIATDVP  296 (359)
T ss_pred             cccHHHHHHhccEEEecCcccCcchHHHHHHHcC-CCEEEecCC
Confidence            4567899999999998875 46677899999999 588888753


No 75 
>cd03823 GT1_ExpE7_like This family is most closely related to the GT1 family of glycosyltransferases. ExpE7 in Sinorhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucans (exopolysaccharide II).
Probab=58.40  E-value=10  Score=39.84  Aligned_cols=41  Identities=12%  Similarity=0.218  Sum_probs=34.7

Q ss_pred             chhHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..++.+.|+.|.+.+.|.  +.+++..++|||.+| +|||.++.
T Consensus       253 ~~~~~~~~~~ad~~i~ps~~~e~~~~~~~Ea~a~G-~Pvi~~~~  295 (359)
T cd03823         253 QEEIDDFYAEIDVLVVPSIWPENFPLVIREALAAG-VPVIASDI  295 (359)
T ss_pred             HHHHHHHHHhCCEEEEcCcccCCCChHHHHHHHCC-CCEEECCC
Confidence            367889999999999886  467778899999999 88888874


No 76 
>PF01683 EB:  EB module;  InterPro: IPR006149  The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO 
Probab=58.29  E-value=11  Score=28.60  Aligned_cols=22  Identities=36%  Similarity=0.975  Sum_probs=18.5

Q ss_pred             CCCCCCCCEEeccCCeEEeCCCcc
Q 005509          124 KSDCSGQGVCNHELGQCRCFHGFR  147 (693)
Q Consensus       124 ~~~C~~~G~C~~~~G~C~C~~G~~  147 (693)
                      ...|.++..|.  .|+|.|++||.
T Consensus        25 ~~qC~~~s~C~--~g~C~C~~g~~   46 (52)
T PF01683_consen   25 DEQCIGGSVCV--NGRCQCPPGYV   46 (52)
T ss_pred             cCCCCCcCEEc--CCEeECCCCCE
Confidence            44688899997  89999999984


No 77 
>cd03822 GT1_ecORF704_like This family is most closely related to the GT1 family of glycosyltransferases. ORF704 in E. coli has been shown to be involved in the biosynthesis of O-specific mannose homopolysaccharides.
Probab=51.95  E-value=15  Score=38.80  Aligned_cols=41  Identities=24%  Similarity=0.146  Sum_probs=34.6

Q ss_pred             chhHHHHhhcCceeeccCC-C--CCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPG-D--GWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d--~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..++.+.|+.|.+.+.|.- .  +++.-+.|||.+|+ |||.+|.
T Consensus       258 ~~~~~~~~~~ad~~v~ps~~e~~~~~~~~~Ea~a~G~-PvI~~~~  301 (366)
T cd03822         258 DEELPELFSAADVVVLPYRSADQTQSGVLAYAIGFGK-PVISTPV  301 (366)
T ss_pred             HHHHHHHHhhcCEEEecccccccccchHHHHHHHcCC-CEEecCC
Confidence            4678899999999998765 4  56778999999999 9999885


No 78 
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=51.30  E-value=9.4  Score=46.64  Aligned_cols=35  Identities=29%  Similarity=0.845  Sum_probs=30.0

Q ss_pred             CCCC-CCCCCCCCEEeccCC---eEEeC-CCccCCCCCccc
Q 005509          120 GKSC-KSDCSGQGVCNHELG---QCRCF-HGFRGKGCSERI  155 (693)
Q Consensus       120 ~~~C-~~~C~~~G~C~~~~G---~C~C~-~G~~G~~Ce~~~  155 (693)
                      ...| |+.|.++|.|.. .+   .|.|. .||+|..|+..+
T Consensus       545 ~drClPN~CehgG~C~Q-s~~~f~C~C~~TGY~GatCHtsi  584 (1306)
T KOG3516|consen  545 SDRCLPNPCEHGGKCSQ-SWDDFECNCELTGYKGATCHTSI  584 (1306)
T ss_pred             ccccCCccccCCCcccc-cccceeEeccccccccccccCCC
Confidence            3678 999999999985 44   89999 999999999754


No 79 
>cd03798 GT1_wlbH_like This family is most closely related to the GT1 family of glycosyltransferases. wlbH in Bordetella parapertussis has been shown to be required for the biosynthesis of a trisaccharide that, when attached to the B. pertussis lipopolysaccharide (LPS) core (band B), generates band A LPS.
Probab=49.08  E-value=17  Score=38.05  Aligned_cols=41  Identities=17%  Similarity=0.165  Sum_probs=34.2

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..++.+.|++|.+.+.|.- ++++..++|||.+|+ |||.++.
T Consensus       269 ~~~~~~~~~~ad~~i~~~~~~~~~~~~~Ea~~~G~-pvI~~~~  310 (377)
T cd03798         269 HEEVPAYYAAADVFVLPSLREGFGLVLLEAMACGL-PVVATDV  310 (377)
T ss_pred             HHHHHHHHHhcCeeecchhhccCChHHHHHHhcCC-CEEEecC
Confidence            3567899999999998876 577888999999998 7777764


No 80 
>cd03807 GT1_WbnK_like This family is most closely related to the GT1 family of glycosyltransferases. WbnK in Shigella dysenteriae has been shown to be involved in the type 7 O-antigen biosynthesis.
Probab=47.25  E-value=19  Score=37.68  Aligned_cols=40  Identities=18%  Similarity=0.225  Sum_probs=33.5

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      .+..+.|+.+.+.+.|.. .+++.-++|||.+| +|||.+|.
T Consensus       260 ~~~~~~~~~adi~v~ps~~e~~~~~~~Ea~a~g-~PvI~~~~  300 (365)
T cd03807         260 SDVPALLNALDVFVLSSLSEGFPNVLLEAMACG-LPVVATDV  300 (365)
T ss_pred             ccHHHHHHhCCEEEeCCccccCCcHHHHHHhcC-CCEEEcCC
Confidence            457799999999998876 47778899999999 58888875


No 81 
>PF00954 S_locus_glycop:  S-locus glycoprotein family;  InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=47.24  E-value=18  Score=32.04  Aligned_cols=29  Identities=34%  Similarity=0.929  Sum_probs=22.2

Q ss_pred             CCC--CCCCCCCCEEeccCC-eEEeCCCccCC
Q 005509          121 KSC--KSDCSGQGVCNHELG-QCRCFHGFRGK  149 (693)
Q Consensus       121 ~~C--~~~C~~~G~C~~~~G-~C~C~~G~~G~  149 (693)
                      ++|  ...|..+|.|+.... .|.|.+||.-.
T Consensus        78 d~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~P~  109 (110)
T PF00954_consen   78 DQCDVYGFCGPNGICNSNNSPKCSCLPGFEPK  109 (110)
T ss_pred             cCCCCccccCCccEeCCCCCCceECCCCcCCC
Confidence            567  467999999975433 89999999643


No 82 
>cd03819 GT1_WavL_like This family is most closely related to the GT1 family of glycosyltransferases. WavL in Vibrio cholerae has been shown to be involved in the biosynthesis of the lipopolysaccharide core.
Probab=45.96  E-value=28  Score=36.83  Aligned_cols=41  Identities=7%  Similarity=0.071  Sum_probs=33.5

Q ss_pred             chhHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..++.+.|+.|...+.|.  ..+++.-++|||.+|+ |||++|.
T Consensus       254 ~~~~~~~l~~ad~~i~ps~~~e~~~~~l~EA~a~G~-PvI~~~~  296 (355)
T cd03819         254 CSDMPAAYALADIVVSASTEPEAFGRTAVEAQAMGR-PVIASDH  296 (355)
T ss_pred             cccHHHHHHhCCEEEecCCCCCCCchHHHHHHhcCC-CEEEcCC
Confidence            456789999999988875  3566778999999998 8888874


No 83 
>KOG1388 consensus Attractin and platelet-activating factor acetylhydrolase [Signal transduction mechanisms; Defense mechanisms]
Probab=45.61  E-value=15  Score=36.61  Aligned_cols=73  Identities=29%  Similarity=0.591  Sum_probs=38.5

Q ss_pred             CCCCCCEEeccCCeE-EeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC---CCCCCCCceeee-CCCcccCCCC
Q 005509          126 DCSGQGVCNHELGQC-RCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP---THCDTTRAMCFC-GEGTKYPNRP  200 (693)
Q Consensus       126 ~C~~~G~C~~~~G~C-~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~---g~C~~~~g~C~C-~~G~~G~~C~  200 (693)
                      .|++|+.|+. .-.| .|..|-+|..|+.     |..+-..+...|.+....|.   ..|....++|.| .-|..|..|+
T Consensus        53 ~cNGh~~c~t-~~v~~~~~N~~~g~~c~k-----c~~g~~GdtN~g~c~~~~~~g~~~~~~~~~~~c~c~~kgvvgd~c~  126 (217)
T KOG1388|consen   53 QCNGHSDCNT-QHVCWRCENGTTGAHCEK-----CIVGFYGDTNGGKCQPCDCNGGASACVTLTGKCFCTTKGVVGDLCP  126 (217)
T ss_pred             HhcCCCCccc-ceeeeeccCccccccCCc-----eEEEEEecCCCCccCHhhhcCCeeeeeccCCccccccceEecccCc
Confidence            3667777763 2233 3555666666654     21110000011222223343   336667889999 5789998887


Q ss_pred             CCCC
Q 005509          201 VAEA  204 (693)
Q Consensus       201 ~~~~  204 (693)
                      .++.
T Consensus       127 ~~e~  130 (217)
T KOG1388|consen  127 KCEV  130 (217)
T ss_pred             cccc
Confidence            6644


No 84 
>cd03821 GT1_Bme6_like This family is most closely related to the GT1 family of glycosyltransferases. Bme6 in Brucella melitensis has been shown to be involved in the biosynthesis of a polysaccharide.
Probab=45.11  E-value=22  Score=37.34  Aligned_cols=41  Identities=15%  Similarity=0.216  Sum_probs=33.9

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      .++.+.|+.+.+.+.|.- .+++.-++|||.+|+ |||.++..
T Consensus       273 ~~~~~~~~~adv~v~ps~~e~~~~~~~Eama~G~-PvI~~~~~  314 (375)
T cd03821         273 EDKAAALADADLFVLPSHSENFGIVVAEALACGT-PVVTTDKV  314 (375)
T ss_pred             HHHHHHHhhCCEEEeccccCCCCcHHHHHHhcCC-CEEEcCCC
Confidence            467789999999988776 577788999999995 88888754


No 85 
>cd04951 GT1_WbdM_like This family is most closely related to the GT1 family of glycosyltransferases and is named after WbdM in Escherichia coli. In general glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have
Probab=44.32  E-value=24  Score=37.35  Aligned_cols=40  Identities=10%  Similarity=0.160  Sum_probs=32.9

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      .+..+.|+.+.+-+.|.. .+++.-++|||.+|+ |||.+|.
T Consensus       254 ~~~~~~~~~ad~~v~~s~~e~~~~~~~Ea~a~G~-PvI~~~~  294 (360)
T cd04951         254 DDIAAYYNAADLFVLSSAWEGFGLVVAEAMACEL-PVVATDA  294 (360)
T ss_pred             ccHHHHHHhhceEEecccccCCChHHHHHHHcCC-CEEEecC
Confidence            456788999999888766 467778999999999 8888875


No 86 
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=43.28  E-value=15  Score=44.33  Aligned_cols=41  Identities=29%  Similarity=0.856  Sum_probs=30.2

Q ss_pred             ccccccCCccC-CCCCCCceeeCC----eeec-CCCcccCCCCCCcc
Q 005509          271 FCEVPVSSTCV-NQCSGHGHCRGG----FCQC-DSGWYGVDCSIPSV  311 (693)
Q Consensus       271 ~C~~~~~~~C~-~~C~~~G~C~~g----~C~C-~~G~~G~~C~~~~~  311 (693)
                      .|....+..|. ++|.|+|.|..|    .|.| ..||.|..|+....
T Consensus       617 sCs~~~~~~C~~nPC~N~g~C~egwNrfiCDCs~T~~~G~~CerE~t  663 (1591)
T KOG3514|consen  617 SCSLSNEKICESNPCQNGGKCSEGWNRFICDCSGTGFEGRTCEREAT  663 (1591)
T ss_pred             ccchhhccccCCCcccCCCCccccccccccccccCcccCccccceee
Confidence            34433233554 799999999855    7999 57999999998654


No 87 
>PF12662 cEGF:  Complement Clr-like EGF-like
Probab=42.86  E-value=16  Score=23.30  Aligned_cols=10  Identities=30%  Similarity=0.856  Sum_probs=6.2

Q ss_pred             CeeecCCCcc
Q 005509          293 GFCQCDSGWY  302 (693)
Q Consensus       293 g~C~C~~G~~  302 (693)
                      ++|.|++||.
T Consensus         2 y~C~C~~Gy~   11 (24)
T PF12662_consen    2 YTCSCPPGYQ   11 (24)
T ss_pred             EEeeCCCCCc
Confidence            3566666664


No 88 
>cd03801 GT1_YqgM_like This family is most closely related to the GT1 family of glycosyltransferases and named after YqgM in Bacillus licheniformis about which little is known. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. 
Probab=41.81  E-value=27  Score=36.26  Aligned_cols=41  Identities=17%  Similarity=0.199  Sum_probs=33.9

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..++.+.|++|.+-+.|.- ++.+..++|||.+|+ |||.++.
T Consensus       266 ~~~~~~~~~~~di~i~~~~~~~~~~~~~Ea~~~g~-pvI~~~~  307 (374)
T cd03801         266 DEDLPALYAAADVFVLPSLYEGFGLVLLEAMAAGL-PVVASDV  307 (374)
T ss_pred             hhhHHHHHHhcCEEEecchhccccchHHHHHHcCC-cEEEeCC
Confidence            4678899999999988765 466788999999996 7887774


No 89 
>cd03818 GT1_ExpC_like This family is most closely related to the GT1 family of glycosyltransferases. ExpC in Rhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucan (exopolysaccharide II).
Probab=41.18  E-value=81  Score=34.44  Aligned_cols=40  Identities=23%  Similarity=0.085  Sum_probs=31.3

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      .++.+.|+.|...+.|.- .+.+.-++|||.+|+ |||.+|.
T Consensus       292 ~~~~~~l~~adv~v~~s~~e~~~~~llEAmA~G~-PVIas~~  332 (396)
T cd03818         292 DQYLALLQVSDVHVYLTYPFVLSWSLLEAMACGC-LVVGSDT  332 (396)
T ss_pred             HHHHHHHHhCcEEEEcCcccccchHHHHHHHCCC-CEEEcCC
Confidence            567789999998887654 344556999999998 8888874


No 90 
>PF13692 Glyco_trans_1_4:  Glycosyl transferases group 1; PDB: 3OY2_A 3OY7_B 2Q6V_A 2HY7_A 3CV3_A 3CUY_A.
Probab=40.14  E-value=44  Score=29.95  Aligned_cols=40  Identities=18%  Similarity=0.305  Sum_probs=28.2

Q ss_pred             hhHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      +++.+.|+++.+.+.|.  +.+.+.+++|+|.+|+ |||.++.
T Consensus        62 ~e~~~~l~~~dv~l~p~~~~~~~~~k~~e~~~~G~-pvi~~~~  103 (135)
T PF13692_consen   62 EELPEILAAADVGLIPSRFNEGFPNKLLEAMAAGK-PVIASDN  103 (135)
T ss_dssp             HHHHHHHHC-SEEEE-BSS-SCC-HHHHHHHCTT---EEEEHH
T ss_pred             HHHHHHHHhCCEEEEEeeCCCcCcHHHHHHHHhCC-CEEECCc
Confidence            57899999999999986  4456788999999997 5555654


No 91 
>cd03816 GT1_ALG1_like This family is most closely related to the GT1 family of glycosyltransferases. The yeast gene ALG1 has been shown to function as a mannosyltransferase that catalyzes the formation of dolichol pyrophosphate (Dol-PP)-GlcNAc2Man from GDP-Man and Dol-PP-Glc-NAc2, and participates in the formation of the lipid-linked precursor oligosaccharide for N-glycosylation. In humans ALG1 has been associated with the congenital disorders of glycosylation (CDG) designated as subtype CDG-Ik.
Probab=37.93  E-value=77  Score=35.09  Aligned_cols=42  Identities=24%  Similarity=0.181  Sum_probs=32.5

Q ss_pred             chhHHHHhhcCceeeccC----CCCCchhHHHHHhcCceeEEEeCCe
Q 005509          628 SENYHEDLSSSVFCGVLP----GDGWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~----Gd~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      .+++.+.|+.|...+.|.    |.+....++|||.+|. |||.++.-
T Consensus       305 ~~~~~~~l~~aDv~v~~~~~~~~~~~p~~~~Eama~G~-PVI~s~~~  350 (415)
T cd03816         305 AEDYPKLLASADLGVSLHTSSSGLDLPMKVVDMFGCGL-PVCALDFK  350 (415)
T ss_pred             HHHHHHHHHhCCEEEEccccccccCCcHHHHHHHHcCC-CEEEeCCC
Confidence            467888999999887532    3345667999999998 99998853


No 92 
>TIGR03087 stp1 sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate.
Probab=37.08  E-value=44  Score=36.57  Aligned_cols=39  Identities=13%  Similarity=0.141  Sum_probs=32.0

Q ss_pred             hHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509          630 NYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       630 ~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      +..+.|+.+...+.|.  +.|....++|||.+|+ |||.++.
T Consensus       290 ~~~~~~~~adv~v~Ps~~~eG~~~~~lEAma~G~-PVV~t~~  330 (397)
T TIGR03087       290 DVRPYLAHAAVAVAPLRIARGIQNKVLEAMAMAK-PVVASPE  330 (397)
T ss_pred             CHHHHHHhCCEEEecccccCCcccHHHHHHHcCC-CEEecCc
Confidence            5678899999988874  5566678999999997 9999874


No 93 
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=36.66  E-value=75  Score=37.94  Aligned_cols=65  Identities=26%  Similarity=0.594  Sum_probs=35.5

Q ss_pred             CCCCCEEeccCCeEEeCCCcc----CCCCCccccCcCCCCCCCC-CCCCCccccccCCCCCC-CCc--eeeeCCCcccCC
Q 005509          127 CSGQGVCNHELGQCRCFHGFR----GKGCSERIHFQCNFPKTPE-LPYGRWVVSICPTHCDT-TRA--MCFCGEGTKYPN  198 (693)
Q Consensus       127 C~~~G~C~~~~G~C~C~~G~~----G~~Ce~~~~~~C~~~~~~~-~~~g~~~~~~C~g~C~~-~~g--~C~C~~G~~G~~  198 (693)
                      |++-|.=.--.|.|.|.+||.    |..|+.     |+.+.-.. .....|  ..||.+-.. ..|  .|.|..||+-..
T Consensus       248 C~~dGeWlvpiG~C~C~aGye~~~~~~~C~a-----Cp~G~yK~~~~~~~C--~~CP~~S~s~~ega~~C~C~~gyyRA~  320 (996)
T KOG0196|consen  248 CSGDGEWLVPIGGCVCKAGYEEAENGKACQA-----CPPGTYKASQGDSLC--LPCPPNSHSSSEGATSCTCENGYYRAD  320 (996)
T ss_pred             EcCCCcEEEEcCceeecCCCCcccCCCccee-----CCCCcccCCCCCCCC--CCCCCCCCCCCCCCCcccccCCcccCC
Confidence            666665544468999999994    566764     65431100 000011  245533222 222  789999987543


No 94 
>cd03805 GT1_ALG2_like This family is most closely related to the GT1 family of glycosyltransferases.  ALG2, a 1,3-mannosyltransferase, in yeast catalyzes the mannosylation of Man(2)GlcNAc(2)-dolichol diphosphate and Man(1)GlcNAc(2)-dolichol diphosphate to form Man(3)GlcNAc(2)-dolichol diphosphate. A deficiency of this enzyme causes an abnormal accumulation of Man1GlcNAc2-PP-dolichol and Man2GlcNAc2-PP-dolichol, which is associated with a type of congenital disorders of glycosylation (CDG), designated CDG-Ii, in humans.
Probab=36.43  E-value=51  Score=35.60  Aligned_cols=40  Identities=18%  Similarity=0.079  Sum_probs=32.0

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ....+.|+.|.+.+.|.. .+++.-++|||.+| +|||.+|.
T Consensus       291 ~~~~~~l~~ad~~l~~s~~E~~g~~~lEAma~G-~PvI~s~~  331 (392)
T cd03805         291 SQKELLLSSARALLYTPSNEHFGIVPLEAMYAG-KPVIACNS  331 (392)
T ss_pred             HHHHHHHhhCeEEEECCCcCCCCchHHHHHHcC-CCEEEECC
Confidence            455688999999998766 35666689999999 78888875


No 95 
>PLN02871 UDP-sulfoquinovose:DAG sulfoquinovosyltransferase
Probab=36.12  E-value=34  Score=38.52  Aligned_cols=40  Identities=13%  Similarity=0.150  Sum_probs=34.3

Q ss_pred             hhHHHHhhcCceeeccCCC-CCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLPGD-GWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~Gd-~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      .++.+.|+.+...+.|... +++.-++|||.+| +|||.++.
T Consensus       323 ~ev~~~~~~aDv~V~pS~~E~~g~~vlEAmA~G-~PVI~s~~  363 (465)
T PLN02871        323 DELSQAYASGDVFVMPSESETLGFVVLEAMASG-VPVVAARA  363 (465)
T ss_pred             HHHHHHHHHCCEEEECCcccccCcHHHHHHHcC-CCEEEcCC
Confidence            5788999999999988763 6677799999999 99999874


No 96 
>cd03794 GT1_wbuB_like This family is most closely related to the GT1 family of glycosyltransferases. wbuB in E. coli is involved in the biosynthesis of the O26 O-antigen.  It has been proposed to function as an N-acetyl-L-fucosamine (L-FucNAc) transferase.
Probab=36.05  E-value=36  Score=35.79  Aligned_cols=43  Identities=19%  Similarity=0.076  Sum_probs=33.3

Q ss_pred             chhHHHHhhcCceeeccCCC-CC-----chhHHHHHhcCceeEEEeCCee
Q 005509          628 SENYHEDLSSSVFCGVLPGD-GW-----SGRMEDSILQGCIPVVIQVVIS  671 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~Gd-~~-----s~Rl~dAi~~GCIPViisd~~~  671 (693)
                      ..++.+.|+.+.+.+.|... ++     ..+++|||.+|+ |||.++.-.
T Consensus       285 ~~~~~~~~~~~di~i~~~~~~~~~~~~~p~~~~Ea~~~G~-pvi~~~~~~  333 (394)
T cd03794         285 KEELPELLAAADVGLVPLKPGPAFEGVSPSKLFEYMAAGK-PVLASVDGE  333 (394)
T ss_pred             hHHHHHHHHhhCeeEEeccCcccccccCchHHHHHHHCCC-cEEEecCCC
Confidence            35778999999999988763 22     456999999995 888887543


No 97 
>cd03804 GT1_wbaZ_like This family is most closely related to the GT1 family of glycosyltransferases.  wbaZ in Salmonella enterica has been shown to possess the mannosyl transferase activity. The members of this family are found in certain bacteria and Archaea.
Probab=35.45  E-value=41  Score=35.81  Aligned_cols=41  Identities=10%  Similarity=0.068  Sum_probs=33.4

Q ss_pred             chhHHHHhhcCceeeccCCCCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPGDGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~Gd~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ...+.+.|+.+...+.|.=.+++.-++|||.+|+ |||.++.
T Consensus       252 ~~~~~~~~~~ad~~v~ps~e~~g~~~~Eama~G~-Pvi~~~~  292 (351)
T cd03804         252 DEELRDLYARARAFLFPAEEDFGIVPVEAMASGT-PVIAYGK  292 (351)
T ss_pred             HHHHHHHHHhCCEEEECCcCCCCchHHHHHHcCC-CEEEeCC
Confidence            3557899999999888754667777899999997 9998874


No 98 
>PRK15427 colanic acid biosynthesis glycosyltransferase WcaL; Provisional
Probab=34.34  E-value=1.5e+02  Score=32.79  Aligned_cols=45  Identities=20%  Similarity=0.217  Sum_probs=34.2

Q ss_pred             chhHHHHhhcCceeeccC-----C--CCCchhHHHHHhcCceeEEEeCCeeec
Q 005509          628 SENYHEDLSSSVFCGVLP-----G--DGWSGRMEDSILQGCIPVVIQVVISSF  673 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~-----G--d~~s~Rl~dAi~~GCIPViisd~~~~p  673 (693)
                      ..+..+.|+.+...+.|.     |  +|...-++|||.+| +|||.++.-..+
T Consensus       289 ~~el~~~l~~aDv~v~pS~~~~~g~~Eg~p~~llEAma~G-~PVI~t~~~g~~  340 (406)
T PRK15427        289 SHEVKAMLDDADVFLLPSVTGADGDMEGIPVALMEAMAVG-IPVVSTLHSGIP  340 (406)
T ss_pred             HHHHHHHHHhCCEEEECCccCCCCCccCccHHHHHHHhCC-CCEEEeCCCCch
Confidence            356789999999988874     2  35566799999999 599998753333


No 99 
>PRK15484 lipopolysaccharide 1,2-N-acetylglucosaminetransferase; Provisional
Probab=34.14  E-value=42  Score=36.70  Aligned_cols=43  Identities=14%  Similarity=0.149  Sum_probs=34.5

Q ss_pred             chhHHHHhhcCceeeccCC--CCCchhHHHHHhcCceeEEEeCCee
Q 005509          628 SENYHEDLSSSVFCGVLPG--DGWSGRMEDSILQGCIPVVIQVVIS  671 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G--d~~s~Rl~dAi~~GCIPViisd~~~  671 (693)
                      ..+..+.|+.|...+.|..  .+++.-++|||.+| +|||.++.-.
T Consensus       267 ~~~l~~~~~~aDv~v~pS~~~E~f~~~~lEAma~G-~PVI~s~~gg  311 (380)
T PRK15484        267 PEKMHNYYPLADLVVVPSQVEEAFCMVAVEAMAAG-KPVLASTKGG  311 (380)
T ss_pred             HHHHHHHHHhCCEEEeCCCCccccccHHHHHHHcC-CCEEEeCCCC
Confidence            3567789999999998864  45666799999999 8999998533


No 100
>cd03809 GT1_mtfB_like This family is most closely related to the GT1 family of glycosyltransferases. mtfB (mannosyltransferase B) in E. coli has been shown to direct the growth of the O9-specific polysaccharide chain. It transfers two mannoses into the position 3 of the previously synthesized polysaccharide.
Probab=34.07  E-value=28  Score=36.62  Aligned_cols=41  Identities=12%  Similarity=0.139  Sum_probs=32.9

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ...+.+.|+.+.+.+.|.- ++++.-++|||.+|+ |||.++.
T Consensus       263 ~~~~~~~~~~~d~~l~ps~~e~~~~~~~Ea~a~G~-pvI~~~~  304 (365)
T cd03809         263 DEELAALYRGARAFVFPSLYEGFGLPVLEAMACGT-PVIASNI  304 (365)
T ss_pred             hhHHHHHHhhhhhhcccchhccCCCCHHHHhcCCC-cEEecCC
Confidence            3567889999999888754 466777999999995 8888875


No 101
>cd03800 GT1_Sucrose_synthase This family is most closely related to the GT1 family of glycosyltransferases. The sucrose-phosphate synthases in this family may be unique to plants and photosynthetic bacteria. This enzyme catalyzes the synthesis of sucrose 6-phosphate from fructose 6-phosphate and uridine 5'-diphosphate-glucose, a key regulatory step of sucrose metabolism. The activity of this enzyme is regulated by phosphorylation and moderated by the concentration of various metabolites and light.
Probab=33.37  E-value=39  Score=36.35  Aligned_cols=40  Identities=15%  Similarity=0.114  Sum_probs=32.7

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      .++.+.|+.|...+.|.- ++++.-++|||.+| +|||.++.
T Consensus       294 ~~~~~~~~~adi~l~ps~~e~~~~~l~Ea~a~G-~Pvi~s~~  334 (398)
T cd03800         294 EDLPALYRAADVFVNPALYEPFGLTALEAMACG-LPVVATAV  334 (398)
T ss_pred             HHHHHHHHhCCEEEecccccccCcHHHHHHhcC-CCEEECCC
Confidence            567788999999988865 35566799999999 69999874


No 102
>PF00919 UPF0004:  Uncharacterized protein family UPF0004;  InterPro: IPR013848  The methylthiotransferase (MTTase) or miaB-like family is named after the (dimethylallyl)adenosine tRNA MTTase miaB protein, which catalyses a C-H to C-S bond conversion in the methylthiolation of tRNA. A related bacterial enzyme rimO performs a similar methylthiolation, but on a protein substrate. RimO acts on the ribosomal protein S12 and forms a separate MTTase subfamily. The miaB-subfamily includes mammalian CDK5 regulatory subunit-associated proteins and similar proteins in other eukaryotes. Two other subfamilies, yqeV and CDKAL1, are named after a Bacillus subtilis and a human protein, respectively. While yqeV-like proteins are found in bacteria, CDKAL1 subfamily members occur in eukaryotes and in archaebacteria. The likely MTTases from these 4 subfamilies contain an N-terminal MTTase domain, a central radical generating fold and a C-terminal TRAM domain (see PDOC50926 from PROSITEDOC). The core forms a radical SAM fold (or AdoMet radical), containing a cysteine motif CxxxCxxC that binds a [4Fe-4S] cluster [, , ]. A reducing equivalent from the [4Fe-4S]+ cluster is used to cleave S-adenosylmethionine (SAM) to generate methionine and a 5'-deoxyadenosyl radical. The latter is thought to produce a reactive substrate radical that is amenable to sulphur insertion [, ]. The N-terminal MTTase domain contains 3 cysteines that bind a second [4Fe-4S] cluster, in addition to the radical-generating [4Fe-4S] cluster, which could be involved in the thiolation reaction. The C-terminal TRAM domain is not shared with other radical SAM proteins outside the MTTase family. The TRAM domain can bind to RNA substrate and seems to be important for substrate recognition. The tertiary structure of the central radical SAM fold has six beta/alpha motifs resembling a three-quarter TIM barrel core (see PDOC00155 from PROSITEDOC) []. The N-terminal MTTase domain might form an additional [beta/alpha]2 TIM barrel unit []. ; GO: 0003824 catalytic activity, 0051539 4 iron, 4 sulfur cluster binding, 0009451 RNA modification
Probab=32.81  E-value=40  Score=29.44  Aligned_cols=32  Identities=22%  Similarity=0.162  Sum_probs=21.6

Q ss_pred             ccchhHHHHHHHHhcCCC-cCCCcCCCceEEEec
Q 005509          392 MLYGSQMAFYESILASPH-RTLNGEEADFFFVPV  424 (693)
Q Consensus       392 ~~y~~E~~~~~~L~~s~~-rT~dP~eAdlF~VP~  424 (693)
                      ++|.+|.+ ...|.+..+ .|.+|++||+++|--
T Consensus        12 N~~Dse~i-~~~l~~~G~~~~~~~e~AD~iiiNT   44 (98)
T PF00919_consen   12 NQYDSERI-ASILQAAGYEIVDDPEEADVIIINT   44 (98)
T ss_pred             cHHHHHHH-HHHHHhcCCeeecccccCCEEEEEc
Confidence            34555643 344555544 799999999998854


No 103
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=32.67  E-value=26  Score=43.05  Aligned_cols=40  Identities=28%  Similarity=0.674  Sum_probs=31.8

Q ss_pred             Ccc-CCCCCCCceee----CCeeecC-CCcccCCCCCCccCCCCCC
Q 005509          278 STC-VNQCSGHGHCR----GGFCQCD-SGWYGVDCSIPSVMSSMSE  317 (693)
Q Consensus       278 ~~C-~~~C~~~G~C~----~g~C~C~-~G~~G~~C~~~~~~~~~~~  317 (693)
                      +.| ||+|.++|.|.    +..|.|. .||.|..|..+......++
T Consensus       546 drClPN~CehgG~C~Qs~~~f~C~C~~TGY~GatCHtsi~e~SCea  591 (1306)
T KOG3516|consen  546 DRCLPNPCEHGGKCSQSWDDFECNCELTGYKGATCHTSIYELSCEA  591 (1306)
T ss_pred             cccCCccccCCCcccccccceeEeccccccccccccCCCcchhhHH
Confidence            456 48999999997    5699998 9999999998776544433


No 104
>cd05844 GT1_like_7 Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center
Probab=29.74  E-value=78  Score=33.64  Aligned_cols=42  Identities=14%  Similarity=0.060  Sum_probs=32.2

Q ss_pred             hhHHHHhhcCceeeccCC-------CCCchhHHHHHhcCceeEEEeCCee
Q 005509          629 ENYHEDLSSSVFCGVLPG-------DGWSGRMEDSILQGCIPVVIQVVIS  671 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-------d~~s~Rl~dAi~~GCIPViisd~~~  671 (693)
                      .+..+.|+.|...+.|.-       .+++..++|||.+|+ |||.+|.-.
T Consensus       256 ~~l~~~~~~ad~~v~ps~~~~~~~~E~~~~~~~EA~a~G~-PvI~s~~~~  304 (367)
T cd05844         256 AEVRELMRRARIFLQPSVTAPSGDAEGLPVVLLEAQASGV-PVVATRHGG  304 (367)
T ss_pred             HHHHHHHHhCCEEEECcccCCCCCccCCchHHHHHHHcCC-CEEEeCCCC
Confidence            567788999998776642       345778999999995 999998643


No 105
>cd04962 GT1_like_5 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=28.76  E-value=57  Score=34.75  Aligned_cols=41  Identities=17%  Similarity=0.161  Sum_probs=33.3

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      .+..+.|+.+...+.|.- .+++.-++|||.+| +|||.+|.-
T Consensus       262 ~~~~~~~~~~d~~v~ps~~E~~~~~~~EAma~g-~PvI~s~~~  303 (371)
T cd04962         262 DHVEELLSIADLFLLPSEKESFGLAALEAMACG-VPVVASNAG  303 (371)
T ss_pred             ccHHHHHHhcCEEEeCCCcCCCccHHHHHHHcC-CCEEEeCCC
Confidence            457789999999998864 35566799999999 899998753


No 106
>TIGR03088 stp2 sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate.
Probab=28.67  E-value=53  Score=35.32  Aligned_cols=41  Identities=15%  Similarity=0.228  Sum_probs=32.3

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      .+..+.|+.|.+.+.|.- .|.+.-++|||.+| +|||.+|.-
T Consensus       264 ~~~~~~~~~adi~v~pS~~Eg~~~~~lEAma~G-~Pvv~s~~~  305 (374)
T TIGR03088       264 DDVPALMQALDLFVLPSLAEGISNTILEAMASG-LPVIATAVG  305 (374)
T ss_pred             CCHHHHHHhcCEEEeccccccCchHHHHHHHcC-CCEEEcCCC
Confidence            467788999998877643 35666799999999 599999853


No 107
>cd03806 GT1_ALG11_like This family is most closely related to the GT1 family of glycosyltransferases. ALG11 in yeast is involved in adding the final 1,2-linked Man to the Man5GlcNAc2-PP-Dol synthesized on the cytosolic face of the ER. The deletion analysis of ALG11 was shown to block the early steps of core biosynthesis that takes place on the cytoplasmic face of the ER and lead to a defect in the assembly of lipid-linked oligosaccharides.
Probab=28.51  E-value=2.1e+02  Score=31.79  Aligned_cols=40  Identities=18%  Similarity=0.169  Sum_probs=31.3

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeC
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQV  668 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd  668 (693)
                      ..++.+.|+.|...+.|.= .+++.=++|||.+||+||. ++
T Consensus       315 ~~~l~~~l~~adv~v~~s~~E~Fgi~~lEAMa~G~pvIa-~~  355 (419)
T cd03806         315 FEELLEELSTASIGLHTMWNEHFGIGVVEYMAAGLIPLA-HA  355 (419)
T ss_pred             HHHHHHHHHhCeEEEECCccCCcccHHHHHHHcCCcEEE-Ec
Confidence            4677899999999988764 4666679999999996664 44


No 108
>cd04955 GT1_like_6 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=28.36  E-value=86  Score=33.11  Aligned_cols=41  Identities=17%  Similarity=0.237  Sum_probs=30.9

Q ss_pred             chhHHHHhhcCceeeccCC--CCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPG--DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G--d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      .....+.++.+...+.|.-  .+++.-++|||.+|+ |||.++.
T Consensus       258 ~~~~~~~~~~ad~~v~ps~~~e~~~~~~~EAma~G~-PvI~s~~  300 (363)
T cd04955         258 DQELLELLRYAALFYLHGHSVGGTNPSLLEAMAYGC-PVLASDN  300 (363)
T ss_pred             hHHHHHHHHhCCEEEeCCccCCCCChHHHHHHHcCC-CEEEecC
Confidence            3556788888888877653  355667999999999 7887764


No 109
>smart00672 CAP10 Putative lipopolysaccharide-modifying enzyme.
Probab=27.86  E-value=84  Score=32.57  Aligned_cols=105  Identities=14%  Similarity=0.096  Sum_probs=64.7

Q ss_pred             CCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCc--cccCcccCcce--EEec-CCchhHHHHhhcC
Q 005509          564 PREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKE--GKLGKQHAEDV--IVTS-LRSENYHEDLSSS  638 (693)
Q Consensus       564 ~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~--~~~g~~~~~~~--~~~~-~~~~~y~~~l~~S  638 (693)
                      +=++|.-.++|+|+...            +..|++|++...+.+...  +........+.  .... .....=.+...+-
T Consensus        79 pW~~K~~~a~WRG~~~~------------~~~R~~Lv~~~~~~p~~~da~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y  146 (256)
T smart00672       79 KWSDKNAYAYWRGNPTV------------ASERLDLIKCNQSSPELVNARITIQDWPGKCDGEEDAPGFKKSPLEEQCKH  146 (256)
T ss_pred             CccccCcCccccCCCCC------------CcchHHHHHHhcCCcccceeEEEEecCCCCChHHhcccCcCCCCHHHHhhc
Confidence            44667888999997632            127999998766654211  00000000000  0000 0011224666788


Q ss_pred             ceeeccCCCCCchhHHHHHhcCceeEEEeCCeeec----eecCCCc
Q 005509          639 VFCGVLPGDGWSGRMEDSILQGCIPVVIQVVISSF----LLLCQNG  680 (693)
Q Consensus       639 ~FCL~p~Gd~~s~Rl~dAi~~GCIPViisd~~~~p----~l~~~~f  680 (693)
                      ||=+..-|.++|.||.=-|.++.|++.....+..+    +.||..+
T Consensus       147 Kyli~~dG~~~S~rl~~~l~~~Svvl~~~~~~~~~~~~~L~P~~HY  192 (256)
T smart00672      147 KYKINIEGVAWSVRLKYILACDSVVLKVKPEYYEFFSRGLQPWVHY  192 (256)
T ss_pred             ceEEecCCccchhhHHHHHhcCceEEEeCCchhHHHHhcccCccce
Confidence            99888999999999999999999999988666554    5666554


No 110
>cd03792 GT1_Trehalose_phosphorylase Trehalose phosphorylase (TP) reversibly catalyzes trehalose synthesis and degradation from alpha-glucose-1-phosphate (alpha-Glc-1-P) and glucose. The catalyzing activity includes the phosphorolysis of trehalose, which produce alpha-Glc-1-P and glucose, and the subsequent synthesis of trehalose. This family is most closely related to the GT1 family of glycosyltransferases.
Probab=27.10  E-value=64  Score=34.81  Aligned_cols=42  Identities=14%  Similarity=0.127  Sum_probs=34.2

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI  670 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~  670 (693)
                      .....+.|+.+...+.|.- .+++.-++|||.+| +|||.++.-
T Consensus       264 ~~~~~~~~~~ad~~v~~s~~Eg~g~~~lEA~a~G-~Pvv~s~~~  306 (372)
T cd03792         264 DLEVNALQRASTVVLQKSIREGFGLTVTEALWKG-KPVIAGPVG  306 (372)
T ss_pred             HHHHHHHHHhCeEEEeCCCccCCCHHHHHHHHcC-CCEEEcCCC
Confidence            3566788999999888765 57777899999999 799999853


No 111
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=26.03  E-value=42  Score=40.82  Aligned_cols=34  Identities=26%  Similarity=0.514  Sum_probs=25.5

Q ss_pred             CCCCCCCceecCCcccccccccccccccccCCCCCcCcccccc
Q 005509          233 TTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVP  275 (693)
Q Consensus       233 ~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~  275 (693)
                      +++|.|+|.|...-+         ...|.|.-.+|.|..|+..
T Consensus       628 ~nPC~N~g~C~egwN---------rfiCDCs~T~~~G~~CerE  661 (1591)
T KOG3514|consen  628 SNPCQNGGKCSEGWN---------RFICDCSGTGFEGRTCERE  661 (1591)
T ss_pred             CCcccCCCCcccccc---------ccccccccCcccCccccce
Confidence            677888888854321         4689998679999999864


No 112
>PHA01633 putative glycosyl transferase group 1
Probab=24.73  E-value=83  Score=34.02  Aligned_cols=40  Identities=25%  Similarity=0.345  Sum_probs=32.4

Q ss_pred             hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      .+..+.++.|.+-+.|.- .+++.=+.|||.+|+ |||.+|-
T Consensus       215 ~dl~~~y~~aDifV~PS~~EgfGlvlLEAMA~G~-PVVas~~  255 (335)
T PHA01633        215 EYIFAFYGAMDFTIVPSGTEGFGMPVLESMAMGT-PVIHQLM  255 (335)
T ss_pred             HHHHHHHHhCCEEEECCccccCCHHHHHHHHcCC-CEEEccC
Confidence            556788999998877754 466777999999999 9999865


No 113
>PRK09922 UDP-D-galactose:(glucosyl)lipopolysaccharide-1,6-D-galactosyltransferase; Provisional
Probab=23.47  E-value=87  Score=33.69  Aligned_cols=38  Identities=11%  Similarity=0.198  Sum_probs=30.0

Q ss_pred             hHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeC
Q 005509          630 NYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQV  668 (693)
Q Consensus       630 ~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd  668 (693)
                      .+.+.|+.+...+.|.- .+++.-++|||.+| +|||.+|
T Consensus       250 ~~~~~~~~~d~~v~~s~~Egf~~~~lEAma~G-~Pvv~s~  288 (359)
T PRK09922        250 VVQQKIKNVSALLLTSKFEGFPMTLLEAMSYG-IPCISSD  288 (359)
T ss_pred             HHHHHHhcCcEEEECCcccCcChHHHHHHHcC-CCEEEeC
Confidence            44566778888877765 46777799999999 7898888


No 114
>cd03795 GT1_like_4 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP-linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=22.78  E-value=86  Score=32.94  Aligned_cols=40  Identities=13%  Similarity=0.142  Sum_probs=31.6

Q ss_pred             hhHHHHhhcCceeeccC---CCCCchhHHHHHhcCceeEEEeCC
Q 005509          629 ENYHEDLSSSVFCGVLP---GDGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       629 ~~y~~~l~~S~FCL~p~---Gd~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..+.+.++.+...+.|.   +.+++.-+.|||.+| +|||.+|.
T Consensus       255 ~~~~~~~~~ad~~i~ps~~~~e~~g~~~~Ea~~~g-~Pvi~~~~  297 (357)
T cd03795         255 EEKAALLAACDVFVFPSVERSEAFGIVLLEAMAFG-KPVISTEI  297 (357)
T ss_pred             HHHHHHHHhCCEEEeCCcccccccchHHHHHHHcC-CCEEecCC
Confidence            55778999999998874   356677799999998 68888774


No 115
>PF12946 EGF_MSP1_1:  MSP1 EGF domain 1;  InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=22.40  E-value=71  Score=22.69  Aligned_cols=24  Identities=29%  Similarity=0.888  Sum_probs=15.4

Q ss_pred             CCCCCCCEEeccC-C--eEEeCCCccC
Q 005509          125 SDCSGQGVCNHEL-G--QCRCFHGFRG  148 (693)
Q Consensus       125 ~~C~~~G~C~~~~-G--~C~C~~G~~G  148 (693)
                      ..|--|..|.... |  +|+|..||..
T Consensus         5 ~~cP~NA~C~~~~dG~eecrCllgyk~   31 (37)
T PF12946_consen    5 TKCPANAGCFRYDDGSEECRCLLGYKK   31 (37)
T ss_dssp             S---TTEEEEEETTSEEEEEE-TTEEE
T ss_pred             ccCCCCcccEEcCCCCEEEEeeCCccc
Confidence            4566788896555 6  8999999964


No 116
>PF00954 S_locus_glycop:  S-locus glycoprotein family;  InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=20.46  E-value=79  Score=27.90  Aligned_cols=22  Identities=27%  Similarity=0.705  Sum_probs=18.1

Q ss_pred             CCCCCCceee---CCeeecCCCccc
Q 005509          282 NQCSGHGHCR---GGFCQCDSGWYG  303 (693)
Q Consensus       282 ~~C~~~G~C~---~g~C~C~~G~~G  303 (693)
                      ..|...|.|+   ...|.|.+||.-
T Consensus        84 ~~CG~~g~C~~~~~~~C~Cl~GF~P  108 (110)
T PF00954_consen   84 GFCGPNGICNSNNSPKCSCLPGFEP  108 (110)
T ss_pred             cccCCccEeCCCCCCceECCCCcCC
Confidence            6799999997   347999999963


No 117
>PF14670 FXa_inhibition:  Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=20.42  E-value=67  Score=22.55  Aligned_cols=16  Identities=31%  Similarity=1.074  Sum_probs=11.3

Q ss_pred             EEeccCC--eEEeCCCcc
Q 005509          132 VCNHELG--QCRCFHGFR  147 (693)
Q Consensus       132 ~C~~~~G--~C~C~~G~~  147 (693)
                      .|....|  +|.|++||.
T Consensus        11 ~C~~~~g~~~C~C~~Gy~   28 (36)
T PF14670_consen   11 ICVNTPGSYRCSCPPGYK   28 (36)
T ss_dssp             EEEEETTSEEEE-STTEE
T ss_pred             CCccCCCceEeECCCCCE
Confidence            5655555  899999996


No 118
>cd03820 GT1_amsD_like This family is most closely related to the GT1 family of glycosyltransferases. AmSD in Erwinia amylovora has been shown to be involved in the biosynthesis of amylovoran, the acidic exopolysaccharide acting as a virulence factor. This enzyme may be responsible for the formation of  galactose alpha-1,6 linkages in amylovoran.
Probab=20.39  E-value=1e+02  Score=31.55  Aligned_cols=41  Identities=12%  Similarity=0.141  Sum_probs=32.9

Q ss_pred             chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509          628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV  669 (693)
Q Consensus       628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~  669 (693)
                      ..+..+.|+++.+.+.|.. ++++..++|||.+|+. ||.+|.
T Consensus       243 ~~~~~~~~~~ad~~i~ps~~e~~~~~~~Ea~a~G~P-vi~~~~  284 (348)
T cd03820         243 TKNIEEYYAKASIFVLTSRFEGFPMVLLEAMAFGLP-VISFDC  284 (348)
T ss_pred             cchHHHHHHhCCEEEeCccccccCHHHHHHHHcCCC-EEEecC
Confidence            4667899999999998875 4677889999999985 556653


Done!