Query         040039
Match_columns 274
No_of_seqs    142 out of 1282
Neff          7.9 
Searched_HMMs 46136
Date          Fri Mar 29 04:29:13 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/040039.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/040039hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF01453 B_lectin:  D-mannose b 100.0   1E-31 2.2E-36  209.5   5.8  102   36-137     1-114 (114)
  2 cd00028 B_lectin Bulb-type man  99.9 2.3E-25 4.9E-30  174.3  13.2   99    1-111    17-116 (116)
  3 smart00108 B_lectin Bulb-type   99.9 2.4E-24 5.1E-29  168.0  12.6   97    1-110    17-114 (114)
  4 PF00954 S_locus_glycop:  S-loc  99.9 2.2E-23 4.7E-28  161.6   9.0  103  164-273     1-107 (110)
  5 smart00108 B_lectin Bulb-type   98.9 7.8E-09 1.7E-13   80.3   9.4   86   54-166    23-111 (114)
  6 cd00028 B_lectin Bulb-type man  98.9 8.1E-09 1.8E-13   80.5   9.2   87   54-167    23-113 (116)
  7 PF01453 B_lectin:  D-mannose b  98.3 6.4E-06 1.4E-10   64.0   8.8   74   37-112    38-114 (114)
  8 PF07974 EGF_2:  EGF-like domai  94.9   0.021 4.6E-07   33.9   2.0   22  253-274     6-29  (32)
  9 PF12661 hEGF:  Human growth fa  94.0  0.0082 1.8E-07   28.2  -0.9   10  265-274     1-10  (13)
 10 cd00053 EGF Epidermal growth f  92.4   0.097 2.1E-06   30.6   1.9   26  249-274     2-31  (36)
 11 PF01683 EB:  EB module;  Inter  92.0    0.13 2.7E-06   33.8   2.2   30  244-273    17-46  (52)
 12 cd00054 EGF_CA Calcium-binding  89.6    0.25 5.4E-06   29.2   1.8   28  247-274     3-34  (38)
 13 smart00179 EGF_CA Calcium-bind  89.5    0.27 5.8E-06   29.5   1.9   27  247-273     3-33  (39)
 14 PF07645 EGF_CA:  Calcium-bindi  88.8    0.11 2.5E-06   32.5  -0.2   27  247-273     3-34  (42)
 15 PF00008 EGF:  EGF-like domain   88.4   0.082 1.8E-06   31.2  -1.0   21  254-274     5-30  (32)
 16 PF12947 EGF_3:  EGF domain;  I  85.8    0.12 2.6E-06   31.5  -1.3   22  253-274     6-31  (36)
 17 PF12662 cEGF:  Complement Clr-  80.3    0.72 1.6E-05   25.4   0.5    9  265-273     3-11  (24)
 18 smart00181 EGF Epidermal growt  78.6     1.5 3.2E-05   25.7   1.6   21  253-274     6-30  (35)
 19 PHA02887 EGF-like protein; Pro  77.7     1.1 2.5E-05   34.5   1.1   27  247-274    84-118 (126)
 20 cd05845 Ig2_L1-CAM_like Second  70.9     7.9 0.00017   28.9   4.2   34   34-67     31-64  (95)
 21 PF09064 Tme5_EGF_like:  Thromb  67.0       4 8.7E-05   24.4   1.4   10  264-273    18-27  (34)
 22 KOG4289 Cadherin EGF LAG seven  66.4     2.4 5.3E-05   45.7   0.7   23  252-274  1244-1270(2531)
 23 PF01436 NHL:  NHL repeat;  Int  66.3      12 0.00026   21.0   3.4   22   53-74      5-26  (28)
 24 PF13360 PQQ_2:  PQQ-like domai  63.0      21 0.00046   30.0   5.9   48   60-107     2-62  (238)
 25 PF07354 Sp38:  Zona-pellucida-  61.4      13 0.00028   33.1   4.2   34   34-67     10-43  (271)
 26 PRK11138 outer membrane biogen  61.1      31 0.00068   32.0   7.1   19   89-107   342-361 (394)
 27 PF13360 PQQ_2:  PQQ-like domai  58.4 1.1E+02  0.0023   25.5  11.9   73   35-107    11-102 (238)
 28 PRK11138 outer membrane biogen  57.7      34 0.00074   31.8   6.8   55   53-107   121-186 (394)
 29 TIGR03300 assembly_YfgL outer   56.8      53  0.0012   30.1   7.9   70   37-106    40-130 (377)
 30 KOG4649 PQQ (pyrrolo-quinoline  56.7      24 0.00053   31.6   5.1   45   36-80    168-218 (354)
 31 PF11403 Yeast_MT:  Yeast metal  56.4     8.6 0.00019   22.9   1.5   19  253-272    12-30  (40)
 32 PHA03099 epidermal growth fact  56.3     5.5 0.00012   31.4   0.9   24  250-274    48-77  (139)
 33 TIGR03066 Gem_osc_para_1 Gemma  54.5      42  0.0009   25.9   5.5   54   50-103    33-104 (111)
 34 smart00051 DSL delta serrate l  53.5      11 0.00025   25.8   2.1   18  257-274    42-60  (63)
 35 cd05852 Ig5_Contactin-1 Fifth   52.9      64  0.0014   22.2   6.0   34   34-68     13-46  (73)
 36 KOG1214 Nidogen and related ba  48.7     9.8 0.00021   39.1   1.5   27  247-274   828-858 (1289)
 37 PF12690 BsuPI:  Intracellular   46.9      24 0.00052   25.4   3.0   16   62-77     27-42  (82)
 38 cd00055 EGF_Lam Laminin-type e  42.2      18 0.00039   23.2   1.6   10  264-273    19-28  (50)
 39 PF12946 EGF_MSP1_1:  MSP1 EGF   39.1     3.7 8.1E-05   25.1  -1.9   22  253-274     5-31  (37)
 40 PF06006 DUF905:  Bacterial pro  38.3      35 0.00075   23.9   2.5   19   93-111    34-52  (70)
 41 KOG0640 mRNA cleavage stimulat  37.2 1.7E+02  0.0037   27.0   7.4   68   39-106   250-333 (430)
 42 PF13570 PQQ_3:  PQQ-like domai  36.4      47   0.001   19.9   2.8   11   70-80      1-11  (40)
 43 KOG1214 Nidogen and related ba  36.4      18  0.0004   37.2   1.3   57  216-274   749-819 (1289)
 44 PF14670 FXa_inhibition:  Coagu  36.1     8.7 0.00019   23.3  -0.6   10  264-273    19-28  (36)
 45 cd00216 PQQ_DH Dehydrogenases   36.1 1.3E+02  0.0028   29.1   7.2   73   35-107    37-136 (488)
 46 PLN00033 photosystem II stabil  33.5 1.7E+02  0.0036   27.8   7.2   46   61-106   259-306 (398)
 47 PF05935 Arylsulfotrans:  Aryls  32.7      48   0.001   32.1   3.6   52   60-112   127-186 (477)
 48 TIGR03300 assembly_YfgL outer   31.9 1.6E+02  0.0035   26.9   6.9   19   89-107   327-346 (377)
 49 smart00180 EGF_Lam Laminin-typ  31.4      32 0.00069   21.7   1.4   10  264-273    18-27  (46)
 50 KOG4260 Uncharacterized conser  30.4      33 0.00072   30.7   1.8   21  254-274   151-178 (350)
 51 KOG3881 Uncharacterized conser  29.1 2.6E+02  0.0057   26.4   7.4   76   36-113   199-279 (412)
 52 PF10282 Lactonase:  Lactonase,  27.7 3.7E+02  0.0081   24.4   8.4   65   17-97    249-331 (345)
 53 PF05294 Toxin_5:  Scorpion sho  27.7      15 0.00033   21.5  -0.5   15  254-268    18-32  (32)
 54 smart00286 PTI Plant trypsin i  26.8      42 0.00091   19.2   1.2   18  255-273    10-28  (29)
 55 smart00564 PQQ beta-propeller   26.5 1.2E+02  0.0027   16.7   3.5   17   59-75     14-31  (33)
 56 PF06247 Plasmod_Pvs28:  Plasmo  25.9     8.9 0.00019   32.3  -2.4   27  247-273    40-79  (197)
 57 PF01011 PQQ:  PQQ enzyme repea  25.8      95  0.0021   18.4   2.8   22   58-79      7-29  (38)
 58 KOG1225 Teneurin-1 and related  25.8      46   0.001   32.7   2.1   20  255-274   287-306 (525)
 59 cd00150 PlantTI Plant trypsin   25.5      44 0.00095   18.8   1.1    8  254-261    19-26  (27)
 60 PF00053 Laminin_EGF:  Laminin   24.7      21 0.00046   22.7  -0.3   10  264-273    18-27  (49)
 61 KOG0291 WD40-repeat-containing  24.6 3.5E+02  0.0077   27.9   7.9   53   53-105   354-419 (893)
 62 PF02237 BPL_C:  Biotin protein  24.0      45 0.00098   21.2   1.2   15   87-101    21-35  (48)
 63 cd05764 Ig_2 Subgroup of the i  23.7 1.3E+02  0.0029   20.1   3.7   34   34-67     13-46  (74)
 64 KOG2106 Uncharacterized conser  23.0   3E+02  0.0064   27.1   6.8   68   53-120   250-329 (626)
 65 KOG0278 Serine/threonine kinas  22.3 5.2E+02   0.011   23.2   7.6   53   53-106   157-211 (334)
 66 PF05833 FbpA:  Fibronectin-bin  22.2      79  0.0017   30.1   2.9   38   85-122   116-159 (455)
 67 PF14870 PSII_BNR:  Photosynthe  21.8 2.9E+02  0.0062   25.1   6.3   52   55-107   159-212 (302)
 68 COG1520 FOG: WD40-like repeat   21.7 4.7E+02    0.01   23.9   8.0   73   36-108   130-226 (370)
 69 KOG4234 TPR repeat-containing   21.0      40 0.00086   29.2   0.5   16  132-148   250-265 (271)
 70 KOG4792 Crk family adapters [S  20.7 1.3E+02  0.0028   26.3   3.6   48  105-158    10-64  (293)
 71 KOG1225 Teneurin-1 and related  20.7      60  0.0013   31.9   1.8   16  259-274   260-275 (525)
 72 COG4787 FlgF Flagellar basal b  20.3 4.5E+02  0.0097   22.9   6.7   24   54-77     79-102 (251)

No 1  
>PF01453 B_lectin:  D-mannose binding lectin;  InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]:  Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein   This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity.  Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=99.97  E-value=1e-31  Score=209.45  Aligned_cols=102  Identities=54%  Similarity=0.820  Sum_probs=75.2

Q ss_pred             CCcEEEEcCCCCCCCC---CcEEEEecCCcEEEEcCCCCEEEee-cCCCCc--eeEEEEeeCCCeeEecCCCceEEeecC
Q 040039           36 FPQVVWSANRNNPVRI---NATLELTSDGNLVLQDADGAIAWST-NTSGKS--VVGLNLTDMGNLVLFDKNNAAVWQSFD  109 (274)
Q Consensus        36 ~~~vVWvANr~~Pv~~---~~~l~~~~~G~L~l~~~~~~~~Wss-~~~~~~--~~~~~l~d~GNlvl~~~~~~~lWqSFd  109 (274)
                      +++|||+|||++|+..   .++|.|+.||+|+|++..++++|++ ++.+..  ...|+|+|+|||||+|..+.+||||||
T Consensus         1 ~~tvvW~an~~~p~~~~s~~~~L~l~~dGnLvl~~~~~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~~~~~lW~Sf~   80 (114)
T PF01453_consen    1 PRTVVWVANRNSPLTSSSGNYTLILQSDGNLVLYDSNGSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDSSGNVLWQSFD   80 (114)
T ss_dssp             ---------TTEEEEECETTEEEEEETTSEEEEEETTTEEEEE--S-TTSS-SSEEEEEETTSEEEEEETTSEEEEESTT
T ss_pred             CcccccccccccccccccccccceECCCCeEEEEcCCCCEEEEecccCCccccCeEEEEeCCCCEEEEeecceEEEeecC
Confidence            3689999999999942   3899999999999999988899999 655543  688999999999999988999999999


Q ss_pred             CCCCcccCCCcccCC------ceEeeecCCCCCC
Q 040039          110 HPTDSLVPGQKLLEG------KKLTASVSTTNWT  137 (274)
Q Consensus       110 ~PtDTlLpgq~l~~~------~~L~Sw~s~~dps  137 (274)
                      |||||+||+|+|+.+      ..|+||++.+|||
T Consensus        81 ~ptdt~L~~q~l~~~~~~~~~~~~~sw~s~~dps  114 (114)
T PF01453_consen   81 YPTDTLLPGQKLGDGNVTGKNDSLTSWSSNTDPS  114 (114)
T ss_dssp             SSS-EEEEEET--TSEEEEESTSSEEEESS----
T ss_pred             CCccEEEeccCcccCCCccccceEEeECCCCCCC
Confidence            999999999999863      3599999999986


No 2  
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=99.93  E-value=2.3e-25  Score=174.30  Aligned_cols=99  Identities=41%  Similarity=0.650  Sum_probs=87.0

Q ss_pred             CeeccccCCCCCceEEEEEEeeeccccccccccCCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEcCCCCEEEeecCCC
Q 040039            1 YACGFFCNGTCDSYLFAVFIVQAYNASLIDYQHIEFPQVVWSANRNNPVRINATLELTSDGNLVLQDADGAIAWSTNTSG   80 (274)
Q Consensus         1 F~lGFf~~~~~~~~~l~iw~~~~~~~~~~~~~~~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~~~~~~~Wss~~~~   80 (274)
                      |++|||++......+++|||..           .+ .++||+|||+.|....+.|.|+.||+|+|+|.++.++|++++.+
T Consensus        17 f~~G~~~~~~q~~dgnlv~~~~-----------~~-~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~~~~   84 (116)
T cd00028          17 FELGFFKLIMQSRDYNLILYKG-----------SS-RTVVWVANRDNPSGSSCTLTLQSDGNLVIYDGSGTVVWSSNTTR   84 (116)
T ss_pred             EEEecccCCCCCCeEEEEEEeC-----------CC-CeEEEECCCCCCCCCCEEEEEecCCCeEEEcCCCcEEEEecccC
Confidence            7899999875323999999974           33 68999999999965568999999999999999999999999876


Q ss_pred             -CceeEEEEeeCCCeeEecCCCceEEeecCCC
Q 040039           81 -KSVVGLNLTDMGNLVLFDKNNAAVWQSFDHP  111 (274)
Q Consensus        81 -~~~~~~~l~d~GNlvl~~~~~~~lWqSFd~P  111 (274)
                       .....++|+|+|||||++.++.+||||||||
T Consensus        85 ~~~~~~~~L~ddGnlvl~~~~~~~~W~Sf~~P  116 (116)
T cd00028          85 VNGNYVLVLLDDGNLVLYDSDGNFLWQSFDYP  116 (116)
T ss_pred             CCCceEEEEeCCCCEEEECCCCCEEEcCCCCC
Confidence             5567889999999999999999999999998


No 3  
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=99.92  E-value=2.4e-24  Score=168.01  Aligned_cols=97  Identities=43%  Similarity=0.689  Sum_probs=86.7

Q ss_pred             CeeccccCCCCCceEEEEEEeeeccccccccccCCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEcCCCCEEEeecCC-
Q 040039            1 YACGFFCNGTCDSYLFAVFIVQAYNASLIDYQHIEFPQVVWSANRNNPVRINATLELTSDGNLVLQDADGAIAWSTNTS-   79 (274)
Q Consensus         1 F~lGFf~~~~~~~~~l~iw~~~~~~~~~~~~~~~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~~~~~~~Wss~~~-   79 (274)
                      |++|||++.. ...+++|||..           .+ .++||+|||+.|+..++.|.|++||+|+|++.++.++|++++. 
T Consensus        17 f~~G~~~~~~-q~dgnlV~~~~-----------~~-~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~t~~   83 (114)
T smart00108       17 FELGFFTLIM-QNDYNLILYKS-----------SS-RTVVWVANRDNPVSDSCTLTLQSDGNLVLYDGDGRVVWSSNTTG   83 (114)
T ss_pred             EeeeccccCC-CCCEEEEEEEC-----------CC-CcEEEECCCCCCCCCCEEEEEeCCCCEEEEeCCCCEEEEecccC
Confidence            6899999865 57999999974           34 7899999999998877899999999999999989999999986 


Q ss_pred             CCceeEEEEeeCCCeeEecCCCceEEeecCC
Q 040039           80 GKSVVGLNLTDMGNLVLFDKNNAAVWQSFDH  110 (274)
Q Consensus        80 ~~~~~~~~l~d~GNlvl~~~~~~~lWqSFd~  110 (274)
                      +.+...++|+|+|||||++..+.+|||||||
T Consensus        84 ~~~~~~~~L~ddGnlvl~~~~~~~~W~Sf~~  114 (114)
T smart00108       84 ANGNYVLVLLDDGNLVIYDSDGNFLWQSFDY  114 (114)
T ss_pred             CCCceEEEEeCCCCEEEECCCCCEEeCCCCC
Confidence            5566788999999999999999999999997


No 4  
>PF00954 S_locus_glycop:  S-locus glycoprotein family;  InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=99.89  E-value=2.2e-23  Score=161.60  Aligned_cols=103  Identities=20%  Similarity=0.311  Sum_probs=79.3

Q ss_pred             EecccCCCCcCCCcceeeeecCceEEEeecCCCCCceEEEEecCCCCCCeEEEEEcCCCCeEEEEEc-CCCCeEEeeecc
Q 040039          164 YYALVKATKTSKEPSHARYLNGSLAFFINSSEPREPDGAVPVPPASSSPGQYMRLWPDGHLRVYEWQ-ASIGWTEVADLL  242 (274)
Q Consensus       164 w~sg~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~rl~Ld~dG~lr~y~w~-~~~~W~~~~~~~  242 (274)
                      ||+|+|++..+.+.+.+..  ..+..+....+..+.+++|.+.+.+.  ++|++||++|++++|.|+ +.++|.++   |
T Consensus         1 wrsG~WnG~~f~g~p~~~~--~~~~~~~fv~~~~e~~~t~~~~~~s~--~~r~~ld~~G~l~~~~w~~~~~~W~~~---~   73 (110)
T PF00954_consen    1 WRSGPWNGQRFSGIPEMSS--NSLYNYSFVSNNEEVYYTYSLSNSSV--LSRLVLDSDGQLQRYIWNESTQSWSVF---W   73 (110)
T ss_pred             CCccccCCeEECCcccccc--cceeEEEEEECCCeEEEEEecCCCce--EEEEEEeeeeEEEEEEEecCCCcEEEE---E
Confidence            8999999976544344331  11222222224556677777665554  999999999999999999 89999997   7


Q ss_pred             cccCCCCCCCcCCCCCCccCC---CCccCCCCCC
Q 040039          243 TGYLGECGYPLVCGKYGICSQ---GQCSCPATYF  273 (274)
Q Consensus       243 ~~p~d~C~~y~~CG~~giC~~---~~C~Cl~gf~  273 (274)
                      .+|.|+||+|++||+||+|+.   ++|+|||||.
T Consensus        74 ~~p~d~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~  107 (110)
T PF00954_consen   74 SAPKDQCDVYGFCGPNGICNSNNSPKCSCLPGFE  107 (110)
T ss_pred             EecccCCCCccccCCccEeCCCCCCceECCCCcC
Confidence            789999999999999999983   7899999996


No 5  
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=98.93  E-value=7.8e-09  Score=80.32  Aligned_cols=86  Identities=27%  Similarity=0.505  Sum_probs=65.3

Q ss_pred             EEEEecCCcEEEEcCC-CCEEEeecCCCC--ceeEEEEeeCCCeeEecCCCceEEeecCCCCCcccCCCcccCCceEeee
Q 040039           54 TLELTSDGNLVLQDAD-GAIAWSTNTSGK--SVVGLNLTDMGNLVLFDKNNAAVWQSFDHPTDSLVPGQKLLEGKKLTAS  130 (274)
Q Consensus        54 ~l~~~~~G~L~l~~~~-~~~~Wss~~~~~--~~~~~~l~d~GNlvl~~~~~~~lWqSFd~PtDTlLpgq~l~~~~~L~Sw  130 (274)
                      ++.++.||+||+++.. ..++|++++...  ....+.|+++|||||++.++.++|+|-..      .             
T Consensus        23 ~~~~q~dgnlV~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~t~------~-------------   83 (114)
T smart00108       23 TLIMQNDYNLILYKSSSRTVVWVANRDNPVSDSCTLTLQSDGNLVLYDGDGRVVWSSNTT------G-------------   83 (114)
T ss_pred             ccCCCCCEEEEEEECCCCcEEEECCCCCCCCCCEEEEEeCCCCEEEEeCCCCEEEEeccc------C-------------
Confidence            5567789999999864 479999998532  22678999999999999889999998211      1             


Q ss_pred             cCCCCCCCCcceEEEecCCCceEEEecCCCeEEEec
Q 040039          131 VSTTNWTDGGLFSLSVTNEGLFAFIESNNTSIRYYA  166 (274)
Q Consensus       131 ~s~~dps~~G~ysl~~d~~g~~~~~~~~~~~~Yw~s  166 (274)
                            .. |.|.+.|+.+|...++.. ..++.|.+
T Consensus        84 ------~~-~~~~~~L~ddGnlvl~~~-~~~~~W~S  111 (114)
T smart00108       84 ------AN-GNYVLVLLDDGNLVIYDS-DGNFLWQS  111 (114)
T ss_pred             ------CC-CceEEEEeCCCCEEEECC-CCCEEeCC
Confidence                  13 788999999998877643 34677865


No 6  
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=98.91  E-value=8.1e-09  Score=80.50  Aligned_cols=87  Identities=26%  Similarity=0.509  Sum_probs=66.2

Q ss_pred             EEEEec-CCcEEEEcCC-CCEEEeecCCC--CceeEEEEeeCCCeeEecCCCceEEeecCCCCCcccCCCcccCCceEee
Q 040039           54 TLELTS-DGNLVLQDAD-GAIAWSTNTSG--KSVVGLNLTDMGNLVLFDKNNAAVWQSFDHPTDSLVPGQKLLEGKKLTA  129 (274)
Q Consensus        54 ~l~~~~-~G~L~l~~~~-~~~~Wss~~~~--~~~~~~~l~d~GNlvl~~~~~~~lWqSFd~PtDTlLpgq~l~~~~~L~S  129 (274)
                      .+.++. ||+|++++.. ..++|++++..  ...+.+.|+++|||||+|.++.++|+|-..                   
T Consensus        23 ~~~~q~~dgnlv~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~~~-------------------   83 (116)
T cd00028          23 KLIMQSRDYNLILYKGSSRTVVWVANRDNPSGSSCTLTLQSDGNLVIYDGSGTVVWSSNTT-------------------   83 (116)
T ss_pred             cCCCCCCeEEEEEEeCCCCeEEEECCCCCCCCCCEEEEEecCCCeEEEcCCCcEEEEeccc-------------------
Confidence            455676 9999999754 47999999864  345678999999999999999999998311                   


Q ss_pred             ecCCCCCCCCcceEEEecCCCceEEEecCCCeEEEecc
Q 040039          130 SVSTTNWTDGGLFSLSVTNEGLFAFIESNNTSIRYYAL  167 (274)
Q Consensus       130 w~s~~dps~~G~ysl~~d~~g~~~~~~~~~~~~Yw~sg  167 (274)
                           . .. +.+.+.|+.+|...++..+ .++.|.+.
T Consensus        84 -----~-~~-~~~~~~L~ddGnlvl~~~~-~~~~W~Sf  113 (116)
T cd00028          84 -----R-VN-GNYVLVLLDDGNLVLYDSD-GNFLWQSF  113 (116)
T ss_pred             -----C-CC-CceEEEEeCCCCEEEECCC-CCEEEcCC
Confidence                 0 13 7889999999988776433 46778764


No 7  
>PF01453 B_lectin:  D-mannose binding lectin;  InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]:  Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein   This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity.  Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=98.26  E-value=6.4e-06  Score=64.03  Aligned_cols=74  Identities=28%  Similarity=0.443  Sum_probs=51.4

Q ss_pred             CcEEEEc-CCCCCCCCCcEEEEecCCcEEEEcCCCCEEEeecCCCCceeEEEEee--CCCeeEecCCCceEEeecCCCC
Q 040039           37 PQVVWSA-NRNNPVRINATLELTSDGNLVLQDADGAIAWSTNTSGKSVVGLNLTD--MGNLVLFDKNNAAVWQSFDHPT  112 (274)
Q Consensus        37 ~~vVWvA-Nr~~Pv~~~~~l~~~~~G~L~l~~~~~~~~Wss~~~~~~~~~~~l~d--~GNlvl~~~~~~~lWqSFd~Pt  112 (274)
                      ..+||.. +........+.+.|+++|||||+|..+.++|++.. ......+.+++  .||++ +.....++|.|-+.|+
T Consensus        38 ~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~~~~~lW~Sf~-~ptdt~L~~q~l~~~~~~-~~~~~~~sw~s~~dps  114 (114)
T PF01453_consen   38 GSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDSSGNVLWQSFD-YPTDTLLPGQKLGDGNVT-GKNDSLTSWSSNTDPS  114 (114)
T ss_dssp             TEEEEE--S-TTSS-SSEEEEEETTSEEEEEETTSEEEEESTT-SSS-EEEEEET--TSEEE-EESTSSEEEESS----
T ss_pred             CCEEEEecccCCccccCeEEEEeCCCCEEEEeecceEEEeecC-CCccEEEeccCcccCCCc-cccceEEeECCCCCCC
Confidence            5679999 43433334589999999999999998999999943 33445566777  88888 7666789999977763


No 8  
>PF07974 EGF_2:  EGF-like domain;  InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=94.86  E-value=0.021  Score=33.86  Aligned_cols=22  Identities=32%  Similarity=0.859  Sum_probs=19.4

Q ss_pred             cCCCCCCccC--CCCccCCCCCCC
Q 040039          253 LVCGKYGICS--QGQCSCPATYFK  274 (274)
Q Consensus       253 ~~CG~~giC~--~~~C~Cl~gf~~  274 (274)
                      .+|..+|+|+  ..+|.|.+||+|
T Consensus         6 ~~C~~~G~C~~~~g~C~C~~g~~G   29 (32)
T PF07974_consen    6 NICSGHGTCVSPCGRCVCDSGYTG   29 (32)
T ss_pred             CccCCCCEEeCCCCEEECCCCCcC
Confidence            5799999998  379999999986


No 9  
>PF12661 hEGF:  Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=93.98  E-value=0.0082  Score=28.23  Aligned_cols=10  Identities=30%  Similarity=1.026  Sum_probs=7.8

Q ss_pred             CccCCCCCCC
Q 040039          265 QCSCPATYFK  274 (274)
Q Consensus       265 ~C~Cl~gf~~  274 (274)
                      +|.|++||+|
T Consensus         1 ~C~C~~G~~G   10 (13)
T PF12661_consen    1 TCQCPPGWTG   10 (13)
T ss_dssp             EEEE-TTEET
T ss_pred             CccCcCCCcC
Confidence            4999999986


No 10 
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.
Probab=92.40  E-value=0.097  Score=30.55  Aligned_cols=26  Identities=31%  Similarity=0.796  Sum_probs=20.1

Q ss_pred             CCCCcCCCCCCccCC----CCccCCCCCCC
Q 040039          249 CGYPLVCGKYGICSQ----GQCSCPATYFK  274 (274)
Q Consensus       249 C~~y~~CG~~giC~~----~~C~Cl~gf~~  274 (274)
                      |.....|..++.|..    ..|.|++||.+
T Consensus         2 C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g   31 (36)
T cd00053           2 CAASNPCSNGGTCVNTPGSYRCVCPPGYTG   31 (36)
T ss_pred             CCCCCCCCCCCEEecCCCCeEeECCCCCcc
Confidence            443567888899973    67999999975


No 11 
>PF01683 EB:  EB module;  InterPro: IPR006149  The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO 
Probab=92.01  E-value=0.13  Score=33.83  Aligned_cols=30  Identities=27%  Similarity=0.637  Sum_probs=26.4

Q ss_pred             ccCCCCCCCcCCCCCCccCCCCccCCCCCC
Q 040039          244 GYLGECGYPLVCGKYGICSQGQCSCPATYF  273 (274)
Q Consensus       244 ~p~d~C~~y~~CG~~giC~~~~C~Cl~gf~  273 (274)
                      .+.+.|....-|-.++.|....|.|++||+
T Consensus        17 ~~g~~C~~~~qC~~~s~C~~g~C~C~~g~~   46 (52)
T PF01683_consen   17 QPGESCESDEQCIGGSVCVNGRCQCPPGYV   46 (52)
T ss_pred             CCCCCCCCcCCCCCcCEEcCCEeECCCCCE
Confidence            355789999999999999889999999985


No 12 
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=89.65  E-value=0.25  Score=29.21  Aligned_cols=28  Identities=36%  Similarity=0.784  Sum_probs=21.5

Q ss_pred             CCCCCCcCCCCCCccCC----CCccCCCCCCC
Q 040039          247 GECGYPLVCGKYGICSQ----GQCSCPATYFK  274 (274)
Q Consensus       247 d~C~~y~~CG~~giC~~----~~C~Cl~gf~~  274 (274)
                      ++|.....|...+.|..    ..|.|++||.|
T Consensus         3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g   34 (38)
T cd00054           3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYTG   34 (38)
T ss_pred             ccCCCCCCcCCCCEeECCCCCeEeECCCCCcC
Confidence            56765457888889973    46999999975


No 13 
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=89.50  E-value=0.27  Score=29.52  Aligned_cols=27  Identities=33%  Similarity=0.802  Sum_probs=21.2

Q ss_pred             CCCCCCcCCCCCCccCC----CCccCCCCCC
Q 040039          247 GECGYPLVCGKYGICSQ----GQCSCPATYF  273 (274)
Q Consensus       247 d~C~~y~~CG~~giC~~----~~C~Cl~gf~  273 (274)
                      ++|.....|...+.|..    -.|.|++||.
T Consensus         3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~   33 (39)
T smart00179        3 DECASGNPCQNGGTCVNTVGSYRCECPPGYT   33 (39)
T ss_pred             ccCcCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence            56765567888889973    4699999997


No 14 
>PF07645 EGF_CA:  Calcium-binding EGF domain;  InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes [].  +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=88.82  E-value=0.11  Score=32.51  Aligned_cols=27  Identities=37%  Similarity=0.818  Sum_probs=22.3

Q ss_pred             CCCCCC-cCCCCCCccCC----CCccCCCCCC
Q 040039          247 GECGYP-LVCGKYGICSQ----GQCSCPATYF  273 (274)
Q Consensus       247 d~C~~y-~~CG~~giC~~----~~C~Cl~gf~  273 (274)
                      |+|... ..|..++.|.+    -.|.|++||.
T Consensus         3 dEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~   34 (42)
T PF07645_consen    3 DECAEGPHNCPENGTCVNTEGSYSCSCPPGYE   34 (42)
T ss_dssp             STTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred             cccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence            678875 48999999973    5799999996


No 15 
>PF00008 EGF:  EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry;  InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=88.40  E-value=0.082  Score=31.22  Aligned_cols=21  Identities=33%  Similarity=0.835  Sum_probs=16.8

Q ss_pred             CCCCCCccCC-----CCccCCCCCCC
Q 040039          254 VCGKYGICSQ-----GQCSCPATYFK  274 (274)
Q Consensus       254 ~CG~~giC~~-----~~C~Cl~gf~~  274 (274)
                      .|...|.|..     .+|.|++||+|
T Consensus         5 ~C~n~g~C~~~~~~~y~C~C~~G~~G   30 (32)
T PF00008_consen    5 PCQNGGTCIDLPGGGYTCECPPGYTG   30 (32)
T ss_dssp             SSTTTEEEEEESTSEEEEEEBTTEES
T ss_pred             cCCCCeEEEeCCCCCEEeECCCCCcc
Confidence            6777888862     57999999986


No 16 
>PF12947 EGF_3:  EGF domain;  InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=85.79  E-value=0.12  Score=31.46  Aligned_cols=22  Identities=23%  Similarity=0.545  Sum_probs=16.2

Q ss_pred             cCCCCCCccCC----CCccCCCCCCC
Q 040039          253 LVCGKYGICSQ----GQCSCPATYFK  274 (274)
Q Consensus       253 ~~CG~~giC~~----~~C~Cl~gf~~  274 (274)
                      +-|-++..|..    -.|+|.+||+|
T Consensus         6 ~~C~~nA~C~~~~~~~~C~C~~Gy~G   31 (36)
T PF12947_consen    6 GGCHPNATCTNTGGSYTCTCKPGYEG   31 (36)
T ss_dssp             GGS-TTCEEEE-TTSEEEEE-CEEEC
T ss_pred             CCCCCCcEeecCCCCEEeECCCCCcc
Confidence            56889999972    56999999986


No 17 
>PF12662 cEGF:  Complement Clr-like EGF-like
Probab=80.32  E-value=0.72  Score=25.44  Aligned_cols=9  Identities=56%  Similarity=1.376  Sum_probs=8.1

Q ss_pred             CccCCCCCC
Q 040039          265 QCSCPATYF  273 (274)
Q Consensus       265 ~C~Cl~gf~  273 (274)
                      .|+|++||.
T Consensus         3 ~C~C~~Gy~   11 (24)
T PF12662_consen    3 TCSCPPGYQ   11 (24)
T ss_pred             EeeCCCCCc
Confidence            599999996


No 18 
>smart00181 EGF Epidermal growth factor-like domain.
Probab=78.59  E-value=1.5  Score=25.68  Aligned_cols=21  Identities=33%  Similarity=0.711  Sum_probs=16.0

Q ss_pred             cCCCCCCccCC----CCccCCCCCCC
Q 040039          253 LVCGKYGICSQ----GQCSCPATYFK  274 (274)
Q Consensus       253 ~~CG~~giC~~----~~C~Cl~gf~~  274 (274)
                      ..|... .|..    ..|.|++||.+
T Consensus         6 ~~C~~~-~C~~~~~~~~C~C~~g~~g   30 (35)
T smart00181        6 GPCSNG-TCINTPGSYTCSCPPGYTG   30 (35)
T ss_pred             CCCCCC-EEECCCCCeEeECCCCCcc
Confidence            456776 7862    67999999975


No 19 
>PHA02887 EGF-like protein; Provisional
Probab=77.68  E-value=1.1  Score=34.51  Aligned_cols=27  Identities=26%  Similarity=0.562  Sum_probs=22.0

Q ss_pred             CCCCC--CcCCCCCCccC------CCCccCCCCCCC
Q 040039          247 GECGY--PLVCGKYGICS------QGQCSCPATYFK  274 (274)
Q Consensus       247 d~C~~--y~~CG~~giC~------~~~C~Cl~gf~~  274 (274)
                      ++|.-  .+.|= +|.|.      .+.|.|.+||+|
T Consensus        84 ~pC~~eyk~YCi-HG~C~yI~dL~epsCrC~~GYtG  118 (126)
T PHA02887         84 EKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGYTG  118 (126)
T ss_pred             cccChHhhCEee-CCEEEccccCCCceeECCCCccc
Confidence            67864  57888 79997      188999999997


No 20 
>cd05845 Ig2_L1-CAM_like Second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins. Ig2_L1-CAM_like: domain similar to the second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth.
Probab=70.90  E-value=7.9  Score=28.87  Aligned_cols=34  Identities=12%  Similarity=0.267  Sum_probs=23.8

Q ss_pred             CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEc
Q 040039           34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQD   67 (274)
Q Consensus        34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~   67 (274)
                      .|..++.|+-+....+.....++++.+|+|.+.+
T Consensus        31 ~P~P~i~W~~~~~~~i~~~~Ri~~~~~GnL~fs~   64 (95)
T cd05845          31 AVPLRIYWMNSDLLHITQDERVSMGQNGNLYFAN   64 (95)
T ss_pred             CCCCEEEEECCCCccccccccEEECCCceEEEEE
Confidence            5677888995544445545677888888888764


No 21 
>PF09064 Tme5_EGF_like:  Thrombomodulin like fifth domain, EGF-like;  InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=66.99  E-value=4  Score=24.40  Aligned_cols=10  Identities=60%  Similarity=1.491  Sum_probs=8.6

Q ss_pred             CCccCCCCCC
Q 040039          264 GQCSCPATYF  273 (274)
Q Consensus       264 ~~C~Cl~gf~  273 (274)
                      .+|.||.||-
T Consensus        18 ~~C~CPeGyI   27 (34)
T PF09064_consen   18 GQCFCPEGYI   27 (34)
T ss_pred             CceeCCCceE
Confidence            5899999984


No 22 
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=66.39  E-value=2.4  Score=45.69  Aligned_cols=23  Identities=26%  Similarity=0.677  Sum_probs=19.1

Q ss_pred             CcCCCCCCccCC----CCccCCCCCCC
Q 040039          252 PLVCGKYGICSQ----GQCSCPATYFK  274 (274)
Q Consensus       252 y~~CG~~giC~~----~~C~Cl~gf~~  274 (274)
                      -+.||++|-|..    -+|+|-|||+|
T Consensus      1244 s~pC~nng~C~srEggYtCeCrpg~tG 1270 (2531)
T KOG4289|consen 1244 SGPCGNNGRCRSREGGYTCECRPGFTG 1270 (2531)
T ss_pred             cCCCCCCCceEEecCceeEEecCCccc
Confidence            478999999973    57999999986


No 23 
>PF01436 NHL:  NHL repeat;  InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ].  The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=66.25  E-value=12  Score=20.98  Aligned_cols=22  Identities=23%  Similarity=0.347  Sum_probs=15.5

Q ss_pred             cEEEEecCCcEEEEcCCCCEEE
Q 040039           53 ATLELTSDGNLVLQDADGAIAW   74 (274)
Q Consensus        53 ~~l~~~~~G~L~l~~~~~~~~W   74 (274)
                      .-+.++.+|+|++.|..+.-||
T Consensus         5 ~gvav~~~g~i~VaD~~n~rV~   26 (28)
T PF01436_consen    5 HGVAVDSDGNIYVADSGNHRVQ   26 (28)
T ss_dssp             EEEEEETTSEEEEEECCCTEEE
T ss_pred             cEEEEeCCCCEEEEECCCCEEE
Confidence            3466778888888886665554


No 24 
>PF13360 PQQ_2:  PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=62.95  E-value=21  Score=29.98  Aligned_cols=48  Identities=29%  Similarity=0.542  Sum_probs=25.4

Q ss_pred             CCcEEEEcC-CCCEEEeecCC---CCce--e-----EEEE-eeCCCeeEecC-CCceEEee
Q 040039           60 DGNLVLQDA-DGAIAWSTNTS---GKSV--V-----GLNL-TDMGNLVLFDK-NNAAVWQS  107 (274)
Q Consensus        60 ~G~L~l~~~-~~~~~Wss~~~---~~~~--~-----~~~l-~d~GNlvl~~~-~~~~lWqS  107 (274)
                      +|.|..+|. +|+.+|+....   ...+  +     .+.+ ..+|+|+..|. +|+++|+-
T Consensus         2 ~g~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~~d~~tG~~~W~~   62 (238)
T PF13360_consen    2 DGTLSALDPRTGKELWSYDLGPGIGGPVATAVPDGGRVYVASGDGNLYALDAKTGKVLWRF   62 (238)
T ss_dssp             TSEEEEEETTTTEEEEEEECSSSCSSEEETEEEETTEEEEEETTSEEEEEETTTSEEEEEE
T ss_pred             CCEEEEEECCCCCEEEEEECCCCCCCccceEEEeCCEEEEEcCCCEEEEEECCCCCEEEEe
Confidence            566777775 67777777541   1111  0     0111 25555556663 56777764


No 25 
>PF07354 Sp38:  Zona-pellucida-binding protein (Sp38);  InterPro: IPR010857 This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90 kDa family of zona pellucida glycoproteins in a calcium-dependent manner []. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur [].; GO: 0007339 binding of sperm to zona pellucida, 0005576 extracellular region
Probab=61.35  E-value=13  Score=33.07  Aligned_cols=34  Identities=21%  Similarity=0.543  Sum_probs=30.6

Q ss_pred             CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEc
Q 040039           34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQD   67 (274)
Q Consensus        34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~   67 (274)
                      +-.++..|+--.++++++++.+.|++.|.|++.|
T Consensus        10 ~iDP~y~W~GP~g~~l~gn~~~nIT~TG~L~~~~   43 (271)
T PF07354_consen   10 LIDPTYLWTGPNGKPLSGNSYVNITETGKLMFKN   43 (271)
T ss_pred             cCCCceEEECCCCcccCCCCeEEEccCceEEeec
Confidence            4567889999999999999999999999999986


No 26 
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=61.08  E-value=31  Score=32.04  Aligned_cols=19  Identities=21%  Similarity=0.235  Sum_probs=10.9

Q ss_pred             eeCCCeeEecC-CCceEEee
Q 040039           89 TDMGNLVLFDK-NNAAVWQS  107 (274)
Q Consensus        89 ~d~GNlvl~~~-~~~~lWqS  107 (274)
                      .++|.|...|. +++++|+-
T Consensus       342 ~~~G~l~~ld~~tG~~~~~~  361 (394)
T PRK11138        342 DSEGYLHWINREDGRFVAQQ  361 (394)
T ss_pred             eCCCEEEEEECCCCCEEEEE
Confidence            45566665553 36666664


No 27 
>PF13360 PQQ_2:  PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=58.44  E-value=1.1e+02  Score=25.54  Aligned_cols=73  Identities=26%  Similarity=0.481  Sum_probs=44.3

Q ss_pred             CCCcEEEEcCC----CCCC----CCCcEEEE-ecCCcEEEEcC-CCCEEEeecCCCC---c-e---eEEEEe-eCCCeeE
Q 040039           35 EFPQVVWSANR----NNPV----RINATLEL-TSDGNLVLQDA-DGAIAWSTNTSGK---S-V---VGLNLT-DMGNLVL   96 (274)
Q Consensus        35 ~~~~vVWvANr----~~Pv----~~~~~l~~-~~~G~L~l~~~-~~~~~Wss~~~~~---~-~---~~~~l~-d~GNlvl   96 (274)
                      .....+|..+-    ..++    .+...+.+ +.+|.|+.+|. +|+.+|+......   . .   ..+.+. .+|.|+.
T Consensus        11 ~tG~~~W~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~~   90 (238)
T PF13360_consen   11 RTGKELWSYDLGPGIGGPVATAVPDGGRVYVASGDGNLYALDAKTGKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLYA   90 (238)
T ss_dssp             TTTEEEEEEECSSSCSSEEETEEEETTEEEEEETTSEEEEEETTTSEEEEEEECSSCGGSGEEEETTEEEEEETTSEEEE
T ss_pred             CCCCEEEEEECCCCCCCccceEEEeCCEEEEEcCCCEEEEEECCCCCEEEEeeccccccceeeecccccccccceeeeEe
Confidence            35567888753    2222    12333434 58899999996 8999999886332   1 1   112222 3445666


Q ss_pred             ec-CCCceEEee
Q 040039           97 FD-KNNAAVWQS  107 (274)
Q Consensus        97 ~~-~~~~~lWqS  107 (274)
                      .| .+++++|+.
T Consensus        91 ~d~~tG~~~W~~  102 (238)
T PF13360_consen   91 LDAKTGKVLWSI  102 (238)
T ss_dssp             EETTTSCEEEEE
T ss_pred             cccCCcceeeee
Confidence            67 678999995


No 28 
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=57.74  E-value=34  Score=31.78  Aligned_cols=55  Identities=27%  Similarity=0.522  Sum_probs=34.8

Q ss_pred             cEEEEe-cCCcEEEEcC-CCCEEEeecCCCC----ce----eEEEEeeCCCeeEecC-CCceEEee
Q 040039           53 ATLELT-SDGNLVLQDA-DGAIAWSTNTSGK----SV----VGLNLTDMGNLVLFDK-NNAAVWQS  107 (274)
Q Consensus        53 ~~l~~~-~~G~L~l~~~-~~~~~Wss~~~~~----~~----~~~~l~d~GNlvl~~~-~~~~lWqS  107 (274)
                      ..+.+. .+|.|+-+|. +|+++|+....+.    ++    ....-..+|.|+-.|. +|+++|+-
T Consensus       121 ~~v~v~~~~g~l~ald~~tG~~~W~~~~~~~~~ssP~v~~~~v~v~~~~g~l~ald~~tG~~~W~~  186 (394)
T PRK11138        121 GKVYIGSEKGQVYALNAEDGEVAWQTKVAGEALSRPVVSDGLVLVHTSNGMLQALNESDGAVKWTV  186 (394)
T ss_pred             CEEEEEcCCCEEEEEECCCCCCcccccCCCceecCCEEECCEEEEECCCCEEEEEEccCCCEeeee
Confidence            344444 5688888885 6899999876432    11    1112234667777775 58899985


No 29 
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=56.77  E-value=53  Score=30.09  Aligned_cols=70  Identities=23%  Similarity=0.467  Sum_probs=39.7

Q ss_pred             CcEEEEcCCCCCC----------CCCcEEEE-ecCCcEEEEc-CCCCEEEeecCCCC---ce-----eEEEEeeCCCeeE
Q 040039           37 PQVVWSANRNNPV----------RINATLEL-TSDGNLVLQD-ADGAIAWSTNTSGK---SV-----VGLNLTDMGNLVL   96 (274)
Q Consensus        37 ~~vVWvANr~~Pv----------~~~~~l~~-~~~G~L~l~~-~~~~~~Wss~~~~~---~~-----~~~~l~d~GNlvl   96 (274)
                      ..++|..+-..++          -....+.+ +.+|.|.-+| .+|+++|+.+....   ..     ....-..+|+|+.
T Consensus        40 ~~~~W~~~~~~~~~~~~~~~~p~v~~~~v~v~~~~g~v~a~d~~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~a  119 (377)
T TIGR03300        40 VDQVWSASVGDGVGHYYLRLQPAVAGGKVYAADADGTVVALDAETGKRLWRVDLDERLSGGVGADGGLVFVGTEKGEVIA  119 (377)
T ss_pred             ceeeeEEEcCCCcCccccccceEEECCEEEEECCCCeEEEEEccCCcEeeeecCCCCcccceEEcCCEEEEEcCCCEEEE
Confidence            3467887644433          22233333 3457887777 47889998765321   11     1111234667776


Q ss_pred             ecC-CCceEEe
Q 040039           97 FDK-NNAAVWQ  106 (274)
Q Consensus        97 ~~~-~~~~lWq  106 (274)
                      .|. +++++|+
T Consensus       120 ld~~tG~~~W~  130 (377)
T TIGR03300       120 LDAEDGKELWR  130 (377)
T ss_pred             EECCCCcEeee
Confidence            775 5888996


No 30 
>KOG4649 consensus PQQ (pyrrolo-quinoline quinone) repeat protein [Secondary metabolites biosynthesis, transport and catabolism]
Probab=56.66  E-value=24  Score=31.58  Aligned_cols=45  Identities=29%  Similarity=0.488  Sum_probs=34.9

Q ss_pred             CCcEEEEcCCCCCCCCC------cEEEEecCCcEEEEcCCCCEEEeecCCC
Q 040039           36 FPQVVWSANRNNPVRIN------ATLELTSDGNLVLQDADGAIAWSTNTSG   80 (274)
Q Consensus        36 ~~~vVWvANr~~Pv~~~------~~l~~~~~G~L~l~~~~~~~~Wss~~~~   80 (274)
                      +.+..|-|.|..||-.+      ++..-+-||+|.-.++.|+.||+..+.+
T Consensus       168 ~~~~~w~~~~~~PiF~splcv~~sv~i~~VdG~l~~f~~sG~qvwr~~t~G  218 (354)
T KOG4649|consen  168 SSTEFWAATRFGPIFASPLCVGSSVIITTVDGVLTSFDESGRQVWRPATKG  218 (354)
T ss_pred             CcceehhhhcCCccccCceeccceEEEEEeccEEEEEcCCCcEEEeecCCC
Confidence            45789999999998754      2333356899999999999999877644


No 31 
>PF11403 Yeast_MT:  Yeast metallothionein;  InterPro: IPR022710  Metallothioneins are characterised by an abundance of cysteine residues and a lack of generic secondary structure motifs. This protein functions in primary metal storage, transport and detoxification []. For the first 40 residues in the protein the polypeptide wraps around the metal by forming two large parallel loops separated by a deep cleft containing the metal cluster []. ; PDB: 1AQS_A 1AQR_A 1RJU_V 1FMY_A 1AOO_A 1AQQ_A.
Probab=56.38  E-value=8.6  Score=22.86  Aligned_cols=19  Identities=37%  Similarity=0.821  Sum_probs=11.1

Q ss_pred             cCCCCCCccCCCCccCCCCC
Q 040039          253 LVCGKYGICSQGQCSCPATY  272 (274)
Q Consensus       253 ~~CG~~giC~~~~C~Cl~gf  272 (274)
                      |.|-.+.-| ...|+||.|-
T Consensus        12 gscknneqc-qkscscptgc   30 (40)
T PF11403_consen   12 GSCKNNEQC-QKSCSCPTGC   30 (40)
T ss_dssp             STTTT-TTS-TTS-SS-TTT
T ss_pred             CCccChHHH-hhcCCCCCCC
Confidence            566677777 5679998773


No 32 
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=56.30  E-value=5.5  Score=31.37  Aligned_cols=24  Identities=25%  Similarity=0.401  Sum_probs=17.5

Q ss_pred             CCCcCCCCCCccC------CCCccCCCCCCC
Q 040039          250 GYPLVCGKYGICS------QGQCSCPATYFK  274 (274)
Q Consensus       250 ~~y~~CG~~giC~------~~~C~Cl~gf~~  274 (274)
                      +..+.|=. |.|.      .+.|.|..||+|
T Consensus        48 ey~~YClH-G~C~yI~dl~~~~CrC~~GYtG   77 (139)
T PHA03099         48 EGDGYCLH-GDCIHARDIDGMYCRCSHGYTG   77 (139)
T ss_pred             hhCCEeEC-CEEEeeccCCCceeECCCCccc
Confidence            34456654 6886      278999999997


No 33 
>TIGR03066 Gem_osc_para_1 Gemmata obscuriglobus paralogous family TIGR03066. This model represents an uncharacterized paralogous family in Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. This family shows sequence similarity to TIGR03067, which is also found in Gemmata obscuriglobus as well as in a few other species.
Probab=54.52  E-value=42  Score=25.86  Aligned_cols=54  Identities=19%  Similarity=0.298  Sum_probs=33.1

Q ss_pred             CCCcEEEEecCCcEEEEcCCCCE-E-----Eeec---------CCCC---ceeEEEEeeCCCeeEecCCCce
Q 040039           50 RINATLELTSDGNLVLQDADGAI-A-----WSTN---------TSGK---SVVGLNLTDMGNLVLFDKNNAA  103 (274)
Q Consensus        50 ~~~~~l~~~~~G~L~l~~~~~~~-~-----Wss~---------~~~~---~~~~~~l~d~GNlvl~~~~~~~  103 (274)
                      ...+.|+|..||.|+|..+++.- +     |+-.         ..+.   +-....-+++|-|||.|.+++.
T Consensus        33 ~~~~~leF~~dGKL~v~~gnng~~~~~~Gty~L~G~kLtL~~~p~g~t~k~~Vtv~~l~~~~Lvl~d~dg~~  104 (111)
T TIGR03066        33 KDDVVIEFAKDGKLVVTIGEKGKEVKADGTYKLDGNKLTLTLKAGGKEKKETLTVKKLTDDELVGKDPDGKK  104 (111)
T ss_pred             CCceEEEEcCCCeEEEecCCCCcEeccCceEEEECCEEEEEEcCCCccccceEEEEEecCCeEEEEcCCCCE
Confidence            35578999999999988765442 1     3321         1111   1011223688999999988763


No 34 
>smart00051 DSL delta serrate ligand.
Probab=53.51  E-value=11  Score=25.77  Aligned_cols=18  Identities=17%  Similarity=0.434  Sum_probs=13.4

Q ss_pred             CCCccCC-CCccCCCCCCC
Q 040039          257 KYGICSQ-GQCSCPATYFK  274 (274)
Q Consensus       257 ~~giC~~-~~C~Cl~gf~~  274 (274)
                      ....|+. ..|.|+||++|
T Consensus        42 ~~~~Cd~~G~~~C~~Gw~G   60 (63)
T smart00051       42 GHYTCDENGNKGCLEGWMG   60 (63)
T ss_pred             CCccCCcCCCEecCCCCcC
Confidence            3455763 67999999986


No 35 
>cd05852 Ig5_Contactin-1 Fifth Ig domain of contactin-1. Ig5_Contactin-1: fifth Ig domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma.
Probab=52.91  E-value=64  Score=22.17  Aligned_cols=34  Identities=21%  Similarity=0.391  Sum_probs=24.1

Q ss_pred             CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEcC
Q 040039           34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQDA   68 (274)
Q Consensus        34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~~   68 (274)
                      .|.+++.|.=|.. ++.....+.+..+|.|+|.+.
T Consensus        13 ~P~p~v~W~k~~~-~l~~~~r~~~~~~g~L~I~~v   46 (73)
T cd05852          13 APKPKFSWSKGTE-LLVNNSRISIWDDGSLEILNI   46 (73)
T ss_pred             eCCCEEEEEeCCE-ecccCCCEEEcCCCEEEECcC
Confidence            4667889987643 555556677777899988764


No 36 
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=48.66  E-value=9.8  Score=39.07  Aligned_cols=27  Identities=30%  Similarity=0.805  Sum_probs=21.2

Q ss_pred             CCCCCCcCCCCCCccCC----CCccCCCCCCC
Q 040039          247 GECGYPLVCGKYGICSQ----GQCSCPATYFK  274 (274)
Q Consensus       247 d~C~~y~~CG~~giC~~----~~C~Cl~gf~~  274 (274)
                      |+|. +..|-++..|-.    ..|.|.|||+|
T Consensus       828 DeC~-psrChp~A~CyntpgsfsC~C~pGy~G  858 (1289)
T KOG1214|consen  828 DECS-PSRCHPAATCYNTPGSFSCRCQPGYYG  858 (1289)
T ss_pred             cccC-ccccCCCceEecCCCcceeecccCccC
Confidence            6675 788888888862    56999999976


No 37 
>PF12690 BsuPI:  Intracellular proteinase inhibitor;  InterPro: IPR020481 BsuPI is a intracellular proteinase inhibitor that directly regulates the major intracellular proteinase (ISP-1) activity in vivo. It inhibits ISP-1 in the early stages of sporulation and then may be inactivated by a membrane-bound proteinase [].; PDB: 3ISY_A.
Probab=46.86  E-value=24  Score=25.43  Aligned_cols=16  Identities=25%  Similarity=0.686  Sum_probs=6.8

Q ss_pred             cEEEEcCCCCEEEeec
Q 040039           62 NLVLQDADGAIAWSTN   77 (274)
Q Consensus        62 ~L~l~~~~~~~~Wss~   77 (274)
                      +|+|.|.+|..||.-.
T Consensus        27 D~~v~d~~g~~vwrwS   42 (82)
T PF12690_consen   27 DFVVKDKEGKEVWRWS   42 (82)
T ss_dssp             EEEEE-TT--EEEETT
T ss_pred             EEEEECCCCCEEEEec
Confidence            4555555555665443


No 38 
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=42.16  E-value=18  Score=23.23  Aligned_cols=10  Identities=40%  Similarity=0.953  Sum_probs=4.9

Q ss_pred             CCccCCCCCC
Q 040039          264 GQCSCPATYF  273 (274)
Q Consensus       264 ~~C~Cl~gf~  273 (274)
                      .+|.|.++++
T Consensus        19 G~C~C~~~~~   28 (50)
T cd00055          19 GQCECKPNTT   28 (50)
T ss_pred             CEEeCCCcCC
Confidence            4455555544


No 39 
>PF12946 EGF_MSP1_1:  MSP1 EGF domain 1;  InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=39.07  E-value=3.7  Score=25.09  Aligned_cols=22  Identities=27%  Similarity=0.578  Sum_probs=13.9

Q ss_pred             cCCCCCCccCC-----CCccCCCCCCC
Q 040039          253 LVCGKYGICSQ-----GQCSCPATYFK  274 (274)
Q Consensus       253 ~~CG~~giC~~-----~~C~Cl~gf~~  274 (274)
                      ..|=.|+-|-.     ..|.|++||+|
T Consensus         5 ~~cP~NA~C~~~~dG~eecrCllgyk~   31 (37)
T PF12946_consen    5 TKCPANAGCFRYDDGSEECRCLLGYKK   31 (37)
T ss_dssp             S---TTEEEEEETTSEEEEEE-TTEEE
T ss_pred             ccCCCCcccEEcCCCCEEEEeeCCccc
Confidence            45777888851     57999999975


No 40 
>PF06006 DUF905:  Bacterial protein of unknown function (DUF905);  InterPro: IPR009253 This family consists of several short hypothetical proteobacterial proteins of unknown function.; PDB: 2HJJ_A.
Probab=38.33  E-value=35  Score=23.88  Aligned_cols=19  Identities=26%  Similarity=0.648  Sum_probs=12.0

Q ss_pred             CeeEecCCCceEEeecCCC
Q 040039           93 NLVLFDKNNAAVWQSFDHP  111 (274)
Q Consensus        93 Nlvl~~~~~~~lWqSFd~P  111 (274)
                      -|||||.++..+|..+.+-
T Consensus        34 RlvvRd~~g~mvWRaWNFE   52 (70)
T PF06006_consen   34 RLVVRDTEGQMVWRAWNFE   52 (70)
T ss_dssp             EEEEE-SS--EEEEEESSS
T ss_pred             EEEEEcCCCcEEEEeeccC
Confidence            4788888888888887653


No 41 
>KOG0640 consensus mRNA cleavage stimulating factor complex; subunit 1 [RNA processing and modification]
Probab=37.25  E-value=1.7e+02  Score=26.99  Aligned_cols=68  Identities=25%  Similarity=0.462  Sum_probs=48.2

Q ss_pred             EEEEcCCCCCCCCC-cEEEEecCCcEEEEcC-CCCE-EEeecC-----------CCCceeEEEEeeCCCeeEecCCCc--
Q 040039           39 VVWSANRNNPVRIN-ATLELTSDGNLVLQDA-DGAI-AWSTNT-----------SGKSVVGLNLTDMGNLVLFDKNNA--  102 (274)
Q Consensus        39 vVWvANr~~Pv~~~-~~l~~~~~G~L~l~~~-~~~~-~Wss~~-----------~~~~~~~~~l~d~GNlvl~~~~~~--  102 (274)
                      .-=.||.+..+.+. ..+.-+..|+|.+... +|.+ +|..-.           .+..+.+|++..+|..+|......  
T Consensus       250 cfvsanPd~qht~ai~~V~Ys~t~~lYvTaSkDG~IklwDGVS~rCv~t~~~AH~gsevcSa~Ftkn~kyiLsSG~DS~v  329 (430)
T KOG0640|consen  250 CFVSANPDDQHTGAITQVRYSSTGSLYVTASKDGAIKLWDGVSNRCVRTIGNAHGGSEVCSAVFTKNGKYILSSGKDSTV  329 (430)
T ss_pred             EeeecCcccccccceeEEEecCCccEEEEeccCCcEEeeccccHHHHHHHHhhcCCceeeeEEEccCCeEEeecCCccee
Confidence            34568888888777 6788888899988854 3443 785321           134567899999999999864433  


Q ss_pred             eEEe
Q 040039          103 AVWQ  106 (274)
Q Consensus       103 ~lWq  106 (274)
                      -|||
T Consensus       330 kLWE  333 (430)
T KOG0640|consen  330 KLWE  333 (430)
T ss_pred             eeee
Confidence            4898


No 42 
>PF13570 PQQ_3:  PQQ-like domain; PDB: 3HXJ_B 3Q54_A.
Probab=36.44  E-value=47  Score=19.86  Aligned_cols=11  Identities=45%  Similarity=1.039  Sum_probs=5.1

Q ss_pred             CCEEEeecCCC
Q 040039           70 GAIAWSTNTSG   80 (274)
Q Consensus        70 ~~~~Wss~~~~   80 (274)
                      |+++|+....+
T Consensus         1 G~~~W~~~~~~   11 (40)
T PF13570_consen    1 GKVLWSYDTGG   11 (40)
T ss_dssp             S-EEEEEE-SS
T ss_pred             CceeEEEECCC
Confidence            34667666543


No 43 
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=36.36  E-value=18  Score=37.22  Aligned_cols=57  Identities=14%  Similarity=0.209  Sum_probs=36.3

Q ss_pred             EEEcCCCCeEEE-----EEc-CCCCeEEeeecccccCCCCCCC-cCCCCCCccC----C---CCccCCCCCCC
Q 040039          216 MRLWPDGHLRVY-----EWQ-ASIGWTEVADLLTGYLGECGYP-LVCGKYGICS----Q---GQCSCPATYFK  274 (274)
Q Consensus       216 l~Ld~dG~lr~y-----~w~-~~~~W~~~~~~~~~p~d~C~~y-~~CG~~giC~----~---~~C~Cl~gf~~  274 (274)
                      +-+..+|+.|.-     .+. +...-+.+.  -.+|.+.|+-- .-|-..|-|.    .   -+|.|||||-|
T Consensus       749 ~Cin~pg~~rceC~~gy~F~dd~~tCV~i~--~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsG  819 (1289)
T KOG1214|consen  749 VCINLPGSYRCECRSGYEFADDRHTCVLIT--PPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSG  819 (1289)
T ss_pred             eeecCCCceeEEEeecceeccCCcceEEec--CCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccC
Confidence            445567887653     344 445455431  12356888776 5688888775    1   47999999975


No 44 
>PF14670 FXa_inhibition:  Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=36.15  E-value=8.7  Score=23.26  Aligned_cols=10  Identities=50%  Similarity=1.132  Sum_probs=7.6

Q ss_pred             CCccCCCCCC
Q 040039          264 GQCSCPATYF  273 (274)
Q Consensus       264 ~~C~Cl~gf~  273 (274)
                      ..|+|++||+
T Consensus        19 ~~C~C~~Gy~   28 (36)
T PF14670_consen   19 YRCSCPPGYK   28 (36)
T ss_dssp             EEEE-STTEE
T ss_pred             eEeECCCCCE
Confidence            5799999985


No 45 
>cd00216 PQQ_DH Dehydrogenases with pyrrolo-quinoline quinone (PQQ) as cofactor, like ethanol, methanol, and membrane bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller.
Probab=36.15  E-value=1.3e+02  Score=29.06  Aligned_cols=73  Identities=25%  Similarity=0.413  Sum_probs=43.4

Q ss_pred             CCCcEEEEcCCC-------CCCCCCcEEEEe-cCCcEEEEcC-CCCEEEeecCCCC-----------cee-----EEE-E
Q 040039           35 EFPQVVWSANRN-------NPVRINATLELT-SDGNLVLQDA-DGAIAWSTNTSGK-----------SVV-----GLN-L   88 (274)
Q Consensus        35 ~~~~vVWvANr~-------~Pv~~~~~l~~~-~~G~L~l~~~-~~~~~Wss~~~~~-----------~~~-----~~~-l   88 (274)
                      ....++|..+-.       .|+-....+.+. .+|.|+-+|. .|+++|+.+....           +++     .+. -
T Consensus        37 ~~~~~~W~~~~~~~~~~~~sPvv~~g~vy~~~~~g~l~AlD~~tG~~~W~~~~~~~~~~~~~~~~~~g~~~~~~~~V~v~  116 (488)
T cd00216          37 KKLKVAWTFSTGDERGQEGTPLVVDGDMYFTTSHSALFALDAATGKVLWRYDPKLPADRGCCDVVNRGVAYWDPRKVFFG  116 (488)
T ss_pred             hcceeeEEEECCCCCCcccCCEEECCEEEEeCCCCcEEEEECCCChhhceeCCCCCccccccccccCCcEEccCCeEEEe
Confidence            345578987644       344444444444 5799988885 6889998764321           000     011 1


Q ss_pred             eeCCCeeEecC-CCceEEee
Q 040039           89 TDMGNLVLFDK-NNAAVWQS  107 (274)
Q Consensus        89 ~d~GNlvl~~~-~~~~lWqS  107 (274)
                      ..+|.++-.|. +++++|+-
T Consensus       117 ~~~g~v~AlD~~TG~~~W~~  136 (488)
T cd00216         117 TFDGRLVALDAETGKQVWKF  136 (488)
T ss_pred             cCCCeEEEEECCCCCEeeee
Confidence            23567776775 58899994


No 46 
>PLN00033 photosystem II stability/assembly factor; Provisional
Probab=33.48  E-value=1.7e+02  Score=27.75  Aligned_cols=46  Identities=17%  Similarity=0.325  Sum_probs=26.3

Q ss_pred             CcEEEEcCCCCEEEeecCC--CCceeEEEEeeCCCeeEecCCCceEEe
Q 040039           61 GNLVLQDADGAIAWSTNTS--GKSVVGLNLTDMGNLVLFDKNNAAVWQ  106 (274)
Q Consensus        61 G~L~l~~~~~~~~Wss~~~--~~~~~~~~l~d~GNlvl~~~~~~~lWq  106 (274)
                      |++++.+.+|...|.....  ......+...++|.++|....+.++|.
T Consensus       259 G~~~~s~d~G~~~W~~~~~~~~~~l~~v~~~~dg~l~l~g~~G~l~~S  306 (398)
T PLN00033        259 GNFYLTWEPGQPYWQPHNRASARRIQNMGWRADGGLWLLTRGGGLYVS  306 (398)
T ss_pred             ccEEEecCCCCcceEEecCCCccceeeeeEcCCCCEEEEeCCceEEEe
Confidence            4444444344445654332  223345566789999998777766554


No 47 
>PF05935 Arylsulfotrans:  Arylsulfotransferase (ASST);  InterPro: IPR010262 This family consists of several bacterial arylsulphotransferase proteins. Arylsulphotransferase (ASST) transfers a sulphate group from phenolic sulphate esters to a phenolic acceptor substrate [].; PDB: 3ETT_B 3ELQ_A 3ETS_A.
Probab=32.72  E-value=48  Score=32.06  Aligned_cols=52  Identities=23%  Similarity=0.415  Sum_probs=30.2

Q ss_pred             CCcEEEEcCCCCEEEeecCCCCceeEEEEeeCCCeeEe--------cCCCceEEeecCCCC
Q 040039           60 DGNLVLQDADGAIAWSTNTSGKSVVGLNLTDMGNLVLF--------DKNNAAVWQSFDHPT  112 (274)
Q Consensus        60 ~G~L~l~~~~~~~~Wss~~~~~~~~~~~l~d~GNlvl~--------~~~~~~lWqSFd~Pt  112 (274)
                      .+..+++|.+|.++|.-.........+..+++|+|...        |-.|+++|+ ++.|.
T Consensus       127 ~~~~~~iD~~G~Vrw~~~~~~~~~~~~~~l~nG~ll~~~~~~~~e~D~~G~v~~~-~~l~~  186 (477)
T PF05935_consen  127 SSYTYLIDNNGDVRWYLPLDSGSDNSFKQLPNGNLLIGSGNRLYEIDLLGKVIWE-YDLPG  186 (477)
T ss_dssp             EEEEEEEETTS-EEEEE-GGGT--SSEEE-TTS-EEEEEBTEEEEE-TT--EEEE-EE--T
T ss_pred             CceEEEECCCccEEEEEccCccccceeeEcCCCCEEEecCCceEEEcCCCCEEEe-eecCC
Confidence            46788999999999987753322222678899999864        345788998 66665


No 48 
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=31.88  E-value=1.6e+02  Score=26.85  Aligned_cols=19  Identities=16%  Similarity=0.081  Sum_probs=12.5

Q ss_pred             eeCCCeeEecCC-CceEEee
Q 040039           89 TDMGNLVLFDKN-NAAVWQS  107 (274)
Q Consensus        89 ~d~GNlvl~~~~-~~~lWqS  107 (274)
                      ..+|.|.+.|.. ++++|+-
T Consensus       327 ~~~G~l~~~d~~tG~~~~~~  346 (377)
T TIGR03300       327 DFEGYLHWLSREDGSFVARL  346 (377)
T ss_pred             eCCCEEEEEECCCCCEEEEE
Confidence            457777777653 6777753


No 49 
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=31.37  E-value=32  Score=21.74  Aligned_cols=10  Identities=40%  Similarity=0.979  Sum_probs=4.4

Q ss_pred             CCccCCCCCC
Q 040039          264 GQCSCPATYF  273 (274)
Q Consensus       264 ~~C~Cl~gf~  273 (274)
                      .+|.|.++++
T Consensus        18 G~C~C~~~~~   27 (46)
T smart00180       18 GQCECKPNVT   27 (46)
T ss_pred             CEEECCCCCC
Confidence            3444444443


No 50 
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=30.43  E-value=33  Score=30.70  Aligned_cols=21  Identities=33%  Similarity=0.841  Sum_probs=18.1

Q ss_pred             CCCCCCccC-------CCCccCCCCCCC
Q 040039          254 VCGKYGICS-------QGQCSCPATYFK  274 (274)
Q Consensus       254 ~CG~~giC~-------~~~C~Cl~gf~~  274 (274)
                      .|+-||-|.       +.+|.|-+||+|
T Consensus       151 ~C~GnG~C~GdGsR~GsGkCkC~~GY~G  178 (350)
T KOG4260|consen  151 PCFGNGSCHGDGSREGSGKCKCETGYTG  178 (350)
T ss_pred             CcCCCCcccCCCCCCCCCcccccCCCCC
Confidence            599999997       278999999986


No 51 
>KOG3881 consensus Uncharacterized conserved protein [Function unknown]
Probab=29.06  E-value=2.6e+02  Score=26.37  Aligned_cols=76  Identities=17%  Similarity=0.149  Sum_probs=49.2

Q ss_pred             CCcEEEEcC-CCCCCCCC-cEEEEecCCcEEEEcCC--CCEEEeecCCCCceeEEEEeeCCCeeEecC-CCceEEeecCC
Q 040039           36 FPQVVWSAN-RNNPVRIN-ATLELTSDGNLVLQDAD--GAIAWSTNTSGKSVVGLNLTDMGNLVLFDK-NNAAVWQSFDH  110 (274)
Q Consensus        36 ~~~vVWvAN-r~~Pv~~~-~~l~~~~~G~L~l~~~~--~~~~Wss~~~~~~~~~~~l~d~GNlvl~~~-~~~~lWqSFd~  110 (274)
                      .+++||... |--|=... --++++..|.|.++|..  .+||=+-.-...+...+.|.-+||+|+... .+.+  -+||+
T Consensus       199 LrVPvW~tdi~Fl~g~~~~~fat~T~~hqvR~YDt~~qRRPV~~fd~~E~~is~~~l~p~gn~Iy~gn~~g~l--~~FD~  276 (412)
T KOG3881|consen  199 LRVPVWITDIRFLEGSPNYKFATITRYHQVRLYDTRHQRRPVAQFDFLENPISSTGLTPSGNFIYTGNTKGQL--AKFDL  276 (412)
T ss_pred             ceeeeeeccceecCCCCCceEEEEecceeEEEecCcccCcceeEeccccCcceeeeecCCCcEEEEecccchh--heecc
Confidence            345666644 22221113 56888999999999963  457766554445667788999999988743 3443  56887


Q ss_pred             CCC
Q 040039          111 PTD  113 (274)
Q Consensus       111 PtD  113 (274)
                      -+-
T Consensus       277 r~~  279 (412)
T KOG3881|consen  277 RGG  279 (412)
T ss_pred             cCc
Confidence            653


No 52 
>PF10282 Lactonase:  Lactonase, 7-bladed beta-propeller;  InterPro: IPR019405  6-phosphogluconolactonases (6PGL) 3.1.1.31 from EC, which hydrolyses 6-phosphogluconolactone to 6-phosphogluconate is opne of the enzymes in the pentose phosphate pathway. Two families of structurally dissimilar 6PGLs are known to exist: the Escherichia coli (strain K12) YbhE IPR022528 from INTERPRO [] and the Pseudomonas aeruginosa DevB IPR005900 from INTERPRO [] types.  This entry contains bacterial 6-phosphogluconolactonases (6PGL) YbhE-type 3.1.1.31 from EC which hydrolyse 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonizing enzyme carboxy-cis,cis-muconate cyclase 5.5.1.5 from EC and muconate cycloisomerase 5.5.1.1 from EC, which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures have been reported for the E. coli 6-phosphogluconolactonase and Neurospora crassa muconate cycloisomerase. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold [].; PDB: 3SCY_A 1L0Q_A 3HFQ_B 3FGB_A 1RI6_A 3U4Y_A 3BWS_A 1JOF_H.
Probab=27.71  E-value=3.7e+02  Score=24.36  Aligned_cols=65  Identities=20%  Similarity=0.358  Sum_probs=0.0

Q ss_pred             EEEEeeeccccccccccCCCCcEEEEcCCCCCCCCC-cEEEE-ecCCcEE---------------EEcCCCCEEEeecCC
Q 040039           17 AVFIVQAYNASLIDYQHIEFPQVVWSANRNNPVRIN-ATLEL-TSDGNLV---------------LQDADGAIAWSTNTS   79 (274)
Q Consensus        17 ~iw~~~~~~~~~~~~~~~~~~~vVWvANr~~Pv~~~-~~l~~-~~~G~L~---------------l~~~~~~~~Wss~~~   79 (274)
                      +|.+.             +....++|+||.   .+. +.+.+ ..+|.|.               ..+++|+.++-++..
T Consensus       249 ~i~is-------------pdg~~lyvsnr~---~~sI~vf~~d~~~g~l~~~~~~~~~G~~Pr~~~~s~~g~~l~Va~~~  312 (345)
T PF10282_consen  249 EIAIS-------------PDGRFLYVSNRG---SNSISVFDLDPATGTLTLVQTVPTGGKFPRHFAFSPDGRYLYVANQD  312 (345)
T ss_dssp             EEEE--------------TTSSEEEEEECT---TTEEEEEEECTTTTTEEEEEEEEESSSSEEEEEE-TTSSEEEEEETT
T ss_pred             eEEEe-------------cCCCEEEEEecc---CCEEEEEEEecCCCceEEEEEEeCCCCCccEEEEeCCCCEEEEEecC


Q ss_pred             CCceeEEEEe-eCCCeeEe
Q 040039           80 GKSVVGLNLT-DMGNLVLF   97 (274)
Q Consensus        80 ~~~~~~~~l~-d~GNlvl~   97 (274)
                      ...+....+. ++|.|...
T Consensus       313 s~~v~vf~~d~~tG~l~~~  331 (345)
T PF10282_consen  313 SNTVSVFDIDPDTGKLTPV  331 (345)
T ss_dssp             TTEEEEEEEETTTTEEEEE
T ss_pred             CCeEEEEEEeCCCCcEEEe


No 53 
>PF05294 Toxin_5:  Scorpion short toxin;  InterPro: IPR007958 This family contains various secreted scorpion short toxins which seem to be unrelated to those described in IPR001947 from INTERPRO.; GO: 0009405 pathogenesis, 0005576 extracellular region; PDB: 1SIS_A 1CHL_A.
Probab=27.69  E-value=15  Score=21.46  Aligned_cols=15  Identities=47%  Similarity=1.096  Sum_probs=12.2

Q ss_pred             CCCCCCccCCCCccC
Q 040039          254 VCGKYGICSQGQCSC  268 (274)
Q Consensus       254 ~CG~~giC~~~~C~C  268 (274)
                      -||..|.|-.++|-|
T Consensus        18 CCgg~GkC~GpqClC   32 (32)
T PF05294_consen   18 CCGGRGKCFGPQCLC   32 (32)
T ss_dssp             HCTTSEEEETTEEEE
T ss_pred             HhCCCCeEcCCcccC
Confidence            388889997788876


No 54 
>smart00286 PTI Plant trypsin inhibitors.
Probab=26.76  E-value=42  Score=19.21  Aligned_cols=18  Identities=33%  Similarity=0.866  Sum_probs=9.1

Q ss_pred             CCCCCccCCCCccCCC-CCC
Q 040039          255 CGKYGICSQGQCSCPA-TYF  273 (274)
Q Consensus       255 CG~~giC~~~~C~Cl~-gf~  273 (274)
                      |-.-+-| .+.|.|++ ||-
T Consensus        10 Ck~DsDC-l~~CiC~~~G~C   28 (29)
T smart00286       10 CKRDSDC-MAECICLANGYC   28 (29)
T ss_pred             cccccCc-ccCCEEcccccc
Confidence            4444444 35566665 553


No 55 
>smart00564 PQQ beta-propeller repeat. Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases.
Probab=26.50  E-value=1.2e+02  Score=16.71  Aligned_cols=17  Identities=47%  Similarity=0.894  Sum_probs=9.6

Q ss_pred             cCCcEEEEcC-CCCEEEe
Q 040039           59 SDGNLVLQDA-DGAIAWS   75 (274)
Q Consensus        59 ~~G~L~l~~~-~~~~~Ws   75 (274)
                      .+|.|+-+|. +|..+|+
T Consensus        14 ~~g~l~a~d~~~G~~~W~   31 (33)
T smart00564       14 TDGTLYALDAKTGEILWT   31 (33)
T ss_pred             CCCEEEEEEcccCcEEEE
Confidence            3466665554 4566665


No 56 
>PF06247 Plasmod_Pvs28:  Plasmodium ookinete surface protein Pvs28;  InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=25.88  E-value=8.9  Score=32.29  Aligned_cols=27  Identities=33%  Similarity=0.815  Sum_probs=17.7

Q ss_pred             CCCCC----CcCCCCCCccCC---------CCccCCCCCC
Q 040039          247 GECGY----PLVCGKYGICSQ---------GQCSCPATYF  273 (274)
Q Consensus       247 d~C~~----y~~CG~~giC~~---------~~C~Cl~gf~  273 (274)
                      -.|+.    .-.||.|+.|..         -.|.|++||.
T Consensus        40 v~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~   79 (197)
T PF06247_consen   40 VECDKLENVNKPCGDYAKCINQANKGEERAYKCDCINGYI   79 (197)
T ss_dssp             ---SG-GGTTSEEETTEEEEE-SSTTSSTSEEEEE-TTEE
T ss_pred             eecCcccccCccccchhhhhcCCCcccceeEEEecccCce
Confidence            45654    457999999971         3599999985


No 57 
>PF01011 PQQ:  PQQ enzyme repeat family.;  InterPro: IPR002372 Pyrrolo-quinoline quinone (PQQ) is a redox coenzyme, which serves as a cofactor for a number of enzymes (quinoproteins) and particularly for some bacterial dehydrogenases [, ]. A number of bacterial quinoproteins belong to this family. Enzymes in this group have repeats of a beta propeller.; PDB: 1H4I_C 1H4J_E 1W6S_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A 1G72_A ....
Probab=25.80  E-value=95  Score=18.38  Aligned_cols=22  Identities=41%  Similarity=0.664  Sum_probs=17.0

Q ss_pred             ecCCcEEEEcC-CCCEEEeecCC
Q 040039           58 TSDGNLVLQDA-DGAIAWSTNTS   79 (274)
Q Consensus        58 ~~~G~L~l~~~-~~~~~Wss~~~   79 (274)
                      +.+|.|+-+|. .|+.+|+-+..
T Consensus         7 ~~~g~l~AlD~~TG~~~W~~~~~   29 (38)
T PF01011_consen    7 TPDGYLYALDAKTGKVLWKFQTG   29 (38)
T ss_dssp             TTTSEEEEEETTTTSEEEEEESS
T ss_pred             CCCCEEEEEECCCCCEEEeeeCC
Confidence            56788888875 68899988764


No 58 
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=25.77  E-value=46  Score=32.66  Aligned_cols=20  Identities=35%  Similarity=0.992  Sum_probs=9.5

Q ss_pred             CCCCCccCCCCccCCCCCCC
Q 040039          255 CGKYGICSQGQCSCPATYFK  274 (274)
Q Consensus       255 CG~~giC~~~~C~Cl~gf~~  274 (274)
                      |...+.|...+|.|.+||+|
T Consensus       287 cs~~g~~~~g~CiC~~g~~G  306 (525)
T KOG1225|consen  287 CSGGGVCVDGECICNPGYSG  306 (525)
T ss_pred             cCCCceecCCEeecCCCccc
Confidence            44444444344555555543


No 59 
>cd00150 PlantTI Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges.
Probab=25.48  E-value=44  Score=18.80  Aligned_cols=8  Identities=38%  Similarity=1.140  Sum_probs=3.5

Q ss_pred             CCCCCCcc
Q 040039          254 VCGKYGIC  261 (274)
Q Consensus       254 ~CG~~giC  261 (274)
                      +|..+|+|
T Consensus        19 iC~~~G~C   26 (27)
T cd00150          19 ICLENGYC   26 (27)
T ss_pred             EEcccccc
Confidence            44444444


No 60 
>PF00053 Laminin_EGF:  Laminin EGF-like (Domains III and V);  InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below.  +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain  In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=24.72  E-value=21  Score=22.67  Aligned_cols=10  Identities=40%  Similarity=0.860  Sum_probs=4.7

Q ss_pred             CCccCCCCCC
Q 040039          264 GQCSCPATYF  273 (274)
Q Consensus       264 ~~C~Cl~gf~  273 (274)
                      .+|.|.++|+
T Consensus        18 G~C~C~~~~~   27 (49)
T PF00053_consen   18 GQCVCKPGTT   27 (49)
T ss_dssp             EEESBSTTEE
T ss_pred             CEEecccccc
Confidence            3455555443


No 61 
>KOG0291 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification]
Probab=24.64  E-value=3.5e+02  Score=27.94  Aligned_cols=53  Identities=28%  Similarity=0.657  Sum_probs=39.2

Q ss_pred             cEEEEecCCcEEEEcC-CCCE-EEeecCC---------CCceeEEEEeeCCCeeEecC-CCce-EE
Q 040039           53 ATLELTSDGNLVLQDA-DGAI-AWSTNTS---------GKSVVGLNLTDMGNLVLFDK-NNAA-VW  105 (274)
Q Consensus        53 ~~l~~~~~G~L~l~~~-~~~~-~Wss~~~---------~~~~~~~~l~d~GNlvl~~~-~~~~-lW  105 (274)
                      .++....||.++...+ ++++ ||.+..+         ..+++.++..-+||.+|..+ +|.| .|
T Consensus       354 ~~l~YSpDgq~iaTG~eDgKVKvWn~~SgfC~vTFteHts~Vt~v~f~~~g~~llssSLDGtVRAw  419 (893)
T KOG0291|consen  354 TSLAYSPDGQLIATGAEDGKVKVWNTQSGFCFVTFTEHTSGVTAVQFTARGNVLLSSSLDGTVRAW  419 (893)
T ss_pred             eeEEECCCCcEEEeccCCCcEEEEeccCceEEEEeccCCCceEEEEEEecCCEEEEeecCCeEEee
Confidence            6899999999999865 4555 8987641         24567788899999999854 4553 45


No 62 
>PF02237 BPL_C:  Biotin protein ligase C terminal domain;  InterPro: IPR003142 This C-terminal domain has an SH3-like barrel fold, the function of which is unknown. It is found associated with prokaryotic bifunctional transcriptional repressors [] and eukaryotic enzymes involved in biotin utilization [, ].   In Escherichia coli the biotin operon repressor (BirA) is a bifunctional protein. BirA acts both as the acetyl-coA carboxylase biotin holoenzyme synthetase (6.3.4.15 from EC) and as the biotin operon repressor. DNA sequence analysis of mutations indicates that the helix-turn-helix DNA binding region is located at the N terminus while mutations affecting enzyme function, although mapping over a large region, are found mainly in the central part of the protein's primary sequence [].; GO: 0006464 protein modification process; PDB: 3RUX_A 2CGH_A 3L1A_B 3L2Z_A 1HXD_A 1BIB_A 2EWN_B 1BIA_A 2EJ9_A 3FJP_A ....
Probab=24.03  E-value=45  Score=21.19  Aligned_cols=15  Identities=20%  Similarity=0.432  Sum_probs=8.6

Q ss_pred             EEeeCCCeeEecCCC
Q 040039           87 NLTDMGNLVLFDKNN  101 (274)
Q Consensus        87 ~l~d~GNlvl~~~~~  101 (274)
                      -+.++|.|+|+..++
T Consensus        21 gId~~G~L~v~~~~g   35 (48)
T PF02237_consen   21 GIDDDGALLVRTEDG   35 (48)
T ss_dssp             EEETTSEEEEEETTE
T ss_pred             EECCCCEEEEEECCC
Confidence            455666666665554


No 63 
>cd05764 Ig_2 Subgroup of the immunoglobulin (Ig) superfamily. Ig_2: subgroup of the immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of the Ig superfamily are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond.
Probab=23.74  E-value=1.3e+02  Score=20.08  Aligned_cols=34  Identities=12%  Similarity=0.174  Sum_probs=21.6

Q ss_pred             CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEc
Q 040039           34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQD   67 (274)
Q Consensus        34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~   67 (274)
                      .|.+++.|.-+.+.++.......+..+|.|.|..
T Consensus        13 ~P~p~v~W~~~~~~~~~~~~~~~~~~~~~L~i~~   46 (74)
T cd05764          13 DPEPAIHWISPDGKLISNSSRTLVYDNGTLDILI   46 (74)
T ss_pred             cCCCEEEEEeCCCEEecCCCeEEEecCCEEEEEE
Confidence            3667889996655566544444455667777663


No 64 
>KOG2106 consensus Uncharacterized conserved protein, contains HELP and WD40 domains [Function unknown]
Probab=23.01  E-value=3e+02  Score=27.10  Aligned_cols=68  Identities=21%  Similarity=0.455  Sum_probs=45.2

Q ss_pred             cEEEEecCCcEEEEcCCCC-EEEeecCC---------CCceeEEEEeeCCCeeEecCCCc-eEEee-cCCCCCcccCCCc
Q 040039           53 ATLELTSDGNLVLQDADGA-IAWSTNTS---------GKSVVGLNLTDMGNLVLFDKNNA-AVWQS-FDHPTDSLVPGQK  120 (274)
Q Consensus        53 ~~l~~~~~G~L~l~~~~~~-~~Wss~~~---------~~~~~~~~l~d~GNlvl~~~~~~-~lWqS-Fd~PtDTlLpgq~  120 (274)
                      -.++|.++|..+--|++|. .||+..+.         ..++-.+.|+.+|-|+=-..+-. ++|.. ..---+|-||.|.
T Consensus       250 l~v~F~engdviTgDS~G~i~Iw~~~~~~~~k~~~aH~ggv~~L~~lr~GtllSGgKDRki~~Wd~~y~k~r~~elPe~~  329 (626)
T KOG2106|consen  250 LCVTFLENGDVITGDSGGNILIWSKGTNRISKQVHAHDGGVFSLCMLRDGTLLSGGKDRKIILWDDNYRKLRETELPEQF  329 (626)
T ss_pred             EEEEEcCCCCEEeecCCceEEEEeCCCceEEeEeeecCCceEEEEEecCccEeecCccceEEeccccccccccccCchhc
Confidence            3677888998888888776 58987642         12455688899999886323322 68983 1123567778775


No 65 
>KOG0278 consensus Serine/threonine kinase receptor-associated protein [Lipid transport and metabolism]
Probab=22.33  E-value=5.2e+02  Score=23.23  Aligned_cols=53  Identities=19%  Similarity=0.317  Sum_probs=30.7

Q ss_pred             cEEEEecCCcEEEEcCC-CCEEEeecCCCCceeEEEEeeCCCeeEec-CCCceEEe
Q 040039           53 ATLELTSDGNLVLQDAD-GAIAWSTNTSGKSVVGLNLTDMGNLVLFD-KNNAAVWQ  106 (274)
Q Consensus        53 ~~l~~~~~G~L~l~~~~-~~~~Wss~~~~~~~~~~~l~d~GNlvl~~-~~~~~lWq  106 (274)
                      +.|.-+.|+.+.|-|.. +..|=+-. ...++.+|++.-+|..+... .++-..|.
T Consensus       157 ~iLSSadd~tVRLWD~rTgt~v~sL~-~~s~VtSlEvs~dG~ilTia~gssV~Fwd  211 (334)
T KOG0278|consen  157 CILSSADDKTVRLWDHRTGTEVQSLE-FNSPVTSLEVSQDGRILTIAYGSSVKFWD  211 (334)
T ss_pred             eEEeeccCCceEEEEeccCcEEEEEe-cCCCCcceeeccCCCEEEEecCceeEEec
Confidence            45555666777776643 33332222 23567889999899877654 33444564


No 66 
>PF05833 FbpA:  Fibronectin-binding protein A N-terminus (FbpA);  InterPro: IPR008616 This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein, the C-terminal region is IPR008532 from INTERPRO. Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host [].; PDB: 3DOA_A 2ZBK_F 2HKJ_A 1Z5B_A 1Z5C_B 1MX0_F 1Z5A_A 1MU5_A 1Z59_A.
Probab=22.16  E-value=79  Score=30.11  Aligned_cols=38  Identities=16%  Similarity=0.270  Sum_probs=22.3

Q ss_pred             EEEEeeC-CCeeEecCCCceEEeecCCCCC-----cccCCCccc
Q 040039           85 GLNLTDM-GNLVLFDKNNAAVWQSFDHPTD-----SLVPGQKLL  122 (274)
Q Consensus        85 ~~~l~d~-GNlvl~~~~~~~lWqSFd~PtD-----TlLpgq~l~  122 (274)
                      .++|... ||++|.|.++.||+---..+.+     +++||+...
T Consensus       116 i~El~g~~~NiiL~d~~~~Il~a~~~~~~~~~~~R~i~~G~~Y~  159 (455)
T PF05833_consen  116 IIELMGRHSNIILTDEDGKILDALRRVSFSQSRDREILPGEPYI  159 (455)
T ss_dssp             EEE--GGG-EEEEEETT-BEEEESS-B---------BSTTSB--
T ss_pred             EEEEcCCcccEEEEcCCCeEEeehhhcCcccccceeeccCcccc
Confidence            4677877 9999999999888876555554     889998765


No 67 
>PF14870 PSII_BNR:  Photosynthesis system II assembly factor YCF48; PDB: 2XBG_A.
Probab=21.82  E-value=2.9e+02  Score=25.12  Aligned_cols=52  Identities=19%  Similarity=0.403  Sum_probs=23.9

Q ss_pred             EEEecCCcEEEEcCCCCEEEeecC--CCCceeEEEEeeCCCeeEecCCCceEEee
Q 040039           55 LELTSDGNLVLQDADGAIAWSTNT--SGKSVVGLNLTDMGNLVLFDKNNAAVWQS  107 (274)
Q Consensus        55 l~~~~~G~L~l~~~~~~~~Wss~~--~~~~~~~~~l~d~GNlvl~~~~~~~lWqS  107 (274)
                      +.+...|++++.-..+...|....  +.+.+..|-...+|+|.+... +..|..|
T Consensus       159 vavs~~G~~~~s~~~G~~~w~~~~r~~~~riq~~gf~~~~~lw~~~~-Gg~~~~s  212 (302)
T PF14870_consen  159 VAVSSRGNFYSSWDPGQTTWQPHNRNSSRRIQSMGFSPDGNLWMLAR-GGQIQFS  212 (302)
T ss_dssp             EEEETTSSEEEEE-TT-SS-EEEE--SSS-EEEEEE-TTS-EEEEET-TTEEEEE
T ss_pred             EEEECcccEEEEecCCCccceEEccCccceehhceecCCCCEEEEeC-CcEEEEc
Confidence            334444444444323344455432  234456677778888888763 3344554


No 68 
>COG1520 FOG: WD40-like repeat [Function unknown]
Probab=21.73  E-value=4.7e+02  Score=23.88  Aligned_cols=73  Identities=26%  Similarity=0.456  Sum_probs=43.6

Q ss_pred             CCcEEEEcCCCC-------CCCCCcEEEEe-cCCcEEEEcCC-CCEEEeecCC---CC-----ce---eEEEE-ee--CC
Q 040039           36 FPQVVWSANRNN-------PVRINATLELT-SDGNLVLQDAD-GAIAWSTNTS---GK-----SV---VGLNL-TD--MG   92 (274)
Q Consensus        36 ~~~vVWvANr~~-------Pv~~~~~l~~~-~~G~L~l~~~~-~~~~Wss~~~---~~-----~~---~~~~l-~d--~G   92 (274)
                      .-+.+|..+...       |+.....+-+. .+|.|+-++.+ |..+|.....   ..     ..   ..+.+ .+  +|
T Consensus       130 ~G~~~W~~~~~~~~~~~~~~v~~~~~v~~~s~~g~~~al~~~tG~~~W~~~~~~~~~~~~~~~~~~~~~~vy~~~~~~~~  209 (370)
T COG1520         130 TGTLVWSRNVGGSPYYASPPVVGDGTVYVGTDDGHLYALNADTGTLKWTYETPAPLSLSIYGSPAIASGTVYVGSDGYDG  209 (370)
T ss_pred             CCcEEEEEecCCCeEEecCcEEcCcEEEEecCCCeEEEEEccCCcEEEEEecCCccccccccCceeecceEEEecCCCcc
Confidence            466889887666       23333555566 57999988876 8899985442   11     11   01111 22  44


Q ss_pred             CeeEecC-CCceEEeec
Q 040039           93 NLVLFDK-NNAAVWQSF  108 (274)
Q Consensus        93 Nlvl~~~-~~~~lWqSF  108 (274)
                      +|+=.|. +|..+|+.+
T Consensus       210 ~~~a~~~~~G~~~w~~~  226 (370)
T COG1520         210 ILYALNAEDGTLKWSQK  226 (370)
T ss_pred             eEEEEEccCCcEeeeee
Confidence            5665565 678889854


No 69 
>KOG4234 consensus TPR repeat-containing protein [General function prediction only]
Probab=20.98  E-value=40  Score=29.19  Aligned_cols=16  Identities=13%  Similarity=0.260  Sum_probs=12.6

Q ss_pred             CCCCCCCCcceEEEecC
Q 040039          132 STTNWTDGGLFSLSVTN  148 (274)
Q Consensus       132 s~~dps~~G~ysl~~d~  148 (274)
                      --.||.+ |.|++.+..
T Consensus       250 mvqd~nT-GsySi~fk~  265 (271)
T KOG4234|consen  250 MVQDPNT-GSYSINFKG  265 (271)
T ss_pred             eeeCCCC-CceeEEecC
Confidence            3458889 999999864


No 70 
>KOG4792 consensus Crk family adapters [Signal transduction mechanisms]
Probab=20.69  E-value=1.3e+02  Score=26.31  Aligned_cols=48  Identities=25%  Similarity=0.342  Sum_probs=30.2

Q ss_pred             EeecCCC------CCcccCCCcccCCceEeeecCCCCCCCCcceEEEec-CCCceEEEecC
Q 040039          105 WQSFDHP------TDSLVPGQKLLEGKKLTASVSTTNWTDGGLFSLSVT-NEGLFAFIESN  158 (274)
Q Consensus       105 WqSFd~P------tDTlLpgq~l~~~~~L~Sw~s~~dps~~G~ysl~~d-~~g~~~~~~~~  158 (274)
                      |.||=.|      .-+||.||+  .|..|+--   ++-++ |.|.|.+- .+...+++|..
T Consensus        10 r~swYfg~mSRqeA~~lL~~~r--~G~FLvRD---Sst~p-GdYvLsV~E~srVshYiIn~   64 (293)
T KOG4792|consen   10 RSSWYFGPMSRQEAVALLQGQR--HGVFLVRD---SSTSP-GDYVLSVSENSRVSHYIINS   64 (293)
T ss_pred             ccceecCcccHHHHHHHhcCcc--eeeEEEec---CCCCC-CceEEEEecCcceeeeeecC
Confidence            5665544      346788887  56666532   22235 99999994 45666777654


No 71 
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=20.69  E-value=60  Score=31.91  Aligned_cols=16  Identities=38%  Similarity=1.140  Sum_probs=12.0

Q ss_pred             CccCCCCccCCCCCCC
Q 040039          259 GICSQGQCSCPATYFK  274 (274)
Q Consensus       259 giC~~~~C~Cl~gf~~  274 (274)
                      |.|...+|-|++||+|
T Consensus       260 g~c~~G~CIC~~Gf~G  275 (525)
T KOG1225|consen  260 GQCVEGRCICPPGFTG  275 (525)
T ss_pred             ceEeCCeEeCCCCCcC
Confidence            5566678888888875


No 72 
>COG4787 FlgF Flagellar basal body rod protein [Cell motility and secretion]
Probab=20.31  E-value=4.5e+02  Score=22.92  Aligned_cols=24  Identities=42%  Similarity=0.691  Sum_probs=18.0

Q ss_pred             EEEEecCCcEEEEcCCCCEEEeec
Q 040039           54 TLELTSDGNLVLQDADGAIAWSTN   77 (274)
Q Consensus        54 ~l~~~~~G~L~l~~~~~~~~Wss~   77 (274)
                      -+.|+.||=|.+.+.+|+....-+
T Consensus        79 Dvaiq~DGwlaVq~~dG~EaYTRn  102 (251)
T COG4787          79 DVAIQGDGWLAVQDADGSEAYTRN  102 (251)
T ss_pred             eEEEccCceEEEEcCCCcchheec
Confidence            478888998888888887655443


Done!