Query         psy8678
Match_columns 394
No_of_seqs    254 out of 1532
Neff          8.1 
Searched_HMMs 46136
Date          Sat Aug 17 00:55:58 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy8678.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/8678hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 TIGR03524 GldJ gliding motilit 100.0 8.6E-66 1.9E-70  508.1  23.2  360   10-393    56-546 (559)
  2 TIGR03530 GldJ_short gliding m 100.0 3.8E-61 8.1E-66  469.3  22.4  307   10-394    61-399 (402)
  3 TIGR03529 GldK_short gliding m 100.0 2.5E-55 5.3E-60  425.7  21.1  268   11-345    50-340 (344)
  4 TIGR03525 GldK gliding motilit 100.0 1.5E-53 3.2E-58  417.9  24.1  325    9-346    35-446 (449)
  5 PF03781 FGE-sulfatase:  Sulfat 100.0 1.5E-51 3.2E-56  389.1  12.2  246   12-341     2-260 (260)
  6 TIGR03440 unchr_TIGR03440 cons 100.0 1.5E-48 3.2E-53  386.9  20.4  235   11-340   166-406 (406)
  7 TIGR02171 Fb_sc_TIGR02171 Fibr 100.0 1.5E-48 3.2E-53  404.0  20.2  243   11-345    35-285 (912)
  8 COG1262 Uncharacterized conser 100.0 1.7E-46 3.7E-51  360.0  15.9  245    9-343    47-313 (314)
  9 TIGR03525 GldK gliding motilit  99.1 4.1E-11 8.9E-16  118.9   3.7   74   76-161   232-324 (449)
 10 TIGR02171 Fb_sc_TIGR02171 Fibr  99.0 1.4E-09 3.1E-14  114.7   8.6   43  348-393   238-283 (912)
 11 TIGR03529 GldK_short gliding m  98.7 1.1E-08 2.4E-13  100.0   3.3   48  114-161   155-219 (344)
 12 PF03781 FGE-sulfatase:  Sulfat  98.4 1.2E-07 2.5E-12   89.5   2.9   52  113-164    87-144 (260)
 13 PHA00653 mtd major tropism det  98.3 1.7E-06 3.7E-11   80.9   6.8  137  182-343   235-377 (381)
 14 TIGR03440 unchr_TIGR03440 cons  98.1 1.1E-06 2.4E-11   88.0   2.7   38  114-151   268-305 (406)
 15 COG1262 Uncharacterized conser  98.0 3.7E-06   8E-11   81.3   3.8   46  114-159   133-183 (314)
 16 TIGR03524 GldJ gliding motilit  97.3 0.00015 3.3E-09   73.6   2.9   46  297-345   503-548 (559)
 17 TIGR03530 GldJ_short gliding m  97.2 0.00023 5.1E-09   70.7   3.2   43  298-343   356-398 (402)
 18 PF07603 DUF1566:  Protein of u  75.5     2.6 5.6E-05   34.5   2.8   30  180-209    28-63  (124)
 19 PHA02673 ORF109 EEV glycoprote  56.9     8.8 0.00019   33.0   2.4   20  183-202    93-112 (161)
 20 PF00193 Xlink:  Extracellular   44.9      25 0.00053   27.6   3.0   39  182-220    13-51  (92)
 21 PHA00653 mtd major tropism det  41.1      15 0.00033   35.3   1.5   39  349-393   336-377 (381)
 22 cd03518 Link_domain_HAPLN_modu  38.7      56  0.0012   25.8   4.1   40  181-220    12-51  (95)
 23 cd03601 CLECT_TC14_like C-type  34.8      23 0.00051   28.6   1.6   19  182-200     9-27  (119)
 24 cd03520 Link_domain_CSPGs_modu  34.1      72  0.0016   25.3   4.1   39  182-220    10-48  (96)
 25 cd03515 Link_domain_TSG_6_like  33.3      86  0.0019   24.7   4.4   39  182-220    13-51  (93)
 26 cd03516 Link_domain_CD44_like   32.8      69  0.0015   27.3   4.0   39  182-220    18-56  (144)
 27 cd03592 CLECT_selectins_like C  31.7      32 0.00069   27.4   1.9   28  182-209     9-39  (115)
 28 smart00445 LINK Link (Hyaluron  31.1      74  0.0016   25.1   3.7   40  181-220    13-52  (94)
 29 cd03599 CLECT_DGCR2_like C-typ  31.1      35 0.00076   29.4   2.1   20  182-201    21-40  (153)
 30 PF05966 Chordopox_A33R:  Chord  30.2      26 0.00057   31.2   1.1   22  183-204   122-143 (190)
 31 TIGR02145 Fib_succ_major Fibro  29.4      21 0.00046   31.4   0.5   26  184-210    51-77  (171)
 32 PF00059 Lectin_C:  Lectin C-ty  29.2      22 0.00048   27.1   0.5   29  183-211     3-34  (105)
 33 PF09603 Fib_succ_major:  Fibro  28.9      32  0.0007   30.1   1.5   16  196-211    73-88  (184)
 34 cd01102 Link_Domain The link d  28.8      85  0.0018   24.6   3.7   40  181-220    12-51  (92)
 35 PHA02953 IEV and EEV membrane   28.2      46   0.001   29.2   2.4   22  182-203    65-86  (170)
 36 PF07979 Intimin_C:  Intimin C-  27.0      24 0.00051   28.2   0.3   23  181-203    11-33  (101)
 37 PHA03093 EEV glycoprotein; Pro  26.8      55  0.0012   29.0   2.5   20  183-202   118-137 (185)
 38 cd03595 CLECT_chondrolectin_li  25.6      46   0.001   28.2   1.9   21  181-201    23-43  (149)
 39 cd03517 Link_domain_CSPGs_modu  25.0 1.2E+02  0.0026   24.0   3.9   40  182-221    13-52  (95)
 40 cd03600 CLECT_thrombomodulin_l  23.9      48   0.001   27.6   1.6   27  182-208    13-42  (141)
 41 cd03591 CLECT_collectin_like C  23.7      45 0.00097   26.5   1.4   27  181-207     9-38  (114)
 42 cd03588 CLECT_CSPGs C-type lec  23.6      49  0.0011   26.8   1.6   28  182-209    19-49  (124)
 43 cd03603 CLECT_VCBS A bacterial  23.2      56  0.0012   26.3   1.9   30  181-210     8-40  (118)
 44 cd03596 CLECT_tetranectin_like  22.4      58  0.0013   26.5   1.8   28  182-209    18-48  (129)
 45 cd00037 CLECT C-type lectin (C  20.6   1E+02  0.0022   23.3   2.9   30  182-211     9-41  (116)

No 1  
>TIGR03524 GldJ gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. There is a GldJ homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=100.00  E-value=8.6e-66  Score=508.10  Aligned_cols=360  Identities=28%  Similarity=0.392  Sum_probs=240.5

Q ss_pred             CCCCCcEEeCCeeeEccCCCCCCCCCCCCCceEEEccceEeeecccCHHHHHHHHHHcCCcch---hh-------hcCCc
Q psy8678          10 ERYKDMVLLPGDTFRMGTNKPILIKDGEFPSRNVTLDAFYLDQHEVSNTQFQEFVSATGYVTE---AE-------KFGDT   79 (394)
Q Consensus        10 ~~~~~mv~IpgG~f~mG~~~~~~~~~~~~p~~~v~v~~F~i~~~EVTn~~y~~fl~~~g~~~~---~~-------~~~~~   79 (394)
                      +..++||.||||+|.||+.+++++.+++.++|+|+|++|+|++|||||+||++||+++++...   ++       .++++
T Consensus        56 ~~~p~MV~IPGG~F~MGs~~dd~~~d~en~phqV~V~~F~IdktpVTNaEYlaFVeat~~~~~~~ea~~~~i~~~~lPdt  135 (559)
T TIGR03524        56 EAPPGLVFVEGGTFTMGQVQDDVMHDWNNTPTQQHVQSFYMDETEVTNSMYLEYLQYLKDVFPPSEANYKNIYSGALPDT  135 (559)
T ss_pred             CCCCceEEECCcEEEeCCCCCccccccCCCceEEEECCeEEECceecHHHHHHHHHHhccCCCccccccccceecccCCc
Confidence            457899999999999999888888889999999999999999999999999999999975432   11       34566


Q ss_pred             ccccCCccHHHHHhhhhhccccccccCCcccccccCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCCcc------
Q psy8678          80 FVFEPLLSEEERAKISQVRHDMKRFEGLDSTIEHRMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLE------  153 (394)
Q Consensus        80 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~------  153 (394)
                      .+|+..++.+......+++|+.            +.+||||||||+||.+||.|+++++++.++.+.+......      
T Consensus       136 ~vWr~~~~~ne~~~~~y~rhP~------------y~~yPVVgVSW~qA~AYc~Wrt~~~ne~~~~~~g~~~~~~~~~~~~  203 (559)
T TIGR03524       136 LVWRNRLGNNETMTENYLRHPA------------YADYPVVGVSWIQAVEFSKWRTNRVNEKVLEDKGNIKKGAKIDVTA  203 (559)
T ss_pred             eeeecccCcccccccccccCCc------------ccCCCEeeeCHHHHHHHHHHHHHHHHHHHHhhcccccccccccccc
Confidence            6666655544444344443332            5689999999999999999999999998776433221000      


Q ss_pred             CccCCCCCccc-c----CCcCCccccc---cCCC---cccCCHHHHHHHHHH------cCCCCCCHHHHHHHHhcCCCC-
Q psy8678         154 NRLFPWGSWLH-P----EGIDSTIEHR---MNHP---VVHVSWNDAVAYCTW------RGARLPTEAEWEYGCRGGLEN-  215 (394)
Q Consensus       154 ~~~~~~~~w~~-~----~~~~~~~~~~---~~~P---v~~Vsw~dA~~yc~w------lg~RLPTEaEWEyAArg~~~~-  215 (394)
                      ...|+...++. |    .|......+.   ...+   ...+.=..+..||+.      .+|||||||||||||||+... 
T Consensus       204 ~~~f~t~~y~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~YRLPTEAEWEYAARGG~~~~  283 (559)
T TIGR03524       204 SSFFDTDVYLVDPSKTYGGDTTVYKRGIGRTRRKKGEARPAVPEEKDAYQQRKDGIITQRYRLPTEAEWEYAAKANVGNR  283 (559)
T ss_pred             ccccchhhhhcCccccccccchhhcccccccccccccccccccchhhhhhhhcccccccCCCCCCHHHHHHHHhCCCCCc
Confidence            01122111111 1    0111000000   0000   112333456677776      368999999999999998654 


Q ss_pred             --------ccccCCCCCCC------CCccccccccCCCCCC---CCCCCCCCccccCcCCCCCcccccccccCHHHhccc
Q psy8678         216 --------RLFPWGNNLTP------RGEHRANVWQGEFPTN---NTAADGYLSTAPVMSYKENKFGLYNMVGNVWEWTAD  278 (394)
Q Consensus       216 --------~~ypwg~~~~~------~~~~~an~~~~~~~~~---~~~~~g~~~~~pVg~~~~n~~GlyDM~GNV~EW~~d  278 (394)
                              ..||||+...+      .++.+||+|++.++..   ....||+..|+|||+|+||+||||||+|||||||.|
T Consensus       284 ~~n~~~~~~~YPWG~~~~~~~~~~~~g~~~ANf~~g~g~y~~~ag~~~DG~~~TapVgsf~pN~~GLYDM~GNV~EW~~D  363 (559)
T TIGR03524       284 EYNNYRGRKKYPWNGKYTRSKNRRNRGDQLANFKQGKGDYGGIAGWSDDGADITNEIKSYPPNDFGLYDMAGNVAEWVAD  363 (559)
T ss_pred             cccccccccccCCCCccCccccccccccceeeeccccCCccccccccccCCcccccccccCCCCCceeecCCchhhhccc
Confidence                    57999987643      4567899999987654   345788899999999999999999999999999999


Q ss_pred             cccCCCCCCCCCCCCCCCCCCCeEEeCccccCCcc---ccccc--------------eeccccCC---------------
Q psy8678         279 WWNVHHHPAPSYNPKGPTTGTDKVKKGGSYLCNEQ---YCYRH--------------RCAARSQN---------------  326 (394)
Q Consensus       279 ~y~~~~~~~~~~~~~~~~~g~~~v~rGGs~~~~~~---~~~~~--------------r~~~r~~~---------------  326 (394)
                      +|++....+.         ......|||+|.....   ....+              +..+|...               
T Consensus       364 ~y~p~~~~~~---------~d~~~~rG~~~~~~~~~~~g~~~~~~~~~~~ydtl~ng~~~~~~~pg~~~~~~~~~~~~~~  434 (559)
T TIGR03524       364 VYRPIIDNEA---------NDFNYYRGNLYTKNMIDSDGNVVFAGTQEIEYDTLPNGKVVARYLPGEIAQVPVDKNETYL  434 (559)
T ss_pred             ccccccccCc---------ccceeEecCcccccccCCCCceEEecccceeeeeccCCceeeccCCCceeeeecCccchhh
Confidence            9998654321         1234455555542100   00000              00000000               


Q ss_pred             -CCCCCCCCCCcc----------------------------------------cccc-------CCCCCCCcccccCCcc
Q psy8678         327 -TPDSSAGNLGFR----------------------------------------CAAD-------KGPTTGTDKVKKGGSY  358 (394)
Q Consensus       327 -~~~~~~~~iGFR----------------------------------------~v~~-------~~~~~~~~~~~~ggsw  358 (394)
                       .-....+++.||                                        +++.       ..+..+..||+|||||
T Consensus       435 r~~~~~~d~~~~~dgd~~ss~~~~~~~~~~~~~~~~~my~~p~~~~~~~~~g~~~~~~d~~~~rttli~d~~RVlRGGSW  514 (559)
T TIGR03524       435 RTNFSKSDNANIRDGDKQSSRYYEFGDDEDEIARRPSMYNSPKSPIEIDPVGGMIVLYDDDKKRTTLIDDRVRVYKGGSW  514 (559)
T ss_pred             hhcccccccccccccccccchhhccccccccccccchhccCcccccccccccceeeecccccCceeeecCCeEEeecCCc
Confidence             000011122222                                        1211       1234477899999999


Q ss_pred             ccCCCccccccccccCCCCCCCCCCCcceEEEeec
Q psy8678         359 LCNEQYCYRHRCAARSQNTPDSSAGNLGFRCAADV  393 (394)
Q Consensus       359 ~~~~~~~~~~~~~~r~~~~~~~~~~~~gfr~~~~~  393 (394)
                      .+   .+||+|||+|++..|+.+..+|||||||++
T Consensus       515 ~~---~~~~~r~A~R~~~~~~~~~~~iGFR~a~~~  546 (559)
T TIGR03524       515 RD---REYWLDPAQRRYLPQYMATDYIGFRCAMSR  546 (559)
T ss_pred             CC---CccccchhhccCCCccccccceeEEEEecc
Confidence            96   455669999999999999999999999975


No 2  
>TIGR03530 GldJ_short gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. This model represents the GldJ homolog in Cytophaga hutchinsonii and several other species which is of shorter architecture than that found in Flavobacterium johnsoniae and is represented by a separate model (TIGR03524). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=100.00  E-value=3.8e-61  Score=469.26  Aligned_cols=307  Identities=26%  Similarity=0.420  Sum_probs=209.7

Q ss_pred             CCCCCcEEeCCeeeEccCCCCCCCCCCCCCceEEEccceEeeecccCHHHHHHHHHHcCCcchhhh----cCCcccccCC
Q psy8678          10 ERYKDMVLLPGDTFRMGTNKPILIKDGEFPSRNVTLDAFYLDQHEVSNTQFQEFVSATGYVTEAEK----FGDTFVFEPL   85 (394)
Q Consensus        10 ~~~~~mv~IpgG~f~mG~~~~~~~~~~~~p~~~v~v~~F~i~~~EVTn~~y~~fl~~~g~~~~~~~----~~~~~~~~~~   85 (394)
                      +..++||.||||+|+||+++++.+.+.+.|+|+|+|++|+|++|||||+||++||+++++...++.    .++..+|...
T Consensus        61 ~~gp~MV~IPgG~F~MGs~~~d~~~~~d~p~h~V~v~~F~Id~~eVTn~qy~~Fv~a~~~~~~ae~~~~~~p~~~~w~~~  140 (402)
T TIGR03530        61 IEGPNLKFIEGGRAVLGSFEEDLMAFGDNLERTVTIANFYMDETEIANIDWLEFLFNMKKDSSADFIEKAEPDEDVWAGE  140 (402)
T ss_pred             CCCCCeEEECCcEEEeCCCcccccccccCCceEEEECCEEEEcccccHHHHHHHHHHhCCCcchhhccccCCCccccccc
Confidence            446899999999999999877666666788999999999999999999999999999988765542    2333333322


Q ss_pred             ccHHHHHhhhhhccccccccCCcccc--cccCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCCccCccCCCCCcc
Q psy8678          86 LSEEERAKISQVRHDMKRFEGLDSTI--EHRMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGSWL  163 (394)
Q Consensus        86 ~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~~~~~~~~~w~  163 (394)
                      +..+....              ..++  ..+.+||||+|||+||++||+|+++++++.++.+    .|+. ..|      
T Consensus       141 ~~~~~~~~--------------~~~~r~pg~~~~PVV~VSW~DA~aYc~Wls~~~~~~~~~~----~g~~-~~~------  195 (402)
T TIGR03530       141 LAFNDLYQ--------------DHYFRFPGFNFFPVAGVNWIQANAYCIWRTEVVNELLAEE----AGID-SPE------  195 (402)
T ss_pred             cccccccc--------------cccccCCCccCCCeeecCHHHHHHHHHHhhhhchhhhhhh----cccc-ccc------
Confidence            21110000              1111  2345799999999999999999999887765432    1211 000      


Q ss_pred             ccCCcCCccccccCCCcccCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCC----------CccccCCCCCCC-------
Q psy8678         164 HPEGIDSTIEHRMNHPVVHVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLE----------NRLFPWGNNLTP-------  226 (394)
Q Consensus       164 ~~~~~~~~~~~~~~~Pv~~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~----------~~~ypwg~~~~~-------  226 (394)
                            ++    .+     +...+.+..   ..|||||||||||||||+..          .+.||||+....       
T Consensus       196 ------~~----g~-----~~~~~g~~~---~~yRLPTEAEWEYAARgg~~~~~~~~~~~~~~~ypWg~~~~~~~~~~~~  257 (402)
T TIGR03530       196 ------GG----GQ-----IPIERGVAL---ADFRLPNEAEWEYAAKALIGNQWLDENQEHGRIYPWDGHALRNPYNVKR  257 (402)
T ss_pred             ------cc----cc-----ccccccccc---ccCcCCCHHHHHHHHhcCCCCcccccccccccccCCCCccccCcccccc
Confidence                  00    00     011111111   26899999999999999753          457999977431       


Q ss_pred             ----CCccccccccCCCCC----CCCCCCCCCccccCcCCCCCcccccccccCHHHhccccccCCCCC-CCCCCCCCCCC
Q psy8678         227 ----RGEHRANVWQGEFPT----NNTAADGYLSTAPVMSYKENKFGLYNMVGNVWEWTADWWNVHHHP-APSYNPKGPTT  297 (394)
Q Consensus       227 ----~~~~~an~~~~~~~~----~~~~~~g~~~~~pVg~~~~n~~GlyDM~GNV~EW~~d~y~~~~~~-~~~~~~~~~~~  297 (394)
                          .+..++|++++.++.    .+...+++..|+||++++||+||||||+|||||||+|+|++.... .+..++..   
T Consensus       258 ~g~~~~~~~AN~~~g~g~y~~~~~~~~~dg~~~t~pvg~~~~N~~Glydm~GNv~EW~~D~~~~~~y~~~~~~n~~~---  334 (402)
T TIGR03530       258 KGKQMGDFLANFKRGRGDYAGIAGNKLNDGAIIPTNIYDFAPNDFGLYCMAGNMNEWVYDVYRPLSFQDFDDLNPLR---  334 (402)
T ss_pred             ccccccccceeeccCcCCcccccCCccccCCcccccccccCCCCCceeecCCCHHHhhcccccccccccccccCCCC---
Confidence                235689999887653    235678888999999999999999999999999999999874222 12222210   


Q ss_pred             CCCeEEeCccccCCccccccceeccccCCCCCCCCCCCCccccccCCCCCCCcccccCCccccCCCccccccccccCCCC
Q psy8678         298 GTDKVKKGGSYLCNEQYCYRHRCAARSQNTPDSSAGNLGFRCAADKGPTTGTDKVKKGGSYLCNEQYCYRHRCAARSQNT  377 (394)
Q Consensus       298 g~~~v~rGGs~~~~~~~~~~~r~~~r~~~~~~~~~~~iGFR~v~~~~~~~~~~~~~~ggsw~~~~~~~~~~~~~~r~~~~  377 (394)
                      .      .|+- ....                 ..+..+|     ..+.....||+|||||.+.+.+|   |++.|.+..
T Consensus       335 ~------~g~~-~~~~-----------------~~~~~~~-----~~~~~~~~RVlRGGSW~~~~~~~---Rsa~R~~~~  382 (402)
T TIGR03530       335 K------DGFF-DEEE-----------------GYDKAGF-----QSLLDDEFRVYKGGSWKDVAYWL---SPGTRRFLA  382 (402)
T ss_pred             C------CCcc-cccc-----------------ccccccc-----cccCCCceEEEeCCCCCCcccce---eeeecCCCC
Confidence            0      0000 0000                 0000111     11223456999999999866666   999999999


Q ss_pred             CCCCCCCcceEEEeecC
Q psy8678         378 PDSSAGNLGFRCAADVS  394 (394)
Q Consensus       378 ~~~~~~~~gfr~~~~~~  394 (394)
                      |+.+..+|||||||+.+
T Consensus       383 p~~~~~~iGFR~a~~~~  399 (402)
T TIGR03530       383 EDSATAAIGFRCAMIQA  399 (402)
T ss_pred             CCccCCceeEEEEEecc
Confidence            99999999999999863


No 3  
>TIGR03529 GldK_short gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. This model represents a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture than that found in Flavobacterium johnsoniae and related species (represented by (TIGR03525). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=100.00  E-value=2.5e-55  Score=425.68  Aligned_cols=268  Identities=32%  Similarity=0.544  Sum_probs=179.2

Q ss_pred             CCCCcEEeCCeeeEccCCCCCCCCCCCCCceEEEccceEeeecccCHHHHHHHHHHcCCc--chhhhcCCcccccCCccH
Q psy8678          11 RYKDMVLLPGDTFRMGTNKPILIKDGEFPSRNVTLDAFYLDQHEVSNTQFQEFVSATGYV--TEAEKFGDTFVFEPLLSE   88 (394)
Q Consensus        11 ~~~~mv~IpgG~f~mG~~~~~~~~~~~~p~~~v~v~~F~i~~~EVTn~~y~~fl~~~g~~--~~~~~~~~~~~~~~~~~~   88 (394)
                      ...+||.||+|+|+||+..++...+.++|.|+|+|++|+|++|||||+||++|+++++..  ......+..+......++
T Consensus        50 ~~~~mv~Ip~G~f~mGs~~~~~~~~~e~p~h~V~l~~F~i~~~eVTn~qy~~f~~~~~~~~~~~g~~~p~~~~~~~~~~~  129 (344)
T TIGR03529        50 VPVGMVVIPAGTFHMGQADEDVPATQINLNKQITISEFFMDKTEVTNNKYRQFLEVVLEGQLATGTPLPPEYDMEELYPD  129 (344)
T ss_pred             CCCCeEEECCCEEEcCCCCccCcccccCCcceEEECCeEEeCccccHHHHHHHHHhhccccccccccCCcccccccccCC
Confidence            457899999999999998776666778999999999999999999999999999875311  110011110000000000


Q ss_pred             HHHHhhhhhccccccccCCcccccccCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCCccCccCCCCCccccCCc
Q psy8678          89 EERAKISQVRHDMKRFEGLDSTIEHRMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGSWLHPEGI  168 (394)
Q Consensus        89 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~~~~~~~~~w~~~~~~  168 (394)
                      ..     .|.....+..+ +..++.                                               .+.+|   
T Consensus       130 ~~-----~w~~~~~~~~~-~~~~~~-----------------------------------------------~~~~p---  153 (344)
T TIGR03529       130 TT-----VWSTSFSHHMG-DPLMEY-----------------------------------------------YFDHP---  153 (344)
T ss_pred             cc-----ccccccccccC-cccccc-----------------------------------------------cccCc---
Confidence            00     00000000000 000000                                               00000   


Q ss_pred             CCccccccCCCcccCCHHHHHHHHHHc-----------------CCCCCCHHHHHHHHhcCCCCccccCCCCCC--CCCc
Q psy8678         169 DSTIEHRMNHPVVHVSWNDAVAYCTWR-----------------GARLPTEAEWEYGCRGGLENRLFPWGNNLT--PRGE  229 (394)
Q Consensus       169 ~~~~~~~~~~Pv~~Vsw~dA~~yc~wl-----------------g~RLPTEaEWEyAArg~~~~~~ypwg~~~~--~~~~  229 (394)
                           ..+++||++|||+||++||+|+                 ++|||||+||||||||+...+.||||+...  ...+
T Consensus       154 -----~~~~~PVv~VSW~dA~ayc~Wls~~~~~~~~~~~~~~~~~~RLPTEAEWEyAARgg~~~~~ypwG~~~~~~~~~~  228 (344)
T TIGR03529       154 -----AFDNYPVVGVDWNAAKQFCEWRTYHMNAYRNEESQYDMPRFRLPSEAEWEYAARGGRDMAKYPWGGPYLRNKRGC  228 (344)
T ss_pred             -----cccCCCcccCCHHHHHHHHHHHhhhccccccccccccCCcccCcCHHHHHHHHhCCCCCCcCCCCCccCCCcccc
Confidence                 1135566666666666666665                 489999999999999988888899997643  2334


Q ss_pred             cccccccCCCCCCCCCCCCCCccccCcCCCCCcccccccccCHHHhccccccCCCCCCC-CCCCCC-CCCCCCeEEeCcc
Q psy8678         230 HRANVWQGEFPTNNTAADGYLSTAPVMSYKENKFGLYNMVGNVWEWTADWWNVHHHPAP-SYNPKG-PTTGTDKVKKGGS  307 (394)
Q Consensus       230 ~~an~~~~~~~~~~~~~~g~~~~~pVg~~~~n~~GlyDM~GNV~EW~~d~y~~~~~~~~-~~~~~~-~~~g~~~v~rGGs  307 (394)
                      ..+|+..+.   .+...+++..++||++++||+||||||+|||||||.|+|.+...... ..++.. ...+..+|+||||
T Consensus       229 ~~a~~~~~~---~~~~~~g~~~t~pVgs~~pN~~GLyDM~GNVwEW~~D~y~~~~~~~~~~~~p~~~~~~~~~rVvRGGS  305 (344)
T TIGR03529       229 MLANFKPGR---GNYYDDGFPYTAPVAVYFPNDFGLYDMAGNVAEWVLDAYAATSVPIVWDLNPVYEDPNEVRKIIRGGS  305 (344)
T ss_pred             ccccccccc---CcccccCCcccccccccCCCCCCeeecCCChhhhccccccccccccccccCccccCCCCceeeecCCC
Confidence            556654332   23334566679999999999999999999999999999988643321 123322 1235689999999


Q ss_pred             ccCCccccccceeccccCCCCCCCCCCCCccccccCCC
Q psy8678         308 YLCNEQYCYRHRCAARSQNTPDSSAGNLGFRCAADKGP  345 (394)
Q Consensus       308 ~~~~~~~~~~~r~~~r~~~~~~~~~~~iGFR~v~~~~~  345 (394)
                      |.+.+..|   |++.|....|+.+..+||||||++...
T Consensus       306 w~~~~~~~---r~~~R~~~~p~~~~~~iGFR~v~~~~~  340 (344)
T TIGR03529       306 WKDIAYYL---ETGTRTFEYEDVSQAHIGFRTVMTYLG  340 (344)
T ss_pred             CCCChhhc---cceecCCCCCCCccCCEEEEEEeecCC
Confidence            99888764   899999999999999999999988644


No 4  
>TIGR03525 GldK gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. There is a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=100.00  E-value=1.5e-53  Score=417.92  Aligned_cols=325  Identities=30%  Similarity=0.470  Sum_probs=205.4

Q ss_pred             CCCCCCcEEeCCeeeEccCCCCCCCCCCCCCceEEEccceEeeecccCHHHHHHHHHHcCCcchhhhc-------C----
Q psy8678           9 VERYKDMVLLPGDTFRMGTNKPILIKDGEFPSRNVTLDAFYLDQHEVSNTQFQEFVSATGYVTEAEKF-------G----   77 (394)
Q Consensus         9 ~~~~~~mv~IpgG~f~mG~~~~~~~~~~~~p~~~v~v~~F~i~~~EVTn~~y~~fl~~~g~~~~~~~~-------~----   77 (394)
                      .+.+.+||.||+|+|.||...+++.+|+|.|.|+|.|.+|+||++||||+||++|++|.....-..++       |    
T Consensus        35 ~~~p~~mv~IpgG~f~mG~~~~d~a~DnE~P~h~V~V~~F~id~~pVTNaEy~~FVe~vrdsi~r~~~a~~a~~~~~~~~  114 (449)
T TIGR03525        35 PEKPYGMVLVPGGSFIMGKSDEDIAGVMNAPTKTVTVRSFYMDETEITNSEYRQFVEWVRDSIVRTKLAELADLAGIGPG  114 (449)
T ss_pred             CCCCCceEEECCcEEEeCCCCCCcccccCCCceeEEECceEeECccchHHHHHHHHHHHHHHHHHHhhhhhhhhcccCCC
Confidence            45578999999999999998877888999999999999999999999999999999996433211111       0    


Q ss_pred             Cc----ccccCCccHH------HHHhh-hh--------------hcccccc--ccCC-ccccc--c----cCCCc-----
Q psy8678          78 DT----FVFEPLLSEE------ERAKI-SQ--------------VRHDMKR--FEGL-DSTIE--H----RMHHP-----  118 (394)
Q Consensus        78 ~~----~~~~~~~~~~------~~~~~-~~--------------~~~~~~~--~~~~-~~~~~--~----~~~~P-----  118 (394)
                      ++    +.+-..-+++      ....+ +.              +..++..  ..-+ +.+.+  .    ....|     
T Consensus       115 ~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~~~~~~~~~~~~y~~~~~~~~g~~  194 (449)
T TIGR03525       115 DGGGSIQDYAFKDAESDNATPYQKYMYDNYYSLGETDYAGRKLNKKTELIWDTSEYPDEYYVEVMDSLYLPEDESYNGLR  194 (449)
T ss_pred             CCCcccchhcccccccccccchhhhccccccccccccccCccCCcccccccccccCCCHHHHHHhhhcccccccCcCCce
Confidence            00    0000000000      00000 00              0000000  0000 00000  0    00112     


Q ss_pred             -------eeeecHHHHHHhhhhhcCCCcchhhhhhcccCCccCccCCCCCccccC--CcC-------CccccccCCCccc
Q psy8678         119 -------VVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGSWLHPE--GID-------STIEHRMNHPVVH  182 (394)
Q Consensus       119 -------V~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~~~~~~~~~w~~~~--~~~-------~~~~~~~~~Pv~~  182 (394)
                             +...+|+|-.+-+  +..    ...++|- +.-+.+++-++-.|...-  +..       -......++||++
T Consensus       195 ~~d~~~~~~~y~~~d~~~a~--~~~----~~~~~f~-~~~~v~~~pdt~vw~~~~~~~~n~~~~~~y~~h~~y~d~PVV~  267 (449)
T TIGR03525       195 TFDVTKLKYRYSWMDIDAAA--RSK----GSRKDFI-KTEEVQVYPDTTVWIKDFNYSYNEPMHNDYFWHQAYDDYPVVG  267 (449)
T ss_pred             ecchhHheeEEEEeeHHHHh--hcc----Cccccce-ecceeeecCCcceEecccccccCchhhhhhccCcccCCCCccC
Confidence                   2333444322211  000    0001100 000112222333343321  100       0012247899999


Q ss_pred             CCHHHHHHHHHHcC-----------------CCCCCHHHHHHHHhcCCCCccccCCCCCCC--CCccccccccCCCCCCC
Q psy8678         183 VSWNDAVAYCTWRG-----------------ARLPTEAEWEYGCRGGLENRLFPWGNNLTP--RGEHRANVWQGEFPTNN  243 (394)
Q Consensus       183 Vsw~dA~~yc~wlg-----------------~RLPTEaEWEyAArg~~~~~~ypwg~~~~~--~~~~~an~~~~~~~~~~  243 (394)
                      |||+||.+||+|++                 +||||||||||||||+...+.||||+....  .++.++|+....   .+
T Consensus       268 VSW~dA~aFC~Wls~~~n~~~~~~g~~t~~~yRLPTEAEWEYAARGG~~~~~YPWG~~~~~~~~~~~~ANf~~~r---G~  344 (449)
T TIGR03525       268 VTWKQARAFCNWRTKYKNDFRKKKGPANVNTFRLPTEAEWEYAARGGLEGATYPWGGPYTKNDRGCFMANFKPVR---GD  344 (449)
T ss_pred             CCHHHHHHHHHHHhccccccccccccccCccccCCCHHHHHHHHhcCCCCCccCCCCCCCccchhhhhhcccccc---CC
Confidence            99999999999996                 499999999999999988888999987543  345567775432   22


Q ss_pred             CCCCCCCccccCcCCCCCcccccccccCHHHhccccccCCC-CCCCCCCCCCCC-CCCCeEEeCccccCCccccccceec
Q psy8678         244 TAADGYLSTAPVMSYKENKFGLYNMVGNVWEWTADWWNVHH-HPAPSYNPKGPT-TGTDKVKKGGSYLCNEQYCYRHRCA  321 (394)
Q Consensus       244 ~~~~g~~~~~pVg~~~~n~~GlyDM~GNV~EW~~d~y~~~~-~~~~~~~~~~~~-~g~~~v~rGGs~~~~~~~~~~~r~~  321 (394)
                      ...++...++||++++||+||||||+|||||||.|+|.+.. ...+..+|.... .+..+|+|||||.+.+..   +|++
T Consensus       345 ~~~dg~~~T~pVgs~~pN~fGLYDMaGNVwEWt~D~Y~~~~y~~~~~~nP~~~~~~~~~RVvRGGSW~d~a~~---lRsa  421 (449)
T TIGR03525       345 YAADEALYTVEAKSYEPNDYGLYNMAGNVSEWTNSSYDPSSYEYMSTMNPNVNDSENTRKVVRGGSWKDVAYF---LQVS  421 (449)
T ss_pred             cccccCcccCCCCCcCCCCcceeccCCChHhhhcccccccccccccccCCCCCCCCCceEEEecCCCCCcccc---eeee
Confidence            33456667899999999999999999999999999998753 333445554432 245799999999998876   4999


Q ss_pred             cccCCCCCCCCCCCCccccccCCCC
Q psy8678         322 ARSQNTPDSSAGNLGFRCAADKGPT  346 (394)
Q Consensus       322 ~r~~~~~~~~~~~iGFR~v~~~~~~  346 (394)
                      .|....++.....||||||++....
T Consensus       422 ~R~~~~pd~~~~~IGFR~Vrd~~~~  446 (449)
T TIGR03525       422 TRDYEYADSARSYIGFRTVQDYLGT  446 (449)
T ss_pred             ecCCcCCCccCCceEEEEEeeccCc
Confidence            9999999999999999999987554


No 5  
>PF03781 FGE-sulfatase:  Sulfatase-modifying factor enzyme 1;  InterPro: IPR005532 This domain is found in eukaryotic proteins [] required for post-translational sulphatase modification (SUMF1). These proteins are associated with the rare disorder multiple sulphatase deficiency (MSD) [, , , ]. The protein product of the SUMF1 gene is FGE, formylglycine-generating enzyme, which is a sulphatase. Sulphatases are enzymes essential for degradation and remodelling of sulphate esters, and formylglycine (FGly), the key catalytic in the active site, is unique to sulphatases []. FGE is localised to the endoplasmic reticulum (ER) and interacts with and modifies the unfolded form of newly synthesised sulphatases. FGE is a single-domain monomer with a surprising paucity of secondary structure that adopts a unique fold which is stabilised by two Ca2+ ions. The effect of all mutations found in MSD patients is explained by the FGE structure, providing a molecular basis for MSD. A redox-active disulphide bond is present in the active site of FGE. An oxidised cysteine residue, possibly cysteine sulphenic acid, has been detected that may allow formulation of a structure-based mechanism for FGly formation from cysteine residues in all sulphatases []. This domain is also found in a few methyltransferases and protein kinases.; PDB: 2Y3C_A 2Q17_B 1Y4J_B 1Y1E_X 2AFT_X 2HIB_X 2AII_X 1Z70_X 1Y1F_X 2HI8_X ....
Probab=100.00  E-value=1.5e-51  Score=389.13  Aligned_cols=246  Identities=44%  Similarity=0.842  Sum_probs=145.1

Q ss_pred             CCCcEEeCCeeeEccCCCCCCCCCCCCCceEEEccceEeeecccCHHHHHHHHHHcCCcchhhhcCCcccccCCccHHHH
Q psy8678          12 YKDMVLLPGDTFRMGTNKPILIKDGEFPSRNVTLDAFYLDQHEVSNTQFQEFVSATGYVTEAEKFGDTFVFEPLLSEEER   91 (394)
Q Consensus        12 ~~~mv~IpgG~f~mG~~~~~~~~~~~~p~~~v~v~~F~i~~~EVTn~~y~~fl~~~g~~~~~~~~~~~~~~~~~~~~~~~   91 (394)
                      .++||+||+|+|.||+ ......+++.|.|+|+|++|+|++|||||+||++||+++++.....  ...+.          
T Consensus         2 ~~~~V~Ip~G~f~~G~-~~~~~~~~~~p~~~v~l~~f~i~~~eVT~~~y~~fl~~~~~~~~~~--~~~~~----------   68 (260)
T PF03781_consen    2 APEMVLIPGGTFLMGS-QPDNGWDDENPPHTVTLSPFYIDKYEVTNAQYRAFLNDGGYQTSAT--PEFWS----------   68 (260)
T ss_dssp             -TTEEEE--EEEEES--S-STGGGTTBSSEEEEE-SEEEESS--BHHHHHHHHHHHT---HSS--CTTEE----------
T ss_pred             CCceEEECCEEEEeCC-CCCCCCcCCCCceEEEECCEEEECEEeCHHHHHHhhhhcccccccc--cceee----------
Confidence            4789999999999999 4444567899999999999999999999999999999987764310  00000          


Q ss_pred             HhhhhhccccccccCCcccccccCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCCccCccCCCCCccccCCcCCc
Q psy8678          92 AKISQVRHDMKRFEGLDSTIEHRMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGSWLHPEGIDST  171 (394)
Q Consensus        92 ~~~~~~~~~~~~~~~~~~~~~~~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~~~~~~~~~w~~~~~~~~~  171 (394)
                                                ++                  ..                    ..+..+......
T Consensus        69 --------------------------~~------------------~~--------------------~~~~~~~~~~~~   84 (260)
T PF03781_consen   69 --------------------------PA------------------SG--------------------ANWRNPSGRYEP   84 (260)
T ss_dssp             --------------------------EE------------------ET---------------------BTTBTTSTT-T
T ss_pred             --------------------------cc------------------CC--------------------cccccccccccc
Confidence                                      00                  00                    000000000011


Q ss_pred             cccccCCCcccCCHHHHHHHHHHcCC------CCCCHHHHHHHHhcCCCCccccCCCCCCCCCccccccccCC---CCCC
Q psy8678         172 IEHRMNHPVVHVSWNDAVAYCTWRGA------RLPTEAEWEYGCRGGLENRLFPWGNNLTPRGEHRANVWQGE---FPTN  242 (394)
Q Consensus       172 ~~~~~~~Pv~~Vsw~dA~~yc~wlg~------RLPTEaEWEyAArg~~~~~~ypwg~~~~~~~~~~an~~~~~---~~~~  242 (394)
                      .....++||++|||+||.+||+|++.      |||||+|||||||++...+.||||+...+..    +.+.+.   ....
T Consensus        85 ~~~~~~~Pv~~Vsw~~A~ayc~wl~~~~g~~yRLPteaEWe~Aar~g~~~~~~~~g~~~~~~~----~~~~g~~~~~~~~  160 (260)
T PF03781_consen   85 KPGPDNHPVVGVSWYDAQAYCNWLGKRTGEGYRLPTEAEWEYAARGGPDGRPYPWGDEFDPDA----NNWAGSLADYNNA  160 (260)
T ss_dssp             STTGTTSB--S--HHHHHHHHHHCTHHTTSS-B---HHHHHHHHHTTSSSSSBTTBSSSSGGG----B---S-HHSTTTE
T ss_pred             ccCCcchhcceeeHHHHHHHHHHhcccccccccCCCHHHHHHHhcccccccccccCCCCCccc----ccccccccccccc
Confidence            12357899999999999999999998      9999999999999988888999998866422    111110   0000


Q ss_pred             CCC-CCCCCccccCcCCCCCcccccccccCHHHhccccccCCCCCCCCCCCCCCCCCCCeEEeCccccCC--ccccccce
Q psy8678         243 NTA-ADGYLSTAPVMSYKENKFGLYNMVGNVWEWTADWWNVHHHPAPSYNPKGPTTGTDKVKKGGSYLCN--EQYCYRHR  319 (394)
Q Consensus       243 ~~~-~~g~~~~~pVg~~~~n~~GlyDM~GNV~EW~~d~y~~~~~~~~~~~~~~~~~g~~~v~rGGs~~~~--~~~~~~~r  319 (394)
                      ... ......+.||+++++|+||||||+|||||||.|+|..+..............+..+|+|||||.+.  ...   .|
T Consensus       161 ~~~~~~~~~~~~pvg~~~~n~~Gl~Dm~GNV~EW~~d~~~~~~~~~~~~~~~~~~~~~~~v~rGGs~~~~~~~~~---~r  237 (260)
T PF03781_consen  161 YSNADSRSGETAPVGSFPPNPFGLYDMAGNVWEWTADWYSGYPPDPPDDNPNDDSDGGYRVVRGGSWASDPMPDS---CR  237 (260)
T ss_dssp             E-TTTS-SS----TTSS-BSTTS-SSSSSSSEEEEEEE-SCHCCCS-CCS----SGGSSEEEES--TTSBTTCCC---CS
T ss_pred             ccccCCCCccceeeeeccccccCcCCCCCCchheecccccCccCCccccccccccCCceEeeeCCccCCCcchhe---Ee
Confidence            011 112235899999999999999999999999999998322222333334445578999999999997  444   47


Q ss_pred             eccc-cCCCCCCCCCCCCccccc
Q psy8678         320 CAAR-SQNTPDSSAGNLGFRCAA  341 (394)
Q Consensus       320 ~~~r-~~~~~~~~~~~iGFR~v~  341 (394)
                      +..| ....+..+...+||||||
T Consensus       238 ~~~r~~~~~~~~~~~~vGFR~vR  260 (260)
T PF03781_consen  238 CAYRGSFYPPDQRSPNVGFRCVR  260 (260)
T ss_dssp             TT-EEEEE-TT--BTTEEE--EE
T ss_pred             eeeccCcCCCCCcCCCEEEEEEC
Confidence            8888 444788999999999996


No 6  
>TIGR03440 unchr_TIGR03440 conserved hypothetical protein TIGR03440. The model TIGR03438 describes a family of uncharacteriaed putative methyltransferases in bacteria. The family described here is a set of proteins also restricted to bacteria, and located close to the member of TIGR03438.
Probab=100.00  E-value=1.5e-48  Score=386.91  Aligned_cols=235  Identities=31%  Similarity=0.483  Sum_probs=173.8

Q ss_pred             CCCCcEEeCCeeeEccCCCCCCCCCCCCCceEEEccceEeeecccCHHHHHHHHHHcCCcchhhhcCCcccccCCccHHH
Q psy8678          11 RYKDMVLLPGDTFRMGTNKPILIKDGEFPSRNVTLDAFYLDQHEVSNTQFQEFVSATGYVTEAEKFGDTFVFEPLLSEEE   90 (394)
Q Consensus        11 ~~~~mv~IpgG~f~mG~~~~~~~~~~~~p~~~v~v~~F~i~~~EVTn~~y~~fl~~~g~~~~~~~~~~~~~~~~~~~~~~   90 (394)
                      ...+||.||||+|.||++.+.+.+|+|.|.|+|.|++|+|+++||||+||++|++++||+.++-...+++          
T Consensus       166 ~~~~~v~ip~G~f~mG~~~~~f~~DnE~P~h~V~l~~F~i~~~~VTn~ey~~Fv~~gGy~~~~~w~~~gw----------  235 (406)
T TIGR03440       166 PPLRWVAFPGGEFEIGSDADGFAFDNERPRHRVLVPPFEIDARPVTNGEYLEFIEDGGYRRPELWLSDGW----------  235 (406)
T ss_pred             CCCCeEEECCeEEEeCCCCCCCcccCCCCceeEEeCCeEEECccCcHHHHHHHHHhcCCCCcccccccch----------
Confidence            3569999999999999987778899999999999999999999999999999999999986532111111          


Q ss_pred             HHhhhhhccccccccCCcccccccCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCCccCccCCCCCccc--cCCc
Q psy8678          91 RAKISQVRHDMKRFEGLDSTIEHRMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGSWLH--PEGI  168 (394)
Q Consensus        91 ~~~~~~~~~~~~~~~~~~~~~~~~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~~~~~~~~~w~~--~~~~  168 (394)
                       ..+..                .....|..            |..    ..                  ..|..  ..+.
T Consensus       236 -~~~~~----------------~~~~~P~~------------w~~----~~------------------~~w~~~~~~g~  264 (406)
T TIGR03440       236 -AWVQA----------------EGWQAPLY------------WRR----DD------------------GTWWVFTLGGL  264 (406)
T ss_pred             -hhhhh----------------hcccCCcc------------ccc----cC------------------CcceeeccCCC
Confidence             11110                00011111            100    00                  01110  0000


Q ss_pred             CCccccccCCCcccCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccCCCCCCCCCccccccccCCCCCCCCCCCC
Q psy8678         169 DSTIEHRMNHPVVHVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGNNLTPRGEHRANVWQGEFPTNNTAADG  248 (394)
Q Consensus       169 ~~~~~~~~~~Pv~~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypwg~~~~~~~~~~an~~~~~~~~~~~~~~g  248 (394)
                         .....++||++|||+||.+||+|+|+|||||+||||||+++...     +           |+       .+     
T Consensus       265 ---~p~~~~~PV~~VS~~eA~Ay~~W~g~RLPTEaEWE~AAr~g~~~-----~-----------~~-------~~-----  313 (406)
T TIGR03440       265 ---RPLDPDAPVCHVSYYEADAYARWAGARLPTEAEWEKAARWGDAP-----P-----------NF-------AE-----  313 (406)
T ss_pred             ---CCCCCCCCccCCCHHHHHHHHHHhCCCCCCHHHHHHHHhcCCCC-----C-----------Cc-------cc-----
Confidence               01236899999999999999999999999999999999964321     0           11       00     


Q ss_pred             CCccccCcCCCCCcccccccccCHHHhccccccCCCCCCC----CCCCCCCCCCCCeEEeCccccCCccccccceecccc
Q psy8678         249 YLSTAPVMSYKENKFGLYNMVGNVWEWTADWWNVHHHPAP----SYNPKGPTTGTDKVKKGGSYLCNEQYCYRHRCAARS  324 (394)
Q Consensus       249 ~~~~~pVg~~~~n~~GlyDM~GNV~EW~~d~y~~~~~~~~----~~~~~~~~~g~~~v~rGGs~~~~~~~~~~~r~~~r~  324 (394)
                      ....+||+++++|++|||||.|||||||+|+|.++....+    ..++.+++.++++|+|||||.+.+..   .|++.|+
T Consensus       314 ~~~~~PV~~~~~~~~Gl~dm~GNVWEWt~d~y~pypgf~~~~g~~~ey~~~~~~~~~VlRGGSw~t~~~~---~R~~~Rn  390 (406)
T TIGR03440       314 ANLGAPVGAYPAGAQGLGQLFGDVWEWTASPYEPYPGFRPPPGAYGEYNGKFMDGQMVLRGGSCATPPRH---LRPSYRN  390 (406)
T ss_pred             cCCCCcCCCcCCCCcccccCcCCeeeecccCCCCCCCCCCCCCccccCCCCcCCCeeEeeCCCCCCCCcc---cCccccC
Confidence            0134899999999999999999999999999998765432    23344555678999999999998876   4999999


Q ss_pred             CCCCCCCCCCCCcccc
Q psy8678         325 QNTPDSSAGNLGFRCA  340 (394)
Q Consensus       325 ~~~~~~~~~~iGFR~v  340 (394)
                      ...|+.+...+|||||
T Consensus       391 ~~~p~~r~~~~GFR~A  406 (406)
T TIGR03440       391 FFYPHRRWQFSGFRLA  406 (406)
T ss_pred             CCCCCccCCCceeeeC
Confidence            9999999999999997


No 7  
>TIGR02171 Fb_sc_TIGR02171 Fibrobacter succinogenes paralogous family TIGR02171. This model describes a paralogous family of the rumen bacterium Fibrobacter succinogenes. Eleven members are found in Fibrobacter succinogenes S85, averaging over 900 amino acids in length. More than half are predicted lipoproteins. The function is unknown.
Probab=100.00  E-value=1.5e-48  Score=403.99  Aligned_cols=243  Identities=26%  Similarity=0.410  Sum_probs=178.6

Q ss_pred             CCCCcEEeCCe-eeEccCCCCCCCCCCCCCceEEEcc-ceEeeecccCHHHHHHHHHHcCCcchhhhcCCcccccCCccH
Q psy8678          11 RYKDMVLLPGD-TFRMGTNKPILIKDGEFPSRNVTLD-AFYLDQHEVSNTQFQEFVSATGYVTEAEKFGDTFVFEPLLSE   88 (394)
Q Consensus        11 ~~~~mv~IpgG-~f~mG~~~~~~~~~~~~p~~~v~v~-~F~i~~~EVTn~~y~~fl~~~g~~~~~~~~~~~~~~~~~~~~   88 (394)
                      ..++||+|||| +|+||+++.. ..++|.|.|+|+|. +|||++|||||+||++|+.+.+.....               
T Consensus        35 ~~~gMV~IpgG~~F~MGsd~~~-a~ddE~P~H~VtL~~~FyI~k~EVTnaqF~aFv~a~~g~~~~---------------   98 (912)
T TIGR02171        35 SVDGFVYVKGKKSTTLGTDDIS-AKSNESPKMTVQLTYDFYIGRHEVTCGEFNDLMKGETGFKVP---------------   98 (912)
T ss_pred             CcCCeEEeCCCCeEEcCCCCCc-cCccCCCceEEEecCCeEEECeeecHHHHHHHHhcCCCCCCC---------------
Confidence            35789999999 9999997643 56789999999997 999999999999999999875211000               


Q ss_pred             HHHHhhhhhccccccccCCcccccccCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCCccCccCCCCCccccCCc
Q psy8678          89 EERAKISQVRHDMKRFEGLDSTIEHRMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGSWLHPEGI  168 (394)
Q Consensus        89 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~~~~~~~~~w~~~~~~  168 (394)
                                             ....++||++|||+||++||+|++.+......|.+           ..++| ++.++
T Consensus        99 -----------------------~~~~d~PV~~VSW~DA~aYcnwLSkktGl~p~Y~~-----------tga~~-~p~g~  143 (912)
T TIGR02171        99 -----------------------CKEDKLPATNVTFYDAVLYANALSKSEGLDTVYTY-----------TSANF-DASGH  143 (912)
T ss_pred             -----------------------cCCCCCCccCCCHHHHHHHHHhhhhhcCCCceeec-----------ccccc-ccccc
Confidence                                   01247899999999999999999764333222222           12445 56666


Q ss_pred             CCccccccCCCcccCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccCCCCCCCCCccccccccCCCCCCCCCCCC
Q psy8678         169 DSTIEHRMNHPVVHVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGNNLTPRGEHRANVWQGEFPTNNTAADG  248 (394)
Q Consensus       169 ~~~~~~~~~~Pv~~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypwg~~~~~~~~~~an~~~~~~~~~~~~~~g  248 (394)
                      ...+.+...+|+.             .||||||||||||||||+... .++|+..                   +.    
T Consensus       144 c~~l~g~~~~p~~-------------~GYRLPTEAEWEYAARGG~~~-~~~W~~~-------------------nS----  186 (912)
T TIGR02171       144 CVNLEGLAFHPEV-------------KGYRLPTEAEWIYVASQSWDP-EKSWNSD-------------------NS----  186 (912)
T ss_pred             ccccccccccccc-------------ccccCCCHHHHHHHHhcCCCC-ccccccc-------------------cc----
Confidence            6555555555554             399999999999999986432 2344321                   00    


Q ss_pred             CCccccCcCCCCCcccccccccCHHHhccccccCCCCCCCCCCCCCCCC---CCCeEEeCccccCCccccccceeccccC
Q psy8678         249 YLSTAPVMSYKENKFGLYNMVGNVWEWTADWWNVHHHPAPSYNPKGPTT---GTDKVKKGGSYLCNEQYCYRHRCAARSQ  325 (394)
Q Consensus       249 ~~~~~pVg~~~~n~~GlyDM~GNV~EW~~d~y~~~~~~~~~~~~~~~~~---g~~~v~rGGs~~~~~~~~~~~r~~~r~~  325 (394)
                      ...++||+++++|+||||||+|||||||.|+|.++. ..+..++.+...   ...+|+|||||.+.+..|   |++.|..
T Consensus       187 ~~~t~PVGsfppN~fGLYDM~GNVWEWc~DwY~~y~-~~~~~np~G~~d~~~~~~RVlRGGSW~s~p~~c---R~a~R~~  262 (912)
T TIGR02171       187 SSEAHEVCTSPDNPGNVCDMAGNVLEWVNDWLASFK-DTTLTNYVGSSDPGSLGERVVKGGSYRNSPSAI---NLYTRGD  262 (912)
T ss_pred             CCccccccccCCCCcCccccCCChHHhhcccccccc-cccccCCCCCCCCCCCceEEEecCCCCCChhhc---ceeeccc
Confidence            013689999999999999999999999999998743 233345544332   357899999999998876   7777765


Q ss_pred             CC---CCCCCCCCCccccccCCC
Q psy8678         326 NT---PDSSAGNLGFRCAADKGP  345 (394)
Q Consensus       326 ~~---~~~~~~~iGFR~v~~~~~  345 (394)
                      ..   ++.+..++|||||++..|
T Consensus       263 ~~p~~pdsr~~~iGFRLAr~~iP  285 (912)
T TIGR02171       263 VYPVTSSTKGDYVGFRLALGAIP  285 (912)
T ss_pred             cCCCCcccccCceEEEEEEecCC
Confidence            43   456778999999998744


No 8  
>COG1262 Uncharacterized conserved protein [Function unknown]
Probab=100.00  E-value=1.7e-46  Score=360.03  Aligned_cols=245  Identities=44%  Similarity=0.740  Sum_probs=177.7

Q ss_pred             CCCCCCcEEeCCeeeEccCCCCCC-CCC-CCCCceEEEccceEeeecccCHHHHHHHHHHcCCcchhhhcCCcccccCCc
Q psy8678           9 VERYKDMVLLPGDTFRMGTNKPIL-IKD-GEFPSRNVTLDAFYLDQHEVSNTQFQEFVSATGYVTEAEKFGDTFVFEPLL   86 (394)
Q Consensus         9 ~~~~~~mv~IpgG~f~mG~~~~~~-~~~-~~~p~~~v~v~~F~i~~~EVTn~~y~~fl~~~g~~~~~~~~~~~~~~~~~~   86 (394)
                      ....++||.||+|+|.||+.+.+. ..+ +|.|.|+|+|++|+|++|||||+||++|++++++....+..+.        
T Consensus        47 ~~~~~~~v~ipgg~f~~g~~~~e~~~~~~~e~P~h~v~v~~F~i~k~pVT~aq~~~fv~~~g~~~~~~~~~~--------  118 (314)
T COG1262          47 LVIAPEMVLIPGGEFTMGSPDDEWERFDRNEAPVHKVTVPPFEIDKYPVTNAQFARFVEAGGYTTAWEEDGE--------  118 (314)
T ss_pred             cccCceEEEECCceeecCCCcccccccccccCCceeeEecceeeeCceEcHHHHHHHHHhcCcccccccccc--------
Confidence            346789999999999999544333 334 7999999999999999999999999999999987641000000        


Q ss_pred             cHHHHHhhhhhccccccccCCcccccccCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCCccCccCCCCCccccC
Q psy8678          87 SEEERAKISQVRHDMKRFEGLDSTIEHRMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGSWLHPE  166 (394)
Q Consensus        87 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g~~~~~~~~~~w~~~~  166 (394)
                                                     |+                  .|                    ..|..+.
T Consensus       119 -------------------------------~~------------------~p--------------------~~w~~~~  129 (314)
T COG1262         119 -------------------------------PV------------------YP--------------------SYWKGEG  129 (314)
T ss_pred             -------------------------------cC------------------Cc--------------------ccccCCC
Confidence                                           00                  00                    1122221


Q ss_pred             CcCCccccccCCCcccCCHHHHHHHHHHcC-----CCCCCHHHHHHHHhcCCCCccccCCCCCCCCCccccccccCCCCC
Q psy8678         167 GIDSTIEHRMNHPVVHVSWNDAVAYCTWRG-----ARLPTEAEWEYGCRGGLENRLFPWGNNLTPRGEHRANVWQGEFPT  241 (394)
Q Consensus       167 ~~~~~~~~~~~~Pv~~Vsw~dA~~yc~wlg-----~RLPTEaEWEyAArg~~~~~~ypwg~~~~~~~~~~an~~~~~~~~  241 (394)
                      +.     ...++||++|||+||.+||.|+|     +|||||+||||||+++.....|+||+...+....+.+-|.-    
T Consensus       130 ~~-----~~~~~Pv~~Vs~~da~aya~~lg~~tg~~rLPTEaEWE~Aar~g~~~~~~~~gd~~~~~~~~~~~~~~~----  200 (314)
T COG1262         130 GR-----LRLEHPVVGVSWYDAQAYAAWLGVKTGEYRLPTEAEWEYAARAGTTTDSYPWGDELEPGLNAYAGTWEY----  200 (314)
T ss_pred             Cc-----ccccCCeeeccHHHHHHHHHHhccccccccCCcHHHHHHHhccCCCCCccccCccccchhhhhccchhh----
Confidence            11     24679999999999999999999     99999999999999987765599999876544333322200    


Q ss_pred             CCCCCCC---CCccccCcCCCCC-cccccccccCHHHhcccccc----CCCCCCCCC------CCCC-CCCCCCeEEeCc
Q psy8678         242 NNTAADG---YLSTAPVMSYKEN-KFGLYNMVGNVWEWTADWWN----VHHHPAPSY------NPKG-PTTGTDKVKKGG  306 (394)
Q Consensus       242 ~~~~~~g---~~~~~pVg~~~~n-~~GlyDM~GNV~EW~~d~y~----~~~~~~~~~------~~~~-~~~g~~~v~rGG  306 (394)
                       .....+   ...++||++++++ .+|||||+|||||||.|++.    ..+...+.+      ...+ ...+..+|+|||
T Consensus       201 -~~~~~~~~~~~~~~pvg~~~~~~~~GlyDm~GnVWEWt~d~~~~~~~~~~~~~~~~g~a~~~~~~~~~~~~~~~v~rgg  279 (314)
T COG1262         201 -LRAAAGWARERETAPVGAFPPNAAYGLYDMHGNVWEWTADWEKEWHYDNYGPAPSDGSAWYDGNSGSKFFGSLRVVRGG  279 (314)
T ss_pred             -hccccccccccccCCccccCCccccChhhcccceeeeecccccccccccccCcccCCceeeccCCccccceeeeeeecc
Confidence             011122   2378999999766 99999999999999999875    222221111      1111 234467899999


Q ss_pred             cccCCccccccceeccccCCCCCCCCCCCCccccccC
Q psy8678         307 SYLCNEQYCYRHRCAARSQNTPDSSAGNLGFRCAADK  343 (394)
Q Consensus       307 s~~~~~~~~~~~r~~~r~~~~~~~~~~~iGFR~v~~~  343 (394)
                      ||.+....   .|++.|....++.+...+||||++.+
T Consensus       280 sw~~~~~~---~r~~~R~~~~~~~~~~~~GfR~~~~~  313 (314)
T COG1262         280 SWASYPGV---LRPAFRNFLVPDYRQAHVGFRCARLI  313 (314)
T ss_pred             cccCcccc---cCHhhhCccCcchhcceeeEEEEeec
Confidence            99996665   59999999999999999999999875


No 9  
>TIGR03525 GldK gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. There is a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=99.10  E-value=4.1e-11  Score=118.85  Aligned_cols=74  Identities=45%  Similarity=0.851  Sum_probs=58.0

Q ss_pred             cCCcccccCC--ccHHHHHhhhhhccccccccCCcccccccCCCceeeecHHHHHHhhhhhcC-----------------
Q psy8678          76 FGDTFVFEPL--LSEEERAKISQVRHDMKRFEGLDSTIEHRMHHPVVHISWNDAVAYCTWRGA-----------------  136 (394)
Q Consensus        76 ~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~PV~~vsW~~A~~y~~w~~~-----------------  136 (394)
                      .|++.||.+.  ++.+.....++++|+.            +.++|||+|||+||.+||+|++.                 
T Consensus       232 ~pdt~vw~~~~~~~~n~~~~~~y~~h~~------------y~d~PVV~VSW~dA~aFC~Wls~~~n~~~~~~g~~t~~~y  299 (449)
T TIGR03525       232 YPDTTVWIKDFNYSYNEPMHNDYFWHQA------------YDDYPVVGVTWKQARAFCNWRTKYKNDFRKKKGPANVNTF  299 (449)
T ss_pred             cCCcceEecccccccCchhhhhhccCcc------------cCCCCccCCCHHHHHHHHHHHhccccccccccccccCccc
Confidence            5777777766  4444554444444332            57999999999999999999985                 


Q ss_pred             CCcchhhhhhcccCCccCccCCCCC
Q psy8678         137 RLPTEAEWEYGCRGGLENRLFPWGS  161 (394)
Q Consensus       137 ~~~~~~~~~~~~~~g~~~~~~~~~~  161 (394)
                      |||+++||||||++|.....|+|++
T Consensus       300 RLPTEAEWEYAARGG~~~~~YPWG~  324 (449)
T TIGR03525       300 RLPTEAEWEYAARGGLEGATYPWGG  324 (449)
T ss_pred             cCCCHHHHHHHHhcCCCCCccCCCC
Confidence            8999999999999998888788743


No 10 
>TIGR02171 Fb_sc_TIGR02171 Fibrobacter succinogenes paralogous family TIGR02171. This model describes a paralogous family of the rumen bacterium Fibrobacter succinogenes. Eleven members are found in Fibrobacter succinogenes S85, averaging over 900 amino acids in length. More than half are predicted lipoproteins. The function is unknown.
Probab=98.96  E-value=1.4e-09  Score=114.71  Aligned_cols=43  Identities=28%  Similarity=0.357  Sum_probs=36.7

Q ss_pred             CCcccccCCccccCCCccccccccccCCCCCC---CCCCCcceEEEeec
Q psy8678         348 GTDKVKKGGSYLCNEQYCYRHRCAARSQNTPD---SSAGNLGFRCAADV  393 (394)
Q Consensus       348 ~~~~~~~ggsw~~~~~~~~~~~~~~r~~~~~~---~~~~~~gfr~~~~~  393 (394)
                      ...||+|||||.+.+..|   |++.|....|.   .+..+||||||++.
T Consensus       238 ~~~RVlRGGSW~s~p~~c---R~a~R~~~~p~~pdsr~~~iGFRLAr~~  283 (912)
T TIGR02171       238 LGERVVKGGSYRNSPSAI---NLYTRGDVYPVTSSTKGDYVGFRLALGA  283 (912)
T ss_pred             CceEEEecCCCCCChhhc---ceeeccccCCCCcccccCceEEEEEEec
Confidence            356899999999999999   99999877654   57799999999874


No 11 
>TIGR03529 GldK_short gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. This model represents a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture than that found in Flavobacterium johnsoniae and related species (represented by (TIGR03525). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=98.67  E-value=1.1e-08  Score=99.99  Aligned_cols=48  Identities=50%  Similarity=1.117  Sum_probs=43.1

Q ss_pred             cCCCceeeecHHHHHHhhhhhc-----------------CCCcchhhhhhcccCCccCccCCCCC
Q psy8678         114 RMHHPVVHISWNDAVAYCTWRG-----------------ARLPTEAEWEYGCRGGLENRLFPWGS  161 (394)
Q Consensus       114 ~~~~PV~~vsW~~A~~y~~w~~-----------------~~~~~~~~~~~~~~~g~~~~~~~~~~  161 (394)
                      ..++||++|||+||.+||+|++                 .|||+++||||||++|.....|+|++
T Consensus       155 ~~~~PVv~VSW~dA~ayc~Wls~~~~~~~~~~~~~~~~~~RLPTEAEWEyAARgg~~~~~ypwG~  219 (344)
T TIGR03529       155 FDNYPVVGVDWNAAKQFCEWRTYHMNAYRNEESQYDMPRFRLPSEAEWEYAARGGRDMAKYPWGG  219 (344)
T ss_pred             ccCCCcccCCHHHHHHHHHHHhhhccccccccccccCCcccCcCHHHHHHHHhCCCCCCcCCCCC
Confidence            3589999999999999999996                 69999999999999988877788754


No 12 
>PF03781 FGE-sulfatase:  Sulfatase-modifying factor enzyme 1;  InterPro: IPR005532 This domain is found in eukaryotic proteins [] required for post-translational sulphatase modification (SUMF1). These proteins are associated with the rare disorder multiple sulphatase deficiency (MSD) [, , , ]. The protein product of the SUMF1 gene is FGE, formylglycine-generating enzyme, which is a sulphatase. Sulphatases are enzymes essential for degradation and remodelling of sulphate esters, and formylglycine (FGly), the key catalytic in the active site, is unique to sulphatases []. FGE is localised to the endoplasmic reticulum (ER) and interacts with and modifies the unfolded form of newly synthesised sulphatases. FGE is a single-domain monomer with a surprising paucity of secondary structure that adopts a unique fold which is stabilised by two Ca2+ ions. The effect of all mutations found in MSD patients is explained by the FGE structure, providing a molecular basis for MSD. A redox-active disulphide bond is present in the active site of FGE. An oxidised cysteine residue, possibly cysteine sulphenic acid, has been detected that may allow formulation of a structure-based mechanism for FGly formation from cysteine residues in all sulphatases []. This domain is also found in a few methyltransferases and protein kinases.; PDB: 2Y3C_A 2Q17_B 1Y4J_B 1Y1E_X 2AFT_X 2HIB_X 2AII_X 1Z70_X 1Y1F_X 2HI8_X ....
Probab=98.42  E-value=1.2e-07  Score=89.55  Aligned_cols=52  Identities=58%  Similarity=1.199  Sum_probs=38.4

Q ss_pred             ccCCCceeeecHHHHHHhhhhhcC------CCcchhhhhhcccCCccCccCCCCCccc
Q psy8678         113 HRMHHPVVHISWNDAVAYCTWRGA------RLPTEAEWEYGCRGGLENRLFPWGSWLH  164 (394)
Q Consensus       113 ~~~~~PV~~vsW~~A~~y~~w~~~------~~~~~~~~~~~~~~g~~~~~~~~~~w~~  164 (394)
                      ...++||++|||++|.+||+|++.      |||+++||||||++|.....++|.+-..
T Consensus        87 ~~~~~Pv~~Vsw~~A~ayc~wl~~~~g~~yRLPteaEWe~Aar~g~~~~~~~~g~~~~  144 (260)
T PF03781_consen   87 GPDNHPVVGVSWYDAQAYCNWLGKRTGEGYRLPTEAEWEYAARGGPDGRPYPWGDEFD  144 (260)
T ss_dssp             TGTTSB--S--HHHHHHHHHHCTHHTTSS-B---HHHHHHHHHTTSSSSSBTTBSSSS
T ss_pred             CCcchhcceeeHHHHHHHHHHhcccccccccCCCHHHHHHHhcccccccccccCCCCC
Confidence            356899999999999999999999      9999999999999988777777755443


No 13 
>PHA00653 mtd major tropism determinant
Probab=98.28  E-value=1.7e-06  Score=80.89  Aligned_cols=137  Identities=23%  Similarity=0.341  Sum_probs=82.4

Q ss_pred             cCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccCCCCCCCCCccccccccCCCCCCCCCCCCCCccccCcCCCCC
Q psy8678         182 HVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWGNNLTPRGEHRANVWQGEFPTNNTAADGYLSTAPVMSYKEN  261 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypwg~~~~~~~~~~an~~~~~~~~~~~~~~g~~~~~pVg~~~~n  261 (394)
                      ..+|+.+.+-....|+|||+-+||.-||.|..+...-  ++.          .|      ....+.|...+..+ .-.-+
T Consensus       235 ~~~WY~~~e~m~~~GKrLp~y~EF~~~afGS~egt~~--~~t----------~~------sat~~~gr~atg~~-~~~vS  295 (381)
T PHA00653        235 DGAWYNFAEVMTHHGKRLPNYNEFQALAFGTTEATSS--GGT----------DV------PTTGVNGTGATSAW-NIFTS  295 (381)
T ss_pred             chhHHHHHHHHHHhccCCCcHHHHHHHHhCCCcccCC--CCC----------Cc------ccccccccccchhH-hhhhh
Confidence            3679999988888899999999999999986543210  000          01      00111111111111 11236


Q ss_pred             cccccccccCHHHhccccccCCCCCCCCCCCCCC---CCCCCeEEeCccccCCccccccceeccccC---CCCCCCCCCC
Q psy8678         262 KFGLYNMVGNVWEWTADWWNVHHHPAPSYNPKGP---TTGTDKVKKGGSYLCNEQYCYRHRCAARSQ---NTPDSSAGNL  335 (394)
Q Consensus       262 ~~GlyDM~GNV~EW~~d~y~~~~~~~~~~~~~~~---~~g~~~v~rGGs~~~~~~~~~~~r~~~r~~---~~~~~~~~~i  335 (394)
                      .+|+.|..||||||..+.........-.-+..+.   +..-..++-||+|....      +|..|..   ..|.....++
T Consensus       296 ~~gv~d~~G~vW~W~~e~~~~~~aa~y~an~~~~G~~yq~~~a~l~GG~W~~g~------~cGsRa~~~~~~Pw~v~aN~  369 (381)
T PHA00653        296 KWGVVQASGCLWTWGNEFGGVNGASEYTANTGGRGSVYAQPAAALFGGSWNYTS------LSGSRAAYWYSGPSNSFANI  369 (381)
T ss_pred             hcchhhhcchHHHHHHHhcCCcccceeecccCCcchHhhchHHHhhCCcccccc------ccccceeceecCcccccccc
Confidence            8999999999999998865332111100011110   11123567899998764      3444443   5777888999


Q ss_pred             CccccccC
Q psy8678         336 GFRCAADK  343 (394)
Q Consensus       336 GFR~v~~~  343 (394)
                      |-|||++.
T Consensus       370 GaRgvCD~  377 (381)
T PHA00653        370 GARGVCDH  377 (381)
T ss_pred             ccceechh
Confidence            99999874


No 14 
>TIGR03440 unchr_TIGR03440 conserved hypothetical protein TIGR03440. The model TIGR03438 describes a family of uncharacteriaed putative methyltransferases in bacteria. The family described here is a set of proteins also restricted to bacteria, and located close to the member of TIGR03438.
Probab=98.14  E-value=1.1e-06  Score=87.97  Aligned_cols=38  Identities=55%  Similarity=0.961  Sum_probs=35.7

Q ss_pred             cCCCceeeecHHHHHHhhhhhcCCCcchhhhhhcccCC
Q psy8678         114 RMHHPVVHISWNDAVAYCTWRGARLPTEAEWEYGCRGG  151 (394)
Q Consensus       114 ~~~~PV~~vsW~~A~~y~~w~~~~~~~~~~~~~~~~~g  151 (394)
                      ..++||++|||+||.+||.|.+.|||+++|||+||+.|
T Consensus       268 ~~~~PV~~VS~~eA~Ay~~W~g~RLPTEaEWE~AAr~g  305 (406)
T TIGR03440       268 DPDAPVCHVSYYEADAYARWAGARLPTEAEWEKAARWG  305 (406)
T ss_pred             CCCCCccCCCHHHHHHHHHHhCCCCCCHHHHHHHHhcC
Confidence            46899999999999999999999999999999999864


No 15 
>COG1262 Uncharacterized conserved protein [Function unknown]
Probab=98.03  E-value=3.7e-06  Score=81.35  Aligned_cols=46  Identities=59%  Similarity=1.221  Sum_probs=40.3

Q ss_pred             cCCCceeeecHHHHHHhhhhhc-----CCCcchhhhhhcccCCccCccCCC
Q psy8678         114 RMHHPVVHISWNDAVAYCTWRG-----ARLPTEAEWEYGCRGGLENRLFPW  159 (394)
Q Consensus       114 ~~~~PV~~vsW~~A~~y~~w~~-----~~~~~~~~~~~~~~~g~~~~~~~~  159 (394)
                      +.++||++|||++|.+||.|++     .|||++++|||||++|....-|++
T Consensus       133 ~~~~Pv~~Vs~~da~aya~~lg~~tg~~rLPTEaEWE~Aar~g~~~~~~~~  183 (314)
T COG1262         133 RLEHPVVGVSWYDAQAYAAWLGVKTGEYRLPTEAEWEYAARAGTTTDSYPW  183 (314)
T ss_pred             cccCCeeeccHHHHHHHHHHhccccccccCCcHHHHHHHhccCCCCCcccc
Confidence            4579999999999999999999     999999999999999876543433


No 16 
>TIGR03524 GldJ gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. There is a GldJ homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=97.29  E-value=0.00015  Score=73.63  Aligned_cols=46  Identities=28%  Similarity=0.366  Sum_probs=40.0

Q ss_pred             CCCCeEEeCccccCCccccccceeccccCCCCCCCCCCCCccccccCCC
Q psy8678         297 TGTDKVKKGGSYLCNEQYCYRHRCAARSQNTPDSSAGNLGFRCAADKGP  345 (394)
Q Consensus       297 ~g~~~v~rGGs~~~~~~~~~~~r~~~r~~~~~~~~~~~iGFR~v~~~~~  345 (394)
                      ....||+|||||.+.+..|   |++.|....++.....||||||++...
T Consensus       503 ~d~~RVlRGGSW~~~~~~~---r~A~R~~~~~~~~~~~iGFR~a~~~~g  548 (559)
T TIGR03524       503 DDRVRVYKGGSWRDREYWL---DPAQRRYLPQYMATDYIGFRCAMSRVG  548 (559)
T ss_pred             cCCeEEeecCCcCCCcccc---chhhccCCCccccccceeEEEEecccC
Confidence            3468999999999888765   999999999999999999999998753


No 17 
>TIGR03530 GldJ_short gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. This model represents the GldJ homolog in Cytophaga hutchinsonii and several other species which is of shorter architecture than that found in Flavobacterium johnsoniae and is represented by a separate model (TIGR03524). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.
Probab=97.21  E-value=0.00023  Score=70.65  Aligned_cols=43  Identities=30%  Similarity=0.413  Sum_probs=37.6

Q ss_pred             CCCeEEeCccccCCccccccceeccccCCCCCCCCCCCCccccccC
Q psy8678         298 GTDKVKKGGSYLCNEQYCYRHRCAARSQNTPDSSAGNLGFRCAADK  343 (394)
Q Consensus       298 g~~~v~rGGs~~~~~~~~~~~r~~~r~~~~~~~~~~~iGFR~v~~~  343 (394)
                      +..||+|||||.+.+..+   |++.|....++.+..+||||||+..
T Consensus       356 ~~~RVlRGGSW~~~~~~~---Rsa~R~~~~p~~~~~~iGFR~a~~~  398 (402)
T TIGR03530       356 DEFRVYKGGSWKDVAYWL---SPGTRRFLAEDSATAAIGFRCAMIQ  398 (402)
T ss_pred             CceEEEeCCCCCCcccce---eeeecCCCCCCccCCceeEEEEEec
Confidence            357999999999887764   9999999999999999999999764


No 18 
>PF07603 DUF1566:  Protein of unknown function (DUF1566);  InterPro: IPR011460 These proteins of unknown function are found in Leptospira interrogans and in several gamma proteobacteria.
Probab=75.50  E-value=2.6  Score=34.52  Aligned_cols=30  Identities=37%  Similarity=0.581  Sum_probs=24.8

Q ss_pred             cccCCHHHHHHHHHHcC------CCCCCHHHHHHHH
Q psy8678         180 VVHVSWNDAVAYCTWRG------ARLPTEAEWEYGC  209 (394)
Q Consensus       180 v~~Vsw~dA~~yc~wlg------~RLPTEaEWEyAA  209 (394)
                      ....+|.+|+++|+-+.      -||||..|-+.-.
T Consensus        28 ~~~~~~~~A~~~c~~l~~~G~~dWRLPt~~EL~~L~   63 (124)
T PF07603_consen   28 PTYMNWDDAIAYCNNLNLGGYTDWRLPTIEELQSLY   63 (124)
T ss_pred             CccCcHHHHHHHHHHHhcCCCCCccCCCHHHHHHHH
Confidence            46689999999998873      4999999976654


No 19 
>PHA02673 ORF109 EEV glycoprotein; Provisional
Probab=56.90  E-value=8.8  Score=33.02  Aligned_cols=20  Identities=35%  Similarity=0.712  Sum_probs=18.8

Q ss_pred             CCHHHHHHHHHHcCCCCCCH
Q psy8678         183 VSWNDAVAYCTWRGARLPTE  202 (394)
Q Consensus       183 Vsw~dA~~yc~wlg~RLPTE  202 (394)
                      .+|+||.+-|.-+|++||..
T Consensus        93 ~tf~eAn~~C~~~g~~LPs~  112 (161)
T PHA02673         93 DTWTNANERCKELGQRLPSP  112 (161)
T ss_pred             CcHHHHHHHHHhcCCcCCCC
Confidence            69999999999999999994


No 20 
>PF00193 Xlink:  Extracellular link domain;  InterPro: IPR000538 The link domain [] is a hyaluronan(HA)-binding region found in proteins of vertebrates that are involved in the assembly of extracellular matrix, cell adhesion, and migration. The structure has been shown [] to consist of two alpha helices and two antiparallel beta sheets arranged around a large hydrophobic core similar to that of C-type lectin. This domain contains four conserved cysteines involved in two disulphide bonds. The link domain has also been termed HABM [] (HA binding module) and PTR [] (proteoglycan tandem repeat). Proteins with such a domain include the proteoglycans aggrecan, brevican, neurocan and versican, which are expressed in the CNS; the cartilage link protein (LP), a proteoglycan that together with HA and aggrecan forms multimolecular aggregates; Tumour necrosis factor-inducible protein TSG-6, which may be involved in cell-cell and cell-matrix interactions during inflammation and tumourgenesis; and CD44 antigen, the main cell surface receptor for HA.; GO: 0005540 hyaluronic acid binding, 0007155 cell adhesion; PDB: 1O7B_T 2PF5_C 1O7C_T 2JCQ_A 2JCR_A 2JCP_A 1UUH_B 1POZ_A 2I83_A.
Probab=44.90  E-value=25  Score=27.59  Aligned_cols=39  Identities=26%  Similarity=0.491  Sum_probs=27.4

Q ss_pred             cCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccC
Q psy8678         182 HVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPW  220 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypw  220 (394)
                      .+++.+|.+.|..+|.+|-|-.|-+.|-+.+-+.=.+-|
T Consensus        13 ~l~f~eA~~~C~~~ga~LAs~~qL~~A~~~G~~~C~~GW   51 (92)
T PF00193_consen   13 KLTFTEAQQACRALGARLASPEQLEAAWKAGFETCRAGW   51 (92)
T ss_dssp             SB-HHHHHHHHHHTTCBE--HHHHHHHHHTT---SS-EE
T ss_pred             cCcHHHHHHHHHHcCCeeCCHHHHHHHHHhhhhHhHHHh
Confidence            588999999999999999999999998887655433433


No 21 
>PHA00653 mtd major tropism determinant
Probab=41.07  E-value=15  Score=35.34  Aligned_cols=39  Identities=26%  Similarity=0.326  Sum_probs=30.0

Q ss_pred             CcccccCCccccCCCccccccccccC---CCCCCCCCCCcceEEEeec
Q psy8678         349 TDKVKKGGSYLCNEQYCYRHRCAARS---QNTPDSSAGNLGFRCAADV  393 (394)
Q Consensus       349 ~~~~~~ggsw~~~~~~~~~~~~~~r~---~~~~~~~~~~~gfr~~~~~  393 (394)
                      ..+++.||+|.+   ..   +|+.|.   ...|-..+.+||-|+||+.
T Consensus       336 ~~a~l~GG~W~~---g~---~cGsRa~~~~~~Pw~v~aN~GaRgvCD~  377 (381)
T PHA00653        336 PAAALFGGSWNY---TS---LSGSRAAYWYSGPSNSFANIGARGVCDH  377 (381)
T ss_pred             hHHHhhCCcccc---cc---ccccceeceecCccccccccccceechh
Confidence            347899999985   33   677764   3467779999999999973


No 22 
>cd03518 Link_domain_HAPLN_module_1 Link_domain_HAPLN_module_1; this link domain is found in the first link module of proteins similar to the vertebrate HAPLN (hyaluronan/HA and proteoglycan binding link) protein family which includes cartilage link protein. The link domain is a HA-binding domain. HAPLNs contain two contiguous link modules. Both link modules of cartilage link protein are involved in interaction with HA. In cartilage, a chondroitin sulfate proteoglycan core protein (CSPG) aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates with other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HAPLN gene family are physically linked adjacent to CSPG genes.
Probab=38.68  E-value=56  Score=25.79  Aligned_cols=40  Identities=18%  Similarity=0.264  Sum_probs=32.7

Q ss_pred             ccCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccC
Q psy8678         181 VHVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPW  220 (394)
Q Consensus       181 ~~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypw  220 (394)
                      -.+++.+|++.|.-.|..|.|-++-+.|-+.|.+.-.+-|
T Consensus        12 Y~l~f~eA~~aC~~~ga~lAs~~QL~~Aw~~Gld~C~~GW   51 (95)
T cd03518          12 YNLNFHEAQQACEEQDATLASFEQLYQAWTEGLDWCNAGW   51 (95)
T ss_pred             cccCHHHHHHHHHHcCCeeCCHHHHHHHHHcCccccCccc
Confidence            3488999999999999999999999988887655444444


No 23 
>cd03601 CLECT_TC14_like C-type lectin-like domain (CTLD) of the type found in lectins TC14, TC14-2, TC14-3, and TC14-4 from the budding tunicate Polyandrocarpa misakiensis and PfG6 from the Acorn worm. CLECT_TC14_like: C-type lectin-like domain (CTLD) of the type found in lectins TC14, TC14-2, TC14-3, and TC14-4 from the budding tunicate Polyandrocarpa misakiensis and PfG6 from the Acorn worm.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  TC14 is homodimeric.  The CTLD of TC14 binds D-galactose and D-fucose.  TC14 is expressed constitutively by multipotent epithelial and mesenchymal cells and plays in role during budding, in inducing the aggregation of undifferentiated mesenchymal cells to give rise to epithelial forming tissue.   TC14-2 and TC14-3 shows calcium-dependent galactose binding activity.  TC14-3 is a cytostatic factor which blocks cell growth and dedifferentiation of the atrial epithelium during asexual reproducti
Probab=34.81  E-value=23  Score=28.63  Aligned_cols=19  Identities=37%  Similarity=0.714  Sum_probs=16.9

Q ss_pred             cCCHHHHHHHHHHcCCCCC
Q psy8678         182 HVSWNDAVAYCTWRGARLP  200 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLP  200 (394)
                      .++|.+|..+|+.+|.+|-
T Consensus         9 ~~~w~~A~~~C~~~G~~La   27 (119)
T cd03601           9 TMNYAKAGAFCRSRGMRLA   27 (119)
T ss_pred             cCCHHHHHHHHHhcCCEEe
Confidence            4899999999999998775


No 24 
>cd03520 Link_domain_CSPGs_modules_2_4 Link_domain_CSPGs_modules_2_4; this link domain is found in the second and fourth link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan and, in the second link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. Aggrecan in addition contains a second globular domain (G2) having link modules 3 and 4 which lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and
Probab=34.15  E-value=72  Score=25.26  Aligned_cols=39  Identities=28%  Similarity=0.388  Sum_probs=31.9

Q ss_pred             cCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccC
Q psy8678         182 HVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPW  220 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypw  220 (394)
                      .+++.+|++.|.-+|..|.|-+|-+.|-+.|.+.=.+-|
T Consensus        10 ~l~f~eA~~aC~~~ga~lAs~~QL~~Aw~~Gld~C~~GW   48 (96)
T cd03520          10 KFTFQEARAECRSLGAVLATTGQLYAAWRQGLDQCDPGW   48 (96)
T ss_pred             CcCHHHHHHHHHHcCCEeCCHHHHHHHHHhccccccCcc
Confidence            589999999999999999999999888876654433333


No 25 
>cd03515 Link_domain_TSG_6_like This is the extracellular link domain of the type found in human TSG-6. The link domain is a hyaluronan (HA)-binding domain. TSG-6 is the protein product of tumor necrosis factor-stimulated gene-6. TSG-6 is up-regulated in inflammatory lesions and in the ovary during ovulation. It has a strong anti-inflammatory and chondroprotective effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. Also included in this group are the stabilins: stabilin-1 (FEEL-1, CLEVER-1) and stabilin-2 (FEEL-2). Stabilin-2 functions as the major liver and lymph node-scavenging receptor for HA and related glycosaminoglycans. Stabilin-2 is a scavenger receptor with a broad range of ligands including advanced glycation end (AGE) products, acetylated low density lipoprotein and procollagen peptides. In contrast, stabilin-1 does not bind HA, but binds acetylated low density lipoprotein and AGEs with lower affinity. As AGEs accum
Probab=33.28  E-value=86  Score=24.66  Aligned_cols=39  Identities=23%  Similarity=0.362  Sum_probs=31.8

Q ss_pred             cCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccC
Q psy8678         182 HVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPW  220 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypw  220 (394)
                      .+++.+|++.|+-.|..|.|-++-+.|-+.|.+.=.+-|
T Consensus        13 ~l~f~eA~~aC~~~ga~lAs~~QL~~Aw~~G~d~C~~GW   51 (93)
T cd03515          13 KLTYTEAKAACEAEGAHLATYSQLSAAQQLGFHLCAAGW   51 (93)
T ss_pred             ccCHHHHHHHHHHcCCccCCHHHHHHHHHcCccccCccc
Confidence            488999999999999999999999888876655443433


No 26 
>cd03516 Link_domain_CD44_like This domain is a hyaluronan (HA)-binding domain. It is found in CD44 receptor and mediates adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It also plays an important role in arteriogenesis. The functional HA-binding domain of CD44 is an extended domain comprised of a single link module flanked with N-and C- extensions. These extensions are essential for folding and for functional activity. This group also contains the cell surface retention sequence (CRS) binding protein-1 (CRSBP-1) and lymph vessel endothelial receptor-1 (LYVE-1). CRSBP-1 is a cell surface binding protein for the CRS motif of PDGF-BB (platelet-derived growth factor-BB) and is responsible for the cell surface retention of PDGF-BB in SSV-transformed cells. CRSBP-1 may play a role in autocrine regulation of cell growth mediated by CRS containing growth regulators. LYVE-1 is preferentially expressed on the lymphatic endothelium and is used as a molecular marke
Probab=32.83  E-value=69  Score=27.32  Aligned_cols=39  Identities=23%  Similarity=0.439  Sum_probs=32.2

Q ss_pred             cCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccC
Q psy8678         182 HVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPW  220 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypw  220 (394)
                      .+++.+|.+.|+.+|.+|.|-+|-+.|-+.|.+.=.+-|
T Consensus        18 ~lnf~eA~~aC~~~ga~lAs~~QL~~Aw~~Gld~C~aGW   56 (144)
T cd03516          18 SLNFTEAKEACRALGLTLASKAQVETALKFGFETCRYGW   56 (144)
T ss_pred             cCCHHHHHHHHHHcCCeeCCHHHHHHHHHcChhccCcce
Confidence            488999999999999999999999998887655444433


No 27 
>cd03592 CLECT_selectins_like C-type lectin-like domain (CTLD) of the type found in the type 1 transmembrane proteins:  P(platlet)-, E(endothelial)-, and L(leukocyte)- selectins (sels). CLECT_selectins_like: C-type lectin-like domain (CTLD) of the type found in the type 1 transmembrane proteins:  P(platlet)-, E(endothelial)-, and L(leukocyte)- selectins (sels).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  P- E- and L-sels are cell adhesion receptors that mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration.  L- sel is expressed constitutively on most leukocytes.  P-sel is stored in the Weibel-Palade bodies of endothelial cells and in the alpha granules of platlets.  E- sels are present on endothelial cells.  Following platelet and/or endothelial cell activation P- sel is rapidly translocated to the cell surface and E-sel exp
Probab=31.66  E-value=32  Score=27.38  Aligned_cols=28  Identities=32%  Similarity=0.517  Sum_probs=20.7

Q ss_pred             cCCHHHHHHHHHHcCCCCC---CHHHHHHHH
Q psy8678         182 HVSWNDAVAYCTWRGARLP---TEAEWEYGC  209 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLP---TEaEWEyAA  209 (394)
                      ..+|.+|.++|+..|..|-   +++|=++..
T Consensus         9 ~~~w~~A~~~C~~~g~~La~i~s~~e~~~i~   39 (115)
T cd03592           9 KMTFNEAVKYCKSRGTDLVAIQNAEENALLN   39 (115)
T ss_pred             ccCHHHHHHHHHHcCCeEeecCCHHHHHHHH
Confidence            4789999999999998664   455544433


No 28 
>smart00445 LINK Link (Hyaluronan-binding).
Probab=31.10  E-value=74  Score=25.08  Aligned_cols=40  Identities=20%  Similarity=0.364  Sum_probs=32.2

Q ss_pred             ccCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccC
Q psy8678         181 VHVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPW  220 (394)
Q Consensus       181 ~~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypw  220 (394)
                      ..+++.+|++.|+-.|..|-|-+|-+.|-+.+.+.=.+-|
T Consensus        13 y~l~f~eA~~aC~~~ga~lAs~~QL~~Aw~~Gld~C~~GW   52 (94)
T smart00445       13 YKLTFAEAREACRAQGATLATVGQLYAAWQDGFDTCDAGW   52 (94)
T ss_pred             CccCHHHHHHHHHHcCCEeCCHHHHHHHHHhchhhcCccc
Confidence            4588999999999999999999999888876655433333


No 29 
>cd03599 CLECT_DGCR2_like C-type lectin-like domain (CTLD) of the type found in DGCR2, an integral membrane protein deleted in DiGeorge Syndrome (DGS). CLECT_DGCR2_like: C-type lectin-like domain (CTLD) of the type found in DGCR2, an integral membrane protein deleted in DiGeorge Syndrome (DGS).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  DGS is also known velo-cardio-facial syndrome (VCFS).  DGS is a genetic abnormality that results in malformations of the heart, face, and limbs and is associated with schizophrenia and depressive disorders.  DGCR2 is a candidate for involvement in the pathogenesis of DGS since the DGCR2 gene lies within the minimal DGS critical region (MDGRC) of 22q11, which when deleted gives rise to DGS, and the DGCR2 gene is in close proximity to the balanced translocation breakpoint in a DGS patient having a balanced translocation.
Probab=31.06  E-value=35  Score=29.40  Aligned_cols=20  Identities=30%  Similarity=0.335  Sum_probs=17.3

Q ss_pred             cCCHHHHHHHHHHcCCCCCC
Q psy8678         182 HVSWNDAVAYCTWRGARLPT  201 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPT  201 (394)
                      ..+|.||.++|+.+|..|.+
T Consensus        21 ~~tw~dA~~~C~~~Gg~Las   40 (153)
T cd03599          21 GENYWDAVQTCQKVNGSLAT   40 (153)
T ss_pred             cCCHHHHHHHHHHcCCEEcC
Confidence            47999999999999987765


No 30 
>PF05966 Chordopox_A33R:  Chordopoxvirus A33R protein;  InterPro: IPR009238 This family consists of several Chordopoxvirus A33R proteins. A33R plays a role in promoting Ab-resistant cell-to-cell spread of virus [] and interacts with A36R to incorporate the protein into the outer membrane of intracellular enveloped virions (IEV) [].; PDB: 3K7B_A.
Probab=30.21  E-value=26  Score=31.21  Aligned_cols=22  Identities=23%  Similarity=0.536  Sum_probs=15.5

Q ss_pred             CCHHHHHHHHHHcCCCCCCHHH
Q psy8678         183 VSWNDAVAYCTWRGARLPTEAE  204 (394)
Q Consensus       183 Vsw~dA~~yc~wlg~RLPTEaE  204 (394)
                      .+|+||.+-|.-+|++||+...
T Consensus       122 ~T~~~A~~~C~~~g~~LPs~~l  143 (190)
T PF05966_consen  122 KTFDEANSDCNNKGQTLPSKDL  143 (190)
T ss_dssp             EEHHHHHHHHHHTT-B---HHH
T ss_pred             CCHHHHHHHHHhcCCcCCCcch
Confidence            5699999999999999999643


No 31 
>TIGR02145 Fib_succ_major Fibrobacter succinogenes major paralogous domain. This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron.
Probab=29.39  E-value=21  Score=31.38  Aligned_cols=26  Identities=31%  Similarity=0.751  Sum_probs=18.3

Q ss_pred             CHHHHHH-HHHHcCCCCCCHHHHHHHHh
Q psy8678         184 SWNDAVA-YCTWRGARLPTEAEWEYGCR  210 (394)
Q Consensus       184 sw~dA~~-yc~wlg~RLPTEaEWEyAAr  210 (394)
                      +|..|.. .|= .|.||||.+||+....
T Consensus        51 ~w~aa~~~~cP-~GWhlPs~~Ew~~L~~   77 (171)
T TIGR02145        51 TWAAAMDSICP-EGWHLPSTTEWNTLFD   77 (171)
T ss_pred             EHHHhccCcCC-CCCCCCCHHHHHHHHH
Confidence            4555554 442 4889999999987754


No 32 
>PF00059 Lectin_C:  Lectin C-type domain;  InterPro: IPR001304 Lectins occur in plants, animals, bacteria and viruses. Initially described for their carbohydrate-binding activity [], they are now recognised as a more diverse group of proteins, some of which are involved in protein-protein, protein-lipid or protein-nucleic acid interactions []. There are at least twelve structural families of lectins:   C-type lectins, which are Ca+-dependent.  S-type (galectins), a widespread family of glycan-binding proteins []. I-type, which have an immunoglobulin-like fold and can recognise sialic acids, other sugars and glycosaminoglycans []. P-type, which bind phosphomannosyl receptors []. Pentraxins []. (Trout) egg lectins. Calreticulin and calnexin, which act as molecular chaperones of the endoplasmic reticulum []. ERGIC-53 and VIP-36 []. Discoidins []. Eel aggutinins (fucolectins) []. Annexin lectins []. Fibrinogen-type lectins, which includes ficolins, tachylectins 5A and 5B, and Limax flavus (Spotted garden slug) agglutinin (these proteins have clear distinctions from one another, but they share a homologous fibrinogen-like domain used for carbohydrate binding). Also unclassified orphan lectins, including amphoterin, Cel-II, complement factor H, thrombospondin, sailic acid-binding lectins, adherence lectin, and cytokins (such as tumour necrosis factor and several interleukins).   C-type lectins can be further divided into seven subgroups based on additional non-lectin domains and gene structure: (I) hyalectans, (II) asialoglycoprotein receptors, (III) collectins, (IV) selectins, (V) NK group transmembrane receptors, (VI) macrophage mannose receptors, and (VII) simple (single domain) lectins []. Therefore, lectins are a diverse group of proteins, both in terms of structure and activity. Carbohydrate binding ability may have evolved independently and sporadically in numerous unrelated families, where each evolved a structure that was conserved to fulfil some other activity and function. In general, animal lectins act as recognition molecules within the immune system, their functions involving defence against pathogens, cell trafficking, immune regulation and the prevention of autoimmunity [].; GO: 0005488 binding; PDB: 1T8D_A 2H2T_B 1T8C_A 2H2R_A 1TN3_A 1RJH_A 1HTN_A 3G8K_B 2E3X_B 1UMR_D ....
Probab=29.16  E-value=22  Score=27.09  Aligned_cols=29  Identities=31%  Similarity=0.614  Sum_probs=22.9

Q ss_pred             CCHHHHHHHHHHcCCCC---CCHHHHHHHHhc
Q psy8678         183 VSWNDAVAYCTWRGARL---PTEAEWEYGCRG  211 (394)
Q Consensus       183 Vsw~dA~~yc~wlg~RL---PTEaEWEyAArg  211 (394)
                      ++|.+|..+|+-.|..|   .++.|.++...-
T Consensus         3 ~~~~~A~~~C~~~~~~L~~i~~~~e~~~i~~~   34 (105)
T PF00059_consen    3 MTWEEAQQYCQSMGAHLASINSEEENDFIQSQ   34 (105)
T ss_dssp             EEHHHHHHHHHHTTSEEB-GSSHHHHHHHHHH
T ss_pred             CCHHHHHHHHhcCCCEEeEeCCHHHhhhhhhc
Confidence            68999999999999755   556688776653


No 33 
>PF09603 Fib_succ_major:  Fibrobacter succinogenes major domain (Fib_succ_major);  InterPro: IPR011871  This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes subsp. succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulphide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron.
Probab=28.91  E-value=32  Score=30.05  Aligned_cols=16  Identities=50%  Similarity=0.727  Sum_probs=13.9

Q ss_pred             CCCCCCHHHHHHHHhc
Q psy8678         196 GARLPTEAEWEYGCRG  211 (394)
Q Consensus       196 g~RLPTEaEWEyAArg  211 (394)
                      |.||||.+||+.....
T Consensus        73 GWrlPt~~Ew~~L~~~   88 (184)
T PF09603_consen   73 GWRLPTRAEWNSLFKY   88 (184)
T ss_pred             CCCCCCHHHHHHHHHh
Confidence            8999999999877654


No 34 
>cd01102 Link_Domain The link domain is a hyaluronan (HA)-binding domain. It functions to mediate adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It is found in the CD44 receptor and in human TSG-6. TSG-6 is the protein product of the tumor necrosis factor-stimulated gene-6. TSG-6 has a strong anti-inflammatory effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. This group also contains the link domains of the chondroitin sulfate proteoglycan core proteins (CSPG) including aggrecan, versican, neurocan, and brevican and the link domains of the vertebrate HAPLN (HA and proteoglycan binding link) protein family. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates in which other CSPGs substitute for aggregan might contribute to the structural integrity of many different tissues. Members of
Probab=28.82  E-value=85  Score=24.64  Aligned_cols=40  Identities=25%  Similarity=0.471  Sum_probs=32.3

Q ss_pred             ccCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccC
Q psy8678         181 VHVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPW  220 (394)
Q Consensus       181 ~~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypw  220 (394)
                      ..+++.+|.+.|+-+|..|-|-.|-+.|-+.|.+.=.+-|
T Consensus        12 y~l~f~eA~~aC~~~ga~lAs~~QL~~Aw~~G~~~C~~GW   51 (92)
T cd01102          12 YKLTFAEAALACKARGAHLATPGQLEAAWQDGFDVCTAGW   51 (92)
T ss_pred             cccCHHHHHHHHHHcCCEeCCHHHHHHHHHcchhhcCCcc
Confidence            4588999999999999999999999888876654433333


No 35 
>PHA02953 IEV and EEV membrane glycoprotein; Provisional
Probab=28.24  E-value=46  Score=29.23  Aligned_cols=22  Identities=36%  Similarity=0.421  Sum_probs=19.1

Q ss_pred             cCCHHHHHHHHHHcCCCCCCHH
Q psy8678         182 HVSWNDAVAYCTWRGARLPTEA  203 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPTEa  203 (394)
                      ..+|.||.++|..+|.+||...
T Consensus        65 ~~tW~~A~~~C~~~Gg~L~~~~   86 (170)
T PHA02953         65 QLSTYGAVYLCNKYRARLPKPN   86 (170)
T ss_pred             cCCHHHHHHHHHhcCCCCCCCc
Confidence            4799999999999999998743


No 36 
>PF07979 Intimin_C:  Intimin C-type lectin domain;  InterPro: IPR013117 This domain is found at the C terminus of intimin. Its structure has been solved and shown to have a C-lectin type of structure []. Intimin is a bacterial adhesion molecule involved in intimate attachment of enteropathogenic and enterohemorrhagic Escherichia coli to mammalian host cells. Intimin targets the translocated intimin receptor (Tir), which is exported by the bacteria and integrated into the host cell plasma membrane.; GO: 0005488 binding, 0009405 pathogenesis, 0009986 cell surface; PDB: 1CWV_A 2ZQK_B 2ZWK_C 1F02_I 1E5U_I 1F00_I 3NCX_B 3NCW_D.
Probab=26.98  E-value=24  Score=28.15  Aligned_cols=23  Identities=26%  Similarity=0.580  Sum_probs=20.0

Q ss_pred             ccCCHHHHHHHHHHcCCCCCCHH
Q psy8678         181 VHVSWNDAVAYCTWRGARLPTEA  203 (394)
Q Consensus       181 ~~Vsw~dA~~yc~wlg~RLPTEa  203 (394)
                      ..|++.+|...|+-++.|||+-.
T Consensus        11 ~~~~Y~~A~~~C~~~s~~LpsS~   33 (101)
T PF07979_consen   11 SRVTYSEAESICQNNSGRLPSSQ   33 (101)
T ss_dssp             CEETHHHHHHHTTTTCCESBSSH
T ss_pred             ceEeHHHHHHHHHhccccCcccH
Confidence            35899999999999999999843


No 37 
>PHA03093 EEV glycoprotein; Provisional
Probab=26.76  E-value=55  Score=28.97  Aligned_cols=20  Identities=30%  Similarity=0.571  Sum_probs=18.8

Q ss_pred             CCHHHHHHHHHHcCCCCCCH
Q psy8678         183 VSWNDAVAYCTWRGARLPTE  202 (394)
Q Consensus       183 Vsw~dA~~yc~wlg~RLPTE  202 (394)
                      .+|+||.+-|.-+|++||..
T Consensus       118 kTf~dA~~~C~~~g~~LPs~  137 (185)
T PHA03093        118 KTFSDAKADCAKKSSTLPNS  137 (185)
T ss_pred             cCHHHHHHHHHhcCCcCCCc
Confidence            78999999999999999984


No 38 
>cd03595 CLECT_chondrolectin_like C-type lectin-like domain (CTLD) of the type found in the human type-1A transmembrane proteins chondrolectin (CHODL) and layilin. CLECT_chondrolectin_like: C-type lectin-like domain (CTLD) of the type found in the human type-1A transmembrane proteins chondrolectin (CHODL) and layilin.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  CHODL is predominantly expressed in muscle cells and is associated with T-cell maturation.  Various alternatively spliced isoforms have been of CHODL have been identified.  The transmembrane form of CHODL is localized in the ER-Golgi apparatus.  Layilin is widely expressed in different cell types.  The extracellular CTLD of layilin binds hyaluronan (HA), a major constituent of the extracellular matrix (ECM).  The cytoplasmic tail of layilin binds various members of the band 4.1/ERM superfamily (talin, radixin, and merlin).  The ERM proteins are cytoskeleton-membrane l
Probab=25.57  E-value=46  Score=28.16  Aligned_cols=21  Identities=19%  Similarity=0.381  Sum_probs=17.8

Q ss_pred             ccCCHHHHHHHHHHcCCCCCC
Q psy8678         181 VHVSWNDAVAYCTWRGARLPT  201 (394)
Q Consensus       181 ~~Vsw~dA~~yc~wlg~RLPT  201 (394)
                      ..++|.+|..+|+.+|..|.+
T Consensus        23 ~~~tw~~A~~~C~~~g~~Las   43 (149)
T cd03595          23 RRLNFEEARQACREDGGELLS   43 (149)
T ss_pred             cccCHHHHHHHHHHcCCEECc
Confidence            468999999999999986654


No 39 
>cd03517 Link_domain_CSPGs_modules_1_3 Link_domain_CSPGs_modules_1_3; this extracellular link domain is found in the first and third link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan. In addition, it is found in the first link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. In addition, aggrecan contains a second globular domain (G2) which contains link modules 3 and 4. G2 appears to lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues.
Probab=24.97  E-value=1.2e+02  Score=24.01  Aligned_cols=40  Identities=18%  Similarity=0.203  Sum_probs=33.3

Q ss_pred             cCCHHHHHHHHHHcCCCCCCHHHHHHHHhcCCCCccccCC
Q psy8678         182 HVSWNDAVAYCTWRGARLPTEAEWEYGCRGGLENRLFPWG  221 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLPTEaEWEyAArg~~~~~~ypwg  221 (394)
                      .+++.+|.+.|.-.|..|.|-++-+.|-+.|.+.=.+-|-
T Consensus        13 ~l~f~eA~~aC~~~ga~lAs~~QL~~Aw~~G~~~C~~GWL   52 (95)
T cd03517          13 ALTFPRAQRACLDISAQIATPEQLLAAYEDGFEQCDAGWL   52 (95)
T ss_pred             eECHHHHHHHHHHcCCEeCCHHHHHHHHHcCcccccCCCC
Confidence            4789999999999999999999998888876665555554


No 40 
>cd03600 CLECT_thrombomodulin_like C-type lectin-like domain (CTLD) of the type found in human thrombomodulin(TM), Endosialin, C14orf27, and C1qR. CLECT_thrombomodulin_like: C-type lectin-like domain (CTLD) of the type found in human thrombomodulin(TM), Endosialin, C14orf27, and C1qR.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  In these thrombomodulin-like proteins the residues involved in coordinating Ca2+ in the classical MBP-A CTLD are not conserved.  TM exerts anti-fibrinolytic and anti-inflammatory activity.  TM also regulates blood coagulation in the anticoagulant protein C pathway.  In this pathway, the procoagulant properties of thrombin (T) are lost when it binds TM.  TM also plays a key role in tumor biology.  It is expressed on endothelial cells and on several type of tumor cell including squamous cell carcinoma.  Loss of TM expression correlates with advanced stage and poor prognosis.  Loss of function of TM func
Probab=23.87  E-value=48  Score=27.59  Aligned_cols=27  Identities=19%  Similarity=0.257  Sum_probs=19.8

Q ss_pred             cCCHHHHHHHHHHcCCCCC---CHHHHHHH
Q psy8678         182 HVSWNDAVAYCTWRGARLP---TEAEWEYG  208 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLP---TEaEWEyA  208 (394)
                      ..+|.+|.++|+.+|..|-   +..|=++.
T Consensus        13 ~~sw~~A~~~C~~~gg~La~i~s~~E~~~v   42 (141)
T cd03600          13 KLTFLEAQRSCIELGGNLATVRSGEEADVV   42 (141)
T ss_pred             ccCHHHHHHHHHhhCCEeeecCCHHHHHHH
Confidence            4889999999999997664   44443333


No 41 
>cd03591 CLECT_collectin_like C-type lectin-like domain (CTLD) of the type found in human collectins including lung surfactant proteins A and D, mannose- or mannan binding lectin (MBL), and CL-L1 (collectin liver 1). CLECT_collectin_like: C-type lectin-like domain (CTLD) of the type found in human collectins including lung surfactant proteins A and D, mannose- or mannan binding lectin (MBL), and CL-L1 (collectin liver 1).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. The CTLDs of these collectins bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, or apoptotic cells) and mediate functions associated with killing and phagocytosis.  MBPs recognize high mannose oligosaccharides in a calcium dependent manner, bind to a broad range of pathogens, and trigger cell killing by activating the complement pathway.  MBP also acts directly as an opsonin.  SP-A and SP-D in addition to functioning as host defense components, a
Probab=23.66  E-value=45  Score=26.53  Aligned_cols=27  Identities=26%  Similarity=0.356  Sum_probs=19.4

Q ss_pred             ccCCHHHHHHHHHHcCCCCC---CHHHHHH
Q psy8678         181 VHVSWNDAVAYCTWRGARLP---TEAEWEY  207 (394)
Q Consensus       181 ~~Vsw~dA~~yc~wlg~RLP---TEaEWEy  207 (394)
                      ...+|.+|..+|+.+|..|-   ++.|=++
T Consensus         9 ~~~~w~~A~~~C~~~g~~La~i~s~~e~~~   38 (114)
T cd03591           9 EEKNFDDAQKLCSEAGGTLAMPRNAAENAA   38 (114)
T ss_pred             ceeCHHHHHHHHhhcCCEEecCCCHHHHHH
Confidence            34789999999999987553   4444433


No 42 
>cd03588 CLECT_CSPGs C-type lectin-like domain (CTLD) of the type found in chondroitin sulfate proteoglycan core proteins. CLECT_CSPGs: C-type lectin-like domain (CTLD) of the type found in chondroitin sulfate proteoglycan core proteins (CSPGs) in human and chicken aggrecan, frog brevican, and zebra fish dermacan.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA).  These aggregates contribute to the tissue's load bearing properties.  Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues.  Xenopus brevican is expressed in the notochord and the brain during early embryogenesis.  Zebra fish dermacan is expressed in dermal bones and may play a role in dermal bone development.  CSPGs do contain LINK domain(s) which bind HA.  These LINK domains are considered by one classif
Probab=23.59  E-value=49  Score=26.77  Aligned_cols=28  Identities=29%  Similarity=0.449  Sum_probs=20.9

Q ss_pred             cCCHHHHHHHHHHcCCCCC---CHHHHHHHH
Q psy8678         182 HVSWNDAVAYCTWRGARLP---TEAEWEYGC  209 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RLP---TEaEWEyAA  209 (394)
                      .++|.+|..+|+.+|.+|-   +..|=++.+
T Consensus        19 ~~sw~~A~~~C~~~gg~La~i~s~~e~~fl~   49 (124)
T cd03588          19 RETWEDAERRCREQQGHLSSIVTPEEQEFVN   49 (124)
T ss_pred             ccCHHHHHHHHHhcCCEEeccCCHHHHHHHH
Confidence            4899999999999998773   444544443


No 43 
>cd03603 CLECT_VCBS A bacterial subgroup of the C-type lectin-like (CTLD) domain; a subgroup of bacterial protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. CLECT_VCBS: A bacterial subgroup of the C-type lectin-like (CTLD) domain; a subgroup of bacterial protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces including CaCO3 and ice.  Bacterial CTLDs within this group are functionally uncharacterized.  Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions.  CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose.  CTLDs associate with each other through several different surface
Probab=23.23  E-value=56  Score=26.31  Aligned_cols=30  Identities=17%  Similarity=0.253  Sum_probs=23.4

Q ss_pred             ccCCHHHHHHHHHHcCCCC---CCHHHHHHHHh
Q psy8678         181 VHVSWNDAVAYCTWRGARL---PTEAEWEYGCR  210 (394)
Q Consensus       181 ~~Vsw~dA~~yc~wlg~RL---PTEaEWEyAAr  210 (394)
                      ..++|.+|..+|+..|..|   -++.|.++...
T Consensus         8 ~~~sw~~A~~~C~~~g~~La~I~s~~E~~fv~~   40 (118)
T cd03603           8 GGMTWEAAQTLAESLGGHLVTINSAEENDWLLS   40 (118)
T ss_pred             CCcCHHHHHHHHHHcCCEEcccCCHHHHHHHHH
Confidence            3589999999999999755   56777776654


No 44 
>cd03596 CLECT_tetranectin_like C-type lectin-like domain (CTLD) of the type found in the tetranectin (TN), cartilage derived C-type lectin (CLECSF1), and stem cell growth factor (SCGF). CLECT_tetranectin_like: C-type lectin-like domain (CTLD) of the type found in the tetranectin (TN), cartilage derived C-type lectin (CLECSF1), and stem cell growth factor (SCGF).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  TN binds to plasminogen and stimulates activation of plasminogen, playing a key role in the regulation of proteolytic processes.  The TN CTLD binds two calcium ions.  Its calcium free form binds to various kringle-like protein ligands.  Two residues involved in the coordination of calcium are critical for the binding of TN to the fourth kringle (K4) domain of plasminogen (Plg K4).  TN binds the kringle 1-4 form of angiostatin (AST K1-4).  AST K1-4 is a fragment of Plg, commonly found in cancer tissues.  TN inhibits the bin
Probab=22.43  E-value=58  Score=26.54  Aligned_cols=28  Identities=21%  Similarity=0.242  Sum_probs=20.5

Q ss_pred             cCCHHHHHHHHHHcCCCC---CCHHHHHHHH
Q psy8678         182 HVSWNDAVAYCTWRGARL---PTEAEWEYGC  209 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RL---PTEaEWEyAA  209 (394)
                      ..+|.+|..+|+.+|.+|   =++.|-++..
T Consensus        18 ~~~w~~A~~~C~~~g~~La~i~s~~e~~~l~   48 (129)
T cd03596          18 TKHYHEASEDCIARGGTLATPRDSDENDALR   48 (129)
T ss_pred             cCCHHHHHHHHHhcCCeEecCCCHHHHHHHH
Confidence            468999999999998755   4455655443


No 45 
>cd00037 CLECT C-type lectin (CTL)/C-type lectin-like (CTLD) domain. CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs.  Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice.  Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis;  P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initia
Probab=20.56  E-value=1e+02  Score=23.30  Aligned_cols=30  Identities=27%  Similarity=0.570  Sum_probs=23.3

Q ss_pred             cCCHHHHHHHHHHcCCCC---CCHHHHHHHHhc
Q psy8678         182 HVSWNDAVAYCTWRGARL---PTEAEWEYGCRG  211 (394)
Q Consensus       182 ~Vsw~dA~~yc~wlg~RL---PTEaEWEyAArg  211 (394)
                      .++|.+|.++|+-.|.+|   .+..|.++-..-
T Consensus         9 ~~~~~~A~~~C~~~~~~L~~~~~~~e~~~i~~~   41 (116)
T cd00037           9 KLTWEEAQEYCRSLGGHLASIHSEEENDFLASL   41 (116)
T ss_pred             ccCHHHHHHHHHHcCCEEcccCCHHHHHHHHHH
Confidence            589999999999998765   445777766653


Done!