Query         006966
Match_columns 623
No_of_seqs    249 out of 1578
Neff          5.4 
Searched_HMMs 46136
Date          Thu Mar 28 17:02:14 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/006966.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/006966hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF00332 Glyco_hydro_17:  Glyco 100.0 8.2E-65 1.8E-69  532.0  13.6  284   26-346     1-306 (310)
  2 smart00768 X8 Possibly involve  99.9 2.9E-28 6.3E-33  210.9   7.8   84  420-504     1-85  (85)
  3 PF07983 X8:  X8 domain;  Inter  99.9 1.8E-22   4E-27  172.0   5.2   72  420-491     1-78  (78)
  4 PF00332 Glyco_hydro_17:  Glyco  99.1 3.5E-12 7.7E-17  134.8  -1.8  118  235-382   184-302 (310)
  5 COG5309 Exo-beta-1,3-glucanase  98.6 2.1E-07 4.7E-12   95.5   9.7  214   24-289    44-266 (305)
  6 PF03198 Glyco_hydro_72:  Gluca  94.5   0.061 1.3E-06   57.4   5.7  126   26-167    30-180 (314)
  7 KOG0260 RNA polymerase II, lar  83.0     7.1 0.00015   48.3  10.3   12  105-116  1031-1042(1605)
  8 KOG0260 RNA polymerase II, lar  73.8      22 0.00048   44.3  10.7   33  144-177   953-985 (1605)
  9 COG1671 Uncharacterized protei  55.4      16 0.00036   35.4   4.2   77   77-160    20-119 (150)
 10 COG5309 Exo-beta-1,3-glucanase  52.7     6.9 0.00015   41.4   1.3   65  299-380   231-303 (305)
 11 COG3889 Predicted solute bindi  48.8      15 0.00032   43.9   3.2   28  142-171   173-200 (872)
 12 PRK00124 hypothetical protein;  46.4      25 0.00054   34.2   3.9   58   99-160    50-120 (151)
 13 KOG1924 RhoA GTPase effector D  42.5      83  0.0018   38.1   7.9   24  433-456   424-447 (1102)
 14 PF00925 GTP_cyclohydro2:  GTP   38.5      11 0.00023   37.0   0.1   35   45-87    131-165 (169)
 15 PF07462 MSP1_C:  Merozoite sur  32.6      98  0.0021   35.9   6.4    8  467-474   247-254 (574)
 16 PF06508 QueC:  Queuosine biosy  29.0 1.6E+02  0.0035   29.8   6.7  122   23-171    26-159 (209)
 17 PF13756 Stimulus_sens_1:  Stim  27.7      45 0.00097   30.5   2.3   33   41-77      2-35  (112)
 18 PF05283 MGC-24:  Multi-glycosy  27.2   2E+02  0.0043   29.1   6.8   13  604-616   163-175 (186)
 19 PF07172 GRP:  Glycine rich pro  27.0      44 0.00096   30.0   2.0    9   15-23     15-23  (95)
 20 PRK12485 bifunctional 3,4-dihy  26.4      36 0.00077   37.7   1.6   33   45-86    330-362 (369)
 21 COG3889 Predicted solute bindi  23.7      52  0.0011   39.5   2.4   33  186-218   321-358 (872)
 22 PRK10629 EnvZ/OmpR regulon mod  22.7 2.4E+02  0.0052   26.6   6.2   36   23-64     35-70  (127)
 23 cd02875 GH18_chitobiase Chitob  22.7 1.6E+02  0.0034   32.2   5.7  102   61-169    57-159 (358)
 24 PRK00393 ribA GTP cyclohydrola  22.6      40 0.00087   33.8   1.1   33   46-86    134-166 (197)
 25 PHA03291 envelope glycoprotein  22.3 1.8E+02  0.0039   32.2   5.9   15  601-615   289-303 (401)
 26 TIGR00505 ribA GTP cyclohydrol  22.3      40 0.00086   33.7   1.0   33   46-86    131-163 (191)
 27 cd06156 eu_AANH_C_2 A group of  21.8 1.1E+02  0.0023   28.1   3.6   31  138-168    29-59  (118)

No 1  
>PF00332 Glyco_hydro_17:  Glycosyl hydrolases family 17;  InterPro: IPR000490 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 17 GH17 from CAZY comprises enzymes with several known activities; endo-1,3-beta-glucosidase (3.2.1.39 from EC); lichenase (3.2.1.73 from EC); exo-1,3-glucanase (3.2.1.58 from EC). Currently these enzymes have only been found in plants and in fungi. ; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 1AQ0_B 1GHR_A 1GHS_B 2CYG_A 3UR8_A 3UR7_B 3EM5_C 3F55_D.
Probab=100.00  E-value=8.2e-65  Score=531.99  Aligned_cols=284  Identities=23%  Similarity=0.270  Sum_probs=217.3

Q ss_pred             eeEeccCCCCCCCCCChhhhhccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEEEecccCchhHHHhhcChHH
Q 006966           26 VGFAFNGRENTSAASSTSEVTSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSSVDLYLNLSLVVDLMQSELS  105 (623)
Q Consensus        26 iGVnYG~~gdnL~lPsP~~vv~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~V~v~vpN~~i~~la~s~~~  105 (623)
                      |||||||+||||  |+|.+||+|    ||+++|  +||||||+|+++||  ||+||||+|+|+  |||++|++|++++..
T Consensus         1 iGvnyG~~~~nl--p~p~~vv~l----~ks~~i--~~vri~d~~~~iL~--a~a~S~i~v~v~--vpN~~l~~la~~~~~   68 (310)
T PF00332_consen    1 IGVNYGRVGNNL--PSPCKVVSL----LKSNGI--TKVRIYDADPSILR--AFAGSGIEVMVG--VPNEDLASLASSQSA   68 (310)
T ss_dssp             EEEEE---SSS-----HHHHHHH----HHHTT----EEEESS--HHHHH--HHTTS--EEEEE--E-GGGHHHHHHHHHH
T ss_pred             CeEeccCccCCC--CCHHHHHHH----HHhccc--ccEEeecCcHHHHH--HHhcCCceeeec--cChHHHHHhccCHHH
Confidence            899999999998  999999999    999998  99999999999999  999999999995  699999999999999


Q ss_pred             HHHHHHhhccCCCCCccEEEEEccCccccccCCCchhhHHHHHHHHHHHHHhCCCCCceEEcccCCcCcccccCCCc---
Q 006966          106 AISWLETNVLTTHPHVNIKSIILSCSSEEFEGKNVLPLILSALKSFHSALNRIHLDMKVKVSVAFPLPLLENLNTSH---  182 (623)
Q Consensus       106 A~~WV~~NV~py~p~t~I~~I~VGnenE~~~~~~~~~~LvPAM~Nih~AL~~~gL~~~IKVSTp~s~~vL~~s~~~~---  182 (623)
                      |..||++||++|+|+|||++|+||  ||++...... .|||||+|||+||.++||+++|||||+|+|++|.+++|++   
T Consensus        69 A~~Wv~~nv~~~~~~~~i~~i~VG--nEv~~~~~~~-~lvpAm~ni~~aL~~~~L~~~IkVst~~~~~vl~~s~PPS~g~  145 (310)
T PF00332_consen   69 AGSWVRTNVLPYLPAVNIRYIAVG--NEVLTGTDNA-YLVPAMQNIHNALTAAGLSDQIKVSTPHSMDVLSNSFPPSAGV  145 (310)
T ss_dssp             HHHHHHHHTCTCTTTSEEEEEEEE--ES-TCCSGGG-GHHHHHHHHHHHHHHTT-TTTSEEEEEEEGGGEEE-SSGGG-E
T ss_pred             HhhhhhhcccccCcccceeeeecc--cccccCccce-eeccHHHHHHHHHHhcCcCCcceeccccccccccccCCCccCc
Confidence            999999999999999999999999  8888642222 8999999999999999999999999999999999998874   


Q ss_pred             -c----hhHHHHHHHHhhcCCeeEEeeC-------CCCCcccccccccccccccccCCCCC-CCchHHHHH---HHhcCC
Q 006966          183 -E----GEIGLIFGYIKKTGSVVIIEAG-------IDGKLSMAEVLVQPLLKKAIKATSIL-PDSDILIDL---VMKSPL  246 (623)
Q Consensus       183 -~----~~i~plL~FL~~T~SPfmVNvY-------~~~~i~LdyALF~~~~d~v~~a~~~L-~~~Da~LDa---Al~~~G  246 (623)
                       +    ..|+|||+||++|+||||||+|       ++.+++|+|||||++. .+.|..-.+ ++||+++|+   ||++.|
T Consensus       146 F~~~~~~~~~~~l~fL~~t~spf~vN~yPyfa~~~~~~~~~l~yAlf~~~~-~~~D~~~~y~nlfDa~~da~~~a~~~~g  224 (310)
T PF00332_consen  146 FRSDIASVMDPLLKFLDGTNSPFMVNVYPYFAYQNNPQNISLDYALFQPNS-GVVDGGLAYTNLFDAMVDAVYAAMEKLG  224 (310)
T ss_dssp             ESHHHHHHHHHHHHHHHHHT--EEEE--HHHHHHHSTTTS-HHHHTT-SSS--SEETTEEESSHHHHHHHHHHHHHHTTT
T ss_pred             ccccchhhhhHHHHHhhccCCCceeccchhhhccCCcccCCcccccccccc-cccccchhhhHHHHHHHHHHHHHHHHhC
Confidence             2    3589999999999999999999       6789999999999873 444542111 579999997   699999


Q ss_pred             CCCcccccchhhhhccccCccchhhHHHHhHHhhhhhcccCCchhhhhhhcCcccccCCCCccccCCC---CCCCCCCCc
Q 006966          247 VPDAKQVAEFTEIVSKFFENNSQIDELYADVASSMGEFVQKGLKVVRRLQNSLKTSIHDTTIFPTTPV---PPDNKPTPT  323 (623)
Q Consensus       247 ~p~~~vV~eeTGwps~~~~daa~~da~~Aav~~nA~~yn~~li~~~~~l~~~~gTp~v~etgwPt~~~---~ed~kpgpt  323 (623)
                      +++++++..|||||+++.        ..++. +||++|+++++++.     ..||+.+++.+++.+.|   +|+.|++..
T Consensus       225 ~~~~~vvv~ETGWPs~G~--------~~a~~-~nA~~~~~nl~~~~-----~~gt~~~~~~~~~~y~F~~FdE~~K~~~~  290 (310)
T PF00332_consen  225 FPNVPVVVGETGWPSAGD--------PGATP-ENAQAYNQNLIKHV-----LKGTPLRPGNGIDVYIFEAFDENWKPGPE  290 (310)
T ss_dssp             -TT--EEEEEE---SSSS--------TTCSH-HHHHHHHHHHHHHC-----CGBBSSSBSS---EEES-SB--TTSSSSG
T ss_pred             CCCceeEEeccccccCCC--------CCCCc-chhHHHHHHHHHHH-----hCCCcccCCCCCeEEEEEEecCcCCCCCc
Confidence            999999777899999965        24566 88999999988874     27898998888777776   899998876


Q ss_pred             ccccCCCCCceecCCCCCCCCCC
Q 006966          324 IVTVPATNPVTVSPANPSGTPLP  346 (623)
Q Consensus       324 ~~~ay~~n~~~~gLf~pdGTPvY  346 (623)
                      .       |||||||++|++|+|
T Consensus       291 ~-------E~~wGlf~~d~~~ky  306 (310)
T PF00332_consen  291 V-------ERHWGLFYPDGTPKY  306 (310)
T ss_dssp             G-------GGG--SB-TTSSBSS
T ss_pred             c-------cceeeeECCCCCeec
Confidence            5       889999999999999


No 2  
>smart00768 X8 Possibly involved in carbohydrate binding. The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges.
Probab=99.95  E-value=2.9e-28  Score=210.91  Aligned_cols=84  Identities=67%  Similarity=1.198  Sum_probs=80.7

Q ss_pred             ceeEecCCCChHHHHhhhhcccccCCCCCCccCCCCCcCCCCChhhhHhHHHHHHHhhCC-CCCCCCCCCceEEEecCCC
Q 006966          420 SWCVAKNGVSETAIQQALDYACGIGGADCSLIQQGASCYNPNTLQNHASFAFNSYYQKNP-SPTSCDFGGTAMIVNTNPS  498 (623)
Q Consensus       420 ~wCVak~~~~~~~l~~~ldyaCg~~~~dCs~I~~gg~cy~p~t~~~~aSyAfN~Yyq~~~-~~~sCdF~G~A~ltt~dpS  498 (623)
                      +|||+|+++++++||++||||||++ +||++|++||+||+||++++|||||||+|||+++ ..++|||+|.|++++.|||
T Consensus         1 ~wCv~~~~~~~~~l~~~~~yaCg~~-~dC~~I~~~g~c~~~~~~~~~aS~a~N~YYq~~~~~~~aC~F~G~a~~~~~~ps   79 (85)
T smart00768        1 LWCVAKPDADEAALQAALDYACGQG-ADCTAIQPGGSCYSPNTVKAHASYAFNSYYQKQGQSSGACDFGGTATITTTDPS   79 (85)
T ss_pred             CccccCCCCCHHHHHHHHHHHhcCC-CCccccCCCCcccCCCCHHHHHHHHHHHHHHHcCCCCCcCCCCCceEEEecCCC
Confidence            4999999999999999999999986 9999999999999999999999999999999987 5899999999999999999


Q ss_pred             CCCeee
Q 006966          499 TGSCVF  504 (623)
Q Consensus       499 ~~~C~~  504 (623)
                      +++|+|
T Consensus        80 ~~~C~~   85 (85)
T smart00768       80 TGSCKF   85 (85)
T ss_pred             CCccCC
Confidence            999986


No 3  
>PF07983 X8:  X8 domain;  InterPro: IPR012946 The X8 domain [] contains 6 conserved cysteine residues that presumably form three disulphide bridges. The domain is found in an Olive pollen allergen [] as well as at the C terminus of family 17 glycosyl hydrolases []. This domain may be involved in carbohydrate binding.; PDB: 2JON_A 2W61_A 2W62_A 2W63_A.
Probab=99.86  E-value=1.8e-22  Score=172.05  Aligned_cols=72  Identities=47%  Similarity=0.938  Sum_probs=61.5

Q ss_pred             ceeEecCCCChHHHHhhhhcccccCCCCCCccCCCCC-----cCCCCChhhhHhHHHHHHHhhCC-CCCCCCCCCceE
Q 006966          420 SWCVAKNGVSETAIQQALDYACGIGGADCSLIQQGAS-----CYNPNTLQNHASFAFNSYYQKNP-SPTSCDFGGTAM  491 (623)
Q Consensus       420 ~wCVak~~~~~~~l~~~ldyaCg~~~~dCs~I~~gg~-----cy~p~t~~~~aSyAfN~Yyq~~~-~~~sCdF~G~A~  491 (623)
                      +|||+|+++++++||++|||||+++++||++|++||+     .|++|+.++|||||||+|||+++ ...+|||+|+||
T Consensus         1 l~Cv~~~~~~~~~l~~~l~~aC~~~~~dC~~I~~~g~~G~YG~~S~C~~~~~lSya~N~YY~~~~~~~~~C~F~G~at   78 (78)
T PF07983_consen    1 LWCVAKPDADDKELQDLLDYACGQGGVDCSPIQPNGTTGVYGAYSMCSPRQHLSYAFNQYYQKQGRNSSACDFSGNAT   78 (78)
T ss_dssp             -EEEE-TTS-HHHHHHHHHHHTTT-SSSCCCC-EETTTTEE-TTTTS-CCHHHHHHHHHHHHHHTSSCCG-SS-STEE
T ss_pred             CcceeCCCCCHHHHHHHHHHHHcCCCCChhhhCCCCcccccccccCCCHHHHHHHHHHHHHHHcCCCCCcCCCCCCCC
Confidence            5999999999999999999999998899999999999     89999999999999999999986 589999999997


No 4  
>PF00332 Glyco_hydro_17:  Glycosyl hydrolases family 17;  InterPro: IPR000490 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 17 GH17 from CAZY comprises enzymes with several known activities; endo-1,3-beta-glucosidase (3.2.1.39 from EC); lichenase (3.2.1.73 from EC); exo-1,3-glucanase (3.2.1.58 from EC). Currently these enzymes have only been found in plants and in fungi. ; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 1AQ0_B 1GHR_A 1GHS_B 2CYG_A 3UR8_A 3UR7_B 3EM5_C 3F55_D.
Probab=99.14  E-value=3.5e-12  Score=134.77  Aligned_cols=118  Identities=20%  Similarity=0.249  Sum_probs=76.0

Q ss_pred             hHHHHHHHhcCCCCCcccccchhhhhccccCccchhhHHHHhHHhhhhhcccCCchhhhhhhcCcccccCCCCccccCCC
Q 006966          235 DILIDLVMKSPLVPDAKQVAEFTEIVSKFFENNSQIDELYADVASSMGEFVQKGLKVVRRLQNSLKTSIHDTTIFPTTPV  314 (623)
Q Consensus       235 Da~LDaAl~~~G~p~~~vV~eeTGwps~~~~daa~~da~~Aav~~nA~~yn~~li~~~~~l~~~~gTp~v~etgwPt~~~  314 (623)
                      ++.||+|+++...   .++|.  |..|.+++| +++|++++++ ++++   .+.+++           +++|+|||+.+.
T Consensus       184 ~~~l~yAlf~~~~---~~~D~--~~~y~nlfD-a~~da~~~a~-~~~g---~~~~~v-----------vv~ETGWPs~G~  242 (310)
T PF00332_consen  184 NISLDYALFQPNS---GVVDG--GLAYTNLFD-AMVDAVYAAM-EKLG---FPNVPV-----------VVGETGWPSAGD  242 (310)
T ss_dssp             TS-HHHHTT-SSS----SEET--TEEESSHHH-HHHHHHHHHH-HTTT----TT--E-----------EEEEE---SSSS
T ss_pred             cCCcccccccccc---ccccc--chhhhHHHH-HHHHHHHHHH-HHhC---CCCcee-----------EEeccccccCCC
Confidence            5789999998542   23343  778999999 9999999999 5443   333443           789999999987


Q ss_pred             CCCCCCCCcccccCCCCCceecCCCCCCCCCCCCCCCcccc-CCCCCCCCCCCCCCCCCCCCCccCCCC
Q 006966          315 PPDNKPTPTIVTVPATNPVTVSPANPSGTPLPIPSTTPVNI-PPATPVNPAAPVTNPATIPAPVTVPGG  382 (623)
Q Consensus       315 ~ed~kpgpt~~~ay~~n~~~~gLf~pdGTPvYp~~~~~v~i-~lfne~~kpgp~se~~~~~~Gl~~p~g  382 (623)
                      .   .+...++..|++|.+.+ +.  .|||.+|+..+++|+ ++|||+.|+++..|||   ||||++|+
T Consensus       243 ~---~a~~~nA~~~~~nl~~~-~~--~gt~~~~~~~~~~y~F~~FdE~~K~~~~~E~~---wGlf~~d~  302 (310)
T PF00332_consen  243 P---GATPENAQAYNQNLIKH-VL--KGTPLRPGNGIDVYIFEAFDENWKPGPEVERH---WGLFYPDG  302 (310)
T ss_dssp             T---TCSHHHHHHHHHHHHHH-CC--GBBSSSBSS---EEES-SB--TTSSSSGGGGG-----SB-TTS
T ss_pred             C---CCCcchhHHHHHHHHHH-Hh--CCCcccCCCCCeEEEEEEecCcCCCCCcccce---eeeECCCC
Confidence            2   23445566688886542 22  899999999999996 9999999999988888   99999975


No 5  
>COG5309 Exo-beta-1,3-glucanase [Carbohydrate transport and metabolism]
Probab=98.58  E-value=2.1e-07  Score=95.48  Aligned_cols=214  Identities=13%  Similarity=0.099  Sum_probs=137.8

Q ss_pred             ceeeEeccCCCCCCCCCChhhhhc---cCccccccCCCCCCcEEEecCCc----cccchhhhcCCCceEEEecccCchhH
Q 006966           24 TLVGFAFNGRENTSAASSTSEVTS---FDSLGLKLDNVPSQRIRVYVANH----RVLNFSSLLNSNASSSVDLYLNLSLV   96 (623)
Q Consensus        24 ~~iGVnYG~~gdnL~lPsP~~vv~---l~~~~lk~~~i~~~~VRiyDadp----~vL~~~AlanTgI~v~V~v~vpN~~i   96 (623)
                      +..+++||..-++-.-++.+++..   +    |++..   ..||+|..|-    .|+.  |..-.|++|.++|. +-+++
T Consensus        44 g~~~f~l~~~n~dGtCKSa~~~~sDLe~----l~~~t---~~IR~Y~sDCn~le~v~p--Aa~~~g~kv~lGiw-~tdd~  113 (305)
T COG5309          44 GFLAFTLGPYNDDGTCKSADQVASDLEL----LASYT---HSIRTYGSDCNTLENVLP--AAEASGFKVFLGIW-PTDDI  113 (305)
T ss_pred             cccceeccccCCCCCCcCHHHHHhHHHH----hccCC---ceEEEeeccchhhhhhHH--HHHhcCceEEEEEe-eccch
Confidence            357899998766655588888854   5    66654   3899996443    5888  99999999999996 44455


Q ss_pred             HHhhcChHHHHHHHHhhccCCCCCccEEEEEccCccccccCCCc-hhhHHHHHHHHHHHHHhCCCCCceEEcccCCcCcc
Q 006966           97 VDLMQSELSAISWLETNVLTTHPHVNIKSIILSCSSEEFEGKNV-LPLILSALKSFHSALNRIHLDMKVKVSVAFPLPLL  175 (623)
Q Consensus        97 ~~la~s~~~A~~WV~~NV~py~p~t~I~~I~VGnenE~~~~~~~-~~~LvPAM~Nih~AL~~~gL~~~IKVSTp~s~~vL  175 (623)
                      ..      +.+.=++.-+++|..--.|+.|.||  ||.+...+. ..+|.-=+..++.+|.++|.+  .+|.|.-...++
T Consensus       114 ~~------~~~~til~ay~~~~~~d~v~~v~VG--nEal~r~~~tasql~~~I~~vrsav~~agy~--gpV~T~dsw~~~  183 (305)
T COG5309         114 HD------AVEKTILSAYLPYNGWDDVTTVTVG--NEALNRNDLTASQLIEYIDDVRSAVKEAGYD--GPVTTVDSWNVV  183 (305)
T ss_pred             hh------hHHHHHHHHHhccCCCCceEEEEec--hhhhhcCCCCHHHHHHHHHHHHHHHHhcCCC--Cceeecccceee
Confidence            43      2333455667888776789999999  788764443 677888899999999999985  678988777777


Q ss_pred             cccCCCcchhHHHHHHHHhhcCCeeEEeeCCCCCcccccccccccccccccCCCCCCCchHHHHHHHhcCCCCC-ccccc
Q 006966          176 ENLNTSHEGEIGLIFGYIKKTGSVVIIEAGIDGKLSMAEVLVQPLLKKAIKATSILPDSDILIDLVMKSPLVPD-AKQVA  254 (623)
Q Consensus       176 ~~s~~~~~~~i~plL~FL~~T~SPfmVNvY~~~~i~LdyALF~~~~d~v~~a~~~L~~~Da~LDaAl~~~G~p~-~~vV~  254 (623)
                      .+ +|.    +-.--+       +.|+|+         ++.||.+.  +-++..  .+.-.+|.. + +...|. +.++.
T Consensus       184 ~~-np~----l~~~SD-------fia~N~---------~aYwd~~~--~a~~~~--~f~~~q~e~-v-qsa~g~~k~~~v  236 (305)
T COG5309         184 IN-NPE----LCQASD-------FIAANA---------HAYWDGQT--VANAAG--TFLLEQLER-V-QSACGTKKTVWV  236 (305)
T ss_pred             eC-ChH----Hhhhhh-------hhhccc---------chhccccc--hhhhhh--HHHHHHHHH-H-HHhcCCCccEEE
Confidence            65 322    111112       345665         34466541  111110  111112221 1 112233 56655


Q ss_pred             chhhhhccccCccchhhHHHHhHHhhhhhcccCCc
Q 006966          255 EFTEIVSKFFENNSQIDELYADVASSMGEFVQKGL  289 (623)
Q Consensus       255 eeTGwps~~~~daa~~da~~Aav~~nA~~yn~~li  289 (623)
                      .|||||+++.    ++.+.++++ +|+..|-++.+
T Consensus       237 ~EtGWPS~G~----~~G~a~pS~-anq~~~~~~i~  266 (305)
T COG5309         237 TETGWPSDGR----TYGSAVPSV-ANQKIAVQEIL  266 (305)
T ss_pred             eeccCCCCCC----ccCCcCCCh-hHHHHHHHHHH
Confidence            6799999974    444556777 88877765443


No 6  
>PF03198 Glyco_hydro_72:  Glucanosyltransferase;  InterPro: IPR004886 This family is a group of yeast glycolipid proteins anchored to the membrane. It includes Candida albicans (Yeast) pH-regulated protein, which is required for apical growth and plays a role in morphogenesis and Saccharomyces cerevisiae glycolipid anchored surface protein.; PDB: 2W61_A 2W62_A 2W63_A.
Probab=94.48  E-value=0.061  Score=57.45  Aligned_cols=126  Identities=16%  Similarity=0.215  Sum_probs=70.8

Q ss_pred             eeEeccCCCCC------CCCCChhhhh----ccCccccccCCCCCCcEEEecCCcc-----ccchhhhcCCCceEEEecc
Q 006966           26 VGFAFNGRENT------SAASSTSEVT----SFDSLGLKLDNVPSQRIRVYVANHR-----VLNFSSLLNSNASSSVDLY   90 (623)
Q Consensus        26 iGVnYG~~gdn------L~lPsP~~vv----~l~~~~lk~~~i~~~~VRiyDadp~-----vL~~~AlanTgI~v~V~v~   90 (623)
                      .||.|=..++.      -||-.+ ++-    .+    ||+.||  .-||+|.-||+     -++  +|+..||-|+++|.
T Consensus        30 kGVaYQp~~~~~~~~~~DPLad~-~~C~rDi~~----l~~Lgi--NtIRVY~vdp~~nHd~CM~--~~~~aGIYvi~Dl~  100 (314)
T PF03198_consen   30 KGVAYQPGGSSEPSNYIDPLADP-EACKRDIPL----LKELGI--NTIRVYSVDPSKNHDECMS--AFADAGIYVILDLN  100 (314)
T ss_dssp             EEEE----------SS--GGG-H-HHHHHHHHH----HHHHT---SEEEES---TTS--HHHHH--HHHHTT-EEEEES-
T ss_pred             eeEEcccCCCCCCccCcCcccCH-HHHHHhHHH----HHHcCC--CEEEEEEeCCCCCHHHHHH--HHHhCCCEEEEecC
Confidence            68988665551      111222 232    37    888887  99999977764     588  99999999999998


Q ss_pred             cCchhHHHhhcChHHHHHHHH-------hhccCCCCCccEEEEEccCccccccC---CCchhhHHHHHHHHHHHHHhCCC
Q 006966           91 LNLSLVVDLMQSELSAISWLE-------TNVLTTHPHVNIKSIILSCSSEEFEG---KNVLPLILSALKSFHSALNRIHL  160 (623)
Q Consensus        91 vpN~~i~~la~s~~~A~~WV~-------~NV~py~p~t~I~~I~VGnenE~~~~---~~~~~~LvPAM~Nih~AL~~~gL  160 (623)
                      .|+..|.+-.    -+..|=.       .-|-.|..-.|.-...+|  |||+..   ....+.+--|.|-+++=+++.+.
T Consensus       101 ~p~~sI~r~~----P~~sw~~~l~~~~~~vid~fa~Y~N~LgFf~G--NEVin~~~~t~aap~vKAavRD~K~Yi~~~~~  174 (314)
T PF03198_consen  101 TPNGSINRSD----PAPSWNTDLLDRYFAVIDAFAKYDNTLGFFAG--NEVINDASNTNAAPYVKAAVRDMKAYIKSKGY  174 (314)
T ss_dssp             BTTBS--TTS----------HHHHHHHHHHHHHHTT-TTEEEEEEE--ESSS-STT-GGGHHHHHHHHHHHHHHHHHSSS
T ss_pred             CCCccccCCC----CcCCCCHHHHHHHHHHHHHhccCCceEEEEec--ceeecCCCCcccHHHHHHHHHHHHHHHHhcCC
Confidence            8877776532    2344521       123333333577888999  777753   24578888889999999998887


Q ss_pred             CCceEEc
Q 006966          161 DMKVKVS  167 (623)
Q Consensus       161 ~~~IKVS  167 (623)
                       ++|-|.
T Consensus       175 -R~IPVG  180 (314)
T PF03198_consen  175 -RSIPVG  180 (314)
T ss_dssp             -----EE
T ss_pred             -CCCcee
Confidence             557776


No 7  
>KOG0260 consensus RNA polymerase II, large subunit [Transcription]
Probab=83.00  E-value=7.1  Score=48.29  Aligned_cols=12  Identities=17%  Similarity=0.493  Sum_probs=7.0

Q ss_pred             HHHHHHHhhccC
Q 006966          105 SAISWLETNVLT  116 (623)
Q Consensus       105 ~A~~WV~~NV~p  116 (623)
                      .|-+||-.||-.
T Consensus      1031 eaf~w~~~~Ie~ 1042 (1605)
T KOG0260|consen 1031 EAFEWVLGEIEA 1042 (1605)
T ss_pred             HHHHHHhhhhhh
Confidence            456666666543


No 8  
>KOG0260 consensus RNA polymerase II, large subunit [Transcription]
Probab=73.75  E-value=22  Score=44.32  Aligned_cols=33  Identities=15%  Similarity=0.265  Sum_probs=23.3

Q ss_pred             HHHHHHHHHHHHHhCCCCCceEEcccCCcCcccc
Q 006966          144 ILSALKSFHSALNRIHLDMKVKVSVAFPLPLLEN  177 (623)
Q Consensus       144 LvPAM~Nih~AL~~~gL~~~IKVSTp~s~~vL~~  177 (623)
                      -.-.-+-|++||...+++ .+|++--+...|+..
T Consensus       953 p~n~~r~I~Na~~~f~~~-~r~~t~l~~~~v~~g  985 (1605)
T KOG0260|consen  953 PENLQRIIWNALKKFSID-ERKPTDLIPFKVVKG  985 (1605)
T ss_pred             chhHHHHHHHHHhhcccc-cccccccchhhhhhh
Confidence            334456788888888885 488887777777644


No 9  
>COG1671 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=55.44  E-value=16  Score=35.40  Aligned_cols=77  Identities=10%  Similarity=0.028  Sum_probs=44.9

Q ss_pred             hhcCCCceEEEecccCchhHHH----------hhcChHHHHHHHHhhccCCCCCccEE-------------EEEccCccc
Q 006966           77 SLLNSNASSSVDLYLNLSLVVD----------LMQSELSAISWLETNVLTTHPHVNIK-------------SIILSCSSE  133 (623)
Q Consensus        77 AlanTgI~v~V~v~vpN~~i~~----------la~s~~~A~~WV~~NV~py~p~t~I~-------------~I~VGnenE  133 (623)
                      +-.-.|++|++   |-|.-+..          +.+-.++|+.|+.++.-+.  ++-|+             .++.+.+-+
T Consensus        20 ~A~r~~~~v~~---Van~~~~~~~~~~i~~v~V~~g~DaaD~~Iv~~a~~g--DlVVT~Di~LA~~ll~kg~~v~~prGr   94 (150)
T COG1671          20 VAERMGLKVTF---VANFPHRVPPSPEIRTVVVDAGFDAADDWIVNLAEKG--DLVVTADIPLASLLLDKGAAVLNPRGR   94 (150)
T ss_pred             HHHHhCCeEEE---EeCCCccCCCCCceeEEEecCCcchHHHHHHHhCCCC--CEEEECchHHHHHHHhcCCEEECCCCc
Confidence            33446666665   44544431          3345689999999987776  32232             222232224


Q ss_pred             cccCCCchhhHHHHHHHHHHHHHhCCC
Q 006966          134 EFEGKNVLPLILSALKSFHSALNRIHL  160 (623)
Q Consensus       134 ~~~~~~~~~~LvPAM~Nih~AL~~~gL  160 (623)
                      ++...++  ..+=+|++|+.-|++.|.
T Consensus        95 ~y~~~nI--~~~L~~R~~~~~lR~~G~  119 (150)
T COG1671          95 LYTEENI--GERLAMRDFMAKLRRQGK  119 (150)
T ss_pred             ccCHhHH--HHHHHHHHHHHHHHHhcc
Confidence            4422222  244589999999999986


No 10 
>COG5309 Exo-beta-1,3-glucanase [Carbohydrate transport and metabolism]
Probab=52.74  E-value=6.9  Score=41.43  Aligned_cols=65  Identities=12%  Similarity=-0.059  Sum_probs=36.9

Q ss_pred             cccccCCCCccccCCCCC-CCCCCCcccccCCC----CCceecCCCCCCCCCCCCCCCcccc-CCCCCCCCCCC--CCCC
Q 006966          299 LKTSIHDTTIFPTTPVPP-DNKPTPTIVTVPAT----NPVTVSPANPSGTPLPIPSTTPVNI-PPATPVNPAAP--VTNP  370 (623)
Q Consensus       299 ~gTp~v~etgwPt~~~~e-d~kpgpt~~~ay~~----n~~~~gLf~pdGTPvYp~~~~~v~i-~lfne~~kpgp--~se~  370 (623)
                      +++..+.|+|||+.+... ...|.+.+...|-+    +.|.+|              +++++ +-|||+=|.-.  .-|+
T Consensus       231 ~k~~~v~EtGWPS~G~~~G~a~pS~anq~~~~~~i~~~~~~~G--------------~d~fvfeAFdd~WK~~~~y~VEk  296 (305)
T COG5309         231 KKTVWVTETGWPSDGRTYGSAVPSVANQKIAVQEILNALRSCG--------------YDVFVFEAFDDDWKADGSYGVEK  296 (305)
T ss_pred             CccEEEeeccCCCCCCccCCcCCChhHHHHHHHHHHhhhhccC--------------ccEEEeeeccccccCccccchhh
Confidence            466789999999998732 22333333222211    222222              26664 78999877433  2355


Q ss_pred             CCCCCCccCC
Q 006966          371 ATIPAPVTVP  380 (623)
Q Consensus       371 ~~~~~Gl~~p  380 (623)
                      +   ||++..
T Consensus       297 y---wGv~~s  303 (305)
T COG5309         297 Y---WGVLSS  303 (305)
T ss_pred             c---eeeecc
Confidence            5   888743


No 11 
>COG3889 Predicted solute binding protein [General function prediction only]
Probab=48.76  E-value=15  Score=43.92  Aligned_cols=28  Identities=14%  Similarity=0.039  Sum_probs=18.2

Q ss_pred             hhHHHHHHHHHHHHHhCCCCCceEEcccCC
Q 006966          142 PLILSALKSFHSALNRIHLDMKVKVSVAFP  171 (623)
Q Consensus       142 ~~LvPAM~Nih~AL~~~gL~~~IKVSTp~s  171 (623)
                      +.+=.||..+.+-+.+-|+  .+|+-++.-
T Consensus       173 q~veq~m~e~~a~~~~~G~--~~~~~~~~~  200 (872)
T COG3889         173 QYVEQAMAELNAEYMAKGL--WYKVGKPDD  200 (872)
T ss_pred             HHHHHHHHHHHHHHhhcCc--EEEecCCCc
Confidence            4456677777666666665  477777655


No 12 
>PRK00124 hypothetical protein; Validated
Probab=46.40  E-value=25  Score=34.22  Aligned_cols=58  Identities=14%  Similarity=0.086  Sum_probs=34.7

Q ss_pred             hhcChHHHHHHHHhhccCCCCCccEE------EEEccC-------ccccccCCCchhhHHHHHHHHHHHHHhCCC
Q 006966           99 LMQSELSAISWLETNVLTTHPHVNIK------SIILSC-------SSEEFEGKNVLPLILSALKSFHSALNRIHL  160 (623)
Q Consensus        99 la~s~~~A~~WV~~NV~py~p~t~I~------~I~VGn-------enE~~~~~~~~~~LvPAM~Nih~AL~~~gL  160 (623)
                      +.+..++|+.|+.+++.+-  +.=|+      ..+++-       .-++++..++-  -+=+||++.+-|++.|.
T Consensus        50 V~~g~D~AD~~Iv~~~~~g--DiVIT~Di~LAa~~l~Kga~vl~prG~~yt~~nI~--~~L~~R~~~~~lR~~G~  120 (151)
T PRK00124         50 VDAGFDAADNEIVQLAEKG--DIVITQDYGLAALALEKGAIVLNPRGYIYTNDNID--QLLAMRDLMATLRRSGI  120 (151)
T ss_pred             eCCCCChHHHHHHHhCCCC--CEEEeCCHHHHHHHHHCCCEEECCCCcCCCHHHHH--HHHHHHHHHHHHHHcCC
Confidence            3456789999999998775  33232      111111       01333222222  34589999999999985


No 13 
>KOG1924 consensus RhoA GTPase effector DIA/Diaphanous [Signal transduction mechanisms; Cytoskeleton]
Probab=42.51  E-value=83  Score=38.06  Aligned_cols=24  Identities=0%  Similarity=-0.191  Sum_probs=9.9

Q ss_pred             HHhhhhcccccCCCCCCccCCCCC
Q 006966          433 IQQALDYACGIGGADCSLIQQGAS  456 (623)
Q Consensus       433 l~~~ldyaCg~~~~dCs~I~~gg~  456 (623)
                      |=.+++.+-++---.|+++-++-.
T Consensus       424 YykLIEecISqIvlHr~~~DPdf~  447 (1102)
T KOG1924|consen  424 YYKLIEECISQIVLHRTGMDPDFK  447 (1102)
T ss_pred             HHHHHHHHHHHHHHhcCCCCCCcc
Confidence            333444433332223555555544


No 14 
>PF00925 GTP_cyclohydro2:  GTP cyclohydrolase II;  InterPro: IPR000926 GTP cyclohydrolase II catalyses the first committed step in the biosynthesis of riboflavin. The enzyme converts GTP and water to formate, 2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)- pyrimidine and pyrophosphate, and requires magnesium as a cofactor. It is sometimes found as a bifunctional enzyme with 3,4-dihydroxy-2-butanone 4-phosphate synthase (DHBP_synthase) IPR000422 from INTERPRO. ; GO: 0003935 GTP cyclohydrolase II activity, 0009231 riboflavin biosynthetic process; PDB: 2BZ0_B 2BZ1_A.
Probab=38.52  E-value=11  Score=36.98  Aligned_cols=35  Identities=17%  Similarity=0.124  Sum_probs=25.4

Q ss_pred             hhccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEEE
Q 006966           45 VTSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSSV   87 (623)
Q Consensus        45 vv~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~V   87 (623)
                      .+|+    ||..||  ++|||.-.||.=+.  ||.|-||+|.=
T Consensus       131 gaqI----L~dLGV--~~~rLLtnnp~k~~--~L~g~gleV~~  165 (169)
T PF00925_consen  131 GAQI----LRDLGV--KKMRLLTNNPRKYV--ALEGFGLEVVE  165 (169)
T ss_dssp             HHHH----HHHTT----SEEEE-S-HHHHH--HHHHTT--EEE
T ss_pred             HHHH----HHHcCC--CEEEECCCChhHHH--HHhcCCCEEEE
Confidence            4677    898887  99999999999888  99999998875


No 15 
>PF07462 MSP1_C:  Merozoite surface protein 1 (MSP1) C-terminus;  InterPro: IPR010901 This entry represents the C-terminal region of merozoite surface protein 1 (MSP1), which is found in a number of Plasmodium species. MSP-1 is a 200 kDa protein expressed on the surface of the Plasmodium vivax merozoite. MSP-1 of Plasmodium species is synthesised as a high-molecular-weight precursor and then processed into several fragments. At the time of red cell invasion by the merozoite, only the 19 kDa C-terminal fragment (MSP-119), which contains two epidermal growth factor-like domains, remains on the surface. Antibodies against MSP-119 inhibit merozoite entry into red cells, and immunisation with MSP-119 protects monkeys from challenging infections. Hence, MSP-119 is considered a promising vaccine candidate [].; GO: 0009405 pathogenesis, 0016020 membrane
Probab=32.59  E-value=98  Score=35.89  Aligned_cols=8  Identities=25%  Similarity=0.430  Sum_probs=4.8

Q ss_pred             HhHHHHHH
Q 006966          467 ASFAFNSY  474 (623)
Q Consensus       467 aSyAfN~Y  474 (623)
                      --=||.+|
T Consensus       247 Vk~ALq~Y  254 (574)
T PF07462_consen  247 VKEALQAY  254 (574)
T ss_pred             HHHHHHHH
Confidence            34567666


No 16 
>PF06508 QueC:  Queuosine biosynthesis protein QueC;  InterPro: IPR018317 This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome.  In Rhizobium meliloti (Sinorhizobium meliloti), a species in which the exo genes make succinoglycan, a symbiotically important exopolysaccharide, exsB is located nearby and affects succinoglycan levels, probably through polar effects on exsA expression or the same polycistronic mRNA [, ].  In Arthrobacter viscosus, the homologous gene is designated alu1 and is associated with an aluminum tolerance phenotype. When expressed in Escherichia coli, it conferred aliminium tolerance []. The entry also contains the gene queC, which is responsible for the conversion of GTP to 7-cyano-7-deazaguanine (preQ0). The biosynthesis of hypermodified tRNA nucleoside queuosine only occurs in eubacteria. It occupies the wobble position for all known tRNAs that are specific for Asp, Asn, His or Tyr [].; PDB: 3BL5_B 2PG3_A.
Probab=28.95  E-value=1.6e+02  Score=29.78  Aligned_cols=122  Identities=14%  Similarity=0.160  Sum_probs=55.8

Q ss_pred             CceeeEeccCCCCCCCCCChhhhhccCccccccCCCCCCcEEEecCC--ccccchhhhcCCCceEEE---------eccc
Q 006966           23 ATLVGFAFNGRENTSAASSTSEVTSFDSLGLKLDNVPSQRIRVYVAN--HRVLNFSSLLNSNASSSV---------DLYL   91 (623)
Q Consensus        23 ~~~iGVnYG~~gdnL~lPsP~~vv~l~~~~lk~~~i~~~~VRiyDad--p~vL~~~AlanTgI~v~V---------~v~v   91 (623)
                      ...|-|+||++ ..-   --+.+.++    .+..++  .+-++.|-+  .++.. |+|-+.+++|-=         ...|
T Consensus        26 v~al~~~YGq~-~~~---El~~a~~i----~~~l~v--~~~~~i~l~~~~~~~~-s~L~~~~~~v~~~~~~~~~~~~t~v   94 (209)
T PF06508_consen   26 VYALTFDYGQR-HRR---ELEAAKKI----AKKLGV--KEHEVIDLSFLKEIGG-SALTDDSIEVPEEEYSEESIPSTYV   94 (209)
T ss_dssp             EEEEEEESSST-TCH---HHHHHHHH----HHHCT---SEEEEEE-CHHHHCSC-HHHHHTT------------------
T ss_pred             EEEEEEECCCC-CHH---HHHHHHHH----HHHhCC--CCCEEeeHHHHHhhCC-CcccCCCcCCcccccccCCCCceEE
Confidence            35688999999 321   23333445    666565  777888888  33442 566666543210         0123


Q ss_pred             CchhHHHhhcChHHHHHHHHhhccCCCCCccEEEEEccCcccccc-CCCchhhHHHHHHHHHHHHHhCCCCCceEEcccC
Q 006966           92 NLSLVVDLMQSELSAISWLETNVLTTHPHVNIKSIILSCSSEEFE-GKNVLPLILSALKSFHSALNRIHLDMKVKVSVAF  170 (623)
Q Consensus        92 pN~~i~~la~s~~~A~~WV~~NV~py~p~t~I~~I~VGnenE~~~-~~~~~~~LvPAM~Nih~AL~~~gL~~~IKVSTp~  170 (623)
                      |+-.+.-|    +.|..|-..        ..+..|++|.-.+-.. .++--+..+-+|+.+-+..    ....|+|.+|+
T Consensus        95 P~RN~l~l----siAa~~A~~--------~g~~~i~~G~~~~D~~~ypDc~~~F~~~~~~~~~~~----~~~~v~i~~P~  158 (209)
T PF06508_consen   95 PFRNGLFL----SIAASYAES--------LGAEAIYIGVNAEDASGYPDCRPEFIDAMNRLLNLG----EGGPVRIETPL  158 (209)
T ss_dssp             TTHHHHHH----HHHHHHHHH--------HT-SEEEE---S-STT--GGGSHHHHHHHHHHHHHH----HTS--EEE-TT
T ss_pred             ecCcHHHH----HHHHHHHHH--------CCCCEEEEEECcCccCCCCCChHHHHHHHHHHHHhc----CCCCEEEEecC
Confidence            43332222    234445443        3677888883111111 1344455666666544333    34569999986


Q ss_pred             C
Q 006966          171 P  171 (623)
Q Consensus       171 s  171 (623)
                      .
T Consensus       159 ~  159 (209)
T PF06508_consen  159 I  159 (209)
T ss_dssp             T
T ss_pred             C
Confidence            3


No 17 
>PF13756 Stimulus_sens_1:  Stimulus-sensing domain
Probab=27.72  E-value=45  Score=30.53  Aligned_cols=33  Identities=18%  Similarity=0.206  Sum_probs=23.2

Q ss_pred             ChhhhhccCccccccCCC-CCCcEEEecCCccccchhh
Q 006966           41 STSEVTSFDSLGLKLDNV-PSQRIRVYVANHRVLNFSS   77 (623)
Q Consensus        41 sP~~vv~l~~~~lk~~~i-~~~~VRiyDadp~vL~~~A   77 (623)
                      .|++|..|    |+.... .-+|+||||+|-.+|-.|-
T Consensus         2 ~pe~a~pl----LrrL~~Pt~~RARlyd~dG~Ll~DSr   35 (112)
T PF13756_consen    2 NPERARPL----LRRLISPTRTRARLYDPDGNLLADSR   35 (112)
T ss_pred             CHHHHHHH----HHHhCCCCCceEEEECCCCCEEeecc
Confidence            46777777    776421 2389999999998776443


No 18 
>PF05283 MGC-24:  Multi-glycosylated core protein 24 (MGC-24);  InterPro: IPR007947 CD164 is a mucin-like receptor, or sialomucin, with specificity in receptor/ ligand interactions that depends on the structural characteristics of the mucin-like receptor. Its functions include mediating, or regulating, haematopoietic progenitor cell adhesion and the negative regulation of their growth and/or-differentiation. It exists in the native state as a disulphide- linked homodimer of two 80-85kDa subunits. It is usually expressed by CD34+ and CD341o/- haematopoietic stem cells and associated microenvironmental cells. It contains, in its extracellular region, two mucin domains (I and II) linked by a non-mucin domain, which has been predicted to contain intra- disulphide bridges. This receptor may play a key role in haematopoiesis by facilitating the adhesion of human CD34+ cells to bone marrow stroma and by negatively regulating CD34+ CD341o/- haematopoietic progenitor cell proliferation. These effects involve the CD164 class I and/or II epitopes recognised by the monoclonal antibodies (mAbs) 105A5 and 103B2/9E10. These epitopes are carbohydrate-dependent and are located on the N-terminal mucin domain I [, ]. It has been found that murine MGC-24v and rat endolyn share significant sequence similarities with human CD164. However, CD164 lacks the consensus glycosaminoglycan (GAG)-attachment site found in MGC-24; it is possible that GAG-association is responsible for the high molecular weight of the epithelial-derived MGC-24 glycoprotein [].  Genomic structure studies have placed CD164 within the mucin-subgroup that comprises multiple exons, and demonstrate the diverse chromosomal distribution of this family of molecules. Molecules with such multiple exons may have sophisticated regulatory mechanisms that involve not only post-translational modifications of the oligosaccharide side chains, but also differential exon usage. Although differences in the intron and exon sizes are seen between the mouse and human genes, the predicted proteins are similar in size and structure, maintaining functionally important motifs that regulate cell proliferation or subcellular distribution [].  CD164 is a gene whose expression depends on differential usage of poly- adenylation sites within the 3'-UTR. The conserved distribution of the 3.2- and 1.2-kb CD164 transcripts between mouse and human suggests that (i) a mechanism may exist to regulate tissue-specific polyadenylation, and (ii) differences in polyadenylation are important for the expression and function of CD164 in different tissues. Two other aspects of the structure of CD164 are of particular interest. First, it shares one of several conserved features of a cytokine-binding pocket - in this respect, it is notable that evidence exists for a class of cell-surface sialomucin modulators that directly interact with growth factor receptors to regulate their response to physiological ligands. Second, its cytoplasmic tail contains a C-terminal YHTL motif found in many endocytic membrane proteins or receptors. These Tyr-based motifs bind to adaptor proteins, which mediate the sorting of membrane proteins into transport vesicles from the plasma membrane to the endosomes, and between intracellular compartments. 
Probab=27.17  E-value=2e+02  Score=29.07  Aligned_cols=13  Identities=8%  Similarity=0.248  Sum_probs=9.3

Q ss_pred             eecceehhhHHhh
Q 006966          604 ILSSLTLVTPFVI  616 (623)
Q Consensus       604 ~~~~~~~~~~~~~  616 (623)
                      |||-|||+..+++
T Consensus       163 FiGGIVL~LGv~a  175 (186)
T PF05283_consen  163 FIGGIVLTLGVLA  175 (186)
T ss_pred             hhhHHHHHHHHHH
Confidence            8888888765443


No 19 
>PF07172 GRP:  Glycine rich protein family;  InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=26.98  E-value=44  Score=29.99  Aligned_cols=9  Identities=44%  Similarity=0.516  Sum_probs=5.3

Q ss_pred             HHHHhhcCC
Q 006966           15 NILTISSSA   23 (623)
Q Consensus        15 ~~~~~~~~~   23 (623)
                      ++|+|++.+
T Consensus        15 ~lLlisSev   23 (95)
T PF07172_consen   15 ALLLISSEV   23 (95)
T ss_pred             HHHHHHhhh
Confidence            556666653


No 20 
>PRK12485 bifunctional 3,4-dihydroxy-2-butanone 4-phosphate synthase/GTP cyclohydrolase II-like protein; Provisional
Probab=26.39  E-value=36  Score=37.71  Aligned_cols=33  Identities=12%  Similarity=0.024  Sum_probs=29.5

Q ss_pred             hhccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEE
Q 006966           45 VTSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSS   86 (623)
Q Consensus        45 vv~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~   86 (623)
                      .+++    ||..||  ++|||. .||.=+.  +|.+-||+|+
T Consensus       330 gAqI----Lr~LGV--~kirLL-nNP~K~~--~L~~~GIeV~  362 (369)
T PRK12485        330 GAQI----LQDLGV--GKLRHL-GPPLKYA--GLTGYDLEVV  362 (369)
T ss_pred             HHHH----HHHcCC--CEEEEC-CCchhhh--hhhhCCcEEE
Confidence            5788    999888  999999 7999888  9999999986


No 21 
>COG3889 Predicted solute binding protein [General function prediction only]
Probab=23.69  E-value=52  Score=39.54  Aligned_cols=33  Identities=9%  Similarity=0.023  Sum_probs=19.3

Q ss_pred             HHHHHHHHhhcCCeeEEeeC-----CCCCccccccccc
Q 006966          186 IGLIFGYIKKTGSVVIIEAG-----IDGKLSMAEVLVQ  218 (623)
Q Consensus       186 i~plL~FL~~T~SPfmVNvY-----~~~~i~LdyALF~  218 (623)
                      +..+.+||++-+--+..|.|     |+...-+++.-|+
T Consensus       321 y~d~~~~~~q~~~~~i~~~~v~~t~N~e~~v~~~~~~d  358 (872)
T COG3889         321 YDDIIRFLKQLNHMEISNPYVELTYNPENYVLNYNKFD  358 (872)
T ss_pred             HHHHHHHHHhhhhhccCCceEEEeeccceeeccccccc
Confidence            45677788876655555554     4444445555444


No 22 
>PRK10629 EnvZ/OmpR regulon moderator; Provisional
Probab=22.71  E-value=2.4e+02  Score=26.63  Aligned_cols=36  Identities=8%  Similarity=0.039  Sum_probs=27.1

Q ss_pred             CceeeEeccCCCCCCCCCChhhhhccCccccccCCCCCCcEE
Q 006966           23 ATLVGFAFNGRENTSAASSTSEVTSFDSLGLKLDNVPSQRIR   64 (623)
Q Consensus        23 ~~~iGVnYG~~gdnL~lPsP~~vv~l~~~~lk~~~i~~~~VR   64 (623)
                      ...|-|.-.+.|..+  |...+|.+.    |+++||.++++.
T Consensus        35 dpavQIs~~~~g~~~--~~~~~v~~~----L~~~gI~~ksi~   70 (127)
T PRK10629         35 ESTLAIRAVHQGASL--PDGFYVYQH----LDANGIHIKSIT   70 (127)
T ss_pred             CceEEEecCCCCCcc--chHHHHHHH----HHHCCCCcceEE
Confidence            456778777667666  899999999    999999444443


No 23 
>cd02875 GH18_chitobiase Chitobiase (also known as di-N-acetylchitobiase) is a lysosomal glycosidase that hydrolyzes the reducing-end N-acetylglucosamine from the chitobiose core of oligosaccharides during the ordered degradation of asparagine-linked glycoproteins in eukaryotes. Chitobiase can only do so if the asparagine that joins the oligosaccharide to protein is previously removed by a glycosylasparaginase. Chitobiase is therefore the final step in the lysosomal degradation of the protein/carbohydrate linkage component of asparagine-linked glycoproteins. The catalytic domain of chitobiase is an eight-stranded alpha/beta barrel fold similar to that of other family 18 glycosyl hydrolases such as hevamine and chitotriosidase.
Probab=22.68  E-value=1.6e+02  Score=32.21  Aligned_cols=102  Identities=10%  Similarity=0.124  Sum_probs=57.1

Q ss_pred             CcEEEec-CCccccchhhhcCCCceEEEecccCchhHHHhhcChHHHHHHHHhhccCCCCCccEEEEEccCccccccCCC
Q 006966           61 QRIRVYV-ANHRVLNFSSLLNSNASSSVDLYLNLSLVVDLMQSELSAISWLETNVLTTHPHVNIKSIILSCSSEEFEGKN  139 (623)
Q Consensus        61 ~~VRiyD-adp~vL~~~AlanTgI~v~V~v~vpN~~i~~la~s~~~A~~WV~~NV~py~p~t~I~~I~VGnenE~~~~~~  139 (623)
                      +.|.||+ .|++++.  .-..-|++|++...++.+    +.+++..=++|+++ |+.++-.-++-.|-+==|.-......
T Consensus        57 tti~~~~~~~~~~~~--~A~~~~v~v~~~~~~~~~----~l~~~~~R~~fi~s-iv~~~~~~gfDGIdIDwE~p~~~~~~  129 (358)
T cd02875          57 TTIAIFGDIDDELLC--YAHSKGVRLVLKGDVPLE----QISNPTYRTQWIQQ-KVELAKSQFMDGINIDIEQPITKGSP  129 (358)
T ss_pred             eEEEecCCCCHHHHH--HHHHcCCEEEEECccCHH----HcCCHHHHHHHHHH-HHHHHHHhCCCeEEEcccCCCCCCcc
Confidence            8889884 5778887  556678999886544432    23455555556553 44444333344443331100000112


Q ss_pred             chhhHHHHHHHHHHHHHhCCCCCceEEccc
Q 006966          140 VLPLILSALKSFHSALNRIHLDMKVKVSVA  169 (623)
Q Consensus       140 ~~~~LvPAM~Nih~AL~~~gL~~~IKVSTp  169 (623)
                      .-..++-=|+.|+++|.+.+.+-.|-|.++
T Consensus       130 d~~~~t~llkelr~~l~~~~~~~~Lsvav~  159 (358)
T cd02875         130 EYYALTELVKETTKAFKKENPGYQISFDVA  159 (358)
T ss_pred             hHHHHHHHHHHHHHHHhhcCCCcEEEEEEe
Confidence            234567778889999998865434444444


No 24 
>PRK00393 ribA GTP cyclohydrolase II; Reviewed
Probab=22.57  E-value=40  Score=33.84  Aligned_cols=33  Identities=18%  Similarity=0.324  Sum_probs=29.5

Q ss_pred             hccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEE
Q 006966           46 TSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSS   86 (623)
Q Consensus        46 v~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~   86 (623)
                      +|+    ||..||  ++|||.-.++.=+.  +|.|-||+|+
T Consensus       134 AQI----L~dLGV--~~mrLLtn~~~k~~--~L~g~GleV~  166 (197)
T PRK00393        134 ADM----LKALGV--KKVRLLTNNPKKVE--ALTEAGINIV  166 (197)
T ss_pred             HHH----HHHcCC--CEEEECCCCHHHHH--HHHhCCCEEE
Confidence            677    898887  99999999998788  9999999997


No 25 
>PHA03291 envelope glycoprotein I; Provisional
Probab=22.33  E-value=1.8e+02  Score=32.15  Aligned_cols=15  Identities=13%  Similarity=0.310  Sum_probs=6.9

Q ss_pred             ccceecceehhhHHh
Q 006966          601 SQLILSSLTLVTPFV  615 (623)
Q Consensus       601 ~~~~~~~~~~~~~~~  615 (623)
                      .|+-|=..|++..|+
T Consensus       289 iQiAIPasii~cV~l  303 (401)
T PHA03291        289 IQIAIPASIIACVFL  303 (401)
T ss_pred             heeccchHHHHHhhh
Confidence            444444444444443


No 26 
>TIGR00505 ribA GTP cyclohydrolase II. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. The function of archaeal members of the family has not been demonstrated and is assigned tentatively.
Probab=22.33  E-value=40  Score=33.68  Aligned_cols=33  Identities=15%  Similarity=0.241  Sum_probs=29.1

Q ss_pred             hccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEE
Q 006966           46 TSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSS   86 (623)
Q Consensus        46 v~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~   86 (623)
                      +|+    ||..||  ++|||.-.++.=+.  +|.|-||+|+
T Consensus       131 AQI----L~dLGV--~~~rLLtn~~~k~~--~L~g~gleVv  163 (191)
T TIGR00505       131 ADI----LEDLGV--KKVRLLTNNPKKIE--ILKKAGINIV  163 (191)
T ss_pred             HHH----HHHcCC--CEEEECCCCHHHHH--HHHhCCCEEE
Confidence            677    888887  99999999888777  9999999987


No 27 
>cd06156 eu_AANH_C_2 A group of hypothetical eukaryotic proteins, characterized by the presence of an adenine nucleotide alpha hydrolase (AANH)-like domain located N-terminal to two distinctly different YjgF-YER057c-UK114-like domains. This CD contains the second of these domains. The YjgF-YER057c-UK114 protein family is a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.
Probab=21.78  E-value=1.1e+02  Score=28.11  Aligned_cols=31  Identities=3%  Similarity=0.088  Sum_probs=25.1

Q ss_pred             CCchhhHHHHHHHHHHHHHhCCCCCceEEcc
Q 006966          138 KNVLPLILSALKSFHSALNRIHLDMKVKVSV  168 (623)
Q Consensus       138 ~~~~~~LvPAM~Nih~AL~~~gL~~~IKVST  168 (623)
                      +++..++--+|+||.+.|.++|.++-||+++
T Consensus        29 ~~~~~Q~~qal~Ni~~vL~~aG~~dVvk~~i   59 (118)
T cd06156          29 GGITLQAVLSLQHLERVAKAMNVQWVLAAVC   59 (118)
T ss_pred             CCHHHHHHHHHHHHHHHHHHcCCCCEEEEEE
Confidence            3566789999999999999999955567663


Done!