Query 006966
Match_columns 623
No_of_seqs 249 out of 1578
Neff 5.4
Searched_HMMs 46136
Date Thu Mar 28 17:02:14 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/006966.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/006966hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF00332 Glyco_hydro_17: Glyco 100.0 8.2E-65 1.8E-69 532.0 13.6 284 26-346 1-306 (310)
2 smart00768 X8 Possibly involve 99.9 2.9E-28 6.3E-33 210.9 7.8 84 420-504 1-85 (85)
3 PF07983 X8: X8 domain; Inter 99.9 1.8E-22 4E-27 172.0 5.2 72 420-491 1-78 (78)
4 PF00332 Glyco_hydro_17: Glyco 99.1 3.5E-12 7.7E-17 134.8 -1.8 118 235-382 184-302 (310)
5 COG5309 Exo-beta-1,3-glucanase 98.6 2.1E-07 4.7E-12 95.5 9.7 214 24-289 44-266 (305)
6 PF03198 Glyco_hydro_72: Gluca 94.5 0.061 1.3E-06 57.4 5.7 126 26-167 30-180 (314)
7 KOG0260 RNA polymerase II, lar 83.0 7.1 0.00015 48.3 10.3 12 105-116 1031-1042(1605)
8 KOG0260 RNA polymerase II, lar 73.8 22 0.00048 44.3 10.7 33 144-177 953-985 (1605)
9 COG1671 Uncharacterized protei 55.4 16 0.00036 35.4 4.2 77 77-160 20-119 (150)
10 COG5309 Exo-beta-1,3-glucanase 52.7 6.9 0.00015 41.4 1.3 65 299-380 231-303 (305)
11 COG3889 Predicted solute bindi 48.8 15 0.00032 43.9 3.2 28 142-171 173-200 (872)
12 PRK00124 hypothetical protein; 46.4 25 0.00054 34.2 3.9 58 99-160 50-120 (151)
13 KOG1924 RhoA GTPase effector D 42.5 83 0.0018 38.1 7.9 24 433-456 424-447 (1102)
14 PF00925 GTP_cyclohydro2: GTP 38.5 11 0.00023 37.0 0.1 35 45-87 131-165 (169)
15 PF07462 MSP1_C: Merozoite sur 32.6 98 0.0021 35.9 6.4 8 467-474 247-254 (574)
16 PF06508 QueC: Queuosine biosy 29.0 1.6E+02 0.0035 29.8 6.7 122 23-171 26-159 (209)
17 PF13756 Stimulus_sens_1: Stim 27.7 45 0.00097 30.5 2.3 33 41-77 2-35 (112)
18 PF05283 MGC-24: Multi-glycosy 27.2 2E+02 0.0043 29.1 6.8 13 604-616 163-175 (186)
19 PF07172 GRP: Glycine rich pro 27.0 44 0.00096 30.0 2.0 9 15-23 15-23 (95)
20 PRK12485 bifunctional 3,4-dihy 26.4 36 0.00077 37.7 1.6 33 45-86 330-362 (369)
21 COG3889 Predicted solute bindi 23.7 52 0.0011 39.5 2.4 33 186-218 321-358 (872)
22 PRK10629 EnvZ/OmpR regulon mod 22.7 2.4E+02 0.0052 26.6 6.2 36 23-64 35-70 (127)
23 cd02875 GH18_chitobiase Chitob 22.7 1.6E+02 0.0034 32.2 5.7 102 61-169 57-159 (358)
24 PRK00393 ribA GTP cyclohydrola 22.6 40 0.00087 33.8 1.1 33 46-86 134-166 (197)
25 PHA03291 envelope glycoprotein 22.3 1.8E+02 0.0039 32.2 5.9 15 601-615 289-303 (401)
26 TIGR00505 ribA GTP cyclohydrol 22.3 40 0.00086 33.7 1.0 33 46-86 131-163 (191)
27 cd06156 eu_AANH_C_2 A group of 21.8 1.1E+02 0.0023 28.1 3.6 31 138-168 29-59 (118)
No 1
>PF00332 Glyco_hydro_17: Glycosyl hydrolases family 17; InterPro: IPR000490 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 17 GH17 from CAZY comprises enzymes with several known activities; endo-1,3-beta-glucosidase (3.2.1.39 from EC); lichenase (3.2.1.73 from EC); exo-1,3-glucanase (3.2.1.58 from EC). Currently these enzymes have only been found in plants and in fungi. ; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 1AQ0_B 1GHR_A 1GHS_B 2CYG_A 3UR8_A 3UR7_B 3EM5_C 3F55_D.
Probab=100.00 E-value=8.2e-65 Score=531.99 Aligned_cols=284 Identities=23% Similarity=0.270 Sum_probs=217.3
Q ss_pred eeEeccCCCCCCCCCChhhhhccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEEEecccCchhHHHhhcChHH
Q 006966 26 VGFAFNGRENTSAASSTSEVTSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSSVDLYLNLSLVVDLMQSELS 105 (623)
Q Consensus 26 iGVnYG~~gdnL~lPsP~~vv~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~V~v~vpN~~i~~la~s~~~ 105 (623)
|||||||+|||| |+|.+||+| ||+++| +||||||+|+++|| ||+||||+|+|+ |||++|++|++++..
T Consensus 1 iGvnyG~~~~nl--p~p~~vv~l----~ks~~i--~~vri~d~~~~iL~--a~a~S~i~v~v~--vpN~~l~~la~~~~~ 68 (310)
T PF00332_consen 1 IGVNYGRVGNNL--PSPCKVVSL----LKSNGI--TKVRIYDADPSILR--AFAGSGIEVMVG--VPNEDLASLASSQSA 68 (310)
T ss_dssp EEEEE---SSS-----HHHHHHH----HHHTT----EEEESS--HHHHH--HHTTS--EEEEE--E-GGGHHHHHHHHHH
T ss_pred CeEeccCccCCC--CCHHHHHHH----HHhccc--ccEEeecCcHHHHH--HHhcCCceeeec--cChHHHHHhccCHHH
Confidence 899999999998 999999999 999998 99999999999999 999999999995 699999999999999
Q ss_pred HHHHHHhhccCCCCCccEEEEEccCccccccCCCchhhHHHHHHHHHHHHHhCCCCCceEEcccCCcCcccccCCCc---
Q 006966 106 AISWLETNVLTTHPHVNIKSIILSCSSEEFEGKNVLPLILSALKSFHSALNRIHLDMKVKVSVAFPLPLLENLNTSH--- 182 (623)
Q Consensus 106 A~~WV~~NV~py~p~t~I~~I~VGnenE~~~~~~~~~~LvPAM~Nih~AL~~~gL~~~IKVSTp~s~~vL~~s~~~~--- 182 (623)
|..||++||++|+|+|||++|+|| ||++...... .|||||+|||+||.++||+++|||||+|+|++|.+++|++
T Consensus 69 A~~Wv~~nv~~~~~~~~i~~i~VG--nEv~~~~~~~-~lvpAm~ni~~aL~~~~L~~~IkVst~~~~~vl~~s~PPS~g~ 145 (310)
T PF00332_consen 69 AGSWVRTNVLPYLPAVNIRYIAVG--NEVLTGTDNA-YLVPAMQNIHNALTAAGLSDQIKVSTPHSMDVLSNSFPPSAGV 145 (310)
T ss_dssp HHHHHHHHTCTCTTTSEEEEEEEE--ES-TCCSGGG-GHHHHHHHHHHHHHHTT-TTTSEEEEEEEGGGEEE-SSGGG-E
T ss_pred HhhhhhhcccccCcccceeeeecc--cccccCccce-eeccHHHHHHHHHHhcCcCCcceeccccccccccccCCCccCc
Confidence 999999999999999999999999 8888642222 8999999999999999999999999999999999998874
Q ss_pred -c----hhHHHHHHHHhhcCCeeEEeeC-------CCCCcccccccccccccccccCCCCC-CCchHHHHH---HHhcCC
Q 006966 183 -E----GEIGLIFGYIKKTGSVVIIEAG-------IDGKLSMAEVLVQPLLKKAIKATSIL-PDSDILIDL---VMKSPL 246 (623)
Q Consensus 183 -~----~~i~plL~FL~~T~SPfmVNvY-------~~~~i~LdyALF~~~~d~v~~a~~~L-~~~Da~LDa---Al~~~G 246 (623)
+ ..|+|||+||++|+||||||+| ++.+++|+|||||++. .+.|..-.+ ++||+++|+ ||++.|
T Consensus 146 F~~~~~~~~~~~l~fL~~t~spf~vN~yPyfa~~~~~~~~~l~yAlf~~~~-~~~D~~~~y~nlfDa~~da~~~a~~~~g 224 (310)
T PF00332_consen 146 FRSDIASVMDPLLKFLDGTNSPFMVNVYPYFAYQNNPQNISLDYALFQPNS-GVVDGGLAYTNLFDAMVDAVYAAMEKLG 224 (310)
T ss_dssp ESHHHHHHHHHHHHHHHHHT--EEEE--HHHHHHHSTTTS-HHHHTT-SSS--SEETTEEESSHHHHHHHHHHHHHHTTT
T ss_pred ccccchhhhhHHHHHhhccCCCceeccchhhhccCCcccCCcccccccccc-cccccchhhhHHHHHHHHHHHHHHHHhC
Confidence 2 3589999999999999999999 6789999999999873 444542111 579999997 699999
Q ss_pred CCCcccccchhhhhccccCccchhhHHHHhHHhhhhhcccCCchhhhhhhcCcccccCCCCccccCCC---CCCCCCCCc
Q 006966 247 VPDAKQVAEFTEIVSKFFENNSQIDELYADVASSMGEFVQKGLKVVRRLQNSLKTSIHDTTIFPTTPV---PPDNKPTPT 323 (623)
Q Consensus 247 ~p~~~vV~eeTGwps~~~~daa~~da~~Aav~~nA~~yn~~li~~~~~l~~~~gTp~v~etgwPt~~~---~ed~kpgpt 323 (623)
+++++++..|||||+++. ..++. +||++|+++++++. ..||+.+++.+++.+.| +|+.|++..
T Consensus 225 ~~~~~vvv~ETGWPs~G~--------~~a~~-~nA~~~~~nl~~~~-----~~gt~~~~~~~~~~y~F~~FdE~~K~~~~ 290 (310)
T PF00332_consen 225 FPNVPVVVGETGWPSAGD--------PGATP-ENAQAYNQNLIKHV-----LKGTPLRPGNGIDVYIFEAFDENWKPGPE 290 (310)
T ss_dssp -TT--EEEEEE---SSSS--------TTCSH-HHHHHHHHHHHHHC-----CGBBSSSBSS---EEES-SB--TTSSSSG
T ss_pred CCCceeEEeccccccCCC--------CCCCc-chhHHHHHHHHHHH-----hCCCcccCCCCCeEEEEEEecCcCCCCCc
Confidence 999999777899999965 24566 88999999988874 27898998888777776 899998876
Q ss_pred ccccCCCCCceecCCCCCCCCCC
Q 006966 324 IVTVPATNPVTVSPANPSGTPLP 346 (623)
Q Consensus 324 ~~~ay~~n~~~~gLf~pdGTPvY 346 (623)
. |||||||++|++|+|
T Consensus 291 ~-------E~~wGlf~~d~~~ky 306 (310)
T PF00332_consen 291 V-------ERHWGLFYPDGTPKY 306 (310)
T ss_dssp G-------GGG--SB-TTSSBSS
T ss_pred c-------cceeeeECCCCCeec
Confidence 5 889999999999999
No 2
>smart00768 X8 Possibly involved in carbohydrate binding. The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges.
Probab=99.95 E-value=2.9e-28 Score=210.91 Aligned_cols=84 Identities=67% Similarity=1.198 Sum_probs=80.7
Q ss_pred ceeEecCCCChHHHHhhhhcccccCCCCCCccCCCCCcCCCCChhhhHhHHHHHHHhhCC-CCCCCCCCCceEEEecCCC
Q 006966 420 SWCVAKNGVSETAIQQALDYACGIGGADCSLIQQGASCYNPNTLQNHASFAFNSYYQKNP-SPTSCDFGGTAMIVNTNPS 498 (623)
Q Consensus 420 ~wCVak~~~~~~~l~~~ldyaCg~~~~dCs~I~~gg~cy~p~t~~~~aSyAfN~Yyq~~~-~~~sCdF~G~A~ltt~dpS 498 (623)
+|||+|+++++++||++||||||++ +||++|++||+||+||++++|||||||+|||+++ ..++|||+|.|++++.|||
T Consensus 1 ~wCv~~~~~~~~~l~~~~~yaCg~~-~dC~~I~~~g~c~~~~~~~~~aS~a~N~YYq~~~~~~~aC~F~G~a~~~~~~ps 79 (85)
T smart00768 1 LWCVAKPDADEAALQAALDYACGQG-ADCTAIQPGGSCYSPNTVKAHASYAFNSYYQKQGQSSGACDFGGTATITTTDPS 79 (85)
T ss_pred CccccCCCCCHHHHHHHHHHHhcCC-CCccccCCCCcccCCCCHHHHHHHHHHHHHHHcCCCCCcCCCCCceEEEecCCC
Confidence 4999999999999999999999986 9999999999999999999999999999999987 5899999999999999999
Q ss_pred CCCeee
Q 006966 499 TGSCVF 504 (623)
Q Consensus 499 ~~~C~~ 504 (623)
+++|+|
T Consensus 80 ~~~C~~ 85 (85)
T smart00768 80 TGSCKF 85 (85)
T ss_pred CCccCC
Confidence 999986
No 3
>PF07983 X8: X8 domain; InterPro: IPR012946 The X8 domain [] contains 6 conserved cysteine residues that presumably form three disulphide bridges. The domain is found in an Olive pollen allergen [] as well as at the C terminus of family 17 glycosyl hydrolases []. This domain may be involved in carbohydrate binding.; PDB: 2JON_A 2W61_A 2W62_A 2W63_A.
Probab=99.86 E-value=1.8e-22 Score=172.05 Aligned_cols=72 Identities=47% Similarity=0.938 Sum_probs=61.5
Q ss_pred ceeEecCCCChHHHHhhhhcccccCCCCCCccCCCCC-----cCCCCChhhhHhHHHHHHHhhCC-CCCCCCCCCceE
Q 006966 420 SWCVAKNGVSETAIQQALDYACGIGGADCSLIQQGAS-----CYNPNTLQNHASFAFNSYYQKNP-SPTSCDFGGTAM 491 (623)
Q Consensus 420 ~wCVak~~~~~~~l~~~ldyaCg~~~~dCs~I~~gg~-----cy~p~t~~~~aSyAfN~Yyq~~~-~~~sCdF~G~A~ 491 (623)
+|||+|+++++++||++|||||+++++||++|++||+ .|++|+.++|||||||+|||+++ ...+|||+|+||
T Consensus 1 l~Cv~~~~~~~~~l~~~l~~aC~~~~~dC~~I~~~g~~G~YG~~S~C~~~~~lSya~N~YY~~~~~~~~~C~F~G~at 78 (78)
T PF07983_consen 1 LWCVAKPDADDKELQDLLDYACGQGGVDCSPIQPNGTTGVYGAYSMCSPRQHLSYAFNQYYQKQGRNSSACDFSGNAT 78 (78)
T ss_dssp -EEEE-TTS-HHHHHHHHHHHTTT-SSSCCCC-EETTTTEE-TTTTS-CCHHHHHHHHHHHHHHTSSCCG-SS-STEE
T ss_pred CcceeCCCCCHHHHHHHHHHHHcCCCCChhhhCCCCcccccccccCCCHHHHHHHHHHHHHHHcCCCCCcCCCCCCCC
Confidence 5999999999999999999999998899999999999 89999999999999999999986 589999999997
No 4
>PF00332 Glyco_hydro_17: Glycosyl hydrolases family 17; InterPro: IPR000490 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 17 GH17 from CAZY comprises enzymes with several known activities; endo-1,3-beta-glucosidase (3.2.1.39 from EC); lichenase (3.2.1.73 from EC); exo-1,3-glucanase (3.2.1.58 from EC). Currently these enzymes have only been found in plants and in fungi. ; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 1AQ0_B 1GHR_A 1GHS_B 2CYG_A 3UR8_A 3UR7_B 3EM5_C 3F55_D.
Probab=99.14 E-value=3.5e-12 Score=134.77 Aligned_cols=118 Identities=20% Similarity=0.249 Sum_probs=76.0
Q ss_pred hHHHHHHHhcCCCCCcccccchhhhhccccCccchhhHHHHhHHhhhhhcccCCchhhhhhhcCcccccCCCCccccCCC
Q 006966 235 DILIDLVMKSPLVPDAKQVAEFTEIVSKFFENNSQIDELYADVASSMGEFVQKGLKVVRRLQNSLKTSIHDTTIFPTTPV 314 (623)
Q Consensus 235 Da~LDaAl~~~G~p~~~vV~eeTGwps~~~~daa~~da~~Aav~~nA~~yn~~li~~~~~l~~~~gTp~v~etgwPt~~~ 314 (623)
++.||+|+++... .++|. |..|.+++| +++|++++++ ++++ .+.+++ +++|+|||+.+.
T Consensus 184 ~~~l~yAlf~~~~---~~~D~--~~~y~nlfD-a~~da~~~a~-~~~g---~~~~~v-----------vv~ETGWPs~G~ 242 (310)
T PF00332_consen 184 NISLDYALFQPNS---GVVDG--GLAYTNLFD-AMVDAVYAAM-EKLG---FPNVPV-----------VVGETGWPSAGD 242 (310)
T ss_dssp TS-HHHHTT-SSS----SEET--TEEESSHHH-HHHHHHHHHH-HTTT----TT--E-----------EEEEE---SSSS
T ss_pred cCCcccccccccc---ccccc--chhhhHHHH-HHHHHHHHHH-HHhC---CCCcee-----------EEeccccccCCC
Confidence 5789999998542 23343 778999999 9999999999 5443 333443 789999999987
Q ss_pred CCCCCCCCcccccCCCCCceecCCCCCCCCCCCCCCCcccc-CCCCCCCCCCCCCCCCCCCCCccCCCC
Q 006966 315 PPDNKPTPTIVTVPATNPVTVSPANPSGTPLPIPSTTPVNI-PPATPVNPAAPVTNPATIPAPVTVPGG 382 (623)
Q Consensus 315 ~ed~kpgpt~~~ay~~n~~~~gLf~pdGTPvYp~~~~~v~i-~lfne~~kpgp~se~~~~~~Gl~~p~g 382 (623)
. .+...++..|++|.+.+ +. .|||.+|+..+++|+ ++|||+.|+++..||| ||||++|+
T Consensus 243 ~---~a~~~nA~~~~~nl~~~-~~--~gt~~~~~~~~~~y~F~~FdE~~K~~~~~E~~---wGlf~~d~ 302 (310)
T PF00332_consen 243 P---GATPENAQAYNQNLIKH-VL--KGTPLRPGNGIDVYIFEAFDENWKPGPEVERH---WGLFYPDG 302 (310)
T ss_dssp T---TCSHHHHHHHHHHHHHH-CC--GBBSSSBSS---EEES-SB--TTSSSSGGGGG-----SB-TTS
T ss_pred C---CCCcchhHHHHHHHHHH-Hh--CCCcccCCCCCeEEEEEEecCcCCCCCcccce---eeeECCCC
Confidence 2 23445566688886542 22 899999999999996 9999999999988888 99999975
No 5
>COG5309 Exo-beta-1,3-glucanase [Carbohydrate transport and metabolism]
Probab=98.58 E-value=2.1e-07 Score=95.48 Aligned_cols=214 Identities=13% Similarity=0.099 Sum_probs=137.8
Q ss_pred ceeeEeccCCCCCCCCCChhhhhc---cCccccccCCCCCCcEEEecCCc----cccchhhhcCCCceEEEecccCchhH
Q 006966 24 TLVGFAFNGRENTSAASSTSEVTS---FDSLGLKLDNVPSQRIRVYVANH----RVLNFSSLLNSNASSSVDLYLNLSLV 96 (623)
Q Consensus 24 ~~iGVnYG~~gdnL~lPsP~~vv~---l~~~~lk~~~i~~~~VRiyDadp----~vL~~~AlanTgI~v~V~v~vpN~~i 96 (623)
+..+++||..-++-.-++.+++.. + |++.. ..||+|..|- .|+. |..-.|++|.++|. +-+++
T Consensus 44 g~~~f~l~~~n~dGtCKSa~~~~sDLe~----l~~~t---~~IR~Y~sDCn~le~v~p--Aa~~~g~kv~lGiw-~tdd~ 113 (305)
T COG5309 44 GFLAFTLGPYNDDGTCKSADQVASDLEL----LASYT---HSIRTYGSDCNTLENVLP--AAEASGFKVFLGIW-PTDDI 113 (305)
T ss_pred cccceeccccCCCCCCcCHHHHHhHHHH----hccCC---ceEEEeeccchhhhhhHH--HHHhcCceEEEEEe-eccch
Confidence 357899998766655588888854 5 66654 3899996443 5888 99999999999996 44455
Q ss_pred HHhhcChHHHHHHHHhhccCCCCCccEEEEEccCccccccCCCc-hhhHHHHHHHHHHHHHhCCCCCceEEcccCCcCcc
Q 006966 97 VDLMQSELSAISWLETNVLTTHPHVNIKSIILSCSSEEFEGKNV-LPLILSALKSFHSALNRIHLDMKVKVSVAFPLPLL 175 (623)
Q Consensus 97 ~~la~s~~~A~~WV~~NV~py~p~t~I~~I~VGnenE~~~~~~~-~~~LvPAM~Nih~AL~~~gL~~~IKVSTp~s~~vL 175 (623)
.. +.+.=++.-+++|..--.|+.|.|| ||.+...+. ..+|.-=+..++.+|.++|.+ .+|.|.-...++
T Consensus 114 ~~------~~~~til~ay~~~~~~d~v~~v~VG--nEal~r~~~tasql~~~I~~vrsav~~agy~--gpV~T~dsw~~~ 183 (305)
T COG5309 114 HD------AVEKTILSAYLPYNGWDDVTTVTVG--NEALNRNDLTASQLIEYIDDVRSAVKEAGYD--GPVTTVDSWNVV 183 (305)
T ss_pred hh------hHHHHHHHHHhccCCCCceEEEEec--hhhhhcCCCCHHHHHHHHHHHHHHHHhcCCC--Cceeecccceee
Confidence 43 2333455667888776789999999 788764443 677888899999999999985 678988777777
Q ss_pred cccCCCcchhHHHHHHHHhhcCCeeEEeeCCCCCcccccccccccccccccCCCCCCCchHHHHHHHhcCCCCC-ccccc
Q 006966 176 ENLNTSHEGEIGLIFGYIKKTGSVVIIEAGIDGKLSMAEVLVQPLLKKAIKATSILPDSDILIDLVMKSPLVPD-AKQVA 254 (623)
Q Consensus 176 ~~s~~~~~~~i~plL~FL~~T~SPfmVNvY~~~~i~LdyALF~~~~d~v~~a~~~L~~~Da~LDaAl~~~G~p~-~~vV~ 254 (623)
.+ +|. +-.--+ +.|+|+ ++.||.+. +-++.. .+.-.+|.. + +...|. +.++.
T Consensus 184 ~~-np~----l~~~SD-------fia~N~---------~aYwd~~~--~a~~~~--~f~~~q~e~-v-qsa~g~~k~~~v 236 (305)
T COG5309 184 IN-NPE----LCQASD-------FIAANA---------HAYWDGQT--VANAAG--TFLLEQLER-V-QSACGTKKTVWV 236 (305)
T ss_pred eC-ChH----Hhhhhh-------hhhccc---------chhccccc--hhhhhh--HHHHHHHHH-H-HHhcCCCccEEE
Confidence 65 322 111112 345665 34466541 111110 111112221 1 112233 56655
Q ss_pred chhhhhccccCccchhhHHHHhHHhhhhhcccCCc
Q 006966 255 EFTEIVSKFFENNSQIDELYADVASSMGEFVQKGL 289 (623)
Q Consensus 255 eeTGwps~~~~daa~~da~~Aav~~nA~~yn~~li 289 (623)
.|||||+++. ++.+.++++ +|+..|-++.+
T Consensus 237 ~EtGWPS~G~----~~G~a~pS~-anq~~~~~~i~ 266 (305)
T COG5309 237 TETGWPSDGR----TYGSAVPSV-ANQKIAVQEIL 266 (305)
T ss_pred eeccCCCCCC----ccCCcCCCh-hHHHHHHHHHH
Confidence 6799999974 444556777 88877765443
No 6
>PF03198 Glyco_hydro_72: Glucanosyltransferase; InterPro: IPR004886 This family is a group of yeast glycolipid proteins anchored to the membrane. It includes Candida albicans (Yeast) pH-regulated protein, which is required for apical growth and plays a role in morphogenesis and Saccharomyces cerevisiae glycolipid anchored surface protein.; PDB: 2W61_A 2W62_A 2W63_A.
Probab=94.48 E-value=0.061 Score=57.45 Aligned_cols=126 Identities=16% Similarity=0.215 Sum_probs=70.8
Q ss_pred eeEeccCCCCC------CCCCChhhhh----ccCccccccCCCCCCcEEEecCCcc-----ccchhhhcCCCceEEEecc
Q 006966 26 VGFAFNGRENT------SAASSTSEVT----SFDSLGLKLDNVPSQRIRVYVANHR-----VLNFSSLLNSNASSSVDLY 90 (623)
Q Consensus 26 iGVnYG~~gdn------L~lPsP~~vv----~l~~~~lk~~~i~~~~VRiyDadp~-----vL~~~AlanTgI~v~V~v~ 90 (623)
.||.|=..++. -||-.+ ++- .+ ||+.|| .-||+|.-||+ -++ +|+..||-|+++|.
T Consensus 30 kGVaYQp~~~~~~~~~~DPLad~-~~C~rDi~~----l~~Lgi--NtIRVY~vdp~~nHd~CM~--~~~~aGIYvi~Dl~ 100 (314)
T PF03198_consen 30 KGVAYQPGGSSEPSNYIDPLADP-EACKRDIPL----LKELGI--NTIRVYSVDPSKNHDECMS--AFADAGIYVILDLN 100 (314)
T ss_dssp EEEE----------SS--GGG-H-HHHHHHHHH----HHHHT---SEEEES---TTS--HHHHH--HHHHTT-EEEEES-
T ss_pred eeEEcccCCCCCCccCcCcccCH-HHHHHhHHH----HHHcCC--CEEEEEEeCCCCCHHHHHH--HHHhCCCEEEEecC
Confidence 68988665551 111222 232 37 888887 99999977764 588 99999999999998
Q ss_pred cCchhHHHhhcChHHHHHHHH-------hhccCCCCCccEEEEEccCccccccC---CCchhhHHHHHHHHHHHHHhCCC
Q 006966 91 LNLSLVVDLMQSELSAISWLE-------TNVLTTHPHVNIKSIILSCSSEEFEG---KNVLPLILSALKSFHSALNRIHL 160 (623)
Q Consensus 91 vpN~~i~~la~s~~~A~~WV~-------~NV~py~p~t~I~~I~VGnenE~~~~---~~~~~~LvPAM~Nih~AL~~~gL 160 (623)
.|+..|.+-. -+..|=. .-|-.|..-.|.-...+| |||+.. ....+.+--|.|-+++=+++.+.
T Consensus 101 ~p~~sI~r~~----P~~sw~~~l~~~~~~vid~fa~Y~N~LgFf~G--NEVin~~~~t~aap~vKAavRD~K~Yi~~~~~ 174 (314)
T PF03198_consen 101 TPNGSINRSD----PAPSWNTDLLDRYFAVIDAFAKYDNTLGFFAG--NEVINDASNTNAAPYVKAAVRDMKAYIKSKGY 174 (314)
T ss_dssp BTTBS--TTS----------HHHHHHHHHHHHHHTT-TTEEEEEEE--ESSS-STT-GGGHHHHHHHHHHHHHHHHHSSS
T ss_pred CCCccccCCC----CcCCCCHHHHHHHHHHHHHhccCCceEEEEec--ceeecCCCCcccHHHHHHHHHHHHHHHHhcCC
Confidence 8877776532 2344521 123333333577888999 777753 24578888889999999998887
Q ss_pred CCceEEc
Q 006966 161 DMKVKVS 167 (623)
Q Consensus 161 ~~~IKVS 167 (623)
++|-|.
T Consensus 175 -R~IPVG 180 (314)
T PF03198_consen 175 -RSIPVG 180 (314)
T ss_dssp -----EE
T ss_pred -CCCcee
Confidence 557776
No 7
>KOG0260 consensus RNA polymerase II, large subunit [Transcription]
Probab=83.00 E-value=7.1 Score=48.29 Aligned_cols=12 Identities=17% Similarity=0.493 Sum_probs=7.0
Q ss_pred HHHHHHHhhccC
Q 006966 105 SAISWLETNVLT 116 (623)
Q Consensus 105 ~A~~WV~~NV~p 116 (623)
.|-+||-.||-.
T Consensus 1031 eaf~w~~~~Ie~ 1042 (1605)
T KOG0260|consen 1031 EAFEWVLGEIEA 1042 (1605)
T ss_pred HHHHHHhhhhhh
Confidence 456666666543
No 8
>KOG0260 consensus RNA polymerase II, large subunit [Transcription]
Probab=73.75 E-value=22 Score=44.32 Aligned_cols=33 Identities=15% Similarity=0.265 Sum_probs=23.3
Q ss_pred HHHHHHHHHHHHHhCCCCCceEEcccCCcCcccc
Q 006966 144 ILSALKSFHSALNRIHLDMKVKVSVAFPLPLLEN 177 (623)
Q Consensus 144 LvPAM~Nih~AL~~~gL~~~IKVSTp~s~~vL~~ 177 (623)
-.-.-+-|++||...+++ .+|++--+...|+..
T Consensus 953 p~n~~r~I~Na~~~f~~~-~r~~t~l~~~~v~~g 985 (1605)
T KOG0260|consen 953 PENLQRIIWNALKKFSID-ERKPTDLIPFKVVKG 985 (1605)
T ss_pred chhHHHHHHHHHhhcccc-cccccccchhhhhhh
Confidence 334456788888888885 488887777777644
No 9
>COG1671 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=55.44 E-value=16 Score=35.40 Aligned_cols=77 Identities=10% Similarity=0.028 Sum_probs=44.9
Q ss_pred hhcCCCceEEEecccCchhHHH----------hhcChHHHHHHHHhhccCCCCCccEE-------------EEEccCccc
Q 006966 77 SLLNSNASSSVDLYLNLSLVVD----------LMQSELSAISWLETNVLTTHPHVNIK-------------SIILSCSSE 133 (623)
Q Consensus 77 AlanTgI~v~V~v~vpN~~i~~----------la~s~~~A~~WV~~NV~py~p~t~I~-------------~I~VGnenE 133 (623)
+-.-.|++|++ |-|.-+.. +.+-.++|+.|+.++.-+. ++-|+ .++.+.+-+
T Consensus 20 ~A~r~~~~v~~---Van~~~~~~~~~~i~~v~V~~g~DaaD~~Iv~~a~~g--DlVVT~Di~LA~~ll~kg~~v~~prGr 94 (150)
T COG1671 20 VAERMGLKVTF---VANFPHRVPPSPEIRTVVVDAGFDAADDWIVNLAEKG--DLVVTADIPLASLLLDKGAAVLNPRGR 94 (150)
T ss_pred HHHHhCCeEEE---EeCCCccCCCCCceeEEEecCCcchHHHHHHHhCCCC--CEEEECchHHHHHHHhcCCEEECCCCc
Confidence 33446666665 44544431 3345689999999987776 32232 222232224
Q ss_pred cccCCCchhhHHHHHHHHHHHHHhCCC
Q 006966 134 EFEGKNVLPLILSALKSFHSALNRIHL 160 (623)
Q Consensus 134 ~~~~~~~~~~LvPAM~Nih~AL~~~gL 160 (623)
++...++ ..+=+|++|+.-|++.|.
T Consensus 95 ~y~~~nI--~~~L~~R~~~~~lR~~G~ 119 (150)
T COG1671 95 LYTEENI--GERLAMRDFMAKLRRQGK 119 (150)
T ss_pred ccCHhHH--HHHHHHHHHHHHHHHhcc
Confidence 4422222 244589999999999986
No 10
>COG5309 Exo-beta-1,3-glucanase [Carbohydrate transport and metabolism]
Probab=52.74 E-value=6.9 Score=41.43 Aligned_cols=65 Identities=12% Similarity=-0.059 Sum_probs=36.9
Q ss_pred cccccCCCCccccCCCCC-CCCCCCcccccCCC----CCceecCCCCCCCCCCCCCCCcccc-CCCCCCCCCCC--CCCC
Q 006966 299 LKTSIHDTTIFPTTPVPP-DNKPTPTIVTVPAT----NPVTVSPANPSGTPLPIPSTTPVNI-PPATPVNPAAP--VTNP 370 (623)
Q Consensus 299 ~gTp~v~etgwPt~~~~e-d~kpgpt~~~ay~~----n~~~~gLf~pdGTPvYp~~~~~v~i-~lfne~~kpgp--~se~ 370 (623)
+++..+.|+|||+.+... ...|.+.+...|-+ +.|.+| +++++ +-|||+=|.-. .-|+
T Consensus 231 ~k~~~v~EtGWPS~G~~~G~a~pS~anq~~~~~~i~~~~~~~G--------------~d~fvfeAFdd~WK~~~~y~VEk 296 (305)
T COG5309 231 KKTVWVTETGWPSDGRTYGSAVPSVANQKIAVQEILNALRSCG--------------YDVFVFEAFDDDWKADGSYGVEK 296 (305)
T ss_pred CccEEEeeccCCCCCCccCCcCCChhHHHHHHHHHHhhhhccC--------------ccEEEeeeccccccCccccchhh
Confidence 466789999999998732 22333333222211 222222 26664 78999877433 2355
Q ss_pred CCCCCCccCC
Q 006966 371 ATIPAPVTVP 380 (623)
Q Consensus 371 ~~~~~Gl~~p 380 (623)
+ ||++..
T Consensus 297 y---wGv~~s 303 (305)
T COG5309 297 Y---WGVLSS 303 (305)
T ss_pred c---eeeecc
Confidence 5 888743
No 11
>COG3889 Predicted solute binding protein [General function prediction only]
Probab=48.76 E-value=15 Score=43.92 Aligned_cols=28 Identities=14% Similarity=0.039 Sum_probs=18.2
Q ss_pred hhHHHHHHHHHHHHHhCCCCCceEEcccCC
Q 006966 142 PLILSALKSFHSALNRIHLDMKVKVSVAFP 171 (623)
Q Consensus 142 ~~LvPAM~Nih~AL~~~gL~~~IKVSTp~s 171 (623)
+.+=.||..+.+-+.+-|+ .+|+-++.-
T Consensus 173 q~veq~m~e~~a~~~~~G~--~~~~~~~~~ 200 (872)
T COG3889 173 QYVEQAMAELNAEYMAKGL--WYKVGKPDD 200 (872)
T ss_pred HHHHHHHHHHHHHHhhcCc--EEEecCCCc
Confidence 4456677777666666665 477777655
No 12
>PRK00124 hypothetical protein; Validated
Probab=46.40 E-value=25 Score=34.22 Aligned_cols=58 Identities=14% Similarity=0.086 Sum_probs=34.7
Q ss_pred hhcChHHHHHHHHhhccCCCCCccEE------EEEccC-------ccccccCCCchhhHHHHHHHHHHHHHhCCC
Q 006966 99 LMQSELSAISWLETNVLTTHPHVNIK------SIILSC-------SSEEFEGKNVLPLILSALKSFHSALNRIHL 160 (623)
Q Consensus 99 la~s~~~A~~WV~~NV~py~p~t~I~------~I~VGn-------enE~~~~~~~~~~LvPAM~Nih~AL~~~gL 160 (623)
+.+..++|+.|+.+++.+- +.=|+ ..+++- .-++++..++- -+=+||++.+-|++.|.
T Consensus 50 V~~g~D~AD~~Iv~~~~~g--DiVIT~Di~LAa~~l~Kga~vl~prG~~yt~~nI~--~~L~~R~~~~~lR~~G~ 120 (151)
T PRK00124 50 VDAGFDAADNEIVQLAEKG--DIVITQDYGLAALALEKGAIVLNPRGYIYTNDNID--QLLAMRDLMATLRRSGI 120 (151)
T ss_pred eCCCCChHHHHHHHhCCCC--CEEEeCCHHHHHHHHHCCCEEECCCCcCCCHHHHH--HHHHHHHHHHHHHHcCC
Confidence 3456789999999998775 33232 111111 01333222222 34589999999999985
No 13
>KOG1924 consensus RhoA GTPase effector DIA/Diaphanous [Signal transduction mechanisms; Cytoskeleton]
Probab=42.51 E-value=83 Score=38.06 Aligned_cols=24 Identities=0% Similarity=-0.191 Sum_probs=9.9
Q ss_pred HHhhhhcccccCCCCCCccCCCCC
Q 006966 433 IQQALDYACGIGGADCSLIQQGAS 456 (623)
Q Consensus 433 l~~~ldyaCg~~~~dCs~I~~gg~ 456 (623)
|=.+++.+-++---.|+++-++-.
T Consensus 424 YykLIEecISqIvlHr~~~DPdf~ 447 (1102)
T KOG1924|consen 424 YYKLIEECISQIVLHRTGMDPDFK 447 (1102)
T ss_pred HHHHHHHHHHHHHHhcCCCCCCcc
Confidence 333444433332223555555544
No 14
>PF00925 GTP_cyclohydro2: GTP cyclohydrolase II; InterPro: IPR000926 GTP cyclohydrolase II catalyses the first committed step in the biosynthesis of riboflavin. The enzyme converts GTP and water to formate, 2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)- pyrimidine and pyrophosphate, and requires magnesium as a cofactor. It is sometimes found as a bifunctional enzyme with 3,4-dihydroxy-2-butanone 4-phosphate synthase (DHBP_synthase) IPR000422 from INTERPRO. ; GO: 0003935 GTP cyclohydrolase II activity, 0009231 riboflavin biosynthetic process; PDB: 2BZ0_B 2BZ1_A.
Probab=38.52 E-value=11 Score=36.98 Aligned_cols=35 Identities=17% Similarity=0.124 Sum_probs=25.4
Q ss_pred hhccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEEE
Q 006966 45 VTSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSSV 87 (623)
Q Consensus 45 vv~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~V 87 (623)
.+|+ ||..|| ++|||.-.||.=+. ||.|-||+|.=
T Consensus 131 gaqI----L~dLGV--~~~rLLtnnp~k~~--~L~g~gleV~~ 165 (169)
T PF00925_consen 131 GAQI----LRDLGV--KKMRLLTNNPRKYV--ALEGFGLEVVE 165 (169)
T ss_dssp HHHH----HHHTT----SEEEE-S-HHHHH--HHHHTT--EEE
T ss_pred HHHH----HHHcCC--CEEEECCCChhHHH--HHhcCCCEEEE
Confidence 4677 898887 99999999999888 99999998875
No 15
>PF07462 MSP1_C: Merozoite surface protein 1 (MSP1) C-terminus; InterPro: IPR010901 This entry represents the C-terminal region of merozoite surface protein 1 (MSP1), which is found in a number of Plasmodium species. MSP-1 is a 200 kDa protein expressed on the surface of the Plasmodium vivax merozoite. MSP-1 of Plasmodium species is synthesised as a high-molecular-weight precursor and then processed into several fragments. At the time of red cell invasion by the merozoite, only the 19 kDa C-terminal fragment (MSP-119), which contains two epidermal growth factor-like domains, remains on the surface. Antibodies against MSP-119 inhibit merozoite entry into red cells, and immunisation with MSP-119 protects monkeys from challenging infections. Hence, MSP-119 is considered a promising vaccine candidate [].; GO: 0009405 pathogenesis, 0016020 membrane
Probab=32.59 E-value=98 Score=35.89 Aligned_cols=8 Identities=25% Similarity=0.430 Sum_probs=4.8
Q ss_pred HhHHHHHH
Q 006966 467 ASFAFNSY 474 (623)
Q Consensus 467 aSyAfN~Y 474 (623)
--=||.+|
T Consensus 247 Vk~ALq~Y 254 (574)
T PF07462_consen 247 VKEALQAY 254 (574)
T ss_pred HHHHHHHH
Confidence 34567666
No 16
>PF06508 QueC: Queuosine biosynthesis protein QueC; InterPro: IPR018317 This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome. In Rhizobium meliloti (Sinorhizobium meliloti), a species in which the exo genes make succinoglycan, a symbiotically important exopolysaccharide, exsB is located nearby and affects succinoglycan levels, probably through polar effects on exsA expression or the same polycistronic mRNA [, ]. In Arthrobacter viscosus, the homologous gene is designated alu1 and is associated with an aluminum tolerance phenotype. When expressed in Escherichia coli, it conferred aliminium tolerance []. The entry also contains the gene queC, which is responsible for the conversion of GTP to 7-cyano-7-deazaguanine (preQ0). The biosynthesis of hypermodified tRNA nucleoside queuosine only occurs in eubacteria. It occupies the wobble position for all known tRNAs that are specific for Asp, Asn, His or Tyr [].; PDB: 3BL5_B 2PG3_A.
Probab=28.95 E-value=1.6e+02 Score=29.78 Aligned_cols=122 Identities=14% Similarity=0.160 Sum_probs=55.8
Q ss_pred CceeeEeccCCCCCCCCCChhhhhccCccccccCCCCCCcEEEecCC--ccccchhhhcCCCceEEE---------eccc
Q 006966 23 ATLVGFAFNGRENTSAASSTSEVTSFDSLGLKLDNVPSQRIRVYVAN--HRVLNFSSLLNSNASSSV---------DLYL 91 (623)
Q Consensus 23 ~~~iGVnYG~~gdnL~lPsP~~vv~l~~~~lk~~~i~~~~VRiyDad--p~vL~~~AlanTgI~v~V---------~v~v 91 (623)
...|-|+||++ ..- --+.+.++ .+..++ .+-++.|-+ .++.. |+|-+.+++|-= ...|
T Consensus 26 v~al~~~YGq~-~~~---El~~a~~i----~~~l~v--~~~~~i~l~~~~~~~~-s~L~~~~~~v~~~~~~~~~~~~t~v 94 (209)
T PF06508_consen 26 VYALTFDYGQR-HRR---ELEAAKKI----AKKLGV--KEHEVIDLSFLKEIGG-SALTDDSIEVPEEEYSEESIPSTYV 94 (209)
T ss_dssp EEEEEEESSST-TCH---HHHHHHHH----HHHCT---SEEEEEE-CHHHHCSC-HHHHHTT------------------
T ss_pred EEEEEEECCCC-CHH---HHHHHHHH----HHHhCC--CCCEEeeHHHHHhhCC-CcccCCCcCCcccccccCCCCceEE
Confidence 35688999999 321 23333445 666565 777888888 33442 566666543210 0123
Q ss_pred CchhHHHhhcChHHHHHHHHhhccCCCCCccEEEEEccCcccccc-CCCchhhHHHHHHHHHHHHHhCCCCCceEEcccC
Q 006966 92 NLSLVVDLMQSELSAISWLETNVLTTHPHVNIKSIILSCSSEEFE-GKNVLPLILSALKSFHSALNRIHLDMKVKVSVAF 170 (623)
Q Consensus 92 pN~~i~~la~s~~~A~~WV~~NV~py~p~t~I~~I~VGnenE~~~-~~~~~~~LvPAM~Nih~AL~~~gL~~~IKVSTp~ 170 (623)
|+-.+.-| +.|..|-.. ..+..|++|.-.+-.. .++--+..+-+|+.+-+.. ....|+|.+|+
T Consensus 95 P~RN~l~l----siAa~~A~~--------~g~~~i~~G~~~~D~~~ypDc~~~F~~~~~~~~~~~----~~~~v~i~~P~ 158 (209)
T PF06508_consen 95 PFRNGLFL----SIAASYAES--------LGAEAIYIGVNAEDASGYPDCRPEFIDAMNRLLNLG----EGGPVRIETPL 158 (209)
T ss_dssp TTHHHHHH----HHHHHHHHH--------HT-SEEEE---S-STT--GGGSHHHHHHHHHHHHHH----HTS--EEE-TT
T ss_pred ecCcHHHH----HHHHHHHHH--------CCCCEEEEEECcCccCCCCCChHHHHHHHHHHHHhc----CCCCEEEEecC
Confidence 43332222 234445443 3677888883111111 1344455666666544333 34569999986
Q ss_pred C
Q 006966 171 P 171 (623)
Q Consensus 171 s 171 (623)
.
T Consensus 159 ~ 159 (209)
T PF06508_consen 159 I 159 (209)
T ss_dssp T
T ss_pred C
Confidence 3
No 17
>PF13756 Stimulus_sens_1: Stimulus-sensing domain
Probab=27.72 E-value=45 Score=30.53 Aligned_cols=33 Identities=18% Similarity=0.206 Sum_probs=23.2
Q ss_pred ChhhhhccCccccccCCC-CCCcEEEecCCccccchhh
Q 006966 41 STSEVTSFDSLGLKLDNV-PSQRIRVYVANHRVLNFSS 77 (623)
Q Consensus 41 sP~~vv~l~~~~lk~~~i-~~~~VRiyDadp~vL~~~A 77 (623)
.|++|..| |+.... .-+|+||||+|-.+|-.|-
T Consensus 2 ~pe~a~pl----LrrL~~Pt~~RARlyd~dG~Ll~DSr 35 (112)
T PF13756_consen 2 NPERARPL----LRRLISPTRTRARLYDPDGNLLADSR 35 (112)
T ss_pred CHHHHHHH----HHHhCCCCCceEEEECCCCCEEeecc
Confidence 46777777 776421 2389999999998776443
No 18
>PF05283 MGC-24: Multi-glycosylated core protein 24 (MGC-24); InterPro: IPR007947 CD164 is a mucin-like receptor, or sialomucin, with specificity in receptor/ ligand interactions that depends on the structural characteristics of the mucin-like receptor. Its functions include mediating, or regulating, haematopoietic progenitor cell adhesion and the negative regulation of their growth and/or-differentiation. It exists in the native state as a disulphide- linked homodimer of two 80-85kDa subunits. It is usually expressed by CD34+ and CD341o/- haematopoietic stem cells and associated microenvironmental cells. It contains, in its extracellular region, two mucin domains (I and II) linked by a non-mucin domain, which has been predicted to contain intra- disulphide bridges. This receptor may play a key role in haematopoiesis by facilitating the adhesion of human CD34+ cells to bone marrow stroma and by negatively regulating CD34+ CD341o/- haematopoietic progenitor cell proliferation. These effects involve the CD164 class I and/or II epitopes recognised by the monoclonal antibodies (mAbs) 105A5 and 103B2/9E10. These epitopes are carbohydrate-dependent and are located on the N-terminal mucin domain I [, ]. It has been found that murine MGC-24v and rat endolyn share significant sequence similarities with human CD164. However, CD164 lacks the consensus glycosaminoglycan (GAG)-attachment site found in MGC-24; it is possible that GAG-association is responsible for the high molecular weight of the epithelial-derived MGC-24 glycoprotein []. Genomic structure studies have placed CD164 within the mucin-subgroup that comprises multiple exons, and demonstrate the diverse chromosomal distribution of this family of molecules. Molecules with such multiple exons may have sophisticated regulatory mechanisms that involve not only post-translational modifications of the oligosaccharide side chains, but also differential exon usage. Although differences in the intron and exon sizes are seen between the mouse and human genes, the predicted proteins are similar in size and structure, maintaining functionally important motifs that regulate cell proliferation or subcellular distribution []. CD164 is a gene whose expression depends on differential usage of poly- adenylation sites within the 3'-UTR. The conserved distribution of the 3.2- and 1.2-kb CD164 transcripts between mouse and human suggests that (i) a mechanism may exist to regulate tissue-specific polyadenylation, and (ii) differences in polyadenylation are important for the expression and function of CD164 in different tissues. Two other aspects of the structure of CD164 are of particular interest. First, it shares one of several conserved features of a cytokine-binding pocket - in this respect, it is notable that evidence exists for a class of cell-surface sialomucin modulators that directly interact with growth factor receptors to regulate their response to physiological ligands. Second, its cytoplasmic tail contains a C-terminal YHTL motif found in many endocytic membrane proteins or receptors. These Tyr-based motifs bind to adaptor proteins, which mediate the sorting of membrane proteins into transport vesicles from the plasma membrane to the endosomes, and between intracellular compartments.
Probab=27.17 E-value=2e+02 Score=29.07 Aligned_cols=13 Identities=8% Similarity=0.248 Sum_probs=9.3
Q ss_pred eecceehhhHHhh
Q 006966 604 ILSSLTLVTPFVI 616 (623)
Q Consensus 604 ~~~~~~~~~~~~~ 616 (623)
|||-|||+..+++
T Consensus 163 FiGGIVL~LGv~a 175 (186)
T PF05283_consen 163 FIGGIVLTLGVLA 175 (186)
T ss_pred hhhHHHHHHHHHH
Confidence 8888888765443
No 19
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=26.98 E-value=44 Score=29.99 Aligned_cols=9 Identities=44% Similarity=0.516 Sum_probs=5.3
Q ss_pred HHHHhhcCC
Q 006966 15 NILTISSSA 23 (623)
Q Consensus 15 ~~~~~~~~~ 23 (623)
++|+|++.+
T Consensus 15 ~lLlisSev 23 (95)
T PF07172_consen 15 ALLLISSEV 23 (95)
T ss_pred HHHHHHhhh
Confidence 556666653
No 20
>PRK12485 bifunctional 3,4-dihydroxy-2-butanone 4-phosphate synthase/GTP cyclohydrolase II-like protein; Provisional
Probab=26.39 E-value=36 Score=37.71 Aligned_cols=33 Identities=12% Similarity=0.024 Sum_probs=29.5
Q ss_pred hhccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEE
Q 006966 45 VTSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSS 86 (623)
Q Consensus 45 vv~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~ 86 (623)
.+++ ||..|| ++|||. .||.=+. +|.+-||+|+
T Consensus 330 gAqI----Lr~LGV--~kirLL-nNP~K~~--~L~~~GIeV~ 362 (369)
T PRK12485 330 GAQI----LQDLGV--GKLRHL-GPPLKYA--GLTGYDLEVV 362 (369)
T ss_pred HHHH----HHHcCC--CEEEEC-CCchhhh--hhhhCCcEEE
Confidence 5788 999888 999999 7999888 9999999986
No 21
>COG3889 Predicted solute binding protein [General function prediction only]
Probab=23.69 E-value=52 Score=39.54 Aligned_cols=33 Identities=9% Similarity=0.023 Sum_probs=19.3
Q ss_pred HHHHHHHHhhcCCeeEEeeC-----CCCCccccccccc
Q 006966 186 IGLIFGYIKKTGSVVIIEAG-----IDGKLSMAEVLVQ 218 (623)
Q Consensus 186 i~plL~FL~~T~SPfmVNvY-----~~~~i~LdyALF~ 218 (623)
+..+.+||++-+--+..|.| |+...-+++.-|+
T Consensus 321 y~d~~~~~~q~~~~~i~~~~v~~t~N~e~~v~~~~~~d 358 (872)
T COG3889 321 YDDIIRFLKQLNHMEISNPYVELTYNPENYVLNYNKFD 358 (872)
T ss_pred HHHHHHHHHhhhhhccCCceEEEeeccceeeccccccc
Confidence 45677788876655555554 4444445555444
No 22
>PRK10629 EnvZ/OmpR regulon moderator; Provisional
Probab=22.71 E-value=2.4e+02 Score=26.63 Aligned_cols=36 Identities=8% Similarity=0.039 Sum_probs=27.1
Q ss_pred CceeeEeccCCCCCCCCCChhhhhccCccccccCCCCCCcEE
Q 006966 23 ATLVGFAFNGRENTSAASSTSEVTSFDSLGLKLDNVPSQRIR 64 (623)
Q Consensus 23 ~~~iGVnYG~~gdnL~lPsP~~vv~l~~~~lk~~~i~~~~VR 64 (623)
...|-|.-.+.|..+ |...+|.+. |+++||.++++.
T Consensus 35 dpavQIs~~~~g~~~--~~~~~v~~~----L~~~gI~~ksi~ 70 (127)
T PRK10629 35 ESTLAIRAVHQGASL--PDGFYVYQH----LDANGIHIKSIT 70 (127)
T ss_pred CceEEEecCCCCCcc--chHHHHHHH----HHHCCCCcceEE
Confidence 456778777667666 899999999 999999444443
No 23
>cd02875 GH18_chitobiase Chitobiase (also known as di-N-acetylchitobiase) is a lysosomal glycosidase that hydrolyzes the reducing-end N-acetylglucosamine from the chitobiose core of oligosaccharides during the ordered degradation of asparagine-linked glycoproteins in eukaryotes. Chitobiase can only do so if the asparagine that joins the oligosaccharide to protein is previously removed by a glycosylasparaginase. Chitobiase is therefore the final step in the lysosomal degradation of the protein/carbohydrate linkage component of asparagine-linked glycoproteins. The catalytic domain of chitobiase is an eight-stranded alpha/beta barrel fold similar to that of other family 18 glycosyl hydrolases such as hevamine and chitotriosidase.
Probab=22.68 E-value=1.6e+02 Score=32.21 Aligned_cols=102 Identities=10% Similarity=0.124 Sum_probs=57.1
Q ss_pred CcEEEec-CCccccchhhhcCCCceEEEecccCchhHHHhhcChHHHHHHHHhhccCCCCCccEEEEEccCccccccCCC
Q 006966 61 QRIRVYV-ANHRVLNFSSLLNSNASSSVDLYLNLSLVVDLMQSELSAISWLETNVLTTHPHVNIKSIILSCSSEEFEGKN 139 (623)
Q Consensus 61 ~~VRiyD-adp~vL~~~AlanTgI~v~V~v~vpN~~i~~la~s~~~A~~WV~~NV~py~p~t~I~~I~VGnenE~~~~~~ 139 (623)
+.|.||+ .|++++. .-..-|++|++...++.+ +.+++..=++|+++ |+.++-.-++-.|-+==|.-......
T Consensus 57 tti~~~~~~~~~~~~--~A~~~~v~v~~~~~~~~~----~l~~~~~R~~fi~s-iv~~~~~~gfDGIdIDwE~p~~~~~~ 129 (358)
T cd02875 57 TTIAIFGDIDDELLC--YAHSKGVRLVLKGDVPLE----QISNPTYRTQWIQQ-KVELAKSQFMDGINIDIEQPITKGSP 129 (358)
T ss_pred eEEEecCCCCHHHHH--HHHHcCCEEEEECccCHH----HcCCHHHHHHHHHH-HHHHHHHhCCCeEEEcccCCCCCCcc
Confidence 8889884 5778887 556678999886544432 23455555556553 44444333344443331100000112
Q ss_pred chhhHHHHHHHHHHHHHhCCCCCceEEccc
Q 006966 140 VLPLILSALKSFHSALNRIHLDMKVKVSVA 169 (623)
Q Consensus 140 ~~~~LvPAM~Nih~AL~~~gL~~~IKVSTp 169 (623)
.-..++-=|+.|+++|.+.+.+-.|-|.++
T Consensus 130 d~~~~t~llkelr~~l~~~~~~~~Lsvav~ 159 (358)
T cd02875 130 EYYALTELVKETTKAFKKENPGYQISFDVA 159 (358)
T ss_pred hHHHHHHHHHHHHHHHhhcCCCcEEEEEEe
Confidence 234567778889999998865434444444
No 24
>PRK00393 ribA GTP cyclohydrolase II; Reviewed
Probab=22.57 E-value=40 Score=33.84 Aligned_cols=33 Identities=18% Similarity=0.324 Sum_probs=29.5
Q ss_pred hccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEE
Q 006966 46 TSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSS 86 (623)
Q Consensus 46 v~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~ 86 (623)
+|+ ||..|| ++|||.-.++.=+. +|.|-||+|+
T Consensus 134 AQI----L~dLGV--~~mrLLtn~~~k~~--~L~g~GleV~ 166 (197)
T PRK00393 134 ADM----LKALGV--KKVRLLTNNPKKVE--ALTEAGINIV 166 (197)
T ss_pred HHH----HHHcCC--CEEEECCCCHHHHH--HHHhCCCEEE
Confidence 677 898887 99999999998788 9999999997
No 25
>PHA03291 envelope glycoprotein I; Provisional
Probab=22.33 E-value=1.8e+02 Score=32.15 Aligned_cols=15 Identities=13% Similarity=0.310 Sum_probs=6.9
Q ss_pred ccceecceehhhHHh
Q 006966 601 SQLILSSLTLVTPFV 615 (623)
Q Consensus 601 ~~~~~~~~~~~~~~~ 615 (623)
.|+-|=..|++..|+
T Consensus 289 iQiAIPasii~cV~l 303 (401)
T PHA03291 289 IQIAIPASIIACVFL 303 (401)
T ss_pred heeccchHHHHHhhh
Confidence 444444444444443
No 26
>TIGR00505 ribA GTP cyclohydrolase II. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. The function of archaeal members of the family has not been demonstrated and is assigned tentatively.
Probab=22.33 E-value=40 Score=33.68 Aligned_cols=33 Identities=15% Similarity=0.241 Sum_probs=29.1
Q ss_pred hccCccccccCCCCCCcEEEecCCccccchhhhcCCCceEE
Q 006966 46 TSFDSLGLKLDNVPSQRIRVYVANHRVLNFSSLLNSNASSS 86 (623)
Q Consensus 46 v~l~~~~lk~~~i~~~~VRiyDadp~vL~~~AlanTgI~v~ 86 (623)
+|+ ||..|| ++|||.-.++.=+. +|.|-||+|+
T Consensus 131 AQI----L~dLGV--~~~rLLtn~~~k~~--~L~g~gleVv 163 (191)
T TIGR00505 131 ADI----LEDLGV--KKVRLLTNNPKKIE--ILKKAGINIV 163 (191)
T ss_pred HHH----HHHcCC--CEEEECCCCHHHHH--HHHhCCCEEE
Confidence 677 888887 99999999888777 9999999987
No 27
>cd06156 eu_AANH_C_2 A group of hypothetical eukaryotic proteins, characterized by the presence of an adenine nucleotide alpha hydrolase (AANH)-like domain located N-terminal to two distinctly different YjgF-YER057c-UK114-like domains. This CD contains the second of these domains. The YjgF-YER057c-UK114 protein family is a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.
Probab=21.78 E-value=1.1e+02 Score=28.11 Aligned_cols=31 Identities=3% Similarity=0.088 Sum_probs=25.1
Q ss_pred CCchhhHHHHHHHHHHHHHhCCCCCceEEcc
Q 006966 138 KNVLPLILSALKSFHSALNRIHLDMKVKVSV 168 (623)
Q Consensus 138 ~~~~~~LvPAM~Nih~AL~~~gL~~~IKVST 168 (623)
+++..++--+|+||.+.|.++|.++-||+++
T Consensus 29 ~~~~~Q~~qal~Ni~~vL~~aG~~dVvk~~i 59 (118)
T cd06156 29 GGITLQAVLSLQHLERVAKAMNVQWVLAAVC 59 (118)
T ss_pred CCHHHHHHHHHHHHHHHHHHcCCCCEEEEEE
Confidence 3566789999999999999999955567663
Done!