Query gi|254781013|ref|YP_003065426.1| phage-associated protein [Candidatus Liberibacter asiaticus str. psy62] Match_columns 302 No_of_seqs 124 out of 144 Neff 3.0 Searched_HMMs 39220 Date Mon May 30 03:13:15 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254781013.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 COG3600 GepA Uncharacterized p 100.0 1.4E-40 4.2E-45 274.0 11.1 140 13-153 5-148 (154) 2 KOG3676 consensus 85.1 1.9 4.8E-05 23.2 4.7 90 20-119 290-394 (782) 3 COG3465 Uncharacterized conser 70.7 5.5 0.00014 20.3 3.5 43 33-81 29-71 (171) 4 PRK13455 F0F1 ATP synthase sub 56.3 10 0.00026 18.7 2.6 24 226-249 17-40 (184) 5 COG0105 Ndk Nucleoside diphosp 56.2 6 0.00015 20.1 1.4 75 1-75 1-76 (135) 6 pfam11282 DUF3082 Protein of u 44.0 26 0.00066 16.1 4.0 49 248-296 28-76 (82) 7 pfam10044 Ret_tiss Retinal tis 42.2 16 0.00041 17.4 1.8 34 143-176 37-70 (95) 8 KOG0790 consensus 41.6 28 0.00071 15.9 4.6 36 252-290 310-345 (600) 9 COG3877 Uncharacterized protei 38.5 30 0.00077 15.7 2.7 69 100-175 41-122 (122) 10 TIGR02814 pfaD_fam PfaD family 36.0 7.3 0.00019 19.6 -0.7 57 57-119 290-353 (449) 11 pfam05042 Caleosin Caleosin re 28.4 46 0.0012 14.6 3.1 56 107-163 96-168 (174) 12 KOG2535 consensus 26.5 42 0.0011 14.8 1.8 60 77-140 307-371 (554) 13 TIGR00920 2A060605 3-hydroxy-3 26.5 32 0.00081 15.6 1.2 11 15-25 353-363 (988) 14 cd06431 GT8_LARGE_C LARGE cata 25.9 47 0.0012 14.5 2.0 52 108-160 113-172 (280) 15 pfam07106 TBPIP Tat binding pr 24.0 55 0.0014 14.1 2.9 29 7-42 11-39 (169) 16 TIGR01578 MiaB-like-B MiaB-lik 22.6 25 0.00063 16.2 0.0 71 107-178 100-187 (487) 17 TIGR03076 near_not_gcvH Chlamy 21.8 61 0.0015 13.8 6.8 103 148-257 127-232 (686) 18 TIGR00130 frhD coenzyme F420-r 21.8 43 0.0011 14.8 1.1 42 117-159 44-92 (162) 19 TIGR00874 talAB transaldolase; 21.6 45 0.0012 14.6 1.2 83 48-143 188-270 (324) 20 cd04418 NDPk5 Nucleoside dipho 20.5 44 0.0011 14.7 1.0 69 5-75 3-72 (132) 21 pfam09049 SNN_transmemb Stanni 20.2 34 0.00086 15.4 0.3 12 275-286 22-33 (33) No 1 >COG3600 GepA Uncharacterized phage-associated protein [Function unknown] Probab=100.00 E-value=1.4e-40 Score=273.98 Aligned_cols=140 Identities=26% Similarity=0.506 Sum_probs=120.5 Q ss_pred CCHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCC Q ss_conf 88999999999966635987889999999999969999984484886723574568816999999851354444212465 Q gi|254781013|r 13 YSTIAVANFFIDKGVKYSIPIDHLKIQQFIYLTHCDVVLQKKKSMLDEEPQAWKQGPVFVGVYHRFKYFDSHPIEVIMDL 92 (302) Q Consensus 13 YsaldVANyFI~kA~e~g~~ITnLKLQKLLYyAQg~~La~~gkpLFdE~fEAW~yGPVvP~VY~~FK~yg~~pI~~~~D~ 92 (302) |++..||||||+++.+.++++|||||||||||||||+|+.+|+|||+|.||||+||||+|++|+.||.+|+++|++.... T Consensus 5 ~d~~~IaN~~L~ka~~~~~~~t~lklqKLlYyA~~~~L~~~~~pL~~~~ieAW~~GPVip~~Yn~~K~~Gs~~I~~r~~~ 84 (154) T COG3600 5 VDPRAIANWFLDKADELDIPVTPLKLQKLLYYAHGWFLAVTGRPLFDEKIEAWKHGPVIPSLYNAFKQYGSNSIDERLPV 84 (154) T ss_pred CCHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHCCCCCHHHHHHHHHCCCCCCCCCCCH T ss_conf 37999999997421541766677999999999999999980885632378887618873999999997088778711221 Q ss_pred CHHHC-CC-CCCHHHHHHHHHHHHHHCCCCHHHHHHHHCCCCCCCHHCCCCCC--CCCCCEECHH Q ss_conf 31103-22-48988999999999984489989999965188876212168778--7864202199 Q gi|254781013|r 93 SFRIF-PK-IANQEIGEIMESIWNKYQSHSTEQLQEIVQEENKTWRRLYDPND--PNTNRTITVE 153 (302) Q Consensus 93 s~ei~-pk-i~DeEi~eILD~VWnkYG~ySA~qLeeLTHqEnSPWkkaYDP~d--p~~N~~ItvE 153 (302) ....+ .. ..|.++.++|..||++||.|||++|+++||+| +||-.+++-.. -.|+..|+-. T Consensus 85 ~~l~~~~~~~iD~~~s~~L~~Vw~~yG~ySa~~L~~iTHae-~PW~~~~~~~~~~~~~~~ri~D~ 148 (154) T COG3600 85 RGLSNGNALPIDADVSAILARVWDTYGRYSAWQLVDITHAE-SPWIKAWKGGGTSDSLGARISDK 148 (154) T ss_pred HHHHHHHCCCCCCHHHHHHHHHHHHHCCCCHHHHHHHHCCC-CHHHHHHHCCCCCCCCCCCCCHH T ss_conf 17776204753214999999999996344688999876143-70899986468631123310146 No 2 >KOG3676 consensus Probab=85.10 E-value=1.9 Score=23.24 Aligned_cols=90 Identities=14% Similarity=0.303 Sum_probs=48.3 Q ss_pred HHHHHHHHH------CCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCC Q ss_conf 999996663------59878899999999999699999844848867235745688169999998513544442124653 Q gi|254781013|r 20 NFFIDKGVK------YSIPIDHLKIQQFIYLTHCDVVLQKKKSMLDEEPQAWKQGPVFVGVYHRFKYFDSHPIEVIMDLS 93 (302) Q Consensus 20 NyFI~kA~e------~g~~ITnLKLQKLLYyAQg~~La~~gkpLFdE~fEAW~yGPVvP~VY~~FK~yg~~pI~~~~D~s 93 (302) ++.+..+.+ +...+|||+|---+==++- +.+-|=-|.+.-|.||||-.++|..-.- +.|. ++.+ T Consensus 290 ~~~L~~ga~~l~~v~N~qgLTPLtLAaklGk~em-----f~~ile~~k~~~W~YGpvtsslYpL~~i---DT~~--n~~S 359 (782) T KOG3676 290 DLALELGANALEHVRNNQGLTPLTLAAKLGKKEM-----FQHILERRKFTDWAYGPVTSSLYPLNSI---DTIG--NENS 359 (782) T ss_pred HHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHH-----HHHHHHHHCCCCEEECCCCCCCCCCHHC---CCCC--CCHH T ss_conf 9999758861003446679976899988706999-----9999986345401105620145561120---4335--6020 Q ss_pred H-H--HCC------CCCCHHHHHHHHHHHHHHCCC Q ss_conf 1-1--032------248988999999999984489 Q gi|254781013|r 94 F-R--IFP------KIANQEIGEIMESIWNKYQSH 119 (302) Q Consensus 94 ~-e--i~p------ki~DeEi~eILD~VWnkYG~y 119 (302) . + .|. ...+.-+.++|++=|++||.. T Consensus 360 vLeivvyg~~~eHl~Ll~~~i~~LL~~KW~~f~k~ 394 (782) T KOG3676 360 VLEIVVYGIKNEHLELLDGPIEELLEDKWKAFGKK 394 (782) T ss_pred HHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 45421038956789987579999999999998499 No 3 >COG3465 Uncharacterized conserved protein [Function unknown] Probab=70.67 E-value=5.5 Score=20.31 Aligned_cols=43 Identities=12% Similarity=0.093 Sum_probs=30.6 Q ss_pred CCHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCCCCCHHHHHHHHHC Q ss_conf 8899999999999699999844848867235745688169999998513 Q gi|254781013|r 33 IDHLKIQQFIYLTHCDVVLQKKKSMLDEEPQAWKQGPVFVGVYHRFKYF 81 (302) Q Consensus 33 ITnLKLQKLLYyAQg~~La~~gkpLFdE~fEAW~yGPVvP~VY~~FK~y 81 (302) -.-.|||||+|++-+ .+-| |..+..-=.|||--+.+=..+-.- T Consensus 29 d~R~KlQKlVYi~Kk-----l~~~-~~~~Y~FnlYGPYS~eLt~~v~~L 71 (171) T COG3465 29 DGRKKLQKLVYIAKK-----LGFP-FSLDYDFNLYGPYSEELTDDVEEL 71 (171) T ss_pred CHHHHHHHHHHHHHH-----HCCC-CHHHCCCCCCCCCCHHHHHHHHHH T ss_conf 058888989878876-----4577-343147533588658888999998 No 4 >PRK13455 F0F1 ATP synthase subunit B; Provisional Probab=56.29 E-value=10 Score=18.69 Aligned_cols=24 Identities=29% Similarity=0.416 Sum_probs=16.3 Q ss_pred HCCEEEEEECCCCCHHHHHHHHHH Q ss_conf 111113442386101356789999 Q gi|254781013|r 226 ESNIMRFAYSNRSFWVTIVWITII 249 (302) Q Consensus 226 ~~~~~~~~~~~~~~~~~~~~~~~~ 249 (302) ++.---|-.+|-+|||+|-++..+ T Consensus 17 ~~~~~~~~~~d~~FWv~IsFvif~ 40 (184) T PRK13455 17 AAGGPFFSLSNTDFIVTLAFLLFI 40 (184) T ss_pred HCCCCCCCCCCCHHHHHHHHHHHH T ss_conf 647998888895499999999999 No 5 >COG0105 Ndk Nucleoside diphosphate kinase [Nucleotide transport and metabolism] Probab=56.16 E-value=6 Score=20.07 Aligned_cols=75 Identities=23% Similarity=0.204 Sum_probs=58.6 Q ss_pred CCCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCHHHHHHH-HHHHHHHHHHHHCCCCCCCCEEEECCCCCCHHHH Q ss_conf 97421123488988999999999966635987889999999-9999699999844848867235745688169999 Q gi|254781013|r 1 MKKRTFTQTNPPYSTIAVANFFIDKGVKYSIPIDHLKIQQF-IYLTHCDVVLQKKKSMLDEEPQAWKQGPVFVGVY 75 (302) Q Consensus 1 m~~~~~tq~nppYsaldVANyFI~kA~e~g~~ITnLKLQKL-LYyAQg~~La~~gkpLFdE~fEAW~yGPVvP~VY 75 (302) |--|+|....|---...+--.+|.+-.+.|..|--||+-++ .-.|..+|-...++|.|.+-++-..-|||+-.++ T Consensus 1 ~~erT~~iiKPDaV~R~LIG~IisrfE~~Glkiva~K~~~~~~e~Ae~~Y~~h~~kpFf~~Lv~fitSgPvv~~Vl 76 (135) T COG0105 1 AMERTLSIIKPDAVKRGLIGEIISRFEKKGLKIVALKMVQLSRELAENHYAEHKGKPFFGELVEFITSGPVVAMVL 76 (135) T ss_pred CCCEEEEEECCCHHHHHHHHHHHHHHHHCCCEEEEEEEECCCHHHHHHHHHHHCCCCCCHHHHHHEECCCEEEEEE T ss_conf 9615899888426654218999999997798887644020479999777898767875287786231265899998 No 6 >pfam11282 DUF3082 Protein of unknown function (DUF3082). This family of proteins has no known function. Probab=43.96 E-value=26 Score=16.14 Aligned_cols=49 Identities=29% Similarity=0.226 Sum_probs=36.5 Q ss_pred HHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCHHHCC Q ss_conf 9999999997320489997522367889989999999999861102102 Q gi|254781013|r 248 IIVAFLLTQIINSNVFAQLCESKYIATVVSLSAIICGFWAIVGKGLFGV 296 (302) Q Consensus 248 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 296 (302) |...|-...+.++|-++|--.+-.-.-|+.++..--+..+++|-|||-+ T Consensus 28 i~~~Fa~~p~~s~~~~v~~I~~avrTLv~Gl~~LaTf~F~~i~~GL~ll 76 (82) T pfam11282 28 IAAYFAAHPPHSSNPIVQSIASAVRTLVVGLCFLATFVFAFVGLGLILL 76 (82) T ss_pred HHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9999842999866299999999999999999999999999999999999 No 7 >pfam10044 Ret_tiss Retinal tissue protein. Rtp is a family of proteins of approximately 112 amino acids in length which is conserved from nematodes to humans. The proposed tertiary structure is of almost entirely alpha helix interrupted only by loops located at proline residues. Three sites in the protein sequence reveal two types of possible post-translation modification. A serine residue, at position 41, is a candidate for protein kinase C phosphorylation. Glycine residues at position 69 and 91 are probable sites for acetylation by covalent amide linkage of myristate via N-myristoyl transferase. Rtp is differentially expressed in the trout retina between parr and smolt developmental stages (smoltification). It is likely to be a house-keeping protein. Probab=42.20 E-value=16 Score=17.42 Aligned_cols=34 Identities=15% Similarity=0.379 Sum_probs=28.6 Q ss_pred CCCCCCEECHHHHHHHCCCCCCCHHHHHHHHHCC Q ss_conf 8786420219986641157878988999985324 Q gi|254781013|r 143 DPNTNRTITVEEITKMADGSNISPENVVEQIKKP 176 (302) Q Consensus 143 dp~~N~~ItvE~itk~~~~~~~~~~~~~~~~~~~ 176 (302) -|.+....+.+||.++-.=+..+|++++|.||+- T Consensus 37 ~~~w~~~l~~~D~~~i~el~sLt~~~L~ekvk~L 70 (95) T pfam10044 37 PPKWTSGLTKDDMDKINELGSLTTSGLIAKVKKL 70 (95) T ss_pred CCCCCCCCCHHHHHHHHHHHCCCHHHHHHHHHHH T ss_conf 8742234899999999999868999999999999 No 8 >KOG0790 consensus Probab=41.60 E-value=28 Score=15.91 Aligned_cols=36 Identities=28% Similarity=0.393 Sum_probs=29.4 Q ss_pred HHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 999997320489997522367889989999999999861 Q gi|254781013|r 252 FLLTQIINSNVFAQLCESKYIATVVSLSAIICGFWAIVG 290 (302) Q Consensus 252 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 290 (302) |.-+|..-+|-+++ ..||||--.|-+.+-.||.+|= T Consensus 310 ~~~~q~~~~~~~~k---KsyIAtQGCL~nTVnDFW~Mvw 345 (600) T KOG0790 310 MIEFQRLCNNSKPK---KSYIATQGCLQNTVNDFWRMVW 345 (600) T ss_pred HHHHHHHCCCCCHH---HHEEEHHHHHHHHHHHHHHHHH T ss_conf 44454306665401---2045012678888999999987 No 9 >COG3877 Uncharacterized protein conserved in bacteria [Function unknown] Probab=38.49 E-value=30 Score=15.69 Aligned_cols=69 Identities=22% Similarity=0.315 Sum_probs=42.1 Q ss_pred CCCHHHHHHHHHHHHHHCC---------CCHHH----HHHHHCCCCCCCHHCCCCCCCCCCCEECHHHHHHHCCCCCCCH Q ss_conf 4898899999999998448---------99899----9996518887621216877878642021998664115787898 Q gi|254781013|r 100 IANQEIGEIMESIWNKYQS---------HSTEQ----LQEIVQEENKTWRRLYDPNDPNTNRTITVEEITKMADGSNISP 166 (302) Q Consensus 100 i~DeEi~eILD~VWnkYG~---------ySA~q----LeeLTHqEnSPWkkaYDP~dp~~N~~ItvE~itk~~~~~~~~~ 166 (302) .++.+..+++.-....-|. .|-.. |.++..+ -.|+|.-++ ...|.+++|-+|..-..|+| T Consensus 41 ~Lt~d~LeFv~lf~r~RGnlKEvEr~lg~sYptvR~kld~vlra------mgy~p~~e~-~~~i~~~~i~~qle~Gei~p 113 (122) T COG3877 41 YLTSDQLEFVELFLRCRGNLKEVERELGISYPTVRTKLDEVLRA------MGYNPDSEN-SVNIGKKKIIDQLEKGEISP 113 (122) T ss_pred CCCHHHHHHHHHHHHHCCCHHHHHHHHCCCCHHHHHHHHHHHHH------CCCCCCCCC-HHHHHHHHHHHHHHCCCCCH T ss_conf 35875768999999972579999999777617899899999998------089989987-04553899999998178799 Q ss_pred HHHHHHHHC Q ss_conf 899998532 Q gi|254781013|r 167 ENVVEQIKK 175 (302) Q Consensus 167 ~~~~~~~~~ 175 (302) |...+-.+| T Consensus 114 eeA~~~L~k 122 (122) T COG3877 114 EEAIKMLNK 122 (122) T ss_pred HHHHHHHCC T ss_conf 999998519 No 10 >TIGR02814 pfaD_fam PfaD family protein; InterPro: IPR014179 The protein PfaD is part of a four-gene locus, similar to polyketide biosynthesis systems, which is responsible for omega-3 polyunsaturated fatty acid biosynthesis in several high pressure and/or cold-adapted bacteria. Several other members of the entry are found in loci presumed to act in polyketide biosyntheses per se.. Probab=35.98 E-value=7.3 Score=19.55 Aligned_cols=57 Identities=16% Similarity=0.455 Sum_probs=31.4 Q ss_pred CCC--CCEEEECCCCCCH----HHHHHHHHCCCC-CCCCCCCCCHHHCCCCCCHHHHHHHHHHHHHHCCC Q ss_conf 886--7235745688169----999998513544-44212465311032248988999999999984489 Q gi|254781013|r 57 MLD--EEPQAWKQGPVFV----GVYHRFKYFDSH-PIEVIMDLSFRIFPKIANQEIGEIMESIWNKYQSH 119 (302) Q Consensus 57 LFd--E~fEAW~yGPVvP----~VY~~FK~yg~~-pI~~~~D~s~ei~pki~DeEi~eILD~VWnkYG~y 119 (302) ||+ -++|--+-|--+| .||+.|+.|||- .|+...- ..+++.--+.-||+||+.-..| T Consensus 290 MFE~GvklQVLKrGtlFP~RANkLY~LYr~YdSle~l~~~~r------~~lE~~~Fkr~l~eVw~~T~~y 353 (449) T TIGR02814 290 MFELGVKLQVLKRGTLFPARANKLYELYRRYDSLEELDAKTR------AQLEKKYFKRSLDEVWEETRAY 353 (449) T ss_pred HHHCCCEEEEEECCCCCHHHCCHHHHHHCCCCCHHHCCHHHH------HHHHHHHCCCCHHHHHHHHHHH T ss_conf 544277688840252321011115798638988421487999------9999986178888999999997 No 11 >pfam05042 Caleosin Caleosin related protein. This family contains plant proteins related to caleosin. Caleosins contain calcium-binding domains and have an oleosin-like association with lipid bodies. Caleosins are present at relatively low levels and are mainly bound to microsomal membrane fractions at the early stages of seed development. As the seeds mature, overall levels of caleosins increased dramatically and they were associated almost exclusively with storage lipid bodies. This family is probably related to EF hands pfam00036. Probab=28.42 E-value=46 Score=14.58 Aligned_cols=56 Identities=21% Similarity=0.393 Sum_probs=39.1 Q ss_pred HHHHHHHHHHC-----CCCHHHHHHHHCCCCCC------------CHHCCCCCCCCCCCEECHHHHHHHCCCCC Q ss_conf 99999999844-----89989999965188876------------21216877878642021998664115787 Q gi|254781013|r 107 EIMESIWNKYQ-----SHSTEQLQEIVQEENKT------------WRRLYDPNDPNTNRTITVEEITKMADGSN 163 (302) Q Consensus 107 eILD~VWnkYG-----~ySA~qLeeLTHqEnSP------------WkkaYDP~dp~~N~~ItvE~itk~~~~~~ 163 (302) .=+++|+++|. ..|..+|.+|++.--.| |.-.|.-.. ..+-..++|+|-.+-|||- T Consensus 96 ~kFE~IFsKya~~~~d~LT~~El~~ml~~nR~~~D~~GW~aa~~EW~~ly~L~~-d~dG~l~Ke~vR~vYDGSl 168 (174) T pfam05042 96 VNFEEIFSKYARTHPDALTLGELWVMTEANRDALDPFGWLASKGEWGLLYTLAK-DEEGFLSKEAVRRCFDGSL 168 (174) T ss_pred HHHHHHHHHHCCCCCCCCCHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHC-CCCCCEEHHHHHHHCCCHH T ss_conf 889999998466898745899999999955476673344889999999999953-6258761898978616158 No 12 >KOG2535 consensus Probab=26.53 E-value=42 Score=14.79 Aligned_cols=60 Identities=23% Similarity=0.508 Sum_probs=36.4 Q ss_pred HHHHCCCCCCCCCCCCCHHHCCCCCCHHHHHHHHHHHH--HHCCCCHHHHHHHHCC---CCCCCHHCCC Q ss_conf 98513544442124653110322489889999999999--8448998999996518---8876212168 Q gi|254781013|r 77 RFKYFDSHPIEVIMDLSFRIFPKIANQEIGEIMESIWN--KYQSHSTEQLQEIVQE---ENKTWRRLYD 140 (302) Q Consensus 77 ~FK~yg~~pI~~~~D~s~ei~pki~DeEi~eILD~VWn--kYG~ySA~qLeeLTHq---EnSPWkkaYD 140 (302) .|+.|-.+|-=- ....++||.+.-..+ =|.+.|+ +|+.|+..+|.++.-+ --.||-++|- T Consensus 307 qF~E~FenP~FR--~DGLKiYPTLVIrGT--GLyELWKtgrYk~Y~p~~LvdlvArILalVPPWtRvYR 371 (554) T KOG2535 307 QFKEYFENPAFR--PDGLKIYPTLVIRGT--GLYELWKTGRYKSYSPSALVDLVARILALVPPWTRVYR 371 (554) T ss_pred HHHHHHCCCCCC--CCCCEECCEEEEECC--CHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCHHHEEE T ss_conf 999983196768--787412226999344--27988752886668989999999999863785452343 No 13 >TIGR00920 2A060605 3-hydroxy-3-methylglutaryl-coenzyme A reductase; InterPro: IPR004816 Synonym(s): 3-hydroxy-3-methylglutaryl-coenzyme A reductase, HMG-CoA reductase. There are two distinct classes of hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase enzymes: class I consists of eukaryotic and most archaeal enzymes (1.1.1.34 from EC), while class II consists of prokaryotic enzymes (1.1.1.88 from EC) , . Class I HMG-CoA reductases catalyse the NADP-dependent synthesis of mevalonate from 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA). In vertebrates, membrane-bound HMG-CoA reductase is the rate-limiting enzyme in the biosynthesis of cholesterol and other isoprenoids. In plants, mevalonate is the precursor of all isoprenoid compounds . The reduction of HMG-CoA to mevalonate is regulated by feedback inhibition by sterols and non-sterol metabolites derived from mevalonate, including cholesterol. In archaea, HMG-CoA reductase is a cytoplasmic enzyme involved in the biosynthesis of the isoprenoids side chains of lipids . Class I HMG-CoA reductases consist of an N-terminal membrane domain (lacking in archaeal enzymes), and a C-terminal catalytic region. The catalytic region can be subdivided into three domains: an N-domain (N-terminal), a large L-domain, and a small S-domain (inserted within the L-domain). The L-domain binds the substrate, while the S-domain binds NADP. Class II HMG-CoA reductases catalyse the reverse reaction of class I enzymes, namely the NAD-dependent synthesis of HMG-CoA from mevalonate and CoA . Some bacteria, such as Pseudomonas mevalonii, can use mevalonate as the sole carbon source. Class II enzymes lack a membrane domain. Their catalytic region is structurally related to that of class I enzymes, but it consists of only two domains: a large L-domain and a small S-domain (inserted within the L-domain). As with class I enzymes, the L-domain binds substrate, but the S-domain binds NAD (instead of NADP in class I). This entry represents Metazoan class I HMG-CoA reductases, which are membrane-bound glycoproteins that remains in the endoplasmic reticulum after synthesis and glycosylation .; GO: 0004420 hydroxymethylglutaryl-CoA reductase (NADPH) activity, 0050661 NADP binding, 0008299 isoprenoid biosynthetic process, 0005789 endoplasmic reticulum membrane. Probab=26.46 E-value=32 Score=15.56 Aligned_cols=11 Identities=9% Similarity=0.646 Sum_probs=4.8 Q ss_pred HHHHHHHHHHH Q ss_conf 99999999996 Q gi|254781013|r 15 TIAVANFFIDK 25 (302) Q Consensus 15 aldVANyFI~k 25 (302) .+.++.+++.| T Consensus 353 S~~Lw~~~~~r 363 (988) T TIGR00920 353 SMPLWQFYLSR 363 (988) T ss_pred CHHHHHHHHHH T ss_conf 65479999961 No 14 >cd06431 GT8_LARGE_C LARGE catalytic domain has closest homology to GT8 glycosyltransferase involved in lipooligosaccharide synthesis. The catalytic domain of LARGE is a putative glycosyltransferase. Mutations of LARGE in mouse and human cause dystroglycanopathies, a disease associated with hypoglycosylation of the membrane protein alpha-dystroglycan (alpha-DG) and consequent loss of extracellular ligand binding. LARGE needs to both physically interact with alpha-dystroglycan and function as a glycosyltransferase in order to stimulate alpha-dystroglycan hyperglycosylation. LARGE localizes to the Golgi apparatus and contains three conserved DxD motifs. While two of the motifs are indispensible for glycosylation function, one is important for localization of th eenzyme. LARGE was originally named because it covers approximately large trunck of genomic DNA, more than 600bp long. The predicted protein structure contains an N-terminal cytoplasmic domain, a transmembrane region, a coiled-coil Probab=25.94 E-value=47 Score=14.49 Aligned_cols=52 Identities=12% Similarity=0.381 Sum_probs=35.4 Q ss_pred HHHHHHHHHCCCCHHHHHHHHCCCCCCCHHC--CCCC------CCCCCCEECHHHHHHHCC Q ss_conf 9999999844899899999651888762121--6877------878642021998664115 Q gi|254781013|r 108 IMESIWNKYQSHSTEQLQEIVQEENKTWRRL--YDPN------DPNTNRTITVEEITKMAD 160 (302) Q Consensus 108 ILD~VWnkYG~ySA~qLeeLTHqEnSPWkka--YDP~------dp~~N~~ItvE~itk~~~ 160 (302) =+++.|+.|..++..|+..|..+. +||... |... .+|-|+-.-.=..|+|-. T Consensus 113 ~I~eLW~~F~~f~~~q~~glape~-~~~Y~~~~~~~~~~~P~~G~G~NSGVmLmnL~rmR~ 172 (280) T cd06431 113 DIAELWKIFHKFTGQQVLGLVENQ-SDWYLGNLWKNHRPWPALGRGFNTGVILLDLDKLRK 172 (280) T ss_pred CHHHHHHHHHHCCHHHHHHCCCCC-CHHHHHHHHHCCCCCCCCCCCCCCCEEEEEHHHHHH T ss_conf 899999998745976753117545-413443255315889876666555316765677765 No 15 >pfam07106 TBPIP Tat binding protein 1(TBP-1)-interacting protein (TBPIP). This family consists of several eukaryotic TBP-1 interacting protein (TBPIP) sequences. TBP-1 has been demonstrated to interact with the human immunodeficiency virus type 1 (HIV-1) viral protein Tat, then modulate the essential replication process of HIV. In addition, TBP-1 has been shown to be a component of the 26S proteasome, a basic multiprotein complex that degrades ubiquitinated proteins in an ATP-dependent fashion. Human TBPIP interacts with human TBP-1 then modulates the inhibitory action of human TBP-1 on HIV-Tat-mediated transactivation. Probab=24.03 E-value=55 Score=14.08 Aligned_cols=29 Identities=21% Similarity=0.385 Sum_probs=19.9 Q ss_pred CCCCCCCCHHHHHHHHHHHHHHCCCCCCHHHHHHHH Q ss_conf 234889889999999999666359878899999999 Q gi|254781013|r 7 TQTNPPYSTIAVANFFIDKGVKYSIPIDHLKIQQFI 42 (302) Q Consensus 7 tq~nppYsaldVANyFI~kA~e~g~~ITnLKLQKLL 42 (302) -+.|.|||+.+|...+=. .++--.+||.| T Consensus 11 ~~qNRPys~~dv~~nL~~-------~~~K~~vqK~L 39 (169) T pfam07106 11 NEQNRPYSVQDVVDNLQN-------GLGKTAVQKAL 39 (169) T ss_pred HHHCCCCCHHHHHHHHHC-------CCCHHHHHHHH T ss_conf 983899849999998816-------24499999999 No 16 >TIGR01578 MiaB-like-B MiaB-like tRNA modifying enzyme, archaeal-type; InterPro: IPR006466 This clade of sequences is closely related to MiaB, a modifier of isopentenylated adenosine-37 of certain eukaryotic and bacterial tRNAs (see IPR006463 from INTERPRO). Sequence alignments suggest that this family of sequences perform the same chemical transformation as MiaB, perhaps on a different (or differently modified) tRNA base substrate. This clade represents a subfamily that spans the archaea and eukaryotes. The only archaeal miaB-like genes are in this group of sequences , , .. Probab=22.64 E-value=25 Score=16.25 Aligned_cols=71 Identities=30% Similarity=0.390 Sum_probs=37.2 Q ss_pred HHHHHHHHHHCCCCHHHHHH---HHCCCCCCCHHCCCCCCCCCCCEECHHH------------HHHHCCCCCCC--HHHH Q ss_conf 99999999844899899999---6518887621216877878642021998------------66411578789--8899 Q gi|254781013|r 107 EIMESIWNKYQSHSTEQLQE---IVQEENKTWRRLYDPNDPNTNRTITVEE------------ITKMADGSNIS--PENV 169 (302) Q Consensus 107 eILD~VWnkYG~ySA~qLee---LTHqEnSPWkkaYDP~dp~~N~~ItvE~------------itk~~~~~~~~--~~~~ 169 (302) +.|.++.++--.--..+|-. ..|+++ --...++--+|-.|..|+.=- |||.|-|.-.| ||.+ T Consensus 100 ~rl~e~ve~~~~~~~~~L~~~~~~~~~~~-~~~~~~~~~~~~~~~~i~i~pI~~GC~~~CsYCi~K~ARG~L~S~PpEki 178 (487) T TIGR01578 100 ERLKELVEEILKRRSVQLLANKKKVLEES-EAKTLLKEPEPRKNPLIEILPINQGCLGNCSYCITKIARGKLASYPPEKI 178 (487) T ss_pred HHHHHHHHHHHHHHHHHHCCCCEEHHCCC-CCCHHCCCHHHHCCCCCCCCCCCCCCCCCCCEEEEEEEECCCCCCCCHHH T ss_conf 57899887764102332026870011033-13100032023146775555436663568875467776445248872256 Q ss_pred HHHHHCCCC Q ss_conf 998532433 Q gi|254781013|r 170 VEQIKKPVR 178 (302) Q Consensus 170 ~~~~~~~~~ 178 (302) |+++|.-+. T Consensus 179 V~~ar~l~~ 187 (487) T TIGR01578 179 VEKARELVA 187 (487) T ss_pred HHHHHHHHH T ss_conf 899999997 No 17 >TIGR03076 near_not_gcvH Chlamydial GcvH homolog upstream region protein. The H protein (GcvH) of the glycine cleavage system shuttles the methylamine group of glycine from the P protein to the T protein. Most Chlamydia but lack the P and T proteins, and have a single homolog of GcvH that appears deeply split from canonical GcvH in molecular phylogenetic trees. The protein family modeled here is observed so far only in the Chlamydiae, always as part of a two-gene operon, upstream of the homolog of GcvH. Its function is unknown. Probab=21.83 E-value=61 Score=13.81 Aligned_cols=103 Identities=23% Similarity=0.350 Sum_probs=66.1 Q ss_pred CEECHHHHHHHCCCCCCCHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCHHHHH---HHHHHHHHHHHHHHHHHHH Q ss_conf 2021998664115787898899998532433377420567789998445407958888---8776447423124677654 Q gi|254781013|r 148 RTITVEEITKMADGSNISPENVVEQIKKPVRFDSDSEFATLESQLVKGFKSMPPQLTK---EFQEAQVQIKKDSAIARSI 224 (302) Q Consensus 148 ~~ItvE~itk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~ 224 (302) --|++|+.-|-..-.--..-.+..+|.+||. .-.|+ -.-||--+=|-.||-+-+ |+|+.|-.+.-|.+.+|.+ T Consensus 127 pFissEevWkssAP~l~~~f~~lq~~~~pVs---Pegf~-aRV~LFLeEkkfphy~LrqmLeYrrqmfnLp~D~~L~rg~ 202 (686) T TIGR03076 127 PFISSEEVWKSSAPQLRDALHIFQQIENPVS---PEGFA-ARVRLFLEEKKFPHYVLRQMLEYRRQMFNLPVDGSLAQGK 202 (686) T ss_pred CCCCHHHHHHHCCHHHHHHHHHHHHHCCCCC---HHHHH-HHHHHHHHHCCCCHHHHHHHHHHHHHHCCCCCCHHHHHCC T ss_conf 6445898874037778999999997338998---78899-9999998622598899999999999862899974665066 Q ss_pred HHCCEEEEEECCCCCHHHHHHHHHHHHHHHHHH Q ss_conf 311111344238610135678999999999997 Q gi|254781013|r 225 QESNIMRFAYSNRSFWVTIVWITIIVAFLLTQI 257 (302) Q Consensus 225 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 257 (302) +.--|.|.|-.=|-+--++...+.|++.-| T Consensus 203 ---dL~LFGY~~i~DWFg~~yvs~v~E~il~fi 232 (686) T TIGR03076 203 ---DLRLFGYRNIKDWFGDAYVSAAVEALLRFI 232 (686) T ss_pred ---CCEEECCCCHHHHHHHHHHHHHHHHHHHHH T ss_conf ---630213342887604889999999999999 No 18 >TIGR00130 frhD coenzyme F420-reducing hydrogenase delta subunit (putative coenzyme F420 hydrogenase processing subunit).; InterPro: IPR004411 This group of sequence are classed as unassigned endopeptidases belonging to the MEROPS peptidase family A31 (HybD endopeptidase family, clan AE). The sequences in this family represent the delta subunit, frhD, of the nickel-containing 8-hydroxy-5-deazaflavin reducing hydrogenase otherwise known as coenzyme F420-reducing hydrogenase. The delta subunit is not part of the active FRH heterotrimer, which contains three known subunits, alpha, beta, and gamma. The alpha subunit contains the metallo centre, which binds nickel. When nickel binds to the metallo centre the alpha subunit is processed to the active enzyme by cleavage at a conserved site near the C-terminus. This entry contains the archaeal hydrogenase maturation endopeptidases of family A31 that are thought to be responsible for these cleavages . At present there is no direct evidence to support there classification as peptidases. ; GO: 0008233 peptidase activity, 0006464 protein modification process. Probab=21.80 E-value=43 Score=14.75 Aligned_cols=42 Identities=21% Similarity=0.401 Sum_probs=33.4 Q ss_pred CCCCHHHHHHHHCCCCCCCHHC-------CCCCCCCCCCEECHHHHHHHC Q ss_conf 4899899999651888762121-------687787864202199866411 Q gi|254781013|r 117 QSHSTEQLQEIVQEENKTWRRL-------YDPNDPNTNRTITVEEITKMA 159 (302) Q Consensus 117 G~ySA~qLeeLTHqEnSPWkka-------YDP~dp~~N~~ItvE~itk~~ 159 (302) ++-+|.+|..+--.|++|||+. |+- +||+=.+|.++|+-+-| T Consensus 44 AGtgg~~l~~~l~~e~~~~kKiiiVD~i~fg~-~PG~~~k~~v~~lpnna 92 (162) T TIGR00130 44 AGTGGFDLVNTLVDEEEKLKKIIIVDAIDFGL-EPGTVKKIEVEELPNNA 92 (162) T ss_pred CCCCHHHHHEECCCCCCCCCEEEEEEEEECCC-CCCEEEEECCCCCCCCC T ss_conf 67554565321003468757689998870767-98637761542066666 No 19 >TIGR00874 talAB transaldolase; InterPro: IPR004730 Transaldolase (2.2.1.2 from EC) catalyzes the reversible transfer of a three-carbon ketol unit from sedoheptulose 7-phosphate to glyceraldehyde 3-phosphate to form erythrose 4-phosphate and fructose 6-phosphate. This enzyme, together with transketolase, provides a link between the glycolytic and pentose-phosphate pathways. Transaldolase is an enzyme of about 34 Kd whose sequence has been well conserved throughout evolution. A lysine has been implicated in the catalytic mechanism of the enzyme; it acts as a nucleophilic group that attacks the carbonyl group of fructose-6-phosphate. Transaldolase is evolutionary related to a bacterial protein of about 20 Kd (known as talC in Escherichia coli), whose exact function is not yet known.; GO: 0004801 transaldolase activity, 0006098 pentose-phosphate shunt, 0005737 cytoplasm. Probab=21.60 E-value=45 Score=14.60 Aligned_cols=83 Identities=19% Similarity=0.140 Sum_probs=38.2 Q ss_pred HHHHHHCCCCCCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCCHHHCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHH Q ss_conf 99998448488672357456881699999985135444421246531103224898899999999998448998999996 Q gi|254781013|r 48 DVVLQKKKSMLDEEPQAWKQGPVFVGVYHRFKYFDSHPIEVIMDLSFRIFPKIANQEIGEIMESIWNKYQSHSTEQLQEI 127 (302) Q Consensus 48 ~~La~~gkpLFdE~fEAW~yGPVvP~VY~~FK~yg~~pI~~~~D~s~ei~pki~DeEi~eILD~VWnkYG~ySA~qLeeL 127 (302) ||.+.+|..=+ ..++=+-=-=|..+|+.||.||+.-+- +--+|+...+ |++=-==-+---|-.=|.+| T Consensus 188 WYka~~g~k~Y--~~~~DPGV~SV~~IY~YYK~~gy~T~v--MgASFR~~~e--------i~~LAGcD~LTIsP~LL~~L 255 (324) T TIGR00874 188 WYKASTGKKEY--SIEEDPGVASVKKIYNYYKKFGYKTEV--MGASFRNIEE--------ILALAGCDRLTISPALLDEL 255 (324) T ss_pred HHHHCCCCCCC--CCCCCCCCCCHHHHHHHHHHCCCCEEE--ECCCCCCHHH--------HHHHHCCCCCCCCHHHHHHH T ss_conf 88643788898--644577522257787776424995057--1531387888--------88874447001687899998 Q ss_pred HCCCCCCCHHCCCCCC Q ss_conf 5188876212168778 Q gi|254781013|r 128 VQEENKTWRRLYDPND 143 (302) Q Consensus 128 THqEnSPWkkaYDP~d 143 (302) -..+ .|=.|--+|.. T Consensus 256 ~~~~-~p~~rkL~~~~ 270 (324) T TIGR00874 256 KESE-GPVERKLDPES 270 (324) T ss_pred HHCC-CCCHHCCCCCC T ss_conf 4303-53000278654 No 20 >cd04418 NDPk5 Nucleoside diphosphate kinase homolog 5 (NDP kinase homolog 5, NDPk5, NM23-H5; Inhibitor of p53-induced apoptosis-beta, IPIA-beta): In human, mRNA for NDPk5 is almost exclusively found in testis, especially in the flagella of spermatids and spermatozoa, in association with axoneme microtubules, and may play a role in spermatogenesis by increasing the ability of late-stage spermatids to eliminate reactive oxygen species. It belongs to the nm23 Group II genes and appears to differ from the other human NDPks in that it lacks two important catalytic site residues, and thus does not appear to possess NDP kinase activity. NDPk5 confers protection from cell death by Bax and alters the cellular levels of several antioxidant enzymes, including glutathione peroxidase 5 (Gpx5). Probab=20.55 E-value=44 Score=14.71 Aligned_cols=69 Identities=14% Similarity=0.046 Sum_probs=45.1 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHCCCCCCHHHHHHHHH-HHHHHHHHHHCCCCCCCCEEEECCCCCCHHHH Q ss_conf 112348898899999999996663598788999999999-99699999844848867235745688169999 Q gi|254781013|r 5 TFTQTNPPYSTIAVANFFIDKGVKYSIPIDHLKIQQFIY-LTHCDVVLQKKKSMLDEEPQAWKQGPVFVGVY 75 (302) Q Consensus 5 ~~tq~nppYsaldVANyFI~kA~e~g~~ITnLKLQKLLY-yAQg~~La~~gkpLFdE~fEAW~yGPVvP~VY 75 (302) |+-...| +++.=..-+|.+-.+.|..|..+|+-.|-- .|..+|-...|++-|++-+.-..-|||+--++ T Consensus 3 Tl~iIKP--Dav~~~g~Il~~i~~~Gf~I~~~k~~~lt~~~a~~~Y~~~~gk~ff~~Lv~~mtSGPvvalvl 72 (132) T cd04418 3 TLAIIKP--DAVHKAEEIEDIILESGFTIVQKRKLQLSPEQCSDFYAEHYGKMFFPHLVAYMSSGPIVAMVL 72 (132) T ss_pred EEEEECC--CHHCCCCHHHHHHHHCCCEEEEHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHCCCCEEEEEE T ss_conf 7999892--053350399999998699990021205999999999999879874787743014788799997 No 21 >pfam09049 SNN_transmemb Stannin transmembrane. Members of this family consist of a single highly hydrophobic transmembrane helix that transverses the lipid bilayer at a 20 degree angle with respect to the membrane normal. They contain a conserved cysteine residue (Cys32) that, together with Cys34 found in the stannin unstructured linker domain, constitutes the putative trimethyltin-binding site that resides at the end of the transmembrane domain close to the lipid/solvent interface. Probab=20.21 E-value=34 Score=15.40 Aligned_cols=12 Identities=42% Similarity=1.010 Sum_probs=8.4 Q ss_pred HHHHHHHHHHHH Q ss_conf 998999999999 Q gi|254781013|r 275 VVSLSAIICGFW 286 (302) Q Consensus 275 ~~~~~~~~~~~~ 286 (302) |..|.+.|||.| T Consensus 22 vaalg~li~gcw 33 (33) T pfam09049 22 IAALGALILGCW 33 (33) T ss_pred HHHHHHHHEECC T ss_conf 998642550059 Done!