Query gi|254781034|ref|YP_003065447.1| hypothetical protein CLIBASIA_04680 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 344 No_of_seqs 130 out of 162 Neff 6.2 Searched_HMMs 39220 Date Mon May 30 04:01:24 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254781034.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 COG4223 Uncharacterized protei 100.0 0 0 329.3 21.2 293 36-337 120-418 (422) 2 pfam09731 Mitofilin Mitochondr 100.0 1.8E-31 4.7E-36 214.1 31.8 240 97-337 302-559 (561) 3 KOG1854 consensus 99.8 3.8E-16 9.7E-21 118.9 26.2 163 176-339 470-649 (657) 4 PRK10920 putative uroporphyrin 95.8 0.13 3.2E-06 28.6 18.4 25 252-276 240-268 (389) 5 PRK06975 bifunctional uroporph 95.5 0.17 4.2E-06 27.9 19.9 50 255-304 541-601 (653) 6 pfam01601 Corona_S2 Coronaviru 94.6 0.3 7.8E-06 26.2 10.8 121 29-166 198-319 (609) 7 COG2959 HemX Uncharacterized e 84.5 2.6 6.7E-05 20.4 15.1 18 257-274 247-264 (391) 8 PRK13428 F0F1 ATP synthase sub 79.7 4 0.0001 19.3 20.8 24 46-69 3-26 (445) 9 COG4942 Membrane-bound metallo 74.6 5.5 0.00014 18.4 13.6 72 97-168 39-110 (420) 10 KOG2391 consensus 73.6 5.8 0.00015 18.2 10.8 103 125-240 240-346 (365) 11 pfam07889 DUF1664 Protein of u 72.8 6.1 0.00016 18.1 8.8 60 97-156 62-121 (126) 12 PRK09039 hypothetical protein; 68.9 7.4 0.00019 17.6 19.4 28 139-166 138-165 (343) 13 PRK09793 methyl-accepting prot 68.9 7.4 0.00019 17.6 16.0 31 293-323 466-496 (533) 14 TIGR00020 prfB peptide chain r 63.9 9.3 0.00024 17.0 9.9 146 133-288 4-158 (373) 15 COG4980 GvpP Gas vesicle prote 60.5 11 0.00027 16.6 10.7 19 36-54 5-23 (115) 16 pfam06160 EzrA Septation ring 59.1 11 0.00029 16.5 20.0 110 224-339 203-316 (559) 17 TIGR02508 type_III_yscG type I 57.3 12 0.00031 16.3 4.9 64 278-344 37-103 (118) 18 COG4649 Uncharacterized protei 55.9 3.6 9.1E-05 19.6 0.2 26 287-312 174-199 (221) 19 pfam02912 Phe_tRNA-synt_N Amin 53.9 14 0.00035 15.9 4.4 52 285-338 12-63 (73) 20 TIGR03319 YmdA_YtgF conserved 53.6 14 0.00036 15.9 12.0 16 307-322 425-440 (514) 21 pfam04375 HemX HemX. This fami 52.0 15 0.00038 15.7 28.6 24 252-275 234-261 (372) 22 COG3118 Thioredoxin domain-con 51.4 15 0.00039 15.7 5.2 26 59-84 69-94 (304) 23 pfam11853 DUF3373 Protein of u 48.2 17 0.00043 15.4 4.6 34 138-171 31-64 (485) 24 PRK13453 F0F1 ATP synthase sub 47.6 17 0.00044 15.3 11.5 45 40-84 14-58 (173) 25 pfam12072 DUF3552 Domain of un 45.5 19 0.00048 15.1 11.3 17 46-62 6-22 (201) 26 KOG2629 consensus 45.3 19 0.00048 15.1 10.4 27 135-161 158-184 (300) 27 COG4372 Uncharacterized protei 43.7 20 0.00051 14.9 14.3 21 85-105 77-97 (499) 28 PRK04778 septation ring format 41.1 22 0.00056 14.7 20.1 110 224-339 207-320 (569) 29 pfam05440 MtrB Tetrahydrometha 40.7 21 0.00053 14.8 2.2 28 34-61 67-94 (97) 30 TIGR02168 SMC_prok_B chromosom 36.0 26 0.00067 14.2 16.1 95 224-321 527-630 (1191) 31 PRK10564 maltose regulon perip 35.7 27 0.00068 14.2 2.7 28 278-305 255-282 (303) 32 pfam07148 MalM Maltose operon 34.4 28 0.00071 14.0 3.2 29 277-305 229-258 (279) 33 PRK00965 tetrahydromethanopter 33.9 28 0.00072 14.0 2.4 34 33-66 67-100 (108) 34 TIGR02978 phageshock_pspC phag 32.8 29 0.00075 13.9 3.9 77 35-117 32-119 (128) 35 PRK11638 lipopolysaccharide bi 32.4 30 0.00076 13.8 2.3 28 33-62 26-53 (348) 36 PRK06287 cobalt transport prot 28.9 34 0.00087 13.5 3.0 27 38-64 74-100 (105) 37 KOG0709 consensus 27.1 37 0.00094 13.3 6.5 102 120-228 268-370 (472) 38 COG4062 MtrB Tetrahydromethano 26.9 37 0.00095 13.2 3.7 26 35-60 69-94 (108) 39 pfam04156 IncA IncA protein. C 26.8 37 0.00095 13.2 16.2 25 134-158 124-148 (186) 40 pfam02605 PsaL Photosystem I r 25.1 40 0.001 13.0 1.8 35 25-59 115-149 (154) 41 PRK10697 DNA-binding transcrip 23.8 43 0.0011 12.9 5.5 70 35-115 38-108 (119) 42 TIGR00996 Mtu_fam_mce virulenc 23.2 44 0.0011 12.8 12.3 13 178-190 268-280 (304) 43 pfam01034 Syndecan Syndecan do 22.4 45 0.0012 12.7 3.3 24 36-59 146-169 (207) 44 PRK12705 hypothetical protein; 21.4 47 0.0012 12.6 14.0 27 37-63 5-31 (485) 45 PRK13729 conjugal transfer pil 20.7 49 0.0013 12.5 7.0 13 310-322 373-385 (474) No 1 >COG4223 Uncharacterized protein conserved in bacteria [Function unknown] Probab=100.00 E-value=0 Score=329.27 Aligned_cols=293 Identities=23% Similarity=0.264 Sum_probs=248.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHH------HHHHHHHHHHHHH Q ss_conf 4667776423467899999753320323202663351033367888731123232200278------9999999999999 Q gi|254781034|r 36 KFFWEKILSNKTFFKILALVCVIVLTFIFIFTALFTEKFLRTDNNLLLLPSVSPLKEDPKD------ISPVIEKEIISQN 109 (344) Q Consensus 36 ~~~~~~~~~~~~~ggiial~~~~~lq~~~~~~~~~~~~~a~~~~~~~~~~~~~~q~~~~~~------~l~~~~~el~~~~ 109 (344) .+.-.|++.++|.||+|+|...++|||.|+++.+..+ +.+.+.++-++..+..+...... ....++|.| T Consensus 120 q~~~~g~iaAgi~gg~IAla~ag~Lq~ag~v~apg~~-~a~~~e~a~l~seiaglk~~g~a~~~Aapd~s~leqri---- 194 (422) T COG4223 120 QAGGEGVIAAGIDGGLIALAGAGALQYAGRVPAPGVG-DAGLLEIAFLKSEIAGLKWFGPANAPAAPDSSGLEQRI---- 194 (422) T ss_pred CCCCCCCEECCCCCCEEECCCCCHHHHCCCCCCCCCC-CCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHH---- T ss_conf 5676320013665412231676333432505789876-20368899999988899874445675586541167764---- Q ss_pred HHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHH Q ss_conf 99997531101447889988888763367899999999999999999999999999500124766999999999999999 Q gi|254781034|r 110 LSIAQQKDEETADKELANTQNFNIKPLLEEIASLKQLISDLSKNYQDIVTRLTKMETLTANPLRNPNTQRMVSLLILKNA 189 (344) Q Consensus 110 ~~~~s~~~e~~~~~e~~~~~~~~~~~~~~~v~~Le~~~~~~~~~~~~l~~~l~~~e~~~~~~~~~~~~~~~~A~~~L~~A 189 (344) ..+.+...+.-.. +......+..+....+.|..+++......+++..|+..+|...+.+.+++.+++.++++.||++ T Consensus 195 aal~aa~~~p~p~---v~al~~avtal~~~~salp~ersta~Aa~ael~gRiaalEqs~ne~ad~ieaA~aiaatalKtA 271 (422) T COG4223 195 AALEAASAEPAPR---VKALEVAVTALLPLESALPAERSTALAAVAELNGRIAALEQSLNEPADDIEAALAIAATALKTA 271 (422) T ss_pred HHHHHHHCCCCCC---HHHHHHHHHHCCCHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHH T ss_conf 3316654278984---2689999985030221162345667778998751699999874353168999999999999998 Q ss_pred HHCCCCHHHHHHHHHHHCCCCCCHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHEEE Q ss_conf 96699628999999951378842178887555289998999999999999999752046777888999999998763241 Q gi|254781034|r 190 LDKGEYSSLNTTMQENFSVLKPCTATLMQFANIKIPTTIEILAKFPKVSEEMVFASESLEKDSGFANYLLFQLTRLVKVR 269 (344) Q Consensus 190 i~~G~pf~~eL~~l~~~~~~~~~l~~L~~~A~~Gvpt~a~L~~~F~~~A~~~~~a~~~~~~~~g~~~~l~~~~~slv~vR 269 (344) ||+|+||..||++|..+.|++|.+..|.+|+.+||||+..|..+|+.++.+++.+.+.+++|.|+|+|+++.++|+|+|| T Consensus 272 idrggPF~aELdtL~~VaP~dP~l~~L~~~A~tGvPTRaeL~~qF~~~AnamvsA~~~pd~nagl~~rL~~Sa~slVsVR 351 (422) T COG4223 272 IDRGGPFLAELDTLESVAPGDPALAALRPYAATGVPTRAELATQFGAVANAMVSASNNPDPNAGLFDRLRSSASSLVSVR 351 (422) T ss_pred HHCCCCCHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHCCEEEE T ss_conf 86289851777667640899803677667776389838999998788998999733489987029999999874351243 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 27887788898889999999998299899999897199888999999999999999999999999975 Q gi|254781034|r 270 PIGGNIEGDAITDVIARIENNLKTGDLVKAAAEWDKIPEKARQPSMFLRNALEAHICSDAILKEEMAK 337 (344) Q Consensus 270 ~~~~~~~G~~~dailaRae~aL~~GdL~~Al~el~~Lp~~a~~~~~~w~~~~eaRl~ad~~~~~~~a~ 337 (344) ++ |+++|++++++++|||++|++|||.+|+.||+.||+.+|.++++|..++++|+.+|.+|..+.+. T Consensus 352 pV-GsveG~t~~a~iARmEa~L~~GDl~gA~~ewd~LpeaaKaa~a~f~~~l~aRieve~~V~a~va~ 418 (422) T COG4223 352 PV-GSVEGSTPDAMIARMEAALDNGDLEGAVLEWDSLPEAAKAASADFAVKLKARIEVETLVDALVAD 418 (422) T ss_pred EC-CCCCCCCCCHHHHHHHHHHHCCCHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 22-55578984068999999875445676777531471889986056899887666389999999863 No 2 >pfam09731 Mitofilin Mitochondrial inner membrane protein. Mitofilin controls mitochondrial cristae morphology. Mitofilin is enriched in the narrow space between the inner boundary and the outer membranes, where it forms a homotypic interaction and assembles into a large multimeric protein complex. The first 78 amino acids contain a typical amino-terminal-cleavable mitochondrial presequence rich in positive-charged and hydroxylated residues and a membrane anchor domain. In addition, it has three centrally located coiled coil domains. Probab=100.00 E-value=1.8e-31 Score=214.13 Aligned_cols=240 Identities=20% Similarity=0.179 Sum_probs=209.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHH Q ss_conf 9999999999999999975311014478899888887633678999-999999999999999999999995001247669 Q gi|254781034|r 97 ISPVIEKEIISQNLSIAQQKDEETADKELANTQNFNIKPLLEEIAS-LKQLISDLSKNYQDIVTRLTKMETLTANPLRNP 175 (344) Q Consensus 97 ~l~~~~~el~~~~~~~~s~~~e~~~~~e~~~~~~~~~~~~~~~v~~-Le~~~~~~~~~~~~l~~~l~~~e~~~~~~~~~~ 175 (344) ++...+++..........+..++...++...+.+.....+...|+. ++.++..+..++.++.+++..+|.....+.... T Consensus 302 ~l~~~~e~kL~~eL~r~~ea~~~~L~n~l~~q~iEl~r~f~~~i~ekve~ER~grl~kL~el~a~l~~LE~a~~~~~~~~ 381 (561) T pfam09731 302 ELRKKYEEKLRTELERQAEAHEQKLKNELAEQAIELQREFEKEIKEKVEEERNGRLAKLAELNSRLKGLEKALDSRSEAE 381 (561) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 99999999999999999999999999999999999999999999999999987689889999999999999999999999 Q ss_pred H-----HHHHHHHHHHHHHHHCC-----CCHHHHHHHHHHHCCCCCCHHHHHH----HH-HCCCCCHHHHHHHHHHHHHH Q ss_conf 9-----99999999999999669-----9628999999951378842178887----55-52899989999999999999 Q gi|254781034|r 176 N-----TQRMVSLLILKNALDKG-----EYSSLNTTMQENFSVLKPCTATLMQ----FA-NIKIPTTIEILAKFPKVSEE 240 (344) Q Consensus 176 ~-----~~~~~A~~~L~~Ai~~G-----~pf~~eL~~l~~~~~~~~~l~~L~~----~A-~~Gvpt~a~L~~~F~~~A~~ 240 (344) . .++++|+.+|++++++| .||..+|.+++.++.+++.+.++.. .+ ..||+|..+|..+|..+++. T Consensus 382 ~~n~~~~qL~~Av~aLk~~L~~~~~~~p~Pl~~el~~lk~~a~~d~~v~~~~~~i~~~a~~~gv~s~~~L~~rf~~v~~~ 461 (561) T pfam09731 382 DENHRVQQLWLAVEALKSALKSGSAGSPRPLKKELDALKELAKDDELVDAALASLPPEASQRGILSEEQLRNRFNLLAPE 461 (561) T ss_pred HHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHCCCCHHHHHHHHHCCHHHHCCCCCCHHHHHHHHHHHHHH T ss_conf 99999999999999999998658988887759999999986689669999998579877537999999999999999999 Q ss_pred HHHHHCCCCCCCCHHHHHHHHHHHHHEEECCCC--CCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHHHHH Q ss_conf 997520467778889999999987632412788--778889888999999999829989999989719988899999999 Q gi|254781034|r 241 MVFASESLEKDSGFANYLLFQLTRLVKVRPIGG--NIEGDAITDVIARIENNLKTGDLVKAAAEWDKIPEKARQPSMFLR 318 (344) Q Consensus 241 ~~~a~~~~~~~~g~~~~l~~~~~slv~vR~~~~--~~~G~~~dailaRae~aL~~GdL~~Al~el~~Lp~~a~~~~~~w~ 318 (344) ++.+++.| +++|+++|+.+++.|++.+++.++ .++|+|++.||+|++++|.+|||+.|+.+++.|.||+|.++.||+ T Consensus 462 ~r~~al~p-~~~g~~~~~~s~~~S~l~~~~~~~~~~~~~~d~~~vl~ra~~~l~~gdl~~A~~~~n~L~G~~r~la~dWl 540 (561) T pfam09731 462 LRKASLLP-ENAGLLGHLLSYLFSKLLFKPKQGEADPDGDDVESVLARAEYNLERGDLDKAAREVNSLKGWSRKLASDWL 540 (561) T ss_pred HHHHHCCC-CCCCHHHHHHHHHHHHHEECCCCCCCCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHCCCHHHHHHHHHH T ss_conf 86866169-99878999999999870226777767887687899999999999718899999999847535689899999 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 9999999999999999975 Q gi|254781034|r 319 NALEAHICSDAILKEEMAK 337 (344) Q Consensus 319 ~~~eaRl~ad~~~~~~~a~ 337 (344) ..++.||.+..++.-+.+- T Consensus 541 ~eaR~~LE~~q~~~~l~ae 559 (561) T pfam09731 541 KEARRRLEVEQALDLLDAE 559 (561) T ss_pred HHHHHHHHHHHHHHHHHHH T ss_conf 9999999999999999973 No 3 >KOG1854 consensus Probab=99.80 E-value=3.8e-16 Score=118.90 Aligned_cols=163 Identities=15% Similarity=0.145 Sum_probs=137.0 Q ss_pred HHHHHHHHHHHHHHHHCC------CCHHHHHHHHHHHCCCCCCHHHHHHH-----HHCCCCCHHHHHHHHHHHHHHHHHH Q ss_conf 999999999999999669------96289999999513788421788875-----5528999899999999999999975 Q gi|254781034|r 176 NTQRMVSLLILKNALDKG------EYSSLNTTMQENFSVLKPCTATLMQF-----ANIKIPTTIEILAKFPKVSEEMVFA 244 (344) Q Consensus 176 ~~~~~~A~~~L~~Ai~~G------~pf~~eL~~l~~~~~~~~~l~~L~~~-----A~~Gvpt~a~L~~~F~~~A~~~~~a 244 (344) ..++++++..|+.-+..| .|....+..++...++++.+.++... -..||.|-.+|..+|..+.+-++.. T Consensus 470 a~q~w~ac~nlk~s~~~g~~e~r~~pLg~~vn~~k~~~~~delv~a~~~~ipk~~~~rgiysee~L~~RF~~l~ki~rr~ 549 (657) T KOG1854 470 AKQLWLACSNLKDSLNKGHYEMRRHPLGKHVNALKEVTKDDELVAAALDSIPKEADTRGIYSEEDLRNRFNTLSKIARRT 549 (657) T ss_pred HHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHCCCCCHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHHHHH T ss_conf 77999999877876640541113473267899884469958999999984562003688777899999999999998886 Q ss_pred HCCCCCCCCHHHHHHHHHHHHHEEE--CCCCCC----CCCCHHHHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHHHHH Q ss_conf 2046777888999999998763241--278877----8889888999999999829989999989719988899999999 Q gi|254781034|r 245 SESLEKDSGFANYLLFQLTRLVKVR--PIGGNI----EGDAITDVIARIENNLKTGDLVKAAAEWDKIPEKARQPSMFLR 318 (344) Q Consensus 245 ~~~~~~~~g~~~~l~~~~~slv~vR--~~~~~~----~G~~~dailaRae~aL~~GdL~~Al~el~~Lp~~a~~~~~~w~ 318 (344) +..++ ++|+++.....++|++.++ ..+... .-.+...||+|+.+.+..|||++|+..++.|.+|.|.++.||+ T Consensus 550 a~l~e-~gg~lg~yf~sl~Slfl~~~~q~g~~~~~~p~~~d~~~iLsrA~~~~~~gdl~~Avr~v~lLkG~pr~va~dWi 628 (657) T KOG1854 550 ALLPE-EGGFLGQYFLSLQSLFLLSPQQLGNPVFLDPNITDTYKILSRARYHLLKGDLDDAVRVVNLLKGWPRKVARDWI 628 (657) T ss_pred HCCCC-CCCHHHHHHHHHHHHEEECHHHCCCCCCCCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHHCCCHHHHHHHHH T ss_conf 10178-87629889998603425257642898657820100899999999998506588999999983243388899999 Q ss_pred HHHHHHHHHHHHHHHHHHHHH Q ss_conf 999999999999999997533 Q gi|254781034|r 319 NALEAHICSDAILKEEMAKIP 339 (344) Q Consensus 319 ~~~eaRl~ad~~~~~~~a~~~ 339 (344) ..+++++....++.-++|-.. T Consensus 629 ~daRr~lE~qql~eiL~AhAa 649 (657) T KOG1854 629 KDARRRLETQQLVEILKAHAA 649 (657) T ss_pred HHHHHHHHHHHHHHHHHHHHH T ss_conf 999999999999999999999 No 4 >PRK10920 putative uroporphyrinogen III C-methyltransferase; Provisional Probab=95.85 E-value=0.13 Score=28.61 Aligned_cols=25 Identities=12% Similarity=0.143 Sum_probs=16.4 Q ss_pred CCHHHHH----HHHHHHHHEEECCCCCCC Q ss_conf 8889999----999987632412788778 Q gi|254781034|r 252 SGFANYL----LFQLTRLVKVRPIGGNIE 276 (344) Q Consensus 252 ~g~~~~l----~~~~~slv~vR~~~~~~~ 276 (344) ++|..++ .+|+.++|+||+.+..++ T Consensus 240 ~~Wq~~l~~sw~~fl~~~I~Irr~d~~~~ 268 (389) T PRK10920 240 SEWRQNLQKSWQNFMDDFITIRRRDDTAV 268 (389) T ss_pred HHHHHHHHHHHHHHHHHHEEEEECCCCCC T ss_conf 58999999999999975257741798646 No 5 >PRK06975 bifunctional uroporphyrinogen-III synthetase/uroporphyrin-III C-methyltransferase; Reviewed Probab=95.54 E-value=0.17 Score=27.87 Aligned_cols=50 Identities=18% Similarity=0.037 Sum_probs=27.2 Q ss_pred HHHHHHHHHHHHEEECCCCCCCCC-CHHH----------HHHHHHHHHHCCCHHHHHHHHH Q ss_conf 999999998763241278877888-9888----------9999999998299899999897 Q gi|254781034|r 255 ANYLLFQLTRLVKVRPIGGNIEGD-AITD----------VIARIENNLKTGDLVKAAAEWD 304 (344) Q Consensus 255 ~~~l~~~~~slv~vR~~~~~~~G~-~~da----------ilaRae~aL~~GdL~~Al~el~ 304 (344) +..++.+++++|+||+.+..++.- +|+- -|-.++-+|-++|=.-.-..|+ T Consensus 541 ~~~~~~~l~~lI~IRr~D~~~~pLLsPeQ~~~LReNLrL~LlqAqlALL~~~~~~Y~~sL~ 601 (653) T PRK06975 541 SAGLGEQLKGLVQVRRIDNADAMLLSPDQGYFLRENVKLRLLNARLSLLSRNDAAFKSDLH 601 (653) T ss_pred HHHHHHHHHCCEEEEECCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHH T ss_conf 9999999638489997898745685957899999999999999999998368799999999 No 6 >pfam01601 Corona_S2 Coronavirus S2 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 pfam01600 and S2. Probab=94.62 E-value=0.3 Score=26.22 Aligned_cols=121 Identities=14% Similarity=0.068 Sum_probs=57.7 Q ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHH-HHHHHHCHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 45433014667776423467899999-75332032320266335103336788873112323220027899999999999 Q gi|254781034|r 29 VKKITWRKFFWEKILSNKTFFKILAL-VCVIVLTFIFIFTALFTEKFLRTDNNLLLLPSVSPLKEDPKDISPVIEKEIIS 107 (344) Q Consensus 29 ~~~~~~~~~~~~~~~~~~~~ggiial-~~~~~lq~~~~~~~~~~~~~a~~~~~~~~~~~~~~q~~~~~~~l~~~~~el~~ 107 (344) ........++...+.+++.+||+-+. +|+|.+|...++.......+..++..-.++ +.+....+.+.. T Consensus 198 v~da~~~amYT~sL~g~ma~ggitaaaaIPFa~~vQ~RlN~lglt~~VL~eNQk~iA-----------~sFN~Ai~~iq~ 266 (609) T pfam01601 198 VVDAEMMAMYTASLVGAMALGGITAAAAIPFATQVQARLNYVGLTTDVLQENQKLIA-----------NSFNKALGNIQD 266 (609) T ss_pred CCCHHHHHHHHHHHHHHHHHCCCHHHEECCHHHHHHHHHHHHEEEHHHHHHHHHHHH-----------HHHHHHHHHHHH T ss_conf 788789988999999987614510310174699899886334020999998899999-----------999999999988 Q ss_pred HHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99999975311014478899888887633678999999999999999999999999995 Q gi|254781034|r 108 QNLSIAQQKDEETADKELANTQNFNIKPLLEEIASLKQLISDLSKNYQDIVTRLTKMET 166 (344) Q Consensus 108 ~~~~~~s~~~e~~~~~e~~~~~~~~~~~~~~~v~~Le~~~~~~~~~~~~l~~~l~~~e~ 166 (344) -+.+.++. ..+....-+..-..+.+.+..|..--...++.+.++.+|++++|. T Consensus 267 gf~tts~A------L~kiQdVVN~q~~aL~~L~~qL~~nFgAISssI~dIy~RLD~leA 319 (609) T pfam01601 267 GFTTTASA------LSKIQDVVNQQGQALSQLTNQLSNNFGAISSSIQDIYSRLDALEA 319 (609) T ss_pred HHHHHHHH------HHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHH T ss_conf 88999999------999999998889999999999875302156789999998887776 No 7 >COG2959 HemX Uncharacterized enzyme of heme biosynthesis [Coenzyme metabolism] Probab=84.55 E-value=2.6 Score=20.41 Aligned_cols=18 Identities=17% Similarity=0.121 Sum_probs=13.8 Q ss_pred HHHHHHHHHHEEECCCCC Q ss_conf 999999876324127887 Q gi|254781034|r 257 YLLFQLTRLVKVRPIGGN 274 (344) Q Consensus 257 ~l~~~~~slv~vR~~~~~ 274 (344) ...+|+..+|.+||.+.+ T Consensus 247 s~~sfl~~fi~irrrd~~ 264 (391) T COG2959 247 SSRSFLDNFITIRRRDDN 264 (391) T ss_pred HHHHHHHHHEEEEECCCC T ss_conf 999998532055124777 No 8 >PRK13428 F0F1 ATP synthase subunit delta; Provisional Probab=79.73 E-value=4 Score=19.30 Aligned_cols=24 Identities=8% Similarity=0.332 Sum_probs=13.8 Q ss_pred HHHHHHHHHHHHHHHCHHHHCCCC Q ss_conf 467899999753320323202663 Q gi|254781034|r 46 KTFFKILALVCVIVLTFIFIFTAL 69 (344) Q Consensus 46 ~~~ggiial~~~~~lq~~~~~~~~ 69 (344) .|.|-+|++++.+-+=|-+++|.. T Consensus 3 ~fIgqLI~Faii~f~~~KfVvP~~ 26 (445) T PRK13428 3 TFIGQLIGFAVIVFLVVRFVVPPV 26 (445) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 399999999999999999985279 No 9 >COG4942 Membrane-bound metallopeptidase [Cell division and chromosome partitioning] Probab=74.62 E-value=5.5 Score=18.41 Aligned_cols=72 Identities=21% Similarity=0.252 Sum_probs=36.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 999999999999999997531101447889988888763367899999999999999999999999999500 Q gi|254781034|r 97 ISPVIEKEIISQNLSIAQQKDEETADKELANTQNFNIKPLLEEIASLKQLISDLSKNYQDIVTRLTKMETLT 168 (344) Q Consensus 97 ~l~~~~~el~~~~~~~~s~~~e~~~~~e~~~~~~~~~~~~~~~v~~Le~~~~~~~~~~~~l~~~l~~~e~~~ 168 (344) ++....++|....................+......++.+.+++...+..+..+..+++++..++..++.+. T Consensus 39 ~l~q~q~ei~~~~~~i~~~~~~~~kL~~~lk~~e~~i~~~~~ql~~s~~~l~~~~~~I~~~~~~l~~l~~q~ 110 (420) T COG4942 39 QLKQIQKEIAALEKKIREQQDQRAKLEKQLKSLETEIASLEAQLIETADDLKKLRKQIADLNARLNALEVQE 110 (420) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 899899999999999999999999999999999988999999999988689998857999999999988899 No 10 >KOG2391 consensus Probab=73.56 E-value=5.8 Score=18.24 Aligned_cols=103 Identities=16% Similarity=0.090 Sum_probs=64.7 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHH Q ss_conf 89988888763367899999999999999999999999999500124766999999999999999966996289999999 Q gi|254781034|r 125 LANTQNFNIKPLLEEIASLKQLISDLSKNYQDIVTRLTKMETLTANPLRNPNTQRMVSLLILKNALDKGEYSSLNTTMQE 204 (344) Q Consensus 125 ~~~~~~~~~~~~~~~v~~Le~~~~~~~~~~~~l~~~l~~~e~~~~~~~~~~~~~~~~A~~~L~~Ai~~G~pf~~eL~~l~ 204 (344) -.++.......+...+..||+++.++..+.+-+.+...+......+ .+ ...+..+++.+.|.-..+-..- T Consensus 240 t~EeL~~G~~kL~~~~etLEqq~~~L~~niDIL~~k~~eal~~~~n-~~---------~~~~D~~~~~~~~l~kq~l~~~ 309 (365) T KOG2391 240 TEEELNIGKQKLVAMKETLEQQLQSLQKNIDILKSKVREALEKAEN-LE---------ALDIDEAIECTAPLYKQILECY 309 (365) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCC-CC---------CCCCHHHHHCCCHHHHHHHHHH T ss_conf 6999986699999999999999999986458988999997765416-76---------7981033320336899999864 Q ss_pred HHCCCCCCHH----HHHHHHHCCCCCHHHHHHHHHHHHHH Q ss_conf 5137884217----88875552899989999999999999 Q gi|254781034|r 205 NFSVLKPCTA----TLMQFANIKIPTTIEILAKFPKVSEE 240 (344) Q Consensus 205 ~~~~~~~~l~----~L~~~A~~Gvpt~a~L~~~F~~~A~~ 240 (344) + ++.+++ -|......||-.+.+-....+.++|+ T Consensus 310 --A-~d~aieD~i~~L~~~~r~G~i~l~~yLr~VR~lsRe 346 (365) T KOG2391 310 --A-LDLAIEDAIYSLGKSLRDGVIDLDQYLRHVRLLSRE 346 (365) T ss_pred --H-HHHHHHHHHHHHHHHHHCCEEEHHHHHHHHHHHHHH T ss_conf --1-466789999999888754810099999999998899 No 11 >pfam07889 DUF1664 Protein of unknown function (DUF1664). The members of this family are hypothetical plant proteins of unknown function. The region featured in this family is approximately 100 amino acids long. Probab=72.81 E-value=6.1 Score=18.13 Aligned_cols=60 Identities=12% Similarity=0.201 Sum_probs=23.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 999999999999999997531101447889988888763367899999999999999999 Q gi|254781034|r 97 ISPVIEKEIISQNLSIAQQKDEETADKELANTQNFNIKPLLEEIASLKQLISDLSKNYQD 156 (344) Q Consensus 97 ~l~~~~~el~~~~~~~~s~~~e~~~~~e~~~~~~~~~~~~~~~v~~Le~~~~~~~~~~~~ 156 (344) .=+++.|+|......+-.++.-.....+.+.+...++..+..++..+...+..+-..+.. T Consensus 62 tKrhLsqRI~~vd~kld~~~eis~~i~~eV~e~~~~~~~i~~D~~~v~~~v~~Le~Ki~~ 121 (126) T pfam07889 62 TKKHLSQRIDNLDDKLDEQKEISESTRDEVTEIREDLSNIGEDVKSVQQAVEGLEGKLDS 121 (126) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 999999988735321999999999999999999964998876999999999989999988 No 12 >PRK09039 hypothetical protein; Validated Probab=68.94 E-value=7.4 Score=17.60 Aligned_cols=28 Identities=21% Similarity=0.322 Sum_probs=11.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 8999999999999999999999999995 Q gi|254781034|r 139 EIASLKQLISDLSKNYQDIVTRLTKMET 166 (344) Q Consensus 139 ~v~~Le~~~~~~~~~~~~l~~~l~~~e~ 166 (344) +|..|.+.+..+..+...+...+...+. T Consensus 138 qi~~Ln~Qi~aLr~qL~~l~~~L~~~e~ 165 (343) T PRK09039 138 QVELLNQQIAALRRQLAALEAALDASEK 165 (343) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9999999999999999999999999998 No 13 >PRK09793 methyl-accepting protein IV; Provisional Probab=68.93 E-value=7.4 Score=17.60 Aligned_cols=31 Identities=13% Similarity=0.114 Sum_probs=14.9 Q ss_pred CCCHHHHHHHHHHCCHHHHHHHHHHHHHHHH Q ss_conf 2998999998971998889999999999999 Q gi|254781034|r 293 TGDLVKAAAEWDKIPEKARQPSMFLRNALEA 323 (344) Q Consensus 293 ~GdL~~Al~el~~Lp~~a~~~~~~w~~~~ea 323 (344) ...+..++.+++...+.....+..-...++. T Consensus 466 ~~qI~~av~~i~~~tqqnaa~vee~a~aa~~ 496 (533) T PRK09793 466 IEQVAQAVSQMDQVTQQNASLVEEAAVATEQ 496 (533) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9999999999999999999999999999999 No 14 >TIGR00020 prfB peptide chain release factor 2; InterPro: IPR004374 In many but not all taxa, there is a conserved real translational frameshift at a TGA codon. RF-2 helps terminate translation at TGA codons and can therefore regulate its own production by readthrough when RF-2 is insufficient. There is a superfamily IPR000352 from INTERPRO of RF-1, RF-2, mitochondrial, RF-H, etc proteins.; GO: 0016149 translation release factor activity codon specific, 0006415 translational termination, 0005737 cytoplasm. Probab=63.87 E-value=9.3 Score=16.99 Aligned_cols=146 Identities=16% Similarity=0.132 Sum_probs=89.1 Q ss_pred HHHHHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHCCHH---HHHHHH-HHHHHHHHHHHHHCCCCH---HHHHHHH Q ss_conf 76336789999999999999--99999999999995001247---669999-999999999999669962---8999999 Q gi|254781034|r 133 IKPLLEEIASLKQLISDLSK--NYQDIVTRLTKMETLTANPL---RNPNTQ-RMVSLLILKNALDKGEYS---SLNTTMQ 203 (344) Q Consensus 133 ~~~~~~~v~~Le~~~~~~~~--~~~~l~~~l~~~e~~~~~~~---~~~~~~-~~~A~~~L~~Ai~~G~pf---~~eL~~l 203 (344) +..+...|..|...++.... ..+.+.+++++++.+.++|. +...++ ..-....|+..|++=.-. ..+|..| T Consensus 4 ~~~~~~~~~~L~~rl~~~~~~Ld~e~~~~rle~le~e~~dP~fW~D~~rAq~v~~e~~~l~~~l~~~~~l~~~~~dl~~L 83 (373) T TIGR00020 4 LSELKEKIEELASRLDDVRGILDPEKLKARLEELEKEMEDPNFWNDQERAQKVIKERSSLEEELDTLEELKNSLDDLSEL 83 (373) T ss_pred CCHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 62268999999989998775138668999999998775078666125899999999999998632799987553257889 Q ss_pred HHHCCCCCCHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHEEECCCCCCCCCCHHHH Q ss_conf 95137884217888755528999899999999999999975204677788899999999876324127887788898889 Q gi|254781034|r 204 ENFSVLKPCTATLMQFANIKIPTTIEILAKFPKVSEEMVFASESLEKDSGFANYLLFQLTRLVKVRPIGGNIEGDAITDV 283 (344) Q Consensus 204 ~~~~~~~~~l~~L~~~A~~Gvpt~a~L~~~F~~~A~~~~~a~~~~~~~~g~~~~l~~~~~slv~vR~~~~~~~G~~~dai 283 (344) -.|+.+.+. ..|..|.-+...|-.+|+.+-..+...-...- =.| =.-.-.-++++++-.|.+|.-|=.++ T Consensus 84 ~Ela~ee~d-----~aaasGm~~~~el~~El~~Le~~~~~lE~~~~-LSg----E~D~~nA~ltI~~GAGGTEa~DWa~M 153 (373) T TIGR00020 84 LELAKEEDD-----EAAASGMETFAELEEELKALEKELEELELRTL-LSG----EYDANNAILTIQSGAGGTEAQDWASM 153 (373) T ss_pred HHHHCCCCH-----HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-CCC----CCCHHCCCEEECCCCCCCCHHHHHHH T ss_conf 987437603-----57778999999999999999999999999997-067----57721063352387798516669999 Q ss_pred HHHHH Q ss_conf 99999 Q gi|254781034|r 284 IARIE 288 (344) Q Consensus 284 laRae 288 (344) |-||= T Consensus 154 L~RMY 158 (373) T TIGR00020 154 LYRMY 158 (373) T ss_pred HHHHH T ss_conf 98753 No 15 >COG4980 GvpP Gas vesicle protein [General function prediction only] Probab=60.53 E-value=11 Score=16.61 Aligned_cols=19 Identities=32% Similarity=0.085 Sum_probs=11.3 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 4667776423467899999 Q gi|254781034|r 36 KFFWEKILSNKTFFKILAL 54 (344) Q Consensus 36 ~~~~~~~~~~~~~ggiial 54 (344) +.|+.|+++|+++|++.+| T Consensus 5 ~~~l~G~liGgiiGa~aaL 23 (115) T COG4980 5 KDFLFGILIGGIIGAAAAL 23 (115) T ss_pred CHHHHHHHHHHHHHHHHHH T ss_conf 0379999999999999999 No 16 >pfam06160 EzrA Septation ring formation regulator, EzrA. During the bacterial cell cycle, the tubulin-like cell-division protein FtsZ polymerizes into a ring structure that establishes the location of the nascent division site. EzrA modulates the frequency and position of FtsZ ring formation. Probab=59.07 E-value=11 Score=16.46 Aligned_cols=110 Identities=13% Similarity=0.068 Sum_probs=65.5 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHEE----ECCCCCCCCCCHHHHHHHHHHHHHCCCHHHH Q ss_conf 999899999999999999975204677788899999999876324----1278877888988899999999982998999 Q gi|254781034|r 224 IPTTIEILAKFPKVSEEMVFASESLEKDSGFANYLLFQLTRLVKV----RPIGGNIEGDAITDVIARIENNLKTGDLVKA 299 (344) Q Consensus 224 vpt~a~L~~~F~~~A~~~~~a~~~~~~~~g~~~~l~~~~~slv~v----R~~~~~~~G~~~dailaRae~aL~~GdL~~A 299 (344) +-.+..+....|.+-..+-.. -|+ -++.+....+.++.- ...+...+=.....-+......|..++|+.| T Consensus 203 ~~~L~~~me~IP~l~~~~~~~--~P~----Ql~eL~~Gy~~m~~~gy~~~~~~i~~~i~~i~~~l~~~~~~l~~L~l~~a 276 (559) T pfam06160 203 TDALEQKMEEIPPLLKELQNE--FPD----QLEELKAGYREMTEEGYHFDHLDIEKELQDLKEQIDQNLALLEELDLDEA 276 (559) T ss_pred HHHHHHHHHHHHHHHHHHHHH--HHH----HHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHH T ss_conf 999999999804899999987--049----99999999999998699788788799999999999999999873798989 Q ss_pred HHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9989719988899999999999999999999999997533 Q gi|254781034|r 300 AAEWDKIPEKARQPSMFLRNALEAHICSDAILKEEMAKIP 339 (344) Q Consensus 300 l~el~~Lp~~a~~~~~~w~~~~eaRl~ad~~~~~~~a~~~ 339 (344) -..++.+.+.--.....+-..++||--++.....+...+. T Consensus 277 ~~~~~~i~~~Id~LYd~lekEv~Ak~~V~~~~~~i~~~l~ 316 (559) T pfam06160 277 EEENEEIEERIDTLYDILEKEVKAKKFVEKNIDKLTDFLE 316 (559) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9899999999999999999999999999985788999999 No 17 >TIGR02508 type_III_yscG type III secretion protein, YscG family; InterPro: IPR013348 YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designated Yops (Yersinia outer proteins), in Yersinia. This entry consists of YscG from Yersinia, and functionally equivalent type III secretion proteins in other species: e.g. AscG in Aeromonas and LscG in Photorhabdus luminescens.; GO: 0009405 pathogenesis. Probab=57.29 E-value=12 Score=16.27 Aligned_cols=64 Identities=14% Similarity=0.126 Sum_probs=52.0 Q ss_pred CCHHH-HHHHHHHHHHCCCHHHHHHHHHHCC--HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCC Q ss_conf 89888-9999999998299899999897199--8889999999999999999999999999753340388 Q gi|254781034|r 278 DAITD-VIARIENNLKTGDLVKAAAEWDKIP--EKARQPSMFLRNALEAHICSDAILKEEMAKIPQTDLP 344 (344) Q Consensus 278 ~~~da-ilaRae~aL~~GdL~~Al~el~~Lp--~~a~~~~~~w~~~~eaRl~ad~~~~~~~a~~~~~~~~ 344 (344) +.-++ .|=|.....++|||..|+.-.+.++ =|. ..||++=.+.|+.-..++.+.+-++-+++-| T Consensus 37 e~~E~v~LIrlsSLmN~G~Y~~Al~lg~~~~tayPd---LepwlALce~rlGl~~Al~~Rl~rLa~s~~p 103 (118) T TIGR02508 37 ESEEAVVLIRLSSLMNRGDYQEALQLGEELCTAYPD---LEPWLALCEWRLGLLSALEERLLRLAASGDP 103 (118) T ss_pred CCHHHHHHHHHHHHCCHHHHHHHHHHCCCCCCCCCC---HHHHHHHHHHHHHHHHHHHHHHHHHHCCCCH T ss_conf 816999999986312755799999724336887778---7789999999987999999999987427985 No 18 >COG4649 Uncharacterized protein conserved in bacteria [Function unknown] Probab=55.93 E-value=3.6 Score=19.59 Aligned_cols=26 Identities=27% Similarity=0.295 Sum_probs=17.1 Q ss_pred HHHHHHCCCHHHHHHHHHHCCHHHHH Q ss_conf 99999829989999989719988899 Q gi|254781034|r 287 IENNLKTGDLVKAAAEWDKIPEKARQ 312 (344) Q Consensus 287 ae~aL~~GdL~~Al~el~~Lp~~a~~ 312 (344) .=++.+.||+++|..++..+-..++. T Consensus 174 glAa~kagd~a~A~~~F~qia~Da~a 199 (221) T COG4649 174 GLAAYKAGDFAKAKSWFVQIANDAQA 199 (221) T ss_pred HHHHHHCCCHHHHHHHHHHHHCCCCC T ss_conf 68887322467799999999701469 No 19 >pfam02912 Phe_tRNA-synt_N Aminoacyl tRNA synthetase class II, N-terminal domain. Probab=53.93 E-value=14 Score=15.93 Aligned_cols=52 Identities=21% Similarity=0.203 Sum_probs=41.7 Q ss_pred HHHHHHHHCCCHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 999999982998999998971998889999999999999999999999999753 Q gi|254781034|r 285 ARIENNLKTGDLVKAAAEWDKIPEKARQPSMFLRNALEAHICSDAILKEEMAKI 338 (344) Q Consensus 285 aRae~aL~~GdL~~Al~el~~Lp~~a~~~~~~w~~~~eaRl~ad~~~~~~~a~~ 338 (344) -|.++.=+.|-|......+.+||...|.....++..++.++.. .+......+ T Consensus 12 ~r~~~lGKkG~l~~~~k~l~~l~~eekk~~G~~iN~~K~~i~~--~~~~k~~~l 63 (73) T pfam02912 12 IRVKYLGKKGPLTELLKGLGKLSPEERPKVGALINEAKEAVEE--ALEEKKAAL 63 (73) T ss_pred HHHHHHCCHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHH--HHHHHHHHH T ss_conf 9999927515999999977069999999989999999999999--999999999 No 20 >TIGR03319 YmdA_YtgF conserved hypothetical protein YmdA/YtgF. Probab=53.58 E-value=14 Score=15.89 Aligned_cols=16 Identities=25% Similarity=0.227 Sum_probs=7.5 Q ss_pred CHHHHHHHHHHHHHHH Q ss_conf 9888999999999999 Q gi|254781034|r 307 PEKARQPSMFLRNALE 322 (344) Q Consensus 307 p~~a~~~~~~w~~~~e 322 (344) |+.-+..+..|+..++ T Consensus 425 PGARre~~e~yi~Rl~ 440 (514) T TIGR03319 425 PGARRESLENYIKRLE 440 (514) T ss_pred CCCCHHHHHHHHHHHH T ss_conf 9724545999999999 No 21 >pfam04375 HemX HemX. This family consists of several bacterial HemX proteins. The hemX gene is not essential for haem synthesis in B. subtilis. HemX is a polytopic membrane protein which by an unknown mechanism down-regulates the level of HemA. Probab=52.04 E-value=15 Score=15.74 Aligned_cols=24 Identities=17% Similarity=0.134 Sum_probs=16.7 Q ss_pred CCHHHHHHH----HHHHHHEEECCCCCC Q ss_conf 888999999----998763241278877 Q gi|254781034|r 252 SGFANYLLF----QLTRLVKVRPIGGNI 275 (344) Q Consensus 252 ~g~~~~l~~----~~~slv~vR~~~~~~ 275 (344) ++|+.++.. ++.++|+||+.+..+ T Consensus 234 ~~W~~~l~~~~~~~l~~li~Irr~d~~~ 261 (372) T pfam04375 234 SDWWENLWKSVRSFLNNFITIRRRDQTD 261 (372) T ss_pred HHHHHHHHHHHHHHHHCCEEEEECCCCC T ss_conf 8999999999999985145652178864 No 22 >COG3118 Thioredoxin domain-containing protein [Posttranslational modification, protein turnover, chaperones] Probab=51.40 E-value=15 Score=15.68 Aligned_cols=26 Identities=4% Similarity=-0.277 Sum_probs=10.5 Q ss_pred HHCHHHHCCCCCCCHHHHHHHHHHHH Q ss_conf 20323202663351033367888731 Q gi|254781034|r 59 VLTFIFIFTALFTEKFLRTDNNLLLL 84 (344) Q Consensus 59 ~lq~~~~~~~~~~~~~a~~~~~~~~~ 84 (344) .-.|.|-|-..-...+.......+.. T Consensus 69 a~~~~G~f~LakvN~D~~p~vAaqfg 94 (304) T COG3118 69 AAEYKGKFKLAKVNCDAEPMVAAQFG 94 (304) T ss_pred HHHHCCCEEEEEECCCCCHHHHHHHC T ss_conf 99858925999846873650898828 No 23 >pfam11853 DUF3373 Protein of unknown function (DUF3373). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 472 to 574 amino acids in length. Probab=48.23 E-value=17 Score=15.37 Aligned_cols=34 Identities=26% Similarity=0.453 Sum_probs=23.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCH Q ss_conf 7899999999999999999999999999500124 Q gi|254781034|r 138 EEIASLKQLISDLSKNYQDIVTRLTKMETLTANP 171 (344) Q Consensus 138 ~~v~~Le~~~~~~~~~~~~l~~~l~~~e~~~~~~ 171 (344) .+|..|+.+++++..+++++..|+.+.|.+.... T Consensus 31 qkI~~L~~ql~eLk~~~~~~~~~v~~~e~~~a~~ 64 (485) T pfam11853 31 QKIEALKKELAELKAQLKDLNKRVDKTEKKSAGD 64 (485) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCC T ss_conf 9999999999999999998776655566653036 No 24 >PRK13453 F0F1 ATP synthase subunit B; Provisional Probab=47.58 E-value=17 Score=15.31 Aligned_cols=45 Identities=11% Similarity=0.077 Sum_probs=32.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCHHHHCCCCCCCHHHHHHHHHHHH Q ss_conf 776423467899999753320323202663351033367888731 Q gi|254781034|r 40 EKILSNKTFFKILALVCVIVLTFIFIFTALFTEKFLRTDNNLLLL 84 (344) Q Consensus 40 ~~~~~~~~~ggiial~~~~~lq~~~~~~~~~~~~~a~~~~~~~~~ 84 (344) .|+=++.||+-+|.+++++.+-|-+.+++...-.+...+...... T Consensus 14 ~gidw~t~~~q~I~F~il~~ll~kf~~~pi~~~L~~R~~~I~~~l 58 (173) T PRK13453 14 GGVEWGTVIVQVLTFIVLLALLKKFAWGPLKDVMDKRERDINRDI 58 (173) T ss_pred CCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 699789999999999999999999989899999999999999889 No 25 >pfam12072 DUF3552 Domain of unknown function (DUF3552). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is about 200 amino acids in length. This domain is found associated with pfam00013, pfam01966. This domain has a single completely conserved residue A that may be functionally important. Probab=45.49 E-value=19 Score=15.11 Aligned_cols=17 Identities=0% Similarity=-0.068 Sum_probs=6.8 Q ss_pred HHHHHHHHHHHHHHHCH Q ss_conf 46789999975332032 Q gi|254781034|r 46 KTFFKILALVCVIVLTF 62 (344) Q Consensus 46 ~~~ggiial~~~~~lq~ 62 (344) +++|++++.+++|.+.+ T Consensus 6 ~i~~~~iG~~~G~~~~~ 22 (201) T pfam12072 6 AIIALVVGFAIGYFVRK 22 (201) T ss_pred HHHHHHHHHHHHHHHHH T ss_conf 99999999999999999 No 26 >KOG2629 consensus Probab=45.35 E-value=19 Score=15.09 Aligned_cols=27 Identities=30% Similarity=0.474 Sum_probs=11.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 336789999999999999999999999 Q gi|254781034|r 135 PLLEEIASLKQLISDLSKNYQDIVTRL 161 (344) Q Consensus 135 ~~~~~v~~Le~~~~~~~~~~~~l~~~l 161 (344) .+...+..|.+.+-.+.....++.+.+ T Consensus 158 Els~~L~~l~~~~~~~s~~~~k~esei 184 (300) T KOG2629 158 ELSRALASLKNTLVQLSRNIEKLESEI 184 (300) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 999999999977777530198888788 No 27 >COG4372 Uncharacterized protein conserved in bacteria with the myosin-like domain [Function unknown] Probab=43.69 E-value=20 Score=14.94 Aligned_cols=21 Identities=10% Similarity=0.034 Sum_probs=9.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHH Q ss_conf 123232200278999999999 Q gi|254781034|r 85 PSVSPLKEDPKDISPVIEKEI 105 (344) Q Consensus 85 ~~~~~q~~~~~~~l~~~~~el 105 (344) +++-++....+.++....+++ T Consensus 77 ddi~~qlr~~rtel~~a~~~k 97 (499) T COG4372 77 DDIRPQLRALRTELGTAQGEK 97 (499) T ss_pred HHHHHHHHHHHHHHHHHHHHH T ss_conf 888899999999998877789 No 28 >PRK04778 septation ring formation regulator EzrA; Provisional Probab=41.09 E-value=22 Score=14.69 Aligned_cols=110 Identities=15% Similarity=0.123 Sum_probs=64.7 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHEE----ECCCCCCCCCCHHHHHHHHHHHHHCCCHHHH Q ss_conf 999899999999999999975204677788899999999876324----1278877888988899999999982998999 Q gi|254781034|r 224 IPTTIEILAKFPKVSEEMVFASESLEKDSGFANYLLFQLTRLVKV----RPIGGNIEGDAITDVIARIENNLKTGDLVKA 299 (344) Q Consensus 224 vpt~a~L~~~F~~~A~~~~~a~~~~~~~~g~~~~l~~~~~slv~v----R~~~~~~~G~~~dailaRae~aL~~GdL~~A 299 (344) .-.+..+....|.+-..+-.. -|. -++.+..+.+.++.- ...+...+=.....-+......|..++++.| T Consensus 207 ~~~L~~~me~IP~l~~~~~~~--~P~----Ql~eL~~Gy~~m~~~gy~l~~~~i~~~i~~l~~~l~~~~~~L~~l~l~~a 280 (569) T PRK04778 207 LAALEQIMEEIPELLKELQTE--FPD----QLDELKAGYRELVEEGYHLDELDIDKELQDLKEQIDKNLELLEELDLDEA 280 (569) T ss_pred HHHHHHHHHHHHHHHHHHHHH--HHH----HHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHH T ss_conf 999999998853789999987--159----99999999999998799788788799999999999999998877798989 Q ss_pred HHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9989719988899999999999999999999999997533 Q gi|254781034|r 300 AAEWDKIPEKARQPSMFLRNALEAHICSDAILKEEMAKIP 339 (344) Q Consensus 300 l~el~~Lp~~a~~~~~~w~~~~eaRl~ad~~~~~~~a~~~ 339 (344) -..++.+.+.--.....+-..++||--++.-...+...+. T Consensus 281 ~~~~~~i~~~Id~LYd~lekEv~Ak~~V~~~~~~i~~~l~ 320 (569) T PRK04778 281 EEENEEIEERIDTLYDILEREVKARKFVEKNIDILPDYLE 320 (569) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9889999999999999999999999999986267999999 No 29 >pfam05440 MtrB Tetrahydromethanopterin S-methyltransferase subunit B. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. Probab=40.73 E-value=21 Score=14.80 Aligned_cols=28 Identities=18% Similarity=0.239 Sum_probs=21.7 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 0146677764234678999997533203 Q gi|254781034|r 34 WRKFFWEKILSNKTFFKILALVCVIVLT 61 (344) Q Consensus 34 ~~~~~~~~~~~~~~~ggiial~~~~~lq 61 (344) +-.....|+|.+.++|.++++.+.+.+- T Consensus 67 Eg~~~~aG~~tn~fyGf~igL~i~~l~a 94 (97) T pfam05440 67 EGVYYTAGILTNAFYGFVIGLAISALLA 94 (97) T ss_pred CCEEEEHHHHHHHHHHHHHHHHHHHHHH T ss_conf 4202110354667999999999999999 No 30 >TIGR02168 SMC_prok_B chromosome segregation protein SMC; InterPro: IPR011890 The SMC (structural maintenance of chromosomes) family of proteins, exist in virtually all organisms including both bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms and form three types of heterodimer (SMC1SMC3, SMC2SMC4, SMC5SMC6), which are core components of large multiprotein complexes. The best known complexes are cohesin, which is responsible for sister-chromatid cohesion, and condensin, which is required for full chromosome condensation in mitosis. SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and share a five-domain structure, with globular N- and C-terminal (IPR003395 from INTERPRO) domains separated by a long (circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular ''hinge'' domain, characterised by a set of four highly conserved glycine residues that are typical of flexible regions in a protein. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases . All SMC proteins appear to form dimers, either forming homodimers with themselves, as in the case of prokaryotic SMC proteins, or heterodimers between different but related SMC proteins. The dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits associate with the SMC amino- and carboxy-terminal domains. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. SMCs share not only sequence similarity but also structural similarity with ABC proteins. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression . The smc gene is often associated with scpB (IPR005234 from INTERPRO) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle , , .; GO: 0005515 protein binding, 0005524 ATP binding, 0005694 chromosome. Probab=35.99 E-value=26 Score=14.19 Aligned_cols=95 Identities=16% Similarity=0.160 Sum_probs=52.7 Q ss_pred CCCHHHHHHH---HHHHHHHHHHHHCC--CCCCCCHHHHHHHHHHH----HHEEECCCCCCCCCCHHHHHHHHHHHHHCC Q ss_conf 9998999999---99999999975204--67778889999999987----632412788778889888999999999829 Q gi|254781034|r 224 IPTTIEILAK---FPKVSEEMVFASES--LEKDSGFANYLLFQLTR----LVKVRPIGGNIEGDAITDVIARIENNLKTG 294 (344) Q Consensus 224 vpt~a~L~~~---F~~~A~~~~~a~~~--~~~~~g~~~~l~~~~~s----lv~vR~~~~~~~G~~~dailaRae~aL~~G 294 (344) +|.+.+|..- |...-..++..... ...+..-.-...++++. .+++-+.+ .+.|..+.. .+++..-... T Consensus 527 ~g~l~~~i~v~~~ye~A~e~aLg~~l~~~vv~~~~~a~~a~~~L~~~~~Gr~~fl~l~-~~~~~~~~~--~~~~~~~~~~ 603 (1191) T TIGR02168 527 VGVLSELIEVDEGYEAAIEAALGGRLQAVVVENLNAAKKAIAFLKQNELGRVTFLPLD-VIKGAEIQG--NDREVLKSIE 603 (1191) T ss_pred CHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHCCCCCCEEEEEECC-CCCCCCCCC--CCHHHHCCCC T ss_conf 0035765215488999999997860020014897999999973110258827763025-567766777--6245423775 Q ss_pred CHHHHHHHHHHCCHHHHHHHHHHHHHH Q ss_conf 989999989719988899999999999 Q gi|254781034|r 295 DLVKAAAEWDKIPEKARQPSMFLRNAL 321 (344) Q Consensus 295 dL~~Al~el~~Lp~~a~~~~~~w~~~~ 321 (344) .+-+.+..+.+-|...+.+..+|+..+ T Consensus 604 Gf~g~~~~lv~~~~~~~~~~~~lL~~~ 630 (1191) T TIGR02168 604 GFLGVAKDLVKFDPKLRKALSYLLGGV 630 (1191) T ss_pred HHHHHHHHHHCCCHHHHHHHHHHHCCE T ss_conf 067887666406066789999855872 No 31 >PRK10564 maltose regulon periplasmic protein; Provisional Probab=35.67 E-value=27 Score=14.16 Aligned_cols=28 Identities=25% Similarity=0.289 Sum_probs=21.3 Q ss_pred CCHHHHHHHHHHHHHCCCHHHHHHHHHH Q ss_conf 8988899999999982998999998971 Q gi|254781034|r 278 DAITDVIARIENNLKTGDLVKAAAEWDK 305 (344) Q Consensus 278 ~~~dailaRae~aL~~GdL~~Al~el~~ 305 (344) |+-.=...-++.||+.||+++|+.-++- T Consensus 255 dTe~Yy~~aI~~AVk~~DI~KAL~LldE 282 (303) T PRK10564 255 DTESYFNQAIKDAVKKGDVDKALKLLNE 282 (303) T ss_pred HHHHHHHHHHHHHHHCCCHHHHHHHHHH T ss_conf 3899999999999975999999999999 No 32 >pfam07148 MalM Maltose operon periplasmic protein precursor (MalM). This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown. Probab=34.41 E-value=28 Score=14.03 Aligned_cols=29 Identities=24% Similarity=0.344 Sum_probs=22.7 Q ss_pred CCCHHHH-HHHHHHHHHCCCHHHHHHHHHH Q ss_conf 8898889-9999999982998999998971 Q gi|254781034|r 277 GDAITDV-IARIENNLKTGDLVKAAAEWDK 305 (344) Q Consensus 277 G~~~dai-laRae~aL~~GdL~~Al~el~~ 305 (344) ..+.++. ..-++.||+.||+++|+.-++- T Consensus 229 ~~dTe~Yy~~aI~~AVk~~DI~KAL~LldE 258 (279) T pfam07148 229 QPDTQSYYLSAIEQAVAKGDIPKALSLLDE 258 (279) T ss_pred CCHHHHHHHHHHHHHHHCCCHHHHHHHHHH T ss_conf 742899999999999975999999999999 No 33 >PRK00965 tetrahydromethanopterin S-methyltransferase subunit B; Provisional Probab=33.89 E-value=28 Score=13.98 Aligned_cols=34 Identities=9% Similarity=0.222 Sum_probs=25.9 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHCHHHHC Q ss_conf 3014667776423467899999753320323202 Q gi|254781034|r 33 TWRKFFWEKILSNKTFFKILALVCVIVLTFIFIF 66 (344) Q Consensus 33 ~~~~~~~~~~~~~~~~ggiial~~~~~lq~~~~~ 66 (344) -+-.....|+|.+.++|.++++.+.+.+-+..+. T Consensus 67 REg~~~~aG~~tn~fyGf~igl~i~~l~~~~l~~ 100 (108) T PRK00965 67 REGTYLTAGMFTNMFYGFWIGLAILFLVAIILVI 100 (108) T ss_pred CCCCEEEHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 6420200136567899999999999999999999 No 34 >TIGR02978 phageshock_pspC phage shock protein C; InterPro: IPR014320 All members of this protein are the phage shock protein PspC. The phage shock regulon is restricted to the Proteobacteria and somewhat sparsely distributed there. It is expressed, under positive control of a sigma-54-dependent transcription factor; PspF, which binds and is modulated by PspA. Stresses that induce the psp regulon include phage secretin over expression, ethanol, heat shock and protein export defects.. Probab=32.84 E-value=29 Score=13.87 Aligned_cols=77 Identities=10% Similarity=0.124 Sum_probs=48.4 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCHHHHCCCCCCCHHHHHHH----------HHHHHHH-HHHHHHHHHHHHHHHHH Q ss_conf 146677764234678999997533203232026633510333678----------8873112-32322002789999999 Q gi|254781034|r 35 RKFFWEKILSNKTFFKILALVCVIVLTFIFIFTALFTEKFLRTDN----------NLLLLPS-VSPLKEDPKDISPVIEK 103 (344) Q Consensus 35 ~~~~~~~~~~~~~~ggiial~~~~~lq~~~~~~~~~~~~~a~~~~----------~~~~~~~-~~~q~~~~~~~l~~~~~ 103 (344) |-.+++++|-++.+..++||.+. +++..+.+......+. -=|...+ ......+.++++..+++ T Consensus 32 Ril~v~~~lfg~~~~~~~AYia~------~~~L~k~P~~~~~~~~~~~~~~vk~k~Wq~g~~~p~~~L~~~~~~~~~~e~ 105 (128) T TIGR02978 32 RILVVSALLFGGGFFVLVAYIAL------WLLLDKKPVNLYEDDDTSKEHEVKSKFWQAGQTSPKQALREVKRELRRLER 105 (128) T ss_pred HHHHHHHHHHHHHHHHHHHHHHH------HHHCCCCCCCCCCCCHHHHCCCCCCCCHHHHCCCHHHHHHHHHHHHHHHHH T ss_conf 99999999998799999999999------996354463323453011101145410022057888999999999999888 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 99999999997531 Q gi|254781034|r 104 EIISQNLSIAQQKD 117 (344) Q Consensus 104 el~~~~~~~~s~~~ 117 (344) .|++.|.-++|... T Consensus 106 RLr~mE~yVTS~~F 119 (128) T TIGR02978 106 RLRNMERYVTSDEF 119 (128) T ss_pred HHHHHCCEEECCCC T ss_conf 98976853206886 No 35 >PRK11638 lipopolysaccharide biosynthesis protein WzzE; Provisional Probab=32.38 E-value=30 Score=13.83 Aligned_cols=28 Identities=21% Similarity=0.513 Sum_probs=12.7 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHCH Q ss_conf 301466777642346789999975332032 Q gi|254781034|r 33 TWRKFFWEKILSNKTFFKILALVCVIVLTF 62 (344) Q Consensus 33 ~~~~~~~~~~~~~~~~ggiial~~~~~lq~ 62 (344) -|++-.| +++-.+++.++|++..+-.+. T Consensus 26 LW~~K~~--II~~t~lf~~ia~~ya~~a~q 53 (348) T PRK11638 26 LWAGKLW--IIGMGLLFALIALAYSFFARQ 53 (348) T ss_pred HHHCCHH--HHHHHHHHHHHHHHHHHHCCC T ss_conf 9835588--999999999999999981844 No 36 >PRK06287 cobalt transport protein CbiN; Validated Probab=28.94 E-value=34 Score=13.46 Aligned_cols=27 Identities=15% Similarity=-0.047 Sum_probs=17.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCHHH Q ss_conf 677764234678999997533203232 Q gi|254781034|r 38 FWEKILSNKTFFKILALVCVIVLTFIF 64 (344) Q Consensus 38 ~~~~~~~~~~~ggiial~~~~~lq~~~ 64 (344) -..|-.++++.|.++.+.+.|++.|-. T Consensus 74 ~k~g~i~a~iiG~l~t~aia~Gvg~ii 100 (105) T PRK06287 74 GKLGEVAAIIIGTLLVLAISFGVGSIF 100 (105) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 530158999999999999998888752 No 37 >KOG0709 consensus Probab=27.12 E-value=37 Score=13.26 Aligned_cols=102 Identities=18% Similarity=0.220 Sum_probs=36.8 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCHH-HHHHHHHHHHHHHHHHHHHCCCCHHH Q ss_conf 14478899888887633678999999999999999999999999995001247-66999999999999999966996289 Q gi|254781034|r 120 TADKELANTQNFNIKPLLEEIASLKQLISDLSKNYQDIVTRLTKMETLTANPL-RNPNTQRMVSLLILKNALDKGEYSSL 198 (344) Q Consensus 120 ~~~~e~~~~~~~~~~~~~~~v~~Le~~~~~~~~~~~~l~~~l~~~e~~~~~~~-~~~~~~~~~A~~~L~~Ai~~G~pf~~ 198 (344) ...++-++.....+..+..+=.+|...++.+-.....+.+++.++++...... ....+.+-+++..+..++--+ T Consensus 268 rkKkeYid~LE~rv~~~taeNqeL~kkV~~Le~~N~sLl~qL~klQt~v~q~an~s~qt~tC~av~~lS~~l~~s----- 342 (472) T KOG0709 268 RKKKEYIDGLESRVSAFTAENQELQKKVEELELSNRSLLAQLKKLQTLVIQVANKSTQTSTCLAVLLLSFCLLLS----- 342 (472) T ss_pred HHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHCCCCHHCCCHHHHHHHHHHHHHHH----- T ss_conf 767667888764221024674888899998762658899998877777740333100022349999999999874----- Q ss_pred HHHHHHHHCCCCCCHHHHHHHHHCCCCCHH Q ss_conf 999999513788421788875552899989 Q gi|254781034|r 199 NTTMQENFSVLKPCTATLMQFANIKIPTTI 228 (344) Q Consensus 199 eL~~l~~~~~~~~~l~~L~~~A~~Gvpt~a 228 (344) -+-.+....... -..+..+|..||-++. T Consensus 343 ~lp~~~~~~~p~--~t~~~d~a~~Gvts~~ 370 (472) T KOG0709 343 TLPCFSEFSQPI--TTPLEDSAPHGVTSRS 370 (472) T ss_pred HCCCCCCCCCCC--CCCCCCCCCCCCCCCC T ss_conf 122224557887--6675444544531233 No 38 >COG4062 MtrB Tetrahydromethanopterin S-methyltransferase, subunit B [Coenzyme metabolism] Probab=26.87 E-value=37 Score=13.24 Aligned_cols=26 Identities=15% Similarity=0.139 Sum_probs=20.7 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 14667776423467899999753320 Q gi|254781034|r 35 RKFFWEKILSNKTFFKILALVCVIVL 60 (344) Q Consensus 35 ~~~~~~~~~~~~~~ggiial~~~~~l 60 (344) -....+|.+.+.++|.++++++...+ T Consensus 69 gv~~~aG~~tna~yGfviGl~i~aLl 94 (108) T COG4062 69 GVYATAGYLTNAFYGFVIGLGIMALL 94 (108) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 25788788877899999999999999 No 39 >pfam04156 IncA IncA protein. Chlamydia trachomatis is an obligate intracellular bacterium that develops within a parasitophorous vacuole termed an inclusion. The inclusion is non-fusogenic with lysosomes but intercepts lipids from a host cell exocytic pathway. Initiation of chlamydial development is concurrent with modification of the inclusion membrane by a set of C. trachomatis-encoded proteins collectively designated Incs. One of these Incs, IncA, is functionally associated with the homotypic fusion of inclusions. This family probably includes members of the wider Inc family rather than just IncA. Probab=26.79 E-value=37 Score=13.23 Aligned_cols=25 Identities=32% Similarity=0.529 Sum_probs=9.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 6336789999999999999999999 Q gi|254781034|r 134 KPLLEEIASLKQLISDLSKNYQDIV 158 (344) Q Consensus 134 ~~~~~~v~~Le~~~~~~~~~~~~l~ 158 (344) ....+.+..++.......+...++. T Consensus 124 ~~~~~~l~~l~~~~~~~~~e~~~l~ 148 (186) T pfam04156 124 KSLEERLESLEESIKELAKELRELR 148 (186) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 8888889999989998999999999 No 40 >pfam02605 PsaL Photosystem I reaction centre subunit XI. This family consists of the photosystem I reaction centre subunit XI, PsaL, from plants and bacteria. PsaL is one of the smaller subunits in photosystem I with only two transmembrane alpha helices and interacts closely with PsaI. Probab=25.14 E-value=40 Score=13.04 Aligned_cols=35 Identities=9% Similarity=-0.066 Sum_probs=20.7 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 86554543301466777642346789999975332 Q gi|254781034|r 25 PSCDVKKITWRKFFWEKILSNKTFFKILALVCVIV 59 (344) Q Consensus 25 ~~~~~~~~~~~~~~~~~~~~~~~~ggiial~~~~~ 59 (344) |+.+..+...-+.|.+|.|++++.|.+.|+.+... T Consensus 115 ~p~~l~t~~gWs~Ft~GF~~Gg~GGa~fAy~Ll~~ 149 (154) T pfam02605 115 PPDALQTSEGWSQFTSGFFVGGVGGAFFAYFLLSN 149 (154) T ss_pred CHHHCCCCCCHHHHHCCEEECCCCHHHHHHHHHHC T ss_conf 74542283368886121000134389999999934 No 41 >PRK10697 DNA-binding transcriptional activator PspC; Provisional Probab=23.82 E-value=43 Score=12.88 Aligned_cols=70 Identities=16% Similarity=0.169 Sum_probs=28.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH-HHHCHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 14667776423467899999753-32032320266335103336788873112323220027899999999999999999 Q gi|254781034|r 35 RKFFWEKILSNKTFFKILALVCV-IVLTFIFIFTALFTEKFLRTDNNLLLLPSVSPLKEDPKDISPVIEKEIISQNLSIA 113 (344) Q Consensus 35 ~~~~~~~~~~~~~~ggiial~~~-~~lq~~~~~~~~~~~~~a~~~~~~~~~~~~~~q~~~~~~~l~~~~~el~~~~~~~~ 113 (344) |-.++.++|.++++.+++++.+. +.+. |.| ...... +............++++..+++.|...+.-++ T Consensus 38 R~~~vl~~f~g~~~~~~~aYii~~~~l~-----p~P--~~~~~~----~~~~~p~~~l~~i~~~~~~~E~RLr~ME~YVT 106 (119) T PRK10697 38 RIIVVLSIFFGLFFFTLVAYIILSFVLD-----PMP--DNMAFG----EQQPTSSELLDEVDRELAAGEQRLREMERYVT 106 (119) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCC-----CCC--CCCCCC----CCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 9999999999606899999999999807-----787--532322----34789999999999999999999999988873 Q ss_pred HH Q ss_conf 75 Q gi|254781034|r 114 QQ 115 (344) Q Consensus 114 s~ 115 (344) |. T Consensus 107 S~ 108 (119) T PRK10697 107 SD 108 (119) T ss_pred CC T ss_conf 67 No 42 >TIGR00996 Mtu_fam_mce virulence factor Mce family protein; InterPro: IPR005693 Mycobacterial species are usually slender, curved rods with a unique cell wall of complex waxes and glycolipids. They are resistant to acids, alkalis and dehydration, and are very slow to grow in vitro. The human pathogenic Mycobacteria (Mycobacterium tuberculosis and Mycobacterium leprae) are becoming resistant to conventional treatments and, together with HIV-related diseases, are fast posing a global health threat. An essential requirement, particularly of M. tuberculosis, is to gain entrance to, and to resist, the hostile intra-cellular environment of epithelial cells . The genome of M. tuberculosis contains four mammalian cell entry (mce) operons , which are widely distributed in both pathogenic and non-pathogenic mycobacteria suggesting that the presence of these putative virulence genes is not an indicator for the pathogenicity of the bacilli. At the 5' end of the transcriptional unit are two genes that have evolved from a tandem duplication, and whose products resemble YrbE, a conserved hypothetical protein found in Escherichia. coli, Haemophilus influenzae and Porphyra purpurea. All of the YrbE proteins, including the eight from M. tuberculosis, are probable integral membrane proteins with six TM alpha helices. The next six genes in each operon, the mce genes, are related, their products ranging in size from 275 to 564 amino acid residues. The corresponding protein sequences contain a number of highly conserved motifs that define a 24- member family with a common organization. Twenty of these proteins have a strongly hydrophobic segment at the NH2-terminal end that could span the lipid bilayer whereas the remaining four, all of which correspond to the seventh gene in their respective operons, mce1E to mce4E, are probably lipoprotein precursors. In all 24 cases the COOH-terminal domain of the mce proteins is predicted to be exposed on the external face of the cytoplasmic membrane . The ability to gain entry and resist the antimicrobial intracellular environment of mammalian cells is an essential virulence property of M. tuberculosis. This property is conferred by Mce1A, the third gene of operon 1, which when expressed in Escherichia coli conferred the ability to invade HeLa cells. The recombinant protein when used to coat latex spheres also promoted their uptake into HeLa cells. N-terminus deletion constructs of Mce1A identified a domain located between amino acid positions 106 and 163 that was needed for this cell uptake activity. Mce1A contains hydrophobic stretches at the N-terminus predictive of a signal sequence, and colloidal gold immunoelectron microscopy indicated that the corresponding native protein is expressed on the surface of M. tuberculosis. Recombinant Mce2A, which had the highest level of identity (67%) to Mce1A, was unable to promote the association of microspheres with HeLa cells and an mce-deletion mutant in Mycobacterium bovis greatly impaired the ability of the microbe to infect epithelial cells in vitro. Although the exact function of Mce1A is still unknown, it appears to serve as an effector molecule expressed on the surface of M. tuberculosis that is capable of eliciting plasma membrane perturbations in non-phagocytic mammalian cells . The distribution of the mce operons in both pathogenic and non-pathogenic mycobacteria suggests that the presence of these putative virulence genes is not an indicator for the pathogenicity of the bacilli - it may be that pathogenicity is determined by their expression , . The members of this family represent all 24 genes associated with the four mammalian cell entry operons of M. tuberculosis and their homologs in other Actinomycetales.; GO: 0009405 pathogenesis. Probab=23.19 E-value=44 Score=12.81 Aligned_cols=13 Identities=23% Similarity=0.066 Sum_probs=4.4 Q ss_pred HHHHHHHHHHHHH Q ss_conf 9999999999999 Q gi|254781034|r 178 QRMVSLLILKNAL 190 (344) Q Consensus 178 ~~~~A~~~L~~Ai 190 (344) .+..++..|.... T Consensus 268 ~L~~~l~~L~~~~ 280 (304) T TIGR00996 268 NLPQALANLAPVL 280 (304) T ss_pred HHHHHHHHHHHHH T ss_conf 6999999877899 No 43 >pfam01034 Syndecan Syndecan domain. Syndecans are transmembrane heparin sulfate proteoglycans which are implicated in the binding of extracellular matrix components and growth factors. Probab=22.43 E-value=45 Score=12.71 Aligned_cols=24 Identities=8% Similarity=0.106 Sum_probs=13.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 466777642346789999975332 Q gi|254781034|r 36 KFFWEKILSNKTFFKILALVCVIV 59 (344) Q Consensus 36 ~~~~~~~~~~~~~ggiial~~~~~ 59 (344) .+.....++++|.||+|++.+.+- T Consensus 146 ~~~~~~~l~~~i~~~~~~~~~a~~ 169 (207) T pfam01034 146 LLERKEVLAAVIAGGVVGLLFAVF 169 (207) T ss_pred CCCCHHHHHHHHHHHHHHHHHHHH T ss_conf 102413789988512899999999 No 44 >PRK12705 hypothetical protein; Provisional Probab=21.41 E-value=47 Score=12.58 Aligned_cols=27 Identities=15% Similarity=0.309 Sum_probs=15.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHCHH Q ss_conf 667776423467899999753320323 Q gi|254781034|r 37 FFWEKILSNKTFFKILALVCVIVLTFI 63 (344) Q Consensus 37 ~~~~~~~~~~~~ggiial~~~~~lq~~ 63 (344) ++...+++..++|.+|+++++..+.|. T Consensus 5 ~~~~~~~i~~l~~~~ig~~lg~~~~~~ 31 (485) T PRK12705 5 YLVLTILVLFLILVLIGLVLGVFIRYL 31 (485) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 469999999999999999999999999 No 45 >PRK13729 conjugal transfer pilus assembly protein TraB; Provisional Probab=20.67 E-value=49 Score=12.49 Aligned_cols=13 Identities=23% Similarity=0.158 Sum_probs=4.4 Q ss_pred HHHHHHHHHHHHH Q ss_conf 8999999999999 Q gi|254781034|r 310 ARQPSMFLRNALE 322 (344) Q Consensus 310 a~~~~~~w~~~~e 322 (344) +...+.-|++.+| T Consensus 373 a~~LAdYYIKrAE 385 (474) T PRK13729 373 AQTLSDYYIKRAE 385 (474) T ss_pred HHHHHHHHHHHHH T ss_conf 9999999999998 Done!