Query gi|254780430|ref|YP_003064843.1| nitrogen fixation protein [Candidatus Liberibacter asiaticus str. psy62] Match_columns 189 No_of_seqs 141 out of 1544 Neff 5.9 Searched_HMMs 39220 Date Sun May 29 16:05:44 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780430.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 KOG2358 consensus 100.0 6.4E-43 0 288.7 7.9 183 1-189 10-198 (213) 2 pfam08712 Nfu_N Scaffold prote 99.9 6.9E-28 1.8E-32 195.0 8.0 86 3-88 1-87 (87) 3 TIGR03341 YhgI_GntY IscR-regul 99.9 5.3E-25 1.4E-29 177.0 9.5 79 111-189 100-179 (190) 4 COG0694 Thioredoxin-like prote 99.9 3.3E-25 8.5E-30 178.2 7.8 78 112-189 5-84 (93) 5 PRK11190 putative DNA uptake p 99.9 4.6E-24 1.2E-28 171.1 9.5 79 111-189 101-181 (192) 6 pfam01106 NifU NifU-like domai 99.9 1.9E-23 4.9E-28 167.3 8.6 68 121-189 1-68 (68) 7 KOG2358 consensus 98.2 1.3E-06 3.4E-11 62.2 3.1 67 121-188 82-148 (213) 8 pfam01883 DUF59 Domain of unkn 88.2 1 2.6E-05 25.5 4.7 67 119-187 3-73 (76) 9 TIGR02000 NifU_proper Fe-S clu 83.6 0.72 1.8E-05 26.5 2.0 68 116-185 231-298 (302) 10 TIGR02176 pyruv_ox_red pyruvat 74.0 0.97 2.5E-05 25.7 0.2 27 119-145 641-667 (1194) 11 COG3086 RseC Positive regulato 61.9 8.1 0.00021 20.0 2.9 35 141-176 7-43 (150) 12 PRK10862 SoxR reducing system 60.5 8.3 0.00021 19.9 2.8 24 141-164 7-32 (158) 13 COG1308 EGD2 Transcription fac 58.0 16 0.0004 18.2 3.8 88 47-140 24-113 (122) 14 cd04874 ACT_Af1403 N-terminal 56.6 18 0.00046 17.8 5.2 49 128-188 16-71 (72) 15 cd07304 Chorismate_synthase Ch 56.5 7.4 0.00019 20.2 1.9 29 114-143 179-207 (344) 16 PRK05382 chorismate synthase; 55.1 8.5 0.00022 19.8 2.0 28 38-65 230-257 (357) 17 pfam04246 RseC_MucC Positive r 52.4 9.6 0.00025 19.5 2.0 20 143-162 2-23 (135) 18 pfam01264 Chorismate_synt Chor 47.5 12 0.0003 18.9 1.8 26 115-141 178-203 (346) 19 COG3435 Gentisate 1,2-dioxygen 44.8 16 0.00041 18.1 2.1 48 128-176 53-100 (351) 20 COG0082 AroC Chorismate syntha 44.0 11 0.00028 19.2 1.2 63 37-99 227-298 (369) 21 pfam03557 Bunya_G1 Bunyavirus 43.1 7.1 0.00018 20.3 0.1 30 135-165 738-767 (871) 22 PRK12463 chorismate synthase; 42.2 17 0.00043 18.0 1.9 27 38-64 242-268 (390) 23 TIGR02311 HpaI 2,4-dihydroxyhe 41.1 14 0.00035 18.5 1.3 47 43-89 156-205 (249) 24 KOG1482 consensus 38.5 34 0.00088 16.1 4.0 50 38-87 296-356 (379) 25 cd01815 BMSC_UbP_N BMSC_UbP (b 38.0 16 0.00041 18.1 1.3 30 154-183 12-41 (75) 26 cd02969 PRX_like1 Peroxiredoxi 37.4 36 0.00091 15.9 3.5 27 160-186 136-162 (171) 27 TIGR00033 aroC chorismate synt 36.5 21 0.00054 17.4 1.7 58 38-97 258-330 (391) 28 PRK10128 putative aldolase; Pr 34.7 31 0.0008 16.3 2.3 56 36-91 138-196 (250) 29 PRK11670 putative ATPase; Prov 34.6 40 0.001 15.7 7.1 42 41-82 29-75 (369) 30 PRK09193 indolepyruvate ferred 34.1 7.8 0.0002 20.1 -0.8 16 152-167 435-450 (1155) 31 PRK13030 2-oxoacid ferredoxin 33.3 8.3 0.00021 19.9 -0.8 17 152-168 434-450 (1168) 32 PRK13029 2-oxoacid ferredoxin 33.1 8.8 0.00023 19.7 -0.7 17 152-168 451-467 (1186) 33 pfam04320 DUF469 Protein with 32.9 42 0.0011 15.5 3.4 65 114-188 29-94 (102) 34 COG2151 PaaD Predicted metal-s 31.7 44 0.0011 15.4 5.8 70 116-187 12-86 (111) 35 TIGR00370 TIGR00370 conserved 31.6 44 0.0011 15.4 3.2 88 36-129 20-122 (217) 36 COG3696 Putative silver efflux 30.8 46 0.0012 15.3 6.1 108 37-146 65-181 (1027) 37 TIGR00852 pts-Glc PTS system, 30.6 35 0.00089 16.0 2.0 62 123-185 37-114 (312) 38 pfam00873 ACR_tran AcrB/AcrD/A 30.2 47 0.0012 15.2 9.7 36 119-154 151-188 (1021) 39 PRK10558 alpha-dehydro-beta-de 28.3 45 0.0011 15.3 2.2 36 38-78 28-63 (256) 40 COG4551 Predicted protein tyro 27.8 52 0.0013 15.0 2.7 91 38-136 14-109 (109) 41 TIGR01684 viral_ppase viral ph 27.0 53 0.0014 14.9 2.6 60 25-84 237-318 (323) 42 PRK02269 ribose-phosphate pyro 23.6 62 0.0016 14.5 3.6 34 150-183 207-248 (321) 43 TIGR02855 spore_yabG sporulati 23.4 18 0.00046 17.8 -0.5 48 116-164 149-223 (292) 44 TIGR02916 PEP_his_kin putative 23.2 63 0.0016 14.4 3.5 108 39-151 558-691 (696) 45 PRK08115 ribonucleotide-diphos 22.6 58 0.0015 14.6 1.9 37 132-171 820-856 (857) 46 PRK10614 multidrug efflux syst 21.8 67 0.0017 14.2 9.3 34 121-154 154-189 (1025) 47 KOG2679 consensus 21.5 68 0.0017 14.2 2.4 105 48-161 159-277 (336) 48 pfam06954 Resistin Resistin. T 21.4 66 0.0017 14.3 2.0 44 112-164 26-69 (109) 49 pfam08777 RRM_3 RNA binding mo 21.3 69 0.0018 14.2 4.1 29 129-158 18-48 (102) 50 cd03009 TryX_like_TryX_NRX Try 20.8 70 0.0018 14.1 2.6 16 124-139 112-127 (131) No 1 >KOG2358 consensus Probab=100.00 E-value=6.4e-43 Score=288.72 Aligned_cols=183 Identities=39% Similarity=0.659 Sum_probs=166.0 Q ss_pred CEEECCCCCCHHHHHHHCCCEEC-CCCCEECCCHHHCCCCHHHHHHHHCC-CCEEEEECCCEEEEEEC-C-CCCHHHHHH Q ss_conf 91021148872243541797543-78864306966701346889874168-81079980877998314-6-682123489 Q gi|254780430|r 1 MFIQTEDTPNPATLKFIPGQVVL-VEGAIHFSNAKEAEISPLASRIFSIP-GIASVYFGYDFITVGKD-Q-YDWEHLRPP 76 (189) Q Consensus 1 m~I~~e~TPNPn~lKFi~~~~i~-~~g~~~f~~~~~a~~spLa~~Lf~i~-GV~~Vfi~~nFITVtK~-~-~eW~~i~p~ 76 (189) |||+++.||||+++||.+++.++ ..++..|.+.-.+..+|||+++|.+. ||+++|+++|||||+|. + ..|..|+|. T Consensus 10 ~~i~t~~tPn~~Sl~f~p~~~i~~~~~~~~~~~~~s~~~s~La~s~~~~~~gvv~~~~g~dfvtv~k~~ee~~w~~L~p~ 89 (213) T KOG2358 10 MFIQTQITPNPSSLLFLPGKQILSERGLGDFATPCSAFFSPLAKSILFRDGGVVKVFFGPDFVTVTKLTEENVWSVLDPE 89 (213) T ss_pred HCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHCHHHHHHHHHCCCCEEEEECCCEEEEECCCHHHHHHHHCHH T ss_conf 24433489894210026898532321113345555311147889887525884798736973788455123447664405 Q ss_pred HHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEEC--CEEEEEE Q ss_conf 99999999740143232321122222222233445440268899999999987777875289759996433--4799996 Q gi|254780430|r 77 VLGMIMEHFISGDPIIHNGGLGDMKLDDMGSGDFIESDSAVVQRIKEVLDNRVRPAVARDGGDIVFKGYRD--GIVFLSM 154 (189) Q Consensus 77 I~~~I~~~l~~g~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~l~~~IrP~l~~dGG~i~~~~~~~--g~v~v~~ 154 (189) +...+.++++.|.+.+......-. .....++|.+....|+++|++||||.++.||||+.|.++++ |+|+++| T Consensus 90 i~~~~sd~g~~g~pli~g~~~~~~------~~~~~e~d~e~t~~ikelietRiRp~i~edggdi~y~g~e~g~g~v~lkl 163 (213) T KOG2358 90 IPSLMSDGGNVGLPLIDGNIVVLK------LQGACESDPESTMTIKELIETRIRPKIQEDGGDEDYVGFETGLGLVSLKL 163 (213) T ss_pred HHHHHHCCCCCCCHHHCCCHHHHH------HCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCCCEEECCCCCCCCHHHHHH T ss_conf 678775224115602235134543------23444578258899999998753343110278304225557631578877 Q ss_pred CCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEEC Q ss_conf 36646666689999999999999978970035539 Q gi|254780430|r 155 RGACSGCPSASETLKYGVANILNHFVPEVKDIRTV 189 (189) Q Consensus 155 ~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~~v 189 (189) +|||++||||..|||+|||+||++|+|+|+.|+++ T Consensus 164 qgact~cpss~vtlk~Gie~mL~~y~~eVK~v~qv 198 (213) T KOG2358 164 QGACTECPSSLVTLKNGIENMLEIYVPEVKGVIQV 198 (213) T ss_pred HHHHCCCCCCCCHHHHHHHHHHHHHCCEEEEEEEC T ss_conf 66651598310012014899998625000577850 No 2 >pfam08712 Nfu_N Scaffold protein Nfu/NifU N terminal. This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters. Probab=99.95 E-value=6.9e-28 Score=194.98 Aligned_cols=86 Identities=63% Similarity=1.029 Sum_probs=83.6 Q ss_pred EECCCCCCHHHHHHHCCCEECCCCCEECCCHHHCCCCHHHHHHHHCCCCEEEEECCCEEEEEEC-CCCCHHHHHHHHHHH Q ss_conf 0211488722435417975437886430696670134688987416881079980877998314-668212348999999 Q gi|254780430|r 3 IQTEDTPNPATLKFIPGQVVLVEGAIHFSNAKEAEISPLASRIFSIPGIASVYFGYDFITVGKD-QYDWEHLRPPVLGMI 81 (189) Q Consensus 3 I~~e~TPNPn~lKFi~~~~i~~~g~~~f~~~~~a~~spLa~~Lf~i~GV~~Vfi~~nFITVtK~-~~eW~~i~p~I~~~I 81 (189) ||+|+|||||+|||++++.++..|+++|++.+++.++|||++||+++||++||++.|||||||+ +++|++|+|+|+++| T Consensus 1 I~~e~TPNPn~lKf~~~~~l~~~gs~~f~~~~~a~~spLa~~Lf~i~gV~~Vf~~~nFITVtK~~~~~W~~l~~~I~~~I 80 (87) T pfam08712 1 IQTESTPNPNTLKFLPGKEVLPEGTFEFKNADEAEGSPLAQKLFKIPGVKSVFFGDDFITVTKADDADWDDLKPEVLEAI 80 (87) T ss_pred CCCCCCCCCCCEEEECCCEECCCCCEEECCHHHCCCCHHHHHHHCCCCEEEEEEECCEEEEEECCCCCHHHHHHHHHHHH T ss_conf 94647989037678679764279867878978752687999984888803999858889997379999899999999999 Q ss_pred HHHHHCC Q ss_conf 9997401 Q gi|254780430|r 82 MEHFISG 88 (189) Q Consensus 82 ~~~l~~g 88 (189) ++||.+| T Consensus 81 ~~~l~sG 87 (87) T pfam08712 81 MEHLESG 87 (87) T ss_pred HHHHHCC T ss_conf 9999649 No 3 >TIGR03341 YhgI_GntY IscR-regulated protein YhgI. IscR (TIGR02010) is an iron-sulfur cluster-binding transcriptional regulator (see Genome Property GenProp0138). Members of this protein family include YhgI, whose expression is under control of IscR, and show sequence similarity to IscA, a known protein of iron-sulfur cluster biosynthesis. These two lines of evidence strongly suggest a role as an iron-sulfur cluster biosynthesis protein. An older study designated this protein GntY and suggested a role for it and for the product of an adjacent gene, based on complementation studies, in gluconate utilization. Probab=99.92 E-value=5.3e-25 Score=176.98 Aligned_cols=79 Identities=33% Similarity=0.644 Sum_probs=74.5 Q ss_pred CCCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEE-ECCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEEC Q ss_conf 54402688999999999877778752897599964-33479999636646666689999999999999978970035539 Q gi|254780430|r 111 IESDSAVVQRIKEVLDNRVRPAVARDGGDIVFKGY-RDGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKDIRTV 189 (189) Q Consensus 111 ~~~~~~~~~~i~~~l~~~IrP~l~~dGG~i~~~~~-~~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~~v 189 (189) ...|+.+..+|+++|+++|||+|++|||||+|+++ ++|+|+|+|+|||+|||+|++|||+|||.+|++++|||++|+.+ T Consensus 100 ~~~d~~l~~~i~~vl~~ei~P~l~~hGG~v~lv~i~~~~~v~~~~~G~C~gC~~s~~Tlk~gvE~~l~~~~Pei~~V~d~ 179 (190) T TIGR03341 100 VADDAPLEERINYVLQSEINPQLASHGGKVTLVEITDDGVAVLQFGGGCNGCSMVDVTLKDGVEKTLLERFPELKGVRDA 179 (190) T ss_pred CCCCCCHHHHHHHHHHHHCCHHHHHCCCEEEEEEEECCCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCCEEEEC T ss_conf 57776899999999986029357636998999999079789999575489983689999999999999869875358876 No 4 >COG0694 Thioredoxin-like proteins and domains [Posttranslational modification, protein turnover, chaperones] Probab=99.92 E-value=3.3e-25 Score=178.24 Aligned_cols=78 Identities=47% Similarity=0.939 Sum_probs=74.4 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEE--CCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEEC Q ss_conf 44026889999999998777787528975999643--3479999636646666689999999999999978970035539 Q gi|254780430|r 112 ESDSAVVQRIKEVLDNRVRPAVARDGGDIVFKGYR--DGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKDIRTV 189 (189) Q Consensus 112 ~~~~~~~~~i~~~l~~~IrP~l~~dGG~i~~~~~~--~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~~v 189 (189) ..++++.++++++|+++|||+|++||||++|++++ +|+|+++|+|||+|||||+.|||+|||++||+++|++++|+++ T Consensus 5 ~~~~~~~e~v~~~l~~~irP~l~~dGGdve~~~i~~~~g~V~l~l~GaC~gC~sS~~TLk~gIE~~L~~~i~ev~~V~~v 84 (93) T COG0694 5 ETDAELLERVEEVLDEKIRPQLAMDGGDVELVGIDEEDGVVYLRLGGACSGCPSSTVTLKNGIERQLKEEIPEVKEVEQV 84 (93) T ss_pred CCCHHHHHHHHHHHHHCCCCEEEECCCEEEEEEEECCCCEEEEEECCCCCCCCCCHHHHHHHHHHHHHHHCCCCCEEEEC T ss_conf 01289999999999844571021049808999875689869999677578996628999999999999668865169973 No 5 >PRK11190 putative DNA uptake protein; Provisional Probab=99.91 E-value=4.6e-24 Score=171.11 Aligned_cols=79 Identities=30% Similarity=0.643 Sum_probs=74.3 Q ss_pred CCCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEE-ECCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCC-CCEEEEE Q ss_conf 54402688999999999877778752897599964-334799996366466666899999999999999789-7003553 Q gi|254780430|r 111 IESDSAVVQRIKEVLDNRVRPAVARDGGDIVFKGY-RDGIVFLSMRGACSGCPSASETLKYGVANILNHFVP-EVKDIRT 188 (189) Q Consensus 111 ~~~~~~~~~~i~~~l~~~IrP~l~~dGG~i~~~~~-~~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vp-ev~~V~~ 188 (189) ..+|+.+.++|+.+|+++|||+|++|||+|+|+++ ++|+|+|+|+|+|+|||+|.+|||+|||.+|++.+| ||++|+. T Consensus 101 ~~~d~~l~~~i~~vl~~~i~P~l~~hGG~v~l~~i~~~~~~~~~~~G~C~gC~~~~~Tlk~gvE~~l~~~~P~ei~~V~d 180 (192) T PRK11190 101 VADDAPLMERVEYVLQSQINPQLAGHGGRVSLMEITEDGYAILQFGGGCNGCSMVDVTLKEGIEKQLLNEFPGELKGVRD 180 (192) T ss_pred CCCCCCHHHHHHHHHHHHCCHHHHHCCCEEEEEEECCCCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCHHHCEEEE T ss_conf 68875599999999987319567637987999998279889999666578986889999999999999869786705787 Q ss_pred C Q ss_conf 9 Q gi|254780430|r 189 V 189 (189) Q Consensus 189 v 189 (189) + T Consensus 181 ~ 181 (192) T PRK11190 181 L 181 (192) T ss_pred C T ss_conf 6 No 6 >pfam01106 NifU NifU-like domain. This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. Probab=99.89 E-value=1.9e-23 Score=167.27 Aligned_cols=68 Identities=40% Similarity=0.660 Sum_probs=65.7 Q ss_pred HHHHHHHHHHHHHHHCCCCEEEEEEECCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEEC Q ss_conf 999999987777875289759996433479999636646666689999999999999978970035539 Q gi|254780430|r 121 IKEVLDNRVRPAVARDGGDIVFKGYRDGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKDIRTV 189 (189) Q Consensus 121 i~~~l~~~IrP~l~~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~~v 189 (189) |+++||+ |||+|++||||++|+++++|+|+|+|+|||+|||||+.||+++||++|++++|+++.|.+| T Consensus 1 I~~~le~-IRP~l~~dGGdvelv~v~~~~v~v~l~GaC~gC~~s~~Tlk~~Ie~~L~~~vpev~~Vv~V 68 (68) T pfam01106 1 IEEVIDE-IRPMLQRDGGDIELVDVDGDIVKVRLQGACGGCMSSTMTLKGGIERKLRERLGESLRVIPV 68 (68) T ss_pred CHHHHHH-HCHHHHHCCCCEEEEEEECCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCCEEEEC T ss_conf 9779987-5648885599289999869999999812898981089999999999999878997669879 No 7 >KOG2358 consensus Probab=98.16 E-value=1.3e-06 Score=62.24 Aligned_cols=67 Identities=25% Similarity=0.376 Sum_probs=61.2 Q ss_pred HHHHHHHHHHHHHHHCCCCEEEEEEECCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEE Q ss_conf 99999998777787528975999643347999963664666668999999999999997897003553 Q gi|254780430|r 121 IKEVLDNRVRPAVARDGGDIVFKGYRDGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKDIRT 188 (189) Q Consensus 121 i~~~l~~~IrP~l~~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~~ 188 (189) ...+|+. ++|++.+|||+.-..-+.+.++.+.++|+|..||++++|+|.+||..+|..|||.-.+++ T Consensus 82 ~w~~L~p-~i~~~~sd~g~~g~pli~g~~~~~~~~~~~e~d~e~t~~ikelietRiRp~i~edggdi~ 148 (213) T KOG2358 82 VWSVLDP-EIPSLMSDGGNVGLPLIDGNIVVLKLQGACESDPESTMTIKELIETRIRPKIQEDGGDED 148 (213) T ss_pred HHHHHCH-HHHHHHHCCCCCCCHHHCCCHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCCCEE T ss_conf 4766440-567877522411560223513454323444578258899999998753343110278304 No 8 >pfam01883 DUF59 Domain of unknown function DUF59. This family includes prokaryotic proteins of unknown function. The family also includes PhaH from Pseudomonas putida. PhaH forms a complex with PhaF, PhaG, and PhaI, which hydroxylates phenylacetic acid to 2-hydroxyphenylacetic acid. So members of this family may all be components of ring hydroxylating complexes. Probab=88.19 E-value=1 Score=25.54 Aligned_cols=67 Identities=28% Similarity=0.423 Sum_probs=47.1 Q ss_pred HHHHHHHHHHHHHHHHH---CCCCEEEEEEE-CCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEE Q ss_conf 99999999987777875---28975999643-34799996366466666899999999999999789700355 Q gi|254780430|r 119 QRIKEVLDNRVRPAVAR---DGGDIVFKGYR-DGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKDIR 187 (189) Q Consensus 119 ~~i~~~l~~~IrP~l~~---dGG~i~~~~~~-~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~ 187 (189) ++|.+.|.+-+.|-+.. +=|-|.=+.++ +|.|.|.|.=.+.+||.+ ..+.+.++..++. +|.|.+|. T Consensus 3 e~I~~aL~~V~DPEl~~~Iv~LGlI~~i~v~~~g~v~I~~~lT~~~CP~~-~~i~~~i~~~l~~-v~gv~~V~ 73 (76) T pfam01883 3 EAILEALKTVIDPELPVDIVDLGLVYEVDIDDDGNVKVKMTLTTPGCPLA-DLIALDVREALLE-LPGVEDVE 73 (76) T ss_pred HHHHHHHHCCCCCCCCCCCCCCCEEEEEEECCCCEEEEEEEECCCCCCCH-HHHHHHHHHHHHC-CCCCEEEE T ss_conf 89999982778999997800245368999857984999999589999837-8999999999983-99940789 No 9 >TIGR02000 NifU_proper Fe-S cluster assembly protein NifU; InterPro: IPR010238 Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] . FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems. The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins , . It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly . The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA . SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA , acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets. In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins . Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen . This entry represents the NifU protein from the NIF system that is involved in nitrogenase maturation. ; GO: 0005506 iron ion binding, 0051536 iron-sulfur cluster binding, 0016226 iron-sulfur cluster assembly. Probab=83.59 E-value=0.72 Score=26.52 Aligned_cols=68 Identities=35% Similarity=0.567 Sum_probs=54.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHCCCCEEEEEEECCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEE Q ss_conf 6889999999998777787528975999643347999963664666668999999999999997897003 Q gi|254780430|r 116 AVVQRIKEVLDNRVRPAVARDGGDIVFKGYRDGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKD 185 (189) Q Consensus 116 ~~~~~i~~~l~~~IrP~l~~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~ 185 (189) .....+...+ ...+|..+.++|+..........+++.+.|+|.+|..+..++.. ++..+......... T Consensus 231 ~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~ 298 (302) T TIGR02000 231 QRIQLIVKVL-AELRPVLQADGGDVELLDVDGKIVYVSLTGACSGCSSSAATLAG-IQQKLVERLGEFVV 298 (302) T ss_pred HHHHHHHHHH-HHHCCHHCCCCCCCCCCCCCCCCCEECCCCCCCCCCHHHHHHHH-HHHHHHHHCCCCCC T ss_conf 0011111111-11100000012210000012221000011111110000111111-22211110011211 No 10 >TIGR02176 pyruv_ox_red pyruvate:ferredoxin (flavodoxin) oxidoreductase; InterPro: IPR011895 The oxidative decarboxylation of pyruvate to acetyl-CoA, a central step in energy metabolism, can occur by two different mechanisms . In mitochondria and aerobic bacteria this reaction is catalysed by the multienzyme complex pyruvate dehydrogenase using NAD as electron acceptor. In anaerobic organisms, however, this reaction is reversibly catalysed by a single enzyme using either ferrodoxin or flavodoxin as the electron acceptor. Pyruvate:ferrodoxin/flavodoxin reductases (PFORs) in this entry occur in both obligately and facultatively anaerobic bacteria and also some eukaryotic microorganisms. These proteins are single-chain enzymes containing a thiamin pyrophosphate cofactor for the cleavage of carbon-carbon bonds next to a carbonyl group, and iron-sulphur clusters for electron transfer. The Desulfovibrio africanus enzyme is currently the only PFOR whose three dimensional structure is known , . It is a homodimer where each subunit contains one thiamin pyrophosphate cofactor and two ferrodoxin-like 4Fe-S clusters and an atypical 4Fe-S cluster. Each monomer is composed of seven domains - domains I, II and VI make intersubunit contacts, while domains III, IV and V are located at the suface of the dimer, and domain VII forms a long arm extending over the other subunit. The cofactor is bound at the interface of domains I and VI and is proximal to the atypical 4Fe-S bound by domain VI, while the ferrodoxin-like 4Fe-S clusters are bound by domain V. Comparison of this enzyme with the multi-chain PFORs shows a correspondance between the domains in this enzyme and the subunits of the multi-chain enzymes.; GO: 0005506 iron ion binding, 0016903 oxidoreductase activity acting on the aldehyde or oxo group of donors, 0006118 electron transport. Probab=73.96 E-value=0.97 Score=25.72 Aligned_cols=27 Identities=30% Similarity=0.588 Sum_probs=16.5 Q ss_pred HHHHHHHHHHHHHHHHHCCCCEEEEEE Q ss_conf 999999999877778752897599964 Q gi|254780430|r 119 QRIKEVLDNRVRPAVARDGGDIVFKGY 145 (189) Q Consensus 119 ~~i~~~l~~~IrP~l~~dGG~i~~~~~ 145 (189) ....+...+=+||.+++.|-+|.+-.| T Consensus 641 ~~~~eFV~nv~~pi~~q~GD~lpVS~~ 667 (1194) T TIGR02176 641 EDAPEFVKNVVRPIAAQEGDDLPVSAF 667 (1194) T ss_pred CCCCHHHHHHHHHHHHCCCCCCCHHHH T ss_conf 747478999888886258898765777 No 11 >COG3086 RseC Positive regulator of sigma E activity [Signal transduction mechanisms] Probab=61.87 E-value=8.1 Score=19.98 Aligned_cols=35 Identities=20% Similarity=0.416 Sum_probs=25.8 Q ss_pred EEEEEECCEEEEEE--CCCCCCCHHHHHHHHHHHHHHH Q ss_conf 99964334799996--3664666668999999999999 Q gi|254780430|r 141 VFKGYRDGIVFLSM--RGACSGCPSASETLKYGVANIL 176 (189) Q Consensus 141 ~~~~~~~g~v~v~~--~GaC~~Cpss~~Tl~~gie~~l 176 (189) .++++++|.++|+- +-+|+.|+|...-. .+..+-| T Consensus 7 ~vv~~q~G~a~V~c~~~S~CgsC~a~~~CG-s~~l~kL 43 (150) T COG3086 7 TVVSWQNGQAKVSCQRQSACGSCAARAGCG-SGLLSKL 43 (150) T ss_pred EEEECCCCEEEEEEECCCCCCCCHHHCCCC-HHHHHHH T ss_conf 999733885999960467654542113422-6799985 No 12 >PRK10862 SoxR reducing system protein RseC; Provisional Probab=60.52 E-value=8.3 Score=19.89 Aligned_cols=24 Identities=21% Similarity=0.638 Sum_probs=16.5 Q ss_pred EEEEEECCEEEEEE--CCCCCCCHHH Q ss_conf 99964334799996--3664666668 Q gi|254780430|r 141 VFKGYRDGIVFLSM--RGACSGCPSA 164 (189) Q Consensus 141 ~~~~~~~g~v~v~~--~GaC~~Cpss 164 (189) .++.+++|.++|+. +.+|++|.+. T Consensus 7 ~Vv~~~~g~a~Ve~~r~SaCg~C~a~ 32 (158) T PRK10862 7 TVVSWQNGQALVRCDVKASCSSCASR 32 (158) T ss_pred EEEEEECCEEEEEEEECCCCCCCCCC T ss_conf 99999699899998526887678888 No 13 >COG1308 EGD2 Transcription factor homologous to NACalpha-BTF3 [Transcription] Probab=58.03 E-value=16 Score=18.17 Aligned_cols=88 Identities=20% Similarity=0.267 Sum_probs=44.9 Q ss_pred HCCCCEEEEECC--CEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHH Q ss_conf 168810799808--779983146682123489999999997401432323211222222222334454402688999999 Q gi|254780430|r 47 SIPGIASVYFGY--DFITVGKDQYDWEHLRPPVLGMIMEHFISGDPIIHNGGLGDMKLDDMGSGDFIESDSAVVQRIKEV 124 (189) Q Consensus 47 ~i~GV~~Vfi~~--nFITVtK~~~eW~~i~p~I~~~I~~~l~~g~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~ 124 (189) .++||+.|-+-. +-+.|.+..+. ......+ ..|-.+|.+.. ..................++|=+++-.=..+ T Consensus 24 eld~v~~V~i~~kd~e~vi~~P~V~----~~~~~g~-~~yqi~g~~~~-~~~~~~~ee~~~d~~~i~eeDIkLV~eQa~V 97 (122) T COG1308 24 ELDGVERVIIKLKDTEYVIENPQVT----VMKAMGQ-KTYQISGDPSA-KEAVKKPEEKTVDESDISEEDIKLVMEQAGV 97 (122) T ss_pred ECCCCEEEEEECCCCEEEEECCCEE----EEHHCCH-HHHHHHCCHHH-HCCCCCCHHCCCCCCCCCHHHHHHHHHHHCC T ss_conf 4267448999828953896189177----5510034-68988464345-3065561011245578988999999998189 Q ss_pred HHHHHHHHHHHCCCCE Q ss_conf 9998777787528975 Q gi|254780430|r 125 LDNRVRPAVARDGGDI 140 (189) Q Consensus 125 l~~~IrP~l~~dGG~i 140 (189) =.+..|-+|...|||+ T Consensus 98 sreeA~kAL~e~~GDl 113 (122) T COG1308 98 SREEAIKALEEAGGDL 113 (122) T ss_pred CHHHHHHHHHHCCCCH T ss_conf 9999999999838869 No 14 >cd04874 ACT_Af1403 N-terminal ACT domain of the yet uncharacterized, small (~133 a.a.), putative amino acid binding protein, Af1403, and related domains. This CD includes the N-terminal ACT domain of the yet uncharacterized, small (~133 a.a.), putative amino acid binding protein, Af1403, from Archaeoglobus fulgidus and other related archeal ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains. Probab=56.55 E-value=18 Score=17.81 Aligned_cols=49 Identities=16% Similarity=0.520 Sum_probs=33.4 Q ss_pred HHHHHHHHCCCCEEEEEE---EC--CEEEEEECCCCCCCHHHHHHHHHHHHHHHH--HHCCCCEEEEE Q ss_conf 877778752897599964---33--479999636646666689999999999999--97897003553 Q gi|254780430|r 128 RVRPAVARDGGDIVFKGY---RD--GIVFLSMRGACSGCPSASETLKYGVANILN--HFVPEVKDIRT 188 (189) Q Consensus 128 ~IrP~l~~dGG~i~~~~~---~~--g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~--~~vpev~~V~~ 188 (189) +|--.++.|||+|.+... ++ |..|+++.|- .-++.+++ +.+|.|.+|+. T Consensus 16 ~itgvIa~hg~NItytqqfi~~~g~~~iY~ElE~v------------~d~e~Li~~L~~~~~V~eVei 71 (72) T cd04874 16 DLTGVIAEHGGNITYTQQFIEREGKARIYMELEGV------------GDIEELVEELRSLPIVREVEI 71 (72) T ss_pred HHHHHHHHCCCCEEEEEEEEECCCEEEEEEEEECC------------CCHHHHHHHHHCCCCEEEEEE T ss_conf 98879986489869999998079828999999679------------998999999877995599994 No 15 >cd07304 Chorismate_synthase Chorismase synthase, the enzyme catalyzing the final step of the shikimate pathway. Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway: the conversion of 5- enolpyruvylshikimate-3-phosphate (EPSP) to chorismate, a precursor for the biosynthesis of aromatic compounds. This process has an absolute requirement for reduced FMN as a co-factor which is thought to facilitate cleavage of C-O bonds by transiently donating an electron to the substrate, having no overall change its redox state. Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two classes: Enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctiona,l while those that can generate reduced FMN at the expense of NADPH, such as found in fun Probab=56.47 E-value=7.4 Score=20.20 Aligned_cols=29 Identities=21% Similarity=0.376 Sum_probs=15.0 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHCCCCEEEE Q ss_conf 026889999999998777787528975999 Q gi|254780430|r 114 DSAVVQRIKEVLDNRVRPAVARDGGDIVFK 143 (189) Q Consensus 114 ~~~~~~~i~~~l~~~IrP~l~~dGG~i~~~ 143 (189) |.+..+++.++|++ +|-.-.+-||-++++ T Consensus 179 d~~~~~~m~~~I~~-ak~~gDSlGG~ve~~ 207 (344) T cd07304 179 DPEAEEKMKELIDE-AKKEGDSVGGVVEVV 207 (344) T ss_pred CHHHHHHHHHHHHH-HHHCCCCCCCEEEEE T ss_conf 98999999999999-854499987289999 No 16 >PRK05382 chorismate synthase; Validated Probab=55.10 E-value=8.5 Score=19.84 Aligned_cols=28 Identities=25% Similarity=0.275 Sum_probs=20.2 Q ss_pred CCHHHHHHHHCCCCEEEEECCCEEEEEE Q ss_conf 3468898741688107998087799831 Q gi|254780430|r 38 ISPLASRIFSIPGIASVYFGYDFITVGK 65 (189) Q Consensus 38 ~spLa~~Lf~i~GV~~Vfi~~nFITVtK 65 (189) ++.||+.||.||.|+.|-++..|=.... T Consensus 230 da~La~a~msIpAvKgvEfG~Gf~~a~~ 257 (357) T PRK05382 230 DADLAHALMSINAVKGVEIGDGFEAARL 257 (357) T ss_pred HHHHHHHHHCCCCEEEEEECCHHHHHHC T ss_conf 5789888617664236896451657526 No 17 >pfam04246 RseC_MucC Positive regulator of sigma(E), RseC/MucC. This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions). In Pseudomonas aeruginosa, de-repression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonisation in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains an ApbE domain pfam02424, w Probab=52.38 E-value=9.6 Score=19.50 Aligned_cols=20 Identities=20% Similarity=0.531 Sum_probs=9.9 Q ss_pred EEEECCEEEEEE--CCCCCCCH Q ss_conf 964334799996--36646666 Q gi|254780430|r 143 KGYRDGIVFLSM--RGACSGCP 162 (189) Q Consensus 143 ~~~~~g~v~v~~--~GaC~~Cp 162 (189) ++++++.++|+. +.+|++|. T Consensus 2 v~v~~~~~~V~~~r~saC~~C~ 23 (135) T pfam04246 2 VAVEGGWATVEAQRKSACGSCA 23 (135) T ss_pred EEEECCEEEEEEEECCCCCCCC T ss_conf 8998999999982178662547 No 18 >pfam01264 Chorismate_synt Chorismate synthase. Probab=47.52 E-value=12 Score=18.93 Aligned_cols=26 Identities=19% Similarity=0.350 Sum_probs=10.0 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCCCCEE Q ss_conf 268899999999987777875289759 Q gi|254780430|r 115 SAVVQRIKEVLDNRVRPAVARDGGDIV 141 (189) Q Consensus 115 ~~~~~~i~~~l~~~IrP~l~~dGG~i~ 141 (189) .+..+++.+.|++ +|-.=.+-||-++ T Consensus 178 ~~~~~~m~~~I~~-ak~~gDSlGG~ve 203 (346) T pfam01264 178 PEAEERMEELIDA-AKKEGDSLGGVVE 203 (346) T ss_pred HHHHHHHHHHHHH-HHHCCCCCCCEEE T ss_conf 8999999999999-9753999871899 No 19 >COG3435 Gentisate 1,2-dioxygenase [Secondary metabolites biosynthesis, transport, and catabolism] Probab=44.79 E-value=16 Score=18.14 Aligned_cols=48 Identities=29% Similarity=0.412 Sum_probs=20.7 Q ss_pred HHHHHHHHCCCCEEEEEEECCEEEEEECCCCCCCHHHHHHHHHHHHHHH Q ss_conf 8777787528975999643347999963664666668999999999999 Q gi|254780430|r 128 RVRPAVARDGGDIVFKGYRDGIVFLSMRGACSGCPSASETLKYGVANIL 176 (189) Q Consensus 128 ~IrP~l~~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l 176 (189) .|||.|+.-|.-|.-.+=.--++++.=-| |.|-.|.+.||..|++-+| T Consensus 53 ~ir~ll~~sgeli~~~~a~RRvi~L~NP~-l~g~ssiT~TLyAglQlil 100 (351) T COG3435 53 EIRPLLLRSGELISAREAVRRVIYLENPG-LRGRSSITPTLYAGLQLIL 100 (351) T ss_pred HHHHHHHHHHHCCCCCCCEEEEEEECCCC-CCCCCCCCHHHHHHHHEEC T ss_conf 99999998642247445436899964888-8886633088886554045 No 20 >COG0082 AroC Chorismate synthase [Amino acid transport and metabolism] Probab=43.97 E-value=11 Score=19.16 Aligned_cols=63 Identities=22% Similarity=0.323 Sum_probs=41.0 Q ss_pred CCCHHHHHHHHCCCCEEEEECCCEEEEEE------CCCCCHH---HHHHHHHHHHHHHHCCCCCCCCCCCCC Q ss_conf 13468898741688107998087799831------4668212---348999999999740143232321122 Q gi|254780430|r 37 EISPLASRIFSIPGIASVYFGYDFITVGK------DQYDWEH---LRPPVLGMIMEHFISGDPIIHNGGLGD 99 (189) Q Consensus 37 ~~spLa~~Lf~i~GV~~Vfi~~nFITVtK------~~~eW~~---i~p~I~~~I~~~l~~g~~~i~~~~~~~ 99 (189) -++.||+.||.|+-|+.|-|+..|-...+ |...|++ -+..=..-|.--+.+|.|++...+... T Consensus 227 Lda~lA~AlmsI~AvKGVEiG~GF~~a~~~GSe~~De~~~~~~~~~~tN~~GGilGGitnG~pIv~r~a~KP 298 (369) T COG0082 227 LDAKLAHALMSIPAVKGVEIGDGFEAARMRGSEANDEITLDGGIVRKTNNAGGILGGITNGEPIVVRVAFKP 298 (369) T ss_pred CHHHHHHHHHCCCCCEEEEECCCHHHHHCCCCCCCCCEEECCCEEECCCCCCCEECCCCCCCCEEEEEEECC T ss_conf 168999986076650057865516553156631168634078806724667763035458961799998677 No 21 >pfam03557 Bunya_G1 Bunyavirus glycoprotein G1. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This family contains the G1 glycoprotein which is the viral attachment protein. Probab=43.07 E-value=7.1 Score=20.33 Aligned_cols=30 Identities=33% Similarity=0.595 Sum_probs=18.8 Q ss_pred HCCCCEEEEEEECCEEEEEECCCCCCCHHHH Q ss_conf 5289759996433479999636646666689 Q gi|254780430|r 135 RDGGDIVFKGYRDGIVFLSMRGACSGCPSAS 165 (189) Q Consensus 135 ~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss~ 165 (189) .+=||+.|..|.... -+...|.|+||+.-. T Consensus 738 ~~lgd~~YK~~~~~~-~l~~~~~C~GC~~Cf 767 (871) T pfam03557 738 AILGDIRYKTFAESI-DLEAEGKCVGCINCF 767 (871) T ss_pred EECCCHHHHCCCCCC-CEEEEEEECCCCCHH T ss_conf 861502354056674-146503851663033 No 22 >PRK12463 chorismate synthase; Reviewed Probab=42.20 E-value=17 Score=17.97 Aligned_cols=27 Identities=26% Similarity=0.289 Sum_probs=14.4 Q ss_pred CCHHHHHHHHCCCCEEEEECCCEEEEE Q ss_conf 346889874168810799808779983 Q gi|254780430|r 38 ISPLASRIFSIPGIASVYFGYDFITVG 64 (189) Q Consensus 38 ~spLa~~Lf~i~GV~~Vfi~~nFITVt 64 (189) ++-||+.||.||.|+.|-++..|=... T Consensus 242 da~LA~A~mSIpAvKGvE~G~GF~~a~ 268 (390) T PRK12463 242 DAKLAGAIMSINAFKGAEIGVGFEAAR 268 (390) T ss_pred HHHHHHHHHCCCCCEEEEECCCHHHHH T ss_conf 299999862778502788635442300 No 23 >TIGR02311 HpaI 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase; InterPro: IPR012689 This entry represents the aldolase which performs the final step unique to the 4-hydroxyphenylacetic acid catabolism pathway in which 2,4-dihydroxyhept-2-ene-1,7-dioic acid is split into pyruvate and succinate-semialdehyde. The gene for enzyme is generally found adjacent to other genes for this pathway.; GO: 0018802 24-dihydroxyhept-2-ene-17-dioate aldolase activity, 0010124 phenylacetate catabolic process. Probab=41.12 E-value=14 Score=18.50 Aligned_cols=47 Identities=13% Similarity=0.211 Sum_probs=28.7 Q ss_pred HHHHHCCCCEEEEECCCEEEEEE---CCCCCHHHHHHHHHHHHHHHHCCC Q ss_conf 98741688107998087799831---466821234899999999974014 Q gi|254780430|r 43 SRIFSIPGIASVYFGYDFITVGK---DQYDWEHLRPPVLGMIMEHFISGD 89 (189) Q Consensus 43 ~~Lf~i~GV~~Vfi~~nFITVtK---~~~eW~~i~p~I~~~I~~~l~~g~ 89 (189) ..+..++||..||||+-=++=+= -+-.=-++...|..+|+.=-..|+ T Consensus 156 ~~Ia~VeGVDGVFiGPADLaasmGH~GnPsHPEV~~AI~~Ai~~i~a~gK 205 (249) T TIGR02311 156 EEIAAVEGVDGVFIGPADLAASMGHLGNPSHPEVQDAIDDAIERIKAAGK 205 (249) T ss_pred HHHHCCCCCCCEEECCHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCC T ss_conf 57750178662475712443401568886961589999999999985489 No 24 >KOG1482 consensus Probab=38.54 E-value=34 Score=16.05 Aligned_cols=50 Identities=24% Similarity=0.218 Sum_probs=36.6 Q ss_pred CCHHHHHHHHCCCCEEEEECCC-EEEEEE----------CCCCCHHHHHHHHHHHHHHHHC Q ss_conf 3468898741688107998087-799831----------4668212348999999999740 Q gi|254780430|r 38 ISPLASRIFSIPGIASVYFGYD-FITVGK----------DQYDWEHLRPPVLGMIMEHFIS 87 (189) Q Consensus 38 ~spLa~~Lf~i~GV~~Vfi~~n-FITVtK----------~~~eW~~i~p~I~~~I~~~l~~ 87 (189) ...+-+.|..++||++|.=-.= -||+.| .+++|+.+..++++.|...+.- T Consensus 296 ~~~~~~~l~~iegV~~VHdLhIWsiTv~k~~ls~Hv~i~~~ad~~~vL~~~~~~i~~~~~~ 356 (379) T KOG1482 296 FDKVKKGLLSIEGVKAVHDLHIWSITVGKVALSVHLAIDSEADAEEVLDEARSLIKRRYGI 356 (379) T ss_pred HHHHHHHHHHHCCEEEEEEEEEEEEECCCEEEEEEEEECCCCCHHHHHHHHHHHHHHHCCE T ss_conf 7899987742116468888888877427557899996168889899999999999742565 No 25 >cd01815 BMSC_UbP_N BMSC_UbP (bone marrow stromal cell-derived ubiquitin-like protein) has an N-terminal ubiquitin-like (UBQ) domain and a C-terminal ubiquitin-associated (UBA) domain, a domain architecture similar to those of the UBIN, Chap1, and ubiquilin proteins. This CD represents the N-terminal ubiquitin-like domain. Probab=38.04 E-value=16 Score=18.14 Aligned_cols=30 Identities=27% Similarity=0.159 Sum_probs=26.4 Q ss_pred ECCCCCCCHHHHHHHHHHHHHHHHHHCCCC Q ss_conf 636646666689999999999999978970 Q gi|254780430|r 154 MRGACSGCPSASETLKYGVANILNHFVPEV 183 (189) Q Consensus 154 ~~GaC~~Cpss~~Tl~~gie~~l~~~vpev 183 (189) +.|.|.-|.-+..||||-|-.+|...+|+= T Consensus 12 e~gd~~~ggy~vs~lKQLia~~L~dS~PDP 41 (75) T cd01815 12 ELGDVSPGGYQVSTLKQLIAAQLPDSLPDP 41 (75) T ss_pred CCCCCCCCCEEHHHHHHHHHHHCCCCCCCH T ss_conf 777659983437899999975565469984 No 26 >cd02969 PRX_like1 Peroxiredoxin (PRX)-like 1 family; hypothetical proteins that show sequence similarity to PRXs. Members of this group contain a conserved cysteine that aligns to the first cysteine in the CXXC motif of TRX. This does not correspond to the peroxidatic cysteine found in PRXs, which aligns to the second cysteine in the CXXC motif of TRX. In addition, these proteins do not contain the other two conserved residues of the catalytic triad of PRX. PRXs confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. Probab=37.38 E-value=36 Score=15.94 Aligned_cols=27 Identities=15% Similarity=0.150 Sum_probs=17.8 Q ss_pred CCHHHHHHHHHHHHHHHHHHCCCCEEE Q ss_conf 666689999999999999978970035 Q gi|254780430|r 160 GCPSASETLKYGVANILNHFVPEVKDI 186 (189) Q Consensus 160 ~Cpss~~Tl~~gie~~l~~~vpev~~V 186 (189) ....+..=|+++|+++|+-.-..+... T Consensus 136 ~~~~~~~~l~~Ai~~~~~g~~i~~~~t 162 (171) T cd02969 136 DPPVTGRDLRAALDALLAGKPVPVPQT 162 (171) T ss_pred CCCCCHHHHHHHHHHHHCCCCCCCCCC T ss_conf 998887999999999983899994676 No 27 >TIGR00033 aroC chorismate synthase; InterPro: IPR000453 Chorismate synthase (4.2.3.5 from EC) catalyzes the last of the seven steps in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity. Chorismate synthase from various sources shows , a high degree of sequence conservation. It is a protein of about 360 to 400 amino-acid residues.; GO: 0004107 chorismate synthase activity, 0009073 aromatic amino acid family biosynthetic process. Probab=36.46 E-value=21 Score=17.38 Aligned_cols=58 Identities=24% Similarity=0.281 Sum_probs=36.7 Q ss_pred CCHHHHHHHHCCCCEEEEECCCEEEEEE------CCCCCH--HHHHHHHHHHHHH-------HHCCCCCCCCCCC Q ss_conf 3468898741688107998087799831------466821--2348999999999-------7401432323211 Q gi|254780430|r 38 ISPLASRIFSIPGIASVYFGYDFITVGK------DQYDWE--HLRPPVLGMIMEH-------FISGDPIIHNGGL 97 (189) Q Consensus 38 ~spLa~~Lf~i~GV~~Vfi~~nFITVtK------~~~eW~--~i~p~I~~~I~~~-------l~~g~~~i~~~~~ 97 (189) ++.||..||.|+=||.|-||..|=..+- |...|+ +-...- +..++ +..|.++.-+-+. T Consensus 258 dA~LA~A~~SI~A~KGvEiG~GF~~a~~~GS~~~Def~~~KE~~~~i~--~ktNn~GGi~GGIt~G~~I~~r~a~ 330 (391) T TIGR00033 258 DAELASALMSIPAVKGVEIGDGFELASMRGSEANDEFVLEKEDDGGIR--RKTNNSGGILGGITNGEPIRVRIAV 330 (391) T ss_pred HHHHHHHHHCCCCCCEEEECCCCCCCCCCCCCCCCCEEEEEECCCCCE--EEECCCCCCCCCCCCCCCEEEEEEE T ss_conf 588888873246211478757765022146422584046666288847--8734557423674017312898866 No 28 >PRK10128 putative aldolase; Provisional Probab=34.75 E-value=31 Score=16.31 Aligned_cols=56 Identities=14% Similarity=0.166 Sum_probs=35.8 Q ss_pred CCCCHHHHHHHHCCCCEEEEECCCEEEEEE---CCCCCHHHHHHHHHHHHHHHHCCCCC Q ss_conf 013468898741688107998087799831---46682123489999999997401432 Q gi|254780430|r 36 AEISPLASRIFSIPGIASVYFGYDFITVGK---DQYDWEHLRPPVLGMIMEHFISGDPI 91 (189) Q Consensus 36 a~~spLa~~Lf~i~GV~~Vfi~~nFITVtK---~~~eW~~i~p~I~~~I~~~l~~g~~~ 91 (189) ...-.=+.+++++|||..+|+|++=++..- .+.+..++...+...+...-..|.++ T Consensus 138 ~~av~nldeI~av~GvD~~fiGp~DLs~slG~pg~~~~p~v~~ai~~v~~~~~~~gk~~ 196 (250) T PRK10128 138 KTALDNLDEILDVEGIDGVFIGPADLSASLGYPDNAGHPEVQRIIETSIRRIRAAGKAA 196 (250) T ss_pred HHHHHHHHHHHCCCCCCEEEECHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCCCE T ss_conf 89998799985889988899884889986599999998699999999999999869976 No 29 >PRK11670 putative ATPase; Provisional Probab=34.59 E-value=40 Score=15.66 Aligned_cols=42 Identities=12% Similarity=0.195 Sum_probs=23.6 Q ss_pred HHHHHHHCCCCEEEEECCCEEEEEEC-C----CCCHHHHHHHHHHHH Q ss_conf 88987416881079980877998314-6----682123489999999 Q gi|254780430|r 41 LASRIFSIPGIASVYFGYDFITVGKD-Q----YDWEHLRPPVLGMIM 82 (189) Q Consensus 41 La~~Lf~i~GV~~Vfi~~nFITVtK~-~----~eW~~i~p~I~~~I~ 82 (189) |-+.|.++..|++|.+.++-++|+=. . ..++.++.++...+. T Consensus 29 ~~~~iv~lg~v~~i~i~~~~v~i~l~l~~~~~~~~~~~~~~~~~~l~ 75 (369) T PRK11670 29 LKHNLTTLKALHHVAWMDDTLHIELVMPFVWNSAFEELKEQCSAELL 75 (369) T ss_pred CCCCEECCCCEEEEEEECCEEEEEEEECCCCCCHHHHHHHHHHHHHH T ss_conf 99880037970169997999999999688898879999999999998 No 30 >PRK09193 indolepyruvate ferredoxin oxidoreductase; Validated Probab=34.10 E-value=7.8 Score=20.07 Aligned_cols=16 Identities=38% Similarity=0.706 Sum_probs=12.5 Q ss_pred EEECCCCCCCHHHHHH Q ss_conf 9963664666668999 Q gi|254780430|r 152 LSMRGACSGCPSASET 167 (189) Q Consensus 152 v~~~GaC~~Cpss~~T 167 (189) .+.--=|+|||..+.| T Consensus 435 ~R~PyFCSGCPHNtST 450 (1155) T PRK09193 435 ARTPYFCSGCPHNTST 450 (1155) T ss_pred CCCCCCCCCCCCCCCC T ss_conf 8798677999997567 No 31 >PRK13030 2-oxoacid ferredoxin oxidoreductase; Provisional Probab=33.34 E-value=8.3 Score=19.90 Aligned_cols=17 Identities=35% Similarity=0.604 Sum_probs=12.3 Q ss_pred EEECCCCCCCHHHHHHH Q ss_conf 99636646666689999 Q gi|254780430|r 152 LSMRGACSGCPSASETL 168 (189) Q Consensus 152 v~~~GaC~~Cpss~~Tl 168 (189) -|.--=|+|||..+.|- T Consensus 434 ~R~P~FCSGCPHNtST~ 450 (1168) T PRK13030 434 RRTPYFCSGCPHNTSTK 450 (1168) T ss_pred CCCCCCCCCCCCCCCCC T ss_conf 67873458999976654 No 32 >PRK13029 2-oxoacid ferredoxin oxidoreductase; Provisional Probab=33.06 E-value=8.8 Score=19.74 Aligned_cols=17 Identities=35% Similarity=0.575 Sum_probs=12.5 Q ss_pred EEECCCCCCCHHHHHHH Q ss_conf 99636646666689999 Q gi|254780430|r 152 LSMRGACSGCPSASETL 168 (189) Q Consensus 152 v~~~GaC~~Cpss~~Tl 168 (189) -+.--=|+|||..+.|- T Consensus 451 ~R~P~FCSGCPHNtST~ 467 (1186) T PRK13029 451 ERKPWFCSGCPHNTSTR 467 (1186) T ss_pred CCCCCCCCCCCCCCCCC T ss_conf 67874468999964565 No 33 >pfam04320 DUF469 Protein with unknown function (DUF469). Family of bacteria protein with no known function. Probab=32.87 E-value=42 Score=15.49 Aligned_cols=65 Identities=26% Similarity=0.411 Sum_probs=43.0 Q ss_pred CHHHHHHHHHHHHHHHHH-HHHHCCCCEEEEEEECCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEEE Q ss_conf 026889999999998777-787528975999643347999963664666668999999999999997897003553 Q gi|254780430|r 114 DSAVVQRIKEVLDNRVRP-AVARDGGDIVFKGYRDGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKDIRT 188 (189) Q Consensus 114 ~~~~~~~i~~~l~~~IrP-~l~~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~~ 188 (189) ++.+..-+...|++.|.| .|.-+||+ ...-+|.|...-.|+|.- .-+..|+.=|+. -|++++|+. T Consensus 29 ~e~~D~~vD~fIde~Ie~ngL~~~Ggg---~~~~eG~Vc~~~~gs~tE------e~R~~V~~WL~~-r~ev~~v~v 94 (102) T pfam04320 29 EEAIDAFVDAFIDEVIEANGLAFGGGG---YLEWEGFVCLQRYGSCTE------EDRAAVEAWLEA-RPEVKDVEV 94 (102) T ss_pred HHHHHHHHHHHHHHHHHHCCCEEECCC---CCEEEEEEEECCCCCCCH------HHHHHHHHHHHH-CCCCCEEEE T ss_conf 899999999999998633683683487---422537998420579998------999999999970-987436886 No 34 >COG2151 PaaD Predicted metal-sulfur cluster biosynthetic enzyme [General function prediction only] Probab=31.70 E-value=44 Score=15.37 Aligned_cols=70 Identities=27% Similarity=0.439 Sum_probs=46.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHC---CCCEEEEEEE--CCEEEEEECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEEE Q ss_conf 688999999999877778752---8975999643--34799996366466666899999999999999789700355 Q gi|254780430|r 116 AVVQRIKEVLDNRVRPAVARD---GGDIVFKGYR--DGIVFLSMRGACSGCPSASETLKYGVANILNHFVPEVKDIR 187 (189) Q Consensus 116 ~~~~~i~~~l~~~IrP~l~~d---GG~i~~~~~~--~g~v~v~~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~V~ 187 (189) +...+|.+.|.+=+.|.+--| =|-|.=+.++ +|.++|.|...-.|||++.. +.+-+++.++. +|.++.|+ T Consensus 12 ~~~~~i~~aL~~V~DPEi~idIvdLGLVy~v~i~~~~~~v~v~mtlT~~gCP~~~~-i~~~v~~al~~-~~~v~~v~ 86 (111) T COG2151 12 VTLEDILEALKTVIDPEIGIDIVDLGLVYEVDIDDVDGLVKVKMTLTSPGCPLAEV-IADQVEAALEE-IPGVEDVE 86 (111) T ss_pred HHHHHHHHHHHCCCCCCCCEEEEEECCEEEEEEECCCCEEEEEEECCCCCCCCHHH-HHHHHHHHHHH-CCCCCEEE T ss_conf 66999999853477966660357631079999726774699999517888882078-89999999984-68813079 No 35 >TIGR00370 TIGR00370 conserved hypothetical protein TIGR00370; InterPro: IPR010016 This is an uncharacterised family of proteins of unknown function.. Probab=31.64 E-value=44 Score=15.36 Aligned_cols=88 Identities=20% Similarity=0.246 Sum_probs=51.4 Q ss_pred CCCCHHHHHHHHC--CCCEEEEECCCEEEE-EECCC------------CCHHHHHHHHHHHHHHHHCCCCCCCCCCCCCC Q ss_conf 0134688987416--881079980877998-31466------------82123489999999997401432323211222 Q gi|254780430|r 36 AEISPLASRIFSI--PGIASVYFGYDFITV-GKDQY------------DWEHLRPPVLGMIMEHFISGDPIIHNGGLGDM 100 (189) Q Consensus 36 a~~spLa~~Lf~i--~GV~~Vfi~~nFITV-tK~~~------------eW~~i~p~I~~~I~~~l~~g~~~i~~~~~~~~ 100 (189) -.-.-+|++|-+- ++|-.+..+.|-+|| ++|-+ -|++++.+|.+-+.+--+.-+. + ...-+- T Consensus 20 ~~~wa~A~~l~~~Pfp~~vE~ipgmnnlTVf~rd~~~~~k~l~qrl~~~wee~krdveerlaeiae~~e~--~-~R~iEI 96 (217) T TIGR00370 20 KRVWALARRLEEEPFPDVVEVIPGMNNLTVFLRDMYEVYKDLVQRLERLWEEVKRDVEERLAEIAEALEV--E-SRFIEI 96 (217) T ss_pred HHHHHHHHHHHHCCCCCEEEEECCCCCCEEEECCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCC--C-CCEEEE T ss_conf 5799999998746898758830175420000005689988767756567788754388899888765145--8-877873 Q ss_pred CCCCCCCCCCCCCCHHHHHHHHHHHHHHH Q ss_conf 22222233445440268899999999987 Q gi|254780430|r 101 KLDDMGSGDFIESDSAVVQRIKEVLDNRV 129 (189) Q Consensus 101 ~~~~~~~~~~~~~~~~~~~~i~~~l~~~I 129 (189) ...- ......|=+.+.+++++=.++| T Consensus 97 PV~Y---GGe~GpDL~~VAk~~qlspe~v 122 (217) T TIGR00370 97 PVVY---GGERGPDLEEVAKFNQLSPEEV 122 (217) T ss_pred EEEE---CCCCCCCHHHHHHHHCCCHHHE T ss_conf 0475---7888989899997717884573 No 36 >COG3696 Putative silver efflux pump [Inorganic ion transport and metabolism] Probab=30.83 E-value=46 Score=15.28 Aligned_cols=108 Identities=20% Similarity=0.273 Sum_probs=56.3 Q ss_pred CCCHHHHHHHHCCCCEEE----EECCCEEEEE-ECCCC--CH-HHHHHHHHHHHHHHHCCCC-CCCCCCCCCCCCCCCCC Q ss_conf 134688987416881079----9808779983-14668--21-2348999999999740143-23232112222222223 Q gi|254780430|r 37 EISPLASRIFSIPGIASV----YFGYDFITVG-KDQYD--WE-HLRPPVLGMIMEHFISGDP-IIHNGGLGDMKLDDMGS 107 (189) Q Consensus 37 ~~spLa~~Lf~i~GV~~V----fi~~nFITVt-K~~~e--W~-~i~p~I~~~I~~~l~~g~~-~i~~~~~~~~~~~~~~~ 107 (189) ---|+-..+..+||++.| +++-.+|||- ||.+| |. .+-.+=.+..+..|..|-. .+.+....-...-... T Consensus 65 iT~Plet~m~G~pg~~tvRs~S~~g~S~VtViF~dgtDiY~ARq~V~ErL~~v~~~LP~g~~p~lgP~sTglG~i~~yt- 143 (1027) T COG3696 65 VTYPLETAMMGLPGVKTVRSISKYGLSLVTVIFKDGTDLYWARQRVLERLSQVQSQLPEGVEPELGPDSTGLGWIYQYT- 143 (1027) T ss_pred CEEEEHHHHCCCCCCEEEECCCCCCCEEEEEEEECCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEEE- T ss_conf 1001067654799851541225577448999980785479999999999999997489988888488766762048999- Q ss_pred CCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEE Q ss_conf 344544026889999999998777787528975999643 Q gi|254780430|r 108 GDFIESDSAVVQRIKEVLDNRVRPAVARDGGDIVFKGYR 146 (189) Q Consensus 108 ~~~~~~~~~~~~~i~~~l~~~IrP~l~~dGG~i~~~~~~ 146 (189) -..+++..-......+.|-.|||.|.+-.|=.+...+- T Consensus 144 -l~~~~~~~~l~elr~lqdw~vrp~L~~vpGVaeV~s~G 181 (1027) T COG3696 144 -LVDKSGKTDLDELRELQDWVVRPQLRSVPGVAEVASVG 181 (1027) T ss_pred -EECCCCCCCHHHHHHHHHHHHHHHHHCCCCHHHHHHCC T ss_conf -97588998889999988782689871599715531047 No 37 >TIGR00852 pts-Glc PTS system, maltose and glucose-specific subfamily, IIC component; InterPro: IPR004719 Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family. These permeases show limited sequence similarity with members of the Fru family. Several of the Escherichia coli PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli. Most, but not all Scr permeases of other bacteria also lack a IIA domain. This HMM is specific for the IIC domain of the Glc family PTS transporters.; GO: 0008982 protein-N(PI)-phosphohistidine-sugar phosphotransferase activity, 0009401 phosphoenolpyruvate-dependent sugar phosphotransferase system, 0016021 integral to membrane. Probab=30.64 E-value=35 Score=16.02 Aligned_cols=62 Identities=19% Similarity=0.222 Sum_probs=49.3 Q ss_pred HHHHHHHHHHHHHCCCCEEEEEEEC-CEEEEE---------------ECCCCCCCHHHHHHHHHHHHHHHHHHCCCCEE Q ss_conf 9999987777875289759996433-479999---------------63664666668999999999999997897003 Q gi|254780430|r 123 EVLDNRVRPAVARDGGDIVFKGYRD-GIVFLS---------------MRGACSGCPSASETLKYGVANILNHFVPEVKD 185 (189) Q Consensus 123 ~~l~~~IrP~l~~dGG~i~~~~~~~-g~v~v~---------------~~GaC~~Cpss~~Tl~~gie~~l~~~vpev~~ 185 (189) -.+..-+.=..+.+||+|+..++-. .+-.+. ++|.|.+|..|...+ .-+|+.++..+|+... T Consensus 37 p~~~~~~~~~~~~~~g~Ipv~~~~~~~~~~~~~~~~g~~~~agIkTLytg~v~pi~~~~~~~-~~~er~~~~~lP~~L~ 114 (312) T TIGR00852 37 PVLNNAMGVGAQAVGGEIPVWNFFGFEIAKVGFNNSGLTLGAGIKTLYTGVVGPIIVGAIAL-ALHERFLDKKLPDVLG 114 (312) T ss_pred HHHHHHHHHHHHCCCCCCCEEEECCCCHHHHHCCCHHHHHHCCCHHHHHHHHHHHHHHHHHH-HHHHHHHHHHCCHHHH T ss_conf 99999999876316787102312243066653341455652453057778899999999999-9998884101002665 No 38 >pfam00873 ACR_tran AcrB/AcrD/AcrF family. Members of this family are integral membrane proteins. Some are involved in drug resistance. AcrB cooperates with a membrane fusion protein, AcrA, and an outer membrane channel TolC. The structure shows the AcrB forms a homotrimer. Probab=30.24 E-value=47 Score=15.22 Aligned_cols=36 Identities=14% Similarity=0.287 Sum_probs=20.0 Q ss_pred HHHHHHHHHHHHHHHHHCCC--CEEEEEEECCEEEEEE Q ss_conf 99999999987777875289--7599964334799996 Q gi|254780430|r 119 QRIKEVLDNRVRPAVARDGG--DIVFKGYRDGIVFLSM 154 (189) Q Consensus 119 ~~i~~~l~~~IrP~l~~dGG--~i~~~~~~~g~v~v~~ 154 (189) ..+.++.+++++|.|.+--| +|++.+..+..+.|++ T Consensus 151 ~~l~~~~~~~l~~~L~~v~gV~~V~~~G~~~~~i~I~~ 188 (1021) T pfam00873 151 TDLRDYADTNIKDQLSRIPGVGDVQLFGGSEYAMRIWL 188 (1021) T ss_pred HHHHHHHHHHHHHHHHCCCCEEEEEECCCCCEEEEEEE T ss_conf 99999999999999847979069997589706999997 No 39 >PRK10558 alpha-dehydro-beta-deoxy-D-glucarate aldolase; Provisional Probab=28.26 E-value=45 Score=15.35 Aligned_cols=36 Identities=11% Similarity=0.346 Sum_probs=15.2 Q ss_pred CCHHHHHHHHCCCCEEEEECCCEEEEEECCCCCHHHHHHHH Q ss_conf 34688987416881079980877998314668212348999 Q gi|254780430|r 38 ISPLASRIFSIPGIASVYFGYDFITVGKDQYDWEHLRPPVL 78 (189) Q Consensus 38 ~spLa~~Lf~i~GV~~Vfi~~nFITVtK~~~eW~~i~p~I~ 78 (189) .+|...+++..-|..-|+++..+- -.+|+++.+.++ T Consensus 28 ~sp~~~Ei~a~~G~Dfv~iD~EHg-----~~~~~~l~~~i~ 63 (256) T PRK10558 28 SNPITTEVLGLAGFDWLVLDGEHA-----PNDVSTFIPQLM 63 (256) T ss_pred CCHHHHHHHHCCCCCEEEECCCCC-----CCCHHHHHHHHH T ss_conf 998999999728989999837789-----999999999999 No 40 >COG4551 Predicted protein tyrosine phosphatase [General function prediction only] Probab=27.78 E-value=52 Score=14.95 Aligned_cols=91 Identities=14% Similarity=0.236 Sum_probs=50.4 Q ss_pred CCHHHHHHHH-CCCCEEEEECCCE---EEEEECCCCCHHHHHHHHHHHHHHHHCCC-CCCCCCCCCCCCCCCCCCCCCCC Q ss_conf 3468898741-6881079980877---99831466821234899999999974014-32323211222222222334454 Q gi|254780430|r 38 ISPLASRIFS-IPGIASVYFGYDF---ITVGKDQYDWEHLRPPVLGMIMEHFISGD-PIIHNGGLGDMKLDDMGSGDFIE 112 (189) Q Consensus 38 ~spLa~~Lf~-i~GV~~Vfi~~nF---ITVtK~~~eW~~i~p~I~~~I~~~l~~g~-~~i~~~~~~~~~~~~~~~~~~~~ 112 (189) .||-|..+|+ -+||+.--.|-+. +.++++..+|.+|.--.-..-+..|+..- +.+-++...-. + .-+ T Consensus 14 rsptae~~Fa~~~~vetdSAGl~~dAe~Plt~e~leWAdiIfVMEr~HrqkL~krf~~~lk~kRviCL---D-----IPD 85 (109) T COG4551 14 RSPTAEVMFATWPGVETDSAGLAHDAETPLTREQLEWADIIFVMERVHRQKLQKRFKASLKGKRVICL---D-----IPD 85 (109) T ss_pred CCCCHHHHHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEE---E-----CCC T ss_conf 69516677512789754445566556787508886355634568999999999986687628848997---4-----885 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 402688999999999877778752 Q gi|254780430|r 113 SDSAVVQRIKEVLDNRVRPAVARD 136 (189) Q Consensus 113 ~~~~~~~~i~~~l~~~IrP~l~~d 136 (189) +..-..-..-++|+.++-|+|+.+ T Consensus 86 dy~yMq~eLi~lLkrkv~p~L~~~ 109 (109) T COG4551 86 DYEYMQPELIDLLKRKVGPYLRTY 109 (109) T ss_pred HHHHCCHHHHHHHHHHHHHHHCCC T ss_conf 476517999999998610142069 No 41 >TIGR01684 viral_ppase viral phosphatase; InterPro: IPR007827 This family contains uncharacterised baculoviral proteins.. Probab=26.96 E-value=53 Score=14.86 Aligned_cols=60 Identities=17% Similarity=0.267 Sum_probs=38.1 Q ss_pred CCCEECCCH--HHCCCCH--HHHHHH--HCCCCEEEEEC----------CCEEEEEE-C---C-C-CCHHHHHHHHHHHH Q ss_conf 886430696--6701346--889874--16881079980----------87799831-4---6-6-82123489999999 Q gi|254780430|r 25 EGAIHFSNA--KEAEISP--LASRIF--SIPGIASVYFG----------YDFITVGK-D---Q-Y-DWEHLRPPVLGMIM 82 (189) Q Consensus 25 ~g~~~f~~~--~~a~~sp--La~~Lf--~i~GV~~Vfi~----------~nFITVtK-~---~-~-eW~~i~p~I~~~I~ 82 (189) ..+..++.. +.--+|| ...-|- .+...+++.+- ++||-|.| + . . ||+....+|.+.|. T Consensus 237 ~~pf~l~~~~~~~LPKSPrvVL~~L~~~G~~~~KsitLVDDL~~NN~~YDyfV~v~rCPDDn~P~NDW~~~~~~Iv~~~~ 316 (323) T TIGR01684 237 KTPFYLNLTDGKRLPKSPRVVLWYLADKGVNYIKSITLVDDLADNNFAYDYFVNVSRCPDDNVPVNDWDSYHDQIVSNLV 316 (323) T ss_pred ECCEECCCCCCCCCCCCCCHHHHHHHHCCCEEEEEEEEEECCCCCCCCCCCEEEEEECCCCCCCCCCHHHHHHHHHHHHH T ss_conf 13335278742247998705453324669168644888604766785654438600077874288512577788888886 Q ss_pred HH Q ss_conf 99 Q gi|254780430|r 83 EH 84 (189) Q Consensus 83 ~~ 84 (189) +| T Consensus 317 ~Y 318 (323) T TIGR01684 317 EY 318 (323) T ss_pred HH T ss_conf 43 No 42 >PRK02269 ribose-phosphate pyrophosphokinase; Provisional Probab=23.64 E-value=62 Score=14.47 Aligned_cols=34 Identities=15% Similarity=0.154 Sum_probs=21.2 Q ss_pred EEEEECCCCCCCH--------HHHHHHHHHHHHHHHHHCCCC Q ss_conf 9999636646666--------689999999999999978970 Q gi|254780430|r 150 VFLSMRGACSGCP--------SASETLKYGVANILNHFVPEV 183 (189) Q Consensus 150 v~v~~~GaC~~Cp--------ss~~Tl~~gie~~l~~~vpev 183 (189) ....+.|.+.|+- ++..|+-+..+.+..+-...| T Consensus 207 ~~~~~~gdV~Gk~vIIVDDiIdTGgTl~~aa~~Lk~~GA~~V 248 (321) T PRK02269 207 EVMNIIGNVSGKKCILIDDMIDTAGTICHAADALAEAGATAV 248 (321) T ss_pred EECCCCCCCCCCEEEEECCHHHCHHHHHHHHHHHHHCCCCEE T ss_conf 420357740697699966243142669999999984899827 No 43 >TIGR02855 spore_yabG sporulation peptidase YabG; InterPro: IPR008764 Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported. This is a group of peptidases belong to MEROPS peptidase family U57 (clan U-). The type example is the YabG protein of Bacillus subtilis. This is a protease involved in the synthesis and maturation of the spore coat proteins SpoIVA and YrbA of Bacillus subtilis .. Probab=23.37 E-value=18 Score=17.78 Aligned_cols=48 Identities=19% Similarity=0.441 Sum_probs=28.5 Q ss_pred HHHHHHHHHHHHHHHHHH----------HHCCCCEEEEEEECC-----------------EEEEEECCCCCCCHHH Q ss_conf 688999999999877778----------752897599964334-----------------7999963664666668 Q gi|254780430|r 116 AVVQRIKEVLDNRVRPAV----------ARDGGDIVFKGYRDG-----------------IVFLSMRGACSGCPSA 164 (189) Q Consensus 116 ~~~~~i~~~l~~~IrP~l----------~~dGG~i~~~~~~~g-----------------~v~v~~~GaC~~Cpss 164 (189) |--.+|.+||++ +||-+ .+-|+--.|..|++. -=.|-|.||||+|-=| T Consensus 149 emPe~v~~L~~~-~~PDIlViTGHDa~~K~~~~~~DL~aYRhSkyFv~~V~~aR~~~P~lD~LVIFAGACQShfE~ 223 (292) T TIGR02855 149 EMPEKVLDLIEE-VRPDILVITGHDAYSKNKGNYGDLNAYRHSKYFVETVREARKYVPSLDQLVIFAGACQSHFES 223 (292) T ss_pred CCCHHHHHHHHH-HCCCEEEEECCCCEEECCCCHHHHHHCCCCHHHHHHHHHHHHHCCCCCCHHHHHCCCHHHHHH T ss_conf 180889999973-099789994666302167871136423665689999999863178753234332121445799 No 44 >TIGR02916 PEP_his_kin putative PEP-CTERM system histidine kinase; InterPro: IPR014265 Proteins in this entry are putative periplasmic sensor signal transduction histidine kinases. They all contain a GAF domain that is present in phytochromes and cGMP-specific phosphodiesterases, and which has been experimentally proven to be involved in protein:protein interactions. They also contain a C-terminal histidine kinase domain, which is composed of a dimerisation sub-domain and an ATP/ADP-binding phosphotransfer, or catalytic, sub-domain. The proteins in this entry are found strictly within a subset of Gram-negative bacterial species with the proposed PEP-CTERM/exosortase system, analogous to the LPXTG/sortase system common in Gram-positive bacteria, where members of IPR014264 from INTERPRO and IPR014266 from INTERPRO also occur.. Probab=23.21 E-value=63 Score=14.42 Aligned_cols=108 Identities=19% Similarity=0.228 Sum_probs=56.3 Q ss_pred CHHHHHHHHC----CCCEEEEE-CCCE-EEEEECCCCCHHHHHHHHHHHHHHHHCCCCC----CCCCCCC-CCCCCCCCC Q ss_conf 4688987416----88107998-0877-9983146682123489999999997401432----3232112-222222223 Q gi|254780430|r 39 SPLASRIFSI----PGIASVYF-GYDF-ITVGKDQYDWEHLRPPVLGMIMEHFISGDPI----IHNGGLG-DMKLDDMGS 107 (189) Q Consensus 39 spLa~~Lf~i----~GV~~Vfi-~~nF-ITVtK~~~eW~~i~p~I~~~I~~~l~~g~~~----i~~~~~~-~~~~~~~~~ 107 (189) +.|++++-+- ..=-+|.+ +.+| ++|. ++=+.++.-|-+.+.+++++..+- +.-.... +...-+..+ T Consensus 558 ~~L~~~~~~~k~~q~p~~e~~~~~~~~rl~~~---a~~erl~rV~gHL~QNAlEAT~~~G~V~~~~~~~~~~~a~i~i~D 634 (696) T TIGR02916 558 VDLLRRVIASKRAQQPRPEVSIEDTDFRLSVR---ADRERLERVLGHLVQNALEATPEEGRVKIRVERECGGAAVIEIED 634 (696) T ss_pred HHHHHHHHHHHHHCCCCEEEEECCCCEEEEEE---ECHHHHHHHHHHHHHHHHHCCCCCCEEEEEEEECCCCEEEEEEEE T ss_conf 99999999998631894489971754178887---528889999999999888604999628999874188822799986 Q ss_pred CCCCCCCHHHHHHH-------H-------HHHHHHHHHHHHHCCCCEEEEEEEC-CEEE Q ss_conf 34454402688999-------9-------9999987777875289759996433-4799 Q gi|254780430|r 108 GDFIESDSAVVQRI-------K-------EVLDNRVRPAVARDGGDIVFKGYRD-GIVF 151 (189) Q Consensus 108 ~~~~~~~~~~~~~i-------~-------~~l~~~IrP~l~~dGG~i~~~~~~~-g~v~ 151 (189) +.+-=.++=+-++. | -+-| .|-+++++||+|++.+..+ |+.+ T Consensus 635 ~G~GM~~~FiR~rLF~PF~tTK~~aGMGIG~YE--~~~yv~e~GG~i~V~S~pG~Gt~f 691 (696) T TIGR02916 635 SGCGMSEAFIRERLFKPFDTTKGNAGMGIGVYE--CRQYVEELGGRIEVESTPGKGTIF 691 (696) T ss_pred CCCCCCHHHHHHHCCCCCCCCCCCCCCCCCHHH--HHHHHHHHCCCEEEEEECCCCEEE T ss_conf 578998589984078997544566787201899--999999838905888635885488 No 45 >PRK08115 ribonucleotide-diphosphate reductase subunit alpha; Validated Probab=22.55 E-value=58 Score=14.63 Aligned_cols=37 Identities=19% Similarity=0.409 Sum_probs=24.7 Q ss_pred HHHHCCCCEEEEEEECCEEEEEECCCCCCCHHHHHHHHHH Q ss_conf 7875289759996433479999636646666689999999 Q gi|254780430|r 132 AVARDGGDIVFKGYRDGIVFLSMRGACSGCPSASETLKYG 171 (189) Q Consensus 132 ~l~~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss~~Tl~~g 171 (189) ..-.+.|++-=+.-++.+.-+ |+|.+|+.-..-||-| T Consensus 820 ~~g~~~~~~cpvc~~g~i~~i---ggc~tc~~cg~q~~cg 856 (857) T PRK08115 820 TIGSEIGNTCPICREGTVEEI---GGCNTCTNCGAQLKCG 856 (857) T ss_pred CCCCCCCCCCCCCCCCCEEEC---CCCCCCCCCHHHCCCC T ss_conf 535334775875788870125---6876676440020048 No 46 >PRK10614 multidrug efflux system subunit MdtC; Provisional Probab=21.76 E-value=67 Score=14.24 Aligned_cols=34 Identities=15% Similarity=0.232 Sum_probs=13.2 Q ss_pred HHHHHHHHHHHHHHHCCC--CEEEEEEECCEEEEEE Q ss_conf 999999987777875289--7599964334799996 Q gi|254780430|r 121 IKEVLDNRVRPAVARDGG--DIVFKGYRDGIVFLSM 154 (189) Q Consensus 121 i~~~l~~~IrP~l~~dGG--~i~~~~~~~g~v~v~~ 154 (189) ..++.+.+++|.+.+--| .+.+.+..+..+.|.+ T Consensus 154 l~~~a~~~l~~~L~~i~GV~~V~~~G~~~~ei~I~~ 189 (1025) T PRK10614 154 LYDFASTQLAQTIAQIDGVGDVDVGGSSLPAVRVGL 189 (1025) T ss_pred HHHHHHHHHHHHHHCCCCCEEEEECCCCCEEEEEEE T ss_conf 999999999999857989269997278707999997 No 47 >KOG2679 consensus Probab=21.48 E-value=68 Score=14.21 Aligned_cols=105 Identities=17% Similarity=0.259 Sum_probs=49.9 Q ss_pred CCCCEEEEECCCEEEE-EECCCCCHHHHHHHHHHHHHHHHCCCCCCCCCCCC---CCCCCCCCCCCCCCCCHHHHHHHHH Q ss_conf 6881079980877998-31466821234899999999974014323232112---2222222233445440268899999 Q gi|254780430|r 48 IPGIASVYFGYDFITV-GKDQYDWEHLRPPVLGMIMEHFISGDPIIHNGGLG---DMKLDDMGSGDFIESDSAVVQRIKE 123 (189) Q Consensus 48 i~GV~~Vfi~~nFITV-tK~~~eW~~i~p~I~~~I~~~l~~g~~~i~~~~~~---~~~~~~~~~~~~~~~~~~~~~~i~~ 123 (189) +-+|.--++..+++|. +|+-++|+...|.+. .++..+..=+--+.+..+. -..+....+..--....+ T Consensus 159 ~f~v~~~~f~~d~~~~~~~~~ydw~~v~PR~~-~~~~~l~~le~~L~~S~a~wkiVvGHh~i~S~~~HG~T~e------- 230 (336) T KOG2679 159 MFFVDTTPFMDDTFTLCTDDVYDWRGVLPRVK-YLRALLSWLEVALKASRAKWKIVVGHHPIKSAGHHGPTKE------- 230 (336) T ss_pred EECCCCCCCHHHHEECCCCCCCCCCCCCHHHH-HHHHHHHHHHHHHHHHHCCEEEEECCCCEEHHHCCCCHHH------- T ss_conf 31112441201101025523223446780889-9999999999999874325699942551001101598499------- Q ss_pred HHHHHHHHHHHHCCCCEEEEEE----------ECCEEEEEECCCCCCC Q ss_conf 9999877778752897599964----------3347999963664666 Q gi|254780430|r 124 VLDNRVRPAVARDGGDIVFKGY----------RDGIVFLSMRGACSGC 161 (189) Q Consensus 124 ~l~~~IrP~l~~dGG~i~~~~~----------~~g~v~v~~~GaC~~C 161 (189) |+++.+|.|++-|=|.-+-+. +.++-|+.=+|+-..- T Consensus 231 -L~~~LlPiL~~n~VdlY~nGHDHcLQhis~~e~~iqf~tSGagSkaw 277 (336) T KOG2679 231 -LEKQLLPILEANGVDLYINGHDHCLQHISSPESGIQFVTSGAGSKAW 277 (336) T ss_pred -HHHHHHHHHHHCCCCEEEECCHHHHHHCCCCCCCEEEEEECCCCCCC T ss_conf -99988889986397579756545665215777770589407755456 No 48 >pfam06954 Resistin Resistin. This family consists of several mammalian resistin proteins. Resistin is a 12.5-kDa cysteine-rich secreted polypeptide first reported from rodent adipocytes. It belongs to a multigene family termed RELMs or FIZZ proteins. Plasma resistin levels are significantly increased in both genetically susceptible and high-fat-diet-induced obese mice. Immunoneutralisation of resistin improves hyperglycemia and insulin resistance in high-fat-diet-induced obese mice, while administration of recombinant resistin impairs glucose tolerance and insulin action in normal mice. It has been demonstrated that increases in circulating resistin levels markedly stimulate glucose production in the presence of fixed physiological insulin levels, whereas insulin suppressed resistin expression. It has been suggested that resistin could be a link between obesity and type 2 diabetes. Probab=21.38 E-value=66 Score=14.27 Aligned_cols=44 Identities=32% Similarity=0.477 Sum_probs=29.8 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEECCEEEEEECCCCCCCHHH Q ss_conf 44026889999999998777787528975999643347999963664666668 Q gi|254780430|r 112 ESDSAVVQRIKEVLDNRVRPAVARDGGDIVFKGYRDGIVFLSMRGACSGCPSA 164 (189) Q Consensus 112 ~~~~~~~~~i~~~l~~~IrP~l~~dGG~i~~~~~~~g~v~v~~~GaC~~Cpss 164 (189) ..|+.+.++|++.+...+.|....-. +. -..|.=+|.=.+||+- T Consensus 26 ~~d~~v~~KIke~~~~l~~ps~~~k~--L~-------C~SV~s~G~lasCPaG 69 (109) T pfam06954 26 SLDSAVDKKIKEDLSSLEDPSAIRKT--LS-------CTSVKSRGRLASCPAG 69 (109) T ss_pred CHHHHHHHHHHHHHHCCCCCCCCCEE--EE-------EEEECCCCCEECCCCC T ss_conf 27999999999988433662001203--78-------8875147842138997 No 49 >pfam08777 RRM_3 RNA binding motif. This domain is found in protein La which functions as an RNA chaperone during RNA polymerase III transcription, and can also stimulate translation initiation. It contains a five stranded beta sheet which forms an atypical RNA recognition motif. Probab=21.26 E-value=69 Score=14.18 Aligned_cols=29 Identities=7% Similarity=0.354 Sum_probs=20.8 Q ss_pred HHHHHHHCCCCEEEEEEEC--CEEEEEECCCC Q ss_conf 7777875289759996433--47999963664 Q gi|254780430|r 129 VRPAVARDGGDIVFKGYRD--GIVFLSMRGAC 158 (189) Q Consensus 129 IrP~l~~dGG~i~~~~~~~--g~v~v~~~GaC 158 (189) |+-.++.+| +|.|++|.. ..-||+|.-+. T Consensus 18 iK~~f~~~g-~V~yVD~~~Gd~eg~vRf~~~~ 48 (102) T pfam08777 18 IKEAFSQHG-EVKYVDFLEGDKEGYVRFKTPE 48 (102) T ss_pred HHHHHHHCC-CEEEEEECCCCCEEEEEECCCC T ss_conf 999998359-7568984278836999967921 No 50 >cd03009 TryX_like_TryX_NRX Tryparedoxin (TryX)-like family, TryX and nucleoredoxin (NRX) subfamily; TryX and NRX are thioredoxin (TRX)-like protein disulfide oxidoreductases that alter the redox state of target proteins via the reversible oxidation of an active center CXXC motif. TryX is involved in the regulation of oxidative stress in parasitic trypanosomatids by reducing TryX peroxidase, which in turn catalyzes the reduction of hydrogen peroxide and organic hydroperoxides. TryX derives reducing equivalents from reduced trypanothione, a polyamine peptide conjugate unique to trypanosomatids, which is regenerated by the NADPH-dependent flavoprotein trypanothione reductase. Vertebrate NRX is a 400-amino acid nuclear protein with one redox active TRX domain containing a CPPC active site motif followed by one redox inactive TRX-like domain. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos. Plant NRX, longer than the Probab=20.82 E-value=70 Score=14.12 Aligned_cols=16 Identities=31% Similarity=0.420 Sum_probs=9.8 Q ss_pred HHHHHHHHHHHHCCCC Q ss_conf 9999877778752897 Q gi|254780430|r 124 VLDNRVRPAVARDGGD 139 (189) Q Consensus 124 ~l~~~IrP~l~~dGG~ 139 (189) ++.+..|+.+.++|+| T Consensus 112 IV~~~a~~~~~~~~~~ 127 (131) T cd03009 112 VVTTDARELVLEYGAD 127 (131) T ss_pred EEHHHHHHHHHHCCCC T ss_conf 9817767788641744 Done!