Query gi|254780287|ref|YP_003064700.1| iron-sulfur cluster assembly accessory protein [Candidatus Liberibacter asiaticus str. psy62] Match_columns 109 No_of_seqs 119 out of 3041 Neff 7.2 Searched_HMMs 39220 Date Tue May 24 15:10:07 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780287.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 TIGR00049 TIGR00049 iron-sulfu 100.0 1.5E-43 0 275.7 10.5 104 5-109 1-105 (105) 2 PRK09502 iscA iron-sulfur clus 100.0 1.3E-38 3.2E-43 247.0 12.1 106 4-109 2-107 (107) 3 COG0316 sufA Fe-S cluster asse 100.0 4.1E-38 1E-42 244.0 12.1 109 1-109 1-110 (110) 4 PRK13623 iron-sulfur cluster i 100.0 5.2E-38 1.3E-42 243.4 12.6 109 1-109 9-118 (118) 5 PRK09504 sufA iron-sulfur clus 100.0 4.3E-38 1.1E-42 243.8 12.0 107 3-109 16-122 (122) 6 TIGR01997 sufA_proteo FeS asse 100.0 2.2E-35 5.5E-40 228.0 9.9 107 3-109 1-110 (110) 7 KOG1120 consensus 100.0 1.3E-32 3.4E-37 211.8 7.6 108 2-109 27-134 (134) 8 TIGR02011 IscA iron-sulfur clu 100.0 3.4E-29 8.6E-34 191.9 7.7 105 5-109 1-105 (105) 9 PRK11190 putative DNA uptake p 100.0 1.5E-28 3.7E-33 188.1 10.7 98 4-101 1-100 (192) 10 TIGR03341 YhgI_GntY IscR-regul 100.0 2E-28 5E-33 187.4 10.5 95 5-99 1-97 (190) 11 pfam01521 Fe-S_biosyn Iron-sul 99.9 2.1E-27 5.4E-32 181.4 10.6 91 5-95 1-91 (91) 12 KOG1119 consensus 99.9 9.8E-28 2.5E-32 183.3 7.5 104 4-109 93-197 (199) 13 TIGR01911 HesB_rel_seleno HesB 98.9 1E-08 2.6E-13 72.2 8.1 91 2-93 1-93 (93) 14 COG4841 Uncharacterized protei 98.0 1.3E-05 3.2E-10 54.2 5.7 87 4-92 2-94 (95) 15 COG4918 Uncharacterized protei 96.8 0.0015 3.9E-08 42.0 4.1 79 4-84 2-85 (114) 16 COG3564 Uncharacterized protei 95.0 0.16 4.1E-06 30.1 7.5 93 1-98 3-101 (116) 17 pfam05610 DUF779 Protein of un 88.2 1.1 2.7E-05 25.4 4.8 79 15-98 3-87 (95) 18 KOG4777 consensus 77.6 1.7 4.3E-05 24.2 2.2 38 59-96 114-151 (361) 19 pfam03749 SfsA Sugar fermentat 63.1 11 0.00029 19.4 3.8 42 6-47 136-179 (215) 20 PRK00347 sugar fermentation st 54.9 17 0.00043 18.3 3.6 42 6-47 151-194 (234) 21 KOG3348 consensus 46.4 28 0.00071 17.1 3.7 38 10-48 3-40 (85) 22 TIGR01369 CPSaseII_lrg carbamo 43.6 11 0.00028 19.5 1.1 47 3-60 251-300 (1089) 23 PRK06031 phosphoribosyltransfe 41.7 32 0.00082 16.7 3.3 65 39-103 115-182 (233) 24 TIGR02036 dsdC D-serine deamin 39.6 14 0.00036 18.8 1.2 73 4-96 31-105 (302) 25 TIGR00954 3a01203 Peroxysomal 38.5 26 0.00066 17.3 2.4 50 51-106 511-570 (788) 26 cd01234 PH_CADPS CADPS (Ca2+-d 36.9 24 0.00061 17.5 2.0 30 67-96 45-84 (117) 27 COG4647 AcxC Acetone carboxyla 34.0 19 0.00048 18.1 1.1 31 78-108 45-80 (165) 28 COG4393 Predicted membrane pro 33.7 32 0.00081 16.8 2.2 15 57-71 379-393 (405) 29 KOG2775 consensus 33.4 44 0.0011 15.9 2.9 76 2-96 110-191 (397) 30 TIGR01074 rep ATP-dependent DN 32.5 46 0.0012 15.8 3.0 69 4-80 53-124 (677) 31 TIGR01931 cysJ sulfite reducta 31.3 48 0.0012 15.7 3.0 38 22-70 426-472 (628) 32 KOG2862 consensus 30.1 50 0.0013 15.6 2.7 77 4-90 48-129 (385) 33 pfam06905 FAIM1 Fas apoptotic 30.0 51 0.0013 15.5 3.2 38 57-94 105-147 (178) 34 KOG0064 consensus 27.7 44 0.0011 15.9 2.1 38 60-106 485-522 (728) 35 COG1489 SfsA DNA-binding prote 27.1 58 0.0015 15.2 3.3 72 6-77 148-226 (235) 36 PRK13640 cbiO cobalt transport 25.6 61 0.0016 15.1 2.8 12 88-99 87-98 (283) 37 PRK03824 hypA hydrogenase nick 25.3 26 0.00065 17.3 0.5 36 5-40 4-39 (135) 38 pfam00262 Calreticulin Calreti 24.9 26 0.00065 17.3 0.5 16 90-105 274-289 (359) 39 pfam03152 UFD1 Ubiquitin fusio 24.7 28 0.00071 17.1 0.6 14 5-18 28-41 (176) 40 PRK00564 hypA hydrogenase nick 23.0 69 0.0018 14.8 3.2 36 5-40 4-39 (117) 41 TIGR02588 TIGR02588 conserved 22.2 72 0.0018 14.7 2.7 25 74-98 83-110 (122) 42 COG5134 Uncharacterized conser 22.1 63 0.0016 15.0 2.0 39 27-68 75-113 (272) 43 KOG0060 consensus 22.1 71 0.0018 14.7 2.3 50 49-106 426-475 (659) 44 TIGR01283 nifE nitrogenase MoF 21.8 28 0.00071 17.1 0.2 23 41-63 91-113 (470) 45 COG1635 THI4 Ribulose 1,5-bisp 21.7 65 0.0017 14.9 2.0 62 30-94 33-107 (262) 46 pfam09624 DUF2393 Protein of u 21.7 74 0.0019 14.6 3.4 77 20-97 26-106 (119) 47 pfam01946 Thi4 Thi4 family. Th 21.2 66 0.0017 14.9 2.0 55 30-85 20-87 (229) 48 TIGR02411 leuko_A4_hydro leuko 20.8 46 0.0012 15.8 1.1 73 28-108 21-104 (660) No 1 >TIGR00049 TIGR00049 iron-sulfur cluster assembly accessory protein; InterPro: IPR016092 This family includes HesB which may be involved in nitrogen fixation; the hesB gene is expressed only under nitrogen fixation conditions . Other members of this family include various hypothetical proteins that also contain the NifU-like domain (IPR001075 from INTERPRO). NifU-like proteins are found in species as divergent as humans and H. influenzae suggesting that these proteins perform some basic cellular function .; GO: 0005198 structural molecule activity, 0051536 iron-sulfur cluster binding, 0016226 iron-sulfur cluster assembly. Probab=100.00 E-value=1.5e-43 Score=275.66 Aligned_cols=104 Identities=45% Similarity=0.911 Sum_probs=101.9 Q ss_pred EEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCC-CCCEECCCCEEEEECHHHHHHCCCCEEEEEE Q ss_conf 331889999999999728988479999836887651465333210352-1130046878999846776540687999872 Q gi|254780287|r 5 IKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSE-DDIVFEKNGAQIFIDKISLAYLTNSEIDFVD 83 (109) Q Consensus 5 i~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~-~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~~ 83 (109) |+|||+|++||+++++++++ .+|||+|++||||||+|.|+|++++++ +|.+++.+|++|+||++|++||.|++|||++ T Consensus 1 i~LTd~A~~~~~~l~~~~~~-~~LRv~V~~GGCSG~~Y~l~~~~~~~~~dD~v~~~~Gv~v~vD~~S~~~l~G~~~Dyv~ 79 (105) T TIGR00049 1 ITLTDSAAKRIKALLAGEGE-LGLRVGVKGGGCSGLQYGLEFDDEPNEKDDEVFEQDGVKVVVDPKSLPYLNGSEIDYVE 79 (105) T ss_pred CEECHHHHHHHHHHHHHCCC-CEEEEEEECCCCCCEEEEEECCCCCCCCCCEEEEECCEEEEECCCCHHHHCCCEEEEEE T ss_conf 90188999999999871589-32789860777576023100114889998878863882898815414542587777873 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCCCC Q ss_conf 78545238988888887788622469 Q gi|254780287|r 84 NLLSKSFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 84 ~~~g~gF~i~nPna~~~CgCG~SF~~ 109 (109) +++++||+|.||||+++||||+||++ T Consensus 80 ~l~~sgF~f~NPNA~~~CGCG~SF~~ 105 (105) T TIGR00049 80 ELEGSGFTFTNPNAKGTCGCGKSFSV 105 (105) T ss_pred CCCCCCCEEECCCCCCCCCCCCCCCC T ss_conf 56778337764864674787776889 No 2 >PRK09502 iscA iron-sulfur cluster assembly protein; Provisional Probab=100.00 E-value=1.3e-38 Score=246.96 Aligned_cols=106 Identities=35% Similarity=0.762 Sum_probs=103.0 Q ss_pred CEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEEEE Q ss_conf 03318899999999997289884799998368876514653332103521130046878999846776540687999872 Q gi|254780287|r 4 IIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDFVD 83 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~~ 83 (109) .|+||++|+++|+++++++++..+|||+|++||||||+|.|++++++.++|.+++.+|++|+||++|++||+|++|||++ T Consensus 2 ~I~iT~~A~~~i~~~l~~~~~~~~lRl~Vk~gGCsG~~Y~l~~~~~~~~~D~~~~~~g~~i~Id~~s~~~l~G~~IDy~~ 81 (107) T PRK09502 2 SITLSDSAAARVNTFLANRGKGFGLRLGVRTSGCSGMAYVLEFVDEPTPEDIVFEDKGVKVVVDGKSLQFLDGTQLDFVK 81 (107) T ss_pred EEEECHHHHHHHHHHHHHCCCCEEEEEEEECCCCCCEEEEEEECCCCCCCCEEEEECCEEEEECHHHHHHHCCCEEEEEE T ss_conf 79999999999999997479970899999357758858375883557899799986998999987998775788997731 Q ss_pred CCCCCEEEEECCCCCCCCCCCCCCCC Q ss_conf 78545238988888887788622469 Q gi|254780287|r 84 NLLSKSFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 84 ~~~g~gF~i~nPna~~~CgCG~SF~~ 109 (109) ++++++|+|+||||+++||||+||+| T Consensus 82 ~~~~~gF~f~NPna~~~CGCG~SF~i 107 (107) T PRK09502 82 EGLNEGFKFTNPNVKDECGCGESFHV 107 (107) T ss_pred CCCCCCEEEECCCCCCCCCCCCCCCC T ss_conf 68846179989798850489877419 No 3 >COG0316 sufA Fe-S cluster assembly scaffold protein [Posttranslational modification, protein turnover, chaperones] Probab=100.00 E-value=4.1e-38 Score=244.00 Aligned_cols=109 Identities=45% Similarity=0.924 Sum_probs=105.3 Q ss_pred CCCCEEECHHHHHHHHHHHHHC-CCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEE Q ss_conf 9840331889999999999728-988479999836887651465333210352113004687899984677654068799 Q gi|254780287|r 1 MVPIIKITDAAATQIKTILESN-SDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEI 79 (109) Q Consensus 1 M~~mi~iT~~A~~~i~~l~~~~-~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~I 79 (109) |.+||+||++|++|++++++++ ++..+|||+|++|||+||+|.|+|+++++++|.+++.+|++|+||++|++||.|++| T Consensus 1 ~~~~itlT~~Aa~~v~~ll~~~~~~~~~lRv~V~~gGCsG~~Y~~~~~~~~~~~D~v~e~~g~~v~vD~~S~~~L~G~~I 80 (110) T COG0316 1 AAMMITLTDAAAARVKALLAKEGEENLGLRVGVKGGGCSGFQYGLEFDDEINEDDTVFEQDGVKVVVDPKSLPYLEGTEI 80 (110) T ss_pred CCCCEEECHHHHHHHHHHHHHCCCCCCEEEEEEECCCCCCCEEEEEECCCCCCCCEEEEECCEEEEECHHHHHHHCCCEE T ss_conf 98740449999999999997165888568999848977893767788667899987998599899987526656469888 Q ss_pred EEEECCCCCEEEEECCCCCCCCCCCCCCCC Q ss_conf 987278545238988888887788622469 Q gi|254780287|r 80 DFVDNLLSKSFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 80 Dy~~~~~g~gF~i~nPna~~~CgCG~SF~~ 109 (109) ||+++.+|++|+|+||||+++||||+||++ T Consensus 81 Dyv~~~~g~~F~~~NPNA~~~CgCg~Sf~v 110 (110) T COG0316 81 DYVEDLLGSGFTFKNPNAKSSCGCGESFSV 110 (110) T ss_pred EEEECCCCCCEEEECCCCCCCCCCCCCCCC T ss_conf 887747677348979998762028887779 No 4 >PRK13623 iron-sulfur cluster insertion protein ErpA; Provisional Probab=100.00 E-value=5.2e-38 Score=243.36 Aligned_cols=109 Identities=43% Similarity=0.837 Sum_probs=103.4 Q ss_pred CCCCEEECHHHHHHHHHHHHHCC-CCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEE Q ss_conf 98403318899999999997289-88479999836887651465333210352113004687899984677654068799 Q gi|254780287|r 1 MVPIIKITDAAATQIKTILESNS-DKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEI 79 (109) Q Consensus 1 M~~mi~iT~~A~~~i~~l~~~~~-~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~I 79 (109) |-+.|+||++|++||++++++++ +..+|||+|++|||+||+|.|.+++++.++|.+++.+|++|+||+.|++||+|++| T Consensus 9 ~p~~ItiT~~A~~~ik~ll~~~~~~~~~lRv~V~~gGCsG~~Y~~~~~~~~~~dD~v~~~~g~~v~Id~~Sl~~L~Gs~I 88 (118) T PRK13623 9 VPLPLVFTDAAAAKVKELIEEEGNPDLKLRVYITGGGCSGFQYGFTFDEQVNEDDTTIEKQGVTLVVDPMSLQYLVGAEV 88 (118) T ss_pred CCCCCEECHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCEEEEEEECCCCCCCCCEEEECCEEEEECHHHHHHCCCCEE T ss_conf 99874999999999999998689996179999957878982856798755688886798378699999889845289899 Q ss_pred EEEECCCCCEEEEECCCCCCCCCCCCCCCC Q ss_conf 987278545238988888887788622469 Q gi|254780287|r 80 DFVDNLLSKSFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 80 Dy~~~~~g~gF~i~nPna~~~CgCG~SF~~ 109 (109) ||++++++++|+|+||||+++||||+||+| T Consensus 89 Dy~e~~~gs~F~f~NPna~~~CGCG~SFsv 118 (118) T PRK13623 89 DYTEGLEGSRFVIKNPNAKTTCGCGSSFSI 118 (118) T ss_pred EEEECCCCCCEEEECCCCCCCCCCCCCCCC T ss_conf 887669867169989898854699877349 No 5 >PRK09504 sufA iron-sulfur cluster assembly scaffold protein; Provisional Probab=100.00 E-value=4.3e-38 Score=243.85 Aligned_cols=107 Identities=34% Similarity=0.756 Sum_probs=103.0 Q ss_pred CCEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEEE Q ss_conf 40331889999999999728988479999836887651465333210352113004687899984677654068799987 Q gi|254780287|r 3 PIIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDFV 82 (109) Q Consensus 3 ~mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~ 82 (109) .-|+||++|++||+++++++++..+|||+|++||||||+|.|++++++.++|.+++.+|++|+||++|++||+|++|||+ T Consensus 16 ~~ItiT~~A~~~i~~l~~~~~~~~~lRl~Vk~gGCsG~~Y~~~~~~~~~~~D~v~~~~g~~v~vd~~s~~~l~Gt~IDy~ 95 (122) T PRK09504 16 QGLTLTPAAAAHIRELMAKQPGMKGVRLGVKQTGCAGFGYVLDSVSEPDKDDLVFEHDGAKLFVPLQAMPFIDGTEVDYV 95 (122) T ss_pred CCEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEEECCCCCCCCEEEEECCEEEEECHHHHCCCCCCEEEEE T ss_conf 44899999999999999749997279999938775888987064378999999998399799987478373279899886 Q ss_pred ECCCCCEEEEECCCCCCCCCCCCCCCC Q ss_conf 278545238988888887788622469 Q gi|254780287|r 83 DNLLSKSFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 83 ~~~~g~gF~i~nPna~~~CgCG~SF~~ 109 (109) +++++++|+|.||||+++||||+||+| T Consensus 96 ~~~~~~~F~f~NPna~~~CGCG~SF~V 122 (122) T PRK09504 96 REGLNQIFKFHNPKAQNECGCGESFGV 122 (122) T ss_pred ECCCCCCEEEECCCCCCCCCCCCCCCC T ss_conf 768867369989898851489877249 No 6 >TIGR01997 sufA_proteo FeS assembly scaffold SufA; InterPro: IPR011298 Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] . FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems. The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins , . It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly . The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA . SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA , acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets. In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins . Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen . This entry represents the SufA protein of the SUF system of iron-sulphur cluster biosynthesis. SufA acts as a scaffold in which Fe and S are assembled into FeS clusters . This system performs FeS biosynthesis even during oxidative stress and tends to be absent in obligate anaerobic and microaerophilic bacteria.. Probab=100.00 E-value=2.2e-35 Score=228.05 Aligned_cols=107 Identities=36% Similarity=0.793 Sum_probs=103.4 Q ss_pred CCEEECHHHHHHHHHHHHHC-CCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEEC-CCCEEEEECHHHHHHCCCCEEE Q ss_conf 40331889999999999728-988479999836887651465333210352113004-6878999846776540687999 Q gi|254780287|r 3 PIIKITDAAATQIKTILESN-SDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFE-KNGAQIFIDKISLAYLTNSEID 80 (109) Q Consensus 3 ~mi~iT~~A~~~i~~l~~~~-~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~-~~gi~v~vd~~s~~~L~g~~ID 80 (109) ..|++|++|+.||++|++++ ++..++||+||.+|||||.|.++++.+|+.+|.+++ .+|.+|+|+|+.+.||+|+.+| T Consensus 1 ~~~~lT~AAa~~i~~l~~~~g~~~~g~Rl~vKktGCaG~~Y~~~~V~ep~~~D~L~et~~Gakv~v~p~a~~~i~GT~vD 80 (110) T TIGR01997 1 AVITLTDAAATHIRELVKKRGTEAVGIRLSVKKTGCAGMEYVLELVSEPKEDDDLIETADGAKVFVAPEAVLFILGTQVD 80 (110) T ss_pred CCCCCCHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCCEEEEEEECCCCCCCEEEECCCCCEEEECHHHHHHCCCCEEE T ss_conf 96520157889999998503898204887665576377302300110788887032103886588664440020486745 Q ss_pred EEECCCCC-EEEEECCCCCCCCCCCCCCCC Q ss_conf 87278545-238988888887788622469 Q gi|254780287|r 81 FVDNLLSK-SFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 81 y~~~~~g~-gF~i~nPna~~~CgCG~SF~~ 109 (109) |+...+++ +|+|+|||+++.|||||||.+ T Consensus 81 f~~~~L~~y~f~FnNPn~~~aCGCGES~~l 110 (110) T TIGR01997 81 FVKTTLRQYGFKFNNPNATSACGCGESFEL 110 (110) T ss_pred EEECCCCEEEEEECCCCCCCCCCCCCCCCC T ss_conf 787354122227428554678876421459 No 7 >KOG1120 consensus Probab=99.97 E-value=1.3e-32 Score=211.77 Aligned_cols=108 Identities=37% Similarity=0.719 Sum_probs=104.6 Q ss_pred CCCEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEE Q ss_conf 84033188999999999972898847999983688765146533321035211300468789998467765406879998 Q gi|254780287|r 2 VPIIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDF 81 (109) Q Consensus 2 ~~mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy 81 (109) ++.|++||.|.+||+++++++++...|||+|+.+||+|.+|.|++.+++...|.+++.+|++|+||++++..|.|+++|| T Consensus 27 k~~ltLTp~Av~~ik~ll~~~~e~~~lrigVk~rGCnGlsYtleY~~~kgkfDE~VeqdGv~I~ie~KA~l~liGteMDy 106 (134) T KOG1120 27 KAALTLTPSAVNHIKQLLSDKPEDVCLRIGVKQRGCNGLSYTLEYTKTKGKFDEVVEQDGVRIFIEPKALLTLIGTEMDY 106 (134) T ss_pred CCCCCCCHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCCEEEEEEECCCCCCCCEEEECCCEEEECCCCEEEECCCEEHH T ss_conf 56120598999999999974876761688775177576355522001689875514544708998140104661211014 Q ss_pred EECCCCCEEEEECCCCCCCCCCCCCCCC Q ss_conf 7278545238988888887788622469 Q gi|254780287|r 82 VDNLLSKSFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 82 ~~~~~g~gF~i~nPna~~~CgCG~SF~~ 109 (109) +++.++|+|+|.|||++++||||+||++ T Consensus 107 vddkL~Sefvf~npna~gtcGcgeSf~~ 134 (134) T KOG1120 107 VDDKLSSEFVFSNPNAKGTCGCGESFSV 134 (134) T ss_pred HHHHHCCCEEEECCCCCCCCCCCCCCCC T ss_conf 4565317357617886653246633459 No 8 >TIGR02011 IscA iron-sulfur cluster assembly protein IscA; InterPro: IPR011302 Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] . FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems. The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins , . It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly . The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA . SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA , acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets. In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins . Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen . This entry represents the IscA component of the ISC system for iron-sulphur cluster assembly. IscA is believed to act as a scaffold upon which 2Fe-2S clusters are assembled and subsequently transferred to ferredoxin , , . This clade is limited to the proteobacteria.; GO: 0005506 iron ion binding, 0005515 protein binding, 0051536 iron-sulfur cluster binding, 0016226 iron-sulfur cluster assembly. Probab=99.96 E-value=3.4e-29 Score=191.89 Aligned_cols=105 Identities=38% Similarity=0.805 Sum_probs=103.1 Q ss_pred EEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEEEEC Q ss_conf 33188999999999972898847999983688765146533321035211300468789998467765406879998727 Q gi|254780287|r 5 IKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDFVDN 84 (109) Q Consensus 5 i~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~~~ 84 (109) ||+|++|+++++..++..+++.+||++|+.+||+|+.|.|+|.++.+++|.+++..|++|+||++|+.||+|+.+||+.+ T Consensus 1 itl~~~a~~~~~~~la~rGkG~G~rlG~~tsGCsGmay~lefvd~~~~~d~~~~~~G~~~~~d~ksl~yl~G~~~d~~ke 80 (105) T TIGR02011 1 ITLTEAAAERVNSFLANRGKGLGLRLGVKTSGCSGMAYVLEFVDEADDDDLVFEDKGVKIVIDAKSLVYLDGTQLDFVKE 80 (105) T ss_pred CCCCHHHHHHHHHHHHHCCCCEEEEEEEEECCCCCEEEEEEECCCCCCCCEEECCCCCEEEECCCEEEEECCCEEEEEEC T ss_conf 94367789999998861488204674100036552000121102688885366168717898174134322622210100 Q ss_pred CCCCEEEEECCCCCCCCCCCCCCCC Q ss_conf 8545238988888887788622469 Q gi|254780287|r 85 LLSKSFQIRNPNATSNCGCGTSFSI 109 (109) Q Consensus 85 ~~g~gF~i~nPna~~~CgCG~SF~~ 109 (109) +++.||.|.|||++..||||+||++ T Consensus 81 Gl~eGf~f~nPn~k~~CGCGesf~v 105 (105) T TIGR02011 81 GLNEGFKFENPNVKDECGCGESFHV 105 (105) T ss_pred HHHCCCEECCCCCCCCCCCCCCCCC T ss_conf 1113731048876666677632349 No 9 >PRK11190 putative DNA uptake protein; Provisional Probab=99.96 E-value=1.5e-28 Score=188.15 Aligned_cols=98 Identities=23% Similarity=0.440 Sum_probs=92.8 Q ss_pred CEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHH--HHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEE Q ss_conf 0331889999999999728988479999836887651465333--21035211300468789998467765406879998 Q gi|254780287|r 4 IIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDL--ESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDF 81 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~--~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy 81 (109) ||+||++|+.||+++++++.++.+|||+|+++||+||+|.|.| +++.+++|.+++.+|++|+||+.|++||+|++||| T Consensus 1 MItITd~A~~~l~~LL~~q~~g~~LRV~V~~~Gcsgaey~~~f~~~de~~~dD~~~e~~gf~V~VD~~S~~yL~ga~IDy 80 (192) T PRK11190 1 MITISDAAQAHFAKLLANQEEGTQIRVFVINPGTPNAECGVSYCPPDAVEATDTELKFNGFSAYVDELSAPFLEDAEIDF 80 (192) T ss_pred CEEECHHHHHHHHHHHHCCCCCCEEEEEEECCCCCCCEEEEEECCCCCCCCCEEEEEECCEEEEECCCHHHHHCCCEEEE T ss_conf 95868999999999984388896399999459989956888974677788852899869999998302667867889976 Q ss_pred EECCCCCEEEEECCCCCCCC Q ss_conf 72785452389888888877 Q gi|254780287|r 82 VDNLLSKSFQIRNPNATSNC 101 (109) Q Consensus 82 ~~~~~g~gF~i~nPna~~~C 101 (109) +++.+|++|+|+||||+... T Consensus 81 ~~d~~G~~ftIkNPNAK~~~ 100 (192) T PRK11190 81 VTDQLGSQLTLKAPNAKMRK 100 (192) T ss_pred ECCCCCCEEEEECCCCCCCC T ss_conf 52677877999899765555 No 10 >TIGR03341 YhgI_GntY IscR-regulated protein YhgI. IscR (TIGR02010) is an iron-sulfur cluster-binding transcriptional regulator (see Genome Property GenProp0138). Members of this protein family include YhgI, whose expression is under control of IscR, and show sequence similarity to IscA, a known protein of iron-sulfur cluster biosynthesis. These two lines of evidence strongly suggest a role as an iron-sulfur cluster biosynthesis protein. An older study designated this protein GntY and suggested a role for it and for the product of an adjacent gene, based on complementation studies, in gluconate utilization. Probab=99.96 E-value=2e-28 Score=187.39 Aligned_cols=95 Identities=24% Similarity=0.417 Sum_probs=90.8 Q ss_pred EEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHH--HCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEEE Q ss_conf 3318899999999997289884799998368876514653332--10352113004687899984677654068799987 Q gi|254780287|r 5 IKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLE--SKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDFV 82 (109) Q Consensus 5 i~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~--~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~ 82 (109) |+||++|++||++++++++++.+|||+|+++||+||+|.|.|+ ++.+++|.+++.+|++|+||+.|++||+|++|||+ T Consensus 1 ItITe~A~~~l~~Ll~~q~~g~~lRv~V~~~Gcsg~qy~l~f~~~de~~~dD~~~e~~g~~V~VD~~S~~yL~Ga~IDy~ 80 (190) T TIGR03341 1 ITITEAAQAYLAKLLAKQNEGTGIRVFVVNPGTPYAECCVSYCPPDEVEPSDIKLEFNGFSAYVDALSAPFLEDAVIDFV 80 (190) T ss_pred CEECHHHHHHHHHHHHCCCCCCEEEEEEECCCCCCCEEEEEECCCCCCCCCCEEEEECCEEEEECCCHHHHHCCCEEEEE T ss_conf 97789999999999840888956899995599998588878888777899878998499899984254678678899763 Q ss_pred ECCCCCEEEEECCCCCC Q ss_conf 27854523898888888 Q gi|254780287|r 83 DNLLSKSFQIRNPNATS 99 (109) Q Consensus 83 ~~~~g~gF~i~nPna~~ 99 (109) ++.+|++|+|+||||+. T Consensus 81 ~d~~g~gfti~NPNAK~ 97 (190) T TIGR03341 81 TDRMGGQLTLKAPNAKM 97 (190) T ss_pred CCCCCCEEEEECCCCCC T ss_conf 36768728997999765 No 11 >pfam01521 Fe-S_biosyn Iron-sulphur cluster biosynthesis. This family is involved in iron-sulphur cluster biosynthesis. Its members include proteins that are involved in nitrogen fixation such as the HesB and HesB-like proteins. Probab=99.95 E-value=2.1e-27 Score=181.40 Aligned_cols=91 Identities=46% Similarity=0.813 Sum_probs=88.1 Q ss_pred EEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEEEEC Q ss_conf 33188999999999972898847999983688765146533321035211300468789998467765406879998727 Q gi|254780287|r 5 IKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDFVDN 84 (109) Q Consensus 5 i~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~~~ 84 (109) |+||++|+++|+++++.++...+|||+|++|||+||+|.|+++++++++|.+++.+|++|+||+.|++||.|++|||+++ T Consensus 1 ItiT~~A~~~i~~~l~~~~~~~~lRl~v~~gGC~G~~Y~l~~~~~~~~~D~~~~~~g~~v~Id~~s~~~l~g~~iDy~~~ 80 (91) T pfam01521 1 ITLTDAAAKWIKKLLDLEGGENGLRIGVRYGGCSGFSYGLTFEDEAGEGDEVFEQDGVTVVVDEKSLPYLEGTEIDFVEE 80 (91) T ss_pred CEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCCEEEEEECCCCCCCCEEEEECCCEEEEEHHHHHHCCCCEEEEEEC T ss_conf 99999999999999974899946999996699798188789853577785999989929999169981559989998626 Q ss_pred CCCCEEEEECC Q ss_conf 85452389888 Q gi|254780287|r 85 LLSKSFQIRNP 95 (109) Q Consensus 85 ~~g~gF~i~nP 95 (109) +++++|+|+|| T Consensus 81 ~~~~~F~f~NP 91 (91) T pfam01521 81 LLGSGFTFSNP 91 (91) T ss_pred CCCCEEEEECC T ss_conf 88773999595 No 12 >KOG1119 consensus Probab=99.94 E-value=9.8e-28 Score=183.33 Aligned_cols=104 Identities=43% Similarity=0.888 Sum_probs=97.6 Q ss_pred CEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEEEEE Q ss_conf 03318899999999997289884799998368876514653332103521130046878999846776540687999872 Q gi|254780287|r 4 IIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDFVD 83 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~~ 83 (109) .+.++++|..+++++.+.. ...|||.|++|||+||+|.|.++...+++|.+++.+|.+|+||..|+.+|.|+||||.+ T Consensus 93 ~~~lsds~~krl~EI~~~~--pe~LRl~VegGGCsGFQYkf~LD~~in~dD~vf~e~~arVVvD~~SL~~~kGatvdy~~ 170 (199) T KOG1119 93 NLHLSDSCSKRLKEIYENS--PEFLRLTVEGGGCSGFQYKFRLDNKINNDDRVFVENGARVVVDNVSLNLLKGATVDYTN 170 (199) T ss_pred EEEEHHHHHHHHHHHHHCC--CCEEEEEEECCCCCCEEEEEEECCCCCCCCEEEEECCCEEEEECCCHHHCCCCEEEHHH T ss_conf 4884267789999998098--30389998548703358888844777876657861880899853542112686333178 Q ss_pred CCCCCEEEEE-CCCCCCCCCCCCCCCC Q ss_conf 7854523898-8888887788622469 Q gi|254780287|r 84 NLLSKSFQIR-NPNATSNCGCGTSFSI 109 (109) Q Consensus 84 ~~~g~gF~i~-nPna~~~CgCG~SF~~ 109 (109) ++++++|++- ||.|+..||||+||.| T Consensus 171 ELIrSsF~ivnNP~A~~gCsCgSSF~i 197 (199) T KOG1119 171 ELIRSSFRIVNNPSAKQGCSCGSSFDI 197 (199) T ss_pred HHHHHHHEEECCCCCCCCCCCCCCCCC T ss_conf 886411066428300258778765444 No 13 >TIGR01911 HesB_rel_seleno HesB-related (seleno)protein; InterPro: IPR010965 This entry represents a family of small proteins related to HesB and its close homologs, which are likely to be involved in iron-sulphur cluster assembly. Several members are selenoproteins, with a TGA codon and Sec residue that aligns to the conserved Cys of the HesB domain.. Probab=98.88 E-value=1e-08 Score=72.18 Aligned_cols=91 Identities=23% Similarity=0.365 Sum_probs=80.1 Q ss_pred CCCEEECHHHHHHHHHHHHHCCC-CCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHH-HCCCCEE Q ss_conf 84033188999999999972898-847999983688765146533321035211300468789998467765-4068799 Q gi|254780287|r 2 VPIIKITDAAATQIKTILESNSD-KKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLA-YLTNSEI 79 (109) Q Consensus 2 ~~mi~iT~~A~~~i~~l~~~~~~-~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~-~L~g~~I 79 (109) |..+++++.|.+.+++.+++++. ..-|||...|-||+|+.|.+..+ +++++|.++...+++++||+.-.. ||....| T Consensus 1 MkkV~~SD~Ay~efldfLK~n~~d~~V~rI~f~G~g~~GP~F~~~i~-e~nenD~~~~i~d~~f~ID~~lid~flgef~I 79 (93) T TIGR01911 1 MKKVVMSDEAYKEFLDFLKKNDVDKDVVRIYFEGFGPSGPVFGIAIA-EKNENDEVVVIKDLTFLIDKSLIDQFLGEFSI 79 (93) T ss_pred CCEEEECHHHHHHHHHHHHCCCCCCCEEEEECCCCCCCCCEEEEEEC-CCCCCCEEEEECCEEEEECCHHHHHCCCEEEE T ss_conf 95788517788889876200689875699951462589975578855-87999747885442798650245330880699 Q ss_pred EEEECCCCCEEEEE Q ss_conf 98727854523898 Q gi|254780287|r 80 DFVDNLLSKSFQIR 93 (109) Q Consensus 80 Dy~~~~~g~gF~i~ 93 (109) ...++..+.||+++ T Consensus 80 ~~~ee~fg~gl~le 93 (93) T TIGR01911 80 SLREENFGKGLKLE 93 (93) T ss_pred EEECCCCCCCEEEC T ss_conf 86113373520309 No 14 >COG4841 Uncharacterized protein conserved in bacteria [Function unknown] Probab=98.01 E-value=1.3e-05 Score=54.16 Aligned_cols=87 Identities=29% Similarity=0.422 Sum_probs=66.5 Q ss_pred CEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCC----CEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCC--C Q ss_conf 03318899999999997289884799998368876----51465333210352113004687899984677654068--7 Q gi|254780287|r 4 IIKITDAAATQIKTILESNSDKKALRITIEGGGCS----GFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTN--S 77 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCs----G~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g--~ 77 (109) .|+||+.|.+-+++-+.- .++..+|+.|+=|||+ ||+..+.-+ .|++--..-+.+|++++|..+.+=|.++ . T Consensus 2 ni~vtd~A~~wfk~E~~l-~~g~~vrffvRyGG~~~~~~GFS~gv~~e-~PkE~g~~q~~Dgltffiee~DlWYF~d~d~ 79 (95) T COG4841 2 NIEVTDQALKWFKEELDL-EEGNKVRFFVRYGGCSSLQQGFSLGVAKE-VPKEIGYKQEYDGLTFFIEEKDLWYFDDHDL 79 (95) T ss_pred CEEECHHHHHHHHHHCCC-CCCCEEEEEEEECCCCCCCCCCCEEEECC-CCHHHCHHEEECCEEEEEECCCEEEECCCCE T ss_conf 458707988888874387-78987899999767112368723134212-8323152204457089994572688728867 Q ss_pred EEEEEECCCCCEEEE Q ss_conf 999872785452389 Q gi|254780287|r 78 EIDFVDNLLSKSFQI 92 (109) Q Consensus 78 ~IDy~~~~~g~gF~i 92 (109) +|||..+...-.|.. T Consensus 80 ~v~y~~~~Dei~fs~ 94 (95) T COG4841 80 KVDYSPDTDEISFSY 94 (95) T ss_pred EEECCCCCCCCEEEC T ss_conf 996167877504631 No 15 >COG4918 Uncharacterized protein conserved in bacteria [Function unknown] Probab=96.84 E-value=0.0015 Score=41.98 Aligned_cols=79 Identities=24% Similarity=0.379 Sum_probs=57.7 Q ss_pred CEEECHHHHHHHHHHHHHCCC-CCEEEEEEECCCCCC---EEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHC-CCCE Q ss_conf 033188999999999972898-847999983688765---14653332103521130046878999846776540-6879 Q gi|254780287|r 4 IIKITDAAATQIKTILESNSD-KKALRITIEGGGCSG---FSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYL-TNSE 78 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~~~-~~~lRi~v~~gGCsG---~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L-~g~~ 78 (109) -|++|++|+.+|+.....+.. ...||....+-+|+| +.|.+ +.+....|..++.++.++++-.-..-|+ +.++ T Consensus 2 ~Itftd~a~~~l~~a~d~nl~~~~hl~ydtEgc~Ca~SGi~t~rl--vae~tg~d~~idsn~gPiyik~~~~~Ff~D~mt 79 (114) T COG4918 2 KITFTDKAADKLKAAGDVNLVFDDHLLYDTEGCACAGSGISTYRL--VAEETGFDASIDSNFGPIYIKDYGSYFFQDEMT 79 (114) T ss_pred EEEECHHHHHHHHHHHCCCCCCCCEEEEECCCCCCCCCCCCEEEE--EEECCCCCCCCCCCCCCEEEEECCEEEECCEEE T ss_conf 578528999999986436867541579805444102688534998--874147430000478748997130067311323 Q ss_pred EEEEEC Q ss_conf 998727 Q gi|254780287|r 79 IDFVDN 84 (109) Q Consensus 79 IDy~~~ 84 (109) |||.+. T Consensus 80 idyN~~ 85 (114) T COG4918 80 IDYNPS 85 (114) T ss_pred EECCCC T ss_conf 420871 No 16 >COG3564 Uncharacterized protein conserved in bacteria [Function unknown] Probab=94.99 E-value=0.16 Score=30.14 Aligned_cols=93 Identities=19% Similarity=0.268 Sum_probs=64.5 Q ss_pred CCCCEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHH---CCCCCCCEE-CCCCEEEEECHHHHHHCCC Q ss_conf 984033188999999999972898847999983688765146533321---035211300-4687899984677654068 Q gi|254780287|r 1 MVPIIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLES---KQSEDDIVF-EKNGAQIFIDKISLAYLTN 76 (109) Q Consensus 1 M~~mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~---~~~~~D~v~-~~~gi~v~vd~~s~~~L~g 76 (109) |.+-++.|++|..-|.++.+++++ + ++-++|||.-=+--|.+.. ...++|+.+ +.+|++|||......+-+. T Consensus 3 ~~~~V~aT~aAl~Li~~l~~~hgp---v-mFHQSGGCCDGSsPMCYP~~~fivGd~DvlLG~i~gvPvyIs~~QyeaWKH 78 (116) T COG3564 3 MPARVLATPAALDLIAELQAEHGP---V-MFHQSGGCCDGSSPMCYPRADFIVGDNDVLLGEIDGVPVYISGPQYEAWKH 78 (116) T ss_pred CCCCEECCHHHHHHHHHHHHHCCC---E-EEECCCCCCCCCCCCCCCCCCEEECCCCEEEEEECCEEEEECCCHHHHHHC T ss_conf 876224388899999999875498---7-883268845899874014544566477358865378778864707752100 Q ss_pred C--EEEEEECCCCCEEEEECCCCC Q ss_conf 7--999872785452389888888 Q gi|254780287|r 77 S--EIDFVDNLLSKSFQIRNPNAT 98 (109) Q Consensus 77 ~--~IDy~~~~~g~gF~i~nPna~ 98 (109) + .||.+. +.|+.|-.+|-..+ T Consensus 79 TqLIIDVVp-GRGGmFSLdng~E~ 101 (116) T COG3564 79 TQLIIDVVP-GRGGMFSLDNGREK 101 (116) T ss_pred CEEEEEEEC-CCCCEEECCCCCCE T ss_conf 178999865-98864572578513 No 17 >pfam05610 DUF779 Protein of unknown function (DUF779). This family consists of several bacterial proteins of unknown function. Probab=88.18 E-value=1.1 Score=25.35 Aligned_cols=79 Identities=22% Similarity=0.397 Sum_probs=53.8 Q ss_pred HHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHH---CCCCCCCEE-CCCCEEEEECHHHHHHCCCC--EEEEEECCCCC Q ss_conf 9999972898847999983688765146533321---035211300-46878999846776540687--99987278545 Q gi|254780287|r 15 IKTILESNSDKKALRITIEGGGCSGFSYKFDLES---KQSEDDIVF-EKNGAQIFIDKISLAYLTNS--EIDFVDNLLSK 88 (109) Q Consensus 15 i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~---~~~~~D~v~-~~~gi~v~vd~~s~~~L~g~--~IDy~~~~~g~ 88 (109) |++|.+++++ | +.-++|||..=+--|.|.. .....|+.+ +.+|+++|+++....|-+.+ +||-++ +.|+ T Consensus 3 i~~L~~~HG~---L-mFHQSGGCCDGSaPMC~p~gef~vg~~DV~LG~i~g~pfym~~~QfeyWkhT~l~iDvv~-GrG~ 77 (95) T pfam05610 3 IRELKAKHGP---L-MFHQSGGCCDGSAPMCYPKGEFIVGDSDVLLGEIGGVPFYISKSQFEYWKHTQLIIDVVP-GRGG 77 (95) T ss_pred HHHHHHHHCC---E-EEEECCCCCCCCCCCEECCCCCCCCCCCEEEEEECCEEEEECCHHHHHHHCCEEEEEEEC-CCCC T ss_conf 8999876188---8-884079756898754034665052887678888669888983389876423679999862-6787 Q ss_pred EEEEECCCCC Q ss_conf 2389888888 Q gi|254780287|r 89 SFQIRNPNAT 98 (109) Q Consensus 89 gF~i~nPna~ 98 (109) +|-.++|... T Consensus 78 ~FSLE~p~G~ 87 (95) T pfam05610 78 MFSLEGPEGK 87 (95) T ss_pred EEEECCCCCC T ss_conf 5771488885 No 18 >KOG4777 consensus Probab=77.63 E-value=1.7 Score=24.19 Aligned_cols=38 Identities=16% Similarity=0.302 Sum_probs=34.9 Q ss_pred CCCEEEEECHHHHHHCCCCEEEEEECCCCCEEEEECCC Q ss_conf 68789998467765406879998727854523898888 Q gi|254780287|r 59 KNGAQIFIDKISLAYLTNSEIDFVDNLLSKSFQIRNPN 96 (109) Q Consensus 59 ~~gi~v~vd~~s~~~L~g~~IDy~~~~~g~gF~i~nPn 96 (109) .++++++|+.-.-..|++...-..+..++.|+.|.||| T Consensus 114 e~~VPLvvP~VNpehld~ik~~~~~~k~~~G~iI~nsN 151 (361) T KOG4777 114 EDGVPLVVPEVNPEHLDGIKVGLDTGKMGKGAIIANSN 151 (361) T ss_pred CCCCCEEECCCCHHHHHHHEECCCCCCCCCCEEEECCC T ss_conf 79974573345877842530222258889952896698 No 19 >pfam03749 SfsA Sugar fermentation stimulation protein. This family contains Sugar fermentation stimulation proteins. Which is probably a regulatory factor involved in maltose metabolism. SfsA has been shown to bind DNA and it contains a helix-turn-helix motif that probably binds DNA at its C-terminus. Probab=63.09 E-value=11 Score=19.38 Aligned_cols=42 Identities=7% Similarity=0.207 Sum_probs=24.3 Q ss_pred EECHHHHHHHHHHHHH--CCCCCEEEEEEECCCCCCEEEEEHHH Q ss_conf 3188999999999972--89884799998368876514653332 Q gi|254780287|r 6 KITDAAATQIKTILES--NSDKKALRITIEGGGCSGFSYKFDLE 47 (109) Q Consensus 6 ~iT~~A~~~i~~l~~~--~~~~~~lRi~v~~gGCsG~~y~l~~~ 47 (109) ..|+.+.+|+++|.+- ++....|-+-|.-.+|.-|.-..+.| T Consensus 136 a~T~RG~KHl~eL~~~~~~G~ra~vlF~vqr~d~~~f~p~~~~D 179 (215) T pfam03749 136 APTARGQKHLRELIALAKEGYRALVLFVVLREDVRCFSPNYEID 179 (215) T ss_pred CCHHHHHHHHHHHHHHHHCCCCEEEEEEEECCCCCEEEECHHCC T ss_conf 62265899999999999879949999999748997892865619 No 20 >PRK00347 sugar fermentation stimulation protein A; Reviewed Probab=54.90 E-value=17 Score=18.33 Aligned_cols=42 Identities=10% Similarity=0.311 Sum_probs=25.2 Q ss_pred EECHHHHHHHHHHHHH--CCCCCEEEEEEECCCCCCEEEEEHHH Q ss_conf 3188999999999972--89884799998368876514653332 Q gi|254780287|r 6 KITDAAATQIKTILES--NSDKKALRITIEGGGCSGFSYKFDLE 47 (109) Q Consensus 6 ~iT~~A~~~i~~l~~~--~~~~~~lRi~v~~gGCsG~~y~l~~~ 47 (109) ..|+.+.+|+++|.+- ++....+-+-|.-.+|.-|+-..+.| T Consensus 151 a~T~RG~kHl~eL~~~~~~G~ra~vlFvvqr~d~~~f~p~~~~D 194 (234) T PRK00347 151 AVTERGQKHLRELIELAAEGHRAVLLFLVQRSDITRFAPADEID 194 (234) T ss_pred CHHHHHHHHHHHHHHHHHCCCCEEEEEEEEECCCCEEEECHHCC T ss_conf 66788999999999997589948999999908997880864439 No 21 >KOG3348 consensus Probab=46.41 E-value=28 Score=17.10 Aligned_cols=38 Identities=16% Similarity=0.405 Sum_probs=26.5 Q ss_pred HHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHH Q ss_conf 999999999972898847999983688765146533321 Q gi|254780287|r 10 AAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLES 48 (109) Q Consensus 10 ~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~ 48 (109) .+.+++++++.+.-+...+++-=.++|| |..|...+.. T Consensus 3 ~~e~~l~~~L~~~l~p~~v~V~D~SgGC-G~~F~v~IvS 40 (85) T KOG3348 3 VTEERLEELLTEALEPEHVEVQDVSGGC-GSMFDVVIVS 40 (85) T ss_pred CHHHHHHHHHHHHCCCEEEEEEECCCCC-CCEEEEEEEC T ss_conf 0689999999834484699999757986-6407899973 No 22 >TIGR01369 CPSaseII_lrg carbamoyl-phosphate synthase, large subunit; InterPro: IPR006275 Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine (6.3.5.5 from EC) or ammonia (6.3.4.16 from EC), and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates , . CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate . The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (6.4.1.2 from EC) (ACC), propionyl-CoA carboxylase (6.4.1.3 from EC) (PCCase), pyruvate carboxylase (6.4.1.1 from EC) (PC) and urea carboxylase (6.3.4.6 from EC). Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain . CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites . The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain. Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein . The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP . There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia . CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains . This entry represents glutamine-dependent CPSase (6.3.5.5 from EC) from prokaryotes and eukaryotes (CPSase II). ; GO: 0004086 carbamoyl-phosphate synthase activity, 0006807 nitrogen compound metabolic process. Probab=43.56 E-value=11 Score=19.48 Aligned_cols=47 Identities=28% Similarity=0.502 Sum_probs=30.1 Q ss_pred CCEEECHHHHHHHHHHH-H--HCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCC Q ss_conf 40331889999999999-7--2898847999983688765146533321035211300468 Q gi|254780287|r 3 PIIKITDAAATQIKTIL-E--SNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKN 60 (109) Q Consensus 3 ~mi~iT~~A~~~i~~l~-~--~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~ 60 (109) |-=||||.=-++|++.. + .+ |+|.+|||| .||.|+++ ...-.|+|.| T Consensus 251 PSQTLtD~EYQ~LR~~sikIIR~-------lGi~GgGCN-vQFAL~P~---s~~Y~vIEvN 300 (1089) T TIGR01369 251 PSQTLTDKEYQMLRDASIKIIRE-------LGIVGGGCN-VQFALDPD---SGRYYVIEVN 300 (1089) T ss_pred CCCCCCCHHHHHHHHHHHHHHHH-------CCCEECCCC-EEEEECCC---CCCEEEEEEC T ss_conf 76368807899999999999987-------391216742-13215078---9706999867 No 23 >PRK06031 phosphoribosyltransferase; Provisional Probab=41.68 E-value=32 Score=16.72 Aligned_cols=65 Identities=23% Similarity=0.370 Sum_probs=43.5 Q ss_pred CEEEEEHHHHCCCCCCCEEC--CCCEEEEECHHHHHHCCCCEEEEEECCCCCEEEEE-CCCCCCCCCC Q ss_conf 51465333210352113004--68789998467765406879998727854523898-8888887788 Q gi|254780287|r 39 GFSYKFDLESKQSEDDIVFE--KNGAQIFIDKISLAYLTNSEIDFVDNLLSKSFQIR-NPNATSNCGC 103 (109) Q Consensus 39 G~~y~l~~~~~~~~~D~v~~--~~gi~v~vd~~s~~~L~g~~IDy~~~~~g~gF~i~-nPna~~~CgC 103 (109) |++-+|.+.++....---++ ..+-++|+|+..+++|.|-.+-.+++-..+|-.+. .-+--..||+ T Consensus 115 g~SrKfwy~d~Ls~~vsSITtp~~~krlylDp~~lpLl~GrRV~lVDDVISSG~Si~a~l~LL~~~G~ 182 (233) T PRK06031 115 GTSRKFWYDDELSVPLSSITTPDQGKRLYIDPRMLPLLRGRRVALIDDVISSGASIVAALRLLATCGI 182 (233) T ss_pred CCCCCCCHHHHHCCCEECCCCCCCCCEEEECHHHHHHHCCCEEEEEECHHCCCHHHHHHHHHHHHCCC T ss_conf 64775331344355100035888773156774441243287799982122155659999999997599 No 24 >TIGR02036 dsdC D-serine deaminase transcriptional activator; InterPro: IPR011781 This family, part of the LysR family of transcriptional regulators, activates transcription of the gene for D-serine deaminase, dsdA. Trusted members of this family so far are found adjacent to dsdA and only in Gammaproteobacteria, including Escherichia coli, Vibrio cholerae, and Colwellia psychrerythraea.. Probab=39.57 E-value=14 Score=18.80 Aligned_cols=73 Identities=22% Similarity=0.358 Sum_probs=57.6 Q ss_pred CEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEEC-HHHHHHCCCCEEEEE Q ss_conf 0331889999999999728988479999836887651465333210352113004687899984-677654068799987 Q gi|254780287|r 4 IIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFID-KISLAYLTNSEIDFV 82 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd-~~s~~~L~g~~IDy~ 82 (109) -+.+||+|.+|=...+++ ..||+|++++= + -++++.+|-+||+. +.|+.-|+-=.+|-+ T Consensus 31 ELSltPSAiSHRIN~LEe---ElGI~LF~RsH-------------R----KveLT~EG~RiY~alkssl~~LNQEIldiK 90 (302) T TIGR02036 31 ELSLTPSAISHRINKLEE---ELGIKLFKRSH-------------R----KVELTKEGKRIYIALKSSLDSLNQEILDIK 90 (302) T ss_pred HHCCCCCHHHHHHHHHHH---HHHHHHCCCCC-------------C----EEEECCCCCCHHHHHHHHHHHHCCEEEHHC T ss_conf 531673457775443356---52144203356-------------5----256446774046677876543141210002 Q ss_pred ECCCCCEEEEE-CCC Q ss_conf 27854523898-888 Q gi|254780287|r 83 DNLLSKSFQIR-NPN 96 (109) Q Consensus 83 ~~~~g~gF~i~-nPn 96 (109) ..+..+.+|+. .|- T Consensus 91 n~E~SG~LT~YSRPS 105 (302) T TIGR02036 91 NQELSGELTVYSRPS 105 (302) T ss_pred CCCCCCCEEECCCCC T ss_conf 675121020022553 No 25 >TIGR00954 3a01203 Peroxysomal long chain fatty acyl transporter; InterPro: IPR005283 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). The members of this family are integral membrane proteins and they are involved in the import of activated long-chain fatty acids from the cytosol to the peroxisomal matrix. ; GO: 0005524 ATP binding, 0006810 transport, 0016021 integral to membrane. Probab=38.55 E-value=26 Score=17.26 Aligned_cols=50 Identities=30% Similarity=0.549 Sum_probs=31.6 Q ss_pred CCCCCEECCCC--EEEEECHHH--------HHHCCCCEEEEEECCCCCEEEEECCCCCCCCCCCCC Q ss_conf 52113004687--899984677--------654068799987278545238988888887788622 Q gi|254780287|r 51 SEDDIVFEKNG--AQIFIDKIS--------LAYLTNSEIDFVDNLLSKSFQIRNPNATSNCGCGTS 106 (109) Q Consensus 51 ~~~D~v~~~~g--i~v~vd~~s--------~~~L~g~~IDy~~~~~g~gF~i~nPna~~~CgCG~S 106 (109) ..+|++++.=. ++++|||.. +.--++..+-|.+ +-|..+.|.-|| |||.| T Consensus 511 s~G~vLi~~L~F~v~lhidPitsksnsiqdlskandiklPflq-GsG~~lLi~GPN-----GCGKS 570 (788) T TIGR00954 511 SNGDVLIESLSFEVPLHIDPITSKSNSIQDLSKANDIKLPFLQ-GSGNHLLICGPN-----GCGKS 570 (788) T ss_pred CCCCEEECCCCEEEEEEECCCCCCHHHHHHHHHHCCCCCCCCC-CCCCEEEEECCC-----CCCHH T ss_conf 4784770145536766533423222356655543024355121-588766876889-----98647 No 26 >cd01234 PH_CADPS CADPS (Ca2+-dependent activator protein) Pleckstrin homology (PH) domain. CADPS is a calcium-dependent activator involved in secretion. It contains a central PH domain that binds to phosphoinositide 4,5 bisphosphate containing liposomes. However, membrane association may also be mediated by binding to phosphatidlyserine via general electrostatic interactions. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. Probab=36.93 E-value=24 Score=17.47 Aligned_cols=30 Identities=10% Similarity=0.192 Sum_probs=18.9 Q ss_pred CHHHHHHCCCCEEEEEEC----------CCCCEEEEECCC Q ss_conf 467765406879998727----------854523898888 Q gi|254780287|r 67 DKISLAYLTNSEIDFVDN----------LLSKSFQIRNPN 96 (109) Q Consensus 67 d~~s~~~L~g~~IDy~~~----------~~g~gF~i~nPn 96 (109) +|.-++-|+|-||||.+. ..|++|-|+.-+ T Consensus 45 eP~E~mqLdGyTVDY~e~~~~~~~~~~~l~ggr~FF~aVk 84 (117) T cd01234 45 EPTEFIQLDGYTVDYMPESDPDPNSELSLQGGRHFFNAVK 84 (117) T ss_pred CHHHHHEECCEEEECCCCCCCCCCCCCCCCCCCEEEEEEC T ss_conf 8067521255477335788866544434546421001111 No 27 >COG4647 AcxC Acetone carboxylase, gamma subunit [Secondary metabolites biosynthesis, transport, and catabolism] Probab=33.97 E-value=19 Score=18.10 Aligned_cols=31 Identities=26% Similarity=0.585 Sum_probs=17.4 Q ss_pred EEEEEEC---CCCC-EEEEECCCCC-CCCCCCCCCC Q ss_conf 9998727---8545-2389888888-8778862246 Q gi|254780287|r 78 EIDFVDN---LLSK-SFQIRNPNAT-SNCGCGTSFS 108 (109) Q Consensus 78 ~IDy~~~---~~g~-gF~i~nPna~-~~CgCG~SF~ 108 (109) .+||.+- ..|- -|+...|..+ ..|.||-||- T Consensus 45 rv~~~dpillpvg~hlfi~qs~~~rv~rcecghsf~ 80 (165) T COG4647 45 RVDWDDPILLPVGDHLFICQSAQKRVIRCECGHSFG 80 (165) T ss_pred HCCCCCCEEEECCCEEEEEECCCCCEEEEECCCCCC T ss_conf 236678746513880798666655578873466546 No 28 >COG4393 Predicted membrane protein [Function unknown] Probab=33.65 E-value=32 Score=16.76 Aligned_cols=15 Identities=47% Similarity=0.720 Sum_probs=6.9 Q ss_pred ECCCCEEEEECHHHH Q ss_conf 046878999846776 Q gi|254780287|r 57 FEKNGAQIFIDKISL 71 (109) Q Consensus 57 ~~~~gi~v~vd~~s~ 71 (109) .|.+|-+++||++|+ T Consensus 379 ye~ddnki~Idkasl 393 (405) T COG4393 379 YEIDDNKIIIDKASL 393 (405) T ss_pred EEECCCEEEEEHHHH T ss_conf 785396899877776 No 29 >KOG2775 consensus Probab=33.45 E-value=44 Score=15.95 Aligned_cols=76 Identities=18% Similarity=0.370 Sum_probs=39.9 Q ss_pred CCCEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCC-EECCCCEEEEECHHHHHHCCCCEEE Q ss_conf 840331889999999999728988479999836887651465333210352113-0046878999846776540687999 Q gi|254780287|r 2 VPIIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDI-VFEKNGAQIFIDKISLAYLTNSEID 80 (109) Q Consensus 2 ~~mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~-v~~~~gi~v~vd~~s~~~L~g~~ID 80 (109) |.||.|.+.-..-.+.++.+++-..||-+ +.|||=-.-.-.+ .|+.+|. |+..+++ +.|| T Consensus 110 mtm~ei~e~iEnttR~li~e~gl~aGi~F---PtG~SlN~cAAHy--TpNaGd~tVLqydDV--------------~KiD 170 (397) T KOG2775 110 MTMIEICETIENTTRKLILENGLNAGIGF---PTGCSLNHCAAHY--TPNAGDKTVLKYDDV--------------MKID 170 (397) T ss_pred CCHHHHHHHHHHHHHHHHHHCCCCCCCCC---CCCCCCCCHHHHC--CCCCCCCEEEEECCE--------------EEEE T ss_conf 42999999998889999874551027667---7766621034306--899998346420656--------------8874 Q ss_pred EEE----CCCCCEEEEE-CCC Q ss_conf 872----7854523898-888 Q gi|254780287|r 81 FVD----NLLSKSFQIR-NPN 96 (109) Q Consensus 81 y~~----~~~g~gF~i~-nPn 96 (109) |-. ....+.|++. ||+ T Consensus 171 fGthi~GrIiDsAFTv~F~p~ 191 (397) T KOG2775 171 FGTHIDGRIIDSAFTVAFNPK 191 (397) T ss_pred CCCCCCCEEEEEEEEEEECCC T ss_conf 021106727534568860755 No 30 >TIGR01074 rep ATP-dependent DNA helicase Rep; InterPro: IPR005752 RepA hexameric DNA helicase contain ATP-binding domains similar to those seen in monomeric helicases but which are arranged in a ring. There is compelling evidence to suggest that a single ssDNA molecule passes through the centre of the hexameric ring. Activity of the enzyme is based upon two separate but coupled activities, ssDNA translocation and duplex destabilisation, and is driven by energy derived from the continuous ATP-binding and hydrolysis events that take place in the active-site cleft. The resulting conformational changes that accompany these events underpin the coupling process and allow the helicase to translocate along the DNA, destabilizing the duplex and separating the two strands in an active process ; GO: 0004003 ATP-dependent DNA helicase activity, 0006268 DNA unwinding during replication, 0005737 cytoplasm. Probab=32.52 E-value=46 Score=15.79 Aligned_cols=69 Identities=19% Similarity=0.199 Sum_probs=40.4 Q ss_pred CEEECHHHHHHHHHHHHHC---CCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHHHHCCCCEEE Q ss_conf 0331889999999999728---9884799998368876514653332103521130046878999846776540687999 Q gi|254780287|r 4 IIKITDAAATQIKTILESN---SDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISLAYLTNSEID 80 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~---~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~ID 80 (109) =||+|.+|+.++++=+++. ++..||-|+-=. +.+|++-. .+-+.+=-..|+.||=+..+...|...+=+ T Consensus 53 AvTFTNKAA~EMkERVA~~L~~~~~~GL~isTFH------~LGL~Ii~--~E~~~lG~K~nFSlFD~~D~~all~eL~~~ 124 (677) T TIGR01074 53 AVTFTNKAAREMKERVAKTLGKGQAKGLTISTFH------TLGLKIIR--REHNALGLKSNFSLFDETDQLALLKELLEG 124 (677) T ss_pred EEECCHHHHHHHHHHHHHHCCCCCCCCCEEECCH------HHHHHHHH--HHHHHCCCCCCCCCCCHHHHHHHHHHHHHH T ss_conf 9735237779999999852265455854475205------73389999--999864889996420678899999987523 No 31 >TIGR01931 cysJ sulfite reductase [NADPH] flavoprotein, alpha-component; InterPro: IPR010199 This entry describes an NADPH-dependent sulphite reductase flavoprotein subunit. Most members of the proteins of this entry are found in Cys biosynthesis gene clusters. The closest homologues are designated as subunits nitrate reductase.; GO: 0004783 sulfite reductase (NADPH) activity, 0000103 sulfate assimilation, 0019344 cysteine biosynthetic process. Probab=31.34 E-value=48 Score=15.68 Aligned_cols=38 Identities=26% Similarity=0.550 Sum_probs=23.4 Q ss_pred CCCCCEEEEEE---------ECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHH Q ss_conf 89884799998---------36887651465333210352113004687899984677 Q gi|254780287|r 22 NSDKKALRITI---------EGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKIS 70 (109) Q Consensus 22 ~~~~~~lRi~v---------~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s 70 (109) .++..-|-|++ .-||||||- ++...++|.| +|||.+.. T Consensus 426 v~dEVHlTVg~V~y~~~G~~r~GgaS~fL-----A~rl~~gd~v------~vyve~N~ 472 (628) T TIGR01931 426 VDDEVHLTVGVVRYEAEGRARLGGASGFL-----AERLEEGDTV------KVYVERND 472 (628) T ss_pred CCCEEEEEEEEEEEEECCEEEECCCHHHH-----HHHCCCCCEE------EEEEEECC T ss_conf 68806887668898205647744105778-----8650889767------78875177 No 32 >KOG2862 consensus Probab=30.10 E-value=50 Score=15.59 Aligned_cols=77 Identities=18% Similarity=0.286 Sum_probs=45.1 Q ss_pred CEEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECHHHH---HH--CCCCE Q ss_conf 03318899999999997289884799998368876514653332103521130046878999846776---54--06879 Q gi|254780287|r 4 IIKITDAAATQIKTILESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDKISL---AY--LTNSE 78 (109) Q Consensus 4 mi~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~~s~---~~--L~g~~ 78 (109) +++|++.-.+-++.+.+..+. .-+.+.+.|.+|+.-.+ .+-.+++|.++- +.+---+. +. =-|+. T Consensus 48 ~~qIm~~v~egikyVFkT~n~---~tf~isgsGh~g~E~al--~N~lePgd~vLv-----~~~G~wg~ra~D~~~r~ga~ 117 (385) T KOG2862 48 FVQIMDEVLEGIKYVFKTANA---QTFVISGSGHSGWEAAL--VNLLEPGDNVLV-----VSTGTWGQRAADCARRYGAE 117 (385) T ss_pred HHHHHHHHHHHHHHHHCCCCC---CEEEEECCCCCHHHHHH--HHHCCCCCEEEE-----EEECHHHHHHHHHHHHHCCE T ss_conf 999999999878998404789---62898369841689988--752578974999-----97233777889999860865 Q ss_pred EEEEECCCCCEE Q ss_conf 998727854523 Q gi|254780287|r 79 IDFVDNLLSKSF 90 (109) Q Consensus 79 IDy~~~~~g~gF 90 (109) +|+++...|++. T Consensus 118 V~~v~~~~G~~~ 129 (385) T KOG2862 118 VDVVEADIGQAV 129 (385) T ss_pred EEEEECCCCCCC T ss_conf 558715855675 No 33 >pfam06905 FAIM1 Fas apoptotic inhibitory molecule (FAIM1). This family consists of several fas apoptotic inhibitory molecule (FAIM1) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM1 is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology. Probab=30.03 E-value=51 Score=15.54 Aligned_cols=38 Identities=13% Similarity=0.458 Sum_probs=26.9 Q ss_pred ECCCCEEEEECHHHHH-HCCCCEEE----EEECCCCCEEEEEC Q ss_conf 0468789998467765-40687999----87278545238988 Q gi|254780287|r 57 FEKNGAQIFIDKISLA-YLTNSEID----FVDNLLSKSFQIRN 94 (109) Q Consensus 57 ~~~~gi~v~vd~~s~~-~L~g~~ID----y~~~~~g~gF~i~n 94 (109) +.-.+.+|+.++..+. +++|-.++ |++++....|.+.+ T Consensus 105 ldg~~~RVVLeK~TmdvwvNg~~~et~geFvd~GtethF~~g~ 147 (178) T pfam06905 105 LDGQDYRVVLEKDTMDVWVNGEKMETAGEFVDDGTETHFSLGD 147 (178) T ss_pred ECCCEEEEEEECCCEEEEECCEEEEEEEEEECCCEEEEEEECC T ss_conf 2895589998147379999889977776795398089999568 No 34 >KOG0064 consensus Probab=27.67 E-value=44 Score=15.91 Aligned_cols=38 Identities=18% Similarity=0.357 Sum_probs=25.1 Q ss_pred CCEEEEECHHHHHHCCCCEEEEEECCCCCEEEEECCCCCCCCCCCCC Q ss_conf 87899984677654068799987278545238988888887788622 Q gi|254780287|r 60 NGAQIFIDKISLAYLTNSEIDFVDNLLSKSFQIRNPNATSNCGCGTS 106 (109) Q Consensus 60 ~gi~v~vd~~s~~~L~g~~IDy~~~~~g~gF~i~nPna~~~CgCG~S 106 (109) +.++|+.+.-.. .+...+++. ..|..+.|.-|| |||.| T Consensus 485 enipvitP~~~v-vv~~Ltf~i---~~G~hLLItGPN-----GCGKS 522 (728) T KOG0064 485 ENIPVITPAGDV-LVPKLTFQI---EPGMHLLITGPN-----GCGKS 522 (728) T ss_pred ECCCEECCCCCE-EECCEEEEE---CCCCEEEEECCC-----CCCHH T ss_conf 147555567546-522215874---588269987899-----76588 No 35 >COG1489 SfsA DNA-binding protein, stimulates sugar fermentation [General function prediction only] Probab=27.15 E-value=58 Score=15.24 Aligned_cols=72 Identities=11% Similarity=0.071 Sum_probs=36.9 Q ss_pred EECHHHHHHHHHHHHHC--CCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCE-----EEEECHHHHHHCCCC Q ss_conf 31889999999999728--9884799998368876514653332103521130046878-----999846776540687 Q gi|254780287|r 6 KITDAAATQIKTILESN--SDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGA-----QIFIDKISLAYLTNS 77 (109) Q Consensus 6 ~iT~~A~~~i~~l~~~~--~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi-----~v~vd~~s~~~L~g~ 77 (109) ..|...++|+++|.+-. +-...+-+-|..++|.-|.-....|....+--......|+ ++-++++.+.+..-. T Consensus 148 ApT~RG~KHLreL~~~~~~G~ra~vlf~v~r~d~~~F~P~~e~Dp~fa~~l~~A~~~GVev~~~~~~~~~~~i~~~~~l 226 (235) T COG1489 148 APTARGQKHLRELERLAKEGYRAVVLFLVLRSDITRFSPNREIDPKFAELLREAIKAGVEVLAYRFEVDGEGIRLVGPL 226 (235) T ss_pred CCCHHHHHHHHHHHHHHHCCCCEEEEEEEECCCCCEECCCCCCCHHHHHHHHHHHHCCCEEEEEEEEECCCCEEEECCE T ss_conf 8523367799999999970774699999934897588733014988999999999759789999999844205861315 No 36 >PRK13640 cbiO cobalt transporter ATP-binding subunit; Provisional Probab=25.60 E-value=61 Score=15.07 Aligned_cols=12 Identities=17% Similarity=0.313 Sum_probs=8.0 Q ss_pred CEEEEECCCCCC Q ss_conf 523898888888 Q gi|254780287|r 88 KSFQIRNPNATS 99 (109) Q Consensus 88 ~gF~i~nPna~~ 99 (109) =|++|.||...- T Consensus 87 vg~VfQ~P~~q~ 98 (283) T PRK13640 87 VGIVFQNPDNQF 98 (283) T ss_pred EEEEEECCCCCC T ss_conf 189986887618 No 37 >PRK03824 hypA hydrogenase nickel incorporation protein; Provisional Probab=25.31 E-value=26 Score=17.30 Aligned_cols=36 Identities=8% Similarity=0.117 Sum_probs=18.3 Q ss_pred EEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCE Q ss_conf 331889999999999728988479999836887651 Q gi|254780287|r 5 IKITDAAATQIKTILESNSDKKALRITIEGGGCSGF 40 (109) Q Consensus 5 i~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~ 40 (109) +.|+.+-.+-+.+..++++-..-.+|.+.=|-=||. T Consensus 4 ~Sia~sil~~v~~~a~~~g~~~V~~V~l~IG~ls~V 39 (135) T PRK03824 4 WALAEAIVRTVLDYAQKEGASKVKALKVVLGELQDV 39 (135) T ss_pred HHHHHHHHHHHHHHHHHCCCCEEEEEEEEECCCCCC T ss_conf 999999999999999981997599999998884643 No 38 >pfam00262 Calreticulin Calreticulin family. Probab=24.89 E-value=26 Score=17.31 Aligned_cols=16 Identities=44% Similarity=0.978 Sum_probs=8.8 Q ss_pred EEEECCCCCCCCCCCC Q ss_conf 3898888888778862 Q gi|254780287|r 90 FQIRNPNATSNCGCGT 105 (109) Q Consensus 90 F~i~nPna~~~CgCG~ 105 (109) =.|.||.-...|||++ T Consensus 274 P~I~NPkC~~~~g~~~ 289 (359) T pfam00262 274 PMIPNPKCEKACGCGK 289 (359) T ss_pred CCCCCCHHHCCCCCCC T ss_conf 6478951103677654 No 39 >pfam03152 UFD1 Ubiquitin fusion degradation protein UFD1. Post-translational ubiquitin-protein conjugates are recognized for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified. This family includes UFD1, a 40kD protein that is essential for vegetative cell viability. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterized by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145 Probab=24.71 E-value=28 Score=17.10 Aligned_cols=14 Identities=14% Similarity=0.320 Sum_probs=7.8 Q ss_pred EEECHHHHHHHHHH Q ss_conf 33188999999999 Q gi|254780287|r 5 IKITDAAATQIKTI 18 (109) Q Consensus 5 i~iT~~A~~~i~~l 18 (109) |-+-++|..++.++ T Consensus 28 IiLP~S~L~~L~~~ 41 (176) T pfam03152 28 IILPPSALDRLSRL 41 (176) T ss_pred EECCHHHHHHHHHC T ss_conf 98699999999972 No 40 >PRK00564 hypA hydrogenase nickel incorporation protein; Provisional Probab=22.97 E-value=69 Score=14.77 Aligned_cols=36 Identities=14% Similarity=0.154 Sum_probs=20.7 Q ss_pred EEECHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCE Q ss_conf 331889999999999728988479999836887651 Q gi|254780287|r 5 IKITDAAATQIKTILESNSDKKALRITIEGGGCSGF 40 (109) Q Consensus 5 i~iT~~A~~~i~~l~~~~~~~~~lRi~v~~gGCsG~ 40 (109) +.|+.+-.+.+.+..++++-..-.+|.++-|-=||. T Consensus 4 lsi~~~iv~~v~~~a~~~~~~~V~~v~l~iG~ls~V 39 (117) T PRK00564 4 YSVVSSLIALCEEHAKKNQAHKIERVVVGIGERSAM 39 (117) T ss_pred HHHHHHHHHHHHHHHHHCCCCEEEEEEEEECCCCCC T ss_conf 999999999999999983997799999998885521 No 41 >TIGR02588 TIGR02588 conserved hypothetical protein TIGR02588; InterPro: IPR013417 The function of this protein is unknown. It is always found as part of a two-gene operon with IPR013416 from INTERPRO, a protein that appears to span the membrane seven times. It has so far been found in the bacteria Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus.. Probab=22.25 E-value=72 Score=14.69 Aligned_cols=25 Identities=16% Similarity=0.320 Sum_probs=18.1 Q ss_pred CCCCEEEEEEC--CCCCEEEEE-CCCCC Q ss_conf 06879998727--854523898-88888 Q gi|254780287|r 74 LTNSEIDFVDN--LLSKSFQIR-NPNAT 98 (109) Q Consensus 74 L~g~~IDy~~~--~~g~gF~i~-nPna~ 98 (109) --+.+|||.-+ ...+-|.|+ ||..- T Consensus 83 ~a~~~iDy~a~~~~~~G~liF~~dPr~G 110 (122) T TIGR02588 83 EAEVTIDYLASGSKEKGTLIFRSDPRNG 110 (122) T ss_pred CCCCEEEECCCCCCCCCEEEEEECCCCC T ss_conf 4673798715997413068871289996 No 42 >COG5134 Uncharacterized conserved protein [Function unknown] Probab=22.14 E-value=63 Score=15.02 Aligned_cols=39 Identities=21% Similarity=0.346 Sum_probs=29.5 Q ss_pred EEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEEEEECH Q ss_conf 799998368876514653332103521130046878999846 Q gi|254780287|r 27 ALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQIFIDK 68 (109) Q Consensus 27 ~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~v~vd~ 68 (109) --|+.+.-.||+ -.+++.+.|..++.|.|++|++-+++. T Consensus 75 iYRf~I~C~~C~---n~i~~RTDPkN~~YV~EsGg~R~i~pq 113 (272) T COG5134 75 IYRFSIKCHLCS---NPIDVRTDPKNTEYVVESGGRRKIEPQ 113 (272) T ss_pred EEEEEEECCCCC---CCEEEECCCCCCEEEEECCCEEECCCC T ss_conf 799999746777---730564079985078851754305766 No 43 >KOG0060 consensus Probab=22.08 E-value=71 Score=14.71 Aligned_cols=50 Identities=18% Similarity=0.304 Sum_probs=34.4 Q ss_pred CCCCCCCEECCCCEEEEECHHHHHHCCCCEEEEEECCCCCEEEEECCCCCCCCCCCCC Q ss_conf 0352113004687899984677654068799987278545238988888887788622 Q gi|254780287|r 49 KQSEDDIVFEKNGAQIFIDKISLAYLTNSEIDFVDNLLSKSFQIRNPNATSNCGCGTS 106 (109) Q Consensus 49 ~~~~~D~v~~~~gi~v~vd~~s~~~L~g~~IDy~~~~~g~gF~i~nPna~~~CgCG~S 106 (109) ...++|..++.+.+++..+...-..++..+.+- ..|+.+.|.-|| |||.| T Consensus 426 ~~~~~Dn~i~~e~v~l~tPt~g~~lie~Ls~~V---~~g~~LLItG~s-----G~GKt 475 (659) T KOG0060 426 KAEPADNAIEFEEVSLSTPTNGDLLIENLSLEV---PSGQNLLITGPS-----GCGKT 475 (659) T ss_pred CCCCCCCEEEEEEEEECCCCCCCEEEEEEEEEE---CCCCEEEEECCC-----CCCHH T ss_conf 236666458963101108999865632100570---589759997899-----87636 No 44 >TIGR01283 nifE nitrogenase MoFe cofactor biosynthesis protein NifE; InterPro: IPR005973 The enzyme responsible for nitrogen fixation, the nitrogenase, shows a high degree of conservation of structure, function, and amino acid sequence across wide phylogenetic ranges. All known Mo-nitrogenases consist of two components, component I (also called dinitrogenase, or Fe-Mo protein), an alpha2beta2 tetramer encoded by the nifD and nifK genes, and component II (dinitrogenase reductase, or Fe protein) a homodimer encoded by the nifH gene. Two operons, nifDK and nifEN, encode a tetrameric (alpha2beta2 and N2E2) enzymatic complex. Nitrogenase contains two unusual rare metal clusters; one of them is the iron molybdenum cofactor (FeMo-co), which is considered to be the site of dinitrogen reduction and whose biosynthesis requires the products of nifNE and of some other nif genes. It has been proposed that NifNE might serve as a scaffold upon which FeMo-co is built and then inserted into component I. ; GO: 0005515 protein binding, 0006461 protein complex assembly, 0009399 nitrogen fixation. Probab=21.82 E-value=28 Score=17.08 Aligned_cols=23 Identities=17% Similarity=0.569 Sum_probs=14.0 Q ss_pred EEEEHHHHCCCCCCCEECCCCEE Q ss_conf 46533321035211300468789 Q gi|254780287|r 41 SYKFDLESKQSEDDIVFEKNGAQ 63 (109) Q Consensus 41 ~y~l~~~~~~~~~D~v~~~~gi~ 63 (109) .|.+.|.+...+.|+|+=.+.-| T Consensus 91 LyR~gFsTDL~E~DVifGrGEKK 113 (470) T TIGR01283 91 LYRLGFSTDLTEKDVIFGRGEKK 113 (470) T ss_pred HCCCCCCCCCCCCCEEECCHHHH T ss_conf 51366632660246673314478 No 45 >COG1635 THI4 Ribulose 1,5-bisphosphate synthetase, converts PRPP to RuBP, flavoprotein [Carbohydrate transport and metabolism] Probab=21.73 E-value=65 Score=14.93 Aligned_cols=62 Identities=19% Similarity=0.354 Sum_probs=40.3 Q ss_pred EEEECCCCCCEEEEEHHHHCCCCCCCEECCC-----C--------EEEEECHHHHHHCCCCEEEEEECCCCCEEEEEC Q ss_conf 9983688765146533321035211300468-----7--------899984677654068799987278545238988 Q gi|254780287|r 30 ITIEGGGCSGFSYKFDLESKQSEDDIVFEKN-----G--------AQIFIDKISLAYLTNSEIDFVDNLLSKSFQIRN 94 (109) Q Consensus 30 i~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~-----g--------i~v~vd~~s~~~L~g~~IDy~~~~~g~gF~i~n 94 (109) +.+.++|-||..-...+.+. .-.-.++|.. | -+++|.+....+|+...|-|.+.+.| +...+ T Consensus 33 ViIVGaGPsGLtAAyyLAk~-g~kV~i~E~~ls~GGG~w~GGmlf~~iVv~~~a~~iL~e~gI~ye~~e~g--~~v~d 107 (262) T COG1635 33 VIIVGAGPSGLTAAYYLAKA-GLKVAIFERKLSFGGGIWGGGMLFNKIVVREEADEILDEFGIRYEEEEDG--YYVAD 107 (262) T ss_pred EEEECCCCCHHHHHHHHHHC-CCEEEEEEEECCCCCCCCCCCCCCCCEEECCHHHHHHHHHCCCCEECCCC--EEEEC T ss_conf 79987685057899999867-96499997301468763344333560444253899999819852445796--69832 No 46 >pfam09624 DUF2393 Protein of unknown function (DUF2393). The function of this protein is unknown. It is always found as part of a two-gene operon with IPR013416, a protein that appears to span the membrane seven times. It has so far been found in the bacteria Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus. Probab=21.67 E-value=74 Score=14.62 Aligned_cols=77 Identities=9% Similarity=0.086 Sum_probs=34.9 Q ss_pred HHCCCCCEEEEEEECCCCCCEEEEEHHHHCCCCCCCEECCCCEE-EEECHHHHHHCCCCEEEEEEC--CCCCEEEEE-CC Q ss_conf 72898847999983688765146533321035211300468789-998467765406879998727--854523898-88 Q gi|254780287|r 20 ESNSDKKALRITIEGGGCSGFSYKFDLESKQSEDDIVFEKNGAQ-IFIDKISLAYLTNSEIDFVDN--LLSKSFQIR-NP 95 (109) Q Consensus 20 ~~~~~~~~lRi~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~gi~-v~vd~~s~~~L~g~~IDy~~~--~~g~gF~i~-nP 95 (109) .+.+...-|++.+..---.+=+|...|.- .+.++.....=.+. .+-++.-..--...||||.-. ...++|+|. || T Consensus 26 ~~~~~pP~l~v~~~~v~~~~~qyyVpF~v-~N~Gg~TAa~V~V~geL~~~~~~~E~~e~tiDflpg~~~~~G~fiF~~dP 104 (119) T pfam09624 26 TEENKPPNLVVAVLQVRQVGGQFYVPFAV-RNDGGQTAAAVTVIGELRQGGAVVEESEVTIDFLPGGSEAKGTLIFRSDP 104 (119) T ss_pred HCCCCCCEEEEEEHHEEEECCEEEEEEEE-EECCCCEEEEEEEEEEECCCCCEEEEEEEEEEECCCCCEEEEEEEECCCC T ss_conf 27999976999850389968878999999-96886677799999999769951787679999747997576899972695 Q ss_pred CC Q ss_conf 88 Q gi|254780287|r 96 NA 97 (109) Q Consensus 96 na 97 (109) .. T Consensus 105 ~~ 106 (119) T pfam09624 105 AG 106 (119) T ss_pred CC T ss_conf 56 No 47 >pfam01946 Thi4 Thi4 family. This family includes a putative thiamine biosynthetic enzyme. Probab=21.21 E-value=66 Score=14.90 Aligned_cols=55 Identities=22% Similarity=0.329 Sum_probs=34.1 Q ss_pred EEEECCCCCCEEEEEHHHHCCCCCCCEECCC-------------CEEEEECHHHHHHCCCCEEEEEECC Q ss_conf 9983688765146533321035211300468-------------7899984677654068799987278 Q gi|254780287|r 30 ITIEGGGCSGFSYKFDLESKQSEDDIVFEKN-------------GAQIFIDKISLAYLTNSEIDFVDNL 85 (109) Q Consensus 30 i~v~~gGCsG~~y~l~~~~~~~~~D~v~~~~-------------gi~v~vd~~s~~~L~g~~IDy~~~~ 85 (109) +.+.++|-||..-...+.+. .-.-.++|.. --++++.+....+|+..-|.|.+.. T Consensus 20 V~IVGaGpsGL~aA~~LAk~-g~KV~i~E~~ls~GGG~WgGGmlfn~ivv~~~a~~iLde~gi~y~~~~ 87 (229) T pfam01946 20 VVIVGAGPSGLTAAYYLAKK-GLKVAIIERSLSPGGGAWGGGMLFSAMVVRKPADEFLDEFGIRYEDEG 87 (229) T ss_pred EEEECCCCHHHHHHHHHHHC-CCEEEEEECCCCCCCCCCCCCCCCCCEEECCHHHHHHHHCCCCCEECC T ss_conf 89988781799999999878-985999964526888620201225633764138999997499527647 No 48 >TIGR02411 leuko_A4_hydro leukotriene A-4 hydrolase/aminopeptidase; InterPro: IPR012777 Members of this family represent a distinctive subset within the zinc metallopeptidases of MEROPS peptidase family M1 (aminopeptidase N, clan MA). The majority of the members are aminopeptidases, but the sequences in this family for which the function is known are leukotriene A-4 hydrolase (LTA4), which has both epoxide hydrolase and aminopeptidase activity at the same active site. The physiological substrate for aminopeptidase activity is not known .; GO: 0004463 leukotriene-A4 hydrolase activity, 0008270 zinc ion binding. Probab=20.79 E-value=46 Score=15.83 Aligned_cols=73 Identities=19% Similarity=0.294 Sum_probs=40.7 Q ss_pred EEEEEECCCCCC-EEEEEHHHHCCC-CCCCEECCCCEEE---EECH-HHHHHCCCCEEE-E----EECCCCCEEEEECCC Q ss_conf 999983688765-146533321035-2113004687899---9846-776540687999-8----727854523898888 Q gi|254780287|r 28 LRITIEGGGCSG-FSYKFDLESKQS-EDDIVFEKNGAQI---FIDK-ISLAYLTNSEID-F----VDNLLSKSFQIRNPN 96 (109) Q Consensus 28 lRi~v~~gGCsG-~~y~l~~~~~~~-~~D~v~~~~gi~v---~vd~-~s~~~L~g~~ID-y----~~~~~g~gF~i~nPn 96 (109) +||.-+..-=.| ..|.|.-..... ...+|++...++| .|+. -|. ..+| | ..+..||.++|.+|. T Consensus 21 ~~vdF~~~~l~G~V~~~l~~~~~n~~~S~l~LDTsyL~i~~v~ingGGse-----~~~~nf~i~~R~~~~GS~L~I~lP~ 95 (660) T TIGR02411 21 LSVDFDKEKLQGSVTFKLKSLTENKNKSELVLDTSYLDIQKVTINGGGSE-----LPADNFEIGERKEPLGSPLTISLPS 95 (660) T ss_pred EEEEECCCEEEEEEEEEEEEECCCCCCCEEEEECCCCEEEEEEECCCCCC-----CCCCCEEHHHHCCCCCCCCEECCCC T ss_conf 87600031465689999986058875430565302233789998686655-----6677502321026778740430755 Q ss_pred CCCCCCCCCCCC Q ss_conf 888778862246 Q gi|254780287|r 97 ATSNCGCGTSFS 108 (109) Q Consensus 97 a~~~CgCG~SF~ 108 (109) +.+ ||.+|. T Consensus 96 ~~~---~n~~~~ 104 (660) T TIGR02411 96 ETS---KNKELE 104 (660) T ss_pred CCC---CCCEEE T ss_conf 677---896579 Done!