Query gi|254780939|ref|YP_003065352.1| type I signal peptidase [Candidatus Liberibacter asiaticus str. psy62] Match_columns 248 No_of_seqs 132 out of 2445 Neff 6.6 Searched_HMMs 39220 Date Mon May 30 03:23:12 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780939.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK10861 lsignal peptidase I; 100.0 0 0 484.1 17.0 223 11-247 58-323 (324) 2 TIGR02227 sigpep_I_bact signal 100.0 0 0 385.7 14.4 195 16-224 1-203 (203) 3 KOG0171 consensus 100.0 1.5E-31 3.7E-36 218.4 11.1 134 28-225 26-160 (176) 4 pfam10502 Peptidase_S26 Peptid 100.0 1.7E-29 4.4E-34 205.4 9.5 107 82-220 19-137 (138) 5 cd06530 S26_SPase_I The S26 Ty 99.9 7.3E-25 1.9E-29 176.3 7.5 85 36-216 1-85 (85) 6 KOG1568 consensus 99.9 2.3E-23 5.9E-28 166.8 7.2 134 15-222 9-150 (174) 7 PRK13884 conjugal transfer pep 99.8 3.9E-19 1E-23 140.2 12.2 109 83-220 49-176 (178) 8 COG0681 LepB Signal peptidase 99.8 3.5E-19 9.1E-24 140.5 10.3 91 14-124 8-100 (166) 9 TIGR02771 TraF_Ti conjugative 99.7 2.2E-17 5.5E-22 129.3 9.3 107 82-217 47-178 (183) 10 PRK13838 conjugal transfer pil 99.7 1.3E-15 3.2E-20 118.2 11.7 105 83-220 49-172 (176) 11 TIGR02754 sod_Ni_protease nick 99.6 1.6E-15 4E-20 117.6 6.9 89 38-217 1-89 (90) 12 COG4959 TraF Type IV secretory 99.3 3.9E-12 1E-16 96.2 6.4 106 83-220 51-169 (173) 13 cd06462 Peptidase_S24_S26 The 99.1 3.1E-10 8E-15 84.2 6.7 82 37-215 2-83 (84) 14 TIGR02228 sigpep_I_arch signal 99.0 1.7E-09 4.5E-14 79.5 6.6 80 12-110 2-95 (175) 15 pfam00717 Peptidase_S24 Peptid 98.7 1.2E-08 2.9E-13 74.4 4.7 56 39-118 1-56 (67) 16 COG2932 Predicted transcriptio 98.0 9E-06 2.3E-10 56.1 5.4 57 37-117 125-181 (214) 17 cd06529 S24_LexA-like Peptidas 97.8 4.4E-05 1.1E-09 51.8 5.7 56 37-117 2-57 (81) 18 KOG3342 consensus 97.4 0.00015 3.7E-09 48.5 3.1 52 38-110 51-102 (180) 19 PRK00215 LexA repressor; Valid 96.7 0.0033 8.4E-08 40.0 5.3 53 37-115 119-172 (204) 20 COG0681 LepB Signal peptidase 96.3 0.0041 1.1E-07 39.4 3.3 28 101-128 138-165 (166) 21 PRK10276 DNA polymerase V subu 96.2 0.0088 2.2E-07 37.3 4.9 45 38-107 54-99 (139) 22 PHA00361 cI Repressor 96.2 0.011 2.8E-07 36.7 5.2 56 31-112 72-131 (165) 23 PRK12423 LexA repressor; Provi 95.5 0.033 8.4E-07 33.7 5.2 54 38-117 117-171 (202) 24 COG1974 LexA SOS-response tran 92.6 0.29 7.4E-06 27.8 5.1 47 38-108 115-162 (201) 25 TIGR02896 spore_III_AF stage I 84.7 1.1 2.9E-05 24.1 3.4 31 11-47 1-31 (113) 26 COG3602 Uncharacterized protei 64.2 4 0.0001 20.6 1.6 17 43-59 12-28 (134) 27 COG1969 HyaC Ni,Fe-hydrogenase 64.2 6.4 0.00016 19.3 2.7 34 13-48 20-53 (227) 28 pfam11101 DUF2884 Protein of u 63.8 4.9 0.00013 20.0 2.1 27 106-132 17-44 (229) 29 pfam10000 DUF2241 Uncharacteri 60.3 5.2 0.00013 19.9 1.7 16 43-58 12-27 (72) 30 TIGR01237 D1pyr5carbox2 delta- 55.9 5.7 0.00014 19.7 1.3 21 98-118 273-295 (518) 31 pfam02836 Glyco_hydro_2_C Glyc 54.6 5.7 0.00015 19.6 1.1 28 184-211 252-279 (297) 32 KOG1247 consensus 50.8 9.7 0.00025 18.2 1.8 41 203-246 250-291 (567) 33 TIGR01718 Uridine-psphlse urid 49.4 14 0.00036 17.2 2.4 31 24-58 75-106 (248) 34 PRK09919 hypothetical protein; 45.0 11 0.00027 17.9 1.3 24 109-132 39-62 (114) 35 cd02776 MopB_CT_Nitrate-R-NarG 44.1 23 0.0006 15.8 2.9 36 84-124 25-62 (141) 36 TIGR01704 MTA/SAH-Nsdase MTA/S 43.8 17 0.00044 16.6 2.2 42 20-62 52-96 (229) 37 TIGR00185 rRNA_methyl_2 RNA me 43.3 24 0.00061 15.7 3.5 103 89-193 3-109 (161) 38 PRK09525 lacZ beta-D-galactosi 42.5 12 0.00031 17.6 1.2 17 114-130 335-351 (1027) 39 COG0361 InfA Translation initi 41.3 21 0.00054 16.1 2.3 13 49-61 47-59 (75) 40 TIGR00915 2A0602 RND transport 41.3 8.4 0.00021 18.6 0.3 15 83-97 659-673 (1058) 41 cd04438 DEP_dishevelled DEP (D 39.0 15 0.00038 17.0 1.3 13 179-191 72-84 (84) 42 cd01287 FabA FabA, beta-hydrox 38.2 16 0.00041 16.8 1.3 26 104-129 112-137 (150) 43 PRK10340 ebgA cryptic beta-D-g 37.1 16 0.00042 16.7 1.2 16 114-129 319-334 (1030) 44 TIGR02390 RNA_pol_rpoA1 DNA-di 35.6 21 0.00054 16.0 1.6 34 82-117 430-467 (901) 45 PRK08566 DNA-directed RNA poly 35.2 32 0.00082 14.9 3.0 33 83-117 410-446 (881) 46 COG1097 RRP4 RNA-binding prote 34.2 33 0.00085 14.8 4.3 17 42-58 106-122 (239) 47 TIGR02219 phage_NlpC_fam putat 33.4 22 0.00057 15.9 1.4 19 79-97 71-90 (135) 48 pfam11057 Cortexin Cortexin of 32.2 25 0.00064 15.6 1.5 29 19-47 35-66 (81) 49 PRK10838 spr putative outer me 30.1 18 0.00046 16.5 0.5 12 22-33 15-26 (188) 50 TIGR02656 cyanin_plasto plasto 29.9 18 0.00045 16.5 0.4 21 106-126 18-38 (102) 51 PRK10369 heme lyase subunit Nr 29.4 40 0.001 14.3 2.9 23 106-128 428-450 (552) 52 pfam00623 RNA_pol_Rpb1_2 RNA p 28.9 41 0.001 14.2 2.2 34 82-117 91-128 (165) 53 TIGR00575 dnlj DNA ligase, NAD 28.7 34 0.00087 14.7 1.7 12 41-52 345-356 (706) 54 KOG2915 consensus 28.3 20 0.00051 16.2 0.4 21 104-124 48-70 (314) 55 TIGR02730 carot_isom carotene 28.0 43 0.0011 14.1 2.2 25 42-67 200-224 (506) 56 TIGR02823 oxido_YhdH putative 27.7 43 0.0011 14.1 3.0 21 43-63 72-92 (330) 57 TIGR00110 ilvD dihydroxy-acid 27.6 28 0.00072 15.3 1.1 18 80-97 455-474 (601) 58 KOG1618 consensus 27.6 35 0.00088 14.7 1.6 47 180-228 294-342 (389) 59 TIGR03468 HpnG hopanoid-associ 27.0 44 0.0011 14.0 2.1 23 42-64 55-77 (212) 60 cd00986 PDZ_LON_protease PDZ d 26.8 30 0.00076 15.1 1.1 42 85-129 25-69 (79) 61 COG3250 LacZ Beta-galactosidas 26.7 31 0.00078 15.0 1.2 17 113-129 284-300 (808) 62 TIGR02634 xylF D-xylose ABC tr 26.2 31 0.0008 15.0 1.1 13 180-192 116-128 (307) 63 PRK05174 3-hydroxydecanoyl-(ac 25.1 39 0.001 14.4 1.5 13 117-129 143-155 (172) 64 PRK10150 beta-D-glucuronidase; 25.1 35 0.0009 14.7 1.2 19 114-132 280-298 (605) 65 PRK06714 S-adenosylhomocystein 24.9 48 0.0012 13.8 3.6 42 21-63 55-99 (236) 66 TIGR01857 FGAM-synthase phosph 24.6 39 0.001 14.4 1.4 20 31-51 361-380 (1279) 67 COG2949 SanA Uncharacterized m 23.6 47 0.0012 13.9 1.6 14 185-198 96-109 (235) 68 PRK05584 5'-methylthioadenosin 23.4 47 0.0012 13.9 1.6 40 22-62 55-97 (230) 69 pfam00877 NLPC_P60 NlpC/P60 fa 23.1 38 0.00098 14.4 1.1 16 80-95 46-61 (105) 70 pfam01048 PNP_UDP_1 Phosphoryl 22.6 49 0.0013 13.7 1.6 22 41-62 78-99 (232) 71 PRK02122 glucosamine-6-phospha 22.4 21 0.00053 16.1 -0.4 21 180-200 528-548 (660) 72 COG4615 PvdE ABC-type sideroph 22.1 35 0.0009 14.7 0.7 46 84-133 345-391 (546) 73 pfam05382 Amidase_5 Bacterioph 22.0 43 0.0011 14.1 1.1 13 84-96 75-87 (145) 74 TIGR01285 nifN nitrogenase mol 21.2 46 0.0012 13.9 1.2 97 28-133 195-335 (451) 75 pfam05257 CHAP CHAP domain. Th 21.1 57 0.0015 13.3 2.7 14 82-95 55-68 (119) 76 TIGR00066 g_glut_trans gamma-g 20.6 36 0.00091 14.6 0.5 11 39-49 455-465 (583) 77 TIGR01026 fliI_yscN ATPase Fli 20.1 60 0.0015 13.2 3.8 71 33-125 23-103 (455) 78 cd03295 ABC_OpuCA_Osmoprotecti 20.1 37 0.00093 14.6 0.5 44 84-131 23-67 (242) No 1 >PRK10861 lsignal peptidase I; Provisional Probab=100.00 E-value=0 Score=484.08 Aligned_cols=223 Identities=38% Similarity=0.699 Sum_probs=170.0 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHEEEEEEEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCE Q ss_conf 03587899999999999988751689899878667764336988999840058766412211542222112555334704 Q gi|254780939|r 11 IFGSDTLKSILQALFFAILIRTFLFQPSVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDV 90 (248) Q Consensus 11 ~f~~e~i~~l~~~i~i~~~ir~fv~~~f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDI 90 (248) -.+-|+.++++.++++++++|+|++|||+|||+||+|||++||+|||||++||+|.|.+..+. +..++|||||| T Consensus 58 p~~~e~~~s~fpvi~~v~ilRsFl~EPF~IPSgSM~PTLlvGDfIlVnKf~YG~r~P~~~~~i------i~~~~PkRGDV 131 (324) T PRK10861 58 PGWLETGASVFPVLAIVLIVRSFIYEPFQIPSGSMMPTLLIGDFILVEKFAYGIKDPIYQKTL------IETGHPKRGDI 131 (324) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHEECEECCCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCC------CCCCCCCCCCE T ss_conf 615556677999999999999883237027864343423558889998120466777667512------45799988999 Q ss_pred EEEECCCCCHHHEEEEEEEECHHHEEEC--CCCEEECCC-----------CCCCCCCCCCE------------------- Q ss_conf 6640144200310000003042343200--773245353-----------12245446400------------------- Q gi|254780939|r 91 VVFRYPKDPSIDYVKRVIGLPGDRISLE--KGIIYINGA-----------PVVRHMEGYFS------------------- 138 (248) Q Consensus 91 VVF~~P~d~~~~yVKRvIGlPGDtV~i~--~~~l~INg~-----------~i~~~~~~~~~------------------- 138 (248) |||++|.|++++|||||||+|||+|+++ |++++|+.. ++.+......+ T Consensus 132 VVF~yP~d~~~dYIKRvIGLPGD~I~y~~~~k~l~i~p~~~~~~~~~~~l~i~~~~~~~~d~~~~~~~~~~~~~~~~~~~ 211 (324) T PRK10861 132 VVFKYPEDPKLDYIKRAVGLPGDKVTYDPVSKEVTIQPGCSSGQACENALPVTYSNVEPSDFVQTFSRRNGGEATSGFFE 211 (324) T ss_pred EEEECCCCCCCCEECCCCCCCCCEEEECCCCCEEEECCCCCCCCCCCCCCCCCCCCCCCHHHHHHHCCCCCCCCCCCCCC T ss_conf 99958999987644104556987797413565267613555451125642101343561345552102578622353101 Q ss_pred EECC--CCCCEEEEEECCCCCCCCCCCEEEC---------CCCCCCCCCCEEECCCCEEEEEECCCCCCCCCCCCCCCCC Q ss_conf 2214--7862012210221467850002212---------4557887453023244649997178877764453653026 Q gi|254780939|r 139 YHYK--EDWSSNVPIFQEKLSNGVLYNVLSQ---------DFLAPSSNISEFLVPKGHYFMMGDNRDKSKDSRWVEVGFV 207 (248) Q Consensus 139 ~~~~--~~~~~~~~~~~e~l~~~~~~~~~~~---------~~~~~~~~~~~~~VP~g~yfvlGDNRdnS~DSR~~~~G~V 207 (248) .+.. .........+.|++++ ..|.++.. ....++...++|+||+|||||||||||||.|||| |||| T Consensus 212 ~p~~~~~~~~~~~~~~~e~l~~-~~h~il~~p~~~~~~~~~~~~~~~~~~~~~VP~g~YFmMGDNRDNS~DSRy--WGFV 288 (324) T PRK10861 212 VPLNETKENGIRLSERKETLGD-VTHRILTVPIAQDQLGMYYQQPGQPLATWIVPPGQYFMMGDNRDNSADSRY--WGFV 288 (324) T ss_pred CCCCCCCCCCEEEEEEEEECCC-CCEEEEECCCCCCCCCCCCCCCCCCCCCEEECCCCEEEEECCCCCCCCCCC--EECC T ss_conf 5655577664354235650487-320367525643334421135777665479689958984128888764574--6167 Q ss_pred CHHHEEEEEEEEEEECCCCCCCCCCCCCCCCCCHHHCCCC Q ss_conf 0888253389999743788775435554567013342232 Q gi|254780939|r 208 PEENLVGRASFVLFSIGGDTPFSKVWLWIPNMRWDRLFKI 247 (248) Q Consensus 208 p~~~IvGka~~i~~S~d~~~~~~~~~~~~~~iRw~R~f~~ 247 (248) |++||+|||++||||+|.+.. .|+.++||+|+++. T Consensus 289 Pe~~IVGKA~~IWmS~d~~~~-----~~p~~iR~~RiG~i 323 (324) T PRK10861 289 PEANLVGKATAIWMSFEKQEG-----EWPTGVRLSRIGGI 323 (324) T ss_pred CHHHCEEEEEEEEEECCCCCC-----CCCCCCCEEECCCC T ss_conf 789945626899998148768-----78776624403267 No 2 >TIGR02227 sigpep_I_bact signal peptidase I; InterPro: IPR000223 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26A. At least 3 eubacterial leader peptidases are known: murein prelipoprotein peptidase, which cleaves the leader peptide from a component of the bacterial outer membrane; type IV prepilin leader peptidase; and the serine-dependent leader peptidase 1, which has the more general role of cleaving the leader peptide from a variety of secreted proteins and proteins directed to the periplasm and periplasmic membrane . Leader peptidase 1 is similar to the eukaryotic signal peptidase, although the bacterial protein is monomeric, while the eukaryotic protein is multimeric . Mitochondria contain a similar two-subunit serine protease that removes leader peptides from nuclear- and mitochondrial-encoded proteins, which localise in the inner mitochondrial space . The catalytic residues of a number of these peptides have been identified as a serine/lysine dyad .; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0016020 membrane. Probab=100.00 E-value=0 Score=385.69 Aligned_cols=195 Identities=41% Similarity=0.745 Sum_probs=160.6 Q ss_pred HHHHHHHHHHHHHHHHHEEEEE-EEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEE Q ss_conf 8999999999999887516898-998786677643369889998400587664122115422221125553347046640 Q gi|254780939|r 16 TLKSILQALFFAILIRTFLFQP-SVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFR 94 (248) Q Consensus 16 ~i~~l~~~i~i~~~ir~fv~~~-f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~ 94 (248) ++.++++|+++++++|+|++++ +.|||+||+|||+.||+|||||++|+ | .++|++...+ +.+++|+|||||||+ T Consensus 1 ~~~~~~~~~~~~~~~r~f~~~~~~~v~g~SM~PTL~~gD~~lv~K~~y~-r-~~lp~~~~~~---~~~~~~~rgDivVF~ 75 (203) T TIGR02227 1 VILSILIAILLALLIRTFVFFPVYKVPGGSMEPTLKEGDRILVNKFAYG-R-LKLPFTHKLL---FKTSDPKRGDIVVFK 75 (203) T ss_pred CHHHHHHHHHHHHHHHHHHEEEEEEECCCCCCHHHCCCCEEEEEEECCC-E-EEEECCEEEE---EEECCCEECCEEEEE T ss_conf 9555678999999986411478898088975532237988999981264-4-5630210003---560575004089995 Q ss_pred CCCCCHH-HEEEEEEEECHHHEEECC-CCEEECC-CCCCCC--CCCCCEEECCCCCCEEEEEECCCCCCCCCCCEEECCC Q ss_conf 1442003-100000030423432007-7324535-312245--4464002214786201221022146785000221245 Q gi|254780939|r 95 YPKDPSI-DYVKRVIGLPGDRISLEK-GIIYING-APVVRH--MEGYFSYHYKEDWSSNVPIFQEKLSNGVLYNVLSQDF 169 (248) Q Consensus 95 ~P~d~~~-~yVKRvIGlPGDtV~i~~-~~l~INg-~~i~~~--~~~~~~~~~~~~~~~~~~~~~e~l~~~~~~~~~~~~~ 169 (248) .|.+++. .|||||||+|||+|+++| |+||||| ++++++ +..........+...... .+ .+ +..... T Consensus 76 ~~~~~~~~~yiKRviglPGD~v~~~~~g~ly~Ng~~~~~e~~~y~~~~~~~~~~~~~~~~~-------~~-~~-~~~~~~ 146 (203) T TIGR02227 76 APDDPDNRIYIKRVIGLPGDKVEIKDKGKLYINGLKKIDEPNEYLKPNKSLDTSEFNAATG-------RG-KH-VTNFAS 146 (203) T ss_pred CCCCCCCCEEEEEEEECCCCEEEEEECCEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCC-------CC-HH-HHHHHH T ss_conf 3899889616789996588889998089069858601236765456544434555310220-------00-01-122211 Q ss_pred CCCCCCCCEEECCCCEEEEEECCCCCCCCCCCC--CCCCCCHHHEEEEEEEEEEECC Q ss_conf 578874530232446499971788777644536--5302608882533899997437 Q gi|254780939|r 170 LAPSSNISEFLVPKGHYFMMGDNRDKSKDSRWV--EVGFVPEENLVGRASFVLFSIG 224 (248) Q Consensus 170 ~~~~~~~~~~~VP~g~yfvlGDNRdnS~DSR~~--~~G~Vp~~~IvGka~~i~~S~d 224 (248) .....+.++++||+||||||||||+||.|||++ .||+||+++|+|||.+++||++ T Consensus 147 ~~~~~~~~~~~VP~g~yFvLGDNR~~S~DSR~~~~~~G~v~~~~i~G~~~~~~~p~~ 203 (203) T TIGR02227 147 EEIITDYGPVTVPEGKYFVLGDNRDNSLDSRFWDDYFGFVPRDDIIGKVSFVFYPFD 203 (203) T ss_pred HHHHHCCCCEEECCCCEEEEECCCCCCCCCCCCCCEECCCCHHHEEEEEEEEEEECC T ss_conf 110112587680798579870254466655357862546207786589999986069 No 3 >KOG0171 consensus Probab=99.97 E-value=1.5e-31 Score=218.44 Aligned_cols=134 Identities=31% Similarity=0.528 Sum_probs=112.5 Q ss_pred HHHHHEEEEEEEECCCCCCCCCCC-CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEE Q ss_conf 988751689899878667764336-9889998400587664122115422221125553347046640144200310000 Q gi|254780939|r 28 ILIRTFLFQPSVIPSGSMIPTLLV-GDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKR 106 (248) Q Consensus 28 ~~ir~fv~~~f~Ips~SM~PTL~~-GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKR 106 (248) -....|+++.-.+++.||+|||+. ||++++.|++| +++.||+||||||+.|.+++++++|| T Consensus 26 h~t~~yl~e~~~~~gpSM~PTl~~~gd~l~aEkls~------------------~f~~~~~gDIVi~~sP~~~~~~~cKR 87 (176) T KOG0171 26 HVTHEYLGEFVMCSGPSMEPTLHDGGDVLLAEKLSY------------------RFRKPQVGDIVIAKSPPDPKEHICKR 87 (176) T ss_pred HHHHHHHCCEEECCCCCCCCEECCCCCEEEHHHHHH------------------HHCCCCCCCEEEEECCCCCHHHHHHE T ss_conf 999987422066258875753627986886335457------------------64589877789994899803422110 Q ss_pred EEEECHHHEEECCCCEEECCCCCCCCCCCCCEEECCCCCCEEEEEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEE Q ss_conf 00304234320077324535312245446400221478620122102214678500022124557887453023244649 Q gi|254780939|r 107 VIGLPGDRISLEKGIIYINGAPVVRHMEGYFSYHYKEDWSSNVPIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHY 186 (248) Q Consensus 107 vIGlPGDtV~i~~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~y 186 (248) +||+|||-+++.++.+.+|+.- + +...+..||+||. T Consensus 88 Iva~eGD~v~v~~~~~~~n~~~--e------------------------------------------~~~~~i~VP~GhV 123 (176) T KOG0171 88 IVAMEGDLVEVHDGPLVVNDLV--E------------------------------------------KFSTPIRVPEGHV 123 (176) T ss_pred EECCCCCEEEEECCCCCCCHHH--H------------------------------------------HCCCEEECCCCEE T ss_conf 3214886289962774343113--2------------------------------------------0564046137518 Q ss_pred EEEECCCCCCCCCCCCCCCCCCHHHEEEEEEEEEEECCC Q ss_conf 997178877764453653026088825338999974378 Q gi|254780939|r 187 FMMGDNRDKSKDSRWVEVGFVPEENLVGRASFVLFSIGG 225 (248) Q Consensus 187 fvlGDNRdnS~DSR~~~~G~Vp~~~IvGka~~i~~S~d~ 225 (248) ||+||||.||.||| .||++|..+|.||..+-+|.... T Consensus 124 fv~GDN~~nS~DSr--~yGplP~glI~gRvv~r~Wp~s~ 160 (176) T KOG0171 124 FVEGDNRNNSLDSR--NYGPLPMGLIQGRVVFRIWPPSR 160 (176) T ss_pred EEECCCCCCCCCCC--CCCCCCHHHEEEEEEEEECCCHH T ss_conf 98658887766456--43777452433568888669311 No 4 >pfam10502 Peptidase_S26 Peptidase S26. This is a family of serine endopeptidases which function in the processing of newly-synthesized secreted proteins. Peptidase S26 removes the hydrophobic, N-terminal signal peptides as proteins are translocated across membranes. Probab=99.96 E-value=1.7e-29 Score=205.43 Aligned_cols=107 Identities=32% Similarity=0.557 Sum_probs=84.0 Q ss_pred CCCCCCCCEEEEECCCCC------------HHHEEEEEEEECHHHEEECCCCEEECCCCCCCCCCCCCEEECCCCCCEEE Q ss_conf 555334704664014420------------03100000030423432007732453531224544640022147862012 Q gi|254780939|r 82 NNQPRRGDVVVFRYPKDP------------SIDYVKRVIGLPGDRISLEKGIIYINGAPVVRHMEGYFSYHYKEDWSSNV 149 (248) Q Consensus 82 ~~~p~RGDIVVF~~P~d~------------~~~yVKRvIGlPGDtV~i~~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~ 149 (248) ..+|+|||+|+|..|.+. +..|||||+|+|||+|++++++++|||+++.+..... . T Consensus 19 ~~~~~rGD~V~f~~P~~~~~~~~~rgyl~~g~~~iKrV~g~pGD~V~i~~~~v~INg~~~~~~~~~d--------~---- 86 (138) T pfam10502 19 LDRPEVGDLVAVCPPEPAAFFAAERGYLPRGVPLLKRVLALPGQRVCIRDGLVTIDGVPVSAALERD--------R---- 86 (138) T ss_pred CCCCCCCCEEEEECCHHHHHHHHHCCCCCCCCCEEEEEEEECCCEEEEECCEEEECCEECCCEECCC--------C---- T ss_conf 8987428899997984887778766867789957889998099899998999988899953210337--------6---- Q ss_pred EEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEEEEEECCCCCCCCCCCCCCCCCCHHHEEEEEEEEE Q ss_conf 21022146785000221245578874530232446499971788777644536530260888253389999 Q gi|254780939|r 150 PIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHYFMMGDNRDKSKDSRWVEVGFVPEENLVGRASFVL 220 (248) Q Consensus 150 ~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~yfvlGDNRdnS~DSR~~~~G~Vp~~~IvGka~~i~ 220 (248) .+.... ...+..+||+|+||||||||++|.|||| ||+||+++|+|||.-|| T Consensus 87 --------~g~~l~----------~~~~~~~vp~g~~fvlgdn~~~S~DSRy--~G~V~~~~I~G~a~pi~ 137 (138) T pfam10502 87 --------KGRPLP----------PWQGCRVLPEGELFLMSVTSPDSFDSRY--FGPVPASAIIGRARPVW 137 (138) T ss_pred --------CCCCCC----------CCCCCCEECCCEEEEECCCCCCCCCCCC--EEECCHHHEEEEEEEEE T ss_conf --------788577----------6678629189989997699998754541--81357799699999967 No 5 >cd06530 S26_SPase_I The S26 Type I signal peptidase (SPase; LepB; leader peptidase B; leader peptidase I; EC 3.4.21.89) family members are essential membrane-bound serine proteases that function to cleave the amino-terminal signal peptide extension from proteins that are translocated across biological membranes. The bacterial signal peptidase I, which is the most intensively studied, has two N-terminal transmembrane segments inserted in the plasma membrane and a hydrophilic, C-terminal catalytic region that is located in the periplasmic space. Although the bacterial signal peptidase I is monomeric, signal peptidases of eukaryotic cells commonly function as oligomeric complexes containing two divergent copies of the catalytic monomer. These are the IMP1 and IMP2 signal peptidases of the mitochondrial inner membrane that remove leader peptides from nuclear- and mitochondrial-encoded proteins. Also, two components of the endoplasmic reticulum signal peptidase in mammals (18-kDa and 21-kDa Probab=99.91 E-value=7.3e-25 Score=176.30 Aligned_cols=85 Identities=58% Similarity=1.046 Sum_probs=78.1 Q ss_pred EEEEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHE Q ss_conf 98998786677643369889998400587664122115422221125553347046640144200310000003042343 Q gi|254780939|r 36 QPSVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRI 115 (248) Q Consensus 36 ~~f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV 115 (248) +++.|+|+||+|||..||.++++|+.|. ..++++||||+|++|.+++..+|||||| T Consensus 1 ~~~~V~g~SM~Pt~~~Gd~v~v~~~~~~------------------~~~~~~GDivv~~~p~~~~~~~ikRVi~------ 56 (85) T cd06530 1 EPVVVPGGSMEPTLQPGDLVLVNKLSYG------------------FREPKRGDVVVFKSPGDPGKPIIKRVIG------ 56 (85) T ss_pred CCEEECCCCCCCCCCCCCEEEEEECCCC------------------CCCCCCCCEEEEECCCCCCCEEEECCCE------ T ss_conf 9969568888060308989999961356------------------5777778699996799999759974327------ Q ss_pred EECCCCEEECCCCCCCCCCCCCEEECCCCCCEEEEEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEEEEEECCCCC Q ss_conf 20077324535312245446400221478620122102214678500022124557887453023244649997178877 Q gi|254780939|r 116 SLEKGIIYINGAPVVRHMEGYFSYHYKEDWSSNVPIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHYFMMGDNRDK 195 (248) Q Consensus 116 ~i~~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~yfvlGDNRdn 195 (248) ||++||||+| T Consensus 57 ----------------------------------------------------------------------~~~~GDN~~n 66 (85) T cd06530 57 ----------------------------------------------------------------------YFVLGDNRNN 66 (85) T ss_pred ----------------------------------------------------------------------EEEEECCCCC T ss_conf ----------------------------------------------------------------------8987359676 Q ss_pred CCCCCCCCCCCCCHHHEEEEE Q ss_conf 764453653026088825338 Q gi|254780939|r 196 SKDSRWVEVGFVPEENLVGRA 216 (248) Q Consensus 196 S~DSR~~~~G~Vp~~~IvGka 216 (248) |.|||+ ||++|.++|+||+ T Consensus 67 s~Dsr~--~g~v~~~~i~Gkv 85 (85) T cd06530 67 SLDSRY--WGPVPEDDIVGKV 85 (85) T ss_pred CCCCCC--CCCCCHHHEEEEC T ss_conf 864571--4767789939959 No 6 >KOG1568 consensus Probab=99.89 E-value=2.3e-23 Score=166.83 Aligned_cols=134 Identities=33% Similarity=0.509 Sum_probs=101.8 Q ss_pred HHHHHHHHHHHHH--HHHHHEEEEEEEECCCCCCCCCCCC------CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCC Q ss_conf 7899999999999--9887516898998786677643369------8899984005876641221154222211255533 Q gi|254780939|r 15 DTLKSILQALFFA--ILIRTFLFQPSVIPSGSMIPTLLVG------DYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPR 86 (248) Q Consensus 15 e~i~~l~~~i~i~--~~ir~fv~~~f~Ips~SM~PTL~~G------D~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~ 86 (248) -+.++++.++.-. +-+---+.-.-+|-+.||.|||..+ |+||+.|+. +...... T Consensus 9 ~~~ksl~~s~~~~v~~t~~DrV~~va~v~G~smqPtlnP~~~~~~~d~Vll~k~~------------------v~n~~~~ 70 (174) T KOG1568 9 VFEKSLTGSLKWHVLLTFSDRVVHVAQVYGSSMQPTLNPTMNTNEKDTVLLRKWN------------------VKNRKVS 70 (174) T ss_pred HHHHCEEEEEEEHEEEEEEEEEEEEEEEECCCCCCCCCCCCCCCCCCEEEEEEEC------------------CCCCEEC T ss_conf 9874003554401024562047787678547678751887665535489998503------------------3343034 Q ss_pred CCCEEEEECCCCCHHHEEEEEEEECHHHEEECCCCEEECCCCCCCCCCCCCEEECCCCCCEEEEEECCCCCCCCCCCEEE Q ss_conf 47046640144200310000003042343200773245353122454464002214786201221022146785000221 Q gi|254780939|r 87 RGDVVVFRYPKDPSIDYVKRVIGLPGDRISLEKGIIYINGAPVVRHMEGYFSYHYKEDWSSNVPIFQEKLSNGVLYNVLS 166 (248) Q Consensus 87 RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~i~~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~~~~~e~l~~~~~~~~~~ 166 (248) |||||+|++|.|+.+.+||||+|+|||.+.-.+ T Consensus 71 rGDiVvl~sP~~p~~~~iKRv~alegd~~~t~~----------------------------------------------- 103 (174) T KOG1568 71 RGDIVVLKSPNDPDKVIIKRVAALEGDIMVTED----------------------------------------------- 103 (174) T ss_pred CCCEEEEECCCCHHHEEEEEEECCCCCEECCCC----------------------------------------------- T ss_conf 687899958999215235515135664751588----------------------------------------------- Q ss_pred CCCCCCCCCCCEEECCCCEEEEEECCCCCCCCCCCCCCCCCCHHHEEEEEEEEEEE Q ss_conf 24557887453023244649997178877764453653026088825338999974 Q gi|254780939|r 167 QDFLAPSSNISEFLVPKGHYFMMGDNRDKSKDSRWVEVGFVPEENLVGRASFVLFS 222 (248) Q Consensus 167 ~~~~~~~~~~~~~~VP~g~yfvlGDNRdnS~DSR~~~~G~Vp~~~IvGka~~i~~S 222 (248) ..-.++.||+||+||.|||--+|.||| .||+|+-..|+|||..|.|. T Consensus 104 -------~k~~~v~vpkghcWVegDn~~hs~DSn--tFGPVS~gli~grai~ilwp 150 (174) T KOG1568 104 -------EKEEPVVVPKGHCWVEGDNQKHSYDSN--TFGPVSTGLIVGRAIYILWP 150 (174) T ss_pred -------CCCCCEECCCCCEEEECCCCCCCCCCC--CCCCCCHHHEEEEEEEEECC T ss_conf -------877735468984789648755334467--54773342124258999748 No 7 >PRK13884 conjugal transfer peptidase TraF; Provisional Probab=99.81 E-value=3.9e-19 Score=140.23 Aligned_cols=109 Identities=20% Similarity=0.268 Sum_probs=79.3 Q ss_pred CCCCCCCEEEEECCCCCH-------------------HHEEEEEEEECHHHEEECCCCEEECCCCCCCCCCCCCEEECCC Q ss_conf 553347046640144200-------------------3100000030423432007732453531224544640022147 Q gi|254780939|r 83 NQPRRGDVVVFRYPKDPS-------------------IDYVKRVIGLPGDRISLEKGIIYINGAPVVRHMEGYFSYHYKE 143 (248) Q Consensus 83 ~~p~RGDIVVF~~P~d~~-------------------~~yVKRvIGlPGDtV~i~~~~l~INg~~i~~~~~~~~~~~~~~ 143 (248) ..+++||.|+|-.|...- ...+|||+|+|||+|++.++.++|||+.+.....-.. T Consensus 49 ~~~~~gd~V~~~pp~~~~~~~a~~RgYl~~G~cpgg~~pLiKrV~AlpGd~V~i~~~~V~InG~~lp~s~~~~~------ 122 (178) T PRK13884 49 APVEKGAYVLFCPPQRGVFDDAKERGYIGAGFCPGGYGYMMKRVLAAKGDAVSVADDGVRVNGELLPLSKPIKA------ 122 (178) T ss_pred CCCCCCCEEEECCCCHHHHHHHHHCCCCCCCCCCCCCCEEEEEEEECCCCEEEEECCEEEECCEECCCCCCCCC------ T ss_conf 87565888999479669999998758656788998864037888506998899869999999998666433334------ Q ss_pred CCCEEEEEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEEEEEECCCCCCCCCCCCCCCCCCHHHEEEEEEEEE Q ss_conf 86201221022146785000221245578874530232446499971788777644536530260888253389999 Q gi|254780939|r 144 DWSSNVPIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHYFMMGDNRDKSKDSRWVEVGFVPEENLVGRASFVL 220 (248) Q Consensus 144 ~~~~~~~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~yfvlGDNRdnS~DSR~~~~G~Vp~~~IvGka~~i~ 220 (248) + ..|.. .+.-+...+++++|+||+|+|++.+|.|||| ||+||+++|+|+|.=+| T Consensus 123 D------------~~GRp---------Lp~~~~~~~~l~~~e~fll~~~~~~SfDSRY--FGPV~~s~I~G~a~Pl~ 176 (178) T PRK13884 123 D------------KAGRP---------LPRYQANSYTLGNSELLLMSDVSATSFDGRY--FGPINRSQIKTVIRPVI 176 (178) T ss_pred C------------CCCCC---------CCCCCCCCEECCCCEEEEECCCCCCCCCCCC--CCCCCHHHEEEEEEEEE T ss_conf 5------------58995---------8854788458279969996699998765445--55465788069999706 No 8 >COG0681 LepB Signal peptidase I [Intracellular trafficking and secretion] Probab=99.80 E-value=3.5e-19 Score=140.50 Aligned_cols=91 Identities=44% Similarity=0.799 Sum_probs=80.6 Q ss_pred HHHHHHHHHHHHHHHHH--HHEEEEEEEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEE Q ss_conf 87899999999999988--7516898998786677643369889998400587664122115422221125553347046 Q gi|254780939|r 14 SDTLKSILQALFFAILI--RTFLFQPSVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVV 91 (248) Q Consensus 14 ~e~i~~l~~~i~i~~~i--r~fv~~~f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIV 91 (248) .++++.++.++++++++ +.|+++++.|||+||+|||+.||+|+|+|++|+.. +++.+|++ T Consensus 8 ~~~~~~~~~~~~~~~~i~~~~~~~~~~~V~s~SM~Ptl~~GD~v~v~k~~~~~~------------------~~~~~~~~ 69 (166) T COG0681 8 LELISSLLIAIILALIIGVRTFVFEPVVVPSGSMEPTLNVGDRVLVKKFSYGFG------------------KLKVPDII 69 (166) T ss_pred HHHHHHHHHHHHHHHHHHHEEEEEEEEEEECCCCCCCCCCCCEEEEECCCCCCC------------------CCCCCCEE T ss_conf 999999999999999753204346889980898745677788999966556766------------------45654200 Q ss_pred EEECCCCCHHHEEEEEEEECHHHEEECCCCEEE Q ss_conf 640144200310000003042343200773245 Q gi|254780939|r 92 VFRYPKDPSIDYVKRVIGLPGDRISLEKGIIYI 124 (248) Q Consensus 92 VF~~P~d~~~~yVKRvIGlPGDtV~i~~~~l~I 124 (248) ..|......++||++|+|||++.++++.+++ T Consensus 70 --~~~~~~~~~~~kr~~~~~GD~i~~~~~~~~~ 100 (166) T COG0681 70 --VLPAVVEGDLIKRVIGLRGDIVVFKDDRLYV 100 (166) T ss_pred --CCCCCCCCCCCCCCCCCCCCEEEECCCCCCC T ss_conf --1442233404032467998889976876310 No 9 >TIGR02771 TraF_Ti conjugative transfer signal peptidase TraF; InterPro: IPR014139 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This entry contains the conjugative transfer signal peptidase (TraF), which belongs to MEROPS peptidase family S26, subfamily S26C (TraF signal peptidase, clan SF). It is found in operons that encode elements of conjugative transfer systems. This family is homologous to a broader family of signal (leader) peptidases such as LepB. This family is present in both Ti-type and I-type conjugative systems .. Probab=99.73 E-value=2.2e-17 Score=129.26 Aligned_cols=107 Identities=34% Similarity=0.576 Sum_probs=81.9 Q ss_pred CCCCCCCCEEEEECC-CCCHH-------------------HEEEEEEEECHHHEEECCCCEEECCCCCCCCCCCCCEEEC Q ss_conf 555334704664014-42003-------------------1000000304234320077324535312245446400221 Q gi|254780939|r 82 NNQPRRGDVVVFRYP-KDPSI-------------------DYVKRVIGLPGDRISLEKGIIYINGAPVVRHMEGYFSYHY 141 (248) Q Consensus 82 ~~~p~RGDIVVF~~P-~d~~~-------------------~yVKRvIGlPGDtV~i~~~~l~INg~~i~~~~~~~~~~~~ 141 (248) -++++|||.|+|.+| +++.. .++|||+|+|||+|+++.+.+.|||+.+........+ T Consensus 47 ~~~v~~G~yV~fcpPe~~~~~~~A~~RGY~~~G~C~GGf~pl~K~v~~l~Gd~v~~~~~~v~iNg~~~~~~~~~~~D--- 123 (183) T TIGR02771 47 SKPVERGDYVVFCPPEDNAVFEEARERGYLREGLCPGGFGPLLKRVLGLPGDRVTVRADVVAINGKLLPYSKPLATD--- 123 (183) T ss_pred CCCCCCCCEEEEECCCCCHHHHHHHHCCCCCCCCCCCCHHHHHHEEECCCCCEEEEECCEEEECCEECCCCCEEECC--- T ss_conf 26877787789835986678740545143357888986001202321278955886068888878677987311007--- Q ss_pred CCCCCEEEEEECCCCCCCCCCCEEECCCCCCCCCC-C-EEECCCC--EEEEEECCCC-CCCCCCCCCCCCCCHHHEEEEE Q ss_conf 47862012210221467850002212455788745-3-0232446--4999717887-7764453653026088825338 Q gi|254780939|r 142 KEDWSSNVPIFQEKLSNGVLYNVLSQDFLAPSSNI-S-EFLVPKG--HYFMMGDNRD-KSKDSRWVEVGFVPEENLVGRA 216 (248) Q Consensus 142 ~~~~~~~~~~~~e~l~~~~~~~~~~~~~~~~~~~~-~-~~~VP~g--~yfvlGDNRd-nS~DSR~~~~G~Vp~~~IvGka 216 (248) +.|.. . ..+ . +-++|+| .+|+|-|... .|.||||..||+|++++|+|++ T Consensus 124 ---------------~~GR~--------l---~p~k~s~~~~p~G~~~~~~v~~~~~a~SFDSRYA~fG~i~~~q~~~~~ 177 (183) T TIGR02771 124 ---------------SEGRP--------L---PPLKASEGVVPPGEAEFLVVSDTSPATSFDSRYAEFGPISREQVIGRV 177 (183) T ss_pred ---------------CCCCC--------C---CCCCCCCCEECCCCCEEEEEECCCCCCCCCHHHHCCCCEECCCEEEEE T ss_conf ---------------88888--------8---765788853028862178985088987610011024772000505877 Q ss_pred E Q ss_conf 9 Q gi|254780939|r 217 S 217 (248) Q Consensus 217 ~ 217 (248) . T Consensus 178 ~ 178 (183) T TIGR02771 178 K 178 (183) T ss_pred E T ss_conf 1 No 10 >PRK13838 conjugal transfer pilin processing protease TraF; Provisional Probab=99.67 E-value=1.3e-15 Score=118.17 Aligned_cols=105 Identities=28% Similarity=0.432 Sum_probs=74.6 Q ss_pred CCCCCCCEEEEECCCCCH-------------H------HEEEEEEEECHHHEEECCCCEEECCCCCCCCCCCCCEEECCC Q ss_conf 553347046640144200-------------3------100000030423432007732453531224544640022147 Q gi|254780939|r 83 NQPRRGDVVVFRYPKDPS-------------I------DYVKRVIGLPGDRISLEKGIIYINGAPVVRHMEGYFSYHYKE 143 (248) Q Consensus 83 ~~p~RGDIVVF~~P~d~~-------------~------~yVKRvIGlPGDtV~i~~~~l~INg~~i~~~~~~~~~~~~~~ 143 (248) +.+++||+|+|..|+..- . ..+|||+|+|||+|++.+ .+.|||+.+.......-+ T Consensus 49 ~~~~~GdlV~vcpP~~~a~~~a~~RgYL~~G~Cpgg~~pLlKrV~Al~Gd~V~i~~-~V~Ing~~v~~s~~~~~D----- 122 (176) T PRK13838 49 RPAAVGDLVFICPPDTAAFREARARGYLRSGLCPGGFAPLIKTVAAVAGQRVEIGD-SVSIDGRPVPSSSLARRD----- 122 (176) T ss_pred CCCCCCCEEEECCCCHHHHHHHHHCCCCCCCCCCCCCCCEEEEEECCCCCEEEECC-CEEECCEECCCCCCCCCC----- T ss_conf 87644989998689479999998778753575888865202565146998899689-789989981235332547----- Q ss_pred CCCEEEEEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEEEEEECCCCCCCCCCCCCCCCCCHHHEEEEEEEEE Q ss_conf 86201221022146785000221245578874530232446499971788777644536530260888253389999 Q gi|254780939|r 144 DWSSNVPIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHYFMMGDNRDKSKDSRWVEVGFVPEENLVGRASFVL 220 (248) Q Consensus 144 ~~~~~~~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~yfvlGDNRdnS~DSR~~~~G~Vp~~~IvGka~~i~ 220 (248) ..|... . ....-++|+|++|+|.|+.+ |.|||| ||+||+++|+|+|.=+| T Consensus 123 -------------~~GrpL--------p---~~~~~~~~~g~~fL~~~~~~-SfDsRY--FGpv~~s~IiG~a~Pl~ 172 (176) T PRK13838 123 -------------GEGRPL--------L---PFPGGVVPPGHLFLHSSFAG-SYDSRY--FGPVPASGILGLARPVL 172 (176) T ss_pred -------------CCCCCC--------C---CCCCEECCCCEEEEECCCCC-CCCCCC--CCCCCHHHEEEEEEEEE T ss_conf -------------899806--------6---60881827986998579998-765555--14066778369999827 No 11 >TIGR02754 sod_Ni_protease nickel-type superoxide dismutase maturation protease; InterPro: IPR014124 Members of this protein family are predicted proteases that are encoded adjacent to the genes for a nickel-type superoxide dismutase (IPR014123 from INTERPRO). This family of predicted peptidases belong to MEROPS peptidase subfamily S26A (signal peptidase I), which have a Ser/Lys catalytic dyad.. Probab=99.61 E-value=1.6e-15 Score=117.58 Aligned_cols=89 Identities=39% Similarity=0.564 Sum_probs=77.3 Q ss_pred EEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHEEE Q ss_conf 99878667764336988999840058766412211542222112555334704664014420031000000304234320 Q gi|254780939|r 38 SVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRISL 117 (248) Q Consensus 38 f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~i 117 (248) .+|.+.||.|||..+|+|||+=+.|.. +-++-|.|||..||..+....|||..| T Consensus 1 ~kV~G~SM~P~L~~~D~~~V~P~~~~~------------------r~~~~G~v~v~~HP~~p~~~~iKRL~~-------- 54 (90) T TIGR02754 1 LKVTGESMSPTLKPGDRILVRPLLKIA------------------RVPPIGEVVVVRHPLKPSLLIIKRLAA-------- 54 (90) T ss_pred CEECCCCCCCCCCCCCEEEEEECCCCC------------------CCCCCCEEEEEECCCCCCEEEEEEEEE-------- T ss_conf 944332157613988869883011023------------------487788089985698997478986132-------- Q ss_pred CCCCEEECCCCCCCCCCCCCEEECCCCCCEEEEEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEEEEEECCCCCCC Q ss_conf 07732453531224544640022147862012210221467850002212455788745302324464999717887776 Q gi|254780939|r 118 EKGIIYINGAPVVRHMEGYFSYHYKEDWSSNVPIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHYFMMGDNRDKSK 197 (248) Q Consensus 118 ~~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~yfvlGDNRdnS~ 197 (248) +-+|-.++||||-+.|- T Consensus 55 ---------------------------------------------------------------~~~nG~~~LGDnp~ASt 71 (90) T TIGR02754 55 ---------------------------------------------------------------VDDNGLFLLGDNPKAST 71 (90) T ss_pred ---------------------------------------------------------------ECCCCCEEECCCCCCCC T ss_conf ---------------------------------------------------------------31788388567377876 Q ss_pred CCCCCCCCCCCHHHEEEEEE Q ss_conf 44536530260888253389 Q gi|254780939|r 198 DSRWVEVGFVPEENLVGRAS 217 (248) Q Consensus 198 DSR~~~~G~Vp~~~IvGka~ 217 (248) ||| .||.||++.++|+|. T Consensus 72 DSR--~~G~v~~~~L~G~v~ 89 (90) T TIGR02754 72 DSR--QLGPVPRELLLGKVL 89 (90) T ss_pred CCH--HHCCCCCHHEEEEEE T ss_conf 712--105888202544785 No 12 >COG4959 TraF Type IV secretory pathway, protease TraF [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion] Probab=99.31 E-value=3.9e-12 Score=96.18 Aligned_cols=106 Identities=25% Similarity=0.439 Sum_probs=78.7 Q ss_pred CCC-CCCCEEEEECCCCC------------HHHEEEEEEEECHHHEEECCCCEEECCCCCCCCCCCCCEEECCCCCCEEE Q ss_conf 553-34704664014420------------03100000030423432007732453531224544640022147862012 Q gi|254780939|r 83 NQP-RRGDVVVFRYPKDP------------SIDYVKRVIGLPGDRISLEKGIIYINGAPVVRHMEGYFSYHYKEDWSSNV 149 (248) Q Consensus 83 ~~p-~RGDIVVF~~P~d~------------~~~yVKRvIGlPGDtV~i~~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~ 149 (248) +.| ++||+|++.+|+.- ....+||+.|+|||+|.+..+.+.|||+++....... . T Consensus 51 ~~Pvt~g~lV~v~pP~~~a~~aA~RGYLp~~~pllK~i~Alpgq~Vci~~~~I~I~G~~v~~sl~~D-------~----- 118 (173) T COG4959 51 SAPVTKGDLVLVCPPQRAAFLAAQRGYLPPYIPLLKRILALPGQHVCITSQGIAIDGKPVAASLPVD-------R----- 118 (173) T ss_pred CCCCCCCCEEEECCCCHHHHHHHHCCCCCCCCHHHHHHHCCCCCCEEEECCEEEECCEEEEEECCCC-------C----- T ss_conf 7875368889987970676767653766666278998751799827873361789899812340225-------6----- Q ss_pred EEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEEEEEECCCCCCCCCCCCCCCCCCHHHEEEEEEEEE Q ss_conf 21022146785000221245578874530232446499971788777644536530260888253389999 Q gi|254780939|r 150 PIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHYFMMGDNRDKSKDSRWVEVGFVPEENLVGRASFVL 220 (248) Q Consensus 150 ~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~yfvlGDNRdnS~DSR~~~~G~Vp~~~IvGka~~i~ 220 (248) .|.....+ +... .+-+++.|+|+|--..|.|||| ||+||.++|+|.|.=+| T Consensus 119 --------~GR~lp~~---------~gcR-~l~~~el~lL~~~~~~SfDsRY--fGpipas~vig~aRPvw 169 (173) T COG4959 119 --------VGRALPRW---------QGCR-YLAPSELLLLTDRSSTSFDSRY--FGPIPASQVIGVARPVW 169 (173) T ss_pred --------CCCCCCCC---------CCCC-EECCCEEEEEECCCCCCCCCCE--ECCCCHHHCCEEEEEEE T ss_conf --------67758763---------6872-5568727998456776554422--05667788101111100 No 13 >cd06462 Peptidase_S24_S26 The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families. The S24 LexA protein domains include: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The S26 type I signal peptidase (SPase) family also includes mitochondrial inner membrane protease (IMP)-like members. SPases are essential membrane-bound proteases which function to cleave away the amino-terminal signal peptide from the translocated pre-protein, thus playing a crucial role in the tr Probab=99.08 E-value=3.1e-10 Score=84.21 Aligned_cols=82 Identities=52% Similarity=0.846 Sum_probs=65.4 Q ss_pred EEEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHEE Q ss_conf 89987866776433698899984005876641221154222211255533470466401442003100000030423432 Q gi|254780939|r 37 PSVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRIS 116 (248) Q Consensus 37 ~f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~ 116 (248) .++|.++||+||+..||.|+|++.. .+++.||+++|+.+. +..+|||++..++ T Consensus 2 ~~~v~gdSM~P~i~~Gd~v~vd~~~---------------------~~~~~Gdiv~~~~~~--~~~~iKrl~~~~~---- 54 (84) T cd06462 2 ALRVEGDSMEPTIPDGDLVLVDKSS---------------------YEPKRGDIVVFRLPG--GELTVKRVIGLPG---- 54 (84) T ss_pred EEEEECCCCCCHHHCCCEEEEECCC---------------------CCCCCCCEEEEEECC--CEEEEEEEEEECC---- T ss_conf 7996567771046489899997356---------------------657899699999789--9399999998889---- Q ss_pred ECCCCEEECCCCCCCCCCCCCEEECCCCCCEEEEEECCCCCCCCCCCEEECCCCCCCCCCCEEECCCCEEEEEECCCCCC Q ss_conf 00773245353122454464002214786201221022146785000221245578874530232446499971788777 Q gi|254780939|r 117 LEKGIIYINGAPVVRHMEGYFSYHYKEDWSSNVPIFQEKLSNGVLYNVLSQDFLAPSSNISEFLVPKGHYFMMGDNRDKS 196 (248) Q Consensus 117 i~~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~~~~~e~l~~~~~~~~~~~~~~~~~~~~~~~~VP~g~yfvlGDNRdnS 196 (248) ++++++.||| .++ T Consensus 55 ------------------------------------------------------------------~~~~~l~~dN-~~~ 67 (84) T cd06462 55 ------------------------------------------------------------------EGHYFLLGDN-PNS 67 (84) T ss_pred ------------------------------------------------------------------CCCEEEECCC-CCC T ss_conf ------------------------------------------------------------------9989997989-999 Q ss_pred CCCCCCCCCCCCHHHEEEE Q ss_conf 6445365302608882533 Q gi|254780939|r 197 KDSRWVEVGFVPEENLVGR 215 (248) Q Consensus 197 ~DSR~~~~G~Vp~~~IvGk 215 (248) .|++. ++. +...++|+ T Consensus 68 ~~~~~--~~~-~~~~~~g~ 83 (84) T cd06462 68 PDSRI--DGP-PELDIVGV 83 (84) T ss_pred CCCCC--CCC-CCEEEEEE T ss_conf 88124--899-97299998 No 14 >TIGR02228 sigpep_I_arch signal peptidase I; InterPro: IPR001733 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26B. Eukaryotic microsomal signal peptidase is involved in the removal of signal peptides from secretory proteins as they pass into the endoplasmic reticulum lumen . The peptidase is more complex than its mitochondrial and bacterial counterparts, containing a number of subunits, ranging from two in the chicken oviduct peptidase, to five in the dog pancreas protein . They share sequence similarity with the bacterial leader peptidases (family S26A), although activity here is mediated by a serine/histidine dyad rather than a serine/lysine dyad . Archaeal signal peptidases also belong to this group. ; GO: 0008233 peptidase activity, 0006465 signal peptide processing, 0006508 proteolysis, 0016020 membrane. Probab=98.96 E-value=1.7e-09 Score=79.51 Aligned_cols=80 Identities=29% Similarity=0.414 Sum_probs=57.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHEE------------EEEEEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCC Q ss_conf 35878999999999999887516------------898998786677643369889998400587664122115422221 Q gi|254780939|r 12 FGSDTLKSILQALFFAILIRTFL------------FQPSVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGR 79 (248) Q Consensus 12 f~~e~i~~l~~~i~i~~~ir~fv------------~~~f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~ 79 (248) .++|.+-.++++++++++.-.+. +|.+.|-|+||||+++.||.+++.-- T Consensus 2 ~~~~~i~~~~~~ll~~~l~~~l~~~~~~~~p~V~GY~~kSVlSgSMEP~f~~Gd~~~~~~~------------------- 62 (175) T TIGR02228 2 KISNVIYVILIILLVILLLVGLVSKASGPDPVVVGYQLKSVLSGSMEPTFNTGDLILVTGK------------------- 62 (175) T ss_pred CHHHHHHHHHHHHHHHHHHHEEEEEECCCCCEEEEEEEEEEEEEEECCEEECCEEEEEECC------------------- T ss_conf 0233235578999998754315466528972798579898985543062521309998042------------------- Q ss_pred CCCCCCCCCCEEEEECCCCC--HHHEEEEEEEE Q ss_conf 12555334704664014420--03100000030 Q gi|254780939|r 80 IFNNQPRRGDVVVFRYPKDP--SIDYVKRVIGL 110 (248) Q Consensus 80 i~~~~p~RGDIVVF~~P~d~--~~~yVKRvIGl 110 (248) ...++.|+||||+|+.+..+ +..-+-||+++ T Consensus 63 ~~~~~~~~GDvI~y~~~~~~WyG~~v~HRv~~~ 95 (175) T TIGR02228 63 VDPEDIQVGDVIVYKSEGKRWYGTPVIHRVIEI 95 (175) T ss_pred CCHHHCCCCCEEEEEECCCCCCCEEEEEEEEEE T ss_conf 176663015679992469952353899999988 No 15 >pfam00717 Peptidase_S24 Peptidase S24-like. Probab=98.74 E-value=1.2e-08 Score=74.35 Aligned_cols=56 Identities=52% Similarity=0.899 Sum_probs=47.0 Q ss_pred EECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHEEEC Q ss_conf 98786677643369889998400587664122115422221125553347046640144200310000003042343200 Q gi|254780939|r 39 VIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRISLE 118 (248) Q Consensus 39 ~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~i~ 118 (248) +|.++||+||++.||.|+|++- .++++||+|+|..+.++ .+|||+...+|+.+.+. T Consensus 1 ~V~GdSM~p~i~~Gd~viv~~~----------------------~~~~~Gdivv~~~~~~~--~~iKrl~~~~~~~~l~~ 56 (67) T pfam00717 1 RVPGDSMEPTIPDGDLLLVDKT----------------------SEPKRGDIVVARLPGEE--AYVKRLIGLPGDIILLP 56 (67) T ss_pred CCCCCCCCCCCCCCCEEEEEEC----------------------CCCCCCCEEEEEECCCC--EEEEEEEEECCEEEEEE T ss_conf 9717887557149999999834----------------------64546959999989995--69999996599299981 No 16 >COG2932 Predicted transcriptional regulator [Transcription] Probab=98.04 E-value=9e-06 Score=56.14 Aligned_cols=57 Identities=33% Similarity=0.442 Sum_probs=42.8 Q ss_pred EEEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHEE Q ss_conf 89987866776433698899984005876641221154222211255533470466401442003100000030423432 Q gi|254780939|r 37 PSVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRIS 116 (248) Q Consensus 37 ~f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~ 116 (248) ...|-++||+|++..||.+||+. .....+||.|++.. +++..||||+...||=.++ T Consensus 125 ~i~V~GDSMeP~~~~Gd~ilVd~----------------------~~~~~~gd~v~v~~--~g~~~~VK~l~~~~~~~~~ 180 (214) T COG2932 125 ALRVTGDSMEPTYEDGDTLLVDP----------------------GVNTRRGDRVYVET--DGGELYVKKLQREPGGLLR 180 (214) T ss_pred EEEEECCCCCCCCCCCCEEEEEC----------------------CCCEEECCEEEEEE--ECCEEEEEEEEEECCCEEE T ss_conf 99996776663015999999978----------------------98425199999999--5994889999995798799 Q ss_pred E Q ss_conf 0 Q gi|254780939|r 117 L 117 (248) Q Consensus 117 i 117 (248) + T Consensus 181 l 181 (214) T COG2932 181 L 181 (214) T ss_pred E T ss_conf 9 No 17 >cd06529 S24_LexA-like Peptidase S24 LexA-like proteins are involved in the SOS response leading to the repair of single-stranded DNA within the bacterial cell. This family includes: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The LexA-like proteins contain two-domains: an N-terminal DNA binding domain and a C-terminal domain (CTD) that provides LexA dimerization as well as cleavage activity. They undergo autolysis, cleaving at an Ala-Gly or a Cys-Gly bond, separating the DNA-binding domain from the rest of the Probab=97.82 E-value=4.4e-05 Score=51.79 Aligned_cols=56 Identities=36% Similarity=0.488 Sum_probs=43.9 Q ss_pred EEEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHEE Q ss_conf 89987866776433698899984005876641221154222211255533470466401442003100000030423432 Q gi|254780939|r 37 PSVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRIS 116 (248) Q Consensus 37 ~f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~ 116 (248) ..+|.++||+|++..||.|+|++- .+++.||++++.... ..+|||+.-.+++.+. T Consensus 2 ~l~v~GdSM~P~i~~Gd~vivd~~----------------------~~~~~g~i~vv~~~~---~~~iKrl~~~~~~~~~ 56 (81) T cd06529 2 ALRVKGDSMEPTIPDGDLVLVDPS----------------------DTPRDGDIVVARLDG---ELTVKRLQRRGGGRLR 56 (81) T ss_pred EEEEECCCCCCCCCCCCEEEEECC----------------------CCCCCCCEEEEEECC---CCEEEEEEECCCCEEE T ss_conf 899967747714069999999377----------------------504799899999759---8069999991896399 Q ss_pred E Q ss_conf 0 Q gi|254780939|r 117 L 117 (248) Q Consensus 117 i 117 (248) + T Consensus 57 L 57 (81) T cd06529 57 L 57 (81) T ss_pred E T ss_conf 9 No 18 >KOG3342 consensus Probab=97.36 E-value=0.00015 Score=48.54 Aligned_cols=52 Identities=37% Similarity=0.542 Sum_probs=36.1 Q ss_pred EEECCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEE Q ss_conf 9987866776433698899984005876641221154222211255533470466401442003100000030 Q gi|254780939|r 38 SVIPSGSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGL 110 (248) Q Consensus 38 f~Ips~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGl 110 (248) ..|-|+||||..+.||.+|..... ....+-||||||+-+. .....|-|||-+ T Consensus 51 VVVLSgSMePaF~RGDlLfL~N~~--------------------~~p~~vGdivVf~veg-R~IPiVHRviK~ 102 (180) T KOG3342 51 VVVLSGSMEPAFHRGDLLFLTNRN--------------------EDPIRVGDIVVFKVEG-REIPIVHRVIKQ 102 (180) T ss_pred EEEECCCCCCCCCCCCEEEEECCC--------------------CCCCEECCEEEEEECC-CCCCHHHHHHHH T ss_conf 999738867551257489985588--------------------9964005489998889-247511788998 No 19 >PRK00215 LexA repressor; Validated Probab=96.74 E-value=0.0033 Score=40.00 Aligned_cols=53 Identities=34% Similarity=0.384 Sum_probs=38.8 Q ss_pred EEEECCCCC-CCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHE Q ss_conf 899878667-7643369889998400587664122115422221125553347046640144200310000003042343 Q gi|254780939|r 37 PSVIPSGSM-IPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRI 115 (248) Q Consensus 37 ~f~Ips~SM-~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV 115 (248) ..+|.++|| ++.++.||.++|.|- .+++-|||||... +. ...+||.-- -|+.+ T Consensus 119 ~LrV~GdSMi~~~I~dGD~viV~~~----------------------~~~~~G~Ivva~i--~~-e~tlKr~~~-~~~~i 172 (204) T PRK00215 119 LLRVRGDSMIDAGILDGDLVIVRKQ----------------------QTARNGQIVVALI--DD-EATVKRFRR-EGGHI 172 (204) T ss_pred EEEECCCCCCCCCCCCCCEEEEECC----------------------CCCCCCCEEEEEE--CC-CCEEEEEEE-ECCEE T ss_conf 9996378766579899999999578----------------------9688996999997--37-758999999-79999 No 20 >COG0681 LepB Signal peptidase I [Intracellular trafficking and secretion] Probab=96.27 E-value=0.0041 Score=39.40 Aligned_cols=28 Identities=57% Similarity=0.934 Sum_probs=14.9 Q ss_pred HHEEEEEEEECHHHEEECCCCEEECCCC Q ss_conf 3100000030423432007732453531 Q gi|254780939|r 101 IDYVKRVIGLPGDRISLEKGIIYINGAP 128 (248) Q Consensus 101 ~~yVKRvIGlPGDtV~i~~~~l~INg~~ 128 (248) ..++||++++|||.+...+..+++||++ T Consensus 138 ~~~~~~~~~~~gd~~~~~~~~~~~~g~~ 165 (166) T COG0681 138 KDYIKRVIGLPGDNILYTDDDLPINGKP 165 (166) T ss_pred CCCCEEEEECCCCCEEEECCCEEECCCC T ss_conf 2332028971476357504534478810 No 21 >PRK10276 DNA polymerase V subunit UmuD; Provisional Probab=96.24 E-value=0.0088 Score=37.34 Aligned_cols=45 Identities=27% Similarity=0.298 Sum_probs=36.3 Q ss_pred EEECCCCC-CCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEE Q ss_conf 99878667-76433698899984005876641221154222211255533470466401442003100000 Q gi|254780939|r 38 SVIPSGSM-IPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRV 107 (248) Q Consensus 38 f~Ips~SM-~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRv 107 (248) .+|-++|| +..+..||.++|+|- .+|+.|||||..- | +...|||. T Consensus 54 lrV~GdSMi~agI~dGDiliVdr~----------------------~~~~~GdIVva~i--d-ge~tvKrl 99 (139) T PRK10276 54 VKASGDSMIDAGISDGDLLIVDSS----------------------ITASHGDIVIAAV--D-GEFTVKKL 99 (139) T ss_pred EEEECCCCCCCCCCCCCEEEEEEC----------------------CCCCCCCEEEEEE--C-CEEEEEEE T ss_conf 997268734488899899999405----------------------9877899999998--9-98899999 No 22 >PHA00361 cI Repressor Probab=96.20 E-value=0.011 Score=36.73 Aligned_cols=56 Identities=29% Similarity=0.393 Sum_probs=39.7 Q ss_pred HHEEEEEEEECCCCCCC----CCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEE Q ss_conf 75168989987866776----43369889998400587664122115422221125553347046640144200310000 Q gi|254780939|r 31 RTFLFQPSVIPSGSMIP----TLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKR 106 (248) Q Consensus 31 r~fv~~~f~Ips~SM~P----TL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKR 106 (248) ++|. .+|.+.||+| ++..||.|+|+. -.+|+-||+||-+-..+... -+|| T Consensus 72 ~~F~---L~V~GdSM~~p~g~~~~~Gd~iiVdp----------------------~~~~~~G~~VvA~~~~~~ea-T~K~ 125 (165) T PHA00361 72 RTFW---LEVEGDSMTAPTGLSFPEGDSILVDP----------------------EVEAEPGDLVIARLEGASEA-TFKK 125 (165) T ss_pred CEEE---EEEECCCCCCCCCCCCCCCCEEEECC----------------------CCCCCCCCEEEEEECCCCCE-EEEE T ss_conf 8899---99966767887568738999999827----------------------87578899999996899834-8999 Q ss_pred EEEECH Q ss_conf 003042 Q gi|254780939|r 107 VIGLPG 112 (248) Q Consensus 107 vIGlPG 112 (248) .+--.| T Consensus 126 l~~d~~ 131 (165) T PHA00361 126 LIIDGG 131 (165) T ss_pred EEEECC T ss_conf 999599 No 23 >PRK12423 LexA repressor; Provisional Probab=95.47 E-value=0.033 Score=33.72 Aligned_cols=54 Identities=35% Similarity=0.444 Sum_probs=38.1 Q ss_pred EEECCCCCCC-CCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEEEECHHHEE Q ss_conf 9987866776-433698899984005876641221154222211255533470466401442003100000030423432 Q gi|254780939|r 38 SVIPSGSMIP-TLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRIS 116 (248) Q Consensus 38 f~Ips~SM~P-TL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~ 116 (248) .+|-++||.. .++.||.|+|.|- ..++-|||||..- |. ..-+||.- .-|+.+. T Consensus 117 LrV~GdSMi~~gI~dGD~viV~~~----------------------~~~~~GdIVvA~i--dg-E~TlKr~~-~~~~~i~ 170 (202) T PRK12423 117 LQVQGDSMIDDGILDGDLVGVHRS----------------------PEARDGQIVVARL--DG-EVTIKRLE-RGADRIR 170 (202) T ss_pred EEECCCCCCCCCCCCCCEEEEECC----------------------CCCCCCCEEEEEE--CC-EEEEEEEE-EECCEEE T ss_conf 998888654489689999999636----------------------8789996999998--99-28999999-9899999 Q ss_pred E Q ss_conf 0 Q gi|254780939|r 117 L 117 (248) Q Consensus 117 i 117 (248) . T Consensus 171 L 171 (202) T PRK12423 171 L 171 (202) T ss_pred E T ss_conf 9 No 24 >COG1974 LexA SOS-response transcriptional repressors (RecA-mediated autopeptidases) [Transcription / Signal transduction mechanisms] Probab=92.56 E-value=0.29 Score=27.76 Aligned_cols=47 Identities=30% Similarity=0.440 Sum_probs=31.2 Q ss_pred EEECCCCCCCC-CCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCHHHEEEEEE Q ss_conf 99878667764-336988999840058766412211542222112555334704664014420031000000 Q gi|254780939|r 38 SVIPSGSMIPT-LLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQPRRGDVVVFRYPKDPSIDYVKRVI 108 (248) Q Consensus 38 f~Ips~SM~PT-L~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p~RGDIVVF~~P~d~~~~yVKRvI 108 (248) -+|.++||..- +..||.|+|.+ ..+.+-|||||-.-.. ...=+||.. T Consensus 115 L~V~GdSM~~~gi~dGDlvvV~~----------------------~~~a~~GdiVvA~i~g--~e~TvKrl~ 162 (201) T COG1974 115 LRVSGDSMIDAGILDGDLVVVDP----------------------TEDAENGDIVVALIDG--EEATVKRLY 162 (201) T ss_pred EEECCCCCCCCCCCCCCEEEECC----------------------CCCCCCCCEEEEECCC--CCEEEEEEE T ss_conf 99458850027788898999838----------------------8877799789998389--817899999 No 25 >TIGR02896 spore_III_AF stage III sporulation protein AF; InterPro: IPR014245 This family represents the stage III sporulation protein AF (SpoIIIAF) of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of these proteins is poorly conserved. . Probab=84.72 E-value=1.1 Score=24.06 Aligned_cols=31 Identities=19% Similarity=0.402 Sum_probs=26.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHEEEEEEEECCCCCCC Q ss_conf 0358789999999999998875168989987866776 Q gi|254780939|r 11 IFGSDTLKSILQALFFAILIRTFLFQPSVIPSGSMIP 47 (248) Q Consensus 11 ~f~~e~i~~l~~~i~i~~~ir~fv~~~f~Ips~SM~P 47 (248) .|..||+++++.+++++.++-..+ |+++|-- T Consensus 1 ~~L~~Wv~~i~~~~llat~~e~LL------P~~~lkK 31 (113) T TIGR02896 1 EFLKEWVTNIIVLILLATILEMLL------PNSSLKK 31 (113) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHC------CCCCCCH T ss_conf 938999999999999999999976------7998517 No 26 >COG3602 Uncharacterized protein conserved in bacteria [Function unknown] Probab=64.25 E-value=4 Score=20.64 Aligned_cols=17 Identities=47% Similarity=0.804 Sum_probs=15.0 Q ss_pred CCCCCCCCCCCEEEEEC Q ss_conf 66776433698899984 Q gi|254780939|r 43 GSMIPTLLVGDYIIVNK 59 (248) Q Consensus 43 ~SM~PTL~~GD~i~VnK 59 (248) .||.|-|+.||++|..- T Consensus 12 ~smtPeL~~G~yVfcT~ 28 (134) T COG3602 12 ASMTPELLDGDYVFCTV 28 (134) T ss_pred HHCCCCCCCCCEEEEEE T ss_conf 96591005896699984 No 27 >COG1969 HyaC Ni,Fe-hydrogenase I cytochrome b subunit [Energy production and conversion] Probab=64.21 E-value=6.4 Score=19.32 Aligned_cols=34 Identities=15% Similarity=0.097 Sum_probs=27.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHEEEEEEEECCCCCCCC Q ss_conf 587899999999999988751689899878667764 Q gi|254780939|r 13 GSDTLKSILQALFFAILIRTFLFQPSVIPSGSMIPT 48 (248) Q Consensus 13 ~~e~i~~l~~~i~i~~~ir~fv~~~f~Ips~SM~PT 48 (248) ++-|++++.++++++... |+..||.-||.|=|+| T Consensus 20 lwHWv~alsi~vL~~TGy--yIg~pf~~Pss~geat 53 (227) T COG1969 20 LWHWVTALSIVVLIVTGY--YIGYPFLLPSSSGEAT 53 (227) T ss_pred HHHHHHHHHHHHHHHHCC--EECCCCCCCCCCCCHH T ss_conf 899999999999998231--5525434777777201 No 28 >pfam11101 DUF2884 Protein of unknown function (DUF2884). Some members in this bacterial family of proteins are annotated as YggN which currently has no known function. Probab=63.84 E-value=4.9 Score=20.05 Aligned_cols=27 Identities=22% Similarity=0.500 Sum_probs=22.7 Q ss_pred EEEEECHHHEEE-CCCCEEECCCCCCCC Q ss_conf 000304234320-077324535312245 Q gi|254780939|r 106 RVIGLPGDRISL-EKGIIYINGAPVVRH 132 (248) Q Consensus 106 RvIGlPGDtV~i-~~~~l~INg~~i~~~ 132 (248) +|++-.|+++.| .+|.|||||+.+..+ T Consensus 17 ~i~~~~~~~~~I~~~g~L~v~G~~v~L~ 44 (229) T pfam11101 17 EVVDKSGPKYQIDEDGQLFVDGKWVTLS 44 (229) T ss_pred EEEECCCCEEEECCCCCEEECCEECCCC T ss_conf 9997899658984999678999867799 No 29 >pfam10000 DUF2241 Uncharacterized protein conserved in bacteria (DUF2241). This domain, found in various hypothetical bacterial proteins, has no known function. Probab=60.28 E-value=5.2 Score=19.87 Aligned_cols=16 Identities=38% Similarity=0.657 Sum_probs=14.9 Q ss_pred CCCCCCCCCCCEEEEE Q ss_conf 6677643369889998 Q gi|254780939|r 43 GSMIPTLLVGDYIIVN 58 (248) Q Consensus 43 ~SM~PTL~~GD~i~Vn 58 (248) .||.|.|..|+|+|++ T Consensus 12 ~~m~P~L~~~~yVF~t 27 (72) T pfam10000 12 ASMSPELDDGEYVFCT 27 (72) T ss_pred HHCCCEECCCCEEEEE T ss_conf 6599377899689999 No 30 >TIGR01237 D1pyr5carbox2 delta-1-pyrroline-5-carboxylate dehydrogenase, putative; InterPro: IPR005932 The delta(1)-pyrroline-5-carboxylate synthetase (1.5.1.12 from EC) a mitochondrial inner membrane, ATP- and NADPH-dependent, bifunctional enzyme, catalyzes the reduction of glutamate to delta1-pyrroline-5-carboxylate, a critical step in the de novo biosynthesis of proline and ornithine. It is the rate-limiting enzyme in proline biosynthesis and is subject to feedback inhibition by proline.1-pyrroline-5-carboxylate + NAD+ + H2O = L-glutamate + NADH This model represents one of several related branches of delta-1-pyrroline-5-carboxylate dehydrogenase.; GO: 0003842 1-pyrroline-5-carboxylate dehydrogenase activity, 0006561 proline biosynthetic process. Probab=55.87 E-value=5.7 Score=19.65 Aligned_cols=21 Identities=33% Similarity=0.645 Sum_probs=16.3 Q ss_pred CCHHHEEEEEEE-ECH-HHEEEC Q ss_conf 200310000003-042-343200 Q gi|254780939|r 98 DPSIDYVKRVIG-LPG-DRISLE 118 (248) Q Consensus 98 d~~~~yVKRvIG-lPG-DtV~i~ 118 (248) +|++.++||||+ |.| |+|-++ T Consensus 273 QPGQkhlKRVIaEmGGKd~~iVD 295 (518) T TIGR01237 273 QPGQKHLKRVIAEMGGKDAVIVD 295 (518) T ss_pred CCCCCEEEEEEEEECCCCEEEEC T ss_conf 98850241133320788507875 No 31 >pfam02836 Glyco_hydro_2_C Glycosyl hydrolases family 2, TIM barrel domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities. Probab=54.64 E-value=5.7 Score=19.64 Aligned_cols=28 Identities=25% Similarity=0.361 Sum_probs=13.3 Q ss_pred CEEEEEECCCCCCCCCCCCCCCCCCHHH Q ss_conf 6499971788777644536530260888 Q gi|254780939|r 184 GHYFMMGDNRDKSKDSRWVEVGFVPEEN 211 (248) Q Consensus 184 g~yfvlGDNRdnS~DSR~~~~G~Vp~~~ 211 (248) .+++-.||-.+.-.|.++..-|.|-.++ T Consensus 252 ~~~w~g~Df~~~~~d~~~~~nGlv~~dR 279 (297) T pfam02836 252 ELYAYGGDFGDRPSDYRFCGNGLFFADR 279 (297) T ss_pred EEEEECCCCCCCCCCCCCCCCCCCCCCC T ss_conf 7999682118999988736585789999 No 32 >KOG1247 consensus Probab=50.85 E-value=9.7 Score=18.19 Aligned_cols=41 Identities=20% Similarity=0.379 Sum_probs=27.8 Q ss_pred CCC-CCCHHHEEEEEEEEEEECCCCCCCCCCCCCCCCCCHHHCCC Q ss_conf 530-26088825338999974378877543555456701334223 Q gi|254780939|r 203 EVG-FVPEENLVGRASFVLFSIGGDTPFSKVWLWIPNMRWDRLFK 246 (248) Q Consensus 203 ~~G-~Vp~~~IvGka~~i~~S~d~~~~~~~~~~~~~~iRw~R~f~ 246 (248) .|| +||-+.-.+|+..+|| |...++.+..+... --|++.+| T Consensus 250 kWGtpVPle~fk~KVfYVWF--DA~IGYlsit~~yt-~ew~kWwk 291 (567) T KOG1247 250 KWGTPVPLEKFKDKVFYVWF--DAPIGYLSITKNYT-DEWEKWWK 291 (567) T ss_pred CCCCCCCHHHHCCCEEEEEE--CCCCEEEEEEHHHH-HHHHHHHC T ss_conf 56887674551662799997--37513788506666-78999846 No 33 >TIGR01718 Uridine-psphlse uridine phosphorylase; InterPro: IPR010058 This entry represents a family of bacterial and archaeal uridine phosphorylases unrelated to the mammalian enzymes of the same name. The Escherichia coli , Salmonella and Klebsiella genes have been characterised. Sequences from Clostridium, Streptomyces, Treponema, Aeropyrum and Pyrobaculum are also included in this family, but it does not include related sequences from Halobacterium, which are more distantly related and represent enzymes with a slightly different substrate specificity. Also distantly related is a clade of archaeal sequences which are related to the DeoD family of inosine phosphorylases (IPR004402 from INTERPRO) as they are to these uridine phosphorylases. This clade includes a characterised protein from Sulfolobus solfataricus which has been miss-named as a methylthioadenosine phosphorylase, but which acts on inosine and guanosine - it is unclear whether uridine has been evaluated as a substrate .; GO: 0004850 uridine phosphorylase activity, 0009166 nucleotide catabolic process, 0005737 cytoplasm. Probab=49.40 E-value=14 Score=17.19 Aligned_cols=31 Identities=26% Similarity=0.511 Sum_probs=25.9 Q ss_pred HHHHHHHHHEEEEEEEEC-CCCCCCCCCCCCEEEEE Q ss_conf 999998875168989987-86677643369889998 Q gi|254780939|r 24 LFFAILIRTFLFQPSVIP-SGSMIPTLLVGDYIIVN 58 (248) Q Consensus 24 i~i~~~ir~fv~~~f~Ip-s~SM~PTL~~GD~i~Vn 58 (248) =++.+.++||+ ||= ||.|.|-++.||.|+.. T Consensus 75 EL~~lGa~TFi----RvGTtGa~qphI~~Gdv~i~t 106 (248) T TIGR01718 75 ELLYLGADTFI----RVGTTGALQPHIKVGDVVIAT 106 (248) T ss_pred HHHHHCCCEEE----EECCCCCCCCCCCCCCEEEEE T ss_conf 99973401255----422666532322004145552 No 34 >PRK09919 hypothetical protein; Provisional Probab=44.97 E-value=11 Score=17.91 Aligned_cols=24 Identities=21% Similarity=0.102 Sum_probs=19.8 Q ss_pred EECHHHEEECCCCEEECCCCCCCC Q ss_conf 304234320077324535312245 Q gi|254780939|r 109 GLPGDRISLEKGIIYINGAPVVRH 132 (248) Q Consensus 109 GlPGDtV~i~~~~l~INg~~i~~~ 132 (248) =.|||.+...+..+.|||++..-+ T Consensus 39 L~pG~~i~~~~~gvliN~k~~~it 62 (114) T PRK09919 39 LPPGSIFTPVKSGILLNDKEYPIT 62 (114) T ss_pred ECCCCEEEECCCEEEECCCEEEEE T ss_conf 089988897488389889386678 No 35 >cd02776 MopB_CT_Nitrate-R-NarG-like Respiratory nitrate reductase A (NarGHI), alpha chain (NarG) and related proteins. Under anaerobic conditions in the presence of nitrate, E. coli synthesizes the cytoplasmic membrane-bound quinol-nitrate oxidoreductase (NarGHI), which reduces nitrate to nitrite and forms part of a redox loop generating a proton-motive force. Found in prokaryotes and some archaea, NarGHI usually functions as a heterotrimer. The alpha chain contains the molybdenum cofactor-containing Mo-bisMGD catalytic subunit. This CD (MopB_CT_Nitrate-R-NarG-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. Probab=44.13 E-value=23 Score=15.78 Aligned_cols=36 Identities=28% Similarity=0.425 Sum_probs=22.7 Q ss_pred CCCCCCEEEEECCCCCHHHEEEEEEEECHHHEEECC--CCEEE Q ss_conf 533470466401442003100000030423432007--73245 Q gi|254780939|r 84 QPRRGDVVVFRYPKDPSIDYVKRVIGLPGDRISLEK--GIIYI 124 (248) Q Consensus 84 ~p~RGDIVVF~~P~d~~~~yVKRvIGlPGDtV~i~~--~~l~I 124 (248) +-+||.-+|.-+|+|-...=|| -||.|++-| |.+.. T Consensus 25 ~L~rg~P~vwinp~DA~~~GI~-----DgD~Vev~N~~G~v~a 62 (141) T cd02776 25 RLQRGGPVVWMNPKDAAELGIK-----DNDWVEVFNDNGVVVA 62 (141) T ss_pred HCCCCCCEEEECHHHHHHCCCC-----CCCEEEEECCCCEEEE T ss_conf 5001897799999999886998-----8999999848925999 No 36 >TIGR01704 MTA/SAH-Nsdase MTA/SAH nucleosidase; InterPro: IPR010049 This entry represents the enzyme 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase which acts on its two substrates at the same active site. This enzyme is involved in the recycling of the components of S-adenosylmethionine after it has donated one of its two non-ribose sulphur ligands to an acceptor. In the case of 5'-methylthioadenosine this represents the first step of the methionine salvage pathway in bacteria , ,]. This enzyme is widely distributed in bacteria. ; GO: 0008782 adenosylhomocysteine nucleosidase activity, 0008930 methylthioadenosine nucleosidase activity, 0009164 nucleoside catabolic process, 0019509 methionine salvage. Probab=43.83 E-value=17 Score=16.63 Aligned_cols=42 Identities=29% Similarity=0.503 Sum_probs=25.3 Q ss_pred HHHHHHHHHHHHHEEEEEEEECCC---CCCCCCCCCCEEEEECCCC Q ss_conf 999999999887516898998786---6776433698899984005 Q gi|254780939|r 20 ILQALFFAILIRTFLFQPSVIPSG---SMIPTLLVGDYIIVNKFSY 62 (248) Q Consensus 20 l~~~i~i~~~ir~fv~~~f~Ips~---SM~PTL~~GD~i~VnK~~Y 62 (248) +-.|+-..||+--+= --|.|.|| -+.|||..||.++-+...| T Consensus 52 V~AA~~~TLLL~~~K-PD~~INTGSAGGl~~TL~VGD~V~S~~~R~ 96 (229) T TIGR01704 52 VAAALSATLLLDRYK-PDVVINTGSAGGLAHTLKVGDVVVSDDVRY 96 (229) T ss_pred HHHHHHHHHHHHCCC-CCEEEECCCCCCCCCCCCCCCEEEECCCEE T ss_conf 899988888875079-976985887544233132056787167402 No 37 >TIGR00185 rRNA_methyl_2 RNA methyltransferase, TrmH family, group 2; InterPro: IPR004440 The RNA methyltransferase, TrmH family, group 2 are part of the trmH (spoU) family of rRNA methylases that are involved in tRNA and rRNA base modification.; GO: 0008173 RNA methyltransferase activity, 0009451 RNA modification. Probab=43.31 E-value=24 Score=15.70 Aligned_cols=103 Identities=13% Similarity=0.104 Sum_probs=49.4 Q ss_pred CEEEEECCCCCHHHEEEEEEEECHHHEEEC-CCCEEECCCCCCCCCCCCCEEECCCCCCEEEEEECCCCCCCC-CCCEEE Q ss_conf 046640144200310000003042343200-773245353122454464002214786201221022146785-000221 Q gi|254780939|r 89 DVVVFRYPKDPSIDYVKRVIGLPGDRISLE-KGIIYINGAPVVRHMEGYFSYHYKEDWSSNVPIFQEKLSNGV-LYNVLS 166 (248) Q Consensus 89 DIVVF~~P~d~~~~yVKRvIGlPGDtV~i~-~~~l~INg~~i~~~~~~~~~~~~~~~~~~~~~~~~e~l~~~~-~~~~~~ 166 (248) +||.|.+.-.+.+-=|=|..|-=|=++++. -=....|+|.+.|.-..+++...-..=..-... -|....++ ..-.+. T Consensus 3 ~iVLy~PeIP~NTGNI~R~Caat~~~LHLi~PlGF~~~DK~L~RAGLdyw~fv~~~~H~s~E~f-le~~~~~~~nlf~lT 81 (161) T TIGR00185 3 NIVLYEPEIPPNTGNIVRTCAATGTRLHLIKPLGFELDDKRLKRAGLDYWEFVQLFYHKSWEEF-LEAEKPQKGNLFLLT 81 (161) T ss_pred EEEEECCCCCCCCCHHHHHHHCCCCEEEEECCCCCCCCCCEEEECCCCCCCCEEEEECCCHHHH-HHHCCCCCEEEEEEE T ss_conf 5774078897884112010111586245660578620781423147874452323562556888-863389971688884 Q ss_pred CCC--CCCCCCCCEEECCCCEEEEEECCC Q ss_conf 245--578874530232446499971788 Q gi|254780939|r 167 QDF--LAPSSNISEFLVPKGHYFMMGDNR 193 (248) Q Consensus 167 ~~~--~~~~~~~~~~~VP~g~yfvlGDNR 193 (248) ... ..+.. .-.|+-.+.+|||+|.-- T Consensus 82 ~~G~~t~~~~-~~~~~~~d~~yl~fG~ET 109 (161) T TIGR00185 82 KKGDKTPDHI-SVTYQDGDELYLVFGQET 109 (161) T ss_pred ECCCCCCCCE-EEEECCCCCEEEEECCCC T ss_conf 0388774504-665437861699836877 No 38 >PRK09525 lacZ beta-D-galactosidase; Reviewed Probab=42.48 E-value=12 Score=17.57 Aligned_cols=17 Identities=29% Similarity=0.927 Sum_probs=11.0 Q ss_pred HEEECCCCEEECCCCCC Q ss_conf 43200773245353122 Q gi|254780939|r 114 RISLEKGIIYINGAPVV 130 (248) Q Consensus 114 tV~i~~~~l~INg~~i~ 130 (248) +|+++|+.++|||+++. T Consensus 335 ~iei~~~~l~vNG~~i~ 351 (1027) T PRK09525 335 KVEIENGLLKLNGKPLL 351 (1027) T ss_pred EEEEECCEEEECCCEEE T ss_conf 99997999999996899 No 39 >COG0361 InfA Translation initiation factor 1 (IF-1) [Translation, ribosomal structure and biogenesis] Probab=41.34 E-value=21 Score=16.05 Aligned_cols=13 Identities=31% Similarity=0.537 Sum_probs=6.0 Q ss_pred CCCCCEEEEECCC Q ss_conf 3369889998400 Q gi|254780939|r 49 LLVGDYIIVNKFS 61 (248) Q Consensus 49 L~~GD~i~VnK~~ 61 (248) +..||.|+|.-+. T Consensus 47 I~~GD~V~Ve~~~ 59 (75) T COG0361 47 ILPGDVVLVELSP 59 (75) T ss_pred ECCCCEEEEEECC T ss_conf 5799999997456 No 40 >TIGR00915 2A0602 RND transporter, hydrophobe/amphiphile efflux-1 (HAE1) family; InterPro: IPR004764 Hydrophobe/amphiphile efflux-1 HAE1 is involved in toxin production and resistance processes.; GO: 0005215 transporter activity, 0006810 transport, 0016021 integral to membrane. Probab=41.34 E-value=8.4 Score=18.57 Aligned_cols=15 Identities=27% Similarity=0.359 Sum_probs=10.5 Q ss_pred CCCCCCCEEEEECCC Q ss_conf 553347046640144 Q gi|254780939|r 83 NQPRRGDVVVFRYPK 97 (248) Q Consensus 83 ~~p~RGDIVVF~~P~ 97 (248) ++.|-+-|+-|.+|. T Consensus 659 G~ikdA~v~a~~pPa 673 (1058) T TIGR00915 659 GQIKDAMVIAFVPPA 673 (1058) T ss_pred CCCCCCEEECCCCCC T ss_conf 768763174046864 No 41 >cd04438 DEP_dishevelled DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in dishevelled-like proteins. Dishevelled-like proteins play a key role in the transduction of the Wnt signal from the cell surface to the nucleus, which in turn is an important regulatory pathway for cellular development and growth. They contain an N-terminal DIX domain, a central PDZ domain, and a C-terminal DEP domain. Probab=38.98 E-value=15 Score=16.98 Aligned_cols=13 Identities=23% Similarity=0.680 Sum_probs=11.2 Q ss_pred EECCCCEEEEEEC Q ss_conf 2324464999717 Q gi|254780939|r 179 FLVPKGHYFMMGD 191 (248) Q Consensus 179 ~~VP~g~yfvlGD 191 (248) .++-+++|||+|| T Consensus 72 ~~F~e~cYyvfgd 84 (84) T cd04438 72 ITFSEQCYYVFGD 84 (84) T ss_pred CCCCCCEEEEECC T ss_conf 4424451686079 No 42 >cd01287 FabA FabA, beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase: Bacterial protein of the type II, fatty acid synthase system that binds ACP and catalyzes both dehydration and isomerization reactions, apparently in the same active site. The FabA structure is a homodimer with two independent active sites located at the dimer interface. Each active site is tunnel-shaped and completely inaccessible to solvent. No metal ions or cofactors are required for ligand binding or catalysis. Probab=38.23 E-value=16 Score=16.78 Aligned_cols=26 Identities=12% Similarity=0.201 Sum_probs=15.0 Q ss_pred EEEEEEECHHHEEECCCCEEECCCCC Q ss_conf 00000304234320077324535312 Q gi|254780939|r 104 VKRVIGLPGDRISLEKGIIYINGAPV 129 (248) Q Consensus 104 VKRvIGlPGDtV~i~~~~l~INg~~i 129 (248) ||||....+..+-+-|+.++++|+.+ T Consensus 112 I~~v~~~~~~~~~iADg~l~vDGk~I 137 (150) T cd01287 112 IKEVGRDGPRPYIIADASLWVDGLRI 137 (150) T ss_pred EEEEEECCCCEEEEEEEEEEECCEEE T ss_conf 99998049958999999999899899 No 43 >PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed Probab=37.14 E-value=16 Score=16.73 Aligned_cols=16 Identities=31% Similarity=0.517 Sum_probs=8.8 Q ss_pred HEEECCCCEEECCCCC Q ss_conf 4320077324535312 Q gi|254780939|r 114 RISLEKGIIYINGAPV 129 (248) Q Consensus 114 tV~i~~~~l~INg~~i 129 (248) +|++++++++|||+++ T Consensus 319 ~iei~~~~~llNGkpi 334 (1030) T PRK10340 319 DIKVRDGLFLINNRYV 334 (1030) T ss_pred EEEEECCEEEECCCEE T ss_conf 9999899999889788 No 44 >TIGR02390 RNA_pol_rpoA1 DNA-directed RNA polymerase subunit A'; InterPro: IPR012758 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme . The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length . The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This family consists of the archaeal A' subunit of the DNA-directed RNA polymerase. The example from Methanococcus jannaschii contains an intein.; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0008270 zinc ion binding, 0006350 transcription. Probab=35.64 E-value=21 Score=16.03 Aligned_cols=34 Identities=32% Similarity=0.608 Sum_probs=24.4 Q ss_pred CCCCCCCCEEEEECCCCCHHH----EEEEEEEECHHHEEE Q ss_conf 555334704664014420031----000000304234320 Q gi|254780939|r 82 NNQPRRGDVVVFRYPKDPSID----YVKRVIGLPGDRISL 117 (248) Q Consensus 82 ~~~p~RGDIVVF~~P~d~~~~----yVKRvIGlPGDtV~i 117 (248) .++-.-||||.|| .+|+.| +==+|-=|||-|.+. T Consensus 430 ERHL~dGDiVLFN--RQPSLHRmSmMgH~VkVLPgkTFRL 467 (901) T TIGR02390 430 ERHLIDGDIVLFN--RQPSLHRMSMMGHKVKVLPGKTFRL 467 (901) T ss_pred EEEECCCCEEEEC--CCCCHHHHHHCCCEEEECCCCCCCC T ss_conf 9871268887665--8864202101144546788885213 No 45 >PRK08566 DNA-directed RNA polymerase subunit alpha; Validated Probab=35.18 E-value=32 Score=14.90 Aligned_cols=33 Identities=36% Similarity=0.669 Sum_probs=17.9 Q ss_pred CCCCCCCEEEEECCCCCHHH----EEEEEEEECHHHEEE Q ss_conf 55334704664014420031----000000304234320 Q gi|254780939|r 83 NQPRRGDVVVFRYPKDPSID----YVKRVIGLPGDRISL 117 (248) Q Consensus 83 ~~p~RGDIVVF~~P~d~~~~----yVKRvIGlPGDtV~i 117 (248) ++...||+|+|+ .+|..| .-=|+.=+||-|+++ T Consensus 410 Rhl~dgD~Vl~N--RqPTLHr~sima~~v~v~~gktirl 446 (881) T PRK08566 410 RHLIDGDIVLFN--RQPSLHRMSIMAHRVRVLPGKTFRL 446 (881) T ss_pred EEEECCCEEEEC--CCCHHHHHCCCCEEEEEEECCEEEE T ss_conf 564359667752--7733544200010379974345775 No 46 >COG1097 RRP4 RNA-binding protein Rrp4 and related proteins (contain S1 domain and KH domain) [Translation, ribosomal structure and biogenesis] Probab=34.25 E-value=33 Score=14.81 Aligned_cols=17 Identities=29% Similarity=0.479 Sum_probs=8.3 Q ss_pred CCCCCCCCCCCCEEEEE Q ss_conf 86677643369889998 Q gi|254780939|r 42 SGSMIPTLLVGDYIIVN 58 (248) Q Consensus 42 s~SM~PTL~~GD~i~Vn 58 (248) +..|.|.|..||.|.+- T Consensus 106 ~~~~r~~l~vGD~v~Ak 122 (239) T COG1097 106 EKDLRPFLNVGDLVYAK 122 (239) T ss_pred CCCCCCCCCCCCEEEEE T ss_conf 23452226768789999 No 47 >TIGR02219 phage_NlpC_fam putative phage cell wall peptidase, NlpC/P60 family; InterPro: IPR011929 Members of this family show sequence similarity to members of the NlpC/P60 family, described by Anantharaman and Aravind . The NlpC/P60 family includes a number of characterised bacterial cell wall hydrolases. Members of this related family are all found in prophage regions of bacterial genomes.. Probab=33.43 E-value=22 Score=15.92 Aligned_cols=19 Identities=21% Similarity=0.497 Sum_probs=14.5 Q ss_pred CCC-CCCCCCCCEEEEECCC Q ss_conf 112-5553347046640144 Q gi|254780939|r 79 RIF-NNQPRRGDVVVFRYPK 97 (248) Q Consensus 79 ~i~-~~~p~RGDIVVF~~P~ 97 (248) +++ ..++|.||+.||+.-. T Consensus 71 ~~pG~~~~qpGDlLlFRw~~ 90 (135) T TIGR02219 71 AVPGLEAAQPGDLLLFRWRP 90 (135) T ss_pred CCCCCCCCCCCCEEEECCCC T ss_conf 57888878887667771552 No 48 >pfam11057 Cortexin Cortexin of kidney. In the middle of cortexin protein there is a single membrane-spanning domain which indicates that this protein may be a membrane protein involved in intracellular or extracellular signalling of the kidney or brain, since it is expressed specifically in the kidneys and brain only. The protein is highly conserved among species. Cortexin is also thought to be important to neurons of both the developing and adult cerebral cortex. Probab=32.16 E-value=25 Score=15.57 Aligned_cols=29 Identities=24% Similarity=0.522 Sum_probs=16.1 Q ss_pred HHHHHHHHHHHHHHE--EEEEEE-ECCCCCCC Q ss_conf 999999999988751--689899-87866776 Q gi|254780939|r 19 SILQALFFAILIRTF--LFQPSV-IPSGSMIP 47 (248) Q Consensus 19 ~l~~~i~i~~~ir~f--v~~~f~-Ips~SM~P 47 (248) .++.+++.++++|.| +++||. .|++|-+. T Consensus 35 ~ll~ifL~~livRCfrillDPYssmPsStW~d 66 (81) T pfam11057 35 GLLLIFLGLLIVRCFRILLDPYSSMPASSWTD 66 (81) T ss_pred HHHHHHHHHHHHHHHHHHCCHHCCCCCCHHHH T ss_conf 99999999999999999808110388530344 No 49 >PRK10838 spr putative outer membrane lipoprotein; Provisional Probab=30.12 E-value=18 Score=16.51 Aligned_cols=12 Identities=25% Similarity=0.409 Sum_probs=4.5 Q ss_pred HHHHHHHHHHHE Q ss_conf 999999988751 Q gi|254780939|r 22 QALFFAILIRTF 33 (248) Q Consensus 22 ~~i~i~~~ir~f 33 (248) .+++++.++..+ T Consensus 15 ~~~~~~~~l~ac 26 (188) T PRK10838 15 PAIAVAVLLSAC 26 (188) T ss_pred HHHHHHHHHHHH T ss_conf 799999988863 No 50 >TIGR02656 cyanin_plasto plastocyanin; InterPro: IPR002387 Blue or 'type-1' copper proteins are small proteins which bind a single copper atom and which are characterised by an intense electronic absorption band near 600 nm , . The most well known members of this class of proteins are the plant chloroplastic plastocyanins, and the distantly related bacterial azurins, which exchange electrons with cytochrome c551. Plastocyanin participates in electron transfer between the cytochrome b6f complex and photosystem I. Many cyanobacteria and eukaryotic algae can synthesise both plastocyanin and its functional analog cytochrome c6, depending on bioavailabilities of copper and iron, respectively . Plastocyanin participates in electron transfer between P700 and the cytochrome b/f complex in photosystem I. ; GO: 0005507 copper ion binding, 0009055 electron carrier activity, 0006118 electron transport. Probab=29.90 E-value=18 Score=16.55 Aligned_cols=21 Identities=10% Similarity=0.044 Sum_probs=14.4 Q ss_pred EEEEECHHHEEECCCCEEECC Q ss_conf 000304234320077324535 Q gi|254780939|r 106 RVIGLPGDRISLEKGIIYING 126 (248) Q Consensus 106 RvIGlPGDtV~i~~~~l~INg 126 (248) .+=--|||||++.|++++=++ T Consensus 18 ~~~i~aGDtV~f~NNK~~PHN 38 (102) T TIGR02656 18 KISIAAGDTVKFVNNKGGPHN 38 (102) T ss_pred CEEECCCCEEEEEECCCCCCC T ss_conf 104688981788437889976 No 51 >PRK10369 heme lyase subunit NrfE; Provisional Probab=29.36 E-value=40 Score=14.30 Aligned_cols=23 Identities=13% Similarity=0.154 Sum_probs=16.1 Q ss_pred EEEEECHHHEEECCCCEEECCCC Q ss_conf 00030423432007732453531 Q gi|254780939|r 106 RVIGLPGDRISLEKGIIYINGAP 128 (248) Q Consensus 106 RvIGlPGDtV~i~~~~l~INg~~ 128 (248) -+.-.|||++++.+-++..++-+ T Consensus 428 ~~~L~~Ges~~i~~y~i~f~~i~ 450 (552) T PRK10369 428 SLNLQPGQQVTLAGYTFRFERLD 450 (552) T ss_pred EEECCCCCEEEECCEEEEECCCE T ss_conf 46507998599888899991248 No 52 >pfam00623 RNA_pol_Rpb1_2 RNA polymerase Rpb1, domain 2. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion. Probab=28.88 E-value=41 Score=14.24 Aligned_cols=34 Identities=29% Similarity=0.471 Sum_probs=22.1 Q ss_pred CCCCCCCCEEEEECCCCCHHH----EEEEEEEECHHHEEE Q ss_conf 555334704664014420031----000000304234320 Q gi|254780939|r 82 NNQPRRGDVVVFRYPKDPSID----YVKRVIGLPGDRISL 117 (248) Q Consensus 82 ~~~p~RGDIVVF~~P~d~~~~----yVKRvIGlPGDtV~i 117 (248) .+..+.||+|+|+ .+|..+ .--|+.-++|.|+++ T Consensus 91 ~R~l~dGD~Vl~N--RqPTLHr~si~a~~v~i~~~~tirl 128 (165) T pfam00623 91 LRHVIDGDVVLLN--RQPTLHRMSIMAHRPRVLEGKTIRL 128 (165) T ss_pred HHHHHCCCEEEEE--CCCCCCCCEEEEEEEEEECCCEEEE T ss_conf 6576479889992--7852251420366889968970587 No 53 >TIGR00575 dnlj DNA ligase, NAD-dependent; InterPro: IPR001679 DNA ligase (polydeoxyribonucleotide synthase) is the enzyme that joins two DNA fragments by catalyzing the formation of an internucleotide ester bond between phosphate and deoxyribose. It is active during DNA replication, DNA repair and DNA recombination. There are two forms of DNA ligase: one requires ATP (6.5.1.1 from EC), the other NAD (6.5.1.2 from EC). This family is predominantly composed of NAD-dependent bacterial DNA ligases. They are proteins of about 75 to 85 Kd whose sequence is well conserved , . They also show similarity to yicF, an Escherichia coli hypothetical protein of 63 Kd.; GO: 0003911 DNA ligase (NAD+) activity, 0006260 DNA replication, 0006281 DNA repair. Probab=28.68 E-value=34 Score=14.74 Aligned_cols=12 Identities=25% Similarity=0.706 Sum_probs=6.2 Q ss_pred CCCCCCCCCCCC Q ss_conf 786677643369 Q gi|254780939|r 41 PSGSMIPTLLVG 52 (248) Q Consensus 41 ps~SM~PTL~~G 52 (248) |.+-|||=-+.| T Consensus 345 PvA~LePV~vaG 356 (706) T TIGR00575 345 PVAKLEPVFVAG 356 (706) T ss_pred CEEEECCEEECC T ss_conf 036756558720 No 54 >KOG2915 consensus Probab=28.25 E-value=20 Score=16.19 Aligned_cols=21 Identities=33% Similarity=0.850 Sum_probs=11.6 Q ss_pred EEEEEEEC-HHHEEECCC-CEEE Q ss_conf 00000304-234320077-3245 Q gi|254780939|r 104 VKRVIGLP-GDRISLEKG-IIYI 124 (248) Q Consensus 104 VKRvIGlP-GDtV~i~~~-~l~I 124 (248) .+-+||.| |..|+...| .+|+ T Consensus 48 h~~iIGK~~G~~v~sskG~~vyl 70 (314) T KOG2915 48 HSDIIGKPYGSKVASSKGKFVYL 70 (314) T ss_pred HHHEECCCCCCEEEECCCCEEEE T ss_conf 01111577553465337847999 No 55 >TIGR02730 carot_isom carotene isomerase; InterPro: IPR014101 Members of this family, including sll0033 (crtH) of Synechocystis sp. (strain PCC 6803), catalyse a cis-trans isomerization of carotenes to the all-trans lycopene, a reaction that can also occur non-enzymatically in light through photoisomerization.. Probab=27.96 E-value=43 Score=14.14 Aligned_cols=25 Identities=12% Similarity=0.224 Sum_probs=12.0 Q ss_pred CCCCCCCCCCCCEEEEECCCCCCCCC Q ss_conf 86677643369889998400587664 Q gi|254780939|r 42 SGSMIPTLLVGDYIIVNKFSYGYSKY 67 (248) Q Consensus 42 s~SM~PTL~~GD~i~VnK~~YG~~~~ 67 (248) -..|-|-+..| +||-+++.=|+..| T Consensus 200 pA~~TPMINAg-MVfsDRH~GGiNYP 224 (506) T TIGR02730 200 PADQTPMINAG-MVFSDRHYGGINYP 224 (506) T ss_pred CCCCCCCCCCH-HHHCCCCCCCCCCC T ss_conf 10258740011-21003455763389 No 56 >TIGR02823 oxido_YhdH putative quinone oxidoreductase, YhdH/YhfP family; InterPro: IPR014188 This entry represents a subfamily of the alcohol dehydrogenase, a superfamily in which some members are zinc-binding medium-chain alcohol dehydrogenases while others are quinone oxidoreductases with no bound zinc. This entry includes YhdH from Escherichia coli and YhfP from Bacillus subtilis both of which bind NADPH or NAD, but not zinc. Both proteins have been studied crystallographically for insight into function. . Probab=27.72 E-value=43 Score=14.12 Aligned_cols=21 Identities=33% Similarity=0.521 Sum_probs=16.2 Q ss_pred CCCCCCCCCCCEEEEECCCCC Q ss_conf 667764336988999840058 Q gi|254780939|r 43 GSMIPTLLVGDYIIVNKFSYG 63 (248) Q Consensus 43 ~SM~PTL~~GD~i~VnK~~YG 63 (248) .|=.|....||.|+|+=|=-| T Consensus 72 ~S~dp~F~~GD~VivTGyglG 92 (330) T TIGR02823 72 SSEDPRFRPGDEVIVTGYGLG 92 (330) T ss_pred ECCCCCCCCCCEEEEEEECCC T ss_conf 448877578871899740245 No 57 >TIGR00110 ilvD dihydroxy-acid dehydratase; InterPro: IPR004404 Two dehydratases, dihydroxy-acid dehydratase (gene ilvD or ILV3) and 6-phosphogluconate dehydratase (gene edd) have been shown to be evolutionary related . Dihydroxy-acid dehydratase catalyzes the fourth step in the biosynthesis of isoleucine and valine, the dehydratation of 2,3-dihydroxy-isovaleic acid into alpha-ketoisovaleric acid. 6-Phosphogluconate dehydratase catalyzes the first step in the Entner-Doudoroff pathway, the dehydratation of 6-phospho-D-gluconate into 6-phospho-2-dehydro-3-deoxy-D-gluconate. Another protein containing this signature is the Escherichia coli hypothetical protein yjhG. The N-terminal part of the proteins contains a cysteine that could be involved in the binding of a 2Fe-2S iron-sulphur cluster . This family represents dihydroxy-acid dehydratase (DAD). It contains a catalytically essential [4Fe-4S] cluster and catalyses the fourth step in valine and isoleucine biosynthesis.; GO: 0004160 dihydroxy-acid dehydratase activity, 0009082 branched chain family amino acid biosynthetic process. Probab=27.62 E-value=28 Score=15.28 Aligned_cols=18 Identities=44% Similarity=0.781 Sum_probs=14.6 Q ss_pred CCCCCCCCCC--EEEEECCC Q ss_conf 1255533470--46640144 Q gi|254780939|r 80 IFNNQPRRGD--VVVFRYPK 97 (248) Q Consensus 80 i~~~~p~RGD--IVVF~~P~ 97 (248) |..+++++|| |||.+|.. T Consensus 455 Il~Gki~~GDktVVVIRYEG 474 (601) T TIGR00110 455 ILGGKIKEGDKTVVVIRYEG 474 (601) T ss_pred HHCCEEEECCCEEEEEEECC T ss_conf 75790431681689997048 No 58 >KOG1618 consensus Probab=27.59 E-value=35 Score=14.71 Aligned_cols=47 Identities=30% Similarity=0.327 Sum_probs=33.4 Q ss_pred ECCCCEEEEEECCCCCCCCCCC--CCCCCCCHHHEEEEEEEEEEECCCCCC Q ss_conf 3244649997178877764453--653026088825338999974378877 Q gi|254780939|r 180 LVPKGHYFMMGDNRDKSKDSRW--VEVGFVPEENLVGRASFVLFSIGGDTP 228 (248) Q Consensus 180 ~VP~g~yfvlGDNRdnS~DSR~--~~~G~Vp~~~IvGka~~i~~S~d~~~~ 228 (248) .=|..+.+|+|||-.. |=|- ..-++-|+.++.|.|-.-|.|+=-.++ T Consensus 294 ~~~~k~lymvGDNP~s--Dv~GA~lf~~yap~~~~g~~~~~~w~SILV~TG 342 (389) T KOG1618 294 AAPIKKLYMVGDNPMS--DVRGANLFHQYAPELGAGGSANYGWISILVRTG 342 (389) T ss_pred CCCCCEEEEECCCCCC--CCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEEE T ss_conf 6775256664688754--433330002346332544334777168998530 No 59 >TIGR03468 HpnG hopanoid-associated phosphorylase. The sequences in this family are members of the pfam01048 family of phosphorylases typically acting on nucleotide-sugar substrates. The genes of the family modeled here are generally in the same locus with genes involved in the biosynthesis and elaboration of hopene, the cyclization product of the polyisoprenoid squalene. This gene is adjacent to the genes PhnA-E and squalene-hopene cyclase (which would be HpnF) in Zymomonas mobilis and their association with hopene biosynthesis has been noted in the literature. Extending the gene symbol sequence, we suggest the symbol HpnG for the product of this gene. Hopanoids are known to be components of the plasma membrane and to have polar sugar head groups in Z. mobilis and other species. Probab=27.00 E-value=44 Score=14.04 Aligned_cols=23 Identities=22% Similarity=0.410 Sum_probs=19.7 Q ss_pred CCCCCCCCCCCCEEEEECCCCCC Q ss_conf 86677643369889998400587 Q gi|254780939|r 42 SGSMIPTLLVGDYIIVNKFSYGY 64 (248) Q Consensus 42 s~SM~PTL~~GD~i~VnK~~YG~ 64 (248) .|++.|.|..||.|+.+++.|.. T Consensus 55 AGgL~p~L~~GDvVv~~~V~~~~ 77 (212) T TIGR03468 55 AGALDPALQPGDLVVPEEVRADG 77 (212) T ss_pred CCCCCCCCCCCCEEEEEEEECCC T ss_conf 04678889767899975675278 No 60 >cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. Probab=26.79 E-value=30 Score=15.11 Aligned_cols=42 Identities=14% Similarity=0.267 Sum_probs=24.1 Q ss_pred CCCCCEEEEEC--CCCCHHHEEEEEEEE-CHHHEEECCCCEEECCCCC Q ss_conf 33470466401--442003100000030-4234320077324535312 Q gi|254780939|r 85 PRRGDVVVFRY--PKDPSIDYVKRVIGL-PGDRISLEKGIIYINGAPV 129 (248) Q Consensus 85 p~RGDIVVF~~--P~d~~~~yVKRvIGl-PGDtV~i~~~~l~INg~~i 129 (248) .|.||+|+=-. |-.....+++++-+. |||+|.+. +..||++. T Consensus 25 Lk~GDvI~~vdGk~v~~~~~l~~~i~~~~~Gd~V~l~---v~R~gk~~ 69 (79) T cd00986 25 LKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLK---VKREEKEL 69 (79) T ss_pred CCCCCEEEEECCEECCCHHHHHHHHHCCCCCCEEEEE---EEECCEEE T ss_conf 7789999999998957999999999659999989999---99999999 No 61 >COG3250 LacZ Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism] Probab=26.66 E-value=31 Score=15.04 Aligned_cols=17 Identities=35% Similarity=0.872 Sum_probs=14.4 Q ss_pred HHEEECCCCEEECCCCC Q ss_conf 34320077324535312 Q gi|254780939|r 113 DRISLEKGIIYINGAPV 129 (248) Q Consensus 113 DtV~i~~~~l~INg~~i 129 (248) =+|+++++.++|||+++ T Consensus 284 R~iei~~~~~~iNGkpv 300 (808) T COG3250 284 RTVEIKDGLLLINGKPV 300 (808) T ss_pred EEEEEECCEEEECCEEE T ss_conf 79999768289988689 No 62 >TIGR02634 xylF D-xylose ABC transporter, substrate-binding protein; InterPro: IPR013456 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This entry represents the D-xylose ABC transporter substrate-binding protein which is a periplasmic (when in Gram-negative bacteria) binding protein for D-xylose import by a high-affinity ATP-binding cassette (ABC) transporter.. Probab=26.19 E-value=31 Score=14.98 Aligned_cols=13 Identities=46% Similarity=1.165 Sum_probs=10.3 Q ss_pred ECCCCEEEEEECC Q ss_conf 3244649997178 Q gi|254780939|r 180 LVPKGHYFMMGDN 192 (248) Q Consensus 180 ~VP~g~yfvlGDN 192 (248) ..|+|.||+||=- T Consensus 116 ~~P~GnY~l~~Gs 128 (307) T TIGR02634 116 AAPKGNYFLLGGS 128 (307) T ss_pred CCCCCCEEEEECC T ss_conf 1678757884177 No 63 >PRK05174 3-hydroxydecanoyl-(acyl carrier protein) dehydratase; Validated Probab=25.14 E-value=39 Score=14.36 Aligned_cols=13 Identities=15% Similarity=0.644 Sum_probs=5.7 Q ss_pred ECCCCEEECCCCC Q ss_conf 0077324535312 Q gi|254780939|r 117 LEKGIIYINGAPV 129 (248) Q Consensus 117 i~~~~l~INg~~i 129 (248) +-|+.++++|+.| T Consensus 143 iADg~l~vDg~~I 155 (172) T PRK05174 143 IADGRVLVDGEEI 155 (172) T ss_pred EEEEEEEECCEEE T ss_conf 9989999899899 No 64 >PRK10150 beta-D-glucuronidase; Provisional Probab=25.08 E-value=35 Score=14.66 Aligned_cols=19 Identities=21% Similarity=0.380 Sum_probs=14.0 Q ss_pred HEEECCCCEEECCCCCCCC Q ss_conf 4320077324535312245 Q gi|254780939|r 114 RISLEKGIIYINGAPVVRH 132 (248) Q Consensus 114 tV~i~~~~l~INg~~i~~~ 132 (248) +|+++++++++||+++... T Consensus 280 ~i~~~~~~f~LNGkpi~Lr 298 (605) T PRK10150 280 SVAVKGGQFLINHKPFYFK 298 (605) T ss_pred EEEECCCEEEECCCEEEEE T ss_conf 9998599799899389973 No 65 >PRK06714 S-adenosylhomocysteine nucleosidase; Validated Probab=24.88 E-value=48 Score=13.79 Aligned_cols=42 Identities=17% Similarity=0.146 Sum_probs=28.3 Q ss_pred HHHHHHHHHHHHEEEEEEEE---CCCCCCCCCCCCCEEEEECCCCC Q ss_conf 99999999887516898998---78667764336988999840058 Q gi|254780939|r 21 LQALFFAILIRTFLFQPSVI---PSGSMIPTLLVGDYIIVNKFSYG 63 (248) Q Consensus 21 ~~~i~i~~~ir~fv~~~f~I---ps~SM~PTL~~GD~i~VnK~~YG 63 (248) -.|+...+++..|=.+ ..| ..|++.|.|..||.|+.+++.|. T Consensus 55 nAA~~t~~LI~~F~~d-~IIntGvAGgl~~~l~igDvVIa~~~~~h 99 (236) T PRK06714 55 SCASCVQLLISEFQPD-ELFMTGICGSLSNKVKNGHIVVALNAIQH 99 (236) T ss_pred HHHHHHHHHHHHCCCC-EEEECCCCCCCCCCCCCCCEEEECEEEEC T ss_conf 9999999999844999-99987863235898805889998725873 No 66 >TIGR01857 FGAM-synthase phosphoribosylformylglycinamidine synthase; InterPro: IPR010141 This entry represents a single-molecule form of phosphoribosylformylglycinamidine synthase, also called FGAM synthase, an enzyme of purine de novo biosynthesis, which represent a second clade of the enzymes found in Clostridia, Bifidobacteria and Streptococcus species. This enzyme performs the fourth step in IMP biosynthesis (the precursor of all purines) from PRPP.. Probab=24.63 E-value=39 Score=14.36 Aligned_cols=20 Identities=30% Similarity=0.644 Sum_probs=17.6 Q ss_pred HHEEEEEEEECCCCCCCCCCC Q ss_conf 751689899878667764336 Q gi|254780939|r 31 RTFLFQPSVIPSGSMIPTLLV 51 (248) Q Consensus 31 r~fv~~~f~Ips~SM~PTL~~ 51 (248) |+||.|+-|| |||=.||--+ T Consensus 361 RSYVYQAmRv-tGaadpt~~v 380 (1279) T TIGR01857 361 RSYVYQAMRV-TGAADPTVPV 380 (1279) T ss_pred HHHEEEEEEE-ECCCCCCCCH T ss_conf 0030110011-0577885540 No 67 >COG2949 SanA Uncharacterized membrane protein [Function unknown] Probab=23.64 E-value=47 Score=13.89 Aligned_cols=14 Identities=29% Similarity=0.389 Sum_probs=5.5 Q ss_pred EEEEEECCCCCCCC Q ss_conf 49997178877764 Q gi|254780939|r 185 HYFMMGDNRDKSKD 198 (248) Q Consensus 185 ~yfvlGDNRdnS~D 198 (248) +.++-|||+.+|.| T Consensus 96 ~LLlSGDN~~~sYn 109 (235) T COG2949 96 YLLLSGDNATVSYN 109 (235) T ss_pred EEEEECCCCCCCCC T ss_conf 99981687753465 No 68 >PRK05584 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase; Validated Probab=23.44 E-value=47 Score=13.87 Aligned_cols=40 Identities=30% Similarity=0.452 Sum_probs=25.4 Q ss_pred HHHHHHHHHHHEE---EEEEEECCCCCCCCCCCCCEEEEECCCC Q ss_conf 9999999887516---8989987866776433698899984005 Q gi|254780939|r 22 QALFFAILIRTFL---FQPSVIPSGSMIPTLLVGDYIIVNKFSY 62 (248) Q Consensus 22 ~~i~i~~~ir~fv---~~~f~Ips~SM~PTL~~GD~i~VnK~~Y 62 (248) .++....++..|= +=..- -.|++.|.|..||.++.+++.| T Consensus 55 AA~~~~~li~~f~p~~ii~~G-~AGgl~~~l~iGDvvia~~~~~ 97 (230) T PRK05584 55 AALTATILIEHFKVDAVINTG-VAGGLAPGLKVGDVVVADELVQ 97 (230) T ss_pred HHHHHHHHHHHCCCCEEEEEC-CCCCCCCCCCCCEEEEECEEEE T ss_conf 999999999737998999942-4334688984780999765797 No 69 >pfam00877 NLPC_P60 NlpC/P60 family. The function of this domain is unknown. It is found in several lipoproteins. Probab=23.14 E-value=38 Score=14.42 Aligned_cols=16 Identities=44% Similarity=0.794 Sum_probs=12.2 Q ss_pred CCCCCCCCCCEEEEEC Q ss_conf 1255533470466401 Q gi|254780939|r 80 IFNNQPRRGDVVVFRY 95 (248) Q Consensus 80 i~~~~p~RGDIVVF~~ 95 (248) +..+++|.||+|.|.. T Consensus 46 v~~~~~~~GDLvFf~~ 61 (105) T pfam00877 46 IPKSEPQRGDLVFFGT 61 (105) T ss_pred CCHHHCCCCCEEEECC T ss_conf 4789989878899778 No 70 >pfam01048 PNP_UDP_1 Phosphorylase superfamily. Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Probab=22.62 E-value=49 Score=13.75 Aligned_cols=22 Identities=27% Similarity=0.425 Sum_probs=18.7 Q ss_pred CCCCCCCCCCCCCEEEEECCCC Q ss_conf 7866776433698899984005 Q gi|254780939|r 41 PSGSMIPTLLVGDYIIVNKFSY 62 (248) Q Consensus 41 ps~SM~PTL~~GD~i~VnK~~Y 62 (248) ..|||.|.+..||.++.+++.+ T Consensus 78 ~aG~l~~~~~~Gdvvi~~~~i~ 99 (232) T pfam01048 78 TAGGLNPDLKPGDLVIPTDAIN 99 (232) T ss_pred CCCCCCCCCCCCCEEEEHHHHH T ss_conf 6455787899998996437752 No 71 >PRK02122 glucosamine-6-phosphate deaminase-like protein; Validated Probab=22.44 E-value=21 Score=16.08 Aligned_cols=21 Identities=24% Similarity=0.336 Sum_probs=15.1 Q ss_pred ECCCCEEEEEECCCCCCCCCC Q ss_conf 324464999717887776445 Q gi|254780939|r 180 LVPKGHYFMMGDNRDKSKDSR 200 (248) Q Consensus 180 ~VP~g~yfvlGDNRdnS~DSR 200 (248) .|.+.+.|+.||=-|-.---| T Consensus 528 ~vkPhqIyaAGDlaDPhgThr 548 (660) T PRK02122 528 EIKPHQIFVAGDLADPHGTHR 548 (660) T ss_pred HCCCCEEEECCCCCCCCCCCH T ss_conf 559777987366679876259 No 72 >COG4615 PvdE ABC-type siderophore export system, fused ATPase and permease components [Secondary metabolites biosynthesis, transport, and catabolism / Inorganic ion transport and metabolism] Probab=22.13 E-value=35 Score=14.67 Aligned_cols=46 Identities=22% Similarity=0.308 Sum_probs=30.0 Q ss_pred CCCCCCEEEEECC-CCCHHHEEEEEEEECHHHEEECCCCEEECCCCCCCCC Q ss_conf 5334704664014-4200310000003042343200773245353122454 Q gi|254780939|r 84 QPRRGDVVVFRYP-KDPSIDYVKRVIGLPGDRISLEKGIIYINGAPVVRHM 133 (248) Q Consensus 84 ~p~RGDIVVF~~P-~d~~~~yVKRvIGlPGDtV~i~~~~l~INg~~i~~~~ 133 (248) +++|||+|-..-. ..++..++|=..|+- +=+.|.+++||+++..+. T Consensus 345 ~ikrGelvFliG~NGsGKST~~~LLtGL~----~PqsG~I~ldg~pV~~e~ 391 (546) T COG4615 345 TIKRGELVFLIGGNGSGKSTLAMLLTGLY----QPQSGEILLDGKPVSAEQ 391 (546) T ss_pred EEECCCEEEEECCCCCCHHHHHHHHHHCC----CCCCCCEEECCCCCCCCC T ss_conf 87337389998889963889999997066----888882667893488447 No 73 >pfam05382 Amidase_5 Bacteriophage peptidoglycan hydrolase. At least one of the members of this family, the Pal protein from the pneumococcal bacteriophage Dp-1 has been shown to be a N-acetylmuramoyl-L-alanine amidase. According to the known modular structure of this and other peptidoglycan hydrolases from the pneumococcal system, the active site should reside at the N-terminal domain whereas the C-terminal domain binds to the choline residues of the cell wall teichoic acids. This family appears to be related to pfam00877. Probab=21.96 E-value=43 Score=14.14 Aligned_cols=13 Identities=23% Similarity=0.657 Sum_probs=10.2 Q ss_pred CCCCCCEEEEECC Q ss_conf 5334704664014 Q gi|254780939|r 84 QPRRGDVVVFRYP 96 (248) Q Consensus 84 ~p~RGDIVVF~~P 96 (248) ++||||||+.=.+ T Consensus 75 ~~q~GDI~IwG~~ 87 (145) T pfam05382 75 NAKRGDIFIWGKR 87 (145) T ss_pred CCCCCCEEEEECC T ss_conf 7776889999268 No 74 >TIGR01285 nifN nitrogenase molybdenum-iron cofactor biosynthesis protein NifN; InterPro: IPR005975 The enzyme responsible for nitrogen fixation, the nitrogenase, shows a high degree of conservation of structure, function, and amino acid sequence across wide phylogenetic ranges. All known Mo-nitrogenases consist of two components, component I (also called dinitrogenase, or Fe-Mo protein), an alpha2beta2 tetramer encoded by the nifD and nifK genes, and component II (dinitrogenase reductase, or Fe protein) a homodimer encoded by the nifH gene. Two operons, nifDK and nifEN, encode a tetrameric (alpha2beta2 and N2E2) enzymatic complex. Nitrogenase contains two unusual rare metal clusters; one of them is the iron molybdenum cofactor (FeMo-co), which is considered to be the site of dinitrogen reduction and whose biosynthesis requires the products of nifNE and of some other nif genes. It has been proposed that NifNE might serve as a scaffold upon which FeMo-co is built and then inserted into component I.; GO: 0005515 protein binding, 0006461 protein complex assembly, 0009399 nitrogen fixation. Probab=21.19 E-value=46 Score=13.95 Aligned_cols=97 Identities=20% Similarity=0.314 Sum_probs=55.4 Q ss_pred HHHHHEEEEEEEEC--CCCCCCCCCCCCEEEEECCCCCCCCC----CCCCC-------------CCCCCCCCCCCCCCCC Q ss_conf 98875168989987--86677643369889998400587664----12211-------------5422221125553347 Q gi|254780939|r 28 ILIRTFLFQPSVIP--SGSMIPTLLVGDYIIVNKFSYGYSKY----SFPFS-------------YNLFNGRIFNNQPRRG 88 (248) Q Consensus 28 ~~ir~fv~~~f~Ip--s~SM~PTL~~GD~i~VnK~~YG~~~~----~~p~~-------------~~~~~~~i~~~~p~RG 88 (248) =.|+.|=++|..+| |.||..+|-.||+ +-.++|...- ++|-+ ...+..| -+. T Consensus 195 ~~vEaFGL~P~~LPDLS~SLDGHLa~dd~---s~~T~GGT~l~~i~~~g~s~~tlaIGe~mr~aA~~l~~R------~g~ 265 (451) T TIGR01285 195 DMVEAFGLKPVVLPDLSRSLDGHLADDDF---SPITLGGTTLEDIRELGQSAVTLAIGESMRAAAELLKDR------CGV 265 (451) T ss_pred HHHHHCCCCCEECCCHHCCCCCCCCCCCC---CCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH------CCC T ss_conf 99986089722402110024650087870---056789973599987368874688777654446752212------177 Q ss_pred CEEEEECCCC-CHH-HEEEEEEEECHHH----------E-------------EECCCCEEECCCCCCCCC Q ss_conf 0466401442-003-1000000304234----------3-------------200773245353122454 Q gi|254780939|r 89 DVVVFRYPKD-PSI-DYVKRVIGLPGDR----------I-------------SLEKGIIYINGAPVVRHM 133 (248) Q Consensus 89 DIVVF~~P~d-~~~-~yVKRvIGlPGDt----------V-------------~i~~~~l~INg~~i~~~~ 133 (248) ...||-+=.. +.. .||.+...+-|.. | .+-|.+.|.||+++.-.- T Consensus 266 ~y~vF~~L~GLeavD~F~~~L~~~SG~~CdhhfPfvP~vP~~~~RqR~QL~DA~LD~HF~~gG~k~AiAa 335 (451) T TIGR01285 266 PYEVFPSLMGLEAVDAFVSVLSKISGSRCDHHFPFVPAVPERFERQRAQLQDAMLDTHFFLGGKKVAIAA 335 (451) T ss_pred CCHHCCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCEEEEEEC T ss_conf 4001012244689999999999984432156789889886334520788999999887741310232016 No 75 >pfam05257 CHAP CHAP domain. This domain corresponds to an amidase function. Many of these proteins are involved in cell wall metabolism of bacteria. This domain is found at the N-terminus of the bifunctional Escherichia coli GSP synthetase/GSP amidase, where is functions as a glutathionylspermidine amidase EC:3.5.1.78. This domain is found to be the catalytic domain of PlyCA. Probab=21.11 E-value=57 Score=13.33 Aligned_cols=14 Identities=36% Similarity=0.674 Sum_probs=11.6 Q ss_pred CCCCCCCCEEEEEC Q ss_conf 55533470466401 Q gi|254780939|r 82 NNQPRRGDVVVFRY 95 (248) Q Consensus 82 ~~~p~RGDIVVF~~ 95 (248) ...|+.|||+||.. T Consensus 55 ~~~P~~G~i~v~~~ 68 (119) T pfam05257 55 GFTPKVGDIAVFDS 68 (119) T ss_pred CCCCCCCEEEEECC T ss_conf 98999988999878 No 76 >TIGR00066 g_glut_trans gamma-glutamyltransferase; InterPro: IPR000101 Gamma-glutamyltranspeptidase (2.3.2.2 from EC) (GGT) catalyzes the transfer of the gamma-glutamyl moiety of glutathione to an acceptor that may be an amino acid, a peptide or water (forming glutamate). GGT plays a key role in the gamma-glutamyl cycle, a pathway for the synthesis and degradation of glutathione and drug and xenobiotic detoxification . In prokaryotes and eukaryotes, it is an enzyme that consists of two polypeptide chains, a heavy and a light subunit, processed from a single chain precursor by an autocatalytic cleavage. The active site of GGT is known to be located in the light subunit. The sequences of mammalian and bacterial GGT show a number of regions of high similarity . Pseudomonas cephalosporin acylases (3.5.1 from EC) that convert 7-beta-(4-carboxybutanamido)-cephalosporanic acid (GL-7ACA) into 7-aminocephalosporanic acid (7ACA) and glutaric acid are evolutionary related to GGT and also show some GGT activity . Like GGT, these GL-7ACA acylases, are also composed of two subunits. As an autocatalytic peptidase GGT belongs to MEROPS peptidase family T3 (gamma-glutamyltransferase family, clan PB(T)). The active site residue for members of this family and family T1 is C-terminal to the autolytic cleavage site. The type example is gamma-glutamyltransferase 1 from Escherichia coli. ; GO: 0003840 gamma-glutamyltransferase activity. Probab=20.56 E-value=36 Score=14.62 Aligned_cols=11 Identities=45% Similarity=0.619 Sum_probs=4.4 Q ss_pred EECCCCCCCCC Q ss_conf 98786677643 Q gi|254780939|r 39 VIPSGSMIPTL 49 (248) Q Consensus 39 ~Ips~SM~PTL 49 (248) +=|.+||.||+ T Consensus 455 KRplSsm~PTI 465 (583) T TIGR00066 455 KRPLSSMAPTI 465 (583) T ss_pred CCCCCCCCCEE T ss_conf 87510047312 No 77 >TIGR01026 fliI_yscN ATPase FliI/YscN family; InterPro: IPR005714 Proteins in this entry show extensive homology to the ATP synthase F1 beta subunit, and are involved in type III protein secretion. They fall into the two separate functional groups outlined below. The first group, exemplified by the Salmonella typhimurium FliI protein (P26465 from SWISSPROT), is needed for flagellar assembly. Most structural components of the bacterial flagellum are translocated through the central channel of the growing flagellar structure by the type III flagellar protein-export apparatus in an ATPase-driven manner, to be assembled at the growing end. FliI is the ATPase that couples ATP hydrolysis to the translocation reaction , . The second group couples ATP hydrolysis to protein translocation in non-flagellar type III secretion systems. Often these systems are involved in virulence and pathogenicity. YscN (P40290 from SWISSPROT) from pathogenic Yersinia species, for example, energises the injection of antihost factors directly into eukaryotic cells, thus overcoming host defences .; GO: 0016887 ATPase activity, 0009058 biosynthetic process, 0015031 protein transport, 0005737 cytoplasm. Probab=20.05 E-value=60 Score=13.19 Aligned_cols=71 Identities=20% Similarity=0.283 Sum_probs=44.8 Q ss_pred EEEEEEEECC---CCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCC--CCCCEEEEECCCCCHHHEEEEE Q ss_conf 1689899878---6677643369889998400587664122115422221125553--3470466401442003100000 Q gi|254780939|r 33 FLFQPSVIPS---GSMIPTLLVGDYIIVNKFSYGYSKYSFPFSYNLFNGRIFNNQP--RRGDVVVFRYPKDPSIDYVKRV 107 (248) Q Consensus 33 fv~~~f~Ips---~SM~PTL~~GD~i~VnK~~YG~~~~~~p~~~~~~~~~i~~~~p--~RGDIVVF~~P~d~~~~yVKRv 107 (248) ++.+--.|-+ .|.-|....||...+.+- ..+. =+++||.|+-+.=--.-| .-+ T Consensus 23 ~~G~v~~v~Gl~~ea~gp~~~vG~~c~I~~~---------------------g~~~~~~~~EVVGf~~~~v~LmPy-~~~ 80 (455) T TIGR01026 23 RVGRVTKVKGLLIEAVGPQASVGDLCLIERK---------------------GSEGKEVVAEVVGFNGEKVLLMPY-EEV 80 (455) T ss_pred EEEEEEEEEEEEEEEECCCCCCCCEEEEEEE---------------------CCCCCEEEEEEEEEECCEEEECCC-CCC T ss_conf 5789999852689852477667777899973---------------------789877999988520675675236-544 Q ss_pred EEE-CHHHEEECC----CCEEEC Q ss_conf 030-423432007----732453 Q gi|254780939|r 108 IGL-PGDRISLEK----GIIYIN 125 (248) Q Consensus 108 IGl-PGDtV~i~~----~~l~IN 125 (248) -|+ ||++|...| ..|.++ T Consensus 81 ~G~~~G~~V~~~~isae~~L~~~ 103 (455) T TIGR01026 81 EGVEPGSKVLAKNISAEEGLSIK 103 (455) T ss_pred CCCCCCCEEEEECCCCCCCCCCC T ss_conf 43353452332043300254557 No 78 >cd03295 ABC_OpuCA_Osmoprotection OpuCA is a the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment. ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition, to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. Probab=20.05 E-value=37 Score=14.56 Aligned_cols=44 Identities=23% Similarity=0.396 Sum_probs=26.3 Q ss_pred CCCCCCEEEEECCCCC-HHHEEEEEEEECHHHEEECCCCEEECCCCCCC Q ss_conf 5334704664014420-03100000030423432007732453531224 Q gi|254780939|r 84 QPRRGDVVVFRYPKDP-SIDYVKRVIGLPGDRISLEKGIIYINGAPVVR 131 (248) Q Consensus 84 ~p~RGDIVVF~~P~d~-~~~yVKRvIGlPGDtV~i~~~~l~INg~~i~~ 131 (248) +.++|+++++--|..- +....+=+.|+ .....|.+.+||+.+.. T Consensus 23 ~i~~Ge~~~ilGpSG~GKSTllr~i~gl----~~p~~G~I~i~g~~i~~ 67 (242) T cd03295 23 EIAKGEFLVLIGPSGSGKTTTMKMINRL----IEPTSGEIFIDGEDIRE 67 (242) T ss_pred EECCCCEEEEECCCCCHHHHHHHHHHCC----CCCCCEEEEECCEECCC T ss_conf 8869989999999995699999999759----99981599999999999 Done!