Query gi|254780350|ref|YP_003064763.1| hypothetical protein CLIBASIA_01175 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 431 No_of_seqs 220 out of 641 Neff 5.9 Searched_HMMs 39220 Date Sun May 29 15:46:33 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780350.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK10594 hypothetical protein; 100.0 0 0 824.4 37.4 396 13-415 156-599 (615) 2 COG2989 Uncharacterized protei 100.0 0 0 831.1 29.6 396 24-421 144-551 (561) 3 PRK10260 hypothetical protein; 99.9 1.8E-22 4.6E-27 158.4 6.7 131 185-358 91-223 (306) 4 PRK10190 hypothetical protein; 99.9 3.2E-22 8.2E-27 156.9 6.5 127 191-359 93-221 (310) 5 pfam03734 YkuD L,D-transpeptid 99.5 1.2E-14 3.1E-19 110.1 6.9 111 194-357 2-115 (122) 6 TIGR02869 spore_SleB spore cor 99.5 1.1E-14 2.9E-19 110.2 5.9 64 100-165 5-68 (232) 7 COG1376 ErfK Uncharacterized p 99.4 9.5E-13 2.4E-17 98.4 6.6 123 198-361 96-221 (232) 8 pfam01471 PG_binding_1 Putativ 99.2 4.1E-11 1E-15 88.3 6.1 57 107-165 1-57 (57) 9 PRK06132 hypothetical protein; 98.9 4.1E-09 1E-13 75.9 5.3 114 181-360 45-159 (365) 10 COG3409 Putative peptidoglycan 98.7 1.1E-07 2.7E-12 67.2 7.6 69 99-168 36-105 (185) 11 PRK12472 hypothetical protein; 98.3 9.3E-07 2.4E-11 61.4 4.0 121 193-383 56-176 (512) 12 COG3409 Putative peptidoglycan 98.1 5.9E-06 1.5E-10 56.4 5.8 64 102-166 121-184 (185) 13 COG3023 ampD N-acetyl-anhydrom 97.4 0.00074 1.9E-08 43.4 6.7 72 90-165 178-251 (257) 14 pfam08823 PG_binding_2 Putativ 96.1 0.013 3.3E-07 35.8 5.0 57 107-165 14-74 (74) 15 pfam09374 PG_binding_3 Predict 95.9 0.0092 2.3E-07 36.7 3.5 45 140-186 1-49 (67) 16 KOG1565 consensus 93.2 0.19 4.8E-06 28.6 4.9 59 110-168 29-88 (469) 17 COG3926 zliS Lysozyme family p 91.6 0.28 7.1E-06 27.5 4.1 47 139-186 94-144 (252) 18 pfam10908 DUF2778 Protein of u 68.0 7.3 0.00019 18.8 3.6 18 337-354 88-105 (121) 19 pfam09692 Arb1 Argonaute siRNA 66.7 6.4 0.00016 19.1 3.1 39 46-89 16-54 (392) 20 COG3442 Predicted glutamine am 63.1 5.3 0.00014 19.6 2.1 51 304-359 164-214 (250) 21 TIGR02546 III_secr_ATP type II 51.5 18 0.00045 16.4 5.4 59 56-147 345-403 (430) 22 TIGR00534 OpcA opcA protein; I 45.8 18 0.00046 16.4 2.5 109 75-186 10-123 (420) 23 PRK09726 DNA-binding transcrip 45.7 22 0.00056 15.9 2.9 17 137-153 12-28 (88) 24 pfam05756 S-antigen S-antigen 38.6 18 0.00046 16.3 1.6 14 9-22 7-20 (310) 25 pfam12068 DUF3548 Domain of un 37.6 28 0.0007 15.2 2.4 14 320-333 143-156 (207) 26 pfam03662 Glyco_hydro_79n Glyc 35.9 15 0.00039 16.8 0.9 15 320-334 245-259 (320) 27 COG3139 Uncharacterized protei 35.8 31 0.00079 14.9 3.5 20 135-154 38-57 (90) 28 COG5383 Uncharacterized protei 34.9 24 0.00062 15.6 1.8 43 82-125 44-89 (295) 29 pfam11625 DUF3253 Protein of u 32.6 35 0.0009 14.6 3.2 34 253-286 39-72 (83) 30 COG3562 KpsS Capsule polysacch 31.5 30 0.00077 15.0 1.8 34 318-356 302-335 (403) 31 TIGR01225 hutH histidine ammon 29.9 15 0.00039 16.8 0.1 34 134-168 163-196 (529) 32 TIGR01811 sdhA_Bsu succinate d 29.5 40 0.001 14.3 3.0 133 245-387 304-451 (620) 33 TIGR01047 nspC carboxynorsperm 27.9 39 0.001 14.3 1.9 33 42-74 55-90 (403) 34 KOG3517 consensus 27.8 14 0.00035 17.1 -0.5 26 146-176 101-126 (334) 35 TIGR01819 F420_cofD LPPG:Fo 2- 26.9 40 0.001 14.3 1.8 154 66-274 102-270 (359) 36 pfam12621 DUF3779 Phosphate me 26.7 44 0.0011 13.9 2.5 57 335-406 32-88 (95) 37 PRK01345 heat shock protein Ht 26.4 42 0.0011 14.1 1.8 19 74-95 65-83 (314) 38 TIGR01935 NOT-MenG RraA family 26.4 18 0.00047 16.3 -0.0 12 268-279 58-69 (155) 39 PRK01265 heat shock protein Ht 26.3 45 0.0011 13.9 1.9 22 133-154 133-154 (326) 40 pfam08692 Pet20 Mitochondrial 26.0 39 0.001 14.3 1.6 31 222-252 94-125 (137) 41 PRK09484 3-deoxy-D-manno-octul 26.0 28 0.00071 15.2 0.8 26 337-367 157-182 (186) 42 TIGR01533 lipo_e_P4 5'-nucleot 25.5 26 0.00067 15.3 0.6 12 308-319 254-265 (295) 43 KOG3516 consensus 25.0 42 0.0011 14.1 1.6 11 88-98 60-71 (1306) 44 PRK02391 heat shock protein Ht 24.5 49 0.0012 13.7 2.2 22 133-154 128-149 (297) 45 TIGR00302 TIGR00302 phosphorib 24.4 49 0.0012 13.7 2.2 24 101-124 8-31 (80) 46 pfam06978 POP1 Ribonucleases P 23.2 10 0.00026 17.9 -1.8 32 314-345 113-144 (158) 47 pfam11353 DUF3153 Protein of u 23.1 20 0.00051 16.1 -0.3 22 299-321 167-188 (210) 48 TIGR02941 Sigma_B RNA polymera 22.6 25 0.00063 15.5 0.0 21 24-45 54-74 (256) 49 pfam07220 DUF1420 Protein of u 21.7 55 0.0014 13.4 4.3 116 5-120 154-308 (672) 50 PRK03001 heat shock protein Ht 21.7 54 0.0014 13.4 1.6 21 134-154 119-139 (284) 51 PRK12285 tryptophanyl-tRNA syn 21.3 16 0.00042 16.6 -1.1 62 312-376 253-328 (369) 52 TIGR01825 gly_Cac_T_rel pyrido 20.1 60 0.0015 13.1 1.8 24 253-276 152-176 (392) No 1 >PRK10594 hypothetical protein; Provisional Probab=100.00 E-value=0 Score=824.39 Aligned_cols=396 Identities=31% Similarity=0.472 Sum_probs=359.0 Q ss_pred HHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHCCCCHHHHH-HHHCCCHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHC Q ss_conf 9999999985310155542301345676411157678764-200078888798736675878899999999999999880 Q gi|254780350|r 13 CFFVYLILPMGLSLVEKPIHASVLDEIINESYHSIVNDRF-DNFLARVDMGIDSDIPIISKETIAQTEKAIAFYQDILSR 91 (431) Q Consensus 13 ~~~~~~~l~~~~~l~s~~l~~~~~~~~i~~~~~~~~d~~f-~~~~~~~~~~~~s~~P~~s~~~~~~l~~al~~y~~i~~~ 91 (431) -|+.|+...-++...++.+-.+.....+..|....++.+. ......+..|+.++.|..+. |..++++|.++ +++. T Consensus 156 a~l~Yl~~~~~v~~~g~~wLy~~~p~~~~~p~~~~i~~w~~A~~~~~~~~~v~sl~pq~p~--Y~~m~~~l~~~--l~~~ 231 (615) T PRK10594 156 AMMGYLHFIANIPVKGTRWLYSNKPYALATPPLSVINQWQLALDEGQLPTFVASLAPQHPQ--YAAMHEALLKL--LADT 231 (615) T ss_pred HHHHHHHHHHCCCCCCCEECCCCCCCCCCCCCCCCHHHHHHHHHHCCHHHHHHHCCCCCHH--HHHHHHHHHHH--HHCC T ss_conf 9999999985165125310124787787899831168999999825568899713888757--89999999998--7404 Q ss_pred CCCCCCCCC-CCCCCCCCCCHHHHHHHHHHCCCCCCCC------------------------------------------ Q ss_conf 998547777-4146888824899999999819866567------------------------------------------ Q gi|254780350|r 92 GGWPELPIR-PLHLGNSSVSVQRLRERLIISGDLDPSK------------------------------------------ 128 (431) Q Consensus 92 ggW~~i~~~-~L~~G~~~~~V~~Lr~RL~~~Gdl~~~~------------------------------------------ 128 (431) ++||.+..+ .||||+++++|+.||++|..+|.++... T Consensus 232 ~~Wp~~~~~~~Lrpg~~~~~v~~lr~il~r~g~l~~~p~~~~~~d~~~~~~~~~p~a~~~~~~~~~~~~~~~~~~~~~~~ 311 (615) T PRK10594 232 RPWPQLTGKATLRPGQWSNDVPALREILQRTGMLDGGPKIALPGDDTATDAVVSPSAVTVETAETKPMDKQTTSRSKPAP 311 (615) T ss_pred CCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC T ss_conf 88853356775577655666188999998705445787534554334433113631012232222223322111235677 Q ss_pred CCCCCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHCCCEEEEEE Q ss_conf 87764578999999999998088878702999998844898898889864077530355567652345420452489999 Q gi|254780350|r 129 GLSVAFDAYVESAVKLFQMRHGLDPSGMVDSSTLEAMNVPVDLRIRQLQVNLMRIKKLLEQKMGLRYVLVNIPAASLEAV 208 (431) Q Consensus 129 ~~~~~yD~~l~~AVk~FQ~rhGL~~DGvig~~Tl~aLN~~~~~r~~qi~~nler~r~l~~~~~~~~~v~VNip~~~l~~~ 208 (431) ..+.+||..|++|||+||+||||++||+||++|++|||++++.|++||.+||||+||++. ++ ++||+||||+|+|+++ T Consensus 312 ~~~~~Yd~~LveAVKrFQ~rHGL~~DGVIG~~Tl~aLNvs~~~Ri~ql~~NlER~Rwlp~-~~-~~~IlVNIP~y~L~~~ 389 (615) T PRK10594 312 AVRAAYDNELVEAVKRFQAWQGLGADGVIGPATRDWLNVTPAQRAGVLALNIQRLRLLPG-EL-STGIMVNIPAYSLVYY 389 (615) T ss_pred CCCCCCCHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHCCCHHHHHHHHHHHHHHHHHCCC-CC-CCEEEEECCCCEEEEE T ss_conf 631336799999999999971998666648889999779999999999999999873633-47-8658998464058989 Q ss_pred ECCEEEEEECCCCCCCCCCCCCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHCHHHHHHCCEEEECCC--CCEECCCCC Q ss_conf 888555541231388777785542100389844877887667777777776418677874993999389--978350204 Q gi|254780350|r 209 ENGKVGLRSTVIVGRVDRQTPILHSRINRIMFNPYWVIPRSIIQKDMMALLRQDPQYLKDNNIHMIDEK--GKEVFVEEV 286 (431) Q Consensus 209 e~g~~~~~~~viVGk~~~~TP~~~~~i~~iv~NP~W~vP~sI~~~eilpk~~~dp~yl~~~~~~i~~~~--g~~vdp~~i 286 (431) ++|+++++|+|||||++|+||+|++.|++||+||+|+||+||+++|++|++++||+||+++||+|++++ |++|||.+| T Consensus 390 ~~G~~v~~srVIVGkp~r~TPifss~I~~vV~NP~W~VP~SI~rkDiLPkl~~dP~YL~~~~~~V~~g~~~~~~vdp~~I 469 (615) T PRK10594 390 QNGNQVLDSRVIVGRPDRKTPMMSSALNNVVVNPPWNVPPTLARKDILPKVRNDPGYLERHGYTVMRGWNSAEAIDPWQV 469 (615) T ss_pred ECCCEEEEEEEEECCCCCCCHHHHCCCCEEEECCCCCCCHHHHHHHHHHHHHHCHHHHHHCCEEEEECCCCCEECCCCCC T ss_conf 88938998756855788888201121008997899889761888876577543868898799299866888735470324 Q ss_pred CCCCCCCC--CCCEEECCCCCCCCEEEEEECCCCCEEEECCCCCHHHCCCCCCCCCCCCEECCCHHHHHHHHHCCCCCCC Q ss_conf 70104457--7224726899986314895315887668138898323286555541150344798999999840488999 Q gi|254780350|r 287 DWNSPEPP--NFIFRQDPGKINAMASTKIEFYSRNNTYMHDTPEPILFNNVVRFETSGCVRVRNIIDLDVWLLKDTPTWS 364 (431) Q Consensus 287 ~w~~~~~~--~~~~rQ~PGp~NaLG~vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~ShGCVRv~np~~La~~ll~~~~~~~ 364 (431) ||...++. +|++||+||++||||+|||+|||+|+|||||||+|+||+++.|+|||||||||||++||+|||++. +|+ T Consensus 470 dW~~~~~~~fpYrlrQ~PG~~NALGrvKF~FPN~~sIYLHDTP~k~LF~r~~RAfSsGCVRVe~p~eLA~~LL~~~-gw~ 548 (615) T PRK10594 470 DWSTITPSNLPFRFQQAPGARNSLGRYKFNMPSSDAIYLHDTPNHNLFQRDIRALSSGCVRVNKASDLANMLLQDA-GWN 548 (615) T ss_pred CCCCCCCCCCCEEEEECCCCCCCCCCEEEECCCCCCEECCCCCCHHHHCCCCCCCCCCCEECCCHHHHHHHHHHCC-CCC T ss_conf 7222575668657886999988670358844799864136899857748786424777034189999999997535-999 Q ss_pred HHHHHHHHHCCCEEEEECCCCCCEEEEEEEEEECCCCCEEEECCCCCHHHH Q ss_conf 899999862598289965898868999999898799838883685744589 Q gi|254780350|r 365 RYHIEEVVKTRKTTPVKLATEVPVHFVYISAWSPKDSIIQFRDDIYGLDNV 415 (431) Q Consensus 365 ~~~i~~~~~~~~~~~v~l~~~iPV~i~Y~Taw~~~dG~v~fr~DiY~~D~~ 415 (431) .++|++++++|+|++|+|+++|||||+|||||+++||+++||+|||++|.- T Consensus 549 ~~ri~~~l~~g~t~~v~L~~~IPV~l~Y~TAwvd~dG~vqFR~DIY~~D~~ 599 (615) T PRK10594 549 DARISDALKQGDTRYVNIRQRIPVNLYYLTAFVGADGRPQYRTDIYNYDLT 599 (615) T ss_pred HHHHHHHHHCCCEEEEECCCCCCEEEEEEEEEECCCCCEEECCCCCCCCCC T ss_conf 999999985698579766997888999989888899968867875245630 No 2 >COG2989 Uncharacterized protein conserved in bacteria [Function unknown] Probab=100.00 E-value=0 Score=831.12 Aligned_cols=396 Identities=42% Similarity=0.632 Sum_probs=357.6 Q ss_pred HHCCCCCCCHHHHHHHHHHHCCCCHHHHHHHHCCC--------HHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHCCCCC Q ss_conf 10155542301345676411157678764200078--------8887987366758788999999999999998809985 Q gi|254780350|r 24 LSLVEKPIHASVLDEIINESYHSIVNDRFDNFLAR--------VDMGIDSDIPIISKETIAQTEKAIAFYQDILSRGGWP 95 (431) Q Consensus 24 ~~l~s~~l~~~~~~~~i~~~~~~~~d~~f~~~~~~--------~~~~~~s~~P~~s~~~~~~l~~al~~y~~i~~~ggW~ 95 (431) ..+..-.....+....+..++++.|+++|+..... ....+.++.+.+++|+...+..+.+.|+.++..|+|| T Consensus 144 ~~~~~l~yaq~v~~~~i~~~~~~~~~d~~~~~~~~~~~~~~a~~~~~va~~l~sl~Pq~~~Y~al~~~l~~~~~~~~~wp 223 (561) T COG2989 144 ASLAYLAYAQDVPNGRIRWLRSSGWYDLADPPASVINALQQAVEEGQVASFLPSLAPQNPQYQALAQALYQLIADAGGWP 223 (561) T ss_pred HHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCHHHHCCCCCCCCHHHHHHHHHHHHHHHCCCCCC T ss_conf 99999999986301454743124843335886345789999873364766500238997789999999998430126876 Q ss_pred CC-CCC-CCCCCCCCCCHHHHHHHHHHCC-CCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHCCCHHHH Q ss_conf 47-777-4146888824899999999819-86656787764578999999999998088878702999998844898898 Q gi|254780350|r 96 EL-PIR-PLHLGNSSVSVQRLRERLIISG-DLDPSKGLSVAFDAYVESAVKLFQMRHGLDPSGMVDSSTLEAMNVPVDLR 172 (431) Q Consensus 96 ~i-~~~-~L~~G~~~~~V~~Lr~RL~~~G-dl~~~~~~~~~yD~~l~~AVk~FQ~rhGL~~DGvig~~Tl~aLN~~~~~r 172 (431) .| +.+ .||||+++++|+.||+||+..| |+......+..||++|++|||+||++|||++||+||+.|++|||++++.| T Consensus 224 ~v~~~~~~LrpG~~~~~v~aL~~~L~~~~~d~~~a~~~s~~yd~el~~avKrfQ~~~GL~~DGviG~~T~~aLn~s~~~R 303 (561) T COG2989 224 QVIPAGALLRPGVTSPDVPALRARLARSGMDLPSAAGSSPAYDPELVEAVKRFQARHGLPADGVIGPATRAALNVSVQIR 303 (561) T ss_pred CCCCCCCCCCCCCCCHHHHHHHHHHHHCCCCCHHHCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHCCCHHHH T ss_conf 33687642478987266899999997507530112057643457788999999997089977744589999862679889 Q ss_pred HHHHHHHHHHHCCCCCCCCCCHHHHHHCCCEEEEEEECCEEEEEECCCCCCCCCCCCCCCCCEEEEEECCCCCCCHHHHH Q ss_conf 88986407753035556765234542045248999988855554123138877778554210038984487788766777 Q gi|254780350|r 173 IRQLQVNLMRIKKLLEQKMGLRYVLVNIPAASLEAVENGKVGLRSTVIVGRVDRQTPILHSRINRIMFNPYWVIPRSIIQ 252 (431) Q Consensus 173 ~~qi~~nler~r~l~~~~~~~~~v~VNip~~~l~~~e~g~~~~~~~viVGk~~~~TP~~~~~i~~iv~NP~W~vP~sI~~ 252 (431) +.++++||||+||+| .+++++||+||||+|.++++|+|.+++.|+|||||++|+||.|+++|..|++||+|+||.||++ T Consensus 304 l~~l~~n~eRlR~lP-~dl~~r~i~VNiPA~~l~y~~~G~~vl~~rvVVGr~~rqTp~~~ski~~VvvNP~WnvP~SIi~ 382 (561) T COG2989 304 LAQLALNLQRLRWLP-GDLGQRGIMVNIPAYSLEYYENGREVLRSRVVVGRPDRQTPVMNSKINNVVVNPYWNVPQSIIV 382 (561) T ss_pred HHHHHHHHHHHHHCC-CCCCCCEEEEECCHHEEEEEECCCEEEEEEEEECCCCCCCCCHHHHHCEEEECCCCCCCHHHHH T ss_conf 999999899986076-5568742898545010366768808998778865788877304524021674799888688998 Q ss_pred HHHHHHHHHCHHHHHHCCEEEECCCCCEECCCCCCCC-CCCCCCCCEEECCCCCCCCEEEEEECCCCCEEEECCCCCHHH Q ss_conf 7777776418677874993999389978350204701-044577224726899986314895315887668138898323 Q gi|254780350|r 253 KDMMALLRQDPQYLKDNNIHMIDEKGKEVFVEEVDWN-SPEPPNFIFRQDPGKINAMASTKIEFYSRNNTYMHDTPEPIL 331 (431) Q Consensus 253 ~eilpk~~~dp~yl~~~~~~i~~~~g~~vdp~~i~w~-~~~~~~~~~rQ~PGp~NaLG~vkf~f~N~~~iyLHdTP~~~l 331 (431) +||+||+++||+||++++|+|++++|++|||+.|||. +.+..+|+|||+||++|+||++||+|||+|+|||||||++++ T Consensus 383 kdilPk~r~DP~YL~rngy~v~~g~G~~V~p~~VdW~i~~~~~~~r~RQ~Pg~~NALG~~Kinfpn~~aIYmHDTP~ksl 462 (561) T COG2989 383 KDILPKVRKDPGYLDRNGYEVIDGWGEVVDPSAVDWSITGSNFPYRFRQAPGPDNALGSYKFNFPNSHAIYLHDTPSKSL 462 (561) T ss_pred HHHHHHHHCCCHHHHHCCEEEECCCCCCCCHHHCCCCCCCCCCCEEEECCCCCCCCHHHEECCCCCCCCEEEECCCCHHH T ss_conf 76405542392558768838985888565743256311577786024568863011110410478976235206861555 Q ss_pred CCCCCCCCCCCCEECCCHHHHHHHHHCCCCCCCHHHHHHHHHCCCEEEEECCCCCCEEEEEEEEEECCCCCEEEECCCCC Q ss_conf 28655554115034479899999984048899989999986259828996589886899999989879983888368574 Q gi|254780350|r 332 FNNVVRFETSGCVRVRNIIDLDVWLLKDTPTWSRYHIEEVVKTRKTTPVKLATEVPVHFVYISAWSPKDSIIQFRDDIYG 411 (431) Q Consensus 332 F~~~~Ra~ShGCVRv~np~~La~~ll~~~~~~~~~~i~~~~~~~~~~~v~l~~~iPV~i~Y~Taw~~~dG~v~fr~DiY~ 411 (431) |++.+|++|||||||+||.+||.|||++ +||+++++++.+++|+|+.++++++||||+.|||||+++||++|||+|||+ T Consensus 463 F~r~mRalSsGCVRvq~~rdla~~lL~d-~Gws~~~v~~~ik~g~t~~i~v~~~vPVyl~Y~TAW~~~dG~vqfrdDIY~ 541 (561) T COG2989 463 FNRDMRALSSGCVRVQKPRDLANALLKD-PGWSVDRVEETLKSGKTTPIKVRQPVPVYLYYFTAWVTKDGVVQFRDDIYG 541 (561) T ss_pred HHHHHHHHCCCCEEECCHHHHHHHHHHC-CCCCHHHHHHHHCCCCCEEEECCCCCCEEEEEEEEEECCCCCEEECCCCCC T ss_conf 6657777535855744889999999855-798889987674168840200377786799999977779985573563000 Q ss_pred HHHHHHHHHH Q ss_conf 4589898861 Q gi|254780350|r 412 LDNVHVGIIP 421 (431) Q Consensus 412 ~D~~~~~~~~ 421 (431) +|...-.++. T Consensus 542 ~D~~~~~a~~ 551 (561) T COG2989 542 YDGYAELALQ 551 (561) T ss_pred CCHHHHHHHH T ss_conf 1348877656 No 3 >PRK10260 hypothetical protein; Provisional Probab=99.87 E-value=1.8e-22 Score=158.43 Aligned_cols=131 Identities=18% Similarity=0.244 Sum_probs=100.6 Q ss_pred CCCCCCCCCHHHHHHCCCEEEEEEEC-CEEEEEECCCCCCCCCCCCCC-CCCEEEEEECCCCCCCHHHHHHHHHHHHHHC Q ss_conf 35556765234542045248999988-855554123138877778554-2100389844877887667777777776418 Q gi|254780350|r 185 KLLEQKMGLRYVLVNIPAASLEAVEN-GKVGLRSTVIVGRVDRQTPIL-HSRINRIMFNPYWVIPRSIIQKDMMALLRQD 262 (431) Q Consensus 185 ~l~~~~~~~~~v~VNip~~~l~~~e~-g~~~~~~~viVGk~~~~TP~~-~~~i~~iv~NP~W~vP~sI~~~eilpk~~~d 262 (431) |..+. ....-|++|+|+++|++|.. ...+..++|.+|+..+.||+. .+.|+....||+|++|.||.++ .. T Consensus 91 ~ILP~-~~r~GIVINlaEmRLYYfp~~~~~V~t~PVGIGr~g~~TPlg~~t~I~~K~~~PtW~pp~sir~e-~~------ 162 (306) T PRK10260 91 LILPD-TVHEGIVINSAEMRLYYYPKGTNTVIVLPIGIGQLGKDTPINWTTKVERKKAGPTWTPTAKMHAE-YR------ 162 (306) T ss_pred EECCC-CCCCCEEEECHHHEEEEECCCCCEEEEEEEECCCCCCCCCCCCEEEEEECCCCCCCCCCHHHHHH-HH------ T ss_conf 02799-98573698532424478438998699982443668887887641799972579988898788899-99------ Q ss_pred HHHHHHCCEEEECCCCCEECCCCCCCCCCCCCCCCEEECCCCCCCCEEEEEECCCCCEEEECCCCCHHHCCCCCCCCCCC Q ss_conf 67787499399938997835020470104457722472689998631489531588766813889832328655554115 Q gi|254780350|r 263 PQYLKDNNIHMIDEKGKEVFVEEVDWNSPEPPNFIFRQDPGKINAMASTKIEFYSRNNTYMHDTPEPILFNNVVRFETSG 342 (431) Q Consensus 263 p~yl~~~~~~i~~~~g~~vdp~~i~w~~~~~~~~~~rQ~PGp~NaLG~vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~ShG 342 (431) ++ |. ..+-+-||||+||||..++.+.. +|.||||+.+..++. ..||| T Consensus 163 ----~~---------G~---------------~LP~vvPpGPdNPLG~~Al~l~~--~YlIHGTN~p~gIG~---rvS~G 209 (306) T PRK10260 163 ----AA---------GE---------------PLPAVVPAGPDNPMGLYALYIGR--LYAIHGTNANFGIGL---RVSHG 209 (306) T ss_pred ----HC---------CC---------------CCCCCCCCCCCCCHHHHHHHHCC--CEEEECCCCCCCCCC---CCCCC T ss_conf ----72---------99---------------88766899998966467776278--405777899875132---01687 Q ss_pred CEECCCHHHHHHHHHC Q ss_conf 0344798999999840 Q gi|254780350|r 343 CVRVRNIIDLDVWLLK 358 (431) Q Consensus 343 CVRv~np~~La~~ll~ 358 (431) ||||-+ .|. ++|.. T Consensus 210 CIRl~p-eDI-~~Lf~ 223 (306) T PRK10260 210 CVRLRN-EDI-KFLFE 223 (306) T ss_pred EECCCH-HHH-HHHHH T ss_conf 512375-719-99995 No 4 >PRK10190 hypothetical protein; Provisional Probab=99.86 E-value=3.2e-22 Score=156.86 Aligned_cols=127 Identities=17% Similarity=0.196 Sum_probs=99.6 Q ss_pred CCCHHHHHHCCCEEEEEEEC-CEEEEEECCCCCCCCCCCCC-CCCCEEEEEECCCCCCCHHHHHHHHHHHHHHCHHHHHH Q ss_conf 65234542045248999988-85555412313887777855-42100389844877887667777777776418677874 Q gi|254780350|r 191 MGLRYVLVNIPAASLEAVEN-GKVGLRSTVIVGRVDRQTPI-LHSRINRIMFNPYWVIPRSIIQKDMMALLRQDPQYLKD 268 (431) Q Consensus 191 ~~~~~v~VNip~~~l~~~e~-g~~~~~~~viVGk~~~~TP~-~~~~i~~iv~NP~W~vP~sI~~~eilpk~~~dp~yl~~ 268 (431) ....-|+||+|+++|++|.. +..+..++|.+|+..+.||. +.+.|+...-||+|++|.||. +|...+ T Consensus 93 ~~r~GIVINlaEmRLYYfp~~~~~V~t~PIGIGr~g~~TP~g~~t~i~~K~~nPtW~Pp~sir-~e~~~~---------- 161 (310) T PRK10190 93 TVRKGIVVNVAEMRLYYYPPDSNTVEVFPIGIGQAGRETPRNWVTTVERKQEAPTWTPTPNTR-REYAKR---------- 161 (310) T ss_pred CCCCCEEEEHHHHEEEEECCCCCEEEEEEEECCCCCCCCCCCCEEEEEECCCCCCCCCCHHHH-HHHHHC---------- T ss_conf 986864876035023785599987999714646588878887548998625699877977788-999971---------- Q ss_pred CCEEEECCCCCEECCCCCCCCCCCCCCCCEEECCCCCCCCEEEEEECCCCCEEEECCCCCHHHCCCCCCCCCCCCEECCC Q ss_conf 99399938997835020470104457722472689998631489531588766813889832328655554115034479 Q gi|254780350|r 269 NNIHMIDEKGKEVFVEEVDWNSPEPPNFIFRQDPGKINAMASTKIEFYSRNNTYMHDTPEPILFNNVVRFETSGCVRVRN 348 (431) Q Consensus 269 ~~~~i~~~~g~~vdp~~i~w~~~~~~~~~~rQ~PGp~NaLG~vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~ShGCVRv~n 348 (431) + ...+-+-||||+||||..++.+.. .|.||||+.+..++. | .||||||| . T Consensus 162 -G-----------------------~~LP~vvP~GPdNPLG~~Al~l~~--~YlIHGTN~p~gIG~--r-vS~GCIRl-~ 211 (310) T PRK10190 162 -G-----------------------ESLPAFVPAGPDNPMGLYAIYIGR--LYAIHGTNANFGIGL--R-VSQGCIRL-R 211 (310) T ss_pred -C-----------------------CCCCCCCCCCCCCCHHHHHHHHCC--CEEEECCCCCCCCCC--H-HCCCCCCC-C T ss_conf -9-----------------------988777799998957778886278--606878899975021--1-06875144-8 Q ss_pred HHHHHHHHHCC Q ss_conf 89999998404 Q gi|254780350|r 349 IIDLDVWLLKD 359 (431) Q Consensus 349 p~~La~~ll~~ 359 (431) |.|. ++|... T Consensus 212 peDI-~~Lf~~ 221 (310) T PRK10190 212 NDDI-KYLFDN 221 (310) T ss_pred HHHH-HHHHHC T ss_conf 7869-999842 No 5 >pfam03734 YkuD L,D-transpeptidase catalytic domain. This family of proteins are found in a range of bacteria. It has been shown that this domain can act as an L,D-transpeptidase that gives rise to an alternative pathway for peptidoglycan cross-linking. This gives bacteria resistance to beta-lactam antibiotics that inhibit PBPs which usually carry out the cross-linking reaction. The conserved region contains a conserved histidine and cysteine, with the cysteine thought to be an active site residue. Several members of this family contain peptidoglycan binding domains. The molecular structure of YkuD protein shows this domain has a novel tertiary fold consisting of a beta-sandwich with two mixed sheets, one containing five strands and the other, six strands. The two beta-sheets form a cradle capped by an alpha-helix. This family was formerly called the ErfK/YbiS/YcfS/YnhG family, but is now named after the first protein of known structure. Probab=99.55 E-value=1.2e-14 Score=110.08 Aligned_cols=111 Identities=30% Similarity=0.402 Sum_probs=83.6 Q ss_pred HHHHHHCCCEEE-EEEECCEEEEEECCCCCCCCCCCCCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHCHHHHHHCCEE Q ss_conf 345420452489-9998885555412313887777855421003898448778876677777777764186778749939 Q gi|254780350|r 194 RYVLVNIPAASL-EAVENGKVGLRSTVIVGRVDRQTPILHSRINRIMFNPYWVIPRSIIQKDMMALLRQDPQYLKDNNIH 272 (431) Q Consensus 194 ~~v~VNip~~~l-~~~e~g~~~~~~~viVGk~~~~TP~~~~~i~~iv~NP~W~vP~sI~~~eilpk~~~dp~yl~~~~~~ 272 (431) .+|.||+.+.++ .++++|+.++.+++.+|+.+.+||.....|.....+|.|..+..+.. T Consensus 2 ~~I~Vd~~~~~l~~~~~~g~~v~~~~vs~G~~~~~TP~G~~~i~~k~~~~~~~~~~~~~~-------------------- 61 (122) T pfam03734 2 RVIVVDLSEQRLLLLYENGKLVLTYPVSVGRGDTPTPLGTFTITEKVENPTWAPGPGNGL-------------------- 61 (122) T ss_pred CEEEEECCCCEEEEEEECCEEEEEEEEEECCCCCCCCCCEEEEEEEEECCCCCCCCCCCC-------------------- T ss_conf 299998943899999759999999778858899988665699899875884578675554-------------------- Q ss_pred EECCCCCEECCCCCCCCCCCCCCCCEEECCCCCCCCEEEEEECCCCCEEEECCCCCHHHCCCCCCCCCCCCEECCC--HH Q ss_conf 9938997835020470104457722472689998631489531588766813889832328655554115034479--89 Q gi|254780350|r 273 MIDEKGKEVFVEEVDWNSPEPPNFIFRQDPGKINAMASTKIEFYSRNNTYMHDTPEPILFNNVVRFETSGCVRVRN--II 350 (431) Q Consensus 273 i~~~~g~~vdp~~i~w~~~~~~~~~~rQ~PGp~NaLG~vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~ShGCVRv~n--p~ 350 (431) .++.+..++......+|+||+|+.+..+... +..|||||||.| +. T Consensus 62 --------------------------------~~~~~~~~~~~~~~~~i~iH~~~~~~~~~~g-~~~ShGCIrl~~~d~~ 108 (122) T pfam03734 62 --------------------------------GYVKFLDPWAFPNGGGIYIHGTGTPDLFSGG-APRSHGCIRLSNEDAK 108 (122) T ss_pred --------------------------------CCCCCCCEEEECCCCCEEEECCCCCCCCCCC-CCCCCCCCCCCHHHHH T ss_conf --------------------------------6876552468727984389668888422479-8467760157999999 Q ss_pred HHHHHHH Q ss_conf 9999984 Q gi|254780350|r 351 DLDVWLL 357 (431) Q Consensus 351 ~La~~ll 357 (431) .|.+++. T Consensus 109 ~l~~~v~ 115 (122) T pfam03734 109 ELYDWVL 115 (122) T ss_pred HHHHCCC T ss_conf 9997099 No 6 >TIGR02869 spore_SleB spore cortex-lytic enzyme; InterPro: IPR014224 The entry represents the spore cortex-lytic enzyme SleB from Bacillus subtilis and other Gram-positive, endospore-forming bacterial species. SleB is stored in an inactive form in the spore and activated during germination.. Probab=99.53 E-value=1.1e-14 Score=110.23 Aligned_cols=64 Identities=27% Similarity=0.357 Sum_probs=61.7 Q ss_pred CCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHH Q ss_conf 741468888248999999998198665678776457899999999999808887870299999884 Q gi|254780350|r 100 RPLHLGNSSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVKLFQMRHGLDPSGMVDSSTLEAM 165 (431) Q Consensus 100 ~~L~~G~~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk~FQ~rhGL~~DGvig~~Tl~aL 165 (431) ..|+-|.++.+|.+|++||..+||| .+..|++||-.|..|||.||..+||++||++|++|+++| T Consensus 5 ~~~~~G~~G~~V~~~Q~rLk~wGYY--~G~VDG~FG~~Ty~AVr~FQ~knGL~VDGivG~~T~~aL 68 (232) T TIGR02869 5 VTLKRGSTGSDVIEVQRRLKAWGYY--NGKVDGVFGVKTYKAVRKFQSKNGLTVDGIVGPKTKAAL 68 (232) T ss_pred CHHHCCCCCHHHHHHHHHHHHCCCC--CCCCCEEECCHHHHHHHHHHHHCCCCCCCCCCHHHHHHH T ss_conf 2011078621789999999874861--143010547369999999888708855642075789999 No 7 >COG1376 ErfK Uncharacterized protein conserved in bacteria [Function unknown] Probab=99.38 E-value=9.5e-13 Score=98.38 Aligned_cols=123 Identities=22% Similarity=0.270 Sum_probs=86.9 Q ss_pred HHCCCEEEEEEECCEEEEEECCCCCCCCCCCCCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHCHHHHHHCCEEEECCC Q ss_conf 20452489999888555541231388777785542100389844877887667777777776418677874993999389 Q gi|254780350|r 198 VNIPAASLEAVENGKVGLRSTVIVGRVDRQTPILHSRINRIMFNPYWVIPRSIIQKDMMALLRQDPQYLKDNNIHMIDEK 277 (431) Q Consensus 198 VNip~~~l~~~e~g~~~~~~~viVGk~~~~TP~~~~~i~~iv~NP~W~vP~sI~~~eilpk~~~dp~yl~~~~~~i~~~~ 277 (431) +......+.+.+.+.....+.+.+|+....++- ...++.....|.|+.+..+..++.- T Consensus 96 ~~~~~~~~~~~~~~~~~~~~~v~~g~~~~~~~~-~~~vs~~~~~p~~Tp~g~~~~~~~~--------------------- 153 (232) T COG1376 96 VDTGLRLLYLVDDSGTAQRYPVGIGKEGLDWPG-TAKVSRGKEGPTWTPTGEFIVREKK--------------------- 153 (232) T ss_pred CCCCCEEEEEECCCCCEEEEEEEECCCCCCCCC-EEEEECCCCCCCCCCCCEEEEECCC--------------------- T ss_conf 167635899835888169998875872320253-3787567647972788557850345--------------------- Q ss_pred CCEECCCCCCCCCCCCCCCCE-EECCCCCCCCEEEEEECCCCC--EEEECCCCCHHHCCCCCCCCCCCCEECCCHHHHHH Q ss_conf 978350204701044577224-726899986314895315887--66813889832328655554115034479899999 Q gi|254780350|r 278 GKEVFVEEVDWNSPEPPNFIF-RQDPGKINAMASTKIEFYSRN--NTYMHDTPEPILFNNVVRFETSGCVRVRNIIDLDV 354 (431) Q Consensus 278 g~~vdp~~i~w~~~~~~~~~~-rQ~PGp~NaLG~vkf~f~N~~--~iyLHdTP~~~lF~~~~Ra~ShGCVRv~np~~La~ 354 (431) + ...+.- ..++||+||||..|+.+.... .+++|+||.++.+++ +.|||||||.| +-|. T Consensus 154 ~--------------~~~~~~~~~~~~p~np~G~~y~~~~~~~~~~y~IHgt~~~~~iG~---~~ShGCIRL~n--~Da~ 214 (232) T COG1376 154 G--------------GFYMPNSGVPPGPNNPLGALYAVRSSPSDTGYGIHGTPEPASIGK---AVSHGCIRLSN--QDAK 214 (232) T ss_pred C--------------CCCCCCCCCCCCCCCCCCCEEEEEECCCCCEEEEECCCCCCCCCC---CCCCCCCCCCH--HHHH T ss_conf 3--------------444326666888989755203574047885078868877677786---25887671774--8899 Q ss_pred HHHCCCC Q ss_conf 9840488 Q gi|254780350|r 355 WLLKDTP 361 (431) Q Consensus 355 ~ll~~~~ 361 (431) ||.+..+ T Consensus 215 ~ly~~v~ 221 (232) T COG1376 215 DLYNRVP 221 (232) T ss_pred HHHHCCC T ss_conf 9983589 No 8 >pfam01471 PG_binding_1 Putative peptidoglycan binding domain. This domain is composed of three alpha helices. This domain is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally. Probab=99.18 E-value=4.1e-11 Score=88.29 Aligned_cols=57 Identities=35% Similarity=0.406 Sum_probs=52.4 Q ss_pred CCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHH Q ss_conf 88248999999998198665678776457899999999999808887870299999884 Q gi|254780350|r 107 SSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVKLFQMRHGLDPSGMVDSSTLEAM 165 (431) Q Consensus 107 ~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk~FQ~rhGL~~DGvig~~Tl~aL 165 (431) ++++|..|+++|...|+++ ...++.||+.+.+||++||+.|||++||++|++||++| T Consensus 1 ~~~~V~~lQ~~L~~~Gy~~--~~~dg~~~~~t~~Av~~fQ~~~gL~~tG~~~~~T~~~L 57 (57) T pfam01471 1 SGEDVKELQRYLKRLGYYP--GPVDGVFGPSTEAAVKAFQRFFGLPVTGIVDPETLAAL 57 (57) T ss_pred CHHHHHHHHHHHHHCCCCC--CCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHCC T ss_conf 9899999999999977999--99999688899999999999959999887699999619 No 9 >PRK06132 hypothetical protein; Provisional Probab=98.85 E-value=4.1e-09 Score=75.92 Aligned_cols=114 Identities=19% Similarity=0.091 Sum_probs=78.6 Q ss_pred HHHCCCCCC-CCCCHHHHHHCCCEEEEEEECCEEEEEECCCCCCCCCCCCCCCCCEEEEEECCCCCCCHHHHHHHHHHHH Q ss_conf 753035556-7652345420452489999888555541231388777785542100389844877887667777777776 Q gi|254780350|r 181 MRIKKLLEQ-KMGLRYVLVNIPAASLEAVENGKVGLRSTVIVGRVDRQTPILHSRINRIMFNPYWVIPRSIIQKDMMALL 259 (431) Q Consensus 181 er~r~l~~~-~~~~~~v~VNip~~~l~~~e~g~~~~~~~viVGk~~~~TP~~~~~i~~iv~NP~W~vP~sI~~~eilpk~ 259 (431) ..+.|.+.. ..|.-.|+|++.+..+++|+++..+...+|..|+..++||.....|..... |+. T Consensus 45 g~~~~~~~~~p~gp~~i~vsl~~Q~~~~y~~~~~i~~s~vstG~~g~~TP~G~F~i~~K~~---~h~------------- 108 (365) T PRK06132 45 GEYLWYPERKPQGPLVIVVSLTEQRLYVYDNGILIAVSTVSTGKRGHETPTGVFSILQKDK---DHR------------- 108 (365) T ss_pred CCEEECCCCCCCCCEEEEEECCCEEEEEEECCEEEEEEEECCCCCCCCCCCEEEEEEECCC---CCC------------- T ss_conf 6236788778988889999832008999989989999884258888878861678887024---123------------- Q ss_pred HHCHHHHHHCCEEEECCCCCEECCCCCCCCCCCCCCCCEEECCCCCCCCEEEEEECCCCCEEEECCCCCHHHCCCCCCCC Q ss_conf 41867787499399938997835020470104457722472689998631489531588766813889832328655554 Q gi|254780350|r 260 RQDPQYLKDNNIHMIDEKGKEVFVEEVDWNSPEPPNFIFRQDPGKINAMASTKIEFYSRNNTYMHDTPEPILFNNVVRFE 339 (431) Q Consensus 260 ~~dp~yl~~~~~~i~~~~g~~vdp~~i~w~~~~~~~~~~rQ~PGp~NaLG~vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~ 339 (431) . + .+++.+.+ +...+.+. +|.||+.+-+ + ... T Consensus 109 -S--------~-------------------~Y~~ApMP-------------~~~rlt~~-GialH~g~lp---g---ypa 140 (365) T PRK06132 109 -S--------N-------------------IYSNAPMP-------------YMQRLTWS-GIALHAGNLP---G---YPA 140 (365) T ss_pred -C--------C-------------------CCCCCCCC-------------CEEEECCC-EEEEECCCCC---C---CCC T ss_conf -5--------6-------------------65788886-------------47985166-3797136679---9---877 Q ss_pred CCCCEECCCHHHHHHHHHCCC Q ss_conf 115034479899999984048 Q gi|254780350|r 340 TSGCVRVRNIIDLDVWLLKDT 360 (431) Q Consensus 340 ShGCVRv~np~~La~~ll~~~ 360 (431) ||||||| |.+||..|..-+ T Consensus 141 SHGCirl--P~~fA~~lf~~t 159 (365) T PRK06132 141 SHGCVRL--PMAFAKKLFGWT 159 (365) T ss_pred CCCCCCC--CHHHHHHHHCCC T ss_conf 6764569--789999985514 No 10 >COG3409 Putative peptidoglycan-binding domain-containing protein [Cell envelope biogenesis, outer membrane] Probab=98.66 E-value=1.1e-07 Score=67.19 Aligned_cols=69 Identities=26% Similarity=0.302 Sum_probs=61.6 Q ss_pred CCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCC-CCCCCCCHHHHHHHCCC Q ss_conf 77414688882489999999981986656787764578999999999998088-87870299999884489 Q gi|254780350|r 99 IRPLHLGNSSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVKLFQMRHGL-DPSGMVDSSTLEAMNVP 168 (431) Q Consensus 99 ~~~L~~G~~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk~FQ~rhGL-~~DGvig~~Tl~aLN~~ 168 (431) ....+.+..++.|..|+..|...|++.. +..++.|++.+..||++||+.||| ++||++|++|+.+|... T Consensus 36 ~~~~~~~~~~~~v~~lq~~L~~~g~~~~-~~~dg~~g~~t~~av~~fQ~~~gl~~~dG~~g~~t~~al~~~ 105 (185) T COG3409 36 DPVLTLGAEGPSVRILQAALNALGYYPD-GVIDGVYGPETAAAVRAFQQKNGLSPVDGIVGPATRAALPSQ 105 (185) T ss_pred CCCCCCCCCCCCHHHHHHHHHHCCCCCC-CCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHH T ss_conf 4210256677609999999997588766-766687672448999989997298767776588899998742 No 11 >PRK12472 hypothetical protein; Provisional Probab=98.26 E-value=9.3e-07 Score=61.37 Aligned_cols=121 Identities=19% Similarity=0.208 Sum_probs=77.2 Q ss_pred CHHHHHHCCCEEEEEEECCEEEEEECCCCCCCCCCCCCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHCHHHHHHCCEE Q ss_conf 23454204524899998885555412313887777855421003898448778876677777777764186778749939 Q gi|254780350|r 193 LRYVLVNIPAASLEAVENGKVGLRSTVIVGRVDRQTPILHSRINRIMFNPYWVIPRSIIQKDMMALLRQDPQYLKDNNIH 272 (431) Q Consensus 193 ~~~v~VNip~~~l~~~e~g~~~~~~~viVGk~~~~TP~~~~~i~~iv~NP~W~vP~sI~~~eilpk~~~dp~yl~~~~~~ 272 (431) +-+.+|.|-..++++|+......+.+|..|+..+.||.....|- ..| .|+ .- T Consensus 56 Pi~aiVSi~~Q~vt~YDa~G~~~~apVSTG~~G~~TP~GVFsii--qK~-k~H------------------------rS- 107 (512) T PRK12472 56 PIMAIVSIKSQQVTLYDADGWILRAPVSTGTTGRETPAGVFAIV--EKD-KDH------------------------HS- 107 (512) T ss_pred CEEEEEEECCCEEEEECCCCEEEECCCCCCCCCCCCCCEEEEEE--ECC-CHH------------------------HH- T ss_conf 65999985132689981687289655347877787775345664--215-202------------------------20- Q ss_pred EECCCCCEECCCCCCCCCCCCCCCCEEECCCCCCCCEEEEEECCCCCEEEECCCCCHHHCCCCCCCCCCCCEECCCHHHH Q ss_conf 99389978350204701044577224726899986314895315887668138898323286555541150344798999 Q gi|254780350|r 273 MIDEKGKEVFVEEVDWNSPEPPNFIFRQDPGKINAMASTKIEFYSRNNTYMHDTPEPILFNNVVRFETSGCVRVRNIIDL 352 (431) Q Consensus 273 i~~~~g~~vdp~~i~w~~~~~~~~~~rQ~PGp~NaLG~vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~ShGCVRv~np~~L 352 (431) +.++....++.|. |-.+ +|-||..+-|+. ..||||||| |.+| T Consensus 108 ----------------niY~~A~MP~MQR-----------iTWs---GiALH~G~lPGY------pASHGCvRm--P~~F 149 (512) T PRK12472 108 ----------------TMYDDAWMPNMQR-----------ITWN---GVALHGGPLPGY------AASHGCVRM--PYGF 149 (512) T ss_pred ----------------CCCCCCCCCCEEE-----------EEEC---CEEECCCCCCCC------CCCCCCCCC--CHHH T ss_conf ----------------3347888750001-----------2121---223216778998------676762427--6589 Q ss_pred HHHHHCCCCCCCHHHHHHHHHCCCEEEEECC Q ss_conf 9998404889998999998625982899658 Q gi|254780350|r 353 DVWLLKDTPTWSRYHIEEVVKTRKTTPVKLA 383 (431) Q Consensus 353 a~~ll~~~~~~~~~~i~~~~~~~~~~~v~l~ 383 (431) |+-|. +|++....-.+..++-..+.+. T Consensus 150 A~~lf----~~T~~G~RViv~p~d~aPv~fs 176 (512) T PRK12472 150 AEKLF----DKTRIGMRVIISPNDAAPVDFS 176 (512) T ss_pred HHHHH----HHCCCCCEEEECCCCCCCCCCC T ss_conf 99874----0002663799537877764457 No 12 >COG3409 Putative peptidoglycan-binding domain-containing protein [Cell envelope biogenesis, outer membrane] Probab=98.13 E-value=5.9e-06 Score=56.44 Aligned_cols=64 Identities=30% Similarity=0.337 Sum_probs=56.6 Q ss_pred CCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHC Q ss_conf 14688882489999999981986656787764578999999999998088878702999998844 Q gi|254780350|r 102 LHLGNSSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVKLFQMRHGLDPSGMVDSSTLEAMN 166 (431) Q Consensus 102 L~~G~~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk~FQ~rhGL~~DGvig~~Tl~aLN 166 (431) ...+.....+..++..+...|+.... ..+++|+..++.||+.||..|||.+||++|+.||.+|- T Consensus 121 ~~~~~~~~~~~~~~~~~~~~~~~~~~-~~dg~fg~~t~~~v~~~q~~~~l~~dgi~g~~t~~~l~ 184 (185) T COG3409 121 PGLGLGGGDVATLQQPLPLLGYRSGI-RVDGIFGPQTEAAVKAFQRQYGLTVDGIVGPQTWAALR 184 (185) T ss_pred CCCCCCCCCCHHHHHHHHHHCCCCCC-CCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHC T ss_conf 44454664412255541210234455-55664571479999999986066644443600376752 No 13 >COG3023 ampD N-acetyl-anhydromuramyl-L-alanine amidase [Cell envelope biogenesis, outer membrane] Probab=97.36 E-value=0.00074 Score=43.44 Aligned_cols=72 Identities=22% Similarity=0.239 Sum_probs=54.8 Q ss_pred HCCCCCCC-CCCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCC-CCCCCCCHHHHHHH Q ss_conf 80998547-777414688882489999999981986656787764578999999999998088-87870299999884 Q gi|254780350|r 90 SRGGWPEL-PIRPLHLGNSSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVKLFQMRHGL-DPSGMVDSSTLEAM 165 (431) Q Consensus 90 ~~ggW~~i-~~~~L~~G~~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk~FQ~rhGL-~~DGvig~~Tl~aL 165 (431) ..|.|+.- +....+.+...+.|..|++.|..-||-. .. +.||..+..+|++||...+= ..||+.|-+|.+.| T Consensus 178 gigaw~~~~~~~~~~~~~~~~~v~~lq~~L~~YGY~v---~~-~~~d~~t~~vv~aFQ~hfrp~~~dg~~d~et~a~l 251 (257) T COG3023 178 GIGAWLDTAQVQKYLALLKGEDVAALQEMLARYGYGV---EI-GVFDQETQQVVRAFQMHFRPGLYDGEADVETIAIL 251 (257) T ss_pred CCCCCCCHHHHHHHHHHHCCCCHHHHHHHHHHHCCCC---CC-CHHHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHH T ss_conf 8644776666531244404678999999999848688---86-52569999999999987077788888876899999 No 14 >pfam08823 PG_binding_2 Putative peptidoglycan binding domain. This family may be a peptidoglycan binding domain. Probab=96.06 E-value=0.013 Score=35.76 Aligned_cols=57 Identities=16% Similarity=0.239 Sum_probs=47.5 Q ss_pred CCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCC----CCCCCCHHHHHHH Q ss_conf 8824899999999819866567877645789999999999980888----7870299999884 Q gi|254780350|r 107 SSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVKLFQMRHGLD----PSGMVDSSTLEAM 165 (431) Q Consensus 107 ~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk~FQ~rhGL~----~DGvig~~Tl~aL 165 (431) .+..+..++.-|..+|||. +..+++||+.+.+|++.|+...++. .||.||..+++-| T Consensus 14 ~~dv~~ev~~~L~rLGyy~--g~~~g~~dea~~~Al~~~~~~ENfEeR~~~d~~ID~~Vl~yL 74 (74) T pfam08823 14 TGDVAEEVQAALSRLGYYK--GEATGVFDEATRDALRAWIATENFENRYRGDGEIDRSVLSYL 74 (74) T ss_pred HHHHHHHHHHHHHHCCCCC--CCCCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHC T ss_conf 2899999999999838534--887774229999999999987405541268981129999449 No 15 >pfam09374 PG_binding_3 Predicted Peptidoglycan domain. This family contains a potential peptidoglycan binding domain. Probab=95.85 E-value=0.0092 Score=36.71 Aligned_cols=45 Identities=27% Similarity=0.365 Sum_probs=32.3 Q ss_pred HHHHHHHHHCC---CCCCCCCCHHHHHHHCCC-HHHHHHHHHHHHHHHCCC Q ss_conf 99999999808---887870299999884489-889888986407753035 Q gi|254780350|r 140 SAVKLFQMRHG---LDPSGMVDSSTLEAMNVP-VDLRIRQLQVNLMRIKKL 186 (431) Q Consensus 140 ~AVk~FQ~rhG---L~~DGvig~~Tl~aLN~~-~~~r~~qi~~nler~r~l 186 (431) .|+|--|+.-| ..+||+|||.|+++++.. ++.-+ ...+.+|..++ T Consensus 1 rA~k~LQr~~G~~~v~~DGiIGp~Tl~Av~~~d~~~l~--~~~~~~R~~fy 49 (67) T pfam09374 1 RAIKMLQRALGQPDVAADGIIGPKTLAALASMGENDLI--KALNAARIRFY 49 (67) T ss_pred CHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHCCHHHHH--HHHHHHHHHHH T ss_conf 95899999867898787878289999999967999999--99999999999 No 16 >KOG1565 consensus Probab=93.21 E-value=0.19 Score=28.61 Aligned_cols=59 Identities=32% Similarity=0.403 Sum_probs=48.0 Q ss_pred CHHHHHHHHHHCCCCCCCC-CCCCCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHCCC Q ss_conf 4899999999819866567-8776457899999999999808887870299999884489 Q gi|254780350|r 110 SVQRLRERLIISGDLDPSK-GLSVAFDAYVESAVKLFQMRHGLDPSGMVDSSTLEAMNVP 168 (431) Q Consensus 110 ~V~~Lr~RL~~~Gdl~~~~-~~~~~yD~~l~~AVk~FQ~rhGL~~DGvig~~Tl~aLN~~ 168 (431) ....+..-|..-||+.... ..+...+..+.+|++.||.-.||++||.+|..|++.|+.| T Consensus 29 ~~~~~~~yl~~~~y~~~~~~~~~~~~~~~~~~al~~~q~~~~l~~tG~lD~~Tl~~m~~p 88 (469) T KOG1565 29 DKVALQDYLECYGYLPPTDLTATRASQNVLEDALKMMQDFFGLPVTGKLDNATLALMNKP 88 (469) T ss_pred CHHHHHHHHHCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCHHHHHHCCCC T ss_conf 355667665304536776545456675105668876676417565675564666431488 No 17 >COG3926 zliS Lysozyme family protein [General function prediction only] Probab=91.63 E-value=0.28 Score=27.55 Aligned_cols=47 Identities=21% Similarity=0.164 Sum_probs=35.1 Q ss_pred HHHHHHHHHHCC----CCCCCCCCHHHHHHHCCCHHHHHHHHHHHHHHHCCC Q ss_conf 999999999808----887870299999884489889888986407753035 Q gi|254780350|r 139 ESAVKLFQMRHG----LDPSGMVDSSTLEAMNVPVDLRIRQLQVNLMRIKKL 186 (431) Q Consensus 139 ~~AVk~FQ~rhG----L~~DGvig~~Tl~aLN~~~~~r~~qi~~nler~r~l 186 (431) -.|+|--|+.-| .+.||+||.+|++|++.-+.. -..-.++.+|+-|+ T Consensus 94 ~rAa~~LQkal~~~~~v~~DGvIG~~TLaAl~~~~~~-~~i~~~~d~r~a~l 144 (252) T COG3926 94 GRAAKWLQKALGPAYTVRVDGVIGAQTLAALKKDPAN-DLIGRICDARLAFL 144 (252) T ss_pred CHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHCCCH-HHHHHHHHHHHHHH T ss_conf 1699999998565776665676248889988716531-37888888999998 No 18 >pfam10908 DUF2778 Protein of unknown function (DUF2778). This is a bacterial family of uncharacterized proteins. Probab=68.04 E-value=7.3 Score=18.81 Aligned_cols=18 Identities=17% Similarity=0.460 Sum_probs=14.8 Q ss_pred CCCCCCCEECCCHHHHHH Q ss_conf 554115034479899999 Q gi|254780350|r 337 RFETSGCVRVRNIIDLDV 354 (431) Q Consensus 337 Ra~ShGCVRv~np~~La~ 354 (431) +-.|+|||=+++..++.. T Consensus 88 ~G~S~GCIT~~~~~~F~~ 105 (121) T pfam10908 88 RGDSNGCITFKDYARFLR 105 (121) T ss_pred CCCCCCCEEECCHHHHHH T ss_conf 866587777668899999 No 19 >pfam09692 Arb1 Argonaute siRNA chaperone (ARC) complex subunit Arb1. Arb1 is required for histone H3 Lys9 (H3-K9) methylation, heterochromatin, assembly and siRNA generation in fission yeast. Probab=66.72 E-value=6.4 Score=19.13 Aligned_cols=39 Identities=13% Similarity=0.073 Sum_probs=27.4 Q ss_pred CCHHHHHHHHCCCHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHH Q ss_conf 76787642000788887987366758788999999999999998 Q gi|254780350|r 46 SIVNDRFDNFLARVDMGIDSDIPIISKETIAQTEKAIAFYQDIL 89 (431) Q Consensus 46 ~~~d~~f~~~~~~~~~~~~s~~P~~s~~~~~~l~~al~~y~~i~ 89 (431) .=+|+-|.-...+.+++..... ++++ +++.+|+|||.-- T Consensus 16 TGFEeyy~DpP~Tp~E~eEek~-iY~~----RiE~~IQRyr~kR 54 (392) T pfam09692 16 TGFEEYYADPPITPEEYEEEKE-IYDP----RIEEAIQRYRAKR 54 (392) T ss_pred CCCCCCCCCCCCCHHHHHHHHH-HCCH----HHHHHHHHHHHHH T ss_conf 9773013789989788877652-2467----8999999999863 No 20 >COG3442 Predicted glutamine amidotransferase [General function prediction only] Probab=63.09 E-value=5.3 Score=19.65 Aligned_cols=51 Identities=12% Similarity=0.083 Sum_probs=38.4 Q ss_pred CCCCCEEEEEECCCCCEEEECCCCCHHHCCCCCCCCCCCCEECCCHHHHHHHHHCC Q ss_conf 99863148953158876681388983232865555411503447989999998404 Q gi|254780350|r 304 KINAMASTKIEFYSRNNTYMHDTPEPILFNNVVRFETSGCVRVRNIIDLDVWLLKD 359 (431) Q Consensus 304 p~NaLG~vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~ShGCVRv~np~~La~~ll~~ 359 (431) .--|||.|.++.-|.-+=.-.|-=.++.|+ ..+||-+=-.|| +||.||+.. T Consensus 164 d~~pLG~Vv~G~GNn~eD~~eG~~ykn~~a----TY~HGP~L~rNp-~LAd~Ll~t 214 (250) T COG3442 164 DVKPLGKVVYGYGNNGEDGTEGAHYKNVIA----TYFHGPILSRNP-ELADRLLTT 214 (250) T ss_pred CCCCCEEEEECCCCCCCCCCCCEEEEEEEE----EEECCCCCCCCH-HHHHHHHHH T ss_conf 876460078866777554664234520478----751175446887-899999999 No 21 >TIGR02546 III_secr_ATP type III secretion apparatus H+-transporting two-sector ATPase; InterPro: IPR013380 Proteins in this entry are found in a variety of bacteria, and are predicted to be ATPases involved in type III secretion systems. One example is YscN (P40290 from SWISSPROT) from Yersinia enterocolitica, which is thought to energise the YOP (Yersinia outer protein) secretion system .; GO: 0046961 hydrogen ion transporting ATPase activity rotational mechanism, 0030254 protein secretion by the type III secretion system. Probab=51.47 E-value=18 Score=16.41 Aligned_cols=59 Identities=19% Similarity=0.314 Sum_probs=41.8 Q ss_pred CCCHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCC Q ss_conf 07888879873667587889999999999999988099854777741468888248999999998198665678776457 Q gi|254780350|r 56 LARVDMGIDSDIPIISKETIAQTEKAIAFYQDILSRGGWPELPIRPLHLGNSSVSVQRLRERLIISGDLDPSKGLSVAFD 135 (431) Q Consensus 56 ~~~~~~~~~s~~P~~s~~~~~~l~~al~~y~~i~~~ggW~~i~~~~L~~G~~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD 135 (431) .+|++.-+..-...=+.+....+++-|++|+++ +-|+..|.|..- -| T Consensus 345 LaS~SRvm~~vv~~eH~~aA~~lR~LLA~Y~e~---------------------------e~LI~lGEY~~G------~D 391 (430) T TIGR02546 345 LASLSRVMSQVVSKEHRRAAGKLRRLLAKYKEV---------------------------ELLIRLGEYQPG------SD 391 (430) T ss_pred HHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHH---------------------------HHHHHHCCCCCC------CC T ss_conf 523664236778878999999999999999999---------------------------889874488899------89 Q ss_pred HHHHHHHHHHHH Q ss_conf 899999999999 Q gi|254780350|r 136 AYVESAVKLFQM 147 (431) Q Consensus 136 ~~l~~AVk~FQ~ 147 (431) +.+++||++.-. T Consensus 392 ~~~D~A~~~~~~ 403 (430) T TIGR02546 392 PETDKAIDKIDA 403 (430) T ss_pred HHHHHHHHHHHH T ss_conf 889999975778 No 22 >TIGR00534 OpcA opcA protein; InterPro: IPR004555 The opcA gene is found immediately downstream of zwf, the glucose-6-phosphate dehydrogenase (G6PDH) gene, in a number of species, including Mycobacterium tuberculosis, Streptomyces coelicolor, Nostoc punctiforme, and Synechococcus sp. (strain PCC 7942). In the latter, disruption of opcA was shown to block assembly of G6PDH into active oligomeric forms. The protein is thought to play a role in the functional assembly of glucose-6-phosphate dehydrogenase.. Probab=45.77 E-value=18 Score=16.39 Aligned_cols=109 Identities=20% Similarity=0.154 Sum_probs=72.4 Q ss_pred HHHHHHHHHHHHHH----HHCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHCC Q ss_conf 99999999999999----88099854777741468888248999999998198665678776457899999999999808 Q gi|254780350|r 75 IAQTEKAIAFYQDI----LSRGGWPELPIRPLHLGNSSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVKLFQMRHG 150 (431) Q Consensus 75 ~~~l~~al~~y~~i----~~~ggW~~i~~~~L~~G~~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk~FQ~rhG 150 (431) .....+.+.+.+.- ...|+-|.+. ..+..--.-..-..++.-+.+.|.+ .+..+++.+++...|++.-|.+|. T Consensus 10 ~~~~~~~l~~~~~~~g~~~~~g~lp~~~-~~~t~~~~~~~~~~~~~~~~a~g~y--~g~~dg~~gp~~a~~~~~~~~~h~ 86 (420) T TIGR00534 10 LSEINKELNQLWESIGTPGEDGGLPSVG-RVLTLVIVPYEPEELQESLAAAGLY--NGPLDGLLGPNGASALREPKPKHE 86 (420) T ss_pred HHHHHHHHHHHHHHHCCCCCCCCCCCCC-EEEEEEEEECCHHHHHHHHHHHHHC--CCCCCCCCCCCCHHHHHCCCCCCC T ss_conf 7788889998887616641014654201-0246656414777899999874310--353233246431023311210014 Q ss_pred CC-CCCCCCHHHHHHHCCCHHHHHHHHHHHHHHHCCC Q ss_conf 88-7870299999884489889888986407753035 Q gi|254780350|r 151 LD-PSGMVDSSTLEAMNVPVDLRIRQLQVNLMRIKKL 186 (431) Q Consensus 151 L~-~DGvig~~Tl~aLN~~~~~r~~qi~~nler~r~l 186 (431) .+ .+|.-+.+|+..|...+..--..-...-..++.. T Consensus 87 ~p~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~ 123 (420) T TIGR00534 87 HPSETGTPDSRTLVALREEPAKGKHGGTGGEHDLAGL 123 (420) T ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHC T ss_conf 7754566203556655542443204776421343102 No 23 >PRK09726 DNA-binding transcriptional regulator HipB; Provisional Probab=45.69 E-value=22 Score=15.86 Aligned_cols=17 Identities=24% Similarity=0.417 Sum_probs=6.9 Q ss_pred HHHHHHHHHHHHCCCCC Q ss_conf 99999999999808887 Q gi|254780350|r 137 YVESAVKLFQMRHGLDP 153 (431) Q Consensus 137 ~l~~AVk~FQ~rhGL~~ 153 (431) .|..++|.+-..+||.. T Consensus 12 qLa~~Lr~~Rk~~gLsQ 28 (88) T PRK09726 12 QLANAMKLVRQQNGWTQ 28 (88) T ss_pred HHHHHHHHHHHHCCCCH T ss_conf 99999999999859879 No 24 >pfam05756 S-antigen S-antigen protein. S-antigens are heat stable proteins that are found in the blood of individuals infected with malaria. Probab=38.65 E-value=18 Score=16.34 Aligned_cols=14 Identities=36% Similarity=0.773 Sum_probs=10.4 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 89999999999985 Q gi|254780350|r 9 KILYCFFVYLILPM 22 (431) Q Consensus 9 ~~~~~~~~~~~l~~ 22 (431) --||.||+||.+-- T Consensus 7 VsfyLFFiYLYIYk 20 (310) T pfam05756 7 VTFYLFFIYLYIYK 20 (310) T ss_pred EHHHHHHHHHHHHH T ss_conf 15789999999987 No 25 >pfam12068 DUF3548 Domain of unknown function (DUF3548). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 184 to 216 amino acids in length. This domain is found associated with pfam00566. This domain is found at the N-terminus of GYP7 proteins. Probab=37.55 E-value=28 Score=15.22 Aligned_cols=14 Identities=7% Similarity=0.093 Sum_probs=6.8 Q ss_pred EEEECCCCCHHHCC Q ss_conf 66813889832328 Q gi|254780350|r 320 NTYMHDTPEPILFN 333 (431) Q Consensus 320 ~iyLHdTP~~~lF~ 333 (431) ..|.|+.-.++++. T Consensus 143 ~L~Fh~gg~~e~l~ 156 (207) T pfam12068 143 PLHFHNGGTREFLK 156 (207) T ss_pred CEEEECCCHHHHHH T ss_conf 54775398799999 No 26 >pfam03662 Glyco_hydro_79n Glycosyl hydrolase family 79, N-terminal domain. Family of endo-beta-N-glucuronidase, or heparanase. Heparan sulfate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulfate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular micro-environment. Heparanase degrades HS at specific intra-chain sites. The enzyme is synthesized as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumour cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity. Probab=35.88 E-value=15 Score=16.80 Aligned_cols=15 Identities=13% Similarity=0.175 Sum_probs=6.3 Q ss_pred EEEECCCCCHHHCCC Q ss_conf 668138898323286 Q gi|254780350|r 320 NTYMHDTPEPILFNN 334 (431) Q Consensus 320 ~iyLHdTP~~~lF~~ 334 (431) -|+|-+-.++.++++ T Consensus 245 ~Ynlg~g~d~~lv~k 259 (320) T pfam03662 245 IYNLGPGVDPHLIDK 259 (320) T ss_pred EECCCCCCCHHHHHH T ss_conf 854799877778876 No 27 >COG3139 Uncharacterized protein conserved in bacteria [Function unknown] Probab=35.76 E-value=31 Score=14.90 Aligned_cols=20 Identities=30% Similarity=0.393 Sum_probs=16.7 Q ss_pred CHHHHHHHHHHHHHCCCCCC Q ss_conf 78999999999998088878 Q gi|254780350|r 135 DAYVESAVKLFQMRHGLDPS 154 (431) Q Consensus 135 D~~l~~AVk~FQ~rhGL~~D 154 (431) -++--+||..+|.+|.++++ T Consensus 38 ke~clQaVmlwqarhN~~aq 57 (90) T COG3139 38 KENCLQAVMLWQARHNTEAQ 57 (90) T ss_pred HHHHHHHHHHHHHHHCCHHH T ss_conf 99899999999986077166 No 28 >COG5383 Uncharacterized protein conserved in bacteria [Function unknown] Probab=34.86 E-value=24 Score=15.58 Aligned_cols=43 Identities=12% Similarity=0.206 Sum_probs=23.2 Q ss_pred HHHHHHHHHCCCCCCCC---CCCCCCCCCCCCHHHHHHHHHHCCCCC Q ss_conf 99999998809985477---774146888824899999999819866 Q gi|254780350|r 82 IAFYQDILSRGGWPELP---IRPLHLGNSSVSVQRLRERLIISGDLD 125 (431) Q Consensus 82 l~~y~~i~~~ggW~~i~---~~~L~~G~~~~~V~~Lr~RL~~~Gdl~ 125 (431) |.+-+.++..+.-..+. .+.++.|.. .....||.-..++|-++ T Consensus 44 l~~~~~~~~~~~l~r~~~erhgairvgt~-~el~~~rr~fa~mgm~p 89 (295) T COG5383 44 LTRHRRLERTDSLERLTEERHGAIRVGTA-AELSMLRRLFAVMGMYP 89 (295) T ss_pred HHHHHHHHHCCHHHHHHHHHCCCEECCCH-HHHHHHHHHHHHHCCCC T ss_conf 98888888534387765876475230789-99999999999965775 No 29 >pfam11625 DUF3253 Protein of unknown function (DUF3253). This bacterial family of proteins has no known function. Probab=32.55 E-value=35 Score=14.58 Aligned_cols=34 Identities=24% Similarity=0.294 Sum_probs=24.3 Q ss_pred HHHHHHHHHCHHHHHHCCEEEECCCCCEECCCCC Q ss_conf 7777776418677874993999389978350204 Q gi|254780350|r 253 KDMMALLRQDPQYLKDNNIHMIDEKGKEVFVEEV 286 (431) Q Consensus 253 ~eilpk~~~dp~yl~~~~~~i~~~~g~~vdp~~i 286 (431) +++++..++.-.-|...|...+...|+.|||... T Consensus 39 R~lm~~vR~aA~~L~~~G~i~ItqkG~~VDp~~~ 72 (83) T pfam11625 39 RPLMPPVRRAARRLAEAGEVEITQKGKPVDPATA 72 (83) T ss_pred HHHHHHHHHHHHHHHHCCCEEEEECCEECCCCCC T ss_conf 9886999999999987896899779987470006 No 30 >COG3562 KpsS Capsule polysaccharide export protein [Cell envelope biogenesis, outer membrane] Probab=31.50 E-value=30 Score=14.99 Aligned_cols=34 Identities=32% Similarity=0.555 Sum_probs=24.4 Q ss_pred CCEEEECCCCCHHHCCCCCCCCCCCCEECCCHHHHHHHH Q ss_conf 876681388983232865555411503447989999998 Q gi|254780350|r 318 RNNTYMHDTPEPILFNNVVRFETSGCVRVRNIIDLDVWL 356 (431) Q Consensus 318 ~~~iyLHdTP~~~lF~~~~Ra~ShGCVRv~np~~La~~l 356 (431) ..-.|+||+|-|.+-.+ +-|||-|.--..|...+ T Consensus 302 ~RvlYvhd~~lpvllr~-----a~GmVTvNsTsGlsal~ 335 (403) T COG3562 302 GRVLYVHDVPLPVLLRH-----ALGMVTVNSTSGLSALL 335 (403) T ss_pred CEEEEECCCCCHHHHHH-----CCCEEEECCCCCHHHHH T ss_conf 44999628885688874-----36549974642167886 No 31 >TIGR01225 hutH histidine ammonia-lyase; InterPro: IPR005921 Histidine ammonia-lyase deaminates histidine to urocanic acid, the first step in histidine degradation. It is closely related to phenylalanine ammonia-lyase. ; GO: 0004397 histidine ammonia-lyase activity, 0006548 histidine catabolic process, 0005737 cytoplasm. Probab=29.85 E-value=15 Score=16.78 Aligned_cols=34 Identities=21% Similarity=0.284 Sum_probs=14.1 Q ss_pred CCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHCCC Q ss_conf 57899999999999808887870299999884489 Q gi|254780350|r 134 FDAYVESAVKLFQMRHGLDPSGMVDSSTLEAMNVP 168 (431) Q Consensus 134 yD~~l~~AVk~FQ~rhGL~~DGvig~~Tl~aLN~~ 168 (431) |+.....|-..- +.|||+|=--=-+|=++-+|.+ T Consensus 163 ~~G~~~~A~~~L-~~aGL~PV~L~~KEGLALINGT 196 (529) T TIGR01225 163 FEGERMPAAEAL-AAAGLKPVTLKAKEGLALINGT 196 (529) T ss_pred CCCCCCCHHHHH-HHHCCCCEEECCCCCHHHHHHH T ss_conf 776506388999-9839998020554330343067 No 32 >TIGR01811 sdhA_Bsu succinate dehydrogenase or fumarate reductase, flavoprotein subunit; InterPro: IPR011280 This entry represents the succinate dehydrogenase flavoprotein subunit as found in the low-GC Gram-positive bacteria and a few other lineages. This enzyme may act in a complete or partial TCA cycle, or act in the opposite direction as fumarate reductase. In some but not all species, succinate dehydrogenase and fumarate reductase may be encoded as separate isozymes.. Probab=29.49 E-value=40 Score=14.26 Aligned_cols=133 Identities=17% Similarity=0.153 Sum_probs=77.4 Q ss_pred CCCHHHHHHHHHHHHHHCHHHHHHCCEEEECCCC--CEECCCCCC--------CC----CCCCCCCCEEECCCCCCCCEE Q ss_conf 8876677777777764186778749939993899--783502047--------01----044577224726899986314 Q gi|254780350|r 245 VIPRSIIQKDMMALLRQDPQYLKDNNIHMIDEKG--KEVFVEEVD--------WN----SPEPPNFIFRQDPGKINAMAS 310 (431) Q Consensus 245 ~vP~sI~~~eilpk~~~dp~yl~~~~~~i~~~~g--~~vdp~~i~--------w~----~~~~~~~~~rQ~PGp~NaLG~ 310 (431) -||+-|+.+++-........--.-++...+|-.. ..+-...|+ .+ ..++.+-|.+.-|--+=.+|- T Consensus 304 LVPRDiAsR~i~~~c~~~~gvg~~~~~vYLDf~d~~~RlG~~~~~~kygnl~~~Y~~~~gddPy~~PM~I~PavHYtMGG 383 (620) T TIGR01811 304 LVPRDIASRAIKEVCDAGKGVGPGENAVYLDFSDAIERLGRKEIDAKYGNLFEMYEKITGDDPYKVPMRIYPAVHYTMGG 383 (620) T ss_pred CCCHHHHHHHHHHHHHHHCCCCCCCCEEEEEHHHCCHHCCHHHHHHHHCCHHHHHHHHHCCCCCCCCCEECCCCEEECCC T ss_conf 86224777999998864037655443135014430021057889877242888999972687798985126852221685 Q ss_pred EEEECCCCCEEEECCCCCHHHCCCCCCCCC-CCCEECCCHHHHHHHHHCCCCCCCHHHHHHHHHCCCEEEEECCCCCC Q ss_conf 895315887668138898323286555541-15034479899999984048899989999986259828996589886 Q gi|254780350|r 311 TKIEFYSRNNTYMHDTPEPILFNNVVRFET-SGCVRVRNIIDLDVWLLKDTPTWSRYHIEEVVKTRKTTPVKLATEVP 387 (431) Q Consensus 311 vkf~f~N~~~iyLHdTP~~~lF~~~~Ra~S-hGCVRv~np~~La~~ll~~~~~~~~~~i~~~~~~~~~~~v~l~~~iP 387 (431) +|++|..- |.=|+||+-.+-.|| ||.=||- +-.|..-+....-. -...|+..+....... .+++..| T Consensus 384 LwvDYd~m-------T~~pGLFa~GEc~fs~HGANRLG-AnsLl~a~~dG~~~-lP~ti~~~~~~~~~~~-~~~~~~P 451 (620) T TIGR01811 384 LWVDYDLM-------TTVPGLFAAGECDFSDHGANRLG-ANSLLSALADGYFV-LPATIENYLGLELSSE-DLDEDAP 451 (620) T ss_pred EEEEHHCC-------CCCCCEEEEECCCCCCCCCCHHH-HHHHHHHHCCCEEE-CHHHHHHHHCCCCCCC-CCCCCCC T ss_conf 04412003-------68875135301671224531556-99998863497576-3025775530245777-6764440 No 33 >TIGR01047 nspC carboxynorspermidine decarboxylase; InterPro: IPR005730 Carboxynorspermidine synthase, mediates the nicotinamide-nucleotide-linked reduction of the Schiff base H2N(CH2)3N = CHCH2CH(NH2)COOH. This is formed from L-aspartic beta-semialdehyde (ASA) and 1,3-diaminopropane (DAP) and is reduced to carboxynorspermidine [H2N(CH2)3NH(CH2)2CH(NH2)COOH], an intermediate in the novel pathway for norspermidine biosynthesis shown in Vibrio alginolyticus.; GO: 0016830 carbon-carbon lyase activity, 0045312 nor-spermidine biosynthetic process. Probab=27.87 E-value=39 Score=14.27 Aligned_cols=33 Identities=6% Similarity=-0.012 Sum_probs=19.6 Q ss_pred HHCCCCHHHHHHHHCCCHH---HHHHHHCCCCCHHH Q ss_conf 1115767876420007888---87987366758788 Q gi|254780350|r 42 ESYHSIVNDRFDNFLARVD---MGIDSDIPIISKET 74 (431) Q Consensus 42 ~~~~~~~d~~f~~~~~~~~---~~~~s~~P~~s~~~ 74 (431) +-.+++||..|........ .=+-.+.|+|+++. T Consensus 55 ~taSgL~EAkLA~E~fgGreshkE~HvYsPay~e~d 90 (403) T TIGR01047 55 ATASGLWEAKLAKEEFGGRESHKEVHVYSPAYKEED 90 (403) T ss_pred CCCCCHHHHHHHHHHCCCCCCCCCEEEECCCCCHHH T ss_conf 242676788632231386026766687158888645 No 34 >KOG3517 consensus Probab=27.81 E-value=14 Score=17.08 Aligned_cols=26 Identities=31% Similarity=0.487 Sum_probs=12.8 Q ss_pred HHHCCCCCCCCCCHHHHHHHCCCHHHHHHHH Q ss_conf 9980888787029999988448988988898 Q gi|254780350|r 146 QMRHGLDPSGMVDSSTLEAMNVPVDLRIRQL 176 (431) Q Consensus 146 Q~rhGL~~DGvig~~Tl~aLN~~~~~r~~qi 176 (431) .-|--|-.||+-|+. |+|--.-+..| T Consensus 101 EIRDRLlsdgiCDk~-----NvPSVSSISRI 126 (334) T KOG3517 101 EIRDRLLSDGICDKY-----NVPSVSSISRI 126 (334) T ss_pred HHHHHHHHCCCCCCC-----CCCCHHHHHHH T ss_conf 124455411544334-----78626789999 No 35 >TIGR01819 F420_cofD LPPG:Fo 2-phospho-L-lactate transferase; InterPro: IPR010115 This entry represents LPPG:Fo 2-phospho-L-lactate transferase (CofD), which catalyses the fourth step in the biosynthesis of coenzyme F420, a flavin derivative found in methanogens, Mycobacteria, and several other lineages. This enzyme is characterised so far in Methanococcus jannaschii but appears restricted to F420-containing species and is predicted to carry out the same function in these other species. . Probab=26.88 E-value=40 Score=14.26 Aligned_cols=154 Identities=17% Similarity=0.227 Sum_probs=79.7 Q ss_pred HCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCC--CCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHH Q ss_conf 36675878899999999999999880998547777--4146888824899999999819866567877645789999999 Q gi|254780350|r 66 DIPIISKETIAQTEKAIAFYQDILSRGGWPELPIR--PLHLGNSSVSVQRLRERLIISGDLDPSKGLSVAFDAYVESAVK 143 (431) Q Consensus 66 ~~P~~s~~~~~~l~~al~~y~~i~~~ggW~~i~~~--~L~~G~~~~~V~~Lr~RL~~~Gdl~~~~~~~~~yD~~l~~AVk 143 (431) ..|.|-|..-+..-+-++| |....++ -|++||+|-++..+|..+...|+- |-+..+ T Consensus 102 dtPrylPddaqtaGrdiar---------Wrrfsa~~E~~~lGDrDrAtH~~Rt~~l~~G~~-------------Ls~vT~ 159 (359) T TIGR01819 102 DTPRYLPDDAQTAGRDIAR---------WRRFSAADEWLRLGDRDRATHILRTQMLRAGHS-------------LSEVTE 159 (359) T ss_pred CCCCCCCCHHHHHHHHHHH---------HHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCCC-------------HHHHHH T ss_conf 8886576412232224466---------520147764321362788899999999866887-------------779999 Q ss_pred HHHHHCCCCCCCCCCHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHCC----C----EEEEEE-ECCEEE Q ss_conf 9999808887870299999884489889888986407753035556765234542045----2----489999-888555 Q gi|254780350|r 144 LFQMRHGLDPSGMVDSSTLEAMNVPVDLRIRQLQVNLMRIKKLLEQKMGLRYVLVNIP----A----ASLEAV-ENGKVG 214 (431) Q Consensus 144 ~FQ~rhGL~~DGvig~~Tl~aLN~~~~~r~~qi~~nler~r~l~~~~~~~~~v~VNip----~----~~l~~~-e~g~~~ 214 (431) +--.+.|++++ + .++.++-=+-++.|..| . |+-+-+ +.|++- T Consensus 160 ~l~~~~~~~~~-l----------------------------~PMtdDRvev~~~v~~dvdg~~~~~HFQefWV~~rg~v~ 210 (359) T TIGR01819 160 ALADAFGIKAR-L----------------------------LPMTDDRVEVSTLVETDVDGKEGAMHFQEFWVRRRGDVE 210 (359) T ss_pred HHHHHHCCCEE-E----------------------------EECCCCCCCEEEEEEECCCCCCCCEECCHHHHHCCCCCC T ss_conf 99987289717-8----------------------------606789733289997478787475300125320379964 Q ss_pred EEECCCCCCC-CCCCCCCCCCEE---EEEECCCCCCCHHHHHHHHHHHHHHCHHHHHHCCEEEE Q ss_conf 5412313887-777855421003---89844877887667777777776418677874993999 Q gi|254780350|r 215 LRSTVIVGRV-DRQTPILHSRIN---RIMFNPYWVIPRSIIQKDMMALLRQDPQYLKDNNIHMI 274 (431) Q Consensus 215 ~~~~viVGk~-~~~TP~~~~~i~---~iv~NP~W~vP~sI~~~eilpk~~~dp~yl~~~~~~i~ 274 (431) -+-=..+|-. .++.+.--..|. .|++-|. |.=.||=--=-+|-+| ..|.+.+.+|+ T Consensus 211 v~dV~f~G~e~A~~a~~a~EAi~~~~~vligPS-NPvtSIGpILal~Gir---e~Lrda~~kVV 270 (359) T TIGR01819 211 VEDVEFRGAEKAKAAPEAIEAIRDADVVLIGPS-NPVTSIGPILALPGIR---EALRDATVKVV 270 (359) T ss_pred CEEEEECCCCCCCCCHHHHHHHHCCCEEEECCC-CCCCCCHHHCCHHHHH---HHHHHCCCCEE T ss_conf 013664065336688789998605998997786-6811132232725689---99983697389 No 36 >pfam12621 DUF3779 Phosphate metabolism protein. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam02714. There are two completely conserved residues (W and D) that may be functionally important. This family is likely to be involved in phosphate metabolism however there is little accompanying literature to confirm this. Probab=26.65 E-value=44 Score=13.95 Aligned_cols=57 Identities=18% Similarity=0.123 Sum_probs=45.4 Q ss_pred CCCCCCCCCEECCCHHHHHHHHHCCCCCCCHHHHHHHHHCCCEEEEECCCCCCEEEEEEEEEECCCCCEEEE Q ss_conf 555541150344798999999840488999899999862598289965898868999999898799838883 Q gi|254780350|r 335 VVRFETSGCVRVRNIIDLDVWLLKDTPTWSRYHIEEVVKTRKTTPVKLATEVPVHFVYISAWSPKDSIIQFR 406 (431) Q Consensus 335 ~~Ra~ShGCVRv~np~~La~~ll~~~~~~~~~~i~~~~~~~~~~~v~l~~~iPV~i~Y~Taw~~~dG~v~fr 406 (431) ...|+=|=||+=+.|. -|+-+|.-|.++++|+.+...| |-+.---||.|+.|++.+. T Consensus 32 ~~~AY~~Pav~~~~P~---lWIPrD~~GvS~~ei~~~~~~~------------v~isde~a~~dekgkv~~~ 88 (95) T pfam12621 32 EKHAYFHPAVTAPPPL---LWIPRDPMGLSRQEIEHTSDVG------------VPISDEGATFDEKGKIVWT 88 (95) T ss_pred HHHCCCCCCCCCCCCE---EEEECCCCCCCHHHHHHHHCCC------------EEEECCCCEECCCCCEEEC T ss_conf 7636568432689985---8854697775899999965387------------3676799658788538962 No 37 >PRK01345 heat shock protein HtpX; Provisional Probab=26.43 E-value=42 Score=14.10 Aligned_cols=19 Identities=21% Similarity=0.510 Sum_probs=7.9 Q ss_pred HHHHHHHHHHHHHHHHHCCCCC Q ss_conf 8999999999999998809985 Q gi|254780350|r 74 TIAQTEKAIAFYQDILSRGGWP 95 (431) Q Consensus 74 ~~~~l~~al~~y~~i~~~ggW~ 95 (431) .+-.+.+-++ +++.+.|-| T Consensus 65 e~P~L~~iVe---~La~~Aglp 83 (314) T PRK01345 65 SAPELYRMVA---ELARRAGLP 83 (314) T ss_pred CCHHHHHHHH---HHHHHCCCC T ss_conf 5669999999---999976989 No 38 >TIGR01935 NOT-MenG RraA family; InterPro: IPR010203 This entry includes a number of closely related sequences bacteria and plants. The Escherichia coli member of this family has been characterised as a regulator of RNase E (see IPR004659 from INTERPRO), and its crystal structure has been analysed . E. coli RraA acts as a trans-acting modulator of RNA turnover, binding essential endonuclease RNase E and inhibiting RNA processing . RNase E forms the core of a large RNA-catalysis machine termed the degradosomes. RraA (and RraB) causes remodelling of degradosome composition, which is associated with alterations in RNA decay and global transcript abundance and as such is a bacterial mechanism for the regulation of RNA cleavage .; GO: 0008428 ribonuclease inhibitor activity, 0051252 regulation of RNA metabolic process. Probab=26.37 E-value=18 Score=16.33 Aligned_cols=12 Identities=17% Similarity=0.440 Sum_probs=5.4 Q ss_pred HCCEEEECCCCC Q ss_conf 499399938997 Q gi|254780350|r 268 DNNIHMIDEKGK 279 (431) Q Consensus 268 ~~~~~i~~~~g~ 279 (431) +.+..|+||.|+ T Consensus 58 ~GrVLVVDGgGS 69 (155) T TIGR01935 58 AGRVLVVDGGGS 69 (155) T ss_pred CCCEEEEECCCH T ss_conf 972799958850 No 39 >PRK01265 heat shock protein HtpX; Provisional Probab=26.29 E-value=45 Score=13.92 Aligned_cols=22 Identities=14% Similarity=-0.019 Sum_probs=14.4 Q ss_pred CCCHHHHHHHHHHHHHCCCCCC Q ss_conf 4578999999999998088878 Q gi|254780350|r 133 AFDAYVESAVKLFQMRHGLDPS 154 (431) Q Consensus 133 ~yD~~l~~AVk~FQ~rhGL~~D 154 (431) ..|.+=-+||-+-+-.|=..-| T Consensus 133 ~L~~dELegVlAHEl~HI~nrD 154 (326) T PRK01265 133 ILNPDEIKAVIGHELGHLKHRD 154 (326) T ss_pred HCCHHHHHHHHHHHHHHHHCCC T ss_conf 4799999999999998984534 No 40 >pfam08692 Pet20 Mitochondrial protein Pet20. Pet20 is a mitochondrial protein which is thought to play a role in the correct assembly/maintenance of mitochondrial components. Probab=26.04 E-value=39 Score=14.28 Aligned_cols=31 Identities=23% Similarity=0.420 Sum_probs=21.6 Q ss_pred CCCCCCCCCCCCCEEEEEECCCC-CCCHHHHH Q ss_conf 88777785542100389844877-88766777 Q gi|254780350|r 222 GRVDRQTPILHSRINRIMFNPYW-VIPRSIIQ 252 (431) Q Consensus 222 Gk~~~~TP~~~~~i~~iv~NP~W-~vP~sI~~ 252 (431) ++..-+-|.-++.++.++++|-| +||..++. T Consensus 94 ~~~~p~ip~W~sS~~GmEyy~Ew~~VP~~Vv~ 125 (137) T pfam08692 94 WDYNPVKPNWSSSAMGMEFYPEWENVPSDVIK 125 (137) T ss_pred CCCCCCCCCCCCCCCCCEECHHHHHCHHHHHH T ss_conf 55688788753133224224655508599985 No 41 >PRK09484 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase; Provisional Probab=25.99 E-value=28 Score=15.21 Aligned_cols=26 Identities=19% Similarity=0.115 Sum_probs=19.6 Q ss_pred CCCCCCCEECCCHHHHHHHHHCCCCCCCHHH Q ss_conf 5541150344798999999840488999899 Q gi|254780350|r 337 RFETSGCVRVRNIIDLDVWLLKDTPTWSRYH 367 (431) Q Consensus 337 Ra~ShGCVRv~np~~La~~ll~~~~~~~~~~ 367 (431) +.--||||| |+++++|+.+..|+..+ T Consensus 157 ~~GG~GAVR-----E~~d~iL~~~g~~~~a~ 182 (186) T PRK09484 157 IAGGRGAVR-----EVCDLLLLAQGKLDEAK 182 (186) T ss_pred CCCCCCHHH-----HHHHHHHHHCCCHHHHC T ss_conf 988884999-----99999999769877854 No 42 >TIGR01533 lipo_e_P4 5'-nucleotidase, lipoprotein e(P4) family; InterPro: IPR006423 This group of sequences represents a set of bacterial lipoproteins belonging to a larger acid phosphatase family, which in turn belongs to the haloacid dehalogenase (HAD) superfamily of aspartate-dependent hydrolases. Members are found on the outer membrane of Gram-negative bacteria and the cytoplasmic membrane of Gram-positive bacteria. Most members have classic lipoprotein signal sequences. A critical role of this 5'-nucleotidase in Haemophilus influenzae is the degradation of external riboside in order to allow transport into the cell. An earlier suggested role in hemin transport is no longer current. This enzyme may also have other physiologically significant roles.. Probab=25.55 E-value=26 Score=15.35 Aligned_cols=12 Identities=8% Similarity=-0.081 Sum_probs=10.7 Q ss_pred CEEEEEECCCCC Q ss_conf 314895315887 Q gi|254780350|r 308 MASTKIEFYSRN 319 (431) Q Consensus 308 LG~vkf~f~N~~ 319 (431) -|+-+|+||||. T Consensus 254 FG~~fIiLPNp~ 265 (295) T TIGR01533 254 FGKKFIILPNPM 265 (295) T ss_pred CCCEEEECCCCC T ss_conf 487453067888 No 43 >KOG3516 consensus Probab=25.02 E-value=42 Score=14.13 Aligned_cols=11 Identities=27% Similarity=0.927 Sum_probs=6.1 Q ss_pred HHHCCCC-CCCC Q ss_conf 9880998-5477 Q gi|254780350|r 88 ILSRGGW-PELP 98 (431) Q Consensus 88 i~~~ggW-~~i~ 98 (431) .+..+|| |.+. T Consensus 60 ~~G~~gwsp~~~ 71 (1306) T KOG3516 60 RVGISGWSPKIS 71 (1306) T ss_pred HCCCCCCCCCCC T ss_conf 247554122437 No 44 >PRK02391 heat shock protein HtpX; Provisional Probab=24.45 E-value=49 Score=13.69 Aligned_cols=22 Identities=18% Similarity=0.128 Sum_probs=11.8 Q ss_pred CCCHHHHHHHHHHHHHCCCCCC Q ss_conf 4578999999999998088878 Q gi|254780350|r 133 AFDAYVESAVKLFQMRHGLDPS 154 (431) Q Consensus 133 ~yD~~l~~AVk~FQ~rhGL~~D 154 (431) ..|++=.+||-+=+-.|=..-| T Consensus 128 ~L~~dEL~aVLAHEl~Hikn~D 149 (297) T PRK02391 128 RLDPEELEAVLAHELSHVKNRD 149 (297) T ss_pred HCCHHHHHHHHHHHHHHHHCCC T ss_conf 3999999999999999997152 No 45 >TIGR00302 TIGR00302 phosphoribosylformylglycinamidine synthase, purS protein; InterPro: IPR003850 Phosphoribosylformylglycinamidine(FGAM) synthetase, 6.3.5.3 from EC, catalyses the fourth step in the de novo purine biosynthetic pathway .5-phosphoribosylformylglycinamide (FGAR) + glutamine + ATP = FGAM + glutamate + ADP + Pi In eukaryotes and many bacterial systems (including Escherichia coli and Salmonella typhimurium), the FGAM synthetase is encoded by the large form of PurL (lgPurL), which contains an N-terminal ATPase domain and a C-terminal glutamine-binding domain. In archaeal and other bacterial systems, however, FGAM synthetase is encoded by separate genes, making it a multisubunit (rather than multidomain) enzyme. The protein is composed of the small form of PurL (smPurL), which is homologus to the ATPase domain of lgPurL, PurQ which is homologous to the glutamine-binding domain of of lgPurL, and PurS, whose function is not known. This entry represents the PurS subunit of the multisubunit FGAM synthetase. Recent studies showed that disruption of the purS gene in B. subtilis resulted in a purine auxotrophic phenotype, due to defective FGAM synthetase activity. Therefore, the PurS protein appears to be required for the function of the PurL and PurQ subunits of the FGAM synthetase, but the molecular mechanism for the functional role of PurS is currently not known. For additional information please see , . ; GO: 0016879 ligase activity forming carbon-nitrogen bonds. Probab=24.42 E-value=49 Score=13.69 Aligned_cols=24 Identities=17% Similarity=0.101 Sum_probs=22.6 Q ss_pred CCCCCCCCCCHHHHHHHHHHCCCC Q ss_conf 414688882489999999981986 Q gi|254780350|r 101 PLHLGNSSVSVQRLRERLIISGDL 124 (431) Q Consensus 101 ~L~~G~~~~~V~~Lr~RL~~~Gdl 124 (431) +||.|.-+|.-.++++-|..+||= T Consensus 8 ~LK~gVLdPeG~a~~~AL~~LGy~ 31 (80) T TIGR00302 8 KLKKGVLDPEGEAVQRALRLLGYN 31 (80) T ss_pred EECCCCCCCCHHHHHHHHHHCCCC T ss_conf 646763681148899998633778 No 46 >pfam06978 POP1 Ribonucleases P/MRP protein subunit POP1. This family represents a conserved region approximately 150 residues long located towards the N-terminus of the POP1 subunit that is common to both the RNase MRP and RNase P ribonucleoproteins (EC:3.1.26.5). These RNA-containing enzymes generate mature tRNA molecules by cleaving their 5' ends. Probab=23.23 E-value=10 Score=17.90 Aligned_cols=32 Identities=16% Similarity=0.085 Sum_probs=21.0 Q ss_pred ECCCCCEEEECCCCCHHHCCCCCCCCCCCCEE Q ss_conf 31588766813889832328655554115034 Q gi|254780350|r 314 EFYSRNNTYMHDTPEPILFNNVVRFETSGCVR 345 (431) Q Consensus 314 ~f~N~~~iyLHdTP~~~lF~~~~Ra~ShGCVR 345 (431) -|.+.+++-|=.+|..--|-...|+.+||||= T Consensus 113 ~M~~~WG~~lp~~~t~K~~R~~~Ra~~~~~~~ 144 (158) T pfam06978 113 HMTKLWGFKLPLTPTQKSYRATHRASKHGAVV 144 (158) T ss_pred CCHHHHCCCCCCCCCCCHHHHHHHHHHCCCEE T ss_conf 21887484358887740469999987259889 No 47 >pfam11353 DUF3153 Protein of unknown function (DUF3153). This family of proteins with unknown function appear to be restricted to Cyanobacteria. Some members are annotated as membrane proteins however this cannot be confirmed. Probab=23.10 E-value=20 Score=16.07 Aligned_cols=22 Identities=23% Similarity=0.413 Sum_probs=11.5 Q ss_pred EECCCCCCCCEEEEEECCCCCEE Q ss_conf 72689998631489531588766 Q gi|254780350|r 299 RQDPGKINAMASTKIEFYSRNNT 321 (431) Q Consensus 299 rQ~PGp~NaLG~vkf~f~N~~~i 321 (431) ...||..|.|--.-.. +|+-+| T Consensus 167 ~L~~GeiN~Le~sfW~-~s~Lgi 188 (210) T pfam11353 167 QLEPGEINHLEASFWR-WSPLGL 188 (210) T ss_pred ECCCCCCCEEEEEEEE-CCHHHH T ss_conf 3588872369999987-167678 No 48 >TIGR02941 Sigma_B RNA polymerase sigma-B factor; InterPro: IPR014288 The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme . RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes. With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding 'helix-turn-helix' motif involved in binding the conserved -35 region of promoters recognized by the major sigma factors , . This entry represents the sigma factor, sigmaB. It is restricted to certain species within the order Bacillales including Staphylococcus aureus , Listeria monocytogenes and Bacillus cereus (strain ATCC 14579 / DSM 31) .. Probab=22.58 E-value=25 Score=15.53 Aligned_cols=21 Identities=5% Similarity=0.207 Sum_probs=9.8 Q ss_pred HHCCCCCCCHHHHHHHHHHHCC Q ss_conf 1015554230134567641115 Q gi|254780350|r 24 LSLVEKPIHASVLDEIINESYH 45 (431) Q Consensus 24 ~~l~s~~l~~~~~~~~i~~~~~ 45 (431) -.+-...+++|.++ +|.+.+| T Consensus 54 ~~~HEDlvQVGM~G-LlgAirR 74 (256) T TIGR02941 54 SAIHEDLVQVGMVG-LLGAIRR 74 (256) T ss_pred CCCCCCHHHHHHHH-HHHHHHH T ss_conf 88333045676899-9999865 No 49 >pfam07220 DUF1420 Protein of unknown function (DUF1420). This family consists of several hypothetical putative lipoproteins which seem to be found specifically in the bacterium Leptospira interrogans. Members of this family are typically around 670 resides in length and their function is unknown. Probab=21.69 E-value=55 Score=13.36 Aligned_cols=116 Identities=24% Similarity=0.342 Sum_probs=64.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHHH-------HHHHHHCCCCHHHHHHHHCCCHHHHHHHHCCCCCHHHHHH Q ss_conf 568889999999999985310155542301345-------6764111576787642000788887987366758788999 Q gi|254780350|r 5 LKINKILYCFFVYLILPMGLSLVEKPIHASVLD-------EIINESYHSIVNDRFDNFLARVDMGIDSDIPIISKETIAQ 77 (431) Q Consensus 5 ~~~~~~~~~~~~~~~l~~~~~l~s~~l~~~~~~-------~~i~~~~~~~~d~~f~~~~~~~~~~~~s~~P~~s~~~~~~ 77 (431) ..+||+++.|...+++.||+.-.+-..++-.+| ++++.-.-|.....|.+..+-+.+-+....=+..++-... T Consensus 154 id~n~~~nl~I~lLii~Y~lLsL~PITNADSLDYHIGVai~IlNqGkmP~~l~wFH~~laGSGEvlNAlGL~IGAEQFgs 233 (672) T pfam07220 154 IDINDFLNLFIILLLIAYGFLALCPITNADSLDYHIGVAIEILNQGKMPAFLGWFHGRLAGSGEVLNALGLAIGAEQFGS 233 (672) T ss_pred CCHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEEEEECCCCCCHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHH T ss_conf 44456789999999999999830776786420100473677403799740777767776274889999887834875036 Q ss_pred HHH---------HHHHH----HHHHHCC-CCCCC----------------CCC--CCCCCCCCCCHHHHHHHHHH Q ss_conf 999---------99999----9998809-98547----------------777--41468888248999999998 Q gi|254780350|r 78 TEK---------AIAFY----QDILSRG-GWPEL----------------PIR--PLHLGNSSVSVQRLRERLII 120 (431) Q Consensus 78 l~~---------al~~y----~~i~~~g-gW~~i----------------~~~--~L~~G~~~~~V~~Lr~RL~~ 120 (431) +-| -|+-| +..+.+| -|.++ ..+ -|+.|+++-+|..|-+-... T Consensus 234 LlQFsGlLsI~GIL~fysf~ek~~~~dgsvwr~iiiiaflsspvlvflvss~kpqllq~gmtsfa~~lllei~sk 308 (672) T pfam07220 234 LLQFCGLLGIYGILAFYSFAEKFLANDGSVWREIIIIAFLSSPVLVFLVSSPKPQLLQIGMTSFAITLLLEIFSK 308 (672) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHEEECCCCEEEEEECCCCHHHHHHCHHHHHHHHHHHHHHH T ss_conf 898743888998999999999850046405567653022178669999659970677623088999999999987 No 50 >PRK03001 heat shock protein HtpX; Provisional Probab=21.65 E-value=54 Score=13.44 Aligned_cols=21 Identities=10% Similarity=0.028 Sum_probs=11.4 Q ss_pred CCHHHHHHHHHHHHHCCCCCC Q ss_conf 578999999999998088878 Q gi|254780350|r 134 FDAYVESAVKLFQMRHGLDPS 154 (431) Q Consensus 134 yD~~l~~AVk~FQ~rhGL~~D 154 (431) .|++=.+||-+-+-.|=-.-| T Consensus 119 L~~dELeaVlAHEl~Hi~n~D 139 (284) T PRK03001 119 LSEREIRGVMAHELAHVKHRD 139 (284) T ss_pred CCHHHHHHHHHHHHHHHHCCC T ss_conf 799999999999999997468 No 51 >PRK12285 tryptophanyl-tRNA synthetase; Reviewed Probab=21.31 E-value=16 Score=16.63 Aligned_cols=62 Identities=21% Similarity=0.284 Sum_probs=37.8 Q ss_pred EEECCCCC-EEEECCCCCHHHCCCCCCCCCCCCEECCCHH------------HHHHH-HHCCCCCCCHHHHHHHHHCCC Q ss_conf 95315887-6681388983232865555411503447989------------99999-840488999899999862598 Q gi|254780350|r 312 KIEFYSRN-NTYMHDTPEPILFNNVVRFETSGCVRVRNII------------DLDVW-LLKDTPTWSRYHIEEVVKTRK 376 (431) Q Consensus 312 kf~f~N~~-~iyLHdTP~~~lF~~~~Ra~ShGCVRv~np~------------~La~~-ll~~~~~~~~~~i~~~~~~~~ 376 (431) |+--+++. +|||+|||+.- =.+-.+|+|-|+--+|.=. ++-.+ ++.++. ..++|.+.-.+|+ T Consensus 253 KMSsS~p~saI~ltDtp~~i-kkKIk~AfSGGr~T~EEhr~~Gg~p~vdv~yq~~~~ff~eDD~--~L~~i~~~y~~G~ 328 (369) T PRK12285 253 KMSSSKPESAIYLTDDPETA-KKKIKNALTGGRPTLEEQRKLGGNPEKCVVYELYLYHLIEDDK--ELKEIYEECRSGK 328 (369) T ss_pred CCCCCCCCCEEEECCCHHHH-HHHHHHHHCCCCCCHHHHHHHCCCCCCCHHHHHHHHHHCCCHH--HHHHHHHHHCCCC T ss_conf 87689998536850899999-9999985427987799999858998512468999998146689--9999999844786 No 52 >TIGR01825 gly_Cac_T_rel pyridoxal phosphate-dependent acyltransferase, putative; InterPro: IPR010962 Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination , , . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy . PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalyzed reactions, enzymatic and non-enzymatic . This entry contains a number of enzyme families: Serine palmitoyltransferase Glycine C-acetyltransferase (2-amino-3-ketobutyrate coenzyme A ligase) 5-aminolevulinic acid synthase 8-amino-7-oxononanoate synthase All transfer the R-group (acetyl, succinyl, or 6-carboxyhexanoyl) from coenzyme A to an amino acid (Gly, Gly, Ala, respectively), with release of CO2 for the latter two reactions.; GO: 0016740 transferase activity, 0030170 pyridoxal phosphate binding. Probab=20.08 E-value=60 Score=13.15 Aligned_cols=24 Identities=17% Similarity=0.292 Sum_probs=13.2 Q ss_pred HHHHHHHHHCHHH-HHHCCEEEECC Q ss_conf 7777776418677-87499399938 Q gi|254780350|r 253 KDMMALLRQDPQY-LKDNNIHMIDE 276 (431) Q Consensus 253 ~eilpk~~~dp~y-l~~~~~~i~~~ 276 (431) .|+-.++++++.+ -.+..+.|.|| T Consensus 152 ~DL~~~l~e~~~~Gqy~~~l~vTDG 176 (392) T TIGR01825 152 DDLERVLRENVEEGQYKKKLIVTDG 176 (392) T ss_pred HHHHHHHHHCCCCCCCCCEEEEECC T ss_conf 5899999716033754436899665 Done!