Query gi|254780249|ref|YP_003064662.1| 50S ribosomal protein L5 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 185 No_of_seqs 126 out of 1598 Neff 5.7 Searched_HMMs 39220 Date Tue May 24 06:47:01 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780249.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK00010 rplE 50S ribosomal pr 100.0 0 0 495.0 18.2 179 7-185 1-179 (179) 2 CHL00078 rpl5 ribosomal protei 100.0 0 0 493.6 17.5 178 8-185 2-179 (180) 3 COG0094 RplE Ribosomal protein 100.0 0 0 466.0 17.1 179 7-185 1-179 (180) 4 KOG0398 consensus 100.0 0 0 361.4 12.9 178 7-184 52-274 (278) 5 PRK04219 rpl5p 50S ribosomal p 100.0 0 0 332.8 13.4 150 26-184 6-155 (177) 6 PTZ00156 60S ribosomal protein 100.0 0 0 297.3 11.7 129 27-165 3-131 (172) 7 pfam00673 Ribosomal_L5_C ribos 100.0 1E-40 2.8E-45 257.7 8.6 95 90-184 1-95 (95) 8 KOG0397 consensus 99.9 2E-24 5E-29 162.9 6.9 129 28-166 6-134 (176) 9 pfam00281 Ribosomal_L5 Ribosom 99.7 1.8E-18 4.7E-23 128.1 5.1 56 30-86 1-56 (56) 10 COG1913 Predicted Zn-dependent 63.9 6.2 0.00016 20.1 2.6 75 83-167 31-106 (181) 11 KOG2961 consensus 52.6 11 0.00029 18.6 2.4 66 117-183 37-112 (190) 12 TIGR02510 NrdE-prime ribonucle 52.1 8.1 0.00021 19.5 1.5 35 80-114 77-111 (560) 13 pfam04921 XAP5 XAP5 protein. T 49.2 18 0.00046 17.5 2.9 43 97-146 112-154 (233) 14 TIGR00705 SppA_67K signal pept 37.9 34 0.00087 15.8 4.8 71 29-100 352-427 (614) 15 PRK00389 gcvT glycine cleavage 35.8 27 0.0007 16.4 2.2 40 93-132 53-92 (362) 16 cd04481 RPA1_DBD_B_like RPA1_D 35.1 20 0.0005 17.2 1.3 43 80-122 24-76 (106) 17 COG4031 Predicted metal-bindin 33.3 40 0.001 15.4 3.0 76 91-179 130-219 (227) 18 PRK12486 dmdA putative dimethy 31.8 28 0.00071 16.3 1.7 15 95-109 151-165 (367) 19 cd03074 PDI_b'_Calsequestrin_C 30.2 46 0.0012 15.1 2.9 48 88-147 21-68 (120) 20 TIGR03212 uraD_N-term-dom puta 25.9 48 0.0012 15.0 2.0 25 51-75 136-160 (297) 21 TIGR02062 RNase_B exoribonucle 25.9 26 0.00066 16.5 0.7 97 41-154 423-529 (664) 22 COG0404 GcvT Glycine cleavage 24.3 52 0.0013 14.8 2.0 30 93-122 57-86 (379) 23 pfam10871 DUF2748 Protein of u 23.9 32 0.00082 16.0 0.8 95 29-129 66-175 (452) 24 TIGR00600 rad2 DNA excision re 23.5 51 0.0013 14.8 1.8 11 36-46 462-472 (1127) 25 COG1058 CinA Predicted nucleot 22.4 64 0.0016 14.2 4.3 140 37-179 61-242 (255) 26 PRK13579 gcvT glycine cleavage 22.2 64 0.0016 14.2 4.3 19 92-110 199-217 (371) 27 PRK13249 phycoerythrobilin:fer 20.8 39 0.00099 15.5 0.8 23 136-160 86-108 (257) 28 pfam06158 Phage_E Phage tail p 20.8 52 0.0013 14.8 1.4 39 75-118 11-53 (86) No 1 >PRK00010 rplE 50S ribosomal protein L5; Validated Probab=100.00 E-value=0 Score=494.99 Aligned_cols=179 Identities=63% Similarity=1.087 Sum_probs=177.7 Q ss_pred CCHHHHHHHHHHHHHHHHHHCCCCHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCC Q ss_conf 61189999999999999982699822286044999974578756537899999998776538961354123764211155 Q gi|254780249|r 7 EPRLKKEYCLRIREAMQQEFSYKNVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLR 86 (185) Q Consensus 7 ~pRLk~~Y~~~i~~dL~~k~~~~N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiR 86 (185) +||||+||+++|++||+++++|+|+||+|+|+|||||+|+|+|..|++.++.+..+|+.||||||++++||+|||+||+| T Consensus 1 m~rLk~~Y~~~v~~~L~~k~~y~N~~qvPkl~KIvin~gvg~a~~~~k~l~~a~~~L~~ITGQkP~~t~akksia~fkir 80 (179) T PRK00010 1 MARLKEKYKEEIVPALMKEFGYKNVMQVPKLEKIVLNMGVGEAVADKKLLENAVEDLTLITGQKPVVTKAKKSIAGFKLR 80 (179) T ss_pred CCHHHHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEEECCCHHHHHCHHHHHHHHHHHHHHCCCCCEEEEECCCCCCCCCC T ss_conf 94789999999999999986889830057056999978862554186779999999998618975798601453200234 Q ss_pred CCCEEEEEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCCCCEEEEECCC Q ss_conf 89862889996034289999999998767752046887110157773143301110068754014688554359998044 Q gi|254780249|r 87 TGMPIGTKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVLGMDISICTTT 166 (185) Q Consensus 87 kG~piG~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~G~~Iti~Tta 166 (185) +|+|+||+|||||++||+||+||+++||||||||+|++.++||++|||+|||+|+++||||+||.++.++|||||||||| T Consensus 81 kg~~iG~kvTLRg~~My~Fl~kli~ivlPrirdFrGl~~~sfD~~GN~s~Gi~e~~~FPEI~~d~~~~~~G~~ItivTtA 160 (179) T PRK00010 81 EGMPIGCKVTLRGERMYEFLDRLINIALPRVRDFRGLSPKSFDGRGNYTLGIKEQIIFPEIDYDKIDKIRGMDITIVTTA 160 (179) T ss_pred CCCEEEEEEEECCHHHHHHHHHHHHHHCCHHCCCCCCCCCCCCCCCCEEECCCCCCCCCCCCCCCCCCCCCCEEEEEECC T ss_conf 79736789987418699999999987452110147899766479983222554021877534022568889758999476 Q ss_pred CCHHHHHHHHHHHCCCCCC Q ss_conf 8968999999980898689 Q gi|254780249|r 167 RSDREAKYLLTLFGFPFPK 185 (185) Q Consensus 167 ~~~~ea~~Lls~~g~Pf~k 185 (185) ++++||++||++|||||+| T Consensus 161 ~t~~ea~~LL~~~g~PF~k 179 (179) T PRK00010 161 KTDEEARALLEAFGFPFRK 179 (179) T ss_pred CCHHHHHHHHHHCCCCCCC T ss_conf 9989999999985998579 No 2 >CHL00078 rpl5 ribosomal protein L5 Probab=100.00 E-value=0 Score=493.59 Aligned_cols=178 Identities=53% Similarity=0.969 Sum_probs=176.4 Q ss_pred CHHHHHHHHHHHHHHHHHHCCCCHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCCC Q ss_conf 11899999999999999826998222860449999745787565378999999987765389613541237642111558 Q gi|254780249|r 8 PRLKKEYCLRIREAMQQEFSYKNVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLRT 87 (185) Q Consensus 8 pRLk~~Y~~~i~~dL~~k~~~~N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiRk 87 (185) -|||++|+++|+++|+++|+|+|+||||+|+|||||+|+|++..|++.++.+..+|++||||+|++++||||||+||+|+ T Consensus 2 ~rLk~~Y~~~i~~~L~~~~~y~N~~~vPkl~KIvln~gvg~a~~~~k~l~~~~~~L~~ITGQkp~~t~AKksia~FKlRk 81 (180) T CHL00078 2 QRLKTLYLEKIVPKLIKEFGYKNIHQVPKLKKIVINRGLGEASQNAKILESSIKELSIITGQKPIITRAKKAIAGFKIRE 81 (180) T ss_pred CHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEEECCCCHHHHCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCCC T ss_conf 37789999999999999878998610572027999898864541737899999999998399847875124556535258 Q ss_pred CCEEEEEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCCCCEEEEECCCC Q ss_conf 98628899960342899999999987677520468871101577731433011100687540146885543599980448 Q gi|254780249|r 88 GMPIGTKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVLGMDISICTTTR 167 (185) Q Consensus 88 G~piG~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~G~~Iti~Tta~ 167 (185) |+|+||||||||++||+|||||+++||||||||+|++.++||++|||+|||+|+++||||+||.++.++||||||||||+ T Consensus 82 g~piG~kVTLRg~~My~FLdkLi~i~lPrirdFrGi~~~sfD~~GN~s~Gi~e~~iFPEI~~d~~~~~~G~~ItivTta~ 161 (180) T CHL00078 82 KMPVGVSVTLRGDKMYAFLDRLINLALPRIRDFQGISPKSFDGRGNYNLGLKEQLIFPEIDYDKIDQIRGMDISIVTTAK 161 (180) T ss_pred CCCCEEEEEECHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCEEEECCCHHCCCCCCCCCCCCCCCCCEEEEEECCC T ss_conf 97416799975799999999999985110002579984334799757546404215776044545787887589991779 Q ss_pred CHHHHHHHHHHHCCCCCC Q ss_conf 968999999980898689 Q gi|254780249|r 168 SDREAKYLLTLFGFPFPK 185 (185) Q Consensus 168 ~~~ea~~Lls~~g~Pf~k 185 (185) +++||++||++|||||++ T Consensus 162 td~ea~~LL~~~g~PF~~ 179 (180) T CHL00078 162 TDEEGLALLKELGMPFKD 179 (180) T ss_pred CHHHHHHHHHHCCCCCCC T ss_conf 989999999986998178 No 3 >COG0094 RplE Ribosomal protein L5 [Translation, ribosomal structure and biogenesis] Probab=100.00 E-value=0 Score=466.03 Aligned_cols=179 Identities=62% Similarity=1.042 Sum_probs=177.3 Q ss_pred CCHHHHHHHHHHHHHHHHHHCCCCHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCC Q ss_conf 61189999999999999982699822286044999974578756537899999998776538961354123764211155 Q gi|254780249|r 7 EPRLKKEYCLRIREAMQQEFSYKNVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLR 86 (185) Q Consensus 7 ~pRLk~~Y~~~i~~dL~~k~~~~N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiR 86 (185) ++||+.+|.++|.+.|+++|+|+|+|++|+|+|||||+|+|+|..|.+.++.|..+|+.||||||++|+||+||++|||| T Consensus 1 ~~rlk~~y~~~i~~~l~~~~~y~n~M~~P~i~KvvvNmGvGea~~d~k~l~~A~~~L~~ItGQKPv~tkAkksia~FkiR 80 (180) T COG0094 1 MNRLKEKYKDEIVPALIKKFGYSNPMQVPRIEKVVVNMGVGEAAADGKRLEKAAKDLELITGQKPVITKAKKSIAGFKIR 80 (180) T ss_pred CCCHHHHHHHHHHHHHHHHHCCCCCCCCCEEEEEEEECCCHHHHHCHHHHHHHHHHHHHHHCCCCEEEEHHCCCCCCCCC T ss_conf 90066677778758988764337864255367999975532322144899999999999868986562011355456622 Q ss_pred CCCEEEEEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCCCCEEEEECCC Q ss_conf 89862889996034289999999998767752046887110157773143301110068754014688554359998044 Q gi|254780249|r 87 TGMPIGTKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVLGMDISICTTT 166 (185) Q Consensus 87 kG~piG~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~G~~Iti~Tta 166 (185) +|+||||||||||++||+||+||++++|||||||+|++.+|||+.|||||||+||++||||+||+.++++||||+|+|+| T Consensus 81 ~g~pIG~KVTLRg~rm~eFL~rl~~i~lPrvrdfrGls~~sFDg~GN~sfGI~E~i~FPei~yD~~~~i~GMdi~ivtta 160 (180) T COG0094 81 EGMPIGVKVTLRGERMYEFLDRLLNIALPRVRDFRGLSPKSFDGRGNYSFGIKEQIIFPEIDYDPIIGIRGMDITIVTTA 160 (180) T ss_pred CCCEEEEEEEECHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCEEECCCCEEECCCCCCCCCCCCCCCEEEEEECC T ss_conf 79732689997648899999999976354422346778566379785576650124067645676677267428999527 Q ss_pred CCHHHHHHHHHHHCCCCCC Q ss_conf 8968999999980898689 Q gi|254780249|r 167 RSDREAKYLLTLFGFPFPK 185 (185) Q Consensus 167 ~~~~ea~~Lls~~g~Pf~k 185 (185) +++.|++.||+++||||++ T Consensus 161 ~~d~e~R~ll~~~~~Pf~~ 179 (180) T COG0094 161 KGDVEARALLSAFGIPFRK 179 (180) T ss_pred CCHHHHHHHHHHCCCCCCC T ss_conf 9719999999855999778 No 4 >KOG0398 consensus Probab=100.00 E-value=0 Score=361.36 Aligned_cols=178 Identities=50% Similarity=0.833 Sum_probs=168.1 Q ss_pred CCHHHHHHHHHHHHHHH-----------------------------------------HHHCCCCHHHCCCEEEEEEECC Q ss_conf 61189999999999999-----------------------------------------9826998222860449999745 Q gi|254780249|r 7 EPRLKKEYCLRIREAMQ-----------------------------------------QEFSYKNVMQIPKIEKVVVNMG 45 (185) Q Consensus 7 ~pRLk~~Y~~~i~~dL~-----------------------------------------~k~~~~N~~~vPki~KIvin~g 45 (185) .-||.+||.+++.+|+. -++.|-|++++|+++|||+||+ T Consensus 52 ~~RLn~h~l~t~l~dvl~vsysh~v~v~ks~~e~aWsGDsPy~~nrpp~~~Rg~kallp~~~~vn~~nvP~v~kVVvnc~ 131 (278) T KOG0398 52 SSRLNAHMLSTPLRDVLKVSYSHTVLVEKSEAEKAWSGDSPYQRNRPPYLERGIKALLPEFKYVNIHNVPKVQKVVVNCG 131 (278) T ss_pred HHHHHHHHHCCCHHHHHHHHCCCCHHHHHHHHHHHCCCCCHHHCCCCCCCCCCCCCCCCCCCCCCHHHCCCEEEEEEECC T ss_conf 56788876163005466322245126553145541137880111488433346621156434357300875003543212 Q ss_pred CCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCEEEEEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCC Q ss_conf 78756537899999998776538961354123764211155898628899960342899999999987677520468871 Q gi|254780249|r 46 VGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLRTGMPIGTKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNS 125 (185) Q Consensus 46 vg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiRkG~piG~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~ 125 (185) +|+|.+|++.++.++++++.||||||..++||++|++||+|+|.|+|+||||||+.||.||++|+++||||+|||+|+++ T Consensus 132 ~~eA~~n~~~l~~am~~~~~ITG~kP~~~~ar~dV~twKlR~g~p~G~kVtL~G~~My~FLs~L~elvLPr~rdfkGvSp 211 (278) T KOG0398 132 IGEAAQNDKGLEAAMKDIALITGQKPIKTRARADVATWKLREGQPLGIKVTLRGDVMYSFLSRLIELVLPRTRDFKGVSP 211 (278) T ss_pred CHHHHHHHHHHHHHHHHHHHHHCCCCCEEEECCCCCCEEECCCCCCEEEEEEECHHHHHHHHHHHHHHCCCCCCCCCCCC T ss_conf 07766427779999999998748996200001567741201697230589981167999999999975421000367688 Q ss_pred CCCCCCCCEEEEEC--CCEECCCCC--CCCCCCCCCCEEEEECCCCCHHHHHHHHHHHCCCCC Q ss_conf 10157773143301--110068754--014688554359998044896899999998089868 Q gi|254780249|r 126 RSFDGSGNFSFGIR--EHIVFPEIN--YDKVDCVLGMDISICTTTRSDREAKYLLTLFGFPFP 184 (185) Q Consensus 126 ~~~d~~Gn~sfGi~--e~~~FPEi~--yd~~~~~~G~~Iti~Tta~~~~ea~~Lls~~g~Pf~ 184 (185) +|+|++||||||++ ++-+||||+ ||.++..+||||+|.|||+++.+|+.|||++||||. T Consensus 212 ~Sgd~~GniSfGl~aEd~~~FPeI~An~d~~pkt~Gm~vnI~T~ak~d~~ar~lls~~~~PF~ 274 (278) T KOG0398 212 SSGDGNGNISFGLKAEDQGVFPEIRANFDAVPKTRGMDVNISTTAKSDQEARKLLSLMGMPFR 274 (278) T ss_pred CCCCCCCCEEECCCHHHCCCCCHHHHHHHCCCCCCCEEEEECCCCCCHHHHHHHHHHCCCCCC T ss_conf 776887887734464213557002322311441155037620453204899999986287655 No 5 >PRK04219 rpl5p 50S ribosomal protein L5P; Reviewed Probab=100.00 E-value=0 Score=332.77 Aligned_cols=150 Identities=43% Similarity=0.629 Sum_probs=141.4 Q ss_pred HCCCCHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCEEEEEEEEECCHHHHH Q ss_conf 26998222860449999745787565378999999987765389613541237642111558986288999603428999 Q gi|254780249|r 26 FSYKNVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLRTGMPIGTKVTLRGTNMYDF 105 (185) Q Consensus 26 ~~~~N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiRkG~piG~kvTLRg~~my~F 105 (185) ..-.|+|++|+|+|||||+|+|++++ .++.+..+|+.||||+|+.++||++|++||||+|+|+||+|||||++||+| T Consensus 6 ~~~~N~M~~pki~KvviN~GvGe~~~---~l~~a~~~L~~ITGQkPv~t~AKksi~~FkiRkG~pIG~kVTLRg~~m~eF 82 (177) T PRK04219 6 LWEMNPMRKPRIEKVTVNIGVGESGE---RLTKAEKLLEELTGQKPVRTKAKRTIPDFGIRKGEPIGVKVTLRGEKAEEF 82 (177) T ss_pred HHHCCCCCCCEEEEEEEECCCCHHHH---HHHHHHHHHHHHCCCCCEEEECCCCCCCCCCCCCCEEEEEEEECHHHHHHH T ss_conf 75349854763789999747883277---899999999996199618871453233456678984789999857889999 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCCCCEEEEECCCCCHHHHHHHHHHHCCCCC Q ss_conf 9999998767752046887110157773143301110068754014688554359998044896899999998089868 Q gi|254780249|r 106 LDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVLGMDISICTTTRSDREAKYLLTLFGFPFP 184 (185) Q Consensus 106 L~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~G~~Iti~Tta~~~~ea~~Lls~~g~Pf~ 184 (185) |+||+++++||++ .++||++|||+|||+||++|||++||+..+++||||+|++++.....++-.....++|-+ T Consensus 83 L~rll~~~~~ri~------~~~FD~~GN~sfGI~E~i~fPei~yD~~~gI~Gmdi~Vvl~rpG~Ri~~Rk~~~~~i~~~ 155 (177) T PRK04219 83 LKRALEAVGNRLK------ASSFDETGNVSFGIEEHIDFPGVKYDPEIGIFGMDVAVTLERPGYRVARRRRKRRKIPKR 155 (177) T ss_pred HHHHHHHHCCCCC------CCCCCCCCCEEECCCEEEECCCCCCCCCCCEEEEEEEEEEECCCCHHHHHHHHHCCCCCC T ss_conf 9988887566427------233189975674660158647652066566322369999826962799998874389954 No 6 >PTZ00156 60S ribosomal protein L11; Provisional Probab=100.00 E-value=0 Score=297.30 Aligned_cols=129 Identities=38% Similarity=0.647 Sum_probs=123.3 Q ss_pred CCCCHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCEEEEEEEEECCHHHHHH Q ss_conf 69982228604499997457875653789999999877653896135412376421115589862889996034289999 Q gi|254780249|r 27 SYKNVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLRTGMPIGTKVTLRGTNMYDFL 106 (185) Q Consensus 27 ~~~N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiRkG~piG~kvTLRg~~my~FL 106 (185) .++|+|++|+|+|||||+|+|++++ .+..+..+|+.||||+|+.++||++|++||||+|+||||+|||||++||+|| T Consensus 3 k~~NpM~~pkI~KvvlNiGvGesg~---~l~~a~~~Le~iTGQkPv~tkAkkti~~F~iRkg~pIG~kVTLRg~ka~efL 79 (172) T PTZ00156 3 KKENPMREIRIEKLVLNICVGESGD---RLTRAAKVLEQLTGQKPVFSKARLTVRSFGIRRNEKIAVHCTVRGKKAEEIL 79 (172) T ss_pred CCCCCCCCCEEEEEEEECCCCCCHH---HHHHHHHHHHHHHCCCCEEEEECCCHHCCCCCCCCEEEEEEEECHHHHHHHH T ss_conf 5378766761689999747883346---5999999999973997136531100110476789857899997668899999 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCCCCEEEEECC Q ss_conf 99999876775204688711015777314330111006875401468855435999804 Q gi|254780249|r 107 DRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVLGMDISICTT 165 (185) Q Consensus 107 ~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~G~~Iti~Tt 165 (185) +|++ +++||+ ++.++||..||+||||+||++| +++||+-.+++|||++++.. T Consensus 80 ~r~l-----~v~d~~-L~~~~Fd~~GNfsFGI~EhId~-G~kYDP~iGI~Gmdv~V~l~ 131 (172) T PTZ00156 80 ERGL-----KVKEFE-LKKRNFSDTGNFGFGIQEHIDL-GIKYDPSTGIYGMDFYVVLS 131 (172) T ss_pred HHHH-----HHCCCC-CCCCCCCCCCCEEECCHHHEEC-CCEECCCCCEEEEEEEEEEE T ss_conf 9988-----431467-3623528998514152222357-74216867776536899971 No 7 >pfam00673 Ribosomal_L5_C ribosomal L5P family C-terminus. This region is found associated with pfam00281. Probab=100.00 E-value=1e-40 Score=257.69 Aligned_cols=95 Identities=63% Similarity=1.146 Sum_probs=93.9 Q ss_pred EEEEEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCCCCEEEEECCCCCH Q ss_conf 62889996034289999999998767752046887110157773143301110068754014688554359998044896 Q gi|254780249|r 90 PIGTKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVLGMDISICTTTRSD 169 (185) Q Consensus 90 piG~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~G~~Iti~Tta~~~ 169 (185) ||||||||||++||+||+||+++||||||||+|++.++||++|||+|||+||++|||++||++++++||||+|+|+|+++ T Consensus 1 piG~kVTLRg~~m~~FL~~li~ivlPrirdf~gi~~~sfD~~GN~sfGi~e~~~FPei~yD~~~~i~G~~I~ivt~a~~~ 80 (95) T pfam00673 1 PIGCKVTLRGEKMYEFLDRLINIVLPRIRDFRGLSPKSFDGRGNYSFGIKEHIIFPEIKYDPIIGIFGMDITIVTTAKTD 80 (95) T ss_pred CEEEEEEECCHHHHHHHHHHHHHHCCCEECCCCCCCCCCCCCCEEEECCCHHCCCCCCCCCCCCCCCCCEEEEEECCCCH T ss_conf 91699997719899999999987352000366578643478734876620413366753065568786089999566997 Q ss_pred HHHHHHHHHHCCCCC Q ss_conf 899999998089868 Q gi|254780249|r 170 REAKYLLTLFGFPFP 184 (185) Q Consensus 170 ~ea~~Lls~~g~Pf~ 184 (185) +|+++||+++||||. T Consensus 81 ~~~r~Ll~~~g~pf~ 95 (95) T pfam00673 81 KEARALLKELGMPFF 95 (95) T ss_pred HHHHHHHHHCCCCCC T ss_conf 999999998599989 No 8 >KOG0397 consensus Probab=99.90 E-value=2e-24 Score=162.85 Aligned_cols=129 Identities=38% Similarity=0.650 Sum_probs=119.7 Q ss_pred CCCHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCEEEEEEEEECCHHHHHHH Q ss_conf 99822286044999974578756537899999998776538961354123764211155898628899960342899999 Q gi|254780249|r 28 YKNVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLRTGMPIGTKVTLRGTNMYDFLD 107 (185) Q Consensus 28 ~~N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiRkG~piG~kvTLRg~~my~FL~ 107 (185) -.|||+--+|.|.++|+++|++++ .|..+..+|+++|||+|+..+|+-.|.+|+||.+..|+|.||.||.++++.|+ T Consensus 6 ~~npMrel~i~KL~lnIcvgESGd---rLtRAaKvLEQLtGQ~pvfskaryTvR~fGirRNEKIAvh~tVrG~KAeeiLe 82 (176) T KOG0397 6 AQNPMRELKIQKLVLNICVGESGD---RLTRAAKVLEQLTGQTPVFSKARYTVRSFGIRRNEKIAVHVTVRGPKAEEILE 82 (176) T ss_pred CCCCHHHHHHHEEEEEEEECCCCH---HHHHHHHHHHHHCCCCCCCHHHHHHHHHHCCCCCCEEEEEEEEECCCHHHHHH T ss_conf 038156650120368874145300---78889999999618986305666647864541276289999950820899998 Q ss_pred HHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCCCCEEEEECCC Q ss_conf 99998767752046887110157773143301110068754014688554359998044 Q gi|254780249|r 108 RLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVLGMDISICTTT 166 (185) Q Consensus 108 kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~G~~Iti~Tta 166 (185) + .| +++++. +..+.|...||+.|||.||++.. |.||+-.+++|||..++..- T Consensus 83 ~----gL-kVkeYe-L~~~nFS~tgnFGFGiqEHIDLG-ikYDPsiGIyGmDFyVvl~R 134 (176) T KOG0397 83 R----GL-KVKEYE-LRKRNFSDTGNFGFGIQEHIDLG-IKYDPSIGIYGMDFYVVLGR 134 (176) T ss_pred H----CC-CHHHHH-HHHHCCCCCCCCCCCHHHHEECC-CEECCCCCEEEEEEEEEECC T ss_conf 3----54-142345-67643656687432426650025-15579763231158999469 No 9 >pfam00281 Ribosomal_L5 Ribosomal protein L5. Probab=99.74 E-value=1.8e-18 Score=128.07 Aligned_cols=56 Identities=61% Similarity=0.926 Sum_probs=53.4 Q ss_pred CHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCCCC Q ss_conf 822286044999974578756537899999998776538961354123764211155 Q gi|254780249|r 30 NVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFKLR 86 (185) Q Consensus 30 N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fkiR 86 (185) |+||+|+|+|||||+|+|++++ ++.++.+..+|+.||||+|++|+||+|||+||+| T Consensus 1 N~mqvPki~KIviN~gvG~a~~-~k~l~~a~~~l~~ItGQkPv~t~aKksia~FKlR 56 (56) T pfam00281 1 NVMEVPKLEKIVVNMGVGEAGD-NKILEKAALELEEISGQKPIITKAKKSIASFKIR 56 (56) T ss_pred CCCCCCEEEEEEEECCCCHHHH-HHHHHHHHHHHHHHCCCCCEEEEEECCHHCCCCC T ss_conf 9876755779999888670331-1789999999999729975253210112204669 No 10 >COG1913 Predicted Zn-dependent proteases [General function prediction only] Probab=63.92 E-value=6.2 Score=20.13 Aligned_cols=75 Identities=17% Similarity=0.204 Sum_probs=52.7 Q ss_pred CCCCCCCEEEEEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEEC-CCCCCCCCCCCCCCEEE Q ss_conf 11558986288999603428999999999876775204688711015777314330111006-87540146885543599 Q gi|254780249|r 83 FKLRTGMPIGTKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVF-PEINYDKVDCVLGMDIS 161 (185) Q Consensus 83 fkiRkG~piG~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~F-PEi~yd~~~~~~G~~It 161 (185) .++..|+.+-.-..--...|++|..+.+...|+++++.. |. -.+|+.+..++ |.+||..-....|.... T Consensus 31 v~l~~~~~~~~~~~ay~~~R~Qf~A~~~l~~l~~v~~~~-------~~---~ilgvt~~Diy~~g~NFVFGla~~~~~~A 100 (181) T COG1913 31 VKLLPGSLKSVPIEAYNWERGQFRARKVLLYLSLVEKDN-------DV---KILGVTDVDIYAPGLNFVFGLAYLGGKVA 100 (181) T ss_pred EEECCCCCCCCCHHCCCHHHCCHHHHHHHHHCCCCCCCC-------CC---CEEEEECCCCCCCCCEEEEEEEEECCCEE T ss_conf 996457644441310557760301899987403001377-------73---18998636766565307998974088279 Q ss_pred EECCCC Q ss_conf 980448 Q gi|254780249|r 162 ICTTTR 167 (185) Q Consensus 162 i~Tta~ 167 (185) ++.++. T Consensus 101 vvs~~R 106 (181) T COG1913 101 VVSTYR 106 (181) T ss_pred EEEEEE T ss_conf 999777 No 11 >KOG2961 consensus Probab=52.63 E-value=11 Score=18.62 Aligned_cols=66 Identities=21% Similarity=0.400 Sum_probs=45.8 Q ss_pred HHHCCCCCCCCCCCCCCEEEEECCCEECC-CC-CCCCCCCCCC-CEEEEECCC-------CCHHHHHHHHHHHCCCC Q ss_conf 52046887110157773143301110068-75-4014688554-359998044-------89689999999808986 Q gi|254780249|r 117 IRDFHGLNSRSFDGSGNFSFGIREHIVFP-EI-NYDKVDCVLG-MDISICTTT-------RSDREAKYLLTLFGFPF 183 (185) Q Consensus 117 ikdf~g~~~~~~d~~Gn~sfGi~e~~~FP-Ei-~yd~~~~~~G-~~Iti~Tta-------~~~~ea~~Lls~~g~Pf 183 (185) |.+|.|++.--+|+-+++++--+ ..++| ++ +.+.+..++| -+|.+..++ .+++.|..|=...|+|. T Consensus 37 I~~~~~ikavVlDKDNcit~P~~-~~Iwp~~l~~ie~~~~vygek~i~v~SNsaG~~~~D~d~s~Ak~le~k~gIpV 112 (190) T KOG2961 37 ILKRKGIKAVVLDKDNCITAPYS-LAIWPPLLPSIERCKAVYGEKDIAVFSNSAGLTEYDHDDSKAKALEAKIGIPV 112 (190) T ss_pred HHHCCCCEEEEECCCCEEECCCC-CCCCCHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCCHHHHHHHHHHHCCCE T ss_conf 11005713899737870517763-43381667789999987276527999547676436986689998887539824 No 12 >TIGR02510 NrdE-prime ribonucleoside-diphosphate reductase, alpha chain; InterPro: IPR013350 Proteins in this entry represent a small clade of ribonucleoside-diphosphate reductase alpha chains which are sufficiently divergent from the usual Class I ribonucleotide reductase (RNR) alpha chains (NrdE or NrdA, IPR013346 from INTERPRO) to form a distinct group. The genes from Thermus thermophilus, Dichelobacter and Salinibacter are adjacent to the usual RNR beta chain.. Probab=52.15 E-value=8.1 Score=19.47 Aligned_cols=35 Identities=26% Similarity=0.463 Sum_probs=31.7 Q ss_pred CCCCCCCCCCEEEEEEEEECCHHHHHHHHHHHHHH Q ss_conf 42111558986288999603428999999999876 Q gi|254780249|r 80 IAGFKLRTGMPIGTKVTLRGTNMYDFLDRLINMGM 114 (185) Q Consensus 80 i~~fkiRkG~piG~kvTLRg~~my~FL~kli~~vl 114 (185) -++|+++.|+||-|.=|-=++.|.+||.--.++.. T Consensus 77 waN~GldRGlPiSC~GsYv~Ds~~~il~~~aEVgM 111 (560) T TIGR02510 77 WANYGLDRGLPISCYGSYVDDSVLDILEGQAEVGM 111 (560) T ss_pred HHHCCCCCCCCEEEECCCCCHHHHHHHCCCCCEEE T ss_conf 33057678973243047521018988512431112 No 13 >pfam04921 XAP5 XAP5 protein. This protein is found in a wide range of eukaryotes. Its function is uncertain. It is a nuclear protein and is suggested to be DNA binding. Probab=49.19 E-value=18 Score=17.46 Aligned_cols=43 Identities=26% Similarity=0.481 Sum_probs=26.9 Q ss_pred EECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCC Q ss_conf 60342899999999987677520468871101577731433011100687 Q gi|254780249|r 97 LRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPE 146 (185) Q Consensus 97 LRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPE 146 (185) -.|+..|.||+++....-. +|+-+..-+.|. -+-+++-++.|- T Consensus 112 kKGdtI~~FL~~~r~~l~~---~frEl~~vsvd~----LM~VkedlIiPH 154 (233) T pfam04921 112 KKGDTIWLFLDKCRKVLAK---DFRELRRVSVDD----LMLVKEDLIIPH 154 (233) T ss_pred CCCCCHHHHHHHHHHHHHH---HHHHHHHCCHHH----HHHHCCCEECCC T ss_conf 2899899999999999888---868988468888----410103564146 No 14 >TIGR00705 SppA_67K signal peptide peptidase SppA, 67K type; InterPro: IPR004634 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of serine peptidases belong to MEROPS peptidase family S49 (protease IV family, clan S-). The predicted active site serine for members of this family occurs in a transmembrane domain. Signal peptides of secretory proteins seem to serve at least two important biological functions. First, they are required for protein targeting to and translocation across membranes, such as the eubacterial plasma membrane and the endoplasmic reticular membrane of eukaryotes. Second, in addition to their role as determinants for protein targeting and translocation, certain signal peptides have a signalling function. During or shortly after pre-protein translocation, the signal peptide is removed by signal peptidases. The integral membrane protein, SppA (protease IV), of Escherichia coli was shown experimentally to degrade signal peptides. The member of this family from Bacillus subtilis has only been shown to be required for efficient processing of pre-proteins under conditions of hyper-secretion . These enzymes have a molecular mass around 67 kDa and a duplication such that the N-terminal half shares extensive homology with the C-terminal half and was shown in E. coli to form homotetramers. E. coli SohB, which is most closely homologous to the C-terminal duplication of SppA, is predicted to perform a similar function of small peptide degradation, but in the periplasm. Many prokaryotes have a single SppA/SohB homolog that may perform the function of either or both. ; GO: 0009003 signal peptidase activity, 0006465 signal peptide processing, 0016021 integral to membrane. Probab=37.92 E-value=34 Score=15.84 Aligned_cols=71 Identities=17% Similarity=0.203 Sum_probs=47.6 Q ss_pred CCHHHCCCEEEEEEECCCCCCCCCHHHHHHHHHHHHHHHCCCCEEEECCCCCCCCC----CCCCC-EEEEEEEEECC Q ss_conf 98222860449999745787565378999999987765389613541237642111----55898-62889996034 Q gi|254780249|r 29 KNVMQIPKIEKVVVNMGVGESIADSKKAESAAADLALITGQKPVITRARRSIAGFK----LRTGM-PIGTKVTLRGT 100 (185) Q Consensus 29 ~N~~~vPki~KIvin~gvg~a~~dkk~l~~~~~~L~~ITGqkP~~~~aKksi~~fk----iRkG~-piG~kvTLRg~ 100 (185) ..+|+-|.|+-|||.+..|.-..--..+.....+...-.|.||+++ |=-++|.-+ --.++ .++---|+.|. T Consensus 352 r~a~~D~~iKAvvLRinSPGGsv~Ase~IR~e~~~~~~~GkKPViv-SMG~~AASGgYWiasaA~yIvA~p~TiTGS 427 (614) T TIGR00705 352 RKARSDPDIKAVVLRINSPGGSVFASEIIRRELERLQARGKKPVIV-SMGAMAASGGYWIASAADYIVADPNTITGS 427 (614) T ss_pred HHHCCCCCCEEEEEEEECCCCCEEHHHHHHHHHHHHHHCCCCCEEE-ECCHHHHCCCCHHCCCCCEEEECCCCCCCC T ss_conf 9870799812899886389863428789999999998268997898-435023205300204557133478743100 No 15 >PRK00389 gcvT glycine cleavage system aminomethyltransferase T; Reviewed Probab=35.81 E-value=27 Score=16.38 Aligned_cols=40 Identities=18% Similarity=0.380 Sum_probs=18.3 Q ss_pred EEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCC Q ss_conf 8999603428999999999876775204688711015777 Q gi|254780249|r 93 TKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSG 132 (185) Q Consensus 93 ~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~G 132 (185) .|+.+.|+.+.+||++++.--+++++.-++...--.|..| T Consensus 53 ~ki~I~G~Da~~~L~~l~t~di~~l~~G~~~yt~~ln~~G 92 (362) T PRK00389 53 GEVDVTGPDALAFLQYLLANDVAKLKPGKALYTAMLNEDG 92 (362) T ss_pred EEEEEECCCHHHHHHHHHCCCCCCCCCCCEEEEEEECCCC T ss_conf 8999988899999988611353447998699998787998 No 16 >cd04481 RPA1_DBD_B_like RPA1_DBD_B_like: A subgroup of uncharacterized, plant OB folds with similarity to the third OB fold, the ssDNA-binding domain (DBD)-B, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-B, RPA1 contains three other OB folds: DBD-A, DBD-C, and RPA1N. The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. Probab=35.08 E-value=20 Score=17.23 Aligned_cols=43 Identities=21% Similarity=0.323 Sum_probs=32.9 Q ss_pred CCCCCCCCCCEEEEEEEEECCHHHHHHHHHHH-------H---HHHHHHHCCC Q ss_conf 42111558986288999603428999999999-------8---7677520468 Q gi|254780249|r 80 IAGFKLRTGMPIGTKVTLRGTNMYDFLDRLIN-------M---GMPRIRDFHG 122 (185) Q Consensus 80 i~~fkiRkG~piG~kvTLRg~~my~FL~kli~-------~---vlPrikdf~g 122 (185) -..|.||-..-.-.++||-|+.+.+|-..+.. + ..=||++|.| T Consensus 24 kr~~~i~D~~~~~l~~tlwG~~A~~F~~~~~~~~~~~~VV~v~~~~~v~~~~g 76 (106) T cd04481 24 KLDFEIRDLSDERLKCTLWGEYAEEFDAKFQSAGNGEPVVAVLRFWKIKEYKG 76 (106) T ss_pred EEEEEEEECCCCEEEEEEEHHHHHHHHHHHHHCCCCCCEEEEEEEEEEEEECC T ss_conf 89999996899989999994798887788875159986899999899887579 No 17 >COG4031 Predicted metal-binding protein [General function prediction only] Probab=33.33 E-value=40 Score=15.40 Aligned_cols=76 Identities=25% Similarity=0.303 Sum_probs=46.1 Q ss_pred EEEEEEEECCHHHHHHHHHHH------HHHHHHHHCCC--------CCCCCCCCCCCEEEEECCCEECCCCCCCCCCCCC Q ss_conf 288999603428999999999------87677520468--------8711015777314330111006875401468855 Q gi|254780249|r 91 IGTKVTLRGTNMYDFLDRLIN------MGMPRIRDFHG--------LNSRSFDGSGNFSFGIREHIVFPEINYDKVDCVL 156 (185) Q Consensus 91 iG~kvTLRg~~my~FL~kli~------~vlPrikdf~g--------~~~~~~d~~Gn~sfGi~e~~~FPEi~yd~~~~~~ 156 (185) =|.+.|+=|.+--.=|-+++. -|+|-+-+-+| ++.+.-|.+||+-+=+.|-..+-| T Consensus 130 gGsHsTiIGgR~G~klI~~va~~P~VKkVIPg~I~~~gs~~g~Gvr~KvtRaD~~GNlrlLl~eGss~Qe---------- 199 (227) T COG4031 130 GGSHSTIIGGRSGKKLILLVAQHPYVKKVIPGVISAKGSAGGGGVRLKVTRADARGNLRLLLSEGSSVQE---------- 199 (227) T ss_pred CCCCCEEECCCCHHHHHHHHHCCCCEEECCCCEEECCCCCCCCCEEEEEEEECCCCCEEEEEECCCCEEE---------- T ss_conf 8842146537327899999844986001046313057545787458899860578978887636873157---------- Q ss_pred CCEEEEECCCCCHHHHHHHHHHH Q ss_conf 43599980448968999999980 Q gi|254780249|r 157 GMDISICTTTRSDREAKYLLTLF 179 (185) Q Consensus 157 G~~Iti~Tta~~~~ea~~Lls~~ 179 (185) |.++|||.+.+|+...+..+ T Consensus 200 ---i~vVTTa~s~eeGe~V~~~L 219 (227) T COG4031 200 ---IRVVTTAGSREEGERVMNLL 219 (227) T ss_pred ---EEEEEEECCHHHHHHHHHHH T ss_conf ---99997525555579999999 No 18 >PRK12486 dmdA putative dimethyl sulfoniopropionate demethylase; Reviewed Probab=31.78 E-value=28 Score=16.35 Aligned_cols=15 Identities=20% Similarity=0.565 Sum_probs=6.5 Q ss_pred EEEECCHHHHHHHHH Q ss_conf 996034289999999 Q gi|254780249|r 95 VTLRGTNMYDFLDRL 109 (185) Q Consensus 95 vTLRg~~my~FL~kl 109 (185) ..|.|.++.+.|.++ T Consensus 151 lalqGP~a~~vl~~~ 165 (367) T PRK12486 151 LAVQGPKADDLMARV 165 (367) T ss_pred EEEECCCHHHHHHHH T ss_conf 985570849999763 No 19 >cd03074 PDI_b'_Calsequestrin_C Protein Disulfide Isomerase (PDIb') family, Calsequestrin subfamily, C-terminal TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin Probab=30.16 E-value=46 Score=15.09 Aligned_cols=48 Identities=23% Similarity=0.301 Sum_probs=35.7 Q ss_pred CCEEEEEEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEECCCC Q ss_conf 986288999603428999999999876775204688711015777314330111006875 Q gi|254780249|r 88 GMPIGTKVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIVFPEI 147 (185) Q Consensus 88 G~piG~kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FPEi 147 (185) |.-|-+.+.=-...-|+||.-|-.++ |+ ...++|+||-.-+...||-+ T Consensus 21 g~~IvAFaee~d~dG~eFl~ilk~vA----~~--------nt~n~~LsivWIDPD~FPll 68 (120) T cd03074 21 GIHIVAFAEEEDPDGYEFLEILKEVA----RD--------NTDNPDLSIIWIDPDDFPLL 68 (120) T ss_pred CCEEEEEECCCCCCHHHHHHHHHHHH----HH--------CCCCCCEEEEEECCCCCCHH T ss_conf 82799986378922899999999999----97--------27698704998888645047 No 20 >TIGR03212 uraD_N-term-dom putative urate catabolism protein. This model represents a protein that is predominantly found just upstream of the UraD protein (OHCU decarboxylase) and in a number of instances as a N-terminal fusion with it. UraD itself catalyzes the last step in the catabolism of urate to allantoate. The function of this protein is presently unknown. It shows homology with the pfam01522 polysaccharide deacetylase domain family. Probab=25.93 E-value=48 Score=14.97 Aligned_cols=25 Identities=16% Similarity=0.170 Sum_probs=20.6 Q ss_pred CCHHHHHHHHHHHHHHHCCCCEEEE Q ss_conf 5378999999987765389613541 Q gi|254780249|r 51 ADSKKAESAAADLALITGQKPVITR 75 (185) Q Consensus 51 ~dkk~l~~~~~~L~~ITGqkP~~~~ 75 (185) ++...+..+..+|+.+||++|+--+ T Consensus 136 ~E~~~i~~~~~~l~~~tG~rP~Gw~ 160 (297) T TIGR03212 136 QEREHIAEAIRLHTEVTGERPLGWY 160 (297) T ss_pred HHHHHHHHHHHHHHHHCCCCCCCCC T ss_conf 9999999999999986599988117 No 21 >TIGR02062 RNase_B exoribonuclease II; InterPro: IPR011804 This family consists of exoribonuclease II (RNase II), the product of the rnb gene, as found in a number of gamma proteobacteria. In Escherichia coli, it is one of eight different exoribonucleases. It is involved in mRNA degradation and tRNA precursor end processing.; GO: 0003723 RNA binding, 0008859 exoribonuclease II activity, 0006401 RNA catabolic process. Probab=25.89 E-value=26 Score=16.51 Aligned_cols=97 Identities=21% Similarity=0.311 Sum_probs=55.9 Q ss_pred EEECCCCCCCCCHHHHHHHHHHHHHHHC--CCCEEEECCCCCCCCCCCCCCEEEEEEEEEC-CHHHHHHHH----HHHHH Q ss_conf 9974578756537899999998776538--9613541237642111558986288999603-428999999----99987 Q gi|254780249|r 41 VVNMGVGESIADSKKAESAAADLALITG--QKPVITRARRSIAGFKLRTGMPIGTKVTLRG-TNMYDFLDR----LINMG 113 (185) Q Consensus 41 vin~gvg~a~~dkk~l~~~~~~L~~ITG--qkP~~~~aKksi~~fkiRkG~piG~kvTLRg-~~my~FL~k----li~~v 113 (185) +-|.. +|=|+..++.+...|..=.. |.-.-.--+-++ --=.||-| .+|-.-|+- .++. T Consensus 423 ifNtH---~GFd~~~~~~~~~lL~~~~Aneqnqtela~~~~v-----------e~l~Tl~GFc~LRr~L~~~~~~YL~~- 487 (664) T TIGR02062 423 IFNTH---AGFDPKNAENVVELLKANGANEQNQTELALKVDV-----------EELATLEGFCKLRRELEAQETDYLDS- 487 (664) T ss_pred EEEEC---CCCCHHHHHHHHHHHHHCCCCCHHHHHHCCCCCH-----------HHEHHHHHHHHHHHHHCCCCCCHHHH- T ss_conf 58723---7878889999999998607540012322005654-----------21001577888877530278661344- Q ss_pred HHHHHHCCCCCCCCCCCCCCEEEEECCCEECC-CC-CC-CCCCC Q ss_conf 67752046887110157773143301110068-75-40-14688 Q gi|254780249|r 114 MPRIRDFHGLNSRSFDGSGNFSFGIREHIVFP-EI-NY-DKVDC 154 (185) Q Consensus 114 lPrikdf~g~~~~~~d~~Gn~sfGi~e~~~FP-Ei-~y-d~~~~ 154 (185) |||-|..++.-+-.-...|.+|+..|.=|. =| +| |++.+ T Consensus 488 --RiRrYqsFae~~~~p~PHFaLGL~~YATWTSPIRKYsDMiNH 529 (664) T TIGR02062 488 --RIRRYQSFAEISSEPAPHFALGLEAYATWTSPIRKYSDMINH 529 (664) T ss_pred --HHHHHHHHHHHCCCCCCCHHCCCCCCEEECCCCCCCCCCHHH T ss_conf --466641025542788872002346540005774312440448 No 22 >COG0404 GcvT Glycine cleavage system T protein (aminomethyltransferase) [Amino acid transport and metabolism] Probab=24.26 E-value=52 Score=14.76 Aligned_cols=30 Identities=23% Similarity=0.444 Sum_probs=23.0 Q ss_pred EEEEEECCHHHHHHHHHHHHHHHHHHHCCC Q ss_conf 899960342899999999987677520468 Q gi|254780249|r 93 TKVTLRGTNMYDFLDRLINMGMPRIRDFHG 122 (185) Q Consensus 93 ~kvTLRg~~my~FL~kli~~vlPrikdf~g 122 (185) .|+.++|..+.+||++++.--+++++.-+. T Consensus 57 gk~~V~GpdA~~~L~~l~~ndv~kl~~Gr~ 86 (379) T COG0404 57 GKVEVSGPDAAAFLQRLLTNDVSKLKPGRA 86 (379) T ss_pred EEEEEECCCHHHHHHHHCCCCCCCCCCCCE T ss_conf 699998989999999770566676777748 No 23 >pfam10871 DUF2748 Protein of unknown function (DUF2748). This is a bacterial family of proteins with unknown function. Probab=23.89 E-value=32 Score=15.97 Aligned_cols=95 Identities=16% Similarity=0.114 Sum_probs=44.5 Q ss_pred CCHHHCCCEEEEEEECCCCCCCCCH--HHHHHHHHHHHH-HHCCCCE-----------EEE-CCCCCCCCCCCCCCEEEE Q ss_conf 9822286044999974578756537--899999998776-5389613-----------541-237642111558986288 Q gi|254780249|r 29 KNVMQIPKIEKVVVNMGVGESIADS--KKAESAAADLAL-ITGQKPV-----------ITR-ARRSIAGFKLRTGMPIGT 93 (185) Q Consensus 29 ~N~~~vPki~KIvin~gvg~a~~dk--k~l~~~~~~L~~-ITGqkP~-----------~~~-aKksi~~fkiRkG~piG~ 93 (185) ..|+-+|.-++.-.|..-+.+.+|+ ..+...+..|.. |.++.|+ ... |.-.+--|-++++--| T Consensus 66 ~dp~L~p~T~~~lq~ly~~sa~ddk~~qKi~~i~~~Lkk~i~k~~~v~~~~~~~lARl~vQsahP~Vi~~lL~~~~ev-- 143 (452) T pfam10871 66 NDPHLIHETKALFQNIYKKIAEDDKVIHKIKQIFDNLKKKIAKLLAVEKDLLEKLARIFVQSAHPIVIHWLLLEKTEV-- 143 (452) T ss_pred CCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHCCCHHHHHHHHHCCCEE-- T ss_conf 686346447999999753323312789999999999999887406745789999999998414519999998628559-- Q ss_pred EEEEECCHHHHHHHHHHHHHHHHHHHCCCCCCCCCC Q ss_conf 999603428999999999876775204688711015 Q gi|254780249|r 94 KVTLRGTNMYDFLDRLINMGMPRIRDFHGLNSRSFD 129 (185) Q Consensus 94 kvTLRg~~my~FL~kli~~vlPrikdf~g~~~~~~d 129 (185) .+|-- ...-+.+|-+.. -+.-+..|+...+++ T Consensus 144 fisys-h~igdmmdi~sW---k~~G~nsGmQS~ng~ 175 (452) T pfam10871 144 FISYS-HNIGDMMDIASW---KRAGGNSGMQSINGK 175 (452) T ss_pred EEEEC-CCHHHHHHHHHH---HHHCCCCCCCCCCCC T ss_conf 99827-518889998889---873577774025899 No 24 >TIGR00600 rad2 DNA excision repair protein (rad2); InterPro: IPR001044 Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair , . XP-G can be corrected by a 133 Kd nuclear protein, XPGC . XPGC is an acidic protein that confers normal UV resistance in expressing cells . It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms , . XPGC cleaves one strand of the duplex at the border with the single-stranded region . XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases , ; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved. This entry represents XPGC, an acidic protein that confers normal UV resistance in expressing cells, can correct XP-G. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region , .; GO: 0003697 single-stranded DNA binding, 0004519 endonuclease activity, 0006289 nucleotide-excision repair, 0005634 nucleus. Probab=23.53 E-value=51 Score=14.84 Aligned_cols=11 Identities=18% Similarity=0.223 Sum_probs=3.8 Q ss_pred CEEEEEEECCC Q ss_conf 04499997457 Q gi|254780249|r 36 KIEKVVVNMGV 46 (185) Q Consensus 36 ki~KIvin~gv 46 (185) +=++..+++.+ T Consensus 462 ~~t~~~~~S~~ 472 (1127) T TIGR00600 462 KKTKMLLISRI 472 (1127) T ss_pred CCCCEEEECCC T ss_conf 54415763178 No 25 >COG1058 CinA Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA [General function prediction only] Probab=22.36 E-value=64 Score=14.23 Aligned_cols=140 Identities=23% Similarity=0.260 Sum_probs=69.0 Q ss_pred EEEEEEECCCCCCCCCHHHHHHH-----------HHHHHHHH----CCCCEEEECCCCCCCC-------CCCCCCEEEEE Q ss_conf 44999974578756537899999-----------99877653----8961354123764211-------15589862889 Q gi|254780249|r 37 IEKVVVNMGVGESIADSKKAESA-----------AADLALIT----GQKPVITRARRSIAGF-------KLRTGMPIGTK 94 (185) Q Consensus 37 i~KIvin~gvg~a~~dkk~l~~~-----------~~~L~~IT----GqkP~~~~aKksi~~f-------kiRkG~piG~k 94 (185) -+=|+++-|+|-.-+|-. .+.+ ..+++.|. .+....+-+++--|-+ .=-.|...|+. T Consensus 61 ~D~vI~tGGLGPT~DDiT-~e~vAka~g~~lv~~~~al~~i~~~~~~r~~~~~~~~~K~A~~P~Ga~~l~NpvG~APG~~ 139 (255) T COG1058 61 ADVVITTGGLGPTHDDLT-AEAVAKALGRPLVLDEEALAMIEEKYAKRGREMTEANRKQAMLPEGAEVLDNPVGTAPGFV 139 (255) T ss_pred CCEEEECCCCCCCCCHHH-HHHHHHHHCCCCCCCHHHHHHHHHHHHHCCCCCCHHHHHHCCCCCCCEECCCCCCCCCEEE T ss_conf 998998798589962768-9999998299856699999999999985288888556641047898875778777687369 Q ss_pred EEEECCHHHHH-------HHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEEECCCEE----------CCCCCCCCCCCCCC Q ss_conf 99603428999-------99999987677520468871101577731433011100----------68754014688554 Q gi|254780249|r 95 VTLRGTNMYDF-------LDRLINMGMPRIRDFHGLNSRSFDGSGNFSFGIREHIV----------FPEINYDKVDCVLG 157 (185) Q Consensus 95 vTLRg~~my~F-------L~kli~~vlPrikdf~g~~~~~~d~~Gn~sfGi~e~~~----------FPEi~yd~~~~~~G 157 (185) +..+|..+|-+ ---+-+.+.|..+. ++.. ...-+.--..||+.|..+ +|++.+...+.-.+ T Consensus 140 v~~~~~~v~~lPGvP~Em~~M~e~~~~~~l~~-~~~~-~~~~~~~~~~~gi~ES~la~~L~~i~~~~~~~~i~s~p~~~~ 217 (255) T COG1058 140 VEGNGKNVYVLPGVPSEMKPMFENVLLPLLTG-RFPS-TKYYSRVLRVFGIGESSLAPTLKDLQDEQPNVTIASYPKDGE 217 (255) T ss_pred EECCCEEEEEECCCCHHHHHHHHHHHHHHHHC-CCCC-CCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCEEEECCCCCC T ss_conf 82288189990899799999999877777411-4788-706999999868775787899999985089977982588773 Q ss_pred C---EEEEECCCCCHHHHHHHHHHH Q ss_conf 3---599980448968999999980 Q gi|254780249|r 158 M---DISICTTTRSDREAKYLLTLF 179 (185) Q Consensus 158 ~---~Iti~Tta~~~~ea~~Lls~~ 179 (185) . +|.|...+.+.+++..++..+ T Consensus 218 ~~~~~~~i~~~~~~~~~~~~~~~~~ 242 (255) T COG1058 218 VRLRELVIRAEARDEEEADALLRWL 242 (255) T ss_pred EECCCEEEEEECCCHHHHHHHHHHH T ss_conf 2113158997537899999999999 No 26 >PRK13579 gcvT glycine cleavage system aminomethyltransferase T; Provisional Probab=22.24 E-value=64 Score=14.22 Aligned_cols=19 Identities=11% Similarity=0.305 Sum_probs=7.3 Q ss_pred EEEEEEECCHHHHHHHHHH Q ss_conf 8899960342899999999 Q gi|254780249|r 92 GTKVTLRGTNMYDFLDRLI 110 (185) Q Consensus 92 G~kvTLRg~~my~FL~kli 110 (185) |.-+-+..+.+..+.+.|+ T Consensus 199 G~Ei~~~~~~a~~l~~~l~ 217 (371) T PRK13579 199 GFEISVPADAAEALAEALL 217 (371) T ss_pred EEEEEECHHHHHHHHHHHH T ss_conf 5999965999999999999 No 27 >PRK13249 phycoerythrobilin:ferredoxin oxidoreductase; Provisional Probab=20.85 E-value=39 Score=15.49 Aligned_cols=23 Identities=17% Similarity=0.411 Sum_probs=9.5 Q ss_pred EEECCCEECCCCCCCCCCCCCCCEE Q ss_conf 3301110068754014688554359 Q gi|254780249|r 136 FGIREHIVFPEINYDKVDCVLGMDI 160 (185) Q Consensus 136 fGi~e~~~FPEi~yd~~~~~~G~~I 160 (185) .-|=...+||.-+||. -++|+|+ T Consensus 86 lqILn~V~fP~~~yDL--PiFG~Dl 108 (257) T PRK13249 86 ASVLNFVINPSNRFDL--PFFGADL 108 (257) T ss_pred CEEEEEEECCCCCCCC--CCCCEEE T ss_conf 6677888647878998--8503036 No 28 >pfam06158 Phage_E Phage tail protein E. Family of small phage tail protein, referred to as protein E. Probab=20.77 E-value=52 Score=14.78 Aligned_cols=39 Identities=36% Similarity=0.562 Sum_probs=26.1 Q ss_pred ECCCCCCCCCCCCCCEEEEEEEEECCHHHHHHH----HHHHHHHHHHH Q ss_conf 123764211155898628899960342899999----99998767752 Q gi|254780249|r 75 RARRSIAGFKLRTGMPIGTKVTLRGTNMYDFLD----RLINMGMPRIR 118 (185) Q Consensus 75 ~aKksi~~fkiRkG~piG~kvTLRg~~my~FL~----kli~~vlPrik 118 (185) |....|...-+||- .-=+|||-++++.+. -++. +||||- T Consensus 11 RG~~~It~vtlrkP----~aG~LRGl~L~dv~~~Dvdal~~-vLPRIT 53 (86) T pfam06158 11 RGEQTITEVTLRKP----NAGSLRGLSLADVLQMDVDALIT-LLPRIT 53 (86) T ss_pred CCCEEEEEEEEECC----CCCCCCCCCHHHHHHCCHHHHHH-HCCCCC T ss_conf 39988889998179----98766575699998267988987-611347 Done!