Query gi|254780789|ref|YP_003065202.1| hypothetical protein CLIBASIA_03400 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 192 No_of_seqs 121 out of 1003 Neff 5.7 Searched_HMMs 39220 Date Sun May 29 20:48:34 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780789.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 COG0779 Uncharacterized protei 100.0 1.5E-44 0 298.5 18.7 151 16-175 1-152 (153) 2 PRK00092 hypothetical protein; 100.0 1.8E-43 0 291.9 19.4 150 20-176 1-151 (153) 3 pfam02576 DUF150 Uncharacteriz 100.0 3.8E-41 1.4E-45 277.3 17.7 140 28-172 1-141 (141) 4 PRK02001 hypothetical protein; 100.0 1.4E-34 3.5E-39 236.4 17.1 143 22-174 4-152 (154) 5 cd01734 YlxS_C YxlS is a Bacil 99.9 3.8E-23 9.7E-28 165.0 10.2 82 90-175 1-82 (83) 6 pfam11562 EDC3_LSm Enhancer of 82.0 4.4 0.00011 21.0 5.8 62 110-174 3-68 (84) 7 cd01719 Sm_G The eukaryotic Sm 80.8 2.6 6.6E-05 22.4 4.0 38 106-147 2-40 (72) 8 cd01729 LSm7 The eukaryotic Sm 78.9 3.2 8.3E-05 21.8 4.0 38 106-147 4-42 (81) 9 pfam11451 DUF3202 Protein of u 78.3 2.7 6.8E-05 22.3 3.4 38 106-147 5-42 (67) 10 pfam06257 DUF1021 Protein of u 76.6 6.4 0.00016 20.0 6.5 60 106-171 9-71 (76) 11 cd00600 Sm_like The eukaryotic 73.3 6.4 0.00016 20.0 4.3 34 110-147 2-36 (63) 12 cd01731 archaeal_Sm1 The archa 72.4 7.2 0.00018 19.7 4.4 37 107-147 3-40 (68) 13 PRK00737 small nuclear ribonuc 71.9 6.2 0.00016 20.1 3.9 41 100-147 3-44 (72) 14 cd01737 LSm16_N LSm16 belongs 70.7 8.9 0.00023 19.1 6.9 58 110-170 2-61 (62) 15 COG1958 LSM1 Small nuclear rib 70.4 7.1 0.00018 19.7 4.0 36 107-146 10-46 (79) 16 cd01726 LSm6 The eukaryotic Sm 68.4 6.9 0.00018 19.8 3.5 34 109-146 5-39 (67) 17 KOG1783 consensus 65.6 4.5 0.00011 20.9 2.1 31 106-140 8-38 (77) 18 pfam01423 LSM LSM domain. The 63.8 11 0.00028 18.5 3.8 35 108-146 2-37 (66) 19 smart00651 Sm snRNP Sm protein 63.4 11 0.00029 18.4 3.9 35 108-146 2-37 (67) 20 COG1481 Uncharacterized protei 61.2 8.1 0.00021 19.3 2.8 111 19-139 50-186 (308) 21 cd01722 Sm_F The eukaryotic Sm 61.2 11 0.00029 18.4 3.6 29 109-141 6-34 (68) 22 TIGR00585 mutl DNA mismatch re 60.5 14 0.00036 17.8 5.7 50 22-74 24-73 (367) 23 PRK06955 biotin--protein ligas 58.3 15 0.00039 17.6 4.2 50 97-150 232-281 (300) 24 cd01732 LSm5 The eukaryotic Sm 53.2 19 0.00047 17.1 5.1 37 107-147 6-43 (76) 25 cd01213 tensin Tensin Phosphot 46.4 17 0.00043 17.4 2.4 45 69-113 24-74 (138) 26 cd01717 Sm_B The eukaryotic Sm 46.0 24 0.00061 16.4 5.0 36 108-147 4-40 (79) 27 COG4466 Veg Uncharacterized pr 44.6 25 0.00065 16.2 6.1 62 106-170 11-72 (80) 28 cd01727 LSm8 The eukaryotic Sm 44.3 26 0.00065 16.2 3.8 35 109-147 4-39 (74) 29 KOG1780 consensus 43.9 26 0.00066 16.2 3.7 30 106-140 7-36 (77) 30 COG1094 Predicted RNA-binding 43.2 18 0.00046 17.2 2.2 89 63-174 61-158 (194) 31 TIGR00302 TIGR00302 phosphorib 40.1 30 0.00076 15.8 5.7 66 19-95 11-79 (80) 32 KOG3482 consensus 39.8 30 0.00077 15.8 3.5 33 105-141 8-41 (79) 33 cd01723 LSm4 The eukaryotic Sm 39.2 31 0.00078 15.7 3.4 34 109-146 6-40 (76) 34 pfam02237 BPL_C Biotin protein 38.8 31 0.00079 15.7 4.4 35 112-151 1-35 (47) 35 cd01724 Sm_D1 The eukaryotic S 38.7 31 0.0008 15.7 3.7 31 107-141 4-34 (90) 36 cd01725 LSm2 The eukaryotic Sm 38.2 32 0.00081 15.6 3.8 30 108-141 5-34 (81) 37 pfam06372 Gemin6 Gemin6 protei 37.8 32 0.00082 15.6 5.6 35 104-142 7-41 (169) 38 PRK11886 biotin--protein ligas 36.1 34 0.00088 15.4 5.2 39 105-148 263-302 (319) 39 cd01733 LSm10 The eukaryotic S 35.4 35 0.0009 15.4 3.7 31 107-141 12-42 (78) 40 KOG1781 consensus 35.3 27 0.00068 16.1 2.0 42 94-139 6-48 (108) 41 cd01721 Sm_D3 The eukaryotic S 35.1 36 0.00091 15.3 3.7 29 109-141 5-33 (70) 42 KOG0095 consensus 33.8 27 0.00068 16.1 1.8 54 49-108 110-163 (213) 43 TIGR00922 nusG transcription t 33.2 38 0.00098 15.1 3.5 52 113-170 142-193 (193) 44 cd01730 LSm3 The eukaryotic Sm 33.0 39 0.00098 15.1 4.8 30 110-143 7-36 (82) 45 PRK01191 rpl24p 50S ribosomal 32.0 40 0.001 15.0 3.9 70 112-185 48-119 (119) 46 pfam01545 Cation_efflux Cation 31.6 41 0.001 15.0 8.3 74 21-97 195-269 (273) 47 PRK13305 sgbH 3-keto-L-gulonat 31.5 41 0.001 15.0 3.5 19 64-82 93-111 (220) 48 pfam10842 DUF2642 Protein of u 31.5 41 0.001 15.0 4.8 37 108-149 15-51 (66) 49 cd02978 KaiB_like KaiB-like fa 31.3 41 0.001 14.9 4.4 40 51-97 2-41 (72) 50 PRK08942 D,D-heptose 1,7-bisph 30.9 24 0.00061 16.4 1.2 41 33-83 41-81 (181) 51 pfam11589 DUF3244 Protein of u 30.6 15 0.00039 17.6 0.1 58 39-98 37-95 (106) 52 pfam08496 Peptidase_S49_N Pept 30.2 43 0.0011 14.8 5.0 46 53-99 97-142 (154) 53 pfam04514 BTV_NS2 Bluetongue v 29.7 24 0.00061 16.4 1.1 35 69-103 23-69 (363) 54 cd04479 RPA3 RPA3: A subfamily 29.3 44 0.0011 14.7 4.9 46 101-149 2-48 (101) 55 pfam03983 SHD1 SLA1 homology d 28.6 46 0.0012 14.6 3.2 48 126-180 19-66 (70) 56 COG0323 MutL DNA mismatch repa 28.2 46 0.0012 14.6 5.6 46 24-72 27-72 (638) 57 PRK08255 salicylyl-CoA 5-hydro 28.0 47 0.0012 14.6 9.0 86 8-101 538-669 (770) 58 PHA00026 cp coat protein 27.6 23 0.00058 16.5 0.6 23 67-89 99-121 (129) 59 pfam02700 PurS Phosphoribosylf 27.6 48 0.0012 14.5 6.9 69 18-95 10-79 (80) 60 cd06168 LSm9 The eukaryotic Sm 27.3 48 0.0012 14.5 4.4 36 108-147 4-40 (75) 61 PRK09301 circadian clock prote 26.7 49 0.0013 14.4 4.1 76 48-138 4-82 (103) 62 pfam00479 G6PD_N Glucose-6-pho 26.1 51 0.0013 14.4 3.5 33 53-86 138-170 (183) 63 pfam08484 Methyltransf_14 C-me 26.0 51 0.0013 14.4 4.1 26 37-62 1-26 (169) 64 KOG3448 consensus 25.7 52 0.0013 14.3 3.1 31 106-140 4-34 (96) 65 COG0172 SerS Seryl-tRNA synthe 25.5 52 0.0013 14.3 4.4 50 88-137 335-386 (429) 66 TIGR01088 aroQ 3-dehydroquinat 25.1 53 0.0013 14.3 4.6 40 9-49 14-53 (144) 67 cd01728 LSm1 The eukaryotic Sm 25.1 53 0.0013 14.3 6.1 39 105-147 3-42 (74) 68 KOG2059 consensus 24.9 41 0.0011 14.9 1.5 76 91-171 546-621 (800) 69 pfam11468 PTase_Orf2 Aromatic 24.8 54 0.0014 14.2 6.7 127 14-146 124-273 (294) 70 TIGR00876 tal_mycobact transal 24.6 54 0.0014 14.2 2.1 38 63-101 69-106 (350) 71 TIGR03361 VI_Rhs_Vgr type VI s 23.6 56 0.0014 14.1 5.3 40 4-47 91-130 (513) 72 pfam09957 DUF2191 Uncharacteri 23.2 49 0.0012 14.5 1.6 20 169-188 2-21 (47) 73 TIGR02776 NHEJ_ligase_prk DNA 23.0 58 0.0015 14.0 5.3 80 25-123 503-582 (645) 74 cd01236 PH_outspread Outspread 23.0 21 0.00053 16.8 -0.3 32 60-98 49-80 (104) 75 cd00770 SerRS_core Seryl-tRNA 22.1 60 0.0015 13.9 2.3 41 85-135 222-262 (297) 76 PRK11778 putative periplasmic 21.8 61 0.0016 13.8 4.9 67 53-120 69-137 (317) 77 COG0250 NusG Transcription ant 21.1 63 0.0016 13.8 4.7 50 113-170 126-177 (178) 78 TIGR00577 fpg formamidopyrimid 20.3 20 0.0005 16.9 -0.9 74 11-84 117-226 (292) 79 COG4004 Uncharacterized protei 20.2 66 0.0017 13.6 2.2 31 69-101 14-44 (96) 80 PRK12853 glucose-6-phosphate 1 20.2 66 0.0017 13.6 3.4 33 53-86 139-171 (486) 81 KOG0129 consensus 20.1 29 0.00074 15.9 -0.0 89 86-175 329-438 (520) No 1 >COG0779 Uncharacterized protein conserved in bacteria [Function unknown] Probab=100.00 E-value=1.5e-44 Score=298.54 Aligned_cols=151 Identities=34% Similarity=0.663 Sum_probs=142.2 Q ss_pred CCCCCHHHHHHHHHHHHHHHCCCEEEEEEEECCC-CCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEE Q ss_conf 0141279999999999997479779999995499-868999996587787899999999998752021135676507999 Q gi|254780789|r 16 FGDMGLAGDISSVIQPVIEEMSFRSVQISLLEEK-NLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEV 94 (192) Q Consensus 16 ~~~~~i~~~i~~li~p~v~~lG~eLv~v~~~~~~-~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEV 94 (192) ++..++.+++.++++|+++++||+||+|++.+.+ +++|||+||++ |+|++|||+.+||++|+.||++|||++.|+||| T Consensus 1 ~~~~~~~~~v~~li~p~~~~lG~ELv~ve~~~~~~~~~lrI~id~~-g~v~ldDC~~vSr~is~~LD~edpi~~~Y~LEV 79 (153) T COG0779 1 IMESPITEKVTELIEPVVESLGFELVDVEFVKEGRDSVLRIYIDKE-GGVTLDDCADVSRAISALLDVEDPIEGAYFLEV 79 (153) T ss_pred CCCCCHHHHHHHHHHHHHHHCCCEEEEEEEEECCCCCEEEEEECCC-CCCCHHHHHHHHHHHHHHHCCCCCCCCCEEEEE T ss_conf 9756257899999987786449589999999728994899996799-998889999999998987365876666589996 Q ss_pred EECCCCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHCCEEC Q ss_conf 72788774457899998710568898722668807899998512697599996355456777726997688765172504 Q gi|254780789|r 95 SSPGIDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSARLIV 174 (192) Q Consensus 95 SSPGidRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kAkLv~ 174 (192) ||||+||||++++||.+|+|++|+|+|+.|++|+|+|.|+|.+++++.+++..+++ ++.|||++|++|||++ T Consensus 80 SSPGldRpL~~~~~f~r~~G~~Vkv~l~~~~~~~k~~~G~i~~~d~~~v~~~~~~k--------~v~Ip~~~i~kArl~~ 151 (153) T COG0779 80 SSPGLDRPLKTAEHFARFIGEKVKVKLRLPIEGRKKFEGKIVAVDGETVTLEVDGK--------EVEIPFSDIAKARLVP 151 (153) T ss_pred ECCCCCCCCCCHHHHHHHCCCEEEEEEECCCCCCEEEEEEEEEECCCEEEEEECCE--------EEEEECCCCHHHEECC T ss_conf 47998877579899998469589999965448840788999997298699997784--------8998756311200114 Q ss_pred C Q ss_conf 6 Q gi|254780789|r 175 T 175 (192) Q Consensus 175 ~ 175 (192) + T Consensus 152 ~ 152 (153) T COG0779 152 E 152 (153) T ss_pred C T ss_conf 6 No 2 >PRK00092 hypothetical protein; Reviewed Probab=100.00 E-value=1.8e-43 Score=291.86 Aligned_cols=150 Identities=31% Similarity=0.596 Sum_probs=140.1 Q ss_pred CHHHHHHHHHHHHHHHCCCEEEEEEEECC-CCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECC Q ss_conf 27999999999999747977999999549-98689999965877878999999999987520211356765079997278 Q gi|254780789|r 20 GLAGDISSVIQPVIEEMSFRSVQISLLEE-KNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPG 98 (192) Q Consensus 20 ~i~~~i~~li~p~v~~lG~eLv~v~~~~~-~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPG 98 (192) .++++|+++++|+++++||+||+|++.++ ++++||||||+++| |+||||+.+||.|++.||.+|+++++|+||||||| T Consensus 1 p~~e~i~~li~pvv~~~G~~L~dve~~~~~~~~~lrI~ID~~~g-v~lddc~~vSr~is~~LD~~d~i~~~Y~LEVSSPG 79 (153) T PRK00092 1 PLEEQLTELIEPVVEGLGYELVGVEFVKAGRPSTLRIYIDSDGG-ITLDDCEDVSRQLSAVLDVEDPIPDAYTLEVSSPG 79 (153) T ss_pred CHHHHHHHHHHHHHHHCCCEEEEEEEEECCCCEEEEEEEECCCC-CCHHHHHHHHHHHHHHHCCCCCCCCCEEEEEECCC T ss_conf 97899999999999877999999999918997399999988999-18999999889988752636567875599996799 Q ss_pred CCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHCCEECCH Q ss_conf 877445789999871056889872266880789999851269759999635545677772699768876517250469 Q gi|254780789|r 99 IDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSARLIVTD 176 (192) Q Consensus 99 idRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kAkLv~~d 176 (192) ++|||+.++||++|+|+.|+|+|+.+.+|+++|.|+|.+++++.++|+++++ ..++.|||++|++|||++.- T Consensus 80 i~RpL~~~~~f~~~~G~~v~v~l~~~~~~~k~~~G~L~~~~~~~i~l~~~~~------~~~~~i~~~~I~ka~l~~ef 151 (153) T PRK00092 80 LDRPLKTAEHFRRFVGREVKVKLREPIDGRKKFQGRLLAVDGETVTLEVEGK------PKVVEIPLDNIAKARLVPEF 151 (153) T ss_pred CCCCCCCHHHHHHHCCCEEEEEEECCCCCCEEEEEEEEEEECCEEEEEECCC------CEEEEEEHHHHEEEEEEEEE T ss_conf 9973269899998669389999944668964999999988499899998897------06999736981089999970 No 3 >pfam02576 DUF150 Uncharacterized BCR, YhbC family COG0779. Probab=100.00 E-value=3.8e-41 Score=277.27 Aligned_cols=140 Identities=34% Similarity=0.675 Sum_probs=130.7 Q ss_pred HHHHHHHHCCCEEEEEEEECC-CCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCCCCCCH Q ss_conf 999999747977999999549-9868999996587787899999999998752021135676507999727887744578 Q gi|254780789|r 28 VIQPVIEEMSFRSVQISLLEE-KNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDRPMVRK 106 (192) Q Consensus 28 li~p~v~~lG~eLv~v~~~~~-~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidRpL~~~ 106 (192) +++|+++++||+||+|++.++ ++++|||+||+++| |++|||+.+||.|++.||.+++++++|+|||||||++|||+++ T Consensus 1 li~pv~~~~G~el~dve~~~~~~~~~l~I~iD~~~g-v~iddc~~~Sr~i~~~Ld~~d~~~~~y~LEVSSPGi~RpL~~~ 79 (141) T pfam02576 1 LIEPVVESLGFELVDVEFVKEGRGWVLRIYIDKDGG-VTLDDCEEVSRAISALLDVEDPIPEAYFLEVSSPGLERPLKTE 79 (141) T ss_pred CCHHHHHHCCCEEEEEEEECCCCCEEEEEEEECCCC-CCHHHHHHHHHHHHHHHCCCCCCCCCEEEEEECCCCCCCCCCH T ss_conf 964478877999999999908997499999989999-7899999999999877512666677559999589999834888 Q ss_pred HHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHCCE Q ss_conf 999987105688987226688078999985126975999963554567777269976887651725 Q gi|254780789|r 107 SDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSARL 172 (192) Q Consensus 107 ~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kAkL 172 (192) +||.+|+|+.|+|+++.+.+++++|+|+|.+++++.++|+++++. .+..++|||++|++||| T Consensus 80 ~~f~~~~G~~v~v~l~~~~~~~k~~~G~L~~~~~~~i~l~~~~~~----~~~~~~i~~~~I~kA~L 141 (141) T pfam02576 80 RHFARFIGKLVKVSLKEPIEGRKNFTGKLLEVDGDTVTIEVDDKR----RKKEVEIPFADIKKARL 141 (141) T ss_pred HHHHHHCCCEEEEEEECCCCCEEEEEEEEEEEECCEEEEEECCCC----CCEEEEEEHHHHHHCCC T ss_conf 999986594899999246699389999999886999999985871----22689973799523359 No 4 >PRK02001 hypothetical protein; Validated Probab=100.00 E-value=1.4e-34 Score=236.36 Aligned_cols=143 Identities=22% Similarity=0.349 Sum_probs=124.8 Q ss_pred HHHHHHHHHHHHHHCCCEEEEEEEECCCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCC Q ss_conf 99999999999974797799999954998689999965877878999999999987520211356765079997278877 Q gi|254780789|r 22 AGDISSVIQPVIEEMSFRSVQISLLEEKNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDR 101 (192) Q Consensus 22 ~~~i~~li~p~v~~lG~eLv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidR 101 (192) .+-++.+++|.++..||.|+++++.+++ .++|+||+++| |+||||+.+||+|+..||.++ .+|+|||||||++| T Consensus 4 ~~~~e~l~e~~~~~~~lfLvdv~v~~~~--~i~V~iD~d~G-v~iddC~~iSr~i~~~LD~~~---~~y~LEVSSPGl~r 77 (154) T PRK02001 4 KKVVELIVEEWLETKSYFLVDVAISPGN--KIVVELDGDEG-VWIDDCVELSRFIESNLDREE---EDYELEVGSAGIGS 77 (154) T ss_pred HHHHHHHHHHHHHCCCEEEEEEEECCCC--EEEEEEECCCC-CCHHHHHHHHHHHHHHHCCCC---CCEEEEEECCCCCC T ss_conf 8999999999975699099999984899--89999979999-278999999999987645676---55399985799998 Q ss_pred CCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCC------CCCCEEEEEHHHHHHCCEEC Q ss_conf 44578999987105688987226688078999985126975999963554567------77726997688765172504 Q gi|254780789|r 102 PMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKD------MNELQIAISFDSLLSARLIV 174 (192) Q Consensus 102 pL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~------~~~~~v~ip~~~I~kAkLv~ 174 (192) ||+.++||.+|+|+.++|++. +.++|.|+|.++++++|+|.++.+.+++ ......+|||++|++|++.. T Consensus 78 PL~~~rqy~kniGr~v~V~~~----~g~~~~G~L~~v~~~~i~L~~~~r~~k~~gKgK~tv~~~~~i~f~~Ik~AkV~I 152 (154) T PRK02001 78 PLKVLRQYKKNIGRELEVLTK----NGRKLEGVLKDADEEKIKVSVKKKVKPEGAKRKKTVEEEETITYADIKYAKYLI 152 (154) T ss_pred CCCCHHHHHHHCCCEEEEEEC----CCCEEEEEEEEECCCEEEEEEEEEECCCCCCCCCCEEEEEEECHHHHEEEEEEE T ss_conf 778989999855988999978----998999999996398299999854057788886203577788378701689999 No 5 >cd01734 YlxS_C YxlS is a Bacillus subtilis gene of unknown function with two domains that each have an alpha/beta fold. The N-terminal domain is composed of two alpha-helices and a three-stranded beta-sheet, while the C-terminal domain is composed of one alpha-helix and a five-stranded beta-sheet. This CD represents the C-terminal domain which has a fold similar to the Sm fold of proteins like Sm-D3. Probab=99.89 E-value=3.8e-23 Score=164.95 Aligned_cols=82 Identities=29% Similarity=0.502 Sum_probs=75.9 Q ss_pred CEEEEEECCCCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHH Q ss_conf 07999727887744578999987105688987226688078999985126975999963554567777269976887651 Q gi|254780789|r 90 YRLEVSSPGIDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLS 169 (192) Q Consensus 90 Y~LEVSSPGidRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~k 169 (192) |+|||||||+||||++++||.+|+|+.|+|+|+.|++|+++|+|.|.+++++.++|.++.+. ...+++|||++|++ T Consensus 1 Y~LEVSSPGldRpL~~~~~f~~~~G~~v~v~l~~~~~g~k~f~G~L~~v~~~~i~l~~~~~~----~~~~~~i~~~~I~k 76 (83) T cd01734 1 YFLEVSSPGAERPLKKEADFERAVGKYVHVKLYQPIDGQKEFEGTLLGVDDDTVTLEVDIKT----RGKTVEIPLDKIAK 76 (83) T ss_pred CEEEEECCCCCCCCCCHHHHHHHCCCEEEEEEECCCCCCEEEEEEEEEEECCEEEEEEECCC----CCEEEEEEHHHHCE T ss_conf 98998489989867899999985798799999150189189999999883999999995277----87899974698214 Q ss_pred CCEECC Q ss_conf 725046 Q gi|254780789|r 170 ARLIVT 175 (192) Q Consensus 170 AkLv~~ 175 (192) |||++. T Consensus 77 ArL~~~ 82 (83) T cd01734 77 ARLAPE 82 (83) T ss_pred EEEEEE T ss_conf 789788 No 6 >pfam11562 EDC3_LSm Enhancer of mRNA-decapping protein 3- N terminal. EDC3 functions in mRNA decapping. This family represents the N-terminal LSm domain of EDC3. This LSm domain mediates DCP1 binding and P-body localisation. The LSm domain adopts a divergent Sm fold that has a disrupted beta4-strand and lacks the usual N-terminal alpha-helix. Probab=82.04 E-value=4.4 Score=21.02 Aligned_cols=62 Identities=16% Similarity=0.119 Sum_probs=45.3 Q ss_pred HHHHCCEEEEEEECCCCCEEEEEEEEEEC--CCCEEEEEECCCCCC--CCCCCEEEEEHHHHHHCCEEC Q ss_conf 98710568898722668807899998512--697599996355456--777726997688765172504 Q gi|254780789|r 110 LRWNGHVVACEIVLSSGDKQKLIGKIMGT--SETGFFLEKEKRGEK--DMNELQIAISFDSLLSARLIV 174 (192) Q Consensus 110 ~r~~G~~VkV~l~~~~~g~k~~~G~L~~v--~~~~i~l~~~~~~~k--~~~~~~v~ip~~~I~kAkLv~ 174 (192) ..|+|..|.++..+ +.-.|.|.+..+ ++..|+|...-+... +....++++.-.+|.+-+++- T Consensus 3 ~~~iG~~VSi~C~d---~lGvyQG~I~~vd~t~q~Itl~~af~NGvPlk~~~~EVtlsa~DI~~LkiI~ 68 (84) T pfam11562 3 DDWIGSSVSINCGD---TLGVYQGKIKQVDQTSQTISLTRPFHNGVPLKCLVPEVTFSAGDISSLKIIE 68 (84) T ss_pred CCCCCCEEEEEECC---CCCEEEEEEEEECCCCCEEEEEEHHHCCCCCCCCCCEEEEEECCCCCEEEEE T ss_conf 23213179998579---8537888999862789769971041478765468843998811011002787 No 7 >cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=80.81 E-value=2.6 Score=22.41 Aligned_cols=38 Identities=11% Similarity=0.285 Sum_probs=29.1 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEEE Q ss_conf 899998710568898722668807899998512697-599996 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLEK 147 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~~ 147 (192) +-+.++|.++.|.|+++ |.+.+.|+|.++|.- .++|.. T Consensus 2 ~pdL~~~ldk~v~Vkl~----ggR~i~G~L~GfD~~mNLVLdd 40 (72) T cd01719 2 PPELKKYMDKKLSLKLN----GNRKVSGILRGFDPFMNLVLDD 40 (72) T ss_pred CCHHHHHCCCEEEEEEC----CCCEEEEEEEEECCCCEEEEEE T ss_conf 81457754988999988----9969999999707420277230 No 8 >cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=78.93 E-value=3.2 Score=21.82 Aligned_cols=38 Identities=11% Similarity=0.276 Sum_probs=28.8 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEEE Q ss_conf 899998710568898722668807899998512697-599996 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLEK 147 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~~ 147 (192) .=+..+|+++.|.|++. +.+.+.|+|.++|.. +++|.. T Consensus 4 ildL~k~ldk~V~Vkl~----~gR~v~G~L~gfD~~mNLVL~d 42 (81) T cd01729 4 ILDLSKYVDKKIRVKFQ----GGREVTGILKGYDQLLNLVLDD 42 (81) T ss_pred HHHHHHHCCCEEEEEEC----CCCEEEEEEECCCCCCEEEEEE T ss_conf 41478855968999987----9939999997046620177663 No 9 >pfam11451 DUF3202 Protein of unknown function (DUF3202). This archaeal family of proteins has no known function. Probab=78.25 E-value=2.7 Score=22.35 Aligned_cols=38 Identities=24% Similarity=0.445 Sum_probs=32.3 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEE Q ss_conf 899998710568898722668807899998512697599996 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEK 147 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~ 147 (192) .+-.++|.|++|.|. +++...|.|+|..++++.+.+.. T Consensus 5 dktL~~WKg~kvAv~----vg~ehSFtGiledFDeEviLL~d 42 (67) T pfam11451 5 DKTLEEWKGHKVAVG----IGGDHSFSGILEDFDEEVILLKD 42 (67) T ss_pred HHHHHHHCCCEEEEE----ECCCCEEEEEHHHCCCCEEEEHH T ss_conf 889998479679999----66852146565445863675004 No 10 >pfam06257 DUF1021 Protein of unknown function (DUF1021). This family consists of several hypothetical bacterial proteins of unknown function. Probab=76.65 E-value=6.4 Score=19.97 Aligned_cols=60 Identities=18% Similarity=0.230 Sum_probs=44.7 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEE---EEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHCC Q ss_conf 8999987105688987226688078---99998512697599996355456777726997688765172 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQK---LIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSAR 171 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~---~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kAk 171 (192) .+....++|+.|.++.. .|+|+ -.|+|.++=..-+++.++... .....++..|.+|---. T Consensus 9 k~~l~~~vG~~v~l~an---~GRkK~~~~~GvL~etyPsvFvV~ld~~~---~~~~rvSYSYsDvLT~~ 71 (76) T pfam06257 9 KEKLDAHVGERVTLKAN---GGRKKVTEREGILEETYPSVFVVELDQDE---NTFERVSYSYSDVLTKT 71 (76) T ss_pred HHHHHHCCCCEEEEEEC---CCCEEEEEEEEEEEEECCCEEEEEEECCC---CCEEEEEEEEEEEECCE T ss_conf 99998607988999962---89522899999993104618999992678---95889989973023131 No 11 >cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=73.25 E-value=6.4 Score=19.96 Aligned_cols=34 Identities=18% Similarity=0.237 Sum_probs=27.5 Q ss_pred HHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEEE Q ss_conf 98710568898722668807899998512697-599996 Q gi|254780789|r 110 LRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLEK 147 (192) Q Consensus 110 ~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~~ 147 (192) ++++|+.|.|+++.. +.+.|+|.++|.. .+.|.. T Consensus 2 ~~~ig~~V~V~l~~g----~~~~G~L~~~D~~mNlvL~~ 36 (63) T cd00600 2 KDLVGKTVRVELKDG----RVLEGVLVAFDKYMNLVLDD 36 (63) T ss_pred HHHCCCEEEEEECCC----CEEEEEEEEECCCCCEEECC T ss_conf 468698599999899----59999999988865409867 No 12 >cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain. Probab=72.40 E-value=7.2 Score=19.67 Aligned_cols=37 Identities=11% Similarity=0.118 Sum_probs=27.8 Q ss_pred HHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEEE Q ss_conf 99998710568898722668807899998512697-599996 Q gi|254780789|r 107 SDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLEK 147 (192) Q Consensus 107 ~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~~ 147 (192) .-.++++|+.|.|+++. .+.+.|+|.++|+. .+.|.. T Consensus 3 ~~L~~~~~k~V~V~Lk~----g~~~~G~L~~~D~~mNlvL~d 40 (68) T cd01731 3 DVLKDSLNKPVLVKLKG----GKEVRGRLKSYDQHMNLVLED 40 (68) T ss_pred HHHHHHCCCEEEEEECC----CCEEEEEEEEECCCCCEEECC T ss_conf 79888549859999989----989999999994753189824 No 13 >PRK00737 small nuclear ribonucleoprotein; Provisional Probab=71.89 E-value=6.2 Score=20.06 Aligned_cols=41 Identities=15% Similarity=0.225 Sum_probs=29.5 Q ss_pred CCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEEE Q ss_conf 774457899998710568898722668807899998512697-599996 Q gi|254780789|r 100 DRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLEK 147 (192) Q Consensus 100 dRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~~ 147 (192) +|||. -.++++|+.|.|+++ +.+.|.|+|.++|.- .++|.. T Consensus 3 ~~Pl~---~L~~~~~k~V~V~Lk----~gr~~~G~L~~~D~~mNlVL~d 44 (72) T PRK00737 3 ERPLD---VLNNSLNSPVLVRLK----GGREFRGELQGYDIHMNLVLAN 44 (72) T ss_pred CCHHH---HHHHHCCCEEEEEEC----CCCEEEEEEEEECCCCCEEECC T ss_conf 87699---999874984999998----9989999999985311179825 No 14 >cd01737 LSm16_N LSm16 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet, that associates with other Sm proteins to form hexameric and heptameric ring structures. LSm16 has, in addition to its N-terminal Sm-like domain, a C-terminal Yjef_N-type rossman fold domain of unknown function. Probab=70.73 E-value=8.9 Score=19.07 Aligned_cols=58 Identities=7% Similarity=0.038 Sum_probs=41.0 Q ss_pred HHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECC--CCCCCCCCCEEEEEHHHHHHC Q ss_conf 9871056889872266880789999851269759999635--545677772699768876517 Q gi|254780789|r 110 LRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEK--RGEKDMNELQIAISFDSLLSA 170 (192) Q Consensus 110 ~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~--~~~k~~~~~~v~ip~~~I~kA 170 (192) ..|+|..|.+.... +.-.|.|.+.+++.+..+|.... ...-+....++++.-.+|..- T Consensus 2 ~~wiG~~VSI~C~~---~lGv~QG~I~~v~~~~qtItl~~~f~ngi~~~~~EVtl~a~dI~~L 61 (62) T cd01737 2 QDWLGSIVSINCGE---TLGVYQGLVSAVDQESQTISLAFPFHNGVKCLVPEVTFRAGDIREL 61 (62) T ss_pred CCCEEEEEEEEECC---CCCEEEEEEEEECCCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHC T ss_conf 76240689998679---8528888999857666389984056688658883599983564333 No 15 >COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription] Probab=70.43 E-value=7.1 Score=19.69 Aligned_cols=36 Identities=14% Similarity=0.174 Sum_probs=27.3 Q ss_pred HHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCE-EEEE Q ss_conf 999987105688987226688078999985126975-9999 Q gi|254780789|r 107 SDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETG-FFLE 146 (192) Q Consensus 107 ~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~-i~l~ 146 (192) .-..+++|+.|.|+|++ | +.+.|+|.++|..- +.|. T Consensus 10 ~~l~~~~~~~V~V~lk~---g-~~~~G~L~~~D~~mNlvL~ 46 (79) T COG1958 10 SFLKKLLNKRVLVKLKN---G-REYRGTLVGFDQYMNLVLD 46 (79) T ss_pred HHHHHHCCCEEEEEECC---C-CEEEEEEEEECCCCCEEEE T ss_conf 99998629889999879---9-5999999998475418991 No 16 >cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=68.38 E-value=6.9 Score=19.78 Aligned_cols=34 Identities=15% Similarity=0.079 Sum_probs=25.7 Q ss_pred HHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEE Q ss_conf 998710568898722668807899998512697-59999 Q gi|254780789|r 109 FLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLE 146 (192) Q Consensus 109 f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~ 146 (192) +...+|+.|-|+|+. | ..|+|+|.++|.- .+.|. T Consensus 5 L~~~~gk~V~VkLk~---G-~ey~G~L~s~D~~MNl~L~ 39 (67) T cd01726 5 LKAIIGRPVVVKLNS---G-VDYRGILACLDGYMNIALE 39 (67) T ss_pred HHHCCCCEEEEEECC---C-CEEEEEEEEECCEEEEEEC T ss_conf 766059909999889---9-8989999988560846871 No 17 >KOG1783 consensus Probab=65.62 E-value=4.5 Score=20.94 Aligned_cols=31 Identities=10% Similarity=0.049 Sum_probs=24.0 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCC Q ss_conf 89999871056889872266880789999851269 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSE 140 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~ 140 (192) .+-+..++|+.|.||+....+ |+|+|...|+ T Consensus 8 ~~fl~~iiGr~V~VKl~sgvd----yrG~l~~lDg 38 (77) T KOG1783 8 GEFLKAIIGRTVVVKLNSGVD----YRGTLVCLDG 38 (77) T ss_pred HHHHHHHHCCEEEEEECCCCC----CCCEEHHHHH T ss_conf 899999719758999547755----2320335544 No 18 >pfam01423 LSM LSM domain. The LSM domain contains Sm proteins as well as other related LSM (Like Sm) proteins. The U1, U2, U4/U6, and U5 small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing contain seven Sm proteins (B/B', D1, D2, D3, E, F and G) in common, which assemble around the Sm site present in four of the major spliceosomal small nuclear RNAs. The U6 snRNP binds to the LSM (Like Sm) proteins. Sm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Sm proteins. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. This family also includes the bacterial Hfq (host factor Q) proteins. Hfq are also RNA-binding proteins, that form hexameric rings. Probab=63.82 E-value=11 Score=18.55 Aligned_cols=35 Identities=20% Similarity=0.257 Sum_probs=27.1 Q ss_pred HHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCE-EEEE Q ss_conf 99987105688987226688078999985126975-9999 Q gi|254780789|r 108 DFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETG-FFLE 146 (192) Q Consensus 108 ~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~-i~l~ 146 (192) ..+.++|+.|.|+++. .+.+.|+|.++|+.- +.|. T Consensus 2 ~L~~~~~~~V~V~l~~----g~~~~G~L~~~D~~mNlvL~ 37 (66) T pfam01423 2 FLQKLLGKRVTVELKN----GRELRGTLKGFDQFMNLVLD 37 (66) T ss_pred HHHHCCCCEEEEEECC----CCEEEEEEEEECCCCCEEEE T ss_conf 5797499879999989----92999999998899950991 No 19 >smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing Probab=63.39 E-value=11 Score=18.39 Aligned_cols=35 Identities=20% Similarity=0.256 Sum_probs=27.0 Q ss_pred HHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEE Q ss_conf 9998710568898722668807899998512697-59999 Q gi|254780789|r 108 DFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLE 146 (192) Q Consensus 108 ~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~ 146 (192) ....++|+.|.|+++. | +.+.|+|.++|.. .+.|. T Consensus 2 ~L~~~~~~~V~V~l~~---g-~~~~G~L~~~D~~mNlvL~ 37 (67) T smart00651 2 FLKKLIGKRVLVELKN---G-REYRGTLKGFDQFMNLVLE 37 (67) T ss_pred HHHHCCCCEEEEEECC---C-CEEEEEEEEECCCCCEEEC T ss_conf 4797199879999989---9-6999999998899972987 No 20 >COG1481 Uncharacterized protein conserved in bacteria [Function unknown] Probab=61.22 E-value=8.1 Score=19.33 Aligned_cols=111 Identities=14% Similarity=0.102 Sum_probs=61.8 Q ss_pred CCHHHHHHHHHHHHHHHCCCEEEEEEEECC----CCCEEEEEEECC-------------CC------CCCHHHHHHHHHH Q ss_conf 127999999999999747977999999549----986899999658-------------77------8789999999999 Q gi|254780789|r 19 MGLAGDISSVIQPVIEEMSFRSVQISLLEE----KNLLLQIFVERD-------------DG------NMTLRDCEELSQA 75 (192) Q Consensus 19 ~~i~~~i~~li~p~v~~lG~eLv~v~~~~~----~~~~LrI~ID~~-------------dg------~i~iddC~~vSr~ 75 (192) ..++.++..++...- ... ++|.+... ++.+-.+++... .+ .+-=|+|..-+=. T Consensus 50 ~~iarri~~~l~~~~---~i~-~ei~~~~~~~lkkn~vY~v~~~~~~~~il~~l~l~d~~~~~~i~~~~v~~~~~~~~yl 125 (308) T COG1481 50 AAIARRLYNLLKKLY---NIK-VEIKVEKKSNLKKNNVYTVRLYEGAEELLEQLKLLDSFFGPVIPEQVVSDDEDFRAYL 125 (308) T ss_pred HHHHHHHHHHHHHHH---CCC-EEEEEEECCCCCCCCEEEEEEECCHHHHHHHHCCCCCCCCCCCCHHHHCCHHHHHHHH T ss_conf 899999999999873---676-0478730135654544899870577999998525455666556466414179999999 Q ss_pred HHHHHC---CCCCCCCCCEEEEEECCCCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECC Q ss_conf 875202---1135676507999727887744578999987105688987226688078999985126 Q gi|254780789|r 76 ISPILD---VENIIEGHYRLEVSSPGIDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTS 139 (192) Q Consensus 76 i~~~LD---~~d~i~~~Y~LEVSSPGidRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~ 139 (192) .++-|- ..||....|.||+||++-+--+...+=+. .-|..+++... ++.+.-.|.+++ T Consensus 126 rGAFLagGSis~Pe~~~YhLEi~s~~ee~a~~L~~l~~-~f~l~ak~~er-----kn~~vvYlK~~E 186 (308) T COG1481 126 RGAFLAGGSISDPETSSYHLEISSNYEEHALALVKLLR-RFGLNAKIIER-----KNKYVVYLKSAE 186 (308) T ss_pred HHHHHCCCCCCCCCCCCEEEEEECCCHHHHHHHHHHHH-HHCCCCEEEEE-----CCCEEEEEECHH T ss_conf 99997288668977776358982486899999999999-83544100112-----373499980498 No 21 >cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers. Probab=61.20 E-value=11 Score=18.39 Aligned_cols=29 Identities=10% Similarity=0.193 Sum_probs=23.7 Q ss_pred HHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC Q ss_conf 998710568898722668807899998512697 Q gi|254780789|r 109 FLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET 141 (192) Q Consensus 109 f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~ 141 (192) ....+|+.|-|+|+. | ..|+|+|.++|.- T Consensus 6 L~~~~gk~V~V~LK~---G-~~y~G~L~s~D~~ 34 (68) T cd01722 6 LNDLTGKPVIVKLKW---G-MEYKGTLVSVDSY 34 (68) T ss_pred HHHCCCCEEEEEECC---C-CEEEEEEEEECCC T ss_conf 876079829999889---9-8999999997242 No 22 >TIGR00585 mutl DNA mismatch repair protein MutL; InterPro: IPR014763 This entry represents DNA mismatch repair proteins, such as MutL. The dimeric MutL protein has a key function in communicating mismatch recognition by MutS to downstream repair processes. Mismatch repair contributes to the overall fidelity of DNA replication by targeting mispaired bases that arise through replication errors during homologous recombination and as a result of DNA damage. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex .. Probab=60.46 E-value=14 Score=17.83 Aligned_cols=50 Identities=12% Similarity=0.284 Sum_probs=35.7 Q ss_pred HHHHHHHHHHHHHHCCCEEEEEEEECCCCCEEEEEEECCCCCCCHHHHHHHHH Q ss_conf 99999999999974797799999954998689999965877878999999999 Q gi|254780789|r 22 AGDISSVIQPVIEEMSFRSVQISLLEEKNLLLQIFVERDDGNMTLRDCEELSQ 74 (192) Q Consensus 22 ~~~i~~li~p~v~~lG~eLv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~vSr 74 (192) ..=|++|||-.+.+ |=.-++|++..++-..++| .|+-.| |+-+|+..+-. T Consensus 24 ~~vVKELvENSLDA-GAt~I~v~~~~gG~~~I~V-~DNG~G-i~~~d~~~~~~ 73 (367) T TIGR00585 24 ASVVKELVENSLDA-GATKIEVEIEEGGLKLIEV-SDNGSG-IDKEDLELACE 73 (367) T ss_pred HHHHHHHHHHHHCC-CCCEEEEEEEECCEEEEEE-EECCCC-CCHHHHHHHHC T ss_conf 99999988731214-8858999996265358999-977856-77777998612 No 23 >PRK06955 biotin--protein ligase; Provisional Probab=58.28 E-value=15 Score=17.60 Aligned_cols=50 Identities=20% Similarity=0.167 Sum_probs=33.5 Q ss_pred CCCCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCC Q ss_conf 788774457899998710568898722668807899998512697599996355 Q gi|254780789|r 97 PGIDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKR 150 (192) Q Consensus 97 PGidRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~ 150 (192) -|...-+..-+.+..++|+.|+|.. .|.+.+.|+..++++++-.+-.... T Consensus 232 ~g~~~~~~~w~~~~~~~G~~V~v~~----~g~~~~~G~a~GId~~G~L~v~t~~ 281 (300) T PRK06955 232 DGLAPFAARWHALHAYAGREVVLLE----DGVELARGVARGIDETGQLLLDTPA 281 (300) T ss_pred CCCHHHHHHHHHHCCCCCCEEEEEE----CCCEEEEEEEEEECCCCEEEEEECC T ss_conf 6819999999985032798699997----8980899999889999769999899 No 24 >cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=53.23 E-value=19 Score=17.08 Aligned_cols=37 Identities=14% Similarity=0.279 Sum_probs=27.0 Q ss_pred HHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEEE Q ss_conf 99998710568898722668807899998512697-599996 Q gi|254780789|r 107 SDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLEK 147 (192) Q Consensus 107 ~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~~ 147 (192) +-..+.+|.+|-|.++ +.+.|.|+|.++|+- .++|+. T Consensus 6 elidk~igs~Iwi~mk----~drE~~GtL~GFDdyvNmVLeD 43 (76) T cd01732 6 ELIDKCIGSRIWIVMK----SDKEFVGTLLGFDDYVNMVLED 43 (76) T ss_pred HHHHHHCCCEEEEEEC----CCCEEEEEEECCCCEEEEEEEE T ss_conf 9997536987999998----9919999997100006889830 No 25 >cd01213 tensin Tensin Phosphotyrosine-binding (PTB) domain. Tensin is a a focal adhesion protein, which contains a C-terminal SH2 domain followed by a PTB domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. Probab=46.37 E-value=17 Score=17.37 Aligned_cols=45 Identities=20% Similarity=0.195 Sum_probs=35.4 Q ss_pred HHHHHHHHHHHHCCC-CCCCCCCEEEEEECCC-----CCCCCCHHHHHHHH Q ss_conf 999999987520211-3567650799972788-----77445789999871 Q gi|254780789|r 69 CEELSQAISPILDVE-NIIEGHYRLEVSSPGI-----DRPMVRKSDFLRWN 113 (192) Q Consensus 69 C~~vSr~i~~~LD~~-d~i~~~Y~LEVSSPGi-----dRpL~~~~~f~r~~ 113 (192) -..|.++++..+... .|.+..-.+.||+-|| .|.++..|||--+. T Consensus 24 ~~AV~kAv~~~~~~~~~P~~t~VhfKVS~QGITLTDn~Rk~FFRRHYp~~~ 74 (138) T cd01213 24 NEAIKKAIAQCSGQAPDPQATEVHFKVSSQGITLTDNTRKKFFRRHYKVDS 74 (138) T ss_pred HHHHHHHHHHHHCCCCCCCCEEEEEEECCCCEEEEECCHHHHHHHCCCCCE T ss_conf 789999999997089998766899997577557884330133331166330 No 26 >cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=46.04 E-value=24 Score=16.38 Aligned_cols=36 Identities=8% Similarity=0.193 Sum_probs=27.5 Q ss_pred HHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCE-EEEEE Q ss_conf 99987105688987226688078999985126975-99996 Q gi|254780789|r 108 DFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETG-FFLEK 147 (192) Q Consensus 108 ~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~-i~l~~ 147 (192) -...|+++.++|++.+ .+.|.|+|.++|... +.|.. T Consensus 4 kL~~~ln~~vrv~~~D----GR~~vG~l~~~D~~~NlVL~~ 40 (79) T cd01717 4 KMLQLINYRLRVTLQD----GRQFVGQFLAFDKHMNLVLSD 40 (79) T ss_pred HHHHHCCCEEEEEEEC----CCEEEEEEEEECCCCCEEEEC T ss_conf 5688659879999968----959999999974766389837 No 27 >COG4466 Veg Uncharacterized protein conserved in bacteria [Function unknown] Probab=44.65 E-value=25 Score=16.25 Aligned_cols=62 Identities=16% Similarity=0.167 Sum_probs=43.2 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHC Q ss_conf 89999871056889872266880789999851269759999635545677772699768876517 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSA 170 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kA 170 (192) .+..+.+.|+.|.++....-....+-.|.|.++=...+.++.+... .+...++..|++|-.- T Consensus 11 K~~i~ah~G~~v~lk~ngGRKk~~~r~G~L~EtYpSvFIiel~~d~---~~~~~vSYsYsDILTe 72 (80) T COG4466 11 KESIDAHLGERVTLKANGGRKKTIERSGILIETYPSVFIIELDQDE---GNFERVSYSYSDILTE 72 (80) T ss_pred HHHHHHCCCCEEEEEECCCCEEEEHHCEEEEEECCCEEEEEECCCC---CCCEEEEEEEHHHEEE T ss_conf 9998735583899993598100202140786643728999943667---9836998872140200 No 28 >cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=44.32 E-value=26 Score=16.22 Aligned_cols=35 Identities=14% Similarity=0.293 Sum_probs=26.4 Q ss_pred HHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCE-EEEEE Q ss_conf 9987105688987226688078999985126975-99996 Q gi|254780789|r 109 FLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETG-FFLEK 147 (192) Q Consensus 109 f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~-i~l~~ 147 (192) ...|+.++|.|.+.+ -+.|.|+|.++|... +.|.. T Consensus 4 L~~~ldk~V~Vi~~D----GR~~vG~L~gfDq~~NlvL~~ 39 (74) T cd01727 4 LEDYLNKTVSVITVD----GRVIVGTLKGFDQATNLILDD 39 (74) T ss_pred HHHHHCCEEEEEECC----CCEEEEEEEECCCCCEEEEEE T ss_conf 667639789999858----959999998426732498642 No 29 >KOG1780 consensus Probab=43.86 E-value=26 Score=16.17 Aligned_cols=30 Identities=10% Similarity=0.300 Sum_probs=25.8 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCC Q ss_conf 89999871056889872266880789999851269 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSE 140 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~ 140 (192) | +.++|..+++.+++ +|.+++.|.|.++|- T Consensus 7 P-eLkkymdKki~lkl----nG~r~v~GiLrGyD~ 36 (77) T KOG1780 7 P-ELKKYMDKKIVLKL----NGGRKVTGILRGYDP 36 (77) T ss_pred C-HHHHHHHHEEEEEE----CCCCEEEEEEECCCH T ss_conf 2-28886302689994----798388888742456 No 30 >COG1094 Predicted RNA-binding protein (contains KH domains) [General function prediction only] Probab=43.16 E-value=18 Score=17.19 Aligned_cols=89 Identities=20% Similarity=0.279 Sum_probs=56.0 Q ss_pred CCCHHHHHHHHHHHHHHHCCCC---CCCCCCEEEEEECCCCCCCC-CHHHHHHHHCCEEEEEEECCCCCEE-----EEEE Q ss_conf 8789999999999875202113---56765079997278877445-7899998710568898722668807-----8999 Q gi|254780789|r 63 NMTLRDCEELSQAISPILDVEN---IIEGHYRLEVSSPGIDRPMV-RKSDFLRWNGHVVACEIVLSSGDKQ-----KLIG 133 (192) Q Consensus 63 ~i~iddC~~vSr~i~~~LD~~d---~i~~~Y~LEVSSPGidRpL~-~~~~f~r~~G~~VkV~l~~~~~g~k-----~~~G 133 (192) ...+-....+=++|+.=++-++ ++.++|.|||=.= ..-.+ ...|+.|-.||.+= -+|+. .++| T Consensus 61 p~~~~ka~d~VkAIgrGF~pe~A~~LL~d~~~levIdi--~~~~~~~~~~l~R~kgRIIG------~~GkTr~~IE~lt~ 132 (194) T COG1094 61 PLALLKARDVVKAIGRGFPPEKALKLLEDDYYLEVIDL--KDVVTLSGDHLRRIKGRIIG------REGKTRRAIEELTG 132 (194) T ss_pred HHHHHHHHHHHHHHHCCCCHHHHHHHHCCCCEEEEEEH--HHHCCCCHHHHHHHHCEEEC------CCCHHHHHHHHHHC T ss_conf 58899899999998668998999998627857999997--88426863466676510248------88508999999868 Q ss_pred EEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHCCEEC Q ss_conf 98512697599996355456777726997688765172504 Q gi|254780789|r 134 KIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSARLIV 174 (192) Q Consensus 134 ~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kAkLv~ 174 (192) .-+++-++.|.+- =+|.++.-||=.+ T Consensus 133 ~~I~V~g~tVaii---------------G~~~~v~iAr~AV 158 (194) T COG1094 133 VYISVYGKTVAII---------------GGFEQVEIAREAV 158 (194) T ss_pred CEEEEECCEEEEE---------------CCHHHHHHHHHHH T ss_conf 8099827689994---------------6826669999999 No 31 >TIGR00302 TIGR00302 phosphoribosylformylglycinamidine synthase, purS protein; InterPro: IPR003850 Phosphoribosylformylglycinamidine(FGAM) synthetase, 6.3.5.3 from EC, catalyses the fourth step in the de novo purine biosynthetic pathway .5-phosphoribosylformylglycinamide (FGAR) + glutamine + ATP = FGAM + glutamate + ADP + Pi In eukaryotes and many bacterial systems (including Escherichia coli and Salmonella typhimurium), the FGAM synthetase is encoded by the large form of PurL (lgPurL), which contains an N-terminal ATPase domain and a C-terminal glutamine-binding domain. In archaeal and other bacterial systems, however, FGAM synthetase is encoded by separate genes, making it a multisubunit (rather than multidomain) enzyme. The protein is composed of the small form of PurL (smPurL), which is homologus to the ATPase domain of lgPurL, PurQ which is homologous to the glutamine-binding domain of of lgPurL, and PurS, whose function is not known. This entry represents the PurS subunit of the multisubunit FGAM synthetase. Recent studies showed that disruption of the purS gene in B. subtilis resulted in a purine auxotrophic phenotype, due to defective FGAM synthetase activity. Therefore, the PurS protein appears to be required for the function of the PurL and PurQ subunits of the FGAM synthetase, but the molecular mechanism for the functional role of PurS is currently not known. For additional information please see , . ; GO: 0016879 ligase activity forming carbon-nitrogen bonds. Probab=40.15 E-value=30 Score=15.82 Aligned_cols=66 Identities=15% Similarity=0.363 Sum_probs=48.0 Q ss_pred CCHHHHHHHHHHHHHHHCCC-EEEEEEEECCCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHC--CCCCCCCCCEEEEE Q ss_conf 12799999999999974797-7999999549986899999658778789999999999875202--11356765079997 Q gi|254780789|r 19 MGLAGDISSVIQPVIEEMSF-RSVQISLLEEKNLLLQIFVERDDGNMTLRDCEELSQAISPILD--VENIIEGHYRLEVS 95 (192) Q Consensus 19 ~~i~~~i~~li~p~v~~lG~-eLv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD--~~d~i~~~Y~LEVS 95 (192) +++-+.=-..++..+..||| ++-+|.... +++|.+|.++. +.+++.+.+.=+ -++|.-++|..+|. T Consensus 11 ~gVLdPeG~a~~~AL~~LGy~~V~~V~t~K----~i~~~~E~~~~-------e~~~~ev~eMC~kLLANpVI~dY~~~~~ 79 (80) T TIGR00302 11 KGVLDPEGEAVQRALRLLGYNEVKDVRTGK----VIELTIEAEDR-------EEVEREVEEMCEKLLANPVIEDYEIEVE 79 (80) T ss_pred CCCCCCCHHHHHHHHHHCCCCCCCEEEEEE----EEEEEECCCCH-------HHHHHHHHHHHHHHCCCCCEECCEEEEE T ss_conf 763681148899998633778730346888----88997278777-------8899999988776247985205457883 No 32 >KOG3482 consensus Probab=39.82 E-value=30 Score=15.78 Aligned_cols=33 Identities=18% Similarity=0.318 Sum_probs=25.6 Q ss_pred CHHHHH-HHHCCEEEEEEECCCCCEEEEEEEEEECCCC Q ss_conf 789999-8710568898722668807899998512697 Q gi|254780789|r 105 RKSDFL-RWNGHVVACEIVLSSGDKQKLIGKIMGTSET 141 (192) Q Consensus 105 ~~~~f~-r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~ 141 (192) .|+.|. .-.|+.|.|+|+.. ..++|+|.++|+- T Consensus 8 NPKpFL~~l~gk~V~vkLKwg----~eYkG~LvsvD~Y 41 (79) T KOG3482 8 NPKPFLNGLTGKPVLVKLKWG----QEYKGTLVSVDNY 41 (79) T ss_pred CCHHHHHHCCCCEEEEEEECC----CEEEEEEEEECCH T ss_conf 826887214597489998627----6887899982443 No 33 >cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=39.16 E-value=31 Score=15.72 Aligned_cols=34 Identities=12% Similarity=0.178 Sum_probs=26.4 Q ss_pred HHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEE Q ss_conf 998710568898722668807899998512697-59999 Q gi|254780789|r 109 FLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLE 146 (192) Q Consensus 109 f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~ 146 (192) +....|+.|.|+|+.. ..++|+|.++|+. .+.+. T Consensus 6 L~~a~g~~V~VELKng----~~~~G~L~~~D~~MN~~L~ 40 (76) T cd01723 6 LKTAQNHPMLVELKNG----ETYNGHLVNCDNWMNIHLR 40 (76) T ss_pred HHHCCCCEEEEEECCC----CEEEEEEEEEECCCCCEEE T ss_conf 7558998999998899----7999999997343581998 No 34 >pfam02237 BPL_C Biotin protein ligase C terminal domain. The function of this structural domain is unknown. It is found to the C terminus of the biotin protein ligase catalytic domain pfam01317. Probab=38.79 E-value=31 Score=15.68 Aligned_cols=35 Identities=23% Similarity=0.285 Sum_probs=26.1 Q ss_pred HHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCC Q ss_conf 7105688987226688078999985126975999963554 Q gi|254780789|r 112 WNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRG 151 (192) Q Consensus 112 ~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~ 151 (192) ++|+.|++.+. ...+.|+..++++++..+.....+ T Consensus 1 ~lG~~V~~~~~-----~~~v~G~a~gId~~G~Lll~~~~g 35 (47) T pfam02237 1 HLGKEVKVTTG-----GGKVEGIAVGIDDDGRLLLETDDG 35 (47) T ss_pred CCCCEEEEEEC-----CCEEEEEEECCCCCCEEEEECCCC T ss_conf 98978999938-----977999996718898499984896 No 35 >cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=38.65 E-value=31 Score=15.67 Aligned_cols=31 Identities=16% Similarity=0.261 Sum_probs=25.4 Q ss_pred HHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC Q ss_conf 99998710568898722668807899998512697 Q gi|254780789|r 107 SDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET 141 (192) Q Consensus 107 ~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~ 141 (192) +=++...|+.|.|+|+.. ..+.|+|.++++. T Consensus 4 ~fL~~l~g~~VtVELKng----~~~~G~L~~vd~~ 34 (90) T cd01724 4 RFLMKLTNETVTIELKNG----TIVHGTITGVDPS 34 (90) T ss_pred HHHHHCCCCEEEEEECCC----CEEEEEEEEECCC T ss_conf 978766898799998799----7999999881378 No 36 >cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=38.25 E-value=32 Score=15.63 Aligned_cols=30 Identities=17% Similarity=0.186 Sum_probs=24.6 Q ss_pred HHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC Q ss_conf 9998710568898722668807899998512697 Q gi|254780789|r 108 DFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET 141 (192) Q Consensus 108 ~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~ 141 (192) =|+.-.|+.|-|+|++. -.+.|+|.++|.- T Consensus 5 f~k~L~g~~vtVELKN~----~~i~G~L~svD~~ 34 (81) T cd01725 5 FFKTLVGKEVTVELKND----LSIRGTLHSVDQY 34 (81) T ss_pred HHHHHCCCEEEEEECCC----CEEEEEEEECCCC T ss_conf 89882798799997699----4999999643722 No 37 >pfam06372 Gemin6 Gemin6 protein. This family consists of several mammalian Gemin6 proteins. The exact function of Gemin6 is unknown but it has been found to form part of the pfam06003 complex. The SMN complex plays a key role in the biogenesis of spliceosomal small nuclear ribonucleoproteins (snRNPs) and other ribonucleoprotein particles. Probab=37.76 E-value=32 Score=15.58 Aligned_cols=35 Identities=11% Similarity=0.155 Sum_probs=28.6 Q ss_pred CCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCE Q ss_conf 578999987105688987226688078999985126975 Q gi|254780789|r 104 VRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETG 142 (192) Q Consensus 104 ~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~ 142 (192) +.|.+|..|+.+.|+|.+.+ .+.+.|.+.-+|.-. T Consensus 7 ~~p~~~~~yv~K~Vkv~~~d----~~~~~GwV~TvDPVS 41 (169) T pfam06372 7 ISLEEFEDFTEKEVKIIACD----NKEIEGWLFCTDPVS 41 (169) T ss_pred CCHHHHHHHHCCEEEEEEEC----CCEEEEEEEEECCCC T ss_conf 69889998526479999953----988987999967875 No 38 >PRK11886 biotin--protein ligase; Provisional Probab=36.06 E-value=34 Score=15.42 Aligned_cols=39 Identities=26% Similarity=0.371 Sum_probs=26.1 Q ss_pred CHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEE-EEEEC Q ss_conf 789999871056889872266880789999851269759-99963 Q gi|254780789|r 105 RKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGF-FLEKE 148 (192) Q Consensus 105 ~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i-~l~~~ 148 (192) .-+.+..++|+.|+|.. +...+.|+..++++++- .+..+ T Consensus 263 ~w~~~~~~~Gk~V~v~~-----~~~~~~G~a~gId~~G~Liv~~~ 302 (319) T PRK11886 263 RWKKLDLFLGREVKLII-----GQKEIFGIARGIDEQGALLLETD 302 (319) T ss_pred HHHHHHCCCCCEEEEEE-----CCEEEEEEEEEECCCCEEEEEEC T ss_conf 99985071698799998-----99899999988999980999989 No 39 >cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing. Probab=35.41 E-value=35 Score=15.35 Aligned_cols=31 Identities=19% Similarity=0.243 Sum_probs=25.1 Q ss_pred HHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC Q ss_conf 99998710568898722668807899998512697 Q gi|254780789|r 107 SDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET 141 (192) Q Consensus 107 ~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~ 141 (192) .=++...|+.|.|.|+.. ..+.|+|.++|+. T Consensus 12 ~lL~~l~g~~VtVELkNg----~~~~G~L~~vD~~ 42 (78) T cd01733 12 ILLQGLQGKVVTVELRNE----TTVTGRIASVDAF 42 (78) T ss_pred HHHHHCCCCEEEEEECCC----CEEEEEEEEECCC T ss_conf 999873897899997699----8999999987446 No 40 >KOG1781 consensus Probab=35.34 E-value=27 Score=16.12 Aligned_cols=42 Identities=10% Similarity=0.206 Sum_probs=25.0 Q ss_pred EEECCCCCCCC-CHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECC Q ss_conf 97278877445-78999987105688987226688078999985126 Q gi|254780789|r 94 VSSPGIDRPMV-RKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTS 139 (192) Q Consensus 94 VSSPGidRpL~-~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~ 139 (192) +++-+.++|=+ ..-|..+|+...|+|++. |.+...|+|.+++ T Consensus 6 ~~~~~~e~~kkEsilDLsky~Dk~Irvkf~----GGr~~sGiLkGyD 48 (108) T KOG1781 6 SQRKKFEKPKKESILDLSKYLDKKIRVKFT----GGREASGILKGYD 48 (108) T ss_pred HCCCCCCCCCHHHHHHHHHHHCCCEEEEEE----CCCEEEEEHHHHH T ss_conf 202543466325786598852001589960----6726410022289 No 41 >cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=35.13 E-value=36 Score=15.32 Aligned_cols=29 Identities=21% Similarity=0.429 Sum_probs=24.0 Q ss_pred HHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC Q ss_conf 998710568898722668807899998512697 Q gi|254780789|r 109 FLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET 141 (192) Q Consensus 109 f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~ 141 (192) .....|+.|.|+|+.. ..+.|+|.++++. T Consensus 5 L~~a~g~~VtVELKnG----~~y~G~L~~~d~~ 33 (70) T cd01721 5 LHEAEGHIVTVELKTG----EVYRGKLIEAEDN 33 (70) T ss_pred HHHCCCCEEEEEECCC----EEEEEEEEEEECC T ss_conf 7557998899998899----4999999887023 No 42 >KOG0095 consensus Probab=33.82 E-value=27 Score=16.10 Aligned_cols=54 Identities=28% Similarity=0.301 Sum_probs=39.0 Q ss_pred CCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCCCCCCHHH Q ss_conf 986899999658778789999999999875202113567650799972788774457899 Q gi|254780789|r 49 KNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDRPMVRKSD 108 (192) Q Consensus 49 ~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidRpL~~~~~ 108 (192) ++++|.|.+-++ ++++|-.+|+.+|.+.+-.. ...|.||-|---.+..=+...| T Consensus 110 n~kvlkilvgnk---~d~~drrevp~qigeefs~~---qdmyfletsakea~nve~lf~~ 163 (213) T KOG0095 110 NNKVLKILVGNK---IDLADRREVPQQIGEEFSEA---QDMYFLETSAKEADNVEKLFLD 163 (213) T ss_pred HCCEEEEEECCC---CCHHHHHHHHHHHHHHHHHH---HHHHHHHHCCCCHHHHHHHHHH T ss_conf 064278761466---56123333058887888775---5566532020001039999999 No 43 >TIGR00922 nusG transcription termination/antitermination factor NusG; InterPro: IPR001062 Bacterial transcription antitermination protein, nusG, is a component of the transcription complex and interacts with the termination factor Rho and RNA polymerase , . NusG is a bacterial transcriptional elongation factor involved in transcription termination and anti-termination .; GO: 0003711 transcription elongation regulator activity, 0006355 regulation of transcription DNA-dependent. Probab=33.17 E-value=38 Score=15.13 Aligned_cols=52 Identities=12% Similarity=0.019 Sum_probs=34.2 Q ss_pred HCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHC Q ss_conf 1056889872266880789999851269759999635545677772699768876517 Q gi|254780789|r 113 NGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSA 170 (192) Q Consensus 113 ~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kA 170 (192) .|..|+|. -.|. .+|.|++.+++.+.=.|.+.-. .-.....+++.|++|.|. T Consensus 142 ~Ge~Vrv~-dGPF---~~F~G~Veev~~Ek~kLkV~VS--IFGR~TPVEL~F~QVEK~ 193 (193) T TIGR00922 142 VGEQVRVN-DGPF---ANFTGTVEEVDYEKSKLKVSVS--IFGRETPVELEFTQVEKI 193 (193) T ss_pred CCCEEEEE-CCCC---CCCCEEEEEEEHHCCEEEEEEE--CCCCCCCEEECCCEEECC T ss_conf 79888980-3888---8851479888021376999997--168787146051112039 No 44 >cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=33.02 E-value=39 Score=15.11 Aligned_cols=30 Identities=10% Similarity=-0.001 Sum_probs=23.7 Q ss_pred HHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEE Q ss_conf 9871056889872266880789999851269759 Q gi|254780789|r 110 LRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGF 143 (192) Q Consensus 110 ~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i 143 (192) +.++++.|.|+++ |.+.+.|+|.++|..-- T Consensus 7 ~~sl~~~V~Vklr----ggRel~G~L~afD~h~N 36 (82) T cd01730 7 RLSLDERVYVKLR----GDRELRGRLHAYDQHLN 36 (82) T ss_pred HHHCCCEEEEEEC----CCCEEEEEEEEECCEEE T ss_conf 8728986999987----99799999997340226 No 45 >PRK01191 rpl24p 50S ribosomal protein L24P; Validated Probab=32.01 E-value=40 Score=15.01 Aligned_cols=70 Identities=14% Similarity=0.089 Sum_probs=44.2 Q ss_pred HHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEE--HHHHHHCCEECCHHHHHHHHHC Q ss_conf 7105688987226688078999985126975999963554567777269976--8876517250469999999960 Q gi|254780789|r 112 WNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAIS--FDSLLSARLIVTDELLRASLNN 185 (192) Q Consensus 112 ~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip--~~~I~kAkLv~~d~l~~~~~~~ 185 (192) -.|-.|+|- .....| -.|+...++-....+.+|+-...+....++.+| =+++-=.+|.++|.+-.+.|++ T Consensus 48 rkgD~V~V~-rG~~kG---~~GkV~~V~~k~~~V~VEgv~~~K~~G~~v~~pIhpSnvvItkL~l~Dk~R~~~Ler 119 (119) T PRK01191 48 RKGDTVKVM-RGDFKG---EEGKVVEVDLKRYRIYVEGVTIKKADGTEVPYPIHPSNVMITKLDLSDERRFKILER 119 (119) T ss_pred ECCCEEEEE-ECCCCC---CCCEEEEEECCCCEEEEEEEEEECCCCCEEEEEECCCCEEEEECCCCCHHHHHHHCC T ss_conf 469999995-527789---623189997368899994369984799878642256317999746688789987439 No 46 >pfam01545 Cation_efflux Cation efflux family. Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells. Probab=31.59 E-value=41 Score=14.96 Aligned_cols=74 Identities=18% Similarity=0.258 Sum_probs=47.3 Q ss_pred HHHHHHHHHHHHHHHCCCEEEEEEEEC-CCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEEC Q ss_conf 799999999999974797799999954-99868999996587787899999999998752021135676507999727 Q gi|254780789|r 21 LAGDISSVIQPVIEEMSFRSVQISLLE-EKNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSP 97 (192) Q Consensus 21 i~~~i~~li~p~v~~lG~eLv~v~~~~-~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSP 97 (192) ..+++.+.++. ...-..+.++.+.. ++...+.+.+.-++ .+++.+|..+++.+...+-..-|.-...+.++.++ T Consensus 195 ~~~~i~~~i~~--~~~v~~v~~~~~~~~G~~~~v~v~i~v~~-~~~~~~~~~i~~~i~~~l~~~~~~i~~~~i~~~~~ 269 (273) T pfam01545 195 LVDKIRKALEA--LPGVLGVHDLRVWKSGPTLLVEIHIEVDP-DLTVEEAHEIADEIEKALKEKFPGIVHVTIHVEPA 269 (273) T ss_pred HHHHHHHHHHC--CCCCEEEEEEEEEEECCCEEEEEEEEECC-CCCHHHHHHHHHHHHHHHHHHCCCCCEEEEEECCC T ss_conf 89999999963--89950343579999689599999999899-99899999999999999998689988699981599 No 47 >PRK13305 sgbH 3-keto-L-gulonate-6-phosphate decarboxylase; Provisional Probab=31.50 E-value=41 Score=14.95 Aligned_cols=19 Identities=11% Similarity=0.202 Sum_probs=10.2 Q ss_pred CCHHHHHHHHHHHHHHHCC Q ss_conf 7899999999998752021 Q gi|254780789|r 64 MTLRDCEELSQAISPILDV 82 (192) Q Consensus 64 i~iddC~~vSr~i~~~LD~ 82 (192) -||..|.+.++..+..+-+ T Consensus 93 ~TI~~~~~~a~~~g~~v~v 111 (220) T PRK13305 93 ATVEKGHAVAQSCGGEIQI 111 (220) T ss_pred HHHHHHHHHHHHCCCEEEE T ss_conf 9999999999980998999 No 48 >pfam10842 DUF2642 Protein of unknown function (DUF2642). This family of proteins with unknown function appear to be restricted to Bacillus spp. Probab=31.49 E-value=41 Score=14.95 Aligned_cols=37 Identities=14% Similarity=0.206 Sum_probs=29.9 Q ss_pred HHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECC Q ss_conf 999871056889872266880789999851269759999635 Q gi|254780789|r 108 DFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEK 149 (192) Q Consensus 108 ~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~ 149 (192) -+..-+|+.+-|.+.. ...+|+|.++..|.++++... T Consensus 15 tlqs~iG~~vvVeT~r-----gsvrG~L~dVkPDHivle~~~ 51 (66) T pfam10842 15 TLQSLIGRRVVVQTVR-----GSVRGRLRDVKPDHLVIEAGD 51 (66) T ss_pred HHHHHHCCEEEEEEEC-----CCEEEEEEEECCCEEEEECCC T ss_conf 9998748379999832-----516878961079889997189 No 49 >cd02978 KaiB_like KaiB-like family; composed of the circadian clock proteins, KaiB and the N-terminal KaiB-like sensory domain of SasA. KaiB is an essential protein in maintaining circadian rhythm. It was originally discovered from the cyanobacterium Synechococcus as part of the circadian clock gene cluster, kaiABC. KaiB attenuates KaiA-enhanced KaiC autokinase activity by interacting with KaiA-KaiC complexes in a circadian fashion. KaiB is membrane-associated as well as cytosolic. The amount of membrane-associated protein peaks in the evening (at circadian time (CT) 12-16) while the cytosolic form peaks later (at CT 20). The rhythmic localization of KaiB may function in regulating the formation of Kai complexes. SasA is a sensory histidine kinase which associates with KaiC. Although it is not an essential oscillator component, it is important in enhancing kaiABC expression and is important in metabolic growth control under day/night cycle conditions. SasA contains an N-terminal sensor Probab=31.28 E-value=41 Score=14.93 Aligned_cols=40 Identities=25% Similarity=0.452 Sum_probs=27.8 Q ss_pred CEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEEC Q ss_conf 68999996587787899999999998752021135676507999727 Q gi|254780789|r 51 LLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSP 97 (192) Q Consensus 51 ~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSP 97 (192) ..|+.|+++.. .-+....+++-+.+...| ++.|.|||=-. T Consensus 2 ~~L~LyVaG~t-p~S~~ai~nl~~i~e~~l------~~~y~LeVIDv 41 (72) T cd02978 2 YVLRLYVAGRT-PKSERALQNLKRILEELL------GGPYELEVIDV 41 (72) T ss_pred EEEEEEECCCC-HHHHHHHHHHHHHHHHHC------CCCEEEEEEEC T ss_conf 18999985999-789999999999999747------99668999883 No 50 >PRK08942 D,D-heptose 1,7-bisphosphate phosphatase; Validated Probab=30.93 E-value=24 Score=16.42 Aligned_cols=41 Identities=15% Similarity=0.187 Sum_probs=23.5 Q ss_pred HHHCCCEEEEEEEECCCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCC Q ss_conf 974797799999954998689999965877878999999999987520211 Q gi|254780789|r 33 IEEMSFRSVQISLLEEKNLLLQIFVERDDGNMTLRDCEELSQAISPILDVE 83 (192) Q Consensus 33 v~~lG~eLv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~ 83 (192) +...||.++-|.=++ + |- -|-++.+|...++..+...|-.. T Consensus 41 l~~~g~~~~ivTNQs--G------I~--rG~~t~~~~~~i~~~m~~~l~~~ 81 (181) T PRK08942 41 LKQAGYRVVVATNQS--G------IA--RGLFTEAQLNALHEKMDWSLADR 81 (181) T ss_pred HHHCCCEEEEEECCH--H------HC--CCCCCHHHHHHHHHHHHHHHHHC T ss_conf 998799699995871--3------42--58677999999999999999976 No 51 >pfam11589 DUF3244 Protein of unknown function (DUF3244). This family of proteins with unknown function appear to be restricted to Bacteroidetes. The protein may have an immunoglobulin-like beta-sandwich fold however this cannot be confirmed. Probab=30.58 E-value=15 Score=17.65 Aligned_cols=58 Identities=17% Similarity=0.302 Sum_probs=41.9 Q ss_pred EEEEEEEECC-CCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECC Q ss_conf 7999999549-98689999965877878999999999987520211356765079997278 Q gi|254780789|r 39 RSVQISLLEE-KNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPG 98 (192) Q Consensus 39 eLv~v~~~~~-~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPG 98 (192) .+..|+|... .+ +.|.|...+|.|=-++|......-...++.++.-++.|.||.+++. T Consensus 37 ~~L~I~F~~~l~~--vtI~I~d~~G~vVYe~~is~~~~~~~~isL~~~~~G~Y~l~it~~~ 95 (106) T pfam11589 37 NILSIEFTSPLDN--LTITITDEKGVVVYEDTISVASGDTITISIAGEAPGEYKLELTHGL 95 (106) T ss_pred CEEEEEEECCCCC--EEEEEECCCCCEEEEEEECCCCCCEEEEEECCCCCCEEEEEEECCC T ss_conf 9999998655898--6999997999899998732678868999836756850899997589 No 52 >pfam08496 Peptidase_S49_N Peptidase family S49 N-terminal. This domain is found to the N-terminus of bacterial signal peptidases of the S49 family (pfam01343). Probab=30.18 E-value=43 Score=14.82 Aligned_cols=46 Identities=28% Similarity=0.385 Sum_probs=38.6 Q ss_pred EEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCC Q ss_conf 99999658778789999999999875202113567650799972788 Q gi|254780789|r 53 LQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGI 99 (192) Q Consensus 53 LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGi 99 (192) =|+|+=+=+|.|.-..++.+-..|+++|-++.+ .+.-.|-.-|||= T Consensus 97 ~rvfVldF~GDi~As~V~~LREEItAIL~~A~~-~DEVllrLES~GG 142 (154) T pfam08496 97 PRLFVLDFKGDIDASEVESLREEITAILSVAKP-EDEVLLRLESGGG 142 (154) T ss_pred CEEEEEECCCCCCHHHHHHHHHHHHHHHHHCCC-CCEEEEEEECCCC T ss_conf 718999535872667668899999999973899-9989999868997 No 53 >pfam04514 BTV_NS2 Bluetongue virus non-structural protein NS2. This family includes NS2 proteins from other members of the Orbivirus genus. NS2 is a non-specific single-stranded RNA-binding protein that forms large homomultimers and accumulates in viral inclusion bodies of infected cells. Three RNA binding regions have been identified in Bluetongue virus serotype 17 at residues 2-11, 153-166 and 274-286. NS2 multimers also possess nucleotidyl phosphatase activity. The precise function of NS2 is not known, but it may be involved in the transport and condensation of viral mRNAs. Probab=29.73 E-value=24 Score=16.38 Aligned_cols=35 Identities=29% Similarity=0.329 Sum_probs=23.7 Q ss_pred HHHHHHHHHH-----------HHC-CCCCCCCCCEEEEEECCCCCCC Q ss_conf 9999999875-----------202-1135676507999727887744 Q gi|254780789|r 69 CEELSQAISP-----------ILD-VENIIEGHYRLEVSSPGIDRPM 103 (192) Q Consensus 69 C~~vSr~i~~-----------~LD-~~d~i~~~Y~LEVSSPGidRpL 103 (192) |..+.++.+. .|. +..|.|..|.|||+-||.-|-. T Consensus 23 cGkIAk~~~~pYcqIKIGR~~a~~~v~~PePk~yVlei~~~gayriq 69 (363) T pfam04514 23 CGQIANAGSQPYCQIKIGRTFALKAVATPEPKGYVLEIQEVGSYRIQ 69 (363) T ss_pred HHHHHHHCCCCEEEEEECCEEEEEECCCCCCCEEEEEECCCCEEEEE T ss_conf 78887504785189995227872012799996489982587258876 No 54 >cd04479 RPA3 RPA3: A subfamily of OB folds similar to human RPA3 (also called RPA14). RPA3 is the smallest subunit of Replication protein A (RPA). RPA is a nuclear ssDNA binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA3 is believed to have a structural role in assembly of the RPA heterotrimer. Probab=29.32 E-value=44 Score=14.72 Aligned_cols=46 Identities=9% Similarity=0.094 Sum_probs=28.1 Q ss_pred CCCCCHHHHHHHHCCEEEEEEE-CCCCCEEEEEEEEEECCCCEEEEEECC Q ss_conf 7445789999871056889872-266880789999851269759999635 Q gi|254780789|r 101 RPMVRKSDFLRWNGHVVACEIV-LSSGDKQKLIGKIMGTSETGFFLEKEK 149 (192) Q Consensus 101 RpL~~~~~f~r~~G~~VkV~l~-~~~~g~k~~~G~L~~v~~~~i~l~~~~ 149 (192) ||.....+...|+|+.|++--+ ...++. .-++.+.|+..+++.... T Consensus 2 ~pRVn~~~L~~f~gk~VrivGkV~~~~g~---~~~~~s~Dg~~v~v~l~~ 48 (101) T cd04479 2 TPRINGAMLSQFVGKTVRIVGKVEKVDGD---SLTLISSDGVNVTVELNR 48 (101) T ss_pred CCEECHHHHHHCCCCEEEEEEEEEEECCC---EEEEEECCCCEEEEEECC T ss_conf 83497899965389869999999986598---169992799989999889 No 55 >pfam03983 SHD1 SLA1 homology domain 1, SHD1. NPFXD peptides specifically interact with the SHD1 domain. NPFXD is a clathrin-facilitated endocytic targeting signal. NPFXD was originally discovered in the cytoplasmic domain of the furin-like protease Kex2p. Sla1 is thought to function as an endocytic adaptor. Probab=28.58 E-value=46 Score=14.64 Aligned_cols=48 Identities=17% Similarity=0.285 Sum_probs=32.8 Q ss_pred CCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHCCEECCHHHHH Q ss_conf 8807899998512697599996355456777726997688765172504699999 Q gi|254780789|r 126 GDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSARLIVTDELLR 180 (192) Q Consensus 126 ~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kAkLv~~d~l~~ 180 (192) .|.=++.+.++++.+..|.|.. .+...+.+|.+..+.+-|.+-+.+-+ T Consensus 19 tG~F~VEA~flg~~dgki~LhK-------~nGv~I~Vp~~klS~~Dl~yVe~~tg 66 (70) T pfam03983 19 SGTFKVEAEFLGLKDGKIHLHK-------ANGVKIAVPVEKMSVEDLEYVERVTG 66 (70) T ss_pred CCCEEEEEEEEEEECCEEEEEE-------CCCEEEEEEHHHCCHHHHHHHHHHHC T ss_conf 9973899999987489899993-------59919997848869879999998647 No 56 >COG0323 MutL DNA mismatch repair enzyme (predicted ATPase) [DNA replication, recombination, and repair] Probab=28.23 E-value=46 Score=14.61 Aligned_cols=46 Identities=7% Similarity=0.152 Sum_probs=28.9 Q ss_pred HHHHHHHHHHHHCCCEEEEEEEECCCCCEEEEEEECCCCCCCHHHHHHH Q ss_conf 9999999999747977999999549986899999658778789999999 Q gi|254780789|r 24 DISSVIQPVIEEMSFRSVQISLLEEKNLLLQIFVERDDGNMTLRDCEEL 72 (192) Q Consensus 24 ~i~~li~p~v~~lG~eLv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~v 72 (192) -|++||+-.+.+ |-.-++|++.+++-..++| .|+-. ||+-+|+.-+ T Consensus 27 VVKELVENSlDA-GAt~I~I~ve~gG~~~I~V-~DNG~-Gi~~~Dl~la 72 (638) T COG0323 27 VVKELVENSLDA-GATRIDIEVEGGGLKLIRV-RDNGS-GIDKEDLPLA 72 (638) T ss_pred HHHHHHHCCCCC-CCCEEEEEEECCCEEEEEE-EECCC-CCCHHHHHHH T ss_conf 999998610304-9988999993598018999-88999-9998999999 No 57 >PRK08255 salicylyl-CoA 5-hydroxylase; Reviewed Probab=28.04 E-value=47 Score=14.58 Aligned_cols=86 Identities=16% Similarity=0.200 Sum_probs=55.5 Q ss_pred CCCCCCCCCCCCCHHHHHHHHHHH--HHHHCCCEEEEEEEECC------------------------------------- Q ss_conf 133376100141279999999999--99747977999999549------------------------------------- Q gi|254780789|r 8 HSKYEPRIFGDMGLAGDISSVIQP--VIEEMSFRSVQISLLEE------------------------------------- 48 (192) Q Consensus 8 ~~~~~~r~~~~~~i~~~i~~li~p--~v~~lG~eLv~v~~~~~------------------------------------- 48 (192) .+..-||-|....|.+-+...+.- -+...||+.+.|--..+ T Consensus 538 ~~~~~Pr~mt~~eI~~vv~~F~~AA~rA~~AGFD~IEiH~AHGYLl~qFLSPlsN~RtDeYGGsleNR~Rf~lEV~~aVR 617 (770) T PRK08255 538 PGSQVPREMTRADMDRVRDQFVAATRRAAEAGFDWLELHCAHGYLLSSFISPLTNQRTDEYGGSLENRLRYPLEVFRAVR 617 (770) T ss_pred CCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCCEEEEECCCCCHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHH T ss_conf 99988756899999999999999999999839998999523455588753864467754357888877788999999999 Q ss_pred ------CCCEEEEEEE-CCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCC Q ss_conf ------9868999996-5877878999999999987520211356765079997278877 Q gi|254780789|r 49 ------KNLLLQIFVE-RDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDR 101 (192) Q Consensus 49 ------~~~~LrI~ID-~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidR 101 (192) ++..+||-.. -.+|+.+++|+..+.+.+.+. -+ =.+.|||-|+.. T Consensus 618 ~~~p~~~Pl~vRiSatDw~~gG~t~edsv~la~~l~~~-Gv-------D~IdvSsGg~~~ 669 (770) T PRK08255 618 AVWPADKPMSVRISAHDWVEGGNTPDDAVEIARAFKAA-GA-------DMIDVSSGQVSK 669 (770) T ss_pred HHCCCCCCEEEEEECCCCCCCCCCHHHHHHHHHHHHHC-CC-------CEEEECCCCCCC T ss_conf 86789886699985102568999999999999999974-99-------899957888886 No 58 >PHA00026 cp coat protein Probab=27.64 E-value=23 Score=16.53 Aligned_cols=23 Identities=30% Similarity=0.481 Sum_probs=16.1 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCC Q ss_conf 99999999987520211356765 Q gi|254780789|r 67 RDCEELSQAISPILDVENIIEGH 89 (192) Q Consensus 67 ddC~~vSr~i~~~LD~~d~i~~~ 89 (192) |||+-+|+++...|..-+||.++ T Consensus 99 ddc~li~kal~glfk~gnpia~a 121 (129) T PHA00026 99 DDCELISKALAGLFKDGNPIAEA 121 (129) T ss_pred CCHHHHHHHHHHHHCCCCCHHHH T ss_conf 84688999977775169855887 No 59 >pfam02700 PurS Phosphoribosylformylglycinamidine (FGAM) synthase. This family forms a component of the de novo purine biosynthesis pathway. Probab=27.58 E-value=48 Score=14.53 Aligned_cols=69 Identities=14% Similarity=0.341 Sum_probs=44.9 Q ss_pred CCCHHHHHHHHHHHHHHHCCCE-EEEEEEECCCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEE Q ss_conf 4127999999999999747977-99999954998689999965877878999999999987520211356765079997 Q gi|254780789|r 18 DMGLAGDISSVIQPVIEEMSFR-SVQISLLEEKNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVS 95 (192) Q Consensus 18 ~~~i~~~i~~li~p~v~~lG~e-Lv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVS 95 (192) ++++-+.=-..+...+..+||. +-+|.+ ++.+.+.+|.. +-+++.+.-..+...| -.+|+-+.|..+|+ T Consensus 10 K~gVlDPqG~aI~~aL~~lG~~~v~~vr~----GK~iel~i~~~----~~e~a~~~v~~~c~~l-LaNpVIE~y~i~i~ 79 (80) T pfam02700 10 KPGVLDPQGEAIKKALHRLGYEGVEDVRI----GKYIELTLEAE----DEEEAEEQVEEMCDKL-LANPVIEDYRIELE 79 (80) T ss_pred CCCCCCCHHHHHHHHHHHCCCCCCCEEEE----EEEEEEEECCC----CHHHHHHHHHHHHHHH-CCCCCEEEEEEEEE T ss_conf 99873817999999998648644222774----12999998689----9899999999999986-38875466999997 No 60 >cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=27.31 E-value=48 Score=14.50 Aligned_cols=36 Identities=14% Similarity=0.230 Sum_probs=27.5 Q ss_pred HHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCE-EEEEE Q ss_conf 99987105688987226688078999985126975-99996 Q gi|254780789|r 108 DFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETG-FFLEK 147 (192) Q Consensus 108 ~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~-i~l~~ 147 (192) -.+.++|+.++|++.+ | +.|.|++...|.+. +.|.. T Consensus 4 ~l~~ll~~~lrV~l~D---G-R~~vG~f~c~Dk~~NiIL~~ 40 (75) T cd06168 4 KLRSLLGRTMRIHMTD---G-RTLVGVFLCTDRDCNIILGS 40 (75) T ss_pred HHHHHCCCEEEEEEEC---C-CEEEEEEEEECCCCCEEEEC T ss_conf 9898629879999967---9-99999999973767599808 No 61 >PRK09301 circadian clock protein KaiB; Provisional Probab=26.74 E-value=49 Score=14.44 Aligned_cols=76 Identities=14% Similarity=0.299 Sum_probs=40.8 Q ss_pred CCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCCCCCCHHHHHHHHCCEEEEE--EE-CC Q ss_conf 9986899999658778789999999999875202113567650799972788774457899998710568898--72-26 Q gi|254780789|r 48 EKNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDRPMVRKSDFLRWNGHVVACE--IV-LS 124 (192) Q Consensus 48 ~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidRpL~~~~~f~r~~G~~VkV~--l~-~~ 124 (192) .+..+||.|+-+.. .-+....+++-+..... +++.|.|||=- .+..| +.+.+..|-.. |- .. T Consensus 4 ~k~yvLrLYVaG~t-p~S~~Ai~nl~~ice~~------L~g~Y~LeVID-----v~~~P---e~Ae~~~IlAtPTLvk~~ 68 (103) T PRK09301 4 RKTYILKLYVAGNT-PNSMRALKTLKNILETE------FKGVYALKVID-----VLKNP---QLAEEDKILATPTLAKIL 68 (103) T ss_pred CCCEEEEEEECCCC-HHHHHHHHHHHHHHHHH------CCCCEEEEEEE-----CCCCH---HHHHHCCEEEECHHHHHC T ss_conf 86189999973899-78999999999999986------59963699998-----12698---577268867843100206 Q ss_pred CCCEEEEEEEEEEC Q ss_conf 68807899998512 Q gi|254780789|r 125 SGDKQKLIGKIMGT 138 (192) Q Consensus 125 ~~g~k~~~G~L~~v 138 (192) -.-.+++.|.|... T Consensus 69 P~P~RriIGDLSd~ 82 (103) T PRK09301 69 PPPVRRIIGDLSDR 82 (103) T ss_pred CCCCEEEECCCCCH T ss_conf 98620575057747 No 62 >pfam00479 G6PD_N Glucose-6-phosphate dehydrogenase, NAD binding domain. Probab=26.09 E-value=51 Score=14.37 Aligned_cols=33 Identities=21% Similarity=0.474 Sum_probs=25.4 Q ss_pred EEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCC Q ss_conf 9999965877878999999999987520211356 Q gi|254780789|r 53 LQIFVERDDGNMTLRDCEELSQAISPILDVENII 86 (192) Q Consensus 53 LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i 86 (192) -||.+++|=| -+++-+..++..|...++++.++ T Consensus 138 ~RiVvEKPfG-~Dl~Sa~~ln~~l~~~f~E~qIy 170 (183) T pfam00479 138 TRVVIEKPFG-HDLESARELNDQLASVFDEDQIY 170 (183) T ss_pred EEEEEECCCC-CCHHHHHHHHHHHHHHCCHHHEE T ss_conf 4799857888-97788999999999447997841 No 63 >pfam08484 Methyltransf_14 C-methyltransferase. This domain is found in bacterial C-methyltransferase proteins, often together with other methyltransferase domains such as pfam08241 or pfam08242. Probab=26.00 E-value=51 Score=14.36 Aligned_cols=26 Identities=12% Similarity=0.272 Sum_probs=21.6 Q ss_pred CCEEEEEEEECCCCCEEEEEEECCCC Q ss_conf 97799999954998689999965877 Q gi|254780789|r 37 SFRSVQISLLEEKNLLLQIFVERDDG 62 (192) Q Consensus 37 G~eLv~v~~~~~~~~~LrI~ID~~dg 62 (192) |++++|++...-.+.-+|+++-+.+. T Consensus 1 GL~I~dv~~~~~~GGSir~~i~k~~~ 26 (169) T pfam08484 1 GLRVIDVERLPTHGGSLRVTLAHEGS 26 (169) T ss_pred CCEEEEEEECCCCCCEEEEEEEECCC T ss_conf 98999978758887679999996799 No 64 >KOG3448 consensus Probab=25.67 E-value=52 Score=14.32 Aligned_cols=31 Identities=19% Similarity=0.210 Sum_probs=25.1 Q ss_pred HHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCC Q ss_conf 89999871056889872266880789999851269 Q gi|254780789|r 106 KSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSE 140 (192) Q Consensus 106 ~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~ 140 (192) ..-|+.-+|..|.|.|++. -.+.|+|.+++. T Consensus 4 ysfFkslvg~~V~VeLKnd----~~i~GtL~svDq 34 (96) T KOG3448 4 YSFFKSLVGKEVVVELKND----LSICGTLHSVDQ 34 (96) T ss_pred HHHHHHHCCCEEEEEECCC----CEEEEEECCCCH T ss_conf 9999975587489998188----389877535452 No 65 >COG0172 SerS Seryl-tRNA synthetase [Translation, ribosomal structure and biogenesis] Probab=25.53 E-value=52 Score=14.30 Aligned_cols=50 Identities=20% Similarity=0.272 Sum_probs=39.2 Q ss_pred CCCEEEEEECC--CCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEE Q ss_conf 65079997278--877445789999871056889872266880789999851 Q gi|254780789|r 88 GHYRLEVSSPG--IDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMG 137 (192) Q Consensus 88 ~~Y~LEVSSPG--idRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~ 137 (192) -.|-|||-=|| -.|.+.+-..+..|.+|...++.+...+|+..|.-+|-+ T Consensus 335 kkYDlEvWlP~q~~yrEisScSnc~DfQaRR~~~Ryr~~~~~k~~~vhTLNG 386 (429) T COG0172 335 KKYDLEVWLPGQNKYREISSCSNCTDFQARRLNIRYRDKEEGKREFVHTLNG 386 (429) T ss_pred CCEEEEEEECCCCCCEEEEEEECCCCHHHHHHHCCCCCCCCCCCEEEEECCC T ss_conf 7323799853778740212443454288898754056365799669995464 No 66 >TIGR01088 aroQ 3-dehydroquinate dehydratase, type II; InterPro: IPR001874 3-dehydroquinate dehydratase (4.2.1.10 from EC), or dehydroquinase, catalyzes the conversion of 3-dehydroquinate into 3-dehydroshikimate. It is the third step in the shikimate pathway for the biosynthesis of aromatic amino acids from chorismate. Two classes of dehydroquinases exist, known as types I and II. Class-II enzymes are homododecameric enzymes of about 17 kDa. They are found in some bacteria such as actinomycetales , and some fungi where they act in a catabolic pathway that allows the use of quinic acid as a carbon source.; GO: 0003855 3-dehydroquinate dehydratase activity, 0009073 aromatic amino acid family biosynthetic process. Probab=25.12 E-value=53 Score=14.25 Aligned_cols=40 Identities=13% Similarity=0.279 Sum_probs=31.6 Q ss_pred CCCCCCCCCCCCHHHHHHHHHHHHHHHCCCEEEEEEEECCC Q ss_conf 33376100141279999999999997479779999995499 Q gi|254780789|r 9 SKYEPRIFGDMGLAGDISSVIQPVIEEMSFRSVQISLLEEK 49 (192) Q Consensus 9 ~~~~~r~~~~~~i~~~i~~li~p~v~~lG~eLv~v~~~~~~ 49 (192) +.+||..++.+.+++ |.+.++..++.+++++----|+++. T Consensus 14 G~REP~~YG~~tle~-i~~~~~~~a~~~~ld~e~~~fQSN~ 53 (144) T TIGR01088 14 GLREPGVYGSQTLEE-IEEILETFAAQLNLDVEVEFFQSNS 53 (144) T ss_pred CCCCCCCCCCCCHHH-HHHHHHHHHHHCCCEEEEEEECCCC T ss_conf 874653247868789-9999999998539827898730443 No 67 >cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=25.11 E-value=53 Score=14.25 Aligned_cols=39 Identities=15% Similarity=0.144 Sum_probs=29.0 Q ss_pred CHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCC-EEEEEE Q ss_conf 7899998710568898722668807899998512697-599996 Q gi|254780789|r 105 RKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSET-GFFLEK 147 (192) Q Consensus 105 ~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~-~i~l~~ 147 (192) .......+++++|.|.|.+ .+.+.|+|.++|.- ++.|.. T Consensus 3 ~~asL~~~ldkkv~V~l~d----gR~~~G~Lr~fDq~~NlvL~~ 42 (74) T cd01728 3 GTASLVDDLDKKVVVLLRD----GRKLIGILRSFDQFANLVLQD 42 (74) T ss_pred CHHHHHHHHCCEEEEEECC----CCEEEEEEEEECCCCEEEEEE T ss_conf 5568788629899999889----989999999874654199320 No 68 >KOG2059 consensus Probab=24.89 E-value=41 Score=14.92 Aligned_cols=76 Identities=11% Similarity=0.066 Sum_probs=39.8 Q ss_pred EEEEEECCCCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEEECCCCCCCCCCCEEEEEHHHHHHC Q ss_conf 79997278877445789999871056889872266880789999851269759999635545677772699768876517 Q gi|254780789|r 91 RLEVSSPGIDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLEKEKRGEKDMNELQIAISFDSLLSA 170 (192) Q Consensus 91 ~LEVSSPGidRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~~~~~~~k~~~~~~v~ip~~~I~kA 170 (192) .=|+||+|.-|++....--..--|..++......--+++.|+-.-...+++..+...... .....+||+++|+-+ T Consensus 546 ld~is~~~~~~~~spq~p~v~k~glm~kr~~gr~~~~~~~FKKryf~LTt~~Ls~~Ksp~-----~q~~~~Ipl~nI~aV 620 (800) T KOG2059 546 LDEISSVGDRSSLSPQEPVVLKEGLMIKRAQGRGRFGKKNFKKRYFRLTTEELSYAKSPG-----KQPIYTIPLSNIRAV 620 (800) T ss_pred HHCCCCCCCCCCCCCCCCCEECCCEEEECCCCCCCHHHHHHHHEEEEECCCEEEEECCCC-----CCCCCEEEHHHHHHH T ss_conf 311355564333577788242155047504555403455233307883152468734877-----672231348887888 Q ss_pred C Q ss_conf 2 Q gi|254780789|r 171 R 171 (192) Q Consensus 171 k 171 (192) . T Consensus 621 E 621 (800) T KOG2059 621 E 621 (800) T ss_pred H T ss_conf 8 No 69 >pfam11468 PTase_Orf2 Aromatic prenyltransferase Orf2. In vivo Orf2 attaches a geranyl group to a 1,3,6,8-tetrahydroxynaphthalene-derived polyketide during naphterpin biosynthesis. In vitro, Orf2 catalyses carbon-carbon based and carbon-oxygen based prenylation of hydroxyl-containing aromatic acceptors of synthetic, microbial and plant origin. Probab=24.79 E-value=54 Score=14.22 Aligned_cols=127 Identities=11% Similarity=0.168 Sum_probs=75.4 Q ss_pred CCCCCCCHHHHHHHHHHHHHHHCCCEEEEEEEECCCCCEEEEEEECCCCCCCHHHHHHHHHHHHH------HHC-CCCCC Q ss_conf 10014127999999999999747977999999549986899999658778789999999999875------202-11356 Q gi|254780789|r 14 RIFGDMGLAGDISSVIQPVIEEMSFRSVQISLLEEKNLLLQIFVERDDGNMTLRDCEELSQAISP------ILD-VENII 86 (192) Q Consensus 14 r~~~~~~i~~~i~~li~p~v~~lG~eLv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~------~LD-~~d~i 86 (192) ++..-+++-.-|....+ ....+|++.|...=.--.++++-+|-....|.++.+....+.|.++- .|+ ...-+ T Consensus 124 ~i~~lp~mP~sl~~~~~-~F~r~GLd~V~~i~vDy~~~TvNlYF~~s~g~~~~~~v~am~r~~G~~~Ps~~~l~~~~~~f 202 (294) T pfam11468 124 DILSLPSMPPSLAAHAE-RFARLGLDKVRHIGVDYRSRTVNLYFQRSQGPLEQETVLAMHRLIGLPPPSEEMLAFCRRAF 202 (294) T ss_pred HHHCCCCCCHHHHHHHH-HHHHCCCCCEEEEEEECCCCCEEEEEECCCCCCCHHHHHHHHHHCCCCCCCHHHHHHHHCCE T ss_conf 88438789978999899-99973854135897513678478998418887588999999985289999989999753463 Q ss_pred CCCCEEEEEECCCCCC----CCC--------HH----HHHHHHCCEEEEEEECCCCCEEEEEEEEEECCCCEEEEE Q ss_conf 7650799972788774----457--------89----999871056889872266880789999851269759999 Q gi|254780789|r 87 EGHYRLEVSSPGIDRP----MVR--------KS----DFLRWNGHVVACEIVLSSGDKQKLIGKIMGTSETGFFLE 146 (192) Q Consensus 87 ~~~Y~LEVSSPGidRp----L~~--------~~----~f~r~~G~~VkV~l~~~~~g~k~~~G~L~~v~~~~i~l~ 146 (192) .-.++|-..|+.|+|. |+. |. +.++|.-. --....+++...|.--+-.++.+.++ T Consensus 203 ~~y~Tl~wdSg~IeRv~f~~l~~~~~~p~~~Pa~i~~~iekF~~~-----aP~~~~~e~~v~a~s~~~~geY~Kle 273 (294) T pfam11468 203 TVYTTLDWDSGDIERVCFAVLKRPGRAPGELPARLEPRIEKFLRA-----APSAYEGEKNVYAASFGPEGEYLKLE 273 (294) T ss_pred EEEEEEECCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHH-----CCCCCCCCCEEEEEEECCCCCEEEEE T ss_conf 899997427764238998613778787211661001799999874-----88778665257887654887237614 No 70 >TIGR00876 tal_mycobact transaldolase; InterPro: IPR004732 Transaldolase (2.2.1.2 from EC) catalyzes the reversible transfer of a three-carbon ketol unit from sedoheptulose 7-phosphate to glyceraldehyde 3-phosphate to form erythrose 4-phosphate and fructose 6-phosphate. This enzyme, together with transketolase, provides a link between the glycolytic and pentose-phosphate pathways. Transaldolase is an enzyme of about 34 Kd whose sequence has been well conserved throughout evolution. A lysine has been implicated in the catalytic mechanism of the enzyme; it acts as a nucleophilic group that attacks the carbonyl group of fructose-6-phosphate. Transaldolase is evolutionary related to a bacterial protein of about 20 Kd (known as talC in Escherichia coli), whose exact function is not yet known.; GO: 0004801 transaldolase activity, 0006098 pentose-phosphate shunt, 0005737 cytoplasm. Probab=24.61 E-value=54 Score=14.19 Aligned_cols=38 Identities=24% Similarity=0.461 Sum_probs=33.2 Q ss_pred CCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCC Q ss_conf 878999999999987520211356765079997278877 Q gi|254780789|r 63 NMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDR 101 (192) Q Consensus 63 ~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidR 101 (192) .++++|.-..|..+-+..+..|+..+.-.|||- |=++. T Consensus 69 T~~~~D~~~A~~~L~P~yE~sD~~~G~~S~E~D-P~L~~ 106 (350) T TIGR00876 69 TLALDDVLSASDVLVPLYEDSDGVDGRVSLEVD-PFLED 106 (350) T ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEEC-CCCCH T ss_conf 774888997875205532367899866888755-86201 No 71 >TIGR03361 VI_Rhs_Vgr type VI secretion system Vgr family protein. Members of this protein family belong to the Rhs element Vgr protein family (see TIGR01646), but furthermore all are found in genomes with type VI secretion loci. However, members of this protein family, although recognizably correlated to type VI secretion according the partial phylogenetic profiling algorithm, are often found far the type VI secretion locus. Probab=23.59 E-value=56 Score=14.07 Aligned_cols=40 Identities=20% Similarity=0.386 Sum_probs=25.4 Q ss_pred HHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCEEEEEEEEC Q ss_conf 01011333761001412799999999999974797799999954 Q gi|254780789|r 4 THFLHSKYEPRIFGDMGLAGDISSVIQPVIEEMSFRSVQISLLE 47 (192) Q Consensus 4 ~~~~~~~~~~r~~~~~~i~~~i~~li~p~v~~lG~eLv~v~~~~ 47 (192) ..++..+..-|+|+++.+.+ +++.++...|+.-+...+.+ T Consensus 91 l~lL~~~~~~RIFq~~sv~d----Iv~~vL~~~g~~~~~~~l~~ 130 (513) T TIGR03361 91 LWLLTLRRDSRIFQNKSVPE----IITEVLKEHGITDFRFRLSK 130 (513) T ss_pred HHHHHCCCCCEEEECCCHHH----HHHHHHHHCCCCCCEEECCC T ss_conf 77843876425741898899----99999986146420121257 No 72 >pfam09957 DUF2191 Uncharacterized protein conserved in bacteria (DUF2191). This domain, found in various hypothetical bacterial proteins, has no known function. Probab=23.24 E-value=49 Score=14.48 Aligned_cols=20 Identities=25% Similarity=0.318 Sum_probs=15.6 Q ss_pred HCCEECCHHHHHHHHHCCCC Q ss_conf 17250469999999960796 Q gi|254780789|r 169 SARLIVTDELLRASLNNYGS 188 (192) Q Consensus 169 kAkLv~~d~l~~~~~~~~~~ 188 (192) ..++.++|+|+.++++..|- T Consensus 2 rTnI~iDD~Ll~~A~~~~g~ 21 (47) T pfam09957 2 RTNIEIDDELLAEAQRLTGL 21 (47) T ss_pred CCCHHCCHHHHHHHHHHCCC T ss_conf 86120069999999998098 No 73 >TIGR02776 NHEJ_ligase_prk DNA ligase D; InterPro: IPR014143 Members of this entry are DNA ligases involved in the repair of DNA double-stranded breaks by non-homologous end joining (NheJ). The system of the bacterial Ku protein (IPR009187 from INTERPRO) plus this DNA ligase is seen in about 200f bacterial genomes to date and at least one archaeon (Archeoglobus fulgidus). This entry describes a central and C-terminal domain. These two domains may be permuted, as in genus Mycobacterium, or divided into tandem ORFs. An additional N-terminal 3 -phosphoesterase (PE) domain (IPR014144 from INTERPRO) is present in some members of this ligase. Most examples of genes for this ligase are adjacent to the gene for Ku.. Probab=23.03 E-value=58 Score=14.00 Aligned_cols=80 Identities=13% Similarity=0.201 Sum_probs=55.5 Q ss_pred HHHHHHHHHHHCCCEEEEEEEECCCCCEEEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCCCCC Q ss_conf 99999999974797799999954998689999965877878999999999987520211356765079997278877445 Q gi|254780789|r 25 ISSVIQPVIEEMSFRSVQISLLEEKNLLLQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDRPMV 104 (192) Q Consensus 25 i~~li~p~v~~lG~eLv~v~~~~~~~~~LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidRpL~ 104 (192) =..++.-.+++||+.-. ++-+|||+ |+|+|==..+.++=|+.-.++++|...|-.. +|+.|+.|- T Consensus 503 AA~~~k~~Ld~LgL~~F-~KTSGGKG--lhv~vPL~~~~~~w~~~k~Fa~aia~~La~~--~Pe~FTt~~---------- 567 (645) T TIGR02776 503 AAQLMKQLLDELGLESF-VKTSGGKG--LHVVVPLRPNTATWDEVKLFAKAIAEYLARQ--FPERFTTEM---------- 567 (645) T ss_pred HHHHHHHHHHHCCCCCC-CCCCCCCE--EEEEEEECCCCCCHHHHHHHHHHHHHHHHHH--CCCHHHHHH---------- T ss_conf 99999998876166342-30168960--3899851688879899999999999999985--784212475---------- Q ss_pred CHHHHHHHHCCEEEEEEEC Q ss_conf 7899998710568898722 Q gi|254780789|r 105 RKSDFLRWNGHVVACEIVL 123 (192) Q Consensus 105 ~~~~f~r~~G~~VkV~l~~ 123 (192) -+++++.++-|=.-. T Consensus 568 ----~kk~R~griFiDYLr 582 (645) T TIGR02776 568 ----GKKNRVGRIFIDYLR 582 (645) T ss_pred ----HHHHCCCCEEEEEEE T ss_conf ----277169964786557 No 74 >cd01236 PH_outspread Outspread Pleckstrin homology (PH) domain. Outspread contains two PH domains and a C-terminal coiled-coil region. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPAses, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. Probab=23.01 E-value=21 Score=16.80 Aligned_cols=32 Identities=16% Similarity=0.480 Sum_probs=24.0 Q ss_pred CCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECC Q ss_conf 877878999999999987520211356765079997278 Q gi|254780789|r 60 DDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPG 98 (192) Q Consensus 60 ~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPG 98 (192) +.|.|++..|..|+.+ ++.-+.++.|.|.+|. T Consensus 49 Pqg~Idm~~c~~V~~a-------ee~Tg~~~s~~I~tpd 80 (104) T cd01236 49 PQGTIDMNQCTDVVDA-------EARTGQKFSICILTPD 80 (104) T ss_pred CCCEEEHHHCEEEECC-------CCCCCCCCEEEEECCC T ss_conf 4507853575687223-------0025875569997488 No 75 >cd00770 SerRS_core Seryl-tRNA synthetase (SerRS) class II core catalytic domain. SerRS is responsible for the attachment of serine to the 3' OH group of ribose of the appropriate tRNA. This domain It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. SerRS synthetase is a homodimer. Probab=22.11 E-value=60 Score=13.89 Aligned_cols=41 Identities=17% Similarity=0.214 Sum_probs=22.1 Q ss_pred CCCCCCEEEEEECCCCCCCCCHHHHHHHHCCEEEEEEECCCCCEEEEEEEE Q ss_conf 567650799972788774457899998710568898722668807899998 Q gi|254780789|r 85 IIEGHYRLEVSSPGIDRPMVRKSDFLRWNGHVVACEIVLSSGDKQKLIGKI 135 (192) Q Consensus 85 ~i~~~Y~LEVSSPGidRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~~~G~L 135 (192) |-.+.| .||||- ..+..|.-|...++.+...++++.|.-+| T Consensus 222 P~~~~y-~EvsS~---------Snc~DfQarRl~iry~~~~~~~~~~~htl 262 (297) T cd00770 222 PGQGKY-REISSC---------SNCTDFQARRLNIRYRDKKDGKKQYVHTL 262 (297) T ss_pred HHHCCE-EEEEEC---------CCCHHHHHHHCCCEEECCCCCCEEEEEEE T ss_conf 541976-886131---------54000666523677734899952654881 No 76 >PRK11778 putative periplasmic protease; Provisional Probab=21.75 E-value=61 Score=13.85 Aligned_cols=67 Identities=22% Similarity=0.253 Sum_probs=47.4 Q ss_pred EEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCCC--CCCHHHHHHHHCCEEEEE Q ss_conf 99999658778789999999999875202113567650799972788774--457899998710568898 Q gi|254780789|r 53 LQIFVERDDGNMTLRDCEELSQAISPILDVENIIEGHYRLEVSSPGIDRP--MVRKSDFLRWNGHVVACE 120 (192) Q Consensus 53 LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidRp--L~~~~~f~r~~G~~VkV~ 120 (192) =|+|+-.=+|.|.-...+.+-..|+++|-++.+- +...|-+-|||--=. =.-..|..|.....+.++ T Consensus 69 ~r~fvldF~Gdi~As~v~~LReeitaiL~~a~~~-DeV~~rles~GG~v~~yglaasql~rlr~~~i~lt 137 (317) T PRK11778 69 PRVFVLDFKGDIDASEVESLREEITAILAVAKPG-DEVLLRLESPGGVVHGYGLAASQLQRLRDAGIPLT 137 (317) T ss_pred CEEEEEECCCCCCHHHHHHHHHHHHHHHHHCCCC-CEEEEEEECCCCEEEHHHHHHHHHHHHHHCCCCEE T ss_conf 6399995468603776777899999999748789-86999997899566605779999999986799289 No 77 >COG0250 NusG Transcription antiterminator [Transcription] Probab=21.14 E-value=63 Score=13.77 Aligned_cols=50 Identities=18% Similarity=0.164 Sum_probs=31.8 Q ss_pred HCCEEEEEEECCCCCEEEEEEEEEECCCC--EEEEEECCCCCCCCCCCEEEEEHHHHHHC Q ss_conf 10568898722668807899998512697--59999635545677772699768876517 Q gi|254780789|r 113 NGHVVACEIVLSSGDKQKLIGKIMGTSET--GFFLEKEKRGEKDMNELQIAISFDSLLSA 170 (192) Q Consensus 113 ~G~~VkV~l~~~~~g~k~~~G~L~~v~~~--~i~l~~~~~~~k~~~~~~v~ip~~~I~kA 170 (192) .|..|+|. ..|..| |.|++..++.+ .+++.+..-+ ....++++|++|.+- T Consensus 126 ~Gd~VrI~-~GpFa~---f~g~V~evd~ek~~~~v~v~ifg----r~tPVel~~~qVek~ 177 (178) T COG0250 126 PGDVVRII-DGPFAG---FKAKVEEVDEEKGKLKVEVSIFG----RPTPVELEFDQVEKL 177 (178) T ss_pred CCCEEEEE-CCCCCC---CCEEEEEECCCCCEEEEEEEEEC----CCEEEEEECCCEEEE T ss_conf 99889991-667899---51789998476768999999717----740799860108970 No 78 >TIGR00577 fpg formamidopyrimidine-DNA glycosylase; InterPro: IPR000191 Formamidopyrimidine-DNA glycosylase (3.2.2.23 from EC) (Fapy-DNA glycosylase) (gene fpg) is a bacterial enzyme involved in DNA repair and which excise oxidized purine bases to release 2,6-diamino-4-hydroxy-5N-methylformamido- pyrimidine (Fapy) and 7,8-dihydro-8-oxoguanine (8-OxoG) residues. In addition to its glycosylase activity, FPG can also nick DNA at apurinic/apyrimidinic sites (AP sites). FPG is a monomeric protein of about 32 Kd which binds and require zinc for its activity. The C-terminal region is the zinc binding site in the enzyme where four conserved and essential cysteines are located .. Probab=20.28 E-value=20 Score=16.92 Aligned_cols=74 Identities=16% Similarity=0.302 Sum_probs=49.6 Q ss_pred CCCCCCC--CCCHHHHHHHHHHHHHHHCCCEEE-EEEEE--------CCCCCEEE--------------EEEE------- Q ss_conf 3761001--412799999999999974797799-99995--------49986899--------------9996------- Q gi|254780789|r 11 YEPRIFG--DMGLAGDISSVIQPVIEEMSFRSV-QISLL--------EEKNLLLQ--------------IFVE------- 58 (192) Q Consensus 11 ~~~r~~~--~~~i~~~i~~li~p~v~~lG~eLv-~v~~~--------~~~~~~Lr--------------I~ID------- 58 (192) ++||.|+ ..-+.++...-..+.+..||.|-. +=+|. ..+++.|+ ||+| T Consensus 117 ~D~R~FG~~~~~l~~~~~~~~~~~l~~LGpEPly~~~F~~~~l~~~l~~~~r~~K~~LLDQ~~V~G~GNIYADE~LF~A~ 196 (292) T TIGR00577 117 HDPRKFGKVTWLLLDRGEVEASLLLAKLGPEPLYSEDFTAEYLFEKLAKSKRKIKTALLDQRLVAGLGNIYADEVLFRAG 196 (292) T ss_pred ECCCEEEEEEEEECCCCCHHHHHHHHHHCCCCCCCHHCCHHHHHHHHHHCCCHHHHHHHCCCEEEEEHHHHHHHHHHHHC T ss_conf 55763556898716775301201366728888865211738999998740403456865487576510106668998736 Q ss_pred -CCC---CCCCHHHHHHHHHHHHHHHCCCC Q ss_conf -587---78789999999999875202113 Q gi|254780789|r 59 -RDD---GNMTLRDCEELSQAISPILDVEN 84 (192) Q Consensus 59 -~~d---g~i~iddC~~vSr~i~~~LD~~d 84 (192) +|+ .+++..+|+.+.+.|.+.|..+= T Consensus 197 ihP~~~A~~L~~~~~~~L~~~i~~vL~~Ai 226 (292) T TIGR00577 197 IHPERLANQLSKEECELLHKAIKEVLRKAI 226 (292) T ss_pred CCCCHHHHCCCHHHHHHHHHHHHHHHHHHH T ss_conf 881010001588899999999999999998 No 79 >COG4004 Uncharacterized protein conserved in archaea [Function unknown] Probab=20.24 E-value=66 Score=13.65 Aligned_cols=31 Identities=32% Similarity=0.442 Sum_probs=17.2 Q ss_pred HHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCC Q ss_conf 999999987520211356765079997278877 Q gi|254780789|r 69 CEELSQAISPILDVENIIEGHYRLEVSSPGIDR 101 (192) Q Consensus 69 C~~vSr~i~~~LD~~d~i~~~Y~LEVSSPGidR 101 (192) -..+-+-+++.. ..--..++ +++-||||+.| T Consensus 14 ~dri~~~l~e~g-~~v~~eGD-~ivas~pgis~ 44 (96) T COG4004 14 PDRIMRGLSELG-WTVSEEGD-RIVASSPGISR 44 (96) T ss_pred HHHHHHHHHHHC-EEEEECCC-EEEEECCCCEE T ss_conf 899999999848-06764564-89983487207 No 80 >PRK12853 glucose-6-phosphate 1-dehydrogenase; Provisional Probab=20.20 E-value=66 Score=13.64 Aligned_cols=33 Identities=18% Similarity=0.411 Sum_probs=27.3 Q ss_pred EEEEEECCCCCCCHHHHHHHHHHHHHHHCCCCCC Q ss_conf 9999965877878999999999987520211356 Q gi|254780789|r 53 LQIFVERDDGNMTLRDCEELSQAISPILDVENII 86 (192) Q Consensus 53 LrI~ID~~dg~i~iddC~~vSr~i~~~LD~~d~i 86 (192) -||.+++|-| -+++.+.++++.|...++++.++ T Consensus 139 ~RvvvEKPFG-~Dl~SA~~Ln~~l~~~f~E~qIy 171 (486) T PRK12853 139 ARVVLEKPFG-HDLASARALNATLAAVFDEDQIY 171 (486) T ss_pred CEEEEECCCC-CCHHHHHHHHHHHHHHCCHHHEE T ss_conf 3278854777-87688999999999866854567 No 81 >KOG0129 consensus Probab=20.08 E-value=29 Score=15.88 Aligned_cols=89 Identities=15% Similarity=0.171 Sum_probs=43.9 Q ss_pred CCCCCEEEEEECCCCCCCCCHHHHHHHHCCEEEEEEECCCCCEEE-EEEEEEE----CC-----C---C---EEEEEECC Q ss_conf 676507999727887744578999987105688987226688078-9999851----26-----9---7---59999635 Q gi|254780789|r 86 IEGHYRLEVSSPGIDRPMVRKSDFLRWNGHVVACEIVLSSGDKQK-LIGKIMG----TS-----E---T---GFFLEKEK 149 (192) Q Consensus 86 i~~~Y~LEVSSPGidRpL~~~~~f~r~~G~~VkV~l~~~~~g~k~-~~G~L~~----v~-----~---~---~i~l~~~~ 149 (192) -.+.|++-||||++.-.-...+.|.-+-...|.- ...+++-+|. |.|-|-. ++ + . .+.|..+. T Consensus 329 ~~~~~yf~vss~~~k~k~VQIrPW~laDs~fv~d-~sq~lDprrTVFVGgvprpl~A~eLA~imd~lyGgV~yaGIDtD~ 407 (520) T KOG0129 329 GEGNYYFKVSSPTIKDKEVQIRPWVLADSDFVLD-HNQPIDPRRTVFVGGLPRPLTAEELAMIMEDLFGGVLYVGIDTDP 407 (520) T ss_pred CCCCEEEEEECCCCCCCCEEEEEEEECCCHHHHC-CCCCCCCCCEEEECCCCCCCHHHHHHHHHHHHCCCEEEEEECCCC T ss_conf 3564589983576677621577657415304325-787678750388677787425999999998743846898744673 Q ss_pred CCCCCCCCCEEEEE-----HHHHHHCCEECC Q ss_conf 54567777269976-----887651725046 Q gi|254780789|r 150 RGEKDMNELQIAIS-----FDSLLSARLIVT 175 (192) Q Consensus 150 ~~~k~~~~~~v~ip-----~~~I~kAkLv~~ 175 (192) +-+=+.+-..|++. +.+|..+=+.+. T Consensus 408 k~KYPkGaGRVtFsnqqsYi~AIsarFvql~ 438 (520) T KOG0129 408 KLKYPKGAGRVTFSNQQAYIKAISARFVQLD 438 (520) T ss_pred CCCCCCCCCEEEECCCHHHHHHHHHHEEEEE T ss_conf 4588777613566041889999753138875 Done!