Query gi|254780955|ref|YP_003065368.1| hypothetical protein CLIBASIA_04270 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 204 No_of_seqs 127 out of 261 Neff 4.1 Searched_HMMs 39220 Date Mon May 30 03:14:30 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780955.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 pfam07031 consensus 100.0 0 0 444.3 17.7 153 30-201 1-153 (153) 2 COG3814 Uncharacterized protei 100.0 0 0 414.1 15.2 157 25-204 1-157 (157) 3 pfam04386 SspB Stringent starv 100.0 1.6E-42 0 297.1 16.0 120 30-150 1-120 (152) 4 PRK11798 ClpXP protease specif 98.0 0.00018 4.7E-09 49.9 11.6 94 43-145 11-107 (140) 5 COG2969 SspB Stringent starvat 95.9 0.16 4E-06 30.8 9.8 95 44-147 13-110 (155) 6 PRK08296 hypothetical protein; 62.2 11 0.00028 18.8 3.6 40 32-71 343-382 (602) 7 pfam03315 SDH_beta Serine dehy 57.2 14 0.00037 18.1 3.5 66 44-110 72-137 (148) 8 pfam10274 ParcG Parkin co-regu 48.3 13 0.00033 18.4 2.1 32 23-56 136-167 (183) 9 pfam06794 UPF0270 Uncharacteri 46.4 16 0.00041 17.7 2.3 31 29-62 2-32 (70) 10 TIGR01349 PDHac_trf_mito pyruv 44.6 11 0.00027 18.9 1.2 41 26-71 342-382 (584) 11 pfam01122 Cobalamin_bind Eukar 44.1 27 0.00068 16.3 3.2 34 30-66 189-222 (305) 12 pfam08977 BOFC_N Bypass of For 43.6 16 0.0004 17.8 1.9 27 80-106 29-55 (59) 13 COG5165 POB3 Nucleosome-bindin 42.5 28 0.00072 16.2 4.8 39 109-147 116-155 (508) 14 pfam06057 VirJ Bacterial virul 41.9 8.2 0.00021 19.6 0.3 10 75-84 88-97 (192) 15 pfam05798 Phage_FRD3 Bacteriop 40.7 19 0.00049 17.2 2.0 37 78-130 16-53 (75) 16 TIGR01182 eda 2-dehydro-3-deox 38.6 32 0.00082 15.8 3.0 35 52-90 29-63 (205) 17 PRK02967 nickel responsive reg 37.0 34 0.00087 15.6 4.6 96 24-131 7-113 (138) 18 TIGR02249 integrase_gron integ 36.1 14 0.00037 18.1 0.8 47 23-87 167-213 (320) 19 pfam11875 DUF3395 Domain of un 35.9 36 0.00091 15.5 4.7 72 38-135 34-115 (144) 20 pfam02995 DUF229 Protein of un 33.1 39 0.001 15.2 5.2 85 13-126 293-398 (498) 21 pfam09664 DUF2399 Protein of u 33.0 39 0.001 15.2 2.7 52 40-104 52-103 (155) 22 TIGR02342 chap_CCT_delta T-com 31.4 33 0.00085 15.7 2.0 57 16-89 375-431 (526) 23 pfam04555 XhoI Restriction end 30.3 44 0.0011 14.9 3.2 52 38-106 21-72 (196) 24 TIGR01572 A_thl_para_3677 Arab 30.1 28 0.00071 16.2 1.4 14 59-72 242-255 (290) 25 pfam04776 DUF626 Protein of un 29.7 32 0.0008 15.8 1.6 28 59-92 77-104 (116) 26 PRK04966 hypothetical protein; 28.2 47 0.0012 14.7 2.3 45 29-76 2-64 (72) 27 pfam10154 DUF2362 Uncharacteri 26.2 7.2 0.00018 20.0 -2.1 11 86-96 229-239 (501) 28 TIGR02340 chap_CCT_alpha T-com 25.6 53 0.0013 14.4 2.3 30 16-49 371-400 (540) 29 TIGR02439 catechol_proteo cate 25.6 38 0.00096 15.3 1.4 23 45-67 205-227 (288) 30 PRK01002 nickel responsive reg 25.1 54 0.0014 14.3 5.2 97 24-131 10-117 (141) 31 pfam11383 DUF3187 Protein of u 24.9 26 0.00066 16.4 0.5 66 50-117 79-145 (273) 32 TIGR00079 pept_deformyl peptid 24.8 33 0.00084 15.7 1.0 13 88-100 152-164 (188) 33 pfam11757 RSS_P20 Suppressor o 24.7 48 0.0012 14.6 1.9 40 15-58 86-125 (137) 34 COG3612 Uncharacterized protei 24.3 50 0.0013 14.6 1.9 36 42-79 39-75 (157) 35 pfam09665 RE_Alw26IDE Type II 23.9 29 0.00073 16.1 0.6 96 24-142 326-432 (511) 36 pfam00073 Rhv picornavirus cap 22.4 48 0.0012 14.7 1.5 42 72-131 111-152 (173) 37 KOG3913 consensus 21.4 28 0.00072 16.2 0.1 94 37-139 194-287 (356) 38 pfam06748 DUF1217 Protein of u 21.3 57 0.0015 14.2 1.7 21 31-51 99-119 (150) 39 cd01611 GABARAP GABARAP (GABA 21.2 57 0.0014 14.2 1.6 33 76-108 13-47 (112) 40 TIGR00174 miaA tRNA delta(2)-i 21.0 42 0.0011 15.0 1.0 42 16-65 57-101 (307) 41 pfam02576 DUF150 Uncharacteriz 20.8 65 0.0017 13.8 3.4 90 30-128 37-138 (141) 42 pfam11094 UL11 Membrane-associ 20.7 46 0.0012 14.8 1.1 10 191-200 21-30 (39) 43 pfam02991 MAP1_LC3 Microtubule 20.7 58 0.0015 14.1 1.6 32 77-108 6-39 (104) 44 KOG1932 consensus 20.4 67 0.0017 13.7 3.5 94 21-126 455-593 (1180) 45 pfam11688 DUF3285 Protein of u 20.1 63 0.0016 13.9 1.7 13 35-47 7-19 (45) 46 TIGR01722 MMSDH methylmalonate 20.0 41 0.001 15.1 0.7 27 19-45 53-82 (478) No 1 >pfam07031 consensus Probab=100.00 E-value=0 Score=444.26 Aligned_cols=153 Identities=52% Similarity=0.851 Sum_probs=131.3 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEEEEE Q ss_conf 49899999999999999999999708999874699998458888775989996397537998431121807645479999 Q gi|254780955|r 30 IRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHFEVG 109 (204) Q Consensus 30 i~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~FsV~ 109 (204) |+|++||++|||+|||++|+.|+++| |||+|||||||+|++|||+||+|||+|||+|||||||||||||+|++++|||+ T Consensus 1 i~Y~~l~~~Alr~vv~~~L~~v~~~G-lpg~hHfyItF~T~~~gV~ip~~L~~~YP~eMTIVlQhqf~dL~V~~~~FsV~ 79 (153) T pfam07031 1 IRYDKLVQDALRGVVRKVLTDVAKEG-LPGEHHFYITFLTNAPGVILPDRLKERYPEEMTIVLQHQFWDLNVTEDGFSVT 79 (153) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHCC-CCCCCEEEEEEECCCCCCCCCHHHHHHCCCCCEEEEEEEECCCEEECCCEEEE T ss_conf 98799999999999999999999729-99873799999659999668999996498613998632232415506836999 Q ss_pred EEECCEEEEEEEEHHHHEEEECCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC Q ss_conf 98498766899844242023067753035603467665433444565444445655664334444444555665556665 Q gi|254780955|r 110 LSFSNVPERLVIPFNAIKGFYDPSVNFELEFDVHIEHIEEKLEGGNTGKVLTSPDNFDKNQTNSVSQDSSKKKSTKKQNK 189 (204) Q Consensus 110 LsF~g~~e~L~IPf~AIt~F~DPSV~FgLqF~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 189 (204) |||||+||+|+|||+||++|+||||+|||||++..++..++.+..+... .....++...... T Consensus 80 LsF~~~~e~l~IPf~aI~~F~DPsv~FgL~F~~~~~~~~~~~~~~~~~~------------------~~~~~~~~~~~~~ 141 (153) T pfam07031 80 LSFGGVPERLTIPFSAITKFVDPSVNFGLQFEQQENDDDEEEPDEDNDD------------------PDDEAPSPAPLTP 141 (153) T ss_pred EEECCEEEEEEEEHHHHHHCCCCCCCEEEEECCCCCCCCCCCCCCCCCC------------------CCCCCCCCCCCCC T ss_conf 9849904678952688410128887757875576676556667777676------------------5567888878888 Q ss_pred CCCCCEEEEEEC Q ss_conf 677728983100 Q gi|254780955|r 190 NKMASVISLDNF 201 (204) Q Consensus 190 ~~~aeVVSLD~F 201 (204) .++|||||||+| T Consensus 142 ~~~aeVVSLD~F 153 (153) T pfam07031 142 AKGANVVSLDAF 153 (153) T ss_pred CCCCCEEECCCC T ss_conf 888868857789 No 2 >COG3814 Uncharacterized protein conserved in bacteria [Function unknown] Probab=100.00 E-value=0 Score=414.08 Aligned_cols=157 Identities=62% Similarity=1.009 Sum_probs=134.0 Q ss_pred CHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCC Q ss_conf 02210498999999999999999999997089998746999984588887759899963975379984311218076454 Q gi|254780955|r 25 MNYDHIRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDN 104 (204) Q Consensus 25 M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~ 104 (204) |.+|+|+|+.|+|+|||||||++|..++..| |||+|||||||+|.+|||++|.+||++||++||||||||||||.|+|. T Consensus 1 m~qd~i~Y~~laqealrgvvkkvL~kva~~g-Lp~dhh~yItf~T~apgV~~~s~lk~kYPeqmTIVlQ~Qfwdl~v~dt 79 (157) T COG3814 1 MGQDHIRYDILAQEALRGVVKKVLAKVAATG-LPGDHHFYITFLTGAPGVRIPSKLKQKYPEQMTIVLQHQFWDLKVTDT 79 (157) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHHHCC-CCCCCEEEEEEECCCCCEECCHHHHHHCCCCEEEEEEEEEECCEECCC T ss_conf 9522478999999999999999999875328-998737999996389823444888763850359976545401210246 Q ss_pred EEEEEEEECCEEEEEEEEHHHHEEEECCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC Q ss_conf 79999984987668998442420230677530356034676654334445654444456556643344444445556655 Q gi|254780955|r 105 HFEVGLSFSNVPERLVIPFNAIKGFYDPSVNFELEFDVHIEHIEEKLEGGNTGKVLTSPDNFDKNQTNSVSQDSSKKKST 184 (204) Q Consensus 105 ~FsV~LsF~g~~e~L~IPf~AIt~F~DPSV~FgLqF~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 184 (204) +|||+|||+|+||+|+|||+||++|+||||||.|+|++.....++.+. ..+|.+ ..+. T Consensus 80 gFsv~lsF~~vpe~l~iPf~Al~~FyDpsvnf~LeF~~~~~~~e~~e~-------~~~p~~---------------~~~~ 137 (157) T COG3814 80 GFSVTLSFSGVPEKLYIPFDALRGFYDPSVNFELEFDVSLNIEEEAEP-------EAEPSN---------------KAKS 137 (157) T ss_pred CEEEEEEECCCCEEEEEEHHHHHHHCCCCCCEEEEECCCCCCCCCCCC-------CCCCCC---------------CCCC T ss_conf 248998857963079962588550148876579997442255445676-------555444---------------2236 Q ss_pred CCCCCCCCCCEEEEEECCCC Q ss_conf 56665677728983100269 Q gi|254780955|r 185 KKQNKNKMASVISLDNFRKK 204 (204) Q Consensus 185 ~~~~~~~~aeVVSLD~FRKK 204 (204) ......++++|||||+|||| T Consensus 138 ~at~~~~~p~VvsLD~FRkk 157 (157) T COG3814 138 GATSDSEGPNVVSLDAFRKK 157 (157) T ss_pred CCCCCCCCCCEEEHHHHHCC T ss_conf 66577889848873774049 No 3 >pfam04386 SspB Stringent starvation protein B. Escherichia coli stringent starvation protein B (SspB), is thought to enhance the specificity of degradation of tmRNA-tagged proteins by the ClpXP protease. The tmRNA tag, also known as ssrA, is an 11-aa peptide added to the C terminus of proteins stalled during translation, targets proteins for degradation by ClpXP and ClpAP. SspB a cytoplasmic protein that specifically binds to residues 1-4 and 7 of the tag. Binding of SspB enhances degradation of tagged proteins by ClpX, and masks sequence elements important for ClpA interactions, inhibiting degradation by ClpA. However, more recent work has cast doubt on the importance of SspB in wild-type cells. SspB is encoded in an operon whose synthesis is stimulated by carbon, amino acid, and phosphate starvation. SspB may play a special role during nutrient stress, for example by ensuring rapid degradation of the products of stalled translation, without causing a global increase in degradation of Probab=100.00 E-value=1.6e-42 Score=297.14 Aligned_cols=120 Identities=44% Similarity=0.757 Sum_probs=114.3 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEEEEE Q ss_conf 49899999999999999999999708999874699998458888775989996397537998431121807645479999 Q gi|254780955|r 30 IRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHFEVG 109 (204) Q Consensus 30 i~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~FsV~ 109 (204) |+|+.|++.|||+|++.+|+.+++.| |+|+|||||||+|+++||.||++|+++||++||||+|||||||.|++++||++ T Consensus 1 ~~y~~~~~~~lRav~~w~L~~~~~~g-ld~~~~pyI~~~t~~~gV~vP~~l~~~~~~~mlnV~~~a~~~L~v~~d~~sf~ 79 (152) T pfam04386 1 ITYDSLRPYALRAVYRWVLTDVAKEG-LDNDHTPYITFDTTAPGVQVPDELVERYPQIMLNVLQHAFWDLEVGNDGFSFN 79 (152) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHCC-CCCCCEEEEEEEECCCCCCCCHHHHHCCCCEEEEECHHHHHCEEECCCEEEEE T ss_conf 97788899999999999998765348-89998379999908999187899984399479995477760516627889999 Q ss_pred EEECCEEEEEEEEHHHHEEEECCCCCEEEEECCCCCCCCCC Q ss_conf 98498766899844242023067753035603467665433 Q gi|254780955|r 110 LSFSNVPERLVIPFNAIKGFYDPSVNFELEFDVHIEHIEEK 150 (204) Q Consensus 110 LsF~g~~e~L~IPf~AIt~F~DPSV~FgLqF~~~~~~~~~~ 150 (204) |+|+|++++|+|||+||++|+||+++|||+|+.......+. T Consensus 80 ~rF~G~~~~i~iP~~Ai~~i~~~e~g~Gl~F~~~~~~~~~~ 120 (152) T pfam04386 80 LRFGGVPERLYVPFAAILAFYDPENGFGLQFEPEEADEDEE 120 (152) T ss_pred EEECCEEEEEEEEHHHEEEEECCCCCCCCCCCCCCCCCCCC T ss_conf 99899307999744771466666678652058988777777 No 4 >PRK11798 ClpXP protease specificity-enhancing factor; Provisional Probab=98.01 E-value=0.00018 Score=49.87 Aligned_cols=94 Identities=15% Similarity=0.274 Sum_probs=78.4 Q ss_pred HHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEE---ECCEEECCCEEEEEEEECCEEEEE Q ss_conf 99999999997089998746999984588887759899963975379984311---218076454799999849876689 Q gi|254780955|r 43 LVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQ---FWDLKVLDNHFEVGLSFSNVPERL 119 (204) Q Consensus 43 Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhq---f~dL~V~~~~FsV~LsF~g~~e~L 119 (204) ++|....=+..+|+ -=||.-.+.++||.+|.... ++=.|||--- -.||.++++..+-.--|+|++..+ T Consensus 11 LiRA~yeW~~Dn~~-----TPyi~V~a~~~~v~VP~~~v----~dg~IvLNIsp~Av~~L~i~nd~isF~ARF~G~~~~i 81 (140) T PRK11798 11 LLRALYEWLVDNGL-----TPHLLVDATYPGVQVPMEYV----KDGQIVLNISPRAVGNLQLDNDAISFNARFGGVPRQI 81 (140) T ss_pred HHHHHHHHHHHCCC-----CCEEEEEECCCCCCCCHHHC----CCCEEEEECCHHHHHCEEECCCEEEEEEEECCEEEEE T ss_conf 89999999972899-----85499981799966898880----1998999779888603077587899987979915899 Q ss_pred EEEHHHHEEEECCCCCEEEEECCCCC Q ss_conf 98442420230677530356034676 Q gi|254780955|r 120 VIPFNAIKGFYDPSVNFELEFDVHIE 145 (204) Q Consensus 120 ~IPf~AIt~F~DPSV~FgLqF~~~~~ 145 (204) .||..||.+-+-.-.+-|+-|+.... T Consensus 82 ~vP~~aV~aIyArEnG~Gm~F~~E~~ 107 (140) T PRK11798 82 YVPIAAVLAIYARENGQGMMFEPEAA 107 (140) T ss_pred EEEHHHHHHHHHHCCCCCCCCCCCCC T ss_conf 97789964253010378754488766 No 5 >COG2969 SspB Stringent starvation protein B [General function prediction only] Probab=95.85 E-value=0.16 Score=30.78 Aligned_cols=95 Identities=13% Similarity=0.188 Sum_probs=76.0 Q ss_pred HHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEE---CCEEECCCEEEEEEEECCEEEEEE Q ss_conf 99999999970899987469999845888877598999639753799843112---180764547999998498766899 Q gi|254780955|r 44 VKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQF---WDLKVLDNHFEVGLSFSNVPERLV 120 (204) Q Consensus 44 vr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf---~dL~V~~~~FsV~LsF~g~~e~L~ 120 (204) +|..-+-...+++- =||--.-+.+||.+|- .|-..=-|||---. -+|..+.+..|-.--|+|++..+. T Consensus 13 lRA~yeWl~DN~~T-----PhlvVd~t~~Gv~VP~----eyvkDgqIVLNvs~~Av~nL~l~Nd~vsFnARFgGvs~~v~ 83 (155) T COG2969 13 LRALYEWLLDNQLT-----PHLVVDVTLPGVKVPM----EYVRDGQIVLNIAPRAVGNLELGNDWVSFNARFGGVSRQVS 83 (155) T ss_pred HHHHHHHHHCCCCC-----CEEEEECCCCCCCCCH----HHCCCCEEEEEECCCCCCCEEECCCEEEEEEEECCCCEEEE T ss_conf 99999998517987-----1389962556732787----77138859997172110366853750898666088431368 Q ss_pred EEHHHHEEEECCCCCEEEEECCCCCCC Q ss_conf 844242023067753035603467665 Q gi|254780955|r 121 IPFNAIKGFYDPSVNFELEFDVHIEHI 147 (204) Q Consensus 121 IPf~AIt~F~DPSV~FgLqF~~~~~~~ 147 (204) ||-.|+..-+-.-.+=|.-|+....-+ T Consensus 84 vP~~avlAiYAREnG~G~~Fe~E~~~~ 110 (155) T COG2969 84 VPVGAVLAIYARENGQGMMFEPEAAYD 110 (155) T ss_pred EEHHHHHHHHHHHCCCCEECCCCCCCC T ss_conf 566784314313248831307311456 No 6 >PRK08296 hypothetical protein; Provisional Probab=62.23 E-value=11 Score=18.77 Aligned_cols=40 Identities=18% Similarity=0.204 Sum_probs=27.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCC Q ss_conf 8999999999999999999997089998746999984588 Q gi|254780955|r 32 YDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNA 71 (204) Q Consensus 32 Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~ 71 (204) |..|+...=|......|..+.+-=-+.-+|||||---+.+ T Consensus 343 yr~lL~~d~r~~Fd~~L~lA~~v~p~~EdH~Fyiehw~~~ 382 (602) T PRK08296 343 YRDLLDGDERAQFDEKLGLARTVFPYVENHNFYVEHWFHS 382 (602) T ss_pred HHHHCCHHHHHHHHHHHHHHHHHCCCCCCCEEEEEHHHHH T ss_conf 9985481679999999999998522457653454104789 No 7 >pfam03315 SDH_beta Serine dehydratase beta chain. L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyses the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway. Probab=57.21 E-value=14 Score=18.06 Aligned_cols=66 Identities=15% Similarity=0.132 Sum_probs=36.5 Q ss_pred HHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEEEEEE Q ss_conf 9999999997089998746999984588887759899963975379984311218076454799999 Q gi|254780955|r 44 VKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHFEVGL 110 (204) Q Consensus 44 vr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~FsV~L 110 (204) +...+..+...+.|.=..+.-|.|.....=+--+......+|+-|+|.+-.. ........+|||+= T Consensus 72 ~~~~~~~i~~~~~l~l~~~~~i~f~~~~di~f~~~~~l~~HpN~m~f~a~~~-~~~l~~~~yySiGG 137 (148) T pfam03315 72 IEARLARIREEGKLNLAGEHSIPFAPERDIVFHFKEILPFHPNGMRFTAFDD-GGELLEETYYSIGG 137 (148) T ss_pred HHHHHHHHHHCCEEECCCCCCEEECCCCCEEEECCCCCCCCCCEEEEEEEEC-CCEEEEEEEEECCC T ss_conf 6899999864775502896216415027939963767999998689999919-97699999998089 No 8 >pfam10274 ParcG Parkin co-regulated protein. This family of proteins is transcribed anti-sense along the DNA to the Parkin gene product and the two appear to be transcribed under the same promoter. The protein has predicted alpha-helical and beta-sheet domains which suggest its function is in the ubiquitin/proteasome system. Mutations in parkin are the genetic cause of early-onset and autosomal recessive juvenile parkinsonism. Probab=48.31 E-value=13 Score=18.38 Aligned_cols=32 Identities=22% Similarity=0.285 Sum_probs=27.2 Q ss_pred CCCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCC Q ss_conf 2202210498999999999999999999997089 Q gi|254780955|r 23 TLMNYDHIRYDILAKEALRGLVKVVLSEVASIGS 56 (204) Q Consensus 23 ~~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~ 56 (204) +.+..|.|+| .....++.+|.++|+..+++|| T Consensus 136 ~~n~gd~idy--~~~~~i~dlI~etL~~lE~~GG 167 (183) T pfam10274 136 NVNLGDRIDY--RKKKNIGDLIQETLELLERNGG 167 (183) T ss_pred CCCCCHHHHH--HHCCCHHHHHHHHHHHHHHHCC T ss_conf 5442116777--6124375699999999999578 No 9 >pfam06794 UPF0270 Uncharacterized protein family (UPF0270). Probab=46.42 E-value=16 Score=17.74 Aligned_cols=31 Identities=26% Similarity=0.354 Sum_probs=24.4 Q ss_pred HCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCE Q ss_conf 0498999999999999999999997089998746 Q gi|254780955|r 29 HIRYDILAKEALRGLVKVVLSEVASIGSLPGEHH 62 (204) Q Consensus 29 li~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hH 62 (204) +|+|+.|-.++|+|++.+-+ .+.|.=.|++. T Consensus 2 iIP~~~L~~etL~~lIeefv---~ReGTDyG~~E 32 (70) T pfam06794 2 IIPWQQLPPETLNNLIEEFV---LREGTDYGEDE 32 (70) T ss_pred CCCHHHCCHHHHHHHHHHHH---HCCCCCCCCCC T ss_conf 27837799999999999998---62377666452 No 10 >TIGR01349 PDHac_trf_mito pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase; InterPro: IPR006257 This group of sequences represent one of several closely related clades of the dihydrolipoamide acetyltransferase subunit of the pyruvate dehydrogenase complex. It includes sequences from mitochondria and from alpha and beta branches of the proteobacteria, as well as from some other bacteria, but not the Gram-positive bacteria.; GO: 0004742 dihydrolipoyllysine-residue acetyltransferase activity, 0006090 pyruvate metabolic process, 0045254 pyruvate dehydrogenase complex. Probab=44.61 E-value=11 Score=18.94 Aligned_cols=41 Identities=17% Similarity=0.358 Sum_probs=31.6 Q ss_pred HHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCC Q ss_conf 2210498999999999999999999997089998746999984588 Q gi|254780955|r 26 NYDHIRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNA 71 (204) Q Consensus 26 ~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~ 71 (204) +.+.=.|..+=++.||.++-+=|. -+|.. .| |||+|..-+- T Consensus 342 ~~~tg~Y~d~P~sniRk~IA~RL~-eSKq~-iP---HyYvs~~~~~ 382 (584) T TIGR01349 342 PVSTGSYEDVPLSNIRKIIAKRLL-ESKQT-IP---HYYVSVECNV 382 (584) T ss_pred CCCCCCEEECCCCCHHHHHHHHHH-HHHCC-CC---EEEEEEEEEH T ss_conf 888864466788832689999998-87568-86---0799998861 No 11 >pfam01122 Cobalamin_bind Eukaryotic cobalamin-binding protein. Probab=44.11 E-value=27 Score=16.33 Aligned_cols=34 Identities=26% Similarity=0.353 Sum_probs=29.0 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEE Q ss_conf 4989999999999999999999970899987469999 Q gi|254780955|r 30 IRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYIT 66 (204) Q Consensus 30 i~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyIt 66 (204) .+|...+..|+|.++.++|+....+| +-|+ .|.| T Consensus 189 ~~~~~~i~~ai~~~~~kil~~q~~~G-~iGN--~yST 222 (305) T pfam01122 189 EGYRAEISQALKTVVEKILKSKKPNG-HIGN--IYST 222 (305) T ss_pred CCHHHHHHHHHHHHHHHHHHHHCCCC-CCCC--EECH T ss_conf 14799999999999999998637677-3253--0037 No 12 >pfam08977 BOFC_N Bypass of Forespore C, N terminal. The N-terminal domain of 'bypass of forespore C' is composed of a four-stranded beta-sheet covered by an alpha-helix. The beta-sheet has a beta2-beta1-beta4-beta3 topology, where strands beta1 and beta2 and strands beta3 and beta4 are connected by beta-turns, whereas strands beta2 and beta3 are joined by an alpha-helix that runs across one face of the beta-sheet. This domain is similar to the third immunoglobulin G-binding domain of protein G from Streptococcus, the latter belonging to a large and diverse group of cell surface-associated proteins that bind to immunoglobulins. It has been hypothesized that this domain may be a mediator of protein-protein interactions involved in proteolytic events at the cell surface. Probab=43.56 E-value=16 Score=17.84 Aligned_cols=27 Identities=15% Similarity=0.327 Sum_probs=23.0 Q ss_pred HHHHCCCEEEEEEEEEECCEEECCCEE Q ss_conf 996397537998431121807645479 Q gi|254780955|r 80 LRKNYPEKMTIVIQNQFWDLKVLDNHF 106 (204) Q Consensus 80 L~~~yP~emTIVlQhqf~dL~V~~~~F 106 (204) .-++-|-+|||.|+-+|-|=.|+++.+ T Consensus 29 v~e~~P~~vtV~LEr~YlDGEvSeE~~ 55 (59) T pfam08977 29 VEEKEPLQVTVQLERVYLDGEVSEEII 55 (59) T ss_pred EEECCCCEEEEEEEEEEECCCCCHHHC T ss_conf 774288079999999884461224222 No 13 >COG5165 POB3 Nucleosome-binding factor SPN, POB3 subunit [Transcription / DNA replication, recombination, and repair / Chromatin structure and dynamics] Probab=42.46 E-value=28 Score=16.16 Aligned_cols=39 Identities=23% Similarity=0.350 Sum_probs=28.4 Q ss_pred EEEECCEEEEEEEEHHHHEEE-ECCCCCEEEEECCCCCCC Q ss_conf 998498766899844242023-067753035603467665 Q gi|254780955|r 109 GLSFSNVPERLVIPFNAIKGF-YDPSVNFELEFDVHIEHI 147 (204) Q Consensus 109 ~LsF~g~~e~L~IPf~AIt~F-~DPSV~FgLqF~~~~~~~ 147 (204) .+-|-+.+-...||+++|+.- .|--..-++.|...++.. T Consensus 116 ~vf~~N~kp~FEIP~~~i~ntnl~~kNEv~vef~~~de~~ 155 (508) T COG5165 116 AVFFRNTKPIFEIPVDDIENTNLDIKNEVSVEFRIQDEEY 155 (508) T ss_pred EEEEECCCEEEEEEHHHHCCCCCCCCCEEEEEEECCCCCC T ss_conf 2454068703771466600355544332689996044013 No 14 >pfam06057 VirJ Bacterial virulence protein (VirJ). This family consists of several bacterial VirJ virulence proteins. VirJ is thought to be involved in the type IV secretion system. It is thought that the substrate proteins localized to the periplasm may associate with the pilus in a manner that is mediated by VirJ, and suggest a two-step process for type IV secretion in Agrobacterium. Probab=41.85 E-value=8.2 Score=19.65 Aligned_cols=10 Identities=10% Similarity=0.318 Sum_probs=4.9 Q ss_pred CCCHHHHHHC Q ss_conf 7598999639 Q gi|254780955|r 75 RISQNLRKNY 84 (204) Q Consensus 75 ~ip~~L~~~y 84 (204) .+|+.+|++- T Consensus 88 ~LP~~~r~~v 97 (192) T pfam06057 88 RLPPATKQRV 97 (192) T ss_pred HCCHHHHHHH T ss_conf 0999998541 No 15 >pfam05798 Phage_FRD3 Bacteriophage FRD3 protein. This family consists of bacteriophage FRD3 proteins. Probab=40.69 E-value=19 Score=17.23 Aligned_cols=37 Identities=27% Similarity=0.738 Sum_probs=25.8 Q ss_pred HHHHHHCCC-EEEEEEEEEECCEEECCCEEEEEEEECCEEEEEEEEHHHHEEEE Q ss_conf 899963975-37998431121807645479999984987668998442420230 Q gi|254780955|r 78 QNLRKNYPE-KMTIVIQNQFWDLKVLDNHFEVGLSFSNVPERLVIPFNAIKGFY 130 (204) Q Consensus 78 ~~L~~~yP~-emTIVlQhqf~dL~V~~~~FsV~LsF~g~~e~L~IPf~AIt~F~ 130 (204) +-+|.|||+ -||-|..-|||+..+. +.=|+++++.|- T Consensus 16 EvIRNRyPelsi~si~d~~f~~~~i~----------------i~GPle~l~~FM 53 (75) T pfam05798 16 EVIRNRYPELSIDSVQDSKFWSIQIV----------------IEGPLEDLKKFM 53 (75) T ss_pred HHHHCCCCCEEEEEEECCCCCEEEEE----------------EECCHHHHHHHH T ss_conf 99973597427776524886259999----------------856599999999 No 16 >TIGR01182 eda 2-dehydro-3-deoxyphosphogluconate aldolase/4-hydroxy-2-oxoglutarate aldolase; InterPro: IPR000887 4-Hydroxy-2-oxoglutarate aldolase (4.1.3.16 from EC) (KHG-aldolase) catalyzes the interconversion of 4-hydroxy-2-oxoglutarate into pyruvate and glyoxylate. Phospho-2-dehydro-3-deoxygluconate aldolase (4.1.2.14 from EC) (KDPG-aldolase) catalyzes the interconversion of 6-phospho-2-dehydro-3-deoxy-D-gluconate into pyruvate and glyceraldehyde 3-phosphate. These two enzymes are structurally and functionally related . They are both homotrimeric proteins of approximately 220 amino-acid residues. They are class I aldolases whose catalytic mechanism involves the formation of a Schiff-base intermediate between the substrate and the epsilon-amino group of a lysine residue. In both enzymes, an arginine is required for catalytic activity.; GO: 0003824 catalytic activity, 0008152 metabolic process. Probab=38.55 E-value=32 Score=15.78 Aligned_cols=35 Identities=29% Similarity=0.312 Sum_probs=26.9 Q ss_pred HHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEE Q ss_conf 970899987469999845888877598999639753799 Q gi|254780955|r 52 ASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTI 90 (204) Q Consensus 52 ~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTI 90 (204) .-.||+ +-.=|||+|....=.| ..|+.++|+++.| T Consensus 29 L~egG~---~~~EvTlRT~~A~~aI-~~l~~~~P~~~~i 63 (205) T TIGR01182 29 LIEGGL---RVLEVTLRTPVALEAI-RALRKEVPKDALI 63 (205) T ss_pred HHHCCC---EEEEEEECCCCHHHHH-HHHHHHCCCCCEE T ss_conf 986798---0898851472168999-9999728233487 No 17 >PRK02967 nickel responsive regulator; Provisional Probab=37.02 E-value=34 Score=15.62 Aligned_cols=96 Identities=22% Similarity=0.304 Sum_probs=58.9 Q ss_pred CCHHHHC-CHHHHHH--------HHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEE Q ss_conf 2022104-9899999--------999999999999999708999874699998458888775989996397537998431 Q gi|254780955|r 24 LMNYDHI-RYDILAK--------EALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQN 94 (204) Q Consensus 24 ~M~~Dli-~Y~~l~~--------~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQh 94 (204) .|+.+|+ ..|+++. +|.|..+|+.|..-.... -.++.-=.||+.-+|.--.+..+|.+ +|| T Consensus 7 Slp~~Ll~~~D~~i~~~Gy~sRSEaIRdliR~~l~e~~~~~-~~~~~~g~it~vYdH~~~~l~~~lt~---------iQH 76 (138) T PRK02967 7 TLDDDLLETLDSLIARRGYQNRSEAIRDLLRAALQQEATQE-HGTQCVAVLSYVYDHEKRDLASRLVS---------TQH 76 (138) T ss_pred ECCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHC-CCCCEEEEEEEEEECCCHHHHHHHHH---------HHH T ss_conf 75999999999999981999788999999999999744425-89728999999985671468999999---------988 Q ss_pred EECCEEECCCEEEEEEEECCEEEEEEE--EHHHHEEEEC Q ss_conf 121807645479999984987668998--4424202306 Q gi|254780955|r 95 QFWDLKVLDNHFEVGLSFSNVPERLVI--PFNAIKGFYD 131 (204) Q Consensus 95 qf~dL~V~~~~FsV~LsF~g~~e~L~I--Pf~AIt~F~D 131 (204) .|-|+.++..+ |-|.=.+--|.+.+ |-..|..|++ T Consensus 77 ~~~d~Iiss~H--vHld~~~CLEvivv~G~~~~i~~la~ 113 (138) T PRK02967 77 HHHDLSVATLH--VHLDHDDCLEVAVLKGDMGDVQHFAD 113 (138) T ss_pred HCCCEEEEEEE--EECCCCCCEEEEEEECCHHHHHHHHH T ss_conf 43042988878--73377762799999557899999999 No 18 >TIGR02249 integrase_gron integron integrase; InterPro: IPR011946 Members of this family are integrases associated with integrons (and super-integrons), which are systems for incorporating and expressing cassettes of laterally transferred DNA. Incorporation occurs at an attI site. A super-integron, as in Vibrio sp., may include over 100 cassettes. This family belongs to the phage integrase family that also includes recombinases XerC (IPR011931 from INTERPRO) and XerD (IPR011932 from INTERPRO), which are bacterial housekeeping proteins. Within this family of integron integrases, some are designated by class, e.g. IntI4, a class 4 integron integrase from Vibrio cholerae N16961.; GO: 0003677 DNA binding, 0006310 DNA recombination, 0015074 DNA integration. Probab=36.08 E-value=14 Score=18.06 Aligned_cols=47 Identities=17% Similarity=0.176 Sum_probs=31.5 Q ss_pred CCCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCE Q ss_conf 22022104989999999999999999999970899987469999845888877598999639753 Q gi|254780955|r 23 TLMNYDHIRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEK 87 (204) Q Consensus 23 ~~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~e 87 (204) .+||.+| ...|+.=+..|...=+++-.-+| +-||-||.-|+.|||+- T Consensus 167 v~Lp~~L-------~~~L~~q~~~~~~~h~~Dl~~~G-----------~g~V~LP~AL~rKYPnA 213 (320) T TIGR02249 167 VTLPKSL-------APPLREQLERARALHEKDLLAEG-----------YGGVYLPHALARKYPNA 213 (320) T ss_pred CCCHHHH-------HHHHHHHHHHHHHHHHHHHHCCC-----------CCCEEHHHHHHCCCCCC T ss_conf 2681755-------79999999999999998640588-----------53010045542157873 No 19 >pfam11875 DUF3395 Domain of unknown function (DUF3395). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 147 to 176 amino acids in length. This domain is found associated with pfam00226. Probab=35.86 E-value=36 Score=15.50 Aligned_cols=72 Identities=26% Similarity=0.355 Sum_probs=33.8 Q ss_pred HHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCC------CEEEEEEEEEECCEEECCCEEEEEEE Q ss_conf 999999999999999708999874699998458888775989996397------53799843112180764547999998 Q gi|254780955|r 38 EALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYP------EKMTIVIQNQFWDLKVLDNHFEVGLS 111 (204) Q Consensus 38 ~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP------~emTIVlQhqf~dL~V~~~~FsV~Ls 111 (204) ..|+.++.+.+..-.+.|||-=.+-.| |..-+.......+ -.+||.||..- .+ T Consensus 34 ~lm~~~~~r~~~~E~~~~GLVI~~A~Y--------G~~~~~~~~~~~~~~~~~viDVTiplq~lV-----~d-------- 92 (144) T pfam11875 34 SLMGDVVERKLTREEEKGGLVILKAYY--------GNLDSIKGKAQNNDEIEEVIDVTIPLQALV-----RD-------- 92 (144) T ss_pred HHHHHHHHHHHHHHHHCCCEEEEEEEC--------CCCCCCCCCCCCCCCCCCEEEEEEHHHHEE-----EC-------- T ss_conf 999999999999886039759999773--------676766665544457885799870320156-----17-------- Q ss_pred ECCEEEEEEEE----HHHHEEEECCCCC Q ss_conf 49876689984----4242023067753 Q gi|254780955|r 112 FSNVPERLVIP----FNAIKGFYDPSVN 135 (204) Q Consensus 112 F~g~~e~L~IP----f~AIt~F~DPSV~ 135 (204) .+|.|| .+-+.+|+||..+ T Consensus 93 -----s~L~l~~~~~Ks~L~GF~DPcpg 115 (144) T pfam11875 93 -----SQLVLPSGSSKSNLLGFYDPCPG 115 (144) T ss_pred -----CEEEECCCCCHHCCCCCCCCCCC T ss_conf -----77886388735217765699989 No 20 >pfam02995 DUF229 Protein of unknown function (DUF229). Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Probab=33.10 E-value=39 Score=15.22 Aligned_cols=85 Identities=27% Similarity=0.488 Sum_probs=51.5 Q ss_pred EEEEEECCCCCCCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCC-------------------- Q ss_conf 045554162222022104989999999999999999999970899987469999845888-------------------- Q gi|254780955|r 13 QWFFIIKWIDTLMNYDHIRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNAR-------------------- 72 (204) Q Consensus 13 ~~~~~~~~~~~~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~-------------------- 72 (204) +=+|.+-|.... +-|.+++-..+++-++..++ ...+.|.| ++-+-| |..+|- T Consensus 293 ~p~F~~fW~~~~-sHd~~n~~~~~D~~~~~~l~----~~~~~g~l--~~tivi-~~SDHG~R~G~~r~t~~G~lEErlP~ 364 (498) T pfam02995 293 SPFFGFFWSNSL-SHDDFNYASALDEDLLKYLK----KLHERGLL--ENTIVI-FMSDHGLRFGKLRRTSQGRLEERLPF 364 (498) T ss_pred CCEEEEEEECCC-CCCCCHHHHHHHHHHHHHHH----HHHHCCCC--CCEEEE-EECCCCCCCCCHHHHCCCCCCCCCCE T ss_conf 965999997874-15663267788999999999----99865963--034999-98777876665322015620002860 Q ss_pred -CCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEEEEEEEECCEEEEEEEEHHHH Q ss_conf -877598999639753799843112180764547999998498766899844242 Q gi|254780955|r 73 -GVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHFEVGLSFSNVPERLVIPFNAI 126 (204) Q Consensus 73 -gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~FsV~LsF~g~~e~L~IPf~AI 126 (204) -+.+|+|+|++||+.+ .+|+... .+|+-||+-= T Consensus 365 l~i~lP~wfr~~yP~~~--------~nL~~N~-------------~rLts~fDlh 398 (498) T pfam02995 365 MSIRLPPWFREKYPQAV--------ENLELNA-------------NRLTTPFDLH 398 (498) T ss_pred EEEEECHHHHHHHHHHH--------HHHHHHH-------------HCCCCHHHHH T ss_conf 89991789976779999--------9999863-------------3358724689 No 21 >pfam09664 DUF2399 Protein of unknown function C-terminus (DUF2399). Proteins in this entry are encoded within a conserved gene four-gene neighbourhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Beta-proteobacteria). Just the C-terminal region is ioncluded here. Probab=33.04 E-value=39 Score=15.21 Aligned_cols=52 Identities=21% Similarity=0.394 Sum_probs=42.4 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCC Q ss_conf 99999999999997089998746999984588887759899963975379984311218076454 Q gi|254780955|r 40 LRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDN 104 (204) Q Consensus 40 lr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~ 104 (204) .+......|...++.| ..+|-+=+=+.+|+.|..+|+.+|+.. +|...+.+= T Consensus 52 p~~A~~~LL~~L~~~g-----~~l~Y~GDFD~~Gl~IA~~l~~ryg~~--------pWrm~~~dY 103 (155) T pfam09664 52 PSAAALILLDRLAAAG-----ARLYYSGDFDWPGLRIANRLIARYGAR--------PWRMDAADY 103 (155) T ss_pred HHHHHHHHHHHHHHCC-----CEEEEECCCCHHHHHHHHHHHHHCCCC--------CCCCCHHHH T ss_conf 7899999999998489-----869995889937999999999871995--------610789999 No 22 >TIGR02342 chap_CCT_delta T-complex protein 1, delta subunit; InterPro: IPR012717 Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT delta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.; GO: 0005524 ATP binding, 0051082 unfolded protein binding, 0006457 protein folding. Probab=31.38 E-value=33 Score=15.68 Aligned_cols=57 Identities=18% Similarity=0.271 Sum_probs=38.8 Q ss_pred EEECCCCCCCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEE Q ss_conf 55416222202210498999999999999999999997089998746999984588887759899963975379 Q gi|254780955|r 16 FIIKWIDTLMNYDHIRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMT 89 (204) Q Consensus 16 ~~~~~~~~~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emT 89 (204) -.+|+.+.+.-+ .=++=+||||| |||- | |.+.+.+||+ .+|.|+|+-.|.+ |-.+|. T Consensus 375 v~~RGSN~Lvi~---EAeRSlHDALC-ViRs-L--Vk~r~L~~GG---------GaPE~E~a~~L~~-~a~~~~ 431 (526) T TIGR02342 375 VLVRGSNKLVID---EAERSLHDALC-VIRS-L--VKKRALIAGG---------GAPEIEIALKLSK-LARTLK 431 (526) T ss_pred EEEECCCCHHHH---HHHHHHHHHHH-HHHH-H--HHCCCCCCCC---------CCCHHHHHHHHHH-HHHHHC T ss_conf 998068602242---33221567899-9998-8--7506513886---------9528999999998-665413 No 23 >pfam04555 XhoI Restriction endonuclease XhoI. This family consists of type II restriction enzymes (EC:3.1.21.4) that recognize the double-stranded sequence CTCGAG and cleave after C-1. Probab=30.30 E-value=44 Score=14.92 Aligned_cols=52 Identities=21% Similarity=0.472 Sum_probs=36.9 Q ss_pred HHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEE Q ss_conf 999999999999999708999874699998458888775989996397537998431121807645479 Q gi|254780955|r 38 EALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHF 106 (204) Q Consensus 38 ~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~F 106 (204) ..|-+++.=++..+.+.| ||..+++ ++-..+.+|-+.|- +--|||.|-.++- T Consensus 21 k~mdgf~~li~~~~~~~G-l~~a~i~-----~~~~~l~lPGYfRp-----------tK~WDllVv~~g~ 72 (196) T pfam04555 21 KNMDGFAALVLDIIRANG-LAHAEIF-----QQRTALTLPGYFRP-----------TKLWDLLVVQKGV 72 (196) T ss_pred CCHHHHHHHHHHHHHHCC-CCHHHHH-----HCCCCCCCCCCCCC-----------CCCCCEEEEECCE T ss_conf 236889999999999859-9777875-----24753105654244-----------5652179998893 No 24 >TIGR01572 A_thl_para_3677 Arabidopsis paralogous family TIGR01572; InterPro: IPR006462 These sequences comprise a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The proteins have no known function.. Probab=30.10 E-value=28 Score=16.18 Aligned_cols=14 Identities=29% Similarity=0.361 Sum_probs=9.9 Q ss_pred CCCEEEEEEECCCC Q ss_conf 87469999845888 Q gi|254780955|r 59 GEHHFYITFATNAR 72 (204) Q Consensus 59 G~hHfyItF~T~~~ 72 (204) +.-||||||+-=.+ T Consensus 242 ~~A~vYItF~Gl~k 255 (290) T TIGR01572 242 KTAIVYITFKGLNK 255 (290) T ss_pred CCEEEEEEECCCCC T ss_conf 75589996558898 No 25 >pfam04776 DUF626 Protein of unknown function (DUF626). Protein of unknown function, currently only identified in Brassicaceae. Probab=29.67 E-value=32 Score=15.84 Aligned_cols=28 Identities=32% Similarity=0.562 Sum_probs=15.9 Q ss_pred CCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEE Q ss_conf 8746999984588887759899963975379984 Q gi|254780955|r 59 GEHHFYITFATNARGVRISQNLRKNYPEKMTIVI 92 (204) Q Consensus 59 G~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVl 92 (204) .+-+|||||++-+.- .+ .+.-+...||= T Consensus 77 ~nAifYI~fk~~~~~-r~-----g~~~dr~AIVR 104 (116) T pfam04776 77 KNAIFYITFKDSCKA-RI-----GKHVDRIAIVR 104 (116) T ss_pred CEEEEEEEECCCCCC-CC-----CCCCCEEEEEE T ss_conf 015999996555566-66-----87612024563 No 26 >PRK04966 hypothetical protein; Provisional Probab=28.24 E-value=47 Score=14.72 Aligned_cols=45 Identities=27% Similarity=0.316 Sum_probs=29.8 Q ss_pred HCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCC------------------EEEEEEECCCCCCCC Q ss_conf 049899999999999999999999708999874------------------699998458888775 Q gi|254780955|r 29 HIRYDILAKEALRGLVKVVLSEVASIGSLPGEH------------------HFYITFATNARGVRI 76 (204) Q Consensus 29 li~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~h------------------HfyItF~T~~~gV~i 76 (204) .|+|+.|-.++|++++-+ .|.+.|.=.|++ .-.|.|...+.-+.| T Consensus 2 iIP~~~L~~etL~~lie~---FV~REGTDYG~~E~sl~~Kv~qv~~qL~~G~a~Ivfd~~~Es~~I 64 (72) T PRK04966 2 IIPWQDLSPETLDNLIES---FVLREGTDYGEHERSLEQKVADVKRQLQSGEAVLVWSELHETVNI 64 (72) T ss_pred CCCHHHCCHHHHHHHHHH---HHHCCCCCCCCCCCCHHHHHHHHHHHHHCCCEEEEECCCCCEEEE T ss_conf 178377899999999999---970367777645458999999999999769989998688773761 No 27 >pfam10154 DUF2362 Uncharacterized conserved protein (DUF2362). This is a family of proteins conserved from nematodes to humans. The function is not known. Probab=26.19 E-value=7.2 Score=19.99 Aligned_cols=11 Identities=36% Similarity=0.522 Sum_probs=7.3 Q ss_pred CEEEEEEEEEE Q ss_conf 53799843112 Q gi|254780955|r 86 EKMTIVIQNQF 96 (204) Q Consensus 86 ~emTIVlQhqf 96 (204) |--||.|--|- T Consensus 229 ESFTIhLGsQl 239 (501) T pfam10154 229 ESFTIHLGSQL 239 (501) T ss_pred CEEEEEHHHHH T ss_conf 01687505677 No 28 >TIGR02340 chap_CCT_alpha T-complex protein 1, alpha subunit; InterPro: IPR012715 Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacteria, GroEL/GroES. This family consists exclusively of the CCT alpha chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.; GO: 0005524 ATP binding, 0051082 unfolded protein binding, 0006457 protein folding. Probab=25.62 E-value=53 Score=14.39 Aligned_cols=30 Identities=23% Similarity=0.265 Sum_probs=21.4 Q ss_pred EEECCCCCCCHHHHCCHHHHHHHHHHHHHHHHHH Q ss_conf 5541622220221049899999999999999999 Q gi|254780955|r 16 FIIKWIDTLMNYDHIRYDILAKEALRGLVKVVLS 49 (204) Q Consensus 16 ~~~~~~~~~M~~Dli~Y~~l~~~Alr~Vvr~~L~ 49 (204) -++|+.+..|.+ .-++=+||||+ |||++|+ T Consensus 371 IILRGAN~~~lD---EmeRSlHDaLc-vVkRtLE 400 (540) T TIGR02340 371 IILRGANDFMLD---EMERSLHDALC-VVKRTLE 400 (540) T ss_pred EEECCCCCHHHH---HHHHHHHHHHH-HHHHHHC T ss_conf 686166511554---66677777887-8766213 No 29 >TIGR02439 catechol_proteo catechol 1,2-dioxygenase; InterPro: IPR012801 Members of this family known so far are catechol 1,2-dioxygenases of the proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the actinobacteria, which are quite similar to each other and resolved by separate entries. This enzyme catalyses intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogues 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol.; GO: 0005506 iron ion binding, 0018576 catechol 12-dioxygenase activity, 0019614 catechol catabolic process. Probab=25.56 E-value=38 Score=15.34 Aligned_cols=23 Identities=22% Similarity=0.522 Sum_probs=18.3 Q ss_pred HHHHHHHHHCCCCCCCCEEEEEE Q ss_conf 99999999708999874699998 Q gi|254780955|r 45 KVVLSEVASIGSLPGEHHFYITF 67 (204) Q Consensus 45 r~~L~~v~~~g~LpG~hHfyItF 67 (204) -.+|....++|.=|-.-|||||= T Consensus 205 QqlLn~LGRHG~RPAHvHFFvSA 227 (288) T TIGR02439 205 QQLLNLLGRHGNRPAHVHFFVSA 227 (288) T ss_pred HHHHHHCCCCCCCCCCEEEEECC T ss_conf 99985417888898606875658 No 30 >PRK01002 nickel responsive regulator; Provisional Probab=25.06 E-value=54 Score=14.33 Aligned_cols=97 Identities=20% Similarity=0.284 Sum_probs=56.9 Q ss_pred CCHHHHC-CHHHHHH--------HHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEE Q ss_conf 2022104-9899999--------999999999999999708999874699998458888775989996397537998431 Q gi|254780955|r 24 LMNYDHI-RYDILAK--------EALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQN 94 (204) Q Consensus 24 ~M~~Dli-~Y~~l~~--------~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQh 94 (204) .++.+++ ..|.+++ +|+|..+|+.|..-...+...|+--=.||..-+|..-.+..+|.+ +|| T Consensus 10 Slp~~Ll~~lD~~i~~~Gy~nRSeaIRdliR~~l~~~~~~~~~~~~~~G~i~vvYdH~~~~l~~~L~~---------iqH 80 (141) T PRK01002 10 SLPTKLLAEFDEIIEERGYASRSEAIRDAIRDYIIKHKWIHSLEGERAGTISIIYDHHYTDVMEKLTD---------IQH 80 (141) T ss_pred ECCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCEEEEEEEEEECCCCCHHHHHHH---------HHH T ss_conf 80999999999999980998788999999999998734302578847999999996786519999999---------977 Q ss_pred EECCEEECCCEEEEEEEECCEEEEEEE--EHHHHEEEEC Q ss_conf 121807645479999984987668998--4424202306 Q gi|254780955|r 95 QFWDLKVLDNHFEVGLSFSNVPERLVI--PFNAIKGFYD 131 (204) Q Consensus 95 qf~dL~V~~~~FsV~LsF~g~~e~L~I--Pf~AIt~F~D 131 (204) +|.++.++..+.-+ .=+.--|.+.+ |-..|..|++ T Consensus 81 ~~~~~I~ss~Hvhl--d~~~ClEvivv~G~~~~i~~l~~ 117 (141) T PRK01002 81 DYRKLIVATIHMHM--DHDHCMEVVLVKGDASEIRELTD 117 (141) T ss_pred HCCCEEEEEEEEEC--CCCCEEEEEEEECCHHHHHHHHH T ss_conf 33060988887751--78860899999737899999999 No 31 >pfam11383 DUF3187 Protein of unknown function (DUF3187). This family of proteins with unknown function appear to be restricted to Proteobacteria. Probab=24.92 E-value=26 Score=16.39 Aligned_cols=66 Identities=18% Similarity=0.257 Sum_probs=46.2 Q ss_pred HHHHCCC-CCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEEEEEEEECCEEE Q ss_conf 9997089-9987469999845888877598999639753799843112180764547999998498766 Q gi|254780955|r 50 EVASIGS-LPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHFEVGLSFSNVPE 117 (204) Q Consensus 50 ~v~~~g~-LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~FsV~LsF~g~~e 117 (204) .+..+|. ---.|-|+|..- ..||.+.+---+---.-+|..+|+|+....-.--.+..+|.||.+.. T Consensus 79 gidQNGRd~v~~~rf~I~~p--~~gi~~~dF~G~tL~~alt~Y~qYql~~~~~hglS~GgSLyyN~v~~ 145 (273) T pfam11383 79 GIGQNGRDEVDKHQFQISSP--EYGIHIEDFEGETLTSAFTLYLQYQLFQNEHHGLSIGGSLYYNNVSS 145 (273) T ss_pred CCCCCCCCCCCCCCEEEECC--CCCCCCCCCCCHHHHHHHHHHHHEEEECCCCCCEEEEEEEEECCCCC T ss_conf 87866742100572687378--67875024661447888987641243048875278779999546467 No 32 >TIGR00079 pept_deformyl peptide deformylase; InterPro: IPR000181 Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria 3.5.1.88 from EC. The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::N-formyl-L-methionine + H_2O = formate + methionyl peptide Catalytic efficiency strongly depends on the identity of the bound metal . The structure of these enzymes is known , . PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .; GO: 0005506 iron ion binding, 0042586 peptide deformylase activity, 0006412 translation. Probab=24.81 E-value=33 Score=15.72 Aligned_cols=13 Identities=31% Similarity=0.478 Sum_probs=9.7 Q ss_pred EEEEEEEEECCEE Q ss_conf 7998431121807 Q gi|254780955|r 88 MTIVIQNQFWDLK 100 (204) Q Consensus 88 mTIVlQhqf~dL~ 100 (204) .-|||||++..|. T Consensus 152 lA~~iQHE~DHLn 164 (188) T TIGR00079 152 LAICIQHEMDHLN 164 (188) T ss_pred EEEEEEEEEECCC T ss_conf 3588764411258 No 33 >pfam11757 RSS_P20 Suppressor of RNA silencing P21-like. This is a large family of putative suppressors of RNA silencing proteins, P20-P25, from ssRNA positive-strand viruses such as Closterovirus, Potyvirus and Cucumovirus families. RNA silencing is one of the major mechanisms of defence against viruses, and, in response, some viruses have evolved or acquired functions for suppression of RNA silencing. These counter-defencive viral proteins with RNA silencing suppressor (RSS) activity were originally discovered in the members of plant virus genera Potyvirus and Cucumovirus. Each of the conserved blocks of amino acids found in P21-like proteins corresponds to a computer-predicted alpha-helix, with the most C-terminal element being 42 residues long. This suggests conservation of the predominantly alpha-helical secondary structure in the P21-like proteins. Probab=24.72 E-value=48 Score=14.63 Aligned_cols=40 Identities=25% Similarity=0.541 Sum_probs=29.0 Q ss_pred EEEECCCCCCCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCC Q ss_conf 55541622220221049899999999999999999999708999 Q gi|254780955|r 15 FFIIKWIDTLMNYDHIRYDILAKEALRGLVKVVLSEVASIGSLP 58 (204) Q Consensus 15 ~~~~~~~~~~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~Lp 58 (204) ||+++.+ .-...+|...++..++-++..||...++.-+|. T Consensus 86 Ffv~k~s----sl~~~~~s~i~~~kvk~~~~aVl~dlS~e~kLD 125 (137) T pfam11757 86 FFVMKYS----SLSHVPFSEVMRTKLKLVVKAVISDLSREHKLD 125 (137) T ss_pred HHHHHHC----CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCC T ss_conf 9999980----388998799998711788999999988985888 No 34 >COG3612 Uncharacterized protein conserved in archaea [Function unknown] Probab=24.29 E-value=50 Score=14.55 Aligned_cols=36 Identities=28% Similarity=0.370 Sum_probs=26.2 Q ss_pred HHHHHHHHHHHHCCCCCCC-CEEEEEEECCCCCCCCCHH Q ss_conf 9999999999970899987-4699998458888775989 Q gi|254780955|r 42 GLVKVVLSEVASIGSLPGE-HHFYITFATNARGVRISQN 79 (204) Q Consensus 42 ~Vvr~~L~~v~~~g~LpG~-hHfyItF~T~~~gV~ip~~ 79 (204) .++++++....+.|.|.+. --+.=||.|. ||++|.. T Consensus 39 d~akr~vd~A~~eGLL~~k~~~~~p~Fd~~--~V~lP~d 75 (157) T COG3612 39 DVAKRVVDEALAEGLLVKKGGRLAPTFDTS--GVELPLD 75 (157) T ss_pred HHHHHHHHHHHHCCCCCCCCCCCCCCCCCC--CEECCCC T ss_conf 779999999986564032275327777877--5561257 No 35 >pfam09665 RE_Alw26IDE Type II restriction endonuclease (RE_Alw26IDE). Members of this entry are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. characterized specificities of the three members are GGTCTC, CGTCTC and the shared subsequence GTCTC. Probab=23.87 E-value=29 Score=16.11 Aligned_cols=96 Identities=18% Similarity=0.238 Sum_probs=63.1 Q ss_pred CCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEE-EEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEEC Q ss_conf 202210498999999999999999999997089998746999-9845888877598999639753799843112180764 Q gi|254780955|r 24 LMNYDHIRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYI-TFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVL 102 (204) Q Consensus 24 ~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyI-tF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~ 102 (204) .+.+|...|.+++.+.+...++ +|..+..+| ||+|+ ||++- .| ---+.-|.+|.++ T Consensus 326 ~~~~~a~~ls~~lr~n~~~~m~-iL~~i~~nG-----~~~fL~t~L~~------------~Y-----a~y~y~Fe~l~~~ 382 (511) T pfam09665 326 SNDETAKRLSKLLRDNRDTYMR-ILWYILSNG-----HAFFLATFLKP------------EY-----ALYDYTFEGLIIS 382 (511) T ss_pred CCHHHHHHHHHHHHHHHHHHHH-HHHHHHHCC-----HHHHHHHHCCC------------HH-----HHCCCEECCCCCC T ss_conf 7828899999999985899999-999999654-----49999997070------------33-----2011011255367 Q ss_pred CC---EEEEEEEEC-------CEEEEEEEEHHHHEEEECCCCCEEEEECC Q ss_conf 54---799999849-------87668998442420230677530356034 Q gi|254780955|r 103 DN---HFEVGLSFS-------NVPERLVIPFNAIKGFYDPSVNFELEFDV 142 (204) Q Consensus 103 ~~---~FsV~LsF~-------g~~e~L~IPf~AIt~F~DPSV~FgLqF~~ 142 (204) +. .-++.-++. .+.-++.|-|.|+..|+.---.=++.+.. T Consensus 383 ~~~~~~~~i~~~~~~T~~~~~~~aR~iRIAFesL~dY~~KEnRn~~~v~~ 432 (511) T pfam09665 383 NHLGQIKSIYKDRRLTKYADSQKARRIRIAFESLKDYNSKENRNAKEVLT 432 (511) T ss_pred CCCEEEEEECCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCC T ss_conf 76202432215730455667876568777799999888664110465334 No 36 >pfam00073 Rhv picornavirus capsid protein. CAUTION: This alignment is very weak. It can not be generated by clustalw. If a representative set is used for a seed, many so-called members are not recognized. The family should probably be split up into sub-families. Capsid proteins of picornaviruses. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus. Common structure is an 8-stranded beta sandwich. Variations (one or two extra strands) occur. Probab=22.44 E-value=48 Score=14.67 Aligned_cols=42 Identities=29% Similarity=0.505 Sum_probs=26.1 Q ss_pred CCCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEEEEEEEECCEEEEEEEEHHHHEEEEC Q ss_conf 887759899963975379984311218076454799999849876689984424202306 Q gi|254780955|r 72 RGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHFEVGLSFSNVPERLVIPFNAIKGFYD 131 (204) Q Consensus 72 ~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~FsV~LsF~g~~e~L~IPf~AIt~F~D 131 (204) ||...|... ++.+ +--||||||.- +-..+|+|||=++..+.. T Consensus 111 ~g~~~p~~~-----~qa~-~~ph~~~~~~~------------n~~~~~~vPyis~~~~~~ 152 (173) T pfam00073 111 PGAPKPTSR-----WQAT-LNPHQFWNLGT------------NSSARLSVPYVSIAPAYS 152 (173) T ss_pred CCCCCCCCH-----HHHH-CCCEEEEECCC------------CCEEEEEECCCCCCCCCC T ss_conf 888899898-----9961-08668894798------------985899976161657666 No 37 >KOG3913 consensus Probab=21.42 E-value=28 Score=16.15 Aligned_cols=94 Identities=13% Similarity=0.129 Sum_probs=60.3 Q ss_pred HHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCEEEEEEEEEECCEEECCCEEEEEEEECCEE Q ss_conf 99999999999999997089998746999984588887759899963975379984311218076454799999849876 Q gi|254780955|r 37 KEALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQNLRKNYPEKMTIVIQNQFWDLKVLDNHFEVGLSFSNVP 116 (204) Q Consensus 37 ~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~~L~~~yP~emTIVlQhqf~dL~V~~~~FsV~LsF~g~~ 116 (204) ++|=|.+|++.|+.--|=-|+.|.--+=--.+.-.+==.+.+.||+||-.-..+..-..=.. ....+.... T Consensus 194 NeaGR~av~~~m~~~CKCHGvSGSC~~KTCW~~lp~Fr~vG~~Lk~KYd~A~~V~~~~~~~~---------~~~~~~~~~ 264 (356) T KOG3913 194 NEAGRKAVKKNMRRECKCHGVSGSCTVKTCWKQLPDFREVGDYLKEKYDGAIKVTVNNRGRR---------SAPALRPEK 264 (356) T ss_pred HHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHCCCHHHHHHHHHHHHHHHEEEEECCCCCC---------CCCCCCCCC T ss_conf 49899999986542364167155215364986576689999999998641168765267766---------665445555 Q ss_pred EEEEEEHHHHEEEECCCCCEEEE Q ss_conf 68998442420230677530356 Q gi|254780955|r 117 ERLVIPFNAIKGFYDPSVNFELE 139 (204) Q Consensus 117 e~L~IPf~AIt~F~DPSV~FgLq 139 (204) ++...|-..=..|.|+|-+|=.. T Consensus 265 ~~~~~~~~~dLVYle~SPdfC~~ 287 (356) T KOG3913 265 PRFKPPTETDLVYLEDSPDYCER 287 (356) T ss_pred CCCCCCCCCCEEEECCCCHHHCC T ss_conf 55689998742885699714315 No 38 >pfam06748 DUF1217 Protein of unknown function (DUF1217). This family represents a conserved region that is found within bacterial proteins, most of which are hypothetical. Some members contain multiple copies. Probab=21.33 E-value=57 Score=14.18 Aligned_cols=21 Identities=43% Similarity=0.513 Sum_probs=17.6 Q ss_pred CHHHHHHHHHHHHHHHHHHHH Q ss_conf 989999999999999999999 Q gi|254780955|r 31 RYDILAKEALRGLVKVVLSEV 51 (204) Q Consensus 31 ~Y~~l~~~Alr~Vvr~~L~~v 51 (204) =|+.|-..+||.||+.+|..= T Consensus 99 ~~~iL~d~~L~~v~~talgLp 119 (150) T pfam06748 99 AYDILADPALREVALTALGLP 119 (150) T ss_pred HHHHHCCHHHHHHHHHHHCCC T ss_conf 999987888999999992999 No 39 >cd01611 GABARAP GABARAP (GABA-receptor-associated protein) belongs ot a large family of proteins that mediate intracellular membrane trafficking and/or fusion. GABARAP binds not only to GABA, type A but also to tubulin, gephrin, and ULK1. Orthologues of GABARAP include Gate-16 (golgi-associated ATPase enhancer), LC3 (microtubule-associated protein light chain 3), and ATG8 (autophagy protein 8). ATG8 is a ubiquitin-like protein that is conjugated to the membrane phospholipid, phosphatidylethanolamine as part of a ubiquitin-like conjugation system essential for autophagosome-formation. Probab=21.16 E-value=57 Score=14.18 Aligned_cols=33 Identities=18% Similarity=0.493 Sum_probs=21.8 Q ss_pred CCHHHHHHCCCEEEEEEEE-EECCE-EECCCEEEE Q ss_conf 5989996397537998431-12180-764547999 Q gi|254780955|r 76 ISQNLRKNYPEKMTIVIQN-QFWDL-KVLDNHFEV 108 (204) Q Consensus 76 ip~~L~~~yP~emTIVlQh-qf~dL-~V~~~~FsV 108 (204) -+.+++++||+-+-||++- .=.+| ..+...|-| T Consensus 13 es~~i~~KyPdrIPVIve~~~~s~lp~ldk~KfLV 47 (112) T cd01611 13 EVERIRAKYPDRIPVIVERYPKSDLPDLDKKKYLV 47 (112) T ss_pred HHHHHHHHCCCCCEEEEEECCCCCCCCCCCCEEEE T ss_conf 99999997899766999987899970026856972 No 40 >TIGR00174 miaA tRNA delta(2)-isopentenylpyrophosphate transferase; InterPro: IPR002627 tRNA isopentenyltransferases 2.5.1.8 from EC also known as tRNA delta(2)-isopentenylpyrophosphate transferases or IPP transferases. These enzymes modify both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37) .; GO: 0004811 tRNA isopentenyltransferase activity, 0005524 ATP binding, 0008033 tRNA processing. Probab=21.04 E-value=42 Score=15.03 Aligned_cols=42 Identities=21% Similarity=0.514 Sum_probs=27.2 Q ss_pred EEECCCCCCCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCCCCC---CCCEEEE Q ss_conf 5541622220221049899999999999999999999708999---8746999 Q gi|254780955|r 16 FIIKWIDTLMNYDHIRYDILAKEALRGLVKVVLSEVASIGSLP---GEHHFYI 65 (204) Q Consensus 16 ~~~~~~~~~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g~Lp---G~hHfyI 65 (204) |.|.++|-.-.-..-+|...+.++|+ .+.+.|-+| |+.+||| T Consensus 57 ~l~Dildp~e~y~~~~F~~~~~~~~~--------~i~~~Gkipl~VGGT~lY~ 101 (307) T TIGR00174 57 HLIDILDPSESYSAADFQTQALNAIA--------DITARGKIPLLVGGTGLYL 101 (307) T ss_pred EEEECCCCCCCCCCHHHHHHHHHHHH--------HHHHCCCCEEEECCHHHHH T ss_conf 58513471200370889999999999--------9985698348868578899 No 41 >pfam02576 DUF150 Uncharacterized BCR, YhbC family COG0779. Probab=20.84 E-value=65 Score=13.79 Aligned_cols=90 Identities=13% Similarity=0.235 Sum_probs=48.4 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEEEECCCCCCCCCH----HHHHHCCCEEEEEEEE------EECC- Q ss_conf 4989999999999999999999970899987469999845888877598----9996397537998431------1218- Q gi|254780955|r 30 IRYDILAKEALRGLVKVVLSEVASIGSLPGEHHFYITFATNARGVRISQ----NLRKNYPEKMTIVIQN------QFWD- 98 (204) Q Consensus 30 i~Y~~l~~~Alr~Vvr~~L~~v~~~g~LpG~hHfyItF~T~~~gV~ip~----~L~~~yP~emTIVlQh------qf~d- 98 (204) ++-+... .+-..+...|.. ...+|++- +.--..||+.=|= +.+.---+.+.|.+.. +|.. T Consensus 37 v~iddc~--~~Sr~i~~~Ld~---~d~~~~~y----~LEVSSPGi~RpL~~~~~f~~~~G~~v~v~l~~~~~~~k~~~G~ 107 (141) T pfam02576 37 VTLDDCE--EVSRAISALLDV---EDPIPEAY----FLEVSSPGLERPLKTERHFARFIGKLVKVSLKEPIEGRKNFTGK 107 (141) T ss_pred CCHHHHH--HHHHHHHHHHCC---CCCCCCCE----EEEEECCCCCCCCCCHHHHHHHCCCEEEEEEECCCCCEEEEEEE T ss_conf 7899999--999999877512---66667755----99995899998348889999865948999992466993899999 Q ss_pred -EEECCCEEEEEEEECCEEEEEEEEHHHHEE Q ss_conf -076454799999849876689984424202 Q gi|254780955|r 99 -LKVLDNHFEVGLSFSNVPERLVIPFNAIKG 128 (204) Q Consensus 99 -L~V~~~~FsV~LsF~g~~e~L~IPf~AIt~ 128 (204) +.++++.+.+.+.=+...+.+.|||+.|.. T Consensus 108 L~~~~~~~i~l~~~~~~~~~~~~i~~~~I~k 138 (141) T pfam02576 108 LLEVDGDTVTIEVDDKRRKKEVEIPFADIKK 138 (141) T ss_pred EEEEECCEEEEEECCCCCCEEEEEEHHHHHH T ss_conf 9988699999998587122689973799523 No 42 >pfam11094 UL11 Membrane-associated tegument protein. The UL11 gene product of herpes simplex virus is a membrane-associated tegument protein that is incorporated into the HSV virion and functions in viral envelopment. UL11 is acylated which is crucial for lipid raft association. Probab=20.72 E-value=46 Score=14.78 Aligned_cols=10 Identities=40% Similarity=0.551 Sum_probs=7.7 Q ss_pred CCCCEEEEEE Q ss_conf 7772898310 Q gi|254780955|r 191 KMASVISLDN 200 (204) Q Consensus 191 ~~aeVVSLD~ 200 (204) .++||||||+ T Consensus 21 ~~GevvsL~a 30 (39) T pfam11094 21 SSGEVVSLDA 30 (39) T ss_pred CCCCEEEECH T ss_conf 5886898723 No 43 >pfam02991 MAP1_LC3 Microtubule associated protein 1A/1B, light chain 3. Light chain 3 is proposed to function primarily as a subunit of microtubule associated proteins 1A and 1B and that its expression may regulate microtubule binding activity. Probab=20.71 E-value=58 Score=14.13 Aligned_cols=32 Identities=25% Similarity=0.624 Sum_probs=20.5 Q ss_pred CHHHHHHCCCEEEEEEEEEE-CCE-EECCCEEEE Q ss_conf 98999639753799843112-180-764547999 Q gi|254780955|r 77 SQNLRKNYPEKMTIVIQNQF-WDL-KVLDNHFEV 108 (204) Q Consensus 77 p~~L~~~yP~emTIVlQhqf-~dL-~V~~~~FsV 108 (204) +.+++++||+-+-||++-.= .+| ..+...|-| T Consensus 6 s~~i~~KyPdriPVIve~~~~~~lp~ldk~KfLV 39 (104) T pfam02991 6 SEKIREKYPDRIPVIIEKASGSDLPDIDKKKYLV 39 (104) T ss_pred HHHHHHHCCCCCEEEEEECCCCCCCCCCCCEEEE T ss_conf 9999987899776999987889872347746872 No 44 >KOG1932 consensus Probab=20.39 E-value=67 Score=13.73 Aligned_cols=94 Identities=19% Similarity=0.180 Sum_probs=55.4 Q ss_pred CCCCCHHHHCCHHHHHHHHHHHHHHHHHHHHHHCC---------CCCCCCEEEEEEECCCCCCCCCHHHH---------- Q ss_conf 22220221049899999999999999999999708---------99987469999845888877598999---------- Q gi|254780955|r 21 IDTLMNYDHIRYDILAKEALRGLVKVVLSEVASIG---------SLPGEHHFYITFATNARGVRISQNLR---------- 81 (204) Q Consensus 21 ~~~~M~~Dli~Y~~l~~~Alr~Vvr~~L~~v~~~g---------~LpG~hHfyItF~T~~~gV~ip~~L~---------- 81 (204) ....|..+.|+-+..+ .|+.++|..+.+.- .++|-.-++|++.-+-.|-.|+--++ T Consensus 455 ~~~~m~~~~i~~e~~~-----q~f~kv~~~~~~~~~k~~~~~Wv~~~g~~~~r~~~~~N~k~~~Ie~~i~Q~v~~~~~A~ 529 (1180) T KOG1932 455 LLQRMSGNRINEELSF-----QVFNKVLELASKMLLKSFFQTWVYGLGVPILRLGQRFNVKGKDIEMGIDQWVRTGGHAP 529 (1180) T ss_pred HHHHHHHCCCCCCHHH-----HHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEEEEEECCCCCCHHHHHHHHHCCCCC T ss_conf 8898731323413799-----99999987620567778888777535870699999985035523278887765236554 Q ss_pred --------------------------HHCCCEEEEEEEEEECCEEECCCEEEEEEEECCEEEEEEEEHHHH Q ss_conf --------------------------639753799843112180764547999998498766899844242 Q gi|254780955|r 82 --------------------------KNYPEKMTIVIQNQFWDLKVLDNHFEVGLSFSNVPERLVIPFNAI 126 (204) Q Consensus 82 --------------------------~~yP~emTIVlQhqf~dL~V~~~~FsV~LsF~g~~e~L~IPf~AI 126 (204) ++|+--|||.+|- ++-. |.-+|--.|--.++-||+++= T Consensus 530 ~sv~~~~n~~rna~~~~~~qD~~~g~~~~~GpmtIrv~E------lDGt-feH~lqi~~~~~k~dI~chsK 593 (1180) T KOG1932 530 FSVFSDFNRKRNALEHEIKQDYTAGNEKYTGPMTIRVQE------LDGT-FEHTLQIDGDFTKLDIQCHSK 593 (1180) T ss_pred EEEECCCCHHHHHHHHHCCCCCCCCCCEECCCEEEEEEE------ECCC-CEEEEEECCCCCCCCEEECCC T ss_conf 022222000122332000356567774543534899996------0674-102588557620030152340 No 45 >pfam11688 DUF3285 Protein of unknown function (DUF3285). This family of proteins with unknown function appears to be restricted to Cyanobacteria. Probab=20.08 E-value=63 Score=13.90 Aligned_cols=13 Identities=31% Similarity=0.378 Sum_probs=10.0 Q ss_pred HHHHHHHHHHHHH Q ss_conf 9999999999999 Q gi|254780955|r 35 LAKEALRGLVKVV 47 (204) Q Consensus 35 l~~~Alr~Vvr~~ 47 (204) .+.-||||.||+- T Consensus 7 yvKLAMRNMVRKg 19 (45) T pfam11688 7 YVKLAMRNMVRKG 19 (45) T ss_pred HHHHHHHHHHHHC T ss_conf 9999999999806 No 46 >TIGR01722 MMSDH methylmalonate-semialdehyde dehydrogenase; InterPro: IPR010061 These proteins are involved in valine catabolism, methylmalonate-semialdehyde dehydrogenase catalyzes the irreversible NAD+- and CoA-dependent oxidative decarboxylation of methylmalonate semialdehyde to propionyl-CoA. Methylmalonate-semialdehyde dehydrogenase has been characterised in both prokaryotes , and eukaryotes , functioning as a mammalian tetramer and a bacterial homodimer. Although similar in monomeric molecular mass and enzymatic activity, the N-terminal sequence in Pseudomonas aeruginosa does not correspond with the N-terminal sequence predicted for rat liver. Sequence homology to a variety of prokaryotic and eukaryotic aldehyde dehydrogenases places MMSDH in the aldehyde dehydrogenase (NAD+) superfamily making MMSDH's CoA requirement unique among known ALDHs. Methylmalonate semialdehyde dehydrogenase is closely related to betaine aldehyde dehydrogenase, 2-hydroxymuconic semialdehyde dehydrogenase, and class 1 and 2 aldehyde dehydrogenase . In Bacillus, a highly homologous protein to methylmalonic acid semialdehyde dehydrogenase, groups out from the main MMSDH clade with Listeria and Sulfolobus. This Bacillus protein has been suggested to be located in an iol operon and/or involved in myo-inositol catabolism, converting malonic semialdehyde to acetyl CoA ad CO2 . The preceding enzymes responsible for valine catabolism are present in Bacillus, Listeria, and Sulfolobus.; GO: 0004491 methylmalonate-semialdehyde dehydrogenase (acylating) activity, 0006573 valine metabolic process. Probab=20.03 E-value=41 Score=15.11 Aligned_cols=27 Identities=30% Similarity=0.473 Sum_probs=15.2 Q ss_pred CCCCCCCHHH---HCCHHHHHHHHHHHHHH Q ss_conf 1622220221---04989999999999999 Q gi|254780955|r 19 KWIDTLMNYD---HIRYDILAKEALRGLVK 45 (204) Q Consensus 19 ~~~~~~M~~D---li~Y~~l~~~Alr~Vvr 45 (204) .|.+++.++- |+||+.|+.+.+..+=| T Consensus 53 ~W~~ts~~~R~~vllRyqaLlkeh~dEiA~ 82 (478) T TIGR01722 53 AWKETSVAERARVLLRYQALLKEHRDEIAK 82 (478) T ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 102477656679999999998840789999 Done!