Query gi|254780400|ref|YP_003064813.1| hypothetical protein CLIBASIA_01425 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 71 No_of_seqs 109 out of 273 Neff 3.2 Searched_HMMs 39220 Date Sun May 29 15:37:09 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780400.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 COG3908 Uncharacterized protei 99.9 1.5E-26 3.8E-31 182.1 7.2 70 1-70 1-73 (77) 2 pfam09866 DUF2093 Uncharacteri 99.8 3.2E-22 8.1E-27 156.1 2.9 42 21-62 1-42 (42) 3 TIGR02890 spore_yteA sporulati 94.2 0.015 3.9E-07 38.0 0.8 23 21-43 82-104 (167) 4 COG1734 DksA DnaK suppressor p 90.9 0.06 1.5E-06 34.4 0.1 22 22-43 75-96 (120) 5 KOG3039 consensus 71.6 1.8 4.7E-05 25.5 1.2 26 18-43 174-199 (303) 6 TIGR02420 dksA RNA polymerase- 64.8 1.7 4.4E-05 25.7 -0.1 25 19-43 72-96 (110) 7 pfam12230 PRP21_like_P Pre-mRN 59.1 4 0.0001 23.5 0.9 17 26-42 161-177 (223) 8 PRK10778 DnaK transcriptional 49.1 5.4 0.00014 22.7 0.3 26 19-44 103-128 (151) 9 PRK10696 C32 tRNA thiolase; Pr 44.6 9.4 0.00024 21.3 0.9 22 17-38 31-52 (311) 10 COG1096 Predicted RNA-binding 35.7 33 0.00084 18.0 2.6 47 5-51 45-91 (188) 11 TIGR02027 rpoA DNA-directed RN 34.9 39 0.00099 17.6 2.8 31 8-39 109-139 (324) 12 KOG2897 consensus 32.9 22 0.00056 19.0 1.3 43 17-64 280-326 (390) 13 PRK00022 lolB outer membrane l 32.0 43 0.0011 17.3 2.7 46 4-49 84-138 (203) 14 pfam09224 DUF1961 Domain of un 29.8 49 0.0013 16.9 2.6 31 17-47 162-195 (218) 15 pfam05477 SURF2 Surfeit locus 28.6 49 0.0012 17.0 2.5 26 16-42 15-42 (244) 16 pfam11811 DUF3331 Domain of un 28.4 32 0.00081 18.1 1.5 34 13-46 39-72 (96) 17 COG0689 Rph RNase PH [Translat 27.4 52 0.0013 16.8 2.5 25 12-36 28-52 (230) 18 pfam10000 DUF2241 Uncharacteri 26.7 45 0.0012 17.1 2.0 15 21-35 17-31 (72) 19 COG3602 Uncharacterized protei 26.0 41 0.001 17.4 1.7 17 22-38 18-34 (134) 20 PRK00276 infA translation init 25.0 61 0.0016 16.4 2.4 23 13-35 15-39 (72) 21 pfam03550 LolB Outer membrane 24.9 12 0.00031 20.6 -1.1 45 5-49 41-94 (156) 22 TIGR01274 ACC_deam 1-aminocycl 24.9 33 0.00085 18.0 1.1 11 26-36 199-209 (352) 23 cd05694 S1_Rrp5_repeat_hs2_sc2 24.6 42 0.0011 17.3 1.6 20 17-36 39-58 (74) 24 PRK10351 holo-(acyl carrier pr 24.1 79 0.002 15.7 3.4 44 16-59 66-114 (185) 25 pfam08265 YL1_C YL1 nuclear pr 24.0 26 0.00067 18.6 0.4 20 29-48 2-25 (30) 26 TIGR00601 rad23 UV excision re 23.6 46 0.0012 17.1 1.6 16 22-37 66-81 (453) 27 cd05704 S1_Rrp5_repeat_hs13 S1 23.3 38 0.00096 17.6 1.1 22 13-34 40-61 (72) 28 KOG0178 consensus 23.2 44 0.0011 17.2 1.4 15 20-34 64-78 (249) 29 pfam00500 Late_protein_L1 L1 ( 22.9 20 0.00051 19.3 -0.4 24 32-55 436-459 (503) 30 COG4066 Uncharacterized protei 20.7 54 0.0014 16.7 1.5 38 15-52 124-162 (165) 31 pfam02329 HDC Histidine carbox 20.6 47 0.0012 17.1 1.1 13 22-34 152-164 (306) 32 pfam06159 DUF974 Protein of un 20.6 62 0.0016 16.3 1.8 16 18-33 70-86 (233) No 1 >COG3908 Uncharacterized protein conserved in bacteria [Function unknown] Probab=99.93 E-value=1.5e-26 Score=182.06 Aligned_cols=70 Identities=49% Similarity=0.914 Sum_probs=65.9 Q ss_pred CCCCCC---CCEEEEEECCCCEEEEECCCEEEEEECCCEECHHHCCCCCHHHCCCCCCHHHHHHHHHHHCCCC Q ss_conf 941237---7669999849980992079889984149670677654277878072259999999999848999 Q gi|254780400|r 1 MYNKVD---ENEASIRYKDGTFEIIRPGTYVVCAITGQRIPLKKLCYWSVDRQVPYANAEASFEAEKISGKIP 70 (71) Q Consensus 1 m~nkm~---~~~Akl~Y~~~~F~ii~~G~yV~CAVsgk~IpL~~L~YWnVe~QEaY~s~e~~~~r~~~~~kip 70 (71) |||++. .++|+|+|+||+|+|+++|+||+||||||+||||+|+||||++||||+++.++++|++..+-+| T Consensus 1 ~mnrfeg~~~~ea~iryldgdf~vv~~GsfV~CAVtgk~IPldeLrYWSvarQEaYv~~a~slere~~a~~~~ 73 (77) T COG3908 1 MMNRFEGPGSREAVIRYLDGDFQVVSPGSFVLCAVTGKPIPLDELRYWSVARQEAYVDAAASLEREREAGPEL 73 (77) T ss_pred CCCCCCCCCCCEEEEEEECCCEEEECCCCEEEEEECCCCCCHHHHHHCCHHHHHCCCCHHHHHHHHHHCCCCC T ss_conf 9644558997626788744856787478679997509965678964134554200256889889887629765 No 2 >pfam09866 DUF2093 Uncharacterized protein conserved in bacteria (DUF2093). This domain, found in various hypothetical prokaryotic proteins, has no known function. Probab=99.85 E-value=3.2e-22 Score=156.09 Aligned_cols=42 Identities=52% Similarity=1.090 Sum_probs=40.5 Q ss_pred EEECCCEEEEEECCCEECHHHCCCCCHHHCCCCCCHHHHHHH Q ss_conf 920798899841496706776542778780722599999999 Q gi|254780400|r 21 IIRPGTYVVCAITGQRIPLKKLCYWSVDRQVPYANAEASFEA 62 (71) Q Consensus 21 ii~~G~yV~CAVsgk~IpL~~L~YWnVe~QEaY~s~e~~~~r 62 (71) |+++||||+|||||++|||++|+|||||+||||+|++++++| T Consensus 1 ii~~G~~V~CAVsgk~IpL~~L~YWsV~~QEaY~s~~~a~~r 42 (42) T pfam09866 1 VLSPGSFVLCAVTGEPIPLDELRYWSVERQEAYASAEAALQR 42 (42) T ss_pred CCCCCCEEEEEEECCEECHHHCCCCCHHHCCCCCCHHHHHCC T ss_conf 945799989974098002645474661022034799998459 No 3 >TIGR02890 spore_yteA sporulation protein, yteA family; InterPro: IPR014240 This entry contains predicted regulatory proteins that are found in nearly every species of the endospore-forming bacteria within the Firmicutes (low-GC Gram-positive bacteria), with the exception of Clostridium perfringens. Some (but not all) of these proteins contain an unusual DksA/TraR C4-type zinc finger, where only one of the four key Cys residues is conserved. All members of this entry share an additional C-terminal domain. The function of proteins in this family is unknown. YteA is found in mature spores of Bacillus subtilis and its expression appeasr to be regulated by sigma-K .. Probab=94.21 E-value=0.015 Score=37.96 Aligned_cols=23 Identities=39% Similarity=0.810 Sum_probs=20.1 Q ss_pred EEECCCEEEEEECCCEECHHHCC Q ss_conf 92079889984149670677654 Q gi|254780400|r 21 IIRPGTYVVCAITGQRIPLKKLC 43 (71) Q Consensus 21 ii~~G~yV~CAVsgk~IpL~~L~ 43 (71) =|+.|+|=+|-|||++||-|-|. T Consensus 82 ~ie~GtYGICe~cG~~Ip~ERLE 104 (167) T TIGR02890 82 KIENGTYGICEVCGKPIPYERLE 104 (167) T ss_pred HHHCCCCEEECCCCCCCCHHHHH T ss_conf 98578970004487879844420 No 4 >COG1734 DksA DnaK suppressor protein [Signal transduction mechanisms] Probab=90.87 E-value=0.06 Score=34.42 Aligned_cols=22 Identities=41% Similarity=0.733 Sum_probs=20.5 Q ss_pred EECCCEEEEEECCCEECHHHCC Q ss_conf 2079889984149670677654 Q gi|254780400|r 22 IRPGTYVVCAITGQRIPLKKLC 43 (71) Q Consensus 22 i~~G~yV~CAVsgk~IpL~~L~ 43 (71) |+.|+|-.|..+|.+||+.-|. T Consensus 75 Ie~gtYG~Ce~cG~~Ip~~RL~ 96 (120) T COG1734 75 IEEGTYGICEECGEPIPEARLE 96 (120) T ss_pred HHCCCCCCHHCCCCCCCHHHHH T ss_conf 8817861143369968899985 No 5 >KOG3039 consensus Probab=71.63 E-value=1.8 Score=25.51 Aligned_cols=26 Identities=35% Similarity=0.699 Sum_probs=22.1 Q ss_pred CEEEEECCCEEEEEECCCEECHHHCC Q ss_conf 80992079889984149670677654 Q gi|254780400|r 18 TFEIIRPGTYVVCAITGQRIPLKKLC 43 (71) Q Consensus 18 ~F~ii~~G~yV~CAVsgk~IpL~~L~ 43 (71) .-.+-+|-.+|+|.||||+|-|.+|- T Consensus 174 atklekP~~~v~CP~s~kplklkdL~ 199 (303) T KOG3039 174 ATKLEKPSTTVVCPVSGKPLKLKDLF 199 (303) T ss_pred HHCCCCCCCEEECCCCCCCCCHHHCC T ss_conf 44015887526556889852010120 No 6 >TIGR02420 dksA RNA polymerase-binding protein DksA; InterPro: IPR012784 This entry describes a small, pleiotropic protein family, DksA (DnaK suppressor A), originally named as a multicopy suppressor of temperature sensitivity of dnaKJ mutants . DksA mutants are defective in quorum sensing, virulence, etc. DksA is now understood to bind RNA polymerase directly and modulate its response to small molecules to control the level of transcription of rRNA. Nearly all members of this family are in the proteobacteria. Whether the closest homologues outside the proteobacteria function equivalently is unknown. The entry also contains possible DksA proteins from outside the proteobacteria while IPR012783 from INTERPRO describes a closely related family of short sequences usually found in prophage regions of proteobacterial genomes or in known phage.; GO: 0008270 zinc ion binding. Probab=64.83 E-value=1.7 Score=25.70 Aligned_cols=25 Identities=36% Similarity=0.623 Sum_probs=21.4 Q ss_pred EEEEECCCEEEEEECCCEECHHHCC Q ss_conf 0992079889984149670677654 Q gi|254780400|r 19 FEIIRPGTYVVCAITGQRIPLKKLC 43 (71) Q Consensus 19 F~ii~~G~yV~CAVsgk~IpL~~L~ 43 (71) .+=|+.|+|=.|..+|.+|.|.-|. T Consensus 72 l~~i~~G~YG~Ce~cGeeIGl~RLe 96 (110) T TIGR02420 72 LERIEDGDYGYCEECGEEIGLRRLE 96 (110) T ss_pred HHHHHCCCCCCCCCCCCCCCCCHHC T ss_conf 9997458986543678766630002 No 7 >pfam12230 PRP21_like_P Pre-mRNA splicing factor PRP21 like protein. This domain family is found in eukaryotes, and is typically between 212 and 238 amino acids in length. The family is found in association with pfam01805. There are two completely conserved residues (W and H) that may be functionally important. PRP21 is required for assembly of the prespliceosome and it interacts with U2 snRNP and/or pre-mRNA in the prespliceosome. This family also contains proteins similar to PRP21, such as the mammalian SF3a. SF3a also interacts with U2 snRNP from the prespliceosome, converting it to its active form. Probab=59.07 E-value=4 Score=23.46 Aligned_cols=17 Identities=35% Similarity=0.765 Sum_probs=15.1 Q ss_pred CEEEEEECCCEECHHHC Q ss_conf 88998414967067765 Q gi|254780400|r 26 TYVVCAITGQRIPLKKL 42 (71) Q Consensus 26 ~yV~CAVsgk~IpL~~L 42 (71) ..++|-+||+.||.+++ T Consensus 161 ~~~~cPitGq~IP~~e~ 177 (223) T pfam12230 161 KMIKCPITGELIPEDEM 177 (223) T ss_pred CEEECCCCCCCCCHHHH T ss_conf 86577877781777899 No 8 >PRK10778 DnaK transcriptional regulator DksA; Provisional Probab=49.11 E-value=5.4 Score=22.69 Aligned_cols=26 Identities=15% Similarity=0.359 Sum_probs=22.0 Q ss_pred EEEEECCCEEEEEECCCEECHHHCCC Q ss_conf 09920798899841496706776542 Q gi|254780400|r 19 FEIIRPGTYVVCAITGQRIPLKKLCY 44 (71) Q Consensus 19 F~ii~~G~yV~CAVsgk~IpL~~L~Y 44 (71) .+-|+.|+|=.|-.+|.+|++.-|.- T Consensus 103 L~rI~~g~YG~Ce~cGe~Ig~~RL~A 128 (151) T PRK10778 103 LKKVEDEDFGYCESCGVEIGIRRLEA 128 (151) T ss_pred HHHHHCCCCCCCCCCCCCCCHHHHHC T ss_conf 99984899875004798526999817 No 9 >PRK10696 C32 tRNA thiolase; Provisional Probab=44.57 E-value=9.4 Score=21.25 Aligned_cols=22 Identities=18% Similarity=0.516 Sum_probs=19.6 Q ss_pred CCEEEEECCCEEEEEECCCEEC Q ss_conf 9809920798899841496706 Q gi|254780400|r 17 GTFEIIRPGTYVVCAITGQRIP 38 (71) Q Consensus 17 ~~F~ii~~G~yV~CAVsgk~Ip 38 (71) .+|..|++||-|..+|||-+=. T Consensus 31 ~dy~MIedGDRVlVglSGGKDS 52 (311) T PRK10696 31 ADFNMIEEGDRIMVCLSGGKDS 52 (311) T ss_pred HHHCCCCCCCEEEEECCCCHHH T ss_conf 9858778999999982678889 No 10 >COG1096 Predicted RNA-binding protein (consists of S1 domain and a Zn-ribbon domain) [Translation, ribosomal structure and biogenesis] Probab=35.72 E-value=33 Score=17.98 Aligned_cols=47 Identities=15% Similarity=0.053 Sum_probs=37.4 Q ss_pred CCCCEEEEEECCCCEEEEECCCEEEEEECCCEECHHHCCCCCHHHCC Q ss_conf 37766999984998099207988998414967067765427787807 Q gi|254780400|r 5 VDENEASIRYKDGTFEIIRPGTYVVCAITGQRIPLKKLCYWSVDRQV 51 (71) Q Consensus 5 m~~~~Akl~Y~~~~F~ii~~G~yV~CAVsgk~IpL~~L~YWnVe~QE 51 (71) |.+..+.++=.-..|.+.++||.|.|-||+.+-..-.++--.|+.++ T Consensus 45 ~~n~~~~V~p~~~~~~~~K~GdiV~grV~~v~~~~a~V~i~~ve~~~ 91 (188) T COG1096 45 DKNRVISVKPGKKTPPLPKGGDIVYGRVTDVREQRALVRIVGVEGKE 91 (188) T ss_pred CCCEEEEECCCCCCCCCCCCCCEEEEEEEECCCCEEEEEEEEEECCC T ss_conf 36449995247777776898879999995326608999999994333 No 11 >TIGR02027 rpoA DNA-directed RNA polymerase, alpha subunit; InterPro: IPR011773 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme . The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length . The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This family consists of the bacterial (and chloroplast) DNA-directed RNA polymerase alpha subunit, encoded by the rpoA gene. The RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates. The amino terminal domain is involved in dimerizing and assembling the other RNA polymerase subunits into a transcriptionally active enzyme. The carboxy-terminal domain contains determinants for interaction with DNA and with transcriptional activator proteins , .; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription DNA-dependent. Probab=34.94 E-value=39 Score=17.56 Aligned_cols=31 Identities=23% Similarity=0.495 Sum_probs=24.6 Q ss_pred CEEEEEECCCCEEEEECCCEEEEEECCCEECH Q ss_conf 66999984998099207988998414967067 Q gi|254780400|r 8 NEASIRYKDGTFEIIRPGTYVVCAITGQRIPL 39 (71) Q Consensus 8 ~~Akl~Y~~~~F~ii~~G~yV~CAVsgk~IpL 39 (71) ..+.++..+++|||+.| |+|+|-++...+.| T Consensus 109 ~A~Di~~~~~~~EvvNp-dl~Iatl~~~Na~l 139 (324) T TIGR02027 109 TAGDIKAQPGGVEVVNP-DLVIATLTEDNAKL 139 (324) T ss_pred EEEEEECCCCCEEEECC-CCEEEEECCCCEEE T ss_conf 52200137996468788-85266740797279 No 12 >KOG2897 consensus Probab=32.85 E-value=22 Score=19.05 Aligned_cols=43 Identities=33% Similarity=0.492 Sum_probs=29.7 Q ss_pred CCEEEEECCCE---EEEEECCCEECHHHCCCCCHHHCCCCCCHHHH-HHHHH Q ss_conf 98099207988---99841496706776542778780722599999-99999 Q gi|254780400|r 17 GTFEIIRPGTY---VVCAITGQRIPLKKLCYWSVDRQVPYANAEAS-FEAEK 64 (71) Q Consensus 17 ~~F~ii~~G~y---V~CAVsgk~IpL~~L~YWnVe~QEaY~s~e~~-~~r~~ 64 (71) +.|....++.. |+|+|||.+- +|-..--|.+|+++.|. ..|+. T Consensus 280 s~~~~~~~p~~~~~~~C~iTg~PA-----~Y~DPVT~lPy~ta~AFKviRe~ 326 (390) T KOG2897 280 SEFPTKSPPKPRERVVCVITGRPA-----RYLDPVTGLPYSTAQAFKVIRER 326 (390) T ss_pred HCCCCCCCCCCCCCCCCCCCCCCC-----CCCCCCCCCCCHHHHHHHHHHHH T ss_conf 027756899874234034337754-----02586557752258999999999 No 13 >PRK00022 lolB outer membrane lipoprotein LolB; Provisional Probab=32.03 E-value=43 Score=17.26 Aligned_cols=46 Identities=20% Similarity=0.289 Sum_probs=31.1 Q ss_pred CCCCCEEEEEECCCCEEEEEC-CCEE--------EEEECCCEECHHHCCCCCHHH Q ss_conf 237766999984998099207-9889--------984149670677654277878 Q gi|254780400|r 4 KVDENEASIRYKDGTFEIIRP-GTYV--------VCAITGQRIPLKKLCYWSVDR 49 (71) Q Consensus 4 km~~~~Akl~Y~~~~F~ii~~-G~yV--------~CAVsgk~IpL~~L~YWnVe~ 49 (71) -++.+.+.|.=.++...+... |... +=.++|=.||++.|+||=.-+ T Consensus 84 pLG~~~~~i~~~~~~~~L~~~~g~~~~a~~~e~Ll~~~lG~~lPv~~L~~Wl~G~ 138 (203) T PRK00022 84 PLGSTELELTGRPGGATLEDNNGQRYTADDAEELLQELTGWSLPLSGLRDWLRGL 138 (203) T ss_pred CCCCEEEEEEECCCEEEEEECCCCEEECCCHHHHHHHHHCCCCCHHHHHHHHHCC T ss_conf 2665499999879979999799988867999999999878846378889998089 No 14 >pfam09224 DUF1961 Domain of unknown function (DUF1961). Members of this family are found in a set of hypothetical bacterial proteins. Their exact function has not, as yet, been determined. Probab=29.76 E-value=49 Score=16.94 Aligned_cols=31 Identities=29% Similarity=0.354 Sum_probs=23.1 Q ss_pred CCEEEEECCCEEEEEECCCEEC---HHHCCCCCH Q ss_conf 9809920798899841496706---776542778 Q gi|254780400|r 17 GTFEIIRPGTYVVCAITGQRIP---LKKLCYWSV 47 (71) Q Consensus 17 ~~F~ii~~G~yV~CAVsgk~Ip---L~~L~YWnV 47 (71) -..++++.|.||.|+|-|-+|- =|..+||-| T Consensus 162 ~r~~lvKdg~~V~f~In~l~i~~W~Dd~~~~Gpv 195 (218) T pfam09224 162 YRMKLIKDGPYVAFSINGLPILEWTDDGDRYGPV 195 (218) T ss_pred EEEEEEECCCEEEEEECCEEEEEEECCCCCCCCC T ss_conf 4799984288269997990788873588630752 No 15 >pfam05477 SURF2 Surfeit locus protein 2 (SURF2). Surfeit locus protein 2 is part of a group of at least six sequence unrelated genes (Surf-1 to Surf-6). The six Surfeit genes have been classified as housekeeping genes, being expressed in all tissue types tested and not containing a TATA box in their promoter region. The exact function of SURF2 is unknown. Probab=28.58 E-value=49 Score=16.96 Aligned_cols=26 Identities=31% Similarity=0.645 Sum_probs=20.5 Q ss_pred CCCEEEEECCCEEEEEECCCEEC--HHHC Q ss_conf 99809920798899841496706--7765 Q gi|254780400|r 16 DGTFEIIRPGTYVVCAITGQRIP--LKKL 42 (71) Q Consensus 16 ~~~F~ii~~G~yV~CAVsgk~Ip--L~~L 42 (71) +..|+++ .|.-|.|..||-.|| |.+| T Consensus 15 hP~l~l~-~~~kvrC~LTgHElP~rl~el 42 (244) T pfam05477 15 HPFLELV-ENGKVRCVLTGHELPCRLPEL 42 (244) T ss_pred CCCEEEC-CCCEEEEEECCCCCCCCHHHH T ss_conf 9961305-898067762487578874899 No 16 >pfam11811 DUF3331 Domain of unknown function (DUF3331). This family of proteins are functionally uncharacterized. This family is only found in bacteria. Proteins in this family vary in length from 96 to 160 amino acids. Probab=28.40 E-value=32 Score=18.08 Aligned_cols=34 Identities=21% Similarity=0.419 Sum_probs=23.6 Q ss_pred EECCCCEEEEECCCEEEEEECCCEECHHHCCCCC Q ss_conf 9849980992079889984149670677654277 Q gi|254780400|r 13 RYKDGTFEIIRPGTYVVCAITGQRIPLKKLCYWS 46 (71) Q Consensus 13 ~Y~~~~F~ii~~G~yV~CAVsgk~IpL~~L~YWn 46 (71) +|++-....-..-.-=+||+||.+|--.|.-|=- T Consensus 39 ~YgeQ~W~~~~Ar~~G~CaLSG~~I~~GD~VyrP 72 (96) T pfam11811 39 HYGEQRWRLARARRRGRCALSGRPIRRGDAVYRP 72 (96) T ss_pred CCCCCEEEEEECCCCCEEECCCCCCCCCCCEECC T ss_conf 3265269987558586785659811389820688 No 17 >COG0689 Rph RNase PH [Translation, ribosomal structure and biogenesis] Probab=27.43 E-value=52 Score=16.78 Aligned_cols=25 Identities=28% Similarity=0.530 Sum_probs=19.4 Q ss_pred EEECCCCEEEEECCCEEEEEECCCE Q ss_conf 9984998099207988998414967 Q gi|254780400|r 12 IRYKDGTFEIIRPGTYVVCAITGQR 36 (71) Q Consensus 12 l~Y~~~~F~ii~~G~yV~CAVsgk~ 36 (71) ++.-+|+-.+---.+.|+|+|||-. T Consensus 28 ~~~a~GS~~~~~G~tkVic~vsGp~ 52 (230) T COG0689 28 LKHAEGSSLIEFGNTKVICTVSGPR 52 (230) T ss_pred CCCCCCCEEEEECCEEEEEEEECCC T ss_conf 4678853799967808999970677 No 18 >pfam10000 DUF2241 Uncharacterized protein conserved in bacteria (DUF2241). This domain, found in various hypothetical bacterial proteins, has no known function. Probab=26.68 E-value=45 Score=17.15 Aligned_cols=15 Identities=33% Similarity=0.882 Sum_probs=12.5 Q ss_pred EEECCCEEEEEECCC Q ss_conf 920798899841496 Q gi|254780400|r 21 IIRPGTYVVCAITGQ 35 (71) Q Consensus 21 ii~~G~yV~CAVsgk 35 (71) ++.+|+||-|.+.+. T Consensus 17 ~L~~~~yVF~t~~~~ 31 (72) T pfam10000 17 ELDDGEYVFCTVPGD 31 (72) T ss_pred EECCCCEEEEEECCC T ss_conf 778996899996786 No 19 >COG3602 Uncharacterized protein conserved in bacteria [Function unknown] Probab=26.01 E-value=41 Score=17.44 Aligned_cols=17 Identities=24% Similarity=0.647 Sum_probs=12.9 Q ss_pred EECCCEEEEEECCCEEC Q ss_conf 20798899841496706 Q gi|254780400|r 22 IRPGTYVVCAITGQRIP 38 (71) Q Consensus 22 i~~G~yV~CAVsgk~Ip 38 (71) +-+||||.|-|.+-..+ T Consensus 18 L~~G~yVfcT~~~ga~~ 34 (134) T COG3602 18 LLDGDYVFCTVAPGALQ 34 (134) T ss_pred CCCCCEEEEEECCCCCC T ss_conf 05896699984477679 No 20 >PRK00276 infA translation initiation factor IF-1; Validated Probab=25.00 E-value=61 Score=16.38 Aligned_cols=23 Identities=22% Similarity=0.443 Sum_probs=16.5 Q ss_pred EECC-CCEEE-EECCCEEEEEECCC Q ss_conf 9849-98099-20798899841496 Q gi|254780400|r 13 RYKD-GTFEI-IRPGTYVVCAITGQ 35 (71) Q Consensus 13 ~Y~~-~~F~i-i~~G~yV~CAVsgk 35 (71) +-+| +.|+| ++.|.-|+|-+||| T Consensus 15 e~lpn~~F~V~Leng~~v~a~~sGK 39 (72) T PRK00276 15 ETLPNAMFRVELENGHEVLAHISGK 39 (72) T ss_pred EECCCCEEEEEECCCCEEEEEECHH T ss_conf 9859988999978999999997413 No 21 >pfam03550 LolB Outer membrane lipoprotein LolB. Probab=24.93 E-value=12 Score=20.57 Aligned_cols=45 Identities=20% Similarity=0.316 Sum_probs=30.1 Q ss_pred CCCCEEEEEECCCCEEEEE-CCCEE--------EEEECCCEECHHHCCCCCHHH Q ss_conf 3776699998499809920-79889--------984149670677654277878 Q gi|254780400|r 5 VDENEASIRYKDGTFEIIR-PGTYV--------VCAITGQRIPLKKLCYWSVDR 49 (71) Q Consensus 5 m~~~~Akl~Y~~~~F~ii~-~G~yV--------~CAVsgk~IpL~~L~YWnVe~ 49 (71) +..+.++|.=.++...+.. .|... +-.++|=.||++.|+||=--+ T Consensus 41 lG~~~~~i~~~~~~~~L~~~dg~~~~a~~~e~Ll~~~~Gw~lPv~~L~~Wl~G~ 94 (156) T pfam03550 41 LGSTELELEGTPGGATLEDSKGQRYTAADAEELLQELTGWDLPLEQLRDWIRGL 94 (156) T ss_pred CCCEEEEEEECCCEEEEEECCCCEEECCCHHHHHHHHHCCCCCHHHHHHHHCCC T ss_conf 676399999869989999799989966999999999878854388899997289 No 22 >TIGR01274 ACC_deam 1-aminocyclopropane-1-carboxylate deaminase; InterPro: IPR005965 1-aminocyclopropane-1-carboxylate deaminase (3.5.99.7 from EC) is a pyridoxal phosphate-dependent enzyme which degrades 1-aminocyclopropane-1-carboxylate to ammonia and alpha-ketoglutarate. In plants, the latter is a precursor of the ripening hormone ethylene . This family includes all members of this family for which the function has been demonstrated experimentally, but excludes a closely related family often annotated as putative members of this family. ; GO: 0008660 1-aminocyclopropane-1-carboxylate deaminase activity, 0030170 pyridoxal phosphate binding, 0009310 amine catabolic process. Probab=24.92 E-value=33 Score=17.96 Aligned_cols=11 Identities=45% Similarity=0.927 Sum_probs=7.9 Q ss_pred CEEEEEECCCE Q ss_conf 88998414967 Q gi|254780400|r 26 TYVVCAITGQR 36 (71) Q Consensus 26 ~yV~CAVsgk~ 36 (71) .-|+|+|||-. T Consensus 199 ~vvVC~VTGST 209 (352) T TIGR01274 199 KVVVCSVTGST 209 (352) T ss_pred EEEEEEECCCC T ss_conf 58898533642 No 23 >cd05694 S1_Rrp5_repeat_hs2_sc2 S1_Rrp5_repeat_hs2_sc2: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 2 (hs2) and S. cerevisiae S1 repeat 2 (sc2). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. Probab=24.65 E-value=42 Score=17.33 Aligned_cols=20 Identities=20% Similarity=0.680 Sum_probs=16.6 Q ss_pred CCEEEEECCCEEEEEECCCE Q ss_conf 98099207988998414967 Q gi|254780400|r 17 GTFEIIRPGTYVVCAITGQR 36 (71) Q Consensus 17 ~~F~ii~~G~yV~CAVsgk~ 36 (71) +++.-+++|..+.|.|++++ T Consensus 39 ~~~~~l~~G~v~~c~V~~v~ 58 (74) T cd05694 39 GNFSKLKVGQLLLCVVEKVK 58 (74) T ss_pred CCCCCCCCCCEEEEEEEEEE T ss_conf 75563348878999999992 No 24 >PRK10351 holo-(acyl carrier protein) synthase 2; Provisional Probab=24.14 E-value=79 Score=15.72 Aligned_cols=44 Identities=5% Similarity=-0.058 Sum_probs=31.3 Q ss_pred CCCEEEEECCCEEEEEECCCE---ECHHHCCCCC--HHHCCCCCCHHHH Q ss_conf 998099207988998414967---0677654277--8780722599999 Q gi|254780400|r 16 DGTFEIIRPGTYVVCAITGQR---IPLKKLCYWS--VDRQVPYANAEAS 59 (71) Q Consensus 16 ~~~F~ii~~G~yV~CAVsgk~---IpL~~L~YWn--Ve~QEaY~s~e~~ 59 (71) +=.|.+--.||+++||||... |.||.+|==. ..+-..|||+.+. T Consensus 66 pL~FNLSHSgd~~llavS~~~eVGVDIE~iRp~~d~~~LA~rfFS~~E~ 114 (185) T PRK10351 66 PLWFNLSHSGDDIALLLSDEGEVGCDIEVIRPRANWRWLANAVFSLGEH 114 (185) T ss_pred CCEEEECCCCCEEEEEEECCCCCCCCHHHHCCCCCHHHHHHHHCCHHHH T ss_conf 8557520568808999945886442278946666899999986799999 No 25 >pfam08265 YL1_C YL1 nuclear protein C-terminal domain. This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins. Probab=24.04 E-value=26 Score=18.58 Aligned_cols=20 Identities=40% Similarity=0.614 Sum_probs=13.6 Q ss_pred EEEECCCEE----CHHHCCCCCHH Q ss_conf 984149670----67765427787 Q gi|254780400|r 29 VCAITGQRI----PLKKLCYWSVD 48 (71) Q Consensus 29 ~CAVsgk~I----pL~~L~YWnVe 48 (71) .|.|||.+- |...|+|-|+| T Consensus 2 ~C~ITglpA~Y~DP~T~l~Y~n~e 25 (30) T pfam08265 2 YCDITGLPAKYKDPKTGLPYSNVE 25 (30) T ss_pred CCCCCCCCCCCCCCCCCCCCCCHH T ss_conf 055629844434888798114888 No 26 >TIGR00601 rad23 UV excision repair protein Rad23; InterPro: IPR004806 All proteins in this family for which functions are known are components of a multiprotein complex used for targeting nucleotide excision repair to specific parts of the genome. Rad23 contains a ubiquitin-like domain that interacts with catalytically active proteasomes and two ubiquitin (Ub)-associated (UBA) sequences that bind Ub. Rad23 interacts with ubiquitinated cellular proteins through the synergistic action of its UBA domains. In humans, Rad23 complexes with the XPC protein.; GO: 0006289 nucleotide-excision repair, 0005634 nucleus. Probab=23.57 E-value=46 Score=17.11 Aligned_cols=16 Identities=19% Similarity=0.453 Sum_probs=13.6 Q ss_pred EECCCEEEEEECCCEE Q ss_conf 2079889984149670 Q gi|254780400|r 22 IRPGTYVVCAITGQRI 37 (71) Q Consensus 22 i~~G~yV~CAVsgk~I 37 (71) |+.+|||+|=||.+|= T Consensus 66 I~E~~FvVvMV~k~K~ 81 (453) T TIGR00601 66 IKEKDFVVVMVSKPKT 81 (453) T ss_pred CCCCCEEEEEECCCCC T ss_conf 6548758998515765 No 27 >cd05704 S1_Rrp5_repeat_hs13 S1_Rrp5_repeat_hs13: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 13 (hs13). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. Probab=23.29 E-value=38 Score=17.64 Aligned_cols=22 Identities=32% Similarity=0.694 Sum_probs=17.7 Q ss_pred EECCCCEEEEECCCEEEEEECC Q ss_conf 9849980992079889984149 Q gi|254780400|r 13 RYKDGTFEIIRPGTYVVCAITG 34 (71) Q Consensus 13 ~Y~~~~F~ii~~G~yV~CAVsg 34 (71) .|-++.-+..++|++|+|-|-- T Consensus 40 ~y~~~Pl~~f~~~qiVrc~VLs 61 (72) T cd05704 40 SYTENPLEGFKPGKIVRCCILS 61 (72) T ss_pred CCCCCCHHHCCCCCEEEEEEEE T ss_conf 4544967765789789999995 No 28 >KOG0178 consensus Probab=23.25 E-value=44 Score=17.22 Aligned_cols=15 Identities=27% Similarity=0.696 Sum_probs=11.5 Q ss_pred EEEECCCEEEEEECC Q ss_conf 992079889984149 Q gi|254780400|r 20 EIIRPGTYVVCAITG 34 (71) Q Consensus 20 ~ii~~G~yV~CAVsg 34 (71) .|-.-+|++.|||.| T Consensus 64 KiY~l~d~iaC~vaG 78 (249) T KOG0178 64 KIYKLNDNIACAVAG 78 (249) T ss_pred HHHHCCCCEEEEEEC T ss_conf 762047744788723 No 29 >pfam00500 Late_protein_L1 L1 (late) protein. Probab=22.87 E-value=20 Score=19.30 Aligned_cols=24 Identities=29% Similarity=0.523 Sum_probs=19.8 Q ss_pred ECCCEECHHHCCCCCHHHCCCCCC Q ss_conf 149670677654277878072259 Q gi|254780400|r 32 ITGQRIPLKKLCYWSVDRQVPYAN 55 (71) Q Consensus 32 Vsgk~IpL~~L~YWnVe~QEaY~s 55 (71) ...+.=|.+.+++|+||+.|-..+ T Consensus 436 p~ek~DPy~~~~FW~VDl~ek~S~ 459 (503) T pfam00500 436 PKEKEDPYKKLKFWEVDLKEKFSL 459 (503) T ss_pred CCCCCCCCCCCCEEEECCCCCCCC T ss_conf 989989766762046547210354 No 30 >COG4066 Uncharacterized protein conserved in archaea [Function unknown] Probab=20.66 E-value=54 Score=16.68 Aligned_cols=38 Identities=24% Similarity=0.551 Sum_probs=29.2 Q ss_pred CCCCEEEEECCCEEEEEECCCEEC-HHHCCCCCHHHCCC Q ss_conf 499809920798899841496706-77654277878072 Q gi|254780400|r 15 KDGTFEIIRPGTYVVCAITGQRIP-LKKLCYWSVDRQVP 52 (71) Q Consensus 15 ~~~~F~ii~~G~yV~CAVsgk~Ip-L~~L~YWnVe~QEa 52 (71) .||+|+|.+.|+---|.|-.+++. -..|.=..|..|-+ T Consensus 124 FPGgfkVrkkgnvyYCPVKdkq~n~pgslC~fCva~Qdp 162 (165) T COG4066 124 FPGGFKVRKKGNVYYCPVKDKQLNQPGSLCEFCVAKQDP 162 (165) T ss_pred CCCCEEEEEECCEEECCCCCCCCCCCCCHHHEEECCCCC T ss_conf 799637986588776556311358985255400003487 No 31 >pfam02329 HDC Histidine carboxylase PI chain. Histidine carboxylase catalyses the formation of histamine from histidine. Cleavage of the proenzyme PI chain yields two subunits, alpha and beta, which arrange as a hexamer (alpha beta)6. Probab=20.61 E-value=47 Score=17.05 Aligned_cols=13 Identities=46% Similarity=0.922 Sum_probs=9.8 Q ss_pred EECCCEEEEEECC Q ss_conf 2079889984149 Q gi|254780400|r 22 IRPGTYVVCAITG 34 (71) Q Consensus 22 i~~G~yV~CAVsg 34 (71) .-||.||+||--| T Consensus 152 ~~PGa~vicAnK~ 164 (306) T pfam02329 152 PAPGSFVVCANKS 164 (306) T ss_pred CCCCCEEEECCCC T ss_conf 9997468851676 No 32 >pfam06159 DUF974 Protein of unknown function (DUF974). Family of uncharacterized eukaryotic proteins. Probab=20.58 E-value=62 Score=16.33 Aligned_cols=16 Identities=25% Similarity=0.812 Sum_probs=12.1 Q ss_pred CEEEEECCCEE-EEEEC Q ss_conf 80992079889-98414 Q gi|254780400|r 18 TFEIIRPGTYV-VCAIT 33 (71) Q Consensus 18 ~F~ii~~G~yV-~CAVs 33 (71) +|+|-+.|.|| +|+|+ T Consensus 70 ~~evKE~G~HiLvC~V~ 86 (233) T pfam06159 70 SFDVKELGAHILVCSVS 86 (233) T ss_pred EEEECCCCCEEEEEEEE T ss_conf 98603478779999999 Done!