Query gi|254780667|ref|YP_003065080.1| hypothetical protein CLIBASIA_02770 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 43 No_of_seqs 1 out of 3 Neff 1.0 Searched_HMMs 39220 Date Sun May 29 20:22:20 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780667.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK00315 potassium-transportin 13.8 1.4E+02 0.0035 14.3 2.5 34 1-38 1-34 (195) 2 pfam07436 Curto_V3 Curtovirus 12.6 1E+02 0.0026 14.9 1.2 17 13-29 46-62 (87) 3 TIGR00717 rpsA ribosomal prote 12.3 67 0.0017 15.8 0.2 12 2-13 398-409 (534) 4 pfam09272 Hepsin-SRCR Hepsin, 12.2 61 0.0015 16.0 -0.1 28 14-43 75-102 (110) 5 TIGR02174 CXXU_selWTH selT/sel 12.0 1E+02 0.0027 14.9 1.1 20 24-43 86-105 (144) 6 PRK13592 ubiA prenyltransferas 11.4 1.3E+02 0.0032 14.4 1.4 20 21-40 180-199 (299) 7 TIGR00630 uvra excinuclease AB 11.3 1.2E+02 0.0031 14.6 1.2 18 6-23 894-911 (956) 8 PRK02487 hypothetical protein; 8.8 1.6E+02 0.0042 13.9 1.1 33 8-40 5-37 (163) 9 COG3124 Uncharacterized protei 7.4 2.4E+02 0.0062 13.1 1.7 23 8-30 163-185 (193) 10 TIGR02247 HAD-1A3-hyp Epoxide 7.1 51 0.0013 16.4 -2.1 11 29-39 4-14 (228) No 1 >PRK00315 potassium-transporting ATPase subunit C; Reviewed Probab=13.78 E-value=1.4e+02 Score=14.29 Aligned_cols=34 Identities=24% Similarity=0.155 Sum_probs=23.5 Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHHCCHHHHHHHHHHH Q ss_conf 93000110220225899999999981137897443233 Q gi|254780667|r 1 MMKGLLHADDIEFRFTAVQRLVFAFYPSAVVWEFGRIL 38 (43) Q Consensus 1 mmkgllhaddiefrftavqrlvfafypsavvwefgril 38 (43) |||-+..+ +|++.+--+++.+--..++|-+++++ T Consensus 1 M~k~l~~a----l~~~l~~~vl~G~~YPl~vtgiaq~~ 34 (195) T PRK00315 1 MMKGLRPA----LVLFLFLLLITGGAYPLLTTGLGQWW 34 (195) T ss_pred CHHHHHHH----HHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 95889999----99999999999999999999999986 No 2 >pfam07436 Curto_V3 Curtovirus V3 protein. This family consists of several Curtovirus V3 proteins of around 90 residues in length. The function of this family is unknown. Probab=12.57 E-value=1e+02 Score=14.93 Aligned_cols=17 Identities=35% Similarity=0.612 Sum_probs=14.1 Q ss_pred HHHHHHHHHHHHHCCHH Q ss_conf 25899999999981137 Q gi|254780667|r 13 FRFTAVQRLVFAFYPSA 29 (43) Q Consensus 13 frftavqrlvfafypsa 29 (43) --|-++|..||+-|||. T Consensus 46 elf~k~qqvvy~r~~sr 62 (87) T pfam07436 46 ELFVKLQQVVYTRYPSR 62 (87) T ss_pred HHHHHHHHHHHCCCCCC T ss_conf 99999999874147642 No 3 >TIGR00717 rpsA ribosomal protein S1; InterPro: IPR000110 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites , . About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome , . Ribosomal protein S1 contains the S1 domain that has been found in a large number of RNA-associated proteins. S1 is a prominent component of the Escherichia coli ribosome and is most probably required for translation of most, if not all, natural mRNAs in E. coli in vivo . It has been suggested that S1 is a RNA-binding protein helping polynucleotide phosphorylase (PNPase, known to be phylogenetically related to S1) to degrade mRNA, or helper molecule involved in other RNase activities . Unique among ribosomal proteins, the primary structure of S1 contains four repeating homologous stretches in the central and terminal region of the molecule. S1 is organised into at least two distinct domains; a ribosome-binding domain at the N-terminal region and a nucleic acid-binding domain at the C-terminal region . There may be a flexible region between the two domains permitting free movement of the domains relative to each other. ; GO: 0003723 RNA binding, 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome. Probab=12.27 E-value=67 Score=15.81 Aligned_cols=12 Identities=33% Similarity=0.916 Sum_probs=7.9 Q ss_pred CCCCCCCCCHHH Q ss_conf 300011022022 Q gi|254780667|r 2 MKGLLHADDIEF 13 (43) Q Consensus 2 mkgllhaddief 13 (43) |+||+|.+||-. T Consensus 398 idGliH~~D~SW 409 (534) T TIGR00717 398 IDGLIHLSDLSW 409 (534) T ss_pred CCEEEECCCCCC T ss_conf 223895640153 No 4 >pfam09272 Hepsin-SRCR Hepsin, SRCR. Members of this family form an extracellular domain of the serine protease hepsin. They are formed primarily by three elements of regular secondary structure: a 12-residue alpha helix, a twisted five-stranded antiparallel beta sheet, and a second, two-stranded, antiparallel sheet. The two beta-sheets lie at roughly right angles to each other, with the helix nestled between the two, adopting an SRCR fold. The exact function of this domain has not been identified, though it probably may serve to orient the protease domain or place it in the vicinity of its substrate. Probab=12.16 E-value=61 Score=16.01 Aligned_cols=28 Identities=32% Similarity=0.390 Sum_probs=18.9 Q ss_pred HHHHHHHHHHHHCCHHHHHHHHHHHHHHCC Q ss_conf 589999999998113789744323331029 Q gi|254780667|r 14 RFTAVQRLVFAFYPSAVVWEFGRILTATCQ 43 (43) Q Consensus 14 rftavqrlvfafypsavvwefgriltatcq 43 (43) +.+--|||.-..+|-. -+-||+|++.|| T Consensus 75 ~L~y~krl~dvlsvCd--Cp~G~fL~~~CQ 102 (110) T pfam09272 75 ELPYGQRLLTVISVCD--CPRGRFLEAICQ 102 (110) T ss_pred CCCHHHHHHHEEEEEC--CCCCCHHHHHHH T ss_conf 2845561553220303--897512778777 No 5 >TIGR02174 CXXU_selWTH selT/selW/selH selenoprotein domain; InterPro: IPR011893 This is a family found in both bacteria and animals, including the animal proteins SelT, SelW, and SelH, all of which are selenoproteins. These proteins contain a domain with a CXXC motif near the N-terminus, where selenocysteine may replace the second Cys. Proteins with this domain may include an insert of about 70 amino acids. ; GO: 0008430 selenium binding, 0045454 cell redox homeostasis. Probab=12.01 E-value=1e+02 Score=14.86 Aligned_cols=20 Identities=30% Similarity=0.645 Sum_probs=17.3 Q ss_pred HHCCHHHHHHHHHHHHHHCC Q ss_conf 98113789744323331029 Q gi|254780667|r 24 AFYPSAVVWEFGRILTATCQ 43 (43) Q Consensus 24 afypsavvwefgriltatcq 43 (43) -||-++.||=||..+.++|+ T Consensus 86 r~~~~~~vF~~GN~~e~~L~ 105 (144) T TIGR02174 86 RFYACMMVFFFGNMLESQLS 105 (144) T ss_pred HHHHHHHHHHHHHHHHHHHH T ss_conf 89999999999999998741 No 6 >PRK13592 ubiA prenyltransferase; Provisional Probab=11.38 E-value=1.3e+02 Score=14.45 Aligned_cols=20 Identities=35% Similarity=0.787 Sum_probs=14.2 Q ss_pred HHHHHCCHHHHHHHHHHHHH Q ss_conf 99998113789744323331 Q gi|254780667|r 21 LVFAFYPSAVVWEFGRILTA 40 (43) Q Consensus 21 lvfafypsavvwefgrilta 40 (43) +.+..|-++.+||+||-..| T Consensus 180 ~~~t~yf~gli~EI~RKira 199 (299) T PRK13592 180 LAFTMYFPSLIWEVCRKIRA 199 (299) T ss_pred HHHHHHCCHHHHHHHHHCCC T ss_conf 99999705444987622279 No 7 >TIGR00630 uvra excinuclease ABC, A subunit; InterPro: IPR004602 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). During the process of Escherichia coli nucleotide excision repair, DNA damage recognition and processing are achieved by the action of the uvrA, uvrB, and uvrC gene products . The UvrC protein contain 4 conserved regions: a central region which interact with UvrB (Uvr domain), a Helix hairpin Helix (HhH) domain important for 5 prime incision of damage DNA and the homology regions 1 and 2 of unknown function. UvrC homology region 2 is specific for UvrC proteins, whereas UvrC homology region 1 is also shared by few other nucleases.; GO: 0009381 excinuclease ABC activity, 0006289 nucleotide-excision repair, 0009380 excinuclease repair complex. Probab=11.34 E-value=1.2e+02 Score=14.56 Aligned_cols=18 Identities=50% Similarity=0.573 Sum_probs=15.1 Q ss_pred CCCCCHHHHHHHHHHHHH Q ss_conf 110220225899999999 Q gi|254780667|r 6 LHADDIEFRFTAVQRLVF 23 (43) Q Consensus 6 lhaddiefrftavqrlvf 23 (43) ||-|||.--..-+||||- T Consensus 894 LHf~Di~kLL~VlqrLv~ 911 (956) T TIGR00630 894 LHFDDIKKLLEVLQRLVD 911 (956) T ss_pred CHHHHHHHHHHHHHHHHH T ss_conf 418999999999999985 No 8 >PRK02487 hypothetical protein; Provisional Probab=8.75 E-value=1.6e+02 Score=13.90 Aligned_cols=33 Identities=24% Similarity=0.516 Sum_probs=24.7 Q ss_pred CCCHHHHHHHHHHHHHHHCCHHHHHHHHHHHHH Q ss_conf 022022589999999998113789744323331 Q gi|254780667|r 8 ADDIEFRFTAVQRLVFAFYPSAVVWEFGRILTA 40 (43) Q Consensus 8 addiefrftavqrlvfafypsavvwefgrilta 40 (43) ++|++------+.|+|..+-....|+.|.++.. T Consensus 5 ~~~l~~l~~qE~~l~f~~F~~~~A~~LG~~l~~ 37 (163) T PRK02487 5 AQDLAQIIAQEQALQFPHFDNDDAWQLGEIIVE 37 (163) T ss_pred HHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHH T ss_conf 889999999999745788998899999999999 No 9 >COG3124 Uncharacterized protein conserved in bacteria [Function unknown] Probab=7.36 E-value=2.4e+02 Score=13.05 Aligned_cols=23 Identities=22% Similarity=0.535 Sum_probs=18.6 Q ss_pred CCCHHHHHHHHHHHHHHHCCHHH Q ss_conf 02202258999999999811378 Q gi|254780667|r 8 ADDIEFRFTAVQRLVFAFYPSAV 30 (43) Q Consensus 8 addiefrftavqrlvfafypsav 30 (43) ..|+|-++++.+.+..+|||.-. T Consensus 163 w~~l~~~Y~~Lea~F~~fYP~mm 185 (193) T COG3124 163 WYDLDAHYDALEARFWQFYPRMM 185 (193) T ss_pred HHHHHHHHHHHHHHHHHHHHHHH T ss_conf 98999887899999999859999 No 10 >TIGR02247 HAD-1A3-hyp Epoxide hydrolase N-terminal domain-like phosphatase; InterPro: IPR011945 This entry represents a small clade of sequences from the metazoa and bacteria. In eukaryotes, this domain exists as an N-terminal fusion to the soluble epoxide hydrolase enzyme and has recently been shown to be an active phosphatase, although the nature of the biological substrate is unclear . These appear to be members of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases by general homology and the conservation of all of the recognized catalytic motifs (although the first motif is unusual in the replacement of the more common aspartate with glycine). The variable domain is found in between motifs 1 and 2, indicating membership in subfamily I and phylogeny and prediction of the alpha helical nature of the variable domain indicate membership in subfamily IA.. Probab=7.12 E-value=51 Score=16.40 Aligned_cols=11 Identities=45% Similarity=1.017 Sum_probs=0.0 Q ss_pred HHHHHHHHHHH Q ss_conf 78974432333 Q gi|254780667|r 29 AVVWEFGRILT 39 (43) Q Consensus 29 avvwefgrilt 39 (43) ||+|.||-+|+ T Consensus 4 AVifD~GGVl~ 14 (228) T TIGR02247 4 AVIFDFGGVLL 14 (228) T ss_pred EEEEECCCEEC T ss_conf 89984386564 Done!