Query gi|254780886|ref|YP_003065299.1| hypothetical protein CLIBASIA_03915 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 41 No_of_seqs 1 out of 3 Neff 1.0 Searched_HMMs 39220 Date Mon May 30 00:01:47 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780886.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 pfam11685 DUF3281 Protein of u 41.9 12 0.00031 19.5 1.1 23 3-25 2-24 (269) 2 PRK09859 multidrug efflux syst 25.6 16 0.0004 19.0 -0.5 23 1-23 1-23 (385) 3 TIGR01250 pro_imino_pep_2 prol 16.3 58 0.0015 16.5 0.8 16 2-17 129-144 (302) 4 pfam00802 Glycoprotein_G Pneum 15.3 43 0.0011 17.1 -0.1 14 17-30 16-29 (263) 5 pfam02459 Adeno_terminal Adeno 13.8 59 0.0015 16.4 0.3 19 16-34 65-83 (548) 6 KOG4611 consensus 11.6 92 0.0023 15.6 0.7 26 4-29 420-447 (747) 7 COG4332 Uncharacterized protei 10.3 83 0.0021 15.8 0.1 22 4-29 42-63 (203) 8 pfam09230 DFF40 DNA fragmentat 8.9 90 0.0023 15.6 -0.1 11 24-34 159-169 (227) 9 TIGR01607 PST-A Plasmodium sub 8.6 95 0.0024 15.5 -0.1 13 22-34 12-24 (379) 10 pfam05472 Ter DNA replication 7.9 1.1E+02 0.0027 15.3 -0.1 19 17-35 152-170 (290) No 1 >pfam11685 DUF3281 Protein of unknown function (DUF3281). This family of bacterial proteins has no known function. Probab=41.94 E-value=12 Score=19.51 Aligned_cols=23 Identities=52% Similarity=0.514 Sum_probs=19.5 Q ss_pred CCCEEEEEHHHHHHHHHCCCCEE Q ss_conf 64036302010012212031101 Q gi|254780886|r 3 AKGLIVASIISSTAIMSSCSYSW 25 (41) Q Consensus 3 akglivasiisstaimsscsysw 25 (41) .|-||-+.||||++++.||.-+- T Consensus 2 kk~lIg~~iis~~~lL~sCgKte 24 (269) T pfam11685 2 KKLLIGAVIISSTVLLGSCGKSE 24 (269) T ss_pred CEEEEEEEEEHHHHHHHHCCCCC T ss_conf 51699875200467886069865 No 2 >PRK09859 multidrug efflux system protein MdtE; Provisional Probab=25.59 E-value=16 Score=19.00 Aligned_cols=23 Identities=17% Similarity=0.473 Sum_probs=20.0 Q ss_pred CCCCCEEEEEHHHHHHHHHCCCC Q ss_conf 98640363020100122120311 Q gi|254780886|r 1 MNAKGLIVASIISSTAIMSSCSY 23 (41) Q Consensus 1 mnakglivasiisstaimsscsy 23 (41) ||.|..+..+++..+++++.|+- T Consensus 1 m~~k~~~li~ll~~~~lL~gC~~ 23 (385) T PRK09859 1 MNRRRKLLIPLLFCGAMLTACDD 23 (385) T ss_pred CCCHHHHHHHHHHHHHHHHCCCC T ss_conf 98206789999999999953799 No 3 >TIGR01250 pro_imino_pep_2 proline-specific peptidases; InterPro: IPR005945 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of serine peptidase belong to MEROPS peptidase family S33 (clan SC). They are proline iminopeptidase (Prolyl aminopeptidase, 3.4.11.5 from EC), which catalyzes the removal of the N-terminal proline from peptides. This family represents one of two related families of proline iminopeptidase containing the alpha/beta fold. The fine specificities of the various members, including both the range of short peptides from which proline can be removed and whether other amino acids such as alanine can be also removed, may vary among members. One of the members of this family is the tricorn protease (TRI) interacting factor 1 from Thermoplasma acidophilum. Factor 1 (F1) is a 33.5 kDa serine peptidase of the alpha/beta-hydrolase family. Tricorn generates small peptides, which are cleaved by F1 to yield single amino acids , . ; GO: 0016804 prolyl aminopeptidase activity, 0005737 cytoplasm. Probab=16.32 E-value=58 Score=16.46 Aligned_cols=16 Identities=38% Similarity=0.634 Sum_probs=12.7 Q ss_pred CCCCEEEEEHHHHHHH Q ss_conf 8640363020100122 Q gi|254780886|r 2 NAKGLIVASIISSTAI 17 (41) Q Consensus 2 nakglivasiisstai 17 (41) +.||||++|-++|-.. T Consensus 129 ~lkglI~ss~~~s~pe 144 (302) T TIGR01250 129 HLKGLIISSMLDSAPE 144 (302) T ss_pred CCEEEEEECCCCCHHH T ss_conf 8269998556567247 No 4 >pfam00802 Glycoprotein_G Pneumovirus attachment glycoprotein G. This family includes attachment proteins from respiratory synctial virus. Glycoprotein G has not been shown to have any neuraminidase or hemagglutinin activity. The amino terminus is thought to be cytoplasmic, and the carboxyl terminus extracellular. The extracellular region contains four completely conserved cysteine residues. Probab=15.29 E-value=43 Score=17.05 Aligned_cols=14 Identities=50% Similarity=0.861 Sum_probs=11.2 Q ss_pred HHHCCCCEEEHHHH Q ss_conf 21203110127763 Q gi|254780886|r 17 IMSSCSYSWNLKHA 30 (41) Q Consensus 17 imsscsyswnlkha 30 (41) +.+||-|..|||-- T Consensus 16 v~~SCLYklNLKSl 29 (263) T pfam00802 16 VISSCLYKLNLKSL 29 (263) T ss_pred EHHHHHHHHHHHHH T ss_conf 01225656207999 No 5 >pfam02459 Adeno_terminal Adenoviral DNA terminal protein. This protein is covalently attached to the terminii of replicating DNA in vivo. Probab=13.76 E-value=59 Score=16.42 Aligned_cols=19 Identities=37% Similarity=0.567 Sum_probs=13.1 Q ss_pred HHHHCCCCEEEHHHHEEEE Q ss_conf 2212031101277630110 Q gi|254780886|r 16 AIMSSCSYSWNLKHAIRKI 34 (41) Q Consensus 16 aimsscsyswnlkhairki 34 (41) .+|+.|||+-|...--|-+ T Consensus 65 s~ladCSYtInTgaY~Rfl 83 (548) T pfam02459 65 SILADCSYTINTGAYHRFI 83 (548) T ss_pred EEECCCEEEEECCHHHHHC T ss_conf 7733640686120356520 No 6 >KOG4611 consensus Probab=11.57 E-value=92 Score=15.56 Aligned_cols=26 Identities=38% Similarity=0.733 Sum_probs=14.5 Q ss_pred CCEEEEEHHHHHHHH--HCCCCEEEHHH Q ss_conf 403630201001221--20311012776 Q gi|254780886|r 4 KGLIVASIISSTAIM--SSCSYSWNLKH 29 (41) Q Consensus 4 kglivasiisstaim--sscsyswnlkh 29 (41) |-.|..|||-..... .-|||||.-.. T Consensus 420 kliialsiiipmslfwsalcsyswgrrq 447 (747) T KOG4611 420 KLIIALSIIIPMSLFWSALCSYSWGRRQ 447 (747) T ss_pred HHEEEEEEHHHHHHHHHHHHCCHHHCCC T ss_conf 0025411010089999987400443136 No 7 >COG4332 Uncharacterized protein conserved in bacteria [Function unknown] Probab=10.32 E-value=83 Score=15.75 Aligned_cols=22 Identities=36% Similarity=0.631 Sum_probs=12.9 Q ss_pred CCEEEEEHHHHHHHHHCCCCEEEHHH Q ss_conf 40363020100122120311012776 Q gi|254780886|r 4 KGLIVASIISSTAIMSSCSYSWNLKH 29 (41) Q Consensus 4 kglivasiisstaimsscsyswnlkh 29 (41) |-|-|-+|.- -+.|.|+||..- T Consensus 42 K~LDvWlIYk----C~~Cd~tWN~~I 63 (203) T COG4332 42 KVLDVWLIYK----CTHCDYTWNISI 63 (203) T ss_pred CEEEEEEEEE----EECCCCCCCHHH T ss_conf 3788999998----504677256103 No 8 >pfam09230 DFF40 DNA fragmentation factor 40 kDa. Members of this family of eukaryotic apoptotic proteins induce DNA fragmentation and chromatin condensation during apoptosis. Probab=8.85 E-value=90 Score=15.60 Aligned_cols=11 Identities=55% Similarity=1.069 Sum_probs=8.5 Q ss_pred EEEHHHHEEEE Q ss_conf 01277630110 Q gi|254780886|r 24 SWNLKHAIRKI 34 (41) Q Consensus 24 swnlkhairki 34 (41) .|||.|-|.|- T Consensus 159 tWNLDH~IEk~ 169 (227) T pfam09230 159 TWNLDHQIEKK 169 (227) T ss_pred ECCCCEEEEEH T ss_conf 31662033101 No 9 >TIGR01607 PST-A Plasmodium subtelomeric family (PST-A); InterPro: IPR006494 These proteins represent a paralogous family of genes found in Plasmodium falciparum and Plasmodium yoelii that are closely related to various phospholipases and lysophospholipases of plants as well as generally being related to the alpha/beta-fold superfamily of hydrolases. These genes are preferentially located in the subtelomeric regions of the chromosomes of both P. falciparum and P. yoelii. . Probab=8.59 E-value=95 Score=15.49 Aligned_cols=13 Identities=54% Similarity=0.917 Sum_probs=10.0 Q ss_pred CCEEEHHHHEEEE Q ss_conf 1101277630110 Q gi|254780886|r 22 SYSWNLKHAIRKI 34 (41) Q Consensus 22 syswnlkhairki 34 (41) ||||-.|.||-=| T Consensus 12 tYsWiVKkAiGII 24 (379) T TIGR01607 12 TYSWIVKKAIGII 24 (379) T ss_pred HHHHHHHHHHHHE T ss_conf 1334222232011 No 10 >pfam05472 Ter DNA replication terminus site-binding protein (Ter protein). This family contains several bacterial Ter proteins. The Ter protein specifically binds to DNA replication terminus sites on the host and plasmid genome and then blocks progress of the DNA replication fork. Probab=7.90 E-value=1.1e+02 Score=15.30 Aligned_cols=19 Identities=32% Similarity=0.618 Sum_probs=15.8 Q ss_pred HHHCCCCEEEHHHHEEEEE Q ss_conf 2120311012776301101 Q gi|254780886|r 17 IMSSCSYSWNLKHAIRKIE 35 (41) Q Consensus 17 imsscsyswnlkhairkie 35 (41) -.+|+.++|-.||.|.++. T Consensus 152 ~p~sVrF~Wa~K~~ik~~t 170 (290) T pfam05472 152 NPQSIRFGWANKHSIKKLT 170 (290) T ss_pred CCCEEEEECCCCCCCCCCC T ss_conf 8776788714665300146 Done!