Query 029902
Match_columns 185
No_of_seqs 115 out of 134
Neff 3.6
Searched_HMMs 46136
Date Fri Mar 29 05:27:11 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/029902.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/029902hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF07939 DUF1685: Protein of u 99.9 1.1E-25 2.5E-30 160.9 3.5 49 81-129 1-50 (64)
2 TIGR03577 EF_0830 conserved hy 31.3 9.7 0.00021 30.4 -1.2 31 79-109 77-109 (115)
3 PF15178 TOM_sub5: Mitochondri 31.3 55 0.0012 22.9 2.6 22 158-182 8-29 (51)
4 PF11427 HTH_Tnp_Tc3_1: Tc3 tr 30.4 26 0.00056 24.1 0.9 22 77-98 1-22 (50)
5 PF14272 Gly_rich_SFCGS: Glyci 29.4 9.3 0.0002 30.5 -1.6 22 89-110 87-110 (115)
6 PF08338 DUF1731: Domain of un 27.5 25 0.00054 23.6 0.4 19 92-110 28-47 (48)
7 PF13986 DUF4224: Domain of un 27.4 30 0.00066 23.1 0.8 13 81-93 3-15 (47)
8 PF08920 SF3b1: Splicing facto 27.2 31 0.00066 28.4 1.0 14 76-89 82-95 (144)
9 COG4004 Uncharacterized protei 25.7 18 0.00039 28.3 -0.6 26 90-116 19-44 (96)
10 PF03869 Arc: Arc-like DNA bin 24.0 65 0.0014 21.6 2.0 14 163-176 12-25 (50)
11 COG3115 ZipA Cell division pro 22.4 1.1E+02 0.0025 28.4 3.8 80 89-179 208-290 (324)
12 PF05416 Peptidase_C37: Southa 21.9 30 0.00065 33.9 0.0 20 75-94 247-266 (535)
13 PF03047 ComC: COMC family; I 20.7 33 0.00072 21.7 0.0 14 78-91 11-24 (32)
No 1
>PF07939 DUF1685: Protein of unknown function (DUF1685); InterPro: IPR012881 The members of this family are hypothetical eukaryotic proteins of unknown function. The region in question is approximately 100 amino acid residues long.
Probab=99.91 E-value=1.1e-25 Score=160.89 Aligned_cols=49 Identities=59% Similarity=1.211 Sum_probs=45.3
Q ss_pred CChhhhhhhhhchhcCCCCCCCCc-hhhhccchhhhhhhhhccccccccc
Q 029902 81 LTDEDLDELKGCLDLGFGFSYDEI-PELCNTLPALELCYSMSQKFMDEHQ 129 (185)
Q Consensus 81 LTDlDLEELKGc~DLGFgFs~E~~-PrL~~tLPaL~l~yav~~~~~d~~~ 129 (185)
|||+|||||||||||||||++++. |+||+|||||++|||||+++.+.+.
T Consensus 1 lTd~dldELkGc~dLGFgF~~~~~~p~L~~tlPaL~lyyavn~q~~~~~~ 50 (64)
T PF07939_consen 1 LTDDDLDELKGCIDLGFGFDEEDLDPRLCDTLPALELYYAVNRQYSDHKS 50 (64)
T ss_pred CcHhHHHHHhhhhhhccccCccccChHHHhhhHHHHHHHHHHHHhccccC
Confidence 799999999999999999988774 9999999999999999999987653
No 2
>TIGR03577 EF_0830 conserved hypothetical protein EF_0830/AHA_3911. Members of this family of small (about 120 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown.
Probab=31.34 E-value=9.7 Score=30.38 Aligned_cols=31 Identities=29% Similarity=0.434 Sum_probs=19.8
Q ss_pred CCCChhhhhhhhhchhcCCCC-CCCCc-hhhhc
Q 029902 79 KSLTDEDLDELKGCLDLGFGF-SYDEI-PELCN 109 (185)
Q Consensus 79 KSLTDlDLEELKGc~DLGFgF-s~E~~-PrL~~ 109 (185)
||+-+-----=.||.=||||| +.|++ .+|+.
T Consensus 77 RSveeGvTAi~eG~~VlGFGFmD~EeLG~rlve 109 (115)
T TIGR03577 77 RSVEEGVTAINEGKNVLGFGFMDKEELGKRLTE 109 (115)
T ss_pred cchhhhHHHHhcCCeEEeeccccHHHHHHHHHH
Confidence 444333333346999999999 55665 67754
No 3
>PF15178 TOM_sub5: Mitochondrial import receptor subunit TOM5 homolog
Probab=31.29 E-value=55 Score=22.92 Aligned_cols=22 Identities=32% Similarity=0.671 Sum_probs=16.2
Q ss_pred CCCCChHHHHHHHHHHHHHHhhhhc
Q 029902 158 SPGDHPEDVKARLKYWAQAVACTVR 182 (185)
Q Consensus 158 spGD~p~dmK~rLK~WAqaVAcsVR 182 (185)
.|.-||++||.+.| +-|-++||
T Consensus 8 ~pk~DPeE~k~kmR---~dvissvr 29 (51)
T PF15178_consen 8 GPKMDPEEMKRKMR---EDVISSVR 29 (51)
T ss_pred CCCCCHHHHHHHHH---HHHHHHHH
Confidence 46667999999987 45666665
No 4
>PF11427 HTH_Tnp_Tc3_1: Tc3 transposase; PDB: 1U78_A 1TC3_C.
Probab=30.41 E-value=26 Score=24.12 Aligned_cols=22 Identities=32% Similarity=0.496 Sum_probs=14.5
Q ss_pred ccCCCChhhhhhhhhchhcCCC
Q 029902 77 RTKSLTDEDLDELKGCLDLGFG 98 (185)
Q Consensus 77 rsKSLTDlDLEELKGc~DLGFg 98 (185)
|.+.|||.|--.|-+..+|||-
T Consensus 1 RG~~Lt~~Eqaqid~m~qlG~s 22 (50)
T PF11427_consen 1 RGKTLTDAEQAQIDVMHQLGMS 22 (50)
T ss_dssp -S----HHHHHHHHHHHHTT--
T ss_pred CCCcCCHHHHHHHHHHHHhchh
Confidence 5789999999999999999994
No 5
>PF14272 Gly_rich_SFCGS: Glycine-rich SFCGS
Probab=29.37 E-value=9.3 Score=30.48 Aligned_cols=22 Identities=36% Similarity=0.617 Sum_probs=16.3
Q ss_pred hhhchhcCCCC-CCCCc-hhhhcc
Q 029902 89 LKGCLDLGFGF-SYDEI-PELCNT 110 (185)
Q Consensus 89 LKGc~DLGFgF-s~E~~-PrL~~t 110 (185)
=.||.=||||| +.|++ .||+..
T Consensus 87 ~~G~~VlGFGFmD~EeLG~rlve~ 110 (115)
T PF14272_consen 87 NEGKKVLGFGFMDKEELGRRLVEA 110 (115)
T ss_pred HcCCeEEeeccccHHHHHHHHHHH
Confidence 35899999999 55665 677653
No 6
>PF08338 DUF1731: Domain of unknown function (DUF1731); InterPro: IPR013549 This domain of unknown function appears towards the C terminus of proteins of the NAD dependent epimerase/dehydratase family (IPR001509 from INTERPRO) in bacteria, eukaryotes and archaea. Many of the proteins in which it is found are involved in cell-division inhibition. ; PDB: 3OH8_A.
Probab=27.46 E-value=25 Score=23.57 Aligned_cols=19 Identities=32% Similarity=0.621 Sum_probs=9.3
Q ss_pred chhcCCCCCCCCc-hhhhcc
Q 029902 92 CLDLGFGFSYDEI-PELCNT 110 (185)
Q Consensus 92 c~DLGFgFs~E~~-PrL~~t 110 (185)
..+.||.|.+.++ .-|.++
T Consensus 28 L~~~GF~F~~p~l~~AL~~l 47 (48)
T PF08338_consen 28 LLEAGFQFRYPTLEEALRDL 47 (48)
T ss_dssp HHHTT---S-SSHHHHHHH-
T ss_pred HHHCCCcccCCCHHHHHhcc
Confidence 4688999999886 444443
No 7
>PF13986 DUF4224: Domain of unknown function (DUF4224)
Probab=27.45 E-value=30 Score=23.14 Aligned_cols=13 Identities=62% Similarity=0.874 Sum_probs=11.6
Q ss_pred CChhhhhhhhhch
Q 029902 81 LTDEDLDELKGCL 93 (185)
Q Consensus 81 LTDlDLEELKGc~ 93 (185)
||++||.||=|..
T Consensus 3 LT~~El~elTG~k 15 (47)
T PF13986_consen 3 LTDEELQELTGYK 15 (47)
T ss_pred CCHHHHHHHHCCC
Confidence 8999999998865
No 8
>PF08920 SF3b1: Splicing factor 3B subunit 1; InterPro: IPR015016 This group of proteins consists of several eukaryotic splicing factor 3B subunit 1 proteins, which associate with p14 through a C terminus beta-strand that interacts with beta-3 of the p14 RNA recognition motif (RRM) beta-sheet, which is in turn connected to an alpha-helix by a loop that makes extensive contacts with both the shorter C-terminal helix and RRM of p14. This subunit is required for 'A' splicing complex assembly (formed by the stable binding of U2 snRNP to the branchpoint sequence in pre-mRNA) and 'E' splicing complex assembly []. ; PDB: 2FHO_A 3LQV_P 2PEH_D 2F9J_P 2F9D_Q.
Probab=27.16 E-value=31 Score=28.36 Aligned_cols=14 Identities=50% Similarity=0.916 Sum_probs=10.4
Q ss_pred cccCCCChhhhhhh
Q 029902 76 KRTKSLTDEDLDEL 89 (185)
Q Consensus 76 rrsKSLTDlDLEEL 89 (185)
.|.|-|||+|||.|
T Consensus 82 ~RNrpLTDEELD~m 95 (144)
T PF08920_consen 82 ERNRPLTDEELDAM 95 (144)
T ss_dssp HCTS-S-HHHHHHT
T ss_pred hccCcCCHHHHHHh
Confidence 46788999999987
No 9
>COG4004 Uncharacterized protein conserved in archaea [Function unknown]
Probab=25.70 E-value=18 Score=28.29 Aligned_cols=26 Identities=19% Similarity=0.391 Sum_probs=21.9
Q ss_pred hhchhcCCCCCCCCchhhhccchhhhh
Q 029902 90 KGCLDLGFGFSYDEIPELCNTLPALEL 116 (185)
Q Consensus 90 KGc~DLGFgFs~E~~PrL~~tLPaL~l 116 (185)
+|.-+|||+|+.+-+ +++..+||.++
T Consensus 19 ~~l~e~g~~v~~eGD-~ivas~pgis~ 44 (96)
T COG4004 19 RGLSELGWTVSEEGD-RIVASSPGISR 44 (96)
T ss_pred HHHHHhCeeEeeccc-EEEEecCCceE
Confidence 567789999988755 99999999986
No 10
>PF03869 Arc: Arc-like DNA binding domain; InterPro: IPR005569 Arc repressor act by the cooperative binding of two Arc repressor dimers to a 21-base-pair operator site. Each Arc dimer uses an antiparallel beta-sheet to recognise bases in the major groove [].; GO: 0003677 DNA binding; PDB: 3QOQ_D 1MNT_B 1QTG_B 1BDV_A 1PAR_C 1BDT_C 1ARR_B 1MYL_F 1MYK_A 1NLA_B ....
Probab=23.96 E-value=65 Score=21.64 Aligned_cols=14 Identities=43% Similarity=0.681 Sum_probs=12.6
Q ss_pred hHHHHHHHHHHHHH
Q 029902 163 PEDVKARLKYWAQA 176 (185)
Q Consensus 163 p~dmK~rLK~WAqa 176 (185)
|.+||++||.+|..
T Consensus 12 P~~l~~~lk~~A~~ 25 (50)
T PF03869_consen 12 PEELKEKLKERAEE 25 (50)
T ss_dssp EHHHHHHHHHHHHH
T ss_pred CHHHHHHHHHHHHH
Confidence 78999999999975
No 11
>COG3115 ZipA Cell division protein [Cell division and chromosome partitioning]
Probab=22.38 E-value=1.1e+02 Score=28.45 Aligned_cols=80 Identities=25% Similarity=0.302 Sum_probs=42.4
Q ss_pred hhhchhcCCCCCCCCc-hhhhccchhhhhhhhhccccccccccCCCCCCCCCCCCCCCCCCCC-CC-CccccCCCCChHH
Q 029902 89 LKGCLDLGFGFSYDEI-PELCNTLPALELCYSMSQKFMDEHQSQKSPESHGNSPVSTDPVSSP-IA-NWKISSPGDHPED 165 (185)
Q Consensus 89 LKGc~DLGFgFs~E~~-PrL~~tLPaL~l~yav~~~~~d~~~~~~s~~s~~ssp~s~~~~~sP-l~-nw~I~spGD~p~d 165 (185)
|+..-.+||-|.+.++ -|..+.-+.=..+|+|-+-. .|. .--+...+...-| |. =..+|+|||+-+.
T Consensus 208 lqsi~q~Gf~FG~mnIfHRHl~~sg~gpvLFSvANm~--------kPG--TFd~dnm~dFsT~gIs~FMqLPs~g~~lqn 277 (324)
T COG3115 208 LQSIQQSGFIFGDMNIFHRHLSLSGSGPVLFSVANMV--------KPG--TFDPDNMADFSTPGISFFMQLPSPGDALQN 277 (324)
T ss_pred HHHHHHhCcccccchhheecccccCCCcceeehhhcc--------CCC--CCCccchhhccccceEEEEeCCCCCCHHHH
Confidence 5667789999999876 55544433333334443210 010 0011000011111 11 2678999998888
Q ss_pred HHHHHHHHHHHHhh
Q 029902 166 VKARLKYWAQAVAC 179 (185)
Q Consensus 166 mK~rLK~WAqaVAc 179 (185)
.|.-|+- ||..|-
T Consensus 278 fklMl~a-Aq~iAe 290 (324)
T COG3115 278 FKLMLQA-AQRIAE 290 (324)
T ss_pred HHHHHHH-HHHHHH
Confidence 8888765 666664
No 12
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=21.95 E-value=30 Score=33.85 Aligned_cols=20 Identities=30% Similarity=0.438 Sum_probs=0.0
Q ss_pred ccccCCCChhhhhhhhhchh
Q 029902 75 LKRTKSLTDEDLDELKGCLD 94 (185)
Q Consensus 75 lrrsKSLTDlDLEELKGc~D 94 (185)
..++|.|||+|+||-|=.++
T Consensus 247 afs~rGLSDEEYDEyKkiRE 266 (535)
T PF05416_consen 247 AFSSRGLSDEEYDEYKKIRE 266 (535)
T ss_dssp --------------------
T ss_pred cccccCCChhHHHHHHHHHH
Confidence 35688899999999987665
No 13
>PF03047 ComC: COMC family; InterPro: IPR004288 Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. Cells that take up DNA inevitably acquire the nucleotides the DNA consists of, and, because nucleotides are needed for DNA and RNA synthesis and are expensive to synthesise, these may make a significant contribution to the cell's energy budget []. The lateral gene transfer caused by competence also contributes to the genetic diversity that makes evolution possible. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions []. This family consists of streptococcal competence stimulating peptide precursors, which are generally up to 50 amino acid residues long. In all the members of this family, the leader sequence is cleaved after two conserved glycine residues; thus the leader sequence is of the double- glycine type []. Competence stimulating peptides (CSP) are small (less than 25 amino acid residues) cationic peptides. The N-terminal amino acid residue is negatively charged, either glutamate or aspartate. The C-terminal end is positively charged. The third residue is also positively charged: a highly conserved arginine []. Some COMC proteins and their precursors (not included in this family) do not fully follow the above description. Functionally, CSP act as pheromones, stimulating competence for genetic transformation in streptococci. In streptococci, the (CSP mediated) competence response requires exponential cell growth at a critical density, a relatively simple requirement when compared to the stationary-phase requirement of Haemophilus, or the late-logarithmic- phase of Bacillus []. All bacteria induced to competence by a particular CSP are said to belong to the same pherotype, because each CSP is recognised by a specific receptor (the signalling domain of a histidine kinase ComD). Pherotypes are not necessarily species-specific. In addition, an organism may change pherotype. There are two possible mechanisms for pherotype switching: horizontal gene transfer, and accumulation of point mutations. The biological significance of pherotypes and pherotype switching is not definitively determined. Pherotype switching occurs frequently enough in naturally competent streptococci to suggest that it may be an important contributor to genetic exchange between different bacterial species []. This entry also includes proteins that form bacteriocin-like propetides with a glycine-glycine cleavage site. The bacteriocin is initially formed as a pre-propeptide and upon cleavage at the glycine-glycine cleavage site, a leader peptide and the propeptide would be formed. The propeptide then undergoes posttranslational modification before becoming functional [].; GO: 0005186 pheromone activity; PDB: 2I2J_A 2I2H_A 2A1C_A.
Probab=20.65 E-value=33 Score=21.71 Aligned_cols=14 Identities=43% Similarity=0.812 Sum_probs=0.0
Q ss_pred cCCCChhhhhhhhh
Q 029902 78 TKSLTDEDLDELKG 91 (185)
Q Consensus 78 sKSLTDlDLEELKG 91 (185)
=+-|||.||+++.|
T Consensus 11 F~~lt~~eL~~I~G 24 (32)
T PF03047_consen 11 FEELTEEELQEIQG 24 (32)
T ss_dssp --------------
T ss_pred HhcCCHHHHhhccC
Confidence 45789999999988
Done!