Query         042838
Match_columns 162
No_of_seqs    126 out of 837
Neff          6.8 
Searched_HMMs 46136
Date          Fri Mar 29 10:03:38 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/042838.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/042838hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF13947 GUB_WAK_bind:  Wall-as 100.0 9.7E-32 2.1E-36  193.2  10.7  103   12-118     1-106 (106)
  2 PF08261 Carcinustatin:  Carcin  49.6       8 0.00017   15.2   0.5    6   25-30      3-8   (8)
  3 PF08685 GON:  GON domain;  Int  31.1 1.9E+02   0.004   23.3   6.0   58   82-140   126-186 (201)
  4 PF05953 Allatostatin:  Allatos  26.0      35 0.00075   14.8   0.6    7   25-31      5-11  (11)
  5 PF14353 CpXC:  CpXC protein     25.7      34 0.00073   24.8   0.9   16   14-30     40-56  (128)
  6 cd00206 snake_toxin Snake toxi  21.0      49  0.0011   21.2   0.8   24  131-158    38-61  (64)
  7 PF09044 Kp4:  Kp4;  InterPro:   19.3      41 0.00089   25.1   0.2   18   13-31     96-113 (128)
  8 PRK08222 hydrogenase 4 subunit  15.6      67  0.0015   24.9   0.7   11   20-30     12-22  (181)
  9 PF02150 RNA_POL_M_15KD:  RNA p  15.2      80  0.0017   17.9   0.8    9   15-24      4-12  (35)
 10 TIGR01206 lysW lysine biosynth  13.4 2.5E+02  0.0054   17.6   2.8   26   14-43      4-35  (54)

No 1  
>PF13947 GUB_WAK_bind:  Wall-associated receptor kinase galacturonan-binding
Probab=99.97  E-value=9.7e-32  Score=193.20  Aligned_cols=103  Identities=42%  Similarity=0.729  Sum_probs=88.9

Q ss_pred             CCcccCCcCCeeeccCCCCCCCCCCCCCCeEEecCCCCCCeeeeecccceEEEEeeecCccceeeeeceecccCCCCCCC
Q 042838           12 KDLCQYDCGNATIRYPFGIGEGCYFDKSFEVICDYSSGSPKAFLASINNLQVLDNHVYGVSNIRVNIPVISLKSSNLTSN   91 (162)
Q Consensus        12 ~~~C~~~CGnv~IpYPFGig~~C~~~~gF~l~C~~~~~~p~l~l~~~~~~~V~~Is~~~~~~v~v~~~~~~~~C~~~~~~   91 (162)
                      +++||++||||+||||||++++|++.++|+|+|++++++|+|++.+. .|||++|+++ +++++|.+++ ++.|+.....
T Consensus         1 ~~~C~~~CGnv~IpYPFgi~~~C~~~~~F~L~C~~~~~~~~l~l~~~-~~~V~~I~~~-~~~i~v~~~~-~~~~~~~~~~   77 (106)
T PF13947_consen    1 KPGCPSSCGNVSIPYPFGIGPGCGRDPGFELTCNNNTSPPKLLLSSG-NYEVLSISYE-NGTIRVSDPI-SSNCYSSSSS   77 (106)
T ss_pred             CCCCCCccCCEeecCCCccCCCCCCCCCcEEECCCCCCCceeEecCC-cEEEEEEecC-CCEEEEEecc-ccceecCCCC
Confidence            68999999999999999999999995599999998877889999765 8999999999 9999999999 7777765443


Q ss_pred             c---cccccCCCCeEecCCCCEEEEEccCc
Q 042838           92 A---EGVSLSVSPFTFSPWDNRFTAIGCNN  118 (162)
Q Consensus        92 ~---~~~~l~~~pF~~S~~~N~f~~~GC~~  118 (162)
                      .   .++++.+ ||.+|+.+|+|+++||++
T Consensus        78 ~~~~~~~~~~~-~~~~s~~~N~~~~~GC~t  106 (106)
T PF13947_consen   78 NSSNSNLSLNG-PFFFSSSSNKFTVVGCNT  106 (106)
T ss_pred             cccccEEeecC-CceEccCCcEEEEECCCC
Confidence            2   2445544 899999999999999985


No 2  
>PF08261 Carcinustatin:  Carcinustatin peptide
Probab=49.56  E-value=8  Score=15.18  Aligned_cols=6  Identities=50%  Similarity=1.315  Sum_probs=4.3

Q ss_pred             ccCCCC
Q 042838           25 RYPFGI   30 (162)
Q Consensus        25 pYPFGi   30 (162)
                      ||-||+
T Consensus         3 py~fgl    8 (8)
T PF08261_consen    3 PYSFGL    8 (8)
T ss_pred             cccccC
Confidence            677774


No 3  
>PF08685 GON:  GON domain;  InterPro: IPR012314 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. The ADAMTSs (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) are a family of zinc dependent metalloproteinases that play important roles in a variety of normal and pathological conditions. These enzymes show a complex domain organisation including signal sequence, propeptide, metalloproteinase domain (see PDOC50215 from PROSITEDOC), disintegrin-like domain (see PDOC00351 from PROSITEDOC), central TS-1 motif (see PDOC50092 from PROSITEDOC), cysteine-rich region, and a variable number of TS-like repeats at the C-terminal region. The GON domain is an approximately 200-residue module, whose presence is the hallmark of a subfamily of structurally and evolutionarily related ADAMTSs, called GON- ADAMTSs. The GON domain is characterised by the presence of several conserved cysteine residues and is likely to be globular [], []. Some proteins known to contain a GON domain are listed below:  Mammalian ADAMTS-9 Mammalian ADAMTS-20  Caenorhabditis elegans gon-1, a protease required for gonadal morphogenesis   Proteins containing the GON domain belong to MEROPS peptidase subfamily M12B (adamalysin, clan MA).; GO: 0004222 metalloendopeptidase activity, 0008270 zinc ion binding
Probab=31.15  E-value=1.9e+02  Score=23.30  Aligned_cols=58  Identities=17%  Similarity=0.242  Sum_probs=35.5

Q ss_pred             cccCCCCCC---CccccccCCCCeEecCCCCEEEEEccCcceeeeecCCCCceeeceEeecC
Q 042838           82 SLKSSNLTS---NAEGVSLSVSPFTFSPWDNRFTAIGCNNYTTIIKRQNDSSVFGGCLSIST  140 (162)
Q Consensus        82 ~~~C~~~~~---~~~~~~l~~~pF~~S~~~N~f~~~GC~~~a~l~~~~~~~~~~~gC~s~C~  140 (162)
                      +.+|+....   ...+++|.|++|.|++ .-++..-|......+.-..+......-|-=+|.
T Consensus       126 AGDCyS~~~CpqG~FsIdL~GTgf~vs~-~~~W~~~G~~a~~~i~~s~~~q~v~g~CGGyCG  186 (201)
T PF08685_consen  126 AGDCYSAARCPQGRFSIDLRGTGFRVSP-DTKWVTQGNYAVGKINRSPDGQKVSGRCGGYCG  186 (201)
T ss_pred             cccccccCCCCCceEEEeeCCCceEecC-CCEEEeCCcEeEEEEEEcCCCcEEEEEeCccCC
Confidence            346766532   1157899999999998 567888898776665321122223333555553


No 4  
>PF05953 Allatostatin:  Allatostatin;  InterPro: IPR010276 This family consists of allatostatins, bombystatins, helicostatins, cydiastatins and schistostatin from several insect species. Allatostatins (ASTs) of the Tyr/Phe-Xaa-Phe-Gly Leu/Ile-NH2 family are a group of insect neuropeptides that inhibit juvenile hormone biosynthesis by the corpora allata [].; GO: 0005184 neuropeptide hormone activity
Probab=26.03  E-value=35  Score=14.78  Aligned_cols=7  Identities=57%  Similarity=1.402  Sum_probs=5.2

Q ss_pred             ccCCCCC
Q 042838           25 RYPFGIG   31 (162)
Q Consensus        25 pYPFGig   31 (162)
                      .|-||+|
T Consensus         5 ~Y~FGLG   11 (11)
T PF05953_consen    5 MYSFGLG   11 (11)
T ss_pred             ccccCcC
Confidence            4888876


No 5  
>PF14353 CpXC:  CpXC protein
Probab=25.70  E-value=34  Score=24.76  Aligned_cols=16  Identities=31%  Similarity=0.883  Sum_probs=12.2

Q ss_pred             cccCCcC-CeeeccCCCC
Q 042838           14 LCQYDCG-NATIRYPFGI   30 (162)
Q Consensus        14 ~C~~~CG-nv~IpYPFGi   30 (162)
                      .|| +|| ...+.|||=.
T Consensus        40 ~CP-~Cg~~~~~~~p~lY   56 (128)
T PF14353_consen   40 TCP-SCGHKFRLEYPLLY   56 (128)
T ss_pred             ECC-CCCCceecCCCEEE
Confidence            488 898 5578888865


No 6  
>cd00206 snake_toxin Snake toxin domain, present in short and long neurotoxins, cytotoxins and short toxins, and in other miscellaneous venom peptides. The toxin acts by binding to the nicotinic acetylcholine receptors in the postsynaptic membrane of skeletal muscles and preventing the binding of acetylcholine, thereby blocking the excitation of muscles. This domain contains 60-75 amino acids that are fixed by 4-5 disulfide bridges and is nearly all beta sheet; it exists as either monomers or dimers.
Probab=20.95  E-value=49  Score=21.22  Aligned_cols=24  Identities=13%  Similarity=0.213  Sum_probs=15.5

Q ss_pred             eeeceEeecCCCCCCCCCcCCCCcccCC
Q 042838          131 VFGGCLSISTCDPALNPGCYDFLCALPQ  158 (162)
Q Consensus       131 ~~~gC~s~C~~~~~~~~~C~G~gCC~~~  158 (162)
                      ..-||+++|+ .... +  ..+-||+++
T Consensus        38 i~rGCa~tCP-~~~~-~--~~v~CC~TD   61 (64)
T cd00206          38 IERGCAATCP-KVKP-G--EYVTCCTTD   61 (64)
T ss_pred             EEccccCcCc-CCCC-C--cceEecCCC
Confidence            3469999999 4432 1  446777765


No 7  
>PF09044 Kp4:  Kp4;  InterPro: IPR015131 Killer toxins are polypeptides secreted by some fungal species that kill sensitive cells of the same or related species, often functioning by creating pores in target cell membranes. The fungal killer toxin KP4 from the corn smut fungus, Ustilago maydis (Smut fungus), is encoded by a resident symbiotic double-stranded RNA virus, Ustilago maydis P4 virus (UmV4), within fungal cells. Unlike most killer toxins, KP4 is a single polypeptide []. KP4 inhibits voltage-gated calcium channels in mammalian cells, which in turn inhibits cell growth and division by blocking calcium import. KP4 adopts a structure consisting of a two-layer alpha/beta sandwich with a left-handed crossover []. ; PDB: 1KPT_B.
Probab=19.31  E-value=41  Score=25.13  Aligned_cols=18  Identities=28%  Similarity=0.541  Sum_probs=9.4

Q ss_pred             CcccCCcCCeeeccCCCCC
Q 042838           13 DLCQYDCGNATIRYPFGIG   31 (162)
Q Consensus        13 ~~C~~~CGnv~IpYPFGig   31 (162)
                      -+|. +||.|.+-||++..
T Consensus        96 HGC~-~CGSvP~~y~~~gN  113 (128)
T PF09044_consen   96 HGCK-VCGSVPYFYTQGGN  113 (128)
T ss_dssp             HT-S-S-EEEE---SSTT-
T ss_pred             cCCC-CCCCCCcccCCCCC
Confidence            4677 99988888998753


No 8  
>PRK08222 hydrogenase 4 subunit H; Validated
Probab=15.62  E-value=67  Score=24.93  Aligned_cols=11  Identities=55%  Similarity=1.153  Sum_probs=9.8

Q ss_pred             CCeeeccCCCC
Q 042838           20 GNATIRYPFGI   30 (162)
Q Consensus        20 Gnv~IpYPFGi   30 (162)
                      |.++++|||.-
T Consensus        12 g~~T~~yP~~~   22 (181)
T PRK08222         12 GTATVKYPFAP   22 (181)
T ss_pred             CCccccCCCcc
Confidence            88999999984


No 9  
>PF02150 RNA_POL_M_15KD:  RNA polymerases M/15 Kd subunit;  InterPro: IPR001529 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:  RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors.  RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs.   Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. In archaebacteria, there is generally a single form of RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. It has recently been shown [], [] that small subunits of about 15 kDa, found in polymerase types I and II, are highly conserved. These proteins contain a probable zinc finger in their N-terminal region and a C-terminal zinc ribbon domain (see IPR001222 from INTERPRO).; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 3H0G_I 3M4O_I 3S14_I 2E2J_I 4A3J_I 3HOZ_I 1TWA_I 3S1Q_I 3S1N_I 1TWG_I ....
Probab=15.20  E-value=80  Score=17.91  Aligned_cols=9  Identities=44%  Similarity=1.209  Sum_probs=5.0

Q ss_pred             ccCCcCCeee
Q 042838           15 CQYDCGNATI   24 (162)
Q Consensus        15 C~~~CGnv~I   24 (162)
                      || +|||+-+
T Consensus         4 Cp-~C~nlL~   12 (35)
T PF02150_consen    4 CP-ECGNLLY   12 (35)
T ss_dssp             ET-TTTSBEE
T ss_pred             CC-CCCccce
Confidence            44 5666643


No 10 
>TIGR01206 lysW lysine biosynthesis protein LysW. This very small, poorly characterized protein has been shown essential in Thermus thermophilus for an unusual pathway of Lys biosynthesis from aspartate by way of alpha-aminoadipate (AAA) rather than diaminopimelate. It is found also in Deinococcus radiodurans and Pyrococcus horikoshii, which appear to share the AAA pathway.
Probab=13.44  E-value=2.5e+02  Score=17.64  Aligned_cols=26  Identities=38%  Similarity=0.760  Sum_probs=17.2

Q ss_pred             cccCCcC-CeeeccCCCCCC-----CCCCCCCCeEE
Q 042838           14 LCQYDCG-NATIRYPFGIGE-----GCYFDKSFEVI   43 (162)
Q Consensus        14 ~C~~~CG-nv~IpYPFGig~-----~C~~~~gF~l~   43 (162)
                      .|| .|| .|.++.+.. |.     .|+.  .|+|.
T Consensus         4 ~CP-~CG~~iev~~~~~-GeiV~Cp~CGa--eleVv   35 (54)
T TIGR01206         4 ECP-DCGAEIELENPEL-GELVICDECGA--ELEVV   35 (54)
T ss_pred             CCC-CCCCEEecCCCcc-CCEEeCCCCCC--EEEEE
Confidence            577 788 677887765 53     5666  36665


Done!