Query 042838
Match_columns 162
No_of_seqs 126 out of 837
Neff 6.8
Searched_HMMs 46136
Date Fri Mar 29 10:03:38 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/042838.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/042838hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF13947 GUB_WAK_bind: Wall-as 100.0 9.7E-32 2.1E-36 193.2 10.7 103 12-118 1-106 (106)
2 PF08261 Carcinustatin: Carcin 49.6 8 0.00017 15.2 0.5 6 25-30 3-8 (8)
3 PF08685 GON: GON domain; Int 31.1 1.9E+02 0.004 23.3 6.0 58 82-140 126-186 (201)
4 PF05953 Allatostatin: Allatos 26.0 35 0.00075 14.8 0.6 7 25-31 5-11 (11)
5 PF14353 CpXC: CpXC protein 25.7 34 0.00073 24.8 0.9 16 14-30 40-56 (128)
6 cd00206 snake_toxin Snake toxi 21.0 49 0.0011 21.2 0.8 24 131-158 38-61 (64)
7 PF09044 Kp4: Kp4; InterPro: 19.3 41 0.00089 25.1 0.2 18 13-31 96-113 (128)
8 PRK08222 hydrogenase 4 subunit 15.6 67 0.0015 24.9 0.7 11 20-30 12-22 (181)
9 PF02150 RNA_POL_M_15KD: RNA p 15.2 80 0.0017 17.9 0.8 9 15-24 4-12 (35)
10 TIGR01206 lysW lysine biosynth 13.4 2.5E+02 0.0054 17.6 2.8 26 14-43 4-35 (54)
No 1
>PF13947 GUB_WAK_bind: Wall-associated receptor kinase galacturonan-binding
Probab=99.97 E-value=9.7e-32 Score=193.20 Aligned_cols=103 Identities=42% Similarity=0.729 Sum_probs=88.9
Q ss_pred CCcccCCcCCeeeccCCCCCCCCCCCCCCeEEecCCCCCCeeeeecccceEEEEeeecCccceeeeeceecccCCCCCCC
Q 042838 12 KDLCQYDCGNATIRYPFGIGEGCYFDKSFEVICDYSSGSPKAFLASINNLQVLDNHVYGVSNIRVNIPVISLKSSNLTSN 91 (162)
Q Consensus 12 ~~~C~~~CGnv~IpYPFGig~~C~~~~gF~l~C~~~~~~p~l~l~~~~~~~V~~Is~~~~~~v~v~~~~~~~~C~~~~~~ 91 (162)
+++||++||||+||||||++++|++.++|+|+|++++++|+|++.+. .|||++|+++ +++++|.+++ ++.|+.....
T Consensus 1 ~~~C~~~CGnv~IpYPFgi~~~C~~~~~F~L~C~~~~~~~~l~l~~~-~~~V~~I~~~-~~~i~v~~~~-~~~~~~~~~~ 77 (106)
T PF13947_consen 1 KPGCPSSCGNVSIPYPFGIGPGCGRDPGFELTCNNNTSPPKLLLSSG-NYEVLSISYE-NGTIRVSDPI-SSNCYSSSSS 77 (106)
T ss_pred CCCCCCccCCEeecCCCccCCCCCCCCCcEEECCCCCCCceeEecCC-cEEEEEEecC-CCEEEEEecc-ccceecCCCC
Confidence 68999999999999999999999995599999998877889999765 8999999999 9999999999 7777765443
Q ss_pred c---cccccCCCCeEecCCCCEEEEEccCc
Q 042838 92 A---EGVSLSVSPFTFSPWDNRFTAIGCNN 118 (162)
Q Consensus 92 ~---~~~~l~~~pF~~S~~~N~f~~~GC~~ 118 (162)
. .++++.+ ||.+|+.+|+|+++||++
T Consensus 78 ~~~~~~~~~~~-~~~~s~~~N~~~~~GC~t 106 (106)
T PF13947_consen 78 NSSNSNLSLNG-PFFFSSSSNKFTVVGCNT 106 (106)
T ss_pred cccccEEeecC-CceEccCCcEEEEECCCC
Confidence 2 2445544 899999999999999985
No 2
>PF08261 Carcinustatin: Carcinustatin peptide
Probab=49.56 E-value=8 Score=15.18 Aligned_cols=6 Identities=50% Similarity=1.315 Sum_probs=4.3
Q ss_pred ccCCCC
Q 042838 25 RYPFGI 30 (162)
Q Consensus 25 pYPFGi 30 (162)
||-||+
T Consensus 3 py~fgl 8 (8)
T PF08261_consen 3 PYSFGL 8 (8)
T ss_pred cccccC
Confidence 677774
No 3
>PF08685 GON: GON domain; InterPro: IPR012314 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. The ADAMTSs (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) are a family of zinc dependent metalloproteinases that play important roles in a variety of normal and pathological conditions. These enzymes show a complex domain organisation including signal sequence, propeptide, metalloproteinase domain (see PDOC50215 from PROSITEDOC), disintegrin-like domain (see PDOC00351 from PROSITEDOC), central TS-1 motif (see PDOC50092 from PROSITEDOC), cysteine-rich region, and a variable number of TS-like repeats at the C-terminal region. The GON domain is an approximately 200-residue module, whose presence is the hallmark of a subfamily of structurally and evolutionarily related ADAMTSs, called GON- ADAMTSs. The GON domain is characterised by the presence of several conserved cysteine residues and is likely to be globular [], []. Some proteins known to contain a GON domain are listed below: Mammalian ADAMTS-9 Mammalian ADAMTS-20 Caenorhabditis elegans gon-1, a protease required for gonadal morphogenesis Proteins containing the GON domain belong to MEROPS peptidase subfamily M12B (adamalysin, clan MA).; GO: 0004222 metalloendopeptidase activity, 0008270 zinc ion binding
Probab=31.15 E-value=1.9e+02 Score=23.30 Aligned_cols=58 Identities=17% Similarity=0.242 Sum_probs=35.5
Q ss_pred cccCCCCCC---CccccccCCCCeEecCCCCEEEEEccCcceeeeecCCCCceeeceEeecC
Q 042838 82 SLKSSNLTS---NAEGVSLSVSPFTFSPWDNRFTAIGCNNYTTIIKRQNDSSVFGGCLSIST 140 (162)
Q Consensus 82 ~~~C~~~~~---~~~~~~l~~~pF~~S~~~N~f~~~GC~~~a~l~~~~~~~~~~~gC~s~C~ 140 (162)
+.+|+.... ...+++|.|++|.|++ .-++..-|......+.-..+......-|-=+|.
T Consensus 126 AGDCyS~~~CpqG~FsIdL~GTgf~vs~-~~~W~~~G~~a~~~i~~s~~~q~v~g~CGGyCG 186 (201)
T PF08685_consen 126 AGDCYSAARCPQGRFSIDLRGTGFRVSP-DTKWVTQGNYAVGKINRSPDGQKVSGRCGGYCG 186 (201)
T ss_pred cccccccCCCCCceEEEeeCCCceEecC-CCEEEeCCcEeEEEEEEcCCCcEEEEEeCccCC
Confidence 346766532 1157899999999998 567888898776665321122223333555553
No 4
>PF05953 Allatostatin: Allatostatin; InterPro: IPR010276 This family consists of allatostatins, bombystatins, helicostatins, cydiastatins and schistostatin from several insect species. Allatostatins (ASTs) of the Tyr/Phe-Xaa-Phe-Gly Leu/Ile-NH2 family are a group of insect neuropeptides that inhibit juvenile hormone biosynthesis by the corpora allata [].; GO: 0005184 neuropeptide hormone activity
Probab=26.03 E-value=35 Score=14.78 Aligned_cols=7 Identities=57% Similarity=1.402 Sum_probs=5.2
Q ss_pred ccCCCCC
Q 042838 25 RYPFGIG 31 (162)
Q Consensus 25 pYPFGig 31 (162)
.|-||+|
T Consensus 5 ~Y~FGLG 11 (11)
T PF05953_consen 5 MYSFGLG 11 (11)
T ss_pred ccccCcC
Confidence 4888876
No 5
>PF14353 CpXC: CpXC protein
Probab=25.70 E-value=34 Score=24.76 Aligned_cols=16 Identities=31% Similarity=0.883 Sum_probs=12.2
Q ss_pred cccCCcC-CeeeccCCCC
Q 042838 14 LCQYDCG-NATIRYPFGI 30 (162)
Q Consensus 14 ~C~~~CG-nv~IpYPFGi 30 (162)
.|| +|| ...+.|||=.
T Consensus 40 ~CP-~Cg~~~~~~~p~lY 56 (128)
T PF14353_consen 40 TCP-SCGHKFRLEYPLLY 56 (128)
T ss_pred ECC-CCCCceecCCCEEE
Confidence 488 898 5578888865
No 6
>cd00206 snake_toxin Snake toxin domain, present in short and long neurotoxins, cytotoxins and short toxins, and in other miscellaneous venom peptides. The toxin acts by binding to the nicotinic acetylcholine receptors in the postsynaptic membrane of skeletal muscles and preventing the binding of acetylcholine, thereby blocking the excitation of muscles. This domain contains 60-75 amino acids that are fixed by 4-5 disulfide bridges and is nearly all beta sheet; it exists as either monomers or dimers.
Probab=20.95 E-value=49 Score=21.22 Aligned_cols=24 Identities=13% Similarity=0.213 Sum_probs=15.5
Q ss_pred eeeceEeecCCCCCCCCCcCCCCcccCC
Q 042838 131 VFGGCLSISTCDPALNPGCYDFLCALPQ 158 (162)
Q Consensus 131 ~~~gC~s~C~~~~~~~~~C~G~gCC~~~ 158 (162)
..-||+++|+ .... + ..+-||+++
T Consensus 38 i~rGCa~tCP-~~~~-~--~~v~CC~TD 61 (64)
T cd00206 38 IERGCAATCP-KVKP-G--EYVTCCTTD 61 (64)
T ss_pred EEccccCcCc-CCCC-C--cceEecCCC
Confidence 3469999999 4432 1 446777765
No 7
>PF09044 Kp4: Kp4; InterPro: IPR015131 Killer toxins are polypeptides secreted by some fungal species that kill sensitive cells of the same or related species, often functioning by creating pores in target cell membranes. The fungal killer toxin KP4 from the corn smut fungus, Ustilago maydis (Smut fungus), is encoded by a resident symbiotic double-stranded RNA virus, Ustilago maydis P4 virus (UmV4), within fungal cells. Unlike most killer toxins, KP4 is a single polypeptide []. KP4 inhibits voltage-gated calcium channels in mammalian cells, which in turn inhibits cell growth and division by blocking calcium import. KP4 adopts a structure consisting of a two-layer alpha/beta sandwich with a left-handed crossover []. ; PDB: 1KPT_B.
Probab=19.31 E-value=41 Score=25.13 Aligned_cols=18 Identities=28% Similarity=0.541 Sum_probs=9.4
Q ss_pred CcccCCcCCeeeccCCCCC
Q 042838 13 DLCQYDCGNATIRYPFGIG 31 (162)
Q Consensus 13 ~~C~~~CGnv~IpYPFGig 31 (162)
-+|. +||.|.+-||++..
T Consensus 96 HGC~-~CGSvP~~y~~~gN 113 (128)
T PF09044_consen 96 HGCK-VCGSVPYFYTQGGN 113 (128)
T ss_dssp HT-S-S-EEEE---SSTT-
T ss_pred cCCC-CCCCCCcccCCCCC
Confidence 4677 99988888998753
No 8
>PRK08222 hydrogenase 4 subunit H; Validated
Probab=15.62 E-value=67 Score=24.93 Aligned_cols=11 Identities=55% Similarity=1.153 Sum_probs=9.8
Q ss_pred CCeeeccCCCC
Q 042838 20 GNATIRYPFGI 30 (162)
Q Consensus 20 Gnv~IpYPFGi 30 (162)
|.++++|||.-
T Consensus 12 g~~T~~yP~~~ 22 (181)
T PRK08222 12 GTATVKYPFAP 22 (181)
T ss_pred CCccccCCCcc
Confidence 88999999984
No 9
>PF02150 RNA_POL_M_15KD: RNA polymerases M/15 Kd subunit; InterPro: IPR001529 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. In archaebacteria, there is generally a single form of RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. It has recently been shown [], [] that small subunits of about 15 kDa, found in polymerase types I and II, are highly conserved. These proteins contain a probable zinc finger in their N-terminal region and a C-terminal zinc ribbon domain (see IPR001222 from INTERPRO).; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 3H0G_I 3M4O_I 3S14_I 2E2J_I 4A3J_I 3HOZ_I 1TWA_I 3S1Q_I 3S1N_I 1TWG_I ....
Probab=15.20 E-value=80 Score=17.91 Aligned_cols=9 Identities=44% Similarity=1.209 Sum_probs=5.0
Q ss_pred ccCCcCCeee
Q 042838 15 CQYDCGNATI 24 (162)
Q Consensus 15 C~~~CGnv~I 24 (162)
|| +|||+-+
T Consensus 4 Cp-~C~nlL~ 12 (35)
T PF02150_consen 4 CP-ECGNLLY 12 (35)
T ss_dssp ET-TTTSBEE
T ss_pred CC-CCCccce
Confidence 44 5666643
No 10
>TIGR01206 lysW lysine biosynthesis protein LysW. This very small, poorly characterized protein has been shown essential in Thermus thermophilus for an unusual pathway of Lys biosynthesis from aspartate by way of alpha-aminoadipate (AAA) rather than diaminopimelate. It is found also in Deinococcus radiodurans and Pyrococcus horikoshii, which appear to share the AAA pathway.
Probab=13.44 E-value=2.5e+02 Score=17.64 Aligned_cols=26 Identities=38% Similarity=0.760 Sum_probs=17.2
Q ss_pred cccCCcC-CeeeccCCCCCC-----CCCCCCCCeEE
Q 042838 14 LCQYDCG-NATIRYPFGIGE-----GCYFDKSFEVI 43 (162)
Q Consensus 14 ~C~~~CG-nv~IpYPFGig~-----~C~~~~gF~l~ 43 (162)
.|| .|| .|.++.+.. |. .|+. .|+|.
T Consensus 4 ~CP-~CG~~iev~~~~~-GeiV~Cp~CGa--eleVv 35 (54)
T TIGR01206 4 ECP-DCGAEIELENPEL-GELVICDECGA--ELEVV 35 (54)
T ss_pred CCC-CCCCEEecCCCcc-CCEEeCCCCCC--EEEEE
Confidence 577 788 677887765 53 5666 36665
Done!