Query T0579 SR518, , 124 residues Match_columns 124 No_of_seqs 37 out of 39 Neff 2.5 Searched_HMMs 11830 Date Sun Jun 13 15:18:06 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/seq/T0579.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/T0579.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF07563 DUF1541: Protein of u 99.9 3.9E-29 3.3E-33 177.1 5.8 53 2-54 1-53 (53) 2 PF07563 DUF1541: Protein of u 99.9 1.9E-28 1.6E-32 173.4 6.0 53 66-118 1-53 (53) 3 PF03724 META: META domain; I 32.6 12 0.001 15.8 2.6 30 45-75 1-30 (101) 4 PF06289 FlbD: Flagellar prote 28.9 14 0.0012 15.4 2.4 33 67-99 8-40 (60) 5 PF03831 PhnA: PhnA protein; 21.5 13 0.0011 15.6 1.1 31 57-89 2-32 (56) 6 PF02013 CBM_10: Cellulose or 19.0 12 0.001 15.8 0.5 20 34-53 16-35 (36) 7 PF01828 Peptidase_A4: Peptida 14.1 33 0.0028 13.4 3.2 57 62-119 88-145 (208) 8 PF09390 DUF1999: Protein of u 13.5 7.8 0.00066 16.9 -1.6 20 21-40 105-124 (161) 9 PF09356 Phage_BR0599: Phage c 11.8 39 0.0033 13.0 2.4 32 45-77 22-53 (80) 10 PF02417 Chromate_transp: Chro 11.0 15 0.0013 15.3 -0.8 15 45-59 31-45 (169) No 1 >PF07563 DUF1541: Protein of unknown function (DUF1541); InterPro: IPR011438 This domain is found in several hypothetical bacterial proteins as a tandem repeat. Probab=99.95 E-value=3.9e-29 Score=177.14 Aligned_cols=53 Identities=64% Similarity=1.117 Sum_probs=41.8 Q ss_pred CCCCEEEEEECCCCCCCCCCEEEEEECCCEEEEEEEECCCCCCCCCCCEEEEE Q ss_conf 87875799606756536872268741020688887321788962345401333 Q T0579 2 KVGSQVIINTSHMKGMKGAEATVTGAYDTTAYVVSYTPTNGGQRVDHHKWVIQ 54 (124) Q Consensus 2 ~vGs~v~l~adHM~GM~GA~atIvgAydTt~Y~VsYtPt~Gg~~V~nHKWVv~ 54 (124) |+||+|+|+|||||||+||+|||++||+||+|.|||+||+||++|+|||||+| T Consensus 1 ~~G~~V~l~AdHM~GMkgA~AtI~~a~~tTvY~V~ytpt~gg~~V~NHKWV~e 53 (53) T PF07563_consen 1 KVGSEVTLTADHMPGMKGAEATIDGAYDTTVYMVDYTPTTGGEEVKNHKWVTE 53 (53) T ss_pred CCCCEEEEECCCCCCCCCCEEEEEEEEEEEEEEEEEEECCCCCEEECCEEEEC T ss_conf 99879999655377668976899532343799998687799948032250709 No 2 >PF07563 DUF1541: Protein of unknown function (DUF1541); InterPro: IPR011438 This domain is found in several hypothetical bacterial proteins as a tandem repeat. Probab=99.94 E-value=1.9e-28 Score=173.42 Aligned_cols=53 Identities=70% Similarity=1.017 Sum_probs=51.8 Q ss_pred CCCCEEEEECCCCCCCCCCCEEEEECCCCEEEEEEEEECCCCCEEEEEECEEC Q ss_conf 89877999701145547870688603686068886432699946530001100 Q T0579 66 QPGDQVILEASHMKGMKGATAEIDSAEKTTVYMVDYTSTTSGEKVKNHKWVTE 118 (124) Q Consensus 66 ~~G~eV~l~AdHM~GMkgA~a~Id~~~~tTVYmVDy~~t~~g~~v~NHKWVtE 118 (124) ++|++|+|+||||+||+||+|+|+++.++|||||||+||+||++|+|||||+| T Consensus 1 ~~G~~V~l~AdHM~GMkgA~AtI~~a~~tTvY~V~ytpt~gg~~V~NHKWV~e 53 (53) T PF07563_consen 1 KVGSEVTLTADHMPGMKGAEATIDGAYDTTVYMVDYTPTTGGEEVKNHKWVTE 53 (53) T ss_pred CCCCEEEEECCCCCCCCCCEEEEEEEEEEEEEEEEEEECCCCCEEECCEEEEC T ss_conf 99879999655377668976899532343799998687799948032250709 No 3 >PF03724 META: META domain; InterPro: IPR005184 A domain found in proteins of unknown function , some of which are described as heat shock protein (HslJ). In Helicobacter pylori the protein is secreted e.g. (O25998 from SWISSPROT) and implicated in motility. In Leishmania spp. it is described as an essential protein, over-expression of which, in L.amazonensis, increases virulence (O43987 from SWISSPROT; ). A pair of cysteine residues show correlated conservation, suggesting that they form a disulphide bond. Probab=32.58 E-value=12 Score=15.77 Aligned_cols=30 Identities=17% Similarity=0.407 Sum_probs=20.3 Q ss_pred CCCCCEEEEEECCCCCCCCCCCCCCEEEEEC Q ss_conf 2345401333002687755478987799970 Q T0579 45 RVDHHKWVIQEEIKDAGDKTLQPGDQVILEA 75 (124) Q Consensus 45 ~V~nHKWVv~EEl~~ag~~~~~~G~eV~l~A 75 (124) |+.+++|+++ .+...+..+..+....+|.- T Consensus 1 ~L~~~~W~l~-~~~~~g~~~~~~~~~~~l~f 30 (101) T PF03724_consen 1 PLQGTTWQLT-SINGDGAAPVPPSAKPTLTF 30 (101) T ss_pred CCCCCEEEEE-EECCCCCCCCCCCCCCEEEE T ss_conf 9849889999-98178831367888736999 No 4 >PF06289 FlbD: Flagellar protein (FlbD); InterPro: IPR009384 This family consists of several bacterial FlbD flagellar proteins. The exact function of this family is unknown . Probab=28.86 E-value=14 Score=15.42 Aligned_cols=33 Identities=15% Similarity=0.359 Sum_probs=19.5 Q ss_pred CCCEEEEECCCCCCCCCCCEEEEECCCCEEEEE Q ss_conf 987799970114554787068860368606888 Q T0579 67 PGDQVILEASHMKGMKGATAEIDSAEKTTVYMV 99 (124) Q Consensus 67 ~G~eV~l~AdHM~GMkgA~a~Id~~~~tTVYmV 99 (124) .|.+..|||||.+-+....-|+-.-..+.-|+| T Consensus 8 ng~~f~lN~dlIE~ie~~PDTvItL~nG~kyvV 40 (60) T PF06289_consen 8 NGEEFVLNADLIETIEETPDTVITLTNGKKYVV 40 (60) T ss_pred CCCEEEECHHHEEEEEECCCEEEEEECCCEEEE T ss_conf 898899977897887165987999947989999 No 5 >PF03831 PhnA: PhnA protein; InterPro: IPR013988 The PhnA protein family includes the uncharacterised Escherichia coli protein PhnA and its homologues. The E. coli phnA gene is part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage . The protein is not related to the characterised phosphonoacetate hydrolase designated PhnA . This entry represents the C-terminal domain of PhnA.; PDB: 2akk_A 2akl_A. Probab=21.52 E-value=13 Score=15.60 Aligned_cols=31 Identities=32% Similarity=0.469 Sum_probs=22.6 Q ss_pred CCCCCCCCCCCCCEEEEECCCCCCCCCCCEEEE Q ss_conf 268775547898779997011455478706886 Q T0579 57 IKDAGDKTLQPGDQVILEASHMKGMKGATAEID 89 (124) Q Consensus 57 l~~ag~~~~~~G~eV~l~AdHM~GMkgA~a~Id 89 (124) ++|+.-.+|++||.|+|--|- -.||+..+|- T Consensus 2 vkDsnG~~L~~GDsV~liKDL--kVKGss~~~K 32 (56) T PF03831_consen 2 VKDSNGNVLADGDSVTLIKDL--KVKGSSFVIK 32 (56) T ss_dssp -B-----B-----EEEESS-E--EE----EEE- T ss_pred EECCCCCCCCCCCEEEEEEEE--EECCCCCCEE T ss_conf 585899894579979998530--1136662060 No 6 >PF02013 CBM_10: Cellulose or protein binding domain; InterPro: IPR002883 The recycling of photosynthetically fixed carbon in plant cell walls is a key microbial process. Enzyme systems that attack the plant cell wall contain noncatalytic carbohydrate-binding modules that mediate attachment to this composite structure and play a pivotal role in maximizing the hydrolytic process. In anaerobes, the degradation is carried out by a high molecular weight, multifunctional complex termed the cellulosome. This consists of a number of independent enzyme components, each of which contains a conserved 40-residue dockerin domain, which functions to bind the enzyme to a cohesin domain within the scaffoldin protein , . In anaerobic bacteria that degrade plant cell walls, exemplified by Clostridium thermocellum, the dockerin domains of the catalytic polypeptides can bind equally well to any cohesin from the same organism. More recently, anaerobic fungi, typified by Piromyces equi, have been suggested to also synthesize a cellulosome complex, although the dockerin sequences of the bacterial and fungal enzymes are completely different . For example, the fungal enzymes contain one, two or three copies of the dockerin sequence in tandem within the catalytic polypeptide. In contrast, all the C. thermocellum cellulosome catalytic components contain a single dockerin domain. The anaerobic bacterial dockerins are homologous to EF hands (calcium-binding motifs) and require calcium for activity whereas the fungal dockerin does not require calcium. Finally, the interaction between cohesin and dockerin appears to be species specific in bacteria, there is almost no species specificity of binding within fungal species and no identified sites that distinguish different species. The structure of dockerin from Piromyces equi contains two helical stretches and four short beta-strands which form an antiparallel sheet structure adjacent to an additional short twisted parallel strand. The N- and C-termini are adjacent to each other. Aerobic bacteria contain related regions, however these appear to function as cellulose/carbohydrate binding domains. ; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 2j4m_A 1e8p_A 2j4n_A 1e8q_A 1qld_A 1e8r_A. Probab=18.97 E-value=12 Score=15.79 Aligned_cols=20 Identities=35% Similarity=0.692 Sum_probs=17.3 Q ss_pred EEEEECCCCCCCCCCCEEEE Q ss_conf 88732178896234540133 Q T0579 34 VVSYTPTNGGQRVDHHKWVI 53 (124) Q Consensus 34 ~VsYtPt~Gg~~V~nHKWVv 53 (124) .|.|++.+|+=-|+|..|-+ T Consensus 16 ~v~ytd~~g~WGvEN~~wCg 35 (36) T PF02013_consen 16 EVVYTDDDGGWGVENNQWCG 35 (36) T ss_dssp --SB-ST---BEEBTTBEEB T ss_pred CEEECCCCCCEEEECCCEEC T ss_conf 74776799986018881660 No 7 >PF01828 Peptidase_A4: Peptidase A4 family; InterPro: IPR000250 Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. The peptidases in family G1 form a subset of what were formerly termed 'pepstatin-insensitive carboxyl proteinases'. After its discovery in about 1970, the pentapeptide pepstatin soon came to be thought of as a very general inhibitor of the endopeptidases that are active at acidic pH. But more recently several acid-acting endopeptidases from bacteria and fungi had been found to be resistant to pepstatin. The unusual active sites of the 'pepstatin-insensitive carboxyl peptidases' proved difficult to characterise, but it has now been established that the enzymes from bacteria are acid-acting serine peptidases in family S53 (clan SB), IPR000209 from INTERPRO, whereas the fungal enzymes are in family G1 (formerly A4). The importance of glutamate ('E') and glutamine ('Q') residues in the active sites of the family G1 enzymes led to the family name, Eqolisin . This group of glutamate/glutamine peptidases belong to MEROPS peptidase family G1 (eqolisin family, clan GA). An example of this group is scytalidoglutamic peptidase. The proteins are thermostable, pepstatin insensitive and are active at low pH ranges . The enzyme has a unique heterodimeric structure, with a 39-residue light chain and a 173-residue heavy chain bound to each other non-covalently . The tertiary structure of the active site of scytalidoglutamic peptidase (MEROPS G01.001) with a bound tripeptide product has been interpreted as showing that Glu136 is the primary catalytic residue. The most likely mechanism is suggested to be nucleophilic attack by a water molecule activated by the Glu136 side chain on the si-face of the scissile peptide bond carbon atom to form the tetrahedral intermediate. Electrophilic assistance, and oxyanion stabilisation, are provided by the side-chain amide of Gln53. Both scytalidoglutamic peptidase (MEROPS G01.001) and aspergilloglutamic peptidase (MEROPS G01.002) cleave the Tyr26 Thr27 bond in the B chain of oxidized insulin; a bond not cleaved by other acid-acting endopeptidases. Scytalidoglutamic peptidase is most active on casein at pH 2 and is inhibited by 1,2-epoxy-3-(p-nitrophenoxy)propane (EPNP), a compound that also inhibits pepsin. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1y43_A 1s2k_A 2ifw_A 2ifr_A 1s2b_A. Probab=14.07 E-value=33 Score=13.42 Aligned_cols=57 Identities=18% Similarity=0.321 Sum_probs=31.9 Q ss_pred CCCCCCCCEEEEECCCCCCCCCCCEEEEECCCCEEEEEEEEE-CCCCCEEEEEECEECC Q ss_conf 554789877999701145547870688603686068886432-6999465300011002 Q T0579 62 DKTLQPGDQVILEASHMKGMKGATAEIDSAEKTTVYMVDYTS-TTSGEKVKNHKWVTED 119 (124) Q Consensus 62 ~~~~~~G~eV~l~AdHM~GMkgA~a~Id~~~~tTVYmVDy~~-t~~g~~v~NHKWVtE~ 119 (124) .-+.++||.+.+...+-..= ...++|....+.--|-..+.+ .....--.|--|+.|| T Consensus 88 ~~~V~~GD~I~v~V~~~s~~-~g~~~i~N~stG~t~t~t~s~~~~~~L~~~~AEWIvEd 145 (208) T PF01828_consen 88 GFPVSPGDTITVTVTATSSS-SGTATIENLSTGQTVTKTLSAPSGATLCGQNAEWIVED 145 (208) T ss_dssp T-------EEEEEEEEEETT-E-EEEEEES----EEEE---S-------S--EEEEEE- T ss_pred CCCCCCCCEEEEEEEEECCC-CEEEEEEECCCCEEEEEEECCCCCCCCCCCCEEEEEEC T ss_conf 54168999999999980599-77999998878879999973467898466566899977 No 8 >PF09390 DUF1999: Protein of unknown function (DUF1999); PDB: 2d4o_A 2d4p_A. Probab=13.55 E-value=7.8 Score=16.86 Aligned_cols=20 Identities=35% Similarity=0.497 Sum_probs=15.5 Q ss_pred CEEEEEECCCEEEEEEEECC Q ss_conf 22687410206888873217 Q T0579 21 EATVTGAYDTTAYVVSYTPT 40 (124) Q Consensus 21 ~atIvgAydTt~Y~VsYtPt 40 (124) -|.|..|||+-+|+|-|--+ T Consensus 105 rAvVKSAYDa~VYEv~l~l~ 124 (161) T PF09390_consen 105 RAVVKSAYDAAVYEVHLPLD 124 (161) T ss_dssp HHHHHHHHH----EEEE--- T ss_pred HHHHHHHHCCEEEEEECCCC T ss_conf 99998762253689852589 No 9 >PF09356 Phage_BR0599: Phage conserved hypothetical protein BR0599 Probab=11.78 E-value=39 Score=13.03 Aligned_cols=32 Identities=22% Similarity=0.170 Sum_probs=23.7 Q ss_pred CCCCCEEEEEECCCCCCCCCCCCCCEEEEECCC Q ss_conf 234540133300268775547898779997011 Q T0579 45 RVDHHKWVIQEEIKDAGDKTLQPGDQVILEASH 77 (124) Q Consensus 45 ~V~nHKWVv~EEl~~ag~~~~~~G~eV~l~AdH 77 (124) .|+.|.- ..=+|-.+-..++..||+++|.|.- T Consensus 22 ~V~~~~~-~~l~L~~p~~~~~~~Gd~~~l~~GC 53 (80) T PF09356_consen 22 EVKSHRG-RTLTLWEPLPAPIAVGDAVRLYAGC 53 (80) T ss_pred EEEECCC-CEEEECCCCCCCCCCCCEEEEECCC T ss_conf 9885328-8999806776788999889996699 No 10 >PF02417 Chromate_transp: Chromate transporter; InterPro: IPR003370 This region is found in known and predicted chromate transporters , in both bacteria and archaebacteria. These proteins reduce chromate accumulation and are essential for chromate resistance. They are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP.; GO: 0015109 chromate transmembrane transporter activity, 0015703 chromate transport Probab=10.99 E-value=15 Score=15.28 Aligned_cols=15 Identities=33% Similarity=0.738 Sum_probs=7.4 Q ss_pred CCCCCEEEEEECCCC Q ss_conf 234540133300268 Q T0579 45 RVDHHKWVIQEEIKD 59 (124) Q Consensus 45 ~V~nHKWVv~EEl~~ 59 (124) -|++++|+.+||..| T Consensus 31 ~V~~~~wit~~~f~~ 45 (169) T PF02417_consen 31 LVEKRGWITEEEFLD 45 (169) T ss_pred HHHCCCCCCHHHHHH T ss_conf 856469999999999 Done!