Query T0607 YP_094844.1, LEGIONELLA PNEUMOPHILA , 471 residues Match_columns 471 No_of_seqs 126 out of 5489 Neff 10.4 Searched_HMMs 11830 Date Mon Jul 5 09:35:54 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/seq/T0607.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/T0607.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF01546 Peptidase_M20: Peptid 99.9 3.8E-24 3.2E-28 164.5 11.7 169 92-466 1-171 (171) 2 PF07687 M20_dimer: Peptidase 99.1 1.8E-10 1.5E-14 79.7 9.7 106 203-363 1-106 (108) 3 PF05343 Peptidase_M42: M42 gl 99.0 4.3E-09 3.7E-13 71.1 12.7 71 386-459 222-292 (292) 4 PF04389 Peptidase_M28: Peptid 98.5 1.8E-07 1.5E-11 61.1 7.2 86 89-191 1-87 (176) 5 PF05450 Nicastrin: Nicastrin; 95.3 0.031 2.6E-06 28.7 7.3 73 89-178 1-77 (234) 6 PF02127 Peptidase_M18: Aminop 85.3 0.68 5.8E-05 20.3 6.3 45 416-463 387-431 (432) 7 PF01320 Colicin_Pyocin: Colic 39.4 6.6 0.00056 14.2 4.9 52 13-66 30-84 (85) 8 PF04692 PDGF_N: Platelet-deri 21.7 7.8 0.00066 13.8 0.0 30 4-34 6-35 (78) 9 PF09280 XPC-binding: XPC-bind 18.8 15 0.0013 12.0 2.5 18 6-24 37-54 (59) 10 PF00883 Peptidase_M17: Cytoso 13.4 21 0.0018 11.1 10.2 118 20-162 1-139 (311) No 1 >PF01546 Peptidase_M20: Peptidase family M20/M25/M40 This family only corresponds to M20 family; InterPro: IPR002933 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of proteins contains the metallopeptidases and non-peptidase homologues that belong to the MEROPS peptidase family M20 (clan MH) . The peptidases of this clan have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide.. The peptidase M20 family has four sub-families: M20A - type example, glutamate carboxypeptidase from Pseudomonas sp. (P06621 from SWISSPROT) M20B - type example, peptidase T from Escherichia coli (P29745 from SWISSPROT) M20C - type example, X-His dipeptidase from E. coli (P15288 from SWISSPROT) M20D - type example, carboxypeptidase Ss1 from Sulfolobus solfataricus (P80092 from SWISSPROT); GO: 0008237 metallopeptidase activity, 0006508 proteolysis; PDB: 2f8h_A 2f7v_A 1vgy_B 1lfw_A 2pok_B 2zof_A 2zog_B 3dlj_B 1q7l_B 2rb7_A .... Probab=99.91 E-value=3.8e-24 Score=164.45 Aligned_cols=169 Identities=29% Similarity=0.284 Sum_probs=126.0 Q ss_pred EEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCC-CCCCCEEEEEECCCCCCCH-HH Q ss_conf 997340323288776778886354334542135642224203589999988763147-8777368998156433644-69 Q T0607 92 LLYGHLDKQPEMSGWSDDLHPWKPVLKNGLLYGRGGADDGYSAYASLTAIRALEQQG-LPYPRCILIIEACEESGSY-DL 169 (471) Q Consensus 92 ~l~~H~DvVp~~~~~~w~~~Pf~~~~~~~~~~GrG~~D~k~~~~~~l~a~~~l~~~~-~~~~~~~~~~~~~EE~G~~-g~ 169 (471) +|++|||||| + ...|+ +||||+.|||+++++++.+++.+++.+ .+++++.++|+++||.|+. |+ T Consensus 1 ll~~H~D~vp-~-~~~w~------------l~grG~~D~k~~~~~~l~a~~~l~~~~~~~~~~l~~~~~~~EE~g~~~g~ 66 (171) T PF01546_consen 1 LLYGHMDTVP-G-PDGWK------------LYGRGAADMKGGLAAMLEALRALKEQGGDLPGNLKFLFTPDEESGSFGGA 66 (171) T ss_dssp EEEEES-BBS---CCGSS------------EEE---HSTHHHHHHHHHHHHHHHHCTTCGSSEEEEEEESTTTTTSTSHH T ss_pred CEEECCCCCC-C-CCCCE------------EECCCCCCCCHHHHHHHHHHHHHHHHCCCCCCCEEEEEECCCCCCCCCCH T ss_conf 9101326277-8-89872------------37785132644899999999999971466689848984034434674358 Q ss_pred HHHHHHHHHHCCCCEEEEECCCCCCCCCCCEEEEEEEEEEEEEEEEEECCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCC Q ss_conf 99999977642896089976877768776558885102368888986447763445677414789999999999974002 Q T0607 170 PFYIELLKERIGKPSLVICLDSGAGNYEQLWMTTSLRGNLVGKLTVELINEGVHSGSASGIVADSFRVARQLISRIEDEN 249 (471) Q Consensus 170 ~~~~~~~~~~~~~~d~~~~~~~~~~~~~~~~i~~g~kG~~~~~i~v~g~~~~~hs~~~~~~~~~~~~~~~~~~~~l~~~~ 249 (471) +.+++.. .....++++..+.+.. T Consensus 67 ~~~~~~~--~~~~~~~~i~~e~~~~------------------------------------------------------- 89 (171) T PF01546_consen 67 RYLIEAI--FGIDIDYVIVGEPGGE------------------------------------------------------- 89 (171) T ss_dssp HHHHHHE--EEEEESEEEECHCSTT------------------------------------------------------- T ss_pred HHHHHHH--HCCCCEEEEEECCCCC------------------------------------------------------- T ss_conf 9998765--3236302221043356------------------------------------------------------- Q ss_pred CCEEECCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHEEEECCEEEEECCCCCCCCEE Q ss_conf 45021331014443000899999986334555654211100123200101233332210012013021000578513035 Q T0607 250 TGEIKLPQLYCDIPDERIKQAKQCAEILGEQVYSEFPWIDSAKPVIQDKQQLILNRTWRPALTVTGADGFPAIADAGNVM 329 (471) Q Consensus 250 ~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t~~v~~i~~~~~~~~~~nvi 329 (471) +. T Consensus 90 -----------------------------------------------------------------------------~~- 91 (171) T PF01546_consen 90 -----------------------------------------------------------------------------DG- 91 (171) T ss_dssp -----------------------------------------------------------------------------CE- T ss_pred -----------------------------------------------------------------------------CC- T ss_conf -----------------------------------------------------------------------------65- Q ss_pred CCEEEEEEEEECCCCCCHHHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCEEE Q ss_conf 47489999751278788899999999999973236633899983157677668998889999999999983697652344 Q T0607 330 RPVTSLKLSMRLPPLVDPEAASVAMEKALTQNPPYNAKVDFKIQNGGSKGWNAPLLSDWLAKAASEASMTYYDKPAAYMG 409 (471) Q Consensus 330 p~~~~~~~d~R~~~~~~~~~~~~~i~~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~ 409 (471) ..+++.+.+.++++++++++....... T Consensus 92 -----------------------------------------------------~~~~~~~~~~~~~~~~~~~~~~~~~~~ 118 (171) T PF01546_consen 92 -----------------------------------------------------SDNSPPLVEALKKAAKEVGGPVKPEPS 118 (171) T ss_dssp -----------------------------------------------------HHTSHHHHHHHHHHHHHCTEEEEEEEB T ss_pred -----------------------------------------------------CCCHHHHHHHHHHHHHHHCCCCCCCCC T ss_conf -----------------------------------------------------545089999999999996287655300 Q ss_pred CCCCHHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH Q ss_conf 277655689999861899899968778666756777600489999999999999999 Q T0607 410 EGGTIPFMSMLGEQFPKAQFMITGVLGPHSNAHGPNEFLHLDMVKKLTSCVSYVLYS 466 (471) Q Consensus 410 ~gGt~~~~~~~~~~~~~~p~~~~G~g~~~~~~H~~~E~~~i~~l~~~~~il~~~l~~ 466 (471) .+| .|+..+....+++|++.+|++. .++|++||+++++++.+++++|+++|.+ T Consensus 119 ~g~--tD~~~~~~~~~~~~~i~~g~~~--~~~H~~~E~i~~~~~~~~~~~~~~~l~~ 171 (171) T PF01546_consen 119 GGG--TDAGYFANVGLGIPTIGFGPGG--GNAHSPNEYISIDSLEKGIEVYARILEN 171 (171) T ss_dssp SS----THHHHHTHHSEEEEEEEEECC--GGTTSTT-EEEHHHHHHHHHHHHHHHHH T ss_pred CCC--CCHHHHHHCCCCEEEEEECCCC--CCCCCCCCEEEHHHHHHHHHHHHHHHHC T ss_conf 047--7547876405780499968999--8889988483099999999999999969 No 2 >PF07687 M20_dimer: Peptidase dimerisation domain This family only corresponds to M20 family; InterPro: IPR011650 This domain consists of 4 beta strands and two alpha helices which make up the dimerisation surface of members of the MEROPS peptidase family M20 . This family includes a range of zinc exopeptidases: carboxypeptidases, dipeptidases and specialised aminopeptidases .; GO: 0016787 hydrolase activity, 0046983 protein dimerization activity; PDB: 2q43_A 1xmb_A 1ysj_A 2v8h_D 2v8g_B 2v8v_D 2v8d_B 1r43_B 2vl1_C 1r3n_H .... Probab=99.12 E-value=1.8e-10 Score=79.69 Aligned_cols=106 Identities=22% Similarity=0.194 Sum_probs=83.7 Q ss_pred EEEEEEEEEEEEEEECCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCEEECCCCCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 85102368888986447763445677414789999999999974002450213310144430008999999863345556 Q T0607 203 TSLRGNLVGKLTVELINEGVHSGSASGIVADSFRVARQLISRIEDENTGEIKLPQLYCDIPDERIKQAKQCAEILGEQVY 282 (471) Q Consensus 203 ~g~kG~~~~~i~v~g~~~~~hs~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~ 282 (471) +|+||.++++++++|. .+||+ .+..+.|++..+++++..+........ T Consensus 1 ~g~~G~~~~~i~~~G~--~~Hs~-~p~~g~nAi~~~~~~l~~l~~~~~~~~----------------------------- 48 (108) T PF07687_consen 1 YGHRGIIWFRITVKGK--SAHSS-RPENGVNAIEAAAKLLNALRDLRFEIA----------------------------- 48 (108) T ss_dssp -EBCEEEEEEEEEESS---EETT--TTCCSHHHHHHHHHHHHHHTTCBCBS----------------------------- T ss_pred CCCCCEEEEEEEEEEE--CCCCC-CCCCCCCHHHHHHHHHHHHHHHHCCCC----------------------------- T ss_conf 9688679999999725--56777-975672899999999999976541333----------------------------- Q ss_pred HHHHHHHHHHHHHCCCHHHHHHHHHHEEEECCEEEEECCCCCCCCEECCEEEEEEEEECCCCCCHHHHHHHHHHHHHHHC Q ss_conf 54211100123200101233332210012013021000578513035474899997512787888999999999999732 Q T0607 283 SEFPWIDSAKPVIQDKQQLILNRTWRPALTVTGADGFPAIADAGNVMRPVTSLKLSMRLPPLVDPEAASVAMEKALTQNP 362 (471) Q Consensus 283 ~~~~~~~~~~~~~~~~~~~~~~~~~~~t~~v~~i~~~~~~~~~~nvip~~~~~~~d~R~~~~~~~~~~~~~i~~~~~~~~ 362 (471) .......++++++.+.++. +.|+||++|++++|+|+.+.++.+++.+.+++.+++.. T Consensus 49 -------------------~~~~~~~~~~~~~~i~gg~----~~n~ip~~~~~~~diR~~~~~~~~~i~~~i~~~~~~~~ 105 (108) T PF07687_consen 49 -------------------EEFFLGPPTINVTMIHGGT----APNVIPDEAEATVDIRLPPGEDLEEIEERIEAIIEEAA 105 (108) T ss_dssp -------------------HHHHHTSEEEEEEEEEESS----ETTSE-SEEEEEEEEEESTTHHHHHHHHHHHHHHHHHH T ss_pred -------------------CCCCCCCCEEEEEEEECCC----CCCEECCEEEEEEEEECCCCCHHHHHHHHHHHHHHHHH T ss_conf -------------------2356889668997740788----67458787999999978995249999999999998773 Q ss_pred C Q ss_conf 3 Q T0607 363 P 363 (471) Q Consensus 363 ~ 363 (471) . T Consensus 106 ~ 106 (108) T PF07687_consen 106 A 106 (108) T ss_dssp H T ss_pred H T ss_conf 3 No 3 >PF05343 Peptidase_M42: M42 glutamyl aminopeptidase; InterPro: IPR008007 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases belong to MEROPS peptidase family M42 (glutamyl aminopeptidase family, clan MH). For members of this family and family M28 the predicted metal ligands occur in the same order in the sequence: H, D, E, D/E, H; and the active site residues occur in the motifs HXD and EE. ; PDB: 1vhe_A 1ylo_E 2fvg_A 2vpu_A 2pe3_B 1y0r_A 1xfo_C 1y0y_A 1vho_A 2gre_K .... Probab=99.00 E-value=4.3e-09 Score=71.13 Aligned_cols=71 Identities=14% Similarity=0.020 Sum_probs=52.5 Q ss_pred CHHHHHHHHHHHHHHCCCCCCEEECCCCHHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHH Q ss_conf 88999999999998369765234427765568999986189989996877866675677760048999999999 Q T0607 386 SDWLAKAASEASMTYYDKPAAYMGEGGTIPFMSMLGEQFPKAQFMITGVLGPHSNAHGPNEFLHLDMVKKLTSC 459 (471) Q Consensus 386 ~~~~~~~~~~a~~~~~~~~~~~~~~gGt~~~~~~~~~~~~~~p~~~~G~g~~~~~~H~~~E~~~i~~l~~~~~i 459 (471) ++.+.+.+.+++++.. .|...-...++++|+..+.....|+|+..+++.. .++|+|.|.++++|+...+++ T Consensus 222 ~~~l~~~l~~~A~~~~-Ip~Q~~~~~~~gTDa~~i~~~~~Gi~t~~isiP~--ry~Hs~~E~~~~~Di~~~~~L 292 (292) T PF05343_consen 222 NPKLVEKLIELAEENG-IPYQLEVFSGGGTDAGAIQLSGGGIPTAVISIPC--RYMHSPYEMVHLDDIEATVKL 292 (292) T ss_dssp -HHHHHHHHHHHHHTT---EEEEEEST---CHCHHHTSCS-SEEEEEEEEE--BSTTSTTEEEEHHHHHHHHHH T ss_pred CHHHHHHHHHHHHHCC-CCEEEECCCCCCCHHHHHHHHCCCCCEEEECCCC--CCCCCCCEEEEHHHHHHHHHC T ss_conf 9999999999999859-9869981789770799999828998889981030--667883468779999998539 No 4 >PF04389 Peptidase_M28: Peptidase family M28; InterPro: IPR007484 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This domain is found in metallopeptidases belonging to the MEROPS peptidase family M28 (aminopeptidase Y, clan MH) . They also contain a transferrin receptor-like dimerisation domain (IPR007365 from INTERPRO) and a protease-associated PA domain (IPR003137 from INTERPRO).; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 2zeg_B 2zeh_A 2afw_B 2zel_A 2zed_A 2zeo_B 2zee_B 2zef_B 2afs_A 2afo_B .... Probab=98.47 E-value=1.8e-07 Score=61.10 Aligned_cols=86 Identities=26% Similarity=0.310 Sum_probs=64.6 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCC-CCEEEEEECCCCCCCH Q ss_conf 879997340323288776778886354334542135642224203589999988763147877-7368998156433644 Q T0607 89 DTVLLYGHLDKQPEMSGWSDDLHPWKPVLKNGLLYGRGGADDGYSAYASLTAIRALEQQGLPY-PRCILIIEACEESGSY 167 (471) Q Consensus 89 ~~i~l~~H~DvVp~~~~~~w~~~Pf~~~~~~~~~~GrG~~D~k~~~~~~l~a~~~l~~~~~~~-~~~~~~~~~~EE~G~~ 167 (471) +.|++.+|+|+++ ... | +..|+.|+..|++++|+++|.|++...++ .++++++...||.|.. T Consensus 1 e~ivi~aH~Ds~~-~~~------~----------~~~GA~DnasG~a~lLe~ar~l~~~~~~~~r~I~fi~~~~EE~gl~ 63 (176) T PF04389_consen 1 EYIVIGAHYDSVG-KDA------P----------WSPGANDNASGVAALLELARALSESKPPPKRTIRFIFFDGEEQGLL 63 (176) T ss_dssp EEEEEEEE--B-T---------------------T----TTTTHHHHHHHHHHHHHHHSCHTTSEEEEEEEESSTTCS-- T ss_pred CEEEEEEECCCCC-CCC------C----------CCCCCCCCHHHHHHHHHHHHHHHHCCCCCCCCEEEEEECCCCCCCC T ss_conf 9899994227898-767------7----------7579565618899999999999863788887379999566225651 Q ss_pred HHHHHHHHHHHHCCCCEEEEECCC Q ss_conf 699999997764289608997687 Q T0607 168 DLPFYIELLKERIGKPSLVICLDS 191 (471) Q Consensus 168 g~~~~~~~~~~~~~~~d~~~~~~~ 191 (471) |++++++..........+++..|. T Consensus 64 GS~~~v~~~~~~~~~i~~~inlD~ 87 (176) T PF04389_consen 64 GSRAYVEHDHEELDNIKAVINLDM 87 (176) T ss_dssp --HHHHHHHHHHHHHEEEEEEEST T ss_pred CHHHHHHHCCCCHHCEEEEEEEEC T ss_conf 119999725033101569996413 No 5 >PF05450 Nicastrin: Nicastrin; InterPro: IPR008710 Nicastrin and presenilin are two major components of the gamma-secretase complex, which executes the intramembrane proteolysis of type I integral membrane proteins such as the amyloid precursor protein (APP) and Notch. Nicastrin is synthesised in fibroblasts and neurons as an endoglycosidase-H-sensitive glycosylated precursor protein (immature nicastrin) and is then modified by complex glycosylation in the Golgi apparatus and by sialylation in the trans-Golgi network (mature nicastrin) .; GO: 0016485 protein processing, 0016021 integral to membrane Probab=95.34 E-value=0.031 Score=28.68 Aligned_cols=73 Identities=14% Similarity=0.077 Sum_probs=57.6 Q ss_pred CEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHC-C---CCCCCEEEEEECCCCC Q ss_conf 87999734032328877677888635433454213564222420358999998876314-7---8777368998156433 Q T0607 89 DTVLLYGHLDKQPEMSGWSDDLHPWKPVLKNGLLYGRGGADDGYSAYASLTAIRALEQQ-G---LPYPRCILIIEACEES 164 (471) Q Consensus 89 ~~i~l~~H~DvVp~~~~~~w~~~Pf~~~~~~~~~~GrG~~D~k~~~~~~l~a~~~l~~~-~---~~~~~~~~~~~~~EE~ 164 (471) |-|++.+.||+.-...+ .+.|+.+...++++.|.+++.|.+. . ...+++.+.|...|.. T Consensus 1 ~iIlv~armDs~s~F~~-----------------~s~GAds~~SglvaLLaaA~~L~~~~~~~~~~~r~v~F~fF~GEs~ 63 (234) T PF05450_consen 1 PIILVSARMDSFSFFPD-----------------VSPGADSSLSGLVALLAAARALSKLLDDLSDLPRNVLFAFFNGESY 63 (234) T ss_pred CEEEEEECCCCCCCCCC-----------------CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCEEEEEECCCCC T ss_conf 97999964455014456-----------------7877456147899999999999985512014665459998468765 Q ss_pred CCHHHHHHHHHHHH Q ss_conf 64469999999776 Q T0607 165 GSYDLPFYIELLKE 178 (471) Q Consensus 165 G~~g~~~~~~~~~~ 178 (471) |-.|..+++..+.. T Consensus 64 dYiGS~Rfvydm~~ 77 (234) T PF05450_consen 64 DYIGSSRFVYDMEK 77 (234) T ss_pred CCCCHHHHHHHHHC T ss_conf 65015999999975 No 6 >PF02127 Peptidase_M18: Aminopeptidase I zinc metalloprotease (M18); InterPro: IPR001948 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide . The type example is aminopeptidase I from Saccharomyces cerevisiae, the sequence of which has been deduced, and the mature protein shown to consist of 469 amino acids . A 45-residue presequence contains both positively- and negatively-charged and hydrophobic residues, which could be arranged in an N-terminal amphiphilic alpha-helix . The presequence differs from signal sequences that direct proteins across bacterial plasma membranes and endoplasmic reticulum or into mitochondria. It is unclear how this unique presequence targets aminopeptidase I to yeast vacuoles, and how this sorting utilises classical protein secretory pathways .; GO: 0004177 aminopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis, 0005773 vacuole; PDB: 2ijz_C 2glf_B 1y7e_A 2glj_V. Probab=85.30 E-value=0.68 Score=20.33 Aligned_cols=45 Identities=13% Similarity=0.014 Sum_probs=34.9 Q ss_pred HHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHHHHH Q ss_conf 689999861899899968778666756777600489999999999999 Q T0607 416 FMSMLGEQFPKAQFMITGVLGPHSNAHGPNEFLHLDMVKKLTSCVSYV 463 (471) Q Consensus 416 ~~~~~~~~~~~~p~~~~G~g~~~~~~H~~~E~~~i~~l~~~~~il~~~ 463 (471) -..+++.. .|++++-+|+... .+|++.|.+...|+...++++... T Consensus 387 iGpi~~a~-~Gi~tiDiG~p~L--sMHS~rE~~~~~D~~~~~~~~~~F 431 (432) T PF02127_consen 387 IGPILSAR-LGIRTIDIGIPQL--SMHSIRETCGKKDVYYLYRAFKAF 431 (432) T ss_dssp -THHHHHT----EEEEE-EEEB--STTSSS-BEEHHHHHHHHHHHHHT T ss_pred HHHHHHHH-CCCCEEECCHHHH--HCCCHHHHHCCCCHHHHHHHHHHH T ss_conf 88999876-6997797262243--045688872554599999999975 No 7 >PF01320 Colicin_Pyocin: Colicin immunity protein / pyocin immunity protein; InterPro: IPR000290 This family includes bacterial colicin and pyocin immunity proteins , . These immunity proteins can bind specifically to the DNase-type colicins and pyocins and inhibit their bactericidal activity. The 1.8-angstrom crystal structure of the ImmE7 protein consists of four antiparallel alpha-helices . Sequence similarities between colicins E2, A and E1 are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 immunity proteins. Pyocin protects a cell that harbours the plasmid ColE2 encoding colicin E2 against colicin E2; it is thus essential both for autonomous replication and colicin E2 immunity .; GO: 0015643 toxin binding, 0030153 bacteriocin immunity; PDB: 1unk_D 1ayi_A 2jbg_C 2jaz_A 1znv_A 1ujz_A 2jb0_A 2k0d_X 2erh_A 1cei_A .... Probab=39.38 E-value=6.6 Score=14.24 Aligned_cols=52 Identities=19% Similarity=0.297 Sum_probs=34.2 Q ss_pred HHHHHHHHHHHHHHHCCCCCCC---CCCCCCCCCHHHHHHHHHHHHHHHHHCCCCCE Q ss_conf 9889999999999961789798---86666666547999999999999983548841 Q T0607 13 QQWQEEILPSLCDYIKIPNKSP---HFDAKWEEHGYMEQAVNHIANWCKSHAPKGMT 66 (471) Q Consensus 13 ~~~~~~~i~~l~~lv~ipS~s~---~~~~~~~~~~~~~~~~~~l~~~l~~~~~~g~~ 66 (471) ++..+++|+.|.+++..|+.|. .+......+ .+.+.+.+.+|=++.|..||. T Consensus 30 ee~~d~lv~~F~~iteHP~gsDLIfyP~~~~e~s--PegIv~~iK~WRa~nG~p~FK 84 (85) T PF01320_consen 30 EEEHDELVDHFEKITEHPDGSDLIFYPEDGREDS--PEGIVKIIKEWRASNGKPGFK 84 (85) T ss_dssp HHHHHHHHHHHHHHH------HHHHS--TTS-SS--CHHHHHHHHHHHHH------B T ss_pred HHHHHHHHHHHHHCCCCCCCCCEEEECCCCCCCC--HHHHHHHHHHHHHHCCCCCCC T ss_conf 8999999999997479999777355289887689--899999999999973997568 No 8 >PF04692 PDGF_N: Platelet-derived growth factor, N terminal region; InterPro: IPR006782 Platelet-derived growth factor (PDGF) , is a potent mitogen for cells of mesenchymal origin, including smooth muscle cells and glial cells. In both mouse and human, the PDGF signalling network consists of four ligands, PDGFA-D, and two receptors, PDGFRalpha and PDGFRbeta. All PDGFs function as secreted, disulphide-linked homodimers, but only PDGFA and B can form functional heterodimers. PDGFRs also function as homo- and heterodimers. All known PDGFs have characteristic 'PDGF domains', which include eight conserved cysteines that are involved in inter- and intramolecular bonds. Alternate splicing of the A chain transcript can give rise to two different forms that differ only in their C-terminal extremity. The transforming protein of simian sarcoma virus (SSV), encoded by the v-sis oncogene, is derived from the B chain of PDGF. PDGFs are mitogenic during early developmental stages, driving the proliferation of undifferentiated mesenchyme and some progenitor populations. During later maturation stages, PDGF signalling has been implicated in tissue remodelling and cellular differentiation, and in inductive events involved in patterning and morphogenesis. In addition to driving mesenchymal proliferation, PDGFs have been shown to direct the migration, differentiation and function of a variety of specialised mesenchymal and migratory cell types, both during development and in the adult animal . PDGF is structurally related to a number of other growth factors which also form disulphide-linked homo- or heterodimers. This domain consists of the N-terminal regions of PGDF A and B.; GO: 0008083 growth factor activity, 0016020 membrane; PDB: 1pdg_B. Probab=21.73 E-value=7.8 Score=13.77 Aligned_cols=30 Identities=13% Similarity=0.247 Sum_probs=21.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHCCCCCCC Q ss_conf 8999999999889999999999961789798 Q T0607 4 PQGLYDYICQQWQEEILPSLCDYIKIPNKSP 34 (471) Q Consensus 4 ~~~~~~~i~~~~~~~~i~~l~~lv~ipS~s~ 34 (471) |+.+++-+ .+..=.-|+.|++|+.|.|+.. T Consensus 6 P~elierL-s~S~I~SIsDLQRLL~iDSVg~ 35 (78) T PF04692_consen 6 PEELIERL-SNSEIHSISDLQRLLEIDSVGE 35 (78) T ss_dssp ------------------------------- T ss_pred CHHHHHHH-HCCCCCCHHHHHHHHHCCCCCC T ss_conf 69999998-2588565999999984267776 No 9 >PF09280 XPC-binding: XPC-binding domain; InterPro: IPR015360 Members of this entry adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair . ; GO: 0003684 damaged DNA binding, 0006289 nucleotide-excision repair, 0043161 proteasomal ubiquitin-dependent protein catabolic process; PDB: 1x3z_B 2qsh_X 2qsg_X 1x3w_B 3esw_B 2qsf_X 2f4o_B 2f4m_B 1pve_A 1oqy_A .... Probab=18.82 E-value=15 Score=11.96 Aligned_cols=18 Identities=50% Similarity=0.442 Sum_probs=10.1 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 9999999988999999999 Q T0607 6 GLYDYICQQWQEEILPSLC 24 (471) Q Consensus 6 ~~~~~i~~~~~~~~i~~l~ 24 (471) .++++| .+++++|++.|. T Consensus 37 ~l~q~I-~~n~e~Fl~ll~ 54 (59) T PF09280_consen 37 QLLQLI-QQNQEEFLRLLN 54 (59) T ss_dssp HHHHHH-HHTHHHHHHHHH T ss_pred HHHHHH-HHCHHHHHHHHC T ss_conf 999999-989999999880 No 10 >PF00883 Peptidase_M17: Cytosol aminopeptidase family, catalytic domain; InterPro: IPR000819 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus. Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear , . Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids . The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another . Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape . The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices . An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core . A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer . The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain .; GO: 0004177 aminopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 2hb6_B 2hc9_A 1bpn_A 1lcp_A 1bll_E 1lan_A 1lam_A 2j9a_A 1bpm_A 1lap_A .... Probab=13.37 E-value=21 Score=11.11 Aligned_cols=118 Identities=21% Similarity=0.210 Sum_probs=0.0 Q ss_pred HHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHCCCCCEEEEEECC------------------CCCEEEEE Q ss_conf 999999961789798866666665479999999999999835488417897538------------------99248999 Q T0607 20 LPSLCDYIKIPNKSPHFDAKWEEHGYMEQAVNHIANWCKSHAPKGMTLEIVRLK------------------NRTPLLFM 81 (471) Q Consensus 20 i~~l~~lv~ipS~s~~~~~~~~~~~~~~~~~~~l~~~l~~~~~~g~~~~~~~~~------------------~~~~~v~~ 81 (471) +.+-++|+.-|+---. ....++++.+.+++. ++++++++.. .....|.. T Consensus 1 vn~aRdL~n~P~n~~~----------p~~~a~~a~~~~~~~---~~~v~v~~~~~l~~~gmg~~laV~~gS~~~P~lv~l 67 (311) T PF00883_consen 1 VNLARDLVNTPANRMT----------PETFAEYAKELAKEY---GVKVEVLDKKELEELGMGGLLAVGRGSAHPPRLVVL 67 (311) T ss_dssp -HHHHHHHHS-TTTSS----------HHHHHHHHHHHHHHC---TEEEEEEEHHHHHHTT-HHHHHHH----S--EEEEE T ss_pred CHHHHHHHCCCHHHCC----------HHHHHHHHHHHHHHC---CCEEEEEEHHHHHHCCCCCEEEEECCCCCCCEEEEE T ss_conf 9377765078944449----------999999999998656---988999729999877997045552346899879999 Q ss_pred EECCCC---CCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCEEEEE Q ss_conf 873799---88799973403232887767788863543345421356422242035899999887631478777368998 Q T0607 82 EIPGQI---DDTVLLYGHLDKQPEMSGWSDDLHPWKPVLKNGLLYGRGGADDGYSAYASLTAIRALEQQGLPYPRCILII 158 (471) Q Consensus 82 ~~~g~~---~~~i~l~~H~DvVp~~~~~~w~~~Pf~~~~~~~~~~GrG~~D~k~~~~~~l~a~~~l~~~~~~~~~~~~~~ 158 (471) ++.|.. .++|.|.|-==|-.-| ...++.+..--.--.||-|..+. +.+++++.+.+++ .+++.++ T Consensus 68 ~Y~g~~~~~~~~i~lVGKGiTFDtG----------G~~lKp~~~M~~Mk~DM~GAAaV-lga~~aia~l~~~-vnv~~vl 135 (311) T PF00883_consen 68 EYKGAGPESKKPIALVGKGITFDTG----------GLSLKPGAGMEGMKYDMGGAAAV-LGAMRAIAELKLP-VNVVAVL 135 (311) T ss_dssp EE----STTS-EEEEE-----EE------------------STTGGGGGGGG---CCC-HHHHHHHHHCTBS-SEEEEEE T ss_pred EECCCCCCCCCCEEEECCEEEEECC----------CCCCCCCCCHHHHHCCCHHHHHH-HHHHHHHHHHCCC-CEEEEEE T ss_conf 9689887778718997466898578----------87888800077503673689999-9999999980949-3799999 Q ss_pred ECCC Q ss_conf 1564 Q T0607 159 EACE 162 (471) Q Consensus 159 ~~~E 162 (471) ..+| T Consensus 136 ~~~E 139 (311) T PF00883_consen 136 PLAE 139 (311) T ss_dssp EE-E T ss_pred ECHH T ss_conf 7021 Done!