Query T0543 Autotaxin, RAT, 887 residues Match_columns 887 No_of_seqs 296 out of 593 Neff 5.6 Searched_HMMs 11830 Date Fri May 21 18:12:50 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0543.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0543.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF01663 Phosphodiest: Type I 100.0 0 0 379.2 18.0 302 165-477 1-364 (364) 2 PF01223 Endonuclease_NS: DNA/ 100.0 1.3E-41 1.4E-45 302.3 10.1 194 639-868 2-203 (204) 3 PF00884 Sulfatase: Sulfatase; 99.9 4.6E-24 3.9E-28 181.2 11.1 252 163-519 1-308 (379) 4 PF08665 PglZ: PglZ domain; I 98.9 1.6E-09 1.3E-13 80.9 8.3 172 164-365 2-178 (181) 5 PF01676 Metalloenzyme: Metall 98.9 1.9E-09 1.6E-13 80.3 7.7 73 285-366 139-211 (262) 6 PF02995 DUF229: Protein of un 98.8 2.1E-07 1.8E-11 66.2 14.6 183 153-364 118-348 (498) 7 PF01033 Somatomedin_B: Somato 98.5 1.9E-08 1.6E-12 73.4 1.8 42 55-97 2-43 (43) 8 PF00245 Alk_phosphatase: Alka 98.2 4.5E-06 3.8E-10 57.1 8.8 193 163-362 2-313 (421) 9 PF01033 Somatomedin_B: Somato 97.6 7.8E-06 6.6E-10 55.4 1.5 41 101-141 3-43 (43) 10 PF04185 Phosphoesterase: Phos 95.5 0.047 4E-06 29.3 8.6 164 183-359 121-307 (376) 11 PF07394 DUF1501: Protein of u 93.3 0.13 1.1E-05 26.2 6.8 61 300-364 246-306 (392) 12 PF05033 Pre-SET: Pre-SET moti 79.3 0.23 1.9E-05 24.6 0.9 46 70-126 46-103 (103) 13 PF11658 DUF3260: Protein of u 60.8 3.2 0.00027 16.7 12.2 181 267-455 312-513 (518) 14 PF05827 ATP-synt_S1: Vacuolar 52.6 4.3 0.00037 15.8 4.0 53 297-359 124-176 (280) 15 PF02739 5_3_exonuc_N: 5'-3' e 45.6 5.5 0.00046 15.0 4.7 35 330-367 109-143 (169) 16 PF11946 DUF3463: Domain of un 26.3 5.5 0.00047 15.0 -0.0 41 619-659 59-101 (140) 17 PF04852 DUF640: Protein of un 25.0 12 0.00098 12.8 3.3 65 283-352 48-114 (131) 18 PF04666 Glyco_transf_54: N-Ac 25.0 11 0.00095 12.9 1.3 33 328-360 66-98 (297) 19 PF02127 Peptidase_M18: Aminop 24.5 12 0.001 12.7 1.4 41 479-521 160-204 (432) 20 PF02110 HK: Hydroxyethylthiaz 21.0 14 0.0012 12.3 2.0 19 744-762 143-161 (246) No 1 >PF01663 Phosphodiest: Type I phosphodiesterase / nucleotide pyrophosphatase; InterPro: IPR002591 This family consists of phosphodiesterases, including human plasma-cell membrane glycoprotein PC-1 / alkaline phosphodiesterase I / nucleotide pyrophosphatase (nppase). These enzymes catalyse the cleavage of phosphodiester and phosphosulphate bonds in NAD, deoxynucleotides and nucleotide sugars . Another member of this family is ATX an autotaxin, tumor cell motility-stimulating protein which exhibits type I phosphodiesterases activity . The alignment encompasses the active site , . Also present within this family is 60 kDa Ca^2+-ATPase from Myroides odoratus . This signature also hits a number of ethanolamine phosphate transferase involved in glycosylphosphatidylinositol-anchor biosynthesis.; GO: 0003824 catalytic activity; PDB: 1ei6_B 2gsu_A 2gso_A 2gsn_B 2rh6_A. Probab=100.00 E-value=0 Score=379.23 Aligned_cols=302 Identities=28% Similarity=0.503 Sum_probs=247.0 Q ss_pred EEEEEECCCCHHHHHCCCCCCHHHHHHHHCCCCCCEEEECCCCCCHHHHHHHHCCCCCCCCCEEECCEECCCCCCEEEEC Q ss_conf 89998448786887210003716899996595025224066542424368872386510165220401254357513226 Q T0543 165 LIIFSVDGFRASYMKKGSKVMPNIEKLRSCGTHAPYMRPVYPTKTFPNLYTLATGLYPESHGIVGNSMYDPVFDASFHLR 244 (887) Q Consensus 165 vILISiDGfR~dyl~~~~~~tPNL~~La~~Gv~a~~m~pvfPT~T~PNh~SIvTGlyPe~HGIv~N~~~Dp~~~~~f~l~ 244 (887) ||||++|||++++++++...+|||++|+++|+++++++++|||.|.|||+||+||++|+.|||++|.+|||.......+. T Consensus 1 vvli~vDGl~~~~l~~~~~~~p~l~~l~~~G~~~~~~~s~~Ps~T~~~~~si~TG~~P~~HGv~~n~~~d~~~~~~~~~~ 80 (364) T PF01663_consen 1 VVLIGVDGLRPDLLERYIGDLPNLARLAQEGVYAPNLRSVFPSTTAPNWTSILTGVYPGEHGVVGNTWYDPKRGKDSWFW 80 (364) T ss_dssp EEEEEE----HHHHHHHH---HHHHHHHH---EES-BB--SS-SHHHHHHHHH----TTT----SSCEEETTTTEEEE-- T ss_pred CEEEEECCCCHHHHHHHHCCCCHHHHHHHCCCEECCCCCCCCCCCCCCHHHHHHCCCHHHCCCCCCCCCCCCCCCCCCCC T ss_conf 98999838999999766426918999997896760124479985300076877447987719862125575544443222 Q ss_pred CCCCCCCHHCCCCCHHHHHHHCCCEEEEEECCCCCCHHHH---------------------------------------- Q ss_conf 7665671110677054576774970899975886347889---------------------------------------- Q T0543 245 GREKFNHRWWGGQPLWITATKQGVRAGTFFWSVSIPHERR---------------------------------------- 284 (887) Q Consensus 245 ~~~~~~~~w~~gePiW~ta~~~G~ksa~~~Wpgs~P~~~r---------------------------------------- 284 (887) .. ......+...|||.++.++|++++.++||++.+-... T Consensus 81 ~~-~~~~~~~~~~~i~~~~~~~g~~~a~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~~P~~~~~~~~~~~~~~~~~~~~~ 159 (364) T PF01663_consen 81 DE-LDDSGDIRAPPIWETLADAGKKSAVVNWPGTYPPYPSYNGWLVSGFGAPDISRFYPSELSDEIYSGAGDVPLDVQWF 159 (364) T ss_dssp TC--TTTGG--CG-CCHHHCC---EEEECS-ESSHCHHHH----------EEESTTGGTSB---HHHH------H----- T ss_pred CC-CCCCCCCCCCCEEEHHHHCCCEEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC T ss_conf 34-44445666750033477649816888731003334554221111111111112332101112001366654321112 Q ss_pred ------HHHHHH-HHC-CCCCCCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEE Q ss_conf ------999999-860-656678728998425634200135888789999999999999999999985686346517997 Q T0543 285 ------ILTILQ-WLS-LPDNERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVNVIFV 356 (887) Q Consensus 285 ------id~vl~-wl~-lp~~e~P~l~~lY~~epD~~GH~yGp~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL~~~tnIIiv 356 (887) .+.++. ++. +.+.++|+|+++||.++|.+||.||++|+++.++++++|+.||+|+++|++.++.++++|||+ T Consensus 160 ~~~~~~~d~~~~~~~~~l~~~~~pd~~~~~~~~~D~~~H~~G~~s~~~~~a~~~~D~~lg~ll~~l~~~~~~~~t~vivt 239 (364) T PF01663_consen 160 FEQSPFVDEAITDVAEYLLERERPDFIFVYFPEPDSAGHAYGPDSPEYEDALRRLDRALGRLLDALDEDGLYEDTNVIVT 239 (364) T ss_dssp -SSCCHHHHHHHH--SHHHHTT-ECEEEEEEEHHHHHHHHS---SHHHHHHHHHHHHH---HHHHHHH---GGGEEEEEE T ss_pred CCCCHHHHHHHHHHHHHHHHCCCCCEEEEECCCCCHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEEEE T ss_conf 22537899999999999986079979999788897100179999899999999999999999998886077788799997 Q ss_pred CCCCCCCCCCCCEEEHHHHC--------CCCCCEE--ECCCCEEEEEEECCCCCCHHHHHHHHHHHHC-CCCCCEEEEE- Q ss_conf 15666567877626535323--------7530057--5167413688625752000189999999840-3777425652- Q T0543 357 GDHGMEDVTCDRTEFLSNYL--------TNVDDIT--LVPGTLGRIRAKSINNSKYDPKTIIANLTCK-KPDQHFKPYM- 424 (887) Q Consensus 357 SDHGM~~v~~~~~I~L~~yl--------~~~~~~~--~~~G~~~~I~pk~~~~~~~~~e~i~~~L~~~-~~~~h~~vY~- 424 (887) |||||.++..++.|.+++++ ....... ...+.+..++++ +.....+++++.|+.. .+...+.+++ T Consensus 240 sDHG~~~~~~~~~i~ln~~l~~~g~~~~~~~~~~~~~~~~~~~~~i~~~---~~~~~~~~v~~~L~~~~~~~~~~~~~~~ 316 (364) T PF01663_consen 240 SDHGMSPVPPRKVIDLNDYLREAGLLKLDEGDDRVDAVGEGRMAYIYVK---EDPKRIAEVAAALKELRDPQEGIAVILA 316 (364) T ss_dssp -----EEB--B-EEEHCHC--HHHHHHCHTTTECEEESTB--EE--EEE----TTS-HHHHHHHH------BSSEEEEEE T ss_pred CCCCCCCCCCCCEECHHHHHHHHHHCCCCCCCCEEECCCCCCEEEEECC---CCHHHHHHHHHHHHHCCCCCCCEEEEEC T ss_conf 7689888887868728995311100135755421100037751368757---7666899999999854467787489961 Q ss_pred --HHHHHHHHHCCCCCCCCCEEEEECCCEEEEECCCCCCCCCCCCCCCCCCCCCC Q ss_conf --01212566316788656248986377078743444343456666666567637 Q T0543 425 --KQHLPKRLHYANNRRIEDIHLLVDRRWHVARKPLDVYKKPSGKCFFQGDHGFD 477 (887) Q Consensus 425 --ke~lP~r~Hy~~n~Ri~~I~l~~d~Gw~i~~~~~~~~~~~~~~~~~~G~HGYd 477 (887) ++++|+|+||. ++|++||++++++||.+..... ......+|+|||| T Consensus 317 ~~~~~~~~~~~~~-~~r~gdlvv~~~~g~~~~~~~~------~~~~~~~g~HG~~ 364 (364) T PF01663_consen 317 VYREELPERWHFG-SPRAGDLVVVARPGYSFAYSDK------EKREKSRGMHGYD 364 (364) T ss_dssp HHHHGSHHHHT---GCCS-SEEEEE-TT-EEE-HCC------S-----B------ T ss_pred CCHHHHHHHCCCC-CCCCCCEEEEECCCEEEEECCC------CCCCCCCCCCCCC T ss_conf 4454404640789-8887888999629989984676------7778888709989 No 2 >PF01223 Endonuclease_NS: DNA/RNA non-specific endonuclease; InterPro: IPR001604 A family of bacterial and eukaryotic endonucleases 3.1.30 from EC share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. An histidine has been shown to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion.; GO: 0003676 nucleic acid binding, 0004519 endonuclease activity; PDB: 2o3b_A 1zm8_A 1ql0_B 1smn_B 1qae_A 1g8t_A. Probab=100.00 E-value=1.3e-41 Score=302.28 Aligned_cols=194 Identities=26% Similarity=0.492 Sum_probs=168.0 Q ss_pred ECCCEEEEHHHCCCCEEEEEEECCCCCCCCCCC-----CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECC-CCCCC Q ss_conf 043012111002883379998756213466777-----665200146557744433432324578754321067-00045 Q T0543 639 HTDFESGYSEIFLMPLWTSYTISKQAEVSSIPE-----HLTNCVRPDVRVSPGFSQNCLAYKNDKQMSYGFLFP-PYLSS 712 (887) Q Consensus 639 ~~~Yv~~Ys~~~~~P~Wvsy~L~~~~~~~~~~~-----~~s~~fr~D~rip~~~~~~csdY~~sg~~drGhL~P-a~~~~ 712 (887) .+.|.++||..+++|+||+|+|++......... ++.+.|+.|++|+..+....++|.++++ |||||+| +++.. T Consensus 2 ~~~y~~~y~~~~~~~~~~a~~l~~~~~~~~~~~~~~~~~~~~~~~~d~~i~~~~~~~~~~y~~~~~-dRGHL~p~~~~~~ 80 (204) T PF01223_consen 2 YENYSVCYDRSRRIPLWVAYNLTGANVGRSRRRNMSQERQSNRFFPDPRIPAAFQATNSDYKGSGY-DRGHLAPSADDFF 80 (204) T ss_dssp -SS-EEEE-TTTSSEEEEEEEESCCC------------------B--TTS-GGGS--GGGC---HE-E----S-GHHHHS T ss_pred CCEEEEEECCCCCCEEEEEEEEEHHHCCCCCCCCCCCCCCCEEEEECCCCCHHHEECCCCCCCCCC-CCCCCCCCCCCCC T ss_conf 856999998898948999999604353666775432223303336779988665624454566777-6466076532323 Q ss_pred CHHHHHHHHHHHCCCCCCHHHHH-HHHHHHHHHHHHHHH-HCCCEEEEECCCCCCCCCCCCCCCCCCEEECCCCEEECCC Q ss_conf 65678989888333556534688-999999999999998-5294799975413276677767601002412886001464 Q T0543 713 SPEAKYDAFLVTNMVPMYPAFKR-VWAYFQRVLVKKYAS-ERNGVNVISGPIFDYNYDGLRDTEDEIKQYVEGSSIPVPT 790 (887) Q Consensus 713 s~~a~~etfl~SNivPq~~~fn~-iW~~le~~l~r~~a~-~~~~V~VvsGPifd~~~Dg~~d~~~~~~~~i~~~~V~VPt 790 (887) +..+|.+||+||||+||.++||+ +|++||+ .+|+|+. ++++|||++||+|+.+ ...+++++|+||+ T Consensus 81 ~~~~~~~Tf~~tNi~PQ~~~~N~g~W~~lE~-~vr~~~~~~~~~~~V~tG~~~~~~-----------~~~~~~~~v~VP~ 148 (204) T PF01223_consen 81 SADAQRATFYYTNIVPQWQSFNQGNWNRLEN-YVRDWANSKNDKVYVVTGPIFVPD-----------YYDIGKNRVPVPT 148 (204) T ss_dssp SHCCHHHCCBGGG---EEHHHHHTTHHHHHH-HHHHHCCST--EEEEEE--E-ES----------------B-S-B--EC T ss_pred CHHHHHHHHHHHHCCCEECHHHHHHHHHHHH-HHHHHHHHCCCCEEEEEEEECCCC-----------CCCCCCCCCCCHH T ss_conf 6677987757751243400003679999999-999998734993899995541543-----------0015788644827 Q ss_pred CEEEEEEEECCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCCCHHHHHHHHHCCCHHHHHHHHCHHHCCCCCH Q ss_conf 126899986487877344688722788880657776420014431100035568862664899987715111015856 Q T0543 791 HYYSIITSCLDFTQPADKCDGPLSVSSFILPHRPDNDESCNSSEDESKWVEELMKMHTARVRDIEHLTGLDFYRKTSR 868 (887) Q Consensus 791 hffkVi~~c~~~~~~~~~c~g~l~viaFIlPn~~~~~~~c~~~~~~~~wv~~~L~~~~~sV~dVE~lTGldFf~~l~~ 868 (887) ||||||..+.+. +.++++||++||.+..... .+. .++|++||.+|||+||+++++ T Consensus 149 ~~~Kvv~~~~~~--------~~~~~~af~~~n~~~~~~~-------------~~~--~v~v~~iE~~tG~~f~~~l~~ 203 (204) T PF01223_consen 149 HFWKVVCCPDNN--------GKLKAIAFVLPNDNNPHST-------------NCR--QVSVDEIEELTGLTFFCNLPD 203 (204) T ss_dssp EEEEEEEECT-T--------TTBEEEEEEEEETTS--S---------------GG--B--HHHHHHH---BTTTTS-H T ss_pred HHHEEEEECCCC--------CCEEEEEEEECCCCCCCCC-------------CCE--EECHHHHHHHHCCCCCCCCCC T ss_conf 831799973799--------9667999995898877766-------------006--861899998549832689998 No 3 >PF00884 Sulfatase: Sulfatase; InterPro: IPR000917 Sulphatases 3.1.6. from EC are enzymes that hydrolyze various sulphate esters. The sequence of different types of sulphatases are available and have shown to be structurally related , , , including arylsulphatase A 3.1.6.8 from EC (ASA), a lysosomal enzyme which hydrolyzes cerebroside sulphate; arylsulphatase B 3.1.6.12 from EC (ASB), which hydrolyzes the sulphate ester group from N-acetylgalactosamine 4-sulphate residues of dermatan sulphate; arylsulphatase C (ASD) and E (ASE); steryl-sulphatase 3.1.6.2 from EC (STS), a membrane bound microsomal enzyme which hydrolyzes 3-beta-hydroxy steroid sulphates; iduronate 2-sulphatase precursor 3.1.6.13 from EC (IDS), a lysosomal enzyme that hydrolyzes the 2-sulphate groups from non-reducing-terminal iduronic acid residues in dermatan sulphate and heparan sulphate; N-acetylgalactosamine-6-sulphatase 3.1.6.4 from EC, which hydrolyzes the 6-sulphate groups of the N-acetyl-d-galactosamine 6-sulphate units of chondroitin sulphate and the D-galactose 6-sulphate units of keratan sulphate; glucosamine-6-sulphatase 3.1.6.14 from EC (G6S), which hydrolyzes the N-acetyl-D-glucosamine 6-sulphate units of heparan sulphate and keratan sulphate; N-sulphoglucosamine sulphohydrolase 3.10.1.1 from EC (sulphamidase), the lysosomal enzyme that catalyzes the hydrolysis of N-sulpho-d-glucosamine into glucosamine and sulphate; sea urchin embryo arylsulphatase 3.1.6.1 from EC; green algae arylsulphatase 3.1.6.1 from EC, which plays an important role in the mineralization of sulphates; and arylsulphatase 3.1.6.1 from EC from Escherichia coli (aslA), Klebsiella aerogenes (gene atsA) and Pseudomonas aeruginosa (gene atsA).; GO: 0008484 sulfuric ester hydrolase activity, 0008152 metabolic process; PDB: 1p49_A 1e3c_P 1e33_P 1e2s_P 1n2k_A 1auk_A 1n2l_A 1e1z_P 1fsu_A 1hdh_A .... Probab=99.90 E-value=4.6e-24 Score=181.20 Aligned_cols=252 Identities=22% Similarity=0.362 Sum_probs=167.0 Q ss_pred CCEEEEEECCCCHHHHHCCC---CCCHHHHHHHHCCCCCCEEEECCCCCCHHHHHHHHCCCCCCCCCEEECCEEC-CC-- Q ss_conf 87899984487868872100---0371689999659502522406654242436887238651016522040125-43-- Q T0543 163 PPLIIFSVDGFRASYMKKGS---KVMPNIEKLRSCGTHAPYMRPVYPTKTFPNLYTLATGLYPESHGIVGNSMYD-PV-- 236 (887) Q Consensus 163 PpvILISiDGfR~dyl~~~~---~~tPNL~~La~~Gv~a~~m~pvfPT~T~PNh~SIvTGlyPe~HGIv~N~~~D-p~-- 236 (887) |+||+|.+|.+|++.+...+ ..||||++|+++|+.+.++.+.-| .|.|++.+|+||+||..||+..|..+. .. T Consensus 1 PNIv~I~~ds~~~~~~~~~g~~~~~tP~ld~la~~g~~f~~~~~~~~-~t~~s~~s~ltG~~~~~~~~~~~~~~~~~~~~ 79 (379) T PF00884_consen 1 PNIVFIMADSLRADDLGCYGYERPTTPNLDRLAKEGVRFTNAYSNSP-STSPSRASLLTGLYPHNHGVTSNSPYGTPPLP 79 (379) T ss_dssp -EEEEEEEST--CCTCTCB--BSCSSHHHHHHHHSCEEESSEE-SSS-SHHHHHHHHHHS--TTTHT-HSSCCTTTHBTT T ss_pred CEEEEEECCCCCCCCCCCCCCCCCCCHHHHHHHHCCCEEEEEEECCC-CCCHHHHHHHHCCCCCCCCCCCCCCCCCCCCC T ss_conf 95999981249989878789974557899999861835634775698-88058999983762124443112345444677 Q ss_pred --CCCEEEECCCCCCCC--------HHCCCCCHHHHHHHCCCEEE----------------EEECC------CCCCHHHH Q ss_conf --575132267665671--------11067705457677497089----------------99758------86347889 Q T0543 237 --FDASFHLRGREKFNH--------RWWGGQPLWITATKQGVRAG----------------TFFWS------VSIPHERR 284 (887) Q Consensus 237 --~~~~f~l~~~~~~~~--------~w~~gePiW~ta~~~G~ksa----------------~~~Wp------gs~P~~~r 284 (887) ....+.+-...++.- .|+..... ....|+..- ...|. ....-+.- T Consensus 80 ~~~~~l~~~lk~~GY~T~~~~~~~~~~~~~~~~---~~~~Gfd~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~ 156 (379) T PF00884_consen 80 QDQPSLPEILKEAGYRTAFFGKWHLGFYNRDSF---YPNRGFDKFFGFNGGSDDFWNPEDSDYFWEINRINSWGYSDEAL 156 (379) T ss_dssp TTS--HHHHHHHHT-EEEEE----TTGGGCSCC---STG---EEEEEE-CCGGGSCCSTTSEEETTTTESTTCCCHHHHH T ss_pred CCCCHHHHHHHHCCCCCCCCCCCCCCCCCCCCC---CCCCCCCEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHH T ss_conf 660339999987389630203444677666677---77789764214555543345664456676555565555309999 Q ss_pred HHHHHHHHCCCCCCCCEEEEEECCCCCCCC-------CCC----C------CCCHHHHHHHHHHHHHHHHHHHHHHHCCC Q ss_conf 999999860656678728998425634200-------135----8------88789999999999999999999985686 Q T0543 285 ILTILQWLSLPDNERPSVYAFYSEQPDFSG-------HKY----G------PFGPEMTNPLREIDKTVGQLMDGLKQLRL 347 (887) Q Consensus 285 id~vl~wl~lp~~e~P~l~~lY~~epD~~G-------H~y----G------p~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL 347 (887) ++.+++||. .+.++|.|+.+++..+-.-- ..+ . .....+.++|+.+|..||+|++.||+.|+ T Consensus 157 ~~~~~~~l~-~~~~kPffl~~~~~~~H~P~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~v~~~D~~lg~ll~~l~~~g~ 235 (379) T PF00884_consen 157 FDRAIEFLK-QKDDKPFFLFVHTMDPHSPYYYPDEYPDKYPDFFDDGPENPEYRNRYLNAVRYVDDQLGRLLDYLKESGL 235 (379) T ss_dssp HHHHHHHHH-HTTTSSEEEEEEE-TTSSS--S-HHHHGGGTCGCS-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHTT- T ss_pred HHHHHHHHH-HCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCC T ss_conf 999999999-6358995699862467544456842233211100000011689999999999999988799999997399 Q ss_pred CCCCCEEEECCCCCCCCCCCCEEEHHHHCCCCCCEEECCCCEEEEEEECCCCCCHHHHHHHHHHHHCCCCCCEEEEEHHH Q ss_conf 34651799715666567877626535323753005751674136886257520001899999998403777425652012 Q T0543 348 HRCVNVIFVGDHGMEDVTCDRTEFLSNYLTNVDDITLVPGTLGRIRAKSINNSKYDPKTIIANLTCKKPDQHFKPYMKQH 427 (887) Q Consensus 348 ~~~tnIIivSDHGM~~v~~~~~I~L~~yl~~~~~~~~~~G~~~~I~pk~~~~~~~~~e~i~~~L~~~~~~~h~~vY~ke~ 427 (887) .++|.||++||||..--. T Consensus 236 ~~nTlii~tsDHG~~~~e-------------------------------------------------------------- 253 (379) T PF00884_consen 236 YDNTLIIFTSDHGESFGE-------------------------------------------------------------- 253 (379) T ss_dssp GGGEEEEEEE-----TGG-------------------------------------------------------------- T ss_pred CCCEEEEEECCCCCCCCC-------------------------------------------------------------- T ss_conf 888399997887654355-------------------------------------------------------------- Q ss_pred HHHHHHCCCCCCCCCEEEEECCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCC-CCCCCH Q ss_conf 1256631678865624898637707874344434345666666656763788876640578877877788550-751323 Q T0543 428 LPKRLHYANNRRIEDIHLLVDRRWHVARKPLDVYKKPSGKCFFQGDHGFDNKVNSMQTVFVGYGPTFKYRTKV-PPFENI 506 (887) Q Consensus 428 lP~r~Hy~~n~Ri~~I~l~~d~Gw~i~~~~~~~~~~~~~~~~~~G~HGYdn~~~dM~aiFiA~GP~FK~g~~v-~pf~NI 506 (887) . .....+.++........+..||.++|+.+.+.++ ..+..| T Consensus 254 ----------------------------~----------~~~~~~~~~~~~~~~~~~vPl~i~~p~~~~~~~~~~~~s~~ 295 (379) T PF00884_consen 254 ----------------------------G----------NHPFRGGKGMSLYEEGIRVPLIIRWPGIIPPRVVDALVSHI 295 (379) T ss_dssp ----------------------------H----------HTTCSSSTTHSSHHHHHBEEEEEEETTTSTT-EECS-EEGG T ss_pred ----------------------------C----------CCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCCCCCCCHH T ss_conf ----------------------------3----------32344454355543454024787569865455546786799 Q ss_pred HHHHHHHHHHCCC Q ss_conf 4889999981888 Q T0543 507 ELYNVMCDLLGLK 519 (887) Q Consensus 507 dIYnLmc~LLgI~ 519 (887) ||+|+|++++||. T Consensus 296 Di~PTil~~~G~~ 308 (379) T PF00884_consen 296 DIAPTILDLAGVP 308 (379) T ss_dssp GHHHHHHHHCT-- T ss_pred HHHHHHHHHHCCC T ss_conf 9999999982886 No 4 >PF08665 PglZ: PglZ domain; InterPro: IPR013973 This entry is a member of the Alkaline phosphatase clan. Probab=98.93 E-value=1.6e-09 Score=80.91 Aligned_cols=172 Identities=16% Similarity=0.132 Sum_probs=90.0 Q ss_pred CEEEEEECCCCHHHHHCCCCCCHHHHHHHHCCCCCCEEEECCCCCCHHHHHHHHCCCCCCCCCEEECCEECCCCCCEEEE Q ss_conf 78999844878688721000371689999659502522406654242436887238651016522040125435751322 Q T0543 164 PLIIFSVDGFRASYMKKGSKVMPNIEKLRSCGTHAPYMRPVYPTKTFPNLYTLATGLYPESHGIVGNSMYDPVFDASFHL 243 (887) Q Consensus 164 pvILISiDGfR~dyl~~~~~~tPNL~~La~~Gv~a~~m~pvfPT~T~PNh~SIvTGlyPe~HGIv~N~~~Dp~~~~~f~l 243 (887) .|+||.+||||+|-.. .+.+.|++-..--+.-..+.++.||.|--++.+|+.|.-|.-- +........ T Consensus 2 rv~liv~D~lrye~~~---eL~~~L~~~~~~~~~~~~~~a~LPS~T~~~r~AL~~g~~~~~~---------~~~~~~~~~ 69 (181) T PF08665_consen 2 RVALIVSDGLRYEQAK---ELAEELNREYRFEVELDAMLAILPSYTQYGRAALFPGKMPSYF---------AKSFPDVWV 69 (181) T ss_pred EEEEEEECCCCHHHHH---HHHHHHHHCCCCEEECCEEEEECCCHHHHHHHHHCCCCCHHHH---------HCCCCCEEE T ss_conf 1999998277699999---9999970045641511345883675768799997599973331---------114786506 Q ss_pred CCCCCCCCHHCCCCCHHHHHHHCCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCHH Q ss_conf 67665671110677054576774970899975886347889999999860656678728998425634200135888789 Q T0543 244 RGREKFNHRWWGGQPLWITATKQGVRAGTFFWSVSIPHERRILTILQWLSLPDNERPSVYAFYSEQPDFSGHKYGPFGPE 323 (887) Q Consensus 244 ~~~~~~~~~w~~gePiW~ta~~~G~ksa~~~Wpgs~P~~~rid~vl~wl~lp~~e~P~l~~lY~~epD~~GH~yGp~S~e 323 (887) ........| ..+. + .+.+. ...+....-.-. .-+.... .-..-+++.+|...+|..||+-+-.... T Consensus 70 -d~~~~~~~~-~r~~-~--l~~~~--~~~~~~~~l~~~--~~~~~~~-----~~~~~~vv~vv~n~ID~~~~~~~~e~~~ 135 (181) T PF08665_consen 70 -DGKSWAGFW-NREK-I--LQAQN--SIAFQYDDLLDM--NGGEARE-----LIKGNRVVYVVHNFIDALGHKAITELAT 135 (181) T ss_pred -CCCCCCCHH-HHHH-H--HHHCC--CCEEEHHHHHCC--CHHHHHH-----HCCCCCEEEEEECCHHHHHCCCCCCCCH T ss_conf -871154766-7788-9--87351--403877355264--7466885-----0679988999978765403821335106 Q ss_pred HHHHHHHHH-----HHHHHHHHHHHHCCCCCCCCEEEECCCCCCCCC Q ss_conf 999999999-----999999999985686346517997156665678 Q T0543 324 MTNPLREID-----KTVGQLMDGLKQLRLHRCVNVIFVGDHGMEDVT 365 (887) Q Consensus 324 ~~~ai~~vD-----~~IG~Ll~~Lk~~gL~~~tnIIivSDHGM~~v~ 365 (887) ...+.+.++ ..|..|+..|.+. ..+|+||||||+.-.. T Consensus 136 ~~~~~~~i~~~~~~~~L~~ll~~l~~~----~~~vviTaDHG~i~~~ 178 (181) T PF08665_consen 136 FEAMYRAIEQWWFEHELRDLLRKLRNA----GYNVVITADHGFIYQR 178 (181) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHC----CCEEEEECCCCEEEEC T ss_conf 899999876611018899999999854----8469998278569946 No 5 >PF01676 Metalloenzyme: Metalloenzyme superfamily; InterPro: IPR006124 This domain unites alkaline phosphatase, N-acetylgalactosamine-4-sulphatase, and cerebroside sulphatase, enzymes with known three-dimensional structures, with phosphopentomutase, 2,3-bisphosphoglycerate-independent phosphoglycerate mutase, phosphoglycerol transferase, phosphonate monoesterase, streptomycin-6-phosphate phosphatase, alkaline phosphodiesterase/nucleotide pyrophosphatase PC-1, and several closely related sulphatases. This domain is also related to alkaline phosphatase IPR001952 from INTERPRO . The most conserved residues are probably involved in metal binding and catalysis.; GO: 0003824 catalytic activity, 0046872 metal ion binding; PDB: 2ify_A 1ejj_A 1eqj_A 1o99_A 1o98_A 2i09_B 2zkt_A. Probab=98.90 E-value=1.9e-09 Score=80.32 Aligned_cols=73 Identities=18% Similarity=0.273 Sum_probs=59.7 Q ss_pred HHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCCCCCCC Q ss_conf 99999986065667872899842563420013588878999999999999999999998568634651799715666567 Q T0543 285 ILTILQWLSLPDNERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVNVIFVGDHGMEDV 364 (887) Q Consensus 285 id~vl~wl~lp~~e~P~l~~lY~~epD~~GH~yGp~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL~~~tnIIivSDHGM~~v 364 (887) ++.+++.++ .+.-+|+++++.++|.+||+- +-.++.++|+++|..|++|++++++. +..+||+||||+... T Consensus 139 ~~~~i~~l~---~~~~d~v~~n~~~~D~~GH~~--d~~~~~~aie~~D~~l~~ll~~l~~~----~~~liITaDHgn~~~ 209 (262) T PF01676_consen 139 ADKAIEALE---KKKYDFVFVNFKNTDMAGHRG--DPEGYVKAIERFDRFLGRLLDALDKE----DDLLIITADHGNDET 209 (262) T ss_dssp HHHHHHHHH---CTC-SEEEEEE-HHHH--------HHHHHHHHHHHHHHHHHHHHHHHHT----TEEEEEE---B-TTT T ss_pred HHHHHHHHH---HCCCCEEEEEECCCCHHHHCC--CHHHHHHHHHHHHHHHHHHHHHHHHC----CCEEEEECCCCCCCC T ss_conf 999999852---037888999505855776747--99999999999999999999999737----999999668999510 Q ss_pred CC Q ss_conf 87 Q T0543 365 TC 366 (887) Q Consensus 365 ~~ 366 (887) -. T Consensus 210 ~~ 211 (262) T PF01676_consen 210 MG 211 (262) T ss_dssp SB T ss_pred CC T ss_conf 28 No 6 >PF02995 DUF229: Protein of unknown function (DUF229); InterPro: IPR004245 Members of this family are uncharacterised with a long conserved region that may contain several domains. Probab=98.77 E-value=2.1e-07 Score=66.20 Aligned_cols=183 Identities=21% Similarity=0.278 Sum_probs=103.3 Q ss_pred CCCCCCCCCCCCEEEEEECCCCHHHHHCCCCCCHHHHH-HHHCCC-CCCEEEECCCCCCHHHHHHHHCCC-CCCCCCEEE Q ss_conf 78888888888789998448786887210003716899-996595-025224066542424368872386-510165220 Q T0543 153 VPECPAGFVRPPLIIFSVDGFRASYMKKGSKVMPNIEK-LRSCGT-HAPYMRPVYPTKTFPNLYTLATGL-YPESHGIVG 229 (887) Q Consensus 153 ~~~cp~g~~rPpvILISiDGfR~dyl~~~~~~tPNL~~-La~~Gv-~a~~m~pvfPT~T~PNh~SIvTGl-yPe~HGIv~ 229 (887) .++-+..-.+|.|+++.||....--+.+ .||-..+ |.+.|. -+....-| =--|+||...|+||. +++..=- . T Consensus 118 ~~~~~~~~~~~sVlilgiDS~Sr~~f~R---~mPkT~~fl~~~~~~ef~GynkV-gdnT~pNl~alltG~~~~~~~~~-~ 192 (498) T PF02995_consen 118 LKSKESSERKPSVLILGIDSMSRMNFRR---SMPKTAEFLRQLGWFEFQGYNKV-GDNTFPNLMALLTGKYFSEEELE-A 192 (498) T ss_pred CCCCCCCCCCCCEEEEEECCCCHHHHHH---HCHHHHHHHHHCCCEEECCCCCC-CCCCHHHHHHHHHCCCCCHHHHH-H T ss_conf 3456567899838999821516788976---26289999963896896684002-43326789999717888856761-2 Q ss_pred CCEECCCCCCEEEECCCCCCCCHHCCCCCHHHHHHHCCCEEEEE--------EC---CC--------------------- Q ss_conf 40125435751322676656711106770545767749708999--------75---88--------------------- Q T0543 230 NSMYDPVFDASFHLRGREKFNHRWWGGQPLWITATKQGVRAGTF--------FW---SV--------------------- 277 (887) Q Consensus 230 N~~~Dp~~~~~f~l~~~~~~~~~w~~gePiW~ta~~~G~ksa~~--------~W---pg--------------------- 277 (887) .++. ......+..-.-||...+++|..++-. |+ +| T Consensus 193 --~~~~-----------~~~~~~~d~~~~iwk~fk~~GY~T~~~ED~~~~~~f~y~~~GF~~~PtD~ylrp~~~~~~~~~ 259 (498) T PF02995_consen 193 --DCNK-----------PYKKGCLDKCPFIWKDFKNAGYVTAYAEDWPSIGTFNYNKKGFRKQPTDHYLRPFLLAIEKHL 259 (498) T ss_pred --CCCC-----------CCCCCCCCCCHHHHHHHHHCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHC T ss_conf --2556-----------456876443648999998778889876576655644557888768996826417999999734 Q ss_pred ------------CCC-HHHHHHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf ------------634-7889999999860656678728998425634200135888789999999999999999999985 Q T0543 278 ------------SIP-HERRILTILQWLSLPDNERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKTVGQLMDGLKQ 344 (887) Q Consensus 278 ------------s~P-~~~rid~vl~wl~lp~~e~P~l~~lY~~epD~~GH~yGp~S~e~~~ai~~vD~~IG~Ll~~Lk~ 344 (887) ..+ ++.-.+-+.+.+..- ..+|-|.++++.+ ..| +....+..+|..+-++++.+++ T Consensus 260 ~~~~~~~~~C~g~r~~~~~~~dy~~~f~~~y-~~~~~F~~~w~~~---~~h-------~~~~~~~~~D~~l~~~l~~~~~ 328 (498) T PF02995_consen 260 RYSKKFGLKCLGGRPYHEYLLDYIRQFMERY-KDRPKFGFFWFNS---GSH-------DDFNGASRLDDDLVEYLEKLEE 328 (498) T ss_pred CCCCCCCCCCCCCHHHHHHHHHHHHHHHHHH-HCCCEEEEEEECC---CCC-------CCCHHHHHHHHHHHHHHHHHHH T ss_conf 6324578888785367999999999999985-1797799998556---636-------7624778889999999999986 Q ss_pred CCCCCCCCEEEECCCCCCCC Q ss_conf 68634651799715666567 Q T0543 345 LRLHRCVNVIFVGDHGMEDV 364 (887) Q Consensus 345 ~gL~~~tnIIivSDHGM~~v 364 (887) .|+.++|.|||+||||+--. T Consensus 329 ~g~l~nTivil~SDHG~R~g 348 (498) T PF02995_consen 329 EGLLNNTIVILMSDHGLRFG 348 (498) T ss_pred CCCCCCEEEEEECCCCCCCC T ss_conf 49745439998747897764 No 7 >PF01033 Somatomedin_B: Somatomedin B domain; InterPro: IPR001212 Somatomedin B, a serum factor of unknown function, is a small cysteine-rich peptide, derived proteolytically from the N-terminus of the cell-substrate adhesion protein vitronectin . Cys-rich somatomedin B-like domains are found in a number of proteins , including plasma-cell membrane glycoprotein (which has nucleotide pyrophosphate and alkaline phosphodiesterase I activities) and placental protein 11 (which appears to possess amidolytic activity). The SMB domain of vitronectin has been demonstrated to interact with both the urokinase receptor and the plasminogen activator inhibitor-1 (PAI-1) and the conserved cysteines of the NPP1 somatomedin B-like domain have been shown to mediate homodimerization . As shown in the following schematic representation below the SMB domain contains eight Cys residues, arranged into four disulphide bonds. It has been suggested that the active SMB domain may be permitted considerable disulphide bond heterogeneity or variability, provided that the Cys25-Cys31 disulphide bond is preserved. The three dimensional structure of the SMB domain is extremely compact and the disulphide bonds are packed in the centre of the domain forming a covalently bonded core . The structure of the SMB domain presents a new protein fold, with the only ordered secondary structure being a single-turn alpha-helix and a single-turn 3(10)-helix .; PDB: 1ssu_A 1oc0_B 3bt2_B 3bt1_B 2jq8_A 1s4g_A 2cqw_A 2ys0_A. Probab=98.45 E-value=1.9e-08 Score=73.39 Aligned_cols=42 Identities=45% Similarity=1.126 Sum_probs=36.8 Q ss_pred CCCHHHCCCCCCCCCCCCCEECHHHHHCCCCCCCHHHHCCCCC Q ss_conf 7741001345456787775024334203674235345123555 Q T0543 55 SGSCKGRCFELQEVGPPDCRCDNLCKSYSSCCHDFDELCLKTA 97 (887) Q Consensus 55 ~~~c~~~c~~~~~~~~~~c~c~~~c~~~~~cc~d~~~~c~~~~ 97 (887) .+||+|||++....+. .|+||..|+.+|+||.||+++|++.+ T Consensus 2 ~~sC~gRCg~~~~~~~-~C~Cd~~C~~~gdCC~DY~~~C~~~~ 43 (43) T PF01033_consen 2 MDSCKGRCGEYFSRGR-PCQCDDDCVSYGDCCPDYEDVCVETT 43 (43) T ss_dssp -SS-TTTTT-B--TTS-SSBSSCCCCCCT-BTTTHHHHTSS-- T ss_pred CCCCCCCCCCCCCCCC-CCCCCCCCCCCCCCCHHHHHHCCCCC T ss_conf 7653472769988998-88875020654763342683617899 No 8 >PF00245 Alk_phosphatase: Alkaline phosphatase; InterPro: IPR001952 Alkaline phosphatase (3.1.3.1 from EC) (ALP) is a zinc and magnesium-containing metalloenzyme which hydrolyzes phosphate esters, optimally at high pH. It is found in nearly all living organisms, with the exception of some plants. In Escherichia coli, ALP (gene phoA) is found in the periplasmic space. In Saccharomyces cerevisiae it (gene PHO8) is found in lysosome-like vacuoles and in mammals, it is a glycoprotein attached to the membrane by a GPI-anchor. In streptomyces species alkaline phosphatase is involved in the synthesis of streptomycin (SM), an antibiotic, express a phosphatase (3.1.3.39 from EC) (gene strK) which is highly related to ALP. It specifically cleaves both streptomycin-6-phosphate and, more slowly, streptomycin-3''-phosphate . In mammals, four different isozymes are currently known . Three of them are tissue-specific: the placental, placental-like (germ cell) and intestinal isozymes. The fourth form is tissue non-specific and was previously known as the liver/bone/kidney isozyme. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymic activity, and folds into a 10-stranded beta-sheet structure .; GO: 0008152 metabolic process; PDB: 2glq_A 1zeb_A 1zef_A 1ew2_A 1zed_A 1k7h_B 1shn_B 1shq_A 1ew8_A 1ura_A .... Probab=98.18 E-value=4.5e-06 Score=57.08 Aligned_cols=193 Identities=18% Similarity=0.178 Sum_probs=117.1 Q ss_pred CCEEEEEECCCCHHHHHCCC------CCC-----HHHHHHHHCCCCCCEEEECCCCCCHHHHHHHHCCCCCCCC------ Q ss_conf 87899984487868872100------037-----1689999659502522406654242436887238651016------ Q T0543 163 PPLIIFSVDGFRASYMKKGS------KVM-----PNIEKLRSCGTHAPYMRPVYPTKTFPNLYTLATGLYPESH------ 225 (887) Q Consensus 163 PpvILISiDGfR~dyl~~~~------~~t-----PNL~~La~~Gv~a~~m~pvfPT~T~PNh~SIvTGlyPe~H------ 225 (887) -+||++.-||+.+..+.-+. ... -++++|-..|....|......|-+.+.-++++||.-..+. T Consensus 2 KNVI~~IgDGmg~~~~taaR~~~~~~~~~~~~~~l~~d~~p~~g~~~T~~~d~~vtDSAa~aTA~atG~Kt~ng~igv~~ 81 (421) T PF00245_consen 2 KNVILFIGDGMGPATVTAARIYKGGKNGRPGEETLAFDRFPYTGLVKTYSADSQVTDSAAAATALATGVKTYNGAIGVDP 81 (421) T ss_dssp SEEEEEE-----HHHHHHHHHHHHH-TCGGCG-SHCGGGSSEEEEEE--ESSSSS--HHHHHHHHH-----BTT-B---T T ss_pred CEEEEEEECCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCEEEECCCCCCCCCCCCCCCEEEEEEEECCCCCCCCCC T ss_conf 55999986899999999999997256788744526411465112441133776665556666486664661467521679 Q ss_pred -------------------CEEECC------------------EECCC-C------------------C---CEEEECCC Q ss_conf -------------------522040------------------12543-5------------------7---51322676 Q T0543 226 -------------------GIVGNS------------------MYDPV-F------------------D---ASFHLRGR 246 (887) Q Consensus 226 -------------------GIv~N~------------------~~Dp~-~------------------~---~~f~l~~~ 246 (887) |||... .++.. . . ..+.+++. T Consensus 82 ~~~~~~tile~Ak~~G~~tGiVtT~~ithATPAaf~AH~~~R~~~~~ia~~~~~~~~g~~dia~ql~~~~~~~dVilGGG 161 (421) T PF00245_consen 82 DGNPVETILEWAKEAGKATGIVTTTRITHATPAAFYAHVPSRNWENDIARPDAADAGGCPDIAQQLVDSEKGVDVILGGG 161 (421) T ss_dssp TSG----HHHHHHHTT-B---EESS-TTSHHHHTTT--BSSTT-CCHHHHHCCCCCTTS--HHHHHHHSS---SB-B--- T ss_pred CCCCCCCHHHHHHHCCCCEEEEECCCCCCCCCEEEEEECCCCCCHHHHHCCCCCCCCCCCCHHHHHHCCCCCCEEEEECC T ss_conf 99857569999997499178860452158864068870353345123201000023688569999753787742998376 Q ss_pred -CCCCC-----------HHCCCCCHHHHHHHCCCE-------------------EEEEECCCCCCH------------HH Q ss_conf -65671-----------110677054576774970-------------------899975886347------------88 Q T0543 247 -EKFNH-----------RWWGGQPLWITATKQGVR-------------------AGTFFWSVSIPH------------ER 283 (887) Q Consensus 247 -~~~~~-----------~w~~gePiW~ta~~~G~k-------------------sa~~~Wpgs~P~------------~~ 283 (887) ..+.| .-..+..+...++.+|.+ .-.+|-++..|+ .+ T Consensus 162 ~~~f~p~~~~~~~~~~~~r~d~~~L~~~~~~~gy~~v~~~~el~a~~~~~~~~~llGlf~~~~l~~~~~~~~~~~PsL~e 241 (421) T PF00245_consen 162 RRYFLPTGTPDNDGKKGKRKDGRNLIDEWKEKGYTYVRTREELDALAPGKTTDPLLGLFADSHLPYEIDRDNSEQPSLAE 241 (421) T ss_dssp -GGGSBCCCGGSTT---SBTT--BHHHHHHH--EEE-SSHHHHHHCCCTTTSSCEE---SSSS---SCTTSTTTS-HHHH T ss_pred HHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHCCCEEECCHHHHHCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCCHHH T ss_conf 33136556776656666535683499998744978976499984535777652156501544476522468778998999 Q ss_pred HHHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCCCCC Q ss_conf 9999999860656678728998425634200135888789999999999999999999985686346517997156665 Q T0543 284 RILTILQWLSLPDNERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVNVIFVGDHGME 362 (887) Q Consensus 284 rid~vl~wl~lp~~e~P~l~~lY~~epD~~GH~yGp~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL~~~tnIIivSDHGM~ 362 (887) ....+++-|+ ++++..|+++-=..+|.++|.- +.......+...|++|+..++..++. ++|.|||++||+.. T Consensus 242 Mt~~Al~~L~--~n~~GFfLmVEgg~ID~a~H~n--d~~~~i~e~~~fd~AV~~a~~~~~~~---~~TLiIVTADH~~g 313 (421) T PF00245_consen 242 MTRKALEVLS--KNPKGFFLMVEGGRIDWAAHAN--DAAGAIEETLEFDDAVAAALDFAEKH---DDTLIIVTADHEHG 313 (421) T ss_dssp HHHHHHHHHT--TCC--BEEEEE--CCHHHHHTT---HHHHHHHHHHHHHHHHHHHHHHHHT---TTEEEEEE-SSBES T ss_pred HHHHHHHHHH--CCCCCEEEEEECCCCCHHCCCC--CHHHHHHHHHHHHHHHHHHHHHHCCC---CCEEEEEEECCCCC T ss_conf 9999999973--1998559999542134100537--78899999999999999999972269---98399998357765 No 9 >PF01033 Somatomedin_B: Somatomedin B domain; InterPro: IPR001212 Somatomedin B, a serum factor of unknown function, is a small cysteine-rich peptide, derived proteolytically from the N-terminus of the cell-substrate adhesion protein vitronectin . Cys-rich somatomedin B-like domains are found in a number of proteins , including plasma-cell membrane glycoprotein (which has nucleotide pyrophosphate and alkaline phosphodiesterase I activities) and placental protein 11 (which appears to possess amidolytic activity). The SMB domain of vitronectin has been demonstrated to interact with both the urokinase receptor and the plasminogen activator inhibitor-1 (PAI-1) and the conserved cysteines of the NPP1 somatomedin B-like domain have been shown to mediate homodimerization . As shown in the following schematic representation below the SMB domain contains eight Cys residues, arranged into four disulphide bonds. It has been suggested that the active SMB domain may be permitted considerable disulphide bond heterogeneity or variability, provided that the Cys25-Cys31 disulphide bond is preserved. The three dimensional structure of the SMB domain is extremely compact and the disulphide bonds are packed in the centre of the domain forming a covalently bonded core . The structure of the SMB domain presents a new protein fold, with the only ordered secondary structure being a single-turn alpha-helix and a single-turn 3(10)-helix .; PDB: 1ssu_A 1oc0_B 3bt2_B 3bt1_B 2jq8_A 1s4g_A 2cqw_A 2ys0_A. Probab=97.58 E-value=7.8e-06 Score=55.42 Aligned_cols=41 Identities=41% Similarity=1.069 Sum_probs=37.0 Q ss_pred EECCCCCCCCCCCCCCCCCCHHHHCCCCCCCCHHHHCCCCC Q ss_conf 20553156200456777657677415782124043148885 Q T0543 101 ECTKDRCGEVRNEENACHCSEDCLSRGDCCTNYQVVCKGES 141 (887) Q Consensus 101 ~c~~~~c~~~r~~~~~c~c~~~c~~~~~cc~~~~~~c~~~~ 141 (887) ++-+.||||....+..|.|.++|+..||||.||..+|..++ T Consensus 3 ~sC~gRCg~~~~~~~~C~Cd~~C~~~gdCC~DY~~~C~~~~ 43 (43) T PF01033_consen 3 DSCKGRCGEYFSRGRPCQCDDDCVSYGDCCPDYEDVCVETT 43 (43) T ss_dssp SS-TTTTT-B--TTSSSBSSCCCCCCT-BTTTHHHHTSS-- T ss_pred CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCC T ss_conf 65347276998899888875020654763342683617899 No 10 >PF04185 Phosphoesterase: Phosphoesterase family; InterPro: IPR007312 This family includes both bacterial phospholipase C enzymes (3.1.4.3 from EC) and eukaryotic acid phosphatases 3.1.3.2 from EC.; GO: 0016788 hydrolase activity, acting on ester bonds; PDB: 2d1g_B. Probab=95.47 E-value=0.047 Score=29.30 Aligned_cols=164 Identities=18% Similarity=0.138 Sum_probs=84.9 Q ss_pred CCCHHHHHHHHCCCCCCEEEECCCCCCHHHHHHHHCCCCCCCCCEEEC--CEEC-CC----CCCEEEECCCCCCCCHHCC Q ss_conf 037168999965950252240665424243688723865101652204--0125-43----5751322676656711106 Q T0543 183 KVMPNIEKLRSCGTHAPYMRPVYPTKTFPNLYTLATGLYPESHGIVGN--SMYD-PV----FDASFHLRGREKFNHRWWG 255 (887) Q Consensus 183 ~~tPNL~~La~~Gv~a~~m~pvfPT~T~PNh~SIvTGlyPe~HGIv~N--~~~D-p~----~~~~f~l~~~~~~~~~w~~ 255 (887) ..+|++.+||++++.+++-.+.-++-|.|||..+++|.. +|...+ ...+ +. ....+.........++||+ T Consensus 121 ~~~P~~~~LA~~f~l~D~yf~s~~~pT~PNr~~~~sG~~---~~~~~~~~~~~~~~~~~~~~~ti~d~L~~aGisW~~Y~ 197 (376) T PF04185_consen 121 ADLPFLHALADQFTLCDNYFCSVPGPTWPNRLFLVSGTS---DGDGNNGGPFIDNPSHPYNWPTIFDRLSAAGISWKVYQ 197 (376) T ss_dssp --SHHHHHHHHHSEEESSEE-----------HHHH---------------TTS-EEE----------HHHH-------EE T ss_pred CCCCHHHHHHHHCEEECCCCCCCCCCCCCCCCEEEECCC---CCCCCCCCCCCCCCCCCCCCCHHHHHHHHCCCEEEEEC T ss_conf 778589999865187311245898988898208983345---65466898654578766232219999987598178722 Q ss_pred CC-CHHHHHHHCCCE--EEEEE------CCC-------CCCHHHHHHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCCCC Q ss_conf 77-054576774970--89997------588-------634788999999986065667872899842563420013588 Q T0543 256 GQ-PLWITATKQGVR--AGTFF------WSV-------SIPHERRILTILQWLSLPDNERPSVYAFYSEQPDFSGHKYGP 319 (887) Q Consensus 256 ge-PiW~ta~~~G~k--sa~~~------Wpg-------s~P~~~rid~vl~wl~lp~~e~P~l~~lY~~epD~~GH~yGp 319 (887) .. +........+.. ...+. ... .......+++-.+-+. ...-|.+.++-. .....+| .+ T Consensus 198 e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~F~~D~~--~g~LP~vsfI~p-~~~~d~H--p~ 272 (376) T PF04185_consen 198 EGYPNPGDNGLNGFDPYFQAFHYPFNPYSFPSYSGSNDRANHIVDLSQFAADLA--NGTLPQVSFIVP-NMCNDMH--PP 272 (376) T ss_dssp -----SEE----EE---EEE-----E--S-GGG---BSTTTT-EECHHHHHHHH--TT---SEEEE-------------- T ss_pred CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHH--CCCCCEEEEEEC-CCCCCCC--CC T ss_conf 478877765545546441001244343345543323334555136999999987--399983799934-7778889--88 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCC Q ss_conf 8789999999999999999999985686346517997156 Q T0543 320 FGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVNVIFVGDH 359 (887) Q Consensus 320 ~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL~~~tnIIivSDH 359 (887) . ..++.-|..|++++++|.....+++|.|||+=|- T Consensus 273 ~-----~~~~~g~~~v~~vv~al~~sp~W~~TliiIt~DE 307 (376) T PF04185_consen 273 Y-----SVPADGDAFVAQVVNALRNSPEWNNTLIIITYDE 307 (376) T ss_dssp --------HHHHHHHHHHHHHHHHCSTTGGGEEEEEEES- T ss_pred C-----CCHHHHHHHHHHHHHHHHCCCCCCCEEEEEEEEC T ss_conf 7-----6356799999999999974956588089999977 No 11 >PF07394 DUF1501: Protein of unknown function (DUF1501); InterPro: IPR010869 This family contains a number of hypothetical bacterial proteins of unknown function approximately 400 residues long. Probab=93.30 E-value=0.13 Score=26.23 Aligned_cols=61 Identities=18% Similarity=0.152 Sum_probs=41.3 Q ss_pred CEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCCCCCCC Q ss_conf 72899842563420013588878999999999999999999998568634651799715666567 Q T0543 300 PSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVNVIFVGDHGMEDV 364 (887) Q Consensus 300 P~l~~lY~~epD~~GH~yGp~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL~~~tnIIivSDHGM~~v 364 (887) ..++++.+.-=|+-. -........+..+|+.|..|++.|+++|++++|.|+++||=|=+.. T Consensus 246 ~~~~~v~~gGwDTH~----n~~~~~~~ll~~ld~alaaf~~dL~~~G~~d~t~vv~~SEFGRT~~ 306 (392) T PF07394_consen 246 VRVVFVSLGGWDTHS----NQGDRHARLLPELDDALAAFIDDLEERGLLDDTLVVTMSEFGRTPK 306 (392) T ss_pred CEEEEECCCCCCCCC----CCHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCEEEEEECCCCCCCC T ss_conf 879997878865735----4677899999999999999999998648867838999020488865 No 12 >PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception , the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain , . Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities . The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilizing the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site . The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity . ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0016568 chromatin modification, 0005634 nucleus; PDB: 2o8j_D 2rfi_B 3hna_A 2igq_A 3fpd_B 3bo5_A 2r3a_A 1peg_B 1ml9_A 1mvh_A .... Probab=79.30 E-value=0.23 Score=24.58 Aligned_cols=46 Identities=30% Similarity=0.737 Sum_probs=30.8 Q ss_pred CCCCEECHHHHHCCCC-CCCHHHH--------CC---CCCCCEEECCCCCCCCCCCCCCCCCCHHHHCC Q ss_conf 7775024334203674-2353451--------23---55530320553156200456777657677415 Q T0543 70 PPDCRCDNLCKSYSSC-CHDFDEL--------CL---KTARGWECTKDRCGEVRNEENACHCSEDCLSR 126 (887) Q Consensus 70 ~~~c~c~~~c~~~~~c-c~d~~~~--------c~---~~~~~~~c~~~~c~~~r~~~~~c~c~~~c~~~ 126 (887) ..+|.|+..|.....| |+..... .| ..+-|++|+.. |.|+..|..| T Consensus 46 ~~gC~C~~~C~~~~~C~C~~~~~~~~~Y~~~g~l~~~~~~~I~ECn~~-----------C~C~~~C~NR 103 (103) T PF05033_consen 46 LVGCDCTDDCSDPSDCSCLQRNGGEFPYDSDGRLVLERGPPIYECNSS-----------CGCSPSCRNR 103 (103) T ss_dssp ------SSCSTCTTTSGGGTTTSS--SB-TTS-BSSSS--EEE---TT-----------SSS-TTSTT- T ss_pred CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCEEEECCCC-----------CCCCCCCCCC T ss_conf 747768888579988728000387667567881843799866738998-----------9868878888 No 13 >PF11658 DUF3260: Protein of unknown function (DUF3260) Probab=60.82 E-value=3.2 Score=16.66 Aligned_cols=181 Identities=14% Similarity=0.200 Sum_probs=92.9 Q ss_pred CCEEEEEECCCCCCHHHHHHHHHHHHCCC-CCCCCEEEEEECCCCCCCCCCC-CC---CCH-HHHHHHHHHHHHHHHHHH Q ss_conf 97089997588634788999999986065-6678728998425634200135-88---878-999999999999999999 Q T0543 267 GVRAGTFFWSVSIPHERRILTILQWLSLP-DNERPSVYAFYSEQPDFSGHKY-GP---FGP-EMTNPLREIDKTVGQLMD 340 (887) Q Consensus 267 G~ksa~~~Wpgs~P~~~rid~vl~wl~lp-~~e~P~l~~lY~~epD~~GH~y-Gp---~S~-e~~~ai~~vD~~IG~Ll~ 340 (887) |+..+..-.+||--++ ..+..-+|++.- +...+...+.|=.-+=|-|-+. |. .|. .|..-++++=..+.++++ T Consensus 312 ~~~~~~~~FDGSpIy~-D~~vL~rW~~~r~~~~~~~va~~YNtIsLHDGNr~~~~~~~~s~~sY~~R~~~LldDl~~F~~ 390 (518) T PF11658_consen 312 GLPVAMHSFDGSPIYD-DYAVLNRWWQQREKQSDGPVALYYNTISLHDGNRIPGADRLNSLASYKPRAQKLLDDLDRFFD 390 (518) T ss_pred CCCHHHHCCCCCCCCC-HHHHHHHHHHHHHCCCCCCEEEEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHH T ss_conf 7736765168974335-089999999986357898669998436425687477986665233269999999999999999 Q ss_pred HHHHCCCCCCCCEEEECCCC------------CCCCCCCCEEEHHHHCCCCCCEEECCCCEEEEEEECCCCCCHHHHHHH Q ss_conf 99856863465179971566------------656787762653532375300575167413688625752000189999 Q T0543 341 GLKQLRLHRCVNVIFVGDHG------------MEDVTCDRTEFLSNYLTNVDDITLVPGTLGRIRAKSINNSKYDPKTII 408 (887) Q Consensus 341 ~Lk~~gL~~~tnIIivSDHG------------M~~v~~~~~I~L~~yl~~~~~~~~~~G~~~~I~pk~~~~~~~~~e~i~ 408 (887) .|++.|. .+.|++|-.|| |-+|..+.++.+-.-+.-.+.-....|...+|- .+.+....-+++ T Consensus 391 ~le~SgR--~vvvv~VPEHGAALrGDk~QisGlREIPsP~IthvPVgik~iG~~~~~~g~~~~I~---~psSYlAlSeLv 465 (518) T PF11658_consen 391 ELEKSGR--KVVVVLVPEHGAALRGDKMQISGLREIPSPSITHVPVGIKLIGMKAPHQGSPVHID---QPSSYLALSELV 465 (518) T ss_pred HHHHCCC--CEEEEEECCCCCCCCCCHHHHCCCCCCCCCCCEEECEEEEEECCCCCCCCCCEEEC---CCCHHHHHHHHH T ss_conf 9997599--58999966853001464054325666899861520227999525788899986877---872299999999 Q ss_pred HHHHHCCCCCCEE---EEEHHHHHHHHHCCCCCCCCCEEEEECCCEEEEE Q ss_conf 9998403777425---6520121256631678865624898637707874 Q T0543 409 ANLTCKKPDQHFK---PYMKQHLPKRLHYANNRRIEDIHLLVDRRWHVAR 455 (887) Q Consensus 409 ~~L~~~~~~~h~~---vY~ke~lP~r~Hy~~n~Ri~~I~l~~d~Gw~i~~ 455 (887) .++-...+-+.-. .-+.++||+----+.|. .-+++....+|++.. T Consensus 466 sr~~~~~~f~~~~~~~~~l~~~LP~T~~VsEN~--~t~vm~~~~k~yv~l 513 (518) T PF11658_consen 466 SRLVAGNPFQADTVDWAQLTQDLPQTAMVSENE--GTIVMQYQGKYYVRL 513 (518) T ss_pred HHHHCCCCCCCCCCCHHHHHHCCCCCCEEECCC--CEEEEEECCEEEEEC T ss_conf 999717977789899999984199874230379--838999898668967 No 14 >PF05827 ATP-synt_S1: Vacuolar ATP synthase subunit S1 (ATP6S1); InterPro: IPR008388 ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport , . F-ATPases (F1F0-ATPases) in mitochondria, chloroplasts and bacterial plasma membranes are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). V-ATPases (V1V0-ATPases) are primarily found in eukaryotic vacuoles, catalysing ATP hydrolysis to transport solutes and lower pH in organelles. A-ATPases (A1A0-ATPases) are found in Archaea and function like F-ATPases. P-ATPases (E1E2-ATPases) are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes. E-ATPases are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP. V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) (3.6.3.14 from EC) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release . V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c, c d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins . This entry represents the S1 subunit (or subunit AC45) found in the V1 complex of V-ATPases. This subunit is synthesized as an N-glycosylated 60 kDa precursor that is intracellularly cleaved to a protein of about 45 kDa. This subunit may assist the V-ATPase in the acidification of neuroendocrine granules . More information about this protein can be found at Protein of the Month: ATP Synthases .; GO: 0046933 hydrogen ion transporting ATP synthase activity, rotational mechanism, 0046961 hydrogen ion transporting ATPase activity, rotational mechanism, 0015986 ATP synthesis coupled proton transport, 0016021 integral to membrane, 0016469 proton-transporting two-sector ATPase complex Probab=52.61 E-value=4.3 Score=15.76 Aligned_cols=53 Identities=15% Similarity=0.145 Sum_probs=41.6 Q ss_pred CCCCEEEEEECCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEECCC Q ss_conf 678728998425634200135888789999999999999999999985686346517997156 Q T0543 297 NERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVNVIFVGDH 359 (887) Q Consensus 297 ~e~P~l~~lY~~epD~~GH~yGp~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL~~~tnIIivSDH 359 (887) +.+|.++.+.|...-..+ .+..++|..-|..|+.+++.|.... .+.+|++|+- T Consensus 124 ~~~~rVi~v~~~~l~~~~-------~~R~~~L~~~D~~l~~Il~~lps~~---~yTvIyts~~ 176 (280) T PF05827_consen 124 EYKPRVIRVDFPPLPSDS-------ESRAEVLSDNDEMLRKILDQLPSPD---PYTVIYTSLP 176 (280) T ss_pred CCCCCEEEEECCCCCCCC-------CCHHHHHHCCCHHHHHHHHHCCCCC---CEEEEEECCC T ss_conf 257858999899889875-------2568998613699999998458976---3689995168 No 15 >PF02739 5_3_exonuc_N: 5'-3' exonuclease, N-terminal resolvase-like domain; InterPro: IPR002421 The N-terminal and internal 5'3'-exonuclease domains are commonly found together, and are most often associated with 5' to 3' nuclease activities. The XPG protein signatures (PDOC00658 from PROSITEDOC) are never found outside the '53EXO' domains. The latter are found in more diverse proteins , , . The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families. In the eubacterial type A DNA-polymerases, the N-terminal and internal domains are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A (IPR001098 from INTERPRO) is always present towards the C-terminus. Several eukaryotic structure-dependent endonucleases and exonucleases have the 53EXO domains separated by 24 to 27 amino acids, and the XPG protein signatures are always present. In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. Eukaryotic DNA repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures. ; GO: 0003677 DNA binding, 0008409 5'-3' exonuclease activity; PDB: 1cmw_A 1tau_A 1bgx_T 1taq_A 2ihn_A 1tfr_A 1ut8_A 1exn_B 1xo1_B 1ut5_A .... Probab=45.64 E-value=5.5 Score=15.04 Aligned_cols=35 Identities=17% Similarity=0.230 Sum_probs=29.3 Q ss_pred HHHHHHHHHHHHHHHCCCCCCCCEEEECCCCCCCCCCC Q ss_conf 99999999999998568634651799715666567877 Q T0543 330 EIDKTVGQLMDGLKQLRLHRCVNVIFVGDHGMEDVTCD 367 (887) Q Consensus 330 ~vD~~IG~Ll~~Lk~~gL~~~tnIIivSDHGM~~v~~~ 367 (887) ++|..||.|.....+.|. -.+|+++|..|.+.-.+ T Consensus 109 EADDvIatla~~~~~~g~---~v~IvS~DKDl~QLv~~ 143 (169) T PF02739_consen 109 EADDVIATLARKASEEGY---EVVIVSSDKDLLQLVSE 143 (169) T ss_dssp -HHHHHHHHHHHHHH-SS---EEEEE-SSGGGGGGCCS T ss_pred CHHHHHHHHHHHHHHCCC---EEEEECCCCCHHHHCCC T ss_conf 788999999999997798---39998479877995779 No 16 >PF11946 DUF3463: Domain of unknown function (DUF3463) Probab=26.34 E-value=5.5 Score=15.02 Aligned_cols=41 Identities=15% Similarity=0.353 Sum_probs=33.2 Q ss_pred CCCCCCCCCC-CC-CCCCEEEEECCCEEEEHHHCCCCEEEEEE Q ss_conf 1256667400-05-77520353043012111002883379998 Q T0543 619 RHLLYGRPAV-LY-RTSYDILYHTDFESGYSEIFLMPLWTSYT 659 (887) Q Consensus 619 ~~lP~G~P~v-~~-~~~~clL~~~~Yv~~Ys~~~~~P~Wvsy~ 659 (887) .=.|||.|.. .+ -..-|-|-.++|...|.+-+..-.|-+|- T Consensus 59 ~CTPWg~Pt~n~~GWq~PCYLl~eG~~~tf~Elme~T~Wd~YG 101 (140) T PF11946_consen 59 ECTPWGNPTRNVFGWQKPCYLLNEGYYKTFKELMEETDWDKYG 101 (140) T ss_pred CCCCCCCCCCCCCCCCCCCEEECCCCHHHHHHHHHCCCHHHCC T ss_conf 6457878764754456772560574388899998727755318 No 17 >PF04852 DUF640: Protein of unknown function (DUF640); InterPro: IPR006936 This conserved region is found in plant proteins including the resistance protein-like protein (O49468 from SWISSPROT). Probab=25.02 E-value=12 Score=12.81 Aligned_cols=65 Identities=18% Similarity=0.280 Sum_probs=37.7 Q ss_pred HHHHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCC--CCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 89999999860656678728998425634200135--88878999999999999999999998568634651 Q T0543 283 RRILTILQWLSLPDNERPSVYAFYSEQPDFSGHKY--GPFGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVN 352 (887) Q Consensus 283 ~rid~vl~wl~lp~~e~P~l~~lY~~epD~~GH~y--Gp~S~e~~~ai~~vD~~IG~Ll~~Lk~~gL~~~tn 352 (887) ..|...+.|++.-- +.- +|-..=-..||.- ++-.--.++|.-.+|..||+|..+.++.|-.-..| T Consensus 48 ~hVl~FL~~~d~~G--kTk---Vh~~~C~~~g~~~pp~~C~CP~rqAwGslDaliGrLraa~ee~Gg~pe~N 114 (131) T PF04852_consen 48 AHVLEFLVYLDQFG--KTK---VHGQGCGFFGHPSPPAPCPCPLRQAWGSLDALIGRLRAAFEEHGGHPEAN 114 (131) T ss_pred HHHHHHHHHHHCCC--CEE---ECCCCCCCCCCCCCCCCCCCCHHHHHCCHHHHHHHHHHHHHHHCCCCCCC T ss_conf 98999999874269--753---35888888888999999998178865039999999999999928998888 No 18 >PF04666 Glyco_transf_54: N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; InterPro: IPR006759 The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyse the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (2.4.1.145 from EC) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms a GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminal end of the protein .; GO: 0016758 transferase activity, transferring hexosyl groups, 0005975 carbohydrate metabolic process, 0016020 membrane Probab=24.97 E-value=11 Score=12.91 Aligned_cols=33 Identities=27% Similarity=0.459 Sum_probs=24.7 Q ss_pred HHHHHHHHHHHHHHHHHCCCCCCCCEEEECCCC Q ss_conf 999999999999999856863465179971566 Q T0543 328 LREIDKTVGQLMDGLKQLRLHRCVNVIFVGDHG 360 (887) Q Consensus 328 i~~vD~~IG~Ll~~Lk~~gL~~~tnIIivSDHG 360 (887) -.+++..||.|+++|-+....+.+.+|++||-- T Consensus 66 ~sYL~~Tl~SLl~~ls~eEr~~i~ivVliAdtd 98 (297) T PF04666_consen 66 QSYLDATLGSLLEGLSPEEREDIVIVVLIADTD 98 (297) T ss_pred CCHHHHHHHHHHHCCCHHHHCCEEEEEEECCCC T ss_conf 645999999998639988974879999973799 No 19 >PF02127 Peptidase_M18: Aminopeptidase I zinc metalloprotease (M18); InterPro: IPR001948 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide . The type example is aminopeptidase I from Saccharomyces cerevisiae, the sequence of which has been deduced, and the mature protein shown to consist of 469 amino acids . A 45-residue presequence contains both positively- and negatively-charged and hydrophobic residues, which could be arranged in an N-terminal amphiphilic alpha-helix . The presequence differs from signal sequences that direct proteins across bacterial plasma membranes and endoplasmic reticulum or into mitochondria. It is unclear how this unique presequence targets aminopeptidase I to yeast vacuoles, and how this sorting utilises classical protein secretory pathways .; GO: 0004177 aminopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis, 0005773 vacuole; PDB: 2ijz_C 2glf_B 1y7e_A 2glj_V. Probab=24.45 E-value=12 Score=12.73 Aligned_cols=41 Identities=17% Similarity=0.273 Sum_probs=20.9 Q ss_pred CCCCCCEEEEEEC----CCCCCCCCCCCCCCHHHHHHHHHHHCCCCC Q ss_conf 8876640578877----877788550751323488999998188879 Q T0543 479 KVNSMQTVFVGYG----PTFKYRTKVPPFENIELYNVMCDLLGLKPA 521 (887) Q Consensus 479 ~~~dM~aiFiA~G----P~FK~g~~v~pf~NIdIYnLmc~LLgI~Pa 521 (887) ...+|.+++...| |.-.+.. ..-.-..|..++++.+||++. T Consensus 160 ~~~~l~pi~~~~~~~~~~~~~~~~--~~~~~~~ll~~la~~~gi~~~ 204 (432) T PF02127_consen 160 KQKHLEPIIGLIGENSIPTEDEKK--KEKHKPALLKLLAEELGIDEE 204 (432) T ss_dssp SSTTS--EEEE-------SS-SSS--SSHHHHHHHHHHHHHC---CT T ss_pred CCCCCCCEEEECCCCCCCCCCCCC--CCCHHHHHHHHHHHHHCCCHH T ss_conf 423442422320333234666200--131279999999988694988 No 20 >PF02110 HK: Hydroxyethylthiazole kinase family; InterPro: IPR000417 Thiamine pyrophosphate (TPP), a required cofactor for many enzymes in the cell, is synthesised de novo in Salmonella typhimurium . Five kinase activities have been implicated in TPP synthesis, which involves joining a 4-methyl-5-(beta-hydroxyethyl)thiazole (THZ) moiety and a 4-amino-5- hydroxymethyl-2-methylpyrimidine (HMP) moiety , . THZ kinase (2.7.1.50 from EC) activity is involved in the salvage synthesis of TH-P from the thiazole:2-methyl-4-amino-5-hydroxymethylpyrimidine diphosphate + 4-4-methyl-5-(2-phosphonooxyethyl)-thiazole = pyrophosphate + thiamin monophosphate Hydroxyethylthiazole kinase expression is regulated at the mRNA level by intracellular thiamin pyrophosphate .; GO: 0004417 hydroxyethylthiazole kinase activity, 0009228 thiamin biosynthetic process; PDB: 3dzv_B 3hpd_A 1c3q_C 1ekk_B 1ekq_A 1esq_C 1esj_C. Probab=21.03 E-value=14 Score=12.28 Aligned_cols=19 Identities=26% Similarity=0.392 Sum_probs=14.8 Q ss_pred HHHHHHHHCCCEEEEECCC Q ss_conf 9999998529479997541 Q T0543 744 LVKKYASERNGVNVISGPI 762 (887) Q Consensus 744 l~r~~a~~~~~V~VvsGPi 762 (887) ..+++|++|+.|-|+||.+ T Consensus 143 ~a~~lA~k~~~vVvvTG~~ 161 (246) T PF02110_consen 143 AAKQLAKKYGCVVVVTGEV 161 (246) T ss_dssp HHHHHHHHCTSEEEE---- T ss_pred HHHHHHHHHCCEEEEECCC T ss_conf 9999999969999997898 Done!