Query T0529 Nucleoprotein structure of lassa fever virus, unknown, 569 residues Match_columns 569 No_of_seqs 39 out of 41 Neff 2.4 Searched_HMMs 11830 Date Fri May 21 18:04:55 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0529.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/hhm/T0529.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF00843 Arena_nucleocap: Aren 100.0 0 0 2216.5 54.1 531 4-543 1-533 (533) 2 PF00929 Exonuc_X-T: Exonuclea 75.6 2.1 0.00017 19.8 7.1 140 386-537 2-161 (165) 3 PF06248 Zw10: Centromere/kine 60.4 4.1 0.00034 17.6 7.5 122 93-232 118-243 (595) 4 PF07047 OPA3: Optic atrophy 3 54.3 5 0.00043 16.9 4.2 72 9-80 42-127 (134) 5 PF01612 3_5_exonuc: 3'-5' exo 52.7 5.3 0.00045 16.7 5.4 123 370-518 7-136 (174) 6 PF01853 MOZ_SAS: MOZ/SAS fami 52.1 5.4 0.00046 16.7 5.3 97 76-212 85-181 (188) 7 PF10221 DUF2151: Cell cycle a 48.4 3.8 0.00032 17.8 2.0 135 380-546 423-572 (692) 8 PF05450 Nicastrin: Nicastrin; 41.1 4.4 0.00037 17.3 1.4 64 256-324 6-70 (234) 9 PF04481 DUF561: Protein of un 38.6 4.6 0.00038 17.2 1.2 68 175-264 23-90 (242) 10 PF09254 Endonuc-FokI_C: Restr 37.9 2.6 0.00022 19.0 -0.1 29 479-507 23-51 (193) 11 PF04522 DUF585: Protein of un 34.6 3.2 0.00027 18.4 -0.1 14 5-18 11-26 (248) 12 PF12007 DUF3501: Protein of u 34.6 9.8 0.00083 14.8 2.7 81 370-454 97-188 (192) 13 PF11502 BCL9: B-cell lymphoma 33.4 9.8 0.00083 14.8 2.2 17 128-144 1-17 (40) 14 PF03223 V-ATPase_C: V-ATPase 33.2 10 0.00087 14.6 2.4 25 88-114 40-64 (371) 15 PF05989 Chordopox_A35R: Chord 32.6 6.1 0.00051 16.3 1.1 25 300-324 144-175 (176) 16 PF06009 Laminin_II: Laminin D 32.0 11 0.00091 14.5 5.3 83 29-118 13-106 (140) 17 PF05813 Orthopox_F7: Orthopox 28.5 5.8 0.00049 16.5 0.4 31 212-242 35-67 (82) 18 PF06953 ArsD: Arsenical resis 28.2 8.2 0.00069 15.4 1.1 43 94-144 21-65 (123) 19 PF10757 YbaJ: Biofilm formati 25.9 14 0.0011 13.8 2.2 33 193-225 46-78 (122) 20 PF11715 Nup160: Nucleoporin N 24.4 5.3 0.00045 16.7 -0.4 31 186-216 89-119 (547) 21 PF11041 DUF2612: Protein of u 24.1 15 0.0012 13.5 3.1 40 245-284 128-169 (187) 22 PF03410 Peptidase_M44: Protei 23.9 15 0.0012 13.5 2.1 180 16-233 62-277 (590) 23 PF06780 Erp_C: Erp protein C- 21.3 16 0.0014 13.2 2.0 48 65-116 81-128 (141) 24 PF05677 DUF818: Chlamydia CHL 20.9 13 0.0011 13.8 1.1 100 227-328 200-307 (365) 25 PF04609 MCR_C: Methyl-coenzym 20.4 17 0.0014 13.0 3.1 14 361-374 115-128 (268) 26 PF02426 MIase: Muconolactone 20.1 17 0.0015 13.0 2.3 29 154-182 46-74 (91) 27 PF05415 Peptidase_C36: Beet n 20.0 17 0.0014 13.0 1.4 16 531-546 51-66 (104) No 1 >PF00843 Arena_nucleocap: Arenavirus nucleocapsid protein; InterPro: IPR000229 Arenaviruses are single stranded RNA viruses. This family represents the nucleocapsid protein that encapsidates the viral ssRNA .; GO: 0019013 viral nucleocapsid Probab=100.00 E-value=0 Score=2216.46 Aligned_cols=531 Identities=63% Similarity=1.051 Sum_probs=521.5 Q ss_pred CCCCCHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 64462367789999764320115688998889999842176678999999854057865789998888998777753202 Q T0529 4 SKEIKSFLWTQSLRRELSGYCSNIKLQVVKDAQALLHGLDFSEVSNVQRLMRKERRDDNDLKRLRDLNQAVNNLVELKST 83 (569) Q Consensus 4 skevpSFrWtQsLRR~Ls~~t~~vK~dVl~Da~~ll~gLDF~~Va~VQR~mRk~KR~D~DL~~LRDlNkeVd~Lm~mkS~ 83 (569) ||||||||||||||||||+||++||+|||+||++|++||||+|||||||||||+||+|+||+||||||||||+||+|||+ T Consensus 1 skevpSFrWtQsLRRgLs~~t~~vK~dVlkDa~~l~~~LDF~~Va~VQR~mRk~KR~d~DL~~LRDlNkeVd~Lm~mkS~ 80 (533) T PF00843_consen 1 SKEVPSFRWTQSLRRGLSNWTTPVKADVLKDARALLSGLDFSQVAQVQRMMRKEKRDDSDLTKLRDLNKEVDSLMSMKST 80 (533) T ss_pred CCCCCCHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 99776125579998664125641288999889998720588989999999886227747789999888999888845411 Q ss_pred CCCCEEEECCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCEEEE Q ss_conf 45526872577767899999889999999752036778850321664778999999999843676556777788875899 Q T0529 84 QQKSILRVGTLTSDDLLILAADLEKLKSKVIRTERPLSAGVYMGNLSSQQLDQRRALLNMIGMSGGNQGARAGRDGVVRV 163 (569) Q Consensus 84 Qk~~~lkvG~LskdeLm~LasDLeKLk~Kv~rtEr~~~~gvY~GNLt~~QL~~Rs~iL~~vG~~~~~~~~~~~~~GvVrv 163 (569) |+|++||||+|+|||||||||||||||+||+|+||+++|||||||||++||+||+|||+++|| ++++++++||||| T Consensus 81 q~~~~lkvG~LskdeLm~LasDLeKLk~Kv~r~er~~~~gvY~GNLt~~QL~~Rs~iL~~~G~----~~~~~~~~GVVrv 156 (533) T PF00843_consen 81 QKNNVLKVGGLSKDELMELASDLEKLKKKVQRTERSGSPGVYMGNLTQSQLDQRSEILRMVGM----QQPRGGRNGVVRV 156 (533) T ss_pred CCCCEEEECCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCEECCCCHHHHHHHHHHHHHHCC----CCCCCCCCCEEEE T ss_conf 545168735767889999997899999997413578998623056668889999999998476----8887899976999 Q ss_pred EECCCHHHHHHHCCCCHHHHHHHHHHCCCCCHHHHHHHHHHCCEEEEECCCCHHHHHHHHHCCCEEECCCCCCCCEEEEE Q ss_conf 73576467764206604678888644056536789998752120034417886799988521871110146543001000 Q T0529 164 WDVKNAELLNNQFGTMPSLTLACLTKQGQVDLNDAVQALTDLGLIYTAKYPNTSDLDRLTQSHPILNMIDTKKSSLNISG 243 (569) Q Consensus 164 WDvkd~sll~NQFGsmPalTiaCMt~Qgge~lndVVQ~Lt~LGLlYT~KyPNl~DLekLt~~Hp~L~iIt~~~S~iNISG 243 (569) ||||||++|||||||||||||||||+||||+|||||||||+|||+||||||||+|||||+++||||++||+|+||||||| T Consensus 157 WDvkd~sll~NQFGsmPalTiaCmt~Qg~e~lndvVQ~lt~LGLlYTvKyPNl~DLekLt~~Hp~L~~It~~~S~iNISG 236 (533) T PF00843_consen 157 WDVKDSSLLNNQFGSMPALTIACMTEQGGETLNDVVQALTDLGLLYTVKYPNLSDLEKLTQKHPCLKIITQEESQINISG 236 (533) T ss_pred EECCCHHHHHHCCCCCHHHHHHHHHHHCCCCHHHHHHHHHHCCEEEEEECCCHHHHHHHHHHCCCEEEECCCCCCEECCC T ss_conf 74688799874047854999999998358717889987520351576406977889987652983012144512200232 Q ss_pred ECHHHHHHHHHCCEEECCCCCEEEEEECHHHHHHHHHHHHHHHHHCCCEECCCCCCCCCHHHHHHHHEECCCCCCCEEEC Q ss_conf 02136666541400110666035676352128899999999886606421178778884454450110157986300012 Q T0529 244 YNFSLGAAVKAGACMLDGGNMLETIKVSPQTMDGILKSILKVKKALGMFISDTPGERNPYENILYKICLSGDGWPYIASR 323 (569) Q Consensus 244 YNlSLsAAVKAGA~liDGGNMLETIkv~p~~f~~iiK~~L~vK~~e~MFv~~~PG~RNPYENlLYKlCLSGeGWPYI~SR 323 (569) |||||||||||||||||||||||||+|+|+||++|||++|+||++|+||||++||+|||||||||||||||||||||||| T Consensus 237 YNlSLsAAVKAGAcliDGGNMLETIkv~p~~f~~iiK~~L~vK~~e~MFV~~~pg~RNPYENlLYKlCLSGeGWPYIgSR 316 (533) T PF00843_consen 237 YNLSLSAAVKAGACLIDGGNMLETIKVTPSNFSTIIKAVLQVKNREGMFVSETPGQRNPYENLLYKLCLSGEGWPYIGSR 316 (533) T ss_pred CCCCHHHHHHCCCEEECCCCCEEEEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCEEEC T ss_conf 24117888750743751774167787255318999999998777418511799988780888799985078897643331 Q ss_pred CEEEEEEECCEEEEECCCC-CCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCEEEEECCCCCCCEEEEE Q ss_conf 4010011034068707888-877888888765545556556788889999999986138998657860478989647888 Q T0529 324 TSITGRAWENTVVDLESDG-KPQKADSNNSSKSLQSAGFTAGLTYSQLMTLKDAMLQLDPNAKTWMDIEGRPEDPVEIAL 402 (569) Q Consensus 324 SqI~GRAWDNT~VDl~~~~-~p~~p~~~~~~~~~~~~~~~~~Lt~~qe~~ik~~m~~Ldp~~tTWiDIEG~p~DPVElAi 402 (569) |||+|||||||+|||+.+| .++++|.+||+.+ ++++|+++||++||++|++|||++||||||||||+||||||| T Consensus 317 SqI~GRAWDNT~VDl~~~~~~~~~~p~~ng~~~-----~l~~Lt~~qe~~vk~am~~Ldp~~ttWiDIEGpp~DPVElAi 391 (533) T PF00843_consen 317 SQIKGRAWDNTTVDLSGKPDSGPPPPVRNGGNP-----RLSGLTESQEMQVKEAMEKLDPNATTWIDIEGPPNDPVELAI 391 (533) T ss_pred CCCCCCCCCCCEEECCCCCCCCCCCCCCCCCCC-----CCCCCCHHHHHHHHHHHHHCCCCCCEEEECCCCCCCCEEEEE T ss_conf 544444457856847889998999865778997-----877889889999999997179899836826789999757888 Q ss_pred EECCCCCEEEEEECCCCHHHHHCCCCCCCCCHHHHHHCCCCCHHHHHHHHCCCCCEEEECCHHHHHHHHHHCCCCCEEEE Q ss_conf 70799867888507551002121264100111655402475368999974586648981382899999984499622678 Q T0529 403 YQPSSGCYIHFFREPTDLKQFKQDAKYSHGIDVTDLFATQPGLTSAVIDALPRNMVITCQGSDDIRKLLESQGRKDIKLI 482 (569) Q Consensus 403 yQP~sg~YIHcyR~P~D~K~FK~~SKysHGillkDl~~aqPGL~S~vI~~LP~~MVlT~QGsDDIrkLld~hGRkDiKli 482 (569) |||++|+||||||+|||+|||||||||||||||||||+|||||+||||++||+|||||||||||||||||||||||||+| T Consensus 392 yQP~sg~YIHcyR~PhDeK~FK~~SKysHGillkDle~aqPGL~S~ii~~LP~~MVlT~QGsDDIrkLld~hGR~DiK~v 471 (533) T PF00843_consen 392 YQPESGNYIHCYRKPHDEKQFKNQSKYSHGILLKDLENAQPGLLSAIIGLLPQNMVLTCQGSDDIRKLLDMHGRRDIKLV 471 (533) T ss_pred ECCCCCCEEEEECCCCCHHHHCCCCCCCCCEEHHHHHHCCCCHHHHHHHHCCCCCEEEEECHHHHHHHHHHCCCCCCEEE T ss_conf 61688857988547862433214564445303565643077539999976786738984171889999984387663588 Q ss_pred EEEECHHHHHHHHHHHHHHHHHHHHHCCCCEEECCCCCCCCCCC-HHHHHHHHHHHHHHHCC Q ss_conf 64306478877788889988778864285043312468888876-03788999998666437 Q T0529 483 DIALSKTDSRKYENAVWDQYKDLCHMHTGVVVEKKKRGGKEEIT-PHCALMDCIMFDAAVSG 543 (569) Q Consensus 483 DV~ls~eqsR~FEd~VWd~f~~LC~~H~GiVi~kKKKg~~~~~t-pHCALlDCiMF~a~~~G 543 (569) ||+||+||||+||++|||+|+|||++||||||+|||||++|++| |||||||||||+||++| T Consensus 472 DV~ls~eqaR~fE~~VW~~f~~LC~~H~GvVv~kKKkg~~~~~t~pHCALlDCiMf~a~~~G 533 (533) T PF00843_consen 472 DVKLSSEQARKFEDQVWDRFGHLCKKHNGVVVKKKKKGKKPESTNPHCALLDCIMFQAAVTG 533 (533) T ss_pred EEEECHHHHHHHHHHHHHHHHHHHHHCCCEEEECCCCCCCCCCCCCHHHHHHHHHHHHHHCC T ss_conf 76506888888899999998898986584687524578999888823898887877764159 No 2 >PF00929 Exonuc_X-T: Exonuclease; InterPro: IPR013520 This entry includes a variety of exonuclease proteins, such as ribonuclease T and the epsilon subunit of DNA polymerase III. Ribonuclease T is responsible for the end-turnover of tRNA,and removes the terminal AMP residue from uncharged tRNA. DNA polymerase III is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria, and also exhibits 3' to 5' exonuclease activity.; PDB: 1zbh_C 1w0h_A 1zbu_B 3cg7_B 3cm5_B 3cm6_B 1wlj_A 2gbz_A 1j9a_A 2igi_A .... Probab=75.62 E-value=2.1 Score=19.76 Aligned_cols=140 Identities=20% Similarity=0.149 Sum_probs=85.0 Q ss_pred EEEEECCC-----CCCCEEEEEEECCCC-----CEEEEEECCCCHHHHHCCCCCCCCCHHHHHHCCCCC--HHHHHHHHC Q ss_conf 57860478-----989647888707998-----678885075510021212641001116554024753--689999745 Q T0529 386 TWMDIEGR-----PEDPVEIALYQPSSG-----CYIHFFREPTDLKQFKQDAKYSHGIDVTDLFATQPG--LTSAVIDAL 453 (569) Q Consensus 386 TWiDIEG~-----p~DPVElAiyQP~sg-----~YIHcyR~P~D~K~FK~~SKysHGillkDl~~aqPG--L~S~vI~~L 453 (569) .++|.|-. ...++|+|..--... .-.+.|-+|.+-......+.--|||--.++.++.+- ....+.+.+ T Consensus 2 v~~D~Ettg~~~~~~~iieig~v~~~~~~~~~~~~f~~~v~p~~~~~i~~~~~~i~GIt~~~l~~~~~~~~~~~~~~~~~ 81 (165) T PF00929_consen 2 VFLDTETTGLDPNRDEIIEIGAVKVDDDENREVDQFNTYVKPEDPPEISPFATKIHGITPEDLEDAPSFEEALDEFLEFL 81 (165) T ss_dssp EEEEEEE--BCTTTC-EEEEEEEEEETTCTEEEEEEEEEEEHSSHHTS-HHHHHHHHH-HHHHHCCCEHHHHHHHHHHHH T ss_pred EEEEEECCCCCCCCCCEEEEEEEEEECCCCCCCEEEEEEECCCCCCCCCHHHHHHCCCCHHHHHCCCCHHHHHHHHHHHH T ss_conf 89999838987999727999999987884333137888866877777987888755988889974998688513699998 Q ss_pred CCCCEEEECC-HHHHHH---HHHHCCC----CCEEEEEEEECHHHHHHHHHHHHHHHHHHHHHCCCCEEECCCCCCCCCC Q ss_conf 8664898138-289999---9984499----6226786430647887778888998877886428504331246888887 Q T0529 454 PRNMVITCQG-SDDIRK---LLESQGR----KDIKLIDIALSKTDSRKYENAVWDQYKDLCHMHTGVVVEKKKRGGKEEI 525 (569) Q Consensus 454 P~~MVlT~QG-sDDIrk---Lld~hGR----kDiKliDV~ls~eqsR~FEd~VWd~f~~LC~~H~GiVi~kKKKg~~~~~ 525 (569) -.+-++-+.+ +=|+.- -+..++. +....+|..-. ..+.|...-|-+-.+||..... .... T Consensus 82 ~~~~~~v~~~~~fd~~~l~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~L~~l~~~~~~----------~~~~ 149 (165) T PF00929_consen 82 DKGDVIVGHNASFDVGFLRRAFRRFLGRPIPKMLPFIDTLDL--ARRTFPGRKSPSLDDLAEYYGI----------PFEG 149 (165) T ss_dssp HHHCEEEETTHHHCEHHHHHHHHHHHHHHHHHHHHESEEEEH--HHHHHHHHHHHSHHHHHHHCTS----------SSTT T ss_pred HHHHHCCCCHHHHHHHHHHHHHHHHCCCCCCCCCCEEEHHHH--HHHHHCCCCCCCHHHHHHHCCC----------CCCC T ss_conf 655311441146779999999998442235555623206899--9997343457899999998599----------9989 Q ss_pred CHHHHHHHHHHH Q ss_conf 603788999998 Q T0529 526 TPHCALMDCIMF 537 (569) Q Consensus 526 tpHCALlDCiMF 537 (569) ++|.||=||.+- T Consensus 150 ~~H~Al~Da~~t 161 (165) T PF00929_consen 150 RAHDALDDARAT 161 (165) T ss_dssp CTTSHHHHHHHH T ss_pred CCCCCHHHHHHH T ss_conf 761829999998 No 3 >PF06248 Zw10: Centromere/kinetochore Zw10; InterPro: IPR009361 Zw10 and rough deal proteins are both required for correct metaphase check-pointing during mitosis ,. These proteins bind to the centromere/kinetochore .; GO: 0007067 mitosis, 0000775 chromosome, pericentric region, 0005634 nucleus Probab=60.37 E-value=4.1 Score=17.59 Aligned_cols=122 Identities=21% Similarity=0.188 Sum_probs=75.4 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHHCCC-CCCCCEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCEEEEEECCCHHH Q ss_conf 77767899999889999999752036-77885032166477899999999984367655677778887589973576467 Q T0529 93 TLTSDDLLILAADLEKLKSKVIRTER-PLSAGVYMGNLSSQQLDQRRALLNMIGMSGGNQGARAGRDGVVRVWDVKNAEL 171 (569) Q Consensus 93 ~LskdeLm~LasDLeKLk~Kv~rtEr-~~~~gvY~GNLt~~QL~~Rs~iL~~vG~~~~~~~~~~~~~GvVrvWDvkd~sl 171 (569) .+...++++-+.-|++++..+..... ...--.-.+-|...-..+|..|...++-. -+..| +||.+... T Consensus 118 al~~~~~~~Aa~~Le~~~~~L~~~~~~~~~~~~v~~~L~~e~~~lr~~L~~~L~~~---------w~~lv-~~~~~~~~- 186 (595) T PF06248_consen 118 ALKEGDYLEAADLLEELEKLLKGLKDEKFTELNVLKLLKEEYSVLRQNLQYQLRER---------WDRLV-QVDKKSSK- 186 (595) T ss_pred HHHCCCHHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHH---------HHHHE-EECCCCCC- T ss_conf 76148799999999999999861576665534499999999999999999999999---------98652-56677654- Q ss_pred HHHHCCCCHH---HHHHHHHHCCCCCHHHHHHHHHHCCEEEEECCCCHHHHHHHHHCCCEEECC Q ss_conf 7642066046---788886440565367899987521200344178867999885218711101 Q T0529 172 LNNQFGTMPS---LTLACLTKQGQVDLNDAVQALTDLGLIYTAKYPNTSDLDRLTQSHPILNMI 232 (569) Q Consensus 172 l~NQFGsmPa---lTiaCMt~Qgge~lndVVQ~Lt~LGLlYT~KyPNl~DLekLt~~Hp~L~iI 232 (569) +-+..+. .++---+.+..+.|+||++||..||.+.. +++-+.+.--+|=+--+| T Consensus 187 ---~~~~~~~~~~~~l~ls~~~~~~~L~~vl~AL~~Lg~L~~----~l~~~~~~Ll~~ii~PlI 243 (595) T PF06248_consen 187 ---DLSNPQNSLCVTLHLSKDESQSTLSDVLQALSRLGQLDY----KLDKFCKDLLKNIIEPLI 243 (595) T ss_pred ---CCCCCCCCEEEEEEECCCCCCCHHHHHHHHHHHHCCHHH----HHHHHHHHHHHHHHHHHH T ss_conf ---556766504788873377651049999999998462568----999999999999988875 No 4 >PF07047 OPA3: Optic atrophy 3 protein (OPA3); InterPro: IPR010754 OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity . This family consists of several optic atrophy 3 (OPA3) proteins and related proteins from other eukaryotic species, the function is unknown. Probab=54.32 E-value=5 Score=16.90 Aligned_cols=72 Identities=24% Similarity=0.357 Sum_probs=52.8 Q ss_pred HHHHHHHHHHHHHHC-CHHHHHHHHHHHHHHHHHCCHH------------HHHHHHHHHHHCCCCHHH-HHHHHHHHHHH Q ss_conf 367789999764320-1156889988899998421766------------789999998540578657-89998888998 Q T0529 9 SFLWTQSLRRELSGY-CSNIKLQVVKDAQALLHGLDFS------------EVSNVQRLMRKERRDDND-LKRLRDLNQAV 74 (569) Q Consensus 9 SFrWtQsLRR~Ls~~-t~~vK~dVl~Da~~ll~gLDF~------------~Va~VQR~mRk~KR~D~D-L~~LRDlNkeV 74 (569) +.+|.+-+++.+.++ -.+++..-|+|++|+--|-||- -+.-++|--||+.+-..+ ..++..|..++ T Consensus 42 ~h~~~~~l~~~~~~~~~~~~~i~pL~E~kAie~Ga~~lgE~fIF~Va~~li~~E~~Rs~~ke~~Ke~~~~~~l~~L~~~i 121 (134) T PF07047_consen 42 YHRFEVRLKMRLLGLGSKPVKIRPLNEEKAIELGAELLGEAFIFSVAGGLIVYEYYRSSRKEAKKEEALQQRLEELEQRI 121 (134) T ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 99999999898751667898689999899999899999999999999999999999997678879999999999999999 Q ss_pred HHHHHH Q ss_conf 777753 Q T0529 75 NNLVEL 80 (569) Q Consensus 75 d~Lm~m 80 (569) +.|... T Consensus 122 ~eL~~~ 127 (134) T PF07047_consen 122 EELEEE 127 (134) T ss_pred HHHHHH T ss_conf 999999 No 5 >PF01612 3_5_exonuc: 3'-5' exonuclease; InterPro: IPR002562 This domain is responsible for the 3'-5' exonuclease proofreading activity of Escherichia coli DNA polymerase I (polI) and other enzymes, it catalyses the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D) .; GO: 0003676 nucleic acid binding, 0008408 3'-5' exonuclease activity, 0006139 nucleobase, nucleoside, nucleotide and nucleic acid metabolic process, 0005622 intracellular; PDB: 2fby_A 2fbt_A 2fbv_A 2fc0_A 2fbx_A 2e6m_A 2e6l_A 1yt3_A 3cym_A 2hbl_A .... Probab=52.66 E-value=5.3 Score=16.72 Aligned_cols=123 Identities=20% Similarity=0.230 Sum_probs=72.4 Q ss_pred HHHHHHHHHHCCCCCCEEEEECCCCCCC----EEEEEEECCCCCEEEEEE-CCCCHHHHHCCCCCCCCCHHHHHHCCCCC Q ss_conf 9999999861389986578604789896----478887079986788850-75510021212641001116554024753 Q T0529 370 LMTLKDAMLQLDPNAKTWMDIEGRPEDP----VEIALYQPSSGCYIHFFR-EPTDLKQFKQDAKYSHGIDVTDLFATQPG 444 (569) Q Consensus 370 e~~ik~~m~~Ldp~~tTWiDIEG~p~DP----VElAiyQP~sg~YIHcyR-~P~D~K~FK~~SKysHGillkDl~~aqPG 444 (569) +.++.+.+..+.-.....+|+|+.+.++ -.++++|-.++ -+||= .++. .+.+ +. T Consensus 7 ~~~l~~~~~~l~~~~~v~~D~E~~~~~~~~~~~~~~~iq~~~~--~~~~i~~~~~--~~~~-----------------~~ 65 (174) T PF01612_consen 7 EEELEELLEKLKKAKVVAFDTETTPLDPKSYSNKICLIQLATG--DGCYIIDPHS--LEDD-----------------PE 65 (174) T ss_dssp HHHHHHHHHHHCTSSEEEEEEEECSSSTTTSSBBEEEEEEEEE--EEEEEEGGSS--STTC-----------------HH T ss_pred HHHHHHHHHHHCCCCEEEEEEEECCCCCCCCCCCEEEEEEECC--CCEEEEEECC--CCCH-----------------HH T ss_conf 9999999999816995999988478875544662189999659--9669995233--3307-----------------99 Q ss_pred HHHHHHHHCCCCCEEEECCH-HHHHHHHHHCCCCCEEEEEEEECHHHHHHHHHHHHHHHHHHHHHCCC-CEEECCC Q ss_conf 68999974586648981382-89999998449962267864306478877788889988778864285-0433124 Q T0529 445 LTSAVIDALPRNMVITCQGS-DDIRKLLESQGRKDIKLIDIALSKTDSRKYENAVWDQYKDLCHMHTG-VVVEKKK 518 (569) Q Consensus 445 L~S~vI~~LP~~MVlT~QGs-DDIrkLld~hGRkDiKliDV~ls~eqsR~FEd~VWd~f~~LC~~H~G-iVi~kKK 518 (569) .+..+++. .+.+..+++. .|++.|...+|=.=-.++|. +-..+..-.... -.+..||..+.| ....|.. T Consensus 66 ~L~~lle~--~~i~kvg~n~k~D~~~L~~~~~i~~~~~~d~-~~a~~ll~~~~~--~~L~~l~~~~l~~~~~~k~~ 136 (174) T PF01612_consen 66 ELKELLED--PNILKVGHNAKFDLRVLKRDFGIDLKNVFDT-MLAAYLLDPPRS--YSLKDLAKKYLGKYDLDKEE 136 (174) T ss_dssp HHHHHHT---TTSEEEESSHHHHHHHHHHHH----SSBEEH-HHHHHHTTTCCC---SHHHHHHHHHSEEB--HHH T ss_pred HHHHHHHC--CCCEEEEECCHHHHHHHCCCCCCCCCCCHHH-HHHHHHHHCCCH--HHHHHHHHHHCCCCCCCHHH T ss_conf 99999958--7861899860644997637445544683849-999987432212--11999999982997664765 No 6 >PF01853 MOZ_SAS: MOZ/SAS family; InterPro: IPR002717 Moz is a monocytic leukemia Zn_finger protein and the SAS protein from Saccharomyces cerevisiae is involved in silencing the Hmr locus. These proteins were reported to be homologous to acetyltransferases but this similarity is not supported by standard sequence analysis.; PDB: 2pq8_A 2giv_A 1mjb_A 1fy7_A 1mj9_A 1mja_A 2ou2_A 2rc4_A 2ozu_A. Probab=52.11 E-value=5.4 Score=16.66 Aligned_cols=97 Identities=27% Similarity=0.410 Sum_probs=69.8 Q ss_pred HHHHHHHCCCCCEEEECCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCC Q ss_conf 77753202455268725777678999998899999997520367788503216647789999999998436765567777 Q T0529 76 NLVELKSTQQKSILRVGTLTSDDLLILAADLEKLKSKVIRTERPLSAGVYMGNLSSQQLDQRRALLNMIGMSGGNQGARA 155 (569) Q Consensus 76 ~Lm~mkS~Qk~~~lkvG~LskdeLm~LasDLeKLk~Kv~rtEr~~~~gvY~GNLt~~QL~~Rs~iL~~vG~~~~~~~~~~ 155 (569) -.|++-.-|++ ..|.+ |+++|-.|-+...+++--|||.+. .|.++-.--= |..|++.+--.. . T Consensus 85 CIltlPpyQrk---GyG~l----LI~~SY~LSr~E~~~G~PErPLSd---lG~~sY~sYW-~~~il~~L~~~~------~ 147 (188) T PF01853_consen 85 CILTLPPYQRK---GYGKL----LIDFSYELSRREGKIGGPERPLSD---LGLLSYRSYW-RRVILEYLLEHK------G 147 (188) T ss_dssp EB---GGGTT--------H----HHHHHHHHHHHTT----B-SS---------HHHHHHH-HHHHHHHHHCTS------S T ss_pred EEEECCCHHHC---CHHHH----HHHHHHHHHHHCCCCCCCCCCCCH---HHHHHHHHHH-HHHHHHHHHHCC------C T ss_conf 99966704636---78789----999999987632889999999898---8999999999-999999998638------8 Q ss_pred CCCCEEEEEECCCHHHHHHHCCCCHHHHHHHHHHCCCCCHHHHHHHHHHCCEEEEEC Q ss_conf 888758997357646776420660467888864405653678999875212003441 Q T0529 156 GRDGVVRVWDVKNAELLNNQFGTMPSLTLACLTKQGQVDLNDAVQALTDLGLIYTAK 212 (569) Q Consensus 156 ~~~GvVrvWDvkd~sll~NQFGsmPalTiaCMt~Qgge~lndVVQ~Lt~LGLlYT~K 212 (569) .+ .+||.-+++..+....||+.+|.+||++...| T Consensus 148 --~~---------------------~isi~dis~~T~i~~~DIi~tL~~l~~l~~~~ 181 (188) T PF01853_consen 148 --ED---------------------SISIEDISKETGIRPEDIISTLQSLGMLKYYK 181 (188) T ss_dssp --E-----------------------EEHHHHHHHH--THHHHHHHHHHTT-EEE-- T ss_pred --CC---------------------CEEHHHHHHHHCCCHHHHHHHHHHCCCEEEEC T ss_conf --76---------------------00799998886898889999999869889888 No 7 >PF10221 DUF2151: Cell cycle and development regulator Probab=48.36 E-value=3.8 Score=17.83 Aligned_cols=135 Identities=20% Similarity=0.255 Sum_probs=84.9 Q ss_pred CCCCCCEEEEECCCCCCCEEEEEEECCCCCEEEEEECCCCHHHHHCCCCCCCCCHHHH-HHCCC-----CCHHH-HHHHH Q ss_conf 3899865786047898964788870799867888507551002121264100111655-40247-----53689-99974 Q T0529 380 LDPNAKTWMDIEGRPEDPVEIALYQPSSGCYIHFFREPTDLKQFKQDAKYSHGIDVTD-LFATQ-----PGLTS-AVIDA 452 (569) Q Consensus 380 Ldp~~tTWiDIEG~p~DPVElAiyQP~sg~YIHcyR~P~D~K~FK~~SKysHGillkD-l~~aq-----PGL~S-~vI~~ 452 (569) |-|-..+-.|...-|..|.|- +. + --.|+|+.-+||.-=+.=.- ||+-+ +-|++ -+=+. T Consensus 423 l~p~~~~~~~~d~~~e~~~eq----~~----~------~~k~~l~r~TRy~P~~~s~T~ifn~~~~~~l~Pl~~li~K~~ 488 (692) T PF10221_consen 423 LVPVSATDLDDDNVPEQPGEQ----QN----Q------RAKKRLERITRYWPMTISDTFIFNMRVTKKLEPLLTLIVKEE 488 (692) T ss_pred EEECCCCCCCCCCCCCCCCHH----CC----H------HHHHHHHHCCCCCCCEECCEEEEECCCCCCCCHHHHHHHCCC T ss_conf 112466656767666664100----00----1------478999846877771211158985022112015778771334 Q ss_pred CCCCCEEEECCHHHHHHHHHHCCCCCEEE---E---EEEE--CHHHHHHHHHHHHHHHHHHHHHCCCCEEECCCCCCCCC Q ss_conf 58664898138289999998449962267---8---6430--64788777888899887788642850433124688888 Q T0529 453 LPRNMVITCQGSDDIRKLLESQGRKDIKL---I---DIAL--SKTDSRKYENAVWDQYKDLCHMHTGVVVEKKKRGGKEE 524 (569) Q Consensus 453 LP~~MVlT~QGsDDIrkLld~hGRkDiKl---i---DV~l--s~eqsR~FEd~VWd~f~~LC~~H~GiVi~kKKKg~~~~ 524 (569) |..+=|+.||. -|-+|.+|-.|+|--. + -+|. ..||-|. .|...+.+-.+|-+ . T Consensus 489 Lt~~dv~~Cqk--~I~~L~~m~~~~d~l~~~~~~g~r~k~~k~dEQyR~----~~~ELe~~i~~~~~------------~ 550 (692) T PF10221_consen 489 LTEEDVLQCQK--CIYNLYQMESRNDPLPFPTMDGSRGKGPKRDEQYRL----MWNELENHIMAYVN------------T 550 (692) T ss_pred CCHHHHHHHHH--HHHHHHHHHHCCCCCCCCCHHCCCCCCCCHHHHHHH----HHHHHHHHHHHHHC------------C T ss_conf 99899999999--999999886049966775100114578731899999----99999999997434------------5 Q ss_pred CCHHHHHHHHHHHHHHHCCCCC Q ss_conf 7603788999998666437657 Q T0529 525 ITPHCALMDCIMFDAAVSGGLN 546 (569) Q Consensus 525 ~tpHCALlDCiMF~a~~~G~~~ 546 (569) +.-|-++++|||=.....+... T Consensus 551 S~~Hk~vl~~i~~~r~~~~~~~ 572 (692) T PF10221_consen 551 SERHKRVLECIMSCRSADPEEE 572 (692) T ss_pred CHHHHHHHHHHHHHHCCCCCCC T ss_conf 7889999999999860385332 No 8 >PF05450 Nicastrin: Nicastrin; InterPro: IPR008710 Nicastrin and presenilin are two major components of the gamma-secretase complex, which executes the intramembrane proteolysis of type I integral membrane proteins such as the amyloid precursor protein (APP) and Notch. Nicastrin is synthesised in fibroblasts and neurons as an endoglycosidase-H-sensitive glycosylated precursor protein (immature nicastrin) and is then modified by complex glycosylation in the Golgi apparatus and by sialylation in the trans-Golgi network (mature nicastrin) .; GO: 0016485 protein processing, 0016021 integral to membrane Probab=41.13 E-value=4.4 Score=17.35 Aligned_cols=64 Identities=19% Similarity=0.468 Sum_probs=34.5 Q ss_pred CEEECCCCCEEEEEECHH-HHHHHHHHHHHHHHHCCCEECCCCCCCCCHHHHHHHHEECCCCCCCEEECC Q ss_conf 001106660356763521-288999999998866064211787788844544501101579863000124 Q T0529 256 ACMLDGGNMLETIKVSPQ-TMDGILKSILKVKKALGMFISDTPGERNPYENILYKICLSGDGWPYIASRT 324 (569) Q Consensus 256 A~liDGGNMLETIkv~p~-~f~~iiK~~L~vK~~e~MFv~~~PG~RNPYENlLYKlCLSGeGWPYI~SRS 324 (569) +|=+|.--|..-+-+-.+ ..+.|| ++|.+-+..+-+.++ .-++--||++ .-+.||-|-|||||- T Consensus 6 ~armDs~s~F~~~s~GAds~~Sglv-aLLaaA~~L~~~~~~---~~~~~r~v~F-~fF~GEs~dYiGS~R 70 (234) T PF05450_consen 6 SARMDSFSFFPDVSPGADSSLSGLV-ALLAAARALSKLLDD---LSDLPRNVLF-AFFNGESYDYIGSSR 70 (234) T ss_pred EECCCCCCCCCCCCCCCCCCHHHHH-HHHHHHHHHHHHCCC---CCCCCCCEEE-EEECCCCCCCCCHHH T ss_conf 9644550144567877456147899-999999999985512---0146654599-984687656501599 No 9 >PF04481 DUF561: Protein of unknown function (DUF561); InterPro: IPR007570 This is a protein of unknown function found in a cyanobacterium, and the chloroplasts of algae. Probab=38.61 E-value=4.6 Score=17.23 Aligned_cols=68 Identities=25% Similarity=0.261 Sum_probs=42.1 Q ss_pred HCCCCHHHHHHHHHHCCCCCHHHHHHHHHHCCEEEEECCCCHHHHHHHHHCCCEEECCCCCCCCEEEEEECHHHHHHHHH Q ss_conf 20660467888864405653678999875212003441788679998852187111014654300100002136666541 Q T0529 175 QFGTMPSLTLACLTKQGQVDLNDAVQALTDLGLIYTAKYPNTSDLDRLTQSHPILNMIDTKKSSLNISGYNFSLGAAVKA 254 (569) Q Consensus 175 QFGsmPalTiaCMt~Qgge~lndVVQ~Lt~LGLlYT~KyPNl~DLekLt~~Hp~L~iIt~~~S~iNISGYNlSLsAAVKA 254 (569) -|-.---..|+--+++||.++=|+ |--|+|-.+-+-.-.-| -+-|++.. -.+-+||+| T Consensus 23 NFd~~~V~~i~~AA~~ggAt~vDI------------Aadp~LV~~v~~~s~lP------iCVSaVep----~~f~~aV~A 80 (242) T PF04481_consen 23 NFDAESVAAIVKAAEIGGATFVDI------------AADPELVKLVKSLSNLP------ICVSAVEP----ELFVAAVKA 80 (242) T ss_pred CCCHHHHHHHHHHHHCCCCCEEEE------------CCCHHHHHHHHHHCCCC------EEEECCCH----HHHHHHHHH T ss_conf 269899999999997069966874------------17999999999738998------57714787----888999982 Q ss_pred CCEEECCCCC Q ss_conf 4001106660 Q T0529 255 GACMLDGGNM 264 (569) Q Consensus 255 GA~liDGGNM 264 (569) ||-|+.=||. T Consensus 81 GAdliEIGNf 90 (242) T PF04481_consen 81 GADLIEIGNF 90 (242) T ss_pred CCCEEEECCH T ss_conf 7877876455 No 10 >PF09254 Endonuc-FokI_C: Restriction endonuclease FokI, C terminal; InterPro: IPR015334 Type II restriction endonucleases (3.1.21.4 from EC) are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin . However, there is still considerable diversity amongst restriction endonucleases , . The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone . There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements , , as summarised below: Type I enzymes (3.1.21.3 from EC) cleave at sites remote from recognition site; require both ATP and S-adenosyl-L-methionine to function; multifunctional protein with both restriction and methylase (2.1.1.72 from EC) activities. Type II enzymes (3.1.21.4 from EC) cleave within or at short specific distances from recognition site; most require magnesium; single function (restriction) enzymes independent of methylase. Type III enzymes (3.1.21.5 from EC) cleave at sites a short distance from recognition site; require ATP (but doesn't hydrolyse it); S-adenosyl-L-methionine stimulates reaction but is not required; exists as part of a complex with a modification methylase methylase (2.1.1.72 from EC). Type IV enzymes target methylated DNA. This entry represents the C-terminal domain of FokI restriction endonucleases, which adopts a structure consisting of an alpha/beta/alpha core containing a five-stranded beta-sheet. FokI recognises the double-stranded DNA sequence 5'-GGATG-3' and cleave DNA phosphodiester groups 9 base pairs away on this strand and 13 base pairs away on the complementary strand , .; GO: 0003677 DNA binding, 0009036 Type II site-specific deoxyribonuclease activity, 0009307 DNA restriction-modification system; PDB: 2fok_A 1fok_A. Probab=37.86 E-value=2.6 Score=19.01 Aligned_cols=29 Identities=31% Similarity=0.485 Sum_probs=12.6 Q ss_pred EEEEEEEECHHHHHHHHHHHHHHHHHHHH Q ss_conf 26786430647887778888998877886 Q T0529 479 IKLIDIALSKTDSRKYENAVWDQYKDLCH 507 (569) Q Consensus 479 iKliDV~ls~eqsR~FEd~VWd~f~~LC~ 507 (569) |++||+..-+-|+|.||-.|-+-|..-|. T Consensus 23 i~lidiA~d~k~~r~fEm~~~elf~~~yg 51 (193) T PF09254_consen 23 IELIDIARDSKQNRDFEMKVMELFTNEYG 51 (193) T ss_dssp GGHHHHTT-GGGHHHHHHHHHHHHHHT-- T ss_pred HHHHHHHHCCCCCHHHHHHHHHHHHHHHC T ss_conf 98999985355212689999999999866 No 11 >PF04522 DUF585: Protein of unknown function (DUF585); InterPro: IPR007610 This region represents the N-termini of bromovirus 2a protein, and is always found N-terminal to a predicted RNA dependent RNA polymerase region (IPR001788 from INTERPRO).; GO: 0003723 RNA binding, 0003968 RNA-directed RNA polymerase activity, 0006350 transcription Probab=34.63 E-value=3.2 Score=18.38 Aligned_cols=14 Identities=50% Similarity=0.871 Sum_probs=9.0 Q ss_pred CCCCHHHHH--HHHHH Q ss_conf 446236778--99997 Q T0529 5 KEIKSFLWT--QSLRR 18 (569) Q Consensus 5 kevpSFrWt--QsLRR 18 (569) -.||||+|. |||.| T Consensus 11 y~vPSFQWlid~sle~ 26 (248) T PF04522_consen 11 YQVPSFQWLIDQSLER 26 (248) T ss_pred EECCCEEEEECCHHCC T ss_conf 7656423650541102 No 12 >PF12007 DUF3501: Protein of unknown function (DUF3501); PDB: 3fjv_B. Probab=34.60 E-value=9.8 Score=14.78 Aligned_cols=81 Identities=20% Similarity=0.293 Sum_probs=54.2 Q ss_pred HHHHHHHHHHC-CCCCCEEEEECCCCCCCEEEEEEEC----------CCCCEEEEEECCCCHHHHHCCCCCCCCCHHHHH Q ss_conf 99999998613-8998657860478989647888707----------998678885075510021212641001116554 Q T0529 370 LMTLKDAMLQL-DPNAKTWMDIEGRPEDPVEIALYQP----------SSGCYIHFFREPTDLKQFKQDAKYSHGIDVTDL 438 (569) Q Consensus 370 e~~ik~~m~~L-dp~~tTWiDIEG~p~DPVElAiyQP----------~sg~YIHcyR~P~D~K~FK~~SKysHGillkDl 438 (569) +...+.....| --.+.+|+.|.|.. || -|||-| ++=+|+||==.+.-...||++..-.-|++.... T Consensus 97 ~~er~~~L~~l~Gie~~v~l~v~~~~--~v-~ai~~eD~~r~~~ek~SsVhylrF~l~~~~~aa~k~~~~v~l~vdHp~Y 173 (192) T PF12007_consen 97 EAERRRELARLIGIEDSVFLEVGGHR--RV-YAIADEDLDRENGEKTSSVHYLRFDLTDEQRAAFKAGAPVELGVDHPNY 173 (192) T ss_dssp HHHHHHHHHHC---CCC-EEEETTS----E-E-EE-TTS-HHHHHS--SEEEEEEE---HHHHHHH----EEE-EEETTC T ss_pred HHHHHHHHHHHCCCCCEEEEEECCCC--CE-EEECCCCCCCCCCCCEEEEEEEEEECCHHHHHHHCCCCCEEEEECCCCC T ss_conf 89999999873684511699988934--14-6406732277655520157899996788899985069967998168876 Q ss_pred HCCCCCHHHHHHHHCC Q ss_conf 0247536899997458 Q T0529 439 FATQPGLTSAVIDALP 454 (569) Q Consensus 439 ~~aqPGL~S~vI~~LP 454 (569) ....+ |...+.+.|- T Consensus 174 ~~~~~-l~~~~~~sL~ 188 (192) T PF12007_consen 174 SHMTP-LPPETRESLA 188 (192) T ss_dssp EEEEE---HHHHHHH- T ss_pred CCCCC-CCHHHHHHHH T ss_conf 10351-8999999998 No 13 >PF11502 BCL9: B-cell lymphoma 9 protein; PDB: 2gl7_F. Probab=33.38 E-value=9.8 Score=14.78 Aligned_cols=17 Identities=41% Similarity=0.491 Sum_probs=11.5 Q ss_pred CCCHHHHHHHHHHHHHH Q ss_conf 66477899999999984 Q T0529 128 NLSSQQLDQRRALLNMI 144 (569) Q Consensus 128 NLt~~QL~~Rs~iL~~v 144 (569) |||.+|++.|-+-|+.+ T Consensus 1 ~Ls~qQ~qHRE~sL~tL 17 (40) T PF11502_consen 1 NLSPQQRQHRERSLQTL 17 (40) T ss_dssp ---HHHHHHHHHHHHHH T ss_pred CCCHHHHHHHHHHHHHH T ss_conf 97889999999999999 No 14 >PF03223 V-ATPase_C: V-ATPase subunit C; InterPro: IPR004907 ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport , . F-ATPases (F1F0-ATPases) in mitochondria, chloroplasts and bacterial plasma membranes are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). V-ATPases (V1V0-ATPases) are primarily found in eukaryotic vacuoles, catalysing ATP hydrolysis to transport solutes and lower pH in organelles. A-ATPases (A1A0-ATPases) are found in Archaea and function like F-ATPases. P-ATPases (E1E2-ATPases) are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes. E-ATPases are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP. V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) (3.6.3.14 from EC) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release . V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c, c d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins . This entry represents the C subunit that is part of the V1 complex, and is localised to the interface between the V1 and V0 complexes . This subunit does not show any homology with F-ATPase subunits. The C subunit plays an essential role in controlling the assembly of V-ATPase, acting as a flexible stator that holds together the catalytic (V1) and membrane (V0) sectors of the enzyme . The release of subunit C from the ATPase complex results in the dissociation of the V1 and V0 subcomplexes, which is an important mechanism in controlling V-ATPase activity in cells. More information about this protein can be found at Protein of the Month: ATP Synthases .; GO: 0005524 ATP binding, 0016820 hydrolase activity, acting on acid anhydrides, catalyzing transmembrane movement of substances, 0015986 ATP synthesis coupled proton transport, 0016469 proton-transporting two-sector ATPase complex; PDB: 1u7l_A. Probab=33.20 E-value=10 Score=14.63 Aligned_cols=25 Identities=56% Similarity=0.721 Sum_probs=20.5 Q ss_pred EEEECCCCHHHHHHHHHHHHHHHHHHH Q ss_conf 687257776789999988999999975 Q T0529 88 ILRVGTLTSDDLLILAADLEKLKSKVI 114 (569) Q Consensus 88 ~lkvG~LskdeLm~LasDLeKLk~Kv~ 114 (569) -||+|+| |.||.++-||.|+..-|. T Consensus 40 ~lk~gTL--DsLv~~sDdL~Kld~~ve 64 (371) T PF03223_consen 40 DLKVGTL--DSLVQLSDDLAKLDSQVE 64 (371) T ss_dssp --B---G--GGHHHHHHHHHHHHHH-- T ss_pred CCCEECH--HHHHHHHHHHHHHHHHHH T ss_conf 7752869--999988898887889999 No 15 >PF05989 Chordopox_A35R: Chordopoxvirus A35R protein; InterPro: IPR009247 This family consists of several Chordopoxvirus sequences homologous to the Vaccinia virus A35R protein. The function of this family is unknown. Probab=32.56 E-value=6.1 Score=16.31 Aligned_cols=25 Identities=36% Similarity=0.946 Sum_probs=17.3 Q ss_pred CCCHHHHH------HHHEEC-CCCCCCEEECC Q ss_conf 88445445------011015-79863000124 Q T0529 300 RNPYENIL------YKICLS-GDGWPYIASRT 324 (569) Q Consensus 300 RNPYENlL------YKlCLS-GeGWPYI~SRS 324 (569) =||-+|.| +.+||+ |+||--+-.++ T Consensus 144 lnPS~~Fl~~L~~~~~~CLtD~~GW~IvD~K~ 175 (176) T PF05989_consen 144 LNPSDNFLMRLMKKSNVCLTDGNGWCIVDGKS 175 (176) T ss_pred ECCHHHHHHHHHHHEEEEEECCCCEEEEECCC T ss_conf 78289999998854159997798089995634 No 16 >PF06009 Laminin_II: Laminin Domain II; InterPro: IPR010307 It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure .; GO: 0007155 cell adhesion, 0005604 basement membrane Probab=32.03 E-value=11 Score=14.49 Aligned_cols=83 Identities=17% Similarity=0.175 Sum_probs=45.6 Q ss_pred HHHHHHHHHHHHHCCHHHHHHHHHHHHHCCC-----------CHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEECCCCHH Q ss_conf 8998889999842176678999999854057-----------86578999888899877775320245526872577767 Q T0529 29 LQVVKDAQALLHGLDFSEVSNVQRLMRKERR-----------DDNDLKRLRDLNQAVNNLVELKSTQQKSILRVGTLTSD 97 (569) Q Consensus 29 ~dVl~Da~~ll~gLDF~~Va~VQR~mRk~KR-----------~D~DL~~LRDlNkeVd~Lm~mkS~Qk~~~lkvG~Lskd 97 (569) ..|+++...+..-++.. ..+++.+.+...+ -+..-..+++|++.++.|+.=-+.-++...-..... T Consensus 13 ~~vl~~~~~i~~~l~~~-~~~~~~l~~~~~~tn~~i~~a~~~i~~a~~~v~~l~~~~~~L~~kl~~v~~~~~~~~~~~-- 89 (140) T PF06009_consen 13 QNVLKILEPISTQLPKW-KEKLGKLNSDVDETNKDISQANNQIDDAEENVRKLEQLAPDLLDKLKPVEQLSDNNSNRT-- 89 (140) T ss_pred HHHHHHHHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-- T ss_conf 99999888899868999-999987504688888889999999999986899999887899999999998751100447-- Q ss_pred HHHHHHHHHHHHHHHHHHCCC Q ss_conf 899999889999999752036 Q T0529 98 DLLILAADLEKLKSKVIRTER 118 (569) Q Consensus 98 eLm~LasDLeKLk~Kv~rtEr 118 (569) ++.+++.||.+|.++-. T Consensus 90 ----ls~nI~~lkelIaqAR~ 106 (140) T PF06009_consen 90 ----LSRNIERLKELIAQARD 106 (140) T ss_pred ----HHHHHHHHHHHHHHHHH T ss_conf ----98899999999999999 No 17 >PF05813 Orthopox_F7: Orthopoxvirus F7 protein; InterPro: IPR008725 The function of the orthopoxvirus F7L proteins are unknown. Probab=28.47 E-value=5.8 Score=16.46 Aligned_cols=31 Identities=35% Similarity=0.489 Sum_probs=23.9 Q ss_pred CCCCHHHHHHHH--HCCCEEECCCCCCCCEEEE Q ss_conf 178867999885--2187111014654300100 Q T0529 212 KYPNTSDLDRLT--QSHPILNMIDTKKSSLNIS 242 (569) Q Consensus 212 KyPNl~DLekLt--~~Hp~L~iIt~~~S~iNIS 242 (569) .|-||.||+.-. -+.--|-+|..|+|.||-- T Consensus 35 dyknlndlde~~triefgplymineEKSdintl 67 (82) T PF05813_consen 35 DYKNLNDLDEFNTRIEFGPLYMINEEKSDINTL 67 (82) T ss_pred CCCCCCCCCHHHHEEEECCEEEECCCCCCCCEE T ss_conf 446422100244202314446862120256310 No 18 >PF06953 ArsD: Arsenical resistance operon trans-acting repressor ArsD; InterPro: IPR010712 This family consists of several bacterial arsenical resistance operon trans-acting repressor ArsD proteins. ArsD is a trans-acting repressor of the arsRDABC operon that confers resistance to arsenicals and antimonials in Escherichia coli. It possesses two-pairs of vicinal cysteine residues, Cys(12)-Cys(13) and Cys(112)-Cys(113), that potentially form separate binding sites for the metalloids that trigger dissociation of ArsD from the operon. However, as a homodimer it has four vicinal cysteine pairs .; GO: 0003677 DNA binding, 0016564 transcription repressor activity, 0016481 negative regulation of transcription, 0046685 response to arsenic Probab=28.16 E-value=8.2 Score=15.36 Aligned_cols=43 Identities=28% Similarity=0.273 Sum_probs=26.8 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEECCCCHHHHH--HHHHHHHHH Q ss_conf 776789999988999999975203677885032166477899--999999984 Q T0529 94 LTSDDLLILAADLEKLKSKVIRTERPLSAGVYMGNLSSQQLD--QRRALLNMI 144 (569) Q Consensus 94 LskdeLm~LasDLeKLk~Kv~rtEr~~~~gvY~GNLt~~QL~--~Rs~iL~~v 144 (569) =-.+||+.+|+|++.||++= .-|..=||+++-.. ....+.+++ T Consensus 21 ~vD~~Lv~~sa~~~~lk~~g--------v~v~RyNLa~~P~aF~~N~~V~~~L 65 (123) T PF06953_consen 21 SVDPELVRFSADLEWLKKQG--------VEVERYNLAQQPQAFVENPAVNQFL 65 (123) T ss_pred CCCHHHHHHHHHHHHHHHCC--------CEEEEECCCCCHHHHHHHHHHHHHH T ss_conf 87989999999999999779--------6289852536989998679999999 No 19 >PF10757 YbaJ: Biofilm formation regulator YbaJ Probab=25.90 E-value=14 Score=13.76 Aligned_cols=33 Identities=27% Similarity=0.544 Sum_probs=28.0 Q ss_pred CCHHHHHHHHHHCCEEEEECCCCHHHHHHHHHC Q ss_conf 536789998752120034417886799988521 Q T0529 193 VDLNDAVQALTDLGLIYTAKYPNTSDLDRLTQS 225 (569) Q Consensus 193 e~lndVVQ~Lt~LGLlYT~KyPNl~DLekLt~~ 225 (569) -+||+.+.-..+..+.|-.|||+-++|--+.++ T Consensus 46 lqLNeLIEHIa~f~~~fKIKYp~~~~l~~lvee 78 (122) T PF10757_consen 46 LQLNELIEHIAAFVWSFKIKYPDESDLIELVEE 78 (122) T ss_pred HHHHHHHHHHHHHHHHHEECCCCHHHHHHHHHH T ss_conf 109999999999898740026857579999999 No 20 >PF11715 Nup160: Nucleoporin Nup120/160 Probab=24.37 E-value=5.3 Score=16.73 Aligned_cols=31 Identities=29% Similarity=0.393 Sum_probs=22.3 Q ss_pred HHHHCCCCCHHHHHHHHHHCCEEEEECCCCH Q ss_conf 8644056536789998752120034417886 Q T0529 186 CLTKQGQVDLNDAVQALTDLGLIYTAKYPNT 216 (569) Q Consensus 186 CMt~Qgge~lndVVQ~Lt~LGLlYT~KyPNl 216 (569) |-+.+-.+.-.-+|-+.|.=+.+||.+.|.- T Consensus 89 ~v~~~~~~~~~~~v~vit~s~~~~~l~l~~~ 119 (547) T PF11715_consen 89 CVSFSEQEDHVLIVFVITQSGHLYTLRLPLD 119 (547) T ss_pred CEEEEECCCCEEEEEEECCCCEEEEEEECCC T ss_conf 1467855886799999705758999992540 No 21 >PF11041 DUF2612: Protein of unknown function (DUF2612) Probab=24.05 E-value=15 Score=13.52 Aligned_cols=40 Identities=23% Similarity=0.238 Sum_probs=28.9 Q ss_pred CHHHHHHHHHC--CEEECCCCCEEEEEECHHHHHHHHHHHHH Q ss_conf 21366665414--00110666035676352128899999999 Q T0529 245 NFSLGAAVKAG--ACMLDGGNMLETIKVSPQTMDGILKSILK 284 (569) Q Consensus 245 NlSLsAAVKAG--A~liDGGNMLETIkv~p~~f~~iiK~~L~ 284 (569) |-.|+...-.+ +..+|++||--++.|....++++.+.+++ T Consensus 128 ~~~l~~lf~~~~~~~v~D~~DMs~~~~v~~~~~~~~~~~li~ 169 (187) T PF11041_consen 128 NAALRYLFGDEGKVFVIDNGDMSMMIYVFGKRPSPIEKALIE 169 (187) T ss_pred HHHHHHHHCCCCEEEEEECCCCEEEEEEECCCCCHHHHHHHH T ss_conf 999999858996089982899679999976779989999997 No 22 >PF03410 Peptidase_M44: Protein G1; InterPro: IPR005072 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases belong to MEROPS peptidase family M44 (clan ME). The active site residues for members of this family and family M16 occur in the motif HXXEHProtein. The type example is the vaccinia virus-type metalloendopeptidase G1 from vaccinia virus, it is a metalloendopeptidase expressed by many Poxviridae which appears to play a role in the maturation of viral proteins.; GO: 0004222 metalloendopeptidase activity, 0008270 zinc ion binding, 0019067 viral assembly, maturation, egress, and release Probab=23.89 E-value=15 Score=13.50 Aligned_cols=180 Identities=20% Similarity=0.265 Sum_probs=96.6 Q ss_pred HHHHHHHCCHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEECCCC Q ss_conf 99764320115688998889999842176678999999854057865789998888998777753202455268725777 Q T0529 16 LRRELSGYCSNIKLQVVKDAQALLHGLDFSEVSNVQRLMRKERRDDNDLKRLRDLNQAVNNLVELKSTQQKSILRVGTLT 95 (569) Q Consensus 16 LRR~Ls~~t~~vK~dVl~Da~~ll~gLDF~~Va~VQR~mRk~KR~D~DL~~LRDlNkeVd~Lm~mkS~Qk~~~lkvG~Ls 95 (569) -|-=.|-||.+.|+.---||---+-+-=|.. .--|++=.+.++|+--||..|---.|. -++. T Consensus 62 aRsYMSFWC~si~g~~~~DAvrtliSWFF~~---------~~Lr~~F~~~~ik~~ikELENEYYFRn----EvfH----- 123 (590) T PF03410_consen 62 ARSYMSFWCKSIRGRTYIDAVRTLISWFFDK---------GKLRTDFSRSKIKNHIKELENEYYFRN----EVFH----- 123 (590) T ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHHHCC---------CCCCCCCCHHHHHHHHHHHHHHHHHHH----HHHH----- T ss_conf 3445656547503997678999999982148---------963025646559999998743241232----4677----- Q ss_pred HHHHHHHHHHHHHHHHHHHHCCCCCCCCEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCEEEEEECCCH--HHHH Q ss_conf 67899999889999999752036778850321664778999999999843676556777788875899735764--6776 Q T0529 96 SDDLLILAADLEKLKSKVIRTERPLSAGVYMGNLSSQQLDQRRALLNMIGMSGGNQGARAGRDGVVRVWDVKNA--ELLN 173 (569) Q Consensus 96 kdeLm~LasDLeKLk~Kv~rtEr~~~~gvY~GNLt~~QL~~Rs~iL~~vG~~~~~~~~~~~~~GvVrvWDvkd~--sll~ 173 (569) -|+.-+=| +++-.|-|.- -+-|++--++-.+++=. -..-.|.|-|+=|-...+. ++|+ T Consensus 124 ---CmDvLtfL-------------~gGDLYNGGR-i~mL~~l~~i~~~L~~R---M~~i~GpniVIFvk~l~~~~l~lL~ 183 (590) T PF03410_consen 124 ---CMDVLTFL-------------GGGDLYNGGR-ISMLNNLPDIRNMLSNR---MRRIIGPNIVIFVKELNDNTLSLLN 183 (590) T ss_pred ---HHHHHHHH-------------CCCCCCCCCH-HHHHHHHHHHHHHHHHH---HHHHCCCCEEEEEEECCHHHHHHHH T ss_conf ---88898885-------------3886557723-68876257899999998---8861089689998507888999999 Q ss_pred HHCCCCHH--HHHHHHHHCC-------------------CCCHHHHHHHH--HH-----------CCEEEEECCCCHHHH Q ss_conf 42066046--7888864405-------------------65367899987--52-----------120034417886799 Q T0529 174 NQFGTMPS--LTLACLTKQG-------------------QVDLNDAVQAL--TD-----------LGLIYTAKYPNTSDL 219 (569) Q Consensus 174 NQFGsmPa--lTiaCMt~Qg-------------------ge~lndVVQ~L--t~-----------LGLlYT~KyPNl~DL 219 (569) |-||+.|+ +||.|...+. ..+|+++.--+ .. =-|--|..|-+-+|- T Consensus 184 ~TFG~LP~~P~~Ip~~~~~~i~gKivMmpsPFYtvmv~V~~tl~NiLai~cL~E~YhLiDYETvg~~LYvtiSFv~E~dY 263 (590) T PF03410_consen 184 NTFGTLPSCPLTIPPTAFSSIGGKIVMMPSPFYTVMVRVDNTLDNILAIICLYETYHLIDYETVGDDLYVTISFVDEDDY 263 (590) T ss_pred HHCCCCCCCCCCCCCCCCCCCCCEEEEECCCCEEEEEECCCCHHHHHHHHHHHHHHHHCCCEEECCEEEEEEEEECHHHH T ss_conf 76278878962026777466687189805873589997473088899999999887200040106717999998354778 Q ss_pred HHHHHCCCEEECCC Q ss_conf 98852187111014 Q T0529 220 DRLTQSHPILNMID 233 (569) Q Consensus 220 ekLt~~Hp~L~iIt 233 (569) |.+-..-.-|++.. T Consensus 264 e~fl~gi~~l~f~~ 277 (590) T PF03410_consen 264 ESFLRGIKDLDFEI 277 (590) T ss_pred HHHHCCCCCCCCCC T ss_conf 99864633145554 No 23 >PF06780 Erp_C: Erp protein C-terminus; InterPro: IPR009618 This entry represents the C terminus of bacterial Erp proteins that seem to be specific to Borrelia burgdorferi (a causative agent of Lyme disease). Borrelia Erp proteins are particularly heterogeneous, which might enable them to interact with a wide variety of host components . Probab=21.31 E-value=16 Score=13.15 Aligned_cols=48 Identities=21% Similarity=0.290 Sum_probs=27.6 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCCEEEECCCCHHHHHHHHHHHHHHHHHHHHC Q ss_conf 9998888998777753202455268725777678999998899999997520 Q T0529 65 KRLRDLNQAVNNLVELKSTQQKSILRVGTLTSDDLLILAADLEKLKSKVIRT 116 (569) Q Consensus 65 ~~LRDlNkeVd~Lm~mkS~Qk~~~lkvG~LskdeLm~LasDLeKLk~Kv~rt 116 (569) ++|...|+.+ ..-.-.|-++..+|... +.+|-.|.+.|++||....-+ T Consensus 81 tKl~e~n~~~---~~~~~~~lK~~V~vseI-k~DLekLKs~LeevKeYL~~~ 128 (141) T PF06780_consen 81 TKLNEGNKPY---TKDNEPKLKENVKVSEI-KSDLEKLKSKLEEVKEYLEDS 128 (141) T ss_pred HHHHHHCCCC---CCCCCCCCCCCCCHHHH-HHHHHHHHHHHHHHHHHHHCH T ss_conf 9998304101---45777320135449999-958999999999999997170 No 24 >PF05677 DUF818: Chlamydia CHLPS protein (DUF818); InterPro: IPR008536 This family consists of several Chlamydia and Parachlamydia proteins, the function of which are unknown. Probab=20.87 E-value=13 Score=13.78 Aligned_cols=100 Identities=21% Similarity=0.198 Sum_probs=62.8 Q ss_pred CEEECCCCCC---CCEEEEEECHHHHHHHHHCC---EEECCCCCEEEEEECHHHHHHHHHHHHHHHHHCCCEECCCC-CC Q ss_conf 7111014654---30010000213666654140---01106660356763521288999999998866064211787-78 Q T0529 227 PILNMIDTKK---SSLNISGYNFSLGAAVKAGA---CMLDGGNMLETIKVSPQTMDGILKSILKVKKALGMFISDTP-GE 299 (569) Q Consensus 227 p~L~iIt~~~---S~iNISGYNlSLsAAVKAGA---~liDGGNMLETIkv~p~~f~~iiK~~L~vK~~e~MFv~~~P-G~ 299 (569) -|+++.+++. -+-+|--|-.||..+|-|-| +..+|++-..-+-|....|+++=+..-+.-++.++|+---- =+ T Consensus 200 a~~~YL~d~~~~p~~~~IIlYG~SLGg~Vaa~~l~~~~~~~~d~~~~~~VkDRtfsSl~~~A~~~~~kig~l~~~l~~W~ 279 (365) T PF05677_consen 200 AVYNYLRDQLNGPKAKQIILYGYSLGGGVAAEALSKEQLKGSDGTRWFLVKDRTFSSLSDIAEQFFGKIGSLIIKLFGWN 279 (365) T ss_pred HHHHHHHHHHCCCCCCEEEEEECCHHHHHHHHHHHHHHHHCCCCCEEEEEECCCCCCHHHHHHHHHHHHHHHHHHHHCCC T ss_conf 99999985023799647999930620899999998654302676104899647855268999999868988999986466 Q ss_pred CCCHHHHHHHHEECCCCCCCEEEC-CEEEE Q ss_conf 884454450110157986300012-40100 Q T0529 300 RNPYENILYKICLSGDGWPYIASR-TSITG 328 (569) Q Consensus 300 RNPYENlLYKlCLSGeGWPYI~SR-SqI~G 328 (569) -|- --.++=|-.-|=|+|-+++ +++.| T Consensus 280 ins--~k~s~~l~cpeifi~~~~~~~~~i~ 307 (365) T PF05677_consen 280 INS--EKSSKSLQCPEIFIYGADSRSSLIG 307 (365) T ss_pred CCC--HHHCCCCCCCEEEEECCCCCCCCCC T ss_conf 542--0001357798799853666766012 No 25 >PF04609 MCR_C: Methyl-coenzyme M reductase operon protein C; InterPro: IPR007687 Methyl-coenzyme M reductase (MCR) catalyses the reduction of methyl-coenzyme M (CH3-SCoM) and coenzyme B (HS-CoB) to methane and the corresponding heterosulphide CoM-S-S-CoB (2.8.4.1 from EC), the final step in methane biosynthesis. This reaction proceeds under anaerobic conditions by methanogenic Archaea , and requires a nickel-porphinoid prosthetic group, coenzyme F430, which is in the EPR-detectable Ni(I) oxidation state in the active enzyme. Studies on a catalytically inactive enzyme aerobically co-crystallized with coenzyme M displayed a fully occupied coenzyme M-binding site with no alternate conformations. The binding of coenzyme M appears to induce specific conformational changes that suggests a molecular mechanism by which the enzyme ensures that methyl-coenzyme M enters the substrate channel prior to coenzyme B, as required by the active-site geometry . MCR is a hexamer composed of 2 alpha, 2 beta, and 2 gamma subunits with two identical nickel porphinoid active sites, which form two long active site channels with F430 embedded at the bottom , . Genes encoding the beta (mcrB) and gamma (mcrG) subunits of MCR are separated by two open reading frames coding for two proteins C and D , . The function of proteins C and D is unknown. This entry represents protein C.; GO: 0003824 catalytic activity, 0015948 methanogenesis Probab=20.40 E-value=17 Score=13.02 Aligned_cols=14 Identities=21% Similarity=0.245 Sum_probs=7.2 Q ss_pred CCCCCCHHHHHHHH Q ss_conf 55678888999999 Q T0529 361 FTAGLTYSQLMTLK 374 (569) Q Consensus 361 ~~~~Lt~~qe~~ik 374 (569) ..++|++.--.++. T Consensus 115 ~~~gl~~~E~~~I~ 128 (268) T PF04609_consen 115 RIFGLTPEEIEQIN 128 (268) T ss_pred CCCCCCHHHHHHHH T ss_conf 76477999999874 No 26 >PF02426 MIase: Muconolactone delta-isomerase; InterPro: IPR003464 This small enzyme forms a homodecameric complex, that catalyses the third step in the catabolism of catechol to succinate- and acetyl-coa in the beta-ketoadipate pathway (5.3.3.4 from EC). The protein has a ferredoxin-like fold according to SCOP.; GO: 0006725 aromatic compound metabolic process Probab=20.10 E-value=17 Score=12.97 Aligned_cols=29 Identities=17% Similarity=0.442 Sum_probs=23.5 Q ss_pred CCCCCCEEEEEECCCHHHHHHHCCCCHHH Q ss_conf 77888758997357646776420660467 Q T0529 154 RAGRDGVVRVWDVKNAELLNNQFGTMPSL 182 (569) Q Consensus 154 ~~~~~GvVrvWDvkd~sll~NQFGsmPal 182 (569) .+|.-+++-||||.|..-|-..+.+.|-- T Consensus 46 v~G~~~n~sIfdv~s~~eLh~iL~~LPL~ 74 (91) T PF02426_consen 46 VVGEYANISIFDVESNDELHEILSSLPLF 74 (91) T ss_pred ECCCCCEEEEEECCCHHHHHHHHHHCCCC T ss_conf 66875068999759879999999838996 No 27 >PF05415 Peptidase_C36: Beet necrotic yellow vein furovirus-type papain-like endopeptidase; InterPro: IPR008746 Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad . This group of cysteine peptidases correspond to MEROPS peptidase family C36 (clan CA). The type example is beet necrotic yellow vein furovirus-type papain-like endopeptidase (beet necrotic yellow vein virus), which is involved in processing the viral polyprotein. Probab=20.01 E-value=17 Score=13.03 Aligned_cols=16 Identities=31% Similarity=0.536 Sum_probs=11.3 Q ss_pred HHHHHHHHHHHCCCCC Q ss_conf 8999998666437657 Q T0529 531 LMDCIMFDAAVSGGLN 546 (569) Q Consensus 531 LlDCiMF~a~~~G~~~ 546 (569) --||+||--++.-+.. T Consensus 51 W~DC~mFA~~LkVsm~ 66 (104) T PF05415_consen 51 WDDCRMFADALKVSMQ 66 (104) T ss_pred HHHHHHHHHHHEEEEE T ss_conf 8999999874445899 Done!