Citrus Sinensis ID: 000545


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230-------240-------250-------260-------270-------280-------290-------300-------310-------320-------330-------340-------350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520-------530-------540-------550-------560-------570-------580-------590-------600-------610-------620-------630-------640-------650-------660-------670-------680-------690-------700-------710-------720-------730-------740-------750-------760-------770-------780-------790-------800-------810-------820-------830-------840-------850-------860-------870-------880-------890-------900-------910-------920-------930-------940-------950-------960-------970-------980-------990------1000------1010------1020------1030------1040------1050------1060------1070------1080------1090------1100------1110------1120------1130------1140------1150------1160------1170------1180------1190------1200------1210------1220------1230------1240------1250------1260------1270------1280------1290------1300------1310------1320------1330------1340------1350------1360------1370------1380------1390------1400------1410------1420------1430--
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
cccEEEEEcccccEEEEEEEEEEEcccccccccccccccccccccccccccccccccEEEEEEcEEEEEEEEEcccccccccccccccccEEEcccccccEEEEEEEEEEEEEEEEEEEEcccccccccccEEEEEEccccEEEEEEEcccccEEEEEEEEEcccccccccccccccccccEEEEcccccEEEEEEEccEEEEEEccccccccccccccccccccccccccccEEEEcccccccEEEEEEEEccccccEEEEEEEccccccccccccccEEEEEEEEEEcccccccEEEEcccccccccEEEEEcccccEEEEEEccEEEEEccccccEEEccccccccccccccccEEEEEEEEEEEEEEEEccEEEEEEccccEEEEEEEEcccEEEEEEEEEEccccccccEEEEcccEEEEEEEcccEEEEEEEEcccccccccccccccccccccccccHHccccccccccccccccccccccccccccccccccEEEEEEcccccccccEEEcccccccccccccccccccccEEEEcccccEEEEEEEccccccccccccccccccccccEEEEEEcccEEEEEEcccEEEEEEcccccccccEEEEEEEccccEEEEEEcccEEEEEccccEEEcccccccccccccccccEEEEEEEcccEEEEEEEcccEEEEEEcccccEEEEEcccccccccccEEEEEEEccccccccccccccccccccccccccccccccccccccEEEEEEEccccEEEEEccccEEEEEEEEEccccEEEEEcccccccccccccccccccccccccccccccccEEEEEEcccccccccccEEEEEEEccEEEEEEEccccccccccccccccccccccEEEcEEEcEEccccccccccccccccccccccccEEEEEEEcccccEEEEEcccccEEEEEEcccEEEEEccccccEEEEEEEcccccccEEEEEEcccEEEEEEccccccccccccEEEEEEccccccEEEEEccccEEEEEEEEccccccccHHHHccccccccccccccccccccccccccccEEEEEEEccccccccEEEEEEEccccccEEEEEEEEEEccccccccEEEEEEEEEEccccccccccEEEEEEEEEccccccEEEEEEEEEccccccEEccccccEEEEEccEEEEEEccccccccEEEEccccEEEEEEEEEccEEEEEEccccEEEEEEEccccEEEEEEcccccccccEEEEEEcccEEEEEEEEccccEEEEEEcccccccccccccccEEEEcccccEEEEEEEEEEEcccccccccccccccccEEEEEEEccccEEEEEEccHHHHHHHHHHHHHHHHccccccccccccccccccccccccccccccEEHHHHHHHccccHHHHHHHHHHHcccHHHHHHHHHHHHHccccc
ccHHHHHHccccccHHHHHEEEEcccHHcccccccccccccccccccccccccccccEEEEEccEEEEEEEEEccccccccEcccccccccccccccccEEEEEEEEEEEEEEEEEEEEEEccccccccccEEEEEEcccEEEEEEEcccccccEEEEEEEccccccccccccccccccccEEEEcccccEEEEEEcccEEEEEEEccccHHHcccccccccccccccccccEEEEEcHHccccccEEEHHcccccccEEEEEEcccccccccccccccEEEEEEEEEEcccccccEEEEEccccccHcEEEEccccccEEEEEEEcEEEEEcccccccEEEccccccccccccccccccEEEEEccEEEEccccEEEEEEccccEEEEEEEEcccEEEEEEEEEccccccccEEEEEcccEEEEEEcccccEEEEEEcccccccccccHcHccccccccccHHcccccccccccccccccccccHccccHcccccccccEEEEEEcccccccccccEEEcccccHcccccccccccccEEEEccccccEEEEEEcccccccccccccccccccccEEEEEEEcccEEEEEEcccEEEEcccccccccccEEEEEEcccccEEEEEEcccEEEEcccccEEEEcccccccccccccccccEEEEEEcccEEEEEEEccEEEEEEEcccccEEEEcccccccccccccEEEEEEEcccccccccccccccccccccccccccccccccccccEEEEEEEcccEEEEEEccccEEEEEEccccccccEEEcccccccccccccccccccccccccccccccccccEEEEEEEcccccccccEEEEEEccccEEEEEEEEcccccccccccccccEEEEEcccccEEcccccccccccccccccccccccccccEEEEEEccccccEEEEEEccccEEEEEEccccccccccccccEEEEcccccccccccEEEEEcccEEEEEEccccccccccccEEEEEEccccccEEEEEccccEEEEEEEccccccccHHHHcccccHcccccccHHHcccHccccccccccEEEEEccccccccEEEEEEEEcccccEEEEEEEEEEEccccccccEEEEEEEEEccccccccccEEEEEEEEEEccccccccEEEEEccccccEEEEEccccEEEEEEccEEEEEEccccccEEEEEEccccEEEEEEEEcccEEEEEEcccEEEEEEEcccccEEEEEEccccccEEEEEEEEcccccEEEEEEcccccEEEEEEcccccccccccEEEEEEEEEccccccEEEEEcccccccccccccccccccccEEEEEEEccccEEEEEcccHHHHHHHHHHHHHHHHcccccccccHHHHHccccccccccccccccccHHHHHHHHHccHHHHHHHHHHHcccHHHHHHHHHHHHHHHccc
msfaaykmmhwptgiancgsgfithsradyvpqipliqteeldselpskrgigpvpnlVVTAANVIEIYVVRVQeegskesknsgeTKRRVLMDGISAASLELVCHYRLHGNVESLAILSqggadnsrrrDSIILAFEDAKisvlefddsihglritsmhcfespewlhlkrgresfargplvkvdpqgrcggvLVYGLQMIILKAsqggsglvgdedtfgsgggfsariESSHvinlrdldmkHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSIsttlkqhpliwsamnlphdaykllavpspiggvLVVGANTIHYHSQSASCALALNNYAVsldssqelprssfsVELDAAHATWLQNDVALLSTKTGDLVLLTVVYdgrvvqrldlsktnpsvltsdittIGNSLFFLGSRLGDSLLVQFtcgsgtsmlssglkeefgdieadapstkrlrrSSSDALQdmvngeelslygsasnntesAQKTFSFAVRDSlvnigplkdfsyglrinadasatgiskqsnyelvelpgckgiwtvyhkssrghnadssrmaayddEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGArildgsymtqdlsfgpsnsesgsgsenstVLSVSIAdpyvllgmsdgsirllvgdpstctvsvqtpaaiesskkpvssctlyhdkgpepwlrktstdawlstgvgeaidgadggpldqgdiysVVCYesgaleifdvpnfncvftvdkfvsgrtHIVDTYMREALKDSeteinssseegtgqgrkeNIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYlfegpentsksddpvstsrslsvsnvsasrlrnlrfsrtpldaytreetphgapcqritifknisghqgfflsgsrpcwcmVFRErlrvhpqlcdgsIVAFTVLHNVNCNHGFIYVTSQGILkicqlpsgstydnywpvqkviplkatphqityfaeknlyplivsvpvlkpLNQVLSLLIDqevghqidnhnlssvdlhrtytvEEYEVrilepdraggpwqtratipmqssenALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFstgrnadnpqnLVTEVYSKELKGAISALASLQGHLliasgpkiilhkwtgtelngiafydapplyvVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKdfgsldcfateflidgstlslvvsdeqKNIQIFyyapkmseswkgqkllsraefhvgAHVTKFLRLQMLAtssdrtgaapgsdktNRFALLFGtldgsigciapldeLTFRRLQSLQKKLVdsvphvaglnprsfrqfhsngkahrpgpdsivdcellshyemlplEEQLEIAHQTGTTRSQILSNlndlalgtsfl
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVrvqeegskesknsgetkrrVLMDGISAASLELVCHYRLHGNVESLAIlsqggadnsrRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFArgplvkvdpqgRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGrvvqrldlsktnpsvltsdITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKeefgdieadapstkrlrrsssDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADAsatgiskqsnyELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTiaagnlfgrrrVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQtpaaiesskkpvssCTLYHDkgpepwlrkTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMRealkdseteinssseegtgqgrkenIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPEntsksddpvstsrslsvsnvsasrlrnlrfsrtpldaytreetphgapCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGhqidnhnlssvdlhrtYTVEEYEVRilepdraggpwqtratipmqssenaLTVRVVTLFNtttkenetlLAIGTAYVQGEDVAARGRVLLFstgrnadnpqnLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSsdrtgaapgsdKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQIlsnlndlalgtsfl
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQeegskesknsgeTKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPsnsesgsgsensTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPvstsrslsvsnvsasrlrnLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
***AAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEE********RGIGPVPNLVVTAANVIEIYVVRV*****************VLMDGISAASLELVCHYRLHGNVESLAILSQGGA****RRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSL**********FSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSG*********************************************************TFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHK***********MAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMT********************VLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQ**************CTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMR****************************MKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFE************************************************GAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPM***ENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLA***************NRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGL*********************IVDCELLSHYEMLPLEEQLEIAH***********************
**FAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEEL********GIGPVPNLVVTAANVIEIYVVRV********************DGISAASLELVCHYRLHGNVESLAILSQG*A**SRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEW*********FARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSG****************RIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGS************************RLRRSSSDALQDMVNGEEL****************FSFAVRDSLVNIGPLKDFSYGLRIN*************YELVELPGCKGIWTVYH***************YDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQ***AIESSKKPVSSCTLYHDKGPEPWLRKTSTDAW*****************DQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALK************************MKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFE***********************SASRLRN***********************RITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKP********************NLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYA************LSRAEFHVGAHVTKFLRLQML****************NRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQE************KRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADA************ALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLS****************VLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPA*************LYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALK******************KENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGP*********************SASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATS*********SDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
*SFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKESK*SG*TKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSG*V***********FSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCG*********************************ALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSS**************DEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLR**STDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEIN********QGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLR*******************GAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQ*VGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATS***********KTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query1432 2.2.26 [Sep-21-2011]
Q9FGR01442 Cleavage and polyadenylat yes no 0.986 0.979 0.747 0.0
Q7XWP11441 Probable cleavage and pol yes no 0.977 0.971 0.640 0.0
Q9V7261455 Cleavage and polyadenylat yes no 0.905 0.890 0.283 1e-144
Q105691444 Cleavage and polyadenylat yes no 0.444 0.440 0.317 3e-90
Q9EPU41441 Cleavage and polyadenylat yes no 0.444 0.441 0.318 3e-90
Q105701443 Cleavage and polyadenylat yes no 0.442 0.439 0.313 9e-89
Q7SEY21456 Protein cft-1 OS=Neurospo N/A no 0.858 0.844 0.242 9e-78
O747331441 Protein cft1 OS=Schizosac yes no 0.864 0.859 0.233 5e-76
Q2TZ191393 Protein cft1 OS=Aspergill yes no 0.833 0.856 0.240 3e-75
Q5BDG71339 Protein cft1 OS=Emericell yes no 0.853 0.912 0.241 3e-72
>sp|Q9FGR0|CPSF1_ARATH Cleavage and polyadenylation specificity factor subunit 1 OS=Arabidopsis thaliana GN=CPSF160 PE=1 SV=2 Back     alignment and function desciption
 Score = 2245 bits (5817), Expect = 0.0,   Method: Compositional matrix adjust.
 Identities = 1092/1461 (74%), Positives = 1254/1461 (85%), Gaps = 48/1461 (3%)

Query: 1    MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQT-EELDSELPS-KRGIGPVPNL 58
            MSFAAYKMMHWPTG+ NC SG+ITHS +D   QIP++   +++++E P+ KRGIGP+PN+
Sbjct: 1    MSFAAYKMMHWPTGVENCASGYITHSLSDSTLQIPIVSVHDDIEAEWPNPKRGIGPLPNV 60

Query: 59   VVTAANVIEIYVVRVQEEG-SKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLA 117
            V+TAAN++E+Y+VR QEEG ++E +N    KR  +MDG+   SLELVCHYRLHGNVES+A
Sbjct: 61   VITAANILEVYIVRAQEEGNTQELRNPKLAKRGGVMDGVYGVSLELVCHYRLHGNVESIA 120

Query: 118  ILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESF 177
            +L  GG ++S+ RDSIIL F DAKISVLEFDDSIH LR+TSMHCFE P+WLHLKRGRESF
Sbjct: 121  VLPMGGGNSSKGRDSIILTFRDAKISVLEFDDSIHSLRMTSMHCFEGPDWLHLKRGRESF 180

Query: 178  ARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVIN 237
             RGPLVKVDPQGRCGGVLVYGLQMIILK SQ GSGLVGD+D F SGG  SAR+ESS++IN
Sbjct: 181  PRGPLVKVDPQGRCGGVLVYGLQMIILKTSQVGSGLVGDDDAFSSGGTVSARVESSYIIN 240

Query: 238  LRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPL 297
            LRDL+MKHVKDF+F+HGYIEPV+VIL E E TWAGRVSWKHHTC++SALSI++TLKQHP+
Sbjct: 241  LRDLEMKHVKDFVFLHGYIEPVIVILQEEEHTWAGRVSWKHHTCVLSALSINSTLKQHPV 300

Query: 298  IWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPR 357
            IWSA+NLPHDAYKLLAVPSPIGGVLV+ ANTIHYHSQSASCALALNNYA S DSSQELP 
Sbjct: 301  IWSAINLPHDAYKLLAVPSPIGGVLVLCANTIHYHSQSASCALALNNYASSADSSQELPA 360

Query: 358  SSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITT 417
            S+FSVELDAAH TW+ NDVALLSTK+G+L+LLT++YDGR VQRLDLSK+  SVL SDIT+
Sbjct: 361  SNFSVELDAAHGTWISNDVALLSTKSGELLLLTLIYDGRAVQRLDLSKSKASVLASDITS 420

Query: 418  IGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQD 477
            +GNSLFFLGSRLGDSLLVQF+C SG +    GL++E  DIE +    KRLR  +SD  QD
Sbjct: 421  VGNSLFFLGSRLGDSLLVQFSCRSGPAASLPGLRDEDEDIEGEGHQAKRLRM-TSDTFQD 479

Query: 478  MVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQS 537
             +  EELSL+GS  NN++SAQK+FSFAVRDSLVN+GP+KDF+YGLRINADA+ATG+SKQS
Sbjct: 480  TIGNEELSLFGSTPNNSDSAQKSFSFAVRDSLVNVGPVKDFAYGLRINADANATGVSKQS 539

Query: 538  NYEL--------------------------VELPGCKGIWTVYHKSSRGHNADSSRMAAY 571
            NYEL                          VELPGCKGIWTVYHKSSRGHNADSS+MAA 
Sbjct: 540  NYELVCCSGHGKNGALCVLRQSIRPEMITEVELPGCKGIWTVYHKSSRGHNADSSKMAAD 599

Query: 572  DDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGA 631
            +DEYHAYLIISLEARTMVLETADLLTEVTESVDY+VQGRTIAAGNLFGRRRVIQVFE GA
Sbjct: 600  EDEYHAYLIISLEARTMVLETADLLTEVTESVDYYVQGRTIAAGNLFGRRRVIQVFEHGA 659

Query: 632  RILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCT 691
            RILDGS+M Q+LSFG SNSES SGSE+STV SVSIADPYVLL M+D SIRLLVGDPSTCT
Sbjct: 660  RILDGSFMNQELSFGASNSESNSGSESSTVSSVSIADPYVLLRMTDDSIRLLVGDPSTCT 719

Query: 692  VSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGD 751
            VS+ +P+ +E SK+ +S+CTLYHDKGPEPWLRK STDAWLS+GVGEA+D  DGGP DQGD
Sbjct: 720  VSISSPSVLEGSKRKISACTLYHDKGPEPWLRKASTDAWLSSGVGEAVDSVDGGPQDQGD 779

Query: 752  IYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGT 811
            IY VVCYESGALEIFDVP+FNCVF+VDKF SGR H+ D  + E     E E+N +SE+ T
Sbjct: 780  IYCVVCYESGALEIFDVPSFNCVFSVDKFASGRRHLSDMPIHEL----EYELNKNSEDNT 835

Query: 812  GQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPV 871
                 + I + +VVELAMQRWS HH+RPFLFA+L DGTILCY AYLF+G ++T K+++ +
Sbjct: 836  S---SKEIKNTRVVELAMQRWSGHHTRPFLFAVLADGTILCYHAYLFDGVDST-KAENSL 891

Query: 872  STSRSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGS 931
            S+    ++++  +S+LRNL+F R PLD  TRE T  G   QRIT+FKNISGHQGFFLSGS
Sbjct: 892  SSENPAALNSSGSSKLRNLKFLRIPLDTSTREGTSDGVASQRITMFKNISGHQGFFLSGS 951

Query: 932  RPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDN 991
            RP WCM+FRERLR H QLCDGSI AFTVLHNVNCNHGFIYVT+QG+LKICQLPS S YDN
Sbjct: 952  RPGWCMLFRERLRFHSQLCDGSIAAFTVLHNVNCNHGFIYVTAQGVLKICQLPSASIYDN 1011

Query: 992  YWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLS 1051
            YWPVQK IPLKATPHQ+TY+AEKNLYPLIVS PV KPLNQVLS L+DQE G Q+DNHN+S
Sbjct: 1012 YWPVQK-IPLKATPHQVTYYAEKNLYPLIVSYPVSKPLNQVLSSLVDQEAGQQLDNHNMS 1070

Query: 1052 SVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETL 1111
            S DL RTYTVEE+E++ILEP+R+GGPW+T+A IPMQ+SE+ALTVRVVTL N +T ENETL
Sbjct: 1071 SDDLQRTYTVEEFEIQILEPERSGGPWETKAKIPMQTSEHALTVRVVTLLNASTGENETL 1130

Query: 1112 LAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIA 1171
            LA+GTAYVQGEDVAARGRVLLFS G+N DN QN+VTEVYS+ELKGAISA+AS+QGHLLI+
Sbjct: 1131 LAVGTAYVQGEDVAARGRVLLFSFGKNGDNSQNVVTEVYSRELKGAISAVASIQGHLLIS 1190

Query: 1172 SGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLN 1231
            SGPKIILHKW GTELNG+AF+DAPPLYVVS+N+VK+FILLGD+HKSIYFLSWKEQG+QL+
Sbjct: 1191 SGPKIILHKWNGTELNGVAFFDAPPLYVVSMNVVKSFILLGDVHKSIYFLSWKEQGSQLS 1250

Query: 1232 LLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHV 1291
            LLAKDF SLDCFATEFLIDGSTLSL VSDEQKNIQ+FYYAPKM ESWKG KLLSRAEFHV
Sbjct: 1251 LLAKDFESLDCFATEFLIDGSTLSLAVSDEQKNIQVFYYAPKMIESWKGLKLLSRAEFHV 1310

Query: 1292 GAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSL 1351
            GAHV+KFLRLQM+++         G+DK NRFALLFGTLDGS GCIAPLDE+TFRRLQSL
Sbjct: 1311 GAHVSKFLRLQMVSS---------GADKINRFALLFGTLDGSFGCIAPLDEVTFRRLQSL 1361

Query: 1352 QKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQT 1411
            QKKLVD+VPHVAGLNP +FRQF S+GKA R GPDSIVDCELL HYEMLPLEEQLE+AHQ 
Sbjct: 1362 QKKLVDAVPHVAGLNPLAFRQFRSSGKARRSGPDSIVDCELLCHYEMLPLEEQLELAHQI 1421

Query: 1412 GTTRSQILSNLNDLALGTSFL 1432
            GTTR  IL +L DL++GTSFL
Sbjct: 1422 GTTRYSILKDLVDLSVGTSFL 1442




CPSF plays a key role in pre-mRNA 3'-end formation, recognizing the AAUAAA signal sequence and interacting with poly(A)polymerase and other factors to bring about cleavage and poly(A) addition. This subunit is involved in the RNA recognition step of the polyadenylation reaction.
Arabidopsis thaliana (taxid: 3702)
>sp|Q7XWP1|CPSF1_ORYSJ Probable cleavage and polyadenylation specificity factor subunit 1 OS=Oryza sativa subsp. japonica GN=Os04g0252200 PE=3 SV=2 Back     alignment and function description
>sp|Q9V726|CPSF1_DROME Cleavage and polyadenylation specificity factor subunit 1 OS=Drosophila melanogaster GN=Cpsf160 PE=1 SV=1 Back     alignment and function description
>sp|Q10569|CPSF1_BOVIN Cleavage and polyadenylation specificity factor subunit 1 OS=Bos taurus GN=CPSF1 PE=1 SV=1 Back     alignment and function description
>sp|Q9EPU4|CPSF1_MOUSE Cleavage and polyadenylation specificity factor subunit 1 OS=Mus musculus GN=Cpsf1 PE=1 SV=1 Back     alignment and function description
>sp|Q10570|CPSF1_HUMAN Cleavage and polyadenylation specificity factor subunit 1 OS=Homo sapiens GN=CPSF1 PE=1 SV=2 Back     alignment and function description
>sp|Q7SEY2|CFT1_NEUCR Protein cft-1 OS=Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) GN=cft-1 PE=3 SV=2 Back     alignment and function description
>sp|O74733|CFT1_SCHPO Protein cft1 OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=cft1 PE=3 SV=1 Back     alignment and function description
>sp|Q2TZ19|CFT1_ASPOR Protein cft1 OS=Aspergillus oryzae (strain ATCC 42149 / RIB 40) GN=cft1 PE=3 SV=1 Back     alignment and function description
>sp|Q5BDG7|CFT1_EMENI Protein cft1 OS=Emericella nidulans (strain FGSC A4 / ATCC 38163 / CBS 112.46 / NRRL 194 / M139) GN=cft1 PE=3 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query1432
2254555711442 PREDICTED: cleavage and polyadenylation 0.988 0.981 0.799 0.0
2960841221448 unnamed protein product [Vitis vinifera] 0.988 0.977 0.796 0.0
2555396811461 cleavage and polyadenylation specificity 0.999 0.979 0.803 0.0
3565599171447 PREDICTED: cleavage and polyadenylation 0.990 0.980 0.786 0.0
3565309451449 PREDICTED: cleavage and polyadenylation 0.991 0.979 0.785 0.0
2241209601455 predicted protein [Populus trichocarpa] 0.991 0.975 0.783 0.0
2977924711444 hypothetical protein ARALYDRAFT_495232 [ 0.988 0.979 0.750 0.0
4494703421504 PREDICTED: cleavage and polyadenylation 0.995 0.948 0.741 0.0
306960881442 cleavage and polyadenylation specificity 0.986 0.979 0.747 0.0
244155801442 putative cleavage and polyadenylation sp 0.986 0.979 0.746 0.0
>gi|225455571|ref|XP_002268371.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Vitis vinifera] Back     alignment and taxonomy information
 Score = 2437 bits (6316), Expect = 0.0,   Method: Compositional matrix adjust.
 Identities = 1166/1458 (79%), Positives = 1283/1458 (87%), Gaps = 42/1458 (2%)

Query: 1    MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVV 60
            MS+AAYKMMHWPTGI NC SGF+THSRAD+ PQI  IQT++L+SE P+KR IGP+PNL+V
Sbjct: 1    MSYAAYKMMHWPTGIENCASGFVTHSRADFAPQIAPIQTDDLESEWPTKRQIGPLPNLIV 60

Query: 61   TAANVIEIYVVRVQEEGSKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILS 120
            TAAN++E+Y+VRVQE+ S+ES+ S ETKR  +M GIS A+LELVC YRLHGNVE++ +L 
Sbjct: 61   TAANILEVYMVRVQEDDSRESRASAETKRGGVMAGISGAALELVCQYRLHGNVETMTVLP 120

Query: 121  QGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARG 180
             GG DNSRRRDSIILAF+DAKISVLEFDDSIHGLR +SMHCFE PEW HLKRG ESFARG
Sbjct: 121  SGGGDNSRRRDSIILAFQDAKISVLEFDDSIHGLRTSSMHCFEGPEWFHLKRGHESFARG 180

Query: 181  PLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRD 240
            PLVKVDPQGRC GVLVYGLQMIILKASQ G GLVGDE+   SG   SAR+ESS+VI+LRD
Sbjct: 181  PLVKVDPQGRCSGVLVYGLQMIILKASQAGYGLVGDEEALSSGSAVSARVESSYVISLRD 240

Query: 241  LDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWS 300
            LDMKHVKDF FVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWS
Sbjct: 241  LDMKHVKDFTFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWS 300

Query: 301  AMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSF 360
            A+NLPHDAYKLL VPSPIGGV+V+ AN+IHYHSQSASCALALNNYAVS D+SQE+PRSSF
Sbjct: 301  AVNLPHDAYKLLPVPSPIGGVVVISANSIHYHSQSASCALALNNYAVSADNSQEMPRSSF 360

Query: 361  SVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGN 420
            SVELDAA+ATWL NDVA+LSTKTG+L+LLT+ YDGRVV RLDLSK+  SVLTS I  IGN
Sbjct: 361  SVELDAANATWLSNDVAMLSTKTGELLLLTLAYDGRVVHRLDLSKSRASVLTSGIAAIGN 420

Query: 421  SLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVN 480
            SLFFLGSRLGDSLLVQFT     S+LSS +KEE GDIE D PS KRLR+SSSDALQDMVN
Sbjct: 421  SLFFLGSRLGDSLLVQFT-----SILSSSVKEEVGDIEGDVPSAKRLRKSSSDALQDMVN 475

Query: 481  GEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYE 540
            GEELSLYGSA N+TE++QKTFSF+VRDS +N+GPLKDF+YGLRINAD  ATGI+KQSNYE
Sbjct: 476  GEELSLYGSAPNSTETSQKTFSFSVRDSFINVGPLKDFAYGLRINADPKATGIAKQSNYE 535

Query: 541  LV--------------------------ELPGCKGIWTVYHKSSRGHNADSSRMAAYDDE 574
            LV                          ELPGCKGIWTVYHK++RGHNADS++MA  DDE
Sbjct: 536  LVCCSGHGKNGALCILQQSIRPEMITEVELPGCKGIWTVYHKNTRGHNADSTKMATKDDE 595

Query: 575  YHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARIL 634
            YHAYLIISLE+RTMVLETADLL EVTESVDY+VQG TI+AGNLFGRRRV+QV+ RGARIL
Sbjct: 596  YHAYLIISLESRTMVLETADLLGEVTESVDYYVQGCTISAGNLFGRRRVVQVYARGARIL 655

Query: 635  DGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSV 694
            DG++MTQDL            SE+STVLSVSIADPYVLL MSDG+I+LLVGDPSTCTVS+
Sbjct: 656  DGAFMTQDLPI----------SESSTVLSVSIADPYVLLRMSDGNIQLLVGDPSTCTVSI 705

Query: 695  QTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYS 754
              PA  ESSKK +S+CTLYHDKGPEPWLRKTSTDAWLSTG+GEAIDGADG   DQGDIY 
Sbjct: 706  NIPAVFESSKKSISACTLYHDKGPEPWLRKTSTDAWLSTGIGEAIDGADGAAQDQGDIYC 765

Query: 755  VVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQG 814
            VV YESG LEIFDVPNFNCVF+VDKF+SG  H+VDT + E  +D++  ++ +SEE   QG
Sbjct: 766  VVSYESGDLEIFDVPNFNCVFSVDKFMSGNAHLVDTLILEPSEDTQKVMSKNSEEEADQG 825

Query: 815  RKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTS 874
            RKEN H++KVVELAMQRWS  HSRPFLF ILTDGTILCY AYL+EGPE+T K+++ VS  
Sbjct: 826  RKENAHNIKVVELAMQRWSGQHSRPFLFGILTDGTILCYHAYLYEGPESTPKTEEAVSAQ 885

Query: 875  RSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPC 934
             SLS+SNVSASRLRNLRF R PLD YTREE   G    R+T+FKNI G QG FLSGSRP 
Sbjct: 886  NSLSISNVSASRLRNLRFVRVPLDTYTREEALSGTTSPRMTVFKNIGGCQGLFLSGSRPL 945

Query: 935  WCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWP 994
            W MVFRER+RVHPQLCDGSIVAFTVLHN+NCNHG IYVTSQG LKICQLP+ S+YDNYWP
Sbjct: 946  WFMVFRERIRVHPQLCDGSIVAFTVLHNINCNHGLIYVTSQGFLKICQLPAVSSYDNYWP 1005

Query: 995  VQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVD 1054
            VQK IPLK TPHQ+TYFAEKNLYPLIVSVPVLKPLN VLS L+DQE GHQ++N NLSS +
Sbjct: 1006 VQK-IPLKGTPHQVTYFAEKNLYPLIVSVPVLKPLNHVLSSLVDQEAGHQLENDNLSSDE 1064

Query: 1055 LHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAI 1114
            LHR+Y+V+E+EVR+LEP+++G PWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAI
Sbjct: 1065 LHRSYSVDEFEVRVLEPEKSGAPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAI 1124

Query: 1115 GTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGP 1174
            GTAYVQGEDVAARGRVLLFS G+N DN QNLV+E+YSKELKGAISA+ASLQGHLLIASGP
Sbjct: 1125 GTAYVQGEDVAARGRVLLFSVGKNTDNSQNLVSEIYSKELKGAISAVASLQGHLLIASGP 1184

Query: 1175 KIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLA 1234
            KIILHKWTGTELNG+AF+DAPPLYVVSLNIVKNFILLGDIH+SIYFLSWKEQGAQLNLLA
Sbjct: 1185 KIILHKWTGTELNGVAFFDAPPLYVVSLNIVKNFILLGDIHRSIYFLSWKEQGAQLNLLA 1244

Query: 1235 KDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAH 1294
            KDFGSLDCFATEFLIDGSTLSL+VSD+QKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAH
Sbjct: 1245 KDFGSLDCFATEFLIDGSTLSLIVSDDQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAH 1304

Query: 1295 VTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKK 1354
            VTKFLRLQML  SSDRT A  GSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKK
Sbjct: 1305 VTKFLRLQMLPASSDRTSATQGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKK 1364

Query: 1355 LVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTT 1414
            LVD+VPHVAGLNPRSFRQF SNGKAHRPGPD+IVDCELL HYEMLP EEQLEIA Q GTT
Sbjct: 1365 LVDAVPHVAGLNPRSFRQFRSNGKAHRPGPDNIVDCELLCHYEMLPFEEQLEIAQQIGTT 1424

Query: 1415 RSQILSNLNDLALGTSFL 1432
            R QILSNLNDL+LGTSFL
Sbjct: 1425 RMQILSNLNDLSLGTSFL 1442




Source: Vitis vinifera

Species: Vitis vinifera

Genus: Vitis

Family: Vitaceae

Order: Vitales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|296084122|emb|CBI24510.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
>gi|255539681|ref|XP_002510905.1| cleavage and polyadenylation specificity factor cpsf, putative [Ricinus communis] gi|223550020|gb|EEF51507.1| cleavage and polyadenylation specificity factor cpsf, putative [Ricinus communis] Back     alignment and taxonomy information
>gi|356559917|ref|XP_003548242.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Glycine max] Back     alignment and taxonomy information
>gi|356530945|ref|XP_003534039.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Glycine max] Back     alignment and taxonomy information
>gi|224120960|ref|XP_002318462.1| predicted protein [Populus trichocarpa] gi|222859135|gb|EEE96682.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
>gi|297792471|ref|XP_002864120.1| hypothetical protein ARALYDRAFT_495232 [Arabidopsis lyrata subsp. lyrata] gi|297309955|gb|EFH40379.1| hypothetical protein ARALYDRAFT_495232 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information
>gi|449470342|ref|XP_004152876.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Cucumis sativus] Back     alignment and taxonomy information
>gi|30696088|ref|NP_199979.2| cleavage and polyadenylation specificity factor subunit 1 [Arabidopsis thaliana] gi|290457637|sp|Q9FGR0.2|CPSF1_ARATH RecName: Full=Cleavage and polyadenylation specificity factor subunit 1; AltName: Full=Cleavage and polyadenylation specificity factor 160 kDa subunit; Short=AtCPSF160; Short=CPSF 160 kDa subunit gi|332008729|gb|AED96112.1| cleavage and polyadenylation specificity factor subunit 1 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|24415580|gb|AAN41460.1| putative cleavage and polyadenylation specificity factor 160 kDa subunit [Arabidopsis thaliana] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query1432
TAIR|locus:21531221442 CPSF160 "cleavage and polyaden 0.926 0.920 0.723 0.0
ZFIN|ZDB-GENE-040709-21451 cpsf1 "cleavage and polyadenyl 0.422 0.416 0.309 6.8e-157
UNIPROTKB|F1PC281398 CPSF1 "Uncharacterized protein 0.412 0.422 0.327 2.9e-156
UNIPROTKB|Q105691444 CPSF1 "Cleavage and polyadenyl 0.412 0.409 0.323 5.7e-155
UNIPROTKB|Q105701443 CPSF1 "Cleavage and polyadenyl 0.415 0.412 0.320 6.8e-154
MGI|MGI:26797221441 Cpsf1 "cleavage and polyadenyl 0.407 0.405 0.328 6.1e-149
FB|FBgn00246981455 Cpsf160 "Cleavage and polyaden 0.346 0.340 0.306 5.3e-129
UNIPROTKB|F1RSN81108 CPSF1 "Uncharacterized protein 0.342 0.442 0.353 1.7e-125
DICTYBASE|DDB_G02815851628 cpsf1 "cleavage and polyadenyl 0.240 0.211 0.329 6.1e-115
RGD|13064061386 Cpsf1 "cleavage and polyadenyl 0.308 0.318 0.353 2.6e-113
TAIR|locus:2153122 CPSF160 "cleavage and polyadenylation specificity factor 160" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 5068 (1789.1 bits), Expect = 0., P = 0.
 Identities = 991/1370 (72%), Positives = 1128/1370 (82%)

Query:    88 KRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEF 147
             KR  +MDG+   SLELVCHYRLHGNVES+A+L  GG ++S+ RDSIIL F DAKISVLEF
Sbjct:    91 KRGGVMDGVYGVSLELVCHYRLHGNVESIAVLPMGGGNSSKGRDSIILTFRDAKISVLEF 150

Query:   148 DDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKAS 207
             DDSIH LR+TSMHCFE P+WLHLKRGRESF RGPLVKVDPQGRCGGVLVYGLQMIILK S
Sbjct:   151 DDSIHSLRMTSMHCFEGPDWLHLKRGRESFPRGPLVKVDPQGRCGGVLVYGLQMIILKTS 210

Query:   208 QGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERE 267
             Q GSGLVGD+D F SGG  SAR+ESS++INLRDL+MKHVKDF+F+HGYIEPV+VIL E E
Sbjct:   211 QVGSGLVGDDDAFSSGGTVSARVESSYIINLRDLEMKHVKDFVFLHGYIEPVIVILQEEE 270

Query:   268 LTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGAN 327
              TWAGRVSWKHHTC++SALSI++TLKQHP+IWSA+NLPHDAYKLLAVPSPIGGVLV+ AN
Sbjct:   271 HTWAGRVSWKHHTCVLSALSINSTLKQHPVIWSAINLPHDAYKLLAVPSPIGGVLVLCAN 330

Query:   328 TIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLV 387
             TIHYHSQSASCALALNNYA S DSSQELP S+FSVELDAAH TW+ NDVALLSTK+G+L+
Sbjct:   331 TIHYHSQSASCALALNNYASSADSSQELPASNFSVELDAAHGTWISNDVALLSTKSGELL 390

Query:   388 LLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLS 447
             LLT++YDGR VQRLDLSK+  SVL SDIT++GNSLFFLGSRLGDSLLVQF+C SG +   
Sbjct:   391 LLTLIYDGRAVQRLDLSKSKASVLASDITSVGNSLFFLGSRLGDSLLVQFSCRSGPAASL 450

Query:   448 SGLKEEFGDIEADAPSTKRLRRSSSDALQDMVNGEELSLYGSASNNTESAQ--------- 498
              GL++E  DIE +    KRLR +S      + N E      + +N+  + +         
Sbjct:   451 PGLRDEDEDIEGEGHQAKRLRMTSDTFQDTIGNEELSLFGSTPNNSDSAQKSFSFAVRDS 510

Query:   499 -------KTFSFAVR-DSLVNI-GPLKDFSYGLRINADASATG---ISKQS-NYEL---V 542
                    K F++ +R ++  N  G  K  +Y L   +     G   + +QS   E+   V
Sbjct:   511 LVNVGPVKDFAYGLRINADANATGVSKQSNYELVCCSGHGKNGALCVLRQSIRPEMITEV 570

Query:   543 ELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTES 602
             ELPGCKGIWTVYHKSSRGHNADSS+MAA +DEYHAYLIISLEARTMVLETADLLTEVTES
Sbjct:   571 ELPGCKGIWTVYHKSSRGHNADSSKMAADEDEYHAYLIISLEARTMVLETADLLTEVTES 630

Query:   603 VDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPXXXXXXXXXXXXTVL 662
             VDY+VQGRTIAAGNLFGRRRVIQVFE GARILDGS+M Q+LSFG             TV 
Sbjct:   631 VDYYVQGRTIAAGNLFGRRRVIQVFEHGARILDGSFMNQELSFGASNSESNSGSESSTVS 690

Query:   663 SVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWL 722
             SVSIADPYVLL M+D SIRLLVGDPSTCTVS+ +P+ +E SK+ +S+CTLYHDKGPEPWL
Sbjct:   691 SVSIADPYVLLRMTDDSIRLLVGDPSTCTVSISSPSVLEGSKRKISACTLYHDKGPEPWL 750

Query:   723 RKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVS 782
             RK STDAWLS+GVGEA+D  DGGP DQGDIY VVCYESGALEIFDVP+FNCVF+VDKF S
Sbjct:   751 RKASTDAWLSSGVGEAVDSVDGGPQDQGDIYCVVCYESGALEIFDVPSFNCVFSVDKFAS 810

Query:   783 GRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLF 842
             GR H+ D  + E     E E+N +SE+ T    KE I + +VVELAMQRWS HH+RPFLF
Sbjct:   811 GRRHLSDMPIHEL----EYELNKNSEDNTSS--KE-IKNTRVVELAMQRWSGHHTRPFLF 863

Query:   843 AILTDGTILCYQAYLFEGPENTSKSDDPXXXXXXXXXXXXXXXXXXXLRFSRTPLDAYTR 902
             A+L DGTILCY AYLF+G ++T K+++                    L+F R PLD  TR
Sbjct:   864 AVLADGTILCYHAYLFDGVDST-KAENSLSSENPAALNSSGSSKLRNLKFLRIPLDTSTR 922

Query:   903 EETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHN 962
             E T  G   QRIT+FKNISGHQGFFLSGSRP WCM+FRERLR H QLCDGSI AFTVLHN
Sbjct:   923 EGTSDGVASQRITMFKNISGHQGFFLSGSRPGWCMLFRERLRFHSQLCDGSIAAFTVLHN 982

Query:   963 VNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVS 1022
             VNCNHGFIYVT+QG+LKICQLPS S YDNYWPVQK IPLKATPHQ+TY+AEKNLYPLIVS
Sbjct:   983 VNCNHGFIYVTAQGVLKICQLPSASIYDNYWPVQK-IPLKATPHQVTYYAEKNLYPLIVS 1041

Query:  1023 VPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRA 1082
              PV KPLNQVLS L+DQE G Q+DNHN+SS DL RTYTVEE+E++ILEP+R+GGPW+T+A
Sbjct:  1042 YPVSKPLNQVLSSLVDQEAGQQLDNHNMSSDDLQRTYTVEEFEIQILEPERSGGPWETKA 1101

Query:  1083 TIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNP 1142
              IPMQ+SE+ALTVRVVTL N +T ENETLLA+GTAYVQGEDVAARGRVLLFS G+N DN 
Sbjct:  1102 KIPMQTSEHALTVRVVTLLNASTGENETLLAVGTAYVQGEDVAARGRVLLFSFGKNGDNS 1161

Query:  1143 QNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSL 1202
             QN+VTEVYS+ELKGAISA+AS+QGHLLI+SGPKIILHKW GTELNG+AF+DAPPLYVVS+
Sbjct:  1162 QNVVTEVYSRELKGAISAVASIQGHLLISSGPKIILHKWNGTELNGVAFFDAPPLYVVSM 1221

Query:  1203 NIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQ 1262
             N+VK+FILLGD+HKSIYFLSWKEQG+QL+LLAKDF SLDCFATEFLIDGSTLSL VSDEQ
Sbjct:  1222 NVVKSFILLGDVHKSIYFLSWKEQGSQLSLLAKDFESLDCFATEFLIDGSTLSLAVSDEQ 1281

Query:  1263 KNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNR 1322
             KNIQ+FYYAPKM ESWKG KLLSRAEFHVGAHV+KFLRLQM+++         G+DK NR
Sbjct:  1282 KNIQVFYYAPKMIESWKGLKLLSRAEFHVGAHVSKFLRLQMVSS---------GADKINR 1332

Query:  1323 FALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRP 1382
             FALLFGTLDGS GCIAPLDE+TFRRLQSLQKKLVD+VPHVAGLNP +FRQF S+GKA R 
Sbjct:  1333 FALLFGTLDGSFGCIAPLDEVTFRRLQSLQKKLVDAVPHVAGLNPLAFRQFRSSGKARRS 1392

Query:  1383 GPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL 1432
             GPDSIVDCELL HYEMLPLEEQLE+AHQ GTTR  IL +L DL++GTSFL
Sbjct:  1393 GPDSIVDCELLCHYEMLPLEEQLELAHQIGTTRYSILKDLVDLSVGTSFL 1442


GO:0003676 "nucleic acid binding" evidence=IEA
GO:0005634 "nucleus" evidence=ISM;IEA;IDA
GO:0006378 "mRNA polyadenylation" evidence=ISS
GO:0006379 "mRNA cleavage" evidence=ISS
GO:0005515 "protein binding" evidence=IPI
GO:0005829 "cytosol" evidence=IDA
GO:0006397 "mRNA processing" evidence=RCA
GO:0009909 "regulation of flower development" evidence=RCA
GO:0016570 "histone modification" evidence=RCA
GO:0048449 "floral organ formation" evidence=RCA
ZFIN|ZDB-GENE-040709-2 cpsf1 "cleavage and polyadenylation specific factor 1" [Danio rerio (taxid:7955)] Back     alignment and assigned GO terms
UNIPROTKB|F1PC28 CPSF1 "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
UNIPROTKB|Q10569 CPSF1 "Cleavage and polyadenylation specificity factor subunit 1" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|Q10570 CPSF1 "Cleavage and polyadenylation specificity factor subunit 1" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
MGI|MGI:2679722 Cpsf1 "cleavage and polyadenylation specific factor 1" [Mus musculus (taxid:10090)] Back     alignment and assigned GO terms
FB|FBgn0024698 Cpsf160 "Cleavage and polyadenylation specificity factor 160" [Drosophila melanogaster (taxid:7227)] Back     alignment and assigned GO terms
UNIPROTKB|F1RSN8 CPSF1 "Uncharacterized protein" [Sus scrofa (taxid:9823)] Back     alignment and assigned GO terms
DICTYBASE|DDB_G0281585 cpsf1 "cleavage and polyadenylation specificity factor 160 kDa subunit" [Dictyostelium discoideum (taxid:44689)] Back     alignment and assigned GO terms
RGD|1306406 Cpsf1 "cleavage and polyadenylation specific factor 1, 160kDa" [Rattus norvegicus (taxid:10116)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
Q9FGR0CPSF1_ARATHNo assigned EC number0.74740.98670.9798yesno
Q7XWP1CPSF1_ORYSJNo assigned EC number0.64080.97760.9715yesno

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Your Input:
GSVIVG00038486001
SubName- Full=Chromosome chr16 scaffold_94, whole genome shotgun sequence; (1448 aa)
(Vitis vinifera)
Predicted Functional Partners:
GSVIVG00037665001
SubName- Full=Chromosome undetermined scaffold_91, whole genome shotgun sequence; (740 aa)
    0.865
GSVIVG00016982001
SubName- Full=Chromosome chr11 scaffold_14, whole genome shotgun sequence; (771 aa)
     0.757
GSVIVG00020879001
SubName- Full=Chromosome chr14 scaffold_21, whole genome shotgun sequence; (427 aa)
     0.681
GSVIVG00028411001
SubName- Full=Chromosome chr10 scaffold_43, whole genome shotgun sequence; (572 aa)
      0.676
GSVIVG00000022001
SubName- Full=Chromosome chr17 scaffold_101, whole genome shotgun sequence; (461 aa)
      0.517
GSVIVG00006902001
SubName- Full=Chromosome chr10 scaffold_179, whole genome shotgun sequence; (863 aa)
      0.510
GSVIVG00023663001
SubName- Full=Chromosome chr7 scaffold_31, whole genome shotgun sequence; (97 aa)
      0.506
GSVIVG00010279001
SubName- Full=Chromosome undetermined scaffold_252, whole genome shotgun sequence; (561 aa)
      0.506
GSVIVG00037110001
SubName- Full=Chromosome chr16 scaffold_86, whole genome shotgun sequence; (512 aa)
      0.496
GSVIVG00001949001
SubName- Full=Chromosome chr5 scaffold_124, whole genome shotgun sequence; (2072 aa)
      0.495

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query1432
pfam03178318 pfam03178, CPSF_A, CPSF A subunit region 6e-92
COG51611319 COG5161, SFT1, Pre-mRNA cleavage and polyadenylati 1e-43
COG51611319 COG5161, SFT1, Pre-mRNA cleavage and polyadenylati 2e-22
pfam10433513 pfam10433, MMS1_N, Mono-functional DNA-alkylating 1e-07
>gnl|CDD|217409 pfam03178, CPSF_A, CPSF A subunit region Back     alignment and domain information
 Score =  299 bits (769), Expect = 6e-92
 Identities = 116/338 (34%), Positives = 189/338 (55%), Gaps = 24/338 (7%)

Query: 1064 YEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGED 1123
              +R+++P      W+   T+ ++ +E  L+V+ V L ++  +  +  L +GTA+  GED
Sbjct: 2    SCIRLVDPIT----WEVIDTLELEENEAVLSVKSVNLEDS--EGRKEYLVVGTAFDLGED 55

Query: 1124 VAAR-GRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWT 1182
             AAR GR+ +F       N +  +  V+  E+KGA++AL   QG LL   G K+ ++   
Sbjct: 56   PAARSGRIYVFEIIEPETNRK--LKLVHKTEVKGAVTALCEFQGRLLAGQGQKLRVYDLG 113

Query: 1183 GTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDC 1242
              +L   AF D P  YVVSL +  N I++GD+ KS+ FL + E+  +L L A+D      
Sbjct: 114  KDKLLPKAFLDTPITYVVSLKVFGNRIIVGDLMKSVTFLGYDEEPYRLILFARDTQPRWV 173

Query: 1243 FATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQ-KLLSRAEFHVGAHVTKFLRL 1301
             A EFL+D  T  ++ +D+  N+ +  Y P+  ES  G  +LL RAEFH+G  VT F + 
Sbjct: 174  TAAEFLVDYDT--ILGADKFGNLHVLRYDPEAPESLDGDPRLLHRAEFHLGDIVTSFQKG 231

Query: 1302 QMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAP-LDELTFRRLQSLQKKLVDSVP 1360
             ++  +        G++ T+   +L+GTLDGSIG + P + E  +RRLQ LQ++L D +P
Sbjct: 232  SLVPKTG-------GAESTSSPQILYGTLDGSIGLLVPFISEEEYRRLQHLQQQLRDELP 284

Query: 1361 HVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEM 1398
            H+ GL+PR+FR ++S     +     ++D +LL  +  
Sbjct: 285  HLCGLDPRAFRSYYSRSPPVKN----VIDGDLLERFLD 318


This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding. Length = 318

>gnl|CDD|227490 COG5161, SFT1, Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification] Back     alignment and domain information
>gnl|CDD|227490 COG5161, SFT1, Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification] Back     alignment and domain information
>gnl|CDD|220751 pfam10433, MMS1_N, Mono-functional DNA-alkylating methyl methanesulfonate N-term Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 1432
KOG18961366 consensus mRNA cleavage and polyadenylation factor 100.0
KOG18971096 consensus Damage-specific DNA binding complex, sub 100.0
KOG18981205 consensus Splicing factor 3b, subunit 3 [RNA proce 100.0
COG51611319 SFT1 Pre-mRNA cleavage and polyadenylation specifi 100.0
PF10433504 MMS1_N: Mono-functional DNA-alkylating methyl meth 100.0
PF03178321 CPSF_A: CPSF A subunit region; InterPro: IPR004871 100.0
KOG0318603 consensus WD40 repeat stress protein/actin interac 98.45
KOG2048691 consensus WD40 repeat protein [General function pr 97.83
PRK11028330 6-phosphogluconolactonase; Provisional 97.22
KOG1539 910 consensus WD repeat protein [General function pred 96.82
cd00200289 WD40 WD40 domain, found in a number of eukaryotic 96.44
PF03178321 CPSF_A: CPSF A subunit region; InterPro: IPR004871 96.39
KOG1273405 consensus WD40 repeat protein [General function pr 96.23
PRK11028330 6-phosphogluconolactonase; Provisional 96.1
KOG1446311 consensus Histone H3 (Lys4) methyltransferase comp 96.08
cd00200289 WD40 WD40 domain, found in a number of eukaryotic 95.9
KOG1274 933 consensus WD40 repeat protein [General function pr 95.9
KOG1036323 consensus Mitotic spindle checkpoint protein BUB3, 95.75
PLN00181793 protein SPA1-RELATED; Provisional 95.73
PF10282345 Lactonase: Lactonase, 7-bladed beta-propeller; Int 95.51
KOG1539 910 consensus WD repeat protein [General function pred 94.91
PF08596395 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; 94.91
KOG0291 893 consensus WD40-repeat-containing subunit of the 18 94.46
KOG0306 888 consensus WD40-repeat-containing subunit of the 18 94.3
KOG0285460 consensus Pleiotropic regulator 1 [RNA processing 93.98
KOG1036323 consensus Mitotic spindle checkpoint protein BUB3, 93.97
KOG0646476 consensus WD40 repeat protein [General function pr 93.83
KOG0315311 consensus G-protein beta subunit-like protein (con 93.21
KOG0282503 consensus mRNA splicing factor [Function unknown] 92.78
KOG0291 893 consensus WD40-repeat-containing subunit of the 18 92.78
KOG0283712 consensus WD40 repeat-containing protein [Function 92.59
KOG2111346 consensus Uncharacterized conserved protein, conta 92.5
PLN00181793 protein SPA1-RELATED; Provisional 92.48
KOG0319 775 consensus WD40-repeat-containing subunit of the 18 92.06
KOG0283712 consensus WD40 repeat-containing protein [Function 91.76
COG2706346 3-carboxymuconate cyclase [Carbohydrate transport 91.75
PTZ00420 568 coronin; Provisional 91.15
KOG2321 703 consensus WD40 repeat protein [General function pr 90.8
KOG18971096 consensus Damage-specific DNA binding complex, sub 90.58
KOG1273405 consensus WD40 repeat protein [General function pr 90.07
PF08596395 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; 89.95
KOG0650733 consensus WD40 repeat nucleolar protein Bop1, invo 89.82
KOG2096420 consensus WD40 repeat protein [General function pr 89.73
KOG0306888 consensus WD40-repeat-containing subunit of the 18 89.67
KOG0319 775 consensus WD40-repeat-containing subunit of the 18 89.55
TIGR03866300 PQQ_ABC_repeats PQQ-dependent catabolism-associate 89.43
KOG3881412 consensus Uncharacterized conserved protein [Funct 88.97
KOG2106626 consensus Uncharacterized conserved protein, conta 88.84
KOG2110391 consensus Uncharacterized conserved protein, conta 88.45
KOG0278334 consensus Serine/threonine kinase receptor-associa 88.21
KOG0294362 consensus WD40 repeat-containing protein [Function 88.19
KOG2055514 consensus WD40 repeat protein [General function pr 88.13
PF14727418 PHTB1_N: PTHB1 N-terminus 87.34
KOG0290364 consensus Conserved WD40 repeat-containing protein 87.06
KOG4378 673 consensus Nuclear protein COP1 [Signal transductio 86.96
PF14783111 BBS2_Mid: Ciliary BBSome complex subunit 2, middle 86.95
KOG0296399 consensus Angio-associated migratory cell protein 86.76
KOG0647347 consensus mRNA export protein (contains WD40 repea 86.58
COG2706346 3-carboxymuconate cyclase [Carbohydrate transport 85.46
KOG0646 476 consensus WD40 repeat protein [General function pr 85.12
KOG0772641 consensus Uncharacterized conserved protein, conta 84.79
KOG2106626 consensus Uncharacterized conserved protein, conta 83.94
KOG0266456 consensus WD40 repeat-containing protein [General 83.18
KOG0318603 consensus WD40 repeat stress protein/actin interac 83.11
KOG2055514 consensus WD40 repeat protein [General function pr 83.0
KOG1408 1080 consensus WD40 repeat protein [Function unknown] 82.75
KOG0315311 consensus G-protein beta subunit-like protein (con 82.59
KOG2110391 consensus Uncharacterized conserved protein, conta 82.44
KOG0299479 consensus U3 snoRNP-associated protein (contains W 82.38
KOG1034385 consensus Transcriptional repressor EED/ESC/FIE, r 81.44
KOG0277311 consensus Peroxisomal targeting signal type 2 rece 81.43
PF08450246 SGL: SMP-30/Gluconolaconase/LRE-like region; Inter 81.11
PTZ00420568 coronin; Provisional 80.93
PF14727418 PHTB1_N: PTHB1 N-terminus 80.52
PTZ00421493 coronin; Provisional 80.49
PHA02713557 hypothetical protein; Provisional 80.38
>KOG1896 consensus mRNA cleavage and polyadenylation factor II complex, subunit CFT1 (CPSF subunit) [RNA processing and modification] Back     alignment and domain information
Probab=100.00  E-value=1.5e-195  Score=1750.69  Aligned_cols=1286  Identities=42%  Similarity=0.682  Sum_probs=1066.3

Q ss_pred             ccceeccccCCceeeeeEEEEeecCCCCCCCCCcccccccccccCCCCCCCCCCCcEEEEcCCeEEEEEEEEeccCCccc
Q 000545            2 SFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKES   81 (1432)
Q Consensus         2 ~~~~~~~~~~pT~V~~s~~~~Ft~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~nLvvak~~~LeIy~v~~~~~g~~~~   81 (1432)
                      .|++|++.|+||+|+||++|+||+....                           ||||+++|.|+||++..++.+.+. 
T Consensus         1 m~~vykq~h~~T~ve~s~ag~Ft~~~~~---------------------------nlvV~~~N~L~vyri~~~~e~~t~-   52 (1366)
T KOG1896|consen    1 MFAVYKQEHDPTVVENSSAGLFTNNRTE---------------------------NLVVAGTNILRVYRISRDAEALTK-   52 (1366)
T ss_pred             CcchhhhccCchhhccceeeeEecCCCc---------------------------ceEEecccEEEEEEeccchhhccc-
Confidence            3788999999999999999999977654                           999999999999999865333211 


Q ss_pred             cCCccccccccccccccceEEEEEEEEeeeeEeEEEEEecCCCCCCCCCcEEEEEeccceEEEEEEeCCCCCeeEEEeee
Q 000545           82 KNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHC  161 (1432)
Q Consensus        82 ~~~~~~~~~~~~~~~~~~~L~~v~~~~l~G~I~~l~~~r~~~~~~~~~~D~Lll~~~~~klsil~~d~~~~~l~t~Slh~  161 (1432)
                            ++.+.|+...+.+|+|+++|.+||+|++|++++..|+    .+|+|+++|++||+|+|+||+.+|.|+|.||||
T Consensus        53 ------~~~~~~~~~~~~~LeLv~~~~l~GnV~si~~~~~~gs----~rD~LlL~f~~AKiSvlefD~~t~sl~TlSLHy  122 (1366)
T KOG1896|consen   53 ------NDPGDMGKAHRKKLELVAEFKLFGNVTSIAKLPLKGS----NRDALLLLFKDAKISVLEFDPQTNSLRTLSLHY  122 (1366)
T ss_pred             ------cCccccccccceEEEEEEEEEeecceeeEEEeecCCC----CcceEEEEeccceEEEEEecCCccceeeeeeEE
Confidence                  1222333344567999999999999999999999987    699999999999999999999999999999999


Q ss_pred             ecCcccccccCCCccccCCCeEEECCCCcEEEEEecCceEEEEeCccCCCCCCCCCCCCCCCCCcccceeccEEEEcccC
Q 000545          162 FESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDL  241 (1432)
Q Consensus       162 ~E~~~~~~~~~g~~~~~~~~~l~vDP~~Rc~~l~~y~~~l~ilp~~~~~~~l~~~~~~~~~~~~~~~~~~~s~~~~l~~l  241 (1432)
                      ||.+++   +.|++....+|.++|||++||++|++|+..++||||++.+ .+++++ ....++...+++.+||+|.+.+|
T Consensus       123 fE~~~~---~~~~~~~~~~p~vrvDPdsrCa~llvyg~~m~iLpf~~~e-~~~~~~-~~~~~~~~ss~~~pSyvi~~reL  197 (1366)
T KOG1896|consen  123 FEGPEF---RKGLVGRAKIPTVRVDPDSRCALLLVYGLRMAILPFRVNE-HLDDEE-LFPSGFSKSSFTAPSYVIALREL  197 (1366)
T ss_pred             eccccc---cccccccccCceEEECCCCCeEEEEEecceEEEeeccccc-cccccc-cccccccccccccceeEEEhhhh
Confidence            999863   4555555678999999999999999999999999998863 344333 22222233457889999999999


Q ss_pred             C--CCceeeEeeecCCCCceEEEEeecCCCcccccccccceeEEEEEEEeecccccceeeEeccCCcccceEEEecCCCC
Q 000545          242 D--MKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIG  319 (1432)
Q Consensus       242 d--i~~V~D~~FL~gy~~PtlavL~e~~~tw~gr~~~~~dt~~~~~~sLd~~~k~~~~i~s~~~Lp~~~~~LipvP~p~g  319 (1432)
                      |  |+||+|++|||||++||+|+||||.+||+||+..|+|||.+.+++||+++|.||+||++.+||+||+++.++|.|+|
T Consensus       198 deki~niiD~qFLhgY~ePTl~ILyep~~tw~grv~~r~dt~~~vaisLni~q~~hpVI~sv~sLP~D~~~~~~vp~piG  277 (1366)
T KOG1896|consen  198 DEKIKNIIDFQFLHGYYEPTLAILYEPEQTWAGRVILRKDTCVLVAISLNITQKVHPVIWSVLSLPFDCYQATAVPTPIG  277 (1366)
T ss_pred             hhhhccceeEEeecCcccceEEEEecccccccceEEEecCcEEEEEEEcCccccccceEeeeccCChhhhhceeecccCc
Confidence            9  88999999999999999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             eEEEEecCeEEEEecCc-cceEEccCCCccCCCCcccCCCCceeEeeceeEEEeeCceEEEEeCCCCEEEEEEEEc-Cee
Q 000545          320 GVLVVGANTIHYHSQSA-SCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYD-GRV  397 (1432)
Q Consensus       320 GvLVig~n~I~y~~~~~-~~~~~~n~~~~~~~~~~~~p~~~~~~~ld~~~~~~~~~~~~Ll~~~~G~l~~l~l~~d-g~~  397 (1432)
                      ||||++.|.++|.+|++ ++++++|++++..+.++.+||+.+.+.+|++..+|++.++++++..+||+|+|+|.+| ++.
T Consensus       278 gvLv~~~n~~iy~nqsv~~~gv~LNs~a~~~t~fpl~~qs~v~i~ld~a~~t~i~~dk~vis~~~Gd~y~Ltl~~D~~r~  357 (1366)
T KOG1896|consen  278 GVLVFTVNNLIYLNQSVSPYGVALNSYASKYTAFPLIPQSGVRIELDCANATWISNDKCVISLKNGDLYLLTLILDIGRS  357 (1366)
T ss_pred             cEEEEeeeeEEEEccCCCceeEEecchhhcccCCccccccceEEEEeeccceeecCCeEEEecCCCcEEEEEEEeccccc
Confidence            99999999999999998 5999999999999999999999999999999999999999999999999999999999 789


Q ss_pred             eeeEEEEecCCCccccceEEecCCeEEEEeecCCeeEEEEeeCCCccccCCCCccccCCcccCCcchhhccCCCcchhhc
Q 000545          398 VQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQD  477 (1432)
Q Consensus       398 V~~l~i~~~~~~~~~s~l~~l~~g~lF~gS~~GDS~L~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  477 (1432)
                      |+.+++..+...++++|++...+++||+||+.|||+|+||.+....+.+...+  ++.+.+....+.++.+...+..+ +
T Consensus       358 V~~~~f~k~~asvl~t~~v~~~n~llFlGSrlgnSlll~~s~~~~~~~e~~~r--e~~d~~~~~~~~~~~d~~~d~~~-~  434 (1366)
T KOG1896|consen  358 VQLLHFDKFKASVLATSIVGHGNNLLFLGSRLGNSLLLRFSELLQRASEGVRR--EEGDTESDGYSKKRVDDTQDVRR-D  434 (1366)
T ss_pred             hhhhhhhhhhcccceeeeeccCCccEEEEecCCCEEEEEehhccccCCccccc--cccCCcCCcchhhcccchhhhhh-h
Confidence            99999999999999999999999999999999999999999876532222222  22222222233333321101111 1


Q ss_pred             ccCcccc------cccCCCCCCcccccceeEEEEeeeecccCCccccccccccccCC---------------CccCCCCC
Q 000545          478 MVNGEEL------SLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADA---------------SATGISKQ  536 (1432)
Q Consensus       478 ~~~~~~~------~l~~~~~~~~~~~~~~~~l~v~d~l~NigPI~D~~vg~~~~~~~---------------~~sG~g~~  536 (1432)
                      +...++.      +-||+++..+   ...+.|++||+|+|+|||.||++|.....+.               .|+|+|+.
T Consensus       435 d~~~~~~~~~g~~~~~g~~a~~t---~~~f~fevcDsL~NIGPi~~~avG~~~~~~~~~~gl~~~~~~~elV~~sGhgkn  511 (1366)
T KOG1896|consen  435 DEKSAELFEAGSEENYGSGAQET---VQPFSFEVCDSLPNIGPITDFAVGKRSSASEAVEGLSPHNKCLELVATSGHGKN  511 (1366)
T ss_pred             hhhccchhhccccccCCccccee---eeeeEEeehhccccccccccceeccccchhhhccCCCCCCCeEEEEEeccCCCC
Confidence            1111111      2222221111   1238899999999999999999998654221               18899999


Q ss_pred             CCeEE------------EecCCCCEEEEEEecCCCCCCCCcccccccCcCcceEEEEeccccceEEEeccceeeeecccc
Q 000545          537 SNYEL------------VELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVD  604 (1432)
Q Consensus       537 g~L~~------------~~L~g~~~iWtv~~~~~~~~~~~~~~~~~~~~~~~~yLvlS~~~~T~Vl~~g~~~eEv~~~~g  604 (1432)
                      |.|.+            ++||||.++|||..+....+         .++..|.||++|..++|+||++|+++.|++. .+
T Consensus       512 gaL~V~r~sI~P~i~t~fel~Gc~~iWtV~~~~~~~~---------~~~~~h~~lilS~e~~t~il~tge~~~Ev~~-s~  581 (1366)
T KOG1896|consen  512 GALSVIRRSIRPEIATEFELPGCVDIWTVFIKGRKRE---------EDNTQHLYLILSTESRTMILETGEELLEVSG-SG  581 (1366)
T ss_pred             cceEEEeecccceeeEEEEecCeeeEEEEEEeccccc---------cccCcceEEEeecccchhhhhccchhhhccc-ce
Confidence            99987            68999999999998644322         2234599999999999999999999999975 58


Q ss_pred             cccccceEEEeeecCCcEEEEEecCcEEEEcCC-cceEEEeCCCCCCCCCCCCCCccEEEEEEeCCEEEEEEeCCcEEEE
Q 000545          605 YFVQGRTIAAGNLFGRRRVIQVFERGARILDGS-YMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLL  683 (1432)
Q Consensus       605 F~~~~~Tl~ag~l~~~~~ivQVt~~~irli~~~-~~~~~~~~~~~~~~~~~~~~~~~I~~asi~d~~vll~~~~g~i~~l  683 (1432)
                      |..+++||++|+++++.+||||||+++|++|++ ...|.++..          .+..+++++++||||++..+.|.+.+|
T Consensus       582 f~~~~~Tl~~gnlg~~rriVQVtp~~~rllDg~~r~lq~i~fd----------~~~~vv~~sv~dpyv~v~~~~g~i~~~  651 (1366)
T KOG1896|consen  582 FTRDGPTLFAGNLGNERRIVQVTPSGLRLLDGDLRMLQRIPFD----------SGAIVVQTSVADPYVAVRSSEGRITLY  651 (1366)
T ss_pred             eEeccceEEEEecCCceEEEEEccceeEEecCcchheeEeccc----------cCCcEEEEeccCceEEEEEcCCceEEE
Confidence            999999999999988899999999999999995 478888882          445689999999999999999999999


Q ss_pred             EecCCCceEEeecCccccCCCCceEEEEeeccCCC-------------CcccccccccccccCCccccccCCCCCCCCCC
Q 000545          684 VGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGP-------------EPWLRKTSTDAWLSTGVGEAIDGADGGPLDQG  750 (1432)
Q Consensus       684 ~~~~~~~~l~~~~~~~~~~~~~~i~~~~l~~d~~~-------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  750 (1432)
                      .++....+|.+.++     ....+.+++++.|.+.             .++.+.. .++...... .+.+.++++...+.
T Consensus       652 ~l~~~s~rl~~~~~-----~s~~~~sv~~~~dlsg~f~~~s~l~~k~~~~~gr~~-~~~~~~~~~-~kv~~~egg~~~~~  724 (1366)
T KOG1896|consen  652 DLEEKSHRLALHDP-----MSFKVVSVSLPADLSGMFTTLSDLSLKGNEANGRSS-EAEGLQSLP-CKVDDEEGGSPEQE  724 (1366)
T ss_pred             EeccccchhhccCc-----ccceeEEEechhhhccceEEEeeecccCcccccccc-cccccccCC-ccccCCCCCCcccC
Confidence            98776656655554     1344566666665432             2222221 111111111 22222332212222


Q ss_pred             cEEEEEEecCCeEEEEECCCCceeEEecccccccccccccccccccccccccccCCCccCCCCCcccccccccEEEEEee
Q 000545          751 DIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQ  830 (1432)
Q Consensus       751 ~~~l~v~~~~g~l~I~sLp~~~~v~~~~~~~~~~~~l~~~~~~~~~~~s~q~l~~~~~~~~~~~~~~~~~~~~v~~i~~~  830 (1432)
                      .+||++++++|+++||++|++++|+.++.|+.++.+|.+......   ..|               ...++..+.++..+
T Consensus       725 ~~~~~~~~e~g~leiy~~pd~~lVf~v~~f~~~~~~L~~~~~~~~---~~~---------------~~s~~~~l~q~~~~  786 (1366)
T KOG1896|consen  725 PYWCVFVTESGTLEIYALPDFDLVFEVDMFDTGNRVLMDSRLRGP---TTN---------------KESEDLELKQLFVN  786 (1366)
T ss_pred             ceEEEEEcCCCceEEEccCCcceEEEeeccCCCcceEEeecccCc---ccc---------------ccccchHHHHhhcc
Confidence            389999999999999999999999999999999999987543222   001               11223566677777


Q ss_pred             eccCC--CCccEEEEEeeCCeEEEEEEeecCCCCCCCCCCCCCcccccccccccccccccceeEEeccCCccCCCC----
Q 000545          831 RWSAH--HSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLRNLRFSRTPLDAYTREE----  904 (1432)
Q Consensus       831 ~~g~~--~~~~~L~vgl~~G~l~~y~~~~~~~~~~~~~~~~~~~~~~~~~lg~~~~~~~~~~rF~k~~~~~~~~~~----  904 (1432)
                      .+|.+  .+.+||++-+.+|.++.|++|+..+                    +    +...++|+|+|+....++.    
T Consensus       787 ~L~~e~~~~e~~L~lv~~~~eil~Ykaf~~~~--------------------~----~~~~~~f~kvp~~~~~~~~~p~~  842 (1366)
T KOG1896|consen  787 PLGSEIVFKEPHLFLVVSDNEILIYKAFPQLS--------------------Q----GNLKVFFKKVPHNLNIRTDKPHF  842 (1366)
T ss_pred             ccchhhhccCCceEEEEeCceEEEEeeccccC--------------------c----cchhhhhhhCCHhhcccccCCcc
Confidence            88877  6899999999999999999985111                    0    1124589999875422111    


Q ss_pred             -------------CCCCCCccceEEeeccCCceEEEEeCCCceEEEE-eCCceEEeeccCCCceEEEeeccCCCCCceEE
Q 000545          905 -------------TPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMV-FRERLRVHPQLCDGSIVAFTVLHNVNCNHGFI  970 (1432)
Q Consensus       905 -------------~~~~lg~~~v~~f~~~~g~~~Vf~~g~rP~~i~~-~~~~l~~~p~~~~~~v~~~~~f~~~~~~~g~i  970 (1432)
                                   ++.+.-.++++.|++++|++|||+||++|+||+. .+|.+++||+.++++|.+|++||+.+||+||+
T Consensus       843 ~~~~~~~~~~e~~~~~~~~~~~m~~f~~i~ghsgvfv~Gs~P~~il~t~rg~lr~h~~~gngpv~sfapfhnvn~p~gfi  922 (1366)
T KOG1896|consen  843 LCKKREGGGAEEGASVSVIVQRMTYFEDIGGHSGVFVTGSKPYLILLTFRGVLRFHPVFGNGPVGSFAPFHNVNCPRGFI  922 (1366)
T ss_pred             cchhhccccccccccccceeeeEEeeccccCeeEEEEecCCceEEEEEcccccceeeeecCCcceeeeeeeccCCCcceE
Confidence                         1122334677899999999999999999999987 59999999999999999999999999999999


Q ss_pred             EEEecCeEEEEEcCCCCccCCCcceEEEeeCCCcccEEEEeCCCCeEEEEEeecccccccccccccccccccccccCCCC
Q 000545          971 YVTSQGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNL 1050 (1432)
Q Consensus       971 ~~~~~~~L~I~~l~~~~~~d~~~~ir~~i~L~~tpr~I~y~~~~~~~~v~~s~~~~~~~~~~~~~~~d~~~~~~~~~~~~ 1050 (1432)
                      |++.++.|+||.++....||+.||+|| |||+.|||+++||++.++|+|+++.+.  ++   +...+|++.      +..
T Consensus       923 yvd~~~~l~i~~lp~~~~Ydn~wPvkk-Ipl~~T~~~vvYh~e~~vy~v~t~~~~--~~---~~~~~d~~e------~~~  990 (1366)
T KOG1896|consen  923 YVDRQGELVICVLPEALSYDNKWPVKK-IPLRKTPHQVVYHYEKKVYAVITSTPV--PY---ERLGEDGEE------EVI  990 (1366)
T ss_pred             EECCCceEEEEEcchhcccCCCCcccc-cccccchhheeeeccceEEEEEEeccc--ee---eeccccccc------ccc
Confidence            999999999999999999999999999 999999999999999999999998541  22   111223221      334


Q ss_pred             CccccccccccceEEEEEeccCCCCCCceeeeeEECCCCCceEEEEEEEeeec-CCCCcceEEEEEeeeecCCCccccee
Q 000545         1051 SSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNT-TTKENETLLAIGTAYVQGEDVAARGR 1129 (1432)
Q Consensus      1051 ~~~~~~~~~~~~~~~v~l~dp~~~~~~~~~~~~~~l~~~E~v~si~~v~l~~~-~~~~~~~~lvVGT~~~~~e~~~~~Gr 1129 (1432)
                      +.+|..+.|..++++|+|++|    .+|+.++.|+|++||++++|+.+.|..+ +++++++||+|||+++.|||.++|||
T Consensus       991 ~~de~~~~p~~~~f~i~LisP----~sw~vi~~iefq~~E~v~~~k~v~L~~~~t~~~~k~ylavGT~~~~gEDv~~RGr 1066 (1366)
T KOG1896|consen  991 SRDENVIHPEGEQFSIQLISP----ESWEVIDKIEFQENEHVLHMKYVILDDEETTKGKKPYLAVGTAFIQGEDVPARGR 1066 (1366)
T ss_pred             cccccccccccccceeEEecC----CccccccccccCccceeeEEEEEEEEecccccCCcceEEEEEeecccccccCccc
Confidence            567778889999999999999    4899999999999999999999999865 45567999999999999999999999


Q ss_pred             EEEEEEee---cCCCC--CccEEEEEEEeecCceEEEccccCeEEEEeCCeEEEEEc-cCCeeeeEEeecCCCeeEEEEE
Q 000545         1130 VLLFSTGR---NADNP--QNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKW-TGTELNGIAFYDAPPLYVVSLN 1203 (1432)
Q Consensus      1130 i~vf~i~~---~~~~~--~~~l~~v~~~~~~g~V~al~~~~g~Ll~~vg~~l~v~~~-~~~~L~~~a~~~~~~~~i~sl~ 1203 (1432)
                      +++|+|++   +|++|  +.|||+++++|+||+|.++|+++|+|+.+.|+||+||+| .+..|.++||+|. |.|+++++
T Consensus      1067 ~hi~diIeVVPepgkP~t~~KlKel~~eE~KGtVsavceV~G~l~~~~GqKI~v~~l~r~~~ligVaFiD~-~~yv~s~~ 1145 (1366)
T KOG1896|consen 1067 IHIFDIIEVVPEPGKPFTKNKLKELYIEEQKGTVSAVCEVRGHLLSSQGQKIIVRKLDRDSELIGVAFIDL-PLYVHSMK 1145 (1366)
T ss_pred             EEEEEEEEecCCCCCCcccceeeeeehhhcccceEEEEEeccEEEEccCcEEEEEEeccCCcceeeEEecc-ceeEEehh
Confidence            99999987   77776  447999999999999999999999999999999999999 5678999999999 99999999


Q ss_pred             EeCCEEEEEeccccEEEEEEecccCEEEEeeeccCCccEEEEEEEEcCCeeEEEEEecCCcEEEEeeCCCCCCCccCceE
Q 000545         1204 IVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKL 1283 (1432)
Q Consensus      1204 ~~~n~IlvgD~~~Sv~ll~~~~~~~~l~~~arD~~~~~vta~~fl~d~~~l~~l~~D~~gNl~vl~~~p~~~~s~~~~kL 1283 (1432)
                      +.||+|++||+|||++|++|++++.+|.+++||..++.|++++||+|+++|+|+++|+++||++|.|.|++++|++|+||
T Consensus      1146 ~vknlIl~gDV~ksisfl~fqeep~rlsL~srd~~~l~v~s~EFLVdg~~L~flvsDa~rNi~vy~Y~Pe~~eS~~G~RL 1225 (1366)
T KOG1896|consen 1146 VVKNLILAGDVMKSISFLGFQEEPYRLSLLSRDFEPLNVYSTEFLVDGSNLSFLVSDADRNIHVYMYAPENIESLSGQRL 1225 (1366)
T ss_pred             hhhhheehhhhhhceEEEEEccCceEEEEeecCCchhhceeeeeEEcCCeeEEEEEcCCCcEEEEEeCCCCccccCccee
Confidence            99999999999999999999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             EEEEEEecCcceeEEEEEeeecCCCCCCCCCCCCCCCCceEEE--EEecCCcEEEEEeCChHhHHHHHHHHHHHHhcCCC
Q 000545         1284 LSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALL--FGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPH 1361 (1432)
Q Consensus      1284 ~~~~~f~lg~~vt~~~~~~l~~~~~~~~~~~~g~~~~~~~~il--~~t~~GsIg~l~pl~e~~~~~L~~Lq~~l~~~~~~ 1361 (1432)
                      +++++||+|..+++|.+...... .+ .    +   .+.+...  |||++|++|.++|++|+.||||..||++|...++|
T Consensus      1226 v~radfhvg~~vs~m~~lp~~~~-~e-~----~---~~~~~~~~v~gtlDG~l~~~~Pl~e~~YRRL~~lQn~L~~~~~h 1296 (1366)
T KOG1896|consen 1226 VRRADFHVGAHVSTMFRLPCHQN-AE-F----G---SNSPMFYEVFGTLDGGLGHLVPLDEKTYRRLLMLQNALMDRLPH 1296 (1366)
T ss_pred             eeeeeeEeccceeeeEecccccc-ch-h----c---cCCchhhhhhcccCCceeEEecCCHHHHHHHHHHHHHHHHhhhh
Confidence            99999999999999998653221 10 0    1   1233444  89999999999999999999999999999999999


Q ss_pred             CCCCCcccccccccCCCCCCCCCCcceeHHHHHHHcCCCHHHHHHHHHHhCCCHHHHHHHHHHhhhccCCC
Q 000545         1362 VAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL 1432 (1432)
Q Consensus      1362 ~~Gl~~~~~R~~~~~~~~~~~~~~~~IDGDlle~fl~L~~~~q~~ia~~l~~~~~~i~~~l~~l~~~~~~~ 1432 (1432)
                      +|||||++||..+... ....+.+++|||+||.+|..|+.++|.++|+++|+++.+|+++|-++...++||
T Consensus      1297 v~GLNPr~yR~~~s~~-~~~n~~r~ilDg~ll~~f~yl~~~er~elA~kiGt~~~eIl~DLvel~~~~s~~ 1366 (1366)
T KOG1896|consen 1297 VGGLNPRAYRLLDSSL-QLSNSLRSILDGELLNRFSYLSMSEREELAHKIGTTRKEILDDLVELDRLTSSL 1366 (1366)
T ss_pred             hcCCCHHHhhhccchh-hhcCCCcccchHhHHHHhhccchhhHHHHHHhcCCCHHHHHHHHHHHHHHhhcC
Confidence            9999999999987665 335789999999999999999999999999999999999999999999999886



>KOG1897 consensus Damage-specific DNA binding complex, subunit DDB1 [Replication, recombination and repair] Back     alignment and domain information
>KOG1898 consensus Splicing factor 3b, subunit 3 [RNA processing and modification] Back     alignment and domain information
>COG5161 SFT1 Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification] Back     alignment and domain information
>PF10433 MMS1_N: Mono-functional DNA-alkylating methyl methanesulfonate N-term; PDB: 2B5M_A 4A0K_C 4A0B_C 3I7L_A 2B5N_C 3I8E_A 4A09_A 4A0A_A 3EI4_C 2B5L_A Back     alignment and domain information
>PF03178 CPSF_A: CPSF A subunit region; InterPro: IPR004871 This family includes a region that lies towards the C terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit Back     alignment and domain information
>KOG0318 consensus WD40 repeat stress protein/actin interacting protein [Cytoskeleton] Back     alignment and domain information
>KOG2048 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>PRK11028 6-phosphogluconolactonase; Provisional Back     alignment and domain information
>KOG1539 consensus WD repeat protein [General function prediction only] Back     alignment and domain information
>cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and botto Back     alignment and domain information
>PF03178 CPSF_A: CPSF A subunit region; InterPro: IPR004871 This family includes a region that lies towards the C terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit Back     alignment and domain information
>KOG1273 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>PRK11028 6-phosphogluconolactonase; Provisional Back     alignment and domain information
>KOG1446 consensus Histone H3 (Lys4) methyltransferase complex and RNA cleavage factor II complex, subunit SWD2 [RNA processing and modification; Chromatin structure and dynamics; Posttranslational modification, protein turnover, chaperones] Back     alignment and domain information
>cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and botto Back     alignment and domain information
>KOG1274 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG1036 consensus Mitotic spindle checkpoint protein BUB3, WD repeat superfamily [Cell cycle control, cell division, chromosome partitioning] Back     alignment and domain information
>PLN00181 protein SPA1-RELATED; Provisional Back     alignment and domain information
>PF10282 Lactonase: Lactonase, 7-bladed beta-propeller; InterPro: IPR019405 6-phosphogluconolactonases (6PGL) 3 Back     alignment and domain information
>KOG1539 consensus WD repeat protein [General function prediction only] Back     alignment and domain information
>PF08596 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; InterPro: IPR013905 The Lethal giant larvae (Lgl) tumour suppressor protein is conserved from yeast to mammals Back     alignment and domain information
>KOG0291 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG0306 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG0285 consensus Pleiotropic regulator 1 [RNA processing and modification] Back     alignment and domain information
>KOG1036 consensus Mitotic spindle checkpoint protein BUB3, WD repeat superfamily [Cell cycle control, cell division, chromosome partitioning] Back     alignment and domain information
>KOG0646 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG0315 consensus G-protein beta subunit-like protein (contains WD40 repeats) [General function prediction only] Back     alignment and domain information
>KOG0282 consensus mRNA splicing factor [Function unknown] Back     alignment and domain information
>KOG0291 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG0283 consensus WD40 repeat-containing protein [Function unknown] Back     alignment and domain information
>KOG2111 consensus Uncharacterized conserved protein, contains WD40 repeats [Function unknown] Back     alignment and domain information
>PLN00181 protein SPA1-RELATED; Provisional Back     alignment and domain information
>KOG0319 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG0283 consensus WD40 repeat-containing protein [Function unknown] Back     alignment and domain information
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism] Back     alignment and domain information
>PTZ00420 coronin; Provisional Back     alignment and domain information
>KOG2321 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG1897 consensus Damage-specific DNA binding complex, subunit DDB1 [Replication, recombination and repair] Back     alignment and domain information
>KOG1273 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>PF08596 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; InterPro: IPR013905 The Lethal giant larvae (Lgl) tumour suppressor protein is conserved from yeast to mammals Back     alignment and domain information
>KOG0650 consensus WD40 repeat nucleolar protein Bop1, involved in ribosome biogenesis [Translation, ribosomal structure and biogenesis] Back     alignment and domain information
>KOG2096 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG0306 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG0319 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein Back     alignment and domain information
>KOG3881 consensus Uncharacterized conserved protein [Function unknown] Back     alignment and domain information
>KOG2106 consensus Uncharacterized conserved protein, contains HELP and WD40 domains [Function unknown] Back     alignment and domain information
>KOG2110 consensus Uncharacterized conserved protein, contains WD40 repeats [Function unknown] Back     alignment and domain information
>KOG0278 consensus Serine/threonine kinase receptor-associated protein [Lipid transport and metabolism] Back     alignment and domain information
>KOG0294 consensus WD40 repeat-containing protein [Function unknown] Back     alignment and domain information
>KOG2055 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>PF14727 PHTB1_N: PTHB1 N-terminus Back     alignment and domain information
>KOG0290 consensus Conserved WD40 repeat-containing protein AN11 [Function unknown] Back     alignment and domain information
>KOG4378 consensus Nuclear protein COP1 [Signal transduction mechanisms] Back     alignment and domain information
>PF14783 BBS2_Mid: Ciliary BBSome complex subunit 2, middle region Back     alignment and domain information
>KOG0296 consensus Angio-associated migratory cell protein (contains WD40 repeats) [Function unknown] Back     alignment and domain information
>KOG0647 consensus mRNA export protein (contains WD40 repeats) [RNA processing and modification] Back     alignment and domain information
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism] Back     alignment and domain information
>KOG0646 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG0772 consensus Uncharacterized conserved protein, contains WD40 repeat [Function unknown] Back     alignment and domain information
>KOG2106 consensus Uncharacterized conserved protein, contains HELP and WD40 domains [Function unknown] Back     alignment and domain information
>KOG0266 consensus WD40 repeat-containing protein [General function prediction only] Back     alignment and domain information
>KOG0318 consensus WD40 repeat stress protein/actin interacting protein [Cytoskeleton] Back     alignment and domain information
>KOG2055 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG1408 consensus WD40 repeat protein [Function unknown] Back     alignment and domain information
>KOG0315 consensus G-protein beta subunit-like protein (contains WD40 repeats) [General function prediction only] Back     alignment and domain information
>KOG2110 consensus Uncharacterized conserved protein, contains WD40 repeats [Function unknown] Back     alignment and domain information
>KOG0299 consensus U3 snoRNP-associated protein (contains WD40 repeats) [RNA processing and modification] Back     alignment and domain information
>KOG1034 consensus Transcriptional repressor EED/ESC/FIE, required for transcriptional silencing, WD repeat superfamily [Transcription] Back     alignment and domain information
>KOG0277 consensus Peroxisomal targeting signal type 2 receptor [Intracellular trafficking, secretion, and vesicular transport] Back     alignment and domain information
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species Back     alignment and domain information
>PTZ00420 coronin; Provisional Back     alignment and domain information
>PF14727 PHTB1_N: PTHB1 N-terminus Back     alignment and domain information
>PTZ00421 coronin; Provisional Back     alignment and domain information
>PHA02713 hypothetical protein; Provisional Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query1432
3i7h_A1143 Crystal Structure Of Ddb1 In Complex With The H-Box 3e-15
4a0b_A1159 Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Dup 3e-15
4a0a_A1159 Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Dup 3e-15
4a11_A1159 Structure Of The Hsddb1-Hscsa Complex Length = 1159 3e-15
3ei4_A1158 Structure Of The Hsddb1-Hsddb2 Complex Length = 115 3e-15
4a0l_A1144 Structure Of Ddb1-Ddb2-Cul4b-Rbx1 Bound To A 12 Bp 3e-15
4a08_A1159 Structure Of Hsddb1-Drddb2 Bound To A 13 Bp Cpd-Dup 4e-15
3e0c_A1140 Crystal Structure Of Dna Damage-Binding Protein 1(D 4e-15
4e54_A1150 Damaged Dna Induced Uv-Damaged Dna-Binding Protein 4e-15
3ei1_A1158 Structure Of Hsddb1-Drddb2 Bound To A 14 Bp 6-4 Pho 4e-15
2b5l_A1140 Crystal Structure Of Ddb1 In Complex With Simian Vi 4e-15
>pdb|3I7H|A Chain A, Crystal Structure Of Ddb1 In Complex With The H-Box Motif Of Hbx Length = 1143 Back     alignment and structure

Iteration: 1

Score = 81.6 bits (200), Expect = 3e-15, Method: Compositional matrix adjust. Identities = 76/285 (26%), Positives = 126/285 (44%), Gaps = 34/285 (11%) Query: 1106 KENETLLAIGTAYVQGEDVAAR-GRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASL 1164 K+ T +GTA V E+ + GR+++F + +D V E KE+KGA+ ++ Sbjct: 826 KDPNTYFIVGTAMVYPEEAEPKQGRIVVF---QYSDGKLQTVAE---KEVKGAVYSMVEF 879 Query: 1165 QGHLLIASGPKIILHKWTG-----TELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIY 1219 G LL + + L++WT TE N + + LY L +FIL+GD+ +S+ Sbjct: 880 NGKLLASINSTVRLYEWTTEKDVRTECN--HYNNIMALY---LKTKGDFILVGDLMRSVL 934 Query: 1220 FLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWK 1279 L++K +A+DF A E L D + L ++ N+ + + + Sbjct: 935 LLAYKPMEGNFEEIARDFNPNWMSAVEILDDDNFLG---AENAFNLFVCQKDSAATTDEE 991 Query: 1280 GQKLLSRAEFHVGAHVTKF----LRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIG 1335 Q L FH+G V F L +Q L +S T + +LFGT++G IG Sbjct: 992 RQHLQEVGLFHLGEFVNVFCHGSLVMQNLGETSTPTQGS----------VLFGTVNGMIG 1041 Query: 1336 CIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAH 1380 + L E + L +Q +L + V + +R FH+ K Sbjct: 1042 LVTSLSESWYNLLLDMQNRLNKVIKSVGKIEHSFWRSFHTERKTE 1086
>pdb|4A0B|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Duplex ( Pyrimidine At D-1 Position) At 3.8 A Resolution (Cpd 4) Length = 1159 Back     alignment and structure
>pdb|4A0A|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Duplex ( Pyrimidine At D-1 Position) At 3.6 A Resolution (Cpd 3) Length = 1159 Back     alignment and structure
>pdb|4A11|A Chain A, Structure Of The Hsddb1-Hscsa Complex Length = 1159 Back     alignment and structure
>pdb|3EI4|A Chain A, Structure Of The Hsddb1-Hsddb2 Complex Length = 1158 Back     alignment and structure
>pdb|4A0L|A Chain A, Structure Of Ddb1-Ddb2-Cul4b-Rbx1 Bound To A 12 Bp Abasic Site Containing Dna-Duplex Length = 1144 Back     alignment and structure
>pdb|4A08|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 13 Bp Cpd-Duplex ( Purine At D-1 Position) At 3.0 A Resolution (Cpd 1) Length = 1159 Back     alignment and structure
>pdb|3E0C|A Chain A, Crystal Structure Of Dna Damage-Binding Protein 1(Ddb1) Length = 1140 Back     alignment and structure
>pdb|4E54|A Chain A, Damaged Dna Induced Uv-Damaged Dna-Binding Protein (Uv-Ddb) Dimerization And Its Roles In Chromatinized Dna Repair Length = 1150 Back     alignment and structure
>pdb|3EI1|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 14 Bp 6-4 Photoproduct Containing Dna-Duplex Length = 1158 Back     alignment and structure
>pdb|2B5L|A Chain A, Crystal Structure Of Ddb1 In Complex With Simian Virus 5 V Protein Length = 1140 Back     alignment and structure

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query1432
3ei3_A1158 DNA damage-binding protein 1; UV-damage, DDB, nucl 7e-79
3ei3_A1158 DNA damage-binding protein 1; UV-damage, DDB, nucl 8e-55
1vt4_I1221 APAF-1 related killer DARK; drosophila apoptosome, 4e-09
1vt4_I1221 APAF-1 related killer DARK; drosophila apoptosome, 5e-05
1vt4_I 1221 APAF-1 related killer DARK; drosophila apoptosome, 1e-04
>3ei3_A DNA damage-binding protein 1; UV-damage, DDB, nucleotide excision repair, xeroderma pigmentosum, cytoplasm, DNA repair; HET: DNA PG4; 2.30A {Homo sapiens} PDB: 3ei1_A* 3ei2_A* 3ei4_A* 4a0l_A* 3e0c_A* 3i7k_A* 3i7h_A* 3i7l_A* 3i7n_A* 3i7o_A* 3i7p_A* 3i89_A* 3i8c_A* 3i8e_A* 2b5l_A 2b5m_A 2hye_A* 4a11_A* 4a0k_C* 4a0a_A* ... Length = 1158 Back     alignment and structure
 Score =  284 bits (726), Expect = 7e-79
 Identities = 107/641 (16%), Positives = 218/641 (34%), Gaps = 51/641 (7%)

Query: 803  INSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPE 862
            +    +E       E  H +  +++     S   S      + TD +    +   FE   
Sbjct: 537  LQIHPQELRQISHTEMEHEVACLDITPLGDSNGLSPLCAIGLWTDISARILKLPSFELLH 596

Query: 863  NTSKSDDPVSTSRSLSVSNVSASRLRNLR--------FSRTPLDAYTREETPHGAPCQRI 914
                  + +  S  ++    S   L  L          +        R++   G     +
Sbjct: 597  KEMLGGEIIPRSILMTTFESSHYLLCALGDGALFYFGLNIETGLLSDRKKVTLGTQPTVL 656

Query: 915  TIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTS 974
              F+++S     F    RP        +L     +    +     L++           +
Sbjct: 657  RTFRSLST-TNVFACSDRPTVIYSSNHKLVFSN-VNLKEVNYMCPLNSDGYPDSLALANN 714

Query: 975  QGILKICQLPSGSTYDNYWPVQKVIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLS 1034
               L I  +           ++  +PL  +P +I Y      + ++ S   ++  +   +
Sbjct: 715  ST-LTIGTID----EIQKLHIRT-VPLYESPRKICYQEVSQCFGVLSSRIEVQDTSGGTT 768

Query: 1035 LLIDQEVGHQIDNHNLSSVDLHRTYTVEE---------YEVRILEPDRAGGPWQTRATIP 1085
             L        + +   SS     +    E         + + I++       ++      
Sbjct: 769  ALRPSASTQALSSSVSSSKLFSSSTAPHETSFGEEVEVHNLLIIDQHT----FEVLHAHQ 824

Query: 1086 MQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAA-RGRVLLFSTGRNADNPQN 1144
               +E AL++    L     K+  T   +GTA V  E+    +GR+++F           
Sbjct: 825  FLQNEYALSLVSCKL----GKDPNTYFIVGTAMVYPEEAEPKQGRIVVFQYSDGK----- 875

Query: 1145 LVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNI 1204
             +  V  KE+KGA+ ++    G LL +    + L++WT  +           +  + L  
Sbjct: 876  -LQTVAEKEVKGAVYSMVEFNGKLLASINSTVRLYEWTTEKELRTECNHYNNIMALYLKT 934

Query: 1205 VKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKN 1264
              +FIL+GD+ +S+  L++K        +A+DF      A E L D + L    ++   N
Sbjct: 935  KGDFILVGDLMRSVLLLAYKPMEGNFEEIARDFNPNWMSAVEILDDDNFL---GAENAFN 991

Query: 1265 IQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFA 1324
            + +       +   + Q L     FH+G  V  F    ++  +   T          + +
Sbjct: 992  LFVCQKDSAATTDEERQHLQEVGLFHLGEFVNVFCHGSLVMQNLGET------STPTQGS 1045

Query: 1325 LLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGP 1384
            +LFGT++G IG +  L E  +  L  +Q +L   +  V  +    +R FH+  K      
Sbjct: 1046 VLFGTVNGMIGLVTSLSESWYNLLLDMQNRLNKVIKSVGKIEHSFWRSFHTERKTE--PA 1103

Query: 1385 DSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDL 1425
               +D +L+  +  +   +  E+           +      
Sbjct: 1104 TGFIDGDLIESFLDISRPKMQEVVANLQYDDGSGMKREATA 1144


>3ei3_A DNA damage-binding protein 1; UV-damage, DDB, nucleotide excision repair, xeroderma pigmentosum, cytoplasm, DNA repair; HET: DNA PG4; 2.30A {Homo sapiens} PDB: 3ei1_A* 3ei2_A* 3ei4_A* 4a0l_A* 3e0c_A* 3i7k_A* 3i7h_A* 3i7l_A* 3i7n_A* 3i7o_A* 3i7p_A* 3i89_A* 3i8c_A* 3i8e_A* 2b5l_A 2b5m_A 2hye_A* 4a11_A* 4a0k_C* 4a0a_A* ... Length = 1158 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query1432
d1k8kc_371 Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taur 97.49
d1nexb2355 Cdc4 propeller domain {Baker's yeast (Saccharomyce 96.77
d1gxra_337 Groucho/tle1, C-terminal domain {Human (Homo sapie 96.2
d1nr0a2299 Actin interacting protein 1 {Nematode (Caenorhabdi 96.16
d1pgua1325 Actin interacting protein 1 {Baker's yeast (Saccha 95.83
d1gxra_337 Groucho/tle1, C-terminal domain {Human (Homo sapie 95.57
d1pgua2287 Actin interacting protein 1 {Baker's yeast (Saccha 95.09
d1k8kc_371 Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taur 94.55
d1nr0a1311 Actin interacting protein 1 {Nematode (Caenorhabdi 93.73
d1tbga_340 beta1-subunit of the signal-transducing G protein 92.75
d1erja_388 Tup1, C-terminal domain {Baker's yeast (Saccharomy 92.38
d1ri6a_333 Putative isomerase YbhE {Escherichia coli [TaxId: 92.29
d1nr0a2299 Actin interacting protein 1 {Nematode (Caenorhabdi 92.15
d1nr0a1311 Actin interacting protein 1 {Nematode (Caenorhabdi 89.92
d1erja_388 Tup1, C-terminal domain {Baker's yeast (Saccharomy 88.87
d1q7fa_279 Brain tumor cg10719-pa {Fruit fly (Drosophila mela 86.26
d1l0qa2301 Surface layer protein {Archaeon Methanosarcina maz 83.74
>d1k8kc_ b.69.4.1 (C:) Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taurus) [TaxId: 9913]} Back     information, alignment and structure
class: All beta proteins
fold: 7-bladed beta-propeller
superfamily: WD40 repeat-like
family: WD40-repeat
domain: Arp2/3 complex 41 kDa subunit ARPC1
species: Cow (Bos taurus) [TaxId: 9913]
Probab=97.49  E-value=0.0022  Score=35.40  Aligned_cols=58  Identities=14%  Similarity=0.116  Sum_probs=27.1

Q ss_pred             EEEEEEEEEECCCCCCCCEEEEEEEEEECCCCCCCCEEEEEEEEECCCEEEECCC-CCEEEEE-ECCEEEEEECCC
Q ss_conf             1999996100598866660699999764599997438999988614841798444-6809999-689699997238
Q 000545         1110 TLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASL-QGHLLIA-SGPKIILHKWTG 1183 (1432)
Q Consensus      1110 ~~ivVGT~~~~~Ed~~~~Gri~vf~i~~~~~~~~~~lk~i~~~~~~G~V~al~~~-~g~Ll~a-vg~~i~i~~~~~ 1183 (1432)
                      .+++.|+.         .|.|.++++...     ..+..+  ....++|++++-. +|.++++ ....+.+|.++.
T Consensus       214 ~~l~s~~~---------d~~i~iwd~~~~-----~~~~~~--~~~~~~v~s~~fs~d~~~la~g~d~~~~~~~~~~  273 (371)
T d1k8kc_         214 SRVAWVSH---------DSTVCLADADKK-----MAVATL--ASETLPLLAVTFITESSLVAAGHDCFPVLFTYDS  273 (371)
T ss_dssp             SEEEEEET---------TTEEEEEEGGGT-----TEEEEE--ECSSCCEEEEEEEETTEEEEEETTSSCEEEEEET
T ss_pred             CCCCCCCC---------CCCCEEEEEECC-----CCEEEE--ECCCCCCEEEEECCCCCEEEEECCCCEEEEEEEC
T ss_conf             21000014---------786058864101-----210000--0146652036546999799998199267877608



>d1nexb2 b.69.4.1 (B:370-744) Cdc4 propeller domain {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1gxra_ b.69.4.1 (A:) Groucho/tle1, C-terminal domain {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
>d1nr0a2 b.69.4.1 (A:313-611) Actin interacting protein 1 {Nematode (Caenorhabditis elegans) [TaxId: 6239]} Back     information, alignment and structure
>d1pgua1 b.69.4.1 (A:2-326) Actin interacting protein 1 {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1gxra_ b.69.4.1 (A:) Groucho/tle1, C-terminal domain {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
>d1pgua2 b.69.4.1 (A:327-613) Actin interacting protein 1 {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1k8kc_ b.69.4.1 (C:) Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taurus) [TaxId: 9913]} Back     information, alignment and structure
>d1nr0a1 b.69.4.1 (A:2-312) Actin interacting protein 1 {Nematode (Caenorhabditis elegans) [TaxId: 6239]} Back     information, alignment and structure
>d1tbga_ b.69.4.1 (A:) beta1-subunit of the signal-transducing G protein heterotrimer {Cow (Bos taurus) [TaxId: 9913]} Back     information, alignment and structure
>d1erja_ b.69.4.1 (A:) Tup1, C-terminal domain {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1ri6a_ b.69.11.1 (A:) Putative isomerase YbhE {Escherichia coli [TaxId: 562]} Back     information, alignment and structure
>d1nr0a2 b.69.4.1 (A:313-611) Actin interacting protein 1 {Nematode (Caenorhabditis elegans) [TaxId: 6239]} Back     information, alignment and structure
>d1nr0a1 b.69.4.1 (A:2-312) Actin interacting protein 1 {Nematode (Caenorhabditis elegans) [TaxId: 6239]} Back     information, alignment and structure
>d1erja_ b.69.4.1 (A:) Tup1, C-terminal domain {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1q7fa_ b.68.9.1 (A:) Brain tumor cg10719-pa {Fruit fly (Drosophila melanogaster) [TaxId: 7227]} Back     information, alignment and structure
>d1l0qa2 b.69.2.3 (A:1-301) Surface layer protein {Archaeon Methanosarcina mazei [TaxId: 2209]} Back     information, alignment and structure