Citrus Sinensis ID: 000548


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230-------240-------250-------260-------270-------280-------290-------300-------310-------320-------330-------340-------350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520-------530-------540-------550-------560-------570-------580-------590-------600-------610-------620-------630-------640-------650-------660-------670-------680-------690-------700-------710-------720-------730-------740-------750-------760-------770-------780-------790-------800-------810-------820-------830-------840-------850-------860-------870-------880-------890-------900-------910-------920-------930-------940-------950-------960-------970-------980-------990------1000------1010------1020------1030------1040------1050------1060------1070------1080------1090------1100------1110------1120------1130------1140------1150------1160------1170------1180------1190------1200------1210------1220------1230------1240------1250------1260------1270------1280------1290------1300------1310------1320------1330------1340------1350------1360------1370------1380------1390------1400------1410------1420------1430-
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
ccccEEEccccccEEEEEEEEEEEcccccccccccccccccccccccccccccccccEEEEEEcEEEEEEEEEcccccccccccccccccEEEcccccccEEEEEEEEEEEEEEEEEEEEcccccccccccEEEEEEccccEEEEEEEcccccEEEEEEEEEcccccccccccccccccccEEEEcccccEEEEEEEccEEEEEEccccccccccccccccccccccccccccEEEEccccccccEEEEEEEccccccEEEEEEEccccccccccccccEEEEEEEEEEcccccccEEEEcccccccccEEEEEcccccEEEEEEccEEEEEEcccccEEEccccccccccccccccEEEEEEEEEEEEEEEEccEEEEEEccccEEEEEEEEcccEEEEEEEEEEccccccccEEEEcccEEEEEEEEccEEEEEEEEcccccccccccccccccccccccccHHccccccccccccccccccccccccccccccccEEEEEEEEcccccccccEEEcccccccccccccccccccccEEEEcccccEEEEEEEccccccccccccccccccccccEEEEEEEccEEEEEEcccEEEEEEcccccccccEEEEEEEccccEEEEEEcccEEEEEccccEEEEEccccccccccccccccEEEEEEcccEEEEEEEcccEEEEEccccccEEEEccccccccccccEEEEEEEEcccccccccccccccccccccccccccccccccccccEEEEEEEcccEEEEEEccccEEEEEEEcccccccEEEEcccccccccccccccccccccccccccccccccEEEEEEEEcccccccccEEEEEEEccEEEEEEEEEcccccccccccccccccEEEEEEcccccccccccccccccccccccccccccccEEEEEEEcccccEEEEEcccccEEEEEEcccEEEEEccccccEEEEEEEcccccccEEEEEEEccEEEEEEccccccccccccEEEEEccccccEEEEEccccEEEEEEEcccccccHHHHHHcccccccccccccccccccccccccccEEEEEEEcccccccccEEEEEEEccccccEEEEEEEEEcccccccccEEEEEEEEEEccccccccccEEEEEEEEcccccccEEEEEEEEEEcccccEEccccccEEEEEccEEEEEEcccccccEEEEEccccEEEEEEEEEccEEEEEEccccEEEEEEEccccEEEEEEEccccccccEEEEEEcccEEEEEEEcccccEEEEEEcccccccccccccccEEEEEccccEEEEEEEEEEccccccccccccccccccEEEEEEEccccEEEEEEccHHHHHHHHHHHHHHHHccccccccccccccccccccccccccccccEEHHHHHHHccccHHHHHHHHHHHcccHHHHHHHHHHHHHccccc
ccHHHHHHccccccHHHHHEEEEcccHHcccccccccccccccccccccccccccccEEEEEccEEEEEEEEEcccccccEEcccccccccccccccccEEEEEEEEEEEEEEEEEEEEEEccccccccccEEEEEEcccEEEEEEEcccccccEEEEEEEccccccccccccccccccccEEEEcccccEEEEEEcccEEEEEEEccccHHHcccccccccccccccccccEEEEEcHHccccccEEEHHcccccccEEEEEEcccccccccccccccEEEEEEEEEEcccccccEEEEEccccccHcEEEEccccccEEEEEEEcEEEEEcccccccEEEccccccccccccccccccEEEEEccEEEEccccEEEEEEccccEEEEEEEEcccEEEEEEEEEccccccccEEEEEcccEEEEEEcccccEEEEEEcccccccccccHccccccccccccHHcccccccccccccccccccccHccccHcccccccccEEEEEEcccccccccccEEEcccccHcccccccccccccEEEEccccccEEEEEEcccccccccccccccccccccEEEEEEEcccEEEEEEcccEEEEcccccccccccEEEEEEcccccEEEEEEcccEEEEcccccEEEEcccccccccccccccccEEEEEEcccEEEEEEEccEEEEEEEcccccEEEEcccccccccccccEEEEEEEcccccccccccccccccccccccccccccccccccccEEEEEEEcccEEEEEEccccEEEEEEccccccccEEEcccccccccccccccccccccccccccccccccccEEEEEEEcccccccccEEEEEEccccEEEEEEEEcccccccccccccccEEEEEcccccEccccccccccccccccccccccccccccEEEEEEccccccEEEEEEccccEEEEEEccccccccccccccEEEEcccccccccccEEEEEcccEEEEEEccccccccccccEEEEEccccccEEEEEccccEEEEEEcccccccccHHHHHccccHcccccccHHHccHHccccccccccEEEEEccccccccEEEEEEEEcccccEEEEEEEEEEEccccccccEEEEEEEEEccccccccccEEEEEEEEEEccccccccEEEEEccccccEEEEEccccEEEEEEccEEEEEEccccccEEEEEEccccEEEEEEEEcccEEEEEEcccEEEEEEEcccccEEEEEEccccccEEEEEEEEcccccEEEEEEcccccEEEEEEcccccccccccEEEEEEEEEccccccEEEEEcccccccccccccccccccccEEEEEEEccccEEEEEcccHHHHHHHHHHHHHHHHcccccccccHHHHHccccccccccccccccccHHHHHHHHHccHHHHHHHHHHHcccHHHHHHHHHHHHHHHccc
msfaaykmmhwptgiancgsgfithsradyvpqipliqteeldselpskrgigpvpnlVVTAANVIEIYVVRVQeegskesknsgeTKRRVLMDGISAASLELVCHYRLHGNVESLAILSqggadnsrrrDSIILAFEDAKisvlefddsihglritsmhcfespewlhlkrgresfargplvkvdpqgrcggvLVYGLQMIILKAsqggsglvgdedtfgsgggfsariESSHvinlrdldmkHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSIsttlkqhpliwsamnlphdaykllavpspiggvLVVGANTIHYHSQSASCALALNNYAVsldssqelprssfsVELDAAHATWLQNDVALLSTKTGDLVLLTVVYdgrvvqrldlsktnpsvltsdittIGNSLFFLGSRLGDSLLVQFtcgsgtsmlssglkeefgdieadapstkrlrrSSSDALQdmvngeelslygsasnntesAQKTFSFAVRDSlvnigplkdfsyglrinadasatgiskqsnyelvelpgckgiwtvyhkssrghnadssrmaayddEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGArildgsymtqdlsfgpsnsesgsgsenstVLSVSIAdpyvllgmsdgsirllvgdpstctvsvqtpaaiesskkpvssctlyhdkgpepwlrktstdawlstgvgeaidgadggpldqgdiysVVCYesgaleifdvpnfncvftvdkfvsgrtHIVDTYMREALKDSeteinssseegtgqgrkeNIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYlfegpentsksddpvstsrslsvsnvsasrlrnlrfsrtpldaytreetphgapcqritifknisghqgfflsgsrpcwcmVFRErlrvhpqlcdgsIVAFTVLHNVNCNHGFIYVTSQGILkicqlpsgstydnywpvqkiplkatphqitYFAEKNLYPLIVSVPVLKPLNQVLSLLIDqevghqidnhnlssvdlhrtytvEEYEVrilepdraggpwqtratipmqssenALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFstgrnadnpqnLVTEVYSKELKGAISALASLQGHLliasgpkiilhkwtgtelngiafydapplyvVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKdfgsldcfateflidgstlslvvsdeqKNIQIFyyapkmseswkgqkllsraefhvgAHVTKFLRLQMLAtssdrtgaapgsdktNRFALLFGtldgsigciapldeLTFRRLQSLQKKLVdsvphvaglnprsfrqfhsngkahrpgpdsivdcellshyemlplEEQLEIAHQTGTTRSQILSNlndlalgtsfl
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVrvqeegskesknsgetkrrVLMDGISAASLELVCHYRLHGNVESLAIlsqggadnsrRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFArgplvkvdpqgRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGrvvqrldlsktnpsvltsdITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKeefgdieadapstkrlrrsssDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADAsatgiskqsnyELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTiaagnlfgrrrVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQtpaaiesskkpvssCTLYHDkgpepwlrkTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMRealkdseteinssseegtgqgrkenIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPEntsksddpvstsrslsvsnvsasrlrnlrfsrtpldaytreetphgapCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGhqidnhnlssvdlhrtYTVEEYEVRilepdraggpwqtratipmqssenaLTVRVVTLFNtttkenetlLAIGTAYVQGEDVAARGRVLLFstgrnadnpqnLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSsdrtgaapgsdKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQIlsnlndlalgtsfl
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQeegskesknsgeTKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPsnsesgsgsensTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPvstsrslsvsnvsasrlrnLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
***AAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEE********RGIGPVPNLVVTAANVIEIYVVRV*****************VLMDGISAASLELVCHYRLHGNVESLAILSQGGA****RRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSL**********FSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSG*********************************************************TFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHK***********MAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMT********************VLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQ**************CTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMR****************************MKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFE************************************************GAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPM***ENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLA***************NRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGL*********************IVDCELLSHYEMLPLEEQLEIAH***********************
**FAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEEL********GIGPVPNLVVTAANVIEIYVVRV********************DGISAASLELVCHYRLHGNVESLAILSQG*A**SRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEW*********FARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGS******************IESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSG**********************KRLRRSSSDALQDMVNGEELS**************TFSFAVRDSLVNIGPLKDFSYGLRINA************YELVELPGCKGIWTVY****************YDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAW*****************DQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREAL************************SMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFE***************************************************CQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLN******************NLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYA************LSRAEFHVGAHVTKFLRLQML****************NRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQE************KRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADA************ALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLS****************VLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPA*************LYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALK******************KENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGP*********************SASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATS*********SDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
*SFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSG*V***********FSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCG*********************************ALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSS**************DEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLR**STDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEI********GQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLR******************HGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLL*D**VGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATS***********KTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYELVELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query1431 2.2.26 [Sep-21-2011]
Q9FGR01442 Cleavage and polyadenylat yes no 0.987 0.979 0.747 0.0
Q7XWP11441 Probable cleavage and pol yes no 0.978 0.971 0.640 0.0
Q9V7261455 Cleavage and polyadenylat yes no 0.905 0.890 0.282 1e-145
Q105691444 Cleavage and polyadenylat yes no 0.444 0.440 0.317 1e-91
Q9EPU41441 Cleavage and polyadenylat yes no 0.444 0.441 0.319 2e-91
Q105701443 Cleavage and polyadenylat yes no 0.443 0.439 0.313 4e-90
Q7SEY21456 Protein cft-1 OS=Neurospo N/A no 0.858 0.844 0.243 4e-78
O747331441 Protein cft1 OS=Schizosac yes no 0.865 0.859 0.232 3e-77
Q2TZ191393 Protein cft1 OS=Aspergill yes no 0.833 0.856 0.240 8e-77
Q5BDG71339 Protein cft1 OS=Emericell yes no 0.853 0.912 0.241 8e-73
>sp|Q9FGR0|CPSF1_ARATH Cleavage and polyadenylation specificity factor subunit 1 OS=Arabidopsis thaliana GN=CPSF160 PE=1 SV=2 Back     alignment and function desciption
 Score = 2249 bits (5829), Expect = 0.0,   Method: Compositional matrix adjust.
 Identities = 1092/1460 (74%), Positives = 1254/1460 (85%), Gaps = 47/1460 (3%)

Query: 1    MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQT-EELDSELPS-KRGIGPVPNL 58
            MSFAAYKMMHWPTG+ NC SG+ITHS +D   QIP++   +++++E P+ KRGIGP+PN+
Sbjct: 1    MSFAAYKMMHWPTGVENCASGYITHSLSDSTLQIPIVSVHDDIEAEWPNPKRGIGPLPNV 60

Query: 59   VVTAANVIEIYVVRVQEEG-SKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLA 117
            V+TAAN++E+Y+VR QEEG ++E +N    KR  +MDG+   SLELVCHYRLHGNVES+A
Sbjct: 61   VITAANILEVYIVRAQEEGNTQELRNPKLAKRGGVMDGVYGVSLELVCHYRLHGNVESIA 120

Query: 118  ILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESF 177
            +L  GG ++S+ RDSIIL F DAKISVLEFDDSIH LR+TSMHCFE P+WLHLKRGRESF
Sbjct: 121  VLPMGGGNSSKGRDSIILTFRDAKISVLEFDDSIHSLRMTSMHCFEGPDWLHLKRGRESF 180

Query: 178  ARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVIN 237
             RGPLVKVDPQGRCGGVLVYGLQMIILK SQ GSGLVGD+D F SGG  SAR+ESS++IN
Sbjct: 181  PRGPLVKVDPQGRCGGVLVYGLQMIILKTSQVGSGLVGDDDAFSSGGTVSARVESSYIIN 240

Query: 238  LRDLDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPL 297
            LRDL+MKHVKDF+F+HGYIEPV+VIL E E TWAGRVSWKHHTC++SALSI++TLKQHP+
Sbjct: 241  LRDLEMKHVKDFVFLHGYIEPVIVILQEEEHTWAGRVSWKHHTCVLSALSINSTLKQHPV 300

Query: 298  IWSAMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPR 357
            IWSA+NLPHDAYKLLAVPSPIGGVLV+ ANTIHYHSQSASCALALNNYA S DSSQELP 
Sbjct: 301  IWSAINLPHDAYKLLAVPSPIGGVLVLCANTIHYHSQSASCALALNNYASSADSSQELPA 360

Query: 358  SSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITT 417
            S+FSVELDAAH TW+ NDVALLSTK+G+L+LLT++YDGR VQRLDLSK+  SVL SDIT+
Sbjct: 361  SNFSVELDAAHGTWISNDVALLSTKSGELLLLTLIYDGRAVQRLDLSKSKASVLASDITS 420

Query: 418  IGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQD 477
            +GNSLFFLGSRLGDSLLVQF+C SG +    GL++E  DIE +    KRLR +S D  QD
Sbjct: 421  VGNSLFFLGSRLGDSLLVQFSCRSGPAASLPGLRDEDEDIEGEGHQAKRLRMTS-DTFQD 479

Query: 478  MVNGEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQS 537
             +  EELSL+GS  NN++SAQK+FSFAVRDSLVN+GP+KDF+YGLRINADA+ATG+SKQS
Sbjct: 480  TIGNEELSLFGSTPNNSDSAQKSFSFAVRDSLVNVGPVKDFAYGLRINADANATGVSKQS 539

Query: 538  NYELV--------------------------ELPGCKGIWTVYHKSSRGHNADSSRMAAY 571
            NYELV                          ELPGCKGIWTVYHKSSRGHNADSS+MAA 
Sbjct: 540  NYELVCCSGHGKNGALCVLRQSIRPEMITEVELPGCKGIWTVYHKSSRGHNADSSKMAAD 599

Query: 572  DDEYHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGA 631
            +DEYHAYLIISLEARTMVLETADLLTEVTESVDY+VQGRTIAAGNLFGRRRVIQVFE GA
Sbjct: 600  EDEYHAYLIISLEARTMVLETADLLTEVTESVDYYVQGRTIAAGNLFGRRRVIQVFEHGA 659

Query: 632  RILDGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCT 691
            RILDGS+M Q+LSFG SNSES SGSE+STV SVSIADPYVLL M+D SIRLLVGDPSTCT
Sbjct: 660  RILDGSFMNQELSFGASNSESNSGSESSTVSSVSIADPYVLLRMTDDSIRLLVGDPSTCT 719

Query: 692  VSVQTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGD 751
            VS+ +P+ +E SK+ +S+CTLYHDKGPEPWLRK STDAWLS+GVGEA+D  DGGP DQGD
Sbjct: 720  VSISSPSVLEGSKRKISACTLYHDKGPEPWLRKASTDAWLSSGVGEAVDSVDGGPQDQGD 779

Query: 752  IYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGT 811
            IY VVCYESGALEIFDVP+FNCVF+VDKF SGR H+ D  + E     E E+N +SE+ T
Sbjct: 780  IYCVVCYESGALEIFDVPSFNCVFSVDKFASGRRHLSDMPIHEL----EYELNKNSEDNT 835

Query: 812  GQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPV 871
                 + I + +VVELAMQRWS HH+RPFLFA+L DGTILCY AYLF+G ++T K+++ +
Sbjct: 836  S---SKEIKNTRVVELAMQRWSGHHTRPFLFAVLADGTILCYHAYLFDGVDST-KAENSL 891

Query: 872  STSRSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGS 931
            S+    ++++  +S+LRNL+F R PLD  TRE T  G   QRIT+FKNISGHQGFFLSGS
Sbjct: 892  SSENPAALNSSGSSKLRNLKFLRIPLDTSTREGTSDGVASQRITMFKNISGHQGFFLSGS 951

Query: 932  RPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDN 991
            RP WCM+FRERLR H QLCDGSI AFTVLHNVNCNHGFIYVT+QG+LKICQLPS S YDN
Sbjct: 952  RPGWCMLFRERLRFHSQLCDGSIAAFTVLHNVNCNHGFIYVTAQGVLKICQLPSASIYDN 1011

Query: 992  YWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSS 1051
            YWPVQKIPLKATPHQ+TY+AEKNLYPLIVS PV KPLNQVLS L+DQE G Q+DNHN+SS
Sbjct: 1012 YWPVQKIPLKATPHQVTYYAEKNLYPLIVSYPVSKPLNQVLSSLVDQEAGQQLDNHNMSS 1071

Query: 1052 VDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLL 1111
             DL RTYTVEE+E++ILEP+R+GGPW+T+A IPMQ+SE+ALTVRVVTL N +T ENETLL
Sbjct: 1072 DDLQRTYTVEEFEIQILEPERSGGPWETKAKIPMQTSEHALTVRVVTLLNASTGENETLL 1131

Query: 1112 AIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIAS 1171
            A+GTAYVQGEDVAARGRVLLFS G+N DN QN+VTEVYS+ELKGAISA+AS+QGHLLI+S
Sbjct: 1132 AVGTAYVQGEDVAARGRVLLFSFGKNGDNSQNVVTEVYSRELKGAISAVASIQGHLLISS 1191

Query: 1172 GPKIILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNL 1231
            GPKIILHKW GTELNG+AF+DAPPLYVVS+N+VK+FILLGD+HKSIYFLSWKEQG+QL+L
Sbjct: 1192 GPKIILHKWNGTELNGVAFFDAPPLYVVSMNVVKSFILLGDVHKSIYFLSWKEQGSQLSL 1251

Query: 1232 LAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVG 1291
            LAKDF SLDCFATEFLIDGSTLSL VSDEQKNIQ+FYYAPKM ESWKG KLLSRAEFHVG
Sbjct: 1252 LAKDFESLDCFATEFLIDGSTLSLAVSDEQKNIQVFYYAPKMIESWKGLKLLSRAEFHVG 1311

Query: 1292 AHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQ 1351
            AHV+KFLRLQM+++         G+DK NRFALLFGTLDGS GCIAPLDE+TFRRLQSLQ
Sbjct: 1312 AHVSKFLRLQMVSS---------GADKINRFALLFGTLDGSFGCIAPLDEVTFRRLQSLQ 1362

Query: 1352 KKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTG 1411
            KKLVD+VPHVAGLNP +FRQF S+GKA R GPDSIVDCELL HYEMLPLEEQLE+AHQ G
Sbjct: 1363 KKLVDAVPHVAGLNPLAFRQFRSSGKARRSGPDSIVDCELLCHYEMLPLEEQLELAHQIG 1422

Query: 1412 TTRSQILSNLNDLALGTSFL 1431
            TTR  IL +L DL++GTSFL
Sbjct: 1423 TTRYSILKDLVDLSVGTSFL 1442




CPSF plays a key role in pre-mRNA 3'-end formation, recognizing the AAUAAA signal sequence and interacting with poly(A)polymerase and other factors to bring about cleavage and poly(A) addition. This subunit is involved in the RNA recognition step of the polyadenylation reaction.
Arabidopsis thaliana (taxid: 3702)
>sp|Q7XWP1|CPSF1_ORYSJ Probable cleavage and polyadenylation specificity factor subunit 1 OS=Oryza sativa subsp. japonica GN=Os04g0252200 PE=3 SV=2 Back     alignment and function description
>sp|Q9V726|CPSF1_DROME Cleavage and polyadenylation specificity factor subunit 1 OS=Drosophila melanogaster GN=Cpsf160 PE=1 SV=1 Back     alignment and function description
>sp|Q10569|CPSF1_BOVIN Cleavage and polyadenylation specificity factor subunit 1 OS=Bos taurus GN=CPSF1 PE=1 SV=1 Back     alignment and function description
>sp|Q9EPU4|CPSF1_MOUSE Cleavage and polyadenylation specificity factor subunit 1 OS=Mus musculus GN=Cpsf1 PE=1 SV=1 Back     alignment and function description
>sp|Q10570|CPSF1_HUMAN Cleavage and polyadenylation specificity factor subunit 1 OS=Homo sapiens GN=CPSF1 PE=1 SV=2 Back     alignment and function description
>sp|Q7SEY2|CFT1_NEUCR Protein cft-1 OS=Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) GN=cft-1 PE=3 SV=2 Back     alignment and function description
>sp|O74733|CFT1_SCHPO Protein cft1 OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=cft1 PE=3 SV=1 Back     alignment and function description
>sp|Q2TZ19|CFT1_ASPOR Protein cft1 OS=Aspergillus oryzae (strain ATCC 42149 / RIB 40) GN=cft1 PE=3 SV=1 Back     alignment and function description
>sp|Q5BDG7|CFT1_EMENI Protein cft1 OS=Emericella nidulans (strain FGSC A4 / ATCC 38163 / CBS 112.46 / NRRL 194 / M139) GN=cft1 PE=3 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query1431
2254555711442 PREDICTED: cleavage and polyadenylation 0.989 0.981 0.800 0.0
2960841221448 unnamed protein product [Vitis vinifera] 0.989 0.977 0.796 0.0
2555396811461 cleavage and polyadenylation specificity 1.0 0.979 0.804 0.0
3565599171447 PREDICTED: cleavage and polyadenylation 0.991 0.980 0.786 0.0
3565309451449 PREDICTED: cleavage and polyadenylation 0.992 0.979 0.785 0.0
2241209601455 predicted protein [Populus trichocarpa] 0.992 0.975 0.784 0.0
2977924711444 hypothetical protein ARALYDRAFT_495232 [ 0.988 0.979 0.750 0.0
4494703421504 PREDICTED: cleavage and polyadenylation 0.996 0.948 0.740 0.0
306960881442 cleavage and polyadenylation specificity 0.987 0.979 0.747 0.0
244155801442 putative cleavage and polyadenylation sp 0.987 0.979 0.747 0.0
>gi|225455571|ref|XP_002268371.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Vitis vinifera] Back     alignment and taxonomy information
 Score = 2441 bits (6327), Expect = 0.0,   Method: Compositional matrix adjust.
 Identities = 1166/1457 (80%), Positives = 1283/1457 (88%), Gaps = 41/1457 (2%)

Query: 1    MSFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVV 60
            MS+AAYKMMHWPTGI NC SGF+THSRAD+ PQI  IQT++L+SE P+KR IGP+PNL+V
Sbjct: 1    MSYAAYKMMHWPTGIENCASGFVTHSRADFAPQIAPIQTDDLESEWPTKRQIGPLPNLIV 60

Query: 61   TAANVIEIYVVRVQEEGSKESKNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILS 120
            TAAN++E+Y+VRVQE+ S+ES+ S ETKR  +M GIS A+LELVC YRLHGNVE++ +L 
Sbjct: 61   TAANILEVYMVRVQEDDSRESRASAETKRGGVMAGISGAALELVCQYRLHGNVETMTVLP 120

Query: 121  QGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHCFESPEWLHLKRGRESFARG 180
             GG DNSRRRDSIILAF+DAKISVLEFDDSIHGLR +SMHCFE PEW HLKRG ESFARG
Sbjct: 121  SGGGDNSRRRDSIILAFQDAKISVLEFDDSIHGLRTSSMHCFEGPEWFHLKRGHESFARG 180

Query: 181  PLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRD 240
            PLVKVDPQGRC GVLVYGLQMIILKASQ G GLVGDE+   SG   SAR+ESS+VI+LRD
Sbjct: 181  PLVKVDPQGRCSGVLVYGLQMIILKASQAGYGLVGDEEALSSGSAVSARVESSYVISLRD 240

Query: 241  LDMKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWS 300
            LDMKHVKDF FVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWS
Sbjct: 241  LDMKHVKDFTFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWS 300

Query: 301  AMNLPHDAYKLLAVPSPIGGVLVVGANTIHYHSQSASCALALNNYAVSLDSSQELPRSSF 360
            A+NLPHDAYKLL VPSPIGGV+V+ AN+IHYHSQSASCALALNNYAVS D+SQE+PRSSF
Sbjct: 301  AVNLPHDAYKLLPVPSPIGGVVVISANSIHYHSQSASCALALNNYAVSADNSQEMPRSSF 360

Query: 361  SVELDAAHATWLQNDVALLSTKTGDLVLLTVVYDGRVVQRLDLSKTNPSVLTSDITTIGN 420
            SVELDAA+ATWL NDVA+LSTKTG+L+LLT+ YDGRVV RLDLSK+  SVLTS I  IGN
Sbjct: 361  SVELDAANATWLSNDVAMLSTKTGELLLLTLAYDGRVVHRLDLSKSRASVLTSGIAAIGN 420

Query: 421  SLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQDMVN 480
            SLFFLGSRLGDSLLVQFT     S+LSS +KEE GDIE D PS KRLR+SSSDALQDMVN
Sbjct: 421  SLFFLGSRLGDSLLVQFT-----SILSSSVKEEVGDIEGDVPSAKRLRKSSSDALQDMVN 475

Query: 481  GEELSLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADASATGISKQSNYE 540
            GEELSLYGSA N+TE++QKTFSF+VRDS +N+GPLKDF+YGLRINAD  ATGI+KQSNYE
Sbjct: 476  GEELSLYGSAPNSTETSQKTFSFSVRDSFINVGPLKDFAYGLRINADPKATGIAKQSNYE 535

Query: 541  LV--------------------------ELPGCKGIWTVYHKSSRGHNADSSRMAAYDDE 574
            LV                          ELPGCKGIWTVYHK++RGHNADS++MA  DDE
Sbjct: 536  LVCCSGHGKNGALCILQQSIRPEMITEVELPGCKGIWTVYHKNTRGHNADSTKMATKDDE 595

Query: 575  YHAYLIISLEARTMVLETADLLTEVTESVDYFVQGRTIAAGNLFGRRRVIQVFERGARIL 634
            YHAYLIISLE+RTMVLETADLL EVTESVDY+VQG TI+AGNLFGRRRV+QV+ RGARIL
Sbjct: 596  YHAYLIISLESRTMVLETADLLGEVTESVDYYVQGCTISAGNLFGRRRVVQVYARGARIL 655

Query: 635  DGSYMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSV 694
            DG++MTQDL            SE+STVLSVSIADPYVLL MSDG+I+LLVGDPSTCTVS+
Sbjct: 656  DGAFMTQDLPI----------SESSTVLSVSIADPYVLLRMSDGNIQLLVGDPSTCTVSI 705

Query: 695  QTPAAIESSKKPVSSCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYS 754
              PA  ESSKK +S+CTLYHDKGPEPWLRKTSTDAWLSTG+GEAIDGADG   DQGDIY 
Sbjct: 706  NIPAVFESSKKSISACTLYHDKGPEPWLRKTSTDAWLSTGIGEAIDGADGAAQDQGDIYC 765

Query: 755  VVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQG 814
            VV YESG LEIFDVPNFNCVF+VDKF+SG  H+VDT + E  +D++  ++ +SEE   QG
Sbjct: 766  VVSYESGDLEIFDVPNFNCVFSVDKFMSGNAHLVDTLILEPSEDTQKVMSKNSEEEADQG 825

Query: 815  RKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTS 874
            RKEN H++KVVELAMQRWS  HSRPFLF ILTDGTILCY AYL+EGPE+T K+++ VS  
Sbjct: 826  RKENAHNIKVVELAMQRWSGQHSRPFLFGILTDGTILCYHAYLYEGPESTPKTEEAVSAQ 885

Query: 875  RSLSVSNVSASRLRNLRFSRTPLDAYTREETPHGAPCQRITIFKNISGHQGFFLSGSRPC 934
             SLS+SNVSASRLRNLRF R PLD YTREE   G    R+T+FKNI G QG FLSGSRP 
Sbjct: 886  NSLSISNVSASRLRNLRFVRVPLDTYTREEALSGTTSPRMTVFKNIGGCQGLFLSGSRPL 945

Query: 935  WCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTSQGILKICQLPSGSTYDNYWP 994
            W MVFRER+RVHPQLCDGSIVAFTVLHN+NCNHG IYVTSQG LKICQLP+ S+YDNYWP
Sbjct: 946  WFMVFRERIRVHPQLCDGSIVAFTVLHNINCNHGLIYVTSQGFLKICQLPAVSSYDNYWP 1005

Query: 995  VQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDL 1054
            VQKIPLK TPHQ+TYFAEKNLYPLIVSVPVLKPLN VLS L+DQE GHQ++N NLSS +L
Sbjct: 1006 VQKIPLKGTPHQVTYFAEKNLYPLIVSVPVLKPLNHVLSSLVDQEAGHQLENDNLSSDEL 1065

Query: 1055 HRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIG 1114
            HR+Y+V+E+EVR+LEP+++G PWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIG
Sbjct: 1066 HRSYSVDEFEVRVLEPEKSGAPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIG 1125

Query: 1115 TAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPK 1174
            TAYVQGEDVAARGRVLLFS G+N DN QNLV+E+YSKELKGAISA+ASLQGHLLIASGPK
Sbjct: 1126 TAYVQGEDVAARGRVLLFSVGKNTDNSQNLVSEIYSKELKGAISAVASLQGHLLIASGPK 1185

Query: 1175 IILHKWTGTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAK 1234
            IILHKWTGTELNG+AF+DAPPLYVVSLNIVKNFILLGDIH+SIYFLSWKEQGAQLNLLAK
Sbjct: 1186 IILHKWTGTELNGVAFFDAPPLYVVSLNIVKNFILLGDIHRSIYFLSWKEQGAQLNLLAK 1245

Query: 1235 DFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHV 1294
            DFGSLDCFATEFLIDGSTLSL+VSD+QKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHV
Sbjct: 1246 DFGSLDCFATEFLIDGSTLSLIVSDDQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHV 1305

Query: 1295 TKFLRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKL 1354
            TKFLRLQML  SSDRT A  GSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKL
Sbjct: 1306 TKFLRLQMLPASSDRTSATQGSDKTNRFALLFGTLDGSIGCIAPLDELTFRRLQSLQKKL 1365

Query: 1355 VDSVPHVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTR 1414
            VD+VPHVAGLNPRSFRQF SNGKAHRPGPD+IVDCELL HYEMLP EEQLEIA Q GTTR
Sbjct: 1366 VDAVPHVAGLNPRSFRQFRSNGKAHRPGPDNIVDCELLCHYEMLPFEEQLEIAQQIGTTR 1425

Query: 1415 SQILSNLNDLALGTSFL 1431
             QILSNLNDL+LGTSFL
Sbjct: 1426 MQILSNLNDLSLGTSFL 1442




Source: Vitis vinifera

Species: Vitis vinifera

Genus: Vitis

Family: Vitaceae

Order: Vitales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|296084122|emb|CBI24510.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
>gi|255539681|ref|XP_002510905.1| cleavage and polyadenylation specificity factor cpsf, putative [Ricinus communis] gi|223550020|gb|EEF51507.1| cleavage and polyadenylation specificity factor cpsf, putative [Ricinus communis] Back     alignment and taxonomy information
>gi|356559917|ref|XP_003548242.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Glycine max] Back     alignment and taxonomy information
>gi|356530945|ref|XP_003534039.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Glycine max] Back     alignment and taxonomy information
>gi|224120960|ref|XP_002318462.1| predicted protein [Populus trichocarpa] gi|222859135|gb|EEE96682.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
>gi|297792471|ref|XP_002864120.1| hypothetical protein ARALYDRAFT_495232 [Arabidopsis lyrata subsp. lyrata] gi|297309955|gb|EFH40379.1| hypothetical protein ARALYDRAFT_495232 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information
>gi|449470342|ref|XP_004152876.1| PREDICTED: cleavage and polyadenylation specificity factor subunit 1-like [Cucumis sativus] Back     alignment and taxonomy information
>gi|30696088|ref|NP_199979.2| cleavage and polyadenylation specificity factor subunit 1 [Arabidopsis thaliana] gi|290457637|sp|Q9FGR0.2|CPSF1_ARATH RecName: Full=Cleavage and polyadenylation specificity factor subunit 1; AltName: Full=Cleavage and polyadenylation specificity factor 160 kDa subunit; Short=AtCPSF160; Short=CPSF 160 kDa subunit gi|332008729|gb|AED96112.1| cleavage and polyadenylation specificity factor subunit 1 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|24415580|gb|AAN41460.1| putative cleavage and polyadenylation specificity factor 160 kDa subunit [Arabidopsis thaliana] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query1431
TAIR|locus:21531221442 CPSF160 "cleavage and polyaden 0.610 0.605 0.762 0.0
ZFIN|ZDB-GENE-040709-21451 cpsf1 "cleavage and polyadenyl 0.422 0.416 0.310 5.7e-158
UNIPROTKB|F1PC281398 CPSF1 "Uncharacterized protein 0.412 0.422 0.328 2.4e-157
UNIPROTKB|Q105691444 CPSF1 "Cleavage and polyadenyl 0.412 0.409 0.323 4.5e-156
UNIPROTKB|Q105701443 CPSF1 "Cleavage and polyadenyl 0.415 0.412 0.321 5.5e-155
MGI|MGI:26797221441 Cpsf1 "cleavage and polyadenyl 0.408 0.405 0.329 5.1e-150
FB|FBgn00246981455 Cpsf160 "Cleavage and polyaden 0.346 0.340 0.305 6.4e-130
UNIPROTKB|F1RSN81108 CPSF1 "Uncharacterized protein 0.342 0.442 0.353 5.9e-127
DICTYBASE|DDB_G02815851628 cpsf1 "cleavage and polyadenyl 0.240 0.211 0.329 5.7e-116
RGD|13064061386 Cpsf1 "cleavage and polyadenyl 0.308 0.318 0.353 2.6e-113
TAIR|locus:2153122 CPSF160 "cleavage and polyadenylation specificity factor 160" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 3551 (1255.1 bits), Expect = 0., Sum P(2) = 0.
 Identities = 679/890 (76%), Positives = 760/890 (85%)

Query:   542 VELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTE 601
             VELPGCKGIWTVYHKSSRGHNADSS+MAA +DEYHAYLIISLEARTMVLETADLLTEVTE
Sbjct:   570 VELPGCKGIWTVYHKSSRGHNADSSKMAADEDEYHAYLIISLEARTMVLETADLLTEVTE 629

Query:   602 SVDYFVQGRTIAAGNLFGRRRVIQVFERGARILDGSYMTQDLSFGPXXXXXXXXXXXXTV 661
             SVDY+VQGRTIAAGNLFGRRRVIQVFE GARILDGS+M Q+LSFG             TV
Sbjct:   630 SVDYYVQGRTIAAGNLFGRRRVIQVFEHGARILDGSFMNQELSFGASNSESNSGSESSTV 689

Query:   662 LSVSIADPYVLLGMSDGSIRLLVGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGPEPW 721
              SVSIADPYVLL M+D SIRLLVGDPSTCTVS+ +P+ +E SK+ +S+CTLYHDKGPEPW
Sbjct:   690 SSVSIADPYVLLRMTDDSIRLLVGDPSTCTVSISSPSVLEGSKRKISACTLYHDKGPEPW 749

Query:   722 LRKTSTDAWLSTGVGEAIDGADGGPLDQGDIYSVVCYESGALEIFDVPNFNCVFTVDKFV 781
             LRK STDAWLS+GVGEA+D  DGGP DQGDIY VVCYESGALEIFDVP+FNCVF+VDKF 
Sbjct:   750 LRKASTDAWLSSGVGEAVDSVDGGPQDQGDIYCVVCYESGALEIFDVPSFNCVFSVDKFA 809

Query:   782 SGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFL 841
             SGR H+ D  + E     E E+N +SE+ T    KE I + +VVELAMQRWS HH+RPFL
Sbjct:   810 SGRRHLSDMPIHEL----EYELNKNSEDNTSS--KE-IKNTRVVELAMQRWSGHHTRPFL 862

Query:   842 FAILTDGTILCYQAYLFEGPENTSKSDDPXXXXXXXXXXXXXXXXXXXLRFSRTPLDAYT 901
             FA+L DGTILCY AYLF+G ++T K+++                    L+F R PLD  T
Sbjct:   863 FAVLADGTILCYHAYLFDGVDST-KAENSLSSENPAALNSSGSSKLRNLKFLRIPLDTST 921

Query:   902 REETPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLH 961
             RE T  G   QRIT+FKNISGHQGFFLSGSRP WCM+FRERLR H QLCDGSI AFTVLH
Sbjct:   922 REGTSDGVASQRITMFKNISGHQGFFLSGSRPGWCMLFRERLRFHSQLCDGSIAAFTVLH 981

Query:   962 NVNCNHGFIYVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVS 1021
             NVNCNHGFIYVT+QG+LKICQLPS S YDNYWPVQKIPLKATPHQ+TY+AEKNLYPLIVS
Sbjct:   982 NVNCNHGFIYVTAQGVLKICQLPSASIYDNYWPVQKIPLKATPHQVTYYAEKNLYPLIVS 1041

Query:  1022 VPVLKPLNQVLSLLIDQEVGHQIDNHNLSSVDLHRTYTVEEYEVRILEPDRAGGPWQTRA 1081
              PV KPLNQVLS L+DQE G Q+DNHN+SS DL RTYTVEE+E++ILEP+R+GGPW+T+A
Sbjct:  1042 YPVSKPLNQVLSSLVDQEAGQQLDNHNMSSDDLQRTYTVEEFEIQILEPERSGGPWETKA 1101

Query:  1082 TIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAARGRVLLFSTGRNADNP 1141
              IPMQ+SE+ALTVRVVTL N +T ENETLLA+GTAYVQGEDVAARGRVLLFS G+N DN 
Sbjct:  1102 KIPMQTSEHALTVRVVTLLNASTGENETLLAVGTAYVQGEDVAARGRVLLFSFGKNGDNS 1161

Query:  1142 QNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSL 1201
             QN+VTEVYS+ELKGAISA+AS+QGHLLI+SGPKIILHKW GTELNG+AF+DAPPLYVVS+
Sbjct:  1162 QNVVTEVYSRELKGAISAVASIQGHLLISSGPKIILHKWNGTELNGVAFFDAPPLYVVSM 1221

Query:  1202 NIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQ 1261
             N+VK+FILLGD+HKSIYFLSWKEQG+QL+LLAKDF SLDCFATEFLIDGSTLSL VSDEQ
Sbjct:  1222 NVVKSFILLGDVHKSIYFLSWKEQGSQLSLLAKDFESLDCFATEFLIDGSTLSLAVSDEQ 1281

Query:  1262 KNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNR 1321
             KNIQ+FYYAPKM ESWKG KLLSRAEFHVGAHV+KFLRLQM+++         G+DK NR
Sbjct:  1282 KNIQVFYYAPKMIESWKGLKLLSRAEFHVGAHVSKFLRLQMVSS---------GADKINR 1332

Query:  1322 FALLFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRP 1381
             FALLFGTLDGS GCIAPLDE+TFRRLQSLQKKLVD+VPHVAGLNP +FRQF S+GKA R 
Sbjct:  1333 FALLFGTLDGSFGCIAPLDEVTFRRLQSLQKKLVDAVPHVAGLNPLAFRQFRSSGKARRS 1392

Query:  1382 GPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL 1431
             GPDSIVDCELL HYEMLPLEEQLE+AHQ GTTR  IL +L DL++GTSFL
Sbjct:  1393 GPDSIVDCELLCHYEMLPLEEQLELAHQIGTTRYSILKDLVDLSVGTSFL 1442


GO:0003676 "nucleic acid binding" evidence=IEA
GO:0005634 "nucleus" evidence=ISM;IEA;IDA
GO:0006378 "mRNA polyadenylation" evidence=ISS
GO:0006379 "mRNA cleavage" evidence=ISS
GO:0005515 "protein binding" evidence=IPI
GO:0005829 "cytosol" evidence=IDA
GO:0006397 "mRNA processing" evidence=RCA
GO:0009909 "regulation of flower development" evidence=RCA
GO:0016570 "histone modification" evidence=RCA
GO:0048449 "floral organ formation" evidence=RCA
ZFIN|ZDB-GENE-040709-2 cpsf1 "cleavage and polyadenylation specific factor 1" [Danio rerio (taxid:7955)] Back     alignment and assigned GO terms
UNIPROTKB|F1PC28 CPSF1 "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
UNIPROTKB|Q10569 CPSF1 "Cleavage and polyadenylation specificity factor subunit 1" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|Q10570 CPSF1 "Cleavage and polyadenylation specificity factor subunit 1" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
MGI|MGI:2679722 Cpsf1 "cleavage and polyadenylation specific factor 1" [Mus musculus (taxid:10090)] Back     alignment and assigned GO terms
FB|FBgn0024698 Cpsf160 "Cleavage and polyadenylation specificity factor 160" [Drosophila melanogaster (taxid:7227)] Back     alignment and assigned GO terms
UNIPROTKB|F1RSN8 CPSF1 "Uncharacterized protein" [Sus scrofa (taxid:9823)] Back     alignment and assigned GO terms
DICTYBASE|DDB_G0281585 cpsf1 "cleavage and polyadenylation specificity factor 160 kDa subunit" [Dictyostelium discoideum (taxid:44689)] Back     alignment and assigned GO terms
RGD|1306406 Cpsf1 "cleavage and polyadenylation specific factor 1, 160kDa" [Rattus norvegicus (taxid:10116)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
Q9FGR0CPSF1_ARATHNo assigned EC number0.74790.98740.9798yesno
Q7XWP1CPSF1_ORYSJNo assigned EC number0.64060.97830.9715yesno

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Your Input:
GSVIVG00038486001
SubName- Full=Chromosome chr16 scaffold_94, whole genome shotgun sequence; (1448 aa)
(Vitis vinifera)
Predicted Functional Partners:
GSVIVG00037665001
SubName- Full=Chromosome undetermined scaffold_91, whole genome shotgun sequence; (740 aa)
    0.865
GSVIVG00016982001
SubName- Full=Chromosome chr11 scaffold_14, whole genome shotgun sequence; (771 aa)
     0.757
GSVIVG00020879001
SubName- Full=Chromosome chr14 scaffold_21, whole genome shotgun sequence; (427 aa)
     0.681
GSVIVG00028411001
SubName- Full=Chromosome chr10 scaffold_43, whole genome shotgun sequence; (572 aa)
      0.676
GSVIVG00000022001
SubName- Full=Chromosome chr17 scaffold_101, whole genome shotgun sequence; (461 aa)
      0.517
GSVIVG00006902001
SubName- Full=Chromosome chr10 scaffold_179, whole genome shotgun sequence; (863 aa)
      0.510
GSVIVG00023663001
SubName- Full=Chromosome chr7 scaffold_31, whole genome shotgun sequence; (97 aa)
      0.506
GSVIVG00010279001
SubName- Full=Chromosome undetermined scaffold_252, whole genome shotgun sequence; (561 aa)
      0.506
GSVIVG00037110001
SubName- Full=Chromosome chr16 scaffold_86, whole genome shotgun sequence; (512 aa)
      0.496
GSVIVG00001949001
SubName- Full=Chromosome chr5 scaffold_124, whole genome shotgun sequence; (2072 aa)
      0.495

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query1431
pfam03178318 pfam03178, CPSF_A, CPSF A subunit region 6e-92
COG51611319 COG5161, SFT1, Pre-mRNA cleavage and polyadenylati 4e-45
COG51611319 COG5161, SFT1, Pre-mRNA cleavage and polyadenylati 2e-22
pfam10433513 pfam10433, MMS1_N, Mono-functional DNA-alkylating 1e-07
>gnl|CDD|217409 pfam03178, CPSF_A, CPSF A subunit region Back     alignment and domain information
 Score =  299 bits (769), Expect = 6e-92
 Identities = 115/338 (34%), Positives = 187/338 (55%), Gaps = 24/338 (7%)

Query: 1063 YEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGED 1122
              +R+++P      W+   T+ ++ +E  L+V+ V L ++  +  +  L +GTA+  GED
Sbjct: 2    SCIRLVDPIT----WEVIDTLELEENEAVLSVKSVNLEDS--EGRKEYLVVGTAFDLGED 55

Query: 1123 VAAR-GRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKWT 1181
             AAR GR+ +F       N +  +  V+  E+KGA++AL   QG LL   G K+ ++   
Sbjct: 56   PAARSGRIYVFEIIEPETNRK--LKLVHKTEVKGAVTALCEFQGRLLAGQGQKLRVYDLG 113

Query: 1182 GTELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDC 1241
              +L   AF D P  YVVSL +  N I++GD+ KS+ FL + E+  +L L A+D      
Sbjct: 114  KDKLLPKAFLDTPITYVVSLKVFGNRIIVGDLMKSVTFLGYDEEPYRLILFARDTQPRWV 173

Query: 1242 FATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQ-KLLSRAEFHVGAHVTKFLRL 1300
             A EFL+D  T  ++ +D+  N+ +  Y P+  ES  G  +LL RAEFH+G  VT F + 
Sbjct: 174  TAAEFLVDYDT--ILGADKFGNLHVLRYDPEAPESLDGDPRLLHRAEFHLGDIVTSFQKG 231

Query: 1301 QMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIGCIAP-LDELTFRRLQSLQKKLVDSVP 1359
             ++  +          + T+   +L+GTLDGSIG + P + E  +RRLQ LQ++L D +P
Sbjct: 232  SLVPKTGGA-------ESTSSPQILYGTLDGSIGLLVPFISEEEYRRLQHLQQQLRDELP 284

Query: 1360 HVAGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEM 1397
            H+ GL+PR+FR ++S     +     ++D +LL  +  
Sbjct: 285  HLCGLDPRAFRSYYSRSPPVKN----VIDGDLLERFLD 318


This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding. Length = 318

>gnl|CDD|227490 COG5161, SFT1, Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification] Back     alignment and domain information
>gnl|CDD|227490 COG5161, SFT1, Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification] Back     alignment and domain information
>gnl|CDD|220751 pfam10433, MMS1_N, Mono-functional DNA-alkylating methyl methanesulfonate N-term Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 1431
KOG18961366 consensus mRNA cleavage and polyadenylation factor 100.0
KOG18971096 consensus Damage-specific DNA binding complex, sub 100.0
KOG18981205 consensus Splicing factor 3b, subunit 3 [RNA proce 100.0
COG51611319 SFT1 Pre-mRNA cleavage and polyadenylation specifi 100.0
PF10433504 MMS1_N: Mono-functional DNA-alkylating methyl meth 100.0
PF03178321 CPSF_A: CPSF A subunit region; InterPro: IPR004871 100.0
KOG0318603 consensus WD40 repeat stress protein/actin interac 98.45
KOG2048691 consensus WD40 repeat protein [General function pr 98.25
PRK11028330 6-phosphogluconolactonase; Provisional 97.1
cd00200289 WD40 WD40 domain, found in a number of eukaryotic 96.69
PF03178321 CPSF_A: CPSF A subunit region; InterPro: IPR004871 96.54
KOG1274 933 consensus WD40 repeat protein [General function pr 96.53
KOG1273405 consensus WD40 repeat protein [General function pr 96.45
KOG1446311 consensus Histone H3 (Lys4) methyltransferase comp 96.44
cd00200289 WD40 WD40 domain, found in a number of eukaryotic 96.13
PLN00181793 protein SPA1-RELATED; Provisional 95.87
KOG1539 910 consensus WD repeat protein [General function pred 95.85
PRK11028330 6-phosphogluconolactonase; Provisional 95.75
KOG1036323 consensus Mitotic spindle checkpoint protein BUB3, 95.65
KOG1539 910 consensus WD repeat protein [General function pred 95.3
KOG2055514 consensus WD40 repeat protein [General function pr 95.17
KOG2106626 consensus Uncharacterized conserved protein, conta 94.86
KOG0291 893 consensus WD40-repeat-containing subunit of the 18 94.71
KOG1036323 consensus Mitotic spindle checkpoint protein BUB3, 94.54
KOG0283712 consensus WD40 repeat-containing protein [Function 93.87
KOG0306888 consensus WD40-repeat-containing subunit of the 18 93.75
PF10282345 Lactonase: Lactonase, 7-bladed beta-propeller; Int 93.5
PF08596395 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; 93.49
KOG2110391 consensus Uncharacterized conserved protein, conta 92.91
PLN00181793 protein SPA1-RELATED; Provisional 92.79
KOG0285460 consensus Pleiotropic regulator 1 [RNA processing 92.57
KOG0646476 consensus WD40 repeat protein [General function pr 92.5
KOG0306 888 consensus WD40-repeat-containing subunit of the 18 92.0
KOG2321 703 consensus WD40 repeat protein [General function pr 92.0
KOG0650733 consensus WD40 repeat nucleolar protein Bop1, invo 91.36
KOG2111346 consensus Uncharacterized conserved protein, conta 91.22
KOG0282503 consensus mRNA splicing factor [Function unknown] 91.18
KOG0319 775 consensus WD40-repeat-containing subunit of the 18 91.05
KOG1273405 consensus WD40 repeat protein [General function pr 90.9
PTZ00420 568 coronin; Provisional 90.34
KOG0294362 consensus WD40 repeat-containing protein [Function 90.25
KOG0278334 consensus Serine/threonine kinase receptor-associa 90.19
KOG1897 1096 consensus Damage-specific DNA binding complex, sub 89.88
KOG0315311 consensus G-protein beta subunit-like protein (con 89.67
KOG2055514 consensus WD40 repeat protein [General function pr 89.51
KOG4378 673 consensus Nuclear protein COP1 [Signal transductio 88.97
KOG0283712 consensus WD40 repeat-containing protein [Function 88.07
PF08596395 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; 88.06
PF14783111 BBS2_Mid: Ciliary BBSome complex subunit 2, middle 88.01
PF14727418 PHTB1_N: PTHB1 N-terminus 87.56
KOG2096420 consensus WD40 repeat protein [General function pr 87.17
KOG0315311 consensus G-protein beta subunit-like protein (con 86.79
COG2706346 3-carboxymuconate cyclase [Carbohydrate transport 86.45
KOG0319 775 consensus WD40-repeat-containing subunit of the 18 86.1
KOG0277311 consensus Peroxisomal targeting signal type 2 rece 85.74
KOG0299479 consensus U3 snoRNP-associated protein (contains W 85.71
KOG0772641 consensus Uncharacterized conserved protein, conta 85.68
KOG0296399 consensus Angio-associated migratory cell protein 85.36
KOG0266456 consensus WD40 repeat-containing protein [General 85.28
KOG0647347 consensus mRNA export protein (contains WD40 repea 85.27
KOG0290364 consensus Conserved WD40 repeat-containing protein 85.02
KOG3881412 consensus Uncharacterized conserved protein [Funct 84.48
COG2706346 3-carboxymuconate cyclase [Carbohydrate transport 84.33
KOG0296399 consensus Angio-associated migratory cell protein 83.74
KOG15171387 consensus Guanine nucleotide binding protein MIP1 83.51
KOG2106626 consensus Uncharacterized conserved protein, conta 83.33
KOG0282503 consensus mRNA splicing factor [Function unknown] 80.79
PTZ00420568 coronin; Provisional 80.58
KOG0772641 consensus Uncharacterized conserved protein, conta 80.58
PF14727418 PHTB1_N: PTHB1 N-terminus 80.51
PF08450246 SGL: SMP-30/Gluconolaconase/LRE-like region; Inter 80.17
>KOG1896 consensus mRNA cleavage and polyadenylation factor II complex, subunit CFT1 (CPSF subunit) [RNA processing and modification] Back     alignment and domain information
Probab=100.00  E-value=8.7e-196  Score=1751.80  Aligned_cols=1286  Identities=41%  Similarity=0.683  Sum_probs=1066.2

Q ss_pred             ccceeccccCCceeeeeEEEEeecCCCCCCCCCcccccccccccCCCCCCCCCCCcEEEEcCCeEEEEEEEEeccCCccc
Q 000548            2 SFAAYKMMHWPTGIANCGSGFITHSRADYVPQIPLIQTEELDSELPSKRGIGPVPNLVVTAANVIEIYVVRVQEEGSKES   81 (1431)
Q Consensus         2 ~~~~~~~~~~pT~V~~s~~~~F~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~nLvvak~~~Leiy~v~~~~~g~~~~   81 (1431)
                      .|++|++.|+||+|+||++|+||.....                           ||||+++|.|+||++..++++.+. 
T Consensus         1 m~~vykq~h~~T~ve~s~ag~Ft~~~~~---------------------------nlvV~~~N~L~vyri~~~~e~~t~-   52 (1366)
T KOG1896|consen    1 MFAVYKQEHDPTVVENSSAGLFTNNRTE---------------------------NLVVAGTNILRVYRISRDAEALTK-   52 (1366)
T ss_pred             CcchhhhccCchhhccceeeeEecCCCc---------------------------ceEEecccEEEEEEeccchhhccc-
Confidence            3789999999999999999999987765                           999999999999999865333211 


Q ss_pred             cCCccccccccccccccceEEEEEEEEeeeeEeEEEEEecCCCCCCCCCcEEEEEeccceEEEEEEeCCCCCeeEEeeee
Q 000548           82 KNSGETKRRVLMDGISAASLELVCHYRLHGNVESLAILSQGGADNSRRRDSIILAFEDAKISVLEFDDSIHGLRITSMHC  161 (1431)
Q Consensus        82 ~~~~~~~~~~~~~~~~~~~L~lv~~~~l~G~I~~l~~~r~~~~~~~~~~D~Llv~~~~~klsil~~d~~~~~l~t~Slh~  161 (1431)
                            .+...|+...+.+|+|+++|.+||+|++|++++..|+    .+|+|+++|++||+|+|+||+.+|+|+|.||||
T Consensus        53 ------~~~~~~~~~~~~~LeLv~~~~l~GnV~si~~~~~~gs----~rD~LlL~f~~AKiSvlefD~~t~sl~TlSLHy  122 (1366)
T KOG1896|consen   53 ------NDPGDMGKAHRKKLELVAEFKLFGNVTSIAKLPLKGS----NRDALLLLFKDAKISVLEFDPQTNSLRTLSLHY  122 (1366)
T ss_pred             ------cCccccccccceEEEEEEEEEeecceeeEEEeecCCC----CcceEEEEeccceEEEEEecCCccceeeeeeEE
Confidence                  1223344455567999999999999999999999987    699999999999999999999999999999999


Q ss_pred             ecCcccccccCCCccccCCCeEEECCCCcEEEEEecCceEEEEeCccCCCCCCCCCCCCCCCCCcccceeccEEEEcccC
Q 000548          162 FESPEWLHLKRGRESFARGPLVKVDPQGRCGGVLVYGLQMIILKASQGGSGLVGDEDTFGSGGGFSARIESSHVINLRDL  241 (1431)
Q Consensus       162 ~E~~~~~~~~~g~~~~~~~~~l~vDP~~Rc~~l~~~~~~L~ilp~~~~~~~l~~~~~~~~~~~~~~~~~~~s~~i~l~~l  241 (1431)
                      ||.+++   +.|+.....+|.++|||++||++|++|+..|+||||++.. .+++++ ....++-..+++.+||+|.+++|
T Consensus       123 fE~~~~---~~~~~~~~~~p~vrvDPdsrCa~llvyg~~m~iLpf~~~e-~~~~~~-~~~~~~~~ss~~~pSyvi~~reL  197 (1366)
T KOG1896|consen  123 FEGPEF---RKGLVGRAKIPTVRVDPDSRCALLLVYGLRMAILPFRVNE-HLDDEE-LFPSGFSKSSFTAPSYVIALREL  197 (1366)
T ss_pred             eccccc---cccccccccCceEEECCCCCeEEEEEecceEEEeeccccc-cccccc-cccccccccccccceeEEEhhhh
Confidence            999864   4555555678999999999999999999999999998752 333322 22222233457899999999999


Q ss_pred             C--CCceeeEeeccCCCcceEEEEeecCCCcccccccccceeEEEEEEEeecccccceeeEeccCCcccceEEEecCCCC
Q 000548          242 D--MKHVKDFIFVHGYIEPVMVILHERELTWAGRVSWKHHTCMISALSISTTLKQHPLIWSAMNLPHDAYKLLAVPSPIG  319 (1431)
Q Consensus       242 d--i~~V~D~~FL~gy~~PtlavL~~~~~tw~gr~~~~~~t~~~~~~sLd~~~k~~~~i~s~~~lp~~~~~LipvP~p~g  319 (1431)
                      |  |+||+|++|||||++||||||||+.+||+||+..|+|||.+.+++||+++|.||+||++.+||+||+.+.++|.|+|
T Consensus       198 deki~niiD~qFLhgY~ePTl~ILyep~~tw~grv~~r~dt~~~vaisLni~q~~hpVI~sv~sLP~D~~~~~~vp~piG  277 (1366)
T KOG1896|consen  198 DEKIKNIIDFQFLHGYYEPTLAILYEPEQTWAGRVILRKDTCVLVAISLNITQKVHPVIWSVLSLPFDCYQATAVPTPIG  277 (1366)
T ss_pred             hhhhccceeEEeecCcccceEEEEecccccccceEEEecCcEEEEEEEcCccccccceEeeeccCChhhhhceeecccCc
Confidence            8  88999999999999999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             eEEEEecCeEEEEecCC-cceEEccCCCccCCCCcccCCCCceeeecceeEEEeeCceEEEEeCCCCEEEEEEEEc-Cee
Q 000548          320 GVLVVGANTIHYHSQSA-SCALALNNYAVSLDSSQELPRSSFSVELDAAHATWLQNDVALLSTKTGDLVLLTVVYD-GRV  397 (1431)
Q Consensus       320 GvLVi~~n~I~y~~~~~-~~~~~~n~~~~~~~~~~~~p~~~~~~~ld~~~~~~~~~~~~Ll~~~~G~l~~l~l~~d-g~~  397 (1431)
                      ||||++.|.++|.+|++ ++++++|++++..+.++.+||+.+.+.+|++..+|++.++++++..+||+|+|+|.+| +|.
T Consensus       278 gvLv~~~n~~iy~nqsv~~~gv~LNs~a~~~t~fpl~~qs~v~i~ld~a~~t~i~~dk~vis~~~Gd~y~Ltl~~D~~r~  357 (1366)
T KOG1896|consen  278 GVLVFTVNNLIYLNQSVSPYGVALNSYASKYTAFPLIPQSGVRIELDCANATWISNDKCVISLKNGDLYLLTLILDIGRS  357 (1366)
T ss_pred             cEEEEeeeeEEEEccCCCceeEEecchhhcccCCccccccceEEEEeeccceeecCCeEEEecCCCcEEEEEEEeccccc
Confidence            99999999999999998 5999999999999999999999999999999999999999999999999999999999 789


Q ss_pred             eeeEEEEecCCCccccceEEecCCeEEEEeecCCeeEEEEeeCCCccccCCCCccccCCcccCCccchhccCCCcchhhc
Q 000548          398 VQRLDLSKTNPSVLTSDITTIGNSLFFLGSRLGDSLLVQFTCGSGTSMLSSGLKEEFGDIEADAPSTKRLRRSSSDALQD  477 (1431)
Q Consensus       398 V~~l~l~~~~~~~~~s~l~~l~~g~lFvgS~~GDS~L~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~  477 (1431)
                      |+.+++..+....+++|++...+++||+||+.|||+|++|.+....+.+.  ...++.+.+....+.++.+...+..+|+
T Consensus       358 V~~~~f~k~~asvl~t~~v~~~n~llFlGSrlgnSlll~~s~~~~~~~e~--~~re~~d~~~~~~~~~~~d~~~d~~~~d  435 (1366)
T KOG1896|consen  358 VQLLHFDKFKASVLATSIVGHGNNLLFLGSRLGNSLLLRFSELLQRASEG--VRREEGDTESDGYSKKRVDDTQDVRRDD  435 (1366)
T ss_pred             hhhhhhhhhhcccceeeeeccCCccEEEEecCCCEEEEEehhccccCCcc--ccccccCCcCCcchhhcccchhhhhhhh
Confidence            99999999999999999999999999999999999999999876522221  1111111222123333332111111111


Q ss_pred             ccCcccc------cccCCCCCCcccccceeEEEEeeeecccCCccccccccccccCC---------------CccCCCCC
Q 000548          478 MVNGEEL------SLYGSASNNTESAQKTFSFAVRDSLVNIGPLKDFSYGLRINADA---------------SATGISKQ  536 (1431)
Q Consensus       478 ~~~~~~~------~l~~~~~~~~~~~~~~~~l~v~d~l~NigPI~D~~vg~~~~~d~---------------~~sG~g~~  536 (1431)
                      . ..++.      +-||++...+   ...+.|++||+|+|+|||.||++|.....+.               .|+|+|+.
T Consensus       436 ~-~~~~~~~~g~~~~~g~~a~~t---~~~f~fevcDsL~NIGPi~~~avG~~~~~~~~~~gl~~~~~~~elV~~sGhgkn  511 (1366)
T KOG1896|consen  436 E-KSAELFEAGSEENYGSGAQET---VQPFSFEVCDSLPNIGPITDFAVGKRSSASEAVEGLSPHNKCLELVATSGHGKN  511 (1366)
T ss_pred             h-hccchhhccccccCCccccee---eeeeEEeehhccccccccccceeccccchhhhccCCCCCCCeEEEEEeccCCCC
Confidence            1 11111      2232221111   1238899999999999999999998754321               18899999


Q ss_pred             CCeEE------------EecCCCCEEEEEEecCCCCCCCCcccccccCcCcceEEEEEccccceEEEeccceeeeecccc
Q 000548          537 SNYEL------------VELPGCKGIWTVYHKSSRGHNADSSRMAAYDDEYHAYLIISLEARTMVLETADLLTEVTESVD  604 (1431)
Q Consensus       537 g~L~~------------~~L~g~~~iWtv~~~~~~~~~~~~~~~~~~~~~~~~yLvlS~~~~T~Vl~~g~~~eev~~~~g  604 (1431)
                      |.|.+            ++|+||.++|||..+....+         .++..|.||++|..++|+||++|+++.|++. .+
T Consensus       512 gaL~V~r~sI~P~i~t~fel~Gc~~iWtV~~~~~~~~---------~~~~~h~~lilS~e~~t~il~tge~~~Ev~~-s~  581 (1366)
T KOG1896|consen  512 GALSVIRRSIRPEIATEFELPGCVDIWTVFIKGRKRE---------EDNTQHLYLILSTESRTMILETGEELLEVSG-SG  581 (1366)
T ss_pred             cceEEEeecccceeeEEEEecCeeeEEEEEEeccccc---------cccCcceEEEeecccchhhhhccchhhhccc-ce
Confidence            99987            68999999999998644322         2234599999999999999999999999975 58


Q ss_pred             cccccceEEEeeecCCcEEEEEecCcEEEEcCC-cceEEEeCCCCCCCCCCCCCCccEEEEEEeCCEEEEEEeCCcEEEE
Q 000548          605 YFVQGRTIAAGNLFGRRRVIQVFERGARILDGS-YMTQDLSFGPSNSESGSGSENSTVLSVSIADPYVLLGMSDGSIRLL  683 (1431)
Q Consensus       605 F~~~~~TI~~g~l~~~~~ivQVt~~~irl~~~~-~~~~~~~~~~~~~~~~~~~~~~~I~~as~~d~~vll~~~~g~i~~l  683 (1431)
                      |..+++||++|+++++.+||||||+++|+++++ .+.|.++..          .+..+++++++||||++....|.+.+|
T Consensus       582 f~~~~~Tl~~gnlg~~rriVQVtp~~~rllDg~~r~lq~i~fd----------~~~~vv~~sv~dpyv~v~~~~g~i~~~  651 (1366)
T KOG1896|consen  582 FTRDGPTLFAGNLGNERRIVQVTPSGLRLLDGDLRMLQRIPFD----------SGAIVVQTSVADPYVAVRSSEGRITLY  651 (1366)
T ss_pred             eEeccceEEEEecCCceEEEEEccceeEEecCcchheeEeccc----------cCCcEEEEeccCceEEEEEcCCceEEE
Confidence            999999999999988899999999999999995 588888883          445699999999999999999999999


Q ss_pred             EecCCCceEEeecCcccccCCCceEEEEeeccCCC-------------CcccccccccccccCCccccccCCCCCCCCCC
Q 000548          684 VGDPSTCTVSVQTPAAIESSKKPVSSCTLYHDKGP-------------EPWLRKTSTDAWLSTGVGEAIDGADGGPLDQG  750 (1431)
Q Consensus       684 ~~~~~~~~l~~~~~~~~~~~~~~i~~~~l~~d~~~-------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  750 (1431)
                      .++....+|.+.++  +   ...+.+++++.|.+.             .++.+.. .++...... .+.+.+++....+.
T Consensus       652 ~l~~~s~rl~~~~~--~---s~~~~sv~~~~dlsg~f~~~s~l~~k~~~~~gr~~-~~~~~~~~~-~kv~~~egg~~~~~  724 (1366)
T KOG1896|consen  652 DLEEKSHRLALHDP--M---SFKVVSVSLPADLSGMFTTLSDLSLKGNEANGRSS-EAEGLQSLP-CKVDDEEGGSPEQE  724 (1366)
T ss_pred             EeccccchhhccCc--c---cceeEEEechhhhccceEEEeeecccCcccccccc-cccccccCC-ccccCCCCCCcccC
Confidence            98776555655554  1   344666666666432             2222221 111111111 22332332111122


Q ss_pred             cEEEEEEecCCeEEEEECCCCceeEEecccccccccccccccccccccccccccCCCccCCCCCcccccccccEEEEEee
Q 000548          751 DIYSVVCYESGALEIFDVPNFNCVFTVDKFVSGRTHIVDTYMREALKDSETEINSSSEEGTGQGRKENIHSMKVVELAMQ  830 (1431)
Q Consensus       751 ~~~l~v~~~~g~l~I~sLp~~~~v~~~~~~~~~~~~l~~~~~~~~~~~s~q~l~~~~~~~~~~~~~~~~~~~~i~~i~~~  830 (1431)
                      .+||++++++|+++||++|++++|+.++.|+.++.+|.+.......   .|              + ..++..++++..+
T Consensus       725 ~~~~~~~~e~g~leiy~~pd~~lVf~v~~f~~~~~~L~~~~~~~~~---~~--------------~-~s~~~~l~q~~~~  786 (1366)
T KOG1896|consen  725 PYWCVFVTESGTLEIYALPDFDLVFEVDMFDTGNRVLMDSRLRGPT---TN--------------K-ESEDLELKQLFVN  786 (1366)
T ss_pred             ceEEEEEcCCCceEEEccCCcceEEEeeccCCCcceEEeecccCcc---cc--------------c-cccchHHHHhhcc
Confidence            3999999999999999999999999999999999998875443220   00              1 1223567777778


Q ss_pred             eccCC--CCccEEEEEeeCCeEEEEEeeecCCCCCCCCCCCCCcccccccccccccccccceeEEeccCCccCCCC----
Q 000548          831 RWSAH--HSRPFLFAILTDGTILCYQAYLFEGPENTSKSDDPVSTSRSLSVSNVSASRLRNLRFSRTPLDAYTREE----  904 (1431)
Q Consensus       831 ~~g~~--~~~~~L~vgl~~G~l~~y~~~~~~~~~~~~~~~~~~~~~~~~~lg~~~~~~~~~~rf~k~~~~~~~~~~----  904 (1431)
                      .+|.+  .++|||++-+.+|+++.|++|+..+                        ++...++|+|+|+.....+.    
T Consensus       787 ~L~~e~~~~e~~L~lv~~~~eil~Ykaf~~~~------------------------~~~~~~~f~kvp~~~~~~~~~p~~  842 (1366)
T KOG1896|consen  787 PLGSEIVFKEPHLFLVVSDNEILIYKAFPQLS------------------------QGNLKVFFKKVPHNLNIRTDKPHF  842 (1366)
T ss_pred             ccchhhhccCCceEEEEeCceEEEEeeccccC------------------------ccchhhhhhhCCHhhcccccCCcc
Confidence            88877  7899999999999999999985111                        01124589999875422111    


Q ss_pred             -------------CCCCCCccceEEeeccCCceEEEEeCCCceEEEEe-CCceEEeeccCCCceEEEeeccCCCCCceEE
Q 000548          905 -------------TPHGAPCQRITIFKNISGHQGFFLSGSRPCWCMVF-RERLRVHPQLCDGSIVAFTVLHNVNCNHGFI  970 (1431)
Q Consensus       905 -------------~~~~~g~~~l~~f~~~~g~~~Vf~~g~rP~~i~~~-~~~l~~~pl~~~~~v~~~~~f~~~~~~~g~i  970 (1431)
                                   .+.+.-.++++.|++++|++|||+||++|+||+.. ++.+++||+..+++|.+|++||+.|||+||+
T Consensus       843 ~~~~~~~~~~e~~~~~~~~~~~m~~f~~i~ghsgvfv~Gs~P~~il~t~rg~lr~h~~~gngpv~sfapfhnvn~p~gfi  922 (1366)
T KOG1896|consen  843 LCKKREGGGAEEGASVSVIVQRMTYFEDIGGHSGVFVTGSKPYLILLTFRGVLRFHPVFGNGPVGSFAPFHNVNCPRGFI  922 (1366)
T ss_pred             cchhhccccccccccccceeeeEEeeccccCeeEEEEecCCceEEEEEcccccceeeeecCCcceeeeeeeccCCCcceE
Confidence                         11123346788999999999999999999999874 9999999999999999999999999999999


Q ss_pred             EEEecCeEEEEEcCCCCccCCCcceEEeeCCCcccEEEEecCCCeEEEEEeecccccccccccccccccccccccCCCCC
Q 000548          971 YVTSQGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSLLIDQEVGHQIDNHNLS 1050 (1431)
Q Consensus       971 ~~~~~~~L~I~~l~~~~~~d~~~~ir~i~L~~tprki~y~~~~~~~~v~~s~~~~~~~~~~~~~~~d~e~~~~~~~~~~~ 1050 (1431)
                      |++.++.|+||.++....||+.||+|+|||+.|||+++||++.++|+|+++.+  .++.   ...+|++.      +.++
T Consensus       923 yvd~~~~l~i~~lp~~~~Ydn~wPvkkIpl~~T~~~vvYh~e~~vy~v~t~~~--~~~~---~~~~d~~e------~~~~  991 (1366)
T KOG1896|consen  923 YVDRQGELVICVLPEALSYDNKWPVKKIPLRKTPHQVVYHYEKKVYAVITSTP--VPYE---RLGEDGEE------EVIS  991 (1366)
T ss_pred             EECCCceEEEEEcchhcccCCCCcccccccccchhheeeeccceEEEEEEecc--ceee---eccccccc------cccc
Confidence            99999999999999999999999999999999999999999999999998863  2221   11223321      2345


Q ss_pred             ccccccccccceEEEEEeccCCCCCCceeeeeEECCCCCceEEEEEEEeeecC-CCCcceEEEEEeeeecCCCcccceeE
Q 000548         1051 SVDLHRTYTVEEYEVRILEPDRAGGPWQTRATIPMQSSENALTVRVVTLFNTT-TKENETLLAIGTAYVQGEDVAARGRV 1129 (1431)
Q Consensus      1051 ~~~~~~~~~~~~~~v~lid~~~~~~~~~~~~~~~l~~~E~v~s~~~v~l~~~~-~~~~~~~ivVGT~~~~~e~~~~~Gri 1129 (1431)
                      .++....|..++++|+|++|    .+|++++.|+|++||++++++.+.|..+. +++.++||+|||+++.|||.++|||+
T Consensus       992 ~de~~~~p~~~~f~i~LisP----~sw~vi~~iefq~~E~v~~~k~v~L~~~~t~~~~k~ylavGT~~~~gEDv~~RGr~ 1067 (1366)
T KOG1896|consen  992 RDENVIHPEGEQFSIQLISP----ESWEVIDKIEFQENEHVLHMKYVILDDEETTKGKKPYLAVGTAFIQGEDVPARGRI 1067 (1366)
T ss_pred             ccccccccccccceeEEecC----CccccccccccCccceeeEEEEEEEEecccccCCcceEEEEEeecccccccCcccE
Confidence            56777888899999999999    49999999999999999999999998654 45579999999999999999999999


Q ss_pred             EEEEEee---cCCCC--CccEEEEEEEeecCceEEEccccCeEEEEeCCeEEEEEc-cCCeeeeeEeecCCCeeEEEEEE
Q 000548         1130 LLFSTGR---NADNP--QNLVTEVYSKELKGAISALASLQGHLLIASGPKIILHKW-TGTELNGIAFYDAPPLYVVSLNI 1203 (1431)
Q Consensus      1130 ~vf~i~~---~~~~~--~~~l~lv~~~~~~g~V~al~~~~g~Ll~~vg~~l~v~~~-~~~~L~~~a~~~~~~~~i~~l~~ 1203 (1431)
                      ++|+|++   +|++|  +.|||+++++|++|+|.++|+++|+|+.|.|+||++|+| .+.+|.++||+|. |.|++++++
T Consensus      1068 hi~diIeVVPepgkP~t~~KlKel~~eE~KGtVsavceV~G~l~~~~GqKI~v~~l~r~~~ligVaFiD~-~~yv~s~~~ 1146 (1366)
T KOG1896|consen 1068 HIFDIIEVVPEPGKPFTKNKLKELYIEEQKGTVSAVCEVRGHLLSSQGQKIIVRKLDRDSELIGVAFIDL-PLYVHSMKV 1146 (1366)
T ss_pred             EEEEEEEecCCCCCCcccceeeeeehhhcccceEEEEEeccEEEEccCcEEEEEEeccCCcceeeEEecc-ceeEEehhh
Confidence            9999987   77766  457999999999999999999999999999999999999 5678999999999 999999999


Q ss_pred             eCCEEEEEeccccEEEEEEeccccEEEEeeeccCCccEEEEEEEecCCeeEEEEEeCCCcEEEEeeCCCCCCCccCceEE
Q 000548         1204 VKNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWKGQKLL 1283 (1431)
Q Consensus      1204 ~~~~IlvGD~~~Sv~ll~~~~~~~~l~~~arD~~~~~vta~~fl~d~~~l~~i~~D~~gNl~vl~~~p~~~~s~~~~~L~ 1283 (1431)
                      +||+|++||+|||++|++|++++.+|.+++||..++.|++++||+|+++|+|+++|+++||++|.|.|++++|++|+||.
T Consensus      1147 vknlIl~gDV~ksisfl~fqeep~rlsL~srd~~~l~v~s~EFLVdg~~L~flvsDa~rNi~vy~Y~Pe~~eS~~G~RLv 1226 (1366)
T KOG1896|consen 1147 VKNLILAGDVMKSISFLGFQEEPYRLSLLSRDFEPLNVYSTEFLVDGSNLSFLVSDADRNIHVYMYAPENIESLSGQRLV 1226 (1366)
T ss_pred             hhhheehhhhhhceEEEEEccCceEEEEeecCCchhhceeeeeEEcCCeeEEEEEcCCCcEEEEEeCCCCccccCcceee
Confidence            99999999999999999999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             EEEEEecCcceeEEEEEeeecCCCCCCCCCCCCCCCCceEEE--EEcCCCcEEEEEeCChHhHHHHHHHHHHHHhcCCCC
Q 000548         1284 SRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFALL--FGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHV 1361 (1431)
Q Consensus      1284 ~~~~fhlg~~vt~~~~~~~~~~~~~~~~~~~g~~~~~~~~il--~~T~~GsIg~l~pl~e~~~~~L~~Lq~~l~~~~~~~ 1361 (1431)
                      ++++||+|..|++|.+...... .+.     +   .+.+...  |||++|++|+++|++|+.||||..||++|...++|+
T Consensus      1227 ~radfhvg~~vs~m~~lp~~~~-~e~-----~---~~~~~~~~v~gtlDG~l~~~~Pl~e~~YRRL~~lQn~L~~~~~hv 1297 (1366)
T KOG1896|consen 1227 RRADFHVGAHVSTMFRLPCHQN-AEF-----G---SNSPMFYEVFGTLDGGLGHLVPLDEKTYRRLLMLQNALMDRLPHV 1297 (1366)
T ss_pred             eeeeeEeccceeeeEecccccc-chh-----c---cCCchhhhhhcccCCceeEEecCCHHHHHHHHHHHHHHHHhhhhh
Confidence            9999999999999998652221 110     1   1233444  899999999999999999999999999999999999


Q ss_pred             CCCCcccccccccCCCCCCCCCCCceeHHHHHHHcCCCHHHHHHHHHHhCCCHHHHHHHHHHhhhccCCC
Q 000548         1362 AGLNPRSFRQFHSNGKAHRPGPDSIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDLALGTSFL 1431 (1431)
Q Consensus      1362 ~Gl~~~~~R~~~~~~~~~~~~~~~~IDGDlie~fl~L~~~~q~~ia~~l~~~~~~i~~~l~~l~~~~~~~ 1431 (1431)
                      |||||++||..+...+ ...+.+++|||+||.+|..|+.++|.++|+++|+++.+|+++|-+|.+.++||
T Consensus      1298 ~GLNPr~yR~~~s~~~-~~n~~r~ilDg~ll~~f~yl~~~er~elA~kiGt~~~eIl~DLvel~~~~s~~ 1366 (1366)
T KOG1896|consen 1298 GGLNPRAYRLLDSSLQ-LSNSLRSILDGELLNRFSYLSMSEREELAHKIGTTRKEILDDLVELDRLTSSL 1366 (1366)
T ss_pred             cCCCHHHhhhccchhh-hcCCCcccchHhHHHHhhccchhhHHHHHHhcCCCHHHHHHHHHHHHHHhhcC
Confidence            9999999999876653 35789999999999999999999999999999999999999999999999886



>KOG1897 consensus Damage-specific DNA binding complex, subunit DDB1 [Replication, recombination and repair] Back     alignment and domain information
>KOG1898 consensus Splicing factor 3b, subunit 3 [RNA processing and modification] Back     alignment and domain information
>COG5161 SFT1 Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification] Back     alignment and domain information
>PF10433 MMS1_N: Mono-functional DNA-alkylating methyl methanesulfonate N-term; PDB: 2B5M_A 4A0K_C 4A0B_C 3I7L_A 2B5N_C 3I8E_A 4A09_A 4A0A_A 3EI4_C 2B5L_A Back     alignment and domain information
>PF03178 CPSF_A: CPSF A subunit region; InterPro: IPR004871 This family includes a region that lies towards the C terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit Back     alignment and domain information
>KOG0318 consensus WD40 repeat stress protein/actin interacting protein [Cytoskeleton] Back     alignment and domain information
>KOG2048 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>PRK11028 6-phosphogluconolactonase; Provisional Back     alignment and domain information
>cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and botto Back     alignment and domain information
>PF03178 CPSF_A: CPSF A subunit region; InterPro: IPR004871 This family includes a region that lies towards the C terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit Back     alignment and domain information
>KOG1274 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG1273 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG1446 consensus Histone H3 (Lys4) methyltransferase complex and RNA cleavage factor II complex, subunit SWD2 [RNA processing and modification; Chromatin structure and dynamics; Posttranslational modification, protein turnover, chaperones] Back     alignment and domain information
>cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and botto Back     alignment and domain information
>PLN00181 protein SPA1-RELATED; Provisional Back     alignment and domain information
>KOG1539 consensus WD repeat protein [General function prediction only] Back     alignment and domain information
>PRK11028 6-phosphogluconolactonase; Provisional Back     alignment and domain information
>KOG1036 consensus Mitotic spindle checkpoint protein BUB3, WD repeat superfamily [Cell cycle control, cell division, chromosome partitioning] Back     alignment and domain information
>KOG1539 consensus WD repeat protein [General function prediction only] Back     alignment and domain information
>KOG2055 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG2106 consensus Uncharacterized conserved protein, contains HELP and WD40 domains [Function unknown] Back     alignment and domain information
>KOG0291 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG1036 consensus Mitotic spindle checkpoint protein BUB3, WD repeat superfamily [Cell cycle control, cell division, chromosome partitioning] Back     alignment and domain information
>KOG0283 consensus WD40 repeat-containing protein [Function unknown] Back     alignment and domain information
>KOG0306 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>PF10282 Lactonase: Lactonase, 7-bladed beta-propeller; InterPro: IPR019405 6-phosphogluconolactonases (6PGL) 3 Back     alignment and domain information
>PF08596 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; InterPro: IPR013905 The Lethal giant larvae (Lgl) tumour suppressor protein is conserved from yeast to mammals Back     alignment and domain information
>KOG2110 consensus Uncharacterized conserved protein, contains WD40 repeats [Function unknown] Back     alignment and domain information
>PLN00181 protein SPA1-RELATED; Provisional Back     alignment and domain information
>KOG0285 consensus Pleiotropic regulator 1 [RNA processing and modification] Back     alignment and domain information
>KOG0646 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG0306 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG2321 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG0650 consensus WD40 repeat nucleolar protein Bop1, involved in ribosome biogenesis [Translation, ribosomal structure and biogenesis] Back     alignment and domain information
>KOG2111 consensus Uncharacterized conserved protein, contains WD40 repeats [Function unknown] Back     alignment and domain information
>KOG0282 consensus mRNA splicing factor [Function unknown] Back     alignment and domain information
>KOG0319 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG1273 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>PTZ00420 coronin; Provisional Back     alignment and domain information
>KOG0294 consensus WD40 repeat-containing protein [Function unknown] Back     alignment and domain information
>KOG0278 consensus Serine/threonine kinase receptor-associated protein [Lipid transport and metabolism] Back     alignment and domain information
>KOG1897 consensus Damage-specific DNA binding complex, subunit DDB1 [Replication, recombination and repair] Back     alignment and domain information
>KOG0315 consensus G-protein beta subunit-like protein (contains WD40 repeats) [General function prediction only] Back     alignment and domain information
>KOG2055 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG4378 consensus Nuclear protein COP1 [Signal transduction mechanisms] Back     alignment and domain information
>KOG0283 consensus WD40 repeat-containing protein [Function unknown] Back     alignment and domain information
>PF08596 Lgl_C: Lethal giant larvae(Lgl) like, C-terminal; InterPro: IPR013905 The Lethal giant larvae (Lgl) tumour suppressor protein is conserved from yeast to mammals Back     alignment and domain information
>PF14783 BBS2_Mid: Ciliary BBSome complex subunit 2, middle region Back     alignment and domain information
>PF14727 PHTB1_N: PTHB1 N-terminus Back     alignment and domain information
>KOG2096 consensus WD40 repeat protein [General function prediction only] Back     alignment and domain information
>KOG0315 consensus G-protein beta subunit-like protein (contains WD40 repeats) [General function prediction only] Back     alignment and domain information
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism] Back     alignment and domain information
>KOG0319 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification] Back     alignment and domain information
>KOG0277 consensus Peroxisomal targeting signal type 2 receptor [Intracellular trafficking, secretion, and vesicular transport] Back     alignment and domain information
>KOG0299 consensus U3 snoRNP-associated protein (contains WD40 repeats) [RNA processing and modification] Back     alignment and domain information
>KOG0772 consensus Uncharacterized conserved protein, contains WD40 repeat [Function unknown] Back     alignment and domain information
>KOG0296 consensus Angio-associated migratory cell protein (contains WD40 repeats) [Function unknown] Back     alignment and domain information
>KOG0266 consensus WD40 repeat-containing protein [General function prediction only] Back     alignment and domain information
>KOG0647 consensus mRNA export protein (contains WD40 repeats) [RNA processing and modification] Back     alignment and domain information
>KOG0290 consensus Conserved WD40 repeat-containing protein AN11 [Function unknown] Back     alignment and domain information
>KOG3881 consensus Uncharacterized conserved protein [Function unknown] Back     alignment and domain information
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism] Back     alignment and domain information
>KOG0296 consensus Angio-associated migratory cell protein (contains WD40 repeats) [Function unknown] Back     alignment and domain information
>KOG1517 consensus Guanine nucleotide binding protein MIP1 [Cell cycle control, cell division, chromosome partitioning] Back     alignment and domain information
>KOG2106 consensus Uncharacterized conserved protein, contains HELP and WD40 domains [Function unknown] Back     alignment and domain information
>KOG0282 consensus mRNA splicing factor [Function unknown] Back     alignment and domain information
>PTZ00420 coronin; Provisional Back     alignment and domain information
>KOG0772 consensus Uncharacterized conserved protein, contains WD40 repeat [Function unknown] Back     alignment and domain information
>PF14727 PHTB1_N: PTHB1 N-terminus Back     alignment and domain information
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query1431
3i7h_A1143 Crystal Structure Of Ddb1 In Complex With The H-Box 3e-15
3ei4_A1158 Structure Of The Hsddb1-Hsddb2 Complex Length = 115 3e-15
3e0c_A1140 Crystal Structure Of Dna Damage-Binding Protein 1(D 4e-15
4e54_A1150 Damaged Dna Induced Uv-Damaged Dna-Binding Protein 4e-15
2b5l_A1140 Crystal Structure Of Ddb1 In Complex With Simian Vi 4e-15
4a08_A1159 Structure Of Hsddb1-Drddb2 Bound To A 13 Bp Cpd-Dup 4e-15
4a0b_A1159 Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Dup 4e-15
4a0a_A1159 Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Dup 4e-15
4a0l_A1144 Structure Of Ddb1-Ddb2-Cul4b-Rbx1 Bound To A 12 Bp 4e-15
4a11_A1159 Structure Of The Hsddb1-Hscsa Complex Length = 1159 4e-15
3ei1_A1158 Structure Of Hsddb1-Drddb2 Bound To A 14 Bp 6-4 Pho 4e-15
>pdb|3I7H|A Chain A, Crystal Structure Of Ddb1 In Complex With The H-Box Motif Of Hbx Length = 1143 Back     alignment and structure

Iteration: 1

Score = 81.6 bits (200), Expect = 3e-15, Method: Compositional matrix adjust. Identities = 76/285 (26%), Positives = 126/285 (44%), Gaps = 34/285 (11%) Query: 1105 KENETLLAIGTAYVQGEDVAAR-GRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASL 1163 K+ T +GTA V E+ + GR+++F + +D V E KE+KGA+ ++ Sbjct: 826 KDPNTYFIVGTAMVYPEEAEPKQGRIVVF---QYSDGKLQTVAE---KEVKGAVYSMVEF 879 Query: 1164 QGHLLIASGPKIILHKWTG-----TELNGIAFYDAPPLYVVSLNIVKNFILLGDIHKSIY 1218 G LL + + L++WT TE N + + LY L +FIL+GD+ +S+ Sbjct: 880 NGKLLASINSTVRLYEWTTEKDVRTECN--HYNNIMALY---LKTKGDFILVGDLMRSVL 934 Query: 1219 FLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNIQIFYYAPKMSESWK 1278 L++K +A+DF A E L D + L ++ N+ + + + Sbjct: 935 LLAYKPMEGNFEEIARDFNPNWMSAVEILDDDNFLG---AENAFNLFVCQKDSAATTDEE 991 Query: 1279 GQKLLSRAEFHVGAHVTKF----LRLQMLATSSDRTGAAPGSDKTNRFALLFGTLDGSIG 1334 Q L FH+G V F L +Q L +S T + +LFGT++G IG Sbjct: 992 RQHLQEVGLFHLGEFVNVFCHGSLVMQNLGETSTPTQGS----------VLFGTVNGMIG 1041 Query: 1335 CIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAH 1379 + L E + L +Q +L + V + +R FH+ K Sbjct: 1042 LVTSLSESWYNLLLDMQNRLNKVIKSVGKIEHSFWRSFHTERKTE 1086
>pdb|3EI4|A Chain A, Structure Of The Hsddb1-Hsddb2 Complex Length = 1158 Back     alignment and structure
>pdb|3E0C|A Chain A, Crystal Structure Of Dna Damage-Binding Protein 1(Ddb1) Length = 1140 Back     alignment and structure
>pdb|4E54|A Chain A, Damaged Dna Induced Uv-Damaged Dna-Binding Protein (Uv-Ddb) Dimerization And Its Roles In Chromatinized Dna Repair Length = 1150 Back     alignment and structure
>pdb|2B5L|A Chain A, Crystal Structure Of Ddb1 In Complex With Simian Virus 5 V Protein Length = 1140 Back     alignment and structure
>pdb|4A08|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 13 Bp Cpd-Duplex ( Purine At D-1 Position) At 3.0 A Resolution (Cpd 1) Length = 1159 Back     alignment and structure
>pdb|4A0B|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Duplex ( Pyrimidine At D-1 Position) At 3.8 A Resolution (Cpd 4) Length = 1159 Back     alignment and structure
>pdb|4A0A|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 16 Bp Cpd-Duplex ( Pyrimidine At D-1 Position) At 3.6 A Resolution (Cpd 3) Length = 1159 Back     alignment and structure
>pdb|4A0L|A Chain A, Structure Of Ddb1-Ddb2-Cul4b-Rbx1 Bound To A 12 Bp Abasic Site Containing Dna-Duplex Length = 1144 Back     alignment and structure
>pdb|4A11|A Chain A, Structure Of The Hsddb1-Hscsa Complex Length = 1159 Back     alignment and structure
>pdb|3EI1|A Chain A, Structure Of Hsddb1-Drddb2 Bound To A 14 Bp 6-4 Photoproduct Containing Dna-Duplex Length = 1158 Back     alignment and structure

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query1431
3ei3_A1158 DNA damage-binding protein 1; UV-damage, DDB, nucl 2e-80
3ei3_A1158 DNA damage-binding protein 1; UV-damage, DDB, nucl 8e-55
1vt4_I1221 APAF-1 related killer DARK; drosophila apoptosome, 4e-09
1vt4_I1221 APAF-1 related killer DARK; drosophila apoptosome, 5e-05
1vt4_I 1221 APAF-1 related killer DARK; drosophila apoptosome, 9e-05
>3ei3_A DNA damage-binding protein 1; UV-damage, DDB, nucleotide excision repair, xeroderma pigmentosum, cytoplasm, DNA repair; HET: DNA PG4; 2.30A {Homo sapiens} PDB: 3ei1_A* 3ei2_A* 3ei4_A* 4a0l_A* 3e0c_A* 3i7k_A* 3i7h_A* 3i7l_A* 3i7n_A* 3i7o_A* 3i7p_A* 3i89_A* 3i8c_A* 3i8e_A* 2b5l_A 2b5m_A 2hye_A* 4a11_A* 4a0k_C* 4a0a_A* ... Length = 1158 Back     alignment and structure
 Score =  288 bits (738), Expect = 2e-80
 Identities = 107/640 (16%), Positives = 218/640 (34%), Gaps = 50/640 (7%)

Query: 803  INSSSEEGTGQGRKENIHSMKVVELAMQRWSAHHSRPFLFAILTDGTILCYQAYLFEGPE 862
            +    +E       E  H +  +++     S   S      + TD +    +   FE   
Sbjct: 537  LQIHPQELRQISHTEMEHEVACLDITPLGDSNGLSPLCAIGLWTDISARILKLPSFELLH 596

Query: 863  NTSKSDDPVSTSRSLSVSNVSASRLRNLR--------FSRTPLDAYTREETPHGAPCQRI 914
                  + +  S  ++    S   L  L          +        R++   G     +
Sbjct: 597  KEMLGGEIIPRSILMTTFESSHYLLCALGDGALFYFGLNIETGLLSDRKKVTLGTQPTVL 656

Query: 915  TIFKNISGHQGFFLSGSRPCWCMVFRERLRVHPQLCDGSIVAFTVLHNVNCNHGFIYVTS 974
              F+++S     F    RP        +L     +    +     L++           +
Sbjct: 657  RTFRSLST-TNVFACSDRPTVIYSSNHKLVFSN-VNLKEVNYMCPLNSDGYPDSLALANN 714

Query: 975  QGILKICQLPSGSTYDNYWPVQKIPLKATPHQITYFAEKNLYPLIVSVPVLKPLNQVLSL 1034
               L I  +           ++ +PL  +P +I Y      + ++ S   ++  +   + 
Sbjct: 715  ST-LTIGTID----EIQKLHIRTVPLYESPRKICYQEVSQCFGVLSSRIEVQDTSGGTTA 769

Query: 1035 LIDQEVGHQIDNHNLSSVDLHRTYTVEE---------YEVRILEPDRAGGPWQTRATIPM 1085
            L        + +   SS     +    E         + + I++       ++       
Sbjct: 770  LRPSASTQALSSSVSSSKLFSSSTAPHETSFGEEVEVHNLLIIDQHT----FEVLHAHQF 825

Query: 1086 QSSENALTVRVVTLFNTTTKENETLLAIGTAYVQGEDVAA-RGRVLLFSTGRNADNPQNL 1144
              +E AL++    L     K+  T   +GTA V  E+    +GR+++F            
Sbjct: 826  LQNEYALSLVSCKL----GKDPNTYFIVGTAMVYPEEAEPKQGRIVVFQYSDGK------ 875

Query: 1145 VTEVYSKELKGAISALASLQGHLLIASGPKIILHKWTGTELNGIAFYDAPPLYVVSLNIV 1204
            +  V  KE+KGA+ ++    G LL +    + L++WT  +           +  + L   
Sbjct: 876  LQTVAEKEVKGAVYSMVEFNGKLLASINSTVRLYEWTTEKELRTECNHYNNIMALYLKTK 935

Query: 1205 KNFILLGDIHKSIYFLSWKEQGAQLNLLAKDFGSLDCFATEFLIDGSTLSLVVSDEQKNI 1264
             +FIL+GD+ +S+  L++K        +A+DF      A E L D + L    ++   N+
Sbjct: 936  GDFILVGDLMRSVLLLAYKPMEGNFEEIARDFNPNWMSAVEILDDDNFL---GAENAFNL 992

Query: 1265 QIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLRLQMLATSSDRTGAAPGSDKTNRFAL 1324
             +       +   + Q L     FH+G  V  F    ++  +   T          + ++
Sbjct: 993  FVCQKDSAATTDEERQHLQEVGLFHLGEFVNVFCHGSLVMQNLGET------STPTQGSV 1046

Query: 1325 LFGTLDGSIGCIAPLDELTFRRLQSLQKKLVDSVPHVAGLNPRSFRQFHSNGKAHRPGPD 1384
            LFGT++G IG +  L E  +  L  +Q +L   +  V  +    +R FH+  K       
Sbjct: 1047 LFGTVNGMIGLVTSLSESWYNLLLDMQNRLNKVIKSVGKIEHSFWRSFHTERKTE--PAT 1104

Query: 1385 SIVDCELLSHYEMLPLEEQLEIAHQTGTTRSQILSNLNDL 1424
              +D +L+  +  +   +  E+           +      
Sbjct: 1105 GFIDGDLIESFLDISRPKMQEVVANLQYDDGSGMKREATA 1144


>3ei3_A DNA damage-binding protein 1; UV-damage, DDB, nucleotide excision repair, xeroderma pigmentosum, cytoplasm, DNA repair; HET: DNA PG4; 2.30A {Homo sapiens} PDB: 3ei1_A* 3ei2_A* 3ei4_A* 4a0l_A* 3e0c_A* 3i7k_A* 3i7h_A* 3i7l_A* 3i7n_A* 3i7o_A* 3i7p_A* 3i89_A* 3i8c_A* 3i8e_A* 2b5l_A 2b5m_A 2hye_A* 4a11_A* 4a0k_C* 4a0a_A* ... Length = 1158 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query1431
d1k8kc_371 Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taur 97.34
d1nexb2355 Cdc4 propeller domain {Baker's yeast (Saccharomyce 97.19
d1gxra_337 Groucho/tle1, C-terminal domain {Human (Homo sapie 96.63
d1nr0a2299 Actin interacting protein 1 {Nematode (Caenorhabdi 96.43
d1pgua1325 Actin interacting protein 1 {Baker's yeast (Saccha 96.38
d1gxra_337 Groucho/tle1, C-terminal domain {Human (Homo sapie 96.28
d1erja_388 Tup1, C-terminal domain {Baker's yeast (Saccharomy 94.81
d1pgua2287 Actin interacting protein 1 {Baker's yeast (Saccha 94.52
d1k8kc_371 Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taur 94.23
d1tbga_340 beta1-subunit of the signal-transducing G protein 94.06
d1nr0a2299 Actin interacting protein 1 {Nematode (Caenorhabdi 93.21
d1erja_388 Tup1, C-terminal domain {Baker's yeast (Saccharomy 90.28
d1nr0a1311 Actin interacting protein 1 {Nematode (Caenorhabdi 88.73
d1ri6a_333 Putative isomerase YbhE {Escherichia coli [TaxId: 86.26
d1l0qa2301 Surface layer protein {Archaeon Methanosarcina maz 85.4
d1q7fa_279 Brain tumor cg10719-pa {Fruit fly (Drosophila mela 84.85
>d1k8kc_ b.69.4.1 (C:) Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taurus) [TaxId: 9913]} Back     information, alignment and structure
class: All beta proteins
fold: 7-bladed beta-propeller
superfamily: WD40 repeat-like
family: WD40-repeat
domain: Arp2/3 complex 41 kDa subunit ARPC1
species: Cow (Bos taurus) [TaxId: 9913]
Probab=97.34  E-value=0.003  Score=33.94  Aligned_cols=57  Identities=14%  Similarity=0.116  Sum_probs=25.7

Q ss_pred             EEEEEEEEEECCCCCCCCEEEEEEEEEECCCCCCCCEEEEEEEEECCCEEEECCC-CCEEEEE-ECCEEEEEECC
Q ss_conf             1999996100598866660699999764599997538999988624831898444-6809999-68969999724
Q 000548         1109 TLLAIGTAYVQGEDVAARGRVLLFSTGRNADNPQNLVTEVYSKELKGAISALASL-QGHLLIA-SGPKIILHKWT 1181 (1431)
Q Consensus      1109 ~~ivVGT~~~~~E~~~~~Gri~vf~i~~~~~~~~~~l~~i~~~~~~G~V~al~~~-~g~Ll~a-vg~~i~i~~~~ 1181 (1431)
                      .+++.|+.         .|.|.++++...     ..+..+  ....++|++++-. +|.++++ ....+.+|.++
T Consensus       214 ~~l~s~~~---------d~~i~iwd~~~~-----~~~~~~--~~~~~~v~s~~fs~d~~~la~g~d~~~~~~~~~  272 (371)
T d1k8kc_         214 SRVAWVSH---------DSTVCLADADKK-----MAVATL--ASETLPLLAVTFITESSLVAAGHDCFPVLFTYD  272 (371)
T ss_dssp             SEEEEEET---------TTEEEEEEGGGT-----TEEEEE--ECSSCCEEEEEEEETTEEEEEETTSSCEEEEEE
T ss_pred             CCCCCCCC---------CCCCEEEEEECC-----CCEEEE--ECCCCCCEEEEECCCCCEEEEECCCCEEEEEEE
T ss_conf             21000014---------786058864101-----210000--014665203654699979999819926787760



>d1nexb2 b.69.4.1 (B:370-744) Cdc4 propeller domain {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1gxra_ b.69.4.1 (A:) Groucho/tle1, C-terminal domain {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
>d1nr0a2 b.69.4.1 (A:313-611) Actin interacting protein 1 {Nematode (Caenorhabditis elegans) [TaxId: 6239]} Back     information, alignment and structure
>d1pgua1 b.69.4.1 (A:2-326) Actin interacting protein 1 {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1gxra_ b.69.4.1 (A:) Groucho/tle1, C-terminal domain {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
>d1erja_ b.69.4.1 (A:) Tup1, C-terminal domain {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1pgua2 b.69.4.1 (A:327-613) Actin interacting protein 1 {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1k8kc_ b.69.4.1 (C:) Arp2/3 complex 41 kDa subunit ARPC1 {Cow (Bos taurus) [TaxId: 9913]} Back     information, alignment and structure
>d1tbga_ b.69.4.1 (A:) beta1-subunit of the signal-transducing G protein heterotrimer {Cow (Bos taurus) [TaxId: 9913]} Back     information, alignment and structure
>d1nr0a2 b.69.4.1 (A:313-611) Actin interacting protein 1 {Nematode (Caenorhabditis elegans) [TaxId: 6239]} Back     information, alignment and structure
>d1erja_ b.69.4.1 (A:) Tup1, C-terminal domain {Baker's yeast (Saccharomyces cerevisiae) [TaxId: 4932]} Back     information, alignment and structure
>d1nr0a1 b.69.4.1 (A:2-312) Actin interacting protein 1 {Nematode (Caenorhabditis elegans) [TaxId: 6239]} Back     information, alignment and structure
>d1ri6a_ b.69.11.1 (A:) Putative isomerase YbhE {Escherichia coli [TaxId: 562]} Back     information, alignment and structure
>d1l0qa2 b.69.2.3 (A:1-301) Surface layer protein {Archaeon Methanosarcina mazei [TaxId: 2209]} Back     information, alignment and structure
>d1q7fa_ b.68.9.1 (A:) Brain tumor cg10719-pa {Fruit fly (Drosophila melanogaster) [TaxId: 7227]} Back     information, alignment and structure