Citrus Sinensis ID: 013597

Local Sequence Feature Prediction

Prediction and (Method)	Result

Residue Number Marker

Protein Sequence

Secondary Structure (PSIPRED)

Secondary Structure Prediction (SSPRO)

Coil and Loop (DISEMBL)

Flexible Loop (DISEMBL)

Low Complexity Region (SEG)

Disordered region (IsUnstruct)

Disordered Region (DISOPRED)

Disordered Region (DISEMBL)

Disordered Region (DISPRO)

Transmembrane Helix (TMHMM)

Transmembrane Helix (HMMTOP)

Transmembrane Helix (MEMSAT)

TM Helix, Signal Peptide (MEMSAT_SVM)

TM Helix, Signal Peptide (Phobius)

Signal Peptide (SignalP HMM Mode)

Signal Peptide (SignalP NN Mode)

Coiled Coils (COILS)

Positional Conservation

--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230-------240-------250-------260-------270-------280-------290-------300-------310-------320-------330-------340-------350-------360-------370-------380-------390-------400-------410-------420-------430-------44

MSSTPGTHSLAFRVMRLCRPSLHVEPPLRVDPTDLFIGEDIFDDPIAASNLPPLISSDVTTNKSSDLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSATMLKADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKLQITWRTNLGEPGRLQTQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDKLEKITYDSLPDLEIFVDQD

cccccccccEEEEEEEEccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccEEEEEEEEEcccccccEEEEEEEEEEccccEEEEcccccccccccccccEEEEEEEEEcEEEccEEEEEEEEEEcccccEEEEEEEEEEEEcccEEEEEEEEEEcccEEEEEEEEEcccccEEEEEEEEEEcccccEEEcccccccccccccccccccccEEEEccccEEEEEEEEEEcccccccccccccccccEEEEEEEEcccccccEEEEEEEEccccccccEEEEEEEcccEEEEcccEEEEEEEEEccccccccEEEEEEccccccccEEEEccccEEEEcccccccEEEEEEEEEEccccEEEEccEEEEEccccEEEcccccEEEEEccc

ccccccccEEEEEEEEccccccccccccccccccccccccccccccccccccHHccccccccccccccccccccccccccccccccccccccccccEEEEEEEEEEEEEEcccccEEEEEEEEEEEEccccEEEcccccccccccccccccccEEEEEEEHHcccEEEEEEEEEEcccccEEEEEEEEEEEEccccEEEEEEEcccccEEEEEEEEEcccccEEEEEEEEccccccEEEEcccccccccccccccccccccccEEccccEEEEEEEEEcccccccccccccccEEEEEEEEEEEcccccccEEEcccccccccccccEEEEEEEcccEEEEEEcEEEEEEEEEccccccccEEEEEEccccccEEEEEEcccEccEcccccccccEEEEEEEEEccccEEEEcEEEEEEccccEEEEccccEEEEEccc

msstpgthsLAFRVMRlcrpslhvepplrvdptdlfigedifddpiaasnlpplissdvttnkssdltyRSRFLLhdsadsiglsgllvlpqafgaiylgeTFCSYISinnsstlevRDVVIKAEIQTDKQRILLLdtskspvesiraggrydfIVEHDVKELGAHTLVCTAlysdgegerkyLPQFFKFIvsnplsvrtKVRVVKEITFLEACIENhtksnlymdqvefepsqnwsatmlkadgphsdynaqsreifkppvlirsgggiHNYLYQLKmlshgssspvkvqgsnvlgklqitwrtnlgepgrlqtQQILGTTITSKEIElnvvevpsvvgidkpfllklkltnqtdkeqgpfeiwlsqndsdeekVVMINGLRIMalapveafgstdFHLNLIATKLGVQRITGITVFDKLekitydslpdleifvdqd

msstpgthslAFRVMRLCrpslhvepplrvDPTDLFIGEDIFDDPIAasnlpplissdvttnksSDLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAeiqtdkqrillldtskspvesiRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFivsnplsvrtKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSATMLKADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKLQITWrtnlgepgrlqtQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLtnqtdkeqgpfeiwlsqndsdEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDKlekitydslpdleifvdqd

MSSTPGTHSLAFRVMRLCRPSLHVEPPLRVDPTDLFIGEDIFDDPIAASNLPPLISSDVTTNKSSDLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSATMLKADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKLQITWRTNLGEPGRlqtqqilgttitSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDKLEKITYDSLPDLEIFVDQD

*********LAFRVMRLCRPSLHVEPPLRVDPTDLFIGEDIFDDPIAASNLPPLIS*********DLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEF*************************EIFKPPVLIRSGGGIHNYLYQLKMLSHG****VKVQGSNVLGKLQITWRTNLGEPGRLQTQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTN*********EIW**********VVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDKLEKITYDSLPDLEIFV***

*****GT**LAFRVMRLCRPSLHVEPPLRVDPTDLFIG******************************************SIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDK*******************GRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSATM********************PVLIRSGGGIHNYLYQLKM*************SNVLGKLQITWRTNLGEPGRLQTQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDKLEKITYDSLPDLEIFVDQD

MSSTPGTHSLAFRVMRLCRPSLHVEPPLRVDPTDLFIGEDIFDDPIAASNLPPLISSDVTTNKSSDLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSATMLKADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSH********QGSNVLGKLQITWRTNLGEPGRLQTQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDKLEKITYDSLPDLEIFVDQD

*****GTHSLAFRVMRLCRPSLHVEPPLRVDPTDLFIGE*************************************DSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSATMLKA***********REIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKLQITWRTNLGEPGRLQTQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDKLEKITYDSLPDLEIFVDQ*

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHHHoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhhhhhhhoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooooooooooooooooooooooo

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST

Original result of BLAST against SWISS-PROT Database

ID	Alignment graph	Length	Definition	RBH(Q2H)	RBH(H2Q)	Q cover	H cover	Identity	E-value
Query		439	2.2.26 [Sep-21-2011]
Q6PBY7		412	UPF0533 protein C5orf44 h	yes	no	0.861	0.917	0.330	9e-53
Q5RCG0		417	UPF0533 protein C5orf44 h	yes	no	0.858	0.904	0.321	5e-52
A5PLN9		417	UPF0533 protein C5orf44 O	yes	no	0.858	0.904	0.318	8e-52
Q3TIR1		417	UPF0533 protein C5orf44 h	yes	no	0.867	0.913	0.319	2e-50
A7MB76		417	UPF0533 protein C5orf44 h	yes	no	0.858	0.904	0.314	2e-50
Q5M887		418	UPF0533 protein C5orf44 h	yes	no	0.870	0.913	0.319	3e-50
Q0VFT9		412	UPF0533 protein C5orf44 h	yes	no	0.867	0.924	0.323	7e-50
Q6GPR5		414	UPF0533 protein C5orf44 h	N/A	no	0.872	0.925	0.320	3e-48
A8WX89		401	UPF0533 protein CBG04321	N/A	no	0.876	0.960	0.304	3e-47
Q95QQ2		401	UPF0533 protein C56C10.7	yes	no	0.879	0.962	0.295	6e-47

>sp\|Q6PBY7\|CE044_DANRE UPF0533 protein C5orf44 homolog OS=Danio rerio GN=zgc:73187 PE=2 SV=2	Back alignment and function desciption

 Score =  207 bits (528), Expect = 9e-53,   Method: Compositional matrix adjust.
 Identities = 141/427 (33%), Positives = 222/427 (51%), Gaps = 49/427 (11%)

Query: 8   HSLAFRVMRLCRPSLHVEPPLRVD----PTDLFIGEDIFDDPIAASNLPPLISSDVTTNK 63
           H LA +VMRL +P+L    P+  +    P DLF+                L+  D +T K
Sbjct: 10  HLLALKVMRLTKPTLFTNMPVTCEDRDLPGDLFLR---------------LMKDDPSTVK 54

Query: 64  SSDLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIK 123
                          A+++ L  +L LPQ FG I+LGETF SYIS++N S+  V+D+++K
Sbjct: 55  G--------------AETLILGEMLTLPQNFGNIFLGETFSSYISVHNDSSQVVKDILVK 100

Query: 124 AEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKY 183
           A++QT  QR L L  S S V  ++     D ++ H+VKE+G H LVC   Y+   GE+ Y
Sbjct: 101 ADLQTSSQR-LNLSASNSAVSELKPECCIDDVIHHEVKEIGTHILVCAVSYTTQTGEKLY 159

Query: 184 LPQFFKFIVSNPLSVRTKVRVVK-EITFLEACIENHTKSNLYMDQVEFEPSQNWSATMLK 242
             +FFKF V  PL V+TK    + +  FLEA I+N T S ++M++V  EPS  ++ T L 
Sbjct: 160 FRKFFKFQVLKPLDVKTKFYNAETDEVFLEAQIQNITTSPMFMEKVSLEPSMMYNVTELN 219

Query: 243 --ADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKLQ 300
             A G  S  +   +  +  P+  R       YLY LK     +     ++G  V+GKL 
Sbjct: 220 NVASGDESSESTFGKMSYLQPLDTR------QYLYCLKPKPEFAEKAGVIKGVTVIGKLD 273

Query: 301 ITWRTNLGEPGRLQTQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQG 360
           I W+TNLGE GRLQT Q+        ++ L++  +P  V +++PF +  K+TN +++   
Sbjct: 274 IVWKTNLGERGRLQTSQLQRMAPGYGDVRLSLEFIPDTVDLEEPFDITCKITNCSERT-- 331

Query: 361 PFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFDK 420
             ++ L   ++       ++G ++  L+P     S    L L+++  G+Q I+G+ + D 
Sbjct: 332 -MDLLLEMCNTRSVHWCGVSGRQLGKLSPS---ASLSIPLKLLSSVQGLQSISGLRLTDT 387

Query: 421 LEKITYD 427
             K TY+
Sbjct: 388 FLKRTYE 394

Danio rerio (taxid: 7955)

>sp\|Q5RCG0\|CE044_PONAB UPF0533 protein C5orf44 homolog OS=Pongo abelii PE=2 SV=1	Back alignment and function description

>sp\|A5PLN9\|CE044_HUMAN UPF0533 protein C5orf44 OS=Homo sapiens GN=C5orf44 PE=2 SV=2	Back alignment and function description

>sp\|Q3TIR1\|CE044_MOUSE UPF0533 protein C5orf44 homolog OS=Mus musculus PE=2 SV=1	Back alignment and function description

>sp\|A7MB76\|CE044_BOVIN UPF0533 protein C5orf44 homolog OS=Bos taurus PE=2 SV=1	Back alignment and function description

>sp\|Q5M887\|CE044_RAT UPF0533 protein C5orf44 homolog OS=Rattus norvegicus PE=2 SV=2	Back alignment and function description

>sp\|Q0VFT9\|CE044_XENTR UPF0533 protein C5orf44 homolog OS=Xenopus tropicalis PE=2 SV=1	Back alignment and function description

>sp\|Q6GPR5\|CE044_XENLA UPF0533 protein C5orf44 homolog OS=Xenopus laevis PE=2 SV=2	Back alignment and function description

>sp\|A8WX89\|U533_CAEBR UPF0533 protein CBG04321 OS=Caenorhabditis briggsae GN=CBG04321 PE=3 SV=2	Back alignment and function description

>sp\|Q95QQ2\|U533_CAEEL UPF0533 protein C56C10.7 OS=Caenorhabditis elegans GN=C56C10.7 PE=1 SV=1	Back alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST

Original result of BLAST against Nonredundant Database

GI	Alignment Graph	Length	Definition	Q cover	H cover	Identity	E-value
Query		439
255556003		434	expressed protein, putative [Ricinus com	0.986	0.997	0.772	0.0
225470348		438	PREDICTED: UPF0533 protein C5orf44 [Viti	0.993	0.995	0.775	0.0
356548745		440	PREDICTED: UPF0533 protein C5orf44 homol	0.977	0.975	0.759	0.0
449457717		440	PREDICTED: UPF0533 protein C5orf44-like	0.995	0.993	0.723	0.0
224079249		450	predicted protein [Populus trichocarpa]	0.972	0.948	0.737	0.0
356521339		435	PREDICTED: UPF0533 protein C5orf44-like	0.968	0.977	0.743	0.0
388496064		437	unknown [Medicago truncatula]	0.965	0.970	0.726	0.0
358346667		446	hypothetical protein MTR_084s0010 [Medic	0.965	0.950	0.712	1e-180
18407493		442	uncharacterized protein [Arabidopsis tha	0.990	0.984	0.679	1e-170
297824907		443	hypothetical protein ARALYDRAFT_483987 [	0.986	0.977	0.678	1e-169

>gi\|255556003\|ref\|XP_002519036.1\| expressed protein, putative [Ricinus communis] gi\|223541699\|gb\|EEF43247.1\| expressed protein, putative [Ricinus communis]	Back alignment and taxonomy information

 Score =  704 bits (1818), Expect = 0.0,   Method: Compositional matrix adjust.
 Identities = 340/440 (77%), Positives = 384/440 (87%), Gaps = 7/440 (1%)

Query: 1   MSSTPGTHSLAFRVMRLCRPSLHVEPPLRVDPTDLFIGEDIFDDPIAASNLPPLISSDVT 60
           MS+TPGTHSLAFRVMRLCRPS HV+  L VDP+DL +GEDIFDDP+AAS LPPLI S +T
Sbjct: 1   MSTTPGTHSLAFRVMRLCRPSFHVDAQLLVDPSDLIVGEDIFDDPVAASRLPPLIDSHIT 60

Query: 61  T-NKSSDLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRD 119
               +SDL+YR+RFL    +DS GL+GLLVLPQAFGAIYLGETFCSYISINNSS  EVRD
Sbjct: 61  KLTDTSDLSYRTRFLHQHPSDSFGLTGLLVLPQAFGAIYLGETFCSYISINNSSNFEVRD 120

Query: 120 VVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEG 179
           V+IKAEIQT++QRILLLDTSK+PVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDG+G
Sbjct: 121 VIIKAEIQTERQRILLLDTSKNPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGDG 180

Query: 180 ERKYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSAT 239
           ERKYLPQFFKFIV+NPLSVRTKVRVVKE T+LEACIENHTK+NLYMDQVEFEP+Q+WSA 
Sbjct: 181 ERKYLPQFFKFIVANPLSVRTKVRVVKETTYLEACIENHTKTNLYMDQVEFEPAQHWSAK 240

Query: 240 MLKADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKL 299
           ++K D   S+ ++ +REIFKPPVLIRSGGGIHNYLYQL++ +HG++       SNVLGKL
Sbjct: 241 IIKDDEKQSEKDSLTREIFKPPVLIRSGGGIHNYLYQLRLSAHGAAQ------SNVLGKL 294

Query: 300 QITWRTNLGEPGRLQTQQILGTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQ 359
           QITWRTNLGEPGRLQTQQILGT IT KEIEL + +VP+V+ +DKPF + LKLTN TDKE 
Sbjct: 295 QITWRTNLGEPGRLQTQQILGTPITRKEIELCIAKVPAVINLDKPFSVHLKLTNHTDKEL 354

Query: 360 GPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFD 419
           GPFE+WLSQ+ S EEK V INGL+ M L+ +EAFG+TDFHLNLIATKLGVQRITGITVFD
Sbjct: 355 GPFEVWLSQDGSVEEKAVTINGLQTMELSQLEAFGTTDFHLNLIATKLGVQRITGITVFD 414

Query: 420 KLEKITYDSLPDLEIFVDQD 439
           K EK TYD LPDLEIFV  D
Sbjct: 415 KSEKKTYDPLPDLEIFVAID 434

Source: Ricinus communis

Species: Ricinus communis

Genus: Ricinus

Family: Euphorbiaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi\|225470348\|ref\|XP_002269604.1\| PREDICTED: UPF0533 protein C5orf44 [Vitis vinifera] gi\|296090651\|emb\|CBI41051.3\| unnamed protein product [Vitis vinifera]	Back alignment and taxonomy information

>gi\|356548745\|ref\|XP_003542760.1\| PREDICTED: UPF0533 protein C5orf44 homolog [Glycine max]	Back alignment and taxonomy information

>gi\|449457717\|ref\|XP_004146594.1\| PREDICTED: UPF0533 protein C5orf44-like [Cucumis sativus]	Back alignment and taxonomy information

>gi\|224079249\|ref\|XP_002305809.1\| predicted protein [Populus trichocarpa] gi\|222848773\|gb\|EEE86320.1\| predicted protein [Populus trichocarpa]	Back alignment and taxonomy information

>gi\|356521339\|ref\|XP_003529314.1\| PREDICTED: UPF0533 protein C5orf44-like [Glycine max]	Back alignment and taxonomy information

>gi\|388496064\|gb\|AFK36098.1\| unknown [Medicago truncatula]	Back alignment and taxonomy information

>gi\|358346667\|ref\|XP_003637387.1\| hypothetical protein MTR_084s0010 [Medicago truncatula] gi\|355503322\|gb\|AES84525.1\| hypothetical protein MTR_084s0010 [Medicago truncatula]	Back alignment and taxonomy information

>gi|18407493|ref|NP_566117.1| uncharacterized protein [Arabidopsis thaliana] gi|16226796|gb|AAL16264.1|AF428334_1 At2g47960/T9J23.10 [Arabidopsis thaliana] gi|18377797|gb|AAL67048.1| unknown protein [Arabidopsis thaliana] gi|20197311|gb|AAC63650.2| expressed protein [Arabidopsis thaliana] gi|20197565|gb|AAM15133.1| expressed protein [Arabidopsis thaliana] gi|21281259|gb|AAM45021.1| unknown protein [Arabidopsis thaliana] gi|330255823|gb|AEC10917.1| uncharacterized protein [Arabidopsis thaliana]

Back alignment and taxonomy information

>gi\|297824907\|ref\|XP_002880336.1\| hypothetical protein ARALYDRAFT_483987 [Arabidopsis lyrata subsp. lyrata] gi\|297326175\|gb\|EFH56595.1\| hypothetical protein ARALYDRAFT_483987 [Arabidopsis lyrata subsp. lyrata]	Back alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST

Original result of BLAST against Gene Ontology (AMIGO)

ID	Alignment graph	Length	Definition	Q cover	H cover	Identity	E-value
Query		439
TAIR\|locus:2043433		442	AT2G47960 "AT2G47960" [Arabido	0.990	0.984	0.661	5.4e-150
ZFIN\|ZDB-GENE-030131-9775		412	trappc13 "trafficking protein	0.765	0.815	0.338	1.1e-52
MGI\|MGI:1914225		417	Trappc13 "trafficking protein	0.767	0.808	0.325	1.4e-48
DICTYBASE\|DDB_G0269062		511	DDB_G0269062 "DUF974 family pr	0.535	0.459	0.348	1.2e-47
FB\|FBgn0032204		438	CG4953 [Drosophila melanogaste	0.788	0.789	0.316	1.4e-44
UNIPROTKB\|G4NC96		339	MGG_01105 "Uncharacterized pro	0.216	0.280	0.324	0.0005

TAIR\|locus:2043433 AT2G47960 "AT2G47960" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

 Score = 1464 (520.4 bits), Expect = 5.4e-150, P = 5.4e-150
 Identities = 291/440 (66%), Positives = 343/440 (77%)

Query:     2 SSTPGTHSLAFRVMRLCRPSLHVEPPLRVDPTDLFIGEDIFDDPIAASNLPPLISSDVTT 61
             + T G HSLAFRVMRLC+PS HV+PPLR+DP DL  GED  DDP +AS     +SS    
Sbjct:     6 TQTHGPHSLAFRVMRLCKPSFHVDPPLRIDPFDLLAGEDFSDDPSSASLFRRHVSSADAV 65

Query:    62 NKSSDLTYRSRFLLHDSADSIGLSGLLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVV 121
             +  SDL+YR+RFLL+   D IGLSGLL+LPQ+FGAIYLGETFCSYIS+NNSST EVRDV 
Sbjct:    66 D--SDLSYRNRFLLNHPTDPIGLSGLLLLPQSFGAIYLGETFCSYISVNNSSTSEVRDVT 123

Query:   122 IKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALYSDGEGER 181
             IKAEIQT++QRILLLDTSKSPVESIR GGRYDFIVEHDVKELGAHTLVC+ALY+D +GER
Sbjct:   124 IKAEIQTERQRILLLDTSKSPVESIRTGGRYDFIVEHDVKELGAHTLVCSALYNDADGER 183

Query:   182 KYLPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEFEPSQNWSATML 241
             KYLPQFFKF+V+NPLSVRTKVRVVKE TFLEACIENHTK+NL+MDQV+FEP++ WSA  L
Sbjct:   184 KYLPQFFKFVVANPLSVRTKVRVVKETTFLEACIENHTKANLFMDQVDFEPAKQWSAVRL 243

Query:   242 KADGPHSD--YNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKL 299
             + +    D   +  S  I KPPV+IRSGGGIHNYLY+L   S   S   K QGSN+LGK 
Sbjct:   244 QNEDSTEDPPTSGLSGLIPKPPVIIRSGGGIHNYLYKLNP-SADVSGQTKFQGSNILGKF 302

Query:   300 QITWRTNLGEPGRXXXXXXXXXXXXSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQ 359
             QITWRTNLGEPGR             KEI + VVEVP+V+ +++PF   L LTNQTD++ 
Sbjct:   303 QITWRTNLGEPGRLQTQQILGAPVSRKEINMRVVEVPAVIHLNRPFRAYLNLTNQTDRQL 362

Query:   360 GPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRITGITVFD 419
             GPFE+ LSQ+++  EK V INGL+ + L  +EAFGS DF LNLIA+KLGVQ+I GIT  D
Sbjct:   363 GPFEVSLSQDETQLEKPVGINGLQTLMLPRIEAFGSNDFQLNLIASKLGVQKIAGITALD 422

Query:   420 KLEKITYDSLPDLEIFVDQD 439
               EK TY+ +PD+EIFV+ D
Sbjct:   423 TREKKTYELVPDMEIFVETD 442

GO:0003674 "molecular_function" evidence=ND

GO:0008150 "biological_process" evidence=ND

GO:0009507 "chloroplast" evidence=ISM

GO:0006635 "fatty acid beta-oxidation" evidence=RCA

GO:0016558 "protein import into peroxisome matrix" evidence=RCA

ZFIN\|ZDB-GENE-030131-9775 trappc13 "trafficking protein particle complex 13" [Danio rerio (taxid:7955)]	Back alignment and assigned GO terms

MGI\|MGI:1914225 Trappc13 "trafficking protein particle complex 13" [Mus musculus (taxid:10090)]	Back alignment and assigned GO terms

DICTYBASE\|DDB_G0269062 DDB_G0269062 "DUF974 family protein" [Dictyostelium discoideum (taxid:44689)]	Back alignment and assigned GO terms

FB\|FBgn0032204 CG4953 [Drosophila melanogaster (taxid:7227)]	Back alignment and assigned GO terms

UNIPROTKB\|G4NC96 MGG_01105 "Uncharacterized protein" [Magnaporthe oryzae 70-15 (taxid:242507)]	Back alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries

Original result of BLAST against SWISS-PROT

ID	Name	Annotated EC number	Identity	Query coverage	Hit coverage	RBH(Q2H)	RBH(H2Q)
Q0VFT9	CE044_XENTR	No assigned EC number	0.3231	0.8678	0.9247	yes	no
Q95TN1	U533_DROME	No assigned EC number	0.3057	0.8838	0.8858	yes	no
Q5RCG0	CE044_PONAB	No assigned EC number	0.3210	0.8587	0.9040	yes	no
Q3TIR1	CE044_MOUSE	No assigned EC number	0.3193	0.8678	0.9136	yes	no
Q5M887	CE044_RAT	No assigned EC number	0.3193	0.8701	0.9138	yes	no
A7MB76	CE044_BOVIN	No assigned EC number	0.3140	0.8587	0.9040	yes	no
Q6PBY7	CE044_DANRE	No assigned EC number	0.3302	0.8610	0.9174	yes	no
A5PLN9	CE044_HUMAN	No assigned EC number	0.3187	0.8587	0.9040	yes	no

EC Number Prediction by Ezypred Server

Original result from Ezypred Server

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software

No EC number assignment, probably not an enzyme!

Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING

Original result from the STRING server

Your Input:
	gw1.IV.3206.1	hypothetical protein (423 aa)
		(Populus trichocarpa)
Predicted Functional Partners:
		Sorry, there are no predicted associations at the current settings.

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST

Original result of RPS-BLAST against CDD database part I

ID	Alignment Graph	Length	Definition	E-value
Query		439
pfam06159		235	pfam06159, DUF974, Protein of unknown function (DU	1e-101

>gnl\|CDD\|218917 pfam06159, DUF974, Protein of unknown function (DUF974)	Back alignment and domain information

 Score =  302 bits (775), Expect = e-101
 Identities = 115/236 (48%), Positives = 151/236 (63%), Gaps = 7/236 (2%)

Query: 88  LVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDKQRILLLDTSKSPVESIR 147
           L LPQ+FG+IYLGETF SY+ +NN S+ EVRDV IKAE+QT  QR+ L D+  +PVE++R
Sbjct: 1   LTLPQSFGSIYLGETFSSYLCVNNESSKEVRDVSIKAELQTPSQRLNLSDSVDAPVETLR 60

Query: 148 AGGRYDFIVEHDVKELGAHTLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVVKE 207
            G   DF+V  DVKE G H LVCT  Y++  GE +Y  +FFKFIV NPLSVRTK   +++
Sbjct: 61  PGESLDFVVSFDVKEEGTHILVCTVSYTEASGETRYFRKFFKFIVKNPLSVRTKFYQLED 120

Query: 208 I----TFLEACIENHTKSNLYMDQVEFEPSQNWSATMLKADGPHSDYNAQSREIFKPPVL 263
           +     +LEA IEN T+ NL++++V  EPS  + AT L  +    D +     + K PVL
Sbjct: 121 LSRRRVYLEAQIENITEDNLFLEKVTLEPSPGYKATSLNWEPSLGDVDGLDGGMDKRPVL 180

Query: 264 IRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKLQITWRTNLGEPGRLQTQQIL 319
               G I  YL+ LK    G+   +K+ G   LGKL I WRT +GE GRLQT Q+ 
Sbjct: 181 --KPGDIRQYLFCLKP-KEGALEELKLDGRTNLGKLDIVWRTAMGEKGRLQTSQLQ 233

Family of uncharacterized eukaryotic proteins. Length = 235

Conserved Domains Detected by HHsearch

Original result of HHsearch against CDD database

ID	Alignment Graph	Length	Definition	Probability
Query		439
KOG2625		348	consensus Uncharacterized conserved protein [Funct	100.0
PF06159		249	DUF974: Protein of unknown function (DUF974); Inte	100.0
PF07919		554	Gryzun: Gryzun, putative trafficking through Golgi	99.9
KOG4386		809	consensus Uncharacterized conserved protein [Funct	99.76
PF12735		306	Trs65: TRAPP trafficking subunit Trs65; InterPro:	99.07
PF08626		1185	TRAPPC9-Trs120: Transport protein Trs120 or TRAPPC	98.23
PF12742		57	Gryzun-like: Gryzun, putative Golgi trafficking	97.74
PF12584		147	TRAPPC10: Trafficking protein particle complex sub	97.56
PF07705		101	CARDB: CARDB; InterPro: IPR011635 The APHP (acidic	96.69
PF00927		107	Transglut_C: Transglutaminase family, C-terminal i	96.48
PF07919		554	Gryzun: Gryzun, putative trafficking through Golgi	95.58
PF10633		78	NPCBM_assoc: NPCBM-associated, NEW3 domain of alph	95.24
PF14874		102	PapD-like: Flagellar-associated PapD-like	94.86
PF07705		101	CARDB: CARDB; InterPro: IPR011635 The APHP (acidic	93.38
PF05753		181	TRAP_beta: Translocon-associated protein beta (TRA	92.46
PF10633		78	NPCBM_assoc: NPCBM-associated, NEW3 domain of alph	91.24
PF05753		181	TRAP_beta: Translocon-associated protein beta (TRA	89.34
PF11797		140	DUF3324: Protein of unknown function C-terminal (D	87.44
smart00809		104	Alpha_adaptinC2 Adaptin C-terminal domain. Adaptin	86.25
PF14874		102	PapD-like: Flagellar-associated PapD-like	84.86
PF02883		115	Alpha_adaptinC2: Adaptin C-terminal domain; InterP	84.25
PF00207		92	A2M: Alpha-2-macroglobulin family; InterPro: IPR00	82.84
PF00927		107	Transglut_C: Transglutaminase family, C-terminal i	81.43

>KOG2625 consensus Uncharacterized conserved protein [Function unknown]	Back alignment and domain information

Probab=100.00  E-value=1.4e-98  Score=692.83  Aligned_cols=341  Identities=31%  Similarity=0.542  Sum_probs=318.1

Q ss_pred             cccccccccceeecceEEEEEEEEcCCCcceEEEEEEEEEeCCCeeEeccCCCCCCccccCCCCeeeEEEEEEccccCce
Q 013597           87 LLVLPQAFGAIYLGETFCSYISINNSSTLEVRDVVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAH  166 (439)
Q Consensus        87 ~L~LP~sfG~iylGEtFs~~i~v~N~s~~~v~~V~ikaelqT~s~r~~L~~~~~~~~~~L~pg~~ld~iv~~~lke~G~h  166 (439)
                      +|.+||.||+|||||||++||+|||+|++.|++|.+|+||||.+||+.|... .....+++|.++.+.+|+||+||+|+|
T Consensus         1 ~l~~pq~f~niflgetfs~yinv~nds~k~v~~i~lk~dlqtssqrl~l~~s-~~~~aei~~~~c~~~vi~hevkeig~h   79 (348)
T KOG2625|consen    1 MLIAPQMFENIFLGETFSFYINVHNDSEKTVKDILLKADLQTSSQRLNLPAS-NAAAAEIEPDCCEDDVIHHEVKEIGQH   79 (348)
T ss_pred             CccchhhhcceeeccceEEEEEEecchhhhhhhheeeecccccceeeccccc-hhhhhhcCccccchhhhhHHHHhhccE
Confidence            4789999999999999999999999999999999999999999999999653 345778999999999999999999999


Q ss_pred             EEEEEEEEEcCCCceeeeceEEEEEeecCeEEEEEEEEe-------CCceEEEEEEEecccccEEEEeEEeeecCCceee
Q 013597          167 TLVCTALYSDGEGERKYLPQFFKFIVSNPLSVRTKVRVV-------KEITFLEACIENHTKSNLYMDQVEFEPSQNWSAT  239 (439)
Q Consensus       167 ~L~c~VsY~~~~Ge~~~frK~fkF~v~~Pl~VktK~~~~-------~~~~~LEaqiqN~s~~~l~le~v~Lep~~~~~~~  239 (439)
                      +|+|+|+|++++||.++|||||||+|.+|+|||||||++       .+++||||||||+|..+|+||+|+|+|+.+|.++
T Consensus        80 ilicavny~tq~ge~myfrkffkf~v~kpidvktkfynaesdlssv~~dvfleaqien~s~a~mflekv~ldps~~ynvt  159 (348)
T KOG2625|consen   80 ILICAVNYKTQAGEKMYFRKFFKFPVLKPIDVKTKFYNAESDLSSVNDDVFLEAQIENMSNANMFLEKVELDPSIHYNVT  159 (348)
T ss_pred             EEEEEEeeeccCccchhHHhhccccccccccccceeecccccccccchhhhhhhhhhcccccchhhhhhccCchheecce
Confidence            999999999999999999999999999999999999976       5889999999999999999999999999999999


Q ss_pred             eecCCCCCCCCCCccccccCCceEEeCCCCeeeEEEEEeecCCCCCCCccccCceeeEEEEEEEEcCCCCCceeeEEEEe
Q 013597          240 MLKADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLKMLSHGSSSPVKVQGSNVLGKLQITWRTNLGEPGRLQTQQIL  319 (439)
Q Consensus       240 ~ln~~~~~~~~~~~~~~~~~~~~l~~~~gd~~q~lf~l~~~~~~~~~~~~~~g~~~lGkL~I~WRs~~Ge~G~LqTs~l~  319 (439)
                      +++.+.+.++..++    |... .+++|.|+|||||||+|+.+..++....++.+.+|||||.||++|||+|||||++||
T Consensus       160 ~i~~~~e~gdcvst----fg~~-~~lkp~d~rq~l~cl~pk~d~~~~~gi~k~lt~igkldi~wktnlgekgrlqts~lq  234 (348)
T KOG2625|consen  160 EIAHEDEAGDCVST----FGSG-ALLKPKDIRQFLFCLKPKADFAEKAGIIKDLTSIGKLDISWKTNLGEKGRLQTSALQ  234 (348)
T ss_pred             eecchhhccccccc----cccc-cccCccchhhheeecCchHHHHHhhccccccceeeeeEEEeeccccccccchHHHHH
Confidence            99987777665443    3332 247789999999999999887655555788999999999999999999999999999


Q ss_pred             eecCccCCeEEEEEecCceeEeCCcEEEEEEEEeCCCCCcccEEEEEEeCCCCCcceEEEecccceeecccCCCCeeEEE
Q 013597          320 GTTITSKEIELNVVEVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFH  399 (439)
Q Consensus       320 ~~~p~~~dl~l~v~~~P~~v~l~~pF~v~~~v~N~s~r~~~~l~l~l~~~~~~~~~~~~~~G~s~~~Lg~L~P~~s~~~~  399 (439)
                      |++|+++|++|+++.+|+.|.+++||.++|+|+|||+|.|| |++.+++..   ..-++|||+++++||+|.|.+...|.
T Consensus       235 riapgygdvrlsle~~p~~vdleepf~iscki~ncserald-l~l~l~~~n---nrhi~~c~~sg~qlgkl~ps~~l~~a  310 (348)
T KOG2625|consen  235 RIAPGYGDVRLSLEAIPACVDLEEPFEISCKITNCSERALD-LQLELCNPN---NRHIHFCGISGRQLGKLHPSQHLCFA  310 (348)
T ss_pred             hhcCCCCceEEEeeccccccccCCCeEEEEEEcccchhhhh-hhhhhcCCC---CceeEEeccccccccCCCCcceeeeE
Confidence            99999999999999999999999999999999999999999 999998864   45799999999999999999999999


Q ss_pred             EEEEecccceEEeCceEEEecCCCeEeccCCCeeeEee
Q 013597          400 LNLIATKLGVQRITGITVFDKLEKITYDSLPDLEIFVD  437 (439)
Q Consensus       400 L~l~pl~~Glq~isgI~l~D~~~~r~y~~~~~~~vfV~  437 (439)
                      |+++|...|+|+|+||+|+|+++||+|||||++||||.
T Consensus       311 l~l~~~~~giqsisgiritdtf~kr~ye~ddiaqi~v~  348 (348)
T KOG2625|consen  311 LNLFPSTQGIQSISGIRITDTFLKRIYEHDDIAQICVS  348 (348)
T ss_pred             EeeccchhcceeecceEeehhhhhhhhcccchHHhhcC
Confidence            99999999999999999999999999999999999984

>PF06159 DUF974: Protein of unknown function (DUF974); InterPro: IPR010378 This is a family of uncharacterised eukaryotic proteins	Back alignment and domain information

>PF07919 Gryzun: Gryzun, putative trafficking through Golgi; InterPro: IPR012880 The proteins featured in this family are all hypothetical eukaryotic proteins of unknown function	Back alignment and domain information

>KOG4386 consensus Uncharacterized conserved protein [Function unknown]	Back alignment and domain information

>PF12735 Trs65: TRAPP trafficking subunit Trs65; InterPro: IPR024662 This family is one of the subunits of the TRAPP Golgi trafficking complex []	Back alignment and domain information

>PF08626 TRAPPC9-Trs120: Transport protein Trs120 or TRAPPC9, TRAPP II complex subunit; InterPro: IPR013935 The trafficking protein particle complex TRAPP is a multi-protein complex needed in the early stages of the secretory pathway	Back alignment and domain information

>PF12742 Gryzun-like: Gryzun, putative Golgi trafficking	Back alignment and domain information

>PF12584 TRAPPC10: Trafficking protein particle complex subunit 10, TRAPPC10; InterPro: IPR022233 The trafficking protein particle complex TRAPP is a multi-protein complex needed in the early stages of the secretory pathway	Back alignment and domain information

>PF07705 CARDB: CARDB; InterPro: IPR011635 The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins	Back alignment and domain information

>PF00927 Transglut_C: Transglutaminase family, C-terminal ig like domain; InterPro: IPR008958 Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase Transglutaminases catalyse the post-translational modification of proteins at glutamine residues, with formation of isopeptide bonds	Back alignment and domain information

>PF07919 Gryzun: Gryzun, putative trafficking through Golgi; InterPro: IPR012880 The proteins featured in this family are all hypothetical eukaryotic proteins of unknown function	Back alignment and domain information

>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known	Back alignment and domain information

>PF14874 PapD-like: Flagellar-associated PapD-like	Back alignment and domain information

>PF07705 CARDB: CARDB; InterPro: IPR011635 The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins	Back alignment and domain information

>PF05753 TRAP_beta: Translocon-associated protein beta (TRAPB); InterPro: IPR008856 This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins	Back alignment and domain information

>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known	Back alignment and domain information

>PF05753 TRAP_beta: Translocon-associated protein beta (TRAPB); InterPro: IPR008856 This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins	Back alignment and domain information

>PF11797 DUF3324: Protein of unknown function C-terminal (DUF3324); InterPro: IPR021759 This family consists of several hypothetical bacterial proteins of unknown function	Back alignment and domain information

>smart00809 Alpha_adaptinC2 Adaptin C-terminal domain	Back alignment and domain information

>PF14874 PapD-like: Flagellar-associated PapD-like	Back alignment and domain information

>PF02883 Alpha_adaptinC2: Adaptin C-terminal domain; InterPro: IPR008152 Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment	Back alignment and domain information

>PF00207 A2M: Alpha-2-macroglobulin family; InterPro: IPR001599 This entry contains serum complement C3 and C4 precursors and alpha-macrogrobulins	Back alignment and domain information

>PF00927 Transglut_C: Transglutaminase family, C-terminal ig like domain; InterPro: IPR008958 Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase Transglutaminases catalyse the post-translational modification of proteins at glutamine residues, with formation of isopeptide bonds	Back alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST

Original result of BLAST against Protein Data Bank

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST

Original result of RPS-BLAST against PDB70 database

ID	Alignment Graph	Length	Definition	E-value
Query		439
1vt4_I		1221	APAF-1 related killer DARK; drosophila apoptosome,	2e-07
1vt4_I		1221	APAF-1 related killer DARK; drosophila apoptosome,	8e-04

>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221	Back alignment and structure

 Score = 52.5 bits (125), Expect = 2e-07
 Identities = 71/408 (17%), Positives = 113/408 (27%), Gaps = 142/408 (34%)

Query: 115 LEVRDVVIKAEIQTDKQRILLLDTSKSPVESIRAGGRYDFIVEHDVKELGAHTLVCTALY 174
            + +DV        D  + +L  + +            D I+       G   L  T L 
Sbjct: 33  FDCKDV-------QDMPKSIL--SKEE----------IDHIIMSKDAVSGTLRLFWTLLS 73

Query: 175 SDGEGERKY----LPQFFKFIVSNPLSVRTKVRVVKEITFLEACIENHTKSNLYMDQVEF 230
              E  +K+    L   +KF++S P+    +   +    ++E       +  LY D   F
Sbjct: 74  KQEEMVQKFVEEVLRINYKFLMS-PIKTEQRQPSMMTRMYIE------QRDRLYNDNQVF 126

Query: 231 EPSQNWSATMLKADGPHSDYNAQSREIFKPPVLIRSGGGIHNYLYQLK---------MLS 281
                              YN  SR     P L      +   L +L+         +L 
Sbjct: 127 AK-----------------YNV-SRL---QPYL-----KLRQALLELRPAKNVLIDGVLG 160

Query: 282 HGSSSPVK--VQGSNVLGKL--QITWRTNLGEPGR----LQTQQILGTTIT--------- 324
            G +           V  K+  +I W  NL         L+  Q L   I          
Sbjct: 161 SGKTWVALDVCLSYKVQCKMDFKIFW-LNLKNCNSPETVLEMLQKLLYQIDPNWTSRSDH 219

Query: 325 SKEIELNVVEV-------------------------PSVVGIDKPFLLKLK--LT----N 353
           S  I+L +  +                                  F L  K  LT     
Sbjct: 220 SSNIKLRIHSIQAELRRLLKSKPYENCLLVLLNVQNAKAW---NAFNLSCKILLTTRFKQ 276

Query: 354 QTDKEQGPFEIWLSQNDS------DEEKVVMIN--GLRIMALAPVEAFGSTDFHLNLIAT 405
            TD         +S +        DE K +++     R   L P E   +    L++IA 
Sbjct: 277 VTDFLSAATTTHISLDHHSMTLTPDEVKSLLLKYLDCRPQDL-PREVLTTNPRRLSIIAE 335

Query: 406 KL--GVQRITGI--TVFDKLEKI---TYDSLP---------DLEIFVD 437
            +  G+           DKL  I   + + L           L +F  
Sbjct: 336 SIRDGLATWDNWKHVNCDKLTTIIESSLNVLEPAEYRKMFDRLSVFPP 383

PyMOL of 1vt4

>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221	Back alignment and structure

Structure Templates Detected by HHsearch

Original result of HHsearch against PDB70 database

ID	Alignment Graph	Length	Definition	Probability
Query		439
2xzz_A		102	Protein-glutamine gamma-glutamyltransferase K; 2.3	96.78
1ex0_A		731	Coagulation factor XIII A chain; transglutaminase,	96.25
1vjj_A		692	Protein-glutamine glutamyltransferase E; transglut	96.22
3hrz_B		252	Cobra venom factor; serine protease, glycosilated,	95.82
3idu_A		127	Uncharacterized protein; all beta-protein, structu	94.01
1g0d_A		695	Protein-glutamine gamma-glutamyltransferase; tissu	93.34
2q3z_A		687	Transglutaminase 2; transglutaminase 2, tissue tra	92.69
3es6_B		118	Prolactin-inducible protein; major histocompatibil	90.02
2qsv_A		220	Uncharacterized protein; MCSG, structural genomics	89.54
4fxk_B		767	Complement C4-A alpha chain; immune system, proteo	89.41
2ys4_A		122	Hydrocephalus-inducing protein homolog; hydin, PAP	88.97
4acq_A		1451	Alpha-2-macroglobulin; hydrolase inhibitor, protei	88.97
2hr0_B		915	Complement C3 alpha' chain; complement component C	88.61
3prx_B		1642	Cobra venom factor; immune system, complement, imm	88.34
2b39_A		1661	C3; thioester, immune defense, immune system; HET:	86.62
2pn5_A		1325	TEP1R, thioester-containing protein I; FULL-length	86.09
2xzz_A		102	Protein-glutamine gamma-glutamyltransferase K; 2.3	85.01
2ys4_A		122	Hydrocephalus-inducing protein homolog; hydin, PAP	81.69
1vjj_A		692	Protein-glutamine glutamyltransferase E; transglut	80.61

>2xzz_A Protein-glutamine gamma-glutamyltransferase K; 2.30A {Homo sapiens}	Back alignment and structure

Probab=96.78  E-value=0.0033  Score=51.98  Aligned_cols=73  Identities=5%  Similarity=0.138  Sum_probs=61.0

Q ss_pred             ecCceeEeCCcEEEEEEEEeCCCCCcccEEEEEEeCCCCCcceEEEecccceeecccCCCCeeEEEEEEEecccceEEeC
Q 013597          334 EVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRIT  413 (439)
Q Consensus       334 ~~P~~v~l~~pF~v~~~v~N~s~r~~~~l~l~l~~~~~~~~~~~~~~G~s~~~Lg~L~P~~s~~~~L~l~pl~~Glq~is  413 (439)
                      +++...++++|+.+++.++|--...+....+.++...      + ..+ ....++.+.||++..+.+.+.|..+|.++|-
T Consensus        11 ~v~g~~~v~~~l~v~vsf~NPL~~~L~~c~~~vEG~G------L-~~~-~~~~~~~v~pg~~~~~~~~~~P~~~G~~~L~   82 (102)
T 2xzz_A           11 TLLGAAVVGQECEVQIVFKNPLPVTLTNVVFRLEGSG------L-QRP-KILNVGDIGGNETVTLRQSFVPVRPGPRQLI   82 (102)
T ss_dssp             EESSCCCSSSCEEEEEEEECCSSSCBCSEEEEEEETT------T-EEE-EEEEECCBCTTCEEEEEEEECCCSCSSCCCE
T ss_pred             EECCCcccCCeEEEEEEEECCCCCcccCEEEEEECCC------C-Ccc-eEEEcCcCCCCCEEEEEEEEecCcccceEEE
Confidence            5667778999999999999997777777889998764      2 344 5567899999999999999999999998874


Q ss_pred             c
Q 013597          414 G  414 (439)
Q Consensus       414 g  414 (439)
                      .
T Consensus        83 a   83 (102)
T 2xzz_A           83 A   83 (102)
T ss_dssp             E
T ss_pred             E
Confidence            3

PyMOL of 2xzz

>1ex0_A Coagulation factor XIII A chain; transglutaminase, blood coagulation, mutant, W279F, oxyanion, transferase; 2.00A {Homo sapiens} SCOP: b.1.18.9 b.1.5.1 b.1.5.1 d.3.1.4 PDB: 1evu_A 1fie_A 1f13_A 1ggt_A 1ggu_A 1ggy_A 1qrk_A	Back alignment and structure

>1vjj_A Protein-glutamine glutamyltransferase E; transglutaminase 3, X-RAY crystallography, metalloenzyme, calcium ION; HET: GDP; 1.90A {Homo sapiens} SCOP: b.1.18.9 b.1.5.1 b.1.5.1 d.3.1.4 PDB: 1sgx_A* 1l9m_A 1l9n_A* 1nud_A 1nuf_A 1nug_A 1rle_A*	Back alignment and structure

>3hrz_B Cobra venom factor; serine protease, glycosilated, multi-domain, complement SYST convertase, complement alternate pathway; HET: NAG P6G; 2.20A {Naja kaouthia} PDB: 3frp_G* 3hs0_B*	Back alignment and structure

>3idu_A Uncharacterized protein; all beta-protein, structural genomics, PSI-2, protein structure initiative; 1.70A {Pyrococcus furiosus} PDB: 2kl6_A	Back alignment and structure

>1g0d_A Protein-glutamine gamma-glutamyltransferase; tissue transglutaminase,acyltransferase; 2.50A {Pagrus major} SCOP: b.1.18.9 b.1.5.1 b.1.5.1 d.3.1.4	Back alignment and structure

>2q3z_A Transglutaminase 2; transglutaminase 2, tissue transglutaminase, TG2, transferas; 2.00A {Homo sapiens} SCOP: b.1.18.9 b.1.5.1 b.1.5.1 d.3.1.4 PDB: 1kv3_A 3ly6_A*	Back alignment and structure

>3es6_B Prolactin-inducible protein; major histocompatibility complex, protein-protein complex, P inducible protein, zinc 2-glycoprotein, ZAG-PIP complex; HET: NDG NAG BMA MAN P6G; 3.23A {Homo sapiens} SCOP: b.1.18.23	Back alignment and structure

>2qsv_A Uncharacterized protein; MCSG, structural genomics, porphyromonas gingivalis W83, PSI protein structure initiative; 2.10A {Porphyromonas gingivalis}	Back alignment and structure

>4fxk_B Complement C4-A alpha chain; immune system, proteolytic cascade; HET: NAG BMA; 3.60A {Homo sapiens} PDB: 4fxg_B*	Back alignment and structure

>2ys4_A Hydrocephalus-inducing protein homolog; hydin, PAPD-like, NPPSFA, national project on protein structural and functional analyses; NMR {Homo sapiens}	Back alignment and structure

>4acq_A Alpha-2-macroglobulin; hydrolase inhibitor, proteinase inhibitor, irreversible PROT inhibitor, conformational change, blood plasma inhibitor; HET: MEQ NAG MAN; 4.30A {Homo sapiens}	Back alignment and structure

>2hr0_B Complement C3 alpha' chain; complement component C3B, immune system; HET: THC; 2.26A {Homo sapiens} PDB: 2icf_B* 2wii_B* 2win_B* 3g6j_B 3l5n_B* 2a73_B* 2i07_B* 2xwj_B* 2xwb_B* 2a74_C* 2ice_C* 2qki_C* 3l3o_F* 3nms_C* 3nsa_C* 3ohx_C* 3t4a_C 2ice_B* 3l3o_B* 3nms_B* ...	Back alignment and structure

>3prx_B Cobra venom factor; immune system, complement, immune SYS complex; HET: NAG; 4.30A {Naja kaouthia} PDB: 3pvm_B*	Back alignment and structure

>2b39_A C3; thioester, immune defense, immune system; HET: NAG BMA; 3.00A {Bos taurus}	Back alignment and structure

>2pn5_A TEP1R, thioester-containing protein I; FULL-length mature peptide, immune system; HET: NAG; 2.70A {Anopheles gambiae}	Back alignment and structure

>2xzz_A Protein-glutamine gamma-glutamyltransferase K; 2.30A {Homo sapiens}	Back alignment and structure

>2ys4_A Hydrocephalus-inducing protein homolog; hydin, PAPD-like, NPPSFA, national project on protein structural and functional analyses; NMR {Homo sapiens}	Back alignment and structure

>1vjj_A Protein-glutamine glutamyltransferase E; transglutaminase 3, X-RAY crystallography, metalloenzyme, calcium ION; HET: GDP; 1.90A {Homo sapiens} SCOP: b.1.18.9 b.1.5.1 b.1.5.1 d.3.1.4 PDB: 1sgx_A* 1l9m_A 1l9n_A* 1nud_A 1nuf_A 1nug_A 1rle_A*	Back alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST

Original result of RPS-BLAST against SCOP70(version1.75) database

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch

Original result of HHsearch against SCOP70(version1.75) database

ID	Alignment Graph	Length	Definition	Probability
Query		439
d1vjja3		99	Transglutaminase, two C-terminal domains {Human (H	97.02
d1ex0a3		100	Transglutaminase, two C-terminal domains {Human (H	96.84
d1g0da3		101	Transglutaminase, two C-terminal domains {Red sea	96.82
d2q3za3		98	Transglutaminase, two C-terminal domains {Human (H	96.56
d1ex0a2		112	Transglutaminase, two C-terminal domains {Human (H	96.28
d1vjja2		115	Transglutaminase, two C-terminal domains {Human (H	96.05
d1g0da2		112	Transglutaminase, two C-terminal domains {Red sea	95.55
d2q3za2		114	Transglutaminase, two C-terminal domains {Human (H	95.32
d2q3za3		98	Transglutaminase, two C-terminal domains {Human (H	94.62
d1vjja3		99	Transglutaminase, two C-terminal domains {Human (H	94.36
d3es6b1		118	Prolactin-inducible protein, PIP {Human (Homo sapi	93.9
d1g0da3		101	Transglutaminase, two C-terminal domains {Red sea	91.94
d1ex0a3		100	Transglutaminase, two C-terminal domains {Human (H	90.58

>d1vjja3 b.1.5.1 (A:594-692) Transglutaminase, two C-terminal domains {Human (Homo sapiens), TGase E3 [TaxId: 9606]}	Back information, alignment and structure

class: All beta proteins
fold: Immunoglobulin-like beta-sandwich
superfamily: Transglutaminase, two C-terminal domains
family: Transglutaminase, two C-terminal domains
domain: Transglutaminase, two C-terminal domains
species: Human (Homo sapiens), TGase E3 [TaxId: 9606]

Probab=97.02  E-value=0.00073  Score=53.66  Aligned_cols=74  Identities=11%  Similarity=0.252  Sum_probs=60.8

Q ss_pred             ecCceeEeCCcEEEEEEEEeCCCCCcccEEEEEEeCCCCCcceEEEecccceeecccCCCCeeEEEEEEEecccceEEeC
Q 013597          334 EVPSVVGIDKPFLLKLKLTNQTDKEQGPFEIWLSQNDSDEEKVVMINGLRIMALAPVEAFGSTDFHLNLIATKLGVQRIT  413 (439)
Q Consensus       334 ~~P~~v~l~~pF~v~~~v~N~s~r~~~~l~l~l~~~~~~~~~~~~~~G~s~~~Lg~L~P~~s~~~~L~l~pl~~Glq~is  413 (439)
                      ++|...++++++.++++++|--+..+.+-.+.++...       ++.+.....++.+.||++.++.+.+.|..+|.++|-
T Consensus         6 ~v~~~~~v~~~~~v~vsf~NPL~~~L~~c~f~vEG~G-------L~~~~~~~~~~~v~p~~~~~~~~~~~P~~~G~~~l~   78 (99)
T d1vjja3           6 EVLNEARVRKPVNVQMLFSNPLDEPVRDCVLMVEGSG-------LLLGNLKIDVPTLGPKERSRVRFDILPSRSGTKQLL   78 (99)
T ss_dssp             EECSCCBTTSCEEEEEEEECCSSSCBCSEEEEEECTT-------TSSSCEEEEECCBCTTCEEEEEEEECCCSCEEEEEE
T ss_pred             EeCCCcCcCCeEEEEEEEECCCCCchhCEEEEEEeCC-------CCCccEEEecCccCCCCEEEEEEEEEcCCcccEEEE
Confidence            5778889999999999999997887776888887753       122333456888999999999999999999999974


Q ss_pred             c
Q 013597          414 G  414 (439)
Q Consensus       414 g  414 (439)
                      .
T Consensus        79 a   79 (99)
T d1vjja3          79 A   79 (99)
T ss_dssp             E
T ss_pred             E
Confidence            3

PyMOL of d1vjja3

>d1ex0a3 b.1.5.1 (A:628-727) Transglutaminase, two C-terminal domains {Human (Homo sapiens), blood isozyme [TaxId: 9606]}	Back information, alignment and structure

>d1g0da3 b.1.5.1 (A:584-684) Transglutaminase, two C-terminal domains {Red sea bream (Chrysophrys major) [TaxId: 143350]}	Back information, alignment and structure

>d2q3za3 b.1.5.1 (A:586-683) Transglutaminase, two C-terminal domains {Human (Homo sapiens), tissue isozyme [TaxId: 9606]}	Back information, alignment and structure

>d1ex0a2 b.1.5.1 (A:516-627) Transglutaminase, two C-terminal domains {Human (Homo sapiens), blood isozyme [TaxId: 9606]}	Back information, alignment and structure

>d1vjja2 b.1.5.1 (A:479-593) Transglutaminase, two C-terminal domains {Human (Homo sapiens), TGase E3 [TaxId: 9606]}	Back information, alignment and structure

>d1g0da2 b.1.5.1 (A:472-583) Transglutaminase, two C-terminal domains {Red sea bream (Chrysophrys major) [TaxId: 143350]}	Back information, alignment and structure

>d2q3za2 b.1.5.1 (A:472-585) Transglutaminase, two C-terminal domains {Human (Homo sapiens), tissue isozyme [TaxId: 9606]}	Back information, alignment and structure

>d2q3za3 b.1.5.1 (A:586-683) Transglutaminase, two C-terminal domains {Human (Homo sapiens), tissue isozyme [TaxId: 9606]}	Back information, alignment and structure

>d1vjja3 b.1.5.1 (A:594-692) Transglutaminase, two C-terminal domains {Human (Homo sapiens), TGase E3 [TaxId: 9606]}	Back information, alignment and structure

>d3es6b1 b.1.18.23 (B:1-118) Prolactin-inducible protein, PIP {Human (Homo sapiens) [TaxId: 9606]}	Back information, alignment and structure

>d1g0da3 b.1.5.1 (A:584-684) Transglutaminase, two C-terminal domains {Red sea bream (Chrysophrys major) [TaxId: 143350]}	Back information, alignment and structure

>d1ex0a3 b.1.5.1 (A:628-727) Transglutaminase, two C-terminal domains {Human (Homo sapiens), blood isozyme [TaxId: 9606]}	Back information, alignment and structure