Query FBpp0308935 type=protein; loc=3L:join(7019749..7019898,7034972..7035038,7035170..7035315,7035744..7035881,7038005..7038151,7041018..7041131,7043466..7043720,7044926..7045003,7045109..7045355,7045534..7045611,7046334..7046488,7046586..7046678,7048842..7048964,7050565..7050695,7051242..7051421,7051486..7051722,7051977..7052052); ID=FBpp0308935; name=Mp-PS; parent=FBgn0260660,FBtr0339903; dbxref=GB_protein:AHN57985,REFSEQ:NP_001286960,FlyBase:FBpp0308935,FlyBase_Annotation_IDs:CG42543-PS; MD5=ed0901aea9a341f7a0056f6887e2dbf6; length=804; release=r6.06; species=Dmel; Match_columns 804 No_of_seqs 665 out of 2589 Neff 5.9 Searched_HMMs 16187 No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF06482 Endostatin: Collagena 100.0 3E-76 1.8E-80 618.6 13.3 254 512-765 1-285 (286) 2 PF07588 DUF1554: Protein of u 99.8 5E-19 3.1E-23 166.9 8.0 126 605-740 2-133 (135) 3 PF01410 COLFI: Fibrillar coll 95.8 0.00092 5.7E-08 67.4 0.1 39 449-487 2-40 (233) 4 PF01484 Col_cuticle_N: Nemato 79.7 0.28 1.7E-05 36.4 0.8 34 14-47 1-34 (50) 5 PF11267 DUF3067: Domain of un 35.1 8.2 0.0005 32.9 1.7 13 790-802 42-54 (98) 6 PF10983 DUF2793: Protein of u 21.1 28 0.0018 28.6 2.6 25 529-553 63-87 (87) 7 PF06609 TRI12: Fungal trichot 19.2 43 0.0027 36.8 4.4 46 8-57 531-576 (599) 8 PF14880 COX14: Cytochrome oxi 18.4 30 0.0018 26.2 2.0 30 10-39 15-44 (59) 9 PF07172 GRP: Glycine rich pro 16.8 50 0.0031 27.4 3.3 25 1-28 1-25 (95) 10 PF11359 gpUL132: Glycoprotein 16.5 17 0.001 34.5 0.0 50 4-53 46-100 (238)No 1>PF06482 Endostatin: Collagenase NC10 and Endostatin; InterPro: IPR010515 NC10 stands for Non-helical region 10 and is taken from P39059 from SWISSPROT. A mutation in this region in P39060 from SWISSPROT is associated with an increased risk of prostrate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumour suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor []. Endostatin also binds a zinc ion near the N terminus; this is likely to be of structural rather than functional importance according to [].; GO: 0005198 structural molecule activity, 0007155 cell adhesion, 0031012 extracellular matrix; PDB: 1DY2_A 1DY1_A 1DY0_A 1KOE_A 3N3F_B 1BNL_D 3HSH_E 3HON_A. Probab=100.00 E-value=3e-76 Score=618.57 Aligned_cols=254 Identities=48% Similarity=0.870 Sum_probs=180.5 Q ss_pred CceeeeccHHHHhhcccCCCCCceEEEecceeEEEEEcCCchhcccCcccccCCCCCCCcc----CCCCCc-cccc---- Q FBpp0308935 512 PGAVTFQNIDEMTKKSALNPPGTLAYITEEEALLVRVNKGWQYIALGTLVPIATPAPPTTV----APSMRF-DLQS---- 582 (804) Q Consensus 512 ~G~~~~~t~~~m~~~~~~s~~Gtlayv~d~~~l~vrv~~Gw~~i~lg~~~~~~~~~~~~~~----~~~~~~-~~~~---- 582 (804) +|+++|.|+++|++.++..+||||+||+|++||||||++|||+||||+++|+....++.++ .++..+ .... T Consensus 1 sGV~vf~T~~~Ml~~a~~~pEGTLayV~e~~eLYVRVrnGWRkV~LG~~ip~~~~~~~~~va~~~p~P~v~~~~~~~~~~ 80 (286) T PF06482_consen 1 SGVTVFRTYETMLATAHRVPEGTLAYVIEREELYVRVRNGWRKVQLGELIPIPSDTPDNEVASTQPPPVVSSPPQSSPPS 80 (286) T ss_dssp --EEEESSHHHHHCHGGGS-TTEEEEETTTTEEEEEETTEEEEE-EEEEEE----------------------------- T ss_pred CCcEEecCHHHHHhhcccCCCeEEEEEEecceEEEEecCCeeeeccCCcccCCCCcccccccccCCCCccccCccccccc Confidence 4899999999999999999999999999999999999999999999999998775532111 111110 0000 Q ss_pred --cCC---C----------------CCCCCCCCCCceEEEEccCCCCCCCCccchhhHHHHHHHhhcCCCCceeEEEecc Q FBpp0308935 583 --KNL---L----------------NSPPPLLNTPTLRVAALNEPSTGDLQGIRGADFACYRQGRRAGLLGTFKAFLSSR 641 (804) Q Consensus 583 --~~~---~----------------~~~~~~~~~~~l~l~aln~~~~G~lgGi~GAD~~C~~~A~~~g~~gt~rA~Ls~~ 641 (804) ... . +..........|||||||+|++|||+||+|||++||+|||++|+.||||||||++ T Consensus 81 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~liAlN~P~~G~m~Gi~gAD~~C~~qAr~~gl~gtfRAfLSs~ 160 (286) T PF06482_consen 81 SHPRPPSTAPDPHYPPQPRRPPPPSPSAHTHHDDGPLHLIALNEPLSGNMRGIRGADFQCFRQARAAGLTGTFRAFLSSR 160 (286) T ss_dssp -----------------------------S--TTS-EEEEE-SS-B-SBSSHHHHHHHHHHHHHHHTT--S-EEESS-BT T ss_pred ccCCcccCCCCccCCCCccccCCCCCccccccCCCceEEEEcCCCCCCCccccccccHHHHHHHHHcCCCCceEEeeecc Confidence 000 0 0001122334499999999999999999999999999999999999999999999 Q ss_pred ccCcccccCCCCC-CCccccCCCcEEecCcchhccCCCCcccCCCceeccCCCccCCCCCCCCceEEEecCCCCccccCc Q FBpp0308935 642 VQNLDTIVRPADR-DLPVVNTRGDVLFNSWKGIFNGQGGFFSQAPRIYSFSGKNVMTDSTWPMKMVWHGSLPNGERSMDT 720 (804) Q Consensus 642 ~~~~~~~V~~~dr-~~P~vn~~g~vl~~~~~~l~~~~~~~~~~~~~i~~f~~~~v~~d~~~~~k~vWtGs~~~g~~~~~~ 720 (804) +|||++||++.|| ++||||+||||||+||++||+++++.|..+++||||||+|||+|+.||+|.|||||+++|++..++ T Consensus 161 ~qdL~~iV~~~dr~~~PivNlkgevLf~sw~~lf~g~~~~~~~~~~iySFdGr~v~~d~~wP~K~vWhGs~~~G~r~~~~ 240 (286) T PF06482_consen 161 LQDLYSIVRRADRDNVPIVNLKGEVLFNSWESLFSGSGGPFNPNAPIYSFDGRDVLTDPAWPQKMVWHGSDPRGRRLTDS 240 (286) T ss_dssp TB-GGGGS-GGGTSS--EE-TTS-EEES-HHHHTSSS-SB--TTS--BBTTS-BTTTSTTSSS-EEE--B-TTS-B-TTS T ss_pred cccHhhhccHhhCCCCCeEeCcCCEeecCHHHHhCCCCCCCCCCCcEEeECCccccCCCCcceEEEEeCCCCCCccCCcC Confidence 9999999999999 899999999999999999999988889999999999999999999999999999999999999999 Q ss_pred cCCccccCCCCceeecccCCccccccccccccccCceEEEEeccc Q FBpp0308935 721 YCDAWHSGDHLKGSFASNLDGHKLLEQKRQSCDSKLIILCVEALS 765 (804) Q Consensus 721 ~C~~W~s~~~~~~G~as~~~~~~l~~~~~~~C~~~~~vlCvE~~~ 765 (804) ||++|+|++.+++|+||+|.+++||.|+.++|+++||||||||+. T Consensus 241 ~C~~Wrs~~~~~~G~As~l~~g~ll~q~~~sC~~~~ivLCiE~~~ 285 (286) T PF06482_consen 241 YCEAWRSSDPAVTGQASSLQSGKLLDQQPYSCSNSFIVLCIENSF 285 (286) T ss_dssp BHHHHB---TTSEEEEEEGGGTBSS--EEEETTS-BB-EEEESS- T ss_pred cccccccCCCCceEeeeecCCCCcccCCcccCCCceEEEEEeccc Confidence 999999999999999999999999999999999999999999974No 2>PF07588 DUF1554: Protein of unknown function (DUF1554); InterPro: IPR011448 This is a domain that occurs in 1-2 copies in a family of proteins identified in Leptospira interrogans and other bacteria. The function of the proteins is not known. Probab=99.75 E-value=5e-19 Score=166.95 Aligned_cols=126 Identities=20% Similarity=0.323 Sum_probs=97.2 Q ss_pred CCCCCCCCccchhhHHHHHHHhhcC--CCCceeEEEeccccCccccc-CCCCCCCccccCCCcEEecCcchhccCCCC-c Q FBpp0308935 605 EPSTGDLQGIRGADFACYRQGRRAG--LLGTFKAFLSSRVQNLDTIV-RPADRDLPVVNTRGDVLFNSWKGIFNGQGG-F 680 (804) Q Consensus 605 ~~~~G~lgGi~GAD~~C~~~A~~~g--~~gt~rA~Ls~~~~~~~~~V-~~~dr~~P~vn~~g~vl~~~~~~l~~~~~~-~ 680 (804) ..|+|||+||+|||++|++.+.+.. ..++|||||++.+...+.+. .++-.. ...||||.+|.+|++. ++. + T Consensus 2 ~~~~GnlGGi~GADa~C~~d~~~p~~~~~~~yKAml~~~~~~~R~a~~t~n~~~----g~~DWVl~pnt~Y~r~-dgt~i 76 (135) T PF07588_consen 2 NTYNGNLGGISGADAKCNADANKPSPGGGGTYKAMLVDGSNSTRRACVTANCGD----GQIDWVLKPNTTYYRS-DGTTI 76 (135) T ss_pred ccccCcccchhhHhHHHHcCCCCCCCCCCcCeEEEEEcCccccceeecCCCCCC----CcccceecCCceEEec-CCCEE Confidence 4689999999999999998887654 56799999999765323222 222221 2789999999999987 555 7 Q ss_pred ccCCCc-eeccCCCccCCCCC-CCCceEEEecCCCCccccCccCCccccCCCCceeecccCC Q FBpp0308935 681 FSQAPR-IYSFSGKNVMTDST-WPMKMVWHGSLPNGERSMDTYCDAWHSGDHLKGSFASNLD 740 (804) Q Consensus 681 ~~~~~~-i~~f~~~~v~~d~~-~~~k~vWtGs~~~g~~~~~~~C~~W~s~~~~~~G~as~~~ 740 (804) |+++.. |++|+ |++++ -..+.+|||++.+++... .+|++|+++...++|..+..+ T Consensus 77 ~tTn~~glf~f~----l~~~i~~~~~~~WTGl~~~Wt~~~-~~C~~Wt~~s~~~~G~~G~~n 133 (135) T PF07588_consen 77 FTTNSNGLFDFP----LSNPISGTSGTIWTGLNSDWTTAT-NNCNNWTSGSSGVTGAYGSSN 133 (135) T ss_pred EecCCCceEccc----ccceecCCCccEEEeECCCCeeCC-CcccCCcCCCCcccccccccc Confidence 777776 99997 44443 348999999999988764 899999999988888777654No 3>PF01410 COLFI: Fibrillar collagen C-terminal domain; InterPro: IPR000885 Collagens contain a large number of globular domains in between the regions of triple helical repeats IPR008160 from INTERPRO. These domains are involved in binding diverse substrates. One of these domains is found at the C terminus of fibrillar collagens. The exact function of this domain is unknown.; GO: 0005201 extracellular matrix structural constituent, 0005581 collagen trimer; PDB: 4AEJ_A 4AE2_A 4AK3_A. Probab=95.76 E-value=0.00092 Score=67.36 Aligned_cols=39 Identities=26% Similarity=0.352 Sum_probs=27.7 Q ss_pred CCCcHHHHHHHHHHHhhhcCCCCCCCCCCCCCCCCCccc Q FBpp0308935 449 ARSSLDELKALRELQDLRDRPDGTAEPPRQPGHSHKHEE 487 (804) Q Consensus 449 ~~~v~~sLksL~~iqel~r~P~Gt~~~PartC~d~~~~~ 487 (804) +.+++.+|++|+++++.+++|+|||.+|||||+|+.... T Consensus 2 ~~~i~~~l~~l~~~i~~~~~P~Gtk~~PArtC~dl~~~~ 40 (233) T PF01410_consen 2 DEEIFAALDSLKEEIESIKKPDGTKENPARTCRDLKLCH 40 (233) T ss_dssp -----HHHHHHHHHHHHHHS--SSSSS-BSSHHHHHHH- T ss_pred HHHHHHHHHHHHHHHHhccCCCCCccChHHHHHHHHHhC Confidence 457899999999999999999999999999999986543No 4>PF01484 Col_cuticle_N: Nematode cuticle collagen N-terminal domain; InterPro: IPR002486 The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens (see IPR008160 from INTERPRO). Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins [, ].; GO: 0042302 structural constituent of cuticle Probab=79.70 E-value=0.28 Score=36.36 Aligned_cols=34 Identities=18% Similarity=0.063 Sum_probs=23.9 Q ss_pred HHHHHHHHHHHHHhhhccccccccccchhhhHHH Q FBpp0308935 14 TALILFFLLGIVLVTGSTKGWFNPNRYNGERVAA 47 (804) Q Consensus 14 ~~~~~~svvaiv~~~~~i~~~~~~n~~~~e~~~~ 47 (804) |+++++|++++++|++++|.+++--....+.+.. T Consensus 1 y~a~~~s~~~i~~~l~~~~~i~~~i~~~~~e~~~ 34 (50) T PF01484_consen 1 YVAIAFSTLSIISCLFTIPMIYNDIQEFQEELED 34 (50) T ss_pred ChhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 4568889999999999998888744433333333No 5>PF11267 DUF3067: Domain of unknown function (DUF3067); InterPro: IPR021420 This family of proteins has no known function. ; PDB: 2LJW_A. Probab=35.11 E-value=8.2 Score=32.87 Aligned_cols=13 Identities=38% Similarity=0.731 Sum_probs=11.0 Q ss_pred cChHHHHHHHHhh Q FBpp0308935 790 KTADEYAAHLENL 802 (804) Q Consensus 790 ~~~~~~~~~~~~~ 802 (804) +||+||.+||+.+ T Consensus 42 ltE~eY~~hL~~i 54 (98) T PF11267_consen 42 LTEEEYLEHLDAI 54 (98) T ss_dssp S-HHHHHHHHHHH T ss_pred CCHHHHHHHHHHH Confidence 6999999999976No 6>PF10983 DUF2793: Protein of unknown function (DUF2793); InterPro: IPR021251 This family of proteins currently has no known function. Probab=21.09 E-value=28 Score=28.63 Aligned_cols=25 Identities=28% Similarity=0.397 Sum_probs=22.2 Q ss_pred CCCCCceEEEecceeEEEEEcCCch Q FBpp0308935 529 LNPPGTLAYITEEEALLVRVNKGWQ 553 (804) Q Consensus 529 ~s~~Gtlayv~d~~~l~vrv~~Gw~ 553 (804) ...+|..+|+.|+.+|||.....|+ T Consensus 63 ~P~~Gw~~~v~~~~~~~~~~g~~Wv 87 (87) T PF10983_consen 63 TPREGWRAWVEDEGALYVFDGTAWV 87 (87) T ss_pred CCCCCCEEEEecCCcEEEEcCCeeC Confidence 4567999999999999999998885No 7>PF06609 TRI12: Fungal trichothecene efflux pump (TRI12); InterPro: IPR010573 This family consists of several fungal specific trichothecene efflux pump proteins. Many of the genes involved in trichothecene toxin biosynthesis in Fusarium sporotrichioides are present within a gene cluster. It has been suggested that TRI12 may play a role in F. sporotrichioides self-protection against trichothecenes []. Probab=19.20 E-value=43 Score=36.81 Aligned_cols=46 Identities=22% Similarity=0.231 Sum_probs=38.1 Q ss_pred hHHHHHHHHHHHHHHHHHHhhhccccccccccchhhhHHHHhhhcccccc Q FBpp0308935 8 RAKLVITALILFFLLGIVLVTGSTKGWFNPNRYNGERVAARIQATDIFDA 57 (804) Q Consensus 8 ~~~lv~~~~~~~svvaiv~~~~~i~~~~~~n~~~~e~~~~~i~~~~~~~~ 57 (804) ..+.|-+.+++|.++++++++++- +..+|..+++++.++..+.-+. T Consensus 531 afr~V~~~siaFg~va~i~a~f~~----d~~~~mt~~Va~~l~~~~~~~~ 576 (599) T PF06609_consen 531 AFRYVYLASIAFGVVAIIAAFFLK----DIDKYMTNHVAVRLEDRKEADK 576 (599) T ss_pred HHHHHHHHHHHHHHHHHHHHHHcC----CcHHhhhhhhhHhHhccccccc Confidence 457788899999999999999875 6788999999999997765443No 8>PF14880 COX14: Cytochrome oxidase c assembly; InterPro: IPR029208 COX14 plays an essential role in cytochrome oxidase assembly. The COX14 product is a low-molecular weight membrane protein of mitochondria, but it is not a subunit of cytochrome oxidase []. Orthology-prediction methods have identified the vertebrate C12orf62 orthologues to be orthologues of the yeast COX14 []. Probab=18.40 E-value=30 Score=26.21 Aligned_cols=30 Identities=23% Similarity=0.229 Sum_probs=23.3 Q ss_pred HHHHHHHHHHHHHHHHHhhhcccccccccc Q FBpp0308935 10 KLVITALILFFLLGIVLVTGSTKGWFNPNR 39 (804) Q Consensus 10 ~lv~~~~~~~svvaiv~~~~~i~~~~~~n~ 39 (804) ++++++++++++++.+++...+=.++..|+ T Consensus 15 R~tv~~Lig~T~~~g~~~~~~~~~~~~~~r 44 (59) T PF14880_consen 15 RGTVLGLIGFTVYGGGLTVYTVYDYMRYNR 44 (59) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 677899999999999999888755555443No 9>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress []. Probab=16.84 E-value=50 Score=27.40 Aligned_cols=25 Identities=24% Similarity=0.174 Sum_probs=0.0 Q ss_pred CcceeehhHHHHHHHHHHHHHHHHHHhh Q FBpp0308935 1 MAMVISTRAKLVITALILFFLLGIVLVT 28 (804) Q Consensus 1 ~~~~~s~~~~lv~~~~~~~svvaiv~~~ 28 (804) || |+.+.|++..+.+|.+|++.+.+ T Consensus 1 Ma---SK~~llL~llLA~~LlisSevaa 25 (95) T PF07172_consen 1 MA---SKAFLLLGLLLAAVLLISSEVAA 25 (95) T ss_pred Cc---hhHHHHHHHHHHHHHHHHHHHHhNo 10>PF11359 gpUL132: Glycoprotein UL132; InterPro: IPR021023 Glycoprotein UL132 is a low-abundance structural component of Human herpesvirus 5 []. The function of this protein is not fully understood. Probab=16.47 E-value=17 Score=34.48 Aligned_cols=50 Identities=12% Similarity=0.073 Sum_probs=0.0 Q ss_pred eeehhHHHHHHHHHHHHHHHHHHhhhcccccccc-----ccchhhhHHHHhhhcc Q FBpp0308935 4 VISTRAKLVITALILFFLLGIVLVTGSTKGWFNP-----NRYNGERVAARIQATD 53 (804) Q Consensus 4 ~~s~~~~lv~~~~~~~svvaiv~~~~~i~~~~~~-----n~~~~e~~~~~i~~~~ 53 (804) +|....-+++|.++.++++.++++++.+=|..|+ -+|..++++.+|+.++ T Consensus 46 eimkvLaIlfYcvtg~sifsfl~VlvavlYssC~~~pgr~~fsd~Eaa~Lld~~d 100 (238) T PF11359_consen 46 EIMKVLAILFYCVTGVSIFSFLLVLVAVLYSSCHRKPGRFKFSDQEAAKLLDDTD 100 (238) T ss_pred hHHHHHhhhheeehhhHHHHHHHHHHHHHHHHHHcCCCCcccchhhhhhcccccc