Query FBpp0290333 type=protein; loc=3L:join(7038101..7038151,7041018..7041131,7043466..7043720,7044926..7045003,7045109..7045355,7046334..7046488,7046586..7046678,7048842..7048964,7050565..7050695,7051242..7051421,7051486..7051722,7051977..7052052); ID=FBpp0290333; name=Mp-PJ; parent=FBgn0260660,FBtr0301111; dbxref=REFSEQ:NP_001163361,GB_protein:ACZ94632,FlyBase:FBpp0290333,FlyBase_Annotation_IDs:CG42543-PJ,FlyMine:FBpp0290333,modMine:FBpp0290333; MD5=de56d33fac9dbdccf91a29c01f9c2c0e; length=579; release=r6.06; species=Dmel; Match_columns 579 No_of_seqs 502 out of 2325 Neff 5.9 Searched_HMMs 16187 No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF06482 Endostatin: Collagena 100.0 1.5E-75 9.3E-80 590.8 12.1 254 287-540 1-285 (286) 2 PF07588 DUF1554: Protein of u 99.6 2.1E-16 1.3E-20 143.2 7.3 124 380-515 2-133 (135) 3 PF01410 COLFI: Fibrillar coll 92.0 0.016 9.9E-07 55.6 0.0 56 392-452 110-170 (233) 4 PF11267 DUF3067: Domain of un 46.4 4.5 0.00028 33.1 1.8 14 565-578 42-55 (98) 5 PF07805 HipA_N: HipA-like N-t 31.2 15 0.00093 28.0 2.6 56 371-429 25-81 (81) 6 PF02676 TYW3: Methyltransfera 20.2 22 0.0013 32.6 1.8 18 522-540 41-58 (214) 7 PF12207 DUF3600: Domain of un 15.9 37 0.0023 29.7 2.1 19 560-578 101-119 (160) 8 PF10860 DUF2661: Protein of u 12.4 40 0.0025 27.9 1.3 12 531-542 36-47 (113) 9 PF09851 SHOCT: Short C-termin 12.1 60 0.0037 19.8 1.8 15 564-578 14-28 (28) 10 PF07406 NICE-3: NICE-3 protei 11.2 26 0.0016 31.2 -0.3 27 552-579 152-178 (181)No 1>PF06482 Endostatin: Collagenase NC10 and Endostatin; InterPro: IPR010515 NC10 stands for Non-helical region 10 and is taken from P39059 from SWISSPROT. A mutation in this region in P39060 from SWISSPROT is associated with an increased risk of prostrate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumour suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor []. Endostatin also binds a zinc ion near the N terminus; this is likely to be of structural rather than functional importance according to [].; GO: 0005198 structural molecule activity, 0007155 cell adhesion, 0031012 extracellular matrix; PDB: 1DY2_A 1DY1_A 1DY0_A 1KOE_A 3N3F_B 1BNL_D 3HSH_E 3HON_A. Probab=100.00 E-value=1.5e-75 Score=590.77 Aligned_cols=254 Identities=47% Similarity=0.867 Sum_probs=180.4 Q ss_pred ceEEeeccHHHHhhccCCCCCCcEEEEccCccEEEEEcCCceeeeccccCcCCCCCCCCcCC---C-CCcccccc----- Q FBpp0290333 287 PGAVTFQNIDEMTKKSALNPPGTLAYITEEEALLVRVNKGWQYIALGTLVPIATPAPPTTVA---P-SMRFDLQS----- 357 (579) Q Consensus 287 ~g~~~~~~~d~m~~~~~~~~~GtlaYv~~~~~l~Vrv~~G~~~i~lg~~~p~~~~~pp~t~~---P-s~r~~~~~----- 357 (579) +||++|+|+++|+++++..++|||+||+|+++|||||++|||+|+||.++|+....++..++ + +....... T Consensus 1 sGV~vf~T~~~Ml~~a~~~pEGTLayV~e~~eLYVRVrnGWRkV~LG~~ip~~~~~~~~~va~~~p~P~v~~~~~~~~~~ 80 (286) T PF06482_consen 1 SGVTVFRTYETMLATAHRVPEGTLAYVIEREELYVRVRNGWRKVQLGELIPIPSDTPDNEVASTQPPPVVSSPPQSSPPS 80 (286) T ss_dssp --EEEESSHHHHHCHGGGS-TTEEEEETTTTEEEEEETTEEEEE-EEEEEE----------------------------- T ss_pred CCcEEecCHHHHHhhcccCCCeEEEEEEecceEEEEecCCeeeeccCCcccCCCCcccccccccCCCCccccCccccccc Confidence 47999999999999999999999999999999999999999999999999987655432111 1 11100000 Q ss_pred --cCcc--------------C-----CCCCCCCCCceEEEecCCCCCCCCCCccchhHHHHHHhhhcCCccccceeeeec Q FBpp0290333 358 --KNLL--------------N-----SPPPLLNTPTLRVAALNEPSTGDLQGIRGADFACYRQGRRAGLLGTFKAFLSSR 416 (579) Q Consensus 358 --~~~~--------------~-----~~~~~~~~~~l~l~a~n~~~~G~l~Gi~GAD~~C~~~A~~~g~~~tfrA~LSs~ 416 (579) .... . ..........|||||||+|++|||+||+|||++||+||+++|+.+|||||||++ T Consensus 81 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~liAlN~P~~G~m~Gi~gAD~~C~~qAr~~gl~gtfRAfLSs~ 160 (286) T PF06482_consen 81 SHPRPPSTAPDPHYPPQPRRPPPPSPSAHTHHDDGPLHLIALNEPLSGNMRGIRGADFQCFRQARAAGLTGTFRAFLSSR 160 (286) T ss_dssp -----------------------------S--TTS-EEEEE-SS-B-SBSSHHHHHHHHHHHHHHHTT--S-EEESS-BT T ss_pred ccCCcccCCCCccCCCCccccCCCCCccccccCCCceEEEEcCCCCCCCccccccccHHHHHHHHHcCCCCceEEeeecc Confidence 0000 0 001122333499999999999999999999999999999999999999999999 Q ss_pred ccccceeeccCCC-CCceecCCCcEEecCccccccCCCCCccCCCceeeccCCcccCCCCCCcceEeecCCCCCccccCC Q FBpp0290333 417 VQNLDTIVRPADR-DLPVVNTRGDVLFNSWKGIFNGQGGFFSQAPRIYSFSGKNVMTDSTWPMKMVWHGSLPNGERSMDT 495 (579) Q Consensus 417 ~q~~~~~V~~~dr-~~p~~n~~g~vlf~~~~~l~~~~~~~~~~~~~i~~f~~~dvl~d~~~p~k~vW~Gs~~~g~~~~~~ 495 (579) +|||.++|++.|| .+||||+||+|||++|++||+++.+.|+.+++||||||+|||+|++||+|+|||||+++|++..+. T Consensus 161 ~qdL~~iV~~~dr~~~PivNlkgevLf~sw~~lf~g~~~~~~~~~~iySFdGr~v~~d~~wP~K~vWhGs~~~G~r~~~~ 240 (286) T PF06482_consen 161 LQDLYSIVRRADRDNVPIVNLKGEVLFNSWESLFSGSGGPFNPNAPIYSFDGRDVLTDPAWPQKMVWHGSDPRGRRLTDS 240 (286) T ss_dssp TB-GGGGS-GGGTSS--EE-TTS-EEES-HHHHTSSS-SB--TTS--BBTTS-BTTTSTTSSS-EEE--B-TTS-B-TTS T ss_pred cccHhhhccHhhCCCCCeEeCcCCEeecCHHHHhCCCCCCCCCCCcEEeECCccccCCCCcceEEEEeCCCCCCccCCcC Confidence 9999999999999 799999999999999999999998899999999999999999999999999999999999999999 Q ss_pred CCCCcccCCCCceeccccCCCCccccccccccCCCceEEEEeccc Q FBpp0290333 496 YCDAWHSGDHLKGSFASNLDGHKLLEQKRQSCDSKLIILCVEALS 540 (579) Q Consensus 496 ~C~~W~s~~~~~~G~as~~~~~~l~~~~~~~C~~~~~vlCvE~~~ 540 (579) ||++|+|++.+++|+||+|.+++||+|+.+||+++||||||||+. T Consensus 241 ~C~~Wrs~~~~~~G~As~l~~g~ll~q~~~sC~~~~ivLCiE~~~ 285 (286) T PF06482_consen 241 YCEAWRSSDPAVTGQASSLQSGKLLDQQPYSCSNSFIVLCIENSF 285 (286) T ss_dssp BHHHHB---TTSEEEEEEGGGTBSS--EEEETTS-BB-EEEESS- T ss_pred cccccccCCCCceEeeeecCCCCcccCCcccCCCceEEEEEeccc Confidence 999999999999999999999999999999999999999999974No 2>PF07588 DUF1554: Protein of unknown function (DUF1554); InterPro: IPR011448 This is a domain that occurs in 1-2 copies in a family of proteins identified in Leptospira interrogans and other bacteria. The function of the proteins is not known. Probab=99.62 E-value=2.1e-16 Score=143.19 Aligned_cols=124 Identities=21% Similarity=0.368 Sum_probs=88.3 Q ss_pred CCCCCCCCCccchhHHHHHHhhhcC--Cccccceeeeeccc--ccceeeccCCCCCceecCCCcEEecCccccccCCCC- Q FBpp0290333 380 EPSTGDLQGIRGADFACYRQGRRAG--LLGTFKAFLSSRVQ--NLDTIVRPADRDLPVVNTRGDVLFNSWKGIFNGQGG- 454 (579) Q Consensus 380 ~~~~G~l~Gi~GAD~~C~~~A~~~g--~~~tfrA~LSs~~q--~~~~~V~~~dr~~p~~n~~g~vlf~~~~~l~~~~~~- 454 (579) ..|+|||+||.|||++|++.+.+.. ..++|||||++.+. ++.. +.+.-.. ...+|||..+ .+++ +.++ T Consensus 2 ~~~~GnlGGi~GADa~C~~d~~~p~~~~~~~yKAml~~~~~~~R~a~-~t~n~~~----g~~DWVl~pn-t~Y~-r~dgt 74 (135) T PF07588_consen 2 NTYNGNLGGISGADAKCNADANKPSPGGGGTYKAMLVDGSNSTRRAC-VTANCGD----GQIDWVLKPN-TTYY-RSDGT 74 (135) T ss_pred ccccCcccchhhHhHHHHcCCCCCCCCCCcCeEEEEEcCccccceee-cCCCCCC----CcccceecCC-ceEE-ecCCC Confidence 3689999999999999998765543 58999999999764 3333 2222111 1455655555 3344 4455 Q ss_pred -CccCCCc-eeeccCCcccCCCCCC-cceEeecCCCCCccccCCCCCCcccCCCCceeccccCC Q FBpp0290333 455 -FFSQAPR-IYSFSGKNVMTDSTWP-MKMVWHGSLPNGERSMDTYCDAWHSGDHLKGSFASNLD 515 (579) Q Consensus 455 -~~~~~~~-i~~f~~~dvl~d~~~p-~k~vW~Gs~~~g~~~~~~~C~~W~s~~~~~~G~as~~~ 515 (579) +|.++.. ||+|. |++++-. .+.||||++.+|+... .+|++|+++...++|.....+ T Consensus 75 ~i~tTn~~glf~f~----l~~~i~~~~~~~WTGl~~~Wt~~~-~~C~~Wt~~s~~~~G~~G~~n 133 (135) T PF07588_consen 75 TIFTTNSNGLFDFP----LSNPISGTSGTIWTGLNSDWTTAT-NNCNNWTSGSSGVTGAYGSSN 133 (135) T ss_pred EEEecCCCceEccc----ccceecCCCccEEEeECCCCeeCC-CcccCCcCCCCcccccccccc Confidence 5656655 99996 6666644 6889999999998885 899999999977777766543No 3>PF01410 COLFI: Fibrillar collagen C-terminal domain; InterPro: IPR000885 Collagens contain a large number of globular domains in between the regions of triple helical repeats IPR008160 from INTERPRO. These domains are involved in binding diverse substrates. One of these domains is found at the C terminus of fibrillar collagens. The exact function of this domain is unknown.; GO: 0005201 extracellular matrix structural constituent, 0005581 collagen trimer; PDB: 4AEJ_A 4AE2_A 4AK3_A. Probab=91.98 E-value=0.016 Score=55.57 Aligned_cols=56 Identities=20% Similarity=0.076 Sum_probs=35.6 Q ss_pred hhHHHHHHhhhcCCccccceeeeecc-cccce----eeccCCCCCceecCCCcEEecCccccccCC Q FBpp0290333 392 ADFACYRQGRRAGLLGTFKAFLSSRV-QNLDT----IVRPADRDLPVVNTRGDVLFNSWKGIFNGQ 452 (579) Q Consensus 392 AD~~C~~~A~~~g~~~tfrA~LSs~~-q~~~~----~V~~~dr~~p~~n~~g~vlf~~~~~l~~~~ 452 (579) +|..|+. ...+|++||+|||+.| |+|++ +|+|+|.. ..+++.++.|.+||+.+-.. T Consensus 110 ~~~~~~~---~~~vQL~FLrLlS~~A~Q~iT~~C~ns~~~~d~~--~~~~~~ai~l~g~n~~~~~~ 170 (233) T PF01410_consen 110 VDELCYD---ISVVQLNFLRLLSSEASQNITYHCRNSVAWYDQS--SGSYNKAIKLLGDNDEELSY 170 (233) T ss_dssp -TTS-HH---HHHHHHHHHHHHSSEEEEEEEEEEES--SS-BTT--TTB-TT--EEE-SSS-EEBS T ss_pred Ccccccc---ccHHHHHHHHHHhHhhcceEEEecCCCccccccc--ccccccceEEecCCceEEec Confidence 5666642 3345899999999999 99999 88888875 46788999999988765443No 4>PF11267 DUF3067: Domain of unknown function (DUF3067); InterPro: IPR021420 This family of proteins has no known function. ; PDB: 2LJW_A. Probab=46.42 E-value=4.5 Score=33.10 Aligned_cols=14 Identities=36% Similarity=0.650 Sum_probs=11.5 Q ss_pred cChHHHHHHHHhhh Q FBpp0290333 565 KTADEYAAHLENLL 578 (579) Q Consensus 565 ~~~~~~~~~~~~~~ 578 (579) +||+||.+||+.+. T Consensus 42 ltE~eY~~hL~~ia 55 (98) T PF11267_consen 42 LTEEEYLEHLDAIA 55 (98) T ss_dssp S-HHHHHHHHHHHH T ss_pred CCHHHHHHHHHHHH Confidence 69999999999874No 5>PF07805 HipA_N: HipA-like N-terminal domain; InterPro: IPR012894 The members of this entry contain a region that is found towards the N terminus of the HipA protein expressed by various bacterial species (for example P23874 from SWISSPROT). This protein is known to be involved in high-frequency persistence to the lethal effects of inhibition of either DNA or peptidoglycan synthesis []. When expressed alone, it is toxic to bacterial cells [], but it is usually tightly associated with HipB [], and the HipA-HipB complex may be involved in autoregulation of the hip operon. The hip proteins may be involved in cell division control and may interact with cell division genes or their products []. ; PDB: 4PU5_A 4PU3_B 4PU4_A 2WIU_C 3HZI_A 3TPD_A 3DNT_B 3TPE_A 3TPV_B 3FBR_A .... Probab=31.19 E-value=15 Score=28.04 Aligned_cols=56 Identities=18% Similarity=0.268 Sum_probs=31.7 Q ss_pred CceEEEecCCCCCCCCCCccchhHHHHHHhhhcCCccccceeeeeccccc-ceeeccCCC Q FBpp0290333 371 PTLRVAALNEPSTGDLQGIRGADFACYRQGRRAGLLGTFKAFLSSRVQNL-DTIVRPADR 429 (579) Q Consensus 371 ~~l~l~a~n~~~~G~l~Gi~GAD~~C~~~A~~~g~~~tfrA~LSs~~q~~-~~~V~~~dr 429 (579) +.-|+|=+.. -++..+.=.-+.|++.|+++|+...--.+++...... ...|...|| T Consensus 25 ~s~~I~K~~~---~~~~~~~~nE~~~m~lA~~~Gi~v~~~~l~~~~~g~~~~l~v~RFDR 81 (81) T PF07805_consen 25 PSTHILKFPS---EDYPDLVENEYLCMRLARAAGIDVPETELVRFGDGEGPALLVERFDR 81 (81) T ss_dssp --SEEEEESS---EECTTHHHHHHHHHHHHHHTT--B--EEEEEETTE-EEEEEEE-SSE T ss_pred CceEEEcCCc---ccCcchHHHHHHHHHHHHHhCCCcCceEEEEccCCCeEEEEEeCCCC Confidence 3444444443 3455666678999999999999655555555443333 667777776No 6>PF02676 TYW3: Methyltransferase TYW3; InterPro: IPR003827 The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis []. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3'-position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner [].; PDB: 1TLJ_A 2IT3_B 2IT2_A 2DRV_A 2DVK_A 2QG3_B. Probab=20.22 E-value=22 Score=32.61 Aligned_cols=18 Identities=17% Similarity=0.420 Sum_probs=10.6 Q ss_pred ccccccCCCceEEEEeccc Q FBpp0290333 522 QKRQSCDSKLIILCVEALS 540 (579) Q Consensus 522 ~~~~~C~~~~~vlCvE~~~ 540 (579) .+++|||.+.+|+| |... T Consensus 41 ~TTSSCSGRI~vf~-eg~~ 58 (214) T PF02676_consen 41 VTTSSCSGRISVFL-EGSK 58 (214) T ss_dssp EEEEEES-EEEEE------ T ss_pred EEecccccceEEEe-eccc Confidence 46789999999998 6544No 7>PF12207 DUF3600: Domain of unknown function (DUF3600); InterPro: IPR022019 This domain is the C-terminal of the putative ecf-type sigma factor negative effector. Proteins in this entry are approximately 230 amino acids in length. ; PDB: 3FH3_A 3FGG_A. Probab=15.89 E-value=37 Score=29.70 Aligned_cols=19 Identities=37% Similarity=0.583 Sum_probs=16.0 Q ss_pred chhhccChHHHHHHHHhhh Q FBpp0290333 560 ESREFKTADEYAAHLENLL 578 (579) Q Consensus 560 ~~~~~~~~~~~~~~~~~~~ 578 (579) ++.+++|++||.+|.+.|| T Consensus 101 ssk~vlt~eEy~~y~~alm 119 (160) T PF12207_consen 101 SSKEVLTDEEYDQYIEALM 119 (160) T ss_dssp -HHHHS-HHHHHHHHHHHH T ss_pred chHHhcCHHHHHHHHHHHh Confidence 4789999999999999998No 8>PF10860 DUF2661: Protein of unknown function (DUF2661); InterPro: IPR020387 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf112; it is a family of uncharacterised viral proteins. This entry also represents the N-terminal region of Fowlpox virus (FPV) FPV217. The protein family is uncharacterised. Probab=12.43 E-value=40 Score=27.86 Aligned_cols=12 Identities=17% Similarity=0.492 Sum_probs=9.7 Q ss_pred ceEEEEecccHH Q FBpp0290333 531 LIILCVEALSQD 542 (579) Q Consensus 531 ~~vlCvE~~~~~ 542 (579) ++|+||||.+.. T Consensus 36 yvlY~ie~~~~~ 47 (113) T PF10860_consen 36 YVLYCIENEDSL 47 (113) T ss_pred EEEEEEcCCCcc Confidence 789999997653No 9>PF09851 SHOCT: Short C-terminal domain; InterPro: IPR018649 This presumed domain is functionally uncharacterised. Probab=12.11 E-value=60 Score=19.84 Aligned_cols=15 Identities=33% Similarity=0.421 Sum_probs=0.0 Q ss_pred ccChHHHHHHHHhhh Q FBpp0290333 564 FKTADEYAAHLENLL 578 (579) Q Consensus 564 ~~~~~~~~~~~~~~~ 578 (579) .+|++||.+-.+.|| T Consensus 14 ~IteeEy~~~k~~lL 28 (28) T PF09851_consen 14 EITEEEYEQKKKKLL 28 (28) T ss_pred CCCHHHHHHHHHHhCNo 10>PF07406 NICE-3: NICE-3 protein; InterPro: IPR010876 This family consists of several eukaryotic NICE-3 and related proteins. The gene coding for NICE-3 is part of the epidermal differentiation complex (EDC), which comprises a large number of genes that are of crucial importance for the maturation of the human epidermis []. The function of NICE-3 is unknown. Probab=11.20 E-value=26 Score=31.19 Aligned_cols=27 Identities=33% Similarity=0.466 Sum_probs=0.0 Q ss_pred CCCCCCCCchhhccChHHHHHHHHhhhC Q FBpp0290333 552 DGSSHGESESREFKTADEYAAHLENLLL 579 (579) Q Consensus 552 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 579 (579) +.|.|-|-+..|| +|+||.+|.+.|.+ T Consensus 152 d~Ye~AR~~~~~F-g~~EY~~y~~~l~~ 178 (181) T PF07406_consen 152 DGYEHARHGPEEF-GEEEYLKYMELLNE 178 (181) T ss_pred HHHHHhcCCCCCC-CHHHHHHHHHHHHH