Query 037939
Match_columns 309
No_of_seqs 68 out of 70
Neff 4.2
Searched_HMMs 46136
Date Fri Mar 29 05:54:57 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/037939.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/037939hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF06697 DUF1191: Protein of u 100.0 1E-110 2E-115 785.0 28.6 271 19-297 3-278 (278)
2 PF02480 Herpes_gE: Alphaherpe 91.6 0.15 3.3E-06 51.8 3.3 66 65-138 103-178 (439)
3 KOG3637 Vitronectin receptor, 88.8 0.16 3.4E-06 56.7 0.7 42 228-273 976-1021(1030)
4 PF07271 Cytadhesin_P30: Cytad 86.1 0.7 1.5E-05 44.7 3.3 42 229-270 67-109 (279)
5 PF06697 DUF1191: Protein of u 81.2 2.9 6.3E-05 40.6 5.3 46 227-274 207-252 (278)
6 PF04478 Mid2: Mid2 like cell 79.4 0.42 9.2E-06 42.7 -0.9 9 231-239 49-57 (154)
7 PF01102 Glycophorin_A: Glycop 78.4 1.2 2.6E-05 38.4 1.5 27 235-262 68-94 (122)
8 KOG4289 Cadherin EGF LAG seven 70.5 22 0.00047 42.2 9.0 64 148-215 2084-2156(2531)
9 PF12768 Rax2: Cortical protei 69.8 5.2 0.00011 38.5 3.7 54 223-279 220-278 (281)
10 PF01034 Syndecan: Syndecan do 68.9 1.4 3E-05 34.3 -0.3 27 235-261 13-39 (64)
11 PF02009 Rifin_STEVOR: Rifin/s 67.1 6.4 0.00014 38.5 3.7 23 244-266 267-289 (299)
12 PTZ00382 Variant-specific surf 67.0 2.3 5.1E-05 34.8 0.6 27 231-257 66-92 (96)
13 PF04995 CcmD: Heme exporter p 63.3 20 0.00043 25.5 4.8 14 257-270 26-39 (46)
14 PF15048 OSTbeta: Organic solu 62.3 2.6 5.7E-05 36.6 0.1 73 226-301 31-106 (125)
15 PHA03283 envelope glycoprotein 61.7 12 0.00026 39.4 4.7 18 194-211 327-349 (542)
16 PF04689 S1FA: DNA binding pro 59.8 23 0.0005 27.9 4.8 23 232-254 14-36 (69)
17 TIGR03141 cytochro_ccmD heme e 57.9 27 0.00058 24.8 4.6 14 257-270 27-40 (45)
18 PF07204 Orthoreo_P10: Orthore 57.9 8.4 0.00018 32.2 2.3 39 229-269 41-79 (98)
19 PTZ00046 rifin; Provisional 57.6 9.3 0.0002 38.4 3.0 23 244-266 326-348 (358)
20 PHA03265 envelope glycoprotein 55.9 8.4 0.00018 39.0 2.4 42 121-164 159-201 (402)
21 PF15345 TMEM51: Transmembrane 55.2 27 0.00059 33.3 5.5 15 285-299 98-112 (233)
22 TIGR03141 cytochro_ccmD heme e 54.9 36 0.00077 24.1 4.9 27 244-270 17-43 (45)
23 TIGR01477 RIFIN variant surfac 54.8 11 0.00024 37.8 3.1 23 244-266 321-343 (353)
24 PF06305 DUF1049: Protein of u 53.0 13 0.00028 27.4 2.5 11 261-271 54-64 (68)
25 PF15102 TMEM154: TMEM154 prot 53.0 4 8.7E-05 36.3 -0.3 56 238-302 63-118 (146)
26 PF05545 FixQ: Cbb3-type cytoc 52.3 20 0.00044 25.5 3.3 19 256-274 30-48 (49)
27 PHA03282 envelope glycoprotein 52.0 16 0.00035 38.3 3.7 25 187-211 334-367 (540)
28 KOG1226 Integrin beta subunit 49.7 24 0.00051 38.9 4.7 40 229-269 712-752 (783)
29 PF08693 SKG6: Transmembrane a 49.3 11 0.00025 26.7 1.6 21 229-251 10-30 (40)
30 PF15099 PIRT: Phosphoinositid 49.3 13 0.00029 32.5 2.3 9 242-250 90-98 (129)
31 PF14575 EphA2_TM: Ephrin type 48.4 5.5 0.00012 31.2 -0.2 9 233-241 2-10 (75)
32 PF12904 Collagen_bind_2: Puta 46.1 97 0.0021 25.4 6.8 32 153-190 31-63 (93)
33 PF06305 DUF1049: Protein of u 45.9 28 0.00062 25.6 3.4 12 258-269 44-55 (68)
34 COG3216 Uncharacterized protei 45.5 32 0.0007 31.8 4.3 85 188-272 87-182 (184)
35 PF01825 GPS: Latrophilin/CL-1 43.6 63 0.0014 22.4 4.6 36 176-211 2-44 (44)
36 smart00218 ZU5 Domain present 40.9 68 0.0015 26.9 5.2 75 68-166 9-85 (104)
37 PF04995 CcmD: Heme exporter p 40.4 65 0.0014 22.8 4.4 30 243-272 15-44 (46)
38 PLN00113 leucine-rich repeat r 40.2 22 0.00048 38.3 2.8 13 228-240 626-638 (968)
39 PF08374 Protocadherin: Protoc 39.4 26 0.00055 33.3 2.7 26 230-258 37-62 (221)
40 PHA03265 envelope glycoprotein 38.1 14 0.00031 37.4 0.9 27 239-268 356-382 (402)
41 PF05283 MGC-24: Multi-glycosy 37.3 25 0.00054 32.4 2.2 18 234-251 160-178 (186)
42 PF11368 DUF3169: Protein of u 36.3 14 0.00029 34.4 0.4 15 230-244 13-27 (248)
43 PF11166 DUF2951: Protein of u 34.8 7.3 0.00016 32.6 -1.4 28 223-250 65-92 (98)
44 PF12191 stn_TNFRSF12A: Tumour 33.9 14 0.0003 32.4 0.0 13 228-240 75-87 (129)
45 PF07172 GRP: Glycine rich pro 32.1 53 0.0012 27.0 3.2 22 262-283 26-47 (95)
46 PF12877 DUF3827: Domain of un 31.6 45 0.00097 36.3 3.3 17 86-108 112-128 (684)
47 PHA03286 envelope glycoprotein 31.4 83 0.0018 33.0 5.1 20 187-206 308-331 (492)
48 PF13908 Shisa: Wnt and FGF in 31.2 14 0.0003 32.6 -0.4 12 178-189 5-16 (179)
49 PF14880 COX14: Cytochrome oxi 30.7 88 0.0019 23.3 3.9 7 234-240 18-24 (59)
50 PF06212 GRIM-19: GRIM-19 prot 29.9 1E+02 0.0022 26.8 4.7 33 238-270 35-67 (130)
51 PF01589 Alpha_E1_glycop: Alph 29.9 40 0.00087 35.2 2.6 21 234-254 474-494 (502)
52 KOG3838 Mannose lectin ERGIC-5 29.5 1.2E+02 0.0027 31.5 5.8 138 7-161 11-165 (497)
53 PF05620 DUF788: Protein of un 29.2 27 0.00058 31.0 1.0 16 258-273 155-170 (170)
54 PF09926 DUF2158: Uncharacteri 28.0 1.1E+02 0.0023 22.8 3.9 33 153-187 10-42 (53)
55 PF03264 Cytochrom_NNT: NapC/N 27.7 45 0.00097 29.2 2.2 20 229-248 4-23 (173)
56 PF10183 ESSS: ESSS subunit of 26.9 79 0.0017 26.2 3.4 48 224-273 53-104 (105)
57 PF05808 Podoplanin: Podoplani 26.7 21 0.00046 32.3 0.0 22 233-254 131-152 (162)
58 PTZ00229 variable surface prot 26.6 30 0.00065 34.3 1.0 47 230-276 241-289 (317)
59 COG1862 YajC Preprotein transl 25.0 1.1E+02 0.0023 25.5 3.8 16 259-274 32-47 (97)
60 smart00303 GPS G-protein-coupl 24.9 2.2E+02 0.0047 20.1 4.9 40 176-215 2-48 (49)
61 PF12597 DUF3767: Protein of u 24.7 1.3E+02 0.0029 25.5 4.5 20 251-270 82-101 (118)
62 TIGR00739 yajC preprotein tran 24.5 1.4E+02 0.003 23.9 4.3 21 258-278 25-45 (84)
63 PF02439 Adeno_E3_CR2: Adenovi 24.2 50 0.0011 23.4 1.5 9 241-249 13-21 (38)
64 PF04971 Lysis_S: Lysis protei 24.1 66 0.0014 25.4 2.3 20 234-254 35-54 (68)
65 PRK05585 yajC preprotein trans 22.9 1.7E+02 0.0037 24.4 4.7 20 258-277 40-59 (106)
66 PF08370 PDR_assoc: Plant PDR 22.5 1.2E+02 0.0026 23.5 3.4 17 229-248 27-43 (65)
67 PF06295 DUF1043: Protein of u 22.4 37 0.0008 28.9 0.7 11 233-243 3-13 (128)
68 PF00558 Vpu: Vpu protein; In 22.2 73 0.0016 25.9 2.2 12 257-268 29-40 (81)
69 PF02158 Neuregulin: Neureguli 22.0 30 0.00065 35.4 0.0 44 235-278 10-55 (404)
70 PF15330 SIT: SHP2-interacting 22.0 51 0.0011 27.7 1.4 31 232-264 1-31 (107)
71 KOG2792 Putative cytochrome C 21.4 1.6E+02 0.0035 28.9 4.8 26 258-284 95-120 (280)
72 PF07006 DUF1310: Protein of u 21.4 35 0.00075 29.4 0.3 22 256-277 21-42 (122)
73 PF04286 DUF445: Protein of un 20.8 49 0.0011 31.1 1.1 21 229-249 343-363 (367)
74 KOG4007 Uncharacterized conser 20.2 1.4E+02 0.003 28.3 3.9 43 231-275 134-178 (229)
75 PF11669 WBP-1: WW domain-bind 20.1 2.3E+02 0.0049 23.4 4.8 22 256-277 41-62 (102)
No 1
>PF06697 DUF1191: Protein of unknown function (DUF1191); InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=100.00 E-value=1.1e-110 Score=784.96 Aligned_cols=271 Identities=51% Similarity=0.786 Sum_probs=246.4
Q ss_pred cccCCCChhHHHHHHHHHHHHhcccCCCccCceEeeeCCCCccCCeEEEEEeecCcccccCc-eeeeeEeCCcceeccce
Q 037939 19 ETQGVKSTRVLDLLIRDYTFKSLDNHAIKTGNLHNVHLPANLSGIKVDMVRFRCGSLRRYGA-RVKEFHLGIGVIVQPCV 97 (309)
Q Consensus 19 ~~q~~~~~~~LD~~lqd~A~k~l~~~~~~TG~~y~~~LPsNlSGi~VsavRlRsgSLrr~G~-~~~eF~IP~gv~~~P~v 97 (309)
++|+++++++||++|||||||+|+.+ ||||++|+++||+||||||||+||||||||||||+ +|+||+||||++++||+
T Consensus 3 ~~~~~~~~~~LD~~lqd~A~kal~~~-p~TG~~y~~~LP~nlsGi~vsavRlRsgSLrr~G~~~~~eF~IP~gv~~~P~v 81 (278)
T PF06697_consen 3 QSQQIYSARSLDALLQDYAFKALVLR-PRTGILYNVSLPSNLSGIEVSAVRLRSGSLRRRGVNNFSEFHIPPGVVVQPYV 81 (278)
T ss_pred cccccCCHHHHHHHHHHHHHHHhccc-cccCceeeeecCCcccceEEEEEEeecCchhhhcccccceeecCCcceecCcc
Confidence 57899999999999999999999644 79999999999999999999999999999999999 89999999999999999
Q ss_pred eEEEEEEeecCCCccccccccCCCCCCeEeceeeeeeeecCCCCcCCCCceEEEEeeCCCceEEEcCCCccccCCCCCcc
Q 037939 98 ERVVVVRQNLGYNWSSIYYANYDLSGYQLVSPVLGILAYNSVTDVNFNNRFELQILANGKPITIDFRNTTRVTNISGIKP 177 (309)
Q Consensus 98 ~Rv~lVyqnLG~NwSs~yy~~y~lpGY~lvsPVlGllaYdas~~~~~~n~~el~i~a~~~PI~V~F~~~~~~~~~~~~~~ 177 (309)
|||+||||||| |||++|| ++|||+|+|||||||||||+| +++.+++||+|.++|+||+|+|+|++..+..+++.+
T Consensus 82 ~Rl~lVyqnlG-NwSs~yy---~lpGY~lvsPVlGllaYdasn-~~~~~~~el~l~a~~~PI~V~F~~~~~~~~~~~~~~ 156 (278)
T PF06697_consen 82 ERLVLVYQNLG-NWSSHYY---PLPGYSLVSPVLGLLAYDASN-LSATSLPELSLRASGKPILVDFSNVSPAPQPGMSVP 156 (278)
T ss_pred eEEEEEEeccC-cccccee---cCCCceEEeeeeeeEEecccc-cccCCcceeeeeccCCcEEEEecCCccCCCcccccc
Confidence 99999999999 9999996 799999999999999999996 556667999999999999999999998754444899
Q ss_pred eeEEEccCCeEEEEecCCCceeEeeccceeEEEEeCCCCCCC----chhccccccceEEEeecchhHHHHHHHHHHHHHH
Q 037939 178 FCANFQRDGKVTLTNQVSPYVCVARKHGHFGLVTKYPPPSEG----PEQVRKKISRWKLAVGTTVGAAVGAFLLGLLLVA 253 (309)
Q Consensus 178 ~Cv~F~~~G~~~~~~~~~~nvC~~~~~GHfslVV~~~~~~~~----~~~~~~k~~~Wk~ivg~vvGg~~glvlLgll~v~ 253 (309)
|||+||+||+++|+|++++|||++++||||||||++++++|. .+..++++++||||+ +++||+++|+||+++++
T Consensus 157 ~Cv~F~~~G~~~~~~~~~~nvC~~~~~GHfslVV~~~~~~~~~~~~~~~~~~~~~~W~iv~-g~~~G~~~L~ll~~lv~- 234 (278)
T PF06697_consen 157 KCVTFDLDGSVTFSNMTSPNVCSTSRQGHFSLVVPSPAPPPAPPPPGAPPRKRSWWWKIVV-GVVGGVVLLGLLSLLVA- 234 (278)
T ss_pred eEEEEcCCCcEEEeccCCCceeeeecCceEEEEEcCCCCCCCCCCcccccCCcceeEEEEE-EehHHHHHHHHHHHHHH-
Confidence 999999999999999999999999999999999998755432 223466766776554 56777888999997665
Q ss_pred hhHHHHHHHHHHHHHHhcccccccceeeeccccCcccccccCCc
Q 037939 254 MFVKVKKKARMEELERRAYEEEALQVSMVGHIRAPTASVTRTVP 297 (309)
Q Consensus 254 ~~vr~~rkkr~~eMEr~A~~gE~L~~~~VG~sraPsA~~tRTqP 297 (309)
+++|+|||||||||||+||+||+|||+||||||||+|++|||||
T Consensus 235 ~~vr~krk~k~~eMEr~A~~gE~L~~~~VG~sraPsA~~tRT~P 278 (278)
T PF06697_consen 235 MLVRYKRKKKIEEMERRAEEGEALQMSWVGGSRAPSATVTRTQP 278 (278)
T ss_pred hhhhhhHHHHHHHHHHhhccCceeeeEEEccccCccccccccCC
Confidence 56999999999999999999999999999999999999999999
No 2
>PF02480 Herpes_gE: Alphaherpesvirus glycoprotein E; InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=91.61 E-value=0.15 Score=51.78 Aligned_cols=66 Identities=17% Similarity=-0.059 Sum_probs=16.9
Q ss_pred EEEEEeecCcccccCceee----e---e-EeCC-cceeccceeEEEEEEeecCCCccccccccCCCCC-CeEeceeeeee
Q 037939 65 VDMVRFRCGSLRRYGARVK----E---F-HLGI-GVIVQPCVERVVVVRQNLGYNWSSIYYANYDLSG-YQLVSPVLGIL 134 (309)
Q Consensus 65 VsavRlRsgSLrr~G~~~~----e---F-~IP~-gv~~~P~v~Rv~lVyqnLG~NwSs~yy~~y~lpG-Y~lvsPVlGll 134 (309)
|-.++.|+++-|.+=..|. + - ..|+ ..-+.|...++-+--+|+- +|-| -+| =-.+++-|=+-
T Consensus 103 vY~l~~~~~~~~~~~~~~~v~v~~~~~~~~~p~p~~p~~p~~~~~~~~~~~~~----s~vf----~~Gdtf~~~v~l~~~ 174 (439)
T PF02480_consen 103 VYTLYVRNDSGWAHQSVVFVTVKGPAPDPRTPPPHHPIVPHRHGATFHLKNYH----SHVF----SPGDTFHLSVHLQSE 174 (439)
T ss_dssp -------------------------------------------SEEEEEE--S----EEE------TT--EE---EEEEE
T ss_pred eEEEEecCCCCceEEEEEEEEEcCCCcCCCCCCCCCCCCCcccccEEEEeccc----eEEe----cCCCcEEEeEEEEec
Confidence 4466777777666442221 1 0 1221 1224466777777777655 6665 356 44567778888
Q ss_pred eecC
Q 037939 135 AYNS 138 (309)
Q Consensus 135 aYda 138 (309)
++|.
T Consensus 175 ~~d~ 178 (439)
T PF02480_consen 175 AHDD 178 (439)
T ss_dssp ESSS
T ss_pred cCCC
Confidence 8874
No 3
>KOG3637 consensus Vitronectin receptor, alpha subunit [Extracellular structures]
Probab=88.79 E-value=0.16 Score=56.70 Aligned_cols=42 Identities=21% Similarity=0.317 Sum_probs=23.8
Q ss_pred cceEEEeecchhHHHHHHHHHHHHHHhhHHH---HHHHH-HHHHHHhccc
Q 037939 228 SRWKLAVGTTVGAAVGAFLLGLLLVAMFVKV---KKKAR-MEELERRAYE 273 (309)
Q Consensus 228 ~~Wk~ivg~vvGg~~glvlLgll~v~~~vr~---~rkkr-~~eMEr~A~~ 273 (309)
--||||+++++||++ ||+||++++ -|. ||+++ ..+||++...
T Consensus 976 vp~wiIi~svl~GLL---lL~llv~~L-wK~GFFKR~r~~~~~~~~~~~~ 1021 (1030)
T KOG3637|consen 976 VPLWIIILSVLGGLL---LLALLVLLL-WKCGFFKRNRKHPKEQEEEDKS 1021 (1030)
T ss_pred cceeeehHHHHHHHH---HHHHHHHHH-HhcCccccCCCCcccccccccC
Confidence 367888888888754 445444333 444 55444 4455555443
No 4
>PF07271 Cytadhesin_P30: Cytadhesin P30/P32; InterPro: IPR009896 This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localised on the tip organelle. It is thought that it is important in cytadherence and virulence [].; GO: 0007157 heterophilic cell-cell adhesion, 0009405 pathogenesis, 0016021 integral to membrane
Probab=86.05 E-value=0.7 Score=44.72 Aligned_cols=42 Identities=33% Similarity=0.482 Sum_probs=27.0
Q ss_pred ceEE-EeecchhHHHHHHHHHHHHHHhhHHHHHHHHHHHHHHh
Q 037939 229 RWKL-AVGTTVGAAVGAFLLGLLLVAMFVKVKKKARMEELERR 270 (309)
Q Consensus 229 ~Wk~-ivg~vvGg~~glvlLgll~v~~~vr~~rkkr~~eMEr~ 270 (309)
.|++ .||+.+|-.+.+++||+-+-...+|.|.|+.+||-|++
T Consensus 67 ~W~~P~v~~~~G~~~v~liLgl~ig~p~~krkek~~iee~e~~ 109 (279)
T PF07271_consen 67 SWFIPVVGGSAGLLAVALILGLAIGIPIYKRKEKRMIEEKEEH 109 (279)
T ss_pred cceeeeccchhhHHHHHHHHHHhhcchhhhhhHHHHHHHHHHH
Confidence 5655 67766776555666776555555676666677777764
No 5
>PF06697 DUF1191: Protein of unknown function (DUF1191); InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=81.19 E-value=2.9 Score=40.64 Aligned_cols=46 Identities=24% Similarity=0.243 Sum_probs=33.2
Q ss_pred ccceEEEeecchhHHHHHHHHHHHHHHhhHHHHHHHHHHHHHHhcccc
Q 037939 227 ISRWKLAVGTTVGAAVGAFLLGLLLVAMFVKVKKKARMEELERRAYEE 274 (309)
Q Consensus 227 ~~~Wk~ivg~vvGg~~glvlLgll~v~~~vr~~rkkr~~eMEr~A~~g 274 (309)
.++||+|- -++|.+.|+++|+||. .++....|-||.++||.---+-
T Consensus 207 ~~~~~~W~-iv~g~~~G~~~L~ll~-~lv~~~vr~krk~k~~eMEr~A 252 (278)
T PF06697_consen 207 RKRSWWWK-IVVGVVGGVVLLGLLS-LLVAMLVRYKRKKKIEEMERRA 252 (278)
T ss_pred CCcceeEE-EEEEehHHHHHHHHHH-HHHHhhhhhhHHHHHHHHHHhh
Confidence 56888887 4444488888888766 4777779999988888644333
No 6
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=79.44 E-value=0.42 Score=42.70 Aligned_cols=9 Identities=33% Similarity=0.866 Sum_probs=6.3
Q ss_pred EEEeecchh
Q 037939 231 KLAVGTTVG 239 (309)
Q Consensus 231 k~ivg~vvG 239 (309)
++|+|.|||
T Consensus 49 nIVIGvVVG 57 (154)
T PF04478_consen 49 NIVIGVVVG 57 (154)
T ss_pred cEEEEEEec
Confidence 467777777
No 7
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=78.41 E-value=1.2 Score=38.36 Aligned_cols=27 Identities=22% Similarity=0.336 Sum_probs=10.5
Q ss_pred ecchhHHHHHHHHHHHHHHhhHHHHHHH
Q 037939 235 GTTVGAAVGAFLLGLLLVAMFVKVKKKA 262 (309)
Q Consensus 235 g~vvGg~~glvlLgll~v~~~vr~~rkk 262 (309)
|-++|.+.|++++.|++.++ +|++|||
T Consensus 68 ~Ii~gv~aGvIg~Illi~y~-irR~~Kk 94 (122)
T PF01102_consen 68 GIIFGVMAGVIGIILLISYC-IRRLRKK 94 (122)
T ss_dssp HHHHHHHHHHHHHHHHHHHH-HHHHS--
T ss_pred ehhHHHHHHHHHHHHHHHHH-HHHHhcc
Confidence 33344444444444444433 4444444
No 8
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=70.52 E-value=22 Score=42.21 Aligned_cols=64 Identities=20% Similarity=0.369 Sum_probs=43.9
Q ss_pred eEEEE-eeCCCceEEEcCCCccccCCCCCcceeEEEcc-CCeEEEE-------ecCCCceeEeeccceeEEEEeCCC
Q 037939 148 FELQI-LANGKPITIDFRNTTRVTNISGIKPFCANFQR-DGKVTLT-------NQVSPYVCVARKHGHFGLVTKYPP 215 (309)
Q Consensus 148 ~el~i-~a~~~PI~V~F~~~~~~~~~~~~~~~Cv~F~~-~G~~~~~-------~~~~~nvC~~~~~GHfslVV~~~~ 215 (309)
+.++| ....+||.|.|.-+.. ++-+.|.||-||. .|.=+.+ |. +--.|.-.+.|-|++.+..++
T Consensus 2084 P~~a~l~~l~~Pv~veF~lle~---~~rtkpiCV~wde~tG~WtARgcelv~rN~-tHaaCqcnr~gsF~vlmd~s~ 2156 (2531)
T KOG4289|consen 2084 PMIAILPRLEDPVIVEFRLLET---EERTKPICVFWDEGTGGWTARGCELVGRNL-THAACQCNRTGSFAVLMDDSR 2156 (2531)
T ss_pred cchhccccCCCCeEEEEEEEec---cCcccceEEEEcCCCCceeeccceeecccc-ceeeeeeccceeEEEEEccCc
Confidence 34554 3467899999985542 3446999999996 4444443 32 234677789999999997654
No 9
>PF12768 Rax2: Cortical protein marker for cell polarity
Probab=69.83 E-value=5.2 Score=38.53 Aligned_cols=54 Identities=26% Similarity=0.262 Sum_probs=25.9
Q ss_pred ccccccceEEEeecchhHHHHHH-HHHHHHHHhhHHHHHHHHHHHHHH----hcccccccce
Q 037939 223 VRKKISRWKLAVGTTVGAAVGAF-LLGLLLVAMFVKVKKKARMEELER----RAYEEEALQV 279 (309)
Q Consensus 223 ~~~k~~~Wk~ivg~vvGg~~glv-lLgll~v~~~vr~~rkkr~~eMEr----~A~~gE~L~~ 279 (309)
++|+..+.++++.++ +-++|++ ||+++.+. .-+.||||...++. +-||+|.+|+
T Consensus 220 ~~~~l~~G~VVlIsl-AiALG~v~ll~l~Gii--~~~~~r~~~~~~~~p~~~~~d~~~~~~~ 278 (281)
T PF12768_consen 220 GGKKLSRGFVVLISL-AIALGTVFLLVLIGII--LAYIRRRRQGYVPAPTSPRIDEDEMMQR 278 (281)
T ss_pred ccccccceEEEEEeh-HHHHHHHHHHHHHHHH--HHHHHhhhccCcCCCcccccCccccccc
Confidence 457888887765332 2223322 22322222 12233333333433 6788888775
No 10
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=68.90 E-value=1.4 Score=34.26 Aligned_cols=27 Identities=26% Similarity=0.503 Sum_probs=0.4
Q ss_pred ecchhHHHHHHHHHHHHHHhhHHHHHH
Q 037939 235 GTTVGAAVGAFLLGLLLVAMFVKVKKK 261 (309)
Q Consensus 235 g~vvGg~~glvlLgll~v~~~vr~~rk 261 (309)
|-+.|+++|+.+.-+|+++++-|.|||
T Consensus 13 avIaG~Vvgll~ailLIlf~iyR~rkk 39 (64)
T PF01034_consen 13 AVIAGGVVGLLFAILLILFLIYRMRKK 39 (64)
T ss_dssp ------------------------S--
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHhc
Confidence 333444444444443333333344443
No 11
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=67.08 E-value=6.4 Score=38.50 Aligned_cols=23 Identities=13% Similarity=0.502 Sum_probs=14.8
Q ss_pred HHHHHHHHHHhhHHHHHHHHHHH
Q 037939 244 AFLLGLLLVAMFVKVKKKARMEE 266 (309)
Q Consensus 244 lvlLgll~v~~~vr~~rkkr~~e 266 (309)
+++|-++++-...||||||||.+
T Consensus 267 iIVLIMvIIYLILRYRRKKKmkK 289 (299)
T PF02009_consen 267 IIVLIMVIIYLILRYRRKKKMKK 289 (299)
T ss_pred HHHHHHHHHHHHHHHHHHhhhhH
Confidence 34444445555569999888864
No 12
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=67.04 E-value=2.3 Score=34.80 Aligned_cols=27 Identities=22% Similarity=0.017 Sum_probs=16.1
Q ss_pred EEEeecchhHHHHHHHHHHHHHHhhHH
Q 037939 231 KLAVGTTVGAAVGAFLLGLLLVAMFVK 257 (309)
Q Consensus 231 k~ivg~vvGg~~glvlLgll~v~~~vr 257 (309)
-.|+|.+||+++.+..|..+++|++++
T Consensus 66 gaiagi~vg~~~~v~~lv~~l~w~f~~ 92 (96)
T PTZ00382 66 GAIAGISVAVVAVVGGLVGFLCWWFVC 92 (96)
T ss_pred ccEEEEEeehhhHHHHHHHHHhheeEE
Confidence 347788888776655554445555443
No 13
>PF04995 CcmD: Heme exporter protein D (CcmD); InterPro: IPR007078 The CcmD protein is part of a C-type cytochrome biogenesis operon []. The exact function of this protein is uncertain. It has been proposed that CcmC, CcmD and CcmE interact directly with each other, establishing a cytoplasm to periplasm haem delivery pathway for cytochrome c maturation []. This protein is found fused to CcmE in P52224 from SWISSPROT. These proteins contain a predicted transmembrane helix.; GO: 0006810 transport, 0016021 integral to membrane
Probab=63.28 E-value=20 Score=25.46 Aligned_cols=14 Identities=21% Similarity=0.287 Sum_probs=5.9
Q ss_pred HHHHHHHHHHHHHh
Q 037939 257 KVKKKARMEELERR 270 (309)
Q Consensus 257 r~~rkkr~~eMEr~ 270 (309)
..++|+-+++++++
T Consensus 26 ~~~~r~~~~~l~~~ 39 (46)
T PF04995_consen 26 LRRRRRLRKELKRL 39 (46)
T ss_pred HHHHHHHHHHHHHH
Confidence 33444444444443
No 14
>PF15048 OSTbeta: Organic solute transporter subunit beta protein
Probab=62.29 E-value=2.6 Score=36.60 Aligned_cols=73 Identities=12% Similarity=0.290 Sum_probs=43.9
Q ss_pred cccceEEEeecchhHHHHHHHHHHHHHHhhHHHHHHHHHHHHHHhcccccccceeeeccccCc---ccccccCCccccc
Q 037939 226 KISRWKLAVGTTVGAAVGAFLLGLLLVAMFVKVKKKARMEELERRAYEEEALQVSMVGHIRAP---TASVTRTVPTIEQ 301 (309)
Q Consensus 226 k~~~Wk~ivg~vvGg~~glvlLgll~v~~~vr~~rkkr~~eMEr~A~~gE~L~~~~VG~sraP---sA~~tRTqP~lE~ 301 (309)
.-..|...+ ++=...++++|++++..-+.-.|++|++.+|++..+-..++-+.+-..+.. ..+...-.|.+..
T Consensus 31 D~tpWNysi---L~Ls~vvlvi~~~LLgrsi~ANRnrK~~~~~k~~pE~~~~~es~~kd~~sL~~l~etllsekp~l~q 106 (125)
T PF15048_consen 31 DATPWNYSI---LALSFVVLVISFFLLGRSIQANRNRKMQPQEKQTPEVLSLDESGLKDDNSLNILRETLLSEKPNLAQ 106 (125)
T ss_pred CCCCcchHH---HHHHHHHHHHHHHHHHHHhHhccccccccccccCHHHhhcccccccccccccHHHHHhhccCCCccC
Confidence 344676532 222223455677776666677888888899998888777776655544432 2344455555544
No 15
>PHA03283 envelope glycoprotein E; Provisional
Probab=61.75 E-value=12 Score=39.43 Aligned_cols=18 Identities=33% Similarity=0.565 Sum_probs=14.6
Q ss_pred CCCceeEeeccce-----eEEEE
Q 037939 194 VSPYVCVARKHGH-----FGLVT 211 (309)
Q Consensus 194 ~~~nvC~~~~~GH-----fslVV 211 (309)
.+-|||..+=.|| |.++.
T Consensus 327 SGLYVfVv~yNgHveAW~YtllS 349 (542)
T PHA03283 327 SGLYVFVLLYNGHPEAWTYTLLS 349 (542)
T ss_pred CceEEEEEEECCeeeeeEEEEEe
Confidence 5689999999999 65554
No 16
>PF04689 S1FA: DNA binding protein S1FA; InterPro: IPR006779 S1FA is an unusual small plant peptide of only 70 amino acids with a basic domain which contains a nuclear localization signal and a putative DNA binding helix. S1FA is highly conserved between dicotyledonous and monocotyledonous plants and may be a DNA-binding protein that specifically recognises the negative promoter element S1F [].; GO: 0003677 DNA binding, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=59.79 E-value=23 Score=27.86 Aligned_cols=23 Identities=26% Similarity=0.306 Sum_probs=16.6
Q ss_pred EEeecchhHHHHHHHHHHHHHHh
Q 037939 232 LAVGTTVGAAVGAFLLGLLLVAM 254 (309)
Q Consensus 232 ~ivg~vvGg~~glvlLgll~v~~ 254 (309)
.||--++|++++++|.|..++.+
T Consensus 14 lIVLlvV~g~ll~flvGnyvlY~ 36 (69)
T PF04689_consen 14 LIVLLVVAGLLLVFLVGNYVLYV 36 (69)
T ss_pred eEEeehHHHHHHHHHHHHHHHHH
Confidence 35556688888888888776644
No 17
>TIGR03141 cytochro_ccmD heme exporter protein CcmD. The model for this protein family describes a small, hydrophobic, and only moderately well-conserved protein, tricky to identify accurately for all of these reasons. However, members are found as part of large operons involved in heme export across the inner membrane for assembly of c-type cytochromes in a large number of bacteria. The gray zone between the trusted cutoff (13.0) and noise cutoff (4.75) includes both low-scoring examples and false-positive matches to hydrophobic domains of longer proteins.
Probab=57.92 E-value=27 Score=24.77 Aligned_cols=14 Identities=21% Similarity=0.394 Sum_probs=6.3
Q ss_pred HHHHHHHHHHHHHh
Q 037939 257 KVKKKARMEELERR 270 (309)
Q Consensus 257 r~~rkkr~~eMEr~ 270 (309)
..++|+.++++++.
T Consensus 27 ~~~~r~~~~~l~~~ 40 (45)
T TIGR03141 27 LLDRRRLLRELRRL 40 (45)
T ss_pred HHHHHHHHHHHHHH
Confidence 44444444444443
No 18
>PF07204 Orthoreo_P10: Orthoreovirus membrane fusion protein p10; InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=57.89 E-value=8.4 Score=32.22 Aligned_cols=39 Identities=18% Similarity=0.245 Sum_probs=18.3
Q ss_pred ceEEEeecchhHHHHHHHHHHHHHHhhHHHHHHHHHHHHHH
Q 037939 229 RWKLAVGTTVGAAVGAFLLGLLLVAMFVKVKKKARMEELER 269 (309)
Q Consensus 229 ~Wk~ivg~vvGg~~glvlLgll~v~~~vr~~rkkr~~eMEr 269 (309)
-|..+++ .||++.++++-.++.++..|+|.+++..+-+|
T Consensus 41 yWpyLA~--GGG~iLilIii~Lv~CC~~K~K~~~~r~~~~r 79 (98)
T PF07204_consen 41 YWPYLAA--GGGLILILIIIALVCCCRAKHKTSAARNTFHR 79 (98)
T ss_pred hhHHhhc--cchhhhHHHHHHHHHHhhhhhhhHhhhhHHHH
Confidence 5655554 35555444442233344345554455444444
No 19
>PTZ00046 rifin; Provisional
Probab=57.57 E-value=9.3 Score=38.45 Aligned_cols=23 Identities=13% Similarity=0.515 Sum_probs=16.1
Q ss_pred HHHHHHHHHHhhHHHHHHHHHHH
Q 037939 244 AFLLGLLLVAMFVKVKKKARMEE 266 (309)
Q Consensus 244 lvlLgll~v~~~vr~~rkkr~~e 266 (309)
+++|-++++-+..||||||||.+
T Consensus 326 VIVLIMvIIYLILRYRRKKKMkK 348 (358)
T PTZ00046 326 VIVLIMVIIYLILRYRRKKKMKK 348 (358)
T ss_pred HHHHHHHHHHHHHHhhhcchhHH
Confidence 44455555556679999999875
No 20
>PHA03265 envelope glycoprotein D; Provisional
Probab=55.87 E-value=8.4 Score=38.95 Aligned_cols=42 Identities=19% Similarity=0.264 Sum_probs=22.9
Q ss_pred CCCCeEece-eeeeeeecCCCCcCCCCceEEEEeeCCCceEEEcC
Q 037939 121 LSGYQLVSP-VLGILAYNSVTDVNFNNRFELQILANGKPITIDFR 164 (309)
Q Consensus 121 lpGY~lvsP-VlGllaYdas~~~~~~n~~el~i~a~~~PI~V~F~ 164 (309)
+.+|++++. =|||+. |+......+.+.=-|..+|.+|-=+|.
T Consensus 159 lasfaf~TdDELGLvm--AAPA~~~~GqYrRlI~Ing~v~yTDFm 201 (402)
T PHA03265 159 LITMAAETDDELGLVL--AAPAHSASGLYRRVIEIDGRRIYTDFS 201 (402)
T ss_pred ccccccccccccceEE--ecCCcccCceEEEEEEECCEEEEEEEE
Confidence 346666665 366655 332222333333334456888888886
No 21
>PF15345 TMEM51: Transmembrane protein 51
Probab=55.23 E-value=27 Score=33.33 Aligned_cols=15 Identities=13% Similarity=-0.002 Sum_probs=8.1
Q ss_pred ccCcccccccCCccc
Q 037939 285 IRAPTASVTRTVPTI 299 (309)
Q Consensus 285 sraPsA~~tRTqP~l 299 (309)
...|.|...+.|+.-
T Consensus 98 ~~~~~a~~~~~q~e~ 112 (233)
T PF15345_consen 98 GAEPQAQEEDSQQEE 112 (233)
T ss_pred cccccccccccccch
Confidence 344555555666554
No 22
>TIGR03141 cytochro_ccmD heme exporter protein CcmD. The model for this protein family describes a small, hydrophobic, and only moderately well-conserved protein, tricky to identify accurately for all of these reasons. However, members are found as part of large operons involved in heme export across the inner membrane for assembly of c-type cytochromes in a large number of bacteria. The gray zone between the trusted cutoff (13.0) and noise cutoff (4.75) includes both low-scoring examples and false-positive matches to hydrophobic domains of longer proteins.
Probab=54.95 E-value=36 Score=24.15 Aligned_cols=27 Identities=26% Similarity=0.418 Sum_probs=20.4
Q ss_pred HHHHHHHHHHhhHHHHHHHHHHHHHHh
Q 037939 244 AFLLGLLLVAMFVKVKKKARMEELERR 270 (309)
Q Consensus 244 lvlLgll~v~~~vr~~rkkr~~eMEr~ 270 (309)
++|+++++....-+.+.++++++.|++
T Consensus 17 l~l~~li~~~~~~~r~~~~~l~~~~~r 43 (45)
T TIGR03141 17 LVLAGLILWSLLDRRRLLRELRRLEAR 43 (45)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHc
Confidence 455566666667899999999988876
No 23
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=54.80 E-value=11 Score=37.82 Aligned_cols=23 Identities=13% Similarity=0.502 Sum_probs=15.7
Q ss_pred HHHHHHHHHHhhHHHHHHHHHHH
Q 037939 244 AFLLGLLLVAMFVKVKKKARMEE 266 (309)
Q Consensus 244 lvlLgll~v~~~vr~~rkkr~~e 266 (309)
+++|-++++-...||||||||.+
T Consensus 321 vIVLIMvIIYLILRYRRKKKMkK 343 (353)
T TIGR01477 321 IIVLIMVIIYLILRYRRKKKMKK 343 (353)
T ss_pred HHHHHHHHHHHHHHhhhcchhHH
Confidence 34455555555579999999875
No 24
>PF06305 DUF1049: Protein of unknown function (DUF1049); InterPro: IPR010445 This entry consists of several hypothetical bacterial proteins of unknown function.
Probab=53.00 E-value=13 Score=27.44 Aligned_cols=11 Identities=18% Similarity=0.528 Sum_probs=5.0
Q ss_pred HHHHHHHHHhc
Q 037939 261 KARMEELERRA 271 (309)
Q Consensus 261 kkr~~eMEr~A 271 (309)
+|+++++|++-
T Consensus 54 ~k~l~~le~e~ 64 (68)
T PF06305_consen 54 RKELKKLEKEL 64 (68)
T ss_pred HHHHHHHHHHH
Confidence 34444454443
No 25
>PF15102 TMEM154: TMEM154 protein family
Probab=52.96 E-value=4 Score=36.31 Aligned_cols=56 Identities=25% Similarity=0.265 Sum_probs=31.2
Q ss_pred hhHHHHHHHHHHHHHHhhHHHHHHHHHHHHHHhcccccccceeeeccccCcccccccCCcccccC
Q 037939 238 VGAAVGAFLLGLLLVAMFVKVKKKARMEELERRAYEEEALQVSMVGHIRAPTASVTRTVPTIEQY 302 (309)
Q Consensus 238 vGg~~glvlLgll~v~~~vr~~rkkr~~eMEr~A~~gE~L~~~~VG~sraPsA~~tRTqP~lE~~ 302 (309)
|..++ ++||-+++|+. +.+.||||..+=+-+...-+++|.-=.|+-+-| +|++|+|
T Consensus 63 IP~VL-LvlLLl~vV~l-v~~~kRkr~K~~~ss~gsq~~~qt~e~~~Env~-------~PiFEed 118 (146)
T PF15102_consen 63 IPLVL-LVLLLLSVVCL-VIYYKRKRTKQEPSSQGSQSALQTYELGSENVK-------VPIFEED 118 (146)
T ss_pred HHHHH-HHHHHHHHHHh-eeEEeecccCCCCccccccccccccccCccccc-------ccccccC
Confidence 55433 33333334433 444455554444556667777887777776665 3577775
No 26
>PF05545 FixQ: Cbb3-type cytochrome oxidase component FixQ; InterPro: IPR008621 This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon [].
Probab=52.27 E-value=20 Score=25.49 Aligned_cols=19 Identities=21% Similarity=0.391 Sum_probs=13.0
Q ss_pred HHHHHHHHHHHHHHhcccc
Q 037939 256 VKVKKKARMEELERRAYEE 274 (309)
Q Consensus 256 vr~~rkkr~~eMEr~A~~g 274 (309)
-+.++|++.+++.+-.-++
T Consensus 30 ~~~~~k~~~e~aa~lpl~d 48 (49)
T PF05545_consen 30 YRPRNKKRFEEAANLPLDD 48 (49)
T ss_pred HcccchhhHHHHHccCccC
Confidence 3667788888887655443
No 27
>PHA03282 envelope glycoprotein E; Provisional
Probab=51.96 E-value=16 Score=38.30 Aligned_cols=25 Identities=28% Similarity=0.570 Sum_probs=18.7
Q ss_pred eEEEEec----CCCceeEeeccce-----eEEEE
Q 037939 187 KVTLTNQ----VSPYVCVARKHGH-----FGLVT 211 (309)
Q Consensus 187 ~~~~~~~----~~~nvC~~~~~GH-----fslVV 211 (309)
..+|+|. .+.|||..+-.|| |.+|.
T Consensus 334 dL~F~nAp~saSGLYVfVl~yNGHVeAW~YtlVS 367 (540)
T PHA03282 334 NLQLRDASEASGGLYVCVVYVNGHVHAWGHVVIS 367 (540)
T ss_pred ceEecCCCcccCceEEEEEEECCeeeeeEEEEEe
Confidence 3567664 5689999999999 65554
No 28
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=49.65 E-value=24 Score=38.87 Aligned_cols=40 Identities=20% Similarity=0.212 Sum_probs=21.3
Q ss_pred ceEEEeecchhHHHHHHHHHHHHHHhhH-HHHHHHHHHHHHH
Q 037939 229 RWKLAVGTTVGAAVGAFLLGLLLVAMFV-KVKKKARMEELER 269 (309)
Q Consensus 229 ~Wk~ivg~vvGg~~glvlLgll~v~~~v-r~~rkkr~~eMEr 269 (309)
.|+.|+.++++++++++ |+||++|-++ ...-++...+||.
T Consensus 712 ~~~~i~lgvv~~ivlig-l~llliwkll~~~~DrrE~akFe~ 752 (783)
T KOG1226|consen 712 NILAIVLGVVAGIVLIG-LALLLIWKLLTTIHDRREFAKFEK 752 (783)
T ss_pred cEeeehHHHHHHHHHHH-HHHHHHHHHhheecccHHhhhhhH
Confidence 78788877777654333 3333443322 2234555566654
No 29
>PF08693 SKG6: Transmembrane alpha-helix domain; InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=49.34 E-value=11 Score=26.73 Aligned_cols=21 Identities=33% Similarity=0.408 Sum_probs=9.1
Q ss_pred ceEEEeecchhHHHHHHHHHHHH
Q 037939 229 RWKLAVGTTVGAAVGAFLLGLLL 251 (309)
Q Consensus 229 ~Wk~ivg~vvGg~~glvlLgll~ 251 (309)
.-.+++|-++. ++++++.+++
T Consensus 10 ~vaIa~~VvVP--V~vI~~vl~~ 30 (40)
T PF08693_consen 10 TVAIAVGVVVP--VGVIIIVLGA 30 (40)
T ss_pred eEEEEEEEEec--hHHHHHHHHH
Confidence 44445554444 3344444333
No 30
>PF15099 PIRT: Phosphoinositide-interacting protein family
Probab=49.32 E-value=13 Score=32.45 Aligned_cols=9 Identities=33% Similarity=0.508 Sum_probs=5.1
Q ss_pred HHHHHHHHH
Q 037939 242 VGAFLLGLL 250 (309)
Q Consensus 242 ~glvlLgll 250 (309)
+||.||+.-
T Consensus 90 ~GLmlL~~~ 98 (129)
T PF15099_consen 90 LGLMLLACS 98 (129)
T ss_pred HHHHHHHhh
Confidence 456666555
No 31
>PF14575 EphA2_TM: Ephrin type-A receptor 2 transmembrane domain; PDB: 3KUL_A 2XVD_A 2VX1_A 2VWV_A 2VX0_A 2VWY_A 2VWZ_A 2VWW_A 2VWU_A 2VWX_A ....
Probab=48.44 E-value=5.5 Score=31.24 Aligned_cols=9 Identities=0% Similarity=0.154 Sum_probs=3.8
Q ss_pred EeecchhHH
Q 037939 233 AVGTTVGAA 241 (309)
Q Consensus 233 ivg~vvGg~ 241 (309)
|++++++|+
T Consensus 2 ii~~~~~g~ 10 (75)
T PF14575_consen 2 IIASIIVGV 10 (75)
T ss_dssp HHHHHHHHH
T ss_pred EEehHHHHH
Confidence 344444443
No 32
>PF12904 Collagen_bind_2: Putative collagen-binding domain of a collagenase ; InterPro: IPR024749 This domain is likely to be the collagen-binding domain of a family of bacterial collagenase enzymes. The structure of one family member, Q8A905 from SWISSPROT, has been characterised. The domain occurs in the C-terminal region of the protein.; PDB: 3KZS_D.
Probab=46.13 E-value=97 Score=25.37 Aligned_cols=32 Identities=31% Similarity=0.466 Sum_probs=22.1
Q ss_pred eeCCCceEEEcCCCccccCCCCCcceeEEEcc-CCeEEE
Q 037939 153 LANGKPITIDFRNTTRVTNISGIKPFCANFQR-DGKVTL 190 (309)
Q Consensus 153 ~a~~~PI~V~F~~~~~~~~~~~~~~~Cv~F~~-~G~~~~ 190 (309)
...|+||+|+..+++ +...++.|||. +|+.+.
T Consensus 31 ~~~Gr~~~vdl~~l~------g~~~~a~WfdPR~G~~~~ 63 (93)
T PF12904_consen 31 TPTGRPFTVDLSKLS------GKKVKAWWFDPRTGKYTY 63 (93)
T ss_dssp ESS---EEEEGGGSS-------SEEEEEEEETTT-BEEE
T ss_pred CCCCCEEEEEccccc------CCceeEEEEcCCCCCEEE
Confidence 556889999988764 34679999999 999886
No 33
>PF06305 DUF1049: Protein of unknown function (DUF1049); InterPro: IPR010445 This entry consists of several hypothetical bacterial proteins of unknown function.
Probab=45.92 E-value=28 Score=25.57 Aligned_cols=12 Identities=17% Similarity=0.545 Sum_probs=4.9
Q ss_pred HHHHHHHHHHHH
Q 037939 258 VKKKARMEELER 269 (309)
Q Consensus 258 ~~rkkr~~eMEr 269 (309)
++.|+++.+.++
T Consensus 44 ~~~r~~~~~~~k 55 (68)
T PF06305_consen 44 LRLRRRIRRLRK 55 (68)
T ss_pred HHHHHHHHHHHH
Confidence 344444444443
No 34
>COG3216 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=45.55 E-value=32 Score=31.75 Aligned_cols=85 Identities=20% Similarity=0.222 Sum_probs=38.4
Q ss_pred EEEEecC-CCcee-EeeccceeEEEEeCCCCCCCc-hh--ccccc-cceEE-EeecchhHHHHHHHHHHHHH----HhhH
Q 037939 188 VTLTNQV-SPYVC-VARKHGHFGLVTKYPPPSEGP-EQ--VRKKI-SRWKL-AVGTTVGAAVGAFLLGLLLV----AMFV 256 (309)
Q Consensus 188 ~~~~~~~-~~nvC-~~~~~GHfslVV~~~~~~~~~-~~--~~~k~-~~Wk~-ivg~vvGg~~glvlLgll~v----~~~v 256 (309)
+.++|.. -|-+- .++..||+=+=.+..+-+|.. .. .--+. +-|-- ..+.++|++++-.+.+++.- +.+.
T Consensus 87 t~l~NPlT~P~I~~~tyelG~~ll~~~~~s~~~~~l~~~~~~~el~~lw~P~l~pm~vgav~~~a~~~ll~y~~~r~~v~ 166 (184)
T COG3216 87 TWLANPLTMPFIWGATYELGAWLLQRPAQSVGPVHLTWMWQSLELSSLWGPVLKPMLVGAVPAGAIGGLLFYGLTRYSVT 166 (184)
T ss_pred hHhcCCcccceeeeeeHhhhhHHhcCCCCCCCchHHHHHHHHhHHHHhcchHHHHHHHhhHHHHHHHHHHHHHHHHHHHH
Confidence 4455653 34444 588999986655433311110 00 00111 23422 23445555433222222221 2223
Q ss_pred HHHHHHHHHHHHHhcc
Q 037939 257 KVKKKARMEELERRAY 272 (309)
Q Consensus 257 r~~rkkr~~eMEr~A~ 272 (309)
+-.+||+.+.|||++-
T Consensus 167 ~f~~rR~~~~~~~~~~ 182 (184)
T COG3216 167 RFRERRRRSLAERAAL 182 (184)
T ss_pred HHHHHHHHHHHhhhcc
Confidence 4467777778888764
No 35
>PF01825 GPS: Latrophilin/CL-1-like GPS domain; InterPro: IPR000203 This domain has been termed the GPS domain (for GPCR proteolytic site), because it contains a cleavage site in O97830 from SWISSPROT latrophilin []. However this region in latrophilin is found in many otherwise unrelated cell surface receptors []. There is no evidence currently that this domain provides a cleavage site in any of the other receptors. However the peptide bond that is cleaved in latrophilin is between Leu and Thr residues that are conserved in some of the other receptors []. GPS domains are about 50 residues long and contain either 2 or 4 cysteine residues that are likely to form disulphide bridges. Based on conservation of these cysteines the following pairing can be predicted. +-----------------+ | | +-----------------+---------------+ | | | | | XXXCXXXXXXXXXXXXXXXXXCXXXXXXXXXXXXXXXCXCXXLTXXXXXXX ^ cleavage site ; GO: 0007218 neuropeptide signaling pathway, 0016020 membrane; PDB: 4DLQ_B 4DLO_B.
Probab=43.57 E-value=63 Score=22.38 Aligned_cols=36 Identities=19% Similarity=0.443 Sum_probs=24.8
Q ss_pred cceeEEEcc-------CCeEEEEecCCCceeEeeccceeEEEE
Q 037939 176 KPFCANFQR-------DGKVTLTNQVSPYVCVARKHGHFGLVT 211 (309)
Q Consensus 176 ~~~Cv~F~~-------~G~~~~~~~~~~nvC~~~~~GHfslVV 211 (309)
.+.|+.+|. +|=.........-.|.-.....|++.+
T Consensus 2 ~~~C~~Wd~~~~~W~~~GC~~~~~~~~~~~C~C~HlT~Favlm 44 (44)
T PF01825_consen 2 NPQCVYWDEGTGSWSSDGCQVVESSNSHVTCSCNHLTSFAVLM 44 (44)
T ss_dssp EEEEEEEETTTEEEE-TTEEEEEEETTEEEEEECC-SEEEEEE
T ss_pred CCEEEEeeCCCCCCCcccccEecccCCCEEEEeeCCCcEeEeC
Confidence 578999994 564443333446789999999999864
No 36
>smart00218 ZU5 Domain present in ZO-1 and Unc5-like netrin receptors. Domain of unknown function.
Probab=40.85 E-value=68 Score=26.88 Aligned_cols=75 Identities=24% Similarity=0.244 Sum_probs=44.8
Q ss_pred EEeecCcccccCceeeeeEeCCcceeccceeEEE-EEEeecCCCccccccccCCC-CCCeEeceeeeeeeecCCCCcCCC
Q 037939 68 VRFRCGSLRRYGARVKEFHLGIGVIVQPCVERVV-VVRQNLGYNWSSIYYANYDL-SGYQLVSPVLGILAYNSVTDVNFN 145 (309)
Q Consensus 68 vRlRsgSLrr~G~~~~eF~IP~gv~~~P~v~Rv~-lVyqnLG~NwSs~yy~~y~l-pGY~lvsPVlGllaYdas~~~~~~ 145 (309)
+=-|.|.|+-..... ...||||.++++--.-+. .|.|+.- .. .++ .|-+++|||+=.=
T Consensus 9 ~d~~GG~L~~~~~Gv-~L~IPpgAi~~~~~~~iyl~v~~~~~-----~~---p~l~~~e~llSpvV~cG----------- 68 (104)
T smart00218 9 FDARGGRLRGPRTGV-RLIIPPGAIPQGTRYTCYLVVHKTLS-----TP---PPLEEGETLLSPVVECG----------- 68 (104)
T ss_pred EeCCCCEEEeCCCCe-EEEeCCCCCCCCCEEEEEEEEecCcC-----CC---CccCCCcEeeCCeEEEC-----------
Confidence 334566666554322 388999999998543333 3444433 12 234 4789999997431
Q ss_pred CceEEEEeeCCCceEEEcCCC
Q 037939 146 NRFELQILANGKPITIDFRNT 166 (309)
Q Consensus 146 n~~el~i~a~~~PI~V~F~~~ 166 (309)
+.-+.+ .+||.++|++-
T Consensus 69 -P~G~~f---~~PViL~vPHc 85 (104)
T smart00218 69 -PHGALF---LRPVILEVPHC 85 (104)
T ss_pred -CCCCeE---cCCEEEecccc
Confidence 112333 57999988864
No 37
>PF04995 CcmD: Heme exporter protein D (CcmD); InterPro: IPR007078 The CcmD protein is part of a C-type cytochrome biogenesis operon []. The exact function of this protein is uncertain. It has been proposed that CcmC, CcmD and CcmE interact directly with each other, establishing a cytoplasm to periplasm haem delivery pathway for cytochrome c maturation []. This protein is found fused to CcmE in P52224 from SWISSPROT. These proteins contain a predicted transmembrane helix.; GO: 0006810 transport, 0016021 integral to membrane
Probab=40.39 E-value=65 Score=22.80 Aligned_cols=30 Identities=20% Similarity=0.356 Sum_probs=22.2
Q ss_pred HHHHHHHHHHHhhHHHHHHHHHHHHHHhcc
Q 037939 243 GAFLLGLLLVAMFVKVKKKARMEELERRAY 272 (309)
Q Consensus 243 glvlLgll~v~~~vr~~rkkr~~eMEr~A~ 272 (309)
.++|+++++.....+.+-++++++.|++-+
T Consensus 15 ~~~l~~l~~~~~~~~r~~~~~l~~~~~r~~ 44 (46)
T PF04995_consen 15 ALVLAGLIVWSLRRRRRLRKELKRLEAREQ 44 (46)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHhHc
Confidence 355566666667789988899999988753
No 38
>PLN00113 leucine-rich repeat receptor-like protein kinase; Provisional
Probab=40.24 E-value=22 Score=38.28 Aligned_cols=13 Identities=31% Similarity=0.654 Sum_probs=6.7
Q ss_pred cceEEEeecchhH
Q 037939 228 SRWKLAVGTTVGA 240 (309)
Q Consensus 228 ~~Wk~ivg~vvGg 240 (309)
..|+++++.++|+
T Consensus 626 ~~~~~~~~~~~~~ 638 (968)
T PLN00113 626 PSWWFYITCTLGA 638 (968)
T ss_pred ceeeeehhHHHHH
Confidence 3565555544444
No 39
>PF08374 Protocadherin: Protocadherin; InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated [].
Probab=39.38 E-value=26 Score=33.26 Aligned_cols=26 Identities=19% Similarity=0.364 Sum_probs=12.2
Q ss_pred eEEEeecchhHHHHHHHHHHHHHHhhHHH
Q 037939 230 WKLAVGTTVGAAVGAFLLGLLLVAMFVKV 258 (309)
Q Consensus 230 Wk~ivg~vvGg~~glvlLgll~v~~~vr~ 258 (309)
-++++| +|+|+++++|+-+ ++++ +|+
T Consensus 37 ~~I~ia-iVAG~~tVILVI~-i~v~-vR~ 62 (221)
T PF08374_consen 37 VKIMIA-IVAGIMTVILVIF-IVVL-VRY 62 (221)
T ss_pred eeeeee-eecchhhhHHHHH-HHHH-HHH
Confidence 344444 4555555555443 3333 453
No 40
>PHA03265 envelope glycoprotein D; Provisional
Probab=38.13 E-value=14 Score=37.40 Aligned_cols=27 Identities=19% Similarity=0.415 Sum_probs=16.0
Q ss_pred hHHHHHHHHHHHHHHhhHHHHHHHHHHHHH
Q 037939 239 GAAVGAFLLGLLLVAMFVKVKKKARMEELE 268 (309)
Q Consensus 239 Gg~~glvlLgll~v~~~vr~~rkkr~~eME 268 (309)
||++||++.|.++..+ |+|||...+=|
T Consensus 356 ~~i~glv~vg~il~~~---~rr~k~~~k~~ 382 (402)
T PHA03265 356 LGIAGLVLVGVILYVC---LRRKKELKKSA 382 (402)
T ss_pred cchhhhhhhhHHHHHH---hhhhhhhhhhh
Confidence 3477788888777644 44444444433
No 41
>PF05283 MGC-24: Multi-glycosylated core protein 24 (MGC-24); InterPro: IPR007947 CD164 is a mucin-like receptor, or sialomucin, with specificity in receptor/ ligand interactions that depends on the structural characteristics of the mucin-like receptor. Its functions include mediating, or regulating, haematopoietic progenitor cell adhesion and the negative regulation of their growth and/or-differentiation. It exists in the native state as a disulphide- linked homodimer of two 80-85kDa subunits. It is usually expressed by CD34+ and CD341o/- haematopoietic stem cells and associated microenvironmental cells. It contains, in its extracellular region, two mucin domains (I and II) linked by a non-mucin domain, which has been predicted to contain intra- disulphide bridges. This receptor may play a key role in haematopoiesis by facilitating the adhesion of human CD34+ cells to bone marrow stroma and by negatively regulating CD34+ CD341o/- haematopoietic progenitor cell proliferation. These effects involve the CD164 class I and/or II epitopes recognised by the monoclonal antibodies (mAbs) 105A5 and 103B2/9E10. These epitopes are carbohydrate-dependent and are located on the N-terminal mucin domain I [, ]. It has been found that murine MGC-24v and rat endolyn share significant sequence similarities with human CD164. However, CD164 lacks the consensus glycosaminoglycan (GAG)-attachment site found in MGC-24; it is possible that GAG-association is responsible for the high molecular weight of the epithelial-derived MGC-24 glycoprotein []. Genomic structure studies have placed CD164 within the mucin-subgroup that comprises multiple exons, and demonstrate the diverse chromosomal distribution of this family of molecules. Molecules with such multiple exons may have sophisticated regulatory mechanisms that involve not only post-translational modifications of the oligosaccharide side chains, but also differential exon usage. Although differences in the intron and exon sizes are seen between the mouse and human genes, the predicted proteins are similar in size and structure, maintaining functionally important motifs that regulate cell proliferation or subcellular distribution []. CD164 is a gene whose expression depends on differential usage of poly- adenylation sites within the 3'-UTR. The conserved distribution of the 3.2- and 1.2-kb CD164 transcripts between mouse and human suggests that (i) a mechanism may exist to regulate tissue-specific polyadenylation, and (ii) differences in polyadenylation are important for the expression and function of CD164 in different tissues. Two other aspects of the structure of CD164 are of particular interest. First, it shares one of several conserved features of a cytokine-binding pocket - in this respect, it is notable that evidence exists for a class of cell-surface sialomucin modulators that directly interact with growth factor receptors to regulate their response to physiological ligands. Second, its cytoplasmic tail contains a C-terminal YHTL motif found in many endocytic membrane proteins or receptors. These Tyr-based motifs bind to adaptor proteins, which mediate the sorting of membrane proteins into transport vesicles from the plasma membrane to the endosomes, and between intracellular compartments.
Probab=37.32 E-value=25 Score=32.42 Aligned_cols=18 Identities=17% Similarity=0.294 Sum_probs=11.1
Q ss_pred eecchhHHHH-HHHHHHHH
Q 037939 234 VGTTVGAAVG-AFLLGLLL 251 (309)
Q Consensus 234 vg~vvGg~~g-lvlLgll~ 251 (309)
.+|+|||++. |+||++++
T Consensus 160 ~~SFiGGIVL~LGv~aI~f 178 (186)
T PF05283_consen 160 AASFIGGIVLTLGVLAIIF 178 (186)
T ss_pred hhhhhhHHHHHHHHHHHHH
Confidence 5789999643 55555433
No 42
>PF11368 DUF3169: Protein of unknown function (DUF3169); InterPro: IPR021509 Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function.
Probab=36.29 E-value=14 Score=34.41 Aligned_cols=15 Identities=20% Similarity=0.448 Sum_probs=9.9
Q ss_pred eEEEeecchhHHHHH
Q 037939 230 WKLAVGTTVGAAVGA 244 (309)
Q Consensus 230 Wk~ivg~vvGg~~gl 244 (309)
+++++|+++||++|.
T Consensus 13 ~~illg~~iGg~~G~ 27 (248)
T PF11368_consen 13 LLILLGGLIGGFIGF 27 (248)
T ss_pred HHHHHHHHHHHHHHH
Confidence 345677777777663
No 43
>PF11166 DUF2951: Protein of unknown function (DUF2951); InterPro: IPR021337 This family of proteins has no known function. It has a highly conserved sequence.
Probab=34.85 E-value=7.3 Score=32.58 Aligned_cols=28 Identities=29% Similarity=0.425 Sum_probs=18.3
Q ss_pred ccccccceEEEeecchhHHHHHHHHHHH
Q 037939 223 VRKKISRWKLAVGTTVGAAVGAFLLGLL 250 (309)
Q Consensus 223 ~~~k~~~Wk~ivg~vvGg~~glvlLgll 250 (309)
++|+..-||.|+-+.||.++|.+++|+|
T Consensus 65 n~Knir~~KmwilGlvgTi~gsliia~l 92 (98)
T PF11166_consen 65 NDKNIRDIKMWILGLVGTIFGSLIIALL 92 (98)
T ss_pred HHhhHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4455566777776777777766666654
No 44
>PF12191 stn_TNFRSF12A: Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain; InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=33.89 E-value=14 Score=32.42 Aligned_cols=13 Identities=0% Similarity=-0.278 Sum_probs=1.7
Q ss_pred cceEEEeecchhH
Q 037939 228 SRWKLAVGTTVGA 240 (309)
Q Consensus 228 ~~Wk~ivg~vvGg 240 (309)
.+|-|.+++.+++
T Consensus 75 ~~l~~pi~~sal~ 87 (129)
T PF12191_consen 75 FPLLWPILGSALS 87 (129)
T ss_dssp SSSS---------
T ss_pred cceehhhhhhHHH
Confidence 3444444333443
No 45
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=32.05 E-value=53 Score=27.00 Aligned_cols=22 Identities=27% Similarity=0.309 Sum_probs=13.2
Q ss_pred HHHHHHHHhcccccccceeeec
Q 037939 262 ARMEELERRAYEEEALQVSMVG 283 (309)
Q Consensus 262 kr~~eMEr~A~~gE~L~~~~VG 283 (309)
++.+++|..+++.|+=+....|
T Consensus 26 ~~~~~~~~~~~~~~v~~~~~~g 47 (95)
T PF07172_consen 26 RELEETEKEEEENEVQDDKYGG 47 (95)
T ss_pred HHhhhccccccCCCCCccccCC
Confidence 3347777777777655544433
No 46
>PF12877 DUF3827: Domain of unknown function (DUF3827); InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells.
Probab=31.61 E-value=45 Score=36.26 Aligned_cols=17 Identities=24% Similarity=0.106 Sum_probs=11.6
Q ss_pred EeCCcceeccceeEEEEEEeecC
Q 037939 86 HLGIGVIVQPCVERVVVVRQNLG 108 (309)
Q Consensus 86 ~IP~gv~~~P~v~Rv~lVyqnLG 108 (309)
.-|+-.+.+|+ =|.||.
T Consensus 112 ~~P~l~IAEp~------~Yp~Ln 128 (684)
T PF12877_consen 112 GYPPLQIAEPF------HYPNLN 128 (684)
T ss_pred cCCceeecccc------ccCccc
Confidence 34888888875 466664
No 47
>PHA03286 envelope glycoprotein E; Provisional
Probab=31.45 E-value=83 Score=33.03 Aligned_cols=20 Identities=35% Similarity=0.459 Sum_probs=16.2
Q ss_pred eEEEEec----CCCceeEeeccce
Q 037939 187 KVTLTNQ----VSPYVCVARKHGH 206 (309)
Q Consensus 187 ~~~~~~~----~~~nvC~~~~~GH 206 (309)
+.+|+|. .+-|||..+=.||
T Consensus 308 nL~f~nA~~s~SGLYVfVl~yNGH 331 (492)
T PHA03286 308 SFRLANAQPTDAGLYVVVALYNGR 331 (492)
T ss_pred ceeecCCCcccCceEEEEEEECCc
Confidence 4667765 4689999999999
No 48
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=31.18 E-value=14 Score=32.58 Aligned_cols=12 Identities=17% Similarity=0.581 Sum_probs=7.9
Q ss_pred eeEEEccCCeEE
Q 037939 178 FCANFQRDGKVT 189 (309)
Q Consensus 178 ~Cv~F~~~G~~~ 189 (309)
.|-++|.+|...
T Consensus 5 ~C~y~d~~g~~~ 16 (179)
T PF13908_consen 5 YCHYYDVMGQWD 16 (179)
T ss_pred cceeecCCCCCc
Confidence 566777777654
No 49
>PF14880 COX14: Cytochrome oxidase c assembly
Probab=30.73 E-value=88 Score=23.33 Aligned_cols=7 Identities=29% Similarity=0.192 Sum_probs=2.9
Q ss_pred eecchhH
Q 037939 234 VGTTVGA 240 (309)
Q Consensus 234 vg~vvGg 240 (309)
|.+.+|+
T Consensus 18 V~~Lig~ 24 (59)
T PF14880_consen 18 VLGLIGF 24 (59)
T ss_pred HHHHHHH
Confidence 4444443
No 50
>PF06212 GRIM-19: GRIM-19 protein; InterPro: IPR009346 This family consists of several eukaryotic gene associated with retinoic-interferon-induced mortality 19 (GRIM-19) proteins. GRIM-19, was reported to encode a small protein primarily distributed in the nucleus and was able to promote cell death induced by IFN-beta and RA. A bovine homologue of GRIM-19 was co-purified with mitochondrial NADH:ubiquinone oxidoreductase (complex I) in bovine heart. Therefore, its exact cellular localisation and function are unclear. It has now been discovered that GRIM-19 is a specific interacting protein which negatively regulates Stat3 activity [].
Probab=29.90 E-value=1e+02 Score=26.82 Aligned_cols=33 Identities=15% Similarity=0.199 Sum_probs=21.3
Q ss_pred hhHHHHHHHHHHHHHHhhHHHHHHHHHHHHHHh
Q 037939 238 VGAAVGAFLLGLLLVAMFVKVKKKARMEELERR 270 (309)
Q Consensus 238 vGg~~glvlLgll~v~~~vr~~rkkr~~eMEr~ 270 (309)
+|+++|+...|+-.+.-..|.++.-++|+++.+
T Consensus 35 ~~~~~~~~~~G~y~~~~~~r~~r~~~~E~~~ar 67 (130)
T PF06212_consen 35 FAGGAGIMAYGFYKVGQGNRERRELKREKRWAR 67 (130)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhH
Confidence 555566666776655555566777777777765
No 51
>PF01589 Alpha_E1_glycop: Alphavirus E1 glycoprotein; InterPro: IPR002548 Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses []. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 (IPR000936 from INTERPRO) and E3 (IPR002533 from INTERPRO) causes a change in the viral surface. Together the E1, E2, and sometimes E3, glycoprotein "spikes" form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike []. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together. The alphaviral glycoprotein E1 is a class II viral fusion protein, which is structurally different from the class I fusion proteins found in influenza virus and HIV. The structure of the Semliki Forest virus revealed a structure that is similar to that of flaviviral glycoprotein E, with three structural domains in the same primary sequence arrangement []. This entry represents all three domains of the alphaviral E1 glycoprotein.; GO: 0004252 serine-type endopeptidase activity, 0019028 viral capsid, 0055036 virion membrane; PDB: 2YEW_L 1LD4_P 1Z8Y_K 3MUU_B 3N44_F 2XFB_F 3N42_F 2XFC_H 3N40_F 3N41_F ....
Probab=29.90 E-value=40 Score=35.19 Aligned_cols=21 Identities=19% Similarity=0.504 Sum_probs=15.1
Q ss_pred eecchhHHHHHHHHHHHHHHh
Q 037939 234 VGTTVGAAVGAFLLGLLLVAM 254 (309)
Q Consensus 234 vg~vvGg~~glvlLgll~v~~ 254 (309)
++..+||...++++|++++.+
T Consensus 474 l~~l~GG~s~li~i~lii~~~ 494 (502)
T PF01589_consen 474 LQALFGGASSLIIIGLIILVC 494 (502)
T ss_dssp HCTTSSHHHHHHHHHHHHHHH
T ss_pred HHHHhcchHHHHHHHHHHHHh
Confidence 677788877777777766544
No 52
>KOG3838 consensus Mannose lectin ERGIC-53, involved in glycoprotein traffic [Intracellular trafficking, secretion, and vesicular transport]
Probab=29.53 E-value=1.2e+02 Score=31.51 Aligned_cols=138 Identities=22% Similarity=0.257 Sum_probs=71.6
Q ss_pred HHHHHHhhcccc--cccCCCChhHHHHHHHHHHHHh--cccCCCccCceEeee-CCCCccCCeEE---EEEeecCccccc
Q 037939 7 LILTILSLASLS--ETQGVKSTRVLDLLIRDYTFKS--LDNHAIKTGNLHNVH-LPANLSGIKVD---MVRFRCGSLRRY 78 (309)
Q Consensus 7 ~~~~~~~l~~~~--~~q~~~~~~~LD~~lqd~A~k~--l~~~~~~TG~~y~~~-LPsNlSGi~Vs---avRlRsgSLrr~ 78 (309)
+|++||+|+-+. .+++....|- +=..|.||. |+-+ .-++++=.+ =-+-.|+=.|. -+|=|+|+.|.|
T Consensus 11 f~~lLllLa~~~~~~~~~~~~~rr---FEYK~SFk~P~Laq~--dgtiPFW~~~GdAIas~eqvRlaPSmrsrkGavWtk 85 (497)
T KOG3838|consen 11 FCALLLLLAPHVPETGCGTPPHRR---FEYKYSFKGPRLAQP--DGTIPFWSHHGDAIASSEQVRLAPSMRSRKGAVWTK 85 (497)
T ss_pred HHHHHHHccCcCcccccCCCccce---eeeeecccCCccccC--CCCcceeeecCcccccccceeeccccccccCceeec
Confidence 455556565444 6666544432 334555554 3211 233443322 11233444444 689999999999
Q ss_pred C-ceeeeeEeCCcceeccceeEEEEEEeecCCCccccccccCCCCCCeEeceeeee--------eeecCCCCcCCCCceE
Q 037939 79 G-ARVKEFHLGIGVIVQPCVERVVVVRQNLGYNWSSIYYANYDLSGYQLVSPVLGI--------LAYNSVTDVNFNNRFE 149 (309)
Q Consensus 79 G-~~~~eF~IP~gv~~~P~v~Rv~lVyqnLG~NwSs~yy~~y~lpGY~lvsPVlGl--------laYdas~~~~~~n~~e 149 (309)
- ++|..|.+-.--.+..- -+.|-+-=.+||+ .|--++-||+|= +.+|.-+|-...|.+.
T Consensus 86 a~~~fe~weVev~~rVtGr--------GRiGAdGlaiWYt----~~~G~~GpVfGg~d~WnGigiffDSfdnD~qknnP~ 153 (497)
T KOG3838|consen 86 ASVPFENWEVEVQFRVTGR--------GRIGADGLAIWYT----RGRGHVGPVFGGLDSWNGIGIFFDSFDNDGQKNNPA 153 (497)
T ss_pred ccCCcccceEEEEEEeccc--------ccccCCceEEEEe----cCCCcccccccccccccceEEEeecccccCCcCCcc
Confidence 7 56776654332222211 1122222235664 233478888863 2455543333456688
Q ss_pred EEEeeCCCceEE
Q 037939 150 LQILANGKPITI 161 (309)
Q Consensus 150 l~i~a~~~PI~V 161 (309)
|+++.+..-|.-
T Consensus 154 Is~~lndGt~~y 165 (497)
T KOG3838|consen 154 ISVLLNDGTIPY 165 (497)
T ss_pred EEEEecCCcccc
Confidence 888877765544
No 53
>PF05620 DUF788: Protein of unknown function (DUF788); InterPro: IPR008506 This family consists of several eukaryotic proteins of unknown function.
Probab=29.16 E-value=27 Score=31.02 Aligned_cols=16 Identities=44% Similarity=0.565 Sum_probs=13.2
Q ss_pred HHHHHHHHHHHHhccc
Q 037939 258 VKKKARMEELERRAYE 273 (309)
Q Consensus 258 ~~rkkr~~eMEr~A~~ 273 (309)
.+++||.++|||+++.
T Consensus 155 ~~~sKrq~K~err~~K 170 (170)
T PF05620_consen 155 EAKSKRQEKMERRANK 170 (170)
T ss_pred ccccHHHHHHHHhccC
Confidence 4578999999999863
No 54
>PF09926 DUF2158: Uncharacterized small protein (DUF2158); InterPro: IPR019226 This entry represents a family of predominantly prokaryotic proteins with no known function.
Probab=27.95 E-value=1.1e+02 Score=22.77 Aligned_cols=33 Identities=21% Similarity=0.262 Sum_probs=19.8
Q ss_pred eeCCCceEEEcCCCccccCCCCCcceeEEEccCCe
Q 037939 153 LANGKPITIDFRNTTRVTNISGIKPFCANFQRDGK 187 (309)
Q Consensus 153 ~a~~~PI~V~F~~~~~~~~~~~~~~~Cv~F~~~G~ 187 (309)
-.+|-+.+|..-.-. ....+....|+|||.+|.
T Consensus 10 KSGGp~MTV~~v~~~--~~~~~~~v~C~WFd~~~~ 42 (53)
T PF09926_consen 10 KSGGPRMTVTEVGPN--AGASGGWVECQWFDGHGE 42 (53)
T ss_pred ccCCCCeEEEEcccc--ccCCCCeEEEEeCCCCCc
Confidence 334456677632111 123345889999999886
No 55
>PF03264 Cytochrom_NNT: NapC/NirT cytochrome c family, N-terminal region; InterPro: IPR005126 Within the NapC/NirT family of cytochrome c proteins, some members, such as NapC P33932 from SWISSPROT and NirT P24038 from SWISSPROT, bind four haem groups, while others, such as TorC P33226 from SWISSPROT, bind five haems. This family aligns the common N-terminal region that contains four haem-binding C-X(2)-CH motifs.; PDB: 2VR0_F 2J7A_C.
Probab=27.69 E-value=45 Score=29.23 Aligned_cols=20 Identities=40% Similarity=0.800 Sum_probs=6.8
Q ss_pred ceEEEeecchhHHHHHHHHH
Q 037939 229 RWKLAVGTTVGAAVGAFLLG 248 (309)
Q Consensus 229 ~Wk~ivg~vvGg~~glvlLg 248 (309)
+|++++..++|.++|+++++
T Consensus 4 ~~~~~~~~~~~~~~~~~~~~ 23 (173)
T PF03264_consen 4 KWSLLLLLLVGVVLGVVLLG 23 (173)
T ss_dssp ------HSTTCHHHHHHHHH
T ss_pred hhHHHHHHHHHHHHHHHHHH
Confidence 55554444455554444444
No 56
>PF10183 ESSS: ESSS subunit of NADH:ubiquinone oxidoreductase (complex I) ; InterPro: IPR019329 NADH:ubiquinone oxidoreductase (complex I) (1.6.5.3 from EC) is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) []. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea [], mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins []. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters []. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes [, ]. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I []. This entry represents the ESSS subunit from mitochondrial NADH:ubiquinone oxidoreductase (complex I). It carries mitochondrial import sequences [].
Probab=26.88 E-value=79 Score=26.17 Aligned_cols=48 Identities=21% Similarity=0.329 Sum_probs=25.3
Q ss_pred cccccceEEEeecchhHHHHHHHHHHHHHHh----hHHHHHHHHHHHHHHhccc
Q 037939 224 RKKISRWKLAVGTTVGAAVGAFLLGLLLVAM----FVKVKKKARMEELERRAYE 273 (309)
Q Consensus 224 ~~k~~~Wk~ivg~vvGg~~glvlLgll~v~~----~vr~~rkkr~~eMEr~A~~ 273 (309)
+.++..|.. ..++|..++++++++.+... ..-|.+++-.++||++-.+
T Consensus 53 ~~d~e~we~--~~f~~~~~~~v~~~~~~~y~PD~~i~~WA~rEA~~rl~~rEa~ 104 (105)
T PF10183_consen 53 KRDWEGWEL--PFFFGFSGSLVFGGVFLAYKPDTSIQTWARREAYRRLERREAE 104 (105)
T ss_pred cchHhhhHH--HHHHHHHHHHHHHHHHHHcCCCCCHHHHHHHHHHHHHhHHhhc
Confidence 344455543 34456555556655443311 1346777777777755443
No 57
>PF05808 Podoplanin: Podoplanin; InterPro: IPR008783 This family consists of several mammalian podoplanin-like proteins which are thought to control specifically the unique shape of podocytes [].; GO: 0016021 integral to membrane; PDB: 3IET_X.
Probab=26.69 E-value=21 Score=32.31 Aligned_cols=22 Identities=23% Similarity=0.458 Sum_probs=0.0
Q ss_pred EeecchhHHHHHHHHHHHHHHh
Q 037939 233 AVGTTVGAAVGAFLLGLLLVAM 254 (309)
Q Consensus 233 ivg~vvGg~~glvlLgll~v~~ 254 (309)
+||.++|.++|++|+|-+++++
T Consensus 131 LVGIIVGVLlaIG~igGIIivv 152 (162)
T PF05808_consen 131 LVGIIVGVLLAIGFIGGIIIVV 152 (162)
T ss_dssp ----------------------
T ss_pred eeeehhhHHHHHHHHhheeeEE
Confidence 6788888888888887666654
No 58
>PTZ00229 variable surface protein Vir30; Provisional
Probab=26.55 E-value=30 Score=34.33 Aligned_cols=47 Identities=15% Similarity=0.159 Sum_probs=26.2
Q ss_pred eEEEeecchhHHHHHHHHHHHHH--HhhHHHHHHHHHHHHHHhcccccc
Q 037939 230 WKLAVGTTVGAAVGAFLLGLLLV--AMFVKVKKKARMEELERRAYEEEA 276 (309)
Q Consensus 230 Wk~ivg~vvGg~~glvlLgll~v--~~~vr~~rkkr~~eMEr~A~~gE~ 276 (309)
-++++-+|+|+++|++++-+++- +=+-.+.|++|...||..-..+|.
T Consensus 241 ~~~~~~~~~gs~~gl~~~f~~~YKFTP~G~~l~~~k~k~~~~~~n~~~e 289 (317)
T PTZ00229 241 NKIISTAVTGTIVGLIPLVGILYKFTPMGQLLKPKKGKLMDGQINNDKE 289 (317)
T ss_pred CceeeehhhhhhhhHHhHHhhhhcccchhHHhhhcccchhhhhcccHHH
Confidence 36678888998888777654432 101133455555566543333343
No 59
>COG1862 YajC Preprotein translocase subunit YajC [Intracellular trafficking and secretion]
Probab=24.96 E-value=1.1e+02 Score=25.52 Aligned_cols=16 Identities=19% Similarity=0.146 Sum_probs=7.0
Q ss_pred HHHHHHHHHHHhcccc
Q 037939 259 KKKARMEELERRAYEE 274 (309)
Q Consensus 259 ~rkkr~~eMEr~A~~g 274 (309)
||.|+.++|=-.=-.|
T Consensus 32 Kr~K~~~~ml~sL~kG 47 (97)
T COG1862 32 KRMKEHQELLNSLKKG 47 (97)
T ss_pred HHHHHHHHHHHhccCC
Confidence 4444455544333333
No 60
>smart00303 GPS G-protein-coupled receptor proteolytic site domain. Present in latrophilin/CL-1, sea urchin REJ and polycystin.
Probab=24.92 E-value=2.2e+02 Score=20.11 Aligned_cols=40 Identities=20% Similarity=0.496 Sum_probs=26.8
Q ss_pred cceeEEEcc-------CCeEEEEecCCCceeEeeccceeEEEEeCCC
Q 037939 176 KPFCANFQR-------DGKVTLTNQVSPYVCVARKHGHFGLVTKYPP 215 (309)
Q Consensus 176 ~~~Cv~F~~-------~G~~~~~~~~~~nvC~~~~~GHfslVV~~~~ 215 (309)
.++|+.+|. +|=-.......--.|.-.....|++.+..+|
T Consensus 2 ~~~Cv~W~~~~~~W~~~GC~~~~~~~~~~~C~CnHlT~Favl~~~~p 48 (49)
T smart00303 2 NPICVFWDESSGEWSTRGCELLETNSTHTTCSCNHLTTFAVLMDVPP 48 (49)
T ss_pred CCEEEEecCCCCCCccccCEEEeCCCCEEEEEEEccceEEEeEEecC
Confidence 368999887 3422222223566799999999999886543
No 61
>PF12597 DUF3767: Protein of unknown function (DUF3767); InterPro: IPR022533 This group of proteins includes mitochodrial cytochrome c oxidase proteins [], and some transmembrane domain-containing proteins of unknown function known as FAM36A. Proteins in this family are typically between 112 and 199 amino acids in length.
Probab=24.69 E-value=1.3e+02 Score=25.54 Aligned_cols=20 Identities=5% Similarity=0.197 Sum_probs=12.1
Q ss_pred HHHhhHHHHHHHHHHHHHHh
Q 037939 251 LVAMFVKVKKKARMEELERR 270 (309)
Q Consensus 251 ~v~~~vr~~rkkr~~eMEr~ 270 (309)
+-+..-|++|++..++|++.
T Consensus 82 ~~we~Cr~~r~~~~~~~~~~ 101 (118)
T PF12597_consen 82 GSWEYCRYNRRKERQQMKRA 101 (118)
T ss_pred HHHHHHHHHHHHHHHHHHHH
Confidence 33443477777777777653
No 62
>TIGR00739 yajC preprotein translocase, YajC subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas.
Probab=24.50 E-value=1.4e+02 Score=23.87 Aligned_cols=21 Identities=10% Similarity=0.126 Sum_probs=14.6
Q ss_pred HHHHHHHHHHHHhcccccccc
Q 037939 258 VKKKARMEELERRAYEEEALQ 278 (309)
Q Consensus 258 ~~rkkr~~eMEr~A~~gE~L~ 278 (309)
.||+|+.+||..+=..|...-
T Consensus 25 kK~~k~~~~m~~~L~~Gd~Vv 45 (84)
T TIGR00739 25 RKRRKAHKKLIESLKKGDKVL 45 (84)
T ss_pred HHHHHHHHHHHHhCCCCCEEE
Confidence 366777788888777776543
No 63
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=24.25 E-value=50 Score=23.38 Aligned_cols=9 Identities=22% Similarity=0.408 Sum_probs=3.7
Q ss_pred HHHHHHHHH
Q 037939 241 AVGAFLLGL 249 (309)
Q Consensus 241 ~~glvlLgl 249 (309)
++|++++.+
T Consensus 13 ~vg~~iiii 21 (38)
T PF02439_consen 13 VVGMAIIII 21 (38)
T ss_pred HHHHHHHHH
Confidence 334444443
No 64
>PF04971 Lysis_S: Lysis protein S ; InterPro: IPR007054 The lysis S protein is a cytotoxic protein forming holes in membranes causing cell lysis. The action of Lysis S is independent of the proportion of acidic phospholipids in the membrane [].
Probab=24.13 E-value=66 Score=25.42 Aligned_cols=20 Identities=20% Similarity=-0.044 Sum_probs=9.7
Q ss_pred eecchhHHHHHHHHHHHHHHh
Q 037939 234 VGTTVGAAVGAFLLGLLLVAM 254 (309)
Q Consensus 234 vg~vvGg~~glvlLgll~v~~ 254 (309)
+.+++||+ .+.+|+++.=+.
T Consensus 35 aIGvi~gi-~~~~lt~ltN~Y 54 (68)
T PF04971_consen 35 AIGVIGGI-FFGLLTYLTNLY 54 (68)
T ss_pred hHHHHHHH-HHHHHHHHhHhh
Confidence 33456653 345556544433
No 65
>PRK05585 yajC preprotein translocase subunit YajC; Validated
Probab=22.90 E-value=1.7e+02 Score=24.43 Aligned_cols=20 Identities=5% Similarity=0.137 Sum_probs=14.1
Q ss_pred HHHHHHHHHHHHhccccccc
Q 037939 258 VKKKARMEELERRAYEEEAL 277 (309)
Q Consensus 258 ~~rkkr~~eMEr~A~~gE~L 277 (309)
.||+|+++||..+=..|...
T Consensus 40 kK~~k~~~~~~~~Lk~Gd~V 59 (106)
T PRK05585 40 QKRQKEHKKMLSSLAKGDEV 59 (106)
T ss_pred HHHHHHHHHHHHhcCCCCEE
Confidence 36667778888877777654
No 66
>PF08370 PDR_assoc: Plant PDR ABC transporter associated; InterPro: IPR013581 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain []. The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ]. The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis [, , , , , ]. The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions []. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ]; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This domain is found on the C terminus of ABC-2 type transporter domains (IPR013525 from INTERPRO). It seems to be associated with the plant pleiotropic drug resistance (PDR) protein family of ABC transporters. Like in yeast, plant PDR ABC transporters may also play a role in the transport of antifungal agents [] (see also IPR010929 from INTERPRO). The PDR family is characterised by a configuration in which the ABC domain is nearer the N terminus of the protein than the transmembrane domain [].
Probab=22.48 E-value=1.2e+02 Score=23.46 Aligned_cols=17 Identities=41% Similarity=0.749 Sum_probs=9.3
Q ss_pred ceEEEeecchhHHHHHHHHH
Q 037939 229 RWKLAVGTTVGAAVGAFLLG 248 (309)
Q Consensus 229 ~Wk~ivg~vvGg~~glvlLg 248 (309)
.|.| +| +|+.+|..+|=
T Consensus 27 ~WyW-Ig--vgaL~G~~vlF 43 (65)
T PF08370_consen 27 YWYW-IG--VGALLGFIVLF 43 (65)
T ss_pred cEEe-eh--HHHHHHHHHHH
Confidence 4544 44 67666655543
No 67
>PF06295 DUF1043: Protein of unknown function (DUF1043); InterPro: IPR009386 This entry consists of several hypothetical bacterial proteins of unknown function.
Probab=22.44 E-value=37 Score=28.93 Aligned_cols=11 Identities=36% Similarity=0.767 Sum_probs=6.0
Q ss_pred EeecchhHHHH
Q 037939 233 AVGTTVGAAVG 243 (309)
Q Consensus 233 ivg~vvGg~~g 243 (309)
++|.+||.++|
T Consensus 3 ~i~lvvG~iiG 13 (128)
T PF06295_consen 3 IIGLVVGLIIG 13 (128)
T ss_pred HHHHHHHHHHH
Confidence 45566665444
No 68
>PF00558 Vpu: Vpu protein; InterPro: IPR008187 The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells [].; GO: 0019076 release of virus from host; PDB: 2JPX_A 1PI8_A 2GOH_A 2GOF_A 1PI7_A 1PJE_A 1VPU_A 2K7Y_A.
Probab=22.17 E-value=73 Score=25.87 Aligned_cols=12 Identities=17% Similarity=0.398 Sum_probs=0.8
Q ss_pred HHHHHHHHHHHH
Q 037939 257 KVKKKARMEELE 268 (309)
Q Consensus 257 r~~rkkr~~eME 268 (309)
.|||-||.++.+
T Consensus 29 eYrk~~rqrkId 40 (81)
T PF00558_consen 29 EYRKIKRQRKID 40 (81)
T ss_dssp ----------CH
T ss_pred HHHHHHHHHhHH
Confidence 445444444443
No 69
>PF02158 Neuregulin: Neuregulin family; InterPro: IPR002154 Neuregulins are a sub-family of EGF-like molecules that have been shown to play multiple essential roles in vertebrate embryogenesis including: cardiac development, Schwann cell and oligodendrocyte differentiation, some aspects of neuronal development, as well as the formation of neuromuscular synapses [, ]. Included in the family are heregulin; neu differentiation factor; acetylcholine receptor synthesis stimulator; glial growth factor; and sensory and motor-neuron derived factor []. Multiple family members are generated by alternate splicing or by use of several cell type-specific transcription initiation sites. In general, they bind to and activate the erbB family of receptor tyrosine kinases (erbB2 (HER2), erbB3 (HER3), and erbB4 (HER4)), functioning both as heterodimers and homodimers. The transmembrane forms of neuregulin 1 (NRG1) are present within synaptic vesicles, including those containing glutamate []. After exocytosis, NRG1 is in the presynaptic membrane, where the ectodomain of NRG1 may be cleaved off. The ectodomain then migrates across the synaptic cleft and binds to and activates a member of the EGF-receptor family on the postsynaptic membrane. This has been shown to increase the expression of certain glutamate-receptor subunits. NRG1 appears to signal for glutamate-receptor subunit expression, localisation, and /or phosphorylation facilitating subsequent glutamate transmission. The NRG1 gene has been identified as a potential gene determining susceptibility to schizophrenia by a combination of genetic linkage and association approaches []. ; GO: 0005102 receptor binding, 0009790 embryo development; PDB: 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=22.01 E-value=30 Score=35.40 Aligned_cols=44 Identities=18% Similarity=0.384 Sum_probs=0.0
Q ss_pred ecchhHHHHHHHHHHHHHH-hhHHH-HHHHHHHHHHHhcccccccc
Q 037939 235 GTTVGAAVGAFLLGLLLVA-MFVKV-KKKARMEELERRAYEEEALQ 278 (309)
Q Consensus 235 g~vvGg~~glvlLgll~v~-~~vr~-~rkkr~~eMEr~A~~gE~L~ 278 (309)
-++-|-++|++++|+++|+ +.-|- |++|||++==|+...++...
T Consensus 10 LTITgIcvaLlVVGi~Cvv~aYCKTKKQRkklh~hLkqs~~~~~~n 55 (404)
T PF02158_consen 10 LTITGICVALLVVGIVCVVDAYCKTKKQRKKLHEHLKQSQRSKNYN 55 (404)
T ss_dssp ----------------------------------------------
T ss_pred hhhhhhhHHHHHHHHHHHHHHHHHhHHHHHHHHHHHhhhcccchhh
Confidence 3445556778888888776 54455 44555655555555555554
No 70
>PF15330 SIT: SHP2-interacting transmembrane adaptor protein, SIT
Probab=21.95 E-value=51 Score=27.70 Aligned_cols=31 Identities=10% Similarity=0.020 Sum_probs=0.0
Q ss_pred EEeecchhHHHHHHHHHHHHHHhhHHHHHHHHH
Q 037939 232 LAVGTTVGAAVGAFLLGLLLVAMFVKVKKKARM 264 (309)
Q Consensus 232 ~ivg~vvGg~~glvlLgll~v~~~vr~~rkkr~ 264 (309)
|.+-.++| ++.+++|++-++++ +..+|++|.
T Consensus 1 w~Ll~il~-llLll~l~asl~~w-r~~~rq~k~ 31 (107)
T PF15330_consen 1 WLLLGILA-LLLLLSLAASLLAW-RMKQRQKKA 31 (107)
T ss_pred ChHHHHHH-HHHHHHHHHHHHHH-HHHhhhccc
No 71
>KOG2792 consensus Putative cytochrome C oxidase assembly protein [Energy production and conversion]
Probab=21.44 E-value=1.6e+02 Score=28.92 Aligned_cols=26 Identities=31% Similarity=0.351 Sum_probs=18.6
Q ss_pred HHHHHHHHHHHHhcccccccceeeecc
Q 037939 258 VKKKARMEELERRAYEEEALQVSMVGH 284 (309)
Q Consensus 258 ~~rkkr~~eMEr~A~~gE~L~~~~VG~ 284 (309)
.++|++.+|-||+ +..++++-..||+
T Consensus 95 ~~~k~~~~e~~r~-~~~~~~gk~~iGG 120 (280)
T KOG2792|consen 95 KKEKARLLEKERE-SANRTAGKPAIGG 120 (280)
T ss_pred HHHHHHHHHHHhh-hhhhhcCCCccCC
Confidence 3566667777777 6667888888886
No 72
>PF07006 DUF1310: Protein of unknown function (DUF1310); InterPro: IPR010738 This family consists of several hypothetical proteins of around 125 residues in length. Members of this family seem to be specific to Listeria and Streptococcus species. The function of this family is unknown.
Probab=21.42 E-value=35 Score=29.37 Aligned_cols=22 Identities=18% Similarity=0.182 Sum_probs=15.6
Q ss_pred HHHHHHHHHHHHHHhccccccc
Q 037939 256 VKVKKKARMEELERRAYEEEAL 277 (309)
Q Consensus 256 vr~~rkkr~~eMEr~A~~gE~L 277 (309)
....++++-+||.+-++..|+=
T Consensus 21 ~~~~~~~~~~eMi~Iv~S~E~k 42 (122)
T PF07006_consen 21 FYMDQQKEKQEMIQIVKSEEAK 42 (122)
T ss_pred EEEEEhHHHHHHHHHhcCHHHH
Confidence 3445555678999988888763
No 73
>PF04286 DUF445: Protein of unknown function (DUF445); InterPro: IPR007383 This entry contains proteins of unknown function. They are predicted to be transmembrane proteins with 2 or 3 TM domains.
Probab=20.83 E-value=49 Score=31.13 Aligned_cols=21 Identities=19% Similarity=0.466 Sum_probs=16.0
Q ss_pred ceEEEeecchhHHHHHHHHHH
Q 037939 229 RWKLAVGTTVGAAVGAFLLGL 249 (309)
Q Consensus 229 ~Wk~ivg~vvGg~~glvlLgl 249 (309)
+|.-|-|+++||++|+++..+
T Consensus 343 ~~IrinGallG~liG~~~~~i 363 (367)
T PF04286_consen 343 QWIRINGALLGGLIGLLQYLI 363 (367)
T ss_pred HhhhhhhHHHHHHHHHHHHHH
Confidence 565588999999988766554
No 74
>KOG4007 consensus Uncharacterized conserved protein [Function unknown]
Probab=20.24 E-value=1.4e+02 Score=28.33 Aligned_cols=43 Identities=23% Similarity=0.322 Sum_probs=23.2
Q ss_pred EEEeecchhHHHHHHHHHHHHHHhhH--HHHHHHHHHHHHHhccccc
Q 037939 231 KLAVGTTVGAAVGAFLLGLLLVAMFV--KVKKKARMEELERRAYEEE 275 (309)
Q Consensus 231 k~ivg~vvGg~~glvlLgll~v~~~v--r~~rkkr~~eMEr~A~~gE 275 (309)
|++|.. +-.++|+ |++..++.|++ -.+|+-+..+-|-+-|++|
T Consensus 134 KVvvIi-vi~ii~i-L~lYMvfLmcldPlLrKr~~~~yq~hnded~e 178 (229)
T KOG4007|consen 134 KVVVII-VISIIGI-LLLYMVFLMCLDPLLRKRVKANYQEHNDEDDE 178 (229)
T ss_pred EEEEee-hHHHHHH-HHHHHHHHHhhhHHHhhhhhhhHHHhcccccc
Confidence 444433 3334443 33444445656 2355667777777777766
No 75
>PF11669 WBP-1: WW domain-binding protein 1; InterPro: IPR021684 This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain [].
Probab=20.09 E-value=2.3e+02 Score=23.44 Aligned_cols=22 Identities=9% Similarity=0.175 Sum_probs=15.0
Q ss_pred HHHHHHHHHHHHHHhccccccc
Q 037939 256 VKVKKKARMEELERRAYEEEAL 277 (309)
Q Consensus 256 vr~~rkkr~~eMEr~A~~gE~L 277 (309)
.++|+|.++++=+|+.|.+++-
T Consensus 41 ~~~r~r~~~~~q~rq~e~~~~~ 62 (102)
T PF11669_consen 41 RHRRRRRRLQQQQRQREINLIA 62 (102)
T ss_pred HHHHHHHhhhhhhccccccccc
Confidence 4556667787777777776644
Done!