Query 018239
Match_columns 359
No_of_seqs 133 out of 194
Neff 6.4
Searched_HMMs 46136
Date Fri Mar 29 07:18:44 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/018239.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/018239hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09402 MSC: Man1-Src1p-C-ter 100.0 5.6E-62 1.2E-66 477.2 4.8 271 71-348 18-333 (334)
2 PF12946 EGF_MSP1_1: MSP1 EGF 95.6 0.0045 9.7E-08 41.7 0.8 28 94-121 5-37 (37)
3 PF01683 EB: EB module; Inter 77.1 4.1 8.9E-05 28.8 3.8 26 95-120 27-52 (52)
4 PF13314 DUF4083: Domain of un 73.8 16 0.00035 27.1 6.1 20 259-278 38-57 (58)
5 PF07127 Nodulin_late: Late no 67.7 8.4 0.00018 27.8 3.6 26 73-113 26-52 (54)
6 PTZ00382 Variant-specific surf 64.3 11 0.00024 30.7 4.1 34 90-123 19-56 (96)
7 PF07645 EGF_CA: Calcium-bindi 62.0 4.3 9.4E-05 27.6 1.1 21 95-115 11-35 (42)
8 COG2976 Uncharacterized protei 59.0 34 0.00074 31.8 6.7 50 227-277 14-65 (207)
9 PF06667 PspB: Phage shock pro 56.8 45 0.00096 26.1 6.1 33 240-272 11-51 (75)
10 KOG0196 Tyrosine kinase, EPH ( 56.2 9.1 0.0002 42.3 2.9 42 73-116 275-319 (996)
11 PF04891 NifQ: NifQ; InterPro 55.4 13 0.00029 33.4 3.4 57 39-104 107-167 (167)
12 PF06387 Calcyon: D1 dopamine 52.2 18 0.00039 32.7 3.6 16 109-124 113-128 (186)
13 PF12947 EGF_3: EGF domain; I 49.3 9.3 0.0002 25.4 1.1 26 95-120 7-36 (36)
14 KOG1214 Nidogen and related ba 47.9 12 0.00025 41.6 2.1 34 91-124 828-867 (1289)
15 TIGR02976 phageshock_pspB phag 46.5 74 0.0016 24.8 5.8 29 244-272 15-51 (75)
16 PF06864 PAP_PilO: Pilin acces 45.1 42 0.00092 34.2 5.6 14 290-303 220-233 (414)
17 PF07543 PGA2: Protein traffic 44.2 65 0.0014 28.1 5.8 34 243-276 21-54 (140)
18 PF07974 EGF_2: EGF-like domai 44.1 22 0.00047 23.1 2.2 20 95-114 7-28 (32)
19 smart00179 EGF_CA Calcium-bind 44.0 22 0.00048 22.6 2.3 26 94-120 9-38 (39)
20 PF01826 TIL: Trypsin Inhibito 43.7 12 0.00027 26.6 1.1 26 96-124 27-53 (55)
21 PF01102 Glycophorin_A: Glycop 43.3 33 0.00071 29.3 3.8 19 237-255 69-87 (122)
22 PF10576 EndIII_4Fe-2S: Iron-s 42.9 11 0.00023 21.1 0.5 14 89-102 4-17 (17)
23 PF08563 P53_TAD: P53 transact 42.8 11 0.00023 23.3 0.5 14 176-189 8-21 (25)
24 cd00053 EGF Epidermal growth f 40.7 27 0.00059 21.3 2.3 25 95-120 7-35 (36)
25 PF00558 Vpu: Vpu protein; In 37.8 36 0.00078 27.0 2.9 22 253-274 27-48 (81)
26 PF08114 PMP1_2: ATPase proteo 36.9 60 0.0013 22.5 3.5 20 245-264 19-38 (43)
27 PF02009 Rifin_STEVOR: Rifin/s 36.4 52 0.0011 32.4 4.5 8 146-153 95-102 (299)
28 PF06679 DUF1180: Protein of u 35.8 1.5E+02 0.0033 26.6 7.0 34 35-68 84-117 (163)
29 PRK09458 pspB phage shock prot 35.8 98 0.0021 24.2 5.0 30 243-272 14-51 (75)
30 PHA03399 pif3 per os infectivi 35.0 46 0.001 30.9 3.6 33 71-115 47-87 (200)
31 PF05568 ASFV_J13L: African sw 34.7 1E+02 0.0022 27.3 5.5 11 230-240 26-36 (189)
32 PF07466 DUF1517: Protein of u 34.5 1.2E+02 0.0026 29.7 6.6 24 43-66 62-85 (289)
33 PF06143 Baculo_11_kDa: Baculo 34.3 2.4E+02 0.0053 22.5 7.9 22 230-251 32-53 (84)
34 PRK07597 secE preprotein trans 33.1 82 0.0018 23.3 4.2 28 38-65 25-52 (64)
35 TIGR00964 secE_bact preprotein 32.8 86 0.0019 22.6 4.1 26 39-64 17-42 (55)
36 PF07271 Cytadhesin_P30: Cytad 30.8 1.6E+02 0.0035 28.6 6.6 16 260-275 104-119 (279)
37 KOG0474 Cl- channel CLC-7 and 30.0 61 0.0013 35.1 4.0 25 90-114 396-421 (762)
38 PF10588 NADH-G_4Fe-4S_3: NADH 29.9 24 0.00051 24.1 0.7 16 88-103 11-26 (41)
39 PRK11677 hypothetical protein; 29.9 2.1E+02 0.0045 24.9 6.6 15 262-276 49-63 (134)
40 PF03672 UPF0154: Uncharacteri 29.5 1.9E+02 0.0041 22.0 5.5 22 240-261 7-28 (64)
41 PF07699 GCC2_GCC3: GCC2 and G 29.4 44 0.00096 23.2 2.0 29 72-102 9-37 (48)
42 PF09402 MSC: Man1-Src1p-C-ter 29.0 18 0.0004 35.4 0.0 71 261-331 97-174 (334)
43 COG0690 SecE Preprotein transl 28.9 1.2E+02 0.0025 23.4 4.5 27 38-64 35-61 (73)
44 PF06247 Plasmod_Pvs28: Plasmo 28.7 21 0.00045 32.8 0.3 31 95-125 51-90 (197)
45 PRK09400 secE preprotein trans 28.2 1.3E+02 0.0027 22.6 4.4 19 40-58 27-45 (61)
46 PF09064 Tme5_EGF_like: Thromb 28.2 47 0.001 22.1 1.8 17 101-117 11-30 (34)
47 PF12662 cEGF: Complement Clr- 27.3 35 0.00075 20.8 1.0 16 107-122 4-21 (24)
48 PF00584 SecE: SecE/Sec61-gamm 26.8 1.5E+02 0.0033 21.2 4.6 21 39-59 18-38 (57)
49 PF14316 DUF4381: Domain of un 26.5 1.7E+02 0.0036 25.2 5.6 16 268-283 69-84 (146)
50 PF05545 FixQ: Cbb3-type cytoc 26.3 1.3E+02 0.0027 21.1 3.9 9 232-240 7-15 (49)
51 PF12729 4HB_MCP_1: Four helix 26.1 2.8E+02 0.0061 23.0 7.0 8 298-305 64-71 (181)
52 PRK15428 putative propanediol 26.0 58 0.0013 29.2 2.6 31 263-301 4-34 (163)
53 PF04882 Peroxin-3: Peroxin-3; 25.6 45 0.00099 34.3 2.1 29 227-255 4-32 (432)
54 PHA02817 EEV Host range protei 25.4 77 0.0017 29.9 3.4 43 72-117 66-126 (225)
55 PF10500 SR-25: Nuclear RNA-sp 24.9 51 0.0011 31.0 2.1 9 177-185 159-167 (225)
56 PF01102 Glycophorin_A: Glycop 24.8 69 0.0015 27.4 2.7 23 236-258 72-94 (122)
57 CHL00190 psaM photosystem I su 24.7 86 0.0019 20.2 2.5 15 52-66 7-21 (30)
58 PF12669 P12: Virus attachment 24.3 1E+02 0.0022 22.7 3.2 10 96-105 40-49 (58)
59 PF07047 OPA3: Optic atrophy 3 24.2 4.5E+02 0.0098 22.4 8.3 20 238-257 79-98 (134)
60 PF07803 GSG-1: GSG1-like prot 24.2 1.7E+02 0.0036 24.9 4.8 30 48-77 9-43 (118)
61 PF07465 PsaM: Photosystem I p 24.0 95 0.0021 19.9 2.6 15 52-66 6-20 (29)
62 PF11392 DUF2877: Protein of u 22.9 46 0.001 27.6 1.3 11 35-45 5-15 (110)
63 smart00181 EGF Epidermal growt 22.4 81 0.0018 19.6 2.2 25 94-120 6-34 (35)
64 cd00033 CCP Complement control 22.4 53 0.0011 22.4 1.4 20 106-125 26-48 (57)
65 TIGR03053 PS_I_psaM photosyste 22.1 1E+02 0.0023 19.7 2.5 15 52-66 6-20 (29)
66 KOG1225 Teneurin-1 and related 21.6 56 0.0012 34.6 1.9 22 94-115 316-337 (525)
67 cd03580 NTR_Sfrp1_like NTR dom 21.0 34 0.00074 29.1 0.1 27 90-116 1-29 (126)
68 PF15240 Pro-rich: Proline-ric 20.9 62 0.0013 29.5 1.8 14 49-62 2-15 (179)
69 cd00054 EGF_CA Calcium-binding 20.7 75 0.0016 19.6 1.7 21 95-115 10-34 (38)
70 PHA02642 C-type lectin-like pr 20.6 2E+02 0.0043 27.0 5.1 19 48-66 56-74 (216)
71 PHA02673 ORF109 EEV glycoprote 20.5 1.6E+02 0.0034 26.4 4.1 22 45-66 35-56 (161)
72 PF11743 DUF3301: Protein of u 20.4 2.6E+02 0.0056 22.6 5.2 22 255-276 15-36 (97)
73 PF14991 MLANA: Protein melan- 20.2 30 0.00066 29.2 -0.3 23 240-262 30-54 (118)
No 1
>PF09402 MSC: Man1-Src1p-C-terminal domain; InterPro: IPR018996 This entry represents the Inner nuclear membrane proteins MAN1 (also known as LEM domain-containing protein 3) and LEM domain-containing protein 2 (or LEM protein 2). Emerin and MAN1 are LEM domain-containing integral membrane proteins of the vertebrate nuclear envelope []. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nulceoplasmic region forms a DNA binding winged helix and binds to Smad []. LEM protein 2 is an essential protein involved in chromosome segregation and cell division, probably via its interaction with lmn-1, the main component of nuclear lamina. Has some overlapping function with emr-1.; GO: 0005639 integral to nuclear inner membrane; PDB: 2CH0_A.
Probab=100.00 E-value=5.6e-62 Score=477.21 Aligned_cols=271 Identities=29% Similarity=0.456 Sum_probs=71.6
Q ss_pred CCCCCCCCCCCCCCC------------CCCCCCCCccCCCCceecCC-eeeeCCCceec-----------CCCcccCchh
Q 018239 71 STSKPFCDSNLLLDS------------PQSPTDSCEPCPSNGECHQG-KLECFHGYRKH-----------GKLCVEDGDI 126 (359)
Q Consensus 71 ~~~~~fCds~~~~~~------------~~~~~p~C~PCP~hA~C~~g-~l~C~~gY~l~-----------~~~Cv~D~~k 126 (359)
+...+|||++.+..+ ...++|+|+|||+||+|++| ++.|++||+++ +++|++|+++
T Consensus 18 ~~~vgyC~~~~~~~~~~~~~~~~~~~~~~~~~P~C~pCP~~a~C~~~~~~~C~~~y~~~~~~l~~~g~~p~~~Ci~D~~k 97 (334)
T PF09402_consen 18 KIAVGYCGTESPSPSFADDDISVPDWLLENFKPSCEPCPEHAICYPGLKLECEPGYVLKPSPLSLFGLIPPPKCIPDTEK 97 (334)
T ss_dssp --------------------------------------------------------------------------------
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccHHH
Confidence 468999999972111 14578999999999999999 99999999999 9999999999
Q ss_pred hHHHHHHHHHHHHHHHHHhhcccccCC---CCcccchhhHHHHhhhhhhhhccCCChHHHHHHHHHHHHHHHhhcceecc
Q 018239 127 NETAGRLSRWVENRLCRAYAQFLCDGT---GSIWVEENDIWNDLEGHELMKIFELDNPVYLYTKKRTMETVGRYLESRTN 203 (359)
Q Consensus 127 ~~~i~~l~~~i~~~Lr~~~a~~~CG~~---~s~~i~e~dl~~~l~~~~~~k~~~ls~~~fe~l~~~al~~l~~~~~~~~~ 203 (359)
++.+++|++++.++||++||+++||.+ .+..|+++||++++.+ ++++++++++|+++|..++..+.+.-+..+.
T Consensus 98 ~~~i~~l~~~~~~~Lr~~~a~~~Cg~~~~~~~~~ls~~el~~~~~~---~~~~~~~~~efe~l~~~a~~~L~~~~ei~~~ 174 (334)
T PF09402_consen 98 EEKIEELAKKILDELRERNAQYECGDSEDDESPGLSEEELKDILSS---KKSPWISDEEFEELWSAALQELKKNPEIIIR 174 (334)
T ss_dssp --------------------------------------------------------------------------------
T ss_pred HHHHHHHHHHHHHHHHHHHhhcccCCCCCCCCCCCcHHHHHHHHHh---ccCccccHHHHHHHHHHHHHHHHhCCcEEEe
Confidence 999999999999999999999999943 3567999999999999 7889999999999999999888743222222
Q ss_pred ------------cCCceeeecccccccCccCcchhhHH----HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 018239 204 ------------SYGMKELKCPELLAEHYKPLSCRIHQ----WVSTHALIIVPVCSLLVGCLLLLWKVHRRRYFAIRVEE 267 (359)
Q Consensus 204 ------------sn~~~~~k~~~l~s~~~ipl~Crir~----~i~~~~~~i~~~l~~~vgi~~l~~~~~r~~~e~~~v~~ 267 (359)
..+.+.+. +++++++||+|++++ ++.+|+..++++++++++++++.++++++++++++|++
T Consensus 175 ~~~~~~~~~~~~~~~~~~~~---s~s~~~lpl~C~~~~~i~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~v~~ 251 (334)
T PF09402_consen 175 DDIINSHSSDDSNEKDKYFR---SSSLPYLPLKCRLRRQIRQFISRYRLIILGVLILLLLIKYIRYRYRKRREEKARVEE 251 (334)
T ss_dssp --------------------------------------------------------------------STHHHHHTTTTT
T ss_pred cccccccccccccCCcEEEE---eeCCCccccEEEEehHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 11122233 368999999997655 55667777777777777777778888889999999999
Q ss_pred HHHHHHHHHHHHHHhhhccCCCCCCcccccccccccCCCCC--cCchhhHHHHHHHHhcCCCcceeeeEEcCceeeeeEe
Q 018239 268 LYHQVCEILEENALMSKSVNGECEPWVVASRLRDHLLLPKE--RKDPVIWKKVEELVQEDSRVDQYPKLLKGESKVVWEW 345 (359)
Q Consensus 268 Lv~~vi~~L~~q~~~~~~~~~~~~pyl~~~qLRD~lL~~~~--r~r~~LW~kV~~~Ve~nSnIr~~~~ei~GE~~~vWeW 345 (359)
||++|+++|++|+..+ ..+..++|||+++||||+||.++. +++++||++|+++||+|||||++++|+|||+|+||||
T Consensus 252 lv~~ii~~L~~~~~~~-~~~~~~~p~v~~~qLRD~ll~~~~~~~~~~~lW~~v~~~ve~ns~Vr~~~~e~~Ge~~~vWeW 330 (334)
T PF09402_consen 252 LVKKIIDRLQDQARAS-DPNSSPEPYVSISQLRDDLLPPEHRLKRRNRLWKKVVKKVEENSNVRTEVREVHGEIMRVWEW 330 (334)
T ss_dssp THHHHHHHHHHHHHHH-TTSS-S-S-B-HHHHHHTT--STTGGG-GHHHHHHHHHHHTT---SEEEEEEETTEEEEEEE-
T ss_pred HHHHHHHHHHHHhhhh-ccCCCCCCCccHHHHHHHhCCcccCHHHHHHHHHHHHHHHHcCCCeeEEEEEECCeEEEEEEe
Confidence 9999999999999843 344678999999999999998765 3379999999999999999999999999999999999
Q ss_pred cCC
Q 018239 346 QGA 348 (359)
Q Consensus 346 ig~ 348 (359)
||+
T Consensus 331 ig~ 333 (334)
T PF09402_consen 331 IGP 333 (334)
T ss_dssp ---
T ss_pred cCC
Confidence 994
No 2
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=95.64 E-value=0.0045 Score=41.70 Aligned_cols=28 Identities=39% Similarity=0.954 Sum_probs=20.1
Q ss_pred ccCCCCceecCC----e-eeeCCCceecCCCcc
Q 018239 94 EPCPSNGECHQG----K-LECFHGYRKHGKLCV 121 (359)
Q Consensus 94 ~PCP~hA~C~~g----~-l~C~~gY~l~~~~Cv 121 (359)
++||+||-|+.+ + -.|..||.+.+.+|+
T Consensus 5 ~~cP~NA~C~~~~dG~eecrCllgyk~~~~~C~ 37 (37)
T PF12946_consen 5 TKCPANAGCFRYDDGSEECRCLLGYKKVGGKCV 37 (37)
T ss_dssp S---TTEEEEEETTSEEEEEE-TTEEEETTEEE
T ss_pred ccCCCCcccEEcCCCCEEEEeeCCccccCCCcC
Confidence 589999999864 3 399999999998886
No 3
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=77.14 E-value=4.1 Score=28.85 Aligned_cols=26 Identities=31% Similarity=0.778 Sum_probs=22.6
Q ss_pred cCCCCceecCCeeeeCCCceecCCCc
Q 018239 95 PCPSNGECHQGKLECFHGYRKHGKLC 120 (359)
Q Consensus 95 PCP~hA~C~~g~l~C~~gY~l~~~~C 120 (359)
.|..++.|.+|.=.|.+||......|
T Consensus 27 qC~~~s~C~~g~C~C~~g~~~~~~~C 52 (52)
T PF01683_consen 27 QCIGGSVCVNGRCQCPPGYVEVGGRC 52 (52)
T ss_pred CCCCcCEEcCCEeECCCCCEecCCCC
Confidence 34499999998889999999988877
No 4
>PF13314 DUF4083: Domain of unknown function (DUF4083)
Probab=73.81 E-value=16 Score=27.08 Aligned_cols=20 Identities=20% Similarity=0.277 Sum_probs=13.1
Q ss_pred HHHHHHHHHHHHHHHHHHHH
Q 018239 259 RYFAIRVEELYHQVCEILEE 278 (359)
Q Consensus 259 ~~e~~~v~~Lv~~vi~~L~~ 278 (359)
++....+++=.+.+++.|++
T Consensus 38 kq~~~~~eqKLDrIIeLLEK 57 (58)
T PF13314_consen 38 KQDVDSMEQKLDRIIELLEK 57 (58)
T ss_pred ccchhHHHHHHHHHHHHHcc
Confidence 34444577777778888764
No 5
>PF07127 Nodulin_late: Late nodulin protein; InterPro: IPR009810 This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues toward the proteins C terminus which may be involved in metal-binding [].; GO: 0046872 metal ion binding, 0009878 nodule morphogenesis
Probab=67.71 E-value=8.4 Score=27.84 Aligned_cols=26 Identities=19% Similarity=0.586 Sum_probs=19.5
Q ss_pred CCCCCCCCCCCCCCCCCCCCCccCCCCceecCC-eeeeCCCc
Q 018239 73 SKPFCDSNLLLDSPQSPTDSCEPCPSNGECHQG-KLECFHGY 113 (359)
Q Consensus 73 ~~~fCds~~~~~~~~~~~p~C~PCP~hA~C~~g-~l~C~~gY 113 (359)
....|.++. .||.+ |..+ ..+|..|+
T Consensus 26 ~~~~C~~d~-------------DCp~~--c~~~~~~kCi~~~ 52 (54)
T PF07127_consen 26 AIIPCKTDS-------------DCPKD--CPPPFIPKCINNI 52 (54)
T ss_pred CCcccCccc-------------cCCCC--CCCCcCcEeCcCC
Confidence 467899985 78888 8887 45887664
No 6
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=64.26 E-value=11 Score=30.71 Aligned_cols=34 Identities=24% Similarity=0.574 Sum_probs=23.7
Q ss_pred CCCCccCCC--CceecCCee--eeCCCceecCCCcccC
Q 018239 90 TDSCEPCPS--NGECHQGKL--ECFHGYRKHGKLCVED 123 (359)
Q Consensus 90 ~p~C~PCP~--hA~C~~g~l--~C~~gY~l~~~~Cv~D 123 (359)
...|.+||. =+.|..... .|..||.+..+.|+.+
T Consensus 19 ~~~C~~C~~~~C~~C~~~~~C~~C~~GY~~~~~~Cv~~ 56 (96)
T PTZ00382 19 GSGCVLCSVGNCKSCVVDGVCGECNSGFSLDNGKCVSS 56 (96)
T ss_pred CCcCCcCCCCCCcCCCCCCccccCcCCcccCCCccccc
Confidence 356999985 234433322 7999999999999864
No 7
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=62.02 E-value=4.3 Score=27.57 Aligned_cols=21 Identities=38% Similarity=0.911 Sum_probs=17.6
Q ss_pred cCCCCceecCC--ee--eeCCCcee
Q 018239 95 PCPSNGECHQG--KL--ECFHGYRK 115 (359)
Q Consensus 95 PCP~hA~C~~g--~l--~C~~gY~l 115 (359)
+|++++.|.+- .. .|.+||..
T Consensus 11 ~C~~~~~C~N~~Gsy~C~C~~Gy~~ 35 (42)
T PF07645_consen 11 NCPENGTCVNTEGSYSCSCPPGYEL 35 (42)
T ss_dssp SSSTTSEEEEETTEEEEEESTTEEE
T ss_pred cCCCCCEEEcCCCCEEeeCCCCcEE
Confidence 68999999875 33 99999994
No 8
>COG2976 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=59.02 E-value=34 Score=31.78 Aligned_cols=50 Identities=14% Similarity=0.295 Sum_probs=24.8
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHHH-HHHHHHHHHHHHHHH
Q 018239 227 IHQWVSTHALIIVPVCSLLVGCLLLLWKVH-RRRYFA-IRVEELYHQVCEILE 277 (359)
Q Consensus 227 ir~~i~~~~~~i~~~l~~~vgi~~l~~~~~-r~~~e~-~~v~~Lv~~vi~~L~ 277 (359)
+++|++++-..++..+++.+|.+ +-|.++ .++.++ +.....|+++++.++
T Consensus 14 ik~wwkeNGk~li~gviLg~~~l-fGW~ywq~~q~~q~~~AS~~Y~~~i~~~~ 65 (207)
T COG2976 14 IKDWWKENGKALIVGVILGLGGL-FGWRYWQSHQVEQAQEASAQYQNAIKAVQ 65 (207)
T ss_pred HHHHHHHCCchhHHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence 56777766543333222222222 335444 444333 345667777777763
No 9
>PF06667 PspB: Phage shock protein B; InterPro: IPR009554 This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages []. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one [].; GO: 0006355 regulation of transcription, DNA-dependent, 0009271 phage shock
Probab=56.80 E-value=45 Score=26.09 Aligned_cols=33 Identities=27% Similarity=0.312 Sum_probs=19.4
Q ss_pred HHHHHHHHHHHH-HHHHHHHH-------HHHHHHHHHHHHH
Q 018239 240 PVCSLLVGCLLL-LWKVHRRR-------YFAIRVEELYHQV 272 (359)
Q Consensus 240 ~~l~~~vgi~~l-~~~~~r~~-------~e~~~v~~Lv~~v 272 (359)
.+.+++|+..|+ .+|..+++ .+.++.++|++.+
T Consensus 11 ivf~ifVap~WL~lHY~sk~~~~~gLs~~d~~~L~~L~~~a 51 (75)
T PF06667_consen 11 IVFMIFVAPIWLILHYRSKWKSSQGLSEEDEQRLQELYEQA 51 (75)
T ss_pred HHHHHHHHHHHHHHHHHHhcccCCCCCHHHHHHHHHHHHHH
Confidence 334445555555 45655554 4677778877765
No 10
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=56.22 E-value=9.1 Score=42.31 Aligned_cols=42 Identities=24% Similarity=0.584 Sum_probs=29.2
Q ss_pred CCCCCCCCCCCCCCCCCCCCCccCCCCcee-cCC-ee-eeCCCceec
Q 018239 73 SKPFCDSNLLLDSPQSPTDSCEPCPSNGEC-HQG-KL-ECFHGYRKH 116 (359)
Q Consensus 73 ~~~fCds~~~~~~~~~~~p~C~PCP~hA~C-~~g-~l-~C~~gY~l~ 116 (359)
..--|..+. + +.......|.|||+|.+= ..| .. .|..||-..
T Consensus 275 ~C~aCp~G~-y-K~~~~~~~C~~CP~~S~s~~ega~~C~C~~gyyRA 319 (996)
T KOG0196|consen 275 ACQACPPGT-Y-KASQGDSLCLPCPPNSHSSSEGATSCTCENGYYRA 319 (996)
T ss_pred cceeCCCCc-c-cCCCCCCCCCCCCCCCCCCCCCCCcccccCCcccC
Confidence 334455553 1 233446889999999998 556 55 999999993
No 11
>PF04891 NifQ: NifQ; InterPro: IPR006975 NifQ is involved in early stages of the biosynthesis of the iron-molybdenum cofactor (FeMo-co) [], which is an integral part of the active site of dinitrogenase []. The conserved C-terminal cysteine residues may be involved in metal binding [].; GO: 0030151 molybdenum ion binding, 0009399 nitrogen fixation
Probab=55.37 E-value=13 Score=33.43 Aligned_cols=57 Identities=19% Similarity=0.307 Sum_probs=32.8
Q ss_pred CCChhhHHHHHHHHHHHHHHHHHHHHHHh----hhcCCCCCCCCCCCCCCCCCCCCCCCccCCCCceecC
Q 018239 39 FPSKQDLLRLITVVAIASSVALTCNYLAN----FLNSTSKPFCDSNLLLDSPQSPTDSCEPCPSNGECHQ 104 (359)
Q Consensus 39 ~~~~~~~~~~~~~~~~a~~~a~~~~~l~~----~~~~~~~~fCds~~~~~~~~~~~p~C~PCP~hA~C~~ 104 (359)
+++++|+.+|+.=-+=+.+.. |.=.| |||+. -|..+. -+.=..|+|.-|.+++.|+.
T Consensus 107 L~~R~eLs~Lm~r~Fp~Laa~---N~~~MrWKKFfYrq---lCe~eG---~~~C~aPsC~~C~D~~~CFG 167 (167)
T PF04891_consen 107 LRSRAELSALMRRHFPPLAAR---NTRNMRWKKFFYRQ---LCEREG---LYLCRAPSCEECSDYAVCFG 167 (167)
T ss_pred CCCHHHHHHHHHHHhHHHHHh---ccCCCcHHHHHHHH---HHHHcC---CCcCCCCCCCCcCCHhhcCC
Confidence 577777777666555444443 33222 55532 233332 01123499999999999984
No 12
>PF06387 Calcyon: D1 dopamine receptor-interacting protein (calcyon); InterPro: IPR009431 This family consists of several D1 dopamine receptor-interacting (calcyon) proteins. D1/D5 dopamine receptors in the basal ganglia, hippocampus, and cerebral cortex modulate motor, reward, and cognitive behaviour. D1-like dopamine receptors likely modulate neocortical and hippocampal neuronal excitability and synaptic function via Ca2+ as well as cAMP-dependent signalling []. Defective calcyon proteins have been implicated in both attention-deficit/hyperactivity disorder (ADHD) [] and schizophrenia.; GO: 0050780 dopamine receptor binding, 0007212 dopamine receptor signaling pathway, 0016021 integral to membrane
Probab=52.17 E-value=18 Score=32.75 Aligned_cols=16 Identities=25% Similarity=0.455 Sum_probs=11.2
Q ss_pred eCCCceecCCCcccCc
Q 018239 109 CFHGYRKHGKLCVEDG 124 (359)
Q Consensus 109 C~~gY~l~~~~Cv~D~ 124 (359)
|-+||++..+.|.|-+
T Consensus 113 CPdGFv~khk~C~P~~ 128 (186)
T PF06387_consen 113 CPDGFVLKHKRCTPLT 128 (186)
T ss_pred CCCcceeecccccchh
Confidence 3448888888887743
No 13
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=49.32 E-value=9.3 Score=25.41 Aligned_cols=26 Identities=31% Similarity=0.797 Sum_probs=17.5
Q ss_pred cCCCCceecCC--ee--eeCCCceecCCCc
Q 018239 95 PCPSNGECHQG--KL--ECFHGYRKHGKLC 120 (359)
Q Consensus 95 PCP~hA~C~~g--~l--~C~~gY~l~~~~C 120 (359)
.|=+||.|.+- .+ .|.+||.--+-.|
T Consensus 7 ~C~~nA~C~~~~~~~~C~C~~Gy~GdG~~C 36 (36)
T PF12947_consen 7 GCHPNATCTNTGGSYTCTCKPGYEGDGFFC 36 (36)
T ss_dssp GS-TTCEEEE-TTSEEEEE-CEEECCSTCE
T ss_pred CCCCCcEeecCCCCEEeECCCCCccCCcCC
Confidence 68889999876 44 9999998655544
No 14
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=47.87 E-value=12 Score=41.59 Aligned_cols=34 Identities=35% Similarity=0.842 Sum_probs=29.5
Q ss_pred CCCcc--CCCCceecCC--ee--eeCCCceecCCCcccCc
Q 018239 91 DSCEP--CPSNGECHQG--KL--ECFHGYRKHGKLCVEDG 124 (359)
Q Consensus 91 p~C~P--CP~hA~C~~g--~l--~C~~gY~l~~~~Cv~D~ 124 (359)
++|.| |=++|.||+- .+ +|.+||.--+-.||||+
T Consensus 828 DeC~psrChp~A~CyntpgsfsC~C~pGy~GDGf~CVP~~ 867 (1289)
T KOG1214|consen 828 DECSPSRCHPAATCYNTPGSFSCRCQPGYYGDGFQCVPDT 867 (1289)
T ss_pred cccCccccCCCceEecCCCcceeecccCccCCCceecCCC
Confidence 67776 9999999986 33 99999999999999993
No 15
>TIGR02976 phageshock_pspB phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response.
Probab=46.47 E-value=74 Score=24.85 Aligned_cols=29 Identities=28% Similarity=0.359 Sum_probs=15.6
Q ss_pred HHHHHHHH-HHHHHHHH-------HHHHHHHHHHHHH
Q 018239 244 LLVGCLLL-LWKVHRRR-------YFAIRVEELYHQV 272 (359)
Q Consensus 244 ~~vgi~~l-~~~~~r~~-------~e~~~v~~Lv~~v 272 (359)
++++..|+ .+|..+++ .+.++..+|++++
T Consensus 15 ifVap~wl~lHY~~k~~~~~~ls~~d~~~L~~L~~~a 51 (75)
T TIGR02976 15 IFVAPLWLILHYRSKRKTAASLSTDDQALLQELYAKA 51 (75)
T ss_pred HHHHHHHHHHHHHhhhccCCCCCHHHHHHHHHHHHHH
Confidence 33444444 45554443 4666677777654
No 16
>PF06864 PAP_PilO: Pilin accessory protein (PilO); InterPro: IPR009663 This family consists of several enterobacterial PilO proteins. The function of PilO is unknown although it has been suggested that it is a cytoplasmic protein in the absence of other Pil proteins, but PilO protein is translocated to the outer membrane in the presence of other Pil proteins. Alternatively, PilO protein may form a complex with other Pil protein(s). PilO has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body []. This family does not seem to be related to IPR007445 from INTERPRO.
Probab=45.09 E-value=42 Score=34.19 Aligned_cols=14 Identities=21% Similarity=0.581 Sum_probs=9.4
Q ss_pred CCCccccccccccc
Q 018239 290 CEPWVVASRLRDHL 303 (359)
Q Consensus 290 ~~pyl~~~qLRD~l 303 (359)
++||...+..-+.|
T Consensus 220 ~~PW~~~P~~~~fl 233 (414)
T PF06864_consen 220 PHPWAKQPSVQAFL 233 (414)
T ss_pred CCCcccCCCHHHHH
Confidence 56887777666554
No 17
>PF07543 PGA2: Protein trafficking PGA2; InterPro: IPR011431 A Saccharomyces cerevisiae (Baker's yeast) member of this family (PGA2, P53903 from SWISSPROT) is a single pass membrane protein which has been implicated in protein trafficking [, ].
Probab=44.18 E-value=65 Score=28.12 Aligned_cols=34 Identities=21% Similarity=0.198 Sum_probs=17.8
Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 018239 243 SLLVGCLLLLWKVHRRRYFAIRVEELYHQVCEIL 276 (359)
Q Consensus 243 ~~~vgi~~l~~~~~r~~~e~~~v~~Lv~~vi~~L 276 (359)
+++||.++|++-|.++...+..+.++-++..+.-
T Consensus 21 ViIVggYiLlRPY~~kl~~k~~~kq~eke~ae~e 54 (140)
T PF07543_consen 21 VIIVGGYILLRPYFRKLAAKDQKKQLEKEKAERE 54 (140)
T ss_pred hhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4556666666655545444555555554443333
No 18
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=44.07 E-value=22 Score=23.05 Aligned_cols=20 Identities=35% Similarity=0.856 Sum_probs=17.0
Q ss_pred cCCCCceec--CCeeeeCCCce
Q 018239 95 PCPSNGECH--QGKLECFHGYR 114 (359)
Q Consensus 95 PCP~hA~C~--~g~l~C~~gY~ 114 (359)
.|=.||+|. .|+=.|++||.
T Consensus 7 ~C~~~G~C~~~~g~C~C~~g~~ 28 (32)
T PF07974_consen 7 ICSGHGTCVSPCGRCVCDSGYT 28 (32)
T ss_pred ccCCCCEEeCCCCEEECCCCCc
Confidence 588999999 56779999984
No 19
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=43.95 E-value=22 Score=22.61 Aligned_cols=26 Identities=38% Similarity=1.017 Sum_probs=18.6
Q ss_pred ccCCCCceecCC--ee--eeCCCceecCCCc
Q 018239 94 EPCPSNGECHQG--KL--ECFHGYRKHGKLC 120 (359)
Q Consensus 94 ~PCP~hA~C~~g--~l--~C~~gY~l~~~~C 120 (359)
.||..+|.|.+. .. .|.+||. .+..|
T Consensus 9 ~~C~~~~~C~~~~g~~~C~C~~g~~-~g~~C 38 (39)
T smart00179 9 NPCQNGGTCVNTVGSYRCECPPGYT-DGRNC 38 (39)
T ss_pred CCcCCCCEeECCCCCeEeECCCCCc-cCCcC
Confidence 379899999854 23 7889987 45555
No 20
>PF01826 TIL: Trypsin Inhibitor like cysteine rich domain; InterPro: IPR002919 This domain is found in proteinase inhibitors as well as in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9. This inhibitor domain belongs to MEROPS inhibitor family I8 (clan IA). Proteins containing this domain inhibit peptidases belonging to families S1 (IPR001254 from INTERPRO), S8 (IPR000209 from INTERPRO), and M4 (IPR001570 from INTERPRO) [] and are restricted to the chordata, nematoda, arthropoda and echinodermata. Examples of proteins containing this domain are: chymotrypsin/elastase inhibitor from Ascaris suum (pig roundworm) Acp62F protein from Drosophila melanogaster Bombina trypsin inhibitor from Bombina maxima (large-webbed bell toad) Bombyx subtilisin inhibitor from Bombyx mori (silk moth) von Willebrand factor ; PDB: 2P3F_N 1HX2_A 1CCV_A 1EAI_D 2H9E_C 1COU_A 1ATE_A 1ATB_A 1ATD_A 1ATA_A ....
Probab=43.73 E-value=12 Score=26.63 Aligned_cols=26 Identities=31% Similarity=0.752 Sum_probs=20.7
Q ss_pred CCCCceecCCeeeeCCCceecCC-CcccCc
Q 018239 96 CPSNGECHQGKLECFHGYRKHGK-LCVEDG 124 (359)
Q Consensus 96 CP~hA~C~~g~l~C~~gY~l~~~-~Cv~D~ 124 (359)
|+ ..|.+| =.|.+||++... .||+-.
T Consensus 27 C~--~~C~~g-C~C~~G~v~~~~~~CV~~~ 53 (55)
T PF01826_consen 27 CS--EPCVEG-CFCPPGYVRNDNGRCVPPS 53 (55)
T ss_dssp CS--SS-ESE-EEETTTEEEETTSEEEEGG
T ss_pred cC--CCCCcc-CCCCCCeeEcCCCCEEcHH
Confidence 55 778888 789999999876 999864
No 21
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=43.32 E-value=33 Score=29.31 Aligned_cols=19 Identities=32% Similarity=0.427 Sum_probs=8.7
Q ss_pred HHHHHHHHHHHHHHHHHHH
Q 018239 237 IIVPVCSLLVGCLLLLWKV 255 (359)
Q Consensus 237 ~i~~~l~~~vgi~~l~~~~ 255 (359)
+++++++.++|+.+++.|.
T Consensus 69 Ii~gv~aGvIg~Illi~y~ 87 (122)
T PF01102_consen 69 IIFGVMAGVIGIILLISYC 87 (122)
T ss_dssp HHHHHHHHHHHHHHHHHHH
T ss_pred hhHHHHHHHHHHHHHHHHH
Confidence 3445444444444444443
No 22
>PF10576 EndIII_4Fe-2S: Iron-sulfur binding domain of endonuclease III; InterPro: IPR003651 Endonuclease III (4.2.99.18 from EC) is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism [, ]. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair []. The 3-D structures of Escherichia coli endonuclease III [] and catalytic domain of MutY [] have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL) []. Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs. The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif [, ]. The iron-sulphur cluster loop (FCL) is also found in DNA-(apurinic or apyrimidinic site) lyase, a subfamily of endonuclease III. The enzyme has both apurinic and apyrimidinic endonuclease activity and a DNA N-glycosylase activity. It cuts damaged DNA at cytosines, thymines and guanines, and acts on the damaged strand 5' of the damaged site. The enzyme binds a 4Fe-4S cluster which is not important for the catalytic activity, but is probably involved in the alignment of the enzyme along the DNA strand.; GO: 0004519 endonuclease activity, 0051539 4 iron, 4 sulfur cluster binding; PDB: 1VRL_A 1RRQ_A 3G0Q_A 3FSQ_A 1RRS_A 3FSP_A 2ABK_A 1KG7_A 1KG2_A 1MUN_A ....
Probab=42.89 E-value=11 Score=21.13 Aligned_cols=14 Identities=36% Similarity=0.923 Sum_probs=8.6
Q ss_pred CCCCCccCCCCcee
Q 018239 89 PTDSCEPCPSNGEC 102 (359)
Q Consensus 89 ~~p~C~PCP~hA~C 102 (359)
.+|.|.-||-+..|
T Consensus 4 r~P~C~~Cpl~~~C 17 (17)
T PF10576_consen 4 RKPKCEECPLADYC 17 (17)
T ss_dssp SS--GGG-TTGGG-
T ss_pred CCCccccCCCcccC
Confidence 46999999999887
No 23
>PF08563 P53_TAD: P53 transactivation motif; InterPro: IPR013872 The binding of this protein by regulatory proteins regulates p53 transcription activation. This entry is comprised of a single amphipathic alpha helix and contains a highly conserved motif [, ]. ; GO: 0005515 protein binding; PDB: 1YCQ_B 2Z5T_R 3DAB_B 3DAC_B 2Z5S_Q 2K8F_B 2L14_B 1YCR_B.
Probab=42.80 E-value=11 Score=23.34 Aligned_cols=14 Identities=7% Similarity=0.007 Sum_probs=9.7
Q ss_pred cCCChHHHHHHHHH
Q 018239 176 FELDNPVYLYTKKR 189 (359)
Q Consensus 176 ~~ls~~~fe~l~~~ 189 (359)
+-|+++.|++||+.
T Consensus 8 ~PLSQeTF~~LW~~ 21 (25)
T PF08563_consen 8 LPLSQETFSDLWNL 21 (25)
T ss_dssp ---STCCHHHHHHT
T ss_pred CCccHHHHHHHHHh
Confidence 46889999999964
No 24
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=40.65 E-value=27 Score=21.27 Aligned_cols=25 Identities=32% Similarity=0.890 Sum_probs=18.0
Q ss_pred cCCCCceecCC--ee--eeCCCceecCCCc
Q 018239 95 PCPSNGECHQG--KL--ECFHGYRKHGKLC 120 (359)
Q Consensus 95 PCP~hA~C~~g--~l--~C~~gY~l~~~~C 120 (359)
+|..||+|.+. .. .|.+||... ..|
T Consensus 7 ~C~~~~~C~~~~~~~~C~C~~g~~g~-~~C 35 (36)
T cd00053 7 PCSNGGTCVNTPGSYRCVCPPGYTGD-RSC 35 (36)
T ss_pred CCCCCCEEecCCCCeEeECCCCCccc-CCc
Confidence 67789999984 23 899998655 344
No 25
>PF00558 Vpu: Vpu protein; InterPro: IPR008187 The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells [].; GO: 0019076 release of virus from host; PDB: 2JPX_A 1PI8_A 2GOH_A 2GOF_A 1PI7_A 1PJE_A 1VPU_A 2K7Y_A.
Probab=37.77 E-value=36 Score=27.03 Aligned_cols=22 Identities=14% Similarity=0.208 Sum_probs=5.9
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 018239 253 WKVHRRRYFAIRVEELYHQVCE 274 (359)
Q Consensus 253 ~~~~r~~~e~~~v~~Lv~~vi~ 274 (359)
+..||+...+++++++++.+.+
T Consensus 27 ~ieYrk~~rqrkId~li~RIre 48 (81)
T PF00558_consen 27 YIEYRKIKRQRKIDRLIERIRE 48 (81)
T ss_dssp ------------CHHHHHHHHC
T ss_pred HHHHHHHHHHHhHHHHHHHHHc
Confidence 3335555666777776664433
No 26
>PF08114 PMP1_2: ATPase proteolipid family; InterPro: IPR012589 This family consists of small proteolipids associated with the plasma membrane H+ ATPase. Two proteolipids (PMP1 and PMP2) are associated with the ATPase and both genes are similarly expressed in the wild-type strain of yeast. No modification of the level of transcription of one PMP gene is detected in a strain deleted of the other. Though both proteolipids show similarity with other small proteolipids associated with other cation -transporting ATPases, their functions remain unclear [].
Probab=36.94 E-value=60 Score=22.47 Aligned_cols=20 Identities=25% Similarity=0.212 Sum_probs=10.1
Q ss_pred HHHHHHHHHHHHHHHHHHHH
Q 018239 245 LVGCLLLLWKVHRRRYFAIR 264 (359)
Q Consensus 245 ~vgi~~l~~~~~r~~~e~~~ 264 (359)
++++..+.-++||+.+.+++
T Consensus 19 lv~i~iva~~iYRKw~aRkr 38 (43)
T PF08114_consen 19 LVGIGIVALFIYRKWQARKR 38 (43)
T ss_pred HHHHHHHHHHHHHHHHHHHH
Confidence 34444444555666555544
No 27
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=36.38 E-value=52 Score=32.37 Aligned_cols=8 Identities=13% Similarity=0.273 Sum_probs=4.7
Q ss_pred hcccccCC
Q 018239 146 AQFLCDGT 153 (359)
Q Consensus 146 a~~~CG~~ 153 (359)
+-+.||.+
T Consensus 95 ~CL~Cg~~ 102 (299)
T PF02009_consen 95 GCLKCGCG 102 (299)
T ss_pred hhhhhcCc
Confidence 45567664
No 28
>PF06679 DUF1180: Protein of unknown function (DUF1180); InterPro: IPR009565 This entry consists of several hypothetical eukaryotic proteins thought to be membrane proteins. Their function is unknown.
Probab=35.79 E-value=1.5e+02 Score=26.57 Aligned_cols=34 Identities=24% Similarity=0.278 Sum_probs=24.0
Q ss_pred CCCCCCChhhHHHHHHHHHHHHHHHHHHHHHHhh
Q 018239 35 PQSLFPSKQDLLRLITVVAIASSVALTCNYLANF 68 (359)
Q Consensus 35 ~~~~~~~~~~~~~~~~~~~~a~~~a~~~~~l~~~ 68 (359)
|..+-+.+.-+.|.++||..+++.+.+|+++=.|
T Consensus 84 ~s~~~~d~~~l~R~~~Vl~g~s~l~i~yfvir~~ 117 (163)
T PF06679_consen 84 PSPSSPDSPMLKRALYVLVGLSALAILYFVIRTF 117 (163)
T ss_pred cCCCcCCccchhhhHHHHHHHHHHHHHHHHHHHH
Confidence 3345567777888998888888877777666433
No 29
>PRK09458 pspB phage shock protein B; Provisional
Probab=35.76 E-value=98 Score=24.24 Aligned_cols=30 Identities=23% Similarity=0.261 Sum_probs=17.5
Q ss_pred HHHHHHHHH-HHHHHHHH-------HHHHHHHHHHHHH
Q 018239 243 SLLVGCLLL-LWKVHRRR-------YFAIRVEELYHQV 272 (359)
Q Consensus 243 ~~~vgi~~l-~~~~~r~~-------~e~~~v~~Lv~~v 272 (359)
+++|+-.|+ .+|..+++ .+.++.++|++.+
T Consensus 14 ~ifVaPiWL~LHY~sk~~~~~~Ls~~d~~~L~~L~~~A 51 (75)
T PRK09458 14 VLFVAPIWLWLHYRSKRQGSQGLSQEEQQRLAQLTEKA 51 (75)
T ss_pred HHHHHHHHHHHhhcccccCCCCCCHHHHHHHHHHHHHH
Confidence 344444444 45554444 4677778887765
No 30
>PHA03399 pif3 per os infectivity factor 3; Provisional
Probab=35.00 E-value=46 Score=30.86 Aligned_cols=33 Identities=18% Similarity=0.496 Sum_probs=22.6
Q ss_pred CCCCCCCCCCCCCCCCCCCCCCCccCCCCceecCC--------eeeeCCCcee
Q 018239 71 STSKPFCDSNLLLDSPQSPTDSCEPCPSNGECHQG--------KLECFHGYRK 115 (359)
Q Consensus 71 ~~~~~fCds~~~~~~~~~~~p~C~PCP~hA~C~~g--------~l~C~~gY~l 115 (359)
++..--|+++. +||=.+..|.++ .+.|+.||=-
T Consensus 47 r~~ivDC~~t~------------lPCVtD~QC~dnC~~~~~~~~~~C~~GFC~ 87 (200)
T PHA03399 47 RNGIVDCSLTR------------LPCVTDQQCRDNCAIGSAAGVMTCDGGFCS 87 (200)
T ss_pred ccCcccCcCCc------------CCcccHHHHHHHHHhccccceEECCCCeec
Confidence 55666677664 388888888654 4589888633
No 31
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=34.66 E-value=1e+02 Score=27.30 Aligned_cols=11 Identities=36% Similarity=0.495 Sum_probs=5.6
Q ss_pred HHHHHHHHHHH
Q 018239 230 WVSTHALIIVP 240 (359)
Q Consensus 230 ~i~~~~~~i~~ 240 (359)
++..|+..|+.
T Consensus 26 ffsthm~tILi 36 (189)
T PF05568_consen 26 FFSTHMYTILI 36 (189)
T ss_pred HHHHHHHHHHH
Confidence 44556655543
No 32
>PF07466 DUF1517: Protein of unknown function (DUF1517); InterPro: IPR010903 This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown.
Probab=34.50 E-value=1.2e+02 Score=29.69 Aligned_cols=24 Identities=4% Similarity=0.148 Sum_probs=14.0
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHH
Q 018239 43 QDLLRLITVVAIASSVALTCNYLA 66 (359)
Q Consensus 43 ~~~~~~~~~~~~a~~~a~~~~~l~ 66 (359)
..|.-++.+|+++.+++++..++.
T Consensus 62 gg~~gl~~iLIl~~Ia~~vv~~~r 85 (289)
T PF07466_consen 62 GGFGGLFDILILFGIAFFVVRFFR 85 (289)
T ss_pred cccchHHHHHHHHHHHHHHHHHHH
Confidence 334556666666666665555554
No 33
>PF06143 Baculo_11_kDa: Baculovirus 11 kDa family; InterPro: IPR009313 This is a family of uncharacterised Baculovirus proteins that are all about 11 kDa in size.
Probab=34.28 E-value=2.4e+02 Score=22.54 Aligned_cols=22 Identities=14% Similarity=0.354 Sum_probs=13.0
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 018239 230 WVSTHALIIVPVCSLLVGCLLL 251 (359)
Q Consensus 230 ~i~~~~~~i~~~l~~~vgi~~l 251 (359)
.++.+.+.+.+++++++.++++
T Consensus 32 firdFvLVic~~lVfVii~lFi 53 (84)
T PF06143_consen 32 FIRDFVLVICCFLVFVIIVLFI 53 (84)
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 4556677676665555555444
No 34
>PRK07597 secE preprotein translocase subunit SecE; Reviewed
Probab=33.10 E-value=82 Score=23.35 Aligned_cols=28 Identities=25% Similarity=0.420 Sum_probs=19.8
Q ss_pred CCCChhhHHHHHHHHHHHHHHHHHHHHH
Q 018239 38 LFPSKQDLLRLITVVAIASSVALTCNYL 65 (359)
Q Consensus 38 ~~~~~~~~~~~~~~~~~a~~~a~~~~~l 65 (359)
-.|+++|..+...+.+++.++..+..++
T Consensus 25 ~WPs~~e~~~~t~~Vi~~~~~~~~~i~~ 52 (64)
T PRK07597 25 TWPTRKELVRSTIVVLVFVAFFALFFYL 52 (64)
T ss_pred cCcCHHHHHhHHHHHHHHHHHHHHHHHH
Confidence 3699999998887777777665444333
No 35
>TIGR00964 secE_bact preprotein translocase, SecE subunit, bacterial. This model represents exclusively the bacterial (and some organellar) SecE protein. SecE is part of the core heterotrimer, SecYEG, of the Sec preprotein translocase system. Other components are the ATPase SecA, a cytosolic chaperone SecB, and an accessory complex of SecDF and YajC.
Probab=32.76 E-value=86 Score=22.59 Aligned_cols=26 Identities=19% Similarity=0.327 Sum_probs=18.5
Q ss_pred CCChhhHHHHHHHHHHHHHHHHHHHH
Q 018239 39 FPSKQDLLRLITVVAIASSVALTCNY 64 (359)
Q Consensus 39 ~~~~~~~~~~~~~~~~a~~~a~~~~~ 64 (359)
.|+|+|..+...+.++++++..+..+
T Consensus 17 WPt~~e~~~~t~~Vi~~~~~~~~~~~ 42 (55)
T TIGR00964 17 WPSRKELITYTIVVIVFVIFFSLFLF 42 (55)
T ss_pred CcCHHHHHhHHHHHHHHHHHHHHHHH
Confidence 69999998887777776666444433
No 36
>PF07271 Cytadhesin_P30: Cytadhesin P30/P32; InterPro: IPR009896 This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localised on the tip organelle. It is thought that it is important in cytadherence and virulence [].; GO: 0007157 heterophilic cell-cell adhesion, 0009405 pathogenesis, 0016021 integral to membrane
Probab=30.83 E-value=1.6e+02 Score=28.59 Aligned_cols=16 Identities=25% Similarity=0.067 Sum_probs=10.8
Q ss_pred HHHHHHHHHHHHHHHH
Q 018239 260 YFAIRVEELYHQVCEI 275 (359)
Q Consensus 260 ~e~~~v~~Lv~~vi~~ 275 (359)
+|++++++++++.-.+
T Consensus 104 ee~e~~~q~~e~~~~i 119 (279)
T PF07271_consen 104 EEKEEHEQLAEQLGRI 119 (279)
T ss_pred HHHHHHHHHHHHHHHH
Confidence 5677788888875433
No 37
>KOG0474 consensus Cl- channel CLC-7 and related proteins (CLC superfamily) [Inorganic ion transport and metabolism]
Probab=30.01 E-value=61 Score=35.14 Aligned_cols=25 Identities=28% Similarity=0.537 Sum_probs=14.2
Q ss_pred CCCCccCCCCceecCC-eeeeCCCce
Q 018239 90 TDSCEPCPSNGECHQG-KLECFHGYR 114 (359)
Q Consensus 90 ~p~C~PCP~hA~C~~g-~l~C~~gY~ 114 (359)
-..|+|||+...=..- .+-|.+|+.
T Consensus 396 l~~C~P~~~~~~~~~~p~f~Cp~~~Y 421 (762)
T KOG0474|consen 396 LADCQPCPPSITEGQCPTFFCPDGEY 421 (762)
T ss_pred HhcCCCCCCCcccccCccccCCCCch
Confidence 3678888765432211 257777744
No 38
>PF10588 NADH-G_4Fe-4S_3: NADH-ubiquinone oxidoreductase-G iron-sulfur binding region; InterPro: IPR019574 NADH:ubiquinone oxidoreductase (complex I) (1.6.5.3 from EC) is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) []. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea [], mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins []. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters []. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes [, ]. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I []. This entry describes the G subunit (one of 14 subunits, A to N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria while translocating protons, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This family does not contain related subunits from formate dehydrogenase complexes. This entry represents the iron-sulphur binding domain of the G subunit.; GO: 0016491 oxidoreductase activity, 0055114 oxidation-reduction process; PDB: 3M9S_C 2FUG_L 3IAS_L 2YBB_3 3IAM_3 3I9V_3.
Probab=29.95 E-value=24 Score=24.13 Aligned_cols=16 Identities=31% Similarity=0.866 Sum_probs=8.5
Q ss_pred CCCCCCccCCCCceec
Q 018239 88 SPTDSCEPCPSNGECH 103 (359)
Q Consensus 88 ~~~p~C~PCP~hA~C~ 103 (359)
+++-.|.-|+.+|.|.
T Consensus 11 ~H~~dC~~C~~~G~Ce 26 (41)
T PF10588_consen 11 NHPLDCPTCDKNGNCE 26 (41)
T ss_dssp T----TTT-TTGGG-H
T ss_pred CCCCcCcCCCCCCCCH
Confidence 4567899999999984
No 39
>PRK11677 hypothetical protein; Provisional
Probab=29.92 E-value=2.1e+02 Score=24.86 Aligned_cols=15 Identities=13% Similarity=0.264 Sum_probs=6.7
Q ss_pred HHHHHHHHHHHHHHH
Q 018239 262 AIRVEELYHQVCEIL 276 (359)
Q Consensus 262 ~~~v~~Lv~~vi~~L 276 (359)
++.|.+.+.+..+.|
T Consensus 49 kqeV~~HFa~TA~Ll 63 (134)
T PRK11677 49 RQELVSHFARSAELL 63 (134)
T ss_pred HHHHHHHHHHHHHHH
Confidence 444555444443333
No 40
>PF03672 UPF0154: Uncharacterised protein family (UPF0154); InterPro: IPR005359 The proteins in this entry are functionally uncharacterised.
Probab=29.49 E-value=1.9e+02 Score=21.98 Aligned_cols=22 Identities=5% Similarity=0.070 Sum_probs=11.0
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 018239 240 PVCSLLVGCLLLLWKVHRRRYF 261 (359)
Q Consensus 240 ~~l~~~vgi~~l~~~~~r~~~e 261 (359)
.++++++|.++.++++.+.-++
T Consensus 7 li~G~~~Gff~ar~~~~k~l~~ 28 (64)
T PF03672_consen 7 LIVGAVIGFFIARKYMEKQLKE 28 (64)
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 3344555555555555554433
No 41
>PF07699 GCC2_GCC3: GCC2 and GCC3; InterPro: IPR011641 Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []: Serine/threonine-protein kinases Tyrosine-protein kinases Dual specific protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins) Protein kinase function has been evolutionarily conserved from Escherichia coli to human []. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases []. Tyrosine-protein kinases can transfer a phosphate group from ATP to a tyrosine residue in a protein. These enzymes can be divided into two main groups []: Receptor tyrosine kinases (RTK), which are transmembrane proteins involved in signal transduction; they play key roles in growth, differentiation, metabolism, adhesion, motility, death and oncogenesis []. RTKs are composed of 3 domains: an extracellular domain (binds ligand), a transmembrane (TM) domain, and an intracellular catalytic domain (phosphorylates substrate). The TM domain plays an important role in the dimerisation process necessary for signal transduction []. Cytoplasmic / non-receptor tyrosine kinases, which act as regulatory proteins, playing key roles in cell differentiation, motility, proliferation, and survival. For example, the Src-family of protein-tyrosine kinases []. This entry represents various ephrin type A and B receptors, which have tyrosine kinase activity.
Probab=29.43 E-value=44 Score=23.18 Aligned_cols=29 Identities=21% Similarity=0.512 Sum_probs=17.2
Q ss_pred CCCCCCCCCCCCCCCCCCCCCCccCCCCcee
Q 018239 72 TSKPFCDSNLLLDSPQSPTDSCEPCPSNGEC 102 (359)
Q Consensus 72 ~~~~fCds~~~~~~~~~~~p~C~PCP~hA~C 102 (359)
....-|.-+.+ +...-...|++||.+-.-
T Consensus 9 ~~C~~Cp~GtY--q~~~g~~~C~~Cp~g~~T 37 (48)
T PF07699_consen 9 NKCQPCPKGTY--QDEEGQTSCTPCPPGSTT 37 (48)
T ss_pred CccCCCCCCcc--CCccCCccCccCcCCCcc
Confidence 34556666652 223344588888887554
No 42
>PF09402 MSC: Man1-Src1p-C-terminal domain; InterPro: IPR018996 This entry represents the Inner nuclear membrane proteins MAN1 (also known as LEM domain-containing protein 3) and LEM domain-containing protein 2 (or LEM protein 2). Emerin and MAN1 are LEM domain-containing integral membrane proteins of the vertebrate nuclear envelope []. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nulceoplasmic region forms a DNA binding winged helix and binds to Smad []. LEM protein 2 is an essential protein involved in chromosome segregation and cell division, probably via its interaction with lmn-1, the main component of nuclear lamina. Has some overlapping function with emr-1.; GO: 0005639 integral to nuclear inner membrane; PDB: 2CH0_A.
Probab=29.00 E-value=18 Score=35.38 Aligned_cols=71 Identities=15% Similarity=0.203 Sum_probs=0.0
Q ss_pred HHHHHHHHHHHHHHHHHHHHHhhhcc--CCCCCCcccccccccccCCCCC-----cCchhhHHHHHHHHhcCCCccee
Q 018239 261 FAIRVEELYHQVCEILEENALMSKSV--NGECEPWVVASRLRDHLLLPKE-----RKDPVIWKKVEELVQEDSRVDQY 331 (359)
Q Consensus 261 e~~~v~~Lv~~vi~~L~~q~~~~~~~--~~~~~pyl~~~qLRD~lL~~~~-----r~r~~LW~kV~~~Ve~nSnIr~~ 331 (359)
..+++.+|++.+.+.|++++..+.=+ .....++++...|+|.+..... ..-+.+|+.+...+.++..|...
T Consensus 97 k~~~i~~l~~~~~~~Lr~~~a~~~Cg~~~~~~~~~ls~~el~~~~~~~~~~~~~~~efe~l~~~a~~~L~~~~ei~~~ 174 (334)
T PF09402_consen 97 KEEKIEELAKKILDELRERNAQYECGDSEDDESPGLSEEELKDILSSKKSPWISDEEFEELWSAALQELKKNPEIIIR 174 (334)
T ss_dssp ------------------------------------------------------------------------------
T ss_pred HHHHHHHHHHHHHHHHHHHHhhcccCCCCCCCCCCCcHHHHHHHHHhccCccccHHHHHHHHHHHHHHHHhCCcEEEe
Confidence 35567889999999998887655433 2457899999999999995441 23388999999988887666555
No 43
>COG0690 SecE Preprotein translocase subunit SecE [Intracellular trafficking and secretion]
Probab=28.90 E-value=1.2e+02 Score=23.40 Aligned_cols=27 Identities=19% Similarity=0.444 Sum_probs=17.9
Q ss_pred CCCChhhHHHHHHHHHHHHHHHHHHHH
Q 018239 38 LFPSKQDLLRLITVVAIASSVALTCNY 64 (359)
Q Consensus 38 ~~~~~~~~~~~~~~~~~a~~~a~~~~~ 64 (359)
-+|+|.|..+...+.++..+++.+..+
T Consensus 35 ~WPsrke~~~~t~~Vl~~v~~~s~~~~ 61 (73)
T COG0690 35 VWPTRKELIRSTLIVLVVVAFFSLFLY 61 (73)
T ss_pred cCCCHHHHHHHHHHHHHHHHHHHHHHH
Confidence 369999988887766665555433333
No 44
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=28.74 E-value=21 Score=32.80 Aligned_cols=31 Identities=26% Similarity=0.720 Sum_probs=24.7
Q ss_pred cCCCCceecCC-e------e--eeCCCceecCCCcccCch
Q 018239 95 PCPSNGECHQG-K------L--ECFHGYRKHGKLCVEDGD 125 (359)
Q Consensus 95 PCP~hA~C~~g-~------l--~C~~gY~l~~~~Cv~D~~ 125 (359)
||=+.|.|... . + .|.+||++..+.|+|+.=
T Consensus 51 ~Cgdya~C~~~~~~~~~~~~~C~C~~gY~~~~~vCvp~~C 90 (197)
T PF06247_consen 51 PCGDYAKCINQANKGEERAYKCDCINGYILKQGVCVPNKC 90 (197)
T ss_dssp EEETTEEEEE-SSTTSSTSEEEEE-TTEEESSSSEEEGGG
T ss_pred cccchhhhhcCCCcccceeEEEecccCceeeCCeEchhhc
Confidence 89999999865 2 2 899999999999999853
No 45
>PRK09400 secE preprotein translocase subunit SecE; Reviewed
Probab=28.24 E-value=1.3e+02 Score=22.57 Aligned_cols=19 Identities=16% Similarity=0.461 Sum_probs=15.0
Q ss_pred CChhhHHHHHHHHHHHHHH
Q 018239 40 PSKQDLLRLITVVAIASSV 58 (359)
Q Consensus 40 ~~~~~~~~~~~~~~~a~~~ 58 (359)
|+++||.+...+.++..++
T Consensus 27 Pd~~Ef~~ia~~~~iG~~i 45 (61)
T PRK09400 27 PTREEFLLVAKVTGLGILL 45 (61)
T ss_pred CCHHHHHHHHHHHHHHHHH
Confidence 8999999988776666554
No 46
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=28.16 E-value=47 Score=22.06 Aligned_cols=17 Identities=24% Similarity=0.559 Sum_probs=12.0
Q ss_pred eecCC---eeeeCCCceecC
Q 018239 101 ECHQG---KLECFHGYRKHG 117 (359)
Q Consensus 101 ~C~~g---~l~C~~gY~l~~ 117 (359)
.|-++ .-.|..||++..
T Consensus 11 ~CDpn~~~~C~CPeGyIlde 30 (34)
T PF09064_consen 11 DCDPNSPGQCFCPEGYILDE 30 (34)
T ss_pred ccCCCCCCceeCCCceEecC
Confidence 66665 337889999853
No 47
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=27.33 E-value=35 Score=20.81 Aligned_cols=16 Identities=31% Similarity=0.800 Sum_probs=11.9
Q ss_pred eeeCCCceec--CCCccc
Q 018239 107 LECFHGYRKH--GKLCVE 122 (359)
Q Consensus 107 l~C~~gY~l~--~~~Cv~ 122 (359)
-.|.+||.+. +..|+.
T Consensus 4 C~C~~Gy~l~~d~~~C~D 21 (24)
T PF12662_consen 4 CSCPPGYQLSPDGRSCED 21 (24)
T ss_pred eeCCCCCcCCCCCCcccc
Confidence 3799999986 567764
No 48
>PF00584 SecE: SecE/Sec61-gamma subunits of protein translocation complex; InterPro: IPR001901 Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component []. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome. The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF) []. The chaperone protein SecB [] is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion []. SecE, part of the main SecYEG translocase complex, is ~106 residues in length, and spans the inner membrane of the Gram-negative bacterial envelope. Together with SecY and SecG, SecE forms a multimeric channel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The latter is mediated by SecA. In eukaryotes, the evolutionary related protein sec61-gamma plays a role in protein translocation through the endoplasmic reticulum; it is part of a trimeric complex that also consist of sec61-alpha and beta []. Both secE and sec61-gamma are small proteins of about 60 to 90 amino acids that contain a single transmembrane region at their C-terminal extremity (Escherichia coli secE is an exception, in that it possess an extra N-terminal segment of 60 residues that contains two additional transmembrane domains) [].; GO: 0006605 protein targeting, 0006886 intracellular protein transport, 0016020 membrane; PDB: 3J01_B 2WW9_B 2WWA_B 3DL8_C 2WWB_B 3DIN_G 2ZJS_E 2ZQP_E.
Probab=26.76 E-value=1.5e+02 Score=21.15 Aligned_cols=21 Identities=24% Similarity=0.505 Sum_probs=15.2
Q ss_pred CCChhhHHHHHHHHHHHHHHH
Q 018239 39 FPSKQDLLRLITVVAIASSVA 59 (359)
Q Consensus 39 ~~~~~~~~~~~~~~~~a~~~a 59 (359)
.|+++|..+.-.+.++..++.
T Consensus 18 WP~~~e~~~~t~~Vl~~~~i~ 38 (57)
T PF00584_consen 18 WPSRKELLKSTIIVLVFVIIF 38 (57)
T ss_dssp CCCTHHHHHHHHHHHHHHHHH
T ss_pred CCCHHHHHHHHHHHHHHHHHH
Confidence 599999988776666655553
No 49
>PF14316 DUF4381: Domain of unknown function (DUF4381)
Probab=26.45 E-value=1.7e+02 Score=25.24 Aligned_cols=16 Identities=25% Similarity=0.243 Sum_probs=8.5
Q ss_pred HHHHHHHHHHHHHHhh
Q 018239 268 LYHQVCEILEENALMS 283 (359)
Q Consensus 268 Lv~~vi~~L~~q~~~~ 283 (359)
...++-.+|+..+..+
T Consensus 69 ~~~~l~~LLKr~a~~~ 84 (146)
T PF14316_consen 69 WLAALNELLKRVALQY 84 (146)
T ss_pred HHHHHHHHHHHHHHHh
Confidence 4445556665555444
No 50
>PF05545 FixQ: Cbb3-type cytochrome oxidase component FixQ; InterPro: IPR008621 This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon [].
Probab=26.29 E-value=1.3e+02 Score=21.07 Aligned_cols=9 Identities=22% Similarity=-0.064 Sum_probs=3.5
Q ss_pred HHHHHHHHH
Q 018239 232 STHALIIVP 240 (359)
Q Consensus 232 ~~~~~~i~~ 240 (359)
..+...+++
T Consensus 7 ~~~~~~~~~ 15 (49)
T PF05545_consen 7 QGFARSIGT 15 (49)
T ss_pred HHHHHHHHH
Confidence 334333333
No 51
>PF12729 4HB_MCP_1: Four helix bundle sensory module for signal transduction; InterPro: IPR024478 This entry represents a four-helix bundle that operates as a ubiquitous sensory module in prokaryotic signal-transduction, which is known as four-helix bundles methyl-accepting chemotaxis protein (4HB_MCP) domain. The 4HB_MCP is always found between two predicted transmembrane helices indicating that it detects only extracellular signals. In many cases the domain is associated with a cytoplasmic HAMP domain suggesting that most proteins carrying the bundle might share the mechanism of transmembrane signalling which is well-characterised in E coli chemoreceptors [].
Probab=26.11 E-value=2.8e+02 Score=22.98 Aligned_cols=8 Identities=50% Similarity=0.517 Sum_probs=3.8
Q ss_pred ccccccCC
Q 018239 298 RLRDHLLL 305 (359)
Q Consensus 298 qLRD~lL~ 305 (359)
.+++.++.
T Consensus 64 ~~~~~~~~ 71 (181)
T PF12729_consen 64 ALRRYLLA 71 (181)
T ss_pred HHHHhhhc
Confidence 44455553
No 52
>PRK15428 putative propanediol utilization protein PduM; Provisional
Probab=25.97 E-value=58 Score=29.23 Aligned_cols=31 Identities=19% Similarity=0.266 Sum_probs=24.1
Q ss_pred HHHHHHHHHHHHHHHHHHHhhhccCCCCCCccccccccc
Q 018239 263 IRVEELYHQVCEILEENALMSKSVNGECEPWVVASRLRD 301 (359)
Q Consensus 263 ~~v~~Lv~~vi~~L~~q~~~~~~~~~~~~pyl~~~qLRD 301 (359)
..++.||++|+.+|+.++.... -++..|||+
T Consensus 4 ~~~~~iV~~Vv~RLk~Ra~~~~--------~ls~~ql~~ 34 (163)
T PRK15428 4 EMLQRIVEEVVARLQRRAQSTA--------TLSVAQLRD 34 (163)
T ss_pred HHHHHHHHHHHHHHHHHhhceE--------EEEHHHccC
Confidence 4578899999999998876543 377778887
No 53
>PF04882 Peroxin-3: Peroxin-3; InterPro: IPR006966 Peroxin 3 (Pex3p), also known as Peroxisomal biogenesis factor 3, has been identified and characterised as a peroxisomal membrane protein in yeasts and mammals []. Two putative peroxisomal membrane-bound Pex3p homologues have also been found in Arabidopsis thaliana []. They possess a membrane peroxisomal targeting signal. Pex3p is an integral membrane protein of peroxisomes, exposing its N- and C-terminal parts to the cytosol []. Peroxin is involved in peroxisome biosynthesis and integrity; it assembles membrane vesicles before the matrix proteins are translocated. In humans, defects in PEX3 are the cause of peroxisome biogenesis disorders [], which include Zellweger syndrome (ZWS), neonatal adrenoleukodystrophy (NALD), infantile Refsum disease (IRD), and classical rhizomelic chondrodysplasia punctata (RCDP). These are peroxisomal disorders that are the result of proteins failing to be imported into the peroxisome.; GO: 0007031 peroxisome organization, 0005779 integral to peroxisomal membrane; PDB: 3MK4_A 3AJB_A.
Probab=25.61 E-value=45 Score=34.33 Aligned_cols=29 Identities=17% Similarity=0.230 Sum_probs=0.0
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 018239 227 IHQWVSTHALIIVPVCSLLVGCLLLLWKV 255 (359)
Q Consensus 227 ir~~i~~~~~~i~~~l~~~vgi~~l~~~~ 255 (359)
++.+++||+..++...+++.|.+++..|.
T Consensus 4 ~~~f~~Rhr~k~~~~~~v~g~~y~~~~y~ 32 (432)
T PF04882_consen 4 LRSFFRRHRRKIIVTGGVVGGGYLLYQYA 32 (432)
T ss_dssp -----------------------------
T ss_pred ccccccccccccccccccccccccccccc
Confidence 56788999977765555555555444444
No 54
>PHA02817 EEV Host range protein; Provisional
Probab=25.38 E-value=77 Score=29.91 Aligned_cols=43 Identities=19% Similarity=0.401 Sum_probs=26.5
Q ss_pred CCCCCCCCCCCCCCCCCCCCCCc--cCC----CCce----------ecCCe--eeeCCCceecC
Q 018239 72 TSKPFCDSNLLLDSPQSPTDSCE--PCP----SNGE----------CHQGK--LECFHGYRKHG 117 (359)
Q Consensus 72 ~~~~fCds~~~~~~~~~~~p~C~--PCP----~hA~----------C~~g~--l~C~~gY~l~~ 117 (359)
...-.|..+. .++ ...|.|+ .|| +||. .+... +.|++||.+.+
T Consensus 66 ~~~i~C~~dG-~Ws--~~~P~C~~v~C~~P~i~NG~v~~~~~~~~y~yg~~Vty~C~~Gy~L~G 126 (225)
T PHA02817 66 EKNIICEKDG-KWN--KEFPVCKIIRCRFPALQNGFVNGIPDSKKFYYESEVSFSCKPGFVLIG 126 (225)
T ss_pred CCeEEECCCC-cCC--CCCCeeeeeECCCCCCcCceeEccccCCceEcCCEEEEEcCCCCEEcC
Confidence 4456787653 232 3468997 685 3442 23333 48999999964
No 55
>PF10500 SR-25: Nuclear RNA-splicing-associated protein; InterPro: IPR019532 SR-25, otherwise known as ADP-ribosylation factor-like factor 6-interacting protein 4, is expressed in virtually all tissue types. At the N terminus there is a repeat of serine-arginine (SR repeat), and towards the middle of the protein there are clusters of both serines and of basic amino acids. The presence of many nuclear localisation signals strongly implies that this is a nuclear protein that may contribute to RNA splicing []. SR-25 is also implicated, along with heat-shock-protein-27, as a mediator in the Rac1 (GTPase ras-related C3 botulinum toxin substrate 1; also see IPR019093 from INTERPRO) signalling pathway [].
Probab=24.86 E-value=51 Score=31.03 Aligned_cols=9 Identities=11% Similarity=0.132 Sum_probs=5.4
Q ss_pred CCChHHHHH
Q 018239 177 ELDNPVYLY 185 (359)
Q Consensus 177 ~ls~~~fe~ 185 (359)
-||.|||+-
T Consensus 159 PmTkEEyea 167 (225)
T PF10500_consen 159 PMTKEEYEA 167 (225)
T ss_pred CCCHHHHHH
Confidence 467666553
No 56
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=24.78 E-value=69 Score=27.36 Aligned_cols=23 Identities=4% Similarity=0.182 Sum_probs=14.4
Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH
Q 018239 236 LIIVPVCSLLVGCLLLLWKVHRR 258 (359)
Q Consensus 236 ~~i~~~l~~~vgi~~l~~~~~r~ 258 (359)
-.+++++++++.++|++++.+|+
T Consensus 72 gv~aGvIg~Illi~y~irR~~Kk 94 (122)
T PF01102_consen 72 GVMAGVIGIILLISYCIRRLRKK 94 (122)
T ss_dssp HHHHHHHHHHHHHHHHHHHHS--
T ss_pred HHHHHHHHHHHHHHHHHHHHhcc
Confidence 34567777777777777766655
No 57
>CHL00190 psaM photosystem I subunit XII; Provisional
Probab=24.71 E-value=86 Score=20.24 Aligned_cols=15 Identities=47% Similarity=0.414 Sum_probs=9.9
Q ss_pred HHHHHHHHHHHHHHH
Q 018239 52 VAIASSVALTCNYLA 66 (359)
Q Consensus 52 ~~~a~~~a~~~~~l~ 66 (359)
++||.++||+..+|+
T Consensus 7 i~iAL~~Al~~~iLA 21 (30)
T CHL00190 7 IFIALFLALTTGILA 21 (30)
T ss_pred HHHHHHHHHHHHHHH
Confidence 466777777776664
No 58
>PF12669 P12: Virus attachment protein p12 family
Probab=24.32 E-value=1e+02 Score=22.73 Aligned_cols=10 Identities=40% Similarity=1.006 Sum_probs=7.9
Q ss_pred CCCCceecCC
Q 018239 96 CPSNGECHQG 105 (359)
Q Consensus 96 CP~hA~C~~g 105 (359)
|+.++.|...
T Consensus 40 ~~~~~~C~~~ 49 (58)
T PF12669_consen 40 CGSSSSCHSK 49 (58)
T ss_pred CCCCCCCCCC
Confidence 5888888877
No 59
>PF07047 OPA3: Optic atrophy 3 protein (OPA3); InterPro: IPR010754 OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity []. This family consists of several optic atrophy 3 (OPA3) proteins and related proteins from other eukaryotic species, the function is unknown.
Probab=24.21 E-value=4.5e+02 Score=22.43 Aligned_cols=20 Identities=15% Similarity=0.237 Sum_probs=8.8
Q ss_pred HHHHHHHHHHHHHHHHHHHH
Q 018239 238 IVPVCSLLVGCLLLLWKVHR 257 (359)
Q Consensus 238 i~~~l~~~vgi~~l~~~~~r 257 (359)
+.=++++.+|...+++-++|
T Consensus 79 l~E~fiF~Va~~li~~E~~R 98 (134)
T PF07047_consen 79 LGEAFIFSVAAGLIIYEYWR 98 (134)
T ss_pred HHHHHHHHHHHHHHHHHHHH
Confidence 33333344444444444443
No 60
>PF07803 GSG-1: GSG1-like protein; InterPro: IPR012478 This family contains sequences bearing similarity to a region of GSG1 (Q9Z1H7 from SWISSPROT), a protein specifically expressed in testicular germ cells []. It is possible that over expression of the human homologue may be involved in tumourigenesis of human testicular germ cell tumours []. The region in question has four highly conserved cysteine residues.
Probab=24.16 E-value=1.7e+02 Score=24.90 Aligned_cols=30 Identities=20% Similarity=0.511 Sum_probs=15.6
Q ss_pred HHHHHHHHHHHHHHH-HHHHhh---hc-CCCCCCC
Q 018239 48 LITVVAIASSVALTC-NYLANF---LN-STSKPFC 77 (359)
Q Consensus 48 ~~~~~~~a~~~a~~~-~~l~~~---~~-~~~~~fC 77 (359)
+|++++..++.+|.. .|+..| .+ ..++|+|
T Consensus 9 ~Ls~~ln~LAL~~S~tA~~sSyWC~GTqKVpKPlC 43 (118)
T PF07803_consen 9 LLSLILNLLALAFSTTALLSSYWCEGTQKVPKPLC 43 (118)
T ss_pred HHHHHHHHHHHHHHHHHHhcccccccceecCCCCC
Confidence 455555554444443 334444 22 3688888
No 61
>PF07465 PsaM: Photosystem I protein M (PsaM); InterPro: IPR010010 Members of this protein family are PsaM, which is subunit XII of the photosystem I reaction centre. PsaM forms part of the photosystem I complex and its binding is stabilised by PsaI []. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen.; GO: 0015979 photosynthesis, 0009522 photosystem I, 0030094 plasma membrane-derived photosystem I; PDB: 3PCQ_M 1JB0_M.
Probab=23.98 E-value=95 Score=19.89 Aligned_cols=15 Identities=40% Similarity=0.414 Sum_probs=9.4
Q ss_pred HHHHHHHHHHHHHHH
Q 018239 52 VAIASSVALTCNYLA 66 (359)
Q Consensus 52 ~~~a~~~a~~~~~l~ 66 (359)
++||.++|+...+|+
T Consensus 6 i~iAL~~Al~~~iLA 20 (29)
T PF07465_consen 6 IFIALVIALITGILA 20 (29)
T ss_dssp HHHHHHHHHHHHHHH
T ss_pred HHHHHHHHHHHHHHH
Confidence 456666666666654
No 62
>PF11392 DUF2877: Protein of unknown function (DUF2877); InterPro: IPR021530 This bacterial family of proteins are putative carboxylase proteins however this cannot be confirmed.
Probab=22.88 E-value=46 Score=27.60 Aligned_cols=11 Identities=36% Similarity=0.507 Sum_probs=9.0
Q ss_pred CCCCCCChhhH
Q 018239 35 PQSLFPSKQDL 45 (359)
Q Consensus 35 ~~~~~~~~~~~ 45 (359)
=|||+||.+||
T Consensus 5 G~GLTPSGDD~ 15 (110)
T PF11392_consen 5 GPGLTPSGDDF 15 (110)
T ss_pred CCCCCCchHHH
Confidence 37899999994
No 63
>smart00181 EGF Epidermal growth factor-like domain.
Probab=22.39 E-value=81 Score=19.58 Aligned_cols=25 Identities=32% Similarity=0.822 Sum_probs=17.4
Q ss_pred ccCCCCceecCC--ee--eeCCCceecCCCc
Q 018239 94 EPCPSNGECHQG--KL--ECFHGYRKHGKLC 120 (359)
Q Consensus 94 ~PCP~hA~C~~g--~l--~C~~gY~l~~~~C 120 (359)
.+|..| .|.+. .. .|.+||... ..|
T Consensus 6 ~~C~~~-~C~~~~~~~~C~C~~g~~g~-~~C 34 (35)
T smart00181 6 GPCSNG-TCINTPGSYTCSCPPGYTGD-KRC 34 (35)
T ss_pred CCCCCC-EEECCCCCeEeECCCCCccC-Ccc
Confidence 478888 89864 33 899998764 444
No 64
>cd00033 CCP Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system. SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function.
Probab=22.36 E-value=53 Score=22.40 Aligned_cols=20 Identities=35% Similarity=0.773 Sum_probs=14.0
Q ss_pred eeeeCCCceecC---CCcccCch
Q 018239 106 KLECFHGYRKHG---KLCVEDGD 125 (359)
Q Consensus 106 ~l~C~~gY~l~~---~~Cv~D~~ 125 (359)
.+.|++||.+.+ -.|..|+.
T Consensus 26 ~~~C~~Gy~~~g~~~~~C~~~g~ 48 (57)
T cd00033 26 TYSCNEGYTLVGSSTITCTENGG 48 (57)
T ss_pred EEECCCCCeEeCCCeeEECCCCe
Confidence 459999999974 35665543
No 65
>TIGR03053 PS_I_psaM photosystem I reaction center subunit XII. Members of this protein family are PsaM, which is subunit XII of the photosystem I reaction center. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen. The seed alignment for this model includes sequences from Pfam model pfam07465 and additional sequences, as from Prochlorococcus.
Probab=22.08 E-value=1e+02 Score=19.67 Aligned_cols=15 Identities=40% Similarity=0.437 Sum_probs=9.6
Q ss_pred HHHHHHHHHHHHHHH
Q 018239 52 VAIASSVALTCNYLA 66 (359)
Q Consensus 52 ~~~a~~~a~~~~~l~ 66 (359)
++||.++|+...+|+
T Consensus 6 i~iaL~~Al~~~iLA 20 (29)
T TIGR03053 6 IFIALVIALIAGILA 20 (29)
T ss_pred HHHHHHHHHHHHHHH
Confidence 456666666666654
No 66
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=21.58 E-value=56 Score=34.65 Aligned_cols=22 Identities=32% Similarity=0.830 Sum_probs=19.8
Q ss_pred ccCCCCceecCCeeeeCCCcee
Q 018239 94 EPCPSNGECHQGKLECFHGYRK 115 (359)
Q Consensus 94 ~PCP~hA~C~~g~l~C~~gY~l 115 (359)
.+|..||.|.+|+-.|++||.-
T Consensus 316 adC~g~G~Ci~G~C~C~~Gy~G 337 (525)
T KOG1225|consen 316 ADCSGHGKCIDGECLCDEGYTG 337 (525)
T ss_pred ccCCCCCcccCCceEeCCCCcC
Confidence 5788999999999999999875
No 67
>cd03580 NTR_Sfrp1_like NTR domain, Secreted frizzled-related protein (Sfrp) 1-like subfamily; composed of proteins similar to human Sfrp1, Sfrp2 and Sfrp5. Sfrps are soluble proteins containing an NTR domain C-terminal to a cysteine-rich Frizzled domain. They show diverse functions and are thought to work in Wnt signaling indirectly, as modulators or antagonists by binding Wnt ligands, and directly, via the Wnt receptor, Frizzled. They participate in regulating the patterning along the anteroposterior axis in vertebrates. Human Sfrp1 has been found frequently to be downregulated in breast cancer and is associated with disease progression and poor prognosis.
Probab=20.98 E-value=34 Score=29.08 Aligned_cols=27 Identities=22% Similarity=0.539 Sum_probs=22.2
Q ss_pred CCCCccCCCCceecCCee--eeCCCceec
Q 018239 90 TDSCEPCPSNGECHQGKL--ECFHGYRKH 116 (359)
Q Consensus 90 ~p~C~PCP~hA~C~~g~l--~C~~gY~l~ 116 (359)
++.|.+|+..+.++...+ -|..+|++.
T Consensus 1 ~~~C~~C~~~~~~~~~l~~~fC~sDFvik 29 (126)
T cd03580 1 PKVCPPCENEEESAKTLLDNFCASDFALK 29 (126)
T ss_pred CCcCCCcCcchhhHHHHHHHhccccEEEE
Confidence 378999999998865544 899999996
No 68
>PF15240 Pro-rich: Proline-rich
Probab=20.88 E-value=62 Score=29.48 Aligned_cols=14 Identities=21% Similarity=0.301 Sum_probs=10.5
Q ss_pred HHHHHHHHHHHHHH
Q 018239 49 ITVVAIASSVALTC 62 (359)
Q Consensus 49 ~~~~~~a~~~a~~~ 62 (359)
|+|||+|+++||.-
T Consensus 2 LlVLLSvALLALSS 15 (179)
T PF15240_consen 2 LLVLLSVALLALSS 15 (179)
T ss_pred hhHHHHHHHHHhhh
Confidence 77888888887544
No 69
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=20.66 E-value=75 Score=19.56 Aligned_cols=21 Identities=33% Similarity=0.885 Sum_probs=15.5
Q ss_pred cCCCCceecCC--ee--eeCCCcee
Q 018239 95 PCPSNGECHQG--KL--ECFHGYRK 115 (359)
Q Consensus 95 PCP~hA~C~~g--~l--~C~~gY~l 115 (359)
||..+|.|.+. .. .|.+||..
T Consensus 10 ~C~~~~~C~~~~~~~~C~C~~g~~g 34 (38)
T cd00054 10 PCQNGGTCVNTVGSYRCSCPPGYTG 34 (38)
T ss_pred CcCCCCEeECCCCCeEeECCCCCcC
Confidence 68888999766 23 78888753
No 70
>PHA02642 C-type lectin-like protein; Provisional
Probab=20.59 E-value=2e+02 Score=27.00 Aligned_cols=19 Identities=11% Similarity=0.034 Sum_probs=11.6
Q ss_pred HHHHHHHHHHHHHHHHHHH
Q 018239 48 LITVVAIASSVALTCNYLA 66 (359)
Q Consensus 48 ~~~~~~~a~~~a~~~~~l~ 66 (359)
++.||+.-.+++++.++++
T Consensus 56 ~i~~l~~~~~~~l~~~~~~ 74 (216)
T PHA02642 56 TICILITINLVPIIILMAF 74 (216)
T ss_pred hHHHHHHHHHHHHHHHHHh
Confidence 4455555556677666665
No 71
>PHA02673 ORF109 EEV glycoprotein; Provisional
Probab=20.47 E-value=1.6e+02 Score=26.41 Aligned_cols=22 Identities=23% Similarity=0.298 Sum_probs=15.0
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 018239 45 LLRLITVVAIASSVALTCNYLA 66 (359)
Q Consensus 45 ~~~~~~~~~~a~~~a~~~~~l~ 66 (359)
|+|+.++++|-++++++..+.+
T Consensus 35 ~~Ri~~~iSIisL~~l~v~LaL 56 (161)
T PHA02673 35 FFRLMAAIAIIVLAILVVILAL 56 (161)
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 6777777777777776665543
No 72
>PF11743 DUF3301: Protein of unknown function (DUF3301); InterPro: IPR021732 This family is conserved in Proteobacteria, but the function is not known.
Probab=20.40 E-value=2.6e+02 Score=22.63 Aligned_cols=22 Identities=14% Similarity=0.095 Sum_probs=11.5
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 018239 255 VHRRRYFAIRVEELYHQVCEIL 276 (359)
Q Consensus 255 ~~r~~~e~~~v~~Lv~~vi~~L 276 (359)
+.+.++.++++.+.++..++.+
T Consensus 15 ~w~~~~~~E~A~~~a~~~C~~~ 36 (97)
T PF11743_consen 15 WWQSRRQRERALQAARRACKRQ 36 (97)
T ss_pred HHHHhhHHHHHHHHHHHHHHHc
Confidence 3333344455556666666655
No 73
>PF14991 MLANA: Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=20.19 E-value=30 Score=29.18 Aligned_cols=23 Identities=30% Similarity=0.626 Sum_probs=0.4
Q ss_pred HHHHHHHHHHHH--HHHHHHHHHHH
Q 018239 240 PVCSLLVGCLLL--LWKVHRRRYFA 262 (359)
Q Consensus 240 ~~l~~~vgi~~l--~~~~~r~~~e~ 262 (359)
+++++++|++++ .||++||.-.+
T Consensus 30 GiL~VILgiLLliGCWYckRRSGYk 54 (118)
T PF14991_consen 30 GILIVILGILLLIGCWYCKRRSGYK 54 (118)
T ss_dssp S------------------------
T ss_pred eeHHHHHHHHHHHhheeeeecchhh
Confidence 344555555544 57777665433
Done!