Query 016022
Match_columns 396
No_of_seqs 135 out of 194
Neff 6.1
Searched_HMMs 46136
Date Fri Mar 29 03:04:44 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/016022.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/016022hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09402 MSC: Man1-Src1p-C-ter 100.0 1.1E-61 2.4E-66 482.5 4.8 275 71-349 18-334 (334)
2 PF12946 EGF_MSP1_1: MSP1 EGF 95.4 0.006 1.3E-07 42.0 0.9 28 94-121 5-37 (37)
3 PF01683 EB: EB module; Inter 72.4 4 8.6E-05 29.5 2.8 26 92-120 27-52 (52)
4 PF13314 DUF4083: Domain of un 64.8 29 0.00063 26.3 6.0 18 261-278 40-57 (58)
5 PF06667 PspB: Phage shock pro 62.8 30 0.00066 27.6 6.2 31 242-272 13-51 (75)
6 PF07127 Nodulin_late: Late no 58.8 15 0.00033 27.0 3.7 26 73-113 26-52 (54)
7 PTZ00382 Variant-specific surf 57.4 16 0.00035 30.3 4.0 34 90-123 19-56 (96)
8 PF06387 Calcyon: D1 dopamine 55.1 15 0.00032 34.0 3.6 16 109-124 113-128 (186)
9 PF07645 EGF_CA: Calcium-bindi 54.7 6.8 0.00015 27.2 1.1 21 95-115 11-35 (42)
10 COG2976 Uncharacterized protei 51.5 52 0.0011 31.2 6.7 49 227-277 14-65 (207)
11 KOG0196 Tyrosine kinase, EPH ( 45.5 17 0.00036 41.1 2.9 41 74-116 276-319 (996)
12 PF02009 Rifin_STEVOR: Rifin/s 45.1 32 0.0007 34.4 4.6 16 248-263 274-289 (299)
13 PF01102 Glycophorin_A: Glycop 44.2 33 0.00071 30.0 3.9 20 236-255 68-87 (122)
14 KOG1214 Nidogen and related ba 41.6 16 0.00035 41.3 2.0 34 91-124 828-867 (1289)
15 PF06864 PAP_PilO: Pilin acces 41.5 50 0.0011 34.3 5.6 14 290-303 220-233 (414)
16 PF12947 EGF_3: EGF domain; I 41.3 15 0.00031 25.1 1.1 26 95-120 7-36 (36)
17 PF04891 NifQ: NifQ; InterPro 41.1 30 0.00065 31.8 3.4 16 89-104 152-167 (167)
18 PF08563 P53_TAD: P53 transact 40.0 13 0.00027 23.6 0.5 14 176-189 8-21 (25)
19 TIGR02976 phageshock_pspB phag 39.4 1.3E+02 0.0028 24.0 6.4 28 245-272 16-51 (75)
20 PF01826 TIL: Trypsin Inhibito 38.2 17 0.00036 26.5 1.1 26 96-124 27-53 (55)
21 smart00179 EGF_CA Calcium-bind 37.7 31 0.00067 22.4 2.3 26 94-120 9-38 (39)
22 PRK09458 pspB phage shock prot 37.2 86 0.0019 25.1 5.0 30 243-272 14-51 (75)
23 PF07974 EGF_2: EGF-like domai 36.2 33 0.00072 22.7 2.2 20 95-114 7-28 (32)
24 PRK11677 hypothetical protein; 34.6 1.5E+02 0.0032 26.3 6.6 7 265-271 52-58 (134)
25 cd00053 EGF Epidermal growth f 34.0 40 0.00086 20.9 2.3 25 95-120 7-35 (36)
26 PF06143 Baculo_11_kDa: Baculo 33.8 2.7E+02 0.0058 22.8 7.9 22 230-251 32-53 (84)
27 PF10576 EndIII_4Fe-2S: Iron-s 32.8 19 0.00042 20.6 0.5 14 89-102 4-17 (17)
28 PF05568 ASFV_J13L: African sw 32.7 1.1E+02 0.0024 27.7 5.5 10 230-239 26-35 (189)
29 PRK07597 secE preprotein trans 32.5 82 0.0018 23.9 4.1 28 38-65 25-52 (64)
30 TIGR00964 secE_bact preprotein 32.0 86 0.0019 23.1 4.1 27 39-65 17-43 (55)
31 PF00558 Vpu: Vpu protein; In 32.0 50 0.0011 26.8 2.9 19 256-274 30-48 (81)
32 PF07271 Cytadhesin_P30: Cytad 31.6 1.5E+02 0.0032 29.5 6.6 17 260-276 104-120 (279)
33 PF06679 DUF1180: Protein of u 30.9 1.9E+02 0.0042 26.4 6.9 31 35-65 84-114 (163)
34 PF07543 PGA2: Protein traffic 30.1 1.5E+02 0.0032 26.5 5.9 12 293-304 62-73 (140)
35 PF11044 TMEMspv1-c74-12: Plec 29.7 2.1E+02 0.0046 20.7 5.4 15 237-251 7-21 (49)
36 COG0690 SecE Preprotein transl 28.3 1.2E+02 0.0026 23.9 4.5 28 38-65 35-62 (73)
37 KOG4403 Cell surface glycoprot 28.0 2.9E+02 0.0063 29.3 8.3 11 157-167 117-127 (575)
38 PHA03399 pif3 per os infectivi 27.4 73 0.0016 30.2 3.6 21 94-114 58-86 (200)
39 KOG0474 Cl- channel CLC-7 and 27.3 90 0.0019 34.6 4.7 24 90-113 396-420 (762)
40 PF07466 DUF1517: Protein of u 26.5 1.8E+02 0.0039 29.0 6.5 23 43-65 62-84 (289)
41 PF14316 DUF4381: Domain of un 26.0 1.7E+02 0.0036 25.8 5.6 15 269-283 70-84 (146)
42 PF00584 SecE: SecE/Sec61-gamm 25.6 1.6E+02 0.0035 21.5 4.6 21 39-59 18-38 (57)
43 PF09064 Tme5_EGF_like: Thromb 25.2 55 0.0012 22.3 1.7 21 95-117 7-30 (34)
44 PF03672 UPF0154: Uncharacteri 25.1 3E+02 0.0065 21.4 6.0 18 243-260 10-27 (64)
45 PF06247 Plasmod_Pvs28: Plasmo 24.9 27 0.00059 32.7 0.3 31 95-125 51-90 (197)
46 PF08114 PMP1_2: ATPase proteo 24.0 1.9E+02 0.0042 20.5 4.3 18 246-263 20-37 (43)
47 PF14991 MLANA: Protein melan- 23.9 19 0.00041 31.0 -0.8 22 241-262 31-54 (118)
48 PF01102 Glycophorin_A: Glycop 23.9 88 0.0019 27.3 3.3 22 237-258 73-94 (122)
49 PF09402 MSC: Man1-Src1p-C-ter 23.4 27 0.00059 34.8 0.0 70 262-331 98-174 (334)
50 PHA02673 ORF109 EEV glycoprote 22.5 1.3E+02 0.0028 27.5 4.1 22 45-66 35-56 (161)
51 PF12729 4HB_MCP_1: Four helix 22.1 3.7E+02 0.0081 22.6 7.1 10 297-306 63-72 (181)
52 PRK15428 putative propanediol 21.7 81 0.0017 28.9 2.7 31 263-301 4-34 (163)
53 PF12662 cEGF: Complement Clr- 21.4 52 0.0011 20.5 1.0 16 107-122 4-21 (24)
54 PF15050 SCIMP: SCIMP protein 21.3 2.3E+02 0.0049 24.9 5.2 13 230-242 3-15 (133)
55 PF11392 DUF2877: Protein of u 21.2 52 0.0011 27.9 1.3 11 35-45 5-15 (110)
56 PF10500 SR-25: Nuclear RNA-sp 21.1 66 0.0014 30.9 2.1 9 177-185 159-167 (225)
57 PF10588 NADH-G_4Fe-4S_3: NADH 20.9 44 0.00095 23.3 0.7 16 88-103 11-26 (41)
58 cd00033 CCP Complement control 20.1 61 0.0013 22.5 1.3 20 106-125 26-48 (57)
59 PRK09400 secE preprotein trans 20.1 2E+02 0.0044 22.0 4.2 19 40-58 27-45 (61)
No 1
>PF09402 MSC: Man1-Src1p-C-terminal domain; InterPro: IPR018996 This entry represents the Inner nuclear membrane proteins MAN1 (also known as LEM domain-containing protein 3) and LEM domain-containing protein 2 (or LEM protein 2). Emerin and MAN1 are LEM domain-containing integral membrane proteins of the vertebrate nuclear envelope []. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nulceoplasmic region forms a DNA binding winged helix and binds to Smad []. LEM protein 2 is an essential protein involved in chromosome segregation and cell division, probably via its interaction with lmn-1, the main component of nuclear lamina. Has some overlapping function with emr-1.; GO: 0005639 integral to nuclear inner membrane; PDB: 2CH0_A.
Probab=100.00 E-value=1.1e-61 Score=482.48 Aligned_cols=275 Identities=28% Similarity=0.423 Sum_probs=71.3
Q ss_pred CCCCCCCCCCCCCCC------------CCCCCCCCccCCCCceecCC-eeeeCCCceec-----------CCCcccChhh
Q 016022 71 STSKPFCDSNLLLDS------------PQSPTDSCEPCPSNGECHQG-KLECFHGYRKH-----------GKLCVEDGDI 126 (396)
Q Consensus 71 ~~~~~fCds~~~~~~------------~~~~~p~C~PCPehAiC~~g-~l~C~~gYvl~-----------~~~CV~D~~k 126 (396)
+...||||++.+..+ ...++|+|+|||+||+|++| ++.|++||++. +++||+|+++
T Consensus 18 ~~~vgyC~~~~~~~~~~~~~~~~~~~~~~~~~P~C~pCP~~a~C~~~~~~~C~~~y~~~~~~l~~~g~~p~~~Ci~D~~k 97 (334)
T PF09402_consen 18 KIAVGYCGTESPSPSFADDDISVPDWLLENFKPSCEPCPEHAICYPGLKLECEPGYVLKPSPLSLFGLIPPPKCIPDTEK 97 (334)
T ss_dssp --------------------------------------------------------------------------------
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccHHH
Confidence 468999999972211 14578999999999999999 99999999998 9999999999
Q ss_pred hHHHHHHHHHHHHHHHHHhhcccccC---CCCcccchhhHHHHhhhhhhhhccCCChHHHHHHHHHHHHHHHhhhhhccc
Q 016022 127 NETAGRLSRWVENRLCRAYAQFLCDG---TGSIWVEENDIWNDLEGHELMKIFELDNPVYLYTKKRTMETVGRYLESRTN 203 (396)
Q Consensus 127 ~~~i~~l~~~i~~~Lr~~~a~~~CG~---~~s~~i~e~dL~~~~~e~~~~k~~~l~~~~fe~l~~~al~~l~~~l~~~~~ 203 (396)
++.+.+|++++.++||++||+++||. ..+.+|+++||++++.+ ++++++++++|+++|..++..+.+.-+..+.
T Consensus 98 ~~~i~~l~~~~~~~Lr~~~a~~~Cg~~~~~~~~~ls~~el~~~~~~---~~~~~~~~~efe~l~~~a~~~L~~~~ei~~~ 174 (334)
T PF09402_consen 98 EEKIEELAKKILDELRERNAQYECGDSEDDESPGLSEEELKDILSS---KKSPWISDEEFEELWSAALQELKKNPEIIIR 174 (334)
T ss_dssp --------------------------------------------------------------------------------
T ss_pred HHHHHHHHHHHHHHHHHHHhhcccCCCCCCCCCCCcHHHHHHHHHh---ccCccccHHHHHHHHHHHHHHHHhCCcEEEe
Confidence 99999999999999999999999993 34678999999999998 7788999999999999999888743222221
Q ss_pred ---------CCCceeeecccccccCccCccchhHHH----HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 016022 204 ---------SYGMKELKCPELLAEHYKPLSCRIHQW----VSTHALIIVPVCSLLVGCLLLLWKVHRRRYFAIRVEELYH 270 (396)
Q Consensus 204 ---------sn~~~~~k~~~~~S~~~i~l~Crir~~----i~~~~~~i~~~l~llv~i~~l~~~~~r~~~e~~~v~~Lv~ 270 (396)
.+.........+++++++||+|++++. +.+++.+++++++++++++++++++++++.++++|++||+
T Consensus 175 ~~~~~~~~~~~~~~~~~~~~s~s~~~lpl~C~~~~~i~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~v~~lv~ 254 (334)
T PF09402_consen 175 DDIINSHSSDDSNEKDKYFRSSSLPYLPLKCRLRRQIRQFISRYRLIILGVLILLLLIKYIRYRYRKRREEKARVEELVK 254 (334)
T ss_dssp -----------------------------------------------------------------STHHHHHTTTTTTHH
T ss_pred cccccccccccccCCcEEEEeeCCCccccEEEEehHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 001111112224689999999976555 4556667777777777777778888888999999999999
Q ss_pred HHHHHHHHHHhhhhccCCCCCCcccccccccccCCCCCc--cchhhHHHHHHHHhcCCCcceeeeEEcCceeeeeEEeec
Q 016022 271 QVCEILEENALMSKSVNGECEPWVVASRLRDHLLLPKER--KDPVIWKKVEELVQEDSRVDQYPKLLKGESKVVWEWQVE 348 (396)
Q Consensus 271 ~vl~~L~~q~~~~~~~~~~~~pyl~~~qLRD~LL~~~~r--~r~~LW~kV~k~Ve~nSnIrt~~~ei~GE~~~vWEWig~ 348 (396)
+|+++|++|+..+ ..+..++|||+++||||+||.++++ ++++||++|+++||+|||||++++|+|||+|+||||||+
T Consensus 255 ~ii~~L~~~~~~~-~~~~~~~p~v~~~qLRD~ll~~~~~~~~~~~lW~~v~~~ve~ns~Vr~~~~e~~Ge~~~vWeWig~ 333 (334)
T PF09402_consen 255 KIIDRLQDQARAS-DPNSSPEPYVSISQLRDDLLPPEHRLKRRNRLWKKVVKKVEENSNVRTEVREVHGEIMRVWEWIGP 333 (334)
T ss_dssp HHHHHHHHHHHHH-TTSS-S-S-B-HHHHHHTT--STTGGG-GHHHHHHHHHHHTT---SEEEEEEETTEEEEEEE----
T ss_pred HHHHHHHHHhhhh-ccCCCCCCCccHHHHHHHhCCcccCHHHHHHHHHHHHHHHHcCCCeeEEEEEECCeEEEEEEecCC
Confidence 9999999999843 3446789999999999999987653 379999999999999999999999999999999999997
Q ss_pred C
Q 016022 349 G 349 (396)
Q Consensus 349 ~ 349 (396)
+
T Consensus 334 ~ 334 (334)
T PF09402_consen 334 N 334 (334)
T ss_dssp -
T ss_pred C
Confidence 5
No 2
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=95.43 E-value=0.006 Score=42.00 Aligned_cols=28 Identities=39% Similarity=0.954 Sum_probs=20.2
Q ss_pred ccCCCCceecCC----e-eeeCCCceecCCCcc
Q 016022 94 EPCPSNGECHQG----K-LECFHGYRKHGKLCV 121 (396)
Q Consensus 94 ~PCPehAiC~~g----~-l~C~~gYvl~~~~CV 121 (396)
++||+||-|+.+ + -.|..||.+.+.+|+
T Consensus 5 ~~cP~NA~C~~~~dG~eecrCllgyk~~~~~C~ 37 (37)
T PF12946_consen 5 TKCPANAGCFRYDDGSEECRCLLGYKKVGGKCV 37 (37)
T ss_dssp S---TTEEEEEETTSEEEEEE-TTEEEETTEEE
T ss_pred ccCCCCcccEEcCCCCEEEEeeCCccccCCCcC
Confidence 589999999864 3 399999999999886
No 3
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=72.44 E-value=4 Score=29.52 Aligned_cols=26 Identities=31% Similarity=0.782 Sum_probs=23.0
Q ss_pred CCccCCCCceecCCeeeeCCCceecCCCc
Q 016022 92 SCEPCPSNGECHQGKLECFHGYRKHGKLC 120 (396)
Q Consensus 92 ~C~PCPehAiC~~g~l~C~~gYvl~~~~C 120 (396)
+|+ .++.|.+|.=.|.+||+....+|
T Consensus 27 qC~---~~s~C~~g~C~C~~g~~~~~~~C 52 (52)
T PF01683_consen 27 QCI---GGSVCVNGRCQCPPGYVEVGGRC 52 (52)
T ss_pred CCC---CcCEEcCCEeECCCCCEecCCCC
Confidence 555 99999998889999999988877
No 4
>PF13314 DUF4083: Domain of unknown function (DUF4083)
Probab=64.77 E-value=29 Score=26.34 Aligned_cols=18 Identities=22% Similarity=0.289 Sum_probs=11.1
Q ss_pred HHHHHHHHHHHHHHHHHH
Q 016022 261 FAIRVEELYHQVCEILEE 278 (396)
Q Consensus 261 e~~~v~~Lv~~vl~~L~~ 278 (396)
....+++=.+.+++.|++
T Consensus 40 ~~~~~eqKLDrIIeLLEK 57 (58)
T PF13314_consen 40 DVDSMEQKLDRIIELLEK 57 (58)
T ss_pred chhHHHHHHHHHHHHHcc
Confidence 333566666677777754
No 5
>PF06667 PspB: Phage shock protein B; InterPro: IPR009554 This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages []. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one [].; GO: 0006355 regulation of transcription, DNA-dependent, 0009271 phage shock
Probab=62.82 E-value=30 Score=27.60 Aligned_cols=31 Identities=26% Similarity=0.323 Sum_probs=18.2
Q ss_pred HHHHHHHHHH-HHHHHHHH-------HHHHHHHHHHHHH
Q 016022 242 CSLLVGCLLL-LWKVHRRR-------YFAIRVEELYHQV 272 (396)
Q Consensus 242 l~llv~i~~l-~~~~~r~~-------~e~~~v~~Lv~~v 272 (396)
.+++|+..|+ .+|..+++ .+.++.++|++.+
T Consensus 13 f~ifVap~WL~lHY~sk~~~~~gLs~~d~~~L~~L~~~a 51 (75)
T PF06667_consen 13 FMIFVAPIWLILHYRSKWKSSQGLSEEDEQRLQELYEQA 51 (75)
T ss_pred HHHHHHHHHHHHHHHHhcccCCCCCHHHHHHHHHHHHHH
Confidence 3445555555 56665554 4666677777765
No 6
>PF07127 Nodulin_late: Late nodulin protein; InterPro: IPR009810 This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues toward the proteins C terminus which may be involved in metal-binding [].; GO: 0046872 metal ion binding, 0009878 nodule morphogenesis
Probab=58.76 E-value=15 Score=27.03 Aligned_cols=26 Identities=19% Similarity=0.586 Sum_probs=18.8
Q ss_pred CCCCCCCCCCCCCCCCCCCCCccCCCCceecCC-eeeeCCCc
Q 016022 73 SKPFCDSNLLLDSPQSPTDSCEPCPSNGECHQG-KLECFHGY 113 (396)
Q Consensus 73 ~~~fCds~~~~~~~~~~~p~C~PCPehAiC~~g-~l~C~~gY 113 (396)
....|.++. .||.+ |..+ ..+|..|+
T Consensus 26 ~~~~C~~d~-------------DCp~~--c~~~~~~kCi~~~ 52 (54)
T PF07127_consen 26 AIIPCKTDS-------------DCPKD--CPPPFIPKCINNI 52 (54)
T ss_pred CCcccCccc-------------cCCCC--CCCCcCcEeCcCC
Confidence 457888874 78888 8777 45887663
No 7
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=57.42 E-value=16 Score=30.32 Aligned_cols=34 Identities=24% Similarity=0.574 Sum_probs=23.5
Q ss_pred CCCCccCCC--CceecCCee--eeCCCceecCCCcccC
Q 016022 90 TDSCEPCPS--NGECHQGKL--ECFHGYRKHGKLCVED 123 (396)
Q Consensus 90 ~p~C~PCPe--hAiC~~g~l--~C~~gYvl~~~~CV~D 123 (396)
...|.+||. =+.|..... .|..||.+..+.|+.+
T Consensus 19 ~~~C~~C~~~~C~~C~~~~~C~~C~~GY~~~~~~Cv~~ 56 (96)
T PTZ00382 19 GSGCVLCSVGNCKSCVVDGVCGECNSGFSLDNGKCVSS 56 (96)
T ss_pred CCcCCcCCCCCCcCCCCCCccccCcCCcccCCCccccc
Confidence 346999985 234433322 7999999999989863
No 8
>PF06387 Calcyon: D1 dopamine receptor-interacting protein (calcyon); InterPro: IPR009431 This family consists of several D1 dopamine receptor-interacting (calcyon) proteins. D1/D5 dopamine receptors in the basal ganglia, hippocampus, and cerebral cortex modulate motor, reward, and cognitive behaviour. D1-like dopamine receptors likely modulate neocortical and hippocampal neuronal excitability and synaptic function via Ca2+ as well as cAMP-dependent signalling []. Defective calcyon proteins have been implicated in both attention-deficit/hyperactivity disorder (ADHD) [] and schizophrenia.; GO: 0050780 dopamine receptor binding, 0007212 dopamine receptor signaling pathway, 0016021 integral to membrane
Probab=55.05 E-value=15 Score=33.98 Aligned_cols=16 Identities=25% Similarity=0.455 Sum_probs=11.5
Q ss_pred eCCCceecCCCcccCh
Q 016022 109 CFHGYRKHGKLCVEDG 124 (396)
Q Consensus 109 C~~gYvl~~~~CV~D~ 124 (396)
|-+||++..+.|+|-+
T Consensus 113 CPdGFv~khk~C~P~~ 128 (186)
T PF06387_consen 113 CPDGFVLKHKRCTPLT 128 (186)
T ss_pred CCCcceeecccccchh
Confidence 4458888888888744
No 9
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=54.72 E-value=6.8 Score=27.15 Aligned_cols=21 Identities=38% Similarity=0.911 Sum_probs=17.6
Q ss_pred cCCCCceecCC--ee--eeCCCcee
Q 016022 95 PCPSNGECHQG--KL--ECFHGYRK 115 (396)
Q Consensus 95 PCPehAiC~~g--~l--~C~~gYvl 115 (396)
+|+.++.|.+- .. .|.+||..
T Consensus 11 ~C~~~~~C~N~~Gsy~C~C~~Gy~~ 35 (42)
T PF07645_consen 11 NCPENGTCVNTEGSYSCSCPPGYEL 35 (42)
T ss_dssp SSSTTSEEEEETTEEEEEESTTEEE
T ss_pred cCCCCCEEEcCCCCEEeeCCCCcEE
Confidence 68999999876 33 99999994
No 10
>COG2976 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=51.54 E-value=52 Score=31.23 Aligned_cols=49 Identities=12% Similarity=0.346 Sum_probs=24.1
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHH-HHHHHHH-HHHHHH-HHHHHHHHHHHHHHH
Q 016022 227 IHQWVSTHALIIVPVCSLLVGCL-LLLWKVH-RRRYFA-IRVEELYHQVCEILE 277 (396)
Q Consensus 227 ir~~i~~~~~~i~~~l~llv~i~-~l~~~~~-r~~~e~-~~v~~Lv~~vl~~L~ 277 (396)
++.|++.+-..++. ++++|+. ++-|.++ .++.++ +.....|+.+++.++
T Consensus 14 ik~wwkeNGk~li~--gviLg~~~lfGW~ywq~~q~~q~~~AS~~Y~~~i~~~~ 65 (207)
T COG2976 14 IKDWWKENGKALIV--GVILGLGGLFGWRYWQSHQVEQAQEASAQYQNAIKAVQ 65 (207)
T ss_pred HHHHHHHCCchhHH--HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence 56676666543322 2222222 3345554 333332 245667777777763
No 11
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=45.50 E-value=17 Score=41.11 Aligned_cols=41 Identities=24% Similarity=0.566 Sum_probs=28.5
Q ss_pred CCCCCCCCCCCCCCCCCCCCccCCCCcee-cCC-ee-eeCCCceec
Q 016022 74 KPFCDSNLLLDSPQSPTDSCEPCPSNGEC-HQG-KL-ECFHGYRKH 116 (396)
Q Consensus 74 ~~fCds~~~~~~~~~~~p~C~PCPehAiC-~~g-~l-~C~~gYvl~ 116 (396)
..-|..+. -+.......|.|||+|.+= ..| .. .|..||-..
T Consensus 276 C~aCp~G~--yK~~~~~~~C~~CP~~S~s~~ega~~C~C~~gyyRA 319 (996)
T KOG0196|consen 276 CQACPPGT--YKASQGDSLCLPCPPNSHSSSEGATSCTCENGYYRA 319 (996)
T ss_pred ceeCCCCc--ccCCCCCCCCCCCCCCCCCCCCCCCcccccCCcccC
Confidence 33455543 1233456789999999998 556 55 999999983
No 12
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=45.05 E-value=32 Score=34.44 Aligned_cols=16 Identities=13% Similarity=0.358 Sum_probs=8.5
Q ss_pred HHHHHHHHHHHHHHHH
Q 016022 248 CLLLLWKVHRRRYFAI 263 (396)
Q Consensus 248 i~~l~~~~~r~~~e~~ 263 (396)
|.||+++|||+++.+.
T Consensus 274 IIYLILRYRRKKKmkK 289 (299)
T PF02009_consen 274 IIYLILRYRRKKKMKK 289 (299)
T ss_pred HHHHHHHHHHHhhhhH
Confidence 4455566666554443
No 13
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=44.19 E-value=33 Score=29.95 Aligned_cols=20 Identities=30% Similarity=0.333 Sum_probs=9.4
Q ss_pred HHHHHHHHHHHHHHHHHHHH
Q 016022 236 LIIVPVCSLLVGCLLLLWKV 255 (396)
Q Consensus 236 ~~i~~~l~llv~i~~l~~~~ 255 (396)
.+|+++++.++|+.+++.|+
T Consensus 68 ~Ii~gv~aGvIg~Illi~y~ 87 (122)
T PF01102_consen 68 GIIFGVMAGVIGIILLISYC 87 (122)
T ss_dssp HHHHHHHHHHHHHHHHHHHH
T ss_pred ehhHHHHHHHHHHHHHHHHH
Confidence 34455554445544444443
No 14
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=41.62 E-value=16 Score=41.25 Aligned_cols=34 Identities=35% Similarity=0.842 Sum_probs=29.4
Q ss_pred CCCcc--CCCCceecCC--ee--eeCCCceecCCCcccCh
Q 016022 91 DSCEP--CPSNGECHQG--KL--ECFHGYRKHGKLCVEDG 124 (396)
Q Consensus 91 p~C~P--CPehAiC~~g--~l--~C~~gYvl~~~~CV~D~ 124 (396)
++|.| |=++|.||+. .+ +|.+||.--+-.||||+
T Consensus 828 DeC~psrChp~A~CyntpgsfsC~C~pGy~GDGf~CVP~~ 867 (1289)
T KOG1214|consen 828 DECSPSRCHPAATCYNTPGSFSCRCQPGYYGDGFQCVPDT 867 (1289)
T ss_pred cccCccccCCCceEecCCCcceeecccCccCCCceecCCC
Confidence 67776 9999999987 33 99999999999999993
No 15
>PF06864 PAP_PilO: Pilin accessory protein (PilO); InterPro: IPR009663 This family consists of several enterobacterial PilO proteins. The function of PilO is unknown although it has been suggested that it is a cytoplasmic protein in the absence of other Pil proteins, but PilO protein is translocated to the outer membrane in the presence of other Pil proteins. Alternatively, PilO protein may form a complex with other Pil protein(s). PilO has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body []. This family does not seem to be related to IPR007445 from INTERPRO.
Probab=41.45 E-value=50 Score=34.26 Aligned_cols=14 Identities=21% Similarity=0.581 Sum_probs=9.6
Q ss_pred CCCccccccccccc
Q 016022 290 CEPWVVASRLRDHL 303 (396)
Q Consensus 290 ~~pyl~~~qLRD~L 303 (396)
++||...+..-+.|
T Consensus 220 ~~PW~~~P~~~~fl 233 (414)
T PF06864_consen 220 PHPWAKQPSVQAFL 233 (414)
T ss_pred CCCcccCCCHHHHH
Confidence 56888777666654
No 16
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=41.33 E-value=15 Score=25.05 Aligned_cols=26 Identities=31% Similarity=0.797 Sum_probs=17.4
Q ss_pred cCCCCceecCC--ee--eeCCCceecCCCc
Q 016022 95 PCPSNGECHQG--KL--ECFHGYRKHGKLC 120 (396)
Q Consensus 95 PCPehAiC~~g--~l--~C~~gYvl~~~~C 120 (396)
.|=+||.|.+- .+ .|.+||.--+..|
T Consensus 7 ~C~~nA~C~~~~~~~~C~C~~Gy~GdG~~C 36 (36)
T PF12947_consen 7 GCHPNATCTNTGGSYTCTCKPGYEGDGFFC 36 (36)
T ss_dssp GS-TTCEEEE-TTSEEEEE-CEEECCSTCE
T ss_pred CCCCCcEeecCCCCEEeECCCCCccCCcCC
Confidence 67889999876 44 9999998655544
No 17
>PF04891 NifQ: NifQ; InterPro: IPR006975 NifQ is involved in early stages of the biosynthesis of the iron-molybdenum cofactor (FeMo-co) [], which is an integral part of the active site of dinitrogenase []. The conserved C-terminal cysteine residues may be involved in metal binding [].; GO: 0030151 molybdenum ion binding, 0009399 nitrogen fixation
Probab=41.12 E-value=30 Score=31.79 Aligned_cols=16 Identities=31% Similarity=0.746 Sum_probs=14.0
Q ss_pred CCCCCccCCCCceecC
Q 016022 89 PTDSCEPCPSNGECHQ 104 (396)
Q Consensus 89 ~~p~C~PCPehAiC~~ 104 (396)
..|+|.-|.+++.||.
T Consensus 152 ~aPsC~~C~D~~~CFG 167 (167)
T PF04891_consen 152 RAPSCEECSDYAVCFG 167 (167)
T ss_pred CCCCCCCcCCHhhcCC
Confidence 3589999999999984
No 18
>PF08563 P53_TAD: P53 transactivation motif; InterPro: IPR013872 The binding of this protein by regulatory proteins regulates p53 transcription activation. This entry is comprised of a single amphipathic alpha helix and contains a highly conserved motif [, ]. ; GO: 0005515 protein binding; PDB: 1YCQ_B 2Z5T_R 3DAB_B 3DAC_B 2Z5S_Q 2K8F_B 2L14_B 1YCR_B.
Probab=40.02 E-value=13 Score=23.58 Aligned_cols=14 Identities=7% Similarity=0.007 Sum_probs=9.9
Q ss_pred cCCChHHHHHHHHH
Q 016022 176 FELDNPVYLYTKKR 189 (396)
Q Consensus 176 ~~l~~~~fe~l~~~ 189 (396)
+-|+++.|++||+.
T Consensus 8 ~PLSQeTF~~LW~~ 21 (25)
T PF08563_consen 8 LPLSQETFSDLWNL 21 (25)
T ss_dssp ---STCCHHHHHHT
T ss_pred CCccHHHHHHHHHh
Confidence 45889999999974
No 19
>TIGR02976 phageshock_pspB phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response.
Probab=39.42 E-value=1.3e+02 Score=24.01 Aligned_cols=28 Identities=29% Similarity=0.338 Sum_probs=14.9
Q ss_pred HHHHHHH-HHHHHHHH-------HHHHHHHHHHHHH
Q 016022 245 LVGCLLL-LWKVHRRR-------YFAIRVEELYHQV 272 (396)
Q Consensus 245 lv~i~~l-~~~~~r~~-------~e~~~v~~Lv~~v 272 (396)
+++..|+ .+|..+++ .+.++..+|++.+
T Consensus 16 fVap~wl~lHY~~k~~~~~~ls~~d~~~L~~L~~~a 51 (75)
T TIGR02976 16 FVAPLWLILHYRSKRKTAASLSTDDQALLQELYAKA 51 (75)
T ss_pred HHHHHHHHHHHHhhhccCCCCCHHHHHHHHHHHHHH
Confidence 3444444 55554443 3555666776654
No 20
>PF01826 TIL: Trypsin Inhibitor like cysteine rich domain; InterPro: IPR002919 This domain is found in proteinase inhibitors as well as in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9. This inhibitor domain belongs to MEROPS inhibitor family I8 (clan IA). Proteins containing this domain inhibit peptidases belonging to families S1 (IPR001254 from INTERPRO), S8 (IPR000209 from INTERPRO), and M4 (IPR001570 from INTERPRO) [] and are restricted to the chordata, nematoda, arthropoda and echinodermata. Examples of proteins containing this domain are: chymotrypsin/elastase inhibitor from Ascaris suum (pig roundworm) Acp62F protein from Drosophila melanogaster Bombina trypsin inhibitor from Bombina maxima (large-webbed bell toad) Bombyx subtilisin inhibitor from Bombyx mori (silk moth) von Willebrand factor ; PDB: 2P3F_N 1HX2_A 1CCV_A 1EAI_D 2H9E_C 1COU_A 1ATE_A 1ATB_A 1ATD_A 1ATA_A ....
Probab=38.18 E-value=17 Score=26.45 Aligned_cols=26 Identities=31% Similarity=0.752 Sum_probs=20.6
Q ss_pred CCCCceecCCeeeeCCCceecCC-CcccCh
Q 016022 96 CPSNGECHQGKLECFHGYRKHGK-LCVEDG 124 (396)
Q Consensus 96 CPehAiC~~g~l~C~~gYvl~~~-~CV~D~ 124 (396)
|+ ..|.+| =.|.+||++... .||+-.
T Consensus 27 C~--~~C~~g-C~C~~G~v~~~~~~CV~~~ 53 (55)
T PF01826_consen 27 CS--EPCVEG-CFCPPGYVRNDNGRCVPPS 53 (55)
T ss_dssp CS--SS-ESE-EEETTTEEEETTSEEEEGG
T ss_pred cC--CCCCcc-CCCCCCeeEcCCCCEEcHH
Confidence 55 778888 789999999876 999864
No 21
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=37.69 E-value=31 Score=22.37 Aligned_cols=26 Identities=38% Similarity=1.017 Sum_probs=18.6
Q ss_pred ccCCCCceecCC--ee--eeCCCceecCCCc
Q 016022 94 EPCPSNGECHQG--KL--ECFHGYRKHGKLC 120 (396)
Q Consensus 94 ~PCPehAiC~~g--~l--~C~~gYvl~~~~C 120 (396)
.||..+|.|.+. .. .|.+||. .+..|
T Consensus 9 ~~C~~~~~C~~~~g~~~C~C~~g~~-~g~~C 38 (39)
T smart00179 9 NPCQNGGTCVNTVGSYRCECPPGYT-DGRNC 38 (39)
T ss_pred CCcCCCCEeECCCCCeEeECCCCCc-cCCcC
Confidence 369899999854 22 7889987 45555
No 22
>PRK09458 pspB phage shock protein B; Provisional
Probab=37.19 E-value=86 Score=25.10 Aligned_cols=30 Identities=23% Similarity=0.261 Sum_probs=17.2
Q ss_pred HHHHHHHHH-HHHHHHHH-------HHHHHHHHHHHHH
Q 016022 243 SLLVGCLLL-LWKVHRRR-------YFAIRVEELYHQV 272 (396)
Q Consensus 243 ~llv~i~~l-~~~~~r~~-------~e~~~v~~Lv~~v 272 (396)
+++|+-.|+ .+|..+++ .+.++.++|++.+
T Consensus 14 ~ifVaPiWL~LHY~sk~~~~~~Ls~~d~~~L~~L~~~A 51 (75)
T PRK09458 14 VLFVAPIWLWLHYRSKRQGSQGLSQEEQQRLAQLTEKA 51 (75)
T ss_pred HHHHHHHHHHHhhcccccCCCCCCHHHHHHHHHHHHHH
Confidence 344454444 56655443 4666677777765
No 23
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=36.20 E-value=33 Score=22.68 Aligned_cols=20 Identities=35% Similarity=0.856 Sum_probs=17.0
Q ss_pred cCCCCceec--CCeeeeCCCce
Q 016022 95 PCPSNGECH--QGKLECFHGYR 114 (396)
Q Consensus 95 PCPehAiC~--~g~l~C~~gYv 114 (396)
.|=.||+|. .|.=.|++||.
T Consensus 7 ~C~~~G~C~~~~g~C~C~~g~~ 28 (32)
T PF07974_consen 7 ICSGHGTCVSPCGRCVCDSGYT 28 (32)
T ss_pred ccCCCCEEeCCCCEEECCCCCc
Confidence 588999999 56779999984
No 24
>PRK11677 hypothetical protein; Provisional
Probab=34.55 E-value=1.5e+02 Score=26.29 Aligned_cols=7 Identities=0% Similarity=0.192 Sum_probs=2.6
Q ss_pred HHHHHHH
Q 016022 265 VEELYHQ 271 (396)
Q Consensus 265 v~~Lv~~ 271 (396)
|.+.+.+
T Consensus 52 V~~HFa~ 58 (134)
T PRK11677 52 LVSHFAR 58 (134)
T ss_pred HHHHHHH
Confidence 3333333
No 25
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=34.03 E-value=40 Score=20.93 Aligned_cols=25 Identities=32% Similarity=0.890 Sum_probs=17.9
Q ss_pred cCCCCceecCC--ee--eeCCCceecCCCc
Q 016022 95 PCPSNGECHQG--KL--ECFHGYRKHGKLC 120 (396)
Q Consensus 95 PCPehAiC~~g--~l--~C~~gYvl~~~~C 120 (396)
+|..||+|.+. .. .|..||... ..|
T Consensus 7 ~C~~~~~C~~~~~~~~C~C~~g~~g~-~~C 35 (36)
T cd00053 7 PCSNGGTCVNTPGSYRCVCPPGYTGD-RSC 35 (36)
T ss_pred CCCCCCEEecCCCCeEeECCCCCccc-CCc
Confidence 67788999984 23 899998654 344
No 26
>PF06143 Baculo_11_kDa: Baculovirus 11 kDa family; InterPro: IPR009313 This is a family of uncharacterised Baculovirus proteins that are all about 11 kDa in size.
Probab=33.77 E-value=2.7e+02 Score=22.83 Aligned_cols=22 Identities=14% Similarity=0.354 Sum_probs=12.6
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 016022 230 WVSTHALIIVPVCSLLVGCLLL 251 (396)
Q Consensus 230 ~i~~~~~~i~~~l~llv~i~~l 251 (396)
.++.+++.|.+++++++.++++
T Consensus 32 firdFvLVic~~lVfVii~lFi 53 (84)
T PF06143_consen 32 FIRDFVLVICCFLVFVIIVLFI 53 (84)
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 4566666666665555555444
No 27
>PF10576 EndIII_4Fe-2S: Iron-sulfur binding domain of endonuclease III; InterPro: IPR003651 Endonuclease III (4.2.99.18 from EC) is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism [, ]. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair []. The 3-D structures of Escherichia coli endonuclease III [] and catalytic domain of MutY [] have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL) []. Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs. The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif [, ]. The iron-sulphur cluster loop (FCL) is also found in DNA-(apurinic or apyrimidinic site) lyase, a subfamily of endonuclease III. The enzyme has both apurinic and apyrimidinic endonuclease activity and a DNA N-glycosylase activity. It cuts damaged DNA at cytosines, thymines and guanines, and acts on the damaged strand 5' of the damaged site. The enzyme binds a 4Fe-4S cluster which is not important for the catalytic activity, but is probably involved in the alignment of the enzyme along the DNA strand.; GO: 0004519 endonuclease activity, 0051539 4 iron, 4 sulfur cluster binding; PDB: 1VRL_A 1RRQ_A 3G0Q_A 3FSQ_A 1RRS_A 3FSP_A 2ABK_A 1KG7_A 1KG2_A 1MUN_A ....
Probab=32.83 E-value=19 Score=20.62 Aligned_cols=14 Identities=36% Similarity=0.923 Sum_probs=8.6
Q ss_pred CCCCCccCCCCcee
Q 016022 89 PTDSCEPCPSNGEC 102 (396)
Q Consensus 89 ~~p~C~PCPehAiC 102 (396)
.+|.|.-||-+..|
T Consensus 4 r~P~C~~Cpl~~~C 17 (17)
T PF10576_consen 4 RKPKCEECPLADYC 17 (17)
T ss_dssp SS--GGG-TTGGG-
T ss_pred CCCccccCCCcccC
Confidence 47899999999887
No 28
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=32.66 E-value=1.1e+02 Score=27.66 Aligned_cols=10 Identities=40% Similarity=0.630 Sum_probs=4.8
Q ss_pred HHHHHHHHHH
Q 016022 230 WVSTHALIIV 239 (396)
Q Consensus 230 ~i~~~~~~i~ 239 (396)
++..|+..|+
T Consensus 26 ffsthm~tIL 35 (189)
T PF05568_consen 26 FFSTHMYTIL 35 (189)
T ss_pred HHHHHHHHHH
Confidence 3455554443
No 29
>PRK07597 secE preprotein translocase subunit SecE; Reviewed
Probab=32.54 E-value=82 Score=23.87 Aligned_cols=28 Identities=25% Similarity=0.420 Sum_probs=20.0
Q ss_pred CCCChhhHHHHHHHHHHHHHHHHHHHHH
Q 016022 38 LFPSKQDLLRLITVVAIASSVALTCNYL 65 (396)
Q Consensus 38 ~~~~~~~~~~~~~v~~ia~~~a~~c~~l 65 (396)
-.|+++|..+...+.+++.++..+..++
T Consensus 25 ~WPs~~e~~~~t~~Vi~~~~~~~~~i~~ 52 (64)
T PRK07597 25 TWPTRKELVRSTIVVLVFVAFFALFFYL 52 (64)
T ss_pred cCcCHHHHHhHHHHHHHHHHHHHHHHHH
Confidence 3699999998888777777665444443
No 30
>TIGR00964 secE_bact preprotein translocase, SecE subunit, bacterial. This model represents exclusively the bacterial (and some organellar) SecE protein. SecE is part of the core heterotrimer, SecYEG, of the Sec preprotein translocase system. Other components are the ATPase SecA, a cytosolic chaperone SecB, and an accessory complex of SecDF and YajC.
Probab=32.02 E-value=86 Score=23.09 Aligned_cols=27 Identities=19% Similarity=0.261 Sum_probs=18.9
Q ss_pred CCChhhHHHHHHHHHHHHHHHHHHHHH
Q 016022 39 FPSKQDLLRLITVVAIASSVALTCNYL 65 (396)
Q Consensus 39 ~~~~~~~~~~~~v~~ia~~~a~~c~~l 65 (396)
.|+|+|..+...+.++++++..+..++
T Consensus 17 WPt~~e~~~~t~~Vi~~~~~~~~~~~~ 43 (55)
T TIGR00964 17 WPSRKELITYTIVVIVFVIFFSLFLFG 43 (55)
T ss_pred CcCHHHHHhHHHHHHHHHHHHHHHHHH
Confidence 699999988877777766664444333
No 31
>PF00558 Vpu: Vpu protein; InterPro: IPR008187 The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells [].; GO: 0019076 release of virus from host; PDB: 2JPX_A 1PI8_A 2GOH_A 2GOF_A 1PI7_A 1PJE_A 1VPU_A 2K7Y_A.
Probab=31.95 E-value=50 Score=26.82 Aligned_cols=19 Identities=16% Similarity=0.238 Sum_probs=5.5
Q ss_pred HHHHHHHHHHHHHHHHHHH
Q 016022 256 HRRRYFAIRVEELYHQVCE 274 (396)
Q Consensus 256 ~r~~~e~~~v~~Lv~~vl~ 274 (396)
|++.+.+++++++++.+.+
T Consensus 30 Yrk~~rqrkId~li~RIre 48 (81)
T PF00558_consen 30 YRKIKRQRKIDRLIERIRE 48 (81)
T ss_dssp ---------CHHHHHHHHC
T ss_pred HHHHHHHHhHHHHHHHHHc
Confidence 5555556667666654433
No 32
>PF07271 Cytadhesin_P30: Cytadhesin P30/P32; InterPro: IPR009896 This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localised on the tip organelle. It is thought that it is important in cytadherence and virulence [].; GO: 0007157 heterophilic cell-cell adhesion, 0009405 pathogenesis, 0016021 integral to membrane
Probab=31.59 E-value=1.5e+02 Score=29.45 Aligned_cols=17 Identities=24% Similarity=0.022 Sum_probs=10.9
Q ss_pred HHHHHHHHHHHHHHHHH
Q 016022 260 YFAIRVEELYHQVCEIL 276 (396)
Q Consensus 260 ~e~~~v~~Lv~~vl~~L 276 (396)
+|+++.++++++.-.+-
T Consensus 104 ee~e~~~q~~e~~~~i~ 120 (279)
T PF07271_consen 104 EEKEEHEQLAEQLGRIS 120 (279)
T ss_pred HHHHHHHHHHHHHHHHH
Confidence 56667788887654443
No 33
>PF06679 DUF1180: Protein of unknown function (DUF1180); InterPro: IPR009565 This entry consists of several hypothetical eukaryotic proteins thought to be membrane proteins. Their function is unknown.
Probab=30.95 E-value=1.9e+02 Score=26.44 Aligned_cols=31 Identities=23% Similarity=0.231 Sum_probs=23.1
Q ss_pred CCCCCCChhhHHHHHHHHHHHHHHHHHHHHH
Q 016022 35 PQSLFPSKQDLLRLITVVAIASSVALTCNYL 65 (396)
Q Consensus 35 ~~~~~~~~~~~~~~~~v~~ia~~~a~~c~~l 65 (396)
|..+-+.+.-+.|.++||..+++.+.+|+++
T Consensus 84 ~s~~~~d~~~l~R~~~Vl~g~s~l~i~yfvi 114 (163)
T PF06679_consen 84 PSPSSPDSPMLKRALYVLVGLSALAILYFVI 114 (163)
T ss_pred cCCCcCCccchhhhHHHHHHHHHHHHHHHHH
Confidence 3345567777888999888888888777774
No 34
>PF07543 PGA2: Protein trafficking PGA2; InterPro: IPR011431 A Saccharomyces cerevisiae (Baker's yeast) member of this family (PGA2, P53903 from SWISSPROT) is a single pass membrane protein which has been implicated in protein trafficking [, ].
Probab=30.07 E-value=1.5e+02 Score=26.50 Aligned_cols=12 Identities=17% Similarity=0.094 Sum_probs=6.7
Q ss_pred cccccccccccC
Q 016022 293 WVVASRLRDHLL 304 (396)
Q Consensus 293 yl~~~qLRD~LL 304 (396)
=|+.+.||+..-
T Consensus 62 k~s~n~lRg~~~ 73 (140)
T PF07543_consen 62 KISPNALRGGKA 73 (140)
T ss_pred cCCchhhccccc
Confidence 356666666433
No 35
>PF11044 TMEMspv1-c74-12: Plectrovirus spv1-c74 ORF 12 transmembrane protein; InterPro: IPR022743 This is a group of proteins expressed by Plectroviruses. The Plectroviruses are single-stranded DNA viruses belonging to the Inoviridae. This entry represents putative transmembrane proteins of unknown function.
Probab=29.75 E-value=2.1e+02 Score=20.70 Aligned_cols=15 Identities=20% Similarity=0.104 Sum_probs=5.7
Q ss_pred HHHHHHHHHHHHHHH
Q 016022 237 IIVPVCSLLVGCLLL 251 (396)
Q Consensus 237 ~i~~~l~llv~i~~l 251 (396)
.|+++++++..++|+
T Consensus 7 ~iFsvvIil~If~~i 21 (49)
T PF11044_consen 7 TIFSVVIILGIFAWI 21 (49)
T ss_pred HHHHHHHHHHHHHHH
Confidence 344443333333343
No 36
>COG0690 SecE Preprotein translocase subunit SecE [Intracellular trafficking and secretion]
Probab=28.35 E-value=1.2e+02 Score=23.92 Aligned_cols=28 Identities=18% Similarity=0.376 Sum_probs=18.5
Q ss_pred CCCChhhHHHHHHHHHHHHHHHHHHHHH
Q 016022 38 LFPSKQDLLRLITVVAIASSVALTCNYL 65 (396)
Q Consensus 38 ~~~~~~~~~~~~~v~~ia~~~a~~c~~l 65 (396)
-+|++.|..+...+.++..+++.+..++
T Consensus 35 ~WPsrke~~~~t~~Vl~~v~~~s~~~~~ 62 (73)
T COG0690 35 VWPTRKELIRSTLIVLVVVAFFSLFLYG 62 (73)
T ss_pred cCCCHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 3699999888877666655554443333
No 37
>KOG4403 consensus Cell surface glycoprotein STIM, contains SAM domain [General function prediction only]
Probab=27.95 E-value=2.9e+02 Score=29.35 Aligned_cols=11 Identities=18% Similarity=0.691 Sum_probs=6.6
Q ss_pred ccchhhHHHHh
Q 016022 157 WVEENDIWNDL 167 (396)
Q Consensus 157 ~i~e~dL~~~~ 167 (396)
.|+.+|||+.-
T Consensus 117 ~ItVedLWeaW 127 (575)
T KOG4403|consen 117 HITVEDLWEAW 127 (575)
T ss_pred ceeHHHHHHHH
Confidence 46666666653
No 38
>PHA03399 pif3 per os infectivity factor 3; Provisional
Probab=27.43 E-value=73 Score=30.17 Aligned_cols=21 Identities=24% Similarity=0.768 Sum_probs=15.7
Q ss_pred ccCCCCceecCC--------eeeeCCCce
Q 016022 94 EPCPSNGECHQG--------KLECFHGYR 114 (396)
Q Consensus 94 ~PCPehAiC~~g--------~l~C~~gYv 114 (396)
+||=.+..|.++ .+.|+.||=
T Consensus 58 lPCVtD~QC~dnC~~~~~~~~~~C~~GFC 86 (200)
T PHA03399 58 LPCVTDQQCRDNCAIGSAAGVMTCDGGFC 86 (200)
T ss_pred CCcccHHHHHHHHHhccccceEECCCCee
Confidence 488899888754 458988863
No 39
>KOG0474 consensus Cl- channel CLC-7 and related proteins (CLC superfamily) [Inorganic ion transport and metabolism]
Probab=27.34 E-value=90 Score=34.59 Aligned_cols=24 Identities=29% Similarity=0.584 Sum_probs=13.6
Q ss_pred CCCCccCCCCceecCC-eeeeCCCc
Q 016022 90 TDSCEPCPSNGECHQG-KLECFHGY 113 (396)
Q Consensus 90 ~p~C~PCPehAiC~~g-~l~C~~gY 113 (396)
-..|+|||....=..- .+-|.+|+
T Consensus 396 l~~C~P~~~~~~~~~~p~f~Cp~~~ 420 (762)
T KOG0474|consen 396 LADCQPCPPSITEGQCPTFFCPDGE 420 (762)
T ss_pred HhcCCCCCCCcccccCccccCCCCc
Confidence 3578888876533211 25676664
No 40
>PF07466 DUF1517: Protein of unknown function (DUF1517); InterPro: IPR010903 This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown.
Probab=26.53 E-value=1.8e+02 Score=28.98 Aligned_cols=23 Identities=4% Similarity=0.163 Sum_probs=12.5
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHH
Q 016022 43 QDLLRLITVVAIASSVALTCNYL 65 (396)
Q Consensus 43 ~~~~~~~~v~~ia~~~a~~c~~l 65 (396)
..|.-++.+|+++.+++++..++
T Consensus 62 gg~~gl~~iLIl~~Ia~~vv~~~ 84 (289)
T PF07466_consen 62 GGFGGLFDILILFGIAFFVVRFF 84 (289)
T ss_pred cccchHHHHHHHHHHHHHHHHHH
Confidence 33455666666555555554444
No 41
>PF14316 DUF4381: Domain of unknown function (DUF4381)
Probab=26.04 E-value=1.7e+02 Score=25.77 Aligned_cols=15 Identities=27% Similarity=0.275 Sum_probs=6.9
Q ss_pred HHHHHHHHHHHHhhh
Q 016022 269 YHQVCEILEENALMS 283 (396)
Q Consensus 269 v~~vl~~L~~q~~~~ 283 (396)
..++-.+|+..+..+
T Consensus 70 ~~~l~~LLKr~a~~~ 84 (146)
T PF14316_consen 70 LAALNELLKRVALQY 84 (146)
T ss_pred HHHHHHHHHHHHHHh
Confidence 334455555444433
No 42
>PF00584 SecE: SecE/Sec61-gamma subunits of protein translocation complex; InterPro: IPR001901 Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component []. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome. The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF) []. The chaperone protein SecB [] is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion []. SecE, part of the main SecYEG translocase complex, is ~106 residues in length, and spans the inner membrane of the Gram-negative bacterial envelope. Together with SecY and SecG, SecE forms a multimeric channel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The latter is mediated by SecA. In eukaryotes, the evolutionary related protein sec61-gamma plays a role in protein translocation through the endoplasmic reticulum; it is part of a trimeric complex that also consist of sec61-alpha and beta []. Both secE and sec61-gamma are small proteins of about 60 to 90 amino acids that contain a single transmembrane region at their C-terminal extremity (Escherichia coli secE is an exception, in that it possess an extra N-terminal segment of 60 residues that contains two additional transmembrane domains) [].; GO: 0006605 protein targeting, 0006886 intracellular protein transport, 0016020 membrane; PDB: 3J01_B 2WW9_B 2WWA_B 3DL8_C 2WWB_B 3DIN_G 2ZJS_E 2ZQP_E.
Probab=25.55 E-value=1.6e+02 Score=21.51 Aligned_cols=21 Identities=24% Similarity=0.505 Sum_probs=15.2
Q ss_pred CCChhhHHHHHHHHHHHHHHH
Q 016022 39 FPSKQDLLRLITVVAIASSVA 59 (396)
Q Consensus 39 ~~~~~~~~~~~~v~~ia~~~a 59 (396)
.|+++|..+.-.+.++..++.
T Consensus 18 WP~~~e~~~~t~~Vl~~~~i~ 38 (57)
T PF00584_consen 18 WPSRKELLKSTIIVLVFVIIF 38 (57)
T ss_dssp CCCTHHHHHHHHHHHHHHHHH
T ss_pred CCCHHHHHHHHHHHHHHHHHH
Confidence 599999888776666655553
No 43
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=25.22 E-value=55 Score=22.26 Aligned_cols=21 Identities=29% Similarity=0.765 Sum_probs=14.2
Q ss_pred cCCCCceecCC---eeeeCCCceecC
Q 016022 95 PCPSNGECHQG---KLECFHGYRKHG 117 (396)
Q Consensus 95 PCPehAiC~~g---~l~C~~gYvl~~ 117 (396)
.||. .|-++ .-.|-+||++..
T Consensus 7 ~CpA--~CDpn~~~~C~CPeGyIlde 30 (34)
T PF09064_consen 7 ECPA--DCDPNSPGQCFCPEGYILDE 30 (34)
T ss_pred cCCC--ccCCCCCCceeCCCceEecC
Confidence 3553 77776 338889999853
No 44
>PF03672 UPF0154: Uncharacterised protein family (UPF0154); InterPro: IPR005359 The proteins in this entry are functionally uncharacterised.
Probab=25.12 E-value=3e+02 Score=21.38 Aligned_cols=18 Identities=6% Similarity=0.143 Sum_probs=9.0
Q ss_pred HHHHHHHHHHHHHHHHHH
Q 016022 243 SLLVGCLLLLWKVHRRRY 260 (396)
Q Consensus 243 ~llv~i~~l~~~~~r~~~ 260 (396)
++++|+++.++++.++-+
T Consensus 10 G~~~Gff~ar~~~~k~l~ 27 (64)
T PF03672_consen 10 GAVIGFFIARKYMEKQLK 27 (64)
T ss_pred HHHHHHHHHHHHHHHHHH
Confidence 444555554565554443
No 45
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=24.94 E-value=27 Score=32.74 Aligned_cols=31 Identities=26% Similarity=0.720 Sum_probs=24.7
Q ss_pred cCCCCceecCC-e------e--eeCCCceecCCCcccChh
Q 016022 95 PCPSNGECHQG-K------L--ECFHGYRKHGKLCVEDGD 125 (396)
Q Consensus 95 PCPehAiC~~g-~------l--~C~~gYvl~~~~CV~D~~ 125 (396)
||=+.|.|... . + .|.+||++..+.|+|+.=
T Consensus 51 ~Cgdya~C~~~~~~~~~~~~~C~C~~gY~~~~~vCvp~~C 90 (197)
T PF06247_consen 51 PCGDYAKCINQANKGEERAYKCDCINGYILKQGVCVPNKC 90 (197)
T ss_dssp EEETTEEEEE-SSTTSSTSEEEEE-TTEEESSSSEEEGGG
T ss_pred cccchhhhhcCCCcccceeEEEecccCceeeCCeEchhhc
Confidence 78899999865 2 2 899999999999999864
No 46
>PF08114 PMP1_2: ATPase proteolipid family; InterPro: IPR012589 This family consists of small proteolipids associated with the plasma membrane H+ ATPase. Two proteolipids (PMP1 and PMP2) are associated with the ATPase and both genes are similarly expressed in the wild-type strain of yeast. No modification of the level of transcription of one PMP gene is detected in a strain deleted of the other. Though both proteolipids show similarity with other small proteolipids associated with other cation -transporting ATPases, their functions remain unclear [].
Probab=24.03 E-value=1.9e+02 Score=20.52 Aligned_cols=18 Identities=17% Similarity=0.074 Sum_probs=8.1
Q ss_pred HHHHHHHHHHHHHHHHHH
Q 016022 246 VGCLLLLWKVHRRRYFAI 263 (396)
Q Consensus 246 v~i~~l~~~~~r~~~e~~ 263 (396)
+++..+...+||+.+.++
T Consensus 20 v~i~iva~~iYRKw~aRk 37 (43)
T PF08114_consen 20 VGIGIVALFIYRKWQARK 37 (43)
T ss_pred HHHHHHHHHHHHHHHHHH
Confidence 344444444455544443
No 47
>PF14991 MLANA: Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=23.87 E-value=19 Score=31.04 Aligned_cols=22 Identities=32% Similarity=0.679 Sum_probs=0.0
Q ss_pred HHHHHHHHHHH--HHHHHHHHHHH
Q 016022 241 VCSLLVGCLLL--LWKVHRRRYFA 262 (396)
Q Consensus 241 ~l~llv~i~~l--~~~~~r~~~e~ 262 (396)
++++++|+++| .||++||.-++
T Consensus 31 iL~VILgiLLliGCWYckRRSGYk 54 (118)
T PF14991_consen 31 ILIVILGILLLIGCWYCKRRSGYK 54 (118)
T ss_dssp ------------------------
T ss_pred eHHHHHHHHHHHhheeeeecchhh
Confidence 34445555444 68887765443
No 48
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=23.86 E-value=88 Score=27.30 Aligned_cols=22 Identities=5% Similarity=0.256 Sum_probs=14.2
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 016022 237 IIVPVCSLLVGCLLLLWKVHRR 258 (396)
Q Consensus 237 ~i~~~l~llv~i~~l~~~~~r~ 258 (396)
.+++++++++.++|++++.+++
T Consensus 73 v~aGvIg~Illi~y~irR~~Kk 94 (122)
T PF01102_consen 73 VMAGVIGIILLISYCIRRLRKK 94 (122)
T ss_dssp HHHHHHHHHHHHHHHHHHHS--
T ss_pred HHHHHHHHHHHHHHHHHHHhcc
Confidence 5667777777777777766554
No 49
>PF09402 MSC: Man1-Src1p-C-terminal domain; InterPro: IPR018996 This entry represents the Inner nuclear membrane proteins MAN1 (also known as LEM domain-containing protein 3) and LEM domain-containing protein 2 (or LEM protein 2). Emerin and MAN1 are LEM domain-containing integral membrane proteins of the vertebrate nuclear envelope []. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nulceoplasmic region forms a DNA binding winged helix and binds to Smad []. LEM protein 2 is an essential protein involved in chromosome segregation and cell division, probably via its interaction with lmn-1, the main component of nuclear lamina. Has some overlapping function with emr-1.; GO: 0005639 integral to nuclear inner membrane; PDB: 2CH0_A.
Probab=23.37 E-value=27 Score=34.83 Aligned_cols=70 Identities=16% Similarity=0.221 Sum_probs=0.0
Q ss_pred HHHHHHHHHHHHHHHHHHHhhhhcc--CCCCCCcccccccccccCCCCC-----ccchhhHHHHHHHHhcCCCccee
Q 016022 262 AIRVEELYHQVCEILEENALMSKSV--NGECEPWVVASRLRDHLLLPKE-----RKDPVIWKKVEELVQEDSRVDQY 331 (396)
Q Consensus 262 ~~~v~~Lv~~vl~~L~~q~~~~~~~--~~~~~pyl~~~qLRD~LL~~~~-----r~r~~LW~kV~k~Ve~nSnIrt~ 331 (396)
.+.+..|++.+.+.|++++..+.=+ .....++++...|+|.+..... ..-+.+|+.+...+.++..|...
T Consensus 98 ~~~i~~l~~~~~~~Lr~~~a~~~Cg~~~~~~~~~ls~~el~~~~~~~~~~~~~~~efe~l~~~a~~~L~~~~ei~~~ 174 (334)
T PF09402_consen 98 EEKIEELAKKILDELRERNAQYECGDSEDDESPGLSEEELKDILSSKKSPWISDEEFEELWSAALQELKKNPEIIIR 174 (334)
T ss_dssp -----------------------------------------------------------------------------
T ss_pred HHHHHHHHHHHHHHHHHHHhhcccCCCCCCCCCCCcHHHHHHHHHhccCccccHHHHHHHHHHHHHHHHhCCcEEEe
Confidence 4568888999999998776665433 2457899999999999995441 23388999999999887666544
No 50
>PHA02673 ORF109 EEV glycoprotein; Provisional
Probab=22.51 E-value=1.3e+02 Score=27.50 Aligned_cols=22 Identities=23% Similarity=0.298 Sum_probs=15.4
Q ss_pred HHHHHHHHHHHHHHHHHHHHHH
Q 016022 45 LLRLITVVAIASSVALTCNYLA 66 (396)
Q Consensus 45 ~~~~~~v~~ia~~~a~~c~~l~ 66 (396)
|+|+.++++|-++.+++..+.+
T Consensus 35 ~~Ri~~~iSIisL~~l~v~LaL 56 (161)
T PHA02673 35 FFRLMAAIAIIVLAILVVILAL 56 (161)
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 6777777777777776665543
No 51
>PF12729 4HB_MCP_1: Four helix bundle sensory module for signal transduction; InterPro: IPR024478 This entry represents a four-helix bundle that operates as a ubiquitous sensory module in prokaryotic signal-transduction, which is known as four-helix bundles methyl-accepting chemotaxis protein (4HB_MCP) domain. The 4HB_MCP is always found between two predicted transmembrane helices indicating that it detects only extracellular signals. In many cases the domain is associated with a cytoplasmic HAMP domain suggesting that most proteins carrying the bundle might share the mechanism of transmembrane signalling which is well-characterised in E coli chemoreceptors [].
Probab=22.10 E-value=3.7e+02 Score=22.62 Aligned_cols=10 Identities=40% Similarity=0.411 Sum_probs=5.0
Q ss_pred cccccccCCC
Q 016022 297 SRLRDHLLLP 306 (396)
Q Consensus 297 ~qLRD~LL~~ 306 (396)
..+++.++.+
T Consensus 63 ~~~~~~~~~~ 72 (181)
T PF12729_consen 63 RALRRYLLAT 72 (181)
T ss_pred HHHHHhhhcC
Confidence 3455555543
No 52
>PRK15428 putative propanediol utilization protein PduM; Provisional
Probab=21.73 E-value=81 Score=28.93 Aligned_cols=31 Identities=19% Similarity=0.266 Sum_probs=24.1
Q ss_pred HHHHHHHHHHHHHHHHHHhhhhccCCCCCCccccccccc
Q 016022 263 IRVEELYHQVCEILEENALMSKSVNGECEPWVVASRLRD 301 (396)
Q Consensus 263 ~~v~~Lv~~vl~~L~~q~~~~~~~~~~~~pyl~~~qLRD 301 (396)
..++.||++|+.+|++++.... -++..|||+
T Consensus 4 ~~~~~iV~~Vv~RLk~Ra~~~~--------~ls~~ql~~ 34 (163)
T PRK15428 4 EMLQRIVEEVVARLQRRAQSTA--------TLSVAQLRD 34 (163)
T ss_pred HHHHHHHHHHHHHHHHHhhceE--------EEEHHHccC
Confidence 4578899999999998776543 377778887
No 53
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=21.38 E-value=52 Score=20.53 Aligned_cols=16 Identities=31% Similarity=0.800 Sum_probs=11.9
Q ss_pred eeeCCCceec--CCCccc
Q 016022 107 LECFHGYRKH--GKLCVE 122 (396)
Q Consensus 107 l~C~~gYvl~--~~~CV~ 122 (396)
-.|.+||.+. +..|+.
T Consensus 4 C~C~~Gy~l~~d~~~C~D 21 (24)
T PF12662_consen 4 CSCPPGYQLSPDGRSCED 21 (24)
T ss_pred eeCCCCCcCCCCCCcccc
Confidence 3699999985 567764
No 54
>PF15050 SCIMP: SCIMP protein
Probab=21.33 E-value=2.3e+02 Score=24.86 Aligned_cols=13 Identities=31% Similarity=0.606 Sum_probs=6.5
Q ss_pred HHHHHHHHHHHHH
Q 016022 230 WVSTHALIIVPVC 242 (396)
Q Consensus 230 ~i~~~~~~i~~~l 242 (396)
|+..+..+|+.+.
T Consensus 3 WWr~nFWiiLAVa 15 (133)
T PF15050_consen 3 WWRDNFWIILAVA 15 (133)
T ss_pred hHHhchHHHHHHH
Confidence 4455555554443
No 55
>PF11392 DUF2877: Protein of unknown function (DUF2877); InterPro: IPR021530 This bacterial family of proteins are putative carboxylase proteins however this cannot be confirmed.
Probab=21.19 E-value=52 Score=27.85 Aligned_cols=11 Identities=36% Similarity=0.507 Sum_probs=9.0
Q ss_pred CCCCCCChhhH
Q 016022 35 PQSLFPSKQDL 45 (396)
Q Consensus 35 ~~~~~~~~~~~ 45 (396)
=|||+||.+||
T Consensus 5 G~GLTPSGDD~ 15 (110)
T PF11392_consen 5 GPGLTPSGDDF 15 (110)
T ss_pred CCCCCCchHHH
Confidence 37899999995
No 56
>PF10500 SR-25: Nuclear RNA-splicing-associated protein; InterPro: IPR019532 SR-25, otherwise known as ADP-ribosylation factor-like factor 6-interacting protein 4, is expressed in virtually all tissue types. At the N terminus there is a repeat of serine-arginine (SR repeat), and towards the middle of the protein there are clusters of both serines and of basic amino acids. The presence of many nuclear localisation signals strongly implies that this is a nuclear protein that may contribute to RNA splicing []. SR-25 is also implicated, along with heat-shock-protein-27, as a mediator in the Rac1 (GTPase ras-related C3 botulinum toxin substrate 1; also see IPR019093 from INTERPRO) signalling pathway [].
Probab=21.08 E-value=66 Score=30.91 Aligned_cols=9 Identities=11% Similarity=0.132 Sum_probs=5.5
Q ss_pred CCChHHHHH
Q 016022 177 ELDNPVYLY 185 (396)
Q Consensus 177 ~l~~~~fe~ 185 (396)
-|+.|||+-
T Consensus 159 PmTkEEyea 167 (225)
T PF10500_consen 159 PMTKEEYEA 167 (225)
T ss_pred CCCHHHHHH
Confidence 467776653
No 57
>PF10588 NADH-G_4Fe-4S_3: NADH-ubiquinone oxidoreductase-G iron-sulfur binding region; InterPro: IPR019574 NADH:ubiquinone oxidoreductase (complex I) (1.6.5.3 from EC) is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) []. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea [], mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins []. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters []. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes [, ]. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I []. This entry describes the G subunit (one of 14 subunits, A to N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria while translocating protons, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This family does not contain related subunits from formate dehydrogenase complexes. This entry represents the iron-sulphur binding domain of the G subunit.; GO: 0016491 oxidoreductase activity, 0055114 oxidation-reduction process; PDB: 3M9S_C 2FUG_L 3IAS_L 2YBB_3 3IAM_3 3I9V_3.
Probab=20.88 E-value=44 Score=23.30 Aligned_cols=16 Identities=31% Similarity=0.866 Sum_probs=8.4
Q ss_pred CCCCCCccCCCCceec
Q 016022 88 SPTDSCEPCPSNGECH 103 (396)
Q Consensus 88 ~~~p~C~PCPehAiC~ 103 (396)
.++-.|.-|+.+|.|.
T Consensus 11 ~H~~dC~~C~~~G~Ce 26 (41)
T PF10588_consen 11 NHPLDCPTCDKNGNCE 26 (41)
T ss_dssp T----TTT-TTGGG-H
T ss_pred CCCCcCcCCCCCCCCH
Confidence 3456899999999984
No 58
>cd00033 CCP Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system. SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function.
Probab=20.11 E-value=61 Score=22.52 Aligned_cols=20 Identities=35% Similarity=0.773 Sum_probs=14.1
Q ss_pred eeeeCCCceecC---CCcccChh
Q 016022 106 KLECFHGYRKHG---KLCVEDGD 125 (396)
Q Consensus 106 ~l~C~~gYvl~~---~~CV~D~~ 125 (396)
.+.|++||.+.+ -.|..|+.
T Consensus 26 ~~~C~~Gy~~~g~~~~~C~~~g~ 48 (57)
T cd00033 26 TYSCNEGYTLVGSSTITCTENGG 48 (57)
T ss_pred EEECCCCCeEeCCCeeEECCCCe
Confidence 459999999874 35666553
No 59
>PRK09400 secE preprotein translocase subunit SecE; Reviewed
Probab=20.08 E-value=2e+02 Score=21.95 Aligned_cols=19 Identities=16% Similarity=0.461 Sum_probs=15.3
Q ss_pred CChhhHHHHHHHHHHHHHH
Q 016022 40 PSKQDLLRLITVVAIASSV 58 (396)
Q Consensus 40 ~~~~~~~~~~~v~~ia~~~ 58 (396)
|+++||.+...+.++..++
T Consensus 27 Pd~~Ef~~ia~~~~iG~~i 45 (61)
T PRK09400 27 PTREEFLLVAKVTGLGILL 45 (61)
T ss_pred CCHHHHHHHHHHHHHHHHH
Confidence 8999999988877666555
Done!