Query psy954
Match_columns 337
No_of_seqs 259 out of 2055
Neff 7.9
Searched_HMMs 46136
Date Fri Aug 16 22:12:06 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy954.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/954hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1215|consensus 99.8 6E-21 1.3E-25 203.2 12.9 215 20-336 73-295 (877)
2 KOG1214|consensus 99.7 2.5E-17 5.5E-22 164.2 6.2 104 1-106 1183-1286(1289)
3 KOG1215|consensus 99.4 1.5E-12 3.3E-17 139.1 10.7 114 1-116 595-710 (877)
4 PF00057 Ldl_recept_a: Low-den 99.2 1.3E-11 2.7E-16 78.9 2.9 35 120-154 2-36 (37)
5 PF00057 Ldl_recept_a: Low-den 99.2 2E-11 4.3E-16 78.0 2.9 36 300-335 2-37 (37)
6 cd00112 LDLa Low Density Lipop 99.2 1.6E-11 3.5E-16 77.5 2.3 34 121-154 1-34 (35)
7 cd00112 LDLa Low Density Lipop 99.1 4.2E-11 9.2E-16 75.6 2.7 35 301-335 1-35 (35)
8 PF14670 FXa_inhibition: Coagu 99.1 2.5E-11 5.3E-16 76.7 0.4 36 78-114 1-36 (36)
9 smart00192 LDLa Low-density li 99.0 1.3E-10 2.9E-15 72.4 2.7 32 121-152 2-33 (33)
10 smart00192 LDLa Low-density li 98.8 3.3E-09 7.1E-14 66.0 3.1 32 301-332 2-33 (33)
11 PF12999 PRKCSH-like: Glucosid 98.8 5.6E-09 1.2E-13 89.2 4.3 65 214-290 38-110 (176)
12 PF12999 PRKCSH-like: Glucosid 98.5 8.5E-08 1.8E-12 82.0 4.6 67 261-332 35-110 (176)
13 PF00058 Ldl_recept_b: Low-den 98.2 2.4E-06 5.2E-11 56.1 5.2 40 26-66 1-41 (42)
14 KOG1214|consensus 98.2 2.5E-06 5.5E-11 86.8 6.6 65 2-67 1096-1165(1289)
15 smart00135 LY Low-density lipo 97.7 0.00012 2.6E-09 47.4 5.8 36 9-44 2-39 (43)
16 KOG2397|consensus 97.5 7.2E-05 1.6E-09 72.7 4.1 65 214-290 45-113 (480)
17 PF00058 Ldl_recept_b: Low-den 96.8 0.0014 3.1E-08 42.8 3.1 24 1-24 17-41 (42)
18 PF12662 cEGF: Complement Clr- 96.7 0.00095 2.1E-08 38.1 1.7 22 96-117 1-22 (24)
19 KOG3509|consensus 96.7 0.0015 3.2E-08 69.3 3.9 105 226-335 2-110 (964)
20 KOG2397|consensus 96.5 0.0024 5.3E-08 62.3 4.0 67 263-334 44-115 (480)
21 PF07645 EGF_CA: Calcium-bindi 95.3 0.0072 1.6E-07 39.4 1.0 38 76-114 3-42 (42)
22 KOG3509|consensus 93.1 0.1 2.2E-06 55.9 4.2 105 132-294 2-110 (964)
23 smart00181 EGF Epidermal growt 93.0 0.096 2.1E-06 32.1 2.5 25 83-108 6-31 (35)
24 PF09064 Tme5_EGF_like: Thromb 92.7 0.099 2.1E-06 32.2 2.1 26 84-111 7-32 (34)
25 cd01475 vWA_Matrilin VWA_Matri 92.5 0.082 1.8E-06 47.4 2.4 38 75-113 187-224 (224)
26 smart00179 EGF_CA Calcium-bind 90.4 0.31 6.8E-06 30.3 2.8 23 88-114 16-38 (39)
27 PF01436 NHL: NHL repeat; Int 90.1 0.88 1.9E-05 26.7 4.4 26 15-40 1-27 (28)
28 PF00008 EGF: EGF-like domain 88.2 0.17 3.7E-06 30.8 0.3 27 75-106 3-29 (32)
29 cd00053 EGF Epidermal growth f 85.4 0.9 2E-05 27.2 2.5 21 87-108 12-32 (36)
30 PF08450 SGL: SMP-30/Gluconola 81.6 5.5 0.00012 35.8 7.1 38 8-45 126-165 (246)
31 PF12947 EGF_3: EGF domain; I 80.5 0.44 9.5E-06 30.0 -0.3 29 78-107 1-31 (36)
32 TIGR03032 conserved hypothetic 79.8 4.2 9.2E-05 38.4 5.7 57 8-66 195-251 (335)
33 KOG1219|consensus 79.6 1.9 4.1E-05 50.2 3.8 58 73-141 3867-3927(4289)
34 PF01731 Arylesterase: Arylest 78.2 5.7 0.00012 30.1 5.1 43 1-43 39-83 (86)
35 TIGR02276 beta_rpt_yvtn 40-res 78.1 7.5 0.00016 24.2 5.0 38 25-64 3-40 (42)
36 PF03088 Str_synth: Strictosid 73.0 11 0.00023 28.9 5.3 47 20-68 2-67 (89)
37 KOG4499|consensus 70.5 8.7 0.00019 34.9 5.0 49 6-54 195-251 (310)
38 PF03088 Str_synth: Strictosid 69.2 9.9 0.00021 29.1 4.4 36 6-41 47-84 (89)
39 KOG1219|consensus 65.4 5.9 0.00013 46.5 3.4 55 75-141 3908-3966(4289)
40 PF08450 SGL: SMP-30/Gluconola 60.3 26 0.00057 31.3 6.3 57 7-65 172-232 (246)
41 cd00054 EGF_CA Calcium-binding 56.8 11 0.00025 22.5 2.4 20 87-107 15-34 (38)
42 PLN02919 haloacid dehalogenase 54.8 38 0.00082 37.7 7.5 33 15-47 682-716 (1057)
43 TIGR03606 non_repeat_PQQ dehyd 51.3 44 0.00095 33.5 6.6 41 7-47 21-62 (454)
44 PF06247 Plasmod_Pvs28: Plasmo 49.3 7.3 0.00016 33.9 0.7 43 88-136 57-103 (197)
45 PLN02919 haloacid dehalogenase 48.6 29 0.00064 38.6 5.4 33 15-47 623-657 (1057)
46 COG3386 Gluconolactonase [Carb 46.1 57 0.0012 30.9 6.2 32 13-44 160-193 (307)
47 TIGR02604 Piru_Ver_Nterm putat 38.4 1.1E+02 0.0024 29.4 7.1 36 9-46 65-100 (367)
48 TIGR02604 Piru_Ver_Nterm putat 38.3 40 0.00087 32.4 4.1 42 2-43 170-212 (367)
49 PF08309 LVIVD: LVIVD repeat; 34.7 1.3E+02 0.0028 19.4 5.9 28 17-45 3-30 (42)
50 PF12661 hEGF: Human growth fa 34.0 16 0.00035 17.5 0.3 9 98-106 1-9 (13)
51 PF01826 TIL: Trypsin Inhibito 30.7 20 0.00043 24.3 0.4 19 98-117 34-52 (55)
52 PF08887 GAD-like: GAD-like do 29.4 59 0.0013 25.8 3.0 28 14-41 78-105 (109)
53 TIGR03032 conserved hypothetic 29.1 2.3E+02 0.0049 27.1 7.2 66 3-68 230-315 (335)
54 COG4257 Vgb Streptogramin lyas 25.5 1.2E+02 0.0026 28.5 4.5 51 14-65 60-111 (353)
55 PF07995 GSDH: Glucose / Sorbo 23.4 88 0.0019 29.7 3.5 37 7-43 172-210 (331)
56 PF05096 Glu_cyclase_2: Glutam 23.4 1.7E+02 0.0037 27.1 5.2 33 15-47 89-121 (264)
57 PRK04043 tolB translocation pr 23.3 2.1E+02 0.0045 28.2 6.2 50 1-51 174-228 (419)
58 PF03022 MRJP: Major royal jel 22.9 2.3E+02 0.0051 26.3 6.2 48 20-67 190-243 (287)
59 PF12942 Archaeal_AmoA: Archae 22.7 65 0.0014 27.4 2.1 23 21-43 8-33 (183)
60 TIGR03866 PQQ_ABC_repeats PQQ- 22.1 3.4E+02 0.0074 23.8 7.1 47 17-65 250-298 (300)
61 PRK04792 tolB translocation pr 21.3 1.8E+02 0.004 28.8 5.5 47 1-47 203-253 (448)
62 PF07995 GSDH: Glucose / Sorbo 20.9 1.9E+02 0.0042 27.3 5.3 31 16-46 253-291 (331)
63 PF00954 S_locus_glycop: S-loc 20.6 4E+02 0.0087 20.5 6.4 12 95-106 96-107 (110)
64 COG3386 Gluconolactonase [Carb 20.2 1.4E+02 0.003 28.2 4.1 28 25-52 36-63 (307)
No 1
>KOG1215|consensus
Probab=99.84 E-value=6e-21 Score=203.21 Aligned_cols=215 Identities=37% Similarity=0.711 Sum_probs=170.8
Q ss_pred EEEEECCEEEEEeCCCCceEEEecccCCceEEEeecccCCCccceeeccCccccCCCCCcCC-CCCccccccCCCCCcCe
Q psy954 20 AITVHRNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVYSADSQKCSVNPCNIH-NGGCAQSCHPGPNGTAE 98 (337)
Q Consensus 20 ~Lav~~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~v~~~~~q~~~~npC~~~-nggCs~lCl~~~~~~~~ 98 (337)
+|++|+++|||+| +.|.+++|.+|....++...... |+.|+++++..++...++|..+ +++|+|
T Consensus 73 ~l~~~~~~~y~~d---~~v~~~~~~sg~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~----------- 137 (877)
T KOG1215|consen 73 ALTLFEDGLYWTD---KSVSAANKKTGKDVTRLSQDSHF-PLDIHAYHPSSQPLAPDPCAESGNGPCSH----------- 137 (877)
T ss_pred eeeeeccceeecc---chhhhhccCCCCcceeehhcCCC-CcceeEEecCCCCCCCCcccccCCCCCcc-----------
Confidence 9999999999999 78999999999999988877755 9999999999888877887763 223333
Q ss_pred ecCCCCcccccCCcccccCCCCCCCCccccC--CCceeCCccccCCCCCCCCCCCCCCCccCCcccCCCCCCCCcccCCC
Q psy954 99 CKCDESTKLVNEGRMCVAKNITCDGSKFFCR--NGKCISRMWSCDGDDDCGDNSDEDPNYCNVQITGVSQPPGELGVPGH 176 (337)
Q Consensus 99 C~Cp~g~~L~~~~~~C~~~~~~C~~~~f~C~--~g~CI~~~~~CDG~~DC~DgsDE~~~~C~~~~c~c~~~~~~~~~~~~ 176 (337)
|...+|.|. +++||+..|+|||..||+||+||.. |....+.
T Consensus 138 ----------------------~~~~~~~c~~~~~~Cip~~~~cd~~~~C~dg~de~~--~~~~~~~------------- 180 (877)
T KOG1215|consen 138 ----------------------CCLDKFSCRTGSCKCIPGDWLCDGEADCPDGSDELN--CAVRRCE------------- 180 (877)
T ss_pred ----------------------ccCCCCCCcCccccCCCCceeCCCCCccccchhhhc--ccccccC-------------
Confidence 333446666 7899999999999999999999986 2211000
Q ss_pred ccccCCCCCCCCccceeeecCCCCCCCCCCCCCCCCceeeCCCCCCCCCeeecCCCcccCCCCCCCCCccccccCCCCCC
Q psy954 177 VQITGVSQPPGIVMVMTTVQTGLMNHPNNRKCDEETEFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADENTTALNCPKQ 256 (337)
Q Consensus 177 ~C~~~~~~~~~~C~~~~~C~~~~d~~c~~~~C~~~~~f~C~~~~~~~~~~Ci~~~~~CDg~~dC~dgsDE~~~~~~C~~~ 256 (337)
+.... | +||...|+||+..+|.+++|| ..+..
T Consensus 181 -------------------------------~~~~~-~-----------~~~~~~~~~d~~~~~~~~~d~----~~~~~- 212 (877)
T KOG1215|consen 181 -------------------------------PRGAS-L-----------DCIVAIKVCDIQHDCADDYDE----SEGRI- 212 (877)
T ss_pred -------------------------------ccccc-c-----------ccceeeeecCccccccccccc----ccCcc-
Confidence 00000 2 448889999999999999999 34432
Q ss_pred CCCC---CCcEEeeC-CceecCCCcCCCCCCCCCCCCCC-CCCCCCCCCCCCCcEEcCCCCeecCCCCCCCcCCCCCCCC
Q psy954 257 SSCS---PDQFSCGN-GRCINTGWLCDHDNDCGDGSDEG-KECHDKYRTCSSEEFACQNFKCIRKTYHCDGEDDCGDRSD 331 (337)
Q Consensus 257 ~~C~---~~~f~C~~-g~Ci~~~~~CDg~~dC~d~sDE~-~~C~~~~~~C~~~~f~C~~~~Ci~~~~~CDg~~dC~dgsD 331 (337)
..+. ...++|.. .+||...|.|||..||.+++||. .++. ...|...++.|.++.|++..++|||..||+||+|
T Consensus 213 ~~~~~~~~~~~~c~g~~~~i~~~~~~Dg~~dc~~~~de~~~~~~--~~~~~~~e~~~~~~~~~~~~~~~~g~~d~pdg~d 290 (877)
T KOG1215|consen 213 YWTDDSRIEVTRCDGSSRCILISEVCDGPRDCVDGPDEGVMNCS--DATCEAPEIECADGDCSDRQKLCDGDLDCPDGLD 290 (877)
T ss_pred cccCCcceeEEEecCCCcEEeehhccCCCcccccCCcCceeEee--ccccCCcceeecCCCCccceEEecCccCCCCccc
Confidence 2232 57889987 59999999999999999999994 3454 4567778999999999999999999999999999
Q ss_pred CCCCC
Q psy954 332 EFNCN 336 (337)
Q Consensus 332 E~~C~ 336 (337)
|..|.
T Consensus 291 e~~~~ 295 (877)
T KOG1215|consen 291 EDYCK 295 (877)
T ss_pred ccccc
Confidence 98775
No 2
>KOG1214|consensus
Probab=99.68 E-value=2.5e-17 Score=164.20 Aligned_cols=104 Identities=31% Similarity=0.517 Sum_probs=94.1
Q ss_pred CccCCCceEEEEcCCCCceEEEEECCEEEEEeCCCCceEEEecccCCceEEEeecccCCCccceeeccCccccCCCCCcC
Q psy954 1 MRIATGASMVLISATIYPFAITVHRNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVYSADSQKCSVNPCNI 80 (337)
Q Consensus 1 ~~~dG~~R~vl~~~~~~Pf~Lav~~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~v~~~~~q~~~~npC~~ 80 (337)
+.++|+.|++|+++|++||+|+.+++.+|||||+.++|..++|+.++.+...+.....+++||+++.+.....+ +||+.
T Consensus 1183 ~~p~g~gRR~i~~~LqYPF~itsy~~~fY~TDWk~n~vvsv~~~~~~~td~~~p~~~s~lyGItav~~~Cp~gs-tpCSe 1261 (1289)
T KOG1214|consen 1183 TLPDGTGRRVIQNNLQYPFSITSYADHFYHTDWKRNGVVSVNKHSGQFTDEYLPEQRSHLYGITAVYPYCPTGS-TPCSE 1261 (1289)
T ss_pred ecCCCCcchhhhhcccCceeeeeccccceeeccccCceEEeeccccccccccccccccceEEEEeccccCCCCC-Ccccc
Confidence 46899999999999999999999999999999999999999999998887777666677999999988766655 99999
Q ss_pred CCCCccccccCCCCCcCeecCCCCcc
Q psy954 81 HNGGCAQSCHPGPNGTAECKCDESTK 106 (337)
Q Consensus 81 ~nggCs~lCl~~~~~~~~C~Cp~g~~ 106 (337)
+||||.||||++-++ +.|.||++.+
T Consensus 1262 dNGGCqHLCLpgqng-avcecpdnvk 1286 (1289)
T KOG1214|consen 1262 DNGGCQHLCLPGQNG-AVCECPDNVK 1286 (1289)
T ss_pred cCCcceeecccCcCC-ccccCCccce
Confidence 999999999998888 8999998854
No 3
>KOG1215|consensus
Probab=99.38 E-value=1.5e-12 Score=139.07 Aligned_cols=114 Identities=35% Similarity=0.696 Sum_probs=94.9
Q ss_pred CccCCCceEEEE-cCCCCceEEEEECCEEEEEeCCCCceEEEecccCCceEEEeecccCCCccceee-ccCccccCCCCC
Q psy954 1 MRIATGASMVLI-SATIYPFAITVHRNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVY-SADSQKCSVNPC 78 (337)
Q Consensus 1 ~~~dG~~R~vl~-~~~~~Pf~Lav~~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~v~-~~~~q~~~~npC 78 (337)
+.++|.+|+++. ..+.|||+|++|+++|||+||....+.++.|..+.....+... ...+..++++ +...++-..|+|
T Consensus 595 ~~~~g~~r~~~~~~~~~~p~~~~~~~~~iyw~d~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~n~C 673 (877)
T KOG1215|consen 595 ANMDGQNRRVVDSEDLPHPFGLSVFEDYIYWTDWSNRAISRAEKHKGSDSRTSRSN-LAQPLDIILVHHSSSRPTGVNPC 673 (877)
T ss_pred eecCCCceEEeccccCCCceEEEEecceeEEeeccccceEeeecccCCcceeeecc-cCcccceEEEeccccCCCCCCcc
Confidence 368999998433 6789999999999999999999999999999988762123333 3567777777 555566677999
Q ss_pred cCCCCCccccccCCCCCcCeecCCCCcccccCCccccc
Q psy954 79 NIHNGGCAQSCHPGPNGTAECKCDESTKLVNEGRMCVA 116 (337)
Q Consensus 79 ~~~nggCs~lCl~~~~~~~~C~Cp~g~~L~~~~~~C~~ 116 (337)
..+||+|++||++.|.+. +|+||.|+.|..+++.|.+
T Consensus 674 ~~~n~~c~~KOG~~p~~~-~c~c~~~~~l~~~~~~C~~ 710 (877)
T KOG1215|consen 674 ESSNGGCSQLCLPRPQGS-TCACPEGYRLSPDGKSCSS 710 (877)
T ss_pred cccCCCCCeeeecCCCCC-eeeCCCCCeecCCCCeecC
Confidence 999999999999999885 9999999999999999987
No 4
>PF00057 Ldl_recept_a: Low-density lipoprotein receptor domain class A This prints entry is specific to LDL receptor; InterPro: IPR002172 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR class A (cyateine-rich) repeat, which contains 6 disulphide-bound cysteines and a highly conserved cluster of negatively charged amino acids, of which many are clustered on one face of the module []. In LDL receptors, the class A domains form the binding site for LDL and calcium. The acidic residues between the fourth and sixth cysteines are important for high-affinity binding of positively charged sequences in LDLR's ligands. The repeat consists of a beta-hairpin structure followed by a series of beta turns. In the absence of calcium, LDL-A domains are unstructured; the bound calcium ion imparts structural integrity. Following these repeats is a 350 residue domain that resembles part of the epidermal growth factor (EGF) precursor. Numerous familial hypercholestorolemia mutations of the LDL receptor alter the calcium coordinating residue of LDL-A domains or other crucial scaffolding residues. ; GO: 0005515 protein binding; PDB: 2I1P_A 3OJY_A 4E0S_B 3T5O_A 4A5W_B 1JRF_A 1K7B_A 1V9U_5 3DPR_E 2KNY_A ....
Probab=99.19 E-value=1.3e-11 Score=78.87 Aligned_cols=35 Identities=51% Similarity=1.126 Sum_probs=33.2
Q ss_pred CCCCCccccCCCceeCCccccCCCCCCCCCCCCCC
Q psy954 120 TCDGSKFFCRNGKCISRMWSCDGDDDCGDNSDEDP 154 (337)
Q Consensus 120 ~C~~~~f~C~~g~CI~~~~~CDG~~DC~DgsDE~~ 154 (337)
+|.+++|+|.+++||+..|+|||++||.|||||.+
T Consensus 2 ~C~~~~f~C~~~~CI~~~~~CDg~~DC~dgsDE~~ 36 (37)
T PF00057_consen 2 TCPPGEFRCGNGQCIPKSWVCDGIPDCPDGSDEQN 36 (37)
T ss_dssp SSSTTEEEETTSSEEEGGGTTSSSCSSSSSTTTSS
T ss_pred cCcCCeeEcCCCCEEChHHcCCCCCCCCCCccccc
Confidence 58899999999999999999999999999999964
No 5
>PF00057 Ldl_recept_a: Low-density lipoprotein receptor domain class A This prints entry is specific to LDL receptor; InterPro: IPR002172 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR class A (cyateine-rich) repeat, which contains 6 disulphide-bound cysteines and a highly conserved cluster of negatively charged amino acids, of which many are clustered on one face of the module []. In LDL receptors, the class A domains form the binding site for LDL and calcium. The acidic residues between the fourth and sixth cysteines are important for high-affinity binding of positively charged sequences in LDLR's ligands. The repeat consists of a beta-hairpin structure followed by a series of beta turns. In the absence of calcium, LDL-A domains are unstructured; the bound calcium ion imparts structural integrity. Following these repeats is a 350 residue domain that resembles part of the epidermal growth factor (EGF) precursor. Numerous familial hypercholestorolemia mutations of the LDL receptor alter the calcium coordinating residue of LDL-A domains or other crucial scaffolding residues. ; GO: 0005515 protein binding; PDB: 2I1P_A 3OJY_A 4E0S_B 3T5O_A 4A5W_B 1JRF_A 1K7B_A 1V9U_5 3DPR_E 2KNY_A ....
Probab=99.16 E-value=2e-11 Score=77.95 Aligned_cols=36 Identities=56% Similarity=1.081 Sum_probs=34.2
Q ss_pred CCCCCcEEcCCCCeecCCCCCCCcCCCCCCCCCCCC
Q psy954 300 TCSSEEFACQNFKCIRKTYHCDGEDDCGDRSDEFNC 335 (337)
Q Consensus 300 ~C~~~~f~C~~~~Ci~~~~~CDg~~dC~dgsDE~~C 335 (337)
.|...+|+|.++.||+..|+|||+.||.|||||.+|
T Consensus 2 ~C~~~~f~C~~~~CI~~~~~CDg~~DC~dgsDE~~C 37 (37)
T PF00057_consen 2 TCPPGEFRCGNGQCIPKSWVCDGIPDCPDGSDEQNC 37 (37)
T ss_dssp SSSTTEEEETTSSEEEGGGTTSSSCSSSSSTTTSSH
T ss_pred cCcCCeeEcCCCCEEChHHcCCCCCCCCCCcccccC
Confidence 578899999999999999999999999999999886
No 6
>cd00112 LDLa Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure
Probab=99.15 E-value=1.6e-11 Score=77.48 Aligned_cols=34 Identities=56% Similarity=1.341 Sum_probs=31.8
Q ss_pred CCCCccccCCCceeCCccccCCCCCCCCCCCCCC
Q psy954 121 CDGSKFFCRNGKCISRMWSCDGDDDCGDNSDEDP 154 (337)
Q Consensus 121 C~~~~f~C~~g~CI~~~~~CDG~~DC~DgsDE~~ 154 (337)
|.+.+|+|.+++||+..|+|||++||+|||||..
T Consensus 1 C~~~~f~C~~~~Ci~~~~~CDg~~DC~dgsDE~~ 34 (35)
T cd00112 1 CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDEEN 34 (35)
T ss_pred CCCCeEEcCCCCeeCHHHcCCCccCCCCCccccc
Confidence 5678999999999999999999999999999974
No 7
>cd00112 LDLa Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure
Probab=99.11 E-value=4.2e-11 Score=75.60 Aligned_cols=35 Identities=60% Similarity=1.271 Sum_probs=32.8
Q ss_pred CCCCcEEcCCCCeecCCCCCCCcCCCCCCCCCCCC
Q psy954 301 CSSEEFACQNFKCIRKTYHCDGEDDCGDRSDEFNC 335 (337)
Q Consensus 301 C~~~~f~C~~~~Ci~~~~~CDg~~dC~dgsDE~~C 335 (337)
|++.+|+|.++.||+..++|||+.||+|||||.+|
T Consensus 1 C~~~~f~C~~~~Ci~~~~~CDg~~DC~dgsDE~~C 35 (35)
T cd00112 1 CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDEENC 35 (35)
T ss_pred CCCCeEEcCCCCeeCHHHcCCCccCCCCCcccccC
Confidence 45689999999999999999999999999999987
No 8
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=99.07 E-value=2.5e-11 Score=76.74 Aligned_cols=36 Identities=39% Similarity=1.048 Sum_probs=31.4
Q ss_pred CcCCCCCccccccCCCCCcCeecCCCCcccccCCccc
Q psy954 78 CNIHNGGCAQSCHPGPNGTAECKCDESTKLVNEGRMC 114 (337)
Q Consensus 78 C~~~nggCs~lCl~~~~~~~~C~Cp~g~~L~~~~~~C 114 (337)
|+.+||||+|+|++.+.+ ++|+||.||+|.+|+++|
T Consensus 1 C~~~NGgC~h~C~~~~g~-~~C~C~~Gy~L~~D~~tC 36 (36)
T PF14670_consen 1 CSVNNGGCSHICVNTPGS-YRCSCPPGYKLAEDGRTC 36 (36)
T ss_dssp CTTGGGGSSSEEEEETTS-EEEE-STTEEE-TTSSSE
T ss_pred CCCCCCCcCCCCccCCCc-eEeECCCCCEECcCCCCC
Confidence 677899999999999877 999999999999999987
No 9
>smart00192 LDLa Low-density lipoprotein receptor domain class A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia.
Probab=99.04 E-value=1.3e-10 Score=72.36 Aligned_cols=32 Identities=56% Similarity=1.267 Sum_probs=30.3
Q ss_pred CCCCccccCCCceeCCccccCCCCCCCCCCCC
Q psy954 121 CDGSKFFCRNGKCISRMWSCDGDDDCGDNSDE 152 (337)
Q Consensus 121 C~~~~f~C~~g~CI~~~~~CDG~~DC~DgsDE 152 (337)
|...+|+|.+++||+..|+|||++||+|+|||
T Consensus 2 C~~~~f~C~~~~Ci~~~~~Cdg~~dC~dgsDE 33 (33)
T smart00192 2 CPPGEFQCDNGRCIPLSWVCDGVDDCSDGSDE 33 (33)
T ss_pred CCCCeEECCCCCEECchhhCCCcCcCcCCCCC
Confidence 66679999999999999999999999999998
No 10
>smart00192 LDLa Low-density lipoprotein receptor domain class A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia.
Probab=98.82 E-value=3.3e-09 Score=66.05 Aligned_cols=32 Identities=53% Similarity=1.125 Sum_probs=29.9
Q ss_pred CCCCcEEcCCCCeecCCCCCCCcCCCCCCCCC
Q psy954 301 CSSEEFACQNFKCIRKTYHCDGEDDCGDRSDE 332 (337)
Q Consensus 301 C~~~~f~C~~~~Ci~~~~~CDg~~dC~dgsDE 332 (337)
|...+|+|.++.||+..++|||+.||+|||||
T Consensus 2 C~~~~f~C~~~~Ci~~~~~Cdg~~dC~dgsDE 33 (33)
T smart00192 2 CPPGEFQCDNGRCIPLSWVCDGVDDCSDGSDE 33 (33)
T ss_pred CCCCeEECCCCCEECchhhCCCcCcCcCCCCC
Confidence 55569999999999999999999999999998
No 11
>PF12999 PRKCSH-like: Glucosidase II beta subunit-like
Probab=98.78 E-value=5.6e-09 Score=89.19 Aligned_cols=65 Identities=43% Similarity=0.716 Sum_probs=56.8
Q ss_pred eeeCCCCCCCCCee-ecCCCcccCCCCCCCCCccccccCCCCCCCCCCCCcEEeeCC----ceecCCCcCCCCCC---CC
Q psy954 214 FTCTENKAWNRAQC-IPKKWLCDGDPDCVDGADENTTALNCPKQSSCSPDQFSCGNG----RCINTGWLCDHDND---CG 285 (337)
Q Consensus 214 f~C~~~~~~~~~~C-i~~~~~CDg~~dC~dgsDE~~~~~~C~~~~~C~~~~f~C~~g----~Ci~~~~~CDg~~d---C~ 285 (337)
|+|.++. .= |+.+++.|+.-||+|||||.. ...|+.+.|+|.|. .-||..+|-||+-| |=
T Consensus 38 f~Cl~~~-----~~~I~~~~iNDdyCDC~DGSDEPG-------TsAC~~~~FyC~N~g~~p~~i~~s~VnDGICDy~~CC 105 (176)
T PF12999_consen 38 FTCLDGS-----KIVIPFSQINDDYCDCPDGSDEPG-------TSACSNGKFYCENKGHIPRYIPSSRVNDGICDYDICC 105 (176)
T ss_pred eEecCCC-----CceecHHHccCcceeCCCCCCccc-------cccCcCceEeeccCCCCCceeehhhhcCCcCcccccC
Confidence 9999884 44 999999999999999999942 24677789999874 68999999999999 99
Q ss_pred CCCCC
Q psy954 286 DGSDE 290 (337)
Q Consensus 286 d~sDE 290 (337)
|||||
T Consensus 106 DGSDE 110 (176)
T PF12999_consen 106 DGSDE 110 (176)
T ss_pred CCCCC
Confidence 99999
No 12
>PF12999 PRKCSH-like: Glucosidase II beta subunit-like
Probab=98.54 E-value=8.5e-08 Score=81.97 Aligned_cols=67 Identities=39% Similarity=0.634 Sum_probs=57.3
Q ss_pred CCcEEeeCCc-e-ecCCCcCCCCCCCCCCCCCCCCCCCCCCCCCCCcEEcCCC----CeecCCCCCCCcCC---CCCCCC
Q psy954 261 PDQFSCGNGR-C-INTGWLCDHDNDCGDGSDEGKECHDKYRTCSSEEFACQNF----KCIRKTYHCDGEDD---CGDRSD 331 (337)
Q Consensus 261 ~~~f~C~~g~-C-i~~~~~CDg~~dC~d~sDE~~~C~~~~~~C~~~~f~C~~~----~Ci~~~~~CDg~~d---C~dgsD 331 (337)
.+.|+|-++. - |+..++.|+.-||+||||| ++- ..|+...|+|.|. .-||.++|=||+=| |=||||
T Consensus 35 ~~~f~Cl~~~~~~I~~~~iNDdyCDC~DGSDE-PGT----sAC~~~~FyC~N~g~~p~~i~~s~VnDGICDy~~CCDGSD 109 (176)
T PF12999_consen 35 NGKFTCLDGSKIVIPFSQINDDYCDCPDGSDE-PGT----SACSNGKFYCENKGHIPRYIPSSRVNDGICDYDICCDGSD 109 (176)
T ss_pred CCceEecCCCCceecHHHccCcceeCCCCCCc-ccc----ccCcCceEeeccCCCCCceeehhhhcCCcCcccccCCCCC
Confidence 4679998663 4 8999999999999999999 542 3577789999975 57999999999999 999999
Q ss_pred C
Q psy954 332 E 332 (337)
Q Consensus 332 E 332 (337)
|
T Consensus 110 E 110 (176)
T PF12999_consen 110 E 110 (176)
T ss_pred C
Confidence 9
No 13
>PF00058 Ldl_recept_b: Low-density lipoprotein receptor repeat class B; InterPro: IPR000033 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR classB (YWTD) repeat, the structure of which has been solved []. The six YWTD repeats together fold into a six-bladed beta-propeller. Each blade of the propeller consists of four antiparallel beta-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor [], preproepidermal growth factor, and nidogen (entactin).; PDB: 3S2K_A 3S8Z_A 3S8V_B 4A0P_A 3SOB_B 3S94_B 4DG6_A 3SOV_A 3SOQ_A 1NPE_A ....
Probab=98.25 E-value=2.4e-06 Score=56.13 Aligned_cols=40 Identities=30% Similarity=0.446 Sum_probs=32.2
Q ss_pred CEEEEEeCCCC-ceEEEecccCCceEEEeecccCCCccceee
Q psy954 26 NYIYWTDLQLR-GVYRAEKHTGANMIEMVKRLEDSPRDIHVY 66 (337)
Q Consensus 26 d~IYWtDw~~~-~I~r~~k~~G~~~~~l~~~~~~~p~gI~v~ 66 (337)
++||||||... +|++++. +|+.+++++......|.+|+|.
T Consensus 1 ~~iYWtD~~~~~~I~~a~~-dGs~~~~vi~~~l~~P~giaVD 41 (42)
T PF00058_consen 1 GKIYWTDWSQDPSIERANL-DGSNRRTVISDDLQHPEGIAVD 41 (42)
T ss_dssp TEEEEEETTTTEEEEEEET-TSTSEEEEEESSTSSEEEEEEE
T ss_pred CEEEEEECCCCcEEEEEEC-CCCCeEEEEECCCCCcCEEEEC
Confidence 58999999999 8888776 5666666666555889999985
No 14
>KOG1214|consensus
Probab=98.20 E-value=2.5e-06 Score=86.77 Aligned_cols=65 Identities=22% Similarity=0.335 Sum_probs=54.3
Q ss_pred ccCCCceEEEE-cCCCCceEEEEE--CCEEEEEeCCCC--ceEEEecccCCceEEEeecccCCCccceeec
Q psy954 2 RIATGASMVLI-SATIYPFAITVH--RNYIYWTDLQLR--GVYRAEKHTGANMIEMVKRLEDSPRDIHVYS 67 (337)
Q Consensus 2 ~~dG~~R~vl~-~~~~~Pf~Lav~--~d~IYWtDw~~~--~I~r~~k~~G~~~~~l~~~~~~~p~gI~v~~ 67 (337)
.|||+.|++|+ ..|.+|.+|++. +..||||||... +|++++. +|.++.+++..-...|.||++..
T Consensus 1096 ~LdG~~rkvLf~tdLVNPR~iv~D~~rgnLYwtDWnRenPkIets~m-DG~NrRilin~DigLPNGLtfdp 1165 (1289)
T KOG1214|consen 1096 LLDGSERKVLFYTDLVNPRAIVVDPIRGNLYWTDWNRENPKIETSSM-DGENRRILINTDIGLPNGLTFDP 1165 (1289)
T ss_pred ecCCceeeEEEeecccCcceEEeecccCceeeccccccCCcceeecc-CCccceEEeecccCCCCCceeCc
Confidence 58999999999 789999999875 999999999876 7877765 77777777766557899999853
No 15
>smart00135 LY Low-density lipoprotein-receptor YWTD domain. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin.
Probab=97.71 E-value=0.00012 Score=47.45 Aligned_cols=36 Identities=25% Similarity=0.336 Sum_probs=28.4
Q ss_pred EEEEcCCCCc--eEEEEECCEEEEEeCCCCceEEEecc
Q psy954 9 MVLISATIYP--FAITVHRNYIYWTDLQLRGVYRAEKH 44 (337)
Q Consensus 9 ~vl~~~~~~P--f~Lav~~d~IYWtDw~~~~I~r~~k~ 44 (337)
+++..++.+| +++...+++|||+|+....|++++..
T Consensus 2 ~~~~~~~~~~~~la~d~~~~~lYw~D~~~~~I~~~~~~ 39 (43)
T smart00135 2 TLLSEGLGHPNGLAVDWIEGRLYWTDWGLDVIEVANLD 39 (43)
T ss_pred EEEECCCCCcCEEEEeecCCEEEEEeCCCCEEEEEeCC
Confidence 4455788899 55555699999999999999988764
No 16
>KOG2397|consensus
Probab=97.55 E-value=7.2e-05 Score=72.73 Aligned_cols=65 Identities=38% Similarity=0.679 Sum_probs=57.6
Q ss_pred eeeCCCCCCCCCeeecCCCcccCCCCCCCCCccccccCCCCCCCCCCCCcEEeeCC----ceecCCCcCCCCCCCCCCCC
Q psy954 214 FTCTENKAWNRAQCIPKKWLCDGDPDCVDGADENTTALNCPKQSSCSPDQFSCGNG----RCINTGWLCDHDNDCGDGSD 289 (337)
Q Consensus 214 f~C~~~~~~~~~~Ci~~~~~CDg~~dC~dgsDE~~~~~~C~~~~~C~~~~f~C~~g----~Ci~~~~~CDg~~dC~d~sD 289 (337)
|.|.++. .-|+.+++-|..-||.|||||. ....|+.+.|+|.|. .-||...+-||+-||-||||
T Consensus 45 ~~CLdgs-----~~i~f~qlNDd~CDC~DGsDEP-------GtsACpngkF~C~N~G~~p~~i~ssrV~DGICDCCDgSD 112 (480)
T KOG2397|consen 45 FKCLDGS-----KTISFSQLNDDSCDCLDGSDEP-------GTSACPNGKFYCVNQGHQPKYIPSSRVNDGICDCCDGSD 112 (480)
T ss_pred eeeccCC-----cccCHHHhccccccCCCCCCCC-------ccccCCCCceeeeecCCCceeeechhccCcccccccCCC
Confidence 9999885 8899999999999999999993 235688899999863 57888999999999999999
Q ss_pred C
Q psy954 290 E 290 (337)
Q Consensus 290 E 290 (337)
|
T Consensus 113 E 113 (480)
T KOG2397|consen 113 E 113 (480)
T ss_pred C
Confidence 9
No 17
>PF00058 Ldl_recept_b: Low-density lipoprotein receptor repeat class B; InterPro: IPR000033 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR classB (YWTD) repeat, the structure of which has been solved []. The six YWTD repeats together fold into a six-bladed beta-propeller. Each blade of the propeller consists of four antiparallel beta-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor [], preproepidermal growth factor, and nidogen (entactin).; PDB: 3S2K_A 3S8Z_A 3S8V_B 4A0P_A 3SOB_B 3S94_B 4DG6_A 3SOV_A 3SOQ_A 1NPE_A ....
Probab=96.79 E-value=0.0014 Score=42.84 Aligned_cols=24 Identities=17% Similarity=0.174 Sum_probs=21.3
Q ss_pred CccCCCceEEEE-cCCCCceEEEEE
Q psy954 1 MRIATGASMVLI-SATIYPFAITVH 24 (337)
Q Consensus 1 ~~~dG~~R~vl~-~~~~~Pf~Lav~ 24 (337)
+++||++|++|+ +.+.+|+||||+
T Consensus 17 a~~dGs~~~~vi~~~l~~P~giaVD 41 (42)
T PF00058_consen 17 ANLDGSNRRTVISDDLQHPEGIAVD 41 (42)
T ss_dssp EETTSTSEEEEEESSTSSEEEEEEE
T ss_pred EECCCCCeEEEEECCCCCcCEEEEC
Confidence 368999999998 778999999996
No 18
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=96.73 E-value=0.00095 Score=38.07 Aligned_cols=22 Identities=27% Similarity=0.652 Sum_probs=19.5
Q ss_pred cCeecCCCCcccccCCcccccC
Q psy954 96 TAECKCDESTKLVNEGRMCVAK 117 (337)
Q Consensus 96 ~~~C~Cp~g~~L~~~~~~C~~~ 117 (337)
+++|.|+.||+|..++++|...
T Consensus 1 sy~C~C~~Gy~l~~d~~~C~DI 22 (24)
T PF12662_consen 1 SYTCSCPPGYQLSPDGRSCEDI 22 (24)
T ss_pred CEEeeCCCCCcCCCCCCccccC
Confidence 3799999999999999999764
No 19
>KOG3509|consensus
Probab=96.67 E-value=0.0015 Score=69.28 Aligned_cols=105 Identities=39% Similarity=0.932 Sum_probs=89.8
Q ss_pred eeecCCCcccCCCCCCCCCccccccCCCCC-CCCCCCCcEEeeCCceecCCCcCCCCCCCCCCCCCCCCCCCC--CCCCC
Q psy954 226 QCIPKKWLCDGDPDCVDGADENTTALNCPK-QSSCSPDQFSCGNGRCINTGWLCDHDNDCGDGSDEGKECHDK--YRTCS 302 (337)
Q Consensus 226 ~Ci~~~~~CDg~~dC~dgsDE~~~~~~C~~-~~~C~~~~f~C~~g~Ci~~~~~CDg~~dC~d~sDE~~~C~~~--~~~C~ 302 (337)
.|......|++..|+.+.+|+ .+++. ...+.+.+|.|.++++....|.||....+..++++ .+|... ...|.
T Consensus 2 ~c~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~s~~~ 76 (964)
T KOG3509|consen 2 ECVKNRYACDRQPDCRDRSDV----ANDPAIGSACSPNEFKCNNPRCVQPEALLDADSTCGPNSTP-SGCNAKPSASDCK 76 (964)
T ss_pred chhhhhhhhccchhhHhhccc----CCCccccccCCcchhccCCccccCchhhhccccccCCCCCc-CCccccccccccC
Confidence 567778899999999999999 56553 25678899999999999999999999999999987 776543 35677
Q ss_pred CCcEEcCCC-CeecCCCCCCCcCCCCCCCCCCCC
Q psy954 303 SEEFACQNF-KCIRKTYHCDGEDDCGDRSDEFNC 335 (337)
Q Consensus 303 ~~~f~C~~~-~Ci~~~~~CDg~~dC~dgsDE~~C 335 (337)
+.+++|.+- ++.+.+..|||.++|.++++|..+
T Consensus 77 ~~~~~c~~~~~~~~~~~~~~g~~~~~~~~~~~~~ 110 (964)
T KOG3509|consen 77 PTETQCRDRLRCNPQSFQCDGTNDCKDGSDEVGC 110 (964)
T ss_pred CcccccccchhcCCccccccCCCCCCccchhccc
Confidence 888999987 789999999999999999999754
No 20
>KOG2397|consensus
Probab=96.52 E-value=0.0024 Score=62.32 Aligned_cols=67 Identities=40% Similarity=0.638 Sum_probs=55.7
Q ss_pred cEEeeCC-ceecCCCcCCCCCCCCCCCCCCCCCCCCCCCCCCCcEEcCCC----CeecCCCCCCCcCCCCCCCCCCC
Q psy954 263 QFSCGNG-RCINTGWLCDHDNDCGDGSDEGKECHDKYRTCSSEEFACQNF----KCIRKTYHCDGEDDCGDRSDEFN 334 (337)
Q Consensus 263 ~f~C~~g-~Ci~~~~~CDg~~dC~d~sDE~~~C~~~~~~C~~~~f~C~~~----~Ci~~~~~CDg~~dC~dgsDE~~ 334 (337)
.|.|-++ .-|+...+-|..-||.||+|| +. ...|+.+.|+|.|. .=|+.+.+=||+-||=|||||..
T Consensus 44 ~~~CLdgs~~i~f~qlNDd~CDC~DGsDE-PG----tsACpngkF~C~N~G~~p~~i~ssrV~DGICDCCDgSDE~~ 115 (480)
T KOG2397|consen 44 MFKCLDGSKTISFSQLNDDSCDCLDGSDE-PG----TSACPNGKFYCVNQGHQPKYIPSSRVNDGICDCCDGSDEYL 115 (480)
T ss_pred ceeeccCCcccCHHHhccccccCCCCCCC-Cc----cccCCCCceeeeecCCCceeeechhccCcccccccCCCCcc
Confidence 6777655 467788899999999999999 43 34688899999863 35889999999999999999974
No 21
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=95.34 E-value=0.0072 Score=39.44 Aligned_cols=38 Identities=21% Similarity=0.620 Sum_probs=27.0
Q ss_pred CCCcCCCCCcc--ccccCCCCCcCeecCCCCcccccCCccc
Q psy954 76 NPCNIHNGGCA--QSCHPGPNGTAECKCDESTKLVNEGRMC 114 (337)
Q Consensus 76 npC~~~nggCs--~lCl~~~~~~~~C~Cp~g~~L~~~~~~C 114 (337)
+.|+.....|. ..|+.+.++ |+|.|+.||.+..++++|
T Consensus 3 dEC~~~~~~C~~~~~C~N~~Gs-y~C~C~~Gy~~~~~~~~C 42 (42)
T PF07645_consen 3 DECAEGPHNCPENGTCVNTEGS-YSCSCPPGYELNDDGTTC 42 (42)
T ss_dssp STTTTTSSSSSTTSEEEEETTE-EEEEESTTEEECTTSSEE
T ss_pred cccCCCCCcCCCCCEEEcCCCC-EEeeCCCCcEECCCCCcC
Confidence 34444444454 678887776 999999999977776655
No 22
>KOG3509|consensus
Probab=93.05 E-value=0.1 Score=55.87 Aligned_cols=105 Identities=32% Similarity=0.757 Sum_probs=81.2
Q ss_pred ceeCCccccCCCCCCCCCCCCCCCccCCcccCCCCCCCCcccCCCccccCCCCCCCCccceeeecCCCCCCCCCCCCCCC
Q psy954 132 KCISRMWSCDGDDDCGDNSDEDPNYCNVQITGVSQPPGELGVPGHVQITGVSQPPGIVMVMTTVQTGLMNHPNNRKCDEE 211 (337)
Q Consensus 132 ~CI~~~~~CDG~~DC~DgsDE~~~~C~~~~c~c~~~~~~~~~~~~~C~~~~~~~~~~C~~~~~C~~~~d~~c~~~~C~~~ 211 (337)
+|....+.|++..|+.+.||+.+..+.. +.+++.
T Consensus 2 ~c~~~~~~~~~~~~~~~~~~~~~~~~~~----------------------------------------------~~~~p~ 35 (964)
T KOG3509|consen 2 ECVKNRYACDRQPDCRDRSDVANDPAIG----------------------------------------------SACSPN 35 (964)
T ss_pred chhhhhhhhccchhhHhhcccCCCcccc----------------------------------------------ccCCcc
Confidence 5778889999999999999997522211 123333
Q ss_pred CceeeCCCCCCCCCeeecCCCcccCCCCCCCCCccccccCCCCC---CCCCCCCcEEeeCC-ceecCCCcCCCCCCCCCC
Q psy954 212 TEFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADENTTALNCPK---QSSCSPDQFSCGNG-RCINTGWLCDHDNDCGDG 287 (337)
Q Consensus 212 ~~f~C~~~~~~~~~~Ci~~~~~CDg~~dC~dgsDE~~~~~~C~~---~~~C~~~~f~C~~g-~Ci~~~~~CDg~~dC~d~ 287 (337)
. |.|.++ ++....|.||...++..++++ .+|.. ...|....+.|.+- ++.+.+..|+|.++|.++
T Consensus 36 ~-~~~~~~------~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~s~~~~~~~~c~~~~~~~~~~~~~~g~~~~~~~ 104 (964)
T KOG3509|consen 36 E-FKCNNP------RCVQPEALLDADSTCGPNSTP----SGCNAKPSASDCKPTETQCRDRLRCNPQSFQCDGTNDCKDG 104 (964)
T ss_pred h-hccCCc------cccCchhhhccccccCCCCCc----CCccccccccccCCcccccccchhcCCccccccCCCCCCcc
Confidence 3 778777 889999999999999999977 45532 35667788888765 788888999999999999
Q ss_pred CCCCCCC
Q psy954 288 SDEGKEC 294 (337)
Q Consensus 288 sDE~~~C 294 (337)
++| ..+
T Consensus 105 ~~~-~~~ 110 (964)
T KOG3509|consen 105 SDE-VGC 110 (964)
T ss_pred chh-ccc
Confidence 999 443
No 23
>smart00181 EGF Epidermal growth factor-like domain.
Probab=93.03 E-value=0.096 Score=32.12 Aligned_cols=25 Identities=24% Similarity=0.619 Sum_probs=19.6
Q ss_pred CCccc-cccCCCCCcCeecCCCCcccc
Q psy954 83 GGCAQ-SCHPGPNGTAECKCDESTKLV 108 (337)
Q Consensus 83 ggCs~-lCl~~~~~~~~C~Cp~g~~L~ 108 (337)
+.|.+ .|+...++ ++|.|+.||.+.
T Consensus 6 ~~C~~~~C~~~~~~-~~C~C~~g~~g~ 31 (35)
T smart00181 6 GPCSNGTCINTPGS-YTCSCPPGYTGD 31 (35)
T ss_pred CCCCCCEEECCCCC-eEeECCCCCccC
Confidence 45666 89988555 999999999864
No 24
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=92.68 E-value=0.099 Score=32.17 Aligned_cols=26 Identities=27% Similarity=0.686 Sum_probs=19.6
Q ss_pred CccccccCCCCCcCeecCCCCcccccCC
Q psy954 84 GCAQSCHPGPNGTAECKCDESTKLVNEG 111 (337)
Q Consensus 84 gCs~lCl~~~~~~~~C~Cp~g~~L~~~~ 111 (337)
.|...|-+... .+|.||+||+|.++.
T Consensus 7 ~CpA~CDpn~~--~~C~CPeGyIlde~~ 32 (34)
T PF09064_consen 7 ECPADCDPNSP--GQCFCPEGYILDEGS 32 (34)
T ss_pred cCCCccCCCCC--CceeCCCceEecCCc
Confidence 47777776433 489999999998754
No 25
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=92.53 E-value=0.082 Score=47.42 Aligned_cols=38 Identities=24% Similarity=0.564 Sum_probs=32.6
Q ss_pred CCCCcCCCCCccccccCCCCCcCeecCCCCcccccCCcc
Q psy954 75 VNPCNIHNGGCAQSCHPGPNGTAECKCDESTKLVNEGRM 113 (337)
Q Consensus 75 ~npC~~~nggCs~lCl~~~~~~~~C~Cp~g~~L~~~~~~ 113 (337)
.++|...++.|.|.|+..+++ |.|.|+.||.|..++++
T Consensus 187 ~~~C~~~~~~c~~~C~~~~g~-~~c~c~~g~~~~~~~~~ 224 (224)
T cd01475 187 PDLCATLSHVCQQVCISTPGS-YLCACTEGYALLEDNKT 224 (224)
T ss_pred chhhcCCCCCccceEEcCCCC-EEeECCCCccCCCCCCC
Confidence 367887788899999988777 99999999999888764
No 26
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=90.40 E-value=0.31 Score=30.29 Aligned_cols=23 Identities=26% Similarity=0.741 Sum_probs=17.8
Q ss_pred cccCCCCCcCeecCCCCcccccCCccc
Q psy954 88 SCHPGPNGTAECKCDESTKLVNEGRMC 114 (337)
Q Consensus 88 lCl~~~~~~~~C~Cp~g~~L~~~~~~C 114 (337)
+|+..+.+ ++|.|+.||. +++.|
T Consensus 16 ~C~~~~g~-~~C~C~~g~~---~g~~C 38 (39)
T smart00179 16 TCVNTVGS-YRCECPPGYT---DGRNC 38 (39)
T ss_pred EeECCCCC-eEeECCCCCc---cCCcC
Confidence 78877666 9999999987 55555
No 27
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=90.12 E-value=0.88 Score=26.67 Aligned_cols=26 Identities=27% Similarity=0.318 Sum_probs=22.2
Q ss_pred CCCceEEEEE-CCEEEEEeCCCCceEE
Q psy954 15 TIYPFAITVH-RNYIYWTDLQLRGVYR 40 (337)
Q Consensus 15 ~~~Pf~Lav~-~d~IYWtDw~~~~I~r 40 (337)
+.+|.||++. ++.||-+|+....|+.
T Consensus 1 f~~P~gvav~~~g~i~VaD~~n~rV~v 27 (28)
T PF01436_consen 1 FNYPHGVAVDSDGNIYVADSGNHRVQV 27 (28)
T ss_dssp BSSEEEEEEETTSEEEEEECCCTEEEE
T ss_pred CcCCcEEEEeCCCCEEEEECCCCEEEE
Confidence 3589999995 9999999999888764
No 28
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=88.24 E-value=0.17 Score=30.85 Aligned_cols=27 Identities=41% Similarity=0.964 Sum_probs=19.5
Q ss_pred CCCCcCCCCCccccccCCCCCcCeecCCCCcc
Q psy954 75 VNPCNIHNGGCAQSCHPGPNGTAECKCDESTK 106 (337)
Q Consensus 75 ~npC~~~nggCs~lCl~~~~~~~~C~Cp~g~~ 106 (337)
.+||+ |+| .|+....+.++|.|+.||.
T Consensus 3 ~~~C~--n~g---~C~~~~~~~y~C~C~~G~~ 29 (32)
T PF00008_consen 3 SNPCQ--NGG---TCIDLPGGGYTCECPPGYT 29 (32)
T ss_dssp TTSST--TTE---EEEEESTSEEEEEEBTTEE
T ss_pred CCcCC--CCe---EEEeCCCCCEEeECCCCCc
Confidence 46777 333 5777775569999999985
No 29
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=85.35 E-value=0.9 Score=27.16 Aligned_cols=21 Identities=19% Similarity=0.414 Sum_probs=16.6
Q ss_pred ccccCCCCCcCeecCCCCcccc
Q psy954 87 QSCHPGPNGTAECKCDESTKLV 108 (337)
Q Consensus 87 ~lCl~~~~~~~~C~Cp~g~~L~ 108 (337)
.+|+..+.+ ++|.||.||...
T Consensus 12 ~~C~~~~~~-~~C~C~~g~~g~ 32 (36)
T cd00053 12 GTCVNTPGS-YRCVCPPGYTGD 32 (36)
T ss_pred CEEecCCCC-eEeECCCCCccc
Confidence 577877755 999999998754
No 30
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2GHS_A 2DG0_L 2DG1_D 2DSO_D 3E5Z_A 2IAT_A 2IAV_A 2GVV_A 3HLI_A 2GVU_A ....
Probab=81.62 E-value=5.5 Score=35.77 Aligned_cols=38 Identities=16% Similarity=0.224 Sum_probs=31.9
Q ss_pred eEEEEcCCCCceEEEEE--CCEEEEEeCCCCceEEEeccc
Q psy954 8 SMVLISATIYPFAITVH--RNYIYWTDLQLRGVYRAEKHT 45 (337)
Q Consensus 8 R~vl~~~~~~Pf~Lav~--~d~IYWtDw~~~~I~r~~k~~ 45 (337)
.+++...+..|.||++. ++.||++|...+.|++.+...
T Consensus 126 ~~~~~~~~~~pNGi~~s~dg~~lyv~ds~~~~i~~~~~~~ 165 (246)
T PF08450_consen 126 VTVVADGLGFPNGIAFSPDGKTLYVADSFNGRIWRFDLDA 165 (246)
T ss_dssp EEEEEEEESSEEEEEEETTSSEEEEEETTTTEEEEEEEET
T ss_pred EEEEecCcccccceEECCcchheeecccccceeEEEeccc
Confidence 34445778999999999 668999999999999998863
No 31
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=80.50 E-value=0.44 Score=29.95 Aligned_cols=29 Identities=28% Similarity=0.696 Sum_probs=19.3
Q ss_pred CcCCCCCcc--ccccCCCCCcCeecCCCCccc
Q psy954 78 CNIHNGGCA--QSCHPGPNGTAECKCDESTKL 107 (337)
Q Consensus 78 C~~~nggCs--~lCl~~~~~~~~C~Cp~g~~L 107 (337)
|+.+|++|. -.|..++.+ ++|.|+.||.-
T Consensus 1 C~~~~~~C~~nA~C~~~~~~-~~C~C~~Gy~G 31 (36)
T PF12947_consen 1 CLENNGGCHPNATCTNTGGS-YTCTCKPGYEG 31 (36)
T ss_dssp TTTGGGGS-TTCEEEE-TTS-EEEEE-CEEEC
T ss_pred CCCCCCCCCCCcEeecCCCC-EEeECCCCCcc
Confidence 455566775 378888775 99999999864
No 32
>TIGR03032 conserved hypothetical protein TIGR03032. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown.
Probab=79.77 E-value=4.2 Score=38.44 Aligned_cols=57 Identities=11% Similarity=0.016 Sum_probs=45.3
Q ss_pred eEEEEcCCCCceEEEEECCEEEEEeCCCCceEEEecccCCceEEEeecccCCCccceee
Q psy954 8 SMVLISATIYPFAITVHRNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVY 66 (337)
Q Consensus 8 R~vl~~~~~~Pf~Lav~~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~v~ 66 (337)
.++|++++..|.+.=.|++.||.+|+.++.|.+++..+|+...+. .++..|.||...
T Consensus 195 ~evl~~GLsmPhSPRWhdgrLwvldsgtGev~~vD~~~G~~e~Va--~vpG~~rGL~f~ 251 (335)
T TIGR03032 195 GEVVASGLSMPHSPRWYQGKLWLLNSGRGELGYVDPQAGKFQPVA--FLPGFTRGLAFA 251 (335)
T ss_pred CCEEEcCccCCcCCcEeCCeEEEEECCCCEEEEEcCCCCcEEEEE--ECCCCCccccee
Confidence 467778999999999999999999999999999999888755553 333556665544
No 33
>KOG1219|consensus
Probab=79.57 E-value=1.9 Score=50.18 Aligned_cols=58 Identities=34% Similarity=0.882 Sum_probs=38.5
Q ss_pred cCCCCCcCCCCCccccccCCCCCcCeecCCCCcccccCCcccccCCCCCCCCccccCC-CceeCC--ccccC
Q psy954 73 CSVNPCNIHNGGCAQSCHPGPNGTAECKCDESTKLVNEGRMCVAKNITCDGSKFFCRN-GKCISR--MWSCD 141 (337)
Q Consensus 73 ~~~npC~~~nggCs~lCl~~~~~~~~C~Cp~g~~L~~~~~~C~~~~~~C~~~~f~C~~-g~CI~~--~~~CD 141 (337)
|..|||+ ||| .|...|.++|.|.||.-|. |+.|...-..|.++ .|.+ |.||+. .+.|+
T Consensus 3867 C~~npCq--hgG---~C~~~~~ggy~CkCpsqys----G~~CEi~~epC~sn--PC~~GgtCip~~n~f~Cn 3927 (4289)
T KOG1219|consen 3867 CNDNPCQ--HGG---TCISQPKGGYKCKCPSQYS----GNHCEIDLEPCASN--PCLTGGTCIPFYNGFLCN 3927 (4289)
T ss_pred cccCccc--CCC---EecCCCCCceEEeCccccc----CcccccccccccCC--CCCCCCEEEecCCCeeEe
Confidence 3446666 555 6888898889999998885 66676543346543 5666 478754 45554
No 34
>PF01731 Arylesterase: Arylesterase; InterPro: IPR002640 The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysis of the toxic metabolites of a variety of organophosphorus insecticides. The enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl acetate), and hence confer resistance to organophosphate toxicity []. Mammals have 3 distinct paraoxonase types, termed PON1-3 [, ]. In mice and humans, the PON genes are found on the same chromosome in close proximity. PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity. Human and rabbit PONs appear to have two distinct Ca2+ binding sites, one required for stability and one required for catalytic activity. The Ca2+ dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as the electrophillic catalyst, like that proposed for phospholipase A2. The paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)- associated proteins capable of preventing oxidative modification of low density lipoproteins (LPL) []. Although PON2 has oxidative properties, the enzyme does not associate with HDL. Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share 79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo []. This family consists of arylesterases (Also known as serum paraoxonase) 3.1.1.2 from EC. These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity []. Human arylesterase (PON1) P27169 from SWISSPROT is associated with HDL and may protect against LDL oxidation [].; GO: 0004064 arylesterase activity
Probab=78.19 E-value=5.7 Score=30.15 Aligned_cols=43 Identities=12% Similarity=0.193 Sum_probs=35.9
Q ss_pred CccCCCceEEEEcCCCCceEEEEE--CCEEEEEeCCCCceEEEec
Q psy954 1 MRIATGASMVLISATIYPFAITVH--RNYIYWTDLQLRGVYRAEK 43 (337)
Q Consensus 1 ~~~dG~~R~vl~~~~~~Pf~Lav~--~d~IYWtDw~~~~I~r~~k 43 (337)
+.++|+.-+++.+++..|=||++. +.+||-++-..++|..-.+
T Consensus 39 vyyd~~~~~~va~g~~~aNGI~~s~~~k~lyVa~~~~~~I~vy~~ 83 (86)
T PF01731_consen 39 VYYDGKEVKVVASGFSFANGIAISPDKKYLYVASSLAHSIHVYKR 83 (86)
T ss_pred EEEeCCEeEEeeccCCCCceEEEcCCCCEEEEEeccCCeEEEEEe
Confidence 357888888888999999999996 7899999988888876554
No 35
>TIGR02276 beta_rpt_yvtn 40-residue YVTN family beta-propeller repeat. This repeat of about 40 amino acids is found in up to 14 copies per protein. Archaea Methanosarcina mazei and Methanosarcina acetivorans each have over 10 genes that encode tandem copies of this repeat, which is also found in other species. PSIPRED predicts with high confidence that each 40-residue repeats contains four beta strands. This model overlaps somewhat with the NHL repeat (Pfam pfam01436) and also shows sequence similarity to the WD domain, G-beta repeat (Pfam pfam00400).
Probab=78.07 E-value=7.5 Score=24.24 Aligned_cols=38 Identities=13% Similarity=0.089 Sum_probs=26.0
Q ss_pred CCEEEEEeCCCCceEEEecccCCceEEEeecccCCCccce
Q psy954 25 RNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIH 64 (337)
Q Consensus 25 ~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~ 64 (337)
+++||-++|..+.|..++..++.....+.. ...|.+|.
T Consensus 3 ~~~lyv~~~~~~~v~~id~~~~~~~~~i~v--g~~P~~i~ 40 (42)
T TIGR02276 3 GTKLYVTNSGSNTVSVIDTATNKVIATIPV--GGYPFGVA 40 (42)
T ss_pred CCEEEEEeCCCCEEEEEECCCCeEEEEEEC--CCCCceEE
Confidence 578999999999999988766654433332 24466554
No 36
>PF03088 Str_synth: Strictosidine synthase; InterPro: IPR018119 This entry represents a conserved region found in strictosidine synthase (4.3.3.2 from EC), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine []. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded beta-propeller fold from the plant kingdom [].; GO: 0016844 strictosidine synthase activity, 0009058 biosynthetic process; PDB: 2FPB_A 2V91_B 2FP8_A 3V1S_B 2FPC_A 2VAQ_A 2FP9_B.
Probab=72.99 E-value=11 Score=28.88 Aligned_cols=47 Identities=19% Similarity=0.327 Sum_probs=27.0
Q ss_pred EEEEEC--CEEEEEeCCCC-----------------ceEEEecccCCceEEEeecccCCCccceeecc
Q psy954 20 AITVHR--NYIYWTDLQLR-----------------GVYRAEKHTGANMIEMVKRLEDSPRDIHVYSA 68 (337)
Q Consensus 20 ~Lav~~--d~IYWtDw~~~-----------------~I~r~~k~~G~~~~~l~~~~~~~p~gI~v~~~ 68 (337)
+|+|.. +.|||||..++ +|++.+..++ ..++++.++ ..|.||.+-..
T Consensus 2 dldv~~~~g~vYfTdsS~~~~~~~~~~~~le~~~~GRll~ydp~t~-~~~vl~~~L-~fpNGVals~d 67 (89)
T PF03088_consen 2 DLDVDQDTGTVYFTDSSSRYDRRDWVYDLLEGRPTGRLLRYDPSTK-ETTVLLDGL-YFPNGVALSPD 67 (89)
T ss_dssp EEEE-TTT--EEEEES-SS--TTGHHHHHHHT---EEEEEEETTTT-EEEEEEEEE-SSEEEEEE-TT
T ss_pred ceeEecCCCEEEEEeCccccCccceeeeeecCCCCcCEEEEECCCC-eEEEehhCC-CccCeEEEcCC
Confidence 466774 59999997543 4555555454 345666666 67888887543
No 37
>KOG4499|consensus
Probab=70.47 E-value=8.7 Score=34.95 Aligned_cols=49 Identities=16% Similarity=0.297 Sum_probs=39.9
Q ss_pred CceEEEE--c-----CCCCceEEEEE-CCEEEEEeCCCCceEEEecccCCceEEEee
Q psy954 6 GASMVLI--S-----ATIYPFAITVH-RNYIYWTDLQLRGVYRAEKHTGANMIEMVK 54 (337)
Q Consensus 6 ~~R~vl~--~-----~~~~Pf~Lav~-~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~ 54 (337)
++|++|+ + ....|=||+|- ++.||-+-|..+.|++++..+|.....+..
T Consensus 195 snr~~i~dlrk~~~~e~~~PDGm~ID~eG~L~Va~~ng~~V~~~dp~tGK~L~eikl 251 (310)
T KOG4499|consen 195 SNRKVIFDLRKSQPFESLEPDGMTIDTEGNLYVATFNGGTVQKVDPTTGKILLEIKL 251 (310)
T ss_pred cCcceeEEeccCCCcCCCCCCcceEccCCcEEEEEecCcEEEEECCCCCcEEEEEEc
Confidence 6788886 2 23577899998 889999999999999999999987666543
No 38
>PF03088 Str_synth: Strictosidine synthase; InterPro: IPR018119 This entry represents a conserved region found in strictosidine synthase (4.3.3.2 from EC), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine []. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded beta-propeller fold from the plant kingdom [].; GO: 0016844 strictosidine synthase activity, 0009058 biosynthetic process; PDB: 2FPB_A 2V91_B 2FP8_A 3V1S_B 2FPC_A 2VAQ_A 2FP9_B.
Probab=69.23 E-value=9.9 Score=29.05 Aligned_cols=36 Identities=11% Similarity=0.306 Sum_probs=28.2
Q ss_pred CceEEEEcCCCCceEEEEE--CCEEEEEeCCCCceEEE
Q psy954 6 GASMVLISATIYPFAITVH--RNYIYWTDLQLRGVYRA 41 (337)
Q Consensus 6 ~~R~vl~~~~~~Pf~Lav~--~d~IYWtDw~~~~I~r~ 41 (337)
..-+||+.++..|-||++. +++|+.++.....|.|.
T Consensus 47 ~~~~vl~~~L~fpNGVals~d~~~vlv~Et~~~Ri~ry 84 (89)
T PF03088_consen 47 KETTVLLDGLYFPNGVALSPDESFVLVAETGRYRILRY 84 (89)
T ss_dssp TEEEEEEEEESSEEEEEE-TTSSEEEEEEGGGTEEEEE
T ss_pred CeEEEehhCCCccCeEEEcCCCCEEEEEeccCceEEEE
Confidence 3345667999999999999 56999999888877764
No 39
>KOG1219|consensus
Probab=65.35 E-value=5.9 Score=46.51 Aligned_cols=55 Identities=36% Similarity=0.878 Sum_probs=37.9
Q ss_pred CCCCcCCCCCccccccCCCCCcCeecCCCCcccccCCcccccC-CCCCCCCccccCC-CceeCC--ccccC
Q psy954 75 VNPCNIHNGGCAQSCHPGPNGTAECKCDESTKLVNEGRMCVAK-NITCDGSKFFCRN-GKCISR--MWSCD 141 (337)
Q Consensus 75 ~npC~~~nggCs~lCl~~~~~~~~C~Cp~g~~L~~~~~~C~~~-~~~C~~~~f~C~~-g~CI~~--~~~CD 141 (337)
+|||. +|| .|++.+++ +.|-||.||. |++|..+ -..|.. -.|.+ |+||+. ++.|+
T Consensus 3908 snPC~--~Gg---tCip~~n~-f~CnC~~gyT----G~~Ce~~Gi~eCs~--n~C~~gg~C~n~~gsf~Cn 3966 (4289)
T KOG1219|consen 3908 SNPCL--TGG---TCIPFYNG-FLCNCPNGYT----GKRCEARGISECSK--NVCGTGGQCINIPGSFHCN 3966 (4289)
T ss_pred CCCCC--CCC---EEEecCCC-eeEeCCCCcc----Cceeeccccccccc--ccccCCceeeccCCceEec
Confidence 35555 555 67888777 9999999996 7788775 223653 35776 588854 45776
No 40
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2GHS_A 2DG0_L 2DG1_D 2DSO_D 3E5Z_A 2IAT_A 2IAV_A 2GVV_A 3HLI_A 2GVU_A ....
Probab=60.32 E-value=26 Score=31.27 Aligned_cols=57 Identities=16% Similarity=0.207 Sum_probs=38.0
Q ss_pred ceEEEE--cCC-CCceEEEEE-CCEEEEEeCCCCceEEEecccCCceEEEeecccCCCcccee
Q psy954 7 ASMVLI--SAT-IYPFAITVH-RNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHV 65 (337)
Q Consensus 7 ~R~vl~--~~~-~~Pf~Lav~-~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~v 65 (337)
+++++. ... ..|=||+|- .+.||.+.|..+.|++++.. |.....+.... .+|+.+.+
T Consensus 172 ~~~~~~~~~~~~g~pDG~~vD~~G~l~va~~~~~~I~~~~p~-G~~~~~i~~p~-~~~t~~~f 232 (246)
T PF08450_consen 172 NRRVFIDFPGGPGYPDGLAVDSDGNLWVADWGGGRIVVFDPD-GKLLREIELPV-PRPTNCAF 232 (246)
T ss_dssp EEEEEEE-SSSSCEEEEEEEBTTS-EEEEEETTTEEEEEETT-SCEEEEEE-SS-SSEEEEEE
T ss_pred eeeeEEEcCCCCcCCCcceEcCCCCEEEEEcCCCEEEEECCC-ccEEEEEcCCC-CCEEEEEE
Confidence 355554 222 469999998 78999999999999999986 76555444332 34554444
No 41
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=56.80 E-value=11 Score=22.46 Aligned_cols=20 Identities=15% Similarity=0.356 Sum_probs=15.1
Q ss_pred ccccCCCCCcCeecCCCCccc
Q psy954 87 QSCHPGPNGTAECKCDESTKL 107 (337)
Q Consensus 87 ~lCl~~~~~~~~C~Cp~g~~L 107 (337)
..|+..+.+ ++|.|+.||..
T Consensus 15 ~~C~~~~~~-~~C~C~~g~~g 34 (38)
T cd00054 15 GTCVNTVGS-YRCSCPPGYTG 34 (38)
T ss_pred CEeECCCCC-eEeECCCCCcC
Confidence 367766655 99999999864
No 42
>PLN02919 haloacid dehalogenase-like hydrolase family protein
Probab=54.77 E-value=38 Score=37.71 Aligned_cols=33 Identities=9% Similarity=0.207 Sum_probs=28.4
Q ss_pred CCCceEEEEE--CCEEEEEeCCCCceEEEecccCC
Q psy954 15 TIYPFAITVH--RNYIYWTDLQLRGVYRAEKHTGA 47 (337)
Q Consensus 15 ~~~Pf~Lav~--~d~IYWtDw~~~~I~r~~k~~G~ 47 (337)
+.+|.+|++. ++.||++|+..+.|++.+..+|.
T Consensus 682 ln~P~gVa~dp~~g~LyVad~~~~~I~v~d~~~g~ 716 (1057)
T PLN02919 682 LNSPWDVCFEPVNEKVYIAMAGQHQIWEYNISDGV 716 (1057)
T ss_pred cCCCeEEEEecCCCeEEEEECCCCeEEEEECCCCe
Confidence 4689999998 68999999999999998886654
No 43
>TIGR03606 non_repeat_PQQ dehydrogenase, PQQ-dependent, s-GDH family. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis.
Probab=51.33 E-value=44 Score=33.47 Aligned_cols=41 Identities=24% Similarity=0.317 Sum_probs=32.5
Q ss_pred ceEEEEcCCCCceEEEEEC-CEEEEEeCCCCceEEEecccCC
Q psy954 7 ASMVLISATIYPFAITVHR-NYIYWTDLQLRGVYRAEKHTGA 47 (337)
Q Consensus 7 ~R~vl~~~~~~Pf~Lav~~-d~IYWtDw~~~~I~r~~k~~G~ 47 (337)
..++|.+++.+|.+|++.. +.||-|+...+.|++++..++.
T Consensus 21 ~~~~va~GL~~Pw~maflPDG~llVtER~~G~I~~v~~~~~~ 62 (454)
T TIGR03606 21 DKKVLLSGLNKPWALLWGPDNQLWVTERATGKILRVNPETGE 62 (454)
T ss_pred EEEEEECCCCCceEEEEcCCCeEEEEEecCCEEEEEeCCCCc
Confidence 3456668999999999984 5899998878899999765543
No 44
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=49.28 E-value=7.3 Score=33.91 Aligned_cols=43 Identities=30% Similarity=0.683 Sum_probs=28.2
Q ss_pred cccCCCC----CcCeecCCCCcccccCCcccccCCCCCCCCccccCCCceeCC
Q psy954 88 SCHPGPN----GTAECKCDESTKLVNEGRMCVAKNITCDGSKFFCRNGKCISR 136 (337)
Q Consensus 88 lCl~~~~----~~~~C~Cp~g~~L~~~~~~C~~~~~~C~~~~f~C~~g~CI~~ 136 (337)
.|+..++ ..++|.|-.||.|.++ .|.+. .|. .+.|.+|+||..
T Consensus 57 ~C~~~~~~~~~~~~~C~C~~gY~~~~~--vCvp~--~C~--~~~Cg~GKCI~d 103 (197)
T PF06247_consen 57 KCINQANKGEERAYKCDCINGYILKQG--VCVPN--KCN--NKDCGSGKCILD 103 (197)
T ss_dssp EEEE-SSTTSSTSEEEEE-TTEEESSS--SEEEG--GGS--S---TTEEEEEE
T ss_pred hhhcCCCcccceeEEEecccCceeeCC--eEchh--hcC--ceecCCCeEEec
Confidence 5666554 4589999999999875 68764 464 388999999954
No 45
>PLN02919 haloacid dehalogenase-like hydrolase family protein
Probab=48.61 E-value=29 Score=38.56 Aligned_cols=33 Identities=12% Similarity=0.182 Sum_probs=27.4
Q ss_pred CCCceEEEEE--CCEEEEEeCCCCceEEEecccCC
Q psy954 15 TIYPFAITVH--RNYIYWTDLQLRGVYRAEKHTGA 47 (337)
Q Consensus 15 ~~~Pf~Lav~--~d~IYWtDw~~~~I~r~~k~~G~ 47 (337)
+.+|.||++. ++.||++|+..+.|.+++..++.
T Consensus 623 f~~P~GIavd~~gn~LYVaDt~n~~Ir~id~~~~~ 657 (1057)
T PLN02919 623 FNRPQGLAYNAKKNLLYVADTENHALREIDFVNET 657 (1057)
T ss_pred cCCCcEEEEeCCCCEEEEEeCCCceEEEEecCCCE
Confidence 3579999997 57899999999999988875553
No 46
>COG3386 Gluconolactonase [Carbohydrate transport and metabolism]
Probab=46.08 E-value=57 Score=30.86 Aligned_cols=32 Identities=13% Similarity=0.190 Sum_probs=27.2
Q ss_pred cCCCCceEEEEE--CCEEEEEeCCCCceEEEecc
Q psy954 13 SATIYPFAITVH--RNYIYWTDLQLRGVYRAEKH 44 (337)
Q Consensus 13 ~~~~~Pf~Lav~--~d~IYWtDw~~~~I~r~~k~ 44 (337)
..+..|=||++. +..||++|...+.|+|..-.
T Consensus 160 ~~~~~~NGla~SpDg~tly~aDT~~~~i~r~~~d 193 (307)
T COG3386 160 DDLTIPNGLAFSPDGKTLYVADTPANRIHRYDLD 193 (307)
T ss_pred CcEEecCceEECCCCCEEEEEeCCCCeEEEEecC
Confidence 448888899988 66899999999999998664
No 47
>TIGR02604 Piru_Ver_Nterm putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs.
Probab=38.42 E-value=1.1e+02 Score=29.40 Aligned_cols=36 Identities=11% Similarity=0.090 Sum_probs=26.2
Q ss_pred EEEEcCCCCceEEEEECCEEEEEeCCCCceEEEecccC
Q psy954 9 MVLISATIYPFAITVHRNYIYWTDLQLRGVYRAEKHTG 46 (337)
Q Consensus 9 ~vl~~~~~~Pf~Lav~~d~IYWtDw~~~~I~r~~k~~G 46 (337)
+++.+.+..|.||+++.+-||-++.. .|++.....|
T Consensus 65 ~vfa~~l~~p~Gi~~~~~GlyV~~~~--~i~~~~d~~g 100 (367)
T TIGR02604 65 NVFAEELSMVTGLAVAVGGVYVATPP--DILFLRDKDG 100 (367)
T ss_pred EEeecCCCCccceeEecCCEEEeCCC--eEEEEeCCCC
Confidence 44557788999999986669999743 5777754433
No 48
>TIGR02604 Piru_Ver_Nterm putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs.
Probab=38.31 E-value=40 Score=32.43 Aligned_cols=42 Identities=14% Similarity=0.185 Sum_probs=32.5
Q ss_pred ccCCCceEEEEcCCCCceEEEEE-CCEEEEEeCCCCceEEEec
Q psy954 2 RIATGASMVLISATIYPFAITVH-RNYIYWTDLQLRGVYRAEK 43 (337)
Q Consensus 2 ~~dG~~R~vl~~~~~~Pf~Lav~-~d~IYWtDw~~~~I~r~~k 43 (337)
+.+|+..+++..++.+|++|++. .+.||.+|-......+++.
T Consensus 170 ~pdg~~~e~~a~G~rnp~Gl~~d~~G~l~~tdn~~~~~~~i~~ 212 (367)
T TIGR02604 170 NPDGGKLRVVAHGFQNPYGHSVDSWGDVFFCDNDDPPLCRVTP 212 (367)
T ss_pred ecCCCeEEEEecCcCCCccceECCCCCEEEEccCCCceeEEcc
Confidence 46777777777889999999996 7788999876666666654
No 49
>PF08309 LVIVD: LVIVD repeat; InterPro: IPR013211 This repeat is found in bacterial and archaeal cell surface proteins, many of which are hypothetical. The secondary structure corresponding to this repeat is predicted to comprise 4 beta-strands, which may associate to form a beta-propeller. The repeat copy number varies from 2-14. This repeat is sometimes found with the PKD domain IPR000601 from INTERPRO.
Probab=34.69 E-value=1.3e+02 Score=19.43 Aligned_cols=28 Identities=21% Similarity=0.245 Sum_probs=20.3
Q ss_pred CceEEEEECCEEEEEeCCCCceEEEeccc
Q psy954 17 YPFAITVHRNYIYWTDLQLRGVYRAEKHT 45 (337)
Q Consensus 17 ~Pf~Lav~~d~IYWtDw~~~~I~r~~k~~ 45 (337)
...+|+|.++|+|-+++.. .|..++-.+
T Consensus 3 ~a~~v~v~g~yaYva~~~~-Gl~IvDISn 30 (42)
T PF08309_consen 3 DARDVAVSGNYAYVADGNN-GLVIVDISN 30 (42)
T ss_pred eEEEEEEECCEEEEEeCCC-CEEEEECCC
Confidence 3468999999999998864 355555543
No 50
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=33.97 E-value=16 Score=17.46 Aligned_cols=9 Identities=22% Similarity=0.748 Sum_probs=6.0
Q ss_pred eecCCCCcc
Q psy954 98 ECKCDESTK 106 (337)
Q Consensus 98 ~C~Cp~g~~ 106 (337)
+|.||.||.
T Consensus 1 ~C~C~~G~~ 9 (13)
T PF12661_consen 1 TCQCPPGWT 9 (13)
T ss_dssp EEEE-TTEE
T ss_pred CccCcCCCc
Confidence 488888875
No 51
>PF01826 TIL: Trypsin Inhibitor like cysteine rich domain; InterPro: IPR002919 This domain is found in proteinase inhibitors as well as in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9. This inhibitor domain belongs to MEROPS inhibitor family I8 (clan IA). Proteins containing this domain inhibit peptidases belonging to families S1 (IPR001254 from INTERPRO), S8 (IPR000209 from INTERPRO), and M4 (IPR001570 from INTERPRO) [] and are restricted to the chordata, nematoda, arthropoda and echinodermata. Examples of proteins containing this domain are: chymotrypsin/elastase inhibitor from Ascaris suum (pig roundworm) Acp62F protein from Drosophila melanogaster Bombina trypsin inhibitor from Bombina maxima (large-webbed bell toad) Bombyx subtilisin inhibitor from Bombyx mori (silk moth) von Willebrand factor ; PDB: 2P3F_N 1HX2_A 1CCV_A 1EAI_D 2H9E_C 1COU_A 1ATE_A 1ATB_A 1ATD_A 1ATA_A ....
Probab=30.70 E-value=20 Score=24.26 Aligned_cols=19 Identities=21% Similarity=0.441 Sum_probs=14.9
Q ss_pred eecCCCCcccccCCcccccC
Q psy954 98 ECKCDESTKLVNEGRMCVAK 117 (337)
Q Consensus 98 ~C~Cp~g~~L~~~~~~C~~~ 117 (337)
-|.|+.||++..+ ..|++.
T Consensus 34 gC~C~~G~v~~~~-~~CV~~ 52 (55)
T PF01826_consen 34 GCFCPPGYVRNDN-GRCVPP 52 (55)
T ss_dssp EEEETTTEEEETT-SEEEEG
T ss_pred cCCCCCCeeEcCC-CCEEcH
Confidence 3999999998776 577753
No 52
>PF08887 GAD-like: GAD-like domain; InterPro: IPR014983 This domain is functionally uncharacterised, but it appears to be distantly related to the GAD domain IPR004115 from INTERPRO.
Probab=29.41 E-value=59 Score=25.78 Aligned_cols=28 Identities=14% Similarity=0.180 Sum_probs=22.0
Q ss_pred CCCCceEEEEECCEEEEEeCCCCceEEE
Q psy954 14 ATIYPFAITVHRNYIYWTDLQLRGVYRA 41 (337)
Q Consensus 14 ~~~~Pf~Lav~~d~IYWtDw~~~~I~r~ 41 (337)
...||++.+-||+.++|.......+..+
T Consensus 78 ~~~~~ia~tAFGdl~~w~e~~g~~~~i~ 105 (109)
T PF08887_consen 78 DNYIPIARTAFGDLYVWGENTGISLIIT 105 (109)
T ss_pred ceEEEEEEcccccEEEEEcCCceEEEEE
Confidence 3479999999999999998766555443
No 53
>TIGR03032 conserved hypothetical protein TIGR03032. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown.
Probab=29.13 E-value=2.3e+02 Score=27.11 Aligned_cols=66 Identities=12% Similarity=0.111 Sum_probs=38.2
Q ss_pred cCCCceEEEEcCCCCceEEEEECCEEEEEeCCCC-------------------ceEEEecccCCceEEEeec-ccCCCcc
Q psy954 3 IATGASMVLISATIYPFAITVHRNYIYWTDLQLR-------------------GVYRAEKHTGANMIEMVKR-LEDSPRD 62 (337)
Q Consensus 3 ~dG~~R~vl~~~~~~Pf~Lav~~d~IYWtDw~~~-------------------~I~r~~k~~G~~~~~l~~~-~~~~p~g 62 (337)
++.+..++|..-..+|.||+.+++++|-+-.+.+ .|+.++..+|.....+.-. ....+++
T Consensus 230 ~~~G~~e~Va~vpG~~rGL~f~G~llvVgmSk~R~~~~f~glpl~~~l~~~~CGv~vidl~tG~vv~~l~feg~v~Eifd 309 (335)
T TIGR03032 230 PQAGKFQPVAFLPGFTRGLAFAGDFAFVGLSKLRESRVFGGLPIEERLDALGCGVAVIDLNSGDVVHWLRFEGVIEEIYD 309 (335)
T ss_pred CCCCcEEEEEECCCCCcccceeCCEEEEEeccccCCCCcCCCchhhhhhhhcccEEEEECCCCCEEEEEEeCCceeEEEE
Confidence 3334455555555799999999999998754332 3556666666644444321 1123455
Q ss_pred ceeecc
Q psy954 63 IHVYSA 68 (337)
Q Consensus 63 I~v~~~ 68 (337)
+.|.-.
T Consensus 310 V~vLPg 315 (335)
T TIGR03032 310 VAVLPG 315 (335)
T ss_pred EEEecC
Confidence 555433
No 54
>COG4257 Vgb Streptogramin lyase [Defense mechanisms]
Probab=25.51 E-value=1.2e+02 Score=28.53 Aligned_cols=51 Identities=18% Similarity=0.203 Sum_probs=38.5
Q ss_pred CCCCceEEEEE-CCEEEEEeCCCCceEEEecccCCceEEEeecccCCCcccee
Q psy954 14 ATIYPFAITVH-RNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHV 65 (337)
Q Consensus 14 ~~~~Pf~Lav~-~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~v 65 (337)
.-..||.|+.. .+.|++++..+++|-+.+..+|+..++-+.. ...|.+|.+
T Consensus 60 ~G~ap~dvapapdG~VWft~qg~gaiGhLdP~tGev~~ypLg~-Ga~Phgiv~ 111 (353)
T COG4257 60 NGSAPFDVAPAPDGAVWFTAQGTGAIGHLDPATGEVETYPLGS-GASPHGIVV 111 (353)
T ss_pred CCCCccccccCCCCceEEecCccccceecCCCCCceEEEecCC-CCCCceEEE
Confidence 34688999988 5559999999999999999999766555443 255666655
No 55
>PF07995 GSDH: Glucose / Sorbosone dehydrogenase; InterPro: IPR012938 Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase (P13650 from SWISSPROT) from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor [].; GO: 0016901 oxidoreductase activity, acting on the CH-OH group of donors, quinone or similar compound as acceptor, 0048038 quinone binding, 0005975 carbohydrate metabolic process; PDB: 2ISM_A 2WG3_D 3HO5_A 3HO4_A 3HO3_A 2WFT_A 2WG4_B 2WFX_B 1CRU_A 1CQ1_B ....
Probab=23.43 E-value=88 Score=29.69 Aligned_cols=37 Identities=14% Similarity=0.171 Sum_probs=25.2
Q ss_pred ceEEEEcCCCCceEEEEEC--CEEEEEeCCCCceEEEec
Q psy954 7 ASMVLISATIYPFAITVHR--NYIYWTDLQLRGVYRAEK 43 (337)
Q Consensus 7 ~R~vl~~~~~~Pf~Lav~~--d~IYWtDw~~~~I~r~~k 43 (337)
+.+++..++.+||+|++.. +.||.+|-.......++.
T Consensus 172 ~~~i~A~GlRN~~~~~~d~~tg~l~~~d~G~~~~dein~ 210 (331)
T PF07995_consen 172 DSEIYAYGLRNPFGLAFDPNTGRLWAADNGPDGWDEINR 210 (331)
T ss_dssp TTTEEEE--SEEEEEEEETTTTEEEEEEE-SSSSEEEEE
T ss_pred eEEEEEeCCCccccEEEECCCCcEEEEccCCCCCcEEEE
Confidence 5666668999999999995 689988866555544443
No 56
>PF05096 Glu_cyclase_2: Glutamine cyclotransferase; InterPro: IPR007788 This family of enzymes 2.3.2.5 from EC catalyse the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively []. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.; PDB: 3NOK_B 2FAW_A 2IWA_A 3NOM_A 3NOL_A 3MBR_X.
Probab=23.36 E-value=1.7e+02 Score=27.07 Aligned_cols=33 Identities=12% Similarity=0.074 Sum_probs=26.9
Q ss_pred CCCceEEEEECCEEEEEeCCCCceEEEecccCC
Q psy954 15 TIYPFAITVHRNYIYWTDLQLRGVYRAEKHTGA 47 (337)
Q Consensus 15 ~~~Pf~Lav~~d~IYWtDw~~~~I~r~~k~~G~ 47 (337)
...--|||++++.||--.|+.+..+..++.+-+
T Consensus 89 ~~FgEGit~~~d~l~qLTWk~~~~f~yd~~tl~ 121 (264)
T PF05096_consen 89 RYFGEGITILGDKLYQLTWKEGTGFVYDPNTLK 121 (264)
T ss_dssp T--EEEEEEETTEEEEEESSSSEEEEEETTTTE
T ss_pred cccceeEEEECCEEEEEEecCCeEEEEccccce
Confidence 345578999999999999999999988886544
No 57
>PRK04043 tolB translocation protein TolB; Provisional
Probab=23.28 E-value=2.1e+02 Score=28.22 Aligned_cols=50 Identities=12% Similarity=0.090 Sum_probs=30.9
Q ss_pred CccCCCceEEEEcCCCCceEEEEE--CCE-EEEEeCC--CCceEEEecccCCceEE
Q psy954 1 MRIATGASMVLISATIYPFAITVH--RNY-IYWTDLQ--LRGVYRAEKHTGANMIE 51 (337)
Q Consensus 1 ~~~dG~~R~vl~~~~~~Pf~Lav~--~d~-IYWtDw~--~~~I~r~~k~~G~~~~~ 51 (337)
|++||.+.++|..+. .-.+.... +++ ||.+... ...|+..+..+|..+.+
T Consensus 174 ~d~dg~~~~~~~~~~-~~~~p~wSpDG~~~i~y~s~~~~~~~Iyv~dl~tg~~~~l 228 (419)
T PRK04043 174 ADYTLTYQKVIVKGG-LNIFPKWANKEQTAFYYTSYGERKPTLYKYNLYTGKKEKI 228 (419)
T ss_pred ECCCCCceeEEccCC-CeEeEEECCCCCcEEEEEEccCCCCEEEEEECCCCcEEEE
Confidence 578999988877442 22223323 554 6665544 46799888877765444
No 58
>PF03022 MRJP: Major royal jelly protein; InterPro: IPR003534 The major royal jelly proteins (MRJPs) comprise 12.5% of the mass, and 82-90% of the protein content [], of honeybee (Apis mellifera) royal jelly. Royal jelly is a substance secreted by the cephalic glands of nurse bees [] and it is used to trigger development of a queen bee from a bee larva. The biological function of the MRJPs is unknown, but they are believed to play a major role in nutrition due to their high essential amino acid content []. Two royal jelly proteins, MRJP3 and MRJP5, contain a tandem repeat that results from a high genetic variablility. This polymorphism may be useful for genotyping individual bees [].; PDB: 3Q6P_B 3Q6K_A 3Q6T_A 2QE8_B.
Probab=22.91 E-value=2.3e+02 Score=26.29 Aligned_cols=48 Identities=10% Similarity=0.153 Sum_probs=0.0
Q ss_pred EEEEE-CCEEEEEeCCCCceEEEeccc----CCceEEEeecc-cCCCccceeec
Q psy954 20 AITVH-RNYIYWTDLQLRGVYRAEKHT----GANMIEMVKRL-EDSPRDIHVYS 67 (337)
Q Consensus 20 ~Lav~-~d~IYWtDw~~~~I~r~~k~~----G~~~~~l~~~~-~~~p~gI~v~~ 67 (337)
|+++. .+.||+++....+|.+.+..+ .....++.... ..+|.++++..
T Consensus 190 g~~~D~~G~ly~~~~~~~aI~~w~~~~~~~~~~~~~l~~d~~~l~~pd~~~i~~ 243 (287)
T PF03022_consen 190 GMAIDPNGNLYFTDVEQNAIGCWDPDGPYTPENFEILAQDPRTLQWPDGLKIDP 243 (287)
T ss_dssp EEEEETTTEEEEEECCCTEEEEEETTTSB-GCCEEEEEE-CC-GSSEEEEEE-T
T ss_pred eEEECCCCcEEEecCCCCeEEEEeCCCCcCccchheeEEcCceeeccceeeecc
No 59
>PF12942 Archaeal_AmoA: Archaeal ammonia monooxygenase subunit A (AmoA); InterPro: IPR024656 This entry represents a group of archaeal proteins that contains ammonia monooxygenase subunit A. Ammonia monooxygenase is an enzyme that oxidises ammonia to nitrite and nitrate, thus playing a significant role in the nitrogen cycle. Ammonia-oxidising archaea (AOA) are widespread in marine environments [].
Probab=22.73 E-value=65 Score=27.43 Aligned_cols=23 Identities=26% Similarity=0.625 Sum_probs=15.1
Q ss_pred EEEE-CCEEEEEeCCCC--ceEEEec
Q psy954 21 ITVH-RNYIYWTDLQLR--GVYRAEK 43 (337)
Q Consensus 21 Lav~-~d~IYWtDw~~~--~I~r~~k 43 (337)
|++. +||||+|||.=. .|++++.
T Consensus 8 ltinagdyifytdwawtsfvvFsisq 33 (183)
T PF12942_consen 8 LTINAGDYIFYTDWAWTSFVVFSISQ 33 (183)
T ss_pred EEEecCceeEEeccCCceEEEEEech
Confidence 4444 999999997533 4555543
No 60
>TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol.
Probab=22.14 E-value=3.4e+02 Score=23.85 Aligned_cols=47 Identities=11% Similarity=0.058 Sum_probs=27.7
Q ss_pred CceEEEEE--CCEEEEEeCCCCceEEEecccCCceEEEeecccCCCcccee
Q psy954 17 YPFAITVH--RNYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHV 65 (337)
Q Consensus 17 ~Pf~Lav~--~d~IYWtDw~~~~I~r~~k~~G~~~~~l~~~~~~~p~gI~v 65 (337)
.|++|++. +.+||-+....+.|...+..++.....+..+ ..|.+|.+
T Consensus 250 ~~~~~~~~~~g~~l~~~~~~~~~i~v~d~~~~~~~~~~~~~--~~~~~~~~ 298 (300)
T TIGR03866 250 RVWQLAFTPDEKYLLTTNGVSNDVSVIDVAALKVIKSIKVG--RLPWGVVV 298 (300)
T ss_pred CcceEEECCCCCEEEEEcCCCCeEEEEECCCCcEEEEEEcc--cccceeEe
Confidence 35566654 5567666655667777777666554444322 45666654
No 61
>PRK04792 tolB translocation protein TolB; Provisional
Probab=21.29 E-value=1.8e+02 Score=28.76 Aligned_cols=47 Identities=4% Similarity=0.004 Sum_probs=27.8
Q ss_pred CccCCCceEEEEcCCCCceEEEE--ECCEEEEEeCC--CCceEEEecccCC
Q psy954 1 MRIATGASMVLISATIYPFAITV--HRNYIYWTDLQ--LRGVYRAEKHTGA 47 (337)
Q Consensus 1 ~~~dG~~R~vl~~~~~~Pf~Lav--~~d~IYWtDw~--~~~I~r~~k~~G~ 47 (337)
|+++|.+.++|..+.....+.+. -+++|+|+... ...|+..+..+|.
T Consensus 203 ~d~dG~~~~~l~~~~~~~~~p~wSPDG~~La~~s~~~g~~~L~~~dl~tg~ 253 (448)
T PRK04792 203 ADYDGYNEQMLLRSPEPLMSPAWSPDGRKLAYVSFENRKAEIFVQDIYTQV 253 (448)
T ss_pred EeCCCCCceEeecCCCcccCceECCCCCEEEEEEecCCCcEEEEEECCCCC
Confidence 35678777776643322233333 37788887544 3468887776664
No 62
>PF07995 GSDH: Glucose / Sorbosone dehydrogenase; InterPro: IPR012938 Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase (P13650 from SWISSPROT) from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor [].; GO: 0016901 oxidoreductase activity, acting on the CH-OH group of donors, quinone or similar compound as acceptor, 0048038 quinone binding, 0005975 carbohydrate metabolic process; PDB: 2ISM_A 2WG3_D 3HO5_A 3HO4_A 3HO3_A 2WFT_A 2WG4_B 2WFX_B 1CRU_A 1CQ1_B ....
Probab=20.91 E-value=1.9e+02 Score=27.32 Aligned_cols=31 Identities=16% Similarity=0.221 Sum_probs=25.4
Q ss_pred CCceEEEEE--------CCEEEEEeCCCCceEEEecccC
Q psy954 16 IYPFAITVH--------RNYIYWTDLQLRGVYRAEKHTG 46 (337)
Q Consensus 16 ~~Pf~Lav~--------~d~IYWtDw~~~~I~r~~k~~G 46 (337)
.-|.||+++ .+.+|.++|....|+++...++
T Consensus 253 ~ap~G~~~y~g~~fp~~~g~~~~~~~~~~~i~~~~~~~~ 291 (331)
T PF07995_consen 253 SAPTGIIFYRGSAFPEYRGDLFVADYGGGRIWRLDLDED 291 (331)
T ss_dssp --EEEEEEE-SSSSGGGTTEEEEEETTTTEEEEEEEETT
T ss_pred cccCceEEECCccCccccCcEEEecCCCCEEEEEeeecC
Confidence 568899988 7789999999999999988644
No 63
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=20.60 E-value=4e+02 Score=20.50 Aligned_cols=12 Identities=17% Similarity=0.595 Sum_probs=9.7
Q ss_pred CcCeecCCCCcc
Q psy954 95 GTAECKCDESTK 106 (337)
Q Consensus 95 ~~~~C~Cp~g~~ 106 (337)
....|.|+.||.
T Consensus 96 ~~~~C~Cl~GF~ 107 (110)
T PF00954_consen 96 NSPKCSCLPGFE 107 (110)
T ss_pred CCCceECCCCcC
Confidence 346799999986
No 64
>COG3386 Gluconolactonase [Carbohydrate transport and metabolism]
Probab=20.24 E-value=1.4e+02 Score=28.23 Aligned_cols=28 Identities=21% Similarity=0.369 Sum_probs=22.4
Q ss_pred CCEEEEEeCCCCceEEEecccCCceEEE
Q psy954 25 RNYIYWTDLQLRGVYRAEKHTGANMIEM 52 (337)
Q Consensus 25 ~d~IYWtDw~~~~I~r~~k~~G~~~~~l 52 (337)
.+.|||+|...+.|+|.+..+|......
T Consensus 36 ~~~L~w~DI~~~~i~r~~~~~g~~~~~~ 63 (307)
T COG3386 36 RGALLWVDILGGRIHRLDPETGKKRVFP 63 (307)
T ss_pred CCEEEEEeCCCCeEEEecCCcCceEEEE
Confidence 5678999999999999998877544433
Done!