Query psy10880
Match_columns 166
No_of_seqs 200 out of 1169
Neff 6.0
Searched_HMMs 46136
Date Fri Aug 16 20:48:53 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy10880.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/10880hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 smart00051 DSL delta serrate l 99.7 1.9E-18 4.2E-23 116.9 5.3 63 67-129 1-63 (63)
2 PF01414 DSL: Delta serrate li 99.7 1.9E-19 4.2E-24 121.8 -1.0 63 67-129 1-63 (63)
3 KOG1225|consensus 98.8 8.6E-09 1.9E-13 94.3 6.6 75 81-161 263-338 (525)
4 KOG1219|consensus 98.7 1E-08 2.2E-13 104.3 4.6 97 68-165 3870-3978(4289)
5 KOG1225|consensus 98.7 3.1E-08 6.8E-13 90.7 5.7 72 84-161 235-307 (525)
6 KOG4289|consensus 98.2 1.4E-06 3.1E-11 86.6 4.5 79 78-157 1217-1308(2531)
7 KOG4289|consensus 97.8 1.1E-05 2.4E-10 80.5 2.5 47 117-164 1223-1273(2531)
8 KOG1226|consensus 97.5 0.00013 2.9E-09 69.1 5.6 16 82-97 477-492 (783)
9 smart00051 DSL delta serrate l 97.5 0.00017 3.8E-09 48.6 4.0 44 117-161 18-61 (63)
10 KOG1214|consensus 97.5 0.00011 2.4E-09 70.5 4.1 88 75-163 749-861 (1289)
11 KOG1226|consensus 97.5 0.0002 4.3E-09 68.0 5.7 77 84-161 526-617 (783)
12 PF07974 EGF_2: EGF-like domai 97.3 0.00019 4.1E-09 42.3 2.3 24 106-129 8-32 (32)
13 PF12661 hEGF: Human growth fa 97.1 0.00013 2.7E-09 34.9 -0.0 12 118-129 2-13 (13)
14 KOG4260|consensus 96.9 0.0017 3.6E-08 55.8 5.1 51 80-131 125-183 (350)
15 KOG1219|consensus 96.9 0.00087 1.9E-08 70.0 3.8 48 116-164 3886-3937(4289)
16 KOG1217|consensus 96.8 0.003 6.6E-08 54.4 6.5 87 76-162 103-204 (487)
17 KOG1217|consensus 96.5 0.0097 2.1E-07 51.3 7.0 87 73-161 283-389 (487)
18 KOG0994|consensus 95.7 0.015 3.3E-07 57.9 4.9 82 80-161 1034-1143(1758)
19 PF09026 CENP-B_dimeris: Centr 95.4 0.0043 9.3E-08 45.3 0.2 13 34-46 31-43 (101)
20 PF01414 DSL: Delta serrate li 95.4 0.0029 6.4E-08 42.6 -0.7 45 116-161 17-61 (63)
21 KOG1218|consensus 95.3 0.038 8.2E-07 46.2 5.5 70 84-159 134-208 (316)
22 KOG0943|consensus 95.1 0.013 2.7E-07 59.1 2.4 25 4-28 1730-1754(3015)
23 PF00008 EGF: EGF-like domain 95.1 0.0055 1.2E-07 35.6 -0.0 22 140-161 5-31 (32)
24 PHA02608 67 prohead core prote 95.1 0.011 2.4E-07 41.5 1.4 7 2-8 43-49 (80)
25 cd00055 EGF_Lam Laminin-type e 95.1 0.035 7.6E-07 35.2 3.6 22 110-131 12-34 (50)
26 smart00180 EGF_Lam Laminin-typ 94.7 0.049 1.1E-06 34.1 3.4 22 110-131 11-33 (46)
27 PF07974 EGF_2: EGF-like domai 94.6 0.038 8.3E-07 32.4 2.6 20 142-161 10-30 (32)
28 KOG1218|consensus 94.5 0.1 2.3E-06 43.5 6.0 72 88-161 96-173 (316)
29 KOG0994|consensus 94.3 0.056 1.2E-06 54.2 4.5 51 110-161 1030-1095(1758)
30 KOG1836|consensus 92.8 0.13 2.9E-06 53.5 4.5 76 85-161 697-809 (1705)
31 PF09026 CENP-B_dimeris: Centr 92.7 0.032 7E-07 40.8 0.0 16 31-46 31-46 (101)
32 PF00008 EGF: EGF-like domain 92.3 0.099 2.1E-06 30.3 1.7 24 71-94 7-31 (32)
33 PF12947 EGF_3: EGF domain; I 90.3 0.065 1.4E-06 32.1 -0.5 15 149-163 20-34 (36)
34 smart00181 EGF Epidermal growt 89.8 0.29 6.3E-06 27.8 2.1 11 119-129 23-34 (35)
35 KOG1214|consensus 89.8 0.58 1.3E-05 46.0 5.1 83 81-164 807-912 (1289)
36 PF00053 Laminin_EGF: Laminin 88.6 0.12 2.7E-06 32.3 -0.1 22 110-131 11-33 (49)
37 smart00179 EGF_CA Calcium-bind 87.5 0.58 1.3E-05 26.8 2.3 12 119-130 27-39 (39)
38 PF14812 PBP1_TM: Transmembran 85.5 0.25 5.5E-06 35.0 0.0 7 48-54 49-55 (81)
39 KOG3130|consensus 84.9 0.54 1.2E-05 42.6 1.8 6 4-9 265-270 (514)
40 KOG3607|consensus 84.7 0.6 1.3E-05 45.0 2.1 27 106-132 632-658 (716)
41 KOG3512|consensus 84.6 2.1 4.6E-05 39.6 5.4 54 108-161 405-475 (592)
42 PF03153 TFIIA: Transcription 83.9 0.32 6.8E-06 42.6 -0.1 10 4-13 273-282 (375)
43 cd00054 EGF_CA Calcium-binding 83.7 1 2.2E-05 25.2 2.1 11 119-129 27-37 (38)
44 cd00053 EGF Epidermal growth f 83.5 1.3 2.8E-05 24.3 2.4 11 119-129 24-35 (36)
45 PF12662 cEGF: Complement Clr- 83.3 0.61 1.3E-05 25.7 0.9 13 149-161 1-13 (24)
46 KOG2652|consensus 83.2 0.93 2E-05 40.0 2.5 15 3-17 252-266 (348)
47 PF07645 EGF_CA: Calcium-bindi 80.2 0.7 1.5E-05 28.0 0.5 17 143-159 15-34 (42)
48 PF03115 Astro_capsid: Astrovi 77.0 0.78 1.7E-05 44.6 0.0 10 51-60 713-722 (787)
49 KOG4260|consensus 76.3 3.1 6.8E-05 36.1 3.5 50 110-161 122-179 (350)
50 PF09064 Tme5_EGF_like: Thromb 76.0 2.5 5.4E-05 25.2 2.0 12 149-160 17-28 (34)
51 KOG1836|consensus 76.0 4.4 9.6E-05 42.7 5.0 51 111-161 953-1018(1705)
52 PHA02887 EGF-like protein; Pro 70.8 2.3 5E-05 32.3 1.2 17 115-131 107-123 (126)
53 KOG3512|consensus 65.9 21 0.00045 33.3 6.5 17 82-98 294-310 (592)
54 PF14670 FXa_inhibition: Coagu 61.0 1.5 3.2E-05 26.3 -1.2 22 95-125 7-28 (36)
55 PF05285 SDA1: SDA1; InterPro 56.6 6.9 0.00015 34.1 1.7 12 52-63 181-192 (324)
56 PF01683 EB: EB module; Inter 52.2 12 0.00027 23.3 2.0 11 150-160 37-47 (52)
57 PF04147 Nop14: Nop14-like fam 50.8 12 0.00025 36.8 2.4 9 82-90 419-427 (840)
58 PHA02887 EGF-like protein; Pro 50.6 9.9 0.00022 28.9 1.5 18 81-98 106-123 (126)
59 PHA03099 epidermal growth fact 46.6 11 0.00023 29.2 1.1 18 116-133 67-84 (139)
60 KOG3509|consensus 45.3 38 0.00082 34.0 5.0 73 84-161 719-792 (964)
61 PHA03099 epidermal growth fact 44.8 13 0.00028 28.8 1.3 26 74-99 56-83 (139)
62 PF04546 Sigma70_ner: Sigma-70 41.3 16 0.00034 29.6 1.5 6 50-55 79-84 (211)
63 PF00954 S_locus_glycop: S-loc 41.1 28 0.0006 24.9 2.6 10 151-160 99-108 (110)
64 cd01475 vWA_Matrilin VWA_Matri 40.4 31 0.00067 27.7 3.1 14 149-162 207-220 (224)
65 PF03153 TFIIA: Transcription 40.2 6.7 0.00015 34.3 -0.9 6 63-68 344-349 (375)
66 PF04281 Tom22: Mitochondrial 33.9 31 0.00068 26.7 2.0 8 49-56 51-58 (137)
67 KOG2023|consensus 32.5 26 0.00057 34.1 1.6 8 64-71 404-411 (885)
68 KOG3607|consensus 28.5 47 0.001 32.3 2.6 24 141-165 632-656 (716)
69 KOG0196|consensus 25.9 1.3E+02 0.0029 30.1 5.1 10 150-159 308-317 (996)
70 smart00017 OSTEO Osteopontin. 25.4 77 0.0017 27.1 3.1 13 72-84 134-146 (287)
71 KOG3516|consensus 24.3 52 0.0011 33.9 2.1 31 133-164 545-580 (1306)
72 KOG3509|consensus 23.9 1.1E+02 0.0025 30.8 4.4 60 70-130 414-479 (964)
73 PF12955 DUF3844: Domain of un 23.1 35 0.00075 25.2 0.5 9 123-131 53-61 (103)
74 PF12946 EGF_MSP1_1: MSP1 EGF 21.6 24 0.00052 21.4 -0.5 15 149-163 20-34 (37)
No 1
>smart00051 DSL delta serrate ligand.
Probab=99.74 E-value=1.9e-18 Score=116.89 Aligned_cols=63 Identities=54% Similarity=1.314 Sum_probs=60.8
Q ss_pred CcccccccceeeeceeEEEecCCCCCCCCCCCCCCCCCCCCCcccCCCCCCccCCCCCCCCCC
Q psy10880 67 WTEDEHKSAHSSMLYEYRVTCDPHYYGNGCATLCRPRDDSFGHYTCSHTGDRKCLPGWSGDYC 129 (166)
Q Consensus 67 W~~~~~~~~~~~l~~~~r~~C~~gyyG~~C~~~C~p~~~~~ghy~C~~~G~c~C~~GwtG~~C 129 (166)
|++.++.+..+.|.+++|+.|+++|||..|+++|+|+++.++||+|++.|.|+|+|||+|++|
T Consensus 1 w~~~~~~~~~~~l~~~~rv~C~~~~yG~~C~~~C~~~~d~~~~~~Cd~~G~~~C~~Gw~G~~C 63 (63)
T smart00051 1 WSTDLHIGGRTFLEYQIRVTCDENYYGEGCNKFCRPRDDFFGHYTCDENGNKGCLEGWMGPYC 63 (63)
T ss_pred CcccccccccceEEEEEEeeCCCCCcCCccCCEeCcCccccCCccCCcCCCEecCCCCcCCCC
Confidence 888899999999999999999999999999999999999999999999999999999999987
No 2
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=99.73 E-value=1.9e-19 Score=121.80 Aligned_cols=63 Identities=52% Similarity=1.313 Sum_probs=32.0
Q ss_pred CcccccccceeeeceeEEEecCCCCCCCCCCCCCCCCCCCCCcccCCCCCCccCCCCCCCCCC
Q psy10880 67 WTEDEHKSAHSSMLYEYRVTCDPHYYGNGCATLCRPRDDSFGHYTCSHTGDRKCLPGWSGDYC 129 (166)
Q Consensus 67 W~~~~~~~~~~~l~~~~r~~C~~gyyG~~C~~~C~p~~~~~ghy~C~~~G~c~C~~GwtG~~C 129 (166)
|++.++.+..+.|.+++|++|.++|||+.|+++|.|+.+.++||+|+..|+++|+|||+|++|
T Consensus 1 W~~~~~~~~~~~l~~~~rv~C~~nyyG~~C~~~C~~~~d~~ghy~Cd~~G~~~C~~Gw~G~~C 63 (63)
T PF01414_consen 1 WNTDTHIGNRTSLSYRIRVVCDENYYGPNCSKFCKPRDDSFGHYTCDSNGNKVCLPGWTGPNC 63 (63)
T ss_dssp ----------------------TTEETTTT-EE---EEETTEEEEE-SS--EEE-TTEESTTS
T ss_pred CccccccCceeEEEEEEEEECCCCCCCccccCCcCCCcCCcCCcccCCCCCCCCCCCCcCCCC
Confidence 889999999999999999999999999999999999988899999999999999999999998
No 3
>KOG1225|consensus
Probab=98.80 E-value=8.6e-09 Score=94.31 Aligned_cols=75 Identities=23% Similarity=0.430 Sum_probs=48.3
Q ss_pred eeEEEecCCCCCCCCCCC-CCCCCCCCCCcccCCCCCCccCCCCCCCCCCCccccCCCCCCCCCCccCCCCceeCCCCCc
Q psy10880 81 YEYRVTCDPHYYGNGCAT-LCRPRDDSFGHYTCSHTGDRKCLPGWSGDYCTKAVQKLSPTKALPNRTSRTLFCVLEPSMQ 159 (166)
Q Consensus 81 ~~~r~~C~~gyyG~~C~~-~C~p~~~~~ghy~C~~~G~c~C~~GwtG~~C~~~iC~~~c~~c~nG~C~~~~~C~C~~G~~ 159 (166)
...+|+|+++|+|..|+. .|... ..+|+.| .+|+|+|.+||+|..|++.-|...|. ++|.|+ .++|+|.+||+
T Consensus 263 ~~G~CIC~~Gf~G~dC~e~~Cp~~--cs~~g~~-~~g~CiC~~g~~G~dCs~~~cpadC~--g~G~Ci-~G~C~C~~Gy~ 336 (525)
T KOG1225|consen 263 VEGRCICPPGFTGDDCDELVCPVD--CSGGGVC-VDGECICNPGYSGKDCSIRRCPADCS--GHGKCI-DGECLCDEGYT 336 (525)
T ss_pred eCCeEeCCCCCcCCCCCcccCCcc--cCCCcee-cCCEeecCCCccccccccccCCccCC--CCCccc-CCceEeCCCCc
Confidence 445677777777777777 46432 3345555 45677777777777777665443222 236676 77788888877
Q ss_pred cC
Q psy10880 160 LY 161 (166)
Q Consensus 160 G~ 161 (166)
|.
T Consensus 337 G~ 338 (525)
T KOG1225|consen 337 GE 338 (525)
T ss_pred CC
Confidence 76
No 4
>KOG1219|consensus
Probab=98.72 E-value=1e-08 Score=104.31 Aligned_cols=97 Identities=21% Similarity=0.362 Sum_probs=78.2
Q ss_pred cccccccceeeec-eeEEEecCCCCCCCCCCC---CCCCCCCCCCcccCCC---CCCccCCCCCCCCCCCcc-ccCCCCC
Q psy10880 68 TEDEHKSAHSSML-YEYRVTCDPHYYGNGCAT---LCRPRDDSFGHYTCSH---TGDRKCLPGWSGDYCTKA-VQKLSPT 139 (166)
Q Consensus 68 ~~~~~~~~~~~l~-~~~r~~C~~gyyG~~C~~---~C~p~~~~~ghy~C~~---~G~c~C~~GwtG~~C~~~-iC~~~c~ 139 (166)
++.+|+|.++... ..|.|+|++-|.|.+|+. .|.+++. ...++|.+ ...|.|+.||||..|+.. +-.|+..
T Consensus 3870 npCqhgG~C~~~~~ggy~CkCpsqysG~~CEi~~epC~snPC-~~GgtCip~~n~f~CnC~~gyTG~~Ce~~Gi~eCs~n 3948 (4289)
T KOG1219|consen 3870 NPCQHGGTCISQPKGGYKCKCPSQYSGNHCEIDLEPCASNPC-LTGGTCIPFYNGFLCNCPNGYTGKRCEARGISECSKN 3948 (4289)
T ss_pred CcccCCCEecCCCCCceEEeCcccccCcccccccccccCCCC-CCCCEEEecCCCeeEeCCCCccCceeecccccccccc
Confidence 4677888887775 689999999999999997 6887653 34678986 358999999999999987 6556666
Q ss_pred CCCC-CccCC---CCceeCCCCCccCCCCC
Q psy10880 140 KALP-NRTSR---TLFCVLEPSMQLYGNCA 165 (166)
Q Consensus 140 ~c~n-G~C~~---~~~C~C~~G~~G~~~~~ 165 (166)
.|.+ |.|.+ .|+|.|.+|+.|.+.|+
T Consensus 3949 ~C~~gg~C~n~~gsf~CncT~g~~gr~c~~ 3978 (4289)
T KOG1219|consen 3949 VCGTGGQCINIPGSFHCNCTPGILGRTCCA 3978 (4289)
T ss_pred cccCCceeeccCCceEeccChhHhcccCcc
Confidence 6665 57863 47999999999998775
No 5
>KOG1225|consensus
Probab=98.66 E-value=3.1e-08 Score=90.67 Aligned_cols=72 Identities=25% Similarity=0.490 Sum_probs=60.0
Q ss_pred EEecCCCCCCCCCCC-CCCCCCCCCCcccCCCCCCccCCCCCCCCCCCccccCCCCCCCCCCccCCCCceeCCCCCccC
Q psy10880 84 RVTCDPHYYGNGCAT-LCRPRDDSFGHYTCSHTGDRKCLPGWSGDYCTKAVQKLSPTKALPNRTSRTLFCVLEPSMQLY 161 (166)
Q Consensus 84 r~~C~~gyyG~~C~~-~C~p~~~~~ghy~C~~~G~c~C~~GwtG~~C~~~iC~~~c~~c~nG~C~~~~~C~C~~G~~G~ 161 (166)
+|+|..+|.|+.|++ .|.++. .+++.| ..|+|+|.+||+|.+|++..|... |+++.+...++|+|++||+|+
T Consensus 235 ic~c~~~~~g~~c~~~~C~~~c--~~~g~c-~~G~CIC~~Gf~G~dC~e~~Cp~~---cs~~g~~~~g~CiC~~g~~G~ 307 (525)
T KOG1225|consen 235 ICECPEGYFGPLCSTIYCPGGC--TGRGQC-VEGRCICPPGFTGDDCDELVCPVD---CSGGGVCVDGECICNPGYSGK 307 (525)
T ss_pred eeecCCceeCCccccccCCCCC--cccceE-eCCeEeCCCCCcCCCCCcccCCcc---cCCCceecCCEeecCCCcccc
Confidence 799999999999998 587643 345777 789999999999999999888754 455566667799999999997
No 6
>KOG4289|consensus
Probab=98.19 E-value=1.4e-06 Score=86.57 Aligned_cols=79 Identities=20% Similarity=0.417 Sum_probs=55.7
Q ss_pred eeceeEEEecCCCCCCCCCCC---CCCCCCCCCCcccCCC---CCCccCCCCCCCCCCCccccCC--CCCCCCC-CccCC
Q psy10880 78 SMLYEYRVTCDPHYYGNGCAT---LCRPRDDSFGHYTCSH---TGDRKCLPGWSGDYCTKAVQKL--SPTKALP-NRTSR 148 (166)
Q Consensus 78 ~l~~~~r~~C~~gyyG~~C~~---~C~p~~~~~ghy~C~~---~G~c~C~~GwtG~~C~~~iC~~--~c~~c~n-G~C~~ 148 (166)
.-...+||+|++||.|..|++ .|..+. +..|++|-. .++|.|.|||+|.+|+...=.- .+--|.| |.|..
T Consensus 1217 ~pvnglrCrCPpGFTgd~CeTeiDlCYs~p-C~nng~C~srEggYtCeCrpg~tGehCEvs~~agrCvpGvC~nggtC~~ 1295 (2531)
T KOG4289|consen 1217 HPVNGLRCRCPPGFTGDYCETEIDLCYSGP-CGNNGRCRSREGGYTCECRPGFTGEHCEVSARAGRCVPGVCKNGGTCVN 1295 (2531)
T ss_pred cccCceeEeCCCCCCcccccchhHhhhcCC-CCCCCceEEecCceeEEecCCccccceeeecccCccccceecCCCEEee
Confidence 345678999999999999987 575443 233667753 3578999999999999765222 2223444 56753
Q ss_pred ----CCceeCCCC
Q psy10880 149 ----TLFCVLEPS 157 (166)
Q Consensus 149 ----~~~C~C~~G 157 (166)
-+.|+||.|
T Consensus 1296 ~~nggf~c~Cp~g 1308 (2531)
T KOG4289|consen 1296 LLNGGFCCHCPYG 1308 (2531)
T ss_pred cCCCceeccCCCc
Confidence 378999988
No 7
>KOG4289|consensus
Probab=97.79 E-value=1.1e-05 Score=80.55 Aligned_cols=47 Identities=23% Similarity=0.536 Sum_probs=38.0
Q ss_pred CccCCCCCCCCCCCccccCCCCCCCC-CCccC---CCCceeCCCCCccCCCC
Q psy10880 117 DRKCLPGWSGDYCTKAVQKLSPTKAL-PNRTS---RTLFCVLEPSMQLYGNC 164 (166)
Q Consensus 117 ~c~C~~GwtG~~C~~~iC~~~c~~c~-nG~C~---~~~~C~C~~G~~G~~~~ 164 (166)
+|.|+|||||.+|++.|-.+-..+|. ||+|. ..|+|+|.+||+|. -|
T Consensus 1223 rCrCPpGFTgd~CeTeiDlCYs~pC~nng~C~srEggYtCeCrpg~tGe-hC 1273 (2531)
T KOG4289|consen 1223 RCRCPPGFTGDYCETEIDLCYSGPCGNNGRCRSREGGYTCECRPGFTGE-HC 1273 (2531)
T ss_pred eEeCCCCCCcccccchhHhhhcCCCCCCCceEEecCceeEEecCCcccc-ce
Confidence 66999999999999998555555555 47886 46999999999997 44
No 8
>KOG1226|consensus
Probab=97.54 E-value=0.00013 Score=69.13 Aligned_cols=16 Identities=25% Similarity=0.636 Sum_probs=13.5
Q ss_pred eEEEecCCCCCCCCCC
Q psy10880 82 EYRVTCDPHYYGNGCA 97 (166)
Q Consensus 82 ~~r~~C~~gyyG~~C~ 97 (166)
-.+|.|.++|+|..|+
T Consensus 477 CG~C~C~~G~~G~~CE 492 (783)
T KOG1226|consen 477 CGQCRCDEGWLGKKCE 492 (783)
T ss_pred ecceecCCCCCCCccc
Confidence 3468999999999995
No 9
>smart00051 DSL delta serrate ligand.
Probab=97.47 E-value=0.00017 Score=48.61 Aligned_cols=44 Identities=9% Similarity=-0.105 Sum_probs=33.9
Q ss_pred CccCCCCCCCCCCCccccCCCCCCCCCCccCCCCceeCCCCCccC
Q psy10880 117 DRKCLPGWSGDYCTKAVQKLSPTKALPNRTSRTLFCVLEPSMQLY 161 (166)
Q Consensus 117 ~c~C~~GwtG~~C~~~iC~~~c~~c~nG~C~~~~~C~C~~G~~G~ 161 (166)
+-+|.++|.|..|.+ .|.+......+..|...+.|.|++||+|.
T Consensus 18 rv~C~~~~yG~~C~~-~C~~~~d~~~~~~Cd~~G~~~C~~Gw~G~ 61 (63)
T smart00051 18 RVTCDENYYGEGCNK-FCRPRDDFFGHYTCDENGNKGCLEGWMGP 61 (63)
T ss_pred EeeCCCCCcCCccCC-EeCcCccccCCccCCcCCCEecCCCCcCC
Confidence 448899999999964 55654333344578888999999999996
No 10
>KOG1214|consensus
Probab=97.46 E-value=0.00011 Score=70.53 Aligned_cols=88 Identities=24% Similarity=0.362 Sum_probs=55.3
Q ss_pred ceeeeceeEEEecCCCCC----CCCCCCCCCCC-----------CCCCCcccCCC----CCCccCCCCCCCC--C-CCcc
Q psy10880 75 AHSSMLYEYRVTCDPHYY----GNGCATLCRPR-----------DDSFGHYTCSH----TGDRKCLPGWSGD--Y-CTKA 132 (166)
Q Consensus 75 ~~~~l~~~~r~~C~~gyy----G~~C~~~C~p~-----------~~~~ghy~C~~----~G~c~C~~GwtG~--~-C~~~ 132 (166)
.++.+..+|||.|..+|- |..|..+-.|+ ....||.+|.. ...|.|+|||.|. . |...
T Consensus 749 ~Cin~pg~~rceC~~gy~F~dd~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsGDG~~c~dvD 828 (1289)
T KOG1214|consen 749 VCINLPGSYRCECRSGYEFADDRHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSGDGHQCTDVD 828 (1289)
T ss_pred eeecCCCceeEEEeecceeccCCcceEEecCCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccCCcccccccc
Confidence 456777788888888763 34664322211 12345666653 3588999999985 3 4444
Q ss_pred ccCCCCCCCCCCccC---CCCceeCCCCCccCCC
Q psy10880 133 VQKLSPTKALPNRTS---RTLFCVLEPSMQLYGN 163 (166)
Q Consensus 133 iC~~~c~~c~nG~C~---~~~~C~C~~G~~G~~~ 163 (166)
-|.++.+. ++..|. ..+.|+|.+||+|.|-
T Consensus 829 eC~psrCh-p~A~CyntpgsfsC~C~pGy~GDGf 861 (1289)
T KOG1214|consen 829 ECSPSRCH-PAATCYNTPGSFSCRCQPGYYGDGF 861 (1289)
T ss_pred ccCccccC-CCceEecCCCcceeecccCccCCCc
Confidence 44433222 245675 3589999999999983
No 11
>KOG1226|consensus
Probab=97.46 E-value=0.0002 Score=67.99 Aligned_cols=77 Identities=22% Similarity=0.377 Sum_probs=47.5
Q ss_pred EEecCCC----CCCCCCCC---CCCC--CCCCCCcccCCCCCCccCCCCCCCCCCCccccCCCCCCCCCCccCCC-----
Q psy10880 84 RVTCDPH----YYGNGCAT---LCRP--RDDSFGHYTCSHTGDRKCLPGWSGDYCTKAVQKLSPTKALPNRTSRT----- 149 (166)
Q Consensus 84 r~~C~~g----yyG~~C~~---~C~p--~~~~~ghy~C~~~G~c~C~~GwtG~~C~~~iC~~~c~~c~nG~C~~~----- 149 (166)
+|+|-+. +||+.|+- .|.- ..-+.||++| .=|+|+|.+||+|.+|.-+.-...|..-+++.|...
T Consensus 526 qC~C~~~~~~~i~G~fCECDnfsC~r~~g~lC~g~G~C-~CG~CvC~~GwtG~~C~C~~std~C~~~~G~iCSGrG~C~C 604 (783)
T KOG1226|consen 526 QCVCHKPDNGKIYGKFCECDNFSCERHKGVLCGGHGRC-ECGRCVCNPGWTGSACNCPLSTDTCESSDGQICSGRGTCEC 604 (783)
T ss_pred ceEecCCCCCceeeeeeeccCcccccccCcccCCCCeE-eCCcEEcCCCCccCCCCCCCCCccccCCCCceeCCCceeeC
Confidence 4555444 45888764 2321 1124578887 679999999999999986643333333333355532
Q ss_pred CceeCCCC-CccC
Q psy10880 150 LFCVLEPS-MQLY 161 (166)
Q Consensus 150 ~~C~C~~G-~~G~ 161 (166)
++|+|... |+|.
T Consensus 605 g~C~C~~~~~sG~ 617 (783)
T KOG1226|consen 605 GRCKCTDPPYSGE 617 (783)
T ss_pred CceEcCCCCcCcc
Confidence 67777766 7775
No 12
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.30 E-value=0.00019 Score=42.26 Aligned_cols=24 Identities=38% Similarity=0.776 Sum_probs=19.0
Q ss_pred CCCcccCCCC-CCccCCCCCCCCCC
Q psy10880 106 SFGHYTCSHT-GDRKCLPGWSGDYC 129 (166)
Q Consensus 106 ~~ghy~C~~~-G~c~C~~GwtG~~C 129 (166)
..+|++|... ++|+|.+||+|++|
T Consensus 8 C~~~G~C~~~~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 8 CSGHGTCVSPCGRCVCDSGYTGPDC 32 (32)
T ss_pred cCCCCEEeCCCCEEECCCCCcCCCC
Confidence 3468888765 88889999988876
No 13
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=97.08 E-value=0.00013 Score=34.86 Aligned_cols=12 Identities=50% Similarity=1.508 Sum_probs=7.9
Q ss_pred ccCCCCCCCCCC
Q psy10880 118 RKCLPGWSGDYC 129 (166)
Q Consensus 118 c~C~~GwtG~~C 129 (166)
|+|++||+|.+|
T Consensus 2 C~C~~G~~G~~C 13 (13)
T PF12661_consen 2 CQCPPGWTGPNC 13 (13)
T ss_dssp EEE-TTEETTTT
T ss_pred ccCcCCCcCCCC
Confidence 577777777766
No 14
>KOG4260|consensus
Probab=96.89 E-value=0.0017 Score=55.79 Aligned_cols=51 Identities=29% Similarity=0.741 Sum_probs=39.7
Q ss_pred ceeEEEecCCCCCCCCCCCCCCC--CCCCCCcccCCC------CCCccCCCCCCCCCCCc
Q psy10880 80 LYEYRVTCDPHYYGNGCATLCRP--RDDSFGHYTCSH------TGDRKCLPGWSGDYCTK 131 (166)
Q Consensus 80 ~~~~r~~C~~gyyG~~C~~~C~p--~~~~~ghy~C~~------~G~c~C~~GwtG~~C~~ 131 (166)
..+.++-|+.|.||+.|.. |.- ...++|++.|.- +|.|.|.+||+|+.|..
T Consensus 125 vdqLkvCCp~gtyGpdCl~-Cpggser~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~ 183 (350)
T KOG4260|consen 125 VDQLKVCCPDGTYGPDCLQ-CPGGSERPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRY 183 (350)
T ss_pred hhhheeccCCCCcCCcccc-CCCCCcCCcCCCCcccCCCCCCCCCcccccCCCCCccccc
Confidence 4678899999999999964 521 113678888872 58999999999998873
No 15
>KOG1219|consensus
Probab=96.88 E-value=0.00087 Score=70.03 Aligned_cols=48 Identities=17% Similarity=0.337 Sum_probs=38.5
Q ss_pred CCccCCCCCCCCCCCccccCCCCCCCCC-CccC---CCCceeCCCCCccCCCC
Q psy10880 116 GDRKCLPGWSGDYCTKAVQKLSPTKALP-NRTS---RTLFCVLEPSMQLYGNC 164 (166)
Q Consensus 116 G~c~C~~GwtG~~C~~~iC~~~c~~c~n-G~C~---~~~~C~C~~G~~G~~~~ 164 (166)
+.|.|++-|+|.+|++.+=.|.+.||.. |.|. +.+.|.|+.||+|. -|
T Consensus 3886 y~CkCpsqysG~~CEi~~epC~snPC~~GgtCip~~n~f~CnC~~gyTG~-~C 3937 (4289)
T KOG1219|consen 3886 YKCKCPSQYSGNHCEIDLEPCASNPCLTGGTCIPFYNGFLCNCPNGYTGK-RC 3937 (4289)
T ss_pred eEEeCcccccCcccccccccccCCCCCCCCEEEecCCCeeEeCCCCccCc-ee
Confidence 3569999999999999885566667765 5776 45899999999997 44
No 16
>KOG1217|consensus
Probab=96.84 E-value=0.003 Score=54.43 Aligned_cols=87 Identities=18% Similarity=0.343 Sum_probs=63.8
Q ss_pred eeeeceeEEEecCCCCCCCCCCC--CCCCCCC-CCCcccCCC------CCCccCCCCCCCCCCCcc--ccCCCCCCCCC-
Q psy10880 76 HSSMLYEYRVTCDPHYYGNGCAT--LCRPRDD-SFGHYTCSH------TGDRKCLPGWSGDYCTKA--VQKLSPTKALP- 143 (166)
Q Consensus 76 ~~~l~~~~r~~C~~gyyG~~C~~--~C~p~~~-~~ghy~C~~------~G~c~C~~GwtG~~C~~~--iC~~~c~~c~n- 143 (166)
.......++|.|.++|.|..|.. .|..... ...+..|.. ...|.|..||.|..|... .|......|.+
T Consensus 103 ~~~~~~~~~c~c~~g~~~~~~~~~~~C~~~~~~~~~~~~c~~~~~~~~~~~c~C~~g~~~~~~~~~~~~C~~~~~~c~~~ 182 (487)
T KOG1217|consen 103 CVDCVGSYECTCPPGYQGTPCEGECECVTGPGVCCIDGSCSNGPGSVGPFRCSCTEGYEGEPCETDLDECIQYSSPCQNG 182 (487)
T ss_pred ccCCCCCceeeCCCccccCcCCcceeecCCCCCeeCchhhcCCCCCCCceeeeeCCCcccccccccccccccCCCCcCCC
Confidence 34466788999999999999998 5766542 223455553 467899999999999875 46644545665
Q ss_pred CccCC---CCceeCCCCCccCC
Q psy10880 144 NRTSR---TLFCVLEPSMQLYG 162 (166)
Q Consensus 144 G~C~~---~~~C~C~~G~~G~~ 162 (166)
+.|.. .|.|.|++||+|..
T Consensus 183 ~~C~~~~~~~~C~c~~~~~~~~ 204 (487)
T KOG1217|consen 183 GTCVNTGGSYLCSCPPGYTGST 204 (487)
T ss_pred cccccCCCCeeEeCCCCccCCc
Confidence 46753 37899999999984
No 17
>KOG1217|consensus
Probab=96.45 E-value=0.0097 Score=51.30 Aligned_cols=87 Identities=24% Similarity=0.421 Sum_probs=58.8
Q ss_pred ccceeeeceeEEEecCCCCCCCCCCC------CCCCC---CCCCCcccCCC-----CCCccCCCCCCCCCCCcc--ccCC
Q psy10880 73 KSAHSSMLYEYRVTCDPHYYGNGCAT------LCRPR---DDSFGHYTCSH-----TGDRKCLPGWSGDYCTKA--VQKL 136 (166)
Q Consensus 73 ~~~~~~l~~~~r~~C~~gyyG~~C~~------~C~p~---~~~~ghy~C~~-----~G~c~C~~GwtG~~C~~~--iC~~ 136 (166)
.+..+.+...|+|.|.++|.|..| . .|.++ ......+.|.. ...|.|.+||+|..|+.. .|..
T Consensus 283 ~~~C~~~~~~~~C~C~~g~~g~~~-~~~~~~~~C~~~~~~~~c~~g~~C~~~~~~~~~~C~c~~~~~g~~C~~~~~~C~~ 361 (487)
T KOG1217|consen 283 GGTCVNVPGSYRCTCPPGFTGRLC-TECVDVDECSPRNAGGPCANGGTCNTLGSFGGFRCACGPGFTGRRCEDSNDECAS 361 (487)
T ss_pred CCeeecCCCcceeeCCCCCCCCCC-ccccccccccccccCCcCCCCcccccCCCCCCCCcCCCCCCCCCccccCCccccC
Confidence 455666666699999999999998 2 34321 11222345632 235899999999999877 4665
Q ss_pred CCCCCCCCccCC----CCceeCCCCCccC
Q psy10880 137 SPTKALPNRTSR----TLFCVLEPSMQLY 161 (166)
Q Consensus 137 ~c~~c~nG~C~~----~~~C~C~~G~~G~ 161 (166)
..+ ..++.|.. .+.|.|+.+|.+.
T Consensus 362 ~~~-~~~~~c~~~~~~~~~c~~~~~~~~~ 389 (487)
T KOG1217|consen 362 SPC-CPGGTCVNETPGSYRCACPAGFAGK 389 (487)
T ss_pred Ccc-ccCCEeccCCCCCeEecCCCccccC
Confidence 442 23455653 4899999999874
No 18
>KOG0994|consensus
Probab=95.67 E-value=0.015 Score=57.94 Aligned_cols=82 Identities=28% Similarity=0.562 Sum_probs=48.7
Q ss_pred ceeEEEecCCCCCCCCCCC------------CCCCC-CCCCCcccCCC-CCCccCCCCCCCCCCCccc--------cCCC
Q psy10880 80 LYEYRVTCDPHYYGNGCAT------------LCRPR-DDSFGHYTCSH-TGDRKCLPGWSGDYCTKAV--------QKLS 137 (166)
Q Consensus 80 ~~~~r~~C~~gyyG~~C~~------------~C~p~-~~~~ghy~C~~-~G~c~C~~GwtG~~C~~~i--------C~~~ 137 (166)
+.+..|.|.++-.|..|.. -|.|- .+..+.-+|+. +|.|+|.|||-|-.|++-. =.++
T Consensus 1034 r~tGQCpClpNv~G~~CDqCA~N~w~laSG~GCe~C~Cd~~~~pqCN~ftGQCqCkpGfGGR~C~qCqel~WGdP~~~C~ 1113 (1758)
T KOG0994|consen 1034 RFTGQCPCLPNVQGVRCDQCAENHWNLASGEGCEPCNCDPIGGPQCNEFTGQCQCKPGFGGRTCSQCQELYWGDPNEKCR 1113 (1758)
T ss_pred cccCcCCCCcccccccccccccchhccccCCCCCccCCCccCCccccccccceeccCCCCCcchhHHHHhhcCCCCCCce
Confidence 3445566666666666642 02211 12234458887 7999999999999987532 1122
Q ss_pred CCCCCC-C----ccCC-CCceeCCCCCccC
Q psy10880 138 PTKALP-N----RTSR-TLFCVLEPSMQLY 161 (166)
Q Consensus 138 c~~c~n-G----~C~~-~~~C~C~~G~~G~ 161 (166)
.+.|.. | .|.. .++|.|++|-.|.
T Consensus 1114 aCdCd~rG~~tpQCdr~tG~C~C~~Gv~G~ 1143 (1758)
T KOG0994|consen 1114 ACDCDPRGIETPQCDRATGRCVCRPGVGGP 1143 (1758)
T ss_pred ecCCCCCCCCCCCccccCCceeecCCCCCc
Confidence 233322 2 3543 5899999987775
No 19
>PF09026 CENP-B_dimeris: Centromere protein B dimerisation domain; InterPro: IPR015115 Centromere protein B (CENP-B) interacts with centromeric heterochromatin in chromosomes and binds to a specific subset of alphoid satellite DNA, called the CENP-B box. CENP-B may organise arrays of centromere satellite DNA into a higher order structure, which then directs centromere formation and kinetochore assembly in mammalian chromosomes. The CENP-B dimerisation domain is composed of two alpha-helices, which are folded into an antiparallel configuration. Dimerisation of CENP-B is mediated by this domain, in which monomers dimerise to form a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation []. ; GO: 0003677 DNA binding, 0003682 chromatin binding, 0006355 regulation of transcription, DNA-dependent, 0000775 chromosome, centromeric region, 0005634 nucleus; PDB: 1UFI_A.
Probab=95.44 E-value=0.0043 Score=45.35 Aligned_cols=13 Identities=62% Similarity=1.122 Sum_probs=0.0
Q ss_pred CCCCCCCCccccc
Q psy10880 34 DDDDDDDDDEVII 46 (166)
Q Consensus 34 ~~~~~~~~~~~~~ 46 (166)
++|||+|++++++
T Consensus 31 Dddddee~de~p~ 43 (101)
T PF09026_consen 31 DDDDDEEEDEVPV 43 (101)
T ss_dssp -------------
T ss_pred ccccccccccccc
Confidence 3344444446654
No 20
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=95.40 E-value=0.0029 Score=42.64 Aligned_cols=45 Identities=16% Similarity=-0.006 Sum_probs=23.0
Q ss_pred CCccCCCCCCCCCCCccccCCCCCCCCCCccCCCCceeCCCCCccC
Q psy10880 116 GDRKCLPGWSGDYCTKAVQKLSPTKALPNRTSRTLFCVLEPSMQLY 161 (166)
Q Consensus 116 G~c~C~~GwtG~~C~~~iC~~~c~~c~nG~C~~~~~C~C~~G~~G~ 161 (166)
.+.+|.+.|.|+.|.+ .|.+.-....+-.|...++=.|.+||+|.
T Consensus 17 ~rv~C~~nyyG~~C~~-~C~~~~d~~ghy~Cd~~G~~~C~~Gw~G~ 61 (63)
T PF01414_consen 17 IRVVCDENYYGPNCSK-FCKPRDDSFGHYTCDSNGNKVCLPGWTGP 61 (63)
T ss_dssp ------TTEETTTT-E-E---EEETTEEEEE-SS--EEE-TTEEST
T ss_pred EEEECCCCCCCccccC-CcCCCcCCcCCcccCCCCCCCCCCCCcCC
Confidence 4558999999999976 44543222222368888999999999997
No 21
>KOG1218|consensus
Probab=95.32 E-value=0.038 Score=46.17 Aligned_cols=70 Identities=21% Similarity=0.395 Sum_probs=43.2
Q ss_pred EEecCC-CCCCCCCCCCCCCCCCCCCcccCCCCCCccCCCCCCCCCCCcccc--CCCCCCCCCC-ccCC-CCceeCCCCC
Q psy10880 84 RVTCDP-HYYGNGCATLCRPRDDSFGHYTCSHTGDRKCLPGWSGDYCTKAVQ--KLSPTKALPN-RTSR-TLFCVLEPSM 158 (166)
Q Consensus 84 r~~C~~-gyyG~~C~~~C~p~~~~~ghy~C~~~G~c~C~~GwtG~~C~~~iC--~~~c~~c~nG-~C~~-~~~C~C~~G~ 158 (166)
...|.. +|+|..|...|.+. +......+.|.|++||+|.+|..... .. ...+.++ .|.. ...+.+.++|
T Consensus 134 ~~~C~~~~~~g~~C~~~c~~~-----~~~~~~~~~c~c~~g~~g~~~~~~~~~c~~-~~~~~~g~~C~~~~~~~~~~~~~ 207 (316)
T KOG1218|consen 134 GEQCGEENLVGLKCQRDCQCT-----GGCDCKNGICTCQPGFVGVFCVESCSGCSP-LTACENGAKCNRSTGSCLCYPGP 207 (316)
T ss_pred cccccccCCCCCCccCCCCCc-----cccCCCCCceeccCCcccccccccCCCcCC-CcccCCCCeeeccccccccCCCC
Confidence 344554 77788887777321 23333678999999999999987653 32 2333443 6653 3555555555
Q ss_pred c
Q psy10880 159 Q 159 (166)
Q Consensus 159 ~ 159 (166)
.
T Consensus 208 ~ 208 (316)
T KOG1218|consen 208 S 208 (316)
T ss_pred c
Confidence 4
No 22
>KOG0943|consensus
Probab=95.14 E-value=0.013 Score=59.06 Aligned_cols=25 Identities=56% Similarity=0.812 Sum_probs=15.8
Q ss_pred ecCcccccCCCCCCCCCCCCCCCCC
Q psy10880 4 VDGEWWMADDDNNDDDDDDDDDDDD 28 (166)
Q Consensus 4 ~~~~~~~~~~~~~~~~~~~~~~~~~ 28 (166)
++|+.+..|++|++++|+++..+++
T Consensus 1730 f~GEed~~Dddnddddddd~EaEdd 1754 (3015)
T KOG0943|consen 1730 FAGEEDHHDDDNDDDDDDDAEAEDD 1754 (3015)
T ss_pred ccCcccccccccccccccchhhccc
Confidence 5788887777766666555443333
No 23
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=95.14 E-value=0.0055 Score=35.65 Aligned_cols=22 Identities=9% Similarity=0.001 Sum_probs=16.6
Q ss_pred CCCC-CccC----CCCceeCCCCCccC
Q psy10880 140 KALP-NRTS----RTLFCVLEPSMQLY 161 (166)
Q Consensus 140 ~c~n-G~C~----~~~~C~C~~G~~G~ 161 (166)
+|+| |.|. ..|+|+|++||+|.
T Consensus 5 ~C~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 5 PCQNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp SSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred cCCCCeEEEeCCCCCEEeECCCCCccC
Confidence 5555 5675 34899999999986
No 24
>PHA02608 67 prohead core protein; Provisional
Probab=95.10 E-value=0.011 Score=41.49 Aligned_cols=7 Identities=43% Similarity=0.838 Sum_probs=4.8
Q ss_pred ceecCcc
Q psy10880 2 MMVDGEW 8 (166)
Q Consensus 2 ~~~~~~~ 8 (166)
+||.|+-
T Consensus 43 v~iEGEe 49 (80)
T PHA02608 43 VMIEGEE 49 (80)
T ss_pred HhhcCCC
Confidence 4788873
No 25
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=95.09 E-value=0.035 Score=35.21 Aligned_cols=22 Identities=32% Similarity=0.758 Sum_probs=18.6
Q ss_pred ccCCC-CCCccCCCCCCCCCCCc
Q psy10880 110 YTCSH-TGDRKCLPGWSGDYCTK 131 (166)
Q Consensus 110 y~C~~-~G~c~C~~GwtG~~C~~ 131 (166)
..|++ +|+|.|.+||+|..|++
T Consensus 12 ~~C~~~~G~C~C~~~~~G~~C~~ 34 (50)
T cd00055 12 GQCDPGTGQCECKPNTTGRRCDR 34 (50)
T ss_pred ccccCCCCEEeCCCcCCCCCCCC
Confidence 45876 78999999999999973
No 26
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=94.67 E-value=0.049 Score=34.08 Aligned_cols=22 Identities=36% Similarity=0.809 Sum_probs=18.4
Q ss_pred ccCCC-CCCccCCCCCCCCCCCc
Q psy10880 110 YTCSH-TGDRKCLPGWSGDYCTK 131 (166)
Q Consensus 110 y~C~~-~G~c~C~~GwtG~~C~~ 131 (166)
..|++ .|+|.|.+||+|..|++
T Consensus 11 ~~C~~~~G~C~C~~~~~G~~C~~ 33 (46)
T smart00180 11 GTCDPDTGQCECKPNVTGRRCDR 33 (46)
T ss_pred CcccCCCCEEECCCCCCCCCCCc
Confidence 46876 68999999999999963
No 27
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=94.63 E-value=0.038 Score=32.36 Aligned_cols=20 Identities=10% Similarity=-0.086 Sum_probs=17.2
Q ss_pred CCCccCCC-CceeCCCCCccC
Q psy10880 142 LPNRTSRT-LFCVLEPSMQLY 161 (166)
Q Consensus 142 ~nG~C~~~-~~C~C~~G~~G~ 161 (166)
++|.|+.+ ++|+|.+||+|.
T Consensus 10 ~~G~C~~~~g~C~C~~g~~G~ 30 (32)
T PF07974_consen 10 GHGTCVSPCGRCVCDSGYTGP 30 (32)
T ss_pred CCCEEeCCCCEEECCCCCcCC
Confidence 35889876 999999999996
No 28
>KOG1218|consensus
Probab=94.46 E-value=0.1 Score=43.49 Aligned_cols=72 Identities=18% Similarity=0.356 Sum_probs=48.8
Q ss_pred CCCCCCCCCCCCCCCCCCCCCcccCCCCC-CccCCCCCCCCCCCcc-----ccCCCCCCCCCCccCCCCceeCCCCCccC
Q psy10880 88 DPHYYGNGCATLCRPRDDSFGHYTCSHTG-DRKCLPGWSGDYCTKA-----VQKLSPTKALPNRTSRTLFCVLEPSMQLY 161 (166)
Q Consensus 88 ~~gyyG~~C~~~C~p~~~~~ghy~C~~~G-~c~C~~GwtG~~C~~~-----iC~~~c~~c~nG~C~~~~~C~C~~G~~G~ 161 (166)
..+|.|..|...|......+. .+|.... .|.|..+|.+..|... .|...+ .+..+.......|.|++||.|.
T Consensus 96 ~~~~~g~~C~~~~~~~~~c~~-~~C~~~~~~c~~~~~~~~~~C~~~~~~g~~C~~~c-~~~~~~~~~~~~c~c~~g~~g~ 173 (316)
T KOG1218|consen 96 LNGYEGPQCESPCPCGDGCAE-KTCANPRRECRCGGGYIGEQCGEENLVGLKCQRDC-QCTGGCDCKNGICTCQPGFVGV 173 (316)
T ss_pred CCCCCcccccCCCCcCCcccc-cccCCCccceecCCcCccccccccCCCCCCccCCC-CCccccCCCCCceeccCCcccc
Confidence 688889999887765331111 5776665 5899999999988771 144333 2223344467999999999987
No 29
>KOG0994|consensus
Probab=94.33 E-value=0.056 Score=54.15 Aligned_cols=51 Identities=22% Similarity=0.399 Sum_probs=34.6
Q ss_pred ccCCC-CCCccCCCCCCCCCCCccc-----------cCC-CCCCCCCC-ccCC-CCceeCCCCCccC
Q psy10880 110 YTCSH-TGDRKCLPGWSGDYCTKAV-----------QKL-SPTKALPN-RTSR-TLFCVLEPSMQLY 161 (166)
Q Consensus 110 y~C~~-~G~c~C~~GwtG~~C~~~i-----------C~~-~c~~c~nG-~C~~-~~~C~C~~G~~G~ 161 (166)
..|+. +|.|.|+|.-.|..|+.-. |.+ .|.+ .++ .|+. .++|+|.|||-|.
T Consensus 1030 ~~CDr~tGQCpClpNv~G~~CDqCA~N~w~laSG~GCe~C~Cd~-~~~pqCN~ftGQCqCkpGfGGR 1095 (1758)
T KOG0994|consen 1030 CHCDRFTGQCPCLPNVQGVRCDQCAENHWNLASGEGCEPCNCDP-IGGPQCNEFTGQCQCKPGFGGR 1095 (1758)
T ss_pred cccccccCcCCCCcccccccccccccchhccccCCCCCccCCCc-cCCccccccccceeccCCCCCc
Confidence 34665 7999999999999998632 221 1111 123 3543 4899999999887
No 30
>KOG1836|consensus
Probab=92.81 E-value=0.13 Score=53.48 Aligned_cols=76 Identities=28% Similarity=0.514 Sum_probs=51.6
Q ss_pred EecCCCCCCCCCCCCCCCC----------------CCCCCc-ccCCC-CCCccCCCCCCCCCCCccc------------c
Q psy10880 85 VTCDPHYYGNGCATLCRPR----------------DDSFGH-YTCSH-TGDRKCLPGWSGDYCTKAV------------Q 134 (166)
Q Consensus 85 ~~C~~gyyG~~C~~~C~p~----------------~~~~gh-y~C~~-~G~c~C~~GwtG~~C~~~i------------C 134 (166)
|.|+.||.|..|.. |.|. .+..|| -+|++ +|.|.|.+-=.|..|.+-. =
T Consensus 697 c~C~~g~tG~~Ce~-C~~gfrr~~~~~~~~~~c~~C~cngh~~~Cd~~tG~C~C~~~t~G~~C~~C~~GfYg~~~~~~~~ 775 (1705)
T KOG1836|consen 697 CTCPVGYTGQFCES-CAPGFRRLSPQLGPFCPCIPCDCNGHSNICDPRTGQCKCKHNTFGGQCAQCVDGFYGLPDLGTSG 775 (1705)
T ss_pred ccCCCCcccchhhh-cchhhhcccccCCCCCcccccccCCccccccCCCCceecccCCCCCchhhhcCCCCCccccCCCC
Confidence 99999999999974 2211 234566 57876 7899888877787776532 1
Q ss_pred CCCCCCCCC-CccC-----CCCcee-CCCCCccC
Q psy10880 135 KLSPTKALP-NRTS-----RTLFCV-LEPSMQLY 161 (166)
Q Consensus 135 ~~~c~~c~n-G~C~-----~~~~C~-C~~G~~G~ 161 (166)
.+.+++|.+ +.|. ..+.|. |++||+|.
T Consensus 776 dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~ 809 (1705)
T KOG1836|consen 776 DCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGL 809 (1705)
T ss_pred CCccCCCCCChhhcCcCcccceecCCCCCCCccc
Confidence 133444444 3443 348998 99999997
No 31
>PF09026 CENP-B_dimeris: Centromere protein B dimerisation domain; InterPro: IPR015115 Centromere protein B (CENP-B) interacts with centromeric heterochromatin in chromosomes and binds to a specific subset of alphoid satellite DNA, called the CENP-B box. CENP-B may organise arrays of centromere satellite DNA into a higher order structure, which then directs centromere formation and kinetochore assembly in mammalian chromosomes. The CENP-B dimerisation domain is composed of two alpha-helices, which are folded into an antiparallel configuration. Dimerisation of CENP-B is mediated by this domain, in which monomers dimerise to form a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation []. ; GO: 0003677 DNA binding, 0003682 chromatin binding, 0006355 regulation of transcription, DNA-dependent, 0000775 chromosome, centromeric region, 0005634 nucleus; PDB: 1UFI_A.
Probab=92.71 E-value=0.032 Score=40.80 Aligned_cols=16 Identities=38% Similarity=0.626 Sum_probs=0.5
Q ss_pred CCCCCCCCCCCccccc
Q psy10880 31 DDDDDDDDDDDDEVII 46 (166)
Q Consensus 31 ~~~~~~~~~~~~~~~~ 46 (166)
+|++++++++---..|
T Consensus 31 Dddddee~de~p~p~f 46 (101)
T PF09026_consen 31 DDDDDEEEDEVPVPEF 46 (101)
T ss_dssp ---------------H
T ss_pred ccccccccccccchhH
Confidence 3333334445555544
No 32
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=92.26 E-value=0.099 Score=30.25 Aligned_cols=24 Identities=21% Similarity=0.469 Sum_probs=13.4
Q ss_pred ccccceeeec-eeEEEecCCCCCCC
Q psy10880 71 EHKSAHSSML-YEYRVTCDPHYYGN 94 (166)
Q Consensus 71 ~~~~~~~~l~-~~~r~~C~~gyyG~ 94 (166)
.+.|..+.+. ..|+|.|+++|.|.
T Consensus 7 ~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 7 QNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp TTTEEEEEESTSEEEEEEBTTEEST
T ss_pred CCCeEEEeCCCCCEEeECCCCCccC
Confidence 3445555555 56666666666554
No 33
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=90.27 E-value=0.065 Score=32.14 Aligned_cols=15 Identities=20% Similarity=0.363 Sum_probs=12.4
Q ss_pred CCceeCCCCCccCCC
Q psy10880 149 TLFCVLEPSMQLYGN 163 (166)
Q Consensus 149 ~~~C~C~~G~~G~~~ 163 (166)
.+.|+|++||.|.|-
T Consensus 20 ~~~C~C~~Gy~GdG~ 34 (36)
T PF12947_consen 20 SYTCTCKPGYEGDGF 34 (36)
T ss_dssp SEEEEE-CEEECCST
T ss_pred CEEeECCCCCccCCc
Confidence 589999999999973
No 34
>smart00181 EGF Epidermal growth factor-like domain.
Probab=89.81 E-value=0.29 Score=27.80 Aligned_cols=11 Identities=45% Similarity=1.389 Sum_probs=7.9
Q ss_pred cCCCCCCC-CCC
Q psy10880 119 KCLPGWSG-DYC 129 (166)
Q Consensus 119 ~C~~GwtG-~~C 129 (166)
.|.+||+| ..|
T Consensus 23 ~C~~g~~g~~~C 34 (35)
T smart00181 23 SCPPGYTGDKRC 34 (35)
T ss_pred ECCCCCccCCcc
Confidence 77788877 555
No 35
>KOG1214|consensus
Probab=89.78 E-value=0.58 Score=45.95 Aligned_cols=83 Identities=19% Similarity=0.315 Sum_probs=48.1
Q ss_pred eeEEEecCCCCCCC--CCCC--CCCCCCCCCCcccCCC---CCCccCCCCCCCC--CCCc-----cccCC---CCCCCCC
Q psy10880 81 YEYRVTCDPHYYGN--GCAT--LCRPRDDSFGHYTCSH---TGDRKCLPGWSGD--YCTK-----AVQKL---SPTKALP 143 (166)
Q Consensus 81 ~~~r~~C~~gyyG~--~C~~--~C~p~~~~~ghy~C~~---~G~c~C~~GwtG~--~C~~-----~iC~~---~c~~c~n 143 (166)
..|.|.|.+||.|. .|.. .|.|.-.. -..+|-. ...|+|.|||+|. .|-. ..|.. .+..|.+
T Consensus 807 s~y~C~CLPGfsGDG~~c~dvDeC~psrCh-p~A~CyntpgsfsC~C~pGy~GDGf~CVP~~~~~T~C~~er~hpl~chg 885 (1289)
T KOG1214|consen 807 STYSCACLPGFSGDGHQCTDVDECSPSRCH-PAATCYNTPGSFSCRCQPGYYGDGFQCVPDTSSLTPCEQERFHPLQCHG 885 (1289)
T ss_pred ceEEEeecCCccCCccccccccccCccccC-CCceEecCCCcceeecccCccCCCceecCCCccCCccccccccceeecc
Confidence 57899999999865 4544 46654211 1223321 2467999999984 2321 11332 1333332
Q ss_pred --Ccc--C--CCCceeCCCCCccCCCC
Q psy10880 144 --NRT--S--RTLFCVLEPSMQLYGNC 164 (166)
Q Consensus 144 --G~C--~--~~~~C~C~~G~~G~~~~ 164 (166)
+.| + ..++|.|.++-.|+|+.
T Consensus 886 ~t~~~~~~Dp~~~e~p~~~~ppG~~~~ 912 (1289)
T KOG1214|consen 886 STGFCWCVDPDGHEVPGTQTPPGSTPP 912 (1289)
T ss_pred ccceeEeeCCCcccCCCCCCCCCCCCC
Confidence 222 2 23899999998888764
No 36
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=88.62 E-value=0.12 Score=32.30 Aligned_cols=22 Identities=41% Similarity=0.856 Sum_probs=17.1
Q ss_pred ccCCC-CCCccCCCCCCCCCCCc
Q psy10880 110 YTCSH-TGDRKCLPGWSGDYCTK 131 (166)
Q Consensus 110 y~C~~-~G~c~C~~GwtG~~C~~ 131 (166)
.+|++ +|+|+|.++|+|..|++
T Consensus 11 ~~C~~~~G~C~C~~~~~G~~C~~ 33 (49)
T PF00053_consen 11 QTCDPSTGQCVCKPGTTGPRCDQ 33 (49)
T ss_dssp SSEEETCEEESBSTTEESTTS-E
T ss_pred CcccCCCCEEeccccccCCcCcC
Confidence 46765 68889999999999985
No 37
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=87.46 E-value=0.58 Score=26.79 Aligned_cols=12 Identities=42% Similarity=1.290 Sum_probs=8.9
Q ss_pred cCCCCCC-CCCCC
Q psy10880 119 KCLPGWS-GDYCT 130 (166)
Q Consensus 119 ~C~~Gwt-G~~C~ 130 (166)
.|.+||+ |..|+
T Consensus 27 ~C~~g~~~g~~C~ 39 (39)
T smart00179 27 ECPPGYTDGRNCE 39 (39)
T ss_pred ECCCCCccCCcCC
Confidence 7888887 77763
No 38
>PF14812 PBP1_TM: Transmembrane domain of transglycosylase PBP1 at N-terminal; PDB: 3FWL_A 3VMA_A.
Probab=85.49 E-value=0.25 Score=35.03 Aligned_cols=7 Identities=14% Similarity=0.164 Sum_probs=0.0
Q ss_pred chhhhhh
Q psy10880 48 CKTLISR 54 (166)
Q Consensus 48 ~~~LIsR 54 (166)
++..|.|
T Consensus 49 eee~m~r 55 (81)
T PF14812_consen 49 EEEPMPR 55 (81)
T ss_dssp -------
T ss_pred hcccccc
Confidence 3444443
No 39
>KOG3130|consensus
Probab=84.86 E-value=0.54 Score=42.57 Aligned_cols=6 Identities=33% Similarity=0.323 Sum_probs=2.7
Q ss_pred ecCccc
Q psy10880 4 VDGEWW 9 (166)
Q Consensus 4 ~~~~~~ 9 (166)
|+|.++
T Consensus 265 v~~~ss 270 (514)
T KOG3130|consen 265 VNGSSS 270 (514)
T ss_pred ccCCCC
Confidence 444444
No 40
>KOG3607|consensus
Probab=84.65 E-value=0.6 Score=44.97 Aligned_cols=27 Identities=30% Similarity=0.755 Sum_probs=24.8
Q ss_pred CCCcccCCCCCCccCCCCCCCCCCCcc
Q psy10880 106 SFGHYTCSHTGDRKCLPGWSGDYCTKA 132 (166)
Q Consensus 106 ~~ghy~C~~~G~c~C~~GwtG~~C~~~ 132 (166)
+.+|++|+...+|+|.+||.+++|+..
T Consensus 632 C~g~GVCnn~~~ChC~~gwapp~C~~~ 658 (716)
T KOG3607|consen 632 CNGHGVCNNELNCHCEPGWAPPFCFIF 658 (716)
T ss_pred cCCCcccCCCcceeeCCCCCCCccccc
Confidence 568999999999999999999999874
No 41
>KOG3512|consensus
Probab=84.58 E-value=2.1 Score=39.62 Aligned_cols=54 Identities=20% Similarity=0.230 Sum_probs=34.2
Q ss_pred CcccCCC-CCCccCCCCCCCCCCCcc----------ccCCCC------CCCCCCccCCCCceeCCCCCccC
Q psy10880 108 GHYTCSH-TGDRKCLPGWSGDYCTKA----------VQKLSP------TKALPNRTSRTLFCVLEPSMQLY 161 (166)
Q Consensus 108 ghy~C~~-~G~c~C~~GwtG~~C~~~----------iC~~~c------~~c~nG~C~~~~~C~C~~G~~G~ 161 (166)
.|.+|+. +|.|.|.+|=+|..|+.- ++.+.. ..++++.--+..-+.|+++..|+
T Consensus 405 ~gktCNq~tGqCpCkeGvtG~tCnrCa~gyqqsrs~vapcik~p~~~~~~~~s~ve~qd~~s~Ck~~~~~~ 475 (592)
T KOG3512|consen 405 AGKTCNQTTGQCPCKEGVTGLTCNRCAPGYQQSRSPVAPCIKIPTDAPTLGSSGVEPQDQCSKCKASPGGK 475 (592)
T ss_pred ccccccccCCcccCCCCCcccccccccchhhcccCCCcCceecCCCCccccCCCCcchhccccCCCCCcce
Confidence 3678994 899999999999999741 222211 11111111234677999888765
No 42
>PF03153 TFIIA: Transcription factor IIA, alpha/beta subunit; InterPro: IPR004855 Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP [], and can dissociate HMGB1 already bound to TBP/TATA-box. Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2 []. This entry represents the precursor that yields both the alpha and beta subunits of TFIIA. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II []. ; GO: 0006367 transcription initiation from RNA polymerase II promoter, 0005672 transcription factor TFIIA complex; PDB: 1NVP_B 1YTF_B 1RM1_C 1NH2_B.
Probab=83.87 E-value=0.32 Score=42.63 Aligned_cols=10 Identities=40% Similarity=0.381 Sum_probs=0.0
Q ss_pred ecCcccccCC
Q psy10880 4 VDGEWWMADD 13 (166)
Q Consensus 4 ~~~~~~~~~~ 13 (166)
+||...++++
T Consensus 273 ~DG~~d~~~~ 282 (375)
T PF03153_consen 273 LDGAGDDSDD 282 (375)
T ss_dssp ----------
T ss_pred ccCCCCCccc
Confidence 5666544443
No 43
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=83.71 E-value=1 Score=25.22 Aligned_cols=11 Identities=45% Similarity=1.377 Sum_probs=8.0
Q ss_pred cCCCCCCCCCC
Q psy10880 119 KCLPGWSGDYC 129 (166)
Q Consensus 119 ~C~~GwtG~~C 129 (166)
.|.+||.|..|
T Consensus 27 ~C~~g~~g~~C 37 (38)
T cd00054 27 SCPPGYTGRNC 37 (38)
T ss_pred ECCCCCcCCcC
Confidence 67777777666
No 44
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=83.46 E-value=1.3 Score=24.32 Aligned_cols=11 Identities=55% Similarity=1.458 Sum_probs=7.6
Q ss_pred cCCCCCCCC-CC
Q psy10880 119 KCLPGWSGD-YC 129 (166)
Q Consensus 119 ~C~~GwtG~-~C 129 (166)
.|++||.|. .|
T Consensus 24 ~C~~g~~g~~~C 35 (36)
T cd00053 24 VCPPGYTGDRSC 35 (36)
T ss_pred ECCCCCcccCCc
Confidence 777777776 44
No 45
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=83.34 E-value=0.61 Score=25.65 Aligned_cols=13 Identities=31% Similarity=0.524 Sum_probs=10.4
Q ss_pred CCceeCCCCCccC
Q psy10880 149 TLFCVLEPSMQLY 161 (166)
Q Consensus 149 ~~~C~C~~G~~G~ 161 (166)
.|+|.|++||+-.
T Consensus 1 sy~C~C~~Gy~l~ 13 (24)
T PF12662_consen 1 SYTCSCPPGYQLS 13 (24)
T ss_pred CEEeeCCCCCcCC
Confidence 3789999999843
No 46
>KOG2652|consensus
Probab=83.21 E-value=0.93 Score=40.00 Aligned_cols=15 Identities=33% Similarity=0.499 Sum_probs=9.0
Q ss_pred eecCcccccCCCCCC
Q psy10880 3 MVDGEWWMADDDNND 17 (166)
Q Consensus 3 ~~~~~~~~~~~~~~~ 17 (166)
-|||...++++++++
T Consensus 252 Q~Dg~~~~~eE~e~E 266 (348)
T KOG2652|consen 252 QVDGTGDTSEEDENE 266 (348)
T ss_pred ecccccccccccccc
Confidence 467877766444444
No 47
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=80.20 E-value=0.7 Score=28.05 Aligned_cols=17 Identities=12% Similarity=0.266 Sum_probs=13.6
Q ss_pred CCccC---CCCceeCCCCCc
Q psy10880 143 PNRTS---RTLFCVLEPSMQ 159 (166)
Q Consensus 143 nG~C~---~~~~C~C~~G~~ 159 (166)
++.|. ..|+|.|++||+
T Consensus 15 ~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 15 NGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp TSEEEEETTEEEEEESTTEE
T ss_pred CCEEEcCCCCEEeeCCCCcE
Confidence 45676 359999999998
No 48
>PF03115 Astro_capsid: Astrovirus capsid protein precursor; InterPro: IPR004337 The astrovirus genome is apparently organised with nonstructural proteins encoded at the 5' end and structural proteins at the 3' end []. Proteins in this family are encoded by astrovirus ORF2, one of the three astrovirus ORFs (1a, 1b, 2). The proteins contain a viral RNA-dependent RNA polymerase motif []. The 87kDa precursor polyprotein undergoes an intracellular cleavage to form a 79kDa protein. Subsequently, extracellular trypsin cleavage yields the three proteins forming the infectious virion [].; PDB: 3QSQ_A 3TS3_D.
Probab=76.97 E-value=0.78 Score=44.64 Aligned_cols=10 Identities=30% Similarity=0.338 Sum_probs=0.0
Q ss_pred hhhhhccccc
Q psy10880 51 LISRLTTQRW 60 (166)
Q Consensus 51 LIsR~~~~~~ 60 (166)
|++-|..|++
T Consensus 713 L~nTLVNqGi 722 (787)
T PF03115_consen 713 LFNTLVNQGI 722 (787)
T ss_dssp ----------
T ss_pred HHHHHHHcCC
Confidence 4444444443
No 49
>KOG4260|consensus
Probab=76.30 E-value=3.1 Score=36.10 Aligned_cols=50 Identities=16% Similarity=0.146 Sum_probs=33.6
Q ss_pred ccCCCCCCccCCCCCCCCCCCccccCC-CCCCCC-CCccC------CCCceeCCCCCccC
Q psy10880 110 YTCSHTGDRKCLPGWSGDYCTKAVQKL-SPTKAL-PNRTS------RTLFCVLEPSMQLY 161 (166)
Q Consensus 110 y~C~~~G~c~C~~GwtG~~C~~~iC~~-~c~~c~-nG~C~------~~~~C~C~~G~~G~ 161 (166)
..|...=.--|++|-.|+.|.+ |.- +-.+|+ +|.|. ..++|.|.+||+|.
T Consensus 122 WlCvdqLkvCCp~gtyGpdCl~--Cpggser~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp 179 (350)
T KOG4260|consen 122 WLCVDQLKVCCPDGTYGPDCLQ--CPGGSERPCFGNGSCHGDGSREGSGKCKCETGYTGP 179 (350)
T ss_pred HhhhhhheeccCCCCcCCcccc--CCCCCcCCcCCCCcccCCCCCCCCCcccccCCCCCc
Confidence 4454444445899999999964 321 223444 46775 35899999999997
No 50
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=76.03 E-value=2.5 Score=25.22 Aligned_cols=12 Identities=17% Similarity=0.127 Sum_probs=8.9
Q ss_pred CCceeCCCCCcc
Q psy10880 149 TLFCVLEPSMQL 160 (166)
Q Consensus 149 ~~~C~C~~G~~G 160 (166)
.++|.||.||.-
T Consensus 17 ~~~C~CPeGyIl 28 (34)
T PF09064_consen 17 PGQCFCPEGYIL 28 (34)
T ss_pred CCceeCCCceEe
Confidence 467888888864
No 51
>KOG1836|consensus
Probab=75.99 E-value=4.4 Score=42.69 Aligned_cols=51 Identities=20% Similarity=0.351 Sum_probs=35.2
Q ss_pred cCCC-CCCccCCCCCCCCCCCcccc--------CCCCCCCC-CC----ccCC-CCceeCCCCCccC
Q psy10880 111 TCSH-TGDRKCLPGWSGDYCTKAVQ--------KLSPTKAL-PN----RTSR-TLFCVLEPSMQLY 161 (166)
Q Consensus 111 ~C~~-~G~c~C~~GwtG~~C~~~iC--------~~~c~~c~-nG----~C~~-~~~C~C~~G~~G~ 161 (166)
.|.. .|.|.|.+|=+|..|.+..= .+.++.|. +| .|.. .++|.|+++|.|.
T Consensus 953 ~c~~~tGqc~c~~gVtgqrc~qc~~~~~~~~~~gc~~c~c~~~Gs~~~qc~~~~G~c~c~~~~~g~ 1018 (1705)
T KOG1836|consen 953 DCDVGTGQCYCRPGVTGQRCDQCETYHFGFQTEGCGLCECDPLGSRGFQCDPEDGQCPCRPGFEGR 1018 (1705)
T ss_pred cccccCCceeeecCccccccCccccCcccccccCCcceecccCCcccceecccCCeeeecCCCCCc
Confidence 6775 79999999999999986420 01111111 12 4665 7999999999985
No 52
>PHA02887 EGF-like protein; Provisional
Probab=70.84 E-value=2.3 Score=32.33 Aligned_cols=17 Identities=24% Similarity=0.468 Sum_probs=13.8
Q ss_pred CCCccCCCCCCCCCCCc
Q psy10880 115 TGDRKCLPGWSGDYCTK 131 (166)
Q Consensus 115 ~G~c~C~~GwtG~~C~~ 131 (166)
.-.|+|.+||+|..|+.
T Consensus 107 epsCrC~~GYtG~RCE~ 123 (126)
T PHA02887 107 EKFCICNKGYTGIRCDE 123 (126)
T ss_pred CceeECCCCcccCCCCc
Confidence 35779999999999975
No 53
>KOG3512|consensus
Probab=65.93 E-value=21 Score=33.31 Aligned_cols=17 Identities=18% Similarity=0.550 Sum_probs=14.5
Q ss_pred eEEEecCCCCCCCCCCC
Q psy10880 82 EYRVTCDPHYYGNGCAT 98 (166)
Q Consensus 82 ~~r~~C~~gyyG~~C~~ 98 (166)
.+.|.|..+-.|+.|.+
T Consensus 294 ~ltCdC~HNTaGPdCgr 310 (592)
T KOG3512|consen 294 HLTCDCEHNTAGPDCGR 310 (592)
T ss_pred ceEEecccCCCCCCccc
Confidence 48899999999999964
No 54
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=61.02 E-value=1.5 Score=26.26 Aligned_cols=22 Identities=41% Similarity=1.250 Sum_probs=12.5
Q ss_pred CCCCCCCCCCCCCCcccCCCCCCccCCCCCC
Q psy10880 95 GCATLCRPRDDSFGHYTCSHTGDRKCLPGWS 125 (166)
Q Consensus 95 ~C~~~C~p~~~~~ghy~C~~~G~c~C~~Gwt 125 (166)
.|+.+|.+.. +.|+| .|++||+
T Consensus 7 gC~h~C~~~~---g~~~C------~C~~Gy~ 28 (36)
T PF14670_consen 7 GCSHICVNTP---GSYRC------SCPPGYK 28 (36)
T ss_dssp GSSSEEEEET---TSEEE------E-STTEE
T ss_pred CcCCCCccCC---CceEe------ECCCCCE
Confidence 4666666532 34666 7777775
No 55
>PF05285 SDA1: SDA1; InterPro: IPR007949 This domain consists of several SDA1 protein homologues. SDA1 is a Saccharomyces cerevisiae protein which is involved in the control of the actin cytoskeleton. The protein is essential for cell viability and is localised in the nucleus [].
Probab=56.59 E-value=6.9 Score=34.08 Aligned_cols=12 Identities=33% Similarity=0.445 Sum_probs=5.9
Q ss_pred hhhhcccccccC
Q psy10880 52 ISRLTTQRWLDV 63 (166)
Q Consensus 52 IsR~~~~~~l~~ 63 (166)
+..++..|.|.+
T Consensus 181 ~~~is~~rILT~ 192 (324)
T PF05285_consen 181 ASKISTTRILTP 192 (324)
T ss_pred hhHHhhccCCCH
Confidence 344555555543
No 56
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=52.17 E-value=12 Score=23.25 Aligned_cols=11 Identities=18% Similarity=0.120 Sum_probs=5.1
Q ss_pred CceeCCCCCcc
Q psy10880 150 LFCVLEPSMQL 160 (166)
Q Consensus 150 ~~C~C~~G~~G 160 (166)
++|.|++||.-
T Consensus 37 g~C~C~~g~~~ 47 (52)
T PF01683_consen 37 GRCQCPPGYVE 47 (52)
T ss_pred CEeECCCCCEe
Confidence 44455555443
No 57
>PF04147 Nop14: Nop14-like family ; InterPro: IPR007276 Emg1 and Nop14 are novel proteins whose interaction is required for the maturation of the 18S rRNA and for 40S ribosome production [].
Probab=50.76 E-value=12 Score=36.85 Aligned_cols=9 Identities=11% Similarity=0.578 Sum_probs=4.2
Q ss_pred eEEEecCCC
Q psy10880 82 EYRVTCDPH 90 (166)
Q Consensus 82 ~~r~~C~~g 90 (166)
-|...|+..
T Consensus 419 Pftf~~P~s 427 (840)
T PF04147_consen 419 PFTFPCPSS 427 (840)
T ss_pred CceecCCCC
Confidence 344445554
No 58
>PHA02887 EGF-like protein; Provisional
Probab=50.64 E-value=9.9 Score=28.93 Aligned_cols=18 Identities=28% Similarity=0.490 Sum_probs=15.4
Q ss_pred eeEEEecCCCCCCCCCCC
Q psy10880 81 YEYRVTCDPHYYGNGCAT 98 (166)
Q Consensus 81 ~~~r~~C~~gyyG~~C~~ 98 (166)
....|.|.+||.|..|..
T Consensus 106 ~epsCrC~~GYtG~RCE~ 123 (126)
T PHA02887 106 DEKFCICNKGYTGIRCDE 123 (126)
T ss_pred CCceeECCCCcccCCCCc
Confidence 456899999999999975
No 59
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=46.56 E-value=11 Score=29.22 Aligned_cols=18 Identities=28% Similarity=0.580 Sum_probs=14.7
Q ss_pred CCccCCCCCCCCCCCccc
Q psy10880 116 GDRKCLPGWSGDYCTKAV 133 (166)
Q Consensus 116 G~c~C~~GwtG~~C~~~i 133 (166)
-.|.|..||+|..|+...
T Consensus 67 ~~CrC~~GYtGeRCEh~d 84 (139)
T PHA03099 67 MYCRCSHGYTGIRCQHVV 84 (139)
T ss_pred ceeECCCCccccccccee
Confidence 455999999999998643
No 60
>KOG3509|consensus
Probab=45.31 E-value=38 Score=34.05 Aligned_cols=73 Identities=14% Similarity=0.114 Sum_probs=31.1
Q ss_pred EEecCCCCCCCCCCCCCCCCCCCCCcccCCCCCCccCCCCCCCCCCCccccCCCCCCCCCCccCCCCce-eCCCCCccC
Q psy10880 84 RVTCDPHYYGNGCATLCRPRDDSFGHYTCSHTGDRKCLPGWSGDYCTKAVQKLSPTKALPNRTSRTLFC-VLEPSMQLY 161 (166)
Q Consensus 84 r~~C~~gyyG~~C~~~C~p~~~~~ghy~C~~~G~c~C~~GwtG~~C~~~iC~~~c~~c~nG~C~~~~~C-~C~~G~~G~ 161 (166)
+|.|+++|.|..|+. |.+....+....|.......|.-+|....|....-. +..|++.. ..++| .|.+|+.|.
T Consensus 719 ~C~c~~g~~G~~ce~-c~e~~~ls~t~~~~~~~~~~c~~~~h~~~c~~~~~~--nt~~q~~~--~~~~~~~~~~g~~~d 792 (964)
T KOG3509|consen 719 QCQCPKGLVGTSCED-CAEGYTLSTTGGLYPGLCEDCECNSHISQCEDDLGY--NTDCQNNT--EGDRCELCSPGTYGD 792 (964)
T ss_pred ccccCccccCccccc-ccccccccccCCcCcccCcccccCCCcccccccccc--cccccccC--ccceeeecCCCcccc
Confidence 455555555555543 222111111122333344455555555555543211 11111100 12455 677777775
No 61
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=44.80 E-value=13 Score=28.79 Aligned_cols=26 Identities=15% Similarity=0.247 Sum_probs=19.9
Q ss_pred cceeeec--eeEEEecCCCCCCCCCCCC
Q psy10880 74 SAHSSML--YEYRVTCDPHYYGNGCATL 99 (166)
Q Consensus 74 ~~~~~l~--~~~r~~C~~gyyG~~C~~~ 99 (166)
|.+.-+. .++.|.|..||.|..|+.+
T Consensus 56 G~C~yI~dl~~~~CrC~~GYtGeRCEh~ 83 (139)
T PHA03099 56 GDCIHARDIDGMYCRCSHGYTGIRCQHV 83 (139)
T ss_pred CEEEeeccCCCceeECCCCcccccccce
Confidence 4454443 5778999999999999873
No 62
>PF04546 Sigma70_ner: Sigma-70, non-essential region; InterPro: IPR007631 The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes. With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding 'helix-turn-helix' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors [, ]. This domain is found in the primary vegetative sigma factor. Its function is unclear, and it can be removed without apparent loss of function [, ].; GO: 0003677 DNA binding, 0003700 sequence-specific DNA binding transcription factor activity, 0016987 sigma factor activity, 0006352 transcription initiation, DNA-dependent, 0006355 regulation of transcription, DNA-dependent; PDB: 1SIG_A 3IYD_F.
Probab=41.30 E-value=16 Score=29.57 Aligned_cols=6 Identities=17% Similarity=0.634 Sum_probs=2.4
Q ss_pred hhhhhh
Q psy10880 50 TLISRL 55 (166)
Q Consensus 50 ~LIsR~ 55 (166)
..+.|+
T Consensus 79 ~v~~~f 84 (211)
T PF04546_consen 79 EVLERF 84 (211)
T ss_dssp HHHHHH
T ss_pred HHHHHH
Confidence 344443
No 63
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=41.14 E-value=28 Score=24.95 Aligned_cols=10 Identities=20% Similarity=0.398 Sum_probs=5.5
Q ss_pred ceeCCCCCcc
Q psy10880 151 FCVLEPSMQL 160 (166)
Q Consensus 151 ~C~C~~G~~G 160 (166)
.|.|++||..
T Consensus 99 ~C~Cl~GF~P 108 (110)
T PF00954_consen 99 KCSCLPGFEP 108 (110)
T ss_pred ceECCCCcCC
Confidence 3666666543
No 64
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=40.36 E-value=31 Score=27.66 Aligned_cols=14 Identities=14% Similarity=0.387 Sum_probs=11.9
Q ss_pred CCceeCCCCCccCC
Q psy10880 149 TLFCVLEPSMQLYG 162 (166)
Q Consensus 149 ~~~C~C~~G~~G~~ 162 (166)
.|.|.|++||+...
T Consensus 207 ~~~c~c~~g~~~~~ 220 (224)
T cd01475 207 SYLCACTEGYALLE 220 (224)
T ss_pred CEEeECCCCccCCC
Confidence 49999999998754
No 65
>PF03153 TFIIA: Transcription factor IIA, alpha/beta subunit; InterPro: IPR004855 Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP [], and can dissociate HMGB1 already bound to TBP/TATA-box. Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2 []. This entry represents the precursor that yields both the alpha and beta subunits of TFIIA. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II []. ; GO: 0006367 transcription initiation from RNA polymerase II promoter, 0005672 transcription factor TFIIA complex; PDB: 1NVP_B 1YTF_B 1RM1_C 1NH2_B.
Probab=40.19 E-value=6.7 Score=34.29 Aligned_cols=6 Identities=33% Similarity=0.878 Sum_probs=2.8
Q ss_pred CCCCCc
Q psy10880 63 VGPSWT 68 (166)
Q Consensus 63 ~~~~W~ 68 (166)
+-..|+
T Consensus 344 ~k~~wk 349 (375)
T PF03153_consen 344 VKNKWK 349 (375)
T ss_dssp ETTEEE
T ss_pred ccceeE
Confidence 334564
No 66
>PF04281 Tom22: Mitochondrial import receptor subunit Tom22 ; InterPro: IPR005683 The mitochondrial protein translocase family, which is responsible for movement of nuclear encoded pre-proteins into mitochondria, is very complex with at least 19 components. These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family represents the Tom22 proteins []. The N-terminal region of Tom22 has been shown to have chaperone-like activity, and the C-terminal region faces the intermembrane face []. ; GO: 0006886 intracellular protein transport, 0005741 mitochondrial outer membrane
Probab=33.91 E-value=31 Score=26.66 Aligned_cols=8 Identities=50% Similarity=0.663 Sum_probs=3.8
Q ss_pred hhhhhhhc
Q psy10880 49 KTLISRLT 56 (166)
Q Consensus 49 ~~LIsR~~ 56 (166)
+.|..|++
T Consensus 51 ETl~ERl~ 58 (137)
T PF04281_consen 51 ETLLERLW 58 (137)
T ss_pred ccHHHHHH
Confidence 34445554
No 67
>KOG2023|consensus
Probab=32.53 E-value=26 Score=34.14 Aligned_cols=8 Identities=25% Similarity=0.659 Sum_probs=5.0
Q ss_pred CCCCcccc
Q psy10880 64 GPSWTEDE 71 (166)
Q Consensus 64 ~~~W~~~~ 71 (166)
+.+|...+
T Consensus 404 ~~~W~vrE 411 (885)
T KOG2023|consen 404 SEEWKVRE 411 (885)
T ss_pred cchhhhhh
Confidence 47886544
No 68
>KOG3607|consensus
Probab=28.47 E-value=47 Score=32.28 Aligned_cols=24 Identities=21% Similarity=0.216 Sum_probs=19.3
Q ss_pred CC-CCccCCCCceeCCCCCccCCCCC
Q psy10880 141 AL-PNRTSRTLFCVLEPSMQLYGNCA 165 (166)
Q Consensus 141 c~-nG~C~~~~~C~C~~G~~G~~~~~ 165 (166)
|+ +|.|++.++|+|.+||.+. .|.
T Consensus 632 C~g~GVCnn~~~ChC~~gwapp-~C~ 656 (716)
T KOG3607|consen 632 CNGHGVCNNELNCHCEPGWAPP-FCF 656 (716)
T ss_pred cCCCcccCCCcceeeCCCCCCC-ccc
Confidence 44 4789999999999999886 443
No 69
>KOG0196|consensus
Probab=25.92 E-value=1.3e+02 Score=30.12 Aligned_cols=10 Identities=20% Similarity=0.238 Sum_probs=7.7
Q ss_pred CceeCCCCCc
Q psy10880 150 LFCVLEPSMQ 159 (166)
Q Consensus 150 ~~C~C~~G~~ 159 (166)
-.|.|..||.
T Consensus 308 ~~C~C~~gyy 317 (996)
T KOG0196|consen 308 TSCTCENGYY 317 (996)
T ss_pred CcccccCCcc
Confidence 5788888874
No 70
>smart00017 OSTEO Osteopontin. Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite [1,2,3]. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone
Probab=25.44 E-value=77 Score=27.10 Aligned_cols=13 Identities=23% Similarity=0.383 Sum_probs=5.8
Q ss_pred cccceeeeceeEE
Q psy10880 72 HKSAHSSMLYEYR 84 (166)
Q Consensus 72 ~~~~~~~l~~~~r 84 (166)
+.|.+-.+.|.+|
T Consensus 134 ~dGRGDSvaYglR 146 (287)
T smart00017 134 NDGRGDSVAYGLR 146 (287)
T ss_pred CCCCcccceehhh
Confidence 3444444444444
No 71
>KOG3516|consensus
Probab=24.32 E-value=52 Score=33.91 Aligned_cols=31 Identities=16% Similarity=0.298 Sum_probs=23.1
Q ss_pred ccCCCCCCCCC-CccCCC---CceeCC-CCCccCCCC
Q psy10880 133 VQKLSPTKALP-NRTSRT---LFCVLE-PSMQLYGNC 164 (166)
Q Consensus 133 iC~~~c~~c~n-G~C~~~---~~C~C~-~G~~G~~~~ 164 (166)
++.|.+.+|++ |.|.+. +.|.|. .||.|. .|
T Consensus 545 ~drClPN~CehgG~C~Qs~~~f~C~C~~TGY~Ga-tC 580 (1306)
T KOG3516|consen 545 SDRCLPNPCEHGGKCSQSWDDFECNCELTGYKGA-TC 580 (1306)
T ss_pred ccccCCccccCCCcccccccceeEeccccccccc-cc
Confidence 36677788887 578764 899999 888875 44
No 72
>KOG3509|consensus
Probab=23.90 E-value=1.1e+02 Score=30.80 Aligned_cols=60 Identities=23% Similarity=0.395 Sum_probs=36.4
Q ss_pred cccccceeeeceeEEEecCCCCCCCCCCC---CCCCCCCCCCcccCCC---CCCccCCCCCCCCCCC
Q psy10880 70 DEHKSAHSSMLYEYRVTCDPHYYGNGCAT---LCRPRDDSFGHYTCSH---TGDRKCLPGWSGDYCT 130 (166)
Q Consensus 70 ~~~~~~~~~l~~~~r~~C~~gyyG~~C~~---~C~p~~~~~ghy~C~~---~G~c~C~~GwtG~~C~ 130 (166)
.++.+......+..+|.|+++|+|..|.. .|.+.....--.+|.+ .+-+.|.|| .|..+.
T Consensus 414 ~~~~g~c~p~~~~~~c~c~~g~~G~~c~d~~~~~~~~~~g~y~~t~~~~~~~~~~~c~pg-~g~~~~ 479 (964)
T KOG3509|consen 414 CQHDGPCLQTLEGKQCLCPPGYTGDSCEDCMNGCDRSPNGSYLGTCVPIQGKRCEYCGPG-AGAPTA 479 (964)
T ss_pred CCCCccccccccccceeccccccCchhhccCccccccCCccccceEeccCCCcceeecCC-CCCccc
Confidence 34445566667888999999999999875 3332211111124443 244788888 666653
No 73
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=23.13 E-value=35 Score=25.23 Aligned_cols=9 Identities=44% Similarity=1.169 Sum_probs=5.7
Q ss_pred CCCCCCCCc
Q psy10880 123 GWSGDYCTK 131 (166)
Q Consensus 123 GwtG~~C~~ 131 (166)
.|.|+-|++
T Consensus 53 ~W~G~aCqK 61 (103)
T PF12955_consen 53 HWGGPACQK 61 (103)
T ss_pred eeccccccc
Confidence 566666654
No 74
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=21.57 E-value=24 Score=21.39 Aligned_cols=15 Identities=13% Similarity=0.073 Sum_probs=10.5
Q ss_pred CCceeCCCCCccCCC
Q psy10880 149 TLFCVLEPSMQLYGN 163 (166)
Q Consensus 149 ~~~C~C~~G~~G~~~ 163 (166)
...|.|.+||...|+
T Consensus 20 ~eecrCllgyk~~~~ 34 (37)
T PF12946_consen 20 SEECRCLLGYKKVGG 34 (37)
T ss_dssp EEEEEE-TTEEEETT
T ss_pred CEEEEeeCCccccCC
Confidence 468888888887664
Done!