Query psy3616
Match_columns 143
No_of_seqs 144 out of 1113
Neff 6.0
Searched_HMMs 46136
Date Fri Aug 16 21:08:27 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy3616.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/3616hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PHA03099 epidermal growth fact 99.2 1.8E-11 3.9E-16 90.9 4.6 68 3-71 50-130 (139)
2 KOG1219|consensus 98.8 1.8E-08 3.9E-13 100.1 7.9 33 3-36 3908-3940(4289)
3 PF00008 EGF: EGF-like domain 98.4 1.3E-07 2.9E-12 54.2 2.3 30 3-32 3-32 (32)
4 smart00179 EGF_CA Calcium-bind 98.0 7.6E-06 1.6E-10 47.1 3.4 30 4-34 9-39 (39)
5 cd00054 EGF_CA Calcium-binding 97.9 1.7E-05 3.6E-10 44.9 3.3 30 4-34 9-38 (38)
6 cd00053 EGF Epidermal growth f 97.8 3.4E-05 7.4E-10 42.9 3.3 30 4-34 6-36 (36)
7 smart00181 EGF Epidermal growt 97.4 0.00019 4E-09 40.6 3.1 29 4-34 6-35 (35)
8 PF07974 EGF_2: EGF-like domai 97.2 0.00062 1.3E-08 39.1 3.4 26 5-33 7-32 (32)
9 KOG1219|consensus 97.1 0.00051 1.1E-08 69.9 4.6 35 2-37 3945-3980(4289)
10 PF12661 hEGF: Human growth fa 97.0 0.00032 7E-09 32.7 1.1 13 21-33 1-13 (13)
11 PF12955 DUF3844: Domain of un 96.7 0.0052 1.1E-07 44.3 5.8 34 4-37 13-63 (103)
12 KOG4289|consensus 96.6 0.0015 3.2E-08 64.5 2.9 34 3-37 1244-1277(2531)
13 PF12273 RCR: Chitin synthesis 96.6 0.0019 4.1E-08 47.6 2.8 11 61-71 18-28 (130)
14 PF07645 EGF_CA: Calcium-bindi 96.4 0.0029 6.3E-08 37.9 2.3 25 4-29 10-34 (42)
15 KOG3607|consensus 96.3 0.0084 1.8E-07 55.8 6.0 32 2-37 628-659 (716)
16 PF01102 Glycophorin_A: Glycop 96.3 0.0042 9.1E-08 46.0 3.1 27 41-67 65-91 (122)
17 PF02009 Rifin_STEVOR: Rifin/s 96.2 0.0034 7.3E-08 52.9 2.6 24 54-77 268-291 (299)
18 PF02439 Adeno_E3_CR2: Adenovi 96.0 0.013 2.8E-07 35.1 3.8 16 47-62 10-25 (38)
19 TIGR01478 STEVOR variant surfa 95.5 0.016 3.4E-07 48.7 3.7 26 45-70 262-287 (295)
20 PTZ00370 STEVOR; Provisional 95.4 0.017 3.8E-07 48.4 3.7 26 45-70 258-283 (296)
21 PF15330 SIT: SHP2-interacting 95.3 0.049 1.1E-06 39.4 5.2 12 88-99 45-56 (107)
22 KOG4289|consensus 95.2 0.013 2.9E-07 58.1 2.8 35 3-37 1721-1756(2531)
23 PTZ00046 rifin; Provisional 95.2 0.015 3.2E-07 50.1 2.7 23 56-78 329-351 (358)
24 PF12947 EGF_3: EGF domain; I 95.2 0.014 3.1E-07 34.2 1.8 27 4-31 6-32 (36)
25 TIGR01477 RIFIN variant surfac 95.1 0.016 3.5E-07 49.8 2.7 23 56-78 324-346 (353)
26 PF12273 RCR: Chitin synthesis 94.8 0.021 4.6E-07 42.0 2.4 16 58-73 18-33 (130)
27 PF06679 DUF1180: Protein of u 94.2 0.11 2.4E-06 40.3 5.1 19 64-82 116-134 (163)
28 KOG3514|consensus 93.4 0.041 8.9E-07 53.4 1.7 33 4-37 629-662 (1591)
29 PF06365 CD34_antigen: CD34/Po 93.4 0.48 1E-05 38.0 7.5 20 81-100 143-166 (202)
30 PF02480 Herpes_gE: Alphaherpe 93.1 0.026 5.6E-07 49.8 0.0 18 46-63 358-375 (439)
31 PF02009 Rifin_STEVOR: Rifin/s 93.0 0.071 1.5E-06 44.9 2.5 31 45-75 262-292 (299)
32 PF05454 DAG1: Dystroglycan (D 93.0 0.028 6E-07 47.2 0.0 24 47-70 152-175 (290)
33 PF01034 Syndecan: Syndecan do 92.7 0.044 9.6E-07 36.3 0.7 26 43-68 12-37 (64)
34 PF04478 Mid2: Mid2 like cell 92.3 0.024 5.2E-07 43.6 -1.1 13 53-65 63-75 (154)
35 PF13908 Shisa: Wnt and FGF in 91.7 0.12 2.5E-06 39.8 2.1 6 42-47 80-85 (179)
36 PF05393 Hum_adeno_E3A: Human 91.6 0.26 5.6E-06 34.8 3.4 16 55-70 45-60 (94)
37 PF01299 Lamp: Lysosome-associ 91.1 0.13 2.8E-06 42.9 1.8 19 48-66 278-296 (306)
38 PF15050 SCIMP: SCIMP protein 91.0 0.56 1.2E-05 34.9 4.9 11 60-70 26-36 (133)
39 PF04863 EGF_alliinase: Alliin 90.8 0.085 1.8E-06 34.0 0.4 33 4-36 17-52 (56)
40 PHA03283 envelope glycoprotein 90.5 0.33 7.1E-06 43.9 3.9 38 44-81 403-440 (542)
41 PF07204 Orthoreo_P10: Orthore 90.2 0.18 3.9E-06 35.9 1.6 9 62-70 61-69 (98)
42 PTZ00382 Variant-specific surf 90.1 0.088 1.9E-06 37.3 -0.0 25 42-66 68-93 (96)
43 KOG3516|consensus 89.8 0.25 5.4E-06 48.4 2.7 32 5-37 962-994 (1306)
44 PF08693 SKG6: Transmembrane a 89.4 0.088 1.9E-06 31.8 -0.4 6 41-46 12-17 (40)
45 PF07213 DAP10: DAP10 membrane 88.9 0.51 1.1E-05 32.5 3.0 18 56-73 49-66 (79)
46 PF10873 DUF2668: Protein of u 88.8 1.5 3.2E-05 33.6 5.8 22 40-61 61-82 (155)
47 PF05568 ASFV_J13L: African sw 88.7 0.51 1.1E-05 36.5 3.3 13 47-59 36-48 (189)
48 KOG1225|consensus 88.7 0.32 6.9E-06 44.0 2.4 30 2-36 314-343 (525)
49 KOG3516|consensus 88.6 0.26 5.6E-06 48.3 1.9 33 3-36 550-583 (1306)
50 PF15298 AJAP1_PANP_C: AJAP1/P 88.4 0.69 1.5E-05 37.0 3.9 11 95-105 166-176 (205)
51 PTZ00046 rifin; Provisional 88.2 0.3 6.5E-06 42.2 1.9 33 43-75 319-351 (358)
52 PF08374 Protocadherin: Protoc 88.0 0.38 8.2E-06 39.0 2.3 18 40-57 37-55 (221)
53 PF12259 DUF3609: Protein of u 88.0 0.6 1.3E-05 40.3 3.6 28 42-69 299-326 (361)
54 PF12877 DUF3827: Domain of un 87.9 0.59 1.3E-05 43.3 3.7 27 41-68 270-297 (684)
55 PF00558 Vpu: Vpu protein; In 87.8 0.71 1.5E-05 32.0 3.2 12 54-65 16-27 (81)
56 PF11980 DUF3481: Domain of un 87.8 0.84 1.8E-05 31.9 3.5 23 48-70 24-46 (87)
57 PF02158 Neuregulin: Neureguli 87.8 0.16 3.5E-06 44.3 0.0 35 43-77 9-45 (404)
58 smart00051 DSL delta serrate l 87.6 0.42 9E-06 31.3 1.9 22 8-33 42-63 (63)
59 PF06697 DUF1191: Protein of u 87.6 0.44 9.6E-06 39.9 2.5 52 40-93 213-264 (278)
60 TIGR01477 RIFIN variant surfac 87.6 0.36 7.7E-06 41.7 2.0 32 44-75 315-346 (353)
61 PF15069 FAM163: FAM163 family 87.5 0.9 2E-05 34.6 3.9 21 39-59 5-25 (143)
62 PF13908 Shisa: Wnt and FGF in 87.4 0.44 9.6E-06 36.6 2.3 17 42-58 77-93 (179)
63 PF12662 cEGF: Complement Clr- 86.4 0.49 1.1E-05 25.4 1.4 17 19-35 1-21 (24)
64 PHA03281 envelope glycoprotein 86.3 1.1 2.3E-05 41.0 4.4 17 61-77 578-594 (642)
65 KOG1225|consensus 86.1 0.71 1.5E-05 41.9 3.1 32 1-37 282-313 (525)
66 PF15102 TMEM154: TMEM154 prot 85.8 0.42 9E-06 36.5 1.3 19 41-59 60-78 (146)
67 PF00053 Laminin_EGF: Laminin 85.7 0.59 1.3E-05 28.4 1.7 25 10-37 11-35 (49)
68 PF01102 Glycophorin_A: Glycop 85.5 1.2 2.6E-05 33.0 3.6 30 44-73 64-94 (122)
69 PF10577 UPF0560: Uncharacteri 85.5 0.94 2E-05 42.9 3.7 25 44-68 276-300 (807)
70 KOG1217|consensus 84.5 1 2.2E-05 37.6 3.2 28 5-33 279-306 (487)
71 KOG1217|consensus 84.1 0.86 1.9E-05 38.0 2.6 32 4-36 177-208 (487)
72 cd00055 EGF_Lam Laminin-type e 83.3 1.1 2.4E-05 27.4 2.3 24 11-37 13-36 (50)
73 PF14991 MLANA: Protein melan- 82.8 0.29 6.3E-06 36.0 -0.6 14 42-55 25-38 (118)
74 KOG3637|consensus 82.5 1.5 3.3E-05 42.7 3.8 27 40-66 979-1005(1030)
75 PF06667 PspB: Phage shock pro 81.5 2.9 6.3E-05 28.4 3.9 20 53-72 13-32 (75)
76 PF11669 WBP-1: WW domain-bind 81.5 4 8.6E-05 29.1 4.8 6 95-100 85-90 (102)
77 PF03302 VSP: Giardia variant- 81.0 1.5 3.3E-05 38.1 3.0 27 42-68 369-396 (397)
78 PF15330 SIT: SHP2-interacting 80.8 6.7 0.00014 28.3 5.9 6 87-92 49-54 (107)
79 KOG1226|consensus 80.4 1.4 3.1E-05 41.5 2.8 24 6-34 557-580 (783)
80 KOG3514|consensus 80.1 1.1 2.4E-05 44.0 2.0 33 4-37 1024-1057(1591)
81 KOG4260|consensus 79.7 1.3 2.8E-05 37.5 2.0 35 3-37 149-185 (350)
82 PF12191 stn_TNFRSF12A: Tumour 79.5 0.68 1.5E-05 34.6 0.3 8 61-68 97-104 (129)
83 KOG4482|consensus 79.4 3.9 8.4E-05 36.0 4.9 28 42-70 299-326 (449)
84 PF14670 FXa_inhibition: Coagu 78.9 1.6 3.4E-05 25.5 1.7 18 11-29 11-28 (36)
85 PF11884 DUF3404: Domain of un 78.1 2.3 5.1E-05 35.3 3.1 15 58-72 246-260 (262)
86 KOG1836|consensus 78.1 1.2 2.5E-05 45.5 1.6 33 5-37 781-815 (1705)
87 PF01528 Herpes_glycop: Herpes 77.9 3.1 6.8E-05 36.2 4.0 19 75-93 336-354 (374)
88 PF12946 EGF_MSP1_1: MSP1 EGF 77.7 1.8 3.9E-05 25.7 1.7 26 4-29 5-30 (37)
89 PHA03265 envelope glycoprotein 77.6 1.2 2.5E-05 38.8 1.2 22 47-69 355-376 (402)
90 PF05399 EVI2A: Ectropic viral 75.7 2.6 5.7E-05 34.2 2.7 24 28-51 116-141 (227)
91 PF12768 Rax2: Cortical protei 75.4 7 0.00015 32.6 5.3 21 41-61 231-251 (281)
92 PF15099 PIRT: Phosphoinositid 75.3 2 4.3E-05 32.2 1.7 7 60-66 101-107 (129)
93 TIGR02976 phageshock_pspB phag 74.9 5.5 0.00012 27.0 3.8 19 54-72 14-32 (75)
94 KOG1094|consensus 74.4 12 0.00026 35.2 6.8 10 58-67 406-415 (807)
95 PF02038 ATP1G1_PLM_MAT8: ATP1 74.1 3.4 7.4E-05 26.1 2.4 11 47-57 17-27 (50)
96 PF05568 ASFV_J13L: African sw 73.7 4 8.7E-05 31.6 3.2 11 55-65 41-51 (189)
97 KOG4699|consensus 73.1 4.4 9.6E-05 31.6 3.3 35 58-93 18-52 (180)
98 COG5538 SEC66 Endoplasmic reti 73.1 4.4 9.6E-05 31.6 3.3 35 58-93 18-52 (180)
99 PF14979 TMEM52: Transmembrane 72.9 4.6 0.0001 31.0 3.3 9 38-46 18-26 (154)
100 PTZ00370 STEVOR; Provisional 72.6 3.6 7.8E-05 34.8 2.9 24 46-69 262-285 (296)
101 smart00180 EGF_Lam Laminin-typ 72.4 3.6 7.7E-05 24.9 2.1 18 19-36 17-34 (46)
102 PRK06531 yajC preprotein trans 72.3 2 4.2E-05 31.5 1.1 17 66-82 20-36 (113)
103 PF15048 OSTbeta: Organic solu 72.2 6.6 0.00014 29.3 3.9 8 61-68 54-61 (125)
104 KOG3653|consensus 71.7 14 0.0003 33.6 6.5 22 6-27 97-123 (534)
105 PRK09458 pspB phage shock prot 71.7 6.2 0.00014 26.9 3.4 21 53-73 13-33 (75)
106 PF02439 Adeno_E3_CR2: Adenovi 70.9 5 0.00011 24.0 2.4 16 42-57 8-23 (38)
107 PF05545 FixQ: Cbb3-type cytoc 70.2 4.1 8.8E-05 24.9 2.1 15 56-70 23-37 (49)
108 PF15347 PAG: Phosphoprotein a 70.1 7.3 0.00016 34.2 4.3 20 42-61 16-35 (428)
109 PRK00523 hypothetical protein; 70.0 6.3 0.00014 26.7 3.1 9 65-73 25-33 (72)
110 PRK13664 hypothetical protein; 70.0 11 0.00024 24.6 4.1 16 106-122 45-60 (62)
111 PF14914 LRRC37AB_C: LRRC37A/B 68.3 6 0.00013 30.4 3.0 11 55-65 135-145 (154)
112 PF01034 Syndecan: Syndecan do 68.3 1.3 2.8E-05 29.4 -0.5 27 41-68 14-40 (64)
113 PF15099 PIRT: Phosphoinositid 67.1 3.1 6.7E-05 31.2 1.2 33 47-79 83-116 (129)
114 TIGR01478 STEVOR variant surfa 67.0 6.9 0.00015 33.1 3.4 20 49-68 269-288 (295)
115 PF05961 Chordopox_A13L: Chord 66.9 4.1 8.8E-05 27.3 1.6 18 96-113 48-65 (68)
116 PHA03049 IMV membrane protein; 66.5 3.8 8.2E-05 27.4 1.4 17 97-113 49-65 (68)
117 PF15117 UPF0697: Uncharacteri 66.3 3.4 7.3E-05 29.2 1.2 12 60-71 30-41 (99)
118 PF02480 Herpes_gE: Alphaherpe 66.1 1.9 4.2E-05 38.1 0.0 6 63-68 372-377 (439)
119 PF01299 Lamp: Lysosome-associ 66.0 7.2 0.00016 32.4 3.4 33 47-79 273-305 (306)
120 KOG0793|consensus 65.8 5.3 0.00012 37.9 2.7 32 52-83 615-646 (1004)
121 PF14584 DUF4446: Protein of u 64.9 7.1 0.00015 29.8 2.9 6 115-120 107-112 (151)
122 PF15069 FAM163: FAM163 family 64.8 7.5 0.00016 29.6 2.9 27 43-69 6-32 (143)
123 PF04689 S1FA: DNA binding pro 63.6 15 0.00032 24.5 3.8 28 41-68 14-41 (69)
124 PF15345 TMEM51: Transmembrane 63.4 8.2 0.00018 31.6 3.1 12 89-100 119-130 (233)
125 COG4736 CcoQ Cbb3-type cytochr 63.4 8.6 0.00019 25.1 2.6 6 61-66 26-31 (60)
126 PF15065 NCU-G1: Lysosomal tra 63.3 2.3 4.9E-05 36.7 -0.1 12 60-71 338-349 (350)
127 PHA03286 envelope glycoprotein 62.8 8 0.00017 34.7 3.2 7 87-93 443-449 (492)
128 PF11157 DUF2937: Protein of u 62.6 9.4 0.0002 29.5 3.2 24 42-65 136-159 (167)
129 PF05984 Cytomega_UL20A: Cytom 61.6 17 0.00036 25.6 4.0 13 106-118 52-64 (100)
130 PF07204 Orthoreo_P10: Orthore 61.4 4.6 0.0001 28.8 1.2 28 41-69 44-71 (98)
131 PRK01844 hypothetical protein; 58.0 15 0.00033 24.8 3.2 7 67-73 26-32 (72)
132 PF12877 DUF3827: Domain of un 57.3 8.8 0.00019 35.8 2.6 31 42-72 268-298 (684)
133 PF07423 DUF1510: Protein of u 57.1 7.4 0.00016 31.5 1.9 10 48-57 21-30 (217)
134 KOG1214|consensus 56.2 9.5 0.00021 37.0 2.6 27 4-31 833-859 (1289)
135 PF11359 gpUL132: Glycoprotein 56.1 38 0.00082 27.7 5.7 11 104-114 122-132 (235)
136 PF00558 Vpu: Vpu protein; In 53.4 13 0.00029 25.6 2.4 26 47-72 12-38 (81)
137 PF13974 YebO: YebO-like prote 53.3 12 0.00026 25.8 2.1 18 51-68 5-22 (80)
138 PF08374 Protocadherin: Protoc 53.2 16 0.00034 29.8 3.1 20 40-59 41-60 (221)
139 PF05084 GRA6: Granule antigen 53.1 18 0.0004 28.5 3.4 22 47-68 154-175 (215)
140 KOG1226|consensus 52.7 17 0.00036 34.6 3.6 25 41-65 715-739 (783)
141 PF03229 Alpha_GJ: Alphavirus 52.6 19 0.00042 26.7 3.2 27 42-68 85-111 (126)
142 PF00974 Rhabdo_glycop: Rhabdo 52.4 4.7 0.0001 36.3 0.0 15 58-72 472-486 (501)
143 PF05510 Sarcoglycan_2: Sarcog 52.3 9.2 0.0002 33.5 1.8 16 53-68 297-312 (386)
144 PF07253 Gypsy: Gypsy protein; 52.1 21 0.00046 32.1 4.0 17 54-70 429-445 (472)
145 PF15048 OSTbeta: Organic solu 52.0 16 0.00035 27.2 2.8 28 44-71 40-67 (125)
146 PF12191 stn_TNFRSF12A: Tumour 52.0 4.8 0.0001 30.2 0.0 32 42-73 81-113 (129)
147 PTZ00234 variable surface prot 51.7 9.2 0.0002 34.0 1.7 7 69-75 395-401 (433)
148 PF13994 PgaD: PgaD-like prote 51.2 18 0.0004 26.7 3.1 20 56-75 76-95 (138)
149 PF06084 Cytomega_TRL10: Cytom 51.0 11 0.00024 28.1 1.8 16 12-27 11-27 (150)
150 PF12768 Rax2: Cortical protei 50.1 26 0.00057 29.2 4.1 29 42-70 229-257 (281)
151 PF14991 MLANA: Protein melan- 49.9 2.7 5.8E-05 31.0 -1.6 29 41-69 20-49 (118)
152 PF00599 Flu_M2: Influenza Mat 49.9 3.7 7.9E-05 28.9 -0.8 25 33-57 19-43 (97)
153 COG3763 Uncharacterized protei 49.6 28 0.0006 23.5 3.4 9 65-73 24-32 (71)
154 PF10873 DUF2668: Protein of u 48.9 22 0.00047 27.4 3.1 40 27-67 52-91 (155)
155 KOG3488|consensus 48.5 26 0.00057 23.8 3.1 23 46-68 54-76 (81)
156 PTZ00382 Variant-specific surf 47.9 3.6 7.8E-05 29.0 -1.2 49 24-72 42-95 (96)
157 PRK11901 hypothetical protein; 47.6 11 0.00025 32.3 1.6 16 42-57 38-53 (327)
158 PHA03289 envelope glycoprotein 47.6 33 0.00072 29.6 4.3 9 85-93 313-321 (352)
159 PF11027 DUF2615: Protein of u 47.3 43 0.00094 24.1 4.3 13 59-71 66-78 (103)
160 PRK04778 septation ring format 47.0 20 0.00043 32.5 3.2 6 63-68 21-26 (569)
161 PF06679 DUF1180: Protein of u 46.9 26 0.00056 27.2 3.3 38 55-92 104-141 (163)
162 PTZ00045 apical membrane antig 46.9 34 0.00074 31.6 4.6 8 20-27 477-484 (595)
163 PHA03290 envelope glycoprotein 46.4 35 0.00075 29.6 4.3 18 40-57 273-290 (357)
164 PF02060 ISK_Channel: Slow vol 45.9 44 0.00095 25.1 4.3 16 58-73 59-74 (129)
165 TIGR02976 phageshock_pspB phag 44.6 33 0.00072 23.2 3.2 29 48-76 5-33 (75)
166 PF12911 OppC_N: N-terminal TM 44.6 29 0.00063 21.1 2.8 7 48-54 21-27 (56)
167 smart00274 FOLN Follistatin-N- 44.5 34 0.00073 18.5 2.7 22 2-23 2-24 (26)
168 PHA02669 hypothetical protein; 44.0 11 0.00025 29.6 1.0 19 65-83 32-50 (210)
169 KOG3054|consensus 43.8 30 0.00065 29.0 3.4 7 61-67 20-26 (299)
170 PF09064 Tme5_EGF_like: Thromb 43.7 13 0.00029 21.6 1.0 12 18-29 16-27 (34)
171 PRK01741 cell division protein 43.6 20 0.00043 30.9 2.4 18 44-61 7-24 (332)
172 PTZ00233 variable surface prot 43.6 23 0.0005 32.2 3.0 25 47-71 442-468 (509)
173 TIGR02736 cbb3_Q_epsi cytochro 43.3 26 0.00056 22.6 2.4 18 64-81 20-37 (56)
174 PF06247 Plasmod_Pvs28: Plasmo 43.0 17 0.00038 29.0 1.9 32 2-34 134-165 (197)
175 PF01708 Gemini_mov: Geminivir 42.2 35 0.00076 24.1 3.1 10 86-95 73-82 (91)
176 PHA02681 ORF089 virion membran 41.5 30 0.00066 24.2 2.7 19 96-114 47-65 (92)
177 PF06143 Baculo_11_kDa: Baculo 40.8 32 0.0007 23.9 2.7 11 47-57 41-51 (84)
178 PLN02745 Putative pectinestera 40.1 42 0.0009 31.1 4.1 8 53-60 36-43 (596)
179 PF09289 FOLN: Follistatin/Ost 39.9 30 0.00064 18.2 1.9 17 6-22 6-22 (22)
180 PF11044 TMEMspv1-c74-12: Plec 39.6 55 0.0012 20.3 3.3 11 46-56 8-18 (49)
181 KOG2052|consensus 39.5 15 0.00033 33.1 1.2 20 112-131 242-261 (513)
182 PRK10905 cell division protein 39.5 21 0.00045 30.7 1.9 13 45-57 3-15 (328)
183 PF12301 CD99L2: CD99 antigen 39.4 39 0.00084 26.3 3.3 6 46-51 120-125 (169)
184 PF05808 Podoplanin: Podoplani 39.4 9.9 0.00021 29.6 0.0 20 42-61 131-150 (162)
185 PF12259 DUF3609: Protein of u 38.9 18 0.00039 31.3 1.5 34 47-80 301-334 (361)
186 PF05283 MGC-24: Multi-glycosy 38.8 32 0.00068 27.3 2.8 14 46-59 164-177 (186)
187 COG1862 YajC Preprotein transl 38.3 23 0.0005 25.2 1.7 23 59-81 20-42 (97)
188 PF10265 DUF2217: Uncharacteri 38.0 52 0.0011 30.0 4.3 9 63-71 33-41 (514)
189 PF01414 DSL: Delta serrate li 37.7 22 0.00047 23.1 1.4 14 20-33 50-63 (63)
190 PF14828 Amnionless: Amnionles 37.4 32 0.00069 30.5 2.9 11 86-96 391-401 (437)
191 PRK08455 fliL flagellar basal 37.3 49 0.0011 25.8 3.6 18 48-65 25-42 (182)
192 COG3115 ZipA Cell division pro 37.1 38 0.00083 29.0 3.1 22 45-66 9-30 (324)
193 PF15183 MRAP: Melanocortin-2 36.8 49 0.0011 23.2 3.1 18 40-57 38-55 (90)
194 KOG1024|consensus 36.7 77 0.0017 28.7 5.1 10 88-97 238-247 (563)
195 PF12725 DUF3810: Protein of u 36.4 41 0.00088 28.4 3.3 6 95-100 82-87 (318)
196 PF02285 COX8: Cytochrome oxid 36.1 69 0.0015 19.6 3.4 25 48-72 19-43 (44)
197 PF03896 TRAP_alpha: Transloco 35.5 74 0.0016 26.7 4.6 11 117-127 252-262 (285)
198 PHA03240 envelope glycoprotein 35.3 44 0.00096 27.5 3.1 12 48-59 218-229 (258)
199 PF12301 CD99L2: CD99 antigen 35.3 50 0.0011 25.7 3.3 19 41-59 112-130 (169)
200 PHA03105 EEV glycoprotein; Pro 34.6 42 0.00091 26.3 2.8 13 56-68 18-30 (188)
201 PHA03281 envelope glycoprotein 34.6 62 0.0013 30.0 4.2 29 46-74 559-587 (642)
202 PF13980 UPF0370: Uncharacteri 34.1 90 0.002 20.5 3.8 15 106-121 44-58 (63)
203 KOG1025|consensus 33.8 33 0.00071 33.7 2.5 15 17-31 562-578 (1177)
204 PF08391 Ly49: Ly49-like prote 33.1 14 0.00031 27.2 0.0 27 42-68 7-34 (119)
205 KOG0994|consensus 32.7 23 0.0005 35.7 1.3 26 11-36 924-950 (1758)
206 TIGR01941 nqrF NADH:ubiquinone 32.5 41 0.00089 28.9 2.7 15 59-73 15-29 (405)
207 cd01328 FSL_SPARC Follistatin- 32.3 50 0.0011 22.9 2.6 23 5-27 6-28 (86)
208 PF15050 SCIMP: SCIMP protein 32.2 41 0.00088 25.2 2.3 21 48-68 18-38 (133)
209 PF05749 Rubella_E2: Rubella m 32.1 1.3E+02 0.0028 24.0 5.2 20 18-37 192-211 (267)
210 PF09802 Sec66: Preprotein tra 31.8 44 0.00096 26.5 2.6 43 49-93 10-53 (190)
211 PF09402 MSC: Man1-Src1p-C-ter 31.0 14 0.00031 30.7 -0.4 9 28-36 200-208 (334)
212 PF14979 TMEM52: Transmembrane 30.9 1.2E+02 0.0026 23.3 4.7 7 60-66 37-43 (154)
213 KOG4818|consensus 30.6 51 0.0011 28.8 2.9 12 48-59 335-346 (362)
214 PF04478 Mid2: Mid2 like cell 30.6 11 0.00023 29.2 -1.1 28 54-86 61-88 (154)
215 PF01561 Hanta_G2: Hantavirus 30.3 23 0.0005 31.7 0.8 8 17-24 430-437 (485)
216 PF11694 DUF3290: Protein of u 30.2 68 0.0015 24.4 3.3 6 115-120 125-130 (149)
217 PRK10884 SH3 domain-containing 30.1 30 0.00065 27.6 1.4 15 47-61 177-191 (206)
218 PRK14750 kdpF potassium-transp 29.6 92 0.002 17.4 2.9 12 46-57 6-17 (29)
219 COG5487 Small integral membran 29.3 1E+02 0.0022 19.6 3.4 18 49-66 33-50 (54)
220 PF01589 Alpha_E1_glycop: Alph 29.3 63 0.0014 29.2 3.3 23 44-66 479-501 (502)
221 COG4477 EzrA Negative regulato 28.9 49 0.0011 30.4 2.7 11 58-68 15-25 (570)
222 PF14851 FAM176: FAM176 family 28.7 32 0.00069 26.4 1.3 6 60-65 42-47 (153)
223 PF05454 DAG1: Dystroglycan (D 28.7 19 0.00041 30.4 0.0 10 118-127 256-265 (290)
224 PHA02902 putative IMV membrane 28.6 96 0.0021 20.7 3.3 17 95-111 48-64 (70)
225 PRK14584 hmsS hemin storage sy 28.5 68 0.0015 24.7 3.0 11 58-68 77-87 (153)
226 PF05297 Herpes_LMP1: Herpesvi 28.3 19 0.00042 30.9 0.0 9 28-36 13-21 (381)
227 PF07669 Eco57I: Eco57I restri 27.5 25 0.00055 24.5 0.5 12 89-100 4-15 (106)
228 KOG1214|consensus 27.3 57 0.0012 32.0 2.8 30 2-32 740-771 (1289)
229 PF04881 Adeno_GP19K: Adenovir 27.1 36 0.00078 25.7 1.3 9 60-68 121-129 (139)
230 PRK11056 hypothetical protein; 27.1 81 0.0017 23.4 3.1 12 60-71 104-115 (120)
231 KOG3637|consensus 26.7 65 0.0014 31.8 3.2 34 41-75 977-1010(1030)
232 KOG2767|consensus 26.7 55 0.0012 28.7 2.5 17 108-124 195-211 (400)
233 PF07297 DPM2: Dolichol phosph 26.7 1E+02 0.0022 21.1 3.3 7 62-68 68-74 (78)
234 PF14851 FAM176: FAM176 family 26.6 73 0.0016 24.5 2.9 15 47-61 32-46 (153)
235 PRK14585 pgaD putative PGA bio 26.4 79 0.0017 23.9 3.0 7 63-69 70-76 (137)
236 PHA03164 hypothetical protein; 26.4 69 0.0015 22.1 2.4 6 20-25 35-40 (88)
237 PF07271 Cytadhesin_P30: Cytad 26.4 51 0.0011 27.7 2.2 10 47-56 77-86 (279)
238 PHA03294 envelope glycoprotein 26.1 55 0.0012 31.6 2.6 28 40-67 803-830 (835)
239 TIGR00383 corA magnesium Mg(2+ 25.7 80 0.0017 25.8 3.2 6 62-67 309-314 (318)
240 TIGR02205 septum_zipA cell div 25.6 34 0.00074 28.7 1.0 9 49-57 7-15 (284)
241 PF10868 DUF2667: Protein of u 25.4 27 0.00059 24.6 0.3 19 8-26 59-79 (90)
242 PF03302 VSP: Giardia variant- 25.3 32 0.0007 29.9 0.8 31 42-72 365-396 (397)
243 PRK00269 zipA cell division pr 25.0 1E+02 0.0022 26.2 3.7 6 64-69 24-29 (293)
244 COG4059 MtrE Tetrahydromethano 24.6 67 0.0015 26.7 2.5 12 71-82 285-296 (304)
245 PF01683 EB: EB module; Inter 24.3 88 0.0019 18.7 2.5 10 20-29 37-46 (52)
246 cd00930 Cyt_c_Oxidase_VIII Cyt 23.9 1.6E+02 0.0035 17.9 3.5 23 49-71 20-42 (43)
247 PF10361 DUF2434: Protein of u 23.9 3.3E+02 0.0071 23.2 6.5 34 7-40 7-43 (296)
248 PF11743 DUF3301: Protein of u 23.2 68 0.0015 22.4 2.0 9 107-115 64-72 (97)
249 PF06295 DUF1043: Protein of u 23.0 83 0.0018 23.0 2.6 9 120-128 108-116 (128)
250 KOG4433|consensus 23.0 1.1E+02 0.0023 28.0 3.6 15 42-56 45-59 (526)
251 KOG1218|consensus 22.7 68 0.0015 25.7 2.3 19 18-36 160-178 (316)
252 PHA03099 epidermal growth fact 22.5 1.1E+02 0.0024 23.1 3.1 29 49-77 104-133 (139)
253 PF11118 DUF2627: Protein of u 22.4 1.1E+02 0.0023 21.0 2.8 24 47-70 45-68 (77)
254 PF15234 LAT: Linker for activ 22.1 4.3E+02 0.0092 21.4 6.5 15 55-69 18-32 (230)
255 PF15106 TMEM156: TMEM156 prot 22.1 1.1E+02 0.0023 25.0 3.2 6 73-78 208-213 (226)
256 PF07226 DUF1422: Protein of u 21.9 1.1E+02 0.0024 22.6 2.9 10 61-70 105-114 (117)
257 PF05478 Prominin: Prominin; 21.5 71 0.0015 30.3 2.4 7 60-66 112-118 (806)
258 PF02013 CBM_10: Cellulose or 21.3 15 0.00032 21.6 -1.4 14 111-124 17-30 (36)
259 PF15013 CCSMST1: CCSMST1 fami 21.2 64 0.0014 22.0 1.5 14 55-68 40-53 (77)
260 PRK10847 hypothetical protein; 21.1 1.3E+02 0.0028 23.7 3.5 6 66-71 206-211 (219)
261 PRK06073 NADH dehydrogenase su 20.9 2.4E+02 0.0052 20.7 4.7 7 93-99 47-53 (124)
262 PF05624 LSR: Lipolysis stimul 20.7 1.8E+02 0.0039 18.2 3.3 7 47-53 10-16 (49)
263 PF14155 DUF4307: Domain of un 20.6 1.3E+02 0.0029 21.4 3.2 16 78-93 39-54 (112)
264 PF01561 Hanta_G2: Hantavirus 20.6 39 0.00084 30.3 0.4 6 62-67 475-480 (485)
265 PF11884 DUF3404: Domain of un 20.6 93 0.002 26.0 2.6 18 58-75 243-260 (262)
266 KOG1631|consensus 20.5 1.7E+02 0.0036 24.3 4.0 10 117-126 230-239 (261)
267 PRK03427 cell division protein 20.5 1.1E+02 0.0025 26.4 3.2 11 58-68 20-30 (333)
268 PHA03270 envelope glycoprotein 20.4 23 0.00051 31.7 -1.0 16 42-57 432-447 (466)
269 PTZ00208 65 kDa invariant surf 20.3 79 0.0017 28.2 2.2 7 61-67 406-412 (436)
No 1
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=99.20 E-value=1.8e-11 Score=90.95 Aligned_cols=68 Identities=24% Similarity=0.465 Sum_probs=45.6
Q ss_pred CCCCCCCcEEEeC-CCCCCeeeecCCCcCCCcccccc------------ceeeehhHHHHHHHHHHHHHHHHHHHHHhhh
Q psy3616 3 KGYCENKGTCVKD-ARGQPSCRCVGSFIGPHCAQKSE------------FAYIAGGIAATVVFLIIIALFVWMICARSER 69 (143)
Q Consensus 3 ~~~C~NgG~C~~~-~~~~~~C~C~~gy~G~rCe~~~~------------~~~ia~~i~~~Vl~lilIvllv~~~~~r~rr 69 (143)
+++|.|| +|... +...+.|+|+.||+|.|||...- .-+|++.++.+++++|++..++..++.+.||
T Consensus 50 ~~YClHG-~C~yI~dl~~~~CrC~~GYtGeRCEh~dLl~~~~~~k~n~~t~Yia~~~il~il~~i~is~~~~~~yr~~r~ 128 (139)
T PHA03099 50 DGYCLHG-DCIHARDIDGMYCRCSHGYTGIRCQHVVLVDYQRSEKPNTTTSYIPSPGIVLVLVGIIITCCLLSVYRFTRR 128 (139)
T ss_pred CCEeECC-EEEeeccCCCceeECCCCcccccccceeeeeeeccccccchhhhhhhhHHHHHHHHHHHHHHHHhhheeeec
Confidence 5899985 99988 67899999999999999998541 1256665555555455444444444444343
Q ss_pred cc
Q psy3616 70 RR 71 (143)
Q Consensus 70 ~k 71 (143)
+|
T Consensus 129 ~~ 130 (139)
T PHA03099 129 TK 130 (139)
T ss_pred cc
Confidence 33
No 2
>KOG1219|consensus
Probab=98.77 E-value=1.8e-08 Score=100.05 Aligned_cols=33 Identities=24% Similarity=0.639 Sum_probs=29.6
Q ss_pred CCCCCCCcEEEeCCCCCCeeeecCCCcCCCcccc
Q psy3616 3 KGYCENKGTCVKDARGQPSCRCVGSFIGPHCAQK 36 (143)
Q Consensus 3 ~~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe~~ 36 (143)
.+||++||+|... .+.+.|.|+.+|+|.|||..
T Consensus 3908 snPC~~GgtCip~-~n~f~CnC~~gyTG~~Ce~~ 3940 (4289)
T KOG1219|consen 3908 SNPCLTGGTCIPF-YNGFLCNCPNGYTGKRCEAR 3940 (4289)
T ss_pred CCCCCCCCEEEec-CCCeeEeCCCCccCceeecc
Confidence 4789999999988 88899999999999999976
No 3
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=98.43 E-value=1.3e-07 Score=54.19 Aligned_cols=30 Identities=30% Similarity=0.759 Sum_probs=26.6
Q ss_pred CCCCCCCcEEEeCCCCCCeeeecCCCcCCC
Q psy3616 3 KGYCENKGTCVKDARGQPSCRCVGSFIGPH 32 (143)
Q Consensus 3 ~~~C~NgG~C~~~~~~~~~C~C~~gy~G~r 32 (143)
.+||.|+|+|+....+.+.|.|++||+|++
T Consensus 3 ~~~C~n~g~C~~~~~~~y~C~C~~G~~G~~ 32 (32)
T PF00008_consen 3 SNPCQNGGTCIDLPGGGYTCECPPGYTGKR 32 (32)
T ss_dssp TTSSTTTEEEEEESTSEEEEEEBTTEESTT
T ss_pred CCcCCCCeEEEeCCCCCEEeECCCCCccCC
Confidence 469999999999933899999999999985
No 4
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=98.00 E-value=7.6e-06 Score=47.10 Aligned_cols=30 Identities=37% Similarity=0.919 Sum_probs=27.6
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCc-CCCcc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFI-GPHCA 34 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~-G~rCe 34 (143)
++|.|+|+|+.. .+.+.|.|+.+|. |.+|+
T Consensus 9 ~~C~~~~~C~~~-~g~~~C~C~~g~~~g~~C~ 39 (39)
T smart00179 9 NPCQNGGTCVNT-VGSYRCECPPGYTDGRNCE 39 (39)
T ss_pred CCcCCCCEeECC-CCCeEeECCCCCccCCcCC
Confidence 589999999988 8889999999999 99996
No 5
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.88 E-value=1.7e-05 Score=44.88 Aligned_cols=30 Identities=37% Similarity=0.913 Sum_probs=27.4
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCcCCCcc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFIGPHCA 34 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe 34 (143)
.+|.|+|.|... .+.+.|.|+.+|.|.+|+
T Consensus 9 ~~C~~~~~C~~~-~~~~~C~C~~g~~g~~C~ 38 (38)
T cd00054 9 NPCQNGGTCVNT-VGSYRCSCPPGYTGRNCE 38 (38)
T ss_pred CCcCCCCEeECC-CCCeEeECCCCCcCCcCC
Confidence 589999999988 778999999999999986
No 6
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=97.77 E-value=3.4e-05 Score=42.87 Aligned_cols=30 Identities=37% Similarity=0.875 Sum_probs=27.4
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCcCC-Ccc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFIGP-HCA 34 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~G~-rCe 34 (143)
++|.|++.|+.. .+.+.|.|+.||.|. +|+
T Consensus 6 ~~C~~~~~C~~~-~~~~~C~C~~g~~g~~~C~ 36 (36)
T cd00053 6 NPCSNGGTCVNT-PGSYRCVCPPGYTGDRSCE 36 (36)
T ss_pred CCCCCCCEEecC-CCCeEeECCCCCcccCCcC
Confidence 789999999998 788999999999999 875
No 7
>smart00181 EGF Epidermal growth factor-like domain.
Probab=97.41 E-value=0.00019 Score=40.62 Aligned_cols=29 Identities=34% Similarity=0.948 Sum_probs=26.2
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCcC-CCcc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFIG-PHCA 34 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~G-~rCe 34 (143)
++|.|+ +|... .+.+.|.|+.||.| ..|+
T Consensus 6 ~~C~~~-~C~~~-~~~~~C~C~~g~~g~~~C~ 35 (35)
T smart00181 6 GPCSNG-TCINT-PGSYTCSCPPGYTGDKRCE 35 (35)
T ss_pred CCCCCC-EEECC-CCCeEeECCCCCccCCccC
Confidence 489988 99988 88999999999999 8885
No 8
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.16 E-value=0.00062 Score=39.13 Aligned_cols=26 Identities=38% Similarity=1.034 Sum_probs=23.1
Q ss_pred CCCCCcEEEeCCCCCCeeeecCCCcCCCc
Q psy3616 5 YCENKGTCVKDARGQPSCRCVGSFIGPHC 33 (143)
Q Consensus 5 ~C~NgG~C~~~~~~~~~C~C~~gy~G~rC 33 (143)
.|.|+|+|+.. ..+|.|.+||+|+.|
T Consensus 7 ~C~~~G~C~~~---~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 7 ICSGHGTCVSP---CGRCVCDSGYTGPDC 32 (32)
T ss_pred ccCCCCEEeCC---CCEEECCCCCcCCCC
Confidence 59999999954 469999999999987
No 9
>KOG1219|consensus
Probab=97.13 E-value=0.00051 Score=69.91 Aligned_cols=35 Identities=26% Similarity=0.814 Sum_probs=31.4
Q ss_pred CC-CCCCCCcEEEeCCCCCCeeeecCCCcCCCccccc
Q psy3616 2 CK-GYCENKGTCVKDARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 2 C~-~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
|. ++|.|||.|+.. .+.+.|.|.+||.|+.|....
T Consensus 3945 Cs~n~C~~gg~C~n~-~gsf~CncT~g~~gr~c~~~~ 3980 (4289)
T KOG1219|consen 3945 CSKNVCGTGGQCINI-PGSFHCNCTPGILGRTCCAEK 3980 (4289)
T ss_pred cccccccCCceeecc-CCceEeccChhHhcccCcccc
Confidence 44 789999999999 899999999999999998653
No 10
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=97.02 E-value=0.00032 Score=32.75 Aligned_cols=13 Identities=38% Similarity=1.354 Sum_probs=11.2
Q ss_pred eeeecCCCcCCCc
Q psy3616 21 SCRCVGSFIGPHC 33 (143)
Q Consensus 21 ~C~C~~gy~G~rC 33 (143)
.|.|++||+|++|
T Consensus 1 ~C~C~~G~~G~~C 13 (13)
T PF12661_consen 1 TCQCPPGWTGPNC 13 (13)
T ss_dssp EEEE-TTEETTTT
T ss_pred CccCcCCCcCCCC
Confidence 5999999999998
No 11
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=96.71 E-value=0.0052 Score=44.34 Aligned_cols=34 Identities=32% Similarity=0.798 Sum_probs=27.0
Q ss_pred CCCCCCcEEEeCC----CCCCeeeecC-------------CCcCCCccccc
Q psy3616 4 GYCENKGTCVKDA----RGQPSCRCVG-------------SFIGPHCAQKS 37 (143)
Q Consensus 4 ~~C~NgG~C~~~~----~~~~~C~C~~-------------gy~G~rCe~~~ 37 (143)
+-|.++|.|+... ..=+.|.|.+ .|.|.-|+.+.
T Consensus 13 n~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~~~~~~~ktt~W~G~aCqKkD 63 (103)
T PF12955_consen 13 NNCSGHGSCVKKYGSGGGDCFACKCKPTVVKTGSGKGKTTHWGGPACQKKD 63 (103)
T ss_pred cCCCCCceEeeccCCCccceEEEEeeccccccccccCceeeeccccccccc
Confidence 5699999999871 2348999999 48899999753
No 12
>KOG4289|consensus
Probab=96.59 E-value=0.0015 Score=64.50 Aligned_cols=34 Identities=35% Similarity=0.871 Sum_probs=31.5
Q ss_pred CCCCCCCcEEEeCCCCCCeeeecCCCcCCCccccc
Q psy3616 3 KGYCENKGTCVKDARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 3 ~~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
.+||.|+|+|... .+.+.|.|.++|+|++||...
T Consensus 1244 s~pC~nng~C~sr-EggYtCeCrpg~tGehCEvs~ 1277 (2531)
T KOG4289|consen 1244 SGPCGNNGRCRSR-EGGYTCECRPGFTGEHCEVSA 1277 (2531)
T ss_pred cCCCCCCCceEEe-cCceeEEecCCccccceeeec
Confidence 4799999999999 999999999999999999854
No 13
>PF12273 RCR: Chitin synthesis regulation, resistance to Congo red; InterPro: IPR020999 RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 [].
Probab=96.58 E-value=0.0019 Score=47.63 Aligned_cols=11 Identities=36% Similarity=0.773 Sum_probs=4.6
Q ss_pred HHHHHHhhhcc
Q psy3616 61 WMICARSERRR 71 (143)
Q Consensus 61 ~~~~~r~rr~k 71 (143)
+++|++|||+|
T Consensus 18 ~~~~~~rRR~r 28 (130)
T PF12273_consen 18 LFYCHNRRRRR 28 (130)
T ss_pred HHHHHHHHHhh
Confidence 33444444433
No 14
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=96.36 E-value=0.0029 Score=37.88 Aligned_cols=25 Identities=36% Similarity=0.842 Sum_probs=23.2
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFI 29 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~ 29 (143)
+.|.+++.|+.. .+++.|.|++||.
T Consensus 10 ~~C~~~~~C~N~-~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 10 HNCPENGTCVNT-EGSYSCSCPPGYE 34 (42)
T ss_dssp SSSSTTSEEEEE-TTEEEEEESTTEE
T ss_pred CcCCCCCEEEcC-CCCEEeeCCCCcE
Confidence 469889999999 9999999999998
No 15
>KOG3607|consensus
Probab=96.30 E-value=0.0084 Score=55.82 Aligned_cols=32 Identities=25% Similarity=0.759 Sum_probs=27.6
Q ss_pred CCCCCCCCcEEEeCCCCCCeeeecCCCcCCCccccc
Q psy3616 2 CKGYCENKGTCVKDARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 2 C~~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
|..-|..+|+|... ..|+|..||.++.|+.+.
T Consensus 628 ~~~~C~g~GVCnn~----~~ChC~~gwapp~C~~~~ 659 (716)
T KOG3607|consen 628 CPTTCNGHGVCNNE----LNCHCEPGWAPPFCFIFG 659 (716)
T ss_pred cccccCCCcccCCC----cceeeCCCCCCCcccccc
Confidence 44558888999877 899999999999999875
No 16
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=96.26 E-value=0.0042 Score=46.04 Aligned_cols=27 Identities=26% Similarity=0.146 Sum_probs=16.2
Q ss_pred eeehhHHHHHHHHHHHHHHHHHHHHHh
Q psy3616 41 YIAGGIAATVVFLIIIALFVWMICARS 67 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvllv~~~~~r~ 67 (143)
.|+++|.|+++.+|++++++++++.|+
T Consensus 65 ~i~~Ii~gv~aGvIg~Illi~y~irR~ 91 (122)
T PF01102_consen 65 AIIGIIFGVMAGVIGIILLISYCIRRL 91 (122)
T ss_dssp CHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred ceeehhHHHHHHHHHHHHHHHHHHHHH
Confidence 456666667776666666655555443
No 17
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=96.21 E-value=0.0034 Score=52.89 Aligned_cols=24 Identities=21% Similarity=0.443 Sum_probs=12.4
Q ss_pred HHHHHHHHHHHHHhhhcccccccc
Q psy3616 54 IIIALFVWMICARSERRREPKKLV 77 (143)
Q Consensus 54 ilIvllv~~~~~r~rr~kk~k~~~ 77 (143)
|++|++++++.+|+||+||.|+++
T Consensus 268 IVLIMvIIYLILRYRRKKKmkKKl 291 (299)
T PF02009_consen 268 IVLIMVIIYLILRYRRKKKMKKKL 291 (299)
T ss_pred HHHHHHHHHHHHHHHHHhhhhHHH
Confidence 333444555555666655555444
No 18
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=96.04 E-value=0.013 Score=35.06 Aligned_cols=16 Identities=31% Similarity=0.424 Sum_probs=6.9
Q ss_pred HHHHHHHHHHHHHHHH
Q psy3616 47 AATVVFLIIIALFVWM 62 (143)
Q Consensus 47 ~~~Vl~lilIvllv~~ 62 (143)
+++++.+++|+++++.
T Consensus 10 v~V~vg~~iiii~~~~ 25 (38)
T PF02439_consen 10 VAVVVGMAIIIICMFY 25 (38)
T ss_pred HHHHHHHHHHHHHHHH
Confidence 3444444444444443
No 19
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=95.51 E-value=0.016 Score=48.67 Aligned_cols=26 Identities=31% Similarity=0.556 Sum_probs=17.2
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHhhhc
Q psy3616 45 GIAATVVFLIIIALFVWMICARSERR 70 (143)
Q Consensus 45 ~i~~~Vl~lilIvllv~~~~~r~rr~ 70 (143)
||+++|+++|.++++++.+|.+|||+
T Consensus 262 giaalvllil~vvliiLYiWlyrrRK 287 (295)
T TIGR01478 262 GIAALVLIILTVVLIILYIWLYRRRK 287 (295)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence 35666666666777777777766544
No 20
>PTZ00370 STEVOR; Provisional
Probab=95.42 E-value=0.017 Score=48.44 Aligned_cols=26 Identities=31% Similarity=0.566 Sum_probs=16.9
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHhhhc
Q psy3616 45 GIAATVVFLIIIALFVWMICARSERR 70 (143)
Q Consensus 45 ~i~~~Vl~lilIvllv~~~~~r~rr~ 70 (143)
||+++|++++.++++++++|.+|||+
T Consensus 258 giaalvllil~vvliilYiwlyrrRK 283 (296)
T PTZ00370 258 GIAALVLLILAVVLIILYIWLYRRRK 283 (296)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence 34666666666667777777766554
No 21
>PF15330 SIT: SHP2-interacting transmembrane adaptor protein, SIT
Probab=95.25 E-value=0.049 Score=39.42 Aligned_cols=12 Identities=17% Similarity=0.268 Sum_probs=7.2
Q ss_pred eeeecCCCCCcc
Q psy3616 88 VNFYYGGAPYAE 99 (143)
Q Consensus 88 ~N~~~g~ppy~e 99 (143)
-+..|||-.+..
T Consensus 45 ~~p~YgNL~~~q 56 (107)
T PF15330_consen 45 DDPCYGNLELQQ 56 (107)
T ss_pred CCcccccccccc
Confidence 467777655543
No 22
>KOG4289|consensus
Probab=95.23 E-value=0.013 Score=58.10 Aligned_cols=35 Identities=34% Similarity=0.850 Sum_probs=30.5
Q ss_pred CCCCCCCcEEEeC-CCCCCeeeecCCCcCCCccccc
Q psy3616 3 KGYCENKGTCVKD-ARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 3 ~~~C~NgG~C~~~-~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
-+||.|.|+|+.. ...++.|.|++||.|++||...
T Consensus 1721 lnpc~~~g~Cv~sp~a~GY~C~C~~g~~G~~Ce~~~ 1756 (2531)
T KOG4289|consen 1721 LNPCENQGTCVRSPGAHGYTCECPPGYTGPYCELRA 1756 (2531)
T ss_pred ccccccCceeecCCCCCceeEECCCcccCcchhhhc
Confidence 3799999999988 3458999999999999999864
No 23
>PTZ00046 rifin; Provisional
Probab=95.18 E-value=0.015 Score=50.13 Aligned_cols=23 Identities=17% Similarity=0.356 Sum_probs=14.5
Q ss_pred HHHHHHHHHHHhhhccccccccc
Q psy3616 56 IALFVWMICARSERRREPKKLVA 78 (143)
Q Consensus 56 Ivllv~~~~~r~rr~kk~k~~~~ 78 (143)
++++++++..|+||+||.|++++
T Consensus 329 LIMvIIYLILRYRRKKKMkKKLQ 351 (358)
T PTZ00046 329 LIMVIIYLILRYRRKKKMKKKLQ 351 (358)
T ss_pred HHHHHHHHHHHhhhcchhHHHHH
Confidence 34455566677777777776653
No 24
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=95.17 E-value=0.014 Score=34.21 Aligned_cols=27 Identities=30% Similarity=0.767 Sum_probs=21.2
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCcCC
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFIGP 31 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~G~ 31 (143)
..|.-+++|+.. .+.+.|.|.+||.|+
T Consensus 6 ~~C~~nA~C~~~-~~~~~C~C~~Gy~Gd 32 (36)
T PF12947_consen 6 GGCHPNATCTNT-GGSYTCTCKPGYEGD 32 (36)
T ss_dssp GGS-TTCEEEE--TTSEEEEE-CEEECC
T ss_pred CCCCCCcEeecC-CCCEEeECCCCCccC
Confidence 468888999999 779999999999986
No 25
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=95.10 E-value=0.016 Score=49.82 Aligned_cols=23 Identities=17% Similarity=0.356 Sum_probs=14.3
Q ss_pred HHHHHHHHHHHhhhccccccccc
Q psy3616 56 IALFVWMICARSERRREPKKLVA 78 (143)
Q Consensus 56 Ivllv~~~~~r~rr~kk~k~~~~ 78 (143)
++++++++..|+||+||.|++++
T Consensus 324 LIMvIIYLILRYRRKKKMkKKLQ 346 (353)
T TIGR01477 324 LIMVIIYLILRYRRKKKMKKKLQ 346 (353)
T ss_pred HHHHHHHHHHHhhhcchhHHHHH
Confidence 33455566667777777766653
No 26
>PF12273 RCR: Chitin synthesis regulation, resistance to Congo red; InterPro: IPR020999 RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 [].
Probab=94.83 E-value=0.021 Score=41.97 Aligned_cols=16 Identities=31% Similarity=0.370 Sum_probs=7.0
Q ss_pred HHHHHHHHHhhhcccc
Q psy3616 58 LFVWMICARSERRREP 73 (143)
Q Consensus 58 llv~~~~~r~rr~kk~ 73 (143)
++++.-..|+||..+.
T Consensus 18 ~~~~~~rRR~r~G~~P 33 (130)
T PF12273_consen 18 LFYCHNRRRRRRGLQP 33 (130)
T ss_pred HHHHHHHHHhhcCCCC
Confidence 3344444444444333
No 27
>PF06679 DUF1180: Protein of unknown function (DUF1180); InterPro: IPR009565 This entry consists of several hypothetical eukaryotic proteins thought to be membrane proteins. Their function is unknown.
Probab=94.23 E-value=0.11 Score=40.32 Aligned_cols=19 Identities=16% Similarity=0.134 Sum_probs=8.6
Q ss_pred HHHhhhccccccccccccC
Q psy3616 64 CARSERRREPKKLVAQTND 82 (143)
Q Consensus 64 ~~r~rr~kk~k~~~~~~~~ 82 (143)
.+|.||++++-++|.-+.+
T Consensus 116 ~~R~r~~~rktRkYgvl~~ 134 (163)
T PF06679_consen 116 TFRLRRRNRKTRKYGVLTT 134 (163)
T ss_pred HHhhccccccceeecccCC
Confidence 3444443333355655544
No 28
>KOG3514|consensus
Probab=93.37 E-value=0.041 Score=53.37 Aligned_cols=33 Identities=30% Similarity=0.809 Sum_probs=29.5
Q ss_pred CCCCCCcEEEeCCCCCCeeeecC-CCcCCCccccc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVG-SFIGPHCAQKS 37 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~-gy~G~rCe~~~ 37 (143)
+||+|+|+|... -..+.|-|.. +|.|+.||+..
T Consensus 629 nPC~N~g~C~eg-wNrfiCDCs~T~~~G~~CerE~ 662 (1591)
T KOG3514|consen 629 NPCQNGGKCSEG-WNRFICDCSGTGFEGRTCEREA 662 (1591)
T ss_pred CcccCCCCcccc-ccccccccccCcccCcccccee
Confidence 789999999988 7889999987 49999999854
No 29
>PF06365 CD34_antigen: CD34/Podocalyxin family; InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=93.36 E-value=0.48 Score=38.00 Aligned_cols=20 Identities=20% Similarity=0.189 Sum_probs=11.7
Q ss_pred cCCCCcceeeec---CCCC-Cccc
Q psy3616 81 NDQTGSQVNFYY---GGAP-YAES 100 (143)
Q Consensus 81 ~~~~gs~~N~~~---g~pp-y~e~ 100 (143)
..+||.|-|..+ +.+| -+|+
T Consensus 143 ~vEng~h~n~~l~v~~~~~E~qeK 166 (202)
T PF06365_consen 143 TVENGYHDNPTLSVAESQPEMQEK 166 (202)
T ss_pred ecccCccCCcccccCCCCcccccc
Confidence 356777777666 4433 5555
No 30
>PF02480 Herpes_gE: Alphaherpesvirus glycoprotein E; InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=93.14 E-value=0.026 Score=49.80 Aligned_cols=18 Identities=17% Similarity=0.767 Sum_probs=0.0
Q ss_pred HHHHHHHHHHHHHHHHHH
Q psy3616 46 IAATVVFLIIIALFVWMI 63 (143)
Q Consensus 46 i~~~Vl~lilIvllv~~~ 63 (143)
+++++++++|+++++|++
T Consensus 358 VlgvavlivVv~viv~vc 375 (439)
T PF02480_consen 358 VLGVAVLIVVVGVIVWVC 375 (439)
T ss_dssp ------------------
T ss_pred HHHHHHHHHHHHHHhhee
Confidence 334444444444444433
No 31
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=93.02 E-value=0.071 Score=44.95 Aligned_cols=31 Identities=16% Similarity=0.386 Sum_probs=18.0
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHhhhcccccc
Q psy3616 45 GIAATVVFLIIIALFVWMICARSERRREPKK 75 (143)
Q Consensus 45 ~i~~~Vl~lilIvllv~~~~~r~rr~kk~k~ 75 (143)
.++.+|++||+|++.+++.+.|+|+++|+.|
T Consensus 262 iiaIliIVLIMvIIYLILRYRRKKKmkKKlQ 292 (299)
T PF02009_consen 262 IIAILIIVLIMVIIYLILRYRRKKKMKKKLQ 292 (299)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhhhHHHH
Confidence 3455666666555444445677777766544
No 32
>PF05454 DAG1: Dystroglycan (Dystrophin-associated glycoprotein 1); InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=93.01 E-value=0.028 Score=47.24 Aligned_cols=24 Identities=29% Similarity=0.553 Sum_probs=0.0
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhc
Q psy3616 47 AATVVFLIIIALFVWMICARSERR 70 (143)
Q Consensus 47 ~~~Vl~lilIvllv~~~~~r~rr~ 70 (143)
+++|+++|||+.+|+++|+||||.
T Consensus 152 aVVI~~iLLIA~iIa~icyrrkR~ 175 (290)
T PF05454_consen 152 AVVIAAILLIAGIIACICYRRKRK 175 (290)
T ss_dssp ------------------------
T ss_pred HHHHHHHHHHHHHHHHHhhhhhhc
Confidence 344444445556667777775543
No 33
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=92.72 E-value=0.044 Score=36.33 Aligned_cols=26 Identities=31% Similarity=0.340 Sum_probs=0.4
Q ss_pred ehhHHHHHHHHHHHHHHHHHHHHHhh
Q psy3616 43 AGGIAATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 43 a~~i~~~Vl~lilIvllv~~~~~r~r 68 (143)
++.|+++|+.+++.+++++++.+|.|
T Consensus 12 aavIaG~Vvgll~ailLIlf~iyR~r 37 (64)
T PF01034_consen 12 AAVIAGGVVGLLFAILLILFLIYRMR 37 (64)
T ss_dssp -------------------------S
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 33345555555554444444445443
No 34
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=92.33 E-value=0.024 Score=43.58 Aligned_cols=13 Identities=31% Similarity=0.695 Sum_probs=5.1
Q ss_pred HHHHHHHHHHHHH
Q psy3616 53 LIIIALFVWMICA 65 (143)
Q Consensus 53 lilIvllv~~~~~ 65 (143)
||+|++++|++|.
T Consensus 63 ll~il~lvf~~c~ 75 (154)
T PF04478_consen 63 LLGILALVFIFCI 75 (154)
T ss_pred HHHHHHhheeEEE
Confidence 3333344443433
No 35
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=91.74 E-value=0.12 Score=39.82 Aligned_cols=6 Identities=33% Similarity=0.490 Sum_probs=2.5
Q ss_pred eehhHH
Q psy3616 42 IAGGIA 47 (143)
Q Consensus 42 ia~~i~ 47 (143)
|+++|+
T Consensus 80 iivgvi 85 (179)
T PF13908_consen 80 IIVGVI 85 (179)
T ss_pred eeeehh
Confidence 444443
No 36
>PF05393 Hum_adeno_E3A: Human adenovirus early E3A glycoprotein; InterPro: IPR008652 This family consists of several early glycoproteins (E3A), from human adenovirus type 2.; GO: 0016021 integral to membrane
Probab=91.57 E-value=0.26 Score=34.76 Aligned_cols=16 Identities=25% Similarity=1.076 Sum_probs=9.9
Q ss_pred HHHHHHHHHHHHhhhc
Q psy3616 55 IIALFVWMICARSERR 70 (143)
Q Consensus 55 lIvllv~~~~~r~rr~ 70 (143)
++++++|++|+.+|||
T Consensus 45 il~VilwfvCC~kRkr 60 (94)
T PF05393_consen 45 ILLVILWFVCCKKRKR 60 (94)
T ss_pred HHHHHHHHHHHHHhhh
Confidence 4445567777766554
No 37
>PF01299 Lamp: Lysosome-associated membrane glycoprotein (Lamp); InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+ In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=91.15 E-value=0.13 Score=42.90 Aligned_cols=19 Identities=5% Similarity=0.251 Sum_probs=7.1
Q ss_pred HHHHHHHHHHHHHHHHHHH
Q psy3616 48 ATVVFLIIIALFVWMICAR 66 (143)
Q Consensus 48 ~~Vl~lilIvllv~~~~~r 66 (143)
|++|++|||++||.+++.|
T Consensus 278 G~~La~lvlivLiaYli~R 296 (306)
T PF01299_consen 278 GAALAGLVLIVLIAYLIGR 296 (306)
T ss_pred HHHHHHHHHHHHHhheeEe
Confidence 3333333333333333333
No 38
>PF15050 SCIMP: SCIMP protein
Probab=90.98 E-value=0.56 Score=34.93 Aligned_cols=11 Identities=18% Similarity=0.220 Sum_probs=4.2
Q ss_pred HHHHHHHhhhc
Q psy3616 60 VWMICARSERR 70 (143)
Q Consensus 60 v~~~~~r~rr~ 70 (143)
++++.+|...|
T Consensus 26 IlyCvcR~~lR 36 (133)
T PF15050_consen 26 ILYCVCRWQLR 36 (133)
T ss_pred HHHHHHHHHHH
Confidence 33333443333
No 39
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=90.80 E-value=0.085 Score=34.03 Aligned_cols=33 Identities=33% Similarity=0.796 Sum_probs=20.4
Q ss_pred CCCCCCcEEEeC---CCCCCeeeecCCCcCCCcccc
Q psy3616 4 GYCENKGTCVKD---ARGQPSCRCVGSFIGPHCAQK 36 (143)
Q Consensus 4 ~~C~NgG~C~~~---~~~~~~C~C~~gy~G~rCe~~ 36 (143)
-+|.-+|....+ ..+.|.|.|..-|.|+.|+..
T Consensus 17 i~CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~~ 52 (56)
T PF04863_consen 17 ISCSGHGRAFLDGLIADGSPVCECNSCYGGPDCSTL 52 (56)
T ss_dssp S--TTSEE--TTS-EETTEE--EE-TTEESTTS-EE
T ss_pred CCcCCCCeeeeccccccCCccccccCCcCCCCcccC
Confidence 367788888766 356799999999999999874
No 40
>PHA03283 envelope glycoprotein E; Provisional
Probab=90.47 E-value=0.33 Score=43.89 Aligned_cols=38 Identities=16% Similarity=0.209 Sum_probs=19.6
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc
Q psy3616 44 GGIAATVVFLIIIALFVWMICARSERRREPKKLVAQTN 81 (143)
Q Consensus 44 ~~i~~~Vl~lilIvllv~~~~~r~rr~kk~k~~~~~~~ 81 (143)
+++++.++++++++++||.+...+++++|..+-+-+|.
T Consensus 403 ~~~~~~~~~~~~~~l~vw~c~~~r~~~~~~y~ilnpf~ 440 (542)
T PHA03283 403 LLAIICTCAALLVALVVWGCILYRRSNRKPYEVLNPFE 440 (542)
T ss_pred HHHHHHHHHHHHHHHhhhheeeehhhcCCcccccCCCc
Confidence 33444444455555666644333555566665555443
No 41
>PF07204 Orthoreo_P10: Orthoreovirus membrane fusion protein p10; InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=90.16 E-value=0.18 Score=35.89 Aligned_cols=9 Identities=22% Similarity=0.844 Sum_probs=4.1
Q ss_pred HHHHHhhhc
Q psy3616 62 MICARSERR 70 (143)
Q Consensus 62 ~~~~r~rr~ 70 (143)
++|+|.|++
T Consensus 61 v~CC~~K~K 69 (98)
T PF07204_consen 61 VCCCRAKHK 69 (98)
T ss_pred HHHhhhhhh
Confidence 344444444
No 42
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=90.06 E-value=0.088 Score=37.26 Aligned_cols=25 Identities=28% Similarity=0.533 Sum_probs=11.3
Q ss_pred eehhHHHHHHHH-HHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFL-IIIALFVWMICAR 66 (143)
Q Consensus 42 ia~~i~~~Vl~l-ilIvllv~~~~~r 66 (143)
|+++++++++++ +|+.+++|++.+|
T Consensus 68 iagi~vg~~~~v~~lv~~l~w~f~~r 93 (96)
T PTZ00382 68 IAGISVAVVAVVGGLVGFLCWWFVCR 93 (96)
T ss_pred EEEEEeehhhHHHHHHHHHhheeEEe
Confidence 444333333333 4444555655444
No 43
>KOG3516|consensus
Probab=89.78 E-value=0.25 Score=48.39 Aligned_cols=32 Identities=31% Similarity=0.804 Sum_probs=28.9
Q ss_pred CCCCCcEEEeCCCCCCeeeecCC-CcCCCccccc
Q psy3616 5 YCENKGTCVKDARGQPSCRCVGS-FIGPHCAQKS 37 (143)
Q Consensus 5 ~C~NgG~C~~~~~~~~~C~C~~g-y~G~rCe~~~ 37 (143)
+|+|||+|+.. -..+.|-|... |.|+.|....
T Consensus 962 ~C~NGG~Cver-y~gytCDCs~Tay~Gp~Cs~ei 994 (1306)
T KOG3516|consen 962 PCLNGGHCVER-YDGYTCDCSRTAYDGPFCSKEI 994 (1306)
T ss_pred cccCCCEEEEe-cCceeeccccCcCCCCcccccc
Confidence 69999999999 77899999987 9999999864
No 44
>PF08693 SKG6: Transmembrane alpha-helix domain; InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=89.44 E-value=0.088 Score=31.83 Aligned_cols=6 Identities=50% Similarity=0.589 Sum_probs=2.7
Q ss_pred eeehhH
Q psy3616 41 YIAGGI 46 (143)
Q Consensus 41 ~ia~~i 46 (143)
.|+.++
T Consensus 12 aIa~~V 17 (40)
T PF08693_consen 12 AIAVGV 17 (40)
T ss_pred EEEEEE
Confidence 444443
No 45
>PF07213 DAP10: DAP10 membrane protein; InterPro: IPR009861 This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells [].
Probab=88.91 E-value=0.51 Score=32.52 Aligned_cols=18 Identities=33% Similarity=0.617 Sum_probs=10.0
Q ss_pred HHHHHHHHHHHhhhcccc
Q psy3616 56 IALFVWMICARSERRREP 73 (143)
Q Consensus 56 Ivllv~~~~~r~rr~kk~ 73 (143)
+|+++.++|.|.++|+++
T Consensus 49 LIv~~vy~car~r~r~~~ 66 (79)
T PF07213_consen 49 LIVLVVYYCARPRRRPTQ 66 (79)
T ss_pred HHHHHHHhhcccccCCcc
Confidence 333455677776655444
No 46
>PF10873 DUF2668: Protein of unknown function (DUF2668); InterPro: IPR022640 Members in this family of proteins are annotated as cysteine and tyrosine-rich protein 1, however currently no function is known [].
Probab=88.84 E-value=1.5 Score=33.63 Aligned_cols=22 Identities=23% Similarity=0.212 Sum_probs=13.5
Q ss_pred eeeehhHHHHHHHHHHHHHHHH
Q psy3616 40 AYIAGGIAATVVFLIIIALFVW 61 (143)
Q Consensus 40 ~~ia~~i~~~Vl~lilIvllv~ 61 (143)
..|++++.++|+++.+|+.+++
T Consensus 61 tAIaGIVfgiVfimgvva~i~i 82 (155)
T PF10873_consen 61 TAIAGIVFGIVFIMGVVAGIAI 82 (155)
T ss_pred ceeeeeehhhHHHHHHHHHHHH
Confidence 3566656677777776664433
No 47
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=88.74 E-value=0.51 Score=36.45 Aligned_cols=13 Identities=46% Similarity=0.708 Sum_probs=5.3
Q ss_pred HHHHHHHHHHHHH
Q psy3616 47 AATVVFLIIIALF 59 (143)
Q Consensus 47 ~~~Vl~lilIvll 59 (143)
+++|+++|+|+++
T Consensus 36 iaIvVliiiiivl 48 (189)
T PF05568_consen 36 IAIVVLIIIIIVL 48 (189)
T ss_pred HHHHHHHHHHHHH
Confidence 3444444444433
No 48
>KOG1225|consensus
Probab=88.65 E-value=0.32 Score=44.05 Aligned_cols=30 Identities=33% Similarity=0.979 Sum_probs=24.8
Q ss_pred CCCCCCCCcEEEeCCCCCCeeeecCCCcCCCcccc
Q psy3616 2 CKGYCENKGTCVKDARGQPSCRCVGSFIGPHCAQK 36 (143)
Q Consensus 2 C~~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe~~ 36 (143)
|...|.++|.|+ . ..|.|..||+|..|+..
T Consensus 314 cpadC~g~G~Ci-~----G~C~C~~Gy~G~~C~~~ 343 (525)
T KOG1225|consen 314 CPADCSGHGKCI-D----GECLCDEGYTGELCIQR 343 (525)
T ss_pred CCccCCCCCccc-C----CceEeCCCCcCCccccc
Confidence 557888888888 4 58999999999999885
No 49
>KOG3516|consensus
Probab=88.62 E-value=0.26 Score=48.30 Aligned_cols=33 Identities=24% Similarity=0.687 Sum_probs=29.5
Q ss_pred CCCCCCCcEEEeCCCCCCeeeec-CCCcCCCcccc
Q psy3616 3 KGYCENKGTCVKDARGQPSCRCV-GSFIGPHCAQK 36 (143)
Q Consensus 3 ~~~C~NgG~C~~~~~~~~~C~C~-~gy~G~rCe~~ 36 (143)
.+||+|||.|... -..+.|.|. .||.|..|...
T Consensus 550 PN~CehgG~C~Qs-~~~f~C~C~~TGY~GatCHts 583 (1306)
T KOG3516|consen 550 PNPCEHGGKCSQS-WDDFECNCELTGYKGATCHTS 583 (1306)
T ss_pred CccccCCCccccc-ccceeEeccccccccccccCC
Confidence 4799999999986 788999999 89999999864
No 50
>PF15298 AJAP1_PANP_C: AJAP1/PANP C-terminus
Probab=88.39 E-value=0.69 Score=37.04 Aligned_cols=11 Identities=36% Similarity=0.606 Sum_probs=8.5
Q ss_pred CCCccccCCCC
Q psy3616 95 APYAESVAPSH 105 (143)
Q Consensus 95 ppy~e~~~~~~ 105 (143)
.+|.|+..|+|
T Consensus 166 tayn~sl~csh 176 (205)
T PF15298_consen 166 TAYNDSLQCSH 176 (205)
T ss_pred cccCCCCCCCc
Confidence 37888888876
No 51
>PTZ00046 rifin; Provisional
Probab=88.24 E-value=0.3 Score=42.22 Aligned_cols=33 Identities=18% Similarity=0.398 Sum_probs=22.3
Q ss_pred ehhHHHHHHHHHHHHHHHHHHHHHhhhcccccc
Q psy3616 43 AGGIAATVVFLIIIALFVWMICARSERRREPKK 75 (143)
Q Consensus 43 a~~i~~~Vl~lilIvllv~~~~~r~rr~kk~k~ 75 (143)
+..|+.+|++||++++-+++-+.|+|++||+-|
T Consensus 319 aSiiAIvVIVLIMvIIYLILRYRRKKKMkKKLQ 351 (358)
T PTZ00046 319 ASIVAIVVIVLIMVIIYLILRYRRKKKMKKKLQ 351 (358)
T ss_pred HHHHHHHHHHHHHHHHHHHHHhhhcchhHHHHH
Confidence 333566777777777666667788888766543
No 52
>PF08374 Protocadherin: Protocadherin; InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated [].
Probab=88.01 E-value=0.38 Score=38.99 Aligned_cols=18 Identities=22% Similarity=0.292 Sum_probs=7.3
Q ss_pred eeeehhH-HHHHHHHHHHH
Q psy3616 40 AYIAGGI-AATVVFLIIIA 57 (143)
Q Consensus 40 ~~ia~~i-~~~Vl~lilIv 57 (143)
+.|+++| +|++.++|+|+
T Consensus 37 ~~I~iaiVAG~~tVILVI~ 55 (221)
T PF08374_consen 37 VKIMIAIVAGIMTVILVIF 55 (221)
T ss_pred eeeeeeeecchhhhHHHHH
Confidence 3444444 33333333333
No 53
>PF12259 DUF3609: Protein of unknown function (DUF3609); InterPro: IPR022048 This domain family is found in eukaryotes and viruses, and is typically between 348 and 360 amino acids in length.
Probab=88.01 E-value=0.6 Score=40.34 Aligned_cols=28 Identities=14% Similarity=0.475 Sum_probs=14.3
Q ss_pred eehhHHHHHHHHHHHHHHHHHHHHHhhh
Q psy3616 42 IAGGIAATVVFLIIIALFVWMICARSER 69 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~~~~~r~rr 69 (143)
....|+++++++++++.++|+++..++|
T Consensus 299 ~~i~v~~~~vli~vl~~~~~~~~~~~~~ 326 (361)
T PF12259_consen 299 VHIAVCGAIVLIIVLISLAWLYRTFRRR 326 (361)
T ss_pred EEEehhHHHHHHHHHHHHHhheeehHHH
Confidence 3334455555555555566765544333
No 54
>PF12877 DUF3827: Domain of unknown function (DUF3827); InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells.
Probab=87.95 E-value=0.59 Score=43.31 Aligned_cols=27 Identities=26% Similarity=0.850 Sum_probs=12.2
Q ss_pred eeehhHHHHHHHHHH-HHHHHHHHHHHhh
Q psy3616 41 YIAGGIAATVVFLII-IALFVWMICARSE 68 (143)
Q Consensus 41 ~ia~~i~~~Vl~lil-Ivllv~~~~~r~r 68 (143)
||.+++++-|+++++ |+++.|.+| |++
T Consensus 270 WII~gVlvPv~vV~~Iiiil~~~LC-Rk~ 297 (684)
T PF12877_consen 270 WIIAGVLVPVLVVLLIIIILYWKLC-RKN 297 (684)
T ss_pred EEEehHhHHHHHHHHHHHHHHHHHh-ccc
Confidence 444444444444444 444445444 444
No 55
>PF00558 Vpu: Vpu protein; InterPro: IPR008187 The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells [].; GO: 0019076 release of virus from host; PDB: 2JPX_A 1PI8_A 2GOH_A 2GOF_A 1PI7_A 1PJE_A 1VPU_A 2K7Y_A.
Probab=87.82 E-value=0.71 Score=31.95 Aligned_cols=12 Identities=50% Similarity=0.969 Sum_probs=5.8
Q ss_pred HHHHHHHHHHHH
Q psy3616 54 IIIALFVWMICA 65 (143)
Q Consensus 54 ilIvllv~~~~~ 65 (143)
++++++||.+.+
T Consensus 16 ~iiaIvvW~iv~ 27 (81)
T PF00558_consen 16 LIIAIVVWTIVY 27 (81)
T ss_dssp HHHHHHHHHHH-
T ss_pred HHHHHHHHHHHH
Confidence 334555666544
No 56
>PF11980 DUF3481: Domain of unknown function (DUF3481); InterPro: IPR022579 This domain of unknown function is located in the C terminus of the eukaryotic neuropilin receptor family of proteins. It is found in association with PF00754 from PFAM, PF00431 from PFAM and PF00629 from PFAM. There are two completely conserved residues (Y and E) that may be functionally important.
Probab=87.79 E-value=0.84 Score=31.90 Aligned_cols=23 Identities=22% Similarity=0.423 Sum_probs=13.9
Q ss_pred HHHHHHHHHHHHHHHHHHHhhhc
Q psy3616 48 ATVVFLIIIALFVWMICARSERR 70 (143)
Q Consensus 48 ~~Vl~lilIvllv~~~~~r~rr~ 70 (143)
+++++++.+.+.+.++|+|.+..
T Consensus 24 ga~llL~~v~l~vvL~C~r~~~a 46 (87)
T PF11980_consen 24 GALLLLVAVCLGVVLYCHRFHWA 46 (87)
T ss_pred cHHHHHHHHHHHHHHhhhhhccc
Confidence 34444444555667788887764
No 57
>PF02158 Neuregulin: Neuregulin family; InterPro: IPR002154 Neuregulins are a sub-family of EGF-like molecules that have been shown to play multiple essential roles in vertebrate embryogenesis including: cardiac development, Schwann cell and oligodendrocyte differentiation, some aspects of neuronal development, as well as the formation of neuromuscular synapses [, ]. Included in the family are heregulin; neu differentiation factor; acetylcholine receptor synthesis stimulator; glial growth factor; and sensory and motor-neuron derived factor []. Multiple family members are generated by alternate splicing or by use of several cell type-specific transcription initiation sites. In general, they bind to and activate the erbB family of receptor tyrosine kinases (erbB2 (HER2), erbB3 (HER3), and erbB4 (HER4)), functioning both as heterodimers and homodimers. The transmembrane forms of neuregulin 1 (NRG1) are present within synaptic vesicles, including those containing glutamate []. After exocytosis, NRG1 is in the presynaptic membrane, where the ectodomain of NRG1 may be cleaved off. The ectodomain then migrates across the synaptic cleft and binds to and activates a member of the EGF-receptor family on the postsynaptic membrane. This has been shown to increase the expression of certain glutamate-receptor subunits. NRG1 appears to signal for glutamate-receptor subunit expression, localisation, and /or phosphorylation facilitating subsequent glutamate transmission. The NRG1 gene has been identified as a potential gene determining susceptibility to schizophrenia by a combination of genetic linkage and association approaches []. ; GO: 0005102 receptor binding, 0009790 embryo development; PDB: 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=87.76 E-value=0.16 Score=44.27 Aligned_cols=35 Identities=17% Similarity=0.342 Sum_probs=0.0
Q ss_pred ehhHHHHHHHHHHHH-HHHH-HHHHHhhhcccccccc
Q psy3616 43 AGGIAATVVFLIIIA-LFVW-MICARSERRREPKKLV 77 (143)
Q Consensus 43 a~~i~~~Vl~lilIv-llv~-~~~~r~rr~kk~k~~~ 77 (143)
+++|.++++.|+++. ++|+ ++|..||+|||.+..|
T Consensus 9 VLTITgIcvaLlVVGi~Cvv~aYCKTKKQRkklh~hL 45 (404)
T PF02158_consen 9 VLTITGICVALLVVGIVCVVDAYCKTKKQRKKLHEHL 45 (404)
T ss_dssp -------------------------------------
T ss_pred hhhhhhhhHHHHHHHHHHHHHHHHHhHHHHHHHHHHH
Confidence 344555555555544 4555 6666666655444333
No 58
>smart00051 DSL delta serrate ligand.
Probab=87.64 E-value=0.42 Score=31.29 Aligned_cols=22 Identities=27% Similarity=0.750 Sum_probs=16.4
Q ss_pred CCcEEEeCCCCCCeeeecCCCcCCCc
Q psy3616 8 NKGTCVKDARGQPSCRCVGSFIGPHC 33 (143)
Q Consensus 8 NgG~C~~~~~~~~~C~C~~gy~G~rC 33 (143)
++.+|... -.|.|.+||+|+.|
T Consensus 42 ~~~~Cd~~----G~~~C~~Gw~G~~C 63 (63)
T smart00051 42 GHYTCDEN----GNKGCLEGWMGPYC 63 (63)
T ss_pred CCccCCcC----CCEecCCCCcCCCC
Confidence 45566432 57889999999988
No 59
>PF06697 DUF1191: Protein of unknown function (DUF1191); InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=87.61 E-value=0.44 Score=39.91 Aligned_cols=52 Identities=21% Similarity=0.334 Sum_probs=21.1
Q ss_pred eeeehhHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCCCCcceeeecC
Q psy3616 40 AYIAGGIAATVVFLIIIALFVWMICARSERRREPKKLVAQTNDQTGSQVNFYYG 93 (143)
Q Consensus 40 ~~ia~~i~~~Vl~lilIvllv~~~~~r~rr~kk~k~~~~~~~~~~gs~~N~~~g 93 (143)
|.+++++++.+++|+|+.++++ .+.|.+|+||..++..+...++-.+ ....|
T Consensus 213 W~iv~g~~~G~~~L~ll~~lv~-~~vr~krk~k~~eMEr~A~~gE~L~-~~~VG 264 (278)
T PF06697_consen 213 WKIVVGVVGGVVLLGLLSLLVA-MLVRYKRKKKIEEMERRAEEGEALQ-MSWVG 264 (278)
T ss_pred EEEEEEehHHHHHHHHHHHHHH-hhhhhhHHHHHHHHHHhhccCceee-eEEEc
Confidence 4455554444444444443332 2233333333333333333333222 45566
No 60
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=87.56 E-value=0.36 Score=41.70 Aligned_cols=32 Identities=16% Similarity=0.378 Sum_probs=21.8
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhcccccc
Q psy3616 44 GGIAATVVFLIIIALFVWMICARSERRREPKK 75 (143)
Q Consensus 44 ~~i~~~Vl~lilIvllv~~~~~r~rr~kk~k~ 75 (143)
..|+.+|++||++++-+++-+.|+|++||+-|
T Consensus 315 SiIAIvvIVLIMvIIYLILRYRRKKKMkKKLQ 346 (353)
T TIGR01477 315 SIIAILIIVLIMVIIYLILRYRRKKKMKKKLQ 346 (353)
T ss_pred HHHHHHHHHHHHHHHHHHHHhhhcchhHHHHH
Confidence 33566777777777666667788888766543
No 61
>PF15069 FAM163: FAM163 family
Probab=87.49 E-value=0.9 Score=34.58 Aligned_cols=21 Identities=52% Similarity=0.760 Sum_probs=12.7
Q ss_pred ceeeehhHHHHHHHHHHHHHH
Q psy3616 39 FAYIAGGIAATVVFLIIIALF 59 (143)
Q Consensus 39 ~~~ia~~i~~~Vl~lilIvll 59 (143)
.+.|++||.+.|++|.||+++
T Consensus 5 TvVItGgILAtVILLcIIaVL 25 (143)
T PF15069_consen 5 TVVITGGILATVILLCIIAVL 25 (143)
T ss_pred eEEEechHHHHHHHHHHHHHH
Confidence 356777776666655555543
No 62
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=87.39 E-value=0.44 Score=36.58 Aligned_cols=17 Identities=29% Similarity=0.395 Sum_probs=9.0
Q ss_pred eehhHHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIIIAL 58 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvl 58 (143)
+++.++++++++++|++
T Consensus 77 ~~~iivgvi~~Vi~Iv~ 93 (179)
T PF13908_consen 77 ITGIIVGVICGVIAIVV 93 (179)
T ss_pred eeeeeeehhhHHHHHHH
Confidence 44555556665555543
No 63
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=86.39 E-value=0.49 Score=25.45 Aligned_cols=17 Identities=24% Similarity=0.774 Sum_probs=13.3
Q ss_pred CCeeeecCCCc----CCCccc
Q psy3616 19 QPSCRCVGSFI----GPHCAQ 35 (143)
Q Consensus 19 ~~~C~C~~gy~----G~rCe~ 35 (143)
++.|.|++||. |..|+.
T Consensus 1 sy~C~C~~Gy~l~~d~~~C~D 21 (24)
T PF12662_consen 1 SYTCSCPPGYQLSPDGRSCED 21 (24)
T ss_pred CEEeeCCCCCcCCCCCCcccc
Confidence 47899999987 566764
No 64
>PHA03281 envelope glycoprotein E; Provisional
Probab=86.30 E-value=1.1 Score=41.01 Aligned_cols=17 Identities=12% Similarity=-0.175 Sum_probs=7.8
Q ss_pred HHHHHHhhhcccccccc
Q psy3616 61 WMICARSERRREPKKLV 77 (143)
Q Consensus 61 ~~~~~r~rr~kk~k~~~ 77 (143)
....+|+|++++.|..+
T Consensus 578 ~~~~~~~~~~~~~~~~~ 594 (642)
T PHA03281 578 TAKKFGHKAYRSDKAAY 594 (642)
T ss_pred HHHHhhhheeecccccc
Confidence 33445555444444333
No 65
>KOG1225|consensus
Probab=86.14 E-value=0.71 Score=41.85 Aligned_cols=32 Identities=31% Similarity=0.850 Sum_probs=22.5
Q ss_pred CCCCCCCCCcEEEeCCCCCCeeeecCCCcCCCccccc
Q psy3616 1 VCKGYCENKGTCVKDARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 1 ~C~~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
+|...|.++|.|+.. .|.|.++|+|..|+.+.
T Consensus 282 ~Cp~~cs~~g~~~~g-----~CiC~~g~~G~dCs~~~ 313 (525)
T KOG1225|consen 282 VCPVDCSGGGVCVDG-----ECICNPGYSGKDCSIRR 313 (525)
T ss_pred cCCcccCCCceecCC-----EeecCCCcccccccccc
Confidence 366667666655533 88888889998887654
No 66
>PF15102 TMEM154: TMEM154 protein family
Probab=85.78 E-value=0.42 Score=36.53 Aligned_cols=19 Identities=16% Similarity=0.226 Sum_probs=9.6
Q ss_pred eeehhHHHHHHHHHHHHHH
Q psy3616 41 YIAGGIAATVVFLIIIALF 59 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvll 59 (143)
+|++-.+++|++|++++++
T Consensus 60 mIlIP~VLLvlLLl~vV~l 78 (146)
T PF15102_consen 60 MILIPLVLLVLLLLSVVCL 78 (146)
T ss_pred EEeHHHHHHHHHHHHHHHh
Confidence 4555555555555544433
No 67
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=85.68 E-value=0.59 Score=28.41 Aligned_cols=25 Identities=32% Similarity=0.666 Sum_probs=18.7
Q ss_pred cEEEeCCCCCCeeeecCCCcCCCccccc
Q psy3616 10 GTCVKDARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 10 G~C~~~~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
.+|... ..+|.|..+|+|++|+.-.
T Consensus 11 ~~C~~~---~G~C~C~~~~~G~~C~~C~ 35 (49)
T PF00053_consen 11 QTCDPS---TGQCVCKPGTTGPRCDQCK 35 (49)
T ss_dssp SSEEET---CEEESBSTTEESTTS-EE-
T ss_pred CcccCC---CCEEeccccccCCcCcCCC
Confidence 377763 3799999999999999743
No 68
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=85.48 E-value=1.2 Score=32.97 Aligned_cols=30 Identities=13% Similarity=0.154 Sum_probs=17.0
Q ss_pred hhHHHHHHHHHH-HHHHHHHHHHHhhhcccc
Q psy3616 44 GGIAATVVFLII-IALFVWMICARSERRREP 73 (143)
Q Consensus 44 ~~i~~~Vl~lil-Ivllv~~~~~r~rr~kk~ 73 (143)
..++++++.++. |+++++++.+.-||++|+
T Consensus 64 ~~i~~Ii~gv~aGvIg~Illi~y~irR~~Kk 94 (122)
T PF01102_consen 64 PAIIGIIFGVMAGVIGIILLISYCIRRLRKK 94 (122)
T ss_dssp TCHHHHHHHHHHHHHHHHHHHHHHHHHHS--
T ss_pred cceeehhHHHHHHHHHHHHHHHHHHHHHhcc
Confidence 335666666665 445566677776666554
No 69
>PF10577 UPF0560: Uncharacterised protein family UPF0560; InterPro: IPR018890 This family of proteins has no known function.
Probab=85.46 E-value=0.94 Score=42.87 Aligned_cols=25 Identities=20% Similarity=0.337 Sum_probs=14.8
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhh
Q psy3616 44 GGIAATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 44 ~~i~~~Vl~lilIvllv~~~~~r~r 68 (143)
++|.|..++|+|++|+++++++|+|
T Consensus 276 l~ILG~~~livl~lL~vLl~yCrrk 300 (807)
T PF10577_consen 276 LAILGGTALIVLILLCVLLCYCRRK 300 (807)
T ss_pred HHHHHHHHHHHHHHHHHHHHhhhcc
Confidence 4455656666666666666655554
No 70
>KOG1217|consensus
Probab=84.52 E-value=1 Score=37.62 Aligned_cols=28 Identities=43% Similarity=0.925 Sum_probs=25.7
Q ss_pred CCCCCcEEEeCCCCCCeeeecCCCcCCCc
Q psy3616 5 YCENKGTCVKDARGQPSCRCVGSFIGPHC 33 (143)
Q Consensus 5 ~C~NgG~C~~~~~~~~~C~C~~gy~G~rC 33 (143)
+|.|+|+|... .+.+.|.|+++|.|..|
T Consensus 279 ~c~~~~~C~~~-~~~~~C~C~~g~~g~~~ 306 (487)
T KOG1217|consen 279 SCPNGGTCVNV-PGSYRCTCPPGFTGRLC 306 (487)
T ss_pred ccCCCCeeecC-CCcceeeCCCCCCCCCC
Confidence 49999999998 66699999999999999
No 71
>KOG1217|consensus
Probab=84.09 E-value=0.86 Score=38.02 Aligned_cols=32 Identities=34% Similarity=0.845 Sum_probs=28.5
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCcCCCcccc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFIGPHCAQK 36 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe~~ 36 (143)
.+|.|++.|... .+.+.|.|+.+|.|..|+..
T Consensus 177 ~~c~~~~~C~~~-~~~~~C~c~~~~~~~~~~~~ 208 (487)
T KOG1217|consen 177 SPCQNGGTCVNT-GGSYLCSCPPGYTGSTCETT 208 (487)
T ss_pred CCcCCCcccccC-CCCeeEeCCCCccCCcCcCC
Confidence 459999999999 66799999999999999875
No 72
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=83.30 E-value=1.1 Score=27.44 Aligned_cols=24 Identities=25% Similarity=0.677 Sum_probs=18.6
Q ss_pred EEEeCCCCCCeeeecCCCcCCCccccc
Q psy3616 11 TCVKDARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 11 ~C~~~~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
.|.. ..-+|.|..+++|++|+.-.
T Consensus 13 ~C~~---~~G~C~C~~~~~G~~C~~C~ 36 (50)
T cd00055 13 QCDP---GTGQCECKPNTTGRRCDRCA 36 (50)
T ss_pred cccC---CCCEEeCCCcCCCCCCCCCC
Confidence 3654 34689999999999999643
No 73
>PF14991 MLANA: Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=82.78 E-value=0.29 Score=35.99 Aligned_cols=14 Identities=43% Similarity=0.482 Sum_probs=1.5
Q ss_pred eehhHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLII 55 (143)
Q Consensus 42 ia~~i~~~Vl~lil 55 (143)
.|+||+.++++|.+
T Consensus 25 EAaGIGiL~VILgi 38 (118)
T PF14991_consen 25 EAAGIGILIVILGI 38 (118)
T ss_dssp ---SSS--------
T ss_pred HhccceeHHHHHHH
Confidence 45555444333333
No 74
>KOG3637|consensus
Probab=82.50 E-value=1.5 Score=42.71 Aligned_cols=27 Identities=15% Similarity=0.282 Sum_probs=15.7
Q ss_pred eeeehhHHHHHHHHHHHHHHHHHHHHH
Q psy3616 40 AYIAGGIAATVVFLIIIALFVWMICAR 66 (143)
Q Consensus 40 ~~ia~~i~~~Vl~lilIvllv~~~~~r 66 (143)
|.|.+++++.+|+|+||++++|=+-+-
T Consensus 979 wiIi~svl~GLLlL~llv~~LwK~GFF 1005 (1030)
T KOG3637|consen 979 WIIILSVLGGLLLLALLVLLLWKCGFF 1005 (1030)
T ss_pred eeehHHHHHHHHHHHHHHHHHHhcCcc
Confidence 455566666666666666666644333
No 75
>PF06667 PspB: Phage shock protein B; InterPro: IPR009554 This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages []. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one [].; GO: 0006355 regulation of transcription, DNA-dependent, 0009271 phage shock
Probab=81.52 E-value=2.9 Score=28.45 Aligned_cols=20 Identities=15% Similarity=0.478 Sum_probs=11.1
Q ss_pred HHHHHHHHHHHHHHhhhccc
Q psy3616 53 LIIIALFVWMICARSERRRE 72 (143)
Q Consensus 53 lilIvllv~~~~~r~rr~kk 72 (143)
.+++|+.+|++.++++|+|.
T Consensus 13 f~ifVap~WL~lHY~sk~~~ 32 (75)
T PF06667_consen 13 FMIFVAPIWLILHYRSKWKS 32 (75)
T ss_pred HHHHHHHHHHHHHHHHhccc
Confidence 33444556777776655433
No 76
>PF11669 WBP-1: WW domain-binding protein 1; InterPro: IPR021684 This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain [].
Probab=81.48 E-value=4 Score=29.10 Aligned_cols=6 Identities=33% Similarity=0.678 Sum_probs=3.3
Q ss_pred CCCccc
Q psy3616 95 APYAES 100 (143)
Q Consensus 95 ppy~e~ 100 (143)
|+|.|.
T Consensus 85 P~Y~ev 90 (102)
T PF11669_consen 85 PSYSEV 90 (102)
T ss_pred CCcHHh
Confidence 446664
No 77
>PF03302 VSP: Giardia variant-specific surface protein; InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=81.01 E-value=1.5 Score=38.10 Aligned_cols=27 Identities=30% Similarity=0.524 Sum_probs=15.5
Q ss_pred eehh-HHHHHHHHHHHHHHHHHHHHHhh
Q psy3616 42 IAGG-IAATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 42 ia~~-i~~~Vl~lilIvllv~~~~~r~r 68 (143)
||.+ |+++|++-.||.+|.|++..|.|
T Consensus 369 IaGIsvavvvvVgglvGfLcWwf~crgk 396 (397)
T PF03302_consen 369 IAGISVAVVVVVGGLVGFLCWWFICRGK 396 (397)
T ss_pred eeeeeehhHHHHHHHHHHHhhheeeccc
Confidence 4443 34444444466678887777654
No 78
>PF15330 SIT: SHP2-interacting transmembrane adaptor protein, SIT
Probab=80.85 E-value=6.7 Score=28.30 Aligned_cols=6 Identities=17% Similarity=-0.103 Sum_probs=2.5
Q ss_pred ceeeec
Q psy3616 87 QVNFYY 92 (143)
Q Consensus 87 ~~N~~~ 92 (143)
-.|..+
T Consensus 49 YgNL~~ 54 (107)
T PF15330_consen 49 YGNLEL 54 (107)
T ss_pred cccccc
Confidence 334444
No 79
>KOG1226|consensus
Probab=80.45 E-value=1.4 Score=41.46 Aligned_cols=24 Identities=29% Similarity=0.872 Sum_probs=12.3
Q ss_pred CCCCcEEEeCCCCCCeeeecCCCcCCCcc
Q psy3616 6 CENKGTCVKDARGQPSCRCVGSFIGPHCA 34 (143)
Q Consensus 6 C~NgG~C~~~~~~~~~C~C~~gy~G~rCe 34 (143)
|.++|.|.=. +|.|.+||+|+.|+
T Consensus 557 C~g~G~C~CG-----~CvC~~GwtG~~C~ 580 (783)
T KOG1226|consen 557 CGGHGRCECG-----RCVCNPGWTGSACN 580 (783)
T ss_pred cCCCCeEeCC-----cEEcCCCCccCCCC
Confidence 5555555433 45555555555544
No 80
>KOG3514|consensus
Probab=80.10 E-value=1.1 Score=44.00 Aligned_cols=33 Identities=27% Similarity=0.842 Sum_probs=28.8
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCC-CcCCCccccc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGS-FIGPHCAQKS 37 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~g-y~G~rCe~~~ 37 (143)
+.|.|+|.|..+ -..+.|.|... |+|++|....
T Consensus 1024 ~acanhG~c~q~-w~~~~c~csmtS~~Gp~C~d~g 1057 (1591)
T KOG3514|consen 1024 DACANHGVCIQQ-WNGIACDCSMTSYSGPRCNDPG 1057 (1591)
T ss_pred hhhhccceeeee-ecceeeeccccccCCCccCCCc
Confidence 349999999999 77899999875 9999999864
No 81
>KOG4260|consensus
Probab=79.66 E-value=1.3 Score=37.54 Aligned_cols=35 Identities=29% Similarity=0.751 Sum_probs=29.1
Q ss_pred CCCCCCCcEEEeC--CCCCCeeeecCCCcCCCccccc
Q psy3616 3 KGYCENKGTCVKD--ARGQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 3 ~~~C~NgG~C~~~--~~~~~~C~C~~gy~G~rCe~~~ 37 (143)
..||...|.|.-+ -.+.-.|+|..||+|+.|..-.
T Consensus 149 er~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~Cg 185 (350)
T KOG4260|consen 149 ERPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRYCG 185 (350)
T ss_pred cCCcCCCCcccCCCCCCCCCcccccCCCCCccccccc
Confidence 3578888999877 3567899999999999999753
No 82
>PF12191 stn_TNFRSF12A: Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain; InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=79.53 E-value=0.68 Score=34.62 Aligned_cols=8 Identities=13% Similarity=-0.068 Sum_probs=0.0
Q ss_pred HHHHHHhh
Q psy3616 61 WMICARSE 68 (143)
Q Consensus 61 ~~~~~r~r 68 (143)
.++.+||.
T Consensus 97 g~lv~rrc 104 (129)
T PF12191_consen 97 GFLVWRRC 104 (129)
T ss_dssp --------
T ss_pred HHHHHhhh
Confidence 33444433
No 83
>KOG4482|consensus
Probab=79.38 E-value=3.9 Score=36.01 Aligned_cols=28 Identities=29% Similarity=0.456 Sum_probs=13.5
Q ss_pred eehhHHHHHHHHHHHHHHHHHHHHHhhhc
Q psy3616 42 IAGGIAATVVFLIIIALFVWMICARSERR 70 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~~~~~r~rr~ 70 (143)
+++.|.++|++|++++|..+ .|+|+-.+
T Consensus 299 ~tfaIpl~Valll~~~La~i-mc~rrEg~ 326 (449)
T KOG4482|consen 299 HTFAIPLGVALLLVLALAYI-MCCRREGQ 326 (449)
T ss_pred HHHHHHHHHHHHHHHHHHHH-Hhhhhhcc
Confidence 44555555555544444444 44444433
No 84
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=78.88 E-value=1.6 Score=25.54 Aligned_cols=18 Identities=28% Similarity=0.750 Sum_probs=14.7
Q ss_pred EEEeCCCCCCeeeecCCCc
Q psy3616 11 TCVKDARGQPSCRCVGSFI 29 (143)
Q Consensus 11 ~C~~~~~~~~~C~C~~gy~ 29 (143)
.|+.. .++++|.|+.||.
T Consensus 11 ~C~~~-~g~~~C~C~~Gy~ 28 (36)
T PF14670_consen 11 ICVNT-PGSYRCSCPPGYK 28 (36)
T ss_dssp EEEEE-TTSEEEE-STTEE
T ss_pred CCccC-CCceEeECCCCCE
Confidence 67888 7889999999985
No 85
>PF11884 DUF3404: Domain of unknown function (DUF3404); InterPro: IPR021821 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 260 amino acids in length. This domain is found associated with PF02518 from PFAM, PF00512 from PFAM.
Probab=78.15 E-value=2.3 Score=35.35 Aligned_cols=15 Identities=27% Similarity=0.705 Sum_probs=7.0
Q ss_pred HHHHHHHHHhhhccc
Q psy3616 58 LFVWMICARSERRRE 72 (143)
Q Consensus 58 llv~~~~~r~rr~kk 72 (143)
++.|.++.++++||+
T Consensus 246 ~~gw~~y~~~~krre 260 (262)
T PF11884_consen 246 VLGWSLYRWNQKRRE 260 (262)
T ss_pred HHHHHHHHHHHHHHh
Confidence 334555554444443
No 86
>KOG1836|consensus
Probab=78.11 E-value=1.2 Score=45.54 Aligned_cols=33 Identities=24% Similarity=0.639 Sum_probs=28.9
Q ss_pred CCCCCcEEEeC-CCCCCeee-ecCCCcCCCccccc
Q psy3616 5 YCENKGTCVKD-ARGQPSCR-CVGSFIGPHCAQKS 37 (143)
Q Consensus 5 ~C~NgG~C~~~-~~~~~~C~-C~~gy~G~rCe~~~ 37 (143)
+|.|+|.|... +.....|+ |+.+|+|.||+...
T Consensus 781 ~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~c~ 815 (1705)
T KOG1836|consen 781 PCPNGGACGQTPEILEVVCKNCPPGYTGLRCEECA 815 (1705)
T ss_pred CCCCChhhcCcCcccceecCCCCCCCcccccccCC
Confidence 68899999877 46788999 99999999999865
No 87
>PF01528 Herpes_glycop: Herpesvirus glycoprotein M; InterPro: IPR000785 The Equid herpesvirus 1 (Equine herpesvirus 1, EHV-1) protein belongs to a family of sequences that groups together Human herpesvirus 1 (HHV-1) UL10, EHV-1 52, Human herpesvirus 3 (HHV-3) 50, Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) BBRF3, Human herpesvirus 1 (HHV-1) 39 and Human cytomegalovirus (HHV-5) UL100. Little is yet known about the properties of the protein. However, its amino acid sequence is highly hydrophobic, containing 8 putative membrane-spanning regions, and it is therefore believed to be either membrane-associated or transmembrane.; GO: 0016020 membrane
Probab=77.94 E-value=3.1 Score=36.23 Aligned_cols=19 Identities=11% Similarity=-0.027 Sum_probs=7.7
Q ss_pred cccccccCCCCcceeeecC
Q psy3616 75 KLVAQTNDQTGSQVNFYYG 93 (143)
Q Consensus 75 ~~~~~~~~~~gs~~N~~~g 93 (143)
.++.+..+....++-....
T Consensus 336 ~~y~~l~~~~~~~vk~~~~ 354 (374)
T PF01528_consen 336 TRYYPLVRTVRKRVKRYIR 354 (374)
T ss_pred hhhhhcccchHHHHHhhcc
Confidence 3444444433334443333
No 88
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=77.68 E-value=1.8 Score=25.69 Aligned_cols=26 Identities=23% Similarity=0.687 Sum_probs=18.8
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCc
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFI 29 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~ 29 (143)
..|.-+..|+..+.+..+|+|..||.
T Consensus 5 ~~cP~NA~C~~~~dG~eecrCllgyk 30 (37)
T PF12946_consen 5 TKCPANAGCFRYDDGSEECRCLLGYK 30 (37)
T ss_dssp S---TTEEEEEETTSEEEEEE-TTEE
T ss_pred ccCCCCcccEEcCCCCEEEEeeCCcc
Confidence 35777889998867999999999986
No 89
>PHA03265 envelope glycoprotein D; Provisional
Probab=77.58 E-value=1.2 Score=38.79 Aligned_cols=22 Identities=14% Similarity=0.539 Sum_probs=9.7
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhh
Q psy3616 47 AATVVFLIIIALFVWMICARSER 69 (143)
Q Consensus 47 ~~~Vl~lilIvllv~~~~~r~rr 69 (143)
+++++.|+++.+ |+++|+||||
T Consensus 355 g~~i~glv~vg~-il~~~~rr~k 376 (402)
T PHA03265 355 GLGIAGLVLVGV-ILYVCLRRKK 376 (402)
T ss_pred ccchhhhhhhhH-HHHHHhhhhh
Confidence 333333333333 3445555554
No 90
>PF05399 EVI2A: Ectropic viral integration site 2A protein (EVI2A); InterPro: IPR008608 This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumours [, ].; GO: 0016021 integral to membrane
Probab=75.66 E-value=2.6 Score=34.18 Aligned_cols=24 Identities=21% Similarity=0.296 Sum_probs=12.2
Q ss_pred CcCCCccccc-cc-eeeehhHHHHHH
Q psy3616 28 FIGPHCAQKS-EF-AYIAGGIAATVV 51 (143)
Q Consensus 28 y~G~rCe~~~-~~-~~ia~~i~~~Vl 51 (143)
|.-+.||... .+ ..|.++|+++++
T Consensus 116 ~kk~~CEen~~K~amLIClIIIAVLf 141 (227)
T PF05399_consen 116 FKKEICEENNNKMAMLICLIIIAVLF 141 (227)
T ss_pred cchhhhhcCccchhHHHHHHHHHHHH
Confidence 5566788753 22 245554443333
No 91
>PF12768 Rax2: Cortical protein marker for cell polarity
Probab=75.42 E-value=7 Score=32.59 Aligned_cols=21 Identities=29% Similarity=0.423 Sum_probs=9.3
Q ss_pred eeehhHHHHHHHHHHHHHHHH
Q psy3616 41 YIAGGIAATVVFLIIIALFVW 61 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvllv~ 61 (143)
.|.++++..+++||+++.+++
T Consensus 231 lIslAiALG~v~ll~l~Gii~ 251 (281)
T PF12768_consen 231 LISLAIALGTVFLLVLIGIIL 251 (281)
T ss_pred EEehHHHHHHHHHHHHHHHHH
Confidence 455555444444444443333
No 92
>PF15099 PIRT: Phosphoinositide-interacting protein family
Probab=75.25 E-value=2 Score=32.18 Aligned_cols=7 Identities=29% Similarity=0.620 Sum_probs=2.6
Q ss_pred HHHHHHH
Q psy3616 60 VWMICAR 66 (143)
Q Consensus 60 v~~~~~r 66 (143)
+|....|
T Consensus 101 cW~~~~r 107 (129)
T PF15099_consen 101 CWKPIIR 107 (129)
T ss_pred eehhhhH
Confidence 3433333
No 93
>TIGR02976 phageshock_pspB phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response.
Probab=74.90 E-value=5.5 Score=27.05 Aligned_cols=19 Identities=21% Similarity=0.560 Sum_probs=10.3
Q ss_pred HHHHHHHHHHHHHhhhccc
Q psy3616 54 IIIALFVWMICARSERRRE 72 (143)
Q Consensus 54 ilIvllv~~~~~r~rr~kk 72 (143)
+++++.+|++.++++|++.
T Consensus 14 ~ifVap~wl~lHY~~k~~~ 32 (75)
T TIGR02976 14 VIFVAPLWLILHYRSKRKT 32 (75)
T ss_pred HHHHHHHHHHHHHHhhhcc
Confidence 3344456777666555443
No 94
>KOG1094|consensus
Probab=74.44 E-value=12 Score=35.21 Aligned_cols=10 Identities=20% Similarity=0.205 Sum_probs=4.2
Q ss_pred HHHHHHHHHh
Q psy3616 58 LFVWMICARS 67 (143)
Q Consensus 58 llv~~~~~r~ 67 (143)
++|.++.+|+
T Consensus 406 ~ii~~~L~R~ 415 (807)
T KOG1094|consen 406 LIIALMLWRW 415 (807)
T ss_pred HHHHHHHHHH
Confidence 3344444443
No 95
>PF02038 ATP1G1_PLM_MAT8: ATP1G1/PLM/MAT8 family; InterPro: IPR000272 The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable. Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=74.14 E-value=3.4 Score=26.11 Aligned_cols=11 Identities=0% Similarity=0.072 Sum_probs=4.5
Q ss_pred HHHHHHHHHHH
Q psy3616 47 AATVVFLIIIA 57 (143)
Q Consensus 47 ~~~Vl~lilIv 57 (143)
+|+++..++.+
T Consensus 17 gGLi~A~vlfi 27 (50)
T PF02038_consen 17 GGLIFAGVLFI 27 (50)
T ss_dssp HHHHHHHHHHH
T ss_pred cchHHHHHHHH
Confidence 34444444433
No 96
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=73.73 E-value=4 Score=31.59 Aligned_cols=11 Identities=18% Similarity=0.468 Sum_probs=4.6
Q ss_pred HHHHHHHHHHH
Q psy3616 55 IIALFVWMICA 65 (143)
Q Consensus 55 lIvllv~~~~~ 65 (143)
+|+++++++.+
T Consensus 41 liiiiivli~l 51 (189)
T PF05568_consen 41 LIIIIIVLIYL 51 (189)
T ss_pred HHHHHHHHHHH
Confidence 33344444433
No 97
>KOG4699|consensus
Probab=73.15 E-value=4.4 Score=31.57 Aligned_cols=35 Identities=11% Similarity=0.148 Sum_probs=20.9
Q ss_pred HHHHHHHHHhhhccccccccccccCCCCcceeeecC
Q psy3616 58 LFVWMICARSERRREPKKLVAQTNDQTGSQVNFYYG 93 (143)
Q Consensus 58 llv~~~~~r~rr~kk~k~~~~~~~~~~gs~~N~~~g 93 (143)
+++++..+|+++.+|++...+.+=+.|- ..|++|-
T Consensus 18 ~v~~~~~~rK~kn~kk~p~~eewF~eN~-~~~vyF~ 52 (180)
T KOG4699|consen 18 TVILMFTFRKRKNRKKAPGDEEWFSENL-ELEVYFL 52 (180)
T ss_pred HHHHHHHHHHHHhhccCCCCccccCCch-hHHHHHH
Confidence 4555666666665565555555555564 4577765
No 98
>COG5538 SEC66 Endoplasmic reticulum translocation complex, subunit SEC66 [Cell motility and secretion]
Probab=73.15 E-value=4.4 Score=31.57 Aligned_cols=35 Identities=11% Similarity=0.148 Sum_probs=20.9
Q ss_pred HHHHHHHHHhhhccccccccccccCCCCcceeeecC
Q psy3616 58 LFVWMICARSERRREPKKLVAQTNDQTGSQVNFYYG 93 (143)
Q Consensus 58 llv~~~~~r~rr~kk~k~~~~~~~~~~gs~~N~~~g 93 (143)
+++++..+|+++.+|++...+.+=+.|- ..|++|-
T Consensus 18 ~v~~~~~~rK~kn~kk~p~~eewF~eN~-~~~vyF~ 52 (180)
T COG5538 18 TVILMFTFRKRKNRKKAPGDEEWFSENL-ELEVYFL 52 (180)
T ss_pred HHHHHHHHHHHHhhccCCCCccccCCch-hHHHHHH
Confidence 4555666666665565555555555564 4577765
No 99
>PF14979 TMEM52: Transmembrane 52
Probab=72.91 E-value=4.6 Score=31.01 Aligned_cols=9 Identities=22% Similarity=0.028 Sum_probs=4.8
Q ss_pred cceeeehhH
Q psy3616 38 EFAYIAGGI 46 (143)
Q Consensus 38 ~~~~ia~~i 46 (143)
..|+|++.+
T Consensus 18 ~LWyIwLil 26 (154)
T PF14979_consen 18 SLWYIWLIL 26 (154)
T ss_pred hhhHHHHHH
Confidence 345666643
No 100
>PTZ00370 STEVOR; Provisional
Probab=72.58 E-value=3.6 Score=34.78 Aligned_cols=24 Identities=17% Similarity=0.627 Sum_probs=14.7
Q ss_pred HHHHHHHHHHHHHHHHHHHHHhhh
Q psy3616 46 IAATVVFLIIIALFVWMICARSER 69 (143)
Q Consensus 46 i~~~Vl~lilIvllv~~~~~r~rr 69 (143)
++.+++.++||+|-+|++..|++.
T Consensus 262 lvllil~vvliilYiwlyrrRK~s 285 (296)
T PTZ00370 262 LVLLILAVVLIILYIWLYRRRKNS 285 (296)
T ss_pred HHHHHHHHHHHHHHHHHHHhhcch
Confidence 344555566666667777776664
No 101
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=72.42 E-value=3.6 Score=24.86 Aligned_cols=18 Identities=22% Similarity=0.654 Sum_probs=15.7
Q ss_pred CCeeeecCCCcCCCcccc
Q psy3616 19 QPSCRCVGSFIGPHCAQK 36 (143)
Q Consensus 19 ~~~C~C~~gy~G~rCe~~ 36 (143)
.-+|.|.++++|++|+.-
T Consensus 17 ~G~C~C~~~~~G~~C~~C 34 (46)
T smart00180 17 TGQCECKPNVTGRRCDRC 34 (46)
T ss_pred CCEEECCCCCCCCCCCcC
Confidence 368999999999999964
No 102
>PRK06531 yajC preprotein translocase subunit YajC; Validated
Probab=72.31 E-value=2 Score=31.45 Aligned_cols=17 Identities=12% Similarity=0.306 Sum_probs=6.9
Q ss_pred HhhhccccccccccccC
Q psy3616 66 RSERRREPKKLVAQTND 82 (143)
Q Consensus 66 r~rr~kk~k~~~~~~~~ 82 (143)
.|.++|++|+...-...
T Consensus 20 iRPQkKr~Ke~~em~~s 36 (113)
T PRK06531 20 QRQQKKQAQERQNQLNA 36 (113)
T ss_pred echHHHHHHHHHHHHHh
Confidence 33444444443333333
No 103
>PF15048 OSTbeta: Organic solute transporter subunit beta protein
Probab=72.17 E-value=6.6 Score=29.27 Aligned_cols=8 Identities=0% Similarity=-0.155 Sum_probs=2.9
Q ss_pred HHHHHHhh
Q psy3616 61 WMICARSE 68 (143)
Q Consensus 61 ~~~~~r~r 68 (143)
+....+.+
T Consensus 54 Lgrsi~AN 61 (125)
T PF15048_consen 54 LGRSIQAN 61 (125)
T ss_pred HHHHhHhc
Confidence 33333333
No 104
>KOG3653|consensus
Probab=71.73 E-value=14 Score=33.59 Aligned_cols=22 Identities=32% Similarity=0.744 Sum_probs=12.4
Q ss_pred CCCCcEEEeC--C-CC--CCeeeecCC
Q psy3616 6 CENKGTCVKD--A-RG--QPSCRCVGS 27 (143)
Q Consensus 6 C~NgG~C~~~--~-~~--~~~C~C~~g 27 (143)
|.+.-.|+.. . .+ -+.|-|..+
T Consensus 97 c~~~~eCv~s~~~~~g~t~~~CcCs~~ 123 (534)
T KOG3653|consen 97 CEDSSECVVSAEPPPGQTLYFCCCSTD 123 (534)
T ss_pred cccccccccCCCCCCCCeEEEEecCCC
Confidence 4455567654 1 11 468888654
No 105
>PRK09458 pspB phage shock protein B; Provisional
Probab=71.72 E-value=6.2 Score=26.93 Aligned_cols=21 Identities=10% Similarity=0.425 Sum_probs=11.9
Q ss_pred HHHHHHHHHHHHHHhhhcccc
Q psy3616 53 LIIIALFVWMICARSERRREP 73 (143)
Q Consensus 53 lilIvllv~~~~~r~rr~kk~ 73 (143)
.+++|+-+|++.+.+.|+|..
T Consensus 13 F~ifVaPiWL~LHY~sk~~~~ 33 (75)
T PRK09458 13 FVLFVAPIWLWLHYRSKRQGS 33 (75)
T ss_pred HHHHHHHHHHHHhhcccccCC
Confidence 334445578777766655443
No 106
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=70.91 E-value=5 Score=23.98 Aligned_cols=16 Identities=19% Similarity=0.299 Sum_probs=6.3
Q ss_pred eehhHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIIIA 57 (143)
Q Consensus 42 ia~~i~~~Vl~lilIv 57 (143)
|.+++++.++++++.+
T Consensus 8 IIv~V~vg~~iiii~~ 23 (38)
T PF02439_consen 8 IIVAVVVGMAIIIICM 23 (38)
T ss_pred HHHHHHHHHHHHHHHH
Confidence 3344433333344433
No 107
>PF05545 FixQ: Cbb3-type cytochrome oxidase component FixQ; InterPro: IPR008621 This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon [].
Probab=70.17 E-value=4.1 Score=24.94 Aligned_cols=15 Identities=27% Similarity=0.791 Sum_probs=7.0
Q ss_pred HHHHHHHHHHHhhhc
Q psy3616 56 IALFVWMICARSERR 70 (143)
Q Consensus 56 Ivllv~~~~~r~rr~ 70 (143)
+.+++|.+..++|++
T Consensus 23 ~gi~~w~~~~~~k~~ 37 (49)
T PF05545_consen 23 IGIVIWAYRPRNKKR 37 (49)
T ss_pred HHHHHHHHcccchhh
Confidence 344455554444444
No 108
>PF15347 PAG: Phosphoprotein associated with glycosphingolipid-enriched
Probab=70.11 E-value=7.3 Score=34.24 Aligned_cols=20 Identities=20% Similarity=0.486 Sum_probs=7.9
Q ss_pred eehhHHHHHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIIIALFVW 61 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~ 61 (143)
+++|.+++|..++||.+||+
T Consensus 16 vlwgsLaav~~f~lis~Lif 35 (428)
T PF15347_consen 16 VLWGSLAAVTTFLLISFLIF 35 (428)
T ss_pred EeehHHHHHHHHHHHHHHHH
Confidence 33333344444444444333
No 109
>PRK00523 hypothetical protein; Provisional
Probab=70.05 E-value=6.3 Score=26.71 Aligned_cols=9 Identities=0% Similarity=0.003 Sum_probs=3.6
Q ss_pred HHhhhcccc
Q psy3616 65 ARSERRREP 73 (143)
Q Consensus 65 ~r~rr~kk~ 73 (143)
..|+..+|.
T Consensus 25 iark~~~k~ 33 (72)
T PRK00523 25 VSKKMFKKQ 33 (72)
T ss_pred HHHHHHHHH
Confidence 333444443
No 110
>PRK13664 hypothetical protein; Provisional
Probab=70.02 E-value=11 Score=24.61 Aligned_cols=16 Identities=38% Similarity=0.897 Sum_probs=10.6
Q ss_pred CCCcccccCCCCCCCCC
Q psy3616 106 HSTYAHYYDDEEDGWEM 122 (143)
Q Consensus 106 ~~~~~~~y~~~~d~~~~ 122 (143)
|+-|-.-.|| ||+|--
T Consensus 45 HRD~N~kWDd-eDdwPk 60 (62)
T PRK13664 45 HRDFNDKWDD-EDDWPK 60 (62)
T ss_pred Cccccccccc-cccCcc
Confidence 5677777777 456753
No 111
>PF14914 LRRC37AB_C: LRRC37A/B like protein 1 C-terminal domain
Probab=68.33 E-value=6 Score=30.43 Aligned_cols=11 Identities=36% Similarity=0.697 Sum_probs=4.3
Q ss_pred HHHHHHHHHHH
Q psy3616 55 IIALFVWMICA 65 (143)
Q Consensus 55 lIvllv~~~~~ 65 (143)
+|+++++.+|.
T Consensus 135 iii~CLiei~s 145 (154)
T PF14914_consen 135 IIIFCLIEICS 145 (154)
T ss_pred HHHHHHHHHHh
Confidence 33344443443
No 112
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=68.31 E-value=1.3 Score=29.41 Aligned_cols=27 Identities=19% Similarity=0.497 Sum_probs=0.6
Q ss_pred eeehhHHHHHHHHHHHHHHHHHHHHHhh
Q psy3616 41 YIAGGIAATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvllv~~~~~r~r 68 (143)
.|+++++++++.++||++++ ..+.+|.
T Consensus 14 vIaG~Vvgll~ailLIlf~i-yR~rkkd 40 (64)
T PF01034_consen 14 VIAGGVVGLLFAILLILFLI-YRMRKKD 40 (64)
T ss_dssp ------------------------S---
T ss_pred HHHHHHHHHHHHHHHHHHHH-HHHHhcC
Confidence 45555555555555554444 4445555
No 113
>PF15099 PIRT: Phosphoinositide-interacting protein family
Probab=67.13 E-value=3.1 Score=31.16 Aligned_cols=33 Identities=18% Similarity=0.277 Sum_probs=17.3
Q ss_pred HHHHHHHHHHHHHHH-HHHHHhhhcccccccccc
Q psy3616 47 AATVVFLIIIALFVW-MICARSERRREPKKLVAQ 79 (143)
Q Consensus 47 ~~~Vl~lilIvllv~-~~~~r~rr~kk~k~~~~~ 79 (143)
.|.+++-+-+.+++. .+|++-++|||+||+.+.
T Consensus 83 ~G~vlLs~GLmlL~~~alcW~~~~rkK~~kr~eS 116 (129)
T PF15099_consen 83 FGPVLLSLGLMLLACSALCWKPIIRKKKKKRRES 116 (129)
T ss_pred ehHHHHHHHHHHHHhhhheehhhhHhHHHHhhhh
Confidence 344444333333433 477777776666555443
No 114
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=66.97 E-value=6.9 Score=33.08 Aligned_cols=20 Identities=20% Similarity=0.745 Sum_probs=9.0
Q ss_pred HHHHHHHHHHHHHHHHHHhh
Q psy3616 49 TVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 49 ~Vl~lilIvllv~~~~~r~r 68 (143)
+++.++||+|-||++..|++
T Consensus 269 lil~vvliiLYiWlyrrRK~ 288 (295)
T TIGR01478 269 IILTVVLIILYIWLYRRRKK 288 (295)
T ss_pred HHHHHHHHHHHHHHHHhhcc
Confidence 33334444444455544444
No 115
>PF05961 Chordopox_A13L: Chordopoxvirus A13L protein; InterPro: IPR009236 This family consists of A13L proteins from the Chordopoxviruses. A13L or p8 is one of the three most abundant membrane proteins of the intracellular mature Vaccinia virus [].
Probab=66.86 E-value=4.1 Score=27.30 Aligned_cols=18 Identities=22% Similarity=0.648 Sum_probs=11.3
Q ss_pred CCccccCCCCCCCccccc
Q psy3616 96 PYAESVAPSHHSTYAHYY 113 (143)
Q Consensus 96 py~e~~~~~~~~~~~~~y 113 (143)
-|-.+..|.|.++|-.++
T Consensus 48 ~yVd~L~~~Hl~SfYkLF 65 (68)
T PF05961_consen 48 GYVDKLKPDHLSSFYKLF 65 (68)
T ss_pred hHHhccCHHHHHHHHHHh
Confidence 344566677877765554
No 116
>PHA03049 IMV membrane protein; Provisional
Probab=66.51 E-value=3.8 Score=27.40 Aligned_cols=17 Identities=18% Similarity=0.585 Sum_probs=10.1
Q ss_pred CccccCCCCCCCccccc
Q psy3616 97 YAESVAPSHHSTYAHYY 113 (143)
Q Consensus 97 y~e~~~~~~~~~~~~~y 113 (143)
|-.+..|+|.++|-.++
T Consensus 49 yvD~L~~~hl~SfyklF 65 (68)
T PHA03049 49 YVDKLKSSHLNSFYKLF 65 (68)
T ss_pred HHhhcCHHHHHHHHHHh
Confidence 33456667777665554
No 117
>PF15117 UPF0697: Uncharacterised protein family UPF0697
Probab=66.29 E-value=3.4 Score=29.16 Aligned_cols=12 Identities=25% Similarity=0.487 Sum_probs=5.9
Q ss_pred HHHHHHHhhhcc
Q psy3616 60 VWMICARSERRR 71 (143)
Q Consensus 60 v~~~~~r~rr~k 71 (143)
.++++.||+|||
T Consensus 30 ~l~~YarrNKrk 41 (99)
T PF15117_consen 30 GLFMYARRNKRK 41 (99)
T ss_pred HHHHhhhhcCce
Confidence 344556555443
No 118
>PF02480 Herpes_gE: Alphaherpesvirus glycoprotein E; InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=66.14 E-value=1.9 Score=38.11 Aligned_cols=6 Identities=0% Similarity=-0.092 Sum_probs=0.0
Q ss_pred HHHHhh
Q psy3616 63 ICARSE 68 (143)
Q Consensus 63 ~~~r~r 68 (143)
++.-.+
T Consensus 372 v~vc~~ 377 (439)
T PF02480_consen 372 VWVCLR 377 (439)
T ss_dssp ------
T ss_pred hheeee
Confidence 333333
No 119
>PF01299 Lamp: Lysosome-associated membrane glycoprotein (Lamp); InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+ In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=65.96 E-value=7.2 Score=32.41 Aligned_cols=33 Identities=15% Similarity=0.081 Sum_probs=17.6
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhcccccccccc
Q psy3616 47 AATVVFLIIIALFVWMICARSERRREPKKLVAQ 79 (143)
Q Consensus 47 ~~~Vl~lilIvllv~~~~~r~rr~kk~k~~~~~ 79 (143)
+.+++.++|.+|+++++....-.|||.+..++.
T Consensus 273 vPIaVG~~La~lvlivLiaYli~Rrr~~~gYq~ 305 (306)
T PF01299_consen 273 VPIAVGAALAGLVLIVLIAYLIGRRRSRAGYQS 305 (306)
T ss_pred HHHHHHHHHHHHHHHHHHhheeEeccccccccc
Confidence 455555555555555555555555555555543
No 120
>KOG0793|consensus
Probab=65.79 E-value=5.3 Score=37.92 Aligned_cols=32 Identities=13% Similarity=0.239 Sum_probs=20.2
Q ss_pred HHHHHHHHHHHHHHHhhhccccccccccccCC
Q psy3616 52 FLIIIALFVWMICARSERRREPKKLVAQTNDQ 83 (143)
Q Consensus 52 ~lilIvllv~~~~~r~rr~kk~k~~~~~~~~~ 83 (143)
++.+|++++.++|.|+++++|.|.++.-.-.+
T Consensus 615 ~a~vLv~~a~~~~~R~h~r~rdke~l~~l~~d 646 (1004)
T KOG0793|consen 615 IAGVLVASALIYCLRHHSRHRDKEKLSGLGGD 646 (1004)
T ss_pred HHHHHHHHHHHHHHHHHHhhhhhHHhhccCCC
Confidence 33444455667788888887777666655443
No 121
>PF14584 DUF4446: Protein of unknown function (DUF4446)
Probab=64.91 E-value=7.1 Score=29.78 Aligned_cols=6 Identities=33% Similarity=0.706 Sum_probs=2.3
Q ss_pred CCCCCC
Q psy3616 115 DEEDGW 120 (143)
Q Consensus 115 ~~~d~~ 120 (143)
++.||.
T Consensus 107 ~~~nGv 112 (151)
T PF14584_consen 107 DNNNGV 112 (151)
T ss_pred CCCCEE
Confidence 334443
No 122
>PF15069 FAM163: FAM163 family
Probab=64.77 E-value=7.5 Score=29.63 Aligned_cols=27 Identities=7% Similarity=0.116 Sum_probs=17.3
Q ss_pred ehhHHHHHHHHHHHHHHHHHHHHHhhh
Q psy3616 43 AGGIAATVVFLIIIALFVWMICARSER 69 (143)
Q Consensus 43 a~~i~~~Vl~lilIvllv~~~~~r~rr 69 (143)
+++.+++++.+|||.+++++++.|..+
T Consensus 6 vVItGgILAtVILLcIIaVLCYCRLQY 32 (143)
T PF15069_consen 6 VVITGGILATVILLCIIAVLCYCRLQY 32 (143)
T ss_pred EEEechHHHHHHHHHHHHHHHHHhhHH
Confidence 344466666666677777777777443
No 123
>PF04689 S1FA: DNA binding protein S1FA; InterPro: IPR006779 S1FA is an unusual small plant peptide of only 70 amino acids with a basic domain which contains a nuclear localization signal and a putative DNA binding helix. S1FA is highly conserved between dicotyledonous and monocotyledonous plants and may be a DNA-binding protein that specifically recognises the negative promoter element S1F [].; GO: 0003677 DNA binding, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=63.64 E-value=15 Score=24.53 Aligned_cols=28 Identities=14% Similarity=0.307 Sum_probs=13.7
Q ss_pred eeehhHHHHHHHHHHHHHHHHHHHHHhh
Q psy3616 41 YIAGGIAATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvllv~~~~~r~r 68 (143)
.|.+.+++.++++.+|...+++.++++.
T Consensus 14 lIVLlvV~g~ll~flvGnyvlY~Yaqk~ 41 (69)
T PF04689_consen 14 LIVLLVVAGLLLVFLVGNYVLYVYAQKT 41 (69)
T ss_pred eEEeehHHHHHHHHHHHHHHHHHHHhhc
Confidence 3444444444444555555555555444
No 124
>PF15345 TMEM51: Transmembrane protein 51
Probab=63.41 E-value=8.2 Score=31.65 Aligned_cols=12 Identities=25% Similarity=0.188 Sum_probs=6.4
Q ss_pred eeecCCCCCccc
Q psy3616 89 NFYYGGAPYAES 100 (143)
Q Consensus 89 N~~~g~ppy~e~ 100 (143)
+-.|-.|.|-|.
T Consensus 119 ~s~y~vPSYEEv 130 (233)
T PF15345_consen 119 ASRYYVPSYEEV 130 (233)
T ss_pred cccccCCChHHH
Confidence 444555666554
No 125
>COG4736 CcoQ Cbb3-type cytochrome oxidase, subunit 3 [Posttranslational modification, protein turnover, chaperones]
Probab=63.40 E-value=8.6 Score=25.13 Aligned_cols=6 Identities=17% Similarity=0.113 Sum_probs=2.2
Q ss_pred HHHHHH
Q psy3616 61 WMICAR 66 (143)
Q Consensus 61 ~~~~~r 66 (143)
+++.+|
T Consensus 26 i~~ayr 31 (60)
T COG4736 26 IYFAYR 31 (60)
T ss_pred HHHHhc
Confidence 333333
No 126
>PF15065 NCU-G1: Lysosomal transcription factor, NCU-G1
Probab=63.34 E-value=2.3 Score=36.75 Aligned_cols=12 Identities=33% Similarity=0.728 Sum_probs=6.0
Q ss_pred HHHHHHHhhhcc
Q psy3616 60 VWMICARSERRR 71 (143)
Q Consensus 60 v~~~~~r~rr~k 71 (143)
.+++|.||+|+|
T Consensus 338 gl~v~~~r~r~~ 349 (350)
T PF15065_consen 338 GLYVCLRRRRKR 349 (350)
T ss_pred hheEEEeccccC
Confidence 344556555443
No 127
>PHA03286 envelope glycoprotein E; Provisional
Probab=62.79 E-value=8 Score=34.71 Aligned_cols=7 Identities=14% Similarity=-0.031 Sum_probs=3.3
Q ss_pred ceeeecC
Q psy3616 87 QVNFYYG 93 (143)
Q Consensus 87 ~~N~~~g 93 (143)
..|.-|.
T Consensus 443 ~~~~~~~ 449 (492)
T PHA03286 443 DFNSPLS 449 (492)
T ss_pred ccCCccc
Confidence 3454444
No 128
>PF11157 DUF2937: Protein of unknown function (DUF2937); InterPro: IPR022584 This family of proteins with unknown function appears to be found mainly in Proteobacteria.
Probab=62.56 E-value=9.4 Score=29.45 Aligned_cols=24 Identities=13% Similarity=0.307 Sum_probs=9.6
Q ss_pred eehhHHHHHHHHHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIIIALFVWMICA 65 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~~~~~ 65 (143)
++.++++++++.+++-++...+..
T Consensus 136 i~~g~vg~l~~~~l~~~l~~l~~~ 159 (167)
T PF11157_consen 136 IVFGLVGALLGALLVELLLGLLRR 159 (167)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHH
Confidence 333344444444443333333333
No 129
>PF05984 Cytomega_UL20A: Cytomegalovirus UL20A protein; InterPro: IPR009245 This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein [].
Probab=61.59 E-value=17 Score=25.62 Aligned_cols=13 Identities=31% Similarity=0.176 Sum_probs=7.1
Q ss_pred CCCcccccCCCCC
Q psy3616 106 HSTYAHYYDDEED 118 (143)
Q Consensus 106 ~~~~~~~y~~~~d 118 (143)
.+.-+.+-|++||
T Consensus 52 lsqgGsttDg~Ed 64 (100)
T PF05984_consen 52 LSQGGSTTDGNED 64 (100)
T ss_pred EcCCCccCCCccc
Confidence 3445555566565
No 130
>PF07204 Orthoreo_P10: Orthoreovirus membrane fusion protein p10; InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=61.39 E-value=4.6 Score=28.79 Aligned_cols=28 Identities=32% Similarity=0.529 Sum_probs=16.5
Q ss_pred eeehhHHHHHHHHHHHHHHHHHHHHHhhh
Q psy3616 41 YIAGGIAATVVFLIIIALFVWMICARSER 69 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvllv~~~~~r~rr 69 (143)
+++.+ +++++++|||.++......+|..
T Consensus 44 yLA~G-GG~iLilIii~Lv~CC~~K~K~~ 71 (98)
T PF07204_consen 44 YLAAG-GGLILILIIIALVCCCRAKHKTS 71 (98)
T ss_pred Hhhcc-chhhhHHHHHHHHHHhhhhhhhH
Confidence 34444 45566666666666667666644
No 131
>PRK01844 hypothetical protein; Provisional
Probab=58.01 E-value=15 Score=24.84 Aligned_cols=7 Identities=0% Similarity=-0.301 Sum_probs=2.7
Q ss_pred hhhcccc
Q psy3616 67 SERRREP 73 (143)
Q Consensus 67 ~rr~kk~ 73 (143)
||..+|.
T Consensus 26 rk~~~k~ 32 (72)
T PRK01844 26 RKYMMNY 32 (72)
T ss_pred HHHHHHH
Confidence 3444333
No 132
>PF12877 DUF3827: Domain of unknown function (DUF3827); InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells.
Probab=57.29 E-value=8.8 Score=35.83 Aligned_cols=31 Identities=16% Similarity=0.210 Sum_probs=21.1
Q ss_pred eehhHHHHHHHHHHHHHHHHHHHHHhhhccc
Q psy3616 42 IAGGIAATVVFLIIIALFVWMICARSERRRE 72 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~~~~~r~rr~kk 72 (143)
-+.+|+|+++.+++++++|+++++.-.|++|
T Consensus 268 NlWII~gVlvPv~vV~~Iiiil~~~LCRk~K 298 (684)
T PF12877_consen 268 NLWIIAGVLVPVLVVLLIIIILYWKLCRKNK 298 (684)
T ss_pred CeEEEehHhHHHHHHHHHHHHHHHHHhcccc
Confidence 3444577777677777777778887776554
No 133
>PF07423 DUF1510: Protein of unknown function (DUF1510); InterPro: IPR009988 This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown.
Probab=57.11 E-value=7.4 Score=31.49 Aligned_cols=10 Identities=50% Similarity=0.680 Sum_probs=4.0
Q ss_pred HHHHHHHHHH
Q psy3616 48 ATVVFLIIIA 57 (143)
Q Consensus 48 ~~Vl~lilIv 57 (143)
++|+|||||+
T Consensus 21 ~IV~lLIiiv 30 (217)
T PF07423_consen 21 GIVSLLIIIV 30 (217)
T ss_pred HHHHHHHHHH
Confidence 3444343333
No 134
>KOG1214|consensus
Probab=56.21 E-value=9.5 Score=37.04 Aligned_cols=27 Identities=33% Similarity=0.778 Sum_probs=23.7
Q ss_pred CCCCCCcEEEeCCCCCCeeeecCCCcCC
Q psy3616 4 GYCENKGTCVKDARGQPSCRCVGSFIGP 31 (143)
Q Consensus 4 ~~C~NgG~C~~~~~~~~~C~C~~gy~G~ 31 (143)
+.|.-.++|.++ .+++.|+|.+||.|+
T Consensus 833 srChp~A~Cynt-pgsfsC~C~pGy~GD 859 (1289)
T KOG1214|consen 833 SRCHPAATCYNT-PGSFSCRCQPGYYGD 859 (1289)
T ss_pred cccCCCceEecC-CCcceeecccCccCC
Confidence 457778999999 899999999999876
No 135
>PF11359 gpUL132: Glycoprotein UL132; InterPro: IPR021023 Glycoprotein UL132 is a low-abundance structural component of Human herpesvirus 5 []. The function of this protein is not fully understood.
Probab=56.15 E-value=38 Score=27.68 Aligned_cols=11 Identities=18% Similarity=0.295 Sum_probs=6.1
Q ss_pred CCCCCcccccC
Q psy3616 104 SHHSTYAHYYD 114 (143)
Q Consensus 104 ~~~~~~~~~y~ 114 (143)
...+.|...-+
T Consensus 122 ~sSs~YQRL~~ 132 (235)
T PF11359_consen 122 CSSSKYQRLEN 132 (235)
T ss_pred ccccccccccc
Confidence 44456766544
No 136
>PF00558 Vpu: Vpu protein; InterPro: IPR008187 The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells [].; GO: 0019076 release of virus from host; PDB: 2JPX_A 1PI8_A 2GOH_A 2GOF_A 1PI7_A 1PJE_A 1VPU_A 2K7Y_A.
Probab=53.37 E-value=13 Score=25.64 Aligned_cols=26 Identities=15% Similarity=0.388 Sum_probs=6.9
Q ss_pred HHHHHHHHHHH-HHHHHHHHHhhhccc
Q psy3616 47 AATVVFLIIIA-LFVWMICARSERRRE 72 (143)
Q Consensus 47 ~~~Vl~lilIv-llv~~~~~r~rr~kk 72 (143)
.++++++.+++ .+++.-+.+-+|+||
T Consensus 12 liv~~iiaIvvW~iv~ieYrk~~rqrk 38 (81)
T PF00558_consen 12 LIVALIIAIVVWTIVYIEYRKIKRQRK 38 (81)
T ss_dssp HHHHHHHHHHHHHHH------------
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence 34444555555 455544444343333
No 137
>PF13974 YebO: YebO-like protein
Probab=53.33 E-value=12 Score=25.84 Aligned_cols=18 Identities=28% Similarity=0.875 Sum_probs=10.1
Q ss_pred HHHHHHHHHHHHHHHHhh
Q psy3616 51 VFLIIIALFVWMICARSE 68 (143)
Q Consensus 51 l~lilIvllv~~~~~r~r 68 (143)
++++++.+++|++..|..
T Consensus 5 ~~~~lv~livWFFVnRaS 22 (80)
T PF13974_consen 5 VLVLLVGLIVWFFVNRAS 22 (80)
T ss_pred HHHHHHHHHHHHHHHHHH
Confidence 334445566777655544
No 138
>PF08374 Protocadherin: Protocadherin; InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated [].
Probab=53.24 E-value=16 Score=29.78 Aligned_cols=20 Identities=25% Similarity=0.537 Sum_probs=11.6
Q ss_pred eeeehhHHHHHHHHHHHHHH
Q psy3616 40 AYIAGGIAATVVFLIIIALF 59 (143)
Q Consensus 40 ~~ia~~i~~~Vl~lilIvll 59 (143)
+.|++|+++++|+|++.+++
T Consensus 41 iaiVAG~~tVILVI~i~v~v 60 (221)
T PF08374_consen 41 IAIVAGIMTVILVIFIVVLV 60 (221)
T ss_pred eeeecchhhhHHHHHHHHHH
Confidence 44555555666666666643
No 139
>PF05084 GRA6: Granule antigen protein (GRA6); InterPro: IPR008119 Toxoplasma gondii is an obligate intracellular apicomplexan protozoan parasite, with a complex lifestyle involving varied hosts []. It has two phases of growth: an intestinal phase in feline hosts, and an extra-intestinal phase in other mammals. Oocysts from infected cats develop into tachyzoites, and eventually, bradyzoites and zoitocysts in the extraintestinal host []. Transmission of the parasite occurs through contact with infected cats or raw/undercooked meat; in immunocompromised individuals, it can cause severe and often lethal toxoplasmosis. Acute infection in healthy humans can sometimes also cause tissue damage []. The protozoan utilises a variety of secretory and antigenic proteins to invade a host and gain access to the intracellular environment []. These originate from distinct organelles in the T. gondii cell termed micronemes, rhoptries, and dense granules. They are released at specific times during invasion to ensure the proteins are allocated to their correct target destinations []. Dense granule antigens (GRAs) are released from the T. gondii tachyzoite while still encapsulated in a host vacuole. Gra6, one of these moieties, is associated with the parasitophorous vacuole []. It possesses a hydrophobic central region flanked by two hydrophilic domains, and is present as a single copy gene in the Toxoplasma gondii genome []. Gra6 shares a similar function with Gra2, in that it is rapidly targeted to a network of membranous tubules that connect with the vacuolar membrane []. Indeed, these two proteins, together with Gra4, form a multimeric complex that stabilises the parasite within the vacuole.
Probab=53.07 E-value=18 Score=28.49 Aligned_cols=22 Identities=23% Similarity=0.600 Sum_probs=14.2
Q ss_pred HHHHHHHHHHHHHHHHHHHHhh
Q psy3616 47 AATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 47 ~~~Vl~lilIvllv~~~~~r~r 68 (143)
++.+++...+++|.|++..|+.
T Consensus 154 IG~~VlA~~VA~L~~~F~RR~~ 175 (215)
T PF05084_consen 154 IGAVVLAVSVAMLTWFFLRRTG 175 (215)
T ss_pred HHHHHHHHHHHHHHHHHHHhhc
Confidence 5666666777777776655443
No 140
>KOG1226|consensus
Probab=52.70 E-value=17 Score=34.64 Aligned_cols=25 Identities=24% Similarity=0.534 Sum_probs=12.4
Q ss_pred eeehhHHHHHHHHHHHHHHHHHHHH
Q psy3616 41 YIAGGIAATVVFLIIIALFVWMICA 65 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvllv~~~~~ 65 (143)
.|.++++++++++.|+++++|-+..
T Consensus 715 ~i~lgvv~~ivligl~llliwkll~ 739 (783)
T KOG1226|consen 715 AIVLGVVAGIVLIGLALLLIWKLLT 739 (783)
T ss_pred eehHHHHHHHHHHHHHHHHHHHHhh
Confidence 3444444444445555556664433
No 141
>PF03229 Alpha_GJ: Alphavirus glycoprotein J; InterPro: IPR004913 The exact function of the herpesvirus glycoprotein J is unknown, but it appears to play a role in the inhibition of apotosis of the host cell [].; GO: 0019050 suppression by virus of host apoptosis
Probab=52.64 E-value=19 Score=26.70 Aligned_cols=27 Identities=11% Similarity=-0.041 Sum_probs=12.9
Q ss_pred eehhHHHHHHHHHHHHHHHHHHHHHhh
Q psy3616 42 IAGGIAATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~~~~~r~r 68 (143)
+...++|.+..|.|.++....+..|..
T Consensus 85 aLp~VIGGLcaL~LaamGA~~LLrR~c 111 (126)
T PF03229_consen 85 ALPLVIGGLCALTLAAMGAGALLRRCC 111 (126)
T ss_pred chhhhhhHHHHHHHHHHHHHHHHHHHH
Confidence 334455555555555554444434333
No 142
>PF00974 Rhabdo_glycop: Rhabdovirus spike glycoprotein; InterPro: IPR001903 Different families of ssRNA negative-strand viruses contain glycoproteins responsible for forming spikes on the surface of the virion. The glycoprotein spike is made up of a trimer of glycoproteins. These proteins are frequently abbreviated to G protein. Channel formed by glycoprotein spike is thought to function in a similar manner to Influenza virus M2 protein channel, thus allowing a signal to pass across the viral membrane to signal for viral uncoating [, ].; GO: 0019031 viral envelope; PDB: 2CMZ_C 2J6J_A 3EGD_D.
Probab=52.43 E-value=4.7 Score=36.26 Aligned_cols=15 Identities=20% Similarity=0.401 Sum_probs=0.0
Q ss_pred HHHHHHHHHhhhccc
Q psy3616 58 LFVWMICARSERRRE 72 (143)
Q Consensus 58 llv~~~~~r~rr~kk 72 (143)
+.++..+.|+++.++
T Consensus 472 ~~cc~~~~r~~~~~~ 486 (501)
T PF00974_consen 472 IRCCCRCRRRRRPKR 486 (501)
T ss_dssp ---------------
T ss_pred HHHhhhhcccccccc
Confidence 334445565554443
No 143
>PF05510 Sarcoglycan_2: Sarcoglycan alpha/epsilon; InterPro: IPR008908 Sarcoglycans are a subcomplex of transmembrane proteins which are part of the dystrophin-glycoprotein complex. They are expressed in the skeletal, cardiac and smooth muscle. Although numerous studies have been conducted on the sarcoglycan subcomplex in skeletal and cardiac muscle, the manner of the distribution and localisation of these proteins along the nonjunctional sarcolemma is not clear []. This family contains alpha and epsilon members.; GO: 0016012 sarcoglycan complex
Probab=52.35 E-value=9.2 Score=33.53 Aligned_cols=16 Identities=31% Similarity=0.945 Sum_probs=6.6
Q ss_pred HHHHHHHHHHHHHHhh
Q psy3616 53 LIIIALFVWMICARSE 68 (143)
Q Consensus 53 lilIvllv~~~~~r~r 68 (143)
+||++++..+.|+|+-
T Consensus 297 llL~llLs~Imc~rRE 312 (386)
T PF05510_consen 297 LLLLLLLSYIMCCRRE 312 (386)
T ss_pred HHHHHHHHHHheechH
Confidence 3333334444444443
No 144
>PF07253 Gypsy: Gypsy protein; InterPro: IPR009882 This family consists of several Gypsy/Env proteins from Drosophila and Ceratitis fruit fly species. Gypsy is an endogenous retrovirus of Drosophila melanogaster. Phylogenetic studies suggest that occasional horizontal transfer events of gypsy occur between Drosophila species. Gypsy possesses infective properties associated with the products of the envelope gene that might be at the origin of these interspecies transfers [].
Probab=52.11 E-value=21 Score=32.14 Aligned_cols=17 Identities=24% Similarity=0.555 Sum_probs=7.0
Q ss_pred HHHHHHHHHHHHHhhhc
Q psy3616 54 IIIALFVWMICARSERR 70 (143)
Q Consensus 54 ilIvllv~~~~~r~rr~ 70 (143)
.+++++++....|++|.
T Consensus 429 ~Ii~~i~~~~~~r~~r~ 445 (472)
T PF07253_consen 429 MIIIIIALILMLRKKRQ 445 (472)
T ss_pred HHHHHHHHHHHHHhhhh
Confidence 33334444444444443
No 145
>PF15048 OSTbeta: Organic solute transporter subunit beta protein
Probab=52.05 E-value=16 Score=27.20 Aligned_cols=28 Identities=11% Similarity=0.093 Sum_probs=15.4
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhcc
Q psy3616 44 GGIAATVVFLIIIALFVWMICARSERRR 71 (143)
Q Consensus 44 ~~i~~~Vl~lilIvllv~~~~~r~rr~k 71 (143)
++.+.+|++|-++++..-...-|+||++
T Consensus 40 L~Ls~vvlvi~~~LLgrsi~ANRnrK~~ 67 (125)
T PF15048_consen 40 LALSFVVLVISFFLLGRSIQANRNRKMQ 67 (125)
T ss_pred HHHHHHHHHHHHHHHHHHhHhccccccc
Confidence 3344555555555666555556666554
No 146
>PF12191 stn_TNFRSF12A: Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain; InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=52.05 E-value=4.8 Score=30.15 Aligned_cols=32 Identities=31% Similarity=0.696 Sum_probs=0.0
Q ss_pred eehhHHHHHHHHHHH-HHHHHHHHHHhhhcccc
Q psy3616 42 IAGGIAATVVFLIII-ALFVWMICARSERRREP 73 (143)
Q Consensus 42 ia~~i~~~Vl~lilI-vllv~~~~~r~rr~kk~ 73 (143)
|.+++.++++++.++ .+++|.-|.||++...+
T Consensus 81 i~~sal~v~lVl~llsg~lv~rrcrrr~~~ttP 113 (129)
T PF12191_consen 81 ILGSALSVVLVLALLSGFLVWRRCRRREKFTTP 113 (129)
T ss_dssp ---------------------------------
T ss_pred hhhhHHHHHHHHHHHHHHHHHhhhhccccCCCc
Confidence 444444444444443 35677777777655544
No 147
>PTZ00234 variable surface protein Vir12; Provisional
Probab=51.73 E-value=9.2 Score=33.96 Aligned_cols=7 Identities=14% Similarity=0.411 Sum_probs=2.9
Q ss_pred hcccccc
Q psy3616 69 RRREPKK 75 (143)
Q Consensus 69 r~kk~k~ 75 (143)
+|||+|+
T Consensus 395 krkrkk~ 401 (433)
T PTZ00234 395 KGEKKKR 401 (433)
T ss_pred chhhccc
Confidence 3444443
No 148
>PF13994 PgaD: PgaD-like protein
Probab=51.20 E-value=18 Score=26.68 Aligned_cols=20 Identities=20% Similarity=0.469 Sum_probs=9.6
Q ss_pred HHHHHHHHHHHhhhcccccc
Q psy3616 56 IALFVWMICARSERRREPKK 75 (143)
Q Consensus 56 Ivllv~~~~~r~rr~kk~k~ 75 (143)
+++++|..+-++|.+.++++
T Consensus 76 ~~Li~Wa~yn~~Rf~~~~rr 95 (138)
T PF13994_consen 76 VILILWAKYNRLRFRGRRRR 95 (138)
T ss_pred HHHHHHHHHHHHHhcchhhc
Confidence 34555655554444444333
No 149
>PF06084 Cytomega_TRL10: Cytomegalovirus TRL10 protein; InterPro: IPR009284 This family consists of several Cytomegalovirus TRL10 proteins. TRL10 represents a structural component of the virus particle and like the other HCMV envelope glycoproteins, is present in a disulphide-linked complex [].
Probab=51.01 E-value=11 Score=28.05 Aligned_cols=16 Identities=19% Similarity=0.656 Sum_probs=10.3
Q ss_pred EEeC-CCCCCeeeecCC
Q psy3616 12 CVKD-ARGQPSCRCVGS 27 (143)
Q Consensus 12 C~~~-~~~~~~C~C~~g 27 (143)
|... ....-.|+|.+.
T Consensus 11 c~~~~~~t~l~ckc~~~ 27 (150)
T PF06084_consen 11 CTSKSENTHLTCKCSPW 27 (150)
T ss_pred eEeccCCeeEEEecCCC
Confidence 6554 344678888775
No 150
>PF12768 Rax2: Cortical protein marker for cell polarity
Probab=50.11 E-value=26 Score=29.17 Aligned_cols=29 Identities=17% Similarity=0.125 Sum_probs=14.1
Q ss_pred eehhHHHHHHHHHHHHHHHHHHHHHhhhc
Q psy3616 42 IAGGIAATVVFLIIIALFVWMICARSERR 70 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~~~~~r~rr~ 70 (143)
++++..++.+.+++++.++-++..|.+||
T Consensus 229 VVlIslAiALG~v~ll~l~Gii~~~~~r~ 257 (281)
T PF12768_consen 229 VVLISLAIALGTVFLLVLIGIILAYIRRR 257 (281)
T ss_pred EEEEehHHHHHHHHHHHHHHHHHHHHHhh
Confidence 33333444444444445555666665544
No 151
>PF14991 MLANA: Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=49.92 E-value=2.7 Score=30.98 Aligned_cols=29 Identities=21% Similarity=0.513 Sum_probs=1.7
Q ss_pred eeehh-HHHHHHHHHHHHHHHHHHHHHhhh
Q psy3616 41 YIAGG-IAATVVFLIIIALFVWMICARSER 69 (143)
Q Consensus 41 ~ia~~-i~~~Vl~lilIvllv~~~~~r~rr 69 (143)
+|.+- .+|+.++++|+++++++-||..||
T Consensus 20 yitAEEAaGIGiL~VILgiLLliGCWYckR 49 (118)
T PF14991_consen 20 YITAEEAAGIGILIVILGILLLIGCWYCKR 49 (118)
T ss_dssp --------SSS-------------------
T ss_pred eeeHHHhccceeHHHHHHHHHHHhheeeee
Confidence 44444 455555555555655544444343
No 152
>PF00599 Flu_M2: Influenza Matrix protein (M2); InterPro: IPR002089 This entry contains Influenza virus matrix protein 2. It is an integral membrane protein that is expressed on the infected cell surface and incorporated into virions where it is a minor component. The protein spans the viral membrane with an extracellular amino-terminus and a cytoplasmic carboxy-terminus. The transmembrane domain of the M2 protein forms the channel pore. The M2 protein, which forms a homotetramer, has H+ ion channel which was found to be regulated by pH [ and may have a pivotal role in the biology of Influenza virus infection [].; GO: 0015078 hydrogen ion transmembrane transporter activity, 0015992 proton transport, 0033644 host cell membrane, 0055036 virion membrane; PDB: 2L0J_A 2KWX_B 2KIH_A 2RLF_A 1MP6_A 2LJB_D 2LJC_A 2H95_B 1NYJ_B 3BKD_E ....
Probab=49.92 E-value=3.7 Score=28.90 Aligned_cols=25 Identities=24% Similarity=0.417 Sum_probs=9.9
Q ss_pred ccccccceeeehhHHHHHHHHHHHH
Q psy3616 33 CAQKSEFAYIAGGIAATVVFLIIIA 57 (143)
Q Consensus 33 Ce~~~~~~~ia~~i~~~Vl~lilIv 57 (143)
|....+...+++.|++++=+++-|+
T Consensus 19 c~~ssd~lv~aA~IiGILHLiLWI~ 43 (97)
T PF00599_consen 19 CSDSSDPLVIAANIIGILHLILWIL 43 (97)
T ss_dssp ----HHHHHHHHHHHHHHHHHHHHH
T ss_pred ecCCcchHHHHHHHHHHHHHHHHHH
Confidence 3333444445555555544433333
No 153
>COG3763 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=49.65 E-value=28 Score=23.52 Aligned_cols=9 Identities=0% Similarity=0.117 Sum_probs=3.6
Q ss_pred HHhhhcccc
Q psy3616 65 ARSERRREP 73 (143)
Q Consensus 65 ~r~rr~kk~ 73 (143)
..+|..+|.
T Consensus 24 iark~~~k~ 32 (71)
T COG3763 24 IARKQMKKQ 32 (71)
T ss_pred HHHHHHHHH
Confidence 333444433
No 154
>PF10873 DUF2668: Protein of unknown function (DUF2668); InterPro: IPR022640 Members in this family of proteins are annotated as cysteine and tyrosine-rich protein 1, however currently no function is known [].
Probab=48.87 E-value=22 Score=27.37 Aligned_cols=40 Identities=28% Similarity=0.255 Sum_probs=28.0
Q ss_pred CCcCCCccccccceeeehhHHHHHHHHHHHHHHHHHHHHHh
Q psy3616 27 SFIGPHCAQKSEFAYIAGGIAATVVFLIIIALFVWMICARS 67 (143)
Q Consensus 27 gy~G~rCe~~~~~~~ia~~i~~~Vl~lilIvllv~~~~~r~ 67 (143)
.|.|+-+.- ...+.|++++..++.++..|++++.++....
T Consensus 52 ~yi~~~lsg-tAIaGIVfgiVfimgvva~i~icvCmc~kn~ 91 (155)
T PF10873_consen 52 AYIGDVLSG-TAIAGIVFGIVFIMGVVAGIAICVCMCMKNS 91 (155)
T ss_pred hhhcccccc-ceeeeeehhhHHHHHHHHHHHHHHhhhhhcC
Confidence 666666653 3677888888777777777777777665433
No 155
>KOG3488|consensus
Probab=48.47 E-value=26 Score=23.84 Aligned_cols=23 Identities=26% Similarity=0.527 Sum_probs=11.4
Q ss_pred HHHHHHHHHHHHHHHHHHHHHhh
Q psy3616 46 IAATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 46 i~~~Vl~lilIvllv~~~~~r~r 68 (143)
+++.++++.+|.+.+.++..+.+
T Consensus 54 vaagl~ll~lig~Fis~vMlKsk 76 (81)
T KOG3488|consen 54 VAAGLFLLCLIGTFISLVMLKSK 76 (81)
T ss_pred HHHHHHHHHHHHHHHHHHhhhcc
Confidence 34444444445555555555444
No 156
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=47.90 E-value=3.6 Score=28.99 Aligned_cols=49 Identities=20% Similarity=0.337 Sum_probs=22.7
Q ss_pred ecCCC--cCCCccccc-c-ceeeehhHHHHHHH-HHHHHHHHHHHHHHhhhccc
Q psy3616 24 CVGSF--IGPHCAQKS-E-FAYIAGGIAATVVF-LIIIALFVWMICARSERRRE 72 (143)
Q Consensus 24 C~~gy--~G~rCe~~~-~-~~~ia~~i~~~Vl~-lilIvllv~~~~~r~rr~kk 72 (143)
|..+| .+..|.... . .......|+++++. +++|.+++.++++...+|||
T Consensus 42 C~~GY~~~~~~Cv~~st~~~~ls~gaiagi~vg~~~~v~~lv~~l~w~f~~r~k 95 (96)
T PTZ00382 42 CNSGFSLDNGKCVSSGANRSGLSTGAIAGISVAVVAVVGGLVGFLCWWFVCRGK 95 (96)
T ss_pred CcCCcccCCCcccccccCCCCcccccEEEEEeehhhHHHHHHHHHhheeEEeec
Confidence 55664 344554322 1 12233445555554 44454565555554444443
No 157
>PRK11901 hypothetical protein; Reviewed
Probab=47.58 E-value=11 Score=32.26 Aligned_cols=16 Identities=44% Similarity=0.640 Sum_probs=7.7
Q ss_pred eehhHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIIIA 57 (143)
Q Consensus 42 ia~~i~~~Vl~lilIv 57 (143)
+.+||+++|||||||.
T Consensus 38 ~MiGiGilVLlLLIi~ 53 (327)
T PRK11901 38 MMIGIGILVLLLLIIA 53 (327)
T ss_pred HHHHHHHHHHHHHHHH
Confidence 4455544555444444
No 158
>PHA03289 envelope glycoprotein I; Provisional
Probab=47.56 E-value=33 Score=29.63 Aligned_cols=9 Identities=44% Similarity=0.567 Sum_probs=6.1
Q ss_pred CcceeeecC
Q psy3616 85 GSQVNFYYG 93 (143)
Q Consensus 85 gs~~N~~~g 93 (143)
.+++|--||
T Consensus 313 ~~~~~~~f~ 321 (352)
T PHA03289 313 NSAVNEKFG 321 (352)
T ss_pred hhhhhhhhc
Confidence 357787777
No 159
>PF11027 DUF2615: Protein of unknown function (DUF2615); InterPro: IPR020309 This entry represents a group of uncharacterised protein from the Metazoa, including CD034 (or C4orf34) and YQF4 (or C34C12.4).
Probab=47.34 E-value=43 Score=24.10 Aligned_cols=13 Identities=15% Similarity=0.071 Sum_probs=5.7
Q ss_pred HHHHHHHHhhhcc
Q psy3616 59 FVWMICARSERRR 71 (143)
Q Consensus 59 lv~~~~~r~rr~k 71 (143)
+++++.+|-++.|
T Consensus 66 A~~ly~~RP~s~R 78 (103)
T PF11027_consen 66 AMALYLLRPSSLR 78 (103)
T ss_pred HHHHHHcCchhhc
Confidence 3344455544433
No 160
>PRK04778 septation ring formation regulator EzrA; Provisional
Probab=46.97 E-value=20 Score=32.49 Aligned_cols=6 Identities=17% Similarity=0.318 Sum_probs=2.4
Q ss_pred HHHHhh
Q psy3616 63 ICARSE 68 (143)
Q Consensus 63 ~~~r~r 68 (143)
+++||+
T Consensus 21 ~~~rr~ 26 (569)
T PRK04778 21 LILRKR 26 (569)
T ss_pred HHHHHH
Confidence 334433
No 161
>PF06679 DUF1180: Protein of unknown function (DUF1180); InterPro: IPR009565 This entry consists of several hypothetical eukaryotic proteins thought to be membrane proteins. Their function is unknown.
Probab=46.93 E-value=26 Score=27.19 Aligned_cols=38 Identities=18% Similarity=0.115 Sum_probs=24.1
Q ss_pred HHHHHHHHHHHHhhhccccccccccccCCCCcceeeec
Q psy3616 55 IIALFVWMICARSERRREPKKLVAQTNDQTGSQVNFYY 92 (143)
Q Consensus 55 lIvllv~~~~~r~rr~kk~k~~~~~~~~~~gs~~N~~~ 92 (143)
+-+++++++.+|.-|.||++++..++.-.....-|.+.
T Consensus 104 ~s~l~i~yfvir~~R~r~~~rktRkYgvl~~~~~~~Em 141 (163)
T PF06679_consen 104 LSALAILYFVIRTFRLRRRNRKTRKYGVLTTRAENVEM 141 (163)
T ss_pred HHHHHHHHHHHHHHhhccccccceeecccCCCccccee
Confidence 33445566778877778877777777666654344333
No 162
>PTZ00045 apical membrane antigen 1; Provisional
Probab=46.93 E-value=34 Score=31.64 Aligned_cols=8 Identities=25% Similarity=0.347 Sum_probs=3.7
Q ss_pred CeeeecCC
Q psy3616 20 PSCRCVGS 27 (143)
Q Consensus 20 ~~C~C~~g 27 (143)
+.|.|...
T Consensus 477 yvc~~~~~ 484 (595)
T PTZ00045 477 YVCERVEK 484 (595)
T ss_pred eEeeeeec
Confidence 34555443
No 163
>PHA03290 envelope glycoprotein I; Provisional
Probab=46.39 E-value=35 Score=29.62 Aligned_cols=18 Identities=28% Similarity=0.355 Sum_probs=8.3
Q ss_pred eeeehhHHHHHHHHHHHH
Q psy3616 40 AYIAGGIAATVVFLIIIA 57 (143)
Q Consensus 40 ~~ia~~i~~~Vl~lilIv 57 (143)
+.|++-+++++++|+.++
T Consensus 273 ~~ivipi~~~llilla~i 290 (357)
T PHA03290 273 FLIAIPITASLLIILAII 290 (357)
T ss_pred EEEEehHHHHHHHHHHHH
Confidence 355555544444444333
No 164
>PF02060 ISK_Channel: Slow voltage-gated potassium channel; InterPro: IPR000369 Potassium channels are the most diverse group of the ion channel family [, ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group. These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers []. In eukaryotic cells, K+ channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes []. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis []. All K+ channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K+ selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK) []. The 2TM domain family comprises inward-rectifying K+ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels. Two types of beta subunit (KCNE and KCNAB) are presently known to associate with voltage-gated alpha subunits (Kv, KCNQ and eag-like). However, not all combinations of alpha and beta subunits are possible. The KCNE family of K+ channel subunits are membrane glycoproteins that possess a single transmembrane (TM) domain. They share no structural relationship with the alpha subunit proteins, which possess pore forming domains. The subunits appear to have a regulatory function, modulating the kinetics and voltage dependence of the alpha subunits of voltage-dependent K+ channels. KCNE subunits are formed from short polypeptides of ~130 amino acids, and are divided into five subfamilies: KCNE1 (MinK/IsK), KCNE2 (MiRP1), KCNE3 (MiRP2), KCNE4 (MiRP3) and KCNE1L (AMMECR2). ; GO: 0005249 voltage-gated potassium channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2K21_A.
Probab=45.91 E-value=44 Score=25.07 Aligned_cols=16 Identities=19% Similarity=0.335 Sum_probs=6.8
Q ss_pred HHHHHHHHHhhhcccc
Q psy3616 58 LFVWMICARSERRREP 73 (143)
Q Consensus 58 llv~~~~~r~rr~kk~ 73 (143)
+.|.+-..|.||+.++
T Consensus 59 ~gImlsyvRSKK~E~s 74 (129)
T PF02060_consen 59 VGIMLSYVRSKKREHS 74 (129)
T ss_dssp HHHHHHHHHHHHH---
T ss_pred HHHHHHHHHHhhhccc
Confidence 3344455666655444
No 165
>TIGR02976 phageshock_pspB phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response.
Probab=44.62 E-value=33 Score=23.21 Aligned_cols=29 Identities=24% Similarity=0.205 Sum_probs=20.3
Q ss_pred HHHHHHHHHHHHHHHHHHHhhhccccccc
Q psy3616 48 ATVVFLIIIALFVWMICARSERRREPKKL 76 (143)
Q Consensus 48 ~~Vl~lilIvllv~~~~~r~rr~kk~k~~ 76 (143)
.+++.+++++++|..++.-..+++|.+..
T Consensus 5 fl~~Pliif~ifVap~wl~lHY~~k~~~~ 33 (75)
T TIGR02976 5 FLAIPLIIFVIFVAPLWLILHYRSKRKTA 33 (75)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhhccC
Confidence 34555666667777888888888887654
No 166
>PF12911 OppC_N: N-terminal TM domain of oligopeptide transport permease C
Probab=44.58 E-value=29 Score=21.14 Aligned_cols=7 Identities=29% Similarity=0.733 Sum_probs=2.5
Q ss_pred HHHHHHH
Q psy3616 48 ATVVFLI 54 (143)
Q Consensus 48 ~~Vl~li 54 (143)
+++++++
T Consensus 21 gl~il~~ 27 (56)
T PF12911_consen 21 GLIILLI 27 (56)
T ss_pred HHHHHHH
Confidence 3333333
No 167
>smart00274 FOLN Follistatin-N-terminal domain-like. Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence
Probab=44.52 E-value=34 Score=18.49 Aligned_cols=22 Identities=32% Similarity=0.797 Sum_probs=16.5
Q ss_pred CCCC-CCCCcEEEeCCCCCCeee
Q psy3616 2 CKGY-CENKGTCVKDARGQPSCR 23 (143)
Q Consensus 2 C~~~-C~NgG~C~~~~~~~~~C~ 23 (143)
|.+. |..|-.|..+..+.+.|.
T Consensus 2 C~~v~C~~G~~C~~d~~g~p~Cv 24 (26)
T smart00274 2 CRNVQCPFGKVCVVDKGGNARCV 24 (26)
T ss_pred CCCEECCCCCEEEeCCCCCEEEe
Confidence 4444 888999998657788884
No 168
>PHA02669 hypothetical protein; Provisional
Probab=43.96 E-value=11 Score=29.64 Aligned_cols=19 Identities=21% Similarity=0.281 Sum_probs=10.6
Q ss_pred HHhhhccccccccccccCC
Q psy3616 65 ARSERRREPKKLVAQTNDQ 83 (143)
Q Consensus 65 ~r~rr~kk~k~~~~~~~~~ 83 (143)
-|.+||.|-|+++.++..|
T Consensus 32 ERanKrsRvK~nMRkLatQ 50 (210)
T PHA02669 32 ERANKRSRVKANMRKLATQ 50 (210)
T ss_pred HHhhhHHHHHHHHHHHHHH
Confidence 3555555555566666554
No 169
>KOG3054|consensus
Probab=43.76 E-value=30 Score=28.97 Aligned_cols=7 Identities=14% Similarity=0.325 Sum_probs=2.7
Q ss_pred HHHHHHh
Q psy3616 61 WMICARS 67 (143)
Q Consensus 61 ~~~~~r~ 67 (143)
++++.|+
T Consensus 20 l~l~~r~ 26 (299)
T KOG3054|consen 20 LFLWKRR 26 (299)
T ss_pred HHHHHhh
Confidence 3333433
No 170
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=43.69 E-value=13 Score=21.60 Aligned_cols=12 Identities=25% Similarity=0.708 Sum_probs=9.9
Q ss_pred CCCeeeecCCCc
Q psy3616 18 GQPSCRCVGSFI 29 (143)
Q Consensus 18 ~~~~C~C~~gy~ 29 (143)
....|.|+.||.
T Consensus 16 ~~~~C~CPeGyI 27 (34)
T PF09064_consen 16 SPGQCFCPEGYI 27 (34)
T ss_pred CCCceeCCCceE
Confidence 456999999986
No 171
>PRK01741 cell division protein ZipA; Provisional
Probab=43.63 E-value=20 Score=30.92 Aligned_cols=18 Identities=17% Similarity=0.320 Sum_probs=8.3
Q ss_pred hhHHHHHHHHHHHHHHHH
Q psy3616 44 GGIAATVVFLIIIALFVW 61 (143)
Q Consensus 44 ~~i~~~Vl~lilIvllv~ 61 (143)
+.|.|++++++||+..+|
T Consensus 7 liILg~lal~~Lv~hgiW 24 (332)
T PRK01741 7 LIILGILALVALVAHGIW 24 (332)
T ss_pred HHHHHHHHHHHHHHhhhh
Confidence 334444444444444444
No 172
>PTZ00233 variable surface protein Vir18; Provisional
Probab=43.57 E-value=23 Score=32.19 Aligned_cols=25 Identities=12% Similarity=0.414 Sum_probs=13.9
Q ss_pred HHHHHHHHHHH--HHHHHHHHHhhhcc
Q psy3616 47 AATVVFLIIIA--LFVWMICARSERRR 71 (143)
Q Consensus 47 ~~~Vl~lilIv--llv~~~~~r~rr~k 71 (143)
+|+||||.||. .=+|.++.+|+|+|
T Consensus 442 mGIvLLLGLLFKyTPLWRvLTKknRKk 468 (509)
T PTZ00233 442 IGIALLLGLLFKYTPLWRVLTKKNRKK 468 (509)
T ss_pred hhHHHHHHHhhccchhHHhhhhccccc
Confidence 45555554444 33787777655443
No 173
>TIGR02736 cbb3_Q_epsi cytochrome c oxidase, cbb3-type, CcoQ subunit, epsilon-Proteobacterial. Members of this protein family are restricted to the epsilon branch of the Proteobacteria. All members are found in operons containing the other three structural subunits of the cbb3 type of cytochrome c oxidase. These small proteins show remote sequence similarity to the CcoQ subunit in other cytochrome c oxidase systems, so this family is assumed to represent the epsilonproteobacterial variant of CcoQ.
Probab=43.32 E-value=26 Score=22.61 Aligned_cols=18 Identities=0% Similarity=-0.058 Sum_probs=7.3
Q ss_pred HHHhhhcccccccccccc
Q psy3616 64 CARSERRREPKKLVAQTN 81 (143)
Q Consensus 64 ~~r~rr~kk~k~~~~~~~ 81 (143)
.+-.|+.|+.++.+++++
T Consensus 20 yhLYrsek~G~rdYEKY~ 37 (56)
T TIGR02736 20 YHLYRSQKKGERDYEKYA 37 (56)
T ss_pred HHhhhhhcccccCHHHHh
Confidence 333333444444444433
No 174
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=42.96 E-value=17 Score=29.03 Aligned_cols=32 Identities=25% Similarity=0.525 Sum_probs=20.4
Q ss_pred CCCCCCCCcEEEeCCCCCCeeeecCCCcCCCcc
Q psy3616 2 CKGYCENKGTCVKDARGQPSCRCVGSFIGPHCA 34 (143)
Q Consensus 2 C~~~C~NgG~C~~~~~~~~~C~C~~gy~G~rCe 34 (143)
|.--|.-+-.|... .+-+.|.|..+|.|.-=+
T Consensus 134 C~LKCk~nE~CK~~-~~~Y~C~~~~~~~~~~~~ 165 (197)
T PF06247_consen 134 CSLKCKENEECKLV-DGYYKCVCKEGFPGDGEG 165 (197)
T ss_dssp -----TTTEEEEEE-TTEEEEEE-TT-EEETTT
T ss_pred eeeecCCCcceeee-CcEEEeecCCCCCCCCCc
Confidence 55567667889988 888999999998766544
No 175
>PF01708 Gemini_mov: Geminivirus putative movement protein ; InterPro: IPR002621 This family consists of putative movement proteins from Maize streak virus and Wheat dwarf virus [].; GO: 0046740 spread of virus in host, cell to cell, 0016021 integral to membrane
Probab=42.18 E-value=35 Score=24.10 Aligned_cols=10 Identities=20% Similarity=0.365 Sum_probs=6.0
Q ss_pred cceeeecCCC
Q psy3616 86 SQVNFYYGGA 95 (143)
Q Consensus 86 s~~N~~~g~p 95 (143)
++-.+.||++
T Consensus 73 sTeEigFG~t 82 (91)
T PF01708_consen 73 STEEIGFGNT 82 (91)
T ss_pred ceeeeeeCCC
Confidence 3446778854
No 176
>PHA02681 ORF089 virion membrane protein; Provisional
Probab=41.50 E-value=30 Score=24.22 Aligned_cols=19 Identities=0% Similarity=0.218 Sum_probs=10.9
Q ss_pred CCccccCCCCCCCcccccC
Q psy3616 96 PYAESVAPSHHSTYAHYYD 114 (143)
Q Consensus 96 py~e~~~~~~~~~~~~~y~ 114 (143)
.|.++.-|.+.++|+.+..
T Consensus 47 ~F~D~lTpDQVrAlHRlvT 65 (92)
T PHA02681 47 SFEDKMTDDQVRAFHALVT 65 (92)
T ss_pred hhhccCCHHHHHHHHHHHh
Confidence 3445555566666766654
No 177
>PF06143 Baculo_11_kDa: Baculovirus 11 kDa family; InterPro: IPR009313 This is a family of uncharacterised Baculovirus proteins that are all about 11 kDa in size.
Probab=40.81 E-value=32 Score=23.91 Aligned_cols=11 Identities=36% Similarity=0.688 Sum_probs=4.1
Q ss_pred HHHHHHHHHHH
Q psy3616 47 AATVVFLIIIA 57 (143)
Q Consensus 47 ~~~Vl~lilIv 57 (143)
+++++++++++
T Consensus 41 c~~lVfVii~l 51 (84)
T PF06143_consen 41 CCFLVFVIIVL 51 (84)
T ss_pred HHHHHHHHHHH
Confidence 33333333333
No 178
>PLN02745 Putative pectinesterase/pectinesterase inhibitor
Probab=40.13 E-value=42 Score=31.06 Aligned_cols=8 Identities=25% Similarity=-0.018 Sum_probs=3.0
Q ss_pred HHHHHHHH
Q psy3616 53 LIIIALFV 60 (143)
Q Consensus 53 lilIvllv 60 (143)
+++|+..+
T Consensus 36 ~~~i~~~~ 43 (596)
T PLN02745 36 VAAVAGGV 43 (596)
T ss_pred HHHHHHHH
Confidence 33333333
No 179
>PF09289 FOLN: Follistatin/Osteonectin-like EGF domain; InterPro: IPR015369 This domain is predominantly found in osteonectin and follistatin. They adopt an EGF-like structure [, ]. Follistatin is involved in diverse activities from embryonic development to cell secretion. ; GO: 0005515 protein binding; PDB: 1LR7_A 1LR8_A 1LR9_A 2ARP_F 3B4V_H 2KCX_A 3SEK_C 2P6A_D 3HH2_C 2B0U_D ....
Probab=39.88 E-value=30 Score=18.23 Aligned_cols=17 Identities=35% Similarity=1.021 Sum_probs=12.6
Q ss_pred CCCCcEEEeCCCCCCee
Q psy3616 6 CENKGTCVKDARGQPSC 22 (143)
Q Consensus 6 C~NgG~C~~~~~~~~~C 22 (143)
|.-|-+|..+..+.|.|
T Consensus 6 Ck~GKvC~~d~~~~P~C 22 (22)
T PF09289_consen 6 CKRGKVCKVDEQGKPHC 22 (22)
T ss_dssp -BTTEEEEEETTTCEEE
T ss_pred cCCCCEeeeCCCCCcCC
Confidence 77788898865777877
No 180
>PF11044 TMEMspv1-c74-12: Plectrovirus spv1-c74 ORF 12 transmembrane protein; InterPro: IPR022743 This is a group of proteins expressed by Plectroviruses. The Plectroviruses are single-stranded DNA viruses belonging to the Inoviridae. This entry represents putative transmembrane proteins of unknown function.
Probab=39.62 E-value=55 Score=20.34 Aligned_cols=11 Identities=36% Similarity=0.449 Sum_probs=4.6
Q ss_pred HHHHHHHHHHH
Q psy3616 46 IAATVVFLIII 56 (143)
Q Consensus 46 i~~~Vl~lilI 56 (143)
|.++|++|.++
T Consensus 8 iFsvvIil~If 18 (49)
T PF11044_consen 8 IFSVVIILGIF 18 (49)
T ss_pred HHHHHHHHHHH
Confidence 33444444443
No 181
>KOG2052|consensus
Probab=39.50 E-value=15 Score=33.14 Aligned_cols=20 Identities=20% Similarity=0.331 Sum_probs=11.9
Q ss_pred ccCCCCCCCCCCcccccccc
Q psy3616 112 YYDDEEDGWEMPNFYNETYM 131 (143)
Q Consensus 112 ~y~~~~d~~~~~~~~~~~~~ 131 (143)
|+..||++|--++=-.+|-|
T Consensus 242 F~srdE~SWfrEtEIYqTvm 261 (513)
T KOG2052|consen 242 FSSRDERSWFRETEIYQTVM 261 (513)
T ss_pred ecccchhhhhhHHHHHHHHH
Confidence 56667777755554444444
No 182
>PRK10905 cell division protein DamX; Validated
Probab=39.46 E-value=21 Score=30.74 Aligned_cols=13 Identities=38% Similarity=0.784 Sum_probs=5.4
Q ss_pred hHHHHHHHHHHHH
Q psy3616 45 GIAATVVFLIIIA 57 (143)
Q Consensus 45 ~i~~~Vl~lilIv 57 (143)
||+++|||||||.
T Consensus 3 GiGilVLlLLIig 15 (328)
T PRK10905 3 GVGILVLLLLIIG 15 (328)
T ss_pred chhHHHHHHHHHH
Confidence 3333444444443
No 183
>PF12301 CD99L2: CD99 antigen like protein 2; InterPro: IPR022078 This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 237 amino acids in length. CD99L2 and CD99 are involved in trans-endothelial migration of neutrophils in vitro and in the recruitment of neutrophils into inflamed peritoneum.
Probab=39.38 E-value=39 Score=26.34 Aligned_cols=6 Identities=50% Similarity=0.700 Sum_probs=2.2
Q ss_pred HHHHHH
Q psy3616 46 IAATVV 51 (143)
Q Consensus 46 i~~~Vl 51 (143)
|+++|+
T Consensus 120 Ivsav~ 125 (169)
T PF12301_consen 120 IVSAVV 125 (169)
T ss_pred HHHHHH
Confidence 333333
No 184
>PF05808 Podoplanin: Podoplanin; InterPro: IPR008783 This family consists of several mammalian podoplanin-like proteins which are thought to control specifically the unique shape of podocytes [].; GO: 0016021 integral to membrane; PDB: 3IET_X.
Probab=39.36 E-value=9.9 Score=29.55 Aligned_cols=20 Identities=15% Similarity=0.212 Sum_probs=0.0
Q ss_pred eehhHHHHHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIIIALFVW 61 (143)
Q Consensus 42 ia~~i~~~Vl~lilIvllv~ 61 (143)
++.+|+++++.+.+|..+|+
T Consensus 131 LVGIIVGVLlaIG~igGIIi 150 (162)
T PF05808_consen 131 LVGIIVGVLLAIGFIGGIII 150 (162)
T ss_dssp --------------------
T ss_pred eeeehhhHHHHHHHHhheee
Confidence 33334444444444443333
No 185
>PF12259 DUF3609: Protein of unknown function (DUF3609); InterPro: IPR022048 This domain family is found in eukaryotes and viruses, and is typically between 348 and 360 amino acids in length.
Probab=38.89 E-value=18 Score=31.31 Aligned_cols=34 Identities=12% Similarity=0.147 Sum_probs=22.2
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhccccccccccc
Q psy3616 47 AATVVFLIIIALFVWMICARSERRREPKKLVAQT 80 (143)
Q Consensus 47 ~~~Vl~lilIvllv~~~~~r~rr~kk~k~~~~~~ 80 (143)
+.+...+++|++++.+.|..++.+||+.+.-++.
T Consensus 301 i~v~~~~vli~vl~~~~~~~~~~~~~~~~~~~~p 334 (361)
T PF12259_consen 301 IAVCGAIVLIIVLISLAWLYRTFRRRQLRSAQNP 334 (361)
T ss_pred EehhHHHHHHHHHHHHHhheeehHHHHhhhccCC
Confidence 4555556666677778888877777776554444
No 186
>PF05283 MGC-24: Multi-glycosylated core protein 24 (MGC-24); InterPro: IPR007947 CD164 is a mucin-like receptor, or sialomucin, with specificity in receptor/ ligand interactions that depends on the structural characteristics of the mucin-like receptor. Its functions include mediating, or regulating, haematopoietic progenitor cell adhesion and the negative regulation of their growth and/or-differentiation. It exists in the native state as a disulphide- linked homodimer of two 80-85kDa subunits. It is usually expressed by CD34+ and CD341o/- haematopoietic stem cells and associated microenvironmental cells. It contains, in its extracellular region, two mucin domains (I and II) linked by a non-mucin domain, which has been predicted to contain intra- disulphide bridges. This receptor may play a key role in haematopoiesis by facilitating the adhesion of human CD34+ cells to bone marrow stroma and by negatively regulating CD34+ CD341o/- haematopoietic progenitor cell proliferation. These effects involve the CD164 class I and/or II epitopes recognised by the monoclonal antibodies (mAbs) 105A5 and 103B2/9E10. These epitopes are carbohydrate-dependent and are located on the N-terminal mucin domain I [, ]. It has been found that murine MGC-24v and rat endolyn share significant sequence similarities with human CD164. However, CD164 lacks the consensus glycosaminoglycan (GAG)-attachment site found in MGC-24; it is possible that GAG-association is responsible for the high molecular weight of the epithelial-derived MGC-24 glycoprotein []. Genomic structure studies have placed CD164 within the mucin-subgroup that comprises multiple exons, and demonstrate the diverse chromosomal distribution of this family of molecules. Molecules with such multiple exons may have sophisticated regulatory mechanisms that involve not only post-translational modifications of the oligosaccharide side chains, but also differential exon usage. Although differences in the intron and exon sizes are seen between the mouse and human genes, the predicted proteins are similar in size and structure, maintaining functionally important motifs that regulate cell proliferation or subcellular distribution []. CD164 is a gene whose expression depends on differential usage of poly- adenylation sites within the 3'-UTR. The conserved distribution of the 3.2- and 1.2-kb CD164 transcripts between mouse and human suggests that (i) a mechanism may exist to regulate tissue-specific polyadenylation, and (ii) differences in polyadenylation are important for the expression and function of CD164 in different tissues. Two other aspects of the structure of CD164 are of particular interest. First, it shares one of several conserved features of a cytokine-binding pocket - in this respect, it is notable that evidence exists for a class of cell-surface sialomucin modulators that directly interact with growth factor receptors to regulate their response to physiological ligands. Second, its cytoplasmic tail contains a C-terminal YHTL motif found in many endocytic membrane proteins or receptors. These Tyr-based motifs bind to adaptor proteins, which mediate the sorting of membrane proteins into transport vesicles from the plasma membrane to the endosomes, and between intracellular compartments.
Probab=38.82 E-value=32 Score=27.27 Aligned_cols=14 Identities=29% Similarity=0.453 Sum_probs=7.1
Q ss_pred HHHHHHHHHHHHHH
Q psy3616 46 IAATVVFLIIIALF 59 (143)
Q Consensus 46 i~~~Vl~lilIvll 59 (143)
|+|+||.|.|++++
T Consensus 164 iGGIVL~LGv~aI~ 177 (186)
T PF05283_consen 164 IGGIVLTLGVLAII 177 (186)
T ss_pred hhHHHHHHHHHHHH
Confidence 35555555555443
No 187
>COG1862 YajC Preprotein translocase subunit YajC [Intracellular trafficking and secretion]
Probab=38.29 E-value=23 Score=25.18 Aligned_cols=23 Identities=13% Similarity=0.127 Sum_probs=10.2
Q ss_pred HHHHHHHHhhhcccccccccccc
Q psy3616 59 FVWMICARSERRREPKKLVAQTN 81 (143)
Q Consensus 59 lv~~~~~r~rr~kk~k~~~~~~~ 81 (143)
+++++...|.+||++|+...-.+
T Consensus 20 ~ifyFli~RPQrKr~K~~~~ml~ 42 (97)
T COG1862 20 AIFYFLIIRPQRKRMKEHQELLN 42 (97)
T ss_pred HHHHHhhcCHHHHHHHHHHHHHH
Confidence 33444344444555554443343
No 188
>PF10265 DUF2217: Uncharacterized conserved protein (DUF2217); InterPro: IPR019392 This is a family of conserved proteins varying in length from 500-600 residues. Their function is not known.
Probab=38.00 E-value=52 Score=30.03 Aligned_cols=9 Identities=22% Similarity=0.412 Sum_probs=3.7
Q ss_pred HHHHhhhcc
Q psy3616 63 ICARSERRR 71 (143)
Q Consensus 63 ~~~r~rr~k 71 (143)
-.+||||+|
T Consensus 33 ~~lkRRr~k 41 (514)
T PF10265_consen 33 HYLKRRRRK 41 (514)
T ss_pred HHHHHhhcc
Confidence 334444433
No 189
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=37.73 E-value=22 Score=23.12 Aligned_cols=14 Identities=29% Similarity=0.869 Sum_probs=9.6
Q ss_pred CeeeecCCCcCCCc
Q psy3616 20 PSCRCVGSFIGPHC 33 (143)
Q Consensus 20 ~~C~C~~gy~G~rC 33 (143)
-.-.|.+||+|+.|
T Consensus 50 G~~~C~~Gw~G~~C 63 (63)
T PF01414_consen 50 GNKVCLPGWTGPNC 63 (63)
T ss_dssp --EEE-TTEESTTS
T ss_pred CCCCCCCCCcCCCC
Confidence 45578999999988
No 190
>PF14828 Amnionless: Amnionless
Probab=37.42 E-value=32 Score=30.51 Aligned_cols=11 Identities=9% Similarity=-0.076 Sum_probs=7.4
Q ss_pred cceeeecCCCC
Q psy3616 86 SQVNFYYGGAP 96 (143)
Q Consensus 86 s~~N~~~g~pp 96 (143)
...|.+|.+++
T Consensus 391 ~f~nprFD~~~ 401 (437)
T PF14828_consen 391 GFDNPRFDNPG 401 (437)
T ss_pred CCCCCccCCCc
Confidence 36677777655
No 191
>PRK08455 fliL flagellar basal body-associated protein FliL; Reviewed
Probab=37.31 E-value=49 Score=25.77 Aligned_cols=18 Identities=28% Similarity=0.429 Sum_probs=7.2
Q ss_pred HHHHHHHHHHHHHHHHHH
Q psy3616 48 ATVVFLIIIALFVWMICA 65 (143)
Q Consensus 48 ~~Vl~lilIvllv~~~~~ 65 (143)
++++++++++.+++++..
T Consensus 25 ~~~llll~~~G~~~~~~~ 42 (182)
T PRK08455 25 GVVVLLLLIVGVIAMLLM 42 (182)
T ss_pred HHHHHHHHHHHHHHHHHh
Confidence 333333333334444444
No 192
>COG3115 ZipA Cell division protein [Cell division and chromosome partitioning]
Probab=37.13 E-value=38 Score=28.98 Aligned_cols=22 Identities=9% Similarity=0.258 Sum_probs=11.9
Q ss_pred hHHHHHHHHHHHHHHHHHHHHH
Q psy3616 45 GIAATVVFLIIIALFVWMICAR 66 (143)
Q Consensus 45 ~i~~~Vl~lilIvllv~~~~~r 66 (143)
+|.|+++++.||+-.+|--...
T Consensus 9 IIvG~IAIiaLLvhGlWtsRkE 30 (324)
T COG3115 9 IIVGAIAIIALLVHGLWTSRKE 30 (324)
T ss_pred HHHHHHHHHHHHHhhhhhcchh
Confidence 3445555555555566774443
No 193
>PF15183 MRAP: Melanocortin-2 receptor accessory protein family
Probab=36.78 E-value=49 Score=23.18 Aligned_cols=18 Identities=11% Similarity=0.087 Sum_probs=8.6
Q ss_pred eeeehhHHHHHHHHHHHH
Q psy3616 40 AYIAGGIAATVVFLIIIA 57 (143)
Q Consensus 40 ~~ia~~i~~~Vl~lilIv 57 (143)
+.|+..+..+++++++++
T Consensus 38 IVI~FWv~LA~FV~~lF~ 55 (90)
T PF15183_consen 38 IVIAFWVSLAAFVVFLFL 55 (90)
T ss_pred eehhHHHHHHHHHHHHHH
Confidence 345555555544444333
No 194
>KOG1024|consensus
Probab=36.73 E-value=77 Score=28.73 Aligned_cols=10 Identities=10% Similarity=-0.057 Sum_probs=4.1
Q ss_pred eeeecCCCCC
Q psy3616 88 VNFYYGGAPY 97 (143)
Q Consensus 88 ~N~~~g~ppy 97 (143)
.+-.|-.||.
T Consensus 238 ~st~y~~P~t 247 (563)
T KOG1024|consen 238 HSTIYVTPST 247 (563)
T ss_pred cCCcccCCCc
Confidence 3433434543
No 195
>PF12725 DUF3810: Protein of unknown function (DUF3810); InterPro: IPR024294 This family of bacterial proteins is functionally uncharacterised. Proteins in this family are typically between 333 and 377 amino acids in length and contain a conserved HEXXH sequence motif that is characteristic of metallopeptidases. This family may therefore belong to an as yet uncharacterised family of peptidase enzymes.
Probab=36.40 E-value=41 Score=28.43 Aligned_cols=6 Identities=33% Similarity=0.689 Sum_probs=2.6
Q ss_pred CCCccc
Q psy3616 95 APYAES 100 (143)
Q Consensus 95 ppy~e~ 100 (143)
+|..++
T Consensus 82 ~pl~~~ 87 (318)
T PF12725_consen 82 PPLSER 87 (318)
T ss_pred cCHHHH
Confidence 444443
No 196
>PF02285 COX8: Cytochrome oxidase c subunit VIII; InterPro: IPR003205 Cytochrome c oxidase (1.9.3.1 from EC) is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen []. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits.This family is composed of cytochrome c oxidase subunit VIII. ; GO: 0004129 cytochrome-c oxidase activity; PDB: 3AG3_Z 3ABM_M 1OCC_Z 3ASO_Z 3AG2_Z 3ABL_M 3AG4_M 3AG1_M 3ASN_M 1OCZ_M ....
Probab=36.08 E-value=69 Score=19.62 Aligned_cols=25 Identities=16% Similarity=0.393 Sum_probs=13.4
Q ss_pred HHHHHHHHHHHHHHHHHHHhhhccc
Q psy3616 48 ATVVFLIIIALFVWMICARSERRRE 72 (143)
Q Consensus 48 ~~Vl~lilIvllv~~~~~r~rr~kk 72 (143)
..++++-+++-..|++.+-..+||+
T Consensus 19 ltv~f~~~L~PagWVLshL~~YKk~ 43 (44)
T PF02285_consen 19 LTVCFVTFLGPAGWVLSHLESYKKR 43 (44)
T ss_dssp HHHHHHHHHHHHHHHHHTHHHHHT-
T ss_pred HHHHHHHHHhhHHHHHHHHHHhhcc
Confidence 3444444445557887776555543
No 197
>PF03896 TRAP_alpha: Translocon-associated protein (TRAP), alpha subunit; InterPro: IPR005595 The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane [].; GO: 0005783 endoplasmic reticulum
Probab=35.53 E-value=74 Score=26.72 Aligned_cols=11 Identities=18% Similarity=0.537 Sum_probs=6.9
Q ss_pred CCCCCCCcccc
Q psy3616 117 EDGWEMPNFYN 127 (143)
Q Consensus 117 ~d~~~~~~~~~ 127 (143)
|++|=+..-.+
T Consensus 252 D~eWIp~~~l~ 262 (285)
T PF03896_consen 252 DEEWIPKEHLN 262 (285)
T ss_pred CcccCCHHHhh
Confidence 55677666655
No 198
>PHA03240 envelope glycoprotein M; Provisional
Probab=35.32 E-value=44 Score=27.46 Aligned_cols=12 Identities=33% Similarity=0.863 Sum_probs=4.6
Q ss_pred HHHHHHHHHHHH
Q psy3616 48 ATVVFLIIIALF 59 (143)
Q Consensus 48 ~~Vl~lilIvll 59 (143)
.+|++++||||+
T Consensus 218 ilIIiIiIIIL~ 229 (258)
T PHA03240 218 IAIIIIIVIILF 229 (258)
T ss_pred HHHHHHHHHHHH
Confidence 333333333333
No 199
>PF12301 CD99L2: CD99 antigen like protein 2; InterPro: IPR022078 This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 237 amino acids in length. CD99L2 and CD99 are involved in trans-endothelial migration of neutrophils in vitro and in the recruitment of neutrophils into inflamed peritoneum.
Probab=35.30 E-value=50 Score=25.73 Aligned_cols=19 Identities=26% Similarity=0.212 Sum_probs=12.1
Q ss_pred eeehhHHHHHHHHHHHHHH
Q psy3616 41 YIAGGIAATVVFLIIIALF 59 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvll 59 (143)
..+..|+++|..+.+.++.
T Consensus 112 ~~~g~IaGIvsav~valvG 130 (169)
T PF12301_consen 112 AEAGTIAGIVSAVVVALVG 130 (169)
T ss_pred cccchhhhHHHHHHHHHHH
Confidence 4666677887766665543
No 200
>PHA03105 EEV glycoprotein; Provisional
Probab=34.60 E-value=42 Score=26.30 Aligned_cols=13 Identities=23% Similarity=0.861 Sum_probs=5.9
Q ss_pred HHHHHHHHHHHhh
Q psy3616 56 IALFVWMICARSE 68 (143)
Q Consensus 56 Ivllv~~~~~r~r 68 (143)
+++-++++|.+..
T Consensus 18 Li~Yll~i~K~~i 30 (188)
T PHA03105 18 LILYIFFICKNTI 30 (188)
T ss_pred HHHHHHHHHHHHH
Confidence 3333445555443
No 201
>PHA03281 envelope glycoprotein E; Provisional
Probab=34.59 E-value=62 Score=30.00 Aligned_cols=29 Identities=17% Similarity=0.321 Sum_probs=15.3
Q ss_pred HHHHHHHHHHHHHHHHHHHHHhhhccccc
Q psy3616 46 IAATVVFLIIIALFVWMICARSERRREPK 74 (143)
Q Consensus 46 i~~~Vl~lilIvllv~~~~~r~rr~kk~k 74 (143)
+.+.+.++.||++++..+|..++.|+|..
T Consensus 559 l~~~~a~~~ll~l~~~~~c~~~~~~~~~~ 587 (642)
T PHA03281 559 ITGGFAALALLCLAIALICTAKKFGHKAY 587 (642)
T ss_pred hhhhhHHHHHHHHHHHHHHHHHHhhhhee
Confidence 34444444445555566676666555443
No 202
>PF13980 UPF0370: Uncharacterised protein family (UPF0370)
Probab=34.07 E-value=90 Score=20.47 Aligned_cols=15 Identities=40% Similarity=0.818 Sum_probs=9.3
Q ss_pred CCCcccccCCCCCCCC
Q psy3616 106 HSTYAHYYDDEEDGWE 121 (143)
Q Consensus 106 ~~~~~~~y~~~~d~~~ 121 (143)
|+-+-.-.|| ||+|-
T Consensus 44 HRDnN~~WDd-eDDwP 58 (63)
T PF13980_consen 44 HRDNNAKWDD-EDDWP 58 (63)
T ss_pred CCcccccccc-ccccc
Confidence 4556666666 66774
No 203
>KOG1025|consensus
Probab=33.77 E-value=33 Score=33.68 Aligned_cols=15 Identities=40% Similarity=0.990 Sum_probs=7.1
Q ss_pred CCCCeee--ecCCCcCC
Q psy3616 17 RGQPSCR--CVGSFIGP 31 (143)
Q Consensus 17 ~~~~~C~--C~~gy~G~ 31 (143)
...+.|. ||.|-+|+
T Consensus 562 ~dgp~CV~~CP~G~~G~ 578 (1177)
T KOG1025|consen 562 RDGPHCVSDCPDGVTGP 578 (1177)
T ss_pred CCCcchhccCCCcccCC
Confidence 3345554 66554443
No 204
>PF08391 Ly49: Ly49-like protein, N-terminal region; InterPro: IPR013600 The sequences making up this entry are annotated as, or are similar to, Ly49 receptors (e.g. P20937 from SWISSPROT). These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function []. They are members of the C-type lectin receptor superfamily [], and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain (IPR001304 from INTERPRO). ; PDB: 1QO3_D 3C8J_D 1P4L_D 3C8K_D 3G8K_B 1JA3_B 3CAD_A 3G8L_A.
Probab=33.14 E-value=14 Score=27.20 Aligned_cols=27 Identities=26% Similarity=0.377 Sum_probs=0.0
Q ss_pred eehhHHHHHHHHHH-HHHHHHHHHHHhh
Q psy3616 42 IAGGIAATVVFLII-IALFVWMICARSE 68 (143)
Q Consensus 42 ia~~i~~~Vl~lil-Ivllv~~~~~r~r 68 (143)
||++.+.+.+++++ +++++..+.....
T Consensus 7 iav~LGILCllLLvtv~vL~t~ifQ~~q 34 (119)
T PF08391_consen 7 IAVALGILCLLLLVTVAVLGTMIFQYSQ 34 (119)
T ss_dssp ----------------------------
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhH
Confidence 44444334444433 3344444544443
No 205
>KOG0994|consensus
Probab=32.66 E-value=23 Score=35.70 Aligned_cols=26 Identities=27% Similarity=0.657 Sum_probs=20.9
Q ss_pred EEEeC-CCCCCeeeecCCCcCCCcccc
Q psy3616 11 TCVKD-ARGQPSCRCVGSFIGPHCAQK 36 (143)
Q Consensus 11 ~C~~~-~~~~~~C~C~~gy~G~rCe~~ 36 (143)
.|..+ ....-.|+|.+||+|.||+.-
T Consensus 924 sC~~d~~t~~ivC~C~~GY~G~RCe~C 950 (1758)
T KOG0994|consen 924 SCYLDTRTQQIVCHCQEGYSGSRCEIC 950 (1758)
T ss_pred cccccccccceeeecccCccccchhhh
Confidence 46555 455789999999999999974
No 206
>TIGR01941 nqrF NADH:ubiquinone oxidoreductase, Na(+)-translocating, F subunit. This model represents the NqrF subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump.
Probab=32.51 E-value=41 Score=28.87 Aligned_cols=15 Identities=13% Similarity=0.078 Sum_probs=6.4
Q ss_pred HHHHHHHHhhhcccc
Q psy3616 59 FVWMICARSERRREP 73 (143)
Q Consensus 59 lv~~~~~r~rr~kk~ 73 (143)
+|+++.+.|+|.++.
T Consensus 15 ~~~~~~~~~~~~~~~ 29 (405)
T TIGR01941 15 LVVVILFAKSKLVSS 29 (405)
T ss_pred HHHHHHHHHhhcccc
Confidence 334444444444333
No 207
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=32.28 E-value=50 Score=22.89 Aligned_cols=23 Identities=26% Similarity=0.745 Sum_probs=19.4
Q ss_pred CCCCCcEEEeCCCCCCeeeecCC
Q psy3616 5 YCENKGTCVKDARGQPSCRCVGS 27 (143)
Q Consensus 5 ~C~NgG~C~~~~~~~~~C~C~~g 27 (143)
.|..|-+|+.+..+.+.|.|.+.
T Consensus 6 ~C~~G~~C~~d~~~~p~CvC~~~ 28 (86)
T cd01328 6 HCGAGKVCEVDDENTPKCVCIDP 28 (86)
T ss_pred CCCCCCEeeECCCCCeEEecCCc
Confidence 48889999987678999999864
No 208
>PF15050 SCIMP: SCIMP protein
Probab=32.16 E-value=41 Score=25.17 Aligned_cols=21 Identities=24% Similarity=0.082 Sum_probs=10.4
Q ss_pred HHHHHHHHHHHHHHHHHHHhh
Q psy3616 48 ATVVFLIIIALFVWMICARSE 68 (143)
Q Consensus 48 ~~Vl~lilIvllv~~~~~r~r 68 (143)
.+-++|.||+.+++..-.|.-
T Consensus 18 ~vS~~lglIlyCvcR~~lRqG 38 (133)
T PF15050_consen 18 LVSVVLGLILYCVCRWQLRQG 38 (133)
T ss_pred HHHHHHHHHHHHHHHHHHHcc
Confidence 333334455566665555443
No 209
>PF05749 Rubella_E2: Rubella membrane glycoprotein E2; InterPro: IPR008821 Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins []. This family is found together with IPR008819 from INTERPRO and IPR008820 from INTERPRO.; GO: 0016021 integral to membrane, 0019013 viral nucleocapsid
Probab=32.08 E-value=1.3e+02 Score=23.98 Aligned_cols=20 Identities=25% Similarity=0.463 Sum_probs=13.7
Q ss_pred CCCeeeecCCCcCCCccccc
Q psy3616 18 GQPSCRCVGSFIGPHCAQKS 37 (143)
Q Consensus 18 ~~~~C~C~~gy~G~rCe~~~ 37 (143)
..-.|+=.+..-|.||-...
T Consensus 192 dgwtcrgvpahpgtrcpelv 211 (267)
T PF05749_consen 192 DGWTCRGVPAHPGTRCPELV 211 (267)
T ss_pred CCceecCccCCCCCCChhhc
Confidence 34566666667788888754
No 210
>PF09802 Sec66: Preprotein translocase subunit Sec66; InterPro: IPR018624 Members of this family of proteins are a component of the heterotetrameric Sec62/63 complex composed of SEC62, SEC63, SEC66 and SEC72. The Sec62/63 complex associates with the Sec61 complex to form the Sec complex. Sec 66 is involved in SRP-independent post-translational translocation across the endoplasmic reticulum and functions together with the Sec61 complex and KAR2 in a channel-forming translocon complex. Furthermore, Sec66 is also required for growth at elevated temperatures [, , , ].
Probab=31.77 E-value=44 Score=26.55 Aligned_cols=43 Identities=7% Similarity=0.098 Sum_probs=18.7
Q ss_pred HHHHHHHHHHH-HHHHHHHhhhccccccccccccCCCCcceeeecC
Q psy3616 49 TVVFLIIIALF-VWMICARSERRREPKKLVAQTNDQTGSQVNFYYG 93 (143)
Q Consensus 49 ~Vl~lilIvll-v~~~~~r~rr~kk~k~~~~~~~~~~gs~~N~~~g 93 (143)
++-+.+|++.+ ++-...|+|+.++. ..+.++=+.+- +-|+|+-
T Consensus 10 ~~Y~~vl~~sl~~Fs~~YRkr~~~~~-~~l~p~F~~~~-~rdiY~s 53 (190)
T PF09802_consen 10 LAYVAVLVGSLATFSSIYRKRKAAKS-ASLEPWFPEHL-QRDIYLS 53 (190)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhh-ccCCCCCCchh-HHHHHHH
Confidence 33333434333 33344444444343 34555544442 4565554
No 211
>PF09402 MSC: Man1-Src1p-C-terminal domain; InterPro: IPR018996 This entry represents the Inner nuclear membrane proteins MAN1 (also known as LEM domain-containing protein 3) and LEM domain-containing protein 2 (or LEM protein 2). Emerin and MAN1 are LEM domain-containing integral membrane proteins of the vertebrate nuclear envelope []. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nulceoplasmic region forms a DNA binding winged helix and binds to Smad []. LEM protein 2 is an essential protein involved in chromosome segregation and cell division, probably via its interaction with lmn-1, the main component of nuclear lamina. Has some overlapping function with emr-1.; GO: 0005639 integral to nuclear inner membrane; PDB: 2CH0_A.
Probab=31.00 E-value=14 Score=30.70 Aligned_cols=9 Identities=11% Similarity=0.615 Sum_probs=0.0
Q ss_pred CcCCCcccc
Q psy3616 28 FIGPHCAQK 36 (143)
Q Consensus 28 y~G~rCe~~ 36 (143)
+.+-.|...
T Consensus 200 ~lpl~C~~~ 208 (334)
T PF09402_consen 200 YLPLKCRLR 208 (334)
T ss_dssp ---------
T ss_pred ccccEEEEe
Confidence 455556443
No 212
>PF14979 TMEM52: Transmembrane 52
Probab=30.95 E-value=1.2e+02 Score=23.34 Aligned_cols=7 Identities=29% Similarity=0.577 Sum_probs=2.8
Q ss_pred HHHHHHH
Q psy3616 60 VWMICAR 66 (143)
Q Consensus 60 v~~~~~r 66 (143)
++..|+|
T Consensus 37 ~ta~C~r 43 (154)
T PF14979_consen 37 LTASCVR 43 (154)
T ss_pred HHHHHHH
Confidence 3334444
No 213
>KOG4818|consensus
Probab=30.60 E-value=51 Score=28.77 Aligned_cols=12 Identities=25% Similarity=0.462 Sum_probs=4.5
Q ss_pred HHHHHHHHHHHH
Q psy3616 48 ATVVFLIIIALF 59 (143)
Q Consensus 48 ~~Vl~lilIvll 59 (143)
+++..+++++++
T Consensus 335 ~~l~gl~~~vli 346 (362)
T KOG4818|consen 335 AILAGLVLVVLI 346 (362)
T ss_pred HHHHHHHHHHHH
Confidence 333333333333
No 214
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=30.59 E-value=11 Score=29.16 Aligned_cols=28 Identities=18% Similarity=0.102 Sum_probs=12.2
Q ss_pred HHHHHHHHHHHHHhhhccccccccccccCCCCc
Q psy3616 54 IIIALFVWMICARSERRREPKKLVAQTNDQTGS 86 (143)
Q Consensus 54 ilIvllv~~~~~r~rr~kk~k~~~~~~~~~~gs 86 (143)
.||++++.++ +...+|+|++ .|.+-+|-
T Consensus 61 ~ill~il~lv-f~~c~r~kkt----dfidSdGk 88 (154)
T PF04478_consen 61 PILLGILALV-FIFCIRRKKT----DFIDSDGK 88 (154)
T ss_pred HHHHHHHHhh-eeEEEecccC----ccccCCCc
Confidence 4444444444 4444343333 24454543
No 215
>PF01561 Hanta_G2: Hantavirus glycoprotein G2; InterPro: IPR002532 The medium (M) genome segment of Hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins [], G1 and G2, as a polyprotein precursor. This entry represents the polyprotein region which forms the G2 glycoprotein.; GO: 0030683 evasion by virus of host immune response, 0044423 virion part
Probab=30.30 E-value=23 Score=31.68 Aligned_cols=8 Identities=38% Similarity=0.970 Sum_probs=3.5
Q ss_pred CCCCeeee
Q psy3616 17 RGQPSCRC 24 (143)
Q Consensus 17 ~~~~~C~C 24 (143)
.+.|.|..
T Consensus 430 Dgap~C~i 437 (485)
T PF01561_consen 430 DGAPECGI 437 (485)
T ss_pred CCCCCcce
Confidence 34444444
No 216
>PF11694 DUF3290: Protein of unknown function (DUF3290); InterPro: IPR021707 This family of proteins with unknown function appears to be restricted to Firmicutes.
Probab=30.22 E-value=68 Score=24.35 Aligned_cols=6 Identities=17% Similarity=0.833 Sum_probs=2.2
Q ss_pred CCCCCC
Q psy3616 115 DEEDGW 120 (143)
Q Consensus 115 ~~~d~~ 120 (143)
+|-+.|
T Consensus 125 ~d~~sY 130 (149)
T PF11694_consen 125 DDNNSY 130 (149)
T ss_pred CCCCeE
Confidence 333333
No 217
>PRK10884 SH3 domain-containing protein; Provisional
Probab=30.13 E-value=30 Score=27.63 Aligned_cols=15 Identities=7% Similarity=-0.006 Sum_probs=7.2
Q ss_pred HHHHHHHHHHHHHHH
Q psy3616 47 AATVVFLIIIALFVW 61 (143)
Q Consensus 47 ~~~Vl~lilIvllv~ 61 (143)
+|+|+++.+|+.+++
T Consensus 177 Gg~v~~~GlllGlil 191 (206)
T PRK10884 177 GGGVAGIGLLLGLLL 191 (206)
T ss_pred chHHHHHHHHHHHHh
Confidence 345555555544444
No 218
>PRK14750 kdpF potassium-transporting ATPase subunit F; Provisional
Probab=29.63 E-value=92 Score=17.45 Aligned_cols=12 Identities=8% Similarity=0.592 Sum_probs=5.2
Q ss_pred HHHHHHHHHHHH
Q psy3616 46 IAATVVFLIIIA 57 (143)
Q Consensus 46 i~~~Vl~lilIv 57 (143)
++++++++++++
T Consensus 6 i~g~llv~lLl~ 17 (29)
T PRK14750 6 VCGALLVLLLLG 17 (29)
T ss_pred HHHHHHHHHHHH
Confidence 344444444443
No 219
>COG5487 Small integral membrane protein [Function unknown]
Probab=29.28 E-value=1e+02 Score=19.61 Aligned_cols=18 Identities=11% Similarity=0.436 Sum_probs=7.1
Q ss_pred HHHHHHHHHHHHHHHHHH
Q psy3616 49 TVVFLIIIALFVWMICAR 66 (143)
Q Consensus 49 ~Vl~lilIvllv~~~~~r 66 (143)
+++.+.++++++..+.-+
T Consensus 33 IlF~i~~vlf~vsL~~g~ 50 (54)
T COG5487 33 ILFFIFLVLFLVSLFAGL 50 (54)
T ss_pred HHHHHHHHHHHHHHHHHH
Confidence 333333334444444433
No 220
>PF01589 Alpha_E1_glycop: Alphavirus E1 glycoprotein; InterPro: IPR002548 Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses []. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 (IPR000936 from INTERPRO) and E3 (IPR002533 from INTERPRO) causes a change in the viral surface. Together the E1, E2, and sometimes E3, glycoprotein "spikes" form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike []. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together. The alphaviral glycoprotein E1 is a class II viral fusion protein, which is structurally different from the class I fusion proteins found in influenza virus and HIV. The structure of the Semliki Forest virus revealed a structure that is similar to that of flaviviral glycoprotein E, with three structural domains in the same primary sequence arrangement []. This entry represents all three domains of the alphaviral E1 glycoprotein.; GO: 0004252 serine-type endopeptidase activity, 0019028 viral capsid, 0055036 virion membrane; PDB: 2YEW_L 1LD4_P 1Z8Y_K 3MUU_B 3N44_F 2XFB_F 3N42_F 2XFC_H 3N40_F 3N41_F ....
Probab=29.25 E-value=63 Score=29.19 Aligned_cols=23 Identities=22% Similarity=0.375 Sum_probs=9.3
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHH
Q psy3616 44 GGIAATVVFLIIIALFVWMICAR 66 (143)
Q Consensus 44 ~~i~~~Vl~lilIvllv~~~~~r 66 (143)
+|+.+++++.++|++++.++..+
T Consensus 479 GG~s~li~i~lii~~~V~~~~~t 501 (502)
T PF01589_consen 479 GGASSLIIIGLIILVCVTCVTFT 501 (502)
T ss_dssp SHHHHHHHHHHHHHHHHHHHHHH
T ss_pred cchHHHHHHHHHHHHhheEEEec
Confidence 33333333333344444444443
No 221
>COG4477 EzrA Negative regulator of septation ring formation [Cell division and chromosome partitioning]
Probab=28.94 E-value=49 Score=30.42 Aligned_cols=11 Identities=9% Similarity=0.018 Sum_probs=5.0
Q ss_pred HHHHHHHHHhh
Q psy3616 58 LFVWMICARSE 68 (143)
Q Consensus 58 llv~~~~~r~r 68 (143)
+.++.+..|||
T Consensus 15 ~~~~g~~lRkk 25 (570)
T COG4477 15 AYAVGYLLRKK 25 (570)
T ss_pred HHHHHHHHHHh
Confidence 33444445554
No 222
>PF14851 FAM176: FAM176 family
Probab=28.74 E-value=32 Score=26.42 Aligned_cols=6 Identities=33% Similarity=0.252 Sum_probs=2.3
Q ss_pred HHHHHH
Q psy3616 60 VWMICA 65 (143)
Q Consensus 60 v~~~~~ 65 (143)
|.-+.+
T Consensus 42 V~risc 47 (153)
T PF14851_consen 42 VIRISC 47 (153)
T ss_pred Hhhhee
Confidence 333333
No 223
>PF05454 DAG1: Dystroglycan (Dystrophin-associated glycoprotein 1); InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=28.67 E-value=19 Score=30.40 Aligned_cols=10 Identities=10% Similarity=0.321 Sum_probs=0.0
Q ss_pred CCCCCCcccc
Q psy3616 118 DGWEMPNFYN 127 (143)
Q Consensus 118 d~~~~~~~~~ 127 (143)
|.|++|....
T Consensus 256 ~~y~ppp~~~ 265 (290)
T PF05454_consen 256 PPYQPPPPFT 265 (290)
T ss_dssp ----------
T ss_pred CccCCCCCcc
Confidence 5588887765
No 224
>PHA02902 putative IMV membrane protein; Provisional
Probab=28.55 E-value=96 Score=20.72 Aligned_cols=17 Identities=12% Similarity=0.346 Sum_probs=7.3
Q ss_pred CCCccccCCCCCCCccc
Q psy3616 95 APYAESVAPSHHSTYAH 111 (143)
Q Consensus 95 ppy~e~~~~~~~~~~~~ 111 (143)
|.|.++.-|.+.++++.
T Consensus 48 ~~F~D~lTpDQirAlHr 64 (70)
T PHA02902 48 PLFKDSLTPDQIKALHR 64 (70)
T ss_pred chhhccCCHHHHHHHHH
Confidence 33444444444444433
No 225
>PRK14584 hmsS hemin storage system protein; Provisional
Probab=28.49 E-value=68 Score=24.68 Aligned_cols=11 Identities=18% Similarity=0.594 Sum_probs=4.3
Q ss_pred HHHHHHHHHhh
Q psy3616 58 LFVWMICARSE 68 (143)
Q Consensus 58 llv~~~~~r~r 68 (143)
+++|.-+-++|
T Consensus 77 LI~WA~YN~~R 87 (153)
T PRK14584 77 LIIWAKYNQVR 87 (153)
T ss_pred HHHHHHHHHHH
Confidence 33444333333
No 226
>PF05297 Herpes_LMP1: Herpesvirus latent membrane protein 1 (LMP1); InterPro: IPR007961 This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4). LMP1 of HHV-4 is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N terminus and a long cytoplasmic carboxy tail of 200 amino acids. HHV-4 virus latent membrane protein 1 (LMP1) is essential for HHV-4 mediated transformation and has been associated with several cases of malignancies. HHV-4-like viruses in Macaca fascicularis (Cynomolgus monkeys) have been associated with high lymphoma rates in immunosuppressed monkeys [].; GO: 0019087 transformation of host cell by virus, 0016021 integral to membrane; PDB: 1CZY_E 1ZMS_B.
Probab=28.34 E-value=19 Score=30.87 Aligned_cols=9 Identities=33% Similarity=0.552 Sum_probs=0.0
Q ss_pred CcCCCcccc
Q psy3616 28 FIGPHCAQK 36 (143)
Q Consensus 28 y~G~rCe~~ 36 (143)
-+-+||-..
T Consensus 13 ~r~pr~p~~ 21 (381)
T PF05297_consen 13 RRPPRCPQP 21 (381)
T ss_dssp ---------
T ss_pred CCCCCCCCc
Confidence 345666654
No 227
>PF07669 Eco57I: Eco57I restriction-modification methylase; InterPro: IPR011639 This entry contains restriction modification methylases, which in the case of endonuclease Eco57I is found adjacent to the DNA cleavage domain, which recognises asymmetric DNA sequence 5'-CTGAAG [, ]. The methylase causes specific methylation on A-5 on one strand, the other strand being methylated by the Eco57IB methylase []. ; GO: 0003677 DNA binding, 0003824 catalytic activity, 0006304 DNA modification
Probab=27.49 E-value=25 Score=24.54 Aligned_cols=12 Identities=25% Similarity=0.606 Sum_probs=10.5
Q ss_pred eeecCCCCCccc
Q psy3616 89 NFYYGGAPYAES 100 (143)
Q Consensus 89 N~~~g~ppy~e~ 100 (143)
++.+|+|||...
T Consensus 4 D~VIGNPPY~~~ 15 (106)
T PF07669_consen 4 DVVIGNPPYIKI 15 (106)
T ss_pred CEEEECCCChhh
Confidence 578999999996
No 228
>KOG1214|consensus
Probab=27.30 E-value=57 Score=32.02 Aligned_cols=30 Identities=20% Similarity=0.500 Sum_probs=24.9
Q ss_pred CCCCCCCCcEEEeCCCCCCeeeecCC--CcCCC
Q psy3616 2 CKGYCENKGTCVKDARGQPSCRCVGS--FIGPH 32 (143)
Q Consensus 2 C~~~C~NgG~C~~~~~~~~~C~C~~g--y~G~r 32 (143)
|..-|--+.+|+++ .+.++|+|..+ |+|++
T Consensus 740 ~~~~CGp~s~Cin~-pg~~rceC~~gy~F~dd~ 771 (1289)
T KOG1214|consen 740 GFHRCGPNSVCINL-PGSYRCECRSGYEFADDR 771 (1289)
T ss_pred CCCCCCCCceeecC-CCceeEEEeecceeccCC
Confidence 56678888999999 88899999988 66663
No 229
>PF04881 Adeno_GP19K: Adenovirus GP19K; InterPro: IPR006965 This 19 kDa glycoprotein binds the major histocompatibility (MHC) class I antigens in the endoplasmic reticulum (ER). The ER retention signal at the C terminus of Gp19K causes retention of the complex in the ER, preventing lysis of the cell by cytotoxic T-lymphocytes [].; GO: 0005537 mannose binding, 0050690 regulation of defense response to virus by virus
Probab=27.14 E-value=36 Score=25.69 Aligned_cols=9 Identities=11% Similarity=0.150 Sum_probs=3.4
Q ss_pred HHHHHHHhh
Q psy3616 60 VWMICARSE 68 (143)
Q Consensus 60 v~~~~~r~r 68 (143)
.|++-.|-|
T Consensus 121 ~~~i~~kpR 129 (139)
T PF04881_consen 121 HLLIKIKPR 129 (139)
T ss_pred hhheeeccc
Confidence 333333333
No 230
>PRK11056 hypothetical protein; Provisional
Probab=27.13 E-value=81 Score=23.38 Aligned_cols=12 Identities=25% Similarity=0.448 Sum_probs=4.7
Q ss_pred HHHHHHHhhhcc
Q psy3616 60 VWMICARSERRR 71 (143)
Q Consensus 60 v~~~~~r~rr~k 71 (143)
++++..|-+.+|
T Consensus 104 ~~Wi~~kl~~~~ 115 (120)
T PRK11056 104 VFWIGRKLRNRK 115 (120)
T ss_pred HHHHHHHHhccc
Confidence 333444444333
No 231
>KOG3637|consensus
Probab=26.73 E-value=65 Score=31.76 Aligned_cols=34 Identities=18% Similarity=0.092 Sum_probs=17.3
Q ss_pred eeehhHHHHHHHHHHHHHHHHHHHHHhhhcccccc
Q psy3616 41 YIAGGIAATVVFLIIIALFVWMICARSERRREPKK 75 (143)
Q Consensus 41 ~ia~~i~~~Vl~lilIvllv~~~~~r~rr~kk~k~ 75 (143)
.++++|+++++.|||++++++ +.++----||+++
T Consensus 977 p~wiIi~svl~GLLlL~llv~-~LwK~GFFKR~r~ 1010 (1030)
T KOG3637|consen 977 PLWIIILSVLGGLLLLALLVL-LLWKCGFFKRNRK 1010 (1030)
T ss_pred ceeeehHHHHHHHHHHHHHHH-HHHhcCccccCCC
Confidence 355555555555555555544 4444444455543
No 232
>KOG2767|consensus
Probab=26.70 E-value=55 Score=28.70 Aligned_cols=17 Identities=18% Similarity=0.552 Sum_probs=12.7
Q ss_pred CcccccCCCCCCCCCCc
Q psy3616 108 TYAHYYDDEEDGWEMPN 124 (143)
Q Consensus 108 ~~~~~y~~~~d~~~~~~ 124 (143)
+...-++||||+|++-+
T Consensus 195 ~~~t~e~~DDddW~~Dt 211 (400)
T KOG2767|consen 195 PLETAEEDDDDDWAVDT 211 (400)
T ss_pred Ccccccccccccccccc
Confidence 35567788888898765
No 233
>PF07297 DPM2: Dolichol phosphate-mannose biosynthesis regulatory protein (DPM2); InterPro: IPR009914 This family consists of several eukaryotic dolichol phosphate-mannose biosynthesis regulatory (DPM2) proteins. Biosynthesis of glycosylphosphatidylinositol and N-glycan precursor is dependent upon a mannosyl donor, dolichol phosphate-mannose (DPM). DPM2, an 84 amino acid membrane protein expressed in the endoplasmic reticulum (ER), makes a complex with DPM1 that is essential for the ER localisation and stable expression of DPM1. Moreover, DPM2 enhances binding of dolichol phosphate, a substrate of DPM synthase. Biosynthesis of DPM in mammalian cells is regulated by DPM2 [].; GO: 0009059 macromolecule biosynthetic process, 0030176 integral to endoplasmic reticulum membrane
Probab=26.69 E-value=1e+02 Score=21.09 Aligned_cols=7 Identities=14% Similarity=0.349 Sum_probs=2.7
Q ss_pred HHHHHhh
Q psy3616 62 MICARSE 68 (143)
Q Consensus 62 ~~~~r~r 68 (143)
++..+.+
T Consensus 68 ~vmik~~ 74 (78)
T PF07297_consen 68 YVMIKSK 74 (78)
T ss_pred HHHhhcc
Confidence 3344333
No 234
>PF14851 FAM176: FAM176 family
Probab=26.63 E-value=73 Score=24.47 Aligned_cols=15 Identities=13% Similarity=0.246 Sum_probs=6.5
Q ss_pred HHHHHHHHHHHHHHH
Q psy3616 47 AATVVFLIIIALFVW 61 (143)
Q Consensus 47 ~~~Vl~lilIvllv~ 61 (143)
+|+|+.|.++++=+.
T Consensus 32 ~GLlLtLcllV~ris 46 (153)
T PF14851_consen 32 AGLLLTLCLLVIRIS 46 (153)
T ss_pred HHHHHHHHHHHhhhe
Confidence 344444444454333
No 235
>PRK14585 pgaD putative PGA biosynthesis protein; Provisional
Probab=26.42 E-value=79 Score=23.95 Aligned_cols=7 Identities=14% Similarity=0.192 Sum_probs=2.9
Q ss_pred HHHHhhh
Q psy3616 63 ICARSER 69 (143)
Q Consensus 63 ~~~r~rr 69 (143)
.+.++++
T Consensus 70 ~WA~YNq 76 (137)
T PRK14585 70 VWALYNK 76 (137)
T ss_pred HHHHHHH
Confidence 3344443
No 236
>PHA03164 hypothetical protein; Provisional
Probab=26.36 E-value=69 Score=22.15 Aligned_cols=6 Identities=17% Similarity=0.157 Sum_probs=2.3
Q ss_pred Ceeeec
Q psy3616 20 PSCRCV 25 (143)
Q Consensus 20 ~~C~C~ 25 (143)
..|.-+
T Consensus 35 veclpP 40 (88)
T PHA03164 35 VECLPP 40 (88)
T ss_pred ceecCC
Confidence 344333
No 237
>PF07271 Cytadhesin_P30: Cytadhesin P30/P32; InterPro: IPR009896 This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localised on the tip organelle. It is thought that it is important in cytadherence and virulence [].; GO: 0007157 heterophilic cell-cell adhesion, 0009405 pathogenesis, 0016021 integral to membrane
Probab=26.36 E-value=51 Score=27.71 Aligned_cols=10 Identities=20% Similarity=0.401 Sum_probs=3.9
Q ss_pred HHHHHHHHHH
Q psy3616 47 AATVVFLIII 56 (143)
Q Consensus 47 ~~~Vl~lilI 56 (143)
+|+++++|||
T Consensus 77 ~G~~~v~liL 86 (279)
T PF07271_consen 77 AGLLAVALIL 86 (279)
T ss_pred hhHHHHHHHH
Confidence 3443333333
No 238
>PHA03294 envelope glycoprotein H; Provisional
Probab=26.08 E-value=55 Score=31.59 Aligned_cols=28 Identities=18% Similarity=0.457 Sum_probs=21.0
Q ss_pred eeeehhHHHHHHHHHHHHHHHHHHHHHh
Q psy3616 40 AYIAGGIAATVVFLIIIALFVWMICARS 67 (143)
Q Consensus 40 ~~ia~~i~~~Vl~lilIvllv~~~~~r~ 67 (143)
.+|+++++|+++.++++..+++|+|---
T Consensus 803 ~yi~aSv~G~~~a~~~~~~i~kmlc~~~ 830 (835)
T PHA03294 803 TYLAASVGGALLAVAILYGIAKMLCSNV 830 (835)
T ss_pred hhHHHHHHHHHHHHHHHHHHHHHHhcCC
Confidence 3677777888887777778888887543
No 239
>TIGR00383 corA magnesium Mg(2+) and cobalt Co(2+) transport protein (corA). The article in Microb Comp Genomics 1998;3(3):151-69 (Medline:98448512) discusses this family and suggests that some members may have functions other than Mg2+ transport.
Probab=25.68 E-value=80 Score=25.78 Aligned_cols=6 Identities=33% Similarity=0.473 Sum_probs=2.4
Q ss_pred HHHHHh
Q psy3616 62 MICARS 67 (143)
Q Consensus 62 ~~~~r~ 67 (143)
++++||
T Consensus 309 ~~~fkr 314 (318)
T TIGR00383 309 LIYFRR 314 (318)
T ss_pred HHHHHH
Confidence 334443
No 240
>TIGR02205 septum_zipA cell division protein ZipA. This model represents the full length of bacterial cell division protein ZipA. The N-terminal hydrophobic stretch is an uncleaved signal-anchor sequence. This is followed by an unconserved, variable length, low complexity region, and then a conserved C-terminal region of about 140 amino acids (see pfam04354) that interacts with the tubulin-like cell division protein FtsZ.
Probab=25.63 E-value=34 Score=28.72 Aligned_cols=9 Identities=33% Similarity=0.442 Sum_probs=3.8
Q ss_pred HHHHHHHHH
Q psy3616 49 TVVFLIIIA 57 (143)
Q Consensus 49 ~Vl~lilIv 57 (143)
+|+.+|+|+
T Consensus 7 IIvGaiaI~ 15 (284)
T TIGR02205 7 IIVGILAIA 15 (284)
T ss_pred HHHHHHHHH
Confidence 444444444
No 241
>PF10868 DUF2667: Protein of unknown function (DUF2667); InterPro: IPR022618 This family of proteins with unknown function appears to be restricted to Arabidopsis thaliana.
Probab=25.41 E-value=27 Score=24.61 Aligned_cols=19 Identities=32% Similarity=0.649 Sum_probs=13.0
Q ss_pred CCcEEEeC--CCCCCeeeecC
Q psy3616 8 NKGTCVKD--ARGQPSCRCVG 26 (143)
Q Consensus 8 NgG~C~~~--~~~~~~C~C~~ 26 (143)
+||.|... ..+.+.|-|-.
T Consensus 59 ~GG~C~~~~~~~~~~~C~Cc~ 79 (90)
T PF10868_consen 59 YGGQCVPVGPPPGDGVCYCCY 79 (90)
T ss_pred CCceeccCCCCCCCcEEEEec
Confidence 57888875 34567887644
No 242
>PF03302 VSP: Giardia variant-specific surface protein; InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=25.31 E-value=32 Score=29.91 Aligned_cols=31 Identities=23% Similarity=0.311 Sum_probs=17.1
Q ss_pred eehhHHHHHHHHHH-HHHHHHHHHHHhhhccc
Q psy3616 42 IAGGIAATVVFLII-IALFVWMICARSERRRE 72 (143)
Q Consensus 42 ia~~i~~~Vl~lil-Ivllv~~~~~r~rr~kk 72 (143)
....|+++.+.+|| |..||-|++|+.--|.|
T Consensus 365 stgaIaGIsvavvvvVgglvGfLcWwf~crgk 396 (397)
T PF03302_consen 365 STGAIAGISVAVVVVVGGLVGFLCWWFICRGK 396 (397)
T ss_pred cccceeeeeehhHHHHHHHHHHHhhheeeccc
Confidence 34445555555444 44567777776654543
No 243
>PRK00269 zipA cell division protein ZipA; Reviewed
Probab=25.02 E-value=1e+02 Score=26.18 Aligned_cols=6 Identities=33% Similarity=0.091 Sum_probs=2.6
Q ss_pred HHHhhh
Q psy3616 64 CARSER 69 (143)
Q Consensus 64 ~~r~rr 69 (143)
-+||+|
T Consensus 24 ~~~r~r 29 (293)
T PRK00269 24 GWRRMR 29 (293)
T ss_pred HHHHHh
Confidence 344444
No 244
>COG4059 MtrE Tetrahydromethanopterin S-methyltransferase, subunit E [Coenzyme metabolism]
Probab=24.58 E-value=67 Score=26.66 Aligned_cols=12 Identities=0% Similarity=0.075 Sum_probs=6.4
Q ss_pred cccccccccccC
Q psy3616 71 REPKKLVAQTND 82 (143)
Q Consensus 71 kk~k~~~~~~~~ 82 (143)
+|.++.+.++.+
T Consensus 285 v~ARn~YGpY~e 296 (304)
T COG4059 285 VKARNAYGPYKE 296 (304)
T ss_pred hhhhhccCCccc
Confidence 444555665554
No 245
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=24.27 E-value=88 Score=18.71 Aligned_cols=10 Identities=20% Similarity=0.929 Sum_probs=7.9
Q ss_pred CeeeecCCCc
Q psy3616 20 PSCRCVGSFI 29 (143)
Q Consensus 20 ~~C~C~~gy~ 29 (143)
-.|.|+.||.
T Consensus 37 g~C~C~~g~~ 46 (52)
T PF01683_consen 37 GRCQCPPGYV 46 (52)
T ss_pred CEeECCCCCE
Confidence 4899999864
No 246
>cd00930 Cyt_c_Oxidase_VIII Cytochrome oxidase c subunit VIII. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIII is the smallest of the nuclear-encoded subunits. It exists in muscle-specific and non-muscle-specific isoforms that are differently expressed in different species, suggesting species-specific regulation of energy metabolism.
Probab=23.91 E-value=1.6e+02 Score=17.90 Aligned_cols=23 Identities=13% Similarity=0.366 Sum_probs=12.1
Q ss_pred HHHHHHHHHHHHHHHHHHhhhcc
Q psy3616 49 TVVFLIIIALFVWMICARSERRR 71 (143)
Q Consensus 49 ~Vl~lilIvllv~~~~~r~rr~k 71 (143)
.++++-+++.-.|++..-...||
T Consensus 20 ~~~f~~~L~p~gWVLshL~~YKk 42 (43)
T cd00930 20 SVFFTTFLLPAGWVLSHLENYKK 42 (43)
T ss_pred HHHHHHHHhhHHHHHHHHHHhcc
Confidence 34444444445677766555443
No 247
>PF10361 DUF2434: Protein of unknown function (DUF2434); InterPro: IPR018830 This entry represents a family of proteins conserved in fungi. Their function is not known.
Probab=23.85 E-value=3.3e+02 Score=23.21 Aligned_cols=34 Identities=21% Similarity=0.318 Sum_probs=21.2
Q ss_pred CCCcEEEeC-CCCCCeeeecCC-Cc-CCCccccccce
Q psy3616 7 ENKGTCVKD-ARGQPSCRCVGS-FI-GPHCAQKSEFA 40 (143)
Q Consensus 7 ~NgG~C~~~-~~~~~~C~C~~g-y~-G~rCe~~~~~~ 40 (143)
-||-.|... +.-.|.=.=+.| |. |..|..+.+.+
T Consensus 7 SNgs~C~L~f~~y~P~~~~~NGtfiN~TsCysPi~~i 43 (296)
T PF10361_consen 7 SNGSNCYLTFDPYTPTMVLSNGTFINGTSCYSPINPI 43 (296)
T ss_pred cCCCeEEEEcCCcCceEEcCCCcEEcCcccCCCCccc
Confidence 467778765 344455544444 44 99999887554
No 248
>PF11743 DUF3301: Protein of unknown function (DUF3301); InterPro: IPR021732 This family is conserved in Proteobacteria, but the function is not known.
Probab=23.15 E-value=68 Score=22.40 Aligned_cols=9 Identities=11% Similarity=0.397 Sum_probs=3.8
Q ss_pred CCcccccCC
Q psy3616 107 STYAHYYDD 115 (143)
Q Consensus 107 ~~~~~~y~~ 115 (143)
+.|.=.|..
T Consensus 64 r~y~FEFS~ 72 (97)
T PF11743_consen 64 RVYQFEFSS 72 (97)
T ss_pred EEEEEEEeC
Confidence 344444433
No 249
>PF06295 DUF1043: Protein of unknown function (DUF1043); InterPro: IPR009386 This entry consists of several hypothetical bacterial proteins of unknown function.
Probab=23.02 E-value=83 Score=22.98 Aligned_cols=9 Identities=33% Similarity=0.324 Sum_probs=3.9
Q ss_pred CCCCccccc
Q psy3616 120 WEMPNFYNE 128 (143)
Q Consensus 120 ~~~~~~~~~ 128 (143)
-++|.=|-+
T Consensus 108 ~~qPrDYa~ 116 (128)
T PF06295_consen 108 EEQPRDYAP 116 (128)
T ss_pred CCCCCCCCC
Confidence 344444433
No 250
>KOG4433|consensus
Probab=23.00 E-value=1.1e+02 Score=28.04 Aligned_cols=15 Identities=20% Similarity=0.339 Sum_probs=6.4
Q ss_pred eehhHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIII 56 (143)
Q Consensus 42 ia~~i~~~Vl~lilI 56 (143)
+.+.++++++++.||
T Consensus 45 lla~l~aa~l~l~Ll 59 (526)
T KOG4433|consen 45 LLAALAAACLGLSLL 59 (526)
T ss_pred HHHHHHHHHHHHHHH
Confidence 333344444444443
No 251
>KOG1218|consensus
Probab=22.74 E-value=68 Score=25.68 Aligned_cols=19 Identities=26% Similarity=0.781 Sum_probs=14.8
Q ss_pred CCCeeeecCCCcCCCcccc
Q psy3616 18 GQPSCRCVGSFIGPHCAQK 36 (143)
Q Consensus 18 ~~~~C~C~~gy~G~rCe~~ 36 (143)
....|.|.+||.|.+|+..
T Consensus 160 ~~~~c~c~~g~~g~~~~~~ 178 (316)
T KOG1218|consen 160 KNGICTCQPGFVGVFCVES 178 (316)
T ss_pred CCCceeccCCccccccccc
Confidence 3467888899888888865
No 252
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=22.50 E-value=1.1e+02 Score=23.15 Aligned_cols=29 Identities=10% Similarity=0.194 Sum_probs=15.1
Q ss_pred HHHHHHHHHH-HHHHHHHHhhhcccccccc
Q psy3616 49 TVVFLIIIAL-FVWMICARSERRREPKKLV 77 (143)
Q Consensus 49 ~Vl~lilIvl-lv~~~~~r~rr~kk~k~~~ 77 (143)
-+++++++.+ ++.....-.|+-||.|-.+
T Consensus 104 ~~il~il~~i~is~~~~~~yr~~r~~~~~~ 133 (139)
T PHA03099 104 PGIVLVLVGIIITCCLLSVYRFTRRTKLPL 133 (139)
T ss_pred hHHHHHHHHHHHHHHHHhhheeeecccCch
Confidence 3333333333 3445566666677776444
No 253
>PF11118 DUF2627: Protein of unknown function (DUF2627); InterPro: IPR020138 This entry represents uncharacterised membrane proteins with no known function.
Probab=22.41 E-value=1.1e+02 Score=21.02 Aligned_cols=24 Identities=25% Similarity=0.411 Sum_probs=12.2
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhc
Q psy3616 47 AATVVFLIIIALFVWMICARSERR 70 (143)
Q Consensus 47 ~~~Vl~lilIvllv~~~~~r~rr~ 70 (143)
++++++++-+.++.-++.+|-|||
T Consensus 45 ~G~~lf~~G~~Fi~GfI~~RDRKr 68 (77)
T PF11118_consen 45 AGLLLFAIGVGFIAGFILHRDRKR 68 (77)
T ss_pred HHHHHHHHHHHHHHhHhheeeccc
Confidence 444444444445555566665544
No 254
>PF15234 LAT: Linker for activation of T-cells
Probab=22.14 E-value=4.3e+02 Score=21.35 Aligned_cols=15 Identities=20% Similarity=0.709 Sum_probs=7.2
Q ss_pred HHHHHHHHHHHHhhh
Q psy3616 55 IIALFVWMICARSER 69 (143)
Q Consensus 55 lIvllv~~~~~r~rr 69 (143)
++++++..+|.|-|.
T Consensus 18 lla~LlmALCvrCRe 32 (230)
T PF15234_consen 18 LLAVLLMALCVRCRE 32 (230)
T ss_pred HHHHHHHHHHHHHhh
Confidence 333444455665443
No 255
>PF15106 TMEM156: TMEM156 protein family
Probab=22.08 E-value=1.1e+02 Score=25.03 Aligned_cols=6 Identities=0% Similarity=-0.169 Sum_probs=2.4
Q ss_pred cccccc
Q psy3616 73 PKKLVA 78 (143)
Q Consensus 73 ~k~~~~ 78 (143)
++.++.
T Consensus 208 q~hky~ 213 (226)
T PF15106_consen 208 QSHKYK 213 (226)
T ss_pred hhcCCC
Confidence 334443
No 256
>PF07226 DUF1422: Protein of unknown function (DUF1422); InterPro: IPR009867 This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown.
Probab=21.86 E-value=1.1e+02 Score=22.57 Aligned_cols=10 Identities=20% Similarity=0.126 Sum_probs=3.9
Q ss_pred HHHHHHhhhc
Q psy3616 61 WMICARSERR 70 (143)
Q Consensus 61 ~~~~~r~rr~ 70 (143)
+++..|-+.+
T Consensus 105 ~Wi~~kl~~~ 114 (117)
T PF07226_consen 105 FWIGYKLGFR 114 (117)
T ss_pred HHHHHHHhhh
Confidence 3344443333
No 257
>PF05478 Prominin: Prominin; InterPro: IPR008795 The prominins are an emerging family of proteins that, among the multispan membrane proteins, display a novel topology. Mouse and Homo sapiens prominin and (Mus musculus) prominin-like 1 (PROML1) are predicted to contain five membrane spanning domains, with an N-terminal domain exposed to the extracellular space followed by four, alternating small cytoplasmic and large extracellular, loops and a cytoplasmic C-terminal domain []. The exact function of prominin is unknown although in humans defects in PROM1, the gene coding for prominin, cause retinal degeneration [].; GO: 0016021 integral to membrane
Probab=21.50 E-value=71 Score=30.32 Aligned_cols=7 Identities=29% Similarity=0.857 Sum_probs=3.9
Q ss_pred HHHHHHH
Q psy3616 60 VWMICAR 66 (143)
Q Consensus 60 v~~~~~r 66 (143)
++++|+|
T Consensus 112 ~~fCcCR 118 (806)
T PF05478_consen 112 LCFCCCR 118 (806)
T ss_pred HHHhccc
Confidence 4556554
No 258
>PF02013 CBM_10: Cellulose or protein binding domain; InterPro: IPR002883 This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. The recycling of photosynthetically fixed carbon in plant cell walls is a key microbial process. Enzyme systems that attack the plant cell wall contain noncatalytic carbohydrate-binding modules that mediate attachment to this composite structure and play a pivotal role in maximizing the hydrolytic process. In anaerobes, the degradation is carried out by a high molecular weight, multifunctional complex termed the cellulosome. This consists of a number of independent enzyme components, each of which contains a conserved 40-residue dockerin domain, which functions to bind the enzyme to a cohesin domain within the scaffoldin protein [, ]. In anaerobic bacteria that degrade plant cell walls, exemplified by Clostridium thermocellum, the dockerin domains of the catalytic polypeptides can bind equally well to any cohesin from the same organism. More recently, anaerobic fungi, typified by Piromyces equi, have been suggested to also synthesise a cellulosome complex, although the dockerin sequences of the bacterial and fungal enzymes are completely different []. For example, the fungal enzymes contain one, two or three copies of the dockerin sequence in tandem within the catalytic polypeptide. In contrast, all the C. thermocellum cellulosome catalytic components contain a single dockerin domain. The anaerobic bacterial dockerins are homologous to EF hands (calcium-binding motifs) and require calcium for activity whereas the fungal dockerin does not require calcium. Finally, the interaction between cohesin and dockerin appears to be species specific in bacteria, there is almost no species specificity of binding within fungal species and no identified sites that distinguish different species. The structure of dockerin from P. equi contains two helical stretches and four short beta-strands which form an antiparallel sheet structure adjacent to an additional short twisted parallel strand. The N- and C-termini are adjacent to each other. Aerobic bacteria contain related regions, however these appear to function as cellulose/carbohydrate binding domains.; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 2J4M_A 2J4N_A 1E8R_A 1QLD_A 1E8P_A 1E8Q_A.
Probab=21.25 E-value=15 Score=21.57 Aligned_cols=14 Identities=36% Similarity=0.947 Sum_probs=9.6
Q ss_pred cccCCCCCCCCCCc
Q psy3616 111 HYYDDEEDGWEMPN 124 (143)
Q Consensus 111 ~~y~~~~d~~~~~~ 124 (143)
-+|+|++++|..+|
T Consensus 17 v~y~d~~g~WGvEN 30 (36)
T PF02013_consen 17 VVYTDDDGGWGVEN 30 (36)
T ss_dssp -SEEETTEEEEEET
T ss_pred eEEcCCCCCEeeEC
Confidence 37888888887544
No 259
>PF15013 CCSMST1: CCSMST1 family
Probab=21.16 E-value=64 Score=22.04 Aligned_cols=14 Identities=21% Similarity=0.539 Sum_probs=7.1
Q ss_pred HHHHHHHHHHHHhh
Q psy3616 55 IIALFVWMICARSE 68 (143)
Q Consensus 55 lIvllv~~~~~r~r 68 (143)
++++++|++..|..
T Consensus 40 l~~fliyFC~lReE 53 (77)
T PF15013_consen 40 LAAFLIYFCFLREE 53 (77)
T ss_pred HHHHHHHHhhcccc
Confidence 34445565655533
No 260
>PRK10847 hypothetical protein; Provisional
Probab=21.10 E-value=1.3e+02 Score=23.68 Aligned_cols=6 Identities=17% Similarity=0.130 Sum_probs=2.4
Q ss_pred Hhhhcc
Q psy3616 66 RSERRR 71 (143)
Q Consensus 66 r~rr~k 71 (143)
..-|++
T Consensus 206 ~~~r~~ 211 (219)
T PRK10847 206 EIWRHK 211 (219)
T ss_pred HHHHHH
Confidence 344433
No 261
>PRK06073 NADH dehydrogenase subunit A; Validated
Probab=20.93 E-value=2.4e+02 Score=20.69 Aligned_cols=7 Identities=29% Similarity=0.620 Sum_probs=2.9
Q ss_pred CCCCCcc
Q psy3616 93 GGAPYAE 99 (143)
Q Consensus 93 g~ppy~e 99 (143)
|.+|..+
T Consensus 47 G~~p~g~ 53 (124)
T PRK06073 47 GNPPTGP 53 (124)
T ss_pred CCCCCCC
Confidence 3344433
No 262
>PF05624 LSR: Lipolysis stimulated receptor (LSR); InterPro: IPR008664 This domain consists of mammalian LISCH7 protein homologues. LISCH7 is a liver-specific BHLH-ZIP transcription factor.
Probab=20.72 E-value=1.8e+02 Score=18.16 Aligned_cols=7 Identities=14% Similarity=0.453 Sum_probs=2.6
Q ss_pred HHHHHHH
Q psy3616 47 AATVVFL 53 (143)
Q Consensus 47 ~~~Vl~l 53 (143)
.|+++++
T Consensus 10 lg~~ll~ 16 (49)
T PF05624_consen 10 LGALLLL 16 (49)
T ss_pred HHHHHHH
Confidence 3333333
No 263
>PF14155 DUF4307: Domain of unknown function (DUF4307)
Probab=20.62 E-value=1.3e+02 Score=21.44 Aligned_cols=16 Identities=13% Similarity=0.054 Sum_probs=9.4
Q ss_pred ccccCCCCcceeeecC
Q psy3616 78 AQTNDQTGSQVNFYYG 93 (143)
Q Consensus 78 ~~~~~~~gs~~N~~~g 93 (143)
-.|...+++++.+.|-
T Consensus 39 ~gf~vv~d~~v~v~f~ 54 (112)
T PF14155_consen 39 IGFEVVDDSTVEVTFD 54 (112)
T ss_pred EEEEECCCCEEEEEEE
Confidence 3444445667777766
No 264
>PF01561 Hanta_G2: Hantavirus glycoprotein G2; InterPro: IPR002532 The medium (M) genome segment of Hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins [], G1 and G2, as a polyprotein precursor. This entry represents the polyprotein region which forms the G2 glycoprotein.; GO: 0030683 evasion by virus of host immune response, 0044423 virion part
Probab=20.60 E-value=39 Score=30.33 Aligned_cols=6 Identities=17% Similarity=0.678 Sum_probs=2.7
Q ss_pred HHHHHh
Q psy3616 62 MICARS 67 (143)
Q Consensus 62 ~~~~r~ 67 (143)
++|-+|
T Consensus 475 ~~cP~r 480 (485)
T PF01561_consen 475 FFCPVR 480 (485)
T ss_pred eeCcch
Confidence 344444
No 265
>PF11884 DUF3404: Domain of unknown function (DUF3404); InterPro: IPR021821 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 260 amino acids in length. This domain is found associated with PF02518 from PFAM, PF00512 from PFAM.
Probab=20.59 E-value=93 Score=25.98 Aligned_cols=18 Identities=0% Similarity=0.060 Sum_probs=8.7
Q ss_pred HHHHHHHHHhhhcccccc
Q psy3616 58 LFVWMICARSERRREPKK 75 (143)
Q Consensus 58 llv~~~~~r~rr~kk~k~ 75 (143)
+++++-+.-+++++|+++
T Consensus 243 i~l~~gw~~y~~~~krre 260 (262)
T PF11884_consen 243 ILLVLGWSLYRWNQKRRE 260 (262)
T ss_pred HHHHHHHHHHHHHHHHHh
Confidence 333444555555555543
No 266
>KOG1631|consensus
Probab=20.51 E-value=1.7e+02 Score=24.32 Aligned_cols=10 Identities=20% Similarity=0.414 Sum_probs=4.7
Q ss_pred CCCCCCCccc
Q psy3616 117 EDGWEMPNFY 126 (143)
Q Consensus 117 ~d~~~~~~~~ 126 (143)
||+|-+-+=+
T Consensus 230 d~eWip~~tl 239 (261)
T KOG1631|consen 230 DDEWIPGTTL 239 (261)
T ss_pred cccccccHhH
Confidence 4455544443
No 267
>PRK03427 cell division protein ZipA; Provisional
Probab=20.49 E-value=1.1e+02 Score=26.35 Aligned_cols=11 Identities=18% Similarity=0.199 Sum_probs=4.3
Q ss_pred HHHHHHHHHhh
Q psy3616 58 LFVWMICARSE 68 (143)
Q Consensus 58 llv~~~~~r~r 68 (143)
+|+--+|..||
T Consensus 20 lL~HGlWtsRK 30 (333)
T PRK03427 20 LLVHGFWTSRK 30 (333)
T ss_pred HHHHhhhhccc
Confidence 33333444333
No 268
>PHA03270 envelope glycoprotein C; Provisional
Probab=20.35 E-value=23 Score=31.68 Aligned_cols=16 Identities=44% Similarity=0.312 Sum_probs=6.9
Q ss_pred eehhHHHHHHHHHHHH
Q psy3616 42 IAGGIAATVVFLIIIA 57 (143)
Q Consensus 42 ia~~i~~~Vl~lilIv 57 (143)
++++|+.+++.+++++
T Consensus 432 ~~~~i~~~~~Aa~~l~ 447 (466)
T PHA03270 432 EWAGIAAGVLAAIGLA 447 (466)
T ss_pred ehhHHHHHHHHHHHHh
Confidence 3444444444444443
No 269
>PTZ00208 65 kDa invariant surface glycoprotein; Provisional
Probab=20.26 E-value=79 Score=28.17 Aligned_cols=7 Identities=14% Similarity=0.520 Sum_probs=3.1
Q ss_pred HHHHHHh
Q psy3616 61 WMICARS 67 (143)
Q Consensus 61 ~~~~~r~ 67 (143)
++++.||
T Consensus 406 ~~~~v~r 412 (436)
T PTZ00208 406 FFIMVKR 412 (436)
T ss_pred hheeeee
Confidence 4444443
Done!