Query psy4697
Match_columns 383
No_of_seqs 305 out of 2376
Neff 7.9
Searched_HMMs 46136
Date Fri Aug 16 23:32:21 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy4697.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/4697hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG4289|consensus 100.0 2.3E-39 4.9E-44 337.8 22.3 241 3-248 227-479 (2531)
2 KOG4289|consensus 100.0 2.1E-36 4.6E-41 315.7 25.0 243 3-253 332-586 (2531)
3 KOG1219|consensus 100.0 3.2E-33 6.9E-38 300.0 27.4 241 4-249 910-1165(4289)
4 KOG1219|consensus 100.0 5.2E-33 1.1E-37 298.4 27.2 236 4-248 2535-2782(4289)
5 cd00031 CA Cadherin repeat dom 100.0 1.6E-27 3.4E-32 214.7 26.9 185 49-237 1-198 (199)
6 cd00031 CA Cadherin repeat dom 99.8 9.7E-20 2.1E-24 163.9 19.2 132 3-134 60-199 (199)
7 PF00028 Cadherin: Cadherin do 99.6 7.1E-15 1.5E-19 116.5 13.0 84 50-133 1-93 (93)
8 KOG1834|consensus 99.6 4.2E-14 9.1E-19 141.3 18.9 196 33-236 21-242 (952)
9 PF00028 Cadherin: Cadherin do 99.5 1.3E-12 2.8E-17 103.5 13.2 87 147-237 1-93 (93)
10 smart00112 CA Cadherin repeats 99.4 7.7E-13 1.7E-17 101.5 9.4 70 71-140 2-79 (79)
11 KOG1834|consensus 99.3 1.2E-11 2.6E-16 124.0 14.0 128 4-133 102-243 (952)
12 smart00112 CA Cadherin repeats 98.9 5.6E-09 1.2E-13 79.9 9.0 74 171-245 1-79 (79)
13 PF08758 Cadherin_pro: Cadheri 97.8 0.00016 3.5E-09 56.8 8.8 86 41-130 2-88 (90)
14 PF08758 Cadherin_pro: Cadheri 97.4 0.0013 2.8E-08 51.7 8.8 80 139-225 3-82 (90)
15 PF08266 Cadherin_2: Cadherin- 97.4 0.00059 1.3E-08 52.9 6.4 57 50-107 3-66 (84)
16 TIGR01965 VCBS_repeat VCBS rep 95.9 0.063 1.4E-06 42.8 8.3 77 65-145 2-89 (99)
17 smart00736 CADG Dystroglycan-t 95.8 0.13 2.8E-06 40.8 9.8 67 69-137 24-96 (97)
18 PF15102 TMEM154: TMEM154 prot 94.5 0.053 1.2E-06 46.0 4.2 34 254-287 57-90 (146)
19 KOG0196|consensus 94.2 2.3 5.1E-05 45.7 16.4 110 103-225 403-524 (996)
20 smart00736 CADG Dystroglycan-t 94.2 0.77 1.7E-05 36.3 10.3 65 169-237 23-92 (97)
21 TIGR01965 VCBS_repeat VCBS rep 92.3 1.2 2.5E-05 35.6 8.3 72 166-244 2-84 (99)
22 PF08374 Protocadherin: Protoc 91.7 0.13 2.8E-06 46.4 2.5 37 250-286 34-70 (221)
23 KOG3597|consensus 90.1 13 0.00028 37.8 15.2 144 27-184 24-193 (442)
24 PF01102 Glycophorin_A: Glycop 89.6 0.17 3.7E-06 41.9 1.3 19 254-272 65-83 (122)
25 PF01034 Syndecan: Syndecan do 89.0 0.18 3.8E-06 36.6 0.8 13 278-290 33-45 (64)
26 PF08266 Cadherin_2: Cadherin- 88.5 1.2 2.6E-05 34.4 5.3 55 148-207 4-65 (84)
27 PF07495 Y_Y_Y: Y_Y_Y domain; 88.1 4.5 9.8E-05 29.0 8.0 56 178-236 9-65 (66)
28 TIGR00845 caca sodium/calcium 86.5 59 0.0013 36.2 20.8 138 37-184 394-567 (928)
29 PF02439 Adeno_E3_CR2: Adenovi 85.8 0.76 1.6E-05 29.7 2.3 9 257-265 6-14 (38)
30 PF12877 DUF3827: Domain of un 81.0 2 4.3E-05 44.9 4.4 35 253-287 268-302 (684)
31 PF05345 He_PIG: Putative Ig d 80.5 7.4 0.00016 26.7 5.8 37 186-222 11-48 (49)
32 PF02439 Adeno_E3_CR2: Adenovi 80.4 1.9 4.1E-05 27.9 2.5 25 252-276 4-28 (38)
33 KOG1094|consensus 80.4 2.9 6.3E-05 43.7 5.2 23 251-273 388-410 (807)
34 KOG4221|consensus 79.7 1.2E+02 0.0026 34.7 21.2 47 179-225 959-1008(1381)
35 PF10577 UPF0560: Uncharacteri 79.0 1.6 3.4E-05 46.9 2.9 31 254-284 273-303 (807)
36 TIGR03660 T1SS_rpt_143 T1SS-14 78.4 41 0.00088 28.5 12.3 53 96-152 69-125 (137)
37 PF15347 PAG: Phosphoprotein a 77.9 2.9 6.4E-05 40.8 4.2 36 251-286 12-47 (428)
38 PF02009 Rifin_STEVOR: Rifin/s 76.2 0.99 2.1E-05 43.4 0.5 30 254-284 256-286 (299)
39 PF04478 Mid2: Mid2 like cell 74.3 2.3 5.1E-05 36.4 2.2 11 275-285 70-80 (154)
40 PF12273 RCR: Chitin synthesis 73.4 3.1 6.8E-05 34.8 2.8 11 277-287 19-29 (130)
41 PF14575 EphA2_TM: Ephrin type 72.1 2.2 4.7E-05 32.2 1.4 27 258-284 2-28 (75)
42 PF13750 Big_3_3: Bacterial Ig 72.0 66 0.0014 27.9 16.1 121 9-133 15-148 (158)
43 PF05393 Hum_adeno_E3A: Human 70.6 3.3 7.2E-05 31.9 2.0 28 259-287 36-63 (94)
44 PF06024 DUF912: Nucleopolyhed 68.9 5 0.00011 32.1 2.9 30 256-285 64-93 (101)
45 PF01299 Lamp: Lysosome-associ 68.8 3.6 7.7E-05 39.8 2.4 20 253-272 270-289 (306)
46 PF13753 SWM_repeat: Putative 67.8 1.2E+02 0.0026 29.2 18.5 202 8-223 11-228 (317)
47 PF15298 AJAP1_PANP_C: AJAP1/P 66.7 14 0.00031 33.0 5.5 89 253-350 99-192 (205)
48 PF05083 LST1: LST-1 protein; 65.9 2.7 5.9E-05 30.9 0.7 23 278-300 19-41 (74)
49 PTZ00382 Variant-specific surf 64.8 4.3 9.3E-05 32.2 1.7 24 258-281 71-94 (96)
50 PF06365 CD34_antigen: CD34/Po 64.5 21 0.00046 32.3 6.3 7 327-333 158-164 (202)
51 TIGR01478 STEVOR variant surfa 62.3 5.4 0.00012 37.7 2.2 14 46-59 20-33 (295)
52 PF15330 SIT: SHP2-interacting 62.0 7.5 0.00016 31.5 2.7 31 260-290 3-33 (107)
53 PF02480 Herpes_gE: Alphaherpe 61.8 2.6 5.7E-05 42.8 0.0 16 125-140 182-197 (439)
54 PTZ00370 STEVOR; Provisional 61.6 5.7 0.00012 37.7 2.2 11 102-112 65-75 (296)
55 PF13750 Big_3_3: Bacterial Ig 59.9 1.2E+02 0.0025 26.3 15.3 121 108-236 14-147 (158)
56 KOG3597|consensus 59.9 94 0.002 31.7 10.6 59 124-186 24-83 (442)
57 PF12768 Rax2: Cortical protei 59.0 15 0.00032 35.1 4.6 11 255-265 229-239 (281)
58 PF15102 TMEM154: TMEM154 prot 58.7 12 0.00027 31.9 3.5 35 252-286 58-92 (146)
59 PTZ00046 rifin; Provisional 57.2 4.9 0.00011 39.4 1.0 30 254-284 315-345 (358)
60 PF05568 ASFV_J13L: African sw 57.0 6.4 0.00014 33.3 1.5 29 257-286 32-60 (189)
61 TIGR01477 RIFIN variant surfac 56.9 5.3 0.00011 39.1 1.1 30 254-284 310-340 (353)
62 PF07495 Y_Y_Y: Y_Y_Y domain; 55.7 45 0.00099 23.5 5.8 55 75-133 8-66 (66)
63 PF07204 Orthoreo_P10: Orthore 54.3 5.2 0.00011 31.3 0.5 8 277-284 62-69 (98)
64 PF02038 ATP1G1_PLM_MAT8: ATP1 54.0 11 0.00025 25.9 2.1 9 261-269 22-30 (50)
65 KOG4482|consensus 52.1 17 0.00036 35.8 3.6 40 251-290 292-332 (449)
66 PF11980 DUF3481: Domain of un 50.1 12 0.00027 28.6 1.9 30 253-282 15-44 (87)
67 PF12191 stn_TNFRSF12A: Tumour 49.5 7.1 0.00015 32.3 0.6 13 272-284 97-109 (129)
68 PF13753 SWM_repeat: Putative 47.4 2.3E+02 0.0049 27.2 10.9 108 108-223 11-124 (317)
69 PF02158 Neuregulin: Neureguli 47.2 6.3 0.00014 38.8 0.0 29 257-285 9-38 (404)
70 PF15069 FAM163: FAM163 family 45.9 58 0.0013 27.7 5.5 7 350-356 93-99 (143)
71 PF06697 DUF1191: Protein of u 45.4 34 0.00073 32.5 4.5 11 48-58 33-43 (278)
72 PF01034 Syndecan: Syndecan do 45.2 7 0.00015 28.4 -0.0 18 269-286 27-44 (64)
73 PF13908 Shisa: Wnt and FGF in 44.7 33 0.00071 30.2 4.2 19 254-272 79-97 (179)
74 cd00146 PKD polycystic kidney 43.8 1.3E+02 0.0028 22.1 7.7 62 167-235 18-80 (81)
75 PF15050 SCIMP: SCIMP protein 43.3 14 0.00031 30.2 1.5 11 255-265 9-19 (133)
76 KOG3513|consensus 41.7 5.8E+02 0.013 29.1 19.9 132 86-236 470-613 (1051)
77 PF01102 Glycophorin_A: Glycop 40.3 16 0.00035 30.3 1.4 27 247-273 61-87 (122)
78 PF15234 LAT: Linker for activ 39.9 1.3E+02 0.0028 26.8 6.8 9 305-313 53-61 (230)
79 PF05454 DAG1: Dystroglycan (D 39.4 9.8 0.00021 36.4 0.0 11 254-264 144-154 (290)
80 PF04906 Tweety: Tweety; Inte 38.9 34 0.00074 34.5 3.8 31 255-285 20-52 (406)
81 PF05895 DUF859: Siphovirus pr 38.8 3.4E+02 0.0075 29.0 11.2 117 10-130 299-433 (624)
82 PF14610 DUF4448: Protein of u 38.5 94 0.002 27.6 6.2 17 255-271 159-175 (189)
83 PF05510 Sarcoglycan_2: Sarcog 38.3 22 0.00048 35.4 2.3 20 271-290 301-320 (386)
84 KOG4433|consensus 36.5 22 0.00049 36.1 2.0 31 253-283 42-74 (526)
85 PHA03290 envelope glycoprotein 36.4 46 0.00099 32.3 3.9 45 94-138 127-171 (357)
86 cd05774 Ig_CEACAM_D1 First imm 36.2 1.1E+02 0.0024 24.4 5.6 34 188-221 61-94 (105)
87 PF11857 DUF3377: Domain of un 35.8 37 0.00081 25.5 2.5 20 254-273 30-49 (74)
88 PF12245 Big_3_2: Bacterial Ig 35.3 1.2E+02 0.0025 21.5 5.1 30 108-137 22-52 (60)
89 KOG1226|consensus 35.2 68 0.0015 34.7 5.3 13 255-267 713-725 (783)
90 PF14991 MLANA: Protein melan- 34.8 10 0.00023 30.8 -0.5 10 274-283 42-51 (118)
91 PF15048 OSTbeta: Organic solu 32.9 54 0.0012 27.2 3.3 20 254-273 36-55 (125)
92 cd05741 Ig_CEACAM_D1_like Firs 32.8 1.1E+02 0.0024 22.8 5.0 34 187-220 47-80 (92)
93 PF03302 VSP: Giardia variant- 32.2 38 0.00082 34.0 2.9 27 256-282 370-396 (397)
94 PHA03283 envelope glycoprotein 31.9 34 0.00073 35.3 2.4 31 255-285 401-431 (542)
95 PF13754 Big_3_4: Bacterial Ig 31.0 1.8E+02 0.0039 20.0 6.0 27 208-234 22-49 (54)
96 TIGR00845 caca sodium/calcium 30.8 4.8E+02 0.01 29.4 11.0 51 30-83 516-568 (928)
97 KOG3488|consensus 30.1 53 0.0012 24.3 2.5 31 255-285 49-79 (81)
98 PF10365 DUF2436: Domain of un 29.9 2.5E+02 0.0054 23.9 6.7 82 39-121 66-156 (161)
99 PHA03286 envelope glycoprotein 29.2 53 0.0011 33.3 3.2 11 207-217 317-327 (492)
100 PF07213 DAP10: DAP10 membrane 29.1 71 0.0015 24.3 3.1 22 264-285 45-66 (79)
101 cd00146 PKD polycystic kidney 28.0 1.1E+02 0.0025 22.4 4.3 29 103-131 51-80 (81)
102 TIGR00864 PCC polycystin catio 27.8 1E+03 0.022 30.5 13.6 110 105-236 1480-1591(2740)
103 PF07204 Orthoreo_P10: Orthore 26.9 26 0.00057 27.5 0.5 25 257-281 45-69 (98)
104 PF14979 TMEM52: Transmembrane 26.5 74 0.0016 27.2 3.1 10 274-283 40-50 (154)
105 PHA03281 envelope glycoprotein 25.7 72 0.0016 33.1 3.5 24 110-134 312-335 (642)
106 TIGR03778 VPDSG_CTERM VPDSG-CT 25.1 81 0.0018 18.7 2.2 11 266-276 10-20 (26)
107 PF13965 SID-1_RNA_chan: dsRNA 24.8 3.5E+02 0.0075 28.7 8.5 24 193-217 58-81 (570)
108 PRK14081 triple tyrosine motif 24.8 9.1E+02 0.02 26.2 21.1 189 8-224 63-270 (667)
109 smart00089 PKD Repeats in poly 24.5 1.8E+02 0.0038 21.2 4.8 28 206-236 51-78 (79)
110 cd05762 Ig8_MLCK Eighth immuno 24.1 3.4E+02 0.0073 20.9 11.1 37 99-137 59-95 (98)
111 COG4288 Uncharacterized protei 23.5 1.2E+02 0.0026 24.5 3.7 47 3-59 52-98 (124)
112 PHA03283 envelope glycoprotein 23.4 4.2E+02 0.0091 27.6 8.3 38 251-288 394-431 (542)
113 PF00558 Vpu: Vpu protein; In 23.3 78 0.0017 24.3 2.5 6 278-283 29-34 (81)
114 KOG3637|consensus 23.0 1E+02 0.0022 35.1 4.4 23 257-279 980-1002(1030)
115 PF11395 DUF2873: Protein of u 22.9 80 0.0017 20.3 2.0 9 264-272 17-25 (43)
116 TIGR03660 T1SS_rpt_143 T1SS-14 22.9 2.7E+02 0.0058 23.6 5.9 44 10-59 86-129 (137)
117 cd05775 Ig_SLAM-CD84_like_N N- 22.7 1.9E+02 0.0041 22.3 4.8 31 190-220 53-84 (97)
118 smart00089 PKD Repeats in poly 22.6 1.9E+02 0.0042 21.0 4.6 31 102-132 48-78 (79)
119 PF13584 BatD: Oxygen toleranc 22.0 8.5E+02 0.018 24.8 14.3 16 178-193 339-354 (484)
120 PHA03265 envelope glycoprotein 21.7 41 0.0009 32.9 0.9 51 40-91 42-93 (402)
121 PHA03099 epidermal growth fact 21.5 1E+02 0.0022 25.8 2.9 6 278-283 122-127 (139)
122 PF05399 EVI2A: Ectropic viral 21.1 41 0.00088 30.5 0.7 14 262-275 142-155 (227)
123 PF08391 Ly49: Ly49-like prote 20.8 33 0.00071 28.4 0.0 22 255-276 6-27 (119)
124 PLN03150 hypothetical protein; 20.5 88 0.0019 33.4 3.2 12 256-267 547-558 (623)
125 PF15065 NCU-G1: Lysosomal tra 20.4 51 0.0011 32.5 1.2 12 108-119 128-139 (350)
126 PF02124 Marek_A: Marek's dise 20.3 5.3E+02 0.012 23.5 7.6 20 211-230 152-171 (211)
No 1
>KOG4289|consensus
Probab=100.00 E-value=2.3e-39 Score=337.76 Aligned_cols=241 Identities=29% Similarity=0.374 Sum_probs=225.3
Q ss_pred CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEE
Q psy4697 3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKY 79 (383)
Q Consensus 3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Y 79 (383)
.|||.++.+.|.|+|.|++.|.++++++|+|.|.|.|||.|+|+|.+|.-++.||.+.|+.|++|+|+|.|. +++.|
T Consensus 227 lDREt~e~HvlrVtA~d~~~P~~SAtttv~V~V~D~nDhsPvFEq~~Y~e~lREn~evGy~vLtvrAtD~Dsp~Nani~Y 306 (2531)
T KOG4289|consen 227 LDRETKETHVLRVTAQDHGDPRRSATTTVTVLVLDTNDHSPVFEQDEYREELRENLEVGYEVLTVRATDGDSPPNANIRY 306 (2531)
T ss_pred hhhhhhheeEEEEEeeecCCCcccceeEEEEEEeecCCCCcccchhHHHHHHhhccccCceEEEEEeccCCCCCCCceEE
Confidence 489999999999999999999999999999999999999999999999999999999999999999999996 89999
Q ss_pred EEec-CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCCcccCCceeEEEecc
Q psy4697 80 WLSN-DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDPRFRYPQYELFLPHI 154 (383)
Q Consensus 80 si~~-~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~ 154 (383)
++.+ +..+.|.||+ +|.|.+..+||||+...|++.|+|.|.| ...++.|.|+|+|+|||+|+|....|.+.|.
T Consensus 307 rl~eg~~~~~f~in~rSGvI~T~a~lDRE~~~~y~L~VeAsDqG~~pgp~Ta~V~itV~D~NDNaPqFse~~Yvvqv~-- 384 (2531)
T KOG4289|consen 307 RLLEGNAKNVFEINPRSGVISTRAPLDREELESYQLDVEASDQGRPPGPRTAMVEITVEDENDNAPQFSEKRYVVQVR-- 384 (2531)
T ss_pred EecCCCccceeEEcCccceeeccCccCHHhhhheEEEEEeccCCCCCCCceEEEEEEEEecCCCCccccccceEEEec--
Confidence 9998 4778999997 9999999999999999999999999976 3459999999999999999999999999999
Q ss_pred CCCCCCCCceEEEEEeeeCCCCC--eEEEEEe-CCCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEE
Q psy4697 155 PLADLTPGSVIGKVEAADGDKGD--RVTLSLR-GPYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTGNPPRQASV 230 (383)
Q Consensus 155 ~~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~-~~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~ 230 (383)
|+..+++.|.+|+|+|.|.|. .+.|+|. |+..+.|.|+. +|++.+..+.+++...|.|.|.|+|+|.|++++++
T Consensus 385 --Edvt~~avvlrV~AtDrD~g~Ng~VHYsi~Sgn~~G~f~id~~tGel~vv~plD~e~~~ytl~IrAqDggrPpLsn~s 462 (2531)
T KOG4289|consen 385 --EDVTPPAVVLRVTATDRDKGTNGKVHYSIASGNGRGQFYIDSLTGELDVVEPLDFENSEYTLRIRAQDGGRPPLSNTS 462 (2531)
T ss_pred --ccCCCCceEEEEEecccCCCcCceEEEEeeccCccccEEEecccceEEEeccccccCCeeEEEEEcccCCCCCccCCC
Confidence 999999999999999999986 6999997 77889999997 99999887655555599999999999999999999
Q ss_pred EEEEEEeCCcccCccccc
Q psy4697 231 PAIMHFPEAIVQQASSKL 248 (383)
Q Consensus 231 tv~I~v~~~~~~~~p~~~ 248 (383)
-|.|+| -+.|+++|.|.
T Consensus 463 gl~iqV-lDINDhaPifv 479 (2531)
T KOG4289|consen 463 GLVIQV-LDINDHAPIFV 479 (2531)
T ss_pred ceEEEE-EecCCCCceeE
Confidence 999999 77888888764
No 2
>KOG4289|consensus
Probab=100.00 E-value=2.1e-36 Score=315.72 Aligned_cols=243 Identities=23% Similarity=0.353 Sum_probs=222.5
Q ss_pred CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEE
Q psy4697 3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKY 79 (383)
Q Consensus 3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Y 79 (383)
.|||..+.|+|.|.|.|.|.++.-.++.|.|.|.|+|||+|+|....|.+.|.||..+++.|++|+|+|.|. +.|.|
T Consensus 332 lDRE~~~~y~L~VeAsDqG~~pgp~Ta~V~itV~D~NDNaPqFse~~Yvvqv~Edvt~~avvlrV~AtDrD~g~Ng~VHY 411 (2531)
T KOG4289|consen 332 LDREELESYQLDVEASDQGRPPGPRTAMVEITVEDENDNAPQFSEKRYVVQVREDVTPPAVVLRVTATDRDKGTNGKVHY 411 (2531)
T ss_pred cCHHhhhheEEEEEeccCCCCCCCceEEEEEEEEecCCCCccccccceEEEecccCCCCceEEEEEecccCCCcCceEEE
Confidence 489999999999999999999887899999999999999999999999999999999999999999999995 89999
Q ss_pred EEec-CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCCcccCCceeEEEecc
Q psy4697 80 WLSN-DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDPRFRYPQYELFLPHI 154 (383)
Q Consensus 80 si~~-~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~ 154 (383)
+|.+ +..+.|.||. +|+|.+..+||+|.. .|.+.|.|.|+| ++.+.-+.|+|+|+|||+|.|...++..++.
T Consensus 412 si~Sgn~~G~f~id~~tGel~vv~plD~e~~-~ytl~IrAqDggrPpLsn~sgl~iqVlDINDhaPifvstpfq~tvl-- 488 (2531)
T KOG4289|consen 412 SIASGNGRGQFYIDSLTGELDVVEPLDFENS-EYTLRIRAQDGGRPPLSNTSGLVIQVLDINDHAPIFVSTPFQATVL-- 488 (2531)
T ss_pred EeeccCccccEEEecccceEEEeccccccCC-eeEEEEEcccCCCCCccCCCceEEEEEecCCCCceeEechhhhhhh--
Confidence 9987 6677899996 999999999999998 999999999987 6777777899999999999999999989999
Q ss_pred CCCCCCCCceEEEEEeeeCCCCC--eEEEEEeCCCCCCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEE
Q psy4697 155 PLADLTPGSVIGKVEAADGDKGD--RVTLSLRGPYEKMFSIND-SGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASV 230 (383)
Q Consensus 155 ~~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~~~~~~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~ 230 (383)
|+.+.|..+..+.|.|.|+|+ .+.|++.|- +.|.|+. +|.|...+ ++++....|.|.|.|+|+|.|++++.+
T Consensus 489 --Env~lg~~v~~vqaidadsg~na~l~y~laG~--~pf~I~~~SG~Itvtk~ldrEt~~~ysl~V~ard~gtp~l~tst 564 (2531)
T KOG4289|consen 489 --ENVPLGYLVCHVQAIDADSGENARLHYSLAGV--GPFQINNGSGWITVTKELDRETVEHYSLGVEARDHGTPPLSTST 564 (2531)
T ss_pred --hcccccceEEEEecccCCCCcccceeeeeccC--CCeeEecCCceEEEeecccccccceEEEEEEEcCCCCCcccccc
Confidence 999999999999999999997 589998754 3899997 99999766 488889999999999999999999999
Q ss_pred EEEEEEeCCcccCcccccCCCCc
Q psy4697 231 PAIMHFPEAIVQQASSKLNSGTS 253 (383)
Q Consensus 231 tv~I~v~~~~~~~~p~~~~~~~~ 253 (383)
.|.|.+ .++|++.|.|....+.
T Consensus 565 sI~Vtv-~dvndndP~Ft~~eyt 586 (2531)
T KOG4289|consen 565 SISVTV-LDVNDNDPTFTQKEYT 586 (2531)
T ss_pred eEEEEe-cccCCCCCccccCceE
Confidence 999999 7888887777544333
No 3
>KOG1219|consensus
Probab=100.00 E-value=3.2e-33 Score=300.03 Aligned_cols=241 Identities=22% Similarity=0.289 Sum_probs=224.3
Q ss_pred CcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCC--CCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeE
Q psy4697 4 EDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLR--ELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVK 78 (383)
Q Consensus 4 DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn--~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~ 78 (383)
|.|+.+-|+|.|.|.|+|.|.+++..++.|.+.|+|+| ||.|..-.-+..|.||+|.|+.++.+.|.|.|. +.++
T Consensus 910 Df~k~~fynLsv~a~d~g~p~lss~chl~Vevldv~enlhpp~F~~~v~e~~V~EnapiGT~vi~i~A~dedsgldg~l~ 989 (4289)
T KOG1219|consen 910 DFEKSDFYNLSVTAVDRGTPILSSICHLEVEVLDVNENLHPPEFISFVTEGHVLENAPIGTIVIRIQARDEDSGLDGELS 989 (4289)
T ss_pred ccccccceEEEEEEecCCCcceeeeEEEEEEEeccCCCCCCcchheeeeeeeEeecCCcceEEEEEEEecCCCCccceEE
Confidence 56999999999999999999999999999999999888 999998888999999999999999999999996 8999
Q ss_pred EEEec-CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCCcccCCceeEEEec
Q psy4697 79 YWLSN-DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDPRFRYPQYELFLPH 153 (383)
Q Consensus 79 Ysi~~-~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~ 153 (383)
|+|.. +..+.|+||+ +|.|++.+.||||....|.|+|.|+|.| +++.+.|.|.|+|+|||+|+|..+.|..+|.
T Consensus 990 Y~I~~gdg~g~FsId~~tG~irTl~~lDrE~ks~YwltveA~D~gt~~~ssv~~vyI~ieDvNDn~Pq~s~pvy~asI~- 1068 (4289)
T KOG1219|consen 990 YKIRTGDGDGIFSIDSTTGSIRTLKALDREKKSSYWLTVEAKDLGTVPLSSVCEVYIEIEDVNDNVPQFSSPVYYASIS- 1068 (4289)
T ss_pred EEEEcCCcceeEEecCCcceEeechhhchhhcceEEEEEEEEecCCCccccceeEEEEEEecCCCCcccCCceEeeeec-
Confidence 99987 6677899995 9999999999999999999999999987 7889999999999999999999999999999
Q ss_pred cCCCCCCCCceEEEEEeeeCCCC--CeEEEEEe-CCCCCCEEEcC-CCcEEE-eccCCCCcceEEEEEEEeeCCCCCcee
Q psy4697 154 IPLADLTPGSVIGKVEAADGDKG--DRVTLSLR-GPYEKMFSIND-SGHISI-VDLSALNTSTIQLVVVATDTGNPPRQA 228 (383)
Q Consensus 154 ~~~e~~~~g~~v~~v~A~D~D~g--~~i~ysi~-~~~~~~F~i~~-tG~i~l-~~~~~~~~~~y~L~V~a~D~g~p~~ss 228 (383)
|+++.+..|.++.|.|+|+. .++.|.|. |+..++|.|++ +|-|.+ ++++++.+.++.|.|.++|.|.|++.+
T Consensus 1069 ---enSp~~vsivq~ea~D~Dsssn~kLmykI~sGnyq~FF~Id~~TG~iTt~r~LDRE~qdEHiLeVTi~D~gep~l~s 1145 (4289)
T KOG1219|consen 1069 ---ENSPETVSIVQAEANDPDSSSNQKLMYKITSGNYQGFFQIDPETGLITTIRRLDREKQDEHILEVTIQDNGEPWLCS 1145 (4289)
T ss_pred ---cCCCCceEEEEeccCCCCcccCcceEEEEccCCccceEEEccccceeeeehhhcccccccceEEEEEecCCCCcccc
Confidence 99999999999999999954 38999997 88999999998 999984 556999999999999999999999999
Q ss_pred EEEEEEEEeCCcccCcccccC
Q psy4697 229 SVPAIMHFPEAIVQQASSKLN 249 (383)
Q Consensus 229 t~tv~I~v~~~~~~~~p~~~~ 249 (383)
.+.|.|.| .+.|++.|.|..
T Consensus 1146 ~~rviV~I-ldvNdnsp~Flq 1165 (4289)
T KOG1219|consen 1146 NQRVIVSI-LDVNDNSPRFLQ 1165 (4289)
T ss_pred ceEEEEEE-eeccCCchhhhh
Confidence 99999999 778888777653
No 4
>KOG1219|consensus
Probab=100.00 E-value=5.2e-33 Score=298.41 Aligned_cols=236 Identities=26% Similarity=0.364 Sum_probs=208.6
Q ss_pred CcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCCCeeEEEEe-
Q psy4697 4 EDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRDLRVKYWLS- 82 (383)
Q Consensus 4 DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~~~v~Ysi~- 82 (383)
|++....|.|.|+|+|.|.|++.+.++|.|+|.+..++.|+|+.+.|.|+|+|+.+.|+.|++|+|.|.|. .+-|++.
T Consensus 2535 ~~~en~tl~l~vkA~D~g~P~~~s~ttV~v~vl~e~v~lPrFSep~y~fsvpEDv~vG~~Ig~v~a~~a~~-~~i~~~v~ 2613 (4289)
T KOG1219|consen 2535 DGLENSTLHLFVKAIDDGKPRRRSNTTVIVTVLPEDVNLPRFSEPIYTFSVPEDVPVGEEIGQVSASDADE-HVIYSLVL 2613 (4289)
T ss_pred hcccCcEEEEEEEeccCCCCCcccceEEEEEecCcccCcccccCceEEEeccccCCCCCeeeEEeecccCC-ceEEEEEe
Confidence 67888999999999999999999999999999999999999999999999999999999999999999885 4455553
Q ss_pred c-----CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC-ceeEEEEEEEEeecCCCCCcccCCceeEEEeccC
Q psy4697 83 N-----DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL-MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIP 155 (383)
Q Consensus 83 ~-----~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~-~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~ 155 (383)
+ +....|++|. +|.|.+.++||+|..++|++.|.|++++ ..+.++|.|.|.|+|||+|+|..+.|.+.+.
T Consensus 2614 ~gt~Esn~d~~Fsvdr~TG~i~v~ksLD~E~kk~yqi~v~a~~~~~vva~tsv~vqVkDvNDNaPvFe~d~y~f~i~--- 2690 (4289)
T KOG1219|consen 2614 GGTPESNPDLPFSVDRNTGMIKVNKSLDHEKKKSYQIKVKATCGQWVVAETSVFVQVKDVNDNAPVFEKDPYLFIIE--- 2690 (4289)
T ss_pred CCCCCCCCCCceEEcCCCceEEeccccchhhhceEEEEEEeecCCceEEEEEEEEEeecccCCCccccCCceeEEEe---
Confidence 3 3445699995 9999999999999999999999999987 4889999999999999999999999999999
Q ss_pred CCCCCCCceEEEEEeeeCCCCC--eEEEEEeCCCCCCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEE
Q psy4697 156 LADLTPGSVIGKVEAADGDKGD--RVTLSLRGPYEKMFSIND-SGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVP 231 (383)
Q Consensus 156 ~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~~~~~~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~t 231 (383)
|+.+.|+.|.+++|.|.|+|. +++|++... ..+|.|++ +|+|.+.. ++.+.+..|.|.|.|+|+|.|+. .++
T Consensus 2691 -En~pvGtsV~qf~AsD~Ds~~nGqirysl~~~-v~yF~In~etGwlTt~~eld~ek~d~y~lkv~AtDhG~~ss--q~~ 2766 (4289)
T KOG1219|consen 2691 -ENSPVGTSVIQFHASDMDSGNNGQIRYSLTSP-VPYFAINPETGWLTTLFELDLEKQDLYSLKVVATDHGVPSS--QAT 2766 (4289)
T ss_pred -ccCCCCceEEEEEeeccCCCCCceEEEEEcCC-cceEEEcCCCCeeeehhhhccccCCceEEEEEEecCCcccc--cce
Confidence 999999999999999999986 799999854 44999997 99998654 56677999999999999999854 455
Q ss_pred EEEEEeCCcccCccccc
Q psy4697 232 AIMHFPEAIVQQASSKL 248 (383)
Q Consensus 232 v~I~v~~~~~~~~p~~~ 248 (383)
+.|.| .+.|+.+|.|.
T Consensus 2767 v~v~v-tDvndspprf~ 2782 (4289)
T KOG1219|consen 2767 VLVHV-TDVNDSPPRFQ 2782 (4289)
T ss_pred EEEEE-EecCCCcchhh
Confidence 55555 56777766654
No 5
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.96 E-value=1.6e-27 Score=214.73 Aligned_cols=185 Identities=36% Similarity=0.537 Sum_probs=167.2
Q ss_pred cEEEEEECCCCCCcEEEEEEeEeCCC---CeeEEEEecCCC-CCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC-c
Q psy4697 49 EYSVSALENLPVNYVLLTVTTNKPRD---LRVKYWLSNDYG-ERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL-M 122 (383)
Q Consensus 49 ~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Ysi~~~~~-~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~-~ 122 (383)
.|.+.|.||.++|+.++++.|.|+|. +.++|+|.+... .+|.|++ +|.|++.+.||||....|.|.|+|.|.| .
T Consensus 1 ~~~~~i~En~~~g~~v~~~~a~D~D~~~~~~~~y~i~~~~~~~~F~i~~~tG~l~~~~~lD~e~~~~~~l~v~a~D~g~~ 80 (199)
T cd00031 1 SYSVSVPENAPPGTVVGTVSATDPDSGENGRVTYSILGGNEDGLFSIDPNTGVITTTKPLDREEQSEYTLTVVASDGGGP 80 (199)
T ss_pred CeEEEEeCCCCCCCEEEEEEEECCCCCCCceEEEEEeCCCCcccEEEeCCCCEEEECCCCCCcCCceEEEEEEEEECCcC
Confidence 47899999999999999999999997 579999998554 7999997 8999999999999999999999999954 3
Q ss_pred --eeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCC--CeEEEEEeCCCC-CCEEEcC-C
Q psy4697 123 --TTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKG--DRVTLSLRGPYE-KMFSIND-S 196 (383)
Q Consensus 123 --~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g--~~i~ysi~~~~~-~~F~i~~-t 196 (383)
++...++|.|.|+|||+|.|..+.|.+.+. |+.++|+.++++.|+|+|.+ ..++|+|.+... .+|.|++ +
T Consensus 81 ~~~~~~~v~I~V~d~Nd~~P~~~~~~~~~~v~----e~~~~~~~i~~~~a~D~D~~~~~~~~y~l~~~~~~~~f~i~~~~ 156 (199)
T cd00031 81 PLSSTATVTVTVLDVNDNPPVFEQSSYEASVP----ENAPPGTVVGTVTATDADSGENAKLTYSILSGNDKELFSIDPNT 156 (199)
T ss_pred cceeEEEEEEEEccCCCCCCcccccceEEEEe----CCCCCCCEEEEEEEEcCCCCCCccEEEEEeCCCCCCEEEEeCCc
Confidence 388999999999999999999888999999 99999999999999999986 589999986544 7999998 9
Q ss_pred CcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEEe
Q psy4697 197 GHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVPAIMHFP 237 (383)
Q Consensus 197 G~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v~ 237 (383)
|.|.+.. .+++....|.|.|.|+|.+.|.++++++++|.+.
T Consensus 157 G~i~~~~~ld~e~~~~~~l~v~a~D~~~~~~~~~~~i~i~v~ 198 (199)
T cd00031 157 GIITLAKPLDREEKSSYELTVVATDGGGPPLSSTATVTVTVL 198 (199)
T ss_pred eEEEeCCccCCccCceEEEEEEEEECCCCCceeEEEEEEEEE
Confidence 9998875 4677777999999999999999999999999884
No 6
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.85 E-value=9.7e-20 Score=163.90 Aligned_cols=132 Identities=30% Similarity=0.332 Sum_probs=123.1
Q ss_pred CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEE
Q psy4697 3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKY 79 (383)
Q Consensus 3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Y 79 (383)
.|||..+.|.|.|+|.|.|.|.+++...|.|.|.|+|||+|.|....|.+.|.|+.++|+.++++.|+|+|. +.++|
T Consensus 60 lD~e~~~~~~l~v~a~D~g~~~~~~~~~v~I~V~d~Nd~~P~~~~~~~~~~v~e~~~~~~~i~~~~a~D~D~~~~~~~~y 139 (199)
T cd00031 60 LDREEQSEYTLTVVASDGGGPPLSSTATVTVTVLDVNDNPPVFEQSSYEASVPENAPPGTVVGTVTATDADSGENAKLTY 139 (199)
T ss_pred CCCcCCceEEEEEEEEECCcCcceeEEEEEEEEccCCCCCCcccccceEEEEeCCCCCCCEEEEEEEEcCCCCCCccEEE
Confidence 488999999999999999888888999999999999999999999999999999999999999999999996 89999
Q ss_pred EEecCCC-CCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEee
Q psy4697 80 WLSNDYG-ERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVN 134 (383)
Q Consensus 80 si~~~~~-~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~D 134 (383)
+|.+... .+|.|+. +|.|++.+.||||....|.+.|.|+|.+ +++.++++|.|.|
T Consensus 140 ~l~~~~~~~~f~i~~~~G~i~~~~~ld~e~~~~~~l~v~a~D~~~~~~~~~~~i~i~v~d 199 (199)
T cd00031 140 SILSGNDKELFSIDPNTGIITLAKPLDREEKSSYELTVVATDGGGPPLSSTATVTVTVLD 199 (199)
T ss_pred EEeCCCCCCEEEEeCCceEEEeCCccCCccCceEEEEEEEEECCCCCceeEEEEEEEEEC
Confidence 9998654 7999997 9999999999999999999999999974 7888889988875
No 7
>PF00028 Cadherin: Cadherin domain; InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=99.63 E-value=7.1e-15 Score=116.52 Aligned_cols=84 Identities=40% Similarity=0.539 Sum_probs=77.6
Q ss_pred EEEEEECCCCCCcEEEEEEeEeCCC---CeeEEEEecCC-CCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecC-C--
Q psy4697 50 YSVSALENLPVNYVLLTVTTNKPRD---LRVKYWLSNDY-GERFSISR-QGDISLMQCLDYETEDSYRFTVYATDT-L-- 121 (383)
Q Consensus 50 y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Ysi~~~~-~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~-~-- 121 (383)
|+++|+||.++|+.++++.|.|+|. +.+.|+|.+.. ..+|.|++ +|.|++.+.||||..+.|.|.|.|+|. +
T Consensus 1 Y~~~v~E~~~~g~~v~~v~a~D~D~~~n~~i~y~i~~~~~~~~F~I~~~tg~i~~~~~LD~E~~~~y~l~v~a~D~~~~~ 80 (93)
T PF00028_consen 1 YSFSVPENAPPGTVVGQVTATDPDSGPNSQITYSILGGNPDGLFSIDPNTGEISLKKPLDRETQSSYQLTVRATDSGGSP 80 (93)
T ss_dssp EEEEEETTGSTSSEEEEEEEEESSTSTTSSEEEEEEETTSTTSEEEETTTTEEEESSSSCTTTTSEEEEEEEEEETTTSS
T ss_pred CEEEEECCCCCCCEEEEEEEEeCCCCCCceEEEEEecCcccCceEEeeeeeccccceecCcccCCEEEEEEEEEECCCCC
Confidence 8999999999999999999999994 89999999854 78999997 999999999999999999999999998 4
Q ss_pred -ceeEEEEEEEEe
Q psy4697 122 -MTTSATVNISVV 133 (383)
Q Consensus 122 -~~s~~tV~I~V~ 133 (383)
++++++|.|+|+
T Consensus 81 ~~~~~~~V~I~V~ 93 (93)
T PF00028_consen 81 PLSSTATVTINVL 93 (93)
T ss_dssp EEEEEEEEEEEEE
T ss_pred CCEEEEEEEEEEC
Confidence 778888888874
No 8
>KOG1834|consensus
Probab=99.60 E-value=4.2e-14 Score=141.33 Aligned_cols=196 Identities=22% Similarity=0.265 Sum_probs=151.9
Q ss_pred EEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-----Ce-eEEEEecCCCCCEEE---cC---cccEEEc
Q psy4697 33 VSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-----LR-VKYWLSNDYGERFSI---SR---QGDISLM 100 (383)
Q Consensus 33 V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-----~~-v~Ysi~~~~~~~F~I---d~---tG~I~~~ 100 (383)
.....+|-+.|. -...|+.-|.||.-.-...--+.|-|.|. |+ .-|.|-+.+ -.|.+ |. .|.|+++
T Consensus 21 ~~aarankhkpw-ie~ey~gvV~Endntvll~Ppl~aLdkdaplr~ageiC~fklhgq~-vPFdavVvdK~TGegvlRaK 98 (952)
T KOG1834|consen 21 HHAARANKHKPW-IEEEYHGVVTENDNTVLLDPPLAALDKDAPLRYAGEICGFKLHGQP-VPFDAVVVDKYTGEGVLRAK 98 (952)
T ss_pred cccccccccCcc-cccceeEEEEeCCceEEeCCCeeeecCCCCcccccccceeEecCCC-CCceEEEEeccCCceEEeec
Confidence 334466777774 56789999999964323333567778774 33 346665532 24654 53 5679999
Q ss_pred ccCCcccCcEEEEEEEEecCC---------ceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEee
Q psy4697 101 QCLDYETEDSYRFTVYATDTL---------MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAA 171 (383)
Q Consensus 101 ~~LD~E~~~~y~~~V~A~D~~---------~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~ 171 (383)
.+||.|.++.|+|+|+|-|-| .+..++|+|+|.|+|+++|+|..+.|.+.|. |. +.-..|+++.|.
T Consensus 99 ~~lDCelqkeytf~iQAydCg~gpdgtn~kKShkatvhIrVkDvNe~AP~f~ep~Yka~V~----EG-K~yd~il~veAi 173 (952)
T KOG1834|consen 99 EPLDCELQKEYTFTIQAYDCGNGPDGTNTKKSHKATVHIRVKDVNEFAPVFKEPWYKAHVT----EG-KVYDSILRVEAI 173 (952)
T ss_pred CcccccccccceEEEEEEecCCCCCccccccccceEEEEEeccccccCchhcccceeeEEe----cc-eeeeeeEEEEee
Confidence 999999999999999999832 4668999999999999999999999999998 54 345678899999
Q ss_pred eCCCCC----eEEEEEeCCCCCCEEEcCCCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697 172 DGDKGD----RVTLSLRGPYEKMFSINDSGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVPAIMHF 236 (383)
Q Consensus 172 D~D~g~----~i~ysi~~~~~~~F~i~~tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v 236 (383)
|.|-+. -..|.|. +.+-.|.||..|.|+... +.+-....|.|+|+|.|.|......-+.|+|.|
T Consensus 174 D~DCspq~sqIC~YEI~-t~d~PFaIdn~G~irnTekLny~ke~~Y~ltVtAyDCg~kraa~d~lV~v~V 242 (952)
T KOG1834|consen 174 DKDCSPQYSQICEYEIT-TPDVPFAIDNDGNIRNTEKLNYTKEHQYKLTVTAYDCGKKRAASDSLVTVHV 242 (952)
T ss_pred cCCCCCcccceeEEEec-CCCCceEEcCCCccccccccccccceeEEEEEEEEecccccccCcceEEEEe
Confidence 999764 4789997 466689999999998655 455567899999999999987766667888888
No 9
>PF00028 Cadherin: Cadherin domain; InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=99.47 E-value=1.3e-12 Score=103.51 Aligned_cols=87 Identities=36% Similarity=0.630 Sum_probs=78.3
Q ss_pred eeEEEeccCCCCCCCCceEEEEEeeeCCCCC--eEEEEEeC-CCCCCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeC
Q psy4697 147 YELFLPHIPLADLTPGSVIGKVEAADGDKGD--RVTLSLRG-PYEKMFSIND-SGHISIVD-LSALNTSTIQLVVVATDT 221 (383)
Q Consensus 147 ~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~~-~~~~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~ 221 (383)
|.+.++ |+.++|+.++++.|.|+|.+. .+.|+|.+ +..++|.|++ +|.|.+.+ ++++....|.|.|.|+|.
T Consensus 1 Y~~~v~----E~~~~g~~v~~v~a~D~D~~~n~~i~y~i~~~~~~~~F~I~~~tg~i~~~~~LD~E~~~~y~l~v~a~D~ 76 (93)
T PF00028_consen 1 YSFSVP----ENAPPGTVVGQVTATDPDSGPNSQITYSILGGNPDGLFSIDPNTGEISLKKPLDRETQSSYQLTVRATDS 76 (93)
T ss_dssp EEEEEE----TTGSTSSEEEEEEEEESSTSTTSSEEEEEEETTSTTSEEEETTTTEEEESSSSCTTTTSEEEEEEEEEET
T ss_pred CEEEEE----CCCCCCCEEEEEEEEeCCCCCCceEEEEEecCcccCceEEeeeeeccccceecCcccCCEEEEEEEEEEC
Confidence 678899 999999999999999999765 69999984 4478999998 99999876 488889999999999999
Q ss_pred -CCCCceeEEEEEEEEe
Q psy4697 222 -GNPPRQASVPAIMHFP 237 (383)
Q Consensus 222 -g~p~~sst~tv~I~v~ 237 (383)
|.|+++++++|+|+|+
T Consensus 77 ~~~~~~~~~~~V~I~V~ 93 (93)
T PF00028_consen 77 GGSPPLSSTATVTINVL 93 (93)
T ss_dssp TTSSEEEEEEEEEEEEE
T ss_pred CCCCCCEEEEEEEEEEC
Confidence 8999999999999984
No 10
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=99.43 E-value=7.7e-13 Score=101.49 Aligned_cols=70 Identities=34% Similarity=0.478 Sum_probs=62.9
Q ss_pred eCCC---CeeEEEEecCCC-CCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCC
Q psy4697 71 KPRD---LRVKYWLSNDYG-ERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDP 140 (383)
Q Consensus 71 D~D~---~~v~Ysi~~~~~-~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P 140 (383)
|+|. +.++|+|..... .+|.|++ +|.|++.++||||....|.|.|.|.|.+ +++.++|.|+|.|+|||+|
T Consensus 2 D~D~g~n~~i~Y~i~~~~~~~~F~i~~~tg~i~~~~~LD~e~~~~y~l~v~a~D~~~~~~~~~~~v~I~V~D~Nd~~P 79 (79)
T smart00112 2 DADSGENGKVTYSILSGNEDGLFSIDPETGEITTTKPLDREEQPEYTLTVEATDGGGPPLSSTATVTVTVLDVNDNAP 79 (79)
T ss_pred CCCCCcCcEEEEEEecCCCCCEEEEeCCccEEEeCCccCeeCCCeEEEEEEEEECCCCCcccEEEEEEEEEECCCCCC
Confidence 5554 789999987554 8999996 9999999999999999999999999976 7899999999999999998
No 11
>KOG1834|consensus
Probab=99.35 E-value=1.2e-11 Score=124.02 Aligned_cols=128 Identities=20% Similarity=0.239 Sum_probs=108.2
Q ss_pred CcCCCCeEEEEEEEEECCCCC------ceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---
Q psy4697 4 EDDFLQPITLVVRAIQYDNQD------RYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD--- 74 (383)
Q Consensus 4 DrE~~~~y~l~V~a~D~g~~~------~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~--- 74 (383)
|=|.+..|+++|+|.|-|..+ .+.-++|.|+|.|+|+.+|+|..+.|.+.|.|.- .-..|++|.|.|.|=
T Consensus 102 DCelqkeytf~iQAydCg~gpdgtn~kKShkatvhIrVkDvNe~AP~f~ep~Yka~V~EGK-~yd~il~veAiD~DCspq 180 (952)
T KOG1834|consen 102 DCELQKEYTFTIQAYDCGNGPDGTNTKKSHKATVHIRVKDVNEFAPVFKEPWYKAHVTEGK-VYDSILRVEAIDKDCSPQ 180 (952)
T ss_pred cccccccceEEEEEEecCCCCCccccccccceEEEEEeccccccCchhcccceeeEEecce-eeeeeEEEEeecCCCCCc
Confidence 568899999999999977654 4556889999999999999999999999999984 467899999999993
Q ss_pred --CeeEEEEecCCCCCEEEcCcccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEe
Q psy4697 75 --LRVKYWLSNDYGERFSISRQGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVV 133 (383)
Q Consensus 75 --~~v~Ysi~~~~~~~F~Id~tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~ 133 (383)
.-..|.|.. +.-.|.||+.|.|+.+.+|.|.....|.|+|.|-|-| ..+.+.|+|.|.
T Consensus 181 ~sqIC~YEI~t-~d~PFaIdn~G~irnTekLny~ke~~Y~ltVtAyDCg~kraa~d~lV~v~Vk 243 (952)
T KOG1834|consen 181 YSQICEYEITT-PDVPFAIDNDGNIRNTEKLNYTKEHQYKLTVTAYDCGKKRAASDSLVTVHVK 243 (952)
T ss_pred ccceeEEEecC-CCCceEEcCCCccccccccccccceeEEEEEEEEecccccccCcceEEEEec
Confidence 446788886 5557999999999999999999999999999999965 223356676664
No 12
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=98.94 E-value=5.6e-09 Score=79.86 Aligned_cols=74 Identities=30% Similarity=0.471 Sum_probs=61.2
Q ss_pred eeCCCCC--eEEEEEeCCCC-CCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEEeCCcccCcc
Q psy4697 171 ADGDKGD--RVTLSLRGPYE-KMFSIND-SGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVPAIMHFPEAIVQQAS 245 (383)
Q Consensus 171 ~D~D~g~--~i~ysi~~~~~-~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v~~~~~~~~p 245 (383)
+|+|.|. .+.|+|.++.. .+|.|++ +|.+.+.+ ++++....|.|.|.|+|.|.|+++++++|.|.| .+.|+++|
T Consensus 1 ~D~D~g~n~~i~Y~i~~~~~~~~F~i~~~tg~i~~~~~LD~e~~~~y~l~v~a~D~~~~~~~~~~~v~I~V-~D~Nd~~P 79 (79)
T smart00112 1 TDADSGENGKVTYSILSGNEDGLFSIDPETGEITTTKPLDREEQPEYTLTVEATDGGGPPLSSTATVTVTV-LDVNDNAP 79 (79)
T ss_pred CCCCCCcCcEEEEEEecCCCCCEEEEeCCccEEEeCCccCeeCCCeEEEEEEEEECCCCCcccEEEEEEEE-EECCCCCC
Confidence 4788874 69999985443 8999997 89887664 577778999999999999999999999999999 66666554
No 13
>PF08758 Cadherin_pro: Cadherin prodomain like; InterPro: IPR014868 Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions []. ; GO: 0007155 cell adhesion, 0016021 integral to membrane; PDB: 1OP4_A.
Probab=97.80 E-value=0.00016 Score=56.77 Aligned_cols=86 Identities=23% Similarity=0.256 Sum_probs=46.8
Q ss_pred CCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-CeeEEEEecCCCCCEEEcCcccEEEcccCCcccCcEEEEEEEEec
Q psy4697 41 RELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-LRVKYWLSNDYGERFSISRQGDISLMQCLDYETEDSYRFTVYATD 119 (383)
Q Consensus 41 n~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-~~v~Ysi~~~~~~~F~Id~tG~I~~~~~LD~E~~~~y~~~V~A~D 119 (383)
+-|-|.+..|.+.|+.+...|..|++|.-.|-.. ..+.|.-. ++ .|.|.++|.|++++++.....+ -.|.|.|.|
T Consensus 2 C~pGF~~~~~~~~Vp~~l~~g~~lg~V~f~dC~~~~~~~~~ss-Dp--dF~V~~DGsVy~~r~v~l~~~~-~~F~V~a~D 77 (90)
T PF08758_consen 2 CRPGFSQKKYTFEVPSNLEAGQPLGKVNFEDCTGRRRVIFESS-DP--DFRVLEDGSVYAKRPVQLSSEQ-RSFTVHAWD 77 (90)
T ss_dssp ---B--S-EEEE----SS-SS--EEE---B--SS---EEEE----S--EEEEETTTEEEEES--S-SSS--EEEEEEEEE
T ss_pred CcCCcccceEEEEcCchhhCCcEEEEEEeccCCCCCceEEecC-CC--CEEEcCCCeEEEeeeEecCCCc-eEEEEEEEC
Confidence 4588999999999999999999999999988865 56888643 22 6999999999999999886543 479999999
Q ss_pred CCceeEEEEEE
Q psy4697 120 TLMTTSATVNI 130 (383)
Q Consensus 120 ~~~~s~~tV~I 130 (383)
.......++.|
T Consensus 78 ~~~~~~~~v~V 88 (90)
T PF08758_consen 78 SQTQEQKEVKV 88 (90)
T ss_dssp TTTTEEEEEEE
T ss_pred CCCCeEEEEEE
Confidence 75333344443
No 14
>PF08758 Cadherin_pro: Cadherin prodomain like; InterPro: IPR014868 Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions []. ; GO: 0007155 cell adhesion, 0016021 integral to membrane; PDB: 1OP4_A.
Probab=97.40 E-value=0.0013 Score=51.69 Aligned_cols=80 Identities=23% Similarity=0.353 Sum_probs=45.5
Q ss_pred CCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCCCeEEEEEeCCCCCCEEEcCCCcEEEeccCCCCcceEEEEEEE
Q psy4697 139 DPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKGDRVTLSLRGPYEKMFSINDSGHISIVDLSALNTSTIQLVVVA 218 (383)
Q Consensus 139 ~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~~i~ysi~~~~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a 218 (383)
.|=|....|.+.|+ .+...|+.|++|.-.|-.....+.|.-. +..|.|.++|.+++.+.-.+....-.+.|.|
T Consensus 3 ~pGF~~~~~~~~Vp----~~l~~g~~lg~V~f~dC~~~~~~~~~ss---DpdF~V~~DGsVy~~r~v~l~~~~~~F~V~a 75 (90)
T PF08758_consen 3 RPGFSQKKYTFEVP----SNLEAGQPLGKVNFEDCTGRRRVIFESS---DPDFRVLEDGSVYAKRPVQLSSEQRSFTVHA 75 (90)
T ss_dssp --B--S-EEEE--------SS-SS--EEE---B--SS---EEEE------SEEEEETTTEEEEES--S-SSS-EEEEEEE
T ss_pred cCCcccceEEEEcC----chhhCCcEEEEEEeccCCCCCceEEecC---CCCEEEcCCCeEEEeeeEecCCCceEEEEEE
Confidence 36788888999999 8899999999999999965567888754 3389999999999888756666667899999
Q ss_pred eeCCCCC
Q psy4697 219 TDTGNPP 225 (383)
Q Consensus 219 ~D~g~p~ 225 (383)
+|.....
T Consensus 76 ~D~~~~~ 82 (90)
T PF08758_consen 76 WDSQTQE 82 (90)
T ss_dssp EETTTTE
T ss_pred ECCCCCe
Confidence 9987644
No 15
>PF08266 Cadherin_2: Cadherin-like; InterPro: IPR013164 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion. This entry represents a cadherin domain that is usually found at the N terminus of cadherin proteins.; PDB: 1WUZ_A 1WYJ_A.
Probab=97.36 E-value=0.00059 Score=52.90 Aligned_cols=57 Identities=16% Similarity=0.253 Sum_probs=37.9
Q ss_pred EEEEEECCCCCCcEEEEEEeEeCCC-----CeeEEEEec-CCCCCEEEcC-cccEEEcccCCccc
Q psy4697 50 YSVSALENLPVNYVLLTVTTNKPRD-----LRVKYWLSN-DYGERFSISR-QGDISLMQCLDYET 107 (383)
Q Consensus 50 y~~~V~En~~~gt~v~~v~A~D~D~-----~~v~Ysi~~-~~~~~F~Id~-tG~I~~~~~LD~E~ 107 (383)
..++|+|..++|+.||.+ |.|... ....|.+.. ....+|.++. +|.|++...+|||.
T Consensus 3 i~YsV~EE~~~Gt~IGni-a~dL~l~~~~l~~~~~ri~s~~~~~~~~v~~~tG~L~v~~rIDRE~ 66 (84)
T PF08266_consen 3 IRYSVPEEMPPGTVIGNI-AKDLGLDPQSLSSRNFRIVSEGNSQYFRVNEKTGDLFVSERIDREE 66 (84)
T ss_dssp EEEEEESS--TT-EEEEC-CCCCT--HHHHCCTTBEEE-SSSS-SEEE-TTTSEEEESS--SCCC
T ss_pred eEEEeecCCCCCCEEEEh-HHhhCCCcccccccceEEeecCCcceeEecCCceeEEeCCccCHHH
Confidence 357899999999999999 445432 334566555 4567999996 99999999999998
No 16
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=95.86 E-value=0.063 Score=42.84 Aligned_cols=77 Identities=23% Similarity=0.185 Sum_probs=51.7
Q ss_pred EEEEeEeCCC-CeeEEEEec--CCCCCEEEcCcccEEEc--------ccCCcccCcEEEEEEEEecCCceeEEEEEEEEe
Q psy4697 65 LTVTTNKPRD-LRVKYWLSN--DYGERFSISRQGDISLM--------QCLDYETEDSYRFTVYATDTLMTTSATVNISVV 133 (383)
Q Consensus 65 ~~v~A~D~D~-~~v~Ysi~~--~~~~~F~Id~tG~I~~~--------~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~ 133 (383)
|++.++|+|. ....++... ...+.|.|+.+|..... +.|.--+...-.|+|.+.|+ ...+|.|.|.
T Consensus 2 G~Lt~sD~D~gd~~~~s~~~~~g~yGtlti~~~G~wtYtl~n~~~avq~L~~Ge~~tdsFtvtv~DG---tt~~vtItI~ 78 (99)
T TIGR01965 2 GQLTISDADAGQAHFIAQTDAAGQYGTFSIDADGQWTYQADNSQTAVQALKAGETLTDTFTVTSADG---TSQTVTITIT 78 (99)
T ss_pred CceEEeCCCCCCceEEecccccCCcEEEEECCCCcEEEEeCCCcHHHHhhcCCCEEEEEEEEEEeCC---CeEEEEEEEE
Confidence 4678888887 345555533 23345888777765432 23333344567888889997 3889999999
Q ss_pred ecCCCCCcccCC
Q psy4697 134 NVNDWDPRFRYP 145 (383)
Q Consensus 134 DvNDn~P~f~~~ 145 (383)
..|| +|++...
T Consensus 79 GtND-apvi~~~ 89 (99)
T TIGR01965 79 GAND-AAVIGGA 89 (99)
T ss_pred ccCC-CCEEecc
Confidence 9999 8877543
No 17
>smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.
Probab=95.75 E-value=0.13 Score=40.82 Aligned_cols=67 Identities=25% Similarity=0.251 Sum_probs=51.3
Q ss_pred eEeCCCCeeEEEEec----CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC-ceeEEEEEEEEeecCC
Q psy4697 69 TNKPRDLRVKYWLSN----DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL-MTTSATVNISVVNVND 137 (383)
Q Consensus 69 A~D~D~~~v~Ysi~~----~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~-~~s~~tV~I~V~DvND 137 (383)
..|+|...++|++.. .-..|...|+ ++.++= .+.+.+ ...|.+.|.|+|+. .++...+.|.|.+.||
T Consensus 24 F~d~d~~~lty~~~~~~~~~lP~Wl~fd~~~~~~~G-tP~~~~-~g~~~i~v~a~D~~g~~~~~~f~i~V~~~~~ 96 (97)
T smart00736 24 FTDADGDTLTYSATLSDGSALPSWLSFDSDTGTLSG-TPTNSD-VGSLSLKVTATDSSGASASDTFTITVVNTND 96 (97)
T ss_pred eECCCCCeEEEEEEeCCCCCCCCeEEEeCCCCEEEE-ECCCCC-CcEEEEEEEEEECCCCEEEEEEEEEEeCCCC
Confidence 467777889999864 2256899986 777665 344433 46799999999975 8888899999999987
No 18
>PF15102 TMEM154: TMEM154 protein family
Probab=94.47 E-value=0.053 Score=46.00 Aligned_cols=34 Identities=26% Similarity=0.444 Sum_probs=16.6
Q ss_pred eeeehhhHHHHHHHHHHHHHHhhheeecCCCCCC
Q psy4697 254 STVLIILGVVLIVLGFVIILLILYIHKNKHTKNN 287 (383)
Q Consensus 254 ~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~ 287 (383)
+++++++..++++||++++++.++++||+|.|..
T Consensus 57 fiLmIlIP~VLLvlLLl~vV~lv~~~kRkr~K~~ 90 (146)
T PF15102_consen 57 FILMILIPLVLLVLLLLSVVCLVIYYKRKRTKQE 90 (146)
T ss_pred eEEEEeHHHHHHHHHHHHHHHheeEEeecccCCC
Confidence 3556666655554443333334444555544443
No 19
>KOG0196|consensus
Probab=94.23 E-value=2.3 Score=45.71 Aligned_cols=110 Identities=18% Similarity=0.140 Sum_probs=55.5
Q ss_pred CCcccCcEEEEEEEEecCC------ceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCC
Q psy4697 103 LDYETEDSYRFTVYATDTL------MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKG 176 (383)
Q Consensus 103 LD~E~~~~y~~~V~A~D~~------~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g 176 (383)
-|.+....|+|.|+|.++- ....+.|.|.. |.-+|.- ...+.+. ......+-..-.--|++.|
T Consensus 403 ~~L~ah~~YTFeV~AvNgVS~lsp~~~~~a~vnItt---~qa~ps~---V~~~r~~-----~~~~~sitlsW~~p~~png 471 (996)
T KOG0196|consen 403 SDLLAHTNYTFEVEAVNGVSDLSPFPRQFASVNITT---NQAAPSP---VSVLRQV-----SRTSDSITLSWSEPDQPNG 471 (996)
T ss_pred eccccccccEEEEEEeecccccCCCCCcceeEEeec---cccCCCc---cceEEEe-----eeccCceEEecCCCCCCCC
Confidence 3556677899999999862 22344555544 3333321 1112111 1111222122233344445
Q ss_pred CeEEEEEe----C-CCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCCCCC
Q psy4697 177 DRVTLSLR----G-PYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTGNPP 225 (383)
Q Consensus 177 ~~i~ysi~----~-~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~ 225 (383)
..+.|.+. . +...+..+.. .-...+..+ .....|-+.|.|.+..+..
T Consensus 472 ~ildYEvky~ek~~~e~~~~~~~t~~~~~ti~gL--~p~t~YvfqVRarT~aG~G 524 (996)
T KOG0196|consen 472 VILDYEVKYYEKDEDERSYSTLKTKTTTATITGL--KPGTVYVFQVRARTAAGYG 524 (996)
T ss_pred cceeEEEEEeeccccccceeEEecccceEEeecc--CCCcEEEEEEEEecccCCC
Confidence 55677764 1 3344444543 323334443 3357899999999875543
No 20
>smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.
Probab=94.21 E-value=0.77 Score=36.27 Aligned_cols=65 Identities=25% Similarity=0.286 Sum_probs=46.3
Q ss_pred EeeeCCCCCeEEEEEeC----CCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEEEe
Q psy4697 169 EAADGDKGDRVTLSLRG----PYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMHFP 237 (383)
Q Consensus 169 ~A~D~D~g~~i~ysi~~----~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v~ 237 (383)
...|.| +..++|++.. ....|...++ ++.+.-. +...+.+.|.+.|.|+|..+ .++...+.|.|.
T Consensus 23 tF~d~d-~~~lty~~~~~~~~~lP~Wl~fd~~~~~~~Gt-P~~~~~g~~~i~v~a~D~~g--~~~~~~f~i~V~ 92 (97)
T smart00736 23 TFTDAD-GDTLTYSATLSDGSALPSWLSFDSDTGTLSGT-PTNSDVGSLSLKVTATDSSG--ASASDTFTITVV 92 (97)
T ss_pred ceECCC-CCeEEEEEEeCCCCCCCCeEEEeCCCCEEEEE-CCCCCCcEEEEEEEEEECCC--CEEEEEEEEEEe
Confidence 356888 7789999862 2345777886 7777543 33344678999999999876 567777888773
No 21
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=92.25 E-value=1.2 Score=35.63 Aligned_cols=72 Identities=22% Similarity=0.264 Sum_probs=46.6
Q ss_pred EEEEeeeCCCCCeEEEEEe--CCCCCCEEEcCCCcEEEec-c-----CCC---CcceEEEEEEEeeCCCCCceeEEEEEE
Q psy4697 166 GKVEAADGDKGDRVTLSLR--GPYEKMFSINDSGHISIVD-L-----SAL---NTSTIQLVVVATDTGNPPRQASVPAIM 234 (383)
Q Consensus 166 ~~v~A~D~D~g~~i~ysi~--~~~~~~F~i~~tG~i~l~~-~-----~~~---~~~~y~L~V~a~D~g~p~~sst~tv~I 234 (383)
+++.++|+|.|+...++.. ....+.|.|+.+|.-.-.. . ..+ +...-.++|.+.|+ .+.+|.|
T Consensus 2 G~Lt~sD~D~gd~~~~s~~~~~g~yGtlti~~~G~wtYtl~n~~~avq~L~~Ge~~tdsFtvtv~DG------tt~~vtI 75 (99)
T TIGR01965 2 GQLTISDADAGQAHFIAQTDAAGQYGTFSIDADGQWTYQADNSQTAVQALKAGETLTDTFTVTSADG------TSQTVTI 75 (99)
T ss_pred CceEEeCCCCCCceEEecccccCCcEEEEECCCCcEEEEeCCCcHHHHhhcCCCEEEEEEEEEEeCC------CeEEEEE
Confidence 3578999999988888874 2345678998887543211 1 112 23345788889995 2677788
Q ss_pred EEeCCcccCc
Q psy4697 235 HFPEAIVQQA 244 (383)
Q Consensus 235 ~v~~~~~~~~ 244 (383)
+| .+.|+.+
T Consensus 76 tI-~GtNDap 84 (99)
T TIGR01965 76 TI-TGANDAA 84 (99)
T ss_pred EE-EccCCCC
Confidence 77 5555543
No 22
>PF08374 Protocadherin: Protocadherin; InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated [].
Probab=91.71 E-value=0.13 Score=46.39 Aligned_cols=37 Identities=11% Similarity=0.265 Sum_probs=23.8
Q ss_pred CCCceeeehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697 250 SGTSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKN 286 (383)
Q Consensus 250 ~~~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~ 286 (383)
..+..++++++|+++.+.|+|++..++++||++..+.
T Consensus 34 ~d~~~I~iaiVAG~~tVILVI~i~v~vR~CRq~~~k~ 70 (221)
T PF08374_consen 34 KDYVKIMIAIVAGIMTVILVIFIVVLVRYCRQSPHKK 70 (221)
T ss_pred ccceeeeeeeecchhhhHHHHHHHHHHHHHhhccccc
Confidence 3566777888888877666666655565577554443
No 23
>KOG3597|consensus
Probab=90.13 E-value=13 Score=37.78 Aligned_cols=144 Identities=14% Similarity=0.130 Sum_probs=85.0
Q ss_pred EEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC--CeeEEEEecCCCC---------------CE
Q psy4697 27 ALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD--LRVKYWLSNDYGE---------------RF 89 (383)
Q Consensus 27 s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~--~~v~Ysi~~~~~~---------------~F 89 (383)
.+....|.|..+||+|..+-...+.+-+.|+...-.....+.+.|+|. ..+.|++....+. -|
T Consensus 24 ~~~~~~i~v~pvndpp~~~~~~~~~l~~~~~~~k~l~~~~l~~~d~d~~~~~l~f~v~~t~~~~~~~~~~~~~g~~~~~F 103 (442)
T KOG3597|consen 24 QTDVLRIHVNPVNDPPSLIFPSGSLLVILEGGQKVLDPELLTAADPDSAPLPLEFQVLGTSSVPLPVLKFDVPGAPATEF 103 (442)
T ss_pred EEeeecccccccCCCcceeecccceEEeecCCceeccceEeeccCCCCCccceEEEEccCCCCCCccceeeccCCcccce
Confidence 344567889999999888888888888898866555556788899987 7788888773322 23
Q ss_pred EEc--CcccEEEcccCCccc--CcEEEEEEEEecCCceeEEEEEEEEeecCCCCCcccCC-ceeEEEeccCCCCCCCCce
Q psy4697 90 SIS--RQGDISLMQCLDYET--EDSYRFTVYATDTLMTTSATVNISVVNVNDWDPRFRYP-QYELFLPHIPLADLTPGSV 164 (383)
Q Consensus 90 ~Id--~tG~I~~~~~LD~E~--~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn~P~f~~~-~~~~~v~~~~~e~~~~g~~ 164 (383)
+-. ..|.+. +++.. .....++..++|+-..+. . .+.-.-...|.+... ...+.+. ......
T Consensus 104 s~~~v~~g~~~----yvh~g~el~~~~~~~~~SDg~~~S~-~---~i~~~~~~~~~~~~~~~~gL~v~------~gS~~~ 169 (442)
T KOG3597|consen 104 SYEEVEDGSLS----YVHSGTELRESELQLRVSDGLLVSE-R---AILKVEATGPAPHLARNTGLKVL------QGSTAP 169 (442)
T ss_pred EehHhhcCcee----EEecCcccccceEEEEeecceEeee-e---EEecccCCCcceeeecccceEEc------cCcccc
Confidence 222 133332 23333 567788888999875555 1 111122223332221 1122222 112222
Q ss_pred E--EEEEeeeCCCC-C-eEEEEEe
Q psy4697 165 I--GKVEAADGDKG-D-RVTLSLR 184 (383)
Q Consensus 165 v--~~v~A~D~D~g-~-~i~ysi~ 184 (383)
| ..+.+.|.|++ + .+.|.|.
T Consensus 170 IT~~~L~ved~d~~~d~~v~~~i~ 193 (442)
T KOG3597|consen 170 ITPSNLSVEDNDSSPDDEVRYDIT 193 (442)
T ss_pred ccHhHceeecCCCCCCcEEEEEec
Confidence 3 24788888854 3 6888885
No 24
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=89.62 E-value=0.17 Score=41.91 Aligned_cols=19 Identities=32% Similarity=0.660 Sum_probs=12.1
Q ss_pred eeeehhhHHHHHHHHHHHH
Q psy4697 254 STVLIILGVVLIVLGFVII 272 (383)
Q Consensus 254 ~~li~~l~~i~~lL~l~~~ 272 (383)
.++.|++|+++++++++++
T Consensus 65 ~i~~Ii~gv~aGvIg~Ill 83 (122)
T PF01102_consen 65 AIIGIIFGVMAGVIGIILL 83 (122)
T ss_dssp CHHHHHHHHHHHHHHHHHH
T ss_pred ceeehhHHHHHHHHHHHHH
Confidence 4556777777777665443
No 25
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=88.96 E-value=0.18 Score=36.59 Aligned_cols=13 Identities=8% Similarity=0.107 Sum_probs=1.6
Q ss_pred eeecCCCCCCCCC
Q psy4697 278 IHKNKHTKNNGPP 290 (383)
Q Consensus 278 ~~r~~~~~~~~~~ 290 (383)
..|.+||+..+|.
T Consensus 33 iyR~rkkdEGSY~ 45 (64)
T PF01034_consen 33 IYRMRKKDEGSYD 45 (64)
T ss_dssp ----S------SS
T ss_pred HHHHHhcCCCCcc
Confidence 4555556655554
No 26
>PF08266 Cadherin_2: Cadherin-like; InterPro: IPR013164 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion. This entry represents a cadherin domain that is usually found at the N terminus of cadherin proteins.; PDB: 1WUZ_A 1WYJ_A.
Probab=88.55 E-value=1.2 Score=34.43 Aligned_cols=55 Identities=20% Similarity=0.420 Sum_probs=32.8
Q ss_pred eEEEeccCCCCCCCCceEEEEEeeeCCCCC----eEEEEEe-CCCCCCEEEcC-CCcEEEecc-CCC
Q psy4697 148 ELFLPHIPLADLTPGSVIGKVEAADGDKGD----RVTLSLR-GPYEKMFSIND-SGHISIVDL-SAL 207 (383)
Q Consensus 148 ~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~----~i~ysi~-~~~~~~F~i~~-tG~i~l~~~-~~~ 207 (383)
..+|+ |..++|+.|+.| |.|.-... ...|.+. .....+|.++. +|.+++... +++
T Consensus 4 ~YsV~----EE~~~Gt~IGni-a~dL~l~~~~l~~~~~ri~s~~~~~~~~v~~~tG~L~v~~rIDRE 65 (84)
T PF08266_consen 4 RYSVP----EEMPPGTVIGNI-AKDLGLDPQSLSSRNFRIVSEGNSQYFRVNEKTGDLFVSERIDRE 65 (84)
T ss_dssp EEEEE----SS--TT-EEEEC-CCCCT--HHHHCCTTBEEE-SSSS-SEEE-TTTSEEEESS--SCC
T ss_pred EEEee----cCCCCCCEEEEh-HHhhCCCcccccccceEEeecCCcceeEecCCceeEEeCCccCHH
Confidence 46788 899999999998 55553211 2345544 34567999997 999998754 444
No 27
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=88.09 E-value=4.5 Score=28.98 Aligned_cols=56 Identities=18% Similarity=0.193 Sum_probs=33.1
Q ss_pred eEEEEEeCCCCCCEEEcCCC-cEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697 178 RVTLSLRGPYEKMFSINDSG-HISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMHF 236 (383)
Q Consensus 178 ~i~ysi~~~~~~~F~i~~tG-~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v 236 (383)
...|.|.|....+..+.... .+.... +..+.|.|.|.|.|..........++.|.|
T Consensus 9 ~Y~Y~l~g~d~~W~~~~~~~~~~~~~~---L~~G~Y~l~V~a~~~~~~~~~~~~~l~i~I 65 (66)
T PF07495_consen 9 RYRYRLEGFDDEWITLGSYSNSISYTN---LPPGKYTLEVRAKDNNGKWSSDEKSLTITI 65 (66)
T ss_dssp EEEEEEETTESSEEEESSTS-EEEEES-----SEEEEEEEEEEETTS-B-SS-EEEEEEE
T ss_pred EEEEEEECCCCeEEECCCCcEEEEEEe---CCCEEEEEEEEEECCCCCcCcccEEEEEEE
Confidence 45666776555455555544 554433 567999999999997665444335666655
No 28
>TIGR00845 caca sodium/calcium exchanger 1. This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family
Probab=86.52 E-value=59 Score=36.23 Aligned_cols=138 Identities=15% Similarity=0.227 Sum_probs=71.6
Q ss_pred eCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEEEEecC---CCCCEEEcCcccEEEc----------
Q psy4697 37 GTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKYWLSND---YGERFSISRQGDISLM---------- 100 (383)
Q Consensus 37 DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Ysi~~~---~~~~F~Id~tG~I~~~---------- 100 (383)
+.||.++.|....-+..|.||. |+.-.+|.=...|. -.|.|+..+. .+.-|.- .+|.|.-.
T Consensus 394 ~~dd~~s~i~Fe~~~Y~V~En~--GtV~VtV~R~GGdl~~tVsVdY~T~DGTA~AG~DY~~-~sGTLtF~PGEt~KtItV 470 (928)
T TIGR00845 394 EENDPVSKIFFEPGHYTCLENC--GTVALTVVRRGGDLTNTVYVDYRTEDGTANAGSDYEF-TEGTLVFKPGETQKEFRI 470 (928)
T ss_pred cccCCcceEEecCCeEEEeecC--cEEEEEEEEccCCCCceEEEEEEccCCccCCCCCccc-cCceEEECCCceEEEEEE
Confidence 3577777776666677888985 77666665443332 5578887651 1111111 13333221
Q ss_pred ccC---CcccCcEEEEEEEEecCC----------------ceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCC
Q psy4697 101 QCL---DYETEDSYRFTVYATDTL----------------MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTP 161 (383)
Q Consensus 101 ~~L---D~E~~~~y~~~V~A~D~~----------------~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~ 161 (383)
..+ -+|....|.+.+.--..+ +....+.+|+|.| ||++|.|....-...+. |+.
T Consensus 471 ~IIDDdi~E~DE~F~V~LSNp~~g~~~G~~~~~~~~~~A~Lg~ps~ATVTIlD-DD~aGIfsFe~~~~sV~----Es~-- 543 (928)
T TIGR00845 471 GIIDDDIFEEDEHFYVRLSNLRVGSEDGILEANHVSAVAQLASPNTATVTILD-DDHAGIFTFEEDVFHVS----ESI-- 543 (928)
T ss_pred EEccCCCCCCCceEEEEEeCCCCCCcccccccccccccceecCCceEEEEEec-CcccCcccccCceEEEE----cCC--
Confidence 112 234455555555332111 1223456677777 77899877665566777 654
Q ss_pred CceEEEEEeeeCCCCC-eEEEEEe
Q psy4697 162 GSVIGKVEAADGDKGD-RVTLSLR 184 (383)
Q Consensus 162 g~~v~~v~A~D~D~g~-~i~ysi~ 184 (383)
|..-.+|.-+-.-.|. .+.|.-.
T Consensus 544 G~vtvtV~RtsGa~G~VtV~Y~T~ 567 (928)
T TIGR00845 544 GIMEVKVLRTSGARGTVIVPYRTV 567 (928)
T ss_pred CEEEEEEEEcCCCCeeEEEEEEee
Confidence 4443343333222233 5667654
No 29
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=85.78 E-value=0.76 Score=29.69 Aligned_cols=9 Identities=22% Similarity=0.516 Sum_probs=3.6
Q ss_pred ehhhHHHHH
Q psy4697 257 LIILGVVLI 265 (383)
Q Consensus 257 i~~l~~i~~ 265 (383)
++++++++.
T Consensus 6 IaIIv~V~v 14 (38)
T PF02439_consen 6 IAIIVAVVV 14 (38)
T ss_pred hhHHHHHHH
Confidence 334444443
No 30
>PF12877 DUF3827: Domain of unknown function (DUF3827); InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells.
Probab=81.02 E-value=2 Score=44.92 Aligned_cols=35 Identities=14% Similarity=0.198 Sum_probs=19.7
Q ss_pred ceeeehhhHHHHHHHHHHHHHHhhheeecCCCCCC
Q psy4697 253 SSTVLIILGVVLIVLGFVIILLILYIHKNKHTKNN 287 (383)
Q Consensus 253 ~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~ 287 (383)
..+||+++.+-++++++|+++|++.+||++|-+..
T Consensus 268 NlWII~gVlvPv~vV~~Iiiil~~~LCRk~K~eFq 302 (684)
T PF12877_consen 268 NLWIIAGVLVPVLVVLLIIIILYWKLCRKNKLEFQ 302 (684)
T ss_pred CeEEEehHhHHHHHHHHHHHHHHHHHhcccccCCC
Confidence 34444444444444444555567778888877643
No 31
>PF05345 He_PIG: Putative Ig domain; InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=80.46 E-value=7.4 Score=26.69 Aligned_cols=37 Identities=27% Similarity=0.298 Sum_probs=27.2
Q ss_pred CCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCC
Q psy4697 186 PYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTG 222 (383)
Q Consensus 186 ~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g 222 (383)
....+..+|+ +|.|.-........+.|.+.|.|+|..
T Consensus 11 ~LP~gLs~d~~tG~isGtp~~~~~~G~y~~~vtatd~~ 48 (49)
T PF05345_consen 11 GLPSGLSLDPSTGTISGTPTSSVQPGTYTFTVTATDGS 48 (49)
T ss_pred CCCCcEEEeCCCCEEEeecCCCccccEEEEEEEEEcCC
Confidence 3455788987 999975544333457999999999964
No 32
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=80.43 E-value=1.9 Score=27.91 Aligned_cols=25 Identities=4% Similarity=0.191 Sum_probs=14.7
Q ss_pred CceeeehhhHHHHHHHHHHHHHHhh
Q psy4697 252 TSSTVLIILGVVLIVLGFVIILLIL 276 (383)
Q Consensus 252 ~~~~li~~l~~i~~lL~l~~~~l~~ 276 (383)
.+..++.++.+.+.++.+.++..+.
T Consensus 4 s~IaIIv~V~vg~~iiii~~~~YaC 28 (38)
T PF02439_consen 4 STIAIIVAVVVGMAIIIICMFYYAC 28 (38)
T ss_pred chhhHHHHHHHHHHHHHHHHHHHHH
Confidence 3456666666666666665554333
No 33
>KOG1094|consensus
Probab=80.41 E-value=2.9 Score=43.65 Aligned_cols=23 Identities=17% Similarity=0.573 Sum_probs=12.6
Q ss_pred CCceeeehhhHHHHHHHHHHHHH
Q psy4697 251 GTSSTVLIILGVVLIVLGFVIIL 273 (383)
Q Consensus 251 ~~~~~li~~l~~i~~lL~l~~~~ 273 (383)
+.+.++++++.+|.+++++++++
T Consensus 388 ~~t~~~~~~f~~if~iva~ii~~ 410 (807)
T KOG1094|consen 388 SPTAILIIIFVAIFLIVALIIAL 410 (807)
T ss_pred CCceehHHHHHHHHHHHHHHHHH
Confidence 34455666666666555554443
No 34
>KOG4221|consensus
Probab=79.73 E-value=1.2e+02 Score=34.74 Aligned_cols=47 Identities=15% Similarity=0.144 Sum_probs=28.4
Q ss_pred EEEEEeCC-CCCCEEEcC-CCcEEEecc-CCCCcceEEEEEEEeeCCCCC
Q psy4697 179 VTLSLRGP-YEKMFSIND-SGHISIVDL-SALNTSTIQLVVVATDTGNPP 225 (383)
Q Consensus 179 i~ysi~~~-~~~~F~i~~-tG~i~l~~~-~~~~~~~y~L~V~a~D~g~p~ 225 (383)
+-|+..++ ...-|++.. .|....... .......|.+.|.|+-..+|.
T Consensus 959 i~Ys~~~n~~~~dWt~~t~~g~~L~~~v~~l~p~t~yffkiQAr~~kG~g 1008 (1381)
T KOG4221|consen 959 IYYSTDGNTPEHDWTIETTAGAELSHQVPNLDPDTGYFFKIQARNEKGPG 1008 (1381)
T ss_pred EEEecCCCCchhhceeeecccchhhhccCCCCCCCceEEEEEeeccCCCC
Confidence 44555433 445688876 565543332 333466799999998876554
No 35
>PF10577 UPF0560: Uncharacterised protein family UPF0560; InterPro: IPR018890 This family of proteins has no known function.
Probab=79.00 E-value=1.6 Score=46.88 Aligned_cols=31 Identities=19% Similarity=0.356 Sum_probs=18.0
Q ss_pred eeeehhhHHHHHHHHHHHHHHhhheeecCCC
Q psy4697 254 STVLIILGVVLIVLGFVIILLILYIHKNKHT 284 (383)
Q Consensus 254 ~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~ 284 (383)
..+.++||+.++++++++.+|.++|+|++.|
T Consensus 273 ~fLl~ILG~~~livl~lL~vLl~yCrrkc~~ 303 (807)
T PF10577_consen 273 VFLLAILGGTALIVLILLCVLLCYCRRKCLK 303 (807)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhhcccCC
Confidence 5567778877776665555554444444433
No 36
>TIGR03660 T1SS_rpt_143 T1SS-143 repeat domain. This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion.
Probab=78.40 E-value=41 Score=28.53 Aligned_cols=53 Identities=28% Similarity=0.331 Sum_probs=36.1
Q ss_pred cEEEcccCCccc---CcEEEEEEEEecC-CceeEEEEEEEEeecCCCCCcccCCceeEEEe
Q psy4697 96 DISLMQCLDYET---EDSYRFTVYATDT-LMTTSATVNISVVNVNDWDPRFRYPQYELFLP 152 (383)
Q Consensus 96 ~I~~~~~LD~E~---~~~y~~~V~A~D~-~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~ 152 (383)
...+.++||+.. .-...|.|.|+|. |-.++.++.|.|.| | .|+..... .+.|.
T Consensus 69 tftL~~~lDH~~g~d~l~l~~~v~a~D~DGD~s~~~l~VtI~D--D-~P~~~~~~-~~~V~ 125 (137)
T TIGR03660 69 EFTLEGPLDHAAGSDELTLNFPIIATDFDGDTSSITLPVTIVD--D-VPTITDVD-ALTVD 125 (137)
T ss_pred EEEEcccccCCCCCceEEEeeeEEEEeCCCCccccEEEEEEEC--C-CCeecccc-ceEEe
Confidence 455678888843 4467889999985 44455688888887 6 57765543 35666
No 37
>PF15347 PAG: Phosphoprotein associated with glycosphingolipid-enriched
Probab=77.94 E-value=2.9 Score=40.84 Aligned_cols=36 Identities=8% Similarity=0.056 Sum_probs=27.1
Q ss_pred CCceeeehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697 251 GTSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKN 286 (383)
Q Consensus 251 ~~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~ 286 (383)
...+++++.|+++..||++.+|+|.+.-|.|.||..
T Consensus 12 q~qivlwgsLaav~~f~lis~LifLCsSC~reKK~~ 47 (428)
T PF15347_consen 12 QVQIVLWGSLAAVTTFLLISFLIFLCSSCDREKKPK 47 (428)
T ss_pred ceeEEeehHHHHHHHHHHHHHHHHHhhcccccccCC
Confidence 445788899999998888777766666777776654
No 38
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=76.16 E-value=0.99 Score=43.40 Aligned_cols=30 Identities=37% Similarity=0.502 Sum_probs=13.5
Q ss_pred eeeehhhHHHHH-HHHHHHHHHhhheeecCCC
Q psy4697 254 STVLIILGVVLI-VLGFVIILLILYIHKNKHT 284 (383)
Q Consensus 254 ~~li~~l~~i~~-lL~l~~~~l~~~~~r~~~~ 284 (383)
.++++++.+|++ +|+++++.|++ |+||+||
T Consensus 256 t~I~aSiiaIliIVLIMvIIYLIL-RYRRKKK 286 (299)
T PF02009_consen 256 TAIIASIIAILIIVLIMVIIYLIL-RYRRKKK 286 (299)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHH-HHHHHhh
Confidence 334444444443 44444444444 4455443
No 39
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=74.30 E-value=2.3 Score=36.44 Aligned_cols=11 Identities=0% Similarity=0.178 Sum_probs=4.7
Q ss_pred hhheeecCCCC
Q psy4697 275 ILYIHKNKHTK 285 (383)
Q Consensus 275 ~~~~~r~~~~~ 285 (383)
++++|+|+||.
T Consensus 70 vf~~c~r~kkt 80 (154)
T PF04478_consen 70 VFIFCIRRKKT 80 (154)
T ss_pred heeEEEecccC
Confidence 33344444443
No 40
>PF12273 RCR: Chitin synthesis regulation, resistance to Congo red; InterPro: IPR020999 RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 [].
Probab=73.37 E-value=3.1 Score=34.78 Aligned_cols=11 Identities=9% Similarity=0.422 Sum_probs=5.6
Q ss_pred heeecCCCCCC
Q psy4697 277 YIHKNKHTKNN 287 (383)
Q Consensus 277 ~~~r~~~~~~~ 287 (383)
++|+.+||+++
T Consensus 19 ~~~~~rRR~r~ 29 (130)
T PF12273_consen 19 FYCHNRRRRRR 29 (130)
T ss_pred HHHHHHHHhhc
Confidence 34555555544
No 41
>PF14575 EphA2_TM: Ephrin type-A receptor 2 transmembrane domain; PDB: 3KUL_A 2XVD_A 2VX1_A 2VWV_A 2VX0_A 2VWY_A 2VWZ_A 2VWW_A 2VWU_A 2VWX_A ....
Probab=72.08 E-value=2.2 Score=32.25 Aligned_cols=27 Identities=19% Similarity=0.413 Sum_probs=14.7
Q ss_pred hhhHHHHHHHHHHHHHHhhheeecCCC
Q psy4697 258 IILGVVLIVLGFVIILLILYIHKNKHT 284 (383)
Q Consensus 258 ~~l~~i~~lL~l~~~~l~~~~~r~~~~ 284 (383)
++.+++.+++++++++++..+|+|+++
T Consensus 2 ii~~~~~g~~~ll~~v~~~~~~~rr~~ 28 (75)
T PF14575_consen 2 IIASIIVGVLLLLVLVIIVIVCFRRCK 28 (75)
T ss_dssp HHHHHHHHHHHHHHHHHHHHCCCTT--
T ss_pred EEehHHHHHHHHHHhheeEEEEEeeEc
Confidence 344555567777666555555555544
No 42
>PF13750 Big_3_3: Bacterial Ig-like domain (group 3)
Probab=71.98 E-value=66 Score=27.87 Aligned_cols=121 Identities=19% Similarity=0.199 Sum_probs=63.3
Q ss_pred CeEEEEE-EEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCC-CcEEEEEEeEeCCC--CeeEEEEecC
Q psy4697 9 QPITLVV-RAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPV-NYVLLTVTTNKPRD--LRVKYWLSND 84 (383)
Q Consensus 9 ~~y~l~V-~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~-gt~v~~v~A~D~D~--~~v~Ysi~~~ 84 (383)
-.|.+++ .|.|..+..........+.+ ...||...- .....+..+... |..=..+.++|... .-.+.++.+.
T Consensus 15 G~Y~l~~~~a~D~agN~~~~~~~~~~~i---D~T~Ptisi-~~~~~~~~g~~v~~~~~i~i~~tD~~~~~~i~sv~l~Gg 90 (158)
T PF13750_consen 15 GSYTLTVVTATDAAGNTSTSTVSETFTI---DNTPPTISI-SDGASVANGSTVYGLVNISINVTDNSDDSKITSVSLTGG 90 (158)
T ss_pred ccEEEEEEEEEecCCCEEEEEEeeEEEE---cCCCCEEEE-ecCCccCCCccccceeeeEEEEEeCCCCceEEEEEEECC
Confidence 5699999 79996544333333223433 344777644 111122222221 22223466666543 2334555542
Q ss_pred C-CCCEEE--cC--cccEEE--cccC-CcccCcEEEEEEEEecC-CceeEEEEEEEEe
Q psy4697 85 Y-GERFSI--SR--QGDISL--MQCL-DYETEDSYRFTVYATDT-LMTTSATVNISVV 133 (383)
Q Consensus 85 ~-~~~F~I--d~--tG~I~~--~~~L-D~E~~~~y~~~V~A~D~-~~~s~~tV~I~V~ 133 (383)
+ .....+ .. .|...+ .+.| ..|....|+++|.|.|. |..++.++.....
T Consensus 91 ~~~d~v~ls~~~~~~~~~~~~yp~~fpsle~~~~YtLtV~a~D~aGN~~~~si~F~y~ 148 (158)
T PF13750_consen 91 PASDSVSLSWTNKGNGVYTLEYPRIFPSLEADDSYTLTVSATDKAGNQSTKSISFSYM 148 (158)
T ss_pred cccceEEEeeEeccCceEEeecccccCCcCCCCeEEEEEEEEecCCCEEEEEEEEEEe
Confidence 2 222222 22 343322 1222 34778899999999995 6777777776654
No 43
>PF05393 Hum_adeno_E3A: Human adenovirus early E3A glycoprotein; InterPro: IPR008652 This family consists of several early glycoproteins (E3A), from human adenovirus type 2.; GO: 0016021 integral to membrane
Probab=70.57 E-value=3.3 Score=31.94 Aligned_cols=28 Identities=14% Similarity=0.196 Sum_probs=13.4
Q ss_pred hhHHHHHHHHHHHHHHhhheeecCCCCCC
Q psy4697 259 ILGVVLIVLGFVIILLILYIHKNKHTKNN 287 (383)
Q Consensus 259 ~l~~i~~lL~l~~~~l~~~~~r~~~~~~~ 287 (383)
....++++++++++ +.+.||+.|||.++
T Consensus 36 ~~lvI~~iFil~Vi-lwfvCC~kRkrsRr 63 (94)
T PF05393_consen 36 WFLVICGIFILLVI-LWFVCCKKRKRSRR 63 (94)
T ss_pred hHHHHHHHHHHHHH-HHHHHHHHhhhccC
Confidence 35555555544333 34445555554443
No 44
>PF06024 DUF912: Nucleopolyhedrovirus protein of unknown function (DUF912); InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=68.90 E-value=5 Score=32.11 Aligned_cols=30 Identities=17% Similarity=0.297 Sum_probs=11.9
Q ss_pred eehhhHHHHHHHHHHHHHHhhheeecCCCC
Q psy4697 256 VLIILGVVLIVLGFVIILLILYIHKNKHTK 285 (383)
Q Consensus 256 li~~l~~i~~lL~l~~~~l~~~~~r~~~~~ 285 (383)
+++++.+++++|+++.++....+.|.+++.
T Consensus 64 ili~lls~v~IlVily~IyYFVILRer~~~ 93 (101)
T PF06024_consen 64 ILISLLSFVCILVILYAIYYFVILRERQKS 93 (101)
T ss_pred hHHHHHHHHHHHHHHhhheEEEEEeccccc
Confidence 333333333333333333333345544443
No 45
>PF01299 Lamp: Lysosome-associated membrane glycoprotein (Lamp); InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+ In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=68.82 E-value=3.6 Score=39.75 Aligned_cols=20 Identities=30% Similarity=0.433 Sum_probs=12.0
Q ss_pred ceeeehhhHHHHHHHHHHHH
Q psy4697 253 SSTVLIILGVVLIVLGFVII 272 (383)
Q Consensus 253 ~~~li~~l~~i~~lL~l~~~ 272 (383)
...+-|++|++|+.|++++|
T Consensus 270 ~~~vPIaVG~~La~lvlivL 289 (306)
T PF01299_consen 270 SDLVPIAVGAALAGLVLIVL 289 (306)
T ss_pred cchHHHHHHHHHHHHHHHHH
Confidence 45556667777766655443
No 46
>PF13753 SWM_repeat: Putative flagellar system-associated repeat
Probab=67.78 E-value=1.2e+02 Score=29.16 Aligned_cols=202 Identities=18% Similarity=0.140 Sum_probs=88.9
Q ss_pred CCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCC------CCcEEEEEEeEeCCC-CeeEEE
Q psy4697 8 LQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLP------VNYVLLTVTTNKPRD-LRVKYW 80 (383)
Q Consensus 8 ~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~------~gt~v~~v~A~D~D~-~~v~Ys 80 (383)
-..|.+.+.++|..+.... .+..+.|.-. +|...-. .+.++.. ......+..+++.+. ..+.+.
T Consensus 11 d~~~~v~vt~tD~aGN~~~--~t~~~~vDt~---~P~v~i~----~~~~~~~~~~~~~~~~~t~s~tvs~~~~g~~v~v~ 81 (317)
T PF13753_consen 11 DGTYTVSVTVTDAAGNTST--ATQSITVDTT---APTVTIT----SIADDDIINGDEATNTVTFSGTVSGAEPGSTVTVT 81 (317)
T ss_pred CCcEEEEEEEEeCCCCeee--eeEEEEEecC---CCceeee----cccCCCccccceeeeeeEEEEEecCCCCCCEEEEE
Confidence 4679999999996644333 3344443322 5533221 1111111 122233444433333 446555
Q ss_pred EecCCCCCEEEcCcccEEEcc-cCCcccCcEEEEEEE-EecC-CceeEE-EEEEEEeecCCCCCcccCCceeE-EEeccC
Q psy4697 81 LSNDYGERFSISRQGDISLMQ-CLDYETEDSYRFTVY-ATDT-LMTTSA-TVNISVVNVNDWDPRFRYPQYEL-FLPHIP 155 (383)
Q Consensus 81 i~~~~~~~F~Id~tG~I~~~~-~LD~E~~~~y~~~V~-A~D~-~~~s~~-tV~I~V~DvNDn~P~f~~~~~~~-~v~~~~ 155 (383)
+... ..-+..+.+|...+.- +-+.-....|.+.+. ++|. |..+.+ ...+.|-..--.+|.+.-....- .+.
T Consensus 82 ~~g~-~~t~~~~~~G~ws~t~~~~~~l~~g~~ti~v~~~tD~aGN~~t~~s~~~~vDt~~~~~p~vti~~~~~~~~~--- 157 (317)
T PF13753_consen 82 INGT-TGTLTADADGNWSVTVTPSDDLPDGDYTITVTTVTDAAGNTSTAASQTFTVDTTAPTAPTVTITGISDDNII--- 157 (317)
T ss_pred ECCE-EEEEEEecCCcEEEeeccccccccCcceeEEEEEEccCCccccccccccccccccccccccceecccCCcee---
Confidence 5221 1123334466533321 111223458888998 9995 454444 45553333311245543321000 011
Q ss_pred CCCCCCCceEEEEEeeeCCCCCeEEEEEeCCCCCCEEEcCCCcEE--Eec--cCCCCcceEEEEEEEeeCCC
Q psy4697 156 LADLTPGSVIGKVEAADGDKGDRVTLSLRGPYEKMFSINDSGHIS--IVD--LSALNTSTIQLVVVATDTGN 223 (383)
Q Consensus 156 ~e~~~~g~~v~~v~A~D~D~g~~i~ysi~~~~~~~F~i~~tG~i~--l~~--~~~~~~~~y~L~V~a~D~g~ 223 (383)
..........+.-...+.+.++.+...+.|... .+.....|... ... ......+.|.+.+.++|..+
T Consensus 158 ~~~~~~~t~t~sg~v~~~~~~d~v~vt~~G~~~-~~~~~~~g~~t~~~~~~~~~~~~d~~~~v~v~~tD~AG 228 (317)
T PF13753_consen 158 NGAESTVTVTFSGTVTGFDAGDTVTVTINGTTY-TTTVGADGTWTVTVTPSDLAGLADGTYTVTVTVTDAAG 228 (317)
T ss_pred eccceeecccccccceeeeeceeEEEeeccccc-ceeecCCCcccccccccccccccCceEEEEEEeeeccc
Confidence 000001111122222345555556666643332 44454444222 111 12244568999999999743
No 47
>PF15298 AJAP1_PANP_C: AJAP1/PANP C-terminus
Probab=66.73 E-value=14 Score=32.99 Aligned_cols=89 Identities=11% Similarity=0.243 Sum_probs=42.1
Q ss_pred ceeeehhhHHHHHHHHHHHHHHhhheeecCCCCCCCCCCC-CCCCCCCcccCc-cCCCCceeecccCceEeeccccCccc
Q psy4697 253 SSTVLIILGVVLIVLGFVIILLILYIHKNKHTKNNGPPGS-SHSKNDSFLSNV-ILPEKHVNVVAIPKIQENPVFNGSQE 330 (383)
Q Consensus 253 ~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~~~~~~-~~~~~~~~~~~~-~~p~~~~~~~~~~~i~~~p~~~~~~~ 330 (383)
..++-|.+..|+.|..|+.-+++..||-+..+.++.+... ..+.|.+.++-- -.|.|.-++ ..+|--|++
T Consensus 99 h~~iTITvSlImViaAliTtlvlK~C~~~s~~~r~~s~qr~~~qqeeS~Q~Ltd~~p~~~ps~--------~diftayn~ 170 (205)
T PF15298_consen 99 HQIITITVSLIMVIAALITTLVLKNCCAQSQNRRRNSHQRKINQQEESCQNLTDFTPARVPSS--------VDIFTAYND 170 (205)
T ss_pred eEEEEEeeehhHHHHHhhhhhhhhhhhhhhcccCCCccccccccchhhccccccCCcccCccc--------eeEecccCC
Confidence 3444455555554444444445556666665555444322 222222222211 122222222 446788888
Q ss_pred ccc---cccCCCCCCcccCCeec
Q psy4697 331 ELQ---TQRGGTNSSIYTATVKK 350 (383)
Q Consensus 331 ~~~---~~~~~~~s~i~~~~~~~ 350 (383)
.+| ++-. +--++|...+++
T Consensus 171 sl~cshecvr-~~~~~y~~e~~~ 192 (205)
T PF15298_consen 171 SLQCSHECVR-TSVPVYTDETLH 192 (205)
T ss_pred CCCCCccccc-CCCCcccccccC
Confidence 888 5544 455666544444
No 48
>PF05083 LST1: LST-1 protein; InterPro: IPR007775 B144/LST1 is a gene encoded in the human major histocompatibility complex that produces multiple forms of alternatively spliced mRNA and encodes peptides fewer than 100 amino acids in length. B144/LST1 is strongly expressed in dendritic cells. Transfection of B144/LST1 into a variety of cells induces morphologic changes including the production of long, thin filopodia []. A possible role in modulating immune responses. Induces morphological changes including production of filopodia and microspikes when overexpressed in a variety of cell types and may be involved in dendritic cell maturation. Isoform 1 and isoform 2 have an inhibitory effect on lymphocyte proliferation [, ]. ; GO: 0000902 cell morphogenesis, 0006955 immune response, 0016020 membrane
Probab=65.94 E-value=2.7 Score=30.90 Aligned_cols=23 Identities=4% Similarity=0.013 Sum_probs=8.9
Q ss_pred eeecCCCCCCCCCCCCCCCCCCc
Q psy4697 278 IHKNKHTKNNGPPGSSHSKNDSF 300 (383)
Q Consensus 278 ~~r~~~~~~~~~~~~~~~~~~~~ 300 (383)
..||.++-.+......++.+..|
T Consensus 19 lsrRvkrLErs~~~~~~eQE~hy 41 (74)
T PF05083_consen 19 LSRRVKRLERSWEQLSSEQELHY 41 (74)
T ss_pred HHhhhhhcccchhccccccchHH
Confidence 34444433333332233344444
No 49
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=64.83 E-value=4.3 Score=32.21 Aligned_cols=24 Identities=29% Similarity=0.427 Sum_probs=10.5
Q ss_pred hhhHHHHHHHHHHHHHHhhheeec
Q psy4697 258 IILGVVLIVLGFVIILLILYIHKN 281 (383)
Q Consensus 258 ~~l~~i~~lL~l~~~~l~~~~~r~ 281 (383)
|++++++++.+|+.+++.++++|+
T Consensus 71 i~vg~~~~v~~lv~~l~w~f~~r~ 94 (96)
T PTZ00382 71 ISVAVVAVVGGLVGFLCWWFVCRG 94 (96)
T ss_pred EEeehhhHHHHHHHHHhheeEEee
Confidence 444444444444444334444443
No 50
>PF06365 CD34_antigen: CD34/Podocalyxin family; InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=64.47 E-value=21 Score=32.28 Aligned_cols=7 Identities=57% Similarity=0.605 Sum_probs=2.8
Q ss_pred Ccccccc
Q psy4697 327 GSQEELQ 333 (383)
Q Consensus 327 ~~~~~~~ 333 (383)
+.+.++|
T Consensus 158 ~~~~E~q 164 (202)
T PF06365_consen 158 ESQPEMQ 164 (202)
T ss_pred CCCcccc
Confidence 3334444
No 51
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=62.30 E-value=5.4 Score=37.72 Aligned_cols=14 Identities=29% Similarity=0.479 Sum_probs=7.3
Q ss_pred ecCcEEEEEECCCC
Q psy4697 46 SKNEYSVSALENLP 59 (383)
Q Consensus 46 ~~~~y~~~V~En~~ 59 (383)
....|++++-.|..
T Consensus 20 ~n~~yn~~li~n~t 33 (295)
T TIGR01478 20 HNKKYNVSYIQNNT 33 (295)
T ss_pred hccccceecccCcc
Confidence 44556665555443
No 52
>PF15330 SIT: SHP2-interacting transmembrane adaptor protein, SIT
Probab=62.04 E-value=7.5 Score=31.51 Aligned_cols=31 Identities=16% Similarity=0.147 Sum_probs=19.6
Q ss_pred hHHHHHHHHHHHHHHhhheeecCCCCCCCCC
Q psy4697 260 LGVVLIVLGFVIILLILYIHKNKHTKNNGPP 290 (383)
Q Consensus 260 l~~i~~lL~l~~~~l~~~~~r~~~~~~~~~~ 290 (383)
|-+++++|+++++++-+..||.+|++.+.+.
T Consensus 3 Ll~il~llLll~l~asl~~wr~~~rq~k~~~ 33 (107)
T PF15330_consen 3 LLGILALLLLLSLAASLLAWRMKQRQKKAGQ 33 (107)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhhccccC
Confidence 3445555555555566788888877765555
No 53
>PF02480 Herpes_gE: Alphaherpesvirus glycoprotein E; InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=61.76 E-value=2.6 Score=42.84 Aligned_cols=16 Identities=6% Similarity=-0.052 Sum_probs=5.8
Q ss_pred EEEEEEEEeecCCCCC
Q psy4697 125 SATVNISVVNVNDWDP 140 (383)
Q Consensus 125 ~~tV~I~V~DvNDn~P 140 (383)
++++.+.-.++|...+
T Consensus 182 ~~~i~W~~~~~~~~C~ 197 (439)
T PF02480_consen 182 SLEIDWYYMPTDPSCA 197 (439)
T ss_dssp EEEEEEEEE---TT-S
T ss_pred eEEEEEEEecCCCCCc
Confidence 3444555555554444
No 54
>PTZ00370 STEVOR; Provisional
Probab=61.56 E-value=5.7 Score=37.65 Aligned_cols=11 Identities=18% Similarity=0.332 Sum_probs=5.2
Q ss_pred cCCcccCcEEE
Q psy4697 102 CLDYETEDSYR 112 (383)
Q Consensus 102 ~LD~E~~~~y~ 112 (383)
.||+|..+.|+
T Consensus 65 ~~n~eaikkyq 75 (296)
T PTZ00370 65 KMNEEAIKKYQ 75 (296)
T ss_pred HHhHHHhhhhh
Confidence 35555544443
No 55
>PF13750 Big_3_3: Bacterial Ig-like domain (group 3)
Probab=59.94 E-value=1.2e+02 Score=26.33 Aligned_cols=121 Identities=22% Similarity=0.278 Sum_probs=63.1
Q ss_pred CcEEEEEE-EEecC-CceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCC-CceEEEEEeeeCCCCCe-EEEEE
Q psy4697 108 EDSYRFTV-YATDT-LMTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTP-GSVIGKVEAADGDKGDR-VTLSL 183 (383)
Q Consensus 108 ~~~y~~~V-~A~D~-~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~-g~~v~~v~A~D~D~g~~-i~ysi 183 (383)
...|.+.+ .|.|. |.....++...+. ++..+|.+.- .....+. ..... |..=..+.++|...+.. -..++
T Consensus 14 dG~Y~l~~~~a~D~agN~~~~~~~~~~~-iD~T~Ptisi-~~~~~~~----~g~~v~~~~~i~i~~tD~~~~~~i~sv~l 87 (158)
T PF13750_consen 14 DGSYTLTVVTATDAAGNTSTSTVSETFT-IDNTPPTISI-SDGASVA----NGSTVYGLVNISINVTDNSDDSKITSVSL 87 (158)
T ss_pred CccEEEEEEEEEecCCCEEEEEEeeEEE-EcCCCCEEEE-ecCCccC----CCccccceeeeEEEEEeCCCCceEEEEEE
Confidence 45799999 79995 4555555543333 2444776643 0001111 11111 11224577777765543 45667
Q ss_pred eCCC-CCCEEE--cC--CCcEEEe--c--cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697 184 RGPY-EKMFSI--ND--SGHISIV--D--LSALNTSTIQLVVVATDTGNPPRQASVPAIMHF 236 (383)
Q Consensus 184 ~~~~-~~~F~i--~~--tG~i~l~--~--~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v 236 (383)
.|+. +..-.+ .. .|...+. . +.-+....|.|+|.|.|..+ ..++..+....
T Consensus 88 ~Gg~~~d~v~ls~~~~~~~~~~~~yp~~fpsle~~~~YtLtV~a~D~aG--N~~~~si~F~y 147 (158)
T PF13750_consen 88 TGGPASDSVSLSWTNKGNGVYTLEYPRIFPSLEADDSYTLTVSATDKAG--NQSTKSISFSY 147 (158)
T ss_pred ECCcccceEEEeeEeccCceEEeecccccCCcCCCCeEEEEEEEEecCC--CEEEEEEEEEE
Confidence 6432 222222 22 3333222 1 12245789999999999855 35555555544
No 56
>KOG3597|consensus
Probab=59.93 E-value=94 Score=31.67 Aligned_cols=59 Identities=19% Similarity=0.062 Sum_probs=39.7
Q ss_pred eEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCCC-eEEEEEeCC
Q psy4697 124 TSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKGD-RVTLSLRGP 186 (383)
Q Consensus 124 s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~-~i~ysi~~~ 186 (383)
-+...+|.|.-+||++..+....+.+-+. +....-..-..+.+.|+|++. .+.|++.+.
T Consensus 24 ~~~~~~i~v~pvndpp~~~~~~~~~l~~~----~~~~k~l~~~~l~~~d~d~~~~~l~f~v~~t 83 (442)
T KOG3597|consen 24 QTDVLRIHVNPVNDPPSLIFPSGSLLVIL----EGGQKVLDPELLTAADPDSAPLPLEFQVLGT 83 (442)
T ss_pred EEeeecccccccCCCcceeecccceEEee----cCCceeccceEeeccCCCCCccceEEEEccC
Confidence 55677899999999655555444455555 444333333568999999875 788888643
No 57
>PF12768 Rax2: Cortical protein marker for cell polarity
Probab=59.01 E-value=15 Score=35.12 Aligned_cols=11 Identities=45% Similarity=0.510 Sum_probs=5.1
Q ss_pred eeehhhHHHHH
Q psy4697 255 TVLIILGVVLI 265 (383)
Q Consensus 255 ~li~~l~~i~~ 265 (383)
.+.|.||+.++
T Consensus 229 VVlIslAiALG 239 (281)
T PF12768_consen 229 VVLISLAIALG 239 (281)
T ss_pred EEEEehHHHHH
Confidence 33445555444
No 58
>PF15102 TMEM154: TMEM154 protein family
Probab=58.73 E-value=12 Score=31.89 Aligned_cols=35 Identities=14% Similarity=0.275 Sum_probs=21.7
Q ss_pred CceeeehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697 252 TSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKN 286 (383)
Q Consensus 252 ~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~ 286 (383)
..+++|-.+..+++||++++++...+|+|.|+...
T Consensus 58 iLmIlIP~VLLvlLLl~vV~lv~~~kRkr~K~~~s 92 (146)
T PF15102_consen 58 ILMILIPLVLLVLLLLSVVCLVIYYKRKRTKQEPS 92 (146)
T ss_pred EEEEeHHHHHHHHHHHHHHHheeEEeecccCCCCc
Confidence 56666664445555554456666777888888643
No 59
>PTZ00046 rifin; Provisional
Probab=57.20 E-value=4.9 Score=39.44 Aligned_cols=30 Identities=33% Similarity=0.491 Sum_probs=13.7
Q ss_pred eeeehhhHHHH-HHHHHHHHHHhhheeecCCC
Q psy4697 254 STVLIILGVVL-IVLGFVIILLILYIHKNKHT 284 (383)
Q Consensus 254 ~~li~~l~~i~-~lL~l~~~~l~~~~~r~~~~ 284 (383)
-++++++.+|+ ++|+.+++.|++ |+||+||
T Consensus 315 taIiaSiiAIvVIVLIMvIIYLIL-RYRRKKK 345 (358)
T PTZ00046 315 TAIIASIVAIVVIVLIMVIIYLIL-RYRRKKK 345 (358)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHH-Hhhhcch
Confidence 34444443333 344445554555 4555544
No 60
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=57.01 E-value=6.4 Score=33.34 Aligned_cols=29 Identities=24% Similarity=0.395 Sum_probs=12.3
Q ss_pred ehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697 257 LIILGVVLIVLGFVIILLILYIHKNKHTKN 286 (383)
Q Consensus 257 i~~l~~i~~lL~l~~~~l~~~~~r~~~~~~ 286 (383)
+++|.+|.++.+ ++++++.+|.+||||..
T Consensus 32 ~tILiaIvVlii-iiivli~lcssRKkKaa 60 (189)
T PF05568_consen 32 YTILIAIVVLII-IIIVLIYLCSSRKKKAA 60 (189)
T ss_pred HHHHHHHHHHHH-HHHHHHHHHhhhhHHHH
Confidence 334443333332 33334444555555543
No 61
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=56.86 E-value=5.3 Score=39.12 Aligned_cols=30 Identities=37% Similarity=0.496 Sum_probs=13.8
Q ss_pred eeeehhhHHHH-HHHHHHHHHHhhheeecCCC
Q psy4697 254 STVLIILGVVL-IVLGFVIILLILYIHKNKHT 284 (383)
Q Consensus 254 ~~li~~l~~i~-~lL~l~~~~l~~~~~r~~~~ 284 (383)
-++++++.+|+ ++|+.+++.|++ |.||+||
T Consensus 310 t~IiaSiIAIvvIVLIMvIIYLIL-RYRRKKK 340 (353)
T TIGR01477 310 TPIIASIIAILIIVLIMVIIYLIL-RYRRKKK 340 (353)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHH-Hhhhcch
Confidence 33444433333 345555555555 4555543
No 62
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=55.72 E-value=45 Score=23.55 Aligned_cols=55 Identities=20% Similarity=0.310 Sum_probs=31.9
Q ss_pred CeeEEEEecCCCCCEEEcCcc-cEEEcccCCcccCcEEEEEEEEecCC--c-eeEEEEEEEEe
Q psy4697 75 LRVKYWLSNDYGERFSISRQG-DISLMQCLDYETEDSYRFTVYATDTL--M-TTSATVNISVV 133 (383)
Q Consensus 75 ~~v~Ysi~~~~~~~F~Id~tG-~I~~~~~LD~E~~~~y~~~V~A~D~~--~-~s~~tV~I~V~ 133 (383)
-...|.|.+-...|..+.... .+... .-....|+|.|.|.|.. . ....++.|.|+
T Consensus 8 ~~Y~Y~l~g~d~~W~~~~~~~~~~~~~----~L~~G~Y~l~V~a~~~~~~~~~~~~~l~i~I~ 66 (66)
T PF07495_consen 8 IRYRYRLEGFDDEWITLGSYSNSISYT----NLPPGKYTLEVRAKDNNGKWSSDEKSLTITIL 66 (66)
T ss_dssp EEEEEEEETTESSEEEESSTS-EEEEE----S--SEEEEEEEEEEETTS-B-SS-EEEEEEEE
T ss_pred eEEEEEEECCCCeEEECCCCcEEEEEE----eCCCEEEEEEEEEECCCCCcCcccEEEEEEEC
Confidence 345667776556677766433 33221 12357999999999953 2 22266776663
No 63
>PF07204 Orthoreo_P10: Orthoreovirus membrane fusion protein p10; InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=54.30 E-value=5.2 Score=31.28 Aligned_cols=8 Identities=25% Similarity=0.368 Sum_probs=3.8
Q ss_pred heeecCCC
Q psy4697 277 YIHKNKHT 284 (383)
Q Consensus 277 ~~~r~~~~ 284 (383)
++||.|+|
T Consensus 62 ~CC~~K~K 69 (98)
T PF07204_consen 62 CCCRAKHK 69 (98)
T ss_pred HHhhhhhh
Confidence 34555544
No 64
>PF02038 ATP1G1_PLM_MAT8: ATP1G1/PLM/MAT8 family; InterPro: IPR000272 The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable. Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=54.01 E-value=11 Score=25.95 Aligned_cols=9 Identities=44% Similarity=0.752 Sum_probs=3.6
Q ss_pred HHHHHHHHH
Q psy4697 261 GVVLIVLGF 269 (383)
Q Consensus 261 ~~i~~lL~l 269 (383)
|++++++.+
T Consensus 22 A~vlfi~Gi 30 (50)
T PF02038_consen 22 AGVLFILGI 30 (50)
T ss_dssp HHHHHHHHH
T ss_pred HHHHHHHHH
Confidence 334444443
No 65
>KOG4482|consensus
Probab=52.11 E-value=17 Score=35.85 Aligned_cols=40 Identities=18% Similarity=0.168 Sum_probs=22.1
Q ss_pred CCceeeehhhHHHHHH-HHHHHHHHhhheeecCCCCCCCCC
Q psy4697 251 GTSSTVLIILGVVLIV-LGFVIILLILYIHKNKHTKNNGPP 290 (383)
Q Consensus 251 ~~~~~li~~l~~i~~l-L~l~~~~l~~~~~r~~~~~~~~~~ 290 (383)
.+...+..++|..+++ +++++++.+++||||++.+.+-..
T Consensus 292 dyy~df~~tfaIpl~Valll~~~La~imc~rrEg~~~rd~~ 332 (449)
T KOG4482|consen 292 DYYGDFLHTFAIPLGVALLLVLALAYIMCCRREGQKKRDDK 332 (449)
T ss_pred hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccc
Confidence 4445555566665553 233344445668888876654433
No 66
>PF11980 DUF3481: Domain of unknown function (DUF3481); InterPro: IPR022579 This domain of unknown function is located in the C terminus of the eukaryotic neuropilin receptor family of proteins. It is found in association with PF00754 from PFAM, PF00431 from PFAM and PF00629 from PFAM. There are two completely conserved residues (Y and E) that may be functionally important.
Probab=50.07 E-value=12 Score=28.62 Aligned_cols=30 Identities=27% Similarity=0.373 Sum_probs=17.2
Q ss_pred ceeeehhhHHHHHHHHHHHHHHhhheeecC
Q psy4697 253 SSTVLIILGVVLIVLGFVIILLILYIHKNK 282 (383)
Q Consensus 253 ~~~li~~l~~i~~lL~l~~~~l~~~~~r~~ 282 (383)
.++-|++.++.+++|+.+.+.+++.++|.+
T Consensus 15 ~~yyiiA~gga~llL~~v~l~vvL~C~r~~ 44 (87)
T PF11980_consen 15 YWYYIIAMGGALLLLVAVCLGVVLYCHRFH 44 (87)
T ss_pred eeeHHHhhccHHHHHHHHHHHHHHhhhhhc
Confidence 445566666666666666655555444443
No 67
>PF12191 stn_TNFRSF12A: Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain; InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=49.47 E-value=7.1 Score=32.33 Aligned_cols=13 Identities=8% Similarity=0.120 Sum_probs=0.0
Q ss_pred HHHhhheeecCCC
Q psy4697 272 ILLILYIHKNKHT 284 (383)
Q Consensus 272 ~~l~~~~~r~~~~ 284 (383)
.++++++|||++|
T Consensus 97 g~lv~rrcrrr~~ 109 (129)
T PF12191_consen 97 GFLVWRRCRRREK 109 (129)
T ss_dssp -------------
T ss_pred HHHHHhhhhcccc
Confidence 3444555555543
No 68
>PF13753 SWM_repeat: Putative flagellar system-associated repeat
Probab=47.40 E-value=2.3e+02 Score=27.17 Aligned_cols=108 Identities=20% Similarity=0.302 Sum_probs=57.0
Q ss_pred CcEEEEEEEEecC-CceeEEEEEEEEeecCCCCCcccCCce--eEEEeccCCCCCCCCceEEEEEeeeCCCCCeEEEEEe
Q psy4697 108 EDSYRFTVYATDT-LMTTSATVNISVVNVNDWDPRFRYPQY--ELFLPHIPLADLTPGSVIGKVEAADGDKGDRVTLSLR 184 (383)
Q Consensus 108 ~~~y~~~V~A~D~-~~~s~~tV~I~V~DvNDn~P~f~~~~~--~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~~i~ysi~ 184 (383)
...|.+.+.++|. |..+..+..+.|--. +|....... ...+. .............+.+.+.|..+.+.+.
T Consensus 11 d~~~~v~vt~tD~aGN~~~~t~~~~vDt~---~P~v~i~~~~~~~~~~----~~~~~~~~t~s~tvs~~~~g~~v~v~~~ 83 (317)
T PF13753_consen 11 DGTYTVSVTVTDAAGNTSTATQSITVDTT---APTVTITSIADDDIIN----GDEATNTVTFSGTVSGAEPGSTVTVTIN 83 (317)
T ss_pred CCcEEEEEEEEeCCCCeeeeeEEEEEecC---CCceeeecccCCCccc----cceeeeeeEEEEEecCCCCCCEEEEEEC
Confidence 4678999999995 555555555543222 664332210 00000 0111222345566667777777777763
Q ss_pred CCCCCCEEEcCCCcEEEe-cc-CCCCcceEEEEEE-EeeCCC
Q psy4697 185 GPYEKMFSINDSGHISIV-DL-SALNTSTIQLVVV-ATDTGN 223 (383)
Q Consensus 185 ~~~~~~F~i~~tG~i~l~-~~-~~~~~~~y~L~V~-a~D~g~ 223 (383)
+ ....+..+.+|.-... .. ..+..+.|.+.+. ++|..+
T Consensus 84 g-~~~t~~~~~~G~ws~t~~~~~~l~~g~~ti~v~~~tD~aG 124 (317)
T PF13753_consen 84 G-TTGTLTADADGNWSVTVTPSDDLPDGDYTITVTTVTDAAG 124 (317)
T ss_pred C-EEEEEEEecCCcEEEeeccccccccCcceeEEEEEEccCC
Confidence 2 2223344445642111 11 2466778999999 999754
No 69
>PF02158 Neuregulin: Neuregulin family; InterPro: IPR002154 Neuregulins are a sub-family of EGF-like molecules that have been shown to play multiple essential roles in vertebrate embryogenesis including: cardiac development, Schwann cell and oligodendrocyte differentiation, some aspects of neuronal development, as well as the formation of neuromuscular synapses [, ]. Included in the family are heregulin; neu differentiation factor; acetylcholine receptor synthesis stimulator; glial growth factor; and sensory and motor-neuron derived factor []. Multiple family members are generated by alternate splicing or by use of several cell type-specific transcription initiation sites. In general, they bind to and activate the erbB family of receptor tyrosine kinases (erbB2 (HER2), erbB3 (HER3), and erbB4 (HER4)), functioning both as heterodimers and homodimers. The transmembrane forms of neuregulin 1 (NRG1) are present within synaptic vesicles, including those containing glutamate []. After exocytosis, NRG1 is in the presynaptic membrane, where the ectodomain of NRG1 may be cleaved off. The ectodomain then migrates across the synaptic cleft and binds to and activates a member of the EGF-receptor family on the postsynaptic membrane. This has been shown to increase the expression of certain glutamate-receptor subunits. NRG1 appears to signal for glutamate-receptor subunit expression, localisation, and /or phosphorylation facilitating subsequent glutamate transmission. The NRG1 gene has been identified as a potential gene determining susceptibility to schizophrenia by a combination of genetic linkage and association approaches []. ; GO: 0005102 receptor binding, 0009790 embryo development; PDB: 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=47.24 E-value=6.3 Score=38.80 Aligned_cols=29 Identities=17% Similarity=0.243 Sum_probs=0.0
Q ss_pred ehhhHHHHHHHHHHHHHHhh-heeecCCCC
Q psy4697 257 LIILGVVLIVLGFVIILLIL-YIHKNKHTK 285 (383)
Q Consensus 257 i~~l~~i~~lL~l~~~~l~~-~~~r~~~~~ 285 (383)
|..+.|||+-||++.+++++ +.||.||.+
T Consensus 9 VLTITgIcvaLlVVGi~Cvv~aYCKTKKQR 38 (404)
T PF02158_consen 9 VLTITGICVALLVVGIVCVVDAYCKTKKQR 38 (404)
T ss_dssp ------------------------------
T ss_pred hhhhhhhhHHHHHHHHHHHHHHHHHhHHHH
Confidence 55667777766666666566 667666653
No 70
>PF15069 FAM163: FAM163 family
Probab=45.86 E-value=58 Score=27.71 Aligned_cols=7 Identities=14% Similarity=0.112 Sum_probs=3.0
Q ss_pred cccCCCC
Q psy4697 350 KTLSGKP 356 (383)
Q Consensus 350 ~~~s~~~ 356 (383)
.|.||.+
T Consensus 93 ~CptCS~ 99 (143)
T PF15069_consen 93 YCPTCSP 99 (143)
T ss_pred cCCCCCC
Confidence 3444433
No 71
>PF06697 DUF1191: Protein of unknown function (DUF1191); InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=45.43 E-value=34 Score=32.55 Aligned_cols=11 Identities=45% Similarity=0.401 Sum_probs=6.2
Q ss_pred CcEEEEEECCC
Q psy4697 48 NEYSVSALENL 58 (383)
Q Consensus 48 ~~y~~~V~En~ 58 (383)
..|.+.++.|.
T Consensus 33 ~~y~~~LP~nl 43 (278)
T PF06697_consen 33 ILYNVSLPSNL 43 (278)
T ss_pred ceeeeecCCcc
Confidence 34666666554
No 72
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=45.20 E-value=7 Score=28.43 Aligned_cols=18 Identities=22% Similarity=0.453 Sum_probs=1.1
Q ss_pred HHHHHHhhheeecCCCCC
Q psy4697 269 FVIILLILYIHKNKHTKN 286 (383)
Q Consensus 269 l~~~~l~~~~~r~~~~~~ 286 (383)
+++++++.+++|+....-
T Consensus 27 lLIlf~iyR~rkkdEGSY 44 (64)
T PF01034_consen 27 LLILFLIYRMRKKDEGSY 44 (64)
T ss_dssp ----------S------S
T ss_pred HHHHHHHHHHHhcCCCCc
Confidence 334455667666665543
No 73
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=44.71 E-value=33 Score=30.21 Aligned_cols=19 Identities=5% Similarity=0.263 Sum_probs=8.3
Q ss_pred eeeehhhHHHHHHHHHHHH
Q psy4697 254 STVLIILGVVLIVLGFVII 272 (383)
Q Consensus 254 ~~li~~l~~i~~lL~l~~~ 272 (383)
++++.++.++++++++|++
T Consensus 79 ~iivgvi~~Vi~Iv~~Iv~ 97 (179)
T PF13908_consen 79 GIIVGVICGVIAIVVLIVC 97 (179)
T ss_pred eeeeehhhHHHHHHHhHhh
Confidence 3444444444444444333
No 74
>cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.
Probab=43.82 E-value=1.3e+02 Score=22.08 Aligned_cols=62 Identities=18% Similarity=0.246 Sum_probs=33.0
Q ss_pred EEEeeeCCCCCeEEEEEe-CCCCCCEEEcCCCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEE
Q psy4697 167 KVEAADGDKGDRVTLSLR-GPYEKMFSINDSGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMH 235 (383)
Q Consensus 167 ~v~A~D~D~g~~i~ysi~-~~~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~ 235 (383)
++.+.+.+.+....|.+. ++.. ....+. ..........+.|.+++.|+|... .+.+.++.|.
T Consensus 18 ~~~~~~~~~~~~~~~~W~fgdg~----~~~~~~-~~~~~~y~~~G~y~v~l~v~d~~g--~~~~~~~~V~ 80 (81)
T cd00146 18 TFSASDSSGGSIVSYKWDFGDGE----VSSSGE-PTVTHTYTKPGTYTVTLTVTNAVG--SSSTKTTTVV 80 (81)
T ss_pred EEEEEeCCCCCEEEEEEEeCCCC----ccccCC-CceEEEcCCCcEEEEEEEEEeCCC--CEEEEEEEEE
Confidence 556666655556677664 3320 111110 111123456789999999999754 3334344443
No 75
>PF15050 SCIMP: SCIMP protein
Probab=43.33 E-value=14 Score=30.24 Aligned_cols=11 Identities=9% Similarity=0.607 Sum_probs=5.0
Q ss_pred eeehhhHHHHH
Q psy4697 255 TVLIILGVVLI 265 (383)
Q Consensus 255 ~li~~l~~i~~ 265 (383)
.+|.+++.|+.
T Consensus 9 WiiLAVaII~v 19 (133)
T PF15050_consen 9 WIILAVAIILV 19 (133)
T ss_pred HHHHHHHHHHH
Confidence 34444554443
No 76
>KOG3513|consensus
Probab=41.69 E-value=5.8e+02 Score=29.09 Aligned_cols=132 Identities=20% Similarity=0.263 Sum_probs=73.0
Q ss_pred CCCEEEcCcccEEEcccCCcccCcEEEEEEEEecCCceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceE
Q psy4697 86 GERFSISRQGDISLMQCLDYETEDSYRFTVYATDTLMTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVI 165 (383)
Q Consensus 86 ~~~F~Id~tG~I~~~~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v 165 (383)
.+.+.|.++|.|..... -++....|...+ .+.--.+..+..+.|.| ++.+..+.... ....|..+
T Consensus 470 ~~r~~i~edGtL~I~n~-t~~DaG~YtC~A--~N~~G~a~~~~~L~Vkd----~tri~~~P~~~--------~v~~g~~v 534 (1051)
T KOG3513|consen 470 SGRIRILEDGTLEISNV-TRSDAGKYTCVA--ENKLGKAESTGNLIVKD----ATRITLAPSNT--------DVKVGESV 534 (1051)
T ss_pred CceEEECCCCcEEeccc-CcccCcEEEEEE--EcccCccceEEEEEEec----CceEEeccchh--------hhccCceE
Confidence 44577777888766443 355667777664 44333455566666665 67776543322 33345544
Q ss_pred -EEEEeeeCCCCC--eEEEEEeC------CCCCCEEEc-C--CCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEE
Q psy4697 166 -GKVEAADGDKGD--RVTLSLRG------PYEKMFSIN-D--SGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAI 233 (383)
Q Consensus 166 -~~v~A~D~D~g~--~i~ysi~~------~~~~~F~i~-~--tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~ 233 (383)
++..+. .|.-. .|.+++.| ....+|.++ . +|.+.++.....+.+.|...+... --+.++.+.+.
T Consensus 535 ~l~Ce~s-hD~~ld~~f~W~~nG~~id~~~~~~~~~~~~~~~~g~L~i~nv~l~~~G~Y~C~aqT~---~Ds~s~~A~l~ 610 (1051)
T KOG3513|consen 535 TLTCEAS-HDPSLDITFTWKKNGRPIDFNPDGDHFEINDGSDSGRLTIANVSLEDSGKYTCVAQTA---LDSASARADLL 610 (1051)
T ss_pred EEEeecc-cCCCcceEEEEEECCEEhhccCCCCceEEeCCcCccceEEEeeccccCceEEEEEEEe---ecchhcccceE
Confidence 333333 13333 45444433 233456654 2 578888777778889998776542 12345555555
Q ss_pred EEE
Q psy4697 234 MHF 236 (383)
Q Consensus 234 I~v 236 (383)
|.-
T Consensus 611 V~g 613 (1051)
T KOG3513|consen 611 VRG 613 (1051)
T ss_pred Eec
Confidence 544
No 77
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=40.34 E-value=16 Score=30.33 Aligned_cols=27 Identities=4% Similarity=0.025 Sum_probs=15.7
Q ss_pred ccCCCCceeeehhhHHHHHHHHHHHHH
Q psy4697 247 KLNSGTSSTVLIILGVVLIVLGFVIIL 273 (383)
Q Consensus 247 ~~~~~~~~~li~~l~~i~~lL~l~~~~ 273 (383)
|......++++.++++++++.+++..+
T Consensus 61 fs~~~i~~Ii~gv~aGvIg~Illi~y~ 87 (122)
T PF01102_consen 61 FSEPAIIGIIFGVMAGVIGIILLISYC 87 (122)
T ss_dssp SS-TCHHHHHHHHHHHHHHHHHHHHHH
T ss_pred ccccceeehhHHHHHHHHHHHHHHHHH
Confidence 444456666676666666655555554
No 78
>PF15234 LAT: Linker for activation of T-cells
Probab=39.85 E-value=1.3e+02 Score=26.81 Aligned_cols=9 Identities=11% Similarity=-0.005 Sum_probs=4.2
Q ss_pred cCCCCceee
Q psy4697 305 ILPEKHVNV 313 (383)
Q Consensus 305 ~~p~~~~~~ 313 (383)
|+|+.-+.+
T Consensus 53 k~p~t~~pw 61 (230)
T PF15234_consen 53 KRPQTLAPW 61 (230)
T ss_pred cCCCCCCCC
Confidence 555544333
No 79
>PF05454 DAG1: Dystroglycan (Dystrophin-associated glycoprotein 1); InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=39.42 E-value=9.8 Score=36.43 Aligned_cols=11 Identities=9% Similarity=0.166 Sum_probs=0.0
Q ss_pred eeeehhhHHHH
Q psy4697 254 STVLIILGVVL 264 (383)
Q Consensus 254 ~~li~~l~~i~ 264 (383)
.++.-.+-+++
T Consensus 144 ~yL~T~IpaVV 154 (290)
T PF05454_consen 144 DYLHTFIPAVV 154 (290)
T ss_dssp -----------
T ss_pred chHHHHHHHHH
Confidence 33444443333
No 80
>PF04906 Tweety: Tweety; InterPro: IPR006990 None of the members of the tweety (tty) family have been functionally characterised. However, they are considered to be transmembrane proteins with five potential membrane-spanning regions. A number of potential functions have been suggested on the basis of homology to the yeast FTR1 and FTH1 iron transporter proteins and the mammalian neurotensin receptors 1 and 2 in that they have a similar hydrophobicity profiles although there is no detectable sequence homology to the tweety-related proteins. It has been proposed that the tweety-related proteins could be involved in transport of iron or other divalent cations or alternatively that they may be membrane-bound receptors [].
Probab=38.92 E-value=34 Score=34.47 Aligned_cols=31 Identities=16% Similarity=0.265 Sum_probs=13.5
Q ss_pred eeehhhHHHHHHHHH--HHHHHhhheeecCCCC
Q psy4697 255 TVLIILGVVLIVLGF--VIILLILYIHKNKHTK 285 (383)
Q Consensus 255 ~li~~l~~i~~lL~l--~~~~l~~~~~r~~~~~ 285 (383)
.++++++++++.|.+ +++.++.++|+|++++
T Consensus 20 ~~la~v~~~~l~l~Ll~ll~yl~~~CC~r~~~~ 52 (406)
T PF04906_consen 20 LILASVAAACLALSLLFLLIYLICRCCCRRPRE 52 (406)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHhhCCCCCc
Confidence 344444444443322 2333444556655444
No 81
>PF05895 DUF859: Siphovirus protein of unknown function (DUF859); InterPro: IPR008577 This entry is represented by Streptococcus phage 7201, Orf39. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. This family consists of several uncharacterised proteins from a number of the Siphoviruses as well as some bacterial proteins from Streptococcus species. Some of the members of this family are described as putative minor structural proteins.
Probab=38.85 E-value=3.4e+02 Score=29.04 Aligned_cols=117 Identities=14% Similarity=0.111 Sum_probs=0.0
Q ss_pred eEEEEEEEEECCCCCceE-EEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-----CeeEEEEec
Q psy4697 10 PITLVVRAIQYDNQDRYA-LATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-----LRVKYWLSN 83 (383)
Q Consensus 10 ~y~l~V~a~D~g~~~~~s-~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-----~~v~Ysi~~ 83 (383)
.+++++.++| +.++.+ ..+..|.|.+- ++|.+....+...-.++...-..-+.+...--.+ -.++|+...
T Consensus 299 ~~Ti~atVtD--SRGr~S~~~~~tItVl~Y--~~P~lsfsv~R~~~~~~~~~v~~~a~Iapl~v~g~qKN~~~lt~~~a~ 374 (624)
T PF05895_consen 299 SATIRATVTD--SRGRTSDPKTKTITVLEY--SPPTLSFSVYRCGSSGNTLTVTRNAKIAPLTVNGVQKNTMTLTFKVAP 374 (624)
T ss_pred eEEEEEEEEE--CCCccCCceEEEEEEEEc--CCCcEEEEEEEeCCCCcEEEEEEEEEEeEEEEcccccceEEEEEEEEE
Q ss_pred CCCCCEEEc--Ccc--------cEEEcccC--CcccCcEEEEEEEEecCCceeEEEEEE
Q psy4697 84 DYGERFSIS--RQG--------DISLMQCL--DYETEDSYRFTVYATDTLMTTSATVNI 130 (383)
Q Consensus 84 ~~~~~F~Id--~tG--------~I~~~~~L--D~E~~~~y~~~V~A~D~~~~s~~tV~I 130 (383)
-....|.+| ..+ .......| .|...+.|.+.+..+|.-.+...+..|
T Consensus 375 ~gt~~~t~d~~~a~~~~s~~s~~~~~~~~L~g~y~~~kSy~V~~~l~D~F~s~t~~~~V 433 (624)
T PF05895_consen 375 LGTGTFTTDNGSASGTWSSISELTNSSANLGGTYDAEKSYDVRGTLSDKFTSTTFTVTV 433 (624)
T ss_pred cCcceEEEEccccccceeeeeeecccceeeccccCCCceEEEEEEEEEEeeeEEEEEEc
No 82
>PF14610 DUF4448: Protein of unknown function (DUF4448)
Probab=38.45 E-value=94 Score=27.59 Aligned_cols=17 Identities=35% Similarity=0.667 Sum_probs=7.9
Q ss_pred eeehhhHHHHHHHHHHH
Q psy4697 255 TVLIILGVVLIVLGFVI 271 (383)
Q Consensus 255 ~li~~l~~i~~lL~l~~ 271 (383)
++.|+|-.+++++++++
T Consensus 159 ~laI~lPvvv~~~~~~~ 175 (189)
T PF14610_consen 159 ALAIALPVVVVVLALIM 175 (189)
T ss_pred eEEEEccHHHHHHHHHH
Confidence 44555554444444433
No 83
>PF05510 Sarcoglycan_2: Sarcoglycan alpha/epsilon; InterPro: IPR008908 Sarcoglycans are a subcomplex of transmembrane proteins which are part of the dystrophin-glycoprotein complex. They are expressed in the skeletal, cardiac and smooth muscle. Although numerous studies have been conducted on the sarcoglycan subcomplex in skeletal and cardiac muscle, the manner of the distribution and localisation of these proteins along the nonjunctional sarcolemma is not clear []. This family contains alpha and epsilon members.; GO: 0016012 sarcoglycan complex
Probab=38.35 E-value=22 Score=35.39 Aligned_cols=20 Identities=10% Similarity=0.235 Sum_probs=11.9
Q ss_pred HHHHhhheeecCCCCCCCCC
Q psy4697 271 IILLILYIHKNKHTKNNGPP 290 (383)
Q Consensus 271 ~~~l~~~~~r~~~~~~~~~~ 290 (383)
+++.+++||||++.+.+-.+
T Consensus 301 llLs~Imc~rREG~~~rd~~ 320 (386)
T PF05510_consen 301 LLLSYIMCCRREGVKKRDSK 320 (386)
T ss_pred HHHHHHheechHHhhcchhc
Confidence 33345668888766555444
No 84
>KOG4433|consensus
Probab=36.50 E-value=22 Score=36.13 Aligned_cols=31 Identities=19% Similarity=0.254 Sum_probs=17.1
Q ss_pred ceeeehhhHHHHHHHHHHHHH--HhhheeecCC
Q psy4697 253 SSTVLIILGVVLIVLGFVIIL--LILYIHKNKH 283 (383)
Q Consensus 253 ~~~li~~l~~i~~lL~l~~~~--l~~~~~r~~~ 283 (383)
...+++++++.+++|.++.++ +++++|+|++
T Consensus 42 aL~lla~l~aa~l~l~Ll~ll~yli~~cC~Rr~ 74 (526)
T KOG4433|consen 42 ALLLLAALAAACLGLSLLFLLFYLICRCCCRRE 74 (526)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHcCCC
Confidence 356677777777755544443 3333444443
No 85
>PHA03290 envelope glycoprotein I; Provisional
Probab=36.37 E-value=46 Score=32.34 Aligned_cols=45 Identities=13% Similarity=0.065 Sum_probs=28.4
Q ss_pred cccEEEcccCCcccCcEEEEEEEEecCCceeEEEEEEEEeecCCC
Q psy4697 94 QGDISLMQCLDYETEDSYRFTVYATDTLMTTSATVNISVVNVNDW 138 (383)
Q Consensus 94 tG~I~~~~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn 138 (383)
+|.+...+.--.|....|.+.|..-+....+--.+.+.|.+..+|
T Consensus 127 ~~vLL~I~~P~~~DSGiY~LRV~Ldga~~sDvF~lsv~Vyp~g~~ 171 (357)
T PHA03290 127 AEIIFKINKPGIEDAGIYLLLVQLDHSRLFDGFFLGLNVYPAGDH 171 (357)
T ss_pred cceEEEeCCCCcccCeeEEEEEEeCCCcccceEEEEEEEecCCCC
Confidence 565555555556667788888888665555555556666555443
No 86
>cd05774 Ig_CEACAM_D1 First immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM). IG_CEACAM_D1: immunoglobulin (Ig)-like domain 1 in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) protein subfamily. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions, it is a cell adhesion molecule, and a signaling molecule that regulates the growth of tumor cells, it is an angiogenic factor, and is a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two [D1, D4] or four [D1-D4] Ig-like domains on the cell surface. This family corresponds to the D
Probab=36.24 E-value=1.1e+02 Score=24.44 Aligned_cols=34 Identities=18% Similarity=0.213 Sum_probs=27.3
Q ss_pred CCCEEEcCCCcEEEeccCCCCcceEEEEEEEeeC
Q psy4697 188 EKMFSINDSGHISIVDLSALNTSTIQLVVVATDT 221 (383)
Q Consensus 188 ~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D~ 221 (383)
.+...+.++|.|.+......+.+.|.+.+...+.
T Consensus 61 ~gR~~~~~ngSL~I~~v~~~D~G~Y~~~v~~~~~ 94 (105)
T cd05774 61 SGRETIYPNGSLLIQNVTQKDTGFYTLQTITTNF 94 (105)
T ss_pred CCcEEEeCCCcEEEecCCcccCEEEEEEEEeCCc
Confidence 4456677789999998888999999998876653
No 87
>PF11857 DUF3377: Domain of unknown function (DUF3377); InterPro: IPR021805 This domain is functionally uncharacterised and found at the C terminus of peptidases belonging to MEROPS peptidase family M10A, membrane-type matrix metallopeptidases (clan MA). ; GO: 0004222 metalloendopeptidase activity
Probab=35.85 E-value=37 Score=25.46 Aligned_cols=20 Identities=25% Similarity=0.519 Sum_probs=10.0
Q ss_pred eeeehhhHHHHHHHHHHHHH
Q psy4697 254 STVLIILGVVLIVLGFVIIL 273 (383)
Q Consensus 254 ~~li~~l~~i~~lL~l~~~~ 273 (383)
.++++.+-+++++.+++++.
T Consensus 30 ~avaVviPl~L~LCiLvl~y 49 (74)
T PF11857_consen 30 NAVAVVIPLVLLLCILVLIY 49 (74)
T ss_pred eEEEEeHHHHHHHHHHHHHH
Confidence 34445555555555554443
No 88
>PF12245 Big_3_2: Bacterial Ig-like domain (group 3); InterPro: IPR022038 This family of proteins is found in bacteria. They have two conserved sequence motifs: AGN and GMT.
Probab=35.32 E-value=1.2e+02 Score=21.54 Aligned_cols=30 Identities=37% Similarity=0.444 Sum_probs=22.8
Q ss_pred CcEEEEEEEEecC-CceeEEEEEEEEeecCC
Q psy4697 108 EDSYRFTVYATDT-LMTTSATVNISVVNVND 137 (383)
Q Consensus 108 ~~~y~~~V~A~D~-~~~s~~tV~I~V~DvND 137 (383)
...|.+.+.|.|. |..+.......+.|..-
T Consensus 22 dg~yt~~v~a~D~AGN~~~~~~~~~i~d~~~ 52 (60)
T PF12245_consen 22 DGEYTLTVTATDKAGNTSSSTTQIVIVDNTA 52 (60)
T ss_pred CccEEEEEEEEECCCCEEEeeeEEEEEcCCC
Confidence 5689999999995 67777777777776553
No 89
>KOG1226|consensus
Probab=35.17 E-value=68 Score=34.69 Aligned_cols=13 Identities=38% Similarity=0.608 Sum_probs=5.9
Q ss_pred eeehhhHHHHHHH
Q psy4697 255 TVLIILGVVLIVL 267 (383)
Q Consensus 255 ~li~~l~~i~~lL 267 (383)
++.|.|+.+++++
T Consensus 713 ~~~i~lgvv~~iv 725 (783)
T KOG1226|consen 713 ILAIVLGVVAGIV 725 (783)
T ss_pred EeeehHHHHHHHH
Confidence 4444554444433
No 90
>PF14991 MLANA: Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=34.79 E-value=10 Score=30.77 Aligned_cols=10 Identities=10% Similarity=0.165 Sum_probs=0.0
Q ss_pred HhhheeecCC
Q psy4697 274 LILYIHKNKH 283 (383)
Q Consensus 274 l~~~~~r~~~ 283 (383)
+.++.|||+.
T Consensus 42 iGCWYckRRS 51 (118)
T PF14991_consen 42 IGCWYCKRRS 51 (118)
T ss_dssp ----------
T ss_pred Hhheeeeecc
Confidence 4555666654
No 91
>PF15048 OSTbeta: Organic solute transporter subunit beta protein
Probab=32.85 E-value=54 Score=27.19 Aligned_cols=20 Identities=20% Similarity=0.504 Sum_probs=10.8
Q ss_pred eeeehhhHHHHHHHHHHHHH
Q psy4697 254 STVLIILGVVLIVLGFVIIL 273 (383)
Q Consensus 254 ~~li~~l~~i~~lL~l~~~~ 273 (383)
-+-+.+|+++++++.++++.
T Consensus 36 NysiL~Ls~vvlvi~~~LLg 55 (125)
T PF15048_consen 36 NYSILALSFVVLVISFFLLG 55 (125)
T ss_pred chHHHHHHHHHHHHHHHHHH
Confidence 33455666666655554443
No 92
>cd05741 Ig_CEACAM_D1_like First immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) and similar proteins. Ig_CEACAM_D1_like : immunoglobulin (IG)-like domain 1 in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) protein subfamily-like. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions, it is a cell adhesion molecule, and a signaling molecule that regulates the growth of tumor cells, it is an angiogenic factor, and is a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two [D1, D4] or four [D1-D4] Ig-like domains on the cell surf
Probab=32.80 E-value=1.1e+02 Score=22.85 Aligned_cols=34 Identities=24% Similarity=0.394 Sum_probs=27.4
Q ss_pred CCCCEEEcCCCcEEEeccCCCCcceEEEEEEEee
Q psy4697 187 YEKMFSINDSGHISIVDLSALNTSTIQLVVVATD 220 (383)
Q Consensus 187 ~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D 220 (383)
..+.+.++.+|.|.+......+.+.|.+.|.-..
T Consensus 47 ~~~R~~~~~~~sL~I~~l~~~DsG~Y~c~v~~~~ 80 (92)
T cd05741 47 YSGRETIYPNGSLLIQNLTKEDSGTYTLQIISTN 80 (92)
T ss_pred cCCeEEEcCCceEEEccCCchhcEEEEEEEEcCC
Confidence 3456777777999998888899999999887665
No 93
>PF03302 VSP: Giardia variant-specific surface protein; InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=32.16 E-value=38 Score=34.03 Aligned_cols=27 Identities=33% Similarity=0.415 Sum_probs=18.6
Q ss_pred eehhhHHHHHHHHHHHHHHhhheeecC
Q psy4697 256 VLIILGVVLIVLGFVIILLILYIHKNK 282 (383)
Q Consensus 256 li~~l~~i~~lL~l~~~~l~~~~~r~~ 282 (383)
.-|++|+|+++..||-+|.+|++||.|
T Consensus 370 aGIsvavvvvVgglvGfLcWwf~crgk 396 (397)
T PF03302_consen 370 AGISVAVVVVVGGLVGFLCWWFICRGK 396 (397)
T ss_pred eeeeehhHHHHHHHHHHHhhheeeccc
Confidence 346666677777777777777777765
No 94
>PHA03283 envelope glycoprotein E; Provisional
Probab=31.93 E-value=34 Score=35.29 Aligned_cols=31 Identities=10% Similarity=0.118 Sum_probs=14.7
Q ss_pred eeehhhHHHHHHHHHHHHHHhhheeecCCCC
Q psy4697 255 TVLIILGVVLIVLGFVIILLILYIHKNKHTK 285 (383)
Q Consensus 255 ~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~ 285 (383)
.+++.++|.++++++.+++-++.+||+.+++
T Consensus 401 ~~~~~~~~~~~~~~~~l~vw~c~~~r~~~~~ 431 (542)
T PHA03283 401 AFLLAIICTCAALLVALVVWGCILYRRSNRK 431 (542)
T ss_pred hhHHHHHHHHHHHHHHHhhhheeeehhhcCC
Confidence 3455555555554444433334444444443
No 95
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=31.03 E-value=1.8e+02 Score=20.04 Aligned_cols=27 Identities=30% Similarity=0.365 Sum_probs=17.7
Q ss_pred CcceEEEEEEEeeCC-CCCceeEEEEEE
Q psy4697 208 NTSTIQLVVVATDTG-NPPRQASVPAIM 234 (383)
Q Consensus 208 ~~~~y~L~V~a~D~g-~p~~sst~tv~I 234 (383)
..+.|.+.+.|+|.. +....+...+.|
T Consensus 22 ~dG~y~itv~a~D~AGN~s~~~~~~~ti 49 (54)
T PF13754_consen 22 ADGTYTITVTATDAAGNTSTSSSVTFTI 49 (54)
T ss_pred CCccEEEEEEEEeCCCCCCCccceeEEE
Confidence 468899999999974 433333334444
No 96
>TIGR00845 caca sodium/calcium exchanger 1. This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family
Probab=30.85 E-value=4.8e+02 Score=29.36 Aligned_cols=51 Identities=18% Similarity=0.123 Sum_probs=30.7
Q ss_pred EEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEe-EeCCC-CeeEEEEec
Q psy4697 30 TLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTT-NKPRD-LRVKYWLSN 83 (383)
Q Consensus 30 ~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A-~D~D~-~~v~Ysi~~ 83 (383)
+.+|.|.| ||++|.|....-..+|.|+. |..-.+|.- .+.+. -.+.|...+
T Consensus 516 ~ATVTIlD-DD~aGIfsFe~~~~sV~Es~--G~vtvtV~RtsGa~G~VtV~Y~T~d 568 (928)
T TIGR00845 516 TATVTILD-DDHAGIFTFEEDVFHVSESI--GIMEVKVLRTSGARGTVIVPYRTVE 568 (928)
T ss_pred eEEEEEec-CcccCcccccCceEEEEcCC--CEEEEEEEEcCCCCeeEEEEEEeec
Confidence 44566677 78899887766677889984 554444432 22222 335576554
No 97
>KOG3488|consensus
Probab=30.10 E-value=53 Score=24.31 Aligned_cols=31 Identities=23% Similarity=0.318 Sum_probs=15.5
Q ss_pred eeehhhHHHHHHHHHHHHHHhhheeecCCCC
Q psy4697 255 TVLIILGVVLIVLGFVIILLILYIHKNKHTK 285 (383)
Q Consensus 255 ~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~ 285 (383)
++.+-+++.+++|.++..++.....|.+||+
T Consensus 49 Ai~iPvaagl~ll~lig~Fis~vMlKskkKK 79 (81)
T KOG3488|consen 49 AITIPVAAGLFLLCLIGTFISLVMLKSKKKK 79 (81)
T ss_pred HhhhHHHHHHHHHHHHHHHHHHHhhhccccc
Confidence 3445556656555555555444444444443
No 98
>PF10365 DUF2436: Domain of unknown function (DUF2436); InterPro: IPR018832 Gingipains R and K are endopeptidases with specificity for arginyl and lysyl bonds, respectively. Like other cysteine peptidases, they require reducing conditions for activity. They are maximally active at approximately neutral pH. Gingipains R and K are secreted by the bacterium Porphyromonas gingivalis (Bacteroides gingivalis). The bacterium is a major pathogen in periodontal disease, and the many ways in which the activities of the gingipains may contribute to the disease processes have been reviewed []. These enzymes are also involved in the hemagglutinating activity of the organisms. This entry represents a central region found in gingipain K peptidases, active on lysyl bonds; they belong to the MEROPS peptidase family C25 (gingipain family, clan CD).
Probab=29.86 E-value=2.5e+02 Score=23.90 Aligned_cols=82 Identities=12% Similarity=0.073 Sum_probs=39.8
Q ss_pred CCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeC------CCCeeEEEEecC-CCCCEEEcCcccEEEcc--cCCcccCc
Q psy4697 39 SLRELQFSKNEYSVSALENLPVNYVLLTVTTNKP------RDLRVKYWLSND-YGERFSISRQGDISLMQ--CLDYETED 109 (383)
Q Consensus 39 Ndn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~------D~~~v~Ysi~~~-~~~~F~Id~tG~I~~~~--~LD~E~~~ 109 (383)
|+++|.=.-..|+-.|++|+-+...- +-...|. ..+..-|-|... +....+|-..|-=-..| -+-+|.-+
T Consensus 66 n~~~pa~ly~~FEYkiP~NADps~tp-q~mv~dG~~~i~IPaG~YDy~I~~P~~~~kiwIaGd~g~~~tr~dDy~fEAGK 144 (161)
T PF10365_consen 66 NCNVPANLYDPFEYKIPANADPSTTP-QNMVVDGEASIDIPAGTYDYCIAAPQPGGKIWIAGDGGDGPTRGDDYVFEAGK 144 (161)
T ss_pred CCCCChhhcccceEeccCCCCCccCc-ceEEecCceEEEecCceeEEEEecCCCCCeEEEecCCCCCCccccceEEecCC
Confidence 34455433355677788876543211 1111111 114445555552 33445553211000122 23457789
Q ss_pred EEEEEEEEecCC
Q psy4697 110 SYRFTVYATDTL 121 (383)
Q Consensus 110 ~y~~~V~A~D~~ 121 (383)
.|.|++.+...+
T Consensus 145 tY~ftm~~~g~g 156 (161)
T PF10365_consen 145 TYRFTMKRVGSG 156 (161)
T ss_pred EEEEEEEeccCC
Confidence 999999987654
No 99
>PHA03286 envelope glycoprotein E; Provisional
Probab=29.18 E-value=53 Score=33.29 Aligned_cols=11 Identities=9% Similarity=0.169 Sum_probs=5.1
Q ss_pred CCcceEEEEEE
Q psy4697 207 LNTSTIQLVVV 217 (383)
Q Consensus 207 ~~~~~y~L~V~ 217 (383)
...+.|-+.+.
T Consensus 317 s~SGLYVfVl~ 327 (492)
T PHA03286 317 TDAGLYVVVAL 327 (492)
T ss_pred ccCceEEEEEE
Confidence 34555544443
No 100
>PF07213 DAP10: DAP10 membrane protein; InterPro: IPR009861 This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells [].
Probab=29.11 E-value=71 Score=24.32 Aligned_cols=22 Identities=14% Similarity=0.259 Sum_probs=12.5
Q ss_pred HHHHHHHHHHHhhheeecCCCC
Q psy4697 264 LIVLGFVIILLILYIHKNKHTK 285 (383)
Q Consensus 264 ~~lL~l~~~~l~~~~~r~~~~~ 285 (383)
++.|++++....+.+-|+++++
T Consensus 45 vlTLLIv~~vy~car~r~r~~~ 66 (79)
T PF07213_consen 45 VLTLLIVLVVYYCARPRRRPTQ 66 (79)
T ss_pred HHHHHHHHHHHhhcccccCCcc
Confidence 3445555666666666666544
No 101
>cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.
Probab=27.99 E-value=1.1e+02 Score=22.37 Aligned_cols=29 Identities=17% Similarity=0.299 Sum_probs=21.3
Q ss_pred CCcccCcEEEEEEEEecC-CceeEEEEEEE
Q psy4697 103 LDYETEDSYRFTVYATDT-LMTTSATVNIS 131 (383)
Q Consensus 103 LD~E~~~~y~~~V~A~D~-~~~s~~tV~I~ 131 (383)
..|.....|.+++.++|. +.+...++.|.
T Consensus 51 ~~y~~~G~y~v~l~v~d~~g~~~~~~~~V~ 80 (81)
T cd00146 51 HTYTKPGTYTVTLTVTNAVGSSSTKTTTVV 80 (81)
T ss_pred EEcCCCcEEEEEEEEEeCCCCEEEEEEEEE
Confidence 456778899999999997 45555465554
No 102
>TIGR00864 PCC polycystin cation channel protein. Note: this model has been restricted to the amino half because for technical reasons.
Probab=27.84 E-value=1e+03 Score=30.47 Aligned_cols=110 Identities=18% Similarity=0.211 Sum_probs=0.0
Q ss_pred cccCcEEEEEEEEecCCceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeee-CCCCCeEEEEE
Q psy4697 105 YETEDSYRFTVYATDTLMTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAAD-GDKGDRVTLSL 183 (383)
Q Consensus 105 ~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D-~D~g~~i~ysi 183 (383)
|.....|.+++.|+|..-.++.+..|.| ..+.-.+.+. .....-.+-..+..+| .+.|...+|++
T Consensus 1480 Y~~~GtYtVtLTvtN~~Gsst~T~~VtV----------~~pV~~~tin----as~~~vpl~~sV~Fta~~s~Gs~v~ysW 1545 (2740)
T TIGR00864 1480 FNSPGDFNIRLAAANEVGKNEATLNVAV----------KARVRGLTIN----ASLTNVPLNGSVHFEAHLDAGDDVRFSW 1545 (2740)
T ss_pred cCCCceEEEEEEEECCCCceEEEEEEEE----------eccccceEEc----CCCccccccceEEEEEEccCCCceeEEE
Q ss_pred e-CCCCCCEEEcCCCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697 184 R-GPYEKMFSINDSGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMHF 236 (383)
Q Consensus 184 ~-~~~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v 236 (383)
. ++......-+++-.-.-... +.|.+.+.|.+..+ +..+++.|.|
T Consensus 1546 dFGDg~ts~~~npt~~yTY~sp-----GtYtVtLTvtN~~G---s~~~T~~i~V 1591 (2740)
T TIGR00864 1546 ILCDHCTPIFGGNTIFYTFRSV-----GTFNIIVTAENDVG---AAQASIFLFV 1591 (2740)
T ss_pred EeCCCCccccCCCceEEeecCC-----ceEEEEEEEecCCC---ccceeEEEEE
No 103
>PF07204 Orthoreo_P10: Orthoreovirus membrane fusion protein p10; InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=26.86 E-value=26 Score=27.48 Aligned_cols=25 Identities=12% Similarity=0.195 Sum_probs=12.5
Q ss_pred ehhhHHHHHHHHHHHHHHhhheeec
Q psy4697 257 LIILGVVLIVLGFVIILLILYIHKN 281 (383)
Q Consensus 257 i~~l~~i~~lL~l~~~~l~~~~~r~ 281 (383)
+++-|+++++++++.+++..+.+++
T Consensus 45 LA~GGG~iLilIii~Lv~CC~~K~K 69 (98)
T PF07204_consen 45 LAAGGGLILILIIIALVCCCRAKHK 69 (98)
T ss_pred hhccchhhhHHHHHHHHHHhhhhhh
Confidence 3333555555544555555555444
No 104
>PF14979 TMEM52: Transmembrane 52
Probab=26.48 E-value=74 Score=27.20 Aligned_cols=10 Identities=10% Similarity=-0.097 Sum_probs=4.4
Q ss_pred Hhhh-eeecCC
Q psy4697 274 LILY-IHKNKH 283 (383)
Q Consensus 274 l~~~-~~r~~~ 283 (383)
+.++ +|.||+
T Consensus 40 ~C~rfCClrk~ 50 (154)
T PF14979_consen 40 SCVRFCCLRKQ 50 (154)
T ss_pred HHHHHHHhccc
Confidence 3344 444444
No 105
>PHA03281 envelope glycoprotein E; Provisional
Probab=25.71 E-value=72 Score=33.12 Aligned_cols=24 Identities=13% Similarity=0.076 Sum_probs=10.7
Q ss_pred EEEEEEEEecCCceeEEEEEEEEee
Q psy4697 110 SYRFTVYATDTLMTTSATVNISVVN 134 (383)
Q Consensus 110 ~y~~~V~A~D~~~~s~~tV~I~V~D 134 (383)
.|.+-.. -+.|.+..+.+.|.|.+
T Consensus 312 VYtly~r-g~~G~s~~svfLVtVkg 335 (642)
T PHA03281 312 VYIWNLQ-GSDGENMYATFLVKLKG 335 (642)
T ss_pred eEEEEec-CCCCcceeEEEEEEecC
Confidence 4444444 22233334455566654
No 106
>TIGR03778 VPDSG_CTERM VPDSG-CTERM exosortase interaction domain. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (PubMed:16930487). This model describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C-terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif (TIGR02595) of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria.
Probab=25.06 E-value=81 Score=18.71 Aligned_cols=11 Identities=27% Similarity=0.446 Sum_probs=3.9
Q ss_pred HHHHHHHHHhh
Q psy4697 266 VLGFVIILLIL 276 (383)
Q Consensus 266 lL~l~~~~l~~ 276 (383)
+|.+.+..++.
T Consensus 10 Ll~~~l~~l~~ 20 (26)
T TIGR03778 10 LLGLGLLGLLG 20 (26)
T ss_pred HHHHHHHHHHH
Confidence 33333333333
No 107
>PF13965 SID-1_RNA_chan: dsRNA-gated channel SID-1
Probab=24.84 E-value=3.5e+02 Score=28.69 Aligned_cols=24 Identities=17% Similarity=0.331 Sum_probs=10.0
Q ss_pred EcCCCcEEEeccCCCCcceEEEEEE
Q psy4697 193 INDSGHISIVDLSALNTSTIQLVVV 217 (383)
Q Consensus 193 i~~tG~i~l~~~~~~~~~~y~L~V~ 217 (383)
+...|.|.+.+.+ ...+.+.+.+.
T Consensus 58 ~T~~a~itv~r~~-f~~~~F~Vvvv 81 (570)
T PF13965_consen 58 MTKKAGITVQRKD-FPSGSFYVVVV 81 (570)
T ss_pred EeccccEEEEhhh-CCCCeEEEEEE
Confidence 3344555544332 22334444444
No 108
>PRK14081 triple tyrosine motif-containing protein; Provisional
Probab=24.84 E-value=9.1e+02 Score=26.15 Aligned_cols=189 Identities=13% Similarity=0.140 Sum_probs=89.1
Q ss_pred CCeEEEEEEEEECCCCC-ceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcE-EEEEEeEeCCCCeeEEEEecCC
Q psy4697 8 LQPITLVVRAIQYDNQD-RYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYV-LLTVTTNKPRDLRVKYWLSNDY 85 (383)
Q Consensus 8 ~~~y~l~V~a~D~g~~~-~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~-v~~v~A~D~D~~~v~Ysi~~~~ 85 (383)
.-.|.+.|+|.+..+.. .--.+.+...|.+.+. ..- ...... .+...+|.. +..|.+.. ..+-|.
T Consensus 63 ~GkY~imVq~K~~~S~~~fD~~~~~~~~v~~~~~---~~I-~~~~~~-~~~l~vGe~l~~~V~~~~---e~~LYK----- 129 (667)
T PRK14081 63 EGEYTIMVQAKKEDSNKPFDYVSKEDYVIGKAEE---KLI-KNIYLD-KDTLNVGEKIEIKVDSNK---EPLMYR----- 129 (667)
T ss_pred CccEEEEEEEecCCCCCCcceeEEEEEEEcccch---hhh-eeeEec-CccccCCCEEEEEEEecc---CcEEEE-----
Confidence 35688888888866542 2233444444444333 111 111111 222334543 23333322 224453
Q ss_pred CCCEEEcCcccEEEcc------cCCcc--cCcEEEEEEEEecCC----ceeEEEEEEEEeecCCCCCcccCCceeEEEec
Q psy4697 86 GERFSISRQGDISLMQ------CLDYE--TEDSYRFTVYATDTL----MTTSATVNISVVNVNDWDPRFRYPQYELFLPH 153 (383)
Q Consensus 86 ~~~F~Id~tG~I~~~~------~LD~E--~~~~y~~~V~A~D~~----~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~ 153 (383)
|.|+.+|.....+ .|.|- ....|.+.+.+.|.. .-..+.+...|....+ +.+.. +. ...
T Consensus 130 ---F~I~~~~~w~~iqDYst~n~lsyt~~~~G~Y~ll~~~Kd~~S~~~fDD~~~v~y~Vk~~~~--v~I~~--F~-~ln- 200 (667)
T PRK14081 130 ---YWIKEDNNWKLIKDYSTENSLSYTANKPGKYELLVECKRIDSTKDFDDFKKVKFKVKEIDK--VEITD--FK-CLN- 200 (667)
T ss_pred ---EEEcCCCcEEEEEecCCcceEEEEecCCCcEEEEEEEecCCCccccCcceEEEEEcccCcc--eEEEe--cc-ccC-
Confidence 3344444433332 22221 246899999999964 4456677776665543 22211 00 111
Q ss_pred cCCCCCCCC-ceEEEEEeeeCCCCCeEEEEEe-CCCCCCEEEcC---CCcEEEeccCCCCcceEEEEEEEeeCCCC
Q psy4697 154 IPLADLTPG-SVIGKVEAADGDKGDRVTLSLR-GPYEKMFSIND---SGHISIVDLSALNTSTIQLVVVATDTGNP 224 (383)
Q Consensus 154 ~~~e~~~~g-~~v~~v~A~D~D~g~~i~ysi~-~~~~~~F~i~~---tG~i~l~~~~~~~~~~y~L~V~a~D~g~p 224 (383)
...-.| .+.+.+.|... .|..+.|.+. -+..+.+.-+. +-.++... ....+.|.|.+.|.|....
T Consensus 201 ---s~~i~~~eI~f~~~a~~~-~g~~~LYKF~~i~~~G~~~~~qdYst~n~~~y~--~~~~G~Y~i~~~VKD~~S~ 270 (667)
T PRK14081 201 ---KELICDEELVFEVESVYE-EDRTILYKFVKIDSDGKQTCIQDYSTKNIVSYK--EKKSGDYKLLCLVKDMYSN 270 (667)
T ss_pred ---cceecCcEEEEEEEEEeC-CCceEEEEEEEECCCCCEEEecCccccceEEEE--eCCCccEEEEEEEeccCcc
Confidence 111122 33455556554 3556666643 12233444332 11222222 2457889999999998654
No 109
>smart00089 PKD Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.
Probab=24.48 E-value=1.8e+02 Score=21.17 Aligned_cols=28 Identities=14% Similarity=0.120 Sum_probs=20.8
Q ss_pred CCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697 206 ALNTSTIQLVVVATDTGNPPRQASVPAIMHF 236 (383)
Q Consensus 206 ~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v 236 (383)
....+.|.+.+.+.|..+ ++++++.|.|
T Consensus 51 y~~~G~y~v~l~v~n~~g---~~~~~~~i~v 78 (79)
T smart00089 51 YTKPGTYTVTLTVTNAVG---SASATVTVVV 78 (79)
T ss_pred eCCCcEEEEEEEEEcCCC---cEEEEEEEEE
Confidence 455789999999999866 5565666654
No 110
>cd05762 Ig8_MLCK Eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK). Ig8_MLCK: the eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK). MLCK is a key regulator of different forms of cell motility involving actin and myosin II. Agonist stimulation of smooth muscle cells increases cytosolic Ca2+, which binds calmodulin. This Ca2+-calmodulin complex in turn binds to and activates MLCK. Activated MLCK leads to the phosphorylation of the 20 kDa myosin regulatory light chain (RLC) of myosin II and the stimulation of actin-activated myosin MgATPase activity. MLCK is widely present in vertebrate tissues; it phosphorylates the 20 kDa RLC of both smooth and nonmuscle myosin II. Phosphorylation leads to the activation of the myosin motor domain and altered structural properties of myosin II. In smooth muscle MLCK it is involved in initiating contraction. In nonmuscle cells, MLCK may participate in cell division and cell motility; it has
Probab=24.06 E-value=3.4e+02 Score=20.90 Aligned_cols=37 Identities=27% Similarity=0.322 Sum_probs=24.7
Q ss_pred EcccCCcccCcEEEEEEEEecCCceeEEEEEEEEeecCC
Q psy4697 99 LMQCLDYETEDSYRFTVYATDTLMTTSATVNISVVNVND 137 (383)
Q Consensus 99 ~~~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvND 137 (383)
+.+...++....|.+.+ .+..-...+++.|.|.|.-+
T Consensus 59 ~I~~~~~~D~G~Ytc~a--~N~~G~~~~~~~l~V~~~P~ 95 (98)
T cd05762 59 TITEGQQEHCGCYTLEV--ENKLGSRQAQVNLTVVDKPD 95 (98)
T ss_pred EECCCChhhCEEEEEEE--EcCCCceeEEEEEEEecCCC
Confidence 34556666677777665 44444566788888888776
No 111
>COG4288 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=23.47 E-value=1.2e+02 Score=24.49 Aligned_cols=47 Identities=19% Similarity=0.162 Sum_probs=29.1
Q ss_pred CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCC
Q psy4697 3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLP 59 (383)
Q Consensus 3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~ 59 (383)
.|||....|.++|.|. .++.+...|.+|..+.-....|...+.-|.|
T Consensus 52 sDr~pvgpyevevaar----------rt~hlRfndL~dpe~iP~d~~yasviesnvP 98 (124)
T COG4288 52 SDREPVGPYEVEVAAR----------RTLHLRFNDLGDPEAIPKDTPYASVIESNVP 98 (124)
T ss_pred ccCCCCCceEEEeecc----------eeEEEEecccCCcccCCCCCchhhheecCCc
Confidence 4555555555555443 3567888899987766555666655555554
No 112
>PHA03283 envelope glycoprotein E; Provisional
Probab=23.36 E-value=4.2e+02 Score=27.61 Aligned_cols=38 Identities=11% Similarity=0.110 Sum_probs=29.2
Q ss_pred CCceeeehhhHHHHHHHHHHHHHHhhheeecCCCCCCC
Q psy4697 251 GTSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKNNG 288 (383)
Q Consensus 251 ~~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~~ 288 (383)
.|.-..+..++++..+++++++.|+++.|-+.++.++.
T Consensus 394 ~~~~~~l~~~~~~~~~~~~~~~~l~vw~c~~~r~~~~~ 431 (542)
T PHA03283 394 AWTRHYLAFLLAIICTCAALLVALVVWGCILYRRSNRK 431 (542)
T ss_pred ccccccchhHHHHHHHHHHHHHHHhhhheeeehhhcCC
Confidence 44566677788888888888888889889998766554
No 113
>PF00558 Vpu: Vpu protein; InterPro: IPR008187 The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells [].; GO: 0019076 release of virus from host; PDB: 2JPX_A 1PI8_A 2GOH_A 2GOF_A 1PI7_A 1PJE_A 1VPU_A 2K7Y_A.
Probab=23.28 E-value=78 Score=24.27 Aligned_cols=6 Identities=0% Similarity=-0.158 Sum_probs=0.0
Q ss_pred eeecCC
Q psy4697 278 IHKNKH 283 (383)
Q Consensus 278 ~~r~~~ 283 (383)
.||+.|
T Consensus 29 eYrk~~ 34 (81)
T PF00558_consen 29 EYRKIK 34 (81)
T ss_dssp ------
T ss_pred HHHHHH
Confidence 344333
No 114
>KOG3637|consensus
Probab=23.03 E-value=1e+02 Score=35.06 Aligned_cols=23 Identities=30% Similarity=0.492 Sum_probs=13.6
Q ss_pred ehhhHHHHHHHHHHHHHHhhhee
Q psy4697 257 LIILGVVLIVLGFVIILLILYIH 279 (383)
Q Consensus 257 i~~l~~i~~lL~l~~~~l~~~~~ 279 (383)
+|+++.+.+||+|++|++++++|
T Consensus 980 iIi~svl~GLLlL~llv~~LwK~ 1002 (1030)
T KOG3637|consen 980 IIILSVLGGLLLLALLVLLLWKC 1002 (1030)
T ss_pred eehHHHHHHHHHHHHHHHHHHhc
Confidence 45555555566666666666554
No 115
>PF11395 DUF2873: Protein of unknown function (DUF2873); InterPro: IPR021532 This entry is represented by the human SARS coronavirus, Orf7b; it is a family of uncharacterised viral proteins.
Probab=22.89 E-value=80 Score=20.30 Aligned_cols=9 Identities=22% Similarity=0.630 Sum_probs=3.6
Q ss_pred HHHHHHHHH
Q psy4697 264 LIVLGFVII 272 (383)
Q Consensus 264 ~~lL~l~~~ 272 (383)
+++|+++++
T Consensus 17 llflv~iml 25 (43)
T PF11395_consen 17 LLFLVIIML 25 (43)
T ss_pred HHHHHHHHH
Confidence 333444443
No 116
>TIGR03660 T1SS_rpt_143 T1SS-143 repeat domain. This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion.
Probab=22.87 E-value=2.7e+02 Score=23.56 Aligned_cols=44 Identities=14% Similarity=0.142 Sum_probs=27.5
Q ss_pred eEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCC
Q psy4697 10 PITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLP 59 (383)
Q Consensus 10 ~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~ 59 (383)
...|.|.|+|..+.. +...+.|.|.| | .|.-.... .+.|.|+..
T Consensus 86 ~l~~~v~a~D~DGD~--s~~~l~VtI~D--D-~P~~~~~~-~~~V~E~~L 129 (137)
T TIGR03660 86 TLNFPIIATDFDGDT--SSITLPVTIVD--D-VPTITDVD-ALTVDEDDL 129 (137)
T ss_pred EEeeeEEEEeCCCCc--cccEEEEEEEC--C-CCeecccc-ceEEecccc
Confidence 467788899865333 23477788877 5 46654433 378888543
No 117
>cd05775 Ig_SLAM-CD84_like_N N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family, CD84_like. Ig_SLAM-CD84_like_N: The N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family, CD84_like. The SLAM family is a group of immune-cell specific receptors that can regulate both adaptive and innate immune responses. Members of this group include proteins such as CD84, SLAM (CD150), Ly-9 (CD229), NTB-A (ly-108, SLAM6), 19A (CRACC), and SLAMF9. The genes coding for the SLAM family are nested on chromosome 1, in humans at 1q23, and in mice at 1H2. The SLAM family is a subset of the CD2 family, which also includes CD2 and CD58 located on chromosome 1 at 1p13 in humans. In mice, CD2 is located on chromosome 3, and there is no CD58 homolog. The SLAM family proteins are organized as an extracellular domain with either two or four Ig-like domains, a single transmembrane segment, and a cytoplasmic region
Probab=22.68 E-value=1.9e+02 Score=22.27 Aligned_cols=31 Identities=6% Similarity=0.146 Sum_probs=25.1
Q ss_pred CEEEcC-CCcEEEeccCCCCcceEEEEEEEee
Q psy4697 190 MFSIND-SGHISIVDLSALNTSTIQLVVVATD 220 (383)
Q Consensus 190 ~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D 220 (383)
.+.++. ++.|.+......+.+.|.+.|...+
T Consensus 53 R~~~~~~~~sL~I~~~~~~DsG~Y~c~v~~~~ 84 (97)
T cd05775 53 RVNFSQNDYSLQISNLKMEDAGSYRAEINTKN 84 (97)
T ss_pred eEEecCCceeEEECCCchHHCEEEEEEEEcCC
Confidence 455665 6888888888888999999998776
No 118
>smart00089 PKD Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.
Probab=22.60 E-value=1.9e+02 Score=20.95 Aligned_cols=31 Identities=29% Similarity=0.419 Sum_probs=23.0
Q ss_pred cCCcccCcEEEEEEEEecCCceeEEEEEEEE
Q psy4697 102 CLDYETEDSYRFTVYATDTLMTTSATVNISV 132 (383)
Q Consensus 102 ~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V 132 (383)
..-|+....|.+++.+.|..-++++++.|.|
T Consensus 48 ~~~y~~~G~y~v~l~v~n~~g~~~~~~~i~v 78 (79)
T smart00089 48 THTYTKPGTYTVTLTVTNAVGSASATVTVVV 78 (79)
T ss_pred EEEeCCCcEEEEEEEEEcCCCcEEEEEEEEE
Confidence 4456778899999999986446666666665
No 119
>PF13584 BatD: Oxygen tolerance
Probab=21.99 E-value=8.5e+02 Score=24.76 Aligned_cols=16 Identities=13% Similarity=0.117 Sum_probs=9.1
Q ss_pred eEEEEEeCCCCCCEEE
Q psy4697 178 RVTLSLRGPYEKMFSI 193 (383)
Q Consensus 178 ~i~ysi~~~~~~~F~i 193 (383)
.++|.+.....+.|.|
T Consensus 339 ~~~~~~ip~~~G~~~l 354 (484)
T PF13584_consen 339 TFKYTLIPKKPGDFTL 354 (484)
T ss_pred EEEEEEEeCCCCeEEc
Confidence 4666666555555555
No 120
>PHA03265 envelope glycoprotein D; Provisional
Probab=21.69 E-value=41 Score=32.89 Aligned_cols=51 Identities=14% Similarity=0.148 Sum_probs=25.0
Q ss_pred CCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-CeeEEEEecCCCCCEEE
Q psy4697 40 LRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-LRVKYWLSNDYGERFSI 91 (383)
Q Consensus 40 dn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-~~v~Ysi~~~~~~~F~I 91 (383)
|.||.|+.+.|+..+..-.. +..|.+-.+.+.+. ..++|-+..++-+...+
T Consensus 42 ~~PP~~PPPRYNyt~~w~~~-~~~IPSPF~d~~~~~veVr~Vtst~pCgmvAL 93 (402)
T PHA03265 42 DRPKEFPPPRYNYTILTRYN-ATALASPFINDQVKNVDLRIVTATRPCEMIAL 93 (402)
T ss_pred CCCCCCCCCCCCceEEEeec-CCCCCCcccCCCCCceeeeeeeccCCcceEEE
Confidence 44788988888776553321 11222222233333 45555554444444444
No 121
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=21.47 E-value=1e+02 Score=25.82 Aligned_cols=6 Identities=0% Similarity=0.069 Sum_probs=2.4
Q ss_pred eeecCC
Q psy4697 278 IHKNKH 283 (383)
Q Consensus 278 ~~r~~~ 283 (383)
+||+-|
T Consensus 122 ~yr~~r 127 (139)
T PHA03099 122 VYRFTR 127 (139)
T ss_pred hheeee
Confidence 344433
No 122
>PF05399 EVI2A: Ectropic viral integration site 2A protein (EVI2A); InterPro: IPR008608 This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumours [, ].; GO: 0016021 integral to membrane
Probab=21.12 E-value=41 Score=30.53 Aligned_cols=14 Identities=14% Similarity=0.285 Sum_probs=5.1
Q ss_pred HHHHHHHHHHHHHh
Q psy4697 262 VVLIVLGFVIILLI 275 (383)
Q Consensus 262 ~i~~lL~l~~~~l~ 275 (383)
.||.+|+|-.++||
T Consensus 142 LICT~LfLSTVVLA 155 (227)
T PF05399_consen 142 LICTLLFLSTVVLA 155 (227)
T ss_pred HHHHHHHHHHHHHH
Confidence 33333333333333
No 123
>PF08391 Ly49: Ly49-like protein, N-terminal region; InterPro: IPR013600 The sequences making up this entry are annotated as, or are similar to, Ly49 receptors (e.g. P20937 from SWISSPROT). These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function []. They are members of the C-type lectin receptor superfamily [], and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain (IPR001304 from INTERPRO). ; PDB: 1QO3_D 3C8J_D 1P4L_D 3C8K_D 3G8K_B 1JA3_B 3CAD_A 3G8L_A.
Probab=20.80 E-value=33 Score=28.36 Aligned_cols=22 Identities=18% Similarity=0.357 Sum_probs=0.0
Q ss_pred eeehhhHHHHHHHHHHHHHHhh
Q psy4697 255 TVLIILGVVLIVLGFVIILLIL 276 (383)
Q Consensus 255 ~li~~l~~i~~lL~l~~~~l~~ 276 (383)
.++++||.+|++|++.++.|+.
T Consensus 6 liav~LGILCllLLvtv~vL~t 27 (119)
T PF08391_consen 6 LIAVALGILCLLLLVTVAVLGT 27 (119)
T ss_dssp ----------------------
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 4577888888876665555554
No 124
>PLN03150 hypothetical protein; Provisional
Probab=20.51 E-value=88 Score=33.39 Aligned_cols=12 Identities=33% Similarity=0.501 Sum_probs=4.9
Q ss_pred eehhhHHHHHHH
Q psy4697 256 VLIILGVVLIVL 267 (383)
Q Consensus 256 li~~l~~i~~lL 267 (383)
+.++++++++++
T Consensus 547 i~~~~~~~~~~l 558 (623)
T PLN03150 547 IGIAFGVSVAFL 558 (623)
T ss_pred EEEEhHHHHHHH
Confidence 334444444333
No 125
>PF15065 NCU-G1: Lysosomal transcription factor, NCU-G1
Probab=20.36 E-value=51 Score=32.54 Aligned_cols=12 Identities=25% Similarity=0.274 Sum_probs=7.3
Q ss_pred CcEEEEEEEEec
Q psy4697 108 EDSYRFTVYATD 119 (383)
Q Consensus 108 ~~~y~~~V~A~D 119 (383)
.....|+++|.+
T Consensus 128 ngsi~~~~~af~ 139 (350)
T PF15065_consen 128 NGSIAFKLQAFS 139 (350)
T ss_pred CCeEEEEEEEec
Confidence 556666666655
No 126
>PF02124 Marek_A: Marek's disease glycoprotein A; InterPro: IPR001038 Equid herpesvirus 1 (Equine herpesvirus 1, EHV-1) glycoprotein 13 (EHV-1 gp13) has the characteristic features of a membrane-spanning protein: an N-terminal signal sequence; a hydrophobic membrane anchor region; a charged C-terminal cytoplasmic tail; and an exterior domain with nine potential N-glycosylation sites []. EHV-1 gp13 is the structural homologue of the gC-like glycoproteins of the Human herpesvirus 1 (HHV-1) and Human herpesvirus 2 (HHV-2) (gC-1 and gC-2 respectively), Pseudorabies virus (strain Indiana-Funkhauser/Becker) (PRV) (gIII) and Human herpesvirus 3 (HHV-3) (gp66). Secretory glycoprotein GP57-65 precursor (glycoprotein A - GA) is similar to Herpesvirus glycoprotein C, and belongs to the immunoglobulin gene superfamily [, ]. GA is thought to play an immunoevasive role in the pathogenesis of Marek's disease. It is a candidate for causing the early-stage immunosuppression that occurs after MDHV infection.
Probab=20.30 E-value=5.3e+02 Score=23.53 Aligned_cols=20 Identities=5% Similarity=-0.054 Sum_probs=12.4
Q ss_pred eEEEEEEEeeCCCCCceeEE
Q psy4697 211 TIQLVVVATDTGNPPRQASV 230 (383)
Q Consensus 211 ~y~L~V~a~D~g~p~~sst~ 230 (383)
.|.-.+.=.-.+-|..+.+.
T Consensus 152 ~YtC~l~GYP~~~p~f~~~~ 171 (211)
T PF02124_consen 152 EYTCRLIGYPDILPVFEDTA 171 (211)
T ss_pred eEEEEEeeCCCCCCcccceE
Confidence 67777765555556655543
Done!