Query 047816
Match_columns 620
No_of_seqs 364 out of 1457
Neff 8.4
Searched_HMMs 46136
Date Fri Mar 29 03:27:31 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/047816.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/047816hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03146 aspartyl protease fam 100.0 1.4E-52 3.1E-57 451.7 37.8 333 81-430 80-430 (431)
2 PTZ00165 aspartyl protease; Pr 100.0 9.7E-53 2.1E-57 454.9 36.6 320 73-437 110-458 (482)
3 cd05478 pepsin_A Pepsin A, asp 100.0 1E-51 2.2E-56 430.9 30.8 298 83-425 8-317 (317)
4 cd05490 Cathepsin_D2 Cathepsin 100.0 2.1E-51 4.6E-56 430.1 31.1 300 83-425 4-325 (325)
5 KOG1339 Aspartyl protease [Pos 100.0 1.7E-50 3.6E-55 433.4 35.8 340 79-429 40-397 (398)
6 cd05486 Cathespin_E Cathepsin 100.0 5.4E-51 1.2E-55 425.3 29.0 296 86-425 1-316 (316)
7 cd05487 renin_like Renin stimu 100.0 1.4E-50 3E-55 423.9 31.5 301 83-426 6-326 (326)
8 PTZ00147 plasmepsin-1; Provisi 100.0 2.8E-50 6.1E-55 432.5 33.3 310 70-427 126-450 (453)
9 cd06098 phytepsin Phytepsin, a 100.0 2.7E-50 5.9E-55 419.9 31.7 290 82-425 7-317 (317)
10 cd06096 Plasmepsin_5 Plasmepsi 100.0 3.6E-50 7.8E-55 420.6 31.0 296 84-429 2-326 (326)
11 PTZ00013 plasmepsin 4 (PM4); P 100.0 8.2E-50 1.8E-54 428.0 33.5 309 69-427 124-449 (450)
12 cd05477 gastricsin Gastricsins 100.0 8.1E-50 1.7E-54 416.9 32.2 297 84-426 2-318 (318)
13 cd05488 Proteinase_A_fungi Fun 100.0 5E-50 1.1E-54 418.6 30.5 295 83-425 8-320 (320)
14 cd05485 Cathepsin_D_like Cathe 100.0 6E-50 1.3E-54 419.3 30.8 299 83-425 9-329 (329)
15 cd05472 cnd41_like Chloroplast 100.0 4.4E-48 9.4E-53 400.4 34.1 293 85-428 1-299 (299)
16 cd05473 beta_secretase_like Be 100.0 1.6E-47 3.4E-52 406.7 31.8 321 85-432 3-351 (364)
17 PF00026 Asp: Eukaryotic aspar 100.0 5.1E-47 1.1E-51 395.7 21.0 302 85-426 1-317 (317)
18 cd06097 Aspergillopepsin_like 100.0 2.8E-46 6E-51 382.8 25.3 265 86-425 1-278 (278)
19 cd05476 pepsin_A_like_plant Ch 100.0 1.4E-45 2.9E-50 375.0 29.3 255 85-428 1-265 (265)
20 cd05475 nucellin_like Nucellin 100.0 4.2E-45 9.1E-50 372.7 30.4 261 84-428 1-273 (273)
21 cd05474 SAP_like SAPs, pepsin- 100.0 3.2E-45 7E-50 378.5 28.7 272 85-426 2-295 (295)
22 cd05489 xylanase_inhibitor_I_l 100.0 1.3E-43 2.9E-48 373.9 32.6 318 92-426 2-361 (362)
23 cd05471 pepsin_like Pepsin-lik 100.0 5E-42 1.1E-46 352.0 28.9 269 86-425 1-283 (283)
24 PF14543 TAXi_N: Xylanase inhi 99.9 3.4E-25 7.4E-30 207.8 14.8 152 86-250 1-164 (164)
25 PF14541 TAXi_C: Xylanase inhi 99.9 1.3E-21 2.7E-26 183.4 14.9 155 269-425 1-161 (161)
26 cd05470 pepsin_retropepsin_lik 99.9 4.7E-21 1E-25 167.5 12.1 107 88-211 1-109 (109)
27 cd05483 retropepsin_like_bacte 97.9 3.1E-05 6.6E-10 65.3 6.9 93 85-213 2-94 (96)
28 PF01102 Glycophorin_A: Glycop 96.6 0.0023 5E-08 55.9 3.9 37 578-615 59-96 (122)
29 TIGR02281 clan_AA_DTGA clan AA 96.5 0.012 2.6E-07 52.1 8.0 94 83-212 9-102 (121)
30 PF13650 Asp_protease_2: Aspar 95.9 0.041 8.9E-07 45.3 7.9 89 88-212 1-89 (90)
31 PTZ00382 Variant-specific surf 95.4 0.0027 5.8E-08 53.5 -1.0 36 574-610 57-94 (96)
32 PF01034 Syndecan: Syndecan do 95.2 0.0058 1.2E-07 46.3 0.1 36 583-619 10-45 (64)
33 cd05479 RP_DDI RP_DDI; retrope 94.2 0.18 3.9E-06 44.8 7.4 27 397-423 98-124 (124)
34 cd05479 RP_DDI RP_DDI; retrope 94.2 0.19 4E-06 44.7 7.3 90 84-212 15-106 (124)
35 PF04478 Mid2: Mid2 like cell 93.2 0.011 2.3E-07 53.3 -2.3 30 583-612 50-79 (154)
36 PF03302 VSP: Giardia variant- 92.6 0.045 9.7E-07 58.8 0.9 37 574-611 358-396 (397)
37 PF01299 Lamp: Lysosome-associ 91.5 0.14 3.1E-06 53.0 3.1 34 583-619 271-304 (306)
38 TIGR01478 STEVOR variant surfa 90.5 0.31 6.6E-06 48.6 4.1 33 586-618 262-294 (295)
39 PF02009 Rifin_STEVOR: Rifin/s 89.2 0.34 7.3E-06 49.6 3.4 31 581-614 255-285 (299)
40 PF05454 DAG1: Dystroglycan (D 88.8 0.13 2.7E-06 52.2 0.0 91 478-571 3-102 (290)
41 PF08693 SKG6: Transmembrane a 88.6 0.041 8.9E-07 37.8 -2.4 8 605-612 32-39 (40)
42 PF02439 Adeno_E3_CR2: Adenovi 88.4 0.96 2.1E-05 30.6 3.9 29 585-613 6-34 (38)
43 cd05484 retropepsin_like_LTR_2 88.0 0.55 1.2E-05 39.0 3.4 28 86-115 1-28 (91)
44 cd06095 RP_RTVL_H_like Retrope 87.6 2.7 5.9E-05 34.5 7.3 27 89-117 2-28 (86)
45 PTZ00370 STEVOR; Provisional 87.5 0.42 9.1E-06 47.7 2.7 26 586-611 258-283 (296)
46 TIGR02281 clan_AA_DTGA clan AA 86.9 7.5 0.00016 34.2 10.1 37 266-316 8-44 (121)
47 PHA03286 envelope glycoprotein 85.8 1.4 3E-05 46.7 5.5 76 537-616 337-424 (492)
48 TIGR03698 clan_AA_DTGF clan AA 85.5 3.2 6.9E-05 35.7 6.9 24 398-421 84-107 (107)
49 PF08284 RVP_2: Retroviral asp 84.6 6.1 0.00013 35.6 8.6 28 398-425 104-131 (135)
50 COG3577 Predicted aspartyl pro 83.8 3 6.5E-05 39.8 6.3 79 73-180 95-173 (215)
51 PF05808 Podoplanin: Podoplani 83.7 0.34 7.4E-06 44.2 0.0 35 574-610 120-155 (162)
52 PF13975 gag-asp_proteas: gag- 82.7 2 4.4E-05 34.0 4.1 36 83-120 6-41 (72)
53 cd05484 retropepsin_like_LTR_2 81.5 7.3 0.00016 32.2 7.3 30 276-316 4-33 (91)
54 PTZ00046 rifin; Provisional 80.5 1.7 3.8E-05 45.2 3.7 30 586-615 316-345 (358)
55 TIGR01477 RIFIN variant surfac 79.9 1.8 4E-05 44.8 3.7 30 586-615 311-340 (353)
56 PF15176 LRR19-TM: Leucine-ric 76.2 4.9 0.00011 33.6 4.5 39 573-612 8-47 (102)
57 PF05393 Hum_adeno_E3A: Human 74.9 3.7 8.1E-05 33.3 3.3 25 596-620 43-68 (94)
58 PF00077 RVP: Retroviral aspar 74.6 4.1 8.9E-05 34.2 3.9 26 88-115 8-33 (100)
59 PF14575 EphA2_TM: Ephrin type 73.5 3.4 7.5E-05 33.0 2.9 27 584-612 2-28 (75)
60 PF12768 Rax2: Cortical protei 72.0 2.3 5E-05 43.4 1.9 36 577-612 221-259 (281)
61 PF06697 DUF1191: Protein of u 70.4 1.1 2.3E-05 45.1 -0.8 29 583-613 215-243 (278)
62 PF05568 ASFV_J13L: African sw 70.0 7 0.00015 34.8 4.2 40 573-613 20-59 (189)
63 PF13650 Asp_protease_2: Aspar 69.3 5.9 0.00013 32.1 3.6 29 277-316 3-31 (90)
64 PF11925 DUF3443: Protein of u 68.8 12 0.00026 39.1 6.3 107 88-214 26-149 (370)
65 PF12191 stn_TNFRSF12A: Tumour 67.7 1.6 3.5E-05 38.0 -0.2 17 599-615 92-108 (129)
66 PF06024 DUF912: Nucleopolyhed 67.6 4.1 8.9E-05 34.7 2.3 31 583-614 63-93 (101)
67 PF03229 Alpha_GJ: Alphavirus 67.4 5.3 0.00012 34.2 2.8 27 584-612 85-111 (126)
68 PF05545 FixQ: Cbb3-type cytoc 67.0 6.6 0.00014 28.5 2.9 23 592-614 15-37 (49)
69 PF15102 TMEM154: TMEM154 prot 66.5 2.4 5.2E-05 38.2 0.6 6 585-590 59-64 (146)
70 PF12384 Peptidase_A2B: Ty3 tr 65.7 19 0.00042 33.2 6.2 22 295-316 46-67 (177)
71 PF13975 gag-asp_proteas: gag- 65.6 8.9 0.00019 30.2 3.7 30 276-316 12-41 (72)
72 PHA03265 envelope glycoprotein 65.1 2.1 4.4E-05 43.9 -0.0 29 583-612 348-376 (402)
73 PF15065 NCU-G1: Lysosomal tra 63.6 7.3 0.00016 40.9 3.6 49 565-613 297-349 (350)
74 cd05482 HIV_retropepsin_like R 62.6 9.6 0.00021 31.5 3.5 25 89-115 2-26 (87)
75 PF02480 Herpes_gE: Alphaherpe 62.5 2.5 5.4E-05 46.0 0.0 23 358-380 170-192 (439)
76 PF02009 Rifin_STEVOR: Rifin/s 61.5 6.4 0.00014 40.4 2.7 24 595-618 269-293 (299)
77 PF07213 DAP10: DAP10 membrane 57.9 16 0.00035 29.3 3.8 19 579-598 30-49 (79)
78 cd05483 retropepsin_like_bacte 57.8 15 0.00033 29.9 4.1 30 276-316 6-35 (96)
79 PF06365 CD34_antigen: CD34/Po 56.6 8.2 0.00018 37.1 2.4 30 583-613 101-130 (202)
80 TIGR01167 LPXTG_anchor LPXTG-m 56.0 13 0.00028 24.4 2.6 10 603-612 24-33 (34)
81 PF12877 DUF3827: Domain of un 55.9 31 0.00067 38.7 6.9 81 523-611 207-297 (684)
82 cd06094 RP_Saci_like RP_Saci_l 55.0 58 0.0013 27.0 6.7 21 294-314 9-29 (89)
83 PF13703 PepSY_TM_2: PepSY-ass 54.9 19 0.00041 29.6 4.0 20 593-612 24-43 (88)
84 cd06095 RP_RTVL_H_like Retrope 53.4 16 0.00035 29.8 3.4 29 277-316 3-31 (86)
85 PF02160 Peptidase_A3: Caulifl 53.3 17 0.00037 34.9 3.9 28 397-425 90-117 (201)
86 PF01034 Syndecan: Syndecan do 53.0 5.1 0.00011 30.7 0.3 34 577-612 7-41 (64)
87 PF11014 DUF2852: Protein of u 48.0 29 0.00063 30.1 4.1 31 580-611 7-38 (115)
88 PF02529 PetG: Cytochrome B6-F 47.9 42 0.00091 22.6 3.9 26 583-609 5-30 (37)
89 TIGR01478 STEVOR variant surfa 47.6 16 0.00036 36.6 2.9 29 590-619 262-291 (295)
90 KOG4818 Lysosomal-associated m 47.0 15 0.00033 38.1 2.7 29 584-613 328-356 (362)
91 PF15099 PIRT: Phosphoinositid 46.6 11 0.00023 33.1 1.3 32 583-615 81-113 (129)
92 PF14986 DUF4514: Domain of un 46.5 19 0.00042 26.3 2.3 28 583-612 23-50 (61)
93 PTZ00370 STEVOR; Provisional 45.8 18 0.00039 36.5 2.9 26 590-615 258-284 (296)
94 PF11353 DUF3153: Protein of u 45.5 18 0.00038 35.2 2.9 46 566-613 162-208 (209)
95 PTZ00382 Variant-specific surf 45.2 3.1 6.8E-05 35.0 -2.1 33 583-615 63-95 (96)
96 PF13268 DUF4059: Protein of u 43.5 25 0.00054 27.5 2.7 23 595-617 17-39 (72)
97 PF08374 Protocadherin: Protoc 42.8 11 0.00024 36.2 1.0 26 583-611 39-64 (221)
98 KOG3540 Beta amyloid precursor 42.3 30 0.00064 37.3 4.0 57 554-613 518-576 (615)
99 PF00077 RVP: Retroviral aspar 42.2 20 0.00042 30.0 2.3 27 276-313 9-35 (100)
100 PF14575 EphA2_TM: Ephrin type 41.2 32 0.0007 27.5 3.2 23 593-615 6-28 (75)
101 PF12384 Peptidase_A2B: Ty3 tr 40.7 36 0.00077 31.5 3.7 29 87-115 34-62 (177)
102 PF09668 Asp_protease: Asparty 38.9 16 0.00034 32.4 1.2 36 85-122 24-59 (124)
103 TIGR03867 MprA_tail MprA prote 38.8 42 0.0009 21.0 2.6 19 593-611 8-26 (27)
104 PF13908 Shisa: Wnt and FGF in 38.0 16 0.00035 34.5 1.2 12 579-590 75-87 (179)
105 PRK09459 pspG phage shock prot 38.0 29 0.00064 27.4 2.4 21 599-619 53-73 (76)
106 PF05084 GRA6: Granule antigen 37.8 36 0.00079 31.0 3.3 13 601-613 164-176 (215)
107 TIGR03370 PEPCTERM_Roseo varia 36.9 41 0.0009 21.0 2.4 17 597-613 9-25 (26)
108 PHA03283 envelope glycoprotein 36.7 34 0.00074 37.4 3.5 25 595-619 410-434 (542)
109 PF01102 Glycophorin_A: Glycop 36.1 46 0.00099 29.4 3.6 30 583-614 69-98 (122)
110 CHL00008 petG cytochrome b6/f 35.5 73 0.0016 21.4 3.5 25 583-608 5-29 (37)
111 PHA03281 envelope glycoprotein 35.5 63 0.0014 35.5 5.2 20 537-556 505-524 (642)
112 COG3577 Predicted aspartyl pro 35.0 71 0.0015 30.8 4.9 36 266-315 102-137 (215)
113 TIGR03778 VPDSG_CTERM VPDSG-CT 34.9 46 0.00099 20.7 2.4 17 595-611 8-24 (26)
114 PF15176 LRR19-TM: Leucine-ric 34.5 41 0.0009 28.3 2.9 36 583-619 15-51 (102)
115 PRK00665 petG cytochrome b6-f 33.7 78 0.0017 21.3 3.4 25 583-608 5-29 (37)
116 cd05481 retropepsin_like_LTR_1 33.2 47 0.001 27.7 3.1 21 296-316 12-32 (93)
117 PF10577 UPF0560: Uncharacteri 33.1 51 0.0011 38.1 4.3 33 580-612 270-302 (807)
118 COG4736 CcoQ Cbb3-type cytochr 32.9 50 0.0011 25.2 2.9 23 589-612 13-35 (60)
119 cd01324 cbb3_Oxidase_CcoQ Cyto 32.4 53 0.0012 23.8 2.9 22 592-613 16-37 (48)
120 COG5550 Predicted aspartyl pro 32.3 32 0.0007 30.2 2.0 20 297-316 29-49 (125)
121 PRK10525 cytochrome o ubiquino 31.4 42 0.00091 34.8 3.1 29 592-620 51-80 (315)
122 TIGR02595 PEP_exosort PEP-CTER 31.1 56 0.0012 20.3 2.4 7 606-612 17-23 (26)
123 PF14654 Epiglycanin_C: Mucin, 30.6 87 0.0019 26.2 4.1 26 583-609 19-44 (106)
124 PF14828 Amnionless: Amnionles 30.5 1.7E+02 0.0037 32.0 7.6 61 471-532 231-291 (437)
125 PF09668 Asp_protease: Asparty 29.8 56 0.0012 28.9 3.2 29 276-315 28-56 (124)
126 PF14979 TMEM52: Transmembrane 29.8 1.1E+02 0.0023 27.7 4.9 36 578-613 15-51 (154)
127 PF02038 ATP1G1_PLM_MAT8: ATP1 29.8 75 0.0016 23.2 3.2 15 583-598 15-29 (50)
128 PF10661 EssA: WXG100 protein 29.7 60 0.0013 29.6 3.4 16 596-611 128-143 (145)
129 TIGR01433 CyoA cytochrome o ub 29.7 43 0.00094 33.0 2.7 17 597-613 44-60 (226)
130 PF06679 DUF1180: Protein of u 29.0 57 0.0012 30.3 3.2 7 612-618 123-129 (163)
131 TIGR03698 clan_AA_DTGF clan AA 29.0 1.6E+02 0.0034 25.2 5.8 64 88-179 2-70 (107)
132 PF14316 DUF4381: Domain of un 29.0 50 0.0011 30.1 2.8 11 602-612 36-46 (146)
133 PF05337 CSF-1: Macrophage col 28.9 19 0.0004 36.1 0.0 20 596-615 236-255 (285)
134 PF10873 DUF2668: Protein of u 28.9 33 0.00071 30.8 1.5 12 579-590 57-69 (155)
135 PF04689 S1FA: DNA binding pro 28.6 45 0.00096 25.5 1.9 36 576-612 6-42 (69)
136 TIGR03501 gamma_C_targ gammapr 28.3 62 0.0013 20.2 2.2 13 598-610 9-21 (26)
137 PF05283 MGC-24: Multi-glycosy 27.9 66 0.0014 30.6 3.5 17 592-608 166-182 (186)
138 PF06040 Adeno_E3: Adenovirus 27.8 62 0.0014 27.8 2.9 23 562-598 82-104 (127)
139 PRK14748 kdpF potassium-transp 27.5 1.2E+02 0.0026 19.3 3.3 21 586-608 4-24 (29)
140 PF09472 MtrF: Tetrahydrometha 27.1 86 0.0019 24.3 3.3 30 577-608 34-64 (64)
141 KOG1094 Discoidin domain recep 26.7 80 0.0017 35.5 4.2 13 96-108 53-65 (807)
142 PRK15348 type III secretion sy 26.7 79 0.0017 31.6 3.9 35 480-515 149-184 (249)
143 PF13706 PepSY_TM_3: PepSY-ass 25.2 1.3E+02 0.0029 20.3 3.7 22 583-605 9-30 (37)
144 KOG0860 Synaptobrevin/VAMP-lik 24.8 72 0.0016 27.7 2.8 15 576-590 85-99 (116)
145 PF14610 DUF4448: Protein of u 24.6 22 0.00047 34.0 -0.4 8 561-568 123-130 (189)
146 PF12301 CD99L2: CD99 antigen 24.0 77 0.0017 29.7 3.1 12 534-545 68-79 (169)
147 PF13172 PepSY_TM_1: PepSY-ass 24.0 1.2E+02 0.0025 20.0 3.2 26 577-603 2-29 (34)
148 PRK11486 flagellar biosynthesi 23.3 87 0.0019 27.7 3.1 20 592-611 24-43 (124)
149 PF11615 DUF3249: Protein of u 23.1 56 0.0012 23.3 1.5 25 534-558 11-35 (60)
150 TIGR03063 srtB_target sortase 22.8 1E+02 0.0022 19.8 2.5 7 604-610 22-28 (29)
151 PF01002 Flavi_NS2B: Flaviviru 22.8 86 0.0019 27.9 3.0 27 506-532 44-79 (128)
152 PF13179 DUF4006: Family of un 22.5 1.4E+02 0.003 23.2 3.6 19 601-619 29-47 (66)
153 PF11669 WBP-1: WW domain-bind 22.3 81 0.0018 26.9 2.7 15 597-611 32-46 (102)
154 PF13908 Shisa: Wnt and FGF in 22.3 41 0.00088 31.8 1.0 11 583-593 76-86 (179)
155 PF14991 MLANA: Protein melan- 22.2 20 0.00042 30.8 -1.1 18 593-610 33-50 (118)
156 PF01282 Ribosomal_S24e: Ribos 22.1 3.7E+02 0.008 21.9 6.5 43 490-532 12-56 (84)
157 PF02038 ATP1G1_PLM_MAT8: ATP1 22.1 1.2E+02 0.0025 22.2 3.0 25 586-611 13-37 (50)
158 cd05481 retropepsin_like_LTR_1 21.3 69 0.0015 26.6 2.0 23 90-114 3-26 (93)
159 PRK00523 hypothetical protein; 21.1 93 0.002 24.6 2.5 25 586-611 5-29 (72)
160 PF02480 Herpes_gE: Alphaherpe 20.8 33 0.00071 37.5 0.0 30 583-612 353-382 (439)
161 PF03597 CcoS: Cytochrome oxid 20.7 1.9E+02 0.004 20.7 3.8 12 596-607 12-23 (45)
162 PRK00972 tetrahydromethanopter 20.5 82 0.0018 31.2 2.6 9 611-619 284-292 (292)
163 PTZ00208 65 kDa invariant surf 20.3 92 0.002 33.0 3.1 6 338-343 237-242 (436)
164 PF11118 DUF2627: Protein of u 20.2 1.1E+02 0.0025 24.3 2.9 27 588-615 44-70 (77)
No 1
>PLN03146 aspartyl protease family protein; Provisional
Probab=100.00 E-value=1.4e-52 Score=451.69 Aligned_cols=333 Identities=28% Similarity=0.537 Sum_probs=269.7
Q ss_pred ccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-cc-------cCCCCC
Q 047816 81 LLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-CN-------CDRERA 152 (620)
Q Consensus 81 ~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c~-------c~~~~~ 152 (620)
.++++|+++|.||||+|++.|++||||+++||+|.+|..|..+.++.|||++|+||+.+.|.+. |. |... +
T Consensus 80 ~~~~~Y~v~i~iGTPpq~~~vi~DTGS~l~Wv~C~~C~~C~~~~~~~fdps~SST~~~~~C~s~~C~~~~~~~~c~~~-~ 158 (431)
T PLN03146 80 SNGGEYLMNISIGTPPVPILAIADTGSDLIWTQCKPCDDCYKQVSPLFDPKKSSTYKDVSCDSSQCQALGNQASCSDE-N 158 (431)
T ss_pred cCCccEEEEEEcCCCCceEEEEECCCCCcceEcCCCCcccccCCCCcccCCCCCCCcccCCCCcccccCCCCCCCCCC-C
Confidence 3567899999999999999999999999999999999999887778999999999999999864 74 5433 4
Q ss_pred cceeEEeeccCCceeEEEEEEEEEeCCCC--CCCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCccc
Q 047816 153 QCVYERKYAEMSSSSGVLGEDIISFGNES--DLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISD 230 (620)
Q Consensus 153 ~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~--~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~ 230 (620)
.|.|.+.|+||+.+.|.+++|+|+|++.. ..++.++.|||+....+.+. ...+||||||++..|++.||..+ +.+
T Consensus 159 ~c~y~i~Ygdgs~~~G~l~~Dtltlg~~~~~~~~v~~~~FGc~~~~~g~f~-~~~~GilGLG~~~~Sl~sql~~~--~~~ 235 (431)
T PLN03146 159 TCTYSYSYGDGSFTKGNLAVETLTIGSTSGRPVSFPGIVFGCGHNNGGTFD-EKGSGIVGLGGGPLSLISQLGSS--IGG 235 (431)
T ss_pred CCeeEEEeCCCCceeeEEEEEEEEeccCCCCcceeCCEEEeCCCCCCCCcc-CCCceeEecCCCCccHHHHhhHh--hCC
Confidence 69999999998888999999999998743 14578999999988766543 35799999999999999999763 557
Q ss_pred ceEEeecCCC---CCCceEEECCCCCCC--CceEeecCCC-CCCeeEEEEeEEEEccEEecCCCCcc--CCCCceEeecc
Q 047816 231 SFSLCYGGMD---VGGGAMVLGGISPPK--DMVFTHSDPV-RSPYYNIDLKVIHVAGKPLPLNPKVF--DGKHGTVLDSG 302 (620)
Q Consensus 231 ~FSl~l~~~~---~~~G~l~fGgiD~~~--~~~~~~~~~~-~~~~w~v~l~~i~v~g~~~~~~~~~~--~~~~~ailDSG 302 (620)
.||+||.+.. ...|.|+||+...-. .+.+++.... ...+|.|.|++|+|+++.+.++...+ .+...+|||||
T Consensus 236 ~FSycL~~~~~~~~~~g~l~fG~~~~~~~~~~~~tPl~~~~~~~~y~V~L~gIsVgg~~l~~~~~~~~~~~~g~~iiDSG 315 (431)
T PLN03146 236 KFSYCLVPLSSDSNGTSKINFGTNAIVSGSGVVSTPLVSKDPDTFYYLTLEAISVGSKKLPYTGSSKNGVEEGNIIIDSG 315 (431)
T ss_pred cEEEECCCCCCCCCCcceEEeCCccccCCCCceEcccccCCCCCeEEEeEEEEEECCEECcCCccccccCCCCcEEEeCC
Confidence 9999996422 347999999953221 2456654422 35789999999999999988766544 23457999999
Q ss_pred ceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecc
Q 047816 303 TTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSK 382 (620)
Q Consensus 303 tt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~ 382 (620)
|++++||+++|+++.+++.+++.... .. ......+.||.... ...+|+|+|+| +|.++.|++++|++....
T Consensus 316 Tt~t~Lp~~~y~~l~~~~~~~~~~~~-~~-~~~~~~~~C~~~~~------~~~~P~i~~~F-~Ga~~~l~~~~~~~~~~~ 386 (431)
T PLN03146 316 TTLTLLPSDFYSELESAVEEAIGGER-VS-DPQGLLSLCYSSTS------DIKLPIITAHF-TGADVKLQPLNTFVKVSE 386 (431)
T ss_pred ccceecCHHHHHHHHHHHHHHhcccc-CC-CCCCCCCccccCCC------CCCCCeEEEEE-CCCeeecCcceeEEEcCC
Confidence 99999999999999999988875321 11 11234578986321 13689999999 689999999999987643
Q ss_pred cCCeEEEEEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEecCCcc
Q 047816 383 VRGAYCLGIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCSE 430 (620)
Q Consensus 383 ~~~~~Cl~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~~ 430 (620)
+..|+++... .+.+|||+.|||++|++||++++|||||+++|+.
T Consensus 387 --~~~Cl~~~~~--~~~~IlG~~~q~~~~vvyDl~~~~igFa~~~C~~ 430 (431)
T PLN03146 387 --DLVCFAMIPT--SSIAIFGNLAQMNFLVGYDLESKTVSFKPTDCTK 430 (431)
T ss_pred --CcEEEEEecC--CCceEECeeeEeeEEEEEECCCCEEeeecCCcCc
Confidence 5689988754 3469999999999999999999999999999975
No 2
>PTZ00165 aspartyl protease; Provisional
Probab=100.00 E-value=9.7e-53 Score=454.89 Aligned_cols=320 Identities=24% Similarity=0.439 Sum_probs=253.4
Q ss_pred eeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCC--CCCCCCCCCCCCCCcccccccCcCCcccCCC
Q 047816 73 RMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEH--CGDHQDPKFEPDLSSTYQPVKCNLYCNCDRE 150 (620)
Q Consensus 73 ~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~--C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~ 150 (620)
..++.+. .|.+|+++|+||||||+|.|++||||+++||+|..|.. |..+ +.|||++|+||+...+..
T Consensus 110 ~~~l~n~--~d~~Y~~~I~IGTPpQ~f~Vv~DTGSS~lWVps~~C~~~~C~~~--~~yd~s~SSTy~~~~~~~------- 178 (482)
T PTZ00165 110 QQDLLNF--HNSQYFGEIQVGTPPKSFVVVFDTGSSNLWIPSKECKSGGCAPH--RKFDPKKSSTYTKLKLGD------- 178 (482)
T ss_pred ceecccc--cCCeEEEEEEeCCCCceEEEEEeCCCCCEEEEchhcCccccccc--CCCCccccCCcEecCCCC-------
Confidence 3444443 47899999999999999999999999999999999985 6554 799999999999843110
Q ss_pred CCcceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC---------chHH
Q 047816 151 RAQCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL---------SVVD 220 (620)
Q Consensus 151 ~~~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~---------s~~~ 220 (620)
....+.++|++ +++.|.+++|+|++|+ ++++++.||++..+++. +....+|||||||++.. ++++
T Consensus 179 -~~~~~~i~YGs-Gs~~G~l~~DtV~ig~---l~i~~q~FG~a~~~s~~~f~~~~~DGILGLg~~~~s~~s~~~~~p~~~ 253 (482)
T PTZ00165 179 -ESAETYIQYGT-GECVLALGKDTVKIGG---LKVKHQSIGLAIEESLHPFADLPFDGLVGLGFPDKDFKESKKALPIVD 253 (482)
T ss_pred -ccceEEEEeCC-CcEEEEEEEEEEEECC---EEEccEEEEEEEeccccccccccccceeecCCCcccccccCCCCCHHH
Confidence 11257799998 5678999999999998 78899999999987653 44557899999998753 5899
Q ss_pred HHHHcCCcc-cceEEeecCCCCCCceEEECCCCCCCC--c-eEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCc
Q 047816 221 QLVEKGVIS-DSFSLCYGGMDVGGGAMVLGGISPPKD--M-VFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHG 296 (620)
Q Consensus 221 ~L~~~g~I~-~~FSl~l~~~~~~~G~l~fGgiD~~~~--~-~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ 296 (620)
+|++||+|+ +.||+||++....+|+|+|||+|+.+. . ...+.+.....+|.|.+++|+|+++.+... .....
T Consensus 254 ~l~~qgli~~~~FS~yL~~~~~~~G~l~fGGiD~~~~~~~g~i~~~Pv~~~~yW~i~l~~i~vgg~~~~~~----~~~~~ 329 (482)
T PTZ00165 254 NIKKQNLLKRNIFSFYMSKDLNQPGSISFGSADPKYTLEGHKIWWFPVISTDYWEIEVVDILIDGKSLGFC----DRKCK 329 (482)
T ss_pred HHHHcCCcccceEEEEeccCCCCCCEEEeCCcCHHHcCCCCceEEEEccccceEEEEeCeEEECCEEeeec----CCceE
Confidence 999999998 899999987655689999999998653 1 233333446789999999999999876542 23457
Q ss_pred eEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCc-----EEEe
Q 047816 297 TVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQ-----KLLL 371 (620)
Q Consensus 297 ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~-----~~~l 371 (620)
+++||||+++++|.++++++.++++. ..+|.. . +.+|+|+|+| +|. +|.+
T Consensus 330 aIiDTGTSli~lP~~~~~~i~~~i~~---------------~~~C~~--------~-~~lP~itf~f-~g~~g~~v~~~l 384 (482)
T PTZ00165 330 AAIDTGSSLITGPSSVINPLLEKIPL---------------EEDCSN--------K-DSLPRISFVL-EDVNGRKIKFDM 384 (482)
T ss_pred EEEcCCCccEeCCHHHHHHHHHHcCC---------------cccccc--------c-ccCCceEEEE-CCCCCceEEEEE
Confidence 99999999999999999999888732 136732 2 5789999999 443 8999
Q ss_pred CCCCcEEEec--ccCCeEEE-EEEecC----CCCceeehHhhhceEEEEEeCCCCEEEEEecCCcc-ccccccc
Q 047816 372 APENYLFRHS--KVRGAYCL-GIFQNG----RDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCSE-LWERLHI 437 (620)
Q Consensus 372 ~~~~yi~~~~--~~~~~~Cl-~~~~~~----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~~-~~~~~~~ 437 (620)
+|++|+++.. ..++..|+ ++.... .++.||||++|||++|+|||++|+|||||+++|+. ..+..++
T Consensus 385 ~p~dYi~~~~~~~~~~~~C~~g~~~~d~~~~~g~~~ILGd~Flr~yy~VFD~~n~rIGfA~a~~~~~~~~~~~~ 458 (482)
T PTZ00165 385 DPEDYVIEEGDSEEQEHQCVIGIIPMDVPAPRGPLFVLGNNFIRKYYSIFDRDHMMVGLVPAKHDQSGPNFQEL 458 (482)
T ss_pred chHHeeeecccCCCCCCeEEEEEEECCCCCCCCceEEEchhhheeEEEEEeCCCCEEEEEeeccCCCCCcEEEe
Confidence 9999999742 23456896 454321 23579999999999999999999999999999876 3334444
No 3
>cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which
Probab=100.00 E-value=1e-51 Score=430.89 Aligned_cols=298 Identities=27% Similarity=0.507 Sum_probs=248.2
Q ss_pred ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816 83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE 162 (620)
Q Consensus 83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d 162 (620)
+..|+++|.||||+|++.|++||||+++||+|..|..|....++.|+|++|+|++..+ +.+.+.|++
T Consensus 8 ~~~Y~~~i~vGtp~q~~~v~~DTGS~~~wv~~~~C~~~~c~~~~~f~~~~Sst~~~~~-------------~~~~~~yg~ 74 (317)
T cd05478 8 DMEYYGTISIGTPPQDFTVIFDTGSSNLWVPSVYCSSQACSNHNRFNPRQSSTYQSTG-------------QPLSIQYGT 74 (317)
T ss_pred CCEEEEEEEeCCCCcEEEEEEeCCCccEEEecCCCCcccccccCcCCCCCCcceeeCC-------------cEEEEEECC
Confidence 6789999999999999999999999999999999985333334799999999999865 589999998
Q ss_pred CCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcC-CCcceEEecCCCCC------chHHHHHHcCCcc-cceEE
Q 047816 163 MSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYS-QHADGIIGLGRGDL------SVVDQLVEKGVIS-DSFSL 234 (620)
Q Consensus 163 g~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~-~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~~FSl 234 (620)
|. +.|.+++|+|++|+ +.++++.|||+..+.+.+.. ...+||||||++.. +++++|+++|+|+ ++||+
T Consensus 75 gs-~~G~~~~D~v~ig~---~~i~~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~L~~~g~i~~~~FS~ 150 (317)
T cd05478 75 GS-MTGILGYDTVQVGG---ISDTNQIFGLSETEPGSFFYYAPFDGILGLAYPSIASSGATPVFDNMMSQGLVSQDLFSV 150 (317)
T ss_pred ce-EEEEEeeeEEEECC---EEECCEEEEEEEecCccccccccccceeeeccchhcccCCCCHHHHHHhCCCCCCCEEEE
Confidence 55 89999999999998 67789999999877665433 35799999998753 5899999999998 89999
Q ss_pred eecCCCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHH
Q 047816 235 CYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEA 311 (620)
Q Consensus 235 ~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~ 311 (620)
||.+.+..+|.|+|||+|++++ +.|++. ....+|.|.++++.|+++.+... .+..++|||||+++++|++
T Consensus 151 ~L~~~~~~~g~l~~Gg~d~~~~~g~l~~~p~--~~~~~w~v~l~~v~v~g~~~~~~-----~~~~~iiDTGts~~~lp~~ 223 (317)
T cd05478 151 YLSSNGQQGSVVTFGGIDPSYYTGSLNWVPV--TAETYWQITVDSVTINGQVVACS-----GGCQAIVDTGTSLLVGPSS 223 (317)
T ss_pred EeCCCCCCCeEEEEcccCHHHccCceEEEEC--CCCcEEEEEeeEEEECCEEEccC-----CCCEEEECCCchhhhCCHH
Confidence 9998665679999999999873 344443 45689999999999999987532 3457999999999999999
Q ss_pred HHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEE
Q 047816 312 AFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGI 391 (620)
Q Consensus 312 ~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~ 391 (620)
++++|.+++++... ..+.|..+|.. . ..+|.|+|+| +|+++.||+++|+.+. +..|+..
T Consensus 224 ~~~~l~~~~~~~~~-------~~~~~~~~C~~--------~-~~~P~~~f~f-~g~~~~i~~~~y~~~~----~~~C~~~ 282 (317)
T cd05478 224 DIANIQSDIGASQN-------QNGEMVVNCSS--------I-SSMPDVVFTI-NGVQYPLPPSAYILQD----QGSCTSG 282 (317)
T ss_pred HHHHHHHHhCCccc-------cCCcEEeCCcC--------c-ccCCcEEEEE-CCEEEEECHHHheecC----CCEEeEE
Confidence 99999998855321 23356678842 1 4689999999 8899999999999864 5689866
Q ss_pred EecCC-CCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 392 FQNGR-DPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 392 ~~~~~-~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
++..+ ...||||++|||++|+|||++|+||||||
T Consensus 283 ~~~~~~~~~~IlG~~fl~~~y~vfD~~~~~iG~A~ 317 (317)
T cd05478 283 FQSMGLGELWILGDVFIRQYYSVFDRANNKVGLAP 317 (317)
T ss_pred EEeCCCCCeEEechHHhcceEEEEeCCCCEEeecC
Confidence 65543 46799999999999999999999999996
No 4
>cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank
Probab=100.00 E-value=2.1e-51 Score=430.15 Aligned_cols=300 Identities=27% Similarity=0.487 Sum_probs=244.9
Q ss_pred ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC----CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816 83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE----HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER 158 (620)
Q Consensus 83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~----~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~ 158 (620)
|.+|+++|.||||+|++.|++||||+++||+|..|. .|..+ +.|+|++|+||+..+ +.|.+
T Consensus 4 ~~~Y~~~i~iGtP~q~~~v~~DTGSs~~Wv~~~~C~~~~~~C~~~--~~y~~~~SsT~~~~~-------------~~~~i 68 (325)
T cd05490 4 DAQYYGEIGIGTPPQTFTVVFDTGSSNLWVPSVHCSLLDIACWLH--HKYNSSKSSTYVKNG-------------TEFAI 68 (325)
T ss_pred CCEEEEEEEECCCCcEEEEEEeCCCccEEEEcCCCCCCCccccCc--CcCCcccCcceeeCC-------------cEEEE
Confidence 668999999999999999999999999999999997 46655 689999999998754 68999
Q ss_pred eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-c
Q 047816 159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-D 230 (620)
Q Consensus 159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~ 230 (620)
.|++| ++.|.+++|+|++|+ .+++++.||+++...+. +.....+||||||++.. +++++|++||+|+ +
T Consensus 69 ~Yg~G-~~~G~~~~D~v~~g~---~~~~~~~Fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~l~~~g~i~~~ 144 (325)
T cd05490 69 QYGSG-SLSGYLSQDTVSIGG---LQVEGQLFGEAVKQPGITFIAAKFDGILGMAYPRISVDGVTPVFDNIMAQKLVEQN 144 (325)
T ss_pred EECCc-EEEEEEeeeEEEECC---EEEcCEEEEEEeeccCCcccceeeeEEEecCCccccccCCCCHHHHHHhcCCCCCC
Confidence 99995 589999999999998 67889999999877653 33346799999998754 5889999999998 8
Q ss_pred ceEEeecCCC--CCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816 231 SFSLCYGGMD--VGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY 305 (620)
Q Consensus 231 ~FSl~l~~~~--~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~ 305 (620)
.||+||++.. ..+|+|+|||+|++++ +.|++ .....+|.|++++|.|+++.... .....++|||||++
T Consensus 145 ~FS~~L~~~~~~~~~G~l~~Gg~d~~~~~g~l~~~~--~~~~~~w~v~l~~i~vg~~~~~~-----~~~~~aiiDSGTt~ 217 (325)
T cd05490 145 VFSFYLNRDPDAQPGGELMLGGTDPKYYTGDLHYVN--VTRKAYWQIHMDQVDVGSGLTLC-----KGGCEAIVDTGTSL 217 (325)
T ss_pred EEEEEEeCCCCCCCCCEEEECccCHHHcCCceEEEE--cCcceEEEEEeeEEEECCeeeec-----CCCCEEEECCCCcc
Confidence 9999998642 2469999999999873 33443 34568999999999998764321 23457999999999
Q ss_pred eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCC
Q 047816 306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRG 385 (620)
Q Consensus 306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~ 385 (620)
+++|.+++++|.+++.+. ....+.|..+|.. ...+|+|+|+| +|+.|.|+|++|+++....+.
T Consensus 218 ~~~p~~~~~~l~~~~~~~-------~~~~~~~~~~C~~---------~~~~P~i~f~f-gg~~~~l~~~~y~~~~~~~~~ 280 (325)
T cd05490 218 ITGPVEEVRALQKAIGAV-------PLIQGEYMIDCEK---------IPTLPVISFSL-GGKVYPLTGEDYILKVSQRGT 280 (325)
T ss_pred ccCCHHHHHHHHHHhCCc-------cccCCCEEecccc---------cccCCCEEEEE-CCEEEEEChHHeEEeccCCCC
Confidence 999999999999888542 1123456778842 14689999999 899999999999997654445
Q ss_pred eEEEEEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 386 AYCLGIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 386 ~~Cl~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
..|+..+.. ..+..||||++|||++|+|||++++|||||+
T Consensus 281 ~~C~~~~~~~~~~~~~~~~~ilGd~flr~~y~vfD~~~~~IGfA~ 325 (325)
T cd05490 281 TICLSGFMGLDIPPPAGPLWILGDVFIGRYYTVFDRDNDRVGFAK 325 (325)
T ss_pred CEEeeEEEECCCCCCCCceEEEChHhheeeEEEEEcCCcEeeccC
Confidence 679765443 2245799999999999999999999999995
No 5
>KOG1339 consensus Aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.7e-50 Score=433.36 Aligned_cols=340 Identities=37% Similarity=0.685 Sum_probs=281.6
Q ss_pred CCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC-CCCCCCCCCCCCCCCcccccccCcCC-cc----cCCCCC
Q 047816 79 DLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE-HCGDHQDPKFEPDLSSTYQPVKCNLY-CN----CDRERA 152 (620)
Q Consensus 79 ~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~-~C~~~~~~~y~p~~SsT~~~~~c~~~-c~----c~~~~~ 152 (620)
....+++|+++|.||||||+|.|++||||+++||+|..|. .|..+.+..|+|++|+||+...|.+. |. |....+
T Consensus 40 ~~~~~~~Y~~~i~IGTPpq~f~v~~DTGS~~lWV~c~~c~~~C~~~~~~~f~p~~SSt~~~~~c~~~~c~~~~~~~~~~~ 119 (398)
T KOG1339|consen 40 SSYSSGEYYGNISIGTPPQSFTVVLDTGSDLLWVPCAPCSSACYSQHNPIFDPSASSTYKSVGCSSPRCKSLPQSCSPNS 119 (398)
T ss_pred ccccccccEEEEecCCCCeeeEEEEeCCCCceeeccccccccccccCCCccCccccccccccCCCCccccccccCcccCC
Confidence 3445678999999999999999999999999999999999 89875445699999999999999974 52 444567
Q ss_pred cceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcC-CCcceEEecCCCCCchHHHHHHcCCcccc
Q 047816 153 QCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYS-QHADGIIGLGRGDLSVVDQLVEKGVISDS 231 (620)
Q Consensus 153 ~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~-~~~dGIlGLg~~~~s~~~~L~~~g~I~~~ 231 (620)
.|.|.+.|+||+.+.|.+++|+|++++.+.+...++.|||+..+.+.+.. ...+||||||++.+++..|+...+...++
T Consensus 120 ~C~y~i~Ygd~~~~~G~l~~Dtv~~~~~~~~~~~~~~FGc~~~~~g~~~~~~~~dGIlGLg~~~~S~~~q~~~~~~~~~~ 199 (398)
T KOG1339|consen 120 SCPYSIQYGDGSSTSGYLATDTVTFGGTTSLPVPNQTFGCGTNNPGSFGLFAAFDGILGLGRGSLSVPSQLPSFYNAINV 199 (398)
T ss_pred cCceEEEeCCCCceeEEEEEEEEEEccccccccccEEEEeeecCccccccccccceEeecCCCCccceeecccccCCcee
Confidence 89999999999999999999999999853356678999999998765222 46899999999999999999988777679
Q ss_pred eEEeecCCCC---CCceEEECCCCCCCCce---EeecCCCCCCeeEEEEeEEEEccEEecCCCCccCC-CCceEeeccce
Q 047816 232 FSLCYGGMDV---GGGAMVLGGISPPKDMV---FTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDG-KHGTVLDSGTT 304 (620)
Q Consensus 232 FSl~l~~~~~---~~G~l~fGgiD~~~~~~---~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~-~~~ailDSGtt 304 (620)
||+||.+.+. .+|.|+||++|+.++.. |++.......+|.|.+++|.|+++. .+....+.. ..++++||||+
T Consensus 200 FS~cL~~~~~~~~~~G~i~fG~~d~~~~~~~l~~tPl~~~~~~~y~v~l~~I~vgg~~-~~~~~~~~~~~~~~iiDSGTs 278 (398)
T KOG1339|consen 200 FSYCLSSNGSPSSGGGSIIFGGVDSSHYTGSLTYTPLLSNPSTYYQVNLDGISVGGKR-PIGSSLFCTDGGGAIIDSGTS 278 (398)
T ss_pred EEEEeCCCCCCCCCCcEEEECCCcccCcCCceEEEeeccCCCccEEEEEeEEEECCcc-CCCcceEecCCCCEEEECCcc
Confidence 9999998653 47999999999998554 6665543335999999999999977 655555555 47899999999
Q ss_pred eeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccC
Q 047816 305 YAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVR 384 (620)
Q Consensus 305 ~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~ 384 (620)
+++||+++|++|.+++.+++. .......+.+.|+...... ..+|.|+|+|++|+.|.+++++|+++.....
T Consensus 279 ~t~lp~~~y~~i~~~~~~~~~----~~~~~~~~~~~C~~~~~~~-----~~~P~i~~~f~~g~~~~l~~~~y~~~~~~~~ 349 (398)
T KOG1339|consen 279 LTYLPTSAYNALREAIGAEVS----VVGTDGEYFVPCFSISTSG-----VKLPDITFHFGGGAVFSLPPKNYLVEVSDGG 349 (398)
T ss_pred eeeccHHHHHHHHHHHHhhee----ccccCCceeeecccCCCCc-----ccCCcEEEEECCCcEEEeCccceEEEECCCC
Confidence 999999999999999988741 0224456778998653221 3589999999559999999999999876532
Q ss_pred CeEEEEEEecCCC-CceeehHhhhceEEEEEeCC-CCEEEEEe--cCCc
Q 047816 385 GAYCLGIFQNGRD-PTTLLGGIIVRNTLVMYDRE-HSKIGFWK--TNCS 429 (620)
Q Consensus 385 ~~~Cl~~~~~~~~-~~~ILG~~fLr~~yvvfD~e-n~rIGfA~--~~c~ 429 (620)
.. |++.+...+. ..||||+.|+|+++++||.. ++|||||+ .+|.
T Consensus 350 ~~-Cl~~~~~~~~~~~~ilG~~~~~~~~~~~D~~~~~riGfa~~~~~c~ 397 (398)
T KOG1339|consen 350 GV-CLAFFNGMDSGPLWILGDVFQQNYLVVFDLGENSRVGFAPALTNCS 397 (398)
T ss_pred Cc-eeeEEecCCCCceEEEchHHhCCEEEEEeCCCCCEEEeccccccCC
Confidence 22 9998877644 48999999999999999999 99999999 6664
No 6
>cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalyt
Probab=100.00 E-value=5.4e-51 Score=425.27 Aligned_cols=296 Identities=28% Similarity=0.525 Sum_probs=243.6
Q ss_pred EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccC
Q 047816 86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEM 163 (620)
Q Consensus 86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg 163 (620)
|+++|+||||+|+++|++||||+++||+|..|. .|..+ +.|+|++|+|++..+ +.|++.|++|
T Consensus 1 Y~~~i~iGtP~Q~~~v~~DTGSs~~Wv~s~~C~~~~C~~~--~~y~~~~SsT~~~~~-------------~~~~i~Yg~g 65 (316)
T cd05486 1 YFGQISIGTPPQNFTVIFDTGSSNLWVPSIYCTSQACTKH--NRFQPSESSTYVSNG-------------EAFSIQYGTG 65 (316)
T ss_pred CeEEEEECCCCcEEEEEEcCCCccEEEecCCCCCcccCcc--ceECCCCCcccccCC-------------cEEEEEeCCc
Confidence 679999999999999999999999999999997 57655 789999999998865 6899999985
Q ss_pred CceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-cceEEe
Q 047816 164 SSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-DSFSLC 235 (620)
Q Consensus 164 ~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~~FSl~ 235 (620)
++.|.+++|+|++++ .++.++.||++..+.+. +.....+||||||++.. +++++|++||+|+ +.||+|
T Consensus 66 -~~~G~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~p~~~~l~~qg~i~~~~FS~~ 141 (316)
T cd05486 66 -SLTGIIGIDQVTVEG---ITVQNQQFAESVSEPGSTFQDSEFDGILGLAYPSLAVDGVTPVFDNMMAQNLVELPMFSVY 141 (316)
T ss_pred -EEEEEeeecEEEECC---EEEcCEEEEEeeccCcccccccccceEeccCchhhccCCCCCHHHHHHhcCCCCCCEEEEE
Confidence 689999999999998 67889999998776553 33456899999998764 4799999999998 899999
Q ss_pred ecCCC--CCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecH
Q 047816 236 YGGMD--VGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPE 310 (620)
Q Consensus 236 l~~~~--~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~ 310 (620)
|++.. ..+|+|+|||+|+++ .+.|+++ ....+|.|.+++|.|+++.+.. .....++|||||+++++|+
T Consensus 142 L~~~~~~~~~g~l~fGg~d~~~~~g~l~~~pi--~~~~~w~v~l~~i~v~g~~~~~-----~~~~~aiiDTGTs~~~lP~ 214 (316)
T cd05486 142 MSRNPNSADGGELVFGGFDTSRFSGQLNWVPV--TVQGYWQIQLDNIQVGGTVIFC-----SDGCQAIVDTGTSLITGPS 214 (316)
T ss_pred EccCCCCCCCcEEEEcccCHHHcccceEEEEC--CCceEEEEEeeEEEEecceEec-----CCCCEEEECCCcchhhcCH
Confidence 98642 357999999999987 3445543 4578999999999999987642 2235799999999999999
Q ss_pred HHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEE
Q 047816 311 AAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLG 390 (620)
Q Consensus 311 ~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~ 390 (620)
++++++.+++.+. ..++.|.++|.. .+.+|+|+|+| +|+.++|+|++|++.....++..|+.
T Consensus 215 ~~~~~l~~~~~~~--------~~~~~~~~~C~~---------~~~~p~i~f~f-~g~~~~l~~~~y~~~~~~~~~~~C~~ 276 (316)
T cd05486 215 GDIKQLQNYIGAT--------ATDGEYGVDCST---------LSLMPSVTFTI-NGIPYSLSPQAYTLEDQSDGGGYCSS 276 (316)
T ss_pred HHHHHHHHHhCCc--------ccCCcEEEeccc---------cccCCCEEEEE-CCEEEEeCHHHeEEecccCCCCEEee
Confidence 9999998877432 123446678842 14689999999 89999999999998753334568975
Q ss_pred EEecC-----CCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 391 IFQNG-----RDPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 391 ~~~~~-----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
.++.. ..+.||||++|||++|+|||.+++|||||+
T Consensus 277 ~~~~~~~~~~~~~~~ILGd~flr~~y~vfD~~~~~IGfA~ 316 (316)
T cd05486 277 GFQGLDIPPPAGPLWILGDVFIRQYYSVFDRGNNRVGFAP 316 (316)
T ss_pred EEEECCCCCCCCCeEEEchHHhcceEEEEeCCCCEeeccC
Confidence 54432 235799999999999999999999999995
No 7
>cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate r
Probab=100.00 E-value=1.4e-50 Score=423.86 Aligned_cols=301 Identities=26% Similarity=0.504 Sum_probs=246.6
Q ss_pred ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCC----CCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816 83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEH----CGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER 158 (620)
Q Consensus 83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~----C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~ 158 (620)
+..|+++|+||||+|+++|++||||+++||++..|.. |..+ +.|+|++|+|++..+ |.|++
T Consensus 6 ~~~y~~~i~iGtP~q~~~v~~DTGSs~~Wv~~~~C~~~~~~c~~~--~~y~~~~SsT~~~~~-------------~~~~~ 70 (326)
T cd05487 6 DTQYYGEIGIGTPPQTFKVVFDTGSSNLWVPSSKCSPLYTACVTH--NLYDASDSSTYKENG-------------TEFTI 70 (326)
T ss_pred CCeEEEEEEECCCCcEEEEEEeCCccceEEccCCCcCcchhhccc--CcCCCCCCeeeeECC-------------EEEEE
Confidence 5689999999999999999999999999999988874 5544 799999999999865 68999
Q ss_pred eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccC-CCcCCCcceEEecCCCCC------chHHHHHHcCCcc-c
Q 047816 159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETG-DLYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-D 230 (620)
Q Consensus 159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~-~~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~ 230 (620)
.|++| ++.|.+++|+|++|+ ..+ ++.||++..... .+.....+||||||++.. +++++|++||+|+ +
T Consensus 71 ~Yg~g-~~~G~~~~D~v~~g~---~~~-~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~L~~qg~i~~~ 145 (326)
T cd05487 71 HYASG-TVKGFLSQDIVTVGG---IPV-TQMFGEVTALPAIPFMLAKFDGVLGMGYPKQAIGGVTPVFDNIMSQGVLKED 145 (326)
T ss_pred EeCCc-eEEEEEeeeEEEECC---EEe-eEEEEEEEeccCCccceeecceEEecCChhhcccCCCCHHHHHHhcCCCCCC
Confidence 99985 599999999999998 444 478999887542 233346899999998653 5899999999998 8
Q ss_pred ceEEeecCCC--CCCceEEECCCCCCCCc-eEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeee
Q 047816 231 SFSLCYGGMD--VGGGAMVLGGISPPKDM-VFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAY 307 (620)
Q Consensus 231 ~FSl~l~~~~--~~~G~l~fGgiD~~~~~-~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~ 307 (620)
.||+||.+.+ ...|.|+|||+|++++. .+.+++.....+|.|+++++.|+++.+... .+..++|||||++++
T Consensus 146 ~FS~~L~~~~~~~~~G~l~fGg~d~~~y~g~l~~~~~~~~~~w~v~l~~i~vg~~~~~~~-----~~~~aiiDSGts~~~ 220 (326)
T cd05487 146 VFSVYYSRDSSHSLGGEIVLGGSDPQHYQGDFHYINTSKTGFWQIQMKGVSVGSSTLLCE-----DGCTAVVDTGASFIS 220 (326)
T ss_pred EEEEEEeCCCCCCCCcEEEECCcChhhccCceEEEECCcCceEEEEecEEEECCEEEecC-----CCCEEEECCCccchh
Confidence 9999998653 35799999999998843 344444456789999999999999876432 235799999999999
Q ss_pred ecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeE
Q 047816 308 LPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAY 387 (620)
Q Consensus 308 LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~ 387 (620)
+|.++++++++++++... ...|..+|.. ...+|+|+|+| +|.+++|++++|+++....++..
T Consensus 221 lP~~~~~~l~~~~~~~~~--------~~~y~~~C~~---------~~~~P~i~f~f-gg~~~~v~~~~yi~~~~~~~~~~ 282 (326)
T cd05487 221 GPTSSISKLMEALGAKER--------LGDYVVKCNE---------VPTLPDISFHL-GGKEYTLSSSDYVLQDSDFSDKL 282 (326)
T ss_pred CcHHHHHHHHHHhCCccc--------CCCEEEeccc---------cCCCCCEEEEE-CCEEEEeCHHHhEEeccCCCCCE
Confidence 999999999998854321 3456778843 14689999999 88999999999999876545678
Q ss_pred EEEEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816 388 CLGIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT 426 (620)
Q Consensus 388 Cl~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~ 426 (620)
|+..++. ..++.||||++|||++|+|||++++|||||++
T Consensus 283 C~~~~~~~~~~~~~~~~~ilG~~flr~~y~vfD~~~~~IGfA~a 326 (326)
T cd05487 283 CTVAFHAMDIPPPTGPLWVLGATFIRKFYTEFDRQNNRIGFALA 326 (326)
T ss_pred EEEEEEeCCCCCCCCCeEEEehHHhhccEEEEeCCCCEEeeeeC
Confidence 8755443 12357999999999999999999999999985
No 8
>PTZ00147 plasmepsin-1; Provisional
Probab=100.00 E-value=2.8e-50 Score=432.51 Aligned_cols=310 Identities=24% Similarity=0.359 Sum_probs=246.6
Q ss_pred CceeeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCC
Q 047816 70 PNARMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDR 149 (620)
Q Consensus 70 ~~~~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~ 149 (620)
.+..+++.+.. +.+|+++|+||||+|++.|++||||+++||+|..|..|..+.++.|||++|+||+..+
T Consensus 126 ~~~~v~L~n~~--n~~Y~~~I~IGTP~Q~f~Vi~DTGSsdlWVps~~C~~~~C~~~~~yd~s~SsT~~~~~--------- 194 (453)
T PTZ00147 126 EFDNVELKDLA--NVMSYGEAKLGDNGQKFNFIFDTGSANLWVPSIKCTTEGCETKNLYDSSKSKTYEKDG--------- 194 (453)
T ss_pred CCCeeeccccC--CCEEEEEEEECCCCeEEEEEEeCCCCcEEEeecCCCcccccCCCccCCccCcceEECC---------
Confidence 44556666543 6789999999999999999999999999999999985443344799999999999865
Q ss_pred CCCcceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC---CcCCCcceEEecCCCCC------chHH
Q 047816 150 ERAQCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD---LYSQHADGIIGLGRGDL------SVVD 220 (620)
Q Consensus 150 ~~~~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~---~~~~~~dGIlGLg~~~~------s~~~ 220 (620)
+.|++.|++| ++.|.+++|+|++|+ .+++ ..|+++....+. +.....|||||||++.. +++.
T Consensus 195 ----~~f~i~Yg~G-svsG~~~~DtVtiG~---~~v~-~qF~~~~~~~~f~~~~~~~~~DGILGLG~~~~S~~~~~p~~~ 265 (453)
T PTZ00147 195 ----TKVEMNYVSG-TVSGFFSKDLVTIGN---LSVP-YKFIEVTDTNGFEPFYTESDFDGIFGLGWKDLSIGSVDPYVV 265 (453)
T ss_pred ----CEEEEEeCCC-CEEEEEEEEEEEECC---EEEE-EEEEEEEeccCcccccccccccceecccCCccccccCCCHHH
Confidence 5899999985 689999999999998 5555 578888765431 22346799999999764 4788
Q ss_pred HHHHcCCcc-cceEEeecCCCCCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCc
Q 047816 221 QLVEKGVIS-DSFSLCYGGMDVGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHG 296 (620)
Q Consensus 221 ~L~~~g~I~-~~FSl~l~~~~~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ 296 (620)
+|++||+|+ ++||+||++.+..+|.|+|||+|+++ ++.|++. ....+|.|.++ +.+++... ....
T Consensus 266 ~L~~qg~I~~~vFS~~L~~~~~~~G~L~fGGiD~~ky~G~l~y~pl--~~~~~W~V~l~-~~vg~~~~--------~~~~ 334 (453)
T PTZ00147 266 ELKNQNKIEQAVFTFYLPPEDKHKGYLTIGGIEERFYEGPLTYEKL--NHDLYWQVDLD-VHFGNVSS--------EKAN 334 (453)
T ss_pred HHHHcCCCCccEEEEEecCCCCCCeEEEECCcChhhcCCceEEEEc--CCCceEEEEEE-EEECCEec--------Ccee
Confidence 999999998 79999998766668999999999997 3445544 35679999998 47765431 2457
Q ss_pred eEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCc
Q 047816 297 TVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENY 376 (620)
Q Consensus 297 ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~y 376 (620)
++|||||+++++|+++++++.+++++... ...+.|.++|+. ..+|+|+|.| +|..++|+|++|
T Consensus 335 aIiDSGTsli~lP~~~~~ai~~~l~~~~~------~~~~~y~~~C~~----------~~lP~~~f~f-~g~~~~L~p~~y 397 (453)
T PTZ00147 335 VIVDSGTSVITVPTEFLNKFVESLDVFKV------PFLPLYVTTCNN----------TKLPTLEFRS-PNKVYTLEPEYY 397 (453)
T ss_pred EEECCCCchhcCCHHHHHHHHHHhCCeec------CCCCeEEEeCCC----------CCCCeEEEEE-CCEEEEECHHHh
Confidence 99999999999999999999998854211 122346678842 3689999999 789999999999
Q ss_pred EEEecccCCeEEEEEEec-C-CCCceeehHhhhceEEEEEeCCCCEEEEEecC
Q 047816 377 LFRHSKVRGAYCLGIFQN-G-RDPTTLLGGIIVRNTLVMYDREHSKIGFWKTN 427 (620)
Q Consensus 377 i~~~~~~~~~~Cl~~~~~-~-~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~ 427 (620)
+....+.+...|+..+.. . ..+.||||++|||++|+|||++++|||||+++
T Consensus 398 i~~~~~~~~~~C~~~i~~~~~~~~~~ILGd~FLr~~YtVFD~~n~rIGfA~a~ 450 (453)
T PTZ00147 398 LQPIEDIGSALCMLNIIPIDLEKNTFILGDPFMRKYFTVFDYDNHTVGFALAK 450 (453)
T ss_pred eeccccCCCcEEEEEEEECCCCCCCEEECHHHhccEEEEEECCCCEEEEEEec
Confidence 986544344679754433 2 23579999999999999999999999999986
No 9
>cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases. They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved fro
Probab=100.00 E-value=2.7e-50 Score=419.88 Aligned_cols=290 Identities=28% Similarity=0.495 Sum_probs=239.2
Q ss_pred cceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC---CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816 82 LNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE---HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER 158 (620)
Q Consensus 82 ~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~---~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~ 158 (620)
.+.+|+++|+||||+|++.|++||||+++||+|..|. .|..+ +.|+|++|+|++..+ ..+.+
T Consensus 7 ~~~~Y~~~i~iGtP~Q~~~v~~DTGSs~lWv~~~~C~~~~~C~~~--~~y~~~~SsT~~~~~-------------~~~~i 71 (317)
T cd06098 7 LDAQYFGEIGIGTPPQKFTVIFDTGSSNLWVPSSKCYFSIACYFH--SKYKSSKSSTYKKNG-------------TSASI 71 (317)
T ss_pred CCCEEEEEEEECCCCeEEEEEECCCccceEEecCCCCCCcccccc--CcCCcccCCCcccCC-------------CEEEE
Confidence 3678999999999999999999999999999999996 68766 799999999998865 57899
Q ss_pred eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-c
Q 047816 159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-D 230 (620)
Q Consensus 159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~ 230 (620)
.|++| ++.|.+++|+|++|+ .+++++.||++..+.+. +.....+||||||++.. +++.+|++||+|+ +
T Consensus 72 ~Yg~G-~~~G~~~~D~v~ig~---~~v~~~~f~~~~~~~~~~~~~~~~dGilGLg~~~~s~~~~~~~~~~l~~qg~i~~~ 147 (317)
T cd06098 72 QYGTG-SISGFFSQDSVTVGD---LVVKNQVFIEATKEPGLTFLLAKFDGILGLGFQEISVGKAVPVWYNMVEQGLVKEP 147 (317)
T ss_pred EcCCc-eEEEEEEeeEEEECC---EEECCEEEEEEEecCCccccccccceeccccccchhhcCCCCHHHHHHhcCCCCCC
Confidence 99985 589999999999998 67889999999876543 34456899999999754 4788999999998 8
Q ss_pred ceEEeecCCC--CCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816 231 SFSLCYGGMD--VGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY 305 (620)
Q Consensus 231 ~FSl~l~~~~--~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~ 305 (620)
.||+||++.. ..+|+|+|||+|++++ +.|+++ ....+|.|++++|.|+++.+.... ....++|||||++
T Consensus 148 ~FS~~L~~~~~~~~~G~l~fGg~d~~~~~g~l~~~pv--~~~~~w~v~l~~i~v~g~~~~~~~----~~~~aivDTGTs~ 221 (317)
T cd06098 148 VFSFWLNRNPDEEEGGELVFGGVDPKHFKGEHTYVPV--TRKGYWQFEMGDVLIGGKSTGFCA----GGCAAIADSGTSL 221 (317)
T ss_pred EEEEEEecCCCCCCCcEEEECccChhhcccceEEEec--CcCcEEEEEeCeEEECCEEeeecC----CCcEEEEecCCcc
Confidence 9999998642 3579999999999974 345544 356799999999999998765422 3457999999999
Q ss_pred eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCC
Q 047816 306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRG 385 (620)
Q Consensus 306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~ 385 (620)
+++|+++++++. +..+|... ..+|+|+|+| +|+++.|+|++|+++......
T Consensus 222 ~~lP~~~~~~i~-------------------~~~~C~~~---------~~~P~i~f~f-~g~~~~l~~~~yi~~~~~~~~ 272 (317)
T cd06098 222 LAGPTTIVTQIN-------------------SAVDCNSL---------SSMPNVSFTI-GGKTFELTPEQYILKVGEGAA 272 (317)
T ss_pred eeCCHHHHHhhh-------------------ccCCcccc---------ccCCcEEEEE-CCEEEEEChHHeEEeecCCCC
Confidence 999998776542 34578421 4689999999 889999999999987655445
Q ss_pred eEEEEEEecC-----CCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 386 AYCLGIFQNG-----RDPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 386 ~~Cl~~~~~~-----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
..|+..++.. .++.||||++|||++|+|||++|+|||||+
T Consensus 273 ~~C~~~~~~~~~~~~~~~~~IlGd~Flr~~y~VfD~~~~~iGfA~ 317 (317)
T cd06098 273 AQCISGFTALDVPPPRGPLWILGDVFMGAYHTVFDYGNLRVGFAE 317 (317)
T ss_pred CEEeceEEECCCCCCCCCeEEechHHhcccEEEEeCCCCEEeecC
Confidence 6897544321 235799999999999999999999999995
No 10
>cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5. Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal l
Probab=100.00 E-value=3.6e-50 Score=420.60 Aligned_cols=296 Identities=30% Similarity=0.554 Sum_probs=240.3
Q ss_pred eeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-c----ccCCCCCcceeEE
Q 047816 84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-C----NCDRERAQCVYER 158 (620)
Q Consensus 84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c----~c~~~~~~~~~~~ 158 (620)
++|+++|.||||+|++.|+|||||+++||+|..|..|..+.++.|+|++|+|++.+.|++. | .|. .+.|.|.+
T Consensus 2 ~~Y~~~i~vGtP~Q~~~v~~DTGS~~~wv~~~~C~~c~~~~~~~y~~~~Sst~~~~~C~~~~c~~~~~~~--~~~~~~~i 79 (326)
T cd06096 2 AYYFIDIFIGNPPQKQSLILDTGSSSLSFPCSQCKNCGIHMEPPYNLNNSITSSILYCDCNKCCYCLSCL--NNKCEYSI 79 (326)
T ss_pred ceEEEEEEecCCCeEEEEEEeCCCCceEEecCCCCCcCCCCCCCcCcccccccccccCCCccccccCcCC--CCcCcEEE
Confidence 4799999999999999999999999999999999999887778999999999999999863 4 243 35699999
Q ss_pred eeccCCceeEEEEEEEEEeCCCCCC----CccceEEEEEEeccCCCcCCCcceEEecCCCCC----chHHHHHHcCCcc-
Q 047816 159 KYAEMSSSSGVLGEDIISFGNESDL----KPQRAVFGCENVETGDLYSQHADGIIGLGRGDL----SVVDQLVEKGVIS- 229 (620)
Q Consensus 159 ~Y~dg~~~~G~~~~D~v~lg~~~~~----~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~----s~~~~L~~~g~I~- 229 (620)
.|++|+.+.|.+++|+|+||+.... ...++.|||+..+.+.+.....+||||||+... +...+|.+++.+.
T Consensus 80 ~Y~~gs~~~G~~~~D~v~lg~~~~~~~~~~~~~~~fg~~~~~~~~~~~~~~~GilGLg~~~~~~~~~~~~~l~~~~~~~~ 159 (326)
T cd06096 80 SYSEGSSISGFYFSDFVSFESYLNSNSEKESFKKIFGCHTHETNLFLTQQATGILGLSLTKNNGLPTPIILLFTKRPKLK 159 (326)
T ss_pred EECCCCceeeEEEEEEEEeccCCCCccccccccEEeccCccccCcccccccceEEEccCCcccccCchhHHHHHhccccc
Confidence 9999878999999999999984310 112578999988877666667899999999764 2444566776653
Q ss_pred --cceEEeecCCCCCCceEEECCCCCCCC-------------ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCC
Q 047816 230 --DSFSLCYGGMDVGGGAMVLGGISPPKD-------------MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGK 294 (620)
Q Consensus 230 --~~FSl~l~~~~~~~G~l~fGgiD~~~~-------------~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~ 294 (620)
++||+||++ .+|.|+|||+|+.++ +.|++. ....+|.|.+++|+|+++.... .....
T Consensus 160 ~~~~FS~~l~~---~~G~l~~Gg~d~~~~~~~~~~~~~~~~~~~~~p~--~~~~~y~v~l~~i~vg~~~~~~---~~~~~ 231 (326)
T cd06096 160 KDKIFSICLSE---DGGELTIGGYDKDYTVRNSSIGNNKVSKIVWTPI--TRKYYYYVKLEGLSVYGTTSNS---GNTKG 231 (326)
T ss_pred CCceEEEEEcC---CCeEEEECccChhhhcccccccccccCCceEEec--cCCceEEEEEEEEEEcccccce---ecccC
Confidence 799999986 469999999998753 244443 3458999999999999875110 11345
Q ss_pred CceEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCC
Q 047816 295 HGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPE 374 (620)
Q Consensus 295 ~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~ 374 (620)
..++|||||++++||+++++++.+++ |+|+|+|++|+++.++|+
T Consensus 232 ~~aivDSGTs~~~lp~~~~~~l~~~~------------------------------------P~i~~~f~~g~~~~i~p~ 275 (326)
T cd06096 232 LGMLVDSGSTLSHFPEDLYNKINNFF------------------------------------PTITIIFENNLKIDWKPS 275 (326)
T ss_pred CCEEEeCCCCcccCCHHHHHHHHhhc------------------------------------CcEEEEEcCCcEEEECHH
Confidence 68999999999999999998877654 789999965899999999
Q ss_pred CcEEEecccCCeEEEEEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEecCCc
Q 047816 375 NYLFRHSKVRGAYCLGIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCS 429 (620)
Q Consensus 375 ~yi~~~~~~~~~~Cl~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~ 429 (620)
+|++.... ..+|+++. .. ++.+|||++|||++|+|||+|++|||||+++|.
T Consensus 276 ~y~~~~~~--~~c~~~~~-~~-~~~~ILG~~flr~~y~vFD~~~~riGfa~~~C~ 326 (326)
T cd06096 276 SYLYKKES--FWCKGGEK-SV-SNKPILGASFFKNKQIIFDLDNNRIGFVESNCP 326 (326)
T ss_pred HhccccCC--ceEEEEEe-cC-CCceEEChHHhcCcEEEEECcCCEEeeEcCCCC
Confidence 99987543 33555543 33 468999999999999999999999999999994
No 11
>PTZ00013 plasmepsin 4 (PM4); Provisional
Probab=100.00 E-value=8.2e-50 Score=428.04 Aligned_cols=309 Identities=25% Similarity=0.381 Sum_probs=244.9
Q ss_pred CCceeeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcc
Q 047816 69 HPNARMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCN 146 (620)
Q Consensus 69 ~~~~~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~ 146 (620)
..+..+++.+.. +.+|+++|+||||+|++.|++||||+++||+|..|. .|..+ +.|+|++|+|++..+
T Consensus 124 ~~~~~~~l~d~~--n~~Yy~~i~IGTP~Q~f~vi~DTGSsdlWV~s~~C~~~~C~~~--~~yd~s~SsT~~~~~------ 193 (450)
T PTZ00013 124 SENDVIELDDVA--NIMFYGEGEVGDNHQKFMLIFDTGSANLWVPSKKCDSIGCSIK--NLYDSSKSKSYEKDG------ 193 (450)
T ss_pred cCCCceeeeccC--CCEEEEEEEECCCCeEEEEEEeCCCCceEEecccCCccccccC--CCccCccCcccccCC------
Confidence 344556666543 668889999999999999999999999999999997 56655 799999999999865
Q ss_pred cCCCCCcceeEEeeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC---CcCCCcceEEecCCCCC------c
Q 047816 147 CDRERAQCVYERKYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD---LYSQHADGIIGLGRGDL------S 217 (620)
Q Consensus 147 c~~~~~~~~~~~~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~---~~~~~~dGIlGLg~~~~------s 217 (620)
+.+++.|++| ++.|.+++|+|++|+ +++. ..|+++....+. +....+|||||||++.. +
T Consensus 194 -------~~~~i~YG~G-sv~G~~~~Dtv~iG~---~~~~-~~f~~~~~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~~p 261 (450)
T PTZ00013 194 -------TKVDITYGSG-TVKGFFSKDLVTLGH---LSMP-YKFIEVTDTDDLEPIYSSSEFDGILGLGWKDLSIGSIDP 261 (450)
T ss_pred -------cEEEEEECCc-eEEEEEEEEEEEECC---EEEc-cEEEEEEeccccccceecccccceecccCCccccccCCC
Confidence 5899999985 599999999999998 4554 578887665321 22346799999999764 5
Q ss_pred hHHHHHHcCCcc-cceEEeecCCCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCC
Q 047816 218 VVDQLVEKGVIS-DSFSLCYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDG 293 (620)
Q Consensus 218 ~~~~L~~~g~I~-~~FSl~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~ 293 (620)
++++|++||+|+ ++||+||++.+..+|.|+|||+|++++ +.|+++ ....+|.|.++ +.++.... .
T Consensus 262 ~~~~L~~qg~I~~~vFS~~L~~~~~~~G~L~fGGiD~~~y~G~L~y~pv--~~~~yW~I~l~-v~~G~~~~--------~ 330 (450)
T PTZ00013 262 IVVELKNQNKIDNALFTFYLPVHDVHAGYLTIGGIEEKFYEGNITYEKL--NHDLYWQIDLD-VHFGKQTM--------Q 330 (450)
T ss_pred HHHHHHhccCcCCcEEEEEecCCCCCCCEEEECCcCccccccceEEEEc--CcCceEEEEEE-EEECceec--------c
Confidence 889999999998 799999987655689999999999973 445444 35679999998 66654322 2
Q ss_pred CCceEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCC
Q 047816 294 KHGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAP 373 (620)
Q Consensus 294 ~~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~ 373 (620)
+..+++||||+++++|+++++++++++..... ...+.|..+|+. +.+|+|+|+| +|.+++|+|
T Consensus 331 ~~~aIlDSGTSli~lP~~~~~~i~~~l~~~~~------~~~~~y~~~C~~----------~~lP~i~F~~-~g~~~~L~p 393 (450)
T PTZ00013 331 KANVIVDSGTTTITAPSEFLNKFFANLNVIKV------PFLPFYVTTCDN----------KEMPTLEFKS-ANNTYTLEP 393 (450)
T ss_pred ccceEECCCCccccCCHHHHHHHHHHhCCeec------CCCCeEEeecCC----------CCCCeEEEEE-CCEEEEECH
Confidence 35799999999999999999999988754311 122346678842 4689999999 789999999
Q ss_pred CCcEEEecccCCeEEEEEEec-C-CCCceeehHhhhceEEEEEeCCCCEEEEEecC
Q 047816 374 ENYLFRHSKVRGAYCLGIFQN-G-RDPTTLLGGIIVRNTLVMYDREHSKIGFWKTN 427 (620)
Q Consensus 374 ~~yi~~~~~~~~~~Cl~~~~~-~-~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~ 427 (620)
++|+......++..|+..+.. . .++.||||++|||++|+|||++++|||||+++
T Consensus 394 ~~Yi~~~~~~~~~~C~~~i~~~~~~~~~~ILGd~FLr~~Y~VFD~~n~rIGfA~a~ 449 (450)
T PTZ00013 394 EYYMNPLLDVDDTLCMITMLPVDIDDNTFILGDPFMRKYFTVFDYDKESVGFAIAK 449 (450)
T ss_pred HHheehhccCCCCeeEEEEEECCCCCCCEEECHHHhccEEEEEECCCCEEEEEEeC
Confidence 999976443345689644443 2 24689999999999999999999999999975
No 12
>cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an exten
Probab=100.00 E-value=8.1e-50 Score=416.92 Aligned_cols=297 Identities=26% Similarity=0.520 Sum_probs=245.2
Q ss_pred eeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeec
Q 047816 84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYA 161 (620)
Q Consensus 84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~ 161 (620)
..|+++|+||||+|++.|++||||+++||+|..|. .|..+ +.|+|++|+||+..+ |.|++.|+
T Consensus 2 ~~y~~~i~iGtP~q~~~v~~DTGS~~~wv~~~~C~~~~C~~~--~~f~~~~SsT~~~~~-------------~~~~~~Yg 66 (318)
T cd05477 2 MSYYGEISIGTPPQNFLVLFDTGSSNLWVPSVLCQSQACTNH--TKFNPSQSSTYSTNG-------------ETFSLQYG 66 (318)
T ss_pred cEEEEEEEECCCCcEEEEEEeCCCccEEEccCCCCCcccccc--CCCCcccCCCceECC-------------cEEEEEEC
Confidence 47999999999999999999999999999999998 46654 799999999999865 68999999
Q ss_pred cCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCC------CchHHHHHHcCCcc-cceE
Q 047816 162 EMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGD------LSVVDQLVEKGVIS-DSFS 233 (620)
Q Consensus 162 dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~------~s~~~~L~~~g~I~-~~FS 233 (620)
+| ++.|.+++|+|++|+ ..+.++.|||+....+. +.....+||||||++. .+++++|+++|.|+ ++||
T Consensus 67 ~G-s~~G~~~~D~i~~g~---~~i~~~~Fg~~~~~~~~~~~~~~~~GilGLg~~~~s~~~~~~~~~~L~~~g~i~~~~FS 142 (318)
T cd05477 67 SG-SLTGIFGYDTVTVQG---IIITNQEFGLSETEPGTNFVYAQFDGILGLAYPSISAGGATTVMQGMMQQNLLQAPIFS 142 (318)
T ss_pred Cc-EEEEEEEeeEEEECC---EEEcCEEEEEEEecccccccccceeeEeecCcccccccCCCCHHHHHHhcCCcCCCEEE
Confidence 95 589999999999998 67789999999876543 3334579999999853 46999999999998 8999
Q ss_pred EeecCCC-CCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeec
Q 047816 234 LCYGGMD-VGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLP 309 (620)
Q Consensus 234 l~l~~~~-~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP 309 (620)
+||++.. ..+|.|+|||+|++++ +.|+++ ....+|.|.+++|+|+++.+... ..+..++|||||+++++|
T Consensus 143 ~~L~~~~~~~~g~l~fGg~d~~~~~g~l~~~pv--~~~~~w~v~l~~i~v~g~~~~~~----~~~~~~iiDSGtt~~~lP 216 (318)
T cd05477 143 FYLSGQQGQQGGELVFGGVDNNLYTGQIYWTPV--TSETYWQIGIQGFQINGQATGWC----SQGCQAIVDTGTSLLTAP 216 (318)
T ss_pred EEEcCCCCCCCCEEEEcccCHHHcCCceEEEec--CCceEEEEEeeEEEECCEEeccc----CCCceeeECCCCccEECC
Confidence 9998742 3469999999999873 445543 45689999999999999876532 234579999999999999
Q ss_pred HHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEE
Q 047816 310 EAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCL 389 (620)
Q Consensus 310 ~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl 389 (620)
++++++|++++.++.. ..+.|..+|.. .+.+|+|+|+| +|+++.||+++|+.+. ...|+
T Consensus 217 ~~~~~~l~~~~~~~~~-------~~~~~~~~C~~---------~~~~p~l~~~f-~g~~~~v~~~~y~~~~----~~~C~ 275 (318)
T cd05477 217 QQVMSTLMQSIGAQQD-------QYGQYVVNCNN---------IQNLPTLTFTI-NGVSFPLPPSAYILQN----NGYCT 275 (318)
T ss_pred HHHHHHHHHHhCCccc-------cCCCEEEeCCc---------cccCCcEEEEE-CCEEEEECHHHeEecC----CCeEE
Confidence 9999999998865432 23456778842 14689999999 7899999999999864 34685
Q ss_pred -EEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816 390 -GIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT 426 (620)
Q Consensus 390 -~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~ 426 (620)
++... .++..||||++|||++|++||++++|||||++
T Consensus 276 ~~i~~~~~~~~~~~~~~ilG~~fl~~~y~vfD~~~~~ig~a~~ 318 (318)
T cd05477 276 VGIEPTYLPSQNGQPLWILGDVFLRQYYSVYDLGNNQVGFATA 318 (318)
T ss_pred EEEEecccCCCCCCceEEEcHHHhhheEEEEeCCCCEEeeeeC
Confidence 55432 12356999999999999999999999999985
No 13
>cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme. Proteinase A preferentially hydro
Probab=100.00 E-value=5e-50 Score=418.62 Aligned_cols=295 Identities=27% Similarity=0.520 Sum_probs=244.2
Q ss_pred ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC--CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEee
Q 047816 83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE--HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKY 160 (620)
Q Consensus 83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~--~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y 160 (620)
+.+|+++|+||||+|++.|++||||+++||+|..|. .|..+ +.|+|++|+|++..+ |.+.+.|
T Consensus 8 ~~~Y~~~i~iGtp~q~~~v~~DTGSs~~wv~~~~C~~~~C~~~--~~y~~~~Sst~~~~~-------------~~~~~~y 72 (320)
T cd05488 8 NAQYFTDITLGTPPQKFKVILDTGSSNLWVPSVKCGSIACFLH--SKYDSSASSTYKANG-------------TEFKIQY 72 (320)
T ss_pred CCEEEEEEEECCCCcEEEEEEecCCcceEEEcCCCCCcccCCc--ceECCCCCcceeeCC-------------CEEEEEE
Confidence 568999999999999999999999999999999997 57655 699999999999765 5899999
Q ss_pred ccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC------chHHHHHHcCCcc-cce
Q 047816 161 AEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL------SVVDQLVEKGVIS-DSF 232 (620)
Q Consensus 161 ~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~------s~~~~L~~~g~I~-~~F 232 (620)
++| +++|.+++|+|++++ +.++++.|||+..+.+. +.....+||||||++.. +.+.+|++||+|. +.|
T Consensus 73 ~~g-~~~G~~~~D~v~ig~---~~~~~~~f~~a~~~~g~~~~~~~~dGilGLg~~~~s~~~~~~~~~~l~~qg~i~~~~F 148 (320)
T cd05488 73 GSG-SLEGFVSQDTLSIGD---LTIKKQDFAEATSEPGLAFAFGKFDGILGLAYDTISVNKIVPPFYNMINQGLLDEPVF 148 (320)
T ss_pred CCc-eEEEEEEEeEEEECC---EEECCEEEEEEecCCCcceeeeeeceEEecCCccccccCCCCHHHHHHhcCCCCCCEE
Confidence 985 589999999999998 67789999999876553 22346799999999764 3567899999998 899
Q ss_pred EEeecCCCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeec
Q 047816 233 SLCYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLP 309 (620)
Q Consensus 233 Sl~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP 309 (620)
|+||++.+..+|.|+|||+|++++ +.|++. ....+|.|.+++|+|+++.+... +..++|||||++++||
T Consensus 149 S~~L~~~~~~~G~l~fGg~d~~~~~g~l~~~p~--~~~~~w~v~l~~i~vg~~~~~~~------~~~~ivDSGtt~~~lp 220 (320)
T cd05488 149 SFYLGSSEEDGGEATFGGIDESRFTGKITWLPV--RRKAYWEVELEKIGLGDEELELE------NTGAAIDTGTSLIALP 220 (320)
T ss_pred EEEecCCCCCCcEEEECCcCHHHcCCceEEEeC--CcCcEEEEEeCeEEECCEEeccC------CCeEEEcCCcccccCC
Confidence 999998655689999999999873 445543 35679999999999999877532 3479999999999999
Q ss_pred HHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEE
Q 047816 310 EAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCL 389 (620)
Q Consensus 310 ~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl 389 (620)
+++++++.+++.+.. .....|..+|.. . +.+|.|+|+| +|+++.||+++|+++. +..|+
T Consensus 221 ~~~~~~l~~~~~~~~-------~~~~~~~~~C~~--------~-~~~P~i~f~f-~g~~~~i~~~~y~~~~----~g~C~ 279 (320)
T cd05488 221 SDLAEMLNAEIGAKK-------SWNGQYTVDCSK--------V-DSLPDLTFNF-DGYNFTLGPFDYTLEV----SGSCI 279 (320)
T ss_pred HHHHHHHHHHhCCcc-------ccCCcEEeeccc--------c-ccCCCEEEEE-CCEEEEECHHHheecC----CCeEE
Confidence 999999988875432 123456677842 1 4689999999 7899999999999853 34698
Q ss_pred EEEecC-----CCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 390 GIFQNG-----RDPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 390 ~~~~~~-----~~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
..+... ..+.||||+.|||++|+|||++++|||||+
T Consensus 280 ~~~~~~~~~~~~~~~~ilG~~fl~~~y~vfD~~~~~iG~a~ 320 (320)
T cd05488 280 SAFTGMDFPEPVGPLAIVGDAFLRKYYSVYDLGNNAVGLAK 320 (320)
T ss_pred EEEEECcCCCCCCCeEEEchHHhhheEEEEeCCCCEEeecC
Confidence 665532 134799999999999999999999999996
No 14
>cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets an
Probab=100.00 E-value=6e-50 Score=419.27 Aligned_cols=299 Identities=25% Similarity=0.472 Sum_probs=245.7
Q ss_pred ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCC----CCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEE
Q 047816 83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCE----HCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYER 158 (620)
Q Consensus 83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~----~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~ 158 (620)
+.+|+++|+||||+|++.|++||||+++||+|..|. .|..+ +.|+|++|+|++..+ +.|.+
T Consensus 9 ~~~Y~~~i~vGtP~q~~~v~~DTGSs~~Wv~~~~C~~~~~~c~~~--~~y~~~~Sst~~~~~-------------~~~~i 73 (329)
T cd05485 9 DAQYYGVITIGTPPQSFKVVFDTGSSNLWVPSKKCSWTNIACLLH--NKYDSTKSSTYKKNG-------------TEFAI 73 (329)
T ss_pred CCeEEEEEEECCCCcEEEEEEcCCCccEEEecCCCCCCCccccCC--CeECCcCCCCeEECC-------------eEEEE
Confidence 568999999999999999999999999999999997 46544 689999999999865 68999
Q ss_pred eeccCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCCc------hHHHHHHcCCcc-c
Q 047816 159 KYAEMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDLS------VVDQLVEKGVIS-D 230 (620)
Q Consensus 159 ~Y~dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~s------~~~~L~~~g~I~-~ 230 (620)
.|++| ++.|.+++|+|++|+ ..++++.||++..+.+. +.....+||||||++..+ ++.+|++||+|+ +
T Consensus 74 ~Y~~g-~~~G~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~~GilGLg~~~~s~~~~~p~~~~l~~qg~i~~~ 149 (329)
T cd05485 74 QYGSG-SLSGFLSTDTVSVGG---VSVKGQTFAEAINEPGLTFVAAKFDGILGMGYSSISVDGVVPVFYNMVNQKLVDAP 149 (329)
T ss_pred EECCc-eEEEEEecCcEEECC---EEECCEEEEEEEecCCccccccccceEEEcCCccccccCCCCHHHHHHhCCCCCCC
Confidence 99985 489999999999998 66789999999876542 334567999999997653 679999999998 8
Q ss_pred ceEEeecCCCC--CCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816 231 SFSLCYGGMDV--GGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY 305 (620)
Q Consensus 231 ~FSl~l~~~~~--~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~ 305 (620)
.||+||.+... .+|.|+|||+|++++ +.+++. ....+|.|.++++.++++.+. ..+..++|||||++
T Consensus 150 ~FS~~l~~~~~~~~~G~l~fGg~d~~~~~g~l~~~p~--~~~~~~~v~~~~i~v~~~~~~------~~~~~~iiDSGtt~ 221 (329)
T cd05485 150 VFSFYLNRDPSAKEGGELILGGSDPKHYTGNFTYLPV--TRKGYWQFKMDSVSVGEGEFC------SGGCQAIADTGTSL 221 (329)
T ss_pred EEEEEecCCCCCCCCcEEEEcccCHHHcccceEEEEc--CCceEEEEEeeEEEECCeeec------CCCcEEEEccCCcc
Confidence 99999986432 469999999999874 344443 457899999999999988654 23457999999999
Q ss_pred eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCC
Q 047816 306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRG 385 (620)
Q Consensus 306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~ 385 (620)
+++|++++++|.+++.+.. + ....|.++|.. .+.+|+|+|+| +|+++.|++++|+++....+.
T Consensus 222 ~~lP~~~~~~l~~~~~~~~-----~--~~~~~~~~C~~---------~~~~p~i~f~f-gg~~~~i~~~~yi~~~~~~~~ 284 (329)
T cd05485 222 IAGPVDEIEKLNNAIGAKP-----I--IGGEYMVNCSA---------IPSLPDITFVL-GGKSFSLTGKDYVLKVTQMGQ 284 (329)
T ss_pred eeCCHHHHHHHHHHhCCcc-----c--cCCcEEEeccc---------cccCCcEEEEE-CCEEeEEChHHeEEEecCCCC
Confidence 9999999999988875431 1 12356778842 14679999999 889999999999998765445
Q ss_pred eEEEEEEec-----CCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 386 AYCLGIFQN-----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 386 ~~Cl~~~~~-----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
..|+..+.. ..++.||||++|||++|+|||++++|||||+
T Consensus 285 ~~C~~~~~~~~~~~~~~~~~IlG~~fl~~~y~vFD~~~~~ig~a~ 329 (329)
T cd05485 285 TICLSGFMGIDIPPPAGPLWILGDVFIGKYYTEFDLGNNRVGFAT 329 (329)
T ss_pred CEEeeeEEECcCCCCCCCeEEEchHHhccceEEEeCCCCEEeecC
Confidence 689754442 2235799999999999999999999999985
No 15
>cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco. CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This fami
Probab=100.00 E-value=4.4e-48 Score=400.45 Aligned_cols=293 Identities=28% Similarity=0.513 Sum_probs=233.7
Q ss_pred eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816 85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS 164 (620)
Q Consensus 85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~ 164 (620)
+|+++|.||||||++.|++||||+++||+|..| | .|.++|++|+
T Consensus 1 ~Y~~~i~iGtP~q~~~v~~DTGSs~~Wv~c~~c-----------------------~-------------~~~i~Yg~Gs 44 (299)
T cd05472 1 EYVVTVGLGTPARDQTVIVDTGSDLTWVQCQPC-----------------------C-------------LYQVSYGDGS 44 (299)
T ss_pred CeEEEEecCCCCcceEEEecCCCCcccccCCCC-----------------------C-------------eeeeEeCCCc
Confidence 488999999999999999999999999987654 2 5889999987
Q ss_pred ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCcccceEEeecCCC-CCC
Q 047816 165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSLCYGGMD-VGG 243 (620)
Q Consensus 165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl~l~~~~-~~~ 243 (620)
.++|.+++|+|+||+. ..++++.|||+...++.+. ..+||||||++..+++.||..+ .+++||+||++.+ ..+
T Consensus 45 ~~~G~~~~D~v~ig~~--~~~~~~~Fg~~~~~~~~~~--~~~GilGLg~~~~s~~~ql~~~--~~~~FS~~L~~~~~~~~ 118 (299)
T cd05472 45 YTTGDLATDTLTLGSS--DVVPGFAFGCGHDNEGLFG--GAAGLLGLGRGKLSLPSQTASS--YGGVFSYCLPDRSSSSS 118 (299)
T ss_pred eEEEEEEEEEEEeCCC--CccCCEEEECCccCCCccC--CCCEEEECCCCcchHHHHhhHh--hcCceEEEccCCCCCCC
Confidence 7899999999999983 1678999999987765432 6899999999999999998765 4589999998754 467
Q ss_pred ceEEECCCCCCC-CceEeecCCC--CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHHHHHH
Q 047816 244 GAMVLGGISPPK-DMVFTHSDPV--RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAFKDAI 320 (620)
Q Consensus 244 G~l~fGgiD~~~-~~~~~~~~~~--~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i~~~l 320 (620)
|+|+|||+|++. .+.|+++... ...+|.|++++|+|+++.+...... .....++|||||++++||++++++|.+++
T Consensus 119 G~l~fGg~d~~~g~l~~~pv~~~~~~~~~y~v~l~~i~vg~~~~~~~~~~-~~~~~~ivDSGTt~~~lp~~~~~~l~~~l 197 (299)
T cd05472 119 GYLSFGAAASVPAGASFTPMLSNPRVPTFYYVGLTGISVGGRRLPIPPAS-FGAGGVIIDSGTVITRLPPSAYAALRDAF 197 (299)
T ss_pred ceEEeCCccccCCCceECCCccCCCCCCeEEEeeEEEEECCEECCCCccc-cCCCCeEEeCCCcceecCHHHHHHHHHHH
Confidence 999999999962 4556654332 2468999999999999987654321 23457999999999999999999999999
Q ss_pred HHHhhhcccccCCCCCCc-cccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEecC-CCC
Q 047816 321 MSELQSLKQIRGPDPNYN-DICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQNG-RDP 398 (620)
Q Consensus 321 ~~~~~~~~~~~~~~~~~~-~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~~~-~~~ 398 (620)
.++...... ....+. ..|+..... . ...+|+|+|+|++|.++.|++++|++... ..+..|+++.... ..+
T Consensus 198 ~~~~~~~~~---~~~~~~~~~C~~~~~~---~-~~~~P~i~f~f~~g~~~~l~~~~y~~~~~-~~~~~C~~~~~~~~~~~ 269 (299)
T cd05472 198 RAAMAAYPR---APGFSILDTCYDLSGF---R-SVSVPTVSLHFQGGADVELDASGVLYPVD-DSSQVCLAFAGTSDDGG 269 (299)
T ss_pred HHHhccCCC---CCCCCCCCccCcCCCC---c-CCccCCEEEEECCCCEEEeCcccEEEEec-CCCCEEEEEeCCCCCCC
Confidence 887642211 111222 358754221 1 25799999999658999999999998432 2457899877653 346
Q ss_pred ceeehHhhhceEEEEEeCCCCEEEEEecCC
Q 047816 399 TTLLGGIIVRNTLVMYDREHSKIGFWKTNC 428 (620)
Q Consensus 399 ~~ILG~~fLr~~yvvfD~en~rIGfA~~~c 428 (620)
.+|||+.|||++|+|||++++|||||+++|
T Consensus 270 ~~ilG~~fl~~~~vvfD~~~~~igfa~~~C 299 (299)
T cd05472 270 LSIIGNVQQQTFRVVYDVAGGRIGFAPGGC 299 (299)
T ss_pred CEEEchHHccceEEEEECCCCEEeEecCCC
Confidence 799999999999999999999999999999
No 16
>cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two
Probab=100.00 E-value=1.6e-47 Score=406.71 Aligned_cols=321 Identities=28% Similarity=0.422 Sum_probs=239.6
Q ss_pred eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816 85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS 164 (620)
Q Consensus 85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~ 164 (620)
.|+++|.||||+|++.|++||||+++||+|..|..| + +.|+|++|+||+..+ |.|++.|++|
T Consensus 3 ~Y~~~i~iGtP~Q~~~v~~DTGSs~lWv~~~~~~~~--~--~~f~~~~SsT~~~~~-------------~~~~i~Yg~G- 64 (364)
T cd05473 3 GYYIEMLIGTPPQKLNILVDTGSSNFAVAAAPHPFI--H--TYFHRELSSTYRDLG-------------KGVTVPYTQG- 64 (364)
T ss_pred ceEEEEEecCCCceEEEEEecCCcceEEEcCCCccc--c--ccCCchhCcCcccCC-------------ceEEEEECcc-
Confidence 478999999999999999999999999999887432 2 589999999999876 5899999985
Q ss_pred ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcC-CCcceEEecCCCCC--------chHHHHHHcCCcccceEEe
Q 047816 165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYS-QHADGIIGLGRGDL--------SVVDQLVEKGVISDSFSLC 235 (620)
Q Consensus 165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~-~~~dGIlGLg~~~~--------s~~~~L~~~g~I~~~FSl~ 235 (620)
++.|.+++|+|+||+..... ..+.|++.....+.+.. ...|||||||++.+ +++++|++|+.++++||+|
T Consensus 65 s~~G~~~~D~v~ig~~~~~~-~~~~~~~~~~~~~~~~~~~~~dGIlGLg~~~l~~~~~~~~~~~~~l~~q~~~~~~FS~~ 143 (364)
T cd05473 65 SWEGELGTDLVSIPKGPNVT-FRANIAAITESENFFLNGSNWEGILGLAYAELARPDSSVEPFFDSLVKQTGIPDVFSLQ 143 (364)
T ss_pred eEEEEEEEEEEEECCCCccc-eEEeeEEEeccccceecccccceeeeecccccccCCCCCCCHHHHHHhccCCccceEEE
Confidence 67999999999998631111 12334555544433332 35799999998754 5889999999988899998
Q ss_pred ecC---------CCCCCceEEECCCCCCCC---ceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccc
Q 047816 236 YGG---------MDVGGGAMVLGGISPPKD---MVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGT 303 (620)
Q Consensus 236 l~~---------~~~~~G~l~fGgiD~~~~---~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGt 303 (620)
|+. ....+|.|+|||+|++++ +.|+++ ....+|.|.+++|+|+++.+......+. ...++|||||
T Consensus 144 l~~~~~~~~~~~~~~~~g~l~fGg~D~~~~~g~l~~~p~--~~~~~~~v~l~~i~vg~~~~~~~~~~~~-~~~~ivDSGT 220 (364)
T cd05473 144 MCGAGLPVNGSASGTVGGSMVIGGIDPSLYKGDIWYTPI--REEWYYEVIILKLEVGGQSLNLDCKEYN-YDKAIVDSGT 220 (364)
T ss_pred ecccccccccccccCCCcEEEeCCcCHhhcCCCceEEec--CcceeEEEEEEEEEECCEeccccccccc-CccEEEeCCC
Confidence 853 123479999999999873 445544 3467999999999999998875433221 2369999999
Q ss_pred eeeeecHHHHHHHHHHHHHHhhhcccccCC-CCCCccccccCCCCCccccCCCCCeEEEEECCC-----cEEEeCCCCcE
Q 047816 304 TYAYLPEAAFLAFKDAIMSELQSLKQIRGP-DPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNG-----QKLLLAPENYL 377 (620)
Q Consensus 304 t~~~LP~~~~~~i~~~l~~~~~~~~~~~~~-~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g-----~~~~l~~~~yi 377 (620)
++++||+++++++.+++.++... ...... ...+..+|+..... ....+|+|+|+|+++ .++.|+|++|+
T Consensus 221 s~~~lp~~~~~~l~~~l~~~~~~-~~~~~~~~~~~~~~C~~~~~~----~~~~~P~i~~~f~g~~~~~~~~l~l~p~~Y~ 295 (364)
T cd05473 221 TNLRLPVKVFNAAVDAIKAASLI-EDFPDGFWLGSQLACWQKGTT----PWEIFPKISIYLRDENSSQSFRITILPQLYL 295 (364)
T ss_pred cceeCCHHHHHHHHHHHHhhccc-ccCCccccCcceeecccccCc----hHhhCCcEEEEEccCCCCceEEEEECHHHhh
Confidence 99999999999999999887531 111111 01234578643211 113689999999542 36899999999
Q ss_pred EEeccc-CCeEEEEEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEecCCcccc
Q 047816 378 FRHSKV-RGAYCLGIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNCSELW 432 (620)
Q Consensus 378 ~~~~~~-~~~~Cl~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c~~~~ 432 (620)
...... .+..|+.+......+.||||+.|||++|+|||++++|||||+++|.+.+
T Consensus 296 ~~~~~~~~~~~C~~~~~~~~~~~~ILG~~flr~~yvvfD~~~~rIGfa~~~C~~~~ 351 (364)
T cd05473 296 RPVEDHGTQLDCYKFAISQSTNGTVIGAVIMEGFYVVFDRANKRVGFAVSTCAEHD 351 (364)
T ss_pred hhhccCCCcceeeEEeeecCCCceEEeeeeEcceEEEEECCCCEEeeEeccccccc
Confidence 764321 2457975332223457999999999999999999999999999998743
No 17
>PF00026 Asp: Eukaryotic aspartyl protease The Prosite entry also includes Pfam:PF00077.; InterPro: IPR001461 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) . More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions []. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site []. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event []. This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 1CZI_E 3CMS_A 1CMS_A 4CMS_A 1YG9_A 2NR6_A 3LIZ_A 1FLH_A 3UTL_A 1QRP_E ....
Probab=100.00 E-value=5.1e-47 Score=395.72 Aligned_cols=302 Identities=32% Similarity=0.563 Sum_probs=248.1
Q ss_pred eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCC-CCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccC
Q 047816 85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHC-GDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEM 163 (620)
Q Consensus 85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C-~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg 163 (620)
.|+++|.||||+|+++|++||||+++||++..|..| .......|++++|+|++..+ +.+.+.|++|
T Consensus 1 ~Y~~~v~iGtp~q~~~~~iDTGS~~~wv~~~~c~~~~~~~~~~~y~~~~S~t~~~~~-------------~~~~~~y~~g 67 (317)
T PF00026_consen 1 QYYINVTIGTPPQTFRVLIDTGSSDTWVPSSNCNSCSSCASSGFYNPSKSSTFSNQG-------------KPFSISYGDG 67 (317)
T ss_dssp EEEEEEEETTTTEEEEEEEETTBSSEEEEBTTECSHTHHCTSC-BBGGGSTTEEEEE-------------EEEEEEETTE
T ss_pred CeEEEEEECCCCeEEEEEEecccceeeeceeccccccccccccccccccccccccce-------------eeeeeeccCc
Confidence 488999999999999999999999999999999876 33334799999999999875 5799999996
Q ss_pred CceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCC-------CCchHHHHHHcCCcc-cceEE
Q 047816 164 SSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRG-------DLSVVDQLVEKGVIS-DSFSL 234 (620)
Q Consensus 164 ~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~-------~~s~~~~L~~~g~I~-~~FSl 234 (620)
. ++|.+++|+|+|++ +.+.++.||++....+. +.....+||||||++ ..+++++|+++|+|+ ++||+
T Consensus 68 ~-~~G~~~~D~v~ig~---~~~~~~~f~~~~~~~~~~~~~~~~~GilGLg~~~~~~~~~~~~~~~~l~~~g~i~~~~fsl 143 (317)
T PF00026_consen 68 S-VSGNLVSDTVSIGG---LTIPNQTFGLADSYSGDPFSPIPFDGILGLGFPSLSSSSTYPTFLDQLVQQGLISSNVFSL 143 (317)
T ss_dssp E-EEEEEEEEEEEETT---EEEEEEEEEEEEEEESHHHHHSSSSEEEE-SSGGGSGGGTS-SHHHHHHHTTSSSSSEEEE
T ss_pred c-cccccccceEeeee---ccccccceeccccccccccccccccccccccCCcccccccCCcceecchhhccccccccce
Confidence 6 99999999999999 67789999999986443 234568999999974 247999999999998 89999
Q ss_pred eecCCCCCCceEEECCCCCCCCc-eEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHH
Q 047816 235 CYGGMDVGGGAMVLGGISPPKDM-VFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAF 313 (620)
Q Consensus 235 ~l~~~~~~~G~l~fGgiD~~~~~-~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~ 313 (620)
+|++.+...|.|+|||+|++++. ...+.+.....+|.+.+++|.++++.... .....++||||+++++||.+++
T Consensus 144 ~l~~~~~~~g~l~~Gg~d~~~~~g~~~~~~~~~~~~w~v~~~~i~i~~~~~~~-----~~~~~~~~Dtgt~~i~lp~~~~ 218 (317)
T PF00026_consen 144 YLNPSDSQNGSLTFGGYDPSKYDGDLVWVPLVSSGYWSVPLDSISIGGESVFS-----SSGQQAILDTGTSYIYLPRSIF 218 (317)
T ss_dssp EEESTTSSEEEEEESSEEGGGEESEEEEEEBSSTTTTEEEEEEEEETTEEEEE-----EEEEEEEEETTBSSEEEEHHHH
T ss_pred eeeecccccchheeeccccccccCceeccCccccccccccccccccccccccc-----ccceeeecccccccccccchhh
Confidence 99987666799999999999832 23333333788999999999999882221 2335699999999999999999
Q ss_pred HHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEe
Q 047816 314 LAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQ 393 (620)
Q Consensus 314 ~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~ 393 (620)
++|++++.+.... ..|..+|. .. +.+|.++|.| ++.+|.||+++|+.+........|+..+.
T Consensus 219 ~~i~~~l~~~~~~--------~~~~~~c~--------~~-~~~p~l~f~~-~~~~~~i~~~~~~~~~~~~~~~~C~~~i~ 280 (317)
T PF00026_consen 219 DAIIKALGGSYSD--------GVYSVPCN--------ST-DSLPDLTFTF-GGVTFTIPPSDYIFKIEDGNGGYCYLGIQ 280 (317)
T ss_dssp HHHHHHHTTEEEC--------SEEEEETT--------GG-GGSEEEEEEE-TTEEEEEEHHHHEEEESSTTSSEEEESEE
T ss_pred HHHHhhhcccccc--------eeEEEecc--------cc-cccceEEEee-CCEEEEecchHhcccccccccceeEeeee
Confidence 9999999765431 45677883 22 5689999999 79999999999999887655558865554
Q ss_pred c----CCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816 394 N----GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT 426 (620)
Q Consensus 394 ~----~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~ 426 (620)
. .....+|||.+|||++|++||+|++|||||+|
T Consensus 281 ~~~~~~~~~~~iLG~~fl~~~y~vfD~~~~~ig~A~a 317 (317)
T PF00026_consen 281 PMDSSDDSDDWILGSPFLRNYYVVFDYENNRIGFAQA 317 (317)
T ss_dssp EESSTTSSSEEEEEHHHHTTEEEEEETTTTEEEEEEE
T ss_pred cccccccCCceEecHHHhhceEEEEeCCCCEEEEecC
Confidence 4 44578999999999999999999999999986
No 18
>cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-
Probab=100.00 E-value=2.8e-46 Score=382.81 Aligned_cols=265 Identities=23% Similarity=0.357 Sum_probs=219.7
Q ss_pred EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCCc
Q 047816 86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMSS 165 (620)
Q Consensus 86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~~ 165 (620)
|+++|+||||+|++.|++||||+++||+|..|..|..+.+..|++++|+|++... .+.|.+.|++|+.
T Consensus 1 Y~~~i~vGtP~Q~~~v~~DTGS~~~wv~~~~c~~~~~~~~~~y~~~~Sst~~~~~------------~~~~~i~Y~~G~~ 68 (278)
T cd06097 1 YLTPVKIGTPPQTLNLDLDTGSSDLWVFSSETPAAQQGGHKLYDPSKSSTAKLLP------------GATWSISYGDGSS 68 (278)
T ss_pred CeeeEEECCCCcEEEEEEeCCCCceeEeeCCCCchhhccCCcCCCccCccceecC------------CcEEEEEeCCCCe
Confidence 6799999999999999999999999999999998887666789999999998753 3689999999878
Q ss_pred eeEEEEEEEEEeCCCCCCCccceEEEEEEeccCC-CcCCCcceEEecCCCCC---------chHHHHHHcCCcccceEEe
Q 047816 166 SSGVLGEDIISFGNESDLKPQRAVFGCENVETGD-LYSQHADGIIGLGRGDL---------SVVDQLVEKGVISDSFSLC 235 (620)
Q Consensus 166 ~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~-~~~~~~dGIlGLg~~~~---------s~~~~L~~~g~I~~~FSl~ 235 (620)
+.|.+++|+|+||+ .+++++.||+++...+. +.....+||||||++.. +++++|.+++. ++.||+|
T Consensus 69 ~~G~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~dGilGLg~~~~~~~~~~~~~~~~~~l~~~~~-~~~Fs~~ 144 (278)
T cd06097 69 ASGIVYTDTVSIGG---VEVPNQAIELATAVSASFFSDTASDGLLGLAFSSINTVQPPKQKTFFENALSSLD-APLFTAD 144 (278)
T ss_pred EEEEEEEEEEEECC---EEECCeEEEEEeecCccccccccccceeeeccccccccccCCCCCHHHHHHHhcc-CceEEEE
Confidence 99999999999998 67789999999987653 33457899999998643 47889999865 7899999
Q ss_pred ecCCCCCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHH
Q 047816 236 YGGMDVGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAA 312 (620)
Q Consensus 236 l~~~~~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~ 312 (620)
|.+ ...|+|+|||+|+++ .+.|++... ...+|.|++++|.|+++.... .....++|||||+++++|+++
T Consensus 145 l~~--~~~G~l~fGg~D~~~~~g~l~~~pi~~-~~~~w~v~l~~i~v~~~~~~~-----~~~~~~iiDSGTs~~~lP~~~ 216 (278)
T cd06097 145 LRK--AAPGFYTFGYIDESKYKGEISWTPVDN-SSGFWQFTSTSYTVGGDAPWS-----RSGFSAIADTGTTLILLPDAI 216 (278)
T ss_pred ecC--CCCcEEEEeccChHHcCCceEEEEccC-CCcEEEEEEeeEEECCcceee-----cCCceEEeecCCchhcCCHHH
Confidence 986 357999999999987 345555432 268999999999999874321 234679999999999999999
Q ss_pred HHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEE
Q 047816 313 FLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIF 392 (620)
Q Consensus 313 ~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~ 392 (620)
++++.+++.+.. .....+.|..+|. ..+|+|+|+|
T Consensus 217 ~~~l~~~l~g~~-----~~~~~~~~~~~C~-----------~~~P~i~f~~----------------------------- 251 (278)
T cd06097 217 VEAYYSQVPGAY-----YDSEYGGWVFPCD-----------TTLPDLSFAV----------------------------- 251 (278)
T ss_pred HHHHHHhCcCCc-----ccCCCCEEEEECC-----------CCCCCEEEEE-----------------------------
Confidence 999998883211 1123345778883 2389999988
Q ss_pred ecCCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 393 QNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 393 ~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
.||||++|||++|+|||++|+|||||+
T Consensus 252 ------~~ilGd~fl~~~y~vfD~~~~~ig~A~ 278 (278)
T cd06097 252 ------FSILGDVFLKAQYVVFDVGGPKLGFAP 278 (278)
T ss_pred ------EEEEcchhhCceeEEEcCCCceeeecC
Confidence 699999999999999999999999995
No 19
>cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The enzymes specifically cleave bonds in peptides which
Probab=100.00 E-value=1.4e-45 Score=374.98 Aligned_cols=255 Identities=42% Similarity=0.755 Sum_probs=217.5
Q ss_pred eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816 85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS 164 (620)
Q Consensus 85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~ 164 (620)
.|+++|+||||+|++.|++||||+++||+| | .|.+.|+|++
T Consensus 1 ~Y~~~i~iGtP~q~~~v~~DTGSs~~wv~~--------------------------~-------------~~~~~Y~dg~ 41 (265)
T cd05476 1 EYLVTLSIGTPPQPFSLIVDTGSDLTWTQC--------------------------C-------------SYEYSYGDGS 41 (265)
T ss_pred CeEEEEecCCCCcceEEEecCCCCCEEEcC--------------------------C-------------ceEeEeCCCc
Confidence 388999999999999999999999999985 1 4789999989
Q ss_pred ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCcccceEEeecCC--CCC
Q 047816 165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSLCYGGM--DVG 242 (620)
Q Consensus 165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl~l~~~--~~~ 242 (620)
.++|.+++|+|+|++.. .++.++.|||+..+.+ +.....+||||||+...|++.||..++ ++||+||.+. ...
T Consensus 42 ~~~G~~~~D~v~~g~~~-~~~~~~~Fg~~~~~~~-~~~~~~~GIlGLg~~~~s~~~ql~~~~---~~Fs~~l~~~~~~~~ 116 (265)
T cd05476 42 STSGVLATETFTFGDSS-VSVPNVAFGCGTDNEG-GSFGGADGILGLGRGPLSLVSQLGSTG---NKFSYCLVPHDDTGG 116 (265)
T ss_pred eeeeeEEEEEEEecCCC-CccCCEEEEecccccC-CccCCCCEEEECCCCcccHHHHhhccc---CeeEEEccCCCCCCC
Confidence 99999999999999832 1678999999998876 555678999999999999999999888 7999999864 356
Q ss_pred CceEEECCCCCCC--CceEeecCCC--CCCeeEEEEeEEEEccEEecCCCCc----cCCCCceEeeccceeeeecHHHHH
Q 047816 243 GGAMVLGGISPPK--DMVFTHSDPV--RSPYYNIDLKVIHVAGKPLPLNPKV----FDGKHGTVLDSGTTYAYLPEAAFL 314 (620)
Q Consensus 243 ~G~l~fGgiD~~~--~~~~~~~~~~--~~~~w~v~l~~i~v~g~~~~~~~~~----~~~~~~ailDSGtt~~~LP~~~~~ 314 (620)
+|.|+|||+|+++ .+.|++.... ...+|.|++++|+|+++.+.++... ......++|||||++++||++++
T Consensus 117 ~G~l~fGg~d~~~~~~l~~~p~~~~~~~~~~~~v~l~~i~v~~~~~~~~~~~~~~~~~~~~~ai~DTGTs~~~lp~~~~- 195 (265)
T cd05476 117 SSPLILGDAADLGGSGVVYTPLVKNPANPTYYYVNLEGISVGGKRLPIPPSVFAIDSDGSGGTIIDSGTTLTYLPDPAY- 195 (265)
T ss_pred CCeEEECCcccccCCCceEeecccCCCCCCceEeeeEEEEECCEEecCCchhcccccCCCCcEEEeCCCcceEcCcccc-
Confidence 7999999999963 5566665442 3679999999999999987643221 13456799999999999999876
Q ss_pred HHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEec
Q 047816 315 AFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQN 394 (620)
Q Consensus 315 ~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~~ 394 (620)
|+|+|+|+++.++.+++++|+.... ++..|+++...
T Consensus 196 ------------------------------------------P~i~~~f~~~~~~~i~~~~y~~~~~--~~~~C~~~~~~ 231 (265)
T cd05476 196 ------------------------------------------PDLTLHFDGGADLELPPENYFVDVG--EGVVCLAILSS 231 (265)
T ss_pred ------------------------------------------CCEEEEECCCCEEEeCcccEEEECC--CCCEEEEEecC
Confidence 6899999658999999999998543 36789988876
Q ss_pred CCCCceeehHhhhceEEEEEeCCCCEEEEEecCC
Q 047816 395 GRDPTTLLGGIIVRNTLVMYDREHSKIGFWKTNC 428 (620)
Q Consensus 395 ~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c 428 (620)
...+.||||++|||++|++||++++|||||+++|
T Consensus 232 ~~~~~~ilG~~fl~~~~~vFD~~~~~iGfa~~~C 265 (265)
T cd05476 232 SSGGVSILGNIQQQNFLVEYDLENSRLGFAPADC 265 (265)
T ss_pred CCCCcEEEChhhcccEEEEEECCCCEEeeecCCC
Confidence 5567899999999999999999999999999999
No 20
>cd05475 nucellin_like Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more d
Probab=100.00 E-value=4.2e-45 Score=372.75 Aligned_cols=261 Identities=34% Similarity=0.695 Sum_probs=213.4
Q ss_pred eeEEEEEEecCCCcEEEEEEeCCCCceeEeCC-CCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816 84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCA-TCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE 162 (620)
Q Consensus 84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~-~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d 162 (620)
|+|+++|.||||+|++.|++||||+++||+|. .|..| . |.|+++|+|
T Consensus 1 ~~Y~~~i~iGtP~q~~~v~~DTGS~~~Wv~c~~~c~~c-------------------~-------------c~~~i~Ygd 48 (273)
T cd05475 1 GYYYVTINIGNPPKPYFLDIDTGSDLTWLQCDAPCTGC-------------------Q-------------CDYEIEYAD 48 (273)
T ss_pred CceEEEEEcCCCCeeEEEEEccCCCceEEeCCCCCCCC-------------------c-------------CccEeEeCC
Confidence 47999999999999999999999999999984 57666 1 469999998
Q ss_pred CCceeEEEEEEEEEeCCCC-CCCccceEEEEEEeccCCC--cCCCcceEEecCCCCCchHHHHHHcCCcccceEEeecCC
Q 047816 163 MSSSSGVLGEDIISFGNES-DLKPQRAVFGCENVETGDL--YSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSLCYGGM 239 (620)
Q Consensus 163 g~~~~G~~~~D~v~lg~~~-~~~~~~~~fg~~~~~~~~~--~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl~l~~~ 239 (620)
++.+.|.+++|+|+++... ...+.++.|||+....+.+ .....+||||||++..++++||.++++|+++||+||.+
T Consensus 49 ~~~~~G~~~~D~v~~~~~~~~~~~~~~~Fgc~~~~~~~~~~~~~~~dGIlGLg~~~~s~~~ql~~~~~i~~~Fs~~l~~- 127 (273)
T cd05475 49 GGSSMGVLVTDIFSLKLTNGSRAKPRIAFGCGYDQQGPLLNPPPPTDGILGLGRGKISLPSQLASQGIIKNVIGHCLSS- 127 (273)
T ss_pred CCceEEEEEEEEEEEeecCCCcccCCEEEEeeeccCCcccCCCccCCEEEECCCCCCCHHHHHHhcCCcCceEEEEccC-
Confidence 8999999999999997531 1456789999998765432 23467999999999999999999999998899999986
Q ss_pred CCCCceEEECCCCCC-CCceEeecCCC-CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHHH
Q 047816 240 DVGGGAMVLGGISPP-KDMVFTHSDPV-RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAFK 317 (620)
Q Consensus 240 ~~~~G~l~fGgiD~~-~~~~~~~~~~~-~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i~ 317 (620)
..+|.|+||+.... ..+.|+++... ...+|.|++.+|+|+++... .....++|||||++++||++++
T Consensus 128 -~~~g~l~~G~~~~~~g~i~ytpl~~~~~~~~y~v~l~~i~vg~~~~~------~~~~~~ivDTGTt~t~lp~~~y---- 196 (273)
T cd05475 128 -NGGGFLFFGDDLVPSSGVTWTPMRRESQKKHYSPGPASLLFNGQPTG------GKGLEVVFDSGSSYTYFNAQAY---- 196 (273)
T ss_pred -CCCeEEEECCCCCCCCCeeecccccCCCCCeEEEeEeEEEECCEECc------CCCceEEEECCCceEEcCCccc----
Confidence 34689999854322 14556554321 24799999999999998543 2345799999999999999865
Q ss_pred HHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCC---cEEEeCCCCcEEEecccCCeEEEEEEec
Q 047816 318 DAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNG---QKLLLAPENYLFRHSKVRGAYCLGIFQN 394 (620)
Q Consensus 318 ~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g---~~~~l~~~~yi~~~~~~~~~~Cl~~~~~ 394 (620)
+|+|+|+|+++ ++++|++++|++... .+..|++++..
T Consensus 197 --------------------------------------~p~i~~~f~~~~~~~~~~l~~~~y~~~~~--~~~~Cl~~~~~ 236 (273)
T cd05475 197 --------------------------------------FKPLTLKFGKGWRTRLLEIPPENYLIISE--KGNVCLGILNG 236 (273)
T ss_pred --------------------------------------cccEEEEECCCCceeEEEeCCCceEEEcC--CCCEEEEEecC
Confidence 46899999543 799999999998754 35689998865
Q ss_pred CC---CCceeehHhhhceEEEEEeCCCCEEEEEecCC
Q 047816 395 GR---DPTTLLGGIIVRNTLVMYDREHSKIGFWKTNC 428 (620)
Q Consensus 395 ~~---~~~~ILG~~fLr~~yvvfD~en~rIGfA~~~c 428 (620)
.+ .+.||||+.|||++|+|||++++|||||+++|
T Consensus 237 ~~~~~~~~~ilG~~~l~~~~~vfD~~~~riGfa~~~C 273 (273)
T cd05475 237 SEIGLGNTNIIGDISMQGLMVIYDNEKQQIGWVRSDC 273 (273)
T ss_pred CCcCCCceEEECceEEEeeEEEEECcCCEeCcccCCC
Confidence 42 35799999999999999999999999999998
No 21
>cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases
Probab=100.00 E-value=3.2e-45 Score=378.47 Aligned_cols=272 Identities=25% Similarity=0.450 Sum_probs=225.7
Q ss_pred eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816 85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS 164 (620)
Q Consensus 85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~ 164 (620)
+|+++|.||||+|++.|++||||+++||+ .|++.|++|+
T Consensus 2 ~Y~~~i~iGtp~q~~~v~~DTgS~~~wv~-----------------------------------------~~~~~Y~~g~ 40 (295)
T cd05474 2 YYSAELSVGTPPQKVTVLLDTGSSDLWVP-----------------------------------------DFSISYGDGT 40 (295)
T ss_pred eEEEEEEECCCCcEEEEEEeCCCCcceee-----------------------------------------eeEEEeccCC
Confidence 68999999999999999999999999997 1778999989
Q ss_pred ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCCC-----------chHHHHHHcCCcc-cce
Q 047816 165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGDL-----------SVVDQLVEKGVIS-DSF 232 (620)
Q Consensus 165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~-----------s~~~~L~~~g~I~-~~F 232 (620)
.+.|.+++|+|++++ ..++++.|||++... ..+||||||++.. +|+++|++||+|+ ++|
T Consensus 41 ~~~G~~~~D~v~~g~---~~~~~~~fg~~~~~~------~~~GilGLg~~~~~~~~~~~~~~~s~~~~L~~~g~i~~~~F 111 (295)
T cd05474 41 SASGTWGTDTVSIGG---ATVKNLQFAVANSTS------SDVGVLGIGLPGNEATYGTGYTYPNFPIALKKQGLIKKNAY 111 (295)
T ss_pred cEEEEEEEEEEEECC---eEecceEEEEEecCC------CCcceeeECCCCCcccccCCCcCCCHHHHHHHCCcccceEE
Confidence 999999999999998 567899999998742 4799999999775 6999999999998 899
Q ss_pred EEeecCCCCCCceEEECCCCCCCC---ceEeecCCCC----CCeeEEEEeEEEEccEEecCCCCccCCCCceEeecccee
Q 047816 233 SLCYGGMDVGGGAMVLGGISPPKD---MVFTHSDPVR----SPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTY 305 (620)
Q Consensus 233 Sl~l~~~~~~~G~l~fGgiD~~~~---~~~~~~~~~~----~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~ 305 (620)
|+||++.+...|.|+|||+|+.++ +.+++..... ..+|.|.+++|.++++.+..+ .......++|||||++
T Consensus 112 sl~l~~~~~~~g~l~~Gg~d~~~~~g~~~~~p~~~~~~~~~~~~~~v~l~~i~v~~~~~~~~--~~~~~~~~iiDSGt~~ 189 (295)
T cd05474 112 SLYLNDLDASTGSILFGGVDTAKYSGDLVTLPIVNDNGGSEPSELSVTLSSISVNGSSGNTT--LLSKNLPALLDSGTTL 189 (295)
T ss_pred EEEeCCCCCCceeEEEeeeccceeeceeEEEeCcCcCCCCCceEEEEEEEEEEEEcCCCccc--ccCCCccEEECCCCcc
Confidence 999998655689999999998873 4555544432 278999999999999876532 1245568999999999
Q ss_pred eeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecc--c
Q 047816 306 AYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSK--V 383 (620)
Q Consensus 306 ~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~--~ 383 (620)
++||++++++|.+++.+.... ....|..+|+.. .. |+|+|+| +|.++.||+++|+++... .
T Consensus 190 ~~lP~~~~~~l~~~~~~~~~~------~~~~~~~~C~~~---------~~-p~i~f~f-~g~~~~i~~~~~~~~~~~~~~ 252 (295)
T cd05474 190 TYLPSDIVDAIAKQLGATYDS------DEGLYVVDCDAK---------DD-GSLTFNF-GGATISVPLSDLVLPASTDDG 252 (295)
T ss_pred EeCCHHHHHHHHHHhCCEEcC------CCcEEEEeCCCC---------CC-CEEEEEE-CCeEEEEEHHHhEeccccCCC
Confidence 999999999999999765431 234577888532 23 9999999 789999999999987642 2
Q ss_pred CCeEEE-EEEecCCCCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816 384 RGAYCL-GIFQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWKT 426 (620)
Q Consensus 384 ~~~~Cl-~~~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~~ 426 (620)
.+..|+ ++.... ++.||||++|||++|++||++++|||||++
T Consensus 253 ~~~~C~~~i~~~~-~~~~iLG~~fl~~~y~vfD~~~~~ig~a~a 295 (295)
T cd05474 253 GDGACYLGIQPST-SDYNILGDTFLRSAYVVYDLDNNEISLAQA 295 (295)
T ss_pred CCCCeEEEEEeCC-CCcEEeChHHhhcEEEEEECCCCEEEeecC
Confidence 345674 554433 478999999999999999999999999986
No 22
>cd05489 xylanase_inhibitor_I_like TAXI-I inhibits degradation of xylan in the cell wall. Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. Xylanases of fungal and bacterial pathogens are the key enzymes in the degradation of xylan in the cell wall. Plants secrete proteins that inhibit these degradation glycosidases, including xylanase. Surprisingly, TAXI-I displays structural homology with the pepsin-like family of aspartic proteases but is proteolytically nonfunctional, because one or more residues of the essential catalytic triad are absent. The structure of the TAXI-inhibitor, Aspergillus niger xylanase I complex, illustrates the ability
Probab=100.00 E-value=1.3e-43 Score=373.91 Aligned_cols=318 Identities=24% Similarity=0.437 Sum_probs=242.6
Q ss_pred ecCCCcE-EEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-cc------cC----------CCCCc
Q 047816 92 IGTPPQT-FALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-CN------CD----------RERAQ 153 (620)
Q Consensus 92 iGTP~Q~-~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c~------c~----------~~~~~ 153 (620)
+|||-.+ +.|++||||+++||+|.+ .+|+||..+.|.+. |. |. -..+.
T Consensus 2 ~~~~~~~~~~~~~DTGS~l~WvqC~~--------------~~sst~~~~~C~s~~C~~~~~~~~~~~~~~~~~~~c~~~~ 67 (362)
T cd05489 2 TITPLKGAVPLVLDLAGPLLWSTCDA--------------GHSSTYQTVPCSSSVCSLANRYHCPGTCGGAPGPGCGNNT 67 (362)
T ss_pred cccCccCCeeEEEECCCCceeeeCCC--------------CCcCCCCccCcCChhhccccccCCCccccCCCCCCCCCCc
Confidence 5788777 999999999999999864 34667777777753 52 11 01234
Q ss_pred ceeEEe-eccCCceeEEEEEEEEEeCCCCC-----CCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCC
Q 047816 154 CVYERK-YAEMSSSSGVLGEDIISFGNESD-----LKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGV 227 (620)
Q Consensus 154 ~~~~~~-Y~dg~~~~G~~~~D~v~lg~~~~-----~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~ 227 (620)
|.|... |++|+...|.+++|+|+|+.... .++.++.|||+............|||||||++.+|++.||..++.
T Consensus 68 C~y~~~~y~~gs~t~G~l~~Dtl~~~~~~g~~~~~~~~~~~~FGC~~~~~~~~~~~~~dGIlGLg~~~lSl~sql~~~~~ 147 (362)
T cd05489 68 CTAHPYNPVTGECATGDLTQDVLSANTTDGSNPLLVVIFNFVFSCAPSLLLKGLPPGAQGVAGLGRSPLSLPAQLASAFG 147 (362)
T ss_pred CeeEccccccCcEeeEEEEEEEEEecccCCCCcccceeCCEEEEcCCcccccCCccccccccccCCCccchHHHhhhhcC
Confidence 777654 77877999999999999975321 257899999998754322233489999999999999999998777
Q ss_pred cccceEEeecCCCCCCceEEECCCCCCC---------CceEeecCCC--CCCeeEEEEeEEEEccEEecCCCCcc----C
Q 047816 228 ISDSFSLCYGGMDVGGGAMVLGGISPPK---------DMVFTHSDPV--RSPYYNIDLKVIHVAGKPLPLNPKVF----D 292 (620)
Q Consensus 228 I~~~FSl~l~~~~~~~G~l~fGgiD~~~---------~~~~~~~~~~--~~~~w~v~l~~i~v~g~~~~~~~~~~----~ 292 (620)
++++||+||.+....+|.|+||+.+..+ .+.|++.... ...+|.|+|++|+|+++.+.+++..+ .
T Consensus 148 ~~~~FS~CL~~~~~~~g~l~fG~~~~~~~~~~~~~~~~~~~tPl~~~~~~~~~Y~v~l~~IsVg~~~l~~~~~~~~~~~~ 227 (362)
T cd05489 148 VARKFALCLPSSPGGPGVAIFGGGPYYLFPPPIDLSKSLSYTPLLTNPRKSGEYYIGVTSIAVNGHAVPLNPTLSANDRL 227 (362)
T ss_pred CCcceEEEeCCCCCCCeeEEECCCchhcccccccccCCccccccccCCCCCCceEEEEEEEEECCEECCCCchhcccccc
Confidence 6689999998754567999999998643 3455554332 34799999999999999987754432 2
Q ss_pred CCCceEeeccceeeeecHHHHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECC-CcEEEe
Q 047816 293 GKHGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGN-GQKLLL 371 (620)
Q Consensus 293 ~~~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~-g~~~~l 371 (620)
+...++|||||++++||+++|++|.+++.+++........ .....+.||......+......+|+|+|+|++ |.++.|
T Consensus 228 ~~~g~iiDSGTs~t~lp~~~y~~l~~a~~~~~~~~~~~~~-~~~~~~~C~~~~~~~~~~~~~~~P~it~~f~g~g~~~~l 306 (362)
T cd05489 228 GPGGVKLSTVVPYTVLRSDIYRAFTQAFAKATARIPRVPA-AAVFPELCYPASALGNTRLGYAVPAIDLVLDGGGVNWTI 306 (362)
T ss_pred CCCcEEEecCCceEEECHHHHHHHHHHHHHHhcccCcCCC-CCCCcCccccCCCcCCcccccccceEEEEEeCCCeEEEE
Confidence 3457999999999999999999999999988764322211 11223689876543333334689999999965 799999
Q ss_pred CCCCcEEEecccCCeEEEEEEecCC--CCceeehHhhhceEEEEEeCCCCEEEEEec
Q 047816 372 APENYLFRHSKVRGAYCLGIFQNGR--DPTTLLGGIIVRNTLVMYDREHSKIGFWKT 426 (620)
Q Consensus 372 ~~~~yi~~~~~~~~~~Cl~~~~~~~--~~~~ILG~~fLr~~yvvfD~en~rIGfA~~ 426 (620)
++++|+++..+ +..|+++..... .+.||||+.|||++|++||++++|||||+.
T Consensus 307 ~~~ny~~~~~~--~~~Cl~f~~~~~~~~~~~IlG~~~~~~~~vvyD~~~~riGfa~~ 361 (362)
T cd05489 307 FGANSMVQVKG--GVACLAFVDGGSEPRPAVVIGGHQMEDNLLVFDLEKSRLGFSSS 361 (362)
T ss_pred cCCceEEEcCC--CcEEEEEeeCCCCCCceEEEeeheecceEEEEECCCCEeecccC
Confidence 99999998653 568998876542 357999999999999999999999999964
No 23
>cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residu
Probab=100.00 E-value=5e-42 Score=352.04 Aligned_cols=269 Identities=36% Similarity=0.682 Sum_probs=223.7
Q ss_pred EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCC--CCCCCCcccccccCcCCcccCCCCCcceeEEeeccC
Q 047816 86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPK--FEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEM 163 (620)
Q Consensus 86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~--y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg 163 (620)
|+++|.||||+|++.|++||||+++||+|..|..|..+.... |++..|+++.... |.+++.|++
T Consensus 1 Y~~~i~iGtp~q~~~l~~DTGS~~~wv~~~~c~~~~~~~~~~~~~~~~~s~~~~~~~-------------~~~~~~Y~~- 66 (283)
T cd05471 1 YYGEITIGTPPQKFSVIFDTGSSLLWVPSSNCTSCSCQKHPRFKYDSSKSSTYKDTG-------------CTFSITYGD- 66 (283)
T ss_pred CEEEEEECCCCcEEEEEEeCCCCCEEEecCCCCccccccCCCCccCccCCceeecCC-------------CEEEEEECC-
Confidence 678999999999999999999999999999999887665444 7888888887654 689999998
Q ss_pred CceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCCCC------CchHHHHHHcCCcc-cceEEee
Q 047816 164 SSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGRGD------LSVVDQLVEKGVIS-DSFSLCY 236 (620)
Q Consensus 164 ~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~------~s~~~~L~~~g~I~-~~FSl~l 236 (620)
+.+.|.+++|+|++++ ..+.++.|||++.....+.....+||||||+.. .+++++|.++++|. +.||+|+
T Consensus 67 g~~~g~~~~D~v~~~~---~~~~~~~fg~~~~~~~~~~~~~~~GilGLg~~~~~~~~~~s~~~~l~~~~~i~~~~Fs~~l 143 (283)
T cd05471 67 GSVTGGLGTDTVTIGG---LTIPNQTFGCATSESGDFSSSGFDGILGLGFPSLSVDGVPSFFDQLKSQGLISSPVFSFYL 143 (283)
T ss_pred CeEEEEEEEeEEEECC---EEEeceEEEEEeccCCcccccccceEeecCCcccccccCCCHHHHHHHCCCCCCCEEEEEE
Confidence 6889999999999998 457899999999887644556789999999988 78999999999998 8999999
Q ss_pred cCC--CCCCceEEECCCCCCC---CceEeecCCCCCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHH
Q 047816 237 GGM--DVGGGAMVLGGISPPK---DMVFTHSDPVRSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEA 311 (620)
Q Consensus 237 ~~~--~~~~G~l~fGgiD~~~---~~~~~~~~~~~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~ 311 (620)
.+. ....|.|+|||+|+++ .+.+++.......+|.|.+++|.++++... .......++|||||++++||++
T Consensus 144 ~~~~~~~~~g~l~~Gg~d~~~~~~~~~~~p~~~~~~~~~~v~l~~i~v~~~~~~----~~~~~~~~iiDsGt~~~~lp~~ 219 (283)
T cd05471 144 GRDGDGGNGGELTFGGIDPSKYTGDLTYTPVVSNGPGYWQVPLDGISVGGKSVI----SSSGGGGAIVDSGTSLIYLPSS 219 (283)
T ss_pred cCCCCCCCCCEEEEcccCccccCCceEEEecCCCCCCEEEEEeCeEEECCceee----ecCCCcEEEEecCCCCEeCCHH
Confidence 975 3467999999999985 455665554347899999999999987411 1134568999999999999999
Q ss_pred HHHHHHHHHHHHhhhcccccCCCCCCccccccCCCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEE
Q 047816 312 AFLAFKDAIMSELQSLKQIRGPDPNYNDICFSGAPSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGI 391 (620)
Q Consensus 312 ~~~~i~~~l~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~ 391 (620)
+++++++++.+.... ....+...| .. .+.+|+|+|+|
T Consensus 220 ~~~~l~~~~~~~~~~------~~~~~~~~~--------~~-~~~~p~i~f~f---------------------------- 256 (283)
T cd05471 220 VYDAILKALGAAVSS------SDGGYGVDC--------SP-CDTLPDITFTF---------------------------- 256 (283)
T ss_pred HHHHHHHHhCCcccc------cCCcEEEeC--------cc-cCcCCCEEEEE----------------------------
Confidence 999999999776532 112223333 11 26899999999
Q ss_pred EecCCCCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 392 FQNGRDPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 392 ~~~~~~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
.+|||.+|||++|++||+++++||||+
T Consensus 257 -------~~ilG~~fl~~~y~vfD~~~~~igfa~ 283 (283)
T cd05471 257 -------LWILGDVFLRNYYTVFDLDNNRIGFAP 283 (283)
T ss_pred -------EEEccHhhhhheEEEEeCCCCEEeecC
Confidence 699999999999999999999999985
No 24
>PF14543 TAXi_N: Xylanase inhibitor N-terminal; PDB: 3HD8_A 3VLB_A 3VLA_A 3AUP_D 1T6G_A 1T6E_X 2B42_A.
Probab=99.93 E-value=3.4e-25 Score=207.79 Aligned_cols=152 Identities=43% Similarity=0.814 Sum_probs=123.4
Q ss_pred EEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCC-c--------ccCCCCCccee
Q 047816 86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLY-C--------NCDRERAQCVY 156 (620)
Q Consensus 86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~-c--------~c~~~~~~~~~ 156 (620)
|+++|.||||+|++.|++||||+.+|++| .++.|+|.+|+||+.+.|.+. | .|......|.|
T Consensus 1 Y~~~~~iGtP~~~~~lvvDtgs~l~W~~C---------~~~~f~~~~Sst~~~v~C~s~~C~~~~~~~~~~~~~~~~C~y 71 (164)
T PF14543_consen 1 YYVSVSIGTPPQPFSLVVDTGSDLTWVQC---------PDPPFDPSKSSTYRPVPCSSPQCSSAPSFCPCCCCSNNSCPY 71 (164)
T ss_dssp EEEEEECTCTTEEEEEEEETT-SSEEEET-------------STT-TTSSBEC-BTTSHHHHHCTSSBTCCTCESSEEEE
T ss_pred CEEEEEeCCCCceEEEEEECCCCceEEcC---------CCcccCCccCCcccccCCCCcchhhcccccccCCCCcCcccc
Confidence 78999999999999999999999999998 237999999999999999763 6 35556788999
Q ss_pred EEeeccCCceeEEEEEEEEEeCCCCC--CCccceEEEEEEeccCCCcCCCcceEEecCCCCCchHHHHHHcCCcccceEE
Q 047816 157 ERKYAEMSSSSGVLGEDIISFGNESD--LKPQRAVFGCENVETGDLYSQHADGIIGLGRGDLSVVDQLVEKGVISDSFSL 234 (620)
Q Consensus 157 ~~~Y~dg~~~~G~~~~D~v~lg~~~~--~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~~~~s~~~~L~~~g~I~~~FSl 234 (620)
.+.|+++..+.|.+++|+|+++.... ....++.|||+....+.+. ..+||||||++..||+.||.++ ..+.||+
T Consensus 72 ~~~y~~~s~~~G~l~~D~~~~~~~~~~~~~~~~~~FGC~~~~~g~~~--~~~GilGLg~~~~Sl~sQl~~~--~~~~FSy 147 (164)
T PF14543_consen 72 SQSYGDGSSSSGFLASDTLTFGSSSGGSNSVPDFIFGCATSNSGLFY--GADGILGLGRGPLSLPSQLASS--SGNKFSY 147 (164)
T ss_dssp EEEETTTEEEEEEEEEEEEEEEEESSSSEEEEEEEEEEE-GGGTSST--TEEEEEE-SSSTTSHHHHHHHH----SEEEE
T ss_pred eeecCCCccccCceEEEEEEecCCCCCCceeeeEEEEeeeccccCCc--CCCcccccCCCcccHHHHHHHh--cCCeEEE
Confidence 99999999999999999999987431 3467899999999886443 7899999999999999999998 5589999
Q ss_pred eecC-CCCCCceEEECC
Q 047816 235 CYGG-MDVGGGAMVLGG 250 (620)
Q Consensus 235 ~l~~-~~~~~G~l~fGg 250 (620)
||.+ .....|.|+||+
T Consensus 148 CL~~~~~~~~g~l~fG~ 164 (164)
T PF14543_consen 148 CLPSSSPSSSGFLSFGD 164 (164)
T ss_dssp EB-S-SSSSEEEEEECS
T ss_pred ECCCCCCCCCEEEEeCc
Confidence 9998 456779999996
No 25
>PF14541 TAXi_C: Xylanase inhibitor C-terminal; PDB: 3AUP_D 3HD8_A 1T6G_A 1T6E_X 2B42_A 3VLB_A 3VLA_A.
Probab=99.87 E-value=1.3e-21 Score=183.45 Aligned_cols=155 Identities=35% Similarity=0.663 Sum_probs=120.2
Q ss_pred eeEEEEeEEEEccEEecCCCCcc---CCCCceEeeccceeeeecHHHHHHHHHHHHHHhhhccccc-CCCCCCccccccC
Q 047816 269 YYNIDLKVIHVAGKPLPLNPKVF---DGKHGTVLDSGTTYAYLPEAAFLAFKDAIMSELQSLKQIR-GPDPNYNDICFSG 344 (620)
Q Consensus 269 ~w~v~l~~i~v~g~~~~~~~~~~---~~~~~ailDSGtt~~~LP~~~~~~i~~~l~~~~~~~~~~~-~~~~~~~~~C~~~ 344 (620)
+|.|+|++|+|+++.+.++...| ++...++|||||++++||+++|+++.+++.+++.....-. .........||..
T Consensus 1 ~Y~v~l~~Isvg~~~l~~~~~~~~~~~~~g~~iiDSGT~~T~L~~~~y~~l~~al~~~~~~~~~~~~~~~~~~~~~Cy~~ 80 (161)
T PF14541_consen 1 FYYVNLTGISVGGKRLPIPPSVFQLSDGSGGTIIDSGTTYTYLPPPVYDALVQALDAQMGAPGVSREAPPFSGFDLCYNL 80 (161)
T ss_dssp SEEEEEEEEEETTEEE---TTCSCETTSTCSEEE-SSSSSEEEEHHHHHHHHHHHHHHHHTCT--CEE---TT-S-EEEG
T ss_pred CccEEEEEEEECCEEecCChHHhhccCCCCCEEEECCCCccCCcHHHHHHHHHHHHHHhhhcccccccccCCCCCceeec
Confidence 48999999999999999988876 4567899999999999999999999999999987653110 1233556789987
Q ss_pred CCCCccccCCCCCeEEEEECCCcEEEeCCCCcEEEecccCCeEEEEEEec--CCCCceeehHhhhceEEEEEeCCCCEEE
Q 047816 345 APSDVSQLSDTFPAVEMAFGNGQKLLLAPENYLFRHSKVRGAYCLGIFQN--GRDPTTLLGGIIVRNTLVMYDREHSKIG 422 (620)
Q Consensus 345 ~~~~~~~~~~~~P~i~f~f~~g~~~~l~~~~yi~~~~~~~~~~Cl~~~~~--~~~~~~ILG~~fLr~~yvvfD~en~rIG 422 (620)
...........+|+|+|+|.+|.++++++++|++.... +..|+++... ..++..|||..+|++++++||++++|||
T Consensus 81 ~~~~~~~~~~~~P~i~l~F~~ga~l~l~~~~y~~~~~~--~~~Cla~~~~~~~~~~~~viG~~~~~~~~v~fDl~~~~ig 158 (161)
T PF14541_consen 81 SSFGVNRDWAKFPTITLHFEGGADLTLPPENYFVQVSP--GVFCLAFVPSDADDDGVSVIGNFQQQNYHVVFDLENGRIG 158 (161)
T ss_dssp GCS-EETTEESS--EEEEETTSEEEEE-HHHHEEEECT--TEEEESEEEETSTTSSSEEE-HHHCCTEEEEEETTTTEEE
T ss_pred cccccccccccCCeEEEEEeCCcceeeeccceeeeccC--CCEEEEEEccCCCCCCcEEECHHHhcCcEEEEECCCCEEE
Confidence 65323334478999999998899999999999998863 7899999888 5568899999999999999999999999
Q ss_pred EEe
Q 047816 423 FWK 425 (620)
Q Consensus 423 fA~ 425 (620)
|+|
T Consensus 159 F~~ 161 (161)
T PF14541_consen 159 FAP 161 (161)
T ss_dssp EEE
T ss_pred EeC
Confidence 986
No 26
>cd05470 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site
Probab=99.85 E-value=4.7e-21 Score=167.54 Aligned_cols=107 Identities=36% Similarity=0.658 Sum_probs=92.6
Q ss_pred EEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCC-CCCCCcccccccCcCCcccCCCCCcceeEEeeccCCce
Q 047816 88 TRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKF-EPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMSSS 166 (620)
Q Consensus 88 ~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y-~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~~~ 166 (620)
++|.||||+|++.|+|||||+++||+|..|..|..+....| +|+.|++++... |.|.+.|++| ++
T Consensus 1 ~~i~vGtP~q~~~~~~DTGSs~~Wv~~~~c~~~~~~~~~~~~~~~~sst~~~~~-------------~~~~~~Y~~g-~~ 66 (109)
T cd05470 1 IEIGIGTPPQTFNVLLDTGSSNLWVPSVDCQSLAIYSHSSYDDPSASSTYSDNG-------------CTFSITYGTG-SL 66 (109)
T ss_pred CEEEeCCCCceEEEEEeCCCCCEEEeCCCCCCcccccccccCCcCCCCCCCCCC-------------cEEEEEeCCC-eE
Confidence 47999999999999999999999999999997775545566 999999998865 6899999985 67
Q ss_pred eEEEEEEEEEeCCCCCCCccceEEEEEEeccCCC-cCCCcceEEec
Q 047816 167 SGVLGEDIISFGNESDLKPQRAVFGCENVETGDL-YSQHADGIIGL 211 (620)
Q Consensus 167 ~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~-~~~~~dGIlGL 211 (620)
.|.++.|+|+|++ ..+.++.|||+....+.+ .....+|||||
T Consensus 67 ~g~~~~D~v~ig~---~~~~~~~fg~~~~~~~~~~~~~~~~GilGL 109 (109)
T cd05470 67 SGGLSTDTVSIGD---IEVVGQAFGCATDEPGATFLPALFDGILGL 109 (109)
T ss_pred EEEEEEEEEEECC---EEECCEEEEEEEecCCccccccccccccCC
Confidence 8999999999998 667899999999887653 33568999998
No 27
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=97.90 E-value=3.1e-05 Score=65.27 Aligned_cols=93 Identities=15% Similarity=0.163 Sum_probs=64.5
Q ss_pred eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCC
Q 047816 85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMS 164 (620)
Q Consensus 85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~ 164 (620)
.|++++.|| ++++++++|||++.+|+.......|... +. ......+...+|.
T Consensus 2 ~~~v~v~i~--~~~~~~llDTGa~~s~i~~~~~~~l~~~----~~----------------------~~~~~~~~~~~G~ 53 (96)
T cd05483 2 HFVVPVTIN--GQPVRFLLDTGASTTVISEELAERLGLP----LT----------------------LGGKVTVQTANGR 53 (96)
T ss_pred cEEEEEEEC--CEEEEEEEECCCCcEEcCHHHHHHcCCC----cc----------------------CCCcEEEEecCCC
Confidence 478999999 8999999999999999976432222210 00 0123456666666
Q ss_pred ceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecCC
Q 047816 165 SSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLGR 213 (620)
Q Consensus 165 ~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg~ 213 (620)
........+.+++|+ .+..++.+.+...... ..+||||+.+
T Consensus 54 ~~~~~~~~~~i~ig~---~~~~~~~~~v~d~~~~-----~~~gIlG~d~ 94 (96)
T cd05483 54 VRAARVRLDSLQIGG---ITLRNVPAVVLPGDAL-----GVDGLLGMDF 94 (96)
T ss_pred ccceEEEcceEEECC---cEEeccEEEEeCCccc-----CCceEeChHH
Confidence 666666789999998 6667777777654332 4799999863
No 28
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=96.58 E-value=0.0023 Score=55.94 Aligned_cols=37 Identities=19% Similarity=0.336 Sum_probs=25.3
Q ss_pred cchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816 578 TWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 578 ~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~ 615 (620)
..++. +|+||++| +++.++++.+++.|+|||+|+|..
T Consensus 59 h~fs~~~i~~Ii~g-v~aGvIg~Illi~y~irR~~Kk~~ 96 (122)
T PF01102_consen 59 HRFSEPAIIGIIFG-VMAGVIGIILLISYCIRRLRKKSS 96 (122)
T ss_dssp SSSS-TCHHHHHHH-HHHHHHHHHHHHHHHHHHHS----
T ss_pred cCccccceeehhHH-HHHHHHHHHHHHHHHHHHHhccCC
Confidence 36667 89999999 777777777777888888776654
No 29
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=96.49 E-value=0.012 Score=52.12 Aligned_cols=94 Identities=12% Similarity=0.179 Sum_probs=59.8
Q ss_pred ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816 83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE 162 (620)
Q Consensus 83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d 162 (620)
+|.|++++.|. ++++.++||||++.+-+....-.... .++.. . .....+.-+.
T Consensus 9 ~g~~~v~~~In--G~~~~flVDTGAs~t~is~~~A~~Lg------l~~~~------~-------------~~~~~~~ta~ 61 (121)
T TIGR02281 9 DGHFYATGRVN--GRNVRFLVDTGATSVALNEEDAQRLG------LDLNR------L-------------GYTVTVSTAN 61 (121)
T ss_pred CCeEEEEEEEC--CEEEEEEEECCCCcEEcCHHHHHHcC------CCccc------C-------------CceEEEEeCC
Confidence 67899999998 89999999999999987643211111 11110 0 0122333333
Q ss_pred CCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecC
Q 047816 163 MSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLG 212 (620)
Q Consensus 163 g~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg 212 (620)
|......+.-|.+.+|+ ....++.+.+..... ..+|+||+.
T Consensus 62 G~~~~~~~~l~~l~iG~---~~~~nv~~~v~~~~~------~~~~LLGm~ 102 (121)
T TIGR02281 62 GQIKAARVTLDRVAIGG---IVVNDVDAMVAEGGA------LSESLLGMS 102 (121)
T ss_pred CcEEEEEEEeCEEEECC---EEEeCcEEEEeCCCc------CCceEcCHH
Confidence 44344456789999999 677788877663221 137999986
No 30
>PF13650 Asp_protease_2: Aspartyl protease
Probab=95.88 E-value=0.041 Score=45.30 Aligned_cols=89 Identities=17% Similarity=0.241 Sum_probs=52.9
Q ss_pred EEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeeccCCcee
Q 047816 88 TRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAEMSSSS 167 (620)
Q Consensus 88 ~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~dg~~~~ 167 (620)
+++.|+ ++++++++|||++.+.+....+...... +... .....+.-.+|....
T Consensus 1 V~v~vn--g~~~~~liDTGa~~~~i~~~~~~~l~~~------~~~~-------------------~~~~~~~~~~g~~~~ 53 (90)
T PF13650_consen 1 VPVKVN--GKPVRFLIDTGASISVISRSLAKKLGLK------PRPK-------------------SVPISVSGAGGSVTV 53 (90)
T ss_pred CEEEEC--CEEEEEEEcCCCCcEEECHHHHHHcCCC------CcCC-------------------ceeEEEEeCCCCEEE
Confidence 367787 8999999999999887764332221111 0000 011223333344344
Q ss_pred EEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecC
Q 047816 168 GVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLG 212 (620)
Q Consensus 168 G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg 212 (620)
.....+.+++|+ .+..+..+-+.. .....+||||+-
T Consensus 54 ~~~~~~~i~ig~---~~~~~~~~~v~~------~~~~~~~iLG~d 89 (90)
T PF13650_consen 54 YRGRVDSITIGG---ITLKNVPFLVVD------LGDPIDGILGMD 89 (90)
T ss_pred EEEEEEEEEECC---EEEEeEEEEEEC------CCCCCEEEeCCc
Confidence 455667899998 556677766654 123578999974
No 31
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=95.45 E-value=0.0027 Score=53.48 Aligned_cols=36 Identities=22% Similarity=0.266 Sum_probs=23.4
Q ss_pred Cccccchhh-hhHHHHHHHHHHHHHHHHHH-HHHhhhhh
Q 047816 574 QVKRTWWQE-HFLMVVLAITIMMVVGLSVF-GILFILRR 610 (620)
Q Consensus 574 ~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~-~~~~~~r~ 610 (620)
+..+++++. +|+||++| +++++.+|.++ .|||++||
T Consensus 57 st~~~~ls~gaiagi~vg-~~~~v~~lv~~l~w~f~~r~ 94 (96)
T PTZ00382 57 GANRSGLSTGAIAGISVA-VVAVVGGLVGFLCWWFVCRG 94 (96)
T ss_pred ccCCCCcccccEEEEEee-hhhHHHHHHHHHhheeEEee
Confidence 455678888 99999998 55555455444 44455543
No 32
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=95.17 E-value=0.0058 Score=46.32 Aligned_cols=36 Identities=17% Similarity=0.288 Sum_probs=2.1
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccccCCCC
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQSVNSYK 619 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~~~~~ 619 (620)
.++|+++| +++.++..+++++++++|.|+|.+.+|.
T Consensus 10 vlaavIaG-~Vvgll~ailLIlf~iyR~rkkdEGSY~ 45 (64)
T PF01034_consen 10 VLAAVIAG-GVVGLLFAILLILFLIYRMRKKDEGSYD 45 (64)
T ss_dssp ----------------------------S------SS
T ss_pred HHHHHHHH-HHHHHHHHHHHHHHHHHHHHhcCCCCcc
Confidence 45566666 5555555666667888898988888884
No 33
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=94.22 E-value=0.18 Score=44.76 Aligned_cols=27 Identities=15% Similarity=0.228 Sum_probs=23.6
Q ss_pred CCceeehHhhhceEEEEEeCCCCEEEE
Q 047816 397 DPTTLLGGIIVRNTLVMYDREHSKIGF 423 (620)
Q Consensus 397 ~~~~ILG~~fLr~~yvvfD~en~rIGf 423 (620)
....|||..||+.+-.+.|+.+++|-+
T Consensus 98 ~~d~ILG~d~L~~~~~~ID~~~~~i~~ 124 (124)
T cd05479 98 DVDFLIGLDMLKRHQCVIDLKENVLRI 124 (124)
T ss_pred CcCEEecHHHHHhCCeEEECCCCEEEC
Confidence 446799999999999999999998853
No 34
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=94.15 E-value=0.19 Score=44.68 Aligned_cols=90 Identities=19% Similarity=0.196 Sum_probs=55.5
Q ss_pred eeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeE-Eeec-
Q 047816 84 GYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYE-RKYA- 161 (620)
Q Consensus 84 ~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~-~~Y~- 161 (620)
..+++++.|+ ++++.+++|||++.+++....+..|+... ... ..+. ...+
T Consensus 15 ~~~~v~~~In--g~~~~~LvDTGAs~s~Is~~~a~~lgl~~------~~~--------------------~~~~~~~~g~ 66 (124)
T cd05479 15 PMLYINVEIN--GVPVKAFVDSGAQMTIMSKACAEKCGLMR------LID--------------------KRFQGIAKGV 66 (124)
T ss_pred eEEEEEEEEC--CEEEEEEEeCCCceEEeCHHHHHHcCCcc------ccC--------------------cceEEEEecC
Confidence 3577899999 89999999999999998765444443320 000 0111 1222
Q ss_pred cCCceeEEEEEEEEEeCCCCCCCccceEEEEEEeccCCCcCCCcceEEecC
Q 047816 162 EMSSSSGVLGEDIISFGNESDLKPQRAVFGCENVETGDLYSQHADGIIGLG 212 (620)
Q Consensus 162 dg~~~~G~~~~D~v~lg~~~~~~~~~~~fg~~~~~~~~~~~~~~dGIlGLg 212 (620)
++....|..-.+.+.+++. .. ...|.+... ...|+|||+-
T Consensus 67 g~~~~~g~~~~~~l~i~~~---~~-~~~~~Vl~~-------~~~d~ILG~d 106 (124)
T cd05479 67 GTQKILGRIHLAQVKIGNL---FL-PCSFTVLED-------DDVDFLIGLD 106 (124)
T ss_pred CCcEEEeEEEEEEEEECCE---Ee-eeEEEEECC-------CCcCEEecHH
Confidence 2234566677788999983 22 245554421 1478999985
No 35
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=93.22 E-value=0.011 Score=53.29 Aligned_cols=30 Identities=20% Similarity=0.455 Sum_probs=19.9
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
.++|+++|+++++++++++++.||..|+|+
T Consensus 50 IVIGvVVGVGg~ill~il~lvf~~c~r~kk 79 (154)
T PF04478_consen 50 IVIGVVVGVGGPILLGILALVFIFCIRRKK 79 (154)
T ss_pred EEEEEEecccHHHHHHHHHhheeEEEeccc
Confidence 699999997777776666655444444443
No 36
>PF03302 VSP: Giardia variant-specific surface protein; InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=92.59 E-value=0.045 Score=58.78 Aligned_cols=37 Identities=19% Similarity=0.289 Sum_probs=27.9
Q ss_pred Cccccchhh-hhHHHHHHHHHHHHHHHH-HHHHHhhhhhh
Q 047816 574 QVKRTWWQE-HFLMVVLAITIMMVVGLS-VFGILFILRRR 611 (620)
Q Consensus 574 ~~~~~~~~~-~~~~i~~~~~~~~~~~l~-~~~~~~~~r~r 611 (620)
...|+++|+ +|+||+++ +|++|-+|+ +|.||||.|+|
T Consensus 358 ~~n~s~LstgaIaGIsva-vvvvVgglvGfLcWwf~crgk 396 (397)
T PF03302_consen 358 STNKSGLSTGAIAGISVA-VVVVVGGLVGFLCWWFICRGK 396 (397)
T ss_pred Ccccccccccceeeeeeh-hHHHHHHHHHHHhhheeeccc
Confidence 456789999 99999999 666666664 45577777765
No 37
>PF01299 Lamp: Lysosome-associated membrane glycoprotein (Lamp); InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+ In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=91.54 E-value=0.14 Score=53.04 Aligned_cols=34 Identities=21% Similarity=0.340 Sum_probs=25.0
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccccCCCC
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQSVNSYK 619 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~~~~~ 619 (620)
.+|-|++| ++.++++|++++.|+|.|||+++ -|+
T Consensus 271 ~~vPIaVG-~~La~lvlivLiaYli~Rrr~~~--gYq 304 (306)
T PF01299_consen 271 DLVPIAVG-AALAGLVLIVLIAYLIGRRRSRA--GYQ 304 (306)
T ss_pred chHHHHHH-HHHHHHHHHHHHhheeEeccccc--ccc
Confidence 46778888 55566777778889999988765 454
No 38
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=90.52 E-value=0.31 Score=48.56 Aligned_cols=33 Identities=18% Similarity=0.255 Sum_probs=17.8
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhhhhccccCCC
Q 047816 586 MVVLAITIMMVVGLSVFGILFILRRRRQSVNSY 618 (620)
Q Consensus 586 ~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~~~~ 618 (620)
||++=+.++++|+|.++=+|+.|||++.-.|.+
T Consensus 262 giaalvllil~vvliiLYiWlyrrRK~swkhe~ 294 (295)
T TIGR01478 262 GIAALVLIILTVVLIILYIWLYRRRKKSWKHEC 294 (295)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhcccccccc
Confidence 777544556666665555555555544333443
No 39
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=89.19 E-value=0.34 Score=49.60 Aligned_cols=31 Identities=23% Similarity=0.414 Sum_probs=20.7
Q ss_pred hhhhHHHHHHHHHHHHHHHHHHHHHhhhhhhccc
Q 047816 581 QEHFLMVVLAITIMMVVGLSVFGILFILRRRRQS 614 (620)
Q Consensus 581 ~~~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~ 614 (620)
+.+|++.+ +++++++|..+++++|||.||++
T Consensus 255 ~t~I~aSi---iaIliIVLIMvIIYLILRYRRKK 285 (299)
T PF02009_consen 255 TTAIIASI---IAILIIVLIMVIIYLILRYRRKK 285 (299)
T ss_pred HHHHHHHH---HHHHHHHHHHHHHHHHHHHHHHh
Confidence 33444444 45566677778889999988854
No 40
>PF05454 DAG1: Dystroglycan (Dystrophin-associated glycoprotein 1); InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=88.78 E-value=0.13 Score=52.24 Aligned_cols=91 Identities=15% Similarity=0.266 Sum_probs=0.0
Q ss_pred EEEEEeecCCCCC---CchhHHHHHHhhccc-ccccceEEeeeeecCCceeeEEEEecCCCcccccHHHHHHHHHHHccC
Q 047816 478 FDMFLSINYSDLR---PHIPELADSIAQELD-VNTSQVHLLNFMSKGNNSFIAWAVFPSGSANYISNATALRIISRLAEH 553 (620)
Q Consensus 478 ~~~~~~~~~~~~~---~~~~~~~~~~~~~l~-~~~~qv~~~~~~~~g~~~~~~~~~~P~~~~~~f~~~~~~~i~~~~~~~ 553 (620)
|.+.|...+-.|. -....|.+-||..++ -+.+++.|.++. .+...+.|-=--- ....=...++.+++.+|...
T Consensus 3 F~~~l~~d~~~f~~dv~~ki~lVekLA~~~GD~nts~ItV~sIt--~gstiVtwtNnTL-p~~~CP~eeI~~L~~~L~~~ 79 (290)
T PF05454_consen 3 FSATLDIDYESFNNDVQRKILLVEKLARLFGDRNTSSITVRSIT--SGSTIVTWTNNTL-PTSPCPKEEIEKLRKRLVDD 79 (290)
T ss_dssp --------------------------------------------------------------------------------
T ss_pred eEEEEcCCHHHhhhhHHHHHHHHHHHHHHhCCCCCCeEEEEEec--CCCEEEEEEcCCC-CCCCCCHHHHHHHHHHHhcC
Confidence 4455544454442 223358888998888 567899999987 3444455521111 12223456677777777666
Q ss_pred ccCCC----CCCcce-eeeeeee
Q 047816 554 RVHIP----DTFGNY-KLLQWNI 571 (620)
Q Consensus 554 ~~~~~----~~fG~y-~l~~~~~ 571 (620)
+-.+. ..+||. .+.+.++
T Consensus 80 ~g~~~~~f~~am~pef~V~svsv 102 (290)
T PF05454_consen 80 DGKPSQEFVRAMGPEFKVKSVSV 102 (290)
T ss_dssp -----------------------
T ss_pred CCCcCHHHHHHhCCCCceeEEEE
Confidence 53322 456643 3444443
No 41
>PF08693 SKG6: Transmembrane alpha-helix domain; InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=88.60 E-value=0.041 Score=37.76 Aligned_cols=8 Identities=50% Similarity=0.899 Sum_probs=3.5
Q ss_pred Hhhhhhhc
Q 047816 605 LFILRRRR 612 (620)
Q Consensus 605 ~~~~r~r~ 612 (620)
+++||||+
T Consensus 32 l~~~~rR~ 39 (40)
T PF08693_consen 32 LFFWYRRK 39 (40)
T ss_pred hheEEecc
Confidence 34444443
No 42
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=88.36 E-value=0.96 Score=30.63 Aligned_cols=29 Identities=7% Similarity=0.182 Sum_probs=12.0
Q ss_pred HHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 585 LMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 585 ~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
++|++|+++.+++..+.+..++..+||.+
T Consensus 6 IaIIv~V~vg~~iiii~~~~YaCcykk~~ 34 (38)
T PF02439_consen 6 IAIIVAVVVGMAIIIICMFYYACCYKKHR 34 (38)
T ss_pred hhHHHHHHHHHHHHHHHHHHHHHHHcccc
Confidence 45555523333333333334444554443
No 43
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=87.98 E-value=0.55 Score=39.01 Aligned_cols=28 Identities=25% Similarity=0.379 Sum_probs=24.7
Q ss_pred EEEEEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816 86 YTTRLWIGTPPQTFALIVDTGSTVTYVPCA 115 (620)
Q Consensus 86 Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~ 115 (620)
|++.+.|+ ++++.+++||||+.+++...
T Consensus 1 ~~~~~~In--g~~i~~lvDTGA~~svis~~ 28 (91)
T cd05484 1 KTVTLLVN--GKPLKFQLDTGSAITVISEK 28 (91)
T ss_pred CEEEEEEC--CEEEEEEEcCCcceEEeCHH
Confidence 35789999 99999999999999999754
No 44
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where
Probab=87.60 E-value=2.7 Score=34.45 Aligned_cols=27 Identities=19% Similarity=0.263 Sum_probs=22.0
Q ss_pred EEEecCCCcEEEEEEeCCCCceeEeCCCC
Q 047816 89 RLWIGTPPQTFALIVDTGSTVTYVPCATC 117 (620)
Q Consensus 89 ~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c 117 (620)
.+.|. ++++++++|||++.+-+....+
T Consensus 2 ~v~In--G~~~~fLvDTGA~~tii~~~~a 28 (86)
T cd06095 2 TITVE--GVPIVFLVDTGATHSVLKSDLG 28 (86)
T ss_pred EEEEC--CEEEEEEEECCCCeEEECHHHh
Confidence 45666 8999999999999999976543
No 45
>PTZ00370 STEVOR; Provisional
Probab=87.45 E-value=0.42 Score=47.72 Aligned_cols=26 Identities=19% Similarity=0.324 Sum_probs=14.7
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816 586 MVVLAITIMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 586 ~i~~~~~~~~~~~l~~~~~~~~~r~r 611 (620)
||++=+.++++|+|.++=+|+.|||+
T Consensus 258 giaalvllil~vvliilYiwlyrrRK 283 (296)
T PTZ00370 258 GIAALVLLILAVVLIILYIWLYRRRK 283 (296)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence 77754455555555555555555544
No 46
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=86.89 E-value=7.5 Score=34.23 Aligned_cols=37 Identities=19% Similarity=0.220 Sum_probs=27.5
Q ss_pred CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816 266 RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 266 ~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i 316 (620)
..++|.++ +.|+|+.. .+++|||.+.+.++++..+++
T Consensus 8 ~~g~~~v~---~~InG~~~-----------~flVDTGAs~t~is~~~A~~L 44 (121)
T TIGR02281 8 GDGHFYAT---GRVNGRNV-----------RFLVDTGATSVALNEEDAQRL 44 (121)
T ss_pred CCCeEEEE---EEECCEEE-----------EEEEECCCCcEEcCHHHHHHc
Confidence 34555444 56787754 489999999999999987663
No 47
>PHA03286 envelope glycoprotein E; Provisional
Probab=85.75 E-value=1.4 Score=46.73 Aligned_cols=76 Identities=18% Similarity=0.171 Sum_probs=44.3
Q ss_pred cccHHHHHHHHHHHccCccCCCCCCcceeeeeeeec-----CC------ccccchhh-hhHHHHHHHHHHHHHHHHHHHH
Q 047816 537 YISNATALRIISRLAEHRVHIPDTFGNYKLLQWNIE-----PQ------VKRTWWQE-HFLMVVLAITIMMVVGLSVFGI 604 (620)
Q Consensus 537 ~f~~~~~~~i~~~~~~~~~~~~~~fG~y~l~~~~~~-----~~------~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~ 604 (620)
+-=.||+.+++..+.+++-| .||+-.++.-+.. |. ..++-+.. .+..+++| ++++++.....+.
T Consensus 337 YtlvST~~~fvNVi~e~~~P---~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~s~~~~-~~~~~~~~~~~~~ 412 (492)
T PHA03286 337 YVYLSTLETILNVFEDVHKP---GFGYNAVSANDPGNFTAAPTHTIAFKEGPTVIYSLLVSSMAAG-AILVVLLFALCIA 412 (492)
T ss_pred EEEEehHHHhhhhhhhccCC---CCCCcccccCCccccccCCcchhhhccCCeEEHHHHHHHHHHH-HHHHHHHHHHHhH
Confidence 44568899999999998777 4886665543221 21 11222333 34456666 5555555555566
Q ss_pred HhhhhhhccccC
Q 047816 605 LFILRRRRQSVN 616 (620)
Q Consensus 605 ~~~~r~r~~~~~ 616 (620)
.+++|||+++..
T Consensus 413 ~~~~r~~~~r~~ 424 (492)
T PHA03286 413 GLYRRRRRHRTN 424 (492)
T ss_pred hHhhhhhhhhcc
Confidence 677766654433
No 48
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=85.45 E-value=3.2 Score=35.72 Aligned_cols=24 Identities=17% Similarity=0.282 Sum_probs=20.9
Q ss_pred CceeehHhhhceEEEEEeCCCCEE
Q 047816 398 PTTLLGGIIVRNTLVMYDREHSKI 421 (620)
Q Consensus 398 ~~~ILG~~fLr~~yvvfD~en~rI 421 (620)
+..+||..||+.+-++.|+.++++
T Consensus 84 ~~~LLG~~~L~~l~l~id~~~~~~ 107 (107)
T TIGR03698 84 DEPLLGTELLEGLGIVIDYRNQGL 107 (107)
T ss_pred CccEecHHHHhhCCEEEehhhCcC
Confidence 478999999999999999987753
No 49
>PF08284 RVP_2: Retroviral aspartyl protease; InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.
Probab=84.58 E-value=6.1 Score=35.56 Aligned_cols=28 Identities=14% Similarity=0.131 Sum_probs=25.4
Q ss_pred CceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 398 PTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 398 ~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
-..|||..+|+.+..+.|..+++|.|-.
T Consensus 104 ~DvILGm~WL~~~~~~IDw~~k~v~f~~ 131 (135)
T PF08284_consen 104 YDVILGMDWLKKHNPVIDWATKTVTFNS 131 (135)
T ss_pred eeeEeccchHHhCCCEEEccCCEEEEeC
Confidence 4589999999999999999999999864
No 50
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=83.78 E-value=3 Score=39.83 Aligned_cols=79 Identities=13% Similarity=0.149 Sum_probs=53.5
Q ss_pred eeeeccCCccceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCC
Q 047816 73 RMRLYDDLLLNGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERA 152 (620)
Q Consensus 73 ~~~l~~~~~~~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~ 152 (620)
.+.+..+ .+|.|.++..|- +|++..+||||-+.+-+...+... -.|+....
T Consensus 95 ~v~Lak~--~~GHF~a~~~VN--Gk~v~fLVDTGATsVal~~~dA~R------lGid~~~l------------------- 145 (215)
T COG3577 95 EVSLAKS--RDGHFEANGRVN--GKKVDFLVDTGATSVALNEEDARR------LGIDLNSL------------------- 145 (215)
T ss_pred EEEEEec--CCCcEEEEEEEC--CEEEEEEEecCcceeecCHHHHHH------hCCCcccc-------------------
Confidence 4555554 488999999998 999999999999998887543211 12333221
Q ss_pred cceeEEeeccCCceeEEEEEEEEEeCCC
Q 047816 153 QCVYERKYAEMSSSSGVLGEDIISFGNE 180 (620)
Q Consensus 153 ~~~~~~~Y~dg~~~~G~~~~D~v~lg~~ 180 (620)
..++.+.-++|..-...+-.|.|.||+.
T Consensus 146 ~y~~~v~TANG~~~AA~V~Ld~v~IG~I 173 (215)
T COG3577 146 DYTITVSTANGRARAAPVTLDRVQIGGI 173 (215)
T ss_pred CCceEEEccCCccccceEEeeeEEEccE
Confidence 1234555566555555677899999993
No 51
>PF05808 Podoplanin: Podoplanin; InterPro: IPR008783 This family consists of several mammalian podoplanin-like proteins which are thought to control specifically the unique shape of podocytes [].; GO: 0016021 integral to membrane; PDB: 3IET_X.
Probab=83.68 E-value=0.34 Score=44.19 Aligned_cols=35 Identities=11% Similarity=0.345 Sum_probs=0.0
Q ss_pred Cccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhh
Q 047816 574 QVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRR 610 (620)
Q Consensus 574 ~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~ 610 (620)
...|.++++ .+|||++| +++++++++-+++++.||
T Consensus 120 t~ek~GL~T~tLVGIIVG--VLlaIG~igGIIivvvRK 155 (162)
T PF05808_consen 120 TVEKDGLSTVTLVGIIVG--VLLAIGFIGGIIIVVVRK 155 (162)
T ss_dssp --------------------------------------
T ss_pred ccccCCcceeeeeeehhh--HHHHHHHHhheeeEEeeh
Confidence 356899999 99999998 555666666556666665
No 52
>PF13975 gag-asp_proteas: gag-polyprotein putative aspartyl protease
Probab=82.70 E-value=2 Score=33.97 Aligned_cols=36 Identities=22% Similarity=0.330 Sum_probs=29.9
Q ss_pred ceeEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCC
Q 047816 83 NGYYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHC 120 (620)
Q Consensus 83 ~~~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C 120 (620)
.+.+++++.|| ++.+..++|||++...|..+.+..+
T Consensus 6 ~g~~~v~~~I~--g~~~~alvDtGat~~fis~~~a~rL 41 (72)
T PF13975_consen 6 PGLMYVPVSIG--GVQVKALVDTGATHNFISESLAKRL 41 (72)
T ss_pred CCEEEEEEEEC--CEEEEEEEeCCCcceecCHHHHHHh
Confidence 46788999999 8999999999999998876554433
No 53
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=81.45 E-value=7.3 Score=32.17 Aligned_cols=30 Identities=30% Similarity=0.543 Sum_probs=24.7
Q ss_pred EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816 276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i 316 (620)
.+.|+|+.+. +++|||++.+.++++.+..+
T Consensus 4 ~~~Ing~~i~-----------~lvDTGA~~svis~~~~~~l 33 (91)
T cd05484 4 TLLVNGKPLK-----------FQLDTGSAITVISEKTWRKL 33 (91)
T ss_pred EEEECCEEEE-----------EEEcCCcceEEeCHHHHHHh
Confidence 4567887664 79999999999999998764
No 54
>PTZ00046 rifin; Provisional
Probab=80.45 E-value=1.7 Score=45.17 Aligned_cols=30 Identities=30% Similarity=0.462 Sum_probs=22.1
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816 586 MVVLAITIMMVVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 586 ~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~ 615 (620)
+|++++..+++++|..++++||.|.||++.
T Consensus 316 aIiaSiiAIvVIVLIMvIIYLILRYRRKKK 345 (358)
T PTZ00046 316 AIIASIVAIVVIVLIMVIIYLILRYRRKKK 345 (358)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhhcch
Confidence 444454566777788888999999998653
No 55
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=79.94 E-value=1.8 Score=44.83 Aligned_cols=30 Identities=27% Similarity=0.446 Sum_probs=22.3
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816 586 MVVLAITIMMVVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 586 ~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~ 615 (620)
+|++++..+++++|..++++||.|.||++.
T Consensus 311 ~IiaSiIAIvvIVLIMvIIYLILRYRRKKK 340 (353)
T TIGR01477 311 PIIASIIAILIIVLIMVIIYLILRYRRKKK 340 (353)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhhcch
Confidence 455555666777778888999999998653
No 56
>PF15176 LRR19-TM: Leucine-rich repeat family 19 TM domain
Probab=76.25 E-value=4.9 Score=33.62 Aligned_cols=39 Identities=8% Similarity=-0.003 Sum_probs=22.2
Q ss_pred CCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 573 PQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 573 ~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
+..++.+.+. .+|||+++ ++++-+.+.+++=+=+|||.+
T Consensus 8 ~~~~~~g~sW~~LVGVv~~-al~~SlLIalaaKC~~~~k~~ 47 (102)
T PF15176_consen 8 PGPGEGGRSWPFLVGVVVT-ALVTSLLIALAAKCPVWYKYL 47 (102)
T ss_pred CCCCCCCcccHhHHHHHHH-HHHHHHHHHHHHHhHHHHHHH
Confidence 3344556666 88999887 444444434444455566554
No 57
>PF05393 Hum_adeno_E3A: Human adenovirus early E3A glycoprotein; InterPro: IPR008652 This family consists of several early glycoproteins (E3A), from human adenovirus type 2.; GO: 0016021 integral to membrane
Probab=74.93 E-value=3.7 Score=33.27 Aligned_cols=25 Identities=24% Similarity=0.377 Sum_probs=12.3
Q ss_pred HHHHHHHHHHhhhhhhcccc-CCCCC
Q 047816 596 VVGLSVFGILFILRRRRQSV-NSYKP 620 (620)
Q Consensus 596 ~~~l~~~~~~~~~r~r~~~~-~~~~~ 620 (620)
+..|+.+.++..|+||+|++ -.|+|
T Consensus 43 iFil~VilwfvCC~kRkrsRrPIYrP 68 (94)
T PF05393_consen 43 IFILLVILWFVCCKKRKRSRRPIYRP 68 (94)
T ss_pred HHHHHHHHHHHHHHHhhhccCCcccc
Confidence 33334444444455555444 46776
No 58
>PF00077 RVP: Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026; InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=74.56 E-value=4.1 Score=34.18 Aligned_cols=26 Identities=19% Similarity=0.323 Sum_probs=22.6
Q ss_pred EEEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816 88 TRLWIGTPPQTFALIVDTGSTVTYVPCA 115 (620)
Q Consensus 88 ~~i~iGTP~Q~~~v~vDTGSs~~WV~~~ 115 (620)
.+|.|. ++++..++||||+.+-++..
T Consensus 8 i~v~i~--g~~i~~LlDTGA~vsiI~~~ 33 (100)
T PF00077_consen 8 ITVKIN--GKKIKALLDTGADVSIISEK 33 (100)
T ss_dssp EEEEET--TEEEEEEEETTBSSEEESSG
T ss_pred EEEeEC--CEEEEEEEecCCCcceeccc
Confidence 578888 88999999999999988764
No 59
>PF14575 EphA2_TM: Ephrin type-A receptor 2 transmembrane domain; PDB: 3KUL_A 2XVD_A 2VX1_A 2VWV_A 2VX0_A 2VWY_A 2VWZ_A 2VWW_A 2VWU_A 2VWX_A ....
Probab=73.47 E-value=3.4 Score=33.05 Aligned_cols=27 Identities=11% Similarity=0.493 Sum_probs=10.3
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 584 FLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 584 ~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
|++++++ +++++++++++ +++++||++
T Consensus 2 ii~~~~~-g~~~ll~~v~~-~~~~~rr~~ 28 (75)
T PF14575_consen 2 IIASIIV-GVLLLLVLVII-VIVCFRRCK 28 (75)
T ss_dssp HHHHHHH-HHHHHHHHHHH-HHCCCTT--
T ss_pred EEehHHH-HHHHHHHhhee-EEEEEeeEc
Confidence 3443433 34444443333 344444443
No 60
>PF12768 Rax2: Cortical protein marker for cell polarity
Probab=71.95 E-value=2.3 Score=43.37 Aligned_cols=36 Identities=17% Similarity=0.291 Sum_probs=18.6
Q ss_pred ccchhh-hhHH--HHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 577 RTWWQE-HFLM--VVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 577 ~~~~~~-~~~~--i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
++.+++ .+|. ++++++++++++|+.+++..++|||+
T Consensus 221 ~~~l~~G~VVlIslAiALG~v~ll~l~Gii~~~~~r~~~ 259 (281)
T PF12768_consen 221 GKKLSRGFVVLISLAIALGTVFLLVLIGIILAYIRRRRQ 259 (281)
T ss_pred cccccceEEEEEehHHHHHHHHHHHHHHHHHHHHHhhhc
Confidence 355555 4444 44444455555555555555555544
No 61
>PF06697 DUF1191: Protein of unknown function (DUF1191); InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=70.45 E-value=1.1 Score=45.08 Aligned_cols=29 Identities=17% Similarity=0.256 Sum_probs=16.1
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
.++|+++| +++++.|+.+++++.+.||++
T Consensus 215 iv~g~~~G--~~~L~ll~~lv~~~vr~krk~ 243 (278)
T PF06697_consen 215 IVVGVVGG--VVLLGLLSLLVAMLVRYKRKK 243 (278)
T ss_pred EEEEehHH--HHHHHHHHHHHHhhhhhhHHH
Confidence 67777777 333444444555555555543
No 62
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=69.96 E-value=7 Score=34.82 Aligned_cols=40 Identities=15% Similarity=0.436 Sum_probs=26.1
Q ss_pred CCccccchhhhhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 573 PQVKRTWWQEHFLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 573 ~~~~~~~~~~~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
+-...+-+++|+.-|++| .+++++.+..++.|+-+|||++
T Consensus 20 ~~~~psffsthm~tILia-IvVliiiiivli~lcssRKkKa 59 (189)
T PF05568_consen 20 PVTPPSFFSTHMYTILIA-IVVLIIIIIVLIYLCSSRKKKA 59 (189)
T ss_pred CCCCccHHHHHHHHHHHH-HHHHHHHHHHHHHHHhhhhHHH
Confidence 445566777888888888 5555555555556666666654
No 63
>PF13650 Asp_protease_2: Aspartyl protease
Probab=69.33 E-value=5.9 Score=32.12 Aligned_cols=29 Identities=21% Similarity=0.487 Sum_probs=23.3
Q ss_pred EEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816 277 IHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 277 i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i 316 (620)
+.|+|+.+ .+++|||++.+.+.++.++++
T Consensus 3 v~vng~~~-----------~~liDTGa~~~~i~~~~~~~l 31 (90)
T PF13650_consen 3 VKVNGKPV-----------RFLIDTGASISVISRSLAKKL 31 (90)
T ss_pred EEECCEEE-----------EEEEcCCCCcEEECHHHHHHc
Confidence 55677654 489999999999999988764
No 64
>PF11925 DUF3443: Protein of unknown function (DUF3443); InterPro: IPR021847 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 400 to 434 amino acids in length. This protein has two conserved sequence motifs: NPV and DNNG.
Probab=68.81 E-value=12 Score=39.12 Aligned_cols=107 Identities=21% Similarity=0.275 Sum_probs=55.0
Q ss_pred EEEEecCCC----cEE-EEEEeCCCCceeEeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816 88 TRLWIGTPP----QTF-ALIVDTGSTVTYVPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE 162 (620)
Q Consensus 88 ~~i~iGTP~----Q~~-~v~vDTGSs~~WV~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d 162 (620)
+.|+|=.|+ |.+ +|+|||||.-+=|..+.-..-.. ...-...+ .-..+. || ..|++
T Consensus 26 VsVtVC~PGts~CqTIdnvlVDTGS~GLRi~~sAl~~~l~---~~Lp~~t~-~g~~la---EC------------~~F~s 86 (370)
T PF11925_consen 26 VSVTVCAPGTSNCQTIDNVLVDTGSYGLRIFASALPSSLA---GSLPQQTG-GGAPLA---EC------------AQFAS 86 (370)
T ss_pred eEEEEeCCCCCCceeeCcEEEeccchhhhHHHhhhchhhh---ccCCcccC-CCcchh---hh------------hhccC
Confidence 566665543 666 89999999988776542210000 00111111 011110 12 34565
Q ss_pred CCceeEEEEEEEEEeCCCCCCCccceEEEE----------EEecc--CCCcCCCcceEEecCCC
Q 047816 163 MSSSSGVLGEDIISFGNESDLKPQRAVFGC----------ENVET--GDLYSQHADGIIGLGRG 214 (620)
Q Consensus 163 g~~~~G~~~~D~v~lg~~~~~~~~~~~fg~----------~~~~~--~~~~~~~~dGIlGLg~~ 214 (620)
+..=|-+.+-.|+|+++....++-|.++= ..... .........||||+|.-
T Consensus 87 -gytWGsVr~AdV~igge~A~~iPiQvI~D~~~~~~P~sC~~~g~~~~t~~~lgaNGILGIg~~ 149 (370)
T PF11925_consen 87 -GYTWGSVRTADVTIGGETASSIPIQVIGDSAAPSVPSSCSNSGASMNTVADLGANGILGIGPF 149 (370)
T ss_pred -cccccceEEEEEEEcCeeccccCEEEEcCCCCCCCCchhhcCCCCCCCcccccCceEEeecCC
Confidence 55567788899999986433344444432 11110 00113467899999973
No 65
>PF12191 stn_TNFRSF12A: Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain; InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=67.73 E-value=1.6 Score=38.01 Aligned_cols=17 Identities=24% Similarity=0.442 Sum_probs=0.0
Q ss_pred HHHHHHHhhhhhhcccc
Q 047816 599 LSVFGILFILRRRRQSV 615 (620)
Q Consensus 599 l~~~~~~~~~r~r~~~~ 615 (620)
|.++..+++|||.||++
T Consensus 92 l~llsg~lv~rrcrrr~ 108 (129)
T PF12191_consen 92 LALLSGFLVWRRCRRRE 108 (129)
T ss_dssp -----------------
T ss_pred HHHHHHHHHHhhhhccc
Confidence 33333455555554443
No 66
>PF06024 DUF912: Nucleopolyhedrovirus protein of unknown function (DUF912); InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=67.60 E-value=4.1 Score=34.69 Aligned_cols=31 Identities=29% Similarity=0.410 Sum_probs=17.5
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQS 614 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~ 614 (620)
.++.++++ .+.+++.|.++.-++|.|.|+++
T Consensus 63 iili~lls-~v~IlVily~IyYFVILRer~~~ 93 (101)
T PF06024_consen 63 IILISLLS-FVCILVILYAIYYFVILRERQKS 93 (101)
T ss_pred chHHHHHH-HHHHHHHHhhheEEEEEeccccc
Confidence 55555555 44444555555566677766644
No 67
>PF03229 Alpha_GJ: Alphavirus glycoprotein J; InterPro: IPR004913 The exact function of the herpesvirus glycoprotein J is unknown, but it appears to play a role in the inhibition of apotosis of the host cell [].; GO: 0019050 suppression by virus of host apoptosis
Probab=67.37 E-value=5.3 Score=34.17 Aligned_cols=27 Identities=26% Similarity=0.294 Sum_probs=19.0
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 584 FLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 584 ~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
+++.|+| ..+++.|.+++...+.||++
T Consensus 85 aLp~VIG--GLcaL~LaamGA~~LLrR~c 111 (126)
T PF03229_consen 85 ALPLVIG--GLCALTLAAMGAGALLRRCC 111 (126)
T ss_pred chhhhhh--HHHHHHHHHHHHHHHHHHHH
Confidence 4588887 56677788888776666553
No 68
>PF05545 FixQ: Cbb3-type cytochrome oxidase component FixQ; InterPro: IPR008621 This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon [].
Probab=66.95 E-value=6.6 Score=28.50 Aligned_cols=23 Identities=17% Similarity=0.134 Sum_probs=14.9
Q ss_pred HHHHHHHHHHHHHHhhhhhhccc
Q 047816 592 TIMMVVGLSVFGILFILRRRRQS 614 (620)
Q Consensus 592 ~~~~~~~l~~~~~~~~~r~r~~~ 614 (620)
.+++.+...++++|++|+||+++
T Consensus 15 ~v~~~~~F~gi~~w~~~~~~k~~ 37 (49)
T PF05545_consen 15 TVLFFVFFIGIVIWAYRPRNKKR 37 (49)
T ss_pred HHHHHHHHHHHHHHHHcccchhh
Confidence 45555566677778887776543
No 69
>PF15102 TMEM154: TMEM154 protein family
Probab=66.47 E-value=2.4 Score=38.21 Aligned_cols=6 Identities=33% Similarity=0.905 Sum_probs=2.5
Q ss_pred HHHHHH
Q 047816 585 LMVVLA 590 (620)
Q Consensus 585 ~~i~~~ 590 (620)
+.|++.
T Consensus 59 LmIlIP 64 (146)
T PF15102_consen 59 LMILIP 64 (146)
T ss_pred EEEeHH
Confidence 344444
No 70
>PF12384 Peptidase_A2B: Ty3 transposon peptidase; InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=65.69 E-value=19 Score=33.23 Aligned_cols=22 Identities=14% Similarity=0.302 Sum_probs=18.9
Q ss_pred CceEeeccceeeeecHHHHHHH
Q 047816 295 HGTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 295 ~~ailDSGtt~~~LP~~~~~~i 316 (620)
..++||||++..+.-.++.+.+
T Consensus 46 i~vLfDSGSPTSfIr~di~~kL 67 (177)
T PF12384_consen 46 IKVLFDSGSPTSFIRSDIVEKL 67 (177)
T ss_pred EEEEEeCCCccceeehhhHHhh
Confidence 3589999999999999887764
No 71
>PF13975 gag-asp_proteas: gag-polyprotein putative aspartyl protease
Probab=65.61 E-value=8.9 Score=30.25 Aligned_cols=30 Identities=17% Similarity=0.407 Sum_probs=24.1
Q ss_pred EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816 276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i 316 (620)
.+.|+|..+ .+++|||.+..+++.+.++.+
T Consensus 12 ~~~I~g~~~-----------~alvDtGat~~fis~~~a~rL 41 (72)
T PF13975_consen 12 PVSIGGVQV-----------KALVDTGATHNFISESLAKRL 41 (72)
T ss_pred EEEECCEEE-----------EEEEeCCCcceecCHHHHHHh
Confidence 355677655 389999999999999998775
No 72
>PHA03265 envelope glycoprotein D; Provisional
Probab=65.14 E-value=2.1 Score=43.92 Aligned_cols=29 Identities=17% Similarity=0.191 Sum_probs=18.2
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
-.+||++| +.++-+++..+++++.||||+
T Consensus 348 ~~~g~~ig-~~i~glv~vg~il~~~~rr~k 376 (402)
T PHA03265 348 TFVGISVG-LGIAGLVLVGVILYVCLRRKK 376 (402)
T ss_pred cccceEEc-cchhhhhhhhHHHHHHhhhhh
Confidence 57788887 444455555555666777663
No 73
>PF15065 NCU-G1: Lysosomal transcription factor, NCU-G1
Probab=63.59 E-value=7.3 Score=40.92 Aligned_cols=49 Identities=10% Similarity=0.218 Sum_probs=25.9
Q ss_pred eeeeeeec---CCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 565 KLLQWNIE---PQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 565 ~l~~~~~~---~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
+.+.|+.. -..+...+|. .|+.|++|.++-+++.|+..+.++++|+|+|
T Consensus 297 ~ylsWt~~~G~G~PP~d~~S~lvi~i~~vgLG~P~l~li~Ggl~v~~~r~r~~ 349 (350)
T PF15065_consen 297 NYLSWTFLIGYGSPPVDSFSPLVIMIMAVGLGVPLLLLILGGLYVCLRRRRKR 349 (350)
T ss_pred ceEEEEEecccCCCCccchhHHHHHHHHHHhhHHHHHHHHhhheEEEeccccC
Confidence 35566642 1245677888 5555667755554444444434444444444
No 74
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=62.64 E-value=9.6 Score=31.46 Aligned_cols=25 Identities=28% Similarity=0.498 Sum_probs=21.3
Q ss_pred EEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816 89 RLWIGTPPQTFALIVDTGSTVTYVPCA 115 (620)
Q Consensus 89 ~i~iGTP~Q~~~v~vDTGSs~~WV~~~ 115 (620)
.+.|+ +|.+..++|||++++-+...
T Consensus 2 ~~~i~--g~~~~~llDTGAd~Tvi~~~ 26 (87)
T cd05482 2 TLYIN--GKLFEGLLDTGADVSIIAEN 26 (87)
T ss_pred EEEEC--CEEEEEEEccCCCCeEEccc
Confidence 36677 89999999999999998753
No 75
>PF02480 Herpes_gE: Alphaherpesvirus glycoprotein E; InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=62.55 E-value=2.5 Score=45.98 Aligned_cols=23 Identities=9% Similarity=0.114 Sum_probs=9.4
Q ss_pred eEEEEECCCcEEEeCCCCcEEEe
Q 047816 358 AVEMAFGNGQKLLLAPENYLFRH 380 (620)
Q Consensus 358 ~i~f~f~~g~~~~l~~~~yi~~~ 380 (620)
.|...+.+...|.+.-+-|.++.
T Consensus 170 ~l~~~~~d~~~f~~~i~W~~~~~ 192 (439)
T PF02480_consen 170 HLQSEAHDDPPFSLEIDWYYMPT 192 (439)
T ss_dssp EEEEEESSS--EEEEEEEEEE--
T ss_pred EEEeccCCCCCeeEEEEEEEecC
Confidence 44444433355555555555554
No 76
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=61.52 E-value=6.4 Score=40.42 Aligned_cols=24 Identities=17% Similarity=0.368 Sum_probs=15.0
Q ss_pred HHHHHHHHHHHhhhhhhc-cccCCC
Q 047816 595 MVVGLSVFGILFILRRRR-QSVNSY 618 (620)
Q Consensus 595 ~~~~l~~~~~~~~~r~r~-~~~~~~ 618 (620)
+++.|.+..+|-.||||+ ++.++|
T Consensus 269 VLIMvIIYLILRYRRKKKmkKKlQY 293 (299)
T PF02009_consen 269 VLIMVIIYLILRYRRKKKMKKKLQY 293 (299)
T ss_pred HHHHHHHHHHHHHHHHhhhhHHHHH
Confidence 345556666789999765 444444
No 77
>PF07213 DAP10: DAP10 membrane protein; InterPro: IPR009861 This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells [].
Probab=57.93 E-value=16 Score=29.26 Aligned_cols=19 Identities=11% Similarity=-0.031 Sum_probs=12.0
Q ss_pred chhh-hhHHHHHHHHHHHHHH
Q 047816 579 WWQE-HFLMVVLAITIMMVVG 598 (620)
Q Consensus 579 ~~~~-~~~~i~~~~~~~~~~~ 598 (620)
.++. .++||++| =+++.+.
T Consensus 30 ~ls~g~LaGiV~~-D~vlTLL 49 (79)
T PF07213_consen 30 PLSPGLLAGIVAA-DAVLTLL 49 (79)
T ss_pred ccCHHHHHHHHHH-HHHHHHH
Confidence 4566 78899887 4444333
No 78
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=57.83 E-value=15 Score=29.94 Aligned_cols=30 Identities=20% Similarity=0.427 Sum_probs=23.2
Q ss_pred EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816 276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i 316 (620)
.+.|+++.+ .+++|||++.+.++.+..+.+
T Consensus 6 ~v~i~~~~~-----------~~llDTGa~~s~i~~~~~~~l 35 (96)
T cd05483 6 PVTINGQPV-----------RFLLDTGASTTVISEELAERL 35 (96)
T ss_pred EEEECCEEE-----------EEEEECCCCcEEcCHHHHHHc
Confidence 456676654 489999999999999876654
No 79
>PF06365 CD34_antigen: CD34/Podocalyxin family; InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=56.64 E-value=8.2 Score=37.07 Aligned_cols=30 Identities=10% Similarity=0.134 Sum_probs=20.0
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
.+|++++. +.+++++++.++.|++|+||+.
T Consensus 101 ~lI~lv~~-g~~lLla~~~~~~Y~~~~Rrs~ 130 (202)
T PF06365_consen 101 TLIALVTS-GSFLLLAILLGAGYCCHQRRSW 130 (202)
T ss_pred EEEehHHh-hHHHHHHHHHHHHHHhhhhccC
Confidence 56665544 4556666666777888888864
No 80
>TIGR01167 LPXTG_anchor LPXTG-motif cell wall anchor domain. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region.
Probab=56.03 E-value=13 Score=24.36 Aligned_cols=10 Identities=30% Similarity=0.717 Sum_probs=4.8
Q ss_pred HHHhhhhhhc
Q 047816 603 GILFILRRRR 612 (620)
Q Consensus 603 ~~~~~~r~r~ 612 (620)
+.++++|||+
T Consensus 24 ~~~~~~~rk~ 33 (34)
T TIGR01167 24 GGLLLRKRKK 33 (34)
T ss_pred HHHHheeccc
Confidence 3455555543
No 81
>PF12877 DUF3827: Domain of unknown function (DUF3827); InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells.
Probab=55.88 E-value=31 Score=38.70 Aligned_cols=81 Identities=15% Similarity=0.152 Sum_probs=47.0
Q ss_pred eeeEEEEecCCCcccccHHHHHHHHHHHccCccCCCCCCc---------ceeeeeeeecCCccccchhh-hhHHHHHHHH
Q 047816 523 SFIAWAVFPSGSANYISNATALRIISRLAEHRVHIPDTFG---------NYKLLQWNIEPQVKRTWWQE-HFLMVVLAIT 592 (620)
Q Consensus 523 ~~~~~~~~P~~~~~~f~~~~~~~i~~~~~~~~~~~~~~fG---------~y~l~~~~~~~~~~~~~~~~-~~~~i~~~~~ 592 (620)
++|.--+-- .++.....++|-.+...+..+++.+- .| ||+... .|..+.-..+. +|+||++.
T Consensus 207 ~EL~YyV~~-~~G~pl~a~~AA~~Ln~ld~Q~~Al~--LGy~V~~~~AqPv~~~a---~P~~~s~~~NlWII~gVlvP-- 278 (684)
T PF12877_consen 207 VELTYYVEG-QNGKPLPAVTAAKDLNLLDSQRMALI--LGYRVQGIVAQPVEKQA---EPPAKSPPNNLWIIAGVLVP-- 278 (684)
T ss_pred eEEEEEEEc-CCCcCCcHHHHHHHHhccCHHHHHHh--cCceecccccccccccc---CCCCCCCCCCeEEEehHhHH--
Confidence 555555541 27888999999999999988886642 33 333332 24444444455 66888655
Q ss_pred HHHHHHHHHHHHHhhhhhh
Q 047816 593 IMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 593 ~~~~~~l~~~~~~~~~r~r 611 (620)
+++++.+.+++.|.++||.
T Consensus 279 v~vV~~Iiiil~~~LCRk~ 297 (684)
T PF12877_consen 279 VLVVLLIIIILYWKLCRKN 297 (684)
T ss_pred HHHHHHHHHHHHHHHhccc
Confidence 3333333444445555443
No 82
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=54.98 E-value=58 Score=26.95 Aligned_cols=21 Identities=19% Similarity=0.316 Sum_probs=17.3
Q ss_pred CCceEeeccceeeeecHHHHH
Q 047816 294 KHGTVLDSGTTYAYLPEAAFL 314 (620)
Q Consensus 294 ~~~ailDSGtt~~~LP~~~~~ 314 (620)
+...++|||.....+|....+
T Consensus 9 ~~~fLVDTGA~vSviP~~~~~ 29 (89)
T cd06094 9 GLRFLVDTGAAVSVLPASSTK 29 (89)
T ss_pred CcEEEEeCCCceEeecccccc
Confidence 346899999999999987654
No 83
>PF13703 PepSY_TM_2: PepSY-associated TM helix
Probab=54.93 E-value=19 Score=29.64 Aligned_cols=20 Identities=20% Similarity=0.368 Sum_probs=10.2
Q ss_pred HHHHHHHHHHHHHhhhhhhc
Q 047816 593 IMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 593 ~~~~~~l~~~~~~~~~r~r~ 612 (620)
..+.+.+++.|.+++|+|++
T Consensus 24 al~~l~~~isGl~l~~p~~~ 43 (88)
T PF13703_consen 24 ALLLLLLLISGLYLWWPRRW 43 (88)
T ss_pred HHHHHHHHHHHHHHhhHHhc
Confidence 33344444556666665443
No 84
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where
Probab=53.36 E-value=16 Score=29.81 Aligned_cols=29 Identities=28% Similarity=0.315 Sum_probs=23.7
Q ss_pred EEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHHH
Q 047816 277 IHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 277 i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~i 316 (620)
+.|||+.+. .++|||.+.+.++++..+.+
T Consensus 3 v~InG~~~~-----------fLvDTGA~~tii~~~~a~~~ 31 (86)
T cd06095 3 ITVEGVPIV-----------FLVDTGATHSVLKSDLGPKQ 31 (86)
T ss_pred EEECCEEEE-----------EEEECCCCeEEECHHHhhhc
Confidence 457777654 79999999999999988764
No 85
>PF02160 Peptidase_A3: Cauliflower mosaic virus peptidase (A3); InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=53.31 E-value=17 Score=34.92 Aligned_cols=28 Identities=25% Similarity=0.319 Sum_probs=20.5
Q ss_pred CCceeehHhhhceEEEEEeCCCCEEEEEe
Q 047816 397 DPTTLLGGIIVRNTLVMYDREHSKIGFWK 425 (620)
Q Consensus 397 ~~~~ILG~~fLr~~yvvfD~en~rIGfA~ 425 (620)
+-..|||..|+|.++=-.+.+ .+|-|-.
T Consensus 90 g~d~IlG~NF~r~y~Pfiq~~-~~I~f~~ 117 (201)
T PF02160_consen 90 GIDIILGNNFLRLYEPFIQTE-DRIQFHK 117 (201)
T ss_pred CCCEEecchHHHhcCCcEEEc-cEEEEEe
Confidence 456999999999887555554 4677764
No 86
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=53.04 E-value=5.1 Score=30.67 Aligned_cols=34 Identities=15% Similarity=0.135 Sum_probs=0.9
Q ss_pred ccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 577 RTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 577 ~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
++..-. .|+|+++| +++++.|+.+.++-+++|--
T Consensus 7 ~~~vlaavIaG~Vvg--ll~ailLIlf~iyR~rkkdE 41 (64)
T PF01034_consen 7 RSEVLAAVIAGGVVG--LLFAILLILFLIYRMRKKDE 41 (64)
T ss_dssp --------------------------------S----
T ss_pred cchHHHHHHHHHHHH--HHHHHHHHHHHHHHHHhcCC
Confidence 344445 88999997 55666777778888888863
No 87
>PF11014 DUF2852: Protein of unknown function (DUF2852); InterPro: IPR021273 This bacterial family of proteins has no known function.
Probab=48.03 E-value=29 Score=30.11 Aligned_cols=31 Identities=19% Similarity=0.227 Sum_probs=22.6
Q ss_pred hhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816 580 WQE-HFLMVVLAITIMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 580 ~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r 611 (620)
|.- .|+.+|+| .|+.-.+-++++.|.||.+|
T Consensus 7 ~~~a~Ia~mVlG-Fi~fWPlGla~Lay~iw~~r 38 (115)
T PF11014_consen 7 WKPAWIAAMVLG-FIVFWPLGLALLAYMIWGKR 38 (115)
T ss_pred CchHHHHHHHHH-HHHHHHHHHHHHHHHHHHHH
Confidence 344 68888999 66666666777778888766
No 88
>PF02529 PetG: Cytochrome B6-F complex subunit 5; InterPro: IPR003683 This family consists of cytochrome b6/f complex subunit 5 (PetG). The cytochrome bf complex, found in green plants, eukaryotic algae and cyanobacteria, connects photosystem I to photosystem II in the electron transport chain, functioning as a plastoquinol:plastocyanin/cytochrome c6 oxidoreductase []. The purified complex from the unicellular alga Chlamydomonas reinhardtii contains seven subunits; namely four high molecular weight subunits (cytochrome f, Rieske iron-sulphur protein, cytochrome b6, and subunit IV) and three approximately miniproteins (PetG, PetL, and PetX) []. Stoichiometry measurements are consistent with every subunit being present as two copies per b6/f dimer. The absence of PetG affects either the assembly or stability of the cytochrome bf complex in C. reinhardtii [].; GO: 0009512 cytochrome b6f complex; PDB: 1Q90_G 2ZT9_G 1VF5_G 2D2C_G 2E74_G 2E75_G 2E76_G.
Probab=47.94 E-value=42 Score=22.62 Aligned_cols=26 Identities=15% Similarity=0.103 Sum_probs=15.9
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhh
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILR 609 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r 609 (620)
...||++| .+.+.++-+.+..|+-.|
T Consensus 5 lL~GiVlG-li~vtl~Glfv~Ay~QY~ 30 (37)
T PF02529_consen 5 LLSGIVLG-LIPVTLAGLFVAAYLQYR 30 (37)
T ss_dssp HHHHHHHH-HHHHHHHHHHHHHHHHHC
T ss_pred hhhhHHHH-hHHHHHHHHHHHHHHHHh
Confidence 46799999 666555555554454444
No 89
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=47.57 E-value=16 Score=36.65 Aligned_cols=29 Identities=34% Similarity=0.509 Sum_probs=16.9
Q ss_pred HHHHHHHHHH-HHHHHHhhhhhhccccCCCC
Q 047816 590 AITIMMVVGL-SVFGILFILRRRRQSVNSYK 619 (620)
Q Consensus 590 ~~~~~~~~~l-~~~~~~~~~r~r~~~~~~~~ 619 (620)
|+++.++++| ++++++.||-+|||+. ++|
T Consensus 262 giaalvllil~vvliiLYiWlyrrRK~-swk 291 (295)
T TIGR01478 262 GIAALVLIILTVVLIILYIWLYRRRKK-SWK 291 (295)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhcc-ccc
Confidence 5444444444 3445678899888774 443
No 90
>KOG4818 consensus Lysosomal-associated membrane protein [General function prediction only]
Probab=47.03 E-value=15 Score=38.10 Aligned_cols=29 Identities=28% Similarity=0.333 Sum_probs=19.9
Q ss_pred hHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 584 FLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 584 ~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
++-|++| ++..++.+..++.++|.||||+
T Consensus 328 v~PivVg-~~l~gl~~~vliaylIgrr~~~ 356 (362)
T KOG4818|consen 328 VLPIAVG-AILAGLVLVVLIAYLIGRRRSH 356 (362)
T ss_pred ecchHHH-HHHHHHHHHHHHHhheeheecc
Confidence 4556777 6666777777788888755543
No 91
>PF15099 PIRT: Phosphoinositide-interacting protein family
Probab=46.55 E-value=11 Score=33.07 Aligned_cols=32 Identities=19% Similarity=0.249 Sum_probs=19.9
Q ss_pred hhHHHH-HHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816 583 HFLMVV-LAITIMMVVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 583 ~~~~i~-~~~~~~~~~~l~~~~~~~~~r~r~~~~ 615 (620)
.++|.+ ++ +..++++.++++|..+.|||++++
T Consensus 81 ~~~G~vlLs-~GLmlL~~~alcW~~~~rkK~~kr 113 (129)
T PF15099_consen 81 SIFGPVLLS-LGLMLLACSALCWKPIIRKKKKKR 113 (129)
T ss_pred hhehHHHHH-HHHHHHHhhhheehhhhHhHHHHh
Confidence 355655 55 556666667777777777665443
No 92
>PF14986 DUF4514: Domain of unknown function (DUF4514)
Probab=46.51 E-value=19 Score=26.27 Aligned_cols=28 Identities=18% Similarity=0.266 Sum_probs=19.1
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
+++|.++|++| ..+.+++-++.|||.-.
T Consensus 23 a~IGtalGvai--sAgFLaLKicmIrkhlf 50 (61)
T PF14986_consen 23 AIIGTALGVAI--SAGFLALKICMIRKHLF 50 (61)
T ss_pred eeehhHHHHHH--HHHHHHHHHHHHHHhhc
Confidence 78888888333 44556777788877653
No 93
>PTZ00370 STEVOR; Provisional
Probab=45.77 E-value=18 Score=36.48 Aligned_cols=26 Identities=31% Similarity=0.423 Sum_probs=15.8
Q ss_pred HHHHHHHHHH-HHHHHHhhhhhhcccc
Q 047816 590 AITIMMVVGL-SVFGILFILRRRRQSV 615 (620)
Q Consensus 590 ~~~~~~~~~l-~~~~~~~~~r~r~~~~ 615 (620)
|+++.++++| ++++++.||-+|||+.
T Consensus 258 giaalvllil~vvliilYiwlyrrRK~ 284 (296)
T PTZ00370 258 GIAALVLLILAVVLIILYIWLYRRRKN 284 (296)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence 5433333333 3445778999998875
No 94
>PF11353 DUF3153: Protein of unknown function (DUF3153); InterPro: IPR021499 This family of proteins with unknown function appear to be restricted to Cyanobacteria. Some members are annotated as membrane proteins however this cannot be confirmed.
Probab=45.54 E-value=18 Score=35.23 Aligned_cols=46 Identities=17% Similarity=0.299 Sum_probs=22.6
Q ss_pred eeeeeecCCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 566 LLQWNIEPQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 566 l~~~~~~~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
.+.|++.| +....+.. .++-=.+| .+++++++++++.++++|+|++
T Consensus 162 ~l~W~L~p-Ge~N~L~~~~w~pn~lg-iG~v~I~~l~~~~~~l~~~r~~ 208 (209)
T PF11353_consen 162 QLTWKLQP-GEINHLEASFWVPNPLG-IGTVLIVLLILLGFLLRRRRLP 208 (209)
T ss_pred EEEEecCC-CceeEEEEEEEeccHHH-HHHHHHHHHHHHHHHHHHhhcC
Confidence 67788743 33333333 22222344 2333444455555677776654
No 95
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=45.24 E-value=3.1 Score=35.02 Aligned_cols=33 Identities=6% Similarity=-0.106 Sum_probs=23.5
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~~ 615 (620)
.-.|.++|+++.+++++.+++.+++|.+.+|++
T Consensus 63 ls~gaiagi~vg~~~~v~~lv~~l~w~f~~r~k 95 (96)
T PTZ00382 63 LSTGAIAGISVAVVAVVGGLVGFLCWWFVCRGK 95 (96)
T ss_pred cccccEEEEEeehhhHHHHHHHHHhheeEEeec
Confidence 677777875666666667778888887776653
No 96
>PF13268 DUF4059: Protein of unknown function (DUF4059)
Probab=43.45 E-value=25 Score=27.53 Aligned_cols=23 Identities=26% Similarity=0.339 Sum_probs=12.9
Q ss_pred HHHHHHHHHHHhhhhhhccccCC
Q 047816 595 MVVGLSVFGILFILRRRRQSVNS 617 (620)
Q Consensus 595 ~~~~l~~~~~~~~~r~r~~~~~~ 617 (620)
.+..+.+-+.|..||.++++-++
T Consensus 17 ~i~V~~~~~~wi~~Ra~~~~DKT 39 (72)
T PF13268_consen 17 SILVLLVSGIWILWRALRKKDKT 39 (72)
T ss_pred HHHHHHHHHHHHHHHHHHcCCCc
Confidence 34444455567777766655443
No 97
>PF08374 Protocadherin: Protocadherin; InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated [].
Probab=42.85 E-value=11 Score=36.18 Aligned_cols=26 Identities=12% Similarity=0.333 Sum_probs=13.0
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r 611 (620)
.++||+.| +++++|+ ++++.++|+.|
T Consensus 39 I~iaiVAG-~~tVILV--I~i~v~vR~CR 64 (221)
T PF08374_consen 39 IMIAIVAG-IMTVILV--IFIVVLVRYCR 64 (221)
T ss_pred eeeeeecc-hhhhHHH--HHHHHHHHHHh
Confidence 57777776 4443333 33334445344
No 98
>KOG3540 consensus Beta amyloid precursor protein [General function prediction only]
Probab=42.32 E-value=30 Score=37.28 Aligned_cols=57 Identities=14% Similarity=0.293 Sum_probs=35.7
Q ss_pred ccCCCCCC-cceeeeeeeecCCccccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 554 RVHIPDTF-GNYKLLQWNIEPQVKRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 554 ~~~~~~~f-G~y~l~~~~~~~~~~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
.++...-| ++|++-.-.+.|.+...+.|. +++|+.++ + ++++-++++.+++.|||+.
T Consensus 518 ev~~d~e~d~~~e~~r~~~~~~~ed~~~s~~av~gllv~-~--~~i~tvivisl~mlrkr~y 576 (615)
T KOG3540|consen 518 EVRVDAEFDEGAEFYRHDLLPQSEDVGRSASAVIGLLVS-A--VFIATVIVISLVMLRKRQY 576 (615)
T ss_pred hcccCCCCCCchhhhhhhhccccccccccHHHHHHHHHH-H--HHHHHHHHHHHHHHccccc
Confidence 34445445 788887777778888888899 99997654 1 1222233344556666653
No 99
>PF00077 RVP: Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026; InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=42.25 E-value=20 Score=29.97 Aligned_cols=27 Identities=22% Similarity=0.571 Sum_probs=21.1
Q ss_pred EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHH
Q 047816 276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAF 313 (620)
Q Consensus 276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~ 313 (620)
.+.++|+.+ .++||||+..+.++++.+
T Consensus 9 ~v~i~g~~i-----------~~LlDTGA~vsiI~~~~~ 35 (100)
T PF00077_consen 9 TVKINGKKI-----------KALLDTGADVSIISEKDW 35 (100)
T ss_dssp EEEETTEEE-----------EEEEETTBSSEEESSGGS
T ss_pred EEeECCEEE-----------EEEEecCCCcceeccccc
Confidence 456677755 489999999999998653
No 100
>PF14575 EphA2_TM: Ephrin type-A receptor 2 transmembrane domain; PDB: 3KUL_A 2XVD_A 2VX1_A 2VWV_A 2VX0_A 2VWY_A 2VWZ_A 2VWW_A 2VWU_A 2VWX_A ....
Probab=41.18 E-value=32 Score=27.48 Aligned_cols=23 Identities=17% Similarity=0.193 Sum_probs=9.6
Q ss_pred HHHHHHHHHHHHHhhhhhhcccc
Q 047816 593 IMMVVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 593 ~~~~~~l~~~~~~~~~r~r~~~~ 615 (620)
+++.+.++.+.++++.-.+||..
T Consensus 6 ~~~g~~~ll~~v~~~~~~~rr~~ 28 (75)
T PF14575_consen 6 IIVGVLLLLVLVIIVIVCFRRCK 28 (75)
T ss_dssp HHHHHHHHHHHHHHHHCCCTT--
T ss_pred HHHHHHHHHHhheeEEEEEeeEc
Confidence 33344444444555555554443
No 101
>PF12384 Peptidase_A2B: Ty3 transposon peptidase; InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=40.67 E-value=36 Score=31.53 Aligned_cols=29 Identities=14% Similarity=0.339 Sum_probs=23.8
Q ss_pred EEEEEecCCCcEEEEEEeCCCCceeEeCC
Q 047816 87 TTRLWIGTPPQTFALIVDTGSTVTYVPCA 115 (620)
Q Consensus 87 ~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~ 115 (620)
+..+.+++-+.++++++||||+.-.+...
T Consensus 34 T~~v~l~~~~t~i~vLfDSGSPTSfIr~d 62 (177)
T PF12384_consen 34 TAIVQLNCKGTPIKVLFDSGSPTSFIRSD 62 (177)
T ss_pred EEEEEEeecCcEEEEEEeCCCccceeehh
Confidence 34677777799999999999999888653
No 102
>PF09668 Asp_protease: Aspartyl protease; InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=38.92 E-value=16 Score=32.40 Aligned_cols=36 Identities=25% Similarity=0.291 Sum_probs=26.6
Q ss_pred eEEEEEEecCCCcEEEEEEeCCCCceeEeCCCCCCCCC
Q 047816 85 YYTTRLWIGTPPQTFALIVDTGSTVTYVPCATCEHCGD 122 (620)
Q Consensus 85 ~Y~~~i~iGTP~Q~~~v~vDTGSs~~WV~~~~c~~C~~ 122 (620)
..|++++|+ ++++...+|||...+-+..+-+..|+.
T Consensus 24 mLyI~~~in--g~~vkA~VDtGAQ~tims~~~a~r~gL 59 (124)
T PF09668_consen 24 MLYINCKIN--GVPVKAFVDTGAQSTIMSKSCAERCGL 59 (124)
T ss_dssp --EEEEEET--TEEEEEEEETT-SS-EEEHHHHHHTTG
T ss_pred eEEEEEEEC--CEEEEEEEeCCCCccccCHHHHHHcCC
Confidence 567899999 999999999999998887654556654
No 103
>TIGR03867 MprA_tail MprA protease C-terminal sorting domain. This model describes a protein C-terminal domain that occurs in species of the genus Ralstonia and is predicted to play a role in protein targeting. This sequence, though limited to members of the MprA serine in species distribution, resembles C-terminal sorting sequences of the sortase and exosortase systems, as well as a Shewanella-type C-terminal sequence modeled by TIGR03501. For all such cases, member proteins have homologs in other species with essentially full-length homology, save for the lack of the domain modeled here. All members of the present family are predicted serine proteases
Probab=38.79 E-value=42 Score=21.03 Aligned_cols=19 Identities=32% Similarity=0.296 Sum_probs=11.1
Q ss_pred HHHHHHHHHHHHHhhhhhh
Q 047816 593 IMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 593 ~~~~~~l~~~~~~~~~r~r 611 (620)
..++..|++++.+.+.|||
T Consensus 8 ~~~A~Lll~aG~~~~~rR~ 26 (27)
T TIGR03867 8 PWLAALLLAAGLLGFARRR 26 (27)
T ss_pred HHHHHHHHHHHhhhHHhhc
Confidence 3445556666666666655
No 104
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=38.02 E-value=16 Score=34.51 Aligned_cols=12 Identities=8% Similarity=0.232 Sum_probs=6.6
Q ss_pred chhh-hhHHHHHH
Q 047816 579 WWQE-HFLMVVLA 590 (620)
Q Consensus 579 ~~~~-~~~~i~~~ 590 (620)
++.. +++||++|
T Consensus 75 ~~~~~iivgvi~~ 87 (179)
T PF13908_consen 75 YFITGIIVGVICG 87 (179)
T ss_pred cceeeeeeehhhH
Confidence 3344 56666665
No 105
>PRK09459 pspG phage shock protein G; Reviewed
Probab=37.99 E-value=29 Score=27.45 Aligned_cols=21 Identities=24% Similarity=0.322 Sum_probs=13.8
Q ss_pred HHHHHHHhhhhhhccccCCCC
Q 047816 599 LSVFGILFILRRRRQSVNSYK 619 (620)
Q Consensus 599 l~~~~~~~~~r~r~~~~~~~~ 619 (620)
+.+++.|++|.+++++.+.|+
T Consensus 53 l~~v~vW~~r~~~~~~~~~y~ 73 (76)
T PRK09459 53 LAVVVVWVIRAIKAPKVPRYQ 73 (76)
T ss_pred HHHHHHHHHHHhhcccccccc
Confidence 355667888776766666664
No 106
>PF05084 GRA6: Granule antigen protein (GRA6); InterPro: IPR008119 Toxoplasma gondii is an obligate intracellular apicomplexan protozoan parasite, with a complex lifestyle involving varied hosts []. It has two phases of growth: an intestinal phase in feline hosts, and an extra-intestinal phase in other mammals. Oocysts from infected cats develop into tachyzoites, and eventually, bradyzoites and zoitocysts in the extraintestinal host []. Transmission of the parasite occurs through contact with infected cats or raw/undercooked meat; in immunocompromised individuals, it can cause severe and often lethal toxoplasmosis. Acute infection in healthy humans can sometimes also cause tissue damage []. The protozoan utilises a variety of secretory and antigenic proteins to invade a host and gain access to the intracellular environment []. These originate from distinct organelles in the T. gondii cell termed micronemes, rhoptries, and dense granules. They are released at specific times during invasion to ensure the proteins are allocated to their correct target destinations []. Dense granule antigens (GRAs) are released from the T. gondii tachyzoite while still encapsulated in a host vacuole. Gra6, one of these moieties, is associated with the parasitophorous vacuole []. It possesses a hydrophobic central region flanked by two hydrophilic domains, and is present as a single copy gene in the Toxoplasma gondii genome []. Gra6 shares a similar function with Gra2, in that it is rapidly targeted to a network of membranous tubules that connect with the vacuolar membrane []. Indeed, these two proteins, together with Gra4, form a multimeric complex that stabilises the parasite within the vacuole.
Probab=37.77 E-value=36 Score=31.01 Aligned_cols=13 Identities=31% Similarity=0.455 Sum_probs=6.5
Q ss_pred HHHHHhhhhhhcc
Q 047816 601 VFGILFILRRRRQ 613 (620)
Q Consensus 601 ~~~~~~~~r~r~~ 613 (620)
++.+|++.|||.|
T Consensus 164 A~L~~~F~RR~~r 176 (215)
T PF05084_consen 164 AMLTWFFLRRTGR 176 (215)
T ss_pred HHHHHHHHHhhcc
Confidence 3344555555543
No 107
>TIGR03370 PEPCTERM_Roseo variant PEP-CTERM putative exosortase signal, Roseobacter type. A probable protein export sorting signal, PEP-CTERM, was described by Haft, et al. (PubMed:16930487). It is predicted to interact with a putative transpeptidase we designate exosortase. Most examples of this signal are recognized by model TIGR02595, but some unusual clades require different models. This model describes a variant with conserved motif VPLPA, rather than VPEP. This variant is found prominently in two members of the Rhodobacterales, namely Jannaschia sp. CCS1 and Roseobacter denitrificans OCh 114. One interesting member protein has a full-length duplication and therefore two copies of this putative sorting domain.
Probab=36.95 E-value=41 Score=20.95 Aligned_cols=17 Identities=41% Similarity=0.479 Sum_probs=9.0
Q ss_pred HHHHHHHHHhhhhhhcc
Q 047816 597 VGLSVFGILFILRRRRQ 613 (620)
Q Consensus 597 ~~l~~~~~~~~~r~r~~ 613 (620)
+.++.++.+.+.|||++
T Consensus 9 LLl~gLggl~~~rRRrk 25 (26)
T TIGR03370 9 LLLAGLGGLGAMRRRRR 25 (26)
T ss_pred HHHHHHHHHHHHHHhhc
Confidence 34455555556665543
No 108
>PHA03283 envelope glycoprotein E; Provisional
Probab=36.69 E-value=34 Score=37.41 Aligned_cols=25 Identities=16% Similarity=0.139 Sum_probs=14.9
Q ss_pred HHHHHHHHHHHhhhhhhccccCCCC
Q 047816 595 MVVGLSVFGILFILRRRRQSVNSYK 619 (620)
Q Consensus 595 ~~~~l~~~~~~~~~r~r~~~~~~~~ 619 (620)
++++|+++++|...|-|++.++.|+
T Consensus 410 ~~~~~~~l~vw~c~~~r~~~~~~y~ 434 (542)
T PHA03283 410 CAALLVALVVWGCILYRRSNRKPYE 434 (542)
T ss_pred HHHHHHHHhhhheeeehhhcCCccc
Confidence 3355666667766665555556664
No 109
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=36.09 E-value=46 Score=29.37 Aligned_cols=30 Identities=10% Similarity=0.165 Sum_probs=18.3
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhccc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRRQS 614 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~~ 614 (620)
.|+||++| ++++++|+++++.-.+||....
T Consensus 69 Ii~gv~aG--vIg~Illi~y~irR~~Kk~~~~ 98 (122)
T PF01102_consen 69 IIFGVMAG--VIGIILLISYCIRRLRKKSSSD 98 (122)
T ss_dssp HHHHHHHH--HHHHHHHHHHHHHHHS------
T ss_pred hhHHHHHH--HHHHHHHHHHHHHHHhccCCCC
Confidence 67888887 4556678888888777777543
No 110
>CHL00008 petG cytochrome b6/f complex subunit V
Probab=35.52 E-value=73 Score=21.39 Aligned_cols=25 Identities=12% Similarity=0.090 Sum_probs=14.6
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhh
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFIL 608 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~ 608 (620)
.+-||++| .+.+.++-+++..|+=.
T Consensus 5 lL~GiVLG-lipvTl~GlfvaAylQY 29 (37)
T CHL00008 5 LLFGIVLG-LIPITLAGLFVTAYLQY 29 (37)
T ss_pred hhhhHHHH-hHHHHHHHHHHHHHHHH
Confidence 35689999 66555554554444433
No 111
>PHA03281 envelope glycoprotein E; Provisional
Probab=35.45 E-value=63 Score=35.51 Aligned_cols=20 Identities=20% Similarity=0.305 Sum_probs=15.4
Q ss_pred cccHHHHHHHHHHHccCccC
Q 047816 537 YISNATALRIISRLAEHRVH 556 (620)
Q Consensus 537 ~f~~~~~~~i~~~~~~~~~~ 556 (620)
+-=.||+.+++..+.++.++
T Consensus 505 YtlvSTad~fvNvV~d~~~P 524 (642)
T PHA03281 505 YTVVSTIDHFVNAIEEHGFP 524 (642)
T ss_pred EEEEehHHhhhhhehhcCCC
Confidence 45567888899999988655
No 112
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=35.05 E-value=71 Score=30.78 Aligned_cols=36 Identities=25% Similarity=0.306 Sum_probs=28.4
Q ss_pred CCCeeEEEEeEEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHH
Q 047816 266 RSPYYNIDLKVIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLA 315 (620)
Q Consensus 266 ~~~~w~v~l~~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~ 315 (620)
.+++|.++ ..|||+.+. .++|||.|.+.++++..+.
T Consensus 102 ~~GHF~a~---~~VNGk~v~-----------fLVDTGATsVal~~~dA~R 137 (215)
T COG3577 102 RDGHFEAN---GRVNGKKVD-----------FLVDTGATSVALNEEDARR 137 (215)
T ss_pred CCCcEEEE---EEECCEEEE-----------EEEecCcceeecCHHHHHH
Confidence 55666654 578888765 7999999999999988765
No 113
>TIGR03778 VPDSG_CTERM VPDSG-CTERM exosortase interaction domain. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (PubMed:16930487). This model describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C-terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif (TIGR02595) of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria.
Probab=34.89 E-value=46 Score=20.73 Aligned_cols=17 Identities=29% Similarity=0.065 Sum_probs=7.9
Q ss_pred HHHHHHHHHHHhhhhhh
Q 047816 595 MVVGLSVFGILFILRRR 611 (620)
Q Consensus 595 ~~~~l~~~~~~~~~r~r 611 (620)
+++...++..++.+|||
T Consensus 8 ~~Ll~~~l~~l~~~rRr 24 (26)
T TIGR03778 8 LALLGLGLLGLLGLRRR 24 (26)
T ss_pred HHHHHHHHHHHHHHhhc
Confidence 33334444445555544
No 114
>PF15176 LRR19-TM: Leucine-rich repeat family 19 TM domain
Probab=34.49 E-value=41 Score=28.28 Aligned_cols=36 Identities=11% Similarity=0.066 Sum_probs=26.9
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc-cccCCCC
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRR-QSVNSYK 619 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~-~~~~~~~ 619 (620)
.-+...+| .++.++.+++++.++++=... +...+|+
T Consensus 15 ~sW~~LVG-Vv~~al~~SlLIalaaKC~~~~k~~~SY~ 51 (102)
T PF15176_consen 15 RSWPFLVG-VVVTALVTSLLIALAAKCPVWYKYLASYR 51 (102)
T ss_pred cccHhHHH-HHHHHHHHHHHHHHHHHhHHHHHHHhccc
Confidence 56778899 888888889998888877664 4455553
No 115
>PRK00665 petG cytochrome b6-f complex subunit PetG; Reviewed
Probab=33.69 E-value=78 Score=21.26 Aligned_cols=25 Identities=12% Similarity=-0.024 Sum_probs=14.5
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhh
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFIL 608 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~ 608 (620)
.+-||++| .+.+.++-+++..|+=.
T Consensus 5 lL~GiVLG-lipiTl~GlfvaAylQY 29 (37)
T PRK00665 5 LLCGIVLG-LIPVTLAGLFVAAWNQY 29 (37)
T ss_pred hhhhHHHH-hHHHHHHHHHHHHHHHH
Confidence 35689999 66555544444444433
No 116
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=33.24 E-value=47 Score=27.66 Aligned_cols=21 Identities=29% Similarity=0.332 Sum_probs=18.6
Q ss_pred ceEeeccceeeeecHHHHHHH
Q 047816 296 GTVLDSGTTYAYLPEAAFLAF 316 (620)
Q Consensus 296 ~ailDSGtt~~~LP~~~~~~i 316 (620)
.+.+|||++...+|...++.+
T Consensus 12 ~~~vDtGA~vnllp~~~~~~l 32 (93)
T cd05481 12 KFQLDTGATCNVLPLRWLKSL 32 (93)
T ss_pred EEEEecCCEEEeccHHHHhhh
Confidence 589999999999999988764
No 117
>PF10577 UPF0560: Uncharacterised protein family UPF0560; InterPro: IPR018890 This family of proteins has no known function.
Probab=33.12 E-value=51 Score=38.11 Aligned_cols=33 Identities=21% Similarity=0.320 Sum_probs=19.9
Q ss_pred hhhhhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 580 WQEHFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 580 ~~~~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
-++.++..++|..++++++|+++.+|..+||..
T Consensus 270 YHT~fLl~ILG~~~livl~lL~vLl~yCrrkc~ 302 (807)
T PF10577_consen 270 YHTVFLLAILGGTALIVLILLCVLLCYCRRKCL 302 (807)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccC
Confidence 355444444443677777777777776666553
No 118
>COG4736 CcoQ Cbb3-type cytochrome oxidase, subunit 3 [Posttranslational modification, protein turnover, chaperones]
Probab=32.88 E-value=50 Score=25.16 Aligned_cols=23 Identities=9% Similarity=-0.008 Sum_probs=13.2
Q ss_pred HHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 589 LAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 589 ~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
.| .+++.+.+++++.|++|++|+
T Consensus 13 ~~-t~~~~l~fiavi~~ayr~~~K 35 (60)
T COG4736 13 WG-TIAFTLFFIAVIYFAYRPGKK 35 (60)
T ss_pred HH-HHHHHHHHHHHHHHHhcccch
Confidence 44 555555556666666665554
No 119
>cd01324 cbb3_Oxidase_CcoQ Cytochrome cbb oxidase CcoQ. Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Found exclusively in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I. Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center. ccoQ, the fourth subunit, is a single transmembrane helix protein. It has been shown to protect the core complex from proteolytic degradation by serine proteases. See cd00919, cd01322
Probab=32.43 E-value=53 Score=23.77 Aligned_cols=22 Identities=5% Similarity=0.004 Sum_probs=13.1
Q ss_pred HHHHHHHHHHHHHHhhhhhhcc
Q 047816 592 TIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 592 ~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
.+.+++.-+++++|.+|+++++
T Consensus 16 l~~~~~~Figiv~wa~~p~~k~ 37 (48)
T cd01324 16 LLYLALFFLGVVVWAFRPGRKK 37 (48)
T ss_pred HHHHHHHHHHHHHHHhCCCcch
Confidence 3344445566667777776654
No 120
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=32.28 E-value=32 Score=30.19 Aligned_cols=20 Identities=30% Similarity=0.511 Sum_probs=17.8
Q ss_pred eEeeccce-eeeecHHHHHHH
Q 047816 297 TVLDSGTT-YAYLPEAAFLAF 316 (620)
Q Consensus 297 ailDSGtt-~~~LP~~~~~~i 316 (620)
.++|||.+ ++.+|+++++++
T Consensus 29 ~LiDTGFtg~lvlp~~vaek~ 49 (125)
T COG5550 29 ELIDTGFTGYLVLPPQVAEKL 49 (125)
T ss_pred eEEecCCceeEEeCHHHHHhc
Confidence 48999999 999999998774
No 121
>PRK10525 cytochrome o ubiquinol oxidase subunit II; Provisional
Probab=31.40 E-value=42 Score=34.80 Aligned_cols=29 Identities=17% Similarity=0.392 Sum_probs=16.2
Q ss_pred HHHHHHHHHHHHHHhhhhhhcc-ccCCCCC
Q 047816 592 TIMMVVGLSVFGILFILRRRRQ-SVNSYKP 620 (620)
Q Consensus 592 ~~~~~~~l~~~~~~~~~r~r~~-~~~~~~~ 620 (620)
.+++++.+.++.++++||.|++ +...|.|
T Consensus 51 ~liv~i~V~~l~~~f~~ryR~~~~~a~y~p 80 (315)
T PRK10525 51 MLIVVIPAILMAVGFAWKYRASNKDAKYSP 80 (315)
T ss_pred HHhhHHHHHHHHheeEEEEecCCCcCCCCC
Confidence 3444444444566777777754 3356654
No 122
>TIGR02595 PEP_exosort PEP-CTERM putative exosortase interaction domain. This model describes a 25-residue domain that includes a near-invariant Pro-Glu-Pro (PEP) motif, a thirteen residue strongly hydrophobic sequence likely to span the membrane, and a five-residue strongly basic motif that often contains four Arg residues. In nearly every case, this motif is found within nine residues, and usually within five residues, of the extreme C-terminus of the protein. Proteins with this motif typically have signal sequences at the N-terminus. This region appears many times per genome or not at all, and co-occurs in genomes with a proposed protein-sorting integral membrane protein we designate exosortase (see TIGR02602). PEP-CTERM proteins frequently are poorly conserved, Ser/Thr-rich proteins and may become extensively modified proteinaceous constituents of extracellular material in bacterial biofilms.
Probab=31.13 E-value=56 Score=20.28 Aligned_cols=7 Identities=71% Similarity=1.312 Sum_probs=2.7
Q ss_pred hhhhhhc
Q 047816 606 FILRRRR 612 (620)
Q Consensus 606 ~~~r~r~ 612 (620)
+..|||+
T Consensus 17 ~~~rrrk 23 (26)
T TIGR02595 17 LLLRRRR 23 (26)
T ss_pred HHHhhcc
Confidence 3334343
No 123
>PF14654 Epiglycanin_C: Mucin, catalytic, TM and cytoplasmic tail region
Probab=30.63 E-value=87 Score=26.16 Aligned_cols=26 Identities=19% Similarity=0.369 Sum_probs=16.1
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhh
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILR 609 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r 609 (620)
.|.-|++. .++++++|++-..+.+|+
T Consensus 19 eIfLItLa-sVvvavGl~aGLfFcvR~ 44 (106)
T PF14654_consen 19 EIFLITLA-SVVVAVGLFAGLFFCVRN 44 (106)
T ss_pred HHHHHHHH-HHHHHHHHHHHHHHHhhh
Confidence 45556666 667777777655555544
No 124
>PF14828 Amnionless: Amnionless
Probab=30.48 E-value=1.7e+02 Score=31.99 Aligned_cols=61 Identities=13% Similarity=0.086 Sum_probs=32.3
Q ss_pred eEEEEEEEEEEEeecCCCCCCchhHHHHHHhhcccccccceEEeeeeecCCceeeEEEEecC
Q 047816 471 LQIGRITFDMFLSINYSDLRPHIPELADSIAQELDVNTSQVHLLNFMSKGNNSFIAWAVFPS 532 (620)
Q Consensus 471 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~qv~~~~~~~~g~~~~~~~~~~P~ 532 (620)
.|-+-+++.+. ...--|++.|...+.+.+..+=....-|.|++.....+..-.+++.|.=.
T Consensus 231 iCGa~v~~~~~-~~~~fdl~~~~~~l~~~~~~~~~~~~v~~~v~kv~~~~~~~~iQiVi~d~ 291 (437)
T PF14828_consen 231 ICGAIVTLEYS-CESTFDLQSYRQRLRHAFLELPQYDEVQMHVSKVWSDQSGNEIQIVITDR 291 (437)
T ss_pred hcceEEEEeec-CCccccHHHHHHHHHHHHhccccccceeEEEEEeecCCCCceEEEEEecC
Confidence 46565555544 34444566665555555544333344566666655554445555555443
No 125
>PF09668 Asp_protease: Aspartyl protease; InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=29.78 E-value=56 Score=28.92 Aligned_cols=29 Identities=14% Similarity=0.323 Sum_probs=22.5
Q ss_pred EEEEccEEecCCCCccCCCCceEeeccceeeeecHHHHHH
Q 047816 276 VIHVAGKPLPLNPKVFDGKHGTVLDSGTTYAYLPEAAFLA 315 (620)
Q Consensus 276 ~i~v~g~~~~~~~~~~~~~~~ailDSGtt~~~LP~~~~~~ 315 (620)
.+.+||+.+. |++|||+..+.++.+.+++
T Consensus 28 ~~~ing~~vk-----------A~VDtGAQ~tims~~~a~r 56 (124)
T PF09668_consen 28 NCKINGVPVK-----------AFVDTGAQSTIMSKSCAER 56 (124)
T ss_dssp EEEETTEEEE-----------EEEETT-SS-EEEHHHHHH
T ss_pred EEEECCEEEE-----------EEEeCCCCccccCHHHHHH
Confidence 4567888764 8999999999999998877
No 126
>PF14979 TMEM52: Transmembrane 52
Probab=29.77 E-value=1.1e+02 Score=27.72 Aligned_cols=36 Identities=17% Similarity=0.370 Sum_probs=17.1
Q ss_pred cchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhcc
Q 047816 578 TWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRRQ 613 (620)
Q Consensus 578 ~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~~ 613 (620)
.|.+- +|+-|++.++..++-++.+..+=|-|+||++
T Consensus 15 ~W~~LWyIwLill~~~llLLCG~ta~C~rfCClrk~~ 51 (154)
T PF14979_consen 15 RWSSLWYIWLILLIGFLLLLCGLTASCVRFCCLRKQA 51 (154)
T ss_pred ceehhhHHHHHHHHHHHHHHHHHHHHHHHHHHhcccc
Confidence 45555 4444443324444444444444445555653
No 127
>PF02038 ATP1G1_PLM_MAT8: ATP1G1/PLM/MAT8 family; InterPro: IPR000272 The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable. Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=29.76 E-value=75 Score=23.21 Aligned_cols=15 Identities=13% Similarity=0.465 Sum_probs=6.6
Q ss_pred hhHHHHHHHHHHHHHH
Q 047816 583 HFLMVVLAITIMMVVG 598 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~ 598 (620)
-+.|.+.+ ++.++++
T Consensus 15 rigGLi~A-~vlfi~G 29 (50)
T PF02038_consen 15 RIGGLIFA-GVLFILG 29 (50)
T ss_dssp HHHHHHHH-HHHHHHH
T ss_pred hccchHHH-HHHHHHH
Confidence 35555544 3333333
No 128
>PF10661 EssA: WXG100 protein secretion system (Wss), protein EssA; InterPro: IPR018920 The Wss (WXG100 protein secretion system) in Staphylococcus aureus seems to be encoded by a locus of eight ORFs, called ess (eSAT-6 secretion system) []. This locus encodes, amongst several other proteins, EssA, a protein predicted to possess one transmembrane domain. Due to its predicted membrane location and its absolute requirement for WXG100 protein secretion, it has been speculated that EssA could form a secretion apparatus in conjunction with YukC and YukAB. Proteins homologous to EssA, YukC, EsaA and YukD were absent from mycobacteria []. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This highly divergent protein family consists largely of a central region of highly polar low-complexity sequence containing occasional LF motifs in weak repeats about 17 residues in length, flanked by hydrophobic N- and C-terminal regions.
Probab=29.74 E-value=60 Score=29.59 Aligned_cols=16 Identities=25% Similarity=0.324 Sum_probs=11.0
Q ss_pred HHHHHHHHHHhhhhhh
Q 047816 596 VVGLSVFGILFILRRR 611 (620)
Q Consensus 596 ~~~l~~~~~~~~~r~r 611 (620)
+|++++.+++.+.||-
T Consensus 128 ~ll~i~~giy~~~r~~ 143 (145)
T PF10661_consen 128 ILLAICGGIYVVLRKV 143 (145)
T ss_pred HHHHHHHHHHHHHHHh
Confidence 5555667778888864
No 129
>TIGR01433 CyoA cytochrome o ubiquinol oxidase subunit II. This enzyme catalyzes the oxidation of ubiquinol with the concomitant reduction of molecular oxygen to water. This acts as the terminal electron acceptor in the respiratory chain. Subunit II is responsible for binding and oxidation of the ubiquinone substrate. This sequence is closely related to QoxA, which oxidizes quinol in gram positive bacteria but which is in complex with subunits which utilize cytochromes a in the reduction of molecular oxygen. Slightly more distantly related is subunit II of cytochrome c oxidase which uses cyt. c as the oxidant.
Probab=29.67 E-value=43 Score=32.99 Aligned_cols=17 Identities=12% Similarity=0.376 Sum_probs=8.8
Q ss_pred HHHHHHHHHhhhhhhcc
Q 047816 597 VGLSVFGILFILRRRRQ 613 (620)
Q Consensus 597 ~~l~~~~~~~~~r~r~~ 613 (620)
+...++.++++||.|++
T Consensus 44 v~v~~~~~~~~~r~r~~ 60 (226)
T TIGR01433 44 IPVILMTLFFAWKYRAT 60 (226)
T ss_pred HHHHHHHheeeEEEecc
Confidence 33344445666666543
No 130
>PF06679 DUF1180: Protein of unknown function (DUF1180); InterPro: IPR009565 This entry consists of several hypothetical eukaryotic proteins thought to be membrane proteins. Their function is unknown.
Probab=29.04 E-value=57 Score=30.31 Aligned_cols=7 Identities=14% Similarity=0.482 Sum_probs=3.7
Q ss_pred ccccCCC
Q 047816 612 RQSVNSY 618 (620)
Q Consensus 612 ~~~~~~~ 618 (620)
+|+.+.|
T Consensus 123 ~rktRkY 129 (163)
T PF06679_consen 123 NRKTRKY 129 (163)
T ss_pred cccceee
Confidence 4555555
No 131
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=29.03 E-value=1.6e+02 Score=25.17 Aligned_cols=64 Identities=14% Similarity=0.053 Sum_probs=38.2
Q ss_pred EEEEecCCCc----EEEEEEeCCCCcee-EeCCCCCCCCCCCCCCCCCCCCcccccccCcCCcccCCCCCcceeEEeecc
Q 047816 88 TRLWIGTPPQ----TFALIVDTGSTVTY-VPCATCEHCGDHQDPKFEPDLSSTYQPVKCNLYCNCDRERAQCVYERKYAE 162 (620)
Q Consensus 88 ~~i~iGTP~Q----~~~v~vDTGSs~~W-V~~~~c~~C~~~~~~~y~p~~SsT~~~~~c~~~c~c~~~~~~~~~~~~Y~d 162 (620)
+++.|..|.| ++..++|||.+..- ++...-.. -...+.. ...+.-++
T Consensus 2 ~~v~~~~p~~~~~~~v~~LVDTGat~~~~l~~~~a~~------lgl~~~~----------------------~~~~~tA~ 53 (107)
T TIGR03698 2 LDVELSNPKNPEFMEVRALVDTGFSGFLLVPPDIVNK------LGLPELD----------------------QRRVYLAD 53 (107)
T ss_pred EEEEEeCCCCCCceEEEEEEECCCCeEEecCHHHHHH------cCCCccc----------------------CcEEEecC
Confidence 5677877732 68999999998664 43221110 0111111 12344556
Q ss_pred CCceeEEEEEEEEEeCC
Q 047816 163 MSSSSGVLGEDIISFGN 179 (620)
Q Consensus 163 g~~~~G~~~~D~v~lg~ 179 (620)
|....-....++|.+++
T Consensus 54 G~~~~~~v~~~~v~igg 70 (107)
T TIGR03698 54 GREVLTDVAKASIIING 70 (107)
T ss_pred CcEEEEEEEEEEEEECC
Confidence 65666677889999998
No 132
>PF14316 DUF4381: Domain of unknown function (DUF4381)
Probab=28.99 E-value=50 Score=30.05 Aligned_cols=11 Identities=27% Similarity=0.338 Sum_probs=4.8
Q ss_pred HHHHhhhhhhc
Q 047816 602 FGILFILRRRR 612 (620)
Q Consensus 602 ~~~~~~~r~r~ 612 (620)
++++..+|+++
T Consensus 36 ~~~~~~~r~~~ 46 (146)
T PF14316_consen 36 LLLWRLWRRWR 46 (146)
T ss_pred HHHHHHHHHHH
Confidence 33444444443
No 133
>PF05337 CSF-1: Macrophage colony stimulating factor-1 (CSF-1); InterPro: IPR008001 Colony stimulating factor 1 (CSF-1) is a homodimeric polypeptide growth factor whose primary function is to regulate the survival, proliferation, differentiation, and function of cells of the mononuclear phagocytic lineage. This lineage includes mononuclear phagocytic precursors, blood monocytes, tissue macrophages, osteoclasts, and microglia of the brain, all of which possess cell surface receptors for CSF-1. The protein has also been linked with male fertility [] and mutations in the Csf-1 gene have been found to cause osteopetrosis and failure of tooth eruption [].; GO: 0005125 cytokine activity, 0008083 growth factor activity, 0016021 integral to membrane; PDB: 3EJJ_A.
Probab=28.91 E-value=19 Score=36.12 Aligned_cols=20 Identities=40% Similarity=0.556 Sum_probs=0.0
Q ss_pred HHHHHHHHHHhhhhhhcccc
Q 047816 596 VVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 596 ~~~l~~~~~~~~~r~r~~~~ 615 (620)
+++|++++.++|.|+|||.+
T Consensus 236 ILVLLaVGGLLfYr~rrRs~ 255 (285)
T PF05337_consen 236 ILVLLAVGGLLFYRRRRRSH 255 (285)
T ss_dssp --------------------
T ss_pred hhhhhhccceeeeccccccc
Confidence 45567777777766665544
No 134
>PF10873 DUF2668: Protein of unknown function (DUF2668); InterPro: IPR022640 Members in this family of proteins are annotated as cysteine and tyrosine-rich protein 1, however currently no function is known [].
Probab=28.88 E-value=33 Score=30.82 Aligned_cols=12 Identities=8% Similarity=0.019 Sum_probs=9.0
Q ss_pred chhh-hhHHHHHH
Q 047816 579 WWQE-HFLMVVLA 590 (620)
Q Consensus 579 ~~~~-~~~~i~~~ 590 (620)
.++. +|+||+.|
T Consensus 57 ~lsgtAIaGIVfg 69 (155)
T PF10873_consen 57 VLSGTAIAGIVFG 69 (155)
T ss_pred ccccceeeeeehh
Confidence 3445 89999988
No 135
>PF04689 S1FA: DNA binding protein S1FA; InterPro: IPR006779 S1FA is an unusual small plant peptide of only 70 amino acids with a basic domain which contains a nuclear localization signal and a putative DNA binding helix. S1FA is highly conserved between dicotyledonous and monocotyledonous plants and may be a DNA-binding protein that specifically recognises the negative promoter element S1F [].; GO: 0003677 DNA binding, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=28.63 E-value=45 Score=25.49 Aligned_cols=36 Identities=8% Similarity=0.190 Sum_probs=23.8
Q ss_pred cccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 576 KRTWWQE-HFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 576 ~~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
..++++- .||-|++| +..+++.+--++.+..|||.-
T Consensus 6 ~~KGlnPGlIVLlvV~-g~ll~flvGnyvlY~Yaqk~l 42 (69)
T PF04689_consen 6 EAKGLNPGLIVLLVVA-GLLLVFLVGNYVLYVYAQKTL 42 (69)
T ss_pred cccCCCCCeEEeehHH-HHHHHHHHHHHHHHHHHhhcC
Confidence 3467777 78887776 555555555556778888763
No 136
>TIGR03501 gamma_C_targ gammaproteobacterial enzyme C-terminal transmembrane domain. This homology domain, largely restricted to a subset of the gamma proteobacteria that excludes the enterobacteria, is found at the extreme carboxyl-terminus of a diverse set of proteins, most of which are enzymes with conventional signal sequences and with hydrolytic activities: nucleases, proteases, agarases, etc. Species that have this domain at all typically have from two to fifteen proteins tagged with this domain at the C-terminus. The agarase AgaA from Vibro sp. strain JT0107 is secreted into the medium, while the same protein heterologously expressed in E. coli is retained in the cell fraction. This suggests cleavage and release in species with this domain. Both this suggestion, and the chemical structure of the domain (motif, hydrophobic predicted transmembrane helix, cluster of basic residues) closely parallels that of the LPXTG/sortase system and the PEP-CTERM/exosortase(EpsH) system.
Probab=28.32 E-value=62 Score=20.22 Aligned_cols=13 Identities=31% Similarity=0.342 Sum_probs=5.1
Q ss_pred HHHHHHHHhhhhh
Q 047816 598 GLSVFGILFILRR 610 (620)
Q Consensus 598 ~l~~~~~~~~~r~ 610 (620)
+|+.+..+.++||
T Consensus 9 ~LllL~~~~~rRr 21 (26)
T TIGR03501 9 SLLLLLLLGLRRR 21 (26)
T ss_pred HHHHHHHHHHHHh
Confidence 3333333444443
No 137
>PF05283 MGC-24: Multi-glycosylated core protein 24 (MGC-24); InterPro: IPR007947 CD164 is a mucin-like receptor, or sialomucin, with specificity in receptor/ ligand interactions that depends on the structural characteristics of the mucin-like receptor. Its functions include mediating, or regulating, haematopoietic progenitor cell adhesion and the negative regulation of their growth and/or-differentiation. It exists in the native state as a disulphide- linked homodimer of two 80-85kDa subunits. It is usually expressed by CD34+ and CD341o/- haematopoietic stem cells and associated microenvironmental cells. It contains, in its extracellular region, two mucin domains (I and II) linked by a non-mucin domain, which has been predicted to contain intra- disulphide bridges. This receptor may play a key role in haematopoiesis by facilitating the adhesion of human CD34+ cells to bone marrow stroma and by negatively regulating CD34+ CD341o/- haematopoietic progenitor cell proliferation. These effects involve the CD164 class I and/or II epitopes recognised by the monoclonal antibodies (mAbs) 105A5 and 103B2/9E10. These epitopes are carbohydrate-dependent and are located on the N-terminal mucin domain I [, ]. It has been found that murine MGC-24v and rat endolyn share significant sequence similarities with human CD164. However, CD164 lacks the consensus glycosaminoglycan (GAG)-attachment site found in MGC-24; it is possible that GAG-association is responsible for the high molecular weight of the epithelial-derived MGC-24 glycoprotein []. Genomic structure studies have placed CD164 within the mucin-subgroup that comprises multiple exons, and demonstrate the diverse chromosomal distribution of this family of molecules. Molecules with such multiple exons may have sophisticated regulatory mechanisms that involve not only post-translational modifications of the oligosaccharide side chains, but also differential exon usage. Although differences in the intron and exon sizes are seen between the mouse and human genes, the predicted proteins are similar in size and structure, maintaining functionally important motifs that regulate cell proliferation or subcellular distribution []. CD164 is a gene whose expression depends on differential usage of poly- adenylation sites within the 3'-UTR. The conserved distribution of the 3.2- and 1.2-kb CD164 transcripts between mouse and human suggests that (i) a mechanism may exist to regulate tissue-specific polyadenylation, and (ii) differences in polyadenylation are important for the expression and function of CD164 in different tissues. Two other aspects of the structure of CD164 are of particular interest. First, it shares one of several conserved features of a cytokine-binding pocket - in this respect, it is notable that evidence exists for a class of cell-surface sialomucin modulators that directly interact with growth factor receptors to regulate their response to physiological ligands. Second, its cytoplasmic tail contains a C-terminal YHTL motif found in many endocytic membrane proteins or receptors. These Tyr-based motifs bind to adaptor proteins, which mediate the sorting of membrane proteins into transport vesicles from the plasma membrane to the endosomes, and between intracellular compartments.
Probab=27.90 E-value=66 Score=30.61 Aligned_cols=17 Identities=12% Similarity=0.280 Sum_probs=12.4
Q ss_pred HHHHHHHHHHHHHHhhh
Q 047816 592 TIMMVVGLSVFGILFIL 608 (620)
Q Consensus 592 ~~~~~~~l~~~~~~~~~ 608 (620)
+|+|+++|++++.++++
T Consensus 166 GIVL~LGv~aI~ff~~K 182 (186)
T PF05283_consen 166 GIVLTLGVLAIIFFLYK 182 (186)
T ss_pred HHHHHHHHHHHHHHHhh
Confidence 78888998887655443
No 138
>PF06040 Adeno_E3: Adenovirus E3 protein; InterPro: IPR009266 This family consists of several Adenovirus E3 proteins. The E3 protein does not seem to be essential for virus replication in cultured cells suggesting that the protein may function in virus-host interactions [].
Probab=27.78 E-value=62 Score=27.79 Aligned_cols=23 Identities=17% Similarity=0.332 Sum_probs=15.7
Q ss_pred cceeeeeeeecCCccccchhhhhHHHHHHHHHHHHHH
Q 047816 562 GNYKLLQWNIEPQVKRTWWQEHFLMVVLAITIMMVVG 598 (620)
Q Consensus 562 G~y~l~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~ 598 (620)
+|||...+. ++||++| +.+++++
T Consensus 82 ~p~evvG~l-------------~LGvV~G-G~i~vLc 104 (127)
T PF06040_consen 82 SPWEVVGYL-------------ILGVVAG-GLIAVLC 104 (127)
T ss_pred CCeeeeehh-------------hHHHHhc-cHHHHHH
Confidence 677666543 7899998 6665553
No 139
>PRK14748 kdpF potassium-transporting ATPase subunit F; Provisional
Probab=27.48 E-value=1.2e+02 Score=19.31 Aligned_cols=21 Identities=14% Similarity=0.251 Sum_probs=11.0
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhh
Q 047816 586 MVVLAITIMMVVGLSVFGILFIL 608 (620)
Q Consensus 586 ~i~~~~~~~~~~~l~~~~~~~~~ 608 (620)
++++| +++++.|+....+++.
T Consensus 4 ~vi~G--~ilv~lLlgYLvyALi 24 (29)
T PRK14748 4 GVITG--VLLVFLLLGYLVYALI 24 (29)
T ss_pred HHHHH--HHHHHHHHHHHHHHHh
Confidence 45555 4555555555555443
No 140
>PF09472 MtrF: Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF); InterPro: IPR013347 Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump []. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. In some organisms this domain is found at the C-terminal region of what appears to be a fusion of the MtrA and MtrF proteins [, ]. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.; GO: 0030269 tetrahydromethanopterin S-methyltransferase activity, 0015948 methanogenesis, 0016020 membrane
Probab=27.13 E-value=86 Score=24.25 Aligned_cols=30 Identities=10% Similarity=0.107 Sum_probs=16.9
Q ss_pred ccchhh-hhHHHHHHHHHHHHHHHHHHHHHhhh
Q 047816 577 RTWWQE-HFLMVVLAITIMMVVGLSVFGILFIL 608 (620)
Q Consensus 577 ~~~~~~-~~~~i~~~~~~~~~~~l~~~~~~~~~ 608 (620)
.++.+. -+.|.++| ++++++|..+-.++.|
T Consensus 34 ~SGv~~~~~~GfaiG--~~~AlvLv~ip~~l~~ 64 (64)
T PF09472_consen 34 ESGVMATGIKGFAIG--FLFALVLVGIPILLMF 64 (64)
T ss_pred HHHHhhhhhHHHHHH--HHHHHHHHHHHHHHhC
Confidence 345555 78888887 4444444444444443
No 141
>KOG1094 consensus Discoidin domain receptor DDR1 [Signal transduction mechanisms]
Probab=26.74 E-value=80 Score=35.47 Aligned_cols=13 Identities=38% Similarity=0.470 Sum_probs=7.3
Q ss_pred CcEEEEEEeCCCC
Q 047816 96 PQTFALIVDTGST 108 (620)
Q Consensus 96 ~Q~~~v~vDTGSs 108 (620)
+|.-++..|-||.
T Consensus 53 ~~~~rl~se~g~G 65 (807)
T KOG1094|consen 53 PQHARLHSEDGSG 65 (807)
T ss_pred cccccccccCCCc
Confidence 5555566655544
No 142
>PRK15348 type III secretion system lipoprotein SsaJ; Provisional
Probab=26.65 E-value=79 Score=31.63 Aligned_cols=35 Identities=14% Similarity=0.230 Sum_probs=19.5
Q ss_pred EEEeecCCCCCCchhHHHHHHhhcc-cccccceEEee
Q 047816 480 MFLSINYSDLRPHIPELADSIAQEL-DVNTSQVHLLN 515 (620)
Q Consensus 480 ~~~~~~~~~~~~~~~~~~~~~~~~l-~~~~~qv~~~~ 515 (620)
++++|+ .+..+........+|+.. ++++..|.|..
T Consensus 149 I~~~~~-~~~~~~~v~I~~LVA~SV~gL~~enVTVvd 184 (249)
T PRK15348 149 IKYSPQ-VNMEAFRVKIKDLIEMSIPGLQYSKISILM 184 (249)
T ss_pred EEeCCC-CChHHHHHHHHHHHHHhcCCCCccceEEEe
Confidence 334555 344444336777777743 45666666654
No 143
>PF13706 PepSY_TM_3: PepSY-associated TM helix
Probab=25.21 E-value=1.3e+02 Score=20.26 Aligned_cols=22 Identities=18% Similarity=0.338 Sum_probs=14.1
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHH
Q 047816 583 HFLMVVLAITIMMVVGLSVFGIL 605 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~ 605 (620)
.++|+++| ...+++.++...+.
T Consensus 9 ~W~Gl~~g-~~l~~~~~tG~~~~ 30 (37)
T PF13706_consen 9 RWLGLILG-LLLFVIFLTGAVMV 30 (37)
T ss_pred HHHHHHHH-HHHHHHHHHhHHHH
Confidence 57889888 66656665554433
No 144
>KOG0860 consensus Synaptobrevin/VAMP-like protein [Intracellular trafficking, secretion, and vesicular transport]
Probab=24.80 E-value=72 Score=27.71 Aligned_cols=15 Identities=20% Similarity=0.950 Sum_probs=7.9
Q ss_pred cccchhhhhHHHHHH
Q 047816 576 KRTWWQEHFLMVVLA 590 (620)
Q Consensus 576 ~~~~~~~~~~~i~~~ 590 (620)
++-||...-+-+.+|
T Consensus 85 rk~wWkn~Km~~il~ 99 (116)
T KOG0860|consen 85 RKMWWKNCKMRIILG 99 (116)
T ss_pred HHHHHHHHHHHHHHH
Confidence 456777733333344
No 145
>PF14610 DUF4448: Protein of unknown function (DUF4448)
Probab=24.55 E-value=22 Score=33.96 Aligned_cols=8 Identities=25% Similarity=0.032 Sum_probs=3.5
Q ss_pred Ccceeeee
Q 047816 561 FGNYKLLQ 568 (620)
Q Consensus 561 fG~y~l~~ 568 (620)
-||--.+.
T Consensus 123 ~GP~V~~~ 130 (189)
T PF14610_consen 123 KGPTVSLT 130 (189)
T ss_pred cCCeEEee
Confidence 44444443
No 146
>PF12301 CD99L2: CD99 antigen like protein 2; InterPro: IPR022078 This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 237 amino acids in length. CD99L2 and CD99 are involved in trans-endothelial migration of neutrophils in vitro and in the recruitment of neutrophils into inflamed peritoneum.
Probab=24.04 E-value=77 Score=29.68 Aligned_cols=12 Identities=8% Similarity=0.122 Sum_probs=6.2
Q ss_pred CcccccHHHHHH
Q 047816 534 SANYISNATALR 545 (620)
Q Consensus 534 ~~~~f~~~~~~~ 545 (620)
.+-.|+.++...
T Consensus 68 ~gg~fsD~DL~D 79 (169)
T PF12301_consen 68 GGGGFSDSDLFD 79 (169)
T ss_pred CCCCcCcccccc
Confidence 344566555444
No 147
>PF13172 PepSY_TM_1: PepSY-associated TM helix
Probab=24.04 E-value=1.2e+02 Score=20.01 Aligned_cols=26 Identities=19% Similarity=0.535 Sum_probs=15.1
Q ss_pred ccchhh--hhHHHHHHHHHHHHHHHHHHH
Q 047816 577 RTWWQE--HFLMVVLAITIMMVVGLSVFG 603 (620)
Q Consensus 577 ~~~~~~--~~~~i~~~~~~~~~~~l~~~~ 603 (620)
++.+.+ .++|+..+ ...++++++.+.
T Consensus 2 r~~~~~~H~~~g~~~~-~~ll~~~lTG~~ 29 (34)
T PF13172_consen 2 RKFWRKIHRWLGLIAA-IFLLLLALTGAL 29 (34)
T ss_pred hHHHHHHHHHHHHHHH-HHHHHHHHHHHH
Confidence 344555 46677766 566666665543
No 148
>PRK11486 flagellar biosynthesis protein FliO; Provisional
Probab=23.31 E-value=87 Score=27.70 Aligned_cols=20 Identities=10% Similarity=0.240 Sum_probs=14.6
Q ss_pred HHHHHHHHHHHHHHhhhhhh
Q 047816 592 TIMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 592 ~~~~~~~l~~~~~~~~~r~r 611 (620)
+..++++|+++..|+++|..
T Consensus 24 ~L~lVl~lI~~~aWLlkR~~ 43 (124)
T PRK11486 24 ALIGIIALILAAAWLVKRLG 43 (124)
T ss_pred HHHHHHHHHHHHHHHHHHcC
Confidence 45667777777788888864
No 149
>PF11615 DUF3249: Protein of unknown function (DUF3249); InterPro: IPR021653 This family of proteins represents the gene product of the protein CAF4, the yeast protein YKR036c. This protein contains seven WD40 repeats in its C terminus. The function however is unknown []. ; PDB: 2PQR_D.
Probab=23.08 E-value=56 Score=23.31 Aligned_cols=25 Identities=40% Similarity=0.723 Sum_probs=12.1
Q ss_pred CcccccHHHHHHHHHHHccCccCCC
Q 047816 534 SANYISNATALRIISRLAEHRVHIP 558 (620)
Q Consensus 534 ~~~~f~~~~~~~i~~~~~~~~~~~~ 558 (620)
..++-...+..||..-|.++++|+|
T Consensus 11 qnnyadsattfrilahldeqryplp 35 (60)
T PF11615_consen 11 QNNYADSATTFRILAHLDEQRYPLP 35 (60)
T ss_dssp ------HHHHHHHHT---TTTS---
T ss_pred eccccchhhHHHHHHhhcccccCCC
Confidence 3455667889999999999999987
No 150
>TIGR03063 srtB_target sortase B cell surface sorting signal. Two different classes of sorting signal, both analogous to the sortase A signal LPXTG, may be recognized by the sortase SrtB. These are given as NXZTN and NPKXZ. Proteins sorted by this class of sortase are less common than the sortase A and LPXTG system. This model describes a number of cell surface protein C-terminal regions from Gram-positive bacteria that appear to be sortase B (SrtB) sorting signals.
Probab=22.79 E-value=1e+02 Score=19.80 Aligned_cols=7 Identities=29% Similarity=0.762 Sum_probs=3.5
Q ss_pred HHhhhhh
Q 047816 604 ILFILRR 610 (620)
Q Consensus 604 ~~~~~r~ 610 (620)
.+++||+
T Consensus 22 ~~Li~k~ 28 (29)
T TIGR03063 22 LFLIRKR 28 (29)
T ss_pred HHHhhcc
Confidence 4555443
No 151
>PF01002 Flavi_NS2B: Flavivirus non-structural protein NS2B; InterPro: IPR000487 Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex [, ].; GO: 0004252 serine-type endopeptidase activity, 0019012 virion; PDB: 2WV9_A 2FOM_A 2VBC_B 3U1I_C 3U1J_A 3LKW_A 3L6P_A 2GGV_A 3E90_C 2IJO_A ....
Probab=22.77 E-value=86 Score=27.91 Aligned_cols=27 Identities=7% Similarity=0.047 Sum_probs=13.1
Q ss_pred ccccceEEe---eeeec------CCceeeEEEEecC
Q 047816 506 VNTSQVHLL---NFMSK------GNNSFIAWAVFPS 532 (620)
Q Consensus 506 ~~~~qv~~~---~~~~~------g~~~~~~~~~~P~ 532 (620)
....|+.+. +++|+ |...+..+++-..
T Consensus 44 gks~~L~~E~ag~i~W~~ea~~sG~s~rldV~~d~~ 79 (128)
T PF01002_consen 44 GKSTDLWLEWAGDISWEEEAEISGGSVRLDVKLDDD 79 (128)
T ss_dssp ---SSEEEEEEE-S---TTHEEHSEEEEEEEEE-TT
T ss_pred cccCceEEEEEeccccCccchhcCCceEEEEEECCC
Confidence 345566665 77888 7777777777554
No 152
>PF13179 DUF4006: Family of unknown function (DUF4006)
Probab=22.54 E-value=1.4e+02 Score=23.24 Aligned_cols=19 Identities=26% Similarity=0.345 Sum_probs=7.7
Q ss_pred HHHHHhhhhhhccccCCCC
Q 047816 601 VFGILFILRRRRQSVNSYK 619 (620)
Q Consensus 601 ~~~~~~~~r~r~~~~~~~~ 619 (620)
.++.+.|.--+......|+
T Consensus 29 ~lt~~ai~~Qq~~At~~Y~ 47 (66)
T PF13179_consen 29 FLTYWAIKVQQEQATNPYK 47 (66)
T ss_pred HHHHHHHHHHHHHhcCCcc
Confidence 3334444333333444443
No 153
>PF11669 WBP-1: WW domain-binding protein 1; InterPro: IPR021684 This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain [].
Probab=22.32 E-value=81 Score=26.86 Aligned_cols=15 Identities=33% Similarity=0.142 Sum_probs=8.5
Q ss_pred HHHHHHHHHhhhhhh
Q 047816 597 VGLSVFGILFILRRR 611 (620)
Q Consensus 597 ~~l~~~~~~~~~r~r 611 (620)
+.|+.+.++-.||+|
T Consensus 32 ill~c~c~~~~~r~r 46 (102)
T PF11669_consen 32 ILLSCCCACRHRRRR 46 (102)
T ss_pred HHHHHHHHHHHHHHH
Confidence 334555566666654
No 154
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=22.32 E-value=41 Score=31.76 Aligned_cols=11 Identities=0% Similarity=0.184 Sum_probs=7.9
Q ss_pred hhHHHHHHHHH
Q 047816 583 HFLMVVLAITI 593 (620)
Q Consensus 583 ~~~~i~~~~~~ 593 (620)
.+++|++||++
T Consensus 76 ~~~~iivgvi~ 86 (179)
T PF13908_consen 76 FITGIIVGVIC 86 (179)
T ss_pred ceeeeeeehhh
Confidence 68889998433
No 155
>PF14991 MLANA: Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=22.17 E-value=20 Score=30.84 Aligned_cols=18 Identities=28% Similarity=0.425 Sum_probs=0.0
Q ss_pred HHHHHHHHHHHHHhhhhh
Q 047816 593 IMMVVGLSVFGILFILRR 610 (620)
Q Consensus 593 ~~~~~~l~~~~~~~~~r~ 610 (620)
++++.+|++++-|..+||
T Consensus 33 ~VILgiLLliGCWYckRR 50 (118)
T PF14991_consen 33 IVILGILLLIGCWYCKRR 50 (118)
T ss_dssp ------------------
T ss_pred HHHHHHHHHHhheeeeec
Confidence 333334444555555444
No 156
>PF01282 Ribosomal_S24e: Ribosomal protein S24e; InterPro: IPR001976 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This family contains the S24e ribosomal proteins from eukaryotes and archaebacteria. These proteins have 101 to 148 amino acids.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome; PDB: 2V94_B 1YWX_A 2G1D_A 3IZ6_U 1XN9_A 2XZM_P 2XZN_P 3U5G_Y 3J16_D 3IZB_U ....
Probab=22.15 E-value=3.7e+02 Score=21.92 Aligned_cols=43 Identities=19% Similarity=0.318 Sum_probs=36.6
Q ss_pred CCchhHHHHHHhhcccccccceEEeeeeec--CCceeeEEEEecC
Q 047816 490 RPHIPELADSIAQELDVNTSQVHLLNFMSK--GNNSFIAWAVFPS 532 (620)
Q Consensus 490 ~~~~~~~~~~~~~~l~~~~~qv~~~~~~~~--g~~~~~~~~~~P~ 532 (620)
.|-..+..+-||..|+++..+|.|.++.-+ +........|+-+
T Consensus 12 Tpsr~ei~~klA~~~~~~~~~ivv~~~~t~fG~~~s~g~a~IYd~ 56 (84)
T PF01282_consen 12 TPSRKEIREKLAAMLNVDPDLIVVFGIKTEFGGGKSTGFAKIYDS 56 (84)
T ss_dssp S--HHHHHHHHHHHHTSTGCCEEEEEEEESSSSSEEEEEEEEESS
T ss_pred CCCHHHHHHHHHHHhCCCCCeEEEeccEecCCCceEEEEEEEeCC
Confidence 577889999999999999999999999988 5678888888875
No 157
>PF02038 ATP1G1_PLM_MAT8: ATP1G1/PLM/MAT8 family; InterPro: IPR000272 The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable. Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=22.08 E-value=1.2e+02 Score=22.23 Aligned_cols=25 Identities=24% Similarity=0.444 Sum_probs=16.1
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816 586 MVVLAITIMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 586 ~i~~~~~~~~~~~l~~~~~~~~~r~r 611 (620)
..=+| +.+++.+|.+++++++.-+|
T Consensus 13 tLrig-GLi~A~vlfi~Gi~iils~k 37 (50)
T PF02038_consen 13 TLRIG-GLIFAGVLFILGILIILSGK 37 (50)
T ss_dssp HHHHH-HHHHHHHHHHHHHHHHCTTH
T ss_pred Hhhcc-chHHHHHHHHHHHHHHHcCc
Confidence 34466 77777777777776665443
No 158
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=21.27 E-value=69 Score=26.64 Aligned_cols=23 Identities=30% Similarity=0.473 Sum_probs=18.3
Q ss_pred EEecCCC-cEEEEEEeCCCCceeEeC
Q 047816 90 LWIGTPP-QTFALIVDTGSTVTYVPC 114 (620)
Q Consensus 90 i~iGTP~-Q~~~v~vDTGSs~~WV~~ 114 (620)
+.|. + +.+++++|||++..-++-
T Consensus 3 ~~i~--g~~~v~~~vDtGA~vnllp~ 26 (93)
T cd05481 3 MKIN--GKQSVKFQLDTGATCNVLPL 26 (93)
T ss_pred eEeC--CceeEEEEEecCCEEEeccH
Confidence 4444 5 899999999999877764
No 159
>PRK00523 hypothetical protein; Provisional
Probab=21.06 E-value=93 Score=24.59 Aligned_cols=25 Identities=8% Similarity=-0.012 Sum_probs=11.5
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhhhhh
Q 047816 586 MVVLAITIMMVVGLSVFGILFILRRR 611 (620)
Q Consensus 586 ~i~~~~~~~~~~~l~~~~~~~~~r~r 611 (620)
+++++ .+++++++-+++.+++-||.
T Consensus 5 ~l~I~-l~i~~li~G~~~Gffiark~ 29 (72)
T PRK00523 5 GLALG-LGIPLLIVGGIIGYFVSKKM 29 (72)
T ss_pred HHHHH-HHHHHHHHHHHHHHHHHHHH
Confidence 44444 23333333445556665554
No 160
>PF02480 Herpes_gE: Alphaherpesvirus glycoprotein E; InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=20.81 E-value=33 Score=37.45 Aligned_cols=30 Identities=33% Similarity=0.455 Sum_probs=0.0
Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhhhhhhc
Q 047816 583 HFLMVVLAITIMMVVGLSVFGILFILRRRR 612 (620)
Q Consensus 583 ~~~~i~~~~~~~~~~~l~~~~~~~~~r~r~ 612 (620)
.++++++|++++++++++++++++.+||||
T Consensus 353 ~~l~vVlgvavlivVv~viv~vc~~~rrrR 382 (439)
T PF02480_consen 353 ALLGVVLGVAVLIVVVGVIVWVCLRCRRRR 382 (439)
T ss_dssp ------------------------------
T ss_pred chHHHHHHHHHHHHHHHHHhheeeeehhcc
Confidence 566666663444444434433333333333
No 161
>PF03597 CcoS: Cytochrome oxidase maturation protein cbb3-type; InterPro: IPR004714 Cytochrome cbb3 oxidases are found almost exclusively in Proteobacteria, and represent a distinctive class of proton-pumping respiratory haem-copper oxidases (HCO) that lack many of the key structural features that contribute to the reaction cycle of the intensely studied mitochondrial cytochrome c oxidase (CcO). Expression of cytochrome cbb3 oxidase allows human pathogens to colonise anoxic tissues and agronomically important diazotrophs to sustain nitrogen fixation []. Genes encoding a cytochrome cbb3 oxidase were initially designated fixNOQP (ccoNOQP), the ccoNOQP operon is always found close to a second gene cluster, known as fixGHIS (ccoGHIS) whose expression is necessary for the assembly of a functional cbb3 oxidase. On the basis of their derived amino acid sequences each of the four proteins encoded by the ccoGHIS operon are thought to be membrane-bound. It has been suggested that they may function in concert as a multi-subunit complex, possibly playing a role in the uptake and metabolism of copper required for the assembly of the binuclear centre of cytochrome cbb3 oxidase.
Probab=20.66 E-value=1.9e+02 Score=20.69 Aligned_cols=12 Identities=17% Similarity=0.492 Sum_probs=4.4
Q ss_pred HHHHHHHHHHhh
Q 047816 596 VVGLSVFGILFI 607 (620)
Q Consensus 596 ~~~l~~~~~~~~ 607 (620)
++++.+++.++|
T Consensus 12 ~l~~~~l~~f~W 23 (45)
T PF03597_consen 12 ILGLIALAAFLW 23 (45)
T ss_pred HHHHHHHHHHHH
Confidence 333333333333
No 162
>PRK00972 tetrahydromethanopterin S-methyltransferase subunit E; Provisional
Probab=20.53 E-value=82 Score=31.24 Aligned_cols=9 Identities=33% Similarity=0.608 Sum_probs=5.2
Q ss_pred hccccCCCC
Q 047816 611 RRQSVNSYK 619 (620)
Q Consensus 611 r~~~~~~~~ 619 (620)
.|++...||
T Consensus 284 aR~~yGpY~ 292 (292)
T PRK00972 284 ARKKYGPYK 292 (292)
T ss_pred HHhhcCCCC
Confidence 355666664
No 163
>PTZ00208 65 kDa invariant surface glycoprotein; Provisional
Probab=20.29 E-value=92 Score=33.04 Aligned_cols=6 Identities=17% Similarity=0.169 Sum_probs=3.0
Q ss_pred cccccc
Q 047816 338 NDICFS 343 (620)
Q Consensus 338 ~~~C~~ 343 (620)
...|..
T Consensus 237 ~~~C~~ 242 (436)
T PTZ00208 237 DMNCNI 242 (436)
T ss_pred Cccccc
Confidence 455643
No 164
>PF11118 DUF2627: Protein of unknown function (DUF2627); InterPro: IPR020138 This entry represents uncharacterised membrane proteins with no known function.
Probab=20.22 E-value=1.1e+02 Score=24.35 Aligned_cols=27 Identities=11% Similarity=0.402 Sum_probs=19.0
Q ss_pred HHHHHHHHHHHHHHHHHHhhhhhhcccc
Q 047816 588 VLAITIMMVVGLSVFGILFILRRRRQSV 615 (620)
Q Consensus 588 ~~~~~~~~~~~l~~~~~~~~~r~r~~~~ 615 (620)
++| .+..++++..++.|.+.|-|+|.+
T Consensus 44 l~G-~~lf~~G~~Fi~GfI~~RDRKrnk 70 (77)
T PF11118_consen 44 LAG-LLLFAIGVGFIAGFILHRDRKRNK 70 (77)
T ss_pred HHH-HHHHHHHHHHHHhHhheeeccccc
Confidence 344 666788888888888887776543
Done!