Query 000479
Match_columns 1470
No_of_seqs 844 out of 4670
Neff 5.7
Searched_HMMs 46136
Date Fri Mar 29 10:14:25 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/000479.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/000479hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1082 Histone H3 (Lys9) meth 100.0 3E-30 6.6E-35 305.8 13.1 211 1224-1470 52-279 (364)
2 KOG4442 Clathrin coat binding 99.9 5.4E-23 1.2E-27 247.3 7.4 105 1353-1470 92-200 (729)
3 KOG1141 Predicted histone meth 99.9 8.7E-23 1.9E-27 244.3 3.6 255 1158-1438 567-855 (1262)
4 KOG2462 C2H2-type Zn-finger pr 99.8 3.5E-22 7.6E-27 220.3 4.8 137 880-1068 129-265 (279)
5 KOG2462 C2H2-type Zn-finger pr 99.8 1.5E-21 3.2E-26 215.4 5.5 111 915-1071 130-240 (279)
6 KOG1074 Transcriptional repres 99.8 3.3E-21 7.1E-26 234.1 7.9 215 847-1071 605-932 (958)
7 KOG3608 Zn finger proteins [Ge 99.8 3.7E-21 8E-26 215.0 5.8 250 752-1059 136-399 (467)
8 KOG1079 Transcriptional repres 99.7 6.4E-19 1.4E-23 211.1 5.4 126 1296-1470 536-672 (739)
9 KOG3608 Zn finger proteins [Ge 99.7 3.6E-19 7.8E-24 199.3 2.7 188 848-1067 178-373 (467)
10 KOG3623 Homeobox transcription 99.7 7.4E-19 1.6E-23 209.9 5.2 81 983-1070 893-973 (1007)
11 PF05033 Pre-SET: Pre-SET moti 99.7 9E-18 2E-22 165.3 7.5 103 1236-1371 1-103 (103)
12 KOG1074 Transcriptional repres 99.7 6.1E-18 1.3E-22 206.0 6.2 113 953-1070 605-733 (958)
13 smart00468 PreSET N-terminal t 99.6 2.5E-16 5.4E-21 154.1 8.1 96 1234-1363 1-98 (98)
14 KOG3576 Ovo and related transc 99.4 5.6E-14 1.2E-18 148.4 0.0 119 915-1046 117-240 (267)
15 KOG1141 Predicted histone meth 99.4 5.5E-13 1.2E-17 161.5 7.9 172 1229-1420 872-1054(1262)
16 KOG1080 Histone H3 (Lys4) meth 99.4 2.8E-13 6E-18 174.3 5.0 79 1379-1470 866-946 (1005)
17 KOG3576 Ovo and related transc 99.3 6E-13 1.3E-17 140.7 1.9 88 843-942 113-200 (267)
18 KOG3623 Homeobox transcription 99.2 5.4E-12 1.2E-16 152.1 1.6 122 882-1041 211-332 (1007)
19 KOG1083 Putative transcription 99.2 2.7E-12 5.9E-17 159.7 -2.0 91 1367-1470 1165-1257(1306)
20 smart00317 SET SET (Su(var)3-9 99.0 3.8E-10 8.2E-15 111.2 7.2 78 1380-1470 1-80 (116)
21 PLN03086 PRLI-interacting fact 98.8 5.9E-09 1.3E-13 128.5 7.3 144 848-1041 408-564 (567)
22 PLN03086 PRLI-interacting fact 98.8 7.2E-09 1.6E-13 127.8 7.5 133 882-1057 408-552 (567)
23 KOG1085 Predicted methyltransf 98.8 6.2E-09 1.4E-13 115.4 6.2 86 1375-1470 252-340 (392)
24 PHA00733 hypothetical protein 98.5 5.5E-08 1.2E-12 100.1 3.7 63 972-1043 62-124 (128)
25 PHA00733 hypothetical protein 98.4 2.1E-07 4.6E-12 95.8 4.3 93 969-1068 25-121 (128)
26 KOG3993 Transcription factor ( 98.4 8.9E-08 1.9E-12 111.4 0.6 180 848-1042 268-483 (500)
27 PHA02768 hypothetical protein; 98.1 7E-07 1.5E-11 78.1 0.5 42 985-1034 6-47 (55)
28 PHA02768 hypothetical protein; 98.0 2.8E-06 6.1E-11 74.4 2.1 46 1018-1065 5-50 (55)
29 COG2940 Proteins containing SE 97.9 2E-06 4.3E-11 106.9 0.1 105 1355-1470 308-412 (480)
30 KOG3993 Transcription factor ( 97.8 7.8E-06 1.7E-10 95.7 1.8 171 881-1068 267-480 (500)
31 PF13465 zf-H2C2_2: Zinc-finge 97.7 1.6E-05 3.5E-10 59.6 1.4 24 972-995 2-25 (26)
32 PHA00732 hypothetical protein 97.4 0.00011 2.4E-09 69.6 3.2 47 984-1042 1-48 (79)
33 PHA00616 hypothetical protein 97.3 6.7E-05 1.5E-09 63.0 0.1 33 1018-1050 1-33 (44)
34 PF13465 zf-H2C2_2: Zinc-finge 97.2 0.00015 3.3E-09 54.4 1.7 25 999-1029 1-25 (26)
35 PF05605 zf-Di19: Drought indu 97.2 0.00028 6.1E-09 62.1 2.9 52 984-1042 2-53 (54)
36 PHA00732 hypothetical protein 97.0 0.00045 9.8E-09 65.6 2.8 47 1018-1070 1-48 (79)
37 PF05605 zf-Di19: Drought indu 97.0 0.00054 1.2E-08 60.3 3.0 52 847-905 2-53 (54)
38 PHA00616 hypothetical protein 96.7 0.00054 1.2E-08 57.7 0.9 34 984-1023 1-34 (44)
39 COG5189 SFP1 Putative transcri 96.7 0.00083 1.8E-08 76.6 2.3 57 982-1038 347-418 (423)
40 COG5189 SFP1 Putative transcri 96.2 0.0018 4E-08 73.9 1.6 71 844-935 346-418 (423)
41 PF00096 zf-C2H2: Zinc finger, 95.8 0.0025 5.5E-08 46.0 0.2 19 985-1003 1-19 (23)
42 PF12756 zf-C2H2_2: C2H2 type 95.7 0.0046 1E-07 59.8 1.7 71 917-1005 1-71 (100)
43 PF12756 zf-C2H2_2: C2H2 type 95.7 0.0041 8.8E-08 60.2 1.2 73 849-938 1-73 (100)
44 KOG2231 Predicted E3 ubiquitin 95.7 0.011 2.3E-07 75.1 4.8 146 881-1050 115-275 (669)
45 cd01395 HMT_MBD Methyl-CpG bin 95.4 0.0032 7E-08 56.6 -0.7 37 1184-1220 1-49 (60)
46 PF00096 zf-C2H2: Zinc finger, 95.2 0.0075 1.6E-07 43.5 0.9 23 1019-1041 1-23 (23)
47 KOG2231 Predicted E3 ubiquitin 95.0 0.026 5.7E-07 71.7 5.1 120 882-1042 100-236 (669)
48 PF13912 zf-C2H2_6: C2H2-type 94.6 0.015 3.2E-07 43.7 1.0 23 1019-1041 2-24 (27)
49 PF13894 zf-C2H2_4: C2H2-type 94.5 0.016 3.5E-07 41.5 1.1 19 884-902 3-21 (24)
50 COG5048 FOG: Zn-finger [Genera 94.5 0.035 7.6E-07 66.4 4.4 62 989-1054 393-454 (467)
51 PF13894 zf-C2H2_4: C2H2-type 93.8 0.034 7.4E-07 39.8 1.6 19 985-1003 1-19 (24)
52 PF13912 zf-C2H2_6: C2H2-type 93.6 0.034 7.3E-07 41.8 1.3 26 984-1010 1-26 (27)
53 KOG1146 Homeobox protein [Gene 93.5 0.041 8.9E-07 73.4 2.6 159 884-1069 439-641 (1406)
54 KOG1146 Homeobox protein [Gene 92.8 0.033 7.1E-07 74.3 0.1 151 849-1008 438-641 (1406)
55 PF09237 GAGA: GAGA factor; I 92.5 0.05 1.1E-06 47.2 0.8 31 1016-1046 22-52 (54)
56 smart00355 ZnF_C2H2 zinc finge 92.1 0.07 1.5E-06 38.6 1.1 23 985-1008 1-23 (26)
57 COG5236 Uncharacterized conser 91.8 0.095 2.1E-06 60.9 2.3 126 915-1066 151-301 (493)
58 COG5048 FOG: Zn-finger [Genera 91.8 0.13 2.9E-06 61.5 3.7 169 846-1037 288-464 (467)
59 PF01352 KRAB: KRAB box; Inte 91.3 0.059 1.3E-06 45.1 0.0 29 732-765 1-29 (41)
60 KOG1081 Transcription factor N 91.1 0.064 1.4E-06 66.8 0.0 89 1353-1470 287-378 (463)
61 PRK04860 hypothetical protein; 90.9 0.094 2E-06 56.5 1.0 37 984-1030 119-155 (160)
62 smart00355 ZnF_C2H2 zinc finge 90.6 0.16 3.5E-06 36.6 1.8 24 1019-1042 1-24 (26)
63 PRK04860 hypothetical protein; 90.6 0.12 2.7E-06 55.6 1.6 36 1018-1057 119-154 (160)
64 cd05162 PWWP The PWWP domain, 90.1 0.28 6.1E-06 47.3 3.4 60 157-220 6-66 (87)
65 smart00570 AWS associated with 90.0 0.14 3E-06 45.0 1.1 25 1353-1377 26-50 (51)
66 cd05840 SPBC215_ISWI_like The 88.7 0.32 7E-06 47.9 2.7 59 157-216 6-65 (93)
67 COG5236 Uncharacterized conser 88.3 0.38 8.3E-06 56.1 3.4 134 757-942 158-308 (493)
68 PF13909 zf-H2C2_5: C2H2-type 87.3 0.19 4.1E-06 36.8 0.1 22 1019-1041 1-22 (24)
69 PF09237 GAGA: GAGA factor; I 87.1 0.27 5.8E-06 42.9 1.0 29 880-908 23-51 (54)
70 KOG4173 Alpha-SNAP protein [In 86.1 0.16 3.4E-06 55.6 -1.2 91 880-1010 78-172 (253)
71 PF12874 zf-met: Zinc-finger o 85.8 0.26 5.7E-06 36.2 0.2 21 1019-1039 1-21 (25)
72 PF12874 zf-met: Zinc-finger o 84.6 0.4 8.8E-06 35.2 0.7 21 883-903 2-22 (25)
73 KOG2785 C2H2-type Zn-finger pr 84.2 0.87 1.9E-05 54.5 3.6 52 1016-1067 164-241 (390)
74 PF11722 zf-TRM13_CCCH: CCCH z 83.3 0.39 8.4E-06 38.0 0.2 29 533-561 2-30 (31)
75 PF13909 zf-H2C2_5: C2H2-type 83.2 0.51 1.1E-05 34.5 0.8 21 883-904 2-22 (24)
76 PF12171 zf-C2H2_jaz: Zinc-fin 82.5 0.77 1.7E-05 34.7 1.5 22 1019-1040 2-23 (27)
77 KOG2482 Predicted C2H2-type Zn 82.4 1.1 2.4E-05 52.7 3.4 51 1019-1069 280-357 (423)
78 KOG2785 C2H2-type Zn-finger pr 81.0 2.6 5.7E-05 50.6 5.9 56 983-1039 165-241 (390)
79 PF12171 zf-C2H2_jaz: Zinc-fin 80.6 0.8 1.7E-05 34.6 1.0 21 882-902 2-22 (27)
80 KOG4173 Alpha-SNAP protein [In 75.5 0.87 1.9E-05 50.1 -0.1 86 846-940 78-171 (253)
81 KOG2893 Zn finger protein [Gen 75.2 1.1 2.5E-05 50.0 0.7 46 987-1042 13-59 (341)
82 PF00856 SET: SET domain; Int 74.8 1.8 4E-05 44.3 2.1 31 1390-1420 1-31 (162)
83 KOG2482 Predicted C2H2-type Zn 74.1 2.2 4.8E-05 50.2 2.7 25 916-940 280-304 (423)
84 KOG2461 Transcription factor B 69.4 4.2 9E-05 50.1 3.7 75 1377-1468 26-105 (396)
85 cd05837 MSH6_like The PWWP dom 66.4 6.2 0.00013 40.1 3.6 63 157-219 8-71 (110)
86 smart00451 ZnF_U1 U1-like zinc 59.9 4 8.7E-05 32.3 0.8 21 1018-1038 3-23 (35)
87 PF13913 zf-C2HC_2: zinc-finge 59.4 5.9 0.00013 29.8 1.5 18 985-1003 3-20 (25)
88 smart00451 ZnF_U1 U1-like zinc 58.2 4.1 8.9E-05 32.2 0.5 24 881-904 3-26 (35)
89 smart00391 MBD Methyl-CpG bind 55.8 3.8 8.2E-05 39.2 -0.0 37 1184-1220 3-53 (77)
90 PF13913 zf-C2HC_2: zinc-finge 55.4 6.1 0.00013 29.7 1.0 17 884-901 5-21 (25)
91 COG4049 Uncharacterized protei 55.3 4.8 0.0001 35.9 0.5 32 978-1009 11-42 (65)
92 TIGR00622 ssl1 transcription f 53.8 21 0.00046 36.6 4.8 20 915-934 15-34 (112)
93 PF09986 DUF2225: Uncharacteri 53.6 4.4 9.6E-05 45.9 0.1 49 983-1031 4-61 (214)
94 KOG2893 Zn finger protein [Gen 52.2 6 0.00013 44.6 0.8 47 884-940 13-59 (341)
95 TIGR00622 ssl1 transcription f 51.9 21 0.00046 36.6 4.5 82 847-939 15-104 (112)
96 KOG1280 Uncharacterized conser 49.5 13 0.00029 44.3 3.0 27 847-873 79-105 (381)
97 KOG2186 Cell growth-regulating 43.0 12 0.00026 43.0 1.3 46 882-936 4-49 (276)
98 PHA00626 hypothetical protein 42.6 10 0.00023 33.9 0.6 14 1017-1030 22-35 (59)
99 cd00122 MBD MeCP2, MBD1, MBD2, 42.3 8.1 0.00018 35.2 -0.1 27 1194-1220 23-50 (62)
100 PF12013 DUF3505: Protein of u 41.7 27 0.00058 35.2 3.5 27 1017-1043 79-109 (109)
101 smart00293 PWWP domain with co 41.3 29 0.00063 31.7 3.3 56 157-215 6-62 (63)
102 COG4049 Uncharacterized protei 40.6 12 0.00026 33.5 0.7 33 840-872 10-42 (65)
103 PF00855 PWWP: PWWP domain; I 39.9 28 0.00061 33.1 3.2 56 157-219 6-62 (86)
104 cd01397 HAT_MBD Methyl-CpG bin 39.8 11 0.00023 35.9 0.3 26 1194-1219 23-49 (73)
105 KOG3813 Uncharacterized conser 38.2 15 0.00033 45.6 1.3 19 1299-1318 307-325 (640)
106 cd05838 WHSC1_related The PWWP 36.8 27 0.00058 34.7 2.6 54 158-214 7-61 (95)
107 cd00350 rubredoxin_like Rubred 34.9 24 0.00052 28.3 1.5 10 985-994 2-11 (33)
108 PF09538 FYDLN_acid: Protein o 33.5 25 0.00053 35.9 1.7 30 985-1031 10-39 (108)
109 KOG2461 Transcription factor B 32.7 63 0.0014 40.1 5.4 80 969-1054 316-395 (396)
110 PRK14890 putative Zn-ribbon RN 30.5 32 0.0007 31.4 1.7 32 983-1026 24-56 (59)
111 KOG2186 Cell growth-regulating 30.4 26 0.00056 40.4 1.4 51 847-906 3-53 (276)
112 cd00729 rubredoxin_SM Rubredox 29.8 31 0.00067 28.0 1.4 10 985-994 3-12 (34)
113 cd05839 BR140_related The PWWP 29.0 83 0.0018 32.4 4.6 61 157-217 6-80 (111)
114 TIGR00373 conserved hypothetic 27.8 39 0.00084 36.6 2.2 42 972-1028 97-138 (158)
115 PF02892 zf-BED: BED zinc fing 27.6 35 0.00075 28.7 1.4 27 982-1008 14-44 (45)
116 PF09986 DUF2225: Uncharacteri 27.5 25 0.00055 39.9 0.8 42 1016-1057 3-59 (214)
117 COG2888 Predicted Zn-ribbon RN 26.8 41 0.00089 30.7 1.7 32 983-1026 26-58 (61)
118 smart00531 TFIIE Transcription 26.8 48 0.001 35.4 2.6 39 980-1028 95-133 (147)
119 PF13891 zf-C3Hc3H: Potential 26.5 21 0.00046 33.0 -0.1 36 587-623 3-38 (65)
120 PF13719 zinc_ribbon_5: zinc-r 25.8 30 0.00065 28.5 0.7 31 986-1028 4-35 (37)
121 TIGR02098 MJ0042_CXXC MJ0042 f 25.2 32 0.0007 28.0 0.8 34 985-1029 3-36 (38)
122 smart00531 TFIIE Transcription 25.0 58 0.0013 34.8 2.9 39 843-891 95-133 (147)
123 PF09538 FYDLN_acid: Protein o 24.9 36 0.00077 34.8 1.2 14 915-928 26-39 (108)
124 TIGR00373 conserved hypothetic 24.7 48 0.001 36.0 2.2 42 835-891 97-138 (158)
125 PRK00464 nrdR transcriptional 24.7 34 0.00074 37.0 1.1 18 1018-1035 28-45 (154)
126 PF11722 zf-TRM13_CCCH: CCCH z 23.9 43 0.00094 26.8 1.2 21 589-609 11-31 (31)
127 PF12013 DUF3505: Protein of u 23.9 60 0.0013 32.7 2.6 24 916-939 81-108 (109)
128 COG1997 RPL43A Ribosomal prote 23.2 40 0.00087 33.1 1.1 32 983-1030 34-65 (89)
129 COG1198 PriA Primosomal protei 22.9 55 0.0012 43.6 2.6 33 142-174 16-49 (730)
130 PRK06266 transcription initiat 22.8 55 0.0012 36.2 2.2 33 981-1028 114-146 (178)
131 PF13717 zinc_ribbon_4: zinc-r 22.7 38 0.00082 27.8 0.7 33 985-1028 3-35 (36)
132 PF14353 CpXC: CpXC protein 22.5 26 0.00057 36.2 -0.3 26 1016-1041 36-61 (128)
133 cd05834 HDGF_related The PWWP 22.5 1.1E+02 0.0023 29.8 3.9 52 157-218 8-60 (83)
134 TIGR02300 FYDLN_acid conserved 22.4 56 0.0012 34.3 2.0 33 985-1034 10-42 (129)
135 KOG1280 Uncharacterized conser 21.3 52 0.0011 39.6 1.7 32 979-1010 74-105 (381)
136 KOG2593 Transcription initiati 21.2 62 0.0013 40.1 2.4 41 978-1027 122-162 (436)
137 KOG2593 Transcription initiati 21.1 57 0.0012 40.4 2.1 48 765-814 113-160 (436)
138 PRK06266 transcription initiat 21.0 60 0.0013 36.0 2.1 36 842-892 112-147 (178)
No 1
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=99.96 E-value=3e-30 Score=305.85 Aligned_cols=211 Identities=32% Similarity=0.494 Sum_probs=159.3
Q ss_pred CCcCCcCcEEEecCCCCCCCCCeeEeeCCCCcccccccCCCCCcccccCCCCCCCCEEccccCCCCCCCCcccCCCCCcc
Q 000479 1224 RKPLLRGTVLCDDISSGLESVPVACVVDDGLLETLCISADSSDSQKTRCSMPWESFTYVTKPLLDQSLDLDAESLQLGCA 1303 (1470)
Q Consensus 1224 ~~~~~~~~vi~~DIS~G~E~vPV~~vnd~D~~~~~~~~g~~~~~~~~~~~~P~~~F~YIt~ni~~~~~~ld~~~~~~gC~ 1303 (1470)
...+.+...+.+||+.|.|++||+.+|++|.. .| ..|+|++..++.+. .........||.
T Consensus 52 ~~~~~~~~~~~~d~~~~~e~~~v~~~n~id~~------------------~~-~~f~y~~~~~~~~~-~~~~~~~~~~c~ 111 (364)
T KOG1082|consen 52 DKDKLEAKSELEDIALGSENLPVPLVNRIDED------------------AP-LYFQYIATEIVDPG-ELSDCENSTGCR 111 (364)
T ss_pred cccccccccccccccCccccCceeeeeeccCC------------------cc-ccceeccccccCcc-ccccCccccCCC
Confidence 34566777899999999999999999999863 12 57999999988885 222244578999
Q ss_pred cCCCCccCC---CCCccccccccccccccccCCCCCCCcccCCCCc--eeeccCcceeccCCCCCCCCCCCCccccccce
Q 000479 1304 CANSTCFPE---TCDHVYLFDNDYEDAKDIDGKSVHGRFPYDQTGR--VILEEGYLIYECNHMCSCDRTCPNRVLQNGVR 1378 (1470)
Q Consensus 1304 C~~~~C~~~---~C~C~~l~~~~~~~~~~~~g~~~~g~~~Y~~~G~--l~~~~~~~IyECn~~C~C~~~C~NRvvQ~g~~ 1378 (1470)
|.+ .|... .|.|... +.+.++|..+|. .....+.+||||+..|+|+++|.|||+|+|++
T Consensus 112 C~~-~~~~~~~~~C~C~~~---------------n~~~~~~~~~~~~~~~~~~~~~i~EC~~~C~C~~~C~nRv~q~g~~ 175 (364)
T KOG1082|consen 112 CCS-SCSSVLPLTCLCERH---------------NGGLVAYTCDGDCGTLGKFKEPVFECSVACGCHPDCANRVVQKGLQ 175 (364)
T ss_pred ccC-CCCCCCCccccChHh---------------hCCccccccCCccccccccCccccccccCCCCCCcCcchhhccccc
Confidence 986 34332 2777532 345677777763 33456779999999999999999999999999
Q ss_pred eeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccC----CCCcEEEEeCcccccc--------ccc
Q 000479 1379 VKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGR----DGCGYMLNIGAHINDM--------GRL 1446 (1470)
Q Consensus 1379 ~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~----~~~~Ylf~l~~~~~~~--------~~~ 1446 (1470)
.+|+||+|..+|||||++++||+|+|||||+||+++..|+++|...+.. .+..++++.+...... ...
T Consensus 176 ~~leIfrt~~kGwgvRs~~~I~~G~fvcEyaGe~~t~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 255 (364)
T KOG1082|consen 176 FHLEVFRTPEKGWGVRTLDPIPAGEFVCEYAGEVLTSEEAQRRTHLREYLDDDCDAYSIADREWVDESPVGNTFVAPSLP 255 (364)
T ss_pred cceEEEecCCceeeecccccccCCCeeEEEeeEecChHHhhhccccccccccccccchhhhccccccccccccccccccc
Confidence 9999999999999999999999999999999999999999988543221 1122333332211000 001
Q ss_pred cCCCCcEEEeccCCCCeeecccCC
Q 000479 1447 IEGQVRYVIDATKYGNVSRFINHR 1470 (1470)
Q Consensus 1447 ~~~~~~~~IDA~~~GNvaRFINHS 1470 (1470)
......++|||+.+|||+||||||
T Consensus 256 ~~~~~~~~ida~~~GNv~RfinHS 279 (364)
T KOG1082|consen 256 GGPGRELLIDAKPHGNVARFINHS 279 (364)
T ss_pred cCCCcceEEchhhcccccccccCC
Confidence 122468999999999999999998
No 2
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=99.87 E-value=5.4e-23 Score=247.30 Aligned_cols=105 Identities=45% Similarity=0.750 Sum_probs=97.1
Q ss_pred cceeccCC-CCC-CCCCCCCccccccceeeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCC-
Q 000479 1353 YLIYECNH-MCS-CDRTCPNRVLQNGVRVKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDG- 1429 (1470)
Q Consensus 1353 ~~IyECn~-~C~-C~~~C~NRvvQ~g~~~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~- 1429 (1470)
....||++ .|. |+..|.|+.+|+....+++||+|+.+||||||..+||+|+||.||+||||+.+|+++|...|+.++
T Consensus 92 ~t~iECs~~~C~~cg~~C~NQRFQkkqyA~vevF~Te~KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~d~~ 171 (729)
T KOG4442|consen 92 MTSIECSDRECPRCGVYCKNQRFQKKQYAKVEVFLTEKKGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAKDGI 171 (729)
T ss_pred hhhcccCCccCCCccccccchhhhhhccCceeEEEecCcccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHhcCC
Confidence 34669987 999 999999999999999999999999999999999999999999999999999999999999998875
Q ss_pred -CcEEEEeCccccccccccCCCCcEEEeccCCCCeeecccCC
Q 000479 1430 -CGYMLNIGAHINDMGRLIEGQVRYVIDATKYGNVSRFINHR 1470 (1470)
Q Consensus 1430 -~~Ylf~l~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHS 1470 (1470)
+.|+|.|.. .++||||.+||.|||||||
T Consensus 172 kh~Yfm~L~~-------------~e~IDAT~KGnlaRFiNHS 200 (729)
T KOG4442|consen 172 KHYYFMALQG-------------GEYIDATKKGNLARFINHS 200 (729)
T ss_pred ceEEEEEecC-------------CceecccccCcHHHhhcCC
Confidence 457776654 6899999999999999998
No 3
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=99.86 E-value=8.7e-23 Score=244.34 Aligned_cols=255 Identities=22% Similarity=0.272 Sum_probs=185.3
Q ss_pred cccceeeeccccCCCcccCCC--CCCCCCCC-CCcccC-----------CcccccccCCCCCCc-cccccceeee-ccCc
Q 000479 1158 IQVEWHREGFLCSNGCKIFKD--PHLPPHLE-PLPSVS-----------AGIRSSDSSDFVNNQ-WEVDECHCII-DSRH 1221 (1470)
Q Consensus 1158 iqv~wh~~~~~c~~g~~~~~~--~~~~~pl~-p~~~~~-----------~~~k~v~~~~p~~~~-w~~~e~~~~l-~~~~ 1221 (1470)
+-+-.|.|...|-+.-....+ ..+-.||+ |..+.| ...-.|.|-+|||.. +.|.|+..|| +.++
T Consensus 567 lsy~sh~cs~acl~~~~~~~~~~~~g~npl~lp~~~~F~r~~a~~rs~~~~~fhv~yktpcg~~lr~~~el~ryL~et~c 646 (1262)
T KOG1141|consen 567 LSYFSHKCSIACLNAAQIAIMVGQPGGNPLNLPYFLTFHRIRASHRSAYIRDFHVEYKTPCGMPLRMRIELYRYLVETRC 646 (1262)
T ss_pred ccccchhhHHHHHhccchhhhccCCCCCccccceEEEeeehhhhhhhhhhhcceeeccCCCccchHHHHHHHHHHHHhcC
Confidence 334467777777655544433 56677888 888888 233368999999988 8888866544 3321
Q ss_pred -------c---------CCCcCCcCcEEEecCCCCCCCCCeeEeeCCCCcccccccCCCCCcccccCCCCCCCCEEcccc
Q 000479 1222 -------L---------GRKPLLRGTVLCDDISSGLESVPVACVVDDGLLETLCISADSSDSQKTRCSMPWESFTYVTKP 1285 (1470)
Q Consensus 1222 -------~---------~~~~~~~~~vi~~DIS~G~E~vPV~~vnd~D~~~~~~~~g~~~~~~~~~~~~P~~~F~YIt~n 1285 (1470)
| +..++.++++.|-||++|+|.+||.++|+.|. .|++.|.|-.+.
T Consensus 647 ~flf~~~f~~~~yV~~~r~~~p~kp~~~~~Di~~g~e~vpis~~neids-------------------~~lpq~ay~K~~ 707 (1262)
T KOG1141|consen 647 KFLFVIGFDRAFYVVRHRAPNPLKPGNRCTDIPCGREHVPISEKNEIDS-------------------HRLPQAAYKKHM 707 (1262)
T ss_pred cEEEEeecccchheeecccCCCcCCcceeccccCCccccccceeecccC-------------------cCCccchhheee
Confidence 1 23456789999999999999999999999885 244689999888
Q ss_pred CCCCCCCCc-ccCCCCCcccCCCCccCCCCCccccccccccccccccCCCCCCCcccCCCCceeeccCcceeccCCCCCC
Q 000479 1286 LLDQSLDLD-AESLQLGCACANSTCFPETCDHVYLFDNDYEDAKDIDGKSVHGRFPYDQTGRVILEEGYLIYECNHMCSC 1364 (1470)
Q Consensus 1286 i~~~~~~ld-~~~~~~gC~C~~~~C~~~~C~C~~l~~~~~~~~~~~~g~~~~g~~~Y~~~G~l~~~~~~~IyECn~~C~C 1364 (1470)
|.+...-.. ...+..+|+|.+||-+...|.|.++....-......+-. ....+.|. |++-..+..+|||+..|+|
T Consensus 708 ip~~~nl~n~~~~fl~scdc~~gcid~~kcachQltvk~~~t~p~~~v~-~t~gykyK---Rl~e~~ptg~yEc~k~ckc 783 (1262)
T KOG1141|consen 708 IPTNNNLSNRRKDFLQSCDCPTGCIDSMKCACHQLTVKKKTTGPNQNVA-STNGYKYK---RLIEIRPTGPYECLKACKC 783 (1262)
T ss_pred ccCCCcccccChhhhhcCCCCcchhhhhhhhHHHHHHHhhccCCCcccc-cCcchhhH---HHHHhcCCCHHHHHHhhcc
Confidence 877653222 256889999999877778999988743221111100111 11223442 3444456689999999999
Q ss_pred CC-CCCCccccccceeeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCCCcEEEEeCc
Q 000479 1365 DR-TCPNRVLQNGVRVKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDGCGYMLNIGA 1438 (1470)
Q Consensus 1365 ~~-~C~NRvvQ~g~~~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~~~Ylf~l~~ 1438 (1470)
.+ .|.||++|+|.+++|++|+|..+|||+|++++|.+|.|||.|.|-+++.+-++.-+ |. .+..|+..||.
T Consensus 784 ~~~~C~nrmvqhg~qvRlq~fkt~~kGWg~rclddi~~g~fVciy~g~~l~~~~sdks~--~~-~~~~~~~~id~ 855 (1262)
T KOG1141|consen 784 CGPDCLNRMVQHGYQVRLQRFKTIHKGWGRRCLDDITGGNFVCIYPGGALLHQISDKSE--YI-HVTRSLLTIDC 855 (1262)
T ss_pred CcHHHHHHHhhcCceeEeeeccccccccceEeeeecCCceEEEEecchhhhhhhchhhh--hc-ccchhhhcccc
Confidence 87 59999999999999999999999999999999999999999999999988777554 22 13457766665
No 4
>KOG2462 consensus C2H2-type Zn-finger protein [Transcription]
Probab=99.85 E-value=3.5e-22 Score=220.26 Aligned_cols=137 Identities=18% Similarity=0.228 Sum_probs=91.1
Q ss_pred CccccccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccC
Q 000479 880 RGYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGED 959 (1470)
Q Consensus 880 Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~C 959 (1470)
..|+|+.|||.+.+.++|.+|.++|..... .+.+.|+.|+|.|.+...|+.|++ +|+
T Consensus 129 ~r~~c~eCgk~ysT~snLsrHkQ~H~~~~s---~ka~~C~~C~K~YvSmpALkMHir-TH~------------------- 185 (279)
T KOG2462|consen 129 PRYKCPECGKSYSTSSNLSRHKQTHRSLDS---KKAFSCKYCGKVYVSMPALKMHIR-THT------------------- 185 (279)
T ss_pred Cceeccccccccccccccchhhcccccccc---cccccCCCCCceeeehHHHhhHhh-ccC-------------------
Confidence 457777777777777777777777765432 246778888888887777777764 554
Q ss_pred CCcccccCChhHHhhhhhhcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccc
Q 000479 960 SPKKLELGYSASVENHSENLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPR 1039 (1470)
Q Consensus 960 p~~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r 1039 (1470)
-+++|.+|||.|.+..-|+-| .|+||| ||||.|+.|+|+|..+++|+.||+
T Consensus 186 -----------------------l~c~C~iCGKaFSRPWLLQGH-iRTHTG-----EKPF~C~hC~kAFADRSNLRAHmQ 236 (279)
T KOG2462|consen 186 -----------------------LPCECGICGKAFSRPWLLQGH-IRTHTG-----EKPFSCPHCGKAFADRSNLRAHMQ 236 (279)
T ss_pred -----------------------CCcccccccccccchHHhhcc-cccccC-----CCCccCCcccchhcchHHHHHHHH
Confidence 345566666666666666666 666666 666666666666666666666666
Q ss_pred cccCCCccccCcCCCcCCChHHHHhhcCC
Q 000479 1040 FKKGLGAVSYRIRNRGAAGMKKRIQTLKP 1068 (1470)
Q Consensus 1040 ~Htgekpy~C~~C~ksF~~~~~L~kHkks 1068 (1470)
+|.+.|+|+|..|+|+|+.++-|.+|..+
T Consensus 237 THS~~K~~qC~~C~KsFsl~SyLnKH~ES 265 (279)
T KOG2462|consen 237 THSDVKKHQCPRCGKSFALKSYLNKHSES 265 (279)
T ss_pred hhcCCccccCcchhhHHHHHHHHHHhhhh
Confidence 66666666666666666666666666553
No 5
>KOG2462 consensus C2H2-type Zn-finger protein [Transcription]
Probab=99.83 E-value=1.5e-21 Score=215.37 Aligned_cols=111 Identities=23% Similarity=0.309 Sum_probs=98.3
Q ss_pred cccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCChhHHhhhhhhcCCccccccCccCCcc
Q 000479 915 LQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSENLGSIRKFICRFCGLKF 994 (1470)
Q Consensus 915 pfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF 994 (1470)
.|+|+.|||.+.+.++|.+|.+ .|..- - ..+.+.|++|||.|
T Consensus 130 r~~c~eCgk~ysT~snLsrHkQ-~H~~~----------------~---------------------s~ka~~C~~C~K~Y 171 (279)
T KOG2462|consen 130 RYKCPECGKSYSTSSNLSRHKQ-THRSL----------------D---------------------SKKAFSCKYCGKVY 171 (279)
T ss_pred ceeccccccccccccccchhhc-ccccc----------------c---------------------ccccccCCCCCcee
Confidence 5788888888888888888843 55311 1 13679999999999
Q ss_pred CChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccccccCCCccccCcCCCcCCChHHHHhhcCCCCC
Q 000479 995 DLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYRIRNRGAAGMKKRIQTLKPLAS 1071 (1470)
Q Consensus 995 ~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~~C~ksF~~~~~L~kHkksh~~ 1071 (1470)
.+...|+.| .++|+- +++|.+|||.|.+..-|+-|+|+|+|||||.|..|+|+|+..+.|+.|+.+|..
T Consensus 172 vSmpALkMH-irTH~l-------~c~C~iCGKaFSRPWLLQGHiRTHTGEKPF~C~hC~kAFADRSNLRAHmQTHS~ 240 (279)
T KOG2462|consen 172 VSMPALKMH-IRTHTL-------PCECGICGKAFSRPWLLQGHIRTHTGEKPFSCPHCGKAFADRSNLRAHMQTHSD 240 (279)
T ss_pred eehHHHhhH-hhccCC-------CcccccccccccchHHhhcccccccCCCCccCCcccchhcchHHHHHHHHhhcC
Confidence 999999999 999986 899999999999999999999999999999999999999999999999999873
No 6
>KOG1074 consensus Transcriptional repressor SALM [Transcription]
Probab=99.83 E-value=3.3e-21 Score=234.12 Aligned_cols=215 Identities=15% Similarity=0.160 Sum_probs=163.5
Q ss_pred ccccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchhhccccccc---CCCC
Q 000479 847 THKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCI---PCGS 923 (1470)
Q Consensus 847 pfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~---~CgK 923 (1470)
|-+|-+|-++...++.|+.| .++|++| |||+|.+||+.|.++.+|+.||-.|....... -.|.|+ +|-+
T Consensus 605 PNqCiiC~rVlSC~saLqmH-yrtHtGE-----RPFkCKiCgRAFtTkGNLkaH~~vHka~p~~R--~q~ScP~~~ic~~ 676 (958)
T KOG1074|consen 605 PNQCIICLRVLSCPSALQMH-YRTHTGE-----RPFKCKICGRAFTTKGNLKAHMSVHKAKPPAR--VQFSCPSTFICQK 676 (958)
T ss_pred ccceeeeeecccchhhhhhh-hhcccCc-----CccccccccchhccccchhhcccccccCcccc--ccccCCchhhhcc
Confidence 57899999999999999999 8999999 99999999999999999999999998765322 368999 9999
Q ss_pred CCCChhhhhhhhhhccccc-ccchhhhhccccccccCCCcccccCChhHHhhhhh----------------hcCCcc---
Q 000479 924 HFGNTEELWLHVQSVHAID-FKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSE----------------NLGSIR--- 983 (1470)
Q Consensus 924 ~F~sks~L~~H~k~~Hsge-f~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r----------------~HtgeK--- 983 (1470)
.|.+.-.|..|++ +|.+. .............-+.|+.+.+.|.....+..|+- .++.+.
T Consensus 677 kftn~V~lpQhIr-iH~~~~~s~g~~a~e~~~~adq~~~~qk~~~~a~~f~~~~se~~~~~s~~~~~~~~~t~t~~~~~t 755 (958)
T KOG1074|consen 677 KFTNAVTLPQHIR-IHLGGQISNGGTAAEGILAADQCSSCQKTFSDARSFSQQISEQPSPESEPDEQMDERTETEELDVT 755 (958)
T ss_pred cccccccccceEE-eecCCCCCCCcccccccchhcccchhhhcccccccchhhhhccCCcccCCcccccccccccccccC
Confidence 9999999999985 78743 10000001222223344444777765444444443 334444
Q ss_pred -ccccCccCCccCChhhHhHhhhhhc-----------------------cCCCCC-------------------------
Q 000479 984 -KFICRFCGLKFDLLPDLGRHHQAAH-----------------------MGPNLV------------------------- 1014 (1470)
Q Consensus 984 -pykC~~CGKsF~sks~LkrH~~rvH-----------------------tge~~~------------------------- 1014 (1470)
+..|..|+..+.....+..+ -..+ ++++..
T Consensus 756 p~~~e~~~~~~~~~e~~i~~~-g~te~asa~~~~vg~~s~~~~~~~~~~T~~k~~~~~~~~~~~~~~~v~~~pvl~~~~~ 834 (958)
T KOG1074|consen 756 PPPPENSCGRELEGEMAISVR-GSTEEASANLDEVGTVSAAGEAGEEDDTSEKPTQASSFPGEILAPSVNMDPVLWNQET 834 (958)
T ss_pred CCccccccccccCcccccccc-cchhhhhcChhhhcCccccchhhhhcccCCCCcccccCCCcCCccccccCchhhcccc
Confidence 67889999998877766665 2222 111000
Q ss_pred -----------------------------------------CCCCcccCCCCccCCChhhhhhccccccCCCccccCcCC
Q 000479 1015 -----------------------------------------NSRPHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYRIRN 1053 (1470)
Q Consensus 1015 -----------------------------------------~eKpykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~~C~ 1053 (1470)
......|.+|++.|...+.|..|||+|+++|||.|.+|+
T Consensus 835 ~~l~eg~~t~~n~~t~~~~~~sv~qs~~~p~l~p~l~~~~pvnn~h~C~vCgk~FsSSsALqiH~rTHtg~KPF~C~fC~ 914 (958)
T KOG1074|consen 835 SMLNEGLATKTNEITPEGPADSVIQSGGVPTLEPSLGRPGPVNNAHVCNVCGKQFSSSAALEIHMRTHTGPKPFFCHFCE 914 (958)
T ss_pred cccccccccccccccCCCcchhhhhhccccccCCCCCCCCcccchhhhccchhcccchHHHHHhhhcCCCCCCccchhhh
Confidence 023478999999999999999999999999999999999
Q ss_pred CcCCChHHHHhhcCCCCC
Q 000479 1054 RGAAGMKKRIQTLKPLAS 1071 (1470)
Q Consensus 1054 ksF~~~~~L~kHkksh~~ 1071 (1470)
++|.....|..|+.+|..
T Consensus 915 ~aFttrgnLKvHMgtH~w 932 (958)
T KOG1074|consen 915 EAFTTRGNLKVHMGTHMW 932 (958)
T ss_pred hhhhhhhhhhhhhccccc
Confidence 999999999999998873
No 7
>KOG3608 consensus Zn finger proteins [General function prediction only]
Probab=99.82 E-value=3.7e-21 Score=215.04 Aligned_cols=250 Identities=16% Similarity=0.200 Sum_probs=199.7
Q ss_pred hccchhhhhhhhCHHHHHhcccccccccccccccCccccccCChHhhhhhhhc-CCc-ccCCC--CC--CCCCccccccc
Q 000479 752 LHLHLACELFYKLLKSILSLRNPVPMEIQFQWALSEASKDAGIGEFLMKLVCC-EKE-RLSKT--WG--FDANENAHVSS 825 (1470)
Q Consensus 752 ~~Lc~~C~k~F~~~~sL~sH~rsH~~ek~~~~kC~eC~K~F~s~~~L~k~iHt-ek~-y~C~~--Cg--F~~~s~~~~~s 825 (1470)
.|+...|+..|.+...+..|.-.|..- | +.+..|. -. ++| +.|.. |- |..+..
T Consensus 136 ~C~WedCe~~F~s~~ef~dHV~~H~l~------c-eyd~~~~---------~~D~~pv~~C~W~~Ct~~~~~k~~----- 194 (467)
T KOG3608|consen 136 RCGWEDCEREFVSIVEFQDHVVKHALF------C-EYDIQKT---------PEDERPVTMCNWAMCTKHMGNKYR----- 194 (467)
T ss_pred ccChhhcCCcccCHHHHHHHHHHhhhh------h-hhhhhhC---------CCCCCceeeccchhhhhhhccHHH-----
Confidence 455678899999888888887666532 1 1111111 11 222 33432 22 444444
Q ss_pred ccccccchhhHHHhhccCCCCccccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhc
Q 000479 826 SVVEDSAVLPLAIAGRSEDEKTHKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERH 905 (1470)
Q Consensus 826 ~~~e~s~~~L~~H~r~H~gekpfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh 905 (1470)
|.+|++.|+++|...|+.|+.-|.++..|-.|+++.-.-. ..+|.|..|.|.|.+...|..|+..|-
T Consensus 195 ---------LreH~r~Hs~eKvvACp~Cg~~F~~~tkl~DH~rRqt~l~----~n~fqC~~C~KrFaTeklL~~Hv~rHv 261 (467)
T KOG3608|consen 195 ---------LREHIRTHSNEKVVACPHCGELFRTKTKLFDHLRRQTELN----TNSFQCAQCFKRFATEKLLKSHVVRHV 261 (467)
T ss_pred ---------HHHHHHhcCCCeEEecchHHHHhccccHHHHHHHhhhhhc----CCchHHHHHHHHHhHHHHHHHHHHHhh
Confidence 9999999999999999999999999999999965543222 268999999999999999999999887
Q ss_pred cccchhhcccccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCChhHHhhhhhhcCCcccc
Q 000479 906 HVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSENLGSIRKF 985 (1470)
Q Consensus 906 ~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeKpy 985 (1470)
. -|+|+.|+.+....++|..||+..|... ++|+|+.| ...|.+.+.|.+|..+|. +-.|
T Consensus 262 n--------~ykCplCdmtc~~~ssL~~H~r~rHs~d---------kpfKCd~C---d~~c~~esdL~kH~~~HS-~~~y 320 (467)
T KOG3608|consen 262 N--------CYKCPLCDMTCSSASSLTTHIRYRHSKD---------KPFKCDEC---DTRCVRESDLAKHVQVHS-KTVY 320 (467)
T ss_pred h--------cccccccccCCCChHHHHHHHHhhhccC---------CCccccch---hhhhccHHHHHHHHHhcc-ccce
Confidence 5 6899999999999999999999989876 99999999 999999999999999998 6789
Q ss_pred ccCc--cCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccccccCC------CccccCcCCCcCC
Q 000479 986 ICRF--CGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKKGL------GAVSYRIRNRGAA 1057 (1470)
Q Consensus 986 kC~~--CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Htge------kpy~C~~C~ksF~ 1057 (1470)
.|+. |..+|++...|++|.+.+|.|.+ +-+|.|..|++.|++-.+|..|++..++. +.|..+.|..+|.
T Consensus 321 ~C~h~~C~~s~r~~~q~~~H~~evhEg~n---p~~Y~CH~Cdr~ft~G~~L~~HL~kkH~f~~PsGh~RFtYk~~edG~m 397 (467)
T KOG3608|consen 321 QCEHPDCHYSVRTYTQMRRHFLEVHEGNN---PILYACHCCDRFFTSGKSLSAHLMKKHGFRLPSGHKRFTYKVDEDGFM 397 (467)
T ss_pred ecCCCCCcHHHHHHHHHHHHHHHhccCCC---CCceeeecchhhhccchhHHHHHHHhhcccCCCCCCceeeeeccCcee
Confidence 9988 99999999999999888887743 56899999999999999999998555444 5677777877775
Q ss_pred Ch
Q 000479 1058 GM 1059 (1470)
Q Consensus 1058 ~~ 1059 (1470)
++
T Consensus 398 RL 399 (467)
T KOG3608|consen 398 RL 399 (467)
T ss_pred ee
Confidence 43
No 8
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=99.75 E-value=6.4e-19 Score=211.07 Aligned_cols=126 Identities=29% Similarity=0.658 Sum_probs=110.9
Q ss_pred cCCCCCcccCCCCccCCCCCccccccccccccccccCCCCCCCcccCCCCceeeccCcceecc-CCCCCC----------
Q 000479 1296 ESLQLGCACANSTCFPETCDHVYLFDNDYEDAKDIDGKSVHGRFPYDQTGRVILEEGYLIYEC-NHMCSC---------- 1364 (1470)
Q Consensus 1296 ~~~~~gC~C~~~~C~~~~C~C~~l~~~~~~~~~~~~g~~~~g~~~Y~~~G~l~~~~~~~IyEC-n~~C~C---------- 1364 (1470)
.+.+.||.| .+.|....|+|+.. ..|| |+.|.+
T Consensus 536 ~nrF~GC~C-k~QC~tkqCpC~~A-----------------------------------~rECdPd~Cl~cg~~~~~d~~ 579 (739)
T KOG1079|consen 536 RNRFPGCRC-KAQCNTKQCPCYLA-----------------------------------VRECDPDVCLMCGNVDHFDSS 579 (739)
T ss_pred HhcCCCCCc-ccccccCcCchhhh-----------------------------------ccccCchHHhccCcccccccC
Confidence 456789999 46888888988532 3366 356644
Q ss_pred CCCCCCccccccceeeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCCCcEEEEeCccccccc
Q 000479 1365 DRTCPNRVLQNGVRVKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDGCGYMLNIGAHINDMG 1444 (1470)
Q Consensus 1365 ~~~C~NRvvQ~g~~~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~~~Ylf~l~~~~~~~~ 1444 (1470)
.-+|.|--+|++.+.++.|..+..-|||+|.++.+.+++||.||+||+|+.+||++|++.|+..+.+|||+|..
T Consensus 580 ~~~C~N~~l~~~~qkr~llapSdVaGwGlFlKe~v~KnefisEY~GE~IS~dEADrRGkiYDr~~cSflFnln~------ 653 (739)
T KOG1079|consen 580 KISCKNTNLQRGEQKRVLLAPSDVAGWGLFLKESVSKNEFISEYTGEIISHDEADRRGKIYDRYMCSFLFNLNN------ 653 (739)
T ss_pred ccccccchhhhhhhcceeechhhccccceeeccccCCCceeeeecceeccchhhhhcccccccccceeeeeccc------
Confidence 12799999999999999999999999999999999999999999999999999999999999999999999976
Q ss_pred cccCCCCcEEEeccCCCCeeecccCC
Q 000479 1445 RLIEGQVRYVIDATKYGNVSRFINHR 1470 (1470)
Q Consensus 1445 ~~~~~~~~~~IDA~~~GNvaRFINHS 1470 (1470)
.|+|||+++||.+||+|||
T Consensus 654 -------dyviDs~rkGnk~rFANHS 672 (739)
T KOG1079|consen 654 -------DYVIDSTRKGNKIRFANHS 672 (739)
T ss_pred -------cceEeeeeecchhhhccCC
Confidence 5999999999999999998
No 9
>KOG3608 consensus Zn finger proteins [General function prediction only]
Probab=99.74 E-value=3.6e-19 Score=199.26 Aligned_cols=188 Identities=19% Similarity=0.218 Sum_probs=168.3
Q ss_pred cccC--cCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchhhcccccccCCCCCC
Q 000479 848 HKCK--ICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHF 925 (1470)
Q Consensus 848 fkC~--~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F 925 (1470)
+.|. .|-+.|.+++.|+.| .+.|+++ |...|+.||..|.++..|..|++..+.... .+|+|..|.|.|
T Consensus 178 ~~C~W~~Ct~~~~~k~~LreH-~r~Hs~e-----KvvACp~Cg~~F~~~tkl~DH~rRqt~l~~----n~fqC~~C~KrF 247 (467)
T KOG3608|consen 178 TMCNWAMCTKHMGNKYRLREH-IRTHSNE-----KVVACPHCGELFRTKTKLFDHLRRQTELNT----NSFQCAQCFKRF 247 (467)
T ss_pred eeccchhhhhhhccHHHHHHH-HHhcCCC-----eEEecchHHHHhccccHHHHHHHhhhhhcC----CchHHHHHHHHH
Confidence 5664 699999999999999 8999999 999999999999999999999988775442 389999999999
Q ss_pred CChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCChhHHhhhhhh-cCCccccccCccCCccCChhhHhHhh
Q 000479 926 GNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSEN-LGSIRKFICRFCGLKFDLLPDLGRHH 1004 (1470)
Q Consensus 926 ~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r~-HtgeKpykC~~CGKsF~sks~LkrH~ 1004 (1470)
.+...|..|+. .|. ..|+|..| ...++..++|..|++. |...|||+|+.|++.|.+.+.|.+|
T Consensus 248 aTeklL~~Hv~-rHv-----------n~ykCplC---dmtc~~~ssL~~H~r~rHs~dkpfKCd~Cd~~c~~esdL~kH- 311 (467)
T KOG3608|consen 248 ATEKLLKSHVV-RHV-----------NCYKCPLC---DMTCSSASSLTTHIRYRHSKDKPFKCDECDTRCVRESDLAKH- 311 (467)
T ss_pred hHHHHHHHHHH-Hhh-----------hccccccc---ccCCCChHHHHHHHHhhhccCCCccccchhhhhccHHHHHHH-
Confidence 99999999986 465 36888888 9999999999999975 8889999999999999999999999
Q ss_pred hhhccCCCCCCCCCcccCC--CCccCCChhhhhhcccccc-CC--CccccCcCCCcCCChHHHHhhcC
Q 000479 1005 QAAHMGPNLVNSRPHKKGI--RFYAYKLKSGRLSRPRFKK-GL--GAVSYRIRNRGAAGMKKRIQTLK 1067 (1470)
Q Consensus 1005 ~rvHtge~~~~eKpykC~i--CgKsF~~ks~L~~H~r~Ht-ge--kpy~C~~C~ksF~~~~~L~kHkk 1067 (1470)
..+|.. -.|+|.. |.++|+....|++|++.|+ |. -+|.|..|.+.|.+-.+|..|.+
T Consensus 312 ~~~HS~------~~y~C~h~~C~~s~r~~~q~~~H~~evhEg~np~~Y~CH~Cdr~ft~G~~L~~HL~ 373 (467)
T KOG3608|consen 312 VQVHSK------TVYQCEHPDCHYSVRTYTQMRRHFLEVHEGNNPILYACHCCDRFFTSGKSLSAHLM 373 (467)
T ss_pred HHhccc------cceecCCCCCcHHHHHHHHHHHHHHHhccCCCCCceeeecchhhhccchhHHHHHH
Confidence 679985 5899999 9999999999999996555 55 67999999999999999998865
No 10
>KOG3623 consensus Homeobox transcription factor SIP1 [Transcription]
Probab=99.74 E-value=7.4e-19 Score=209.88 Aligned_cols=81 Identities=21% Similarity=0.239 Sum_probs=76.9
Q ss_pred cccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccccccCCCccccCcCCCcCCChHHH
Q 000479 983 RKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYRIRNRGAAGMKKR 1062 (1470)
Q Consensus 983 KpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~~C~ksF~~~~~L 1062 (1470)
.+|.|+.|+|.|...+.|.|| +--|+| .|||+|.+|.|+|+++.+|..|+|.|.|+|||+|+.|+|.|+...+.
T Consensus 893 gmyaCDqCDK~FqKqSSLaRH-KYEHsG-----qRPyqC~iCkKAFKHKHHLtEHkRLHSGEKPfQCdKClKRFSHSGSY 966 (1007)
T KOG3623|consen 893 GMYACDQCDKAFQKQSSLARH-KYEHSG-----QRPYQCIICKKAFKHKHHLTEHKRLHSGEKPFQCDKCLKRFSHSGSY 966 (1007)
T ss_pred ccchHHHHHHHHHhhHHHHHh-hhhhcC-----CCCcccchhhHhhhhhhhhhhhhhhccCCCcchhhhhhhhcccccch
Confidence 469999999999999999999 999999 99999999999999999999999999999999999999999988888
Q ss_pred HhhcCCCC
Q 000479 1063 IQTLKPLA 1070 (1470)
Q Consensus 1063 ~kHkksh~ 1070 (1470)
.+|+. |.
T Consensus 967 SQHMN-HR 973 (1007)
T KOG3623|consen 967 SQHMN-HR 973 (1007)
T ss_pred Hhhhc-cc
Confidence 88877 65
No 11
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=99.71 E-value=9e-18 Score=165.33 Aligned_cols=103 Identities=31% Similarity=0.621 Sum_probs=70.5
Q ss_pred cCCCCCCCCCeeEeeCCCCcccccccCCCCCcccccCCCCCCCCEEccccCCCCCCCCcccCCCCCcccCCCCccCCCCC
Q 000479 1236 DISSGLESVPVACVVDDGLLETLCISADSSDSQKTRCSMPWESFTYVTKPLLDQSLDLDAESLQLGCACANSTCFPETCD 1315 (1470)
Q Consensus 1236 DIS~G~E~vPV~~vnd~D~~~~~~~~g~~~~~~~~~~~~P~~~F~YIt~ni~~~~~~ld~~~~~~gC~C~~~~C~~~~C~ 1315 (1470)
|||.|+|++||+++|++|+. .||+.|+||+++++.+++......+..||+|.++|-.+.+|.
T Consensus 1 Dis~g~e~~pI~~~N~vd~~------------------~~p~~F~Yi~~~~~~~~~~~~~~~~~~~C~C~~~C~~~~~C~ 62 (103)
T PF05033_consen 1 DISRGKENVPIPVVNDVDDE------------------PPPPNFEYIPENIYGEGVPDIDPEFLQGCDCSGDCSNPSNCE 62 (103)
T ss_dssp -TTCTSSSS-EEEEESSSS--------------------SSTSSEE-SS-EESTTSS-TBGGGTS----SSSSTCTTTSH
T ss_pred CCCCCccCCCEEEEeCCCCC------------------CCCCCeEEeeeEEcCCCccccccccCccCccCCCCCCCCCCc
Confidence 89999999999999999964 344799999999999987533456788999986533778999
Q ss_pred ccccccccccccccccCCCCCCCcccCCCCceeeccCcceeccCCCCCCCCCCCCc
Q 000479 1316 HVYLFDNDYEDAKDIDGKSVHGRFPYDQTGRVILEEGYLIYECNHMCSCDRTCPNR 1371 (1470)
Q Consensus 1316 C~~l~~~~~~~~~~~~g~~~~g~~~Y~~~G~l~~~~~~~IyECn~~C~C~~~C~NR 1371 (1470)
|++++ ++.++|+.+|++......+|||||+.|+|+.+|+||
T Consensus 63 C~~~~---------------~~~~~Y~~~g~l~~~~~~~i~EC~~~C~C~~~C~NR 103 (103)
T PF05033_consen 63 CLQRN---------------GGIFAYDSNGRLRIPDKPPIFECNDNCGCSPSCRNR 103 (103)
T ss_dssp HHCCT---------------SSS-SB-TTSSBSSSSTSEEE---TTSSS-TTSTT-
T ss_pred Ccccc---------------CccccccCCCcCccCCCCeEEeCCCCCCCCCCCCCC
Confidence 97642 234689999998767889999999999999999998
No 12
>KOG1074 consensus Transcriptional repressor SALM [Transcription]
Probab=99.71 E-value=6.1e-18 Score=206.03 Aligned_cols=113 Identities=14% Similarity=0.082 Sum_probs=92.7
Q ss_pred ccccccCCCcccccCChhHHhhhhhhcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccC---CCCccCC
Q 000479 953 NQSVGEDSPKKLELGYSASVENHSENLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKG---IRFYAYK 1029 (1470)
Q Consensus 953 ~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~---iCgKsF~ 1029 (1470)
+-+|-+| .+...-.+.|+.|.|+|+|||||+|.+||+.|.++.+|+.| +.+|...... .-+|.|+ +|-+.|.
T Consensus 605 PNqCiiC---~rVlSC~saLqmHyrtHtGERPFkCKiCgRAFtTkGNLkaH-~~vHka~p~~-R~q~ScP~~~ic~~kft 679 (958)
T KOG1074|consen 605 PNQCIIC---LRVLSCPSALQMHYRTHTGERPFKCKICGRAFTTKGNLKAH-MSVHKAKPPA-RVQFSCPSTFICQKKFT 679 (958)
T ss_pred ccceeee---eecccchhhhhhhhhcccCcCccccccccchhccccchhhc-ccccccCccc-cccccCCchhhhccccc
Confidence 4567777 77777788999999999999999999999999999999999 8899875444 3678999 9999999
Q ss_pred ChhhhhhccccccCC-C------------ccccCcCCCcCCChHHHHhhcCCCC
Q 000479 1030 LKSGRLSRPRFKKGL-G------------AVSYRIRNRGAAGMKKRIQTLKPLA 1070 (1470)
Q Consensus 1030 ~ks~L~~H~r~Htge-k------------py~C~~C~ksF~~~~~L~kHkksh~ 1070 (1470)
....|..|+++|.+. . .-+|..|.+.|.....+-.+.--|.
T Consensus 680 n~V~lpQhIriH~~~~~s~g~~a~e~~~~adq~~~~qk~~~~a~~f~~~~se~~ 733 (958)
T KOG1074|consen 680 NAVTLPQHIRIHLGGQISNGGTAAEGILAADQCSSCQKTFSDARSFSQQISEQP 733 (958)
T ss_pred ccccccceEEeecCCCCCCCcccccccchhcccchhhhcccccccchhhhhccC
Confidence 999999999999843 1 2458999999987777777766444
No 13
>smart00468 PreSET N-terminal to some SET domains. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.
Probab=99.65 E-value=2.5e-16 Score=154.06 Aligned_cols=96 Identities=34% Similarity=0.652 Sum_probs=78.7
Q ss_pred EecCCCCCCCCCeeEeeCCCCcccccccCCCCCcccccCCCCCCCCEEccccCCCCCCCC-cccCCCCCcccCCCCccCC
Q 000479 1234 CDDISSGLESVPVACVVDDGLLETLCISADSSDSQKTRCSMPWESFTYVTKPLLDQSLDL-DAESLQLGCACANSTCFPE 1312 (1470)
Q Consensus 1234 ~~DIS~G~E~vPV~~vnd~D~~~~~~~~g~~~~~~~~~~~~P~~~F~YIt~ni~~~~~~l-d~~~~~~gC~C~~~~C~~~ 1312 (1470)
+.|||+|+|++||++||++|++ .||++|+||++++++.++.+ ....+..||+|.+ .|.+.
T Consensus 1 ~~Dis~G~E~~pI~~vN~vD~~------------------~~p~~F~Yi~~~~~~~gv~~~~~~~~~~gC~C~~-~C~~~ 61 (98)
T smart00468 1 CLDISNGKENVPVPLVNEVDED------------------PPPPDFEYISEYIYGQGVPIDRSPSPLVGCSCSG-DCSSS 61 (98)
T ss_pred CccccCCccCCCcceEecCCCC------------------CCCCCcEECcceEcCCCcccccCCCCCCCCcCCC-CCCCC
Confidence 3699999999999999999974 23479999999999998753 3467889999997 57666
Q ss_pred C-CCccccccccccccccccCCCCCCCcccCCCCceeeccCcceeccCCCCC
Q 000479 1313 T-CDHVYLFDNDYEDAKDIDGKSVHGRFPYDQTGRVILEEGYLIYECNHMCS 1363 (1470)
Q Consensus 1313 ~-C~C~~l~~~~~~~~~~~~g~~~~g~~~Y~~~G~l~~~~~~~IyECn~~C~ 1363 (1470)
. |.|+++ .++.|+|+..+++++..+.+|||||+.|+
T Consensus 62 ~~C~C~~~---------------~~~~~~Y~~~~~~~~~~~~~IyECn~~C~ 98 (98)
T smart00468 62 NKCECARK---------------NGGEFAYELNGGLRLKRKPLIYECNSRCS 98 (98)
T ss_pred CcCCcHhh---------------cCCccCcccCCCEEeCCCCEEEcCCCCCC
Confidence 5 999753 24679997777777888999999999985
No 14
>KOG3576 consensus Ovo and related transcription factors [Transcription]
Probab=99.38 E-value=5.6e-14 Score=148.39 Aligned_cols=119 Identities=14% Similarity=0.167 Sum_probs=99.2
Q ss_pred cccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCChhHHhhhhhhcCCccccccCccCCcc
Q 000479 915 LQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSENLGSIRKFICRFCGLKF 994 (1470)
Q Consensus 915 pfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF 994 (1470)
.|.|.+|+|.|.-...|.+|++ .|... +.+.|..| ++.|.....|++|+|+|+|.+||+|..|+|+|
T Consensus 117 ~ftCrvCgK~F~lQRmlnrh~k-ch~~v---------kr~lct~c---gkgfndtfdlkrh~rthtgvrpykc~~c~kaf 183 (267)
T KOG3576|consen 117 SFTCRVCGKKFGLQRMLNRHLK-CHSDV---------KRHLCTFC---GKGFNDTFDLKRHTRTHTGVRPYKCSLCEKAF 183 (267)
T ss_pred eeeeehhhhhhhHHHHHHHHhh-hccHH---------HHHHHhhc---cCcccchhhhhhhhccccCccccchhhhhHHH
Confidence 6888888888888888888875 67755 77888888 88899889999999999999999999999999
Q ss_pred CChhhHhHhhhhhccCCCCC-----CCCCcccCCCCccCCChhhhhhccccccCCCc
Q 000479 995 DLLPDLGRHHQAAHMGPNLV-----NSRPHKKGIRFYAYKLKSGRLSRPRFKKGLGA 1046 (1470)
Q Consensus 995 ~sks~LkrH~~rvHtge~~~-----~eKpykC~iCgKsF~~ks~L~~H~r~Htgekp 1046 (1470)
.++-.|..|.+++|.-...| ..+.|.|..||++-.....+..|++.|+...|
T Consensus 184 tqrcsleshl~kvhgv~~~yaykerr~kl~vcedcg~t~~~~e~~~~h~~~~hp~Sp 240 (267)
T KOG3576|consen 184 TQRCSLESHLKKVHGVQHQYAYKERRAKLYVCEDCGYTSERPEVYYLHLKLHHPFSP 240 (267)
T ss_pred HhhccHHHHHHHHcCchHHHHHHHhhhheeeecccCCCCCChhHHHHHHHhcCCCCH
Confidence 99999999988888754322 25678898898888888888888888775544
No 15
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=99.38 E-value=5.5e-13 Score=161.48 Aligned_cols=172 Identities=24% Similarity=0.409 Sum_probs=111.4
Q ss_pred cCcEEEecCCCCCCCCCeeEeeCCCCcccccccCCCCCcccccCCCCCCCCEEccccCCCCCCCCcccCCCCCcccCCCC
Q 000479 1229 RGTVLCDDISSGLESVPVACVVDDGLLETLCISADSSDSQKTRCSMPWESFTYVTKPLLDQSLDLDAESLQLGCACANST 1308 (1470)
Q Consensus 1229 ~~~vi~~DIS~G~E~vPV~~vnd~D~~~~~~~~g~~~~~~~~~~~~P~~~F~YIt~ni~~~~~~ld~~~~~~gC~C~~~~ 1308 (1470)
..++-.+|.+.|.+.+|||.||.+|....+.-++ +. -.|.|..... +.-....+..||+|.+.+
T Consensus 872 ~~g~d~~d~~~g~sg~~~p~~~~~d~~~~~~c~d----~~--------~~~~~~~~~~----~s~~~~~~~~~~s~d~hp 935 (1262)
T KOG1141|consen 872 DKGLDVADFSLGTSGIPIPLVNSVDNDEPPSCED----SK--------RRFQYNDQVD----ISSVSRDFCSGCSCDGHP 935 (1262)
T ss_pred ccccchhhhhccccCCCCccccccccCCCccccc----cc--------eeecccccch----hhhhccccccccccCCCC
Confidence 3445567999999999999999888643222111 00 1233333211 111235678899998755
Q ss_pred ccCCCCCcccccccccccc---ccccCCCCCCCcccCCCCceeeccCcceeccCCCCCCCCCCCCccccccceeeEE---
Q 000479 1309 CFPETCDHVYLFDNDYEDA---KDIDGKSVHGRFPYDQTGRVILEEGYLIYECNHMCSCDRTCPNRVLQNGVRVKLE--- 1382 (1470)
Q Consensus 1309 C~~~~C~C~~l~~~~~~~~---~~~~g~~~~g~~~Y~~~G~l~~~~~~~IyECn~~C~C~~~C~NRvvQ~g~~~~Le--- 1382 (1470)
-+-+.|.|.++.-...... ...+|...--.-.|+.+.. .....|||++.|.|...|.||++|.+.+++.+
T Consensus 936 ~d~~~~~~~~~~~~~~~~cpp~~s~d~~~~~~eS~~~~ns~----~~~~f~e~~~hss~~~~e~~~~v~~~~~~~me~~s 1011 (1262)
T KOG1141|consen 936 SDASKCECQQLSIEAMKRCPPNLSFDGHDELYESSEKQNSF----LKLFFFECNDHSSCHRKEYNRVVQNNIKYPMEVSS 1011 (1262)
T ss_pred cccCcccCCCCChhhhcCCCCccccCchhhhhhhhhhcchh----hhccceeccccchhcccccchhhhcCCccceeeee
Confidence 4557788876532222111 0111111111111222211 12357899999999999999999998877655
Q ss_pred -----EEeccCccccceecccccCCcEEEEeeccccCHHHHHH
Q 000479 1383 -----VFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNK 1420 (1470)
Q Consensus 1383 -----VfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~ 1420 (1470)
||+++..|||++...+||.-+|||+|+|...++.-+.+
T Consensus 1012 ~~~l~i~~~~~~~~~~~edtD~~~~~~~~~~~~~ppt~~l~~~ 1054 (1262)
T KOG1141|consen 1012 FNDLQIFKTAQSGWGVREDTDIPQSTFICTYVGAPPTDDLADE 1054 (1262)
T ss_pred cccccccccccccccccccccCCCCcccccccCCCCchhhHHH
Confidence 66777899999999999999999999999999887764
No 16
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=99.37 E-value=2.8e-13 Score=174.34 Aligned_cols=79 Identities=37% Similarity=0.605 Sum_probs=73.1
Q ss_pred eeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCC--CcEEEEeCccccccccccCCCCcEEEe
Q 000479 1379 VKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDG--CGYMLNIGAHINDMGRLIEGQVRYVID 1456 (1470)
Q Consensus 1379 ~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~--~~Ylf~l~~~~~~~~~~~~~~~~~~ID 1456 (1470)
..|...++..+||||||.++|.+|+||+||+||+|...-|+.|+..|...| ++|||.+|. .+|||
T Consensus 866 k~~~F~~s~iH~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~~Y~~~gi~~sYlfrid~-------------~~ViD 932 (1005)
T KOG1080|consen 866 KYVKFGRSGIHGWGLFAMENIAAGDMVIEYRGELVRSSIADLREARYERMGIGDSYLFRIDD-------------EVVVD 932 (1005)
T ss_pred hhhccccccccccceeeccCccccceEEEeeceehhhhHHHHHHHHHhccCcccceeeeccc-------------ceEEe
Confidence 347778899999999999999999999999999999999999999998764 789999986 59999
Q ss_pred ccCCCCeeecccCC
Q 000479 1457 ATKYGNVSRFINHR 1470 (1470)
Q Consensus 1457 A~~~GNvaRFINHS 1470 (1470)
||+.||+|||||||
T Consensus 933 Atk~gniAr~InHs 946 (1005)
T KOG1080|consen 933 ATKKGNIARFINHS 946 (1005)
T ss_pred ccccCchhheeecc
Confidence 99999999999998
No 17
>KOG3576 consensus Ovo and related transcription factors [Transcription]
Probab=99.30 E-value=6e-13 Score=140.73 Aligned_cols=88 Identities=23% Similarity=0.500 Sum_probs=82.7
Q ss_pred CCCCccccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchhhcccccccCCC
Q 000479 843 EDEKTHKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCG 922 (1470)
Q Consensus 843 ~gekpfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~Cg 922 (1470)
.+...|.|.+|+|.|.-..-|.+| ++.|... +.|-|..|||.|...-.|++|+++|+|.+ ||+|..|+
T Consensus 113 sd~d~ftCrvCgK~F~lQRmlnrh-~kch~~v-----kr~lct~cgkgfndtfdlkrh~rthtgvr------pykc~~c~ 180 (267)
T KOG3576|consen 113 SDQDSFTCRVCGKKFGLQRMLNRH-LKCHSDV-----KRHLCTFCGKGFNDTFDLKRHTRTHTGVR------PYKCSLCE 180 (267)
T ss_pred CCCCeeeeehhhhhhhHHHHHHHH-hhhccHH-----HHHHHhhccCcccchhhhhhhhccccCcc------ccchhhhh
Confidence 445679999999999999999999 8999988 99999999999999999999999999998 99999999
Q ss_pred CCCCChhhhhhhhhhccccc
Q 000479 923 SHFGNTEELWLHVQSVHAID 942 (1470)
Q Consensus 923 K~F~sks~L~~H~k~~Hsge 942 (1470)
|.|.....|..|++.+|...
T Consensus 181 kaftqrcsleshl~kvhgv~ 200 (267)
T KOG3576|consen 181 KAFTQRCSLESHLKKVHGVQ 200 (267)
T ss_pred HHHHhhccHHHHHHHHcCch
Confidence 99999999999999999754
No 18
>KOG3623 consensus Homeobox transcription factor SIP1 [Transcription]
Probab=99.18 E-value=5.4e-12 Score=152.12 Aligned_cols=122 Identities=22% Similarity=0.341 Sum_probs=90.6
Q ss_pred cccccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCC
Q 000479 882 YACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSP 961 (1470)
Q Consensus 882 y~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~ 961 (1470)
..|+.|.+.+.....|+.|++..|... +-.|.|..|..+|.....|.+||. .|..-
T Consensus 211 ltcpycdrgykrltslkeHikyrhekn----e~nfsC~lCsytFAyRtQLErhm~-~hkpg------------------- 266 (1007)
T KOG3623|consen 211 LTCPYCDRGYKRLTSLKEHIKYRHEKN----EPNFSCMLCSYTFAYRTQLERHMQ-LHKPG------------------- 266 (1007)
T ss_pred hcchhHHHHHHHHHHHHHHHHHHHhhC----CCCCcchhhhhhhhhHHHHHHHHH-hhcCC-------------------
Confidence 567777777777777777777766544 344777777777777777777765 45321
Q ss_pred cccccCChhHHhhhhhhcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccccc
Q 000479 962 KKLELGYSASVENHSENLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFK 1041 (1470)
Q Consensus 962 ~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~H 1041 (1470)
+-. ..|.-.-...|.|+|.+|||.|+.+.+|+.| .|+|.| +|||.|+-|+|+|.+...+-.||-..
T Consensus 267 -~dq-------a~sltqsa~lRKFKCtECgKAFKfKHHLKEH-lRIHSG-----EKPfeCpnCkKRFSHSGSySSHmSSK 332 (1007)
T KOG3623|consen 267 -GDQ-------AISLTQSALLRKFKCTECGKAFKFKHHLKEH-LRIHSG-----EKPFECPNCKKRFSHSGSYSSHMSSK 332 (1007)
T ss_pred -Ccc-------cccccchhhhccccccccchhhhhHHHHHhh-heeecC-----CCCcCCcccccccccCCccccccccc
Confidence 000 0111111234789999999999999999999 999999 99999999999999999999999543
No 19
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=99.16 E-value=2.7e-12 Score=159.71 Aligned_cols=91 Identities=42% Similarity=0.690 Sum_probs=81.8
Q ss_pred CCCCccccc-cceeeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHH-hhccCCCCcEEEEeCccccccc
Q 000479 1367 TCPNRVLQN-GVRVKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRR-SRYGRDGCGYMLNIGAHINDMG 1444 (1470)
Q Consensus 1367 ~C~NRvvQ~-g~~~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~-~~y~~~~~~Ylf~l~~~~~~~~ 1444 (1470)
+|.|+.+|+ +.-.+|+||+...+||||++.++|++|+||+||+||||+.++++.|+ ..|..+.+.|+..++.
T Consensus 1165 ~c~nqrm~r~e~cp~L~v~~gp~~G~~v~tk~PikagtfI~EYvGeVit~ke~e~~mmtl~~~d~~~~cL~I~p------ 1238 (1306)
T KOG1083|consen 1165 SCSNQRMQRHEECPPLEVFRGPKKGWGVRTKEPIKAGTFIMEYVGEVITEKEFEPRMMTLYHNDDDHYCLVIDP------ 1238 (1306)
T ss_pred hhhhHHhhhhccCCCcceeccCCCCccccccccccccchHHHHHHHHHHHHhhcccccccCCCCCcccccccCc------
Confidence 388888885 56678999999999999999999999999999999999999998884 4488888899988876
Q ss_pred cccCCCCcEEEeccCCCCeeecccCC
Q 000479 1445 RLIEGQVRYVIDATKYGNVSRFINHR 1470 (1470)
Q Consensus 1445 ~~~~~~~~~~IDA~~~GNvaRFINHS 1470 (1470)
..+||+.++||.+||||||
T Consensus 1239 -------~l~id~~R~~n~~Rfinhs 1257 (1306)
T KOG1083|consen 1239 -------GLFIDIPRMGNGARFINHS 1257 (1306)
T ss_pred -------cccCChhhccccccccccc
Confidence 5899999999999999997
No 20
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.03 E-value=3.8e-10 Score=111.24 Aligned_cols=78 Identities=46% Similarity=0.863 Sum_probs=66.5
Q ss_pred eEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCC--CcEEEEeCccccccccccCCCCcEEEec
Q 000479 1380 KLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDG--CGYMLNIGAHINDMGRLIEGQVRYVIDA 1457 (1470)
Q Consensus 1380 ~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~--~~Ylf~l~~~~~~~~~~~~~~~~~~IDA 1457 (1470)
+++++++..+|+||+|..+|++|++|++|.|+++...++..+...+...+ ..|+|.... .++||+
T Consensus 1 ~~~~~~~~~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-------------~~~id~ 67 (116)
T smart00317 1 KLEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERSKAYDTDGADSFYLFEIDS-------------DLCIDA 67 (116)
T ss_pred CcEEEecCCCcEEEEECCccCCCCEEEEEEeEEECHHHHHHHHHHHHhcCCCCEEEEECCC-------------CEEEeC
Confidence 36788888999999999999999999999999999998887764444433 378887643 589999
Q ss_pred cCCCCeeecccCC
Q 000479 1458 TKYGNVSRFINHR 1470 (1470)
Q Consensus 1458 ~~~GNvaRFINHS 1470 (1470)
...||++||||||
T Consensus 68 ~~~~~~~~~iNHs 80 (116)
T smart00317 68 RRKGNIARFINHS 80 (116)
T ss_pred CccCcHHHeeCCC
Confidence 9999999999998
No 21
>PLN03086 PRLI-interacting factor K; Provisional
Probab=98.80 E-value=5.9e-09 Score=128.50 Aligned_cols=144 Identities=19% Similarity=0.328 Sum_probs=90.8
Q ss_pred cccCcCCccccChhHHhhhhhhccccchhcccCcccccc--ccccccChhhhhhhhhhhccccchhhcccccccCCCCCC
Q 000479 848 HKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAI--CLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHF 925 (1470)
Q Consensus 848 fkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~--CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F 925 (1470)
-.|+.|...... ..|..| ..... . ..-.|+. |+..|. +..+..| +.|+.|++.|
T Consensus 408 V~C~NC~~~i~l-~~l~lH-e~~C~-r-----~~V~Cp~~~Cg~v~~-r~el~~H---------------~~C~~Cgk~f 463 (567)
T PLN03086 408 VECRNCKHYIPS-RSIALH-EAYCS-R-----HNVVCPHDGCGIVLR-VEEAKNH---------------VHCEKCGQAF 463 (567)
T ss_pred EECCCCCCccch-hHHHHH-HhhCC-C-----cceeCCcccccceee-ccccccC---------------ccCCCCCCcc
Confidence 457777665543 345566 22221 1 2345764 777772 3334444 3578887777
Q ss_pred CChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCChhHHhhhhhhcCCccccccCccCCccC----------
Q 000479 926 GNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSENLGSIRKFICRFCGLKFD---------- 995 (1470)
Q Consensus 926 ~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF~---------- 995 (1470)
. ...|..|++..| +++.|. | +..+ .+..|..|+.+|.+++++.|++|++.|.
T Consensus 464 ~-~s~LekH~~~~H------------kpv~Cp-C---g~~~-~R~~L~~H~~thCp~Kpi~C~fC~~~v~~g~~~~d~~d 525 (567)
T PLN03086 464 Q-QGEMEKHMKVFH------------EPLQCP-C---GVVL-EKEQMVQHQASTCPLRLITCRFCGDMVQAGGSAMDVRD 525 (567)
T ss_pred c-hHHHHHHHHhcC------------CCccCC-C---CCCc-chhHHHhhhhccCCCCceeCCCCCCccccCccccchhh
Confidence 5 566777765433 467776 6 5444 4577888888888888888888888884
Q ss_pred ChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhcc-ccc
Q 000479 996 LLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRP-RFK 1041 (1470)
Q Consensus 996 sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~-r~H 1041 (1470)
..+.|..| ..++ | .+++.|..||+.|..+ .|..|+ ..|
T Consensus 526 ~~s~Lt~H-E~~C-G-----~rt~~C~~Cgk~Vrlr-dm~~H~~~~h 564 (567)
T PLN03086 526 RLRGMSEH-ESIC-G-----SRTAPCDSCGRSVMLK-EMDIHQIAVH 564 (567)
T ss_pred hhhhHHHH-HHhc-C-----CcceEccccCCeeeeh-hHHHHHHHhh
Confidence 24578888 6655 5 6788888888777654 466666 444
No 22
>PLN03086 PRLI-interacting factor K; Provisional
Probab=98.79 E-value=7.2e-09 Score=127.76 Aligned_cols=133 Identities=16% Similarity=0.170 Sum_probs=92.9
Q ss_pred cccccccccccChhhhhhhhhhhccccchhhcccccccC--CCCCCCChhhhhhhhhhcccccccchhhhhccccccccC
Q 000479 882 YACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIP--CGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGED 959 (1470)
Q Consensus 882 y~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~--CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~C 959 (1470)
-.|+.|..... ..+|..|.....-. .-.|+. |+..|. +..+..| +.|..|
T Consensus 408 V~C~NC~~~i~-l~~l~lHe~~C~r~-------~V~Cp~~~Cg~v~~-r~el~~H-------------------~~C~~C 459 (567)
T PLN03086 408 VECRNCKHYIP-SRSIALHEAYCSRH-------NVVCPHDGCGIVLR-VEEAKNH-------------------VHCEKC 459 (567)
T ss_pred EECCCCCCccc-hhHHHHHHhhCCCc-------ceeCCcccccceee-ccccccC-------------------ccCCCC
Confidence 46888876554 44556776443322 346874 888773 3333333 346666
Q ss_pred CCcccccCChhHHhhhhhhcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCC---------
Q 000479 960 SPKKLELGYSASVENHSENLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKL--------- 1030 (1470)
Q Consensus 960 p~~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~--------- 1030 (1470)
+..|. ...|..|+++|+ ++|.|+ ||+.| .+..|..| +++|.+ .+++.|.+|++.|..
T Consensus 460 ---gk~f~-~s~LekH~~~~H--kpv~Cp-Cg~~~-~R~~L~~H-~~thCp-----~Kpi~C~fC~~~v~~g~~~~d~~d 525 (567)
T PLN03086 460 ---GQAFQ-QGEMEKHMKVFH--EPLQCP-CGVVL-EKEQMVQH-QASTCP-----LRLITCRFCGDMVQAGGSAMDVRD 525 (567)
T ss_pred ---CCccc-hHHHHHHHHhcC--CCccCC-CCCCc-chhHHHhh-hhccCC-----CCceeCCCCCCccccCccccchhh
Confidence 77775 567888888875 788888 88755 56888888 788888 788888888888852
Q ss_pred -hhhhhhccccccCCCccccCcCCCcCC
Q 000479 1031 -KSGRLSRPRFKKGLGAVSYRIRNRGAA 1057 (1470)
Q Consensus 1031 -ks~L~~H~r~Htgekpy~C~~C~ksF~ 1057 (1470)
.+.|..|...+ |.+++.|..|++.+.
T Consensus 526 ~~s~Lt~HE~~C-G~rt~~C~~Cgk~Vr 552 (567)
T PLN03086 526 RLRGMSEHESIC-GSRTAPCDSCGRSVM 552 (567)
T ss_pred hhhhHHHHHHhc-CCcceEccccCCeee
Confidence 35788888775 888888888887775
No 23
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=98.79 E-value=6.2e-09 Score=115.44 Aligned_cols=86 Identities=35% Similarity=0.516 Sum_probs=69.4
Q ss_pred ccceeeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCC--CcEEEEeCccccccccccCCCCc
Q 000479 1375 NGVRVKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDG--CGYMLNIGAHINDMGRLIEGQVR 1452 (1470)
Q Consensus 1375 ~g~~~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~--~~Ylf~l~~~~~~~~~~~~~~~~ 1452 (1470)
.|..-.|.+..-..+|.||++...+.+|+||.||.|.+|...||..|++.|..+. ..|+|.+... ...
T Consensus 252 ~g~~egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdliei~eAk~rE~~Ya~De~~GcYMYyF~h~----------sk~ 321 (392)
T KOG1085|consen 252 KGTNEGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLIEISEAKVREEQYANDEEIGCYMYYFEHN----------SKK 321 (392)
T ss_pred hccccceeEEeeccccceeEeecccccCceEEEEecceeeechHHHHHHHhccCcccceEEEeeecc----------Cee
Confidence 3445555555555699999999999999999999999999999999999997653 3477766543 247
Q ss_pred EEEeccCCC-CeeecccCC
Q 000479 1453 YVIDATKYG-NVSRFINHR 1470 (1470)
Q Consensus 1453 ~~IDA~~~G-NvaRFINHS 1470 (1470)
|+|||+.-- -.+|.||||
T Consensus 322 yCiDAT~et~~lGRLINHS 340 (392)
T KOG1085|consen 322 YCIDATKETPWLGRLINHS 340 (392)
T ss_pred eeeecccccccchhhhccc
Confidence 999999864 569999998
No 24
>PHA00733 hypothetical protein
Probab=98.52 E-value=5.5e-08 Score=100.08 Aligned_cols=63 Identities=11% Similarity=0.151 Sum_probs=48.2
Q ss_pred HhhhhhhcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccccccC
Q 000479 972 VENHSENLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKKG 1043 (1470)
Q Consensus 972 L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Htg 1043 (1470)
|.+|+..| +.+||.|+.||+.|.+...|..| ++.|+. +|.|.+|++.|.....|..|+..+++
T Consensus 62 l~~~~~~~-~~kPy~C~~Cgk~Fss~s~L~~H-~r~h~~-------~~~C~~CgK~F~~~~sL~~H~~~~h~ 124 (128)
T PHA00733 62 LYKLLTSK-AVSPYVCPLCLMPFSSSVSLKQH-IRYTEH-------SKVCPVCGKEFRNTDSTLDHVCKKHN 124 (128)
T ss_pred HHhhcccC-CCCCccCCCCCCcCCCHHHHHHH-HhcCCc-------CccCCCCCCccCCHHHHHHHHHHhcC
Confidence 44454444 46888999999999999999888 666643 68899999999999999888866553
No 25
>PHA00733 hypothetical protein
Probab=98.40 E-value=2.1e-07 Score=95.79 Aligned_cols=93 Identities=4% Similarity=-0.048 Sum_probs=74.6
Q ss_pred hhHHhhhhhhcCCccccccCccCCccCChhhHhHh--h--hhhccCCCCCCCCCcccCCCCccCCChhhhhhccccccCC
Q 000479 969 SASVENHSENLGSIRKFICRFCGLKFDLLPDLGRH--H--QAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKKGL 1044 (1470)
Q Consensus 969 ~s~L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH--~--~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Htge 1044 (1470)
...|..+-..-...+++.|.+|.+.|.....|..| + ...+.+ .+||.|+.|++.|.+...|..|++.| .
T Consensus 25 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~l~~~~~~~~-----~kPy~C~~Cgk~Fss~s~L~~H~r~h--~ 97 (128)
T PHA00733 25 LEELKRYHSLTPEQKRLIRAVVKTLIYNPQLLDESSYLYKLLTSKA-----VSPYVCPLCLMPFSSSVSLKQHIRYT--E 97 (128)
T ss_pred HHHhhhhhcCChhhhhHHHHHHhhhccChhhhcchHHHHhhcccCC-----CCCccCCCCCCcCCCHHHHHHHHhcC--C
Confidence 34555555444556889999999999988777766 1 112233 68999999999999999999999987 4
Q ss_pred CccccCcCCCcCCChHHHHhhcCC
Q 000479 1045 GAVSYRIRNRGAAGMKKRIQTLKP 1068 (1470)
Q Consensus 1045 kpy~C~~C~ksF~~~~~L~kHkks 1068 (1470)
++|.|..|++.|.....|..|+..
T Consensus 98 ~~~~C~~CgK~F~~~~sL~~H~~~ 121 (128)
T PHA00733 98 HSKVCPVCGKEFRNTDSTLDHVCK 121 (128)
T ss_pred cCccCCCCCCccCCHHHHHHHHHH
Confidence 689999999999999999999763
No 26
>KOG3993 consensus Transcription factor (contains Zn finger) [Transcription]
Probab=98.36 E-value=8.9e-08 Score=111.41 Aligned_cols=180 Identities=12% Similarity=0.083 Sum_probs=100.7
Q ss_pred cccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchh----------------
Q 000479 848 HKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVE---------------- 911 (1470)
Q Consensus 848 fkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e---------------- 911 (1470)
|.|..|...|.+...|.+| +-.-.-. -.|+|+.|+|.|....+|..|.|.|.......
T Consensus 268 yiCqLCK~kYeD~F~LAQH-rC~RIV~-----vEYrCPEC~KVFsCPANLASHRRWHKPR~eaa~a~~~P~k~~~~~rae 341 (500)
T KOG3993|consen 268 YICQLCKEKYEDAFALAQH-RCPRIVH-----VEYRCPECDKVFSCPANLASHRRWHKPRPEAAKAGSPPPKQAVETRAE 341 (500)
T ss_pred HHHHHHHHhhhhHHHHhhc-cCCeeEE-----eeecCCcccccccCchhhhhhhcccCCchhhhhcCCCChhhhhhhhhh
Confidence 7777777777777777777 3222111 34777777777777777777777775321110
Q ss_pred -----------hcccccccCCCCCCCChhhhhhhhhhcccccccchh-------hhhccccccccCCCcccccCChhHHh
Q 000479 912 -----------QCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSE-------VAQQHNQSVGEDSPKKLELGYSASVE 973 (1470)
Q Consensus 912 -----------~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~-------~~~~k~~~C~~Cp~~~k~F~s~s~L~ 973 (1470)
....|.|..|+|.|.++..|++|+...|........ ....-.+-|..| .-.+.....-.
T Consensus 342 ~~ea~rsg~dss~gi~~C~~C~KkFrRqAYLrKHqlthq~~~~~k~~a~~f~~s~~~~l~~~~~~~---a~h~~a~~~~g 418 (500)
T KOG3993|consen 342 VQEAERSGDDSSSGIFSCHTCGKKFRRQAYLRKHQLTHQRAPLAKEKAPKFLLSRVIPLMHFNQAV---ATHSSASDSHG 418 (500)
T ss_pred hhhccccCCcccCceeecHHhhhhhHHHHHHHHhHHhhhccccchhcccCcchhhccccccccccc---ccccccccccc
Confidence 123588999999999999999996654433200000 000001222222 11111111111
Q ss_pred hhhhhcCC-ccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhcc-cccc
Q 000479 974 NHSENLGS-IRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRP-RFKK 1042 (1470)
Q Consensus 974 ~H~r~Htg-eKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~-r~Ht 1042 (1470)
.|...+.+ .....|++||-.+.++..-..| .+.-.. +.-|.|.+|.-.|....+|.+|+ +-|.
T Consensus 419 ~~vl~~a~sael~~pp~~~~ppsss~~sgg~-~rlg~~-----~q~f~~ky~~atfyss~~ltrhin~~Hp 483 (500)
T KOG3993|consen 419 DEVLYVAGSAELELPPYDGSPPSSSGSSGGY-GRLGIA-----EQGFTCKYCPATFYSSPGLTRHINKCHP 483 (500)
T ss_pred cceeeeeccccccCCCCCCCCcccCCCCCcc-ccccch-----hhccccccchHhhhcCcchHhHhhhcCh
Confidence 11111111 1234578888777777666665 332222 56788888888888888888888 4454
No 27
>PHA02768 hypothetical protein; Provisional
Probab=98.11 E-value=7e-07 Score=78.07 Aligned_cols=42 Identities=14% Similarity=0.064 Sum_probs=21.5
Q ss_pred cccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhh
Q 000479 985 FICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGR 1034 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L 1034 (1470)
|+|+.||+.|.+.++|.+| +++|+. +|+|..|++.|.+.+.|
T Consensus 6 y~C~~CGK~Fs~~~~L~~H-~r~H~k-------~~kc~~C~k~f~~~s~l 47 (55)
T PHA02768 6 YECPICGEIYIKRKSMITH-LRKHNT-------NLKLSNCKRISLRTGEY 47 (55)
T ss_pred cCcchhCCeeccHHHHHHH-HHhcCC-------cccCCcccceeccccee
Confidence 4555555555555555555 555542 45555555555544444
No 28
>PHA02768 hypothetical protein; Provisional
Probab=98.00 E-value=2.8e-06 Score=74.37 Aligned_cols=46 Identities=11% Similarity=-0.036 Sum_probs=38.3
Q ss_pred CcccCCCCccCCChhhhhhccccccCCCccccCcCCCcCCChHHHHhh
Q 000479 1018 PHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYRIRNRGAAGMKKRIQT 1065 (1470)
Q Consensus 1018 pykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~~C~ksF~~~~~L~kH 1065 (1470)
-|+|+.||+.|.+.++|..||++|+ ++|+|..|++.|.+.+.|+.-
T Consensus 5 ~y~C~~CGK~Fs~~~~L~~H~r~H~--k~~kc~~C~k~f~~~s~l~~~ 50 (55)
T PHA02768 5 GYECPICGEIYIKRKSMITHLRKHN--TNLKLSNCKRISLRTGEYIEI 50 (55)
T ss_pred ccCcchhCCeeccHHHHHHHHHhcC--CcccCCcccceecccceeEEE
Confidence 4789999999999999999998888 788888999988877766543
No 29
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=97.93 E-value=2e-06 Score=106.94 Aligned_cols=105 Identities=29% Similarity=0.324 Sum_probs=81.7
Q ss_pred eeccCCCCCCCCCCCCccccccceeeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCCCcEEE
Q 000479 1355 IYECNHMCSCDRTCPNRVLQNGVRVKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDGCGYML 1434 (1470)
Q Consensus 1355 IyECn~~C~C~~~C~NRvvQ~g~~~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~~~Ylf 1434 (1470)
+.++...+.....+.|.........+..+..+..+||||+|++.|++|+||.+|.|+++...++..|...+...+..+.|
T Consensus 308 ~~~~~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~fa~~~i~~~e~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 387 (480)
T COG2940 308 SDFSKSNVSKLKELLNSNGCKKRREPNVVQESEIKGYGVFALESIKKGEFIIEYHGEIIRRKEAREREENYDLLGNEFSF 387 (480)
T ss_pred cccccccCccccchhhhcccccccchhhhhhhcccccceeehhhccchHHHHHhcCcccchHHHHhhhccccccccccch
Confidence 33444444445567777667777788888889999999999999999999999999999999999888776444433333
Q ss_pred EeCccccccccccCCCCcEEEeccCCCCeeecccCC
Q 000479 1435 NIGAHINDMGRLIEGQVRYVIDATKYGNVSRFINHR 1470 (1470)
Q Consensus 1435 ~l~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHS 1470 (1470)
.+.. ...+++|+...|+++||||||
T Consensus 388 ~~~~-----------~~~~~~d~~~~g~~~r~~nHS 412 (480)
T COG2940 388 GLLE-----------DKDKVRDSQKAGDVARFINHS 412 (480)
T ss_pred hhcc-----------ccchhhhhhhcccccceeecC
Confidence 3322 126899999999999999998
No 30
>KOG3993 consensus Transcription factor (contains Zn finger) [Transcription]
Probab=97.79 E-value=7.8e-06 Score=95.70 Aligned_cols=171 Identities=18% Similarity=0.218 Sum_probs=113.8
Q ss_pred ccccccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCC
Q 000479 881 GYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDS 960 (1470)
Q Consensus 881 py~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp 960 (1470)
-|.|..|...|...-.|.+|.-...-.. -|+|++|+|.|.-..+|..| ++-|........ -..-|
T Consensus 267 dyiCqLCK~kYeD~F~LAQHrC~RIV~v------EYrCPEC~KVFsCPANLASH-RRWHKPR~eaa~--------a~~~P 331 (500)
T KOG3993|consen 267 DYICQLCKEKYEDAFALAQHRCPRIVHV------EYRCPECDKVFSCPANLASH-RRWHKPRPEAAK--------AGSPP 331 (500)
T ss_pred HHHHHHHHHhhhhHHHHhhccCCeeEEe------eecCCcccccccCchhhhhh-hcccCCchhhhh--------cCCCC
Confidence 4999999999999999999975433322 59999999999999999999 568864310000 00000
Q ss_pred CcccccCChhHHhhhhhh--cCCccccccCccCCccCChhhHhHhhhhhccCCCCCC-----------------------
Q 000479 961 PKKLELGYSASVENHSEN--LGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVN----------------------- 1015 (1470)
Q Consensus 961 ~~~k~F~s~s~L~~H~r~--HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~----------------------- 1015 (1470)
.+.. -......++-.|. ...+..|.|.+|+|+|.+...|++| +.+|.......
T Consensus 332 ~k~~-~~~rae~~ea~rsg~dss~gi~~C~~C~KkFrRqAYLrKH-qlthq~~~~~k~~a~~f~~s~~~~l~~~~~~~a~ 409 (500)
T KOG3993|consen 332 PKQA-VETRAEVQEAERSGDDSSSGIFSCHTCGKKFRRQAYLRKH-QLTHQRAPLAKEKAPKFLLSRVIPLMHFNQAVAT 409 (500)
T ss_pred hhhh-hhhhhhhhhccccCCcccCceeecHHhhhhhHHHHHHHHh-HHhhhccccchhcccCcchhhccccccccccccc
Confidence 0000 0000011111111 1223479999999999999999999 66664311000
Q ss_pred ------------------CCCcccCCCCccCCChhhhhhccccccCCCccccCcCCCcCCChHHHHhhcCC
Q 000479 1016 ------------------SRPHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYRIRNRGAAGMKKRIQTLKP 1068 (1470)
Q Consensus 1016 ------------------eKpykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~~C~ksF~~~~~L~kHkks 1068 (1470)
.....|+.|+-.+.++..--.|.+.-..+..|.|++|.-+|.....|.+|...
T Consensus 410 h~~a~~~~g~~vl~~a~sael~~pp~~~~ppsss~~sgg~~rlg~~~q~f~~ky~~atfyss~~ltrhin~ 480 (500)
T KOG3993|consen 410 HSSASDSHGDEVLYVAGSAELELPPYDGSPPSSSGSSGGYGRLGIAEQGFTCKYCPATFYSSPGLTRHINK 480 (500)
T ss_pred ccccccccccceeeeeccccccCCCCCCCCcccCCCCCccccccchhhccccccchHhhhcCcchHhHhhh
Confidence 12345777887787777777777777777889999999999999999998763
No 31
>PF13465 zf-H2C2_2: Zinc-finger double domain; PDB: 2EN7_A 1TF6_A 1TF3_A 2ELT_A 2EOS_A 2EN2_A 2DMD_A 2WBS_A 2WBU_A 2EM5_A ....
Probab=97.68 E-value=1.6e-05 Score=59.61 Aligned_cols=24 Identities=21% Similarity=0.654 Sum_probs=15.0
Q ss_pred HhhhhhhcCCccccccCccCCccC
Q 000479 972 VENHSENLGSIRKFICRFCGLKFD 995 (1470)
Q Consensus 972 L~~H~r~HtgeKpykC~~CGKsF~ 995 (1470)
|..|+++|++++||+|++|+++|.
T Consensus 2 l~~H~~~H~~~k~~~C~~C~k~F~ 25 (26)
T PF13465_consen 2 LRRHMRTHTGEKPYKCPYCGKSFS 25 (26)
T ss_dssp HHHHHHHHSSSSSEEESSSSEEES
T ss_pred HHHHhhhcCCCCCCCCCCCcCeeC
Confidence 556666666666666666666664
No 32
>PHA00732 hypothetical protein
Probab=97.39 E-value=0.00011 Score=69.64 Aligned_cols=47 Identities=21% Similarity=0.218 Sum_probs=30.3
Q ss_pred ccccCccCCccCChhhHhHhhhh-hccCCCCCCCCCcccCCCCccCCChhhhhhcccccc
Q 000479 984 KFICRFCGLKFDLLPDLGRHHQA-AHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKK 1042 (1470)
Q Consensus 984 pykC~~CGKsF~sks~LkrH~~r-vHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Ht 1042 (1470)
||+|+.||+.|.+...|.+| ++ .|++ +.|+.|++.|. .|..|++.+.
T Consensus 1 py~C~~Cgk~F~s~s~Lk~H-~r~~H~~--------~~C~~CgKsF~---~l~~H~~~~~ 48 (79)
T PHA00732 1 MFKCPICGFTTVTLFALKQH-ARRNHTL--------TKCPVCNKSYR---RLNQHFYSQY 48 (79)
T ss_pred CccCCCCCCccCCHHHHHHH-hhcccCC--------CccCCCCCEeC---ChhhhhcccC
Confidence 46677777777777777777 44 3543 35777777776 4666665544
No 33
>PHA00616 hypothetical protein
Probab=97.26 E-value=6.7e-05 Score=62.96 Aligned_cols=33 Identities=3% Similarity=-0.224 Sum_probs=18.9
Q ss_pred CcccCCCCccCCChhhhhhccccccCCCccccC
Q 000479 1018 PHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYR 1050 (1470)
Q Consensus 1018 pykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~ 1050 (1470)
||+|+.||+.|..++.|.+|++.|+|++++.|+
T Consensus 1 pYqC~~CG~~F~~~s~l~~H~r~~hg~~~~~~~ 33 (44)
T PHA00616 1 MYQCLRCGGIFRKKKEVIEHLLSVHKQNKLTLE 33 (44)
T ss_pred CCccchhhHHHhhHHHHHHHHHHhcCCCcccee
Confidence 455555555555555555555555555555543
No 34
>PF13465 zf-H2C2_2: Zinc-finger double domain; PDB: 2EN7_A 1TF6_A 1TF3_A 2ELT_A 2EOS_A 2EN2_A 2DMD_A 2WBS_A 2WBU_A 2EM5_A ....
Probab=97.22 E-value=0.00015 Score=54.38 Aligned_cols=25 Identities=28% Similarity=0.402 Sum_probs=16.0
Q ss_pred hHhHhhhhhccCCCCCCCCCcccCCCCccCC
Q 000479 999 DLGRHHQAAHMGPNLVNSRPHKKGIRFYAYK 1029 (1470)
Q Consensus 999 ~LkrH~~rvHtge~~~~eKpykC~iCgKsF~ 1029 (1470)
+|.+| +++|+| ++||+|++|+++|.
T Consensus 1 ~l~~H-~~~H~~-----~k~~~C~~C~k~F~ 25 (26)
T PF13465_consen 1 NLRRH-MRTHTG-----EKPYKCPYCGKSFS 25 (26)
T ss_dssp HHHHH-HHHHSS-----SSSEEESSSSEEES
T ss_pred CHHHH-hhhcCC-----CCCCCCCCCcCeeC
Confidence 35666 556666 66666666666664
No 35
>PF05605 zf-Di19: Drought induced 19 protein (Di19), zinc-binding; InterPro: IPR008598 This entry consists of several drought induced 19 (Di19) like and RING finger 114 proteins. Di19 has been found to be strongly expressed in both the roots and leaves of Arabidopsis thaliana during progressive drought [], whilst RING finger proteins are thought to play a role in spermatogenesis. The precise function is unknown.
Probab=97.16 E-value=0.00028 Score=62.08 Aligned_cols=52 Identities=17% Similarity=0.198 Sum_probs=42.6
Q ss_pred ccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhcccccc
Q 000479 984 KFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKK 1042 (1470)
Q Consensus 984 pykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Ht 1042 (1470)
.|.|++|++. .+...|..|....|..+ .+.+.|++|...+. .+|..|+..++
T Consensus 2 ~f~CP~C~~~-~~~~~L~~H~~~~H~~~----~~~v~CPiC~~~~~--~~l~~Hl~~~H 53 (54)
T PF05605_consen 2 SFTCPYCGKG-FSESSLVEHCEDEHRSE----SKNVVCPICSSRVT--DNLIRHLNSQH 53 (54)
T ss_pred CcCCCCCCCc-cCHHHHHHHHHhHCcCC----CCCccCCCchhhhh--hHHHHHHHHhc
Confidence 4899999994 55688999988999874 46799999998755 49999997665
No 36
>PHA00732 hypothetical protein
Probab=96.99 E-value=0.00045 Score=65.62 Aligned_cols=47 Identities=11% Similarity=-0.113 Sum_probs=38.8
Q ss_pred CcccCCCCccCCChhhhhhcccc-ccCCCccccCcCCCcCCChHHHHhhcCCCC
Q 000479 1018 PHKKGIRFYAYKLKSGRLSRPRF-KKGLGAVSYRIRNRGAAGMKKRIQTLKPLA 1070 (1470)
Q Consensus 1018 pykC~iCgKsF~~ks~L~~H~r~-Htgekpy~C~~C~ksF~~~~~L~kHkksh~ 1070 (1470)
||.|..|++.|.+...|..|++. |+ ++.|+.|++.|.+ +..|.++..
T Consensus 1 py~C~~Cgk~F~s~s~Lk~H~r~~H~---~~~C~~CgKsF~~---l~~H~~~~~ 48 (79)
T PHA00732 1 MFKCPICGFTTVTLFALKQHARRNHT---LTKCPVCNKSYRR---LNQHFYSQY 48 (79)
T ss_pred CccCCCCCCccCCHHHHHHHhhcccC---CCccCCCCCEeCC---hhhhhcccC
Confidence 68999999999999999999984 65 4689999999985 555655433
No 37
>PF05605 zf-Di19: Drought induced 19 protein (Di19), zinc-binding; InterPro: IPR008598 This entry consists of several drought induced 19 (Di19) like and RING finger 114 proteins. Di19 has been found to be strongly expressed in both the roots and leaves of Arabidopsis thaliana during progressive drought [], whilst RING finger proteins are thought to play a role in spermatogenesis. The precise function is unknown.
Probab=96.99 E-value=0.00054 Score=60.34 Aligned_cols=52 Identities=27% Similarity=0.587 Sum_probs=33.1
Q ss_pred ccccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhc
Q 000479 847 THKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERH 905 (1470)
Q Consensus 847 pfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh 905 (1470)
.|.||.|++ ..+...|..|....|..+. +.+.|++|...+. .+|..|+..+|
T Consensus 2 ~f~CP~C~~-~~~~~~L~~H~~~~H~~~~----~~v~CPiC~~~~~--~~l~~Hl~~~H 53 (54)
T PF05605_consen 2 SFTCPYCGK-GFSESSLVEHCEDEHRSES----KNVVCPICSSRVT--DNLIRHLNSQH 53 (54)
T ss_pred CcCCCCCCC-ccCHHHHHHHHHhHCcCCC----CCccCCCchhhhh--hHHHHHHHHhc
Confidence 367777777 4455667777766776652 4577777776544 36666766655
No 38
>PHA00616 hypothetical protein
Probab=96.72 E-value=0.00054 Score=57.67 Aligned_cols=34 Identities=15% Similarity=0.161 Sum_probs=29.8
Q ss_pred ccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCC
Q 000479 984 KFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGI 1023 (1470)
Q Consensus 984 pykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~i 1023 (1470)
||+|+.||+.|..+++|.+| .+.|+| ++++.|+.
T Consensus 1 pYqC~~CG~~F~~~s~l~~H-~r~~hg-----~~~~~~~~ 34 (44)
T PHA00616 1 MYQCLRCGGIFRKKKEVIEH-LLSVHK-----QNKLTLEY 34 (44)
T ss_pred CCccchhhHHHhhHHHHHHH-HHHhcC-----CCccceeE
Confidence 68999999999999999999 788888 78888864
No 39
>COG5189 SFP1 Putative transcriptional repressor regulating G2/M transition [Transcription / Cell division and chromosome partitioning]
Probab=96.69 E-value=0.00083 Score=76.58 Aligned_cols=57 Identities=19% Similarity=0.280 Sum_probs=44.2
Q ss_pred ccccccCc--cCCccCChhhHhHhhhhhccCCCC-------------CCCCCcccCCCCccCCChhhhhhcc
Q 000479 982 IRKFICRF--CGLKFDLLPDLGRHHQAAHMGPNL-------------VNSRPHKKGIRFYAYKLKSGRLSRP 1038 (1470)
Q Consensus 982 eKpykC~~--CGKsF~sks~LkrH~~rvHtge~~-------------~~eKpykC~iCgKsF~~ks~L~~H~ 1038 (1470)
+|||+|++ |.|+|+....|+-|.+.-|...++ .+.|||.|++|+|+|+....|+.|.
T Consensus 347 ~KpykCpV~gC~K~YknqnGLKYH~lhGH~~~~~~~~p~p~~~~~F~~~~KPYrCevC~KRYKNlNGLKYHr 418 (423)
T COG5189 347 GKPYKCPVEGCNKKYKNQNGLKYHMLHGHQNQKLHENPSPEKMNIFSAKDKPYRCEVCDKRYKNLNGLKYHR 418 (423)
T ss_pred CceecCCCCCchhhhccccchhhhhhccccCcccCCCCCccccccccccCCceeccccchhhccCccceecc
Confidence 49999975 999999999999995555533211 2368888888888888888888886
No 40
>COG5189 SFP1 Putative transcriptional repressor regulating G2/M transition [Transcription / Cell division and chromosome partitioning]
Probab=96.24 E-value=0.0018 Score=73.90 Aligned_cols=71 Identities=21% Similarity=0.378 Sum_probs=47.1
Q ss_pred CCCccccCc--CCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchhhcccccccCC
Q 000479 844 DEKTHKCKI--CSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPC 921 (1470)
Q Consensus 844 gekpfkC~~--CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~C 921 (1470)
++|||+|++ |.|.|++...|+.|+..-|... +...=+ .-..|.- +..+.|||+|++|
T Consensus 346 d~KpykCpV~gC~K~YknqnGLKYH~lhGH~~~-----~~~~~p----------~p~~~~~------F~~~~KPYrCevC 404 (423)
T COG5189 346 DGKPYKCPVEGCNKKYKNQNGLKYHMLHGHQNQ-----KLHENP----------SPEKMNI------FSAKDKPYRCEVC 404 (423)
T ss_pred cCceecCCCCCchhhhccccchhhhhhccccCc-----ccCCCC----------Ccccccc------ccccCCceecccc
Confidence 359999987 9999999999999965555332 211111 1111111 1112358999999
Q ss_pred CCCCCChhhhhhhh
Q 000479 922 GSHFGNTEELWLHV 935 (1470)
Q Consensus 922 gK~F~sks~L~~H~ 935 (1470)
+|.|++...|+.|+
T Consensus 405 ~KRYKNlNGLKYHr 418 (423)
T COG5189 405 DKRYKNLNGLKYHR 418 (423)
T ss_pred chhhccCccceecc
Confidence 99999988898884
No 41
>PF00096 zf-C2H2: Zinc finger, C2H2 type; InterPro: IPR007087 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger: #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C], where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter []. This entry represents the classical C2H2 zinc finger domain. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005622 intracellular; PDB: 2D9H_A 2EPC_A 1SP1_A 1VA3_A 2WBT_B 2ELR_A 2YTP_A 2YTT_A 1VA1_A 2ELO_A ....
Probab=95.82 E-value=0.0025 Score=46.01 Aligned_cols=19 Identities=37% Similarity=0.880 Sum_probs=10.4
Q ss_pred cccCccCCccCChhhHhHh
Q 000479 985 FICRFCGLKFDLLPDLGRH 1003 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH 1003 (1470)
|+|+.|++.|.++..|.+|
T Consensus 1 y~C~~C~~~f~~~~~l~~H 19 (23)
T PF00096_consen 1 YKCPICGKSFSSKSNLKRH 19 (23)
T ss_dssp EEETTTTEEESSHHHHHHH
T ss_pred CCCCCCCCccCCHHHHHHH
Confidence 4455555555555555555
No 42
>PF12756 zf-C2H2_2: C2H2 type zinc-finger (2 copies); PDB: 2DMI_A.
Probab=95.75 E-value=0.0046 Score=59.80 Aligned_cols=71 Identities=23% Similarity=0.430 Sum_probs=17.0
Q ss_pred cccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCChhHHhhhhhhcCCccccccCccCCccCC
Q 000479 917 QCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSENLGSIRKFICRFCGLKFDL 996 (1470)
Q Consensus 917 kC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeKpykC~~CGKsF~s 996 (1470)
+|..|+..|.+...|..||...|...+ + . ...+.....+..+.+... ...+.|..|++.|.+
T Consensus 1 ~C~~C~~~f~~~~~l~~H~~~~H~~~~---------~----~----~~~l~~~~~~~~~~~~~~-~~~~~C~~C~~~f~s 62 (100)
T PF12756_consen 1 QCLFCDESFSSVDDLLQHMKKKHGFDI---------P----D----QKYLVDPNRLLNYLRKKV-KESFRCPYCNKTFRS 62 (100)
T ss_dssp ------------------------------------------------------------------SSEEBSSSS-EESS
T ss_pred Ccccccccccccccccccccccccccc---------c----c----cccccccccccccccccc-CCCCCCCccCCCCcC
Confidence 488999999999999999887786330 0 0 111112223333332211 125888888888888
Q ss_pred hhhHhHhhh
Q 000479 997 LPDLGRHHQ 1005 (1470)
Q Consensus 997 ks~LkrH~~ 1005 (1470)
...|..|+.
T Consensus 63 ~~~l~~Hm~ 71 (100)
T PF12756_consen 63 REALQEHMR 71 (100)
T ss_dssp HHHHHHHHH
T ss_pred HHHHHHHHc
Confidence 888888843
No 43
>PF12756 zf-C2H2_2: C2H2 type zinc-finger (2 copies); PDB: 2DMI_A.
Probab=95.72 E-value=0.0041 Score=60.18 Aligned_cols=73 Identities=19% Similarity=0.303 Sum_probs=20.9
Q ss_pred ccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCCh
Q 000479 849 KCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNT 928 (1470)
Q Consensus 849 kC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sk 928 (1470)
+|..|+..|.+...|..|+...|.-. .+ ....+.....+..+++..... .+.|..|++.|.+.
T Consensus 1 ~C~~C~~~f~~~~~l~~H~~~~H~~~-----~~-----~~~~l~~~~~~~~~~~~~~~~-------~~~C~~C~~~f~s~ 63 (100)
T PF12756_consen 1 QCLFCDESFSSVDDLLQHMKKKHGFD-----IP-----DQKYLVDPNRLLNYLRKKVKE-------SFRCPYCNKTFRSR 63 (100)
T ss_dssp ----------------------------------------------------------S-------SEEBSSSS-EESSH
T ss_pred Cccccccccccccccccccccccccc-----cc-----cccccccccccccccccccCC-------CCCCCccCCCCcCH
Confidence 58999999999999999987788644 11 222233444454554433222 58899999999999
Q ss_pred hhhhhhhhhc
Q 000479 929 EELWLHVQSV 938 (1470)
Q Consensus 929 s~L~~H~k~~ 938 (1470)
..|..||+..
T Consensus 64 ~~l~~Hm~~~ 73 (100)
T PF12756_consen 64 EALQEHMRSK 73 (100)
T ss_dssp HHHHHHHHHT
T ss_pred HHHHHHHcCc
Confidence 9999999854
No 44
>KOG2231 consensus Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]
Probab=95.65 E-value=0.011 Score=75.08 Aligned_cols=146 Identities=19% Similarity=0.265 Sum_probs=70.7
Q ss_pred ccccccccccccChhhhhhhhhhhccccchhhcccccccCCCC---------CCCChhhhhhhhhhcccccccchhhhhc
Q 000479 881 GYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGS---------HFGNTEELWLHVQSVHAIDFKMSEVAQQ 951 (1470)
Q Consensus 881 py~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK---------~F~sks~L~~H~k~~Hsgef~~~~~~~~ 951 (1470)
.-.|.+| -.|.+...|+.|+...|. .+.|..|-. ...+...|..|++.--.++ .....
T Consensus 115 ~~~~~~c-~~~~s~~~Lk~H~~~~H~--------~~~c~lC~~~~kif~~e~k~Yt~~el~~h~~~gd~d~----~s~rG 181 (669)
T KOG2231|consen 115 KKECLHC-TEFKSVENLKNHMRDQHK--------LHLCSLCLQNLKIFINERKLYTRAELNLHLMFGDPDD----ESCRG 181 (669)
T ss_pred cCCCccc-cchhHHHHHHHHHHHhhh--------hhccccccccceeeeeeeehehHHHHHHHHhcCCCcc----ccccC
Confidence 4457777 666677777777765554 344554421 2234556666643211111 00000
Q ss_pred cccccccCCCcccccCChhHHhhhhhhcCCccccccCcc------CCccCChhhHhHhhhhhccCCCCCCCCCcccCCCC
Q 000479 952 HNQSVGEDSPKKLELGYSASVENHSENLGSIRKFICRFC------GLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRF 1025 (1470)
Q Consensus 952 k~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeKpykC~~C------GKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCg 1025 (1470)
.-.|..| ...|-....|.+|++.++ |.|.+| +.-|.....|..|.+.-|-- -.+.-..+..+-
T Consensus 182 -hp~C~~C---~~~fld~~el~rH~~~~h----~~chfC~~~~~~neyy~~~~dLe~HfR~~Hfl---CE~~~C~~~~f~ 250 (669)
T KOG2231|consen 182 -HPLCKFC---HERFLDDDELYRHLRFDH----EFCHFCDYKTGQNEYYNDYDDLEEHFRKGHFL---CEEEFCRTKKFY 250 (669)
T ss_pred -Cccchhh---hhhhccHHHHHHhhccce----eheeecCcccccchhcccchHHHHHhhhcCcc---ccccccccceee
Confidence 1223333 666666667777776554 455555 34466666777774333321 000011112222
Q ss_pred ccCCChhhhhhccccccCCCccccC
Q 000479 1026 YAYKLKSGRLSRPRFKKGLGAVSYR 1050 (1470)
Q Consensus 1026 KsF~~ks~L~~H~r~Htgekpy~C~ 1050 (1470)
-.|.....|+.|.+.+.-++.|.|.
T Consensus 251 ~~~~~ei~lk~~~~~~~~e~~~~~~ 275 (669)
T KOG2231|consen 251 VAFELEIELKAHNRFIQHEKCYICR 275 (669)
T ss_pred ehhHHHHHHHhhccccchheeccCC
Confidence 2334445555555555555666664
No 45
>cd01395 HMT_MBD Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as CLLD8 and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and a bifurcated SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. SETDB1 and other proteins in this group have a similar domain architecture. SETDB1 is a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1-mediated silencing of euchromatic genes by KRAB zinc-finger proteins.
Probab=95.39 E-value=0.0032 Score=56.65 Aligned_cols=37 Identities=14% Similarity=0.039 Sum_probs=32.2
Q ss_pred CCC-CCcccC----------CcccccccCCCCCCc-cccccceeeeccC
Q 000479 1184 HLE-PLPSVS----------AGIRSSDSSDFVNNQ-WEVDECHCIIDSR 1220 (1470)
Q Consensus 1184 pl~-p~~~~~----------~~~k~v~~~~p~~~~-w~~~e~~~~l~~~ 1220 (1470)
||+ |+++|| +.++.|+|++|||.. ++|.|+|.||...
T Consensus 1 PL~~Pll~gw~R~~~~~~~~~~k~~V~Y~aPCGr~Lr~~~EV~~YL~~t 49 (60)
T cd01395 1 PLHTPLLCGFQRMKYRARVGKVKKHVIYKAPCGRSLRNMSEVHRYLRET 49 (60)
T ss_pred CcccccccCeEEEEEeccCCCcccceEEECCcchhhhcHHHHHHHHHhc
Confidence 777 999999 257789999999999 9999999988765
No 46
>PF00096 zf-C2H2: Zinc finger, C2H2 type; InterPro: IPR007087 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger: #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C], where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter []. This entry represents the classical C2H2 zinc finger domain. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005622 intracellular; PDB: 2D9H_A 2EPC_A 1SP1_A 1VA3_A 2WBT_B 2ELR_A 2YTP_A 2YTT_A 1VA1_A 2ELO_A ....
Probab=95.25 E-value=0.0075 Score=43.53 Aligned_cols=23 Identities=22% Similarity=-0.042 Sum_probs=20.8
Q ss_pred cccCCCCccCCChhhhhhccccc
Q 000479 1019 HKKGIRFYAYKLKSGRLSRPRFK 1041 (1470)
Q Consensus 1019 ykC~iCgKsF~~ks~L~~H~r~H 1041 (1470)
|+|+.|++.|.++..|.+|++.|
T Consensus 1 y~C~~C~~~f~~~~~l~~H~~~H 23 (23)
T PF00096_consen 1 YKCPICGKSFSSKSNLKRHMRRH 23 (23)
T ss_dssp EEETTTTEEESSHHHHHHHHHHH
T ss_pred CCCCCCCCccCCHHHHHHHHhHC
Confidence 68999999999999999999765
No 47
>KOG2231 consensus Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]
Probab=94.95 E-value=0.026 Score=71.73 Aligned_cols=120 Identities=19% Similarity=0.180 Sum_probs=75.3
Q ss_pred cccccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCC
Q 000479 882 YACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSP 961 (1470)
Q Consensus 882 y~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~ 961 (1470)
+.|.+|++.|.-... ...|..| -.|.+...|+.|+...| +.+.|..|-.
T Consensus 100 ~~C~~C~~~~~~~~~------------------~~~~~~c-~~~~s~~~Lk~H~~~~H------------~~~~c~lC~~ 148 (669)
T KOG2231|consen 100 HSCHICDRRFRALYN------------------KKECLHC-TEFKSVENLKNHMRDQH------------KLHLCSLCLQ 148 (669)
T ss_pred hhcCccccchhhhcc------------------cCCCccc-cchhHHHHHHHHHHHhh------------hhhccccccc
Confidence 468888877642211 2247788 77888888888887677 3566666644
Q ss_pred cccccC------ChhHHhhhhhhcC-Cccc----cccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCC------
Q 000479 962 KKLELG------YSASVENHSENLG-SIRK----FICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIR------ 1024 (1470)
Q Consensus 962 ~~k~F~------s~s~L~~H~r~Ht-geKp----ykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iC------ 1024 (1470)
..+.|. ....|..|+..-. +++. -.|..|...|.....|.+|+...| |.|.+|
T Consensus 149 ~~kif~~e~k~Yt~~el~~h~~~gd~d~~s~rGhp~C~~C~~~fld~~el~rH~~~~h----------~~chfC~~~~~~ 218 (669)
T KOG2231|consen 149 NLKIFINERKLYTRAELNLHLMFGDPDDESCRGHPLCKFCHERFLDDDELYRHLRFDH----------EFCHFCDYKTGQ 218 (669)
T ss_pred cceeeeeeeehehHHHHHHHHhcCCCccccccCCccchhhhhhhccHHHHHHhhccce----------eheeecCccccc
Confidence 443333 3566777765322 1222 358888888888888888833333 345555
Q ss_pred CccCCChhhhhhcccccc
Q 000479 1025 FYAYKLKSGRLSRPRFKK 1042 (1470)
Q Consensus 1025 gKsF~~ks~L~~H~r~Ht 1042 (1470)
+.-|..-..|..|.|.+|
T Consensus 219 neyy~~~~dLe~HfR~~H 236 (669)
T KOG2231|consen 219 NEYYNDYDDLEEHFRKGH 236 (669)
T ss_pred chhcccchHHHHHhhhcC
Confidence 344677777888877666
No 48
>PF13912 zf-C2H2_6: C2H2-type zinc finger; PDB: 1JN7_A 1FU9_A 2L1O_A 1NJQ_A 2EN8_A 2EMM_A 1FV5_A 1Y0J_B 2L6Z_B.
Probab=94.58 E-value=0.015 Score=43.68 Aligned_cols=23 Identities=9% Similarity=-0.159 Sum_probs=11.2
Q ss_pred cccCCCCccCCChhhhhhccccc
Q 000479 1019 HKKGIRFYAYKLKSGRLSRPRFK 1041 (1470)
Q Consensus 1019 ykC~iCgKsF~~ks~L~~H~r~H 1041 (1470)
|.|..|++.|.+...|..|++.|
T Consensus 2 ~~C~~C~~~F~~~~~l~~H~~~h 24 (27)
T PF13912_consen 2 FECDECGKTFSSLSALREHKRSH 24 (27)
T ss_dssp EEETTTTEEESSHHHHHHHHCTT
T ss_pred CCCCccCCccCChhHHHHHhHHh
Confidence 44444444444444444444444
No 49
>PF13894 zf-C2H2_4: C2H2-type zinc finger; PDB: 2ELX_A 2EPP_A 2DLK_A 1X6H_A 2EOU_A 2EMB_A 2GQJ_A 2CSH_A 2WBT_B 2ELM_A ....
Probab=94.52 E-value=0.016 Score=41.52 Aligned_cols=19 Identities=42% Similarity=0.849 Sum_probs=6.6
Q ss_pred cccccccccChhhhhhhhh
Q 000479 884 CAICLDSFTNKKVLESHVQ 902 (1470)
Q Consensus 884 C~~CgKsF~sks~L~~H~r 902 (1470)
|++|++.|.+...|..|++
T Consensus 3 C~~C~~~~~~~~~l~~H~~ 21 (24)
T PF13894_consen 3 CPICGKSFRSKSELRQHMR 21 (24)
T ss_dssp -SSTS-EESSHHHHHHHHH
T ss_pred CcCCCCcCCcHHHHHHHHH
Confidence 3333333333333333333
No 50
>COG5048 FOG: Zn-finger [General function prediction only]
Probab=94.45 E-value=0.035 Score=66.43 Aligned_cols=62 Identities=11% Similarity=0.097 Sum_probs=42.2
Q ss_pred ccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccccccCCCccccCcCCC
Q 000479 989 FCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYRIRNR 1054 (1470)
Q Consensus 989 ~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~~C~k 1054 (1470)
.|-..+.....+..| ...|.... ...+.+..|.+.|.....|..|++.|....++.|..++.
T Consensus 393 ~~~~~~~~~~~~~~~-~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 454 (467)
T COG5048 393 SCIRNFKRDSNLSLH-IITHLSFR---PYNCKNPPCSKSFNRHYNLIPHKKIHTNHAPLLCSILKS 454 (467)
T ss_pred chhhhhccccccccc-cccccccC---CcCCCCCcchhhccCcccccccccccccCCceeeccccc
Confidence 366667777777777 66665511 235677778888888888888888888776666655544
No 51
>PF13894 zf-C2H2_4: C2H2-type zinc finger; PDB: 2ELX_A 2EPP_A 2DLK_A 1X6H_A 2EOU_A 2EMB_A 2GQJ_A 2CSH_A 2WBT_B 2ELM_A ....
Probab=93.81 E-value=0.034 Score=39.80 Aligned_cols=19 Identities=37% Similarity=0.857 Sum_probs=9.6
Q ss_pred cccCccCCccCChhhHhHh
Q 000479 985 FICRFCGLKFDLLPDLGRH 1003 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH 1003 (1470)
|.|++|++.|.+...|.+|
T Consensus 1 ~~C~~C~~~~~~~~~l~~H 19 (24)
T PF13894_consen 1 FQCPICGKSFRSKSELRQH 19 (24)
T ss_dssp EE-SSTS-EESSHHHHHHH
T ss_pred CCCcCCCCcCCcHHHHHHH
Confidence 4455555555555555555
No 52
>PF13912 zf-C2H2_6: C2H2-type zinc finger; PDB: 1JN7_A 1FU9_A 2L1O_A 1NJQ_A 2EN8_A 2EMM_A 1FV5_A 1Y0J_B 2L6Z_B.
Probab=93.64 E-value=0.034 Score=41.75 Aligned_cols=26 Identities=35% Similarity=0.686 Sum_probs=22.6
Q ss_pred ccccCccCCccCChhhHhHhhhhhccC
Q 000479 984 KFICRFCGLKFDLLPDLGRHHQAAHMG 1010 (1470)
Q Consensus 984 pykC~~CGKsF~sks~LkrH~~rvHtg 1010 (1470)
||+|..|++.|.+...|..| ++.|.+
T Consensus 1 ~~~C~~C~~~F~~~~~l~~H-~~~h~~ 26 (27)
T PF13912_consen 1 PFECDECGKTFSSLSALREH-KRSHCS 26 (27)
T ss_dssp SEEETTTTEEESSHHHHHHH-HCTTTT
T ss_pred CCCCCccCCccCChhHHHHH-hHHhcC
Confidence 68999999999999999999 777753
No 53
>KOG1146 consensus Homeobox protein [General function prediction only]
Probab=93.51 E-value=0.041 Score=73.41 Aligned_cols=159 Identities=14% Similarity=0.089 Sum_probs=101.5
Q ss_pred cccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCCcc
Q 000479 884 CAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKK 963 (1470)
Q Consensus 884 C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~ 963 (1470)
|..|+..+..+..+..|+..-+... +.|+|+.|+..|+....|-.||+..|..- .. ..|
T Consensus 439 ~~~~e~~~~s~r~~~~~t~~L~S~~-----kt~~cpkc~~~yk~a~~L~vhmRskhp~~---------~~---~~c---- 497 (1406)
T KOG1146|consen 439 LTKAEPLLESKRSLEGQTVVLHSFF-----KTLKCPKCNWHYKLAQTLGVHMRSKHPES---------QS---AYC---- 497 (1406)
T ss_pred ccchhhhhhhhcccccceeeeeccc-----ccccCCccchhhhhHHHhhhccccccccc---------ch---hHh----
Confidence 5556666666666766666544433 36788888888888888888887767532 00 222
Q ss_pred cccCChhHHhhhhhh------cCCccccccCccCCccCChhhHhHhhhhh-ccCC-------------------------
Q 000479 964 LELGYSASVENHSEN------LGSIRKFICRFCGLKFDLLPDLGRHHQAA-HMGP------------------------- 1011 (1470)
Q Consensus 964 k~F~s~s~L~~H~r~------HtgeKpykC~~CGKsF~sks~LkrH~~rv-Htge------------------------- 1011 (1470)
..-+.|.+. -.+.++|.|..|..+|..+..|..|++.. |..+
T Consensus 498 ------~~gq~~~~~arg~~~~~~~~p~~C~~C~~stttng~LsihlqS~~h~~~lee~~~~~g~~v~~~~~~v~s~~P~ 571 (1406)
T KOG1146|consen 498 ------KAGQNHPRLARGEVYRCPGKPYPCRACNYSTTTNGNLSIHLQSDLHRNELEEAEENAGEQVRLLPASVTSAVPE 571 (1406)
T ss_pred ------HhccccccccccccccCCCCcccceeeeeeeecchHHHHHHHHHhhHHHHHHHHhccccchhhhhhhhcccCcc
Confidence 011112211 12337888888999998888888885432 2110
Q ss_pred ----------C-CCCCCCcccCCCCccCCChhhhhhcc-ccccCCCccccCcCCCcCCChHHHHhhcCCC
Q 000479 1012 ----------N-LVNSRPHKKGIRFYAYKLKSGRLSRP-RFKKGLGAVSYRIRNRGAAGMKKRIQTLKPL 1069 (1470)
Q Consensus 1012 ----------~-~~~eKpykC~iCgKsF~~ks~L~~H~-r~Htgekpy~C~~C~ksF~~~~~L~kHkksh 1069 (1470)
. ....-.+.|.+|++--.-..+|+.|| ..|+-.-|.-|-.|+-.+..-..+..+.+-+
T Consensus 572 ~ag~~~~ags~~pktkP~~~C~vc~yetniarnlrihmtss~~s~~p~~~Lq~~it~~l~~~~~~~~~lp 641 (1406)
T KOG1146|consen 572 EAGLGPSAGSSGPKTKPSWRCEVCSYETNIARNLRIHMTASPSSSPPSLVLQQNITSSLASLLGGQGRLP 641 (1406)
T ss_pred cccCCCCCCCCCCCCCCCcchhhhcchhhhhhccccccccCCCCCChHHHhhhcchhhccccccCcCCCC
Confidence 0 11134689999999999999999999 4444444477777877777666666666643
No 54
>KOG1146 consensus Homeobox protein [General function prediction only]
Probab=92.76 E-value=0.033 Score=74.28 Aligned_cols=151 Identities=14% Similarity=0.098 Sum_probs=99.7
Q ss_pred ccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchh-----------------
Q 000479 849 KCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVE----------------- 911 (1470)
Q Consensus 849 kC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e----------------- 911 (1470)
.|..|+..+.+...+.-|+...|.-. +.|.|+.|+..|+....|..|||..|.+....
T Consensus 438 e~~~~e~~~~s~r~~~~~t~~L~S~~-----kt~~cpkc~~~yk~a~~L~vhmRskhp~~~~~~c~~gq~~~~~arg~~~ 512 (1406)
T KOG1146|consen 438 ELTKAEPLLESKRSLEGQTVVLHSFF-----KTLKCPKCNWHYKLAQTLGVHMRSKHPESQSAYCKAGQNHPRLARGEVY 512 (1406)
T ss_pred cccchhhhhhhhcccccceeeeeccc-----ccccCCccchhhhhHHHhhhcccccccccchhHhHhccccccccccccc
Confidence 35566777777777777766666655 88888888888888888888888854432110
Q ss_pred --hcccccccCCCCCCCChhhhhhhhhhc-cccc--------------------------------ccchhhhhcccccc
Q 000479 912 --QCMLQQCIPCGSHFGNTEELWLHVQSV-HAID--------------------------------FKMSEVAQQHNQSV 956 (1470)
Q Consensus 912 --~~kpfkC~~CgK~F~sks~L~~H~k~~-Hsge--------------------------------f~~~~~~~~k~~~C 956 (1470)
.-++|.|..|...+..+.+|..|++.. |..+ .........-.+.|
T Consensus 513 ~~~~~p~~C~~C~~stttng~LsihlqS~~h~~~lee~~~~~g~~v~~~~~~v~s~~P~~ag~~~~ags~~pktkP~~~C 592 (1406)
T KOG1146|consen 513 RCPGKPYPCRACNYSTTTNGNLSIHLQSDLHRNELEEAEENAGEQVRLLPASVTSAVPEEAGLGPSAGSSGPKTKPSWRC 592 (1406)
T ss_pred cCCCCcccceeeeeeeecchHHHHHHHHHhhHHHHHHHHhccccchhhhhhhhcccCcccccCCCCCCCCCCCCCCCcch
Confidence 126899999999999999999998743 3211 00000112224667
Q ss_pred ccCCCcccccCChhHHhhhhhh-cCCccccccCccCCccCChhhHhHhhhhhc
Q 000479 957 GEDSPKKLELGYSASVENHSEN-LGSIRKFICRFCGLKFDLLPDLGRHHQAAH 1008 (1470)
Q Consensus 957 ~~Cp~~~k~F~s~s~L~~H~r~-HtgeKpykC~~CGKsF~sks~LkrH~~rvH 1008 (1470)
..| +....-...|..|+.. |+-..|.-|..|+-.+.....+..| .+.|
T Consensus 593 ~vc---~yetniarnlrihmtss~~s~~p~~~Lq~~it~~l~~~~~~~-~~lp 641 (1406)
T KOG1146|consen 593 EVC---SYETNIARNLRIHMTASPSSSPPSLVLQQNITSSLASLLGGQ-GRLP 641 (1406)
T ss_pred hhh---cchhhhhhccccccccCCCCCChHHHhhhcchhhccccccCc-CCCC
Confidence 777 5555555567777643 4444457777888888777777777 6666
No 55
>PF09237 GAGA: GAGA factor; InterPro: IPR015318 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Members of this entry bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 1YUI_A 1YUJ_A.
Probab=92.51 E-value=0.05 Score=47.21 Aligned_cols=31 Identities=6% Similarity=-0.108 Sum_probs=16.6
Q ss_pred CCCcccCCCCccCCChhhhhhccccccCCCc
Q 000479 1016 SRPHKKGIRFYAYKLKSGRLSRPRFKKGLGA 1046 (1470)
Q Consensus 1016 eKpykC~iCgKsF~~ks~L~~H~r~Htgekp 1046 (1470)
+.|..|++|+..+.+..+|++|+..+++.||
T Consensus 22 ~~PatCP~C~a~~~~srnLrRHle~~H~~k~ 52 (54)
T PF09237_consen 22 EQPATCPICGAVIRQSRNLRRHLEIRHFKKP 52 (54)
T ss_dssp S--EE-TTT--EESSHHHHHHHHHHHTTTS-
T ss_pred CCCCCCCcchhhccchhhHHHHHHHHhcccC
Confidence 5666777777777777777777766655554
No 56
>smart00355 ZnF_C2H2 zinc finger.
Probab=92.11 E-value=0.07 Score=38.57 Aligned_cols=23 Identities=30% Similarity=0.600 Sum_probs=12.3
Q ss_pred cccCccCCccCChhhHhHhhhhhc
Q 000479 985 FICRFCGLKFDLLPDLGRHHQAAH 1008 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH~~rvH 1008 (1470)
|+|+.|++.|.....|..| ++.|
T Consensus 1 ~~C~~C~~~f~~~~~l~~H-~~~H 23 (26)
T smart00355 1 YRCPECGKVFKSKSALKEH-MRTH 23 (26)
T ss_pred CCCCCCcchhCCHHHHHHH-HHHh
Confidence 3455555555555555555 3344
No 57
>COG5236 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=91.82 E-value=0.095 Score=60.90 Aligned_cols=126 Identities=19% Similarity=0.180 Sum_probs=79.2
Q ss_pred cccccC--CCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCC------hhHHhhhhhhcCCcccc-
Q 000479 915 LQQCIP--CGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGY------SASVENHSENLGSIRKF- 985 (1470)
Q Consensus 915 pfkC~~--CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s------~s~L~~H~r~HtgeKpy- 985 (1470)
.|.|+. |.........|+.|.+..|. .+.|.+|....+.|.. ++.|..|...-..+.-|
T Consensus 151 ~F~CP~skc~~~C~~~k~lk~H~K~~H~------------~~~C~~C~~nKk~F~~E~~lF~~~~Lr~H~~~G~~e~GFK 218 (493)
T COG5236 151 SFKCPKSKCHRRCGSLKELKKHYKAQHG------------FVLCSECIGNKKDFWNEIRLFRSSTLRDHKNGGLEEEGFK 218 (493)
T ss_pred HhcCCchhhhhhhhhHHHHHHHHHhhcC------------cEEhHhhhcCcccCccceeeeecccccccccCCccccCcC
Confidence 467764 66666667888889887774 6778888655555553 45666676443333222
Q ss_pred ---ccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCcc-------CCChhhhhhccccccCCCccccCc--C-
Q 000479 986 ---ICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYA-------YKLKSGRLSRPRFKKGLGAVSYRI--R- 1052 (1470)
Q Consensus 986 ---kC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKs-------F~~ks~L~~H~r~Htgekpy~C~~--C- 1052 (1470)
.|..|.+.|-.-..|.+|.+..|.. |.+|++. |+.-..|..|.+.-+ |.|.. |
T Consensus 219 GHP~C~FC~~~FYdDDEL~~HcR~~HE~----------ChICD~v~p~~~QYFK~Y~~Le~HF~~~h----y~ct~qtc~ 284 (493)
T COG5236 219 GHPLCIFCKIYFYDDDELRRHCRLRHEA----------CHICDMVGPIRYQYFKSYEDLEAHFRNAH----YCCTFQTCR 284 (493)
T ss_pred CCchhhhccceecChHHHHHHHHhhhhh----------hhhhhccCccchhhhhCHHHHHHHhhcCc----eEEEEEEEe
Confidence 5888888888888888885555543 6666654 667777777764322 44422 2
Q ss_pred ---CCcCCChHHHHhhc
Q 000479 1053 ---NRGAAGMKKRIQTL 1066 (1470)
Q Consensus 1053 ---~ksF~~~~~L~kHk 1066 (1470)
-..|.....|..|.
T Consensus 285 ~~k~~vf~~~~el~~h~ 301 (493)
T COG5236 285 VGKCYVFPYHTELLEHL 301 (493)
T ss_pred cCcEEEeccHHHHHHHH
Confidence 22455666666664
No 58
>COG5048 FOG: Zn-finger [General function prediction only]
Probab=91.78 E-value=0.13 Score=61.52 Aligned_cols=169 Identities=15% Similarity=0.202 Sum_probs=110.4
Q ss_pred CccccCcCCccccChhHHhhhhhh--ccccchhcccCccccc--cccccccChhhhhhhhhhhccccchhhcccccccC-
Q 000479 846 KTHKCKICSQVFLHDQELGVHWMD--NHKKEAQWLFRGYACA--ICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIP- 920 (1470)
Q Consensus 846 kpfkC~~CgK~F~s~s~L~~H~~~--~Ht~e~~~~~Kpy~C~--~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~- 920 (1470)
.++.|..|...|.....|..| .. .|..+. .+++.|+ .|++.|.....+..|...|.+.. ++.|..
T Consensus 288 ~~~~~~~~~~~~s~~~~l~~~-~~~~~h~~~~---~~~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~------~~~~~~~ 357 (467)
T COG5048 288 LPIKSKQCNISFSRSSPLTRH-LRSVNHSGES---LKPFSCPYSLCGKLFSRNDALKRHILLHTSIS------PAKEKLL 357 (467)
T ss_pred cCCCCccccCCcccccccccc-cccccccccc---CCceeeeccCCCccccccccccCCcccccCCC------ccccccc
Confidence 578999999999999999999 55 787762 2689999 79999999999999999999887 555644
Q ss_pred -CCCCCCChhhhhhhhhhcccccccchhhhhccccccccCCCcccccCChhHHhhhhhhcCCcc--ccccCccCCccCCh
Q 000479 921 -CGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVGEDSPKKLELGYSASVENHSENLGSIR--KFICRFCGLKFDLL 997 (1470)
Q Consensus 921 -CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~~Cp~~~k~F~s~s~L~~H~r~HtgeK--pykC~~CGKsF~sk 997 (1470)
|.+.+.....-..+.. .+... .......+.+..- .+...+.....+..|...|...+ .+.|..|.+.|...
T Consensus 358 ~~~~~~~~~~~~~~~~~-~~~~~----~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 431 (467)
T COG5048 358 NSSSKFSPLLNNEPPQS-LQQYK----DLKNDKKSETLSN-SCIRNFKRDSNLSLHIITHLSFRPYNCKNPPCSKSFNRH 431 (467)
T ss_pred cCccccccccCCCCccc-hhhcc----CccCCcccccccc-chhhhhccccccccccccccccCCcCCCCCcchhhccCc
Confidence 5554444333221111 01000 0001122222221 22455566667777777777665 56788999999999
Q ss_pred hhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhc
Q 000479 998 PDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSR 1037 (1470)
Q Consensus 998 s~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H 1037 (1470)
..|..| ++.|.. ..++.|..+. .|.....+..|
T Consensus 432 ~~~~~~-~~~~~~-----~~~~~~~~~~-~~~~~~~~~~~ 464 (467)
T COG5048 432 YNLIPH-KKIHTN-----HAPLLCSILK-SFRRDLDLSNH 464 (467)
T ss_pred cccccc-cccccc-----CCceeecccc-ccchhhhhhcc
Confidence 999999 888876 4455554443 44444444433
No 59
>PF01352 KRAB: KRAB box; InterPro: IPR001909 The Krueppel-associated box (KRAB) is a domain of around 75 amino acids that is found in the N-terminal part of about one third of eukaryotic Krueppel-type C2H2 zinc finger proteins (ZFPs) []. It is enriched in charged amino acids and can be divided into subregions A and B, which are predicted to fold into two amphipathic alpha-helices. The KRAB A and B boxes can be separated by variable spacer segments and many KRAB proteins contain only the A box []. The functions currently known for members of the KRAB-containing protein family include transcriptional repression of RNA polymerase I, II, and III promoters, binding and splicing of RNA, and control of nucleolus function. The KRAB domain functions as a transcriptional repressor when tethered to the template DNA by a DNA-binding domain. A sequence of 45 amino acids in the KRAB A subdomain has been shown to be necessary and sufficient for transcriptional repression. The B box does not repress by itself but does potentiate the repression exerted by the KRAB A subdomain [, ]. Gene silencing requires the binding of the KRAB domain to the RING-B box-coiled coil (RBCC) domain of the KAP-1/TIF1-beta corepressor. As KAP-1 binds to the heterochromatin proteins HP1, it has been proposed that the KRAB-ZFP-bound target gene could be silenced following recruitment to heterochromatin [, ]. KRAB-ZFPs probably constitute the single largest class of transcription factors within the human genome []. Although the function of KRAB-ZFPs is largely unknown, they appear to play important roles during cell differentiation and development. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B.; GO: 0003676 nucleic acid binding, 0006355 regulation of transcription, DNA-dependent, 0005622 intracellular; PDB: 1V65_A.
Probab=91.33 E-value=0.059 Score=45.13 Aligned_cols=29 Identities=21% Similarity=0.164 Sum_probs=17.7
Q ss_pred cchhHHHHHHhhccCChhhhhccchhhhhhhhCH
Q 000479 732 IISKEVFLELLKDCCSLEQKLHLHLACELFYKLL 765 (1470)
Q Consensus 732 ltfkDV~v~flk~c~S~EEk~~Lc~~C~k~F~~~ 765 (1470)
++|+||+++| |+|||.+|.+.++.+|++.
T Consensus 1 Vtf~Dvav~f-----s~eEW~~L~~~Qk~ly~dv 29 (41)
T PF01352_consen 1 VTFEDVAVYF-----SQEEWELLDPAQKNLYRDV 29 (41)
T ss_dssp ------TT--------HHHHHTS-HHHHHHHHHH
T ss_pred CeEEEEEEEc-----ChhhcccccceecccchhH
Confidence 5899999999 9999999999999999874
No 60
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=91.11 E-value=0.064 Score=66.79 Aligned_cols=89 Identities=34% Similarity=0.527 Sum_probs=60.7
Q ss_pred cceecc-CCCCCCCCCCCCccccccceeeEEEEeccCccccceecccccCCcEEEEeeccccCHHHHHHHHhhccCCC--
Q 000479 1353 YLIYEC-NHMCSCDRTCPNRVLQNGVRVKLEVFKTENKGWAVRAGQAILRGTFVCEYIGEVLDELETNKRRSRYGRDG-- 1429 (1470)
Q Consensus 1353 ~~IyEC-n~~C~C~~~C~NRvvQ~g~~~~LeVfrT~~kGwGVra~~~I~~G~FI~EYvGEvIt~~ea~~R~~~y~~~~-- 1429 (1470)
...+|| +..|.+...|.|+..-...... -.+ .+..+|.+| +|++|+..+...|...-....
T Consensus 287 ~~~~~~~p~~~~~~~~~~~~~~sk~~~~e-------~~~---~~~~~~~k~------vg~~i~~~e~~~~~~~~~~~~~~ 350 (463)
T KOG1081|consen 287 MLAYEVHPKVCSAEERCHNQQFSKESYPE-------PQK---TAKADIRKG------VGEVIDDKECKARLQRVKESDLV 350 (463)
T ss_pred hhhhhhcccccccccccccchhhhhcccc-------cch---hhHHhhhcc------cCcccchhhheeehhhhhccchh
Confidence 345565 7889999899998665443332 222 788888888 999999999887764422221
Q ss_pred CcEEEEeCccccccccccCCCCcEEEeccCCCCeeecccCC
Q 000479 1430 CGYMLNIGAHINDMGRLIEGQVRYVIDATKYGNVSRFINHR 1470 (1470)
Q Consensus 1430 ~~Ylf~l~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHS 1470 (1470)
.-|+..+.. ..+||+..+||.+||+|||
T Consensus 351 ~~~~~~~e~-------------~~~id~~~~~n~sr~~nh~ 378 (463)
T KOG1081|consen 351 DFYMVFIQK-------------DRIIDAGPKGNYSRFLNHS 378 (463)
T ss_pred hhhhhhhhc-------------ccccccccccchhhhhccc
Confidence 223222222 1289999999999999997
No 61
>PRK04860 hypothetical protein; Provisional
Probab=90.86 E-value=0.094 Score=56.49 Aligned_cols=37 Identities=14% Similarity=0.143 Sum_probs=23.0
Q ss_pred ccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCC
Q 000479 984 KFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKL 1030 (1470)
Q Consensus 984 pykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ 1030 (1470)
+|.|. |++ ....+++| .++|++ +++|.|..|+..|..
T Consensus 119 ~Y~C~-C~~---~~~~~rrH-~ri~~g-----~~~YrC~~C~~~l~~ 155 (160)
T PRK04860 119 PYRCK-CQE---HQLTVRRH-NRVVRG-----EAVYRCRRCGETLVF 155 (160)
T ss_pred EEEcC-CCC---eeCHHHHH-HHHhcC-----CccEECCCCCceeEE
Confidence 56665 665 55556666 666666 566666666666544
No 62
>smart00355 ZnF_C2H2 zinc finger.
Probab=90.63 E-value=0.16 Score=36.64 Aligned_cols=24 Identities=17% Similarity=-0.100 Sum_probs=21.3
Q ss_pred cccCCCCccCCChhhhhhcccccc
Q 000479 1019 HKKGIRFYAYKLKSGRLSRPRFKK 1042 (1470)
Q Consensus 1019 ykC~iCgKsF~~ks~L~~H~r~Ht 1042 (1470)
|.|..|++.|.....|..|++.|.
T Consensus 1 ~~C~~C~~~f~~~~~l~~H~~~H~ 24 (26)
T smart00355 1 YRCPECGKVFKSKSALKEHMRTHX 24 (26)
T ss_pred CCCCCCcchhCCHHHHHHHHHHhc
Confidence 679999999999999999998775
No 63
>PRK04860 hypothetical protein; Provisional
Probab=90.59 E-value=0.12 Score=55.63 Aligned_cols=36 Identities=11% Similarity=-0.039 Sum_probs=18.4
Q ss_pred CcccCCCCccCCChhhhhhccccccCCCccccCcCCCcCC
Q 000479 1018 PHKKGIRFYAYKLKSGRLSRPRFKKGLGAVSYRIRNRGAA 1057 (1470)
Q Consensus 1018 pykC~iCgKsF~~ks~L~~H~r~Htgekpy~C~~C~ksF~ 1057 (1470)
+|.|. |++ ....+++|.++|+++++|.|..|++.|.
T Consensus 119 ~Y~C~-C~~---~~~~~rrH~ri~~g~~~YrC~~C~~~l~ 154 (160)
T PRK04860 119 PYRCK-CQE---HQLTVRRHNRVVRGEAVYRCRRCGETLV 154 (160)
T ss_pred EEEcC-CCC---eeCHHHHHHHHhcCCccEECCCCCceeE
Confidence 45554 554 4444555555555555555555555443
No 64
>cd05162 PWWP The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors.
Probab=90.07 E-value=0.28 Score=47.28 Aligned_cols=60 Identities=18% Similarity=0.474 Sum_probs=47.7
Q ss_pred EEEEEecc-ccccceeeeeccCCCccccccccCCCccEEEEEeccCCcchhhhhhccccccCCCc
Q 000479 157 ALWVKWRG-KWQAGIRCARADWPLPTLKAKPTHDRKKYFVIFFPHTRNYSWADMLLVRSINEFPQ 220 (1470)
Q Consensus 157 ~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 220 (1470)
-+|+|.+| -|--|+-+...+.+... .+......|.|.||+ +++|.||+---|.+..++-.
T Consensus 6 lVwaK~~g~pwWPa~V~~~~~~~~~~---~~~~~~~~~~V~Ffg-~~~~~wv~~~~l~pf~~~~~ 66 (87)
T cd05162 6 LVWAKMKGYPWWPALVVDPPKDSKKA---KKKAKEGKVLVLFFG-DKTFAWVGAERLKPFTEHKE 66 (87)
T ss_pred EEEEeCCCCCCCCEEEccccccchhh---hccCCCCEEEEEEeC-CCcEEEeCccceeeccchHH
Confidence 48999999 78888888777776543 233345689999999 99999999999988887653
No 65
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=90.00 E-value=0.14 Score=44.97 Aligned_cols=25 Identities=32% Similarity=0.777 Sum_probs=22.4
Q ss_pred cceeccCCCCCCCCCCCCccccccc
Q 000479 1353 YLIYECNHMCSCDRTCPNRVLQNGV 1377 (1470)
Q Consensus 1353 ~~IyECn~~C~C~~~C~NRvvQ~g~ 1377 (1470)
.+.+||+..|+|+..|.||.+|+..
T Consensus 26 ~l~~EC~~~C~~G~~C~NqrFqk~~ 50 (51)
T smart00570 26 MLLIECSSDCPCGSYCSNQRFQKRQ 50 (51)
T ss_pred HHhhhcCCCCCCCcCccCcccccCc
Confidence 4678999999999999999999864
No 66
>cd05840 SPBC215_ISWI_like The PWWP domain is a component of the S. pombe hypothetical protein SPBC215, as well as ISWI complex protein 4. The ISWI (imitation switch) proteins are ATPases responsible for chromatin remodeling in eukaryotes, and SPBC215 is proposed to also bind chromatin. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.
Probab=88.66 E-value=0.32 Score=47.87 Aligned_cols=59 Identities=24% Similarity=0.430 Sum_probs=48.9
Q ss_pred EEEEEeccc-cccceeeeeccCCCccccccccCCCccEEEEEeccCCcchhhhhhcccccc
Q 000479 157 ALWVKWRGK-WQAGIRCARADWPLPTLKAKPTHDRKKYFVIFFPHTRNYSWADMLLVRSIN 216 (1470)
Q Consensus 157 ~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 216 (1470)
-+|.|-+|- |=-|+=|...+-|-.-|++++......|.|.||+. ++|.|++--.+.+..
T Consensus 6 lVwaK~~GyPwWPA~V~~~~~~p~~~l~~~~~~~~~~~~V~FFg~-~~~~Wv~~~~l~pl~ 65 (93)
T cd05840 6 RVLAKVKGFPAWPAIVVPEEMLPDSVLKGKKKKNKRTYPVMFFPD-GDYYWVPNKDLKPLT 65 (93)
T ss_pred EEEEeCCCCCCCCEEECChHHCCHHHHhcccCCCCCeEEEEEeCC-CcEEEEChhhcccCC
Confidence 389999994 66777777777888888888888899999999995 699999887777665
No 67
>COG5236 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=88.35 E-value=0.38 Score=56.13 Aligned_cols=134 Identities=24% Similarity=0.298 Sum_probs=84.1
Q ss_pred hhhhhhhCHHHHHhcccccccccccccccCccccccCChHhhhhhhhcCCcccCCCCCCCCCcccccccccccccchhhH
Q 000479 757 ACELFYKLLKSILSLRNPVPMEIQFQWALSEASKDAGIGEFLMKLVCCEKERLSKTWGFDANENAHVSSSVVEDSAVLPL 836 (1470)
Q Consensus 757 ~C~k~F~~~~sL~sH~rsH~~ek~~~~kC~eC~K~F~s~~~L~k~iHtek~y~C~~CgF~~~s~~~~~s~~~e~s~~~L~ 836 (1470)
.|.........|..|.+..++. +.|.+|-+ ..+.|.|.+=-|+.. .|.
T Consensus 158 kc~~~C~~~k~lk~H~K~~H~~----~~C~~C~~-------------nKk~F~~E~~lF~~~---------------~Lr 205 (493)
T COG5236 158 KCHRRCGSLKELKKHYKAQHGF----VLCSECIG-------------NKKDFWNEIRLFRSS---------------TLR 205 (493)
T ss_pred hhhhhhhhHHHHHHHHHhhcCc----EEhHhhhc-------------CcccCccceeeeecc---------------ccc
Confidence 5666666677777777754431 25777722 122233322113221 255
Q ss_pred HHhhccCCCCc----cccCcCCccccChhHHhhhhhhccccchhcccCccccccccc-------cccChhhhhhhhhhhc
Q 000479 837 AIAGRSEDEKT----HKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLD-------SFTNKKVLESHVQERH 905 (1470)
Q Consensus 837 ~H~r~H~gekp----fkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgK-------sF~sks~L~~H~r~Hh 905 (1470)
.|...-..+.- -.|..|...|.+...|..|++..|. .|.+|++ -|.+...|..|.+.-|
T Consensus 206 ~H~~~G~~e~GFKGHP~C~FC~~~FYdDDEL~~HcR~~HE----------~ChICD~v~p~~~QYFK~Y~~Le~HF~~~h 275 (493)
T COG5236 206 DHKNGGLEEEGFKGHPLCIFCKIYFYDDDELRRHCRLRHE----------ACHICDMVGPIRYQYFKSYEDLEAHFRNAH 275 (493)
T ss_pred ccccCCccccCcCCCchhhhccceecChHHHHHHHHhhhh----------hhhhhhccCccchhhhhCHHHHHHHhhcCc
Confidence 55544333322 3599999999999999999877773 3666665 3888889999977544
Q ss_pred cccchhhcccccccC--CC----CCCCChhhhhhhhhhccccc
Q 000479 906 HVQFVEQCMLQQCIP--CG----SHFGNTEELWLHVQSVHAID 942 (1470)
Q Consensus 906 ~ek~~e~~kpfkC~~--Cg----K~F~sks~L~~H~k~~Hsge 942 (1470)
|.|.. |- ..|.....|+.|+.+.|...
T Consensus 276 ----------y~ct~qtc~~~k~~vf~~~~el~~h~~~~h~~~ 308 (493)
T COG5236 276 ----------YCCTFQTCRVGKCYVFPYHTELLEHLTRFHKVN 308 (493)
T ss_pred ----------eEEEEEEEecCcEEEeccHHHHHHHHHHHhhcc
Confidence 33432 21 36889999999998888643
No 68
>PF13909 zf-H2C2_5: C2H2-type zinc-finger domain; PDB: 1X5W_A.
Probab=87.30 E-value=0.19 Score=36.82 Aligned_cols=22 Identities=18% Similarity=-0.055 Sum_probs=10.3
Q ss_pred cccCCCCccCCChhhhhhccccc
Q 000479 1019 HKKGIRFYAYKLKSGRLSRPRFK 1041 (1470)
Q Consensus 1019 ykC~iCgKsF~~ks~L~~H~r~H 1041 (1470)
|+|+.|++... +..|.+|++.|
T Consensus 1 y~C~~C~y~t~-~~~l~~H~~~~ 22 (24)
T PF13909_consen 1 YKCPHCSYSTS-KSNLKRHLKRH 22 (24)
T ss_dssp EE-SSSS-EES-HHHHHHHHHHH
T ss_pred CCCCCCCCcCC-HHHHHHHHHhh
Confidence 44555555554 55555555443
No 69
>PF09237 GAGA: GAGA factor; InterPro: IPR015318 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Members of this entry bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 1YUI_A 1YUJ_A.
Probab=87.15 E-value=0.27 Score=42.87 Aligned_cols=29 Identities=24% Similarity=0.500 Sum_probs=16.2
Q ss_pred CccccccccccccChhhhhhhhhhhcccc
Q 000479 880 RGYACAICLDSFTNKKVLESHVQERHHVQ 908 (1470)
Q Consensus 880 Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek 908 (1470)
.|..|++|+..+.+..+|++|+..+|+.+
T Consensus 23 ~PatCP~C~a~~~~srnLrRHle~~H~~k 51 (54)
T PF09237_consen 23 QPATCPICGAVIRQSRNLRRHLEIRHFKK 51 (54)
T ss_dssp --EE-TTT--EESSHHHHHHHHHHHTTTS
T ss_pred CCCCCCcchhhccchhhHHHHHHHHhccc
Confidence 56667777777777777777776666654
No 70
>KOG4173 consensus Alpha-SNAP protein [Intracellular trafficking, secretion, and vesicular transport]
Probab=86.07 E-value=0.16 Score=55.65 Aligned_cols=91 Identities=23% Similarity=0.325 Sum_probs=66.8
Q ss_pred Ccccccc--ccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhhhcccccccchhhhhccccccc
Q 000479 880 RGYACAI--CLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHAIDFKMSEVAQQHNQSVG 957 (1470)
Q Consensus 880 Kpy~C~~--CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hsgef~~~~~~~~k~~~C~ 957 (1470)
..|.|++ |...|........|..+.|+.. |..|.+.|.+...|..|+...|..-|.
T Consensus 78 ~~~~cqvagc~~~~d~lD~~E~hY~~~h~~s---------Cs~C~r~~Pt~hLLd~HI~E~HDs~Fq------------- 135 (253)
T KOG4173|consen 78 PAFACQVAGCCQVFDALDDYEHHYHTLHGNS---------CSFCKRAFPTGHLLDAHILEWHDSLFQ------------- 135 (253)
T ss_pred ccccccccchHHHHhhhhhHHHhhhhcccch---------hHHHHHhCCchhhhhHHHHHHHHHHHH-------------
Confidence 4477876 7778888888888887777764 999999999998888888766642111
Q ss_pred cCCCcccccCChhHHhhhhhhcCCcccccc--CccCCccCChhhHhHhhhhhccC
Q 000479 958 EDSPKKLELGYSASVENHSENLGSIRKFIC--RFCGLKFDLLPDLGRHHQAAHMG 1010 (1470)
Q Consensus 958 ~Cp~~~k~F~s~s~L~~H~r~HtgeKpykC--~~CGKsF~sks~LkrH~~rvHtg 1010 (1470)
..+-.|.-.|+| ..|+..|++....+.|+.+.|.=
T Consensus 136 ------------------a~veRG~dMy~ClvEgCt~KFkT~r~RkdH~I~~Hk~ 172 (253)
T KOG4173|consen 136 ------------------ALVERGQDMYQCLVEGCTEKFKTSRDRKDHMIRMHKY 172 (253)
T ss_pred ------------------HHHHcCccHHHHHHHhhhhhhhhhhhhhhHHHHhccC
Confidence 112234456888 57999999999999997788864
No 71
>PF12874 zf-met: Zinc-finger of C2H2 type; PDB: 1ZU1_A 2KVG_A.
Probab=85.78 E-value=0.26 Score=36.20 Aligned_cols=21 Identities=10% Similarity=-0.042 Sum_probs=10.7
Q ss_pred cccCCCCccCCChhhhhhccc
Q 000479 1019 HKKGIRFYAYKLKSGRLSRPR 1039 (1470)
Q Consensus 1019 ykC~iCgKsF~~ks~L~~H~r 1039 (1470)
|.|.+|++.|.+...|..|++
T Consensus 1 ~~C~~C~~~f~s~~~~~~H~~ 21 (25)
T PF12874_consen 1 FYCDICNKSFSSENSLRQHLR 21 (25)
T ss_dssp EEETTTTEEESSHHHHHHHHT
T ss_pred CCCCCCCCCcCCHHHHHHHHC
Confidence 345555555555555555553
No 72
>PF12874 zf-met: Zinc-finger of C2H2 type; PDB: 1ZU1_A 2KVG_A.
Probab=84.62 E-value=0.4 Score=35.23 Aligned_cols=21 Identities=33% Similarity=0.773 Sum_probs=9.9
Q ss_pred ccccccccccChhhhhhhhhh
Q 000479 883 ACAICLDSFTNKKVLESHVQE 903 (1470)
Q Consensus 883 ~C~~CgKsF~sks~L~~H~r~ 903 (1470)
.|.+|++.|.+...|..|++.
T Consensus 2 ~C~~C~~~f~s~~~~~~H~~s 22 (25)
T PF12874_consen 2 YCDICNKSFSSENSLRQHLRS 22 (25)
T ss_dssp EETTTTEEESSHHHHHHHHTT
T ss_pred CCCCCCCCcCCHHHHHHHHCc
Confidence 344444444444444444443
No 73
>KOG2785 consensus C2H2-type Zn-finger protein [General function prediction only]
Probab=84.25 E-value=0.87 Score=54.50 Aligned_cols=52 Identities=12% Similarity=-0.051 Sum_probs=38.5
Q ss_pred CCCcccCCCCccCCChhhhhhccccccCC-----------------------CccccCcCC---CcCCChHHHHhhcC
Q 000479 1016 SRPHKKGIRFYAYKLKSGRLSRPRFKKGL-----------------------GAVSYRIRN---RGAAGMKKRIQTLK 1067 (1470)
Q Consensus 1016 eKpykC~iCgKsF~~ks~L~~H~r~Htge-----------------------kpy~C~~C~---ksF~~~~~L~kHkk 1067 (1470)
.-|-.|-+|++.|.+-..-..||..|+|. .-+.|-.|+ +.|.++....+|+.
T Consensus 164 ~~Pt~CLfC~~~~k~~e~~~~HM~~~HgffIPdreYL~D~~GLl~YLgeKV~~~~~CL~CN~~~~~f~sleavr~HM~ 241 (390)
T KOG2785|consen 164 LIPTDCLFCDKKSKSLEENLKHMFKEHGFFIPDREYLTDEKGLLKYLGEKVGIGFICLFCNELGRPFSSLEAVRAHMR 241 (390)
T ss_pred cCCcceeecCCCcccHHHHHHHHhhccCCcCCchHhhhchhHHHHHHHHHhccCceEEEeccccCcccccHHHHHHHh
Confidence 34567777777777777777777666665 346677787 88888888888876
No 74
>PF11722 zf-TRM13_CCCH: CCCH zinc finger in TRM13 protein; InterPro: IPR021721 This domain is found at the N terminus of TRM13 methyltransferase proteins. It is presumed to be a zinc binding domain. ; GO: 0008168 methyltransferase activity
Probab=83.27 E-value=0.39 Score=37.99 Aligned_cols=29 Identities=28% Similarity=0.619 Sum_probs=27.0
Q ss_pred cccchhhhhcCceeeEeecCCceEEEEec
Q 000479 533 RQCTAFIESKGRQCVRWANEGDVYCCVHL 561 (1470)
Q Consensus 533 ~~c~a~~~~kgrqc~r~a~~~~~ycc~h~ 561 (1470)
-+|.-||+.|.|.|.=.+..|..||--|+
T Consensus 2 ~~C~f~l~~K~R~C~m~~~~g~~fC~~H~ 30 (31)
T PF11722_consen 2 GRCEFFLPRKKRFCKMTRKPGSRFCGEHM 30 (31)
T ss_pred CcceEECCccccccCCeecCcCCccccCC
Confidence 37999999999999999999999999885
No 75
>PF13909 zf-H2C2_5: C2H2-type zinc-finger domain; PDB: 1X5W_A.
Probab=83.20 E-value=0.51 Score=34.52 Aligned_cols=21 Identities=29% Similarity=0.536 Sum_probs=7.7
Q ss_pred ccccccccccChhhhhhhhhhh
Q 000479 883 ACAICLDSFTNKKVLESHVQER 904 (1470)
Q Consensus 883 ~C~~CgKsF~sks~L~~H~r~H 904 (1470)
+|+.|+.... +..|.+|++.|
T Consensus 2 ~C~~C~y~t~-~~~l~~H~~~~ 22 (24)
T PF13909_consen 2 KCPHCSYSTS-KSNLKRHLKRH 22 (24)
T ss_dssp E-SSSS-EES-HHHHHHHHHHH
T ss_pred CCCCCCCcCC-HHHHHHHHHhh
Confidence 3444443333 33444444433
No 76
>PF12171 zf-C2H2_jaz: Zinc-finger double-stranded RNA-binding; InterPro: IPR022755 This zinc finger is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus []. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation. This entry represents the multiple-adjacent-C2H2 zinc finger, JAZ. ; PDB: 4DGW_A 1ZR9_A.
Probab=82.47 E-value=0.77 Score=34.71 Aligned_cols=22 Identities=0% Similarity=-0.277 Sum_probs=14.2
Q ss_pred cccCCCCccCCChhhhhhcccc
Q 000479 1019 HKKGIRFYAYKLKSGRLSRPRF 1040 (1470)
Q Consensus 1019 ykC~iCgKsF~~ks~L~~H~r~ 1040 (1470)
|.|..|++.|.+...|..|++.
T Consensus 2 ~~C~~C~k~f~~~~~~~~H~~s 23 (27)
T PF12171_consen 2 FYCDACDKYFSSENQLKQHMKS 23 (27)
T ss_dssp CBBTTTTBBBSSHHHHHCCTTS
T ss_pred CCcccCCCCcCCHHHHHHHHcc
Confidence 5566666666666666666643
No 77
>KOG2482 consensus Predicted C2H2-type Zn-finger protein [Transcription]
Probab=82.39 E-value=1.1 Score=52.68 Aligned_cols=51 Identities=2% Similarity=-0.263 Sum_probs=36.2
Q ss_pred cccCCCCccCCChhhhhhccccccCC---------------------------CccccCcCCCcCCChHHHHhhcCCC
Q 000479 1019 HKKGIRFYAYKLKSGRLSRPRFKKGL---------------------------GAVSYRIRNRGAAGMKKRIQTLKPL 1069 (1470)
Q Consensus 1019 ykC~iCgKsF~~ks~L~~H~r~Htge---------------------------kpy~C~~C~ksF~~~~~L~kHkksh 1069 (1470)
-.|-+|...+-....|..||+.-+.- +...|-.|.-.|-....|..|+-.+
T Consensus 280 v~CLfC~~~~en~~~l~eHmk~vHe~Dl~Ki~sd~~Ln~YqrvrviNyiRkq~~~~~c~~cd~~F~~e~~l~~hm~e~ 357 (423)
T KOG2482|consen 280 VVCLFCTNFYENPVFLFEHMKIVHEFDLLKIQSDYSLNFYQRVRVINYIRKQKKKSRCAECDLSFWKEPGLLIHMVED 357 (423)
T ss_pred eEEEeeccchhhHHHHHHHHHHHHHhhHHhhccccccchhhhhhHHHHHHHHhhccccccccccccCcchhhhhcccc
Confidence 47888888888888888888432211 2334677888888888888887643
No 78
>KOG2785 consensus C2H2-type Zn-finger protein [General function prediction only]
Probab=81.01 E-value=2.6 Score=50.60 Aligned_cols=56 Identities=14% Similarity=0.001 Sum_probs=43.9
Q ss_pred cccccCccCCccCChhhHhHhhhhhccCCCC------------------CCCCCcccCCCC---ccCCChhhhhhccc
Q 000479 983 RKFICRFCGLKFDLLPDLGRHHQAAHMGPNL------------------VNSRPHKKGIRF---YAYKLKSGRLSRPR 1039 (1470)
Q Consensus 983 KpykC~~CGKsF~sks~LkrH~~rvHtge~~------------------~~eKpykC~iCg---KsF~~ks~L~~H~r 1039 (1470)
-|-.|-+|++.|.+...-.+| +..|.|--. ....-|.|-.|+ +.|.+-...+.||.
T Consensus 165 ~Pt~CLfC~~~~k~~e~~~~H-M~~~HgffIPdreYL~D~~GLl~YLgeKV~~~~~CL~CN~~~~~f~sleavr~HM~ 241 (390)
T KOG2785|consen 165 IPTDCLFCDKKSKSLEENLKH-MFKEHGFFIPDREYLTDEKGLLKYLGEKVGIGFICLFCNELGRPFSSLEAVRAHMR 241 (390)
T ss_pred CCcceeecCCCcccHHHHHHH-HhhccCCcCCchHhhhchhHHHHHHHHHhccCceEEEeccccCcccccHHHHHHHh
Confidence 357799999999999999999 555555110 014578899999 99999999999993
No 79
>PF12171 zf-C2H2_jaz: Zinc-finger double-stranded RNA-binding; InterPro: IPR022755 This zinc finger is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus []. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation. This entry represents the multiple-adjacent-C2H2 zinc finger, JAZ. ; PDB: 4DGW_A 1ZR9_A.
Probab=80.63 E-value=0.8 Score=34.63 Aligned_cols=21 Identities=24% Similarity=0.695 Sum_probs=10.1
Q ss_pred cccccccccccChhhhhhhhh
Q 000479 882 YACAICLDSFTNKKVLESHVQ 902 (1470)
Q Consensus 882 y~C~~CgKsF~sks~L~~H~r 902 (1470)
|.|..|++.|.+...|..|++
T Consensus 2 ~~C~~C~k~f~~~~~~~~H~~ 22 (27)
T PF12171_consen 2 FYCDACDKYFSSENQLKQHMK 22 (27)
T ss_dssp CBBTTTTBBBSSHHHHHCCTT
T ss_pred CCcccCCCCcCCHHHHHHHHc
Confidence 344444444444444444444
No 80
>KOG4173 consensus Alpha-SNAP protein [Intracellular trafficking, secretion, and vesicular transport]
Probab=75.47 E-value=0.87 Score=50.13 Aligned_cols=86 Identities=24% Similarity=0.541 Sum_probs=66.5
Q ss_pred CccccCc--CCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhccccchh----hccccccc
Q 000479 846 KTHKCKI--CSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHHVQFVE----QCMLQQCI 919 (1470)
Q Consensus 846 kpfkC~~--CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e----~~kpfkC~ 919 (1470)
..|.|++ |.+.|........|.-..|+. .|..|.+.|.+...|..|+..-|..-+.. +.-.|+|-
T Consensus 78 ~~~~cqvagc~~~~d~lD~~E~hY~~~h~~---------sCs~C~r~~Pt~hLLd~HI~E~HDs~Fqa~veRG~dMy~Cl 148 (253)
T KOG4173|consen 78 PAFACQVAGCCQVFDALDDYEHHYHTLHGN---------SCSFCKRAFPTGHLLDAHILEWHDSLFQALVERGQDMYQCL 148 (253)
T ss_pred ccccccccchHHHHhhhhhHHHhhhhcccc---------hhHHHHHhCCchhhhhHHHHHHHHHHHHHHHHcCccHHHHH
Confidence 3478887 888999988888885555654 49999999999999999987665431110 11268995
Q ss_pred --CCCCCCCChhhhhhhhhhccc
Q 000479 920 --PCGSHFGNTEELWLHVQSVHA 940 (1470)
Q Consensus 920 --~CgK~F~sks~L~~H~k~~Hs 940 (1470)
.|+..|.+...-+.|+-..|.
T Consensus 149 vEgCt~KFkT~r~RkdH~I~~Hk 171 (253)
T KOG4173|consen 149 VEGCTEKFKTSRDRKDHMIRMHK 171 (253)
T ss_pred HHhhhhhhhhhhhhhhHHHHhcc
Confidence 599999999999999988885
No 81
>KOG2893 consensus Zn finger protein [General function prediction only]
Probab=75.18 E-value=1.1 Score=50.04 Aligned_cols=46 Identities=24% Similarity=0.247 Sum_probs=35.5
Q ss_pred cCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhcc-cccc
Q 000479 987 CRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRP-RFKK 1042 (1470)
Q Consensus 987 C~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~-r~Ht 1042 (1470)
|-+|++.|....-|.+| ++ .|.|+|.+|.|...+--.|..|- ++|.
T Consensus 13 cwycnrefddekiliqh-qk---------akhfkchichkkl~sgpglsihcmqvhk 59 (341)
T KOG2893|consen 13 CWYCNREFDDEKILIQH-QK---------AKHFKCHICHKKLFSGPGLSIHCMQVHK 59 (341)
T ss_pred eeecccccchhhhhhhh-hh---------hccceeeeehhhhccCCCceeehhhhhh
Confidence 88888888888888888 32 45688888888888888888773 5654
No 82
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=74.78 E-value=1.8 Score=44.30 Aligned_cols=31 Identities=19% Similarity=0.140 Sum_probs=26.6
Q ss_pred cccceecccccCCcEEEEeeccccCHHHHHH
Q 000479 1390 GWAVRAGQAILRGTFVCEYIGEVLDELETNK 1420 (1470)
Q Consensus 1390 GwGVra~~~I~~G~FI~EYvGEvIt~~ea~~ 1420 (1470)
|+||+|..+|++|++|+++.+.+|+..++..
T Consensus 1 GrGl~At~dI~~Ge~I~~p~~~~~~~~~~~~ 31 (162)
T PF00856_consen 1 GRGLFATRDIKAGEVILIPRPAILTPDEVSP 31 (162)
T ss_dssp SEEEEESS-B-TTEEEEEESEEEEEHHHHHC
T ss_pred CEEEEECccCCCCCEEEEECcceEEehhhhh
Confidence 8999999999999999999999999887754
No 83
>KOG2482 consensus Predicted C2H2-type Zn-finger protein [Transcription]
Probab=74.07 E-value=2.2 Score=50.22 Aligned_cols=25 Identities=28% Similarity=0.569 Sum_probs=14.7
Q ss_pred ccccCCCCCCCChhhhhhhhhhccc
Q 000479 916 QQCIPCGSHFGNTEELWLHVQSVHA 940 (1470)
Q Consensus 916 fkC~~CgK~F~sks~L~~H~k~~Hs 940 (1470)
..|-.|....-+...|..||+.+|.
T Consensus 280 v~CLfC~~~~en~~~l~eHmk~vHe 304 (423)
T KOG2482|consen 280 VVCLFCTNFYENPVFLFEHMKIVHE 304 (423)
T ss_pred eEEEeeccchhhHHHHHHHHHHHHH
Confidence 3566666655556666666665554
No 84
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=69.39 E-value=4.2 Score=50.09 Aligned_cols=75 Identities=24% Similarity=0.270 Sum_probs=51.7
Q ss_pred ceeeEEEEecc--CccccceecccccCCcEEEEeeccc-cCHHHHHHHHhhccCCCCcEEEEeCccccccccccCCCCcE
Q 000479 1377 VRVKLEVFKTE--NKGWAVRAGQAILRGTFVCEYIGEV-LDELETNKRRSRYGRDGCGYMLNIGAHINDMGRLIEGQVRY 1453 (1470)
Q Consensus 1377 ~~~~LeVfrT~--~kGwGVra~~~I~~G~FI~EYvGEv-It~~ea~~R~~~y~~~~~~Ylf~l~~~~~~~~~~~~~~~~~ 1453 (1470)
+...|.|..+. ..|.||.+...|++|+--+-|+||+ ++.++ ...+..|+..+-.. +..-+
T Consensus 26 LP~~l~i~~Ssv~~~~lgV~s~~~i~~G~~FGP~~G~~~~~~~~--------~~~n~~y~W~I~~~---------d~~~~ 88 (396)
T KOG2461|consen 26 LPPELRIKPSSVPVTGLGVWSNASILPGTSFGPFEGEIIASIDS--------KSANNRYMWEIFSS---------DNGYE 88 (396)
T ss_pred CCCceEeeccccCCccccccccccccCcccccCccCcccccccc--------ccccCcceEEEEeC---------CCceE
Confidence 55567777664 4789999999999999999999998 32211 01123455444321 11358
Q ss_pred EEeccC--CCCeeeccc
Q 000479 1454 VIDATK--YGNVSRFIN 1468 (1470)
Q Consensus 1454 ~IDA~~--~GNvaRFIN 1468 (1470)
+||++. .+|+.||+|
T Consensus 89 ~iDg~d~~~sNWmRYV~ 105 (396)
T KOG2461|consen 89 YIDGTDEEHSNWMRYVN 105 (396)
T ss_pred EeccCChhhcceeeeec
Confidence 999987 589999998
No 85
>cd05837 MSH6_like The PWWP domain is present in MSH6, a mismatch repair protein homologous to bacterial MutS. The PWWP domain of histone-lysine N-methyltransferase, also known as Nuclear SET domain-containing protein 3, is also included. Mutations in MSH6 have been linked to increased cancer susceptibility, particularly in hereditary nonpolyposis colorectal cancer in humans. The role of the PWWP domain in MSH6 is not clear; MSH6 orthologs found in S. cerevisiae, Caenorhabditis elegans and Arabidopsis thaliana lack the PWWP domain. Histone methyltransferases (HMTases) induce the posttranslational methylation of lysine residues in histones and play a role in apoptosis. In the HMTase Whistle, the PWWP domain is necessary for HMTase activity. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain pro
Probab=66.36 E-value=6.2 Score=40.12 Aligned_cols=63 Identities=17% Similarity=0.374 Sum_probs=45.8
Q ss_pred EEEEEeccc-cccceeeeeccCCCccccccccCCCccEEEEEeccCCcchhhhhhccccccCCC
Q 000479 157 ALWVKWRGK-WQAGIRCARADWPLPTLKAKPTHDRKKYFVIFFPHTRNYSWADMLLVRSINEFP 219 (1470)
Q Consensus 157 ~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 219 (1470)
-+|.|=+|- |--|+-+...+=|..+.+..+....+.|.|.||..+.+|.||.---+.++.+.-
T Consensus 8 lVWaK~~g~PwWPa~V~~~~~~~~~~~~~~~~~~~~~~~V~FFG~~~~~aWv~~~~l~pf~~~~ 71 (110)
T cd05837 8 LVWAKVSGYPWWPCMVCSDPLLGTYTKTKRNKRKPRQYHVQFFGDNPERAWISEKSLKPFKGSK 71 (110)
T ss_pred EEEEeCCCCCCCCEEEecccccchhhhhhhccCCCCeEEEEEcCCCCCEEEecHHHccccCCch
Confidence 479999884 666666654444444444445555689999999999999999988888877654
No 86
>smart00451 ZnF_U1 U1-like zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.
Probab=59.91 E-value=4 Score=32.29 Aligned_cols=21 Identities=0% Similarity=-0.236 Sum_probs=12.4
Q ss_pred CcccCCCCccCCChhhhhhcc
Q 000479 1018 PHKKGIRFYAYKLKSGRLSRP 1038 (1470)
Q Consensus 1018 pykC~iCgKsF~~ks~L~~H~ 1038 (1470)
+|.|.+|++.|.....+..|+
T Consensus 3 ~~~C~~C~~~~~~~~~~~~H~ 23 (35)
T smart00451 3 GFYCKLCNVTFTDEISVEAHL 23 (35)
T ss_pred CeEccccCCccCCHHHHHHHH
Confidence 455666666666665666655
No 87
>PF13913 zf-C2HC_2: zinc-finger of a C2HC-type
Probab=59.36 E-value=5.9 Score=29.80 Aligned_cols=18 Identities=39% Similarity=0.755 Sum_probs=13.6
Q ss_pred cccCccCCccCChhhHhHh
Q 000479 985 FICRFCGLKFDLLPDLGRH 1003 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH 1003 (1470)
..|+.||++| ....|.+|
T Consensus 3 ~~C~~CgR~F-~~~~l~~H 20 (25)
T PF13913_consen 3 VPCPICGRKF-NPDRLEKH 20 (25)
T ss_pred CcCCCCCCEE-CHHHHHHH
Confidence 4688888888 66777777
No 88
>smart00451 ZnF_U1 U1-like zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.
Probab=58.17 E-value=4.1 Score=32.22 Aligned_cols=24 Identities=29% Similarity=0.805 Sum_probs=13.5
Q ss_pred ccccccccccccChhhhhhhhhhh
Q 000479 881 GYACAICLDSFTNKKVLESHVQER 904 (1470)
Q Consensus 881 py~C~~CgKsF~sks~L~~H~r~H 904 (1470)
+|.|++|++.|.+...+..|++..
T Consensus 3 ~~~C~~C~~~~~~~~~~~~H~~gk 26 (35)
T smart00451 3 GFYCKLCNVTFTDEISVEAHLKGK 26 (35)
T ss_pred CeEccccCCccCCHHHHHHHHChH
Confidence 355666666666555555555543
No 89
>smart00391 MBD Methyl-CpG binding domain. Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain
Probab=55.83 E-value=3.8 Score=39.16 Aligned_cols=37 Identities=19% Similarity=0.106 Sum_probs=29.3
Q ss_pred CCC-CCcccC------------CcccccccCCCCCCc-cccccceeeeccC
Q 000479 1184 HLE-PLPSVS------------AGIRSSDSSDFVNNQ-WEVDECHCIIDSR 1220 (1470)
Q Consensus 1184 pl~-p~~~~~------------~~~k~v~~~~p~~~~-w~~~e~~~~l~~~ 1220 (1470)
|+. |++.|| .++..|.|.+|||.. +.+.|++.||...
T Consensus 3 ~~~~Plp~GW~R~~~~r~~g~~~~~~dV~Y~sP~GkklRs~~ev~~YL~~~ 53 (77)
T smart00391 3 PLRLPLPCGWRRETKQRKSGRSAGKFDVYYISPCGKKLRSKSELARYLHKN 53 (77)
T ss_pred cccCCCCCCcEEEEEEecCCCCCCcccEEEECCCCCeeeCHHHHHHHHHhC
Confidence 455 677788 145679999999999 9999999988753
No 90
>PF13913 zf-C2HC_2: zinc-finger of a C2HC-type
Probab=55.44 E-value=6.1 Score=29.73 Aligned_cols=17 Identities=47% Similarity=0.847 Sum_probs=7.6
Q ss_pred cccccccccChhhhhhhh
Q 000479 884 CAICLDSFTNKKVLESHV 901 (1470)
Q Consensus 884 C~~CgKsF~sks~L~~H~ 901 (1470)
|+.||+.| ....|..|+
T Consensus 5 C~~CgR~F-~~~~l~~H~ 21 (25)
T PF13913_consen 5 CPICGRKF-NPDRLEKHE 21 (25)
T ss_pred CCCCCCEE-CHHHHHHHH
Confidence 44444444 334444443
No 91
>COG4049 Uncharacterized protein containing archaeal-type C2H2 Zn-finger [General function prediction only]
Probab=55.26 E-value=4.8 Score=35.90 Aligned_cols=32 Identities=28% Similarity=0.385 Sum_probs=25.8
Q ss_pred hcCCccccccCccCCccCChhhHhHhhhhhcc
Q 000479 978 NLGSIRKFICRFCGLKFDLLPDLGRHHQAAHM 1009 (1470)
Q Consensus 978 ~HtgeKpykC~~CGKsF~sks~LkrH~~rvHt 1009 (1470)
.-.||--+.|+.||+.|....+..+|+.+.|.
T Consensus 11 ~RDGE~~lrCPRC~~~FR~~K~Y~RHVNKaH~ 42 (65)
T COG4049 11 DRDGEEFLRCPRCGMVFRRRKDYIRHVNKAHG 42 (65)
T ss_pred ccCCceeeeCCchhHHHHHhHHHHHHhhHHhh
Confidence 45677778899999999988888898777774
No 92
>TIGR00622 ssl1 transcription factor ssl1. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=53.83 E-value=21 Score=36.57 Aligned_cols=20 Identities=20% Similarity=0.238 Sum_probs=15.4
Q ss_pred cccccCCCCCCCChhhhhhh
Q 000479 915 LQQCIPCGSHFGNTEELWLH 934 (1470)
Q Consensus 915 pfkC~~CgK~F~sks~L~~H 934 (1470)
|..|+.|+-+......|.+.
T Consensus 15 P~~CpiCgLtLVss~HLARS 34 (112)
T TIGR00622 15 PVECPICGLTLILSTHLARS 34 (112)
T ss_pred CCcCCcCCCEEeccchHHHh
Confidence 66788888888777777766
No 93
>PF09986 DUF2225: Uncharacterized protein conserved in bacteria (DUF2225); InterPro: IPR018708 This conserved bacterial family has no known function.
Probab=53.60 E-value=4.4 Score=45.87 Aligned_cols=49 Identities=14% Similarity=0.133 Sum_probs=27.0
Q ss_pred cccccCccCCccCChhhHhHhhhhhccCCCCC----CCC-----CcccCCCCccCCCh
Q 000479 983 RKFICRFCGLKFDLLPDLGRHHQAAHMGPNLV----NSR-----PHKKGIRFYAYKLK 1031 (1470)
Q Consensus 983 KpykC~~CGKsF~sks~LkrH~~rvHtge~~~----~eK-----pykC~iCgKsF~~k 1031 (1470)
+.+.|++|++.|.++.-+....+..+....+. ... ...|+.||++|...
T Consensus 4 k~~~CPvC~~~F~~~~vrs~~~r~~~~d~D~~~~Y~~vnP~~Y~V~vCP~CgyA~~~~ 61 (214)
T PF09986_consen 4 KKITCPVCGKEFKTKKVRSGKIRVIRRDSDFCPRYKGVNPLFYEVWVCPHCGYAAFEE 61 (214)
T ss_pred CceECCCCCCeeeeeEEEcCCceEeeecCCCccccCCCCCeeeeEEECCCCCCccccc
Confidence 56788888888887765555522222221110 012 23677777776644
No 94
>KOG2893 consensus Zn finger protein [General function prediction only]
Probab=52.25 E-value=6 Score=44.58 Aligned_cols=47 Identities=28% Similarity=0.553 Sum_probs=22.8
Q ss_pred cccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhhhccc
Q 000479 884 CAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQSVHA 940 (1470)
Q Consensus 884 C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k~~Hs 940 (1470)
|-.|++.|....-|.+|++..| |+|.+|.|..-+--.|..|-..+|.
T Consensus 13 cwycnrefddekiliqhqkakh----------fkchichkkl~sgpglsihcmqvhk 59 (341)
T KOG2893|consen 13 CWYCNREFDDEKILIQHQKAKH----------FKCHICHKKLFSGPGLSIHCMQVHK 59 (341)
T ss_pred eeecccccchhhhhhhhhhhcc----------ceeeeehhhhccCCCceeehhhhhh
Confidence 4555555555555555544443 3455555544444444444333443
No 95
>TIGR00622 ssl1 transcription factor ssl1. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=51.90 E-value=21 Score=36.55 Aligned_cols=82 Identities=18% Similarity=0.335 Sum_probs=48.2
Q ss_pred ccccCcCCccccChhHHhhhhhhccccch-hcc-------cCccccccccccccChhhhhhhhhhhccccchhhcccccc
Q 000479 847 THKCKICSQVFLHDQELGVHWMDNHKKEA-QWL-------FRGYACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQC 918 (1470)
Q Consensus 847 pfkC~~CgK~F~s~s~L~~H~~~~Ht~e~-~~~-------~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC 918 (1470)
|-.|+.|+-..-...+|.+. ..|.-.. .+. ...-.|--|.+.|....... .++ ......|+|
T Consensus 15 P~~CpiCgLtLVss~HLARS--yHHLfPl~~f~ev~~~~~~~~~~C~~C~~~f~~~~~~~------~~~--~~~~~~y~C 84 (112)
T TIGR00622 15 PVECPICGLTLILSTHLARS--YHHLFPLKAFQEIPLEEYNGSRFCFGCQGPFPKPPVSP------FDE--LKDSHRYVC 84 (112)
T ss_pred CCcCCcCCCEEeccchHHHh--hhccCCCcccccccccccCCCCcccCcCCCCCCccccc------ccc--cccccceeC
Confidence 56677777777777777654 1231110 000 01124888999987654221 110 111237999
Q ss_pred cCCCCCCCChhhhhhhhhhcc
Q 000479 919 IPCGSHFGNTEELWLHVQSVH 939 (1470)
Q Consensus 919 ~~CgK~F~sks~L~~H~k~~H 939 (1470)
+.|...|-..-+.-.|. ..|
T Consensus 85 ~~C~~~FC~dCD~fiHe-~Lh 104 (112)
T TIGR00622 85 AVCKNVFCVDCDVFVHE-SLH 104 (112)
T ss_pred CCCCCccccccchhhhh-hcc
Confidence 99999999888888884 455
No 96
>KOG1280 consensus Uncharacterized conserved protein containing ZZ-type Zn-finger [General function prediction only]
Probab=49.53 E-value=13 Score=44.29 Aligned_cols=27 Identities=19% Similarity=0.597 Sum_probs=14.5
Q ss_pred ccccCcCCccccChhHHhhhhhhcccc
Q 000479 847 THKCKICSQVFLHDQELGVHWMDNHKK 873 (1470)
Q Consensus 847 pfkC~~CgK~F~s~s~L~~H~~~~Ht~ 873 (1470)
.|.|+.|++.=.+...|..|+...|..
T Consensus 79 SftCPyC~~~Gfte~~f~~Hv~s~Hpd 105 (381)
T KOG1280|consen 79 SFTCPYCGIMGFTERQFGTHVLSQHPE 105 (381)
T ss_pred cccCCcccccccchhHHHHHhhhcCcc
Confidence 455555555555555555555555543
No 97
>KOG2186 consensus Cell growth-regulating nucleolar protein [Cell cycle control, cell division, chromosome partitioning]
Probab=43.04 E-value=12 Score=42.96 Aligned_cols=46 Identities=26% Similarity=0.600 Sum_probs=30.5
Q ss_pred cccccccccccChhhhhhhhhhhccccchhhcccccccCCCCCCCChhhhhhhhh
Q 000479 882 YACAICLDSFTNKKVLESHVQERHHVQFVEQCMLQQCIPCGSHFGNTEELWLHVQ 936 (1470)
Q Consensus 882 y~C~~CgKsF~sks~L~~H~r~Hh~ek~~e~~kpfkC~~CgK~F~sks~L~~H~k 936 (1470)
|.|..||.... +..+.+|+-.-++. -|.|..|++.|.. .++..|.+
T Consensus 4 FtCnvCgEsvK-Kp~vekH~srCrn~-------~fSCIDC~k~F~~-~sYknH~k 49 (276)
T KOG2186|consen 4 FTCNVCGESVK-KPQVEKHMSRCRNA-------YFSCIDCGKTFER-VSYKNHTK 49 (276)
T ss_pred Eehhhhhhhcc-ccchHHHHHhccCC-------eeEEeeccccccc-chhhhhhh
Confidence 66777776654 34456677766664 4777777777776 66666754
No 98
>PHA00626 hypothetical protein
Probab=42.65 E-value=10 Score=33.95 Aligned_cols=14 Identities=7% Similarity=-0.429 Sum_probs=9.2
Q ss_pred CCcccCCCCccCCC
Q 000479 1017 RPHKKGIRFYAYKL 1030 (1470)
Q Consensus 1017 KpykC~iCgKsF~~ 1030 (1470)
..|+|+.||+.|+.
T Consensus 22 nrYkCkdCGY~ft~ 35 (59)
T PHA00626 22 DDYVCCDCGYNDSK 35 (59)
T ss_pred cceEcCCCCCeech
Confidence 46777777776654
No 99
>cd00122 MBD MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family.
Probab=42.27 E-value=8.1 Score=35.21 Aligned_cols=27 Identities=7% Similarity=-0.031 Sum_probs=23.0
Q ss_pred cccccccCCCCCCc-cccccceeeeccC
Q 000479 1194 GIRSSDSSDFVNNQ-WEVDECHCIIDSR 1220 (1470)
Q Consensus 1194 ~~k~v~~~~p~~~~-w~~~e~~~~l~~~ 1220 (1470)
++..|.|.+|+|.. +.+.|+..||..+
T Consensus 23 ~k~dv~Y~sP~Gk~~Rs~~ev~~yL~~~ 50 (62)
T cd00122 23 GKGDVYYYSPCGKKLRSKPEVARYLEKT 50 (62)
T ss_pred CcceEEEECCCCceecCHHHHHHHHHhC
Confidence 45579999999988 9999999988765
No 100
>PF12013 DUF3505: Protein of unknown function (DUF3505); InterPro: IPR022698 This family of proteins is functionally uncharacterised. This protein is found in eukaryotes. Proteins in this family are typically between 247 to 1018 amino acids in length. This region contains two segments that are likely to be C2H2 zinc binding domains.
Probab=41.68 E-value=27 Score=35.24 Aligned_cols=27 Identities=15% Similarity=-0.070 Sum_probs=22.6
Q ss_pred CCccc----CCCCccCCChhhhhhccccccC
Q 000479 1017 RPHKK----GIRFYAYKLKSGRLSRPRFKKG 1043 (1470)
Q Consensus 1017 KpykC----~iCgKsF~~ks~L~~H~r~Htg 1043 (1470)
.-|.| ..|++.+.+...+++|++.++|
T Consensus 79 ~G~~C~~~~~~C~y~~~~~~~m~~H~~~~Hg 109 (109)
T PF12013_consen 79 DGYRCQCDPPHCGYITRSKKTMRKHWRKEHG 109 (109)
T ss_pred CCeeeecCCCCCCcEeccHHHHHHHHHHhcC
Confidence 45889 8999999999999999877664
No 101
>smart00293 PWWP domain with conserved PWWP motif. conservation of Pro-Trp-Trp-Pro residues
Probab=41.30 E-value=29 Score=31.67 Aligned_cols=56 Identities=20% Similarity=0.433 Sum_probs=38.5
Q ss_pred EEEEEecc-ccccceeeeeccCCCccccccccCCCccEEEEEeccCCcchhhhhhccccc
Q 000479 157 ALWVKWRG-KWQAGIRCARADWPLPTLKAKPTHDRKKYFVIFFPHTRNYSWADMLLVRSI 215 (1470)
Q Consensus 157 ~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 215 (1470)
-+|.|=+| -|--|+-+...+-|...++ +.-....|.|.||.. .+|.|++--.+.++
T Consensus 6 lVwaK~~G~p~WPa~V~~~~~~~~~~~~--~~~~~~~~~V~Ffg~-~~~awv~~~~l~p~ 62 (63)
T smart00293 6 LVWAKMKGFPWWPALVVSPKETPDNIRK--RKRFENLYPVLFFGD-KDTAWISSSKLFPL 62 (63)
T ss_pred EEEEECCCCCCCCeEEcCcccCChhHhh--ccCCCCEEEEEEeCC-CCEEEECccceeeC
Confidence 37999999 7777777766665554332 334456788888875 55699987766654
No 102
>COG4049 Uncharacterized protein containing archaeal-type C2H2 Zn-finger [General function prediction only]
Probab=40.63 E-value=12 Score=33.50 Aligned_cols=33 Identities=21% Similarity=0.371 Sum_probs=22.8
Q ss_pred hccCCCCccccCcCCccccChhHHhhhhhhccc
Q 000479 840 GRSEDEKTHKCKICSQVFLHDQELGVHWMDNHK 872 (1470)
Q Consensus 840 r~H~gekpfkC~~CgK~F~s~s~L~~H~~~~Ht 872 (1470)
+...||..++|+-|++.|.......+|.-..|.
T Consensus 10 ~~RDGE~~lrCPRC~~~FR~~K~Y~RHVNKaH~ 42 (65)
T COG4049 10 RDRDGEEFLRCPRCGMVFRRRKDYIRHVNKAHG 42 (65)
T ss_pred eccCCceeeeCCchhHHHHHhHHHHHHhhHHhh
Confidence 344566677777777777777777777655553
No 103
>PF00855 PWWP: PWWP domain; InterPro: IPR000313 Upon characterisation of WHSC1, a gene mapping to the Wolf-Hirschhornsyndrome critical region and at its C terminus similar to the Drosophila melanogaster ASH1/trithorax group proteins, a novel protein domain designated PWWP domain was identified []. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. It is present in proteins of nuclear origin and plays a role in cell growth and differentiation. Due to its position, the composition of amino acids close to the PWWP motif and the pattern of other domains present it has been suggested that the domain is involved in protein-protein interactions [].; PDB: 3LYI_B 2L89_A 2NLU_A 1RI0_A 1KHC_A 3QKJ_C 2DAQ_A 1N27_A 3PFS_B 3QJ6_A ....
Probab=39.87 E-value=28 Score=33.05 Aligned_cols=56 Identities=23% Similarity=0.577 Sum_probs=38.4
Q ss_pred EEEEEecc-ccccceeeeeccCCCccccccccCCCccEEEEEeccCCcchhhhhhccccccCCC
Q 000479 157 ALWVKWRG-KWQAGIRCARADWPLPTLKAKPTHDRKKYFVIFFPHTRNYSWADMLLVRSINEFP 219 (1470)
Q Consensus 157 ~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 219 (1470)
-+|+|=+| -|=-|+=|...+.+- + ......|.|.||... +|.|++.-.|.+.+++-
T Consensus 6 lVWaK~~g~pwWPa~V~~~~~~~~-----~-~~~~~~~~V~Ffg~~-~~~wv~~~~i~~f~~~~ 62 (86)
T PF00855_consen 6 LVWAKLKGYPWWPARVCDPDEKSK-----K-KRKDGHVLVRFFGDN-DYAWVKPSNIKPFSEFK 62 (86)
T ss_dssp EEEEEETTSEEEEEEEEECCHCTS-----C-SSSSTEEEEEETTTT-EEEEEEGGGEEECCHHH
T ss_pred EEEEEeCCCCCCceEEeecccccc-----c-CCCCCEEEEEecCCC-CEEEECHHHhhChhhhH
Confidence 48999987 355666666664443 1 334466777777766 99999998888877544
No 104
>cd01397 HAT_MBD Methyl-CpG binding domains (MBD) present in putative chromatin remodelling factor such as BAZ2A; BAZ2A contains a MBD, DDT, PHD-type zinc finger and Bromo domain suggesting that BAZ2A might be associated with histone acetyltransferase (HAT) activity. The Drosophila melanogaster toutatis protein, a putative subunit of the chromatin-remodeling complex, and other such proteins in this group share a similar domain architecture with BAZ2A, as does the Caenorhabditis elegans flectin homolog.
Probab=39.77 E-value=11 Score=35.88 Aligned_cols=26 Identities=4% Similarity=-0.178 Sum_probs=21.9
Q ss_pred cccccccCCCCCCc-cccccceeeecc
Q 000479 1194 GIRSSDSSDFVNNQ-WEVDECHCIIDS 1219 (1470)
Q Consensus 1194 ~~k~v~~~~p~~~~-w~~~e~~~~l~~ 1219 (1470)
.+..|.|.+|||.. +.+.|++.||..
T Consensus 23 ~~~dV~Y~aPcGKklRs~~ev~~yL~~ 49 (73)
T cd01397 23 IQGEVAYYAPCGKKLRQYPEVIKYLSK 49 (73)
T ss_pred ccceEEEECCCCcccccHHHHHHHHHh
Confidence 34468899999999 999999988874
No 105
>KOG3813 consensus Uncharacterized conserved protein (tumor-suppressor AXUD1 in humans) [General function prediction only]
Probab=38.15 E-value=15 Score=45.65 Aligned_cols=19 Identities=42% Similarity=1.027 Sum_probs=16.6
Q ss_pred CCCcccCCCCccCCCCCccc
Q 000479 1299 QLGCACANSTCFPETCDHVY 1318 (1470)
Q Consensus 1299 ~~gC~C~~~~C~~~~C~C~~ 1318 (1470)
-+||+|. +-|+|++|+|.+
T Consensus 307 eCGCsCr-~~CdPETCaCSq 325 (640)
T KOG3813|consen 307 ECGCSCR-GVCDPETCACSQ 325 (640)
T ss_pred hhCCccc-ceeChhhcchhc
Confidence 5899999 599999999964
No 106
>cd05838 WHSC1_related The PWWP domain was first identified in the WHSC1 (Wolf-Hirschhorn syndrome candidate 1) protein, a protein implicated in Wolf-Hirschhorn syndrome (WHS). When translocated, WHSC1 plays a role in lymphoid multiple myeloma (MM) disease, also known as plasmacytoma. WHCS1 proteins typically contain two copies of the PWWP domain. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.
Probab=36.79 E-value=27 Score=34.67 Aligned_cols=54 Identities=26% Similarity=0.543 Sum_probs=34.1
Q ss_pred EEEEecc-ccccceeeeeccCCCccccccccCCCccEEEEEeccCCcchhhhhhcccc
Q 000479 158 LWVKWRG-KWQAGIRCARADWPLPTLKAKPTHDRKKYFVIFFPHTRNYSWADMLLVRS 214 (1470)
Q Consensus 158 ~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 214 (1470)
+|+|-+| -|=-|+-|-..+=|-..+..+ +....|.|.|| .|++|.|++--.|-+
T Consensus 7 VWaK~~g~pwWPa~V~~~~~~p~~~~~~~--~~~~~~~V~Ff-gs~~y~Wv~~~~l~p 61 (95)
T cd05838 7 VWAKLGNFRWWPAIICDPREVPPNIQVLR--HCIGEFCVMFF-GTHDYYWVHRGRVFP 61 (95)
T ss_pred EEEECCCCCCCCeEEcChhhcChhHhhcc--CCCCeEEEEEe-CCCCEEEeccccccc
Confidence 7999998 555666665543333222211 23356888888 589999999744443
No 107
>cd00350 rubredoxin_like Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=34.90 E-value=24 Score=28.26 Aligned_cols=10 Identities=30% Similarity=1.331 Sum_probs=5.1
Q ss_pred cccCccCCcc
Q 000479 985 FICRFCGLKF 994 (1470)
Q Consensus 985 ykC~~CGKsF 994 (1470)
|+|..||..+
T Consensus 2 ~~C~~CGy~y 11 (33)
T cd00350 2 YVCPVCGYIY 11 (33)
T ss_pred EECCCCCCEE
Confidence 4555555443
No 108
>PF09538 FYDLN_acid: Protein of unknown function (FYDLN_acid); InterPro: IPR012644 Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=33.48 E-value=25 Score=35.92 Aligned_cols=30 Identities=23% Similarity=0.222 Sum_probs=19.1
Q ss_pred cccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCCh
Q 000479 985 FICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLK 1031 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~k 1031 (1470)
..|+.||++|.-. + ..|..|+.||..|.-.
T Consensus 10 R~Cp~CG~kFYDL---n--------------k~PivCP~CG~~~~~~ 39 (108)
T PF09538_consen 10 RTCPSCGAKFYDL---N--------------KDPIVCPKCGTEFPPE 39 (108)
T ss_pred ccCCCCcchhccC---C--------------CCCccCCCCCCccCcc
Confidence 5677777777542 1 2366777777777655
No 109
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=32.65 E-value=63 Score=40.11 Aligned_cols=80 Identities=0% Similarity=-0.290 Sum_probs=54.5
Q ss_pred hhHHhhhhhhcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhhhhccccccCCCccc
Q 000479 969 SASVENHSENLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGRLSRPRFKKGLGAVS 1048 (1470)
Q Consensus 969 ~s~L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L~~H~r~Htgekpy~ 1048 (1470)
...+..|...|++..++-++++.+.+.....+..| ...|.+ +.++.+..+...+.....+..+..+|+....+.
T Consensus 316 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 389 (396)
T KOG2461|consen 316 QLVLDQSEVPATVSVWTGETIPVRTPAGQLIYTQS-HSMEVA-----EPTDMAPNQIWKIYHTGVLGFLIITTDESECNN 389 (396)
T ss_pred ccccccccccccccccCcCcccccccccccchhhh-hhcccC-----CCCcccccccccceeccccceeeeecccccccc
Confidence 34556666777777777777777777777777777 666766 556666666666666666666777777777776
Q ss_pred cCcCCC
Q 000479 1049 YRIRNR 1054 (1470)
Q Consensus 1049 C~~C~k 1054 (1470)
+..|.+
T Consensus 390 ~~~~~~ 395 (396)
T KOG2461|consen 390 MSFVCK 395 (396)
T ss_pred ccccCC
Confidence 666554
No 110
>PRK14890 putative Zn-ribbon RNA-binding protein; Provisional
Probab=30.46 E-value=32 Score=31.38 Aligned_cols=32 Identities=22% Similarity=0.285 Sum_probs=19.4
Q ss_pred cccccCccCCc-cCChhhHhHhhhhhccCCCCCCCCCcccCCCCc
Q 000479 983 RKFICRFCGLK-FDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFY 1026 (1470)
Q Consensus 983 KpykC~~CGKs-F~sks~LkrH~~rvHtge~~~~eKpykC~iCgK 1026 (1470)
-.|.|+.||+. -.+-..-+++ ..+|.|+.||.
T Consensus 24 ~~F~CPnCG~~~I~RC~~CRk~------------~~~Y~CP~CGF 56 (59)
T PRK14890 24 VKFLCPNCGEVIIYRCEKCRKQ------------SNPYTCPKCGF 56 (59)
T ss_pred CEeeCCCCCCeeEeechhHHhc------------CCceECCCCCC
Confidence 34777777776 3333333333 45888888875
No 111
>KOG2186 consensus Cell growth-regulating nucleolar protein [Cell cycle control, cell division, chromosome partitioning]
Probab=30.43 E-value=26 Score=40.38 Aligned_cols=51 Identities=16% Similarity=0.361 Sum_probs=35.1
Q ss_pred ccccCcCCccccChhHHhhhhhhccccchhcccCccccccccccccChhhhhhhhhhhcc
Q 000479 847 THKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFTNKKVLESHVQERHH 906 (1470)
Q Consensus 847 pfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~sks~L~~H~r~Hh~ 906 (1470)
.|.|..||...+.+. +.+|+.+.|. .-|.|-.|++.|.. .....|.+--+.
T Consensus 3 ~FtCnvCgEsvKKp~-vekH~srCrn-------~~fSCIDC~k~F~~-~sYknH~kCITE 53 (276)
T KOG2186|consen 3 FFTCNVCGESVKKPQ-VEKHMSRCRN-------AYFSCIDCGKTFER-VSYKNHTKCITE 53 (276)
T ss_pred EEehhhhhhhccccc-hHHHHHhccC-------CeeEEeeccccccc-chhhhhhhhcch
Confidence 377888888776544 6668555554 24888888888887 667777664443
No 112
>cd00729 rubredoxin_SM Rubredoxin, Small Modular nonheme iron binding domain containing a [Fe(SCys)4] center, present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=29.83 E-value=31 Score=27.97 Aligned_cols=10 Identities=30% Similarity=1.109 Sum_probs=6.0
Q ss_pred cccCccCCcc
Q 000479 985 FICRFCGLKF 994 (1470)
Q Consensus 985 ykC~~CGKsF 994 (1470)
|+|..||..+
T Consensus 3 ~~C~~CG~i~ 12 (34)
T cd00729 3 WVCPVCGYIH 12 (34)
T ss_pred EECCCCCCEe
Confidence 5666666544
No 113
>cd05839 BR140_related The PWWP domain is found in the BR140 family, which includes peregrin and BR140-like proteins 1 and 2. BR140 is the only family to contain the PWWP domain at the C terminus, with PHD and bromo domains in the N-terminal region. In myeloid leukemias, BR140 is disrupted by chromosomal translocations, similar to translocations of WHSC1 in lymphoid multiple myeloma. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding proteins, that function as transcription factors regulating a variety of developmental processes.
Probab=29.00 E-value=83 Score=32.40 Aligned_cols=61 Identities=21% Similarity=0.421 Sum_probs=40.7
Q ss_pred EEEEEeccc-cccceeeeec----cC-----CCcccc----ccccCCCccEEEEEeccCCcchhhhhhccccccC
Q 000479 157 ALWVKWRGK-WQAGIRCARA----DW-----PLPTLK----AKPTHDRKKYFVIFFPHTRNYSWADMLLVRSINE 217 (1470)
Q Consensus 157 ~~~~~~~~~-~~~~~~~~~~----~~-----~~~~~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 217 (1470)
-||.|-+|- |.-|+-.-.. .. |++-|+ .+.-.+.+.|+|-||=.+++|.|++---+.+..+
T Consensus 6 lVwaK~~g~P~wPa~iidp~~~~~~~~~~~~p~~~l~~~~~~~~~~~~~~~lV~FFd~~~s~~Wv~~~~l~pl~~ 80 (111)
T cd05839 6 LVWAKCRGYPSYPALIIDPKMPRDGVFHNGVPPDVLTLGEARAQNADERLYLVLFFDNKRTWQWLPGDKLEPLGV 80 (111)
T ss_pred EeeeeecCCCCCCeEeeCCCCCCcccccCCCCchhhhHHHHHhccCCCcEEEEEEecCCCcceecCHHHCccccc
Confidence 379998883 6666554422 11 112222 2334678889999999999999999887776654
No 114
>TIGR00373 conserved hypothetical protein TIGR00373. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain.
Probab=27.76 E-value=39 Score=36.64 Aligned_cols=42 Identities=12% Similarity=-0.010 Sum_probs=29.1
Q ss_pred HhhhhhhcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccC
Q 000479 972 VENHSENLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAY 1028 (1470)
Q Consensus 972 L~~H~r~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF 1028 (1470)
|..-+.......-|.|+.|+..|+....+. .-|.|+.||...
T Consensus 97 lk~~l~~e~~~~~Y~Cp~c~~r~tf~eA~~---------------~~F~Cp~Cg~~L 138 (158)
T TIGR00373 97 LREKLEFETNNMFFICPNMCVRFTFNEAME---------------LNFTCPRCGAML 138 (158)
T ss_pred HHHHHhhccCCCeEECCCCCcEeeHHHHHH---------------cCCcCCCCCCEe
Confidence 333334445556789999999888877763 158999998763
No 115
>PF02892 zf-BED: BED zinc finger; InterPro: IPR003656 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents predicted BED-type zinc finger domains. The BED finger which was named after the Drosophila proteins BEAF and DREF, is found in one or more copies in cellular regulatory factors and transposases from plants, animals and fungi. The BED finger is an about 50 to 60 amino acid residues domain that contains a characteristic motif with two highly conserved aromatic positions, as well as a shared pattern of cysteines and histidines that is predicted to form a zinc finger. As diverse BED fingers are able to bind DNA, it has been suggested that DNA-binding is the general function of this domain []. Some proteins known to contain a BED domain include animal, plant and fungi AC1 and Hobo-like transposases; Caenorhabditis elegans Dpy-20 protein, a predicted cuticular gene transcriptional regulator; Drosophila BEAF (boundary element-associated factor), thought to be involved in chromatin insulation; Drosophila DREF, a transcriptional regulator for S-phase genes; and tobacco 3AF1 and tomato E4/E8-BP1, light- and ethylene-regulated DNA binding proteins that contain two BED fingers. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003677 DNA binding; PDB: 2DJR_A 2CT5_A.
Probab=27.62 E-value=35 Score=28.69 Aligned_cols=27 Identities=30% Similarity=0.659 Sum_probs=14.8
Q ss_pred ccccccCccCCccCCh----hhHhHhhhhhc
Q 000479 982 IRKFICRFCGLKFDLL----PDLGRHHQAAH 1008 (1470)
Q Consensus 982 eKpykC~~CGKsF~sk----s~LkrH~~rvH 1008 (1470)
....+|.+|++.+... +.|.+|+++.|
T Consensus 14 ~~~a~C~~C~~~~~~~~~~ts~l~~HL~~~h 44 (45)
T PF02892_consen 14 KKKAKCKYCGKVIKYSSGGTSNLKRHLKKKH 44 (45)
T ss_dssp SS-EEETTTTEE-----SSTHHHHHHHHHTT
T ss_pred cCeEEeCCCCeEEeeCCCcHHHHHHhhhhhC
Confidence 3556788888777663 67777744554
No 116
>PF09986 DUF2225: Uncharacterized protein conserved in bacteria (DUF2225); InterPro: IPR018708 This conserved bacterial family has no known function.
Probab=27.54 E-value=25 Score=39.86 Aligned_cols=42 Identities=14% Similarity=0.035 Sum_probs=30.8
Q ss_pred CCCcccCCCCccCCChhhhhhcccc---cc-------CCCc-----cccCcCCCcCC
Q 000479 1016 SRPHKKGIRFYAYKLKSGRLSRPRF---KK-------GLGA-----VSYRIRNRGAA 1057 (1470)
Q Consensus 1016 eKpykC~iCgKsF~~ks~L~~H~r~---Ht-------gekp-----y~C~~C~ksF~ 1057 (1470)
.+.+.||+|++.|.++.-+....+. .. +..| ..|+.||-+|.
T Consensus 3 ~k~~~CPvC~~~F~~~~vrs~~~r~~~~d~D~~~~Y~~vnP~~Y~V~vCP~CgyA~~ 59 (214)
T PF09986_consen 3 DKKITCPVCGKEFKTKKVRSGKIRVIRRDSDFCPRYKGVNPLFYEVWVCPHCGYAAF 59 (214)
T ss_pred CCceECCCCCCeeeeeEEEcCCceEeeecCCCccccCCCCCeeeeEEECCCCCCccc
Confidence 5688999999999998877766643 22 2223 35999998875
No 117
>COG2888 Predicted Zn-ribbon RNA-binding protein with a function in translation [Translation, ribosomal structure and biogenesis]
Probab=26.82 E-value=41 Score=30.75 Aligned_cols=32 Identities=19% Similarity=0.132 Sum_probs=19.3
Q ss_pred cccccCccCCccCChh-hHhHhhhhhccCCCCCCCCCcccCCCCc
Q 000479 983 RKFICRFCGLKFDLLP-DLGRHHQAAHMGPNLVNSRPHKKGIRFY 1026 (1470)
Q Consensus 983 KpykC~~CGKsF~sks-~LkrH~~rvHtge~~~~eKpykC~iCgK 1026 (1470)
-.|.|+.||..-..+. .-++| ..+|.|+.||.
T Consensus 26 v~F~CPnCGe~~I~Rc~~CRk~------------g~~Y~Cp~CGF 58 (61)
T COG2888 26 VKFPCPNCGEVEIYRCAKCRKL------------GNPYRCPKCGF 58 (61)
T ss_pred eEeeCCCCCceeeehhhhHHHc------------CCceECCCcCc
Confidence 3578888885544332 22222 45888888884
No 118
>smart00531 TFIIE Transcription initiation factor IIE.
Probab=26.75 E-value=48 Score=35.40 Aligned_cols=39 Identities=13% Similarity=0.054 Sum_probs=24.4
Q ss_pred CCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccC
Q 000479 980 GSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAY 1028 (1470)
Q Consensus 980 tgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF 1028 (1470)
....-|.|+.|++.|.....+..- +. ...|.|+.||...
T Consensus 95 ~~~~~Y~Cp~C~~~y~~~ea~~~~----d~------~~~f~Cp~Cg~~l 133 (147)
T smart00531 95 TNNAYYKCPNCQSKYTFLEANQLL----DM------DGTFTCPRCGEEL 133 (147)
T ss_pred cCCcEEECcCCCCEeeHHHHHHhc----CC------CCcEECCCCCCEE
Confidence 344568888888888865443321 11 2248888888764
No 119
>PF13891 zf-C3Hc3H: Potential DNA-binding domain
Probab=26.50 E-value=21 Score=32.96 Aligned_cols=36 Identities=28% Similarity=0.393 Sum_probs=26.1
Q ss_pred eeccCcccccccCCCcccccCCCCCCCCCccCCCchh
Q 000479 587 TVLGTRCKHRALYGSSFCKKHRPRTDTGRILDSPDNT 623 (1470)
Q Consensus 587 ~~~g~~ckh~~~~~~~~ck~~~~~~~~~~~~~~~~~~ 623 (1470)
+..|+.|+.+++||+.||=+|-..- .++.||..+.-
T Consensus 3 ~~~~~~C~~~~lp~~~yC~~HIl~D-~~Q~Lf~~C~~ 38 (65)
T PF13891_consen 3 TYSGRGCSQPALPGSKYCIRHILED-PNQPLFKQCSY 38 (65)
T ss_pred CCCCCCcCcccCchhhHHHHHhccC-CCCCCcccCcC
Confidence 4578999999999999999987432 22555555443
No 120
>PF13719 zinc_ribbon_5: zinc-ribbon domain
Probab=25.82 E-value=30 Score=28.46 Aligned_cols=31 Identities=19% Similarity=0.256 Sum_probs=16.5
Q ss_pred ccCccCCccCChhh-HhHhhhhhccCCCCCCCCCcccCCCCccC
Q 000479 986 ICRFCGLKFDLLPD-LGRHHQAAHMGPNLVNSRPHKKGIRFYAY 1028 (1470)
Q Consensus 986 kC~~CGKsF~sks~-LkrH~~rvHtge~~~~eKpykC~iCgKsF 1028 (1470)
.|+.|+..|.-..+ |... .+..+|+.|+..|
T Consensus 4 ~CP~C~~~f~v~~~~l~~~------------~~~vrC~~C~~~f 35 (37)
T PF13719_consen 4 TCPNCQTRFRVPDDKLPAG------------GRKVRCPKCGHVF 35 (37)
T ss_pred ECCCCCceEEcCHHHcccC------------CcEEECCCCCcEe
Confidence 46666666655443 1111 3456666666655
No 121
>TIGR02098 MJ0042_CXXC MJ0042 family finger-like domain. This domain contains a CXXCX(19)CXXC motif suggestive of both zinc fingers and thioredoxin, usually found at the N-terminus of prokaryotic proteins. One partially characterized gene, agmX, is among a large set in Myxococcus whose interruption affects adventurous gliding motility.
Probab=25.23 E-value=32 Score=27.99 Aligned_cols=34 Identities=12% Similarity=0.122 Sum_probs=18.7
Q ss_pred cccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCC
Q 000479 985 FICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYK 1029 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~ 1029 (1470)
++|+.|+..|.-....... . .....|+.|+..|.
T Consensus 3 ~~CP~C~~~~~v~~~~~~~------~-----~~~v~C~~C~~~~~ 36 (38)
T TIGR02098 3 IQCPNCKTSFRVVDSQLGA------N-----GGKVRCGKCGHVWY 36 (38)
T ss_pred EECCCCCCEEEeCHHHcCC------C-----CCEEECCCCCCEEE
Confidence 4677777777655432211 1 22466777776653
No 122
>smart00531 TFIIE Transcription initiation factor IIE.
Probab=24.99 E-value=58 Score=34.78 Aligned_cols=39 Identities=13% Similarity=0.377 Sum_probs=24.6
Q ss_pred CCCCccccCcCCccccChhHHhhhhhhccccchhcccCccccccccccc
Q 000479 843 EDEKTHKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSF 891 (1470)
Q Consensus 843 ~gekpfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF 891 (1470)
.....|.|+.|+..|.....+..- +. . ..|.|+.||...
T Consensus 95 ~~~~~Y~Cp~C~~~y~~~ea~~~~----d~-~-----~~f~Cp~Cg~~l 133 (147)
T smart00531 95 TNNAYYKCPNCQSKYTFLEANQLL----DM-D-----GTFTCPRCGEEL 133 (147)
T ss_pred cCCcEEECcCCCCEeeHHHHHHhc----CC-C-----CcEECCCCCCEE
Confidence 445568888888888865544321 11 2 348888888754
No 123
>PF09538 FYDLN_acid: Protein of unknown function (FYDLN_acid); InterPro: IPR012644 Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=24.93 E-value=36 Score=34.80 Aligned_cols=14 Identities=29% Similarity=0.643 Sum_probs=7.7
Q ss_pred cccccCCCCCCCCh
Q 000479 915 LQQCIPCGSHFGNT 928 (1470)
Q Consensus 915 pfkC~~CgK~F~sk 928 (1470)
|..|+.||..|.-.
T Consensus 26 PivCP~CG~~~~~~ 39 (108)
T PF09538_consen 26 PIVCPKCGTEFPPE 39 (108)
T ss_pred CccCCCCCCccCcc
Confidence 44566666555544
No 124
>TIGR00373 conserved hypothetical protein TIGR00373. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain.
Probab=24.75 E-value=48 Score=35.96 Aligned_cols=42 Identities=10% Similarity=0.173 Sum_probs=29.2
Q ss_pred hHHHhhccCCCCccccCcCCccccChhHHhhhhhhccccchhcccCccccccccccc
Q 000479 835 PLAIAGRSEDEKTHKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSF 891 (1470)
Q Consensus 835 L~~H~r~H~gekpfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF 891 (1470)
|..-+....+..-|.|+.|+..|+....+. ..|.|+.||...
T Consensus 97 lk~~l~~e~~~~~Y~Cp~c~~r~tf~eA~~---------------~~F~Cp~Cg~~L 138 (158)
T TIGR00373 97 LREKLEFETNNMFFICPNMCVRFTFNEAME---------------LNFTCPRCGAML 138 (158)
T ss_pred HHHHHhhccCCCeEECCCCCcEeeHHHHHH---------------cCCcCCCCCCEe
Confidence 333334455566789999998888777763 238899998764
No 125
>PRK00464 nrdR transcriptional regulator NrdR; Validated
Probab=24.66 E-value=34 Score=37.03 Aligned_cols=18 Identities=0% Similarity=-0.407 Sum_probs=11.9
Q ss_pred CcccCCCCccCCChhhhh
Q 000479 1018 PHKKGIRFYAYKLKSGRL 1035 (1470)
Q Consensus 1018 pykC~iCgKsF~~ks~L~ 1035 (1470)
.++|+.||++|.+-..+.
T Consensus 28 ~~~c~~c~~~f~~~e~~~ 45 (154)
T PRK00464 28 RRECLACGKRFTTFERVE 45 (154)
T ss_pred eeeccccCCcceEeEecc
Confidence 477777777777655443
No 126
>PF11722 zf-TRM13_CCCH: CCCH zinc finger in TRM13 protein; InterPro: IPR021721 This domain is found at the N terminus of TRM13 methyltransferase proteins. It is presumed to be a zinc binding domain. ; GO: 0008168 methyltransferase activity
Probab=23.94 E-value=43 Score=26.76 Aligned_cols=21 Identities=38% Similarity=0.634 Sum_probs=18.3
Q ss_pred ccCcccccccCCCcccccCCC
Q 000479 589 LGTRCKHRALYGSSFCKKHRP 609 (1470)
Q Consensus 589 ~g~~ckh~~~~~~~~ck~~~~ 609 (1470)
-.|.|+-...+|+.||..|.|
T Consensus 11 K~R~C~m~~~~g~~fC~~H~~ 31 (31)
T PF11722_consen 11 KKRFCKMTRKPGSRFCGEHMP 31 (31)
T ss_pred cccccCCeecCcCCccccCCC
Confidence 357899999999999999975
No 127
>PF12013 DUF3505: Protein of unknown function (DUF3505); InterPro: IPR022698 This family of proteins is functionally uncharacterised. This protein is found in eukaryotes. Proteins in this family are typically between 247 to 1018 amino acids in length. This region contains two segments that are likely to be C2H2 zinc binding domains.
Probab=23.86 E-value=60 Score=32.69 Aligned_cols=24 Identities=21% Similarity=0.400 Sum_probs=21.7
Q ss_pred ccc----cCCCCCCCChhhhhhhhhhcc
Q 000479 916 QQC----IPCGSHFGNTEELWLHVQSVH 939 (1470)
Q Consensus 916 fkC----~~CgK~F~sks~L~~H~k~~H 939 (1470)
|.| ..|+..+.+...+.+|++..|
T Consensus 81 ~~C~~~~~~C~y~~~~~~~m~~H~~~~H 108 (109)
T PF12013_consen 81 YRCQCDPPHCGYITRSKKTMRKHWRKEH 108 (109)
T ss_pred eeeecCCCCCCcEeccHHHHHHHHHHhc
Confidence 889 999999999999999988766
No 128
>COG1997 RPL43A Ribosomal protein L37AE/L43A [Translation, ribosomal structure and biogenesis]
Probab=23.20 E-value=40 Score=33.06 Aligned_cols=32 Identities=22% Similarity=0.278 Sum_probs=18.5
Q ss_pred cccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCC
Q 000479 983 RKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKL 1030 (1470)
Q Consensus 983 KpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ 1030 (1470)
.+|.|+.|++. .+ .++-+| -+.|..|++.|.-
T Consensus 34 ~~~~Cp~C~~~--------~V-kR~a~G-------IW~C~kCg~~fAG 65 (89)
T COG1997 34 AKHVCPFCGRT--------TV-KRIATG-------IWKCRKCGAKFAG 65 (89)
T ss_pred cCCcCCCCCCc--------ce-eeeccC-------eEEcCCCCCeecc
Confidence 35667777664 11 444444 5677777766653
No 129
>COG1198 PriA Primosomal protein N' (replication factor Y) - superfamily II helicase [DNA replication, recombination, and repair]
Probab=22.87 E-value=55 Score=43.62 Aligned_cols=33 Identities=24% Similarity=0.320 Sum_probs=24.8
Q ss_pred ccCCCCCccccc-CcEEEEEEeccccccceeeee
Q 000479 142 SSFSEPKWLEHD-ESVALWVKWRGKWQAGIRCAR 174 (1470)
Q Consensus 142 ~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~ 174 (1470)
.++.-|.-|+.+ -+..|+|.|+|+=..||=...
T Consensus 16 fdY~~p~~l~~~~~G~rV~VPfg~~~~~GiV~~~ 49 (730)
T COG1198 16 FDYLIPEGLEPDQPGSRVRVPFGGRLVVGIVVEL 49 (730)
T ss_pred ccccCCcccccCCCccEEEEEcCCceEEEEEEEe
Confidence 666667777774 568899999988788876554
No 130
>PRK06266 transcription initiation factor E subunit alpha; Validated
Probab=22.77 E-value=55 Score=36.23 Aligned_cols=33 Identities=12% Similarity=0.128 Sum_probs=21.3
Q ss_pred CccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccC
Q 000479 981 SIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAY 1028 (1470)
Q Consensus 981 geKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF 1028 (1470)
...-|.|+.|++.|+....+. .-|.|+.||...
T Consensus 114 ~~~~Y~Cp~C~~rytf~eA~~---------------~~F~Cp~Cg~~L 146 (178)
T PRK06266 114 NNMFFFCPNCHIRFTFDEAME---------------YGFRCPQCGEML 146 (178)
T ss_pred CCCEEECCCCCcEEeHHHHhh---------------cCCcCCCCCCCC
Confidence 345577877877777665542 247777777654
No 131
>PF13717 zinc_ribbon_4: zinc-ribbon domain
Probab=22.69 E-value=38 Score=27.79 Aligned_cols=33 Identities=12% Similarity=0.146 Sum_probs=17.6
Q ss_pred cccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccC
Q 000479 985 FICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAY 1028 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF 1028 (1470)
..|+.|+..|.-...... .. .+..+|+.|+..|
T Consensus 3 i~Cp~C~~~y~i~d~~ip------~~-----g~~v~C~~C~~~f 35 (36)
T PF13717_consen 3 ITCPNCQAKYEIDDEKIP------PK-----GRKVRCSKCGHVF 35 (36)
T ss_pred EECCCCCCEEeCCHHHCC------CC-----CcEEECCCCCCEe
Confidence 356666666665544211 11 3456666666655
No 132
>PF14353 CpXC: CpXC protein
Probab=22.49 E-value=26 Score=36.23 Aligned_cols=26 Identities=12% Similarity=-0.169 Sum_probs=18.9
Q ss_pred CCCcccCCCCccCCChhhhhhccccc
Q 000479 1016 SRPHKKGIRFYAYKLKSGRLSRPRFK 1041 (1470)
Q Consensus 1016 eKpykC~iCgKsF~~ks~L~~H~r~H 1041 (1470)
-..|.|+.||+.|.-...+..|-..|
T Consensus 36 l~~~~CP~Cg~~~~~~~p~lY~D~~~ 61 (128)
T PF14353_consen 36 LFSFTCPSCGHKFRLEYPLLYHDPEK 61 (128)
T ss_pred cCEEECCCCCCceecCCCEEEEcCCC
Confidence 34678888888888777777775443
No 133
>cd05834 HDGF_related The PWWP domain is an essential part of the Hepatoma Derived Growth Factor (HDGF) family of proteins, and is necessary for DNA binding by HDGF. This family of endogenous nuclear-targeted mitogens includes HRP (HDGF-related proteins 1, 2, 3, 4, or HPR1, HPR2, HPR3, HPR4, respectively) and lens epithelium-derived growth factor, LEDGF. Members of the HDGF family have been linked to human diseases, and HDGF is a prognostic factor in several types of cancer. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.
Probab=22.47 E-value=1.1e+02 Score=29.78 Aligned_cols=52 Identities=23% Similarity=0.233 Sum_probs=35.5
Q ss_pred EEEEEeccc-cccceeeeeccCCCccccccccCCCccEEEEEeccCCcchhhhhhccccccCC
Q 000479 157 ALWVKWRGK-WQAGIRCARADWPLPTLKAKPTHDRKKYFVIFFPHTRNYSWADMLLVRSINEF 218 (1470)
Q Consensus 157 ~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 218 (1470)
-+|.|=+|- |=-|+=|...+. +-..++|.|.||. |..|.||..-.+.++.++
T Consensus 8 lVwaK~kGyp~WPa~I~~~~~~---------~~~~~~~~V~FfG-t~~~a~v~~~~l~pf~~~ 60 (83)
T cd05834 8 LVFAKVKGYPAWPARVDEPEDW---------KPPGKKYPVYFFG-THETAFLKPEDLFPYTEN 60 (83)
T ss_pred EEEEecCCCCCCCEEEeccccc---------CCCCCEEEEEEeC-CCCEeEECHHHceecccc
Confidence 368887773 333444444332 2235789999999 789999998888888775
No 134
>TIGR02300 FYDLN_acid conserved hypothetical protein TIGR02300. Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=22.40 E-value=56 Score=34.27 Aligned_cols=33 Identities=24% Similarity=0.215 Sum_probs=20.2
Q ss_pred cccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCccCCChhhh
Q 000479 985 FICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYAYKLKSGR 1034 (1470)
Q Consensus 985 ykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKsF~~ks~L 1034 (1470)
..|+.||++|.-. + ..|..|+.||..|.....+
T Consensus 10 r~Cp~cg~kFYDL---n--------------k~p~vcP~cg~~~~~~~~~ 42 (129)
T TIGR02300 10 RICPNTGSKFYDL---N--------------RRPAVSPYTGEQFPPEEAL 42 (129)
T ss_pred ccCCCcCcccccc---C--------------CCCccCCCcCCccCcchhh
Confidence 5677777777542 1 2467777777776555333
No 135
>KOG1280 consensus Uncharacterized conserved protein containing ZZ-type Zn-finger [General function prediction only]
Probab=21.27 E-value=52 Score=39.59 Aligned_cols=32 Identities=22% Similarity=0.375 Sum_probs=27.1
Q ss_pred cCCccccccCccCCccCChhhHhHhhhhhccC
Q 000479 979 LGSIRKFICRFCGLKFDLLPDLGRHHQAAHMG 1010 (1470)
Q Consensus 979 HtgeKpykC~~CGKsF~sks~LkrH~~rvHtg 1010 (1470)
|-....|.|++|++.=.+...|..|+...|..
T Consensus 74 ~y~~qSftCPyC~~~Gfte~~f~~Hv~s~Hpd 105 (381)
T KOG1280|consen 74 HYDPQSFTCPYCGIMGFTERQFGTHVLSQHPE 105 (381)
T ss_pred ccccccccCCcccccccchhHHHHHhhhcCcc
Confidence 33445799999999999999999998899976
No 136
>KOG2593 consensus Transcription initiation factor IIE, alpha subunit [Transcription]
Probab=21.23 E-value=62 Score=40.13 Aligned_cols=41 Identities=12% Similarity=0.104 Sum_probs=26.9
Q ss_pred hcCCccccccCccCCccCChhhHhHhhhhhccCCCCCCCCCcccCCCCcc
Q 000479 978 NLGSIRKFICRFCGLKFDLLPDLGRHHQAAHMGPNLVNSRPHKKGIRFYA 1027 (1470)
Q Consensus 978 ~HtgeKpykC~~CGKsF~sks~LkrH~~rvHtge~~~~eKpykC~iCgKs 1027 (1470)
--+...-|.|+.|.++|.....|+- ...-+ -.|.|..|+-.
T Consensus 122 d~t~~~~Y~Cp~C~kkyt~Lea~~L--~~~~~-------~~F~C~~C~ge 162 (436)
T KOG2593|consen 122 DDTNVAGYVCPNCQKKYTSLEALQL--LDNET-------GEFHCENCGGE 162 (436)
T ss_pred hccccccccCCccccchhhhHHHHh--hcccC-------ceEEEecCCCc
Confidence 3444567889999999888776643 12222 37888888744
No 137
>KOG2593 consensus Transcription initiation factor IIE, alpha subunit [Transcription]
Probab=21.15 E-value=57 Score=40.41 Aligned_cols=48 Identities=15% Similarity=-0.060 Sum_probs=34.1
Q ss_pred HHHHHhcccccccccccccccCccccccCChHhhhhhhhcCCcccCCCCC
Q 000479 765 LKSILSLRNPVPMEIQFQWALSEASKDAGIGEFLMKLVCCEKERLSKTWG 814 (1470)
Q Consensus 765 ~~sL~sH~rsH~~ek~~~~kC~eC~K~F~s~~~L~k~iHtek~y~C~~Cg 814 (1470)
++.|..-++.-+....| .|+.|.+.|.....++........|.|..|+
T Consensus 113 ~krled~~~d~t~~~~Y--~Cp~C~kkyt~Lea~~L~~~~~~~F~C~~C~ 160 (436)
T KOG2593|consen 113 RKRLEDRLRDDTNVAGY--VCPNCQKKYTSLEALQLLDNETGEFHCENCG 160 (436)
T ss_pred HHHHHHHhhhccccccc--cCCccccchhhhHHHHhhcccCceEEEecCC
Confidence 44444444444555555 9999999999888877555556789999998
No 138
>PRK06266 transcription initiation factor E subunit alpha; Validated
Probab=21.00 E-value=60 Score=35.95 Aligned_cols=36 Identities=22% Similarity=0.546 Sum_probs=25.6
Q ss_pred cCCCCccccCcCCccccChhHHhhhhhhccccchhcccCcccccccccccc
Q 000479 842 SEDEKTHKCKICSQVFLHDQELGVHWMDNHKKEAQWLFRGYACAICLDSFT 892 (1470)
Q Consensus 842 H~gekpfkC~~CgK~F~s~s~L~~H~~~~Ht~e~~~~~Kpy~C~~CgKsF~ 892 (1470)
-....-|.|+.|+..|+....+. ..|.|+.||....
T Consensus 112 e~~~~~Y~Cp~C~~rytf~eA~~---------------~~F~Cp~Cg~~L~ 147 (178)
T PRK06266 112 EENNMFFFCPNCHIRFTFDEAME---------------YGFRCPQCGEMLE 147 (178)
T ss_pred ccCCCEEECCCCCcEEeHHHHhh---------------cCCcCCCCCCCCe
Confidence 34456688999998888776642 2488999987654
Done!