Query 028610
Match_columns 206
No_of_seqs 220 out of 1016
Neff 6.4
Searched_HMMs 46136
Date Fri Mar 29 14:12:01 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/028610.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/028610hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG3288 OTU-like cysteine prot 100.0 5.9E-68 1.3E-72 448.7 13.7 199 1-205 106-307 (307)
2 COG5539 Predicted cysteine pro 100.0 6.4E-37 1.4E-41 263.8 5.3 187 11-205 118-306 (306)
3 PF02338 OTU: OTU-like cystein 99.9 1.8E-26 4E-31 176.5 9.9 102 11-119 1-121 (121)
4 KOG2606 OTU (ovarian tumor)-li 99.9 1.9E-24 4.1E-29 187.3 10.1 122 4-126 158-299 (302)
5 PF10275 Peptidase_C65: Peptid 99.6 1.4E-14 3.1E-19 124.2 14.5 90 34-124 141-244 (244)
6 KOG3991 Uncharacterized conser 99.6 1E-14 2.2E-19 122.9 7.9 92 33-125 157-256 (256)
7 KOG2605 OTU (ovarian tumor)-li 99.5 2.9E-14 6.4E-19 128.9 3.6 120 5-125 218-344 (371)
8 COG5539 Predicted cysteine pro 98.9 2.7E-10 5.8E-15 99.2 1.9 116 5-122 171-304 (306)
9 PF05415 Peptidase_C36: Beet n 95.5 0.04 8.6E-07 40.7 5.3 77 10-105 3-83 (104)
10 PF02148 zf-UBP: Zn-finger in 87.5 0.28 6.1E-06 33.4 1.1 30 173-202 10-42 (63)
11 PF12874 zf-met: Zinc-finger o 85.9 0.34 7.4E-06 26.4 0.7 25 175-199 1-25 (25)
12 PF12756 zf-C2H2_2: C2H2 type 84.6 1 2.2E-05 32.0 2.9 30 174-203 50-79 (100)
13 PF00096 zf-C2H2: Zinc finger, 80.3 1.2 2.5E-05 23.7 1.4 21 176-196 2-22 (23)
14 PF12171 zf-C2H2_jaz: Zinc-fin 78.6 0.4 8.7E-06 27.0 -0.9 25 175-199 2-26 (27)
15 smart00290 ZnF_UBP Ubiquitin C 78.3 1.3 2.9E-05 28.2 1.4 29 174-202 11-42 (50)
16 PF13894 zf-C2H2_4: C2H2-type 74.6 2.6 5.6E-05 21.9 1.7 21 176-196 2-22 (24)
17 KOG0804 Cytoplasmic Zn-finger 72.0 2.2 4.7E-05 40.2 1.6 25 176-200 242-269 (493)
18 smart00355 ZnF_C2H2 zinc finge 71.8 2.9 6.2E-05 21.9 1.5 21 175-195 1-21 (26)
19 PF13912 zf-C2H2_6: C2H2-type 67.6 4.5 9.7E-05 22.2 1.7 21 175-195 2-22 (27)
20 PF05412 Peptidase_C33: Equine 65.5 4.4 9.6E-05 30.9 1.8 84 10-126 4-87 (108)
21 PRK10963 hypothetical protein; 64.8 6.1 0.00013 33.5 2.8 36 37-78 6-41 (223)
22 smart00451 ZnF_U1 U1-like zinc 59.9 5.3 0.00011 23.2 1.1 25 175-199 4-28 (35)
23 PHA03082 DNA-dependent RNA pol 59.8 5.1 0.00011 27.3 1.1 18 173-190 3-20 (63)
24 PF05864 Chordopox_RPO7: Chord 59.0 5.6 0.00012 27.1 1.2 18 173-190 3-20 (63)
25 PF13913 zf-C2HC_2: zinc-finge 53.7 11 0.00024 20.9 1.7 20 175-195 3-22 (25)
26 PHA00616 hypothetical protein 53.4 6 0.00013 25.4 0.6 29 175-203 2-31 (44)
27 COG4049 Uncharacterized protei 52.2 7.6 0.00017 26.4 1.0 27 170-196 13-39 (65)
28 PF05381 Peptidase_C21: Tymovi 51.5 81 0.0018 23.9 6.6 89 13-121 2-94 (104)
29 COG3426 Butyrate kinase [Energ 50.5 14 0.0003 33.3 2.6 60 17-84 274-341 (358)
30 cd00729 rubredoxin_SM Rubredox 41.8 13 0.00029 22.2 0.8 14 174-187 2-15 (34)
31 PF09237 GAGA: GAGA factor; I 41.0 24 0.00053 23.5 2.0 22 176-197 26-47 (54)
32 cd02669 Peptidase_C19M A subfa 38.7 16 0.00034 34.0 1.2 28 174-201 28-58 (440)
33 PF13909 zf-H2C2_5: C2H2-type 38.3 38 0.00081 17.9 2.3 21 175-196 1-21 (24)
34 PRK06266 transcription initiat 37.9 30 0.00065 28.5 2.6 50 136-186 80-129 (178)
35 PF04475 DUF555: Protein of un 37.5 44 0.00096 25.2 3.2 39 148-186 21-59 (102)
36 PF13465 zf-H2C2_2: Zinc-finge 36.4 18 0.00038 20.0 0.8 18 168-185 8-25 (26)
37 PF04877 Hairpins: HrpZ; Inte 35.6 33 0.00071 30.8 2.6 49 32-85 163-211 (308)
38 PHA02768 hypothetical protein; 34.6 34 0.00074 23.0 2.0 21 176-196 7-27 (55)
39 PF09082 DUF1922: Domain of un 33.6 15 0.00032 25.8 0.2 21 164-185 10-30 (68)
40 PF05148 Methyltransf_8: Hypot 33.4 1E+02 0.0023 26.4 5.3 71 16-95 13-102 (219)
41 COG3357 Predicted transcriptio 32.2 53 0.0012 24.5 2.9 36 149-186 35-70 (97)
42 PHA00732 hypothetical protein 31.5 42 0.00092 23.9 2.2 17 177-193 4-20 (79)
43 cd01675 RNR_III Class III ribo 31.2 67 0.0015 31.0 4.2 36 148-187 495-531 (555)
44 PRK09784 hypothetical protein; 29.7 28 0.00062 30.7 1.3 19 5-23 200-218 (417)
45 PF04959 ARS2: Arsenite-resist 29.4 42 0.00091 28.6 2.2 28 168-195 71-98 (214)
46 PF09494 Slx4: Slx4 endonuclea 29.4 1.8E+02 0.0038 19.6 5.0 42 37-79 3-48 (64)
47 TIGR00373 conserved hypothetic 29.3 48 0.001 26.6 2.5 50 135-185 71-120 (158)
48 PF13240 zinc_ribbon_2: zinc-r 29.2 33 0.0007 18.7 1.0 14 168-184 10-23 (23)
49 TIGR02934 nifT_nitrog probable 28.7 4.1 8.9E-05 28.5 -3.3 23 53-75 16-38 (67)
50 cd00350 rubredoxin_like Rubred 28.7 23 0.00049 20.9 0.4 14 175-188 2-15 (33)
51 PF07967 zf-C3HC: C3HC zinc fi 28.7 33 0.00072 26.5 1.4 23 165-187 34-56 (133)
52 COG2051 RPS27A Ribosomal prote 28.7 31 0.00067 24.1 1.1 15 171-185 35-49 (67)
53 smart00238 BIR Baculoviral inh 27.8 1.2E+02 0.0025 20.4 3.9 40 162-202 25-68 (71)
54 PF07368 DUF1487: Protein of u 26.9 1.2E+02 0.0025 26.0 4.5 43 4-46 6-56 (215)
55 KOG1247 Methionyl-tRNA synthet 26.8 35 0.00077 32.3 1.4 61 114-186 86-148 (567)
56 COG5134 Uncharacterized conser 26.8 41 0.00089 29.0 1.7 32 150-181 14-49 (272)
57 cd00022 BIR Baculoviral inhibi 26.6 1.2E+02 0.0026 20.1 3.8 40 162-202 23-66 (69)
58 COG2174 RPL34A Ribosomal prote 26.2 30 0.00065 25.7 0.7 14 175-188 35-48 (93)
59 KOG1790 60s ribosomal protein 25.6 24 0.00052 27.4 0.1 25 165-189 32-56 (121)
60 smart00531 TFIIE Transcription 25.4 96 0.0021 24.4 3.5 62 135-198 61-122 (147)
61 PF04340 DUF484: Protein of un 24.5 34 0.00073 28.8 0.8 16 37-52 9-24 (225)
62 PF13451 zf-trcl: Probable zin 24.5 67 0.0014 21.1 2.0 27 172-198 2-28 (49)
63 PF01199 Ribosomal_L34e: Ribos 23.9 38 0.00081 25.2 0.8 21 172-192 39-59 (94)
64 PRK05452 anaerobic nitric oxid 23.5 89 0.0019 29.5 3.5 51 137-189 373-440 (479)
65 PF12907 zf-met2: Zinc-binding 22.8 40 0.00087 21.1 0.7 23 174-196 1-26 (40)
66 COG1592 Rubrerythrin [Energy p 22.5 1E+02 0.0022 25.3 3.2 13 174-186 134-146 (166)
67 PF08209 Sgf11: Sgf11 (transcr 22.3 61 0.0013 19.4 1.4 19 173-191 3-21 (33)
68 PF03884 DUF329: Domain of unk 22.2 40 0.00086 22.8 0.6 13 173-185 1-13 (57)
69 PRK03922 hypothetical protein; 22.1 1E+02 0.0022 23.7 2.9 38 149-186 23-61 (113)
70 PF13717 zinc_ribbon_4: zinc-r 21.8 59 0.0013 19.6 1.3 14 171-184 22-35 (36)
71 PF12091 DUF3567: Protein of u 21.5 93 0.002 22.8 2.5 36 31-69 46-81 (85)
72 PLN02748 tRNA dimethylallyltra 21.3 51 0.0011 31.4 1.4 25 176-200 420-445 (468)
73 PF00356 LacI: Bacterial regul 21.2 1.9E+02 0.0041 18.3 3.7 27 19-45 14-40 (46)
74 PF06107 DUF951: Bacterial pro 21.2 53 0.0011 22.3 1.1 15 171-185 28-42 (57)
75 PF00653 BIR: Inhibitor of Apo 20.9 1.1E+02 0.0024 20.7 2.7 39 162-201 25-67 (70)
76 PF15412 Nse4-Nse3_bdg: Bindin 20.7 35 0.00076 22.6 0.1 36 144-179 2-37 (56)
77 PF10588 NADH-G_4Fe-4S_3: NADH 20.6 79 0.0017 19.6 1.7 18 72-89 21-38 (41)
78 PF14749 Acyl-CoA_ox_N: Acyl-c 20.2 2.3E+02 0.005 21.1 4.6 31 20-51 4-34 (125)
79 KOG2785 C2H2-type Zn-finger pr 20.2 95 0.0021 28.8 2.8 34 168-201 62-95 (390)
No 1
>KOG3288 consensus OTU-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=5.9e-68 Score=448.71 Aligned_cols=199 Identities=57% Similarity=1.022 Sum_probs=189.8
Q ss_pred CCCcEEEEEeCCCCchhhHHHHHHHhcCCCC-hHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCCCcccchHHHH
Q 028610 1 MEGIIVRRVIPSDNSCLFNAVGYVMEHDKNK-APELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELS 79 (206)
Q Consensus 1 ~~g~L~~~~ip~DGnCLFrAis~~l~g~~~~-~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL~ 79 (206)
|+|.|.+|+||+||||||+||+|.+.+.... ..+||++||+.+.+||+.|+++|||++..|||+||+++.+|||+|||+
T Consensus 106 ~~gvl~~~vvp~DNSCLF~ai~yv~~k~~~~~~~elR~iiA~~Vasnp~~yn~AiLgK~n~eYc~WI~k~dsWGGaIEls 185 (307)
T KOG3288|consen 106 GEGVLSRRVVPDDNSCLFTAIAYVIFKQVSNRPYELREIIAQEVASNPDKYNDAILGKPNKEYCAWILKMDSWGGAIELS 185 (307)
T ss_pred ccceeEEEeccCCcchhhhhhhhhhcCccCCCcHHHHHHHHHHHhcChhhhhHHHhCCCcHHHHHHHccccccCceEEee
Confidence 5799999999999999999999999987543 469999999999999999999999999999999999999999999999
Q ss_pred HHHHHhCCcEEEEECCCCceeEeCCC--CCceEEEEEcCCcceeeeecCCCCCCCCCCeeeeeCCCCCchhHHHHHHHHH
Q 028610 80 ILADYYGREIAAYDIQTTRCDLYGQK--YSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQKGRTIGPAEDLALKL 157 (206)
Q Consensus 80 ala~~~~v~I~v~~~~~~~~~~fg~~--~~~~i~LlY~G~HYD~l~~~~~~~~~~~~d~t~f~~~d~~~~~~~~~~a~~l 157 (206)
|||++|+++|+|+|+++.++++||++ +..|++|+|+|+|||+|++.+. .|++.|.|+||.+| +.++.+|++|
T Consensus 186 ILS~~ygveI~vvDiqt~rid~fged~~~~~rv~llydGIHYD~l~m~~~--~~~~~~~tifp~~d----d~v~~~alqL 259 (307)
T KOG3288|consen 186 ILSDYYGVEICVVDIQTVRIDRFGEDKNFDNRVLLLYDGIHYDPLAMNEF--KPTDVDNTIFPVSD----DTVLTQALQL 259 (307)
T ss_pred eehhhhceeEEEEecceeeehhcCCCCCCCceEEEEecccccChhhhccC--CccCCccccccccc----chHHHHHHHH
Confidence 99999999999999999999999998 7999999999999999999976 67788999999999 6678999999
Q ss_pred HHHHhhCCCccccCCceeeccccCCcccCHHHHHHHHHhhCCCccccc
Q 028610 158 VKEQQRKKTYTDTANFTLRCGVCQIGVIGQKEAVEHAQATGHVNFQEY 205 (206)
Q Consensus 158 ~~~~~~~~~~t~t~~~~l~C~~c~~~~~g~~~a~~Ha~~tgH~~F~e~ 205 (206)
|+++|++||||||++|+|||.+|++.|.||++|++||++|||+||||+
T Consensus 260 a~~~k~~r~ytdt~~ftlRC~~Cq~glvGq~ea~eHA~~TGH~nFge~ 307 (307)
T KOG3288|consen 260 ASELKRTRYYTDTAKFTLRCMVCQMGLVGQKEAAEHAKATGHVNFGEY 307 (307)
T ss_pred HHHHHhcceeccccceEEEeeecccceeeHHHHHHHHHhcCCCccccC
Confidence 999999999999999999999999999999999999999999999996
No 2
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=6.4e-37 Score=263.83 Aligned_cols=187 Identities=30% Similarity=0.547 Sum_probs=166.8
Q ss_pred CCCCchhhHHHHHHHhcCCCChHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCCCccc-chHHHHHHHHHhCCcE
Q 028610 11 PSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWG-GAIELSILADYYGREI 89 (206)
Q Consensus 11 p~DGnCLFrAis~~l~g~~~~~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~~~WG-g~iEL~ala~~~~v~I 89 (206)
-+|++|+|++.++.++.. ...+||..|+..+.+|||.|++++++.|.-.|+.||.++..|| |+||+.++|+.+++.|
T Consensus 118 ~~d~srl~q~~~~~l~~a--sv~~lrE~vs~Ev~snPDl~n~~i~~~~~i~y~~~i~k~d~~~dG~ieia~iS~~l~v~i 195 (306)
T COG5539 118 QDDNSRLFQAERYSLRDA--SVAKLREVVSLEVLSNPDLYNPAILEIDVIAYATWIVKPDSQGDGCIEIAIISDQLPVRI 195 (306)
T ss_pred CCchHHHHHHHHhhhhhh--hHHHHHHHHHHHHhhCccccchhhcCcchHHHHHhhhccccCCCceEEEeEeccccceee
Confidence 367999999999999864 5789999999999999999999999999999999999999999 9999999999999999
Q ss_pred EEEECCCCceeEeCCC-CCceEEEEEcCCcceeeeecCCCCCCCCCCeeeeeCCCCCchhHHHHHHHHHHHHHhhCCCcc
Q 028610 90 AAYDIQTTRCDLYGQK-YSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYT 168 (206)
Q Consensus 90 ~v~~~~~~~~~~fg~~-~~~~i~LlY~G~HYD~l~~~~~~~~~~~~d~t~f~~~d~~~~~~~~~~a~~l~~~~~~~~~~t 168 (206)
++++++..+.++|++. +..++.++|+|+|||.....-.+ ..+..+.-.|+.+|- +.-.+++||+-|+..+|||
T Consensus 196 ~~Vdv~~~~~dr~~~~~~~q~~~i~f~g~hfD~~t~~m~~-~dt~~ne~~~~a~~g-----~~~ei~qLas~lk~~~~~~ 269 (306)
T COG5539 196 HVVDVDKDSEDRYNSHPYVQRISILFTGIHFDEETLAMVL-WDTYVNEVLFDASDG-----ITIEIQQLASLLKNPHYYT 269 (306)
T ss_pred eeeecchhHHhhccCChhhhhhhhhhcccccchhhhhcch-HHHHHhhhccccccc-----chHHHHHHHHHhcCceEEe
Confidence 9999998999999988 77899999999999999865332 223345566766662 4566777888889999999
Q ss_pred ccCCceeeccccCCcccCHHHHHHHHHhhCCCccccc
Q 028610 169 DTANFTLRCGVCQIGVIGQKEAVEHAQATGHVNFQEY 205 (206)
Q Consensus 169 ~t~~~~l~C~~c~~~~~g~~~a~~Ha~~tgH~~F~e~ 205 (206)
||+++++||+.||+.|.|++++.+||..|||+||+|.
T Consensus 270 nT~~~~ik~n~c~~~~~~e~~~~~Ha~a~GH~n~~~d 306 (306)
T COG5539 270 NTASPSIKCNICGTGFVGEKDYYAHALATGHYNFGED 306 (306)
T ss_pred ecCCceEEeeccccccchhhHHHHHHHhhcCccccCC
Confidence 9999999999999999999999999999999999973
No 3
>PF02338 OTU: OTU-like cysteine protease; InterPro: IPR003323 This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65). None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity [].; PDB: 2VFJ_C 3DKB_F 3PHW_A 3PHU_B 3PHX_A 3BY4_A 3C0R_C 3PRM_C 3PRP_C 3ZRH_A ....
Probab=99.94 E-value=1.8e-26 Score=176.52 Aligned_cols=102 Identities=29% Similarity=0.519 Sum_probs=85.9
Q ss_pred CCCCchhhHHHHHHHh----cCCCChHHHHHHHHHHHh-hChhhhhhhhcCCCHHHHHHhhCCCCcccchHHHHHHHHHh
Q 028610 11 PSDNSCLFNAVGYVME----HDKNKAPELRQVIAATVA-SDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYY 85 (206)
Q Consensus 11 p~DGnCLFrAis~~l~----g~~~~~~~lR~~v~~~i~-~np~~y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL~ala~~~ 85 (206)
||||||||||||++|+ |++..|.+||+.|+++|+ .|++.|.+.+.+. +|+++++|||++||+|+|++|
T Consensus 1 pgDGnClF~Avs~~l~~~~~~~~~~~~~lR~~~~~~l~~~~~~~~~~~~~~~-------~~~~~~~Wg~~~el~a~a~~~ 73 (121)
T PF02338_consen 1 PGDGNCLFRAVSDQLYGDGGGSEDNHQELRKAVVDYLRDKNRDKFEEFLEGD-------KMSKPGTWGGEIELQALANVL 73 (121)
T ss_dssp -SSTTHHHHHHHHHHCTT-SSSTTTHHHHHHHHHHHHHTHTTTHHHHHHHHH-------HHTSTTSHEEHHHHHHHHHHH
T ss_pred CCCccHHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHhccchhhhhhhhh-------hhccccccCcHHHHHHHHHHh
Confidence 8999999999999999 999999999999999999 9999995555443 999999999999999999999
Q ss_pred CCcEEEEECCCCce---eEeCC----C-CCceEEEEEc------CCcc
Q 028610 86 GREIAAYDIQTTRC---DLYGQ----K-YSERVMLIYD------GLHY 119 (206)
Q Consensus 86 ~v~I~v~~~~~~~~---~~fg~----~-~~~~i~LlY~------G~HY 119 (206)
+++|.|++...+.. ..+.. . ..+.+.|.|. |+||
T Consensus 74 ~~~I~v~~~~~~~~~~~~~~~~~~~~~~~~~~i~l~~~~~l~~~~~Hy 121 (121)
T PF02338_consen 74 NRPIIVYSSSDGDNVVFIKFTGKYPPLESPPPICLCYHGHLYYTGNHY 121 (121)
T ss_dssp TSEEEEECETTTBEEEEEEESCEESTTTTTTSEEEEEETEEEEETTEE
T ss_pred CCeEEEEEcCCCCccceeeecCccccCCCCCeEEEEEcCCccCCCCCC
Confidence 99999998766643 23322 2 4677777775 6898
No 4
>KOG2606 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.91 E-value=1.9e-24 Score=187.26 Aligned_cols=122 Identities=24% Similarity=0.445 Sum_probs=104.9
Q ss_pred cEEEEEeCCCCchhhHHHHHHHhcC---CCChHHHHHHHHHHHhhChhhhhhhhcCC---------CHHHHHHhhCCCCc
Q 028610 4 IIVRRVIPSDNSCLFNAVGYVMEHD---KNKAPELRQVIAATVASDPVKYSEAFLGK---------SNQEYCSWIQDPEK 71 (206)
Q Consensus 4 ~L~~~~ip~DGnCLFrAis~~l~g~---~~~~~~lR~~v~~~i~~np~~y~e~~l~~---------~~~~Y~~~m~~~~~ 71 (206)
.|....||+||+|||+||++||.-. .-..+.||..+|+||++|.+.|..+++.. +|+.||+.|+++..
T Consensus 158 ~l~~~~Ip~DG~ClY~aI~hQL~~~~~~~~~v~kLR~~~a~Ymr~H~~df~pf~~~eet~d~~~~~~f~~Yc~eI~~t~~ 237 (302)
T KOG2606|consen 158 GLKMFDIPADGHCLYAAISHQLKLRSGKLLSVQKLREETADYMREHVEDFLPFLLDEETGDSLGPEDFDKYCREIRNTAA 237 (302)
T ss_pred cCccccCCCCchhhHHHHHHHHHhccCCCCcHHHHHHHHHHHHHHHHHHhhhHhcCccccccCCHHHHHHHHHHhhhhcc
Confidence 4788999999999999999999532 35679999999999999999996666632 49999999999999
Q ss_pred ccchHHHHHHHHHhCCcEEEEECCCCceeEeCCCC--CceEEEEE------cCCcceeeeecC
Q 028610 72 WGGAIELSILADYYGREIAAYDIQTTRCDLYGQKY--SERVMLIY------DGLHYDALAISP 126 (206)
Q Consensus 72 WGg~iEL~ala~~~~v~I~v~~~~~~~~~~fg~~~--~~~i~LlY------~G~HYD~l~~~~ 126 (206)
|||+|||.|||..|.+||.||... +.+..||+.+ .+++.|+| .|.||+++.+..
T Consensus 238 WGgelEL~AlShvL~~PI~Vy~~~-~p~~~~geey~kd~pL~lvY~rH~y~LGeHYNS~~~~~ 299 (302)
T KOG2606|consen 238 WGGELELKALSHVLQVPIEVYQAD-GPILEYGEEYGKDKPLILVYHRHAYGLGEHYNSVTPLK 299 (302)
T ss_pred ccchHHHHHHHHhhccCeEEeecC-CCceeechhhCCCCCeeeehHHhHHHHHhhhccccccc
Confidence 999999999999999999999855 5688999883 37888887 378999987653
No 5
>PF10275 Peptidase_C65: Peptidase C65 Otubain; InterPro: IPR019400 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. ; PDB: 4DDG_C 3VON_O 2ZFY_A 4DHZ_A 4DDI_C 1TFF_A 4DHJ_I 4DHI_B.
Probab=99.61 E-value=1.4e-14 Score=124.18 Aligned_cols=90 Identities=24% Similarity=0.365 Sum_probs=68.9
Q ss_pred HHHHHHHHHHhhChhhhhhhhc-C---CCHHHHHH-hhCCCCcccchHHHHHHHHHhCCcEEEEECCCC---c---eeEe
Q 028610 34 ELRQVIAATVASDPVKYSEAFL-G---KSNQEYCS-WIQDPEKWGGAIELSILADYYGREIAAYDIQTT---R---CDLY 102 (206)
Q Consensus 34 ~lR~~v~~~i~~np~~y~e~~l-~---~~~~~Y~~-~m~~~~~WGg~iEL~ala~~~~v~I~v~~~~~~---~---~~~f 102 (206)
.||..++.||+.|++.| ++|+ + .++++||+ .+..++.=.+++.|.|||+.++++|.|+..+.+ . ...|
T Consensus 141 flRLlts~~l~~~~d~y-~~fi~~~~~~tve~~C~~~Vep~~~Ead~v~i~ALa~aL~v~i~v~yld~~~~~~~~~~~~~ 219 (244)
T PF10275_consen 141 FLRLLTSAYLKSNSDEY-EPFIDGLEYLTVEEFCSQEVEPMGKEADHVQIIALAQALGVPIRVEYLDRSVEGDEVNRHEF 219 (244)
T ss_dssp HHHHHHHHHHHHTHHHH-GGGSSTT--S-HHHHHHHHTSSTT--B-HHHHHHHHHHHT--EEEEESSSSGCSTTSEEEEE
T ss_pred HHHHHHHHHHHhhHHHH-hhhhcccccCCHHHHHHhhcccccccchhHHHHHHHHHhCCeEEEEEecCCCCCCccccccC
Confidence 59999999999999999 5555 4 78999997 589999999999999999999999999987632 1 3456
Q ss_pred CC---CCCceEEEEEcCCcceeeee
Q 028610 103 GQ---KYSERVMLIYDGLHYDALAI 124 (206)
Q Consensus 103 g~---~~~~~i~LlY~G~HYD~l~~ 124 (206)
.+ +..++|.|+|...|||.+++
T Consensus 220 ~~~~~~~~~~i~LLyrpgHYdIly~ 244 (244)
T PF10275_consen 220 PPDNESQEPQITLLYRPGHYDILYP 244 (244)
T ss_dssp S-SSTTSS-SEEEEEETBEEEEEEE
T ss_pred CCccCCCCCEEEEEEcCCccccccC
Confidence 42 25788999999889999985
No 6
>KOG3991 consensus Uncharacterized conserved protein [Function unknown]
Probab=99.55 E-value=1e-14 Score=122.94 Aligned_cols=92 Identities=20% Similarity=0.293 Sum_probs=76.1
Q ss_pred HHHHHHHHHHHhhChhhhhhhhc-C-CCHHHHHHh-hCCCCcccchHHHHHHHHHhCCcEEEEECCCCce-----eEeCC
Q 028610 33 PELRQVIAATVASDPVKYSEAFL-G-KSNQEYCSW-IQDPEKWGGAIELSILADYYGREIAAYDIQTTRC-----DLYGQ 104 (206)
Q Consensus 33 ~~lR~~v~~~i~~np~~y~e~~l-~-~~~~~Y~~~-m~~~~~WGg~iEL~ala~~~~v~I~v~~~~~~~~-----~~fg~ 104 (206)
..||..++.+|++|++.| ++|+ | ++..+||.. +.....-.|||+|.|||+.+++.|.|..++.+.- ..|.+
T Consensus 157 ~ylRLvtS~~ik~~adfy-~pFI~e~~tV~~fC~~eVEPm~kesdhi~I~ALs~Al~i~irVey~dr~~~~~~~hH~fpe 235 (256)
T KOG3991|consen 157 MYLRLVTSGFIKSNADFY-QPFIDEGMTVKAFCTQEVEPMYKESDHIHITALSQALGIRIRVEYVDRGSGDTVNHHDFPE 235 (256)
T ss_pred HHHHHHHHHHHhhChhhh-hccCCCCCcHHHHHHhhcchhhhccCceeHHHHHhhhCceEEEEEecCCCCCCCCCCcCcc
Confidence 469999999999999999 5565 3 689999995 7777788999999999999999999998764322 23434
Q ss_pred CCCceEEEEEcCCcceeeeec
Q 028610 105 KYSERVMLIYDGLHYDALAIS 125 (206)
Q Consensus 105 ~~~~~i~LlY~G~HYD~l~~~ 125 (206)
...++|+|+|...|||+|++.
T Consensus 236 ~s~P~I~LLYrpGHYdilY~~ 256 (256)
T KOG3991|consen 236 ASAPEIYLLYRPGHYDILYKK 256 (256)
T ss_pred ccCceEEEEecCCccccccCC
Confidence 467889999999999999863
No 7
>KOG2605 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.46 E-value=2.9e-14 Score=128.94 Aligned_cols=120 Identities=19% Similarity=0.211 Sum_probs=95.0
Q ss_pred EEEEEeCCCCchhhHHHHHHHhcCCCChHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCCCcccchHHHHHHHH-
Q 028610 5 IVRRVIPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILAD- 83 (206)
Q Consensus 5 L~~~~ip~DGnCLFrAis~~l~g~~~~~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL~ala~- 83 (206)
+..++|.+||||+|||+++|++++.+.|..+|+.+++++..+++.| +-++.+++.+|++.++.++.||.+||+||+|.
T Consensus 218 ~e~~Kv~edGsC~fra~aDQvy~d~e~~~~~~~~~~dq~~~e~~~~-~~~vt~~~~~y~k~kr~~~~~gnhie~Qa~a~~ 296 (371)
T KOG2605|consen 218 FEYKKVVEDGSCLFRALADQVYGDDEQHDHNRRECVDQLKKERDFY-EDYVTEDFTSYIKRKRADGEPGNHIEQQAAADI 296 (371)
T ss_pred hhhhhcccCCchhhhccHHHhhcCHHHHHHHHHHHHHHHhhccccc-ccccccchhhcccccccCCCCcchHHHhhhhhh
Confidence 4567899999999999999999999999999999999999999999 56778889999999999999999999999995
Q ss_pred --HhCCcEEEEECCCCceeEeCCCCCceEE-EEE---cCCcceeeeec
Q 028610 84 --YYGREIAAYDIQTTRCDLYGQKYSERVM-LIY---DGLHYDALAIS 125 (206)
Q Consensus 84 --~~~v~I~v~~~~~~~~~~fg~~~~~~i~-LlY---~G~HYD~l~~~ 125 (206)
....++.+....+..+....+....++- +.| .-.||+.++..
T Consensus 297 ~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~~~n~~~~~h~~~~~~~ 344 (371)
T KOG2605|consen 297 YEEIEKPLNITSFKDTCYIQTPPAIEESVKMEKYNFWVEVHYNTARHS 344 (371)
T ss_pred hhhccccceeecccccceeccCcccccchhhhhhcccchhhhhhcccc
Confidence 4445555555555544444443222222 223 35699998875
No 8
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=98.94 E-value=2.7e-10 Score=99.24 Aligned_cols=116 Identities=16% Similarity=0.031 Sum_probs=85.3
Q ss_pred EEEEEeCCCCchhhHHHHHHHhc-----CCCChHHHHHHHHHHHhhChhhhhhhhcC------CCHHHHHHhhCCCCccc
Q 028610 5 IVRRVIPSDNSCLFNAVGYVMEH-----DKNKAPELRQVIAATVASDPVKYSEAFLG------KSNQEYCSWIQDPEKWG 73 (206)
Q Consensus 5 L~~~~ip~DGnCLFrAis~~l~g-----~~~~~~~lR~~v~~~i~~np~~y~e~~l~------~~~~~Y~~~m~~~~~WG 73 (206)
|+--.++|||+|+|-+||++|.- +.+..+.+|-.=..|.+.+.+.|+....+ .+|++|++.|+.+..||
T Consensus 171 i~k~d~~~dG~ieia~iS~~l~v~i~~Vdv~~~~~dr~~~~~~~q~~~i~f~g~hfD~~t~~m~~~dt~~ne~~~~a~~g 250 (306)
T COG5539 171 IVKPDSQGDGCIEIAIISDQLPVRIHVVDVDKDSEDRYNSHPYVQRISILFTGIHFDEETLAMVLWDTYVNEVLFDASDG 250 (306)
T ss_pred hhccccCCCceEEEeEeccccceeeeeeecchhHHhhccCChhhhhhhhhhcccccchhhhhcchHHHHHhhhccccccc
Confidence 44457899999999999999963 22344677777777777777777432222 37999999999999999
Q ss_pred chHHHHHHHHHhCCcEEEEECCCCceeEeCCCCCceEE-EE-----E-cCCcceee
Q 028610 74 GAIELSILADYYGREIAAYDIQTTRCDLYGQKYSERVM-LI-----Y-DGLHYDAL 122 (206)
Q Consensus 74 g~iEL~ala~~~~v~I~v~~~~~~~~~~fg~~~~~~i~-Ll-----Y-~G~HYD~l 122 (206)
+.||+++||..|++++++++..+ .+++|++=...++. +. | .| ||+.+
T Consensus 251 ~~~ei~qLas~lk~~~~~~nT~~-~~ik~n~c~~~~~~e~~~~~Ha~a~G-H~n~~ 304 (306)
T COG5539 251 ITIEIQQLASLLKNPHYYTNTAS-PSIKCNICGTGFVGEKDYYAHALATG-HYNFG 304 (306)
T ss_pred chHHHHHHHHHhcCceEEeecCC-ceEEeeccccccchhhHHHHHHHhhc-Ccccc
Confidence 99999999999999999997553 36777653122221 11 2 46 99976
No 9
>PF05415 Peptidase_C36: Beet necrotic yellow vein furovirus-type papain-like endopeptidase; InterPro: IPR008746 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases correspond to MEROPS peptidase family C36 (clan CA). The type example is beet necrotic yellow vein furovirus-type papain-like endopeptidase (beet necrotic yellow vein virus), which is involved in processing the viral polyprotein.
Probab=95.47 E-value=0.04 Score=40.75 Aligned_cols=77 Identities=13% Similarity=0.321 Sum_probs=53.1
Q ss_pred eCCCCchhhHHHHHHHhcCCCChHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCC--CCcccchHHHHHHHHHhCC
Q 028610 10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQD--PEKWGGAIELSILADYYGR 87 (206)
Q Consensus 10 ip~DGnCLFrAis~~l~g~~~~~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~--~~~WGg~iEL~ala~~~~v 87 (206)
+..|||||--|||.+|.-+-+ .+-.-|+.|.. ..+.|+.|.++ |.+|-+- ..+|+.+++
T Consensus 3 ~sR~NNCLVVAis~~L~~T~e-------~l~~~M~An~~---------~i~~y~~W~r~~~~STW~DC---~mFA~~LkV 63 (104)
T PF05415_consen 3 ASRPNNCLVVAISECLGVTLE-------KLDNLMQANVS---------TIKKYHTWLRKKRPSTWDDC---RMFADALKV 63 (104)
T ss_pred ccCCCCeEeehHHHHhcchHH-------HHHHHHHhhHH---------HHHHHHHHHhcCCCCcHHHH---HHHHHhhee
Confidence 467999999999999985432 22234555533 26789999765 7799775 468999999
Q ss_pred cEEEEECC-CC-ceeEeCCC
Q 028610 88 EIAAYDIQ-TT-RCDLYGQK 105 (206)
Q Consensus 88 ~I~v~~~~-~~-~~~~fg~~ 105 (206)
.|.+--.. .+ .+.-|+++
T Consensus 64 sm~vkV~~~~~~~l~~~~d~ 83 (104)
T PF05415_consen 64 SMQVKVLSDKPYDLLYFVDG 83 (104)
T ss_pred EEEEEEcCCCCceeeEeecC
Confidence 98875443 33 24456665
No 10
>PF02148 zf-UBP: Zn-finger in ubiquitin-hydrolases and other protein; InterPro: IPR001607 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP) [, ], All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties. Some of the proteins containing an UBP zinc finger include: Homo sapiens (Human) deubiquitinating enzyme 13 (UBPD) Human deubiquitinating enzyme 5 (UBP5) Dictyostelium discoideum (Slime mold) deubiquitinating enzyme A (UBPA) Saccharomyces cerevisiae (Baker's yeast) deubiquitinating enzyme 8 (UBP8) Yeast deubiquitinating enzyme 14 (UBP14) More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding; PDB: 3GV4_A 3PHD_B 3C5K_A 2UZG_A 3IHP_B 2G43_B 2G45_D 2I50_A 3MHH_A 3MHS_A ....
Probab=87.52 E-value=0.28 Score=33.38 Aligned_cols=30 Identities=27% Similarity=0.313 Sum_probs=22.2
Q ss_pred ceeeccccCCcccCH---HHHHHHHHhhCCCcc
Q 028610 173 FTLRCGVCQIGVIGQ---KEAVEHAQATGHVNF 202 (206)
Q Consensus 173 ~~l~C~~c~~~~~g~---~~a~~Ha~~tgH~~F 202 (206)
-...|+.||+.+-|. .-|.+|+++|||.=|
T Consensus 10 ~lw~CL~Cg~~~C~~~~~~Ha~~H~~~~~H~l~ 42 (63)
T PF02148_consen 10 NLWLCLTCGYVGCGRYSNGHALKHYKETGHPLA 42 (63)
T ss_dssp SEEEETTTS-EEETTTSTSHHHHHHHHHT--EE
T ss_pred ceEEeCCCCcccccCCcCcHHHHhhcccCCeEE
Confidence 345799999999985 679999999999633
No 11
>PF12874 zf-met: Zinc-finger of C2H2 type; PDB: 1ZU1_A 2KVG_A.
Probab=85.94 E-value=0.34 Score=26.43 Aligned_cols=25 Identities=16% Similarity=0.504 Sum_probs=21.3
Q ss_pred eeccccCCcccCHHHHHHHHHhhCC
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQATGH 199 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~~tgH 199 (206)
..|.+|++.+.++...+.|-+...|
T Consensus 1 ~~C~~C~~~f~s~~~~~~H~~s~~H 25 (25)
T PF12874_consen 1 FYCDICNKSFSSENSLRQHLRSKKH 25 (25)
T ss_dssp EEETTTTEEESSHHHHHHHHTTHHH
T ss_pred CCCCCCCCCcCCHHHHHHHHCcCCC
Confidence 3699999999999999999876543
No 12
>PF12756 zf-C2H2_2: C2H2 type zinc-finger (2 copies); PDB: 2DMI_A.
Probab=84.62 E-value=1 Score=31.99 Aligned_cols=30 Identities=20% Similarity=0.411 Sum_probs=26.5
Q ss_pred eeeccccCCcccCHHHHHHHHHhhCCCccc
Q 028610 174 TLRCGVCQIGVIGQKEAVEHAQATGHVNFQ 203 (206)
Q Consensus 174 ~l~C~~c~~~~~g~~~a~~Ha~~tgH~~F~ 203 (206)
.++|..|++.+......+.|-...+|....
T Consensus 50 ~~~C~~C~~~f~s~~~l~~Hm~~~~H~~~~ 79 (100)
T PF12756_consen 50 SFRCPYCNKTFRSREALQEHMRSKHHKKRN 79 (100)
T ss_dssp SEEBSSSS-EESSHHHHHHHHHHTTTTC-S
T ss_pred CCCCCccCCCCcCHHHHHHHHcCccCCCcc
Confidence 689999999999999999999999998874
No 13
>PF00096 zf-C2H2: Zinc finger, C2H2 type; InterPro: IPR007087 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger: #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C], where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter []. This entry represents the classical C2H2 zinc finger domain. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005622 intracellular; PDB: 2D9H_A 2EPC_A 1SP1_A 1VA3_A 2WBT_B 2ELR_A 2YTP_A 2YTT_A 1VA1_A 2ELO_A ....
Probab=80.26 E-value=1.2 Score=23.70 Aligned_cols=21 Identities=14% Similarity=0.468 Sum_probs=18.9
Q ss_pred eccccCCcccCHHHHHHHHHh
Q 028610 176 RCGVCQIGVIGQKEAVEHAQA 196 (206)
Q Consensus 176 ~C~~c~~~~~g~~~a~~Ha~~ 196 (206)
+|.+|++.+.....-..|-+.
T Consensus 2 ~C~~C~~~f~~~~~l~~H~~~ 22 (23)
T PF00096_consen 2 KCPICGKSFSSKSNLKRHMRR 22 (23)
T ss_dssp EETTTTEEESSHHHHHHHHHH
T ss_pred CCCCCCCccCCHHHHHHHHhH
Confidence 799999999999999999764
No 14
>PF12171 zf-C2H2_jaz: Zinc-finger double-stranded RNA-binding; InterPro: IPR022755 This zinc finger is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus []. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation. This entry represents the multiple-adjacent-C2H2 zinc finger, JAZ. ; PDB: 4DGW_A 1ZR9_A.
Probab=78.60 E-value=0.4 Score=26.97 Aligned_cols=25 Identities=16% Similarity=0.438 Sum_probs=20.7
Q ss_pred eeccccCCcccCHHHHHHHHHhhCC
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQATGH 199 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~~tgH 199 (206)
..|..|++.|.++.....|-++..|
T Consensus 2 ~~C~~C~k~f~~~~~~~~H~~sk~H 26 (27)
T PF12171_consen 2 FYCDACDKYFSSENQLKQHMKSKKH 26 (27)
T ss_dssp CBBTTTTBBBSSHHHHHCCTTSHHH
T ss_pred CCcccCCCCcCCHHHHHHHHccCCC
Confidence 3599999999999999998765544
No 15
>smart00290 ZnF_UBP Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger.
Probab=78.27 E-value=1.3 Score=28.19 Aligned_cols=29 Identities=31% Similarity=0.401 Sum_probs=22.2
Q ss_pred eeeccccCCcccCH---HHHHHHHHhhCCCcc
Q 028610 174 TLRCGVCQIGVIGQ---KEAVEHAQATGHVNF 202 (206)
Q Consensus 174 ~l~C~~c~~~~~g~---~~a~~Ha~~tgH~~F 202 (206)
...|+.|+...-|. .-+..|++.|||.=+
T Consensus 11 l~~CL~C~~~~c~~~~~~h~~~H~~~t~H~~~ 42 (50)
T smart00290 11 LWLCLTCGQVGCGRYQLGHALEHFEETGHPLV 42 (50)
T ss_pred eEEecCCCCcccCCCCCcHHHHHhhhhCCCEE
Confidence 44799998777643 459999999999643
No 16
>PF13894 zf-C2H2_4: C2H2-type zinc finger; PDB: 2ELX_A 2EPP_A 2DLK_A 1X6H_A 2EOU_A 2EMB_A 2GQJ_A 2CSH_A 2WBT_B 2ELM_A ....
Probab=74.57 E-value=2.6 Score=21.88 Aligned_cols=21 Identities=19% Similarity=0.517 Sum_probs=16.6
Q ss_pred eccccCCcccCHHHHHHHHHh
Q 028610 176 RCGVCQIGVIGQKEAVEHAQA 196 (206)
Q Consensus 176 ~C~~c~~~~~g~~~a~~Ha~~ 196 (206)
+|..|++.+....+-..|-..
T Consensus 2 ~C~~C~~~~~~~~~l~~H~~~ 22 (24)
T PF13894_consen 2 QCPICGKSFRSKSELRQHMRT 22 (24)
T ss_dssp E-SSTS-EESSHHHHHHHHHH
T ss_pred CCcCCCCcCCcHHHHHHHHHh
Confidence 699999999999999998653
No 17
>KOG0804 consensus Cytoplasmic Zn-finger protein BRAP2 (BRCA1 associated protein) [General function prediction only]
Probab=72.03 E-value=2.2 Score=40.18 Aligned_cols=25 Identities=32% Similarity=0.473 Sum_probs=21.0
Q ss_pred eccccCCcccC---HHHHHHHHHhhCCC
Q 028610 176 RCGVCQIGVIG---QKEAVEHAQATGHV 200 (206)
Q Consensus 176 ~C~~c~~~~~g---~~~a~~Ha~~tgH~ 200 (206)
.|.+||.+.-| +.-|++|++.|||+
T Consensus 242 icliCg~vgcgrY~eghA~rHweet~H~ 269 (493)
T KOG0804|consen 242 ICLICGNVGCGRYKEGHARRHWEETGHC 269 (493)
T ss_pred EEEEccceecccccchhHHHHHHhhcce
Confidence 67778777665 88999999999996
No 18
>smart00355 ZnF_C2H2 zinc finger.
Probab=71.84 E-value=2.9 Score=21.88 Aligned_cols=21 Identities=24% Similarity=0.403 Sum_probs=18.7
Q ss_pred eeccccCCcccCHHHHHHHHH
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQ 195 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~ 195 (206)
.+|..|++.+.+...-..|-.
T Consensus 1 ~~C~~C~~~f~~~~~l~~H~~ 21 (26)
T smart00355 1 YRCPECGKVFKSKSALKEHMR 21 (26)
T ss_pred CCCCCCcchhCCHHHHHHHHH
Confidence 369999999999999999976
No 19
>PF13912 zf-C2H2_6: C2H2-type zinc finger; PDB: 1JN7_A 1FU9_A 2L1O_A 1NJQ_A 2EN8_A 2EMM_A 1FV5_A 1Y0J_B 2L6Z_B.
Probab=67.62 E-value=4.5 Score=22.23 Aligned_cols=21 Identities=19% Similarity=0.384 Sum_probs=18.6
Q ss_pred eeccccCCcccCHHHHHHHHH
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQ 195 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~ 195 (206)
.+|..|++.|.....-.+|=+
T Consensus 2 ~~C~~C~~~F~~~~~l~~H~~ 22 (27)
T PF13912_consen 2 FECDECGKTFSSLSALREHKR 22 (27)
T ss_dssp EEETTTTEEESSHHHHHHHHC
T ss_pred CCCCccCCccCChhHHHHHhH
Confidence 589999999999999998863
No 20
>PF05412 Peptidase_C33: Equine arterivirus Nsp2-type cysteine proteinase; InterPro: IPR008743 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing [].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=65.54 E-value=4.4 Score=30.85 Aligned_cols=84 Identities=15% Similarity=0.244 Sum_probs=48.3
Q ss_pred eCCCCchhhHHHHHHHhcCCCChHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCCCcccchHHHHHHHHHhCCcE
Q 028610 10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREI 89 (206)
Q Consensus 10 ip~DGnCLFrAis~~l~g~~~~~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL~ala~~~~v~I 89 (206)
=|+||+|=.|.||..+++-. . ..|.. .|-+.-+.+..|.++-.|.-+=..++.|.
T Consensus 4 PP~DG~CG~H~i~aI~n~m~---------------~--~~~t~--------~l~~~~r~~d~W~~dedl~~~iq~l~lPa 58 (108)
T PF05412_consen 4 PPGDGSCGWHCIAAIMNHMM---------------G--GEFTT--------PLPQRNRPSDDWADDEDLYQVIQSLRLPA 58 (108)
T ss_pred CCCCCchHHHHHHHHHHHhh---------------c--cCCCc--------cccccCCChHHccChHHHHHHHHHccCce
Confidence 38999999999998776421 1 01111 11222344567888777766666666666
Q ss_pred EEEECCCCceeEeCCCCCceEEEEEcCCcceeeeecC
Q 028610 90 AAYDIQTTRCDLYGQKYSERVMLIYDGLHYDALAISP 126 (206)
Q Consensus 90 ~v~~~~~~~~~~fg~~~~~~i~LlY~G~HYD~l~~~~ 126 (206)
.+..-..- .+-+-.|.-+|.|+.+-+...
T Consensus 59 t~~~~~~C--------p~ArYv~~l~~qHW~V~~~~g 87 (108)
T PF05412_consen 59 TLDRNGAC--------PHARYVLKLDGQHWEVSVRKG 87 (108)
T ss_pred eccCCCCC--------CCCEEEEEecCceEEEEEcCC
Confidence 55421110 223333447899998766553
No 21
>PRK10963 hypothetical protein; Provisional
Probab=64.80 E-value=6.1 Score=33.53 Aligned_cols=36 Identities=14% Similarity=0.294 Sum_probs=26.5
Q ss_pred HHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCCCcccchHHH
Q 028610 37 QVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIEL 78 (206)
Q Consensus 37 ~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL 78 (206)
+.|++|+++|||.|.+ -.+-+..|.-|..+||.|-|
T Consensus 6 ~~V~~yL~~~PdFf~~------h~~Ll~~L~lph~~~gaVSL 41 (223)
T PRK10963 6 RAVVDYLLQNPDFFIR------NARLVEQMRVPHPVRGTVSL 41 (223)
T ss_pred HHHHHHHHHCchHHhh------CHHHHHhccCCCCCCCeecH
Confidence 4799999999998843 34556677777777776554
No 22
>smart00451 ZnF_U1 U1-like zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.
Probab=59.90 E-value=5.3 Score=23.21 Aligned_cols=25 Identities=16% Similarity=0.481 Sum_probs=20.8
Q ss_pred eeccccCCcccCHHHHHHHHHhhCC
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQATGH 199 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~~tgH 199 (206)
..|..|++.|.+......|-..--|
T Consensus 4 ~~C~~C~~~~~~~~~~~~H~~gk~H 28 (35)
T smart00451 4 FYCKLCNVTFTDEISVEAHLKGKKH 28 (35)
T ss_pred eEccccCCccCCHHHHHHHHChHHH
Confidence 3599999999999999998766554
No 23
>PHA03082 DNA-dependent RNA polymerase subunit; Provisional
Probab=59.83 E-value=5.1 Score=27.29 Aligned_cols=18 Identities=22% Similarity=0.545 Sum_probs=15.6
Q ss_pred ceeeccccCCcccCHHHH
Q 028610 173 FTLRCGVCQIGVIGQKEA 190 (206)
Q Consensus 173 ~~l~C~~c~~~~~g~~~a 190 (206)
|.+.|+.||..+..++..
T Consensus 3 f~lVCsTCGrDlSeeRy~ 20 (63)
T PHA03082 3 FQLVCSTCGRDLSEERYR 20 (63)
T ss_pred eeeeecccCcchhHHHHH
Confidence 778999999999888764
No 24
>PF05864 Chordopox_RPO7: Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7); InterPro: IPR008448 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This family consists of several Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide sequences. DNA-dependent RNA polymerase catalyses the transcription of DNA into RNA [].; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent
Probab=59.00 E-value=5.6 Score=27.10 Aligned_cols=18 Identities=22% Similarity=0.545 Sum_probs=15.7
Q ss_pred ceeeccccCCcccCHHHH
Q 028610 173 FTLRCGVCQIGVIGQKEA 190 (206)
Q Consensus 173 ~~l~C~~c~~~~~g~~~a 190 (206)
|.+.|+.||..+..++..
T Consensus 3 f~lvCSTCGrDlSeeRy~ 20 (63)
T PF05864_consen 3 FQLVCSTCGRDLSEERYR 20 (63)
T ss_pred eeeeecccCCcchHHHHH
Confidence 778999999999988764
No 25
>PF13913 zf-C2HC_2: zinc-finger of a C2HC-type
Probab=53.70 E-value=11 Score=20.89 Aligned_cols=20 Identities=15% Similarity=0.463 Sum_probs=15.8
Q ss_pred eeccccCCcccCHHHHHHHHH
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQ 195 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~ 195 (206)
+.|..||..| +......|..
T Consensus 3 ~~C~~CgR~F-~~~~l~~H~~ 22 (25)
T PF13913_consen 3 VPCPICGRKF-NPDRLEKHEK 22 (25)
T ss_pred CcCCCCCCEE-CHHHHHHHHH
Confidence 5799999999 6667777754
No 26
>PHA00616 hypothetical protein
Probab=53.38 E-value=6 Score=25.42 Aligned_cols=29 Identities=21% Similarity=0.272 Sum_probs=23.6
Q ss_pred eeccccCCcccCHHHHHHHHH-hhCCCccc
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQ-ATGHVNFQ 203 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~-~tgH~~F~ 203 (206)
.+|..||+.|.--.+-..|-. .|||..|.
T Consensus 2 YqC~~CG~~F~~~s~l~~H~r~~hg~~~~~ 31 (44)
T PHA00616 2 YQCLRCGGIFRKKKEVIEHLLSVHKQNKLT 31 (44)
T ss_pred CccchhhHHHhhHHHHHHHHHHhcCCCccc
Confidence 479999999999999999974 46666654
No 27
>COG4049 Uncharacterized protein containing archaeal-type C2H2 Zn-finger [General function prediction only]
Probab=52.15 E-value=7.6 Score=26.41 Aligned_cols=27 Identities=22% Similarity=0.413 Sum_probs=22.4
Q ss_pred cCCceeeccccCCcccCHHHHHHHHHh
Q 028610 170 TANFTLRCGVCQIGVIGQKEAVEHAQA 196 (206)
Q Consensus 170 t~~~~l~C~~c~~~~~g~~~a~~Ha~~ 196 (206)
-...-++|.-||.+|+.++.-..|-.+
T Consensus 13 DGE~~lrCPRC~~~FR~~K~Y~RHVNK 39 (65)
T COG4049 13 DGEEFLRCPRCGMVFRRRKDYIRHVNK 39 (65)
T ss_pred CCceeeeCCchhHHHHHhHHHHHHhhH
Confidence 344558999999999999999998754
No 28
>PF05381 Peptidase_C21: Tymovirus endopeptidase; InterPro: IPR008043 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This entry is found in cysteine peptidases belong to the MEROPS peptidase family C21 (tymovirus endopeptidase family, clan CA). The type example is tymovirus endopeptidase (turnip yellow mosaic virus). The noncapsid protein expressed from ORF-206 of turnip yellow mosaic virus (TYMV) is autocatalytically processed by a papain-like protease, producing N-terminal 150kDa and C-terminal 70kDa proteins.; GO: 0003968 RNA-directed RNA polymerase activity, 0016032 viral reproduction
Probab=51.51 E-value=81 Score=23.95 Aligned_cols=89 Identities=19% Similarity=0.150 Sum_probs=53.0
Q ss_pred CCchhhHHHHHHHhcCCCChHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCCCcccchHHHHHHHHHhCCcEEEE
Q 028610 13 DNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREIAAY 92 (206)
Q Consensus 13 DGnCLFrAis~~l~g~~~~~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL~ala~~~~v~I~v~ 92 (206)
..+||--|||.+..-+. .+|....... .||- +|.. ..|++-|== --.+.|||-.|+....+.
T Consensus 2 ~~~CLL~A~s~at~~~~---~~LW~~L~~~---lPDS----lL~n------~ei~~~GLS--TDhltaLa~~~~~~~~~h 63 (104)
T PF05381_consen 2 ALDCLLVAISQATSISP---ETLWATLCEI---LPDS----LLDN------PEIRTLGLS--TDHLTALAYRYHFQCTFH 63 (104)
T ss_pred CcceeHHhhhhhhCCCH---HHHHHHHHHh---Cchh----hcCc------hhhhhcCCc--HHHHHHHHHHHheEEEEE
Confidence 57899999999987543 3444433322 2331 2211 112221111 124679999999999888
Q ss_pred ECCCCceeEeCCC-CCceEEEEEc-C--Cccee
Q 028610 93 DIQTTRCDLYGQK-YSERVMLIYD-G--LHYDA 121 (206)
Q Consensus 93 ~~~~~~~~~fg~~-~~~~i~LlY~-G--~HYD~ 121 (206)
... .+..||-. ....+.|.|. | .||..
T Consensus 64 s~~--~~~~~Gi~~as~~~~I~ht~G~p~HFs~ 94 (104)
T PF05381_consen 64 SDH--GVLHYGIKDASTVFTITHTPGPPGHFSL 94 (104)
T ss_pred cCC--ceEEeecCCCceEEEEEeCCCCCCcccc
Confidence 533 47889976 4555556664 5 39998
No 29
>COG3426 Butyrate kinase [Energy production and conversion]
Probab=50.49 E-value=14 Score=33.35 Aligned_cols=60 Identities=23% Similarity=0.418 Sum_probs=40.3
Q ss_pred hhHHHHHHHhcCCCChHHHHHHHHHHHhhChhh--------hhhhhcCCCHHHHHHhhCCCCcccchHHHHHHHHH
Q 028610 17 LFNAVGYVMEHDKNKAPELRQVIAATVASDPVK--------YSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADY 84 (206)
Q Consensus 17 LFrAis~~l~g~~~~~~~lR~~v~~~i~~np~~--------y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL~ala~~ 84 (206)
-|.|++||+.. ++= .++..+..+||- |.+.|+. -..+|++||..--.+.|..||.|||.-
T Consensus 274 ~~~AmayQVaK------eIG-~~savL~G~vDaIvLTGGiA~~~~f~~-~I~~~v~~iapv~v~PGE~EleALA~G 341 (358)
T COG3426 274 AYEAMAYQVAK------EIG-AMSAVLKGKVDAIVLTGGIAYEKLFVD-AIEDRVSWIAPVIVYPGEDELEALAEG 341 (358)
T ss_pred HHHHHHHHHHH------HHH-hhhhhcCCCCCEEEEecchhhHHHHHH-HHHHHHhhhcceEecCCchHHHHHHhh
Confidence 36677777653 222 233456667762 2233333 368899999999999999999999963
No 30
>cd00729 rubredoxin_SM Rubredoxin, Small Modular nonheme iron binding domain containing a [Fe(SCys)4] center, present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=41.76 E-value=13 Score=22.21 Aligned_cols=14 Identities=29% Similarity=0.465 Sum_probs=11.3
Q ss_pred eeeccccCCcccCH
Q 028610 174 TLRCGVCQIGVIGQ 187 (206)
Q Consensus 174 ~l~C~~c~~~~~g~ 187 (206)
..+|.+||++..|.
T Consensus 2 ~~~C~~CG~i~~g~ 15 (34)
T cd00729 2 VWVCPVCGYIHEGE 15 (34)
T ss_pred eEECCCCCCEeECC
Confidence 35899999998875
No 31
>PF09237 GAGA: GAGA factor; InterPro: IPR015318 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Members of this entry bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 1YUI_A 1YUJ_A.
Probab=40.99 E-value=24 Score=23.53 Aligned_cols=22 Identities=14% Similarity=0.386 Sum_probs=16.9
Q ss_pred eccccCCcccCHHHHHHHHHhh
Q 028610 176 RCGVCQIGVIGQKEAVEHAQAT 197 (206)
Q Consensus 176 ~C~~c~~~~~g~~~a~~Ha~~t 197 (206)
.|.+|+..+.-++.-+.|-+.+
T Consensus 26 tCP~C~a~~~~srnLrRHle~~ 47 (54)
T PF09237_consen 26 TCPICGAVIRQSRNLRRHLEIR 47 (54)
T ss_dssp E-TTT--EESSHHHHHHHHHHH
T ss_pred CCCcchhhccchhhHHHHHHHH
Confidence 6999999999999999998754
No 32
>cd02669 Peptidase_C19M A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.
Probab=38.71 E-value=16 Score=34.03 Aligned_cols=28 Identities=21% Similarity=0.237 Sum_probs=21.8
Q ss_pred eeeccccCCccc---CHHHHHHHHHhhCCCc
Q 028610 174 TLRCGVCQIGVI---GQKEAVEHAQATGHVN 201 (206)
Q Consensus 174 ~l~C~~c~~~~~---g~~~a~~Ha~~tgH~~ 201 (206)
..-|.+||+.+. +..-|..|++.|||.=
T Consensus 28 ~~~CL~cg~~~~g~~~~~ha~~H~~~~~H~~ 58 (440)
T cd02669 28 VYACLVCGKYFQGRGKGSHAYTHSLEDNHHV 58 (440)
T ss_pred EEEEcccCCeecCCCCCcHHHHHhhccCCCE
Confidence 456999996654 3568999999999963
No 33
>PF13909 zf-H2C2_5: C2H2-type zinc-finger domain; PDB: 1X5W_A.
Probab=38.31 E-value=38 Score=17.92 Aligned_cols=21 Identities=14% Similarity=0.412 Sum_probs=15.5
Q ss_pred eeccccCCcccCHHHHHHHHHh
Q 028610 175 LRCGVCQIGVIGQKEAVEHAQA 196 (206)
Q Consensus 175 l~C~~c~~~~~g~~~a~~Ha~~ 196 (206)
.+|..|.+... ...-.+|-+.
T Consensus 1 y~C~~C~y~t~-~~~l~~H~~~ 21 (24)
T PF13909_consen 1 YKCPHCSYSTS-KSNLKRHLKR 21 (24)
T ss_dssp EE-SSSS-EES-HHHHHHHHHH
T ss_pred CCCCCCCCcCC-HHHHHHHHHh
Confidence 37999999998 8888888653
No 34
>PRK06266 transcription initiation factor E subunit alpha; Validated
Probab=37.86 E-value=30 Score=28.45 Aligned_cols=50 Identities=8% Similarity=0.098 Sum_probs=37.8
Q ss_pred eeeeeCCCCCchhHHHHHHHHHHHHHhhCCCccccCCceeeccccCCcccC
Q 028610 136 QTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG 186 (206)
Q Consensus 136 ~t~f~~~d~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~l~C~~c~~~~~g 186 (206)
.-.|..+.+.+.+.++....++.+.|+.+-.+.... .-..|..|+..+.=
T Consensus 80 ~y~w~l~~~~i~d~ik~~~~~~~~klk~~l~~e~~~-~~Y~Cp~C~~rytf 129 (178)
T PRK06266 80 TYTWKPELEKLPEIIKKKKMEELKKLKEQLEEEENN-MFFFCPNCHIRFTF 129 (178)
T ss_pred EEEEEeCHHHHHHHHHHHHHHHHHHHHHHhhhccCC-CEEECCCCCcEEeH
Confidence 568888888888999999999999998876665444 34678888766553
No 35
>PF04475 DUF555: Protein of unknown function (DUF555); InterPro: IPR007564 This is a family of uncharacterised, hypothetical archaeal proteins.
Probab=37.49 E-value=44 Score=25.19 Aligned_cols=39 Identities=10% Similarity=0.004 Sum_probs=30.5
Q ss_pred hHHHHHHHHHHHHHhhCCCccccCCceeeccccCCcccC
Q 028610 148 GPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG 186 (206)
Q Consensus 148 ~~~~~~a~~l~~~~~~~~~~t~t~~~~l~C~~c~~~~~g 186 (206)
|++--+.-|..+.|+..-.|-+..--.+.|..||..|..
T Consensus 21 DAI~iAIseaGkrLn~~~~~VeIevG~~~cP~Cge~~~~ 59 (102)
T PF04475_consen 21 DAIGIAISEAGKRLNPDLDYVEIEVGDTICPKCGEELDS 59 (102)
T ss_pred HHHHHHHHHHHHhhCCCCCeEEEecCcccCCCCCCccCc
Confidence 344444557778889988899999999999999987754
No 36
>PF13465 zf-H2C2_2: Zinc-finger double domain; PDB: 2EN7_A 1TF6_A 1TF3_A 2ELT_A 2EOS_A 2EN2_A 2DMD_A 2WBS_A 2WBU_A 2EM5_A ....
Probab=36.42 E-value=18 Score=20.04 Aligned_cols=18 Identities=22% Similarity=0.412 Sum_probs=13.4
Q ss_pred cccCCceeeccccCCccc
Q 028610 168 TDTANFTLRCGVCQIGVI 185 (206)
Q Consensus 168 t~t~~~~l~C~~c~~~~~ 185 (206)
+-+.....+|..|++.|.
T Consensus 8 ~H~~~k~~~C~~C~k~F~ 25 (26)
T PF13465_consen 8 THTGEKPYKCPYCGKSFS 25 (26)
T ss_dssp HHSSSSSEEESSSSEEES
T ss_pred hcCCCCCCCCCCCcCeeC
Confidence 345666689999998764
No 37
>PF04877 Hairpins: HrpZ; InterPro: IPR006961 HrpZ (harpin elicitor) from the plant pathogen Pseudomonas syringae binds to lipid bilayers and forms a cation-conducting pore in vivo. This pore-forming activity may allow nutrient release or delivery of virulence factors during bacterial colonisation of host plants []. The entry also represents hairpinN which is a virulence determinant which elicits lesion formation in Arabidopsis and tobacco and triggers systemic resistance in Arabidopsis [].
Probab=35.58 E-value=33 Score=30.76 Aligned_cols=49 Identities=14% Similarity=0.137 Sum_probs=29.0
Q ss_pred hHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCCCcccchHHHHHHHHHh
Q 028610 32 APELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYY 85 (206)
Q Consensus 32 ~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~~~WGg~iEL~ala~~~ 85 (206)
-..|=++|++||-.||+.|-.+ -.++ +.+++. .+..=+.-|...+-..+
T Consensus 163 D~~lL~eIaqFMD~nPe~FgkP-d~~s---W~~eLk-eD~~L~~~E~~~F~~Al 211 (308)
T PF04877_consen 163 DMPLLKEIAQFMDQNPEQFGKP-DRKS---WADELK-EDNGLDKAETEQFQKAL 211 (308)
T ss_pred cHHHHHHHHHHHhcCHhhcCCC-CCch---HHHHhh-cCCCCCHHHHHHHHHHH
Confidence 3567789999999999999443 1122 334442 34443444555555444
No 38
>PHA02768 hypothetical protein; Provisional
Probab=34.57 E-value=34 Score=22.96 Aligned_cols=21 Identities=24% Similarity=0.502 Sum_probs=14.6
Q ss_pred eccccCCcccCHHHHHHHHHh
Q 028610 176 RCGVCQIGVIGQKEAVEHAQA 196 (206)
Q Consensus 176 ~C~~c~~~~~g~~~a~~Ha~~ 196 (206)
+|..||+.|.-...-..|-..
T Consensus 7 ~C~~CGK~Fs~~~~L~~H~r~ 27 (55)
T PHA02768 7 ECPICGEIYIKRKSMITHLRK 27 (55)
T ss_pred CcchhCCeeccHHHHHHHHHh
Confidence 677777777766666666555
No 39
>PF09082 DUF1922: Domain of unknown function (DUF1922); InterPro: IPR015166 Members of this family consist of a beta-sheet region followed by an alpha-helix and an unstructured C terminus. The beta-sheet region contains a CXCX...XCXC sequence with Cys residues located in two proximal loops and pointing towards each other. This precise function of this set of bacterial proteins is, as yet, unknown []. ; PDB: 1GH9_A.
Probab=33.62 E-value=15 Score=25.80 Aligned_cols=21 Identities=24% Similarity=0.477 Sum_probs=15.0
Q ss_pred CCCccccCCceeeccccCCccc
Q 028610 164 KKTYTDTANFTLRCGVCQIGVI 185 (206)
Q Consensus 164 ~~~~t~t~~~~l~C~~c~~~~~ 185 (206)
+.-|.+-...+-+| +||+.++
T Consensus 10 r~lya~e~~kTkkC-~CG~~l~ 30 (68)
T PF09082_consen 10 RYLYAKEGAKTKKC-VCGKTLK 30 (68)
T ss_dssp --EEEETT-SEEEE-TTTEEEE
T ss_pred CEEEecCCcceeEe-cCCCeee
Confidence 34577888889999 9998765
No 40
>PF05148 Methyltransf_8: Hypothetical methyltransferase; InterPro: IPR007823 This family consists of uncharacterised eukaryotic proteins which are related to S-adenosyl-L-methionine-dependent methyltransferases.; GO: 0008168 methyltransferase activity; PDB: 2ZFU_B.
Probab=33.43 E-value=1e+02 Score=26.39 Aligned_cols=71 Identities=13% Similarity=0.218 Sum_probs=40.5
Q ss_pred hhhHHHHHHHhcCCCChHHHHHHHHHHHhhChhhhhhhhc----------CCCHHHHHHhhCC-CCcc------cchHHH
Q 028610 16 CLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFL----------GKSNQEYCSWIQD-PEKW------GGAIEL 78 (206)
Q Consensus 16 CLFrAis~~l~g~~~~~~~lR~~v~~~i~~np~~y~e~~l----------~~~~~~Y~~~m~~-~~~W------Gg~iEL 78 (206)
-.||-|-.+||.+... ...+.++++|+.|...-- ..|++.+++|+++ |..| .|+-.
T Consensus 13 srFR~lNE~LYT~~s~------~A~~lf~~dP~~F~~YH~Gfr~Qv~~WP~nPvd~iI~~l~~~~~~~viaD~GCGdA~- 85 (219)
T PF05148_consen 13 SRFRWLNEQLYTTSSE------EALKLFQEDPELFDIYHEGFRQQVKKWPVNPVDVIIEWLKKRPKSLVIADFGCGDAK- 85 (219)
T ss_dssp HHHHHHHHHHHHS-HH------HHHHHHHH-HHHHHHHHHHHHHHHCTSSS-HHHHHHHHHCTS-TTS-EEEES-TT-H-
T ss_pred CchHHHHHhHhcCCHH------HHHHHHHhCHHHHHHHHHHHHHHHhcCCCCcHHHHHHHHHhcCCCEEEEECCCchHH-
Confidence 3699999999976432 455678999998843221 2588999999885 4455 23322
Q ss_pred HHHHHHh--CCcEEEEECC
Q 028610 79 SILADYY--GREIAAYDIQ 95 (206)
Q Consensus 79 ~ala~~~--~v~I~v~~~~ 95 (206)
||+.. +..|+.+|.-
T Consensus 86 --la~~~~~~~~V~SfDLv 102 (219)
T PF05148_consen 86 --LAKAVPNKHKVHSFDLV 102 (219)
T ss_dssp --HHHH--S---EEEEESS
T ss_pred --HHHhcccCceEEEeecc
Confidence 33333 3457777753
No 41
>COG3357 Predicted transcriptional regulator containing an HTH domain fused to a Zn-ribbon [Transcription]
Probab=32.22 E-value=53 Score=24.47 Aligned_cols=36 Identities=19% Similarity=0.201 Sum_probs=23.5
Q ss_pred HHHHHHHHHHHHHhhCCCccccCCceeeccccCCcccC
Q 028610 149 PAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG 186 (206)
Q Consensus 149 ~~~~~a~~l~~~~~~~~~~t~t~~~~l~C~~c~~~~~g 186 (206)
.++...+.+|+-|+++++---..- -+|..||+-|..
T Consensus 35 ~v~~~L~hiak~lkr~g~~Llv~P--a~CkkCGfef~~ 70 (97)
T COG3357 35 EVYDHLEHIAKSLKRKGKRLLVRP--ARCKKCGFEFRD 70 (97)
T ss_pred HHHHHHHHHHHHHHhCCceEEecC--hhhcccCccccc
Confidence 355666667777788776322211 189999998876
No 42
>PHA00732 hypothetical protein
Probab=31.52 E-value=42 Score=23.92 Aligned_cols=17 Identities=18% Similarity=0.497 Sum_probs=7.7
Q ss_pred ccccCCcccCHHHHHHH
Q 028610 177 CGVCQIGVIGQKEAVEH 193 (206)
Q Consensus 177 C~~c~~~~~g~~~a~~H 193 (206)
|..||+.+.....-+.|
T Consensus 4 C~~Cgk~F~s~s~Lk~H 20 (79)
T PHA00732 4 CPICGFTTVTLFALKQH 20 (79)
T ss_pred CCCCCCccCCHHHHHHH
Confidence 44444444444444444
No 43
>cd01675 RNR_III Class III ribonucleotide reductase. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in strict or facultative anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). The class III enzyme from phage T4 consists of two subunits, this model covers the larger subunit w
Probab=31.18 E-value=67 Score=31.05 Aligned_cols=36 Identities=22% Similarity=0.104 Sum_probs=22.3
Q ss_pred hHHHHHHHHHHHHHhhCCC-ccccCCceeeccccCCcccCH
Q 028610 148 GPAEDLALKLVKEQQRKKT-YTDTANFTLRCGVCQIGVIGQ 187 (206)
Q Consensus 148 ~~~~~~a~~l~~~~~~~~~-~t~t~~~~l~C~~c~~~~~g~ 187 (206)
++++++.+..++ +..+| +++|..+ +|.+||+...|+
T Consensus 495 ~al~~lv~~a~~--~~~~y~~~~~p~~--~C~~CG~~~~~~ 531 (555)
T cd01675 495 EALEALVKKAAK--RGVIYFGINTPID--ICNDCGYIGEGE 531 (555)
T ss_pred HHHHHHHHHHHH--cCCceEEEecCCc--cCCCCCCCCcCC
Confidence 444444444433 33455 6777777 999999976543
No 44
>PRK09784 hypothetical protein; Provisional
Probab=29.70 E-value=28 Score=30.68 Aligned_cols=19 Identities=21% Similarity=0.364 Sum_probs=15.6
Q ss_pred EEEEEeCCCCchhhHHHHH
Q 028610 5 IVRRVIPSDNSCLFNAVGY 23 (206)
Q Consensus 5 L~~~~ip~DGnCLFrAis~ 23 (206)
|.-.+|.|||-||.|||--
T Consensus 200 lkyapvdgdgycllrailv 218 (417)
T PRK09784 200 LKYAPVDGDGYCLLRAILV 218 (417)
T ss_pred ceecccCCCchhHHHHHHH
Confidence 4556788999999999954
No 45
>PF04959 ARS2: Arsenite-resistance protein 2; InterPro: IPR007042 This entry represents Arsenite-resistance protein 2 (also known as Serrate RNA effector molecule homolog) which is thought to play a role in arsenite resistance [], although does not directly confer arsenite resistance but rather modulates arsenic sensitivity []. Arsenite is a carcinogenic compound which can act as a comutagen by inhibiting DNA repair. It is also involved in cell cycle progression at S phase. ; PDB: 3AX1_A.
Probab=29.41 E-value=42 Score=28.61 Aligned_cols=28 Identities=18% Similarity=0.266 Sum_probs=20.9
Q ss_pred cccCCceeeccccCCcccCHHHHHHHHH
Q 028610 168 TDTANFTLRCGVCQIGVIGQKEAVEHAQ 195 (206)
Q Consensus 168 t~t~~~~l~C~~c~~~~~g~~~a~~Ha~ 195 (206)
+....-.-+|..|+|.|+|..-+++|-.
T Consensus 71 ~e~~~~K~~C~lc~KlFkg~eFV~KHI~ 98 (214)
T PF04959_consen 71 KEEDEDKWRCPLCGKLFKGPEFVRKHIF 98 (214)
T ss_dssp -SSSSEEEEE-SSS-EESSHHHHHHHHH
T ss_pred HHHcCCEECCCCCCcccCChHHHHHHHh
Confidence 3456667799999999999999999954
No 46
>PF09494 Slx4: Slx4 endonuclease; InterPro: IPR018574 The Slx4 protein is a heteromeric structure-specific endonuclease found in fungi. Slx4 with Slx1 acts as a nuclease on branched DNA substrates, particularly simple-Y, 5'-flap, or replication fork structures by cleaving the strand bearing the 5' non-homologous arm at the branch junction and thus generating ligatable nicked products from 5'-flap or replication fork substrates [].
Probab=29.41 E-value=1.8e+02 Score=19.58 Aligned_cols=42 Identities=21% Similarity=0.449 Sum_probs=27.1
Q ss_pred HHHHHHHhhChhhhhhhhc-CC--CHHHHHHhhCCCCc-ccchHHHH
Q 028610 37 QVIAATVASDPVKYSEAFL-GK--SNQEYCSWIQDPEK-WGGAIELS 79 (206)
Q Consensus 37 ~~v~~~i~~np~~y~e~~l-~~--~~~~Y~~~m~~~~~-WGg~iEL~ 79 (206)
+.+.++|++||+.| +.+| -+ .+++..+|+...|- |.+.+...
T Consensus 3 ~~lt~~I~~~p~l~-ekIL~YePI~L~el~~~L~~~g~~~~~~~~~~ 48 (64)
T PF09494_consen 3 EALTKLIRSDPELY-EKILMYEPINLEELHAWLKASGIGFDRKVDPS 48 (64)
T ss_pred HHHHHHHHcCHHHH-HHHHcCCCccHHHHHHHHHHcCCCccceeCHH
Confidence 35667888999999 5555 33 46888888875553 44444333
No 47
>TIGR00373 conserved hypothetical protein TIGR00373. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain.
Probab=29.27 E-value=48 Score=26.64 Aligned_cols=50 Identities=6% Similarity=0.095 Sum_probs=34.2
Q ss_pred CeeeeeCCCCCchhHHHHHHHHHHHHHhhCCCccccCCceeeccccCCccc
Q 028610 135 DQTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVI 185 (206)
Q Consensus 135 d~t~f~~~d~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~l~C~~c~~~~~ 185 (206)
-...|-++.+++.+.++....++.+.|++.-.+.... .-..|..|+..+.
T Consensus 71 ~~Y~w~i~~~~i~d~Ik~~~~~~~~~lk~~l~~e~~~-~~Y~Cp~c~~r~t 120 (158)
T TIGR00373 71 YEYTWRINYEKALDVLKRKLEETAKKLREKLEFETNN-MFFICPNMCVRFT 120 (158)
T ss_pred EEEEEEeCHHHHHHHHHHHHHHHHHHHHHHHhhccCC-CeEECCCCCcEee
Confidence 3566766667777888888888888887765554333 3357888875544
No 48
>PF13240 zinc_ribbon_2: zinc-ribbon domain
Probab=29.19 E-value=33 Score=18.73 Aligned_cols=14 Identities=36% Similarity=0.776 Sum_probs=8.7
Q ss_pred cccCCceeeccccCCcc
Q 028610 168 TDTANFTLRCGVCQIGV 184 (206)
Q Consensus 168 t~t~~~~l~C~~c~~~~ 184 (206)
.+.+.| |..||..|
T Consensus 10 ~~~~~f---C~~CG~~l 23 (23)
T PF13240_consen 10 EDDAKF---CPNCGTPL 23 (23)
T ss_pred CCcCcc---hhhhCCcC
Confidence 355566 77777654
No 49
>TIGR02934 nifT_nitrog probable nitrogen fixation protein FixT. This largely uncharacterized protein family is assigned a role in nitrogen fixation by two criteria. First, its gene occurs, generally, among genes essential for expression of active nitrogenase. Second, its phylogenetic profile closely matches that of nitrogen-fixing bacteria. However, mutational studies in Klebsiella pneumoniae failed to demonstrate any phenotype for deletion or overexpression of the protein.
Probab=28.74 E-value=4.1 Score=28.54 Aligned_cols=23 Identities=26% Similarity=0.351 Sum_probs=18.5
Q ss_pred hhcCCCHHHHHHhhCCCCcccch
Q 028610 53 AFLGKSNQEYCSWIQDPEKWGGA 75 (206)
Q Consensus 53 ~~l~~~~~~Y~~~m~~~~~WGg~ 75 (206)
.+--+++++=+-.|.+++.|||.
T Consensus 16 YvpKKDLEE~Vv~~e~~~~WGG~ 38 (67)
T TIGR02934 16 YVPKKDLEEVIVSVEKEELWGGW 38 (67)
T ss_pred EEECCcchhheeeeecCccccCE
Confidence 34456888888889999999996
No 50
>cd00350 rubredoxin_like Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=28.72 E-value=23 Score=20.85 Aligned_cols=14 Identities=29% Similarity=0.506 Sum_probs=11.0
Q ss_pred eeccccCCcccCHH
Q 028610 175 LRCGVCQIGVIGQK 188 (206)
Q Consensus 175 l~C~~c~~~~~g~~ 188 (206)
.+|.+||++..+..
T Consensus 2 ~~C~~CGy~y~~~~ 15 (33)
T cd00350 2 YVCPVCGYIYDGEE 15 (33)
T ss_pred EECCCCCCEECCCc
Confidence 47999999877653
No 51
>PF07967 zf-C3HC: C3HC zinc finger-like ; InterPro: IPR012935 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) and other proteins. NIPA is thought to perform an antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signalling events []. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe protein containing this domain (O94506 from SWISSPROT) is involved in mRNA export from the nucleus []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005634 nucleus
Probab=28.68 E-value=33 Score=26.51 Aligned_cols=23 Identities=13% Similarity=0.330 Sum_probs=20.3
Q ss_pred CCccccCCceeeccccCCcccCH
Q 028610 165 KTYTDTANFTLRCGVCQIGVIGQ 187 (206)
Q Consensus 165 ~~~t~t~~~~l~C~~c~~~~~g~ 187 (206)
+.++++...+|+|..||..+.-.
T Consensus 34 ~GW~~~~~d~l~C~~C~~~l~~~ 56 (133)
T PF07967_consen 34 RGWICVSKDMLKCESCGARLCVK 56 (133)
T ss_pred cCCCcCCCCEEEeCCCCCEEEEe
Confidence 88999999999999999877654
No 52
>COG2051 RPS27A Ribosomal protein S27E [Translation, ribosomal structure and biogenesis]
Probab=28.67 E-value=31 Score=24.12 Aligned_cols=15 Identities=20% Similarity=0.505 Sum_probs=12.0
Q ss_pred CCceeeccccCCccc
Q 028610 171 ANFTLRCGVCQIGVI 185 (206)
Q Consensus 171 ~~~~l~C~~c~~~~~ 185 (206)
+++..+|..||..|.
T Consensus 35 ast~V~C~~CG~~l~ 49 (67)
T COG2051 35 ASTVVTCLICGTTLA 49 (67)
T ss_pred CceEEEecccccEEE
Confidence 456778999999885
No 53
>smart00238 BIR Baculoviral inhibition of apoptosis protein repeat. Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes.
Probab=27.77 E-value=1.2e+02 Score=20.36 Aligned_cols=40 Identities=20% Similarity=0.263 Sum_probs=28.0
Q ss_pred hhCCCccccCCceeeccccCCccc----CHHHHHHHHHhhCCCcc
Q 028610 162 QRKKTYTDTANFTLRCGVCQIGVI----GQKEAVEHAQATGHVNF 202 (206)
Q Consensus 162 ~~~~~~t~t~~~~l~C~~c~~~~~----g~~~a~~Ha~~tgH~~F 202 (206)
+..=||+.+ .-.++|-.|+..+. ++.-.++|+...-...|
T Consensus 25 ~~Gfyy~~~-~d~v~C~~C~~~l~~w~~~d~p~~~H~~~~p~C~f 68 (71)
T smart00238 25 EAGFYYTGV-GDEVKCFFCGGELDNWEPGDDPWEEHKKWSPNCPF 68 (71)
T ss_pred HcCCeECCC-CCEEEeCCCCCCcCCCCCCCCHHHHHhHhCcCCcC
Confidence 455667766 44699999999885 45567778776655555
No 54
>PF07368 DUF1487: Protein of unknown function (DUF1487); InterPro: IPR009961 This family consists of several uncharacterised proteins from Drosophila melanogaster. The function of this family is unknown.
Probab=26.88 E-value=1.2e+02 Score=26.01 Aligned_cols=43 Identities=9% Similarity=-0.011 Sum_probs=26.5
Q ss_pred cEEEEEeCCCCchhhHHHHHHHhcCC--------CChHHHHHHHHHHHhhC
Q 028610 4 IIVRRVIPSDNSCLFNAVGYVMEHDK--------NKAPELRQVIAATVASD 46 (206)
Q Consensus 4 ~L~~~~ip~DGnCLFrAis~~l~g~~--------~~~~~lR~~v~~~i~~n 46 (206)
.|.+.-=.||=||=-+.|...|..-- --+..+|+..++-|+++
T Consensus 6 ~lMIvfe~GDlnsA~~~L~~sl~~Pf~~~~VatVlVqEsireefi~rvr~~ 56 (215)
T PF07368_consen 6 QLMIVFEDGDLNSAMHYLLESLHNPFAPGAVATVLVQESIREEFIERVRSR 56 (215)
T ss_pred eEEEEEeCCCHHHHHHHHHHHHhCcccCCcEEEEEEeHHHHHHHHHHHHHh
Confidence 44445556788887777777775321 12356777777766654
No 55
>KOG1247 consensus Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]
Probab=26.84 E-value=35 Score=32.34 Aligned_cols=61 Identities=18% Similarity=0.284 Sum_probs=43.7
Q ss_pred EcCCcceeeeecCCCCCCCCCCeeeee--CCCCCchhHHHHHHHHHHHHHhhCCCccccCCceeeccccCCcccC
Q 028610 114 YDGLHYDALAISPFEGAPEEFDQTIFP--VQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG 186 (206)
Q Consensus 114 Y~G~HYD~l~~~~~~~~~~~~d~t~f~--~~d~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~l~C~~c~~~~~g 186 (206)
|.++||++----. .|...|. +.+.+. ..++.+..+|..++|+.-.+-..|.|.+|++-|..
T Consensus 86 yh~ihk~vy~Wf~-------IdfD~fgrtTT~~qT-----~i~Q~iF~kl~~ng~~se~tv~qLyC~vc~~flad 148 (567)
T KOG1247|consen 86 YHGIHKVVYDWFK-------IDFDEFGRTTTKTQT-----EICQDIFSKLYDNGYLSEQTVKQLYCEVCDTFLAD 148 (567)
T ss_pred cchhHHHHHHhhc-------ccccccCcccCcchh-----HHHHHHhhchhhcCCcccceeeeEEehhhcccccc
Confidence 7888888764432 2344444 333333 67778888889999999888899999999887753
No 56
>COG5134 Uncharacterized conserved protein [Function unknown]
Probab=26.79 E-value=41 Score=28.97 Aligned_cols=32 Identities=22% Similarity=0.397 Sum_probs=23.2
Q ss_pred HHHHHHHHHHHHhhCCCc----cccCCceeeccccC
Q 028610 150 AEDLALKLVKEQQRKKTY----TDTANFTLRCGVCQ 181 (206)
Q Consensus 150 ~~~~a~~l~~~~~~~~~~----t~t~~~~l~C~~c~ 181 (206)
|..++..-+++||.++.- .-.+-|+++|..|+
T Consensus 14 AqpL~~~~~~KlK~arprglSiRL~TPF~~RCL~C~ 49 (272)
T COG5134 14 AQPLAKRKFDKLKNARPRGLSIRLETPFPVRCLNCE 49 (272)
T ss_pred cchhHHHHHHHhcccCcccceEEeccCcceeecchh
Confidence 456677777777877653 34567899999995
No 57
>cd00022 BIR Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger.
Probab=26.61 E-value=1.2e+02 Score=20.14 Aligned_cols=40 Identities=18% Similarity=0.285 Sum_probs=27.4
Q ss_pred hhCCCccccCCceeeccccCCcccC----HHHHHHHHHhhCCCcc
Q 028610 162 QRKKTYTDTANFTLRCGVCQIGVIG----QKEAVEHAQATGHVNF 202 (206)
Q Consensus 162 ~~~~~~t~t~~~~l~C~~c~~~~~g----~~~a~~Ha~~tgH~~F 202 (206)
+..=||+.. .-.++|--|+..+.+ +.-.++|....-+..|
T Consensus 23 ~~Gfyy~~~-~d~v~C~~C~~~~~~w~~~d~p~~~H~~~~p~C~f 66 (69)
T cd00022 23 EAGFYYTGR-GDEVKCFFCGLELKNWEPGDDPWEEHKRWSPNCPF 66 (69)
T ss_pred HcCCeEcCC-CCEEEeCCCCCCccCCCCCCCHHHHHhHhCcCCcC
Confidence 455667655 456999999998874 5556778776555554
No 58
>COG2174 RPL34A Ribosomal protein L34E [Translation, ribosomal structure and biogenesis]
Probab=26.21 E-value=30 Score=25.67 Aligned_cols=14 Identities=21% Similarity=0.558 Sum_probs=11.7
Q ss_pred eeccccCCcccCHH
Q 028610 175 LRCGVCQIGVIGQK 188 (206)
Q Consensus 175 l~C~~c~~~~~g~~ 188 (206)
-+|.+||..|.|..
T Consensus 35 p~C~~cg~pL~Gi~ 48 (93)
T COG2174 35 PKCAICGRPLGGIP 48 (93)
T ss_pred CcccccCCccCCcc
Confidence 37999999999853
No 59
>KOG1790 consensus 60s ribosomal protein L34 [Translation, ribosomal structure and biogenesis]
Probab=25.64 E-value=24 Score=27.41 Aligned_cols=25 Identities=20% Similarity=0.436 Sum_probs=18.7
Q ss_pred CCccccCCceeeccccCCcccCHHH
Q 028610 165 KTYTDTANFTLRCGVCQIGVIGQKE 189 (206)
Q Consensus 165 ~~~t~t~~~~l~C~~c~~~~~g~~~ 189 (206)
++|+.-.+...+|.+|+..|.|-..
T Consensus 32 ~q~~kK~~~~pkc~~c~~~l~Gi~~ 56 (121)
T KOG1790|consen 32 YQYVKKKAKLPKCGDCGMRLQGIPA 56 (121)
T ss_pred hHhhHhhccCCCCCcCCcccCCCCC
Confidence 4566666666779999999998543
No 60
>smart00531 TFIIE Transcription initiation factor IIE.
Probab=25.40 E-value=96 Score=24.39 Aligned_cols=62 Identities=18% Similarity=0.101 Sum_probs=41.3
Q ss_pred CeeeeeCCCCCchhHHHHHHHHHHHHHhhCCCccccCCceeeccccCCcccCHHHHHHHHHhhC
Q 028610 135 DQTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIGQKEAVEHAQATG 198 (206)
Q Consensus 135 d~t~f~~~d~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~l~C~~c~~~~~g~~~a~~Ha~~tg 198 (206)
-++.|-++-..+.+.+.-...++-+.|+++-.+. +.....+|+.|+..+.- .+|...-..+|
T Consensus 61 ~~~yw~i~y~~~~~vik~r~~~~~~~L~~~l~~e-~~~~~Y~Cp~C~~~y~~-~ea~~~~d~~~ 122 (147)
T smart00531 61 YRYYWYINYDTLLDVVKYKLDKMRKRLEDKLEDE-TNNAYYKCPNCQSKYTF-LEANQLLDMDG 122 (147)
T ss_pred EEEEEEecHHHHHHHHHHHHHHHHHHHHHHHhcc-cCCcEEECcCCCCEeeH-HHHHHhcCCCC
Confidence 3566777766666777777777877777766553 33557899999988874 45555422354
No 61
>PF04340 DUF484: Protein of unknown function, DUF484; InterPro: IPR007435 This family consists of several proteins of uncharacterised function.; PDB: 3E98_B.
Probab=24.54 E-value=34 Score=28.77 Aligned_cols=16 Identities=19% Similarity=0.287 Sum_probs=0.0
Q ss_pred HHHHHHHhhChhhhhh
Q 028610 37 QVIAATVASDPVKYSE 52 (206)
Q Consensus 37 ~~v~~~i~~np~~y~e 52 (206)
+.|++|+++|||.|.+
T Consensus 9 ~~V~~yL~~~PdFf~~ 24 (225)
T PF04340_consen 9 EDVAAYLRQHPDFFER 24 (225)
T ss_dssp ----------------
T ss_pred HHHHHHHHhCcHHHHh
Confidence 4789999999998843
No 62
>PF13451 zf-trcl: Probable zinc-binding domain
Probab=24.45 E-value=67 Score=21.07 Aligned_cols=27 Identities=19% Similarity=0.277 Sum_probs=19.6
Q ss_pred CceeeccccCCcccCHHHHHHHHHhhC
Q 028610 172 NFTLRCGVCQIGVIGQKEAVEHAQATG 198 (206)
Q Consensus 172 ~~~l~C~~c~~~~~g~~~a~~Ha~~tg 198 (206)
...|+|-+||..+.=....|+...+-|
T Consensus 2 Dk~l~C~dCg~~FvfTa~EQ~fy~eKg 28 (49)
T PF13451_consen 2 DKTLTCKDCGAEFVFTAGEQKFYAEKG 28 (49)
T ss_pred CeeEEcccCCCeEEEehhHHHHHHhcC
Confidence 357899999999986666666655544
No 63
>PF01199 Ribosomal_L34e: Ribosomal protein L34e; InterPro: IPR008195 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31 [], plant L34 [], yeast putative ribosomal protein YIL052c and archaebacterial L34e.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome; PDB: 3IZR_i 3IZS_i 4A19_L 4A1D_L 4A18_L 4A1B_L.
Probab=23.86 E-value=38 Score=25.24 Aligned_cols=21 Identities=19% Similarity=0.282 Sum_probs=12.4
Q ss_pred CceeeccccCCcccCHHHHHH
Q 028610 172 NFTLRCGVCQIGVIGQKEAVE 192 (206)
Q Consensus 172 ~~~l~C~~c~~~~~g~~~a~~ 192 (206)
.-.-+|.+||..|.|-...+.
T Consensus 39 ~~~pkC~~cg~~L~Gi~~~rp 59 (94)
T PF01199_consen 39 PKKPKCGDCGKPLNGIPALRP 59 (94)
T ss_dssp TT--BSTSSS-BSSSS-SS-S
T ss_pred CCCCCcCccCCcccccccccH
Confidence 334579999999999754443
No 64
>PRK05452 anaerobic nitric oxide reductase flavorubredoxin; Provisional
Probab=23.50 E-value=89 Score=29.55 Aligned_cols=51 Identities=12% Similarity=0.140 Sum_probs=31.5
Q ss_pred eeeeCCCCCchhHHHHHHHHHHHHHhhCCC---ccc--------------cCCceeeccccCCcccCHHH
Q 028610 137 TIFPVQKGRTIGPAEDLALKLVKEQQRKKT---YTD--------------TANFTLRCGVCQIGVIGQKE 189 (206)
Q Consensus 137 t~f~~~d~~~~~~~~~~a~~l~~~~~~~~~---~t~--------------t~~~~l~C~~c~~~~~g~~~ 189 (206)
..|..++ +.++.+.+.+++||+.++ ... .|- ......+|..||++-..+..
T Consensus 373 ~~~~P~e-e~~~~~~~~g~~la~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c~~c~~~yd~~~g 440 (479)
T PRK05452 373 AKWRPDQ-DALELCREHGREIARQWA-LAPLPQSTVNTVVKEETSATTTADLGPRMQCSVCQWIYDPAKG 440 (479)
T ss_pred EEecCCH-HHHHHHHHHHHHHHHHHh-hCCccccccccccccccccccccCCCCeEEECCCCeEECCCCC
Confidence 3444444 345888888888887666 322 111 12344599999998876544
No 65
>PF12907 zf-met2: Zinc-binding
Probab=22.82 E-value=40 Score=21.13 Aligned_cols=23 Identities=22% Similarity=0.609 Sum_probs=18.3
Q ss_pred eeeccccC---CcccCHHHHHHHHHh
Q 028610 174 TLRCGVCQ---IGVIGQKEAVEHAQA 196 (206)
Q Consensus 174 ~l~C~~c~---~~~~g~~~a~~Ha~~ 196 (206)
.++|.+|- -.+..+..-.+|++.
T Consensus 1 ~i~C~iC~qtF~~t~~~~~L~eH~en 26 (40)
T PF12907_consen 1 NIICKICRQTFMQTTNEPQLKEHAEN 26 (40)
T ss_pred CcCcHHhhHHHHhcCCHHHHHHHHHc
Confidence 36899998 566788889999874
No 66
>COG1592 Rubrerythrin [Energy production and conversion]
Probab=22.54 E-value=1e+02 Score=25.30 Aligned_cols=13 Identities=31% Similarity=0.544 Sum_probs=12.0
Q ss_pred eeeccccCCcccC
Q 028610 174 TLRCGVCQIGVIG 186 (206)
Q Consensus 174 ~l~C~~c~~~~~g 186 (206)
..+|.+||++..|
T Consensus 134 ~~vC~vCGy~~~g 146 (166)
T COG1592 134 VWVCPVCGYTHEG 146 (166)
T ss_pred EEEcCCCCCcccC
Confidence 7899999999988
No 67
>PF08209 Sgf11: Sgf11 (transcriptional regulation protein); InterPro: IPR013246 The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae (Baker's yeast). The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation [].; PDB: 3M99_B 2LO2_A 3MHH_C 3MHS_C.
Probab=22.29 E-value=61 Score=19.44 Aligned_cols=19 Identities=21% Similarity=0.237 Sum_probs=14.5
Q ss_pred ceeeccccCCcccCHHHHH
Q 028610 173 FTLRCGVCQIGVIGQKEAV 191 (206)
Q Consensus 173 ~~l~C~~c~~~~~g~~~a~ 191 (206)
....|..|+..+...+-|.
T Consensus 3 ~~~~C~nC~R~v~a~RfA~ 21 (33)
T PF08209_consen 3 PYVECPNCGRPVAASRFAP 21 (33)
T ss_dssp -EEE-TTTSSEEEGGGHHH
T ss_pred CeEECCCCcCCcchhhhHH
Confidence 4568999999999888875
No 68
>PF03884 DUF329: Domain of unknown function (DUF329); InterPro: IPR005584 The biological function of these short proteins is unknown, but they contain four conserved cysteines, suggesting that they all bind zinc. YacG (Q5X8H6 from SWISSPROT) from Escherichia coli has been shown to bind zinc and contains the structural motifs typical of zinc-binding proteins []. The conserved four cysteine motif in these proteins (-C-X(2)-C-X(15)-C-X(3)-C-) is not found in other zinc-binding proteins with known structures.; GO: 0008270 zinc ion binding; PDB: 1LV3_A.
Probab=22.17 E-value=40 Score=22.79 Aligned_cols=13 Identities=31% Similarity=0.746 Sum_probs=7.2
Q ss_pred ceeeccccCCccc
Q 028610 173 FTLRCGVCQIGVI 185 (206)
Q Consensus 173 ~~l~C~~c~~~~~ 185 (206)
|+.+|.+||+...
T Consensus 1 m~v~CP~C~k~~~ 13 (57)
T PF03884_consen 1 MTVKCPICGKPVE 13 (57)
T ss_dssp -EEE-TTT--EEE
T ss_pred CcccCCCCCCeec
Confidence 5789999998764
No 69
>PRK03922 hypothetical protein; Provisional
Probab=22.11 E-value=1e+02 Score=23.66 Aligned_cols=38 Identities=11% Similarity=-0.017 Sum_probs=27.5
Q ss_pred HHHHHHHHHHHHHhh-CCCccccCCceeeccccCCcccC
Q 028610 149 PAEDLALKLVKEQQR-KKTYTDTANFTLRCGVCQIGVIG 186 (206)
Q Consensus 149 ~~~~~a~~l~~~~~~-~~~~t~t~~~~l~C~~c~~~~~g 186 (206)
++--+.-|..+.|+. .-.|-+..--...|..||..|..
T Consensus 23 AI~iAIseaGkrLn~~~l~yVeievG~~~cP~cge~~~~ 61 (113)
T PRK03922 23 AIGVAISEAGKRLNPEDLDYVEVEVGLTICPKCGEPFDS 61 (113)
T ss_pred HHHHHHHHHHhhcCcccCCeEEEecCcccCCCCCCcCCc
Confidence 333444466667777 67788888888899999987753
No 70
>PF13717 zinc_ribbon_4: zinc-ribbon domain
Probab=21.78 E-value=59 Score=19.56 Aligned_cols=14 Identities=21% Similarity=0.446 Sum_probs=10.9
Q ss_pred CCceeeccccCCcc
Q 028610 171 ANFTLRCGVCQIGV 184 (206)
Q Consensus 171 ~~~~l~C~~c~~~~ 184 (206)
.+..++|..||..+
T Consensus 22 ~g~~v~C~~C~~~f 35 (36)
T PF13717_consen 22 KGRKVRCSKCGHVF 35 (36)
T ss_pred CCcEEECCCCCCEe
Confidence 45578999998765
No 71
>PF12091 DUF3567: Protein of unknown function (DUF3567); InterPro: IPR021951 This family of proteins is functionally uncharacterised. This protein is found in bacteria. Proteins in this family are about 90 amino acids in length. This protein has a conserved EIVDK sequence motif.
Probab=21.45 E-value=93 Score=22.79 Aligned_cols=36 Identities=25% Similarity=0.301 Sum_probs=26.8
Q ss_pred ChHHHHHHHHHHHhhChhhhhhhhcCCCHHHHHHhhCCC
Q 028610 31 KAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDP 69 (206)
Q Consensus 31 ~~~~lR~~v~~~i~~np~~y~e~~l~~~~~~Y~~~m~~~ 69 (206)
-+...|+.|-.++.+.|..= =++.-+..|+.+|.+|
T Consensus 46 ~Ae~Fr~~V~~li~~~Pt~E---evDdfL~~y~~l~~qP 81 (85)
T PF12091_consen 46 WAEMFREDVQALIASEPTQE---EVDDFLGGYDALMQQP 81 (85)
T ss_pred HHHHHHHHHHHHHhcCCCHH---HHHHHHHHHHHHHhCC
Confidence 35789999999999999842 1133367899998876
No 72
>PLN02748 tRNA dimethylallyltransferase
Probab=21.25 E-value=51 Score=31.36 Aligned_cols=25 Identities=32% Similarity=0.516 Sum_probs=22.9
Q ss_pred eccccCC-cccCHHHHHHHHHhhCCC
Q 028610 176 RCGVCQI-GVIGQKEAVEHAQATGHV 200 (206)
Q Consensus 176 ~C~~c~~-~~~g~~~a~~Ha~~tgH~ 200 (206)
.|.+|++ .+.|+.+=+.|-++..|-
T Consensus 420 ~Ce~C~~~~~~G~~eW~~Hlksr~Hk 445 (468)
T PLN02748 420 VCEACGNKVLRGAHEWEQHKQGRGHR 445 (468)
T ss_pred cccCCCCcccCCHHHHHHHhcchHHH
Confidence 6999998 899999999999998884
No 73
>PF00356 LacI: Bacterial regulatory proteins, lacI family; InterPro: IPR000843 Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. These proteins are very diverse, but for convenience may be grouped into subfamilies on the basis of sequence similarity. One such family groups together a range of proteins, including ascG, ccpA, cytR, ebgR, fruR, galR, galS, lacI, malI, opnR, purF, rafR, rbtR and scrR [, ]. Within this family, the HTH motif is situated towards the N terminus.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0005622 intracellular; PDB: 3KJX_C 1ZAY_A 1VPW_A 2PUA_A 1QQA_A 1PNR_A 1JFT_A 1QP4_A 2PUD_A 1JH9_A ....
Probab=21.20 E-value=1.9e+02 Score=18.32 Aligned_cols=27 Identities=19% Similarity=0.218 Sum_probs=22.7
Q ss_pred HHHHHHHhcCCCChHHHHHHHHHHHhh
Q 028610 19 NAVGYVMEHDKNKAPELRQVIAATVAS 45 (206)
Q Consensus 19 rAis~~l~g~~~~~~~lR~~v~~~i~~ 45 (206)
..+|..|.+...-..+.|+.|.+.+++
T Consensus 14 ~TVSr~ln~~~~vs~~tr~rI~~~a~~ 40 (46)
T PF00356_consen 14 STVSRVLNGPPRVSEETRERILEAAEE 40 (46)
T ss_dssp HHHHHHHTTCSSSTHHHHHHHHHHHHH
T ss_pred HHHHHHHhCCCCCCHHHHHHHHHHHHH
Confidence 468899999888889999999888765
No 74
>PF06107 DUF951: Bacterial protein of unknown function (DUF951); InterPro: IPR009296 This family consists of several short hypothetical bacterial proteins of unknown function.
Probab=21.20 E-value=53 Score=22.26 Aligned_cols=15 Identities=20% Similarity=0.616 Sum_probs=12.0
Q ss_pred CCceeeccccCCccc
Q 028610 171 ANFTLRCGVCQIGVI 185 (206)
Q Consensus 171 ~~~~l~C~~c~~~~~ 185 (206)
+.|.|||..||..+-
T Consensus 28 aDikikC~gCg~~im 42 (57)
T PF06107_consen 28 ADIKIKCLGCGRQIM 42 (57)
T ss_pred CcEEEEECCCCCEEE
Confidence 568899999997653
No 75
>PF00653 BIR: Inhibitor of Apoptosis domain; InterPro: IPR001370 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. The baculovirus inhibitor of apoptosis protein repeat (BIR) is a domain of tandem repeats separated by a variable length linker that seems to confer cell death-preventing activity [, ]. The BIR domains characterise the Inhibitor of Apoptosis (IAP) family of proteins (MEROPS proteinase inhibitor family I32, clan IV) that suppress apoptosis by interacting with and inhibiting the enzymatic activity of both initiator and effector caspases (MEROPS peptidase family C14, IPR002398 from INTERPRO). Several distinct mammalian IAPs including XIAP, c-IAP1, c-IAP2, and ML-IAP, have been identified, and they all exhibit antiapoptotic activity in cell culture. The functional unit in each IAP protein is the baculoviral IAP repeat (BIR), which contains approximately 80 amino acids folded around a zinc atom. Most mammalian IAPs have more than one BIR domain, with the different BIR domains performing distinct functions. For example, in XIAP, the third BIR domain (BIR3) potently inhibits the catalytic activity of caspase-9, whereas the linker sequences immediately preceding the second BIR domain (BIR2) selectively targets caspase-3 or -7. The first-recognised members of family MEROPS inhibitor family I32 were viral proteins that inhibited the apoptosis of infected cells: Cp-IAP from Cydia pomonella granulosis virus (CpGV) [] and Op-IAP from Orgyia pseudotsugata multicapsid polyhedrosis virus(OpMNPV) []. The discovery of homologous proteins in mammals followed soon after with the recognition that mutations in the gene for neuronal apoptosis inhibitory protein (NIAP) underlie spinal muscular atrophy []. The inhibitors in family I32 all possess one or more 80-residue domains known as BIR (baculovirus inhibitor repeat) domains and have accordingly been termed 'BIR-containing' or 'BIRC' proteins as well as IAP proteins. The mechanism of inhibition of caspases by the IAP proteins is complex, and reactive site residues cannot yet be identified with any confidence. Despite the conservation of the BIR or IAP (inhibitor of apoptosis) domains throughout the family it seems clear that other parts of the molecules also make essential contributions to inhibitory activity. Homologs of most components in the mammalian apoptotic pathway have been identified in fruit flies. The Drosophila Apaf-1, known as Dapaf-1, HAC-1 or Dark, shares significant sequence similarity with its mammalian counterpart, and is critically important for the activation of the Drosophila initiator caspase Dronc. Dronc, in turn, cleaves and activates the effector caspase DrICE. The Drosophila IAP, DIAP1, binds to and in-activates both DrICE and Dronc through its BIR1 and BIR2 domains. During apoptosis, the anti-death function of DIAP1 is countered by at least four pro-apoptotic proteins, Reaper, Hid, Grim, and sickle, through direct physical interactions. These four proteins represent the functional homologs of the mammalian protein Smac, and they all share a conserved IAP-binding motif at their N termini. The three proteins Reaper, Hid, and Grim are collectively referred to as the RHG proteins [, ]. Both XIAP and DIAP1 contain a RING domain at their C termini, and can act as an E3 ubiquitin ligase. Indeed, both XIAP and DIAP1 have been shown to promote self-ubiquitination and degradation as well as to negatively regulate the target caspases. Nonetheless, important differences exist between XIAP and DIAP1. The primary function of XIAP is thought to inhibit the catalytic activities of caspases; to what extent the ubiquitinating activity of XIAP contributes to its function remains unclear. For DIAP1, however, the ubiquitinating activity appears to be essential for its function. Recently a Drosophila p53 protein has been identified that mediates apoptosis via a novel pathway involving the activation of the Reaper gene and subsequent inhibition of the inhibitors of apoptosis (IAPs). CIAP1, a major mammalian homologue of Drosophila IAPs, is irreversibly inhibited (cleaved) during p53-dependent apoptosis and this cleavage is mediated by a serine protease. Serine protease inhibitors that block CIAP1 cleavage inhibit p53-dependent apoptosis. Furthermore, activation of the p53 protein increases the transcription of the HTRA2 gene, which encodes a serine protease that interacts with CIAP1 and potentiates apoptosis. Therefore mammalian p53 protein activates apoptosis through a novel pathway functionally similar to that in Drosophila, which involves HTRA2 and subsequent inhibition of CIAP1 by cleavage [].; GO: 0005622 intracellular; PDB: 3HL5_B 3UW5_A 3CM7_A 1G3F_A 1G73_C 3G76_G 3CM2_C 2VSL_A 2OPZ_B 3CLX_A ....
Probab=20.91 E-value=1.1e+02 Score=20.67 Aligned_cols=39 Identities=21% Similarity=0.269 Sum_probs=26.2
Q ss_pred hhCCCccccCCceeeccccCCccc----CHHHHHHHHHhhCCCc
Q 028610 162 QRKKTYTDTANFTLRCGVCQIGVI----GQKEAVEHAQATGHVN 201 (206)
Q Consensus 162 ~~~~~~t~t~~~~l~C~~c~~~~~----g~~~a~~Ha~~tgH~~ 201 (206)
++.=|||.+ ...++|-.||..+. ++.-.++|.+..-...
T Consensus 25 ~aGFyy~~~-~d~v~C~~C~~~l~~w~~~Ddp~~~H~~~sp~C~ 67 (70)
T PF00653_consen 25 RAGFYYTGT-GDRVRCFYCGLELDNWEPNDDPWEEHKRHSPNCP 67 (70)
T ss_dssp HTTEEEESS-TTEEEETTTTEEEES-STT--HHHHHHHHSTTBH
T ss_pred HCCCEEcCC-CCEEEEeccCCEEeCCCCCCCHHHHHHHHCcCCe
Confidence 445667766 78899999999885 4455677877554443
No 76
>PF15412 Nse4-Nse3_bdg: Binding domain of Nse4/EID3 to Nse3-MAGE
Probab=20.72 E-value=35 Score=22.64 Aligned_cols=36 Identities=19% Similarity=0.154 Sum_probs=27.8
Q ss_pred CCchhHHHHHHHHHHHHHhhCCCccccCCceeeccc
Q 028610 144 GRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGV 179 (206)
Q Consensus 144 ~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~l~C~~ 179 (206)
.+.|-.+-++|.+-|+.|+-..-..|+..|.-+|..
T Consensus 2 S~~Lv~aSdla~~ka~~lk~~~~~fd~deFv~~l~~ 37 (56)
T PF15412_consen 2 SRLLVLASDLAAEKARNLKFGGSGFDVDEFVSKLKT 37 (56)
T ss_pred cHHHHHHHHHHHHHHHHhccCCCccCHHHHHHHHHH
Confidence 344566778888888888999888899999666644
No 77
>PF10588 NADH-G_4Fe-4S_3: NADH-ubiquinone oxidoreductase-G iron-sulfur binding region; InterPro: IPR019574 NADH:ubiquinone oxidoreductase (complex I) (1.6.5.3 from EC) is a respiratory-chain enzyme that catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane (NADH + ubiquinone = NAD+ + ubiquinol) []. Complex I is a major source of reactive oxygen species (ROS) that are predominantly formed by electron transfer from FMNH(2). Complex I is found in bacteria, cyanobacteria (as a NADH-plastoquinone oxidoreductase), archaea [], mitochondira, and in the hydrogenosome, a mitochondria-derived organelle. In general, the bacterial complex consists of 14 different subunits, while the mitochondrial complex contains homologues to these subunits in addition to approximately 31 additional proteins []. Mitochondrial complex I, which is located in the inner mitochondrial membrane, is the largest multimeric respiratory enzyme in the mitochondria, consisting of more than 40 subunits, one FMN co-factor and eight FeS clusters []. The assembly of mitochondrial complex I is an intricate process that requires the cooperation of the nuclear and mitochondrial genomes [, ]. Mitochondrial complex I can cycle between active and deactive forms that can be distinguished by the reactivity towards divalent cations and thiol-reactive agents. All redox prosthetic groups reside in the peripheral arm of the L-shaped structure. The NADH oxidation domain harbouring the FMN cofactor is connected via a chain of iron-sulphur clusters to the ubiquinone reduction site that is located in a large pocket formed by the PSST and 49kDa subunits of complex I []. This entry describes the G subunit (one of 14 subunits, A to N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria while translocating protons, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This family does not contain related subunits from formate dehydrogenase complexes. This entry represents the iron-sulphur binding domain of the G subunit.; GO: 0016491 oxidoreductase activity, 0055114 oxidation-reduction process; PDB: 3M9S_C 2FUG_L 3IAS_L 2YBB_3 3IAM_3 3I9V_3.
Probab=20.60 E-value=79 Score=19.55 Aligned_cols=18 Identities=39% Similarity=0.361 Sum_probs=12.3
Q ss_pred ccchHHHHHHHHHhCCcE
Q 028610 72 WGGAIELSILADYYGREI 89 (206)
Q Consensus 72 WGg~iEL~ala~~~~v~I 89 (206)
=.|+.||+.++..|++.-
T Consensus 21 ~~G~CeLQ~~~~~~gv~~ 38 (41)
T PF10588_consen 21 KNGNCELQDLAYEYGVDE 38 (41)
T ss_dssp TGGG-HHHHHHHHH-S--
T ss_pred CCCCCHHHHHHHHhCCCc
Confidence 368999999999999864
No 78
>PF14749 Acyl-CoA_ox_N: Acyl-coenzyme A oxidase N-terminal; PDB: 2FON_A 1W07_B 1IS2_B 2DDH_A.
Probab=20.19 E-value=2.3e+02 Score=21.15 Aligned_cols=31 Identities=19% Similarity=0.304 Sum_probs=23.6
Q ss_pred HHHHHHhcCCCChHHHHHHHHHHHhhChhhhh
Q 028610 20 AVGYVMEHDKNKAPELRQVIAATVASDPVKYS 51 (206)
Q Consensus 20 Ais~~l~g~~~~~~~lR~~v~~~i~~np~~y~ 51 (206)
-++..|+|+.. ..+.|+.+.+.|.++|....
T Consensus 4 eLt~~l~Gg~~-~~~~rr~i~~~i~~dP~f~~ 34 (125)
T PF14749_consen 4 ELTNFLDGGEE-KLERRREIESLIESDPIFSK 34 (125)
T ss_dssp HHHHHHHSSHH-HHHHHHHHHHHHHT-GGG--
T ss_pred HHHHHHcCCHH-HHHHHHHHHHHHhhChhhhc
Confidence 47888998864 47899999999999998654
No 79
>KOG2785 consensus C2H2-type Zn-finger protein [General function prediction only]
Probab=20.16 E-value=95 Score=28.83 Aligned_cols=34 Identities=21% Similarity=0.400 Sum_probs=30.4
Q ss_pred cccCCceeeccccCCcccCHHHHHHHHHhhCCCc
Q 028610 168 TDTANFTLRCGVCQIGVIGQKEAVEHAQATGHVN 201 (206)
Q Consensus 168 t~t~~~~l~C~~c~~~~~g~~~a~~Ha~~tgH~~ 201 (206)
-+...+.+.|.+|.+.+..+..-..|.++--|..
T Consensus 62 ~e~~~~~~~c~~c~k~~~s~~a~~~hl~Sk~h~~ 95 (390)
T KOG2785|consen 62 LEEAESVVYCEACNKSFASPKAHENHLKSKKHVE 95 (390)
T ss_pred hhhcccceehHHhhccccChhhHHHHHHHhhcch
Confidence 3667889999999999999999999999988864
Done!