Query 028455
Match_columns 208
No_of_seqs 223 out of 971
Neff 6.7
Searched_HMMs 46136
Date Fri Mar 29 11:45:44 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/028455.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/028455hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG3288 OTU-like cysteine prot 100.0 2.9E-68 6.3E-73 448.7 13.8 201 1-207 106-307 (307)
2 COG5539 Predicted cysteine pro 100.0 1.4E-36 3.1E-41 260.7 5.3 189 10-207 117-306 (306)
3 PF02338 OTU: OTU-like cystein 99.9 1.2E-26 2.5E-31 177.5 8.4 104 11-121 1-121 (121)
4 KOG2606 OTU (ovarian tumor)-li 99.9 1.3E-25 2.9E-30 193.7 8.2 122 4-127 158-298 (302)
5 PF10275 Peptidase_C65: Peptid 99.6 6E-15 1.3E-19 126.4 12.4 92 34-126 141-244 (244)
6 KOG3991 Uncharacterized conser 99.6 4.3E-15 9.2E-20 124.7 6.9 94 33-127 157-256 (256)
7 KOG2605 OTU (ovarian tumor)-li 99.4 4.1E-14 8.9E-19 127.7 3.5 120 5-127 218-344 (371)
8 COG5539 Predicted cysteine pro 99.0 1.1E-10 2.3E-15 101.3 2.6 115 5-124 171-304 (306)
9 PF05415 Peptidase_C36: Beet n 96.1 0.017 3.8E-07 42.4 5.4 78 10-106 3-84 (104)
10 PF02148 zf-UBP: Zn-finger in 87.1 0.28 6.2E-06 33.3 0.9 30 175-204 10-42 (63)
11 PF12874 zf-met: Zinc-finger o 84.5 0.41 8.8E-06 26.0 0.6 25 177-201 1-25 (25)
12 PF12756 zf-C2H2_2: C2H2 type 83.8 1.1 2.4E-05 31.8 2.8 30 176-205 50-79 (100)
13 PF00096 zf-C2H2: Zinc finger, 79.1 1.2 2.7E-05 23.5 1.2 21 178-198 2-22 (23)
14 smart00290 ZnF_UBP Ubiquitin C 77.6 1.4 3.1E-05 28.0 1.4 29 176-204 11-42 (50)
15 PF12171 zf-C2H2_jaz: Zinc-fin 77.2 0.45 9.7E-06 26.7 -1.0 25 177-201 2-26 (27)
16 PF13894 zf-C2H2_4: C2H2-type 72.8 2.8 6.1E-05 21.6 1.6 21 178-198 2-22 (24)
17 KOG0804 Cytoplasmic Zn-finger 71.5 2.3 5E-05 39.9 1.6 25 178-202 242-269 (493)
18 smart00355 ZnF_C2H2 zinc finge 71.2 3 6.5E-05 21.7 1.5 21 177-197 1-21 (26)
19 PF05412 Peptidase_C33: Equine 67.8 3.6 7.8E-05 31.2 1.7 84 10-128 4-87 (108)
20 PF13912 zf-C2H2_6: C2H2-type 64.5 5.2 0.00011 21.9 1.6 21 177-197 2-22 (27)
21 PF05379 Peptidase_C23: Carlav 60.4 33 0.00072 25.0 5.6 16 14-29 3-18 (89)
22 smart00451 ZnF_U1 U1-like zinc 59.5 5.4 0.00012 23.1 1.1 25 177-201 4-28 (35)
23 PHA03082 DNA-dependent RNA pol 57.4 5.8 0.00013 26.8 1.1 19 174-192 2-20 (63)
24 PF05864 Chordopox_RPO7: Chord 57.1 6.1 0.00013 26.7 1.2 18 175-192 3-20 (63)
25 PRK10963 hypothetical protein; 54.0 10 0.00022 32.1 2.3 36 37-78 6-41 (223)
26 PF13913 zf-C2HC_2: zinc-finge 53.2 11 0.00025 20.8 1.7 21 176-197 2-22 (25)
27 COG4049 Uncharacterized protei 53.0 6.9 0.00015 26.5 0.9 26 173-198 14-39 (65)
28 PHA00616 hypothetical protein 52.7 7.6 0.00016 24.8 1.0 29 177-205 2-31 (44)
29 PF05381 Peptidase_C21: Tymovi 43.4 1.4E+02 0.0031 22.5 6.8 89 13-123 2-94 (104)
30 cd00729 rubredoxin_SM Rubredox 41.0 13 0.00028 22.2 0.7 14 176-189 2-15 (34)
31 PF09237 GAGA: GAGA factor; I 38.8 26 0.00056 23.3 1.9 22 178-199 26-47 (54)
32 cd02669 Peptidase_C19M A subfa 38.0 16 0.00035 33.9 1.2 32 173-204 25-59 (440)
33 PF04475 DUF555: Protein of un 38.0 42 0.0009 25.2 3.1 38 151-188 22-59 (102)
34 KOG1247 Methionyl-tRNA synthet 38.0 19 0.00041 34.0 1.6 61 116-188 86-148 (567)
35 PF13465 zf-H2C2_2: Zinc-finge 36.4 17 0.00037 20.0 0.7 18 170-187 8-25 (26)
36 COG3426 Butyrate kinase [Energ 36.3 53 0.0011 29.6 4.0 59 18-84 275-341 (358)
37 PHA02768 hypothetical protein; 36.1 32 0.00068 23.0 2.0 21 178-198 7-27 (55)
38 PF09082 DUF1922: Domain of un 35.9 13 0.00027 26.0 0.1 21 166-187 10-30 (68)
39 PF13909 zf-H2C2_5: C2H2-type 35.2 41 0.00089 17.7 2.1 21 177-198 1-21 (24)
40 PF07368 DUF1487: Protein of u 34.2 64 0.0014 27.5 4.1 66 71-145 144-213 (215)
41 COG3357 Predicted transcriptio 33.8 49 0.0011 24.5 2.9 37 150-188 34-70 (97)
42 cd01675 RNR_III Class III ribo 33.1 57 0.0012 31.5 4.1 36 150-189 495-531 (555)
43 PF04877 Hairpins: HrpZ; Inte 33.0 36 0.00078 30.4 2.4 50 31-85 162-211 (308)
44 TIGR02934 nifT_nitrog probable 31.3 4.3 9.3E-05 28.3 -2.8 44 44-103 6-50 (67)
45 PRK09784 hypothetical protein; 31.2 25 0.00055 30.9 1.2 20 5-24 200-219 (417)
46 PRK06266 transcription initiat 31.0 41 0.00088 27.6 2.4 49 138-187 80-128 (178)
47 PHA00732 hypothetical protein 30.8 44 0.00096 23.7 2.2 10 178-187 29-38 (79)
48 PF05148 Methyltransf_8: Hypot 30.2 1.3E+02 0.0028 25.8 5.2 72 16-97 13-104 (219)
49 COG2051 RPS27A Ribosomal prote 29.0 29 0.00063 24.1 1.0 15 173-187 35-49 (67)
50 PF13240 zinc_ribbon_2: zinc-r 29.0 33 0.00072 18.6 1.0 14 170-186 10-23 (23)
51 PF07967 zf-C3HC: C3HC zinc fi 28.8 30 0.00066 26.7 1.2 23 167-189 34-56 (133)
52 cd00350 rubredoxin_like Rubred 28.3 22 0.00048 20.8 0.2 14 177-190 2-15 (33)
53 PF04959 ARS2: Arsenite-resist 27.8 45 0.00098 28.3 2.2 28 170-197 71-98 (214)
54 KOG1790 60s ribosomal protein 26.8 23 0.00049 27.4 0.2 25 167-191 32-56 (121)
55 smart00238 BIR Baculoviral inh 26.8 1.2E+02 0.0027 20.2 3.9 40 164-204 25-68 (71)
56 PF13451 zf-trcl: Probable zin 26.7 54 0.0012 21.4 1.9 27 174-200 2-28 (49)
57 COG2174 RPL34A Ribosomal prote 26.3 30 0.00065 25.6 0.7 14 177-190 35-48 (93)
58 COG5134 Uncharacterized conser 25.1 46 0.001 28.5 1.7 32 152-183 14-49 (272)
59 cd00022 BIR Baculoviral inhibi 25.1 1.3E+02 0.0029 19.9 3.8 40 164-204 23-66 (69)
60 PRK13731 conjugal transfer sur 24.9 2E+02 0.0042 25.1 5.5 45 137-185 50-104 (243)
61 TIGR00373 conserved hypothetic 24.1 59 0.0013 26.0 2.1 49 138-187 72-120 (158)
62 PF05413 Peptidase_C34: Putati 23.2 67 0.0015 23.3 2.0 89 7-123 2-90 (92)
63 PRK03922 hypothetical protein; 21.8 1E+02 0.0022 23.6 2.8 38 150-187 21-60 (113)
64 PF15412 Nse4-Nse3_bdg: Bindin 21.8 31 0.00068 22.8 0.1 36 146-181 2-37 (56)
65 PF06107 DUF951: Bacterial pro 21.6 48 0.001 22.4 0.9 15 173-187 28-42 (57)
66 PF13717 zinc_ribbon_4: zinc-r 21.3 58 0.0013 19.5 1.2 14 173-186 22-35 (36)
67 PLN02748 tRNA dimethylallyltra 21.0 51 0.0011 31.3 1.3 26 177-202 419-445 (468)
68 PF03884 DUF329: Domain of unk 21.0 46 0.001 22.4 0.8 13 175-187 1-13 (57)
69 PF08209 Sgf11: Sgf11 (transcr 20.9 60 0.0013 19.4 1.2 19 175-193 3-21 (33)
70 PRK05452 anaerobic nitric oxid 20.6 1.1E+02 0.0023 29.0 3.4 53 139-192 373-441 (479)
71 PF00653 BIR: Inhibitor of Apo 20.3 1.2E+02 0.0025 20.5 2.7 39 164-203 25-67 (70)
72 PF08782 c-SKI_SMAD_bind: c-SK 20.2 38 0.00083 25.2 0.3 25 167-191 19-43 (96)
73 PF14300 DUF4375: Domain of un 20.1 55 0.0012 24.7 1.1 18 33-50 106-123 (123)
74 PF10571 UPF0547: Uncharacteri 20.1 51 0.0011 18.5 0.7 12 176-187 14-25 (26)
No 1
>KOG3288 consensus OTU-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=2.9e-68 Score=448.66 Aligned_cols=201 Identities=57% Similarity=1.023 Sum_probs=191.8
Q ss_pred CCCcEEEEEeCCCCchhhHHHHHHhhcCCCc-hHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHH
Q 028455 1 MEGIIVRRVIPSDNSCLFNAVGYVMEHDKNK-APELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELS 79 (208)
Q Consensus 1 ~~~~l~~~~ip~DGnCLFrAis~~l~~~~~~-~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~ 79 (208)
|+|.|.+|+||+||||||+||+|.+.+.... ..+||++||..+.+||+.|+++|||++..+||.||+++.+|||+|||+
T Consensus 106 ~~gvl~~~vvp~DNSCLF~ai~yv~~k~~~~~~~elR~iiA~~Vasnp~~yn~AiLgK~n~eYc~WI~k~dsWGGaIEls 185 (307)
T KOG3288|consen 106 GEGVLSRRVVPDDNSCLFTAIAYVIFKQVSNRPYELREIIAQEVASNPDKYNDAILGKPNKEYCAWILKMDSWGGAIELS 185 (307)
T ss_pred ccceeEEEeccCCcchhhhhhhhhhcCccCCCcHHHHHHHHHHHhcChhhhhHHHhCCCcHHHHHHHccccccCceEEee
Confidence 5789999999999999999999999987543 469999999999999999999999999999999999999999999999
Q ss_pred HHHHhhCceEEEEECCCCceeEeCCCCCCCCeEEEEEcCccceeeecCCCCCCCCCCCeeeeeCCCCCcchHHHHHHHHH
Q 028455 80 ILADYYGREIAAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQKGRTIGPAEDLALKL 159 (208)
Q Consensus 80 als~~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~~~~~~~~~d~~~f~~~~~~~~~~~~~~a~~l 159 (208)
|||++|+++|+|+|+++.++++||+++++..|++|+|+|+|||++++++. .|.+.|.|+||.+| +.++.+|++|
T Consensus 186 ILS~~ygveI~vvDiqt~rid~fged~~~~~rv~llydGIHYD~l~m~~~--~~~~~~~tifp~~d----d~v~~~alqL 259 (307)
T KOG3288|consen 186 ILSDYYGVEICVVDIQTVRIDRFGEDKNFDNRVLLLYDGIHYDPLAMNEF--KPTDVDNTIFPVSD----DTVLTQALQL 259 (307)
T ss_pred eehhhhceeEEEEecceeeehhcCCCCCCCceEEEEecccccChhhhccC--CccCCccccccccc----chHHHHHHHH
Confidence 99999999999999999999999999999999999999999999999976 57778899999999 5678999999
Q ss_pred HHHHhhCCCccccCCceeeccccCCCcCCHHHHHHHHHhhCCCccccc
Q 028455 160 VKEQQRKKTYTDTANFTLRCGVCQIGVIGQKEAVEHAQATGHVNFQEY 207 (208)
Q Consensus 160 ~~~~~~~~~~t~t~~~~~~C~~c~~~~~g~~~a~~ha~~tgH~~F~e~ 207 (208)
|++||++||||||++|+|||.+|+..|+||++|++||++|||+||+|+
T Consensus 260 a~~~k~~r~ytdt~~ftlRC~~Cq~glvGq~ea~eHA~~TGH~nFge~ 307 (307)
T KOG3288|consen 260 ASELKRTRYYTDTAKFTLRCMVCQMGLVGQKEAAEHAKATGHVNFGEY 307 (307)
T ss_pred HHHHHhcceeccccceEEEeeecccceeeHHHHHHHHHhcCCCccccC
Confidence 999999999999999999999999999999999999999999999996
No 2
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.4e-36 Score=260.68 Aligned_cols=189 Identities=29% Similarity=0.518 Sum_probs=166.2
Q ss_pred eCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCccc-CHHHHHHHHHhhCce
Q 028455 10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWG-GAIELSILADYYGRE 88 (208)
Q Consensus 10 ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WG-G~iEL~als~~~~~~ 88 (208)
.-+|++|+|++.++.++.. ...+||.+|+..+.+|||.|+.++++.|.-.|+.||.++..|| |+||+.++|..+++.
T Consensus 117 ~~~d~srl~q~~~~~l~~a--sv~~lrE~vs~Ev~snPDl~n~~i~~~~~i~y~~~i~k~d~~~dG~ieia~iS~~l~v~ 194 (306)
T COG5539 117 GQDDNSRLFQAERYSLRDA--SVAKLREVVSLEVLSNPDLYNPAILEIDVIAYATWIVKPDSQGDGCIEIAIISDQLPVR 194 (306)
T ss_pred CCCchHHHHHHHHhhhhhh--hHHHHHHHHHHHHhhCccccchhhcCcchHHHHHhhhccccCCCceEEEeEecccccee
Confidence 3467999999999999864 6789999999999999999999999999999999999999999 999999999999999
Q ss_pred EEEEECCCCceeEeCCCCCCCCeEEEEEcCccceeeecCCCCCCCCCCCeeeeeCCCCCcchHHHHHHHHHHHHHhhCCC
Q 028455 89 IAAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQKGRTIGPAEDLALKLVKEQQRKKT 168 (208)
Q Consensus 89 I~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~~~~~~~~~d~~~f~~~~~~~~~~~~~~a~~l~~~~~~~~~ 168 (208)
|+++++.+.+.++|++.. +..++.++|+|+|||.....-.+ ..+..+.-.|+.+| .+...+++||+-|+..+|
T Consensus 195 i~~Vdv~~~~~dr~~~~~-~~q~~~i~f~g~hfD~~t~~m~~-~dt~~ne~~~~a~~-----g~~~ei~qLas~lk~~~~ 267 (306)
T COG5539 195 IHVVDVDKDSEDRYNSHP-YVQRISILFTGIHFDEETLAMVL-WDTYVNEVLFDASD-----GITIEIQQLASLLKNPHY 267 (306)
T ss_pred eeeeecchhHHhhccCCh-hhhhhhhhhcccccchhhhhcch-HHHHHhhhcccccc-----cchHHHHHHHHHhcCceE
Confidence 999999999999999887 67888899999999999865321 12223344555554 245667789999999999
Q ss_pred ccccCCceeeccccCCCcCCHHHHHHHHHhhCCCccccc
Q 028455 169 YTDTANFTLRCGVCQIGVIGQKEAVEHAQATGHVNFQEY 207 (208)
Q Consensus 169 ~t~t~~~~~~C~~c~~~~~g~~~a~~ha~~tgH~~F~e~ 207 (208)
||||+++++||+.||+.|.|++++-+||..|||+||+|.
T Consensus 268 ~~nT~~~~ik~n~c~~~~~~e~~~~~Ha~a~GH~n~~~d 306 (306)
T COG5539 268 YTNTASPSIKCNICGTGFVGEKDYYAHALATGHYNFGED 306 (306)
T ss_pred EeecCCceEEeeccccccchhhHHHHHHHhhcCccccCC
Confidence 999999999999999999999999999999999999973
No 3
>PF02338 OTU: OTU-like cysteine protease; InterPro: IPR003323 This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65). None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity [].; PDB: 2VFJ_C 3DKB_F 3PHW_A 3PHU_B 3PHX_A 3BY4_A 3C0R_C 3PRM_C 3PRP_C 3ZRH_A ....
Probab=99.94 E-value=1.2e-26 Score=177.50 Aligned_cols=104 Identities=30% Similarity=0.544 Sum_probs=86.0
Q ss_pred CCCCchhhHHHHHHhh----cCCCchHHHHHHHHHHHh-cChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhh
Q 028455 11 PSDNSCLFNAVGYVME----HDKNKAPELRQVIAATVA-SDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYY 85 (208)
Q Consensus 11 p~DGnCLFrAis~~l~----~~~~~~~~lR~~va~~I~-~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~ 85 (208)
||||||||||||++|+ +++..|.+||+.|+++|+ .|++.| +.++... +|+++++|||++||+|||.+|
T Consensus 1 pgDGnClF~Avs~~l~~~~~~~~~~~~~lR~~~~~~l~~~~~~~~-~~~~~~~------~~~~~~~Wg~~~el~a~a~~~ 73 (121)
T PF02338_consen 1 PGDGNCLFRAVSDQLYGDGGGSEDNHQELRKAVVDYLRDKNRDKF-EEFLEGD------KMSKPGTWGGEIELQALANVL 73 (121)
T ss_dssp -SSTTHHHHHHHHHHCTT-SSSTTTHHHHHHHHHHHHHTHTTTHH-HHHHHHH------HHTSTTSHEEHHHHHHHHHHH
T ss_pred CCCccHHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHhccchh-hhhhhhh------hhccccccCcHHHHHHHHHHh
Confidence 8999999999999999 999999999999999999 999999 5555433 999999999999999999999
Q ss_pred CceEEEEECCCCceeE---eCC---CCCCCCeEEEEEc------Cccc
Q 028455 86 GREIAAYDIQTTRCDL---YGQ---EKKYSERVMLIYD------GLHY 121 (208)
Q Consensus 86 ~~~I~V~d~~~~~~~~---fg~---~~~~~~~i~llY~------G~HY 121 (208)
+++|+|++...+.... +.. ......++.+.|. |+||
T Consensus 74 ~~~I~v~~~~~~~~~~~~~~~~~~~~~~~~~~i~l~~~~~l~~~~~Hy 121 (121)
T PF02338_consen 74 NRPIIVYSSSDGDNVVFIKFTGKYPPLESPPPICLCYHGHLYYTGNHY 121 (121)
T ss_dssp TSEEEEECETTTBEEEEEEESCEESTTTTTTSEEEEEETEEEEETTEE
T ss_pred CCeEEEEEcCCCCccceeeecCccccCCCCCeEEEEEcCCccCCCCCC
Confidence 9999999886664322 222 2334567777775 6898
No 4
>KOG2606 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.92 E-value=1.3e-25 Score=193.67 Aligned_cols=122 Identities=25% Similarity=0.428 Sum_probs=105.5
Q ss_pred cEEEEEeCCCCchhhHHHHHHhhcC---CCchHHHHHHHHHHHhcChhcchhhhcC----------CCHHHHHHHhCCCC
Q 028455 4 IIVRRVIPSDNSCLFNAVGYVMEHD---KNKAPELRQVIAATVASDPVKYSEAFLG----------KSNQEYCSWIQDPE 70 (208)
Q Consensus 4 ~l~~~~ip~DGnCLFrAis~~l~~~---~~~~~~lR~~va~~I~~np~~y~e~~l~----------~~~~eY~~~i~~~~ 70 (208)
.|....||+||+|||+||++||.-. ..+...||..+|+||++|.++| .+|+- .+|+.||+.|.++.
T Consensus 158 ~l~~~~Ip~DG~ClY~aI~hQL~~~~~~~~~v~kLR~~~a~Ymr~H~~df-~pf~~~eet~d~~~~~~f~~Yc~eI~~t~ 236 (302)
T KOG2606|consen 158 GLKMFDIPADGHCLYAAISHQLKLRSGKLLSVQKLREETADYMREHVEDF-LPFLLDEETGDSLGPEDFDKYCREIRNTA 236 (302)
T ss_pred cCccccCCCCchhhHHHHHHHHHhccCCCCcHHHHHHHHHHHHHHHHHHh-hhHhcCccccccCCHHHHHHHHHHhhhhc
Confidence 4788999999999999999999532 3577899999999999999999 56652 14999999999999
Q ss_pred cccCHHHHHHHHHhhCceEEEEECCCCceeEeCCCCCCCCeEEEEE------cCccceeeecC
Q 028455 71 KWGGAIELSILADYYGREIAAYDIQTTRCDLYGQEKKYSERVMLIY------DGLHYDALAIS 127 (208)
Q Consensus 71 ~WGG~iEL~als~~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~llY------~G~HYD~l~~~ 127 (208)
.|||+|||.|||..|++||.||..+.+ +..||+..+..+|++|+| .|.||+++.+.
T Consensus 237 ~WGgelEL~AlShvL~~PI~Vy~~~~p-~~~~geey~kd~pL~lvY~rH~y~LGeHYNS~~~~ 298 (302)
T KOG2606|consen 237 AWGGELELKALSHVLQVPIEVYQADGP-ILEYGEEYGKDKPLILVYHRHAYGLGEHYNSVTPL 298 (302)
T ss_pred cccchHHHHHHHHhhccCeEEeecCCC-ceeechhhCCCCCeeeehHHhHHHHHhhhcccccc
Confidence 999999999999999999999997755 889998876568888887 46799998764
No 5
>PF10275 Peptidase_C65: Peptidase C65 Otubain; InterPro: IPR019400 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. ; PDB: 4DDG_C 3VON_O 2ZFY_A 4DHZ_A 4DDI_C 1TFF_A 4DHJ_I 4DHI_B.
Probab=99.61 E-value=6e-15 Score=126.38 Aligned_cols=92 Identities=23% Similarity=0.350 Sum_probs=69.6
Q ss_pred HHHHHHHHHHhcChhcchhhhcC----CCHHHHHH-HhCCCCcccCHHHHHHHHHhhCceEEEEECCCC------ceeEe
Q 028455 34 ELRQVIAATVASDPVKYSEAFLG----KSNQEYCS-WIQDPEKWGGAIELSILADYYGREIAAYDIQTT------RCDLY 102 (208)
Q Consensus 34 ~lR~~va~~I~~np~~y~e~~l~----~~~~eY~~-~i~~~~~WGG~iEL~als~~~~~~I~V~d~~~~------~~~~f 102 (208)
.||..++.||+.|++.| ++|+. .++++||+ .+...+.-.+++.|.|||++++++|.|+-++.. ....|
T Consensus 141 flRLlts~~l~~~~d~y-~~fi~~~~~~tve~~C~~~Vep~~~Ead~v~i~ALa~aL~v~i~v~yld~~~~~~~~~~~~~ 219 (244)
T PF10275_consen 141 FLRLLTSAYLKSNSDEY-EPFIDGLEYLTVEEFCSQEVEPMGKEADHVQIIALAQALGVPIRVEYLDRSVEGDEVNRHEF 219 (244)
T ss_dssp HHHHHHHHHHHHTHHHH-GGGSSTT--S-HHHHHHHHTSSTT--B-HHHHHHHHHHHT--EEEEESSSSGCSTTSEEEEE
T ss_pred HHHHHHHHHHHhhHHHH-hhhhcccccCCHHHHHHhhcccccccchhHHHHHHHHHhCCeEEEEEecCCCCCCccccccC
Confidence 58999999999999999 77875 78999997 577789999999999999999999999876632 23445
Q ss_pred CCC-CCCCCeEEEEEcCccceeeec
Q 028455 103 GQE-KKYSERVMLIYDGLHYDALAI 126 (208)
Q Consensus 103 g~~-~~~~~~i~llY~G~HYD~l~~ 126 (208)
.+. .+...+|.|||...|||++++
T Consensus 220 ~~~~~~~~~~i~LLyrpgHYdIly~ 244 (244)
T PF10275_consen 220 PPDNESQEPQITLLYRPGHYDILYP 244 (244)
T ss_dssp S-SSTTSS-SEEEEEETBEEEEEEE
T ss_pred CCccCCCCCEEEEEEcCCccccccC
Confidence 432 224678899999999999985
No 6
>KOG3991 consensus Uncharacterized conserved protein [Function unknown]
Probab=99.57 E-value=4.3e-15 Score=124.70 Aligned_cols=94 Identities=21% Similarity=0.319 Sum_probs=75.6
Q ss_pred HHHHHHHHHHHhcChhcchhhhcC--CCHHHHHHHhCCC-CcccCHHHHHHHHHhhCceEEEEECCCCceeEeCCC---C
Q 028455 33 PELRQVIAATVASDPVKYSEAFLG--KSNQEYCSWIQDP-EKWGGAIELSILADYYGREIAAYDIQTTRCDLYGQE---K 106 (208)
Q Consensus 33 ~~lR~~va~~I~~np~~y~e~~l~--~~~~eY~~~i~~~-~~WGG~iEL~als~~~~~~I~V~d~~~~~~~~fg~~---~ 106 (208)
..||..++.+|++|++.| ++|++ ++.++||..--.| ..-.|+|+|.|||+++++.|.|..++.+.....+.- .
T Consensus 157 ~ylRLvtS~~ik~~adfy-~pFI~e~~tV~~fC~~eVEPm~kesdhi~I~ALs~Al~i~irVey~dr~~~~~~~hH~fpe 235 (256)
T KOG3991|consen 157 MYLRLVTSGFIKSNADFY-QPFIDEGMTVKAFCTQEVEPMYKESDHIHITALSQALGIRIRVEYVDRGSGDTVNHHDFPE 235 (256)
T ss_pred HHHHHHHHHHHhhChhhh-hccCCCCCcHHHHHHhhcchhhhccCceeHHHHHhhhCceEEEEEecCCCCCCCCCCcCcc
Confidence 469999999999999999 88884 6999999975444 788999999999999999999988765432222211 2
Q ss_pred CCCCeEEEEEcCccceeeecC
Q 028455 107 KYSERVMLIYDGLHYDALAIS 127 (208)
Q Consensus 107 ~~~~~i~llY~G~HYD~l~~~ 127 (208)
+..++|.|||...|||+|+++
T Consensus 236 ~s~P~I~LLYrpGHYdilY~~ 256 (256)
T KOG3991|consen 236 ASAPEIYLLYRPGHYDILYKK 256 (256)
T ss_pred ccCceEEEEecCCccccccCC
Confidence 246789999999999999864
No 7
>KOG2605 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.44 E-value=4.1e-14 Score=127.66 Aligned_cols=120 Identities=19% Similarity=0.222 Sum_probs=95.2
Q ss_pred EEEEEeCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHH-
Q 028455 5 IVRRVIPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILAD- 83 (208)
Q Consensus 5 l~~~~ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~- 83 (208)
+..+.|..||||+|||+++|++++.+.|..+|+.+++++..+++.| +-++.+++.+|++.++.++.||.+||++|+|.
T Consensus 218 ~e~~Kv~edGsC~fra~aDQvy~d~e~~~~~~~~~~dq~~~e~~~~-~~~vt~~~~~y~k~kr~~~~~gnhie~Qa~a~~ 296 (371)
T KOG2605|consen 218 FEYKKVVEDGSCLFRALADQVYGDDEQHDHNRRECVDQLKKERDFY-EDYVTEDFTSYIKRKRADGEPGNHIEQQAAADI 296 (371)
T ss_pred hhhhhcccCCchhhhccHHHhhcCHHHHHHHHHHHHHHHhhccccc-ccccccchhhcccccccCCCCcchHHHhhhhhh
Confidence 4567899999999999999999999999999999999999999999 78889999999999999999999999999995
Q ss_pred --hhCceEEEEECCCCceeEeCCCCCCCCeEEEE-E---cCccceeeecC
Q 028455 84 --YYGREIAAYDIQTTRCDLYGQEKKYSERVMLI-Y---DGLHYDALAIS 127 (208)
Q Consensus 84 --~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~ll-Y---~G~HYD~l~~~ 127 (208)
....++.+....+..+..-.+.. ..++-.+ | .-.||+.++..
T Consensus 297 ~~~~~~~~~~~~~~~t~~~~~~~~~--~~~~~~~~~n~~~~~h~~~~~~~ 344 (371)
T KOG2605|consen 297 YEEIEKPLNITSFKDTCYIQTPPAI--EESVKMEKYNFWVEVHYNTARHS 344 (371)
T ss_pred hhhccccceeecccccceeccCccc--ccchhhhhhcccchhhhhhcccc
Confidence 45555555555555333332222 2222222 3 44699998875
No 8
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=99.03 E-value=1.1e-10 Score=101.34 Aligned_cols=115 Identities=16% Similarity=0.047 Sum_probs=88.6
Q ss_pred EEEEEeCCCCchhhHHHHHHhhcC-----CCchHHHHHHHHHHHhcChhcchhhhc-C------CCHHHHHHHhCCCCcc
Q 028455 5 IVRRVIPSDNSCLFNAVGYVMEHD-----KNKAPELRQVIAATVASDPVKYSEAFL-G------KSNQEYCSWIQDPEKW 72 (208)
Q Consensus 5 l~~~~ip~DGnCLFrAis~~l~~~-----~~~~~~lR~~va~~I~~np~~y~e~~l-~------~~~~eY~~~i~~~~~W 72 (208)
++--.++|||+|+|-+||++|.-. -+....+|-.=..|...+.+.| ..++ + .+|++|++.|+.+..|
T Consensus 171 i~k~d~~~dG~ieia~iS~~l~v~i~~Vdv~~~~~dr~~~~~~~q~~~i~f-~g~hfD~~t~~m~~~dt~~ne~~~~a~~ 249 (306)
T COG5539 171 IVKPDSQGDGCIEIAIISDQLPVRIHVVDVDKDSEDRYNSHPYVQRISILF-TGIHFDEETLAMVLWDTYVNEVLFDASD 249 (306)
T ss_pred hhccccCCCceEEEeEeccccceeeeeeecchhHHhhccCChhhhhhhhhh-cccccchhhhhcchHHHHHhhhcccccc
Confidence 455678999999999999999632 2345777877778887777777 4443 1 3899999999999999
Q ss_pred cCHHHHHHHHHhhCceEEEEECCCCceeEeCCCCCCCCeEEEE--E-----cCccceee
Q 028455 73 GGAIELSILADYYGREIAAYDIQTTRCDLYGQEKKYSERVMLI--Y-----DGLHYDAL 124 (208)
Q Consensus 73 GG~iEL~als~~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~ll--Y-----~G~HYD~l 124 (208)
|+.||+++||..|++++++++.... +.+|++-. ..++.-+ | .| ||+.+
T Consensus 250 g~~~ei~qLas~lk~~~~~~nT~~~-~ik~n~c~--~~~~~e~~~~~Ha~a~G-H~n~~ 304 (306)
T COG5539 250 GITIEIQQLASLLKNPHYYTNTASP-SIKCNICG--TGFVGEKDYYAHALATG-HYNFG 304 (306)
T ss_pred cchHHHHHHHHHhcCceEEeecCCc-eEEeeccc--cccchhhHHHHHHHhhc-Ccccc
Confidence 9999999999999999999997776 66776643 2233211 2 56 99976
No 9
>PF05415 Peptidase_C36: Beet necrotic yellow vein furovirus-type papain-like endopeptidase; InterPro: IPR008746 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases correspond to MEROPS peptidase family C36 (clan CA). The type example is beet necrotic yellow vein furovirus-type papain-like endopeptidase (beet necrotic yellow vein virus), which is involved in processing the viral polyprotein.
Probab=96.05 E-value=0.017 Score=42.37 Aligned_cols=78 Identities=13% Similarity=0.319 Sum_probs=53.8
Q ss_pred eCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCC--CCcccCHHHHHHHHHhhCc
Q 028455 10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQD--PEKWGGAIELSILADYYGR 87 (208)
Q Consensus 10 ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~--~~~WGG~iEL~als~~~~~ 87 (208)
+..|||||--|||.+|.-+.+ .+-+-|+.|.. +.+.||.|.++ |.+|-+- ..+|+.+++
T Consensus 3 ~sR~NNCLVVAis~~L~~T~e-------~l~~~M~An~~---------~i~~y~~W~r~~~~STW~DC---~mFA~~LkV 63 (104)
T PF05415_consen 3 ASRPNNCLVVAISECLGVTLE-------KLDNLMQANVS---------TIKKYHTWLRKKRPSTWDDC---RMFADALKV 63 (104)
T ss_pred ccCCCCeEeehHHHHhcchHH-------HHHHHHHhhHH---------HHHHHHHHHhcCCCCcHHHH---HHHHHhhee
Confidence 567999999999999985431 22334555433 36789999875 6899775 478999999
Q ss_pred eEEEEEC-CCC-ceeEeCCCC
Q 028455 88 EIAAYDI-QTT-RCDLYGQEK 106 (208)
Q Consensus 88 ~I~V~d~-~~~-~~~~fg~~~ 106 (208)
.|.+--. +++ ....|+++.
T Consensus 64 sm~vkV~~~~~~~l~~~~d~~ 84 (104)
T PF05415_consen 64 SMQVKVLSDKPYDLLYFVDGA 84 (104)
T ss_pred EEEEEEcCCCCceeeEeecCc
Confidence 9988543 333 344566554
No 10
>PF02148 zf-UBP: Zn-finger in ubiquitin-hydrolases and other protein; InterPro: IPR001607 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP) [, ], All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties. Some of the proteins containing an UBP zinc finger include: Homo sapiens (Human) deubiquitinating enzyme 13 (UBPD) Human deubiquitinating enzyme 5 (UBP5) Dictyostelium discoideum (Slime mold) deubiquitinating enzyme A (UBPA) Saccharomyces cerevisiae (Baker's yeast) deubiquitinating enzyme 8 (UBP8) Yeast deubiquitinating enzyme 14 (UBP14) More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding; PDB: 3GV4_A 3PHD_B 3C5K_A 2UZG_A 3IHP_B 2G43_B 2G45_D 2I50_A 3MHH_A 3MHS_A ....
Probab=87.06 E-value=0.28 Score=33.26 Aligned_cols=30 Identities=27% Similarity=0.313 Sum_probs=22.2
Q ss_pred ceeeccccCCCcCCH---HHHHHHHHhhCCCcc
Q 028455 175 FTLRCGVCQIGVIGQ---KEAVEHAQATGHVNF 204 (208)
Q Consensus 175 ~~~~C~~c~~~~~g~---~~a~~ha~~tgH~~F 204 (208)
-...|+.||+.+-|. .-|.+|+++|||.=|
T Consensus 10 ~lw~CL~Cg~~~C~~~~~~Ha~~H~~~~~H~l~ 42 (63)
T PF02148_consen 10 NLWLCLTCGYVGCGRYSNGHALKHYKETGHPLA 42 (63)
T ss_dssp SEEEETTTS-EEETTTSTSHHHHHHHHHT--EE
T ss_pred ceEEeCCCCcccccCCcCcHHHHhhcccCCeEE
Confidence 345799999999985 679999999999633
No 11
>PF12874 zf-met: Zinc-finger of C2H2 type; PDB: 1ZU1_A 2KVG_A.
Probab=84.51 E-value=0.41 Score=26.01 Aligned_cols=25 Identities=16% Similarity=0.504 Sum_probs=21.2
Q ss_pred eeccccCCCcCCHHHHHHHHHhhCC
Q 028455 177 LRCGVCQIGVIGQKEAVEHAQATGH 201 (208)
Q Consensus 177 ~~C~~c~~~~~g~~~a~~ha~~tgH 201 (208)
..|..|++.+.++...+.|-+...|
T Consensus 1 ~~C~~C~~~f~s~~~~~~H~~s~~H 25 (25)
T PF12874_consen 1 FYCDICNKSFSSENSLRQHLRSKKH 25 (25)
T ss_dssp EEETTTTEEESSHHHHHHHHTTHHH
T ss_pred CCCCCCCCCcCCHHHHHHHHCcCCC
Confidence 3699999999999999999876543
No 12
>PF12756 zf-C2H2_2: C2H2 type zinc-finger (2 copies); PDB: 2DMI_A.
Probab=83.80 E-value=1.1 Score=31.77 Aligned_cols=30 Identities=20% Similarity=0.411 Sum_probs=26.5
Q ss_pred eeeccccCCCcCCHHHHHHHHHhhCCCccc
Q 028455 176 TLRCGVCQIGVIGQKEAVEHAQATGHVNFQ 205 (208)
Q Consensus 176 ~~~C~~c~~~~~g~~~a~~ha~~tgH~~F~ 205 (208)
..+|..|++.+.....-+.|-...+|....
T Consensus 50 ~~~C~~C~~~f~s~~~l~~Hm~~~~H~~~~ 79 (100)
T PF12756_consen 50 SFRCPYCNKTFRSREALQEHMRSKHHKKRN 79 (100)
T ss_dssp SEEBSSSS-EESSHHHHHHHHHHTTTTC-S
T ss_pred CCCCCccCCCCcCHHHHHHHHcCccCCCcc
Confidence 689999999999999999999999998874
No 13
>PF00096 zf-C2H2: Zinc finger, C2H2 type; InterPro: IPR007087 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger: #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C], where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter []. This entry represents the classical C2H2 zinc finger domain. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005622 intracellular; PDB: 2D9H_A 2EPC_A 1SP1_A 1VA3_A 2WBT_B 2ELR_A 2YTP_A 2YTT_A 1VA1_A 2ELO_A ....
Probab=79.08 E-value=1.2 Score=23.48 Aligned_cols=21 Identities=14% Similarity=0.468 Sum_probs=18.8
Q ss_pred eccccCCCcCCHHHHHHHHHh
Q 028455 178 RCGVCQIGVIGQKEAVEHAQA 198 (208)
Q Consensus 178 ~C~~c~~~~~g~~~a~~ha~~ 198 (208)
+|.+|++.+.....-..|-+.
T Consensus 2 ~C~~C~~~f~~~~~l~~H~~~ 22 (23)
T PF00096_consen 2 KCPICGKSFSSKSNLKRHMRR 22 (23)
T ss_dssp EETTTTEEESSHHHHHHHHHH
T ss_pred CCCCCCCccCCHHHHHHHHhH
Confidence 799999999999999998764
No 14
>smart00290 ZnF_UBP Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger.
Probab=77.62 E-value=1.4 Score=28.00 Aligned_cols=29 Identities=31% Similarity=0.401 Sum_probs=22.2
Q ss_pred eeeccccCCCcCCH---HHHHHHHHhhCCCcc
Q 028455 176 TLRCGVCQIGVIGQ---KEAVEHAQATGHVNF 204 (208)
Q Consensus 176 ~~~C~~c~~~~~g~---~~a~~ha~~tgH~~F 204 (208)
.-.|+.|++..-|. .-+..|++.|||.=+
T Consensus 11 l~~CL~C~~~~c~~~~~~h~~~H~~~t~H~~~ 42 (50)
T smart00290 11 LWLCLTCGQVGCGRYQLGHALEHFEETGHPLV 42 (50)
T ss_pred eEEecCCCCcccCCCCCcHHHHHhhhhCCCEE
Confidence 44799998777643 459999999999643
No 15
>PF12171 zf-C2H2_jaz: Zinc-finger double-stranded RNA-binding; InterPro: IPR022755 This zinc finger is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus []. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation. This entry represents the multiple-adjacent-C2H2 zinc finger, JAZ. ; PDB: 4DGW_A 1ZR9_A.
Probab=77.23 E-value=0.45 Score=26.66 Aligned_cols=25 Identities=16% Similarity=0.438 Sum_probs=20.7
Q ss_pred eeccccCCCcCCHHHHHHHHHhhCC
Q 028455 177 LRCGVCQIGVIGQKEAVEHAQATGH 201 (208)
Q Consensus 177 ~~C~~c~~~~~g~~~a~~ha~~tgH 201 (208)
..|..|++.|.++.....|-++..|
T Consensus 2 ~~C~~C~k~f~~~~~~~~H~~sk~H 26 (27)
T PF12171_consen 2 FYCDACDKYFSSENQLKQHMKSKKH 26 (27)
T ss_dssp CBBTTTTBBBSSHHHHHCCTTSHHH
T ss_pred CCcccCCCCcCCHHHHHHHHccCCC
Confidence 3599999999999999988765544
No 16
>PF13894 zf-C2H2_4: C2H2-type zinc finger; PDB: 2ELX_A 2EPP_A 2DLK_A 1X6H_A 2EOU_A 2EMB_A 2GQJ_A 2CSH_A 2WBT_B 2ELM_A ....
Probab=72.78 E-value=2.8 Score=21.64 Aligned_cols=21 Identities=19% Similarity=0.517 Sum_probs=16.5
Q ss_pred eccccCCCcCCHHHHHHHHHh
Q 028455 178 RCGVCQIGVIGQKEAVEHAQA 198 (208)
Q Consensus 178 ~C~~c~~~~~g~~~a~~ha~~ 198 (208)
+|..|++.+....+-..|-..
T Consensus 2 ~C~~C~~~~~~~~~l~~H~~~ 22 (24)
T PF13894_consen 2 QCPICGKSFRSKSELRQHMRT 22 (24)
T ss_dssp E-SSTS-EESSHHHHHHHHHH
T ss_pred CCcCCCCcCCcHHHHHHHHHh
Confidence 699999999999999988653
No 17
>KOG0804 consensus Cytoplasmic Zn-finger protein BRAP2 (BRCA1 associated protein) [General function prediction only]
Probab=71.46 E-value=2.3 Score=39.87 Aligned_cols=25 Identities=32% Similarity=0.473 Sum_probs=20.9
Q ss_pred eccccCCCcCC---HHHHHHHHHhhCCC
Q 028455 178 RCGVCQIGVIG---QKEAVEHAQATGHV 202 (208)
Q Consensus 178 ~C~~c~~~~~g---~~~a~~ha~~tgH~ 202 (208)
.|.+||.+.-| +.-|++|++.|||+
T Consensus 242 icliCg~vgcgrY~eghA~rHweet~H~ 269 (493)
T KOG0804|consen 242 ICLICGNVGCGRYKEGHARRHWEETGHC 269 (493)
T ss_pred EEEEccceecccccchhHHHHHHhhcce
Confidence 67778777665 88999999999996
No 18
>smart00355 ZnF_C2H2 zinc finger.
Probab=71.24 E-value=3 Score=21.71 Aligned_cols=21 Identities=24% Similarity=0.403 Sum_probs=18.7
Q ss_pred eeccccCCCcCCHHHHHHHHH
Q 028455 177 LRCGVCQIGVIGQKEAVEHAQ 197 (208)
Q Consensus 177 ~~C~~c~~~~~g~~~a~~ha~ 197 (208)
.+|..|++.+.+...-..|-.
T Consensus 1 ~~C~~C~~~f~~~~~l~~H~~ 21 (26)
T smart00355 1 YRCPECGKVFKSKSALKEHMR 21 (26)
T ss_pred CCCCCCcchhCCHHHHHHHHH
Confidence 369999999999999999976
No 19
>PF05412 Peptidase_C33: Equine arterivirus Nsp2-type cysteine proteinase; InterPro: IPR008743 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing [].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=67.77 E-value=3.6 Score=31.18 Aligned_cols=84 Identities=15% Similarity=0.255 Sum_probs=47.6
Q ss_pred eCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhCceE
Q 028455 10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREI 89 (208)
Q Consensus 10 ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~~~I 89 (208)
=|+||+|-.|.|+..+++-. . ..|.... -+.-+.+..|-++-.|.-+=..++.|.
T Consensus 4 PP~DG~CG~H~i~aI~n~m~---------------~--~~~t~~l--------~~~~r~~d~W~~dedl~~~iq~l~lPa 58 (108)
T PF05412_consen 4 PPGDGSCGWHCIAAIMNHMM---------------G--GEFTTPL--------PQRNRPSDDWADDEDLYQVIQSLRLPA 58 (108)
T ss_pred CCCCCchHHHHHHHHHHHhh---------------c--cCCCccc--------cccCCChHHccChHHHHHHHHHccCce
Confidence 38999999999998877421 1 1131111 112223456777766666656666666
Q ss_pred EEEECCCCceeEeCCCCCCCCeEEEEEcCccceeeecCC
Q 028455 90 AAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISP 128 (208)
Q Consensus 90 ~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~ 128 (208)
.+...... ..-+-++.-+|.|+.+-....
T Consensus 59 t~~~~~~C----------p~ArYv~~l~~qHW~V~~~~g 87 (108)
T PF05412_consen 59 TLDRNGAC----------PHARYVLKLDGQHWEVSVRKG 87 (108)
T ss_pred eccCCCCC----------CCCEEEEEecCceEEEEEcCC
Confidence 55432211 123334447888888766653
No 20
>PF13912 zf-C2H2_6: C2H2-type zinc finger; PDB: 1JN7_A 1FU9_A 2L1O_A 1NJQ_A 2EN8_A 2EMM_A 1FV5_A 1Y0J_B 2L6Z_B.
Probab=64.50 E-value=5.2 Score=21.86 Aligned_cols=21 Identities=19% Similarity=0.384 Sum_probs=18.5
Q ss_pred eeccccCCCcCCHHHHHHHHH
Q 028455 177 LRCGVCQIGVIGQKEAVEHAQ 197 (208)
Q Consensus 177 ~~C~~c~~~~~g~~~a~~ha~ 197 (208)
.+|..|++.|.....=.+|-+
T Consensus 2 ~~C~~C~~~F~~~~~l~~H~~ 22 (27)
T PF13912_consen 2 FECDECGKTFSSLSALREHKR 22 (27)
T ss_dssp EEETTTTEEESSHHHHHHHHC
T ss_pred CCCCccCCccCChhHHHHHhH
Confidence 589999999999999888863
No 21
>PF05379 Peptidase_C23: Carlavirus endopeptidase ; InterPro: IPR008041 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C23 (clan CA). The type example is Carlavirus (apple stem pitting virus) endopeptidase, this thought to play a role in the post-translational cleavage of the high molecular weight primary translation products of the virus.; GO: 0003968 RNA-directed RNA polymerase activity, 0016817 hydrolase activity, acting on acid anhydrides
Probab=60.42 E-value=33 Score=24.99 Aligned_cols=16 Identities=19% Similarity=0.536 Sum_probs=14.0
Q ss_pred CchhhHHHHHHhhcCC
Q 028455 14 NSCLFNAVGYVMEHDK 29 (208)
Q Consensus 14 GnCLFrAis~~l~~~~ 29 (208)
|.|..||||.+|.+..
T Consensus 3 N~Cvi~AiA~aL~R~~ 18 (89)
T PF05379_consen 3 NGCVIRAIAEALGRRE 18 (89)
T ss_pred ccchhHHHHHHhCCCH
Confidence 7899999999998753
No 22
>smart00451 ZnF_U1 U1-like zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.
Probab=59.45 E-value=5.4 Score=23.08 Aligned_cols=25 Identities=16% Similarity=0.481 Sum_probs=20.7
Q ss_pred eeccccCCCcCCHHHHHHHHHhhCC
Q 028455 177 LRCGVCQIGVIGQKEAVEHAQATGH 201 (208)
Q Consensus 177 ~~C~~c~~~~~g~~~a~~ha~~tgH 201 (208)
..|..|++.|.+......|-..--|
T Consensus 4 ~~C~~C~~~~~~~~~~~~H~~gk~H 28 (35)
T smart00451 4 FYCKLCNVTFTDEISVEAHLKGKKH 28 (35)
T ss_pred eEccccCCccCCHHHHHHHHChHHH
Confidence 3599999999999999988766544
No 23
>PHA03082 DNA-dependent RNA polymerase subunit; Provisional
Probab=57.44 E-value=5.8 Score=26.85 Aligned_cols=19 Identities=21% Similarity=0.478 Sum_probs=16.0
Q ss_pred CceeeccccCCCcCCHHHH
Q 028455 174 NFTLRCGVCQIGVIGQKEA 192 (208)
Q Consensus 174 ~~~~~C~~c~~~~~g~~~a 192 (208)
.|.+.|+.||..+..++..
T Consensus 2 Vf~lVCsTCGrDlSeeRy~ 20 (63)
T PHA03082 2 VFQLVCSTCGRDLSEERYR 20 (63)
T ss_pred eeeeeecccCcchhHHHHH
Confidence 3778999999999888764
No 24
>PF05864 Chordopox_RPO7: Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7); InterPro: IPR008448 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This family consists of several Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide sequences. DNA-dependent RNA polymerase catalyses the transcription of DNA into RNA [].; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent
Probab=57.13 E-value=6.1 Score=26.73 Aligned_cols=18 Identities=22% Similarity=0.545 Sum_probs=15.7
Q ss_pred ceeeccccCCCcCCHHHH
Q 028455 175 FTLRCGVCQIGVIGQKEA 192 (208)
Q Consensus 175 ~~~~C~~c~~~~~g~~~a 192 (208)
|.+.|+.||..+..++..
T Consensus 3 f~lvCSTCGrDlSeeRy~ 20 (63)
T PF05864_consen 3 FQLVCSTCGRDLSEERYR 20 (63)
T ss_pred eeeeecccCCcchHHHHH
Confidence 778999999999888764
No 25
>PRK10963 hypothetical protein; Provisional
Probab=54.03 E-value=10 Score=32.07 Aligned_cols=36 Identities=14% Similarity=0.294 Sum_probs=26.0
Q ss_pred HHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHH
Q 028455 37 QVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIEL 78 (208)
Q Consensus 37 ~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL 78 (208)
+.|++|+++|||.|.. ..+-+..|.-|...||.|-|
T Consensus 6 ~~V~~yL~~~PdFf~~------h~~Ll~~L~lph~~~gaVSL 41 (223)
T PRK10963 6 RAVVDYLLQNPDFFIR------NARLVEQMRVPHPVRGTVSL 41 (223)
T ss_pred HHHHHHHHHCchHHhh------CHHHHHhccCCCCCCCeecH
Confidence 5799999999998832 34556677777667776544
No 26
>PF13913 zf-C2HC_2: zinc-finger of a C2HC-type
Probab=53.18 E-value=11 Score=20.76 Aligned_cols=21 Identities=14% Similarity=0.420 Sum_probs=16.1
Q ss_pred eeeccccCCCcCCHHHHHHHHH
Q 028455 176 TLRCGVCQIGVIGQKEAVEHAQ 197 (208)
Q Consensus 176 ~~~C~~c~~~~~g~~~a~~ha~ 197 (208)
.+.|..||..| ++.....|..
T Consensus 2 l~~C~~CgR~F-~~~~l~~H~~ 22 (25)
T PF13913_consen 2 LVPCPICGRKF-NPDRLEKHEK 22 (25)
T ss_pred CCcCCCCCCEE-CHHHHHHHHH
Confidence 35799999999 6666777754
No 27
>COG4049 Uncharacterized protein containing archaeal-type C2H2 Zn-finger [General function prediction only]
Probab=53.03 E-value=6.9 Score=26.46 Aligned_cols=26 Identities=23% Similarity=0.429 Sum_probs=21.9
Q ss_pred CCceeeccccCCCcCCHHHHHHHHHh
Q 028455 173 ANFTLRCGVCQIGVIGQKEAVEHAQA 198 (208)
Q Consensus 173 ~~~~~~C~~c~~~~~g~~~a~~ha~~ 198 (208)
...-++|.-||.+|+.++.-..|-.+
T Consensus 14 GE~~lrCPRC~~~FR~~K~Y~RHVNK 39 (65)
T COG4049 14 GEEFLRCPRCGMVFRRRKDYIRHVNK 39 (65)
T ss_pred CceeeeCCchhHHHHHhHHHHHHhhH
Confidence 34457999999999999999999754
No 28
>PHA00616 hypothetical protein
Probab=52.66 E-value=7.6 Score=24.81 Aligned_cols=29 Identities=21% Similarity=0.272 Sum_probs=23.6
Q ss_pred eeccccCCCcCCHHHHHHHHH-hhCCCccc
Q 028455 177 LRCGVCQIGVIGQKEAVEHAQ-ATGHVNFQ 205 (208)
Q Consensus 177 ~~C~~c~~~~~g~~~a~~ha~-~tgH~~F~ 205 (208)
.+|..||++|.--.+-..|-. .|||..|.
T Consensus 2 YqC~~CG~~F~~~s~l~~H~r~~hg~~~~~ 31 (44)
T PHA00616 2 YQCLRCGGIFRKKKEVIEHLLSVHKQNKLT 31 (44)
T ss_pred CccchhhHHHhhHHHHHHHHHHhcCCCccc
Confidence 479999999999999999974 46666654
No 29
>PF05381 Peptidase_C21: Tymovirus endopeptidase; InterPro: IPR008043 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This entry is found in cysteine peptidases belong to the MEROPS peptidase family C21 (tymovirus endopeptidase family, clan CA). The type example is tymovirus endopeptidase (turnip yellow mosaic virus). The noncapsid protein expressed from ORF-206 of turnip yellow mosaic virus (TYMV) is autocatalytically processed by a papain-like protease, producing N-terminal 150kDa and C-terminal 70kDa proteins.; GO: 0003968 RNA-directed RNA polymerase activity, 0016032 viral reproduction
Probab=43.41 E-value=1.4e+02 Score=22.53 Aligned_cols=89 Identities=19% Similarity=0.199 Sum_probs=51.3
Q ss_pred CCchhhHHHHHHhhcCCC-chHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhCceEEE
Q 028455 13 DNSCLFNAVGYVMEHDKN-KAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREIAA 91 (208)
Q Consensus 13 DGnCLFrAis~~l~~~~~-~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~~~I~V 91 (208)
..+||--|||.+..-+.+ -...|....=+=+..|++ ..-+|-+- -.+.|||-.|.....+
T Consensus 2 ~~~CLL~A~s~at~~~~~~LW~~L~~~lPDSlL~n~e---i~~~GLST----------------DhltaLa~~~~~~~~~ 62 (104)
T PF05381_consen 2 ALDCLLVAISQATSISPETLWATLCEILPDSLLDNPE---IRTLGLST----------------DHLTALAYRYHFQCTF 62 (104)
T ss_pred CcceeHHhhhhhhCCCHHHHHHHHHHhCchhhcCchh---hhhcCCcH----------------HHHHHHHHHHheEEEE
Confidence 478999999999875431 122233322233333333 11112211 2467999999999888
Q ss_pred EECCCCceeEeCCCCCCCCeEEEEE-cC--cccee
Q 028455 92 YDIQTTRCDLYGQEKKYSERVMLIY-DG--LHYDA 123 (208)
Q Consensus 92 ~d~~~~~~~~fg~~~~~~~~i~llY-~G--~HYD~ 123 (208)
.... .+..||-.+ ....+.+.+ +| .||..
T Consensus 63 hs~~--~~~~~Gi~~-as~~~~I~ht~G~p~HFs~ 94 (104)
T PF05381_consen 63 HSDH--GVLHYGIKD-ASTVFTITHTPGPPGHFSL 94 (104)
T ss_pred EcCC--ceEEeecCC-CceEEEEEeCCCCCCcccc
Confidence 8633 367888766 244444445 45 39998
No 30
>cd00729 rubredoxin_SM Rubredoxin, Small Modular nonheme iron binding domain containing a [Fe(SCys)4] center, present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=41.03 E-value=13 Score=22.18 Aligned_cols=14 Identities=29% Similarity=0.465 Sum_probs=11.3
Q ss_pred eeeccccCCCcCCH
Q 028455 176 TLRCGVCQIGVIGQ 189 (208)
Q Consensus 176 ~~~C~~c~~~~~g~ 189 (208)
.-+|.+||++..|.
T Consensus 2 ~~~C~~CG~i~~g~ 15 (34)
T cd00729 2 VWVCPVCGYIHEGE 15 (34)
T ss_pred eEECCCCCCEeECC
Confidence 35899999998775
No 31
>PF09237 GAGA: GAGA factor; InterPro: IPR015318 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Members of this entry bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 1YUI_A 1YUJ_A.
Probab=38.77 E-value=26 Score=23.26 Aligned_cols=22 Identities=14% Similarity=0.386 Sum_probs=16.9
Q ss_pred eccccCCCcCCHHHHHHHHHhh
Q 028455 178 RCGVCQIGVIGQKEAVEHAQAT 199 (208)
Q Consensus 178 ~C~~c~~~~~g~~~a~~ha~~t 199 (208)
.|.+|+..+.-++.-+.|-+.+
T Consensus 26 tCP~C~a~~~~srnLrRHle~~ 47 (54)
T PF09237_consen 26 TCPICGAVIRQSRNLRRHLEIR 47 (54)
T ss_dssp E-TTT--EESSHHHHHHHHHHH
T ss_pred CCCcchhhccchhhHHHHHHHH
Confidence 6999999999999999998754
No 32
>cd02669 Peptidase_C19M A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.
Probab=38.01 E-value=16 Score=33.90 Aligned_cols=32 Identities=22% Similarity=0.214 Sum_probs=23.2
Q ss_pred CCceeeccccCCCcC---CHHHHHHHHHhhCCCcc
Q 028455 173 ANFTLRCGVCQIGVI---GQKEAVEHAQATGHVNF 204 (208)
Q Consensus 173 ~~~~~~C~~c~~~~~---g~~~a~~ha~~tgH~~F 204 (208)
..-..-|.+||+.+. +..-|..|++.|||.=|
T Consensus 25 ~~n~~~CL~cg~~~~g~~~~~ha~~H~~~~~H~~~ 59 (440)
T cd02669 25 NLNVYACLVCGKYFQGRGKGSHAYTHSLEDNHHVF 59 (440)
T ss_pred CCcEEEEcccCCeecCCCCCcHHHHHhhccCCCEE
Confidence 333456999996655 34689999999999633
No 33
>PF04475 DUF555: Protein of unknown function (DUF555); InterPro: IPR007564 This is a family of uncharacterised, hypothetical archaeal proteins.
Probab=38.00 E-value=42 Score=25.22 Aligned_cols=38 Identities=11% Similarity=0.003 Sum_probs=29.8
Q ss_pred HHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcCC
Q 028455 151 PAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG 188 (208)
Q Consensus 151 ~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~g 188 (208)
++--+.-|..+.|+..-.|-+...-.+.|..||..|..
T Consensus 22 AI~iAIseaGkrLn~~~~~VeIevG~~~cP~Cge~~~~ 59 (102)
T PF04475_consen 22 AIGIAISEAGKRLNPDLDYVEIEVGDTICPKCGEELDS 59 (102)
T ss_pred HHHHHHHHHHHhhCCCCCeEEEecCcccCCCCCCccCc
Confidence 33344457778889988899999999999999987754
No 34
>KOG1247 consensus Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]
Probab=37.98 E-value=19 Score=33.96 Aligned_cols=61 Identities=18% Similarity=0.284 Sum_probs=43.9
Q ss_pred EcCccceeeecCCCCCCCCCCCeeeee--CCCCCcchHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcCC
Q 028455 116 YDGLHYDALAISPFEGAPEEFDQTIFP--VQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG 188 (208)
Q Consensus 116 Y~G~HYD~l~~~~~~~~~~~~d~~~f~--~~~~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~g 188 (208)
|+++||++.---.. |...|. +.+.+. ..++.+..+|..++|+.-.+-..|.|.+|++-|..
T Consensus 86 yh~ihk~vy~Wf~I-------dfD~fgrtTT~~qT-----~i~Q~iF~kl~~ng~~se~tv~qLyC~vc~~flad 148 (567)
T KOG1247|consen 86 YHGIHKVVYDWFKI-------DFDEFGRTTTKTQT-----EICQDIFSKLYDNGYLSEQTVKQLYCEVCDTFLAD 148 (567)
T ss_pred cchhHHHHHHhhcc-------cccccCcccCcchh-----HHHHHHhhchhhcCCcccceeeeEEehhhcccccc
Confidence 88999988754321 233554 333333 56778888889999999999999999999887763
No 35
>PF13465 zf-H2C2_2: Zinc-finger double domain; PDB: 2EN7_A 1TF6_A 1TF3_A 2ELT_A 2EOS_A 2EN2_A 2DMD_A 2WBS_A 2WBU_A 2EM5_A ....
Probab=36.45 E-value=17 Score=20.03 Aligned_cols=18 Identities=22% Similarity=0.412 Sum_probs=13.5
Q ss_pred cccCCceeeccccCCCcC
Q 028455 170 TDTANFTLRCGVCQIGVI 187 (208)
Q Consensus 170 t~t~~~~~~C~~c~~~~~ 187 (208)
+-+.....+|..|++.|.
T Consensus 8 ~H~~~k~~~C~~C~k~F~ 25 (26)
T PF13465_consen 8 THTGEKPYKCPYCGKSFS 25 (26)
T ss_dssp HHSSSSSEEESSSSEEES
T ss_pred hcCCCCCCCCCCCcCeeC
Confidence 345666789999998764
No 36
>COG3426 Butyrate kinase [Energy production and conversion]
Probab=36.29 E-value=53 Score=29.58 Aligned_cols=59 Identities=24% Similarity=0.432 Sum_probs=38.5
Q ss_pred hHHHHHHhhcCCCchHHHHHHHHHHHhcChhc--------chhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHh
Q 028455 18 FNAVGYVMEHDKNKAPELRQVIAATVASDPVK--------YSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADY 84 (208)
Q Consensus 18 FrAis~~l~~~~~~~~~lR~~va~~I~~np~~--------y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~ 84 (208)
|+|++||+.+ ++= .++..+..+||- |.+.|+. -..+|++||..--.+.|+.||.|||.=
T Consensus 275 ~~AmayQVaK------eIG-~~savL~G~vDaIvLTGGiA~~~~f~~-~I~~~v~~iapv~v~PGE~EleALA~G 341 (358)
T COG3426 275 YEAMAYQVAK------EIG-AMSAVLKGKVDAIVLTGGIAYEKLFVD-AIEDRVSWIAPVIVYPGEDELEALAEG 341 (358)
T ss_pred HHHHHHHHHH------HHH-hhhhhcCCCCCEEEEecchhhHHHHHH-HHHHHHhhhcceEecCCchHHHHHHhh
Confidence 5677777653 222 234456666662 2222222 467889999888899999999999863
No 37
>PHA02768 hypothetical protein; Provisional
Probab=36.09 E-value=32 Score=23.02 Aligned_cols=21 Identities=24% Similarity=0.502 Sum_probs=14.3
Q ss_pred eccccCCCcCCHHHHHHHHHh
Q 028455 178 RCGVCQIGVIGQKEAVEHAQA 198 (208)
Q Consensus 178 ~C~~c~~~~~g~~~a~~ha~~ 198 (208)
+|..||+.|.-...-..|-..
T Consensus 7 ~C~~CGK~Fs~~~~L~~H~r~ 27 (55)
T PHA02768 7 ECPICGEIYIKRKSMITHLRK 27 (55)
T ss_pred CcchhCCeeccHHHHHHHHHh
Confidence 677777777766666666554
No 38
>PF09082 DUF1922: Domain of unknown function (DUF1922); InterPro: IPR015166 Members of this family consist of a beta-sheet region followed by an alpha-helix and an unstructured C terminus. The beta-sheet region contains a CXCX...XCXC sequence with Cys residues located in two proximal loops and pointing towards each other. This precise function of this set of bacterial proteins is, as yet, unknown []. ; PDB: 1GH9_A.
Probab=35.91 E-value=13 Score=26.05 Aligned_cols=21 Identities=24% Similarity=0.477 Sum_probs=15.1
Q ss_pred CCCccccCCceeeccccCCCcC
Q 028455 166 KKTYTDTANFTLRCGVCQIGVI 187 (208)
Q Consensus 166 ~~~~t~t~~~~~~C~~c~~~~~ 187 (208)
+.-|.+-.+.+-+| +||+.++
T Consensus 10 r~lya~e~~kTkkC-~CG~~l~ 30 (68)
T PF09082_consen 10 RYLYAKEGAKTKKC-VCGKTLK 30 (68)
T ss_dssp --EEEETT-SEEEE-TTTEEEE
T ss_pred CEEEecCCcceeEe-cCCCeee
Confidence 34578888889999 9998765
No 39
>PF13909 zf-H2C2_5: C2H2-type zinc-finger domain; PDB: 1X5W_A.
Probab=35.21 E-value=41 Score=17.66 Aligned_cols=21 Identities=14% Similarity=0.412 Sum_probs=15.4
Q ss_pred eeccccCCCcCCHHHHHHHHHh
Q 028455 177 LRCGVCQIGVIGQKEAVEHAQA 198 (208)
Q Consensus 177 ~~C~~c~~~~~g~~~a~~ha~~ 198 (208)
.+|..|.+... ...-..|-+.
T Consensus 1 y~C~~C~y~t~-~~~l~~H~~~ 21 (24)
T PF13909_consen 1 YKCPHCSYSTS-KSNLKRHLKR 21 (24)
T ss_dssp EE-SSSS-EES-HHHHHHHHHH
T ss_pred CCCCCCCCcCC-HHHHHHHHHh
Confidence 37999999998 8888888653
No 40
>PF07368 DUF1487: Protein of unknown function (DUF1487); InterPro: IPR009961 This family consists of several uncharacterised proteins from Drosophila melanogaster. The function of this family is unknown.
Probab=34.16 E-value=64 Score=27.51 Aligned_cols=66 Identities=18% Similarity=0.286 Sum_probs=37.2
Q ss_pred cccCHHHH-HHHHHhhCceEEEEE---CCCCceeEeCCCCCCCCeEEEEEcCccceeeecCCCCCCCCCCCeeeeeCCC
Q 028455 71 KWGGAIEL-SILADYYGREIAAYD---IQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQK 145 (208)
Q Consensus 71 ~WGG~iEL-~als~~~~~~I~V~d---~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~~~~~~~~~d~~~f~~~~ 145 (208)
.|...++- --++..+++++.-++ +.-..+..+-. ..+...++.+|.||++|.... ..-+-|||..+
T Consensus 144 iW~ekla~~Yel~~~l~~~~f~iNC~~V~L~PI~~~~~---~~~~~v~i~~gyHYE~l~~~~------~~k~IVFP~~~ 213 (215)
T PF07368_consen 144 IWNEKLASAYELAARLPCDTFYINCFNVDLSPIMPFFA---ARKNDVLIANGYHYETLTIGG------KRKIIVFPIGT 213 (215)
T ss_pred EeCcHHHHHHHHHHhCCCCEEEEEeccCCchhhhhhhh---cCCceEEEECCeeEEEEEECC------eEEEEEEeccc
Confidence 57665542 234455555555544 22222333221 235667778999999998863 23457787654
No 41
>COG3357 Predicted transcriptional regulator containing an HTH domain fused to a Zn-ribbon [Transcription]
Probab=33.80 E-value=49 Score=24.53 Aligned_cols=37 Identities=19% Similarity=0.186 Sum_probs=24.4
Q ss_pred hHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcCC
Q 028455 150 GPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG 188 (208)
Q Consensus 150 ~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~g 188 (208)
..+....+.+|+.|+.+++---..- -+|..||+-|+.
T Consensus 34 ~~v~~~L~hiak~lkr~g~~Llv~P--a~CkkCGfef~~ 70 (97)
T COG3357 34 KEVYDHLEHIAKSLKRKGKRLLVRP--ARCKKCGFEFRD 70 (97)
T ss_pred HHHHHHHHHHHHHHHhCCceEEecC--hhhcccCccccc
Confidence 3455666677777788876322211 179999999886
No 42
>cd01675 RNR_III Class III ribonucleotide reductase. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in strict or facultative anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). The class III enzyme from phage T4 consists of two subunits, this model covers the larger subunit w
Probab=33.12 E-value=57 Score=31.49 Aligned_cols=36 Identities=22% Similarity=0.104 Sum_probs=22.0
Q ss_pred hHHHHHHHHHHHHHhhCCC-ccccCCceeeccccCCCcCCH
Q 028455 150 GPAEDLALKLVKEQQRKKT-YTDTANFTLRCGVCQIGVIGQ 189 (208)
Q Consensus 150 ~~~~~~a~~l~~~~~~~~~-~t~t~~~~~~C~~c~~~~~g~ 189 (208)
+++++..+..++ +...| +++|..+ +|.+||+...|+
T Consensus 495 ~al~~lv~~a~~--~~~~y~~~~~p~~--~C~~CG~~~~~~ 531 (555)
T cd01675 495 EALEALVKKAAK--RGVIYFGINTPID--ICNDCGYIGEGE 531 (555)
T ss_pred HHHHHHHHHHHH--cCCceEEEecCCc--cCCCCCCCCcCC
Confidence 334444444433 33455 7788777 999999976543
No 43
>PF04877 Hairpins: HrpZ; InterPro: IPR006961 HrpZ (harpin elicitor) from the plant pathogen Pseudomonas syringae binds to lipid bilayers and forms a cation-conducting pore in vivo. This pore-forming activity may allow nutrient release or delivery of virulence factors during bacterial colonisation of host plants []. The entry also represents hairpinN which is a virulence determinant which elicits lesion formation in Arabidopsis and tobacco and triggers systemic resistance in Arabidopsis [].
Probab=32.98 E-value=36 Score=30.42 Aligned_cols=50 Identities=10% Similarity=0.156 Sum_probs=30.2
Q ss_pred chHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhh
Q 028455 31 KAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYY 85 (208)
Q Consensus 31 ~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~ 85 (208)
.-..|=++|++||-.||+.|-.| +...|.+++. ++..=..-|...+-+.+
T Consensus 162 ~D~~lL~eIaqFMD~nPe~FgkP----d~~sW~~eLk-eD~~L~~~E~~~F~~Al 211 (308)
T PF04877_consen 162 EDMPLLKEIAQFMDQNPEQFGKP----DRKSWADELK-EDNGLDKAETEQFQKAL 211 (308)
T ss_pred ccHHHHHHHHHHHhcCHhhcCCC----CCchHHHHhh-cCCCCCHHHHHHHHHHH
Confidence 34567889999999999999444 1222444452 44443444555554444
No 44
>TIGR02934 nifT_nitrog probable nitrogen fixation protein FixT. This largely uncharacterized protein family is assigned a role in nitrogen fixation by two criteria. First, its gene occurs, generally, among genes essential for expression of active nitrogenase. Second, its phylogenetic profile closely matches that of nitrogen-fixing bacteria. However, mutational studies in Klebsiella pneumoniae failed to demonstrate any phenotype for deletion or overexpression of the protein.
Probab=31.33 E-value=4.3 Score=28.28 Aligned_cols=44 Identities=16% Similarity=0.112 Sum_probs=28.4
Q ss_pred hcChhc-chhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhCceEEEEECCCCceeEeC
Q 028455 44 ASDPVK-YSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREIAAYDIQTTRCDLYG 103 (208)
Q Consensus 44 ~~np~~-y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~~~I~V~d~~~~~~~~fg 103 (208)
++|.+- ++..+--++.++=+-.+.+++.||| ++.+.+|....+.
T Consensus 6 R~~~~g~l~~YvpKKDLEE~Vv~~e~~~~WGG----------------~v~L~NGw~l~lp 50 (67)
T TIGR02934 6 RRNRAGELSAYVPKKDLEEVIVSVEKEELWGG----------------WVTLANGWRLELP 50 (67)
T ss_pred EeCCCCCEEEEEECCcchhheeeeecCccccC----------------EEEECCccEEEeC
Confidence 444443 4223334678888888889999999 5566677554443
No 45
>PRK09784 hypothetical protein; Provisional
Probab=31.16 E-value=25 Score=30.86 Aligned_cols=20 Identities=20% Similarity=0.376 Sum_probs=16.4
Q ss_pred EEEEEeCCCCchhhHHHHHH
Q 028455 5 IVRRVIPSDNSCLFNAVGYV 24 (208)
Q Consensus 5 l~~~~ip~DGnCLFrAis~~ 24 (208)
|+--+|.|||-||.|||--.
T Consensus 200 lkyapvdgdgycllrailvl 219 (417)
T PRK09784 200 LKYAPVDGDGYCLLRAILVL 219 (417)
T ss_pred ceecccCCCchhHHHHHHHh
Confidence 55678999999999999543
No 46
>PRK06266 transcription initiation factor E subunit alpha; Validated
Probab=31.04 E-value=41 Score=27.59 Aligned_cols=49 Identities=8% Similarity=0.135 Sum_probs=35.3
Q ss_pred eeeeeCCCCCcchHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcC
Q 028455 138 QTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVI 187 (208)
Q Consensus 138 ~~~f~~~~~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~ 187 (208)
.-.|..+.+.+.+.++....++.+.|+.+-.+.... .-..|..|+..+.
T Consensus 80 ~y~w~l~~~~i~d~ik~~~~~~~~klk~~l~~e~~~-~~Y~Cp~C~~ryt 128 (178)
T PRK06266 80 TYTWKPELEKLPEIIKKKKMEELKKLKEQLEEEENN-MFFFCPNCHIRFT 128 (178)
T ss_pred EEEEEeCHHHHHHHHHHHHHHHHHHHHHHhhhccCC-CEEECCCCCcEEe
Confidence 457777777777888888888888888876665444 3367877776654
No 47
>PHA00732 hypothetical protein
Probab=30.76 E-value=44 Score=23.74 Aligned_cols=10 Identities=30% Similarity=0.766 Sum_probs=5.3
Q ss_pred eccccCCCcC
Q 028455 178 RCGVCQIGVI 187 (208)
Q Consensus 178 ~C~~c~~~~~ 187 (208)
+|..||+.+.
T Consensus 29 ~C~~CgKsF~ 38 (79)
T PHA00732 29 KCPVCNKSYR 38 (79)
T ss_pred ccCCCCCEeC
Confidence 5555555554
No 48
>PF05148 Methyltransf_8: Hypothetical methyltransferase; InterPro: IPR007823 This family consists of uncharacterised eukaryotic proteins which are related to S-adenosyl-L-methionine-dependent methyltransferases.; GO: 0008168 methyltransferase activity; PDB: 2ZFU_B.
Probab=30.23 E-value=1.3e+02 Score=25.79 Aligned_cols=72 Identities=13% Similarity=0.278 Sum_probs=41.7
Q ss_pred hhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhc-----------CCCHHHHHHHhCC-CCcc------cCHHH
Q 028455 16 CLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFL-----------GKSNQEYCSWIQD-PEKW------GGAIE 77 (208)
Q Consensus 16 CLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l-----------~~~~~eY~~~i~~-~~~W------GG~iE 77 (208)
-.||-|-.+||.+.. ....+.++++|+.| +.+. ..|++.+++|+++ |..| .|+-.
T Consensus 13 srFR~lNE~LYT~~s------~~A~~lf~~dP~~F-~~YH~Gfr~Qv~~WP~nPvd~iI~~l~~~~~~~viaD~GCGdA~ 85 (219)
T PF05148_consen 13 SRFRWLNEQLYTTSS------EEALKLFQEDPELF-DIYHEGFRQQVKKWPVNPVDVIIEWLKKRPKSLVIADFGCGDAK 85 (219)
T ss_dssp HHHHHHHHHHHHS-H------HHHHHHHHH-HHHH-HHHHHHHHHHHCTSSS-HHHHHHHHHCTS-TTS-EEEES-TT-H
T ss_pred CchHHHHHhHhcCCH------HHHHHHHHhCHHHH-HHHHHHHHHHHhcCCCCcHHHHHHHHHhcCCCEEEEECCCchHH
Confidence 369999999996542 23456788999988 4332 3589999999985 4556 23333
Q ss_pred HHHHHHhh--CceEEEEECCCC
Q 028455 78 LSILADYY--GREIAAYDIQTT 97 (208)
Q Consensus 78 L~als~~~--~~~I~V~d~~~~ 97 (208)
||+.. +..|+.+|..+.
T Consensus 86 ---la~~~~~~~~V~SfDLva~ 104 (219)
T PF05148_consen 86 ---LAKAVPNKHKVHSFDLVAP 104 (219)
T ss_dssp ---HHHH--S---EEEEESS-S
T ss_pred ---HHHhcccCceEEEeeccCC
Confidence 33444 345788886553
No 49
>COG2051 RPS27A Ribosomal protein S27E [Translation, ribosomal structure and biogenesis]
Probab=28.98 E-value=29 Score=24.14 Aligned_cols=15 Identities=20% Similarity=0.505 Sum_probs=12.0
Q ss_pred CCceeeccccCCCcC
Q 028455 173 ANFTLRCGVCQIGVI 187 (208)
Q Consensus 173 ~~~~~~C~~c~~~~~ 187 (208)
+++.++|..||..|.
T Consensus 35 ast~V~C~~CG~~l~ 49 (67)
T COG2051 35 ASTVVTCLICGTTLA 49 (67)
T ss_pred CceEEEecccccEEE
Confidence 456778999999886
No 50
>PF13240 zinc_ribbon_2: zinc-ribbon domain
Probab=28.95 E-value=33 Score=18.58 Aligned_cols=14 Identities=36% Similarity=0.776 Sum_probs=8.6
Q ss_pred cccCCceeeccccCCCc
Q 028455 170 TDTANFTLRCGVCQIGV 186 (208)
Q Consensus 170 t~t~~~~~~C~~c~~~~ 186 (208)
.+.+.| |..||..|
T Consensus 10 ~~~~~f---C~~CG~~l 23 (23)
T PF13240_consen 10 EDDAKF---CPNCGTPL 23 (23)
T ss_pred CCcCcc---hhhhCCcC
Confidence 355556 77777654
No 51
>PF07967 zf-C3HC: C3HC zinc finger-like ; InterPro: IPR012935 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) and other proteins. NIPA is thought to perform an antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signalling events []. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe protein containing this domain (O94506 from SWISSPROT) is involved in mRNA export from the nucleus []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005634 nucleus
Probab=28.80 E-value=30 Score=26.65 Aligned_cols=23 Identities=13% Similarity=0.330 Sum_probs=20.4
Q ss_pred CCccccCCceeeccccCCCcCCH
Q 028455 167 KTYTDTANFTLRCGVCQIGVIGQ 189 (208)
Q Consensus 167 ~~~t~t~~~~~~C~~c~~~~~g~ 189 (208)
+.++++...+|+|..||..+.-.
T Consensus 34 ~GW~~~~~d~l~C~~C~~~l~~~ 56 (133)
T PF07967_consen 34 RGWICVSKDMLKCESCGARLCVK 56 (133)
T ss_pred cCCCcCCCCEEEeCCCCCEEEEe
Confidence 88999999999999999887654
No 52
>cd00350 rubredoxin_like Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=28.25 E-value=22 Score=20.82 Aligned_cols=14 Identities=29% Similarity=0.506 Sum_probs=10.9
Q ss_pred eeccccCCCcCCHH
Q 028455 177 LRCGVCQIGVIGQK 190 (208)
Q Consensus 177 ~~C~~c~~~~~g~~ 190 (208)
-+|.+||++..+..
T Consensus 2 ~~C~~CGy~y~~~~ 15 (33)
T cd00350 2 YVCPVCGYIYDGEE 15 (33)
T ss_pred EECCCCCCEECCCc
Confidence 47999999877653
No 53
>PF04959 ARS2: Arsenite-resistance protein 2; InterPro: IPR007042 This entry represents Arsenite-resistance protein 2 (also known as Serrate RNA effector molecule homolog) which is thought to play a role in arsenite resistance [], although does not directly confer arsenite resistance but rather modulates arsenic sensitivity []. Arsenite is a carcinogenic compound which can act as a comutagen by inhibiting DNA repair. It is also involved in cell cycle progression at S phase. ; PDB: 3AX1_A.
Probab=27.83 E-value=45 Score=28.34 Aligned_cols=28 Identities=18% Similarity=0.266 Sum_probs=20.8
Q ss_pred cccCCceeeccccCCCcCCHHHHHHHHH
Q 028455 170 TDTANFTLRCGVCQIGVIGQKEAVEHAQ 197 (208)
Q Consensus 170 t~t~~~~~~C~~c~~~~~g~~~a~~ha~ 197 (208)
+....-.-+|..|+|.|+|..-+.+|-.
T Consensus 71 ~e~~~~K~~C~lc~KlFkg~eFV~KHI~ 98 (214)
T PF04959_consen 71 KEEDEDKWRCPLCGKLFKGPEFVRKHIF 98 (214)
T ss_dssp -SSSSEEEEE-SSS-EESSHHHHHHHHH
T ss_pred HHHcCCEECCCCCCcccCChHHHHHHHh
Confidence 3446667799999999999999999954
No 54
>KOG1790 consensus 60s ribosomal protein L34 [Translation, ribosomal structure and biogenesis]
Probab=26.84 E-value=23 Score=27.41 Aligned_cols=25 Identities=20% Similarity=0.436 Sum_probs=19.0
Q ss_pred CCccccCCceeeccccCCCcCCHHH
Q 028455 167 KTYTDTANFTLRCGVCQIGVIGQKE 191 (208)
Q Consensus 167 ~~~t~t~~~~~~C~~c~~~~~g~~~ 191 (208)
++|+...+...+|.+|+..|.|-..
T Consensus 32 ~q~~kK~~~~pkc~~c~~~l~Gi~~ 56 (121)
T KOG1790|consen 32 YQYVKKKAKLPKCGDCGMRLQGIPA 56 (121)
T ss_pred hHhhHhhccCCCCCcCCcccCCCCC
Confidence 4566666677789999999998543
No 55
>smart00238 BIR Baculoviral inhibition of apoptosis protein repeat. Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes.
Probab=26.76 E-value=1.2e+02 Score=20.18 Aligned_cols=40 Identities=20% Similarity=0.263 Sum_probs=28.1
Q ss_pred hhCCCccccCCceeeccccCCCcC----CHHHHHHHHHhhCCCcc
Q 028455 164 QRKKTYTDTANFTLRCGVCQIGVI----GQKEAVEHAQATGHVNF 204 (208)
Q Consensus 164 ~~~~~~t~t~~~~~~C~~c~~~~~----g~~~a~~ha~~tgH~~F 204 (208)
+..=|||.+ .-.++|-.|+..+. ++.-.++|+...-...|
T Consensus 25 ~~Gfyy~~~-~d~v~C~~C~~~l~~w~~~d~p~~~H~~~~p~C~f 68 (71)
T smart00238 25 EAGFYYTGV-GDEVKCFFCGGELDNWEPGDDPWEEHKKWSPNCPF 68 (71)
T ss_pred HcCCeECCC-CCEEEeCCCCCCcCCCCCCCCHHHHHhHhCcCCcC
Confidence 455677766 44599999999885 45557778776665555
No 56
>PF13451 zf-trcl: Probable zinc-binding domain
Probab=26.65 E-value=54 Score=21.38 Aligned_cols=27 Identities=19% Similarity=0.277 Sum_probs=19.8
Q ss_pred CceeeccccCCCcCCHHHHHHHHHhhC
Q 028455 174 NFTLRCGVCQIGVIGQKEAVEHAQATG 200 (208)
Q Consensus 174 ~~~~~C~~c~~~~~g~~~a~~ha~~tg 200 (208)
...|+|-+||..+.=....|+...+-|
T Consensus 2 Dk~l~C~dCg~~FvfTa~EQ~fy~eKg 28 (49)
T PF13451_consen 2 DKTLTCKDCGAEFVFTAGEQKFYAEKG 28 (49)
T ss_pred CeeEEcccCCCeEEEehhHHHHHHhcC
Confidence 357899999999986666666665544
No 57
>COG2174 RPL34A Ribosomal protein L34E [Translation, ribosomal structure and biogenesis]
Probab=26.35 E-value=30 Score=25.56 Aligned_cols=14 Identities=21% Similarity=0.558 Sum_probs=11.6
Q ss_pred eeccccCCCcCCHH
Q 028455 177 LRCGVCQIGVIGQK 190 (208)
Q Consensus 177 ~~C~~c~~~~~g~~ 190 (208)
-+|.+||..|.|..
T Consensus 35 p~C~~cg~pL~Gi~ 48 (93)
T COG2174 35 PKCAICGRPLGGIP 48 (93)
T ss_pred CcccccCCccCCcc
Confidence 37999999999853
No 58
>COG5134 Uncharacterized conserved protein [Function unknown]
Probab=25.13 E-value=46 Score=28.53 Aligned_cols=32 Identities=22% Similarity=0.397 Sum_probs=22.2
Q ss_pred HHHHHHHHHHHHhhCCCc----cccCCceeeccccC
Q 028455 152 AEDLALKLVKEQQRKKTY----TDTANFTLRCGVCQ 183 (208)
Q Consensus 152 ~~~~a~~l~~~~~~~~~~----t~t~~~~~~C~~c~ 183 (208)
|..++..-+++|+.++.- .-.+-|+++|..|+
T Consensus 14 AqpL~~~~~~KlK~arprglSiRL~TPF~~RCL~C~ 49 (272)
T COG5134 14 AQPLAKRKFDKLKNARPRGLSIRLETPFPVRCLNCE 49 (272)
T ss_pred cchhHHHHHHHhcccCcccceEEeccCcceeecchh
Confidence 345666777777777653 34567899999995
No 59
>cd00022 BIR Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger.
Probab=25.11 E-value=1.3e+02 Score=19.90 Aligned_cols=40 Identities=18% Similarity=0.285 Sum_probs=27.5
Q ss_pred hhCCCccccCCceeeccccCCCcCC----HHHHHHHHHhhCCCcc
Q 028455 164 QRKKTYTDTANFTLRCGVCQIGVIG----QKEAVEHAQATGHVNF 204 (208)
Q Consensus 164 ~~~~~~t~t~~~~~~C~~c~~~~~g----~~~a~~ha~~tgH~~F 204 (208)
+..=||+.. .-.++|--|+..+.+ +.-.++|....-+..|
T Consensus 23 ~~Gfyy~~~-~d~v~C~~C~~~~~~w~~~d~p~~~H~~~~p~C~f 66 (69)
T cd00022 23 EAGFYYTGR-GDEVKCFFCGLELKNWEPGDDPWEEHKRWSPNCPF 66 (69)
T ss_pred HcCCeEcCC-CCEEEeCCCCCCccCCCCCCCHHHHHhHhCcCCcC
Confidence 455667665 456999999988864 5556778776655554
No 60
>PRK13731 conjugal transfer surface exclusion protein TraT; Provisional
Probab=24.92 E-value=2e+02 Score=25.06 Aligned_cols=45 Identities=18% Similarity=0.228 Sum_probs=30.8
Q ss_pred Ceeee----eCCCCCcchHHHHHHHHHHHHHhhCCCcc----ccCCceeeccc--cCCC
Q 028455 137 DQTIF----PVQKGRTIGPAEDLALKLVKEQQRKKTYT----DTANFTLRCGV--CQIG 185 (208)
Q Consensus 137 d~~~f----~~~~~~~~~~~~~~a~~l~~~~~~~~~~t----~t~~~~~~C~~--c~~~ 185 (208)
++||| +++|.. +..+..++.+.|+.++|-- +.+.+.|.-++ |+|.
T Consensus 50 ~ktVyv~vrNTSd~~----~~~l~~~i~~~L~~kGY~iv~~P~~A~Y~lQaNVL~~~K~ 104 (243)
T PRK13731 50 ERTVFLQIKNTSDKD----MSGLQGKIADAVKAKGYQVVTSPDKAYYWIQANVLKADKM 104 (243)
T ss_pred CceEEEEEeeCCCcc----hHHHHHHHHHHHHhCCeEEecChhhceeeeeeeehhcccC
Confidence 46777 455633 3346678888889999843 56777788877 7776
No 61
>TIGR00373 conserved hypothetical protein TIGR00373. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain.
Probab=24.11 E-value=59 Score=26.02 Aligned_cols=49 Identities=6% Similarity=0.116 Sum_probs=31.4
Q ss_pred eeeeeCCCCCcchHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcC
Q 028455 138 QTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVI 187 (208)
Q Consensus 138 ~~~f~~~~~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~ 187 (208)
+..|-++.+++.+.++....++.+.|++.-.+.... .-..|..|+..+.
T Consensus 72 ~Y~w~i~~~~i~d~Ik~~~~~~~~~lk~~l~~e~~~-~~Y~Cp~c~~r~t 120 (158)
T TIGR00373 72 EYTWRINYEKALDVLKRKLEETAKKLREKLEFETNN-MFFICPNMCVRFT 120 (158)
T ss_pred EEEEEeCHHHHHHHHHHHHHHHHHHHHHHHhhccCC-CeEECCCCCcEee
Confidence 456656666666777888788888777765543333 3356877775544
No 62
>PF05413 Peptidase_C34: Putative closterovirus papain-like endopeptidase; InterPro: IPR008744 RNA-directed RNA polymerase (RdRp) (2.7.7.48 from EC) is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage [, ]. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear. The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product []. All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb []. Only the catalytic palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA []. The domain organisation [] and the 3D structure of the catalytic centre of a wide range of RdPp's, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues. There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage: Viruses containing positive-strand RNA or double-strand RNA, except retroviruses and Birnaviridae: viral RNA-directed RNA polymerases including all positive-strand RNA viruses with no DNA stage, double-strand RNA viruses, and the Cystoviridae, Reoviridae, Hypoviridae, Partitiviridae, Totiviridae families. Mononegavirales (negative-strand RNA viruses with non-segmented genomes). Negative-strand RNA viruses with segmented genomes, i.e. Orthomyxoviruses (including influenza A, B, and C viruses, Thogotoviruses, and the infectious salmon anemia virus), Arenaviruses, Bunyaviruses, Hantaviruses, Nairoviruses, Phleboviruses, Tenuiviruses and Tospoviruses. Birnaviridae family of dsRNA viruses. The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups: All positive-strand RNA eukaryotic viruses with no DNA stage. All RNA-containing bacteriophages -there are two families of RNA-containing bacteriophages: Leviviridae (positive ssRNA phages) and Cystoviridae (dsRNA phages). Reoviridae family of dsRNA viruses. This signature is found in the RNA-direct RNA polymerase of apple chlorotic leaf spot virus and cherry mottle virus.; GO: 0003723 RNA binding, 0003968 RNA-directed RNA polymerase activity, 0005524 ATP binding, 0019079 viral genome replication
Probab=23.23 E-value=67 Score=23.27 Aligned_cols=89 Identities=13% Similarity=0.197 Sum_probs=46.8
Q ss_pred EEEeCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhC
Q 028455 7 RRVIPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYG 86 (208)
Q Consensus 7 ~~~ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~ 86 (208)
++.|.|--.|||-|+|..+...++ +.|...|... + +-|++.|. .--.+.++++.|.
T Consensus 2 ~kFikGk~DClf~s~a~~I~Kkpe----------evm~~~phvl---------d---RCisNkGC--sidD~k~iC~~YE 57 (92)
T PF05413_consen 2 VKFIKGKYDCLFVSVAEIIHKKPE----------EVMMFLPHVL---------D---RCISNKGC--SIDDLKAICEKYE 57 (92)
T ss_pred cceeccccccHHHHHHHHHhcCHH----------HHHHhChHHH---------H---HHHhcCCC--CHHHHHHHHhhcE
Confidence 567888899999999887664432 1122222222 1 12222221 2224678888888
Q ss_pred ceEEEEECCCCceeEeCCCCCCCCeEEEEEcCcccee
Q 028455 87 REIAAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDA 123 (208)
Q Consensus 87 ~~I~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~ 123 (208)
+.|.+-- +-| ...-|.-. -+--.++..|+||..
T Consensus 58 iKveceG-DCG-lvE~Gs~G--l~~Gr~~LRGNHF~v 90 (92)
T PF05413_consen 58 IKVECEG-DCG-LVECGSIG--LPLGRMLLRGNHFSV 90 (92)
T ss_pred EeeEecC-ccc-eEEecCcc--Cchhheeecccceee
Confidence 7765521 122 33334322 111135578899864
No 63
>PRK03922 hypothetical protein; Provisional
Probab=21.76 E-value=1e+02 Score=23.56 Aligned_cols=38 Identities=13% Similarity=-0.017 Sum_probs=27.5
Q ss_pred hHHH-HHHHHHHHHHhh-CCCccccCCceeeccccCCCcC
Q 028455 150 GPAE-DLALKLVKEQQR-KKTYTDTANFTLRCGVCQIGVI 187 (208)
Q Consensus 150 ~~~~-~~a~~l~~~~~~-~~~~t~t~~~~~~C~~c~~~~~ 187 (208)
|.|+ -+.-|..+.|+. .-.|-+..--...|..||..|.
T Consensus 21 dDAI~iAIseaGkrLn~~~l~yVeievG~~~cP~cge~~~ 60 (113)
T PRK03922 21 DDAIGVAISEAGKRLNPEDLDYVEVEVGLTICPKCGEPFD 60 (113)
T ss_pred HHHHHHHHHHHHhhcCcccCCeEEEecCcccCCCCCCcCC
Confidence 4333 334466666777 6778888888899999998765
No 64
>PF15412 Nse4-Nse3_bdg: Binding domain of Nse4/EID3 to Nse3-MAGE
Probab=21.75 E-value=31 Score=22.77 Aligned_cols=36 Identities=19% Similarity=0.154 Sum_probs=28.2
Q ss_pred CCcchHHHHHHHHHHHHHhhCCCccccCCceeeccc
Q 028455 146 GRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGV 181 (208)
Q Consensus 146 ~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~ 181 (208)
.+.|-.+.++|.+-|+.|+-..-..|+..|.-+|-.
T Consensus 2 S~~Lv~aSdla~~ka~~lk~~~~~fd~deFv~~l~~ 37 (56)
T PF15412_consen 2 SRLLVLASDLAAEKARNLKFGGSGFDVDEFVSKLKT 37 (56)
T ss_pred cHHHHHHHHHHHHHHHHhccCCCccCHHHHHHHHHH
Confidence 345567788888888999999888899999666644
No 65
>PF06107 DUF951: Bacterial protein of unknown function (DUF951); InterPro: IPR009296 This family consists of several short hypothetical bacterial proteins of unknown function.
Probab=21.61 E-value=48 Score=22.37 Aligned_cols=15 Identities=20% Similarity=0.616 Sum_probs=12.0
Q ss_pred CCceeeccccCCCcC
Q 028455 173 ANFTLRCGVCQIGVI 187 (208)
Q Consensus 173 ~~~~~~C~~c~~~~~ 187 (208)
+.|.|||..||..+-
T Consensus 28 aDikikC~gCg~~im 42 (57)
T PF06107_consen 28 ADIKIKCLGCGRQIM 42 (57)
T ss_pred CcEEEEECCCCCEEE
Confidence 568899999997654
No 66
>PF13717 zinc_ribbon_4: zinc-ribbon domain
Probab=21.30 E-value=58 Score=19.49 Aligned_cols=14 Identities=21% Similarity=0.446 Sum_probs=10.9
Q ss_pred CCceeeccccCCCc
Q 028455 173 ANFTLRCGVCQIGV 186 (208)
Q Consensus 173 ~~~~~~C~~c~~~~ 186 (208)
.+..++|..||..+
T Consensus 22 ~g~~v~C~~C~~~f 35 (36)
T PF13717_consen 22 KGRKVRCSKCGHVF 35 (36)
T ss_pred CCcEEECCCCCCEe
Confidence 45578999999765
No 67
>PLN02748 tRNA dimethylallyltransferase
Probab=21.05 E-value=51 Score=31.27 Aligned_cols=26 Identities=31% Similarity=0.496 Sum_probs=23.2
Q ss_pred eeccccCC-CcCCHHHHHHHHHhhCCC
Q 028455 177 LRCGVCQI-GVIGQKEAVEHAQATGHV 202 (208)
Q Consensus 177 ~~C~~c~~-~~~g~~~a~~ha~~tgH~ 202 (208)
..|.+|++ .++|+.+=+.|-+...|-
T Consensus 419 ~~Ce~C~~~~~~G~~eW~~Hlksr~Hk 445 (468)
T PLN02748 419 YVCEACGNKVLRGAHEWEQHKQGRGHR 445 (468)
T ss_pred ccccCCCCcccCCHHHHHHHhcchHHH
Confidence 36999998 899999999999998884
No 68
>PF03884 DUF329: Domain of unknown function (DUF329); InterPro: IPR005584 The biological function of these short proteins is unknown, but they contain four conserved cysteines, suggesting that they all bind zinc. YacG (Q5X8H6 from SWISSPROT) from Escherichia coli has been shown to bind zinc and contains the structural motifs typical of zinc-binding proteins []. The conserved four cysteine motif in these proteins (-C-X(2)-C-X(15)-C-X(3)-C-) is not found in other zinc-binding proteins with known structures.; GO: 0008270 zinc ion binding; PDB: 1LV3_A.
Probab=21.01 E-value=46 Score=22.37 Aligned_cols=13 Identities=31% Similarity=0.746 Sum_probs=7.1
Q ss_pred ceeeccccCCCcC
Q 028455 175 FTLRCGVCQIGVI 187 (208)
Q Consensus 175 ~~~~C~~c~~~~~ 187 (208)
|+.+|.+||+...
T Consensus 1 m~v~CP~C~k~~~ 13 (57)
T PF03884_consen 1 MTVKCPICGKPVE 13 (57)
T ss_dssp -EEE-TTT--EEE
T ss_pred CcccCCCCCCeec
Confidence 5789999998754
No 69
>PF08209 Sgf11: Sgf11 (transcriptional regulation protein); InterPro: IPR013246 The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae (Baker's yeast). The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation [].; PDB: 3M99_B 2LO2_A 3MHH_C 3MHS_C.
Probab=20.91 E-value=60 Score=19.35 Aligned_cols=19 Identities=21% Similarity=0.237 Sum_probs=14.4
Q ss_pred ceeeccccCCCcCCHHHHH
Q 028455 175 FTLRCGVCQIGVIGQKEAV 193 (208)
Q Consensus 175 ~~~~C~~c~~~~~g~~~a~ 193 (208)
....|..|+..+...+-|+
T Consensus 3 ~~~~C~nC~R~v~a~RfA~ 21 (33)
T PF08209_consen 3 PYVECPNCGRPVAASRFAP 21 (33)
T ss_dssp -EEE-TTTSSEEEGGGHHH
T ss_pred CeEECCCCcCCcchhhhHH
Confidence 4568999999999888775
No 70
>PRK05452 anaerobic nitric oxide reductase flavorubredoxin; Provisional
Probab=20.62 E-value=1.1e+02 Score=28.98 Aligned_cols=53 Identities=11% Similarity=0.130 Sum_probs=31.8
Q ss_pred eeeeCCCCCcchHHHHHHHHHHHHHhhC--CCccc--------------cCCceeeccccCCCcCCHHHH
Q 028455 139 TIFPVQKGRTIGPAEDLALKLVKEQQRK--KTYTD--------------TANFTLRCGVCQIGVIGQKEA 192 (208)
Q Consensus 139 ~~f~~~~~~~~~~~~~~a~~l~~~~~~~--~~~t~--------------t~~~~~~C~~c~~~~~g~~~a 192 (208)
..|..++ +.++.+.+.+++|++.++.+ +|.|- ......+|..||++-..+..-
T Consensus 373 ~~~~P~e-e~~~~~~~~g~~la~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c~~c~~~yd~~~g~ 441 (479)
T PRK05452 373 AKWRPDQ-DALELCREHGREIARQWALAPLPQSTVNTVVKEETSATTTADLGPRMQCSVCQWIYDPAKGE 441 (479)
T ss_pred EEecCCH-HHHHHHHHHHHHHHHHHhhCCccccccccccccccccccccCCCCeEEECCCCeEECCCCCC
Confidence 3444444 44588888888888766622 11111 123445999999987765443
No 71
>PF00653 BIR: Inhibitor of Apoptosis domain; InterPro: IPR001370 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. The baculovirus inhibitor of apoptosis protein repeat (BIR) is a domain of tandem repeats separated by a variable length linker that seems to confer cell death-preventing activity [, ]. The BIR domains characterise the Inhibitor of Apoptosis (IAP) family of proteins (MEROPS proteinase inhibitor family I32, clan IV) that suppress apoptosis by interacting with and inhibiting the enzymatic activity of both initiator and effector caspases (MEROPS peptidase family C14, IPR002398 from INTERPRO). Several distinct mammalian IAPs including XIAP, c-IAP1, c-IAP2, and ML-IAP, have been identified, and they all exhibit antiapoptotic activity in cell culture. The functional unit in each IAP protein is the baculoviral IAP repeat (BIR), which contains approximately 80 amino acids folded around a zinc atom. Most mammalian IAPs have more than one BIR domain, with the different BIR domains performing distinct functions. For example, in XIAP, the third BIR domain (BIR3) potently inhibits the catalytic activity of caspase-9, whereas the linker sequences immediately preceding the second BIR domain (BIR2) selectively targets caspase-3 or -7. The first-recognised members of family MEROPS inhibitor family I32 were viral proteins that inhibited the apoptosis of infected cells: Cp-IAP from Cydia pomonella granulosis virus (CpGV) [] and Op-IAP from Orgyia pseudotsugata multicapsid polyhedrosis virus(OpMNPV) []. The discovery of homologous proteins in mammals followed soon after with the recognition that mutations in the gene for neuronal apoptosis inhibitory protein (NIAP) underlie spinal muscular atrophy []. The inhibitors in family I32 all possess one or more 80-residue domains known as BIR (baculovirus inhibitor repeat) domains and have accordingly been termed 'BIR-containing' or 'BIRC' proteins as well as IAP proteins. The mechanism of inhibition of caspases by the IAP proteins is complex, and reactive site residues cannot yet be identified with any confidence. Despite the conservation of the BIR or IAP (inhibitor of apoptosis) domains throughout the family it seems clear that other parts of the molecules also make essential contributions to inhibitory activity. Homologs of most components in the mammalian apoptotic pathway have been identified in fruit flies. The Drosophila Apaf-1, known as Dapaf-1, HAC-1 or Dark, shares significant sequence similarity with its mammalian counterpart, and is critically important for the activation of the Drosophila initiator caspase Dronc. Dronc, in turn, cleaves and activates the effector caspase DrICE. The Drosophila IAP, DIAP1, binds to and in-activates both DrICE and Dronc through its BIR1 and BIR2 domains. During apoptosis, the anti-death function of DIAP1 is countered by at least four pro-apoptotic proteins, Reaper, Hid, Grim, and sickle, through direct physical interactions. These four proteins represent the functional homologs of the mammalian protein Smac, and they all share a conserved IAP-binding motif at their N termini. The three proteins Reaper, Hid, and Grim are collectively referred to as the RHG proteins [, ]. Both XIAP and DIAP1 contain a RING domain at their C termini, and can act as an E3 ubiquitin ligase. Indeed, both XIAP and DIAP1 have been shown to promote self-ubiquitination and degradation as well as to negatively regulate the target caspases. Nonetheless, important differences exist between XIAP and DIAP1. The primary function of XIAP is thought to inhibit the catalytic activities of caspases; to what extent the ubiquitinating activity of XIAP contributes to its function remains unclear. For DIAP1, however, the ubiquitinating activity appears to be essential for its function. Recently a Drosophila p53 protein has been identified that mediates apoptosis via a novel pathway involving the activation of the Reaper gene and subsequent inhibition of the inhibitors of apoptosis (IAPs). CIAP1, a major mammalian homologue of Drosophila IAPs, is irreversibly inhibited (cleaved) during p53-dependent apoptosis and this cleavage is mediated by a serine protease. Serine protease inhibitors that block CIAP1 cleavage inhibit p53-dependent apoptosis. Furthermore, activation of the p53 protein increases the transcription of the HTRA2 gene, which encodes a serine protease that interacts with CIAP1 and potentiates apoptosis. Therefore mammalian p53 protein activates apoptosis through a novel pathway functionally similar to that in Drosophila, which involves HTRA2 and subsequent inhibition of CIAP1 by cleavage [].; GO: 0005622 intracellular; PDB: 3HL5_B 3UW5_A 3CM7_A 1G3F_A 1G73_C 3G76_G 3CM2_C 2VSL_A 2OPZ_B 3CLX_A ....
Probab=20.28 E-value=1.2e+02 Score=20.50 Aligned_cols=39 Identities=21% Similarity=0.269 Sum_probs=26.2
Q ss_pred hhCCCccccCCceeeccccCCCcC----CHHHHHHHHHhhCCCc
Q 028455 164 QRKKTYTDTANFTLRCGVCQIGVI----GQKEAVEHAQATGHVN 203 (208)
Q Consensus 164 ~~~~~~t~t~~~~~~C~~c~~~~~----g~~~a~~ha~~tgH~~ 203 (208)
++.=|||.+ ...++|-.||..+. ++.-.++|.+..-...
T Consensus 25 ~aGFyy~~~-~d~v~C~~C~~~l~~w~~~Ddp~~~H~~~sp~C~ 67 (70)
T PF00653_consen 25 RAGFYYTGT-GDRVRCFYCGLELDNWEPNDDPWEEHKRHSPNCP 67 (70)
T ss_dssp HTTEEEESS-TTEEEETTTTEEEES-STT--HHHHHHHHSTTBH
T ss_pred HCCCEEcCC-CCEEEEeccCCEEeCCCCCCCHHHHHHHHCcCCe
Confidence 455667766 78899999999884 4455677877554443
No 72
>PF08782 c-SKI_SMAD_bind: c-SKI Smad4 binding domain; InterPro: IPR014890 c-SKI is an oncoprotein that inhibits TGF-beta signalling through interaction with Smad proteins []. This protein binds to Smad4 [].; GO: 0005634 nucleus; PDB: 1MR1_C.
Probab=20.16 E-value=38 Score=25.22 Aligned_cols=25 Identities=20% Similarity=0.235 Sum_probs=13.4
Q ss_pred CCccccCCceeeccccCCCcCCHHH
Q 028455 167 KTYTDTANFTLRCGVCQIGVIGQKE 191 (208)
Q Consensus 167 ~~~t~t~~~~~~C~~c~~~~~g~~~ 191 (208)
..|+...+.-|+|.+|+..|.-++=
T Consensus 19 ~lY~~~~a~CI~C~~C~~~FsP~kF 43 (96)
T PF08782_consen 19 ELYSSPNAKCIECLECRGMFSPQKF 43 (96)
T ss_dssp GG--STT---EEETTT--EE-HHHH
T ss_pred hhcCCCCCCceEcccCCCEeCCcCE
Confidence 3588888999999999988876653
No 73
>PF14300 DUF4375: Domain of unknown function (DUF4375); PDB: 3VJZ_A.
Probab=20.12 E-value=55 Score=24.74 Aligned_cols=18 Identities=28% Similarity=0.549 Sum_probs=15.0
Q ss_pred HHHHHHHHHHHhcChhcc
Q 028455 33 PELRQVIAATVASDPVKY 50 (208)
Q Consensus 33 ~~lR~~va~~I~~np~~y 50 (208)
..+=..++.||++||+.|
T Consensus 106 e~~~~l~~~Yv~~h~~~F 123 (123)
T PF14300_consen 106 EDLTELLARYVREHPEKF 123 (123)
T ss_dssp HHHHHHHHHHHHHTHHHH
T ss_pred cHHHHHHHHHHHHCHhhC
Confidence 466778899999999976
No 74
>PF10571 UPF0547: Uncharacterised protein family UPF0547; InterPro: IPR018886 This domain may well be a type of zinc-finger as it carries two pairs of highly conserved cysteine residues though with no accompanying histidines. Several members are annotated as putative helicases.
Probab=20.06 E-value=51 Score=18.48 Aligned_cols=12 Identities=17% Similarity=0.293 Sum_probs=8.8
Q ss_pred eeeccccCCCcC
Q 028455 176 TLRCGVCQIGVI 187 (208)
Q Consensus 176 ~~~C~~c~~~~~ 187 (208)
..+|..||+.|.
T Consensus 14 ~~~Cp~CG~~F~ 25 (26)
T PF10571_consen 14 AKFCPHCGYDFE 25 (26)
T ss_pred cCcCCCCCCCCc
Confidence 346999998874
Done!