Query 027645
Match_columns 220
No_of_seqs 140 out of 156
Neff 3.6
Searched_HMMs 46136
Date Fri Mar 29 13:03:04 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/027645.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/027645hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF00397 WW: WW domain; Inter 99.1 1E-10 2.2E-15 73.6 3.1 31 63-93 1-31 (31)
2 smart00456 WW Domain with 2 co 98.9 1.6E-09 3.5E-14 67.1 3.4 32 63-95 1-32 (32)
3 cd00201 WW Two conserved trypt 98.7 8.5E-09 1.8E-13 62.9 3.3 31 64-95 1-31 (31)
4 KOG3259 Peptidyl-prolyl cis-tr 98.3 2.3E-07 5E-12 78.6 1.9 37 60-96 4-40 (163)
5 KOG1891 Proline binding protei 95.3 0.013 2.9E-07 53.2 2.9 37 59-96 90-126 (271)
6 COG5104 PRP40 Splicing factor 94.8 0.012 2.5E-07 57.9 1.0 32 65-97 15-46 (590)
7 KOG3209 WW domain-containing p 94.2 0.031 6.7E-07 57.6 2.7 35 61-96 221-255 (984)
8 KOG1891 Proline binding protei 93.1 0.057 1.2E-06 49.2 2.1 35 61-96 127-161 (271)
9 KOG0940 Ubiquitin protein liga 90.7 0.1 2.2E-06 49.3 1.0 31 66-97 117-147 (358)
10 PRK08351 DNA-directed RNA poly 88.9 0.18 4E-06 37.0 0.9 25 174-202 1-25 (61)
11 smart00391 MBD Methyl-CpG bind 87.0 0.57 1.2E-05 35.2 2.6 28 59-87 4-38 (77)
12 cd00122 MBD MeCP2, MBD1, MBD2, 84.8 1 2.2E-05 32.1 2.8 26 59-84 2-33 (62)
13 KOG0940 Ubiquitin protein liga 82.5 1.2 2.5E-05 42.3 3.1 46 52-97 49-98 (358)
14 PRK06393 rpoE DNA-directed RNA 81.7 0.55 1.2E-05 34.9 0.4 23 175-201 4-26 (64)
15 COG5104 PRP40 Splicing factor 80.7 0.76 1.6E-05 45.6 1.1 30 66-96 57-86 (590)
16 cd01397 HAT_MBD Methyl-CpG bin 80.1 1.5 3.2E-05 33.2 2.3 24 59-82 2-31 (73)
17 PF01429 MBD: Methyl-CpG bindi 77.8 2.2 4.7E-05 31.6 2.6 30 59-89 7-43 (77)
18 PRK00398 rpoP DNA-directed RNA 77.6 1.7 3.7E-05 29.1 1.8 31 174-204 1-33 (46)
19 KOG3552 FERM domain protein FR 61.0 2.3 4.9E-05 45.7 -0.6 34 61-95 18-51 (1298)
20 PRK13130 H/ACA RNA-protein com 59.8 8.4 0.00018 27.9 2.3 25 174-202 3-27 (56)
21 COG3357 Predicted transcriptio 57.4 4.6 9.9E-05 32.4 0.7 31 172-202 54-86 (97)
22 TIGR02098 MJ0042_CXXC MJ0042 f 56.9 6.9 0.00015 24.8 1.3 25 177-201 3-34 (38)
23 COG2260 Predicted Zn-ribbon RN 55.8 9.8 0.00021 28.1 2.1 24 175-202 4-27 (59)
24 PF09538 FYDLN_acid: Protein o 54.7 9.8 0.00021 30.5 2.2 28 177-205 10-39 (108)
25 PF12172 DUF35_N: Rubredoxin-l 53.3 9.9 0.00022 24.2 1.6 27 172-200 7-33 (37)
26 TIGR00100 hypA hydrogenase nic 51.6 12 0.00026 29.8 2.2 30 173-202 67-96 (115)
27 smart00661 RPOL9 RNA polymeras 51.4 9.2 0.0002 25.4 1.3 28 179-206 3-34 (52)
28 KOG0155 Transcription factor C 51.3 8.9 0.00019 38.9 1.7 35 61-96 109-144 (617)
29 PRK12380 hydrogenase nickel in 48.7 14 0.0003 29.4 2.2 30 173-202 67-96 (113)
30 PRK03681 hypA hydrogenase nick 48.3 15 0.00032 29.3 2.3 30 173-202 67-97 (114)
31 TIGR00155 pqiA_fam integral me 47.1 9.8 0.00021 36.4 1.3 26 176-202 215-240 (403)
32 cd01396 MeCP2_MBD MeCP2, MBD1, 46.5 18 0.0004 27.1 2.4 25 61-85 5-35 (77)
33 PF13719 zinc_ribbon_5: zinc-r 46.0 7.7 0.00017 25.2 0.3 24 177-200 3-33 (37)
34 PF10571 UPF0547: Uncharacteri 45.1 11 0.00024 23.2 0.9 21 179-201 3-23 (26)
35 PF10058 DUF2296: Predicted in 43.8 17 0.00036 25.8 1.7 32 170-201 16-53 (54)
36 cd04482 RPA2_OBF_like RPA2_OBF 43.3 12 0.00026 28.4 1.0 10 190-199 82-91 (91)
37 PF12065 DUF3545: Protein of u 42.8 11 0.00023 27.9 0.6 14 20-33 19-32 (59)
38 PF10164 DUF2367: Uncharacteri 42.6 23 0.00049 28.5 2.5 16 172-187 45-60 (98)
39 COG1096 Predicted RNA-binding 42.6 16 0.00035 32.3 1.8 32 170-201 143-174 (188)
40 COG1675 TFA1 Transcription ini 42.4 8.6 0.00019 33.4 0.1 32 175-206 112-146 (176)
41 PF03682 UPF0158: Uncharacteri 42.4 16 0.00035 30.8 1.8 17 66-82 21-37 (163)
42 PF07295 DUF1451: Protein of u 41.5 24 0.00051 29.7 2.6 30 172-201 108-139 (146)
43 PF01155 HypA: Hydrogenase exp 40.9 9.8 0.00021 30.1 0.2 30 173-202 67-96 (113)
44 cd07973 Spt4 Transcription elo 39.9 14 0.0003 29.3 0.9 26 177-202 4-31 (98)
45 PRK15103 paraquat-inducible me 39.4 15 0.00033 35.4 1.2 25 176-202 221-245 (419)
46 KOG0155 Transcription factor C 38.3 18 0.00039 36.8 1.6 32 64-96 11-42 (617)
47 PF10122 Mu-like_Com: Mu-like 37.8 14 0.00031 26.5 0.6 29 177-205 5-37 (51)
48 PRK00420 hypothetical protein; 37.5 15 0.00033 29.8 0.9 32 172-203 19-51 (112)
49 KOG4286 Dystrophin-like protei 37.5 11 0.00024 39.9 -0.0 31 61-94 350-380 (966)
50 COG4416 Com Mu-like prophage p 35.9 15 0.00032 27.1 0.4 11 192-202 24-34 (60)
51 PRK03824 hypA hydrogenase nick 35.3 36 0.00078 27.8 2.7 29 174-202 68-117 (135)
52 TIGR00373 conserved hypothetic 35.0 18 0.00038 30.3 0.9 29 175-203 108-139 (158)
53 COG2093 DNA-directed RNA polym 34.0 14 0.0003 27.7 0.1 26 174-201 2-27 (64)
54 KOG4334 Uncharacterized conser 33.7 33 0.00071 35.0 2.6 42 54-96 146-187 (650)
55 PF09003 Phage_integ_N: Bacter 33.3 23 0.00049 27.1 1.1 28 61-88 10-39 (75)
56 PRK05978 hypothetical protein; 33.1 32 0.0007 29.1 2.1 35 175-209 32-72 (148)
57 smart00659 RPOLCX RNA polymera 32.7 35 0.00077 23.2 1.9 27 176-202 2-29 (44)
58 PRK00564 hypA hydrogenase nick 32.3 25 0.00054 28.1 1.3 30 173-202 68-98 (117)
59 cd00350 rubredoxin_like Rubred 31.8 31 0.00067 21.7 1.4 23 178-200 3-25 (33)
60 KOG3209 WW domain-containing p 31.2 42 0.0009 35.7 2.9 77 15-96 215-301 (984)
61 PRK06266 transcription initiat 30.7 27 0.00058 29.9 1.3 30 175-204 116-148 (178)
62 COG5242 TFB4 RNA polymerase II 30.3 42 0.00091 31.2 2.5 27 175-203 259-285 (296)
63 cd04476 RPA1_DBD_C RPA1_DBD_C: 29.9 35 0.00075 27.8 1.7 30 173-202 31-61 (166)
64 PF14369 zf-RING_3: zinc-finge 29.6 47 0.001 21.5 2.0 21 179-199 5-28 (35)
65 COG0846 SIR2 NAD-dependent pro 29.4 38 0.00082 30.7 2.0 29 172-200 118-154 (250)
66 PF06943 zf-LSD1: LSD1 zinc fi 28.7 41 0.00088 20.9 1.5 23 179-201 1-25 (25)
67 PF11238 DUF3039: Protein of u 28.0 30 0.00065 25.5 0.9 30 172-201 24-53 (58)
68 KOG0150 Spliceosomal protein F 27.6 32 0.0007 32.9 1.3 22 75-96 160-181 (336)
69 PRK11032 hypothetical protein; 26.6 56 0.0012 28.1 2.5 30 172-201 120-151 (160)
70 KOG2846 Predicted membrane pro 26.1 32 0.0007 32.8 1.0 38 170-207 214-257 (328)
71 COG1867 TRM1 N2,N2-dimethylgua 26.0 36 0.00079 33.0 1.4 30 173-202 237-267 (380)
72 PF13248 zf-ribbon_3: zinc-rib 25.7 26 0.00057 21.0 0.3 20 178-199 4-23 (26)
73 smart00564 PQQ beta-propeller 25.4 51 0.0011 19.3 1.5 18 74-91 14-31 (33)
74 TIGR01384 TFS_arch transcripti 25.3 28 0.00062 26.4 0.4 27 178-204 2-28 (104)
75 PF02150 RNA_POL_M_15KD: RNA p 25.0 20 0.00043 23.3 -0.4 28 178-205 3-33 (35)
76 PF15232 DUF4585: Domain of un 25.0 48 0.001 25.6 1.6 26 70-95 11-38 (75)
77 PF08274 PhnA_Zn_Ribbon: PhnA 25.0 29 0.00063 22.2 0.4 11 192-202 2-12 (30)
78 TIGR02300 FYDLN_acid conserved 24.9 46 0.001 28.0 1.6 29 177-205 10-39 (129)
79 COG2995 PqiA Uncharacterized p 24.2 46 0.00099 32.8 1.7 31 172-202 14-48 (418)
80 PF11023 DUF2614: Protein of u 23.7 41 0.00089 27.8 1.1 32 174-206 67-99 (114)
81 COG0375 HybF Zn finger protein 22.6 43 0.00094 27.4 1.0 31 173-203 67-97 (115)
82 PF14435 SUKH-4: SUKH-4 immuni 22.2 59 0.0013 26.5 1.8 21 66-86 86-106 (179)
83 COG2995 PqiA Uncharacterized p 21.6 40 0.00087 33.2 0.8 25 175-200 219-243 (418)
84 PF15163 Meiosis_expr: Meiosis 21.4 44 0.00095 25.9 0.8 56 24-84 3-62 (77)
85 PF06677 Auto_anti-p27: Sjogre 21.1 53 0.0012 22.2 1.1 28 172-199 13-41 (41)
86 PF03604 DNA_RNApol_7kD: DNA d 20.3 49 0.0011 21.3 0.7 23 179-201 3-26 (32)
87 PF13717 zinc_ribbon_4: zinc-r 20.3 51 0.0011 21.3 0.8 12 178-189 4-15 (36)
88 PF10891 DUF2719: Protein of u 20.2 40 0.00088 26.3 0.4 11 194-204 24-34 (81)
No 1
>PF00397 WW: WW domain; InterPro: IPR001202 Synonym(s): Rsp5 or WWP domain The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins [, , , ]. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs [, ]. It is frequently associated with other domains typical for proteins in signal transduction processes. A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organisation; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein, amongst others.; GO: 0005515 protein binding; PDB: 2JXW_A 2DK1_A 2JOC_A 2JO9_A 1YIU_A 1O6W_A 2JMF_A 1TK7_A 2KYK_A 2L5F_A ....
Probab=99.07 E-value=1e-10 Score=73.62 Aligned_cols=31 Identities=35% Similarity=0.624 Sum_probs=29.3
Q ss_pred CCccceeceecCcccEEEEecCCCcccCCCC
Q 027645 63 LPLEWERCLDIQSGEIHFYNTRTHKKTSGDP 93 (220)
Q Consensus 63 LPsgWEq~LDlqSG~iYY~N~~T~~stw~dP 93 (220)
||.||+.+.|..+|++||+|+.||+++|++|
T Consensus 1 LP~gW~~~~~~~~g~~YY~N~~t~~s~W~~P 31 (31)
T PF00397_consen 1 LPPGWEEYFDPDSGRPYYYNHETGESQWERP 31 (31)
T ss_dssp SSTTEEEEEETTTSEEEEEETTTTEEESSST
T ss_pred CCcCCEEEEcCCCCCEEEEeCCCCCEEeCCC
Confidence 8999999999668999999999999999987
No 2
>smart00456 WW Domain with 2 conserved Trp (W) residues. Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.
Probab=98.88 E-value=1.6e-09 Score=67.10 Aligned_cols=32 Identities=34% Similarity=0.591 Sum_probs=30.1
Q ss_pred CCccceeceecCcccEEEEecCCCcccCCCCCC
Q 027645 63 LPLEWERCLDIQSGEIHFYNTRTHKKTSGDPRG 95 (220)
Q Consensus 63 LPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~ 95 (220)
||.||+++.|.+ |++||+|+.|++++|++|+.
T Consensus 1 lp~gW~~~~~~~-g~~yy~n~~t~~s~W~~P~~ 32 (32)
T smart00456 1 LPPGWEERKDPD-GRPYYYNHETKETQWEKPRE 32 (32)
T ss_pred CCCCCEEEECCC-CCEEEEECCCCCEEcCCCCC
Confidence 799999999988 99999999999999999973
No 3
>cd00201 WW Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.
Probab=98.75 E-value=8.5e-09 Score=62.91 Aligned_cols=31 Identities=35% Similarity=0.687 Sum_probs=29.0
Q ss_pred CccceeceecCcccEEEEecCCCcccCCCCCC
Q 027645 64 PLEWERCLDIQSGEIHFYNTRTHKKTSGDPRG 95 (220)
Q Consensus 64 PsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~ 95 (220)
|.||+.+.|.. |++||+|+.|++++|++|+.
T Consensus 1 p~~W~~~~~~~-g~~yy~n~~t~~s~W~~P~~ 31 (31)
T cd00201 1 PPGWEERWDPD-GRVYYYNHNTKETQWEDPRE 31 (31)
T ss_pred CCCCEEEECCC-CCEEEEECCCCCEeCCCCCC
Confidence 78999999988 99999999999999999973
No 4
>KOG3259 consensus Peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]
Probab=98.32 E-value=2.3e-07 Score=78.59 Aligned_cols=37 Identities=32% Similarity=0.615 Sum_probs=34.6
Q ss_pred CCCCCccceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 60 ETPLPLEWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 60 ~~pLPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
+..||.+||++.+..+|+.||||+.|++++|+.|...
T Consensus 4 ~~~LP~~Wekr~Srs~gr~YyfN~~T~~SqWe~P~~t 40 (163)
T KOG3259|consen 4 EEKLPPGWEKRMSRSSGRPYYFNTETNESQWERPSGT 40 (163)
T ss_pred cccCCchhheeccccCCCcceeccccchhhccCCCcc
Confidence 3679999999999999999999999999999999875
No 5
>KOG1891 consensus Proline binding protein WW45 [General function prediction only]
Probab=95.31 E-value=0.013 Score=53.17 Aligned_cols=37 Identities=16% Similarity=0.321 Sum_probs=32.3
Q ss_pred cCCCCCccceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 59 LETPLPLEWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 59 l~~pLPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
-+.|||.||---.++. ||-||++|++.++-|.+|-+.
T Consensus 90 edlPLPpgWav~~T~~-grkYYIDHn~~tTHW~HPler 126 (271)
T KOG1891|consen 90 EDLPLPPGWAVEFTTE-GRKYYIDHNNRTTHWVHPLER 126 (271)
T ss_pred ccCCCCCCcceeeEec-CceeEeecCCCcccccChhhh
Confidence 4589999998777764 999999999999999999764
No 6
>COG5104 PRP40 Splicing factor [RNA processing and modification]
Probab=94.75 E-value=0.012 Score=57.93 Aligned_cols=32 Identities=34% Similarity=0.561 Sum_probs=26.8
Q ss_pred ccceeceecCcccEEEEecCCCcccCCCCCCCC
Q 027645 65 LEWERCLDIQSGEIHFYNTRTHKKTSGDPRGSP 97 (220)
Q Consensus 65 sgWEq~LDlqSG~iYY~N~~T~~stw~dPR~~~ 97 (220)
+.|+.--+ .+|+|||+|..||+++|+.|.+..
T Consensus 15 s~w~e~k~-~dgRiYYYN~~T~kS~weKPkell 46 (590)
T COG5104 15 SEWEELKA-PDGRIYYYNKRTGKSSWEKPKELL 46 (590)
T ss_pred HHHHHhhC-CCCceEEEecccccccccChHHHh
Confidence 46887666 479999999999999999997653
No 7
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=94.23 E-value=0.031 Score=57.65 Aligned_cols=35 Identities=34% Similarity=0.633 Sum_probs=31.5
Q ss_pred CCCCccceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 61 TPLPLEWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 61 ~pLPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
.|||..||.... ..|++||++|.|+.++|.|||..
T Consensus 221 gplp~nwemayt-e~gevyfiDhntkttswLdprl~ 255 (984)
T KOG3209|consen 221 GPLPHNWEMAYT-EQGEVYFIDHNTKTTSWLDPRLT 255 (984)
T ss_pred CCCCccceEeEe-ecCeeEeeecccccceecChhhh
Confidence 569999999876 46999999999999999999964
No 8
>KOG1891 consensus Proline binding protein WW45 [General function prediction only]
Probab=93.10 E-value=0.057 Score=49.20 Aligned_cols=35 Identities=23% Similarity=0.481 Sum_probs=31.6
Q ss_pred CCCCccceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 61 TPLPLEWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 61 ~pLPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
-.||.||++..|..-| +||+|+.++++++++|+..
T Consensus 127 EgLppGW~rv~s~e~G-tyY~~~~~k~tQy~HPc~~ 161 (271)
T KOG1891|consen 127 EGLPPGWKRVFSPEKG-TYYYHEEMKRTQYEHPCIS 161 (271)
T ss_pred ccCCcchhhccccccc-eeeeecccchhhhcCCCCC
Confidence 5699999999998766 7999999999999999985
No 9
>KOG0940 consensus Ubiquitin protein ligase RSP5/NEDD4 [Posttranslational modification, protein turnover, chaperones]
Probab=90.71 E-value=0.1 Score=49.29 Aligned_cols=31 Identities=26% Similarity=0.396 Sum_probs=28.1
Q ss_pred cceeceecCcccEEEEecCCCcccCCCCCCCC
Q 027645 66 EWERCLDIQSGEIHFYNTRTHKKTSGDPRGSP 97 (220)
Q Consensus 66 gWEq~LDlqSG~iYY~N~~T~~stw~dPR~~~ 97 (220)
+|+++.|- +|+.||+|+..++++|-||+++.
T Consensus 117 ~~h~~~~~-~g~r~F~~~i~~ktt~ldd~e~~ 147 (358)
T KOG0940|consen 117 GWHMRFTD-TGQRPFYKHILKKTTTLDDREAV 147 (358)
T ss_pred ceeeEecC-CCceehhhhhhcCccccCchhhc
Confidence 69999874 68999999999999999999973
No 10
>PRK08351 DNA-directed RNA polymerase subunit E''; Validated
Probab=88.90 E-value=0.18 Score=37.04 Aligned_cols=25 Identities=32% Similarity=0.810 Sum_probs=19.4
Q ss_pred eeeeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 174 MVATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 174 mv~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
|..-+|.+|+..+ ....||+|.+..
T Consensus 1 M~~kAC~~C~~i~----~~~~CP~Cgs~~ 25 (61)
T PRK08351 1 MTEKACRHCHYIT----TEDRCPVCGSRD 25 (61)
T ss_pred CchhhhhhCCccc----CCCcCCCCcCCc
Confidence 4556999999877 455899999754
No 11
>smart00391 MBD Methyl-CpG binding domain. Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain
Probab=87.01 E-value=0.57 Score=35.20 Aligned_cols=28 Identities=25% Similarity=0.481 Sum_probs=20.5
Q ss_pred cCCCCCccceeceecC-------cccEEEEecCCCc
Q 027645 59 LETPLPLEWERCLDIQ-------SGEIHFYNTRTHK 87 (220)
Q Consensus 59 l~~pLPsgWEq~LDlq-------SG~iYY~N~~T~~ 87 (220)
++.|||.||+..+-+. .+.|||+.. +|+
T Consensus 4 ~~~Plp~GW~R~~~~r~~g~~~~~~dV~Y~sP-~Gk 38 (77)
T smart00391 4 LRLPLPCGWRRETKQRKSGRSAGKFDVYYISP-CGK 38 (77)
T ss_pred ccCCCCCCcEEEEEEecCCCCCCcccEEEECC-CCC
Confidence 4578999999988632 478999955 443
No 12
>cd00122 MBD MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family.
Probab=84.85 E-value=1 Score=32.08 Aligned_cols=26 Identities=38% Similarity=0.816 Sum_probs=20.1
Q ss_pred cCCCCCccceeceec------CcccEEEEecC
Q 027645 59 LETPLPLEWERCLDI------QSGEIHFYNTR 84 (220)
Q Consensus 59 l~~pLPsgWEq~LDl------qSG~iYY~N~~ 84 (220)
++.|||.||+..+-+ ..+.|||+...
T Consensus 2 l~~P~p~GW~R~~~~r~~g~~~k~dv~Y~sP~ 33 (62)
T cd00122 2 LRDPLPPGWKRELVIRKSGSAGKGDVYYYSPC 33 (62)
T ss_pred CCCCCCCCeEEEEEEcCCCCCCcceEEEECCC
Confidence 568999999999875 34578999664
No 13
>KOG0940 consensus Ubiquitin protein ligase RSP5/NEDD4 [Posttranslational modification, protein turnover, chaperones]
Probab=82.50 E-value=1.2 Score=42.31 Aligned_cols=46 Identities=22% Similarity=0.188 Sum_probs=38.4
Q ss_pred ccccccccC-CCCCccceeceecCcc--cEEEEecCCC-cccCCCCCCCC
Q 027645 52 IFDIELQLE-TPLPLEWERCLDIQSG--EIHFYNTRTH-KKTSGDPRGSP 97 (220)
Q Consensus 52 i~~iEL~l~-~pLPsgWEq~LDlqSG--~iYY~N~~T~-~stw~dPR~~~ 97 (220)
.|..|..++ ++||.+|+..|+.+-| ..||+|+.+. .++|.+|+...
T Consensus 49 ~~~~ee~ldy~glprewf~~lS~e~~~p~~~~~~~~~~~~tlq~~P~sg~ 98 (358)
T KOG0940|consen 49 EFKGEEGLDYGGLPREWFFLLSHEGFNPWYGLFQHSRKDYTLWLNPRSGV 98 (358)
T ss_pred ecccccccccCCCCcceeeeeccccCCcceeeeeecccccccccCCccCC
Confidence 347777766 8899999999998765 8889999998 59999999873
No 14
>PRK06393 rpoE DNA-directed RNA polymerase subunit E''; Validated
Probab=81.71 E-value=0.55 Score=34.94 Aligned_cols=23 Identities=30% Similarity=0.514 Sum_probs=18.9
Q ss_pred eeeeCccceeEeEeeCCCCCCCCCCcC
Q 027645 175 VATACMRCHMLVMLCKSSPTCPNCKFL 201 (220)
Q Consensus 175 v~~gC~~ClmyVM~~k~~p~CP~Ck~~ 201 (220)
...+|.+|+..+ .+..||.|.+-
T Consensus 4 ~~~AC~~C~~i~----~~~~Cp~Cgs~ 26 (64)
T PRK06393 4 QYRACKKCKRLT----PEKTCPVHGDE 26 (64)
T ss_pred hhhhHhhCCccc----CCCcCCCCCCC
Confidence 456899999888 46699999974
No 15
>COG5104 PRP40 Splicing factor [RNA processing and modification]
Probab=80.67 E-value=0.76 Score=45.64 Aligned_cols=30 Identities=23% Similarity=0.568 Sum_probs=26.3
Q ss_pred cceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 66 EWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 66 gWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
+|..|-+ ..|++||+|..|.++.|.-|-..
T Consensus 57 ~Wke~~T-adGkvyyyN~~TREs~W~iP~e~ 86 (590)
T COG5104 57 PWKECRT-ADGKVYYYNSITRESRWKIPPER 86 (590)
T ss_pred hHHHHhh-cCCceEEecCccccccccCChhh
Confidence 7999976 57999999999999999988653
No 16
>cd01397 HAT_MBD Methyl-CpG binding domains (MBD) present in putative chromatin remodelling factor such as BAZ2A; BAZ2A contains a MBD, DDT, PHD-type zinc finger and Bromo domain suggesting that BAZ2A might be associated with histone acetyltransferase (HAT) activity. The Drosophila melanogaster toutatis protein, a putative subunit of the chromatin-remodeling complex, and other such proteins in this group share a similar domain architecture with BAZ2A, as does the Caenorhabditis elegans flectin homolog.
Probab=80.10 E-value=1.5 Score=33.23 Aligned_cols=24 Identities=42% Similarity=0.844 Sum_probs=18.9
Q ss_pred cCCCCCccceeceec------CcccEEEEe
Q 027645 59 LETPLPLEWERCLDI------QSGEIHFYN 82 (220)
Q Consensus 59 l~~pLPsgWEq~LDl------qSG~iYY~N 82 (220)
+..|||.||+..+=+ ..|.|||+-
T Consensus 2 ~r~Pl~~GW~Re~vir~~~~~~~~dV~Y~a 31 (73)
T cd01397 2 LRVPLELGWRRETRIRGLGGRIQGEVAYYA 31 (73)
T ss_pred ccCCCCCCceeEEEeccCCCCccceEEEEC
Confidence 458999999998855 457899983
No 17
>PF01429 MBD: Methyl-CpG binding domain; InterPro: IPR001739 Methylation at CpG dinucleotide, the most common DNA modification in eukaryotes, has been correlated with gene silencing associated with various phenomena such as genomic imprinting, transposon and chromosome X inactivation, differentiation, and cancer. Effects of DNA methylation are mediated through proteins which bind to symmetrically methylated CpGs. Such proteins contain a specific domain of ~70 residues, the methyl-CpG-binding domain (MBD), which is linked to additional domains associated with chromatin, such as the bromodomain, the AT hook motif,the SET domain, or the PHD finger. MBD-containing proteins appear to act as structural proteins, which recruit a variety of histone deacetylase (HDAC) complexes and chromatin remodelling factors, leading to chromatin compaction and, consequently, to transcriptional repression. The MBD of MeCP2, MBD1, MBD2, MBD4 and BAZ2 mediates binding to DNA, in case of MeCP2, MBD1 and MBD2 preferentially to methylated CpG. In case of human MBD3 and SETDB1 the MBD has been shown to mediate protein-protein interactions [, ]. The MBD folds into an alpha/beta sandwich structure comprising a layer of twisted beta sheet, backed by another layer formed by the alpha1 helix and a hairpin loop at the C terminus. These layers are both amphipathic, with the alpha1 helix and the beta sheet lying parallel and the hydrophobic faces tightly packed against each other. The beta sheet is composed of two long inner strands (beta2 and beta3) sandwiched by two shorter outer strands (beta1 and beta4) [].; GO: 0003677 DNA binding, 0005634 nucleus; PDB: 2KY8_A 1UB1_A 1D9N_A 1IG4_A 1QK9_A 3C2I_A.
Probab=77.79 E-value=2.2 Score=31.58 Aligned_cols=30 Identities=30% Similarity=0.691 Sum_probs=20.4
Q ss_pred cCCCCCccceeceec-Cc------ccEEEEecCCCccc
Q 027645 59 LETPLPLEWERCLDI-QS------GEIHFYNTRTHKKT 89 (220)
Q Consensus 59 l~~pLPsgWEq~LDl-qS------G~iYY~N~~T~~st 89 (220)
+..+||.||+..+=. ++ +.+||+.. +|++-
T Consensus 7 ~~~~Lp~GW~re~~~R~~g~~~~~~dv~Y~sP-~Gk~~ 43 (77)
T PF01429_consen 7 LDPPLPDGWKREVVVRKSGSSAGKKDVYYYSP-CGKRF 43 (77)
T ss_dssp EBTTSTTT-EEEEEESSSSTTTTSEEEEEEET-TSEEE
T ss_pred ccCCCCCCCEEEEEEecCCCcCCceEEEEECC-CCCEE
Confidence 458999999877662 32 46799987 66543
No 18
>PRK00398 rpoP DNA-directed RNA polymerase subunit P; Provisional
Probab=77.60 E-value=1.7 Score=29.08 Aligned_cols=31 Identities=29% Similarity=0.493 Sum_probs=24.1
Q ss_pred eeeeeCccceeEeEeeCC--CCCCCCCCcCCCC
Q 027645 174 MVATACMRCHMLVMLCKS--SPTCPNCKFLHPP 204 (220)
Q Consensus 174 mv~~gC~~ClmyVM~~k~--~p~CP~Ck~~~p~ 204 (220)
|+..-|++|.--+-+... ..+||.|.+-...
T Consensus 1 ~~~y~C~~CG~~~~~~~~~~~~~Cp~CG~~~~~ 33 (46)
T PRK00398 1 MAEYKCARCGREVELDEYGTGVRCPYCGYRILF 33 (46)
T ss_pred CCEEECCCCCCEEEECCCCCceECCCCCCeEEE
Confidence 678889999987766654 5899999975543
No 19
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=61.04 E-value=2.3 Score=45.72 Aligned_cols=34 Identities=26% Similarity=0.465 Sum_probs=28.7
Q ss_pred CCCCccceeceecCcccEEEEecCCCcccCCCCCC
Q 027645 61 TPLPLEWERCLDIQSGEIHFYNTRTHKKTSGDPRG 95 (220)
Q Consensus 61 ~pLPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~ 95 (220)
..||++|+...|- .|+-||+|+.+..+++++|--
T Consensus 18 ~~v~~~~~r~~ds-k~r~~y~~~~~~~~~~~~~~~ 51 (1298)
T KOG3552|consen 18 EELSYGWERAIDS-KGRSYYINHLNKTTTYEAPEC 51 (1298)
T ss_pred cccchHHHHhhhc-ccchhHHhhcCCccCcCCCcc
Confidence 6799999999996 599999999998888877643
No 20
>PRK13130 H/ACA RNA-protein complex component Nop10p; Reviewed
Probab=59.76 E-value=8.4 Score=27.86 Aligned_cols=25 Identities=20% Similarity=0.541 Sum_probs=20.0
Q ss_pred eeeeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 174 MVATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 174 mv~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
|-...|+.|-.|-| ...||.|....
T Consensus 3 s~mr~C~~CgvYTL----k~~CP~CG~~t 27 (56)
T PRK13130 3 SKIRKCPKCGVYTL----KEICPVCGGKT 27 (56)
T ss_pred ccceECCCCCCEEc----cccCcCCCCCC
Confidence 34567999999999 67899999443
No 21
>COG3357 Predicted transcriptional regulator containing an HTH domain fused to a Zn-ribbon [Transcription]
Probab=57.37 E-value=4.6 Score=32.37 Aligned_cols=31 Identities=23% Similarity=0.437 Sum_probs=21.9
Q ss_pred cceeeeeCccceeEeEeeC--CCCCCCCCCcCC
Q 027645 172 QEMVATACMRCHMLVMLCK--SSPTCPNCKFLH 202 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~~k--~~p~CP~Ck~~~ 202 (220)
=.|+-+.|..|-.-+-=-+ .--+||+||+--
T Consensus 54 Llv~Pa~CkkCGfef~~~~ik~pSRCP~CKSE~ 86 (97)
T COG3357 54 LLVRPARCKKCGFEFRDDKIKKPSRCPKCKSEW 86 (97)
T ss_pred EEecChhhcccCccccccccCCcccCCcchhhc
Confidence 4677889999986554323 336899999754
No 22
>TIGR02098 MJ0042_CXXC MJ0042 family finger-like domain. This domain contains a CXXCX(19)CXXC motif suggestive of both zinc fingers and thioredoxin, usually found at the N-terminus of prokaryotic proteins. One partially characterized gene, agmX, is among a large set in Myxococcus whose interruption affects adventurous gliding motility.
Probab=56.90 E-value=6.9 Score=24.84 Aligned_cols=25 Identities=20% Similarity=0.558 Sum_probs=19.0
Q ss_pred eeCccceeEeEeeCC-------CCCCCCCCcC
Q 027645 177 TACMRCHMLVMLCKS-------SPTCPNCKFL 201 (220)
Q Consensus 177 ~gC~~ClmyVM~~k~-------~p~CP~Ck~~ 201 (220)
+-||+|.--+.|... ..+||+|+..
T Consensus 3 ~~CP~C~~~~~v~~~~~~~~~~~v~C~~C~~~ 34 (38)
T TIGR02098 3 IQCPNCKTSFRVVDSQLGANGGKVRCGKCGHV 34 (38)
T ss_pred EECCCCCCEEEeCHHHcCCCCCEEECCCCCCE
Confidence 679999998888742 2689999853
No 23
>COG2260 Predicted Zn-ribbon RNA-binding protein [Translation, ribosomal structure and biogenesis]
Probab=55.81 E-value=9.8 Score=28.09 Aligned_cols=24 Identities=21% Similarity=0.538 Sum_probs=19.1
Q ss_pred eeeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 175 VATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 175 v~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
..--|+.|..|-|= .+||.|....
T Consensus 4 ~~rkC~~cg~YTLk----e~Cp~CG~~t 27 (59)
T COG2260 4 LIRKCPKCGRYTLK----EKCPVCGGDT 27 (59)
T ss_pred hhhcCcCCCceeec----ccCCCCCCcc
Confidence 44579999999885 7899999544
No 24
>PF09538 FYDLN_acid: Protein of unknown function (FYDLN_acid); InterPro: IPR012644 Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=54.68 E-value=9.8 Score=30.51 Aligned_cols=28 Identities=36% Similarity=0.863 Sum_probs=21.7
Q ss_pred eeCccceeEeE-eeCCCCC-CCCCCcCCCCC
Q 027645 177 TACMRCHMLVM-LCKSSPT-CPNCKFLHPPD 205 (220)
Q Consensus 177 ~gC~~ClmyVM-~~k~~p~-CP~Ck~~~p~~ 205 (220)
-.|+.|-.-+. |.| +|. ||+|...+++.
T Consensus 10 R~Cp~CG~kFYDLnk-~PivCP~CG~~~~~~ 39 (108)
T PF09538_consen 10 RTCPSCGAKFYDLNK-DPIVCPKCGTEFPPE 39 (108)
T ss_pred ccCCCCcchhccCCC-CCccCCCCCCccCcc
Confidence 35899987665 566 665 99999998876
No 25
>PF12172 DUF35_N: Rubredoxin-like zinc ribbon domain (DUF35_N); InterPro: IPR022002 This domain has no known function and is found in conserved hypothetical archaeal and bacterial proteins. The domain is duplicated in O53566 from SWISSPROT. The structure of a DUF35 representative reveals two long N-terminal helices followed by a rubredoxin-like zinc ribbon domain represented in this family and a C-terminal OB fold domain. Zinc is chelated by the four conserved cysteines in the alignment. ; PDB: 3IRB_A.
Probab=53.30 E-value=9.9 Score=24.21 Aligned_cols=27 Identities=22% Similarity=0.698 Sum_probs=18.0
Q ss_pred cceeeeeCccceeEeEeeCCCCCCCCCCc
Q 027645 172 QEMVATACMRCHMLVMLCKSSPTCPNCKF 200 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~~k~~p~CP~Ck~ 200 (220)
..+++.-|..|-.++.-.+ +.||+|.+
T Consensus 7 ~~l~~~rC~~Cg~~~~pPr--~~Cp~C~s 33 (37)
T PF12172_consen 7 GRLLGQRCRDCGRVQFPPR--PVCPHCGS 33 (37)
T ss_dssp T-EEEEE-TTT--EEES----SEETTTT-
T ss_pred CEEEEEEcCCCCCEecCCC--cCCCCcCc
Confidence 5789999999999988877 89999964
No 26
>TIGR00100 hypA hydrogenase nickel insertion protein HypA. In Hpylori, hypA mutant abolished hydrogenase activity and decrease in urease activity. Nickel supplementation in media restored urease activity and partial hydrogenase activity. HypA probably involved in inserting Ni in enzymes.
Probab=51.55 E-value=12 Score=29.79 Aligned_cols=30 Identities=27% Similarity=0.421 Sum_probs=23.8
Q ss_pred ceeeeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 173 EMVATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 173 ~mv~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
.-+.+-|..|--++=+....-.||+|.+..
T Consensus 67 ~p~~~~C~~Cg~~~~~~~~~~~CP~Cgs~~ 96 (115)
T TIGR00100 67 EPVECECEDCSEEVSPEIDLYRCPKCHGIM 96 (115)
T ss_pred eCcEEEcccCCCEEecCCcCccCcCCcCCC
Confidence 446789999998877766677899999754
No 27
>smart00661 RPOL9 RNA polymerase subunit 9.
Probab=51.44 E-value=9.2 Score=25.40 Aligned_cols=28 Identities=21% Similarity=0.646 Sum_probs=18.1
Q ss_pred CccceeEeEeeCC----CCCCCCCCcCCCCCC
Q 027645 179 CMRCHMLVMLCKS----SPTCPNCKFLHPPDQ 206 (220)
Q Consensus 179 C~~ClmyVM~~k~----~p~CP~Ck~~~p~~~ 206 (220)
||.|.-.+..... .-.||.|.+.+...+
T Consensus 3 Cp~Cg~~l~~~~~~~~~~~vC~~Cg~~~~~~~ 34 (52)
T smart00661 3 CPKCGNMLIPKEGKEKRRFVCRKCGYEEPIEQ 34 (52)
T ss_pred CCCCCCccccccCCCCCEEECCcCCCeEECCC
Confidence 8888764444332 356999998776543
No 28
>KOG0155 consensus Transcription factor CA150 [Transcription]
Probab=51.28 E-value=8.9 Score=38.88 Aligned_cols=35 Identities=23% Similarity=0.426 Sum_probs=26.1
Q ss_pred CCCCc-cceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 61 TPLPL-EWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 61 ~pLPs-gWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
.++|- .|==- -...|++||+|..|..+.|+.|-+.
T Consensus 109 ~~ipgtdWcVV-wTgD~RvFFyNpktk~S~We~P~dl 144 (617)
T KOG0155|consen 109 KPIPGTDWCVV-WTGDNRVFFYNPKTKLSVWERPLDL 144 (617)
T ss_pred CCCCCCCeEEE-EeCCCceEEeCCccccccccCchhh
Confidence 44665 68322 2345899999999999999999764
No 29
>PRK12380 hydrogenase nickel incorporation protein HybF; Provisional
Probab=48.71 E-value=14 Score=29.35 Aligned_cols=30 Identities=20% Similarity=0.534 Sum_probs=23.2
Q ss_pred ceeeeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 173 EMVATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 173 ~mv~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
.-+.+-|..|--.+-+....-.||+|.+..
T Consensus 67 vp~~~~C~~Cg~~~~~~~~~~~CP~Cgs~~ 96 (113)
T PRK12380 67 KPAQAWCWDCSQVVEIHQHDAQCPHCHGER 96 (113)
T ss_pred eCcEEEcccCCCEEecCCcCccCcCCCCCC
Confidence 446788999997777766666799999653
No 30
>PRK03681 hypA hydrogenase nickel incorporation protein; Validated
Probab=48.34 E-value=15 Score=29.27 Aligned_cols=30 Identities=23% Similarity=0.472 Sum_probs=23.0
Q ss_pred ceeeeeCccceeEeEeeCCC-CCCCCCCcCC
Q 027645 173 EMVATACMRCHMLVMLCKSS-PTCPNCKFLH 202 (220)
Q Consensus 173 ~mv~~gC~~ClmyVM~~k~~-p~CP~Ck~~~ 202 (220)
.-+.+-|..|--++=+.... -.||+|.+..
T Consensus 67 ~p~~~~C~~Cg~~~~~~~~~~~~CP~Cgs~~ 97 (114)
T PRK03681 67 QEAECWCETCQQYVTLLTQRVRRCPQCHGDM 97 (114)
T ss_pred eCcEEEcccCCCeeecCCccCCcCcCcCCCC
Confidence 44678899999887776554 6699999754
No 31
>TIGR00155 pqiA_fam integral membrane protein, PqiA family. This family consists of uncharacterized predicted integral membrane proteins found, so far, only in the Proteobacteria. Of two members in E. coli, one is induced by paraquat and is designated PqiA, paraquat-inducible protein A.
Probab=47.13 E-value=9.8 Score=36.37 Aligned_cols=26 Identities=31% Similarity=0.782 Sum_probs=19.3
Q ss_pred eeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 176 ATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 176 ~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
+.+|+.|...+ -......||||+...
T Consensus 215 ~~~C~~Cd~~~-~~~~~a~CpRC~~~L 240 (403)
T TIGR00155 215 LRSCSACHTTI-LPAQEPVCPRCSTPL 240 (403)
T ss_pred CCcCCCCCCcc-CCCCCcCCcCCCCcc
Confidence 55799999954 234557899999765
No 32
>cd01396 MeCP2_MBD MeCP2, MBD1, MBD2, MBD3, and MBD4 are members of a protein family that share the methyl-CpG-binding domain (MBD). The MBD, consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1.
Probab=46.55 E-value=18 Score=27.13 Aligned_cols=25 Identities=20% Similarity=0.367 Sum_probs=17.6
Q ss_pred CCCCccceeceecCc------ccEEEEecCC
Q 027645 61 TPLPLEWERCLDIQS------GEIHFYNTRT 85 (220)
Q Consensus 61 ~pLPsgWEq~LDlqS------G~iYY~N~~T 85 (220)
..||.||+..+-+.. +.+||+....
T Consensus 5 ~~lp~GW~r~~~~R~~gs~~k~DvyY~sP~G 35 (77)
T cd01396 5 PRLPPGWKRELVPRKSGSAGKFDVYYISPTG 35 (77)
T ss_pred CCCCCCCEEEEEEecCCCCCcceEEEECCCC
Confidence 349999988775433 3589996653
No 33
>PF13719 zinc_ribbon_5: zinc-ribbon domain
Probab=46.04 E-value=7.7 Score=25.21 Aligned_cols=24 Identities=21% Similarity=0.635 Sum_probs=16.5
Q ss_pred eeCccceeEeEeeCC-------CCCCCCCCc
Q 027645 177 TACMRCHMLVMLCKS-------SPTCPNCKF 200 (220)
Q Consensus 177 ~gC~~ClmyVM~~k~-------~p~CP~Ck~ 200 (220)
+-||+|.--.-|... .-+||+|+.
T Consensus 3 i~CP~C~~~f~v~~~~l~~~~~~vrC~~C~~ 33 (37)
T PF13719_consen 3 ITCPNCQTRFRVPDDKLPAGGRKVRCPKCGH 33 (37)
T ss_pred EECCCCCceEEcCHHHcccCCcEEECCCCCc
Confidence 468888877776643 457888874
No 34
>PF10571 UPF0547: Uncharacterised protein family UPF0547; InterPro: IPR018886 This domain may well be a type of zinc-finger as it carries two pairs of highly conserved cysteine residues though with no accompanying histidines. Several members are annotated as putative helicases.
Probab=45.14 E-value=11 Score=23.19 Aligned_cols=21 Identities=33% Similarity=0.792 Sum_probs=13.9
Q ss_pred CccceeEeEeeCCCCCCCCCCcC
Q 027645 179 CMRCHMLVMLCKSSPTCPNCKFL 201 (220)
Q Consensus 179 C~~ClmyVM~~k~~p~CP~Ck~~ 201 (220)
||.|-.-| +.+.-.||.|.+.
T Consensus 3 CP~C~~~V--~~~~~~Cp~CG~~ 23 (26)
T PF10571_consen 3 CPECGAEV--PESAKFCPHCGYD 23 (26)
T ss_pred CCCCcCCc--hhhcCcCCCCCCC
Confidence 77776655 4556778888753
No 35
>PF10058 DUF2296: Predicted integral membrane metal-binding protein (DUF2296); InterPro: IPR019273 This domain, found mainly in the eukaryotic lunapark proteins, has no known function [].
Probab=43.82 E-value=17 Score=25.84 Aligned_cols=32 Identities=25% Similarity=0.461 Sum_probs=22.6
Q ss_pred CccceeeeeCccceeEe-EeeC-----CCCCCCCCCcC
Q 027645 170 DQQEMVATACMRCHMLV-MLCK-----SSPTCPNCKFL 201 (220)
Q Consensus 170 ~~~~mv~~gC~~ClmyV-M~~k-----~~p~CP~Ck~~ 201 (220)
....+.|..|.+|++-- |..+ ..-+||.|+++
T Consensus 16 ~~~~r~aLIC~~C~~hNGla~~~~~~~i~y~C~~Cg~~ 53 (54)
T PF10058_consen 16 SPSNRYALICSKCFSHNGLAPKEEFEEIQYRCPYCGAL 53 (54)
T ss_pred cccCceeEECcccchhhcccccccCCceEEEcCCCCCc
Confidence 34688899999999754 2223 33579999875
No 36
>cd04482 RPA2_OBF_like RPA2_OBF_like: A subgroup of uncharacterized archaeal OB folds with similarity to the OB fold of the central ssDNA-binding domain (DBD)-D of human RPA2 (also called RPA32). RPA2 is a subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B; RPA2 DBD-D is a weak ssDNA-binding domain. RPA2 DBD-D is also involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. N-terminal to human RPA2 DBD-D is a domain containing all the known phosphorylation sites of RPA. Human RPA2 is phosphorylated in a cell cycle depende
Probab=43.27 E-value=12 Score=28.42 Aligned_cols=10 Identities=40% Similarity=1.388 Sum_probs=8.5
Q ss_pred CCCCCCCCCC
Q 027645 190 KSSPTCPNCK 199 (220)
Q Consensus 190 k~~p~CP~Ck 199 (220)
..+|+||+|+
T Consensus 82 ~~np~C~~C~ 91 (91)
T cd04482 82 RENPVCPKCG 91 (91)
T ss_pred EcCCcCCCCC
Confidence 4789999995
No 37
>PF12065 DUF3545: Protein of unknown function (DUF3545); InterPro: IPR021932 This family of proteins is functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 60 to 77 amino acids in length. This protein has two completely conserved residues (R and L) that may be functionally important.
Probab=42.81 E-value=11 Score=27.88 Aligned_cols=14 Identities=43% Similarity=0.420 Sum_probs=10.5
Q ss_pred Ccchhhhhhhhhhh
Q 027645 20 DNLSKKRKLEEIIQ 33 (220)
Q Consensus 20 ~~~skkrkwee~~~ 33 (220)
-...+||||-|++.
T Consensus 19 r~k~~KRKWREIEA 32 (59)
T PF12065_consen 19 RSKPKKRKWREIEA 32 (59)
T ss_pred cCCccchhHHHHHH
Confidence 34579999999663
No 38
>PF10164 DUF2367: Uncharacterized conserved protein (DUF2367); InterPro: IPR019317 This is a highly conserved set of proteins which contains three pairs of cysteine residues within a length of 42 amino acids and is rich in proline residues towards the N terminus. It includes a membrane protein that has been found to be highly expressed in the mouse brain and consequently, several members have been assigned as brain protein i3 (Bri3). Their function is unknown.
Probab=42.62 E-value=23 Score=28.54 Aligned_cols=16 Identities=31% Similarity=0.611 Sum_probs=13.5
Q ss_pred cceeeeeCccceeEeE
Q 027645 172 QEMVATACMRCHMLVM 187 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM 187 (220)
...|+.||++|..-+|
T Consensus 45 ~vvvvggCp~CrvG~l 60 (98)
T PF10164_consen 45 QVVVVGGCPACRVGVL 60 (98)
T ss_pred ceEEecCCCCCceeee
Confidence 4778899999998776
No 39
>COG1096 Predicted RNA-binding protein (consists of S1 domain and a Zn-ribbon domain) [Translation, ribosomal structure and biogenesis]
Probab=42.58 E-value=16 Score=32.29 Aligned_cols=32 Identities=22% Similarity=0.529 Sum_probs=24.9
Q ss_pred CccceeeeeCccceeEeEeeCCCCCCCCCCcC
Q 027645 170 DQQEMVATACMRCHMLVMLCKSSPTCPNCKFL 201 (220)
Q Consensus 170 ~~~~mv~~gC~~ClmyVM~~k~~p~CP~Ck~~ 201 (220)
+.-..|.|-|.+|---.+.-...-+||||...
T Consensus 143 ~dlGVI~A~CsrC~~~L~~~~~~l~Cp~Cg~t 174 (188)
T COG1096 143 NDLGVIYARCSRCRAPLVKKGNMLKCPNCGNT 174 (188)
T ss_pred CcceEEEEEccCCCcceEEcCcEEECCCCCCE
Confidence 34688999999998665554556999999854
No 40
>COG1675 TFA1 Transcription initiation factor IIE, alpha subunit [Transcription]
Probab=42.38 E-value=8.6 Score=33.41 Aligned_cols=32 Identities=22% Similarity=0.338 Sum_probs=24.5
Q ss_pred eeeeCccceeEeEeeC---CCCCCCCCCcCCCCCC
Q 027645 175 VATACMRCHMLVMLCK---SSPTCPNCKFLHPPDQ 206 (220)
Q Consensus 175 v~~gC~~ClmyVM~~k---~~p~CP~Ck~~~p~~~ 206 (220)
+--.|++|+|++-+-. ..-.||+|+..+--..
T Consensus 112 ~~y~C~~~~~r~sfdeA~~~~F~Cp~Cg~~L~~~d 146 (176)
T COG1675 112 NYYVCPNCHVKYSFDEAMELGFTCPKCGEDLEEYD 146 (176)
T ss_pred CceeCCCCCCcccHHHHHHhCCCCCCCCchhhhcc
Confidence 4456899999997664 5589999998875443
No 41
>PF03682 UPF0158: Uncharacterised protein family (UPF0158); InterPro: IPR005361 This is a small family of hypothetical bacterial proteins of unknown function.
Probab=42.37 E-value=16 Score=30.84 Aligned_cols=17 Identities=41% Similarity=0.726 Sum_probs=15.1
Q ss_pred cceeceecCcccEEEEe
Q 027645 66 EWERCLDIQSGEIHFYN 82 (220)
Q Consensus 66 gWEq~LDlqSG~iYY~N 82 (220)
+++-+||++||+|+|+.
T Consensus 21 e~~~yLD~~TGeI~~~~ 37 (163)
T PF03682_consen 21 EREYYLDLETGEIFYVS 37 (163)
T ss_pred cceEEEECCCCeEEEee
Confidence 67889999999999984
No 42
>PF07295 DUF1451: Protein of unknown function (DUF1451); InterPro: IPR009912 This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein. The function of this family is unknown.
Probab=41.49 E-value=24 Score=29.68 Aligned_cols=30 Identities=27% Similarity=0.530 Sum_probs=21.7
Q ss_pred cceeeeeCccceeEeEeeCC--CCCCCCCCcC
Q 027645 172 QEMVATACMRCHMLVMLCKS--SPTCPNCKFL 201 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~~k~--~p~CP~Ck~~ 201 (220)
.++=...|.+|.--+-+... -|.||+|...
T Consensus 108 ~g~G~l~C~~Cg~~~~~~~~~~l~~Cp~C~~~ 139 (146)
T PF07295_consen 108 VGPGTLVCENCGHEVELTHPERLPPCPKCGHT 139 (146)
T ss_pred ecCceEecccCCCEEEecCCCcCCCCCCCCCC
Confidence 44445669999877776654 4999999853
No 43
>PF01155 HypA: Hydrogenase expression/synthesis hypA family; InterPro: IPR000688 Bacterial membrane-bound nickel-dependent hydrogenases requires a number of accessory proteins which are involved in their maturation. The exact role of these proteins is not yet clear, but some seem to be required for the incorporation of the nickel ions []. One of these proteins is generally known as hypA. It is a protein of about 12 to 14 kDa that contains, in its C-terminal region, four conserved cysteines that form a zinc-finger like motif. Escherichia coli has two proteins that belong to this family, hypA and hybF. A homologue, MJ0214, has also been found in a number of archaeal species, including the genome of Methanocaldococcus jannaschii (Methanococcus jannaschii).; GO: 0016151 nickel ion binding, 0006464 protein modification process; PDB: 2KDX_A 3A44_D 3A43_B.
Probab=40.92 E-value=9.8 Score=30.06 Aligned_cols=30 Identities=17% Similarity=0.279 Sum_probs=22.0
Q ss_pred ceeeeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 173 EMVATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 173 ~mv~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
.-+.+-|..|-..+-+....-.||+|++..
T Consensus 67 ~p~~~~C~~Cg~~~~~~~~~~~CP~Cgs~~ 96 (113)
T PF01155_consen 67 VPARARCRDCGHEFEPDEFDFSCPRCGSPD 96 (113)
T ss_dssp E--EEEETTTS-EEECHHCCHH-SSSSSS-
T ss_pred cCCcEECCCCCCEEecCCCCCCCcCCcCCC
Confidence 447789999999999888888899999753
No 44
>cd07973 Spt4 Transcription elongation factor Spt4. Spt4 is a transcription elongation factor. Three transcription-elongation factors Spt4, Spt5, and Spt6, are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. Spt4 functions entirely in the context of the Spt4-Spt5 heterodimer and it has been found only as a complex to Spt5 in Yeast and Human. Spt4 is a small protein that has zinc finger at the N-terminus. Spt5 is a large protein that has several interesting structural features of an acidic N-terminus, a single NGN domain, five or six KOW domains, and a set of simple C-termianl repeats. Spt4 binds to Spt5 NGN domain. Unlike Spt5, Spt4 is not essential for viability in yeast, however Spt4 is critical for normal function of the Spt4-Spt5 compl
Probab=39.88 E-value=14 Score=29.32 Aligned_cols=26 Identities=31% Similarity=0.867 Sum_probs=17.1
Q ss_pred eeCccceeEeEeeC-CCCCCCCCC-cCC
Q 027645 177 TACMRCHMLVMLCK-SSPTCPNCK-FLH 202 (220)
Q Consensus 177 ~gC~~ClmyVM~~k-~~p~CP~Ck-~~~ 202 (220)
.+|.+|.+.+=..+ ....||||. +++
T Consensus 4 rAC~~C~~I~~~~qf~~~gCpnC~~~l~ 31 (98)
T cd07973 4 RACLLCSLIKTEDQFERDGCPNCEGYLD 31 (98)
T ss_pred chhccCCcccccccccCCCCCCCcchhc
Confidence 38999997663222 346899994 443
No 45
>PRK15103 paraquat-inducible membrane protein A; Provisional
Probab=39.41 E-value=15 Score=35.36 Aligned_cols=25 Identities=24% Similarity=0.711 Sum_probs=19.0
Q ss_pred eeeCccceeEeEeeCCCCCCCCCCcCC
Q 027645 176 ATACMRCHMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 176 ~~gC~~ClmyVM~~k~~p~CP~Ck~~~ 202 (220)
+.+|+.|...+ ....-.||||....
T Consensus 221 l~~C~~Cd~l~--~~~~a~CpRC~~~L 245 (419)
T PRK15103 221 LRSCSCCTAIL--PADQPVCPRCHTKG 245 (419)
T ss_pred CCcCCCCCCCC--CCCCCCCCCCCCcC
Confidence 55799999953 44556899999765
No 46
>KOG0155 consensus Transcription factor CA150 [Transcription]
Probab=38.30 E-value=18 Score=36.77 Aligned_cols=32 Identities=22% Similarity=0.321 Sum_probs=27.3
Q ss_pred CccceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 64 PLEWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 64 PsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
|++|-.+= ...|.-||+|..|...||..|...
T Consensus 11 ps~wtef~-ap~G~pyy~ns~t~~st~ekP~~l 42 (617)
T KOG0155|consen 11 PSGWTEFK-APDGIPYYWNSETLESTWEKPSFL 42 (617)
T ss_pred CCCCccCC-CCCCcceecccccccchhhCchhh
Confidence 48998774 457999999999999999999864
No 47
>PF10122 Mu-like_Com: Mu-like prophage protein Com; InterPro: IPR019294 Members of this entry belong to the Com family of proteins that act as translational regulators of mom [, ].
Probab=37.76 E-value=14 Score=26.47 Aligned_cols=29 Identities=24% Similarity=0.595 Sum_probs=18.9
Q ss_pred eeCccceeEeEee----CCCCCCCCCCcCCCCC
Q 027645 177 TACMRCHMLVMLC----KSSPTCPNCKFLHPPD 205 (220)
Q Consensus 177 ~gC~~ClmyVM~~----k~~p~CP~Ck~~~p~~ 205 (220)
.-|.+|.=+.+-. ...-+||||+.+.-..
T Consensus 5 iRC~~CnklLa~~g~~~~leIKCpRC~tiN~~~ 37 (51)
T PF10122_consen 5 IRCGHCNKLLAKAGEVIELEIKCPRCKTINHVR 37 (51)
T ss_pred eeccchhHHHhhhcCccEEEEECCCCCccceEe
Confidence 4577776544432 3457999999887543
No 48
>PRK00420 hypothetical protein; Validated
Probab=37.55 E-value=15 Score=29.79 Aligned_cols=32 Identities=16% Similarity=0.319 Sum_probs=25.4
Q ss_pred cceeeeeCccceeEeEe-eCCCCCCCCCCcCCC
Q 027645 172 QEMVATACMRCHMLVML-CKSSPTCPNCKFLHP 203 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~-~k~~p~CP~Ck~~~p 203 (220)
..|....||.|-.-.+= -...-.||+|..++-
T Consensus 19 a~ml~~~CP~Cg~pLf~lk~g~~~Cp~Cg~~~~ 51 (112)
T PRK00420 19 AKMLSKHCPVCGLPLFELKDGEVVCPVHGKVYI 51 (112)
T ss_pred HHHccCCCCCCCCcceecCCCceECCCCCCeee
Confidence 67899999999965553 567789999997654
No 49
>KOG4286 consensus Dystrophin-like protein [Cell motility; Signal transduction mechanisms; Cytoskeleton]
Probab=37.54 E-value=11 Score=39.88 Aligned_cols=31 Identities=23% Similarity=0.420 Sum_probs=25.6
Q ss_pred CCCCccceeceecCcccEEEEecCCCcccCCCCC
Q 027645 61 TPLPLEWERCLDIQSGEIHFYNTRTHKKTSGDPR 94 (220)
Q Consensus 61 ~pLPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR 94 (220)
..|| ||..+.. .---||+||.|.++.|++|.
T Consensus 350 vq~p--w~rais~-nkvpyyinh~~q~t~wdhp~ 380 (966)
T KOG4286|consen 350 VQGP--WERAISP-NKVPYYINHETQTTCWDHPK 380 (966)
T ss_pred Cccc--chhccCc-cccchhhcccchhhhccchH
Confidence 4444 9999887 45559999999999999997
No 50
>COG4416 Com Mu-like prophage protein Com [General function prediction only]
Probab=35.92 E-value=15 Score=27.15 Aligned_cols=11 Identities=36% Similarity=0.996 Sum_probs=8.6
Q ss_pred CCCCCCCCcCC
Q 027645 192 SPTCPNCKFLH 202 (220)
Q Consensus 192 ~p~CP~Ck~~~ 202 (220)
.-+|||||.+.
T Consensus 24 e~KCPrCK~vN 34 (60)
T COG4416 24 EKKCPRCKEVN 34 (60)
T ss_pred eecCCccceee
Confidence 36899999655
No 51
>PRK03824 hypA hydrogenase nickel incorporation protein; Provisional
Probab=35.35 E-value=36 Score=27.85 Aligned_cols=29 Identities=21% Similarity=0.361 Sum_probs=21.3
Q ss_pred eeeeeCccceeEeEee---------------------CCCCCCCCCCcCC
Q 027645 174 MVATACMRCHMLVMLC---------------------KSSPTCPNCKFLH 202 (220)
Q Consensus 174 mv~~gC~~ClmyVM~~---------------------k~~p~CP~Ck~~~ 202 (220)
-+.+-|+.|--.+-+. ...-.||+|.+..
T Consensus 68 p~~~~C~~CG~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~CP~Cgs~~ 117 (135)
T PRK03824 68 EAVLKCRNCGNEWSLKEVKESLDEEIREAIHFIPEVVHAFLKCPKCGSRD 117 (135)
T ss_pred ceEEECCCCCCEEecccccccccccccccccccccccccCcCCcCCCCCC
Confidence 3678899998666554 3456799999754
No 52
>TIGR00373 conserved hypothetical protein TIGR00373. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain.
Probab=35.02 E-value=18 Score=30.27 Aligned_cols=29 Identities=17% Similarity=0.335 Sum_probs=21.3
Q ss_pred eeeeCccceeEeEee---CCCCCCCCCCcCCC
Q 027645 175 VATACMRCHMLVMLC---KSSPTCPNCKFLHP 203 (220)
Q Consensus 175 v~~gC~~ClmyVM~~---k~~p~CP~Ck~~~p 203 (220)
.--.||+|+.-+-.. ..+-.||+|..+.-
T Consensus 108 ~~Y~Cp~c~~r~tf~eA~~~~F~Cp~Cg~~L~ 139 (158)
T TIGR00373 108 MFFICPNMCVRFTFNEAMELNFTCPRCGAMLD 139 (158)
T ss_pred CeEECCCCCcEeeHHHHHHcCCcCCCCCCEee
Confidence 446799999766433 36799999997654
No 53
>COG2093 DNA-directed RNA polymerase, subunit E'' [Transcription]
Probab=34.03 E-value=14 Score=27.72 Aligned_cols=26 Identities=27% Similarity=0.600 Sum_probs=18.7
Q ss_pred eeeeeCccceeEeEeeCCCCCCCCCCcC
Q 027645 174 MVATACMRCHMLVMLCKSSPTCPNCKFL 201 (220)
Q Consensus 174 mv~~gC~~ClmyVM~~k~~p~CP~Ck~~ 201 (220)
|.--+|.+|+-. +.....-||.|.+-
T Consensus 2 ~~~kAC~~Ck~l--~~~d~e~CP~Cgs~ 27 (64)
T COG2093 2 STEKACKNCKRL--TPEDTEICPVCGST 27 (64)
T ss_pred chhHHHhhcccc--CCCCCccCCCCCCc
Confidence 445689999854 35666779999964
No 54
>KOG4334 consensus Uncharacterized conserved protein, contains double-stranded RNA-binding motif and WW domain [General function prediction only]
Probab=33.68 E-value=33 Score=34.96 Aligned_cols=42 Identities=26% Similarity=0.324 Sum_probs=35.5
Q ss_pred ccccccCCCCCccceeceecCcccEEEEecCCCcccCCCCCCC
Q 027645 54 DIELQLETPLPLEWERCLDIQSGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 54 ~iEL~l~~pLPsgWEq~LDlqSG~iYY~N~~T~~stw~dPR~~ 96 (220)
.+.|+.--|||.||-+- .-+||.-.|+...|..-+|.+|-..
T Consensus 146 ~~~l~~~epLPeGW~~i-~HnSGmPvylHr~tRVvt~SrPYfl 187 (650)
T KOG4334|consen 146 RIDLDKSEPLPEGWTVI-SHNSGMPVYLHRFTRVVTHSRPYFL 187 (650)
T ss_pred hccCCCCCcCCCceEEE-eecCCCceEEeeeeeeEeccCceee
Confidence 56777779999999886 4579999999999999999988653
No 55
>PF09003 Phage_integ_N: Bacteriophage lambda integrase, N-terminal domain ; InterPro: IPR015094 The amino terminal domain of bacteriophage lambda integrase folds into a three-stranded, antiparallel beta-sheet that packs against a C-terminal alpha-helix, adopting a fold that is structurally related to the three-stranded beta-sheet family of DNA-binding domains (which includes the GCC-box DNA-binding domain and the N-terminal domain of Tn916 integrase). This domain is responsible for high-affinity binding to each of the five DNA arm-type sites and is also a context-sensitive modulator of DNA cleavage []. ; GO: 0003677 DNA binding, 0008907 integrase activity, 0015074 DNA integration; PDB: 1Z1G_B 1Z1B_A 2WCC_3 1KJK_A.
Probab=33.32 E-value=23 Score=27.07 Aligned_cols=28 Identities=25% Similarity=0.191 Sum_probs=14.1
Q ss_pred CCCCccceeceecCcccEEEE--ecCCCcc
Q 027645 61 TPLPLEWERCLDIQSGEIHFY--NTRTHKK 88 (220)
Q Consensus 61 ~pLPsgWEq~LDlqSG~iYY~--N~~T~~s 88 (220)
.-||.+.-...|-.+|++||+ |..||+.
T Consensus 10 ~~lP~NLy~~~dkr~~k~Yy~Yr~P~tGk~ 39 (75)
T PF09003_consen 10 RDLPPNLYCRKDKRNGKGYYQYRNPITGKE 39 (75)
T ss_dssp GGS-TTEEEETT-----SEEEEE-TTTS-E
T ss_pred CCCCCCccccCCcCcceeEEEEecCCCCce
Confidence 457777777778788999995 5667654
No 56
>PRK05978 hypothetical protein; Provisional
Probab=33.08 E-value=32 Score=29.15 Aligned_cols=35 Identities=23% Similarity=0.456 Sum_probs=23.6
Q ss_pred eeeeCccceeEeE---eeCCCCCCCCCC---cCCCCCCCCC
Q 027645 175 VATACMRCHMLVM---LCKSSPTCPNCK---FLHPPDQSPP 209 (220)
Q Consensus 175 v~~gC~~ClmyVM---~~k~~p~CP~Ck---~~~p~~~~~~ 209 (220)
+..-||+|.=--| .-+.+++|+.|. ..++.++.|+
T Consensus 32 l~grCP~CG~G~LF~g~Lkv~~~C~~CG~~~~~~~a~DgpA 72 (148)
T PRK05978 32 FRGRCPACGEGKLFRAFLKPVDHCAACGEDFTHHRADDLPA 72 (148)
T ss_pred HcCcCCCCCCCcccccccccCCCccccCCccccCCccccCc
Confidence 3456999974444 457899999999 3344555554
No 57
>smart00659 RPOLCX RNA polymerase subunit CX. present in RNA polymerase I, II and III
Probab=32.67 E-value=35 Score=23.22 Aligned_cols=27 Identities=15% Similarity=0.329 Sum_probs=20.1
Q ss_pred eeeCccceeEeEeeC-CCCCCCCCCcCC
Q 027645 176 ATACMRCHMLVMLCK-SSPTCPNCKFLH 202 (220)
Q Consensus 176 ~~gC~~ClmyVM~~k-~~p~CP~Ck~~~ 202 (220)
.--|..|..-|-+.. ..-+||+|.+-+
T Consensus 2 ~Y~C~~Cg~~~~~~~~~~irC~~CG~rI 29 (44)
T smart00659 2 IYICGECGRENEIKSKDVVRCRECGYRI 29 (44)
T ss_pred EEECCCCCCEeecCCCCceECCCCCceE
Confidence 346889999887764 447899998654
No 58
>PRK00564 hypA hydrogenase nickel incorporation protein; Provisional
Probab=32.26 E-value=25 Score=28.08 Aligned_cols=30 Identities=20% Similarity=0.415 Sum_probs=22.0
Q ss_pred ceeeeeCccceeEeEeeC-CCCCCCCCCcCC
Q 027645 173 EMVATACMRCHMLVMLCK-SSPTCPNCKFLH 202 (220)
Q Consensus 173 ~mv~~gC~~ClmyVM~~k-~~p~CP~Ck~~~ 202 (220)
.-+.+-|..|--++=+.. ...+||+|++..
T Consensus 68 vp~~~~C~~Cg~~~~~~~~~~~~CP~Cgs~~ 98 (117)
T PRK00564 68 EKVELECKDCSHVFKPNALDYGVCEKCHSKN 98 (117)
T ss_pred cCCEEEhhhCCCccccCCccCCcCcCCCCCc
Confidence 446788999997766654 345799999753
No 59
>cd00350 rubredoxin_like Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=31.82 E-value=31 Score=21.73 Aligned_cols=23 Identities=22% Similarity=0.451 Sum_probs=15.0
Q ss_pred eCccceeEeEeeCCCCCCCCCCc
Q 027645 178 ACMRCHMLVMLCKSSPTCPNCKF 200 (220)
Q Consensus 178 gC~~ClmyVM~~k~~p~CP~Ck~ 200 (220)
.|..|-.-..-.+.+.+||.|+.
T Consensus 3 ~C~~CGy~y~~~~~~~~CP~Cg~ 25 (33)
T cd00350 3 VCPVCGYIYDGEEAPWVCPVCGA 25 (33)
T ss_pred ECCCCCCEECCCcCCCcCcCCCC
Confidence 46777644333446779999974
No 60
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=31.21 E-value=42 Score=35.74 Aligned_cols=77 Identities=18% Similarity=0.158 Sum_probs=47.3
Q ss_pred CccccCcchhhhhhhh--hhhcccccccCCCCCc--------CCcccccccccccCCCCCccceeceecCcccEEEEecC
Q 027645 15 ANKEFDNLSKKRKLEE--IIQGEGTFDKRSSKGV--------TTKSSIFDIELQLETPLPLEWERCLDIQSGEIHFYNTR 84 (220)
Q Consensus 15 ~~~e~~~~skkrkwee--~~~~~~~~~~~~~~~~--------~krk~i~~iEL~l~~pLPsgWEq~LDlqSG~iYY~N~~ 84 (220)
...+++++ -..+||- -.++|-+|.+.--|++ .+.|... ---+.-||+|||+--|+.-| +||+.|.
T Consensus 215 ~~e~~~gp-lp~nwemayte~gevyfiDhntkttswLdprl~kkaK~~e---eckd~elPygWeki~dpiYg-~yyvdHi 289 (984)
T KOG3209|consen 215 TQEDNLGP-LPHNWEMAYTEQGEVYFIDHNTKTTSWLDPRLTKKAKPPE---ECKDQELPYGWEKIEDPIYG-TYYVDHI 289 (984)
T ss_pred ccccccCC-CCccceEeEeecCeeEeeecccccceecChhhhcccCChh---hcccccccccccccCCccce-eEEeccc
Confidence 45667777 5567983 1366777775322332 1122221 12234499999999887644 6777788
Q ss_pred CCcccCCCCCCC
Q 027645 85 THKKTSGDPRGS 96 (220)
Q Consensus 85 T~~stw~dPR~~ 96 (220)
+-.+.++.|-..
T Consensus 290 N~~sq~enpvle 301 (984)
T KOG3209|consen 290 NRKSQYENPVLE 301 (984)
T ss_pred chhhhhccchhh
Confidence 888888877653
No 61
>PRK06266 transcription initiation factor E subunit alpha; Validated
Probab=30.67 E-value=27 Score=29.89 Aligned_cols=30 Identities=20% Similarity=0.470 Sum_probs=21.6
Q ss_pred eeeeCccceeEeEee---CCCCCCCCCCcCCCC
Q 027645 175 VATACMRCHMLVMLC---KSSPTCPNCKFLHPP 204 (220)
Q Consensus 175 v~~gC~~ClmyVM~~---k~~p~CP~Ck~~~p~ 204 (220)
.--.||+|+.-+-.- ..+-.||+|..+.-.
T Consensus 116 ~~Y~Cp~C~~rytf~eA~~~~F~Cp~Cg~~L~~ 148 (178)
T PRK06266 116 MFFFCPNCHIRFTFDEAMEYGFRCPQCGEMLEE 148 (178)
T ss_pred CEEECCCCCcEEeHHHHhhcCCcCCCCCCCCee
Confidence 456799998665432 367999999976643
No 62
>COG5242 TFB4 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription / DNA replication, recombination, and repair]
Probab=30.35 E-value=42 Score=31.24 Aligned_cols=27 Identities=26% Similarity=0.758 Sum_probs=21.2
Q ss_pred eeeeCccceeEeEeeCCCCCCCCCCcCCC
Q 027645 175 VATACMRCHMLVMLCKSSPTCPNCKFLHP 203 (220)
Q Consensus 175 v~~gC~~ClmyVM~~k~~p~CP~Ck~~~p 203 (220)
|--.|+-||.-+ ++-.|.|+.|++--.
T Consensus 259 ~GfvCsVCLsvf--c~p~~~C~~C~skF~ 285 (296)
T COG5242 259 LGFVCSVCLSVF--CRPVPVCKKCKSKFS 285 (296)
T ss_pred Eeeehhhhheee--cCCcCcCcccccccc
Confidence 444699998755 899999999996554
No 63
>cd04476 RPA1_DBD_C RPA1_DBD_C: A subfamily of OB folds corresponding to the C-terminal OB fold, the ssDNA-binding domain (DBD)-C, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-C, RPA1 contains three other OB folds: DBD-A, DBD-B, and RPA1N. The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in DNA binding and trimerization. It contains two structural insertions not found to date in other OB-folds: a zinc ribbon and a three-helix bundle. RPA1 DBD-C also contains a Cys4-type zinc-binding motif, which plays a role in the ssDNA binding fun
Probab=29.90 E-value=35 Score=27.84 Aligned_cols=30 Identities=20% Similarity=0.316 Sum_probs=23.4
Q ss_pred ceeeeeCccceeEeEeeC-CCCCCCCCCcCC
Q 027645 173 EMVATACMRCHMLVMLCK-SSPTCPNCKFLH 202 (220)
Q Consensus 173 ~mv~~gC~~ClmyVM~~k-~~p~CP~Ck~~~ 202 (220)
.+.-.+|+.|.=-|.-.. ....|++|+..+
T Consensus 31 ~~~Y~aC~~C~kkv~~~~~~~~~C~~C~~~~ 61 (166)
T cd04476 31 NWWYPACPGCNKKVVEEGNGTYRCEKCNKSV 61 (166)
T ss_pred CeEEccccccCcccEeCCCCcEECCCCCCcC
Confidence 688899999988765333 568999998765
No 64
>PF14369 zf-RING_3: zinc-finger
Probab=29.63 E-value=47 Score=21.55 Aligned_cols=21 Identities=33% Similarity=0.985 Sum_probs=14.0
Q ss_pred CccceeEeEee--CCC-CCCCCCC
Q 027645 179 CMRCHMLVMLC--KSS-PTCPNCK 199 (220)
Q Consensus 179 C~~ClmyVM~~--k~~-p~CP~Ck 199 (220)
|=.|-+.|-+. ..+ ..||+|.
T Consensus 5 Ch~C~~~V~~~~~~~~~~~CP~C~ 28 (35)
T PF14369_consen 5 CHQCNRFVRIAPSPDSDVACPRCH 28 (35)
T ss_pred CccCCCEeEeCcCCCCCcCCcCCC
Confidence 66777777764 233 4499997
No 65
>COG0846 SIR2 NAD-dependent protein deacetylases, SIR2 family [Transcription]
Probab=29.36 E-value=38 Score=30.74 Aligned_cols=29 Identities=21% Similarity=0.580 Sum_probs=21.0
Q ss_pred cceeeeeCccce-eEe---E---eeC-CCCCCCCCCc
Q 027645 172 QEMVATACMRCH-MLV---M---LCK-SSPTCPNCKF 200 (220)
Q Consensus 172 ~~mv~~gC~~Cl-myV---M---~~k-~~p~CP~Ck~ 200 (220)
.++-.+.|..|+ .|. | +-. .-|+||+|..
T Consensus 118 Gsl~~~~C~~C~~~~~~~~~~~~~~~~~~p~C~~Cg~ 154 (250)
T COG0846 118 GSLKRVRCSKCGNQYYDEDVIKFIEDGLIPRCPKCGG 154 (250)
T ss_pred cceeeeEeCCCcCccchhhhhhhcccCCCCcCccCCC
Confidence 688999999996 333 1 212 2489999998
No 66
>PF06943 zf-LSD1: LSD1 zinc finger; InterPro: IPR005735 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This model describes a putative zinc finger domain found in three closely spaced copies in Arabidopsis protein LSD1 and in two copies in other proteins from the same species. The motif resembles CxxCRxxLMYxxGASxVxCxxC []. This domain may play a role in the regulation of transcription, via either repression of a prodeath pathway or activation of an antideath pathway, in response to signals emanating from cells undergoing pathogen-induced hypersensitive cell death. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].
Probab=28.68 E-value=41 Score=20.87 Aligned_cols=23 Identities=30% Similarity=0.732 Sum_probs=17.5
Q ss_pred CccceeEeEeeC--CCCCCCCCCcC
Q 027645 179 CMRCHMLVMLCK--SSPTCPNCKFL 201 (220)
Q Consensus 179 C~~ClmyVM~~k--~~p~CP~Ck~~ 201 (220)
|-+|.++.|.+. .+-+|..|..+
T Consensus 1 C~~Cr~~L~yp~GA~sVrCa~C~~V 25 (25)
T PF06943_consen 1 CGGCRTLLMYPRGAPSVRCACCHTV 25 (25)
T ss_pred CCCCCceEEcCCCCCCeECCccCcC
Confidence 678888888887 44678888764
No 67
>PF11238 DUF3039: Protein of unknown function (DUF3039); InterPro: IPR021400 This family of proteins with unknown function appears to be restricted to Actinobacteria.
Probab=28.03 E-value=30 Score=25.48 Aligned_cols=30 Identities=27% Similarity=0.465 Sum_probs=19.2
Q ss_pred cceeeeeCccceeEeEeeCCCCCCCCCCcC
Q 027645 172 QEMVATACMRCHMLVMLCKSSPTCPNCKFL 201 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~~k~~p~CP~Ck~~ 201 (220)
..+|+|-|-.=..-.=-.+..|-||.||-+
T Consensus 24 G~pVvALCGk~wvp~rdp~~~PVCP~Ck~i 53 (58)
T PF11238_consen 24 GTPVVALCGKVWVPTRDPKPFPVCPECKEI 53 (58)
T ss_pred CceeEeeeCceeCCCCCCCCCCCCcCHHHH
Confidence 477888887532222223456999999854
No 68
>KOG0150 consensus Spliceosomal protein FBP21 [RNA processing and modification]
Probab=27.60 E-value=32 Score=32.87 Aligned_cols=22 Identities=36% Similarity=0.413 Sum_probs=19.9
Q ss_pred cccEEEEecCCCcccCCCCCCC
Q 027645 75 SGEIHFYNTRTHKKTSGDPRGS 96 (220)
Q Consensus 75 SG~iYY~N~~T~~stw~dPR~~ 96 (220)
+|-.||.|..|+.+.|..|+..
T Consensus 160 s~~~yy~n~~t~esvwk~P~~~ 181 (336)
T KOG0150|consen 160 SGPTYYSNKRTNESVWKPPRIS 181 (336)
T ss_pred CCCCcceecCCCccccCCCCcc
Confidence 5888999999999999999964
No 69
>PRK11032 hypothetical protein; Provisional
Probab=26.57 E-value=56 Score=28.07 Aligned_cols=30 Identities=23% Similarity=0.604 Sum_probs=20.2
Q ss_pred cceeeeeCccceeEeEeeC--CCCCCCCCCcC
Q 027645 172 QEMVATACMRCHMLVMLCK--SSPTCPNCKFL 201 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~~k--~~p~CP~Ck~~ 201 (220)
.++=...|.+|.--+-+.. .-|.||+|...
T Consensus 120 vg~G~LvC~~Cg~~~~~~~p~~i~pCp~C~~~ 151 (160)
T PRK11032 120 VGLGNLVCEKCHHHLAFYTPEVLPLCPKCGHD 151 (160)
T ss_pred eecceEEecCCCCEEEecCCCcCCCCCCCCCC
Confidence 3344456999976555544 55999999854
No 70
>KOG2846 consensus Predicted membrane protein [Function unknown]
Probab=26.09 E-value=32 Score=32.80 Aligned_cols=38 Identities=29% Similarity=0.623 Sum_probs=28.3
Q ss_pred CccceeeeeCccceeEe-EeeCC-----CCCCCCCCcCCCCCCC
Q 027645 170 DQQEMVATACMRCHMLV-MLCKS-----SPTCPNCKFLHPPDQS 207 (220)
Q Consensus 170 ~~~~mv~~gC~~ClmyV-M~~k~-----~p~CP~Ck~~~p~~~~ 207 (220)
+-..|-|.-|..|||-= |..+. .-.||.|+.+.|.-..
T Consensus 214 sP~~ryALIC~~C~~HNGla~~ee~~yi~F~C~~Cn~LN~~~k~ 257 (328)
T KOG2846|consen 214 SPNNRYALICSQCHHHNGLARKEEYEYITFRCPHCNALNPAKKS 257 (328)
T ss_pred CCcchhhhcchhhccccCcCChhhcCceEEECccccccCCCcCC
Confidence 44688999999999976 55442 2579999999874443
No 71
>COG1867 TRM1 N2,N2-dimethylguanosine tRNA methyltransferase [Translation, ribosomal structure and biogenesis]
Probab=25.97 E-value=36 Score=33.05 Aligned_cols=30 Identities=20% Similarity=0.411 Sum_probs=24.7
Q ss_pred ceeeeeCccc-eeEeEeeCCCCCCCCCCcCC
Q 027645 173 EMVATACMRC-HMLVMLCKSSPTCPNCKFLH 202 (220)
Q Consensus 173 ~mv~~gC~~C-lmyVM~~k~~p~CP~Ck~~~ 202 (220)
-+-..-|.+| ..+-+....+.+||+|...+
T Consensus 237 ~g~~~~c~~cg~~~~~~~~~~~~c~~Cg~~~ 267 (380)
T COG1867 237 LGYIYHCSRCGEIVGSFREVDEKCPHCGGKV 267 (380)
T ss_pred cCcEEEcccccceecccccccccCCcccccc
Confidence 4456889999 47888889999999999654
No 72
>PF13248 zf-ribbon_3: zinc-ribbon domain
Probab=25.70 E-value=26 Score=21.02 Aligned_cols=20 Identities=30% Similarity=0.760 Sum_probs=12.2
Q ss_pred eCccceeEeEeeCCCCCCCCCC
Q 027645 178 ACMRCHMLVMLCKSSPTCPNCK 199 (220)
Q Consensus 178 gC~~ClmyVM~~k~~p~CP~Ck 199 (220)
-|++|-- .+....--||+|.
T Consensus 4 ~Cp~Cg~--~~~~~~~fC~~CG 23 (26)
T PF13248_consen 4 FCPNCGA--EIDPDAKFCPNCG 23 (26)
T ss_pred CCcccCC--cCCcccccChhhC
Confidence 4666666 3455666677664
No 73
>smart00564 PQQ beta-propeller repeat. Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases.
Probab=25.44 E-value=51 Score=19.32 Aligned_cols=18 Identities=11% Similarity=0.100 Sum_probs=15.3
Q ss_pred CcccEEEEecCCCcccCC
Q 027645 74 QSGEIHFYNTRTHKKTSG 91 (220)
Q Consensus 74 qSG~iYY~N~~T~~stw~ 91 (220)
..|.+|-+|..||+..|.
T Consensus 14 ~~g~l~a~d~~~G~~~W~ 31 (33)
T smart00564 14 TDGTLYALDAKTGEILWT 31 (33)
T ss_pred CCCEEEEEEcccCcEEEE
Confidence 358899999999999885
No 74
>TIGR01384 TFS_arch transcription factor S, archaeal. There has been an apparent duplication event in the Halobacteriaceae lineage (Haloarcula, Haloferax, Haloquadratum, Halobacterium and Natromonas). There appears to be a separate duplication in Methanosphaera stadtmanae.
Probab=25.27 E-value=28 Score=26.43 Aligned_cols=27 Identities=22% Similarity=0.586 Sum_probs=19.9
Q ss_pred eCccceeEeEeeCCCCCCCCCCcCCCC
Q 027645 178 ACMRCHMLVMLCKSSPTCPNCKFLHPP 204 (220)
Q Consensus 178 gC~~ClmyVM~~k~~p~CP~Ck~~~p~ 204 (220)
-|+.|............||+|.+....
T Consensus 2 fC~~Cg~~l~~~~~~~~C~~C~~~~~~ 28 (104)
T TIGR01384 2 FCPKCGSLMTPKNGVYVCPSCGYEKEK 28 (104)
T ss_pred CCcccCcccccCCCeEECcCCCCcccc
Confidence 388887766555567899999977553
No 75
>PF02150 RNA_POL_M_15KD: RNA polymerases M/15 Kd subunit; InterPro: IPR001529 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. In archaebacteria, there is generally a single form of RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. It has recently been shown [], [] that small subunits of about 15 kDa, found in polymerase types I and II, are highly conserved. These proteins contain a probable zinc finger in their N-terminal region and a C-terminal zinc ribbon domain (see IPR001222 from INTERPRO).; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 3H0G_I 3M4O_I 3S14_I 2E2J_I 4A3J_I 3HOZ_I 1TWA_I 3S1Q_I 3S1N_I 1TWG_I ....
Probab=25.04 E-value=20 Score=23.25 Aligned_cols=28 Identities=21% Similarity=0.556 Sum_probs=14.9
Q ss_pred eCccceeEeEeeCCCCC---CCCCCcCCCCC
Q 027645 178 ACMRCHMLVMLCKSSPT---CPNCKFLHPPD 205 (220)
Q Consensus 178 gC~~ClmyVM~~k~~p~---CP~Ck~~~p~~ 205 (220)
-||.|--.....+.... |++|...++.+
T Consensus 3 FCp~C~nlL~p~~~~~~~~~C~~C~Y~~~~~ 33 (35)
T PF02150_consen 3 FCPECGNLLYPKEDKEKRVACRTCGYEEPIS 33 (35)
T ss_dssp BETTTTSBEEEEEETTTTEEESSSS-EEE-S
T ss_pred eCCCCCccceEcCCCccCcCCCCCCCccCCC
Confidence 37777433333333333 99998887754
No 76
>PF15232 DUF4585: Domain of unknown function (DUF4585)
Probab=25.04 E-value=48 Score=25.58 Aligned_cols=26 Identities=27% Similarity=0.264 Sum_probs=18.1
Q ss_pred ceecCcccEEEEecCCC--cccCCCCCC
Q 027645 70 CLDIQSGEIHFYNTRTH--KKTSGDPRG 95 (220)
Q Consensus 70 ~LDlqSG~iYY~N~~T~--~stw~dPR~ 95 (220)
-+|++||+-||+..--. .++.-||..
T Consensus 11 L~DP~SG~Yy~vd~P~Qp~~k~lfDPET 38 (75)
T PF15232_consen 11 LQDPESGQYYVVDAPVQPKTKTLFDPET 38 (75)
T ss_pred eecCCCCCEEEEecCCCcceeeeecCCC
Confidence 47999999999965443 455555544
No 77
>PF08274 PhnA_Zn_Ribbon: PhnA Zinc-Ribbon ; InterPro: IPR013987 The PhnA protein family includes the uncharacterised Escherichia coli protein PhnA and its homologues. The E. coli phnA gene is part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage []. The protein is not related to the characterised phosphonoacetate hydrolase designated PhnA []. This entry represents the N-terminal domain of PhnA, which is predicted to form a zinc-ribbon.; PDB: 2AKL_A.
Probab=24.97 E-value=29 Score=22.18 Aligned_cols=11 Identities=36% Similarity=0.736 Sum_probs=3.4
Q ss_pred CCCCCCCCcCC
Q 027645 192 SPTCPNCKFLH 202 (220)
Q Consensus 192 ~p~CP~Ck~~~ 202 (220)
-|+||.|.+-.
T Consensus 2 ~p~Cp~C~se~ 12 (30)
T PF08274_consen 2 LPKCPLCGSEY 12 (30)
T ss_dssp S---TTT----
T ss_pred CCCCCCCCCcc
Confidence 48999998654
No 78
>TIGR02300 FYDLN_acid conserved hypothetical protein TIGR02300. Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=24.94 E-value=46 Score=27.96 Aligned_cols=29 Identities=17% Similarity=0.292 Sum_probs=22.7
Q ss_pred eeCccceeEeE-eeCCCCCCCCCCcCCCCC
Q 027645 177 TACMRCHMLVM-LCKSSPTCPNCKFLHPPD 205 (220)
Q Consensus 177 ~gC~~ClmyVM-~~k~~p~CP~Ck~~~p~~ 205 (220)
-.|+.|---+. |.|.-..||+|...+++.
T Consensus 10 r~Cp~cg~kFYDLnk~p~vcP~cg~~~~~~ 39 (129)
T TIGR02300 10 RICPNTGSKFYDLNRRPAVSPYTGEQFPPE 39 (129)
T ss_pred ccCCCcCccccccCCCCccCCCcCCccCcc
Confidence 35899976554 678888999999988765
No 79
>COG2995 PqiA Uncharacterized paraquat-inducible protein A [Function unknown]
Probab=24.24 E-value=46 Score=32.77 Aligned_cols=31 Identities=32% Similarity=0.575 Sum_probs=23.9
Q ss_pred cceeeeeCccceeEeEeeC----CCCCCCCCCcCC
Q 027645 172 QEMVATACMRCHMLVMLCK----SSPTCPNCKFLH 202 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~~k----~~p~CP~Ck~~~ 202 (220)
..-=...|+.|-|.|=+.. ..--||||+-..
T Consensus 14 ~~~~~~~C~eCd~~~~~P~l~~~q~A~CPRC~~~l 48 (418)
T COG2995 14 PPGHLILCPECDMLVSLPRLDSGQSAYCPRCGHTL 48 (418)
T ss_pred CccceecCCCCCceeccccCCCCCcccCCCCCCcc
Confidence 3445689999999998875 447899999544
No 80
>PF11023 DUF2614: Protein of unknown function (DUF2614); InterPro: IPR020912 This entry describes proteins of unknown function, which are thought to be membrane proteins.; GO: 0005887 integral to plasma membrane
Probab=23.73 E-value=41 Score=27.76 Aligned_cols=32 Identities=28% Similarity=0.493 Sum_probs=24.6
Q ss_pred eeeeeCccceeEe-EeeCCCCCCCCCCcCCCCCC
Q 027645 174 MVATACMRCHMLV-MLCKSSPTCPNCKFLHPPDQ 206 (220)
Q Consensus 174 mv~~gC~~ClmyV-M~~k~~p~CP~Ck~~~p~~~ 206 (220)
-|.+-||.|.=.- |+-+.| .|+.|+.-.-.|+
T Consensus 67 av~V~CP~C~K~TKmLGr~D-~CM~C~~pLTLd~ 99 (114)
T PF11023_consen 67 AVQVECPNCGKQTKMLGRVD-ACMHCKEPLTLDP 99 (114)
T ss_pred ceeeECCCCCChHhhhchhh-ccCcCCCcCccCc
Confidence 3777899999877 677776 9999996555444
No 81
>COG0375 HybF Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) [General function prediction only]
Probab=22.59 E-value=43 Score=27.43 Aligned_cols=31 Identities=26% Similarity=0.489 Sum_probs=24.8
Q ss_pred ceeeeeCccceeEeEeeCCCCCCCCCCcCCC
Q 027645 173 EMVATACMRCHMLVMLCKSSPTCPNCKFLHP 203 (220)
Q Consensus 173 ~mv~~gC~~ClmyVM~~k~~p~CP~Ck~~~p 203 (220)
.-+..-|..|.-++-+-..+-.||+|.+...
T Consensus 67 ~p~~~~C~~C~~~~~~e~~~~~CP~C~s~~~ 97 (115)
T COG0375 67 EPAECWCLDCGQEVELEELDYRCPKCGSINL 97 (115)
T ss_pred eccEEEeccCCCeecchhheeECCCCCCCce
Confidence 3467789999888888888888999997654
No 82
>PF14435 SUKH-4: SUKH-4 immunity protein
Probab=22.22 E-value=59 Score=26.55 Aligned_cols=21 Identities=19% Similarity=0.309 Sum_probs=17.1
Q ss_pred cceeceecCcccEEEEecCCC
Q 027645 66 EWERCLDIQSGEIHFYNTRTH 86 (220)
Q Consensus 66 gWEq~LDlqSG~iYY~N~~T~ 86 (220)
+..=|||..||+|||++....
T Consensus 86 ~~~i~ld~~tG~V~~~~~~~~ 106 (179)
T PF14435_consen 86 GGSICLDPATGAVYALDPDEE 106 (179)
T ss_pred CCeEEEECCCCeEEEecCCcc
Confidence 555599999999999977764
No 83
>COG2995 PqiA Uncharacterized paraquat-inducible protein A [Function unknown]
Probab=21.59 E-value=40 Score=33.16 Aligned_cols=25 Identities=28% Similarity=0.830 Sum_probs=19.5
Q ss_pred eeeeCccceeEeEeeCCCCCCCCCCc
Q 027645 175 VATACMRCHMLVMLCKSSPTCPNCKF 200 (220)
Q Consensus 175 v~~gC~~ClmyVM~~k~~p~CP~Ck~ 200 (220)
-...|.+|+..-.- ++.+.||||..
T Consensus 219 ~~~~C~~C~~~~~~-~~~~~CpRC~~ 243 (418)
T COG2995 219 GLRSCLCCHYILPH-DAEPRCPRCGS 243 (418)
T ss_pred cceecccccccCCH-hhCCCCCCCCC
Confidence 46789999976554 48899999985
No 84
>PF15163 Meiosis_expr: Meiosis-expressed
Probab=21.36 E-value=44 Score=25.91 Aligned_cols=56 Identities=16% Similarity=0.255 Sum_probs=37.2
Q ss_pred hhhhhhhhhhcccccccCCCCCcCCcccccc--ccccc-CCCCCc-cceeceecCcccEEEEecC
Q 027645 24 KKRKLEEIIQGEGTFDKRSSKGVTTKSSIFD--IELQL-ETPLPL-EWERCLDIQSGEIHFYNTR 84 (220)
Q Consensus 24 kkrkwee~~~~~~~~~~~~~~~~~krk~i~~--iEL~l-~~pLPs-gWEq~LDlqSG~iYY~N~~ 84 (220)
|-++|.+ .+ |..|.-| .+|=|..|.. |+-.- ..-+|. |+-+.|-...|-.||+|..
T Consensus 3 RaK~Ws~-ev-E~~yRfQ---~AGyRDe~EY~~vk~~~~~erWp~~GfVkklqr~dg~f~Y~nk~ 62 (77)
T PF15163_consen 3 RAKKWSE-EV-ENAYRFQ---QAGYRDEIEYKQVKQVEEPERWPETGFVKKLQRKDGTFYYYNKK 62 (77)
T ss_pred cccccCH-HH-HHHHHHH---HcccccHHHHhhccccCccccccccCceeeEEecCceEEEecCc
Confidence 4567876 44 4555533 3566666653 33222 255888 9999999999999999864
No 85
>PF06677 Auto_anti-p27: Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27); InterPro: IPR009563 The proteins in this entry are functionally uncharacterised and include several proteins that characterise Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27). It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis [].
Probab=21.09 E-value=53 Score=22.24 Aligned_cols=28 Identities=25% Similarity=0.465 Sum_probs=21.8
Q ss_pred cceeeeeCccceeEeEeeC-CCCCCCCCC
Q 027645 172 QEMVATACMRCHMLVMLCK-SSPTCPNCK 199 (220)
Q Consensus 172 ~~mv~~gC~~ClmyVM~~k-~~p~CP~Ck 199 (220)
..|...-|+.|-+-.|=.+ ..--||.|.
T Consensus 13 ~~ML~~~Cp~C~~PL~~~k~g~~~Cv~C~ 41 (41)
T PF06677_consen 13 WTMLDEHCPDCGTPLMRDKDGKIYCVSCG 41 (41)
T ss_pred HhHhcCccCCCCCeeEEecCCCEECCCCC
Confidence 5788999999987777633 557899884
No 86
>PF03604 DNA_RNApol_7kD: DNA directed RNA polymerase, 7 kDa subunit; InterPro: IPR006591 DNA-dependent RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates. Each class of RNA polymerase is assembled from 9 to 15 different polypeptides. Rbp10 (RNA polymerase CX) is a domain found in RNA polymerase subunit 10; present in RNA polymerase I, II and III.; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 2PMZ_Z 3HKZ_X 2NVX_L 3S1Q_L 2JA6_L 3S17_L 3HOW_L 3HOV_L 3PO2_L 3HOZ_L ....
Probab=20.31 E-value=49 Score=21.32 Aligned_cols=23 Identities=30% Similarity=0.626 Sum_probs=14.7
Q ss_pred CccceeEeEeeCCC-CCCCCCCcC
Q 027645 179 CMRCHMLVMLCKSS-PTCPNCKFL 201 (220)
Q Consensus 179 C~~ClmyVM~~k~~-p~CP~Ck~~ 201 (220)
|..|..-|-+...+ -+||.|.+-
T Consensus 3 C~~Cg~~~~~~~~~~irC~~CG~R 26 (32)
T PF03604_consen 3 CGECGAEVELKPGDPIRCPECGHR 26 (32)
T ss_dssp ESSSSSSE-BSTSSTSSBSSSS-S
T ss_pred CCcCCCeeEcCCCCcEECCcCCCe
Confidence 67777777766544 578888753
No 87
>PF13717 zinc_ribbon_4: zinc-ribbon domain
Probab=20.27 E-value=51 Score=21.35 Aligned_cols=12 Identities=17% Similarity=0.456 Sum_probs=6.5
Q ss_pred eCccceeEeEee
Q 027645 178 ACMRCHMLVMLC 189 (220)
Q Consensus 178 gC~~ClmyVM~~ 189 (220)
.|++|..-.-|.
T Consensus 4 ~Cp~C~~~y~i~ 15 (36)
T PF13717_consen 4 TCPNCQAKYEID 15 (36)
T ss_pred ECCCCCCEEeCC
Confidence 455565555544
No 88
>PF10891 DUF2719: Protein of unknown function (DUF2719); InterPro: IPR020122 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf56; it is a family of uncharacterised viral proteins.
Probab=20.16 E-value=40 Score=26.31 Aligned_cols=11 Identities=45% Similarity=1.228 Sum_probs=8.4
Q ss_pred CCCCCCcCCCC
Q 027645 194 TCPNCKFLHPP 204 (220)
Q Consensus 194 ~CP~Ck~~~p~ 204 (220)
.||+|+|+-|.
T Consensus 24 ~C~~C~FVAPm 34 (81)
T PF10891_consen 24 YCPKCYFVAPM 34 (81)
T ss_pred Eccccceeccc
Confidence 58888888763
Done!