Query 041094
Match_columns 523
No_of_seqs 31 out of 33
Neff 2.1
Searched_HMMs 46136
Date Fri Mar 29 03:46:03 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/041094.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/041094hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF14237 DUF4339: Domain of un 99.4 9.8E-14 2.1E-18 101.4 3.3 45 243-288 1-45 (45)
2 cd00072 GYF GYF domain: contai 97.1 0.00087 1.9E-08 52.4 4.6 50 243-292 3-53 (57)
3 PF02213 GYF: GYF domain; Int 96.3 0.0022 4.7E-08 49.3 1.9 47 243-289 2-49 (57)
4 KOG1789 Endocytosis protein RM 96.1 0.0032 7E-08 72.6 2.7 79 237-317 951-1036(2235)
5 smart00444 GYF Contains conser 95.6 0.018 4E-07 44.9 4.2 50 243-292 2-51 (56)
6 COG4969 PilA Tfp pilus assembl 91.6 0.064 1.4E-06 47.7 0.5 59 353-424 3-63 (125)
7 KOG1862 GYF domain containing 73.7 3.1 6.7E-05 45.6 3.4 71 240-310 202-282 (673)
8 PRK10574 putative major pilin 66.8 1.5 3.3E-05 40.1 -0.6 40 378-423 18-57 (146)
9 PLN00122 serine/threonine prot 47.6 1.1E+02 0.0024 29.2 8.3 40 444-493 106-149 (170)
10 KOG1532 GTPase XAB1, interacts 45.7 1.3E+02 0.0027 32.2 8.9 78 438-515 194-300 (366)
11 PF09130 DUF1932: Domain of un 44.0 44 0.00096 26.9 4.4 40 458-497 4-44 (73)
12 TIGR03639 cas1_NMENI CRISPR-as 43.5 47 0.001 32.9 5.3 74 190-273 149-227 (278)
13 PF11740 KfrA_N: Plasmid repli 42.8 43 0.00093 28.1 4.3 21 434-454 32-52 (120)
14 TIGR03640 cas1_DVULG CRISPR-as 38.3 91 0.002 31.6 6.5 91 190-280 167-276 (340)
15 PF11221 Med21: Subunit 21 of 36.0 2.7E+02 0.0059 25.2 8.5 29 449-477 69-97 (144)
16 PTZ00384 choline kinase; Provi 35.7 41 0.00089 34.7 3.7 99 135-238 257-372 (383)
17 COG4968 PilE Tfp pilus assembl 32.8 13 0.00027 35.0 -0.4 39 378-422 19-57 (139)
18 TIGR03641 cas1_HMARI CRISPR-as 31.8 1E+02 0.0022 31.2 5.6 92 190-281 153-260 (322)
19 KOG2505 Ankyrin repeat protein 30.7 1.7E+02 0.0037 33.0 7.4 51 415-475 450-500 (591)
20 PF01473 CW_binding_1: Putativ 29.6 34 0.00075 21.4 1.3 14 239-252 5-18 (19)
21 TIGR01710 typeII_sec_gspG gene 28.2 28 0.00061 30.6 1.0 49 378-439 14-62 (134)
22 PF05491 RuvB_C: Holliday junc 26.9 69 0.0015 27.5 3.0 35 224-268 3-37 (76)
23 PF10777 YlaC: Inner membrane 25.6 22 0.00048 34.0 -0.1 10 421-430 97-106 (155)
24 KOG2636 Splicing factor 3a, su 24.7 58 0.0013 35.9 2.7 30 144-181 361-390 (497)
25 PF11214 Med2: Mediator comple 22.4 1.9E+02 0.0041 26.4 5.0 29 455-483 37-65 (105)
26 PRK09819 alpha-mannosidase; Pr 21.9 5.7E+02 0.012 29.3 9.7 86 420-508 263-374 (875)
27 PF08621 RPAP1_N: RPAP1-like, 20.6 1.4E+02 0.003 23.5 3.4 29 108-137 12-41 (49)
28 COG3012 Uncharacterized protei 20.5 48 0.001 31.7 1.0 20 236-255 114-133 (151)
No 1
>PF14237 DUF4339: Domain of unknown function (DUF4339)
Probab=99.42 E-value=9.8e-14 Score=101.36 Aligned_cols=45 Identities=40% Similarity=0.956 Sum_probs=43.6
Q ss_pred ceEeecCCCCCCCCCchHHHHHHHhcCcccCCcccccCCCCcchhh
Q 041094 243 GWYYKDRLGRTRGPLELITLKTAWGAGIIDKDTFIWGEDMDEWVPI 288 (523)
Q Consensus 243 ~WYYaDr~rq~RGPv~l~tLr~~w~~GIID~dTLVWgEGLdqWvPL 288 (523)
.|||.+ +|+++||+++++|++++++|.|+.+||||++||++|.||
T Consensus 1 ~Wy~~~-~g~~~GP~s~~el~~l~~~g~i~~~tlvw~~g~~~W~pl 45 (45)
T PF14237_consen 1 EWYYAR-NGQQQGPFSLEELRQLISSGEIDPDTLVWKEGMSDWKPL 45 (45)
T ss_pred CEEEeC-CCeEECCcCHHHHHHHHHcCCCCCCCeEeCCChhhceEC
Confidence 599999 999999999999999999999999999999999999986
No 2
>cd00072 GYF GYF domain: contains conserved Gly-Tyr-Phe residues; Proline-binding domain in CD2-binding and other proteins. Involved in signaling lymphocyte activity. Also present in other unrelated proteins (mainly unknown) derived from diverse eukaryotic species.
Probab=97.07 E-value=0.00087 Score=52.36 Aligned_cols=50 Identities=20% Similarity=0.418 Sum_probs=46.9
Q ss_pred ceEeecCCCCCCCCCchHHHHHHHhcCcccCCcccccC-CCCcchhhhhhh
Q 041094 243 GWYYKDRLGRTRGPLELITLKTAWGAGIIDKDTFIWGE-DMDEWVPIHMVY 292 (523)
Q Consensus 243 ~WYYaDr~rq~RGPv~l~tLr~~w~~GIID~dTLVWgE-GLdqWvPL~~V~ 292 (523)
.|||+|-.|+.|||-+..+|..=+.+|-...+-.|.+. .-.+|.||..+-
T Consensus 3 ~W~Y~d~~g~vqGPF~~~~M~~W~~~gyF~~~l~vr~~~~~~~f~~l~~~~ 53 (57)
T cd00072 3 QWFYKDPQGEIQGPFSASQMLQWYQAGYFPDGLQVRRLDNGGEFYTLGDIL 53 (57)
T ss_pred EEEEECCCCCCcCCcCHHHHHHHHHCCCCCCCeEEEECCCCCCcEEHHHHH
Confidence 59999999999999999999999999999999999999 557999998875
No 3
>PF02213 GYF: GYF domain; InterPro: IPR003169 The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eukaryotic proteins of unknown function []. It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition []. Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand-binding site []. There is limited homology within the C-terminal 20-30 amino acids of various GYF domains, supporting the idea that this part of the domain is structurally but not functionally important [].; GO: 0005515 protein binding; PDB: 1SYX_F 1L2Z_A 1GYF_A 1WH2_A 3FMA_C 3K3V_A.
Probab=96.32 E-value=0.0022 Score=49.30 Aligned_cols=47 Identities=26% Similarity=0.499 Sum_probs=39.4
Q ss_pred ceEeecCCCCCCCCCchHHHHHHHhcCcccCCcccccCCCCcc-hhhh
Q 041094 243 GWYYKDRLGRTRGPLELITLKTAWGAGIIDKDTFIWGEDMDEW-VPIH 289 (523)
Q Consensus 243 ~WYYaDr~rq~RGPv~l~tLr~~w~~GIID~dTLVWgEGLdqW-vPL~ 289 (523)
.|||+|..|..+||-+..+|..=+.+|-...+..|++.+-.++ .|+.
T Consensus 2 ~W~Y~d~~g~~qGPf~~~~M~~W~~~gyF~~~l~vr~~~~~~~~~~~~ 49 (57)
T PF02213_consen 2 MWYYKDPDGNIQGPFSSEQMQAWYKQGYFPDDLQVRRVDDTQFIDPFG 49 (57)
T ss_dssp EEEEESTTS-EEEEEEHHHHHHHHHTTSSTTT-EEEETTSTTT--SSC
T ss_pred EeEEECCCCCcCCCcCHHHHHHHHHCCCCCCCcEEEEecCCCCcccch
Confidence 5999999999999999999999999999999999999976555 4443
No 4
>KOG1789 consensus Endocytosis protein RME-8, contains DnaJ domain [Intracellular trafficking, secretion, and vesicular transport; Posttranslational modification, protein turnover, chaperones]
Probab=96.12 E-value=0.0032 Score=72.64 Aligned_cols=79 Identities=22% Similarity=0.447 Sum_probs=64.9
Q ss_pred HHHhcCceEeecCCCCCCCCCchHHHHHHHhcCcccCCcccccCCCCcchhhhhhhhcc-------cccchhhhhhhhhH
Q 041094 237 FIMRSGGWYYKDRLGRTRGPLELITLKTAWGAGIIDKDTFIWGEDMDEWVPIHMVYGLE-------KAIATWEVRLGAAA 309 (523)
Q Consensus 237 ~imr~n~WYYaDr~rq~RGPv~l~tLr~~w~~GIID~dTLVWgEGLdqWvPL~~V~gLe-------p~I~T~EVr~aA~~ 309 (523)
|.|..-.|||.|..|..-||++.+.+|.+|..--||..|-+|--||++|--|+.|..+- |..+.-+| +-++
T Consensus 951 ~~~~~~ew~y~dk~~~~vgp~~~~~~~sl~s~k~i~~~s~~~a~gm~~w~~l~~i~~~rw~v~~~ipv~~~s~~--~~~~ 1028 (2235)
T KOG1789|consen 951 EQMAEEEWYYHDKDAKQVGPLSFEKMKSLYTEKTIFEKSQIWAAGMDKWMSLAAVPQFRWTVCQQIPVMNFTDL--SVLC 1028 (2235)
T ss_pred hhcCchhheeecCCccccCchhHHHHHHHhcccchhHHHHHHHhhhhHHHhhhhhhhhhhhhhhcccccCHHHH--HHHH
Confidence 66778899999999999999999999999999999999999999999999999999754 22233233 3444
Q ss_pred HHHHhhhh
Q 041094 310 TAFLHKLQ 317 (523)
Q Consensus 310 tAf~hKL~ 317 (523)
...||..-
T Consensus 1029 l~~L~~Mc 1036 (2235)
T KOG1789|consen 1029 LDTLLQMC 1036 (2235)
T ss_pred HHHHHHHH
Confidence 66666654
No 5
>smart00444 GYF Contains conserved Gly-Tyr-Phe residues. Proline-binding domain in CD2-binding protein. Contains conserved Gly-Tyr-Phe residues.
Probab=95.61 E-value=0.018 Score=44.88 Aligned_cols=50 Identities=20% Similarity=0.296 Sum_probs=45.3
Q ss_pred ceEeecCCCCCCCCCchHHHHHHHhcCcccCCcccccCCCCcchhhhhhh
Q 041094 243 GWYYKDRLGRTRGPLELITLKTAWGAGIIDKDTFIWGEDMDEWVPIHMVY 292 (523)
Q Consensus 243 ~WYYaDr~rq~RGPv~l~tLr~~w~~GIID~dTLVWgEGLdqWvPL~~V~ 292 (523)
.|+|+|-.|+.+||-+..+|..=+.+|-...+=.|++.+-.+..||..+.
T Consensus 2 ~W~Y~d~~~~iqGPf~~~~M~~W~~~gyF~~~l~vr~~~~~~~~~l~~~~ 51 (56)
T smart00444 2 LWLYKDPDGEIQGPFTASQMSQWYQAGYFPDSLQIKRLNEPPYDTLGDLD 51 (56)
T ss_pred EEEEECCCCCEeCCcCHHHHHHHHHCCCCCCCeEEEEcCCCCCCcchhhh
Confidence 69999999999999999999999999999999999999888777666554
No 6
>COG4969 PilA Tfp pilus assembly protein, major pilin PilA [Cell motility and secretion / Intracellular trafficking and secretion]
Probab=91.56 E-value=0.064 Score=47.74 Aligned_cols=59 Identities=12% Similarity=0.101 Sum_probs=48.1
Q ss_pred HHhCCcccc--eeccchhhhhhhccchhhhHhhhccCCCCCCchHHHHHHHhhcCCCCCcchhhHHHhhhhhhc
Q 041094 353 QANGGVWPG--VRIPSHALFLWASGSELTTMLEADHMPNKFIPKDLRLQLSKIIPGLRPWEVLSVEQAMDQITY 424 (523)
Q Consensus 353 ~~nggvwpg--v~ipsHalFLWAsGselt~iLaaiamPnkYi~y~~R~klAk~ip~LrP~evlsiEq~MD~ity 424 (523)
.++|+-|+. +.+. +++||++|++| .|+.|..|.+|..++....|.-....+-++..-.+
T Consensus 3 ~q~GFtLiElmivi~------------Ii~iLaaIaiP-~YQ~y~~k~~v~~al~~~~~~k~~ie~~~ls~~~~ 63 (125)
T COG4969 3 KQKGFTLIELMIVLA------------IIAILAAIAIP-LYQNYVARAQVMAALADITPGRTTVEIILLEPGAG 63 (125)
T ss_pred ccCcceehHHHHHHH------------HHHHHHHhhhh-HHHHHHHHHHHHHHHHhhccchhHHHHHHhccCCC
Confidence 356777877 5566 89999999999 99999999999999999999877666666665444
No 7
>KOG1862 consensus GYF domain containing proteins [General function prediction only]
Probab=73.68 E-value=3.1 Score=45.60 Aligned_cols=71 Identities=20% Similarity=0.254 Sum_probs=55.5
Q ss_pred hcCceEeecCCCCCCCCCchHHHHHHHhcCcccCCcccccCCCCc---chhhhhhhh-------cccccchhhhhhhhhH
Q 041094 240 RSGGWYYKDRLGRTRGPLELITLKTAWGAGIIDKDTFIWGEDMDE---WVPIHMVYG-------LEKAIATWEVRLGAAA 309 (523)
Q Consensus 240 r~n~WYYaDr~rq~RGPv~l~tLr~~w~~GIID~dTLVWgEGLdq---WvPL~~V~g-------Lep~I~T~EVr~aA~~ 309 (523)
..-+|||.|-.|+.|||.+..+|..=+.+|-.--+..|+...=.+ -.+|..+-+ .+.+|.+....+-.++
T Consensus 202 ~d~~~~Y~DP~g~iqGPf~~~~v~~W~~~GyF~~~l~vr~~e~~~~~~f~tl~~~~~~l~~~~~~~~p~~~~~~~~~~~~ 281 (673)
T KOG1862|consen 202 EELSWLYKDPQGQIQGPFSASDVLQWYEAGYFPDDLQVRLGENPERSIFQTLGEVMQLLKTRTGQEAPVPTQYSDSEFSN 281 (673)
T ss_pred cceeEEeeCCCCcccCCchHHHHHHHHhcCccCCCceeeeccCCccccceehhhhhhhcccccCccCCCcCccccchhhc
Confidence 456899999999999999999999999999998888888877777 666555543 2266777666665544
Q ss_pred H
Q 041094 310 T 310 (523)
Q Consensus 310 t 310 (523)
+
T Consensus 282 ~ 282 (673)
T KOG1862|consen 282 F 282 (673)
T ss_pred c
Confidence 4
No 8
>PRK10574 putative major pilin subunit; Provisional
Probab=66.84 E-value=1.5 Score=40.08 Aligned_cols=40 Identities=13% Similarity=0.262 Sum_probs=33.0
Q ss_pred hhhHhhhccCCCCCCchHHHHHHHhhcCCCCCcchhhHHHhhhhhh
Q 041094 378 LTTMLEADHMPNKFIPKDLRLQLSKIIPGLRPWEVLSVEQAMDQIT 423 (523)
Q Consensus 378 lt~iLaaiamPnkYi~y~~R~klAk~ip~LrP~evlsiEq~MD~it 423 (523)
+++||+++++| .|++|..|+++++++..+ .+++.+++.-.
T Consensus 18 IigILaaiaiP-~~~~~~~~a~~~~~~~~~-----~~~~~av~~~~ 57 (146)
T PRK10574 18 IIAILSAIGIP-AYQNYLQKAALTDMLQTF-----VPYKTAVELCA 57 (146)
T ss_pred HHHHHHHHHHH-HHHHHHHHHHHHHHHHHH-----HHHHHHHHHHH
Confidence 78999999999 999999999999988764 56666666543
No 9
>PLN00122 serine/threonine protein phosphatase 2A; Provisional
Probab=47.61 E-value=1.1e+02 Score=29.22 Aligned_cols=40 Identities=8% Similarity=0.334 Sum_probs=23.6
Q ss_pred hhhhHHHHHHH----HHHHHHHHHHHHHHhhcCCChHHHHHHHHhhHHHHHHHH
Q 041094 444 REWNVDILEFV----RVLDILRTGALNSLLEKVPGFNTIVERLKAENEAKEARR 493 (523)
Q Consensus 444 ~~wn~dv~~l~----~~~~~l~~~~~~~l~~~~pgfd~i~~kv~~d~~~r~~~~ 493 (523)
.|||+.|+-|. ++|-.+...+ ||.+..+.+++...+.++.
T Consensus 106 ~HWN~~V~~lt~nvlK~f~emD~~L----------F~ec~~~~ke~~~~~~~~~ 149 (170)
T PLN00122 106 GHWNQAVHGLTLNVRKMFSEMDPEL----------FEECLRKFEEDEAKAKEVE 149 (170)
T ss_pred HHhhHHHHHHHHHHHHHHHHhCHHH----------HHHHHHHHHHHHHHHHHHH
Confidence 69999998663 4444444444 4555566665544444433
No 10
>KOG1532 consensus GTPase XAB1, interacts with DNA repair protein XPA [Replication, recombination and repair]
Probab=45.65 E-value=1.3e+02 Score=32.19 Aligned_cols=78 Identities=18% Similarity=0.314 Sum_probs=45.8
Q ss_pred CCCchhhhhhHHHHHHHHHHHH--------HHH-------HHHHHHh------hcCCChHHHHHHHHhhHHHHHH-----
Q 041094 438 TGPPYIREWNVDILEFVRVLDI--------LRT-------GALNSLL------EKVPGFNTIVERLKAENEAKEA----- 491 (523)
Q Consensus 438 t~p~y~~~wn~dv~~l~~~~~~--------l~~-------~~~~~l~------~~~pgfd~i~~kv~~d~~~r~~----- 491 (523)
.++.|..+|-.|+.++-..+.. |.+ ..|+.|. -|=-|||..+..|.+-.+.-..
T Consensus 194 ~d~~fa~eWm~DfE~FqeAl~~~~~~y~s~l~~SmSL~leeFY~~lrtv~VSs~tG~G~ddf~~av~~~vdEy~~~ykp~ 273 (366)
T KOG1532|consen 194 SDSEFALEWMTDFEAFQEALNEAESSYMSNLTRSMSLMLEEFYRSLRTVGVSSVTGEGFDDFFTAVDESVDEYEEEYKPE 273 (366)
T ss_pred cccHHHHHHHHHHHHHHHHHHhhccchhHHhhhhHHHHHHHHHhhCceEEEecccCCcHHHHHHHHHHHHHHHHHHhhhH
Confidence 5678899999888776555543 222 2233332 2346999999988775544332
Q ss_pred ---HHHHHHHHHHHHHHHHhhHHHhhc
Q 041094 492 ---RREKRMEALIKEKQEKMDAMVKAK 515 (523)
Q Consensus 492 ---~~~~~~~~~~~~~~~~~~~~~~~~ 515 (523)
++.+||.++.+++++.++...|-.
T Consensus 274 ~Ek~k~~k~~~ee~~k~k~le~l~kdm 300 (366)
T KOG1532|consen 274 YEKKKAEKRLAEEERKKKQLEKLMKDM 300 (366)
T ss_pred HHHHHHHHHHHHHHhhhhhHHHHHhcc
Confidence 333444444555556666665543
No 11
>PF09130 DUF1932: Domain of unknown function (DUF1932); InterPro: IPR015814 This domain has been found in a number of eukaryotic and prokaryotic proteins, some of which are predicted to be 6-phosphogluconate dehydrogenase, NAD-binding proteins.; PDB: 3QSG_A 1I36_A 4EZB_A.
Probab=44.00 E-value=44 Score=26.87 Aligned_cols=40 Identities=30% Similarity=0.408 Sum_probs=28.7
Q ss_pred HHHHHHHHHHHhhcCCChH-HHHHHHHhhHHHHHHHHHHHH
Q 041094 458 DILRTGALNSLLEKVPGFN-TIVERLKAENEAKEARREKRM 497 (523)
Q Consensus 458 ~~l~~~~~~~l~~~~pgfd-~i~~kv~~d~~~r~~~~~~~~ 497 (523)
+++...|.+.|..++||++ .++++.-.....+..||-..+
T Consensus 4 ~Gv~~~ll~sl~~s~p~~~~~~~~~~v~~~~~hA~Rr~~EM 44 (73)
T PF09130_consen 4 YGVEDELLASLAESFPGLDWALAERLVPRMAPHAYRRAAEM 44 (73)
T ss_dssp TT-HHHHHHHHHHHSCCSCHHHHHHHHHHHHHHHHHHHHHH
T ss_pred cccHHHHHHHHHHHCCcchHHHHHHHcccchhhHHHHHHHH
Confidence 3567789999999999999 777776666665555554443
No 12
>TIGR03639 cas1_NMENI CRISPR-associated endonuclease Cas1, NMENI subtype. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is a prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 variant of the NMENI subtype of CRISPR/Cas system.
Probab=43.49 E-value=47 Score=32.89 Aligned_cols=74 Identities=22% Similarity=0.387 Sum_probs=55.1
Q ss_pred hchhHHHHHHHHhhhhhccccCCCCCCCcchhhhhhhhhhhHHHHHHHHHhcC-----ceEeecCCCCCCCCCchHHHHH
Q 041094 190 SDDKFWDFMKQFQFGLWGFRQRPYPPGRPIDVAQAIGYKRLEKRYYDFIMRSG-----GWYYKDRLGRTRGPLELITLKT 264 (523)
Q Consensus 190 ~~~kfwDF~kqf~fGlWg~rQRPyPp~rPidvaQaigyk~l~kry~d~imr~n-----~WYYaDr~rq~RGPv~l~tLr~ 264 (523)
+...+|.=+ |+- +|+-+| ||.-|+.++=++||.-|.......|...| +.+++++.|++ .+. -.|-+
T Consensus 149 aA~~Yf~~~----~~~-~f~R~~-p~~DpvNa~LsygY~iL~~~v~~al~~~GLdP~iGflH~~~~~r~--sLa-~DLmE 219 (278)
T TIGR03639 149 AAKLYFKTL----FGE-DFSRDD-DGEDPINAALNYGYAILRSAVARALVKSGLLPRLGIFHKSEYNPF--NLA-DDLME 219 (278)
T ss_pred HHHHHHHHH----ccC-CCccCC-CCCCchhhHHHHHHHHHHHHHHHHHHHcCCCcccccccCCCCCCc--chh-Hhhhh
Confidence 555666633 332 677566 99999999999999999999999999887 78888887776 332 34556
Q ss_pred HHhcCcccC
Q 041094 265 AWGAGIIDK 273 (523)
Q Consensus 265 ~w~~GIID~ 273 (523)
.|+- +||+
T Consensus 220 ~FRp-iVD~ 227 (278)
T TIGR03639 220 PFRP-LVDY 227 (278)
T ss_pred hhHH-HHHH
Confidence 6775 6654
No 13
>PF11740 KfrA_N: Plasmid replication region DNA-binding N-term; InterPro: IPR021104 The KfrA family of protiens are encoded on plasmids, generally in or near gene clusters invloved in stable inheritance functions. These proteins are thought to form an all-helical structure, consisting of an N-terminal helix-turn-helix DNA binding domain and an extended coiled-coil tail. The best-characterised KfrA protein, encoded on the broad host-range Plasmid RK2, is a site-specific DNA-binding protein whose operator overlaps its own promoter. The DNA-binding domain is essential for function, while the coiled-coil domain is probably responsible for formation of multimers, and may provide an example of a bridge to host structures required for plasmid partitioning []. This entry represents the N-terminal DNA-binding domain.
Probab=42.76 E-value=43 Score=28.11 Aligned_cols=21 Identities=29% Similarity=0.392 Sum_probs=18.5
Q ss_pred CCCCCCCchhhhhhHHHHHHH
Q 041094 434 GSHTTGPPYIREWNVDILEFV 454 (523)
Q Consensus 434 gs~tt~p~y~~~wn~dv~~l~ 454 (523)
||++|=-.|+++|........
T Consensus 32 GS~~ti~~~l~~w~~~~~~~~ 52 (120)
T PF11740_consen 32 GSMSTISKHLKEWREEREAQV 52 (120)
T ss_pred CCHHHHHHHHHHHHHhhhccc
Confidence 799999999999998877665
No 14
>TIGR03640 cas1_DVULG CRISPR-associated endonuclease Cas1, DVULG subtype. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 protein particular to the DVULG subtype of CRISPR/Cas system.
Probab=38.34 E-value=91 Score=31.57 Aligned_cols=91 Identities=20% Similarity=0.303 Sum_probs=68.6
Q ss_pred hchhHHHHHHHHhhh---hhccccCCC-CCCCcchhhhhhhhhhhHHHHHHHHHhcC-----ceEeecCCCCC-------
Q 041094 190 SDDKFWDFMKQFQFG---LWGFRQRPY-PPGRPIDVAQAIGYKRLEKRYYDFIMRSG-----GWYYKDRLGRT------- 253 (523)
Q Consensus 190 ~~~kfwDF~kqf~fG---lWg~rQRPy-Pp~rPidvaQaigyk~l~kry~d~imr~n-----~WYYaDr~rq~------- 253 (523)
+...+|.-+.+.+=. =|+|.-|-. ||.-|+.++=++||.-|.......|...| +.++.++.|++
T Consensus 167 aa~~Yf~~l~~~l~~~~~~f~F~~R~rrpp~DpvNalLsygYtlL~~~v~~ai~~~GLdP~iG~lH~~~~~r~sLa~DL~ 246 (340)
T TIGR03640 167 AARLYFAVFDHLLRQDRPAFRFDGRSRRPPLDPVNALLSFLYTLLTHDCRSALEGVGLDPAVGFLHRDRPGRPSLALDLM 246 (340)
T ss_pred HHHHHHHHHHHHHhcccccCccCCCCCCCCCChHHHHHHHHHHHHHHHHHHHHHHcCCCccceeccCCCCCCccHHHHHH
Confidence 455666665554411 277777755 89999999999999999999999999888 78999988876
Q ss_pred ---CCCCchHHHHHHHhcCcccCCcccccC
Q 041094 254 ---RGPLELITLKTAWGAGIIDKDTFIWGE 280 (523)
Q Consensus 254 ---RGPv~l~tLr~~w~~GIID~dTLVWgE 280 (523)
|-++-.-.+-.+...|+|..+.|...+
T Consensus 247 E~FRp~ivD~~V~~l~~~~~i~~~dF~~~~ 276 (340)
T TIGR03640 247 EEFRAVLADRLALSLINRGQLTAKDFEVRE 276 (340)
T ss_pred HHhhHHHHHHHHHHHHhcCCCCHHhCccCC
Confidence 223434456667788888888887765
No 15
>PF11221 Med21: Subunit 21 of Mediator complex; InterPro: IPR021384 The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP. Med21 has been known as Srb7 in yeasts, hSrb7 in humans and Trap 19 in Drosophila. The heterodimer of the two subunits Med7 and Med21 appears to act as a hinge between the middle and the tail regions of Mediator []. ; PDB: 1YKE_B 1YKH_B.
Probab=36.00 E-value=2.7e+02 Score=25.17 Aligned_cols=29 Identities=14% Similarity=0.306 Sum_probs=19.1
Q ss_pred HHHHHHHHHHHHHHHHHHHHhhcCCChHH
Q 041094 449 DILEFVRVLDILRTGALNSLLEKVPGFNT 477 (523)
Q Consensus 449 dv~~l~~~~~~l~~~~~~~l~~~~pgfd~ 477 (523)
+....|..--.+..+-.+.|-.++||++.
T Consensus 69 ~~~~elA~dIi~kakqIe~LIdsLPg~~~ 97 (144)
T PF11221_consen 69 ENIKELATDIIRKAKQIEYLIDSLPGIEV 97 (144)
T ss_dssp HHHHHHHHHHHHHHHHHHHHHHHSTTSSS
T ss_pred HHHHHHHHHHHHHHHHHHHHHHhCCCCCC
Confidence 33334444445566677889999999874
No 16
>PTZ00384 choline kinase; Provisional
Probab=35.72 E-value=41 Score=34.71 Aligned_cols=99 Identities=14% Similarity=0.302 Sum_probs=51.7
Q ss_pred HHHHHHHHHHhhHHHhh--hcCCCC---------ChhhHHHHhh--CCCCCCCCCCCcchhhhcchhhhchhHHHHHHHH
Q 041094 135 MARAEEIADYLNEVELK--KNDKPY---------RPEDKKLWRA--LPHVIGLDGRPMPRKAIKTKEESDDKFWDFMKQF 201 (523)
Q Consensus 135 ~~r~eei~~~~~e~El~--~N~~P~---------r~ED~~~W~~--lP~v~G~dGrPmpRKA~~~~~e~~~kfwDF~kqf 201 (523)
--++-.||.-.+|+--- ....|| ..|....|=+ |...-|.. .+|-....-.--.+-..|-.+-.+
T Consensus 257 n~~~fDLAn~f~E~~~~y~~~~~~~~~~~~~~~ps~e~~~~fi~~Yl~~~~~~~--~~~~~~~~~~l~~~v~~~~l~sh~ 334 (383)
T PTZ00384 257 NYVGWEIANFFVKLYIVYDPPTPPYFNSDDSLALSEEMKTIFVSVYLSQLLGKN--VLPSDDLVKEFLQSLEIHTLGVNL 334 (383)
T ss_pred CchHhHHHHHHHHHhcccCCCCCCccccCCCCCCCHHHHHHHHHHHHHHhcCCC--CCCcHHHHHHHHHHHHHHHHHHHH
Confidence 35778888888888742 223344 3444443421 12211211 112111001112334578899999
Q ss_pred hhhhhccccCCC---CCCCcchhhhhhhhhhhHH-HHHHHH
Q 041094 202 QFGLWGFRQRPY---PPGRPIDVAQAIGYKRLEK-RYYDFI 238 (523)
Q Consensus 202 ~fGlWg~rQRPy---Pp~rPidvaQaigyk~l~k-ry~d~i 238 (523)
||||||.=|--. +-+.+||. +||-.+.- ||+..+
T Consensus 335 ~W~lW~iIq~~~~~~~~~~~~~f---~~y~~~r~~~~~~~~ 372 (383)
T PTZ00384 335 FWTYWGIVMNDKPKNELSKPVKF---EAYAKFQYNLFKNNL 372 (383)
T ss_pred HHHHHHHHhCCCCcccccCCchH---HHHHHHHHHHHHHHH
Confidence 999999999432 23458898 66644322 444443
No 17
>COG4968 PilE Tfp pilus assembly protein PilE [Cell motility and secretion / Intracellular trafficking and secretion]
Probab=32.79 E-value=13 Score=34.98 Aligned_cols=39 Identities=23% Similarity=0.232 Sum_probs=33.2
Q ss_pred hhhHhhhccCCCCCCchHHHHHHHhhcCCCCCcchhhHHHhhhhh
Q 041094 378 LTTMLEADHMPNKFIPKDLRLQLSKIIPGLRPWEVLSVEQAMDQI 422 (523)
Q Consensus 378 lt~iLaaiamPnkYi~y~~R~klAk~ip~LrP~evlsiEq~MD~i 422 (523)
+.+||+.|++| .|.-|-+|.+.+++-..| +.+-|.|.+-
T Consensus 19 Iv~ILa~IAyP-SY~~yv~rs~R~~a~A~L-----~~~a~~~Er~ 57 (139)
T COG4968 19 IVGILALIAYP-SYQNYVLRSRRSAAKAAL-----LENAQFMERY 57 (139)
T ss_pred HHHHHHHHHhH-hHHHHHHHHHHHHHHHHH-----HHHHHHHHHH
Confidence 78999999999 999999999999987765 5667777764
No 18
>TIGR03641 cas1_HMARI CRISPR-associated endonuclease Cas1, HMARI/TNEAP subtype. It describes Cas1 subgroup that includes Cas1 proteins of the related HMARI and TNEAP subtypes of CRISPR/Cas system.
Probab=31.79 E-value=1e+02 Score=31.19 Aligned_cols=92 Identities=12% Similarity=0.135 Sum_probs=65.4
Q ss_pred hchhHHHHHHHHhhhhhccccCC-CCCCCcchhhhhhhhhhhHHHHHHHHHhcC-----ceEeecCCCCC----------
Q 041094 190 SDDKFWDFMKQFQFGLWGFRQRP-YPPGRPIDVAQAIGYKRLEKRYYDFIMRSG-----GWYYKDRLGRT---------- 253 (523)
Q Consensus 190 ~~~kfwDF~kqf~fGlWg~rQRP-yPp~rPidvaQaigyk~l~kry~d~imr~n-----~WYYaDr~rq~---------- 253 (523)
+...+|.-+.+.+=.=|+|.-|- -||.-|+.++=+.||.-|.......|...| +.++.++.|++
T Consensus 153 aA~~Yf~~l~~~l~~~~~F~gR~rrp~~DpvNa~LsygY~iL~~~i~~al~~~GLdP~iG~lH~~~~gr~sLa~DL~E~f 232 (322)
T TIGR03641 153 IRKTYYSAFDEILPEGFRFEKRTRRPPKNELNALISFGNSLLYTTVLSEIYKTHLNPTISYLHEPSERRFSLALDIAEIF 232 (322)
T ss_pred HHHHHHHHHHHHhcccCcCCCCCCCCcCCHHHHHHHHHHHHHHHHHHHHHHHcCCCcceeeccCCCCCCccHHHHHHHHh
Confidence 44455554444331237777664 378999999999999999999999999887 78999988876
Q ss_pred CCCCchHHHHHHHhcCcccCCcccccCC
Q 041094 254 RGPLELITLKTAWGAGIIDKDTFIWGED 281 (523)
Q Consensus 254 RGPv~l~tLr~~w~~GIID~dTLVWgEG 281 (523)
|-++-...+..+...|+|..+-|.-.+|
T Consensus 233 Rp~~vD~~v~~l~~~~~i~~~dF~~~~~ 260 (322)
T TIGR03641 233 KPIIVDRLIFRLVNKKIIKEKHFEKDLN 260 (322)
T ss_pred hHHHHHHHHHHHHhcCCcCHHHccccCC
Confidence 4223333455666778888888765543
No 19
>KOG2505 consensus Ankyrin repeat protein [General function prediction only]
Probab=30.69 E-value=1.7e+02 Score=33.02 Aligned_cols=51 Identities=22% Similarity=0.212 Sum_probs=42.2
Q ss_pred HHHhhhhhhcCCccccccCCCCCCCCchhhhhhHHHHHHHHHHHHHHHHHHHHHhhcCCCh
Q 041094 415 VEQAMDQITYGGEWYREPLGSHTTGPPYIREWNVDILEFVRVLDILRTGALNSLLEKVPGF 475 (523)
Q Consensus 415 iEq~MD~ity~~eWYREplgs~tt~p~y~~~wn~dv~~l~~~~~~l~~~~~~~l~~~~pgf 475 (523)
+|..||--|-+| ..++||.-.=|+||+-.|.-++.|....+|--...||-=
T Consensus 450 Leeg~Dp~~kd~----------~Grtpy~ls~nkdVk~~F~a~~~l~es~~nW~~t~i~~P 500 (591)
T KOG2505|consen 450 LEEGCDPSTKDG----------AGRTPYSLSANKDVKSIFIARRVLNESFGNWARTHIPEP 500 (591)
T ss_pred HHhcCCchhccc----------CCCCcccccccHHHHHHHHHHHHhcccccchhhhcCCCc
Confidence 456677777766 357899888899999999999999999999888888753
No 20
>PF01473 CW_binding_1: Putative cell wall binding repeat; InterPro: IPR018337 The cell wall-binding repeat (CW) is an about 20 amino acid residue module, essentially found in two bacterial Gram-positive protein families; the choline binding proteins and glucosyltransferases (2.4.1.5 from EC). In choline-binding proteins cell wall binding repeats bind to choline moieties of both teichoic and lipoteichoic acids, two components peculiar to the cell surface of Gram-positive bacteria [, ]. In glucosyltransferases the region spanning the CW repeats is a glucan binding domain []. Several crystal structures of CW have been solved [, ]. In the choline binding protein LytA, the repeats adopt a solenoid fold consisting exclusively of beta-hairpins that stack to form a left-handed superhelix with a boomerang-like shape. The choline groups bind between beta-hairpin 'steps' of the superhelix []. In Cpl-1 CW repeats assemble in two sub-domains: an N-terminal superhelical moiety similar to the LytA one and a C-terminal beta-sheet involved in interactions with the lysozyme domain. Choline is bound between repeats 1 and 2, and, 2 and 3 of the superhelical sub-domain []. Some proteins known to contain cell-wall binding repeats include: Pneumococcal N-acetylmuramoyl-L-alanine amidase (autolysin, lytA) (3.5.1.28 from EC). It is a surface-exposed enzyme that rules the self-destruction of pneumococcal cells through degradation of their peptidoglycan backbone. It mediates the release of toxic substances that damage the host tissues. Pneumococcal endo-beta-N-acetylglucosaminidase (lytB) (3.2.1.96 from EC). It plays an important role in cell wall degradation and cell separation. Pneumococcal teichoic acid phosphorylcholine esterase (pce or cbpE), a cell wall hydrolase important for cellular adhesion and colonisation. Lactobacillales glucosyltransferase. It catalyses the transfer of glucosyl units from the cleavage of sucrose to a growing chain of glucan. Clostridium difficile toxin A (tcdA) and toxin B (tcdb). They are the causative agents of the antibiotic-associated pseudomembranous colitis. They are intracellular acting toxins that reach their targets after receptor-mediated endocytosis. Clostridium acetobutylicum cspA protein. Siphoviridae bacteriophages N-acetylmuramoyl-L-alanine amidase. It lyses the bacterial host cell wall. Podoviridae lysozyme protein (cpl-1). It is capable of digesting the pneumococcal cell wall. The cell wall binding repeats are also known as the choline-binding repeats (ChBr) or the choline-binding domain (ChBD). ; PDB: 1GVM_C 2BML_B 1HCX_A 1OBA_A 1H09_A 2J8F_A 2IXU_A 2J8G_A 2IXV_A 2X8O_A ....
Probab=29.62 E-value=34 Score=21.42 Aligned_cols=14 Identities=43% Similarity=0.914 Sum_probs=11.1
Q ss_pred HhcCceEeecCCCC
Q 041094 239 MRSGGWYYKDRLGR 252 (523)
Q Consensus 239 mr~n~WYYaDr~rq 252 (523)
.-+|.|||-+..|.
T Consensus 5 ~~~~~wYy~~~~G~ 18 (19)
T PF01473_consen 5 QDNGNWYYFDSDGY 18 (19)
T ss_dssp EETTEEEEETTTSB
T ss_pred EECCEEEEeCCCcc
Confidence 34799999998874
No 21
>TIGR01710 typeII_sec_gspG general secretion pathway protein G. This model represents GspG, protein G of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=28.15 E-value=28 Score=30.65 Aligned_cols=49 Identities=20% Similarity=0.250 Sum_probs=36.3
Q ss_pred hhhHhhhccCCCCCCchHHHHHHHhhcCCCCCcchhhHHHhhhhhhcCCccccccCCCCCCC
Q 041094 378 LTTMLEADHMPNKFIPKDLRLQLSKIIPGLRPWEVLSVEQAMDQITYGGEWYREPLGSHTTG 439 (523)
Q Consensus 378 lt~iLaaiamPnkYi~y~~R~klAk~ip~LrP~evlsiEq~MD~ity~~eWYREplgs~tt~ 439 (523)
+++||+++++| .|.....|++...+. -++.++++|+++ |+-..|.|.+.
T Consensus 14 Iigil~~i~~p-~~~~~~~~a~~~~~~-----~~l~~i~~al~~-------y~~d~g~yP~~ 62 (134)
T TIGR01710 14 ILGLLAALVAP-KLFSQADKAKAQVAK-----AQIKALKNALDM-------YRLDNGRYPTE 62 (134)
T ss_pred HHHHHHHHHHH-HHHHHHHHHHHHHHH-----HHHHHHHHHHHH-------HHHHhCCCCCc
Confidence 67899999999 888888888887765 355677888875 44455666554
No 22
>PF05491 RuvB_C: Holliday junction DNA helicase ruvB C-terminus; InterPro: IPR008823 The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein []. This group of sequences contain this signature which is located in the C-terminal region of the proteins; it is thought to be a helicase DNA-binding domain.; GO: 0003677 DNA binding, 0009378 four-way junction helicase activity, 0006281 DNA repair, 0006310 DNA recombination; PDB: 3PFI_B 1IXR_C 1HQC_B 1IXS_B 1IN8_A 1IN4_A 1IN5_A 1J7K_A 1IN6_A 1IN7_A.
Probab=26.93 E-value=69 Score=27.45 Aligned_cols=35 Identities=31% Similarity=0.604 Sum_probs=25.8
Q ss_pred hhhhhhhHHHHHHHHHhcCceEeecCCCCCCCCCchHHHHHHHhc
Q 041094 224 AIGYKRLEKRYYDFIMRSGGWYYKDRLGRTRGPLELITLKTAWGA 268 (523)
Q Consensus 224 aigyk~l~kry~d~imr~n~WYYaDr~rq~RGPv~l~tLr~~w~~ 268 (523)
..|...+|.||..+++.. =.-|||-++||.++...
T Consensus 3 ~~GLd~~D~~yL~~l~~~----------f~ggPvGl~tlA~~l~e 37 (76)
T PF05491_consen 3 ELGLDELDRRYLKTLIEN----------FKGGPVGLDTLAAALGE 37 (76)
T ss_dssp TTS-BHHHHHHHHHHHHC----------STTS-B-HHHHHHHTTS
T ss_pred cccCCHHHHHHHHHHHHH----------cCCCCeeHHHHHHHHCC
Confidence 379999999999999874 11399999999988753
No 23
>PF10777 YlaC: Inner membrane protein YlaC; InterPro: IPR019713 The extracytoplasmic function (ECF) sigma factors are small regulatory proteins that are quite divergent in sequence relative to most other sigma factors. YlaC, regulated by YlaA, is important in oxidative stress resistance. It contributes to hydrogen peroxide resistance in Bacillus subtilis [].
Probab=25.56 E-value=22 Score=34.00 Aligned_cols=10 Identities=50% Similarity=1.491 Sum_probs=8.8
Q ss_pred hhhcCCcccc
Q 041094 421 QITYGGEWYR 430 (523)
Q Consensus 421 ~ity~~eWYR 430 (523)
++.||||||.
T Consensus 97 RVCYNGEWy~ 106 (155)
T PF10777_consen 97 RVCYNGEWYN 106 (155)
T ss_pred eEEEcceeee
Confidence 5899999985
No 24
>KOG2636 consensus Splicing factor 3a, subunit 3 [RNA processing and modification]
Probab=24.66 E-value=58 Score=35.89 Aligned_cols=30 Identities=33% Similarity=0.708 Sum_probs=20.7
Q ss_pred HhhHHHhhhcCCCCChhhHHHHhhCCCCCCCCCCCcch
Q 041094 144 YLNEVELKKNDKPYRPEDKKLWRALPHVIGLDGRPMPR 181 (523)
Q Consensus 144 ~~~e~El~~N~~P~r~ED~~~W~~lP~v~G~dGrPmpR 181 (523)
..++-+..+++.||.+- +|| .|.||.|||+
T Consensus 361 ~~~e~~~de~~~~ynp~------~lP--LGwDGkPiPy 390 (497)
T KOG2636|consen 361 SDEESDDDEEELIYNPK------NLP--LGWDGKPIPY 390 (497)
T ss_pred cccccccchhhccCCcc------cCC--CCCCCCcCch
Confidence 33445566677777763 344 6999999996
No 25
>PF11214 Med2: Mediator complex subunit 2; InterPro: IPR021017 The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP. This family of mediator complex subunit 2 proteins is conserved in fungi. Cyclin-dependent kinase CDK8 or Srb10 interacts with and phosphorylates Med2. Post-translational modifications of Mediator subunits are important for regulation of gene expression [, ].
Probab=22.37 E-value=1.9e+02 Score=26.38 Aligned_cols=29 Identities=21% Similarity=0.339 Sum_probs=22.9
Q ss_pred HHHHHHHHHHHHHHhhcCCChHHHHHHHH
Q 041094 455 RVLDILRTGALNSLLEKVPGFNTIVERLK 483 (523)
Q Consensus 455 ~~~~~l~~~~~~~l~~~~pgfd~i~~kv~ 483 (523)
.|+.+.++.+.+.|..+|-.|+.|++-+.
T Consensus 37 ~vi~G~n~~l~k~L~eki~~Fh~ILDd~~ 65 (105)
T PF11214_consen 37 NVITGFNNQLQKQLSEKIHKFHSILDDTE 65 (105)
T ss_pred hhhccccHHHHHHHHHHHHHHHHHHHHHH
Confidence 45666777888888899999999988654
No 26
>PRK09819 alpha-mannosidase; Provisional
Probab=21.91 E-value=5.7e+02 Score=29.30 Aligned_cols=86 Identities=20% Similarity=0.141 Sum_probs=55.0
Q ss_pred hhhhcCCcc----cccc-CCCCCCCCchhhhhhHHHHHHHH-HHHHHHHHH--------------------HHHHhhcCC
Q 041094 420 DQITYGGEW----YREP-LGSHTTGPPYIREWNVDILEFVR-VLDILRTGA--------------------LNSLLEKVP 473 (523)
Q Consensus 420 D~ity~~eW----YREp-lgs~tt~p~y~~~wn~dv~~l~~-~~~~l~~~~--------------------~~~l~~~~p 473 (523)
+.=++.||- |=+. =|+|||++ +++.+|+....++. +..-|++.. .++--..||
T Consensus 263 ~lp~~~GEl~~~~y~~~HrG~~TSr~-~iK~~nr~~E~~L~~~~E~l~~la~~~g~~yp~~~l~~~Wk~ll~nq~HD~i~ 341 (875)
T PRK09819 263 NLPTLKGEFIDGKYMRVHRSIFSTRM-DIKIANARIENKIVNVLEPLASIAYSLGFEYPHGLLEKIWKEMFKNHAHDSIG 341 (875)
T ss_pred CCCeeeeecCCCccccccCCccccHH-HHHHHHHHHHHHHHHHhchHHHHHHHcCCCCCHHHHHHHHHHHHHhcCCCccc
Confidence 345677887 5443 48888886 68889998887764 333332221 112223568
Q ss_pred ChHHHHHHHHhhHHHHHHHHHHHHHHHHHHHHHHh
Q 041094 474 GFNTIVERLKAENEAKEARREKRMEALIKEKQEKM 508 (523)
Q Consensus 474 gfd~i~~kv~~d~~~r~~~~~~~~~~~~~~~~~~~ 508 (523)
| +-.+.|..|...|-++-.+.-+..+.....++
T Consensus 342 G--~sid~V~~d~~~r~~~~~~~~~~l~~~~l~~l 374 (875)
T PRK09819 342 C--CCSDTVHRDIVARYKLAEDLADNLLDFYMRKI 374 (875)
T ss_pred c--cCChHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 8 66788899999888877766666666555554
No 27
>PF08621 RPAP1_N: RPAP1-like, N-terminal; InterPro: IPR013930 Inhibition of RNA polymerase II-associated protein 1 (RPAP1) synthesis in Saccharomyces cerevisiae (Baker's yeast) results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11 []. This entry represents the N-terminal region of RPAP-1 that is conserved from yeast to humans.
Probab=20.62 E-value=1.4e+02 Score=23.51 Aligned_cols=29 Identities=28% Similarity=0.509 Sum_probs=19.4
Q ss_pred CCCCCHHHHHHHHH-HHHHHhcChHHHHHHH
Q 041094 108 KENPSEEEIKENEE-WWKSFRESPVVQFMAR 137 (523)
Q Consensus 108 k~n~t~eE~~en~~-~~e~~~~s~~~~f~~r 137 (523)
..+||+|||.+..+ +.+.+ .--+++||.+
T Consensus 12 L~~MS~eEI~~er~eL~~~L-dP~li~~L~~ 41 (49)
T PF08621_consen 12 LASMSPEEIEEEREELLESL-DPKLIEFLKK 41 (49)
T ss_pred HHhCCHHHHHHHHHHHHHhC-CHHHHHHHHH
Confidence 35799999987765 44444 4456666654
No 28
>COG3012 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=20.52 E-value=48 Score=31.71 Aligned_cols=20 Identities=35% Similarity=0.715 Sum_probs=15.8
Q ss_pred HHHHhcCceEeecCCCCCCC
Q 041094 236 DFIMRSGGWYYKDRLGRTRG 255 (523)
Q Consensus 236 d~imr~n~WYYaDr~rq~RG 255 (523)
.|+-.+|.|||-|...-+.|
T Consensus 114 rFvk~ngrWyyiDgtv~~~g 133 (151)
T COG3012 114 RFVKINGRWYYIDGTVPPLG 133 (151)
T ss_pred hheEECCEEEEECCCCCccc
Confidence 57788899999998766444
Done!