Query 035666
Match_columns 152
No_of_seqs 103 out of 133
Neff 4.0
Searched_HMMs 46136
Date Fri Mar 29 05:11:19 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/035666.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/035666hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09478 CBM49: Carbohydrate b 96.7 0.012 2.6E-07 41.2 7.4 72 58-135 1-78 (80)
2 PLN02171 endoglucanase 85.1 2.3 5E-05 40.7 6.2 76 55-137 534-615 (629)
3 PF14016 DUF4232: Protein of u 57.9 67 0.0015 23.8 7.0 77 53-138 1-82 (131)
4 PF02933 CDC48_2: Cell divisio 52.2 12 0.00026 24.8 2.0 28 120-148 15-42 (64)
5 PF03330 DPBB_1: Rare lipoprot 51.9 12 0.00025 25.6 1.9 37 71-108 38-74 (78)
6 PF06682 DUF1183: Protein of u 50.4 64 0.0014 28.8 6.7 54 52-109 58-113 (318)
7 KOG3358 Uncharacterized secret 45.9 17 0.00037 30.6 2.4 40 96-137 61-101 (211)
8 cd00602 IPT_TF IPT domain of e 43.3 83 0.0018 23.4 5.5 72 56-135 29-100 (101)
9 PF10633 NPCBM_assoc: NPCBM-as 42.9 51 0.0011 22.2 4.0 24 69-94 5-28 (78)
10 PF06483 ChiC: Chitinase C; I 32.0 68 0.0015 26.6 3.8 48 88-142 36-86 (180)
11 PLN03024 Putative EG45-like do 28.5 36 0.00077 26.3 1.5 27 72-99 84-110 (125)
12 PLN02340 endoglucanase 25.1 64 0.0014 31.1 2.8 76 54-135 518-600 (614)
13 PF11395 DUF2873: Protein of u 21.7 96 0.0021 20.1 2.3 10 19-30 20-29 (43)
14 PF03293 Pox_RNA_pol: Poxvirus 21.7 1.8E+02 0.0038 23.8 4.3 47 88-134 95-141 (160)
15 PF09270 BTD: Beta-trefoil DNA 21.0 1E+02 0.0023 25.0 2.9 49 85-137 71-123 (158)
16 PF00856 SET: SET domain; Int 21.0 1.2E+02 0.0027 21.0 3.0 22 113-134 140-161 (162)
No 1
>PF09478 CBM49: Carbohydrate binding domain CBM49; InterPro: IPR019028 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This domain is found at the C-terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose []. ; GO: 0030246 carbohydrate binding, 0005576 extracellular region
Probab=96.70 E-value=0.012 Score=41.24 Aligned_cols=72 Identities=21% Similarity=0.241 Sum_probs=52.8
Q ss_pred ceEEEeecC---CCCc--ceEEEEEEeCCcCCCCccceEEecCCccceeeeCCcceeeecCCceEE-eCCccCCCCCeeE
Q 035666 58 ISISQSKDS---TSGI--PQYIVQIVNTCVSGCAPSDIHLHCGWFASARVVNPRTFKRVSYDDCLV-NGGMPLRPSQVIR 131 (152)
Q Consensus 58 I~V~Q~~tg---~~G~--Pef~VtI~N~C~~~C~~s~V~L~C~gF~Sa~~VDP~lfrr~~~d~CLV-n~G~pI~~g~~Vs 131 (152)
|+|.|..++ .+|. .+|.|+|+|.+. =+++++++.-..+. -+--=+.+.+++..-+ ..-.+|.+|++.+
T Consensus 1 i~i~q~~~~sW~~~g~~y~qy~v~I~N~~~--~~I~~~~i~~~~l~----~~iW~l~~~~~~~y~lPs~~~~i~pg~s~~ 74 (80)
T PF09478_consen 1 ITITQTLVNSWTENGQTYTQYDVTITNNGS--KPIKSLKISIDNLY----GSIWGLDKVSGNTYTLPSYQPTIKPGQSFT 74 (80)
T ss_pred CEEEEEEEeEEEeCCEEEEEEEEEEEECCC--CeEEEEEEEECccc----hhheeEEeccCCEEECCccccccCCCCEEE
Confidence 678888864 4454 579999999996 89999999998765 1222222356677766 3344999999999
Q ss_pred EEec
Q 035666 132 FTYS 135 (152)
Q Consensus 132 F~YA 135 (152)
|-|-
T Consensus 75 FGYI 78 (80)
T PF09478_consen 75 FGYI 78 (80)
T ss_pred EEEE
Confidence 9984
No 2
>PLN02171 endoglucanase
Probab=85.15 E-value=2.3 Score=40.72 Aligned_cols=76 Identities=16% Similarity=0.208 Sum_probs=51.5
Q ss_pred CCCceEEEeecC-----CCCcceEEEEEEeCCcCCCCccceEEecCCccceeeeCCcceeeecCCceEEeCCc-cCCCCC
Q 035666 55 NRDISISQSKDS-----TSGIPQYIVQIVNTCVSGCAPSDIHLHCGWFASARVVNPRTFKRVSYDDCLVNGGM-PLRPSQ 128 (152)
Q Consensus 55 ~sDI~V~Q~~tg-----~~G~Pef~VtI~N~C~~~C~~s~V~L~C~gF~Sa~~VDP~lfrr~~~d~CLVn~G~-pI~~g~ 128 (152)
.+.|+|.|..++ ..+..+|+|+|+|++. .+++++++.=..+-. |=.=+.+ .++...+-+-. .|.+|+
T Consensus 534 ~~ei~i~q~v~~sW~~~g~~y~qy~v~I~N~s~--~~ik~i~i~~~~~~~----~iW~v~~-~~ngytlPs~~~sL~aG~ 606 (629)
T PLN02171 534 SSPIEIEQKATASWKAKGRTYYRYSTTVTNRSA--KTLKELHLGISKLYG----PLWGLTK-AGYGYVLPSWMPSLPAGK 606 (629)
T ss_pred cceeEEEEEEEEEEEcCCceEEEEEEEEEECCC--Cceeeeeeeeccccc----cchheee-cCCcccCchhhcccCCCC
Confidence 447999998874 3567889999999996 999999996544431 1111111 22334444433 788899
Q ss_pred eeEEEeccC
Q 035666 129 VIRFTYSNS 137 (152)
Q Consensus 129 ~VsF~YAws 137 (152)
..+|-|=+.
T Consensus 607 s~tFgyI~~ 615 (629)
T PLN02171 607 SLEFVYVHS 615 (629)
T ss_pred eeEEEeecC
Confidence 999999855
No 3
>PF14016 DUF4232: Protein of unknown function (DUF4232)
Probab=57.93 E-value=67 Score=23.76 Aligned_cols=77 Identities=17% Similarity=0.169 Sum_probs=49.4
Q ss_pred CCCCCceEEEeecC-CCCcceEEEEEEeCCcCCCCccceEEecCCccceeeeCCccee----eecCCceEEeCCccCCCC
Q 035666 53 CTNRDISISQSKDS-TSGIPQYIVQIVNTCVSGCAPSDIHLHCGWFASARVVNPRTFK----RVSYDDCLVNGGMPLRPS 127 (152)
Q Consensus 53 Cs~sDI~V~Q~~tg-~~G~Pef~VtI~N~C~~~C~~s~V~L~C~gF~Sa~~VDP~lfr----r~~~d~CLVn~G~pI~~g 127 (152)
|...|++++-.... ..|...+.|+++|+=..-|.+. ||..+..+|..=-+ ....+. -..--.|.+|
T Consensus 1 C~~~~L~~~~~~~~~~~g~~~~~l~~tN~s~~~C~l~-------G~P~v~~~~~~g~~~~~~~~~~~~--~~~~vtL~PG 71 (131)
T PF14016_consen 1 CTAADLSVTVGPVDAGAGQRHATLTFTNTSDTPCTLY-------GYPGVALVDADGAPLGVPAVREGP--PPRPVTLAPG 71 (131)
T ss_pred CCcccEEEEEecccCCCCccEEEEEEEECCCCcEEec-------cCCcEEEECCCCCcCCccccccCC--CCCcEEECCC
Confidence 88899999987653 6688899999999875557665 66666655442220 000111 1222357788
Q ss_pred CeeEEEeccCC
Q 035666 128 QVIRFTYSNSF 138 (152)
Q Consensus 128 ~~VsF~YAws~ 138 (152)
++..|.=.|..
T Consensus 72 ~sA~a~l~~~~ 82 (131)
T PF14016_consen 72 GSAYAGLRWSN 82 (131)
T ss_pred CEEEEEEEEec
Confidence 88888777754
No 4
>PF02933 CDC48_2: Cell division protein 48 (CDC48), domain 2; InterPro: IPR004201 This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C terminus. The VAT-N domain found in AAA ATPases (IPR003959 from INTERPRO) is a substrate 185-residue recognition domain [].; GO: 0005524 ATP binding; PDB: 1QDN_B 1QCS_A 1CR5_C 3QQ8_A 3HU2_A 3HU1_E 3HU3_A 3QWZ_A 3TIW_B 3QQ7_A ....
Probab=52.16 E-value=12 Score=24.76 Aligned_cols=28 Identities=21% Similarity=0.395 Sum_probs=22.7
Q ss_pred CCccCCCCCeeEEEeccCCcccceeeeee
Q 035666 120 GGMPLRPSQVIRFTYSNSFMYPIAFKSAK 148 (152)
Q Consensus 120 ~G~pI~~g~~VsF~YAws~~fpL~~~Sa~ 148 (152)
.|+|+..|+.|.|.+. ...++|.+.+..
T Consensus 15 ~~~pv~~Gd~i~~~~~-~~~~~~~V~~~~ 42 (64)
T PF02933_consen 15 EGRPVTKGDTIVFPFF-GQALPFKVVSTE 42 (64)
T ss_dssp TTEEEETT-EEEEEET-TEEEEEEEEEEC
T ss_pred cCCCccCCCEEEEEeC-CcEEEEEEEEEE
Confidence 4699999999999996 688898887753
No 5
>PF03330 DPBB_1: Rare lipoprotein A (RlpA)-like double-psi beta-barrel; InterPro: IPR009009 Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair [].; PDB: 1N10_B 3D30_A 2BH0_A 2HCZ_X.
Probab=51.89 E-value=12 Score=25.55 Aligned_cols=37 Identities=22% Similarity=0.463 Sum_probs=25.7
Q ss_pred ceEEEEEEeCCcCCCCccceEEecCCccceeeeCCcce
Q 035666 71 PQYIVQIVNTCVSGCAPSDIHLHCGWFASARVVNPRTF 108 (152)
Q Consensus 71 Pef~VtI~N~C~~~C~~s~V~L~C~gF~Sa~~VDP~lf 108 (152)
-.-.|+|+++|| +|+..++-|+=..|..--..|..++
T Consensus 38 ksV~v~V~D~Cp-~~~~~~lDLS~~aF~~la~~~~G~i 74 (78)
T PF03330_consen 38 KSVTVTVVDRCP-GCPPNHLDLSPAAFKALADPDAGVI 74 (78)
T ss_dssp CEEEEEEEEE-T-TSSSSEEEEEHHHHHHTBSTTCSSE
T ss_pred CeEEEEEEccCC-CCcCCEEEeCHHHHHHhCCCCceEE
Confidence 556799999998 7999988888777765433344443
No 6
>PF06682 DUF1183: Protein of unknown function (DUF1183); InterPro: IPR009567 This family consists of several eukaryotic proteins of around 360 residues in length. The function of this family is unknown.
Probab=50.45 E-value=64 Score=28.75 Aligned_cols=54 Identities=11% Similarity=0.186 Sum_probs=38.3
Q ss_pred CCCCCCceEEEeec-C-CCCcceEEEEEEeCCcCCCCccceEEecCCccceeeeCCccee
Q 035666 52 SCTNRDISISQSKD-S-TSGIPQYIVQIVNTCVSGCAPSDIHLHCGWFASARVVNPRTFK 109 (152)
Q Consensus 52 ~Cs~sDI~V~Q~~t-g-~~G~Pef~VtI~N~C~~~C~~s~V~L~C~gF~Sa~~VDP~lfr 109 (152)
.|..-.++|-|=.. | -+-+.||+=+-. =+..=-...|.|.|.|+.+.+ ||=|||
T Consensus 58 ~c~~~~p~vvqC~N~G~dg~dvqW~C~A~--Lp~~~klG~~~V~CEGY~~pd--DpyvLk 113 (318)
T PF06682_consen 58 GCDLYEPDVVQCTNQGYDGEDVQWECKAD--LPNEYKLGSTDVSCEGYDYPD--DPYVLK 113 (318)
T ss_pred cccccCcceEEEEecCCCCcccceEEeCC--CCcceeecceEEeeecccCCC--CceecC
Confidence 37777777777543 4 455777875432 123456778999999999976 999998
No 7
>KOG3358 consensus Uncharacterized secreted protein SDF2 (Stromal cell-derived factor 2), contains MIR domains [General function prediction only]
Probab=45.86 E-value=17 Score=30.64 Aligned_cols=40 Identities=28% Similarity=0.326 Sum_probs=31.9
Q ss_pred CccceeeeCC-cceeeecCCceEEeCCccCCCCCeeEEEeccC
Q 035666 96 WFASARVVNP-RTFKRVSYDDCLVNGGMPLRPSQVIRFTYSNS 137 (152)
Q Consensus 96 gF~Sa~~VDP-~lfrr~~~d~CLVn~G~pI~~g~~VsF~YAws 137 (152)
||..++.+|. -.+|+..+..| +-|.||.-|++||.+--.+
T Consensus 61 gv~~~dD~NSyW~Ik~~~~~~c--~rG~pikcG~~iRL~H~~T 101 (211)
T KOG3358|consen 61 GVEGVDDSNSYWRIKPVSGTTC--ERGDPIKCGQTIRLTHLKT 101 (211)
T ss_pred cccccccCcceEEEecCCCCcc--cCCCccccCCeEEEEEeec
Confidence 6666666676 56777888889 8999999999999987644
No 8
>cd00602 IPT_TF IPT domain of eukaryotic transcription factors NF-kappaB/Rel, nuclear factor of activated Tcells (NFAT), and recombination signal J-kappa binding protein (RBP-Jkappa). The IPT domains in these proteins are involved in DNA binding. Most NF-kappaB/Rel proteins form homo- and heterodimers, while NFAT proteins are largely monomeric (with TonEBP being an exception). While the majority of sequence-specific DNA binding elements are found in the N-terminal domain, several are found in the IPT domain in loops adjacent to, and including, the linker region.
Probab=43.26 E-value=83 Score=23.36 Aligned_cols=72 Identities=19% Similarity=0.159 Sum_probs=51.8
Q ss_pred CCceEEEeecCCCCcceEEEEEEeCCcCCCCccceEEecCCccceeeeCCcceeeecCCceEEeCCccCCCCCeeEEEec
Q 035666 56 RDISISQSKDSTSGIPQYIVQIVNTCVSGCAPSDIHLHCGWFASARVVNPRTFKRVSYDDCLVNGGMPLRPSQVIRFTYS 135 (152)
Q Consensus 56 sDI~V~Q~~tg~~G~Pef~VtI~N~C~~~C~~s~V~L~C~gF~Sa~~VDP~lfrr~~~d~CLVn~G~pI~~g~~VsF~YA 135 (152)
.||.|.=...+. |...|+....=++. +||..|--|....--|+.+=+.+.-.--|++.-.... .++..|+|-
T Consensus 29 ~dikV~F~e~~~-g~~~WE~~~~f~~~------dv~q~aiv~~tP~y~~~~i~~pV~V~i~L~r~~~~~~-S~~~~FtY~ 100 (101)
T cd00602 29 PDIKVWFGEKGP-GETVWEAEAMFRQE------DVRQVAIVFKTPPYHNKWITRPVQVPIQLVRPDDRKR-SEPLTFTYT 100 (101)
T ss_pred CCCEEEEEecCC-CCCeEEEEEEECHH------HceEeEEEecCCCcCCCCccccEEEEEEEEeCCCCee-cCCcCeEEc
Confidence 588876554433 88899999988875 6677777777766667777666666678888844444 478999994
No 9
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=42.89 E-value=51 Score=22.20 Aligned_cols=24 Identities=13% Similarity=0.164 Sum_probs=15.2
Q ss_pred CcceEEEEEEeCCcCCCCccceEEec
Q 035666 69 GIPQYIVQIVNTCVSGCAPSDIHLHC 94 (152)
Q Consensus 69 G~Pef~VtI~N~C~~~C~~s~V~L~C 94 (152)
..-+++|+|.|... -+..++.|+-
T Consensus 5 ~~~~~~~tv~N~g~--~~~~~v~~~l 28 (78)
T PF10633_consen 5 ETVTVTLTVTNTGT--APLTNVSLSL 28 (78)
T ss_dssp EEEEEEEEEE--SS--S-BSS-EEEE
T ss_pred CEEEEEEEEEECCC--CceeeEEEEE
Confidence 34569999999995 6677888775
No 10
>PF06483 ChiC: Chitinase C; InterPro: IPR009470 This ~170 aa region is found at the C-terminal to the catalytic domain (IPR001223 from INTERPRO) found in members of glycoside hydrolase family 18.
Probab=31.98 E-value=68 Score=26.65 Aligned_cols=48 Identities=21% Similarity=0.317 Sum_probs=35.5
Q ss_pred cceEEecCCcc---ceeeeCCcceeeecCCceEEeCCccCCCCCeeEEEeccCCcccc
Q 035666 88 SDIHLHCGWFA---SARVVNPRTFKRVSYDDCLVNGGMPLRPSQVIRFTYSNSFMYPI 142 (152)
Q Consensus 88 s~V~L~C~gF~---Sa~~VDP~lfrr~~~d~CLVn~G~pI~~g~~VsF~YAws~~fpL 142 (152)
-||.+.=++|+ +-=||+|++- +- | |.++.|+.|..++|.|+-+.+-.+
T Consensus 36 ldv~v~~~gf~~GD~NYPI~Pkl~--iT-N----ns~~~iPGGt~~~FD~ptSa~~~~ 86 (180)
T PF06483_consen 36 LDVSVSFTGFKLGDSNYPINPKLT--IT-N----NSGQTIPGGTEFEFDYPTSAPDNA 86 (180)
T ss_pred EEEEEEeCCcccCCCCCCcCCcEE--EE-c----CCCcccCCccEEEEccccCCcccc
Confidence 37777778887 4458888764 21 2 788999999999999997766433
No 11
>PLN03024 Putative EG45-like domain containing protein 1; Provisional
Probab=28.47 E-value=36 Score=26.28 Aligned_cols=27 Identities=44% Similarity=0.707 Sum_probs=17.6
Q ss_pred eEEEEEEeCCcCCCCccceEEecCCccc
Q 035666 72 QYIVQIVNTCVSGCAPSDIHLHCGWFAS 99 (152)
Q Consensus 72 ef~VtI~N~C~~~C~~s~V~L~C~gF~S 99 (152)
.=.|+|+++||++|+ .++-|+=..|..
T Consensus 84 sV~V~VtD~CP~~C~-~~~DLS~~AF~~ 110 (125)
T PLN03024 84 SVTVKIVDHCPSGCA-STLDLSREAFAQ 110 (125)
T ss_pred eEEEEEEcCCCCCCC-CceEcCHHHHHH
Confidence 467899999986675 355554444443
No 12
>PLN02340 endoglucanase
Probab=25.07 E-value=64 Score=31.10 Aligned_cols=76 Identities=16% Similarity=0.150 Sum_probs=48.5
Q ss_pred CCCCceEEEeecC-----CCCcceEEEEEEeCCcCCCCccceEEecCCccc-eeeeCCcceeeecCCceEEeCC-ccCCC
Q 035666 54 TNRDISISQSKDS-----TSGIPQYIVQIVNTCVSGCAPSDIHLHCGWFAS-ARVVNPRTFKRVSYDDCLVNGG-MPLRP 126 (152)
Q Consensus 54 s~sDI~V~Q~~tg-----~~G~Pef~VtI~N~C~~~C~~s~V~L~C~gF~S-a~~VDP~lfrr~~~d~CLVn~G-~pI~~ 126 (152)
+.+++++.|.-+. ....-+|+|+|+|+|. =|++.+++.=..+-- .-.|.|.+ ..+.+.+-+= ..|.+
T Consensus 518 ~~~~~e~~~~~~~sw~~~g~~y~~~~v~i~N~s~--~pi~~l~~~~~~l~g~lwgl~~~~----~~~~y~~p~~~~tl~~ 591 (614)
T PLN02340 518 SGAPVEFVHSITNTWTAGGTTYYRHKVIIKNKSQ--KPITDLKLVIEDLSGPIWGLNPTK----EKNTYELPQWQKVLQP 591 (614)
T ss_pred CCCchhhhhhheeeeecCCceEEEEEEEEEeCCC--CCchhhhhhhhhcccchhcceecc----ccCCccCchhhhccCC
Confidence 4556677666553 4466789999999996 788888876644432 11233221 1233434333 57888
Q ss_pred CCeeEEEec
Q 035666 127 SQVIRFTYS 135 (152)
Q Consensus 127 g~~VsF~YA 135 (152)
|+.++|.|-
T Consensus 592 g~~~~f~yi 600 (614)
T PLN02340 592 GSQLSFVYV 600 (614)
T ss_pred CCeeEEEec
Confidence 999999998
No 13
>PF11395 DUF2873: Protein of unknown function (DUF2873); InterPro: IPR021532 This entry is represented by the human SARS coronavirus, Orf7b; it is a family of uncharacterised viral proteins.
Probab=21.72 E-value=96 Score=20.07 Aligned_cols=10 Identities=60% Similarity=1.105 Sum_probs=4.8
Q ss_pred HHHHHHHHhhhh
Q 035666 19 LMLMMLTIIIFC 30 (152)
Q Consensus 19 ~~~~~~~~~~~~ 30 (152)
+.+.|| |.||
T Consensus 20 lv~iml--iif~ 29 (43)
T PF11395_consen 20 LVIIML--IIFW 29 (43)
T ss_pred HHHHHH--HHHH
Confidence 334566 4444
No 14
>PF03293 Pox_RNA_pol: Poxvirus DNA-directed RNA polymerase, 18 kD subunit; InterPro: IPR004973 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. The Poxvirus DNA-directed RNA polymerase (2.7.7.6 from EC) catalyses DNA-template-directed extension of the 3'-end of an RNA strand by one nucleotide at a time. The enzyme consists of at least eight subunits, this is the 18 kDa subunit.; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0019083 viral transcription
Probab=21.67 E-value=1.8e+02 Score=23.76 Aligned_cols=47 Identities=15% Similarity=0.279 Sum_probs=29.9
Q ss_pred cceEEecCCccceeeeCCcceeeecCCceEEeCCccCCCCCeeEEEe
Q 035666 88 SDIHLHCGWFASARVVNPRTFKRVSYDDCLVNGGMPLRPSQVIRFTY 134 (152)
Q Consensus 88 s~V~L~C~gF~Sa~~VDP~lfrr~~~d~CLVn~G~pI~~g~~VsF~Y 134 (152)
+||.+.|+..-=-..=|..-..--+..-|+++||..-..|..|+-.-
T Consensus 95 Sni~V~CgDLiCkl~rdsGtVSf~dsKYCfirNg~vY~ngs~Vsv~L 141 (160)
T PF03293_consen 95 SNITVQCGDLICKLSRDSGTVSFNDSKYCFIRNGVVYDNGSEVSVVL 141 (160)
T ss_pred CceEEEcCcEEEEeeccCCeEEecCceEEEEECCEEecCCCEEEEEe
Confidence 58999998765432223333221111239999999999998887654
No 15
>PF09270 BTD: Beta-trefoil DNA-binding domain; InterPro: IPR015350 This DNA-binding domain adopt a beta-trefoil fold, that is, a capped beta-barrel with internal pseudo threefold symmetry. In the DNA-binding protein LAG-1, it also is the site of mutually exclusive interactions with NotchIC (and the viral protein EBNA2) and corepressors (SMRT/N-Cor and CIR) []. ; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus; PDB: 2FO1_A 3NBN_A 2F8X_C 3V79_C 3IAG_C 3BRG_C.
Probab=21.05 E-value=1e+02 Score=24.96 Aligned_cols=49 Identities=16% Similarity=0.278 Sum_probs=35.5
Q ss_pred CCccceEEec--CCccceeeeCCcceeeecCCceEEeC--CccCCCCCeeEEEeccC
Q 035666 85 CAPSDIHLHC--GWFASARVVNPRTFKRVSYDDCLVNG--GMPLRPSQVIRFTYSNS 137 (152)
Q Consensus 85 C~~s~V~L~C--~gF~Sa~~VDP~lfrr~~~d~CLVn~--G~pI~~g~~VsF~YAws 137 (152)
|.=+-|+|-| .|.+| +|-++|+++.+.=++.. |+|+..-+.+.|+-..+
T Consensus 71 ~YGs~V~Lv~~~TGv~s----ppliIRKVdk~~~~ld~~~~ePVSQLhK~Afq~~d~ 123 (158)
T PF09270_consen 71 HYGSTVVLVCSVTGVSS----PPLIIRKVDKQQVVLDAASDEPVSQLHKCAFQMIDG 123 (158)
T ss_dssp BTTSEEEEEETTT-EBE----EEEEEEEEETTEEESSGGTTSB-BTTEEEEEEETTS
T ss_pred ecCCEEEEEECCCCccc----CceEEEEecCCceeecccccchhhhhheeeEEecCC
Confidence 4444688888 45555 78889999776555565 89999999999998873
No 16
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=21.01 E-value=1.2e+02 Score=20.98 Aligned_cols=22 Identities=14% Similarity=0.155 Sum_probs=16.9
Q ss_pred CCceEEeCCccCCCCCeeEEEe
Q 035666 113 YDDCLVNGGMPLRPSQVIRFTY 134 (152)
Q Consensus 113 ~d~CLVn~G~pI~~g~~VsF~Y 134 (152)
.+...+.-.++|.+|+.|...|
T Consensus 140 ~~~~~~~a~r~I~~GeEi~isY 161 (162)
T PF00856_consen 140 GGCLVVRATRDIKKGEEIFISY 161 (162)
T ss_dssp TTEEEEEESS-B-TTSBEEEES
T ss_pred cceEEEEECCccCCCCEEEEEE
Confidence 4456788899999999999998
Done!