Query 041740
Match_columns 126
No_of_seqs 102 out of 136
Neff 5.1
Searched_HMMs 46136
Date Fri Mar 29 10:09:28 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/041740.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/041740hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09478 CBM49: Carbohydrate b 97.1 0.0045 9.7E-08 41.6 7.5 70 35-110 1-78 (80)
2 PLN02171 endoglucanase 91.0 0.95 2.1E-05 41.5 7.4 81 33-120 535-625 (629)
3 PF02933 CDC48_2: Cell divisio 72.5 3.1 6.7E-05 26.4 2.0 28 95-122 15-42 (64)
4 PF03330 DPBB_1: Rare lipoprot 68.2 5 0.00011 26.2 2.3 37 48-84 38-75 (78)
5 PF06682 DUF1183: Protein of u 54.5 28 0.00061 29.6 5.0 23 61-84 91-113 (318)
6 PLN02340 endoglucanase 47.9 15 0.00032 33.8 2.4 75 32-111 519-601 (614)
7 PF15281 Consortin_C: Consorti 38.2 20 0.00042 26.4 1.4 29 8-36 54-82 (113)
8 PRK15249 fimbrial chaperone pr 35.2 43 0.00094 27.0 3.1 91 4-104 7-113 (253)
9 PF07127 Nodulin_late: Late no 34.2 56 0.0012 20.2 2.9 8 3-10 1-8 (54)
10 PF05304 DUF728: Protein of un 30.4 17 0.00037 26.0 0.0 28 51-85 33-60 (103)
11 cd00602 IPT_TF IPT domain of e 29.3 2.1E+02 0.0045 20.2 5.9 72 32-110 28-100 (101)
12 PF14016 DUF4232: Protein of u 29.1 96 0.0021 21.9 3.8 70 30-111 1-80 (131)
13 TIGR01451 B_ant_repeat conserv 27.6 1.1E+02 0.0023 18.7 3.3 24 45-69 11-34 (53)
14 PF06483 ChiC: Chitinase C; I 26.9 1.2E+02 0.0026 24.0 4.1 46 62-114 34-84 (180)
15 PF04744 Monooxygenase_B: Mono 25.9 1.1E+02 0.0023 26.9 4.0 58 45-108 262-330 (381)
16 PF03293 Pox_RNA_pol: Poxvirus 24.0 1.3E+02 0.0029 23.1 3.8 48 63-110 94-142 (160)
17 PF10633 NPCBM_assoc: NPCBM-as 23.8 92 0.002 20.0 2.6 24 47-71 6-29 (78)
18 PF00856 SET: SET domain; Int 22.2 1.2E+02 0.0026 20.3 3.1 22 88-109 140-161 (162)
19 PF08626 TRAPPC9-Trs120: Trans 21.8 1.7E+02 0.0036 28.8 5.0 63 45-111 797-877 (1185)
20 PF01345 DUF11: Domain of unkn 21.2 1.5E+02 0.0033 18.7 3.3 25 45-70 40-64 (76)
21 PF11906 DUF3426: Protein of u 20.9 2.7E+02 0.0059 19.8 4.9 63 45-110 67-133 (149)
No 1
>PF09478 CBM49: Carbohydrate binding domain CBM49; InterPro: IPR019028 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This domain is found at the C-terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose []. ; GO: 0030246 carbohydrate binding, 0005576 extracellular region
Probab=97.06 E-value=0.0045 Score=41.65 Aligned_cols=70 Identities=21% Similarity=0.299 Sum_probs=54.1
Q ss_pred CeEEEeecCC---CC--CCeEEEEEEeCCCCCCceeEEEecCCccc-c-ccCccceeeecCCeeEE-eCCccCCCCCeeE
Q 041740 35 PTLQQTQVGF---GS--PPTFMARVHNNCPMCPVINIHLKCGNFSQ-A-LVNPRLLKVISYNNCVV-NSGFPLSPLQTFS 106 (126)
Q Consensus 35 I~V~Q~~tg~---~g--~p~~~VtI~N~C~~C~~~~V~l~C~gF~s-~-~VdP~~fr~~~~~~CLv-n~G~pi~~g~~v~ 106 (126)
|+|.|..++. +| ..+|.|+|+|++. =+++++++.-..+.+ . .+|. .+++..-+ +.-.+|.+|++.+
T Consensus 1 i~i~q~~~~sW~~~g~~y~qy~v~I~N~~~-~~I~~~~i~~~~l~~~iW~l~~-----~~~~~y~lPs~~~~i~pg~s~~ 74 (80)
T PF09478_consen 1 ITITQTLVNSWTENGQTYTQYDVTITNNGS-KPIKSLKISIDNLYGSIWGLDK-----VSGNTYTLPSYQPTIKPGQSFT 74 (80)
T ss_pred CEEEEEEEeEEEeCCEEEEEEEEEEEECCC-CeEEEEEEEECccchhheeEEe-----ccCCEEECCccccccCCCCEEE
Confidence 6889998864 44 4679999999998 899999999997753 3 4443 45566766 3344999999999
Q ss_pred EEec
Q 041740 107 FNYS 110 (126)
Q Consensus 107 F~YA 110 (126)
|-|-
T Consensus 75 FGYI 78 (80)
T PF09478_consen 75 FGYI 78 (80)
T ss_pred EEEE
Confidence 9984
No 2
>PLN02171 endoglucanase
Probab=91.01 E-value=0.95 Score=41.46 Aligned_cols=81 Identities=15% Similarity=0.139 Sum_probs=55.6
Q ss_pred CCCeEEEeecCC-----CCCCeEEEEEEeCCCCCCceeEEEecCCcc-cc-ccCccceeeecCCeeEEeCCc-cCCCCCe
Q 041740 33 YSPTLQQTQVGF-----GSPPTFMARVHNNCPMCPVINIHLKCGNFS-QA-LVNPRLLKVISYNNCVVNSGF-PLSPLQT 104 (126)
Q Consensus 33 ~dI~V~Q~~tg~-----~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~-s~-~VdP~~fr~~~~~~CLvn~G~-pi~~g~~ 104 (126)
+.|+|+|..++. .+..+|+|+|+|++. .+++++++.=..+- +. .|. + .+ +...+-+-. .|.+|++
T Consensus 535 ~ei~i~q~v~~sW~~~g~~y~qy~v~I~N~s~-~~ik~i~i~~~~~~~~iW~v~----~-~~-ngytlPs~~~sL~aG~s 607 (629)
T PLN02171 535 SPIEIEQKATASWKAKGRTYYRYSTTVTNRSA-KTLKELHLGISKLYGPLWGLT----K-AG-YGYVLPSWMPSLPAGKS 607 (629)
T ss_pred ceeEEEEEEEEEEEcCCceEEEEEEEEEECCC-Cceeeeeeeeccccccchhee----e-cC-CcccCchhhcccCCCCe
Confidence 368899988863 348889999999999 99999999754443 33 342 1 12 223333332 7889999
Q ss_pred eEEEecC--CCceeeeee
Q 041740 105 FSFNYSH--PKYVMQPAT 120 (126)
Q Consensus 105 v~F~YA~--~~f~l~p~s 120 (126)
.+|-|=+ .+..+.+.+
T Consensus 608 ~tFgyI~~~~pA~~~v~~ 625 (629)
T PLN02171 608 LEFVYVHSASPADVWVSG 625 (629)
T ss_pred eEEEeecCCCCceEEEEE
Confidence 9999986 344455543
No 3
>PF02933 CDC48_2: Cell division protein 48 (CDC48), domain 2; InterPro: IPR004201 This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C terminus. The VAT-N domain found in AAA ATPases (IPR003959 from INTERPRO) is a substrate 185-residue recognition domain [].; GO: 0005524 ATP binding; PDB: 1QDN_B 1QCS_A 1CR5_C 3QQ8_A 3HU2_A 3HU1_E 3HU3_A 3QWZ_A 3TIW_B 3QQ7_A ....
Probab=72.55 E-value=3.1 Score=26.42 Aligned_cols=28 Identities=14% Similarity=0.205 Sum_probs=23.5
Q ss_pred CCccCCCCCeeEEEecCCCceeeeeeee
Q 041740 95 SGFPLSPLQTFSFNYSHPKYVMQPATWS 122 (126)
Q Consensus 95 ~G~pi~~g~~v~F~YA~~~f~l~p~ss~ 122 (126)
.|+|+..|+.|.|.+....++|.+++.+
T Consensus 15 ~~~pv~~Gd~i~~~~~~~~~~~~V~~~~ 42 (64)
T PF02933_consen 15 EGRPVTKGDTIVFPFFGQALPFKVVSTE 42 (64)
T ss_dssp TTEEEETT-EEEEEETTEEEEEEEEEEC
T ss_pred cCCCccCCCEEEEEeCCcEEEEEEEEEE
Confidence 4699999999999998888999988764
No 4
>PF03330 DPBB_1: Rare lipoprotein A (RlpA)-like double-psi beta-barrel; InterPro: IPR009009 Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair [].; PDB: 1N10_B 3D30_A 2BH0_A 2HCZ_X.
Probab=68.21 E-value=5 Score=26.21 Aligned_cols=37 Identities=19% Similarity=0.409 Sum_probs=27.4
Q ss_pred CeEEEEEEeCCCCCCceeEEEecCCcccc-ccCcccee
Q 041740 48 PTFMARVHNNCPMCPVINIHLKCGNFSQA-LVNPRLLK 84 (126)
Q Consensus 48 p~~~VtI~N~C~~C~~~~V~l~C~gF~s~-~VdP~~fr 84 (126)
-.-.|+|+++|+.|+-.++-|+=..|... ..|..++.
T Consensus 38 ksV~v~V~D~Cp~~~~~~lDLS~~aF~~la~~~~G~i~ 75 (78)
T PF03330_consen 38 KSVTVTVVDRCPGCPPNHLDLSPAAFKALADPDAGVIP 75 (78)
T ss_dssp CEEEEEEEEE-TTSSSSEEEEEHHHHHHTBSTTCSSEE
T ss_pred CeEEEEEEccCCCCcCCEEEeCHHHHHHhCCCCceEEE
Confidence 56779999999989999999987777764 44555444
No 5
>PF06682 DUF1183: Protein of unknown function (DUF1183); InterPro: IPR009567 This family consists of several eukaryotic proteins of around 360 residues in length. The function of this family is unknown.
Probab=54.50 E-value=28 Score=29.56 Aligned_cols=23 Identities=17% Similarity=0.394 Sum_probs=18.8
Q ss_pred CCceeEEEecCCccccccCcccee
Q 041740 61 CPVINIHLKCGNFSQALVNPRLLK 84 (126)
Q Consensus 61 C~~~~V~l~C~gF~s~~VdP~~fr 84 (126)
=....|.|.|.|+.... ||=|||
T Consensus 91 ~klG~~~V~CEGY~~pd-DpyvLk 113 (318)
T PF06682_consen 91 YKLGSTDVSCEGYDYPD-DPYVLK 113 (318)
T ss_pred eeecceEEeeecccCCC-CceecC
Confidence 34567899999999854 999998
No 6
>PLN02340 endoglucanase
Probab=47.95 E-value=15 Score=33.81 Aligned_cols=75 Identities=17% Similarity=0.164 Sum_probs=51.4
Q ss_pred CCCCeEEEeecCC-----CCCCeEEEEEEeCCCCCCceeEEEecCCccc-c-ccCccceeeecCCeeEEeCC-ccCCCCC
Q 041740 32 KYSPTLQQTQVGF-----GSPPTFMARVHNNCPMCPVINIHLKCGNFSQ-A-LVNPRLLKVISYNNCVVNSG-FPLSPLQ 103 (126)
Q Consensus 32 ~~dI~V~Q~~tg~-----~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~s-~-~VdP~~fr~~~~~~CLvn~G-~pi~~g~ 103 (126)
.+++.+.|.-+.. ...-+|+|+|+|++. =|.+.+++.=..+-. . .|.|.+=+ +...+-+= ..|.+|+
T Consensus 519 ~~~~e~~~~~~~sw~~~g~~y~~~~v~i~N~s~-~pi~~l~~~~~~l~g~lwgl~~~~~~----~~y~~p~~~~tl~~g~ 593 (614)
T PLN02340 519 GAPVEFVHSITNTWTAGGTTYYRHKVIIKNKSQ-KPITDLKLVIEDLSGPIWGLNPTKEK----NTYELPQWQKVLQPGS 593 (614)
T ss_pred CCchhhhhhheeeeecCCceEEEEEEEEEeCCC-CCchhhhhhhhhcccchhcceecccc----CCccCchhhhccCCCC
Confidence 4455676666542 347889999999999 899999988766554 3 55554322 22333333 4788899
Q ss_pred eeEEEecC
Q 041740 104 TFSFNYSH 111 (126)
Q Consensus 104 ~v~F~YA~ 111 (126)
.++|.|-.
T Consensus 594 ~~~f~yi~ 601 (614)
T PLN02340 594 QLSFVYVQ 601 (614)
T ss_pred eeEEEecc
Confidence 99999985
No 7
>PF15281 Consortin_C: Consortin C-terminus
Probab=38.23 E-value=20 Score=26.35 Aligned_cols=29 Identities=28% Similarity=0.246 Sum_probs=20.3
Q ss_pred HHHHHHHHHHHhhhhcccCCCCCCCCCCe
Q 041740 8 KLLLWCSCLTFASLLDQGKGEKCSKYSPT 36 (126)
Q Consensus 8 kll~~~~~l~l~~~~~~G~~~~Cs~~dI~ 36 (126)
.++++++|++.+.|..-|.+-.|+..|..
T Consensus 54 cl~L~LlclvTv~lS~gGTALYCt~gd~~ 82 (113)
T PF15281_consen 54 CLLLLLLCLVTVVLSVGGTALYCTFGDME 82 (113)
T ss_pred cHHHHHHHHHHHHHhccceEEEEecCCcc
Confidence 45556666666667778888789988764
No 8
>PRK15249 fimbrial chaperone protein StbB; Provisional
Probab=35.18 E-value=43 Score=27.02 Aligned_cols=91 Identities=14% Similarity=0.180 Sum_probs=48.4
Q ss_pred hhhHHHHHHHHHHHHhhhhcccCCCCCCCCCCeEEEeecCC-CCCCeEEEEEEeCCCCCCceeEEEecCCcc--------
Q 041740 4 HKAFKLLLWCSCLTFASLLDQGKGEKCSKYSPTLQQTQVGF-GSPPTFMARVHNNCPMCPVINIHLKCGNFS-------- 74 (126)
Q Consensus 4 ~~~~kll~~~~~l~l~~~~~~G~~~~Cs~~dI~V~Q~~tg~-~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~-------- 74 (126)
|+++|+|++.+++++.+. +....|.|..++.-- ++..+=+|+|.|+=. =+ .-|..+=..-.
T Consensus 7 ~~~~~~~~~~~~~~~~~~--------~a~A~l~l~~TRviy~~~~~~~sl~l~N~~~-~p-~LvQsWv~~~~~~~~p~~~ 76 (253)
T PRK15249 7 HSALYYLIVFLFLALPAT--------ASWASVTILGSRIIYPSTASSVDVQLKNNDA-IP-YIVQTWFDDGDMNTSPENS 76 (253)
T ss_pred hhHHHHHHHHHHHHhhhH--------hheeEEEeCceEEEEeCCCcceeEEEEcCCC-Cc-EEEEEEEeCCCCCCCcccc
Confidence 678898877654433221 223457777766532 466777888888654 11 12221111111
Q ss_pred -cc--ccCccceeeecCCeeE---EeCC-ccCCCCCe
Q 041740 75 -QA--LVNPRLLKVISYNNCV---VNSG-FPLSPLQT 104 (126)
Q Consensus 75 -s~--~VdP~~fr~~~~~~CL---vn~G-~pi~~g~~ 104 (126)
+. .|-|-+||..+++.=. +..| .+++....
T Consensus 77 ~~~pFivtPPlfrl~p~~~q~lRI~~~~~~~lP~DRE 113 (253)
T PRK15249 77 SAMPFIATPPVFRIQPKAGQVVRVIYNNTKKLPQDRE 113 (253)
T ss_pred ccCcEEEcCCeEEecCCCceEEEEEEcCCCCCCCCce
Confidence 12 4788899987665332 3344 36665544
No 9
>PF07127 Nodulin_late: Late nodulin protein; InterPro: IPR009810 This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues toward the proteins C terminus which may be involved in metal-binding [].; GO: 0046872 metal ion binding, 0009878 nodule morphogenesis
Probab=34.20 E-value=56 Score=20.16 Aligned_cols=8 Identities=25% Similarity=0.384 Sum_probs=6.1
Q ss_pred chhhHHHH
Q 041740 3 SHKAFKLL 10 (126)
Q Consensus 3 ~~~~~kll 10 (126)
|++++|+.
T Consensus 1 Ma~ilKFv 8 (54)
T PF07127_consen 1 MAKILKFV 8 (54)
T ss_pred CccchhhH
Confidence 77888864
No 10
>PF05304 DUF728: Protein of unknown function (DUF728); InterPro: IPR007968 This entry is represented by the Tobacco rattle virus, 16kDa protein; it is a family of uncharacterised viral proteins.
Probab=30.37 E-value=17 Score=26.03 Aligned_cols=28 Identities=32% Similarity=0.636 Sum_probs=21.5
Q ss_pred EEEEEeCCCCCCceeEEEecCCccccccCccceee
Q 041740 51 MARVHNNCPMCPVINIHLKCGNFSQALVNPRLLKV 85 (126)
Q Consensus 51 ~VtI~N~C~~C~~~~V~l~C~gF~s~~VdP~~fr~ 85 (126)
.|.|+-+|. | .+||||.-+.|+|.-|.+
T Consensus 33 ~~~v~RkC~-~------~NCGWf~~i~v~~~~~eV 60 (103)
T PF05304_consen 33 KVGVKRKCE-C------NNCGWFPAISVNDDTFEV 60 (103)
T ss_pred HhChhhhhh-c------cCCCceEEEEEeccEEee
Confidence 455666776 5 389999998889988876
No 11
>cd00602 IPT_TF IPT domain of eukaryotic transcription factors NF-kappaB/Rel, nuclear factor of activated Tcells (NFAT), and recombination signal J-kappa binding protein (RBP-Jkappa). The IPT domains in these proteins are involved in DNA binding. Most NF-kappaB/Rel proteins form homo- and heterodimers, while NFAT proteins are largely monomeric (with TonEBP being an exception). While the majority of sequence-specific DNA binding elements are found in the N-terminal domain, several are found in the IPT domain in loops adjacent to, and including, the linker region.
Probab=29.33 E-value=2.1e+02 Score=20.18 Aligned_cols=72 Identities=11% Similarity=0.064 Sum_probs=46.0
Q ss_pred CCCCeEEEeecCCCCCCeEEEEEEeCCCCCCceeEEEecCCcccc-ccCccceeeecCCeeEEeCCccCCCCCeeEEEec
Q 041740 32 KYSPTLQQTQVGFGSPPTFMARVHNNCPMCPVINIHLKCGNFSQA-LVNPRLLKVISYNNCVVNSGFPLSPLQTFSFNYS 110 (126)
Q Consensus 32 ~~dI~V~Q~~tg~~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~s~-~VdP~~fr~~~~~~CLvn~G~pi~~g~~v~F~YA 110 (126)
+.||.|.=...+. |...|+..-.=.+. +||..|--|... --|+.+=+...-..-|++.-... ..++..|+|-
T Consensus 28 k~dikV~F~e~~~-g~~~WE~~~~f~~~-----dv~q~aiv~~tP~y~~~~i~~pV~V~i~L~r~~~~~-~S~~~~FtY~ 100 (101)
T cd00602 28 KPDIKVWFGEKGP-GETVWEAEAMFRQE-----DVRQVAIVFKTPPYHNKWITRPVQVPIQLVRPDDRK-RSEPLTFTYT 100 (101)
T ss_pred CCCCEEEEEecCC-CCCeEEEEEEECHH-----HceEeEEEecCCCcCCCCccccEEEEEEEEeCCCCe-ecCCcCeEEc
Confidence 4588865444333 78899999998887 445555555554 23666655444456778763333 3478999994
No 12
>PF14016 DUF4232: Protein of unknown function (DUF4232)
Probab=29.14 E-value=96 Score=21.93 Aligned_cols=70 Identities=10% Similarity=0.113 Sum_probs=44.0
Q ss_pred CCCCCCeEEEeecC-CCCCCeEEEEEEeCCC-CCCceeEEEecCCcccc-c-------cCccceeeecCCeeEEeCCccC
Q 041740 30 CSKYSPTLQQTQVG-FGSPPTFMARVHNNCP-MCPVINIHLKCGNFSQA-L-------VNPRLLKVISYNNCVVNSGFPL 99 (126)
Q Consensus 30 Cs~~dI~V~Q~~tg-~~g~p~~~VtI~N~C~-~C~~~~V~l~C~gF~s~-~-------VdP~~fr~~~~~~CLvn~G~pi 99 (126)
|..+|++++-+... ..|...+.|+++|+=. .|.+. ||..+ . +.+..-+. + + -..--.|
T Consensus 1 C~~~~L~~~~~~~~~~~g~~~~~l~~tN~s~~~C~l~-------G~P~v~~~~~~g~~~~~~~~~~-~-~---~~~~vtL 68 (131)
T PF14016_consen 1 CTAADLSVTVGPVDAGAGQRHATLTFTNTSDTPCTLY-------GYPGVALVDADGAPLGVPAVRE-G-P---PPRPVTL 68 (131)
T ss_pred CCcccEEEEEecccCCCCccEEEEEEEECCCCcEEec-------cCCcEEEECCCCCcCCcccccc-C-C---CCCcEEE
Confidence 88899999887653 4677899999999764 47663 44432 2 23333332 1 1 1222346
Q ss_pred CCCCeeEEEecC
Q 041740 100 SPLQTFSFNYSH 111 (126)
Q Consensus 100 ~~g~~v~F~YA~ 111 (126)
.+|++..|.=.|
T Consensus 69 ~PG~sA~a~l~~ 80 (131)
T PF14016_consen 69 APGGSAYAGLRW 80 (131)
T ss_pred CCCCEEEEEEEE
Confidence 788888887776
No 13
>TIGR01451 B_ant_repeat conserved repeat domain. This model represents the conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis, and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydial outer membrane proteins.
Probab=27.56 E-value=1.1e+02 Score=18.74 Aligned_cols=24 Identities=29% Similarity=0.450 Sum_probs=17.9
Q ss_pred CCCCeEEEEEEeCCCCCCceeEEEe
Q 041740 45 GSPPTFMARVHNNCPMCPVINIHLK 69 (126)
Q Consensus 45 ~g~p~~~VtI~N~C~~C~~~~V~l~ 69 (126)
+..-+|+++|.|+-. =+..+|.+.
T Consensus 11 Gd~v~Yti~v~N~g~-~~a~~v~v~ 34 (53)
T TIGR01451 11 GDTITYTITVTNNGN-VPATNVVVT 34 (53)
T ss_pred CCEEEEEEEEEECCC-CceEeEEEE
Confidence 346789999999987 556666654
No 14
>PF06483 ChiC: Chitinase C; InterPro: IPR009470 This ~170 aa region is found at the C-terminal to the catalytic domain (IPR001223 from INTERPRO) found in members of glycoside hydrolase family 18.
Probab=26.93 E-value=1.2e+02 Score=23.99 Aligned_cols=46 Identities=22% Similarity=0.385 Sum_probs=35.0
Q ss_pred CceeEEEecCCccc----cccCccceeeecCCeeEEeCCccCCCCCeeEEEecC-CCc
Q 041740 62 PVINIHLKCGNFSQ----ALVNPRLLKVISYNNCVVNSGFPLSPLQTFSFNYSH-PKY 114 (126)
Q Consensus 62 ~~~~V~l~C~gF~s----~~VdP~~fr~~~~~~CLvn~G~pi~~g~~v~F~YA~-~~f 114 (126)
..-||.+.=+||.- .||+|++--- =|.++.|+.|..++|.|+- .+-
T Consensus 34 ~~ldv~v~~~gf~~GD~NYPI~Pkl~iT-------Nns~~~iPGGt~~~FD~ptSa~~ 84 (180)
T PF06483_consen 34 EALDVSVSFTGFKLGDSNYPINPKLTIT-------NNSGQTIPGGTEFEFDYPTSAPD 84 (180)
T ss_pred ceEEEEEEeCCcccCCCCCCcCCcEEEE-------cCCCcccCCccEEEEccccCCcc
Confidence 34588889999984 2899986432 3678999999999999986 543
No 15
>PF04744 Monooxygenase_B: Monooxygenase subunit B protein; InterPro: IPR006833 Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related []. These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules []. These enzymes are composed of 3 subunits - A (IPR003393 from INTERPRO), B (IPR006833 from INTERPRO) and C (IPR006980 from INTERPRO) - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus str. Bath is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certain[]. The soluble regions of these enzymes derive primarily from the B subunit. This subunit forms two antiparallel beta-barrel-like structures and contains the mono- and di- nuclear copper metal centres [].; PDB: 3CHX_E 3RFR_A 3RGB_A 1YEW_A.
Probab=25.93 E-value=1.1e+02 Score=26.93 Aligned_cols=58 Identities=26% Similarity=0.402 Sum_probs=32.4
Q ss_pred CCCCeEEEEEEeCCCCCCceeEEEecCCcccc---ccCccceeeecC--------CeeEEeCCccCCCCCeeEEE
Q 041740 45 GSPPTFMARVHNNCPMCPVINIHLKCGNFSQA---LVNPRLLKVISY--------NNCVVNSGFPLSPLQTFSFN 108 (126)
Q Consensus 45 ~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~s~---~VdP~~fr~~~~--------~~CLvn~G~pi~~g~~v~F~ 108 (126)
+.--++.++|+|+=. =| | .=+.|+++ -+||.+.+-..+ +.=.|.+-.||.+|++-+++
T Consensus 262 gR~l~~~l~VtN~g~-~p---v--~LgeF~tA~vrFln~~v~~~~~~~P~~l~A~~gL~vs~~~pI~PGETrtl~ 330 (381)
T PF04744_consen 262 GRTLTMTLTVTNNGD-SP---V--RLGEFNTANVRFLNPDVPTDDPDYPDELLAERGLSVSDNSPIAPGETRTLT 330 (381)
T ss_dssp SSEEEEEEEEEEESS-S----B--EEEEEESSS-EEE-TTT-SS-S---TTTEETT-EEES--S-B-TT-EEEEE
T ss_pred CcEEEEEEEEEcCCC-Cc---e--EeeeEEeccEEEeCcccccCCCCCchhhhccCcceeCCCCCcCCCceEEEE
Confidence 346679999999875 33 3 44788886 379998864321 12346677799999996654
No 16
>PF03293 Pox_RNA_pol: Poxvirus DNA-directed RNA polymerase, 18 kD subunit; InterPro: IPR004973 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. The Poxvirus DNA-directed RNA polymerase (2.7.7.6 from EC) catalyses DNA-template-directed extension of the 3'-end of an RNA strand by one nucleotide at a time. The enzyme consists of at least eight subunits, this is the 18 kDa subunit.; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0019083 viral transcription
Probab=23.96 E-value=1.3e+02 Score=23.12 Aligned_cols=48 Identities=15% Similarity=0.242 Sum_probs=29.4
Q ss_pred ceeEEEecCCcccc-ccCccceeeecCCeeEEeCCccCCCCCeeEEEec
Q 041740 63 VINIHLKCGNFSQA-LVNPRLLKVISYNNCVVNSGFPLSPLQTFSFNYS 110 (126)
Q Consensus 63 ~~~V~l~C~gF~s~-~VdP~~fr~~~~~~CLvn~G~pi~~g~~v~F~YA 110 (126)
.+||.+.|+..-=- .=|..-......--|++.||..-..|..|+-.--
T Consensus 94 ESni~V~CgDLiCkl~rdsGtVSf~dsKYCfirNg~vY~ngs~Vsv~Lk 142 (160)
T PF03293_consen 94 ESNITVQCGDLICKLSRDSGTVSFNDSKYCFIRNGVVYDNGSEVSVVLK 142 (160)
T ss_pred cCceEEEcCcEEEEeeccCCeEEecCceEEEEECCEEecCCCEEEEEeh
Confidence 46788888764321 1122222221112699999999999999876543
No 17
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=23.81 E-value=92 Score=19.97 Aligned_cols=24 Identities=25% Similarity=0.326 Sum_probs=15.7
Q ss_pred CCeEEEEEEeCCCCCCceeEEEecC
Q 041740 47 PPTFMARVHNNCPMCPVINIHLKCG 71 (126)
Q Consensus 47 ~p~~~VtI~N~C~~C~~~~V~l~C~ 71 (126)
.-+++++|.|... -+..++.|+-.
T Consensus 6 ~~~~~~tv~N~g~-~~~~~v~~~l~ 29 (78)
T PF10633_consen 6 TVTVTLTVTNTGT-APLTNVSLSLS 29 (78)
T ss_dssp EEEEEEEEE--SS-S-BSS-EEEEE
T ss_pred EEEEEEEEEECCC-CceeeEEEEEe
Confidence 3469999999997 77788888763
No 18
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=22.19 E-value=1.2e+02 Score=20.32 Aligned_cols=22 Identities=14% Similarity=0.058 Sum_probs=16.8
Q ss_pred CCeeEEeCCccCCCCCeeEEEe
Q 041740 88 YNNCVVNSGFPLSPLQTFSFNY 109 (126)
Q Consensus 88 ~~~CLvn~G~pi~~g~~v~F~Y 109 (126)
.+...+...++|.+|+.|...|
T Consensus 140 ~~~~~~~a~r~I~~GeEi~isY 161 (162)
T PF00856_consen 140 GGCLVVRATRDIKKGEEIFISY 161 (162)
T ss_dssp TTEEEEEESS-B-TTSBEEEES
T ss_pred cceEEEEECCccCCCCEEEEEE
Confidence 3556788889999999999998
No 19
>PF08626 TRAPPC9-Trs120: Transport protein Trs120 or TRAPPC9, TRAPP II complex subunit; InterPro: IPR013935 The trafficking protein particle complex TRAPP is a multi-protein complex needed in the early stages of the secretory pathway. To date, two kinds of TRAPP complexes have been studied, TRAPPI and TRAPP II. These complexes differ in subunit composition []. TRAPP I binds vesicles derived from the endoplasmic reticulum bringing them closer to the acceptor membrane. Trs120 is a subunit specific to the TRAPP II complex [] along with Trs65p and Trs130p(TRAPPC10). It is suggested that Trs120p is required for the stability of the Trs130p subunit, suggesting that these two proteins might interact in some way []. It is likely that there is a complex function for TRAPP II in multiple pathways [].
Probab=21.81 E-value=1.7e+02 Score=28.77 Aligned_cols=63 Identities=21% Similarity=0.337 Sum_probs=40.9
Q ss_pred CC-CCeEEEEEEeCCCCCCceeEEEecCC-----ccccc------------cCccceeeecCCeeEEeCCccCCCCCeeE
Q 041740 45 GS-PPTFMARVHNNCPMCPVINIHLKCGN-----FSQAL------------VNPRLLKVISYNNCVVNSGFPLSPLQTFS 106 (126)
Q Consensus 45 ~g-~p~~~VtI~N~C~~C~~~~V~l~C~g-----F~s~~------------VdP~~fr~~~~~~CLvn~G~pi~~g~~v~ 106 (126)
.| .-+|+||+.|.-. ||+..+++.-.. ++.+. ++-.++++- ..-+.|.. +|.||++++
T Consensus 797 eGE~~~~~ItL~N~S~-~pvd~l~~sf~DS~~~~~~~~l~~k~l~~~e~yelE~~l~~~~--~~~i~~~~-~I~Pg~~~~ 872 (1185)
T PF08626_consen 797 EGEKQTFTITLRNTSS-VPVDFLSFSFQDSTIEPLQKALSNKDLSPDELYELEWQLFKLP--AFRILNKP-PIPPGESAT 872 (1185)
T ss_pred CCcEEEEEEEEEECCc-cccceEEEEEEeccHHHHhhhhhcccCChhhhhhhhhhhhcCc--ceeecccC-ccCCCCEEE
Confidence 45 7889999999997 999999999752 11110 111122221 12344545 999999999
Q ss_pred EEecC
Q 041740 107 FNYSH 111 (126)
Q Consensus 107 F~YA~ 111 (126)
|++--
T Consensus 873 ~~~~~ 877 (1185)
T PF08626_consen 873 FTVEV 877 (1185)
T ss_pred EEEEe
Confidence 98774
No 20
>PF01345 DUF11: Domain of unknown function DUF11; InterPro: IPR001434 This group of sequences is represented by a conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis (Streptococcus faecalis), and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydia trachomatis outer membrane proteins. In C. trachomatis, three cysteine-rich proteins (also believed to be lipoproteins), MOMP, OMP6 and OMP3, make up the extracellular matrix of the outer membrane []. They are involved in the essential structural integrity of both the elementary body (EB) and recticulate body (RB) phase. They are thought to be involved in porin formation and, as these bacteria lack the peptidoglycan layer common to most Gram-negative microbes, such proteins are highly important in the pathogenicity of the organism.; GO: 0005727 extrachromosomal circular DNA
Probab=21.24 E-value=1.5e+02 Score=18.69 Aligned_cols=25 Identities=28% Similarity=0.476 Sum_probs=20.4
Q ss_pred CCCCeEEEEEEeCCCCCCceeEEEec
Q 041740 45 GSPPTFMARVHNNCPMCPVINIHLKC 70 (126)
Q Consensus 45 ~g~p~~~VtI~N~C~~C~~~~V~l~C 70 (126)
+..-+|.++|.|.=. -+..||.|.-
T Consensus 40 Gd~v~ytitvtN~G~-~~a~nv~v~D 64 (76)
T PF01345_consen 40 GDTVTYTITVTNTGP-APATNVVVTD 64 (76)
T ss_pred CCEEEEEEEEEECCC-CeeEeEEEEE
Confidence 346789999999988 7788888764
No 21
>PF11906 DUF3426: Protein of unknown function (DUF3426); InterPro: IPR021834 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 262 to 463 amino acids in length.
Probab=20.87 E-value=2.7e+02 Score=19.81 Aligned_cols=63 Identities=10% Similarity=0.084 Sum_probs=39.7
Q ss_pred CCCCeEEEEEEeCCC-CCCceeEEEecCCccccccCccceeeecCCeeEEeC---CccCCCCCeeEEEec
Q 041740 45 GSPPTFMARVHNNCP-MCPVINIHLKCGNFSQALVNPRLLKVISYNNCVVNS---GFPLSPLQTFSFNYS 110 (126)
Q Consensus 45 ~g~p~~~VtI~N~C~-~C~~~~V~l~C~gF~s~~VdP~~fr~~~~~~CLvn~---G~pi~~g~~v~F~YA 110 (126)
.+.-+.+.+|.|+=. .=+.-.|++.=.+=+..+|.-++|+. .++|..+ ...|++|+++.|+-.
T Consensus 67 ~~~l~v~g~i~N~~~~~~~~P~l~l~L~D~~g~~l~~r~~~P---~~yl~~~~~~~~~l~pg~~~~~~~~ 133 (149)
T PF11906_consen 67 PGVLVVSGTIRNRADFPQALPALELSLLDAQGQPLARRVFTP---ADYLPPGLAAQAGLPPGESVPFRLR 133 (149)
T ss_pred CCEEEEEEEEEeCCCCcccCceEEEEEECCCCCEEEEEEECh---HHhcccccccccccCCCCeEEEEEE
Confidence 345566669999986 24444555554443333555666654 4566654 678999999888643
Done!