Query         041740
Match_columns 126
No_of_seqs    102 out of 136
Neff          5.1 
Searched_HMMs 46136
Date          Fri Mar 29 10:09:28 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/041740.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/041740hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF09478 CBM49:  Carbohydrate b  97.1  0.0045 9.7E-08   41.6   7.5   70   35-110     1-78  (80)
  2 PLN02171 endoglucanase          91.0    0.95 2.1E-05   41.5   7.4   81   33-120   535-625 (629)
  3 PF02933 CDC48_2:  Cell divisio  72.5     3.1 6.7E-05   26.4   2.0   28   95-122    15-42  (64)
  4 PF03330 DPBB_1:  Rare lipoprot  68.2       5 0.00011   26.2   2.3   37   48-84     38-75  (78)
  5 PF06682 DUF1183:  Protein of u  54.5      28 0.00061   29.6   5.0   23   61-84     91-113 (318)
  6 PLN02340 endoglucanase          47.9      15 0.00032   33.8   2.4   75   32-111   519-601 (614)
  7 PF15281 Consortin_C:  Consorti  38.2      20 0.00042   26.4   1.4   29    8-36     54-82  (113)
  8 PRK15249 fimbrial chaperone pr  35.2      43 0.00094   27.0   3.1   91    4-104     7-113 (253)
  9 PF07127 Nodulin_late:  Late no  34.2      56  0.0012   20.2   2.9    8    3-10      1-8   (54)
 10 PF05304 DUF728:  Protein of un  30.4      17 0.00037   26.0   0.0   28   51-85     33-60  (103)
 11 cd00602 IPT_TF IPT domain of e  29.3 2.1E+02  0.0045   20.2   5.9   72   32-110    28-100 (101)
 12 PF14016 DUF4232:  Protein of u  29.1      96  0.0021   21.9   3.8   70   30-111     1-80  (131)
 13 TIGR01451 B_ant_repeat conserv  27.6 1.1E+02  0.0023   18.7   3.3   24   45-69     11-34  (53)
 14 PF06483 ChiC:  Chitinase C;  I  26.9 1.2E+02  0.0026   24.0   4.1   46   62-114    34-84  (180)
 15 PF04744 Monooxygenase_B:  Mono  25.9 1.1E+02  0.0023   26.9   4.0   58   45-108   262-330 (381)
 16 PF03293 Pox_RNA_pol:  Poxvirus  24.0 1.3E+02  0.0029   23.1   3.8   48   63-110    94-142 (160)
 17 PF10633 NPCBM_assoc:  NPCBM-as  23.8      92   0.002   20.0   2.6   24   47-71      6-29  (78)
 18 PF00856 SET:  SET domain;  Int  22.2 1.2E+02  0.0026   20.3   3.1   22   88-109   140-161 (162)
 19 PF08626 TRAPPC9-Trs120:  Trans  21.8 1.7E+02  0.0036   28.8   5.0   63   45-111   797-877 (1185)
 20 PF01345 DUF11:  Domain of unkn  21.2 1.5E+02  0.0033   18.7   3.3   25   45-70     40-64  (76)
 21 PF11906 DUF3426:  Protein of u  20.9 2.7E+02  0.0059   19.8   4.9   63   45-110    67-133 (149)

No 1  
>PF09478 CBM49:  Carbohydrate binding domain CBM49;  InterPro: IPR019028 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [].  This domain is found at the C-terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose []. ; GO: 0030246 carbohydrate binding, 0005576 extracellular region
Probab=97.06  E-value=0.0045  Score=41.65  Aligned_cols=70  Identities=21%  Similarity=0.299  Sum_probs=54.1

Q ss_pred             CeEEEeecCC---CC--CCeEEEEEEeCCCCCCceeEEEecCCccc-c-ccCccceeeecCCeeEE-eCCccCCCCCeeE
Q 041740           35 PTLQQTQVGF---GS--PPTFMARVHNNCPMCPVINIHLKCGNFSQ-A-LVNPRLLKVISYNNCVV-NSGFPLSPLQTFS  106 (126)
Q Consensus        35 I~V~Q~~tg~---~g--~p~~~VtI~N~C~~C~~~~V~l~C~gF~s-~-~VdP~~fr~~~~~~CLv-n~G~pi~~g~~v~  106 (126)
                      |+|.|..++.   +|  ..+|.|+|+|++. =+++++++.-..+.+ . .+|.     .+++..-+ +.-.+|.+|++.+
T Consensus         1 i~i~q~~~~sW~~~g~~y~qy~v~I~N~~~-~~I~~~~i~~~~l~~~iW~l~~-----~~~~~y~lPs~~~~i~pg~s~~   74 (80)
T PF09478_consen    1 ITITQTLVNSWTENGQTYTQYDVTITNNGS-KPIKSLKISIDNLYGSIWGLDK-----VSGNTYTLPSYQPTIKPGQSFT   74 (80)
T ss_pred             CEEEEEEEeEEEeCCEEEEEEEEEEEECCC-CeEEEEEEEECccchhheeEEe-----ccCCEEECCccccccCCCCEEE
Confidence            6889998864   44  4679999999998 899999999997753 3 4443     45566766 3344999999999


Q ss_pred             EEec
Q 041740          107 FNYS  110 (126)
Q Consensus       107 F~YA  110 (126)
                      |-|-
T Consensus        75 FGYI   78 (80)
T PF09478_consen   75 FGYI   78 (80)
T ss_pred             EEEE
Confidence            9984


No 2  
>PLN02171 endoglucanase
Probab=91.01  E-value=0.95  Score=41.46  Aligned_cols=81  Identities=15%  Similarity=0.139  Sum_probs=55.6

Q ss_pred             CCCeEEEeecCC-----CCCCeEEEEEEeCCCCCCceeEEEecCCcc-cc-ccCccceeeecCCeeEEeCCc-cCCCCCe
Q 041740           33 YSPTLQQTQVGF-----GSPPTFMARVHNNCPMCPVINIHLKCGNFS-QA-LVNPRLLKVISYNNCVVNSGF-PLSPLQT  104 (126)
Q Consensus        33 ~dI~V~Q~~tg~-----~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~-s~-~VdP~~fr~~~~~~CLvn~G~-pi~~g~~  104 (126)
                      +.|+|+|..++.     .+..+|+|+|+|++. .+++++++.=..+- +. .|.    + .+ +...+-+-. .|.+|++
T Consensus       535 ~ei~i~q~v~~sW~~~g~~y~qy~v~I~N~s~-~~ik~i~i~~~~~~~~iW~v~----~-~~-ngytlPs~~~sL~aG~s  607 (629)
T PLN02171        535 SPIEIEQKATASWKAKGRTYYRYSTTVTNRSA-KTLKELHLGISKLYGPLWGLT----K-AG-YGYVLPSWMPSLPAGKS  607 (629)
T ss_pred             ceeEEEEEEEEEEEcCCceEEEEEEEEEECCC-Cceeeeeeeeccccccchhee----e-cC-CcccCchhhcccCCCCe
Confidence            368899988863     348889999999999 99999999754443 33 342    1 12 223333332 7889999


Q ss_pred             eEEEecC--CCceeeeee
Q 041740          105 FSFNYSH--PKYVMQPAT  120 (126)
Q Consensus       105 v~F~YA~--~~f~l~p~s  120 (126)
                      .+|-|=+  .+..+.+.+
T Consensus       608 ~tFgyI~~~~pA~~~v~~  625 (629)
T PLN02171        608 LEFVYVHSASPADVWVSG  625 (629)
T ss_pred             eEEEeecCCCCceEEEEE
Confidence            9999986  344455543


No 3  
>PF02933 CDC48_2:  Cell division protein 48 (CDC48), domain 2;  InterPro: IPR004201 This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C terminus. The VAT-N domain found in AAA ATPases (IPR003959 from INTERPRO) is a substrate 185-residue recognition domain [].; GO: 0005524 ATP binding; PDB: 1QDN_B 1QCS_A 1CR5_C 3QQ8_A 3HU2_A 3HU1_E 3HU3_A 3QWZ_A 3TIW_B 3QQ7_A ....
Probab=72.55  E-value=3.1  Score=26.42  Aligned_cols=28  Identities=14%  Similarity=0.205  Sum_probs=23.5

Q ss_pred             CCccCCCCCeeEEEecCCCceeeeeeee
Q 041740           95 SGFPLSPLQTFSFNYSHPKYVMQPATWS  122 (126)
Q Consensus        95 ~G~pi~~g~~v~F~YA~~~f~l~p~ss~  122 (126)
                      .|+|+..|+.|.|.+....++|.+++.+
T Consensus        15 ~~~pv~~Gd~i~~~~~~~~~~~~V~~~~   42 (64)
T PF02933_consen   15 EGRPVTKGDTIVFPFFGQALPFKVVSTE   42 (64)
T ss_dssp             TTEEEETT-EEEEEETTEEEEEEEEEEC
T ss_pred             cCCCccCCCEEEEEeCCcEEEEEEEEEE
Confidence            4699999999999998888999988764


No 4  
>PF03330 DPBB_1:  Rare lipoprotein A (RlpA)-like double-psi beta-barrel;  InterPro: IPR009009  Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair [].; PDB: 1N10_B 3D30_A 2BH0_A 2HCZ_X.
Probab=68.21  E-value=5  Score=26.21  Aligned_cols=37  Identities=19%  Similarity=0.409  Sum_probs=27.4

Q ss_pred             CeEEEEEEeCCCCCCceeEEEecCCcccc-ccCcccee
Q 041740           48 PTFMARVHNNCPMCPVINIHLKCGNFSQA-LVNPRLLK   84 (126)
Q Consensus        48 p~~~VtI~N~C~~C~~~~V~l~C~gF~s~-~VdP~~fr   84 (126)
                      -.-.|+|+++|+.|+-.++-|+=..|... ..|..++.
T Consensus        38 ksV~v~V~D~Cp~~~~~~lDLS~~aF~~la~~~~G~i~   75 (78)
T PF03330_consen   38 KSVTVTVVDRCPGCPPNHLDLSPAAFKALADPDAGVIP   75 (78)
T ss_dssp             CEEEEEEEEE-TTSSSSEEEEEHHHHHHTBSTTCSSEE
T ss_pred             CeEEEEEEccCCCCcCCEEEeCHHHHHHhCCCCceEEE
Confidence            56779999999989999999987777764 44555444


No 5  
>PF06682 DUF1183:  Protein of unknown function (DUF1183);  InterPro: IPR009567 This family consists of several eukaryotic proteins of around 360 residues in length. The function of this family is unknown.
Probab=54.50  E-value=28  Score=29.56  Aligned_cols=23  Identities=17%  Similarity=0.394  Sum_probs=18.8

Q ss_pred             CCceeEEEecCCccccccCcccee
Q 041740           61 CPVINIHLKCGNFSQALVNPRLLK   84 (126)
Q Consensus        61 C~~~~V~l~C~gF~s~~VdP~~fr   84 (126)
                      =....|.|.|.|+.... ||=|||
T Consensus        91 ~klG~~~V~CEGY~~pd-DpyvLk  113 (318)
T PF06682_consen   91 YKLGSTDVSCEGYDYPD-DPYVLK  113 (318)
T ss_pred             eeecceEEeeecccCCC-CceecC
Confidence            34567899999999854 999998


No 6  
>PLN02340 endoglucanase
Probab=47.95  E-value=15  Score=33.81  Aligned_cols=75  Identities=17%  Similarity=0.164  Sum_probs=51.4

Q ss_pred             CCCCeEEEeecCC-----CCCCeEEEEEEeCCCCCCceeEEEecCCccc-c-ccCccceeeecCCeeEEeCC-ccCCCCC
Q 041740           32 KYSPTLQQTQVGF-----GSPPTFMARVHNNCPMCPVINIHLKCGNFSQ-A-LVNPRLLKVISYNNCVVNSG-FPLSPLQ  103 (126)
Q Consensus        32 ~~dI~V~Q~~tg~-----~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~s-~-~VdP~~fr~~~~~~CLvn~G-~pi~~g~  103 (126)
                      .+++.+.|.-+..     ...-+|+|+|+|++. =|.+.+++.=..+-. . .|.|.+=+    +...+-+= ..|.+|+
T Consensus       519 ~~~~e~~~~~~~sw~~~g~~y~~~~v~i~N~s~-~pi~~l~~~~~~l~g~lwgl~~~~~~----~~y~~p~~~~tl~~g~  593 (614)
T PLN02340        519 GAPVEFVHSITNTWTAGGTTYYRHKVIIKNKSQ-KPITDLKLVIEDLSGPIWGLNPTKEK----NTYELPQWQKVLQPGS  593 (614)
T ss_pred             CCchhhhhhheeeeecCCceEEEEEEEEEeCCC-CCchhhhhhhhhcccchhcceecccc----CCccCchhhhccCCCC
Confidence            4455676666542     347889999999999 899999988766554 3 55554322    22333333 4788899


Q ss_pred             eeEEEecC
Q 041740          104 TFSFNYSH  111 (126)
Q Consensus       104 ~v~F~YA~  111 (126)
                      .++|.|-.
T Consensus       594 ~~~f~yi~  601 (614)
T PLN02340        594 QLSFVYVQ  601 (614)
T ss_pred             eeEEEecc
Confidence            99999985


No 7  
>PF15281 Consortin_C:  Consortin C-terminus
Probab=38.23  E-value=20  Score=26.35  Aligned_cols=29  Identities=28%  Similarity=0.246  Sum_probs=20.3

Q ss_pred             HHHHHHHHHHHhhhhcccCCCCCCCCCCe
Q 041740            8 KLLLWCSCLTFASLLDQGKGEKCSKYSPT   36 (126)
Q Consensus         8 kll~~~~~l~l~~~~~~G~~~~Cs~~dI~   36 (126)
                      .++++++|++.+.|..-|.+-.|+..|..
T Consensus        54 cl~L~LlclvTv~lS~gGTALYCt~gd~~   82 (113)
T PF15281_consen   54 CLLLLLLCLVTVVLSVGGTALYCTFGDME   82 (113)
T ss_pred             cHHHHHHHHHHHHHhccceEEEEecCCcc
Confidence            45556666666667778888789988764


No 8  
>PRK15249 fimbrial chaperone protein StbB; Provisional
Probab=35.18  E-value=43  Score=27.02  Aligned_cols=91  Identities=14%  Similarity=0.180  Sum_probs=48.4

Q ss_pred             hhhHHHHHHHHHHHHhhhhcccCCCCCCCCCCeEEEeecCC-CCCCeEEEEEEeCCCCCCceeEEEecCCcc--------
Q 041740            4 HKAFKLLLWCSCLTFASLLDQGKGEKCSKYSPTLQQTQVGF-GSPPTFMARVHNNCPMCPVINIHLKCGNFS--------   74 (126)
Q Consensus         4 ~~~~kll~~~~~l~l~~~~~~G~~~~Cs~~dI~V~Q~~tg~-~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~--------   74 (126)
                      |+++|+|++.+++++.+.        +....|.|..++.-- ++..+=+|+|.|+=. =+ .-|..+=..-.        
T Consensus         7 ~~~~~~~~~~~~~~~~~~--------~a~A~l~l~~TRviy~~~~~~~sl~l~N~~~-~p-~LvQsWv~~~~~~~~p~~~   76 (253)
T PRK15249          7 HSALYYLIVFLFLALPAT--------ASWASVTILGSRIIYPSTASSVDVQLKNNDA-IP-YIVQTWFDDGDMNTSPENS   76 (253)
T ss_pred             hhHHHHHHHHHHHHhhhH--------hheeEEEeCceEEEEeCCCcceeEEEEcCCC-Cc-EEEEEEEeCCCCCCCcccc
Confidence            678898877654433221        223457777766532 466777888888654 11 12221111111        


Q ss_pred             -cc--ccCccceeeecCCeeE---EeCC-ccCCCCCe
Q 041740           75 -QA--LVNPRLLKVISYNNCV---VNSG-FPLSPLQT  104 (126)
Q Consensus        75 -s~--~VdP~~fr~~~~~~CL---vn~G-~pi~~g~~  104 (126)
                       +.  .|-|-+||..+++.=.   +..| .+++....
T Consensus        77 ~~~pFivtPPlfrl~p~~~q~lRI~~~~~~~lP~DRE  113 (253)
T PRK15249         77 SAMPFIATPPVFRIQPKAGQVVRVIYNNTKKLPQDRE  113 (253)
T ss_pred             ccCcEEEcCCeEEecCCCceEEEEEEcCCCCCCCCce
Confidence             12  4788899987665332   3344 36665544


No 9  
>PF07127 Nodulin_late:  Late nodulin protein;  InterPro: IPR009810 This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues toward the proteins C terminus which may be involved in metal-binding [].; GO: 0046872 metal ion binding, 0009878 nodule morphogenesis
Probab=34.20  E-value=56  Score=20.16  Aligned_cols=8  Identities=25%  Similarity=0.384  Sum_probs=6.1

Q ss_pred             chhhHHHH
Q 041740            3 SHKAFKLL   10 (126)
Q Consensus         3 ~~~~~kll   10 (126)
                      |++++|+.
T Consensus         1 Ma~ilKFv    8 (54)
T PF07127_consen    1 MAKILKFV    8 (54)
T ss_pred             CccchhhH
Confidence            77888864


No 10 
>PF05304 DUF728:  Protein of unknown function (DUF728);  InterPro: IPR007968 This entry is represented by the Tobacco rattle virus, 16kDa protein; it is a family of uncharacterised viral proteins.
Probab=30.37  E-value=17  Score=26.03  Aligned_cols=28  Identities=32%  Similarity=0.636  Sum_probs=21.5

Q ss_pred             EEEEEeCCCCCCceeEEEecCCccccccCccceee
Q 041740           51 MARVHNNCPMCPVINIHLKCGNFSQALVNPRLLKV   85 (126)
Q Consensus        51 ~VtI~N~C~~C~~~~V~l~C~gF~s~~VdP~~fr~   85 (126)
                      .|.|+-+|. |      .+||||.-+.|+|.-|.+
T Consensus        33 ~~~v~RkC~-~------~NCGWf~~i~v~~~~~eV   60 (103)
T PF05304_consen   33 KVGVKRKCE-C------NNCGWFPAISVNDDTFEV   60 (103)
T ss_pred             HhChhhhhh-c------cCCCceEEEEEeccEEee
Confidence            455666776 5      389999998889988876


No 11 
>cd00602 IPT_TF IPT domain of eukaryotic transcription factors NF-kappaB/Rel, nuclear factor of activated Tcells (NFAT), and  recombination signal J-kappa binding protein (RBP-Jkappa). The IPT domains in these proteins are involved in DNA binding. Most NF-kappaB/Rel proteins form homo- and heterodimers, while NFAT proteins are largely monomeric (with TonEBP being an exception). While the majority of sequence-specific DNA binding elements are found in the N-terminal domain, several are found in the IPT domain in loops adjacent to, and including, the linker region.
Probab=29.33  E-value=2.1e+02  Score=20.18  Aligned_cols=72  Identities=11%  Similarity=0.064  Sum_probs=46.0

Q ss_pred             CCCCeEEEeecCCCCCCeEEEEEEeCCCCCCceeEEEecCCcccc-ccCccceeeecCCeeEEeCCccCCCCCeeEEEec
Q 041740           32 KYSPTLQQTQVGFGSPPTFMARVHNNCPMCPVINIHLKCGNFSQA-LVNPRLLKVISYNNCVVNSGFPLSPLQTFSFNYS  110 (126)
Q Consensus        32 ~~dI~V~Q~~tg~~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~s~-~VdP~~fr~~~~~~CLvn~G~pi~~g~~v~F~YA  110 (126)
                      +.||.|.=...+. |...|+..-.=.+.     +||..|--|... --|+.+=+...-..-|++.-... ..++..|+|-
T Consensus        28 k~dikV~F~e~~~-g~~~WE~~~~f~~~-----dv~q~aiv~~tP~y~~~~i~~pV~V~i~L~r~~~~~-~S~~~~FtY~  100 (101)
T cd00602          28 KPDIKVWFGEKGP-GETVWEAEAMFRQE-----DVRQVAIVFKTPPYHNKWITRPVQVPIQLVRPDDRK-RSEPLTFTYT  100 (101)
T ss_pred             CCCCEEEEEecCC-CCCeEEEEEEECHH-----HceEeEEEecCCCcCCCCccccEEEEEEEEeCCCCe-ecCCcCeEEc
Confidence            4588865444333 78899999998887     445555555554 23666655444456778763333 3478999994


No 12 
>PF14016 DUF4232:  Protein of unknown function (DUF4232)
Probab=29.14  E-value=96  Score=21.93  Aligned_cols=70  Identities=10%  Similarity=0.113  Sum_probs=44.0

Q ss_pred             CCCCCCeEEEeecC-CCCCCeEEEEEEeCCC-CCCceeEEEecCCcccc-c-------cCccceeeecCCeeEEeCCccC
Q 041740           30 CSKYSPTLQQTQVG-FGSPPTFMARVHNNCP-MCPVINIHLKCGNFSQA-L-------VNPRLLKVISYNNCVVNSGFPL   99 (126)
Q Consensus        30 Cs~~dI~V~Q~~tg-~~g~p~~~VtI~N~C~-~C~~~~V~l~C~gF~s~-~-------VdP~~fr~~~~~~CLvn~G~pi   99 (126)
                      |..+|++++-+... ..|...+.|+++|+=. .|.+.       ||..+ .       +.+..-+. + +   -..--.|
T Consensus         1 C~~~~L~~~~~~~~~~~g~~~~~l~~tN~s~~~C~l~-------G~P~v~~~~~~g~~~~~~~~~~-~-~---~~~~vtL   68 (131)
T PF14016_consen    1 CTAADLSVTVGPVDAGAGQRHATLTFTNTSDTPCTLY-------GYPGVALVDADGAPLGVPAVRE-G-P---PPRPVTL   68 (131)
T ss_pred             CCcccEEEEEecccCCCCccEEEEEEEECCCCcEEec-------cCCcEEEECCCCCcCCcccccc-C-C---CCCcEEE
Confidence            88899999887653 4677899999999764 47663       44432 2       23333332 1 1   1222346


Q ss_pred             CCCCeeEEEecC
Q 041740          100 SPLQTFSFNYSH  111 (126)
Q Consensus       100 ~~g~~v~F~YA~  111 (126)
                      .+|++..|.=.|
T Consensus        69 ~PG~sA~a~l~~   80 (131)
T PF14016_consen   69 APGGSAYAGLRW   80 (131)
T ss_pred             CCCCEEEEEEEE
Confidence            788888887776


No 13 
>TIGR01451 B_ant_repeat conserved repeat domain. This model represents the conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis, and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydial outer membrane proteins.
Probab=27.56  E-value=1.1e+02  Score=18.74  Aligned_cols=24  Identities=29%  Similarity=0.450  Sum_probs=17.9

Q ss_pred             CCCCeEEEEEEeCCCCCCceeEEEe
Q 041740           45 GSPPTFMARVHNNCPMCPVINIHLK   69 (126)
Q Consensus        45 ~g~p~~~VtI~N~C~~C~~~~V~l~   69 (126)
                      +..-+|+++|.|+-. =+..+|.+.
T Consensus        11 Gd~v~Yti~v~N~g~-~~a~~v~v~   34 (53)
T TIGR01451        11 GDTITYTITVTNNGN-VPATNVVVT   34 (53)
T ss_pred             CCEEEEEEEEEECCC-CceEeEEEE
Confidence            346789999999987 556666654


No 14 
>PF06483 ChiC:  Chitinase C;  InterPro: IPR009470 This ~170 aa region is found at the C-terminal to the catalytic domain (IPR001223 from INTERPRO) found in members of glycoside hydrolase family 18.
Probab=26.93  E-value=1.2e+02  Score=23.99  Aligned_cols=46  Identities=22%  Similarity=0.385  Sum_probs=35.0

Q ss_pred             CceeEEEecCCccc----cccCccceeeecCCeeEEeCCccCCCCCeeEEEecC-CCc
Q 041740           62 PVINIHLKCGNFSQ----ALVNPRLLKVISYNNCVVNSGFPLSPLQTFSFNYSH-PKY  114 (126)
Q Consensus        62 ~~~~V~l~C~gF~s----~~VdP~~fr~~~~~~CLvn~G~pi~~g~~v~F~YA~-~~f  114 (126)
                      ..-||.+.=+||.-    .||+|++---       =|.++.|+.|..++|.|+- .+-
T Consensus        34 ~~ldv~v~~~gf~~GD~NYPI~Pkl~iT-------Nns~~~iPGGt~~~FD~ptSa~~   84 (180)
T PF06483_consen   34 EALDVSVSFTGFKLGDSNYPINPKLTIT-------NNSGQTIPGGTEFEFDYPTSAPD   84 (180)
T ss_pred             ceEEEEEEeCCcccCCCCCCcCCcEEEE-------cCCCcccCCccEEEEccccCCcc
Confidence            34588889999984    2899986432       3678999999999999986 543


No 15 
>PF04744 Monooxygenase_B:  Monooxygenase subunit B protein;  InterPro: IPR006833 Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related []. These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules []. These enzymes are composed of 3 subunits - A (IPR003393 from INTERPRO), B (IPR006833 from INTERPRO) and C (IPR006980 from INTERPRO) - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus str. Bath is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certain[]. The soluble regions of these enzymes derive primarily from the B subunit. This subunit forms two antiparallel beta-barrel-like structures and contains the mono- and di- nuclear copper metal centres [].; PDB: 3CHX_E 3RFR_A 3RGB_A 1YEW_A.
Probab=25.93  E-value=1.1e+02  Score=26.93  Aligned_cols=58  Identities=26%  Similarity=0.402  Sum_probs=32.4

Q ss_pred             CCCCeEEEEEEeCCCCCCceeEEEecCCcccc---ccCccceeeecC--------CeeEEeCCccCCCCCeeEEE
Q 041740           45 GSPPTFMARVHNNCPMCPVINIHLKCGNFSQA---LVNPRLLKVISY--------NNCVVNSGFPLSPLQTFSFN  108 (126)
Q Consensus        45 ~g~p~~~VtI~N~C~~C~~~~V~l~C~gF~s~---~VdP~~fr~~~~--------~~CLvn~G~pi~~g~~v~F~  108 (126)
                      +.--++.++|+|+=. =|   |  .=+.|+++   -+||.+.+-..+        +.=.|.+-.||.+|++-+++
T Consensus       262 gR~l~~~l~VtN~g~-~p---v--~LgeF~tA~vrFln~~v~~~~~~~P~~l~A~~gL~vs~~~pI~PGETrtl~  330 (381)
T PF04744_consen  262 GRTLTMTLTVTNNGD-SP---V--RLGEFNTANVRFLNPDVPTDDPDYPDELLAERGLSVSDNSPIAPGETRTLT  330 (381)
T ss_dssp             SSEEEEEEEEEEESS-S----B--EEEEEESSS-EEE-TTT-SS-S---TTTEETT-EEES--S-B-TT-EEEEE
T ss_pred             CcEEEEEEEEEcCCC-Cc---e--EeeeEEeccEEEeCcccccCCCCCchhhhccCcceeCCCCCcCCCceEEEE
Confidence            346679999999875 33   3  44788886   379998864321        12346677799999996654


No 16 
>PF03293 Pox_RNA_pol:  Poxvirus DNA-directed RNA polymerase, 18 kD subunit;  InterPro: IPR004973 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:  RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors.  RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs.   Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. The Poxvirus DNA-directed RNA polymerase (2.7.7.6 from EC) catalyses DNA-template-directed extension of the 3'-end of an RNA strand by one nucleotide at a time. The enzyme consists of at least eight subunits, this is the 18 kDa subunit.; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0019083 viral transcription
Probab=23.96  E-value=1.3e+02  Score=23.12  Aligned_cols=48  Identities=15%  Similarity=0.242  Sum_probs=29.4

Q ss_pred             ceeEEEecCCcccc-ccCccceeeecCCeeEEeCCccCCCCCeeEEEec
Q 041740           63 VINIHLKCGNFSQA-LVNPRLLKVISYNNCVVNSGFPLSPLQTFSFNYS  110 (126)
Q Consensus        63 ~~~V~l~C~gF~s~-~VdP~~fr~~~~~~CLvn~G~pi~~g~~v~F~YA  110 (126)
                      .+||.+.|+..-=- .=|..-......--|++.||..-..|..|+-.--
T Consensus        94 ESni~V~CgDLiCkl~rdsGtVSf~dsKYCfirNg~vY~ngs~Vsv~Lk  142 (160)
T PF03293_consen   94 ESNITVQCGDLICKLSRDSGTVSFNDSKYCFIRNGVVYDNGSEVSVVLK  142 (160)
T ss_pred             cCceEEEcCcEEEEeeccCCeEEecCceEEEEECCEEecCCCEEEEEeh
Confidence            46788888764321 1122222221112699999999999999876543


No 17 
>PF10633 NPCBM_assoc:  NPCBM-associated, NEW3 domain of alpha-galactosidase;  InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=23.81  E-value=92  Score=19.97  Aligned_cols=24  Identities=25%  Similarity=0.326  Sum_probs=15.7

Q ss_pred             CCeEEEEEEeCCCCCCceeEEEecC
Q 041740           47 PPTFMARVHNNCPMCPVINIHLKCG   71 (126)
Q Consensus        47 ~p~~~VtI~N~C~~C~~~~V~l~C~   71 (126)
                      .-+++++|.|... -+..++.|+-.
T Consensus         6 ~~~~~~tv~N~g~-~~~~~v~~~l~   29 (78)
T PF10633_consen    6 TVTVTLTVTNTGT-APLTNVSLSLS   29 (78)
T ss_dssp             EEEEEEEEE--SS-S-BSS-EEEEE
T ss_pred             EEEEEEEEEECCC-CceeeEEEEEe
Confidence            3469999999997 77788888763


No 18 
>PF00856 SET:  SET domain;  InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities [].  The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=22.19  E-value=1.2e+02  Score=20.32  Aligned_cols=22  Identities=14%  Similarity=0.058  Sum_probs=16.8

Q ss_pred             CCeeEEeCCccCCCCCeeEEEe
Q 041740           88 YNNCVVNSGFPLSPLQTFSFNY  109 (126)
Q Consensus        88 ~~~CLvn~G~pi~~g~~v~F~Y  109 (126)
                      .+...+...++|.+|+.|...|
T Consensus       140 ~~~~~~~a~r~I~~GeEi~isY  161 (162)
T PF00856_consen  140 GGCLVVRATRDIKKGEEIFISY  161 (162)
T ss_dssp             TTEEEEEESS-B-TTSBEEEES
T ss_pred             cceEEEEECCccCCCCEEEEEE
Confidence            3556788889999999999998


No 19 
>PF08626 TRAPPC9-Trs120:  Transport protein Trs120 or TRAPPC9, TRAPP II complex subunit;  InterPro: IPR013935 The trafficking protein particle complex TRAPP is a multi-protein complex needed in the early stages of the secretory pathway. To date, two kinds of TRAPP complexes have been studied, TRAPPI and TRAPP II. These complexes differ in subunit composition []. TRAPP I binds vesicles derived from the endoplasmic reticulum bringing them closer to the acceptor membrane. Trs120 is a subunit specific to the TRAPP II complex [] along with Trs65p and Trs130p(TRAPPC10). It is suggested that Trs120p is required for the stability of the Trs130p subunit, suggesting that these two proteins might interact in some way []. It is likely that there is a complex function for TRAPP II in multiple pathways [].
Probab=21.81  E-value=1.7e+02  Score=28.77  Aligned_cols=63  Identities=21%  Similarity=0.337  Sum_probs=40.9

Q ss_pred             CC-CCeEEEEEEeCCCCCCceeEEEecCC-----ccccc------------cCccceeeecCCeeEEeCCccCCCCCeeE
Q 041740           45 GS-PPTFMARVHNNCPMCPVINIHLKCGN-----FSQAL------------VNPRLLKVISYNNCVVNSGFPLSPLQTFS  106 (126)
Q Consensus        45 ~g-~p~~~VtI~N~C~~C~~~~V~l~C~g-----F~s~~------------VdP~~fr~~~~~~CLvn~G~pi~~g~~v~  106 (126)
                      .| .-+|+||+.|.-. ||+..+++.-..     ++.+.            ++-.++++-  ..-+.|.. +|.||++++
T Consensus       797 eGE~~~~~ItL~N~S~-~pvd~l~~sf~DS~~~~~~~~l~~k~l~~~e~yelE~~l~~~~--~~~i~~~~-~I~Pg~~~~  872 (1185)
T PF08626_consen  797 EGEKQTFTITLRNTSS-VPVDFLSFSFQDSTIEPLQKALSNKDLSPDELYELEWQLFKLP--AFRILNKP-PIPPGESAT  872 (1185)
T ss_pred             CCcEEEEEEEEEECCc-cccceEEEEEEeccHHHHhhhhhcccCChhhhhhhhhhhhcCc--ceeecccC-ccCCCCEEE
Confidence            45 7889999999997 999999999752     11110            111122221  12344545 999999999


Q ss_pred             EEecC
Q 041740          107 FNYSH  111 (126)
Q Consensus       107 F~YA~  111 (126)
                      |++--
T Consensus       873 ~~~~~  877 (1185)
T PF08626_consen  873 FTVEV  877 (1185)
T ss_pred             EEEEe
Confidence            98774


No 20 
>PF01345 DUF11:  Domain of unknown function DUF11;  InterPro: IPR001434 This group of sequences is represented by a conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis (Streptococcus faecalis), and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydia trachomatis outer membrane proteins.  In C. trachomatis, three cysteine-rich proteins (also believed to be lipoproteins), MOMP, OMP6 and OMP3, make up the extracellular matrix of the outer membrane []. They are involved in the essential structural integrity of both the elementary body (EB) and recticulate body (RB) phase. They are thought to be involved in porin formation and, as these bacteria lack the peptidoglycan layer common to most Gram-negative microbes, such proteins are highly important in the pathogenicity of the organism.; GO: 0005727 extrachromosomal circular DNA
Probab=21.24  E-value=1.5e+02  Score=18.69  Aligned_cols=25  Identities=28%  Similarity=0.476  Sum_probs=20.4

Q ss_pred             CCCCeEEEEEEeCCCCCCceeEEEec
Q 041740           45 GSPPTFMARVHNNCPMCPVINIHLKC   70 (126)
Q Consensus        45 ~g~p~~~VtI~N~C~~C~~~~V~l~C   70 (126)
                      +..-+|.++|.|.=. -+..||.|.-
T Consensus        40 Gd~v~ytitvtN~G~-~~a~nv~v~D   64 (76)
T PF01345_consen   40 GDTVTYTITVTNTGP-APATNVVVTD   64 (76)
T ss_pred             CCEEEEEEEEEECCC-CeeEeEEEEE
Confidence            346789999999988 7788888764


No 21 
>PF11906 DUF3426:  Protein of unknown function (DUF3426);  InterPro: IPR021834  This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 262 to 463 amino acids in length. 
Probab=20.87  E-value=2.7e+02  Score=19.81  Aligned_cols=63  Identities=10%  Similarity=0.084  Sum_probs=39.7

Q ss_pred             CCCCeEEEEEEeCCC-CCCceeEEEecCCccccccCccceeeecCCeeEEeC---CccCCCCCeeEEEec
Q 041740           45 GSPPTFMARVHNNCP-MCPVINIHLKCGNFSQALVNPRLLKVISYNNCVVNS---GFPLSPLQTFSFNYS  110 (126)
Q Consensus        45 ~g~p~~~VtI~N~C~-~C~~~~V~l~C~gF~s~~VdP~~fr~~~~~~CLvn~---G~pi~~g~~v~F~YA  110 (126)
                      .+.-+.+.+|.|+=. .=+.-.|++.=.+=+..+|.-++|+.   .++|..+   ...|++|+++.|+-.
T Consensus        67 ~~~l~v~g~i~N~~~~~~~~P~l~l~L~D~~g~~l~~r~~~P---~~yl~~~~~~~~~l~pg~~~~~~~~  133 (149)
T PF11906_consen   67 PGVLVVSGTIRNRADFPQALPALELSLLDAQGQPLARRVFTP---ADYLPPGLAAQAGLPPGESVPFRLR  133 (149)
T ss_pred             CCEEEEEEEEEeCCCCcccCceEEEEEECCCCCEEEEEEECh---HHhcccccccccccCCCCeEEEEEE
Confidence            345566669999986 24444555554443333555666654   4566654   678999999888643


Done!