Query psy15440
Match_columns 79
No_of_seqs 107 out of 526
Neff 5.4
Searched_HMMs 46136
Date Fri Aug 16 21:10:31 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy15440.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/15440hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1110|consensus 99.9 4.2E-26 9.2E-31 160.8 5.5 72 2-74 101-172 (183)
2 KOG1108|consensus 99.8 1.2E-19 2.7E-24 133.2 2.6 73 2-75 105-181 (281)
3 PF00173 Cyt-b5: Cytochrome b5 96.7 0.0017 3.8E-08 38.2 2.7 33 2-55 43-75 (76)
4 PF12596 Tnp_P_element_C: 87kD 71.6 3.4 7.4E-05 27.2 2.0 29 29-57 73-102 (106)
5 PF09476 Pilus_CpaD: Pilus bio 57.7 23 0.0005 25.1 4.3 38 28-65 52-89 (203)
6 PF06299 DUF1045: Protein of u 50.9 15 0.00032 25.7 2.3 22 26-53 83-104 (160)
7 cd02395 SF1_like-KH Splicing f 49.6 12 0.00026 24.7 1.6 15 48-62 13-27 (120)
8 smart00674 CENPB Putative DNA- 47.7 10 0.00022 21.4 1.0 16 37-52 48-63 (66)
9 PF05603 DUF775: Protein of un 47.7 10 0.00023 27.0 1.2 16 35-50 181-196 (202)
10 KOG0537|consensus 47.2 5.7 0.00012 26.4 -0.2 13 2-14 48-60 (124)
11 TIGR03180 UraD_2 OHCU decarbox 47.1 24 0.00052 24.2 2.9 33 24-56 82-116 (158)
12 PRK13798 putative OHCU decarbo 46.8 24 0.00053 24.4 2.9 32 25-56 88-121 (166)
13 TIGR03164 UHCUDC OHCU decarbox 46.2 25 0.00055 24.0 2.9 32 25-56 83-116 (157)
14 PF14584 DUF4446: Protein of u 45.7 17 0.00038 24.9 2.0 32 4-40 118-149 (151)
15 KOG1588|consensus 42.9 11 0.00024 28.3 0.7 16 47-62 104-119 (259)
16 PF02065 Melibiase: Melibiase; 41.5 23 0.00049 27.7 2.3 26 22-47 309-334 (394)
17 PRK09750 hypothetical protein; 40.5 23 0.00049 21.4 1.7 25 48-72 3-27 (64)
18 TIGR03223 Phn_opern_protn puta 39.0 26 0.00057 25.7 2.2 21 27-53 144-164 (228)
19 TIGR02491 NrdG anaerobic ribon 36.6 61 0.0013 21.5 3.5 50 13-66 29-78 (154)
20 COG4892 Predicted heme/steroid 36.5 14 0.00031 23.1 0.4 13 47-59 67-79 (81)
21 TIGR01446 DnaD_dom DnaD and ph 36.3 43 0.00093 19.2 2.5 18 29-46 12-29 (73)
22 TIGR02522 pilus_cpaD pilus (Ca 35.6 46 0.00099 23.7 2.9 32 28-59 53-84 (198)
23 COG5547 Small integral membran 35.6 8.9 0.00019 23.0 -0.6 9 48-56 8-16 (62)
24 cd00412 pyrophosphatase Inorga 34.9 47 0.001 22.9 2.8 26 22-47 103-128 (155)
25 TIGR00636 PduO_Nterm ATP:cob(I 34.0 45 0.00097 23.3 2.6 35 29-63 76-110 (171)
26 PF06223 Phage_tail_T: Minor t 33.9 26 0.00056 23.0 1.3 19 28-49 16-34 (103)
27 PF00034 Cytochrom_C: Cytochro 32.6 55 0.0012 18.1 2.5 17 29-45 74-90 (91)
28 PF09349 OHCU_decarbox: OHCU d 32.4 28 0.00061 23.6 1.4 31 23-53 84-116 (159)
29 PF00719 Pyrophosphatase: Inor 32.3 39 0.00085 23.2 2.1 32 24-55 102-137 (156)
30 PF09568 RE_MjaI: MjaI restric 30.8 27 0.00058 24.8 1.1 15 36-50 32-46 (170)
31 PF12512 DUF3717: Protein of u 29.9 52 0.0011 20.2 2.1 19 25-43 51-69 (71)
32 PF14376 Haem_bd: Haem-binding 29.6 57 0.0012 21.7 2.5 17 28-44 119-135 (137)
33 PHA02057 ADP-ribosylation supe 28.4 21 0.00045 27.6 0.2 17 43-59 16-32 (319)
34 cd00086 homeodomain Homeodomai 27.2 1E+02 0.0022 16.3 2.9 21 28-50 5-25 (59)
35 PF00046 Homeobox: Homeobox do 26.6 97 0.0021 16.6 2.7 20 29-50 6-25 (57)
36 PRK01250 inorganic pyrophospha 26.1 76 0.0016 22.4 2.7 26 22-47 119-144 (176)
37 COG2096 cob(I)alamin adenosylt 26.0 97 0.0021 22.2 3.2 35 29-63 84-118 (184)
38 PF07901 DUF1672: Protein of u 25.9 57 0.0012 24.8 2.2 25 32-56 122-146 (277)
39 COG0133 TrpB Tryptophan syntha 25.7 49 0.0011 26.3 1.8 36 28-63 164-199 (396)
40 KOG4048|consensus 24.9 50 0.0011 23.8 1.6 29 25-53 107-135 (201)
41 PF03221 HTH_Tnp_Tc5: Tc5 tran 24.9 39 0.00085 18.6 0.9 16 37-52 48-63 (66)
42 PF03392 OS-D: Insect pheromon 24.1 61 0.0013 20.5 1.7 27 26-52 51-77 (95)
43 PF14409 Herpeto_peptide: Ribo 23.0 93 0.002 18.4 2.2 22 33-54 24-48 (58)
44 PRK02230 inorganic pyrophospha 23.0 88 0.0019 22.3 2.5 26 22-47 105-130 (184)
45 PRK13797 putative bifunctional 22.9 91 0.002 25.6 2.9 35 22-56 435-471 (516)
46 PRK05134 bifunctional 3-demeth 22.7 96 0.0021 21.2 2.7 26 27-52 1-26 (233)
47 PF06688 DUF1187: Protein of u 22.2 73 0.0016 19.1 1.7 26 49-74 1-26 (61)
48 PLN02373 soluble inorganic pyr 21.9 1E+02 0.0022 22.0 2.7 26 22-47 125-150 (188)
49 PF06297 PET: PET Domain; Int 21.9 1.1E+02 0.0024 19.9 2.7 31 27-57 74-104 (106)
50 PRK11789 N-acetyl-anhydromuran 21.9 79 0.0017 22.3 2.1 32 26-57 123-155 (185)
No 1
>KOG1110|consensus
Probab=99.92 E-value=4.2e-26 Score=160.83 Aligned_cols=72 Identities=39% Similarity=0.651 Sum_probs=69.8
Q ss_pred ceecccccchhhcCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHcccCceeeEEcCCCCCCCCCccccccchh
Q psy15440 2 QISRRDASRAFATFTVTPGKDEYDDLSDLNPAEWESVREWEEQLDSKYTYIGKLLKPGESHINYEDEEKGSIE 74 (79)
Q Consensus 2 ~FAGrDaSRalat~~~~~~~~~~~dls~Lt~~el~~L~~W~~~f~~KYp~VG~l~~~~~~pt~y~~ee~~k~~ 74 (79)
.||||||||+||+||++. .++++|+|||++.|+++|++|+.+|+.|||+||+|+.++++|..|+.++.++..
T Consensus 101 ~fAG~DASR~La~~s~d~-~d~~ddlsdL~a~e~eal~eWE~~fk~KY~~VG~L~~~~~e~~~~s~~~~~~~~ 172 (183)
T KOG1110|consen 101 LFAGKDASRGLAKMSFDL-SDETDDLSDLTAEELEALNEWETKFKAKYPVVGRLVKKGEENEEYSPEEDTKDA 172 (183)
T ss_pred hhcccchHHHHHhcccch-hhccccccccCHHHHHHHHHHHHHHhhcCceeEEeecCCcccccCCcccccccc
Confidence 599999999999999999 499999999999999999999999999999999999999999999999999876
No 2
>KOG1108|consensus
Probab=99.77 E-value=1.2e-19 Score=133.17 Aligned_cols=73 Identities=26% Similarity=0.450 Sum_probs=68.6
Q ss_pred ceecccccchhhcCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHcccCceeeEEcC----CCCCCCCCccccccchhh
Q psy15440 2 QISRRDASRAFATFTVTPGKDEYDDLSDLNPAEWESVREWEEQLDSKYTYIGKLLK----PGESHINYEDEEKGSIEE 75 (79)
Q Consensus 2 ~FAGrDaSRalat~~~~~~~~~~~dls~Lt~~el~~L~~W~~~f~~KYp~VG~l~~----~~~~pt~y~~ee~~k~~~ 75 (79)
.||||||||||++|+|.+. ...+++.+|++.|+.+|.+|..||.+.|++||+|++ +.|.||.|....+++.+.
T Consensus 105 hFaGRDASrAFvsGdf~e~-gl~d~v~gLs~dEllsi~dWrsFY~k~Y~~vGrv~gryYds~G~pT~~lt~v~a~~er 181 (281)
T KOG1108|consen 105 HFAGRDASRAFVSGDFEEP-GLADDVLGLSPDELLSIADWRSFYQKDYVYVGRVIGRYYDSKGAPTPYLTKVLALLER 181 (281)
T ss_pred cccccccchheecccCCCC-cchhhhccCCHHHHhhhhhhhhhhhcccceeeEEeeeeecCCCCCcHHHHHHHHHHHH
Confidence 6999999999999999975 999999999999999999999999999999999997 899999999999887654
No 3
>PF00173 Cyt-b5: Cytochrome b5-like Heme/Steroid binding domain This prints entry is a subset of the Pfam entry; InterPro: IPR001199 Cytochromes b5 are ubiquitous electron transport proteins found in animals, plants and yeasts []. The microsomal and mitochondrial variants are membrane-bound, while those from erythrocytes and other animal tissues are water-soluble [, ]. The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group []. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.; GO: 0020037 heme binding; PDB: 2I96_A 3KS0_A 1KBI_B 1KBJ_B 1LTD_A 1SZG_B 1SZF_A 1LDC_B 2OZ0_B 1LCO_A ....
Probab=96.68 E-value=0.0017 Score=38.18 Aligned_cols=33 Identities=36% Similarity=0.465 Sum_probs=27.9
Q ss_pred ceecccccchhhcCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHcccCceeeEE
Q psy15440 2 QISRRDASRAFATFTVTPGKDEYDDLSDLNPAEWESVREWEEQLDSKYTYIGKL 55 (79)
Q Consensus 2 ~FAGrDaSRalat~~~~~~~~~~~dls~Lt~~el~~L~~W~~~f~~KYp~VG~l 55 (79)
.+||+|+|.+| +......|..++..+|.+||+|
T Consensus 43 ~~aG~D~T~~f---------------------~~~~h~~~~~~~l~~~~~vG~l 75 (76)
T PF00173_consen 43 KYAGRDATDAF---------------------EEAFHSWWAEKCLEKYYKVGYL 75 (76)
T ss_dssp TTTTSBTHHHH---------------------HHHTHHHHHHHHHHGCGEEEEE
T ss_pred HhccccccHHH---------------------hhccCcHHHHHHccCCCEEEEe
Confidence 47999999999 4555668888999999999997
No 4
>PF12596 Tnp_P_element_C: 87kDa Transposase; InterPro: IPR022242 This domain family is found in eukaryotes, and is typically between 78 and 110 amino acids in length. The family is found in association with PF05485 from PFAM. There are two completely conserved residues (D and G) that may be functionally important. This family is an 87kDa transposase protein which catalyses both the precise and imprecise excision of a nonautonomous P transposable element.
Probab=71.62 E-value=3.4 Score=27.20 Aligned_cols=29 Identities=14% Similarity=0.338 Sum_probs=21.6
Q ss_pred CCCHHHHHHHHHHHH-HHcccCceeeEEcC
Q psy15440 29 DLNPAEWESVREWEE-QLDSKYTYIGKLLK 57 (79)
Q Consensus 29 ~Lt~~el~~L~~W~~-~f~~KYp~VG~l~~ 57 (79)
.+.+.-|+=+.+|+. +|+.|||.+|.+..
T Consensus 73 e~e~d~l~YiaGyVa~k~~~k~p~L~~~t~ 102 (106)
T PF12596_consen 73 EIEEDGLEYIAGYVAKKFRNKYPNLGDYTC 102 (106)
T ss_pred hhHHHHHHHHHHHHHHHHHhcCCchhheee
Confidence 355555666778887 79999999997653
No 5
>PF09476 Pilus_CpaD: Pilus biogenesis CpaD protein (pilus_cpaD); InterPro: IPR019027 Proteins in this entry consist of a pilus biogenesis protein, CpaD, from Caulobacter, and homologues in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function of the homologues is not known.
Probab=57.74 E-value=23 Score=25.14 Aligned_cols=38 Identities=13% Similarity=0.143 Sum_probs=31.3
Q ss_pred CCCCHHHHHHHHHHHHHHcccCceeeEEcCCCCCCCCC
Q psy15440 28 SDLNPAEWESVREWEEQLDSKYTYIGKLLKPGESHINY 65 (79)
Q Consensus 28 s~Lt~~el~~L~~W~~~f~~KYp~VG~l~~~~~~pt~y 65 (79)
.+|+..|+..|+.|.+.|...|.-.=.|..+.|.|+..
T Consensus 52 ~~Lt~~q~~~l~~f~~~~~~~~~~~v~i~~psg~~n~~ 89 (203)
T PF09476_consen 52 GGLTPSQRDRLRGFASRYGRRGGGRVTIDVPSGSANAR 89 (203)
T ss_pred CCCCHHHHHHHHHHHHHHhccCCCeEEEecCCCCcchH
Confidence 78999999999999999998877666666777776543
No 6
>PF06299 DUF1045: Protein of unknown function (DUF1045); InterPro: IPR009389 This family consists of several hypothetical proteins from Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown.
Probab=50.92 E-value=15 Score=25.71 Aligned_cols=22 Identities=23% Similarity=0.644 Sum_probs=18.9
Q ss_pred CCCCCCHHHHHHHHHHHHHHcccCceee
Q psy15440 26 DLSDLNPAEWESVREWEEQLDSKYTYIG 53 (79)
Q Consensus 26 dls~Lt~~el~~L~~W~~~f~~KYp~VG 53 (79)
.-.+||+.|...|..| -||||=
T Consensus 83 ~~~~Ls~~Q~~~L~rW------GYPYV~ 104 (160)
T PF06299_consen 83 RPAGLSPRQRANLERW------GYPYVM 104 (160)
T ss_pred CcccCCHHHHHHHHHh------CCCcee
Confidence 3478999999999999 788884
No 7
>cd02395 SF1_like-KH Splicing factor 1 (SF1) K homology RNA-binding domain (KH). Splicing factor 1 (SF1) specifically recognizes the intron branch point sequence (BPS) UACUAAC in the pre-mRNA transcripts during spliceosome assembly. We show that the KH-QUA2 region of SF1 defines an enlarged KH (hnRNP K) fold which is necessary and sufficient for BPS binding. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA.
Probab=49.56 E-value=12 Score=24.68 Aligned_cols=15 Identities=33% Similarity=0.806 Sum_probs=12.9
Q ss_pred cCceeeEEcCCCCCC
Q psy15440 48 KYTYIGKLLKPGESH 62 (79)
Q Consensus 48 KYp~VG~l~~~~~~p 62 (79)
.|.+||+|++|+|.-
T Consensus 13 ~~N~IG~IIGPgG~t 27 (120)
T cd02395 13 KYNFVGLILGPRGNT 27 (120)
T ss_pred CCCeeEEEECCCChH
Confidence 689999999988864
No 8
>smart00674 CENPB Putative DNA-binding domain in centromere protein B, mouse jerky and transposases.
Probab=47.75 E-value=10 Score=21.38 Aligned_cols=16 Identities=6% Similarity=0.385 Sum_probs=12.8
Q ss_pred HHHHHHHHHcccCcee
Q psy15440 37 SVREWEEQLDSKYTYI 52 (79)
Q Consensus 37 ~L~~W~~~f~~KYp~V 52 (79)
.=..|...|.++|+++
T Consensus 48 ~s~~Wl~rF~~Rh~~~ 63 (66)
T smart00674 48 ASNGWLTRFKKRHNIV 63 (66)
T ss_pred CCHHHHHHHHHHcCCc
Confidence 3357999999999865
No 9
>PF05603 DUF775: Protein of unknown function (DUF775); InterPro: IPR008493 This family consists of several eukaryotic proteins of unknown function.
Probab=47.72 E-value=10 Score=27.00 Aligned_cols=16 Identities=19% Similarity=0.460 Sum_probs=13.6
Q ss_pred HHHHHHHHHHHcccCc
Q psy15440 35 WESVREWEEQLDSKYT 50 (79)
Q Consensus 35 l~~L~~W~~~f~~KYp 50 (79)
+..|+.|.++|+.|+.
T Consensus 181 ~~~~~~W~~kFe~Kl~ 196 (202)
T PF05603_consen 181 LSVFDKWWEKFERKLR 196 (202)
T ss_pred HHHHHHHHHHHHHHHh
Confidence 3699999999998874
No 10
>KOG0537|consensus
Probab=47.16 E-value=5.7 Score=26.43 Aligned_cols=13 Identities=31% Similarity=0.570 Sum_probs=10.2
Q ss_pred ceecccccchhhc
Q psy15440 2 QISRRDASRAFAT 14 (79)
Q Consensus 2 ~FAGrDaSRalat 14 (79)
..||+|||.+|--
T Consensus 48 ~~AGkDaT~~F~~ 60 (124)
T KOG0537|consen 48 EYAGKDATEAFED 60 (124)
T ss_pred HHhchhhHHhccc
Confidence 3699999998843
No 11
>TIGR03180 UraD_2 OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model. This model is a separate (but related) clade from that represented by TIGR3164. This model places a second homolog in streptomyces species which (are not in the vicinity of other urate catabolism associated genes) below the trusted cutoff.
Probab=47.06 E-value=24 Score=24.21 Aligned_cols=33 Identities=6% Similarity=0.150 Sum_probs=26.1
Q ss_pred CCCCCCCCHHHHHHHHHHHHHHcccC--ceeeEEc
Q psy15440 24 YDDLSDLNPAEWESVREWEEQLDSKY--TYIGKLL 56 (79)
Q Consensus 24 ~~dls~Lt~~el~~L~~W~~~f~~KY--p~VG~l~ 56 (79)
...+..+++.+++.|..|-..|++|| |+|=.+.
T Consensus 82 Qagl~~~~~~~~~~L~~lN~~Y~~kFGfpFii~v~ 116 (158)
T TIGR03180 82 QAGVDGADEETRAALLEGNAAYEEKFGRIFLIRAA 116 (158)
T ss_pred HhcccCCCHHHHHHHHHHHHHHHHHCCCeEEEeeC
Confidence 34778899999999999999999875 5554443
No 12
>PRK13798 putative OHCU decarboxylase; Provisional
Probab=46.84 E-value=24 Score=24.41 Aligned_cols=32 Identities=9% Similarity=0.170 Sum_probs=25.7
Q ss_pred CCCCCCCHHHHHHHHHHHHHHcccC--ceeeEEc
Q psy15440 25 DDLSDLNPAEWESVREWEEQLDSKY--TYIGKLL 56 (79)
Q Consensus 25 ~dls~Lt~~el~~L~~W~~~f~~KY--p~VG~l~ 56 (79)
..+..|++.+.+.|..|-..|++|| |+|=.+.
T Consensus 88 ~gl~~l~~~~~~~l~~lN~~Y~~kFGfpFii~v~ 121 (166)
T PRK13798 88 AGVADADEAVMAALAAGNRAYEEKFGFVFLICAT 121 (166)
T ss_pred cccccCCHHHHHHHHHHHHHHHHhCCCeEEEeeC
Confidence 3688899999999999999999875 5554443
No 13
>TIGR03164 UHCUDC OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model.
Probab=46.19 E-value=25 Score=24.04 Aligned_cols=32 Identities=13% Similarity=0.332 Sum_probs=25.3
Q ss_pred CCCCCCCHHHHHHHHHHHHHHcccC--ceeeEEc
Q psy15440 25 DDLSDLNPAEWESVREWEEQLDSKY--TYIGKLL 56 (79)
Q Consensus 25 ~dls~Lt~~el~~L~~W~~~f~~KY--p~VG~l~ 56 (79)
..+..+++.+++.|..|-..|++|| |+|=.+.
T Consensus 83 agl~~~~~~~~~~L~~lN~~Y~~kFGfpFvi~v~ 116 (157)
T TIGR03164 83 AGLDQLSQEEFARFTRLNNAYRARFGFPFIMAVK 116 (157)
T ss_pred ccccCCCHHHHHHHHHHHHHHHHHCCCeeEEeeC
Confidence 4578899999999999999999875 5554443
No 14
>PF14584 DUF4446: Protein of unknown function (DUF4446)
Probab=45.66 E-value=17 Score=24.89 Aligned_cols=32 Identities=25% Similarity=0.287 Sum_probs=24.7
Q ss_pred ecccccchhhcCCCCCCCCCCCCCCCCCHHHHHHHHH
Q psy15440 4 SRRDASRAFATFTVTPGKDEYDDLSDLNPAEWESVRE 40 (79)
Q Consensus 4 AGrDaSRalat~~~~~~~~~~~dls~Lt~~el~~L~~ 40 (79)
-|||.||..||.--.-. .--.|+++|.++|+.
T Consensus 118 ~~Re~s~~YaK~I~~G~-----S~~~LS~EE~eal~~ 149 (151)
T PF14584_consen 118 HSREESRTYAKPIVNGQ-----SSYPLSEEEKEALEK 149 (151)
T ss_pred ecCCCcEEEEEEecCCc-----ccccCCHHHHHHHHH
Confidence 48899999998877664 224599999999863
No 15
>KOG1588|consensus
Probab=42.88 E-value=11 Score=28.34 Aligned_cols=16 Identities=25% Similarity=0.810 Sum_probs=13.1
Q ss_pred ccCceeeEEcCCCCCC
Q psy15440 47 SKYTYIGKLLKPGESH 62 (79)
Q Consensus 47 ~KYp~VG~l~~~~~~p 62 (79)
-||.+||||++|-|.-
T Consensus 104 P~fNFVGRILGPrGnS 119 (259)
T KOG1588|consen 104 PKFNFVGRILGPRGNS 119 (259)
T ss_pred CCCccccccccCCcch
Confidence 4899999999987653
No 16
>PF02065 Melibiase: Melibiase; InterPro: IPR000111 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycosyl hydrolase family 27, family 31 and family 36 alpha-galactosidases form the glycosyl hydrolase clan GH-D (acc_GH from CAZY), a superfamily of alpha-galactosidases, alpha-N-acetylgalactosaminidases, and isomaltodextranases which are likely to share a common catalytic mechanism and structural topology. Alpha-galactosidase (3.2.1.22 from EC) (melibiase) [] catalyzes the hydrolysis of melibiose into galactose and glucose. In man, the deficiency of this enzyme is the cause of Fabry's disease (X-linked sphingolipidosis). Alpha-galactosidase is present in a variety of organisms. There is a considerable degree of similarity in the sequence of alpha-galactosidase from various eukaryotic species. Escherichia coli alpha-galactosidase (gene melA), which requires NAD and magnesium as cofactors, is not structurally related to the eukaryotic enzymes; by contrast, an Escherichia coli plasmid encoded alpha-galactosidase (gene rafA P16551 from SWISSPROT) [] contains a region of about 50 amino acids which is similar to a domain of the eukaryotic alpha-galactosidases. Alpha-N-acetylgalactosaminidase (3.2.1.49 from EC) [] catalyzes the hydrolysis of terminal non-reducing N-acetyl-D-galactosamine residues in N-acetyl-alpha-D- galactosaminides. In man, the deficiency of this enzyme is the cause of Schindler and Kanzaki diseases. The sequence of this enzyme is highly related to that of the eukaryotic alpha-galactosidases.; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 1KTC_A 1KTB_A 1UAS_A 3H55_A 3H53_A 3IGU_B 3H54_A 3LRM_A 3LRL_A 3LRK_A ....
Probab=41.51 E-value=23 Score=27.72 Aligned_cols=26 Identities=27% Similarity=0.183 Sum_probs=21.9
Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHcc
Q psy15440 22 DEYDDLSDLNPAEWESVREWEEQLDS 47 (79)
Q Consensus 22 ~~~~dls~Lt~~el~~L~~W~~~f~~ 47 (79)
....++..|+++|++.+..|..+|++
T Consensus 309 g~e~dl~~ls~~e~~~~~~~ia~YK~ 334 (394)
T PF02065_consen 309 GLELDLTKLSEEELAAVKEQIAFYKS 334 (394)
T ss_dssp EEESTGCGS-HHHHHHHHHHHHHHHH
T ss_pred eeccCcccCCHHHHHHHHHHHHHHHh
Confidence 33468899999999999999999996
No 17
>PRK09750 hypothetical protein; Provisional
Probab=40.52 E-value=23 Score=21.38 Aligned_cols=25 Identities=20% Similarity=0.306 Sum_probs=19.9
Q ss_pred cCceeeEEcCCCCCCCCCccccccc
Q psy15440 48 KYTYIGKLLKPGESHINYEDEEKGS 72 (79)
Q Consensus 48 KYp~VG~l~~~~~~pt~y~~ee~~k 72 (79)
||.+-..+..|+|.|+...+=.+.|
T Consensus 3 kY~I~Ati~KpGg~P~~W~r~s~~~ 27 (64)
T PRK09750 3 MYKITATIEKEGGTPTNWTRYSKSK 27 (64)
T ss_pred eeEEEEEEECCCCCccceeEecCCc
Confidence 7899999999999999866544443
No 18
>TIGR03223 Phn_opern_protn putative phosphonate metabolism protein. This family of proteins is observed in the vicinity of other caharacterized genes involved in the catabolism of phosphonates via the3 C-P lyase system (GenProp0232), its function is unknown. These proteins are members of the somewhat broader pfam06299 model "Protein of unknown function (DUF1045)" which contains proteins found in a different genomic context as well.
Probab=39.03 E-value=26 Score=25.73 Aligned_cols=21 Identities=24% Similarity=0.615 Sum_probs=18.5
Q ss_pred CCCCCHHHHHHHHHHHHHHcccCceee
Q psy15440 27 LSDLNPAEWESVREWEEQLDSKYTYIG 53 (79)
Q Consensus 27 ls~Lt~~el~~L~~W~~~f~~KYp~VG 53 (79)
-.+||+.|...|..| -||||=
T Consensus 144 ~~~Ls~~Q~~~L~~W------GYPYV~ 164 (228)
T TIGR03223 144 PDQLTPRQRALLERW------GYPYVL 164 (228)
T ss_pred ccCCCHHHHHHHHHc------CCCcee
Confidence 478999999999999 788885
No 19
>TIGR02491 NrdG anaerobic ribonucleoside-triphosphate reductase activating protein. This enzyme is a member of the radical-SAM family (pfam04055) and utilizes S-adenosyl methionine, an iron-sulfur cluster and a reductant (dihydroflavodoxin ) to produce a glycine-centered radical in the class III (anaerobic) ribonucleotide triphosphate reductase (NrdD, TIGR02487). The two components form an alpha-2/beta-2 heterodimer.
Probab=36.56 E-value=61 Score=21.53 Aligned_cols=50 Identities=8% Similarity=-0.036 Sum_probs=30.8
Q ss_pred hcCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHcccCceeeEEcCCCCCCCCCc
Q psy15440 13 ATFTVTPGKDEYDDLSDLNPAEWESVREWEEQLDSKYTYIGKLLKPGESHINYE 66 (79)
Q Consensus 13 at~~~~~~~~~~~dls~Lt~~el~~L~~W~~~f~~KYp~VG~l~~~~~~pt~y~ 66 (79)
..+|+.++......-..++.++.+.|-++...+ +.++.|.-.+|+|....
T Consensus 29 C~~C~n~~~~~~~~g~~~~~~~~~~i~~~l~~~----~~~~gVt~sGGEPllq~ 78 (154)
T TIGR02491 29 CEGCFNKETWNFNGGKEFTEALEKEIIRDLNDN----PLIDGLTLSGGDPLYPR 78 (154)
T ss_pred CcCCCcccccCCCCCCcCCHHHHHHHHHHHHhc----CCcCeEEEeChhhCCCC
Confidence 345665531111123468888887777776433 34677777899999754
No 20
>COG4892 Predicted heme/steroid binding protein [General function prediction only]
Probab=36.50 E-value=14 Score=23.08 Aligned_cols=13 Identities=23% Similarity=0.470 Sum_probs=10.6
Q ss_pred ccCceeeEEcCCC
Q psy15440 47 SKYTYIGKLLKPG 59 (79)
Q Consensus 47 ~KYp~VG~l~~~~ 59 (79)
..||+||.|+.+.
T Consensus 67 ~~~PvVG~L~k~~ 79 (81)
T COG4892 67 TSLPVVGALIKEK 79 (81)
T ss_pred hcCchhheeeccc
Confidence 4899999998753
No 21
>TIGR01446 DnaD_dom DnaD and phage-associated domain. This model represents the conserved domain of DnaD, part of Bacillus subtilis replication restart primosome, and of a number of phage-associated proteins. Members, both chromosomal or phage-associated, are found in the Bacillus/Clostridium group of Gram-positive bacteria.
Probab=36.30 E-value=43 Score=19.21 Aligned_cols=18 Identities=28% Similarity=0.623 Sum_probs=16.3
Q ss_pred CCCHHHHHHHHHHHHHHc
Q psy15440 29 DLNPAEWESVREWEEQLD 46 (79)
Q Consensus 29 ~Lt~~el~~L~~W~~~f~ 46 (79)
-|++.+++.|..|.+.|.
T Consensus 12 ~ls~~e~~~i~~~~~~~~ 29 (73)
T TIGR01446 12 MLSPFEMEDLKYWLDEFG 29 (73)
T ss_pred CCCHHHHHHHHHHHHHhC
Confidence 489999999999999875
No 22
>TIGR02522 pilus_cpaD pilus (Caulobacter type) biogenesis lipoprotein CpaD. This family consists of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function is not known.
Probab=35.65 E-value=46 Score=23.71 Aligned_cols=32 Identities=9% Similarity=0.142 Sum_probs=24.0
Q ss_pred CCCCHHHHHHHHHHHHHHcccCceeeEEcCCC
Q psy15440 28 SDLNPAEWESVREWEEQLDSKYTYIGKLLKPG 59 (79)
Q Consensus 28 s~Lt~~el~~L~~W~~~f~~KYp~VG~l~~~~ 59 (79)
.+|+..|+..|+.|.+.|...|--.=.|..+.
T Consensus 53 ~~Lt~~qr~~l~~f~~~~~~~g~~~i~i~~ps 84 (198)
T TIGR02522 53 RGLSASQQDQLLGFLKDWGRASAQTLTLIIPS 84 (198)
T ss_pred CCCCHHHHHHHHHHHHHHhhcCCceEEEecCh
Confidence 56999999999999999987665433343343
No 23
>COG5547 Small integral membrane protein [Function unknown]
Probab=35.62 E-value=8.9 Score=23.01 Aligned_cols=9 Identities=56% Similarity=0.951 Sum_probs=7.3
Q ss_pred cCceeeEEc
Q psy15440 48 KYTYIGKLL 56 (79)
Q Consensus 48 KYp~VG~l~ 56 (79)
|||+||-|+
T Consensus 8 kypIIgglv 16 (62)
T COG5547 8 KYPIIGGLV 16 (62)
T ss_pred ccchHHHHH
Confidence 899999654
No 24
>cd00412 pyrophosphatase Inorganic pyrophosphatase. These enzymes hydrolyze inorganic pyrophosphate (PPi) to two molecules of orthophosphates (Pi). The reaction requires bivalent cations. The enzymes in general exist as homooligomers.
Probab=34.93 E-value=47 Score=22.91 Aligned_cols=26 Identities=19% Similarity=0.457 Sum_probs=21.9
Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHcc
Q psy15440 22 DEYDDLSDLNPAEWESVREWEEQLDS 47 (79)
Q Consensus 22 ~~~~dls~Lt~~el~~L~~W~~~f~~ 47 (79)
....++++|.+..++.+.+|.+.|+.
T Consensus 103 ~~i~~l~Dl~~~~l~~I~~fF~~YK~ 128 (155)
T cd00412 103 SHINDISDVPPHLLDEIKHFFEHYKD 128 (155)
T ss_pred ccCCChHHCCHHHHHHHHHHHHHhcc
Confidence 34457889999999999999999983
No 25
>TIGR00636 PduO_Nterm ATP:cob(I)alamin adenosyltransferase. This model represents as ATP:cob(I)alamin adenosyltransferase family corresponding to the N-terminal half of Salmonella PduO, a 1,2-propanediol utilization protein that probably is bifunctional. PduO represents one of at least three families of ATP:corrinoid adenosyltransferase: others are CobA (which partially complements PduO) and EutT. It was not clear originally whether ATP:cob(I)alamin adenosyltransferase activity resides in the N-terminal region of PduO, modeled here, but this has now become clear from the characterization of MeaD from Methylobacterium extorquens.
Probab=33.96 E-value=45 Score=23.32 Aligned_cols=35 Identities=11% Similarity=0.197 Sum_probs=28.8
Q ss_pred CCCHHHHHHHHHHHHHHcccCceeeEEcCCCCCCC
Q psy15440 29 DLNPAEWESVREWEEQLDSKYTYIGKLLKPGESHI 63 (79)
Q Consensus 29 ~Lt~~el~~L~~W~~~f~~KYp~VG~l~~~~~~pt 63 (79)
.+++++.+.|+.|.+.|.+.-|-.-..+-|+|.|.
T Consensus 76 ~i~~~~v~~LE~~id~~~~~l~~l~~FiLPggs~~ 110 (171)
T TIGR00636 76 KITEEDVKWLEERIDQYRKELPPLKLFVLPGGTPA 110 (171)
T ss_pred CcCHHHHHHHHHHHHHHHhhCCCCCceeeCCCCHH
Confidence 59999999999999999988776666666777654
No 26
>PF06223 Phage_tail_T: Minor tail protein T; InterPro: IPR009350 This family represents the minor tail protein T of Lambda-like viruses and their prophage. The minor tail protein T is located at the distal end and is involved in the assembly of the initiator complex for tail polymerisation. The protein is essential for tail assembly but is not found in the mature virion [].
Probab=33.91 E-value=26 Score=23.01 Aligned_cols=19 Identities=16% Similarity=0.609 Sum_probs=13.3
Q ss_pred CCCCHHHHHHHHHHHHHHcccC
Q psy15440 28 SDLNPAEWESVREWEEQLDSKY 49 (79)
Q Consensus 28 s~Lt~~el~~L~~W~~~f~~KY 49 (79)
++++.. .+.+|..||.+.|
T Consensus 16 a~MSst---E~~eW~~ff~~~~ 34 (103)
T PF06223_consen 16 AEMSST---EYGEWADFFRKHY 34 (103)
T ss_pred HhcCHH---HHHHHHHHHHhCC
Confidence 556665 4678999997544
No 27
>PF00034 Cytochrom_C: Cytochrome c; InterPro: IPR003088 Cytochromes c (cytC) can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more generally, two thioether bonds involving sulphydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes. Ambler [] recognised four classes of cytC. Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C terminus. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cytC and prokaryotic 'short' cyt c2 exemplified by Rhodopila globiformis cyt c2; class IA includes 'long' cyt c2, such as Rhodospirillum rubrum cyt c2 and Aquaspirillum itersonii cyt c-550, which have several extra loops by comparison with class IB cytC.; GO: 0005506 iron ion binding, 0009055 electron carrier activity, 0020037 heme binding; PDB: 1YNR_B 2AI5_A 1AYG_A 3O5C_C 1YEA_A 3CXH_W 1YTC_A 1YEB_A 2YBB_Y 2B4Z_A ....
Probab=32.62 E-value=55 Score=18.08 Aligned_cols=17 Identities=18% Similarity=0.194 Sum_probs=14.6
Q ss_pred CCCHHHHHHHHHHHHHH
Q psy15440 29 DLNPAEWESVREWEEQL 45 (79)
Q Consensus 29 ~Lt~~el~~L~~W~~~f 45 (79)
.|+++|+..|-.|....
T Consensus 74 ~ls~~e~~~l~ayl~sl 90 (91)
T PF00034_consen 74 ILSDEEIADLAAYLRSL 90 (91)
T ss_dssp TSSHHHHHHHHHHHHHT
T ss_pred CCCHHHHHHHHHHHHHh
Confidence 69999999999998753
No 28
>PF09349 OHCU_decarbox: OHCU decarboxylase; InterPro: IPR018020 The proteins in this entry are OHCU decarboxylase, an enzyme of the purine catabolism that catalyses the conversion of OHCU into S(+)-allantoin []; it is the third step of the conversion of uric acid (a purine derivative) to allantoin. Step one is catalysed by urate oxidase (IPR002042 from INTERPRO) and step two is catalysed by hydroxyisourate hydrolase (IPR000895 from INTERPRO). ; PDB: 3O7I_B 3O7H_B 3O7J_A 3O7K_A 2Q37_A 2O70_B 2O73_C 2O74_C 2O8I_A.
Probab=32.40 E-value=28 Score=23.60 Aligned_cols=31 Identities=13% Similarity=0.320 Sum_probs=22.7
Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHcccC--ceee
Q psy15440 23 EYDDLSDLNPAEWESVREWEEQLDSKY--TYIG 53 (79)
Q Consensus 23 ~~~dls~Lt~~el~~L~~W~~~f~~KY--p~VG 53 (79)
....+..+++.++..|..+-..|+.|| |+|=
T Consensus 84 ~~agl~~~~~~~~~~L~~lN~~Y~~kFGf~Fvi 116 (159)
T PF09349_consen 84 ASAGLDSLDEEELAELAALNQAYEEKFGFPFVI 116 (159)
T ss_dssp HHCCTTSTHHHHHHHHHHHHHHHHHHHSS----
T ss_pred hhcccccCCHHHHHHHHHHHHHHHHHcCCceEe
Confidence 344567899999999999999999875 4443
No 29
>PF00719 Pyrophosphatase: Inorganic pyrophosphatase; InterPro: IPR008162 Inorganic pyrophosphatase (3.6.1.1 from EC) (PPase) [, ] is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis. The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.; GO: 0000287 magnesium ion binding, 0004427 inorganic diphosphatase activity, 0006796 phosphate-containing compound metabolic process, 0005737 cytoplasm; PDB: 2UXS_A 1WCF_A 1SXV_A 3I4Q_A 2PRD_A 2IHP_B 1PYP_A 2IK7_A 2IK4_A 1E9G_A ....
Probab=32.27 E-value=39 Score=23.17 Aligned_cols=32 Identities=22% Similarity=0.427 Sum_probs=24.6
Q ss_pred CCCCCCCCHHHHHHHHHHHHHHcc----cCceeeEE
Q psy15440 24 YDDLSDLNPAEWESVREWEEQLDS----KYTYIGKL 55 (79)
Q Consensus 24 ~~dls~Lt~~el~~L~~W~~~f~~----KYp~VG~l 55 (79)
..++++|.+..++.+.+|...|+. |+-.+|..
T Consensus 102 i~dl~dl~~~~~~~i~~fF~~YK~l~~~k~~~~~~~ 137 (156)
T PF00719_consen 102 IKDLEDLPPHLLDEIEHFFRNYKDLEENKWVEVGGW 137 (156)
T ss_dssp HHSGGGSSHHHHHHHHHHHHHTTTTSTTEEEEEEEE
T ss_pred cCcHHHhChhHHHHHHHHHHHhcCcCCCCeEEeCCC
Confidence 446889999999999999999973 45554433
No 30
>PF09568 RE_MjaI: MjaI restriction endonuclease; InterPro: IPR019068 There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements [, ], as summarised below: Type I enzymes (3.1.21.3 from EC) cleave at sites remote from recognition site; require both ATP and S-adenosyl-L-methionine to function; multifunctional protein with both restriction and methylase (2.1.1.72 from EC) activities. Type II enzymes (3.1.21.4 from EC) cleave within or at short specific distances from recognition site; most require magnesium; single function (restriction) enzymes independent of methylase. Type III enzymes (3.1.21.5 from EC) cleave at sites a short distance from recognition site; require ATP (but doesn't hydrolyse it); S-adenosyl-L-methionine stimulates reaction but is not required; exists as part of a complex with a modification methylase methylase (2.1.1.72 from EC). Type IV enzymes target methylated DNA. Type II restriction endonucleases (3.1.21.4 from EC) are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin []. However, there is still considerable diversity amongst restriction endonucleases [, ]. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone []. This entry includes the MjaI (recognises CTAG but cleavage site unknown) restriction endonuclease. ; GO: 0003677 DNA binding, 0009036 Type II site-specific deoxyribonuclease activity, 0009307 DNA restriction-modification system
Probab=30.82 E-value=27 Score=24.79 Aligned_cols=15 Identities=40% Similarity=0.784 Sum_probs=13.2
Q ss_pred HHHHHHHHHHcccCc
Q psy15440 36 ESVREWEEQLDSKYT 50 (79)
Q Consensus 36 ~~L~~W~~~f~~KYp 50 (79)
.++++|.+.|.++||
T Consensus 32 kt~eeWe~wY~~~~~ 46 (170)
T PF09568_consen 32 KTIEEWEEWYFEKYP 46 (170)
T ss_pred CCHHHHHHHHHhcCh
Confidence 489999999999875
No 31
>PF12512 DUF3717: Protein of unknown function (DUF3717) ; InterPro: IPR022191 This family of proteins is found in bacteria. Proteins in this family are typically between 75 and 117 amino acids in length. There is a conserved AIN sequence motif. There are two completely conserved residues (L and Y) that may be functionally important.
Probab=29.86 E-value=52 Score=20.17 Aligned_cols=19 Identities=21% Similarity=0.410 Sum_probs=16.7
Q ss_pred CCCCCCCHHHHHHHHHHHH
Q psy15440 25 DDLSDLNPAEWESVREWEE 43 (79)
Q Consensus 25 ~dls~Lt~~el~~L~~W~~ 43 (79)
-+.+.|++.+++++..|.+
T Consensus 51 v~~~~L~~~a~~A~~~w~~ 69 (71)
T PF12512_consen 51 VDEASLDPEARAAWLAWYD 69 (71)
T ss_pred CChhhCCHHHHHHHHHHHh
Confidence 4678999999999999975
No 32
>PF14376 Haem_bd: Haem-binding domain
Probab=29.61 E-value=57 Score=21.66 Aligned_cols=17 Identities=18% Similarity=0.468 Sum_probs=15.0
Q ss_pred CCCCHHHHHHHHHHHHH
Q psy15440 28 SDLNPAEWESVREWEEQ 44 (79)
Q Consensus 28 s~Lt~~el~~L~~W~~~ 44 (79)
+.||++|++.|.+|...
T Consensus 119 a~Ls~~ek~~Ll~Wi~~ 135 (137)
T PF14376_consen 119 AKLSEEEKQALLNWIKE 135 (137)
T ss_pred CCCCHHHHHHHHHHHHH
Confidence 66999999999999863
No 33
>PHA02057 ADP-ribosylation superfamily-like protein
Probab=28.39 E-value=21 Score=27.58 Aligned_cols=17 Identities=29% Similarity=0.233 Sum_probs=13.9
Q ss_pred HHHcccCceeeEEcCCC
Q psy15440 43 EQLDSKYTYIGKLLKPG 59 (79)
Q Consensus 43 ~~f~~KYp~VG~l~~~~ 59 (79)
.||++|||-+|+|..+.
T Consensus 16 ~~~k~~~ps~~~~~~~~ 32 (319)
T PHA02057 16 VYQKKKSPSIGKLSEEA 32 (319)
T ss_pred eeecccCCccccccccc
Confidence 47889999999997654
No 34
>cd00086 homeodomain Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.
Probab=27.22 E-value=1e+02 Score=16.30 Aligned_cols=21 Identities=19% Similarity=0.392 Sum_probs=15.8
Q ss_pred CCCCHHHHHHHHHHHHHHcccCc
Q psy15440 28 SDLNPAEWESVREWEEQLDSKYT 50 (79)
Q Consensus 28 s~Lt~~el~~L~~W~~~f~~KYp 50 (79)
..++..++..|..|... ..||
T Consensus 5 ~~~~~~~~~~Le~~f~~--~~~P 25 (59)
T cd00086 5 TRFTPEQLEELEKEFEK--NPYP 25 (59)
T ss_pred CcCCHHHHHHHHHHHHh--CCCC
Confidence 45788899999888776 4565
No 35
>PF00046 Homeobox: Homeobox domain not present here.; InterPro: IPR001356 The homeobox domain was first identified in a number of drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [, , ]. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two alpha-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA []. The first helix helps to stabilise the structure. The motif is very similar in sequence and structure in a wide range of DNA-binding proteins (e.g., cro and repressor proteins, homeotic proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the beta-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2DA3_A 1LFB_A 2LFB_A 2ECB_A 2DA5_A 3D1N_O 3A03_A 2XSD_C 3CMY_A 1AHD_P ....
Probab=26.56 E-value=97 Score=16.62 Aligned_cols=20 Identities=10% Similarity=0.341 Sum_probs=15.3
Q ss_pred CCCHHHHHHHHHHHHHHcccCc
Q psy15440 29 DLNPAEWESVREWEEQLDSKYT 50 (79)
Q Consensus 29 ~Lt~~el~~L~~W~~~f~~KYp 50 (79)
-+|+.|+..|..+... ..||
T Consensus 6 ~~t~~q~~~L~~~f~~--~~~p 25 (57)
T PF00046_consen 6 RFTKEQLKVLEEYFQE--NPYP 25 (57)
T ss_dssp SSSHHHHHHHHHHHHH--SSSC
T ss_pred CCCHHHHHHHHHHHHH--hccc
Confidence 4788999999888774 5555
No 36
>PRK01250 inorganic pyrophosphatase; Provisional
Probab=26.12 E-value=76 Score=22.36 Aligned_cols=26 Identities=19% Similarity=0.309 Sum_probs=21.7
Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHcc
Q psy15440 22 DEYDDLSDLNPAEWESVREWEEQLDS 47 (79)
Q Consensus 22 ~~~~dls~Lt~~el~~L~~W~~~f~~ 47 (79)
....++++|.+.-++.+.+|...|+.
T Consensus 119 ~~i~dl~dl~~~~l~eI~~fF~~YK~ 144 (176)
T PRK01250 119 DHIKDVNDLPELLKAQIKHFFEHYKD 144 (176)
T ss_pred cccCChHHCCHHHHHHHHHHHHHhcC
Confidence 34557888999999999999999973
No 37
>COG2096 cob(I)alamin adenosyltransferase [Coenzyme transport and metabolism]
Probab=25.97 E-value=97 Score=22.20 Aligned_cols=35 Identities=9% Similarity=0.229 Sum_probs=29.2
Q ss_pred CCCHHHHHHHHHHHHHHcccCceeeEEcCCCCCCC
Q psy15440 29 DLNPAEWESVREWEEQLDSKYTYIGKLLKPGESHI 63 (79)
Q Consensus 29 ~Lt~~el~~L~~W~~~f~~KYp~VG~l~~~~~~pt 63 (79)
.++++++.-|+.|...|...=|-+=..+-|+|.|.
T Consensus 84 ~i~~e~v~~LE~~id~y~~~l~~l~~FiLPGgs~~ 118 (184)
T COG2096 84 RITEEDVKRLEKRIDAYNAELPPLKSFVLPGGSPA 118 (184)
T ss_pred ccCHHHHHHHHHHHHHHHhcCCCcceeeccCCCHH
Confidence 58999999999999999988886666667777763
No 38
>PF07901 DUF1672: Protein of unknown function (DUF1672); InterPro: IPR012873 This family is composed of hypothetical bacterial proteins of unknown function.
Probab=25.92 E-value=57 Score=24.80 Aligned_cols=25 Identities=24% Similarity=0.450 Sum_probs=18.8
Q ss_pred HHHHHHHHHHHHHHcccCceeeEEc
Q psy15440 32 PAEWESVREWEEQLDSKYTYIGKLL 56 (79)
Q Consensus 32 ~~el~~L~~W~~~f~~KYp~VG~l~ 56 (79)
.++...|....+...+||+++|+=.
T Consensus 122 keefd~L~~f~~~~akky~~tG~Tk 146 (277)
T PF07901_consen 122 KEEFDNLDKFLEKNAKKYQYTGFTK 146 (277)
T ss_pred HHHHHHHHHHHHHHHHhcCCccccH
Confidence 3555677777777788999999653
No 39
>COG0133 TrpB Tryptophan synthase beta chain [Amino acid transport and metabolism]
Probab=25.67 E-value=49 Score=26.28 Aligned_cols=36 Identities=22% Similarity=0.351 Sum_probs=30.2
Q ss_pred CCCCHHHHHHHHHHHHHHcccCceeeEEcCCCCCCC
Q psy15440 28 SDLNPAEWESVREWEEQLDSKYTYIGKLLKPGESHI 63 (79)
Q Consensus 28 s~Lt~~el~~L~~W~~~f~~KYp~VG~l~~~~~~pt 63 (79)
..|...=-++|++|...++..|=++|.+++|.==|+
T Consensus 164 ~TLKDA~neAlRdWvtn~~~ThY~iGsa~GPHPyP~ 199 (396)
T COG0133 164 GTLKDAINEALRDWVTNVEDTHYLIGSAAGPHPYPT 199 (396)
T ss_pred chHHHHHHHHHHHHHhccccceEEEeeccCCCCchH
Confidence 347777778999999999999999999999765554
No 40
>KOG4048|consensus
Probab=24.93 E-value=50 Score=23.82 Aligned_cols=29 Identities=21% Similarity=0.238 Sum_probs=21.4
Q ss_pred CCCCCCCHHHHHHHHHHHHHHcccCceee
Q psy15440 25 DDLSDLNPAEWESVREWEEQLDSKYTYIG 53 (79)
Q Consensus 25 ~dls~Lt~~el~~L~~W~~~f~~KYp~VG 53 (79)
.|.++..-.--+.++.|..||+.+=|.|-
T Consensus 107 ~dFs~cP~nsdeEVnkWL~fyEM~ApL~c 135 (201)
T KOG4048|consen 107 ADFSSCPLNSDEEVNKWLHFYEMKAPLVC 135 (201)
T ss_pred cccCCCCCCCHHHHHHHHhHeeccCceee
Confidence 34555554555689999999999888765
No 41
>PF03221 HTH_Tnp_Tc5: Tc5 transposase DNA-binding domain; InterPro: IPR006600 This entry represents a DNA-binding helix-turn-helix domain found in the pogo family of transposable elements, the centromere protein Cenp-B, and yeast PCD2. There is extensive sequence similarity between Cenp-B and transposase proteins encoded by the pogo superfamily of transposable elements, which includes the human Tigger and Jerky elements []. The HTH domain is composed of three alpha-helices, with the second and third helices connected via a turn comprise the helix-turn-helix motif. Helix 3 is termed the recognition helix as it binds the DNA major groove, as in other HTHs []. This conserved DNA-binding domain is found in the following proteins: Cenp-B (major centromere autoantigen B or centromere protein B), which appears to organise arrays of centromere satellite DNA into a higher order structure that then direct centromere formation and kinetochore assembly in mammalian chromosomes. The N terminus of Cenp-B contains two DNA-binding HTH domains, which bind to adjacent major grooves of DNA: a psq-type HTH domain followed by a CenpB-type HTH domain, which together bind specifically to the Cenp-B box, which occurs in alpha-satellite DNA in human centromeres []. Pogo family transposable elements includes both Tigger and Jerky elements []. Pogo contains two open reading frames flanked by inverted repeats. The N-terminal region of pogo transposase contains a Cenp-B-type HTH DNA-binding domain []. Mammalian jerky protein, involved in epileptic seizures in mice []. PDC2 (Pyruvate DeCarboxylase 2), which is a transcription factor required for the synthesis of the glycolytic enzyme pyruvate decarboxylase, required for high level expression of both the THI and the PDC genes. PDC2 may be important for a high basal level of PDC gene expression or play a positive role in the autoregulation control of PDC1 and PDC5 [, ]. ; PDB: 1HLV_A 1IUF_A.
Probab=24.86 E-value=39 Score=18.57 Aligned_cols=16 Identities=6% Similarity=0.351 Sum_probs=10.5
Q ss_pred HHHHHHHHHcccCcee
Q psy15440 37 SVREWEEQLDSKYTYI 52 (79)
Q Consensus 37 ~L~~W~~~f~~KYp~V 52 (79)
.=..|...|.++|.+.
T Consensus 48 ~s~~W~~~F~~Rh~i~ 63 (66)
T PF03221_consen 48 ASKGWLDRFKKRHGIK 63 (66)
T ss_dssp --CHHHHHHHHHTS--
T ss_pred cccHHHHHHHHHcCCC
Confidence 3357999999999764
No 42
>PF03392 OS-D: Insect pheromone-binding family, A10/OS-D; InterPro: IPR005055 A class of small (14-20 Kd) water-soluble proteins, called odorant binding proteins (OBPs), first discovered in the insect sensillar lymph but also in the mucus of vertebrates, is postulated to mediate the solubilisation of hydrophobic odorant molecules, and thereby to facilitate their transport to the receptor neurons. The product of a gene expressed in the olfactory system of Drosophila melanogaster (Fruit fly), OS-D, shares features common to vertebrate odorant-binding proteins, but has a primary structure unlike odorant-binding proteins []. OS-D derivatives have subsequently been found in chemosensory organs of phylogenetically distinct insects, including cockroaches, phasmids and moths, suggesting that OS-D-like proteins seem to be conserved in the insect phylum.; PDB: 1KX9_A 1N8U_A 1KX8_A 1K19_A 1N8V_A 2GVS_A 2JNT_A.
Probab=24.08 E-value=61 Score=20.47 Aligned_cols=27 Identities=19% Similarity=0.228 Sum_probs=21.1
Q ss_pred CCCCCCHHHHHHHHHHHHHHcccCcee
Q psy15440 26 DLSDLNPAEWESVREWEEQLDSKYTYI 52 (79)
Q Consensus 26 dls~Lt~~el~~L~~W~~~f~~KYp~V 52 (79)
..+..|+.|.+.+.....++..+||-.
T Consensus 51 ~C~KCt~kQK~~~~kv~~~l~~~~P~~ 77 (95)
T PF03392_consen 51 NCAKCTPKQKENARKVIKFLKKNYPDE 77 (95)
T ss_dssp TTTTS-HHHHHHHHHHHHHHHHHHHHH
T ss_pred cccCCCHHHHHHHHHHHHHHHHcCHHH
Confidence 567899999999998888888777643
No 43
>PF14409 Herpeto_peptide: Ribosomally synthesized peptide in Herpetosiphon
Probab=22.99 E-value=93 Score=18.44 Aligned_cols=22 Identities=14% Similarity=0.137 Sum_probs=15.6
Q ss_pred HHHHHHHHHHHHHcc---cCceeeE
Q psy15440 33 AEWESVREWEEQLDS---KYTYIGK 54 (79)
Q Consensus 33 ~el~~L~~W~~~f~~---KYp~VG~ 54 (79)
++-..+++|+..+.. -||.||=
T Consensus 24 ~eaa~~~~~V~~~~~~~G~~~~tgC 48 (58)
T PF14409_consen 24 EEAAEINDVVGCWKPRDGSYPVTGC 48 (58)
T ss_pred HHHHHHHHHHHhhhhccCCCCCCCC
Confidence 344678889988873 4888773
No 44
>PRK02230 inorganic pyrophosphatase; Provisional
Probab=22.98 E-value=88 Score=22.29 Aligned_cols=26 Identities=15% Similarity=0.242 Sum_probs=21.9
Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHcc
Q psy15440 22 DEYDDLSDLNPAEWESVREWEEQLDS 47 (79)
Q Consensus 22 ~~~~dls~Lt~~el~~L~~W~~~f~~ 47 (79)
....++++|.+..++.+.+|...|+.
T Consensus 105 ~~i~di~Dlp~~~l~~I~~fF~~YK~ 130 (184)
T PRK02230 105 DHINSLKDLPQHWLDEIEYFFSNYKN 130 (184)
T ss_pred hhcCChHHCCHHHHHHHHHHHHHhcC
Confidence 34557899999999999999999983
No 45
>PRK13797 putative bifunctional allantoicase/OHCU decarboxylase; Provisional
Probab=22.88 E-value=91 Score=25.62 Aligned_cols=35 Identities=3% Similarity=0.117 Sum_probs=27.7
Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHccc--CceeeEEc
Q psy15440 22 DEYDDLSDLNPAEWESVREWEEQLDSK--YTYIGKLL 56 (79)
Q Consensus 22 ~~~~dls~Lt~~el~~L~~W~~~f~~K--Yp~VG~l~ 56 (79)
....-++.|++++++.|..+-..|++| +|+|=.+.
T Consensus 435 ~EQAGL~~ls~~e~~~L~~lN~aY~eKFGFpFIIca~ 471 (516)
T PRK13797 435 REQAAMDQAAEDVRAAFARGNAAYEERFGFIFLVRAA 471 (516)
T ss_pred hhhhccccCCHHHHHHHHHHHHHHHHhCCCeEEEEEC
Confidence 344678899999999999999999987 55555444
No 46
>PRK05134 bifunctional 3-demethylubiquinone-9 3-methyltransferase/ 2-octaprenyl-6-hydroxy phenol methylase; Provisional
Probab=22.70 E-value=96 Score=21.20 Aligned_cols=26 Identities=15% Similarity=0.177 Sum_probs=21.2
Q ss_pred CCCCCHHHHHHHHHHHHHHcccCcee
Q psy15440 27 LSDLNPAEWESVREWEEQLDSKYTYI 52 (79)
Q Consensus 27 ls~Lt~~el~~L~~W~~~f~~KYp~V 52 (79)
++.+.++|.+.+..|.+.|...|-..
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 26 (233)
T PRK05134 1 MSNVDPAEIAKFSALAARWWDPNGEF 26 (233)
T ss_pred CCcccHHHHHHHHHHHHHHhccCCCc
Confidence 46789999999999999888777633
No 47
>PF06688 DUF1187: Protein of unknown function (DUF1187); InterPro: IPR009572 This family consists of several short, hypothetical bacterial proteins of around 62 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi. The function of this family is unknown.
Probab=22.16 E-value=73 Score=19.09 Aligned_cols=26 Identities=19% Similarity=0.552 Sum_probs=19.4
Q ss_pred CceeeEEcCCCCCCCCCccccccchh
Q psy15440 49 YTYIGKLLKPGESHINYEDEEKGSIE 74 (79)
Q Consensus 49 Yp~VG~l~~~~~~pt~y~~ee~~k~~ 74 (79)
|.+-..+..|+|.|+...+=.+.|+.
T Consensus 1 YkItAtI~KpG~~Pv~W~rys~~kmT 26 (61)
T PF06688_consen 1 YKITATIIKPGNTPVNWTRYSDSKMT 26 (61)
T ss_pred CceEEEEEcCCCCCeeeEEecCCccC
Confidence 67788899999999886655554443
No 48
>PLN02373 soluble inorganic pyrophosphatase
Probab=21.91 E-value=1e+02 Score=22.03 Aligned_cols=26 Identities=23% Similarity=0.526 Sum_probs=21.8
Q ss_pred CCCCCCCCCCHHHHHHHHHHHHHHcc
Q psy15440 22 DEYDDLSDLNPAEWESVREWEEQLDS 47 (79)
Q Consensus 22 ~~~~dls~Lt~~el~~L~~W~~~f~~ 47 (79)
....++++|.+.-++.+.+|...|+.
T Consensus 125 ~~i~dl~Dl~~~~l~~I~~fF~~YK~ 150 (188)
T PLN02373 125 RHYTDIKELPPHRLAEIRRFFEDYKK 150 (188)
T ss_pred ccCCChHHCCHHHHHHHHHHHHHhcc
Confidence 34556788999999999999999984
No 49
>PF06297 PET: PET Domain; InterPro: IPR010442 The PET domain is a ~110 amino acid motif in the N-terminal part of LIM domain proteins. The domain was described in Drosophila proteins involved in cell differentiation and is named after Prickle, Espinas and Testin. PET domain proteins contain about three zinc-binding LIM domains (see PDOC00382 from INTERPRO, IPR001781 from INTERPRO) and are found among metazoans. The PET domain has been suggested to play a role in protein-protein interactions with proteins involved in planar polarity signalling or organisation of the cytoskeleton []. Some proteins known to contain a PET domain: Mammalian testin protein (Q9UGI8 from SWISSPROT), which may function as a tumour suppressor. Mammalian LIM domain only protein 6 (LMO6/Prickle3, O43900 from SWISSPROT). Fruit fly prickle (A1Z6W3 from SWISSPROT) and espinas (Q9U1I1 from SWISSPROT) proteins encoded by the tissue polarity gene prickle (pk), involved in the control of orientation of bristles and hairs. Mammalian prickle-like proteins 1 (Q96MT3 from SWISSPROT) and 2 (Q7Z3G6 from SWISSPROT). ; GO: 0008270 zinc ion binding
Probab=21.90 E-value=1.1e+02 Score=19.92 Aligned_cols=31 Identities=13% Similarity=0.197 Sum_probs=26.6
Q ss_pred CCCCCHHHHHHHHHHHHHHcccCceeeEEcC
Q psy15440 27 LSDLNPAEWESVREWEEQLDSKYTYIGKLLK 57 (79)
Q Consensus 27 ls~Lt~~el~~L~~W~~~f~~KYp~VG~l~~ 57 (79)
+.+|+++|.+.|.......++..-=||.|..
T Consensus 74 C~~Lse~E~k~l~~F~~~rk~~alG~G~V~~ 104 (106)
T PF06297_consen 74 CHSLSEEEKKELEDFVKQRKEEALGVGTVKE 104 (106)
T ss_pred HhhCCHHHHHHHHHHHHHHHHHcCCeeEEEe
Confidence 5789999999999999999888777887753
No 50
>PRK11789 N-acetyl-anhydromuranmyl-L-alanine amidase; Provisional
Probab=21.86 E-value=79 Score=22.31 Aligned_cols=32 Identities=16% Similarity=0.334 Sum_probs=26.4
Q ss_pred CCCCCCHHHHHHHHHHHHHHcccCcee-eEEcC
Q psy15440 26 DLSDLNPAEWESVREWEEQLDSKYTYI-GKLLK 57 (79)
Q Consensus 26 dls~Lt~~el~~L~~W~~~f~~KYp~V-G~l~~ 57 (79)
+...+|+.|.++|.........+||+. .+|++
T Consensus 123 ~~~~~t~aQ~~aL~~L~~~L~~~y~i~~~~IvG 155 (185)
T PRK11789 123 DTLPFTDAQYQALAALTRALRAAYPIIAERITG 155 (185)
T ss_pred CCCCccHHHHHHHHHHHHHHHHHcCCCHHhEEe
Confidence 346799999999999999999999975 45554
Done!