RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy7233
(383 letters)
>gnl|CDD|233420 TIGR01452, PGP_euk, phosphoglycolate/pyridoxal phosphate
phosphatase family. PGP is an essential enzyme in the
glycolate salvage pathway in higher organisms
(photorespiration in plants). Phosphoglycolate results
from the oxidase activity of RubisCO in the Calvin cycle
when concentrations of carbon dioxide are low relative
to oxygen. In mammals, PGP is found in many tissues,
notably in red blood cells where P-glycolate is and
important activator of the hydrolysis of
2,3-bisphosphoglycerate, a major modifier of the oxygen
affinity of hemoglobin. Pyridoxal phosphate (PLP,
Vitamin B6) phosphatase is involved in the degradation
of PLP in mammals and is widely distributed in human
tissues including erythrocyes. The enzymes described
here are members of the Haloacid dehalogenase
superfamily of hydrolase enzymes (pfam00702). Unlike the
bacterial PGP equivalog (TIGR01449), which is a member
of class (subfamily) I, these enzymes are members of
class (subfamily) II. These two families have almost
certainly arisen from convergent evolution (although
these two ancestors may themselves have diverged from a
more distant HAD superfamily progenitor). The primary
seed sequence for this model comes from Chlamydomonas
reinhardtii, a photosynthetic alga. The enzyme has been
purified and characterized and these data are fully
consistent with the assignment of function as a PGPase
involved in photorespiration. The second seed, from Homo
sapiens chromosome 22 has been characterized as a
pyridoxal phosphatase. Biochemical characterization of
partially purified PGP's from various tissues including
red blood cells have been performed while one gene for
PGP has been localized to chromosome 16p13.3. The
sequence used here maps to chromosome 22. There is
indeed a related gene on chromosome 16 (and it is
expressed, since EST's are found) which shows 46%
identity and 59% positives by BLAST2 (E=1e-66). The
chromosome 16 gene is not in evidence in nraa but
translated from the genomic sequence would score 372.4
(E=7.9e-113) versus This model, well above trusted. The
third seed, from C. elegans, is only supported by
sequence similarity. This model is limited to eukaryotic
species including S. pombe and S. cerevisiae, although
several archaea score between the trusted and noise
cutoffs. This model is closely related to a family of
bacterial sequences including the E. coli NagD and B.
subtilus AraL genes which are characterized by the
ability to hydrolyze para-nitrophenylphosphate (pNPPases
or NPPases). The chlamydomonas PGPase d.
Length = 279
Score = 271 bits (695), Expect = 8e-90
Identities = 109/282 (38%), Positives = 164/282 (58%), Gaps = 5/282 (1%)
Query: 96 DTVLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNA 155
+ DCDGVLWL ++ GA ++++ L GK +VTNNSTK+R + +K LGFN
Sbjct: 3 QGFIFDCDGVLWLGERVVPGAPELLDRLARAGKAALFVTNNSTKSRAEYALKFARLGFNG 62
Query: 156 EPNEIIGTAYLAAQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDVMIPGRDLK 215
++ +A AA+ L++ D K Y++G G+ EL+ AGI G G +
Sbjct: 63 LAEQLFSSALCAARLLRQPPDAPKAVYVIGEEGLRAELDAAGIRLAGDPSAG--DGAAPR 120
Query: 216 TDHEKLNLDPHVGAVVVGFDSHISFPKLMKAACYLTNPNTLFVATNTDESFPMGPHVTVP 275
+ L+ +VGAVVVG+D H S+ KL +A +L P LFVATN D P+ P
Sbjct: 121 GSGAFMKLEENVGAVVVGYDEHFSYAKLREACAHLREPGCLFVATNRDPWHPLSDGSRTP 180
Query: 276 GTGSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNG 335
GTGS+VAA++T + R+P+V+GKPS + + E ++++P RTLM+GDR TDI G+ G
Sbjct: 181 GTGSLVAAIETASGRQPLVVGKPSPYMFECITENFSIDPARTLMVGDRLETDILFGHRCG 240
Query: 336 FQTLLVLTGDTTMEKAIAWSKSEDEEYKSRVADYYLSSLGDM 377
T+LVL+G + +E+A + + + V DY + SL D+
Sbjct: 241 MTTVLVLSGVSRLEEAQEYLAAGQHDL---VPDYVVESLADL 279
>gnl|CDD|178251 PLN02645, PLN02645, phosphoglycolate phosphatase.
Length = 311
Score = 245 bits (626), Expect = 6e-79
Identities = 117/298 (39%), Positives = 169/298 (56%), Gaps = 11/298 (3%)
Query: 83 LSGDKQKDFLNSFDTVLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTRE 142
L+ + + ++S +T + DCDGV+W ++LI G + ++ L+S+GKK+ +VTNNSTK+R
Sbjct: 16 LTLENADELIDSVETFIFDCDGVIWKGDKLIEGVPETLDMLRSMGKKLVFVTNNSTKSRA 75
Query: 143 QLIVKLKHLGFNAEPNEIIGTAYLAAQYLKK-HLDPKKKAYIVGSSGIADELNLAGIENF 201
Q K + LG N EI +++ AA YLK + KK Y++G GI +EL LAG +
Sbjct: 76 QYGKKFESLGLNVTEEEIFSSSFAAAAYLKSINFPKDKKVYVIGEEGILEELELAGFQYL 135
Query: 202 GVGPDVMIPGRDLKTDHEKLNLDPHVGAVVVGFDSHISFPKLMKAA-CYLTNPNTLFVAT 260
G GP+ +LK + D VGAVVVGFD +I++ K+ A C NP LF+AT
Sbjct: 136 G-GPEDGDKKIELKPG-FLMEHDKDVGAVVVGFDRYINYYKIQYATLCIRENPGCLFIAT 193
Query: 261 NTDESFPMGPHVTVPGTGSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERTLMI 320
N D + G GSMV A+K +REP+V+GKPS + YL K+ + + M+
Sbjct: 194 NRDAVTHLTDAQEWAGAGSMVGAIKGSTEREPLVVGKPSTFMMDYLANKFGIEKSQICMV 253
Query: 321 GDRGNTDIRLGYNNGFQTLLVLTGDTTMEKAIAWSKSEDEEYKSRVADYYLSSLGDML 378
GDR +TDI G N G +TLLVL+G T+ E K + D+Y S + D L
Sbjct: 254 GDRLDTDILFGQNGGCKTLLVLSGVTSESML------LSPENKIQ-PDFYTSKISDFL 304
>gnl|CDD|223720 COG0647, NagD, Predicted sugar phosphatases of the HAD superfamily
[Carbohydrate transport and metabolism].
Length = 269
Score = 214 bits (548), Expect = 8e-68
Identities = 97/294 (32%), Positives = 143/294 (48%), Gaps = 31/294 (10%)
Query: 89 KDFLNSFDTVLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKL 148
D ++ +D L D DGVL+ NE I GA + + LK+ GK + ++TNNST++RE + +L
Sbjct: 2 FDVMDKYDGFLFDLDGVLYRGNEAIPGAAEALKRLKAAGKPVIFLTNNSTRSREVVAARL 61
Query: 149 KHLGFN-AEPNEIIGTAYLAAQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDV 207
LG P++I+ + A YL K KK Y++G G+ +EL AG E
Sbjct: 62 SSLGGVDVTPDDIVTSGDATADYLAKQKPG-KKVYVIGEEGLKEELEGAGFELVD----- 115
Query: 208 MIPGRDLKTDHEKLNLDPHVGAVVVGFDSHISFPKLMKAACYLTNPNTLFVATNTDESFP 267
+ E +D AVVVG D +++ + + A F+ATN D + P
Sbjct: 116 ---------EEEPARVD----AVVVGLDRTLTY-EKLAEALLAIAAGAPFIATNPDLTVP 161
Query: 268 MGPHVTVPGTGSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTD 327
PG G++ A ++ REP VIGKPS I +EK L+ LM+GDR +TD
Sbjct: 162 T-ERGLRPGAGAIAALLEQATGREPTVIGKPSPAIYEAALEKLGLDRSEVLMVGDRLDTD 220
Query: 328 IRLGYNNGFQTLLVLTGDTTMEKAIAWSKSEDEEYKSRVADYYLSSLGDMLPFL 381
I G TLLVLTG + +ED + Y + SL +++ L
Sbjct: 221 ILGAKAAGLDTLLVLTGVS---------SAEDLDRAEVKPTYVVDSLAELITAL 265
>gnl|CDD|233422 TIGR01460, HAD-SF-IIA, Haloacid Dehalogenase Superfamily Class
(subfamily) IIA. This model represents one structural
subclass of the Haloacid Dehalogenase (HAD) superfamily
of aspartate-nucleophile hydrolases. The superfamily is
defined by the presence of three short catalytic motifs.
The classes are defined based on the location and the
observed or predicted fold of a so-called "capping
domain", or the absence of such a domain. Class I
consists of sequences in which the capping domain is
found in between the first and second catalytic motifs.
Class II consists of sequences in which the capping
domain is found between the second and third motifs.
Class III sequences have no capping domain in iether of
these positions. The Class IIA capping domain is
predicted by PSI-PRED to consist of a mixed alpha-beta
fold with the basic pattern:
Helix-Helix-Helix-Sheet-Helix-Loop-Sheet-Helix-Sheet-
Helix. Presently, this subfamily encompasses a single
equivalog model (TIGR01452) for the eukaryotic
phosphoglycolate phosphatase, as well as four
hypothetical equivalogs covering closely related
sequences (TIGR01456 and TIGR01458 in eukaryotes,
TIGR01457 in gram positive bacteria and TIGR01459 in
gram negative bacteria). The Escherishia coli NagD gene
and the Bacillus subtilus AraL gene are members of this
subfamily but are not members of the any of the
presently defined equivalogs within it. NagD is part of
the NAG operon responsible for N-acetylglucosamine
metabolism. The function of this gene is unknown. Genes
from several organisms have been annotated as NagD, or
NagD-like. However, without data on the presence of
other members of this pathway, (such as in the case of
Yersinia pestis) these assignments should not be given
great weight. The AraL gene is similar: it is part of
the L-arabinose operon, but the function is unknown. A
gene from Halobacterium has been annotated as AraL, but
no other Ara operon genes have been annotated. Many of
the genes in this subfamily have been annotated as
"pNPPase" "4-nitrophenyl phosphatase" or "NPPase". These
all refer to the same activity versus a common lab test
compound used to determine phosphatase activity. There
is no evidence that this activity is physiologically
relevant [Unknown function, Enzymes of unknown
specificity].
Length = 236
Score = 183 bits (466), Expect = 5e-56
Identities = 86/249 (34%), Positives = 128/249 (51%), Gaps = 15/249 (6%)
Query: 98 VLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHL-GFNAE 156
L D DGVLWL ++ I GA + +N L++ GK + ++TNNS+++ E KL L G +
Sbjct: 1 FLFDIDGVLWLGHKPIPGAAEALNRLRAKGKPVVFLTNNSSRSEEDYAEKLSSLLGVDVS 60
Query: 157 PNEIIGTAYLAAQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDVMIPGRDLKT 216
P++II + + L++ +K Y++G + + L G N D
Sbjct: 61 PDQIITSGSVTKDLLRQRF-EGEKVYVIGVGELRESLEGLGFRN------------DFFD 107
Query: 217 DHEKLNLDPHVGAVVVGFDSHISFPKLMKAACYLTNPNTLFVATNTDESFPMGPHVTVPG 276
D + L ++ AV+VG S S+ +L KAA L + F+A N D+ +G PG
Sbjct: 108 DIDHLAIEKIPAAVIVGEPSDFSYDELAKAAYLLAEGDVPFIAANRDDLVRLGDGRFRPG 167
Query: 277 TGSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERT-LMIGDRGNTDIRLGYNNG 335
G++ A +K + REP V+GKPS I + PER +M+GD TDI N G
Sbjct: 168 AGAIAAGIKELSGREPTVVGKPSPAIYRAALNLLQARPERRDVMVGDNLRTDILGAKNAG 227
Query: 336 FQTLLVLTG 344
F TLLVLTG
Sbjct: 228 FDTLLVLTG 236
>gnl|CDD|205524 pfam13344, Hydrolase_6, Haloacid dehalogenase-like hydrolase. This
family is part of the HAD superfamily.
Length = 101
Score = 123 bits (312), Expect = 6e-35
Identities = 46/102 (45%), Positives = 67/102 (65%), Gaps = 1/102 (0%)
Query: 98 VLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAEP 157
L D DGVLW E I GA + +N+L++ GK + +VTNNS+++REQ KL+ LGF+ +
Sbjct: 1 FLFDVDGVLWRGGEPIPGAAEALNALRAAGKPVVFVTNNSSRSREQYAKKLRKLGFDVDE 60
Query: 158 NEIIGTAYLAAQYLKKHLDPKKKAYIVGSSGIADELNLAGIE 199
+E+I +A AA YLK+ KK ++GS G+ +EL AG E
Sbjct: 61 DEVITSATAAADYLKERK-FGKKVLVIGSEGLREELEEAGFE 101
>gnl|CDD|130524 TIGR01457, HAD-SF-IIA-hyp2, HAD-superfamily subfamily IIA
hydrolase, TIGR01457. This hypothetical equivalog is a
member of the Class IIA subfamily of the haloacid
dehalogenase superfamily of aspartate-nucleophile
hydrolases. The sequences modelled by this equivalog are
all gram positive (low-GC) bacteria. Sequences found in
This model are annotated variously as related to NagD or
4-nitrophenyl phosphatase, and this hypothetical
equivalog, of all of those within the Class IIA
subfamily, is most closely related to the E. coli NagD
enzyme and the PGP_euk equivalog (TIGR01452). However,
there is presently no evidence that this hypothetical
equivalog has the same function of either those [Unknown
function, Enzymes of unknown specificity].
Length = 249
Score = 126 bits (319), Expect = 3e-34
Identities = 80/252 (31%), Positives = 114/252 (45%), Gaps = 27/252 (10%)
Query: 99 LTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAEPN 158
L D DG ++ E I A + + +LK G +VTNNST+T EQ+ KL A
Sbjct: 5 LIDLDGTMYNGTEKIEEACEFVRTLKKRGVPYLFVTNNSTRTPEQVADKLVSFDIPA-TE 63
Query: 159 EIIGTAYLA-AQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDVMIPGRDLKTD 217
E + T +A AQY+ + Y++G G+ + + G+ G PD
Sbjct: 64 EQVFTTSMATAQYIAQ-QKKDASVYVIGEEGLREAIKENGLTFGGENPDY---------- 112
Query: 218 HEKLNLDPHVGAVVVGFDSHISFPKLMKAACYLTNPNTLFVATNTDESFPMGPHVTVPGT 277
VVVG D I++ K A + N F++TN D + P + +PG
Sbjct: 113 ------------VVVGLDRSITYEKFAVACLAIRN-GARFISTNGDIAIPTERGL-LPGN 158
Query: 278 GSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQ 337
GS+ + + +PV IGKP +I + + E TLM+GD TDI G N G
Sbjct: 159 GSLTSVLTVSTGVKPVFIGKPESIIMEQAMRVLGTDVEETLMVGDNYATDIMAGINAGID 218
Query: 338 TLLVLTGDTTME 349
TLLV TG T E
Sbjct: 219 TLLVHTGVTKRE 230
>gnl|CDD|182466 PRK10444, PRK10444, UMP phosphatase; Provisional.
Length = 248
Score = 100 bits (251), Expect = 1e-24
Identities = 69/253 (27%), Positives = 117/253 (46%), Gaps = 31/253 (12%)
Query: 98 VLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAEP 157
V+ D DGVL +N + GA + ++ + G + +TN ++T + L + G + P
Sbjct: 4 VICDIDGVLMHDNVAVPGAAEFLHRILDKGLPLVLLTNYPSQTGQDLANRFATAGVDV-P 62
Query: 158 NEIIGTAYLA-AQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDVMIPGRDLKT 216
+ + T+ +A A +L++ KKAY++G + EL AG + PD +I G
Sbjct: 63 DSVFYTSAMATADFLRRQ--EGKKAYVIGEGALIHELYKAGFTITDINPDFVIVG----- 115
Query: 217 DHEKLNLDPHVGAVVVGFDSHISFPKLMKAACYLTNPNTLFVATNTDESFPMGPHVTVPG 276
+ N D +M A Y F+ATN D P
Sbjct: 116 ETRSYNWD------------------MMHKAAYFVANGARFIATNPDTHGRG----FYPA 153
Query: 277 TGSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGF 336
G++ A ++ + R+P +GKPS I + K + E T+++GD TDI G+ G
Sbjct: 154 CGALCAGIEKISGRKPFYVGKPSPWIIRAALNKMQAHSEETVIVGDNLRTDILAGFQAGL 213
Query: 337 QTLLVLTGDTTME 349
+T+LVL+G +T++
Sbjct: 214 ETILVLSGVSTLD 226
>gnl|CDD|222003 pfam13242, Hydrolase_like, HAD-hyrolase-like.
Length = 74
Score = 68.1 bits (167), Expect = 1e-14
Identities = 31/83 (37%), Positives = 43/83 (51%), Gaps = 10/83 (12%)
Query: 294 VIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTGDTTMEKAIA 353
V GKP+ + +E+ ++PE +MIGD +TDI G +T+LVLTG TT
Sbjct: 1 VCGKPNPGMLRAALERLGVDPEECVMIGDS-DTDILAARAAGIRTILVLTGVTT------ 53
Query: 354 WSKSEDEEYKSRVADYYLSSLGD 376
+ED E DY + SL D
Sbjct: 54 ---AEDLERAPGRPDYVVDSLAD 73
>gnl|CDD|119389 cd01427, HAD_like, Haloacid dehalogenase-like hydrolases. The
haloacid dehalogenase-like (HAD) superfamily includes
L-2-haloacid dehalogenase, epoxide hydrolase,
phosphoserine phosphatase, phosphomannomutase,
phosphoglycolate phosphatase, P-type ATPase, and many
others, all of which use a nucleophilic aspartate in
their phosphoryl transfer reaction. All members possess
a highly conserved alpha/beta core domain, and many also
possess a small cap domain, the fold and function of
which is variable. Members of this superfamily are
sometimes referred to as belonging to the DDDD
superfamily of phosphohydrolases.
Length = 139
Score = 63.9 bits (156), Expect = 2e-12
Identities = 33/123 (26%), Positives = 53/123 (43%), Gaps = 16/123 (13%)
Query: 98 VLTDCDGVLW---------LENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKL 148
VL D DG L E EL G + + LK G K+ TN S R +++ L
Sbjct: 2 VLFDLDGTLLDSEPGIAEIEELELYPGVKEALKELKEKGIKLALATNKS---RREVLELL 58
Query: 149 KHLGFNAEPNEIIGTAYLAAQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPD-- 206
+ LG + + +I + A Y K+ L + +G + LA ++ GV P+
Sbjct: 59 EELGLDDYFDPVITSNGAAIYYPKEGLFLGGGPFDIGKPNP--DKLLAALKLLGVDPEEV 116
Query: 207 VMI 209
+M+
Sbjct: 117 LMV 119
Score = 55.5 bits (134), Expect = 2e-09
Identities = 23/98 (23%), Positives = 33/98 (33%), Gaps = 13/98 (13%)
Query: 256 LFVATNT------------DESFPMGPHVTVPGTGSMVAAVKTGAQREPVVIGKPSKLIG 303
L +ATN P +T G P IGKP+
Sbjct: 43 LALATNKSRREVLELLEELGLDDYFDPVITSNGAAIYYPKEGLFLGGGPFDIGKPNPDKL 102
Query: 304 SYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLV 341
++ ++PE LM+GD N DI + G + V
Sbjct: 103 LAALKLLGVDPEEVLMVGDSLN-DIEMAKAAGGLGVAV 139
>gnl|CDD|162372 TIGR01458, HAD-SF-IIA-hyp3, HAD-superfamily subfamily IIA
hydrolase, TIGR01458. This hypothetical equivalog is a
member of the IIA subfamily (TIGR01460) of the haloacid
dehalogenase superfamily of aspartate-nucleophile
hydrolases. One sequence (GP|10716807) has been
annotated as a "phospholysine phosphohistidine inorganic
pyrophosphatase," probably in reference to studies on
similarly described (but unsequenced) enzymes from
bovine and rat tissues. However, the supporting
information for this annotation has never been published
[Unknown function, Enzymes of unknown specificity].
Length = 257
Score = 63.3 bits (154), Expect = 2e-11
Identities = 64/287 (22%), Positives = 111/287 (38%), Gaps = 49/287 (17%)
Query: 98 VLTDCDGVLWLE----NELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGF 153
VL D GVL++ + G+ + + L+ K+ +VTN + ++++ L+ +L+ LGF
Sbjct: 4 VLLDISGVLYISDAGGGTAVPGSQEAVKRLRGASVKVRFVTNTTKESKQDLLERLQRLGF 63
Query: 154 NAEPNEIIGTAYLAAQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDVMIPGRD 213
+ +E+ A A Q L++ + P +++ R
Sbjct: 64 DISEDEVFTPAPAARQLLEEK---------------------------QLRPMLLVDDRV 96
Query: 214 LKTDHEKLNLDPHVGAVVVGF-DSHISFPKLMKAACYLTN-PNTLFVATNTDESFPM--G 269
L DP+ VV+G H S+ L +A L + + +A + G
Sbjct: 97 LPDFDGIDTSDPN--CVVMGLAPEHFSYQILNQAFRLLLDGAKPVLIAIGKGRYYKRKDG 154
Query: 270 PHVTVPGTGSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIR 329
+ G V A++ + V+GKPSK + PE +MIGD D+
Sbjct: 155 LAL---DVGPFVTALEYATDTKATVVGKPSKTFFLEALRATGCEPEEAVMIGDDCRDDVG 211
Query: 330 LGYNNGFQTLLVLTGDTTMEKAIAWSKSEDEEYKSRVADYYLSSLGD 376
+ G + + V TG + S DEE + D SL
Sbjct: 212 GAQDCGMRGIQVRTGK--------YRPS-DEEKINVPPDLTCDSLPH 249
>gnl|CDD|130526 TIGR01459, HAD-SF-IIA-hyp4, HAD-superfamily class IIA hydrolase,
TIGR01459. This hypothetical equivalog is a member of
the Class IIA subfamily of the haloacid dehalogenase
superfamily of aspartate-nucleophile hydrolases. The
sequences modelled by this equivalog are all gram
negative and primarily alpha proteobacteria. Only one
sequence hase been annotated as other than
"hypothetical." That one, from Brucella, is annotated as
related to NagD, but only by sequence similarity and
should be treated with some skepticism. (See comments
for Class IIA subfamily model).
Length = 242
Score = 52.2 bits (125), Expect = 1e-07
Identities = 73/268 (27%), Positives = 106/268 (39%), Gaps = 42/268 (15%)
Query: 90 DFLNSFDTVLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLK 149
D +N +D L D GV+ N GA Q +N + + GK +++V+N+ L LK
Sbjct: 3 DLINDYDVFLLDLWGVIIDGNHTYPGAVQNLNKIIAQGKPVYFVSNSPRNIFS-LHKTLK 61
Query: 150 HLGFNAE-PNEIIGTAYLAAQYLKKHLDPKKKA-------YIVGSSGIADELNLAGI--- 198
LG NA+ P II + +A Q + L+ KK+ Y++G D +NL
Sbjct: 62 SLGINADLPEMIISSGEIAVQMI---LESKKRFDIRNGIIYLLG-HLENDIINLMQCYTT 117
Query: 199 --ENFGVGPDVMIPGRDLKTDHEKLNLDPHVGAVVVGFDSHISFPKLMKAACYLTNPNTL 256
EN + I +++EKL+LD FD + K NP+
Sbjct: 118 DDENKANASLITIYR----SENEKLDLDE--------FDELFAPIVARKIPNICANPDRG 165
Query: 257 FVATNTDESFPMGPHVTVPGTGSMVAAVKTGAQREPVVIGKPSKLI-GSYLIEKYNLNPE 315
+ G G A + + + GKP I L E N+
Sbjct: 166 INQHG----------IYRYGAG-YYAELIKQLGGKVIYSGKPYPAIFHKALKECSNIPKN 214
Query: 316 RTLMIGDRGNTDIRLGYNNGFQTLLVLT 343
R LM+GD TDI G T LVLT
Sbjct: 215 RMLMVGDSFYTDILGANRLGIDTALVLT 242
>gnl|CDD|223620 COG0546, Gph, Predicted phosphatases [General function prediction
only].
Length = 220
Score = 51.3 bits (123), Expect = 1e-07
Identities = 24/77 (31%), Positives = 33/77 (42%), Gaps = 12/77 (15%)
Query: 305 YLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTGDTTMEKAIAWSKSEDEEYKS 364
L+EK L+PE LM+GD N DI G + V G + E E +
Sbjct: 153 LLLEKLGLDPEEALMVGDSLN-DILAAKAAGVPAVGVTWGYNSRE--------ELAQAG- 202
Query: 365 RVADYYLSSLGDMLPFL 381
AD + SL ++L L
Sbjct: 203 --ADVVIDSLAELLALL 217
Score = 30.5 bits (69), Expect = 1.2
Identities = 26/97 (26%), Positives = 43/97 (44%), Gaps = 13/97 (13%)
Query: 108 LENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAEPNEIIGTAYLA 167
LE+ L G +++ +LKS G K+ VTN + + L LK LG + I+G +
Sbjct: 86 LESRLFPGVKELLAALKSAGYKLGIVTNKPERELDIL---LKALGLADYFDVIVGGDDVP 142
Query: 168 AQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVG 204
K P + ++ + ++L L E VG
Sbjct: 143 P---PK---PDPEPLLL----LLEKLGLDPEEALMVG 169
>gnl|CDD|216069 pfam00702, Hydrolase, haloacid dehalogenase-like hydrolase. This
family is structurally different from the alpha/beta
hydrolase family (pfam00561). This family includes
L-2-haloacid dehalogenase, epoxide hydrolases and
phosphatases. The structure of the family consists of
two domains. One is an inserted four helix bundle, which
is the least well conserved region of the alignment,
between residues 16 and 96 of Pseudomonas sp.
(S)-2-haloacid dehalogenase 1. The rest of the fold is
composed of the core alpha/beta domain. Those members
with the characteristic DxD triad at the N-terminus are
probably phosphatidylglycerolphosphate (PGP)
phosphatases involved in cardiolipin biosynthesis in the
mitochondria.
Length = 187
Score = 44.6 bits (105), Expect = 2e-05
Identities = 39/235 (16%), Positives = 69/235 (29%), Gaps = 54/235 (22%)
Query: 95 FDTVLTDCDGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFN 154
V+ D DG L ++ A+ ++ + +LG I + +
Sbjct: 1 IKAVVFDLDGTLTDGEPVVPEAEALLEAAAALGVAIVIAAGENLTKEGR----------- 49
Query: 155 AEPNEIIGTAYLAAQYLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDVMIPGRDL 214
++ + +A E L + G ++ L
Sbjct: 50 -----------------------EELVRRLLLRALAGEELLEELLRAGATVVAVLDLVVL 86
Query: 215 KTDHEKLNLDPHVGAVVVGFDSHISFPKLMKAACYLTNPNTLFVATNTDESFPMGPHVTV 274
L P + K +K A L + T + +
Sbjct: 87 GLIALTDPLYPGAREAL----------KELKEAGI-----KLAILTGDNRLTANAIARLL 131
Query: 275 PGTGSMVAAVKTGAQREPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIR 329
++V+A G V +GKP I +E+ + PE LM+GD N DI
Sbjct: 132 GLFDALVSADLYG----LVGVGKPDPKIFELALEELGVKPEEVLMVGDGVN-DIP 181
>gnl|CDD|223319 COG0241, HisB, Histidinol phosphatase and related phosphatases
[Amino acid transport and metabolism].
Length = 181
Score = 43.0 bits (102), Expect = 6e-05
Identities = 17/48 (35%), Positives = 30/48 (62%), Gaps = 1/48 (2%)
Query: 297 KPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTG 344
KP + +++YN++ R+ ++GDR TD++ N G + +LVLTG
Sbjct: 105 KPKPGMLLSALKEYNIDLSRSYVVGDR-LTDLQAAENAGIKGVLVLTG 151
>gnl|CDD|225090 COG2179, COG2179, Predicted hydrolase of the HAD superfamily
[General function prediction only].
Length = 175
Score = 41.5 bits (98), Expect = 2e-04
Identities = 16/46 (34%), Positives = 25/46 (54%)
Query: 296 GKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLV 341
KP +++ NL PE +M+GD+ TD+ G G +T+LV
Sbjct: 92 KKPFGRAFRRALKEMNLPPEEVVMVGDQLFTDVLGGNRAGMRTILV 137
>gnl|CDD|162787 TIGR02253, CTE7, HAD superfamily (subfamily IA) hydrolase,
TIGR02253. This family of sequences from archaea and
metazoans includes the human uncharacterized protein
CTE7. Pyrococcus species appear to have three different
forms of this enzyme, so it is unclear whether all
members of this family have the same function. This
family is a member of the haloacid dehalogenase (HAD)
superfamily of hydrolases which are characterized by
three conserved sequence motifs. By virtue of an alpha
helical domain in-between the first and second conserved
motif, this family is a member of subfamily IA
(TIGR01549).
Length = 221
Score = 42.0 bits (99), Expect = 2e-04
Identities = 27/89 (30%), Positives = 44/89 (49%), Gaps = 10/89 (11%)
Query: 290 REPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTGDTTME 349
E + KP I +++ + PE +M+GDR + DI+ N G +T+ + G +
Sbjct: 143 SEEEGVEKPHPKIFYAALKRLGVKPEEAVMVGDRLDKDIKGAKNLGMKTVWINQGKS--- 199
Query: 350 KAIAWSKSEDEEYKSRVADYYLSSLGDML 378
SK ED+ Y DY +SSL ++L
Sbjct: 200 -----SKMEDDVYPY--PDYEISSLRELL 221
>gnl|CDD|233517 TIGR01662, HAD-SF-IIIA, HAD-superfamily hydrolase, subfamily IIIA.
This subfamily falls within the Haloacid Dehalogenase
(HAD) superfamily of aspartate-nucleophile hydrolases.
The Class III subfamilies are characterized by the lack
of any domains located between either between the first
and second conserved catalytic motifs (as in the Class I
subfamilies, TIGR01493, TIGR01509, TIGR01488 and
TIGR01494) or between the second and third conserved
catalytic motifs (as in the Class II subfamilies,
TIGR01460 and TIGR01484) of the superfamily domain. The
IIIA subfamily contains five major clades:
histidinol-phosphatase (TIGR01261) and
histidinol-phosphatase-related protein (TIGR00213) which
together form a subfamily (TIGR01656), DNA
3'-phosphatase (TIGR01663, TIGR01664), YqeG (TIGR01668)
and YrbI (TIGR01670). In the case of histidinol
phosphatase and PNK-3'-phosphatase, this model
represents a domain of a bifunctional system. In the
histidinol phosphatase HisB, a C-terminal domain is an
imidazoleglycerol-phosphate dehydratase which catalyzes
a related step in histidine biosynthesis. In
PNK-3'-phosphatase, N- and C-terminal domains constitute
the polynucleotide kinase and DNA-binding components of
the enzyme [Unknown function, Enzymes of unknown
specificity].
Length = 132
Score = 39.7 bits (93), Expect = 4e-04
Identities = 10/40 (25%), Positives = 23/40 (57%), Gaps = 1/40 (2%)
Query: 305 YLIEKYN-LNPERTLMIGDRGNTDIRLGYNNGFQTLLVLT 343
++++N ++PE ++ +GD+ TD++ G +LV
Sbjct: 93 EALKRFNEIDPEESVYVGDQDLTDLQAAKRAGLAFILVAP 132
>gnl|CDD|223943 COG1011, COG1011, Predicted hydrolase (HAD superfamily) [General
function prediction only].
Length = 229
Score = 40.7 bits (95), Expect = 5e-04
Identities = 19/63 (30%), Positives = 28/63 (44%)
Query: 291 EPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTGDTTMEK 350
E V + KP I Y +EK + PE L +GD DI G +T+ + G +
Sbjct: 148 EDVGVAKPDPEIFEYALEKLGVPPEEALFVGDSLENDILGARALGMKTVWINRGGKPLPD 207
Query: 351 AIA 353
A+
Sbjct: 208 ALE 210
>gnl|CDD|233452 TIGR01533, lipo_e_P4, 5'-nucleotidase, lipoprotein e(P4) family.
This model represents a set of bacterial lipoproteins
belonging to a larger acid phosphatase family
(pfam03767), which in turn belongs to the haloacid
dehalogenase (HAD) superfamily of aspartate-dependent
hydrolases. Members are found on the outer membrane of
Gram-negative bacteria and the cytoplasmic membrane of
Gram-positive bacteria. Most members have classic
lipoprotein signal sequences. A critical role of this
5'-nucleotidase in Haemophilus influenzae is the
degradation of external riboside in order to allow
transport into the cell. An earlier suggested role in
hemin transport is no longer current. This enzyme may
also have other physiologically significant roles
[Transport and binding proteins, Other, Biosynthesis of
cofactors, prosthetic groups, and carriers, Pyridine
nucleotides].
Length = 266
Score = 39.0 bits (91), Expect = 0.002
Identities = 23/73 (31%), Positives = 32/73 (43%), Gaps = 10/73 (13%)
Query: 109 ENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAEPNEIIGTAYLAA 168
+ + ++GA +N S G KIFYV+N S K + + LK GF E +
Sbjct: 116 QAKPVAGALDFLNYANSKGVKIFYVSNRSEKEKAATLKNLKRFGFPQADEEHL------- 168
Query: 169 QYLKKHLDPKKKA 181
LKK D K
Sbjct: 169 -LLKK--DKSSKE 178
>gnl|CDD|233519 TIGR01668, YqeG_hyp_ppase, HAD superfamily (subfamily IIIA)
phosphatase, TIGR01668. This family of hypothetical
proteins is a member of the IIIA subfamily of the
haloacid dehalogenase (HAD) superfamily of hydrolases.
All characterized members of this subfamily (TIGR01662)
and most characterized members of the HAD superfamily
are phosphatases. HAD superfamily phosphatases contain
active site residues in several conserved catalytic
motifs, all of which are found conserved here. This
family consists of sequences from fungi, plants,
cyanobacteria, gram-positive bacteria and Deinococcus.
There is presently no characterization of any sequence
in this family.
Length = 170
Score = 35.1 bits (81), Expect = 0.029
Identities = 13/31 (41%), Positives = 18/31 (58%)
Query: 311 NLNPERTLMIGDRGNTDIRLGYNNGFQTLLV 341
L E+ ++GDR TD+ G NG T+LV
Sbjct: 105 GLTSEQVAVVGDRLFTDVMGGNRNGSYTILV 135
>gnl|CDD|217719 pfam03767, Acid_phosphat_B, HAD superfamily, subfamily IIIB (Acid
phosphatase). This family proteins includes acid
phosphatases and a number of vegetative storage
proteins.
Length = 213
Score = 35.0 bits (81), Expect = 0.033
Identities = 16/42 (38%), Positives = 23/42 (54%)
Query: 113 ISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFN 154
+ GA ++ N L LG KIF+V+ S R + LK GF+
Sbjct: 105 LPGALELYNYLVELGVKIFFVSGRSEDLRAATVENLKKAGFH 146
>gnl|CDD|162788 TIGR02254, YjjG/YfnB, HAD superfamily (subfamily IA) hydrolase,
TIGR02254. This family consists of uncharacterized
proteobacterial and gram positive bacterial sequences
including YjjG from E. coli and YfnB from B. subtilis.
This family is a member of the haloacid dehalogenase
(HAD) superfamily of hydrolases which are characterized
by three conserved sequence motifs. By virtue of an
alpha helical domain in-between the first and second
conserved motif, this family is a member of subfamily IA
(TIGR01549). Most likely, these enzymes are
phosphatases.
Length = 224
Score = 34.8 bits (80), Expect = 0.046
Identities = 20/51 (39%), Positives = 25/51 (49%), Gaps = 1/51 (1%)
Query: 291 EPVVIGKPSKLIGSYLIEK-YNLNPERTLMIGDRGNTDIRLGYNNGFQTLL 340
E I KP K I +Y +E+ + E LMIGD DI+ G N G T
Sbjct: 146 EDAGIQKPDKEIFNYALERMPKFSKEEVLMIGDSLTADIKGGQNAGLDTCW 196
>gnl|CDD|237336 PRK13288, PRK13288, pyrophosphatase PpaX; Provisional.
Length = 214
Score = 34.2 bits (79), Expect = 0.071
Identities = 21/75 (28%), Positives = 30/75 (40%), Gaps = 12/75 (16%)
Query: 308 EKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTGDTTMEKAIAWSKSEDEEYKSRVA 367
E PE LM+GD + DI G N G +T V AW+ E +
Sbjct: 149 ELLGAKPEEALMVGDNHH-DILAGKNAGTKTAGV-----------AWTIKGREYLEQYKP 196
Query: 368 DYYLSSLGDMLPFLS 382
D+ L + D+L +
Sbjct: 197 DFMLDKMSDLLAIVG 211
>gnl|CDD|180686 PRK06769, PRK06769, hypothetical protein; Validated.
Length = 173
Score = 33.5 bits (77), Expect = 0.082
Identities = 15/37 (40%), Positives = 20/37 (54%), Gaps = 1/37 (2%)
Query: 308 EKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTG 344
EK+ L+ + +IGDR TDI T+LV TG
Sbjct: 104 EKHGLDLTQCAVIGDRW-TDIVAAAKVNATTILVRTG 139
>gnl|CDD|222115 pfam13419, HAD_2, Haloacid dehalogenase-like hydrolase.
Length = 176
Score = 31.5 bits (72), Expect = 0.47
Identities = 14/51 (27%), Positives = 21/51 (41%), Gaps = 1/51 (1%)
Query: 291 EPVVIGKPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLV 341
+ V KP ++E+ L PE L I D D+ G +T+ V
Sbjct: 127 DDVGARKPDPEAYERVLERLGLPPEEILFIDDSPE-DLEAARAAGIKTVHV 176
Score = 28.1 bits (63), Expect = 6.1
Identities = 12/52 (23%), Positives = 23/52 (44%), Gaps = 3/52 (5%)
Query: 111 ELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAEPNEIIG 162
E +++ LK+ G K+ ++N S RE + L+ LG + +
Sbjct: 77 EPFPDVVELLRRLKAKGVKLVILSNGS---REAVERLLEKLGLLDLFDAVFT 125
>gnl|CDD|237310 PRK13222, PRK13222, phosphoglycolate phosphatase; Provisional.
Length = 226
Score = 31.7 bits (73), Expect = 0.49
Identities = 13/37 (35%), Positives = 18/37 (48%), Gaps = 1/37 (2%)
Query: 305 YLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLV 341
EK L+PE L +GD N DI+ G ++ V
Sbjct: 157 LACEKLGLDPEEMLFVGDSRN-DIQAARAAGCPSVGV 192
>gnl|CDD|181865 PRK09449, PRK09449, dUMP phosphatase; Provisional.
Length = 224
Score = 31.4 bits (72), Expect = 0.65
Identities = 20/50 (40%), Positives = 28/50 (56%), Gaps = 3/50 (6%)
Query: 291 EPVVIGKPSKLIGSYLIEKYNLNPERT--LMIGDRGNTDIRLGYNNGFQT 338
E V + KP I Y +E+ NP+R+ LM+GD ++DI G N G T
Sbjct: 144 EQVGVAKPDVAIFDYALEQMG-NPDRSRVLMVGDNLHSDILGGINAGIDT 192
>gnl|CDD|218618 pfam05508, Ran-binding, RanGTP-binding protein. The small Ras-like
GTPase Ran plays an essential role in the transport of
macromolecules in and out of the nucleus and has been
implicated in spindle and nuclear envelope formation
during mitosis in higher eukaryotes. The S. cerevisiae
ORF YGL164c encoding a novel RanGTP-binding protein,
termed Yrb30p was identified. The protein competes with
yeast RanBP1 (Yrb1p) for binding to the GTP-bound form
of yeast Ran (Gsp1p) and is, like Yrb1p, able to form
trimeric complexes with RanGTP and some of the
karyopherins.
Length = 302
Score = 31.1 bits (71), Expect = 0.96
Identities = 13/39 (33%), Positives = 21/39 (53%), Gaps = 1/39 (2%)
Query: 236 SHISFPKLMKAACYLTNPNTLFVATNTDESFPMGPHVTV 274
+ IS +L++A+ YLT NT F + +GP T+
Sbjct: 159 ATISPSRLLQASNYLTKNNTQFGGSPKSPV-QVGPTFTL 196
>gnl|CDD|218442 pfam05116, S6PP, Sucrose-6F-phosphate phosphohydrolase. This
family consists of Sucrose-6F-phosphate phosphohydrolase
proteins found in plants and cyanobacteria.
Sucrose-6(F)-phosphate phosphohydrolase catalyzes the
final step in the pathway of sucrose biosynthesis.
Length = 247
Score = 30.7 bits (70), Expect = 1.1
Identities = 14/26 (53%), Positives = 17/26 (65%), Gaps = 1/26 (3%)
Query: 305 YLIEKYNLNPERTLMIGDRGNTDIRL 330
YL +K+ L PE TL+ GD GN D L
Sbjct: 172 YLAKKWGLPPENTLVCGDSGN-DAEL 196
>gnl|CDD|233512 TIGR01656, Histidinol-ppas, histidinol-phosphate phosphatase family
domain. This domain is found in authentic
histidinol-phosphate phosphatases which are sometimes
found as stand-alone entities and sometimes as fusions
with imidazoleglycerol-phosphate dehydratase
(TIGR01261). Additionally, a family of proteins
including YaeD from E. coli (TIGR00213) and various
other proteins are closely related but may not have the
same substrate specificity. This domain is a member of
the haloacid-dehalogenase (HAD) superfamily of
aspartate-nucleophile hydrolases. This superfamily is
distinguished by the presence of three motifs: an
N-terminal motif containing the nucleophilic aspartate,
a central motif containing an conserved serine or
threonine, and a C-terminal motif containing a conserved
lysine (or arginine) and conserved aspartates. More
specifically, the domian modelled here is a member of
subfamily III of the HAD-superfamily by virtue of
lacking a "capping" domain in either of the two common
positions, between motifs 1 and 2, or between motifs 2
and 3.
Length = 147
Score = 29.7 bits (67), Expect = 1.2
Identities = 14/49 (28%), Positives = 27/49 (55%), Gaps = 3/49 (6%)
Query: 297 KPS-KLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTG 344
KP LI +++ ++ R+L++GDR D++ N G +L++ G
Sbjct: 101 KPKPGLILEA-LKRLGVDASRSLVVGDR-LRDLQAARNAGLAAVLLVDG 147
>gnl|CDD|184834 PRK14821, PRK14821, putative deoxyribonucleotide triphosphate
pyrophosphatase; Provisional.
Length = 184
Score = 29.9 bits (68), Expect = 1.4
Identities = 21/71 (29%), Positives = 31/71 (43%), Gaps = 14/71 (19%)
Query: 129 KIFYVTNNSTKTREQLIVKLKHLGFNAEPNEI------IGT----AYLAAQYLKKHLDPK 178
KI++ T N K E I+ LK LG E +I T A A+++ L+
Sbjct: 2 KIYFATGNKGKVEEAKII-LKPLGIEVEQIKIEYPEIQADTLEEVAAFGAKWVYNKLN-- 58
Query: 179 KKAYIVGSSGI 189
+ IV SG+
Sbjct: 59 -RPVIVEDSGL 68
>gnl|CDD|238285 cd00515, HAM1, NTPase/HAM1. This family consists of the HAM1
protein and pyrophosphate-releasing xanthosine/ inosine
triphosphatase. HAM1 protects the cell against
mutagenesis by the base analog 6-N-hydroxylaminopurine
(HAP) in E. Coli and S. cerevisiae. A Ham1-related
protein from Methanococcus jannaschii is a novel NTPase
that has been shown to hydrolyze nonstandard nucleotides
such as XTP to XMP and ITP to IMP, but not the standard
nucleotides, in the presence of Mg or Mn ions. The
enzyme exists as a homodimer. The HAM1 protein may be
acting as an NTPase by hydrolyzing the HAP triphosphate.
Length = 183
Score = 29.8 bits (68), Expect = 1.7
Identities = 17/76 (22%), Positives = 27/76 (35%), Gaps = 14/76 (18%)
Query: 130 IFYVTNNSTKTRE------QLIVKLKHLGFNAEPNEIIGT----AYLAAQYLKKHLDPKK 179
I + T N K +E +++ L + E T A L A+ + L
Sbjct: 1 IVFATGNKGKLKEFKEILAPFGIEVVSLKDIIDIEETGSTFEENALLKARAAAEAL---G 57
Query: 180 KAYIVGSSGIA-DELN 194
+ SG+ D LN
Sbjct: 58 LPVLADDSGLCVDALN 73
>gnl|CDD|213834 TIGR03597, GTPase_YqeH, ribosome biogenesis GTPase YqeH. This
family describes YqeH, a member of a larger family of
GTPases involved in ribosome biogenesis. Like YqlF, it
shows a cyclical permutation relative to GTPases EngA
(in which the GTPase domain is duplicated), Era, and
others. Members of this protein family are found in a
relatively small number of bacterial species, including
Bacillus subtilis but not Escherichia coli [Protein
synthesis, Other].
Length = 360
Score = 30.3 bits (69), Expect = 1.8
Identities = 29/137 (21%), Positives = 49/137 (35%), Gaps = 34/137 (24%)
Query: 76 KLINLSEL--SGDKQKDFLNSFDTVLTDCDGVLW------LENELISGADQVMNSLKSLG 127
+L + +E+ DFLN +++ +++ E LI + + G
Sbjct: 36 RLKHYNEIQDVELNDDDFLNLLNSLGDSNALIVYVVDIFDFEGSLIPELKRFVG-----G 90
Query: 128 KKIFYVTN---------NSTKTREQLIVKLKHLGFNAEPNEII--------GTAYLAAQY 170
+ V N N +K +E + + K LG P +II G L
Sbjct: 91 NPVLLVGNKIDLLPKSVNLSKIKEWMKKRAKELGLK--PVDIILVSAKKGNGIDELLD-- 146
Query: 171 LKKHLDPKKKAYIVGSS 187
K KK Y+VG +
Sbjct: 147 KIKKARNKKDVYVVGVT 163
>gnl|CDD|236354 PRK08942, PRK08942, D,D-heptose 1,7-bisphosphate phosphatase;
Validated.
Length = 181
Score = 29.4 bits (67), Expect = 2.0
Identities = 14/49 (28%), Positives = 24/49 (48%), Gaps = 3/49 (6%)
Query: 307 IEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTGD--TTMEKAIA 353
E+ N++ + M+GD D++ G +LV TG TT+ + A
Sbjct: 113 AERLNIDLAGSPMVGDSLR-DLQAAAAAGVTPVLVRTGKGVTTLAEGAA 160
>gnl|CDD|130549 TIGR01485, SPP_plant-cyano, sucrose-6F-phosphate phosphohydrolase.
This model describes the sucrose phosphate
phosphohydrolase from plants and cyanobacteria (SPP).
SPP is a member of the Class IIB subfamily (TIGR01484)
of the Haloacid Dehalogenase (HAD) superfamily of
aspartate-nucleophile hydrolases. SPP catalyzes the
final step in the biosynthesis of sucrose, a critically
important molecule for plants. This model is limited to
plants and cyanobacteria. However, a closely related
group of sequences from bacteria and archaea (TIGR*****)
may prove to catalyze the same reaction. If so, the
SPP-subfamily model (TIGR01482) containing both of these
groups should be promoted to equivalog and the two
smaller models retired. Sucrose phosphate synthase
(SPS), the prior step in the biosynthesis of sucrose
contains a domain which exhibits considerable similarity
to SPP albeit without conservation of the catalytic
residues. The catalytic machinery of the synthase
resides in another domain. It seems likely that the
phosphatase-like domain is involved in substrate
binding, possibly binding both substrates in a
"product-like" orientation prior to ligation by the
synthase catalytic domain.
Length = 249
Score = 29.4 bits (66), Expect = 2.9
Identities = 13/26 (50%), Positives = 18/26 (69%), Gaps = 1/26 (3%)
Query: 305 YLIEKYNLNPERTLMIGDRGNTDIRL 330
YL++K + P +TL+ GD GN DI L
Sbjct: 174 YLLQKLAMEPSQTLVCGDSGN-DIEL 198
>gnl|CDD|200170 TIGR02252, DREG-2, REG-2-like, HAD superfamily (subfamily IA)
hydrolase. This family of proteins includes
uncharacterized sequences from eukaryotes, cyanobacteria
and Leptospira as well as the DREG-2 protein from
Drosophila melanogaster which has been identified as a
rhythmically (diurnally) regulated gene. This family is
a member of the Haloacid Dehalogenase (HAD) superfamily
of aspartate-nucleophile hydrolases. The superfamily is
defined by the presence of three short catalytic motifs.
The subfamilies are defined based on the location and
the observed or predicted fold of a so-called 'capping
domain', or the absence of such a domain. This family is
a member of subfamily 1A in which the cap domain
consists of a predicted alpha helical bundle found in
between the first and second catalytic motifs. A
distinctive feature of this family is a conserved tandem
pair of tryptophan residues in the cap domain. The most
divergent sequences included within the scope of this
model are from plants and have "FW" at this position
instead. Most likely, these sequences, like the vast
majority of HAD sequences, represent phosphatase
enzymes.
Length = 203
Score = 29.2 bits (66), Expect = 3.0
Identities = 14/44 (31%), Positives = 21/44 (47%)
Query: 297 KPSKLIGSYLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLL 340
KP I +E+ ++PE L IGD D + G++ LL
Sbjct: 160 KPDPKIFQEALERAGISPEEALHIGDSLRNDYQGARAAGWRALL 203
>gnl|CDD|225531 COG2984, COG2984, ABC-type uncharacterized transport system,
periplasmic component [General function prediction
only].
Length = 322
Score = 29.2 bits (66), Expect = 3.1
Identities = 15/76 (19%), Positives = 25/76 (32%), Gaps = 8/76 (10%)
Query: 122 SLKSLGKK----IFYVTNNSTKTREQLIVKLKHLGFNAEPNEIIGTAYLAAQYLKKHLDP 177
+LK G K + T Q+ +L P+ I+ A AAQ L
Sbjct: 54 ALKDAGYKNVKIDYQNAQGDLGTAAQIARQLVGDK----PDVIVAIATPAAQALVSATKT 109
Query: 178 KKKAYIVGSSGIADEL 193
+ + + +L
Sbjct: 110 IPVVFAAVTDPVGAKL 125
>gnl|CDD|187677 cd09619, CBM9_like_4, DOMON-like type 9 carbohydrate binding
module. Family 9 carbohydrate-binding modules (CBM9)
play a role in the microbial degradation of cellulose
and hemicellulose (materials found in plants). The
domain has previously been called cellulose-binding
domain. The polysaccharide binding sites of CBMs with
available 3D structure have been found to be either flat
surfaces with interactions formed by predominantly
aromatic residues (tryptophan and tyrosine), or extended
shallow grooves. CBM9 domains found in this
uncharacterized heterogeneous subfamily are often
located at the C-terminus of longer proteins and may
co-occur with various other domains.
Length = 187
Score = 28.9 bits (65), Expect = 3.3
Identities = 22/111 (19%), Positives = 27/111 (24%), Gaps = 27/111 (24%)
Query: 241 PKLMKAACYLTNPNTLFVATNTDESFPMGPHVTVPGTGSMVAAVKTGAQREPVVIGKPSK 300
P K A Y N L N + P G + T P G VA+
Sbjct: 81 PDGNKLATYTANDFQLGFRPNDETGQPEGWNHTTPAPGIRVASTPRY------------- 127
Query: 301 LIGSYLIE------------KYNLNPERTLMIGDRGNTDIRLGYNNGFQTL 339
G Y +E NL + I D R
Sbjct: 128 --GGYTVEAAIPWSTLGITPAANLLLGFDVAINDDDTGGTRDQQIAWNAKD 176
>gnl|CDD|223587 COG0513, SrmB, Superfamily II DNA and RNA helicases [DNA
replication, recombination, and repair / Transcription /
Translation, ribosomal structure and biogenesis].
Length = 513
Score = 29.4 bits (66), Expect = 3.8
Identities = 23/78 (29%), Positives = 33/78 (42%), Gaps = 19/78 (24%)
Query: 166 LAAQ------YLKKHLDPKKKAYIVGSSGIADELNLAGIENFGVGPDVMI--PGRDLKTD 217
LA Q L K+L + A + G I + IE G D+++ PGR L D
Sbjct: 111 LAVQIAEELRKLGKNLGGLRVAVVYGGVSIRKQ-----IEALKRGVDIVVATPGRLL--D 163
Query: 218 H---EKLNLDPHVGAVVV 232
KL+L V +V+
Sbjct: 164 LIKRGKLDLS-GVETLVL 180
>gnl|CDD|232982 TIGR00457, asnS, asparaginyl-tRNA synthetase. In a multiple
sequence alignment of representative asparaginyl-tRNA
synthetases (asnS), archaeal/eukaryotic type
aspartyl-tRNA synthetases (aspS_arch), and bacterial
type aspartyl-tRNA synthetases (aspS_bact), there is a
striking similarity between asnS and aspS_arch in gap
pattern and in sequence, and a striking divergence of
aspS_bact. Consequently, a separate model was built for
each of the three groups. This model, asnS, represents
asparaginyl-tRNA synthetases from the three domains of
life. Some species lack this enzyme and charge tRNA(asn)
by misacylation with Asp, followed by transamidation of
Asp to Asn [Protein synthesis, tRNA aminoacylation].
Length = 453
Score = 29.3 bits (66), Expect = 4.2
Identities = 20/82 (24%), Positives = 35/82 (42%), Gaps = 19/82 (23%)
Query: 211 GRDLKTDHEKLNLDPH-VGAVVVGFDSHISFPKLMKAACYLTNPNTLFVATNTDESFPMG 269
G DL+T+HE+ + + V V ++PK +KA ++ N D
Sbjct: 321 GDDLQTEHERFLAEEYFKPPVFV-----TNYPKDIKA---------FYMKLNDDGKTVAA 366
Query: 270 PHVTVPGTGSMVAAVKTGAQRE 291
+ PG G ++ G++RE
Sbjct: 367 MDLLAPGIGEIIG----GSERE 384
>gnl|CDD|233433 TIGR01482, SPP-subfamily, sucrose-phosphate phosphatase subfamily.
This model includes both the members of the SPP
equivalog model (TIGR01485), encompassing plants and
cyanobacteria, as well as those archaeal sequences which
are the closest relatives (TIGR01487). It remains to be
shown whether these archaeal sequences catalyze the same
reaction as SPP.
Length = 225
Score = 28.6 bits (64), Expect = 4.2
Identities = 20/71 (28%), Positives = 28/71 (39%), Gaps = 1/71 (1%)
Query: 305 YLIEKYNLNPERTLMIGDRGNTDIRLGYNNGFQTLLVLTGDTTMEKAIAWSKSEDEEYKS 364
L EK + P TL+ GD N DI L GF + E A ++S E +
Sbjct: 156 KLKEKLGIKPGETLVCGDSEN-DIDLFEVPGFGVAVANAQPELKEWADYVTESPYGEGGA 214
Query: 365 RVADYYLSSLG 375
L ++G
Sbjct: 215 EAIGEILQAIG 225
>gnl|CDD|137467 PRK09677, PRK09677, putative lipopolysaccharide biosynthesis
O-acetyl transferase WbbJ; Provisional.
Length = 192
Score = 28.3 bits (63), Expect = 5.3
Identities = 32/126 (25%), Positives = 53/126 (42%), Gaps = 16/126 (12%)
Query: 189 IADELNLAGIENFGVGPDVMIPGRDLKTDHEKLNLDPHVGAVVVGFDSHISFPKLMKAAC 248
+ D +++A IE+ +G D +I + TDH + S P L
Sbjct: 74 VNDYVHIACIESITIGRDTLIASKVFITDHNHGSFKH---------SDDFSSPNLPPDMR 124
Query: 249 YLTNPNTLFVATNT--DESFPMGPHVTVPGTGSMVAA--VKTGAQREPVVI-GKPSKLIG 303
L + + + + E+ + P V++ G G +V A V T + E VI G P+K+I
Sbjct: 125 TLES-SAVVIGQRVWIGENVTILPGVSI-GNGCIVGANSVVTKSIPENTVIAGNPAKIIK 182
Query: 304 SYLIEK 309
Y E
Sbjct: 183 KYNHET 188
>gnl|CDD|188367 TIGR03677, rpl7ae, 50S ribosomal protein L7Ae. This model
specifically identifies the archaeal version of the
large ribosomal complex L7 protein. The family is a
narrower version of the pfam01248 model which also
recognizes the L30 protein. Multifunctional RNA-binding
protein that recognizes the K-turn motif in ribosomal
RNA, box H/ACA, box C/D and box C'/D' sRNAs. Interacts
with protein L15e.
Length = 117
Score = 27.4 bits (61), Expect = 6.0
Identities = 23/73 (31%), Positives = 32/73 (43%), Gaps = 14/73 (19%)
Query: 129 KIFYVTNNSTKTREQLIVKLKHLGFNAEPNEII----------GTAYLAAQYLKKHLDPK 178
KI TN TK E+ I KL + + EP EI+ G Y+ Y+K D
Sbjct: 26 KIKKGTNEVTKAVERGIAKLVVIAEDVEPPEIVAHLPALCEEKGIPYI---YVKTKEDLG 82
Query: 179 KKAYI-VGSSGIA 190
A + VG++ A
Sbjct: 83 AAAGLEVGAAAAA 95
>gnl|CDD|233463 TIGR01549, HAD-SF-IA-v1, haloacid dehalogenase superfamily,
subfamily IA, variant 1 with third motif having Dx(3-4)D
or Dx(3-4)E. This model represents part of one
structural subfamily of the Haloacid Dehalogenase (HAD)
superfamily of aspartate-nucleophile hydrolases. The
superfamily is defined by the presence of three short
catalytic motifs. The subfamilies are defined based on
the location and the observed or predicted fold of a
so-called "capping domain", or the absence of such a
domain. Subfamily I consists of sequences in which the
capping domain is found in between the first and second
catalytic motifs. Subfamily II consists of sequences in
which the capping domain is found between the second and
third motifs. Subfamily III sequences have no capping
domain in either of these positions.The Subfamily IA and
IB capping domains are predicted by PSI-PRED to consist
of an alpha helical bundle. Subfamily I encompasses such
a wide region of sequence space (the sequences are
highly divergent) that modelling it with a single
representation is impossible, resulting in an overly
broad description which allows in many unrelated
sequences. Subfamily IA and IB are separated based on an
aparrent phylogenetic bifurcation. Subfamily IA is still
too broad to model, but cannot be further subdivided
into large chunks based on phylogenetic trees. Of the
three motifs defining the HAD superfamily, the third has
three variant forms : (1) hhhhsDxxx(x)(D/E), (2)
hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to
a small amino acid and _h_ to a hydrophobic one. All
three of these variants are found in subfamily IA.
Individual models were made based on seeds exhibiting
only one of the variants each. Variant 1 (this model) is
found in the enzymes phosphoglycolate phosphatase
(TIGR01449) and enolase-phosphatase. These three variant
models (see also TIGR01493 and TIGR01509) were created
withthe knowledge that there will be overlap among them
- this is by design and serves the purpose of
eliminating the overlap with models of more distantly
relatedHAD subfamilies caused by an overly broad single
model [Unknown function, Enzymes of unknown
specificity].
Length = 162
Score = 27.7 bits (62), Expect = 6.2
Identities = 14/48 (29%), Positives = 23/48 (47%), Gaps = 3/48 (6%)
Query: 115 GADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAEPNEIIG 162
GA ++ LK G K+ ++N S R Q ++ L+ G I+G
Sbjct: 75 GAADLLPRLKEAGIKLGIISNGS--LRAQKLL-LRKHGLGDYFELILG 119
>gnl|CDD|223494 COG0417, PolB, DNA polymerase elongation subunit (family B) [DNA
replication, recombination, and repair].
Length = 792
Score = 28.5 bits (64), Expect = 6.5
Identities = 29/137 (21%), Positives = 45/137 (32%), Gaps = 20/137 (14%)
Query: 127 GKKIFYVTNNSTKTREQLIVKLKHLGFNAEPNEIIGTAYLAAQYLKKHLDPKKKAYIVGS 186
+ I + + + RE + L+ L F+ E G K D +K I+ S
Sbjct: 133 VEDIGSIHSLFLEHREDVRPPLRVLAFDIETLSEPG----------KFPDGEKDPIIMIS 182
Query: 187 S--GIADELNLAGIENFGVGPDVMI----PGRDLKTDHEKLNLDPHVGAVVVGFDSHI-S 239
L I G G V + + DP V+VG++
Sbjct: 183 YAIEAEGGLIEVFIYTSGEGFSVEVVISEAELLERFVELIREYDPD---VIVGYNGDNFD 239
Query: 240 FPKLMKAACYLTNPNTL 256
+P L + A L P L
Sbjct: 240 WPYLAERAERLGIPLRL 256
>gnl|CDD|232927 TIGR00338, serB, phosphoserine phosphatase SerB. Phosphoserine
phosphatase catalyzes the reaction 3-phospho-serine +
H2O = L-serine + phosphate. It catalyzes the last of
three steps in the biosynthesis of serine from
D-3-phosphoglycerate. Note that this enzyme acts on free
phosphoserine, not on phosphoserine residues of
phosphoproteins [Amino acid biosynthesis, Serine
family].
Length = 219
Score = 28.1 bits (63), Expect = 6.6
Identities = 8/20 (40%), Positives = 13/20 (65%)
Query: 306 LIEKYNLNPERTLMIGDRGN 325
L+ K ++PE T+ +GD N
Sbjct: 160 LLRKEGISPENTVAVGDGAN 179
>gnl|CDD|237857 PRK14902, PRK14902, 16S rRNA methyltransferase B; Provisional.
Length = 444
Score = 28.2 bits (64), Expect = 7.2
Identities = 17/65 (26%), Positives = 28/65 (43%), Gaps = 4/65 (6%)
Query: 107 WLENELIS--GADQVMNSLKSLGK--KIFYVTNNSTKTREQLIVKLKHLGFNAEPNEIIG 162
WL I G ++ L+SL + K N + E+LI KL+ G+ E + +
Sbjct: 152 WLVKRWIDQYGEEKAEKILESLNEPPKASIRVNTLKISVEELIEKLEEEGYEVEESLLSP 211
Query: 163 TAYLA 167
A +
Sbjct: 212 EALVI 216
>gnl|CDD|165316 PHA03017, PHA03017, hypothetical protein; Provisional.
Length = 228
Score = 28.0 bits (62), Expect = 7.2
Identities = 14/54 (25%), Positives = 25/54 (46%)
Query: 103 DGVLWLENELISGADQVMNSLKSLGKKIFYVTNNSTKTREQLIVKLKHLGFNAE 156
DG + N AD N +K K FY+ N++ + E +L++L + +
Sbjct: 47 DGFVIFRNGNGKSADDYNNIIKDKKCKGFYIINDNIQESEDAHFELENLDIDID 100
>gnl|CDD|117710 pfam09155, DUF1940, Domain of unknown function (DUF1940). Members
of this family adopt a secondary structure consisting of
six alpha helices, with four long helices (alpha1,
alpha2, alpha5, alpha6) form a left-handed, antiparallel
alpha helical bundle. The function of this family of
Archaeal hypothetical proteins has not, as yet, been
defined.
Length = 143
Score = 27.6 bits (61), Expect = 7.9
Identities = 19/69 (27%), Positives = 32/69 (46%), Gaps = 3/69 (4%)
Query: 309 KYNLNPERTLMIGDRGNTDIRLGYNNGFQ-TLLVLTGDTTMEKAIAWSK--SEDEEYKSR 365
K N E+ L + DI+ NNG + +L + + ++ A+ W EDE +K
Sbjct: 62 KLNDFQEKRLNFTEESWYDIKEKMNNGNKWSLYMFLARSHLDLAVYWITDMEEDERFKDF 121
Query: 366 VADYYLSSL 374
V D ++ L
Sbjct: 122 VDDETINYL 130
>gnl|CDD|220999 pfam11144, DUF2920, Protein of unknown function (DUF2920). This
bacterial family of proteins has no known function.
Length = 403
Score = 28.0 bits (63), Expect = 8.5
Identities = 13/44 (29%), Positives = 20/44 (45%), Gaps = 7/44 (15%)
Query: 124 KSLGKKIFYVTNNS-------TKTREQLIVKLKHLGFNAEPNEI 160
+ YV+ +S K +E+L LK LGF+A + I
Sbjct: 288 SNYNPNTIYVSYHSIKDELAPAKDKEELYDILKELGFDATLHLI 331
>gnl|CDD|237575 PRK13977, PRK13977, myosin-cross-reactive antigen; Provisional.
Length = 576
Score = 28.3 bits (64), Expect = 8.7
Identities = 11/20 (55%), Positives = 14/20 (70%), Gaps = 4/20 (20%)
Query: 178 KKKAYIVGSSGIADELNLAG 197
KKAYI+G SG+A +LA
Sbjct: 22 NKKAYIIG-SGLA---SLAA 37
>gnl|CDD|237439 PRK13586, PRK13586,
1-(5-phosphoribosyl)-5-[(5-
phosphoribosylamino)methylideneamino]
imidazole-4-carboxamide isomerase; Provisional.
Length = 232
Score = 27.8 bits (62), Expect = 9.3
Identities = 26/101 (25%), Positives = 41/101 (40%), Gaps = 16/101 (15%)
Query: 45 KLEKLQELQQYFCHKFIALKCIVATSQTTVMKLINLSELSGDKQKDFLNSFDTVLTDCDG 104
+EK + L + + IV T+ ++ E+ ++ L S D D
Sbjct: 84 DIEKAKRLLSLDVNA-LVFSTIVFTNFNLFHDIVR--EIGSNR---VLVSIDY---DNTK 134
Query: 105 -VL---WLEN--ELISGADQVMNSLKSLGKKIFYVTNNSTK 139
VL W E E+I G +V N L+ LG Y++N T
Sbjct: 135 RVLIRGWKEKSMEVIDGIKKV-NELELLGIIFTYISNEGTT 174
>gnl|CDD|171522 PRK12468, flhB, flagellar biosynthesis protein FlhB; Reviewed.
Length = 386
Score = 28.1 bits (62), Expect = 9.9
Identities = 24/93 (25%), Positives = 43/93 (46%), Gaps = 6/93 (6%)
Query: 212 RDLKTDHEKLNLDPHVGAVVVGFDSHISFPKLM----KAACYLTNPNTLFVATNTDESFP 267
+D++ + + DPHV + ++ ++M KA +TNP VA +ES
Sbjct: 225 QDIRDEFKNQEGDPHVKGRIRQQQRAMARRRMMVDVPKADVIVTNPTHYAVALQYNESKM 284
Query: 268 MGPHVTVPGTGSMVAAVKT-GAQ-REPVVIGKP 298
P V G G++ ++ GA+ R P++ P
Sbjct: 285 SAPKVLAKGAGAVALRIRELGAEHRIPLLEAPP 317
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.318 0.135 0.392
Gapped
Lambda K H
0.267 0.0716 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 19,399,883
Number of extensions: 1890011
Number of successful extensions: 1777
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1746
Number of HSP's successfully gapped: 72
Length of query: 383
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 284
Effective length of database: 6,546,556
Effective search space: 1859221904
Effective search space used: 1859221904
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 60 (26.9 bits)