RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy17690
(545 letters)
>gnl|CDD|217328 pfam03031, NIF, NLI interacting factor-like phosphatase. This
family contains a number of NLI interacting factor
isoforms and also an N-terminal regions of RNA
polymerase II CTC phosphatase and FCP1 serine
phosphatase. This region has been identified as the
minimal phosphatase domain.
Length = 153
Score = 127 bits (321), Expect = 1e-34
Identities = 60/168 (35%), Positives = 89/168 (52%), Gaps = 25/168 (14%)
Query: 122 YTLLLEFRDLLVHPEWTY----------NTGWRFKKRPFVDDFFETLNGSTTDRNNVPLF 171
TL+L+ + LVH + N G KKRP +D+F + L+ +
Sbjct: 1 KTLVLDLDETLVHSSFEPDLPFDFVLNFNHGVYVKKRPGLDEFLQELS---------KYY 51
Query: 172 EVVIFTSESGLSIAPILEALDKENKYFYFKLFRDSTEFVDGHHVKNLDLLNRDLKKVIAV 231
E+VIFT+ S P+L+ LD + KYF +L+R+S F +VK+L LL RDL +V+ V
Sbjct: 52 EIVIFTASSKEYADPVLDKLDPKKKYFKHRLYRESCTF----YVKDLSLLGRDLSRVVIV 107
Query: 232 DWNTHSLSKNRENALIIPRWNGNDDDRTLVDLAVFLRTIAVNGVDDVR 279
D + S +N + IP + G+ DD L+ L FL+ +A VDDVR
Sbjct: 108 DNSPRSFLLQPDNGIPIPPFYGDPDDTELLKLLPFLKELA--KVDDVR 153
Score = 83.4 bits (207), Expect = 5e-19
Identities = 33/84 (39%), Positives = 50/84 (59%), Gaps = 4/84 (4%)
Query: 342 PILEALDKENKYFYFKLFRDSTEFVDGHHVKNLDLLNRDLKKVIAVDWNTHSLSKNRENA 401
P+L+ LD + KYF +L+R+S F +VK+L LL RDL +V+ VD + S +N
Sbjct: 66 PVLDKLDPKKKYFKHRLYRESCTF----YVKDLSLLGRDLSRVVIVDNSPRSFLLQPDNG 121
Query: 402 LIIPRWNGNDDDRTLVDLAVFLRS 425
+ IP + G+ DD L+ L FL+
Sbjct: 122 IPIPPFYGDPDDTELLKLLPFLKE 145
Score = 44.9 bits (107), Expect = 2e-05
Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 19/69 (27%)
Query: 486 YTLLLEFRDLLVHPEWTY----------NTGWRFKKRPFVDDFFETLNGSTTDRNNVPLF 535
TL+L+ + LVH + N G KKRP +D+F + L+ +
Sbjct: 1 KTLVLDLDETLVHSSFEPDLPFDFVLNFNHGVYVKKRPGLDEFLQELS---------KYY 51
Query: 536 EVVIFTSES 544
E+VIFT+ S
Sbjct: 52 EIVIFTASS 60
>gnl|CDD|214729 smart00577, CPDc, catalytic domain of ctd-like phosphatases.
Length = 148
Score = 117 bits (296), Expect = 3e-31
Identities = 56/157 (35%), Positives = 78/157 (49%), Gaps = 27/157 (17%)
Query: 120 PPYTLLLEFRDLLVH------PEWTYNT------------GWRFKKRPFVDDFFETLNGS 161
TL+L+ + LVH EWT G KKRP VD+F + +
Sbjct: 1 KKKTLVLDLDETLVHSTHRSFKEWTNRDFIVPVLIDGHPHGVYVKKRPGVDEFLKRAS-- 58
Query: 162 TTDRNNVPLFEVVIFTSESGLSIAPILEALDKENKYFYFKLFRDSTEFVDGHHVKNLDLL 221
LFE+V+FT+ + P+L+ LD + + Y +LFRD FV G +VK+L LL
Sbjct: 59 -------ELFELVVFTAGLRMYADPVLDLLDPKKYFGYRRLFRDECVFVKGKYVKDLSLL 111
Query: 222 NRDLKKVIAVDWNTHSLSKNRENALIIPRWNGNDDDR 258
NRDL KVI +D + S + EN + I W G+ DD
Sbjct: 112 NRDLSKVIIIDDSPDSWPFHPENLIPIKPWFGDPDDT 148
Score = 76.1 bits (188), Expect = 2e-16
Identities = 33/73 (45%), Positives = 45/73 (61%)
Query: 342 PILEALDKENKYFYFKLFRDSTEFVDGHHVKNLDLLNRDLKKVIAVDWNTHSLSKNRENA 401
P+L+ LD + + Y +LFRD FV G +VK+L LLNRDL KVI +D + S + EN
Sbjct: 76 PVLDLLDPKKYFGYRRLFRDECVFVKGKYVKDLSLLNRDLSKVIIIDDSPDSWPFHPENL 135
Query: 402 LIIPRWNGNDDDR 414
+ I W G+ DD
Sbjct: 136 IPIKPWFGDPDDT 148
Score = 47.6 bits (114), Expect = 1e-06
Identities = 23/79 (29%), Positives = 32/79 (40%), Gaps = 27/79 (34%)
Query: 484 PPYTLLLEFRDLLVH------PEWTYNT------------GWRFKKRPFVDDFFETLNGS 525
TL+L+ + LVH EWT G KKRP VD+F + +
Sbjct: 1 KKKTLVLDLDETLVHSTHRSFKEWTNRDFIVPVLIDGHPHGVYVKKRPGVDEFLKRAS-- 58
Query: 526 TTDRNNVPLFEVVIFTSES 544
LFE+V+FT+
Sbjct: 59 -------ELFELVVFTAGL 70
>gnl|CDD|233801 TIGR02251, HIF-SF_euk, Dullard-like phosphatase domain. This model
represents the putative phosphatase domain of a family
of eukaryotic proteins including "Dullard" , and the NLI
interacting factor (NIF)-like phosphatases. This domain
is a member of the haloacid dehalogenase (HAD)
superfamily by virtue of a conserved set of three
catalytic motifs and a conserved fold as predicted by
PSIPRED. The third motif in this family is distinctive
(hhhhDNxPxxa) and aparrently lacking the last aspartate.
This domain is classified as a "Class III" HAD, since
there is no large "cap" domain found between motifs 1
and 2 or motifs 2 and 3. This domain is related to
domains found in FCP1-like phosphatases (TIGR02250), and
together both are detected by the pfam03031.
Length = 162
Score = 85.0 bits (211), Expect = 2e-19
Identities = 44/122 (36%), Positives = 70/122 (57%), Gaps = 9/122 (7%)
Query: 146 KKRPFVDDFFETLNGSTTDRNNVPLFEVVIFTSESGLSIAPILEALDKENKYFYFKLFRD 205
KRP VD+F E ++ +E+VIFT+ P+L+ LD+ K +L+R+
Sbjct: 42 FKRPHVDEFLERVS---------KWYELVIFTASLEEYADPVLDILDRGGKVISRRLYRE 92
Query: 206 STEFVDGHHVKNLDLLNRDLKKVIAVDWNTHSLSKNRENALIIPRWNGNDDDRTLVDLAV 265
S F +G +VK+L L+ +DL KVI +D + +S S +NA+ I W G+ +D L++L
Sbjct: 93 SCVFTNGKYVKDLSLVGKDLSKVIIIDNSPYSYSLQPDNAIPIKSWFGDPNDTELLNLIP 152
Query: 266 FL 267
FL
Sbjct: 153 FL 154
Score = 65.4 bits (160), Expect = 2e-12
Identities = 32/83 (38%), Positives = 52/83 (62%)
Query: 341 APILEALDKENKYFYFKLFRDSTEFVDGHHVKNLDLLNRDLKKVIAVDWNTHSLSKNREN 400
P+L+ LD+ K +L+R+S F +G +VK+L L+ +DL KVI +D + +S S +N
Sbjct: 72 DPVLDILDRGGKVISRRLYRESCVFTNGKYVKDLSLVGKDLSKVIIIDNSPYSYSLQPDN 131
Query: 401 ALIIPRWNGNDDDRTLVDLAVFL 423
A+ I W G+ +D L++L FL
Sbjct: 132 AIPIKSWFGDPNDTELLNLIPFL 154
Score = 29.6 bits (67), Expect = 2.7
Identities = 12/33 (36%), Positives = 18/33 (54%), Gaps = 9/33 (27%)
Query: 510 KKRPFVDDFFETLNGSTTDRNNVPLFEVVIFTS 542
KRP VD+F E ++ +E+VIFT+
Sbjct: 42 FKRPHVDEFLERVS---------KWYELVIFTA 65
>gnl|CDD|227517 COG5190, FCP1, TFIIF-interacting CTD phosphatases, including
NLI-interacting factor [Transcription].
Length = 390
Score = 71.7 bits (176), Expect = 2e-13
Identities = 48/184 (26%), Positives = 79/184 (42%), Gaps = 12/184 (6%)
Query: 102 EPSREKLLPDPVPFPYYQPPYTLLLEFRDLLVHPEW-TYNTGWRFKKRPFVDDFFETLNG 160
S +K L + + D LV E KRP +D F L+
Sbjct: 208 STSPKKTLVLDLD-ETLVHSSFRYITLLDFLVKVEISLLQHLVYVSKRPELDYFLGKLSK 266
Query: 161 STTDRNNVPLFEVVIFTSESGLSIAPILEALDKENKYFYFKLFRDSTEFVDGHHVKNLDL 220
+ E+V FT+ P+L+ LD +K F +LFR+S G ++K++
Sbjct: 267 ---------IHELVYFTASVKRYADPVLDILDS-DKVFSHRLFRESCVSYLGVYIKDISK 316
Query: 221 LNRDLKKVIAVDWNTHSLSKNRENALIIPRWNGNDDDRTLVDLAVFLRTIAVNGVDDVRE 280
+ R L KVI +D + S + ENA+ I +W ++ D L++L FL + + DV
Sbjct: 317 IGRSLDKVIIIDNSPASYEFHPENAIPIEKWISDEHDDELLNLLPFLEDLPDRDLKDVSS 376
Query: 281 VMLY 284
++
Sbjct: 377 ILQS 380
Score = 56.7 bits (137), Expect = 1e-08
Identities = 55/232 (23%), Positives = 93/232 (40%), Gaps = 29/232 (12%)
Query: 200 FKLFRDSTEFVDGHHVKNLDLLNRDLKKVIAVDWNTHSLSKNRENALIIPRWNGNDDDRT 259
F +++ + +L L R L + +D +SK+ + T
Sbjct: 167 FVAKSPFSKYESDKDIVDLPRLERKLSREAGIDTLEPPVSKSTSPKKTLVLDLDE----T 222
Query: 260 LVDLAVFLRTIAVNGVDDVREVMLYYSQFDDPIEAFNQNQIKLRSIAPILEALDKENKYF 319
LV + T+ D + +V +I L + + E YF
Sbjct: 223 LVHSSFRYITLL----DFLVKV-----------------EISLLQHL-VYVSKRPELDYF 260
Query: 320 YFKLFRDSTEFVEALYPPQSIA-PILEALDKENKYFYFKLFRDSTEFVDGHHVKNLDLLN 378
KL E V + A P+L+ LD +K F +LFR+S G ++K++ +
Sbjct: 261 LGKL-SKIHELVYFTASVKRYADPVLDILDS-DKVFSHRLFRESCVSYLGVYIKDISKIG 318
Query: 379 RDLKKVIAVDWNTHSLSKNRENALIIPRWNGNDDDRTLVDLAVFLRSPPQKD 430
R L KVI +D + S + ENA+ I +W ++ D L++L FL P +D
Sbjct: 319 RSLDKVIIIDNSPASYEFHPENAIPIEKWISDEHDDELLNLLPFLEDLPDRD 370
>gnl|CDD|219133 pfam06682, DUF1183, Protein of unknown function (DUF1183). This
family consists of several eukaryotic proteins of around
360 residues in length. The function of this family is
unknown.
Length = 317
Score = 34.7 bits (80), Expect = 0.11
Identities = 22/68 (32%), Positives = 24/68 (35%), Gaps = 11/68 (16%)
Query: 10 PSSPLSPAPPTLKSS---PLSPSPPPTSSSEDDAKREAQ--WRSMKLGF-TVIGASTGAL 63
S P PP KSS P P P+S R Q W GF T G G
Sbjct: 208 GGSGPGPPPPGFKSSFPPPYGPGAGPSSGYGSGGTRSGQGGWGP---GFWT--GLGAGGA 262
Query: 64 LAYFNGNI 71
L Y G+
Sbjct: 263 LGYLFGSR 270
>gnl|CDD|131304 TIGR02250, FCP1_euk, FCP1-like phosphatase, phosphatase domain.
This model represents the phosphatase domain of the
humanRNA polymerase II subunit A C-terminal domain
phosphatase (FCP1, ) and closely related phosphatases
from eukaryotes including plants, fungi and slime mold.
This domain is a member of the haloacid dehalogenase
(HAD) superfamily by virtue of a conserved set of three
catalytic motifs and a conserved fold as predicted by
PSIPRED. The third motif in this family is distinctive
(hhhhDDppphW). This domain is classified as a "Class
III" HAD, since there is no large "cap" domain found
between motifs 1 and 2 or motifs 2 and 3.This domain is
related to domains found in the human NLI interacting
factor-like phosphatases, and together both are detected
by the pfam03031.
Length = 156
Score = 33.4 bits (77), Expect = 0.12
Identities = 25/109 (22%), Positives = 47/109 (43%), Gaps = 14/109 (12%)
Query: 146 KKRPFVDDFFETLNGSTTDRNNVPLFEVVIFTSESGLSIAPILEALDKENKYFYFK-LFR 204
K RPF+ +F + + L+E+ ++T + I + +D + KYF + + R
Sbjct: 58 KLRPFLHEFLKEAS---------KLYEMHVYTMGTRAYAQAIAKLIDPDGKYFGDRIISR 108
Query: 205 DSTEFVDGHHVKNL-DLLNRDLKKVIAVDWNTHSLSKNRENALIIPRWN 252
D + H K+L L D V+ +D ++ N + I +N
Sbjct: 109 DESG---SPHTKSLLRLFPADESMVVIIDDREDVWPWHKRNLIQIEPYN 154
>gnl|CDD|215533 PLN02983, PLN02983, biotin carboxyl carrier protein of acetyl-CoA
carboxylase.
Length = 274
Score = 32.5 bits (74), Expect = 0.49
Identities = 10/35 (28%), Positives = 14/35 (40%), Gaps = 1/35 (2%)
Query: 2 PISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSS 36
+ PA + P AP + P SPPP +
Sbjct: 160 AMPPASPPAAQPAPSAPASS-PPPTPASPPPAKAP 193
Score = 31.7 bits (72), Expect = 0.78
Identities = 14/35 (40%), Positives = 15/35 (42%), Gaps = 4/35 (11%)
Query: 2 PISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSS 36
AQ PS+P S PPT P SP P S
Sbjct: 165 SPPAAQPAPSAPASSPPPT----PASPPPAKAPKS 195
Score = 29.0 bits (65), Expect = 6.3
Identities = 15/48 (31%), Positives = 18/48 (37%), Gaps = 8/48 (16%)
Query: 2 PISPAQSIPS--------SPLSPAPPTLKSSPLSPSPPPTSSSEDDAK 41
P PA + P SP S + SPPPT +S AK
Sbjct: 144 PPPPAPVVMMQPPPPHAMPPASPPAAQPAPSAPASSPPPTPASPPPAK 191
>gnl|CDD|221745 pfam12737, Mating_C, C-terminal domain of homeodomain 1. Mating in
fungi is controlled by the loci that determine the
mating type of an individual, and only individuals with
differing mating types can mate. Basidiomycete fungi
have evolved a unique mating system, termed tetrapolar
or bifactorial incompatibility, in which mating type is
determined by two unlinked loci; compatibility at both
loci is required for mating to occur. The multi-allelic
tetrapolar mating system is considered to be a novel
innovation that could have only evolved once, and is
thus unique to the mushroom fungi. This domain is
C-terminal to the homeodomain transcription factor
region.
Length = 418
Score = 32.1 bits (73), Expect = 0.80
Identities = 19/45 (42%), Positives = 22/45 (48%), Gaps = 5/45 (11%)
Query: 4 SPAQSIPSSPLSPAPPTLKSSPLSPSPPPT----SSSEDDAKREA 44
SPA S LSP+P L SP+ SP SSS D + EA
Sbjct: 77 SPALS-SERLLSPSPSVLDLSPVLASPQTGKRRRSSSPSDDEDEA 120
Score = 31.7 bits (72), Expect = 1.2
Identities = 12/46 (26%), Positives = 18/46 (39%), Gaps = 4/46 (8%)
Query: 2 PISPAQSIPSSPLSPAPPTL----KSSPLSPSPPPTSSSEDDAKRE 43
++ + P + + P SPSP SSE +AKR
Sbjct: 356 SPLDFSTLFNQPSPSPMASQSILAPAQPTSPSPVALPSSELEAKRR 401
>gnl|CDD|221040 pfam11235, Med25_SD1, Mediator complex subunit 25 synapsin 1.
The overall function of the full-length Med25 is
efficiently to coordinate the transcriptional
activation of RAR/RXR (retinoic acid receptor/retinoic
X receptor) in higher eukaryotic cells. Human Med25
consists of several domains with different binding
properties, the N-terminal, VWA, domain, this SD1 -
synapsin 1 - domain from residues 229-381, a PTOV(B) or
ACID domain from 395-545, an SD2 domain from residues
564-645 and a C-terminal NR box-containing domain
(646-650) from 646-747. This The function of the SD
domains is unclear.
Length = 168
Score = 31.0 bits (69), Expect = 1.0
Identities = 9/30 (30%), Positives = 14/30 (46%)
Query: 2 PISPAQSIPSSPLSPAPPTLKSSPLSPSPP 31
P+ Q + P + PP +P +P PP
Sbjct: 11 PLQSKQPVSLPPAAVLPPQSLPAPQNPLPP 40
>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
Length = 3151
Score = 31.8 bits (72), Expect = 1.2
Identities = 11/32 (34%), Positives = 15/32 (46%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPP 32
+ A +SP P PP + P +P PPP
Sbjct: 2812 LAPAAALPPAASPAGPLPPPTSAQPTAPPPPP 2843
Score = 31.4 bits (71), Expect = 1.5
Identities = 11/37 (29%), Positives = 11/37 (29%)
Query: 4 SPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDA 40
P P SPL P P SPSP
Sbjct: 2607 DPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPP 2643
Score = 31.1 bits (70), Expect = 1.9
Identities = 10/35 (28%), Positives = 15/35 (42%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSS 35
+ P + P+ PL P ++P P PP S
Sbjct: 2815 AAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPS 2849
Score = 30.7 bits (69), Expect = 2.5
Identities = 10/39 (25%), Positives = 18/39 (46%)
Query: 2 PISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDA 40
+P + + L+ AP L + P P P P +E++
Sbjct: 290 AAAPPDGVWGAALAGAPLALPAPPDPPPPAPAGDAEEED 328
Score = 30.3 bits (68), Expect = 3.6
Identities = 18/47 (38%), Positives = 23/47 (48%), Gaps = 5/47 (10%)
Query: 2 PISPAQSIPS----SPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREA 44
P PA +PS S PAPP + P +P+ P DDA R+A
Sbjct: 428 PPPPATPLPSAEPGSDDGPAPPP-ERQPPAPATEPAPDDPDDATRKA 473
Score = 29.9 bits (67), Expect = 4.1
Identities = 11/36 (30%), Positives = 16/36 (44%), Gaps = 2/36 (5%)
Query: 2 PISPAQSIPSSPLSPAPPTL--KSSPLSPSPPPTSS 35
P + P + + L +SP P PPPTS+
Sbjct: 2799 PSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSA 2834
Score = 29.5 bits (66), Expect = 6.3
Identities = 7/41 (17%), Positives = 15/41 (36%)
Query: 5 PAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREAQ 45
P + + +S + + P P PP + + + Q
Sbjct: 2882 PVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQ 2922
>gnl|CDD|235540 PRK05641, PRK05641, putative acetyl-CoA carboxylase biotin
carboxyl carrier protein subunit; Validated.
Length = 153
Score = 30.2 bits (68), Expect = 1.4
Identities = 11/38 (28%), Positives = 19/38 (50%), Gaps = 1/38 (2%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSED 38
P + +P P+ PT +P +P+P P S+ E+
Sbjct: 49 QEQVPTPAPAPAPAVPSAPT-PVAPAAPAPAPASAGEN 85
>gnl|CDD|234398 TIGR03921, T7SS_mycosin, type VII secretion-associated serine
protease mycosin. Members of this family are
subtilisin-related serine proteases, found strictly in
the Actinobacteria and associated with type VII
secretion operons. The designation mycosin is used for
members from Mycobacterium [Protein fate, Protein and
peptide secretion and trafficking, Protein fate, Protein
modification and repair].
Length = 350
Score = 31.1 bits (71), Expect = 1.5
Identities = 14/56 (25%), Positives = 19/56 (33%), Gaps = 12/56 (21%)
Query: 10 PSSPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREAQWRSMKLGFTVIGASTGALLA 65
PL PAP + P++ PP + R A + GA AL
Sbjct: 290 DGRPLRPAPAP--ARPVAAPAPPPPPDDTPRGRVA----------LWGAGLAALAV 333
>gnl|CDD|218549 pfam05308, Mito_fiss_reg, Mitochondrial fission regulator. In
eukaryotes, this family of proteins induces
mitochondrial fission.
Length = 248
Score = 30.5 bits (69), Expect = 1.9
Identities = 13/36 (36%), Positives = 15/36 (41%), Gaps = 1/36 (2%)
Query: 1 SPISPAQSIPSSPLSPAP-PTLKSSPLSPSPPPTSS 35
P S S P SP + P + P P PPP S
Sbjct: 162 VPSSSTTSFPISPPTEEPVLEVPPPPPPPPPPPPPS 197
Score = 29.7 bits (67), Expect = 3.0
Identities = 14/46 (30%), Positives = 19/46 (41%), Gaps = 4/46 (8%)
Query: 2 PISPAQSIPSSPLSPAPPTLKSSPLSPSP--PPTSSSEDDAKREAQ 45
PISP P + P PP P P TS+ + +R+ Q
Sbjct: 171 PISPPTEEPVLEVPPPPPP--PPPPPPPSLQQSTSAIDLIKERKGQ 214
>gnl|CDD|232824 TIGR00099, Cof-subfamily, Cof subfamily of IIB subfamily of
haloacid dehalogenase superfamily. This subfamily of
sequences falls within the Class-IIB subfamily
(TIGR01484) of the Haloacid Dehalogenase superfamily of
aspartate-nucleophile hydrolases. The use of the name
"Cof" as an identifier here is arbitrary and refers to
the E. coli Cof protein. This subfamily is notable for
the large number of recent paralogs in many species.
Listeria, for instance, has 12, Clostridium, Lactococcus
and Streptococcus pneumoniae have 8 each, Enterococcus
and Salmonella have 7 each, and Bacillus subtilus,
Mycoplasma, Staphylococcus and E. coli have 6 each. This
high degree of gene duplication is limited to the gamma
proteobacteria and low-GC gram positive lineages. The
profusion of genes in this subfamily is not coupled with
a high degree of divergence, so it is impossible to
determine an accurate phylogeny at the equivalog level.
Considering the relationship of this subfamily to the
other known members of the HAD-IIB subfamily
(TIGR01484), sucrose and trehalose phosphatases and
phosphomannomutase, it seems a reasonable hypothesis
that these enzymes act on phosphorylated sugars.
Possibly the diversification of genes in this subfamily
represents the diverse sugars and polysaccharides that
various bacteria find in their biological niches. The
members of this subfamily are restricted almost
exclusively to bacteria (one sequences from S. pombe
scores above trusted, while another is between trusted
and noise). It is notable that no archaea are found in
this group, the closest relations to the archaea found
here being two Deinococcus sequences [Unknown function,
Enzymes of unknown specificity].
Length = 256
Score = 30.3 bits (69), Expect = 2.2
Identities = 13/86 (15%), Positives = 25/86 (29%), Gaps = 5/86 (5%)
Query: 308 ILEALDKENKYFYFKLFRDSTEFVEALYPPQSIAPILEALDKENKYFYFKLFRDSTEFVD 367
+ + F L E V+ Y P I IL + E +
Sbjct: 109 ASKNDPEYFTIFKKFLGEPKLEVVDIQYLPDDILKILLLFLDPEDLDLLIEALNKLELEE 168
Query: 368 GHHV-----KNLDLLNRDLKKVIAVD 388
V ++++ + + K A+
Sbjct: 169 NVSVVSSGPYSIEITAKGVSKGSALQ 194
>gnl|CDD|225689 COG3147, DedD, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 226
Score = 30.2 bits (68), Expect = 2.2
Identities = 12/51 (23%), Positives = 19/51 (37%)
Query: 2 PISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREAQWRSMKLG 52
P++ P P P + +P P P +E A Q ++LG
Sbjct: 107 PVAAQTPKPVKPPKQPPAGAVPAKPTPKPEPKPVAEPAAAPTGQAFVVQLG 157
>gnl|CDD|192243 pfam09287, CEP1-DNA_bind, CEP-1, DNA binding. Members of this
family of DNA-binding domains are found the
transcription factor CEP-1. They adopt a beta sandwich
structure, with nine strands in two beta-sheets, in a
Greek-key topology.
Length = 198
Score = 30.0 bits (67), Expect = 2.3
Identities = 21/46 (45%), Positives = 23/46 (50%), Gaps = 7/46 (15%)
Query: 411 DDDRTLVDLAVFLRSPPQKDENGNII-HDEFMDLPIVQQYSKRIWK 455
D R + LAVFL DENGN I H L IV Y +R WK
Sbjct: 149 ADRRKRMCLAVFL-----DDENGNEILHAVIKQLLIV-AYPRRDWK 188
>gnl|CDD|217277 pfam02901, PFL, Pyruvate formate lyase.
Length = 646
Score = 30.7 bits (70), Expect = 2.5
Identities = 22/63 (34%), Positives = 31/63 (49%), Gaps = 12/63 (19%)
Query: 223 RDLKKVIAVDWNTHSLSKNRENALIIPRWNGNDDDRTLVDLAVFLRTIAVNGVDDV-REV 281
R+L++ +A D+ + R L P++ GNDDDR VD IAV V+ EV
Sbjct: 573 RELEEALAADFEGEE--ELRRLLLDAPKY-GNDDDR--VD------EIAVEVVETFMDEV 621
Query: 282 MLY 284
Y
Sbjct: 622 RKY 624
Score = 30.3 bits (69), Expect = 3.1
Identities = 17/46 (36%), Positives = 27/46 (58%), Gaps = 6/46 (13%)
Query: 379 RDLKKVIAVDWNTHSLSKNRENALIIPRWNGNDDDRTLVD-LAVFL 423
R+L++ +A D+ + R L P++ GNDDDR VD +AV +
Sbjct: 573 RELEEALAADFEGEE--ELRRLLLDAPKY-GNDDDR--VDEIAVEV 613
>gnl|CDD|152960 pfam12526, DUF3729, Protein of unknown function (DUF3729). This
family of proteins is found in viruses. Proteins in
this family are typically between 145 and 1707 amino
acids in length. The family is found in association
with pfam01443, pfam01661, pfam05417, pfam01660,
pfam00978. There is a single completely conserved
residue L that may be functionally important.
Length = 115
Score = 28.9 bits (65), Expect = 2.7
Identities = 10/34 (29%), Positives = 11/34 (32%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTS 34
P PA P P SPL+P P
Sbjct: 65 PPSEPAAPPPDPEPPVPGPAGPPSPLAPPAPARK 98
>gnl|CDD|204405 pfam10146, zf-C4H2, Zinc finger-containing protein. This is a
family of proteins which appears to have a highly
conserved zinc finger domain at the C terminal end,
described as -C-X2-CH-X3-H-X5-C-X2-C-. The structure is
predicted to contain a coiled coil. Members are
annotated as being tumour-associated antigen HCA127 in
humans but this could not confirmed.
Length = 215
Score = 29.8 bits (67), Expect = 2.9
Identities = 9/41 (21%), Positives = 15/41 (36%)
Query: 12 SPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREAQWRSMKLG 52
+SPA + PL P S+ +A A + +
Sbjct: 119 QKISPATSPVPPVPLPDPPAFPSTLPANAAAAAAAQQQRDV 159
>gnl|CDD|166942 PRK00404, tatB, sec-independent translocase; Provisional.
Length = 141
Score = 29.0 bits (65), Expect = 3.1
Identities = 7/37 (18%), Positives = 13/37 (35%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSE 37
+PA P +PA P ++ + S+
Sbjct: 97 QSPAPAVPTPPPTSTPAVPPAPAAAVPAPAAAPPPSD 133
>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
Length = 1352
Score = 30.5 bits (69), Expect = 3.2
Identities = 12/43 (27%), Positives = 19/43 (44%)
Query: 4 SPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREAQW 46
SP+ S P P P+ P + P P +S+ +R A+
Sbjct: 350 SPSPSRPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARA 392
Score = 29.8 bits (67), Expect = 5.3
Identities = 13/40 (32%), Positives = 17/40 (42%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDA 40
P S + S SP+P + S P SP +SSS
Sbjct: 283 GPASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSR 322
Score = 29.0 bits (65), Expect = 8.7
Identities = 15/45 (33%), Positives = 21/45 (46%), Gaps = 1/45 (2%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREAQ 45
SP A++ PSSP + PP+ + SP PP SS +
Sbjct: 178 SPEETARA-PSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPA 221
>gnl|CDD|215496 PLN02918, PLN02918, pyridoxine (pyridoxamine) 5'-phosphate
oxidase.
Length = 544
Score = 30.3 bits (68), Expect = 3.4
Identities = 12/26 (46%), Positives = 14/26 (53%)
Query: 7 QSIPSSPLSPAPPTLKSSPLSPSPPP 32
QS+ P+SP PP S SPSP
Sbjct: 15 QSLLPLPISPPPPHSSSLSSSPSPTQ 40
>gnl|CDD|112348 pfam03525, Meiotic_rec114, Meiotic recombination protein rec114.
Length = 328
Score = 29.6 bits (66), Expect = 4.2
Identities = 5/44 (11%), Positives = 16/44 (36%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREA 44
++ + + P P ++ +S S T+ + + +
Sbjct: 219 QDLNTPSATQTVLARPEPLIVQPLEVSQSLQNTTVCLPNTENQK 262
>gnl|CDD|233367 TIGR01349, PDHac_trf_mito, pyruvate dehydrogenase complex
dihydrolipoamide acetyltransferase, long form. This
model represents one of several closely related clades
of the dihydrolipoamide acetyltransferase subunit of the
pyruvate dehydrogenase complex. It includes sequences
from mitochondria and from alpha and beta branches of
the proteobacteria, as well as from some other bacteria.
Sequences from Gram-positive bacteria are not included.
The non-enzymatic homolog protein X, which serves as an
E3 component binding protein, falls within the clade
phylogenetically but is rejected by its low score
[Energy metabolism, Pyruvate dehydrogenase].
Length = 436
Score = 29.4 bits (66), Expect = 5.3
Identities = 14/36 (38%), Positives = 16/36 (44%)
Query: 2 PISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSE 37
P A + P S P+P K SP SP P S E
Sbjct: 100 PSEIAPTAPPSAPKPSPAPQKQSPEPSSPAPLSDKE 135
>gnl|CDD|221321 pfam11928, DUF3446, Domain of unknown function (DUF3446). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 80 to 99 amino acids in length. This domain is
found associated with pfam00096. This domain has a
single completely conserved residue P that may be
functionally important.
Length = 84
Score = 27.5 bits (61), Expect = 5.3
Identities = 14/39 (35%), Positives = 20/39 (51%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDD 39
S S + S S PLS + + SP+ + PP SS+ D
Sbjct: 42 SSSSSSSSSQSPPLSCSVHQSEPSPIYSAAPPYSSACGD 80
>gnl|CDD|197226 cd09128, PLDc_unchar1_2, Putative catalytic domain, repeat 2, of
uncharacterized phospholipase D-like proteins. Putative
catalytic domain, repeat 2, of uncharacterized
phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD
enzymes hydrolyze phospholipid phosphodiester bonds to
yield phosphatidic acid and a free polar head group.
They can also catalyze transphosphatidylation of
phospholipids to acceptor alcohols. Members of this
subfamily contain two HKD motifs (H-x-K-x(4)-D, where x
represents any amino acid residue) that characterizes
the PLD superfamily. The two motifs may be part of the
active site and may be involved in phosphatidyl group
transfer.
Length = 142
Score = 28.4 bits (64), Expect = 5.5
Identities = 12/32 (37%), Positives = 15/32 (46%), Gaps = 10/32 (31%)
Query: 227 KVIAVD----------WNTHSLSKNRENALII 248
K I VD W+ +SL +NRE LI
Sbjct: 94 KGIVVDGKTALVGSENWSANSLDRNREVGLIF 125
Score = 28.4 bits (64), Expect = 5.5
Identities = 12/32 (37%), Positives = 15/32 (46%), Gaps = 10/32 (31%)
Query: 383 KVIAVD----------WNTHSLSKNRENALII 404
K I VD W+ +SL +NRE LI
Sbjct: 94 KGIVVDGKTALVGSENWSANSLDRNREVGLIF 125
>gnl|CDD|225552 COG3007, COG3007, Uncharacterized paraquat-inducible protein B
[Function unknown].
Length = 398
Score = 29.4 bits (66), Expect = 5.7
Identities = 11/37 (29%), Positives = 18/37 (48%)
Query: 425 SPPQKDENGNIIHDEFMDLPIVQQYSKRIWKQMVTYN 461
S Q D+ G + D++ P VQ + +W Q+ N
Sbjct: 320 SKIQLDDEGRLRMDDWELRPDVQDQVRELWDQVTNEN 356
>gnl|CDD|236507 PRK09424, pntA, NAD(P) transhydrogenase subunit alpha; Provisional.
Length = 509
Score = 29.4 bits (67), Expect = 5.9
Identities = 10/52 (19%), Positives = 21/52 (40%), Gaps = 2/52 (3%)
Query: 16 PAPPTLKSSPLSPSPPPTSSSEDDAKREAQWRSMKLGFTVIGASTGALLAYF 67
P PP ++ S + +++++ K+ A K + A+ LA
Sbjct: 369 PPPP-IQVSAAPAAAAAAPAAKEEEKKPASPWR-KYALMALAAALFGWLASV 418
>gnl|CDD|218332 pfam04929, Herpes_DNAp_acc, Herpes DNA replication accessory
factor. Replicative DNA polymerases are capable of
polymerising tens of thousands of nucleotides without
dissociating from their DNA templates. The high
processivity of these polymerases is dependent upon
accessory proteins that bind to the catalytic subunit of
the polymerase or to the substrate. The Epstein-Barr
virus (EBV) BMRF1 protein is an essential component of
the viral DNA polymerase and is absolutely required for
lytic virus replication. BMRF1 is also a transactivator.
This family is predicted to have a UL42 like structure.
Length = 381
Score = 29.3 bits (66), Expect = 6.1
Identities = 14/50 (28%), Positives = 17/50 (34%), Gaps = 7/50 (14%)
Query: 2 PISPAQSIPS------SPLSPAPPTLKSSPLSP-SPPPTSSSEDDAKREA 44
P S+ S SP+PP S S +PPP S S
Sbjct: 299 EPEPTGSVSDRPRHLSSDSSPSPPDTSDSDPSTETPPPASLSHSPPAAFE 348
>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
Length = 1463
Score = 29.4 bits (65), Expect = 7.4
Identities = 16/39 (41%), Positives = 20/39 (51%), Gaps = 3/39 (7%)
Query: 4 SPAQSIPSSPLSPAPPTLKSSPLSPS---PPPTSSSEDD 39
S AQ P++P APP + PLS S P P +DD
Sbjct: 752 SAAQESPANPWPRAPPCDEQEPLSVSPYGPEPDRPPDDD 790
>gnl|CDD|225950 COG3416, COG3416, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 233
Score = 28.7 bits (64), Expect = 7.8
Identities = 10/35 (28%), Positives = 17/35 (48%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSS 35
+P + P APP+ +SSP P+ P+ +
Sbjct: 95 QEPAPPANAPPPKEPAAPPSWRSSPAGPTTQPSPA 129
>gnl|CDD|216325 pfam01140, Gag_MA, Matrix protein (MA), p15. The matrix protein,
p15, is encoded by the gag gene. MA is involved in
pathogenicity.
Length = 129
Score = 27.9 bits (62), Expect = 7.9
Identities = 12/27 (44%), Positives = 13/27 (48%), Gaps = 1/27 (3%)
Query: 10 PSSPLSPAPPTLKSSPLSPSPPPTSSS 36
P L P+ SP SPS PP SS
Sbjct: 102 PPKVLLPSSTPKPVSP-SPSAPPRPSS 127
>gnl|CDD|234184 TIGR03362, VI_chp_7, type VI secretion-associated protein,
VC_A0119 family. This protein family is one of two
related families in type VI secretion systems that
contain an ImpA-related N-terminal domain (pfam06812)
[Protein fate, Protein and peptide secretion and
trafficking, Cellular processes, Pathogenesis].
Length = 301
Score = 28.9 bits (65), Expect = 7.9
Identities = 16/44 (36%), Positives = 23/44 (52%), Gaps = 3/44 (6%)
Query: 5 PAQSIPSSPLS-PAPPTLKSSPLSPSPPPTSSSEDDAKREAQWR 47
++P++P S PAP T ++P P PP S DD+ E R
Sbjct: 8 APAAVPTAPASAPAPATTAAAPQPPEPPA--SVVDDSSSERALR 49
>gnl|CDD|237864 PRK14950, PRK14950, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 585
Score = 29.0 bits (65), Expect = 8.5
Identities = 7/44 (15%), Positives = 19/44 (43%)
Query: 1 SPISPAQSIPSSPLSPAPPTLKSSPLSPSPPPTSSSEDDAKREA 44
+P + ++ ++ + P P +++ P PP + E+
Sbjct: 384 APSTRPKAAAAANIPPKEPVRETATPPPVPPRPVAPPVPHTPES 427
>gnl|CDD|197200 cd00138, PLDc_SF, Catalytic domain of phospholipase D superfamily
proteins. Catalytic domain of phospholipase D (PLD)
superfamily proteins. The PLD superfamily is composed of
a large and diverse group of proteins including plant,
mammalian and bacterial PLDs, bacterial cardiolipin (CL)
synthases, bacterial phosphatidylserine synthases (PSS),
eukaryotic phosphatidylglycerophosphate (PGP) synthase,
eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and
some bacterial endonucleases (Nuc and BfiI), among
others. PLD enzymes hydrolyze phospholipid
phosphodiester bonds to yield phosphatidic acid and a
free polar head group. They can also catalyze the
transphosphatidylation of phospholipids to acceptor
alcohols. The majority of members in this superfamily
contain a short conserved sequence motif (H-x-K-x(4)-D,
where x represents any amino acid residue), called the
HKD signature motif. There are varying expanded forms of
this motif in different family members. Some members
contain variant HKD motifs. Most PLD enzymes are
monomeric proteins with two HKD motif-containing
domains. Two HKD motifs from two domains form a single
active site. Some PLD enzymes have only one copy of the
HKD motif per subunit but form a functionally active
dimer, which has a single active site at the dimer
interface containing the two HKD motifs from both
subunits. Different PLD enzymes may have evolved through
domain fusion of a common catalytic core with separate
substrate recognition domains. Despite their various
catalytic functions and a very broad range of substrate
specificities, the diverse group of PLD enzymes can bind
to a phosphodiester moiety. Most of them are active as
bi-lobed monomers or dimers, and may possess similar
core structures for catalytic activity. They are
generally thought to utilize a common two-step ping-pong
catalytic mechanism, involving an enzyme-substrate
intermediate, to cleave phosphodiester bonds. The two
histidine residues from the two HKD motifs play key
roles in the catalysis. Upon substrate binding, a
histidine from one HKD motif could function as the
nucleophile, attacking the phosphodiester bond to create
a covalent phosphohistidine intermediate, while the
other histidine residue from the second HKD motif could
serve as a general acid, stabilizing the leaving group.
Length = 119
Score = 27.5 bits (61), Expect = 9.8
Identities = 19/122 (15%), Positives = 38/122 (31%), Gaps = 26/122 (21%)
Query: 306 APILEALDKENKY----FYFKLFRDSTEFVEALY--------------PPQSIAPILEA- 346
+LE L + F + ++AL P + A L A
Sbjct: 1 EALLELLKNAKESIFIATPNFSFNSADRLLKALLAAAERGVDVRLIIDKPPNAAGSLSAA 60
Query: 347 --LDKENKYFYFKLFRDSTEFVDGHHVKNLDLLNRDLKKVIA--VDWNTHSLSKNRENAL 402
+ + F + H K ++ D + + +T S ++NRE +
Sbjct: 61 LLEALLRAGVNVRSYVTPPHFFERLHAK---VVVIDGEVAYVGSANLSTASAAQNREAGV 117
Query: 403 II 404
++
Sbjct: 118 LV 119
Score = 27.5 bits (61), Expect = 10.0
Identities = 9/62 (14%), Positives = 22/62 (35%), Gaps = 5/62 (8%)
Query: 189 EALDKENKYFYFKLFRDSTEFVDGHHVKNLDLLNRDLKKVIA--VDWNTHSLSKNRENAL 246
+ + F + H K ++ D + + +T S ++NRE +
Sbjct: 61 LLEALLRAGVNVRSYVTPPHFFERLHAK---VVVIDGEVAYVGSANLSTASAAQNREAGV 117
Query: 247 II 248
++
Sbjct: 118 LV 119
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.320 0.138 0.422
Gapped
Lambda K H
0.267 0.0774 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 29,362,409
Number of extensions: 2980789
Number of successful extensions: 3809
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3731
Number of HSP's successfully gapped: 95
Length of query: 545
Length of database: 10,937,602
Length adjustment: 102
Effective length of query: 443
Effective length of database: 6,413,494
Effective search space: 2841177842
Effective search space used: 2841177842
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 61 (27.2 bits)