RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy5735
(247 letters)
>gnl|CDD|216155 pfam00856, SET, SET domain. SET domains are protein lysine
methyltransferase enzymes. SET domains appear to be
protein-protein interaction domains. It has been
demonstrated that SET domains mediate interactions with
a family of proteins that display similarity with
dual-specificity phosphatases (dsPTPases). A subset of
SET domains have been called PR domains. These domains
are divergent in sequence from other SET domains, but
also appear to mediate protein-protein interaction. The
SET domain consists of two regions known as SET-N and
SET-C. SET-C forms an unusual and conserved knot-like
structure of probably functional importance.
Additionally to SET-N and SET-C, an insert region
(SET-I) and flanking regions of high structural
variability form part of the overall structure.
Length = 113
Score = 41.7 bits (98), Expect = 4e-05
Identities = 15/51 (29%), Positives = 23/51 (45%), Gaps = 4/51 (7%)
Query: 43 GAGIFPTLSMFNHSCEPN----IVRYFRGTMVYVNLCKNFKKGDQICENYG 89
G+ NHSCEPN V G + V ++ K G+++ +YG
Sbjct: 63 ATGLGNVARFINHSCEPNCEVRFVFVNGGDRIVVRALRDIKPGEELTIDYG 113
>gnl|CDD|214614 smart00317, SET, SET (Su(var)3-9, Enhancer-of-zeste, Trithorax)
domain. Putative methyl transferase, based on outlier
plant homologues.
Length = 124
Score = 35.8 bits (83), Expect = 0.005
Identities = 12/55 (21%), Positives = 23/55 (41%), Gaps = 4/55 (7%)
Query: 43 GAGIFPTLSMFNHSCEPN--IVRYFRGTMVYVNLC--KNFKKGDQICENYGPLYS 93
NHSCEPN ++ + + ++ K G+++ +YG Y+
Sbjct: 68 ARRKGNLARFINHSCEPNCELLFVEVNGDDRIVIFALRDIKPGEELTIDYGSDYA 122
>gnl|CDD|225491 COG2940, COG2940, Proteins containing SET domain [General function
prediction only].
Length = 480
Score = 36.7 bits (85), Expect = 0.009
Identities = 16/74 (21%), Positives = 28/74 (37%), Gaps = 8/74 (10%)
Query: 51 SMFNHSCEPNI--VRYFRG---TMVYVNLCKNFKKGDQICENYGPLYSQVRKTERQNTLK 105
NHSC PN + + ++ K G+++ +YGP E + L+
Sbjct: 407 RFINHSCTPNCEASPIEVNGIFKISIYAI-RDIKAGEELTYDYGPSLED--NRELKKLLE 463
Query: 106 SQYWFDCHCIACEH 119
++ C C H
Sbjct: 464 KRWGCACGEDRCSH 477
>gnl|CDD|226687 COG4235, COG4235, Cytochrome c biogenesis factor [Posttranslational
modification, protein turnover, chaperones].
Length = 287
Score = 36.2 bits (84), Expect = 0.010
Identities = 16/61 (26%), Positives = 24/61 (39%), Gaps = 4/61 (6%)
Query: 176 QDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLVPPYRDYILCQRSIQTCFLNLGQ 235
+ +L LA E G Y +A + L+ LL + P R I +RSI
Sbjct: 225 ANIRALSLLAFAAFEQGDYAEAAAAWQMLLDLLPADD--PRRSLI--ERSIARALAQRSA 280
Query: 236 K 236
+
Sbjct: 281 Q 281
>gnl|CDD|129583 TIGR00492, alr, alanine racemase. This enzyme interconverts
L-alanine and D-alanine. Its primary function is to
generate D-alanine for cell wall formation. With
D-alanine-D-alanine ligase, it makes up the D-alanine
branch of the peptidoglycan biosynthetic route. It is a
monomer with one pyridoxal phosphate per subunit. In E.
coli, the ortholog is duplicated so that a second
isozyme, DadX, is present. DadX, a paralog of the
biosynthetic Alr, is induced by D- or L-alanine and is
involved in catabolism [Cell envelope, Biosynthesis and
degradation of murein sacculus and peptidoglycan].
Length = 367
Score = 36.2 bits (84), Expect = 0.012
Identities = 23/97 (23%), Positives = 47/97 (48%), Gaps = 4/97 (4%)
Query: 124 FEEMQAAQDLRFRCETENCHNVVKVATNTTQFMIKCDKCDQFINIFKGLKNLQDTESLF- 182
E++QA ++ + E + +K+ T + +K D+ F+ + LK + E +F
Sbjct: 103 VEQLQALEEALLK-EPKRLKVHLKIDTGMNRLGVKPDEAALFVQKLRQLKKFLELEGIFS 161
Query: 183 RLAN-NYKENGLYEKALEKFTQLM-TLLDENLVPPYR 217
A + + G +K +E+F + L +N+ PP+R
Sbjct: 162 HFATADEPKTGTTQKQIERFNSFLEGLKQQNIEPPFR 198
>gnl|CDD|191825 pfam07719, TPR_2, Tetratricopeptide repeat. This Pfam entry
includes outlying Tetratricopeptide-like repeats (TPR)
that are not matched by pfam00515.
Length = 34
Score = 30.9 bits (71), Expect = 0.045
Identities = 12/33 (36%), Positives = 19/33 (57%)
Query: 179 ESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
E+L+ L Y + G YE+ALE + + + L N
Sbjct: 2 EALYNLGLAYYKLGDYEEALEAYEKALELDPNN 34
>gnl|CDD|222112 pfam13414, TPR_11, TPR repeat.
Length = 69
Score = 30.4 bits (69), Expect = 0.13
Identities = 13/61 (21%), Positives = 27/61 (44%), Gaps = 8/61 (13%)
Query: 176 QDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLVPPYRDYILCQRSIQTCFLNLGQ 235
+ E+L L N + G Y++A+E + + + L +N ++ +L LG+
Sbjct: 1 DNAEALKNLGNALFKLGDYDEAIEAYEKALELDPDNAE------AYYNLAL--AYLKLGK 52
Query: 236 K 236
Sbjct: 53 D 53
>gnl|CDD|221956 pfam13174, TPR_6, Tetratricopeptide repeat.
Length = 33
Score = 28.6 bits (65), Expect = 0.24
Identities = 8/26 (30%), Positives = 16/26 (61%)
Query: 179 ESLFRLANNYKENGLYEKALEKFTQL 204
++L++LA Y + G ++A E +L
Sbjct: 1 DALYKLALAYLKLGDTDEAKEALERL 26
>gnl|CDD|205602 pfam13424, TPR_12, Tetratricopeptide repeat.
Length = 78
Score = 29.3 bits (66), Expect = 0.42
Identities = 10/31 (32%), Positives = 16/31 (51%)
Query: 180 SLFRLANNYKENGLYEKALEKFTQLMTLLDE 210
+L LA + G Y++ALE + + L E
Sbjct: 7 ALNNLALVLRRLGDYDEALELLEKALELARE 37
>gnl|CDD|217657 pfam03648, Glyco_hydro_67N, Glycosyl hydrolase family 67
N-terminus. Alpha-glucuronidases, components of an
ensemble of enzymes central to the recycling of
photosynthetic biomass, remove the alpha-1,2 linked
4-O-methyl glucuronic acid from xylans. This family
represents the N-terminal region of
alpha-glucuronidase. The N-terminal domain forms a
two-layer sandwich, each layer being formed by a beta
sheet of five strands. A further two helices form part
of the interface with the central, catalytic, module
(pfam07488).
Length = 122
Score = 29.6 bits (67), Expect = 0.56
Identities = 11/34 (32%), Positives = 15/34 (44%)
Query: 4 ELEEFIGGLLLHQIQCLQFNCHEVADLVGTGESS 37
EL+ + G+L Q + LVGT E S
Sbjct: 39 ELQRGLKGMLGKTPQVSSEPPESSSILVGTLEES 72
>gnl|CDD|197478 smart00028, TPR, Tetratricopeptide repeats. Repeats present in 4
or more copies in proteins. Contain a minimum of 34
amino acids each and self-associate via a "knobs and
holes" mechanism.
Length = 34
Score = 27.0 bits (61), Expect = 1.1
Identities = 12/33 (36%), Positives = 20/33 (60%)
Query: 179 ESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
E+L+ L N Y + G Y++ALE + + + L N
Sbjct: 2 EALYNLGNAYLKLGDYDEALEYYEKALELDPNN 34
>gnl|CDD|238112 cd00189, TPR, Tetratricopeptide repeat domain; typically contains
34 amino acids
[WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-
X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found
in a variety of organisms including bacteria,
cyanobacteria, yeast, fungi, plants, and humans in
various subcellular locations; involved in a variety of
functions including protein-protein interactions, but
common features in the interaction partners have not
been defined; involved in chaperone, cell-cycle,
transciption, and protein transport complexes; the
number of TPR motifs varies among proteins (1,3-11,13
15,16,19); 5-6 tandem repeats generate a right-handed
helical structure with an amphipathic channel that is
thought to accomodate an alpha-helix of a target
protein; it has been proposed that TPR proteins
preferably interact with WD-40 repeat proteins, but in
many instances several TPR-proteins seem to aggregate to
multi-protein complexes; examples of TPR-proteins
include, Cdc16p, Cdc23p and Cdc27p components of the
cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal
targeting signals, the Tom70p co-receptor for
mitochondrial targeting signals, Ser/Thr phosphatase 5C
and the p110 subunit of O-GlcNAc transferase; three
copies of the repeat are present here.
Length = 100
Score = 27.7 bits (62), Expect = 1.9
Identities = 12/33 (36%), Positives = 20/33 (60%)
Query: 179 ESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
E+L L N Y + G Y++ALE + + + L +N
Sbjct: 1 EALLNLGNLYYKLGDYDEALEYYEKALELDPDN 33
Score = 27.0 bits (60), Expect = 4.4
Identities = 11/36 (30%), Positives = 21/36 (58%)
Query: 176 QDTESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
+ ++ + LA Y + G YE+ALE + + + L +N
Sbjct: 32 DNADAYYNLAAAYYKLGKYEEALEDYEKALELDPDN 67
>gnl|CDD|225504 COG2956, COG2956, Predicted N-acetylglucosaminyl transferase
[Carbohydrate transport and metabolism].
Length = 389
Score = 29.3 bits (66), Expect = 2.2
Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 6/49 (12%)
Query: 163 DQFINIFKGLKNLQDTE------SLFRLANNYKENGLYEKALEKFTQLM 205
D+ I I + L D +L +L +Y GL ++A + F QL+
Sbjct: 86 DRAIRIHQTLLESPDLTFEQRLLALQQLGRDYMAAGLLDRAEDIFNQLV 134
>gnl|CDD|220248 pfam09455, Cas_DxTHG, CRISPR-associated (Cas) DxTHG family. CRISPR
is a term for Clustered Regularly Interspaced Short
Palidromic Repeats. A number of protein families appear
only in association with these repeats and are
designated Cas (CRISPR associated) proteins. The family
describes Cas proteins of about 400 residues that
include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and
associated proteins are thought to be involved in the
evolution of host resistance. The exact molecular
function of this family is currently unknown.
Length = 370
Score = 29.0 bits (65), Expect = 2.7
Identities = 16/52 (30%), Positives = 24/52 (46%), Gaps = 8/52 (15%)
Query: 163 DQFINIFKGL-KNLQDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLV 213
++ I +K K+ + E L +L Y E GLY Q +TL E L+
Sbjct: 268 EKIIERYKKFAKDEESLEDLEKLIEWYLERGLY-------VQALTLAREWLI 312
>gnl|CDD|212673 cd10231, YegD_like, Escherichia coli YegD, a putative chaperone
protein, and related proteins. This bacterial subfamily
includes the uncharacterized Escherichia coli YegD. It
belongs to the heat shock protein 70 (HSP70) family of
chaperones that assist in protein folding and assembly
and can direct incompetent "client" proteins towards
degradation. Typically, HSP70s have a nucleotide-binding
domain (NBD) and a substrate-binding domain (SBD). The
nucleotide sits in a deep cleft formed between the two
lobes of the NBD. The two subdomains of each lobe change
conformation between ATP-bound, ADP-bound, and
nucleotide-free states. ATP binding opens up the
substrate-binding site; substrate-binding increases the
rate of ATP hydrolysis. YegD lacks the SBD. HSP70
chaperone activity is regulated by various
co-chaperones: J-domain proteins and nucleotide exchange
factors (NEFs). Some family members are not chaperones
but instead, function as NEFs for their Hsp70 partners,
other family members function as both chaperones and
NEFs.
Length = 415
Score = 28.7 bits (65), Expect = 3.4
Identities = 13/47 (27%), Positives = 21/47 (44%), Gaps = 10/47 (21%)
Query: 166 INIFKGLKNLQDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENL 212
IN K L++ L LA + E E +L+T+++E L
Sbjct: 263 INFLYTPKTLRE---LRELARDAVEP-------ELLERLITVIEEEL 299
>gnl|CDD|216869 pfam02085, Cytochrom_CIII, Class III cytochrome C family.
Length = 99
Score = 27.1 bits (60), Expect = 3.8
Identities = 8/43 (18%), Positives = 11/43 (25%), Gaps = 7/43 (16%)
Query: 114 CIACEHDWPLFEEMQAAQDLRFRCETENCHNVVKVATNTTQFM 156
C C H ++ C T CH + F
Sbjct: 29 CATCHHKVDGKGKIAK-------CSTAGCHATEDKDKDEKSFY 64
>gnl|CDD|143450 cd07132, ALDH_F3AB, Aldehyde dehydrogenase family 3 members A1, A2,
and B1 and related proteins. NAD(P)+-dependent,
aldehyde dehydrogenase, family 3 members A1 and B1
(ALDH3A1, ALDH3B1, EC=1.2.1.5) and fatty aldehyde
dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3),
and similar sequences are included in this CD. Human
ALDH3A1 is a homodimer with a critical role in cellular
defense against oxidative stress; it catalyzes the
oxidation of various cellular membrane lipid-derived
aldehydes. Corneal crystalline ALDH3A1 protects the
cornea and underlying lens against UV-induced oxidative
stress. Human ALDH3A2, a microsomal homodimer, catalyzes
the oxidation of long-chain aliphatic aldehydes to fatty
acids. Human ALDH3B1 is highly expressed in the kidney
and liver and catalyzes the oxidation of various medium-
and long-chain saturated and unsaturated aliphatic
aldehydes.
Length = 443
Score = 28.3 bits (64), Expect = 4.1
Identities = 8/21 (38%), Positives = 12/21 (57%)
Query: 218 DYILCQRSIQTCFLNLGQKCL 238
DY+LC +Q F+ +K L
Sbjct: 244 DYVLCTPEVQEKFVEALKKTL 264
>gnl|CDD|201277 pfam00515, TPR_1, Tetratricopeptide repeat.
Length = 34
Score = 25.1 bits (56), Expect = 4.6
Identities = 9/25 (36%), Positives = 17/25 (68%)
Query: 179 ESLFRLANNYKENGLYEKALEKFTQ 203
++L+ L N Y + G Y++ALE + +
Sbjct: 2 KALYNLGNAYLKLGKYDEALEYYEK 26
>gnl|CDD|235228 PRK04155, PRK04155, chaperone protein HchA; Provisional.
Length = 287
Score = 27.7 bits (62), Expect = 6.4
Identities = 14/32 (43%), Positives = 15/32 (46%), Gaps = 2/32 (6%)
Query: 191 NGLYEKALEKFTQLMTLLD--ENLVPPYRDYI 220
G YEK KF Q L D NL+ P DY
Sbjct: 118 MGFYEKYKSKFKQPKKLADVVANLLAPDSDYA 149
>gnl|CDD|223533 COG0457, NrfG, FOG: TPR repeat [General function prediction only].
Length = 291
Score = 27.1 bits (58), Expect = 8.5
Identities = 11/53 (20%), Positives = 21/53 (39%)
Query: 173 KNLQDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLVPPYRDYILCQRS 225
+ + E+L L + G YE+ALE + + L ++ + L
Sbjct: 162 ELNELAEALLALGALLEALGRYEEALELLEKALKLNPDDDAEALLNLGLLYLK 214
>gnl|CDD|150830 pfam10216, ChpXY, CO2 hydration protein (ChpXY). This small family
of proteins includes paralogues ChpX and ChpY in
Synechococcus sp. PCC7942 and other cyanobacteria,
associated with distinct NAD(P)H dehydrogenase
complexes. These proteins collectively enable
light-dependent CO2 hydration and CO2 uptake; loss of
both blocks growth at low CO2 concentrations.
Length = 353
Score = 27.3 bits (61), Expect = 9.4
Identities = 16/45 (35%), Positives = 22/45 (48%), Gaps = 7/45 (15%)
Query: 43 GAGIFPTLSM--FNHSCEPNIVRYFRGTM-----VYVNLCKNFKK 80
GAGI PTL M H + Y+RG + V +C +F+K
Sbjct: 267 GAGIPPTLLMQDMYHHLPEYLHEYYRGHCRGEDDLRVQICISFQK 311
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.324 0.139 0.438
Gapped
Lambda K H
0.267 0.0717 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 11,851,985
Number of extensions: 1046816
Number of successful extensions: 1157
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1154
Number of HSP's successfully gapped: 33
Length of query: 247
Length of database: 10,937,602
Length adjustment: 94
Effective length of query: 153
Effective length of database: 6,768,326
Effective search space: 1035553878
Effective search space used: 1035553878
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 15 ( 7.0 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 40 (21.6 bits)
S2: 58 (26.1 bits)