RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy5735
         (247 letters)



>gnl|CDD|216155 pfam00856, SET, SET domain.  SET domains are protein lysine
           methyltransferase enzymes. SET domains appear to be
           protein-protein interaction domains. It has been
           demonstrated that SET domains mediate interactions with
           a family of proteins that display similarity with
           dual-specificity phosphatases (dsPTPases). A subset of
           SET domains have been called PR domains. These domains
           are divergent in sequence from other SET domains, but
           also appear to mediate protein-protein interaction. The
           SET domain consists of two regions known as SET-N and
           SET-C. SET-C forms an unusual and conserved knot-like
           structure of probably functional importance.
           Additionally to SET-N and SET-C, an insert region
           (SET-I) and flanking regions of high structural
           variability form part of the overall structure.
          Length = 113

 Score = 41.7 bits (98), Expect = 4e-05
 Identities = 15/51 (29%), Positives = 23/51 (45%), Gaps = 4/51 (7%)

Query: 43  GAGIFPTLSMFNHSCEPN----IVRYFRGTMVYVNLCKNFKKGDQICENYG 89
             G+       NHSCEPN     V    G  + V   ++ K G+++  +YG
Sbjct: 63  ATGLGNVARFINHSCEPNCEVRFVFVNGGDRIVVRALRDIKPGEELTIDYG 113


>gnl|CDD|214614 smart00317, SET, SET (Su(var)3-9, Enhancer-of-zeste, Trithorax)
           domain.  Putative methyl transferase, based on outlier
           plant homologues.
          Length = 124

 Score = 35.8 bits (83), Expect = 0.005
 Identities = 12/55 (21%), Positives = 23/55 (41%), Gaps = 4/55 (7%)

Query: 43  GAGIFPTLSMFNHSCEPN--IVRYFRGTMVYVNLC--KNFKKGDQICENYGPLYS 93
                      NHSCEPN  ++         + +   ++ K G+++  +YG  Y+
Sbjct: 68  ARRKGNLARFINHSCEPNCELLFVEVNGDDRIVIFALRDIKPGEELTIDYGSDYA 122


>gnl|CDD|225491 COG2940, COG2940, Proteins containing SET domain [General function
           prediction only].
          Length = 480

 Score = 36.7 bits (85), Expect = 0.009
 Identities = 16/74 (21%), Positives = 28/74 (37%), Gaps = 8/74 (10%)

Query: 51  SMFNHSCEPNI--VRYFRG---TMVYVNLCKNFKKGDQICENYGPLYSQVRKTERQNTLK 105
              NHSC PN             +    + ++ K G+++  +YGP        E +  L+
Sbjct: 407 RFINHSCTPNCEASPIEVNGIFKISIYAI-RDIKAGEELTYDYGPSLED--NRELKKLLE 463

Query: 106 SQYWFDCHCIACEH 119
            ++   C    C H
Sbjct: 464 KRWGCACGEDRCSH 477


>gnl|CDD|226687 COG4235, COG4235, Cytochrome c biogenesis factor [Posttranslational
           modification, protein turnover, chaperones].
          Length = 287

 Score = 36.2 bits (84), Expect = 0.010
 Identities = 16/61 (26%), Positives = 24/61 (39%), Gaps = 4/61 (6%)

Query: 176 QDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLVPPYRDYILCQRSIQTCFLNLGQ 235
            +  +L  LA    E G Y +A   +  L+ LL  +   P R  I  +RSI         
Sbjct: 225 ANIRALSLLAFAAFEQGDYAEAAAAWQMLLDLLPADD--PRRSLI--ERSIARALAQRSA 280

Query: 236 K 236
           +
Sbjct: 281 Q 281


>gnl|CDD|129583 TIGR00492, alr, alanine racemase.  This enzyme interconverts
           L-alanine and D-alanine. Its primary function is to
           generate D-alanine for cell wall formation. With
           D-alanine-D-alanine ligase, it makes up the D-alanine
           branch of the peptidoglycan biosynthetic route. It is a
           monomer with one pyridoxal phosphate per subunit. In E.
           coli, the ortholog is duplicated so that a second
           isozyme, DadX, is present. DadX, a paralog of the
           biosynthetic Alr, is induced by D- or L-alanine and is
           involved in catabolism [Cell envelope, Biosynthesis and
           degradation of murein sacculus and peptidoglycan].
          Length = 367

 Score = 36.2 bits (84), Expect = 0.012
 Identities = 23/97 (23%), Positives = 47/97 (48%), Gaps = 4/97 (4%)

Query: 124 FEEMQAAQDLRFRCETENCHNVVKVATNTTQFMIKCDKCDQFINIFKGLKNLQDTESLF- 182
            E++QA ++   + E +     +K+ T   +  +K D+   F+   + LK   + E +F 
Sbjct: 103 VEQLQALEEALLK-EPKRLKVHLKIDTGMNRLGVKPDEAALFVQKLRQLKKFLELEGIFS 161

Query: 183 RLAN-NYKENGLYEKALEKFTQLM-TLLDENLVPPYR 217
             A  +  + G  +K +E+F   +  L  +N+ PP+R
Sbjct: 162 HFATADEPKTGTTQKQIERFNSFLEGLKQQNIEPPFR 198


>gnl|CDD|191825 pfam07719, TPR_2, Tetratricopeptide repeat.  This Pfam entry
           includes outlying Tetratricopeptide-like repeats (TPR)
           that are not matched by pfam00515.
          Length = 34

 Score = 30.9 bits (71), Expect = 0.045
 Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 179 ESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
           E+L+ L   Y + G YE+ALE + + + L   N
Sbjct: 2   EALYNLGLAYYKLGDYEEALEAYEKALELDPNN 34


>gnl|CDD|222112 pfam13414, TPR_11, TPR repeat. 
          Length = 69

 Score = 30.4 bits (69), Expect = 0.13
 Identities = 13/61 (21%), Positives = 27/61 (44%), Gaps = 8/61 (13%)

Query: 176 QDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLVPPYRDYILCQRSIQTCFLNLGQ 235
            + E+L  L N   + G Y++A+E + + + L  +N             ++   +L LG+
Sbjct: 1   DNAEALKNLGNALFKLGDYDEAIEAYEKALELDPDNAE------AYYNLAL--AYLKLGK 52

Query: 236 K 236
            
Sbjct: 53  D 53


>gnl|CDD|221956 pfam13174, TPR_6, Tetratricopeptide repeat. 
          Length = 33

 Score = 28.6 bits (65), Expect = 0.24
 Identities = 8/26 (30%), Positives = 16/26 (61%)

Query: 179 ESLFRLANNYKENGLYEKALEKFTQL 204
           ++L++LA  Y + G  ++A E   +L
Sbjct: 1   DALYKLALAYLKLGDTDEAKEALERL 26


>gnl|CDD|205602 pfam13424, TPR_12, Tetratricopeptide repeat. 
          Length = 78

 Score = 29.3 bits (66), Expect = 0.42
 Identities = 10/31 (32%), Positives = 16/31 (51%)

Query: 180 SLFRLANNYKENGLYEKALEKFTQLMTLLDE 210
           +L  LA   +  G Y++ALE   + + L  E
Sbjct: 7   ALNNLALVLRRLGDYDEALELLEKALELARE 37


>gnl|CDD|217657 pfam03648, Glyco_hydro_67N, Glycosyl hydrolase family 67
          N-terminus.  Alpha-glucuronidases, components of an
          ensemble of enzymes central to the recycling of
          photosynthetic biomass, remove the alpha-1,2 linked
          4-O-methyl glucuronic acid from xylans. This family
          represents the N-terminal region of
          alpha-glucuronidase. The N-terminal domain forms a
          two-layer sandwich, each layer being formed by a beta
          sheet of five strands. A further two helices form part
          of the interface with the central, catalytic, module
          (pfam07488).
          Length = 122

 Score = 29.6 bits (67), Expect = 0.56
 Identities = 11/34 (32%), Positives = 15/34 (44%)

Query: 4  ELEEFIGGLLLHQIQCLQFNCHEVADLVGTGESS 37
          EL+  + G+L    Q         + LVGT E S
Sbjct: 39 ELQRGLKGMLGKTPQVSSEPPESSSILVGTLEES 72


>gnl|CDD|197478 smart00028, TPR, Tetratricopeptide repeats.  Repeats present in 4
           or more copies in proteins. Contain a minimum of 34
           amino acids each and self-associate via a "knobs and
           holes" mechanism.
          Length = 34

 Score = 27.0 bits (61), Expect = 1.1
 Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 179 ESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
           E+L+ L N Y + G Y++ALE + + + L   N
Sbjct: 2   EALYNLGNAYLKLGDYDEALEYYEKALELDPNN 34


>gnl|CDD|238112 cd00189, TPR, Tetratricopeptide repeat domain; typically contains
           34 amino acids
           [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-
           X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found
           in a variety of organisms including bacteria,
           cyanobacteria, yeast, fungi, plants, and humans in
           various subcellular locations; involved in a variety of
           functions including protein-protein interactions, but
           common features in the interaction partners have not
           been defined; involved in chaperone, cell-cycle,
           transciption, and protein transport complexes; the
           number of TPR motifs varies among proteins (1,3-11,13
           15,16,19); 5-6 tandem repeats generate a right-handed
           helical structure with an amphipathic channel that is
           thought to accomodate an alpha-helix of a target
           protein; it has been proposed that TPR proteins
           preferably interact with WD-40 repeat proteins, but in
           many instances several TPR-proteins seem to aggregate to
           multi-protein complexes; examples of TPR-proteins
           include, Cdc16p, Cdc23p and Cdc27p components of the
           cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal
           targeting signals, the Tom70p co-receptor for
           mitochondrial targeting signals, Ser/Thr phosphatase 5C
           and the p110 subunit of O-GlcNAc transferase; three
           copies of the repeat are present here.
          Length = 100

 Score = 27.7 bits (62), Expect = 1.9
 Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 179 ESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
           E+L  L N Y + G Y++ALE + + + L  +N
Sbjct: 1   EALLNLGNLYYKLGDYDEALEYYEKALELDPDN 33



 Score = 27.0 bits (60), Expect = 4.4
 Identities = 11/36 (30%), Positives = 21/36 (58%)

Query: 176 QDTESLFRLANNYKENGLYEKALEKFTQLMTLLDEN 211
            + ++ + LA  Y + G YE+ALE + + + L  +N
Sbjct: 32  DNADAYYNLAAAYYKLGKYEEALEDYEKALELDPDN 67


>gnl|CDD|225504 COG2956, COG2956, Predicted N-acetylglucosaminyl transferase
           [Carbohydrate transport and metabolism].
          Length = 389

 Score = 29.3 bits (66), Expect = 2.2
 Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 6/49 (12%)

Query: 163 DQFINIFKGLKNLQDTE------SLFRLANNYKENGLYEKALEKFTQLM 205
           D+ I I + L    D        +L +L  +Y   GL ++A + F QL+
Sbjct: 86  DRAIRIHQTLLESPDLTFEQRLLALQQLGRDYMAAGLLDRAEDIFNQLV 134


>gnl|CDD|220248 pfam09455, Cas_DxTHG, CRISPR-associated (Cas) DxTHG family.  CRISPR
           is a term for Clustered Regularly Interspaced Short
           Palidromic Repeats. A number of protein families appear
           only in association with these repeats and are
           designated Cas (CRISPR associated) proteins. The family
           describes Cas proteins of about 400 residues that
           include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and
           associated proteins are thought to be involved in the
           evolution of host resistance. The exact molecular
           function of this family is currently unknown.
          Length = 370

 Score = 29.0 bits (65), Expect = 2.7
 Identities = 16/52 (30%), Positives = 24/52 (46%), Gaps = 8/52 (15%)

Query: 163 DQFINIFKGL-KNLQDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLV 213
           ++ I  +K   K+ +  E L +L   Y E GLY        Q +TL  E L+
Sbjct: 268 EKIIERYKKFAKDEESLEDLEKLIEWYLERGLY-------VQALTLAREWLI 312


>gnl|CDD|212673 cd10231, YegD_like, Escherichia coli YegD, a putative chaperone
           protein, and related proteins.  This bacterial subfamily
           includes the uncharacterized Escherichia coli YegD. It
           belongs to the heat shock protein 70 (HSP70) family of
           chaperones that assist in protein folding and assembly
           and can direct incompetent "client" proteins towards
           degradation. Typically, HSP70s have a nucleotide-binding
           domain (NBD) and a substrate-binding domain (SBD). The
           nucleotide sits in a deep cleft formed between the two
           lobes of the NBD. The two subdomains of each lobe change
           conformation between ATP-bound, ADP-bound, and
           nucleotide-free states. ATP binding opens up the
           substrate-binding site; substrate-binding increases the
           rate of ATP hydrolysis. YegD lacks the SBD. HSP70
           chaperone activity is regulated by various
           co-chaperones: J-domain proteins and nucleotide exchange
           factors (NEFs). Some family members are not chaperones
           but instead, function as NEFs for their Hsp70 partners,
           other family members function as both chaperones and
           NEFs.
          Length = 415

 Score = 28.7 bits (65), Expect = 3.4
 Identities = 13/47 (27%), Positives = 21/47 (44%), Gaps = 10/47 (21%)

Query: 166 INIFKGLKNLQDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENL 212
           IN     K L++   L  LA +  E        E   +L+T+++E L
Sbjct: 263 INFLYTPKTLRE---LRELARDAVEP-------ELLERLITVIEEEL 299


>gnl|CDD|216869 pfam02085, Cytochrom_CIII, Class III cytochrome C family. 
          Length = 99

 Score = 27.1 bits (60), Expect = 3.8
 Identities = 8/43 (18%), Positives = 11/43 (25%), Gaps = 7/43 (16%)

Query: 114 CIACEHDWPLFEEMQAAQDLRFRCETENCHNVVKVATNTTQFM 156
           C  C H      ++         C T  CH       +   F 
Sbjct: 29  CATCHHKVDGKGKIAK-------CSTAGCHATEDKDKDEKSFY 64


>gnl|CDD|143450 cd07132, ALDH_F3AB, Aldehyde dehydrogenase family 3 members A1, A2,
           and B1 and related proteins.  NAD(P)+-dependent,
           aldehyde dehydrogenase, family 3 members A1 and B1
           (ALDH3A1, ALDH3B1,  EC=1.2.1.5) and fatty aldehyde
           dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3),
           and similar sequences are included in this CD. Human
           ALDH3A1 is a homodimer with a critical role in cellular
           defense against oxidative stress; it catalyzes the
           oxidation of various cellular membrane lipid-derived
           aldehydes. Corneal crystalline ALDH3A1 protects the
           cornea and underlying lens against UV-induced oxidative
           stress. Human ALDH3A2, a microsomal homodimer, catalyzes
           the oxidation of long-chain aliphatic aldehydes to fatty
           acids. Human ALDH3B1 is highly expressed in the kidney
           and liver and catalyzes the oxidation of various medium-
           and long-chain saturated and unsaturated aliphatic
           aldehydes.
          Length = 443

 Score = 28.3 bits (64), Expect = 4.1
 Identities = 8/21 (38%), Positives = 12/21 (57%)

Query: 218 DYILCQRSIQTCFLNLGQKCL 238
           DY+LC   +Q  F+   +K L
Sbjct: 244 DYVLCTPEVQEKFVEALKKTL 264


>gnl|CDD|201277 pfam00515, TPR_1, Tetratricopeptide repeat. 
          Length = 34

 Score = 25.1 bits (56), Expect = 4.6
 Identities = 9/25 (36%), Positives = 17/25 (68%)

Query: 179 ESLFRLANNYKENGLYEKALEKFTQ 203
           ++L+ L N Y + G Y++ALE + +
Sbjct: 2   KALYNLGNAYLKLGKYDEALEYYEK 26


>gnl|CDD|235228 PRK04155, PRK04155, chaperone protein HchA; Provisional.
          Length = 287

 Score = 27.7 bits (62), Expect = 6.4
 Identities = 14/32 (43%), Positives = 15/32 (46%), Gaps = 2/32 (6%)

Query: 191 NGLYEKALEKFTQLMTLLD--ENLVPPYRDYI 220
            G YEK   KF Q   L D   NL+ P  DY 
Sbjct: 118 MGFYEKYKSKFKQPKKLADVVANLLAPDSDYA 149


>gnl|CDD|223533 COG0457, NrfG, FOG: TPR repeat [General function prediction only].
          Length = 291

 Score = 27.1 bits (58), Expect = 8.5
 Identities = 11/53 (20%), Positives = 21/53 (39%)

Query: 173 KNLQDTESLFRLANNYKENGLYEKALEKFTQLMTLLDENLVPPYRDYILCQRS 225
           +  +  E+L  L    +  G YE+ALE   + + L  ++      +  L    
Sbjct: 162 ELNELAEALLALGALLEALGRYEEALELLEKALKLNPDDDAEALLNLGLLYLK 214


>gnl|CDD|150830 pfam10216, ChpXY, CO2 hydration protein (ChpXY).  This small family
           of proteins includes paralogues ChpX and ChpY in
           Synechococcus sp. PCC7942 and other cyanobacteria,
           associated with distinct NAD(P)H dehydrogenase
           complexes. These proteins collectively enable
           light-dependent CO2 hydration and CO2 uptake; loss of
           both blocks growth at low CO2 concentrations.
          Length = 353

 Score = 27.3 bits (61), Expect = 9.4
 Identities = 16/45 (35%), Positives = 22/45 (48%), Gaps = 7/45 (15%)

Query: 43  GAGIFPTLSM--FNHSCEPNIVRYFRGTM-----VYVNLCKNFKK 80
           GAGI PTL M    H     +  Y+RG       + V +C +F+K
Sbjct: 267 GAGIPPTLLMQDMYHHLPEYLHEYYRGHCRGEDDLRVQICISFQK 311


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.324    0.139    0.438 

Gapped
Lambda     K      H
   0.267   0.0717    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 11,851,985
Number of extensions: 1046816
Number of successful extensions: 1157
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1154
Number of HSP's successfully gapped: 33
Length of query: 247
Length of database: 10,937,602
Length adjustment: 94
Effective length of query: 153
Effective length of database: 6,768,326
Effective search space: 1035553878
Effective search space used: 1035553878
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 15 ( 7.0 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 40 (21.6 bits)
S2: 58 (26.1 bits)