RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy7969
(502 letters)
>gnl|CDD|218393 pfam05033, Pre-SET, Pre-SET motif. This protein motif is a zinc
binding motif. It contains 9 conserved cysteines that
coordinate three zinc ions. It is thought that this
region plays a structural role in stabilising SET
domains.
Length = 103
Score = 115 bits (290), Expect = 4e-31
Identities = 51/94 (54%), Positives = 59/94 (62%), Gaps = 2/94 (2%)
Query: 240 PIYVINNVDLSCVPANFTHTNHNIPAEGV-IVNEEPIIWCECVDNCRDSSYC-CGQLNDS 297
PI V+N VDL P NFT+ N IP GV + E ++ C C D C DSS C C QLN
Sbjct: 10 PIPVVNEVDLEGPPPNFTYINEYIPGSGVSDIPNEFLVGCSCKDGCPDSSNCACLQLNGG 69
Query: 298 VTAYDENKRLRIGQGTPIYECNKNCKCNASCPNR 331
AYD+N RLR+ G PIYECN CKC+ SCPNR
Sbjct: 70 GFAYDKNGRLRVEPGPPIYECNSRCKCDPSCPNR 103
>gnl|CDD|214614 smart00317, SET, SET (Su(var)3-9, Enhancer-of-zeste, Trithorax)
domain. Putative methyl transferase, based on outlier
plant homologues.
Length = 124
Score = 113 bits (286), Expect = 3e-30
Identities = 50/130 (38%), Positives = 73/130 (56%), Gaps = 13/130 (10%)
Query: 339 IKLGIYKTYNDCGWGVQTLEDIPKGTYVTEYVGEILTYEAASLRDNQTYLFNLDFNGSTS 398
KL ++K+ GWGV+ EDIPKG ++ EYVGEI+T E A R D +G+ +
Sbjct: 1 NKLEVFKSPG-KGWGVRATEDIPKGEFIGEYVGEIITSEEAEERPKA-----YDTDGAKA 54
Query: 399 FVIDAYFNGSTSFVIDACNFGNISHFINHSCDPNLAVYAAYIQCLDPNLHRLPLFAIRDI 458
F F+ + IDA GN++ FINHSC+PN + + R+ +FA+RDI
Sbjct: 55 FY---LFDIDSDLCIDARRKGNLARFINHSCEPNCELLFVEVN----GDDRIVIFALRDI 107
Query: 459 QKGEQLSFSY 468
+ GE+L+ Y
Sbjct: 108 KPGEELTIDY 117
>gnl|CDD|216155 pfam00856, SET, SET domain. SET domains are protein lysine
methyltransferase enzymes. SET domains appear to be
protein-protein interaction domains. It has been
demonstrated that SET domains mediate interactions with
a family of proteins that display similarity with
dual-specificity phosphatases (dsPTPases). A subset of
SET domains have been called PR domains. These domains
are divergent in sequence from other SET domains, but
also appear to mediate protein-protein interaction. The
SET domain consists of two regions known as SET-N and
SET-C. SET-C forms an unusual and conserved knot-like
structure of probably functional importance.
Additionally to SET-N and SET-C, an insert region
(SET-I) and flanking regions of high structural
variability form part of the overall structure.
Length = 113
Score = 105 bits (264), Expect = 2e-27
Identities = 41/118 (34%), Positives = 58/118 (49%), Gaps = 6/118 (5%)
Query: 351 GWGVQTLEDIPKGTYVTEYVGEILTYEAASLRDNQTYLFNLDFNGSTSFVIDAYFNGSTS 410
G G+ DIPKG + EYVGE++T E A R+ L G S + +
Sbjct: 1 GRGLFATRDIPKGELIIEYVGELITPEEAEERELLYNKEEL--RGLLSDLELFLSRLDSE 58
Query: 411 FVIDACNFGNISHFINHSCDPNLAVYAAYIQCLDPNLHRLPLFAIRDIQKGEQLSFSY 468
+ IDA GN++ FINHSC+PN R+ + A+RDI+ GE+L+ Y
Sbjct: 59 YDIDATGLGNVARFINHSCEPN----CEVRFVFVNGGDRIVVRALRDIKPGEELTIDY 112
>gnl|CDD|225491 COG2940, COG2940, Proteins containing SET domain [General function
prediction only].
Length = 480
Score = 79.1 bits (195), Expect = 9e-16
Identities = 50/230 (21%), Positives = 83/230 (36%), Gaps = 23/230 (10%)
Query: 271 NEEPIIWCECVDNCRDSSYCCGQLNDSVTAYDENKRLRIGQGTPIYECNKNCKCNASCPN 330
+ P C +S +A + + T N N
Sbjct: 273 SLSPNFLEGCSPLLCSAS---------PSAINRISKSEEDSTTSSDFSKSNVSKLKELLN 323
Query: 331 RVIQLGTKIKLGIYKTYNDCGWGVQTLEDIPKGTYVTEYVGEILTYEAASLRDNQTYLFN 390
+ + + G+GV LE I KG ++ EY GEI+ + A R+
Sbjct: 324 SNGCKKRREPN-VVQESEIKGYGVFALESIKKGEFIIEYHGEIIRRKEAREREENYD--- 379
Query: 391 LDFNGSTSFVIDAYFNGSTSFVIDACNFGNISHFINHSCDPNLAVYAAYIQCLDPNLHRL 450
SF V D+ G+++ FINHSC PN A+ I+ + ++
Sbjct: 380 -LLGNEFSF----GLLEDKDKVRDSQKAGDVARFINHSCTPNC--EASPIE--VNGIFKI 430
Query: 451 PLFAIRDIQKGEQLSFSYYKSVTKEPTRPG-GSNKVKCKCEAKNCRGYLN 499
++AIRDI+ GE+L++ Y S+ + C C C ++
Sbjct: 431 SIYAIRDIKAGEELTYDYGPSLEDNRELKKLLEKRWGCACGEDRCSHTMS 480
>gnl|CDD|128744 smart00468, PreSET, N-terminal to some SET domains. A Cys-rich
putative Zn2+-binding domain that occurs N-terminal to
some SET domains. Function is unknown. Unpublished.
Length = 98
Score = 70.9 bits (174), Expect = 3e-15
Identities = 28/87 (32%), Positives = 44/87 (50%), Gaps = 3/87 (3%)
Query: 240 PIYVINNVDLSCVPANFTHTNHNIPAEGVI--VNEEPIIWCECVDNCRDSSYC-CGQLND 296
P+ ++N VD P +F + + I +GV + P++ C C +C S+ C C + N
Sbjct: 12 PVPLVNEVDEDPPPPDFEYISEYIYGQGVPIDRSPSPLVGCSCSGDCSSSNKCECARKNG 71
Query: 297 SVTAYDENKRLRIGQGTPIYECNKNCK 323
AY+ N LR+ + IYECN C
Sbjct: 72 GEFAYELNGGLRLKRKPLIYECNSRCS 98
>gnl|CDD|214703 smart00508, PostSET, Cysteine-rich motif following a subset of SET
domains.
Length = 17
Score = 33.9 bits (79), Expect = 0.006
Identities = 9/17 (52%), Positives = 10/17 (58%)
Query: 483 NKVKCKCEAKNCRGYLN 499
K C C A NCRG+L
Sbjct: 1 KKQPCLCGAPNCRGFLG 17
>gnl|CDD|216301 pfam01105, EMP24_GP25L, emp24/gp25L/p24 family/GOLD. Members of
this family are implicated in bringing cargo forward
from the ER and binding to coat proteins by their
cytoplasmic domains. This domain corresponds closely to
the beta-strand rich GOLD domain described in. The GOLD
domain is always found combined with lipid- or
membrane-association domains.
Length = 178
Score = 32.6 bits (75), Expect = 0.23
Identities = 13/54 (24%), Positives = 29/54 (53%), Gaps = 6/54 (11%)
Query: 8 ICVSSSDYSIS------DVEQDTDVQEVEQKPDLNQVEIKLEEPEVKITSIKSE 55
C S+S + S D++ + +++ +K L+ +E +L++ E ++ IK E
Sbjct: 76 FCFSNSFSTFSSKTVSFDIKVGEEAKDIAKKEKLDPLEEELKKLEDQLNDIKRE 129
>gnl|CDD|234503 TIGR04215, choice_anch_A, choice-of-anchor A domain. This domain
may occur as essentially the full length of a protein,
except for an N-terminal sequence and a C-terminal
protein-sorting signal such as PEP-CTERM or LPXTG. Most
often, the putative surface protein is longer and
contains repetitive sequence regions. This is one of
very few domains for which both anchoring domains occur,
and designated choice-of-anchor A domain. The best
characterized member is Bacillus anthracis protein
BA0871, a collagen-binding protein with five CNA-family
protein B-type repeats toward the C-terminus and an
LPXTG cell wall attachment motif.
Length = 249
Score = 31.4 bits (72), Expect = 0.77
Identities = 12/56 (21%), Positives = 19/56 (33%), Gaps = 8/56 (14%)
Query: 377 EAASLRDNQTYLFNLDFNGSTSFVIDA---YFNGSTS-----FVIDACNFGNISHF 424
A LR L L NG+ + + G+ S F +DA + +
Sbjct: 100 AFAELRALSAALAGLAANGTVTVSGNGGGLTLTGTGSSGLNVFNVDAADLFGANEI 155
>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457). This is
a family of uncharacterized proteins.
Length = 449
Score = 30.3 bits (68), Expect = 2.5
Identities = 19/92 (20%), Positives = 29/92 (31%), Gaps = 7/92 (7%)
Query: 64 PEYTPRTRDDKTDREQA---SSNFFKPPRTRDDKTDREQASSNFFKKREWASLNSPPIYD 120
P P +D+ D + S + F + R + S KR SPP
Sbjct: 230 PTSDPEDEEDELDDVEEVIESDDHFFLDLDGERGRRRRKRSPRTSPKR----FRSPPPRK 285
Query: 121 FGKKSPKRSLLGSPSIEPKRSTKLNLGTKLGC 152
+SP+R + P RS +
Sbjct: 286 ARGRSPRRLIRSPPPPGRLRSPPPLHASDSPV 317
>gnl|CDD|235285 PRK04335, PRK04335, cell division protein ZipA; Provisional.
Length = 313
Score = 30.1 bits (68), Expect = 2.6
Identities = 13/83 (15%), Positives = 26/83 (31%)
Query: 1 MSDDDDVICVSSSDYSISDVEQDTDVQEVEQKPDLNQVEIKLEEPEVKITSIKSEYPTIT 60
+ DD + + E+ + + +P +VE ++EEP + + E
Sbjct: 88 LIDDPLFGGELEEEEDKFEQEEAPIPVQAQSQPQPEKVEPQVEEPRDEEVLEEPEPVAAK 147
Query: 61 NISPEYTPRTRDDKTDREQASSN 83
E P + E
Sbjct: 148 VPMAEVQPEEETEIEVDEPEEPK 170
>gnl|CDD|219683 pfam07986, TBCC, Tubulin binding cofactor C. Members of this
family are involved in the folding pathway of tubulins
and form a beta helix structure.
Length = 119
Score = 28.7 bits (65), Expect = 2.7
Identities = 22/76 (28%), Positives = 35/76 (46%), Gaps = 16/76 (21%)
Query: 281 VDNCRDSSYCCGQLNDSVTAYD-EN-------KRLRIGQGTPIYECNKNCKCNASCPNR- 331
+DNC+D + G ++ SV D EN ++LRI +C NC +R
Sbjct: 25 IDNCKDCTIILGPVSGSVFIRDCENCTIVVACRQLRIH------DC-TNCDFYLHTTSRP 77
Query: 332 VIQLGTKIKLGIYKTY 347
+I+ + I+ Y TY
Sbjct: 78 IIEDSSGIRFAPYNTY 93
>gnl|CDD|237224 PRK12842, PRK12842, putative succinate dehydrogenase; Reviewed.
Length = 574
Score = 30.4 bits (69), Expect = 2.7
Identities = 15/34 (44%), Positives = 22/34 (64%), Gaps = 2/34 (5%)
Query: 208 ITSFLY-DKRLI-QMENLKRYEMEINVTTGNAVA 239
+TSF+Y KRL +++L Y VT+GNA+A
Sbjct: 184 LTSFIYVAKRLATHLKDLALYRRGTQVTSGNALA 217
>gnl|CDD|199890 cd02860, E_set_Pullulanase, Early set domain associated with the
catalytic domain of pullulanase (also called dextrinase
and alpha-dextrin endo-1,6-alpha glucosidase). E or
"early" set domains are associated with the catalytic
domain of pullulanase at either the N-terminal or
C-terminal end, and in a few instances at both ends.
Pullulanase is an enzyme with activity similar to that
of isoamylase; it cleaves 1,6-alpha-glucosidic linkages
in pullulan, amylopectin, and glycogen, and in alpha-and
beta-amylase limit-dextrins of amylopectin and glycogen.
The E set domain of pullulanase may be related to the
immunoglobulin and/or fibronectin type III
superfamilies. These domains are associated with
different types of catalytic domains at either the
N-terminal or C-terminal end and may be involved in
homodimeric/tetrameric/dodecameric interactions. Members
of this family include members of the alpha amylase
family, sialidase, galactose oxidase, cellulase,
cellulose, hyaluronate lyase, chitobiase, and chitinase.
This domain is also a member of the CBM48 (Carbohydrate
Binding Module 48) family whose members include
maltooligosyl trehalose synthase, starch branching
enzyme, glycogen branching enzyme, glycogen debranching
enzyme, isoamylase, and the beta subunit of
AMP-activated protein kinase.
Length = 97
Score = 28.3 bits (64), Expect = 3.2
Identities = 19/84 (22%), Positives = 32/84 (38%), Gaps = 15/84 (17%)
Query: 337 TKIKLGIYKTYNDCG-WGVQTLEDIPKGTYVTEYVGEILTYEAASLRDNQTYLFNLDFNG 395
K+KL +Y +D ++ KG + V L + Y + + G
Sbjct: 22 QKVKLLLYDDGDDAKPAKTVPMKREEKGVWSVT-VDGDL--------KGKYYTYEVTVYG 72
Query: 396 STSFVIDAY-----FNGSTSFVID 414
T+ V+D Y NG S ++D
Sbjct: 73 ETNEVVDPYAKAVGVNGKRSVIVD 96
>gnl|CDD|163337 TIGR03586, PseI, pseudaminic acid synthase. Members of this family
are included within the larger pfam03102 (NeuB) family.
NeuB itself (TIGR03569) is involved in the biosynthesis
of neuraminic acid by the condensation of
phosphoenolpyruvate (PEP) with N-Acetyl-D-Mannosamine.
In an analagous reaction, this enzyme, PseI , condenses
PEP with 6-deoxy-beta-L-AltNAc4NAc to generate
pseudaminic acid.
Length = 327
Score = 29.6 bits (67), Expect = 4.0
Identities = 12/34 (35%), Positives = 18/34 (52%), Gaps = 5/34 (14%)
Query: 447 LHRLPLFAIRDIQKGEQLSFSYYKSVTKEPTRPG 480
R L+ ++DI+KGE + +SV RPG
Sbjct: 273 QFRRSLYVVKDIKKGETFTEENVRSV-----RPG 301
>gnl|CDD|180871 PRK07191, flgK, flagellar hook-associated protein FlgK; Validated.
Length = 456
Score = 29.8 bits (67), Expect = 4.1
Identities = 17/67 (25%), Positives = 33/67 (49%), Gaps = 8/67 (11%)
Query: 164 QLEAVNSVQNVDVQEINGHIRNFARNPQLIKTNKA------ELDYLREQLIT--SFLYDK 215
Q +++ ++ V++IN R+ A Q I N++ +L R+ I S L +
Sbjct: 153 QKKSIGQQRDATVKQINSLTRSIADYNQKILKNRSDGNNISDLLDQRDLQIKKLSGLIEV 212
Query: 216 RLIQMEN 222
R++Q E+
Sbjct: 213 RVVQQED 219
>gnl|CDD|147046 pfam04693, DDE_Tnp_2, Archaeal putative transposase ISC1217.
Length = 327
Score = 29.5 bits (66), Expect = 4.2
Identities = 14/50 (28%), Positives = 24/50 (48%), Gaps = 2/50 (4%)
Query: 359 DIPKGTYVTEYVGEILTYEAASL--RDNQTYLFNLDFNGSTSFVIDAYFN 406
+ P G Y+ EY+G + L +D + Y F+ D N + +I + N
Sbjct: 225 EFPPGEYLVEYLGTPIKLLVIDLYKKDGRRYFFSTDLNDTDEDIITTWEN 274
>gnl|CDD|219625 pfam07895, DUF1673, Protein of unknown function (DUF1673). This
family contains hypothetical proteins of unknown
function expressed by two archaeal species.
Length = 204
Score = 28.8 bits (65), Expect = 4.5
Identities = 10/40 (25%), Positives = 16/40 (40%), Gaps = 2/40 (5%)
Query: 207 LITSFLYDKRLIQMENLKRYEMEINVTTGNAVAPIYVINN 246
L+ +L +LI E K+ I +T Y+I
Sbjct: 165 LVLMWLVYFQLIYWE--KKNHKIIYITKEKGTKKSYIIGE 202
>gnl|CDD|165588 PHA03344, PHA03344, US22 family homolog; Provisional.
Length = 672
Score = 29.6 bits (66), Expect = 5.0
Identities = 22/96 (22%), Positives = 39/96 (40%), Gaps = 14/96 (14%)
Query: 42 LEEPEVKITSIKSEYPTITNISPEYTPRTRDDKTDREQASSNF------FKPPRTRDDKT 95
LEE + + P + Y P+ D E+A + F+ PRT D
Sbjct: 544 LEEAKRQFRHPAKGIPVCIVTAESYHPKG-----DPEEAFESLAEGRRNFRCPRTCDAPL 598
Query: 96 DREQASSNFFKKREWASLNSPPIYDFGKKSPKRSLL 131
D E+ F +R + + ++D GK+ +R ++
Sbjct: 599 DEERCRDRDFYQR---MMCAMDMFDGGKEESEREII 631
>gnl|CDD|205676 pfam13498, DUF4122, Domain of unknown function (DUF4122). Based on
Bacteroides thetaiotaomicron gene BT_2607, a putative
uncharacterized protein. As seen in gene expression
experiments
(http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE22
31), it appears to be upregulated in the presence of
host or vs when in culture.
Length = 220
Score = 28.7 bits (64), Expect = 5.7
Identities = 23/89 (25%), Positives = 35/89 (39%), Gaps = 10/89 (11%)
Query: 21 EQDTDVQEVEQKPDLNQVEIKLEEPEVKITSIKSEYPTITNISPEYTPRTRDDKTDREQA 80
E+D +EVE K L ++ + EE E + +E P + +SP TPR D E
Sbjct: 93 EEDVQEEEVECKLPLEEMRMLKEEQE----ELDAESPEVEAVSPVVTPR------DLENL 142
Query: 81 SSNFFKPPRTRDDKTDREQASSNFFKKRE 109
D+ +A+ RE
Sbjct: 143 GEVLTHLNEAGSDEEKSMRAARTLHSIRE 171
>gnl|CDD|226416 COG3900, COG3900, Predicted periplasmic protein [Function unknown].
Length = 262
Score = 28.7 bits (64), Expect = 6.3
Identities = 15/35 (42%), Positives = 19/35 (54%), Gaps = 2/35 (5%)
Query: 63 SPEYTPRTRDDKTDREQASSNF-FKPPRTRDDKTD 96
P+YT + K+ E SS+F FKPP T K D
Sbjct: 213 EPQYTVVFSNWKSGDEVPSSDFTFKPP-TDAVKVD 246
>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
bacterial type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. This family
represents the SMC protein of most bacteria. The smc
gene is often associated with scpB (TIGR00281) and scpA
genes, where scp stands for segregation and condensation
protein. SMC was shown (in Caulobacter crescentus) to be
induced early in S phase but present and bound to DNA
throughout the cell cycle [Cellular processes, Cell
division, DNA metabolism, Chromosome-associated
proteins].
Length = 1179
Score = 29.3 bits (66), Expect = 7.2
Identities = 17/72 (23%), Positives = 33/72 (45%), Gaps = 5/72 (6%)
Query: 20 VEQDTDVQEVEQKPD-----LNQVEIKLEEPEVKITSIKSEYPTITNISPEYTPRTRDDK 74
E + ++E+E K D L ++E KLEE + ++ S+++E + E R + +
Sbjct: 319 EELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEELEAELEELESRLEELE 378
Query: 75 TDREQASSNFFK 86
E S +
Sbjct: 379 EQLETLRSKVAQ 390
>gnl|CDD|239622 cd03564, ANTH_AP180_CALM, ANTH domain family; composed of adaptor
protein 180 (AP180), clathrin assembly lymphoid myeloid
leukemia protein (CALM) and similar proteins. A set of
proteins previously designated as harboring an ENTH
domain in fact contains a highly similar, yet unique
module referred to as an AP180 N-terminal homology
(ANTH) domain. AP180 and CALM play important roles in
clathrin-mediated endocytosis. AP180 is a brain-specific
clathrin-binding protein which stimulates clathrin
assembly during the recycling of synaptic vesicles. The
ANTH domain is structurally similar to the VHS domain
and is composed of a superhelix of eight alpha helices.
ANTH domains bind both inositol phospholipids and
proteins, and contribute to the nucleation and formation
of clathrin coats on membranes. ANTH-bearing proteins
have recently been shown to function with adaptor
protein-1 and GGA adaptors at the trans-Golgi network,
which suggests that the ANTH domain is a universal
component of the machinery for clathrin-mediated
membrane budding.
Length = 117
Score = 27.6 bits (62), Expect = 7.3
Identities = 9/48 (18%), Positives = 18/48 (37%), Gaps = 3/48 (6%)
Query: 395 GSTSFVIDAYFNGSTSFVIDACNFGNISHFINHSCDPNLAVYAAYIQC 442
G SF+ + ++ NF + S + + + YA Y+
Sbjct: 68 GHPSFLQELLSRRG---WLNLSNFLDKSSSLGYGYSAFIRAYARYLDE 112
>gnl|CDD|181231 PRK08114, PRK08114, cystathionine beta-lyase; Provisional.
Length = 395
Score = 28.5 bits (64), Expect = 10.0
Identities = 12/29 (41%), Positives = 19/29 (65%)
Query: 280 CVDNCRDSSYCCGQLNDSVTAYDENKRLR 308
C + R++SY GQ+ D+ TAY ++ LR
Sbjct: 229 CWEQLRENSYLMGQMVDADTAYMTSRGLR 257
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.315 0.133 0.397
Gapped
Lambda K H
0.267 0.0673 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 24,596,507
Number of extensions: 2325424
Number of successful extensions: 1822
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1806
Number of HSP's successfully gapped: 36
Length of query: 502
Length of database: 10,937,602
Length adjustment: 101
Effective length of query: 401
Effective length of database: 6,457,848
Effective search space: 2589597048
Effective search space used: 2589597048
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 61 (27.4 bits)