RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy7578
(280 letters)
>gnl|CDD|238037 cd00084, HMG-box, High Mobility Group (HMG)-box is found in a
variety of eukaryotic chromosomal proteins and
transcription factors. HMGs bind to the minor groove of
DNA and have been classified by DNA binding preferences.
Two phylogenically distinct groups of Class I proteins
bind DNA in a sequence specific fashion and contain a
single HMG box. One group (SOX-TCF) includes
transcription factors, TCF-1, -3, -4; and also SRY and
LEF-1, which bind four-way DNA junctions and duplex DNA
targets. The second group (MATA) includes fungal mating
type gene products MC, MATA1 and Ste11. Class II and III
proteins (HMGB-UBF) bind DNA in a non-sequence specific
fashion and contain two or more tandem HMG boxes. Class
II members include non-histone chromosomal proteins,
HMG1 and HMG2, which bind to bent or distorted DNA such
as four-way DNA junctions, synthetic DNA cruciforms,
kinked cisplatin-modified DNA, DNA bulges, cross-overs
in supercoiled DNA, and can cause looping of linear DNA.
Class III members include nucleolar and mitochondrial
transcription factors, UBF and mtTF1, which bind
four-way DNA junctions.
Length = 66
Score = 57.2 bits (139), Expect = 4e-11
Identities = 17/60 (28%), Positives = 38/60 (63%)
Query: 78 TAYMLWAKQIRQKLIKSNPEMDFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAKYTQKM 137
+AY L++++ R ++ NP + ++SK LGE+W ++ EK ++ +A++ +Y ++M
Sbjct: 6 SAYFLFSQEHRAEVKAENPGLSVGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM 65
>gnl|CDD|197700 smart00398, HMG, high mobility group.
Length = 70
Score = 56.6 bits (137), Expect = 8e-11
Identities = 19/62 (30%), Positives = 40/62 (64%)
Query: 78 TAYMLWAKQIRQKLIKSNPEMDFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAKYTQKM 137
+A+ML++++ R K+ NP++ +++SKKLGE W + EK ++ +A + +Y ++M
Sbjct: 7 SAFMLFSQENRAKIKAENPDLSNAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEM 66
Query: 138 SK 139
+
Sbjct: 67 PE 68
>gnl|CDD|238686 cd01390, HMGB-UBF_HMG-box, HMGB-UBF_HMG-box, class II and III
members of the HMG-box superfamily of DNA-binding
proteins. These proteins bind the minor groove of DNA in
a non-sequence specific fashion and contain two or more
tandem HMG boxes. Class II members include non-histone
chromosomal proteins, HMG1 and HMG2, which bind to bent
or distorted DNA such as four-way DNA junctions,
synthetic DNA cruciforms, kinked cisplatin-modified DNA,
DNA bulges, cross-overs in supercoiled DNA, and can
cause looping of linear DNA. Class III members include
nucleolar and mitochondrial transcription factors, UBF
and mtTF1, which bind four-way DNA junctions.
Length = 66
Score = 56.1 bits (136), Expect = 1e-10
Identities = 20/61 (32%), Positives = 38/61 (62%)
Query: 78 TAYMLWAKQIRQKLIKSNPEMDFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAKYTQKM 137
+AY L++++ R KL K NP+ ++V+K LGE W + EK ++ +A++ +Y ++M
Sbjct: 6 SAYFLFSQEQRPKLKKENPDASVTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEM 65
Query: 138 S 138
Sbjct: 66 K 66
>gnl|CDD|189580 pfam00505, HMG_box, HMG (high mobility group) box.
Length = 69
Score = 54.2 bits (131), Expect = 5e-10
Identities = 18/61 (29%), Positives = 36/61 (59%)
Query: 78 TAYMLWAKQIRQKLIKSNPEMDFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAKYTQKM 137
+A+ L++++ R KL NP + +++SK LGE W + EK ++ +A++ A+Y +
Sbjct: 6 SAFFLFSQEQRAKLKAENPGLKNAEISKILGEKWKNLSEEEKKPYEEKAEKEKARYEKAY 65
Query: 138 S 138
Sbjct: 66 P 66
>gnl|CDD|238684 cd01388, SOX-TCF_HMG-box, SOX-TCF_HMG-box, class I member of the
HMG-box superfamily of DNA-binding proteins. These
proteins contain a single HMG box, and bind the minor
groove of DNA in a highly sequence-specific manner.
Members include SRY and its homologs in insects and
vertebrates, and transcription factor-like proteins,
TCF-1, -3, -4, and LEF-1. They appear to bind the minor
groove of the A/T C A A A G/C-motif.
Length = 72
Score = 47.7 bits (114), Expect = 1e-07
Identities = 16/58 (27%), Positives = 33/58 (56%)
Query: 79 AYMLWAKQIRQKLIKSNPEMDFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAKYTQK 136
A+ML++K+ R+K+++ P + +SK LG+ W + EK + +A +L + +
Sbjct: 8 AFMLFSKRHRRKVLQEYPLKENRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKL 65
>gnl|CDD|185511 PTZ00199, PTZ00199, high mobility group protein; Provisional.
Length = 94
Score = 42.1 bits (99), Expect = 2e-05
Identities = 21/62 (33%), Positives = 36/62 (58%), Gaps = 2/62 (3%)
Query: 74 KARFTAYMLWAKQIRQKLIKSNPEM--DFSQVSKKLGELWHTVPFNEKYGWKRQADRLAA 131
K +AYM +AK+ R ++I NPE+ D + V K +GE W+ + EK ++++A
Sbjct: 24 KRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKV 83
Query: 132 KY 133
+Y
Sbjct: 84 RY 85
>gnl|CDD|227935 COG5648, NHP6B, Chromatin-associated proteins containing the HMG
domain [Chromatin structure and dynamics].
Length = 211
Score = 43.3 bits (102), Expect = 4e-05
Identities = 30/114 (26%), Positives = 48/114 (42%), Gaps = 10/114 (8%)
Query: 74 KARFTAYMLWAKQIRQKLIKSNPEMDFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAKY 133
K +AY L++ + R ++ K NP++ F +V K L E W + EK + ++A+ +Y
Sbjct: 72 KRPLSAYFLYSAENRDEIRKENPKLTFGEVGKLLSEKWKELTDEEKEPYYKEANSDRERY 131
Query: 134 TQKMSKAPAQKTKSTYTPHGRVGRPPLNKQTVEAVIETKPSPPAAPRVPLVKPT 187
Q+ K Y P E I K P +P LV+ T
Sbjct: 132 ---------QREKEEYNKKLPNKAPIGPFIENEPKIRPKVEGP-SPDKALVEET 175
>gnl|CDD|204115 pfam09011, DUF1898, Domain of unknown function (DUF1898). This
domain is predominantly found in Maelstrom homolog
proteins. It has no known function.
Length = 69
Score = 39.3 bits (92), Expect = 9e-05
Identities = 16/65 (24%), Positives = 33/65 (50%), Gaps = 1/65 (1%)
Query: 74 KARFTAYMLWAKQIRQKLIKSNPEM-DFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAK 132
KA+ AY + + +R +L + P++ ++ SK E W + EK ++ +A +
Sbjct: 5 KAKRNAYFFFVQTMRPELKREGPQVPGVAEFSKLCSEKWKAMSEEEKEKYEEKAREDKKR 64
Query: 133 YTQKM 137
Y ++M
Sbjct: 65 YDREM 69
>gnl|CDD|238685 cd01389, MATA_HMG-box, MATA_HMG-box, class I member of the HMG-box
superfamily of DNA-binding proteins. These proteins
contain a single HMG box, and bind the minor groove of
DNA in a highly sequence-specific manner. Members
include the fungal mating type gene products MC, MATA1
and Ste11.
Length = 77
Score = 36.5 bits (85), Expect = 0.001
Identities = 11/58 (18%), Positives = 30/58 (51%)
Query: 79 AYMLWAKQIRQKLIKSNPEMDFSQVSKKLGELWHTVPFNEKYGWKRQADRLAAKYTQK 136
A++L+ + +L NP + +++S+ +G +W + K +K A+ ++ ++
Sbjct: 8 AFILYRQDKHAQLKTENPGLTNNEISRIIGRMWRSESPEVKAYYKELAEEEKERHARE 65
>gnl|CDD|220232 pfam09421, FRQ, Frequency clock protein. The frequency clock
protein, is the central component of the frq-based
circadian negative feedback loop, regulates various
aspects of the circadian clock in Neurospora crassa.
This protein has been shown to interact with itself via
a coiled-coil.
Length = 989
Score = 31.5 bits (71), Expect = 0.48
Identities = 22/75 (29%), Positives = 29/75 (38%)
Query: 108 GELWHTVPFNEKYGWKRQADRLAAKYTQKMSKAPAQKTKSTYTPHGRVGRPPLNKQTVEA 167
G V EK K RL +T K+S ++T+ST TP PP+ Q E
Sbjct: 282 GLYPRHVVMTEKEKKKLVVRRLEQIFTGKISGRNVRRTQSTLTPSVDAALPPVQAQQQEG 341
Query: 168 VIETKPSPPAAPRVP 182
P PP+
Sbjct: 342 TQMAPPQPPSNFITN 356
>gnl|CDD|237803 PRK14724, PRK14724, DNA topoisomerase III; Provisional.
Length = 987
Score = 30.3 bits (68), Expect = 1.3
Identities = 28/139 (20%), Positives = 48/139 (34%), Gaps = 18/139 (12%)
Query: 115 PFNEKYGWKRQADRLAAKYTQKMSKAPAQKT--------KSTYTPHGRVGRPPLNKQTVE 166
F W ++A ++ ++ + SK P +KT + + V K+
Sbjct: 829 AFKAFLAWDKEAGKVNFEFEPRESKFPPRKTAAAKAGAASAAFGGTVAVKAAKPAKKAAA 888
Query: 167 AVIETKPSPPAAPRVPLVKPTLP--------ADLFKVTGTQPLDIAAHLRLLGDNLTIIG 218
+ K + PR K P A L V G +P+ ++ L D I
Sbjct: 889 KKVAAKTAAAKTPRKAAKKKAAPPAAGLKPSAALAAVIGAEPVARPEVIKKLWD--YIKA 946
Query: 219 ERLKDTQGRMAISGGMSLL 237
L+D + AI+ L
Sbjct: 947 NNLQDPADKRAINADAKLR 965
>gnl|CDD|240289 PTZ00144, PTZ00144, dihydrolipoamide succinyltransferase;
Provisional.
Length = 418
Score = 29.7 bits (67), Expect = 1.8
Identities = 25/106 (23%), Positives = 32/106 (30%), Gaps = 17/106 (16%)
Query: 129 LAAKYTQKMSKAPAQKTKSTYTPHGRVGRPPLNKQTVEAVIETKPSPPAAPRVPLVKPTL 188
A A TP +P T E +KP+PPAA + P P
Sbjct: 120 TGGAPPAAAPAAAAAAKAEKTTPE----KPKAAAPTPEPPAASKPTPPAAAKPPEPAPAA 175
Query: 189 PADLFKVTGTQPLDIAAHL-----RLLGDNLTIIGERLKDTQGRMA 229
V P + + R I ERLK +Q A
Sbjct: 176 KPPPTPVARADPRETRVPMSRMRQR--------IAERLKASQNTCA 213
>gnl|CDD|240325 PTZ00237, PTZ00237, acetyl-CoA synthetase; Provisional.
Length = 647
Score = 29.7 bits (67), Expect = 2.1
Identities = 16/34 (47%), Positives = 21/34 (61%), Gaps = 2/34 (5%)
Query: 64 EDDLSQESIEKARFTAYMLWAKQIRQKLIKSNPE 97
EDDL +IEK + T + K IR LIK++PE
Sbjct: 338 EDDL-WNTIEKHKVTHTLTLPKTIR-YLIKTDPE 369
>gnl|CDD|233055 TIGR00617, rpa1, replication factor-a protein 1 (rpa1). All
proteins in this family for which functions are known
are part of a multiprotein complex made up of homologs
of RPA1, RPA2 and RPA3 that bind ssDNA and function in
the recognition of DNA damage for nucleotide excision
repairThis family is based on the phylogenomic analysis
of JA Eisen (1999, Ph.D. Thesis, Stanford University)
[DNA metabolism, DNA replication, recombination, and
repair].
Length = 608
Score = 29.7 bits (67), Expect = 2.2
Identities = 24/137 (17%), Positives = 44/137 (32%), Gaps = 22/137 (16%)
Query: 58 DNLLIDEDDLSQESIEK-ARFTAYMLWAKQ----IRQKLIKSNPEMDFSQVSKKLGELWH 112
N L+ E +L + +I + +F + I +L PE+ +V K+G +
Sbjct: 63 LNPLVREGELQEGTIIRLTKFEVNTIGKDGRKVLIVYELEVVKPEL---KVRDKIG---N 116
Query: 113 TVPFNEKYGWKRQADRLAAKYTQKMSKAPAQKTKSTYTPHGRVGRPPLNKQTVEAVIETK 172
V + + + LA+K + P K P
Sbjct: 117 PVTYEKYLDSWHEEQVLASKPATNPANPPNAKAPKNEVASYNNAANPERG---------- 166
Query: 173 PSPPAAPRVPLVKPTLP 189
+ P AP + +P
Sbjct: 167 -NAPPAPNSGSTRRVMP 182
>gnl|CDD|233491 TIGR01610, phage_O_Nterm, phage replication protein O, N-terminal
domain. This model represents the N-terminal region of
the phage lambda replication protein O and homologous
regions of other phage proteins [DNA metabolism, DNA
replication, recombination, and repair, Mobile and
extrachromosomal element functions, Prophage functions].
Length = 95
Score = 27.1 bits (60), Expect = 3.4
Identities = 6/22 (27%), Positives = 12/22 (54%)
Query: 120 YGWKRQADRLAAKYTQKMSKAP 141
YGW ++ DR+ A +++
Sbjct: 39 YGWNKKQDRVTATVIAELTGLS 60
>gnl|CDD|211850 TIGR03610, RutC, pyrimidine utilization protein C. This protein is
observed in operons extremely similar to that
characterized in E. coli K-12 responsible for the import
and catabolism of pyrimidines, primarily uracil. This
protein is a member of the endoribonuclease L-PSP family
defined by pfam01042.
Length = 126
Score = 27.5 bits (61), Expect = 3.8
Identities = 16/47 (34%), Positives = 23/47 (48%), Gaps = 3/47 (6%)
Query: 166 EAVIETKPSPPAAPRVPLVKPTLPADLFKVTGTQPLDIAAHLRLLGD 212
+ +I S P AP VP TL + V+GT P D ++ +GD
Sbjct: 3 KVIIPAGTSKPLAPFVP---GTLADGVVYVSGTLPFDKDNNVVHVGD 46
>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
Length = 3151
Score = 28.8 bits (64), Expect = 3.8
Identities = 13/55 (23%), Positives = 21/55 (38%), Gaps = 4/55 (7%)
Query: 140 APAQKTKSTYTPHGRVGRPPLNKQTVEAVI----ETKPSPPAAPRVPLVKPTLPA 190
A+ P R+ RP +++ T + +P P AP P +P P
Sbjct: 2871 PAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPP 2925
>gnl|CDD|184285 PRK13733, PRK13733, conjugal transfer protein TraV; Provisional.
Length = 171
Score = 27.8 bits (62), Expect = 4.1
Identities = 16/77 (20%), Positives = 23/77 (29%), Gaps = 11/77 (14%)
Query: 125 QADRLAAKYTQKMSKAPAQKT---------KSTYTPHGRVGRPPLNKQTVEAVIETKPSP 175
QA+ A K Q PA + ++ P ++ V A E K
Sbjct: 41 QANEKAKKLEQSSDAKPAAASLPRLAEGNFRTMPVQTVTATTPSGSRPAVTATPEQKLLA 100
Query: 176 PAAPRV--PLVKPTLPA 190
P VK +P
Sbjct: 101 PRPLFTAAREVKTVVPV 117
>gnl|CDD|235581 PRK05728, PRK05728, DNA polymerase III subunit chi; Validated.
Length = 142
Score = 27.1 bits (61), Expect = 5.9
Identities = 10/42 (23%), Positives = 14/42 (33%)
Query: 151 PHGRVGRPPLNKQTVEAVIETKPSPPAAPRVPLVKPTLPADL 192
PHG G P Q V K + + + +PA
Sbjct: 59 PHGLAGEGPAAGQPVLLTWPGKRNANHRDLLINLDGAVPAFA 100
>gnl|CDD|222254 pfam13598, DUF4139, Domain of unknown function (DUF4139). This
family is usually found at the C-terminus of proteins.
Length = 264
Score = 27.9 bits (63), Expect = 5.9
Identities = 23/70 (32%), Positives = 28/70 (40%), Gaps = 10/70 (14%)
Query: 147 STYTPHGRVGRPPLNKQTVEAVIETK-------PSPPAAPRVPLVKPTLPADLFKVTGTQ 199
ST P P L+ V + + PS A RV L + TLPA+L V
Sbjct: 51 STARPGRGGSPPELSPWRVSVYVTFRIPGPVSVPSDGEAVRVTLAQETLPAELEYV--AV 108
Query: 200 P-LDIAAHLR 208
P LD A L
Sbjct: 109 PKLDPTAFLV 118
>gnl|CDD|197237 cd09139, PLDc_pPLD_like_1, Catalytic domain, repeat 1, of plant
phospholipase D and similar proteins. Catalytic domain,
repeat 1, of plant phospholipase D (PLD, EC 3.1.4.4) and
similar proteins. Plant PLDs have broad substrate
specificity and can hydrolyze the terminal
phosphodiester bond of several common membrane
phospholipids such as phosphatidylcholine (PC),
phosphatidylethanolamine (PE), phosphatidylglycerol
(PG), and phosphatidylserine (PS), with the formation of
phosphatidic acid and alcohols. Phosphatidic acid is an
essential compound involved in signal transduction. PLDs
also catalyze the transphosphatidylation of
phospholipids to acceptor alcohols, by which various
phospholipids can be synthesized. Most plant PLDs
possess a regulatory calcium-dependent
phospholipid-binding C2 domain in the N-terminus and
require calcium for activity, which is unique to plant
PLDs and is not present in animal or fungal PLDs. Like
other PLD enzymes, the monomer of plant PLDs consists of
two catalytic domains, each of which contains one copy
of the conserved HKD motif (H-x-K-x(4)-D, where x
represents any amino acid residue). Two HKD motifs from
two domains form a single active site. Plant PLDs may
utilize a common two-step ping-pong catalytic mechanism
involving an enzyme-substrate intermediate to cleave
phosphodiester bonds. The two histidine residues from
the two HKD motifs play key roles in the catalysis. Upon
substrate binding, a histidine residue from one HKD
motif could function as the nucleophile, attacking the
phosphodiester bond to create a covalent
phosphohistidine intermediate, while the other histidine
residue from the second HKD motif could serve as a
general acid, stabilizing the leaving group. This
subfamily includes two types of plant PLDs, alpha-type
and beta-type PLDs, which are derived from different
gene products and distinctly regulated. The zeta-type
PLD from Arabidopsis is not included in this subfamily.
Length = 176
Score = 27.0 bits (60), Expect = 8.3
Identities = 12/42 (28%), Positives = 20/42 (47%), Gaps = 1/42 (2%)
Query: 70 ESIEKARFTAYML-WAKQIRQKLIKSNPEMDFSQVSKKLGEL 110
++I A+ Y+ W+ LI+ + D + S LGEL
Sbjct: 16 DAICNAKHLIYIAGWSVNPEISLIRDSEREDPPKYSPTLGEL 57
>gnl|CDD|215160 PLN02286, PLN02286, arginine-tRNA ligase.
Length = 576
Score = 27.7 bits (62), Expect = 9.4
Identities = 11/26 (42%), Positives = 15/26 (57%)
Query: 10 LKEEVNISDISDSELYNSAQAVGSGA 35
L E S+ + EL +A+AVG GA
Sbjct: 404 LIERGKDSEWTPEELEQAAEAVGYGA 429
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.314 0.132 0.380
Gapped
Lambda K H
0.267 0.0727 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 14,310,093
Number of extensions: 1356210
Number of successful extensions: 987
Number of sequences better than 10.0: 1
Number of HSP's gapped: 982
Number of HSP's successfully gapped: 35
Length of query: 280
Length of database: 10,937,602
Length adjustment: 96
Effective length of query: 184
Effective length of database: 6,679,618
Effective search space: 1229049712
Effective search space used: 1229049712
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 58 (26.1 bits)