RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy4401
(147 letters)
>gnl|CDD|238016 cd00059, FH, Forkhead (FH), also known as a "winged helix". FH
is named for the Drosophila fork head protein, a
transcription factor which promotes terminal rather
than segmental development. This family of
transcription factor domains, which bind to B-DNA as
monomers, are also found in the Hepatocyte nuclear
factor (HNF) proteins, which provide tissue-specific
gene regulation. The structure contains 2 flexible
loops or "wings" in the C-terminal region, hence the
term winged helix.
Length = 78
Score = 73.4 bits (181), Expect = 3e-18
Identities = 26/32 (81%), Positives = 28/32 (87%)
Query: 4 NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD 35
NSIRHNLSLNKCFVKVPR D+PGKG +W LD
Sbjct: 47 NSIRHNLSLNKCFVKVPREPDEPGKGSYWTLD 78
>gnl|CDD|189470 pfam00250, Fork_head, Fork head domain.
Length = 96
Score = 72.3 bits (178), Expect = 2e-17
Identities = 26/32 (81%), Positives = 29/32 (90%)
Query: 4 NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD 35
NSIRHNLSLNKCF+KVPRS D+PGKG +W LD
Sbjct: 47 NSIRHNLSLNKCFIKVPRSPDKPGKGSYWTLD 78
>gnl|CDD|214627 smart00339, FH, FORKHEAD. FORKHEAD, also known as a "winged
helix".
Length = 89
Score = 67.3 bits (165), Expect = 1e-15
Identities = 25/32 (78%), Positives = 27/32 (84%)
Query: 4 NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD 35
NSIRHNLSLN CFVKVPR D+PGKG +W LD
Sbjct: 47 NSIRHNLSLNDCFVKVPREGDRPGKGSYWTLD 78
>gnl|CDD|227358 COG5025, COG5025, Transcription factor of the Forkhead/HNF3 family
[Transcription].
Length = 610
Score = 56.7 bits (137), Expect = 3e-10
Identities = 30/71 (42%), Positives = 36/71 (50%), Gaps = 4/71 (5%)
Query: 4 NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD----LQKLEEGGGRYRNVRRKSSDIQHR 59
NSIRHNLSLNK F KVPRS QPGKG FWK+D +K + R + + +
Sbjct: 383 NSIRHNLSLNKSFEKVPRSASQPGKGCFWKIDYSYIYEKESKRNPRSPKKSPSAHSVHQK 442
Query: 60 KQAHRPKPAPS 70
H S
Sbjct: 443 LSLHVNDLYQS 453
Score = 38.3 bits (89), Expect = 7e-04
Identities = 24/107 (22%), Positives = 37/107 (34%), Gaps = 6/107 (5%)
Query: 4 NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD----LQKLEEGGGRYRNVRRKSSDIQHR 59
NSIRHNLSLN F+K+ KG FW + Q L+ G ++ +
Sbjct: 132 NSIRHNLSLNDAFIKIEGRNGAKVKGHFWSIGPGHETQFLKSGLRLDGGGKQMMFTLP-- 189
Query: 60 KQAHRPKPAPSENLNTAYTNNNSIAPGENLNKLEDKLDDTLMESEIS 106
S + +N+S+ L+ L+
Sbjct: 190 SSTEIKITYSSTHSMPLLESNDSLNSNNERELLDIIKSSALIRIPAD 236
>gnl|CDD|237555 PRK13914, PRK13914, invasion associated secreted endopeptidase;
Provisional.
Length = 481
Score = 30.5 bits (68), Expect = 0.25
Identities = 14/39 (35%), Positives = 18/39 (46%), Gaps = 9/39 (23%)
Query: 61 QAHRPKPAPSENLN---------TAYTNNNSIAPGENLN 90
+A +P PAPS N N T N N+ P +N N
Sbjct: 300 EAAKPAPAPSTNTNANKTNTNTNTNTNNTNTSTPSKNTN 338
>gnl|CDD|234345 TIGR03755, conj_TIGR03755, integrating conjugative element protein,
PFL_4711 family. Members of this protein family are
found in genomic regions associated with conjugative
transfer and integrated TOL-like plasmids. The specific
function is unknown [Mobile and extrachromosomal element
functions, Plasmid functions].
Length = 418
Score = 28.8 bits (65), Expect = 1.1
Identities = 21/74 (28%), Positives = 37/74 (50%), Gaps = 11/74 (14%)
Query: 65 PKPAPSENLNTAYTNNNSIAPG--ENLNKLEDKLDDTLME---SEISMTDDIDEDLILSN 119
P ENL A + + I G E L + D+ L++ SEI++ D +++ L++
Sbjct: 277 ATPPTQENLAKASSPSLPITRGVIEALREDPDQ--SLLVQRLASEIALADTLEKALLMRR 334
Query: 120 ILLSGDSYWIHEPE 133
+LL+G + EP
Sbjct: 335 MLLTG----LQEPN 344
>gnl|CDD|223020 PHA03246, PHA03246, large tegument protein UL36; Provisional.
Length = 3095
Score = 27.6 bits (61), Expect = 3.3
Identities = 17/92 (18%), Positives = 27/92 (29%), Gaps = 5/92 (5%)
Query: 50 RRKSSDIQHRKQAHRPKPA-----PSENLNTAYTNNNSIAPGENLNKLEDKLDDTLMESE 104
SS H + PS + A TN N P + + ES
Sbjct: 341 YADSSPKLHSESTDLTPHEHGEYDPSTLVGGASTNINISDPPARTDCRRYSEGSVIHESV 400
Query: 105 ISMTDDIDEDLILSNILLSGDSYWIHEPEHIS 136
S +D+ E + S + H++
Sbjct: 401 DSHIEDVTEATSVVAAWSDAFSDISEDYSHLT 432
>gnl|CDD|216798 pfam01937, DUF89, Protein of unknown function DUF89. This family
has no known function.
Length = 315
Score = 26.5 bits (59), Expect = 5.9
Identities = 11/47 (23%), Positives = 22/47 (46%), Gaps = 2/47 (4%)
Query: 101 MESEISMTDDIDEDLILSNILLSGDSYWI--HEPEHISPDLLDSLLD 145
+ ++ +DE L L ++ SG +W + +SP+L + L
Sbjct: 220 LADHSALGAGLDELLKLGKLIDSGSDFWTPGIDFWEMSPELYEELSK 266
>gnl|CDD|237592 PRK14040, PRK14040, oxaloacetate decarboxylase; Provisional.
Length = 593
Score = 26.4 bits (59), Expect = 6.7
Identities = 10/19 (52%), Positives = 13/19 (68%)
Query: 34 LDLQKLEEGGGRYRNVRRK 52
LD+ KLEE +R VR+K
Sbjct: 259 LDILKLEEIAAYFREVRKK 277
>gnl|CDD|147011 pfam04645, DUF603, Protein of unknown function, DUF603. This
family includes several uncharacterized proteins from
Borrelia species.
Length = 181
Score = 26.1 bits (57), Expect = 6.8
Identities = 11/33 (33%), Positives = 16/33 (48%)
Query: 73 LNTAYTNNNSIAPGENLNKLEDKLDDTLMESEI 105
LN N E +N L+ +LD+ + E EI
Sbjct: 124 LNKKINKKNLSHVNEEINSLKLELDELIKECEI 156
>gnl|CDD|233959 TIGR02639, ClpA, ATP-dependent Clp protease ATP-binding subunit
clpA. [Protein fate, Degradation of proteins, peptides,
and glycopeptides].
Length = 730
Score = 26.5 bits (59), Expect = 7.3
Identities = 9/29 (31%), Positives = 18/29 (62%)
Query: 86 GENLNKLEDKLDDTLMESEISMTDDIDED 114
G ++ L +L+D L E+ + ++IDE+
Sbjct: 47 GGDVELLRKRLEDYLEENLPVIEEEIDEE 75
>gnl|CDD|219622 pfam07891, DUF1666, Protein of unknown function (DUF1666). These
sequences are derived from hypothetical plant proteins
of unknown function. The region in question is
approximately 250 residues long.
Length = 247
Score = 25.9 bits (57), Expect = 7.7
Identities = 10/41 (24%), Positives = 17/41 (41%), Gaps = 3/41 (7%)
Query: 33 KLDLQKLEEGGGRYRNVRRKSSDIQHRKQAHRPKPAPSENL 73
K DLQK E+ + + V R S I + + + +
Sbjct: 159 KTDLQKKEK---KLKEVLRSGSCILKKFKKNESESDEVLIF 196
>gnl|CDD|240767 cd12321, RRM1_TDP43, RNA recognition motif 1 in TAR DNA-binding
protein 43 (TDP-43) and similar proteins. This
subfamily corresponds to the RRM1 of TDP-43 (also
termed TARDBP), a ubiquitously expressed pathogenic
protein whose normal function and abnormal aggregation
are directly linked to the genetic disease cystic
fibrosis, and two neurodegenerative disorders:
frontotemporal lobar degeneration (FTLD) and
amyotrophic lateral sclerosis (ALS). TDP-43 binds both
DNA and RNA, and has been implicated in transcriptional
repression, pre-mRNA splicing and translational
regulation. TDP-43 is a dimeric protein with two RNA
recognition motifs (RRMs), also termed RBDs (RNA
binding domains) or RNPs (ribonucleoprotein domains),
and a C-terminal glycine-rich domain. The RRMs are
responsible for DNA and RNA binding; they bind to TAR
DNA and RNA sequences with UG-repeats. The glycine-rich
domain can interact with the hnRNP family proteins to
form the hnRNP-rich complex involved in splicing
inhibition. It is also essential for the cystic
fibrosis transmembrane conductance regulator (CFTR)
exon 9-skipping activity. .
Length = 77
Score = 25.0 bits (55), Expect = 8.1
Identities = 10/23 (43%), Positives = 14/23 (60%)
Query: 1 MRVNSIRHNLSLNKCFVKVPRSK 23
++V S RH + C VK+P SK
Sbjct: 55 VKVLSQRHMIDGRWCDVKIPNSK 77
>gnl|CDD|237548 PRK13893, PRK13893, conjugal transfer protein TrbM; Provisional.
Length = 193
Score = 25.9 bits (57), Expect = 8.2
Identities = 12/43 (27%), Positives = 22/43 (51%), Gaps = 2/43 (4%)
Query: 19 VPRSKDQPGKGGFWKLDLQKLEEGGGRYRNVRRKSSDIQHRKQ 61
+PR P +GG+W ++ + + Y N R + D + R+Q
Sbjct: 149 LPRYVGTPERGGYW-VEARDYDRALAEY-NERIRREDEERRRQ 189
>gnl|CDD|221495 pfam12258, Microcephalin, Microcephalin protein. This family of
proteins is found in eukaryotes. Proteins in this family
are typically between 384 and 835 amino acids in length.
Microcephalin is involved in determining the size of the
brain in animals. It is a protein, which if expressed
homozygously causes the organism to have the condition
microcephaly. Organisms expressing the mutated form of
this protein in a homozygous manner develop a condition
called microcephaly - a drastically reduced brain mass
and volume. Microcephalin is predicted to contain three
BRCA1 C-terminal domains, the first of which is the
probable microcephaly mutation site.
Length = 391
Score = 26.0 bits (57), Expect = 8.5
Identities = 13/47 (27%), Positives = 21/47 (44%)
Query: 50 RRKSSDIQHRKQAHRPKPAPSENLNTAYTNNNSIAPGENLNKLEDKL 96
R KSS + ++ + +P E L ++ S P L K E +L
Sbjct: 124 RPKSSSAKRKRTSENSHSSPKERLKRKRSSGKSAMPRLQLWKSEGRL 170
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.313 0.134 0.393
Gapped
Lambda K H
0.267 0.0768 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 7,524,033
Number of extensions: 667573
Number of successful extensions: 462
Number of sequences better than 10.0: 1
Number of HSP's gapped: 462
Number of HSP's successfully gapped: 33
Length of query: 147
Length of database: 10,937,602
Length adjustment: 88
Effective length of query: 59
Effective length of database: 7,034,450
Effective search space: 415032550
Effective search space used: 415032550
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 54 (24.5 bits)