RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy723
(395 letters)
>gnl|CDD|238016 cd00059, FH, Forkhead (FH), also known as a "winged helix". FH is
named for the Drosophila fork head protein, a
transcription factor which promotes terminal rather than
segmental development. This family of transcription
factor domains, which bind to B-DNA as monomers, are
also found in the Hepatocyte nuclear factor (HNF)
proteins, which provide tissue-specific gene regulation.
The structure contains 2 flexible loops or "wings" in
the C-terminal region, hence the term winged helix.
Length = 78
Score = 141 bits (357), Expect = 1e-41
Identities = 46/78 (58%), Positives = 56/78 (71%), Gaps = 2/78 (2%)
Query: 98 KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
KP +SY LIAMAI SSPE +L LS+IY++I DN+ YFR GW+NSIRHNLSLN CF+
Sbjct: 1 KPPYSYSALIAMAIQSSPEKRLTLSEIYKWISDNFPYFRDAPAGWQNSIRHNLSLNKCFV 60
Query: 158 KAGRS--ANGKGHYWSIH 173
K R GKG YW++
Sbjct: 61 KVPREPDEPGKGSYWTLD 78
>gnl|CDD|214627 smart00339, FH, FORKHEAD. FORKHEAD, also known as a "winged
helix".
Length = 89
Score = 137 bits (348), Expect = 3e-40
Identities = 51/89 (57%), Positives = 66/89 (74%), Gaps = 2/89 (2%)
Query: 98 KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
KP +SYI LIAMAILSSP+ +L LS+IY++I DN+ Y+R GW+NSIRHNLSLNDCF+
Sbjct: 1 KPPYSYIALIAMAILSSPDKRLTLSEIYKWIEDNFPYYRENRAGWQNSIRHNLSLNDCFV 60
Query: 158 KAGRSA--NGKGHYWSIHPANVDDFKKGD 184
K R GKG YW++ PA + F+ G+
Sbjct: 61 KVPREGDRPGKGSYWTLDPAAENMFENGN 89
>gnl|CDD|189470 pfam00250, Fork_head, Fork head domain.
Length = 96
Score = 136 bits (345), Expect = 8e-40
Identities = 50/96 (52%), Positives = 66/96 (68%), Gaps = 2/96 (2%)
Query: 98 KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
KP +SYI LI MAI SPE L LS+IYQ+I+D + Y+R GW+NSIRHNLSLN CFI
Sbjct: 1 KPPYSYIALITMAIQQSPEKMLTLSEIYQWIMDLFPYYRQNKQGWQNSIRHNLSLNKCFI 60
Query: 158 KAGRSA--NGKGHYWSIHPANVDDFKKGDFRRRKAQ 191
K RS GKG YW++ P + + F+ G + +R+ +
Sbjct: 61 KVPRSPDKPGKGSYWTLDPESENMFENGKYLKRRKR 96
>gnl|CDD|227358 COG5025, COG5025, Transcription factor of the Forkhead/HNF3 family
[Transcription].
Length = 610
Score = 78.7 bits (194), Expect = 1e-15
Identities = 48/137 (35%), Positives = 67/137 (48%), Gaps = 6/137 (4%)
Query: 97 PKPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCF 156
KP SY I AILSSP K+ LS+IY +I N Y+R + W+NSIRHNLSLN F
Sbjct: 336 SKPAFSYANSITQAILSSPSGKMTLSEIYSWISSNLPYYRHKPTAWQNSIRHNLSLNKSF 395
Query: 157 IKAGRSAN--GKGHYWSIHPANVDDFKKGDFRRRKAQRKVRRHMGLSVDDDNDSNSPP-- 212
K RSA+ GKG +W I + ++K R ++ +K + N
Sbjct: 396 EKVPRSASQPGKGCFWKIDYS--YIYEKESKRNPRSPKKSPSAHSVHQKLSLHVNDLYQS 453
Query: 213 PLSPPLTFPNILFSSHP 229
P + + + +S P
Sbjct: 454 PATSDIASSSSQVNSQP 470
Score = 61.7 bits (150), Expect = 2e-10
Identities = 41/92 (44%), Positives = 52/92 (56%), Gaps = 2/92 (2%)
Query: 98 KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
P +SY +AIL+SP+ L LS IY +I + + Y+ W+NSIRHNLSLND FI
Sbjct: 86 VPPYSYATGRGLAILNSPDKPLTLSKIYTWIHNTFFYYAKVVSRWQNSIRHNLSLNDAFI 145
Query: 158 K--AGRSANGKGHYWSIHPANVDDFKKGDFRR 187
K A KGH+WSI P + F K R
Sbjct: 146 KIEGRNGAKVKGHFWSIGPGHETQFLKSGLRL 177
Score = 35.9 bits (83), Expect = 0.037
Identities = 13/75 (17%), Positives = 17/75 (22%), Gaps = 6/75 (8%)
Query: 100 QHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFIKA 159
LI + SS + + L + G N R N S
Sbjct: 224 IIKSSALIRIPADSSSNLDVSLGHHISQPSTHTPVLDNHSSGEENISRINNSSQID---- 279
Query: 160 GRSANGKGHYWSIHP 174
S SI
Sbjct: 280 --SPTPNYRMSSIDS 292
>gnl|CDD|236413 PRK09210, PRK09210, RNA polymerase sigma factor RpoD; Validated.
Length = 367
Score = 32.6 bits (75), Expect = 0.33
Identities = 17/55 (30%), Positives = 30/55 (54%), Gaps = 1/55 (1%)
Query: 312 KVKHAASIT-DEVFERLQPGEEDRNKSDEEIDAENEADIDVVNNNNDSESEKVQE 365
K K ++T DE+ E+L P E D ++ D+ + +A I +V+ + S +V E
Sbjct: 21 KGKKRGTLTYDEIAEKLIPFELDSDQIDDLYERLEDAGISIVDEEGNPSSAQVVE 75
>gnl|CDD|181765 PRK09294, PRK09294, acyltransferase PapA5; Provisional.
Length = 416
Score = 30.8 bits (70), Expect = 1.2
Identities = 18/77 (23%), Positives = 27/77 (35%), Gaps = 9/77 (11%)
Query: 14 SPPGSPADQNEPLGNATALVPTLDSHPLLPI-----EQYRIQLYNYAIQAERLRLSQQY- 67
+PP + + LG AT L ++ + R L + IQ L +
Sbjct: 268 TPPVAATEGTNLLGAATYLAEIGPDTDIVDLARAIAATLRADLADGVIQQSFLHFGTAFE 327
Query: 68 GTPYTNYQTPNVNRVMN 84
GTP P V + N
Sbjct: 328 GTPPGL---PPVVFITN 341
>gnl|CDD|220427 pfam09825, BPL_N, Biotin-protein ligase, N terminal. The function
of this structural domain is unknown. It is found to the
N terminus of the biotin protein ligase catalytic
domain.
Length = 364
Score = 30.8 bits (70), Expect = 1.3
Identities = 13/53 (24%), Positives = 24/53 (45%), Gaps = 6/53 (11%)
Query: 177 VDDFKKGDFRRRKAQRKVRRHMGLSVDDDNDSNSPPPLSPPLTFPNILFSSHP 229
+++ K + R R+ +GL V+DD + P L+P + S+P
Sbjct: 233 IEELKADEKARLVFLRECLTKLGLKVNDDTSEDGIPSLTP------LYLLSNP 279
>gnl|CDD|223520 COG0443, DnaK, Molecular chaperone [Posttranslational modification,
protein turnover, chaperones].
Length = 579
Score = 30.4 bits (69), Expect = 1.7
Identities = 23/87 (26%), Positives = 37/87 (42%), Gaps = 15/87 (17%)
Query: 267 PASDLENTGKRQFDVDSLLAPDHPASDLENTDARKKLKPTSSPQT-KVKHAASITDEVFE 325
PA + FD+D A+ + N A K T Q+ +K ++ ++DE E
Sbjct: 440 PAPRGVPQIEVTFDID--------ANGILNVTA--KDLGTGKEQSITIKASSGLSDEEIE 489
Query: 326 RLQPGEEDR----NKSDEEIDAENEAD 348
R+ E K E ++A NEA+
Sbjct: 490 RMVEDAEANAALDKKFRELVEARNEAE 516
>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
biogenesis [Translation, ribosomal structure and
biogenesis].
Length = 1077
Score = 30.5 bits (68), Expect = 1.7
Identities = 20/82 (24%), Positives = 39/82 (47%), Gaps = 7/82 (8%)
Query: 282 DSLLAPDHPASDLENTDARKKLKPTSSPQTKVKHAASITDEVFERLQPGEEDRNKSDEEI 341
D++ D +S+++N + + +PT + S DE L + D + SDE
Sbjct: 390 DAIDTVDRESSEIDNVGRKTRRQPTGKA---IAEETSREDE----LSFDDSDVSTSDENE 442
Query: 342 DAENEADIDVVNNNNDSESEKV 363
D + +NN ++S++E+V
Sbjct: 443 DVDFTGKKGAINNEDESDNEEV 464
>gnl|CDD|221184 pfam11718, CPSF73-100_C, Pre-mRNA 3'-end-processing endonuclease
polyadenylation factor C-term. This is the C-terminal
conserved region of the pre-mRNA 3'-end-processing of
the polyadenylation factor CPSF-73/CPSF-100 proteins.
The exact function of this domain is not known.
Length = 208
Score = 29.2 bits (66), Expect = 3.3
Identities = 17/92 (18%), Positives = 32/92 (34%), Gaps = 21/92 (22%)
Query: 305 PTSSPQTKVKHAASITDEVF-ERLQ--------------PGEEDRNKSDEEIDAENEADI 349
P S + +E F ERL+ + + +ADI
Sbjct: 117 PASVKLSSKICHKKSDEEEFIERLEMLLEAQFGDDCVPLKDKPKLPVILTVTIGKKKADI 176
Query: 350 DV----VNNNNDSESEKVQESLILQRYYQTIA 377
++ V ++S E+V+ L L+R + +
Sbjct: 177 NLETLKVECEDESLRERVE--LELKRLHGLVI 206
>gnl|CDD|217143 pfam02614, UxaC, Glucuronate isomerase. This is a family of
Glucuronate isomerases also known as D-glucuronate
isomerase, uronic isomerase, uronate isomerase, or
uronic acid isomerase, EC:5.3.1.12. This enzyme
catalyzes the reactions: D-glucuronate <=>
D-fructuronate and D-galacturonate <=> D-tagaturonate.
It is not however clear where the experimental evidence
for this functional assignment came from and thus this
family has no literature reference.
Length = 469
Score = 29.1 bits (66), Expect = 4.1
Identities = 23/91 (25%), Positives = 31/91 (34%), Gaps = 22/91 (24%)
Query: 253 KRQFDVDSLLAPD------HPASDLENTGK-------RQFDVDSLLAPDHPASDLENTDA 299
KR F + LL A+ L T ++ +V+ + D P DLE
Sbjct: 109 KRYFGITELLNEKTAEEIWERANALLATEAFSPRGLIKKSNVEVVCTTDDPIDDLE---Y 165
Query: 300 RKKLKPTSSPQTKV------KHAASITDEVF 324
K L S KV A +I E F
Sbjct: 166 HKALAEDESFSVKVLPTFRPDKALNIEREGF 196
>gnl|CDD|233634 TIGR01914, cas_Csa4, CRISPR-associated protein Cas8a2/Csa4, subtype
I-A/APERN. CRISPR loci appear to be mobile elements
with a wide host range. This model represents a protein
that tends to be found near CRISPR repeats. The species
range for this species, so far, is exclusively archaeal.
It is found so far in only four different species, and
includes two tandem genes in Pyrococcus furiosus DSM
3638. This subfamily is found in a CRISPR/Cas locus we
designate APERN, so the family is designated Csa4, for
CRISPR/Cas Subtype Protein 4 [Mobile and
extrachromosomal element functions, Other].
Length = 354
Score = 29.0 bits (65), Expect = 4.3
Identities = 16/92 (17%), Positives = 35/92 (38%), Gaps = 20/92 (21%)
Query: 47 YRIQLYNYAIQAERLRLSQQYGTPYTNYQTPNVNRVMNYFHPRFQISSEEPKPQHSYIGL 106
++ YA+ + Y +PY YQ + +V+ +P P + +
Sbjct: 157 IKVCPLCYAL----AWIGFHYYSPYIKYQKGDETKVVIL----------QPAPA-EEVDM 201
Query: 107 IAMAILSSPEMKLVLSDIYQYILDNYSYFRTR 138
I + +L K++ YIL ++ + +
Sbjct: 202 IELLLLKDLASKIM-----PYILRDFHFKISN 228
>gnl|CDD|182325 PRK10239, PRK10239,
2-amino-4-hydroxy-6-hydroxymethyldihyropteridine
pyrophosphokinase; Provisional.
Length = 159
Score = 28.2 bits (63), Expect = 5.0
Identities = 20/55 (36%), Positives = 28/55 (50%), Gaps = 6/55 (10%)
Query: 14 SPPGSPADQNEPLGNATALVPTLDSHPLLPIEQYRIQLYNYAIQAERLRLSQQYG 68
+PP P DQ + L A AL L LL Q RI+L Q R+R ++++G
Sbjct: 43 TPPLGPQDQPDYLNAAVALETALAPEELLNHTQ-RIEL-----QQGRVRKAERWG 91
>gnl|CDD|217527 pfam03387, Herpes_UL46, Herpesvirus UL46 protein.
Length = 443
Score = 28.5 bits (64), Expect = 6.7
Identities = 11/50 (22%), Positives = 19/50 (38%), Gaps = 4/50 (8%)
Query: 191 QRKVRRHMGLSVDDDNDSNSPPPLSPPL----TFPNILFSSHPFQCFPQM 236
+ ++ G V D +++ P +S L TF S PF+
Sbjct: 112 WKYLQASSGADVPDSPETDGPTQVSVVLLFYPTFGPKPLSKAPFKSKKDN 161
>gnl|CDD|146273 pfam03546, Treacle, Treacher Collins syndrome protein Treacle.
Length = 519
Score = 28.7 bits (63), Expect = 6.7
Identities = 23/90 (25%), Positives = 34/90 (37%), Gaps = 5/90 (5%)
Query: 258 VDSLLAPDHPASDLENTGKRQFDVDSLLAPDHPASDLENTDARKKLKPTSSPQTKVKHAA 317
S D E++ + + D D AP S + AR P P K A
Sbjct: 148 AGSAAVQVGKQEDSESSSEEESDSDGPGAPAQAKSSGKLLQARPASGPAKGPPQKAGPVA 207
Query: 318 SITDEVFERLQPGEEDRNKSDEEIDAENEA 347
+ + + G+ED S+E D+E EA
Sbjct: 208 TQV-----KAERGKEDSESSEESSDSEEEA 232
>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen. This
family consists of several Theileria P67 surface
antigens. A stage specific surface antigen of Theileria
parva, p67, is the basis for the development of an
anti-sporozoite vaccine for the control of East Coast
fever (ECF) in cattle. The antigen has been shown to
contain five distinct linear peptide sequences
recognised by sporozoite-neutralising murine monoclonal
antibodies.
Length = 727
Score = 28.5 bits (63), Expect = 7.2
Identities = 15/59 (25%), Positives = 26/59 (44%), Gaps = 1/59 (1%)
Query: 306 TSSPQTKVKHAASITDEVFERLQPGEEDRNKSDEEIDAENEADIDVVNNNNDSESEKVQ 364
T S Q V + + D E+ Q + + S+E+ D E D ++ + S+K Q
Sbjct: 86 TRSFQEPVSQESEVQDNT-EQNQDTKGSKTDSEEDDDDSEEEDNKSTSSKDGKGSKKTQ 143
>gnl|CDD|215893 pfam00389, 2-Hacid_dh, D-isomer specific 2-hydroxyacid
dehydrogenase, catalytic domain. This family represents
the largest portion of the catalytic domain of
2-hydroxyacid dehydrogenases as the NAD binding domain
is inserted within the structural domain.
Length = 312
Score = 28.4 bits (64), Expect = 7.4
Identities = 9/26 (34%), Positives = 13/26 (50%)
Query: 203 DDDNDSNSPPPLSPPLTFPNILFSSH 228
D + PP SP L PN++ + H
Sbjct: 253 LDVVEEEPPPVNSPLLDLPNVILTPH 278
>gnl|CDD|237124 PRK12518, PRK12518, RNA polymerase sigma factor; Provisional.
Length = 175
Score = 27.8 bits (62), Expect = 7.4
Identities = 14/35 (40%), Positives = 16/35 (45%), Gaps = 4/35 (11%)
Query: 184 DFRRRKAQRKVRRHMGLSVDDDNDSNSPPPLSPPL 218
D RR+ AQR R D ND S P +P L
Sbjct: 75 DARRQFAQRPSRIQ----DDSLNDQPSRPSDTPDL 105
>gnl|CDD|130370 TIGR01303, IMP_DH_rel_1, IMP dehydrogenase family protein. This
model represents a family of proteins, often annotated
as a putative IMP dehydrogenase, related to IMP
dehydrogenase and GMP reductase and restricted to the
high GC Gram-positive bacteria. All species in which a
member is found so far (Corynebacterium glutamicum,
Mycobacterium tuberculosis, Streptomyces coelicolor,
etc.) also have IMP dehydrogenase as described by
TIGRFAMs entry TIGR01302 [Unknown function, General].
Length = 475
Score = 28.3 bits (63), Expect = 7.4
Identities = 21/71 (29%), Positives = 24/71 (33%), Gaps = 6/71 (8%)
Query: 234 PQMLPPLGSTNTTSPCISRKRQFDVDSLLAPDHPASDLEN-TGKRQFDVDSLLAPDHPA- 291
PQ LP T + SR D LAP SD KR ++ D P
Sbjct: 73 PQDLPIPAVKQTVAFVKSRDLVLDTPITLAPHDTVSDAMALIHKRAHGAAVVILEDRPVG 132
Query: 292 ----SDLENTD 298
SDL D
Sbjct: 133 LVTDSDLLGVD 143
>gnl|CDD|222905 PHA02603, nrdC.11, hypothetical protein; Provisional.
Length = 330
Score = 28.2 bits (63), Expect = 9.1
Identities = 14/32 (43%), Positives = 22/32 (68%), Gaps = 1/32 (3%)
Query: 107 IAMAILSSPEMKLVL-SDIYQYILDNYSYFRT 137
+A+ +L +P+ K+V D++QYI DN S F T
Sbjct: 81 VALDMLHTPDDKVVKSPDVWQYIQDNRSRFYT 112
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.314 0.131 0.386
Gapped
Lambda K H
0.267 0.0768 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 20,290,055
Number of extensions: 1965033
Number of successful extensions: 1445
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1439
Number of HSP's successfully gapped: 34
Length of query: 395
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 296
Effective length of database: 6,546,556
Effective search space: 1937780576
Effective search space used: 1937780576
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 60 (26.8 bits)