RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy4401
         (147 letters)



>gnl|CDD|238016 cd00059, FH, Forkhead (FH), also known as a "winged helix".  FH
          is named for the Drosophila fork head protein, a
          transcription factor which promotes terminal rather
          than segmental development. This family of
          transcription factor domains, which bind to B-DNA as
          monomers, are also found in the Hepatocyte nuclear
          factor (HNF) proteins, which provide tissue-specific
          gene regulation. The structure contains 2 flexible
          loops or "wings" in the C-terminal region, hence the
          term winged helix.
          Length = 78

 Score = 73.4 bits (181), Expect = 3e-18
 Identities = 26/32 (81%), Positives = 28/32 (87%)

Query: 4  NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD 35
          NSIRHNLSLNKCFVKVPR  D+PGKG +W LD
Sbjct: 47 NSIRHNLSLNKCFVKVPREPDEPGKGSYWTLD 78


>gnl|CDD|189470 pfam00250, Fork_head, Fork head domain. 
          Length = 96

 Score = 72.3 bits (178), Expect = 2e-17
 Identities = 26/32 (81%), Positives = 29/32 (90%)

Query: 4  NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD 35
          NSIRHNLSLNKCF+KVPRS D+PGKG +W LD
Sbjct: 47 NSIRHNLSLNKCFIKVPRSPDKPGKGSYWTLD 78


>gnl|CDD|214627 smart00339, FH, FORKHEAD.  FORKHEAD, also known as a "winged
          helix".
          Length = 89

 Score = 67.3 bits (165), Expect = 1e-15
 Identities = 25/32 (78%), Positives = 27/32 (84%)

Query: 4  NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD 35
          NSIRHNLSLN CFVKVPR  D+PGKG +W LD
Sbjct: 47 NSIRHNLSLNDCFVKVPREGDRPGKGSYWTLD 78


>gnl|CDD|227358 COG5025, COG5025, Transcription factor of the Forkhead/HNF3 family
           [Transcription].
          Length = 610

 Score = 56.7 bits (137), Expect = 3e-10
 Identities = 30/71 (42%), Positives = 36/71 (50%), Gaps = 4/71 (5%)

Query: 4   NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD----LQKLEEGGGRYRNVRRKSSDIQHR 59
           NSIRHNLSLNK F KVPRS  QPGKG FWK+D     +K  +   R       +  +  +
Sbjct: 383 NSIRHNLSLNKSFEKVPRSASQPGKGCFWKIDYSYIYEKESKRNPRSPKKSPSAHSVHQK 442

Query: 60  KQAHRPKPAPS 70
              H      S
Sbjct: 443 LSLHVNDLYQS 453



 Score = 38.3 bits (89), Expect = 7e-04
 Identities = 24/107 (22%), Positives = 37/107 (34%), Gaps = 6/107 (5%)

Query: 4   NSIRHNLSLNKCFVKVPRSKDQPGKGGFWKLD----LQKLEEGGGRYRNVRRKSSDIQHR 59
           NSIRHNLSLN  F+K+        KG FW +      Q L+ G       ++    +   
Sbjct: 132 NSIRHNLSLNDAFIKIEGRNGAKVKGHFWSIGPGHETQFLKSGLRLDGGGKQMMFTLP-- 189

Query: 60  KQAHRPKPAPSENLNTAYTNNNSIAPGENLNKLEDKLDDTLMESEIS 106
                     S +      +N+S+        L+      L+     
Sbjct: 190 SSTEIKITYSSTHSMPLLESNDSLNSNNERELLDIIKSSALIRIPAD 236


>gnl|CDD|237555 PRK13914, PRK13914, invasion associated secreted endopeptidase;
           Provisional.
          Length = 481

 Score = 30.5 bits (68), Expect = 0.25
 Identities = 14/39 (35%), Positives = 18/39 (46%), Gaps = 9/39 (23%)

Query: 61  QAHRPKPAPSENLN---------TAYTNNNSIAPGENLN 90
           +A +P PAPS N N         T   N N+  P +N N
Sbjct: 300 EAAKPAPAPSTNTNANKTNTNTNTNTNNTNTSTPSKNTN 338


>gnl|CDD|234345 TIGR03755, conj_TIGR03755, integrating conjugative element protein,
           PFL_4711 family.  Members of this protein family are
           found in genomic regions associated with conjugative
           transfer and integrated TOL-like plasmids. The specific
           function is unknown [Mobile and extrachromosomal element
           functions, Plasmid functions].
          Length = 418

 Score = 28.8 bits (65), Expect = 1.1
 Identities = 21/74 (28%), Positives = 37/74 (50%), Gaps = 11/74 (14%)

Query: 65  PKPAPSENLNTAYTNNNSIAPG--ENLNKLEDKLDDTLME---SEISMTDDIDEDLILSN 119
             P   ENL  A + +  I  G  E L +  D+    L++   SEI++ D +++ L++  
Sbjct: 277 ATPPTQENLAKASSPSLPITRGVIEALREDPDQ--SLLVQRLASEIALADTLEKALLMRR 334

Query: 120 ILLSGDSYWIHEPE 133
           +LL+G    + EP 
Sbjct: 335 MLLTG----LQEPN 344


>gnl|CDD|223020 PHA03246, PHA03246, large tegument protein UL36; Provisional.
          Length = 3095

 Score = 27.6 bits (61), Expect = 3.3
 Identities = 17/92 (18%), Positives = 27/92 (29%), Gaps = 5/92 (5%)

Query: 50  RRKSSDIQHRKQAHRPKPA-----PSENLNTAYTNNNSIAPGENLNKLEDKLDDTLMESE 104
              SS   H +             PS  +  A TN N   P    +         + ES 
Sbjct: 341 YADSSPKLHSESTDLTPHEHGEYDPSTLVGGASTNINISDPPARTDCRRYSEGSVIHESV 400

Query: 105 ISMTDDIDEDLILSNILLSGDSYWIHEPEHIS 136
            S  +D+ E   +        S    +  H++
Sbjct: 401 DSHIEDVTEATSVVAAWSDAFSDISEDYSHLT 432


>gnl|CDD|216798 pfam01937, DUF89, Protein of unknown function DUF89.  This family
           has no known function.
          Length = 315

 Score = 26.5 bits (59), Expect = 5.9
 Identities = 11/47 (23%), Positives = 22/47 (46%), Gaps = 2/47 (4%)

Query: 101 MESEISMTDDIDEDLILSNILLSGDSYWI--HEPEHISPDLLDSLLD 145
           +    ++   +DE L L  ++ SG  +W    +   +SP+L + L  
Sbjct: 220 LADHSALGAGLDELLKLGKLIDSGSDFWTPGIDFWEMSPELYEELSK 266


>gnl|CDD|237592 PRK14040, PRK14040, oxaloacetate decarboxylase; Provisional.
          Length = 593

 Score = 26.4 bits (59), Expect = 6.7
 Identities = 10/19 (52%), Positives = 13/19 (68%)

Query: 34  LDLQKLEEGGGRYRNVRRK 52
           LD+ KLEE    +R VR+K
Sbjct: 259 LDILKLEEIAAYFREVRKK 277


>gnl|CDD|147011 pfam04645, DUF603, Protein of unknown function, DUF603.  This
           family includes several uncharacterized proteins from
           Borrelia species.
          Length = 181

 Score = 26.1 bits (57), Expect = 6.8
 Identities = 11/33 (33%), Positives = 16/33 (48%)

Query: 73  LNTAYTNNNSIAPGENLNKLEDKLDDTLMESEI 105
           LN      N     E +N L+ +LD+ + E EI
Sbjct: 124 LNKKINKKNLSHVNEEINSLKLELDELIKECEI 156


>gnl|CDD|233959 TIGR02639, ClpA, ATP-dependent Clp protease ATP-binding subunit
           clpA.  [Protein fate, Degradation of proteins, peptides,
           and glycopeptides].
          Length = 730

 Score = 26.5 bits (59), Expect = 7.3
 Identities = 9/29 (31%), Positives = 18/29 (62%)

Query: 86  GENLNKLEDKLDDTLMESEISMTDDIDED 114
           G ++  L  +L+D L E+   + ++IDE+
Sbjct: 47  GGDVELLRKRLEDYLEENLPVIEEEIDEE 75


>gnl|CDD|219622 pfam07891, DUF1666, Protein of unknown function (DUF1666).  These
           sequences are derived from hypothetical plant proteins
           of unknown function. The region in question is
           approximately 250 residues long.
          Length = 247

 Score = 25.9 bits (57), Expect = 7.7
 Identities = 10/41 (24%), Positives = 17/41 (41%), Gaps = 3/41 (7%)

Query: 33  KLDLQKLEEGGGRYRNVRRKSSDIQHRKQAHRPKPAPSENL 73
           K DLQK E+   + + V R  S I  + + +  +       
Sbjct: 159 KTDLQKKEK---KLKEVLRSGSCILKKFKKNESESDEVLIF 196


>gnl|CDD|240767 cd12321, RRM1_TDP43, RNA recognition motif 1 in TAR DNA-binding
          protein 43 (TDP-43) and similar proteins.  This
          subfamily corresponds to the RRM1 of TDP-43 (also
          termed TARDBP), a ubiquitously expressed pathogenic
          protein whose normal function and abnormal aggregation
          are directly linked to the genetic disease cystic
          fibrosis, and two neurodegenerative disorders:
          frontotemporal lobar degeneration (FTLD) and
          amyotrophic lateral sclerosis (ALS). TDP-43 binds both
          DNA and RNA, and has been implicated in transcriptional
          repression, pre-mRNA splicing and translational
          regulation. TDP-43 is a dimeric protein with two RNA
          recognition motifs (RRMs), also termed RBDs (RNA
          binding domains) or RNPs (ribonucleoprotein domains),
          and a C-terminal glycine-rich domain. The RRMs are
          responsible for DNA and RNA binding; they bind to TAR
          DNA and RNA sequences with UG-repeats. The glycine-rich
          domain can interact with the hnRNP family proteins to
          form the hnRNP-rich complex involved in splicing
          inhibition. It is also essential for the cystic
          fibrosis transmembrane conductance regulator (CFTR)
          exon 9-skipping activity. .
          Length = 77

 Score = 25.0 bits (55), Expect = 8.1
 Identities = 10/23 (43%), Positives = 14/23 (60%)

Query: 1  MRVNSIRHNLSLNKCFVKVPRSK 23
          ++V S RH +    C VK+P SK
Sbjct: 55 VKVLSQRHMIDGRWCDVKIPNSK 77


>gnl|CDD|237548 PRK13893, PRK13893, conjugal transfer protein TrbM; Provisional.
          Length = 193

 Score = 25.9 bits (57), Expect = 8.2
 Identities = 12/43 (27%), Positives = 22/43 (51%), Gaps = 2/43 (4%)

Query: 19  VPRSKDQPGKGGFWKLDLQKLEEGGGRYRNVRRKSSDIQHRKQ 61
           +PR    P +GG+W ++ +  +     Y N R +  D + R+Q
Sbjct: 149 LPRYVGTPERGGYW-VEARDYDRALAEY-NERIRREDEERRRQ 189


>gnl|CDD|221495 pfam12258, Microcephalin, Microcephalin protein.  This family of
           proteins is found in eukaryotes. Proteins in this family
           are typically between 384 and 835 amino acids in length.
           Microcephalin is involved in determining the size of the
           brain in animals. It is a protein, which if expressed
           homozygously causes the organism to have the condition
           microcephaly. Organisms expressing the mutated form of
           this protein in a homozygous manner develop a condition
           called microcephaly - a drastically reduced brain mass
           and volume. Microcephalin is predicted to contain three
           BRCA1 C-terminal domains, the first of which is the
           probable microcephaly mutation site.
          Length = 391

 Score = 26.0 bits (57), Expect = 8.5
 Identities = 13/47 (27%), Positives = 21/47 (44%)

Query: 50  RRKSSDIQHRKQAHRPKPAPSENLNTAYTNNNSIAPGENLNKLEDKL 96
           R KSS  + ++ +     +P E L    ++  S  P   L K E +L
Sbjct: 124 RPKSSSAKRKRTSENSHSSPKERLKRKRSSGKSAMPRLQLWKSEGRL 170


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.313    0.134    0.393 

Gapped
Lambda     K      H
   0.267   0.0768    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 7,524,033
Number of extensions: 667573
Number of successful extensions: 462
Number of sequences better than 10.0: 1
Number of HSP's gapped: 462
Number of HSP's successfully gapped: 33
Length of query: 147
Length of database: 10,937,602
Length adjustment: 88
Effective length of query: 59
Effective length of database: 7,034,450
Effective search space: 415032550
Effective search space used: 415032550
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 54 (24.5 bits)