RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy723
         (395 letters)



>gnl|CDD|238016 cd00059, FH, Forkhead (FH), also known as a "winged helix".  FH is
           named for the Drosophila fork head protein, a
           transcription factor which promotes terminal rather than
           segmental development. This family of transcription
           factor domains, which bind to B-DNA as monomers, are
           also found in the Hepatocyte nuclear factor (HNF)
           proteins, which provide tissue-specific gene regulation.
           The structure contains 2 flexible loops or "wings" in
           the C-terminal region, hence the term winged helix.
          Length = 78

 Score =  141 bits (357), Expect = 1e-41
 Identities = 46/78 (58%), Positives = 56/78 (71%), Gaps = 2/78 (2%)

Query: 98  KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
           KP +SY  LIAMAI SSPE +L LS+IY++I DN+ YFR    GW+NSIRHNLSLN CF+
Sbjct: 1   KPPYSYSALIAMAIQSSPEKRLTLSEIYKWISDNFPYFRDAPAGWQNSIRHNLSLNKCFV 60

Query: 158 KAGRS--ANGKGHYWSIH 173
           K  R     GKG YW++ 
Sbjct: 61  KVPREPDEPGKGSYWTLD 78


>gnl|CDD|214627 smart00339, FH, FORKHEAD.  FORKHEAD, also known as a "winged
           helix".
          Length = 89

 Score =  137 bits (348), Expect = 3e-40
 Identities = 51/89 (57%), Positives = 66/89 (74%), Gaps = 2/89 (2%)

Query: 98  KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
           KP +SYI LIAMAILSSP+ +L LS+IY++I DN+ Y+R    GW+NSIRHNLSLNDCF+
Sbjct: 1   KPPYSYIALIAMAILSSPDKRLTLSEIYKWIEDNFPYYRENRAGWQNSIRHNLSLNDCFV 60

Query: 158 KAGRSA--NGKGHYWSIHPANVDDFKKGD 184
           K  R     GKG YW++ PA  + F+ G+
Sbjct: 61  KVPREGDRPGKGSYWTLDPAAENMFENGN 89


>gnl|CDD|189470 pfam00250, Fork_head, Fork head domain. 
          Length = 96

 Score =  136 bits (345), Expect = 8e-40
 Identities = 50/96 (52%), Positives = 66/96 (68%), Gaps = 2/96 (2%)

Query: 98  KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
           KP +SYI LI MAI  SPE  L LS+IYQ+I+D + Y+R    GW+NSIRHNLSLN CFI
Sbjct: 1   KPPYSYIALITMAIQQSPEKMLTLSEIYQWIMDLFPYYRQNKQGWQNSIRHNLSLNKCFI 60

Query: 158 KAGRSA--NGKGHYWSIHPANVDDFKKGDFRRRKAQ 191
           K  RS    GKG YW++ P + + F+ G + +R+ +
Sbjct: 61  KVPRSPDKPGKGSYWTLDPESENMFENGKYLKRRKR 96


>gnl|CDD|227358 COG5025, COG5025, Transcription factor of the Forkhead/HNF3 family
           [Transcription].
          Length = 610

 Score = 78.7 bits (194), Expect = 1e-15
 Identities = 48/137 (35%), Positives = 67/137 (48%), Gaps = 6/137 (4%)

Query: 97  PKPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCF 156
            KP  SY   I  AILSSP  K+ LS+IY +I  N  Y+R +   W+NSIRHNLSLN  F
Sbjct: 336 SKPAFSYANSITQAILSSPSGKMTLSEIYSWISSNLPYYRHKPTAWQNSIRHNLSLNKSF 395

Query: 157 IKAGRSAN--GKGHYWSIHPANVDDFKKGDFRRRKAQRKVRRHMGLSVDDDNDSNSPP-- 212
            K  RSA+  GKG +W I  +    ++K   R  ++ +K      +        N     
Sbjct: 396 EKVPRSASQPGKGCFWKIDYS--YIYEKESKRNPRSPKKSPSAHSVHQKLSLHVNDLYQS 453

Query: 213 PLSPPLTFPNILFSSHP 229
           P +  +   +   +S P
Sbjct: 454 PATSDIASSSSQVNSQP 470



 Score = 61.7 bits (150), Expect = 2e-10
 Identities = 41/92 (44%), Positives = 52/92 (56%), Gaps = 2/92 (2%)

Query: 98  KPQHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFI 157
            P +SY     +AIL+SP+  L LS IY +I + + Y+      W+NSIRHNLSLND FI
Sbjct: 86  VPPYSYATGRGLAILNSPDKPLTLSKIYTWIHNTFFYYAKVVSRWQNSIRHNLSLNDAFI 145

Query: 158 K--AGRSANGKGHYWSIHPANVDDFKKGDFRR 187
           K      A  KGH+WSI P +   F K   R 
Sbjct: 146 KIEGRNGAKVKGHFWSIGPGHETQFLKSGLRL 177



 Score = 35.9 bits (83), Expect = 0.037
 Identities = 13/75 (17%), Positives = 17/75 (22%), Gaps = 6/75 (8%)

Query: 100 QHSYIGLIAMAILSSPEMKLVLSDIYQYILDNYSYFRTRGPGWRNSIRHNLSLNDCFIKA 159
                 LI +   SS  + + L         +         G  N  R N S        
Sbjct: 224 IIKSSALIRIPADSSSNLDVSLGHHISQPSTHTPVLDNHSSGEENISRINNSSQID---- 279

Query: 160 GRSANGKGHYWSIHP 174
             S        SI  
Sbjct: 280 --SPTPNYRMSSIDS 292


>gnl|CDD|236413 PRK09210, PRK09210, RNA polymerase sigma factor RpoD; Validated.
          Length = 367

 Score = 32.6 bits (75), Expect = 0.33
 Identities = 17/55 (30%), Positives = 30/55 (54%), Gaps = 1/55 (1%)

Query: 312 KVKHAASIT-DEVFERLQPGEEDRNKSDEEIDAENEADIDVVNNNNDSESEKVQE 365
           K K   ++T DE+ E+L P E D ++ D+  +   +A I +V+   +  S +V E
Sbjct: 21  KGKKRGTLTYDEIAEKLIPFELDSDQIDDLYERLEDAGISIVDEEGNPSSAQVVE 75


>gnl|CDD|181765 PRK09294, PRK09294, acyltransferase PapA5; Provisional.
          Length = 416

 Score = 30.8 bits (70), Expect = 1.2
 Identities = 18/77 (23%), Positives = 27/77 (35%), Gaps = 9/77 (11%)

Query: 14  SPPGSPADQNEPLGNATALVPTLDSHPLLPI-----EQYRIQLYNYAIQAERLRLSQQY- 67
           +PP +  +    LG AT L        ++ +        R  L +  IQ   L     + 
Sbjct: 268 TPPVAATEGTNLLGAATYLAEIGPDTDIVDLARAIAATLRADLADGVIQQSFLHFGTAFE 327

Query: 68  GTPYTNYQTPNVNRVMN 84
           GTP      P V  + N
Sbjct: 328 GTPPGL---PPVVFITN 341


>gnl|CDD|220427 pfam09825, BPL_N, Biotin-protein ligase, N terminal.  The function
           of this structural domain is unknown. It is found to the
           N terminus of the biotin protein ligase catalytic
           domain.
          Length = 364

 Score = 30.8 bits (70), Expect = 1.3
 Identities = 13/53 (24%), Positives = 24/53 (45%), Gaps = 6/53 (11%)

Query: 177 VDDFKKGDFRRRKAQRKVRRHMGLSVDDDNDSNSPPPLSPPLTFPNILFSSHP 229
           +++ K  +  R    R+    +GL V+DD   +  P L+P      +   S+P
Sbjct: 233 IEELKADEKARLVFLRECLTKLGLKVNDDTSEDGIPSLTP------LYLLSNP 279


>gnl|CDD|223520 COG0443, DnaK, Molecular chaperone [Posttranslational modification,
           protein turnover, chaperones].
          Length = 579

 Score = 30.4 bits (69), Expect = 1.7
 Identities = 23/87 (26%), Positives = 37/87 (42%), Gaps = 15/87 (17%)

Query: 267 PASDLENTGKRQFDVDSLLAPDHPASDLENTDARKKLKPTSSPQT-KVKHAASITDEVFE 325
           PA       +  FD+D        A+ + N  A  K   T   Q+  +K ++ ++DE  E
Sbjct: 440 PAPRGVPQIEVTFDID--------ANGILNVTA--KDLGTGKEQSITIKASSGLSDEEIE 489

Query: 326 RLQPGEEDR----NKSDEEIDAENEAD 348
           R+    E       K  E ++A NEA+
Sbjct: 490 RMVEDAEANAALDKKFRELVEARNEAE 516


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
           biogenesis [Translation, ribosomal structure and
           biogenesis].
          Length = 1077

 Score = 30.5 bits (68), Expect = 1.7
 Identities = 20/82 (24%), Positives = 39/82 (47%), Gaps = 7/82 (8%)

Query: 282 DSLLAPDHPASDLENTDARKKLKPTSSPQTKVKHAASITDEVFERLQPGEEDRNKSDEEI 341
           D++   D  +S+++N   + + +PT      +    S  DE    L   + D + SDE  
Sbjct: 390 DAIDTVDRESSEIDNVGRKTRRQPTGKA---IAEETSREDE----LSFDDSDVSTSDENE 442

Query: 342 DAENEADIDVVNNNNDSESEKV 363
           D +       +NN ++S++E+V
Sbjct: 443 DVDFTGKKGAINNEDESDNEEV 464


>gnl|CDD|221184 pfam11718, CPSF73-100_C, Pre-mRNA 3'-end-processing endonuclease
           polyadenylation factor C-term.  This is the C-terminal
           conserved region of the pre-mRNA 3'-end-processing of
           the polyadenylation factor CPSF-73/CPSF-100 proteins.
           The exact function of this domain is not known.
          Length = 208

 Score = 29.2 bits (66), Expect = 3.3
 Identities = 17/92 (18%), Positives = 32/92 (34%), Gaps = 21/92 (22%)

Query: 305 PTSSPQTKVKHAASITDEVF-ERLQ--------------PGEEDRNKSDEEIDAENEADI 349
           P S   +         +E F ERL+                +            + +ADI
Sbjct: 117 PASVKLSSKICHKKSDEEEFIERLEMLLEAQFGDDCVPLKDKPKLPVILTVTIGKKKADI 176

Query: 350 DV----VNNNNDSESEKVQESLILQRYYQTIA 377
           ++    V   ++S  E+V+  L L+R +  + 
Sbjct: 177 NLETLKVECEDESLRERVE--LELKRLHGLVI 206


>gnl|CDD|217143 pfam02614, UxaC, Glucuronate isomerase.  This is a family of
           Glucuronate isomerases also known as D-glucuronate
           isomerase, uronic isomerase, uronate isomerase, or
           uronic acid isomerase, EC:5.3.1.12. This enzyme
           catalyzes the reactions: D-glucuronate <=>
           D-fructuronate and D-galacturonate <=> D-tagaturonate.
           It is not however clear where the experimental evidence
           for this functional assignment came from and thus this
           family has no literature reference.
          Length = 469

 Score = 29.1 bits (66), Expect = 4.1
 Identities = 23/91 (25%), Positives = 31/91 (34%), Gaps = 22/91 (24%)

Query: 253 KRQFDVDSLLAPD------HPASDLENTGK-------RQFDVDSLLAPDHPASDLENTDA 299
           KR F +  LL           A+ L  T         ++ +V+ +   D P  DLE    
Sbjct: 109 KRYFGITELLNEKTAEEIWERANALLATEAFSPRGLIKKSNVEVVCTTDDPIDDLE---Y 165

Query: 300 RKKLKPTSSPQTKV------KHAASITDEVF 324
            K L    S   KV        A +I  E F
Sbjct: 166 HKALAEDESFSVKVLPTFRPDKALNIEREGF 196


>gnl|CDD|233634 TIGR01914, cas_Csa4, CRISPR-associated protein Cas8a2/Csa4, subtype
           I-A/APERN.  CRISPR loci appear to be mobile elements
           with a wide host range. This model represents a protein
           that tends to be found near CRISPR repeats. The species
           range for this species, so far, is exclusively archaeal.
           It is found so far in only four different species, and
           includes two tandem genes in Pyrococcus furiosus DSM
           3638. This subfamily is found in a CRISPR/Cas locus we
           designate APERN, so the family is designated Csa4, for
           CRISPR/Cas Subtype Protein 4 [Mobile and
           extrachromosomal element functions, Other].
          Length = 354

 Score = 29.0 bits (65), Expect = 4.3
 Identities = 16/92 (17%), Positives = 35/92 (38%), Gaps = 20/92 (21%)

Query: 47  YRIQLYNYAIQAERLRLSQQYGTPYTNYQTPNVNRVMNYFHPRFQISSEEPKPQHSYIGL 106
            ++    YA+      +   Y +PY  YQ  +  +V+            +P P    + +
Sbjct: 157 IKVCPLCYAL----AWIGFHYYSPYIKYQKGDETKVVIL----------QPAPA-EEVDM 201

Query: 107 IAMAILSSPEMKLVLSDIYQYILDNYSYFRTR 138
           I + +L     K++      YIL ++ +  + 
Sbjct: 202 IELLLLKDLASKIM-----PYILRDFHFKISN 228


>gnl|CDD|182325 PRK10239, PRK10239,
          2-amino-4-hydroxy-6-hydroxymethyldihyropteridine
          pyrophosphokinase; Provisional.
          Length = 159

 Score = 28.2 bits (63), Expect = 5.0
 Identities = 20/55 (36%), Positives = 28/55 (50%), Gaps = 6/55 (10%)

Query: 14 SPPGSPADQNEPLGNATALVPTLDSHPLLPIEQYRIQLYNYAIQAERLRLSQQYG 68
          +PP  P DQ + L  A AL   L    LL   Q RI+L     Q  R+R ++++G
Sbjct: 43 TPPLGPQDQPDYLNAAVALETALAPEELLNHTQ-RIEL-----QQGRVRKAERWG 91


>gnl|CDD|217527 pfam03387, Herpes_UL46, Herpesvirus UL46 protein. 
          Length = 443

 Score = 28.5 bits (64), Expect = 6.7
 Identities = 11/50 (22%), Positives = 19/50 (38%), Gaps = 4/50 (8%)

Query: 191 QRKVRRHMGLSVDDDNDSNSPPPLSPPL----TFPNILFSSHPFQCFPQM 236
            + ++   G  V D  +++ P  +S  L    TF     S  PF+     
Sbjct: 112 WKYLQASSGADVPDSPETDGPTQVSVVLLFYPTFGPKPLSKAPFKSKKDN 161


>gnl|CDD|146273 pfam03546, Treacle, Treacher Collins syndrome protein Treacle. 
          Length = 519

 Score = 28.7 bits (63), Expect = 6.7
 Identities = 23/90 (25%), Positives = 34/90 (37%), Gaps = 5/90 (5%)

Query: 258 VDSLLAPDHPASDLENTGKRQFDVDSLLAPDHPASDLENTDARKKLKPTSSPQTKVKHAA 317
             S         D E++ + + D D   AP    S  +   AR    P   P  K    A
Sbjct: 148 AGSAAVQVGKQEDSESSSEEESDSDGPGAPAQAKSSGKLLQARPASGPAKGPPQKAGPVA 207

Query: 318 SITDEVFERLQPGEEDRNKSDEEIDAENEA 347
           +       + + G+ED   S+E  D+E EA
Sbjct: 208 TQV-----KAERGKEDSESSEESSDSEEEA 232


>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen.  This
           family consists of several Theileria P67 surface
           antigens. A stage specific surface antigen of Theileria
           parva, p67, is the basis for the development of an
           anti-sporozoite vaccine for the control of East Coast
           fever (ECF) in cattle. The antigen has been shown to
           contain five distinct linear peptide sequences
           recognised by sporozoite-neutralising murine monoclonal
           antibodies.
          Length = 727

 Score = 28.5 bits (63), Expect = 7.2
 Identities = 15/59 (25%), Positives = 26/59 (44%), Gaps = 1/59 (1%)

Query: 306 TSSPQTKVKHAASITDEVFERLQPGEEDRNKSDEEIDAENEADIDVVNNNNDSESEKVQ 364
           T S Q  V   + + D   E+ Q  +  +  S+E+ D   E D    ++ +   S+K Q
Sbjct: 86  TRSFQEPVSQESEVQDNT-EQNQDTKGSKTDSEEDDDDSEEEDNKSTSSKDGKGSKKTQ 143


>gnl|CDD|215893 pfam00389, 2-Hacid_dh, D-isomer specific 2-hydroxyacid
           dehydrogenase, catalytic domain.  This family represents
           the largest portion of the catalytic domain of
           2-hydroxyacid dehydrogenases as the NAD binding domain
           is inserted within the structural domain.
          Length = 312

 Score = 28.4 bits (64), Expect = 7.4
 Identities = 9/26 (34%), Positives = 13/26 (50%)

Query: 203 DDDNDSNSPPPLSPPLTFPNILFSSH 228
            D  +   PP  SP L  PN++ + H
Sbjct: 253 LDVVEEEPPPVNSPLLDLPNVILTPH 278


>gnl|CDD|237124 PRK12518, PRK12518, RNA polymerase sigma factor; Provisional.
          Length = 175

 Score = 27.8 bits (62), Expect = 7.4
 Identities = 14/35 (40%), Positives = 16/35 (45%), Gaps = 4/35 (11%)

Query: 184 DFRRRKAQRKVRRHMGLSVDDDNDSNSPPPLSPPL 218
           D RR+ AQR  R       D  ND  S P  +P L
Sbjct: 75  DARRQFAQRPSRIQ----DDSLNDQPSRPSDTPDL 105


>gnl|CDD|130370 TIGR01303, IMP_DH_rel_1, IMP dehydrogenase family protein.  This
           model represents a family of proteins, often annotated
           as a putative IMP dehydrogenase, related to IMP
           dehydrogenase and GMP reductase and restricted to the
           high GC Gram-positive bacteria. All species in which a
           member is found so far (Corynebacterium glutamicum,
           Mycobacterium tuberculosis, Streptomyces coelicolor,
           etc.) also have IMP dehydrogenase as described by
           TIGRFAMs entry TIGR01302 [Unknown function, General].
          Length = 475

 Score = 28.3 bits (63), Expect = 7.4
 Identities = 21/71 (29%), Positives = 24/71 (33%), Gaps = 6/71 (8%)

Query: 234 PQMLPPLGSTNTTSPCISRKRQFDVDSLLAPDHPASDLEN-TGKRQFDVDSLLAPDHPA- 291
           PQ LP      T +   SR    D    LAP    SD      KR      ++  D P  
Sbjct: 73  PQDLPIPAVKQTVAFVKSRDLVLDTPITLAPHDTVSDAMALIHKRAHGAAVVILEDRPVG 132

Query: 292 ----SDLENTD 298
               SDL   D
Sbjct: 133 LVTDSDLLGVD 143


>gnl|CDD|222905 PHA02603, nrdC.11, hypothetical protein; Provisional.
          Length = 330

 Score = 28.2 bits (63), Expect = 9.1
 Identities = 14/32 (43%), Positives = 22/32 (68%), Gaps = 1/32 (3%)

Query: 107 IAMAILSSPEMKLVL-SDIYQYILDNYSYFRT 137
           +A+ +L +P+ K+V   D++QYI DN S F T
Sbjct: 81  VALDMLHTPDDKVVKSPDVWQYIQDNRSRFYT 112


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.314    0.131    0.386 

Gapped
Lambda     K      H
   0.267   0.0768    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 20,290,055
Number of extensions: 1965033
Number of successful extensions: 1445
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1439
Number of HSP's successfully gapped: 34
Length of query: 395
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 296
Effective length of database: 6,546,556
Effective search space: 1937780576
Effective search space used: 1937780576
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 60 (26.8 bits)