RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy8149
         (182 letters)



>gnl|CDD|197842 smart00717, SANT, SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding
          domains. 
          Length = 49

 Score = 37.6 bits (88), Expect = 2e-04
 Identities = 16/49 (32%), Positives = 25/49 (51%), Gaps = 4/49 (8%)

Query: 31 GNKWTAEDIEMLKETVRKFG-DELVIISDRIKDRTISQIKS---NLKKK 75
            +WT E+ E+L E V+K+G +    I+  +  RT  Q +    NL K 
Sbjct: 1  KGEWTEEEDELLIELVKKYGKNNWEKIAKELPGRTAEQCRERWRNLLKP 49


>gnl|CDD|215818 pfam00249, Myb_DNA-binding, Myb-like DNA-binding domain.  This
          family contains the DNA binding domains from Myb
          proteins, as well as the SANT domain family.
          Length = 47

 Score = 35.2 bits (82), Expect = 0.001
 Identities = 14/40 (35%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 32 NKWTAEDIEMLKETVRKFG-DELVIISDRIKDRTISQIKS 70
            WT E+ E+L E V+K G      I+  +  RT +Q K+
Sbjct: 2  GPWTPEEDELLIEAVKKHGNGNWSKIAKHLPGRTDNQCKN 41


>gnl|CDD|238096 cd00167, SANT, 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding
          domains. Tandem copies of the domain bind telomeric DNA
          tandem repeatsas part of the capping complex. Binding
          is sequence dependent for repeats which contain the G/C
          rich motif [C2-3 A (CA)1-6]. The domain is also found
          in regulatory transcriptional repressor complexes where
          it also binds DNA.
          Length = 45

 Score = 34.5 bits (80), Expect = 0.002
 Identities = 13/39 (33%), Positives = 20/39 (51%), Gaps = 1/39 (2%)

Query: 33 KWTAEDIEMLKETVRKFGDE-LVIISDRIKDRTISQIKS 70
           WT E+ E+L E V+K+G      I+  +  RT  Q + 
Sbjct: 1  PWTEEEDELLLEAVKKYGKNNWEKIAKELPGRTPKQCRE 39


>gnl|CDD|215119 PLN02187, PLN02187, rooty/superroot1.
          Length = 462

 Score = 37.0 bits (85), Expect = 0.003
 Identities = 16/40 (40%), Positives = 28/40 (70%), Gaps = 2/40 (5%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRIKDRTI 65
           P++P GN ++ + ++ + ET RK G  +++ISD + DRTI
Sbjct: 213 PNNPCGNVYSHDHLKKVAETARKLG--IMVISDEVYDRTI 250


>gnl|CDD|206092 pfam13921, Myb_DNA-bind_6, Myb-like DNA-binding domain.  This
          family contains the DNA binding domains from Myb
          proteins, as well as the SANT domain family.
          Length = 59

 Score = 32.3 bits (74), Expect = 0.017
 Identities = 14/59 (23%), Positives = 27/59 (45%), Gaps = 10/59 (16%)

Query: 34 WTAEDIEMLKETVRKFGDELVIISD-------RIKDRTISQIKSNLKKKAF---EDAGL 82
          WT E+ E L + V K+G++   I++         +DR   +++    +  +   ED  L
Sbjct: 1  WTEEEDEKLLKLVEKYGNDWKQIAEELGRTPSACRDRWRRKLRPKRSRGPWTKEEDQRL 59


>gnl|CDD|181036 PRK07568, PRK07568, aspartate aminotransferase; Provisional.
          Length = 397

 Score = 34.8 bits (81), Expect = 0.017
 Identities = 15/35 (42%), Positives = 22/35 (62%), Gaps = 2/35 (5%)

Query: 24  STPDSPSGNKWTAEDIEMLKETVRKFGDELVIISD 58
           S P +P+G  +T E++EML E  +K    L +ISD
Sbjct: 169 SNPGNPTGVVYTKEELEMLAEIAKKHD--LFLISD 201


>gnl|CDD|212556 cd11658, SANT_DMAP1_like, SANT/myb-like domain of Human Dna
          Methyltransferase 1 Associated Protein 1-like.  These
          proteins are members of the SANT/myb group. SANT is
          named after 'SWI3, ADA2, N-CoR and TFIIIB', several
          factors that share this domain. The SANT domain
          resembles the 3 alpha-helix bundle of the DNA-binding
          Myb domains and is found in a diverse set of proteins.
          Length = 46

 Score = 31.2 bits (71), Expect = 0.032
 Identities = 14/45 (31%), Positives = 24/45 (53%), Gaps = 7/45 (15%)

Query: 34 WTAEDIEMLKETVRKFGDELVIISDRI---KDRTISQIKSNLKKK 75
          WT E+ + L + V++F     +I DR    K R++     +LK+K
Sbjct: 1  WTKEETDYLFDLVKRFDLRWNVILDRYPFQKGRSV----EDLKEK 41


>gnl|CDD|224090 COG1168, MalY, Bifunctional PLP-dependent enzyme with
           beta-cystathionase and maltose regulon repressor
           activities [Amino acid transport and metabolism].
          Length = 388

 Score = 33.8 bits (78), Expect = 0.038
 Identities = 12/35 (34%), Positives = 20/35 (57%), Gaps = 2/35 (5%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRI 60
           P +P+G  WT E++  + E   + G  + +ISD I
Sbjct: 167 PHNPTGRVWTKEELRKIAELCLRHG--VRVISDEI 199


>gnl|CDD|212558 cd11660, SANT_TRF, Telomere repeat binding factor-like
          DNA-binding domains of the SANT/myb-like family.  Human
          telomere repeat binding factors, TRF1 and TRF2,
          function as part of the 6 component shelterin complex.
          TRF2 binds DNA and recruits RAP1 (via binding to the
          RAP1 protein c-terminal (RCT)) and TIN2 in the
          protection of telomeres from DNA repair machinery.
          Metazoan shelterin consists of 3 DNA binding proteins
          (TRF2, TRF1, and POT1) and 3 recruited proteins that
          bind to one or more of these DNA-binding proteins
          (RAP1, TIN2, TPP1).  Schizosaccharomyces pombe TAZ1 is
          an orthlog and binds RAP1. Human TRF1 and TRF2 bind
          double-stranded DNA. hTRF2 consists of a basic
          N-terminus, a TRF homology domain, the RAP1 binding
          motif (RBM), the TIN2 binding motif (TBM) and a
          myb-like DNA binding domain, SANT, named after 'SWI3,
          ADA2, N-CoR and TFIIIB', several factors that share
          this domain. Tandem copies of the domain bind telomeric
          DNA tandem repeats as part of the capping complex. The
          single myb-like domain of TRF-type proteins is similar
          to the tandem myb_like domains found in yeast RAP1.
          Length = 50

 Score = 31.0 bits (71), Expect = 0.039
 Identities = 17/49 (34%), Positives = 25/49 (51%), Gaps = 7/49 (14%)

Query: 33 KWTAEDIEMLKETVRKFGD----ELVIISDRIKDRTISQIK---SNLKK 74
          KWT E+ E L E V K+G     +++     + +RT   +K    NLKK
Sbjct: 2  KWTDEEDEALVEGVEKYGVGNWAKILKDYFFVNNRTSVDLKDKWRNLKK 50


>gnl|CDD|224231 COG1312, UxuA, D-mannonate dehydratase [Carbohydrate transport
          and metabolism].
          Length = 362

 Score = 33.1 bits (76), Expect = 0.064
 Identities = 15/59 (25%), Positives = 22/59 (37%), Gaps = 11/59 (18%)

Query: 26 PDSPSGNKWTAEDIEMLKETVRKFGDEL-----VIISDRIK------DRTISQIKSNLK 73
             P+G  W  E+I   KE +   G        V + + IK      DR I   K  ++
Sbjct: 32 HHIPAGEVWPVEEILKRKEEIESAGLTWSVVESVPVHEDIKLGTPTRDRYIENYKQTIR 90


>gnl|CDD|237191 PRK12757, PRK12757, cell division protein FtsN; Provisional.
          Length = 256

 Score = 32.3 bits (74), Expect = 0.100
 Identities = 16/54 (29%), Positives = 19/54 (35%), Gaps = 3/54 (5%)

Query: 85  QQQVAPPQQIVQ---QVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQ 135
           +Q    P+  VQ   Q  QQQ       PQ V  P +         P PV  Q 
Sbjct: 109 EQTPQVPRSTVQIQQQAQQQQPPATTAQPQPVTPPRQTTAPVQPQTPAPVRTQP 162



 Score = 31.9 bits (73), Expect = 0.12
 Identities = 14/52 (26%), Positives = 16/52 (30%)

Query: 85  QQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQ 136
           +  V   QQ  QQ       Q        QT   V  QT A V    +A   
Sbjct: 116 RSTVQIQQQAQQQQPPATTAQPQPVTPPRQTTAPVQPQTPAPVRTQPAAPVT 167



 Score = 30.4 bits (69), Expect = 0.45
 Identities = 15/49 (30%), Positives = 16/49 (32%)

Query: 85  QQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSA 133
           Q Q   P     Q       +   AP   QTP  V  Q  A V   V A
Sbjct: 124 QAQQQQPPATTAQPQPVTPPRQTTAPVQPQTPAPVRTQPAAPVTQAVEA 172



 Score = 28.9 bits (65), Expect = 1.2
 Identities = 13/52 (25%), Positives = 16/52 (30%)

Query: 84  IQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQ 135
           IQQQ    Q        Q      Q    VQ  T    +T    PV  + + 
Sbjct: 121 IQQQAQQQQPPATTAQPQPVTPPRQTTAPVQPQTPAPVRTQPAAPVTQAVEA 172



 Score = 28.1 bits (63), Expect = 2.4
 Identities = 15/58 (25%), Positives = 20/58 (34%), Gaps = 1/58 (1%)

Query: 91  PQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGSHKSAEVT 148
             Q+ +  VQ Q Q   Q P       + V     T   PV  Q      +  +A VT
Sbjct: 111 TPQVPRSTVQIQQQAQQQQPPATTAQPQPVTPPRQTTA-PVQPQTPAPVRTQPAAPVT 167


>gnl|CDD|181642 PRK09082, PRK09082, methionine aminotransferase; Validated.
          Length = 386

 Score = 32.6 bits (75), Expect = 0.10
 Identities = 12/34 (35%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 25  TPDSPSGNKWTAEDIEMLKETVRKFGDELVIISD 58
           TP +PSG  W+A D+  L + +   G ++ ++SD
Sbjct: 171 TPHNPSGTVWSAADMRALWQLIA--GTDIYVLSD 202


>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
          Length = 1068

 Score = 31.9 bits (73), Expect = 0.20
 Identities = 17/106 (16%), Positives = 38/106 (35%), Gaps = 3/106 (2%)

Query: 76  AFEDAGLPIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQ 135
              +  +    +      +V+ V +   + VV A    +    V       +  PV+ Q 
Sbjct: 869 VVAEVPVAAAVEPVVSAPVVEAVAEVVEEPVVVAEPQPEEVVVVETTHPEVIAAPVTEQP 928

Query: 136 QLLSGSHKSAEVTLNMLNAHPESEVDVEGLPEEVKLQFDTTAQQVA 181
           Q+++   +S       +  H E  V+ +    +++   +T    VA
Sbjct: 929 QVIT---ESDVAVAQEVAEHAEPVVEPQDETADIEEAAETAEVVVA 971



 Score = 28.9 bits (65), Expect = 1.7
 Identities = 12/42 (28%), Positives = 20/42 (47%), Gaps = 1/42 (2%)

Query: 88  VAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPV 129
           V  PQ  VQ   Q++ ++V   P + + P     + V + PV
Sbjct: 847 VVRPQD-VQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSAPV 887


>gnl|CDD|226870 COG4464, CapC, Capsular polysaccharide biosynthesis protein
          [Carbohydrate transport and metabolism / Cell envelope
          biogenesis, outer membrane].
          Length = 254

 Score = 31.2 bits (71), Expect = 0.21
 Identities = 20/77 (25%), Positives = 35/77 (45%), Gaps = 7/77 (9%)

Query: 23 HSTPDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRIKDR---TISQIKSNLKK--KAF 77
          H  PD   G K   E + ML+E VR+   ++V  S  +  R    I ++K    +  +  
Sbjct: 7  HILPDIDDGPKSLEESLAMLREAVRQGVTKIVATSHHLHGRYENPIEKVKEKANQLNEIL 66

Query: 78 EDAGLPIQQQVAPPQQI 94
          +   + +  +V P Q+I
Sbjct: 67 KKEAIDL--KVLPGQEI 81


>gnl|CDD|213963 TIGR04350, C_S_lyase_PatB, putative C-S lyase.  Members of this
           subfamily are probable C-S lyases from a family of
           pyridoxal phosphate-dependent enzymes that tend to be
           (mis)annotated as probable aminotransferases. One member
           is PatB of Bacillus subtilis, a proven C-S-lyase.
           Another is the virulence factor cystalysin from
           Treponema denticola, whose hemolysin activity may stem
           from H2S production. Members of the seed alignment occur
           next to examples of the enzyme 5-histidylcysteine
           sulfoxide synthase, from ovothiol A biosynthesis, and
           would be expected to perform a C-S cleavage of
           5-histidylcysteine sulfoxide to leave
           1-methyl-4-mercaptohistidine (ovothiol A).
          Length = 384

 Score = 31.1 bits (71), Expect = 0.25
 Identities = 12/35 (34%), Positives = 19/35 (54%), Gaps = 2/35 (5%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRI 60
           P +P G  WT E++  L E   +    +V++SD I
Sbjct: 166 PHNPVGRVWTREELTRLAELCLRHN--VVVVSDEI 198


>gnl|CDD|215074 PLN00145, PLN00145, tyrosine/nicotianamine aminotransferase;
           Provisional.
          Length = 430

 Score = 31.3 bits (71), Expect = 0.26
 Identities = 12/37 (32%), Positives = 24/37 (64%), Gaps = 2/37 (5%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRIKD 62
           P++P G+ ++ E +  + ET RK G  +++I+D + D
Sbjct: 199 PNNPCGSVYSYEHLAKIAETARKLG--ILVIADEVYD 233


>gnl|CDD|236300 PRK08576, PRK08576, hypothetical protein; Provisional.
          Length = 438

 Score = 31.2 bits (71), Expect = 0.28
 Identities = 12/44 (27%), Positives = 23/44 (52%), Gaps = 1/44 (2%)

Query: 17  DLTLQLHSTPDSPSGNKW-TAEDIEMLKETVRKFGDELVIISDR 59
           D+ + +         N+W T   +E L+E +R+  D L+++ DR
Sbjct: 296 DVPMPIEKYGMPTHSNRWCTKLKVEALEEAIRELEDGLLVVGDR 339


>gnl|CDD|216477 pfam01397, Terpene_synth, Terpene synthase, N-terminal domain.
          It has been suggested that this gene family be
          designated tps (for terpene synthase). It has been
          split into six subgroups on the basis of phylogeny,
          called tpsa-tpsf. tpsa includes vetispiridiene
          synthase, 5-epi- aristolochene synthase, and
          (+)-delta-cadinene synthase. tpsb includes (-)-limonene
          synthase. tpsc includes kaurene synthase A. tpsd
          includes taxadiene synthase, pinene synthase, and
          myrcene synthase. tpse includes kaurene synthase B.
          tpsf includes linalool synthase.
          Length = 177

 Score = 30.2 bits (69), Expect = 0.37
 Identities = 11/27 (40%), Positives = 15/27 (55%)

Query: 22 LHSTPDSPSGNKWTAEDIEMLKETVRK 48
          L  T D  + ++   E +E LKE VRK
Sbjct: 6  LSYTADMQTEDEKCLERLESLKEEVRK 32


>gnl|CDD|215193 PLN02337, PLN02337, lipoxygenase.
          Length = 866

 Score = 30.8 bits (70), Expect = 0.38
 Identities = 19/51 (37%), Positives = 27/51 (52%), Gaps = 12/51 (23%)

Query: 20  LQLHSTP-------DSPSGNKWTAEDIEMLKETVRKFGDELVIISDRIKDR 63
           L  HS+        D+P   +WT+ D E L E  ++FG+ LV I +RI D 
Sbjct: 777 LSRHSSDEVYLGQRDTP---EWTS-DAEPL-EAFKRFGERLVEIENRIVDM 822


>gnl|CDD|235172 PRK03906, PRK03906, mannonate dehydratase; Provisional.
          Length = 385

 Score = 30.2 bits (69), Expect = 0.54
 Identities = 15/48 (31%), Positives = 20/48 (41%), Gaps = 8/48 (16%)

Query: 22 LHSTPDSPSGNKWTAEDIEMLKETVRKFGDEL-VI----ISDRIKDRT 64
          LH   D P G  W  E+I   K  +   G E  V+    + + IK  T
Sbjct: 31 LH---DIPVGEVWPVEEILARKAEIEAAGLEWSVVESVPVHEDIKTGT 75


>gnl|CDD|235124 PRK03427, PRK03427, cell division protein ZipA; Provisional.
          Length = 333

 Score = 30.0 bits (68), Expect = 0.58
 Identities = 16/52 (30%), Positives = 19/52 (36%)

Query: 85  QQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQ 136
            Q   P QQ V   V    Q V  APQ  Q   +  +   A  P PV+    
Sbjct: 136 PQPEQPLQQPVSPQVAPAPQPVHSAPQPAQQAFQPAEPVAAPQPEPVAEPAP 187



 Score = 28.8 bits (65), Expect = 1.5
 Identities = 29/91 (31%), Positives = 36/91 (39%), Gaps = 10/91 (10%)

Query: 83  PIQQQVAP-PQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGS 141
            +  Q AP P Q   Q VQQ   Q  Q  Q +Q P   V   VA  P PV +  Q    +
Sbjct: 112 QVPPQHAPRPAQPAPQPVQQPAYQP-QPEQPLQQP---VSPQVAPAPQPVHSAPQPAQQA 167

Query: 142 HKSAEVTLNMLNAHPESEVDVEGLPEEVKLQ 172
            + AE       A P+ E   E  P   K +
Sbjct: 168 FQPAE-----PVAAPQPEPVAEPAPVMDKPK 193


>gnl|CDD|237863 PRK14949, PRK14949, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 944

 Score = 30.1 bits (68), Expect = 0.81
 Identities = 13/67 (19%), Positives = 19/67 (28%)

Query: 82  LPIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGS 141
           LP  Q  +     VQ     + Q V  AP   +T           V    +        S
Sbjct: 376 LPEGQTPSALAAAVQAPHANEPQFVNAAPAEKKTALTEQTTAQQQVQAANAEAVAEADAS 435

Query: 142 HKSAEVT 148
            + A+  
Sbjct: 436 AEPADTV 442


>gnl|CDD|198284 cd10421, SH2_STAT5a, Src homology 2 (SH2) domain found in signal
           transducer and activator of transcription (STAT) 5a
           proteins.  STAT5 is a member of the STAT family of
           transcription factors.  Two highly related proteins,
           STAT5a and STAT5b are encoded by separate genes, but are
           90% identical at the amino acid level.  Both STAT5a and
           STAT5b are ubiquitously expressed and functionally
           interchangeable. Mice lacking either STAT5a or STAT5b
           have mild defects in prolactin dependent mammary
           differentiation or sexually dimorphic growth
           hormone-dependent effects, respectively. Mice lacking
           both STAT5a and STAT5b exhibit a perinatal lethal
           phenotype and have multiple defects, including anemia
           and a virtual absence of B and T lymphocytes. STAT
           proteins mediate the signaling of cytokines and a number
           of growth factors from the receptors of these
           extracellular signaling molecules to the cell nucleus.
           STATs are specifically phosphorylated by
           receptor-associated Janus kinases, receptor tyrosine
           kinases, or cytoplasmic tyrosine kinases. The
           phosphorylated STAT molecules dimerize by reciprocal
           binding of their SH2 domains to the phosphotyrosine
           residues. These dimeric STATs translocate into the
           nucleus, bind to specific DNA sequences, and regulate
           the transcription of their target genes.  However there
           are a number of unphosphorylated STATs that travel
           between the cytoplasm and nucleus and some STATs that
           exist as dimers in unstimulated cells that can exert
           biological functions independent of being activated.
           There are seven mammalian STAT family members which have
           been identified: STAT1, STAT2, STAT3, STAT4, STAT5
           (STAT5A and STAT5B), and STAT6. There are 6 conserved
           domains in STAT: N-terminal domain (NTD), coiled-coil
           domain (CCD), DNA-binding domain (DBD), alpha-helical
           linker domain (LD), SH2 domain, and transactivation
           domain (TAD). NTD is involved in dimerization of
           unphosphorylated STATs monomers and for the
           tetramerization between STAT1, STAT3, STAT4 and STAT5 on
           promoters with two or more tandem STAT binding sites.
           It also plays a role in promoting interactions with
           transcriptional co-activators such as CREB binding
           protein (CBP)/p300, as well as being important for
           nuclear import and deactivation of STATs involving
           tyrosine de-phosphorylation. CCD interacts with other
           proteins, such as IFN regulatory protein 9 (IRF-9/p48)
           with STAT1 and c-JUN with STAT3 and is also thought to
           participate in the negative regulation of these
           proteins. Distinct genes are bound to STATs via their
           DBD domain. This domain is also involved in nuclear
           translocation of activated STAT1 and STAT3
           phosphorylated dimers upon cytokine stimulation. LD
           links the DNA-binding and SH2 domains and is important
           for the transcriptional activation of STAT1 in response
           to IFN-gamma. It also plays a role in protein-protein
           interactions and has also been implicated in the
           constitutive nucleocytoplasmic shuttling of
           unphosphorylated STATs in resting cells.  The SH2 domain
           is necessary for receptor association and tyrosine
           phosphodimer formation. Residues within this domain may
           be particularly important for some cellular functions
           mediated by the STATs as well as residues adjacent to
           this domain.  The TAD interacts with several proteins,
           namely minichromosome maintenance complex component 5
           (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also
           contains a modulatory phosphorylation site that
           regulates STAT activity and is necessary for maximal
           transcription of a number of target genes. The conserved
           tyrosine residue present in the C-terminus is crucial
           for dimerization via interaction with the SH2 domain
           upon the interaction of the ligand with the receptor.
           STAT activation by tyrosine phosphorylation also
           determines nuclear import and retention, DNA binding to
           specific DNA elements in the promoters of responsive
           genes, and transcriptional activation of STAT dimers. In
           addition to the SH2 domain there is a coiled-coil
           domain, a DNA binding domain, and a transactivation
           domain in the STAT proteins. In general SH2 domains are
           involved in signal transduction.  They typically bind
           pTyr-containing ligands via two surface pockets, a pTyr
           and hydrophobic binding pocket, allowing proteins with
           SH2 domains to localize to tyrosine phosphorylated
           sites.
          Length = 140

 Score = 28.9 bits (64), Expect = 0.88
 Identities = 19/72 (26%), Positives = 28/72 (38%), Gaps = 7/72 (9%)

Query: 27  DSPSGNKWTAEDIEMLKETVRKFGDEL-------VIISDRIKDRTISQIKSNLKKKAFED 79
           DSP  N W  +       ++R   D L        +  DR KD   S+  + +  KA + 
Sbjct: 62  DSPDRNLWNLKPFTTRDFSIRSLADRLGDLNYLIYVFPDRPKDEVFSKYYTPVLAKAVDG 121

Query: 80  AGLPIQQQVAPP 91
              P  +QV P 
Sbjct: 122 YVKPQIKQVVPE 133


>gnl|CDD|165132 PHA02767, PHA02767, hypothetical protein; Provisional.
          Length = 101

 Score = 28.4 bits (63), Expect = 0.91
 Identities = 15/36 (41%), Positives = 20/36 (55%), Gaps = 9/36 (25%)

Query: 49 FGDELVIISDRIKDRTISQIKSNLKKKAFEDAGLPI 84
          F DE +I+          QIK N+KKK +E AG+ I
Sbjct: 34 FEDEEIILI---------QIKINMKKKEYEKAGMSI 60


>gnl|CDD|236669 PRK10263, PRK10263, DNA translocase FtsK; Provisional.
          Length = 1355

 Score = 29.7 bits (66), Expect = 0.95
 Identities = 17/51 (33%), Positives = 17/51 (33%)

Query: 85  QQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQ 135
           QQ VAP  Q  Q       Q   Q PQ    P    QQ    V      QQ
Sbjct: 782 QQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQ 832



 Score = 29.3 bits (65), Expect = 1.6
 Identities = 20/54 (37%), Positives = 21/54 (38%), Gaps = 3/54 (5%)

Query: 85  QQQVAPPQQIVQ---QVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQ 135
           QQ VAP  Q  Q    V  Q   Q  Q P   Q   +  QQ VA  P     QQ
Sbjct: 769 QQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQ 822



 Score = 27.4 bits (60), Expect = 6.6
 Identities = 22/72 (30%), Positives = 25/72 (34%), Gaps = 4/72 (5%)

Query: 86  QQVAPPQQIVQQVVQ----QQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGS 141
           QQ   PQ   QQ  Q    Q   Q  Q P   Q   +  QQ VA  P     QQ +    
Sbjct: 782 QQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQP 841

Query: 142 HKSAEVTLNMLN 153
             +    L M N
Sbjct: 842 QDTLLHPLLMRN 853


>gnl|CDD|237874 PRK14971, PRK14971, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 614

 Score = 29.4 bits (66), Expect = 1.0
 Identities = 13/64 (20%), Positives = 15/64 (23%), Gaps = 1/64 (1%)

Query: 86  QQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVA-TVPVPVSAQQQLLSGSHKS 144
           Q  A PQ            Q   A Q     +         TV V   A   +   S   
Sbjct: 385 QPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGTPPTVSVDPPAAVPVNPPSTAP 444

Query: 145 AEVT 148
             V 
Sbjct: 445 QAVR 448


>gnl|CDD|99734 cd00609, AAT_like, Aspartate aminotransferase family. This family
           belongs to pyridoxal phosphate (PLP)-dependent aspartate
           aminotransferase superfamily (fold I). Pyridoxal
           phosphate combines with an alpha-amino acid to form a
           compound called a Schiff base or aldimine intermediate,
           which depending on the reaction, is the substrate in
           four kinds of reactions (1) transamination (movement of
           amino groups), (2) racemization (redistribution of
           enantiomers), (3) decarboxylation (removing COOH
           groups), and (4) various side-chain reactions depending
           on the enzyme involved. Pyridoxal phosphate (PLP)
           dependent enzymes were previously classified into alpha,
           beta and gamma classes, based on the chemical
           characteristics (carbon atom involved) of the reaction
           they catalyzed. The availability of several structures
           allowed a comprehensive analysis of  the evolutionary
           classification of PLP dependent enzymes, and it was
           found that the functional classification did not always
           agree with the evolutionary history of these enzymes.
           The major groups in this CD corresponds to Aspartate
           aminotransferase a, b and c, Tyrosine, Alanine,
           Aromatic-amino-acid, Glutamine phenylpyruvate,
           1-Aminocyclopropane-1-carboxylate synthase,
           Histidinol-phosphate, gene products of malY and cobC,
           Valine-pyruvate aminotransferase and Rhizopine
           catabolism regulatory protein.
          Length = 350

 Score = 29.2 bits (66), Expect = 1.1
 Identities = 13/34 (38%), Positives = 22/34 (64%), Gaps = 2/34 (5%)

Query: 25  TPDSPSGNKWTAEDIEMLKETVRKFGDELVIISD 58
            P++P+G   + E++E L E  +K G  ++IISD
Sbjct: 140 NPNNPTGAVLSEEELEELAELAKKHG--ILIISD 171


>gnl|CDD|200370 TIGR04119, CXXX_matur, CXXX repeat peptide maturase.  This model
          describes a peptide maturase that works with, usually
          fused to, a radical SAM enzyme in a system that
          modifies peptides with multiple tandem repeats of CXXX
          sequences. This protein includes an iron-sulfur cluster
          binding region associated with peptide modification as
          described in domain model TIGR04085.
          Length = 210

 Score = 28.8 bits (65), Expect = 1.2
 Identities = 16/63 (25%), Positives = 25/63 (39%), Gaps = 22/63 (34%)

Query: 32 NKWTAEDIEMLKETVRKFGD------------ELVIISDRIKDRTISQIKSNLKKKAFED 79
          +KW   D+++ K  + K  D            E+ +I+DR+           LKK    D
Sbjct: 36 DKWNNFDLDLYKSQLAKISDYIADLYKKGIIKEINVITDRL----------MLKKMNNCD 85

Query: 80 AGL 82
          AG 
Sbjct: 86 AGE 88


>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1.  Members
           of this family are necessary for accurate chromosome
           transmission during cell division.
          Length = 804

 Score = 29.4 bits (66), Expect = 1.2
 Identities = 17/58 (29%), Positives = 18/58 (31%), Gaps = 5/58 (8%)

Query: 82  LPIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLS 139
           L   Q     QQ+     Q   QQ  Q PQ    P    Q T    P P   Q Q   
Sbjct: 243 LQQPQFPGLSQQMPPPPPQPPQQQ-QQPPQPQAQPPPQNQPT----PHPGLPQGQNAP 295



 Score = 29.0 bits (65), Expect = 1.5
 Identities = 17/60 (28%), Positives = 18/60 (30%), Gaps = 3/60 (5%)

Query: 82  LPIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRV---VQQTVATVPVPVSAQQQLL 138
            P   Q  PP        QQQ  Q    P     PT      Q   A +P P   Q   L
Sbjct: 248 FPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPL 307



 Score = 29.0 bits (65), Expect = 1.8
 Identities = 18/78 (23%), Positives = 20/78 (25%), Gaps = 3/78 (3%)

Query: 80  AGLPIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLS 139
              P Q   AP Q   Q  +  Q  Q     Q  Q P    Q        P   QQ    
Sbjct: 213 QVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQP 272

Query: 140 GSHKSAEVTLNMLNAHPE 157
            +        N    HP 
Sbjct: 273 QAQPPP---QNQPTPHPG 287



 Score = 29.0 bits (65), Expect = 1.9
 Identities = 10/41 (24%), Positives = 15/41 (36%)

Query: 83  PIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQT 123
             Q Q AP     Q  +    QQ     +  Q   ++VQ +
Sbjct: 288 LPQGQNAPLPPPQQPQLLPLVQQPQGQQRGPQFREQLVQLS 328



 Score = 27.4 bits (61), Expect = 4.8
 Identities = 12/62 (19%), Positives = 13/62 (20%)

Query: 82  LPIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGS 141
           LP Q       Q      Q          Q  Q P    Q      P P     Q  +  
Sbjct: 236 LPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAP 295

Query: 142 HK 143
             
Sbjct: 296 LP 297


>gnl|CDD|173845 cd01156, IVD, Isovaleryl-CoA dehydrogenase.  Isovaleryl-CoA
          dehydrogenase (IVD) is an is an acyl-CoA dehydrogenase,
          which catalyzes the third step in leucine catabolism,
          the conversion of isovaleryl-CoA (3-methylbutyryl-CoA)
          into 3-methylcrotonyl-CoA. IVD is a homotetramer and
          has the greatest affinity for small branched chain
          substrates.
          Length = 376

 Score = 28.5 bits (64), Expect = 1.8
 Identities = 11/29 (37%), Positives = 21/29 (72%), Gaps = 2/29 (6%)

Query: 37 EDIEMLKETVRKFG-DELVIISDRIKDRT 64
          ++IEML+++VR+F   E+  ++ +I DR 
Sbjct: 4  DEIEMLRQSVREFAQKEIAPLAAKI-DRD 31


>gnl|CDD|236444 PRK09275, PRK09275, aspartate aminotransferase; Provisional.
          Length = 527

 Score = 28.7 bits (65), Expect = 1.9
 Identities = 9/33 (27%), Positives = 18/33 (54%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISD 58
           P +P     + E +E + + V +   +L+II+D
Sbjct: 250 PSNPPSVAMSDESLEKIADIVNEKRPDLMIITD 282


>gnl|CDD|162743 TIGR02172, Fb_sc_TIGR02172, Fibrobacter succinogenes paralogous
           family TIGR02172.  This model describes a paralogous
           family of five proteins, likely to be enzymes, in the
           rumen bacterium Fibrobacter succinogenes S85. Members
           show homology to proteins described by pfam01636, a
           phosphotransferase enzyme family associated with
           resistance to aminoglycoside antibiotics. However,
           members of this family score below the current trusted
           and noise cutoffs for pfam01636.
          Length = 226

 Score = 28.2 bits (63), Expect = 1.9
 Identities = 15/56 (26%), Positives = 25/56 (44%), Gaps = 7/56 (12%)

Query: 15  LADLTLQLHSTPDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRIKDRTISQIKS 70
            A++  +LHST    S  +         KE +RKF +E   +    K++  + IK 
Sbjct: 100 FAEMAKKLHSTKCDTSTFQSY-------KEKIRKFIEEKDFVPKDYKEKARAFIKE 148


>gnl|CDD|240336 PTZ00260, PTZ00260, dolichyl-phosphate beta-glucosyltransferase;
           Provisional.
          Length = 333

 Score = 28.6 bits (64), Expect = 1.9
 Identities = 14/54 (25%), Positives = 27/54 (50%), Gaps = 10/54 (18%)

Query: 41  MLKETVR----------KFGDELVIISDRIKDRTISQIKSNLKKKAFEDAGLPI 84
           MLKET++          KF  E++I++D  KD+T+   K   ++    +  + +
Sbjct: 88  MLKETIKYLESRSRKDPKFKYEIIIVNDGSKDKTLKVAKDFWRQNINPNIDIRL 141


>gnl|CDD|221503 pfam12275, DUF3616, Protein of unknown function (DUF3616).  This
           family of proteins is found in bacteria. Proteins in
           this family are typically between 335 and 392 amino
           acids in length. There is a conserved GLRGPV sequence
           motif.
          Length = 331

 Score = 28.5 bits (64), Expect = 2.1
 Identities = 15/44 (34%), Positives = 22/44 (50%), Gaps = 3/44 (6%)

Query: 125 ATVP--VPVSAQQQLLSGSHKSAEV-TLNMLNAHPESEVDVEGL 165
           AT+   V + A  ++  G HKS  +     L   P+ E D+EGL
Sbjct: 21  ATIERLVLLEAGGRVRFGGHKSFPLGDFFDLPGEPDKEADIEGL 64


>gnl|CDD|215756 pfam00155, Aminotran_1_2, Aminotransferase class I and II. 
          Length = 357

 Score = 28.4 bits (64), Expect = 2.1
 Identities = 10/37 (27%), Positives = 23/37 (62%), Gaps = 2/37 (5%)

Query: 22  LHSTPDSPSGNKWTAEDIEMLKETVRKFGDELVIISD 58
           LH++P +P+G   T E++E L +  ++    ++++ D
Sbjct: 147 LHTSPHNPTGTVATLEELEKLLDLAKEHN--ILLLVD 181


>gnl|CDD|222095 pfam13388, DUF4106, Protein of unknown function (DUF4106).  This
           family of proteins are found in large numbers in the
           Trichomonas vaginalis proteome. The function of this
           protein is unknown.
          Length = 422

 Score = 28.5 bits (63), Expect = 2.2
 Identities = 14/47 (29%), Positives = 15/47 (31%)

Query: 90  PPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQ 136
           P QQ   Q   QQ      A Q  Q P +   Q          AQQ 
Sbjct: 203 PTQQPTVQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQP 249



 Score = 26.6 bits (58), Expect = 8.3
 Identities = 18/62 (29%), Positives = 20/62 (32%), Gaps = 4/62 (6%)

Query: 85  QQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGSHKS 144
            QQ        Q  VQ   QQ     Q  Q P +  QQ   T   P     Q   G  +S
Sbjct: 204 TQQPTVQNPAQQPTVQNPAQQ--PQQQPQQQPVQPAQQ--PTPQNPAQQPPQTEQGHKRS 259

Query: 145 AE 146
            E
Sbjct: 260 RE 261


>gnl|CDD|172952 PRK14477, PRK14477, bifunctional nitrogenase molybdenum-cofactor
           biosynthesis protein NifE/NifN; Provisional.
          Length = 917

 Score = 28.6 bits (64), Expect = 2.2
 Identities = 11/28 (39%), Positives = 16/28 (57%)

Query: 31  GNKWTAEDIEMLKETVRKFGDELVIISD 58
           G   T  D+E +KE V  FG + V++ D
Sbjct: 652 GAHLTPADVEEIKEIVEAFGLDPVVVPD 679


>gnl|CDD|165711 PLN00143, PLN00143, tyrosine/nicotianamine aminotransferase;
           Provisional.
          Length = 409

 Score = 28.4 bits (63), Expect = 2.4
 Identities = 11/35 (31%), Positives = 22/35 (62%), Gaps = 2/35 (5%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRI 60
           P +P G+ ++ E +  + ET RK G  +++I+D +
Sbjct: 179 PGNPCGSVYSYEHLNKIAETARKLG--ILVIADEV 211


>gnl|CDD|183514 PRK12414, PRK12414, putative aminotransferase; Provisional.
          Length = 384

 Score = 28.2 bits (63), Expect = 2.6
 Identities = 12/36 (33%), Positives = 21/36 (58%), Gaps = 2/36 (5%)

Query: 25  TPDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRI 60
           TP +PS   ++A D+  L +  R    ++VI+SD +
Sbjct: 170 TPHNPSATVFSAADLARLAQLTR--NTDIVILSDEV 203


>gnl|CDD|202808 pfam03920, TLE_N, Groucho/TLE N-terminal Q-rich domain.  The
           N-terminal domain of the Grouch/TLE co-repressor
           proteins are involved in oligomerisation.
          Length = 134

 Score = 27.1 bits (60), Expect = 3.3
 Identities = 16/51 (31%), Positives = 23/51 (45%), Gaps = 6/51 (11%)

Query: 92  QQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGSH 142
            QI+  + Q+  QQV QA +      R  Q T+A +   +  Q Q    SH
Sbjct: 89  AQILPFLSQEHQQQVAQAVE------RAKQVTMAELNAIIGQQLQAQQLSH 133


>gnl|CDD|227446 COG5116, RPN2, 26S proteasome regulatory complex component
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 926

 Score = 28.0 bits (62), Expect = 3.4
 Identities = 11/49 (22%), Positives = 21/49 (42%), Gaps = 3/49 (6%)

Query: 46  VRKFGDELVIISDRIKDRTISQIKSNLKKKAFEDAGLPIQQQVAPPQQI 94
           VRKF   +V++ DR     ++ I++  + K   D   P+         +
Sbjct: 876 VRKFKGGVVVLRDREPKEPVALIETVRQMK---DVNAPLPTPFKVDDNV 921


>gnl|CDD|223157 COG0079, HisC, Histidinol-phosphate/aromatic aminotransferase and
           cobyric acid decarboxylase [Amino acid transport and
           metabolism].
          Length = 356

 Score = 27.7 bits (62), Expect = 4.1
 Identities = 10/33 (30%), Positives = 18/33 (54%), Gaps = 2/33 (6%)

Query: 24  STPDSPSGNKWTAEDIEMLKETVRKFGDELVII 56
             P++P+G     E++  L E + + G  LV+I
Sbjct: 152 CNPNNPTGTLLPREELRALLEALPEGG--LVVI 182


>gnl|CDD|118654 pfam10126, Nit_Regul_Hom, Uncharacterized protein, homolog of
          nitrogen regulatory protein PII.  This domain, found in
          various hypothetical archaeal proteins, has no known
          function. It is distantly similar to the nitrogen
          regulatory protein PII.
          Length = 110

 Score = 26.6 bits (59), Expect = 4.5
 Identities = 11/40 (27%), Positives = 22/40 (55%)

Query: 36 AEDIEMLKETVRKFGDELVIISDRIKDRTISQIKSNLKKK 75
           ED EM  + +R   ++ V+I+  + +  + +I   LK+K
Sbjct: 50 REDPEMAIKAIRDLSEDAVMINTVVSEEKVEKIVELLKEK 89


>gnl|CDD|235285 PRK04335, PRK04335, cell division protein ZipA; Provisional.
          Length = 313

 Score = 27.5 bits (61), Expect = 4.6
 Identities = 26/129 (20%), Positives = 43/129 (33%), Gaps = 9/129 (6%)

Query: 12  FNKLADLTLQLHSTPDSPSGNKWTAEDIEMLKETVRK------FGDELVIISDRIKDRTI 65
           F       L + S  + P+  +  A+  E   E +RK      FG E       I D   
Sbjct: 35  FGDKPLGKLDVDSDDEQPAPERGFAQAPEDDFEIIRKERKEPDFGRENSFHDPLIDDPLF 94

Query: 66  SQIKSN-LKKKAFEDAGLPIQQQVAPPQQIVQQVVQQQHQQVV--QAPQMVQTPTRVVQQ 122
                    K   E+A +P+Q Q  P  + V+  V++   + V  +   +         Q
Sbjct: 95  GGELEEEEDKFEQEEAPIPVQAQSQPQPEKVEPQVEEPRDEEVLEEPEPVAAKVPMAEVQ 154

Query: 123 TVATVPVPV 131
                 + V
Sbjct: 155 PEEETEIEV 163


>gnl|CDD|237264 PRK13007, PRK13007, succinyl-diaminopimelate desuccinylase;
           Reviewed.
          Length = 352

 Score = 27.1 bits (61), Expect = 5.0
 Identities = 12/24 (50%), Positives = 15/24 (62%), Gaps = 1/24 (4%)

Query: 142 HKSAEVTLNMLNAHPESEVDVEGL 165
           HK+A V L  L A+   EV V+GL
Sbjct: 192 HKAAPV-LARLAAYEPREVVVDGL 214


>gnl|CDD|180720 PRK06836, PRK06836, aspartate aminotransferase; Provisional.
          Length = 394

 Score = 27.1 bits (61), Expect = 5.1
 Identities = 10/38 (26%), Positives = 25/38 (65%), Gaps = 4/38 (10%)

Query: 25  TPDSPSGNKWTAEDIE----MLKETVRKFGDELVIISD 58
           +P++P+G  ++ E ++    +L+E  +++G  + +ISD
Sbjct: 176 SPNNPTGVVYSEETLKALAALLEEKSKEYGRPIYLISD 213


>gnl|CDD|227448 COG5118, BDP1, Transcription initiation factor TFIIIB, Bdp1 subunit
           [Transcription].
          Length = 507

 Score = 27.4 bits (60), Expect = 5.2
 Identities = 11/44 (25%), Positives = 23/44 (52%)

Query: 33  KWTAEDIEMLKETVRKFGDELVIISDRIKDRTISQIKSNLKKKA 76
           +W+ ++IE   + +  +G +  +IS    +R   QIK+   K+ 
Sbjct: 367 RWSKKEIEKFYKALSIWGTDFSLISSLFPNRERKQIKAKFIKEE 410


>gnl|CDD|219339 pfam07223, DUF1421, Protein of unknown function (DUF1421).  This
           family represents a conserved region approximately 350
           residues long within a number of plant proteins of
           unknown function.
          Length = 357

 Score = 27.2 bits (60), Expect = 5.7
 Identities = 15/45 (33%), Positives = 16/45 (35%)

Query: 84  IQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVP 128
            Q Q   PQ   Q   QQQ+Q   Q PQ  Q P    Q       
Sbjct: 134 QQPQAQQPQPPPQVPQQQQYQSPPQQPQYQQNPPPQAQSAPQVSG 178


>gnl|CDD|234275 TIGR03599, YloV, DAK2 domain fusion protein YloV.  This model
           describes a protein family that contains an N-terminal
           DAK2 domain (pfam02734), so named because of similarity
           to the dihydroxyacetone kinase family family. The
           GTP-binding protein CgtA (a member of the obg family) is
           a bacterial GTPase associated with ribosome biogenesis,
           and it has a characteristic extension (TIGR03595) in
           certain lineages. This protein family described here was
           found, by the method of partial phylognetic profiling,
           to have a phylogenetic distribution strongly correlated
           to that of TIGR03595. This correlation implies some form
           of functional coupling.
          Length = 530

 Score = 27.1 bits (61), Expect = 6.0
 Identities = 8/23 (34%), Positives = 13/23 (56%)

Query: 36  AEDIEMLKETVRKFGDELVIISD 58
             D E  ++ + K GD LV++ D
Sbjct: 236 KFDEEKFRKELEKLGDSLVVVGD 258


>gnl|CDD|234547 TIGR04330, cas_Cpf1, CRISPR-associated protein Cpf1, subtype PREFRAN.
             This family is the long protein of a novel CRISPR
            subtype, PREFRAN, which is most common in Prevotella and
            Francisella, although widely distributed. The PREFRAN
            type has Cas1, Cas2, and Cas4, but lacks the helicase
            Cas3 and endonuclease Cas3-HD.
          Length = 1287

 Score = 27.1 bits (60), Expect = 6.1
 Identities = 12/53 (22%), Positives = 18/53 (33%), Gaps = 2/53 (3%)

Query: 27   DSPSGNKWTAEDIEMLKETVRKFGDELVIISDRIKDRTISQIKSNLKKKAFED 79
            +    N W  E+I + +E      D  +   D    +    I     KK F D
Sbjct: 1144 NKDKNNYWDTEEINLTEEIKALLEDNAINYGDGNCIKE--AIAGKSDKKFFAD 1194


>gnl|CDD|237466 PRK13674, PRK13674, putative GTP cyclohydrolase; Provisional.
          Length = 271

 Score = 26.7 bits (60), Expect = 6.2
 Identities = 19/56 (33%), Positives = 21/56 (37%), Gaps = 3/56 (5%)

Query: 110 PQMVQTPTRVVQQTVATVPVPVSAQQQLLSGSHKSAEVTLNMLNAHPESEVDVEGL 165
           P  V T     Q TVA V + VS       G H S    L  L  H E E+    L
Sbjct: 34  PVRVDTRDGGTQTTVARVDLTVSLPAD-FKGIHMSRLYEL--LEEHAEQELSPASL 86


>gnl|CDD|233290 TIGR01141, hisC, histidinol-phosphate aminotransferase.  Alternate
           names: histidinol-phosphate transaminase; imidazole
           acetol-phosphate transaminase Histidinol-phosphate
           aminotransferase is a pyridoxal-phosphate dependent
           enzyme [Amino acid biosynthesis, Histidine family].
          Length = 346

 Score = 26.8 bits (60), Expect = 6.3
 Identities = 11/32 (34%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 25  TPDSPSGNKWTAEDIEMLKETVRKFGDELVII 56
           +P++P+GN  +  DIE + E      D LV++
Sbjct: 150 SPNNPTGNLLSRSDIEAVLERTP--EDALVVV 179


>gnl|CDD|185309 PRK15411, rcsA, colanic acid capsular biosynthesis activation
           protein A; Provisional.
          Length = 207

 Score = 26.6 bits (59), Expect = 6.3
 Identities = 11/22 (50%), Positives = 16/22 (72%), Gaps = 2/22 (9%)

Query: 56  ISDR--IKDRTISQIKSNLKKK 75
           ISD+  IK +T+S  K N+K+K
Sbjct: 158 ISDQMNIKAKTVSSHKGNIKRK 179


>gnl|CDD|237722 PRK14476, PRK14476, nitrogenase molybdenum-cofactor biosynthesis
           protein NifN; Provisional.
          Length = 455

 Score = 27.1 bits (61), Expect = 6.4
 Identities = 11/24 (45%), Positives = 15/24 (62%)

Query: 35  TAEDIEMLKETVRKFGDELVIISD 58
           T  DIE L+E +  FG E +I+ D
Sbjct: 180 TPGDIEELREIIEAFGLEPIILPD 203


>gnl|CDD|163513 TIGR03801, asp_4_decarbox, aspartate 4-decarboxylase.  This enzyme,
           aspartate 4-decarboxylase (EC 4.1.1.12), removes the
           side-chain carboxylate from L-aspartate, converting it
           to L-alanine plus carbon dioxide. It is a PLP-dependent
           enzyme, homologous to aspartate aminotransferase (EC
           2.6.1.1) [Energy metabolism, Amino acids and amines].
          Length = 521

 Score = 26.9 bits (60), Expect = 6.6
 Identities = 9/33 (27%), Positives = 17/33 (51%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISD 58
           P +P     + E IE + + V     +L+I++D
Sbjct: 249 PSNPPSVAMSDESIEKIVDIVANDRPDLMILTD 281


>gnl|CDD|146221 pfam03465, eRF1_3, eRF1 domain 3.  The release factor eRF1
          terminates protein biosynthesis by recognising stop
          codons at the A site of the ribosome and stimulating
          peptidyl-tRNA bond hydrolysis at the peptidyl
          transferase centre. The crystal structure of human eRF1
          is known. The overall shape and dimensions of eRF1
          resemble a tRNA molecule with domains 1, 2, and 3 of
          eRF1 corresponding to the anticodon loop, aminoacyl
          acceptor stem, and T stem of a tRNA molecule,
          respectively. The position of the essential GGQ motif
          at an exposed tip of domain 2 suggests that the Gln
          residue coordinates a water molecule to mediate the
          hydrolytic activity at the peptidyl transferase centre.
          A conserved groove on domain 1, 80 A from the GGQ
          motif, is proposed to form the codon recognition site.
          This family also includes other proteins for which the
          precise molecular function is unknown. Many of them are
          from Archaebacteria. These proteins may also be
          involved in translation termination but this awaits
          experimental verification.
          Length = 100

 Score = 25.6 bits (57), Expect = 6.9
 Identities = 8/23 (34%), Positives = 12/23 (52%)

Query: 37 EDIEMLKETVRKFGDELVIISDR 59
            IE L E   + G ++ I+SD 
Sbjct: 57 NKIEWLVENAEESGGKVEIVSDE 79


>gnl|CDD|220309 pfam09606, Med15, ARC105 or Med15 subunit of Mediator complex
           non-fungal.  The approx. 70 residue Med15 domain of the
           ARC-Mediator co-activator is a three-helix bundle with
           marked similarity to the KIX domain. The sterol
           regulatory element binding protein (SREBP) family of
           transcription activators use the ARC105 subunit to
           activate target genes in the regulation of cholesterol
           and fatty acid homeostasis. In addition, Med15 is a
           critical transducer of gene activation signals that
           control early metazoan development.
          Length = 768

 Score = 26.9 bits (59), Expect = 7.1
 Identities = 19/58 (32%), Positives = 22/58 (37%), Gaps = 3/58 (5%)

Query: 83  PIQQQVAPPQQIVQQVVQQQHQQVVQA---PQMVQTPTRVVQQTVATVPVPVSAQQQL 137
           P   Q   P    Q  +QQQ Q   Q    PQM Q      QQ +     P  AQ Q+
Sbjct: 194 PQMGQPGMPGGGGQGQMQQQGQPGGQQQQNPQMQQQLQNQQQQQMDQQQGPADAQAQM 251


>gnl|CDD|217220 pfam02771, Acyl-CoA_dh_N, Acyl-CoA dehydrogenase, N-terminal
          domain.  The N-terminal domain of Acyl-CoA
          dehydrogenase is an all-alpha domain.
          Length = 113

 Score = 25.9 bits (58), Expect = 7.2
 Identities = 8/16 (50%), Positives = 13/16 (81%)

Query: 37 EDIEMLKETVRKFGDE 52
          E+ E L++TVR+F +E
Sbjct: 2  EEQEALRDTVREFAEE 17


>gnl|CDD|234022 TIGR02813, omega_3_PfaA, polyketide-type polyunsaturated fatty acid
            synthase PfaA.  Members of the seed for this alignment
            are involved in omega-3 polyunsaturated fatty acid
            biosynthesis, such as the protein PfaA from the
            eicosapentaenoic acid biosynthesis operon in
            Photobacterium profundum strain SS9. PfaA is encoded
            together with PfaB, PfaC, and PfaD, and the functions of
            the individual polypeptides have not yet been described.
            More distant homologs of PfaA, also included with the
            reach of this model, appear to be involved in
            polyketide-like biosynthetic mechanisms of
            polyunsaturated fatty acid biosynthesis, an alternative
            to the more familiar iterated mechanism of chain
            extension and desaturation, and in most cases are encoded
            near genes for homologs of PfaB, PfaC, and/or PfaD.
          Length = 2582

 Score = 26.9 bits (59), Expect = 7.6
 Identities = 12/54 (22%), Positives = 18/54 (33%), Gaps = 8/54 (14%)

Query: 83   PIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPV---PVSA 133
            P+ Q           +       VV  P +   P + V   VA  PV   P++ 
Sbjct: 1137 PVVQVTISVAPAAPVLP-----AVVSPPVVSAAPAQSVATAVAMAPVAEVPIAV 1185


>gnl|CDD|221188 pfam11725, AvrE, Pathogenicity factor.  This family is secreted by
           gram-negative Gammaproteobacteria such as Pseudomonas
           syringae of tomato and the fire blight plant pathogen
           Erwinia amylovora, amongst others. It is an essential
           pathogenicity factor of approximately 198 kDa. Its
           injection into the host-plant is dependent upon the
           bacterial type III or Hrp secretion system. The family
           is long and carries a number of predicted functional
           regions, including an ERMS or endoplasmic reticulum
           membrane retention signal at both the C- and the
           N-termini, a leucine-zipper motif from residues 539-560,
           and a nuclear localisation signal at 1358-1361. this
           conserved AvrE-family of effectors is among the few that
           are required for full virulence of many phytopathogenic
           pseudomonads, erwinias and pantoeas.
          Length = 1771

 Score = 26.7 bits (59), Expect = 8.4
 Identities = 19/103 (18%), Positives = 36/103 (34%), Gaps = 3/103 (2%)

Query: 79  DAGLPIQQQVAPPQQIVQQVVQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLL 138
            A   +QQ    P Q     +  + ++  +    V   +   +Q  A  P  ++      
Sbjct: 23  GAPTGLQQSSESPTQRASHSLASEGKKNRKKMPKVFQKSSAPRQIQAAPPQALNPTA--- 79

Query: 139 SGSHKSAEVTLNMLNAHPESEVDVEGLPEEVKLQFDTTAQQVA 181
           +    S   TL  L A PE + + +        +  T ++ VA
Sbjct: 80  AAPQSSRGPTLRELLALPEDDGETQAPESSPSARRLTRSEGVA 122


>gnl|CDD|237361 PRK13355, PRK13355, bifunctional HTH-domain containing
           protein/aminotransferase; Provisional.
          Length = 517

 Score = 26.6 bits (59), Expect = 8.9
 Identities = 12/38 (31%), Positives = 23/38 (60%), Gaps = 2/38 (5%)

Query: 26  PDSPSGNKWTAEDIEMLKETVRKFGDELVIISDRIKDR 63
           P++P+G  +  E ++ + +  R+   +L+I SD I DR
Sbjct: 290 PNNPTGALYPREVLQQIVDIAREH--QLIIFSDEIYDR 325


>gnl|CDD|234977 PRK01741, PRK01741, cell division protein ZipA; Provisional.
          Length = 332

 Score = 26.6 bits (59), Expect = 9.0
 Identities = 15/125 (12%), Positives = 36/125 (28%), Gaps = 5/125 (4%)

Query: 44  ETVRKFGDELVIISDRIKDRTISQIKSNLKKKAF-----EDAGLPIQQQVAPPQQIVQQV 98
               +   E  +   +  ++++ +IK  L  +            PIQ      Q   Q  
Sbjct: 77  PQTEESESENEVQIQQEVEQSVDEIKITLPNQEPAYYMQNHRSEPIQPTQPQYQSPTQTN 136

Query: 99  VQQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQLLSGSHKSAEVTLNMLNAHPES 158
           V     +  Q+P +         + +      ++A+    +              A  E+
Sbjct: 137 VASMTIEETQSPNVPIEGINSSSEQLRVELAELAAEIYSDASHRVELAKNFMEPQAETEA 196

Query: 159 EVDVE 163
           + +  
Sbjct: 197 QPEAT 201


>gnl|CDD|182312 PRK10216, PRK10216, DNA-binding transcriptional regulator YidZ;
           Provisional.
          Length = 319

 Score = 26.3 bits (58), Expect = 9.0
 Identities = 10/37 (27%), Positives = 17/37 (45%)

Query: 100 QQQHQQVVQAPQMVQTPTRVVQQTVATVPVPVSAQQQ 136
           Q  H  +  AP+  Q   ++ Q  +  +P+P    QQ
Sbjct: 247 QPDHLLLATAPRYCQYYNQLHQLPLVALPLPFDESQQ 283


>gnl|CDD|233628 TIGR01900, dapE-gram_pos, succinyl-diaminopimelate desuccinylase.
           This model represents a clade of
           succinyl-diaminopimelate desuccinylases from
           actinobacteria (high-GC gram positives),
           delta-proteobacteria and aquificales and is based on the
           characterization of the enzyme from Corynebacterium
           glutamicum. This enzyme is involved in the biosynthesis
           of lysine, and is related to the enzyme acetylornithine
           deacetylase and other amidases and peptidases found
           within pfam01546. Other sequences included in the seed
           of this model were assessed to confirm that 1) the
           related genes DapC (succinyl-diaminopimelate
           transaminase) and DapD
           (2,3,4,5-tetrahydropyridine-2,6-dicarboxylate
           N-succinyltransferase) are also found in the genome, 2)
           each is found only once in those genomes, 3) the lysine
           biosynthesis pathway is complete and 4) the direct
           (TIGR03540 or TIGR03542) or acetylated (GenProp0787)
           aminotransferase pathways are absent in thes genomes.
           Additionally, a number of the seed members are observed
           adjacent to either DapC or DapD (often as a divergon
           with a putative promoter site between them [Amino acid
           biosynthesis, Aspartate family].
          Length = 351

 Score = 26.5 bits (59), Expect = 9.1
 Identities = 11/24 (45%), Positives = 15/24 (62%), Gaps = 1/24 (4%)

Query: 142 HKSAEVTLNMLNAHPESEVDVEGL 165
           HK+A + L  L A+   EV V+GL
Sbjct: 190 HKAAPI-LARLAAYEPREVTVDGL 212


>gnl|CDD|211345 cd02868, PseudoU_synth_hTruB2_like, Pseudouridine synthase,
           humanTRUB2_like.  This group consists of eukaryotic
           pseudouridine synthases similar to human TruB
           pseudouridine synthase homolog 2 (TRUB2). Pseudouridine
           synthases catalyze the isomerization of specific
           uridines in an RNA molecule to pseudouridines
           (5-ribosyluracil, psi).
          Length = 226

 Score = 26.2 bits (58), Expect = 9.4
 Identities = 12/35 (34%), Positives = 21/35 (60%), Gaps = 1/35 (2%)

Query: 55  IISDRIKDRTISQIKSNLKKKAFEDAGLPIQQQVA 89
           I  ++I +R ++ I+S  ++KAFE   +  Q Q A
Sbjct: 99  ITREKI-ERLLAVIQSGHQQKAFELCSVDDQSQQA 132


>gnl|CDD|181580 PRK08912, PRK08912, hypothetical protein; Provisional.
          Length = 387

 Score = 26.5 bits (59), Expect = 9.7
 Identities = 9/34 (26%), Positives = 18/34 (52%), Gaps = 2/34 (5%)

Query: 25  TPDSPSGNKWTAEDIEMLKETVRKFGDELVIISD 58
            P +P+G  +  E++ +L E  ++   + V I D
Sbjct: 167 NPLNPAGKVFPREELALLAEFCQRH--DAVAICD 198


>gnl|CDD|225241 COG2366, COG2366, Protein related to penicillin acylase [General
           function prediction only].
          Length = 768

 Score = 26.7 bits (59), Expect = 9.7
 Identities = 11/34 (32%), Positives = 16/34 (47%)

Query: 62  DRTISQIKSNLKKKAFEDAGLPIQQQVAPPQQIV 95
           D  ++ +  NL   +FED G   +    PPQ  V
Sbjct: 396 DEILACLLLNLAATSFEDFGNAAEYFAIPPQNWV 429


>gnl|CDD|238928 cd01966, Nitrogenase_NifN_1, Nitrogenase_nifN1: A subgroup of the
           NifN subunit of the NifEN complex: NifN forms an
           alpha2beta2 tetramer with NifE.  NifN and nifE are
           structurally homologous to nitrogenase MoFe protein beta
           and alpha subunits respectively.  NifEN participates in
           the synthesis of the iron-molybdenum cofactor (FeMoco)
           of the MoFe protein.  NifB-co (an iron and sulfur
           containing precursor of the FeMoco) from NifB is
           transferred to the NifEN complex where it is further
           processed to FeMoco. The nifEN bound precursor of FeMoco
           has been identified as a molybdenum-free, iron- and
           sulfur- containing analog of FeMoco. It has been
           suggested that this nifEN bound precursor also acts as a
           cofactor precursor in nitrogenase systems which require
           a cofactor other than FeMoco: i.e. iron-vanadium
           cofactor (FeVco) or iron only cofactor (FeFeco).
          Length = 417

 Score = 26.4 bits (59), Expect = 10.0
 Identities = 10/24 (41%), Positives = 15/24 (62%)

Query: 35  TAEDIEMLKETVRKFGDELVIISD 58
           T  D+E LK+ +  FG E +I+ D
Sbjct: 169 TPGDVEELKDIIEAFGLEPIILPD 192


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.312    0.126    0.342 

Gapped
Lambda     K      H
   0.267   0.0647    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 8,937,856
Number of extensions: 803864
Number of successful extensions: 1544
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1300
Number of HSP's successfully gapped: 182
Length of query: 182
Length of database: 10,937,602
Length adjustment: 91
Effective length of query: 91
Effective length of database: 6,901,388
Effective search space: 628026308
Effective search space used: 628026308
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 56 (25.5 bits)