RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy4141
         (353 letters)



>gnl|CDD|197652 smart00322, KH, K homology RNA-binding domain. 
          Length = 68

 Score = 70.8 bits (174), Expect = 9e-16
 Identities = 27/66 (40%), Positives = 41/66 (62%), Gaps = 1/66 (1%)

Query: 81  VTIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYI-VTPSRDKEPVFEVTGAPDSVEIARQ 139
           VTIEV +P   VGL++G  G+TIK+I+ +T   I +     +E V E+TG P++VE A +
Sbjct: 3   VTIEVLIPADKVGLIIGKGGSTIKKIEEETGVKIDIPGPGSEERVVEITGPPENVEKAAE 62

Query: 140 EIESHI 145
            I   +
Sbjct: 63  LILEIL 68



 Score = 43.8 bits (104), Expect = 4e-06
 Identities = 17/49 (34%), Positives = 23/49 (46%), Gaps = 1/49 (2%)

Query: 4  ISRSGCKIKALRAKTNTYIKTPVRGE-EPVFVVTGRKEDVARAKREILS 51
          I + G  IK +  +T   I  P  G  E V  +TG  E+V +A   IL 
Sbjct: 18 IGKGGSTIKKIEEETGVKIDIPGPGSEERVVEITGPPENVEKAAELILE 66


>gnl|CDD|238053 cd00105, KH-I, K homology RNA-binding domain, type I.  KH binds
           single-stranded RNA or DNA. It is found in a wide
           variety of proteins including ribosomal proteins,
           transcription factors and post-transcriptional modifiers
           of mRNA. There are two different KH domains that belong
           to different protein folds, but they share a single KH
           motif. The KH motif is folded into a beta alpha alpha
           beta unit. In addition to the core, type II KH domains
           (e.g. ribosomal protein S3) include N-terminal extension
           and type I KH domains (e.g. hnRNP K) contain C-terminal
           extension.
          Length = 64

 Score = 68.0 bits (167), Expect = 9e-15
 Identities = 23/64 (35%), Positives = 36/64 (56%), Gaps = 3/64 (4%)

Query: 82  TIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYIVTPSRD---KEPVFEVTGAPDSVEIAR 138
           T  V VP  +VG ++G  G+TIK I+ +T   I  P      +E +  +TG P++VE A+
Sbjct: 1   TERVLVPSSLVGRIIGKGGSTIKEIREETGAKIKIPDSGSGSEERIVTITGTPEAVEKAK 60

Query: 139 QEIE 142
           + I 
Sbjct: 61  ELIL 64



 Score = 46.0 bits (110), Expect = 5e-07
 Identities = 20/50 (40%), Positives = 26/50 (52%), Gaps = 3/50 (6%)

Query: 4  ISRSGCKIKALRAKTNTYIKTPVRG---EEPVFVVTGRKEDVARAKREIL 50
          I + G  IK +R +T   IK P  G   EE +  +TG  E V +AK  IL
Sbjct: 15 IGKGGSTIKEIREETGAKIKIPDSGSGSEERIVTITGTPEAVEKAKELIL 64


>gnl|CDD|215657 pfam00013, KH_1, KH domain.  KH motifs bind RNA in vitro.
           Autoantibodies to Nova, a KH domain protein, cause
           paraneoplastic opsoclonus ataxia.
          Length = 59

 Score = 66.0 bits (162), Expect = 3e-14
 Identities = 19/60 (31%), Positives = 33/60 (55%), Gaps = 1/60 (1%)

Query: 82  TIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFEVTGAPDSVEIARQEI 141
           T  + +P   VG ++G  G+ IK I+ +T   I  P  D++    ++G P+ VE A++ I
Sbjct: 1   TERILIPPDKVGRIIGKGGSNIKEIREETGVKIRIP-DDRDDTVTISGTPEQVEKAKELI 59



 Score = 42.2 bits (100), Expect = 1e-05
 Identities = 14/46 (30%), Positives = 22/46 (47%), Gaps = 1/46 (2%)

Query: 4  ISRSGCKIKALRAKTNTYIKTPVRGEEPVFVVTGRKEDVARAKREI 49
          I + G  IK +R +T   I+ P    +    ++G  E V +AK  I
Sbjct: 15 IGKGGSNIKEIREETGVKIRIP-DDRDDTVTISGTPEQVEKAKELI 59


>gnl|CDD|222454 pfam13920, zf-C3HC4_3, Zinc finger, C3HC4 type (RING finger). 
          Length = 49

 Score = 58.2 bits (141), Expect = 2e-11
 Identities = 17/48 (35%), Positives = 24/48 (50%), Gaps = 1/48 (2%)

Query: 300 SRQCYLCNDREVTHALIPCGHNFFCSECAERTCDFDRTCPMCRVPVNQ 347
              C +C +R      +PCGH   C ECA+R     + CP+CR P+  
Sbjct: 2   DDLCVICLERPRNVVFLPCGHLCLCEECAKR-LRSKKKCPICRQPIES 48


>gnl|CDD|239087 cd02394, vigilin_like_KH, K homology RNA-binding
           domain_vigilin_like.  The vigilin family is a large and
           extended family of multiple KH-domain proteins,
           including vigilin, also called high density lipoprotein
           binding protien (HBP), fungal Scp160 and bicaudal-C.
           Yeast Scp160p has been shown to bind RNA and to
           associate with both soluble and membrane-bound
           polyribosomes as a mRNP component. Bicaudal-C is a
           RNA-binding molecule believed to function in embryonic
           development at the post-transcriptional level. In
           general, KH binds single-stranded RNA or DNA. It is
           found in a wide variety of proteins including ribosomal
           proteins, transcription factors and post-transcriptional
           modifiers of mRNA.
          Length = 62

 Score = 56.4 bits (137), Expect = 9e-11
 Identities = 20/62 (32%), Positives = 33/62 (53%), Gaps = 1/62 (1%)

Query: 82  TIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYI-VTPSRDKEPVFEVTGAPDSVEIARQE 140
           T EV +P K+   ++G KG+ I++I  +T   I       K     +TG  ++VE A++E
Sbjct: 1   TEEVEIPKKLHRFIIGKKGSNIRKIMEETGVKIRFPDPGSKSDTITITGPKENVEKAKEE 60

Query: 141 IE 142
           I 
Sbjct: 61  IL 62



 Score = 41.0 bits (97), Expect = 3e-05
 Identities = 17/48 (35%), Positives = 26/48 (54%), Gaps = 1/48 (2%)

Query: 4  ISRSGCKIKALRAKTNTYIKTPVRGEEP-VFVVTGRKEDVARAKREIL 50
          I + G  I+ +  +T   I+ P  G +     +TG KE+V +AK EIL
Sbjct: 15 IGKKGSNIRKIMEETGVKIRFPDPGSKSDTITITGPKENVEKAKEEIL 62


>gnl|CDD|206094 pfam13923, zf-C3HC4_2, Zinc finger, C3HC4 type (RING finger). 
          Length = 45

 Score = 46.0 bits (109), Expect = 4e-07
 Identities = 17/44 (38%), Positives = 20/44 (45%), Gaps = 2/44 (4%)

Query: 300 SRQCYLCNDREV-THALIPCGHNFFCSECAERTCDFDRTCPMCR 342
             +C +C D       L PCGH  FC EC  R       CP+CR
Sbjct: 2   ELECPICLDLLRDPVVLTPCGH-VFCRECILRYLKKKSKCPICR 44


>gnl|CDD|222279 pfam13639, zf-RING_2, Ring finger domain. 
          Length = 46

 Score = 41.2 bits (97), Expect = 2e-05
 Identities = 16/47 (34%), Positives = 22/47 (46%), Gaps = 4/47 (8%)

Query: 301 RQCYLCNDREVTHALI---PCGHNFFCSECAERTCDFDRTCPMCRVP 344
            +C +C D       +   PCGH  F  EC ++      TCP+CR P
Sbjct: 1   DECPICLDEFEPGEEVVVLPCGH-VFHKECLDKWLRSSNTCPLCRAP 46


>gnl|CDD|238093 cd00162, RING, RING-finger (Really Interesting New Gene) domain, a
           specialized type of Zn-finger of 40 to 60 residues that
           binds two atoms of zinc; defined by the 'cross-brace'
           motif C-X2-C-X(9-39)-C-X(1-3)-
           H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved
           in mediating protein-protein interactions; identified in
           a proteins with a wide range of functions such as viral
           replication, signal transduction, and development; has
           two variants, the C3HC4-type and a C3H2C3-type (RING-H2
           finger), which have different cysteine/histidine
           pattern; a subset of RINGs are associated with B-Boxes
           (C-X2-H-X7-C-X7-C-X2-C-H-X2-H).
          Length = 45

 Score = 40.1 bits (94), Expect = 4e-05
 Identities = 16/46 (34%), Positives = 24/46 (52%), Gaps = 3/46 (6%)

Query: 302 QCYLCNDR-EVTHALIPCGHNFFCSECAERTCD-FDRTCPMCRVPV 345
           +C +C +       L+PCGH F C  C ++       TCP+CR P+
Sbjct: 1   ECPICLEEFREPVVLLPCGHVF-CRSCIDKWLKSGKNTCPLCRTPI 45


>gnl|CDD|221895 pfam13014, KH_3, KH domain.  KH motifs bind RNA in vitro. This
           RNA-binding domain is required for the efficient
           anchoring of ASH1-mRNA to the distal tip of the daughter
           cell. ASH1 is a specific repressor of transcription that
           localizes asymmetrically to the daughter cell nucleus.
           RNA localisation is a widespread mechanism for achieving
           localised protein synthesis.
          Length = 42

 Score = 39.8 bits (94), Expect = 4e-05
 Identities = 14/41 (34%), Positives = 20/41 (48%), Gaps = 3/41 (7%)

Query: 92  VGLVVGPKGATIKRIQHQTNTYIVTPSR---DKEPVFEVTG 129
           VG ++G  G TIK I+ +T   I  P       E +  +TG
Sbjct: 2   VGAIIGKGGETIKEIREETGAKIQIPKPEPGSGERIVTITG 42



 Score = 26.8 bits (60), Expect = 2.2
 Identities = 11/37 (29%), Positives = 16/37 (43%), Gaps = 3/37 (8%)

Query: 4  ISRSGCKIKALRAKTNTYI---KTPVRGEEPVFVVTG 37
          I + G  IK +R +T   I   K      E +  +TG
Sbjct: 6  IGKGGETIKEIREETGAKIQIPKPEPGSGERIVTITG 42


>gnl|CDD|239089 cd02396, PCBP_like_KH, K homology RNA-binding domain, PCBP_like.
           Members of this group possess KH domains in a tandem
           arrangement. Most members, similar to the poly(C)
           binding proteins (PCBPs) and Nova, containing three KH
           domains, with the first and second domains, which are
           represented here, in tandem arrangement, followed by a
           large spacer region, with the third domain near the
           C-terminal end of the protein. The poly(C) binding
           proteins (PCBPs) can be divided into two groups, hnRNPs
           K/J and the alphaCPs, which share a triple KH domain
           configuration and  poly(C) binding specificity. They
           play roles in mRNA stabilization, translational
           activation, and translational silencing. Nova-1 and
           Nova-2 are nuclear RNA-binding proteins that regulate
           splicing. This group also contains plant proteins that
           seem to have two tandem repeat arrrangements, like Hen4,
           a protein that plays a role in  AGAMOUS (AG) pre-mRNA
           processing and important step in plant development. In
           general, KH binds single-stranded RNA or DNA. It is
           found in a wide variety of proteins including ribosomal
           proteins, transcription factors and post-transcriptional
           modifiers of mRNA.
          Length = 65

 Score = 38.2 bits (90), Expect = 3e-04
 Identities = 18/59 (30%), Positives = 27/59 (45%), Gaps = 4/59 (6%)

Query: 87  VPYKVVGLVVGPKGATIKRIQHQTNTYIVTPSRDK----EPVFEVTGAPDSVEIARQEI 141
           VP    G ++G  G+TIK I+ +T   I           E V  ++G P +V+ A   I
Sbjct: 6   VPSSQAGSIIGKGGSTIKEIREETGAKIRVSKSVLPGSTERVVTISGKPSAVQKALLLI 64



 Score = 32.1 bits (74), Expect = 0.047
 Identities = 14/51 (27%), Positives = 22/51 (43%), Gaps = 4/51 (7%)

Query: 4  ISRSGCKIKALRAKTNTYIK----TPVRGEEPVFVVTGRKEDVARAKREIL 50
          I + G  IK +R +T   I+          E V  ++G+   V +A   IL
Sbjct: 15 IGKGGSTIKEIREETGAKIRVSKSVLPGSTERVVTISGKPSAVQKALLLIL 65


>gnl|CDD|239092 cd02409, KH-II, KH-II  (K homology RNA-binding domain, type II).
           KH binds single-stranded RNA or DNA. It is found in a
           wide variety of proteins including ribosomal proteins
           (e.g. ribosomal protein S3), transcription factors (e.g.
           NusA_K), and post-transcriptional modifiers of mRNA
           (e.g. hnRNP K). There are two different KH domains that
           belong to different protein folds, but they share a
           single KH motif. The KH motif is a
           beta-alpha-alpha-beta-beta unit that folds into an
           alpha-beta structure with a three stranded beta-sheet
           interupted by two contiguous helices. In addition to
           their KH core domain, KH-II proteins have an N-terminal
           alpha helical extension while KH-I proteins have a
           C-terminal alpha helical extension.
          Length = 68

 Score = 36.5 bits (85), Expect = 0.001
 Identities = 10/34 (29%), Positives = 16/34 (47%)

Query: 81  VTIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYI 114
           + I + V     GLV+G KG  I+ +Q      +
Sbjct: 25  IEIIIVVARGQPGLVIGKKGQNIRALQKLLQKLL 58


>gnl|CDD|214546 smart00184, RING, Ring finger.  E3 ubiquitin-protein ligase
           activity is intrinsic to the RING domain of c-Cbl and is
           likely to be a general function of this domain; Various
           RING fingers exhibit binding activity towards E2
           ubiquitin-conjugating enzymes (Ubc' s).
          Length = 40

 Score = 35.2 bits (81), Expect = 0.002
 Identities = 13/41 (31%), Positives = 21/41 (51%), Gaps = 3/41 (7%)

Query: 303 CYLCNDREVTHA-LIPCGHNFFCSECAERTCDFD-RTCPMC 341
           C +C +  +    ++PCGH F C  C  +  +    TCP+C
Sbjct: 1   CPICLEEYLKDPVILPCGHTF-CRSCIRKWLESGNNTCPIC 40


>gnl|CDD|233043 TIGR00599, rad18, DNA repair protein rad18.  All proteins in this
           family for which functions are known are involved in
           nucleotide excision repair.This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 397

 Score = 37.7 bits (87), Expect = 0.006
 Identities = 23/72 (31%), Positives = 30/72 (41%), Gaps = 4/72 (5%)

Query: 273 DASPVNPSSIWSYPPVSSTSPSGSISGSRQCYLCNDREVTHALIPCGHNFFCSECAERTC 332
           D   +  SS W   P+ S  P   +  S +C++C D      L  C H F CS C  R  
Sbjct: 2   DELDITDSSDWLTTPIPSLYP---LDTSLRCHICKDFFDVPVLTSCSHTF-CSLCIRRCL 57

Query: 333 DFDRTCPMCRVP 344
                CP+CR  
Sbjct: 58  SNQPKCPLCRAE 69


>gnl|CDD|224019 COG1094, COG1094, Predicted RNA-binding protein (contains KH
           domains) [General function prediction only].
          Length = 194

 Score = 36.9 bits (86), Expect = 0.007
 Identities = 16/54 (29%), Positives = 27/54 (50%), Gaps = 4/54 (7%)

Query: 90  KVVGLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFEVTGAPDSVEIARQEIES 143
           ++ G ++G +G T + I+  T  YI      K     + G  + VEIAR+ +E 
Sbjct: 111 RIKGRIIGREGKTRRAIEELTGVYISV--YGKT--VAIIGGFEQVEIAREAVEM 160


>gnl|CDD|239088 cd02395, SF1_like-KH, Splicing factor 1 (SF1) K homology
           RNA-binding domain (KH). Splicing factor 1 (SF1)
           specifically recognizes the intron branch point sequence
           (BPS) UACUAAC in the pre-mRNA transcripts during
           spliceosome assembly. We show that the KH-QUA2 region of
           SF1 defines an enlarged KH (hnRNP K) fold which is
           necessary and sufficient for BPS binding. KH binds
           single-stranded RNA or DNA. It is found in a wide
           variety of proteins including ribosomal proteins,
           transcription factors and post-transcriptional modifiers
           of mRNA.
          Length = 120

 Score = 35.3 bits (82), Expect = 0.011
 Identities = 22/88 (25%), Positives = 38/88 (43%), Gaps = 13/88 (14%)

Query: 82  TIEVRVP------YKVVGLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFE--VTGAPDS 133
           T +V +P      Y  VGL++GP+G T+K+++ +T   I    R K  + +         
Sbjct: 1   TEKVYIPVKQYPKYNFVGLILGPRGNTLKQLEKETGAKISI--RGKGSMKDGKKEEELRG 58

Query: 134 VEIARQEIESHIIRRTGSCVTPAEAVLN 161
            + A      H++    +  TP E  L 
Sbjct: 59  PKYAHLNEPLHVLI---TAETPPEEALA 83


>gnl|CDD|237494 PRK13763, PRK13763, putative RNA-processing protein; Provisional.
          Length = 180

 Score = 35.6 bits (83), Expect = 0.019
 Identities = 23/89 (25%), Positives = 42/89 (47%), Gaps = 7/89 (7%)

Query: 79  GHVTIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFEVTGAPDSVEI-- 136
             +   V++P   +G+++G KG T K I+ +T   +   S   E + E T   D + +  
Sbjct: 1   MMMMEYVKIPKDRIGVLIGKKGETKKEIEERTGVKLEIDSETGEVIIEPTDGEDPLAVLK 60

Query: 137 ARQEIESHIIRRTGSCVTPAEAVLNGDDN 165
           AR      I++  G   +P +A+   DD+
Sbjct: 61  AR-----DIVKAIGRGFSPEKALRLLDDD 84



 Score = 29.5 bits (67), Expect = 1.7
 Identities = 42/156 (26%), Positives = 65/156 (41%), Gaps = 42/156 (26%)

Query: 1   MKKI-SRSGCKIKALRAKTNTYIKTPVRGEEPVFVVTGRKEDVARAKREILSA-ADHFSA 58
            K+I  R+G K++ + ++T   I  P  GE+P+         V +A R+I+ A    FS 
Sbjct: 25  KKEIEERTGVKLE-IDSETGEVIIEPTDGEDPL--------AVLKA-RDIVKAIGRGFSP 74

Query: 59  LRASRKSGALS--------PLSPPTGVPGHVTIEVRVPYKVVGLVVGPKGATIKRIQHQT 110
            +A R    L          LS     P  +        ++ G ++G  G T + I+  T
Sbjct: 75  EKALR---LLDDDYVLEVIDLSDYGDSPNALR-------RIKGRIIGEGGKTRRIIEELT 124

Query: 111 NTYIVTPSRDKEPVFEVT----GAPDSVEIARQEIE 142
              I         V+  T    G P+ VEIAR+ IE
Sbjct: 125 GVDIS--------VYGKTVAIIGDPEQVEIAREAIE 152


>gnl|CDD|211858 TIGR03665, arCOG04150, arCOG04150 universal archaeal KH domain
           protein.  This family of proteins is universal among the
           41 archaeal genomes analyzed in and is not observed
           outside of the archaea. The proteins contain a single KH
           domain (pfam00013) which is likely to confer the ability
           to bind RNA.
          Length = 172

 Score = 34.8 bits (81), Expect = 0.028
 Identities = 21/83 (25%), Positives = 39/83 (46%), Gaps = 6/83 (7%)

Query: 84  EVRVPYKVVGLVVGPKGATIKRIQHQTNTYIVTPSRDKE-PVFEVTGAPDSVEIARQEIE 142
            V++P   +G+++G  G T K I+ +T   +   S   E  + E    P +V  AR    
Sbjct: 1   YVKIPKDRIGVLIGKGGETKKEIEERTGVKLDIDSETGEVKIEEEDEDPLAVMKAR---- 56

Query: 143 SHIIRRTGSCVTPAEAVLNGDDN 165
             +++  G   +P +A+   DD+
Sbjct: 57  -EVVKAIGRGFSPEKALKLLDDD 78



 Score = 31.0 bits (71), Expect = 0.50
 Identities = 16/50 (32%), Positives = 24/50 (48%), Gaps = 4/50 (8%)

Query: 93  GLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFEVTGAPDSVEIARQEIE 142
           G ++G  G T + I+  T   I      K     + G P+ V+IAR+ IE
Sbjct: 101 GRIIGEGGKTRRIIEELTGVSISV--YGKT--VGIIGDPEQVQIAREAIE 146


>gnl|CDD|236995 PRK11824, PRK11824, polynucleotide phosphorylase/polyadenylase;
           Provisional.
          Length = 693

 Score = 35.8 bits (84), Expect = 0.030
 Identities = 27/90 (30%), Positives = 39/90 (43%), Gaps = 19/90 (21%)

Query: 58  ALRASRKSGALSPLSPPTGVPGHVTIE-VRVPYKVVGLVVGPKGATIKRIQHQTNTYIVT 116
           A+   R    LSP +P         IE +++P   +  V+GP G TI+ I  +T   I  
Sbjct: 540 AISEPRAE--LSPYAP--------RIETIKIPPDKIRDVIGPGGKTIREITEETGAKI-- 587

Query: 117 PSRDKEPVFEVT-GAPD--SVEIARQEIES 143
              D E    V   A D  + E A++ IE 
Sbjct: 588 ---DIEDDGTVKIAATDGEAAEAAKERIEG 614


>gnl|CDD|224768 COG1855, COG1855, ATPase (PilT family) [General function prediction
           only].
          Length = 604

 Score = 35.8 bits (83), Expect = 0.031
 Identities = 26/83 (31%), Positives = 37/83 (44%), Gaps = 16/83 (19%)

Query: 77  VPGHVTIE--------VRVPYKVVGLVVGPKGATIKRIQHQTNTYI-VTPSRDKEPVFEV 127
           +PG V +E        V+VP K +  V+G  G  IK I+ +    I V P  ++E   +V
Sbjct: 474 LPGDVEVEVVGDGRAVVKVPEKYIPKVIGKGGKRIKEIEKKLGIKIDVKPLEEEEEGEKV 533

Query: 128 TGAPDSVEIARQEIESHIIRRTG 150
                 VEI   E   HI+   G
Sbjct: 534 P-----VEIE--EKGKHIVLYVG 549


>gnl|CDD|227719 COG5432, RAD18, RING-finger-containing E3 ubiquitin ligase [Signal
           transduction mechanisms].
          Length = 391

 Score = 35.4 bits (81), Expect = 0.037
 Identities = 15/42 (35%), Positives = 18/42 (42%), Gaps = 1/42 (2%)

Query: 303 CYLCNDREVTHALIPCGHNFFCSECAERTCDFDRTCPMCRVP 344
           C +C+ R        CGH  FCS C  R       CP+CR  
Sbjct: 28  CRICDCRISIPCETTCGHT-FCSLCIRRHLGTQPFCPVCRED 68


>gnl|CDD|131743 TIGR02696, pppGpp_PNP, guanosine pentaphosphate synthetase
           I/polynucleotide phosphorylase.  Sohlberg, et al present
           characterization of two proteins from Streptomyces
           coelicolor. The protein in this family was shown to have
           poly(A) polymerase activity and may be responsible for
           polyadenylating RNA in this species. Reference 2 showed
           that a nearly identical plasmid-encoded protein from
           Streptomyces antibioticus is a bifunctional enzyme that
           acts also as a guanosine pentaphosphate synthetase.
          Length = 719

 Score = 35.2 bits (81), Expect = 0.047
 Identities = 21/61 (34%), Positives = 29/61 (47%), Gaps = 2/61 (3%)

Query: 83  IEVRVPYKVVGLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFEVTGAPDSVEIARQEIE 142
           I V++P   +G V+GPKG  I +IQ +T   I     D   V+       S E AR  I 
Sbjct: 580 ITVKIPVDKIGEVIGPKGKMINQIQDETGAEISI--EDDGTVYIGAADGPSAEAARAMIN 637

Query: 143 S 143
           +
Sbjct: 638 A 638


>gnl|CDD|239086 cd02393, PNPase_KH, Polynucleotide phosphorylase (PNPase) K
           homology RNA-binding domain (KH). PNPase is a
           polyribonucleotide nucleotidyl transferase that degrades
           mRNA in prokaryotes and plant chloroplasts. The
           C-terminal region of PNPase contains domains homologous
           to those in other RNA binding proteins: a KH domain and
           an S1 domain. KH domains bind single-stranded RNA and
           are found in a wide variety of proteins including
           ribosomal proteins, transcription factors and
           post-transcriptional modifiers of mRNA.
          Length = 61

 Score = 31.7 bits (73), Expect = 0.050
 Identities = 14/52 (26%), Positives = 26/52 (50%), Gaps = 4/52 (7%)

Query: 92  VGLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFEVTGAP-DSVEIARQEIE 142
           +  V+GP G TIK+I  +T   I     + +    +  +  ++ E A++ IE
Sbjct: 13  IRDVIGPGGKTIKKIIEETGVKIDI---EDDGTVYIAASDKEAAEKAKKMIE 61


>gnl|CDD|227568 COG5243, HRD1, HRD ubiquitin ligase complex, ER membrane component
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 491

 Score = 34.6 bits (79), Expect = 0.069
 Identities = 17/61 (27%), Positives = 26/61 (42%), Gaps = 14/61 (22%)

Query: 298 SGSRQCYLCND-------------REVTHALIPCGHNFFCSECAERTCDFDRTCPMCRVP 344
           +  R C +C D              ++T   +PCGH      C +   +  +TCP+CR P
Sbjct: 285 NSDRTCTICMDEMFHPDHEPLPRGLDMTPKRLPCGHILHL-HCLKNWLERQQTCPICRRP 343

Query: 345 V 345
           V
Sbjct: 344 V 344


>gnl|CDD|215715 pfam00097, zf-C3HC4, Zinc finger, C3HC4 type (RING finger).  The
           C3HC4 type zinc-finger (RING finger) is a cysteine-rich
           domain of 40 to 60 residues that coordinates two zinc
           ions, and has the consensus sequence:
           C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C
           where X is any amino acid. Many proteins containing a
           RING finger play a key role in the ubiquitination
           pathway.
          Length = 40

 Score = 29.7 bits (67), Expect = 0.16
 Identities = 14/41 (34%), Positives = 23/41 (56%), Gaps = 3/41 (7%)

Query: 303 CYLCNDR-EVTHALIPCGHNFFCSECAERTCDF-DRTCPMC 341
           C +C +  +    ++PCGH  FCS+C     +  + TCP+C
Sbjct: 1   CPICLEEPKDPVTILPCGHL-FCSKCILSWLESGNVTCPLC 40


>gnl|CDD|220401 pfam09786, CytochromB561_N, Cytochrome B561, N terminal.  Members
           of this family are found in the N terminal region of
           cytochrome B561, as well as in various other putative
           uncharacterized proteins.
          Length = 559

 Score = 33.6 bits (77), Expect = 0.16
 Identities = 24/105 (22%), Positives = 40/105 (38%), Gaps = 9/105 (8%)

Query: 207 NFNMPLSSSQMN-HHVFSGSSGCSSASSSSSSSACAPHSSTQLDLGSIWSGMSSLDKDEG 265
           +  M  S   +  H  FS S   S++ S   S +     S QL   +  +  SS  +   
Sbjct: 126 STPMNTSEPLVPGHSSFSDSPSRSASPSRKFSPSSTIQQSPQLTPSNKPASPSSSYQ--- 182

Query: 266 LGDSPSFDAS-PVNPSSIWSYPPVSSTSPSGSISGSRQCYLCNDR 309
              SPS+ +S     SS       SS   +   SG ++    +++
Sbjct: 183 ---SPSYSSSLGPVNSSGNRSNLRSSPW-ALRSSGDKKDITTDEK 223


>gnl|CDD|143417 cd07099, ALDH_DDALDH, Methylomonas sp.
           4,4'-diapolycopene-dialdehyde dehydrogenase-like.  The
           4,4'-diapolycopene-dialdehyde dehydrogenase (DDALDH)
           involved in C30 carotenoid synthesis in Methylomonas sp.
           strain 16a and other similar sequences are present in
           this CD. DDALDH converts 4,4'-diapolycopene-dialdehyde
           into 4,4'-diapolycopene-diacid.
          Length = 453

 Score = 33.0 bits (76), Expect = 0.24
 Identities = 22/67 (32%), Positives = 28/67 (41%), Gaps = 10/67 (14%)

Query: 36  TGRKEDVARAKREILSAADHFS--ALRASRKSGALSPLSPPTG--VPGHVTIEVRVPYKV 91
           TG+    A    E+L A +     A  A R    L+P   PTG  +P         PY V
Sbjct: 68  TGKPRADAGL--EVLLALEAIDWAARNAPR---VLAPRKVPTGLLMPNKKATVEYRPYGV 122

Query: 92  VGLVVGP 98
           VG V+ P
Sbjct: 123 VG-VISP 128


>gnl|CDD|221705 pfam12678, zf-rbx1, RING-H2 zinc finger.  There are 8 cysteine/
           histidine residues which are proposed to be the
           conserved residues involved in zinc binding. The
           protein, of which this domain is the conserved region,
           participates in diverse functions relevant to chromosome
           metabolism and cell cycle control.
          Length = 73

 Score = 30.1 bits (68), Expect = 0.28
 Identities = 11/26 (42%), Positives = 12/26 (46%), Gaps = 3/26 (11%)

Query: 318 CGHNF-FCSECAERTCDFDRTCPMCR 342
           CGH F     C  R      TCP+CR
Sbjct: 50  CGHAFHLH--CISRWLKTRNTCPLCR 73


>gnl|CDD|227861 COG5574, PEX10, RING-finger-containing E3 ubiquitin ligase
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 271

 Score = 32.2 bits (73), Expect = 0.34
 Identities = 15/46 (32%), Positives = 20/46 (43%), Gaps = 3/46 (6%)

Query: 299 GSRQCYLCNDREVTHALIPCGHNFFCSEC--AERTCDFDRTCPMCR 342
              +C+LC +     +  PCGH  FC  C     T      CP+CR
Sbjct: 214 ADYKCFLCLEEPEVPSCTPCGH-LFCLSCLLISWTKKKYEFCPLCR 258


>gnl|CDD|206613 pfam14447, Prok-RING_4, Prokaryotic RING finger family 4.  RING
           finger family domain found sporadically in bacteria. The
           finger is fused to an N-terminal alpha-helical domain,
           ROT/Trove-like repeats and a C-terminal TerD domain. The
           architecture suggests a possible role in an
           RNA-processing complex.
          Length = 55

 Score = 29.2 bits (66), Expect = 0.41
 Identities = 14/45 (31%), Positives = 17/45 (37%), Gaps = 7/45 (15%)

Query: 303 CYLCNDREVTHALIPCGHNFFCSECAERTCDFDR--TCPMCRVPV 345
           C  C        L+PCGH        + T D +R   CP C  P 
Sbjct: 10  CLFCGTVGTKGVLLPCGH-LIP----DGTFDGERYNGCPFCGTPF 49


>gnl|CDD|188567 TIGR04052, AZL_007920_fam, AZL_007920/MXAN_0976 family protein.
           Members of this rare protein family regularly occur next
           to a member of the MXAN_0977 subfamily (TIGR04039) of
           the di-heme cytochrome c peroxidase/MauG family
           (pfam03150). MauG itself (TIGR03791) is a protein
           modification enzyme responsible for the tryptophan
           tryptophylquinone (TTQ) modification involved in
           methylamine dehydrogenase activation. All members of
           this family have a motif of four spaced invariant Cys
           residues, while additional homologs outside the scope of
           this family lack the four Cys residues.
          Length = 206

 Score = 31.3 bits (71), Expect = 0.56
 Identities = 20/75 (26%), Positives = 34/75 (45%), Gaps = 16/75 (21%)

Query: 205 EFNFNMPLSSSQMNHHVFSGSSGCSSASSSSSSSACA-------------PHSST-QLDL 250
           + + N  +  S     V  GS+GC+ + +   SSAC              P+S   +LDL
Sbjct: 109 DVSPNASVGKST-GWVVHLGSTGCAGSPARGESSACTNPNRLPVTLPGFDPNSQKVELDL 167

Query: 251 GSIWSGMSSLDKDEG 265
            +++ G S+L  + G
Sbjct: 168 AALFEG-SNLGANPG 181


>gnl|CDD|222944 PHA02929, PHA02929, N1R/p28-like protein; Provisional.
          Length = 238

 Score = 31.3 bits (71), Expect = 0.59
 Identities = 12/27 (44%), Positives = 14/27 (51%), Gaps = 1/27 (3%)

Query: 318 CGHNFFCSECAERTCDFDRTCPMCRVP 344
           C H  FC EC +       TCP+CR P
Sbjct: 200 CNH-VFCIECIDIWKKEKNTCPVCRTP 225


>gnl|CDD|203707 pfam07650, KH_2, KH domain. 
          Length = 77

 Score = 29.0 bits (66), Expect = 0.69
 Identities = 8/29 (27%), Positives = 16/29 (55%), Gaps = 2/29 (6%)

Query: 81  VTIEVRVPYKVVGLVVGPKGATIKRIQHQ 109
           V + +R      G+V+G  G+ IK++  +
Sbjct: 27  VIVVIRTSQP--GIVIGKGGSNIKKLGKE 53


>gnl|CDD|131219 TIGR02164, torA, trimethylamine-N-oxide reductase TorA.  This
          very narrowly defined family represents TorA, part of a
          family of related molybdoenzymes that include biotin
          sulfoxide reductases, dimethyl sulfoxide reductases,
          and at least two different subfamilies of
          trimethylamine-N-oxide reductases. A single enzyme from
          the larger family may have more than one activity. TorA
          typically is located in the periplasm, has a Tat
          (twin-arginine translocation)-dependent signal
          sequence, and is encoded in a torCAD operon.
          Length = 822

 Score = 31.8 bits (72), Expect = 0.70
 Identities = 18/58 (31%), Positives = 25/58 (43%), Gaps = 10/58 (17%)

Query: 47 REILSAADHFSALRASRKSGALSPLSP------PT----GVPGHVTIEVRVPYKVVGL 94
           E  +   H+ A RA  K+G +  + P      PT    G+ G V    RV Y +V L
Sbjct: 38 DEWKTTGSHWGAFRAKVKNGKVVEVKPFELDKYPTEMINGIRGMVYNPSRVRYPMVRL 95


>gnl|CDD|237909 PRK15102, PRK15102, trimethylamine N-oxide reductase I catalytic
          subunit; Provisional.
          Length = 825

 Score = 31.6 bits (72), Expect = 0.71
 Identities = 19/62 (30%), Positives = 26/62 (41%), Gaps = 10/62 (16%)

Query: 43 ARAKREILSAADHFSALRASRKSGALSPLSP------PT----GVPGHVTIEVRVPYKVV 92
          A   +E +    H+ A RA  K+G      P      PT    G+ GHV    R+ Y +V
Sbjct: 37 AETTKEWILTGSHWGAFRAKVKNGRFVEAKPFELDKYPTKMINGIKGHVYNPSRIRYPMV 96

Query: 93 GL 94
           L
Sbjct: 97 RL 98


>gnl|CDD|184311 PRK13764, PRK13764, ATPase; Provisional.
          Length = 602

 Score = 31.3 bits (72), Expect = 0.85
 Identities = 29/132 (21%), Positives = 44/132 (33%), Gaps = 37/132 (28%)

Query: 28  GEE----PVFVVTGRKEDVARAKREILSAADHFSALRASRKSGALSPLSPPTGVPGHVTI 83
           GE+    PV     +      A++EI      +                    +PG V +
Sbjct: 436 GEQTVVVPVEEEEEKSPVWRLAEKEIEREIKRY--------------------LPGPVEV 475

Query: 84  E--------VRVPYKVVGLVVGPKGATIKRIQHQTNTYIVTPSRDKEPVFEVTGAPDSVE 135
           E        V VP K +  V+G  G  IK+I+ +    I     D+EP        +  E
Sbjct: 476 EVVSDNKAVVYVPEKDIPKVIGKGGKRIKKIEKKLGIDIDVRPLDEEP----GEEAEEGE 531

Query: 136 IARQEIES-HII 146
               E    H+I
Sbjct: 532 EVTVEETKKHVI 543


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 30.5 bits (69), Expect = 1.6
 Identities = 18/80 (22%), Positives = 26/80 (32%)

Query: 223 SGSSGCSSASSSSSSSACAPHSSTQLDLGSIWSGMSSLDKDEGLGDSPSFDASPVNPSSI 282
           S SS    + S S SS  +  + +     S  S             S S   + V+P   
Sbjct: 287 SSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPS 346

Query: 283 WSYPPVSSTSPSGSISGSRQ 302
            S  P  S  P  +   S +
Sbjct: 347 PSRSPSPSRPPPPADPSSPR 366



 Score = 29.4 bits (66), Expect = 3.8
 Identities = 22/90 (24%), Positives = 32/90 (35%), Gaps = 1/90 (1%)

Query: 223 SGSSGCSSASSSSSSSACAPHSSTQLDLGSIWSGMSSLDKDEGLG-DSPSFDASPVNPSS 281
           S     SS+SSS  SS+ +  SS++   G+  S   S  +        P  D S      
Sbjct: 310 SSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRP 369

Query: 282 IWSYPPVSSTSPSGSISGSRQCYLCNDREV 311
             S  P S  + +G  +  R       R  
Sbjct: 370 RPSRAPSSPAASAGRPTRRRARAAVAGRAR 399


>gnl|CDD|143397 cd07078, ALDH, NAD(P)+ dependent aldehyde dehydrogenase family.
           The aldehyde dehydrogenase family (ALDH) of NAD(P)+
           dependent enzymes, in general, oxidize a wide range of
           endogenous and exogenous aliphatic and aromatic
           aldehydes to their corresponding carboxylic acids and
           play an  important role in detoxification. Besides
           aldehyde detoxification, many ALDH isozymes possess
           multiple additional catalytic and non-catalytic
           functions such as participating in  metabolic pathways,
           or as  binding proteins, or as osmoregulants, to mention
           a few. The enzyme has three domains, a NAD(P)+
           cofactor-binding domain, a catalytic domain, and a
           bridging domain; and the active enzyme  is generally
           either homodimeric or homotetrameric. The catalytic
           mechanism is proposed to involve cofactor binding,
           resulting in a conformational change and activation of
           an invariant catalytic cysteine nucleophile. The
           cysteine and aldehyde substrate form an oxyanion
           thiohemiacetal intermediate resulting in hydride
           transfer to the cofactor and formation of a
           thioacylenzyme intermediate. Hydrolysis of the
           thioacylenzyme and release of the carboxylic acid
           product occurs, and in most cases, the reduced cofactor
           dissociates from the enzyme. The evolutionary
           phylogenetic tree of ALDHs appears to have an initial
           bifurcation between what has been characterized as the
           classical aldehyde dehydrogenases, the ALDH family
           (ALDH) and extended family members or aldehyde
           dehydrogenase-like (ALDH-like) proteins. The ALDH
           proteins are represented by enzymes which share a number
           of highly conserved residues necessary for catalysis and
           cofactor binding and they include such proteins as
           retinal dehydrogenase, 10-formyltetrahydrofolate
           dehydrogenase, non-phosphorylating glyceraldehyde
           3-phosphate dehydrogenase,
           delta(1)-pyrroline-5-carboxylate dehydrogenases,
           alpha-ketoglutaric semialdehyde dehydrogenase,
           alpha-aminoadipic semialdehyde dehydrogenase, coniferyl
           aldehyde dehydrogenase and succinate-semialdehyde
           dehydrogenase.  Included in this larger group are all
           human, Arabidopsis, Tortula, fungal, protozoan, and
           Drosophila ALDHs identified in families ALDH1 through
           ALDH22 with the exception of families ALDH18, ALDH19,
           and ALDH20 which are present in the ALDH-like group.
          Length = 432

 Score = 30.3 bits (69), Expect = 1.7
 Identities = 20/64 (31%), Positives = 27/64 (42%), Gaps = 8/64 (12%)

Query: 35  VTGRKEDVARAKREILSAAD--HFSALRASRKSGALSPLSPPTGVPGHVTIEVRVPYKVV 92
            TG+  + A    E+  AAD   + A  A R  G       P+  PG + I  R P  VV
Sbjct: 47  ETGKPIEEALG--EVARAADTFRYYAGLARRLHGE----VIPSPDPGELAIVRREPLGVV 100

Query: 93  GLVV 96
           G + 
Sbjct: 101 GAIT 104


>gnl|CDD|218222 pfam04710, Pellino, Pellino.  Pellino is involved in Toll-like
           signalling pathways, and associates with the kinase
           domain of the Pelle Ser/Thr kinase.
          Length = 416

 Score = 30.2 bits (68), Expect = 1.8
 Identities = 16/45 (35%), Positives = 18/45 (40%), Gaps = 12/45 (26%)

Query: 312 THALIPCGHNFFCSECAER----------TCDFDRTCPMCRVPVN 346
           THA +PCGH   CSE              T  F   CP C  P+ 
Sbjct: 359 THAFVPCGH--VCSEKTALYWAQIPLPHGTHAFHAACPFCATPLA 401


>gnl|CDD|239049 cd02134, NusA_KH, NusA_K homology RNA-binding domain (KH). NusA is
           an essential multifunctional transcription elongation
           factor that is universally conserved among prokaryotes
           and archaea. NusA anti-termination function plays an
           important role in the expression of ribosomal rrn
           operons. During transcription of many other genes,
           NusA-induced RNAP pausing provides a mechanism for
           synchronizing transcription and translation . The
           N-terminal RNAP-binding domain (NTD) is connected
           through a flexible hinge helix to three globular
           domains, S1, KH1 and KH2.   The KH motif is a
           beta-alpha-alpha-beta-beta unit that folds into an
           alpha-beta structure with a three stranded beta-sheet
           interupted by two contiguous helices.
          Length = 61

 Score = 27.5 bits (62), Expect = 1.9
 Identities = 8/33 (24%), Positives = 12/33 (36%)

Query: 82  TIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYI 114
              V VP   +GL +G  G  ++         I
Sbjct: 26  RARVVVPDDQLGLAIGKGGQNVRLASKLLGEKI 58


>gnl|CDD|226194 COG3668, ParE, Plasmid stabilization system protein [General
           function prediction only].
          Length = 98

 Score = 28.2 bits (63), Expect = 2.0
 Identities = 13/74 (17%), Positives = 25/74 (33%), Gaps = 6/74 (8%)

Query: 36  TGRKEDVARAKREILSAADHFSALRASRKSGALSPLSPPTGVPGHVTIEVRVPYKVVGLV 95
             R+   + A+R + +    F +L    + G          + G   I     + +    
Sbjct: 21  IARRFGPSAARRYVRALETAFESLAEFPEIG-----RSRDEIRGGRRIVPYGSHYIFYYR 75

Query: 96  VGPKGATIKRIQHQ 109
           VG +   I R+ H 
Sbjct: 76  VGGR-VLILRVLHG 88


>gnl|CDD|234271 TIGR03591, polynuc_phos, polyribonucleotide nucleotidyltransferase.
            Members of this protein family are polyribonucleotide
           nucleotidyltransferase, also called polynucleotide
           phosphorylase. Some members have been shown also to have
           additional functions as guanosine pentaphosphate
           synthetase and as poly(A) polymerase (see model
           TIGR02696 for an exception clade, within this family)
           [Transcription, Degradation of RNA].
          Length = 684

 Score = 30.2 bits (69), Expect = 2.0
 Identities = 13/37 (35%), Positives = 19/37 (51%), Gaps = 2/37 (5%)

Query: 78  PGHVTIEVRVPYKVVGLVVGPKGATIKRIQHQTNTYI 114
           P   TI++  P K+   V+GP G  I+ I  +T   I
Sbjct: 550 PRIETIKIN-PDKI-RDVIGPGGKVIREITEETGAKI 584


>gnl|CDD|227503 COG5176, MSL5, Splicing factor (branch point binding protein) [RNA
           processing and modification].
          Length = 269

 Score = 29.6 bits (66), Expect = 2.1
 Identities = 11/33 (33%), Positives = 20/33 (60%), Gaps = 2/33 (6%)

Query: 92  VGLVVGPKGATIKRIQHQTNT--YIVTPSRDKE 122
           VGL++GP+G+T+K+++  +     I      KE
Sbjct: 165 VGLLIGPRGSTLKQLERISRAKIAIRGSGSVKE 197


>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein.  This entry is a highly
           conserved protein present in eukaryotes.
          Length = 680

 Score = 29.9 bits (67), Expect = 2.4
 Identities = 29/128 (22%), Positives = 43/128 (33%), Gaps = 6/128 (4%)

Query: 135 EIARQEIESHIIRRTGSCVTPAEAVLNGDDNSADLLASLCNSGLGSLGTILNYVNGTSGP 194
           +     I +H  +   S +   E V+    N +   +S  N    +     +   G+ G 
Sbjct: 269 QHHSIGINNHHSKHADSKLQTIE-VIENHSNKSRPSSSSTNGSKETTSNSSSAAAGSIGS 327

Query: 195 ASDSYGAGPGEFNFNMPLSSSQMNHHVFSGSSGCSSASSSSSSSACAP-HSSTQLDLGSI 253
            S            N    SS  +H   +GS   SS S + S    A   SS   D    
Sbjct: 328 KSSKSAKHSNRNKSN----SSPKSHSSANGSVPSSSVSDNESKQKRASKSSSGARDSKKD 383

Query: 254 WSGMSSLD 261
            SGMS+  
Sbjct: 384 ASGMSANG 391


>gnl|CDD|215562 PLN03078, PLN03078, Putative tRNA pseudouridine synthase;
           Provisional.
          Length = 513

 Score = 29.9 bits (67), Expect = 2.4
 Identities = 22/90 (24%), Positives = 31/90 (34%), Gaps = 8/90 (8%)

Query: 210 MPLSSSQMNHHVF---SGSSGCSSASSSSSSSACAPHSSTQLDLGSIWSGMSSLDKDEGL 266
           +P    Q N  V      S   SS+ S  +    +      L   SI SG S  +     
Sbjct: 251 LPGKHKQRNGAVSRRAKSSKEMSSSESEENHGEISEEDEEDLSFSSIPSGSSDEN----- 305

Query: 267 GDSPSFDASPVNPSSIWSYPPVSSTSPSGS 296
            D   F +S V   + W + P  +   S S
Sbjct: 306 EDILKFQSSQVQIRARWLHEPDETDRISAS 335


>gnl|CDD|221368 pfam11999, DUF3494, Protein of unknown function (DUF3494).  This
           family of proteins is functionally uncharacterized. This
           protein is found in bacteria, archaea and eukaryotes.
           Proteins in this family are typically between 243 to 678
           amino acids in length. This protein has a single
           completely conserved residue G that may be functionally
           important.
          Length = 196

 Score = 29.2 bits (66), Expect = 2.8
 Identities = 15/56 (26%), Positives = 21/56 (37%), Gaps = 2/56 (3%)

Query: 149 TGSCVT--PAEAVLNGDDNSADLLASLCNSGLGSLGTILNYVNGTSGPASDSYGAG 202
             + +T  P   V +G   +AD  A      +  L T  N   G + P     GAG
Sbjct: 25  AATAITGFPLGVVSSGTIYAADYAAPTATQAVSDLTTAYNDAAGRTTPDYTGLGAG 80


>gnl|CDD|227909 COG5622, COG5622, Protein required for attachment to host cells
           [Cell motility and secretion].
          Length = 139

 Score = 28.6 bits (64), Expect = 3.1
 Identities = 11/49 (22%), Positives = 18/49 (36%), Gaps = 2/49 (4%)

Query: 199 YGAGPGEFNFNMPLSSSQ--MNHHVFSGSSGCSSASSSSSSSACAPHSS 245
           +     +   N+P        N H   G+    S+SSS+  S+     S
Sbjct: 23  FRNQGDKATPNLPAKLVLDIDNDHHGRGARQSHSSSSSNPDSSREEEDS 71


>gnl|CDD|171842 PRK13023, PRK13023, bifunctional preprotein translocase subunit
           SecD/SecF; Reviewed.
          Length = 758

 Score = 29.6 bits (66), Expect = 3.3
 Identities = 19/77 (24%), Positives = 31/77 (40%), Gaps = 6/77 (7%)

Query: 116 TPSRDKEPVFEVTGAPDSVEIARQEIESHIIRRTGSCVTPAEAVLNGDDNSADLLASLCN 175
           TP  D E V+     P    + +  I       TG  +T A+A ++ DD    +  +L +
Sbjct: 139 TPPADSEIVYSFDDPPVGYLLKKTPI------LTGHDITDAKASISADDGQPVITLTLDD 192

Query: 176 SGLGSLGTILNYVNGTS 192
           +G   L  +    N  S
Sbjct: 193 NGRRRLADLTAQGNENS 209


>gnl|CDD|218608 pfam05495, zf-CHY, CHY zinc finger.  This family of domains are
           likely to bind to zinc ions. They contain many conserved
           cysteine and histidine residues. We have named this
           domain after the N-terminal motif CXHY. This domain can
           be found in isolation in some proteins, but is also
           often associated with pfam00097. One of the proteins in
           this family is a mitochondrial intermembrane space
           protein called Hot13. This protein is involved in the
           assembly of small TIM complexes.
          Length = 74

 Score = 27.3 bits (61), Expect = 3.4
 Identities = 13/50 (26%), Positives = 17/50 (34%), Gaps = 10/50 (20%)

Query: 303 CYLCNDREVTHALIPC-GHNFFCSEC------AERTCDFDR---TCPMCR 342
           C LC+D    H L         C  C       E  C  +     CP+C+
Sbjct: 22  CRLCHDELEDHPLDRWNVKAVLCGVCRTEQTVQEYNCGVEFADYYCPICK 71


>gnl|CDD|219133 pfam06682, DUF1183, Protein of unknown function (DUF1183).  This
           family consists of several eukaryotic proteins of around
           360 residues in length. The function of this family is
           unknown.
          Length = 317

 Score = 29.3 bits (66), Expect = 3.4
 Identities = 20/72 (27%), Positives = 30/72 (41%)

Query: 175 NSGLGSLGTILNYVNGTSGPASDSYGAGPGEFNFNMPLSSSQMNHHVFSGSSGCSSASSS 234
           +SG GS GT         G  +     G   + F    +++      +   S   S SSS
Sbjct: 234 SSGYGSGGTRSGQGGWGPGFWTGLGAGGALGYLFGSRRNNNSSYGRSYGSGSPSYSPSSS 293

Query: 235 SSSSACAPHSST 246
           S+SS+ +  SST
Sbjct: 294 SNSSSSSSSSST 305


>gnl|CDD|224106 COG1185, Pnp, Polyribonucleotide nucleotidyltransferase
           (polynucleotide phosphorylase) [Translation, ribosomal
           structure and biogenesis].
          Length = 692

 Score = 29.1 bits (66), Expect = 4.3
 Identities = 26/92 (28%), Positives = 40/92 (43%), Gaps = 23/92 (25%)

Query: 58  ALRASRKSGALSPLSPPTGVPGHVTIE-VRVPYKVVGLVVGPKGATIKRIQHQTNTYIVT 116
           A+   RK   LSP +P         IE +++    +  V+GP G TIK I  +T   I  
Sbjct: 538 AISEPRKE--LSPYAP--------RIETIKIDPDKIRDVIGPGGKTIKAITEETGVKI-- 585

Query: 117 PSRDKEP-----VFEVTGAPDSVEIARQEIES 143
              D E      +    G  +S + A++ IE+
Sbjct: 586 ---DIEDDGTVKIAASDG--ESAKKAKERIEA 612


>gnl|CDD|236873 PRK11186, PRK11186, carboxy-terminal protease; Provisional.
          Length = 667

 Score = 28.7 bits (65), Expect = 5.3
 Identities = 16/36 (44%), Positives = 22/36 (61%), Gaps = 6/36 (16%)

Query: 91  VVGLVVGPKGATIKRIQHQ-----TNTYIVTPSRDK 121
           VV L+ GPKG+ + R++       T T IVT +RDK
Sbjct: 301 VVALIKGPKGSKV-RLEILPAGKGTKTRIVTLTRDK 335


>gnl|CDD|177010 CHL00071, tufA, elongation factor Tu.
          Length = 409

 Score = 28.8 bits (65), Expect = 5.4
 Identities = 14/37 (37%), Positives = 17/37 (45%), Gaps = 8/37 (21%)

Query: 10  KIKALRAKTNTYIKTPVRGE--------EPVFVVTGR 38
           KI  L    ++YI TP R          E VF +TGR
Sbjct: 198 KIYNLMDAVDSYIPTPERDTDKPFLMAIEDVFSITGR 234


>gnl|CDD|223128 COG0050, TufB, GTPases - translation elongation factors
           [Translation, ribosomal structure and biogenesis].
          Length = 394

 Score = 28.4 bits (64), Expect = 6.8
 Identities = 18/54 (33%), Positives = 25/54 (46%), Gaps = 9/54 (16%)

Query: 10  KIKALRAKTNTYIKTPVRGE--------EPVFVVTGRKEDV-ARAKREILSAAD 54
           KI+ L    ++YI TP R          E VF ++GR   V  R +R IL   +
Sbjct: 188 KIEELMDAVDSYIPTPERDIDKPFLMPVEDVFSISGRGTVVTGRVERGILKVGE 241


>gnl|CDD|226801 COG4357, COG4357, Zinc finger domain containing protein (CHY type)
           [Function unknown].
          Length = 105

 Score = 26.7 bits (59), Expect = 8.0
 Identities = 18/57 (31%), Positives = 21/57 (36%), Gaps = 16/57 (28%)

Query: 303 CYLCNDREVTHALIPCGHNFF------CSEC------AE-RTCDFDRTCPMCRVPVN 346
           CY C+D    H   P G   F      C  C      AE   C    +CP C+ P N
Sbjct: 38  CYHCHDELEDHPFEPWGLQEFNPKAIICGVCRKLLTRAEYGMCG---SCPYCQSPFN 91


>gnl|CDD|239210 cd02844, PAZ_CAF_like, PAZ domain, CAF_like subfamily. CAF (for
           carpel factory) is a plant homolog of Dicer. CAF has
           been implicated in flower morphogenesis and in early
           Arabidopsis development and might function through
           posttranscriptional regulation of specific mRNA
           molecules. PAZ domains are named after the proteins
           Piwi, Argonaut, and Zwille. PAZ is found in two families
           of proteins that are essential components of
           RNA-mediated gene-silencing pathways, including RNA
           interference, the Piwi and Dicer families. PAZ functions
           as a nucleic-acid binding domain, with a strong
           preference for single-stranded nucleic acids (RNA or
           DNA) or RNA duplexes with single-stranded 3' overhangs.
           It has been suggested that the PAZ domain provides a
           unique mode for the recognition of the two 3'-terminal
           nucleotides in single-stranded nucleic acids and buries
           the 3' OH group, and that it might recognize
           characteristic 3' overhangs in siRNAs within RISC
           (RNA-induced silencing) and other complexes.
          Length = 135

 Score = 27.0 bits (60), Expect = 8.3
 Identities = 16/49 (32%), Positives = 19/49 (38%), Gaps = 10/49 (20%)

Query: 227 GCSSASSSSSSSACAPHS------STQLDLGSIWSGMSSLDKDEGLGDS 269
           G   A     S   APH+      S  LDL    +  SS    EGLG +
Sbjct: 24  GSFCACDLKGSVVTAPHNGRFYVISGILDL----NANSSFPGKEGLGYA 68


>gnl|CDD|143467 cd07149, ALDH_y4uC, Uncharacterized ALDH (y4uC) with similarity to
           Tortula ruralis aldehyde dehydrogenase ALDH21A1.
           Uncharacterized aldehyde dehydrogenase (ORF name y4uC)
           with sequence similarity to the moss Tortula ruralis
           aldehyde dehydrogenase ALDH21A1 (RNP123) believed to
           play an important role in the detoxification of
           aldehydes generated in response to desiccation- and
           salinity-stress, and similar sequences are included in
           this CD.
          Length = 453

 Score = 27.9 bits (63), Expect = 8.5
 Identities = 19/70 (27%), Positives = 28/70 (40%), Gaps = 14/70 (20%)

Query: 38  RKEDVAR------------AKREILSAAD--HFSALRASRKSGALSPLSPPTGVPGHVTI 83
           R+E+ AR            A++E+  A +    SA  A R +G   P     G  G +  
Sbjct: 59  RREEFARTIALEAGKPIKDARKEVDRAIETLRLSAEEAKRLAGETIPFDASPGGEGRIGF 118

Query: 84  EVRVPYKVVG 93
            +R P  VV 
Sbjct: 119 TIREPIGVVA 128


>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
          Length = 1463

 Score = 28.2 bits (62), Expect = 9.1
 Identities = 20/45 (44%), Positives = 25/45 (55%)

Query: 195  ASDSYGAGPGEFNFNMPLSSSQMNHHVFSGSSGCSSASSSSSSSA 239
            A    GAG      ++  SSS M+    S SS CSS+SSSS SS+
Sbjct: 1229 AQPHVGAGAMPPCPDLSESSSTMHSSSSSSSSSCSSSSSSSDSSS 1273


>gnl|CDD|112657 pfam03854, zf-P11, P-11 zinc finger. 
          Length = 50

 Score = 25.2 bits (55), Expect = 9.3
 Identities = 9/36 (25%), Positives = 17/36 (47%)

Query: 315 LIPCGHNFFCSECAERTCDFDRTCPMCRVPVNQAMR 350
           L+ C  ++ C  C +        CP+C+ P+   +R
Sbjct: 15  LVTCSDHYLCLRCLQLLLSVSERCPICKKPLPTKLR 50


>gnl|CDD|235957 PRK07193, fliF, flagellar MS-ring protein; Reviewed.
          Length = 552

 Score = 28.0 bits (63), Expect = 9.5
 Identities = 9/15 (60%), Positives = 10/15 (66%)

Query: 255 SGMSSLDKDEGLGDS 269
           SG+  LDKD  LG S
Sbjct: 113 SGLELLDKDSPLGTS 127


>gnl|CDD|220662 pfam10265, DUF2217, Uncharacterized conserved protein (DUF2217).
           This is a family of conserved proteins of from 500 - 600
           residues found from worms to humans. Its function is not
           known.
          Length = 515

 Score = 27.8 bits (62), Expect = 9.9
 Identities = 20/93 (21%), Positives = 30/93 (32%), Gaps = 7/93 (7%)

Query: 190 GTSGPASDSYGAGPGEFNFNMPLSSSQMNHHVFSGSSGCSSASSSSSSSACAPHSSTQLD 249
           GT  P S   G        +     +       S  S   S SS S +S    +SS+   
Sbjct: 51  GTRRPLSRKIGKCSSRRVRSPSSKPNDTLSGASSKLSSKHSGSSHSLASVSDRNSSSS-- 108

Query: 250 LGSIWSGMSSLDKDEGLGDSPSFDASPVNPSSI 282
            GS  +  S     E  G     + +   P ++
Sbjct: 109 -GSCANSGSW----EAAGMEEPINTTDTTPENL 136


>gnl|CDD|227827 COG5540, COG5540, RING-finger-containing ubiquitin ligase
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 374

 Score = 27.7 bits (61), Expect = 9.9
 Identities = 13/50 (26%), Positives = 21/50 (42%), Gaps = 9/50 (18%)

Query: 302 QCYLC------NDREVTHALIPCGHNFFCSECAERTCDFDRTCPMCRVPV 345
           +C +C      NDR     ++PC H F      +    +   CP+CR  +
Sbjct: 325 ECAICMSNFIKNDRLR---VLPCDHRFHVGCVDKWLLGYSNKCPVCRTAI 371


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.314    0.129    0.385 

Gapped
Lambda     K      H
   0.267   0.0737    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 17,253,066
Number of extensions: 1605028
Number of successful extensions: 1432
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1395
Number of HSP's successfully gapped: 96
Length of query: 353
Length of database: 10,937,602
Length adjustment: 98
Effective length of query: 255
Effective length of database: 6,590,910
Effective search space: 1680682050
Effective search space used: 1680682050
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 59 (26.5 bits)