RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy13382
         (237 letters)



>gnl|CDD|146145 pfam03357, Snf7, Snf7.  This family of proteins are involved in
           protein sorting and transport from the endosome to the
           vacuole/lysosome in eukaryotic cells. Vacuoles/lysosomes
           play an important role in the degradation of both lipids
           and cellular proteins. In order to perform this
           degradative function, vacuoles/lysosomes contain
           numerous hydrolases which have been transported in the
           form of inactive precursors via the biosynthetic pathway
           and are proteolytically activated upon delivery to the
           vacuole/lysosome. The delivery of transmembrane
           proteins, such as activated cell surface receptors to
           the lumen of the vacuole/lysosome, either for
           degradation/downregulation, or in the case of
           hydrolases, for proper localisation, requires the
           formation of multivesicular bodies (MVBs). These late
           endosomal structures are formed by invaginating and
           budding of the limiting membrane into the lumen of the
           compartment. During this process, a subset of the
           endosomal membrane proteins is sorted into the forming
           vesicles. Mature MVBs fuse with the vacuole/lysosome,
           thereby releasing cargo containing vesicles into its
           hydrolytic lumen for degradation. Endosomal proteins
           that are not sorted into the intralumenal MVB vesicles
           are either recycled back to the plasma membrane or Golgi
           complex, or remain in the limiting membrane of the MVB
           and are thereby transported to the limiting membrane of
           the vacuole/lysosome as a consequence of fusion.
           Therefore, the MVB sorting pathway plays a critical role
           in the decision between recycling and degradation of
           membrane proteins. A few archaeal sequences are also
           present within this family.
          Length = 169

 Score =  138 bits (350), Expect = 2e-41
 Identities = 58/167 (34%), Positives = 103/167 (61%), Gaps = 2/167 (1%)

Query: 17  KNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAKKF 76
           +   +L KA+R+LD++++ +E++ KKL A+IKK+AK+G  ++  I+ K   R  K   + 
Sbjct: 1   EAILSLRKAIRELDKKQESLEKKIKKLEAEIKKLAKKGNKDAALILLKQKKRYEKQLDQL 60

Query: 77  MMMKANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQIQRILQEFEKQSEIM 136
               AN++ V + I+  ++   +  AMKG ++AM+ MN+ M++ +I  ++ E E Q E  
Sbjct: 61  DGQLANLEQVRMAIENAKTNQEVLNAMKGGAKAMKAMNKNMDIDKIDDLMDEIEDQMEKA 120

Query: 137 DMKEEMMNDAMDDAMDADEDEDETNAVVTQVLDELGLQLGDQLASIP 183
           D   EM++D +DDA   +EDE+E +A +  +LDE+G +   +L S P
Sbjct: 121 DEISEMLSDTLDDA--DEEDEEELDAELDALLDEIGDEELVELPSAP 165


>gnl|CDD|227778 COG5491, VPS24, Conserved protein implicated in secretion [Cell
           motility and secretion].
          Length = 204

 Score = 66.0 bits (161), Expect = 3e-13
 Identities = 47/165 (28%), Positives = 79/165 (47%), Gaps = 8/165 (4%)

Query: 37  EQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAK--KFMMMKANIQAVSLKIQTLR 94
           E+Q KKL+ ++K+ AK+GQ+   +I  K   R R   +  K    ++ + A   ++Q+L 
Sbjct: 6   ERQAKKLVRELKQEAKKGQVLLNEIAKKAPNRRRLAEELYKLRKARSRLDASISRLQSLD 65

Query: 95  SQNAMAEAMKGCSRAMQNMNRQMN-LPQIQRILQEFEKQSEIMD---MKEEMMNDAMDDA 150
           +       M+  S  M      MN L  I+RI+Q FE Q   ++   ++ E M++ MD  
Sbjct: 66  TMLFEKVVMRQVSGDMAKAAMYMNELESIRRIMQLFETQFLALELVQLRLETMDELMDVV 125

Query: 151 MDADEDEDETNA--VVTQVLDELGLQLGDQLASIPDPASSMMSDK 193
           +     ED      +V +VL E+GL+L +   S+P       S  
Sbjct: 126 VGDPVLEDLEELDELVNKVLPEIGLELDESEQSLPANVVENGSVP 170


>gnl|CDD|240422 PTZ00446, PTZ00446, vacuolar sorting protein SNF7-like;
           Provisional.
          Length = 191

 Score = 33.5 bits (76), Expect = 0.044
 Identities = 22/147 (14%), Positives = 74/147 (50%)

Query: 24  KAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAKKFMMMKANI 83
           +A+  L++++  +E++ K+L  + K+  ++ QM + KI+ K      +  +  +  +  +
Sbjct: 34  EAIDALEKKQVQVEKKIKQLEIEAKQKVEQNQMSNAKILLKRKKLYEQEIENILNNRLTL 93

Query: 84  QAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQIQRILQEFEKQSEIMDMKEEMM 143
           +   + ++ +        A+   +   + +N ++N  ++++I+   ++  +I +   + +
Sbjct: 94  EDNMINLENMHLHKIAVNALSYAANTHKKLNNEINTQKVEKIIDTIQENKDIQEEINQAL 153

Query: 144 NDAMDDAMDADEDEDETNAVVTQVLDE 170
           +  + + +D DE + E + +  Q ++E
Sbjct: 154 SFNLLNNVDDDEIDKELDLLKEQTMEE 180


>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
          Length = 1465

 Score = 33.3 bits (76), Expect = 0.12
 Identities = 19/69 (27%), Positives = 30/69 (43%), Gaps = 1/69 (1%)

Query: 7    RKITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDL 66
            +K TP+ +  K+  AL K +  LD+E    E+  +KL        + G  + V   A   
Sbjct: 1146 KKTTPKSLWLKDLDALEKELDKLDKEDAKAEEAREKLQRA-AARGESGAAKKVSRQAPKK 1204

Query: 67   VRTRKYAKK 75
               +K  KK
Sbjct: 1205 PAPKKTTKK 1213


>gnl|CDD|235152 PRK03705, PRK03705, glycogen debranching enzyme; Provisional.
          Length = 658

 Score = 32.3 bits (74), Expect = 0.19
 Identities = 13/28 (46%), Positives = 16/28 (57%), Gaps = 4/28 (14%)

Query: 203 GSGSNHSNNHGGGGGSTLSDADADLQAR 230
           G+ +N+SNNHG  G      AD DL  R
Sbjct: 468 GTNNNYSNNHGKEG----LGADLDLVER 491


>gnl|CDD|221784 pfam12810, Gly_rich, Glycine rich protein.  This family of proteins
           is greatly expanded in Trichomonas vaginalis. The
           proteins are composed of several glycine rich motifs
           interspersed through the sequence. Although many
           proteins have been annotated by similarity in the family
           these annotations given the biased composition of the
           sequences these are unlikely to be functionally
           relevant.
          Length = 248

 Score = 31.8 bits (73), Expect = 0.21
 Identities = 9/21 (42%), Positives = 14/21 (66%), Gaps = 3/21 (14%)

Query: 202 GGSGSNHSNNH---GGGGGST 219
           GG G N ++++   G GGG+T
Sbjct: 75  GGDGGNDNSSNDGSGSGGGAT 95


>gnl|CDD|240425 PTZ00464, PTZ00464, SNF-7-like protein; Provisional.
          Length = 211

 Score = 31.7 bits (72), Expect = 0.22
 Identities = 38/208 (18%), Positives = 80/208 (38%), Gaps = 12/208 (5%)

Query: 1   MEWLFGR-KITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESV 59
           M  LFG+   TP+  L    + +      +D     ++ +  KL   I++     Q    
Sbjct: 1   MNRLFGKKNKTPKPTLEDASKRIGGRSEVVDARINKIDAELMKLKEQIQRTRGMTQSRHK 60

Query: 60  KIMAKDLVRTRKYAKKFMMMKA---NIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQ 116
           +   + L + R Y  +  MM     N+  +    ++++      +AMK  ++ ++   ++
Sbjct: 61  QRAMQLLQQKRMYQNQQDMMMQQQFNMDQLQFTTESVKDTKVQVDAMKQAAKTLKKQFKK 120

Query: 117 MNLPQIQRILQEFEKQSEIMDMKEEMMNDAMDDAMDADEDEDETNAVVTQVLDELGLQLG 176
           +N+ +++ +  E     E     +E+M  A D   D DEDE          LD L   + 
Sbjct: 121 LNVDKVEDLQDELADLYEDTQEIQEIMGRAYDVPDDIDEDEMLGE------LDALDFDME 174

Query: 177 DQLASIPDPASSMMSDKGKTPVAIPGGS 204
            +  +     +  ++  G     +P   
Sbjct: 175 KEADA--SYLADALAVPGTKLPDVPTDE 200


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 31.6 bits (72), Expect = 0.42
 Identities = 20/182 (10%), Positives = 59/182 (32%), Gaps = 2/182 (1%)

Query: 7    RKITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDL 66
            +  TP++M  ++     +A+ + +  ++    +E++L +  K  A + +   +K   K  
Sbjct: 1122 KNTTPKDMWLEDLDKFEEALEEQEEVEEKEIAKEQRLKSKTKGKASKLRKPKLKKKEKKK 1181

Query: 67   VRTRKYAKKFMMMKANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQIQRIL 126
             ++     K   +  N + V    +         +         ++   Q   P+   + 
Sbjct: 1182 KKSSADKSKKASVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVK 1241

Query: 127  QEFEKQSEIMDMKEEMMNDAMDDAMDADEDEDETNAVVTQVLDELGLQLGDQLASIPDPA 186
            +   K++      E+    + DD     + ++         +         +     +  
Sbjct: 1242 RLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPK--RVSAVQYSPPPPSKRPDGESNGG 1299

Query: 187  SS 188
            S 
Sbjct: 1300 SK 1301


>gnl|CDD|213889 TIGR04059, Ald_deCOase, long-chain fatty aldehyde decarbonylase.
           This cyanobacterial family of fatty aldehyde
           decarbonylases acts on mainly C16 and C18 substrates to
           form hydrocarbons and carbon monoxide. Note that the
           corresponding EC number (4.1.99.5) dating from 1989
           refers to a nonorthologous Pisum sativum enzyme that
           acts on C18 and longer chains and attaches the overly
           narrow narrow name octadecanal decarbonylase [Central
           intermediary metabolism, Other].
          Length = 220

 Score = 30.5 bits (69), Expect = 0.51
 Identities = 12/43 (27%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 106 CSRAMQNMNRQMNLPQIQRILQEFEKQSEIMDM-KEEMMNDAM 147
               ++  NR+ NLP + R+L +    + ++ M KE ++ D M
Sbjct: 153 VKEELEEANRE-NLPLVWRMLNQVADDAAVLGMEKEALVEDFM 194


>gnl|CDD|173939 cd08180, PDD, 1,3-propanediol dehydrogenase (PPD) catalyzes the
           reduction of 3-hydroxypropionaldehyde (3-HPA) to
           1,3-propanediol in glycerol metabolism.  1,3-propanediol
           dehydrogenase (PPD) plays a role in glycerol metabolism
           of some bacteria in anaerobic conditions. In this
           degradation pathway, glycerol is converted in a two-step
           process to 1,3-propanediol (1,3-PD) which is then
           excreted into the extracellular medium. The first
           reaction involves the transformation of glycerol into
           3-hydroxypropionaldehyde (3-HPA) by a coenzyme
           B-12-dependent dehydratase. The second reaction involves
           the dismutation of the 3-hydroxypropionaldehyde (3-HPA)
           to 1,3-propanediol by the NADH-linked 1,3-propanediol
           dehydrogenase (PPD). The enzyme require iron ion for its
           function.  Because many genes in this pathway are
           present in the pdu (propanediol utilisation) operon,
           they are also named pdu genes. PPD is a member of the
           iron-containing alcohol dehydrogenase superfamily. The
           PPD structure has a dehydroquinate synthase-like fold.
          Length = 332

 Score = 31.0 bits (71), Expect = 0.54
 Identities = 11/44 (25%), Positives = 24/44 (54%), Gaps = 4/44 (9%)

Query: 108 RAMQNMNRQMNLPQIQRILQEF-EKQSEIMDMKEEMMNDAMDDA 150
            A++ + +++N+P+    L+E    + E     +EM  +A+ DA
Sbjct: 273 EAIKQLKKKLNIPE---TLKELGVDKEEFEAAIDEMAENALKDA 313


>gnl|CDD|216284 pfam01074, Glyco_hydro_38, Glycosyl hydrolases family 38
          N-terminal domain.  Glycosyl hydrolases are key enzymes
          of carbohydrate metabolism.
          Length = 269

 Score = 30.7 bits (70), Expect = 0.59
 Identities = 18/69 (26%), Positives = 31/69 (44%), Gaps = 16/69 (23%)

Query: 1  MEWLFGRKITPEEMLRKNQRALNKAMRDLDREK------------QHMEQQEKKLIADIK 48
          + WL+    T +E  RK QR  +  ++ LDR              +   + + +L   IK
Sbjct: 12 VGWLW----TVDETRRKVQRTFSNVLKLLDRYPEFRFIQSEAQFYEWWWEDQPELFKKIK 67

Query: 49 KMAKEGQME 57
          K+  EG++E
Sbjct: 68 KLVAEGRLE 76


>gnl|CDD|151707 pfam11266, DUF3066, Protein of unknown function (DUF3066).  This
           family of proteins with unknown function appears to be
           restricted to Cyanobacteria.
          Length = 219

 Score = 30.5 bits (69), Expect = 0.63
 Identities = 11/43 (25%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 106 CSRAMQNMNRQMNLPQIQRILQEFEKQSEIMDM-KEEMMNDAM 147
               ++  NR+ NLP + ++L +    + ++ M KE ++ D M
Sbjct: 152 SKEELEEANRE-NLPLVWKMLNQVADDAAVLGMDKEALVEDFM 193


>gnl|CDD|237879 PRK14983, PRK14983, aldehyde decarbonylase; Provisional.
          Length = 231

 Score = 30.0 bits (68), Expect = 0.78
 Identities = 10/35 (28%), Positives = 21/35 (60%), Gaps = 2/35 (5%)

Query: 114 NRQMNLPQIQRILQEFEKQSEIMDM-KEEMMNDAM 147
           N++ NLP + ++L +    + ++ M KE ++ D M
Sbjct: 170 NKE-NLPLVWKMLNQVADDAAVLGMEKEALVEDFM 203


>gnl|CDD|226659 COG4196, COG4196, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 808

 Score = 30.3 bits (68), Expect = 0.94
 Identities = 18/75 (24%), Positives = 33/75 (44%), Gaps = 14/75 (18%)

Query: 148 DDAMDADE---------DEDETNAVVTQVLDELGLQLGDQLASIPDPASSMMSDKGKTPV 198
           +DA+ ADE         D++    V+  + D LGL +     +  DP   + ++     V
Sbjct: 125 NDALLADEWGAPPVDPVDDEAAYKVLAGIADGLGLPISQVRPAYEDPLERLAAE-----V 179

Query: 199 AIPGGSGSNHSNNHG 213
            +P G   + S++ G
Sbjct: 180 RLPAGDPVDPSDDRG 194


>gnl|CDD|217817 pfam03961, DUF342, Protein of unknown function (DUF342).  This
           family of bacterial proteins has no known function. The
           proteins are in the region of 500-600 amino acid
           residues in length.
          Length = 450

 Score = 29.5 bits (67), Expect = 1.5
 Identities = 14/84 (16%), Positives = 45/84 (53%), Gaps = 4/84 (4%)

Query: 18  NQRALNKAMRDLDREKQHMEQQEKKLIADIKKM--AKEGQMESVKI-MAKDLVRTRKYAK 74
           +   L + +++L+ E + +E++ +K+   +KK+     GQ+   K    + L+ T++   
Sbjct: 328 DFPELKEELKELEEELKELEEELEKIKKLLKKLPKKARGQLPPEKREQLEKLLETKEKLS 387

Query: 75  KFM-MMKANIQAVSLKIQTLRSQN 97
           + +  ++  ++ +  ++++L S+ 
Sbjct: 388 EELEELEEELKELKEELESLYSEG 411


>gnl|CDD|200948 pfam00038, Filament, Intermediate filament protein. 
          Length = 312

 Score = 29.1 bits (66), Expect = 1.7
 Identities = 25/131 (19%), Positives = 53/131 (40%), Gaps = 29/131 (22%)

Query: 22  LNKAMRDL--------DREKQHMEQQEKKLIADIKKMAKEG--QMESVKIMAKDLVRTRK 71
           L KA+ ++        ++ +Q  E+  K  + ++++ A      + S K    +L R   
Sbjct: 167 LTKALAEIRAQYEELAEKNRQEAEEWYKSKLEELQQAAARNGDALRSAKEEITELRRQ-- 224

Query: 72  YAKKFMMMKANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQIQRILQEFEK 131
                      IQ++ +++Q+L+ Q A  E     +   +    ++     Q  + E E 
Sbjct: 225 -----------IQSLEIELQSLKKQKASLERQ--LAELEERYELELA--DYQDTISELE- 268

Query: 132 QSEIMDMKEEM 142
             E+  +K EM
Sbjct: 269 -EELQQLKAEM 278


>gnl|CDD|220964 pfam11068, DUF2869, Protein of unknown function (DUF2869).  This
          bacterial family of proteins has no known function.
          Length = 131

 Score = 28.3 bits (64), Expect = 1.9
 Identities = 11/40 (27%), Positives = 22/40 (55%)

Query: 16 RKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQ 55
           + Q  L + +  L++E Q +E Q +K I +I+K + +  
Sbjct: 19 EELQAELQEQLTQLEQELQQLEFQGQKAIKEIRKQSAQQI 58


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 29.4 bits (67), Expect = 2.0
 Identities = 10/45 (22%), Positives = 28/45 (62%), Gaps = 3/45 (6%)

Query: 12  EEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIA---DIKKMAKE 53
           E+ L + +  L++ +  L++ ++ +E++EK+L     +++K  +E
Sbjct: 88  EKRLLQKEENLDRKLELLEKREEELEKKEKELEQKQQELEKKEEE 132


>gnl|CDD|217902 pfam04111, APG6, Autophagy protein Apg6.  In yeast, 15 Apg proteins
           coordinate the formation of autophagosomes. Autophagy is
           a bulk degradation process induced by starvation in
           eukaryotic cells. Apg6/Vps30p has two distinct functions
           in the autophagic process, either associated with the
           membrane or in a retrieval step of the carboxypeptidase
           Y sorting pathway.
          Length = 356

 Score = 29.1 bits (65), Expect = 2.1
 Identities = 12/59 (20%), Positives = 27/59 (45%), Gaps = 5/59 (8%)

Query: 7   RKITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIAD-----IKKMAKEGQMESVK 60
           R +   E L K    L+  + +L  EK+ +E +E + + +        +  E  ++S++
Sbjct: 81  RLLDELEELEKEDDDLDGELVELQEEKEQLENEELQYLREYNLFDRNNLQLEDNLQSLE 139


>gnl|CDD|233207 TIGR00955, 3a01204, The Eye Pigment Precursor Transporter (EPP)
           Family protein.  [Transport and binding proteins,
           Other].
          Length = 617

 Score = 29.2 bits (66), Expect = 2.1
 Identities = 31/165 (18%), Positives = 58/165 (35%), Gaps = 30/165 (18%)

Query: 15  LRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAK 74
            R  Q    K +    R     E+  K L+ ++  +AK G++  + +M          A 
Sbjct: 12  GRVAQDGSWKQLVSRLRGCFCRERPRKHLLKNVSGVAKPGEL--LAVMGS------SGAG 63

Query: 75  KFMMMK--ANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQIQRI---LQEF 129
           K  +M   A      +K       N M                 ++  +++ I   +Q+ 
Sbjct: 64  KTTLMNALAFRSPKGVKGSGSVLLNGMP----------------IDAKEMRAISAYVQQD 107

Query: 130 EKQSEIMDMKEEMMNDAMDDAMDADEDEDETNAVVTQVLDELGLQ 174
           +     + ++E +M  A    M     + E    V +VL  LGL+
Sbjct: 108 DLFIPTLTVREHLMFQAHL-RMPRRVTKKEKRERVDEVLQALGLR 151


>gnl|CDD|114701 pfam05993, Reovirus_M2, Reovirus major virion structural protein
           Mu-1/Mu-1C (M2).  This family consists of several
           Reovirus major virion structural protein Mu-1/Mu-1C (M2)
           sequences. This family is family is thought to play a
           role in host cell membrane penetration.
          Length = 648

 Score = 29.0 bits (64), Expect = 2.7
 Identities = 14/75 (18%), Positives = 28/75 (37%), Gaps = 4/75 (5%)

Query: 6   GRKITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKD 65
           G    P        R++  A++ L + +  ++     L  DI        M+S+   A D
Sbjct: 121 GPVFIPTRQTMNLDRSIAAALKALAKWEIDLDTAMTLLPPDIPAGEASCNMKSLLAFADD 180

Query: 66  LVR----TRKYAKKF 76
           ++       +Y K+ 
Sbjct: 181 ILPDDNLCLRYPKEA 195


>gnl|CDD|189895 pfam01221, Dynein_light, Dynein light chain type 1. 
          Length = 86

 Score = 26.8 bits (60), Expect = 3.2
 Identities = 11/38 (28%), Positives = 20/38 (52%), Gaps = 3/38 (7%)

Query: 136 MDMKEEMMNDAMD---DAMDADEDEDETNAVVTQVLDE 170
            DM EEM  DA++   +A++    E +  A + +  D+
Sbjct: 8   ADMPEEMQEDAIECAAEALEKFNVEKDIAAHIKKEFDK 45


>gnl|CDD|220506 pfam09989, DUF2229, CoA enzyme activase uncharacterized domain
           (DUF2229).  Members of this family include various
           bacterial hypothetical proteins, as well as CoA enzyme
           activases. The exact function of this domain has not, as
           yet, been defined.
          Length = 218

 Score = 28.3 bits (64), Expect = 3.4
 Identities = 11/54 (20%), Positives = 30/54 (55%), Gaps = 4/54 (7%)

Query: 1   MEWLFGRKITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEG 54
            E      I+ EE+    ++A+ KA+ + +  K+ + ++ ++ +A +++  K+G
Sbjct: 133 YELGKKLGISKEEI----KKAVEKALEEQEAFKKDLRKKGEEALAYLEEEGKKG 182


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 28.2 bits (63), Expect = 3.6
 Identities = 18/100 (18%), Positives = 32/100 (32%), Gaps = 9/100 (9%)

Query: 12  EEMLRKNQR----ALNKAMRDL-DREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDL 66
           EE+     R     +N  +  L  REK+  E++E++  A IK ++   + E  +  A D 
Sbjct: 155 EELKELQSRRADVDVNSMLEALFRREKKEEEEEEEEDEALIKSLSFGPETEEDRRRADDE 214

Query: 67  VRTRKYAKKFMMMKANIQAVSL----KIQTLRSQNAMAEA 102
                             + S      I    +       
Sbjct: 215 DSEDDEEDNDNTPSPKSGSSSPAKPTSILKKSAAKRSEAP 254


>gnl|CDD|220767 pfam10459, Peptidase_S46, Peptidase S46.  Dipeptidyl-peptidase 7
           (DPP-7) is the best characterized member of this family.
           It is a serine peptidase that is located on the cell
           surface and is predicted to have two N-terminal
           transmembrane domains.
          Length = 696

 Score = 28.3 bits (64), Expect = 3.6
 Identities = 27/163 (16%), Positives = 56/163 (34%), Gaps = 24/163 (14%)

Query: 17  KNQRALNKAMRDLD-REKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAKK 75
           KN   + + ++DLD   ++  + +E  L A +KK    G                KY   
Sbjct: 309 KNSIGMLEGLKDLDLLARK--QAREAALRAWVKKDPARG---------------AKYGDA 351

Query: 76  FMMMKANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQIQRILQEFEKQSEI 135
                  + A+  + + L  +    E        +    R +     +R   + E++   
Sbjct: 352 L----DELAALYAERRELARRYFYLEEAFRSGELLS-AARTLVRLAKEREKPDAEREPGY 406

Query: 136 MDMKEEMMNDAMDDAMDADEDEDETNAVVTQVLDELGLQLGDQ 178
            +     +   ++  +D   D +   AV+  +L+E    LG  
Sbjct: 407 QERDLPRLEQQLER-IDKPYDAEVDKAVLAAMLEEYRELLGAD 448


>gnl|CDD|227952 COG5665, NOT5, CCR4-NOT transcriptional regulation complex, NOT5
           subunit [Transcription].
          Length = 548

 Score = 28.5 bits (63), Expect = 3.8
 Identities = 31/187 (16%), Positives = 66/187 (35%), Gaps = 28/187 (14%)

Query: 3   WLFGRKITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIM 62
           WL    +  +++L  N+R +   M      ++ M+          K+ +KE       I 
Sbjct: 54  WLSKEDVKDKQVLMTNRRLIENGMERFKSVEKLMK---------TKQFSKEALTNPDIID 104

Query: 63  AKDLVRTRKYAKKFMMMKANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQI 122
            K+L +           +  +  +   +  L+ Q    EA +   +  ++     NL  I
Sbjct: 105 PKELKK-----------RDQVLFIHDCLDELQKQLEQYEAQENEEQTERHEFHIANLENI 153

Query: 123 QRILQEFEKQSEIMDMKEEMMNDAMDDAMDADE-DEDETNAVVTQVLDELGLQLGDQLAS 181
            + LQ  E       M  E + +  DD     E ++D        + +++G ++    ++
Sbjct: 154 LKKLQNNE-------MDPEPVEEFQDDIKYYVENNDDPDFIEYDTIYEDMGCEIQPSSSN 206

Query: 182 IPDPASS 188
              P   
Sbjct: 207 NEAPKEG 213


>gnl|CDD|130146 TIGR01074, rep, ATP-dependent DNA helicase Rep.  Designed to
           identify rep members of the uvrD/rep subfamily [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 664

 Score = 28.2 bits (63), Expect = 4.3
 Identities = 15/106 (14%), Positives = 42/106 (39%), Gaps = 10/106 (9%)

Query: 68  RTRKYAKKFMMMKANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMN-------RQMNLP 120
           R  +  ++F      I+ ++ + + + +  ++ E +   +   +          R  N+ 
Sbjct: 447 RGYESLQRFTDWLVEIRRLAERSEPIEAVRSLIEDIDYENWLYETSPSPKAAEMRMKNVN 506

Query: 121 QIQRILQEFEKQSEI---MDMKEEMMNDAMDDAMDADEDEDETNAV 163
            +    +E  +  E    M + + +    + D ++  EDE+E + V
Sbjct: 507 TLFSWFKEMLEGDEEDEPMTLTQVVTRLTLRDMLERGEDEEELDQV 552


>gnl|CDD|235773 PRK06291, PRK06291, aspartate kinase; Provisional.
          Length = 465

 Score = 28.0 bits (63), Expect = 5.0
 Identities = 6/44 (13%), Positives = 21/44 (47%), Gaps = 3/44 (6%)

Query: 131 KQSEIMDMKEEMMN---DAMDDAMDADEDEDETNAVVTQVLDEL 171
             +++ D   ++      A+++A+   +  +E +  +   ++EL
Sbjct: 61  DIAKVKDFIADLRERHYKAIEEAIKDPDIREEVSKTIDSRIEEL 104


>gnl|CDD|202724 pfam03685, UPF0147, Uncharacterized protein family (UPF0147).  This
           family of small proteins have no known function.
          Length = 85

 Score = 26.4 bits (59), Expect = 5.1
 Identities = 16/55 (29%), Positives = 27/55 (49%), Gaps = 13/55 (23%)

Query: 130 EKQSEIMDMKEEMMND---------AMDDAMDADEDEDETNAV----VTQVLDEL 171
           EK  + ++M + ++ND         A  DA  A  +E+E+ AV       +LDE+
Sbjct: 6   EKIKQAIEMLDRIINDTTVPRNIRRAATDAKAALLNEEESPAVRAATAISILDEI 60


>gnl|CDD|227512 COG5185, HEC1, Protein involved in chromosome segregation,
           interacts with SMC proteins [Cell division and
           chromosome partitioning].
          Length = 622

 Score = 28.0 bits (62), Expect = 5.2
 Identities = 17/147 (11%), Positives = 51/147 (34%), Gaps = 27/147 (18%)

Query: 17  KNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAKKF 76
           +    +++ ++ L  + + ++    K    +  M  + +              +++  K 
Sbjct: 288 QEAMKISQKIKTLREKWRALKSDSNKYENYVNAM--KQKS-------------QEWPGKL 332

Query: 77  MMMKANIQAVSLKIQTLRSQNAMAEAMKGCSRAMQNMNRQMNLPQIQRILQEFEKQSEIM 136
             +K+ I+    +I+ L+S               Q   + ++  Q + + QE EK +  +
Sbjct: 333 EKLKSEIELKEEEIKALQSNIDELHK--------QLRKQGISTEQFELMNQEREKLTREL 384

Query: 137 DM----KEEMMNDAMDDAMDADEDEDE 159
           D      +++        ++A      
Sbjct: 385 DKINIQSDKLTKSVKSRKLEAQGIFKS 411


>gnl|CDD|218556 pfam05327, RRN3, RNA polymerase I specific transcription initiation
           factor RRN3.  This family consists of several eukaryotic
           proteins which are homologous to the yeast RRN3 protein.
           RRN3 is one of the RRN genes specifically required for
           the transcription of rDNA by RNA polymerase I (Pol I) in
           Saccharomyces cerevisiae.
          Length = 554

 Score = 28.0 bits (63), Expect = 5.3
 Identities = 13/62 (20%), Positives = 29/62 (46%), Gaps = 2/62 (3%)

Query: 121 QIQRILQEFEKQSEIMDMKEEMMNDAMDDAMDADEDEDETNAVVTQVLDELGLQLGDQLA 180
           +IQ  L + + + E   + +E  +D  +D M   +D+DE  +           ++ ++L 
Sbjct: 217 EIQNELDDIDDEEEERVLADEDDDD--EDDMFDMDDDDEEESDPEVERTSTIKEVSEKLD 274

Query: 181 SI 182
           +I
Sbjct: 275 AI 276


>gnl|CDD|227780 COG5493, COG5493, Uncharacterized conserved protein containing a
           coiled-coil domain [Function unknown].
          Length = 231

 Score = 27.5 bits (61), Expect = 5.4
 Identities = 16/86 (18%), Positives = 36/86 (41%), Gaps = 5/86 (5%)

Query: 8   KITPEEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLV 67
           +I  E + +          +D++  ++  EQ++K+L AD K   ++ +     +      
Sbjct: 27  EILYEVLAKLTPWQQLATKQDVEELRKETEQRQKEL-ADEKLEVRKQKATKEDLKLLQ-- 83

Query: 68  RTRKYAKKFMMMKANIQAVSLKIQTL 93
             R   ++F   K +I+ +   I  L
Sbjct: 84  --RFQEEEFRATKEDIKRLETIITGL 107


>gnl|CDD|240197 cd05692, S1_RPS1_repeat_hs4, S1_RPS1_repeat_hs4: Ribosomal
          protein S1 (RPS1) domain. RPS1 is a component of the
          small ribosomal subunit thought to be involved in the
          recognition and binding of mRNA's during translation
          initiation. The bacterial RPS1 domain architecture
          consists of 4-6 tandem S1 domains. In some bacteria,
          the tandem S1 array is located C-terminal to a
          4-hydroxy-3-methylbut-2-enyl diphosphate reductase
          (HMBPP reductase) domain. While RPS1 is found primarily
          in bacteria, proteins with tandem RPS1-like domains
          have been identified in plants and humans, however
          these lack the N-terminal HMBPP reductase domain. This
          CD includes S1 repeat 4 (hs4) of the H. sapiens RPS1
          homolog. Autoantibodies to double-stranded DNA from
          patients with systemic lupus erythematosus cross-react
          with the human RPS1 homolog.
          Length = 69

 Score = 25.7 bits (57), Expect = 5.6
 Identities = 11/31 (35%), Positives = 18/31 (58%)

Query: 35 HMEQQEKKLIADIKKMAKEGQMESVKIMAKD 65
          H+ Q   K + D+K + KEG    VK+++ D
Sbjct: 29 HISQIAHKRVKDVKDVLKEGDKVKVKVLSID 59


>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain.  This
           family represents the C-terminus (approximately 300
           residues) of proteins that are involved as binding
           partners for Prp19 as part of the nuclear pore complex.
           The family in Drosophila is necessary for pre-mRNA
           splicing, and the human protein has been found in
           purifications of the spliceosome. In the past this
           family was thought, erroneously, to be associated with
           microfibrillin.
          Length = 277

 Score = 27.6 bits (61), Expect = 5.7
 Identities = 17/63 (26%), Positives = 37/63 (58%)

Query: 24  KAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAKKFMMMKANI 83
           K  R   +E++    +EK L  + K+ A+E + E++KI+ +++ +  +  K+  +++ANI
Sbjct: 44  KKDRITIQEREREAAKEKALEEEAKRKAEERKRETLKIVEEEVKKELELKKRNTLLEANI 103

Query: 84  QAV 86
             V
Sbjct: 104 DDV 106


>gnl|CDD|239237 cd02911, arch_FMN, Archeal FMN-binding domain. This family of
           archaeal proteins are part of the NAD(P)H-dependent
           flavin oxidoreductase (oxidored) FMN-binding family that
           reduce a range of alternative electron acceptors. Most
           use FAD/FMN as a cofactor and NAD(P)H as electron donor.
           Some contain 4Fe-4S cluster to transfer electron from
           FAD to FMN. The specific function of this group is
           unknown.
          Length = 233

 Score = 27.3 bits (61), Expect = 6.3
 Identities = 12/45 (26%), Positives = 19/45 (42%), Gaps = 4/45 (8%)

Query: 34  QHMEQQEKKLIADIKKMAKEGQMESVKIMA----KDLVRTRKYAK 74
           + + +  ++L   IK + + G   SVKI A     D    R   K
Sbjct: 119 EALLKDPERLSEFIKALKETGVPVSVKIRAGVDVDDEELARLIEK 163


>gnl|CDD|234892 PRK01037, trmD, tRNA (guanine-N(1)-)-methyltransferase/unknown
           domain fusion protein; Reviewed.
          Length = 357

 Score = 27.5 bits (61), Expect = 6.3
 Identities = 13/48 (27%), Positives = 19/48 (39%), Gaps = 5/48 (10%)

Query: 30  DREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRK-YAKKF 76
            R    +  QE       K      +  SV +  +DL R +K Y+K F
Sbjct: 226 GRSADCLFTQE----DLPKIEVFSPKTFSVVLEVQDLRRAKKFYSKMF 269


>gnl|CDD|224460 COG1543, COG1543, Uncharacterized conserved protein [Function
           unknown].
          Length = 504

 Score = 27.7 bits (62), Expect = 6.4
 Identities = 10/58 (17%), Positives = 26/58 (44%), Gaps = 3/58 (5%)

Query: 22  LNKAMRD---LDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVRTRKYAKKF 76
           L + + D    DR  +++E + +    + K+ A +   E  +    + +  R Y +++
Sbjct: 64  LMEMLADPYLQDRYLRYLEWKIELSEKEEKRYADQSLRELAEFYIPEFLNARGYWEQY 121


>gnl|CDD|212549 cd11711, GINS_A_Sld5, Alpha-helical domain of GINS complex
          protein Sld5.  Sld5 is a component of GINS tetrameric
          protein complex, and within the complex Sld5 interacts
          with Psf1 via its N-terminal A-domain, and with Psf2
          through a combination of the A and B domains. Sld5 in
          Drosophila is required for normal cell cycle
          progression and the maintenance of genomic integrity.
          GINS is a complex of four subunits (Sld5, Psf1, Psf2
          and Psf3) that is involved in both initiation and
          elongation stages of eukaryotic chromosome replication.
          Besides being essential for the maintenance of genomic
          integrity, GINS plays a central role in coordinating
          DNA replication with cell cycle checkpoints and is
          involved in cell growth. The  eukaryotic GINS subunits
          are homologous and homologs are also found in the
          archaea; the complex is not found in bacteria. The four
          subunits of the complex consist of two domains each,
          termed the alpha-helical (A) and beta-strand (B)
          domains. The A and B domains of Sld5/Psf1 are permuted
          with respect to Psf1/Psf3.
          Length = 119

 Score = 26.4 bits (59), Expect = 7.5
 Identities = 12/49 (24%), Positives = 26/49 (53%), Gaps = 7/49 (14%)

Query: 29 LDREKQHMEQQEKKLIA-------DIKKMAKEGQMESVKIMAKDLVRTR 70
          ++R  + +EQQE+ L         D++    E ++E ++ + +  +RTR
Sbjct: 26 VERVLEQIEQQEENLEELKASEKDDLRLSLHEMELERIRFLLRSYLRTR 74


>gnl|CDD|226889 COG4487, COG4487, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 438

 Score = 27.5 bits (61), Expect = 8.2
 Identities = 14/57 (24%), Positives = 23/57 (40%)

Query: 12  EEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMAKDLVR 68
           +   R       K    L+ E++  E+Q  +   D++    E Q ES   + K L R
Sbjct: 149 KNEERLKFENEKKLEESLELEREKFEEQLHEANLDLEFKENEEQRESKWAILKKLKR 205


>gnl|CDD|172947 PRK14472, PRK14472, F0F1 ATP synthase subunit B; Provisional.
          Length = 175

 Score = 26.7 bits (59), Expect = 8.7
 Identities = 17/52 (32%), Positives = 30/52 (57%), Gaps = 3/52 (5%)

Query: 12  EEMLRKNQRALNKAMRDLDREKQHMEQQEKKLIADIKKMAKEGQMESVKIMA 63
           E +LRKN+  L KA  + D+  +  ++  +KL A+I + A     E+ K++A
Sbjct: 69  EAILRKNRELLAKADAEADKIIREGKEYAEKLRAEITEKA---HTEAKKMIA 117


>gnl|CDD|223778 COG0706, YidC, Preprotein translocase subunit YidC [Intracellular
           trafficking and secretion].
          Length = 314

 Score = 27.0 bits (60), Expect = 8.8
 Identities = 13/41 (31%), Positives = 22/41 (53%), Gaps = 3/41 (7%)

Query: 103 MKGCSRAMQNMNRQMNLPQIQRILQEFEKQSEIMDMKEEMM 143
            +  +R+M  M  Q   P+I+ I QE  K ++    ++EMM
Sbjct: 129 SQKSTRSMAKM--QELQPKIKEI-QEKYKGTDKQKQQQEMM 166


>gnl|CDD|223450 COG0373, HemA, Glutamyl-tRNA reductase [Coenzyme metabolism].
          Length = 414

 Score = 27.2 bits (61), Expect = 8.9
 Identities = 20/81 (24%), Positives = 34/81 (41%), Gaps = 17/81 (20%)

Query: 1   MEWLFGRKITP------EEMLRKNQRALNKAMRDLDREKQHMEQQE-------KKLIAD- 46
           MEWL   ++ P      E+     +  L KA++ L   +   E  E        KL+   
Sbjct: 327 MEWLKKLEVVPTIRALREQAEDVREEELEKALKKLPNGEDEEEVLEKLARSLVNKLLHAP 386

Query: 47  ---IKKMAKEGQMESVKIMAK 64
              +K+ AKEG  E ++ + +
Sbjct: 387 TVRLKEAAKEGSEELLRALRE 407


>gnl|CDD|221173 pfam11702, DUF3295, Protein of unknown function (DUF3295).  This
           family is conserved in fungi but the function is not
           known.
          Length = 509

 Score = 27.2 bits (60), Expect = 9.7
 Identities = 9/42 (21%), Positives = 17/42 (40%)

Query: 118 NLPQIQRILQEFEKQSEIMDMKEEMMNDAMDDAMDADEDEDE 159
           +L   ++    F++Q       E   +D  D     ++D DE
Sbjct: 259 SLMSPRKKTASFKEQVVTRTFPERTSDDDEDAIETEEDDVDE 300


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.312    0.126    0.339 

Gapped
Lambda     K      H
   0.267   0.0788    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 11,977,272
Number of extensions: 1125854
Number of successful extensions: 1967
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1899
Number of HSP's successfully gapped: 189
Length of query: 237
Length of database: 10,937,602
Length adjustment: 94
Effective length of query: 143
Effective length of database: 6,768,326
Effective search space: 967870618
Effective search space used: 967870618
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 57 (25.6 bits)