RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy13263
         (577 letters)



>gnl|CDD|213253 cd03286, ABC_MSH6_euk, ATP-binding cassette domain of eukaryotic
           MutS6 homolog.  The MutS protein initiates DNA mismatch
           repair by recognizing mispaired and unpaired bases
           embedded in duplex DNA and activating endo- and
           exonucleases to remove the mismatch. Members of the MutS
           family possess C-terminal domain with a conserved ATPase
           activity that belongs to the ATP binding cassette (ABC)
           superfamily. MutS homologs (MSH) have been identified in
           most prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 218

 Score =  118 bits (297), Expect = 2e-30
 Identities = 46/97 (47%), Positives = 60/97 (61%), Gaps = 2/97 (2%)

Query: 390 RGTGTNDGCVIARVTLEKFLQ-IGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
           RGT T+DG  IA   LE  ++ + CLT+F+THYHS+     E   V   +M+      +D
Sbjct: 120 RGTSTHDGYAIAHAVLEYLVKKVKCLTLFSTHYHSLCDEFHEHGGVRLGHMACAVKNESD 179

Query: 449 G-IDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVK 484
             I  I FLYKLV GICPKS+G  VA +AGIP+ VV+
Sbjct: 180 PTIRDITFLYKLVAGICPKSYGLYVALMAGIPDGVVE 216


>gnl|CDD|197777 smart00534, MUTSac, ATPase domain of DNA mismatch repair MUTS
           family. 
          Length = 185

 Score =  113 bits (285), Expect = 4e-29
 Identities = 43/97 (44%), Positives = 59/97 (60%), Gaps = 6/97 (6%)

Query: 389 CRGTGTNDGCVIARVTLEKFL-QIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRN 447
            RGT T DG  IA   LE  L +IG  T+FATHYH + +     P V   +MS +E+   
Sbjct: 88  GRGTSTYDGLAIAAAILEYLLEKIGARTLFATHYHELTKLADNHPGVRNLHMSALEET-- 145

Query: 448 DGIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVK 484
              + I FLYKL PG+  KS+G  VA+LAG+P++V++
Sbjct: 146 ---ENITFLYKLKPGVAGKSYGIEVAKLAGLPKEVIE 179


>gnl|CDD|215944 pfam00488, MutS_V, MutS domain V.  This domain is found in proteins
           of the MutS family (DNA mismatch repair proteins) and is
           found associated with pfam01624, pfam05188, pfam05192
           and pfam05190. The mutS family of proteins is named
           after the Salmonella typhimurium MutS protein involved
           in mismatch repair; other members of the family included
           the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have
           various roles in DNA repair and recombination. Human MSH
           has been implicated in non-polyposis colorectal
           carcinoma (HNPCC) and is a mismatch binding protein. The
           aligned region corresponds with domain V of Thermus
           aquaticus MutS as characterized in, which contains a
           Walker A motif, and is structurally similar to the
           ATPase domain of ABC transporters.
          Length = 235

 Score =  102 bits (256), Expect = 1e-24
 Identities = 45/108 (41%), Positives = 60/108 (55%), Gaps = 6/108 (5%)

Query: 390 RGTGTNDGCVIARVTLEKFLQ-IGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
           RGT T DG  IA    E   + I   T+FATHYH + +   + P V   +M+ +E     
Sbjct: 133 RGTSTYDGLAIAWAVAEHLAEKIRARTLFATHYHELTKLAEKLPAVKNVHMAAVET---- 188

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQMEAR 496
               IVFLYK+ PG   KS+G +VAELAG+PE VV+    V  ++E R
Sbjct: 189 -NGDIVFLYKVKPGAADKSYGIHVAELAGLPESVVERAREVLAELEDR 235


>gnl|CDD|223327 COG0249, MutS, Mismatch repair ATPase (MutS family) [DNA
           replication, recombination, and repair].
          Length = 843

 Score =  106 bits (268), Expect = 4e-24
 Identities = 54/145 (37%), Positives = 80/145 (55%), Gaps = 12/145 (8%)

Query: 390 RGTGTNDGCVIARVTLEKFLQ-IGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
           RGT T DG  IA   LE   + IGC T+FATHYH +     + P V   +MS +E+    
Sbjct: 697 RGTSTYDGLAIAWAVLEYLHEKIGCRTLFATHYHELTELEEKLPQVKNYHMSAVEEG--- 753

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQMEARH-----NLRQLF 503
               I FLYK+ PGI  KS+G +VA+LAG+PE+V++    +  ++E         L Q  
Sbjct: 754 --GDITFLYKVKPGIADKSYGIHVAKLAGLPEEVIERAREILAELEKESRSSNLELNQKD 811

Query: 504 IHKFASLVKSGEKVDVEEL-QKALE 527
           +  F  ++K+ + +D +EL  +AL 
Sbjct: 812 LSLFPKVLKALKSLDPDELTPRALN 836



 Score = 66.5 bits (163), Expect = 2e-11
 Identities = 23/45 (51%), Positives = 28/45 (62%)

Query: 203 LKKQTPCMGQWWTIKSQNFDCVLFFKVGKFYELFHMDAVIGADEL 247
             K TP M Q+  IK+Q  D +LFF++G FYELF  DA I A  L
Sbjct: 3   KAKLTPMMQQYLEIKAQYPDTLLFFRMGDFYELFFEDAKIAARLL 47



 Score = 46.5 bits (111), Expect = 3e-05
 Identities = 27/110 (24%), Positives = 44/110 (40%), Gaps = 2/110 (1%)

Query: 279 FPDMSELLKYFENAFDHKEASSAGNIIPKAGVDKEYDEVMDEIKSIEKEIQTYLRTQCAH 338
              ++ELL+  E A +     +  + I K G + E DE+ D + + ++ I      +   
Sbjct: 398 LDYLAELLELLETAINEDPPLAVRDGIIKEGYNIELDELRDLLNNAKEWIAKLELEERER 457

Query: 339 FGCTVIYSEAQKKQKKYVLEVPSKYASKAKSNHQRVATKKKNVENYVTPE 388
            G          K   Y +EV    A     ++ R  T  KN E + TPE
Sbjct: 458 TGIK-SLKIKYNKVYGYYIEVTKSNAKLVPDDYIRRQT-LKNAERFTTPE 505


>gnl|CDD|213254 cd03287, ABC_MSH3_euk, ATP-binding cassette domain of eukaryotic
           MutS3 homolog.  The MutS protein initiates DNA mismatch
           repair by recognizing mispaired and unpaired bases
           embedded in duplex DNA and activating endo- and
           exonucleases to remove the mismatch. Members of the MutS
           family possess C-terminal domain with a conserved ATPase
           activity that belongs to the ATP binding cassette (ABC)
           superfamily. MutS homologs (MSH) have been identified in
           most prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 222

 Score = 98.7 bits (246), Expect = 2e-23
 Identities = 43/99 (43%), Positives = 61/99 (61%), Gaps = 5/99 (5%)

Query: 390 RGTGTNDGCVIARVTLEKFLQI-GCLTVFATHYHSVARRLRE-EPNVAFEYMSYIEDKRN 447
           RGT T+DG  IA  TL   L+   CL +F THY S+   LR  E ++   +MSY+E +++
Sbjct: 121 RGTSTHDGIAIAYATLHYLLEEKKCLVLFVTHYPSLGEILRRFEGSIRNYHMSYLESQKD 180

Query: 448 DGI---DTIVFLYKLVPGICPKSFGFNVAELAGIPEDVV 483
                  +I FLYKLV G+  +SFG NVA LAG+P+ ++
Sbjct: 181 FETSDSQSITFLYKLVRGLASRSFGLNVARLAGLPKSII 219


>gnl|CDD|213210 cd03243, ABC_MutS_homologs, ATP-binding cassette domain of MutS
           homologs.  The MutS protein initiates DNA mismatch
           repair by recognizing mispaired and unpaired bases
           embedded in duplex DNA and activating endo- and
           exonucleases to remove the mismatch. Members of the MutS
           family also possess a conserved ATPase activity that
           belongs to the ATP binding cassette (ABC) superfamily.
           MutS homologs (MSH) have been identified in most
           prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 202

 Score = 92.3 bits (230), Expect = 2e-21
 Identities = 38/108 (35%), Positives = 53/108 (49%), Gaps = 13/108 (12%)

Query: 379 KNVENYVTPE--------CRGTGTNDGCVIARVTLEKFLQIGCLTVFATHYHSVARRLRE 430
           K + +  TP          RGT T +G  IA   LE  L+ GC T+FATH+H +A    +
Sbjct: 100 KEILSLATPRSLVLIDELGRGTSTAEGLAIAYAVLEHLLEKGCRTLFATHFHELADLPEQ 159

Query: 431 EPNVAFEYMSYIEDKRNDGIDTIVFLYKLVPGICPKSFGFNVAELAGI 478
            P V   +M  +          + F YKL+ GIC  S+   +AELAG+
Sbjct: 160 VPGVKNLHMEELITT-----GGLTFTYKLIDGICDPSYALQIAELAGL 202


>gnl|CDD|213251 cd03284, ABC_MutS1, ATP-binding cassette domain of MutS1 homolog.
           The MutS protein initiates DNA mismatch repair by
           recognizing mispaired and unpaired bases embedded in
           duplex DNA and activating endo- and exonucleases to
           remove the mismatch. Members of the MutS family possess
           C-terminal domain with a conserved ATPase activity that
           belongs to the ATP binding cassette (ABC) superfamily.
           MutS homologs (MSH) have been identified in most
           prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 216

 Score = 87.7 bits (218), Expect = 8e-20
 Identities = 38/96 (39%), Positives = 54/96 (56%), Gaps = 6/96 (6%)

Query: 390 RGTGTNDGCVIARVTLEKFL-QIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
           RGT T DG  IA   +E    +IG  T+FATHYH +     + P V   +++  E     
Sbjct: 120 RGTSTYDGLSIAWAIVEYLHEKIGAKTLFATHYHELTELEGKLPRVKNFHVAVKEKG--- 176

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVK 484
               +VFL+K+V G   KS+G  VA LAG+PE+V++
Sbjct: 177 --GGVVFLHKIVEGAADKSYGIEVARLAGLPEEVIE 210


>gnl|CDD|235444 PRK05399, PRK05399, DNA mismatch repair protein MutS; Provisional.
          Length = 854

 Score = 90.9 bits (227), Expect = 5e-19
 Identities = 56/159 (35%), Positives = 77/159 (48%), Gaps = 29/159 (18%)

Query: 390 RGTGTNDGCVIARVTLEKFL--QIGCLTVFATHYH---SVARRLREEPNVAFEYMSYIED 444
           RGT T DG  IA    E +L  +IG  T+FATHYH    +  +L   P V   +++  E 
Sbjct: 697 RGTSTYDGLSIAWAVAE-YLHDKIGAKTLFATHYHELTELEEKL---PGVKNVHVAVKEH 752

Query: 445 KRNDGIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQMEARHNLRQLFI 504
                   IVFL+K+VPG   KS+G +VA+LAG+P  V+K          AR  L QL  
Sbjct: 753 G-----GDIVFLHKVVPGAADKSYGIHVAKLAGLPASVIK---------RAREILAQL-- 796

Query: 505 HKFASLVKSGEKVDVEELQKALESVKSFESQTKKDLEDL 543
               S  +  +    EE Q +L +    ES   + L+ L
Sbjct: 797 ---ESASEKAKAASAEEDQLSLFAEPE-ESPLLEALKAL 831



 Score = 67.4 bits (166), Expect = 1e-11
 Identities = 21/45 (46%), Positives = 28/45 (62%)

Query: 203 LKKQTPCMGQWWTIKSQNFDCVLFFKVGKFYELFHMDAVIGADEL 247
           + K TP M Q+  IK+Q  D +LFF++G FYELF  DA   +  L
Sbjct: 5   MSKLTPMMQQYLEIKAQYPDALLFFRMGDFYELFFEDAKKASRLL 49



 Score = 33.5 bits (78), Expect = 0.36
 Identities = 27/103 (26%), Positives = 38/103 (36%), Gaps = 35/103 (33%)

Query: 302 GNIIPKAGVDKEYDE---VMDE----IKSIE-KEIQTYLRTQCAH--------FGCTVIY 345
           G +I   G D E DE   + D     +  +E +E +   RT  +         FG     
Sbjct: 421 GGVI-ADGYDAELDELRALSDNGKDWLAELEARERE---RTGISSLKVGYNKVFG----- 471

Query: 346 SEAQKKQKKYVLEVPSKYASKAKSNHQRVATKKKNVENYVTPE 388
                    Y +EV      K   ++ R  T K N E Y+TPE
Sbjct: 472 ---------YYIEVTKANLDKVPEDYIRRQTLK-NAERYITPE 504


>gnl|CDD|213252 cd03285, ABC_MSH2_euk, ATP-binding cassette domain of eukaryotic
           MutS2 homolog.  The MutS protein initiates DNA mismatch
           repair by recognizing mispaired and unpaired bases
           embedded in duplex DNA and activating endo- and
           exonucleases to remove the mismatch. Members of the MutS
           family possess C-terminal domain with a conserved ATPase
           activity that belongs to the ATP binding cassette (ABC)
           superfamily. MutS homologs (MSH) have been identified in
           most prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 222

 Score = 80.5 bits (199), Expect = 3e-17
 Identities = 40/106 (37%), Positives = 58/106 (54%), Gaps = 4/106 (3%)

Query: 390 RGTGTNDGCVIARVTLEKFL-QIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
           RGT T DG  +A    E    QI C  +FATH+H +     E PNV   +++ + D   D
Sbjct: 120 RGTSTYDGFGLAWAIAEYIATQIKCFCLFATHFHELTALADEVPNVKNLHVTALTD---D 176

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQME 494
              T+  LYK+  G C +SFG +VAELA  P++V++     A ++E
Sbjct: 177 ASRTLTMLYKVEKGACDQSFGIHVAELANFPKEVIEMAKQKALELE 222


>gnl|CDD|216613 pfam01624, MutS_I, MutS domain I.  This domain is found in proteins
           of the MutS family (DNA mismatch repair proteins) and is
           found associated with pfam00488, pfam05188, pfam05192
           and pfam05190. The MutS family of proteins is named
           after the Salmonella typhimurium MutS protein involved
           in mismatch repair; other members of the family included
           the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have
           various roles in DNA repair and recombination. Human MSH
           has been implicated in non-polyposis colorectal
           carcinoma (HNPCC) and is a mismatch binding protein. The
           aligned region corresponds with globular domain I, which
           is involved in DNA binding, in Thermus aquaticus MutS as
           characterized in.
          Length = 113

 Score = 75.7 bits (187), Expect = 1e-16
 Identities = 26/56 (46%), Positives = 33/56 (58%)

Query: 207 TPCMGQWWTIKSQNFDCVLFFKVGKFYELFHMDAVIGADELACSYMKESGCTGEST 262
           TP M Q+  +KS+  D VLFF+VG FYELF  DA I A EL  +     G +G+  
Sbjct: 1   TPMMRQYLELKSKYPDAVLFFRVGDFYELFGEDAEIAARELGITLTVRGGGSGKRI 56


>gnl|CDD|233259 TIGR01070, mutS1, DNA mismatch repair protein MutS.  [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 840

 Score = 74.8 bits (184), Expect = 5e-14
 Identities = 38/112 (33%), Positives = 58/112 (51%), Gaps = 6/112 (5%)

Query: 390 RGTGTNDGCVIARVTLEKFLQ-IGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
           RGT T DG  +A    E   + I   T+FATHY  +       P +   +++ +E     
Sbjct: 682 RGTSTYDGLALAWAIAEYLHEHIRAKTLFATHYFELTALEESLPGLKNVHVAALEHN--- 738

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQMEARHNLR 500
              TIVFL++++PG   KS+G  VA LAG+P++V+     +  Q+EAR    
Sbjct: 739 --GTIVFLHQVLPGPASKSYGLAVAALAGLPKEVIARARQILTQLEARSTES 788



 Score = 53.6 bits (129), Expect = 2e-07
 Identities = 21/56 (37%), Positives = 30/56 (53%)

Query: 207 TPCMGQWWTIKSQNFDCVLFFKVGKFYELFHMDAVIGADELACSYMKESGCTGEST 262
           TP M Q+  +K+++ D +LFF++G FYELF+ DA   A  L  S         E  
Sbjct: 2   TPMMQQYLKLKAEHPDALLFFRMGDFYELFYEDAKKAAQLLDISLTSRGQSADEPI 57



 Score = 35.5 bits (82), Expect = 0.075
 Identities = 35/142 (24%), Positives = 59/142 (41%), Gaps = 31/142 (21%)

Query: 262 TLLTQLCNYESQTPSGCFPDMSELLKYFENAFDHKEASSAGNIIPKAGVDKE-YDEVMDE 320
            LL +L     Q  +    D SELL+  E A       +   ++   G+ +E YDE +DE
Sbjct: 365 ALLEELEGPTLQALAAQIDDFSELLELLEAAL----IENPPLVVRDGGLIREGYDEELDE 420

Query: 321 IKSIEKEIQTYLRTQCAHFGCTVIYSEAQKKQKK--------------YVLEVPSKYASK 366
           +++  +E   YL              EA+++++               Y +EV       
Sbjct: 421 LRAASREGTDYLARL-----------EARERERTGIPTLKVGYNAVFGYYIEVTRGQLHL 469

Query: 367 AKSNHQRVATKKKNVENYVTPE 388
             ++++R  T  KN E Y+TPE
Sbjct: 470 VPAHYRRRQT-LKNAERYITPE 490


>gnl|CDD|213248 cd03281, ABC_MSH5_euk, ATP-binding cassette domain of eukaryotic
           MutS5 homolog.  The MutS protein initiates DNA mismatch
           repair by recognizing mispaired and unpaired bases
           embedded in duplex DNA and activating endo- and
           exonucleases to remove the mismatch. Members of the MutS
           family possess C-terminal domain with a conserved ATPase
           activity that belongs to the ATP binding cassette (ABC)
           superfamily. MutS homologs (MSH) have been identified in
           most prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 213

 Score = 63.9 bits (156), Expect = 2e-11
 Identities = 32/95 (33%), Positives = 48/95 (50%), Gaps = 6/95 (6%)

Query: 390 RGTGTNDGCVIARVTLEKFLQIG--C-LTVFATHYHSVARR--LREEPNVAFEYMS-YIE 443
           +GT T DG  +   T+E  L+ G  C   + +TH+H +  R  L E   + F  M   + 
Sbjct: 119 KGTDTEDGAGLLIATIEHLLKRGPECPRVIVSTHFHELFNRSLLPERLKIKFLTMEVLLN 178

Query: 444 DKRNDGIDTIVFLYKLVPGICPKSFGFNVAELAGI 478
                  + I +LY+LVPG+   SF  + A+LAGI
Sbjct: 179 PTSTSPNEDITYLYRLVPGLADTSFAIHCAKLAGI 213


>gnl|CDD|218487 pfam05190, MutS_IV, MutS family domain IV.  This domain is found in
           proteins of the MutS family (DNA mismatch repair
           proteins) and is found associated with pfam01624,
           pfam05188, pfam05192 and pfam00488. The mutS family of
           proteins is named after the Salmonella typhimurium MutS
           protein involved in mismatch repair; other members of
           the family included the eukaryotic MSH 1,2,3, 4,5 and 6
           proteins. These have various roles in DNA repair and
           recombination. Human MSH has been implicated in
           non-polyposis colorectal carcinoma (HNPCC) and is a
           mismatch binding protein. The aligned region corresponds
           in part with globular domain IV, which is involved in
           DNA binding, in Thermus aquaticus MutS as characterized
           in.
          Length = 92

 Score = 52.2 bits (126), Expect = 1e-08
 Identities = 25/83 (30%), Positives = 37/83 (44%), Gaps = 8/83 (9%)

Query: 309 GVDKEYDEVMDEIKSIEKEIQTYLRTQCAHFGCT---VIYSEAQKKQKKYVLEVPSKYAS 365
           G D+E DE+ D ++ +E E+   L  +    G     V Y+    K   Y +EV    A 
Sbjct: 1   GFDEELDELRDLLEELESELDELLAKERERLGIKSLKVGYN----KVFGYYIEVTRSEAK 56

Query: 366 KAKSNHQRVATKKKNVENYVTPE 388
           K   ++ R  T  KN   + TPE
Sbjct: 57  KVPKDYIRRQT-LKNGVRFTTPE 78


>gnl|CDD|218489 pfam05192, MutS_III, MutS domain III.  This domain is found in
           proteins of the MutS family (DNA mismatch repair
           proteins) and is found associated with pfam00488,
           pfam05188, pfam01624 and pfam05190. The MutS family of
           proteins is named after the Salmonella typhimurium MutS
           protein involved in mismatch repair; other members of
           the family included the eukaryotic MSH 1,2,3, 4,5 and 6
           proteins. These have various roles in DNA repair and
           recombination. Human MSH has been implicated in
           non-polyposis colorectal carcinoma (HNPCC) and is a
           mismatch binding protein. The aligned region corresponds
           with domain III, which is central to the structure of
           Thermus aquaticus MutS as characterized in.
          Length = 290

 Score = 52.0 bits (125), Expect = 3e-07
 Identities = 27/111 (24%), Positives = 46/111 (41%), Gaps = 2/111 (1%)

Query: 281 DMSELLKYFENAFD-HKEASSAGNIIPKAGVDKEYDEVMDEIKSIEKEIQTYLRTQCAHF 339
            + ELL+  E A D     S     + K G D E DE+   +  + +++   L  +    
Sbjct: 127 PLPELLELLERAIDEDPPLSLRDGGVIKDGYDPELDELRALLDELREKLAELLERERERT 186

Query: 340 GCTVIYSEAQKKQKKYVLEVPSKYASKAKSNHQRVATKKKNVENYVTPECR 390
           G   +     +    YV+EV +  A K   ++ R  +  KN   + TPE +
Sbjct: 187 GIKSLKVGYNRVFGYYVIEVKASKADKVPGDYIRR-STTKNAVRFTTPELK 236


>gnl|CDD|214710 smart00533, MUTSd, DNA-binding domain of DNA mismatch repair MUTS
           family. 
          Length = 308

 Score = 51.5 bits (124), Expect = 5e-07
 Identities = 32/108 (29%), Positives = 52/108 (48%), Gaps = 10/108 (9%)

Query: 284 ELLKYFENAFDHKEASSAGNIIPKAGVDKEYDEVMDEIKSIEKEIQTYLRTQCAHFGC-- 341
           ELL    N  D  E +  G +I K G D E DE+ ++++ +E+E++  L+ +    G   
Sbjct: 122 ELLLELLNDDDPLEVND-GGLI-KDGFDPELDELREKLEELEEELEELLKKEREELGIDS 179

Query: 342 -TVIYSEAQKKQKKYVLEVPSKYASKAKSNHQRVATKKKNVENYVTPE 388
             + Y+    K   Y +EV    A K   +  R ++  KN E + TPE
Sbjct: 180 LKLGYN----KVHGYYIEVTKSEAKKVPKDFIRRSS-LKNTERFTTPE 222


>gnl|CDD|213250 cd03283, ABC_MutS-like, ATP-binding cassette domain of MutS-like
           homolog.  The MutS protein initiates DNA mismatch repair
           by recognizing mispaired and unpaired bases embedded in
           duplex DNA and activating endo- and exonucleases to
           remove the mismatch. Members of the MutS family possess
           C-terminal domain with a conserved ATPase activity that
           belongs to the ATP binding cassette (ABC) superfamily.
           MutS homologs (MSH) have been identified in most
           prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 199

 Score = 50.0 bits (120), Expect = 6e-07
 Identities = 19/91 (20%), Positives = 37/91 (40%), Gaps = 7/91 (7%)

Query: 389 CRGTGTNDGCVIARVTLEKFLQIGCLTVFATHYHSVARRLREEPNVAFEYMS-YIEDKRN 447
            +GT + +    +   L+       + + +TH   +A  L  +  V   +    I+D   
Sbjct: 115 FKGTNSRERQAASAAVLKFLKNKNTIGIISTHDLELADLLDLDSAVRNYHFREDIDD--- 171

Query: 448 DGIDTIVFLYKLVPGICPKSFGFNVAELAGI 478
              + ++F YKL PG+ P      + +  GI
Sbjct: 172 ---NKLIFDYKLKPGVSPTRNALRLMKKIGI 199


>gnl|CDD|213249 cd03282, ABC_MSH4_euk, ATP-binding cassette domain of eukaryotic
           MutS4 homolog.  The MutS protein initiates DNA mismatch
           repair by recognizing mispaired and unpaired bases
           embedded in duplex DNA and activating endo- and
           exonucleases to remove the mismatch. Members of the MutS
           family possess C-terminal domain with a conserved ATPase
           activity that belongs to the ATP binding cassette (ABC)
           superfamily. MutS homologs (MSH) have been identified in
           most prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 204

 Score = 43.5 bits (103), Expect = 9e-05
 Identities = 23/73 (31%), Positives = 31/73 (42%), Gaps = 4/73 (5%)

Query: 390 RGTGTNDGCVIARVTLEKFLQIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRNDG 449
           RGT + DG  I+   LE  ++      FATH+  +A  L  +  V   +M       N  
Sbjct: 119 RGTSSADGFAISLAILECLIKKESTVFFATHFRDIAAILGNKSCVVHLHMKAQSINSNG- 177

Query: 450 IDTIVFLYKLVPG 462
              I   YKLV G
Sbjct: 178 ---IEMAYKLVLG 187


>gnl|CDD|224114 COG1193, COG1193, Mismatch repair ATPase (MutS family) [DNA
           replication, recombination, and repair].
          Length = 753

 Score = 45.0 bits (107), Expect = 1e-04
 Identities = 38/158 (24%), Positives = 67/158 (42%), Gaps = 14/158 (8%)

Query: 389 CRGTGTNDGCVIARVTLEKFLQIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
             GT  ++G  +A   LE  L+     V  THY  +     E   V    M +       
Sbjct: 405 GSGTDPDEGAALAIAILEDLLEKPAKIVATTHYRELKALAAEREGVENASMEFDA----- 459

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQMEARHNLRQLFIHKFA 508
             +T+   Y+L+ G+  +S  F++A   G+PE +++   T   +      L +  I K  
Sbjct: 460 --ETLRPTYRLLEGVPGRSNAFDIALRLGLPEPIIEEAKT---EFGEEKELLEELIEKLE 514

Query: 509 SLVKSGEKVDVEELQKALESVKSFE---SQTKKDLEDL 543
            + K   + ++EE++K L+ V+      S  K  L +L
Sbjct: 515 EVRKE-LEEELEEVEKLLDEVELLTGANSGGKTSLLEL 551


>gnl|CDD|130141 TIGR01069, mutS2, MutS2 family protein.  Function of MutS2 is
           unknown. It should not be considered a DNA mismatch
           repair protein. It is likely a DNA mismatch binding
           protein of unknown cellular function [DNA metabolism,
           Other].
          Length = 771

 Score = 43.3 bits (102), Expect = 3e-04
 Identities = 38/155 (24%), Positives = 70/155 (45%), Gaps = 11/155 (7%)

Query: 389 CRGTGTNDGCVIARVTLEKFLQIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
             GT  ++G  +A   LE  L+     +  THY  +  +     N   E  S + D    
Sbjct: 412 GAGTDPDEGSALAISILEYLLKQNAQVLITTHYKEL--KALMYNNEGVENASVLFD---- 465

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQMEARHNLRQLFIHKFA 508
             +T+   YKL+ GI  +S+ F +A+  GIP  +++   T             + I K +
Sbjct: 466 -EETLSPTYKLLKGIPGESYAFEIAQRYGIPHFIIEQAKT---FYGEFKEEINVLIEKLS 521

Query: 509 SLVKSGEKVDVEELQKALESVKSFESQTKKDLEDL 543
           +L K  E+   E L+K L+  +  + + ++++E+L
Sbjct: 522 ALEKELEQ-KNEHLEKLLKEQEKLKKELEQEMEEL 555


>gnl|CDD|213247 cd03280, ABC_MutS2, ATP-binding cassette domain of MutS2.  MutS2
           homologs in bacteria and eukaryotes. The MutS protein
           initiates DNA mismatch repair by recognizing mispaired
           and unpaired bases embedded in duplex DNA and activating
           endo- and exonucleases to remove the mismatch. Members
           of the MutS family also possess a conserved ATPase
           activity that belongs to the ATP binding cassette (ABC)
           superfamily. MutS homologs (MSH) have been identified in
           most prokaryotic and all eukaryotic organisms examined.
           Prokaryotes have two homologs (MutS1 and MutS2), whereas
           seven MSH proteins (MSH1 to MSH7) have been identified
           in eukaryotes. The homodimer MutS1 and heterodimers
           MSH2-MSH3 and MSH2-MSH6 are primarily involved in
           mitotic mismatch repair, whereas MSH4-MSH5 is involved
           in resolution of Holliday junctions during meiosis. All
           members of the MutS family contain the highly conserved
           Walker A/B ATPase domain, and many share a common
           mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and
           MSH4-MSH5 dimerize to form sliding clamps, and
           recognition of specific DNA structures or lesions
           results in ADP/ATP exchange.
          Length = 200

 Score = 39.5 bits (93), Expect = 0.002
 Identities = 21/90 (23%), Positives = 37/90 (41%), Gaps = 7/90 (7%)

Query: 389 CRGTGTNDGCVIARVTLEKFLQIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRND 448
             GT   +G  +A   LE+ L+ G L +  THY  +     +   V    M +       
Sbjct: 118 GSGTDPVEGAALAIAILEELLERGALVIATTHYGELKAYAYKREGVENASMEF------- 170

Query: 449 GIDTIVFLYKLVPGICPKSFGFNVAELAGI 478
             +T+   Y+L+ G+  +S    +A   G+
Sbjct: 171 DPETLKPTYRLLIGVPGRSNALEIARRLGL 200


>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein.  This family consists of
           AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
           retardation syndrome) nuclear proteins. These proteins
           have been linked to human diseases such as acute
           lymphoblastic leukaemia and mental retardation. The
           family also contains a Drosophila AF4 protein homologue
           Lilliputian which contains an AT-hook domain.
           Lilliputian represents a novel pair-rule gene that acts
           in cytoskeleton regulation, segmentation and
           morphogenesis in Drosophila.
          Length = 1154

 Score = 38.7 bits (90), Expect = 0.009
 Identities = 32/168 (19%), Positives = 55/168 (32%), Gaps = 13/168 (7%)

Query: 5   SKESPEKKGDSESST---PASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDS 61
            K+  EK+G  +SS       SK      S  +        K P     K+ KS  QS++
Sbjct: 471 IKQPMEKEGKVKSSGSQYHPESKEPPPKSSSKEKRRPRTAQKGPESGRGKQ-KSPAQSEA 529

Query: 62  EPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKR 121
            P    +  G +    P KA    E      E E        +  +          G+++
Sbjct: 530 PPQR--RTVGKKQPKKPEKASAGDERTGLRPESEPGTLPYGSSVQTPPDRPKAATKGSRK 587

Query: 122 GLSSKSGQPT-------KKPKLTAPSTPSTPSFPVSDTSETTPSTSGA 162
               K  + +       +K K  +   P +  F  +D+S +      +
Sbjct: 588 PSPRKEPKSSVPPAAEKRKYKSPSKIVPKSREFIETDSSSSDSPEDES 635



 Score = 33.7 bits (77), Expect = 0.30
 Identities = 36/172 (20%), Positives = 52/172 (30%), Gaps = 17/172 (9%)

Query: 4   DSKESPEKKGDSESSTPA---SSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSD 60
             K+S        S T +   SSKGK+  K+    E D   +K+ R +      S   S 
Sbjct: 724 AEKDSLSAPKKQTSKTASEKSSSKGKRKHKNDE--EADKIESKKQRLEEKSSSCSPSSSS 781

Query: 61  SEPDDMLQDNG------SEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTP 114
           S                 E+E +P  +   S S         +        SSS      
Sbjct: 782 SHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDTSSSSG---- 837

Query: 115 MKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQDWS 166
                    SS     T K + T     ST S     +S  TP+ + +    
Sbjct: 838 -PFSASSTKSSSKSSSTSKHRKTEGKGSST-SKEHKGSSGDTPNKASSFPVP 887


>gnl|CDD|236304 PRK08581, PRK08581, N-acetylmuramoyl-L-alanine amidase; Validated.
          Length = 619

 Score = 37.5 bits (87), Expect = 0.018
 Identities = 32/181 (17%), Positives = 58/181 (32%), Gaps = 26/181 (14%)

Query: 5   SKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSA----KRVKSAIQSD 60
           +    +K  D E+S   SSK    + +   S  D+   K     S+      +   I  +
Sbjct: 39  TSHDSKKSNDDETSKDTSSKDTDKADNNNTSNQDNNDKKFSTIDSSTSDSNNIIDFIYKN 98

Query: 61  SEPDDMLQD---NGSEDEYVPP-----------------KAEVESESEHSSGEEELEESV 100
               ++ Q    N  +D Y                    +     +S + S +       
Sbjct: 99  LPQTNINQLLTKNKYDDNYSLTTLIQNLFNLNSDISDYEQPRNSEKSTNDSNKNSDSSIK 158

Query: 101 EDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTP-SFPVSDTSETTPST 159
            D    SS+ +    +       ++K     K+P    P+ P+   S P SD +    S+
Sbjct: 159 NDTDTQSSKQDKADNQKAPSS-NNTKPSTSNKQPNSPKPTQPNQSNSQPASDDTANQKSS 217

Query: 160 S 160
           S
Sbjct: 218 S 218


>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
           protein; Reviewed.
          Length = 782

 Score = 37.1 bits (87), Expect = 0.027
 Identities = 37/154 (24%), Positives = 63/154 (40%), Gaps = 31/154 (20%)

Query: 391 GTGTNDGCVIARVTLEKFLQIGCLTVFATHYHSVARRLREEPNVAFEYMSYIEDKRNDGI 450
           GT  ++G  +A   LE   + G   +  THY  +   +     V  E  S   D      
Sbjct: 419 GTDPDEGAALAISILEYLRKRGAKIIATTHYKELKALMYNREGV--ENASVEFD-----E 471

Query: 451 DTIVFLYKLVPGICPKSFGFNVAELAGIPEDVVKFGTTVAFQMEARHNLRQLFIHKFASL 510
           +T+   Y+L+ GI  KS  F +A+  G+PE++++         EA+  +           
Sbjct: 472 ETLRPTYRLLIGIPGKSNAFEIAKRLGLPENIIE---------EAKKLI----------- 511

Query: 511 VKSGEKVDVEELQKALESVKSFESQTK-KDLEDL 543
               +K  + EL  +LE     E + K ++ E L
Sbjct: 512 --GEDKEKLNELIASLEE-LERELEQKAEEAEAL 542


>gnl|CDD|114270 pfam05539, Pneumo_att_G, Pneumovirinae attachment membrane
           glycoprotein G. 
          Length = 408

 Score = 36.2 bits (83), Expect = 0.040
 Identities = 27/140 (19%), Positives = 44/140 (31%), Gaps = 12/140 (8%)

Query: 22  SSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKA 81
             K     K P  +   S  T  P   S     S +   S+P        + ++ +    
Sbjct: 158 RGKDVSCCKEPKTAVTTSKTTSWPTEVSHPTYPSQVTPQSQPATQGHQTATANQRLSSTE 217

Query: 82  EVESE--SEHSSGEEELE--ESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKLT 137
            V ++  +  S+ E + E   S   P+ S      T  ++ +  G   +  Q        
Sbjct: 218 PVGTQGTTTSSNPEPQTEPPPSQRGPSGSPQHPPSTTSQDQSTTGDGQEHTQ-------- 269

Query: 138 APSTPSTPSFPVSDTSETTP 157
              TP   S   S  S  TP
Sbjct: 270 RRKTPPATSNRRSPHSTATP 289


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 34.7 bits (80), Expect = 0.12
 Identities = 21/125 (16%), Positives = 42/125 (33%), Gaps = 5/125 (4%)

Query: 36  EDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEE 95
           E+D  + K                D + +D  +DN +        +     +       +
Sbjct: 189 EEDEALIKSLSFGPETEEDRRRADDEDSEDDEEDNDNTPSPKSGSSSPAKPTS----ILK 244

Query: 96  LEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSET 155
              +     PSSS+A+         R  +  S    KK    + S   + + P S++ +T
Sbjct: 245 KSAAKRSEAPSSSKAKKNSRGIPKPRD-ALSSLVVRKKAAPESTSQSPSSAEPTSESPQT 303

Query: 156 TPSTS 160
             ++S
Sbjct: 304 AGNSS 308


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 35.1 bits (81), Expect = 0.13
 Identities = 25/157 (15%), Positives = 43/157 (27%), Gaps = 5/157 (3%)

Query: 7   ESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDM 66
           ++     DS SS  +       ++ P        +  R    S     S+    +     
Sbjct: 231 DAGASSSDSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSS- 289

Query: 67  LQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRG-LSS 125
                      P                    S E  + S+S +  +        G   S
Sbjct: 290 -SPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPS 348

Query: 126 KSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGA 162
           +S  P++ P    PS+P     P    S  +P+ S  
Sbjct: 349 RSPSPSRPPPPADPSSPRKRPRPSRAPS--SPAASAG 383



 Score = 32.8 bits (75), Expect = 0.62
 Identities = 27/158 (17%), Positives = 46/158 (29%), Gaps = 3/158 (1%)

Query: 8   SPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDML 67
           SP  +    +   A+S       SP  +   SP     R  +     S+  S S      
Sbjct: 188 SPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGC 247

Query: 68  QDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPT--PSSSEAEVTPMKNGNKRGLSS 125
                 +  +P  A +   +           S        SSS  E +P  + +  G   
Sbjct: 248 GWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGP 307

Query: 126 KSGQPTKKPKLTAPSTPSTPS-FPVSDTSETTPSTSGA 162
               P      ++    S+ S    S++S     + G 
Sbjct: 308 APSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGP 345



 Score = 29.8 bits (67), Expect = 5.5
 Identities = 17/111 (15%), Positives = 28/111 (25%), Gaps = 5/111 (4%)

Query: 59  SDSEPDDMLQDNGSEDEYVPPKAE-VESESEHSSGEEELEESVEDPTPSSSEAEVTPMKN 117
                DD+L   GS+ + V   AE         +   +  E    P P            
Sbjct: 28  PGDAADDLLS--GSQGQLVSDSAELAAVTVVAGAAACDRFEPPTGPPPGPGTEAPANESR 85

Query: 118 GNK--RGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQDWS 166
                   +     P ++   T P   S    P +    + P +       
Sbjct: 86  STPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSE 136


>gnl|CDD|220102 pfam09073, BUD22, BUD22.  BUD22 has been shown in yeast to be a
           nuclear protein involved in bud-site selection. It plays
           a role in positioning the proximal bud pole signal. More
           recently it has been shown to be involved in ribosome
           biogenesis.
          Length = 424

 Score = 34.8 bits (80), Expect = 0.13
 Identities = 38/152 (25%), Positives = 58/152 (38%), Gaps = 27/152 (17%)

Query: 4   DSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSD--S 61
              +  E K  S+      S+ +  SKS   +EDDS              +    S+  S
Sbjct: 156 KKSKKKEAKESSDKDDEEESESEDESKSEESAEDDS----------DDEEEEDSDSEDYS 205

Query: 62  EPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKR 121
           + D ML    S DE    +A   + +E  + E E +ES  + + S S ++          
Sbjct: 206 QYDGML--VDSSDEEEGEEAPSINYNE-DTSESESDESDSEISESRSVSD---------- 252

Query: 122 GLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTS 153
             S +S  P+KKPK    S+   PS      S
Sbjct: 253 --SEESSPPSKKPKEKKTSSTFLPSLMGGYFS 282


>gnl|CDD|220972 pfam11081, DUF2890, Protein of unknown function (DUF2890).  This
           family is conserved in dsDNA adenoviruses of
           vertebrates. The function is not known.
          Length = 172

 Score = 33.4 bits (76), Expect = 0.15
 Identities = 27/111 (24%), Positives = 45/111 (40%), Gaps = 13/111 (11%)

Query: 44  RPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELE-ESVED 102
            P+  +    K          D  +D  S+ E V    E   + E S  EE+ E E VE+
Sbjct: 2   PPKGNA----KKLKVRPPPTKDEEEDWDSQAEEVEEDEEEMEDWEDSLDEEDEEAEEVEE 57

Query: 103 PTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPK--------LTAPSTPSTP 145
            T +SS+A  +  K+ ++  +S     P ++P            P+T + P
Sbjct: 58  ETAASSKAPSSSSKSSSQETISIPPTPPARRPSRRWDQTGRFPNPTTGAKP 108


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 34.5 bits (79), Expect = 0.15
 Identities = 28/148 (18%), Positives = 50/148 (33%), Gaps = 16/148 (10%)

Query: 6   KESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDD 65
           +E   +   S SS P+ +       SP+ +  +S   +              + + E ++
Sbjct: 407 QERERQGTSSRSSDPSKASSTSGE-SPSMASQESEEEESVEE----------EEEEEEEE 455

Query: 66  MLQDNGSEDEYVPPKAEVESESEHSSGEEELE---ESVEDPTPSSSEAEVTPMKNGNKRG 122
             ++  SE+E    + E E     +  EEE+E   E   D      +AE     +     
Sbjct: 456 EEEEQESEEEEGEDEEEEEEVEADNGSEEEMEGSSEGDGDGEEPEEDAERR--NSEMAGI 513

Query: 123 LSSKSGQPTKKPKLTAPSTPSTPSFPVS 150
                GQ  +   +   S    P  P S
Sbjct: 514 SRMSEGQQPRGSSVQPESPQEEPLQPES 541



 Score = 32.6 bits (74), Expect = 0.68
 Identities = 32/161 (19%), Positives = 52/161 (32%), Gaps = 7/161 (4%)

Query: 6   KESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDD 65
            E  E + + E     +  G +     +   D          +      + I   SE   
Sbjct: 462 SEEEEGEDEEEEEEVEADNGSEEEMEGSSEGDGDGEEPEEDAERRNSEMAGISRMSEGQ- 520

Query: 66  MLQDNGSEDEYVPPKAEVESES--EHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGL 123
                 S     P +  ++ ES    S GEE  EE + + +P SS  E+  +    +  +
Sbjct: 521 -QPRGSSVQPESPQEEPLQPESMDAESVGEESDEELLAEESPLSSHTELEGVATPVETKI 579

Query: 124 SSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQD 164
           SS    P   P   + S  +  +   S T     S    QD
Sbjct: 580 SSSRKLP---PPPVSTSLENDSATVTSTTRNGNVSPHTPQD 617


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 34.6 bits (80), Expect = 0.17
 Identities = 19/107 (17%), Positives = 38/107 (35%), Gaps = 7/107 (6%)

Query: 5    SKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPD 64
            SK    +       +  + K  K  K    S       K+  +K+A++ KS     +   
Sbjct: 1289 SKRPDGESNGGSKPSSPTKKKVK--KRLEGSLAALKKKKKSEKKTARKKKS----KTRVK 1342

Query: 65   DMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAE 111
                   S     P K + +S SE    + E+++S ++      + +
Sbjct: 1343 QASASQSSRLLRRPRKKKSDSSSE-DDDDSEVDDSEDEDDEDDEDDD 1388


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 33.8 bits (78), Expect = 0.23
 Identities = 19/117 (16%), Positives = 38/117 (32%), Gaps = 9/117 (7%)

Query: 4   DSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQS---- 59
           +S    +K   +  +  A +  KK  K    S   +       +         I      
Sbjct: 71  ESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQA 130

Query: 60  -----DSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAE 111
                D + DD+  D+  +D+      E + + +    +EE +E+ E    S  +  
Sbjct: 131 DDDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDF 187


>gnl|CDD|218601 pfam05477, SURF2, Surfeit locus protein 2 (SURF2).  Surfeit locus
           protein 2 is part of a group of at least six sequence
           unrelated genes (Surf-1 to Surf-6). The six Surfeit
           genes have been classified as housekeeping genes, being
           expressed in all tissue types tested and not containing
           a TATA box in their promoter region. The exact function
           of SURF2 is unknown.
          Length = 244

 Score = 32.6 bits (74), Expect = 0.41
 Identities = 26/103 (25%), Positives = 40/103 (38%), Gaps = 7/103 (6%)

Query: 34  KSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGE 93
           + ED     ++P R             S+ DD   ++   D Y P     E  +  + G+
Sbjct: 133 RREDQEDGVRQPGRTEKSGSDFWEPPSSDEDDSDSEDSMSDLYPP-----ELFTLKNPGK 187

Query: 94  EELEESVED-PTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPK 135
           E+  +  +D  T    E EV   +   KR    +SG  TKK K
Sbjct: 188 EQNGDEDDDFETDDEDEMEVESPELQQKRS-KKQSGSLTKKFK 229


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
           subunit (TFIIF-alpha).  Transcription initiation factor
           IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
           II-associating protein 74 (RAP74) is the large subunit
           of transcription factor IIF (TFIIF), which is essential
           for accurate initiation and stimulates elongation by RNA
           polymerase II.
          Length = 528

 Score = 33.0 bits (75), Expect = 0.50
 Identities = 35/163 (21%), Positives = 53/163 (32%), Gaps = 6/163 (3%)

Query: 6   KESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDD 65
           K SPE     E      S+  +  K+  +        K  + K  K       SDS  D 
Sbjct: 303 KLSPEIPAKPEIEQDEDSEESEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDSGDDS 362

Query: 66  MLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSS 125
              D   ED      A    + +    EE ++ +   P  S         K+  KR    
Sbjct: 363 DDSDIDGEDSVSLVTA---KKQKEPKKEEPVDSNPSSPGNSGPARPSPESKDKGKRKA-- 417

Query: 126 KSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQDWSHN 168
            + + +K P         T + P S + ++TP T      S N
Sbjct: 418 -ANEVSKSPASVPAKKLKTENAPKSSSGKSTPQTFSGSKSSSN 459


>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein.  This entry is a highly
           conserved protein present in eukaryotes.
          Length = 680

 Score = 33.0 bits (75), Expect = 0.53
 Identities = 33/185 (17%), Positives = 56/185 (30%), Gaps = 15/185 (8%)

Query: 8   SPEKKGDSESSTPASSKGK-----KTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSE 62
           S   K  SE+S+   +  K     + S         S         S K        + +
Sbjct: 210 SVTDKEKSEASSKGLTSTKELVPVQNSGGNHSLSKSSNSQTPELEYSEKGKDHHHSHNHQ 269

Query: 63  PDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPT-----PSSSEAEVTPMKN 117
              +  +N            +E    HS+       S           SS+ A     K+
Sbjct: 270 HHSIGINNHHSKHADSKLQTIEVIENHSNKSRPSSSSTNGSKETTSNSSSAAAGSIGSKS 329

Query: 118 GNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSE----TTPSTSGAQDWSHNHYQFL 173
                 S+++   +  PK  + +  S PS  VSD        + S+SGA+D   +     
Sbjct: 330 SKSAKHSNRNKSNS-SPKSHSSANGSVPSSSVSDNESKQKRASKSSSGARDSKKDASGMS 388

Query: 174 HPDHI 178
               +
Sbjct: 389 ANGTV 393


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 32.9 bits (75), Expect = 0.53
 Identities = 34/168 (20%), Positives = 53/168 (31%), Gaps = 11/168 (6%)

Query: 6   KESPEKKGDSES---STPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSE 62
           KE P+ +   E      P   K K+  K   +  D     KR R ++  R K   +    
Sbjct: 125 KEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPP 184

Query: 63  PDDMLQDNGSED-----EYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKN 117
                     +      E V  K E    +E    EE+  +  E  T    E E      
Sbjct: 185 NKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSE 244

Query: 118 GNKRGLSSKS---GQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGA 162
            ++R  SS       P+     T  S+  T + P +     +   + A
Sbjct: 245 ISRRSSSSLKKPDPSPSMASPETRESSKRTETRPRTSLRPPSARPASA 292



 Score = 32.2 bits (73), Expect = 0.94
 Identities = 21/124 (16%), Positives = 41/124 (33%), Gaps = 3/124 (2%)

Query: 20  PASSKGKKTSKSPAKSEDDSPVTKRPRRK---SAKRVKSAIQSDSEPDDMLQDNGSEDEY 76
              SKG      PAK   +    +  + K     ++ K   +   EP D      ++++ 
Sbjct: 82  KGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKR 141

Query: 77  VPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKL 136
            P + E E E +     +  EE   +   + S  +  P K    +       +  ++   
Sbjct: 142 PPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEEEKQRQAAR 201

Query: 137 TAPS 140
            A  
Sbjct: 202 EAVK 205



 Score = 29.1 bits (65), Expect = 6.8
 Identities = 27/161 (16%), Positives = 49/161 (30%), Gaps = 6/161 (3%)

Query: 6   KESPEKKGDSESSTPASSKGKKTSKSP-AKSEDDSPVTKRPR-RKSAKRVKSAIQSDSEP 63
           +E  EK+   E       K K+  K    K E       + + ++  K+V+     + E 
Sbjct: 105 EEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEK 164

Query: 64  D-DMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRG 122
             + ++      +   P  +     +    EEE +             E    +   K  
Sbjct: 165 KRERVRAKSRPKK---PPKKKPPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEE 221

Query: 123 LSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQ 163
              K  + T  P     S  S+     S +S   P  S + 
Sbjct: 222 DDGKDRETTTSPMEEDESRQSSEISRRSSSSLKKPDPSPSM 262


>gnl|CDD|215521 PLN02967, PLN02967, kinase.
          Length = 581

 Score = 32.7 bits (74), Expect = 0.57
 Identities = 17/88 (19%), Positives = 30/88 (34%)

Query: 7   ESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDM 66
                       TP  ++ K  + S    E+ +    R RRK  K  +      SE +  
Sbjct: 101 NEDAALDKESKKTPRRTRRKAAAASSDVEEEKTEKKVRKRRKVKKMDEDVEDQGSESEVS 160

Query: 67  LQDNGSEDEYVPPKAEVESESEHSSGEE 94
             +       +  ++E E + E   GE+
Sbjct: 161 DVEESEFVTSLENESEEELDLEKDDGED 188


>gnl|CDD|177653 PLN00014, PLN00014, light-harvesting-like protein 3; Provisional.
          Length = 250

 Score = 32.1 bits (73), Expect = 0.64
 Identities = 22/124 (17%), Positives = 37/124 (29%), Gaps = 23/124 (18%)

Query: 77  VPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKL 136
           +  K+E    S   +  +    S   P+P          + G   G ++    P      
Sbjct: 29  LGAKSEGSLVSVTVASTDGGGISERKPSP--------LERGGTLEGEAAAGKDPGPAAAA 80

Query: 137 TAPSTPSTPSFP----VSDTSETTP-STSGAQDWSHNHYQFLHPDHILDADRRSPKHPDY 191
                 S   F      + T +       G  DW          D ++DA+    K  + 
Sbjct: 81  KTSLAVSVGKFEDPRWKNGTWDLNQFKKDGKTDW----------DAVIDAEVVRRKWLED 130

Query: 192 NPKT 195
           NP+T
Sbjct: 131 NPET 134


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 32.1 bits (73), Expect = 0.69
 Identities = 25/108 (23%), Positives = 37/108 (34%), Gaps = 28/108 (25%)

Query: 3   LDSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSE 62
           L S +  E+ GD   S          + S    ++D P                 + D +
Sbjct: 203 LSSMDMAEELGDEPES----------ADSEDNEDEDDP-----------------KEDED 235

Query: 63  PDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEA 110
            DD  ++  S       +    S  E  SGE E  E+  D TP S +A
Sbjct: 236 -DDQGEEEESGSSDSLSEDSDASSEEMESGEMEAAEASADDTPDSDDA 282


>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit.  This model
           describes Pseudomonas denitrificans CobT gene product,
           which is a cobalt chelatase subunit that functions in
           cobalamin biosynthesis. Cobalamin (vitamin B12) can be
           synthesized via several pathways, including an aerobic
           pathway (found in Pseudomonas denitrificans) and an
           anaerobic pathway (found in P. shermanii and Salmonella
           typhimurium). These pathways differ in the point of
           cobalt insertion during corrin ring formation. There are
           apparently a number of variations on these two pathways,
           where the major differences seem to be concerned with
           the process of ring contraction. Confusion regarding the
           functions of enzymes found in the aerobic vs. anaerobic
           pathways has arisen because nonhomologous genes in these
           different pathways were given the same gene symbols.
           Thus, cobT in the aerobic pathway (P. denitrificans) is
           not a homolog of cobT in the anaerobic pathway (S.
           typhimurium). It should be noted that E. coli
           synthesizes cobalamin only when it is supplied with the
           precursor cobinamide, which is a complex intermediate.
           Additionally, all E. coli cobalamin synthesis genes
           (cobU, cobS and cobT) were named after their Salmonella
           typhimurium homologs which function in the anaerobic
           cobalamin synthesis pathway. This model describes the
           aerobic cobalamin pathway Pseudomonas denitrificans CobT
           gene product, which is a cobalt chelatase subunit, with
           a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
           heterotrimeric, ATP-dependent enzyme that catalyzes
           cobalt insertion during cobalamin biosynthesis. The
           other two subunits are the P. denitrificans CobS
           (TIGR01650) and CobN (pfam02514 CobN/Magnesium
           Chelatase) proteins. To avoid potential confusion with
           the nonhomologous Salmonella typhimurium/E.coli cobT
           gene product, the P. denitrificans gene symbol is not
           used in the name of this model [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 600

 Score = 32.6 bits (74), Expect = 0.71
 Identities = 24/118 (20%), Positives = 40/118 (33%), Gaps = 29/118 (24%)

Query: 3   LDSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSE 62
           L S E  E+ GD   S                 +DD P                   + +
Sbjct: 194 LRSMELAEEMGDDTESEDEE-----------DGDDDQP-----------------TENEQ 225

Query: 63  PDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNK 120
            ++  +  G   E   P+    ++ E  SGEEE+ +S +D  P  S+ +      G +
Sbjct: 226 -EEQGEGEGEGQEGSAPQESEATDRESESGEEEMVQSDQDDLPDESDDDSETPGEGAR 282


>gnl|CDD|191716 pfam07263, DMP1, Dentin matrix protein 1 (DMP1).  This family
           consists of several mammalian dentin matrix protein 1
           (DMP1) sequences. The dentin matrix acidic
           phosphoprotein 1 (DMP1) gene has been mapped to human
           chromosome 4q21. DMP1 is a bone and teeth specific
           protein initially identified from mineralised dentin.
           DMP1 is primarily localised in the nuclear compartment
           of undifferentiated osteoblasts. In the nucleus, DMP1
           acts as a transcriptional component for activation of
           osteoblast-specific genes like osteocalcin. During the
           early phase of osteoblast maturation, Ca(2+) surges into
           the nucleus from the cytoplasm, triggering the
           phosphorylation of DMP1 by a nuclear isoform of casein
           kinase II. This phosphorylated DMP1 is then exported out
           into the extracellular matrix, where it regulates
           nucleation of hydroxyapatite. DMP1 is a unique molecule
           that initiates osteoblast differentiation by
           transcription in the nucleus and orchestrates
           mineralised matrix formation extracellularly, at later
           stages of osteoblast maturation. The DMP1 gene has been
           found to be ectopically expressed in lung cancer
           although the reason for this is unknown.
          Length = 514

 Score = 32.3 bits (73), Expect = 0.73
 Identities = 40/158 (25%), Positives = 65/158 (41%), Gaps = 6/158 (3%)

Query: 7   ESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQ----SDSE 62
           +S E   + +  +  SS+          SE    V    R  +     S  +    S+S 
Sbjct: 320 QSQEDSQEVQDPSSESSQEADLPSQENSSESQEEVVSESRGDNPDNTTSHSEDQEDSESS 379

Query: 63  PDDMLQDNGSEDEYVPPKAEVESESEHS-SGEEELEESVEDPTPSSSEAEVTPMKNGNKR 121
            +D L D  S  E    + + +SES  S S  EE  ES ED   SS E   +   +   R
Sbjct: 380 EEDSL-DTPSSSESQSTEEQADSESNESLSSSEESPESTEDENSSSQEGLQSHSASTESR 438

Query: 122 GLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPST 159
              S+S Q ++  +  + S  S+ S   S+++E+  S+
Sbjct: 439 SQESQSEQDSRSEEDDSDSQDSSRSKEDSNSTESASSS 476


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 32.3 bits (73), Expect = 0.80
 Identities = 21/109 (19%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 6   KESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKS---AKRVKSAIQSDSE 62
            E+  + G+ ES   A  +G+  +K   +SE + P  ++  ++     +  ++  + ++E
Sbjct: 653 TEAEGENGE-ESGGEAEQEGETETKGENESEGEIPAERKGEQEGEGEIEAKEADHKGETE 711

Query: 63  PDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAE 111
            +++  +  +E E    + E+E+  E    E+E E   E      +E +
Sbjct: 712 AEEVEHEGETEAEGTEDEGEIETGEEGEEVEDEGEGEAEGKHEVETEGD 760


>gnl|CDD|221185 pfam11719, Drc1-Sld2, DNA replication and checkpoint protein.
           Genome duplication is precisely regulated by
           cyclin-dependent kinases CDKs, which bring about the
           onset of S phase by activating replication origins and
           then prevent relicensing of origins until mitosis is
           completed. The optimum sequence motif for CDK
           phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found
           to have at least 11 potential phosphorylation sites.
           Drc1 is required for DNA synthesis and S-M replication
           checkpoint control. Drc1 associates with Cdc2 and is
           phosphorylated at the onset of S phase when Cdc2 is
           activated. Thus Cdc2 promotes DNA replication by
           phosphorylating Drc1 and regulating its association with
           Cut5. Sld2 and Sld3 represent the minimal set of S-CDK
           substrates required for DNA replication.
          Length = 397

 Score = 32.1 bits (73), Expect = 0.85
 Identities = 36/173 (20%), Positives = 56/173 (32%), Gaps = 28/173 (16%)

Query: 11  KKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPD------ 64
           KK  S  +   S K +K S    +S+  +P  + P         SA++S S         
Sbjct: 45  KKLLSAKTIEPSPKKRKHSSPDGESQS-TPRKRIPSDVDPYDSPSALRSPSSLKTELGPT 103

Query: 65  -----------DMLQDNGSEDEYVPPKAEVESESEHSSGE--------EELEESVEDPTP 105
                      D+L  +   +    P     + S  S+          E L+   ED   
Sbjct: 104 PQRDGKVLSLFDLLSSSTPPES--TPSKRKLASSVASATPFSTPSKRRETLDAEDEDRPE 161

Query: 106 SSSEAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPS 158
               +E TP+ +G K  L       + +     PS     +  VS TS    S
Sbjct: 162 YGPRSERTPLSSGKKVMLDLFFTPTSWRYSSETPSFLRRSNQDVSATSNPLNS 214


>gnl|CDD|180536 PRK06347, PRK06347, autolysin; Reviewed.
          Length = 592

 Score = 32.0 bits (72), Expect = 0.96
 Identities = 26/88 (29%), Positives = 41/88 (46%), Gaps = 3/88 (3%)

Query: 72  SEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPT 131
           S DE  P     +S   +++ E     + E+ T  + E + T  K   K   + +  QP 
Sbjct: 51  SADETAPADEASKSAEANTTKEAPATATPENTTEPTVEPKQTETKEQTK---TPEEKQPA 107

Query: 132 KKPKLTAPSTPSTPSFPVSDTSETTPST 159
            K    AP+ P+T S P + TS +TP+T
Sbjct: 108 AKQVEKAPAEPATVSNPDNATSSSTPAT 135


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 32.0 bits (73), Expect = 1.0
 Identities = 25/145 (17%), Positives = 46/145 (31%), Gaps = 5/145 (3%)

Query: 3   LDSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSE 62
               E  E+  + E+  P+     +    P   E ++   K  +    +  K   +SD E
Sbjct: 398 RRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEAESKKLKKENKNE-FKEKKESDEE 456

Query: 63  PDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDP--TPSSSEAEVTPMKNGNK 120
            +  L+D             ++   +    EEE E   E+P    +SS  +    ++  K
Sbjct: 457 EE--LEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEENPWLKTTSSVGKSAKKQDSKK 514

Query: 121 RGLSSKSGQPTKKPKLTAPSTPSTP 145
           +  S       K  K          
Sbjct: 515 KSSSKLDKAANKISKAAVKVKKKKK 539


>gnl|CDD|139494 PRK13335, PRK13335, superantigen-like protein; Reviewed.
          Length = 356

 Score = 31.6 bits (71), Expect = 1.0
 Identities = 27/139 (19%), Positives = 50/139 (35%), Gaps = 5/139 (3%)

Query: 28  TSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKAEVESES 87
           T    A+    + V K P  K+ +     I + +        N  ++     +    +  
Sbjct: 25  TQSVKAEKIQSTKVDKVPTLKAERLAMINITAGANSATTQAANTRQERTPKLEKAPNTNE 84

Query: 88  EHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTK-KPKLTAPSTPSTPS 146
           E +S      E +  P     ++         K+  S  + + T  K K+T P + +TP 
Sbjct: 85  EKTSAS--KIEKISQPKQEEQKSLNISATPAPKQEQSQTTTESTTPKTKVTTPPSTNTPQ 142

Query: 147 FPVSDTSET--TPSTSGAQ 163
              S  S+T  +P+   AQ
Sbjct: 143 PMQSTKSDTPQSPTIKQAQ 161


>gnl|CDD|235285 PRK04335, PRK04335, cell division protein ZipA; Provisional.
          Length = 313

 Score = 31.7 bits (72), Expect = 1.1
 Identities = 20/108 (18%), Positives = 43/108 (39%), Gaps = 5/108 (4%)

Query: 9   PEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQ 68
           P  K D +S     +  +  +++P   EDD  + ++ R++     +++       D +  
Sbjct: 39  PLGKLDVDSDDEQPAPERGFAQAP---EDDFEIIRKERKEPDFGRENSFHDPLIDDPLFG 95

Query: 69  DNGSEDEYVPPKAEVE--SESEHSSGEEELEESVEDPTPSSSEAEVTP 114
               E+E    + E     +++     E++E  VE+P       E  P
Sbjct: 96  GELEEEEDKFEQEEAPIPVQAQSQPQPEKVEPQVEEPRDEEVLEEPEP 143


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 32.0 bits (72), Expect = 1.1
 Identities = 33/172 (19%), Positives = 62/172 (36%), Gaps = 38/172 (22%)

Query: 4   DSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQS---- 59
           +S + P KK   E+++P  S  K+  + PA   ++      P R +AK+ K+   S    
Sbjct: 60  ESTKKPNKKIKEEATSPLKS-TKRQREKPASDTEE------PERVTAKKSKTQELSRPNS 112

Query: 60  ------------------------DSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEE 95
                                    S+P D+ QDN S    +P   + ES+S+ S+ ++ 
Sbjct: 113 PSEGEGEGEGEGESSDSRSVNEEGSSDPKDIDQDNRSSSPSIPSPQDNESDSDSSAQQQL 172

Query: 96  LEES---VEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPST 144
           L+          P ++ A   P    + + +  +      +P          
Sbjct: 173 LQPQGPPSIQVPPGAALAPSAPPPTPSAQAVPPQGSPIAAQPAPQPQQPSPL 224



 Score = 30.4 bits (68), Expect = 3.2
 Identities = 42/159 (26%), Positives = 55/159 (34%), Gaps = 20/159 (12%)

Query: 13  GDSESSTPASSKGKKTSKSPAKSEDDSPV--TKRPRRKSAK------RVKSAIQSDSEPD 64
             S SS  + ++  K      K E  SP+  TKR R K A       RV +      E  
Sbjct: 49  AASTSSNDSKAESTKKPNKKIKEEATSPLKSTKRQREKPASDTEEPERVTAKKSKTQELS 108

Query: 65  DMLQDNGSEDEYVPPKAEVESESEHS-SGEEELEESVED---PTPSSSEAEVTPMKNGNK 120
                +  E E      E E ES  S S  EE     +D      SSS +  +P  N + 
Sbjct: 109 RPNSPSEGEGE-----GEGEGESSDSRSVNEEGSSDPKDIDQDNRSSSPSIPSPQDNESD 163

Query: 121 RGLSSKS--GQPTKKPKLTAPS-TPSTPSFPVSDTSETT 156
              S++    QP   P +  P      PS P    S   
Sbjct: 164 SDSSAQQQLLQPQGPPSIQVPPGAALAPSAPPPTPSAQA 202


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 31.8 bits (72), Expect = 1.2
 Identities = 23/155 (14%), Positives = 45/155 (29%), Gaps = 29/155 (18%)

Query: 8    SPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDML 67
             P  +   +   P      + ++   ++   S   +RPRR++A+    ++ S ++P    
Sbjct: 2646 VPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPP--- 2702

Query: 68   QDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKS 127
                      PP    E                  P    S   + P     ++   +  
Sbjct: 2703 ----------PPPPTPEPA----------------PHALVSATPLPPGPAAARQASPALP 2736

Query: 128  GQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGA 162
              P        P+TP  P+ P    +   P     
Sbjct: 2737 AAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAP 2771



 Score = 30.7 bits (69), Expect = 2.9
 Identities = 22/152 (14%), Positives = 38/152 (25%), Gaps = 7/152 (4%)

Query: 8    SPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDML 67
            SP          P  S     ++         P  +RPR   A    S  +         
Sbjct: 2615 SPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAA 2674

Query: 68   QDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKS 127
            Q +          A     S  S  +        +P P         + +         +
Sbjct: 2675 QASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAP-------HALVSATPLPPGPAA 2727

Query: 128  GQPTKKPKLTAPSTPSTPSFPVSDTSETTPST 159
             +        AP+ P+ P+ P +      P+ 
Sbjct: 2728 ARQASPALPAAPAPPAVPAGPATPGGPARPAR 2759


>gnl|CDD|148635 pfam07139, DUF1387, Protein of unknown function (DUF1387).  This
           family represents a conserved region approximately 300
           residues long within a number of hypothetical proteins
           of unknown function that seem to be restricted to
           mammals.
          Length = 301

 Score = 31.5 bits (71), Expect = 1.3
 Identities = 25/139 (17%), Positives = 50/139 (35%)

Query: 26  KKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKAEVES 85
           KK  K  +K + ++P     + ++    ++A   + +  +    NGS D+     +  E 
Sbjct: 7   KKNKKKKSKPKPEAPAKSASKEETTPEEQAAPGDEKDEVNGFHANGSADDTESVDSLSEG 66

Query: 86  ESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTP 145
               S    E E    D  PS S +    + +   +     S   + KP  ++    +  
Sbjct: 67  LDSASLDAREPEAVTLDAPPSPSSSLTNGLSDLQSKLELQSSPHSSAKPHPSSDQHKNAK 126

Query: 146 SFPVSDTSETTPSTSGAQD 164
            +    +   TP+ S   D
Sbjct: 127 KYVSKPSQPVTPNNSAHHD 145


>gnl|CDD|221179 pfam11711, Tim54, Inner membrane protein import complex subunit
           Tim54.  Mitochondrial function depends on the import of
           hundreds of different proteins synthesised in the
           cytosol. Protein import is a multi-step pathway which
           includes the binding of precursor proteins to surface
           receptors, translocation of the precursor across one or
           both mitochondrial membranes, and folding and assembly
           of the imported protein inside the mitochondrion. Most
           precursor proteins carry amino-terminal targeting
           signals, called pre-sequences, and are imported into
           mitochondria via import complexes located in both the
           outer and the inner membrane (IM). The IM complex, TIM,
           is made up of at least two proteins which mediate
           translocation of proteins into the matrix by removing
           their signal peptide and another pair of proteins, Tim54
           and Tim22, that insert the polytopic proteins, that
           carry internal targetting information, into the inner
           membrane.
          Length = 377

 Score = 31.2 bits (71), Expect = 1.3
 Identities = 15/82 (18%), Positives = 25/82 (30%), Gaps = 9/82 (10%)

Query: 77  VPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKL 136
            PP+    +  E ++ E E+E +    +P+    E              +      KP +
Sbjct: 196 DPPEPPEPTVDE-AAPETEVEATPAAESPAEPAEETAETTPEETEDAPEEENNKPVKPPV 254

Query: 137 TAP--------STPSTPSFPVS 150
             P        S P  P  P  
Sbjct: 255 PKPYISPDEYPSAPLPPELPQL 276


>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
          Length = 943

 Score = 31.6 bits (71), Expect = 1.6
 Identities = 27/134 (20%), Positives = 46/134 (34%), Gaps = 17/134 (12%)

Query: 43  KRPRRKSAKRVKSAIQSDSE---------------PDDMLQDNGSEDEYVPPKAEVESES 87
           K+  +KS K++    + DS+               P       G E E+   K   E + 
Sbjct: 486 KKLIKKSKKKLAPIEEEDSDKHDEPPEGPEASGLPPKAPGDKEGEEGEHEDSKESDEPKE 545

Query: 88  EHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKL-TAPSTPSTPS 146
               GE +  E  + P P + E + + +   +K+    K  +  K P+    P  P +  
Sbjct: 546 GGKPGETKEGEVGKKPGP-AKEHKPSKIPTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQ 604

Query: 147 FPVSDTSETTPSTS 160
            P    S   P   
Sbjct: 605 RPTRPKSPKLPELL 618



 Score = 30.8 bits (69), Expect = 2.2
 Identities = 29/152 (19%), Positives = 57/152 (37%), Gaps = 16/152 (10%)

Query: 11  KKGDSESSTPASSKGKKTSKSPAKSEDDSPVT----KRPRRKSAKRVKSAIQSDSEPDDM 66
           K+ D  + T    +  K   SP++ ED  P       + R +      S    +S+   +
Sbjct: 778 KEEDIHAETGEPDEAMKRPDSPSEHEDKPPGDHPSLPKKRHRLDGLALSTTDLESDAGRI 837

Query: 67  LQD----------NGSEDEY--VPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTP 114
            +D          + S D+   V    E+ +E+     +++  E+ ++ T    E   + 
Sbjct: 838 AKDASGKIVKLKRSKSFDDLTTVEEAEEMGAEARKIVVDDDGTEADDEDTHPPEEKHKSE 897

Query: 115 MKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPS 146
           ++        SK  +P+K  K   P +   PS
Sbjct: 898 VRRRRPPKKPSKPKKPSKPKKPKKPDSAFIPS 929



 Score = 30.8 bits (69), Expect = 2.2
 Identities = 28/163 (17%), Positives = 50/163 (30%), Gaps = 7/163 (4%)

Query: 4   DSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEP 63
           D K   + +   +   P S++     KSP K  +   + K P+R  + +          P
Sbjct: 584 DPKHPKDPEEPKKPKRPRSAQRPTRPKSP-KLPELLDIPKSPKRPESPKSPKRPPPPQRP 642

Query: 64  DDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGN---- 119
               +  G +    P   +          +E+  +   D    S E + T + + +    
Sbjct: 643 SSPERPEGPKIIKSPKPPKSPKPPFDPKFKEKFYDDYLDAAAKSKETKTTVVLDESFESI 702

Query: 120 -KRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSE-TTPSTS 160
            K  L    G P   P+   P  P    FP     +       
Sbjct: 703 LKETLPETPGTPFTTPRPLPPKLPRDEEFPFEPIGDPDAEQPD 745



 Score = 30.8 bits (69), Expect = 2.3
 Identities = 32/155 (20%), Positives = 55/155 (35%), Gaps = 30/155 (19%)

Query: 4   DSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSP-----VTKRP----RRKSAKRVK 54
           DSKES E K   E   P  +K  +  K P  +++  P     ++K+P      K  K  +
Sbjct: 536 DSKESDEPK---EGGKPGETKEGEVGKKPGPAKEHKPSKIPTLSKKPEFPKDPKHPKDPE 592

Query: 55  SAIQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDP-TPSSSEAEVT 113
              +                     +     +S       ++ +S + P +P S +    
Sbjct: 593 EPKKPKRP--------------RSAQRPTRPKSPKLPELLDIPKSPKRPESPKSPKRPPP 638

Query: 114 PMKNGN-KRGLSSKSGQPTKKPKLTAPSTPSTPSF 147
           P +  + +R    K  +  K PK  +P  P  P F
Sbjct: 639 PQRPSSPERPEGPKIIKSPKPPK--SPKPPFDPKF 671


>gnl|CDD|215018 smart01087, COG6, Conserved oligomeric complex COG6.  COG6 is a
           component of the conserved oligomeric golgi complex,
           which is composed of eight different subunits and is
           required for normal golgi morphology and localisation.
          Length = 598

 Score = 31.1 bits (71), Expect = 1.6
 Identities = 16/64 (25%), Positives = 31/64 (48%), Gaps = 13/64 (20%)

Query: 493 MEARHNLR-----------QLFIHKFASLVKSGEKV--DVEELQKALESVKSFESQTKKD 539
           +EAR NLR             F+ +F  + +  +++  DV++L  + +S+K   +  K  
Sbjct: 2   LEARRNLRSDLEKRLLKINGEFLSEFKPVAEQLQRLSEDVQKLNNSCDSMKDQLNTAKNQ 61

Query: 540 LEDL 543
            +DL
Sbjct: 62  TQDL 65


>gnl|CDD|220710 pfam10351, Apt1, Golgi-body localisation protein domain.  This is
           the C-terminus of a family of proteins conserved from
           plants to humans. The plant members are localised to the
           Golgi proteins and appear to regulate membrane
           trafficking, as they are required for rapid vesicle
           accumulation at the tip of the pollen tube. The
           C-terminus probably contains the Golgi localisation
           signal and it is well-conserved.
          Length = 451

 Score = 31.1 bits (71), Expect = 1.7
 Identities = 15/66 (22%), Positives = 28/66 (42%), Gaps = 4/66 (6%)

Query: 3   LDSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSE 62
             S     +   +  S+ +SS    +S   +KS D S    + +R S K  KS  +   +
Sbjct: 318 SSSSSGSRRSSSTSRSSSSSSSLLSSSSILSKSSDKS----KDKRFSLKLSKSEKEESDD 373

Query: 63  PDDMLQ 68
            ++M+ 
Sbjct: 374 LEEMIS 379


>gnl|CDD|220402 pfam09787, Golgin_A5, Golgin subfamily A member 5.  Members of this
           family of proteins are involved in maintaining Golgi
           structure. They stimulate the formation of Golgi stacks
           and ribbons, and are involved in intra-Golgi retrograde
           transport. Two main interactions have been
           characterized: one with RAB1A that has been activated by
           GTP-binding and another with isoform CASP of CUTL1.
          Length = 509

 Score = 31.0 bits (70), Expect = 1.9
 Identities = 26/113 (23%), Positives = 45/113 (39%), Gaps = 4/113 (3%)

Query: 16  ESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDE 75
            SS  A S+ +K +     S   SP +K+    +++ + S +   +        +  EDE
Sbjct: 51  ASSNKARSRSEKWNPDQPGSRVSSPSSKKDG--TSRSLSSQVDDLASAVSSQSSSDLEDE 108

Query: 76  YVPPKAEV-ESESEHSSGEEELEE-SVEDPTPSSSEAEVTPMKNGNKRGLSSK 126
               K  + E+  E    + ELE+   E     S E  +  ++ G  R L  K
Sbjct: 109 LAALKIRLQEAAQELRELKSELEDLRSERSRDLSDEESIKRLQRGAVRSLQDK 161


>gnl|CDD|227805 COG5518, COG5518, Bacteriophage capsid portal protein [General
           function prediction only].
          Length = 492

 Score = 31.1 bits (70), Expect = 1.9
 Identities = 20/64 (31%), Positives = 28/64 (43%)

Query: 499 LRQLFIHKFASLVKSGEKVDVEELQKALESVKSFESQTKKDLEDLPGGVAGAGTEWWPSP 558
           L  L I +    +K  + +D E +  AL    +  + +  DL DL G V G   E WP  
Sbjct: 403 LEVLPIQERRLALKGPDWIDPEVIAFALYPFITAGAVSPNDLRDLAGRVLGKTLEEWPEE 462

Query: 559 SNDP 562
            N P
Sbjct: 463 YNRP 466


>gnl|CDD|227931 COG5644, COG5644, Uncharacterized conserved protein [Function
           unknown].
          Length = 869

 Score = 31.2 bits (70), Expect = 2.0
 Identities = 14/58 (24%), Positives = 30/58 (51%), Gaps = 8/58 (13%)

Query: 60  DSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKN 117
            ++ +++L+ + S           +SESE S  E E+E S  D    +S++++  ++N
Sbjct: 147 ATDKENLLESDASSSN--------DSESEESDSESEIESSDSDHDDENSDSKLDNLRN 196


>gnl|CDD|234428 TIGR03979, His_Ser_Rich, His-Xaa-Ser repeat protein HxsA.  Members
           of this protein share two defining regions. One is a
           histidine/serine-rich cluster, typically
           H-R-S-H-S-S-H-R-S-H-S-S-H. Members are found always in
           the context of a pair of radical SAM proteins, HxsB and
           HxsC, and a fourth protein HxsD. The system is predicted
           to perform peptide modifications, likely in the
           His-Xaa-Ser region, to produce some uncharacterized
           natural product.
          Length = 186

 Score = 30.2 bits (68), Expect = 2.1
 Identities = 10/40 (25%), Positives = 14/40 (35%), Gaps = 2/40 (5%)

Query: 121 RGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTS 160
            G +S    P   P  +     S  S P   T+   P +S
Sbjct: 80  SGDTSTYSYPVPSPSYSPSPGSSIQSLPS--TTGVRPQSS 117


>gnl|CDD|227578 COG5253, MSS4, Phosphatidylinositol-4-phosphate 5-kinase [Signal
           transduction mechanisms].
          Length = 612

 Score = 30.7 bits (69), Expect = 2.3
 Identities = 32/169 (18%), Positives = 58/169 (34%), Gaps = 13/169 (7%)

Query: 3   LDSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSA------ 56
            +  E       S+  T +       SK      +   +    R+  A R++++      
Sbjct: 44  ANGNEYSPNNKVSKKDTFSDQLHDALSKEFTLERERDRLQLNKRKYQAIRLQTSTPIVEI 103

Query: 57  ---IQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHS-SGEEELEESVEDPTPSSSEAEV 112
               +   +P +  + +G+       K       EHS S      +   D  P SS ++ 
Sbjct: 104 FKNNKDAVDPPNHTRSSGNNLSNANVKTLSAPVGEHSRSNNPPNLDQNLDTEPESSISQW 163

Query: 113 TPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPS-FPVSDTSETTPSTS 160
             ++  N  G +  S QP++KP    P + S  S  P S  S     + 
Sbjct: 164 GELQL-NPSGKTLSS-QPSRKPTSENPKSESDNSKLPTSVNSPLPDKSL 210


>gnl|CDD|191309 pfam05597, Phasin, Poly(hydroxyalcanoate) granule associated
           protein (phasin).  Polyhydroxyalkanoates (PHAs) are
           storage polyesters synthesised by various bacteria as
           intracellular carbon and energy reserve material. PHAs
           are accumulated as water-insoluble inclusions within the
           cells. This family consists of the phasins PhaF and PhaI
           which act as a transcriptional regulator of PHA
           biosynthesis genes. PhaF has been proposed to repress
           expression of the phaC1 gene and the phaIF operon.
          Length = 132

 Score = 29.6 bits (67), Expect = 2.3
 Identities = 11/43 (25%), Positives = 21/43 (48%)

Query: 507 FASLVKSGEKVDVEELQKALESVKSFESQTKKDLEDLPGGVAG 549
           F +LVK GE+++    + A E V++     K  + ++     G
Sbjct: 41  FEALVKEGEELEKRTRKLAEEQVEAVRESVKSRVSEVKDKAEG 83


>gnl|CDD|218370 pfam04997, RNA_pol_Rpb1_1, RNA polymerase Rpb1, domain 1.  RNA
           polymerases catalyze the DNA dependent polymerisation of
           RNA. Prokaryotes contain a single RNA polymerase
           compared to three in eukaryotes (not including
           mitochondrial. and chloroplast polymerases). This
           domain, domain 1, represents the clamp domain, which a
           mobile domain involved in positioning the DNA,
           maintenance of the transcription bubble and positioning
           of the nascent RNA strand.
          Length = 330

 Score = 30.3 bits (69), Expect = 2.4
 Identities = 19/91 (20%), Positives = 38/91 (41%), Gaps = 10/91 (10%)

Query: 326 KEIQTYLRTQCAHFGCTVIYSEAQKKQKKYVLEVPSKYASKAKSNHQRVAT--KKKNVEN 383
           K+I + LR  C     +++ +E+ K     V+  P    SK +   +++    KKK++ +
Sbjct: 87  KKILSILRCVC-KLCSSLLLNESVKYFFLKVVIDPKGKNSKKR--LKKINNLCKKKSICS 143

Query: 384 YVTPECRGTGTNDGC-----VIARVTLEKFL 409
               +  G    +GC      I++   E   
Sbjct: 144 KCGEDNGGLKAFEGCGKYQPKISKDGAEAIK 174


>gnl|CDD|233044 TIGR00600, rad2, DNA excision repair protein (rad2).  All proteins
           in this family for which functions are known are flap
           endonucleases that generate the 3' incision next to DNA
           damage as part of nucleotide excision repair. This
           family is related to many other flap endonuclease
           families including the fen1 family. This family is based
           on the phylogenomic analysis of JA Eisen (1999, Ph.D.
           Thesis, Stanford University) [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 1034

 Score = 31.0 bits (70), Expect = 2.4
 Identities = 30/160 (18%), Positives = 52/160 (32%), Gaps = 12/160 (7%)

Query: 23  SKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQD-NGSEDEYVPPKA 81
           +      K  + S DD      P +K+   + S I+ + +  D L    G         +
Sbjct: 402 ALDDDEDKKVSASSDDQ---ASPSKKTKMLLISRIEVEDDDLDYLDQGEGIPLMAALQLS 458

Query: 82  EVESESEH---SSGEEELEESVEDPTPSSSEAEVTPMKNGNK-----RGLSSKSGQPTKK 133
            V S+ E    +    E+  S  +  P + ++ +    N +        L  KS    ++
Sbjct: 459 SVNSKPEAVASTKIAREVTSSGHEAVPKAVQSLLLGATNDSPIPSEFTILDRKSELSIER 518

Query: 134 PKLTAPSTPSTPSFPVSDTSETTPSTSGAQDWSHNHYQFL 173
                 S    PS      +  T  T   Q  S +  QF 
Sbjct: 519 TVKPVSSEFGLPSQREDKLAIPTEGTQNLQGISDHPEQFE 558


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 30.1 bits (68), Expect = 2.5
 Identities = 17/90 (18%), Positives = 31/90 (34%), Gaps = 12/90 (13%)

Query: 7   ESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDM 66
               KK +  +   A       +   A +E D    +    K  +  + A  SD++  D 
Sbjct: 69  AHKSKKENKLAIEDADKS----TNLDASNEGDEDDDEEDEIKRKRIEEDARNSDADDSDS 124

Query: 67  LQDNGSEDEYVPPKAEVESESEHSSGEEEL 96
             D+ S D+        +S+ + S  E   
Sbjct: 125 SSDSDSSDD--------DSDDDDSEDETAA 146


>gnl|CDD|221825 pfam12877, DUF3827, Domain of unknown function (DUF3827).  This
           family contains the human KIAA1549 protein which has
           been found to be fused fused to BRAF gene in many cases
           of pilocytic astrocytomas. The fusion is due mainly to a
           tandem duplication of 2 Mb at 7q34. Although nothing is
           known about the function of KIAA1549 protein, the BRAF
           protein is a well characterized oncoprotein. It is a
           serine/threonine protein kinase which is implicated in
           MAP/ERK signalling, a critical pathway for the
           regulation of cell division, differentiation and
           secretion.
          Length = 684

 Score = 30.3 bits (68), Expect = 3.4
 Identities = 28/156 (17%), Positives = 54/156 (34%), Gaps = 22/156 (14%)

Query: 8   SPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDML 67
           +P+ K   + S+    +  + S S   SE  S ++ R  R+ + R  +   S +      
Sbjct: 364 TPKSKSSQDGSSNKKRRRGRKSPSDGDSEGSSVISNRSSREKSGRPSTT-PSVTAQQKPT 422

Query: 68  QDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKS 127
           ++ G +    PP      + + SS    + E V+  +  SS+              S K 
Sbjct: 423 KEEGRKKP-APPSGT---DEQLSSA--SIFEHVDRLSRPSSDP---------YDRSSGKI 467

Query: 128 GQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQ 163
                +P       P+ P  P  + S    +    +
Sbjct: 468 QLIAMQP------MPAPPVPPRFEPSRDDRAAENGK 497


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 30.4 bits (68), Expect = 3.7
 Identities = 25/107 (23%), Positives = 42/107 (39%), Gaps = 6/107 (5%)

Query: 1    MYLDSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSD 60
            + LD KE    K DS+          + +K  A +E D P+      +    +   IQ D
Sbjct: 3993 LKLDEKEGDVSK-DSDLEDMDMEAADE-NKEEADAEKDEPMQDEDPLEENNTLDEDIQQD 4050

Query: 61   SEPDDMLQDNGSEDEYVPPKAEVESESEHSSG---EEELEESVEDPT 104
             +  D+ +D+   +E    +   E+E     G   +EELE+      
Sbjct: 4051 -DFSDLAEDDEKMNEDGFEENVQENEESTEDGVKSDEELEQGEVPED 4096


>gnl|CDD|234706 PRK00269, zipA, cell division protein ZipA; Reviewed.
          Length = 293

 Score = 29.7 bits (67), Expect = 4.1
 Identities = 26/106 (24%), Positives = 40/106 (37%), Gaps = 17/106 (16%)

Query: 14  DSESSTP---ASSKGKKTSKSPAKSEDDSP-VTKRPR-----RKSAKRVKSAIQSDSEPD 64
           D E  +      ++   T K P   E D P ++ RPR      K+AK+ K    S+ +  
Sbjct: 48  DEEEGSAELLGPARVLDTHKEPQLDEHDLPSMSARPRERRRDTKTAKQQKRGRGSEPQQG 107

Query: 65  DMLQDNGSEDEYVPP-----KAEVESESEHSSGEEELEESVEDPTP 105
           D+   N   DE  P        +   +   S G E   E  ++  P
Sbjct: 108 DL---NLDLDEVEPALFSDRDDDFTPDKRKSKGREPRIEPPKELPP 150


>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
          Length = 1465

 Score = 30.2 bits (68), Expect = 4.4
 Identities = 21/121 (17%), Positives = 39/121 (32%), Gaps = 7/121 (5%)

Query: 20   PASSKGKKTSKSPAKSEDD-SPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVP 78
             A    KK  + PA +    +      +++    V+S  +  +E     +  G   E   
Sbjct: 1326 LAERLKKKGGRKPAAANKKAAKPPAAAKKRGPATVQSGQKLLTEMLKPAEAIGISPEKKV 1385

Query: 79   PKAEVESESEHSS------GEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTK 132
             K      ++ S          +  ES E+ + SSS  +     +   R   +   Q T 
Sbjct: 1386 RKMRASPFNKKSGSVLGRAATNKETESSENVSGSSSSEKDEIDVSAKPRPQRANRKQTTY 1445

Query: 133  K 133
             
Sbjct: 1446 V 1446



 Score = 29.4 bits (66), Expect = 6.2
 Identities = 19/67 (28%), Positives = 25/67 (37%), Gaps = 4/67 (5%)

Query: 8    SPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRP---RRKSAKRVKSAIQSDSEPD 64
            S   +  +   T  SS+    S S  K E D     RP    RK    V S  +S+S  D
Sbjct: 1399 SVLGRAATNKET-ESSENVSGSSSSEKDEIDVSAKPRPQRANRKQTTYVLSDSESESADD 1457

Query: 65   DMLQDNG 71
                D+ 
Sbjct: 1458 SDFDDDE 1464


>gnl|CDD|176015 cd04050, C2B_Synaptotagmin-like, C2 domain second repeat present
          in Synaptotagmin-like proteins.  Synaptotagmin is a
          membrane-trafficking protein characterized by a
          N-terminal transmembrane region, a linker, and 2
          C-terminal C2 domains. Previously all synaptotagmins
          were thought to be calcium sensors in the regulation of
          neurotransmitter release and hormone secretion, but it
          has been shown that not all of them bind calcium.  Of
          the 17 identified synaptotagmins only 8 bind calcium
          (1-3, 5-7, 9, 10).  The function of the two C2 domains
          that bind calcium are: regulating the fusion step of
          synaptic vesicle exocytosis (C2A) and  binding to
          phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the
          absence of calcium ions and to phosphatidylinositol
          bisphosphate (PIP2) in their presence (C2B).  C2B also
          regulates also the recycling step of synaptic vesicles.
          C2 domains fold into an 8-standed beta-sandwich that
          can adopt 2 structural arrangements: Type I and Type
          II, distinguished by a circular permutation involving
          their N- and C-terminal beta strands. Many C2 domains
          are Ca2+-dependent membrane-targeting modules that bind
          a wide variety of substances including bind
          phospholipids, inositol polyphosphates, and
          intracellular proteins.  Most C2 domain proteins are
          either signal transduction enzymes that contain a
          single C2 domain, such as protein kinase C, or membrane
          trafficking proteins which contain at least two C2
          domains, such as synaptotagmin 1.  However, there are a
          few exceptions to this including RIM isoforms and some
          splice variants of piccolo/aczonin and intersectin
          which only have a single C2 domain.  C2 domains with a
          calcium binding region have negatively charged
          residues, primarily aspartates, that serve as ligands
          for calcium ions. This cd contains the second C2
          repeat, C2B, and has a type-I topology.
          Length = 105

 Score = 27.9 bits (63), Expect = 4.5
 Identities = 16/43 (37%), Positives = 22/43 (51%), Gaps = 2/43 (4%)

Query: 1  MYLDSKES-PEKKGDSESSTPAS-SKGKKTSKSPAKSEDDSPV 41
          +YLDS ++ P  K   E S     + GK T KS  K   ++PV
Sbjct: 4  VYLDSAKNLPLAKSTKEPSPYVELTVGKTTQKSKVKERTNNPV 46


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
           are designated YL1. These proteins have been shown to be
           DNA-binding and may be a transcription factor.
          Length = 238

 Score = 29.3 bits (66), Expect = 4.6
 Identities = 25/129 (19%), Positives = 45/129 (34%), Gaps = 11/129 (8%)

Query: 41  VTKRPRRKSA-KRVKSAIQSDSEPDDMLQDNGS------EDEYV----PPKAEVESESEH 89
              R RR +A  R+K  ++ + E D+             ++E+       + EV+S+ + 
Sbjct: 1   AATRARRSNAGNRMKKLLEEELEEDEFFWTYLLFEEEEDDEEFEIEEEEEEEEVDSDFDD 60

Query: 90  SSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPV 149
           S  +E   +  E+        E    K   K     +  +  KK   TA  +P   +   
Sbjct: 61  SEDDEPESDDEEEGEKELQREERLKKKKRVKTKAYKEPTKKKKKKDPTAAKSPKAAAPRP 120

Query: 150 SDTSETTPS 158
              SE    
Sbjct: 121 KKKSERISW 129


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
           the C subunit of DNA polymerase delta. It carries the
           essential residues for binding to the Pol1 subunit of
           polymerase alpha, from residues 293-332, which are
           characterized by the motif D--G--VT, referred to as the
           DPIM motif. The first 160 residues of the protein form
           the minimal domain for binding to the B subunit, Cdc1,
           of polymerase delta, the final 10 C-terminal residues,
           362-372, being the DNA sliding clamp, PCNA, binding
           motif.
          Length = 427

 Score = 29.4 bits (66), Expect = 5.6
 Identities = 23/119 (19%), Positives = 43/119 (36%), Gaps = 19/119 (15%)

Query: 4   DSKESPEKKGDSESSTPASS--------KGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKS 55
             K +  K+   + S   SS        K +K   S +  +++S      R         
Sbjct: 201 SVKAASLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVKEESEEESGKRDV------- 253

Query: 56  AIQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGE--EELEESVEDPTPSSSEAEV 112
            ++ +S     L ++  EDE  P  +   S+SE  + E  +E  + ++       E E 
Sbjct: 254 ILEDESAEPTGLDEDEDEDE--PKPSGERSDSEEETEEKEKEKRKRLKKMMEDEDEDEE 310


>gnl|CDD|220684 pfam10310, DUF2413, Protein of unknown function (DUF2413).  This is
           a family of proteins conserved in fungi. The function is
           not known.
          Length = 436

 Score = 29.4 bits (66), Expect = 6.2
 Identities = 23/88 (26%), Positives = 31/88 (35%), Gaps = 8/88 (9%)

Query: 3   LDSKESPEKKGDSESSTPASSKG-----KKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAI 57
           LD  E  EK    +    AS  G     KK+SK    S   S       RKSA   +S  
Sbjct: 38  LDELEQSEKAKPPKKPKEASRPGTPRNPKKSSKPTESSAASSEEKPAKPRKSA---ESTR 94

Query: 58  QSDSEPDDMLQDNGSEDEYVPPKAEVES 85
            S  +      ++  E+E       + S
Sbjct: 95  SSHPKSKAPSTESEEEEEPEETPDPIAS 122


>gnl|CDD|113514 pfam04747, DUF612, Protein of unknown function, DUF612.  This
           family includes several uncharacterized proteins from
           Caenorhabditis elegans.
          Length = 517

 Score = 29.3 bits (64), Expect = 6.4
 Identities = 43/165 (26%), Positives = 70/165 (42%), Gaps = 20/165 (12%)

Query: 6   KESPEKKGDSESSTPASSKGKKTSK--SPAKSEDDSPVTKRPRRKSAKRV---KSAIQSD 60
           K   EKK +       + K +KT K  +PA  E++  V K    +SA      K+   + 
Sbjct: 142 KLQAEKKKEKAVKAEKAEKAEKTKKASTPAPVEEEIVVKKVANDRSAAPAPEPKTPTNTP 201

Query: 61  SEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSE--AEVTPMKNG 118
           +EP + +Q+   +      K + +SESE ++    +E+ VE P   + E   +  P +  
Sbjct: 202 AEPAEQVQEITGKKN---KKNKKKSESEATAAPASVEQVVEQPKVVTEEPHQQAAPQEKK 258

Query: 119 NKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQ 163
           NK+           K K  + + P+    PV    ETTP  S  Q
Sbjct: 259 NKK----------NKRKSESENVPAASETPVEPVVETTPPASENQ 293



 Score = 28.9 bits (63), Expect = 9.4
 Identities = 29/140 (20%), Positives = 57/140 (40%), Gaps = 2/140 (1%)

Query: 6   KESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDD 65
           K    KK     +T A +  ++  + P    ++      P+ K  K+ K   +S++ P  
Sbjct: 215 KNKKNKKKSESEATAAPASVEQVVEQPKVVTEEPHQQAAPQEKKNKKNKRKSESENVPAA 274

Query: 66  MLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSS 125
                    E  PP +E + +++    + E E+ VE+P  + +     P  + N   L  
Sbjct: 275 SETPVEPVVETTPPASENQKKNKKDKKKSESEKVVEEPVQAEAPKSKKPTADDNMDFLDF 334

Query: 126 KSGQPTKKPKLTAPSTPSTP 145
            + +  ++PK     TP+ P
Sbjct: 335 VTAK--EEPKDEPAETPAAP 352


>gnl|CDD|223068 PHA03384, PHA03384, early DNA-binding protein E2A; Provisional.
          Length = 445

 Score = 29.3 bits (66), Expect = 6.4
 Identities = 19/90 (21%), Positives = 31/90 (34%), Gaps = 7/90 (7%)

Query: 25  GKKTSKSPAKSEDDSPVT----KRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPK 80
           G+ +S     S DDSP        P++K  +RV    + + E +  +   G      PP 
Sbjct: 3   GRGSSSDSPYSSDDSPSLEPPELPPKKKGRRRVSPVEEEEEEEEAEVVAVGFSY---PPV 59

Query: 81  AEVESESEHSSGEEELEESVEDPTPSSSEA 110
                +          EE   +   S+  A
Sbjct: 60  RISRGKDGKRPVRPLKEEKDSEKKASTEAA 89


>gnl|CDD|221745 pfam12737, Mating_C, C-terminal domain of homeodomain 1.  Mating in
           fungi is controlled by the loci that determine the
           mating type of an individual, and only individuals with
           differing mating types can mate. Basidiomycete fungi
           have evolved a unique mating system, termed tetrapolar
           or bifactorial incompatibility, in which mating type is
           determined by two unlinked loci; compatibility at both
           loci is required for mating to occur. The multi-allelic
           tetrapolar mating system is considered to be a novel
           innovation that could have only evolved once, and is
           thus unique to the mushroom fungi. This domain is
           C-terminal to the homeodomain transcription factor
           region.
          Length = 418

 Score = 29.0 bits (65), Expect = 6.8
 Identities = 34/151 (22%), Positives = 53/151 (35%), Gaps = 20/151 (13%)

Query: 8   SPEKKGDSESSTPASSKGKKTSKSPAKSEDD---SPVTKRPRRKSAKRVKSAIQSDSEPD 64
           SP       S   AS +  K  +S + S+D+      +KRPR         +I S S P 
Sbjct: 87  SPSPSVLDLSPVLASPQTGKRRRSSSPSDDEDEAERPSKRPRS-------DSISSSSSPA 139

Query: 65  DMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLS 124
                   E     P A  + E   +S       S+  P   +     T      KR LS
Sbjct: 140 KP-----PEACLPSPAASTQDELSEASAAPLPTPSLSPPHTPTD----TAPSGKRKRRLS 190

Query: 125 SKSGQP-TKKPKLTAPSTPSTPSFPVSDTSE 154
                P  K+P+ ++     +   P+  T++
Sbjct: 191 DGFQLPAPKRPQTSSRPQTVSDPLPLHATTD 221


>gnl|CDD|239091 cd02407, PTH2_family, Peptidyl-tRNA hydrolase, type 2 (PTH2)_like .
           Peptidyl-tRNA hydrolase activity releases tRNA from the
           premature translation termination product peptidyl-tRNA.
           Two structurally different enzymes have been reported to
           encode such activity, Pth present in bacteria and
           eukaryotes and Pth2 present in archaea and eukaryotes.
          Length = 115

 Score = 27.9 bits (63), Expect = 7.0
 Identities = 19/54 (35%), Positives = 22/54 (40%), Gaps = 18/54 (33%)

Query: 334 TQCAHFGCTVIYSEAQKK------------QKKYVLEVPS-----KYASKAKSN 370
            QCAH      Y +A K             QKK VL+VPS     + A KAK  
Sbjct: 20  AQCAH-AALAAYKKAMKDPPTLLRAWELEGQKKVVLKVPSEEELLELAKKAKEL 72


>gnl|CDD|233045 TIGR00601, rad23, UV excision repair protein Rad23.  All proteins
           in this family for which functions are known are
           components of a multiprotein complex used for targeting
           nucleotide excision repair to specific parts of the
           genome. In humans, Rad23 complexes with the XPC protein.
           This family is based on the phylogenomic analysis of JA
           Eisen (1999, Ph.D. Thesis, Stanford University) [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 378

 Score = 29.1 bits (65), Expect = 7.0
 Identities = 11/42 (26%), Positives = 14/42 (33%)

Query: 118 GNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPST 159
              +  + K   P   P      TPS P+ P S  S    S 
Sbjct: 75  SKPKTGTGKVAPPAATPTSAPTPTPSPPASPASGMSAAPASA 116


>gnl|CDD|227925 COG5638, COG5638, Uncharacterized conserved protein [Function
           unknown].
          Length = 622

 Score = 29.4 bits (65), Expect = 7.0
 Identities = 20/60 (33%), Positives = 25/60 (41%), Gaps = 11/60 (18%)

Query: 74  DEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKK 133
           DEY P + E    +  SS E   EES E+     SE          K G   + G PTK+
Sbjct: 100 DEYDPARGEGIISTSESSDESR-EESEEEKANEISE----------KAGAVPEEGNPTKR 148


>gnl|CDD|234698 PRK00236, xerC, site-specific tyrosine recombinase XerC; Reviewed.
          Length = 297

 Score = 29.0 bits (66), Expect = 7.1
 Identities = 12/33 (36%), Positives = 17/33 (51%), Gaps = 4/33 (12%)

Query: 170 YQFLHPDHILDADR----RSPKHPDYNPKTLYV 198
           Y++L    +L A+     R+PK P   PK L V
Sbjct: 86  YRWLVRRGLLKANPAAGLRAPKIPKRLPKPLDV 118


>gnl|CDD|200550 cd10924, CE4_COG4878, Putative NodB-like catalytic domain of
           uncharacterized proteins found in bacteria.  The family
           corresponds to a group of uncharacterized bacterial
           proteins with high sequence similarity to the catalytic
           domain of the six-stranded barrel rhizobial NodB-like
           proteins, which remove N-linked or O-linked acetyl
           groups from cell wall polysaccharides and belong to the
           larger carbohydrate esterase 4 (CE4) superfamily.
          Length = 273

 Score = 28.8 bits (65), Expect = 7.1
 Identities = 10/19 (52%), Positives = 13/19 (68%), Gaps = 4/19 (21%)

Query: 165 WSHNHYQFLHPDHILDADR 183
            SH    F+HPD +LDA+R
Sbjct: 228 VSH----FVHPDDVLDAER 242


>gnl|CDD|237284 PRK13108, PRK13108, prolipoprotein diacylglyceryl transferase;
           Reviewed.
          Length = 460

 Score = 29.2 bits (65), Expect = 7.2
 Identities = 17/153 (11%), Positives = 39/153 (25%), Gaps = 15/153 (9%)

Query: 15  SESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSED 74
           + S+      G      P +  DD     +          +A       D   +   + +
Sbjct: 311 AASAVGPVGPG-----EPNQP-DDVAEAVKAEVAEVTDEVAAESVVQVADRDGESTPAVE 364

Query: 75  EYVPP---KAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLSSKSGQPT 131
           E       + +    +  +    +++       P    A  +   +       ++   P 
Sbjct: 365 ETSEADIEREQPGDLAGQAPAAHQVDAEAASAAPEEPAALASEAHDE------TEPEVPE 418

Query: 132 KKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQD 164
           K   +  P+ P   +          P     QD
Sbjct: 419 KAAPIPDPAKPDELAVAGPGDDPAEPDGIRRQD 451


>gnl|CDD|216648 pfam01690, PLRV_ORF5, Potato leaf roll virus readthrough protein.
           This family consists mainly of the potato leaf roll
           virus readthrough protein. This is generated via a
           readthrough of open reading frame 3 a coat protein
           allowing transcription of open reading frame 5 to give
           an extended coat protein with a large c-terminal
           addition or read through domain. The readthrough protein
           is thought to play a role in the circulative aphid
           transmission of potato leaf roll virus. Also in the
           family is open reading frame 6 from beet western yellows
           virus and potato leaf roll virus both luteovirus and an
           unknown protein from cucurbit aphid-borne yellows virus
           a closterovirus.
          Length = 460

 Score = 28.9 bits (65), Expect = 7.3
 Identities = 29/161 (18%), Positives = 50/161 (31%), Gaps = 12/161 (7%)

Query: 42  TKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVE 101
            +RP R     V S  +    P+   Q +  +    PP+ E  S++E+      ++E  +
Sbjct: 244 VRRPPRPGHLLVVSTPELVLPPEPDEQSSERQTFKTPPQPESSSDAENGLV-SLVDEDDK 302

Query: 102 DPTPSSSEAEVTPMKNGNKRGLSSKSGQPTKKPK-----LTAPSTPSTPSFPVSDTSETT 156
           +     SE++  P      R L+       + P      L     PS           T 
Sbjct: 303 EEVSRDSESDAPPDDTDLTRALAEYEAAAPEVPDAARTVLQGKEQPSPDPVESPGPDLTP 362

Query: 157 PSTSGAQDWSHNHYQFLHPDHILDADRRSPKHPDYNPKTLY 197
                      +     +      A+ R P   D  P T+ 
Sbjct: 363 GY------PKSDEVAGTYLGGGSVAEGRDPLEADPTPSTVL 397


>gnl|CDD|191602 pfam06752, E_Pc_C, Enhancer of Polycomb C-terminus.  This family
           represents the C-terminus of eukaryotic enhancer of
           polycomb proteins, which have roles in heterochromatin
           formation. This family contains several conserved
           motifs.
          Length = 230

 Score = 28.8 bits (64), Expect = 7.4
 Identities = 20/96 (20%), Positives = 33/96 (34%), Gaps = 9/96 (9%)

Query: 118 GNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPS--------TSGAQDWSHNH 169
           G  +GL             +  S+  +PS  +  +   T +         +  Q    N+
Sbjct: 89  GVYKGLHLTRSAVPTFLPSSGGSSAGSPSGSLVRSPGHTSTNHLVPALGPASPQVLPGNN 148

Query: 170 YQFLHPDHILDADRRSPKHPDYNPKTL-YVPPEFLK 204
                P H+      SP +  + P+TL  VP   LK
Sbjct: 149 ICLSVPSHLSTVSAVSPLNVRHIPRTLAPVPSSALK 184


>gnl|CDD|165564 PHA03309, PHA03309, transcriptional regulator ICP4; Provisional.
          Length = 2033

 Score = 29.4 bits (65), Expect = 7.6
 Identities = 32/151 (21%), Positives = 58/151 (38%), Gaps = 13/151 (8%)

Query: 12   KGDSESSTPASSKGKKTSKSPAKSEDDS--PVTKRPRRKSAKRVKSAIQSDSEPDDMLQD 69
            +  S SS+ +SS     S  P++S   S  P    PRR    R +S  + + +     + 
Sbjct: 1815 RSSSSSSSSSSSSSSSPSSRPSRSATPSLSPSPSPPRRAPVDRSRSGRRRERD-----RP 1869

Query: 70   NGSEDEYVPPKAEVESESEHSSGEEELEESVED------PTPSSSEAEVTPMKNGNKRGL 123
            + +   + P +      S   +   +   ++ED      P  + S A   P ++G +  +
Sbjct: 1870 SANPFRWAPRQRSRADHSPDGTAPGDAPLNLEDGPGRGRPIWTPSSATTLPSRSGPEDSV 1929

Query: 124  SSKSGQPTKKPKLTAPSTPSTPSFPVSDTSE 154
                 + +  P   APS   T     S+ SE
Sbjct: 1930 DETETEDSAPPARLAPSPLETSRAEDSEDSE 1960


>gnl|CDD|233367 TIGR01349, PDHac_trf_mito, pyruvate dehydrogenase complex
           dihydrolipoamide acetyltransferase, long form.  This
           model represents one of several closely related clades
           of the dihydrolipoamide acetyltransferase subunit of the
           pyruvate dehydrogenase complex. It includes sequences
           from mitochondria and from alpha and beta branches of
           the proteobacteria, as well as from some other bacteria.
           Sequences from Gram-positive bacteria are not included.
           The non-enzymatic homolog protein X, which serves as an
           E3 component binding protein, falls within the clade
           phylogenetically but is rejected by its low score
           [Energy metabolism, Pyruvate dehydrogenase].
          Length = 436

 Score = 29.0 bits (65), Expect = 7.8
 Identities = 13/52 (25%), Positives = 17/52 (32%), Gaps = 4/52 (7%)

Query: 109 EAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTS 160
           E      KN         S  P  KP   AP+ P +   P     + +P  S
Sbjct: 79  EDVADAFKNYK----LESSASPAPKPSEIAPTAPPSAPKPSPAPQKQSPEPS 126


>gnl|CDD|227091 COG4748, COG4748, Uncharacterized conserved protein [Function
           unknown].
          Length = 365

 Score = 29.0 bits (65), Expect = 8.1
 Identities = 21/64 (32%), Positives = 33/64 (51%), Gaps = 7/64 (10%)

Query: 39  SPVTKRPRR-----KSAKRVKSAIQSDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGE 93
           + + K         +   R+KSA +S+   D  ++DN  +DE V  K   ESES+  + E
Sbjct: 214 TDIVKNAFSQFINDRVNDRLKSAKKSEDTVDSGIKDNNIKDENV--KIFEESESDIITTE 271

Query: 94  EELE 97
           EE+E
Sbjct: 272 EEIE 275


>gnl|CDD|215456 PLN02850, PLN02850, aspartate-tRNA ligase.
          Length = 530

 Score = 28.9 bits (65), Expect = 8.1
 Identities = 21/86 (24%), Positives = 32/86 (37%), Gaps = 6/86 (6%)

Query: 16 ESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEPDDMLQDNGSEDE 75
           S       G+K SK  AK        K  + +     K+A  S  + DD L  N  +  
Sbjct: 1  SSQEAVEESGEKISKKAAKKA----AAKAEKLRREATAKAAAASLEDEDDPLASNYGDVP 56

Query: 76 YVPPKAEVESE--SEHSSGEEELEES 99
              +++V     ++ S   EEL  S
Sbjct: 57 LEELQSKVTGREWTDVSDLGEELAGS 82


>gnl|CDD|222483 pfam13971, MEI4-Rec24, Microtubule-binding domain of katanin.  This
           is the C-terminal domain of katanin - a heterodimeric
           micro-tubule severing ATPase. Katanin is found localised
           at mitotic spindle bodies. This domain binds
           microtubules by interacting specifically with the
           N-terminal domain of p60 katanin.
          Length = 375

 Score = 28.7 bits (64), Expect = 8.6
 Identities = 23/111 (20%), Positives = 34/111 (30%), Gaps = 17/111 (15%)

Query: 75  EYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNGNKRGLS-SKSGQPTKK 133
           E +  K + + E       E LE    +P     E     +KN N  G + S +     K
Sbjct: 44  EALAKKLKDQDEGWKKK-AELLEA---EPLQLRQEL----LKNRNSAGCAKSGAKVFPAK 95

Query: 134 ------PKLTAPSTPSTPSFPVSDTSETTPSTSGA--QDWSHNHYQFLHPD 176
                        T      P    S+  P    A  ++   +H QFL   
Sbjct: 96  LLDQDPTSSENDETLLEELGPTPPNSQRVPKRPQADIENPFSSHMQFLQCL 146


>gnl|CDD|237632 PRK14164, PRK14164, heat shock protein GrpE; Provisional.
          Length = 218

 Score = 28.2 bits (63), Expect = 8.7
 Identities = 19/99 (19%), Positives = 39/99 (39%), Gaps = 9/99 (9%)

Query: 4   DSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDSEP 63
            S   P+  GD E++ P ++   +   +  ++     V +      A   +     +++ 
Sbjct: 3   TSNGMPDNPGDPENTDPEATSADRAEAAAEEAALAQGVPEDDPFGDAVEGEIDPDLEADL 62

Query: 64  DDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVED 102
           +D+L D            E+  + E S+ E +L E  ED
Sbjct: 63  EDLLDD---------VDPELADDGEASTVEAQLAERTED 92


>gnl|CDD|233787 TIGR02223, ftsN, cell division protein FtsN.  FtsN is a poorly
           conserved protein active in cell division in a number of
           Proteobacteria. The N-terminal 30 residue region tends
           to by Lys/Arg-rich, and is followed by a
           membrane-spanning region. This is followed by an acidic
           low-complexity region of variable length and a
           well-conserved C-terminal domain of two tandem regions
           matched by pfam05036 (Sporulation related repeat), found
           in several cell division and sporulation proteins. The
           role of FtsN as a suppressor for other cell division
           mutations is poorly understood; it may involve cell wall
           hydrolysis [Cellular processes, Cell division].
          Length = 298

 Score = 28.5 bits (63), Expect = 8.8
 Identities = 19/178 (10%), Positives = 46/178 (25%), Gaps = 17/178 (9%)

Query: 5   SKESPEKKGDSESSTPASSKGKKTSKSPAKSEDD--SPVTKRPRRKSAKRVKSAIQSDSE 62
           SK++ E +     +   + +         +        +  R    +     S      E
Sbjct: 51  SKQANEPETLQPKNQTENGETAADLPPKPEERWSYIEELEAREVLINDPEEPSNGGGVEE 110

Query: 63  PDDMLQDNGSE------DEYVPPKAEVESESEHSSGEEELEESVEDP---------TPSS 107
              +  +          D     K    + SE +   E  +++ E             + 
Sbjct: 111 SAQLTAEQRQLLEQMQADMRAAEKVLATAPSEQTVAVEARKQTAEKKPQKARTAEAQKTP 170

Query: 108 SEAEVTPMKNGNKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQDW 165
            E E    K    +       + T + +  +    + P    +D ++  P     +  
Sbjct: 171 VETEKIASKVKEAKQKQKALPKQTAETQSNSKPIETAPKADKADKTKPKPKEKAERAA 228


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 28.8 bits (65), Expect = 9.1
 Identities = 14/90 (15%), Positives = 31/90 (34%), Gaps = 9/90 (10%)

Query: 54  KSAIQSDSEPDDMLQDNGSE--DEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAE 111
           +     + E D +  ++  +  D+    + +V+   E    E+E  +  +D      E E
Sbjct: 318 QGEEDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDEEDEDSDDEDDEEEEEEEKE 377

Query: 112 VTPMKNGNKRGLSSKSGQPTKKPKLTAPST 141
               K       S++S +         P +
Sbjct: 378 KKKKK-------SAESTRSELPFTFPCPKS 400


>gnl|CDD|223378 COG0301, ThiI, Thiamine biosynthesis ATP pyrophosphatase [Coenzyme
           metabolism].
          Length = 383

 Score = 28.8 bits (65), Expect = 9.4
 Identities = 31/117 (26%), Positives = 46/117 (39%), Gaps = 31/117 (26%)

Query: 248 ACSYMKESGC----TGESTLLTQLCNYESQTPSGCFPDMSELLKYFENAFDHKEASSAGN 303
           A    +E G     TGES  L Q+    SQT         E L+  ++        +   
Sbjct: 271 AEKLAEEFGAKAIVTGES--LGQV---ASQTL--------ENLRVIDSV-------TNTP 310

Query: 304 II-PKAGVDKEYDEVMDEIKSIEKEIQTYLRTQCAHFGCTVIYSEAQKKQKKYVLEV 359
           ++ P  G+DKE      EI  I + I TY  +      C VI++    K K  ++E 
Sbjct: 311 VLRPLIGLDKE------EIIEIARRIGTYEISIEPPEDCCVIFAPPTPKTKPKLIEA 361


>gnl|CDD|113839 pfam05084, GRA6, Granule antigen protein (GRA6).  This family
           contains the granule antigen protein GRA6 which is found
           in the parasitic protozoa Toxoplasma gondii and Neospora
           caninum. GRA6 protein plays an important role in the
           antigenicity and pathogenicity in these organisms.
          Length = 217

 Score = 28.2 bits (62), Expect = 9.6
 Identities = 28/122 (22%), Positives = 47/122 (38%), Gaps = 27/122 (22%)

Query: 2   YLDSKESPEKKGDSESSTPASSKGKKTSKSPAKSEDDSPVTKRPRRKSAKRVKSAIQSDS 61
              S E PE  G SE    +S+         A++++++               +A + D 
Sbjct: 56  TGSSGEPPEAVGTSEDYVNSSALAGGQDDGLAEADEEA---------------AAAEGDV 100

Query: 62  EPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEV--TPMKNGN 119
           +P  +L D            E  +E++  S EE +EE+ + P PS  +     TP K   
Sbjct: 101 DPFPVLIDAE----------EGAAEAQGPSLEERIEEADDAPKPSPVQEAQAKTPAKRQQ 150

Query: 120 KR 121
            R
Sbjct: 151 AR 152


>gnl|CDD|227952 COG5665, NOT5, CCR4-NOT transcriptional regulation complex, NOT5
           subunit [Transcription].
          Length = 548

 Score = 28.9 bits (64), Expect = 9.8
 Identities = 22/153 (14%), Positives = 42/153 (27%), Gaps = 11/153 (7%)

Query: 59  SDSEPDDMLQDNGSEDEYVPPKAEVESESEHSSGEEELEESVEDPTPSSSEAEVTPMKNG 118
           ++      ++ +  ++     KA     S        +   VE  + S S    TP    
Sbjct: 214 NNQTSLSSIRSSKKQERSPKKKAPQRDVSISDRATTPIAPGVESASQSISS---TPTPVS 270

Query: 119 NKRGLSSKSGQPTKKPKLTAPSTPSTPSFPVSDTSETTPSTSGAQDWSHNHYQFLHPDHI 178
               L +      K       ST  TP+  VS   + + + S  Q     ++     D I
Sbjct: 271 TDTPLHTVKDDSIK----FDNSTLGTPTTHVSMKKKESENDSEQQ----LNFPKDSTDEI 322

Query: 179 LDADRRSPKHPDYNPKTLYVPPEFLKKQTPCMG 211
               +   +        L+         +    
Sbjct: 323 RKTIQHDVETNAAFQNPLFNDELKWWLASKRYL 355


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.311    0.128    0.375 

Gapped
Lambda     K      H
   0.267   0.0724    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 28,391,621
Number of extensions: 2697572
Number of successful extensions: 3268
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3096
Number of HSP's successfully gapped: 226
Length of query: 577
Length of database: 10,937,602
Length adjustment: 102
Effective length of query: 475
Effective length of database: 6,413,494
Effective search space: 3046409650
Effective search space used: 3046409650
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.8 bits)
S2: 62 (27.7 bits)