RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy1544
         (1331 letters)



>gnl|CDD|215601 PLN03142, PLN03142, Probable chromatin-remodeling complex ATPase
            chain; Provisional.
          Length = 1033

 Score =  524 bits (1351), Expect = e-166
 Identities = 257/588 (43%), Positives = 367/588 (62%), Gaps = 35/588 (5%)

Query: 497  DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSIAHTVHEIVTEQASILVNGK 556
            D+ +   K K  G + +K   + ED+EY K   +             ++ + + I   GK
Sbjct: 116  DQSASAKKAKGRGRHASKLTEEEEDEEYLKEEEDGLGGSG----GTRLLVQPSCI--KGK 169

Query: 557  LKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIVPLS 616
            +++YQ+ GL W++ L+ N +NGILADEMGLGKT+QTI+L+ YL E + + GP +++ P S
Sbjct: 170  MRDYQLAGLNWLIRLYENGINGILADEMGLGKTLQTISLLGYLHEYRGITGPHMVVAPKS 229

Query: 617  TLSNWSLEFERWAPSVNVVAYKGSPHLRKTLQAQ-MKASKFNVLLTTYEYVIKDKGPLAK 675
            TL NW  E  R+ P +  V + G+P  R   + + + A KF+V +T++E  IK+K  L +
Sbjct: 230  TLGNWMNEIRRFCPVLRAVKFHGNPEERAHQREELLVAGKFDVCVTSFEMAIKEKTALKR 289

Query: 676  LHWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKLPELWALLNFLLPSI 735
              W+Y+IIDE HR+KN +  L+  +  F    +RLL+TGTPLQN L ELWALLNFLLP I
Sbjct: 290  FSWRYIIIDEAHRIKNENSLLSKTMRLF-STNYRLLITGTPLQNNLHELWALLNFLLPEI 348

Query: 736  FKSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFLLRRLKKEVESQLPDKV 795
            F S  TF++WF      +GE    N+++ +  +++LHKVLRPFLLRRLK +VE  LP K 
Sbjct: 349  FSSAETFDEWF----QISGE----NDQQEV--VQQLHKVLRPFLLRRLKSDVEKGLPPKK 398

Query: 796  EYIIKCDMSGLQKVLYRHMHTKGILLTDGSEKGKQGKGGAKALMNTIVQLRKLCNHPFMF 855
            E I+K  MS +QK  Y+ +  K + + +         G  K L+N  +QLRK CNHP++F
Sbjct: 399  ETILKVGMSQMQKQYYKALLQKDLDVVNAG-------GERKRLLNIAMQLRKCCNHPYLF 451

Query: 856  QNIEEKFSDHVGGSGIVSGPDLYRVSGKFELLDRILPKLKSTGHRVLLFCQMTQLMNILE 915
            Q  E        G    +G  L   SGK  LLD++LPKLK    RVL+F QMT+L++ILE
Sbjct: 452  QGAEP-------GPPYTTGEHLVENSGKMVLLDKLLPKLKERDSRVLIFSQMTRLLDILE 504

Query: 916  DYFSYRGFKYMRLDGTTKAEDRGDLLKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVII 975
            DY  YRG++Y R+DG T  EDR   +  FN P SE F+F+LSTRAGGLG+NL TAD VI+
Sbjct: 505  DYLMYRGYQYCRIDGNTGGEDRDASIDAFNKPGSEKFVFLLSTRAGGLGINLATADIVIL 564

Query: 976  FDSDWNPHQDLQAQDRAHRIGQKNEVRVLRLMTVNSVEERILAAARYKLNMDEKVIQAG- 1034
            +DSDWNP  DLQAQDRAHRIGQK EV+V R  T  ++EE+++  A  KL +D  VIQ G 
Sbjct: 565  YDSDWNPQVDLQAQDRAHRIGQKKEVQVFRFCTEYTIEEKVIERAYKKLALDALVIQQGR 624

Query: 1035 MFDQKSTGSERHQFLQTILHQDDEEDEEENAVPDDETVNQMLARSEEE 1082
            + +QK+   +  + LQ + +  +     +++   DE +++++A+ EE 
Sbjct: 625  LAEQKTVNKD--ELLQMVRYGAEMVFSSKDSTITDEDIDRIIAKGEEA 670


>gnl|CDD|215770 pfam00176, SNF2_N, SNF2 family N-terminal domain.  This domain is
           found in proteins involved in a variety of processes
           including transcription regulation (e.g., SNF2, STH1,
           brahma, MOT1), DNA repair (e.g. ERCC6, RAD16, RAD5), DNA
           recombination (e.g. RAD54), and chromatin unwinding
           (e.g. ISWI) as well as a variety of other proteins with
           little functional information (e.g. lodestar, ETL1).
          Length = 301

 Score =  389 bits (1000), Expect = e-125
 Identities = 147/302 (48%), Positives = 197/302 (65%), Gaps = 7/302 (2%)

Query: 560 YQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALI-TYLMEKKKVNGPFLIIVPLSTL 618
           YQ++G+ W++SL +N L GILADEMGLGKT+QTIAL+ TYL E K   GP L++ PLSTL
Sbjct: 1   YQLEGVNWLISLESNGLGGILADEMGLGKTLQTIALLATYLKEGKDRRGPTLVVCPLSTL 60

Query: 619 SNWSLEFERWAPSVNVVAYKGSPHLRKTLQAQM--KASKFNVLLTTYEYVIKDK---GPL 673
            NW  EFE+WAP++ VV Y G    R  L+  M  +   ++V++TTYE + KDK     L
Sbjct: 61  HNWLNEFEKWAPALRVVVYHGDGRERSKLRQSMAKRLDTYDVVITTYEVLRKDKKLLSLL 120

Query: 674 AKLHWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKLPELWALLNFLLP 733
            K+ W  +++DE HR+KN   KL   L       +RLLLTGTP+QN L ELWALLNFL P
Sbjct: 121 NKVEWDRVVLDEAHRLKNSKSKLYKALKKLK-TRNRLLLTGTPIQNNLEELWALLNFLRP 179

Query: 734 SIFKSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFLLRRLKKEVESQLPD 793
             F S   FE+WFN P A T +    N E+    I RLHK+L+PFLLRR K +VE  LP 
Sbjct: 180 GPFGSFKVFEEWFNIPIANTADNKNKNLEKGKEGINRLHKLLKPFLLRRTKDDVEKSLPP 239

Query: 794 KVEYIIKCDMSGLQKVLYRHMHTKGILLTDGSEKGKQGKGGAKALMNTIVQLRKLCNHPF 853
           K E+++ C++S  Q+ LY+ + TK  L    + +G +   G  +L+N I+QLRK+CNHP+
Sbjct: 240 KTEHVLYCNLSDEQRKLYKKLLTKRRLALSFAVEGGEKNVGIASLLNLIMQLRKICNHPY 299

Query: 854 MF 855
           +F
Sbjct: 300 LF 301


>gnl|CDD|223627 COG0553, HepA, Superfamily II DNA/RNA helicases, SNF2 family
            [Transcription / DNA replication, recombination, and
            repair].
          Length = 866

 Score =  353 bits (906), Expect = e-105
 Identities = 204/520 (39%), Positives = 290/520 (55%), Gaps = 38/520 (7%)

Query: 555  GKLKEYQIKGLEWMVSLFN-NNLNGILADEMGLGKTIQTIALITYLMEKKKVN-GPFLII 612
             +L+ YQ++G+ W+  L   N L GILAD+MGLGKT+QTIAL+  L+E  KV  GP LI+
Sbjct: 337  AELRPYQLEGVNWLSELLRSNLLGGILADDMGLGKTVQTIALLLSLLESIKVYLGPALIV 396

Query: 613  VPLSTLSNWSLEFERWAPSVN-VVAYKGSPHL----RKTLQAQMKASK---FNVLLTTYE 664
            VP S LSNW  EFE++AP +  V+ Y G        R+ L+  +K      F+V++TTYE
Sbjct: 397  VPASLLSNWKREFEKFAPDLRLVLVYHGEKSELDKKREALRDLLKLHLVIIFDVVITTYE 456

Query: 665  YVIK---DKGPLAKLHWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKL 721
             + +   D G L K+ W  +++DE HR+KN        L     A +RL LTGTPL+N+L
Sbjct: 457  LLRRFLVDHGGLKKIEWDRVVLDEAHRIKNDQSSEGKALQFL-KALNRLDLTGTPLENRL 515

Query: 722  PELWALLN-FLLPSIF-KSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFL 779
             ELW+LL  FL P +   S + F + F  P     E+     E   L I  L K+L PF+
Sbjct: 516  GELWSLLQEFLNPGLLGTSFAIFTRLFEKPIQA--EEDIGPLEARELGIELLRKLLSPFI 573

Query: 780  LRRLKKEVE--SQLPDKVEYIIKCDMSGLQKVLY-RHMHTKGILLTD------GSEKGKQ 830
            LRR K++VE   +LP K+E +++C++S  Q+ LY   +                     +
Sbjct: 574  LRRTKEDVEVLKELPPKIEKVLECELSEEQRELYEALLEGAEKNQQLLEDLEKADSDENR 633

Query: 831  GKGGAKALMNTIVQLRKLCNHPFMFQNIEEKFSDHVGGSGIVSGPDLYRV-------SGK 883
                   ++  + +LR++CNHP +     E   D +           Y          GK
Sbjct: 634  IGDSELNILALLTRLRQICNHPALVDEGLEATFDRIVLLLREDKDFDYLKKPLIQLSKGK 693

Query: 884  FELLDRIL-PKLKSTGH--RVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDL 940
             + LD +L  KL   GH  +VL+F Q T ++++LEDY    G KY+RLDG+T A+ R +L
Sbjct: 694  LQALDELLLDKLLEEGHYHKVLIFSQFTPVLDLLEDYLKALGIKYVRLDGSTPAKRRQEL 753

Query: 941  LKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIGQKNE 1000
            + +FNA D E  +F+LS +AGGLGLNL  ADTVI+FD  WNP  +LQA DRAHRIGQK  
Sbjct: 754  IDRFNA-DEEEKVFLLSLKAGGLGLNLTGADTVILFDPWWNPAVELQAIDRAHRIGQKRP 812

Query: 1001 VRVLRLMTVNSVEERILAAARYKLNMDEKVIQAGMFDQKS 1040
            V+V RL+T  ++EE+IL     K  + + +I A    + S
Sbjct: 813  VKVYRLITRGTIEEKILELQEKKQELLDSLIDAEGEKELS 852


>gnl|CDD|99947 cd05516, Bromo_SNF2L2, Bromodomain, SNF2L2-like subfamily, specific
            to animals. SNF2L2 (SNF2-alpha) or SWI/SNF-related
            matrix-associated actin-dependent regulator of chromatin
            subfamily A member 2 is a global transcriptional
            activator, which cooperates with nuclear hormone
            receptors to boost transcriptional activation.
            Bromodomains are 110 amino acid long domains, that are
            found in many chromatin associated proteins. Bromodomains
            can interact specifically with acetylated lysine.
          Length = 107

 Score =  153 bits (389), Expect = 1e-43
 Identities = 64/107 (59%), Positives = 85/107 (79%)

Query: 1217 KLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDG 1276
            +L K + KI+ VVIKY DSDGR L+E FI+LPSRKELP+YYE+I +P+D KKI  RI + 
Sbjct: 1    ELTKKMNKIVDVVIKYKDSDGRQLAEVFIQLPSRKELPEYYELIRKPVDFKKIKERIRNH 60

Query: 1277 KYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRV 1323
            KY S+++L+KD   LC+NAQ +N E SLI+EDS+VL+SVF  ARQ++
Sbjct: 61   KYRSLEDLEKDVMLLCQNAQTFNLEGSLIYEDSIVLQSVFKSARQKI 107


>gnl|CDD|214692 smart00487, DEXDc, DEAD-like helicases superfamily. 
          Length = 201

 Score =  123 bits (310), Expect = 8e-32
 Identities = 51/200 (25%), Positives = 90/200 (45%), Gaps = 16/200 (8%)

Query: 556 KLKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQ-TIALITYLMEKKKVNGPFLIIVP 614
            L+ YQ + +E ++S      + ILA   G GKT+   +  +  L   K   G  L++VP
Sbjct: 8   PLRPYQKEAIEALLS---GLRDVILAAPTGSGKTLAALLPALEALKRGKG--GRVLVLVP 62

Query: 615 LSTL-SNWSLEFERWAPS--VNVVAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKD-- 669
              L   W+ E ++  PS  + VV   G    R+ L+ ++++ K ++L+TT   ++    
Sbjct: 63  TRELAEQWAEELKKLGPSLGLKVVGLYGGDSKREQLR-KLESGKTDILVTTPGRLLDLLE 121

Query: 670 KGPLAKLHWKYMIIDEGHRMKN--HHCKLTHILNTFYVAPHRLLLTGTPLQNKLPELWAL 727
              L+  +   +I+DE HR+ +     +L  +L         LLL+ TP +     L   
Sbjct: 122 NDKLSLSNVDLVILDEAHRLLDGGFGDQLEKLLKLLPKNVQLLLLSATPPEEIENLLELF 181

Query: 728 LN--FLLPSIFKSVSTFEQW 745
           LN    +   F  +   EQ+
Sbjct: 182 LNDPVFIDVGFTPLEPIEQF 201


>gnl|CDD|238034 cd00079, HELICc, Helicase superfamily c-terminal domain; associated
            with DEXDc-, DEAD-, and DEAH-box proteins, yeast
            initiation factor 4A, Ski2p, and Hepatitis C virus NS3
            helicases; this domain is found in a wide variety of
            helicases and helicase related proteins; may not be an
            autonomously folding unit, but an integral part of the
            helicase; 4 helicase superfamilies at present according
            to the organization of their signature motifs; all
            helicases share the ability to unwind nucleic acid
            duplexes with a distinct directional polarity; they
            utilize the free energy from nucleoside triphosphate
            hydrolysis to fuel their translocation along DNA,
            unwinding the duplex in the process.
          Length = 131

 Score =  113 bits (286), Expect = 2e-29
 Identities = 36/124 (29%), Positives = 58/124 (46%), Gaps = 3/124 (2%)

Query: 881  SGKFELLDRILPKLKSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDL 940
              K E L  +L +    G +VL+FC   ++++ L +     G K   L G    E+R ++
Sbjct: 11   DEKLEALLELLKEHLKKGGKVLIFCPSKKMLDELAELLRKPGIKVAALHGDGSQEEREEV 70

Query: 941  LKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIGQKNE 1000
            LK F   +    + +++T     G++L     VI +D  W+P   LQ   RA R GQK  
Sbjct: 71   LKDFREGE---IVVLVATDVIARGIDLPNVSVVINYDLPWSPSSYLQRIGRAGRAGQKGT 127

Query: 1001 VRVL 1004
              +L
Sbjct: 128  AILL 131


>gnl|CDD|238005 cd00046, DEXDc, DEAD-like helicases superfamily. A diverse family
           of proteins involved in ATP-dependent RNA or DNA
           unwinding. This domain contains the ATP-binding region.
          Length = 144

 Score =  108 bits (272), Expect = 2e-27
 Identities = 35/146 (23%), Positives = 60/146 (41%), Gaps = 9/146 (6%)

Query: 577 NGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIVPLSTLSNWSLEFERWAPS--VNV 634
           + +LA   G GKT+  +  I  L++  K  G  L++ P   L+N   E  +      + V
Sbjct: 2   DVLLAAPTGSGKTLAALLPILELLDSLK-GGQVLVLAPTRELANQVAERLKELFGEGIKV 60

Query: 635 VAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKDKGPLAKL--HWKYMIIDEGHRMKNH 692
               G        Q ++ + K ++++ T   ++ +   L         +I+DE HR+ N 
Sbjct: 61  GYLIGG--TSIKQQEKLLSGKTDIVVGTPGRLLDELERLKLSLKKLDLLILDEAHRLLNQ 118

Query: 693 HCKLT--HILNTFYVAPHRLLLTGTP 716
              L    IL         LLL+ TP
Sbjct: 119 GFGLLGLKILLKLPKDRQVLLLSATP 144


>gnl|CDD|99946 cd05515, Bromo_polybromo_V, Bromodomain, polybromo repeat V.
            Polybromo is a nuclear protein of unknown function, which
            contains 6 bromodomains. The human ortholog BAF180 is
            part of a SWI/SNF chromatin-remodeling complex, and it
            may carry out the functions of Yeast Rsc-1 and Rsc-2. It
            was shown that polybromo bromodomains bind to histone H3
            at specific acetyl-lysine positions. Bromodomains are
            found in many chromatin-associated proteins and in
            nuclear histone acetyltransferases. They interact
            specifically with acetylated lysine, but not all the
            bromodomains in polybromo may bind to acetyl-lysine.
          Length = 105

 Score =  106 bits (266), Expect = 4e-27
 Identities = 41/105 (39%), Positives = 68/105 (64%)

Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
            +++ L ++   V  YTD  GR LS  F++LPS+ E PDYY+VI +P+D++KI  +IE  +
Sbjct: 1    MQQKLWELYNAVKNYTDGRGRRLSLIFMRLPSKSEYPDYYDVIKKPIDMEKIRSKIEGNQ 60

Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQR 1322
            Y S+D++  DF  +  NA  YNE  S I++D++ L+ V  + ++ 
Sbjct: 61   YQSLDDMVSDFVLMFDNACKYNEPDSQIYKDALTLQKVLLETKRE 105


>gnl|CDD|197636 smart00297, BROMO, bromo domain. 
          Length = 107

 Score =  105 bits (265), Expect = 6e-27
 Identities = 46/107 (42%), Positives = 70/107 (65%), Gaps = 2/107 (1%)

Query: 1217 KLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDG 1276
            KL+K L+++++ V+   DS    LS PF+K  SRKE PDYY++I +PMD+K I  ++E+G
Sbjct: 3    KLQKKLQELLKAVLDKLDSHP--LSWPFLKPVSRKEAPDYYDIIKKPMDLKTIKKKLENG 60

Query: 1277 KYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRV 1323
            KYSSV+E   DF  +  NA+ YN   S +++D+  LE  F K  + +
Sbjct: 61   KYSSVEEFVADFNLMFSNARTYNGPDSEVYKDAKKLEKFFEKKLREL 107


>gnl|CDD|99922 cd04369, Bromodomain, Bromodomain. Bromodomains are found in many
            chromatin-associated proteins and in nuclear histone
            acetyltransferases. They interact specifically with
            acetylated lysine.
          Length = 99

 Score =  102 bits (255), Expect = 1e-25
 Identities = 36/105 (34%), Positives = 59/105 (56%), Gaps = 8/105 (7%)

Query: 1214 DQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRI 1273
             + KL+  L  + ++            SEPF++    KE PDYYEVI  PMD+  I  ++
Sbjct: 1    LKKKLRSLLDALKKLKRDL--------SEPFLEPVDPKEAPDYYEVIKNPMDLSTIKKKL 52

Query: 1274 EDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTK 1318
            ++G+Y S++E + D + +  NA+ YN   S I++D+  LE +F K
Sbjct: 53   KNGEYKSLEEFEADVRLIFSNAKTYNGPGSPIYKDAKKLEKLFEK 97


>gnl|CDD|99950 cd05519, Bromo_SNF2, Bromodomain, SNF2-like subfamily, specific to
            fungi. SNF2 is a yeast protein involved in
            transcriptional activation, it is the catalytic component
            of the SWI/SNF ATP-dependent chromatin remodeling
            complex. The protein is essential for the regulation of
            gene expression (both positive and negative) of a large
            number of genes. The SWI/SNF complex changes chromatin
            structure by altering DNA-histone contacts within the
            nucleosome, which results in a re-positioning of the
            nucleosome and facilitates or represses the binding of
            gene-specific transcription factors. Bromodomains are 110
            amino acid long domains, that are found in many chromatin
            associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 103

 Score = 97.4 bits (243), Expect = 5e-24
 Identities = 42/102 (41%), Positives = 63/102 (61%)

Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
            LK  + +I   V+   D  GR LSE F++ PS+K  PDYY +I RP+ + +I  RIE   
Sbjct: 1    LKAAMLEIYDAVLNCEDETGRKLSELFLEKPSKKLYPDYYVIIKRPIALDQIKRRIEGRA 60

Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKA 1319
            Y S++E  +DF  +  NA+ YN+E S+++ED+V +E  F K 
Sbjct: 61   YKSLEEFLEDFHLMFANARTYNQEGSIVYEDAVEMEKAFKKK 102


>gnl|CDD|201125 pfam00271, Helicase_C, Helicase conserved C-terminal domain.  The
           Prosite family is restricted to DEAD/H helicases,
           whereas this domain family is found in a wide variety of
           helicases and helicase related proteins. It may be that
           this is not an autonomously folding unit, but an
           integral part of the helicase.
          Length = 78

 Score = 95.3 bits (238), Expect = 1e-23
 Identities = 25/81 (30%), Positives = 37/81 (45%), Gaps = 3/81 (3%)

Query: 916 DYFSYRGFKYMRLDGTTKAEDRGDLLKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVII 975
                 G K  RL G    E+R ++L+ F    S     +++T   G G++L   + VI 
Sbjct: 1   KLLRKPGIKVARLHGGLSQEEREEILEDFRNGKS---KVLVATDVAGRGIDLPDVNLVIN 57

Query: 976 FDSDWNPHQDLQAQDRAHRIG 996
           +D  WNP   +Q   RA R G
Sbjct: 58  YDLPWNPASYIQRIGRAGRAG 78


>gnl|CDD|197757 smart00490, HELICc, helicase superfamily c-terminal domain. 
          Length = 82

 Score = 94.6 bits (236), Expect = 3e-23
 Identities = 28/84 (33%), Positives = 39/84 (46%), Gaps = 3/84 (3%)

Query: 913 ILEDYFSYRGFKYMRLDGTTKAEDRGDLLKKFNAPDSEYFIFVLSTRAGGLGLNLQTADT 972
            L +     G K  RL G    E+R ++L KFN         +++T     GL+L   D 
Sbjct: 2   ELAELLKELGIKVARLHGGLSQEEREEILDKFNNGKI---KVLVATDVAERGLDLPGVDL 58

Query: 973 VIIFDSDWNPHQDLQAQDRAHRIG 996
           VII+D  W+P   +Q   RA R G
Sbjct: 59  VIIYDLPWSPASYIQRIGRAGRAG 82


>gnl|CDD|99949 cd05518, Bromo_polybromo_IV, Bromodomain, polybromo repeat IV.
            Polybromo is a nuclear protein of unknown function, which
            contains 6 bromodomains. The human ortholog BAF180 is
            part of a SWI/SNF chromatin-remodeling complex, and it
            may carry out the functions of Yeast Rsc-1 and Rsc-2. It
            was shown that polybromo bromodomains bind to histone H3
            at specific acetyl-lysine positions. Bromodomains are
            found in many chromatin-associated proteins and in
            nuclear histone acetyltransferases. They interact
            specifically with acetylated lysine, but not all the
            bromodomains in polybromo may bind to acetyl-lysine.
          Length = 103

 Score = 95.2 bits (237), Expect = 3e-23
 Identities = 42/103 (40%), Positives = 67/103 (65%)

Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
             KK +  +   V++Y +  GR L + F++ PS+K+ PDYY++I  P+D+K I   I + K
Sbjct: 1    RKKRMLALFLYVLEYREGSGRRLCDLFMEKPSKKDYPDYYKIILEPIDLKTIEHNIRNDK 60

Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKAR 1320
            Y++ +EL  DFK + RNA+ YNEE S ++ED+ +LE V  + R
Sbjct: 61   YATEEELMDDFKLMFRNARHYNEEGSQVYEDANILEKVLKEKR 103


>gnl|CDD|99948 cd05517, Bromo_polybromo_II, Bromodomain, polybromo repeat II.
            Polybromo is a nuclear protein of unknown function, which
            contains 6 bromodomains. The human ortholog BAF180 is
            part of a SWI/SNF chromatin-remodeling complex, and it
            may carry out the functions of Yeast Rsc-1 and Rsc-2. It
            was shown that polybromo bromodomains bind to histone H3
            at specific acetyl-lysine positions. Bromodomains are
            found in many chromatin-associated proteins and in
            nuclear histone acetyltransferases. They interact
            specifically with acetylated lysine, but not all the
            bromodomains in polybromo may bind to acetyl-lysine.
          Length = 103

 Score = 90.2 bits (224), Expect = 2e-21
 Identities = 40/103 (38%), Positives = 69/103 (66%)

Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
            LK+ L++++  V+  TD  GR++SE F KLPS+   PDYY VI  P+D+K I  RI+ G 
Sbjct: 1    LKQILEQLLEAVMTATDPSGRLISELFQKLPSKVLYPDYYAVIKEPIDLKTIAQRIQSGY 60

Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKAR 1320
            Y S+++++KD   + +NA+ +NE  S +++D+  ++ +FT  +
Sbjct: 61   YKSIEDMEKDLDLMVKNAKTFNEPGSQVYKDANAIKKIFTAKK 103


>gnl|CDD|99954 cd05524, Bromo_polybromo_I, Bromodomain, polybromo repeat I.
            Polybromo is a nuclear protein of unknown function, which
            contains 6 bromodomains. The human ortholog BAF180 is
            part of a SWI/SNF chromatin-remodeling complex, and it
            may carry out the functions of Yeast Rsc-1 and Rsc-2. It
            was shown that polybromo bromodomains bind to histone H3
            at specific acetyl-lysine positions. Bromodomains are
            found in many chromatin-associated proteins and in
            nuclear histone acetyltransferases. They interact
            specifically with acetylated lysine, but not all the
            bromodomains in polybromo may bind to acetyl-lysine.
          Length = 113

 Score = 83.2 bits (206), Expect = 6e-19
 Identities = 39/104 (37%), Positives = 61/104 (58%)

Query: 1225 IMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDEL 1284
            +   +  Y   DGR+L E FI++P R+  P+YYEV+  P+D+ KI  +++  +Y  VD+L
Sbjct: 10   LYDTIRNYKSEDGRILCESFIRVPKRRNEPEYYEVVSNPIDLLKIQQKLKTEEYDDVDDL 69

Query: 1285 QKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRVESGED 1328
              DF+ L  NA+ Y +  S  H+D+  L  +F  AR  V SG +
Sbjct: 70   TADFELLINNAKAYYKPDSPEHKDACKLWELFLSARNEVLSGGE 113


>gnl|CDD|214727 smart00573, HSA, domain in helicases and associated with SANT
           domains. 
          Length = 73

 Score = 81.3 bits (201), Expect = 1e-18
 Identities = 31/73 (42%), Positives = 49/73 (67%)

Query: 297 QKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQ 356
           QK+E ER+++Q     +  ++ H KDFKE H+   A   ++ KAVM+YH N EKE+++ +
Sbjct: 1   QKLEEERRRKQHWDHLLEEMIWHAKDFKEEHKWKIAAAKKMAKAVMDYHQNKEKEEERRE 60

Query: 357 ERIEKERMRRLMA 369
           E+ EK R+R+L A
Sbjct: 61  EKNEKRRLRKLAA 73


>gnl|CDD|215921 pfam00439, Bromodomain, Bromodomain.  Bromodomains are 110 amino acid
            long domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 84

 Score = 80.5 bits (199), Expect = 2e-18
 Identities = 32/77 (41%), Positives = 46/77 (59%)

Query: 1233 TDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLC 1292
             D     L+EPF++    +E PDYYEVI  PMD+  I  +++ GKY S+ E  KD + + 
Sbjct: 6    EDLMEHPLAEPFLEPVDPEEYPDYYEVIKEPMDLSTIRQKLKSGKYKSLAEFLKDVELIF 65

Query: 1293 RNAQIYNEELSLIHEDS 1309
             NA  YN E S I++D+
Sbjct: 66   SNAITYNGEDSDIYKDA 82


>gnl|CDD|227408 COG5076, COG5076, Transcription factor involved in chromatin
            remodeling, contains bromodomain [Chromatin structure and
            dynamics / Transcription].
          Length = 371

 Score = 87.2 bits (216), Expect = 6e-18
 Identities = 58/265 (21%), Positives = 97/265 (36%), Gaps = 31/265 (11%)

Query: 1057 DEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAER---RKEQGKKSRLIEVSELPDWL 1113
                    +V  +E  N++L     +  +    +A      K   +K       E     
Sbjct: 7    SYSQLGRPSVLKEEFGNELLRL--VDNDSSPFPNAPEEEGSKNLFQKQLKRMPKEY-ITS 63

Query: 1114 IKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE 1173
            I +D                L    G         T S  EK   +++     +D+    
Sbjct: 64   IVDDR----EPGSMANVNDDLENVGG--------ITYSPFEKNRPESLR----FDEIVFL 107

Query: 1174 EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYT 1233
              E V  +             +     +K  K +++    D        K I +   +  
Sbjct: 108  AIESVTPESGLGSLLMAHL--KTSVKKRKTPKIEDELLYADN-------KAIAKFKKQLF 158

Query: 1234 DSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCR 1293
              DGR LS  F+ LPS++E PDYYE+I  PMD+  I  ++++G+Y S +E   D   +  
Sbjct: 159  LRDGRFLSSIFLGLPSKREYPDYYEIIKSPMDLLTIQKKLKNGRYKSFEEFVSDLNLMFD 218

Query: 1294 NAQIYNEELSLIHEDSVVLESVFTK 1318
            N ++YN   S ++ D+  LE  F K
Sbjct: 219  NCKLYNGPDSSVYVDAKELEKYFLK 243



 Score = 31.7 bits (72), Expect = 2.4
 Identities = 19/89 (21%), Positives = 38/89 (42%), Gaps = 3/89 (3%)

Query: 1195 EEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELP 1254
            E+       +  +E       +      ++  R  +  T+S   V + PF++  S +E+P
Sbjct: 238  EKYFLKLIEEIPEEMLEL---SIKPGREEREERESVLITNSQAHVGAWPFLRPVSDEEVP 294

Query: 1255 DYYEVIDRPMDIKKILGRIEDGKYSSVDE 1283
            DYY+ I  PMD+     ++ +  Y   + 
Sbjct: 295  DYYKDIRDPMDLSTKELKLRNNYYRPEET 323


>gnl|CDD|99953 cd05522, Bromo_Rsc1_2_II, Bromodomain, repeat II in Rsc1/2_like
            subfamily, specific to fungi. Rsc1 and Rsc2 are
            components of the RSC complex (remodeling the structure
            of chromatin), are essential for transcriptional control,
            and have a specific domain architecture including two
            bromodomains. The RSC complex has also been linked to
            homologous recombination and nonhomologous end-joining
            repair of DNA double strand breaks. Bromodomains are 110
            amino acid long domains, that are found in many chromatin
            associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 104

 Score = 79.6 bits (197), Expect = 1e-17
 Identities = 32/96 (33%), Positives = 54/96 (56%)

Query: 1223 KKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVD 1282
            K I++ + K  D +GR+L+  F KLP +   P+YY+ I  P+ +  I  +++  KY S D
Sbjct: 7    KNILKGLRKERDENGRLLTLHFEKLPDKAREPEYYQEISNPISLDDIKKKVKRRKYKSFD 66

Query: 1283 ELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTK 1318
            +   D   +  NA++YNE  S  ++D+V+LE     
Sbjct: 67   QFLNDLNLMFENAKLYNENDSQEYKDAVLLEKEARL 102


>gnl|CDD|99951 cd05520, Bromo_polybromo_III, Bromodomain, polybromo repeat III.
            Polybromo is a nuclear protein of unknown function, which
            contains 6 bromodomains. The human ortholog BAF180 is
            part of a SWI/SNF chromatin-remodeling complex, and it
            may carry out the functions of Yeast Rsc-1 and Rsc-2. It
            was shown that polybromo bromodomains bind to histone H3
            at specific acetyl-lysine positions. Bromodomains are
            found in many chromatin-associated proteins and in
            nuclear histone acetyltransferases. They interact
            specifically with acetylated lysine, but not all the
            bromodomains in polybromo may bind to acetyl-lysine.
          Length = 103

 Score = 76.2 bits (188), Expect = 1e-16
 Identities = 29/83 (34%), Positives = 59/83 (71%)

Query: 1233 TDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLC 1292
             ++ G++L+EPF+KLPS+++ PDYY+ I  P+ +++I  ++++G+Y +++EL+ D   + 
Sbjct: 16   RNNQGQLLAEPFLKLPSKRKYPDYYQEIKNPISLQQIRTKLKNGEYETLEELEADLNLMF 75

Query: 1293 RNAQIYNEELSLIHEDSVVLESV 1315
             NA+ YN   S I++D+  L+ +
Sbjct: 76   ENAKRYNVPNSRIYKDAEKLQKL 98


>gnl|CDD|99955 cd05525, Bromo_ASH1, Bromodomain; ASH1_like sub-family. ASH1 (absent,
            small, or homeotic 1) is a member of the trithorax-group
            in Drosophila melanogaster, an epigenetic transcriptional
            regulator of HOX genes. Drosophila ASH1 has been shown to
            methylate specific lysines in histones H3 and H4.
            Mammalian ASH1 has been shown to methylate histone H3.
            Bromodomains are 110 amino acid long domains, that are
            found in many chromatin associated proteins. Bromodomains
            can interact specifically with acetylated lysine.
          Length = 106

 Score = 74.3 bits (183), Expect = 6e-16
 Identities = 40/106 (37%), Positives = 59/106 (55%)

Query: 1216 AKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIED 1275
            A+L + LK+I   +I Y DS+G+ L+ PFI LPS+K+ PDYYE I  P+D+  I  +I  
Sbjct: 1    ARLAQVLKEICDAIITYKDSNGQSLAIPFINLPSKKKNPDYYERITDPVDLSTIEKQILT 60

Query: 1276 GKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQ 1321
            G Y + +    D   + RNA+ Y    S I  D   L   + +A+ 
Sbjct: 61   GYYKTPEAFDSDMLKVFRNAEKYYGRKSPIGRDVCRLRKAYYQAKH 106


>gnl|CDD|219455 pfam07529, HSA, HSA.  This domain is predicted to bind DNA and is
           often found associated with helicases.
          Length = 73

 Score = 71.2 bits (175), Expect = 3e-15
 Identities = 22/73 (30%), Positives = 42/73 (57%)

Query: 297 QKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQ 356
           Q++E E++++      +  +    KDF+E  +   A+  +L +AV  YH   EKE+++ +
Sbjct: 1   QRLEEEQREKTHWDHLLEEMKWMSKDFREERKWKIAKAKKLARAVAQYHKYIEKEEQRRK 60

Query: 357 ERIEKERMRRLMA 369
           ER  K+R++ L A
Sbjct: 61  EREAKQRLKALKA 73


>gnl|CDD|99941 cd05509, Bromo_gcn5_like, Bromodomain; Gcn5_like subfamily. Gcn5p is
            a histone acetyltransferase (HAT) which mediates
            acetylation of histones at lysine residues; such
            acetylation is generally correlated with the activation
            of transcription. Bromodomains are 110 amino acid long
            domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 101

 Score = 68.7 bits (169), Expect = 6e-14
 Identities = 25/77 (32%), Positives = 46/77 (59%)

Query: 1243 PFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEEL 1302
            PF++   ++E PDYY+VI +PMD+  +  ++E+G Y +++E   D K +  N ++YN   
Sbjct: 21   PFLEPVDKEEAPDYYDVIKKPMDLSTMEEKLENGYYVTLEEFVADLKLIFDNCRLYNGPD 80

Query: 1303 SLIHEDSVVLESVFTKA 1319
            +  ++ +  LE  F K 
Sbjct: 81   TEYYKCANKLEKFFWKK 97


>gnl|CDD|99952 cd05521, Bromo_Rsc1_2_I, Bromodomain, repeat I in Rsc1/2_like
            subfamily, specific to fungi. Rsc1 and Rsc2 are
            components of the RSC complex (remodeling the structure
            of chromatin), are essential for transcriptional control,
            and have a specific domain architecture including two
            bromodomains. The RSC complex has also been linked to
            homologous recombination and nonhomologous end-joining
            repair of DNA double strand breaks. Bromodomains are 110
            amino acid long domains, that are found in many chromatin
            associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 106

 Score = 59.6 bits (145), Expect = 1e-10
 Identities = 29/98 (29%), Positives = 53/98 (54%), Gaps = 2/98 (2%)

Query: 1217 KLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDG 1276
            KL K LK +   +    + +G  +   F  LP RK+ PDYY++I  P+ +  +  R+   
Sbjct: 1    KLSKKLKPLYDGIYTLKEENGIEIHPIFNVLPLRKDYPDYYKIIKNPLSLNTVKKRLP-- 58

Query: 1277 KYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLES 1314
             Y++  E   D   +  NA++YN + S+I++ +++LE 
Sbjct: 59   HYTNAQEFVNDLAQIPWNARLYNTKGSVIYKYALILEK 96


>gnl|CDD|99936 cd05504, Bromo_Acf1_like, Bromodomain; Acf1_like or BAZ1A_like
            subfamily. Bromo adjacent to zinc finger 1A (BAZ1A) was
            identified as a novel human bromodomain gene by cDNA
            library screening. The Drosophila homologue, Acf1, is
            part of the CHRAC (chromatin accessibility complex) and
            regulates ISWI-induced nucleosome remodeling.
            Bromodomains are 110 amino acid long domains, that are
            found in many chromatin associated proteins. Bromodomains
            can interact specifically with acetylated lysine.
          Length = 115

 Score = 59.3 bits (144), Expect = 1e-10
 Identities = 25/78 (32%), Positives = 45/78 (57%)

Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
            S PF++  S+ E+PDYY++I +PMD+  I  ++  G+Y   +E   D + +  N  +YN 
Sbjct: 30   SWPFLRPVSKIEVPDYYDIIKKPMDLGTIKEKLNMGEYKLAEEFLSDIQLVFSNCFLYNP 89

Query: 1301 ELSLIHEDSVVLESVFTK 1318
            E + +++    L+  F K
Sbjct: 90   EHTSVYKAGTRLQRFFIK 107


>gnl|CDD|99945 cd05513, Bromo_brd7_like, Bromodomain, brd7_like subgroup. The BRD7
            gene encodes a nuclear protein that has been shown to
            inhibit cell growth and the progression of the cell cycle
            by regulating cell-cycle genes at the transcriptional
            level. BRD7 has been identified as a gene involved in
            nasopharyngeal carcinoma. The protein interacts with
            acetylated histone H3 via its bromodomain. Bromodomains
            are 110 amino acid long domains that are found in many
            chromatin associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 98

 Score = 55.5 bits (134), Expect = 2e-09
 Identities = 18/46 (39%), Positives = 27/46 (58%)

Query: 1254 PDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYN 1299
            P Y  +I  PMD   +  +I++  Y S++E + DFK +C NA  YN
Sbjct: 32   PGYSSIIKHPMDFSTMKEKIKNNDYQSIEEFKDDFKLMCENAMKYN 77


>gnl|CDD|203672 pfam07533, BRK, BRK domain.  The function of this domain is
           unknown. It is often found associated with helicases and
           transcription factors.
          Length = 45

 Score = 52.3 bits (126), Expect = 8e-09
 Identities = 16/45 (35%), Positives = 26/45 (57%)

Query: 446 SQLTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVAD 490
           S   +  + V    +GK L G+DAP    L++W+Q++PG+EV   
Sbjct: 1   SLDGEERVPVINRKTGKRLTGDDAPKLKDLERWLQENPGYEVDPR 45


>gnl|CDD|99937 cd05505, Bromo_WSTF_like, Bromodomain; Williams syndrome
            transcription factor-like subfamily (WSTF-like). The
            Williams-Beuren syndrome deletion transcript 9 is a
            putative transcriptional regulator. WSTF was found to
            play a role in vitamin D-mediated transcription as part
            of two chromatin remodeling complexes, WINAC and WICH.
            Bromodomains are 110 amino acid long domains, that are
            found in many chromatin associated proteins. Bromodomains
            can interact specifically with acetylated lysine.
          Length = 97

 Score = 54.1 bits (130), Expect = 8e-09
 Identities = 26/76 (34%), Positives = 38/76 (50%), Gaps = 6/76 (7%)

Query: 1225 IMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDEL 1284
            I+  ++KY  S       PF +  +  E  DY +VI  PMD++ +  +   G YSSV E 
Sbjct: 8    ILSKILKYRFS------WPFREPVTADEAEDYKKVITNPMDLQTMQTKCSCGSYSSVQEF 61

Query: 1285 QKDFKTLCRNAQIYNE 1300
              D K +  NA+ Y E
Sbjct: 62   LDDMKLVFSNAEKYYE 77


>gnl|CDD|99942 cd05510, Bromo_SPT7_like, Bromodomain; SPT7_like subfamily. SPT7 is a
            yeast protein that functions as a component of the
            transcription regulatory histone acetylation (HAT)
            complexes SAGA, SALSA, and SLIK. SAGA is involved in the
            RNA polymerase II-dependent transcriptional regulation of
            about 10% of all yeast genes. The SPT7 bromodomain has
            been shown to weakly interact with acetylated histone H3,
            but not H4. The human representative of this subfamily is
            cat eye syndrome critical region protein 2 (CECR2).
            Bromodomains are 110 amino acid long domains, that are
            found in many chromatin associated proteins. Bromodomains
            can interact specifically with acetylated lysine.
          Length = 112

 Score = 54.0 bits (130), Expect = 1e-08
 Identities = 22/63 (34%), Positives = 39/63 (61%)

Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
            S PF+   S++E PDYY++I +PMD+  +L ++++ +Y S  E   D   + +N  +YN 
Sbjct: 26   STPFLTKVSKREAPDYYDIIKKPMDLGTMLKKLKNLQYKSKAEFVDDLNLIWKNCLLYNS 85

Query: 1301 ELS 1303
            + S
Sbjct: 86   DPS 88


>gnl|CDD|99943 cd05511, Bromo_TFIID, Bromodomain, TFIID-like subfamily. Human
            TAFII250 (or TAF250) is the largest subunit of TFIID, a
            large multi-domain complex, which initiates the assembly
            of the transcription machinery. TAFII250 contains two
            bromodomains that specifically bind to acetylated histone
            H4. Bromodomains are 110 amino acid long domains, that
            are found in many chromatin associated proteins.
            Bromodomains can interact specifically with acetylated
            lysine.
          Length = 112

 Score = 53.8 bits (130), Expect = 1e-08
 Identities = 27/83 (32%), Positives = 46/83 (55%), Gaps = 11/83 (13%)

Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
            S PF    ++K++PDYY++I RPMD++ I  +I   KY S +E  +D + +  N+ +YN 
Sbjct: 18   SWPFHTPVNKKKVPDYYKIIKRPMDLQTIRKKISKHKYQSREEFLEDIELIVDNSVLYNG 77

Query: 1301 ELSLIHEDSVVLESVFTKARQRV 1323
                        +SV+TK  + +
Sbjct: 78   P-----------DSVYTKKAKEM 89


>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1.  Members
           of this family are necessary for accurate chromosome
           transmission during cell division.
          Length = 804

 Score = 59.0 bits (143), Expect = 1e-08
 Identities = 43/209 (20%), Positives = 59/209 (28%), Gaps = 35/209 (16%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGS-----------GPPGSPGPSPGQAPGQN 49
            S+ S   +   P Q     +        P             G    P P P QAP   
Sbjct: 90  DSDLSQKTSTFSPCQSGYEAST------DPEYIPDLQPDPSLWGTAPKPEPQPPQAPESQ 143

Query: 50  PQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQV--QQLRFQIMAY 107
           PQ           + K   LEE     +  +      +         +  +Q  F     
Sbjct: 144 PQP-------QTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGP 196

Query: 108 RLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPN--QAQPMPLQ 165
                  P  PQ     Q +  + +P+  Q P      P+P    Q  P   Q Q   L 
Sbjct: 197 PEQPPGYPQPPQGHPE-QVQPQQFLPAPSQAPAQP---PLPPQLPQQPPPLQQPQFPGLS 252

Query: 166 QQPPPQPHQQQGHISSQIKQSKLTNIPKP 194
           QQ PP P Q       Q +  +    P P
Sbjct: 253 QQMPPPPPQPPQQ---QQQPPQPQAQPPP 278



 Score = 57.5 bits (139), Expect = 4e-08
 Identities = 41/209 (19%), Positives = 50/209 (23%), Gaps = 39/209 (18%)

Query: 8   PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
           P  P P QQ  P  +       P  GPP  P   P    G   Q        A      Q
Sbjct: 170 PQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQ 229

Query: 68  GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGK 127
                        +     +        Q   L  Q+          P  PQ        
Sbjct: 230 -----------PPLPPQLPQQPPPLQQPQFPGLSQQMPP------PPPQPPQQQ------ 266

Query: 128 RMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSK 187
                    Q PP     P P     P P   Q       PP QP         Q +Q  
Sbjct: 267 ---------QQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQRG 317

Query: 188 LTNIPKPEGLDPLIILQERENRVALNIER 216
                +   L        ++ R AL+ E 
Sbjct: 318 PQFREQLVQL-------SQQQREALSQEE 339



 Score = 41.3 bits (97), Expect = 0.004
 Identities = 29/180 (16%), Positives = 47/180 (26%), Gaps = 12/180 (6%)

Query: 29  APGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEI 88
                       +   +P Q+  E  T  +   D      L+ DP           +   
Sbjct: 85  PSVGPDSDLSQKTSTFSPCQSGYEASTDPEYIPD------LQPDPSLWGTAPKPEPQPPQ 138

Query: 89  KHAFTSAQVQQLRFQIMAYRLLA--RNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGP 146
                       +  +    + A  + +   PQL    Q  ++      P+       GP
Sbjct: 139 APESQPQPQTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQ--QVLPQGMPPRQAAFPQQGP 196

Query: 147 MPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQER 206
              PP  P P Q    P Q QP           +      +L   P P        L ++
Sbjct: 197 PEQPPGYPQPPQ--GHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQ 254



 Score = 32.8 bits (75), Expect = 1.3
 Identities = 19/92 (20%), Positives = 31/92 (33%), Gaps = 4/92 (4%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
                    PP PQ Q PP N        PG     +    P Q P   P       Q+ 
Sbjct: 260 PQPPQQQQQPPQPQAQPPPQNQPTPH---PGLPQGQNAPLPPPQQPQLLPLVQQPQGQQR 316

Query: 61  IDSMKEQGLEEDPRYQKLIEMKAN-RTEIKHA 91
               +EQ ++   + ++ +  +   R + +H 
Sbjct: 317 GPQFREQLVQLSQQQREALSQEEAKRAKRRHK 348



 Score = 30.9 bits (70), Expect = 5.4
 Identities = 14/59 (23%), Positives = 16/59 (27%), Gaps = 7/59 (11%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQ-------NPQENLTALQ 58
           +  P PPQ  Q P  + Q          P  P   P Q            PQ   T   
Sbjct: 228 AQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHP 286



 Score = 30.5 bits (69), Expect = 8.7
 Identities = 19/84 (22%), Positives = 28/84 (33%), Gaps = 14/84 (16%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAP------------GSGPP-GSPGPSPGQAPGQNPQEN 53
              PP  Q Q P L+    P                   PP   P P PG   GQN    
Sbjct: 238 QQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLP 297

Query: 54  LTALQRAIDSMK-EQGLEEDPRYQ 76
                + +  ++  QG +  P+++
Sbjct: 298 PPQQPQLLPLVQQPQGQQRGPQFR 321


>gnl|CDD|220309 pfam09606, Med15, ARC105 or Med15 subunit of Mediator complex
           non-fungal.  The approx. 70 residue Med15 domain of the
           ARC-Mediator co-activator is a three-helix bundle with
           marked similarity to the KIX domain. The sterol
           regulatory element binding protein (SREBP) family of
           transcription activators use the ARC105 subunit to
           activate target genes in the regulation of cholesterol
           and fatty acid homeostasis. In addition, Med15 is a
           critical transducer of gene activation signals that
           control early metazoan development.
          Length = 768

 Score = 58.9 bits (142), Expect = 2e-08
 Identities = 56/202 (27%), Positives = 70/202 (34%), Gaps = 18/202 (8%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
           M        P   Q    P  + Q+P G  G G PG P   P Q PG  PQ    A+Q+ 
Sbjct: 275 MQQQPPQQQPQQSQLGMLPNQMQQMPGG--GQGGPGQPMGPPPQRPGAVPQ-GGQAVQQG 331

Query: 61  IDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQL 120
           + S  +Q L    +  KL  M+  +       T  Q QQ      A      NQ +    
Sbjct: 332 VMSAGQQQL----KQMKLRNMRGQQQ------TQQQQQQQGGNHPAAHQQQMNQQVGQGG 381

Query: 121 AMG-VQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ-QQGH 178
            M  +    ++G   G    PM    P  M    P+P            PPQP     G 
Sbjct: 382 QMVALGYLNIQGNQGGLGANPMQQGQPGMMSSPSPVPQVQTNQ--SMPQPPQPSVPSPGG 439

Query: 179 ISSQIKQSKLTN-IPKPEGLDP 199
             SQ  QS     IP P  L P
Sbjct: 440 PGSQPPQSVSGGMIPSPPALMP 461



 Score = 53.9 bits (129), Expect = 6e-07
 Identities = 47/202 (23%), Positives = 61/202 (30%), Gaps = 17/202 (8%)

Query: 1   MSNSSTSPNPPPPQQ-QQPPLNVGQLPMGAPGSGPPGSP-GPSPGQAPGQ--NPQENLTA 56
           MS  +         Q    P+N  Q   G    GP   P GP PG+  GQ        + 
Sbjct: 53  MSKKAAQQQVLQGGQGMPDPINALQNLTGQGTRGPQMGPMGPGPGRPMGQQMGGPGTASN 112

Query: 57  LQ-----RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLA 111
           L      R    M   G+      +        +       +S Q Q  +   M  +   
Sbjct: 113 LLQSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQSSGQPQSQQPNQMGPQQ-G 171

Query: 112 RNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ 171
           + Q     +  G QG      P G Q PP      MP    Q    Q      QQQ  PQ
Sbjct: 172 QAQGQAGGMNQGQQG------PVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGGQQQQNPQ 225

Query: 172 PHQQ-QGHISSQIKQSKLTNIP 192
             QQ Q     Q+ Q +     
Sbjct: 226 MQQQLQNQQQQQMDQQQGPADA 247



 Score = 51.2 bits (122), Expect = 3e-06
 Identities = 40/169 (23%), Positives = 49/169 (28%), Gaps = 10/169 (5%)

Query: 12  PPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ-GLE 70
           P  Q    +   Q   G P S  P   GP  GQA GQ    N             Q G  
Sbjct: 143 PGGQAGGMM---QQSSGQPQSQQPNQMGPQQGQAQGQAGGMNQGQQGPVGQQQPPQMGQP 199

Query: 71  EDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRME 130
             P      +M+           + Q+QQ        ++  +  P   Q  MG Q     
Sbjct: 200 GMPGGGGQGQMQQQGQPGGQQQQNPQMQQQLQNQQQQQMDQQQGPADAQAQMGQQ----- 254

Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHI 179
                  M P  + G     P Q  P Q QP   Q    P   QQ    
Sbjct: 255 -QQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQSQLGMLPNQMQQMPGG 302



 Score = 50.0 bits (119), Expect = 8e-06
 Identities = 43/199 (21%), Positives = 59/199 (29%), Gaps = 12/199 (6%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKE 66
           S  P   Q  Q     GQ    A G    G  GP   Q P Q  Q  +         M++
Sbjct: 155 SGQPQSQQPNQMGPQQGQAQGQAGGMNQ-GQQGPVGQQQPPQMGQPGMPGGGGQ-GQMQQ 212

Query: 67  QGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
           QG     + Q     +  + + +      Q      Q    +       + PQ   G Q 
Sbjct: 213 QGQPGGQQQQNPQMQQQLQNQQQQQMDQQQGPADA-QAQMGQQQQGQGGMQPQQMQGGQM 271

Query: 127 KR-MEGVPSGPQM---PPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ-----PHQQQG 177
           +  M+  P   Q        L   M   P        QPM    Q P          QQG
Sbjct: 272 QVPMQQQPPQQQPQQSQLGMLPNQMQQMPGGGQGGPGQPMGPPPQRPGAVPQGGQAVQQG 331

Query: 178 HISSQIKQSKLTNIPKPEG 196
            +S+  +Q K   +    G
Sbjct: 332 VMSAGQQQLKQMKLRNMRG 350



 Score = 44.6 bits (105), Expect = 3e-04
 Identities = 43/190 (22%), Positives = 52/190 (27%), Gaps = 31/190 (16%)

Query: 12  PPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEE 71
           P  QQQPP        G  G G     G   GQ   QNPQ       +    M +Q    
Sbjct: 187 PVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGGQ-QQQNPQMQQQLQNQQQQQMDQQQGPA 245

Query: 72  DPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQ------ 125
           D + Q   + +             Q+Q    Q        + QP   QL M         
Sbjct: 246 DAQAQMGQQQQGQGGMQPQQMQGGQMQVPMQQ-----QPPQQQPQQSQLGMLPNQMQQMP 300

Query: 126 --GKRMEGVPSGPQ------MPPMSLHGPMP-MPPSQP---------MPNQAQPMPLQQQ 167
             G+   G P GP       +P          M   Q          M  Q Q    QQQ
Sbjct: 301 GGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQQLKQMKLRNMRGQQQTQQ-QQQ 359

Query: 168 PPPQPHQQQG 177
                H    
Sbjct: 360 QQGGNHPAAH 369



 Score = 43.5 bits (102), Expect = 0.001
 Identities = 42/217 (19%), Positives = 59/217 (27%), Gaps = 46/217 (21%)

Query: 10  PPPPQQQQPPLNVGQLP---MGAPGSGPPGSPGPSPGQAPGQ-------NPQENLTALQR 59
           PP  Q QQ  L +       M   G G PG P   P Q PG          Q  ++A Q+
Sbjct: 279 PPQQQPQQSQLGMLPNQMQQMPGGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQ 338

Query: 60  AIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQV--QQLRFQIMAYRLLARNQ--- 114
            +  MK + +    + Q+  + +       H     Q   Q  +   + Y  +  NQ   
Sbjct: 339 QLKQMKLRNMRGQQQTQQQQQQQGGNHPAAHQQQMNQQVGQGGQMVALGYLNIQGNQGGL 398

Query: 115 ---------------PLTPQLAMGVQGKRMEGVPSGPQ----------------MPPMSL 143
                          P         Q       PS P                 +P    
Sbjct: 399 GANPMQQGQPGMMSSPSPVPQVQTNQSMPQPPQPSVPSPGGPGSQPPQSVSGGMIPSPPA 458

Query: 144 HGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHIS 180
             P P P     P   + +      P  P    G  S
Sbjct: 459 LMPSPSPQMSQSPASQRTIQQDMVSPGGPLNTPGQSS 495



 Score = 33.4 bits (76), Expect = 1.0
 Identities = 22/85 (25%), Positives = 35/85 (41%), Gaps = 12/85 (14%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN------LTALQRA 60
           SP+P   Q       + Q  +     GP  +PG S   +P  NPQE          L + 
Sbjct: 462 SPSPQMSQSPASQRTIQQDMVS--PGGPLNTPGQSSVNSPA-NPQEEQLYREKYKQLSKY 518

Query: 61  IDSMKE--QGLEEDP-RYQKLIEMK 82
           I+ ++     ++ D  R + L +MK
Sbjct: 519 IEPLRRMIAKIDNDEGRIKDLSKMK 543


>gnl|CDD|99957 cd05528, Bromo_AAA, Bromodomain; sub-family co-occurring with AAA
            domains. Bromodomains are 110 amino acid long domains,
            that are found in many chromatin associated proteins.
            Bromodomains can interact specifically with acetylated
            lysine. The structure(2DKW) in this alignment is an
            uncharacterized protein predicted from analysis of cDNA
            clones from human fetal liver.
          Length = 112

 Score = 53.5 bits (129), Expect = 2e-08
 Identities = 28/78 (35%), Positives = 44/78 (56%), Gaps = 2/78 (2%)

Query: 1222 LKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSV 1281
            L+  +R V+K   SD R     F K    +E+PDYYE+I +PMD++ IL +++  +Y + 
Sbjct: 4    LRLFLRDVLKRLASDKRFN--AFTKPVDEEEVPDYYEIIKQPMDLQTILQKLDTHQYLTA 61

Query: 1282 DELQKDFKTLCRNAQIYN 1299
             +  KD   +  NA  YN
Sbjct: 62   KDFLKDIDLIVTNALEYN 79


>gnl|CDD|99958 cd05529, Bromo_WDR9_I_like, Bromodomain; WDR9 repeat I_like
            subfamily. WDR9 is a human gene located in the Down
            Syndrome critical region-2 of chromosome 21. It encodes
            for a nuclear protein containing WD40 repeats and two
            bromodomains, which may function as a transcriptional
            regulator involved in chromatin remodeling and play a
            role in embryonic development. Bromodomains are 110 amino
            acid long domains, that are found in many chromatin
            associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 128

 Score = 53.9 bits (130), Expect = 2e-08
 Identities = 29/115 (25%), Positives = 56/115 (48%), Gaps = 8/115 (6%)

Query: 1205 KEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFI-KLPSRKELPDYYEVIDRP 1263
             E+   R++++ +L   L K++        S    ++E F   +  R   PDY+  +  P
Sbjct: 16   WEQPHIRDEERERLISGLDKLLL-------SLQLEIAEYFEYPVDLRAWYPDYWNRVPVP 68

Query: 1264 MDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTK 1318
            MD++ I  R+E+  Y S++ L+ D + +  NA+ +NE  S I + +  L     +
Sbjct: 69   MDLETIRSRLENRYYRSLEALRHDVRLILSNAETFNEPNSEIAKKAKRLSDWLLR 123


>gnl|CDD|204086 pfam08880, QLQ, QLQ.  The QLQ domain is named after the conserved
           Gln, Leu, Gln motif. The QLQ domain is found at the
           N-terminus of SWI2/SNF2 protein, which has been shown to
           be involved in protein-protein interactions. This domain
           has thus been postulated to be involved in mediating
           protein interactions.
          Length = 37

 Score = 50.1 bits (121), Expect = 4e-08
 Identities = 18/36 (50%), Positives = 25/36 (69%)

Query: 91  AFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
            FT AQ+Q+L+ QI+AY+ LA NQP+ P L   +Q 
Sbjct: 2   PFTPAQLQELKAQILAYKYLAANQPVPPHLQQPIQK 37


>gnl|CDD|99929 cd05497, Bromo_Brdt_I_like, Bromodomain, Brdt_like subfamily, repeat
            I. Human Brdt is a testis-specific member of the BET
            subfamily of bromodomain proteins; the first bromodomain
            in Brdt has been shown to be essential for male germ cell
            differentiation. Bromodomains are 110 amino acid long
            domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 107

 Score = 52.0 bits (125), Expect = 4e-08
 Identities = 26/76 (34%), Positives = 40/76 (52%), Gaps = 7/76 (9%)

Query: 1252 ELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVV 1311
             LPDY+++I  PMD+  I  R+E+  Y S  E  +DF T+  N  IYN+       D VV
Sbjct: 36   NLPDYHKIIKTPMDLGTIKKRLENNYYWSASECIQDFNTMFTNCYIYNKP-----GDDVV 90

Query: 1312 L--ESVFTKARQRVES 1325
            L  +++     Q++  
Sbjct: 91   LMAQTLEKLFLQKLAQ 106


>gnl|CDD|99931 cd05499, Bromo_BDF1_2_II, Bromodomain. BDF1/BDF2 like subfamily,
            restricted to fungi, repeat II. BDF1 and BDF2 are yeast
            transcription factors involved in the expression of a
            wide range of genes, including snRNAs; they are required
            for sporulation and DNA repair and protect histone H4
            from deacetylation. Bromodomains are 110 amino acid long
            domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 102

 Score = 51.1 bits (123), Expect = 8e-08
 Identities = 26/99 (26%), Positives = 54/99 (54%), Gaps = 9/99 (9%)

Query: 1220 KTLKKIMRVVIKYTDSDGRVLSEPFIKL--PSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
            + LK++M+   K++       + PF+    P    +P+Y+ +I +PMD+  I  ++++G+
Sbjct: 7    EVLKELMKP--KHSA-----YNWPFLDPVDPVALNIPNYFSIIKKPMDLGTISKKLQNGQ 59

Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVF 1316
            Y S  E ++D + + +N   +N E + ++     LE VF
Sbjct: 60   YQSAKEFERDVRLIFKNCYTFNPEGTDVYMMGHQLEEVF 98


>gnl|CDD|99938 cd05506, Bromo_plant1, Bromodomain, uncharacterized subfamily
            specific to plants. Might function as a global
            transcription factor. Bromodomains are 110 amino acid
            long domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 99

 Score = 50.8 bits (122), Expect = 1e-07
 Identities = 25/80 (31%), Positives = 39/80 (48%), Gaps = 8/80 (10%)

Query: 1222 LKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSV 1281
            L+K+M          G V + P         LPDY+++I +PMD+  +  ++E G+YSS 
Sbjct: 9    LRKLM------KHKWGWVFNAPVD--VVALGLPDYFDIIKKPMDLGTVKKKLEKGEYSSP 60

Query: 1282 DELQKDFKTLCRNAQIYNEE 1301
            +E   D +    NA  YN  
Sbjct: 61   EEFAADVRLTFANAMRYNPP 80


>gnl|CDD|99930 cd05498, Bromo_Brdt_II_like, Bromodomain, Brdt_like subfamily, repeat
            II. Human Brdt is a testis-specific member of the BET
            subfamily of bromodomain proteins; the first bromodomain
            in Brdt has been shown to be essential for male germ cell
            differentiation. Bromodomains are 110 amino acid long
            domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 102

 Score = 50.4 bits (121), Expect = 1e-07
 Identities = 19/71 (26%), Positives = 34/71 (47%)

Query: 1248 PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHE 1307
            P    L DY+++I  PMD+  I  ++++ +Y+   E   D + +  N   YN     +H 
Sbjct: 30   PEALGLHDYHDIIKHPMDLSTIKKKLDNREYADAQEFAADVRLMFSNCYKYNPPDHPVHA 89

Query: 1308 DSVVLESVFTK 1318
             +  L+ VF  
Sbjct: 90   MARKLQDVFED 100


>gnl|CDD|99944 cd05512, Bromo_brd1_like, Bromodomain; brd1_like subfamily. BRD1 is a
            mammalian gene which encodes for a nuclear protein
            assumed to be a transcriptional regulator. BRD1 has been
            implicated with brain development and susceptibility to
            schizophrenia and bipolar affective disorder.
            Bromodomains are 110 amino acid long domains that are
            found in many chromatin associated proteins. Bromodomains
            can interact specifically with acetylated lysine.
          Length = 98

 Score = 50.1 bits (120), Expect = 2e-07
 Identities = 18/63 (28%), Positives = 32/63 (50%), Gaps = 4/63 (6%)

Query: 1237 GRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQ 1296
              + SEP        E+PDY + I +PMD   +  ++E  +Y ++++ + DF  +  N  
Sbjct: 19   AEIFSEPV----DLSEVPDYLDHIKQPMDFSTMRKKLESQRYRTLEDFEADFNLIINNCL 74

Query: 1297 IYN 1299
             YN
Sbjct: 75   AYN 77


>gnl|CDD|197800 smart00592, BRK, domain in transcription and CHROMO domain
           helicases. 
          Length = 45

 Score = 48.1 bits (115), Expect = 2e-07
 Identities = 15/45 (33%), Positives = 26/45 (57%)

Query: 448 LTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSD 492
             +  + V    +GK L G+DAP A  L++W++++P +EV   S 
Sbjct: 1   DGEERVPVINRETGKKLTGDDAPKAKDLERWLEENPEYEVAPRSA 45


>gnl|CDD|99927 cd05495, Bromo_cbp_like, Bromodomain, cbp_like subfamily. Cbp (CREB
            binding protein or CREBBP) is an acetyltransferase acting
            on histone, which gives a specific tag for
            transcriptional activation and also acetylates
            non-histone proteins. CREBBP binds specifically to
            phosphorylated CREB protein and augments the activity of
            phosphorylated CREB to activate transcription of
            cAMP-responsive genes. Bromodomains are 110 amino acid
            long domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 108

 Score = 49.0 bits (117), Expect = 5e-07
 Identities = 22/78 (28%), Positives = 40/78 (51%), Gaps = 2/78 (2%)

Query: 1241 SEPFIKL--PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIY 1298
            S PF +   P    +PDY++++  PMD+  I  +++ G+Y    +   D   +  NA +Y
Sbjct: 22   SLPFRQPVDPKLLGIPDYFDIVKNPMDLSTIRRKLDTGQYQDPWQYVDDVWLMFDNAWLY 81

Query: 1299 NEELSLIHEDSVVLESVF 1316
            N + S +++    L  VF
Sbjct: 82   NRKTSRVYKYCTKLAEVF 99


>gnl|CDD|99935 cd05503, Bromo_BAZ2A_B_like, Bromodomain, BAZ2A/BAZ2B_like subfamily.
            Bromo adjacent to zinc finger 2A (BAZ2A) and 2B (BAZ2B)
            were identified as a novel human bromodomain gene by cDNA
            library screening. BAZ2A is also known as Tip5
            (Transcription termination factor I-interacting protein
            5) and hWALp3. The proteins may play roles in
            transcriptional regulation. Human Tip5 is part of a
            complex termed NoRC (nucleolar remodeling complex), which
            induces nucleosome sliding and may play a role in the
            regulation of the rDNA locus. Bromodomains are 110 amino
            acid long domains, that are found in many chromatin
            associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 97

 Score = 48.5 bits (116), Expect = 7e-07
 Identities = 21/76 (27%), Positives = 42/76 (55%)

Query: 1243 PFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEEL 1302
            PF++  + K +P Y ++I +PMD   I  ++E G+Y +++E  +D + +  N + +NE+ 
Sbjct: 20   PFLEPVNTKLVPGYRKIIKKPMDFSTIREKLESGQYKTLEEFAEDVRLVFDNCETFNEDD 79

Query: 1303 SLIHEDSVVLESVFTK 1318
            S +      +   F K
Sbjct: 80   SEVGRAGHNMRKFFEK 95


>gnl|CDD|99932 cd05500, Bromo_BDF1_2_I, Bromodomain. BDF1/BDF2 like subfamily,
            restricted to fungi, repeat I. BDF1 and BDF2 are yeast
            transcription factors involved in the expression of a
            wide range of genes, including snRNAs; they are required
            for sporulation and DNA repair and protect histone H4
            from deacetylation. Bromodomains are 110 amino acid long
            domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 103

 Score = 48.1 bits (115), Expect = 1e-06
 Identities = 19/71 (26%), Positives = 35/71 (49%)

Query: 1248 PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHE 1307
            P +  +P Y  +I +PMD+  I  +++   Y+SV+E   DF  +  N   +N     + +
Sbjct: 31   PVKLNIPHYPTIIKKPMDLGTIERKLKSNVYTSVEEFTADFNLMVDNCLTFNGPEHPVSQ 90

Query: 1308 DSVVLESVFTK 1318
                L++ F K
Sbjct: 91   MGKRLQAAFEK 101


>gnl|CDD|99928 cd05496, Bromo_WDR9_II, Bromodomain; WDR9 repeat II_like subfamily.
            WDR9 is a human gene located in the Down Syndrome
            critical region-2 of chromosome 21. It encodes for a
            nuclear protein containing WD40 repeats and two
            bromodomains, which may function as a transcriptional
            regulator involved in chromatin remodeling and play a
            role in embryonic development. Bromodomains are 110 amino
            acid long domains, that are found in many chromatin
            associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 119

 Score = 46.3 bits (110), Expect = 6e-06
 Identities = 27/99 (27%), Positives = 49/99 (49%), Gaps = 7/99 (7%)

Query: 1219 KKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKY 1278
            KK  K+++ ++    DS      EPF +     + PDY ++ID PMD+  +   +  G Y
Sbjct: 7    KKQCKELVNLMWDCEDS------EPFRQPVDLLKYPDYRDIIDTPMDLGTVKETLFGGNY 60

Query: 1279 SSVDELQKDFKTLCRNAQIYN-EELSLIHEDSVVLESVF 1316
                E  KD + +  N++ Y   + S I+  ++ L ++F
Sbjct: 61   DDPMEFAKDVRLIFSNSKSYTPNKRSRIYSMTLRLSALF 99


>gnl|CDD|214931 smart00951, QLQ, QLQ is named after the conserved Gln, Leu, Gln
           motif.  QLQ is found at the N-terminus of SWI2/SNF2
           protein, which has been shown to be involved in
           protein-protein interactions. QLQ has been postulated to
           be involved in mediating protein interactions.
          Length = 36

 Score = 43.7 bits (104), Expect = 7e-06
 Identities = 19/33 (57%), Positives = 25/33 (75%), Gaps = 1/33 (3%)

Query: 91  AFTSAQVQQLRFQIMAYR-LLARNQPLTPQLAM 122
            FT AQ++ LR QI+AY+ LLARNQP+ P+L  
Sbjct: 2   PFTPAQLELLRAQILAYKYLLARNQPVPPELLQ 34


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 49.7 bits (118), Expect = 1e-05
 Identities = 45/203 (22%), Positives = 68/203 (33%), Gaps = 26/203 (12%)

Query: 4   SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSP---GQAPGQNPQENLTALQRA 60
           S  +  P P  QQ  PL++      AP   P   P P P    Q   Q   +      R 
Sbjct: 208 SPIAAQPAPQPQQPSPLSLIS----APSLHPQRLPSPHPPLQPQTASQQSPQPPAPSSRH 263

Query: 61  IDSMKEQGLEEDPR--YQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTP 118
             S         P    Q  + ++   +     F  AQ Q     + +      + P + 
Sbjct: 264 PQSSHHGPGPPMPHALQQGPVFLQHPSSNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQ 323

Query: 119 QLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGH 178
                 Q  R + +P  P MP +        PP+ P+P          Q P Q H+   H
Sbjct: 324 SALQPQQPPREQPLPPAPSMPHIK------PPPTTPIP----------QLPNQSHKHPPH 367

Query: 179 ISSQIKQSKL-TNIPKPEGLDPL 200
           +       ++ +N+P P  L PL
Sbjct: 368 LQGPSPFPQMPSNLPPPPALKPL 390



 Score = 47.4 bits (112), Expect = 5e-05
 Identities = 44/200 (22%), Positives = 62/200 (31%), Gaps = 32/200 (16%)

Query: 3   NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAID 62
           N S+SP+ P PQ  +   +           GPP    P PG A   +      + Q    
Sbjct: 147 NRSSSPSIPSPQDNESDSDSSAQQQLLQPQGPPSIQVP-PGAALAPSAPPPTPSAQAVPP 205

Query: 63  SMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAM 122
                  +  P+ Q+   +        H                 RL + + PL PQ A 
Sbjct: 206 QGSPIAAQPAPQPQQPSPLSLISAPSLHP---------------QRLPSPHPPLQPQTA- 249

Query: 123 GVQGKRMEGVPSGPQMPPMSLHGPMP------------MPPSQPMPNQAQPMPLQQQPP- 169
               +  +      + P  S HGP P            +      P Q   +   Q PP 
Sbjct: 250 --SQQSPQPPAPSSRHPQSSHHGPGPPMPHALQQGPVFLQHPSSNPPQPFGLAQSQVPPL 307

Query: 170 PQPHQQQGHISSQIKQSKLT 189
           P P Q Q H  +   QS L 
Sbjct: 308 PLPSQAQPHSHTPPSQSALQ 327



 Score = 36.6 bits (84), Expect = 0.11
 Identities = 49/205 (23%), Positives = 69/205 (33%), Gaps = 47/205 (22%)

Query: 1   MSNSSTSPNPPPPQQQQP-------------PLNVGQLPMGAPGSGPP---------GSP 38
            + S  SP PP P  + P              L  G + +  P S PP           P
Sbjct: 247 QTASQQSPQPPAPSSRHPQSSHHGPGPPMPHALQQGPVFLQHPSSNPPQPFGLAQSQVPP 306

Query: 39  GPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQ 98
            P P QA   +      +  +     +EQ L   P          +   IK   T+  + 
Sbjct: 307 LPLPSQAQPHSHTPPSQSALQPQQPPREQPLPPAP----------SMPHIKPPPTTP-IP 355

Query: 99  QLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMS---LHGP--MPMPPSQ 153
           QL  Q  +++     Q  +P   M         +P  P + P+S    H P     PP Q
Sbjct: 356 QLPNQ--SHKHPPHLQGPSPFPQMP------SNLPPPPALKPLSSLPTHHPPSAHPPPLQ 407

Query: 154 PMPNQAQPMPLQQQPPPQPHQQQGH 178
            MP Q+QP+      PP   Q Q  
Sbjct: 408 LMP-QSQPLQSVPAQPPVLTQSQSL 431



 Score = 33.9 bits (77), Expect = 0.76
 Identities = 70/320 (21%), Positives = 110/320 (34%), Gaps = 37/320 (11%)

Query: 2   SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPS-PGQAPGQNPQENLTALQRA 60
           S S+  P  PP +Q  PP      P       PP +P P  P Q+    P     +    
Sbjct: 322 SQSALQPQQPPREQPLPPA-----PSMPHIKPPPTTPIPQLPNQSHKHPPHLQGPSPFPQ 376

Query: 61  IDSMKEQGLEEDPRYQKLIEMKANRTEIKHA---FTSAQVQQLRFQIMAYRLLARNQPLT 117
           + S     L   P  + L  +  +     H        Q Q L+       +L ++Q L 
Sbjct: 377 MPS----NLPPPPALKPLSSLPTHHPPSAHPPPLQLMPQSQPLQSVPAQPPVLTQSQSLP 432

Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLH--GPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQ 175
           P+ +         G+ SGP   P + H      +P   P P+     P      P     
Sbjct: 433 PKAS----THPHSGLHSGPPQSPFAQHPFTSGGLPAIGPPPSLPTSTP----AAPPRASS 484

Query: 176 QGHISSQIKQSKLTNIPKPEGLDPLIILQERENRVALNIERRIEELNGSLTSTLPEHLRV 235
                     S          L P+ I +E      L+     E       S  PE   V
Sbjct: 485 GSQPPGSALPSSGGCAGPGPPLPPIQIKEE-----PLDEAEEPESPPPPPRSPSPEPTVV 539

Query: 236 KAEIELRALKVLNFQRQLRAEVIACARRD---TTLETAVNVK----AYKRTKRQGLKEAR 288
                  A +   F + L     +CAR D   T L ++   K    A ++ KR+  ++AR
Sbjct: 540 NTPSH--ASQSARFYKHLDRGYNSCARTDLYFTPLASSKLAKKREEAVEKAKREAEQKAR 597

Query: 289 ATEKLEKQQKVEAERKKRQK 308
              + EK+++ E ER++ ++
Sbjct: 598 EEREREKEKEKERERERERE 617


>gnl|CDD|99934 cd05502, Bromo_tif1_like, Bromodomain; tif1_like subfamily. Tif1
            (transcription intermediary factor 1) is a member of the
            tripartite motif (TRIM) protein family, which is
            characterized by a particular domain architecture. It
            functions by recruiting coactivators and/or corepressors
            to modulate transcription. Vertebrate Tif1-gamma, also
            labeled E3 ubiquitin-protein ligase TRIM33, plays a role
            in the control of hematopoiesis. Its homologue in Xenopus
            laevis, Ectodermin, has been shown to function in
            germ-layer specification and control of cell growth
            during embryogenesis. Bromodomains are 110 amino acid
            long domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 109

 Score = 45.4 bits (108), Expect = 1e-05
 Identities = 28/80 (35%), Positives = 40/80 (50%), Gaps = 4/80 (5%)

Query: 1240 LSEPFIKLPSRKELPDYYEVIDRPMD---IKKILGRIEDGKYSSVDELQKDFKTLCRNAQ 1296
            LS PF   P    +P+YY++I  PMD   I+K L       YSS +E   D + + +N  
Sbjct: 21   LSLPF-HEPVSPSVPNYYKIIKTPMDLSLIRKKLQPKSPQHYSSPEEFVADVRLMFKNCY 79

Query: 1297 IYNEELSLIHEDSVVLESVF 1316
             +NEE S + +    LE  F
Sbjct: 80   KFNEEDSEVAQAGKELELFF 99


>gnl|CDD|221124 pfam11496, HDA2-3, Class II histone deacetylase complex subunits 2
            and 3.  This family of class II histone deacetylase
            complex subunits HDA2 and HDA3 is found in fungi, The
            member from S. pombe is referred to as Ccq1. These
            proteins associate with HDA1 to generate the activity of
            the HDA1 histone deacetylase complex. HDA1 interacts with
            itself and with the HDA2-HDA3 subcomplex to form a
            probable tetramer and these interactions are necessary
            for catalytic activity. The HDA1 histone deacetylase
            complex is responsible for the deacetylation of lysine
            residues on the N-terminal part of the core histones
            (H2A, H2B, H3 and H4). Histone deacetylation gives a tag
            for epigenetic repression and plays an important role in
            transcriptional regulation, cell cycle progression and
            developmental events. HDA2 and HDA3 have a conserved
            coiled-coil domain towards their C-terminus.
          Length = 279

 Score = 47.4 bits (113), Expect = 3e-05
 Identities = 38/178 (21%), Positives = 73/178 (41%), Gaps = 21/178 (11%)

Query: 881  SGKFELLDRILPKL--KSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRG 938
            SGKF +L+ ++  L        VL+  +  + ++++E     +G  Y RL G +  E+  
Sbjct: 94   SGKFLVLNDLINLLIRSERDLHVLIISRSVKTLDLVEALLLGKGLNYKRLSGESLYEE-- 151

Query: 939  DLLKKFNAPDSEYFIFVLSTRAGGL------GLNLQTADTVIIFDSDWNPH----QDLQA 988
            +            +I + ++   GL       L+    D +I FD   +      + L+ 
Sbjct: 152  NHKVSDKKGSLSLWIHLTTSD--GLTNTDSSLLSNYKFDLIISFDPSLDTSLPSIESLRT 209

Query: 989  QDRAHRIGQKNEVRVLRLMTVNSVEERILAAARYKLNMDEKVIQAGMFDQKSTGSERH 1046
            Q+R       N   ++RL+ VNS+E   L   +   N  + ++QA +  +   G    
Sbjct: 210  QNRRG-----NLTPIIRLVVVNSIEHVELCFPKKYPNRLDYLVQASVVLRDIVGDLPP 262


>gnl|CDD|99956 cd05526, Bromo_polybromo_VI, Bromodomain, polybromo repeat VI.
            Polybromo is a nuclear protein of unknown function, which
            contains 6 bromodomains. The human ortholog BAF180 is
            part of a SWI/SNF chromatin-remodeling complex, and it
            may carry out the functions of Yeast Rsc-1 and Rsc-2. It
            was shown that polybromo bromodomains bind to histone H3
            at specific acetyl-lysine positions. Bromodomains are
            found in many chromatin-associated proteins and in
            nuclear histone acetyltransferases. They interact
            specifically with acetylated lysine, but not all the
            bromodomains in polybromo may bind to acetyl-lysine.
          Length = 110

 Score = 43.5 bits (103), Expect = 5e-05
 Identities = 26/106 (24%), Positives = 50/106 (47%), Gaps = 2/106 (1%)

Query: 1215 QAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIE 1274
            Q  +++ L  +   V+ + D +GR  S+   +LP         +    P+ +  I   ++
Sbjct: 1    QLLVQELLATLFVSVMNHQDEEGRCYSDSLAELPELAVDGVGPK--KIPLTLDIIKRNVD 58

Query: 1275 DGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKAR 1320
             G+Y  +D+ Q+D   +   A+  +   S I+ED+V L+  F K R
Sbjct: 59   KGRYRRLDKFQEDMFEVLERARRLSRTDSEIYEDAVELQQFFIKIR 104


>gnl|CDD|237871 PRK14965, PRK14965, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 576

 Score = 45.9 bits (109), Expect = 1e-04
 Identities = 33/133 (24%), Positives = 44/133 (33%), Gaps = 21/133 (15%)

Query: 45  APGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTS--AQVQQLRF 102
           A   + Q +LT L RA   M        PR   ++EM      +K A  +  A V +L  
Sbjct: 322 ADAADLQRHLTLLLRAEGEMA---HASFPRL--VLEM----ALLKMATLAPGAPVSELLD 372

Query: 103 QIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPM 162
           ++ A   L R  P  P  A G         P     PP         P +   P  A+P 
Sbjct: 373 RLEA---LERGAPAPPSAAWGAPTPAAPAAPPPAAAPP-------VPPAAPARPAAARPA 422

Query: 163 PLQQQPPPQPHQQ 175
           P    P       
Sbjct: 423 PAPAPPAAAAPPA 435


>gnl|CDD|219339 pfam07223, DUF1421, Protein of unknown function (DUF1421).  This
           family represents a conserved region approximately 350
           residues long within a number of plant proteins of
           unknown function.
          Length = 357

 Score = 45.3 bits (107), Expect = 1e-04
 Identities = 44/191 (23%), Positives = 65/191 (34%), Gaps = 23/191 (12%)

Query: 4   SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPG-QNPQENLTALQRAID 62
           S   P+  PPQQ Q            P   PP  P P P Q P  Q PQ      Q    
Sbjct: 95  SHQYPSQLPPQQVQSVPQQPTPQQ-EPYYPPPSQPQPPPAQQPQAQQPQPPPQVPQ---- 149

Query: 63  SMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAM 122
              +Q  +  P+  +  +    + +     +    ++  +Q  +Y     N+PL   +AM
Sbjct: 150 ---QQQYQSPPQQPQYQQNPPPQAQSAPQVSGLYPEESPYQPQSY---PPNEPLPSSMAM 203

Query: 123 GVQGKRMEGVPS----GPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-----PQPH 173
             Q       PS    GP  P   ++G     P+   P+  QP P Q Q       P P 
Sbjct: 204 --QPPYSGAPPSQQFYGPPQPSPYMYGGPGGRPNSGFPSGQQPPPSQGQEGYGYSGPPPS 261

Query: 174 QQQGHISSQIK 184
           +      +   
Sbjct: 262 KGNHGSVASYA 272



 Score = 40.7 bits (95), Expect = 0.005
 Identities = 38/198 (19%), Positives = 52/198 (26%), Gaps = 23/198 (11%)

Query: 11  PPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLE 70
           PPP Q QPP    Q P       PP  P     Q+P Q PQ       +A  + +  GL 
Sbjct: 123 PPPSQPQPP--PAQQPQAQQPQPPPQVPQQQQYQSPPQQPQYQQNPPPQAQSAPQVSGLY 180

Query: 71  -EDPRYQKL----IEMKANRTEIKHAF-TSAQVQQL------RFQIMAYRLLARNQPLTP 118
            E+  YQ       E   +   ++  +  +   QQ          +        N     
Sbjct: 181 PEESPYQPQSYPPNEPLPSSMAMQPPYSGAPPSQQFYGPPQPSPYMYGGPGGRPNSGFPS 240

Query: 119 QLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ---------PMPNQAQPMPLQQQPP 169
                    +     SGP     +        P           P    A  +P      
Sbjct: 241 GQQPPPSQGQEGYGYSGPPPSKGNHGSVASYAPQGSSQSYSTAYPSLPAATVLPQALPMS 300

Query: 170 PQPHQQQGHISSQIKQSK 187
             P    G  S Q     
Sbjct: 301 SAPMSGGGSGSPQSGNRV 318



 Score = 37.6 bits (87), Expect = 0.043
 Identities = 31/121 (25%), Positives = 40/121 (33%), Gaps = 23/121 (19%)

Query: 110 LARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ---------------- 153
            A  Q +   L      +  +         P S   P  +PP Q                
Sbjct: 63  DAPLQQVNAALPPAPAPQSPQPDQQQQSQAPPSHQYPSQLPPQQVQSVPQQPTPQQEPYY 122

Query: 154 PMPNQAQPMPLQQ------QPPPQ-PHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQER 206
           P P+Q QP P QQ      QPPPQ P QQQ     Q  Q +    P+ +    +  L   
Sbjct: 123 PPPSQPQPPPAQQPQAQQPQPPPQVPQQQQYQSPPQQPQYQQNPPPQAQSAPQVSGLYPE 182

Query: 207 E 207
           E
Sbjct: 183 E 183



 Score = 34.5 bits (79), Expect = 0.37
 Identities = 43/193 (22%), Positives = 49/193 (25%), Gaps = 43/193 (22%)

Query: 5   STSPNPPPPQQQQPPLNVG---QLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
           ST   P P Q  +  L      Q+    P +  P SP P   Q   Q P  +    Q   
Sbjct: 46  STKQPPAPEQVAKHELADAPLQQVNAALPPAPAPQSPQPDQ-QQQSQAPPSHQYPSQLP- 103

Query: 62  DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLA 121
                                            +  QQ   Q   Y       P  PQ  
Sbjct: 104 ----------------------------PQQVQSVPQQPTPQQEPYY----PPPSQPQPP 131

Query: 122 MGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISS 181
              Q +        PQ PP          P Q    Q  P P Q Q  PQ        S 
Sbjct: 132 PAQQPQ-----AQQPQPPPQVPQQQQYQSPPQQPQYQQNPPP-QAQSAPQVSGLYPEESP 185

Query: 182 QIKQSKLTNIPKP 194
              QS   N P P
Sbjct: 186 YQPQSYPPNEPLP 198



 Score = 31.4 bits (71), Expect = 3.1
 Identities = 19/106 (17%), Positives = 32/106 (30%), Gaps = 11/106 (10%)

Query: 89  KHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG-----KRMEGVPSGPQMPPMSL 143
           K      Q +  + Q+ ++     ++  +  +    Q           +   P    ++ 
Sbjct: 14  KQEIAETQKELSKLQL-SHEEAQSSEAHSFHVDSTKQPPAPEQVAKHELADAPL-QQVNA 71

Query: 144 HGPMPMPPSQPMPNQAQPM--PLQQQPPPQ--PHQQQGHISSQIKQ 185
             P    P  P P+Q Q    P   Q P Q  P Q Q        Q
Sbjct: 72  ALPPAPAPQSPQPDQQQQSQAPPSHQYPSQLPPQQVQSVPQQPTPQ 117


>gnl|CDD|223989 COG1061, SSL2, DNA or RNA helicases of superfamily II
           [Transcription / DNA replication, recombination, and
           repair].
          Length = 442

 Score = 45.1 bits (107), Expect = 2e-04
 Identities = 53/278 (19%), Positives = 100/278 (35%), Gaps = 33/278 (11%)

Query: 553 VNGKLKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLII 612
              +L+ YQ + L+ +V        G++    G GKT+     I       ++    L++
Sbjct: 33  FEFELRPYQEEALDALVKNRRTERRGVIVLPTGAGKTVVAAEAI------AELKRSTLVL 86

Query: 613 VPLSTL-SNWSLEFERWA-PSVNVVAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKDK 670
           VP   L   W+   +++   +  +  Y G     K L          V + T + + + +
Sbjct: 87  VPTKELLDQWAEALKKFLLLNDEIGIYGGG---EKEL------EPAKVTVATVQTLARRQ 137

Query: 671 GPLAKL--HWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTP---LQNKLPELW 725
                L   +  +I DE H +         IL     A  RL LT TP      ++ +  
Sbjct: 138 LLDEFLGNEFGLIIFDEVHHLPAP--SYRRILELLSAAYPRLGLTATPEREDGGRIGD-- 193

Query: 726 ALLNFLLPSI---FKSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFLLRR 782
             L  L+  I          ++ + AP+     KV L E+E      +     R  L  R
Sbjct: 194 --LFDLIGPIVYEVSLKELIDEGYLAPYKYVEIKVTLTEDE-EREYAKESARFRELLRAR 250

Query: 783 LKKEVESQLPDKVEYIIKCDMSGLQKVLYRHMHTKGIL 820
                E++   ++    +  ++ ++ +L +H      L
Sbjct: 251 GTLRAENEAR-RIAIASERKIAAVRGLLLKHARGDKTL 287


>gnl|CDD|220950 pfam11029, DAZAP2, DAZ associated protein 2 (DAZAP2).  DAZ
           associated protein 2 has a highly conserved sequence
           throughout evolution including a conserved polyproline
           region and several SH2/SH3 binding sites. It occurs as a
           single copy gene with a four-exon organisation and is
           located on chromosome 12. It encodes a ubiquitously
           expressed protein and binds to DAZ and DAZL1 through DAZ
           repeats.
          Length = 136

 Score = 42.1 bits (99), Expect = 2e-04
 Identities = 13/51 (25%), Positives = 17/51 (33%), Gaps = 5/51 (9%)

Query: 131 GVPSGPQMPPMSLHGP-----MPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
            VP   QMP  S   P     +PM     +  Q+   P+   P   P    
Sbjct: 17  VVPPQAQMPQASAPYPGPSMYLPMAQVMAVGPQSSHPPMAYYPIGAPPPVY 67


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 44.9 bits (106), Expect = 4e-04
 Identities = 40/200 (20%), Positives = 55/200 (27%), Gaps = 21/200 (10%)

Query: 11   PPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLT--ALQRAIDSMKEQG 68
            PP     P    G      P +   G P P+P  AP   P   LT  A+    +S +   
Sbjct: 2741 PPAVPAGPATPGGPARPARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLP 2799

Query: 69   LEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMG---VQ 125
               DP       +         A  +  +                 P  P L +G     
Sbjct: 2800 SPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPP-GPPPPSLPLGGSVAP 2858

Query: 126  G---------KRMEGVPSGPQMPPMS-LHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQ 175
            G         +     P+ P  PP+  L  P     ++     A P P Q + PPQP   
Sbjct: 2859 GGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES---FALP-PDQPERPPQPQAP 2914

Query: 176  QGHISSQIKQSKLTNIPKPE 195
                            P P 
Sbjct: 2915 PPPQPQPQPPPPPQPQPPPP 2934



 Score = 41.8 bits (98), Expect = 0.003
 Identities = 38/191 (19%), Positives = 52/191 (27%), Gaps = 33/191 (17%)

Query: 4    SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDS 63
            +   P P   Q   PP   G  P   P  G     G    + P ++P     A  R    
Sbjct: 2825 AGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPAR---- 2880

Query: 64   MKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMG 123
                     P  ++L     +R+    +F     Q  R             P  PQ    
Sbjct: 2881 ---------PPVRRLARPAVSRS--TESFALPPDQPER-------------PPQPQAPPP 2916

Query: 124  VQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQI 183
             Q +     P  PQ PP     P P P     P    P    +     P    G +    
Sbjct: 2917 PQPQPQPPPPPQPQPPPP----PPPRPQPPLAP-TTDPAGAGEPSGAVPQPWLGALVPGR 2971

Query: 184  KQSKLTNIPKP 194
                   +P+P
Sbjct: 2972 VAVPRFRVPQP 2982



 Score = 38.0 bits (88), Expect = 0.046
 Identities = 24/74 (32%), Positives = 33/74 (44%), Gaps = 5/74 (6%)

Query: 4   SSTSPNPPPPQQQQPPLNVGQLPMGAPGS--GPPGSPGPSPGQAPGQNPQENLT--ALQR 59
           S  +P P P     PP     LP   PGS  GP   P   P  AP   P  +    A ++
Sbjct: 414 SVPTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPERQP-PAPATEPAPDDPDDATRK 472

Query: 60  AIDSMKEQGLEEDP 73
           A+D+++E+   E P
Sbjct: 473 ALDALRERRPPEPP 486



 Score = 36.5 bits (84), Expect = 0.13
 Identities = 35/165 (21%), Positives = 51/165 (30%), Gaps = 15/165 (9%)

Query: 12   PPQQQQPPLNV---GQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQG 68
            PPQ  +P   V   G     AP S  P          P  +P  N             + 
Sbjct: 2592 PPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPER 2651

Query: 69   LEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKR 128
              +DP   ++   +  R   + A  S+  Q+            R +   P +        
Sbjct: 2652 PRDDPAPGRVSRPRRARRLGRAAQASSPPQR-----------PRRRAARPTVGSLTSLAD 2700

Query: 129  MEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQP-MPLQQQPPPQP 172
                P  P+  P +L    P+PP      QA P +P    PP  P
Sbjct: 2701 PPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVP 2745



 Score = 33.8 bits (77), Expect = 0.97
 Identities = 29/175 (16%), Positives = 41/175 (23%), Gaps = 6/175 (3%)

Query: 7    SPNPP-----PPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
            +P PP      P ++     V  L         P  P   P             A     
Sbjct: 2768 APAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGP 2827

Query: 62   DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLA 121
                       P         +       A      ++   +  A +  A  +P   +LA
Sbjct: 2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLA 2887

Query: 122  MGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQ-QPPPQPHQQ 175
                 +  E     P  P        P PP         P P     PPP+P   
Sbjct: 2888 RPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPP 2942



 Score = 31.4 bits (71), Expect = 4.2
 Identities = 29/170 (17%), Positives = 43/170 (25%), Gaps = 27/170 (15%)

Query: 4    SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDS 63
               +P P  P            P  A    P    G   G AP      +  A       
Sbjct: 2571 PRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPP--- 2627

Query: 64   MKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMG 123
                     P         AN  +     T    ++            R+ P   +++  
Sbjct: 2628 ------PPSPS------PAANEPDPHPPPTVPPPER-----------PRDDPAPGRVSRP 2664

Query: 124  VQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPH 173
             + +R  G  +    PP         P    + + A P P    P P PH
Sbjct: 2665 RRARR-LGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPH 2713


>gnl|CDD|99940 cd05508, Bromo_RACK7, Bromodomain, RACK7_like subfamily. RACK7 (also
            called human protein kinase C-binding protein) was
            identified as a potential tumor suppressor genes, it
            shares domain architecture with BS69/ZMYND11; both have
            been implicated in the regulation of cellular
            proliferation. Bromodomains are 110 amino acid long
            domains, that are found in many chromatin associated
            proteins. Bromodomains can interact specifically with
            acetylated lysine.
          Length = 99

 Score = 40.4 bits (95), Expect = 5e-04
 Identities = 20/59 (33%), Positives = 31/59 (52%)

Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYN 1299
            +EPF+K    ++ PDY + + +PMD+  +   +    Y S D    D K +  NA IYN
Sbjct: 20   AEPFLKPVDLEQFPDYAQYVFKPMDLSTLEKNVRKKAYGSTDAFLADAKWILHNAIIYN 78


>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
           recombination, and repair].
          Length = 908

 Score = 44.0 bits (104), Expect = 7e-04
 Identities = 45/241 (18%), Positives = 100/241 (41%), Gaps = 24/241 (9%)

Query: 200 LIILQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIA 259
           LI L E E  +   +E ++E+L   L        +++ +             QL+ E+  
Sbjct: 517 LIELLELEEALKEELEEKLEKLENLLEELEELKEKLQLQ-------------QLKEELRQ 563

Query: 260 CARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQK-----HQEYIT 314
              R   L+  +      RT+++ L+E R   K  K++  E E +  Q        E   
Sbjct: 564 LEDRLQELKELLEELRLLRTRKEELEELRERLKELKKKLKELEERLSQLEELLQSLELSE 623

Query: 315 TVLQHCKDFKEYHRNNQARI---MRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAED 371
              +  ++ +E   +   ++     L + +       E++ ++ +  I +E  R    E 
Sbjct: 624 AENEL-EEAEEELESELEKLNLQAELEELLQAALEELEEKVEELEAEIRRELQRIENEEQ 682

Query: 372 EEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKE-HKMEQKKKQDEESKKRKQSVKQK 430
            E   + ++Q +++ L  L  + +E +  L ++ +   ++E +K + EE KK  + +++ 
Sbjct: 683 LEEKLEELEQLEEE-LEQLREELEELLKKLGEIEQLIEELESRKAELEELKKELEKLEKA 741

Query: 431 L 431
           L
Sbjct: 742 L 742



 Score = 41.7 bits (98), Expect = 0.003
 Identities = 42/268 (15%), Positives = 106/268 (39%), Gaps = 14/268 (5%)

Query: 196 GLDPLIILQERENRVALNIERRIEELNGSLTSTLPEHLR--------VKAEIELRALKVL 247
           GL+    L E    V    + +IEEL G L+  L +           +K   +L  ++  
Sbjct: 165 GLEKYEKLSELLKEVIKEAKAKIEELEGQLSELLEDIEDLLEALEEELKELKKLEEIQEE 224

Query: 248 NFQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQ 307
             + +L  E+ A   R   LE         + +   + E+   E L+ +++   E ++  
Sbjct: 225 QEEEELEQEIEALEERLAELEEEKERLEELKARLLEI-ESLELEALKIREEELRELERLL 283

Query: 308 KHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQK---KEQERIEKERM 364
           +  E     L+  +   E        +  L + +       +  ++   K +E++EK   
Sbjct: 284 EELEEKIERLEELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLES 343

Query: 365 RRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEH--KMEQKKKQDEESKK 422
                 +E+     + +++ K L   L + ++ +    + +K+    +++ K++  E   
Sbjct: 344 ELEELAEEKNELAKLLEERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELSA 403

Query: 423 RKQSVKQKLMDTDGKVTLDQDETSQLTD 450
             + ++++L + + ++   + E  +L +
Sbjct: 404 ALEEIQEELEELEKELEELERELEELEE 431



 Score = 32.8 bits (75), Expect = 1.8
 Identities = 23/156 (14%), Positives = 56/156 (35%), Gaps = 20/156 (12%)

Query: 1070 ETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKE 1129
            E     L   EEE    +     R + +  +  L E+ E    L++ +E +++   E  E
Sbjct: 477  ELYELELEELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKEELEEKLE 536

Query: 1130 EEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKK 1189
            + +                       E L+ + + ++    +EE  +     ++ K   +
Sbjct: 537  KLE--------------------NLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLE 576

Query: 1190 TEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
                        +  +E+ K+ +K   +L++ L ++
Sbjct: 577  ELRLLRTRKEELEELRERLKELKKKLKELEERLSQL 612



 Score = 31.3 bits (71), Expect = 4.4
 Identities = 52/326 (15%), Positives = 115/326 (35%), Gaps = 33/326 (10%)

Query: 216 RRIEELNGSLTSTLPEHLRVKAEIELRAL--------KVLNFQRQLRAEVIACARRDTTL 267
            RI +    +   + E L +  +   R++          L  + + R E++     D   
Sbjct: 110 ERIADGKKDVNEKIEELLGLDKDTFTRSVYLPQGEFDAFLKSKPKERKEIL-----DELF 164

Query: 268 ETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKR--QKHQEYITTVLQHCKDFKE 325
                 K  +  K    +     E+LE Q     E  +   +  +E +  +       K 
Sbjct: 165 GLEKYEKLSELLKEVIKEAKAKIEELEGQLSELLEDIEDLLEALEEELKELK------KL 218

Query: 326 YHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDK 385
                +     L + +         E ++E+ER+E+ + R L  E  E     I +++ +
Sbjct: 219 EEIQEEQEEEELEQEIEAL-EERLAELEEEKERLEELKARLLEIESLELEALKIREEELR 277

Query: 386 RLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDET 445
            L  LL + +E I  L +   E ++E+ +++ E  +   + +++ L   +   +L++   
Sbjct: 278 ELERLLEELEEKIERLEE--LEREIEELEEELEGLRALLEELEELL---EKLKSLEERLE 332

Query: 446 SQLTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKE 505
                +     E+      K E A L     + +++        + + E   E  ++ +E
Sbjct: 333 KLEEKLEKLESELEELAEEKNELAKLLEERLKELEER---LEELEKELEKALERLKQLEE 389

Query: 506 KTSGENENKEKNKGEDDEYNKNAMEE 531
                 E KE+         +   E 
Sbjct: 390 AIQ---ELKEELAELSAALEEIQEEL 412



 Score = 30.5 bits (69), Expect = 7.9
 Identities = 27/182 (14%), Positives = 75/182 (41%), Gaps = 5/182 (2%)

Query: 1056 DDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAE-----RRKEQGKKSRLIEVSELP 1110
            + E +  E  + + E   + L   +      + ++ E       + +  +  L E+ E  
Sbjct: 231  EQEIEALEERLAELEEEKERLEELKARLLEIESLELEALKIREEELRELERLLEELEEKI 290

Query: 1111 DWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDE 1170
            + L + + EIE+   E +     L       ++ +         +E L+ ++  +E   E
Sbjct: 291  ERLEELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLESELEELAE 350

Query: 1171 EEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVI 1230
            E+ E  ++  +R  +  ++ E+ ++E   + +R K+ E+  ++ + +L +    +  +  
Sbjct: 351  EKNELAKLLEERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELSAALEEIQE 410

Query: 1231 KY 1232
            + 
Sbjct: 411  EL 412


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 43.6 bits (102), Expect = 0.001
 Identities = 62/338 (18%), Positives = 127/338 (37%), Gaps = 29/338 (8%)

Query: 207  ENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAE---VIACARR 263
            E+   + I R+ E+   +  +   E  + KAE   +A +V   +   +AE       AR+
Sbjct: 1149 EDAKRVEIARKAEDARKAEEARKAEDAK-KAEAARKAEEVRKAEELRKAEDARKAEAARK 1207

Query: 264  DTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDF 323
                  A   +  +  K+     A A +K E+ +K   E KK ++         ++ ++ 
Sbjct: 1208 AEEERKAEEARKAEDAKK-----AEAVKKAEEAKKDAEEAKKAEE--------ERNNEEI 1254

Query: 324  KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKK 383
            +++     A   R   A+    A    E KK +E+ + +  ++  AE+++   +   + +
Sbjct: 1255 RKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKK--AEEKKKADEAKKKAE 1312

Query: 384  DKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQD 443
            + + A    +  E         K+ K E+ KK  E +K   ++   +    + K    + 
Sbjct: 1313 EAKKADEAKKKAEEAKKKADAAKK-KAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEK 1371

Query: 444  ETSQ-------LTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENE 496
            +  +               ++    K    ED   A  LK+        +      EE +
Sbjct: 1372 KKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKK 1431

Query: 497  --DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEA 532
              DE  +K++E    +   K+  + +  E  K   EEA
Sbjct: 1432 KADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEA 1469



 Score = 42.8 bits (100), Expect = 0.002
 Identities = 55/296 (18%), Positives = 121/296 (40%), Gaps = 37/296 (12%)

Query: 231  EHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARAT 290
            +  +  AE + +A +    +   +A+    A      + A   KA ++ K   LK+A   
Sbjct: 1500 DEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAK--KAEEKKKADELKKAEEL 1557

Query: 291  EKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEK 350
            +K E+++K E  +K     +E     L+  ++ K+        +M+L +        AE+
Sbjct: 1558 KKAEEKKKAEEAKKA----EEDKNMALRKAEEAKKAEEARIEEVMKLYEE--EKKMKAEE 1611

Query: 351  EQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEHKM 410
             +K E+ +I+ E +++      E  +K ++Q K K                 +  ++ K 
Sbjct: 1612 AKKAEEAKIKAEELKK-----AEEEKKKVEQLKKK-----------------EAEEKKKA 1649

Query: 411  EQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVREISSGKVLKGEDAP 470
            E+ KK +EE+K +     +K  +       D+ +  +        ++ +     + E+A 
Sbjct: 1650 EELKKAEEENKIKAAEEAKKAEE-------DKKKAEEAKKAEEDEKKAAEALKKEAEEAK 1702

Query: 471  LAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNK 526
             A  LK+   +           EE     +E++K++   + +  E+ K +++E  K
Sbjct: 1703 KAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKK 1758



 Score = 42.1 bits (98), Expect = 0.003
 Identities = 40/157 (25%), Positives = 72/157 (45%), Gaps = 25/157 (15%)

Query: 1077 ARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEE----K 1132
            A+  EE     +  A R+ E+ KK    E + + + +   +EE +  A EAK+ E    K
Sbjct: 1569 AKKAEE----DKNMALRKAEEAKK---AEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIK 1621

Query: 1133 ALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTED 1192
            A  + +   ++K+V+       +E  KA         EE ++ EE    +  +  KK E+
Sbjct: 1622 AEELKKAEEEKKKVEQLKKKEAEEKKKA---------EELKKAEEENKIKAAEEAKKAEE 1672

Query: 1193 DDEEPSTSKK-----RKKEKEKDREKDQAKLKKTLKK 1224
            D ++   +KK     +K  +   +E ++AK  + LKK
Sbjct: 1673 DKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKK 1709



 Score = 40.9 bits (95), Expect = 0.007
 Identities = 59/302 (19%), Positives = 123/302 (40%), Gaps = 30/302 (9%)

Query: 254  RAEVIACARRDTTLETAVNVKAYKRTKRQGL-KEARATEKLEKQQKVEAERKKRQKHQEY 312
            +AE    A      E     +  K+ K +   K   A  K E+ +K E E+KK ++ ++ 
Sbjct: 1582 KAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKK 1641

Query: 313  ITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE 372
                 +  ++ K+    N+ +     K        AE+ +K E++  +     +  AE+ 
Sbjct: 1642 EAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEA 1701

Query: 373  EGYRKL-IDQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKK-----QDEESKKRKQS 426
            +   +L   + ++K+ A  L + +E      +  K+   E KKK     +DEE KK+   
Sbjct: 1702 KKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAH 1761

Query: 427  VKQKLMDTDGKVT----------LDQDETSQLTDMHISVREISSGKVL---KGEDAPLAA 473
            +K++      ++           LD+++  +  ++   +++I          G++  L  
Sbjct: 1762 LKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIFDNFANIIEGGKEGNLVI 1821

Query: 474  HLKQWIQDHPGWEVVADSDEENEDED----------SEKSKEKTSGENENKEKNKGEDDE 523
            +  + ++D    EV    + + E+ D          +E  ++     + NKEK+  EDDE
Sbjct: 1822 NDSKEMEDSAIKEVADSKNMQLEEADAFEKHKFNKNNENGEDGNKEADFNKEKDLKEDDE 1881

Query: 524  YN 525
              
Sbjct: 1882 EE 1883



 Score = 34.7 bits (79), Expect = 0.45
 Identities = 55/283 (19%), Positives = 117/283 (41%), Gaps = 19/283 (6%)

Query: 1058 EEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKED 1117
             E+++  A+   E   +      EE       + + + E+ KK+   E     + L K +
Sbjct: 1572 AEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAE--EAKIKAEELKKAE 1629

Query: 1118 EEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEE 1177
            EE ++     K+E +     +    +K  +       +E  KA +D  + ++ ++ EE+E
Sbjct: 1630 EEKKKVEQLKKKEAEEKK--KAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDE 1687

Query: 1178 VRS----KRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYT 1233
             ++    K++ +  KK E+  ++ +  KK+ +E +K  E+++ K ++  K+      K  
Sbjct: 1688 KKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEED--KKK 1745

Query: 1234 DSDGRVLSEPFIKLPSRKELPDYYEVIDRPMD---IKKILGRIEDGKYSSVDELQKDFKT 1290
              + +   E   K+   K+  +      R      I++ L   ++ +   VD+  KD   
Sbjct: 1746 AEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIFD 1805

Query: 1291 LCRNAQIYNEELSLI------HEDSVVLESVFTKARQRVESGE 1327
               N     +E +L+       EDS + E   +K  Q  E+  
Sbjct: 1806 NFANIIEGGKEGNLVINDSKEMEDSAIKEVADSKNMQLEEADA 1848



 Score = 34.3 bits (78), Expect = 0.60
 Identities = 46/272 (16%), Positives = 109/272 (40%), Gaps = 34/272 (12%)

Query: 984  QDLQAQDRAHRIGQKNEVRVLRLMTVNSVEE-RILAAARYKLNMDEKVIQAGMFDQKSTG 1042
            ++L+  +   +  +  +    + M +   EE +    AR +  M     +  M  +++  
Sbjct: 1555 EELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKK 1614

Query: 1043 SERHQFLQTILHQDDEEDE--EENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKK 1100
            +E  +     L + +EE +  E+    + E   +     + E +   +   E +K +  K
Sbjct: 1615 AEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDK 1674

Query: 1101 SRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKA 1160
             +  E         K +E+         E++ A  + + + + K+ +       +E  KA
Sbjct: 1675 KKAEE-------AKKAEED---------EKKAAEALKKEAEEAKKAEELKKKEAEEKKKA 1718

Query: 1161 IDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEK------DREKD 1214
                     EE ++ EE    +  + +K+ E+D ++   +KK ++EK+K      + EK 
Sbjct: 1719 ---------EELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKK 1769

Query: 1215 QAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIK 1246
              +++K  + ++   +   D   R+  +  IK
Sbjct: 1770 AEEIRKEKEAVIEEELDEEDEKRRMEVDKKIK 1801



 Score = 33.6 bits (76), Expect = 1.0
 Identities = 33/142 (23%), Positives = 69/142 (48%), Gaps = 7/142 (4%)

Query: 1091 AERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTD 1150
            AE  K+  +  +  E ++  D   K+ EE ++ A EAK+  +A      +++ ++    D
Sbjct: 1466 AEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKAD 1525

Query: 1151 SLTEKEWLKAIDDGVEYDDEEEEEE----EEVR---SKRKGKRRKKTEDDDEEPSTSKKR 1203
               + E  K  D+  + +++++ +E    EE++    K+K +  KK E+D        + 
Sbjct: 1526 EAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEE 1585

Query: 1204 KKEKEKDREKDQAKLKKTLKKI 1225
             K+ E+ R ++  KL +  KK+
Sbjct: 1586 AKKAEEARIEEVMKLYEEEKKM 1607



 Score = 33.6 bits (76), Expect = 1.1
 Identities = 28/160 (17%), Positives = 64/160 (40%), Gaps = 13/160 (8%)

Query: 1078 RSEEEFQTYQ---RIDAERRKEQGKKS----------RLIEVSELPDWLIKEDEEIEQWA 1124
            R  EE +  +   + +A ++ E+ KK           R  E     +          Q A
Sbjct: 1212 RKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAA 1271

Query: 1125 FEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKG 1184
             +A+E  KA  + +   ++K  +   +  +K+  +A     E    +E +++   +K+K 
Sbjct: 1272 IKAEEARKADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKA 1331

Query: 1185 KRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
               KK  ++ ++ + + K + E   D  +   +  +  +K
Sbjct: 1332 DAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEK 1371



 Score = 32.4 bits (73), Expect = 1.9
 Identities = 34/163 (20%), Positives = 68/163 (41%), Gaps = 16/163 (9%)

Query: 1077 ARSEEEFQTYQ---RIDAERRKEQGKKS----RLIEVSELPDWLIKEDEEIEQWAFEAKE 1129
            AR  E  +  +   + +  R+ E  KK+    +  E  +  +   K +EE         E
Sbjct: 1199 ARKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFE 1258

Query: 1130 EEKALHMGRGSRQRK--QVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRR 1187
            E +  H  R     K  +    D L + E  K  D+  +   EE+++ +E + K +  ++
Sbjct: 1259 EARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKK--AEEKKKADEAKKKAEEAKK 1316

Query: 1188 -----KKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
                 KK E+  ++   +KK+ +E +K  E  +A+ +    + 
Sbjct: 1317 ADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEA 1359



 Score = 31.6 bits (71), Expect = 3.8
 Identities = 37/174 (21%), Positives = 76/174 (43%), Gaps = 7/174 (4%)

Query: 1056 DDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIK 1115
            ++ +   E A  + E        +EE+ +  ++   E +K+     +  E  +  D   K
Sbjct: 1339 EEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKK 1398

Query: 1116 EDEEIEQWAFEAKEEEKALHMGRGSRQR-KQVDYTDSLTEKEWLKAIDDGVEYDDEEEEE 1174
            + EE ++ A E K+   A      ++++ ++    D   +K       D  +   EE ++
Sbjct: 1399 KAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKK 1458

Query: 1175 EEEVRSKRKGKRRKKTEDDDEEPSTSKK----RKKEKEKDREKDQAKLKKTLKK 1224
             EE   K+K +  KK ++  ++   +KK    +KK +E  ++ D+AK     KK
Sbjct: 1459 AEEA--KKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKK 1510



 Score = 31.3 bits (70), Expect = 4.5
 Identities = 55/287 (19%), Positives = 102/287 (35%), Gaps = 17/287 (5%)

Query: 261  ARRDTTLETAVNVKAYKRTKR-QGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQH 319
            A++  T +     KA +  K+ +  ++A    K E  +K E  RK     +  I    + 
Sbjct: 1103 AKKTETGKAEEARKAEEAKKKAEDARKAEEARKAEDARKAEEARKAEDAKRVEIARKAED 1162

Query: 320  CKDFKEYHRNNQA-RIMRLNKAVMNYHAN----------AEKEQKKEQERIEKERMRRLM 368
             +  +E  +   A +     KA     A           AE  +K E+ER  +E  +   
Sbjct: 1163 ARKAEEARKAEDAKKAEAARKAEEVRKAEELRKAEDARKAEAARKAEEERKAEEARKAED 1222

Query: 369  AEDEEGYRKLIDQKKDKRLAFLLSQ--TDEYISNLTQMVKEHKMEQKKKQDEESKKRKQS 426
            A+  E  +K  + KKD   A    +   +E I    +    H   ++     E  ++   
Sbjct: 1223 AKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADE 1282

Query: 427  VKQKLMDTDGKVTLDQDETSQLTDMHISVREISSGKVLKG---EDAPLAAHLKQWIQDHP 483
            +K+             +E  +  +      E       K    E    A   K+  ++  
Sbjct: 1283 LKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAK 1342

Query: 484  GWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAME 530
                 A ++ E   +++E ++EK     + KE+ K + D   K A E
Sbjct: 1343 KAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEE 1389



 Score = 30.9 bits (69), Expect = 5.8
 Identities = 52/303 (17%), Positives = 111/303 (36%), Gaps = 17/303 (5%)

Query: 216  RRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLETAVNVKA 275
            ++ +E   +      +  + KAE   +A +      + + +  A  ++    + A     
Sbjct: 1290 KKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAK 1349

Query: 276  YKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIM 335
                +    +   A EK E  +K + E KK+    +      +   + K+    ++ +  
Sbjct: 1350 -AEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKAD 1408

Query: 336  RLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTD 395
             L KA        E ++K E+++   E  ++  AE+ +   +   + ++ + A    +  
Sbjct: 1409 ELKKAAAAKKKADEAKKKAEEKKKADEAKKK--AEEAKKADEAKKKAEEAKKAEEAKKKA 1466

Query: 396  EYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISV 455
            E      +  K  K E+ KK DE  KK +++ K+             DE  +  +     
Sbjct: 1467 EEAKKADEAKK--KAEEAKKADEAKKKAEEAKKKA------------DEAKKAAEAKKKA 1512

Query: 456  REISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKE 515
             E    +  K  D    A   +   +    E    +DE  + E+ +K++EK   E   K 
Sbjct: 1513 DEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKA 1572

Query: 516  KNK 518
            +  
Sbjct: 1573 EED 1575



 Score = 30.5 bits (68), Expect = 9.3
 Identities = 39/162 (24%), Positives = 77/162 (47%), Gaps = 13/162 (8%)

Query: 1076 LARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAK---EEEK 1132
            L ++EE+ +  +   AE +K+  +  +  E ++  D   K+ EE ++ A  AK   EE K
Sbjct: 1283 LKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAK 1342

Query: 1133 ALHMGRGSRQRKQVDYTDSLTEKEWL--KAIDDGVEYDDEEEEEEEEVR----SKRKGKR 1186
                   +      D  ++  EK     K  ++  +  D  +++ EE +    +K+K + 
Sbjct: 1343 KAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEE 1402

Query: 1187 RKKTEDDDEEPSTSKKR----KKEKEKDREKDQAKLKKTLKK 1224
             KK  D+ ++ + +KK+    KK+ E+ ++ D+AK K    K
Sbjct: 1403 DKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK 1444


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
            binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 43.4 bits (102), Expect = 0.001
 Identities = 43/197 (21%), Positives = 78/197 (39%), Gaps = 26/197 (13%)

Query: 1020 ARYKLNMDEKVIQAGMFDQKSTGSERHQFLQTILHQDDEEDEEENAVPDDETVNQMLARS 1079
            A +K   + + ++     + + G+E    ++T    ++ EDE E        V     R 
Sbjct: 704  ADHKGETEAEEVEHEGETE-AEGTEDEGEIETGEEGEEVEDEGEGEAEGKHEVETEGDRK 762

Query: 1080 EEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRG 1139
            E E +     + +  +++G      E+    D  +K DE  E       E E        
Sbjct: 763  ETEHEGETEAEGKEDEDEG------EIQAGEDGEMKGDEGAEGKVEHEGETEAGEKDEHE 816

Query: 1140 SRQRKQVDYTDSLTE--------------KEWLKAID-----DGVEYDDEEEEEEEEVRS 1180
             +   Q D T+   E              K+  K +D     DG + ++EEEEEEEE   
Sbjct: 817  GQSETQADDTEVKDETGEQELNAENQGEAKQDEKGVDGGGGSDGGDSEEEEEEEEEEEEE 876

Query: 1181 KRKGKRRKKTEDDDEEP 1197
            + + +  ++ E+++EEP
Sbjct: 877  EEEEEEEEEEEEENEEP 893



 Score = 31.5 bits (71), Expect = 3.7
 Identities = 47/184 (25%), Positives = 76/184 (41%), Gaps = 15/184 (8%)

Query: 1013 EERILAAARYKLNMDEKVIQA-GMFDQKSTGSERHQFLQTILHQDDEEDEEENAVPDDET 1071
            E  I      +   DE   +A G  + ++ G  +    +     + +EDE+E  +   E 
Sbjct: 729  EGEIETGEEGEEVEDEGEGEAEGKHEVETEGDRKETEHEGETEAEGKEDEDEGEIQAGED 788

Query: 1072 VNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSE-LPDWLIKEDEEIEQWAF----- 1125
              +M      E     +++ E   E G+K      SE   D    +DE  EQ        
Sbjct: 789  G-EMKGDEGAE----GKVEHEGETEAGEKDEHEGQSETQADDTEVKDETGEQELNAENQG 843

Query: 1126 EAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
            EAK++EK +  G GS      D  +   E+E  +  ++  E ++EEEEE EE  S    +
Sbjct: 844  EAKQDEKGVDGGGGS---DGGDSEEEEEEEEEEEEEEEEEEEEEEEEEENEEPLSLEWPE 900

Query: 1186 RRKK 1189
             R+K
Sbjct: 901  TRQK 904


>gnl|CDD|197891 smart00818, Amelogenin, Amelogenins, cell adhesion proteins, play a
           role in the biomineralisation of teeth.  They seem to
           regulate formation of crystallites during the secretory
           stage of tooth enamel development and are thought to
           play a major role in the structural organisation and
           mineralisation of developing enamel. The extracellular
           matrix of the developing enamel comprises two major
           classes of protein: the hydrophobic amelogenins and the
           acidic enamelins. Circular dichroism studies of porcine
           amelogenin have shown that the protein consists of 3
           discrete folding units: the N-terminal region appears to
           contain beta-strand structures, while the C-terminal
           region displays characteristics of a random coil
           conformation. Subsequent studies on the bovine protein
           have indicated the amelogenin structure to contain a
           repetitive beta-turn segment and a "beta-spiral" between
           Gln112 and Leu138, which sequester a (Pro, Leu, Gln)
           rich region. The beta-spiral offers a probable site for
           interactions with Ca2+ ions. Muatations in the human
           amelogenin gene (AMGX) cause X-linked hypoplastic
           amelogenesis imperfecta, a disease characterised by
           defective enamel. A 9bp deletion in exon 2 of AMGX
           results in the loss of codons for Ile5, Leu6, Phe7 and
           Ala8, and replacement by a new threonine codon,
           disrupting the 16-residue (Met1-Ala16) amelogenin signal
           peptide.
          Length = 165

 Score = 40.9 bits (96), Expect = 0.001
 Identities = 26/92 (28%), Positives = 32/92 (34%), Gaps = 7/92 (7%)

Query: 110 LARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP 169
           L   QP+ PQ  +      M  VP    M P   H P    P+Q         P Q Q P
Sbjct: 61  LPAQQPVVPQQPL------MP-VPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPPQPQQP 113

Query: 170 PQPHQQQGHISSQIKQSKLTNIPKPEGLDPLI 201
            QP      I     Q  L  +   + L PL+
Sbjct: 114 MQPQPPVHPIPPLPPQPPLPPMFPMQPLPPLL 145



 Score = 39.8 bits (93), Expect = 0.003
 Identities = 25/93 (26%), Positives = 32/93 (34%), Gaps = 3/93 (3%)

Query: 113 NQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-PQ 171
              L P   + V   +   VP  P MP    H   P    QP   Q    P Q QP  P 
Sbjct: 49  THTLQPHHHIPVLPAQQPVVPQQPLMPVPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPP 108

Query: 172 PHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQ 204
             QQ   +  Q     +  +P    L P+  +Q
Sbjct: 109 QPQQP--MQPQPPVHPIPPLPPQPPLPPMFPMQ 139



 Score = 32.8 bits (75), Expect = 0.55
 Identities = 21/82 (25%), Positives = 27/82 (32%), Gaps = 8/82 (9%)

Query: 90  HAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPS--GPQMPPMSLHGPM 147
           H+ T  Q  Q      A       QP  PQ     Q ++         P  P        
Sbjct: 80  HSMTPTQHHQPNLPQPA------QQPFQPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLP 133

Query: 148 PMPPSQPMPNQAQPMPLQQQPP 169
           PM P QP+P     +PL+  P 
Sbjct: 134 PMFPMQPLPPLLPDLPLEAWPA 155



 Score = 30.9 bits (70), Expect = 2.8
 Identities = 12/44 (27%), Positives = 12/44 (27%)

Query: 8   PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQ 51
           PN P P QQ       Q P       P     P P   P     
Sbjct: 90  PNLPQPAQQPFQPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLP 133


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit; Provisional.
          Length = 482

 Score = 41.8 bits (99), Expect = 0.002
 Identities = 16/60 (26%), Positives = 35/60 (58%)

Query: 1165 VEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
               +  E++ EEE + K+K     K ++++EE    KK ++++E++ E ++ K ++  KK
Sbjct: 414  KIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEKK 473



 Score = 32.6 bits (75), Expect = 1.6
 Identities = 19/91 (20%), Positives = 41/91 (45%)

Query: 1134 LHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDD 1193
            LH  +   +R+ + +   + +     A       +  EEE E    SK+  K+ KK  + 
Sbjct: 359  LHTSKRKVRREVLPFLSIIFKHNPELAARLAAFLELTEEEIEFLTGSKKATKKIKKIVEK 418

Query: 1194 DEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
             E+    +K++K+K+    K + + ++  K+
Sbjct: 419  AEKKREEEKKEKKKKAFAGKKKEEEEEEEKE 449



 Score = 31.0 bits (71), Expect = 4.8
 Identities = 22/87 (25%), Positives = 44/87 (50%), Gaps = 6/87 (6%)

Query: 1116 EDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEE 1175
             +EEIE +   +K+  K +       ++K+ +      +KE  K    G + ++EEEEE+
Sbjct: 395  TEEEIE-FLTGSKKATKKIKKIVEKAEKKREE-----EKKEKKKKAFAGKKKEEEEEEEK 448

Query: 1176 EEVRSKRKGKRRKKTEDDDEEPSTSKK 1202
            E+   +++ +  +  E+ +EE    KK
Sbjct: 449  EKKEEEKEEEEEEAEEEKEEEEEKKKK 475



 Score = 30.3 bits (69), Expect = 8.4
 Identities = 19/87 (21%), Positives = 41/87 (47%), Gaps = 1/87 (1%)

Query: 1124 AFEAKEEEKALHMGRGSR-QRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR 1182
              E  EEE     G     ++ +     +  ++E  K       +  +++EEEEE   ++
Sbjct: 391  FLELTEEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEK 450

Query: 1183 KGKRRKKTEDDDEEPSTSKKRKKEKEK 1209
            K + +++ E++ EE    ++ KK+K+ 
Sbjct: 451  KEEEKEEEEEEAEEEKEEEEEKKKKQA 477


>gnl|CDD|99939 cd05507, Bromo_brd8_like, Bromodomain, brd8_like subgroup. In
            mammals, brd8 (bromodomain containing 8) interacts with
            the thyroid hormone receptor in a ligand-dependent
            fashion and enhances thyroid hormone-dependent activation
            from thyroid response elements. Brd8 is thought to be a
            nuclear receptor coactivator. Bromodomains are 110 amino
            acid long domains, that are found in many chromatin
            associated proteins. Bromodomains can interact
            specifically with acetylated lysine.
          Length = 104

 Score = 38.9 bits (91), Expect = 0.002
 Identities = 21/59 (35%), Positives = 32/59 (54%)

Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYN 1299
            +  F+K  +    P Y+ V+ RPMD+  I   IE+G   S  E Q+D   + +NA +YN
Sbjct: 21   ASVFLKPVTEDIAPGYHSVVYRPMDLSTIKKNIENGTIRSTAEFQRDVLLMFQNAIMYN 79


>gnl|CDD|130689 TIGR01628, PABP-1234, polyadenylate binding protein, human types 1,
           2, 3, 4 family.  These eukaryotic proteins recognize the
           poly-A of mRNA and consists of four tandem RNA
           recognition domains at the N-terminus (rrm: pfam00076)
           followed by a PABP-specific domain (pfam00658) at the
           C-terminus. The protein is involved in the transport of
           mRNA's from the nucleus to the cytoplasm. There are four
           paralogs in Homo sapiens which are expressed in testis
           (GP:11610605_PABP3 ), platelets (SP:Q13310_PABP4 ),
           broadly expressed (SP:P11940_PABP1) and of unknown
           tissue range (SP:Q15097_PABP2).
          Length = 562

 Score = 42.1 bits (99), Expect = 0.002
 Identities = 35/150 (23%), Positives = 48/150 (32%), Gaps = 23/150 (15%)

Query: 59  RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTP 118
           RA+  M  + L   P Y  L    A R E + A    Q  QL+ ++    + +       
Sbjct: 341 RAVTEMHGRMLGGKPLYVAL----AQRKEQRRAHLQDQFMQLQPRMRQLPMGSPMGGAMG 396

Query: 119 QLAMGVQGKRME------GVPSGPQMP----PMSLHGPM---PMPPSQPMPNQAQPMPLQ 165
           Q     QG + +      G P    MP    P     P    PM   +     AQ     
Sbjct: 397 QPPYYGQGPQQQFNGQPLGWPRMSMMPTPMGPGGPLRPNGLAPMNAVRAPSRNAQNAA-- 454

Query: 166 QQPPPQPH----QQQGHISSQIKQSKLTNI 191
           Q+PP QP       Q    SQ      +  
Sbjct: 455 QKPPMQPVMYPPNYQSLPLSQDLPQPQSTA 484



 Score = 32.5 bits (74), Expect = 1.8
 Identities = 16/93 (17%), Positives = 22/93 (23%), Gaps = 3/93 (3%)

Query: 95  AQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQP 154
              Q L +  M+        P  P    G+        PS             P+    P
Sbjct: 409 FNGQPLGWPRMSMMP-TPMGPGGPLRPNGLAPMNAVRAPSRNAQNAAQKPPMQPV-MYPP 466

Query: 155 MPNQAQPMPLQQQPPPQPHQQ-QGHISSQIKQS 186
                       QP     Q  Q    +Q+  S
Sbjct: 467 NYQSLPLSQDLPQPQSTASQGGQNKKLAQVLAS 499



 Score = 30.9 bits (70), Expect = 5.7
 Identities = 17/79 (21%), Positives = 21/79 (26%), Gaps = 12/79 (15%)

Query: 1   MSNSSTSPN--PPPPQQQQPPLN----VGQLPMGA----PG--SGPPGSPGPSPGQAPGQ 48
                  PN   P    + P  N      + PM      P   S P     P P     Q
Sbjct: 427 GPGGPLRPNGLAPMNAVRAPSRNAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQ 486

Query: 49  NPQENLTALQRAIDSMKEQ 67
             Q    A   A  + + Q
Sbjct: 487 GGQNKKLAQVLASATPQMQ 505



 Score = 30.5 bits (69), Expect = 7.6
 Identities = 25/124 (20%), Positives = 40/124 (32%), Gaps = 9/124 (7%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKE 66
           S  P P     P    G  PM A  +    +       A  + P + +            
Sbjct: 420 SMMPTPMGPGGPLRPNGLAPMNAVRAPSRNAQ-----NAAQKPPMQPVMYPPNYQSLPLS 474

Query: 67  QGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
           Q L   P+ Q          ++     SA  Q  + Q++  RL    + + P LA  + G
Sbjct: 475 QDL---PQPQSTASQGGQNKKLAQVLASATPQMQK-QVLGERLFPLVEAIEPALAAKITG 530

Query: 127 KRME 130
             +E
Sbjct: 531 MLLE 534


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
            proteins whose interaction is required for the maturation
            of the 18S rRNA and for 40S ribosome production.
          Length = 809

 Score = 41.5 bits (98), Expect = 0.003
 Identities = 34/155 (21%), Positives = 62/155 (40%), Gaps = 39/155 (25%)

Query: 1129 EEEKAL-HMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE-EEEEVRSKRKGKR 1186
            E+E  L H+G+            SL+E +    + D  ++DD++      + R+   G  
Sbjct: 111  EDEFVLTHLGQ------------SLSEIDKDDDVRDDDDFDDDDLGDLASDDRAAHFGGG 158

Query: 1187 RKKTEDDDEEP--------------STSKKRKKEKEKDREKDQA---KLKKTLKKIMRVV 1229
                ED++E+P              + SK  K E++K +E+D+    +L    K +M   
Sbjct: 159  EDDEEDEEEQPERKKSKKEVMKEVIAKSKFYKAERQKAKEEDEDLREELDDDFKDLM--- 215

Query: 1230 IKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPM 1264
                 S  R +  P     + +E  D Y+   R +
Sbjct: 216  -----SLLRTVKPPPKPPMTPEEKDDEYDQRVREL 245



 Score = 30.7 bits (70), Expect = 7.2
 Identities = 32/193 (16%), Positives = 77/193 (39%), Gaps = 25/193 (12%)

Query: 350 KEQKKEQERIEK--ERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKE 407
           K ++++++  E+  +  + LM+               +       + DEY   + ++  +
Sbjct: 195 KAKEEDEDLREELDDDFKDLMSLLRTVKPPPKPPMTPE------EKDDEYDQRVRELTFD 248

Query: 408 HKM---------EQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVREI 458
            +          E+  K  EE+++ K+   ++L    G+   D++E     D   S  ++
Sbjct: 249 RRAQPTDRTKTEEELAK--EEAERLKKLEAERLRRMRGEEEDDEEEE----DSKESADDL 302

Query: 459 SSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNK 518
                   +D       +    +    + V D DEE++D+D E+ +E     +E +++  
Sbjct: 303 DDEFEPDDDDNFGLG--QGEEDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDEED 360

Query: 519 GEDDEYNKNAMEE 531
            + D+ +    EE
Sbjct: 361 EDSDDEDDEEEEE 373


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 41.2 bits (97), Expect = 0.004
 Identities = 39/250 (15%), Positives = 92/250 (36%), Gaps = 25/250 (10%)

Query: 203 LQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACAR 262
           L+E E  ++  +E  +EEL   L     E   +K+E+E    ++   Q +L         
Sbjct: 241 LEELEEELS-RLEEELEELQEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEE 299

Query: 263 RDTTLETAVNVKAYKRTKRQGLKEARA---TEKLEKQQKVEAERKKRQKHQEYITTVLQH 319
            +  +            + + L+E       +    ++++E      ++ ++ +  + + 
Sbjct: 300 LEGEISLLRERLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEA 359

Query: 320 CKDFKEYHRNN-----------QARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLM 368
            ++ +E                +  +  L   +       E E+ K +    +ER+ RL 
Sbjct: 360 KEELEEKLSALLEELEELFEALREELAELEAELAEI--RNELEELKREIESLEERLERLS 417

Query: 369 AEDEEGYRKLID--------QKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEES 420
              E+   +L +        Q + + L   L + +E +  L   +KE + E  + Q+E  
Sbjct: 418 ERLEDLKEELKELEAELEELQTELEELNEELEELEEQLEELRDRLKELERELAELQEELQ 477

Query: 421 KKRKQSVKQK 430
           +  K+    +
Sbjct: 478 RLEKELSSLE 487



 Score = 38.2 bits (89), Expect = 0.039
 Identities = 46/221 (20%), Positives = 104/221 (47%), Gaps = 13/221 (5%)

Query: 1059 EDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDE 1118
            E+E E    + E + + L   EEE ++ +   A+       K  + E+ E    L +E E
Sbjct: 743  EEELEELEEELEELQERLEELEEELESLEEALAKL------KEEIEELEEKRQALQEELE 796

Query: 1119 EIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTE-----KEWLKAIDDGVEYDDEEEE 1173
            E+E+   EA+    AL     S ++++      + E     +E  + +D+  E  +E E+
Sbjct: 797  ELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEK 856

Query: 1174 EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYT 1233
            E EE++ + +    +K E +DE     ++ K+E E++  + +++L +  ++I ++  +  
Sbjct: 857  ELEELKEELEELEAEKEELEDEL-KELEEEKEELEEELRELESELAELKEEIEKLRERLE 915

Query: 1234 DSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIE 1274
            + + ++           +EL + YE      ++++ + R+E
Sbjct: 916  ELEAKLERLEVELPELEEELEEEYE-DTLETELEREIERLE 955



 Score = 33.5 bits (77), Expect = 0.95
 Identities = 39/211 (18%), Positives = 82/211 (38%), Gaps = 19/211 (9%)

Query: 214  IERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLETAVNV 273
             ERR++ L   L S      R++ EIE    ++   + +L          +  LE     
Sbjct: 805  AERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEE 864

Query: 274  KAYKRTKRQGLKEARAT-----EKLEKQ-QKVEAERKK----RQKHQEYITTVLQHCKDF 323
                  +++ L++         E+LE++ +++E+E  +     +K +E +  +    +  
Sbjct: 865  LEELEAEKEELEDELKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERL 924

Query: 324  KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE-----EGYRKL 378
            +      +  +    +  +      E+E ++ +E IE      L A +E     E Y +L
Sbjct: 925  EVELPELEEELEEEYEDTL--ETELEREIERLEEEIEALGPVNLRAIEEYEEVEERYEEL 982

Query: 379  IDQKKD--KRLAFLLSQTDEYISNLTQMVKE 407
              Q++D  +    LL   +E      +  KE
Sbjct: 983  KSQREDLEEAKEKLLEVIEELDKEKRERFKE 1013



 Score = 32.4 bits (74), Expect = 2.1
 Identities = 49/293 (16%), Positives = 111/293 (37%), Gaps = 36/293 (12%)

Query: 1044 ERHQFLQTILHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRL 1103
               +    +    +   E E     +E ++++    EE     + ++   ++ +  KS L
Sbjct: 223  RELELALLLAKLKELRKELEEL---EEELSRLEEELEEL---QEELEEAEKEIEELKSEL 276

Query: 1104 IEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGR----GSRQRKQVDYTDSLTEK-EWL 1158
             E+ E  + L +E  E+++   E  E E +L   R     +   +  +  + L EK E L
Sbjct: 277  EELREELEELQEELLELKE-EIEELEGEISLLRERLEELENELEELEERLEELKEKIEAL 335

Query: 1159 KAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKL 1218
            K   +  E   EE E+      + K +  +K     EE     +  +E+  + E + A++
Sbjct: 336  KEELEERETLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREELAELEAELAEI 395

Query: 1219 KKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKY 1278
            +  L+++                        ++E+    E ++R  +  + L        
Sbjct: 396  RNELEEL------------------------KREIESLEERLERLSERLEDLKEELKELE 431

Query: 1279 SSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRVESGEDPDE 1331
            + ++ELQ + + L    +   E+L  + +    LE    + ++ ++  E    
Sbjct: 432  AELEELQTELEELNEELEELEEQLEELRDRLKELERELAELQEELQRLEKELS 484



 Score = 31.2 bits (71), Expect = 4.9
 Identities = 35/188 (18%), Positives = 80/188 (42%), Gaps = 20/188 (10%)

Query: 350 KEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVK--- 406
           +E +K+ E++E++      AE  E Y++L  + ++  LA LL++  E    L ++ +   
Sbjct: 196 EELEKQLEKLERQ------AEKAERYQELKAELRELELALLLAKLKELRKELEELEEELS 249

Query: 407 --EHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDM---------HISV 455
             E ++E+ +++ EE++K  + +K +L +   ++   Q+E  +L +           +  
Sbjct: 250 RLEEELEELQEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLLRE 309

Query: 456 REISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKE 515
           R       L+  +  L    ++        E      EE E   +E  + K   E +   
Sbjct: 310 RLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEAKEELEEKLSA 369

Query: 516 KNKGEDDE 523
             +  ++ 
Sbjct: 370 LLEELEEL 377



 Score = 30.8 bits (70), Expect = 6.2
 Identities = 32/152 (21%), Positives = 61/152 (40%), Gaps = 19/152 (12%)

Query: 1076 LARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALH 1135
            L R  E+ + YQ + AE R+ +     L ++ EL        +E+E    E +EE   L 
Sbjct: 205  LERQAEKAERYQELKAELRELELAL-LLAKLKEL-------RKELE----ELEEELSRLE 252

Query: 1136 MGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRK--KTEDD 1193
                  Q +         EKE  +   +  E  +E EE +EE+   ++       +    
Sbjct: 253  EELEELQEEL-----EEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLL 307

Query: 1194 DEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
             E     +   +E E+  E+ + K++   +++
Sbjct: 308  RERLEELENELEELEERLEELKEKIEALKEEL 339



 Score = 30.5 bits (69), Expect = 8.9
 Identities = 30/152 (19%), Positives = 61/152 (40%), Gaps = 10/152 (6%)

Query: 1076 LARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALH 1135
            L   EEE    +    +  +E   KS   E+  L D      EE+ +   E + + + L 
Sbjct: 669  LKELEEELAELEAQLEKLEEEL--KSLKNELRSLED----LLEELRRQLEELERQLEELK 722

Query: 1136 MGRGSRQRKQVDYTDSLTEKEWLKAIDDG--VEYDDEEEEEEEEVRSKRKGKRRKKTEDD 1193
                + + +       L E E      +    E  +  EE EEE+ S    +   K +++
Sbjct: 723  RELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELESLE--EALAKLKEE 780

Query: 1194 DEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
             EE    ++  +E+ ++ E++  + ++ L  +
Sbjct: 781  IEELEEKRQALQEELEELEEELEEAERRLDAL 812


>gnl|CDD|206063 pfam13892, DBINO, DNA-binding domain.  DBINO is a DNA-binding
           domain found on global transcription activator SNF2L1
           proteins and chromatin re-modelling proteins.
          Length = 140

 Score = 38.4 bits (90), Expect = 0.005
 Identities = 21/70 (30%), Positives = 42/70 (60%), Gaps = 4/70 (5%)

Query: 328 RNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRL 387
           ++ Q R  RL + ++ +    EKE+++ ++R EKE + +   E+E   R+   Q+  ++L
Sbjct: 61  KDTQLRAKRLMREMLLFWKKNEKEERELRKRAEKEALEQAKKEEEL--REAKRQQ--RKL 116

Query: 388 AFLLSQTDEY 397
            FL++QT+ Y
Sbjct: 117 NFLITQTELY 126


>gnl|CDD|148844 pfam07469, DUF1518, Domain of unknown function (DUF1518).  This
          domain, which is usually found tandemly repeated, is
          found various receptor co-activating proteins.
          Length = 56

 Score = 36.4 bits (84), Expect = 0.005
 Identities = 16/42 (38%), Positives = 17/42 (40%), Gaps = 3/42 (7%)

Query: 12 PPQQQQPPLNVGQLPMGAPGSGPPGSPGP---SPGQAPGQNP 50
          PPQQ   P N G      P    P SP     SP   P Q+P
Sbjct: 15 PPQQFPYPPNYGMGQQPDPAFTSPFSPQSPMMSPRMGPSQSP 56


>gnl|CDD|221040 pfam11235, Med25_SD1, Mediator complex subunit 25 synapsin 1.  The
           overall function of the full-length Med25 is efficiently
           to coordinate the transcriptional activation of RAR/RXR
           (retinoic acid receptor/retinoic X receptor) in higher
           eukaryotic cells. Human Med25 consists of several
           domains with different binding properties, the
           N-terminal, VWA, domain, this SD1 - synapsin 1 - domain
           from residues 229-381, a PTOV(B) or ACID domain from
           395-545, an SD2 domain from residues 564-645 and a
           C-terminal NR box-containing domain (646-650) from
           646-747. This The function of the SD domains is unclear.
          Length = 168

 Score = 38.7 bits (89), Expect = 0.006
 Identities = 40/193 (20%), Positives = 59/193 (30%), Gaps = 51/193 (26%)

Query: 5   STSPNPPPPQQQ--QPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLT------A 56
            + P P   +Q    PP       +       P +P P P   P     +N++      A
Sbjct: 6   GSVPGPLQSKQPVSLPPAA----VLPPQSLPAPQNPLP-PVTPPQMQVPQNVSLHAAHDA 60

Query: 57  LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPL 116
            Q+A+++ K Q      R+  +  ++            A +    F          +Q  
Sbjct: 61  AQKAVEAAKNQKQGLKNRFSPITPLQ-----------QAPIVGPPF----------SQAP 99

Query: 117 TPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP------- 169
            P L     G      PS P      +    P     P+  Q Q  P Q Q P       
Sbjct: 100 APVLP---PGPPGAPKPS-PASQLSLVTTVSPGSGLAPVLTQQQVPPQQPQQPSMVPTPA 155

Query: 170 ------PQPHQQQ 176
                 PQP QQQ
Sbjct: 156 LGGVQPPQPSQQQ 168



 Score = 35.2 bits (80), Expect = 0.10
 Identities = 15/43 (34%), Positives = 22/43 (51%), Gaps = 1/43 (2%)

Query: 135 GPQMP-PMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
           G  +P P+    P+ +PP+  +P Q+ P P    PP  P Q Q
Sbjct: 5   GGSVPGPLQSKQPVSLPPAAVLPPQSLPAPQNPLPPVTPPQMQ 47



 Score = 30.2 bits (67), Expect = 4.4
 Identities = 13/45 (28%), Positives = 15/45 (33%), Gaps = 1/45 (2%)

Query: 1   MSNSSTSPNPPPPQ-QQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ 44
           ++  S      P   QQQ P    Q P   P     G   P P Q
Sbjct: 122 VTTVSPGSGLAPVLTQQQVPPQQPQQPSMVPTPALGGVQPPQPSQ 166


>gnl|CDD|218292 pfam04851, ResIII, Type III restriction enzyme, res subunit. 
          Length = 100

 Score = 36.8 bits (86), Expect = 0.009
 Identities = 34/164 (20%), Positives = 48/164 (29%), Gaps = 71/164 (43%)

Query: 556 KLKEYQIKGLE-WMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIVP 614
           +L+ YQ + +E  +         G++    G GKT+   ALI  L + KK     L +VP
Sbjct: 3   ELRPYQEEAIERLL-----EKKRGLIVMATGSGKTLTAAALIARLAKGKK---KVLFVVP 54

Query: 615 LSTLSNWSLEFERWAPSVNVVAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKDKGPLA 674
                                        RK L  Q                        
Sbjct: 55  -----------------------------RKDLLEQ------------------------ 61

Query: 675 KLHWKYMIIDEGHRM--KNHHCKLTHILNTFYVAPHRLLLTGTP 716
                 +IIDE H    K  + K    +   +     L LT TP
Sbjct: 62  ---ALVIIIDEAHHSSAKTKYRK----ILEKFKPAFLLGLTATP 98


>gnl|CDD|165468 PHA03201, PHA03201, uracil DNA glycosylase; Provisional.
          Length = 318

 Score = 39.5 bits (92), Expect = 0.010
 Identities = 15/42 (35%), Positives = 20/42 (47%), Gaps = 1/42 (2%)

Query: 7  SPNPPPPQQQQPPLNVG-QLPMGAPGSGPPGSPGPSPGQAPG 47
          S +P PP++  PP     + P  +P   PP  PGP     PG
Sbjct: 6  SRSPSPPRRPSPPRPTPPRSPDASPEETPPSPPGPGAEPPPG 47


>gnl|CDD|222579 pfam14179, YppG, YppG-like protein.  The YppG-like protein family
           includes the B. subtilis YppG protein, which is
           functionally uncharacterized. This family of proteins is
           found in bacteria. Proteins in this family are typically
           between 115 and 181 amino acids in length. There are two
           completely conserved residues (F and G) that may be
           functionally important.
          Length = 110

 Score = 37.0 bits (86), Expect = 0.011
 Identities = 20/54 (37%), Positives = 20/54 (37%), Gaps = 2/54 (3%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQS 186
           P   QMPP     P        MP Q QP P Q     QP Q      SQ K S
Sbjct: 24  PYHQQMPPPPYS-PPQQQQGHFMPPQPQPYPKQSPQQQQPPQFSS-FLSQFKNS 75



 Score = 36.6 bits (85), Expect = 0.012
 Identities = 18/52 (34%), Positives = 18/52 (34%), Gaps = 5/52 (9%)

Query: 136 PQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPP--QPHQQQGHISSQIKQ 185
           P           P    Q  P Q QP   Q  PPP   P QQQGH      Q
Sbjct: 2   PYQQN---TNQYPPQNQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQ 50



 Score = 35.5 bits (82), Expect = 0.031
 Identities = 18/60 (30%), Positives = 18/60 (30%), Gaps = 6/60 (10%)

Query: 133 PSGPQMPP---MSLHGPMPMPPSQPMPNQ---AQPMPLQQQPPPQPHQQQGHISSQIKQS 186
           P   Q  P      H  MP PP  P   Q     P   Q  P   P QQQ    S     
Sbjct: 12  PQNQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQSPQQQQPPQFSSFLSQ 71



 Score = 33.9 bits (78), Expect = 0.10
 Identities = 16/44 (36%), Positives = 18/44 (40%), Gaps = 4/44 (9%)

Query: 129 MEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
            + +P  P  PP    G    P  QP P Q      QQQ PPQ 
Sbjct: 26  HQQMPPPPYSPPQQQQGHFMPPQPQPYPKQ----SPQQQQPPQF 65



 Score = 30.1 bits (68), Expect = 2.8
 Identities = 12/42 (28%), Positives = 12/42 (28%), Gaps = 1/42 (2%)

Query: 11 PPPQQQQPPLNVGQLPMGAPGSGPP-GSPGPSPGQAPGQNPQ 51
           P  QQ PP              PP   P P       Q PQ
Sbjct: 23 QPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQSPQQQQPPQ 64



 Score = 28.9 bits (65), Expect = 6.3
 Identities = 15/49 (30%), Positives = 16/49 (32%), Gaps = 5/49 (10%)

Query: 8  PNPPPPQQQQP-----PLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQ 51
               P QQQP     P      P    G   P  P P P Q+P Q   
Sbjct: 14 NQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQSPQQQQP 62



 Score = 28.5 bits (64), Expect = 8.1
 Identities = 14/58 (24%), Positives = 19/58 (32%), Gaps = 6/58 (10%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQP---MPLQQQPPPQPHQQQGHISSQIKQSK 187
              PQ      +   P     P P  + P         P PQP+ +Q   S Q +Q  
Sbjct: 9   QYPPQNQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQ---SPQQQQPP 63



 Score = 28.5 bits (64), Expect = 9.6
 Identities = 11/49 (22%), Positives = 11/49 (22%), Gaps = 4/49 (8%)

Query: 3  NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQ 51
                  PPP    P     Q     P    P  P  SP Q       
Sbjct: 22 QQPYHQQMPPPPYSPP---QQQQGHFMPPQPQP-YPKQSPQQQQPPQFS 66


>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
           Transcription initiation factor IIA (TFIIA) is a
           heterotrimer, the three subunits being known as alpha,
           beta, and gamma, in order of molecular weight. The N and
           C-terminal domains of the gamma subunit are represented
           in pfam02268 and pfam02751, respectively. This family
           represents the precursor that yields both the alpha and
           beta subunits. The TFIIA heterotrimer is an essential
           general transcription initiation factor for the
           expression of genes transcribed by RNA polymerase II.
           Together with TFIID, TFIIA binds to the promoter region;
           this is the first step in the formation of a
           pre-initiation complex (PIC). Binding of the rest of the
           transcription machinery follows this step. After
           initiation, the PIC does not completely dissociate from
           the promoter. Some components, including TFIIA, remain
           attached and re-initiate a subsequent round of
           transcription.
          Length = 332

 Score = 39.3 bits (92), Expect = 0.012
 Identities = 30/196 (15%), Positives = 47/196 (23%), Gaps = 19/196 (9%)

Query: 4   SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPG-------SPGPSPGQAPGQNPQENLTA 56
                   PP   Q P  + Q P        P        +P  SP   P          
Sbjct: 48  PWDPSPQAPPPVAQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAGP 107

Query: 57  LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPL 116
               I    E G     +   ++      + +        +QQL+      R  A     
Sbjct: 108 AGPTI--QTEPGQLYPVQVPVMVTQNPANSPLDQPAQQRALQQLQ-----QRYGAPASGQ 160

Query: 117 TPQLAMGVQGKRMEGVPS--GPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ 174
            P      Q      +      + PP    G         +  +     L+Q+       
Sbjct: 161 LPSQQQSAQKNDESQLQQQPNGETPPQQTDGAGDDESEALVRLREADGTLEQRIKG---A 217

Query: 175 QQGHISSQIKQSKLTN 190
           + G     +KQ K   
Sbjct: 218 EGGGAMKVLKQPKKQA 233



 Score = 37.8 bits (88), Expect = 0.034
 Identities = 35/227 (15%), Positives = 56/227 (24%), Gaps = 34/227 (14%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPG------SPGPSPGQAPGQNPQENLTALQRA 60
           SP  PPP  Q P       P  A  + P G      +P  SP   P              
Sbjct: 52  SPQAPPPVAQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAGPAGPT 111

Query: 61  IDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQL 120
           I    E G                   +     +        Q  A       Q L  + 
Sbjct: 112 I--QTEPGQL----------YPVQVPVMVTQNPANSPLDQPAQQRA------LQQLQQRY 153

Query: 121 AMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHIS 180
                G+    +PS  Q               Q  PN   P            +    + 
Sbjct: 154 GAPASGQ----LPSQQQSA-----QKNDESQLQQQPNGETPPQQTDGAGDDESEALVRLR 204

Query: 181 SQIKQSKLTNIPKPEGLDPLIILQERENRVALNIERRIEELNGSLTS 227
                 +        G   + +L++ + +   +  R I +++G  + 
Sbjct: 205 EADGTLEQRIKGAEGGGA-MKVLKQPKKQAKSSKRRTIAQIDGIDSD 250



 Score = 32.8 bits (75), Expect = 1.2
 Identities = 14/53 (26%), Positives = 19/53 (35%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN 53
           M   + + +P     QQ  L   Q   GAP SG   S   S  +      Q+ 
Sbjct: 127 MVTQNPANSPLDQPAQQRALQQLQQRYGAPASGQLPSQQQSAQKNDESQLQQQ 179


>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein. 
          Length = 529

 Score = 39.0 bits (91), Expect = 0.017
 Identities = 21/54 (38%), Positives = 32/54 (59%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKT 1221
            D+EEEEE+EE + + +    K+ E D+EE    KK+K +K K+   +   L KT
Sbjct: 37   DEEEEEEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKVKETTTEWELLNKT 90



 Score = 30.5 bits (69), Expect = 6.3
 Identities = 17/67 (25%), Positives = 31/67 (46%), Gaps = 12/67 (17%)

Query: 1166 EYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
            E + E  +EEEE             E ++++    K   KE+E D E+++ + KK  KK+
Sbjct: 30   EVEKEVPDEEEE------------EEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKV 77

Query: 1226 MRVVIKY 1232
                 ++
Sbjct: 78   KETTTEW 84


>gnl|CDD|222095 pfam13388, DUF4106, Protein of unknown function (DUF4106).  This
           family of proteins are found in large numbers in the
           Trichomonas vaginalis proteome. The function of this
           protein is unknown.
          Length = 422

 Score = 38.9 bits (90), Expect = 0.017
 Identities = 15/52 (28%), Positives = 17/52 (32%)

Query: 136 PQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSK 187
            Q P        P    Q  P Q    P QQ  P  P QQ        K+S+
Sbjct: 209 VQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQPPQTEQGHKRSR 260



 Score = 36.6 bits (84), Expect = 0.081
 Identities = 14/52 (26%), Positives = 18/52 (34%), Gaps = 1/52 (1%)

Query: 132 VPSGPQMPPMSLHGPMPMP-PSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQ 182
           V +  Q P +      P   P Q     AQ    Q      P  +QGH  S+
Sbjct: 209 VQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQPPQTEQGHKRSR 260



 Score = 33.5 bits (76), Expect = 0.86
 Identities = 18/65 (27%), Positives = 23/65 (35%), Gaps = 4/65 (6%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKE 66
            P    P QQ    N  Q P   P   P         Q P Q P +     +R+    +E
Sbjct: 206 QPTVQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQPPQTEQGHKRS----RE 261

Query: 67  QGLEE 71
           QG +E
Sbjct: 262 QGNQE 266



 Score = 32.7 bits (74), Expect = 1.5
 Identities = 15/52 (28%), Positives = 16/52 (30%)

Query: 144 HGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIPKPE 195
           H   P P  QP        P  Q P  QP QQ      Q  Q      P  +
Sbjct: 197 HRHAPKPTQQPTVQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQ 248



 Score = 31.2 bits (70), Expect = 3.8
 Identities = 16/62 (25%), Positives = 19/62 (30%)

Query: 115 PLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ 174
           P  P+      G R    P   Q P +      P   +     Q QP     QP  QP  
Sbjct: 183 PGLPKTFTSSHGHRHRHAPKPTQQPTVQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTP 242

Query: 175 QQ 176
           Q 
Sbjct: 243 QN 244


>gnl|CDD|220441 pfam09849, DUF2076, Uncharacterized protein conserved in bacteria
           (DUF2076).  This domain, found in various hypothetical
           prokaryotic proteins, has no known function. The domain,
           however, is found in various periplasmic ligand-binding
           sensor proteins.
          Length = 234

 Score = 37.7 bits (88), Expect = 0.024
 Identities = 31/144 (21%), Positives = 44/144 (30%), Gaps = 27/144 (18%)

Query: 49  NPQENLTALQRAIDSMKEQ--GLEEDPRYQKLIEMKANRTEIKHAFTSAQ------VQQL 100
            PQE     ++ ID +  +    E  PR     + +A    I  A           VQ +
Sbjct: 2   TPQE-----RQLIDGLFSRLKQAEGAPR-----DAEAEA-LIAEALRRQPDAPYYLVQTI 50

Query: 101 RFQIMAY-RLLARNQPLTPQLAMGVQGKR------MEGVPSGPQMPPMSLHGPMPMPPSQ 153
             Q  A  +  AR + L  Q               M G    P+ PP +     P PP++
Sbjct: 51  LVQEAALKQANARIEELEAQAQHPQSQSSGGFLSGMFG-GGAPRPPPAAPAVQPPAPPAR 109

Query: 154 PMPNQAQPMPLQQQPPPQPHQQQG 177
           P      P        P   Q   
Sbjct: 110 PGWGSGGPSQQGAGQQPGYAQPGP 133



 Score = 35.4 bits (82), Expect = 0.12
 Identities = 14/50 (28%), Positives = 15/50 (30%), Gaps = 6/50 (12%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
               +  P P  P  Q P       P   PG G  G      GQ PG   
Sbjct: 87  FGGGAPRPPPAAPAVQPPA------PPARPGWGSGGPSQQGAGQQPGYAQ 130



 Score = 32.7 bits (75), Expect = 0.85
 Identities = 10/46 (21%), Positives = 11/46 (23%)

Query: 2   SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPG 47
                 P  P  Q   PP   G    G    G    PG +      
Sbjct: 90  GAPRPPPAAPAVQPPAPPARPGWGSGGPSQQGAGQQPGYAQPGPGS 135


>gnl|CDD|215038 PLN00040, PLN00040, Protein MAK16 homolog; Provisional.
          Length = 233

 Score = 37.4 bits (87), Expect = 0.026
 Identities = 24/122 (19%), Positives = 46/122 (37%), Gaps = 6/122 (4%)

Query: 1086 YQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQ 1145
             Q +   R+     + +++     P  L+K +   E  A +A + EK++      R +  
Sbjct: 116  TQYLIRMRKLALKTREKIVTT---PRKLLKRERRRESKAQKAAQLEKSIEKELLERLKSG 172

Query: 1146 VDYTD--SLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKR 1203
              Y D  +   K + K ++     + EEE  + +     K K R   E + E+    K  
Sbjct: 173  T-YGDIYNFPSKSYNKVLEMEEVEEAEEELPKSDKNPNSKKKSRVHVEIEYEDEIEYKSL 231

Query: 1204 KK 1205
              
Sbjct: 232  MS 233


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part of
            a large ribonucleoprotein complex containing the U3
            snoRNA. Depletion of the Utp proteins impedes production
            of the 18S rRNA, indicating that they are part of the
            active pre-rRNA processing complex. This large RNP
            complex has been termed the small subunit (SSU)
            processome.
          Length = 728

 Score = 38.5 bits (90), Expect = 0.029
 Identities = 31/157 (19%), Positives = 65/157 (41%), Gaps = 31/157 (19%)

Query: 1058 EEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKED 1117
            EEDE+E++  ++E  +      +                + K  +L E  +      +E+
Sbjct: 325  EEDEDEDSDSEEEDEDDDEDDDD---------GENPWMLRKKLGKLKEGED-----DEEN 370

Query: 1118 EEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEE 1177
              +    F  + E          R++++            ++ +   +E ++E +EEE E
Sbjct: 371  SGLLSMKFMQRAEA---------RKKEE--------NDAEIEELRRELEGEEESDEEENE 413

Query: 1178 VRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
              SK+   RRK   ++ E+ + SKK KKE + + ++ 
Sbjct: 414  EPSKKNVGRRKFGPENGEKEAESKKLKKENKNEFKEK 450



 Score = 35.8 bits (83), Expect = 0.16
 Identities = 31/171 (18%), Positives = 58/171 (33%), Gaps = 9/171 (5%)

Query: 1054 HQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWL 1113
             Q  E  ++E    + E + + L   EE  +      ++  K  G++    E  E     
Sbjct: 379  MQRAEARKKEENDAEIEELRRELEGEEESDEEENEEPSK--KNVGRRKFGPENGEKEAES 436

Query: 1114 IKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE 1173
             K  +E +    E KE +          + +  +          L    +  + ++EEEE
Sbjct: 437  KKLKKENKNEFKEKKESD-------EEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEE 489

Query: 1174 EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
             +EE    +      K+    +    S  +  +      K   K+KK  KK
Sbjct: 490  LDEENPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKK 540



 Score = 33.9 bits (78), Expect = 0.66
 Identities = 37/193 (19%), Positives = 71/193 (36%), Gaps = 25/193 (12%)

Query: 342 MNYHANAEKEQKKEQERIEKERMRRLMAE----DEEGYRKLIDQKKDKRLAFLLSQTDEY 397
           M +   AE  +KKE+   E E +RR +      DEE   +   +   +R     +   E 
Sbjct: 376 MKFMQRAE-ARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEA 434

Query: 398 ISNLTQMVKEHKMEQKKK-------QDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTD 450
            S   +   +++ ++KK+       +DEE  K ++   + L  ++     +++E     +
Sbjct: 435 ESKKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEEN 494

Query: 451 MHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGE 510
                   S GK  K          KQ  +     ++  D       + + K K+K   E
Sbjct: 495 -PWLKTTSSVGKSAK----------KQDSKKKSSSKL--DKAANKISKAAVKVKKKKKKE 541

Query: 511 NENKEKNKGEDDE 523
                 +   D+E
Sbjct: 542 KSIDLDDDLIDEE 554


>gnl|CDD|189968 pfam01391, Collagen, Collagen triple helix repeat (20 copies).
          Members of this family belong to the collagen
          superfamily. Collagens are generally extracellular
          structural proteins involved in formation of connective
          tissue structure. The alignment contains 20 copies of
          the G-X-Y repeat that forms a triple helix. The first
          position of the repeat is glycine, the second and third
          positions can be any residue but are frequently proline
          and hydroxyproline. Collagens are post translationally
          modified by proline hydroxylase to form the
          hydroxyproline residues. Defective hydroxylation is the
          cause of scurvy. Some members of the collagen
          superfamily are not involved in connective tissue
          structure but share the same triple helical structure.
          Length = 60

 Score = 34.0 bits (79), Expect = 0.031
 Identities = 23/46 (50%), Positives = 23/46 (50%), Gaps = 6/46 (13%)

Query: 8  PNPP-PPQQQQPPLNVGQL-PMGAPGS-GPPGSPGPSPGQA--PGQ 48
          P PP PP    PP   G   P G PG  GPPG PGP PG    PG 
Sbjct: 3  PGPPGPPGPPGPPGPPGPPGPPGPPGPPGPPGPPGP-PGPPGPPGP 47



 Score = 33.6 bits (78), Expect = 0.051
 Identities = 15/25 (60%), Positives = 15/25 (60%), Gaps = 4/25 (16%)

Query: 28 GAPGS-GPPGSPGPSPGQ--APGQN 49
          G PG  GPPG PGP PG   APG  
Sbjct: 37 GPPGPPGPPGPPGP-PGAPGAPGPP 60



 Score = 32.8 bits (76), Expect = 0.073
 Identities = 14/27 (51%), Positives = 14/27 (51%), Gaps = 2/27 (7%)

Query: 26 PMGAPGS-GPPGSPGPSPGQAPGQNPQ 51
          P G PG  GPPG PGP PG      P 
Sbjct: 5  PPGPPGPPGPPGPPGP-PGPPGPPGPP 30


>gnl|CDD|223587 COG0513, SrmB, Superfamily II DNA and RNA helicases [DNA
           replication, recombination, and repair / Transcription /
           Translation, ribosomal structure and biogenesis].
          Length = 513

 Score = 38.2 bits (89), Expect = 0.032
 Identities = 33/116 (28%), Positives = 52/116 (44%), Gaps = 11/116 (9%)

Query: 881 SGKFELLDRILPKLKSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDL 940
             K ELL ++L        RV++F +  +L+  L +    RGFK   L G    E+R   
Sbjct: 258 EEKLELLLKLLKDEDEG--RVIVFVRTKRLVEELAESLRKRGFKVAALHGDLPQEERDRA 315

Query: 941 LKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIG 996
           L+KF   +      +++T     GL++     VI +D   +P      +D  HRIG
Sbjct: 316 LEKFKDGELRV---LVATDVAARGLDIPDVSHVINYDLPLDP------EDYVHRIG 362


>gnl|CDD|234468 TIGR04095, dnd_restrict_1, DNA phosphorothioation system
           restriction enzyme.  The DNA phosphorothioate
           modification system dnd (DNA instability during
           electrophoresis) recently has been shown to provide a
           modification essential to a restriction system. This
           protein family was detected by Partial Phylogenetic
           Profiling as linked to dnd, and its members usually are
           clustered with the dndABCDE genes.
          Length = 451

 Score = 38.1 bits (89), Expect = 0.034
 Identities = 50/197 (25%), Positives = 86/197 (43%), Gaps = 29/197 (14%)

Query: 556 KLKEYQIKGL-EWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIV- 613
           +L++YQ + +  W    F NN  GIL    G GKT+  +A  + L EK    G  +++V 
Sbjct: 8   ELRDYQKEAIRAW----FKNNGRGILKMATGTGKTLTALAAASKLYEK---IGLLVLLVV 60

Query: 614 -PLSTL-SNWSLEFERWAPSVN-VVAYKGSPHLRKTLQAQM-----KASKFNVLLTTYEY 665
            P   L   W+ E E++   +N ++ Y+   + +  L   +        KF  ++TT   
Sbjct: 61  CPYQHLVDQWAREAEKF--GLNPILCYESVSNWQSELSTGLYNLNSGNQKFLAIITT-NA 117

Query: 666 VIKDKGPLAKLHW---KYMII-DEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKL 721
               K   ++L     K ++I DE H +      +   L        RL L+ TP ++  
Sbjct: 118 TFIGKNFQSQLRRFPGKTLLIGDEAHNLGAPR--IRESLPDN--IGFRLGLSATPERHFD 173

Query: 722 PE-LWALLNFLLPSIFK 737
            E   ALLN+    +++
Sbjct: 174 EEGTNALLNYFGKIVYE 190


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family are
            designated YL1. These proteins have been shown to be
            DNA-binding and may be a transcription factor.
          Length = 238

 Score = 37.4 bits (87), Expect = 0.035
 Identities = 24/77 (31%), Positives = 34/77 (44%)

Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKT 1221
            DD  E DDEEE E+E  R +R  K+++      +EP+  KK+K        K  A   K 
Sbjct: 63   DDEPESDDEEEGEKELQREERLKKKKRVKTKAYKEPTKKKKKKDPTAAKSPKAAAPRPKK 122

Query: 1222 LKKIMRVVIKYTDSDGR 1238
              + +       DS  R
Sbjct: 123  KSERISWAPTLLDSPRR 139


>gnl|CDD|215832 pfam00270, DEAD, DEAD/DEAH box helicase.  Members of this family
           include the DEAD and DEAH box helicases. Helicases are
           involved in unwinding nucleic acids. The DEAD box
           helicases are involved in various aspects of RNA
           metabolism, including nuclear transcription, pre mRNA
           splicing, ribosome biogenesis, nucleocytoplasmic
           transport, translation, RNA decay and organellar gene
           expression.
          Length = 169

 Score = 36.5 bits (85), Expect = 0.039
 Identities = 30/143 (20%), Positives = 57/143 (39%), Gaps = 15/143 (10%)

Query: 585 GLGKTIQTIALITYL--MEKKKVNGPFLIIVPLSTLSNWSLE-FERWAPSVNV---VAYK 638
           G GKT+    L+  L  +  KK     L++ P   L+    E  ++    + +   +   
Sbjct: 24  GSGKTL--AFLLPILQALLPKKGGPQALVLAPTRELAEQIYEELKKLFKILGLRVALLTG 81

Query: 639 GSPHLRKTLQAQMKASKFNVLLTTYE---YVIKDKGPLAKLHWKYMIIDEGHRM--KNHH 693
           G+    K    ++K  K ++L+ T      +++        + K +++DE HR+      
Sbjct: 82  GTS--LKEQARKLKKGKADILVGTPGRLLDLLRRGKLKLLKNLKLLVLDEAHRLLDMGFG 139

Query: 694 CKLTHILNTFYVAPHRLLLTGTP 716
             L  IL+        LLL+ T 
Sbjct: 140 DDLEEILSRLPPDRQILLLSATL 162


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
            members of this family are all hypothetical eukaryotic
            proteins of unknown function. One member is described as
            being an adipocyte-specific protein, but no evidence of
            this was found.
          Length = 322

 Score = 37.2 bits (87), Expect = 0.047
 Identities = 24/76 (31%), Positives = 41/76 (53%), Gaps = 10/76 (13%)

Query: 1142 QRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSK 1201
              ++VD T    E++ LKA          EEE +EE + K++ K++++ E    + S  +
Sbjct: 257  VLRKVDKTREEEEEKILKA---------AEEERQEEAQEKKEEKKKEEREAKLAKLSPEE 307

Query: 1202 KRKKEKEKDREKDQAK 1217
            +RK E EK+R+K   K
Sbjct: 308  QRKLE-EKERKKQARK 322



 Score = 31.8 bits (73), Expect = 2.5
 Identities = 14/62 (22%), Positives = 34/62 (54%), Gaps = 6/62 (9%)

Query: 328 RNNQAR---IMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKD 384
           + ++ R     ++ KA         +E+K+E+++ E+E     ++ +E+  RKL ++K+ 
Sbjct: 260 KVDKTREEEEEKILKAAEEERQEEAQEKKEEKKKEEREAKLAKLSPEEQ--RKL-EEKER 316

Query: 385 KR 386
           K+
Sbjct: 317 KK 318



 Score = 29.9 bits (68), Expect = 8.6
 Identities = 16/60 (26%), Positives = 27/60 (45%), Gaps = 2/60 (3%)

Query: 251 RQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEA-RATEKLEKQQKVEA-ERKKRQK 308
            + R E      +    E     +  K  K++  +EA  A    E+Q+K+E  ERKK+ +
Sbjct: 262 DKTREEEEEKILKAAEEERQEEAQEKKEEKKKEEREAKLAKLSPEEQRKLEEKERKKQAR 321


>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381).  This
            domain is functionally uncharacterized. This domain is
            found in eukaryotes. This presumed domain is typically
            between 156 to 174 amino acids in length. This domain is
            found associated with pfam07780, pfam01728.
          Length = 154

 Score = 35.7 bits (83), Expect = 0.051
 Identities = 18/73 (24%), Positives = 33/73 (45%), Gaps = 11/73 (15%)

Query: 1155 KEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
            ++ L       E ++EEE E EE             E++  +    K+  K K + R ++
Sbjct: 91   RKLLGLDKKEKEEEEEEEVEVEE-----------LDEEEQIDELLEKELAKLKREKRREN 139

Query: 1215 QAKLKKTLKKIMR 1227
            + K K+ LK+ M+
Sbjct: 140  ERKQKEILKEQMK 152



 Score = 32.6 bits (75), Expect = 0.68
 Identities = 17/67 (25%), Positives = 32/67 (47%), Gaps = 11/67 (16%)

Query: 1152 LTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
              EKE  +  +  VE  DEEE+ +E +  +    +R+K           ++  + K+K+ 
Sbjct: 98   KKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREK-----------RRENERKQKEI 146

Query: 1212 EKDQAKL 1218
             K+Q K+
Sbjct: 147  LKEQMKM 153


>gnl|CDD|221247 pfam11825, Nuc_recep-AF1, Nuclear/hormone receptor activator site
          AF-1.  Nuclear receptors (NRs) are a family of
          ligand-inducible transcription factors, and, like other
          transcription factors, they contain a distinct DNA
          binding domain that allows for target gene recognition
          and several activation domains that possess the ability
          to activate transcription. One of these activation
          domains is at the N-terminal, although there are two
          distinct motifs within this domain, between residues
          20-36 and between 74 and the end of this domain, which
          are the binding regions. One of the co-activators is
          TIF1beta, which appears to bind at the first motif.
          Length = 106

 Score = 34.8 bits (80), Expect = 0.055
 Identities = 15/52 (28%), Positives = 19/52 (36%), Gaps = 2/52 (3%)

Query: 2  SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN 53
               S    P      P +V    MG+P    P +PG   G   G +PQ N
Sbjct: 25 PMGPMSTLSSPINGLGSPYSVISSSMGSPSMSLPSTPGLGYG--TGSSPQIN 74


>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
            This is a family of fungal proteins of unknown function.
          Length = 182

 Score = 36.2 bits (84), Expect = 0.056
 Identities = 23/76 (30%), Positives = 42/76 (55%), Gaps = 5/76 (6%)

Query: 1148 YTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
            YT++  +K+ L       E +  ++E EE+ + K K K+ KK +D D++    KK  K +
Sbjct: 57   YTEAKKKKKELAE-----EIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSE 111

Query: 1208 EKDREKDQAKLKKTLK 1223
            +KD ++ + KL+   K
Sbjct: 112  KKDEKEAEDKLEDLTK 127



 Score = 35.1 bits (81), Expect = 0.13
 Identities = 16/60 (26%), Positives = 32/60 (53%)

Query: 1166 EYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
            E  +E E+ ++E   K+K K +KK     ++    KK  K+ +K  +KD+ + +  L+ +
Sbjct: 66   ELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDL 125



 Score = 33.9 bits (78), Expect = 0.32
 Identities = 18/70 (25%), Positives = 35/70 (50%), Gaps = 3/70 (4%)

Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSK--KRKKEKEKDREKDQ 1215
             K + + +E   ++E EE++    +K K +KK + D ++    K  K +K+ EK+ E   
Sbjct: 64   KKELAEEIE-KVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKL 122

Query: 1216 AKLKKTLKKI 1225
              L K+  + 
Sbjct: 123  EDLTKSYSET 132



 Score = 30.4 bits (69), Expect = 3.6
 Identities = 10/53 (18%), Positives = 24/53 (45%)

Query: 492 DEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSIAHTVHEI 544
            ++   +  +K K+K   + ++K + K E +  +K      +Y     T+ E+
Sbjct: 87  KKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSETLSTLSEL 139



 Score = 29.3 bits (66), Expect = 8.6
 Identities = 17/101 (16%), Positives = 43/101 (42%), Gaps = 9/101 (8%)

Query: 1126 EAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
            EAK+++K L      + +K+ +      +++W        +  D++++++++   K+  K
Sbjct: 59   EAKKKKKELA-EEIEKVKKEYE-----EKQKWKWKKKKSKKKKDKDKDKKDD---KKDDK 109

Query: 1186 RRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIM 1226
              KK E + E+      +   +      +    K  L K +
Sbjct: 110  SEKKDEKEAEDKLEDLTKSYSETLSTLSELKPRKYALHKDI 150



 Score = 29.3 bits (66), Expect = 9.3
 Identities = 17/58 (29%), Positives = 33/58 (56%)

Query: 1167 YDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
            YD E  E +++ +   +   + K E ++++    KK+K +K+KD++KD+   KK  K 
Sbjct: 53   YDAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKS 110


>gnl|CDD|219262 pfam07001, BAT2_N, BAT2 N-terminus.  This family represents the
           N-terminus (approximately 200 residues) of the
           proline-rich protein BAT2. BAT2 is similar to other
           proteins with large proline-rich domains, such as some
           nuclear proteins, collagens, elastin, and synapsin.
          Length = 189

 Score = 36.1 bits (83), Expect = 0.058
 Identities = 19/62 (30%), Positives = 25/62 (40%), Gaps = 1/62 (1%)

Query: 5   STSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGP-SPGQAPGQNPQENLTALQRAIDS 63
           +++ +PPPP Q   PL  G     A  S  PG+ G             E   +LQ A D 
Sbjct: 117 TSASSPPPPPQPATPLVPGGAKSWAVASAKPGAQGDGGRASQLSSFSHEEFPSLQAAGDQ 176

Query: 64  MK 65
            K
Sbjct: 177 DK 178


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
            related to Mpp10 (M phase phosphoprotein 10). The U3
            small nucleolar ribonucleoprotein (snoRNP) is required
            for three cleavage events that generate the mature 18S
            rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
            depletion of Mpp10, a U3 snoRNP-specific protein, halts
            18S rRNA production and impairs cleavage at the three U3
            snoRNP-dependent sites.
          Length = 613

 Score = 37.3 bits (86), Expect = 0.059
 Identities = 28/170 (16%), Positives = 62/170 (36%), Gaps = 12/170 (7%)

Query: 1053 LHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDW 1112
              ++  E        D   V+    + +E  +  +  +AE     G +    +  +    
Sbjct: 171  EEKESVEQATREKKFDKSGVDDKFFKLDEMNEFLEATEAEEEAALGDEDDFEDYFQDDSE 230

Query: 1113 LIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEE 1172
              K+DE+      E  +EE              ++Y D    KE  K  D G + + E++
Sbjct: 231  DGKDDEDFGSGEDEEDDEEGN------------IEYEDFFDPKEKDKKKDAGDDAELEDD 278

Query: 1173 EEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTL 1222
            E ++E   K    + ++ +++D+E    +  ++  E   +K +       
Sbjct: 279  EPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLE 328


>gnl|CDD|152960 pfam12526, DUF3729, Protein of unknown function (DUF3729).  This
          family of proteins is found in viruses. Proteins in
          this family are typically between 145 and 1707 amino
          acids in length. The family is found in association
          with pfam01443, pfam01661, pfam05417, pfam01660,
          pfam00978. There is a single completely conserved
          residue L that may be functionally important.
          Length = 115

 Score = 34.7 bits (80), Expect = 0.064
 Identities = 15/47 (31%), Positives = 19/47 (40%), Gaps = 5/47 (10%)

Query: 4  SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
          S+    PPP +   PP +        PG   P SP   P  AP + P
Sbjct: 58 SAVWVLPPPSEPAAPPPD---PEPPVPGPAGPPSPLAPP--APARKP 99


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
            eukaryotic proteins with undetermined function.
          Length = 321

 Score = 36.6 bits (85), Expect = 0.072
 Identities = 29/137 (21%), Positives = 54/137 (39%), Gaps = 21/137 (15%)

Query: 1110 PDWLIKEDEEIEQWAFEAKEEEKALHM--GRGSRQRKQVDYTDSLTEKEWLKAIDDGVEY 1167
             D L +E EE  +   E +    A+     R +  +++++  + L E + L++    V+ 
Sbjct: 110  ADKLDEEQEERVEKEREEELAGDAMKKLENRTADSKREMEVLERLEELKELQSRRADVDV 169

Query: 1168 D---------------DEEEEEEEEVRSKRKG----KRRKKTEDDDEEPSTSKKRKKEKE 1208
            +               +EEEE+E  ++S   G    + R++ +D+D E            
Sbjct: 170  NSMLEALFRREKKEEEEEEEEDEALIKSLSFGPETEEDRRRADDEDSEDDEEDNDNTPSP 229

Query: 1209 KDREKDQAKLKKTLKKI 1225
            K      AK    LKK 
Sbjct: 230  KSGSSSPAKPTSILKKS 246


>gnl|CDD|220252 pfam09468, RNase_H2-Ydr279, Ydr279p protein family (RNase H2 complex
            component).  RNases H are enzymes that specifically
            hydrolyse RNA when annealed to a complementary DNA and
            are present in all living organisms. In yeast RNase H2 is
            composed of a complex of three proteins (Rnh2Ap, Ydr279p
            and Ylr154p), this family represents the homologues of
            Ydr279p. It is not known whether non yeast proteins in
            this family fulfil the same function.
          Length = 287

 Score = 36.5 bits (85), Expect = 0.074
 Identities = 17/78 (21%), Positives = 35/78 (44%), Gaps = 7/78 (8%)

Query: 1150 DSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTS---KKRKKE 1206
              L    + + +          E +  +   K   K++++TE+D E   +    K++ KE
Sbjct: 201  SYLPPDLYKELLK----SLLIPEFKPLDKYLKESKKKKRETEEDVEAAESRAEKKRKSKE 256

Query: 1207 KEKDREKDQAKLKKTLKK 1224
            + K ++  ++K  K LKK
Sbjct: 257  EIKKKKPKESKGVKALKK 274


>gnl|CDD|219358 pfam07271, Cytadhesin_P30, Cytadhesin P30/P32.  This family
           consists of several Mycoplasma species specific
           Cytadhesin P32 and P30 proteins. P30 has been found to
           be membrane associated and localised on the tip
           organelle. It is thought that it is important in
           cytadherence and virulence.
          Length = 279

 Score = 36.2 bits (83), Expect = 0.093
 Identities = 32/125 (25%), Positives = 46/125 (36%), Gaps = 7/125 (5%)

Query: 57  LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFT-SAQVQQLRFQIMAYRLLARNQP 115
           LQR  +  ++Q +E DP  +   +       +  A     QVQ         R+  +   
Sbjct: 116 LQRISEQNEQQAIEIDPTEEVNTQEPTQPAGVNVANNPQPQVQPQFGPNPQQRINPQRFG 175

Query: 116 LTPQLAMGVQGKRMEGVPSGPQMPPMSLH---GPMP-MPPSQPMPNQAQPMPLQQQP--P 169
              Q  MG++    +  P  P MPP  +     PMP MPP          MP   +P   
Sbjct: 176 FPMQPNMGMRPGFNQMPPHMPGMPPNQMRPGFNPMPGMPPRPGFNQNPNMMPNMNRPGFR 235

Query: 170 PQPHQ 174
           PQP  
Sbjct: 236 PQPGG 240


>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
          Length = 619

 Score = 36.7 bits (86), Expect = 0.095
 Identities = 21/88 (23%), Positives = 36/88 (40%), Gaps = 11/88 (12%)

Query: 1147 DYTDSLTEKE-WLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKK 1205
            ++ D L   E  L+ + DG   D   EE+   V S+ +     + E+++E+ +       
Sbjct: 154  EWYDRLENGERRLRELIDG-FVDPNAEEDPAHVGSELEELDDDEDEEEEEDENDDSLAAD 212

Query: 1206 EKE---------KDREKDQAKLKKTLKK 1224
            E E         K   K   KL+K  +K
Sbjct: 213  ESELPEKVLEKFKALAKQYKKLRKAQEK 240


>gnl|CDD|180801 PRK07033, PRK07033, hypothetical protein; Provisional.
          Length = 427

 Score = 36.6 bits (85), Expect = 0.096
 Identities = 17/48 (35%), Positives = 21/48 (43%)

Query: 3  NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
          N+S+ P         PP    + P  AP +G P  P P  G A G NP
Sbjct: 1  NTSSDPFSAGSGGFVPPNPGDRTPAAAPAAGAPFQPRPGRGAASGLNP 48



 Score = 31.2 bits (71), Expect = 4.6
 Identities = 9/46 (19%), Positives = 11/46 (23%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGH 178
            +          G +P  P    P  A       QP P      G 
Sbjct: 1   NTSSDPFSAGSGGFVPPNPGDRTPAAAPAAGAPFQPRPGRGAASGL 46


>gnl|CDD|215565 PLN03083, PLN03083, E3 UFM1-protein ligase 1 homolog; Provisional.
          Length = 803

 Score = 36.7 bits (85), Expect = 0.11
 Identities = 38/155 (24%), Positives = 64/155 (41%), Gaps = 22/155 (14%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMR 1227
            DDEE+  ++  ++++KG+ +      D +    K+  K +E     +    +  +KKI+ 
Sbjct: 440  DDEEDAPKKGKKNQKKGRDKSSKVPSDSKAGGKKESVKSQED--NNNIPPEEWVMKKILE 497

Query: 1228 VVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDI-------KKILGRIEDGKYSS 1280
             V    + DG       +K      L D+     RPM I       K +     + +   
Sbjct: 498  WVPDL-EEDGTEDPGSILKH-----LADHL----RPMLINSLKERRKALFTENAERRRRL 547

Query: 1281 VDELQKDFKTLCRNAQIYNEELSLIHED---SVVL 1312
            +D LQK       N Q+Y + L L  +D   SVVL
Sbjct: 548  LDNLQKKIDESFLNMQLYEKALDLFEDDQSTSVVL 582


>gnl|CDD|217298 pfam02948, Amelogenin, Amelogenin.  Amelogenins play a role in
           biomineralisation. They seem to regulate the formation
           of crystallites during the secretory stage of tooth
           enamel development. thought to play a major role in the
           structural organisation and mineralisation of developing
           enamel. They are found in the extracellular matrix.
           Mutations in X-chromosomal amelogenin can cause
           Amelogenesis imperfecta.
          Length = 174

 Score = 34.9 bits (80), Expect = 0.11
 Identities = 14/73 (19%), Positives = 20/73 (27%), Gaps = 5/73 (6%)

Query: 132 VPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-----PQPHQQQGHISSQIKQS 186
           +P  PQMP         + P   +       P+   P      P   QQ           
Sbjct: 51  IPLSPQMPQQQQSAHPKLTPHHQLLILPPQQPMMPVPGHHPMVPMTGQQPHLQPPAQHPL 110

Query: 187 KLTNIPKPEGLDP 199
           + T    P+   P
Sbjct: 111 QPTYGQNPQPQQP 123



 Score = 33.0 bits (75), Expect = 0.61
 Identities = 20/65 (30%), Positives = 24/65 (36%), Gaps = 3/65 (4%)

Query: 133 PSGPQMP--PMSLHGPM-PMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLT 189
              PQ P  P+  H PM PM   QP        PLQ      P  QQ   +    Q +  
Sbjct: 76  ILPPQQPMMPVPGHHPMVPMTGQQPHLQPPAQHPLQPTYGQNPQPQQPTHTQPPVQPQQP 135

Query: 190 NIPKP 194
             P+P
Sbjct: 136 ADPQP 140



 Score = 32.2 bits (73), Expect = 1.0
 Identities = 21/88 (23%), Positives = 28/88 (31%), Gaps = 7/88 (7%)

Query: 90  HAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKR-MEGVPSGPQMPPMSLHGPMP 148
           H       QQ   Q  A        PL P      Q ++     P      P       P
Sbjct: 90  HPMVPMTGQQPHLQPPA------QHPLQPTYGQNPQPQQPTHTQPPVQPQQPADPQPGQP 143

Query: 149 MPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
           M P QP+P     +PL+  P     +Q+
Sbjct: 144 MFPMQPLPPLVPDLPLEPWPAADKTKQE 171


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1 protein
            homologues. SDA1 is a Saccharomyces cerevisiae protein
            which is involved in the control of the actin
            cytoskeleton. The protein is essential for cell viability
            and is localised in the nucleus.
          Length = 317

 Score = 35.8 bits (83), Expect = 0.13
 Identities = 12/50 (24%), Positives = 29/50 (58%)

Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
            D  +E  D E+EEE++  +K+  +   +   +++E   +++ + E EK++
Sbjct: 124  DKEIESSDSEDEEEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEK 173


>gnl|CDD|184281 PRK13729, PRK13729, conjugal transfer pilus assembly protein TraB;
           Provisional.
          Length = 475

 Score = 36.0 bits (83), Expect = 0.14
 Identities = 34/198 (17%), Positives = 59/198 (29%), Gaps = 34/198 (17%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
           +S+   S N     +Q+P  ++  +                         +++ T   + 
Sbjct: 33  LSDVDMSGNGEAVAEQEPVPDMTGVV----------------DTTFDDKVRQHATTEMQV 76

Query: 61  IDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQL 120
             +  ++  EE  R   ++  +    + +          L  Q+ A   L  N    P  
Sbjct: 77  TAAQMQKQYEEIRRELDVLNKQRGDDQRRIEKLGQDNAALAEQVKA---LGAN----PVT 129

Query: 121 AMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHIS 180
           A G      E VP  P  PP     P P       P Q         PPP      G+  
Sbjct: 130 ATG------EPVPQMPASPPGPEGEPQPGNTPVSFPPQGSVAV----PPPTAF-YPGNGV 178

Query: 181 SQIKQSKLTNIPKPEGLD 198
           +   Q    ++P P  + 
Sbjct: 179 TPPPQVTYQSVPVPNRIQ 196


>gnl|CDD|223065 PHA03378, PHA03378, EBNA-3B; Provisional.
          Length = 991

 Score = 36.2 bits (83), Expect = 0.15
 Identities = 32/122 (26%), Positives = 44/122 (36%), Gaps = 12/122 (9%)

Query: 106 AYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHG-PMPMPPSQPMPNQAQPMPL 164
           A    A      P  A   + +     P G   PP +  G P P PP      QA P P 
Sbjct: 734 ARPPAAAPGRARPPAAAPGRARPPAAAP-GRARPPAAAPGAPTPQPPP-----QAPPAPQ 787

Query: 165 QQ---QPPPQPHQQQGHISSQI--KQSKLTNIPKPEGLDPLIILQERENRVALNIERRIE 219
           Q+    P PQP  Q G  S Q+  + +     P  + L  L+    +  R +L     +E
Sbjct: 788 QRPRGAPTPQPPPQAGPTSMQLMPRAAPGQQGPTKQILRQLLTGGVKRGRPSLKKPAALE 847

Query: 220 EL 221
             
Sbjct: 848 RQ 849



 Score = 36.2 bits (83), Expect = 0.16
 Identities = 45/233 (19%), Positives = 70/233 (30%), Gaps = 52/233 (22%)

Query: 5   STSPNPP-----PPQQQQP---PLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTA 56
           S +P PP      P+   P   P+ +  +PM  P    P +        P Q PQ  +T 
Sbjct: 603 SQTPEPPTTQSHIPETSAPRQWPMPLRPIPM-RPLRMQPITFNVLVFPTPHQPPQVEITP 661

Query: 57  LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLL---ARN 113
            +              P + ++  +    +             L  Q     +       
Sbjct: 662 YK--------------PTWTQIGHIPYQPSPTGAN------TMLPIQWAPGTMQPPPRAP 701

Query: 114 QPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHG--------PMPMPPSQPMPNQAQP---M 162
            P+ P  A   + +R      G   PP +  G        P    P    P +A+P    
Sbjct: 702 TPMRPPAAPPGRAQRPAAAT-GRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAA 760

Query: 163 PLQQQPP--------PQPHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQERE 207
           P + +PP        PQP  Q      Q  +   T  P P+     + L  R 
Sbjct: 761 PGRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGPTSMQLMPRA 813



 Score = 32.7 bits (74), Expect = 1.6
 Identities = 16/60 (26%), Positives = 20/60 (33%), Gaps = 5/60 (8%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ-APGQNPQENLTALQR 59
               + +P    P    P     + P  APG   P  P  +PG   P   PQ      QR
Sbjct: 734 ARPPAAAPGRARPPAAAP--GRARPPAAAPGRARP--PAAAPGAPTPQPPPQAPPAPQQR 789



 Score = 30.4 bits (68), Expect = 7.4
 Identities = 20/54 (37%), Positives = 22/54 (40%), Gaps = 6/54 (11%)

Query: 1   MSNSSTSPNPPPPQQQ-QPPLNVGQLPMGAPGSGPPGSPGP-----SPGQAPGQ 48
               + +P  P PQ   Q P    Q P GAP   PP   GP      P  APGQ
Sbjct: 764 ARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGPTSMQLMPRAAPGQ 817


>gnl|CDD|221404 pfam12067, Sox_C_TAD, Sox C-terminal transactivation domain.  This
           domain is found at the C-terminus of the Sox family of
           transcription factors. It is found associated with
           pfam00505. It binds to the Armadillo repeats (pfam00514)
           in Catenin beta-1 (CTNNB1), which is involved in
           transcriptional regulation. It functions as a
           transactivating domain (TAD).
          Length = 197

 Score = 35.1 bits (81), Expect = 0.15
 Identities = 20/89 (22%), Positives = 28/89 (31%), Gaps = 20/89 (22%)

Query: 113 NQPLTPQLAM-GVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPM-------PNQAQPMPL 164
           N    PQ A      ++M    + PQ  P      M  P    M       P  A+  P+
Sbjct: 56  NSSYAPQNAHAPALLRQMAVTENIPQGSPA--PSIMGCPTPPQMYYGQMYVPECAKHHPV 113

Query: 165 ---QQQPPPQ-------PHQQQGHISSQI 183
              Q  PPP+          QQ  +   +
Sbjct: 114 QLGQLSPPPESQHLDTLDQLQQAELLGDV 142



 Score = 30.1 bits (68), Expect = 6.5
 Identities = 12/57 (21%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-PQPHQQQGHISSQIKQS 186
           G+P+ P+M P+      P   S P     Q MP          +     +  Q+  +
Sbjct: 21  GLPT-PEMSPLDALESEPAFFSPPCQEDCQMMPYGYNSSYAPQNAHAPALLRQMAVT 76


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 36.2 bits (84), Expect = 0.15
 Identities = 19/71 (26%), Positives = 30/71 (42%), Gaps = 2/71 (2%)

Query: 1157 WLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQA 1216
            WL+ +D   E  +E+EE EE+     K +R K             K KK+++K ++    
Sbjct: 1130 WLEDLDKFEEALEEQEEVEEK--EIAKEQRLKSKTKGKASKLRKPKLKKKEKKKKKSSAD 1187

Query: 1217 KLKKTLKKIMR 1227
            K KK       
Sbjct: 1188 KSKKASVVGNS 1198



 Score = 31.9 bits (73), Expect = 3.3
 Identities = 20/87 (22%), Positives = 39/87 (44%), Gaps = 5/87 (5%)

Query: 1155 KEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
             ++ +A+++  E +++E  +E+ ++SK KGK  K       +P   KK KK+K+   +K 
Sbjct: 1135 DKFEEALEEQEEVEEKEIAKEQRLKSKTKGKASKL-----RKPKLKKKEKKKKKSSADKS 1189

Query: 1215 QAKLKKTLKKIMRVVIKYTDSDGRVLS 1241
            +        K +    K    D     
Sbjct: 1190 KKASVVGNSKRVDSDEKRKLDDKPDNK 1216


>gnl|CDD|219753 pfam08226, DUF1720, Domain of unknown function (DUF1720).  This
           domain is found in different combinations with cortical
           patch components EF hand, SH3 and ENTH and is therefore
           likely to be involved in cytoskeletal processes. This
           family contains many hypothetical proteins.
          Length = 73

 Score = 32.1 bits (73), Expect = 0.21
 Identities = 14/68 (20%), Positives = 18/68 (26%), Gaps = 10/68 (14%)

Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMP---PSQPMPNQAQP------MPLQQQP 168
           PQ       ++ +    GP + P       P P     Q    Q Q          Q   
Sbjct: 3   PQQTGYQPPQQQQPQQQGP-LQPQPTGFMQPQPTGFGQQQQGLQPQQTGFQPQAGQQMPT 61

Query: 169 PPQPHQQQ 176
              P Q Q
Sbjct: 62  GTGPLQPQ 69



 Score = 29.4 bits (66), Expect = 2.2
 Identities = 15/59 (25%), Positives = 18/59 (30%), Gaps = 12/59 (20%)

Query: 114 QPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMP----MPPSQPMPNQAQPMPLQQQP 168
            PL PQ    +Q +     P+G       L    P      P           PLQ QP
Sbjct: 20  GPLQPQPTGFMQPQ-----PTGFGQQQQGL---QPQQTGFQPQAGQQMPTGTGPLQPQP 70


>gnl|CDD|220708 pfam10349, WWbp, WW-domain ligand protein.  The WWbp domain is
           characterized by several short PY and PT-like motifs of
           the PPPPY form. These appear to bind directly to the WW
           domains of WWP1 and WWP2 and other such diverse proteins
           as dystrophin and YAP (Yes-associated protein). This is
           the WW-domain binding protein WWbp via PY and PY_like
           motifs. The presence of a phosphotyrosine residue in the
           pWBP-1 peptide abolishes WW domain binding which
           suggests a potential regulatory role for tyrosine
           phosphorylation in modulating WW domain-ligand
           interactions. Given the likelihood that WWP1 and WWP2
           function as E3 ubiquitin-protein ligases, it is possible
           that initial substrate-specific recognition occurs via
           WW domain-substrate protein interaction followed by
           ubiquitin transfer and subsequent proteolysis. This
           domain lies just downstream of the GRAM (pfam02893) in
           many members.
          Length = 111

 Score = 33.1 bits (76), Expect = 0.24
 Identities = 15/66 (22%), Positives = 17/66 (25%), Gaps = 5/66 (7%)

Query: 110 LARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP 169
             R QP+      G        V   P  P          PP  P P      P     P
Sbjct: 34  AQRAQPV--SRESGYYPPPGAYVHLEPL-PAY--GQYAAPPPYGPPPPYYPAPPGVYPTP 88

Query: 170 PQPHQQ 175
           P P+  
Sbjct: 89  PPPNSG 94



 Score = 32.8 bits (75), Expect = 0.29
 Identities = 11/47 (23%), Positives = 14/47 (29%), Gaps = 2/47 (4%)

Query: 134 SGPQMPPMSLHGPMPMPP--SQPMPNQAQPMPLQQQPPPQPHQQQGH 178
           SG   PP +     P+P       P    P P     PP  +     
Sbjct: 44  SGYYPPPGAYVHLEPLPAYGQYAAPPPYGPPPPYYPAPPGVYPTPPP 90



 Score = 32.0 bits (73), Expect = 0.49
 Identities = 20/82 (24%), Positives = 26/82 (31%), Gaps = 5/82 (6%)

Query: 95  AQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQP 154
            + Q +  +   Y        L P  A G         P     PP     P  + P+ P
Sbjct: 35  QRAQPVSRESGYYPPPGAYVHLEPLPAYG-----QYAAPPPYGPPPPYYPAPPGVYPTPP 89

Query: 155 MPNQAQPMPLQQQPPPQPHQQQ 176
            PN       Q+ PPP P   Q
Sbjct: 90  PPNSGYMADPQEPPPPYPGPPQ 111



 Score = 31.2 bits (71), Expect = 0.94
 Identities = 14/57 (24%), Positives = 16/57 (28%), Gaps = 1/57 (1%)

Query: 1  MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSG-PPGSPGPSPGQAPGQNPQENLTA 56
          +S  S    PP       PL         P  G PP      PG  P   P  +   
Sbjct: 40 VSRESGYYPPPGAYVHLEPLPAYGQYAAPPPYGPPPPYYPAPPGVYPTPPPPNSGYM 96



 Score = 30.1 bits (68), Expect = 2.7
 Identities = 11/36 (30%), Positives = 11/36 (30%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSP 42
              P PP     P       M  P   PP  PGP  
Sbjct: 76  PYYPAPPGVYPTPPPPNSGYMADPQEPPPPYPGPPQ 111


>gnl|CDD|203444 pfam06424, PRP1_N, PRP1 splicing factor, N-terminal.  This domain is
            specific to the N-terminal part of the prp1 splicing
            factor, which is involved in mRNA splicing (and possibly
            also poly(A)+ RNA nuclear export and cell cycle
            progression). This domain is specific to the N terminus
            of the RNA splicing factor encoded by prp1. It is
            involved in mRNA splicing and possibly also poly(A)and
            RNA nuclear export and cell cycle progression.
          Length = 131

 Score = 33.4 bits (77), Expect = 0.24
 Identities = 18/64 (28%), Positives = 34/64 (53%), Gaps = 8/64 (12%)

Query: 1166 EYDDEEEEEE---EEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQ-AKLKKT 1221
            +YDDE+EE +   E +  +   +R+K+ E  ++E    +  K  +E  + + Q A LK+ 
Sbjct: 55   KYDDEDEEADRIYESIDERMDERRKKRREQKEKE----EIEKYREENPKIQQQFADLKRN 110

Query: 1222 LKKI 1225
            L  +
Sbjct: 111  LATV 114


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved in
            cell cycle arrest and pre-mRNA splicing. It has been
            shown to be a component of U4/U6 x U5 tri-snRNP complex
            in human, Schizosaccharomyces pombe and Saccharomyces
            cerevisiae. SART-1 is a known tumour antigen in a range
            of cancers recognised by T cells.
          Length = 603

 Score = 35.5 bits (82), Expect = 0.24
 Identities = 39/192 (20%), Positives = 72/192 (37%), Gaps = 22/192 (11%)

Query: 1057 DEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKE 1116
            DE  E   ++  +    +   + E   +     + +  +E       +E+S + +   KE
Sbjct: 394  DETSEFVRSLQKEPLEEKPENKDESVEEISDAEEDDEDEEDEDGDGDVEMSAVDNDEEKE 453

Query: 1117 DEEIEQWAFEAKEEEKALHMGRGS-----RQRKQVDYTDSLTE-KEWLK-AIDDGVEYDD 1169
            +E+ E       EEE  +  G  +     + R  +       E +E+LK      +  + 
Sbjct: 454  EEDKEAIPSTILEEEPTVGGGLAAALKLLKSRGILKKNQLERERREFLKEKERLKLLAEI 513

Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVV 1229
             E  E E  R+  K  R    E   EE +  +  +++KE+  + D             V 
Sbjct: 514  RERIERERDRNDGKYSRMSARER--EEYARPENDQRDKEEAYKPD-------------VK 558

Query: 1230 IKYTDSDGRVLS 1241
            +KY D  GR L+
Sbjct: 559  LKYVDEFGRELT 570



 Score = 32.4 bits (74), Expect = 1.6
 Identities = 25/98 (25%), Positives = 45/98 (45%), Gaps = 13/98 (13%)

Query: 1150 DSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKK-------TEDDDEEPSTS-- 1200
            D+   + W K  ++       EE  E+  +++ K +R  K        EDDD++  T   
Sbjct: 35   DAAAYENWKKRQEEAEAKRKREELREKIAKAREKRERNSKLGGIKTLGEDDDDDDDTKAW 94

Query: 1201 --KKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSD 1236
              K +K++K+K+ E+ +A L    +K      +YT  D
Sbjct: 95   LKKSKKRQKKKEAERKKALLLDEKEK--ERAAEYTSED 130


>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 824

 Score = 35.3 bits (82), Expect = 0.26
 Identities = 31/193 (16%), Positives = 41/193 (21%), Gaps = 44/193 (22%)

Query: 8   PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
             PP P    PP    + P  A  + P     P+P  A     + +              
Sbjct: 598 EGPPAPASSGPPEEAAR-P--AAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHP 654

Query: 68  GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGK 127
                P                    S                    P  P         
Sbjct: 655 KHVAVPDA------------------SDGGDG------WPAKAGGAAPAAP--------- 681

Query: 128 RMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQ-AQPMPLQQQPPPQPHQQQGHISSQIKQS 186
                   P   P +   P    P+QP P   A P   Q   P     Q    +S    +
Sbjct: 682 -------PPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPA 734

Query: 187 KLTNIPKPEGLDP 199
               +P P   D 
Sbjct: 735 ADDPVPLPPEPDD 747



 Score = 32.7 bits (75), Expect = 1.8
 Identities = 11/43 (25%), Positives = 13/43 (30%)

Query: 10  PPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQE 52
            P P         G  P   P + P   P P+P  AP      
Sbjct: 438 APAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAP 480



 Score = 31.9 bits (73), Expect = 2.9
 Identities = 12/65 (18%), Positives = 14/65 (21%)

Query: 108 RLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQ 167
           RL  R        A                  P +        P+     Q  P P    
Sbjct: 380 RLERRLGVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAP 439

Query: 168 PPPQP 172
            PP P
Sbjct: 440 APPSP 444



 Score = 31.1 bits (71), Expect = 5.0
 Identities = 9/46 (19%), Positives = 11/46 (23%)

Query: 7   SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQE 52
               PP      P      P  A       +P P+    P   P  
Sbjct: 437 PAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAP 482



 Score = 30.7 bits (70), Expect = 6.7
 Identities = 14/68 (20%), Positives = 18/68 (26%), Gaps = 1/68 (1%)

Query: 10  PPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAP-GQNPQENLTALQRAIDSMKEQG 68
           PPP      P          P   P  +P       P  Q PQ    A   +  +     
Sbjct: 681 PPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVP 740

Query: 69  LEEDPRYQ 76
           L  +P   
Sbjct: 741 LPPEPDDP 748



 Score = 30.3 bits (69), Expect = 8.1
 Identities = 13/51 (25%), Positives = 15/51 (29%)

Query: 2   SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQE 52
            N+     P PP    P       P  AP      +P P    AP   P  
Sbjct: 446 GNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAA 496


>gnl|CDD|99926 cd05494, Bromodomain_1, Bromodomain; uncharacterized subfamily.
            Bromodomains are found in many chromatin-associated
            proteins and in nuclear histone acetyltransferases. They
            interact specifically with acetylated lysine.
          Length = 114

 Score = 32.8 bits (75), Expect = 0.28
 Identities = 15/49 (30%), Positives = 25/49 (51%), Gaps = 2/49 (4%)

Query: 1241 SEPFIKL--PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKD 1287
            + PF++   P R+  PDY +VI RPM     +  I +     +++LQ  
Sbjct: 21   AWPFLEPVNPPRRGAPDYRDVIKRPMSFGTKVNNIVETGARDLEDLQIV 69


>gnl|CDD|227512 COG5185, HEC1, Protein involved in chromosome segregation, interacts
            with SMC proteins [Cell division and chromosome
            partitioning].
          Length = 622

 Score = 35.0 bits (80), Expect = 0.30
 Identities = 36/166 (21%), Positives = 64/166 (38%), Gaps = 19/166 (11%)

Query: 1067 PDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFE 1126
             + E   Q L    E+F      D    K Q     L E        I+E  +I Q    
Sbjct: 249  DNYEPSEQELKLGFEKFVHIINTDIANLKTQND--NLYE-------KIQEAMKISQKIKT 299

Query: 1127 AKEEEKALHMGRGSRQRKQVDYTDSLTEK--EWLKAIDDGVEYDDEEEEEEEEVRSKRKG 1184
             +E+ +AL     S   K  +Y +++ +K  EW   ++      + +EEE + ++S    
Sbjct: 300  LREKWRALK----SDSNKYENYVNAMKQKSQEWPGKLEKLKSEIELKEEEIKALQSNIDE 355

Query: 1185 KRRKKTEDDDEEPSTSKKRKKEKEK-DREKDQAKLKKTLKKIMRVV 1229
               K+           +   +E+EK  RE D+  ++    K+ + V
Sbjct: 356  -LHKQLRKQGISTEQFELMNQEREKLTRELDKINIQSD--KLTKSV 398


>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein.  The YqfQ-like protein family
            includes the B. subtilis YqfQ protein, also known as
            VrrA, which is functionally uncharacterized. This family
            of proteins is found in bacteria. Proteins in this family
            are typically between 146 and 237 amino acids in length.
            There are two conserved sequence motifs: QYGP and PKLY.
          Length = 155

 Score = 33.6 bits (77), Expect = 0.32
 Identities = 16/62 (25%), Positives = 28/62 (45%)

Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAK 1217
            L + DD  E  +EE  +E E     + K   K +   E P    +++K K + ++   +K
Sbjct: 91   LSSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSK 150

Query: 1218 LK 1219
             K
Sbjct: 151  PK 152



 Score = 31.3 bits (71), Expect = 1.9
 Identities = 18/56 (32%), Positives = 30/56 (53%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLK 1223
            DDEEEE EEE   + + +   +T+ + +E    +  K + EK++ K + K  K  K
Sbjct: 95   DDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSK 150


>gnl|CDD|206039 pfam13868, Trichoplein, Tumour suppressor, Mitostatin.  Trichoplein
           or mitostatin, was first defined as a meiosis-specific
           nuclear structural protein. It has since been linked
           with mitochondrial movement. It is associated with the
           mitochondrial outer membrane, and over-expression leads
           to reduction in mitochondrial motility whereas lack of
           it enhances mitochondrial movement. The activity appears
           to be mediated through binding the mitochondria to the
           actin intermediate filaments (IFs).
          Length = 349

 Score = 34.5 bits (80), Expect = 0.33
 Identities = 25/112 (22%), Positives = 54/112 (48%), Gaps = 1/112 (0%)

Query: 277 KRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMR 336
           +R ++Q L+ AR  +  EK+++++ ER + +  +E +    Q   +  E     + R+ R
Sbjct: 228 RRRQKQELQRAREEQIEEKEERLQEERAEEEAERERMLEK-QAEDEELEQENAEKRRMKR 286

Query: 337 LNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLA 388
           L           EKE+++  ER E+      + E+E   +  I++++ + L 
Sbjct: 287 LEHRRELEQQIEEKEERRAAEREEELEEGERLREEEAERQARIEEERQRLLK 338


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 35.4 bits (81), Expect = 0.33
 Identities = 51/255 (20%), Positives = 91/255 (35%), Gaps = 51/255 (20%)

Query: 1055 QDDEEDEEENAVPDDETVNQMLARSEE----EFQTYQRID-------------AERRKEQ 1097
            + +E+  EEN   ++E+    +   EE    E    Q ID             AE  +E 
Sbjct: 4061 KMNEDGFEENVQENEESTEDGVKSDEELEQGEVPEDQAIDNHPKMDAKSTFASAEADEEN 4120

Query: 1098 GKKSRLIEVSEL-----------PDWLIKEDEEIEQWAFEAKEEEKALHMGRGS-----R 1141
              K  + E  EL            D   ++ +E      EA  E    +   G      +
Sbjct: 4121 TDKGIVGENEELGEEDGVRGNGTADGEFEQVQEDTSTPKEAMSEADRQYQSLGDHLREWQ 4180

Query: 1142 QRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRK-KTEDDDEEPSTS 1200
            Q  ++   + LTE +      D  E+   +E+EEE++++    ++ + K+ D DE  + +
Sbjct: 4181 QANRIHEWEDLTESQ--SQAFDDSEFMHVKEDEEEDLQALGNAEKDQIKSIDRDESANQN 4238

Query: 1201 KKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVI 1260
                       ++      K L             DG+ +S+  IK      LP  +  I
Sbjct: 4239 PDSMNSTNIAEDEADEVGDKQL------------QDGQDISD--IKQTGEDTLPTEFGSI 4284

Query: 1261 DRPMDIKKILGRIED 1275
            ++   +   L   ED
Sbjct: 4285 NQSEKV-FELSEDED 4298



 Score = 35.0 bits (80), Expect = 0.38
 Identities = 32/145 (22%), Positives = 55/145 (37%), Gaps = 18/145 (12%)

Query: 395  DEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETS----QLTD 450
            DE         ++   EQ    +E     K+   + L D D +   D++E S       +
Sbjct: 3909 DEPNEEDLLETEQKSNEQSAANNESDLVSKEDDNKALEDKDRQEKEDEEEMSDDVGIDDE 3968

Query: 451  MHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGE 510
            +   ++E +S    + ED  L   LK             D  E +  +DS+         
Sbjct: 3969 IQPDIQENNSQPPPENEDLDLPEDLK------------LDEKEGDVSKDSDLEDMDMEAA 4016

Query: 511  NENKEKNKGEDDE--YNKNAMEEAT 533
            +ENKE+   E DE   +++ +EE  
Sbjct: 4017 DENKEEADAEKDEPMQDEDPLEENN 4041


>gnl|CDD|222449 pfam13908, Shisa, Wnt and FGF inhibitory regulator.  Shisa is a
           transcription factor-type molecule that physically
           interacts with immature forms of the Wnt receptor
           Frizzled and the FGF receptor within the endoplasmic
           reticulum to inhibit their post-translational maturation
           and trafficking to the cell surface.
          Length = 177

 Score = 33.6 bits (77), Expect = 0.33
 Identities = 23/71 (32%), Positives = 28/71 (39%), Gaps = 11/71 (15%)

Query: 112 RNQPLTPQ-LAMGVQGKRM----EGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQ 166
             +P+  +  +  VQ   +       PS P       H   PMPP   MP  A P  L Q
Sbjct: 109 PQRPVMTRATSTTVQTTPLPQPPSTAPSYPGPQYQGYH---PMPPQPGMP--APPYSL-Q 162

Query: 167 QPPPQPHQQQG 177
            PPP   Q QG
Sbjct: 163 YPPPGLLQPQG 173



 Score = 31.3 bits (71), Expect = 2.1
 Identities = 11/47 (23%), Positives = 14/47 (29%)

Query: 8   PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENL 54
               P   +     V   P+  P S  P  PGP         PQ  +
Sbjct: 108 RPQRPVMTRATSTTVQTTPLPQPPSTAPSYPGPQYQGYHPMPPQPGM 154


>gnl|CDD|111993 pfam03157, Glutenin_hmw, High molecular weight glutenin subunit.
           Members of this family include high molecular weight
           subunits of glutenin. This group of gluten proteins is
           thought to be largely responsible for the elastic
           properties of gluten, and hence, doughs. Indeed,
           glutenin high molecular weight subunits are classified
           as elastomeric proteins, because the glutenin network
           can withstand significant deformations without breaking,
           and return to the original conformation when the stress
           is removed. Elastomeric proteins differ considerably in
           amino acid sequence, but they are all polymers whose
           subunits consist of elastomeric domains, composed of
           repeated motifs, and non-elastic domains that mediate
           cross-linking between the subunits. The elastomeric
           domain motifs are all rich in glycine residues in
           addition to other hydrophobic residues. High molecular
           weight glutenin subunits have an extensive central
           elastomeric domain, flanked by two terminal non-elastic
           domains that form disulphide cross-links. The central
           elastomeric domain is characterized by the following
           three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It
           possesses overlapping beta-turns within and between the
           repeated motifs, and assumes a regular helical secondary
           structure with a diameter of approx. 1.9 nm and a pitch
           of approx. 1.5 nm.
          Length = 779

 Score = 35.1 bits (79), Expect = 0.34
 Identities = 51/200 (25%), Positives = 67/200 (33%), Gaps = 28/200 (14%)

Query: 12  PPQQQQP--------PLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP--QENLTALQRAI 61
           P Q QQP        P +  Q   G PG  P  S  P+  Q PGQ    Q+     Q  I
Sbjct: 288 PAQGQQPGQGQPGHYPASPQQPGQGQPGHYPASSQQPTQSQEPGQGQQGQQVGQGQQAQI 347

Query: 62  DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTS----AQVQQLRFQIMAYRLLARNQPLT 117
            +  +Q  +  P +     ++    +  H  TS     Q QQ+     +       QP  
Sbjct: 348 PAQGQQPGQGQPGHYPASPLQQGPGQPGHYLTSLQQLGQGQQIGQLQQSAPGQKGQQPGQ 407

Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQP--------- 168
            Q     Q  +  G     Q P     G  P    Q  P Q Q     QQP         
Sbjct: 408 GQQPGQGQQGQQPGQGEQEQQPGQGQPGYYPTSLQQ--PGQGQQPGQWQQPGQGQPGYYP 465

Query: 169 --PPQPHQ-QQGHISSQIKQ 185
               QP Q Q GH  + ++Q
Sbjct: 466 TSLLQPGQGQPGHDPASLQQ 485



 Score = 33.9 bits (76), Expect = 0.77
 Identities = 46/202 (22%), Positives = 62/202 (30%), Gaps = 15/202 (7%)

Query: 6   TSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ-------------APGQNPQE 52
           TSP   P Q QQP        +G    G     G  PGQ               GQ  Q+
Sbjct: 147 TSPQHQPGQLQQPAQGQQGQQIGQGQQGQQPEQGQQPGQGQQGQQPGQGQQPGQGQQGQQ 206

Query: 53  NLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLAR 112
                Q       +Q  +  P +      +  + +  H   S Q      Q    +  A+
Sbjct: 207 LGQGQQGYYPGQLQQSGQGQPGHYPTSLQQLGQGQQGHYLASPQQPGQGQQPGQLQQPAQ 266

Query: 113 NQPLTPQLAMGVQGKRMEG-VPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ 171
            Q            +  +G  P+  Q P     G  P  P QP   Q    P   Q P Q
Sbjct: 267 GQQPEQGQQGQQPAQGQQGHQPAQGQQPGQGQPGHYPASPQQPGQGQPGHYPASSQQPTQ 326

Query: 172 PHQQ-QGHISSQIKQSKLTNIP 192
             +  QG    Q+ Q +   IP
Sbjct: 327 SQEPGQGQQGQQVGQGQQAQIP 348



 Score = 31.6 bits (70), Expect = 4.0
 Identities = 48/177 (27%), Positives = 62/177 (35%), Gaps = 5/177 (2%)

Query: 12  PPQQQQPPLNVGQLPMGAPGSGP-PGSPGPSPGQA-PGQNPQENLTALQRAIDSMKEQGL 69
           P Q+ Q P    Q   G  G  P  G     PGQ  PG  P       Q       +Q  
Sbjct: 398 PGQKGQQPGQGQQPGQGQQGQQPGQGEQEQQPGQGQPGYYPTSLQQPGQGQQPGQWQQPG 457

Query: 70  EEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRM 129
           +  P Y     ++  + +  H   S Q      Q    +  A+ QP   QLA G QG++ 
Sbjct: 458 QGQPGYYPTSLLQPGQGQPGHDPASLQQPGQGQQPGQLQQPAQGQP-GQQLAQGQQGQQP 516

Query: 130 EGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ-QQGHISSQIKQ 185
             V  G Q         +        P Q Q  P Q +   QP Q QQG    Q +Q
Sbjct: 517 AQVQQGQQPAQGQQGQQLGQGQQGQQPGQGQ-HPAQGEQGQQPGQGQQGQQPGQGQQ 572


>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
          Length = 880

 Score = 34.7 bits (80), Expect = 0.41
 Identities = 47/206 (22%), Positives = 87/206 (42%), Gaps = 18/206 (8%)

Query: 1089 IDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRG-SRQRKQVD 1147
            +  E RKE  ++    E+  +   L + +E+  +   E +E EK L       + ++  +
Sbjct: 445  LTEEHRKELLEEYTA-ELKRIEKELKEIEEKERKLRKELRELEKVLKKESELIKLKELAE 503

Query: 1148 YTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKG--KRRKKTEDDDEEPSTSKKRKK 1205
                L EK  LK  +  +E  +++ EE E+++ K        K  + + E+    KK+  
Sbjct: 504  QLKELEEK--LKKYN--LEELEKKAEEYEKLKEKLIKLKGEIKSLKKELEKLEELKKKLA 559

Query: 1206 EKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMD 1265
            E EK  ++ + +L + LK++  +  +  +     L          KEL  +Y       D
Sbjct: 560  ELEKKLDELEEELAELLKELEELGFESVEELEERL----------KELEPFYNEYLELKD 609

Query: 1266 IKKILGRIEDGKYSSVDELQKDFKTL 1291
             +K L R E       +EL K F+ L
Sbjct: 610  AEKELEREEKELKKLEEELDKAFEEL 635



 Score = 34.3 bits (79), Expect = 0.59
 Identities = 39/229 (17%), Positives = 90/229 (39%), Gaps = 21/229 (9%)

Query: 213 NIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLE-TAV 271
           NIE  I+E    L   L E   + +E+            +LR E+    +    LE    
Sbjct: 190 NIEELIKEKEKELEEVLREINEISSEL-----------PELREELEKLEKEVKELEELKE 238

Query: 272 NVKAYKRTKRQGLKEARATE--------KLEKQQKVEAERKKRQKHQEYITTVLQHCKDF 323
            ++  ++         R  E        ++E+ +K   E +++ K  + +    +     
Sbjct: 239 EIEELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEKVKELKELKEKAEEYIKL 298

Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKE-RMRRLMAEDEEGYRKLIDQK 382
            E++      +  + K +          +++ +E  EKE R+  L  + +E  ++L + +
Sbjct: 299 SEFYEEYLDELREIEKRLSRLEEEINGIEERIKELEEKEERLEELKKKLKELEKRLEELE 358

Query: 383 KDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKL 431
           +   L        E +  L + +     E+ +K+ EE +K K+ +++++
Sbjct: 359 ERHELYEEAKAKKEELERLKKRLTGLTPEKLEKELEELEKAKEEIEEEI 407



 Score = 32.3 bits (74), Expect = 2.1
 Identities = 33/155 (21%), Positives = 65/155 (41%), Gaps = 11/155 (7%)

Query: 1075 MLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKAL 1134
             +   ++E +  +    E ++ + K    I++SE  +  + E  EIE+     +EE   +
Sbjct: 267  RIEELKKEIEELEEKVKELKELKEKAEEYIKLSEFYEEYLDELREIEKRLSRLEEEINGI 326

Query: 1135 HMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEE----EEVRSKRKGKRRKKT 1190
                   + +  +  +     E LK     +E   EE EE     EE ++K++   R K 
Sbjct: 327  -------EERIKELEEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKK 379

Query: 1191 EDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
                  P   +K  +E EK +E+ + ++ K   +I
Sbjct: 380  RLTGLTPEKLEKELEELEKAKEEIEEEISKITARI 414



 Score = 32.3 bits (74), Expect = 2.3
 Identities = 35/175 (20%), Positives = 85/175 (48%), Gaps = 32/175 (18%)

Query: 291 EKLEKQQKVEAERKKRQKHQEYIT-------TVLQHCKDFKEYHRNNQARIMRLNKAVMN 343
           E  E+ +K+E E K+ ++ +E I        ++    +  +E  R  + RI         
Sbjct: 218 ELREELEKLEKEVKELEELKEEIEELEKELESLEGSKRKLEEKIRELEERI--------- 268

Query: 344 YHANAEKEQKKEQERIEKERMRRL--MAEDEEGYRKLID-----QKKDKRLAFLLSQTDE 396
                  E+ K++    +E+++ L  + E  E Y KL +       + + +   LS+ +E
Sbjct: 269 -------EELKKEIEELEEKVKELKELKEKAEEYIKLSEFYEEYLDELREIEKRLSRLEE 321

Query: 397 YISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDM 451
            I+ + + +KE  +E+K+++ EE KK+ + ++++L + + +  L ++  ++  ++
Sbjct: 322 EINGIEERIKE--LEEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEEL 374


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
            family of proteins are involved in 60S ribosomal
            biogenesis. They are specifically involved in the
            processing beyond the 27S stage of 25S rRNA maturation.
            This family contains sequences that bear similarity to
            the glioma tumour suppressor candidate region gene 2
            protein (p60). This protein has been found to interact
            with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 34.3 bits (79), Expect = 0.44
 Identities = 23/105 (21%), Positives = 45/105 (42%), Gaps = 12/105 (11%)

Query: 1126 EAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
            E K E+K   + R   ++ +        E   L  + +G+  + +++ EEE         
Sbjct: 206  EVKAEKKRQELERVEEKKLEK----MAPEASRLDEMSEGLLEESDDDGEEESDDESAWEG 261

Query: 1186 RRKKTEDDDEEPSTSKKRKKEKEKDREK------DQAKLKKTLKK 1224
               ++E +        KRK + ++++EK       +AK +K LKK
Sbjct: 262  --FESEYEPINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQLKK 304


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 34.5 bits (79), Expect = 0.45
 Identities = 27/136 (19%), Positives = 59/136 (43%), Gaps = 26/136 (19%)

Query: 398 ISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVRE 457
           +S L +++ ++ M+Q   ++EE +KR++  +Q                   T    S   
Sbjct: 381 VSRLEEVISKYAMKQDDTEEEERRKRQERERQG------------------TSSRSSDPS 422

Query: 458 ISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKN 517
            +S       ++P  A      Q+    E V + +EE E+E+ E+ + +     + +E+ 
Sbjct: 423 KASST---SGESPSMAS-----QESEEEESVEEEEEEEEEEEEEEQESEEEEGEDEEEEE 474

Query: 518 KGEDDEYNKNAMEEAT 533
           + E D  ++  ME ++
Sbjct: 475 EVEADNGSEEEMEGSS 490


>gnl|CDD|218188 pfam04641, Rtf2, Replication termination factor 2.  It is vital for
            effective cell-replication that replication is not
            stalled at any point by, for instance, damaged bases.
            Rtf2 stabilizes the replication fork stalled at the
            site-specific replication barrier RTS1 by preventing
            replication restart until completion of DNA synthesis by
            a converging replication fork initiated at a flanking
            origin. The RTS1 element terminates replication forks
            that are moving in the cen2-distal direction while
            allowing forks moving in the cen2-proximal direction to
            pass through the region. Rtf2 contains a C2HC2 motif
            related to the C3HC4 RING-finger motif, and would appear
            to fold up, creating a RING finger-like structure but
            forming only one functional Zn2+ ion-binding site.
          Length = 254

 Score = 33.9 bits (78), Expect = 0.46
 Identities = 15/65 (23%), Positives = 22/65 (33%)

Query: 1172 EEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIK 1231
            EEE  + + K+K K+ KK          +       E      Q    K LKK   +   
Sbjct: 177  EEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEAKKLKKKRSIAPD 236

Query: 1232 YTDSD 1236
               S+
Sbjct: 237  NEKSE 241


>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3 subunit. 
            This is a family of proteins which are subunits of the
            eukaryotic translation initiation factor 3 (eIF3). In
            yeast it is called Hcr1. The Saccharomyces cerevisiae
            protein eIF3j (HCR1) has been shown to be required for
            processing of 20S pre-rRNA and binds to 18S rRNA and eIF3
            subunits Rpg1p and Prt1p.
          Length = 242

 Score = 33.9 bits (78), Expect = 0.46
 Identities = 11/39 (28%), Positives = 21/39 (53%)

Query: 1172 EEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKD 1210
            EE+E+  R K +   R+  ED  E+    K R ++ +++
Sbjct: 67   EEKEKAKREKEEKGLRELEEDTPEDELAEKLRLRKLQEE 105



 Score = 30.8 bits (70), Expect = 4.2
 Identities = 17/58 (29%), Positives = 31/58 (53%), Gaps = 8/58 (13%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
            D+EE+EE+EE ++K   K + K           K + +EKEK + + + K  + L++ 
Sbjct: 38   DEEEDEEKEEEKAKVAAKAKAKKA--------LKAKIEEKEKAKREKEEKGLRELEED 87



 Score = 30.4 bits (69), Expect = 5.0
 Identities = 19/70 (27%), Positives = 33/70 (47%), Gaps = 4/70 (5%)

Query: 1162 DDGVEYDDEEEEEEEEVRSKRK----GKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAK 1217
            DD V+   +EEE+EE+   K K     K +K  +   EE   +K+ K+EK     ++   
Sbjct: 30   DDDVKDSWDEEEDEEKEEEKAKVAAKAKAKKALKAKIEEKEKAKREKEEKGLRELEEDTP 89

Query: 1218 LKKTLKKIMR 1227
              +  +K+  
Sbjct: 90   EDELAEKLRL 99


>gnl|CDD|234090 TIGR03021, pilP_fam, type IV pilus biogenesis protein PilP.
           Members of this protein family are found in type IV
           pilus biogenesis loci and include proteins designated
           PilP [Cell envelope, Surface structures].
          Length = 119

 Score = 32.4 bits (74), Expect = 0.46
 Identities = 14/67 (20%), Positives = 22/67 (32%), Gaps = 6/67 (8%)

Query: 93  TSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPS 152
           T  Q++ L+ +               +    ++     G   GP MP  S  G  PM  +
Sbjct: 3   TVGQLEALQSETALLEAQLARA----KAQNELEEAERGGQVGGPGMPFTS--GVPPMALT 56

Query: 153 QPMPNQA 159
              P  A
Sbjct: 57  GANPTSA 63


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
            subunit (TFIIF-alpha).  Transcription initiation factor
            IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
            II-associating protein 74 (RAP74) is the large subunit of
            transcription factor IIF (TFIIF), which is essential for
            accurate initiation and stimulates elongation by RNA
            polymerase II.
          Length = 528

 Score = 34.2 bits (78), Expect = 0.48
 Identities = 19/85 (22%), Positives = 32/85 (37%), Gaps = 1/85 (1%)

Query: 1140 SRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPST 1199
             R++K  +    L   +  K        +DEE E E+    + K  + K  E DDE+   
Sbjct: 169  KRRKKTANGF-QLMMMKAAKNGPAAFGDEDEETEGEKGGGGRGKDLKIKDLEGDDEDDGD 227

Query: 1200 SKKRKKEKEKDREKDQAKLKKTLKK 1224
               +  E   + +  + K K    K
Sbjct: 228  ESDKGGEDGDEEKSKKKKKKLAKNK 252



 Score = 33.0 bits (75), Expect = 1.1
 Identities = 32/165 (19%), Positives = 64/165 (38%), Gaps = 22/165 (13%)

Query: 1090 DAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEE---EKALHMGRGSR----- 1141
            + E  K  G + + +++ +L      + +E ++   +  EE   +K   + +  +     
Sbjct: 199  ETEGEKGGGGRGKDLKIKDLEGDDEDDGDESDKGGEDGDEEKSKKKKKKLAKNKKKLDDD 258

Query: 1142 QRKQVDYTDSLTEKEWLKAIDDGVEYD--------DEEEEEEEEVRSKRKGKRR--KKTE 1191
            ++ +    D   E +     D+G E D          + EE E+  S     +   ++ E
Sbjct: 259  KKGKRGGDDDADEYDSDDGDDEGREEDYISDSSASGNDPEEREDKLSPEIPAKPEIEQDE 318

Query: 1192 DDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSD 1236
            D +E     ++ K E+E    K   KLKK   K   +    +DS 
Sbjct: 319  DSEES----EEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDSG 359


>gnl|CDD|218108 pfam04487, CITED, CITED.  CITED, CBP/p300-interacting
           transactivator with ED-rich tail, are characterized by a
           conserved 32-amino acid sequence at the C-terminus.
           CITED proteins do not bind DNA directly and are thought
           to function as transcriptional co-activators.
          Length = 206

 Score = 33.3 bits (76), Expect = 0.50
 Identities = 20/76 (26%), Positives = 22/76 (28%), Gaps = 13/76 (17%)

Query: 1   MSNSSTSPNPPPPQQQQP--------PLNV---GQLPM--GAPGSGPPGSPGPSPGQAPG 47
           M N S+ P P                 LN    G      G PG G P    P  GQ PG
Sbjct: 81  MFNPSSKPQPFMLVPGPQLMASMQLQKLNTQYQGHAGAPAGHPGGGGPQQFRPGAGQPPG 140

Query: 48  QNPQENLTALQRAIDS 63
                        ID+
Sbjct: 141 MQHMPAPALPPNVIDT 156


>gnl|CDD|217453 pfam03251, Tymo_45kd_70kd, Tymovirus 45/70Kd protein.  Tymoviruses
           are single stranded RNA viruses. This family includes a
           protein of unknown function that has been named based on
           its molecular weight. Tymoviruses such as the ononis
           yellow mosaic tymovirus encode only three proteins. Of
           these two are overlapping this protein overlaps a larger
           ORF that is thought to be the polymerase.
          Length = 458

 Score = 34.3 bits (79), Expect = 0.51
 Identities = 11/52 (21%), Positives = 15/52 (28%), Gaps = 4/52 (7%)

Query: 4   SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGP----PGSPGPSPGQAPGQNPQ 51
            +T P+PP P   + P +                 P  P  S G  P     
Sbjct: 249 HTTRPSPPRPAFSRSPSSPLSPLPRPSTRRGLLPNPRLPRASRGHLPPPTSS 300


>gnl|CDD|185594 PTZ00395, PTZ00395, Sec24-related protein; Provisional.
          Length = 1560

 Score = 34.3 bits (78), Expect = 0.52
 Identities = 15/42 (35%), Positives = 24/42 (57%)

Query: 480 QDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGED 521
           +DHP         E++++E  E S  + S ENEN+  +KGE+
Sbjct: 550 EDHPEGGTNRQKYEQSDEESVESSSSENSSENENEVTDKGEE 591


>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain.  This domain is
            found at the N terminus of SMC proteins. The SMC
            (structural maintenance of chromosomes) superfamily
            proteins have ATP-binding domains at the N- and
            C-termini, and two extended coiled-coil domains separated
            by a hinge in the middle. The eukaryotic SMC proteins
            form two kind of heterodimers: the SMC1/SMC3 and the
            SMC2/SMC4 types. These heterodimers constitute an
            essential part of higher order complexes, which are
            involved in chromatin and DNA dynamics. This family also
            includes the RecF and RecN proteins that are involved in
            DNA metabolism and recombination.
          Length = 1162

 Score = 34.2 bits (78), Expect = 0.56
 Identities = 32/142 (22%), Positives = 60/142 (42%), Gaps = 3/142 (2%)

Query: 1087 QRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQW--AFEAKEEEKALHMGRGSRQRK 1144
            Q   A    +  +K  L E + L    +K +EE           E+E+     +   + +
Sbjct: 206  QAKKALEYYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEE 265

Query: 1145 QVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR-KGKRRKKTEDDDEEPSTSKKR 1203
            ++        KE  K      E      +EEEE++S+  K +RRK  +++  + S  + +
Sbjct: 266  EILAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELK 325

Query: 1204 KKEKEKDREKDQAKLKKTLKKI 1225
            K EKE  +EK++ +  +   K 
Sbjct: 326  KLEKELKKEKEEIEELEKELKE 347



 Score = 33.8 bits (77), Expect = 0.80
 Identities = 40/237 (16%), Positives = 91/237 (38%), Gaps = 14/237 (5%)

Query: 990  DRAHRIGQKNEVRVLRLMTVNSVEERILAAARYKLNMDEKVIQAGMFDQKSTGSERHQFL 1049
            ++A     +++ R   +  +    E        K           + +  +  SE    L
Sbjct: 610  NKATLEADEDDKRAKVVEGILKDTELTKLLESAKAKESGLRKGVSLEEGLAEKSELKASL 669

Query: 1050 QTILHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAE-RRKEQGKKSRLIEVSE 1108
              +  +   E E +     +   N++L R EE  +  QRI  E ++ +  K+  L +  +
Sbjct: 670  SELTKELLAEQELQEKAESELAKNEILRRQEEIKKKEQRIKEELKKLKLEKEELLADKVQ 729

Query: 1109 LPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYD 1168
                 I E+ ++ +   + KEEE+     R  ++ ++ + ++   +++           +
Sbjct: 730  EAQDKINEELKLLEQKIKEKEEEEEKS--RLKKEEEEEEKSELSLKEK-----------E 776

Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
              EEEE+ E     + K  K    ++E  +  ++ K+E E   E+     ++   K 
Sbjct: 777  LAEEEEKTEKLKVEEEKEEKLKAQEEELRALEEELKEEAELLEEEQLLIEQEEKIKE 833



 Score = 33.4 bits (76), Expect = 0.95
 Identities = 44/229 (19%), Positives = 83/229 (36%), Gaps = 12/229 (5%)

Query: 1053 LHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRL-IEVSELPD 1111
            L   +EE+ +   +  +        + +E  +  ++++ E +KE+ +   L  E+ EL  
Sbjct: 291  LLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEI 350

Query: 1112 WLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEE 1171
                E+EE EQ     K +EK         Q ++        E E L +     E + E 
Sbjct: 351  KREAEEEEEEQ---LEKLQEKLE-------QLEEELLAKKKLESERLSSAAKLKEEELEL 400

Query: 1172 EEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIK 1231
            + EEE+          ++ +   EE     K  +E E+  E  Q KL +  K+ +     
Sbjct: 401  KNEEEKEAKLLLELSEQEEDLLKEEKKEELKIVEELEESLETKQGKLTE-EKEELEKQAL 459

Query: 1232 YTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSS 1280
                D   L +    L   K +    ++    +  K      ++ K   
Sbjct: 460  KLLKDKLELKKSEDLLKETKLVKLLEQLELLLLRQKLEEASQKESKARE 508



 Score = 33.4 bits (76), Expect = 1.1
 Identities = 41/292 (14%), Positives = 89/292 (30%), Gaps = 39/292 (13%)

Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKK 383
           K+  R  +      N A +       K Q+ + +   K+ +     +++    +  +   
Sbjct: 171 KKKERLKKLIEETENLAELIIDLEELKLQELKLKEQAKKALEYYQLKEKL-ELEEENLLY 229

Query: 384 DKRLAFLLSQTDEYISNLT-------QMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDG 436
              L     + D     L           +E + E++       + +++  ++KL + + 
Sbjct: 230 LDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQEEEL 289

Query: 437 KVTLDQDETSQLTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENE 496
           K+   ++E  +   + +  R++   + LK                         S++E +
Sbjct: 290 KLLAKEEEELKSELLKLERRKVDDEEKLKE------------------------SEKELK 325

Query: 497 DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSIAHTVHEIVTEQASILVNGK 556
             + E  KEK   E   KE  + E     +   EE               E+  +     
Sbjct: 326 KLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLE---KLQEKLEQLEEELLAKKKL 382

Query: 557 LKEY---QIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKV 605
             E      K  E  + L N          + L +  + +       E K V
Sbjct: 383 ESERLSSAAKLKEEELELKNEEEK-EAKLLLELSEQEEDLLKEEKKEELKIV 433


>gnl|CDD|100796 PRK01156, PRK01156, chromosome segregation protein; Provisional.
          Length = 895

 Score = 34.1 bits (78), Expect = 0.59
 Identities = 40/259 (15%), Positives = 97/259 (37%), Gaps = 28/259 (10%)

Query: 203 LQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACAR 262
           L+     +  NI+++I +   S + TL E  R+  E         N +  L       + 
Sbjct: 192 LKSSNLELE-NIKKQIADDEKSHSITLKEIERLSIEYNNAMDDYNNLKSALNE---LSSL 247

Query: 263 RDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKD 322
            D        +K  +      L++    ++LE++   +       K++ YI    ++  D
Sbjct: 248 EDMKNRYESEIKTAESDLSMELEKNNYYKELEERHM-KIINDPVYKNRNYINDYFKYKND 306

Query: 323 FKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQK 382
            +   +     +  ++  +  YHA  +K    +++  +  + +    +            
Sbjct: 307 IENKKQ----ILSNIDAEINKYHAIIKKLSVLQKDYNDYIKKKSRYDD------------ 350

Query: 383 KDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQ 442
               L   + + + Y  +    +K   +E  KK+ EE  K  + +   + +      +D 
Sbjct: 351 ----LNNQILELEGYEMDYNSYLKS--IESLKKKIEEYSKNIERMSAFISEILKIQEIDP 404

Query: 443 DE-TSQLTDMHISVREISS 460
           D    +L ++++ +++ISS
Sbjct: 405 DAIKKELNEINVKLQDISS 423


>gnl|CDD|215590 PLN03123, PLN03123, poly [ADP-ribose] polymerase; Provisional.
          Length = 981

 Score = 34.0 bits (78), Expect = 0.60
 Identities = 17/79 (21%), Positives = 37/79 (46%), Gaps = 4/79 (5%)

Query: 1150 DSLT--EKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
            D+L+  ++E +  +      + +EE+ EE  +  +KG +RKK    D++   +K  +   
Sbjct: 169  DTLSDSDQEAVLPLVKKSPSEAKEEKAEERKQESKKGAKRKKDASGDDKSKKAKTDRDVS 228

Query: 1208 EKD--REKDQAKLKKTLKK 1224
                  +K  + L+  L+ 
Sbjct: 229  TSTAASQKKSSDLESKLEA 247


>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1.  All
           proteins in this family for which functions are known
           are cyclin dependent protein kinases that are components
           of TFIIH, a complex that is involved in nucleotide
           excision repair and transcription initiation. Also known
           as MAT1 (menage a trois 1). This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 309

 Score = 33.6 bits (77), Expect = 0.61
 Identities = 25/104 (24%), Positives = 50/104 (48%), Gaps = 7/104 (6%)

Query: 321 KDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLID 380
           K  + Y + N+  I + NK         E E+  E E+ E+E+ RRL+ + EE  +++  
Sbjct: 120 KKIETYQKENKDVIQK-NKEKST-REQEELEEALEFEKEEEEQ-RRLLLQKEEEEQQMNK 176

Query: 381 QKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRK 424
           +K  + L   L  +    +   +++ +HK +   K + + +K K
Sbjct: 177 RKNKQALLDELETSTLPAA---ELIAQHK-KNSVKLEMQVEKPK 216


>gnl|CDD|148139 pfam06346, Drf_FH1, Formin Homology Region 1.  This region is found
           in some of the Diaphanous related formins (Drfs). It
           consists of low complexity repeats of around 12
           residues.
          Length = 160

 Score = 32.6 bits (74), Expect = 0.63
 Identities = 16/47 (34%), Positives = 17/47 (36%), Gaps = 7/47 (14%)

Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMPNQA---QPMPLQQQPPPQPHQ 174
            VP  P +P     GP   PP  P P       P P    PPP P  
Sbjct: 108 AVPPPPPLPG----GPGVPPPPPPFPGAPGIPPPPPGMGSPPPPPFG 150


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 33.8 bits (78), Expect = 0.65
 Identities = 22/135 (16%), Positives = 49/135 (36%), Gaps = 1/135 (0%)

Query: 1088 RIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVD 1147
             I  ++ K   K +     ++      + D   +     A +++  L+  +      Q D
Sbjct: 73   DIPKKKTKTAAKAAAAKAPAKK-KLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQAD 131

Query: 1148 YTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
              D   + + L   D   + DDE+++E+++          KK   + E+ S       ++
Sbjct: 132  DDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDFVWDE 191

Query: 1208 EKDREKDQAKLKKTL 1222
            +      QA+    L
Sbjct: 192  DDSEALRQARKDAKL 206


>gnl|CDD|218621 pfam05518, Totivirus_coat, Totivirus coat protein. 
          Length = 753

 Score = 34.0 bits (78), Expect = 0.67
 Identities = 12/46 (26%), Positives = 15/46 (32%), Gaps = 1/46 (2%)

Query: 8   PNPPPPQQQQPPLNVGQLPMG-APGSGPPGSPGPSPGQAPGQNPQE 52
             P  P+   PP   G LP      +    +P  S   A    P E
Sbjct: 693 RAPQAPRPGGPPGGGGGLPPPPDLPAAAGPAPCGSSLIASPTAPPE 738


>gnl|CDD|165431 PHA03160, PHA03160, hypothetical protein; Provisional.
          Length = 499

 Score = 33.9 bits (77), Expect = 0.67
 Identities = 36/164 (21%), Positives = 56/164 (34%), Gaps = 23/164 (14%)

Query: 38  PGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSA-- 95
           PG S       N  +N++ LQ  +  +K+  + +  R             I H F++   
Sbjct: 342 PGESSLYKDVLNLTKNISQLQDDLKDLKQAAINQPNRI------------IPHHFSNPYS 389

Query: 96  --QVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ 153
                   F+   Y     +  L P LA   Q   +   P   Q  PM      P P   
Sbjct: 390 FDPGHAPFFRYAPYGAPKNDHHLLPPLACSQQ---LPMQPLHVQQAPMQAPHVAPPPMQP 446

Query: 154 PMPNQAQPMPLQQQP---PPQPHQQQG-HISSQIKQSKLTNIPK 193
           P   Q + +P         P+P  Q+  HI +   Q  ++ I K
Sbjct: 447 PHVQQPRVLPSTDGASNEAPKPSAQEPVHIDASFAQDPVSKIQK 490



 Score = 31.2 bits (70), Expect = 4.6
 Identities = 22/73 (30%), Positives = 31/73 (42%), Gaps = 7/73 (9%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPM--PLQQQPPPQ-PHQQQGHISSQIKQSKLT 189
                +PP++    +PM   QP+  Q  PM  P    PP Q PH QQ  +      +   
Sbjct: 408 NDHHLLPPLACSQQLPM---QPLHVQQAPMQAPHVAPPPMQPPHVQQPRVLPSTDGAS-N 463

Query: 190 NIPKPEGLDPLII 202
             PKP   +P+ I
Sbjct: 464 EAPKPSAQEPVHI 476


>gnl|CDD|237015 PRK11901, PRK11901, hypothetical protein; Reviewed.
          Length = 327

 Score = 33.5 bits (77), Expect = 0.68
 Identities = 29/176 (16%), Positives = 41/176 (23%), Gaps = 38/176 (21%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPG--SGPPGSPGPSPGQAPGQNPQENLTALQ 58
           +S+ + S               G      P   S PP SP P+    P           Q
Sbjct: 87  LSSGNQSSPSAANNTSDGHDASGVKNTAPPQDISAPPISPTPTQAAPPQTP-----NGQQ 141

Query: 59  RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQ--PL 116
           R                   IE+  N   I  A +    QQ +    +          P 
Sbjct: 142 R-------------------IELPGN---ISDALSQ---QQGQVNAASQNAQGNTSTLPT 176

Query: 117 TPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
            P      +G ++         PP       P          A P         +P
Sbjct: 177 APATVAPSKGAKVPATAETHPTPPQKPATKKPAVNHHKTATVAVP----PATSGKP 228


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
            biogenesis [Translation, ribosomal structure and
            biogenesis].
          Length = 1077

 Score = 33.9 bits (77), Expect = 0.69
 Identities = 26/156 (16%), Positives = 55/156 (35%), Gaps = 9/156 (5%)

Query: 1058 EEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKED 1117
            EE   E+ +  D++       +E+   T ++       E   +    EV+   D    E 
Sbjct: 420  EETSREDELSFDDSDVSTSDENEDVDFTGKKGAINNEDESDNE----EVAFDSDSQFDES 475

Query: 1118 EEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKE----WLKAIDDGVEYDDEEEE 1173
            E   +W      +      G+  R  +++ Y +SL+ +E    +        E D   ++
Sbjct: 476  EGNLRWKEGLASKLAYSQSGKRGRNIQKIFYDESLSPEECIEEYKGESAKSSESDLVVQD 535

Query: 1174 EEEE-VRSKRKGKRRKKTEDDDEEPSTSKKRKKEKE 1208
            E E+     +       +  +    S  ++ KK+  
Sbjct: 536  EPEDFFDVSKVANESISSNHEKLMESEFEELKKKWS 571


>gnl|CDD|213398 cd12191, gal11_coact, gall11 coactivator domain.  Gall11/MED15 acts
           in the general regulation of GAL structural genes and is
           required for full expression for several genes in this
           pathway, including GALs 1,7, and 10 in Saccharomyces
           cerevisiae. GAL11 function is dependent on GCN4
           functionality and binds GCN4 in a degenerate manner with
           multiple orientations found at the GCN4-Gal11 interface.
          Length = 90

 Score = 31.2 bits (71), Expect = 0.69
 Identities = 13/39 (33%), Positives = 15/39 (38%), Gaps = 1/39 (2%)

Query: 155 MPNQAQPMPLQQQPPPQPHQQQGHI-SSQIKQSKLTNIP 192
              Q QP   QQQ  PQ  Q    + +  I    L  IP
Sbjct: 1   PQQQQQPQQQQQQQMPQNPQLVNMMDNMPIPPQLLAKIP 39


>gnl|CDD|221042 pfam11244, Med25_NR-box, Mediator complex subunit 25 C-terminal
          NR box-containing.  The overall function of the
          full-length Med25 is efficiently to coordinate the
          transcriptional activation of RAR/RXR (retinoic acid
          receptor/retinoic X receptor) in higher eukaryotic
          cells. Human Med25 consists of several domains with
          different binding properties, the N-terminal, VWA,
          domain, an SD1 - synapsin 1 - domain from residues
          229-381, a PTOV(B) or ACID domain from 395-545, an SD2
          domain from residues 564-645 and this C-terminal NR
          box-containing domain (646-650) from C69-747. The NR
          box of MED25 is critical for its recruitment to the
          promoter, probably through an interaction with pre
          bound RAR.
          Length = 89

 Score = 31.2 bits (70), Expect = 0.72
 Identities = 15/51 (29%), Positives = 16/51 (31%), Gaps = 8/51 (15%)

Query: 9  NPPPPQQQQPPLNVGQLPMGAPGSGPPGSP----GPSPGQ----APGQNPQ 51
               Q         QLPM  P   P        GP  GQ    A G+ PQ
Sbjct: 11 AAMQQQAMGQQQQGHQLPMPGPAQFPLQQLQQMRGPGGGQMSMQAGGRAPQ 61



 Score = 28.9 bits (64), Expect = 5.0
 Identities = 13/45 (28%), Positives = 14/45 (31%)

Query: 14 QQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQ 58
          QQ QP L   Q             P P P Q P Q  Q+      
Sbjct: 4  QQGQPGLAAMQQQAMGQQQQGHQLPMPGPAQFPLQQLQQMRGPGG 48



 Score = 28.1 bits (62), Expect = 8.6
 Identities = 18/59 (30%), Positives = 21/59 (35%)

Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
            Q AMG Q +  +    GP   P+     M  P    M  QA     QQ    QP   Q
Sbjct: 14  QQQAMGQQQQGHQLPMPGPAQFPLQQLQQMRGPGGGQMSMQAGGRAPQQMHALQPLLGQ 72


>gnl|CDD|233048 TIGR00605, rad4, DNA repair protein rad4.  All proteins in this
            family for which functions are known are involved in
            targeting nucleotide excision repair to specific regions
            of the genome.This family is based on the phylogenomic
            analysis of JA Eisen (1999, Ph.D. Thesis, Stanford
            University) [DNA metabolism, DNA replication,
            recombination, and repair].
          Length = 713

 Score = 33.7 bits (77), Expect = 0.74
 Identities = 34/138 (24%), Positives = 56/138 (40%), Gaps = 19/138 (13%)

Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVV 1229
            E E+E E+    R+ K R++ E       + ++RKK  +    +   ++       +   
Sbjct: 17   ENEKEAEKQPKSRRRKVRRENE------PSLRRRKKRFKTGLNELPHEVV------LMCN 64

Query: 1230 IKYTDSDGRVLSEPFI-----KLPSRKELPDYYEVID--RPMDIKKILGRIEDGKYSSVD 1282
            +  T SD RV+S P       ++PSR+E  D  E  D      + +      + K SS  
Sbjct: 65   LDSTHSDDRVVSVPDSLSVSEEIPSREEDYDSREFEDVYLSNLVAEFETISVEIKPSSKA 124

Query: 1283 ELQKDFKTLCRNAQIYNE 1300
            E   D +TL RN      
Sbjct: 125  ESDDDAETLSRNVCSNEA 142


>gnl|CDD|224124 COG1203, COG1203, CRISPR-associated helicase Cas3 [Defense
           mechanisms].
          Length = 733

 Score = 34.0 bits (78), Expect = 0.74
 Identities = 26/147 (17%), Positives = 51/147 (34%), Gaps = 9/147 (6%)

Query: 481 DHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSI--- 537
               +++ +   E++   D E   +          +    D +  +N  E A   +    
Sbjct: 116 HLARYQLSSLISEKSFLADWEGLSDSLFRFFFRLLEKM--DIKDTRNFTELAKQEARLLK 173

Query: 538 ---AHTVHEIVTEQASILVNGKLKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIA 594
                        +    +  +  E Q K LE ++ L   +L  +L    G GKT  ++ 
Sbjct: 174 PLLLLLSAIARINKFKSFIEHEGYELQEKALELILRLEKRSLLVVLEAPTGYGKTEASLI 233

Query: 595 LITYLMEKKKVNGPFLIIV-PLSTLSN 620
           L   L+++K      +I V P  T+  
Sbjct: 234 LALALLDEKIKLKSRVIYVLPFRTIIE 260


>gnl|CDD|179385 PRK02224, PRK02224, chromosome segregation protein; Provisional.
          Length = 880

 Score = 33.9 bits (78), Expect = 0.77
 Identities = 44/172 (25%), Positives = 84/172 (48%), Gaps = 12/172 (6%)

Query: 1059 EDEEENAVPDDETVNQMLARSEEEFQTYQRID--AERRKEQG-----KKSRLIEVSELPD 1111
            E E E+   + E V + L R+E+  +   RI+   ERR++       ++  + E  E  +
Sbjct: 481  EAELEDLEEEVEEVEERLERAEDLVEAEDRIERLEERREDLEELIAERRETIEEKRERAE 540

Query: 1112 WLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDS-LTE-KEWLKAIDDGVEYDD 1169
             L +   E+E  A E K E  A         R++V   +S L E KE +++++       
Sbjct: 541  ELRERAAELEAEA-EEKREAAAEAEEEAEEAREEVAELNSKLAELKERIESLERIRTLLA 599

Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPST-SKKRKKEKEKDREKDQAKLKK 1220
               + E+E+   R+ KR    E +DE     ++KR++++E + E D+A++++
Sbjct: 600  AIADAEDEIERLRE-KREALAELNDERRERLAEKRERKRELEAEFDEARIEE 650


>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
            subunit [Translation, ribosomal structure and
            biogenesis].
          Length = 591

 Score = 33.5 bits (76), Expect = 0.77
 Identities = 15/65 (23%), Positives = 33/65 (50%)

Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKT 1221
                   +E+++++EE++++++ +   +     E     K   K K K R+ D+ + +K 
Sbjct: 484  RYEHVAGEEDDDDDEELQAQKELELEAQGIKYSETSEADKDVNKSKNKKRKVDEEEEEKK 543

Query: 1222 LKKIM 1226
            LK IM
Sbjct: 544  LKMIM 548


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
           RNase Y, an endoribonuclease. The member from Bacillus
           subtilis, YmdA, has been shown to be involved in
           turnover of yitJ riboswitch [Transcription, Degradation
           of RNA].
          Length = 514

 Score = 33.7 bits (78), Expect = 0.78
 Identities = 36/168 (21%), Positives = 74/168 (44%), Gaps = 34/168 (20%)

Query: 277 KRTKRQGLKEA-RATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIM 335
           +   ++ ++EA +  E L+K+  +EA+ +  +   E    + +   + +   R    R  
Sbjct: 31  EELAKRIIEEAKKEAETLKKEALLEAKEEVHKLRAELERELKERRNELQRLERRLLQREE 90

Query: 336 RLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTD 395
            L++ + +     E  +KKE+E   KE+    + E EE   +LI +++ + L        
Sbjct: 91  TLDRKMESLDKKEENLEKKEKELSNKEKN---LDEKEEELEELIAEQR-EEL-------- 138

Query: 396 EYISNLTQ---------------------MVKEHKMEQKKKQDEESKK 422
           E IS LTQ                     ++KE + E K++ D+++K+
Sbjct: 139 ERISGLTQEEAKEILLEEVEEEARHEAAKLIKEIEEEAKEEADKKAKE 186


>gnl|CDD|99933 cd05501, Bromo_SP100C_like, Bromodomain, SP100C_like subfamily. The
            SP100C protein is a splice variant of SP100, a major
            component of PML-SP100 nuclear bodies (NBs), which are
            poorly understood. It is covalently modified by SUMO-1
            and may play a role in processes at the chromatin level.
            Bromodomains are 110 amino acid long domains, that are
            found in many chromatin associated proteins. Bromodomains
            can interact specifically with acetylated lysine.
          Length = 102

 Score = 31.2 bits (71), Expect = 0.79
 Identities = 14/58 (24%), Positives = 29/58 (50%), Gaps = 2/58 (3%)

Query: 1244 FIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEE 1301
            FI  P      DY + I  PM + K+  R+ +  Y +V+   +D + +  N +++ ++
Sbjct: 23   FISKPYYIR--DYCQGIKEPMWLNKVKERLNERVYHTVEGFVRDMRLIFHNHKLFYKD 78


>gnl|CDD|236498 PRK09401, PRK09401, reverse gyrase; Reviewed.
          Length = 1176

 Score = 33.8 bits (78), Expect = 0.80
 Identities = 27/108 (25%), Positives = 48/108 (44%), Gaps = 12/108 (11%)

Query: 585 GLGKTIQTIALITYLMEK-KKVNGPFLIIVPLSTLSNWSLE-FERWAPSVNVVAYKGSPH 642
           G+GKT   + +  YL +K KK      II P   L    +E  E++   V         H
Sbjct: 105 GVGKTTFGLVMSLYLAKKGKKS----YIIFPTRLLVEQVVEKLEKFGEKVGCGVKILYYH 160

Query: 643 --LRKTLQAQMKAS----KFNVLLTTYEYVIKDKGPLAKLHWKYMIID 684
             L+K  + +         F++L+TT +++ K+   L K  + ++ +D
Sbjct: 161 SSLKKKEKEEFLERLKEGDFDILVTTSQFLSKNFDELPKKKFDFVFVD 208


>gnl|CDD|219321 pfam07174, FAP, Fibronectin-attachment protein (FAP).  This
          family contains bacterial fibronectin-attachment
          proteins (FAP). Family members are rich in alanine and
          proline, are approximately 300 long, and seem to be
          restricted to mycobacteria. These proteins contain a
          fibronectin-binding motif that allows mycobacteria to
          bind to fibronectin in the extracellular matrix.
          Length = 297

 Score = 33.3 bits (76), Expect = 0.81
 Identities = 11/44 (25%), Positives = 12/44 (27%)

Query: 7  SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
          S     P    PP         AP    P +  P P   P   P
Sbjct: 49 STAAAAPAPAAPPPPPPPAAPPAPQPDDPNAAPPPPPADPNAPP 92



 Score = 32.6 bits (74), Expect = 1.3
 Identities = 13/43 (30%), Positives = 16/43 (37%), Gaps = 5/43 (11%)

Query: 4  SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAP 46
           ST+   P P    PP      P  AP +  P  P  +P   P
Sbjct: 48 PSTAAAAPAPAAPPPPP-----PPAAPPAPQPDDPNAAPPPPP 85



 Score = 31.4 bits (71), Expect = 3.0
 Identities = 11/41 (26%), Positives = 11/41 (26%), Gaps = 1/41 (2%)

Query: 132 VPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
               P   P     P   P  QP    A P P    P   P
Sbjct: 53  AAPAPA-APPPPPPPAAPPAPQPDDPNAAPPPPPADPNAPP 92


>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein.  This family
           consists of several Borrelia P83/P100 antigen proteins.
          Length = 489

 Score = 33.4 bits (76), Expect = 0.82
 Identities = 31/142 (21%), Positives = 58/142 (40%), Gaps = 14/142 (9%)

Query: 284 LKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMR-LNKAVM 342
           LKE  + E  ++ Q+++ E  K+Q   +          DF + + + Q   +R   +   
Sbjct: 203 LKERESQEDAKRAQQLKEELDKKQIDADKAQQKA----DFAQDNADKQRDEVRQKQQEAK 258

Query: 343 NYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLT 402
           N    A+    KE +++ + + R +     E       +K D+     L   D    +L 
Sbjct: 259 NLPKPADTSSPKEDKQVAENQKREIEKAQIEI------KKNDEEA---LKAKDHKAFDLK 309

Query: 403 QMVKEHKMEQKKKQDEESKKRK 424
           Q  K  + E + K+ E  KKR+
Sbjct: 310 QESKASEKEAEDKELEAQKKRE 331


>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
            RPA34.5.  This is a family of proteins conserved from
            yeasts to human. Subunit A34.5 of RNA polymerase I is a
            non-essential subunit which is thought to help Pol I
            overcome topological constraints imposed on ribosomal DNA
            during the process of transcription.
          Length = 193

 Score = 32.8 bits (75), Expect = 0.83
 Identities = 17/62 (27%), Positives = 27/62 (43%), Gaps = 1/62 (1%)

Query: 1159 KAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKL 1218
                 G       E   E   S+++   + + E + EE    +K+KK KE  +EK + K 
Sbjct: 114  FPTGYGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKK-KEVKKEKKEKKD 172

Query: 1219 KK 1220
            KK
Sbjct: 173  KK 174



 Score = 32.4 bits (74), Expect = 0.91
 Identities = 16/53 (30%), Positives = 26/53 (49%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKK 1220
               + E+E EV  + K +++KK E   E+     K++K  E    K + K KK
Sbjct: 139  TTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKKKKKK 191



 Score = 32.4 bits (74), Expect = 0.99
 Identities = 19/61 (31%), Positives = 28/61 (45%), Gaps = 7/61 (11%)

Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPS------TSKKRKKEKEKDREKDQAKLKKTLK 1223
            E   E E    +   K  K+ E ++EE          KK KKEK KD+++   + K + K
Sbjct: 127  ELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEK-KDKKEKMVEPKGSKK 185

Query: 1224 K 1224
            K
Sbjct: 186  K 186



 Score = 29.7 bits (67), Expect = 6.8
 Identities = 17/66 (25%), Positives = 29/66 (43%), Gaps = 5/66 (7%)

Query: 1151 SLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR-----KGKRRKKTEDDDEEPSTSKKRKK 1205
              +E E  +         + E EEEE+   K+     K K+ KK + +        K+KK
Sbjct: 128  LGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKK 187

Query: 1206 EKEKDR 1211
            +K+K +
Sbjct: 188  KKKKKK 193


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 33.7 bits (77), Expect = 0.83
 Identities = 24/96 (25%), Positives = 41/96 (42%), Gaps = 4/96 (4%)

Query: 1188 KKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKL 1247
             + E         +++KK++EK +EK+  KLK   K+    +     SDG  + +   K 
Sbjct: 6    SEAEKKILTEEELERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEKK 65

Query: 1248 PSRK----ELPDYYEVIDRPMDIKKILGRIEDGKYS 1279
              ++    E P+ +   D P   KK L      +YS
Sbjct: 66   SRKRDVEDENPEDFIDPDTPFGQKKRLSSQMAKQYS 101


>gnl|CDD|151322 pfam10873, DUF2668, Protein of unknown function (DUF2668).  Members
           in this family of proteins are annotated as Cysteine and
           tyrosine-rich protein 1, however currently no function
           is known.
          Length = 154

 Score = 32.1 bits (73), Expect = 0.85
 Identities = 14/48 (29%), Positives = 16/48 (33%), Gaps = 9/48 (18%)

Query: 3   NSSTSPNPPPPQQQQ---------PPLNVGQLPMGAPGSGPPGSPGPS 41
           N+ + P  PPP             PP         A  S PP  PG S
Sbjct: 105 NAISYPMAPPPYTYDHEMEYPTDLPPPYSPAPQASAQRSPPPPYPGNS 152


>gnl|CDD|233366 TIGR01348, PDHac_trf_long, pyruvate dehydrogenase complex
           dihydrolipoamide acetyltransferase, long form.  This
           model describes a subset of pyruvate dehydrogenase
           complex dihydrolipoamide acetyltransferase specifically
           close by both phylogenetic and per cent identity (UPGMA)
           trees. Members of this set include two or three copies
           of the lipoyl-binding domain. E. coli AceF is a member
           of this model, while mitochondrial and some other
           bacterial forms belong to a separate model [Energy
           metabolism, Pyruvate dehydrogenase].
          Length = 546

 Score = 33.7 bits (77), Expect = 0.85
 Identities = 15/49 (30%), Positives = 18/49 (36%), Gaps = 2/49 (4%)

Query: 5   STSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSP--GPSPGQAPGQNPQ 51
           ST    P P   QP           P + P  +    P+P QA  QNP 
Sbjct: 194 STPATAPAPASAQPAAQSPAATQPEPAAAPAAAKAQAPAPQQAGTQNPA 242


>gnl|CDD|220431 pfam09831, DUF2058, Uncharacterized protein conserved in bacteria
            (DUF2058).  This domain, found in various prokaryotic
            proteins, has no known function.
          Length = 177

 Score = 32.6 bits (75), Expect = 0.86
 Identities = 16/49 (32%), Positives = 28/49 (57%), Gaps = 1/49 (2%)

Query: 1177 EVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK-EKDREKDQAKLKKTLKK 1224
            E R +RK  R+   + DDE    +++ K EK E+DRE ++ +  +  +K
Sbjct: 23   EKRKQRKQARKGADDGDDELKQAAEEAKAEKAERDRELNRQRQAEAEQK 71


>gnl|CDD|235585 PRK05733, PRK05733, single-stranded DNA-binding protein;
           Provisional.
          Length = 172

 Score = 32.2 bits (73), Expect = 0.93
 Identities = 14/47 (29%), Positives = 14/47 (29%), Gaps = 5/47 (10%)

Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMP-----NQAQPMPLQQQPPPQP 172
           G P G               P Q         Q Q  P  QQP PQP
Sbjct: 113 GRPQGDDQGGQGGGNYNQSAPRQQAQRPQQAAQQQSRPAPQQPAPQP 159


>gnl|CDD|218328 pfam04921, XAP5, XAP5, circadian clock regulator.  This protein is
            found in a wide range of eukaryotes. It is a nuclear
            protein and is suggested to be DNA binding. In plants,
            this family is essential for correct circadian clock
            functioning by acting as a light-quality regulator
            coordinating the activities of blue and red light
            signalling pathways during plant growth - inhibiting
            growth in red light but promoting growth in blue light.
          Length = 233

 Score = 32.7 bits (75), Expect = 0.96
 Identities = 17/66 (25%), Positives = 35/66 (53%), Gaps = 10/66 (15%)

Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK----------EKDREKDQAKL 1218
             +++EEE+E   + + K  K++ + DE      K+K  K          +K RE+ +A+L
Sbjct: 14   GDDDEEEDEDEGEDEKKVPKESSEPDEANVNPNKKKIGKNPSVDTSFLPDKAREEKEAEL 73

Query: 1219 KKTLKK 1224
            ++ L++
Sbjct: 74   REELRE 79


>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
          Length = 413

 Score = 33.2 bits (76), Expect = 1.00
 Identities = 16/55 (29%), Positives = 28/55 (50%)

Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
             EEE +    S +K K+  K  +  ++    KK+KKEK++ + + + KL     K
Sbjct: 43   SEEENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKTPK 97



 Score = 32.8 bits (75), Expect = 1.2
 Identities = 16/56 (28%), Positives = 30/56 (53%)

Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
                EEE +V +    K +K+ ++++ +  + KK+KK+KEK   K + + K   K 
Sbjct: 40   STFSEEENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKT 95


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 33.5 bits (77), Expect = 1.1
 Identities = 38/221 (17%), Positives = 79/221 (35%), Gaps = 10/221 (4%)

Query: 204 QERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRA--EVIACA 261
           +E+       +E  +  L   + +   E   ++A IE     +   +  L      ++ +
Sbjct: 732 EEKLKERLEELEEDLSSLEQEIENVKSELKELEARIEELEEDLHKLEEALNDLEARLSHS 791

Query: 262 RRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCK 321
           R          ++         L+E           + E   K+ Q+ QE    + +  K
Sbjct: 792 RIPEIQAELSKLEEEVSRIEARLREIEQKLN-RLTLEKEYLEKEIQELQEQRIDLKEQIK 850

Query: 322 DFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKER--MRRLMAEDEEGYRKLI 379
             ++   N   +   L + +      A ++ +     ++KER  +   + E E    +L 
Sbjct: 851 SIEKEIENLNGKKEELEEELEE-LEAALRDLESRLGDLKKERDELEAQLRELERKIEELE 909

Query: 380 DQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEES 420
            Q + KR    LS+    +  L + + E  +E  K +DEE 
Sbjct: 910 AQIEKKRK--RLSELKAKLEALEEELSE--IEDPKGEDEEI 946



 Score = 32.7 bits (75), Expect = 1.8
 Identities = 49/291 (16%), Positives = 125/291 (42%), Gaps = 17/291 (5%)

Query: 249 FQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQK 308
           F R   AE+     R   L+  ++    +  + +   +  + E  +  +K+    K+ ++
Sbjct: 668 FSRSEPAELQRLRERLEGLKRELSSLQSELRRIENRLDELSQELSDASRKIGEIEKEIEQ 727

Query: 309 HQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLM 368
            ++    + +  ++ +E   + +  I   N         A  E+ +E     +E +  L 
Sbjct: 728 LEQEEEKLKERLEELEEDLSSLEQEI--ENVKSELKELEARIEELEEDLHKLEEALNDLE 785

Query: 369 AEDEEGYRKLIDQKKDK------RLAFLLSQTDEYISNLTQMVK--EHKMEQKKKQDEES 420
           A         I  +  K      R+   L + ++ ++ LT   +  E ++++ ++Q  + 
Sbjct: 786 ARLSHSRIPEIQAELSKLEEEVSRIEARLREIEQKLNRLTLEKEYLEKEIQELQEQRIDL 845

Query: 421 KKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVREISSGKV-LKGEDAPLAAHLKQWI 479
           K++ +S+++++ + +GK    ++   +L ++  ++R++ S    LK E   L A L++ +
Sbjct: 846 KEQIKSIEKEIENLNGKK---EELEEELEELEAALRDLESRLGDLKKERDELEAQLRE-L 901

Query: 480 QDHPGWEVVADSDEENE-DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAM 529
           +     E+ A  +++ +   + +   E    E    E  KGED+E  +  +
Sbjct: 902 ERKIE-ELEAQIEKKRKRLSELKAKLEALEEELSEIEDPKGEDEEIPEEEL 951



 Score = 32.3 bits (74), Expect = 2.2
 Identities = 76/328 (23%), Positives = 127/328 (38%), Gaps = 25/328 (7%)

Query: 1012 VEERILAAARYKLNMDEKVIQAGMFDQKSTGSERHQFLQTILHQDDEE----DEEENAVP 1067
            VEE I    R  L +DEK  Q     ++   +ER+Q L     ++ E      E+E    
Sbjct: 182  VEENI---ERLDLIIDEKRQQLERLRREREKAERYQALLKEK-REYEGYELLKEKEALER 237

Query: 1068 DDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEA 1127
              E + + LA  EEE +      +E  K   +  +L+E        IK+  E EQ   + 
Sbjct: 238  QKEAIERQLASLEEELEKLTEEISELEKRLEEIEQLLEELNKK---IKDLGEEEQLRVKE 294

Query: 1128 KEEEKALHMGRGSRQRKQVDYTDSL--TEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
            K  E  L     S +R   +    L   E+   K   +  +   E EE E E+  +RK +
Sbjct: 295  KIGE--LEAEIASLERSIAEKERELEDAEERLAKLEAEIDKLLAEIEELEREIEEERKRR 352

Query: 1186 RRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFI 1245
             +   E  + +      R + +E D  K+ A+ +  LK     + K       +  E   
Sbjct: 353  DKLTEEYAELKEELEDLRAELEEVD--KEFAETRDELKDYREKLEKLKREINELKRELDR 410

Query: 1246 KLPSRKELPDYYEVIDRPMDIKKILGRI---EDGKYSSVDELQKDFKTLCRNAQI---YN 1299
                 + L +  E+ D    I  I  +I   E+ K     E++K    L + A     Y 
Sbjct: 411  LQEELQRLSE--ELADLNAAIAGIEAKINELEEEKEDKALEIKKQEWKLEQLAADLSKYE 468

Query: 1300 EELSLIHEDSVVLESVFTKARQRVESGE 1327
            +EL  + E+   +E   +K ++ +   E
Sbjct: 469  QELYDLKEEYDRVEKELSKLQRELAEAE 496


>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
          Length = 434

 Score = 32.8 bits (75), Expect = 1.1
 Identities = 14/71 (19%), Positives = 35/71 (49%), Gaps = 2/71 (2%)

Query: 1144 KQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKR 1203
            K+ +      +K  L+     +    +E +  EE  +K K ++ K+ E++ ++   + + 
Sbjct: 366  KRQELLKEYNKK--LQDYTKKLGEVKDETDASEEAEAKAKEEKLKQEENEKKQKEQADED 423

Query: 1204 KKEKEKDREKD 1214
            K++++KD  K 
Sbjct: 424  KEKRQKDERKK 434


>gnl|CDD|218146 pfam04554, Extensin_2, Extensin-like region. 
          Length = 57

 Score = 29.7 bits (67), Expect = 1.1
 Identities = 11/41 (26%), Positives = 18/41 (43%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPH 173
           P     PP   +   P PP +    ++ P P+ + PPP  +
Sbjct: 9   PVKQYSPPPPYYYKSPPPPVKSPVYKSPPPPVYKSPPPPKY 49


>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168).  This
            family consists of several hypothetical eukaryotic
            proteins of unknown function.
          Length = 142

 Score = 31.6 bits (72), Expect = 1.1
 Identities = 22/58 (37%), Positives = 36/58 (62%), Gaps = 5/58 (8%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
              ++E E+EE + KR+ K+RK     DEE +  K+ K++K+K ++K + K KK  KK 
Sbjct: 58   KWKKETEDEEFQQKREEKKRK-----DEEKTAKKRAKRQKKKQKKKKKKKAKKGNKKE 110


>gnl|CDD|221440 pfam12144, Med12-PQL, Eukaryotic Mediator 12 catenin-binding
           domain.  This domain is found in eukaryotes, and is
           typically between 325 and 354 amino acids in length.
           Both development and carcinogenesis are driven by signal
           transduction within the canonical Wnt/beta-catenin
           pathway through both programmed and unprogrammed changes
           in gene transcription. Beta-catenin physically and
           functionally targets this PQL (proline-, glutamine-,
           leucine-rich) region of the Med12 subunit of Mediator to
           activate transcription. The beta-catenin transactivation
           domain binds directly to isolated Med12 and intact
           Mediator both in vitro and in vivo, and Mediator is
           recruited to Wnt-responsive genes in a
           beta-catenin-dependent manner.
          Length = 204

 Score = 32.3 bits (73), Expect = 1.2
 Identities = 48/210 (22%), Positives = 65/210 (30%), Gaps = 45/210 (21%)

Query: 10  PPPPQQQQPP--LNVGQLPMGAPGSG---PPGSPGPSPGQAPGQNPQE------NLTALQ 58
           PP   Q  P   L  GQ  M         PPG PG  P   P +NP        N T + 
Sbjct: 1   PPELMQNAPYGRLPYGQQAMNMYTQNQPLPPGGPGLEPPYRPARNPMNKMPVRPNYTGMM 60

Query: 59  RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTP 118
             +       +  + +Y             K      Q Q LR Q+              
Sbjct: 61  PGMQGNMPTVMGLEKQYS---------MGFKPQPNMPQGQILRQQLQV------------ 99

Query: 119 QLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGH 178
           +    + G+++       QM P   +    M  SQ   +    M +QQ P          
Sbjct: 100 KQNQSMIGQQIR------QMTPNQPYT--SMQASQGYTSYGSHMGMQQHPSQTGGMVPSS 151

Query: 179 ISSQIKQSK--LTNIPKPEGLDPLIILQER 206
             SQ  Q     TN   P  +DP   LQ+R
Sbjct: 152 YGSQNFQGTHPATN---PTVVDPHRQLQQR 178


>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR).  This family
            consists of several bovine specific leukaemia virus
            receptors which are thought to function as transmembrane
            proteins, although their exact function is unknown.
          Length = 561

 Score = 33.1 bits (75), Expect = 1.2
 Identities = 25/103 (24%), Positives = 39/103 (37%), Gaps = 6/103 (5%)

Query: 1172 EEEEEEVRSKRKGKRRKKTEDDD------EEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
            + E+  V+  R  +  K  E  D      +     KK KKEKEK+R+KD+ K  +  K +
Sbjct: 169  DSEKLPVQKHRNAETSKSPEKGDVPAVEKKSKKPKKKEKKEKEKERDKDKKKEVEGFKSL 228

Query: 1226 MRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKK 1268
            +  +     S   V       L +           D P D + 
Sbjct: 229  LLALDDSPASAASVAEADEASLANTVSGTAPDSEPDEPKDAEA 271


>gnl|CDD|223079 PHA03419, PHA03419, E4 protein; Provisional.
          Length = 200

 Score = 32.2 bits (73), Expect = 1.2
 Identities = 16/52 (30%), Positives = 17/52 (32%), Gaps = 8/52 (15%)

Query: 11  PPPQQQQPPLNVGQLPMGAPGSGPPGSP--------GPSPGQAPGQNPQENL 54
           P    +      G  P   P    P  P        GPSPG  PG   QE L
Sbjct: 112 PDQGPEAKGEGEGHEPEDPPPEDTPPPPGGEGEVEGGPSPGPGPGPLDQEGL 163


>gnl|CDD|221827 pfam12881, NUT_N, NUT protein N terminus.  This family includes the
           NUT protein. The gene encoding for NUT protein (Nuclear
           Testis protein) is found fused to BRD3 or BRD4 genes, in
           some aggressive types of carcinoma, due to chromosomal
           translocations. Proteins of the BRD family contain two
           bromodomains that bind transcriptionally active
           chromatin through associations with acetylated histones
           H3 and H4. Such proteins are crucial for the regulation
           of cell cycle progression. On the other hand, little is
           known about NUT protein. NUT is known to have a Nuclear
           Export Sequence (NES) as well as a Nuclear Localization
           Signal (NLS), both located towards the C-terminal end of
           the protein. A fused NUT-GFP protein showed either
           cytoplasmic or nuclear localization, suggesting that it
           is subject to nuclear/cytoplasmic shuttling. Consistent
           with this possibility, treatment with leptomycin B an
           inhibitor of CRM1-dependent nuclear export resulted in
           re-distribution of NUT-GFP to the nucleus. Inspection of
           NUT revealed a C-terminal sequence similar to known
           nuclear export sequences (NES) which are often regulated
           by phosphorylation.
          Length = 328

 Score = 32.6 bits (74), Expect = 1.3
 Identities = 33/131 (25%), Positives = 49/131 (37%), Gaps = 26/131 (19%)

Query: 8   PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
           P PPPP  Q  P+    + +     GP G+ G     A  +    +         S K +
Sbjct: 156 PPPPPPVAQLVPI----VSLENAWPGPQGATGEGGPAAIQKPSPGD--------YSSKPK 203

Query: 68  GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLR-FQIMAYRLLARNQPLTPQLAMGVQG 126
            + E+ R  +  +  A R    H   S  V+ L  F I   R LAR +P      M ++ 
Sbjct: 204 SVYENFRRWQHYKTLARR----HLPQSPDVEALSCFLIPVLRSLARRKP-----TMTLE- 253

Query: 127 KRMEGVPSGPQ 137
              EG+    Q
Sbjct: 254 ---EGLWRALQ 261


>gnl|CDD|240578 cd12951, RRP7_Rrp7A, RRP7 domain ribosomal RNA-processing protein 7
            homolog A (Rrp7A) and similar proteins.  The family
            corresponds to the RRP7 domain of Rrp7A, also termed
            gastric cancer antigen Zg14, and similar proteins which
            are yeast ribosomal RNA-processing protein 7 (Rrp7p)
            homologs mainly found in Metazoans. The cellular function
            of Rrp7A remains unclear currently. Rrp7A harbors an
            N-terminal RNA recognition motif (RRM), also termed RBD
            (RNA binding domain) or RNP (ribonucleoprotein domain),
            and a C-terminal RRP7 domain.
          Length = 129

 Score = 31.1 bits (71), Expect = 1.4
 Identities = 21/64 (32%), Positives = 31/64 (48%), Gaps = 13/64 (20%)

Query: 1159 KAIDDGVE-YDDEEEEEEEEVRS------------KRKGKRRKKTEDDDEEPSTSKKRKK 1205
              ID+ +E YD EEEEE+EE                +KG+R K    +      ++K KK
Sbjct: 21   SEIDEYMEEYDKEEEEEKEEKEKEAEPDEDGWVTVTKKGRRPKTARKESVAAKAAEKEKK 80

Query: 1206 EKEK 1209
            +K+K
Sbjct: 81   KKKK 84


>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region.  The myc family belongs
            to the basic helix-loop-helix leucine zipper class of
            transcription factors, see pfam00010. Myc forms a
            heterodimer with Max, and this complex regulates cell
            growth through direct activation of genes involved in
            cell replication. Mutations in the C-terminal 20 residues
            of this domain cause unique changes in the induction of
            apoptosis, transformation, and G2 arrest.
          Length = 329

 Score = 32.6 bits (74), Expect = 1.4
 Identities = 14/44 (31%), Positives = 20/44 (45%), Gaps = 2/44 (4%)

Query: 1162 DDGVEYDDEEEEEEEE--VRSKRKGKRRKKTEDDDEEPSTSKKR 1203
             +  E ++EEEEEEEE  V +  K +     +    E  T   R
Sbjct: 228  SEEDEEEEEEEEEEEEIDVVTVEKRRSSSNRKASTSESITVPSR 271


>gnl|CDD|178748 PLN03209, PLN03209, translocon at the inner envelope of chloroplast
           subunit 62; Provisional.
          Length = 576

 Score = 32.6 bits (74), Expect = 1.5
 Identities = 19/62 (30%), Positives = 28/62 (45%), Gaps = 8/62 (12%)

Query: 115 PLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMP-PSQPMPNQAQPMPLQQQPPPQPH 173
           PLTP        + +  +PS    P  S     P P P++P+  +A P P  ++ PPQP 
Sbjct: 312 PLTPM------EELLAKIPSQRVPPKESDAADGPKPVPTKPVTPEA-PSPPIEEEPPQPK 364

Query: 174 QQ 175
             
Sbjct: 365 AV 366


>gnl|CDD|221654 pfam12589, WBS_methylT, Methyltransferase involved in Williams-Beuren
            syndrome.  This domain family is found in eukaryotes, and
            is typically between 72 and 83 amino acids in length. The
            family is found in association with pfam08241. This
            family is made up of S-adenosylmethionine-dependent
            methyltransferases. The proteins are deleted in
            Williams-Beuren syndrome (WBS), a complex developmental
            disorder with multisystemic manifestations including
            supravalvular aortic stenosis (SVAS) and a specific
            cognitive phenotype.
          Length = 85

 Score = 30.0 bits (68), Expect = 1.6
 Identities = 18/67 (26%), Positives = 30/67 (44%), Gaps = 16/67 (23%)

Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDRE----- 1212
            L         +D+E+ +  +VR   +  RRKK           KK+KK K+K +E     
Sbjct: 12   LPNGLGEEGEEDDEQIDASKVRRISQRNRRKK-----------KKKKKLKKKSKEWILRK 60

Query: 1213 KDQAKLK 1219
            K+Q + +
Sbjct: 61   KEQMRRR 67


>gnl|CDD|218734 pfam05758, Ycf1, Ycf1.  The chloroplast genomes of most higher plants
            contain two giant open reading frames designated ycf1 and
            ycf2. Although the function of Ycf1 is unknown, it is
            known to be an essential gene.
          Length = 832

 Score = 32.7 bits (75), Expect = 1.7
 Identities = 15/55 (27%), Positives = 31/55 (56%)

Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
            EE + E E  S+ KG ++++    +E+PS   + K++ +K  + D+ ++ K  K 
Sbjct: 237  EETDVEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDPDKTEDLDKLEILKEKKD 291



 Score = 30.7 bits (70), Expect = 7.3
 Identities = 27/105 (25%), Positives = 43/105 (40%), Gaps = 12/105 (11%)

Query: 1079 SEEEFQTYQRIDAERRKEQGKKSRLIEVS-ELPDWLIKEDEEIEQWAFEAKEEEKALHMG 1137
             +E F      D   +K   K   + E+S ++P W  K  +E+EQ   + +E     H  
Sbjct: 488  YQEFFNII-TTDPNDQKINKKSIGIEEISKKVPRWSYKLIDELEQDEGDNEENPPEDHDI 546

Query: 1138 RGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR 1182
            R SR+ K+V      T+           +    E+E+E EV   R
Sbjct: 547  R-SRKAKRVVI---FTDN------KKNTDNTTNEDEKEREVALIR 581


>gnl|CDD|221143 pfam11593, Med3, Mediator complex subunit 3 fungal.  Mediator is a
           large complex of up to 33 proteins that is conserved
           from plants to fungi to humans - the number and
           representation of individual subunits varying with
           species. It is arranged into four different sections, a
           core, a head, a tail and a kinase-activity part, and the
           number of subunits within each of these is what varies
           with species. Overall, Mediator regulates the
           transcriptional activity of RNA polymerase II but it
           would appear that each of the four different sections
           has a slightly different function. Mediator subunit
           Hrs1/Med3 is a physical target for Cyc8-Tup1, a yeast
           transcriptional co-repressor.
          Length = 381

 Score = 32.3 bits (73), Expect = 1.7
 Identities = 27/163 (16%), Positives = 48/163 (29%), Gaps = 8/163 (4%)

Query: 26  PMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANR 85
           P  A       +  P+     G        +   A      Q     PR  K     A  
Sbjct: 150 PAAAKVLKANAASAPNTTTGVGSAATTAAISATTATTPTTTQKKPRKPRQTKKTGPAAAA 209

Query: 86  TEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHG 145
                A   AQ Q   +  M    + +N  +  Q+        M+ +     + P +   
Sbjct: 210 K--AQASAQAQAQASAYNQMGSLGVPQNTSMLAQIPNPTP--LMQLL---NGVSPNNAMA 262

Query: 146 PMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKL 188
             P+    PM N  Q           P    G++++Q +++ +
Sbjct: 263 S-PLNNMSPMRNLNQMGNQNNGGQMTPSANNGNMNNQSRENSM 304


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 32.3 bits (73), Expect = 1.7
 Identities = 31/177 (17%), Positives = 63/177 (35%), Gaps = 13/177 (7%)

Query: 1055 QDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLI 1114
            +  +E+     V     VN   +  +EE +T    +A   +    +          + L 
Sbjct: 21   RQKQEEGSLGQVTTQVEVNSQNSVPDEESKTSTDDEAALLERLA-RREERRDERFSEALE 79

Query: 1115 KEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE- 1173
            ++ E       ++  E         SR+ ++    ++ T +E  K        + EE E 
Sbjct: 80   RQKEFKPTSTDQSLSE--------PSRRMQEDSGAENETVEEEEKEESREEREEVEETEG 131

Query: 1174 ---EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMR 1227
                E++   +   + +K+ ++ + E     KR   +E + E    KLK T     R
Sbjct: 132  VTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKHTENTFSR 188



 Score = 30.8 bits (69), Expect = 5.1
 Identities = 28/174 (16%), Positives = 70/174 (40%), Gaps = 8/174 (4%)

Query: 263 RDTTLETAVNVKAYKRTKRQ--GLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHC 320
           R+   ET    K+ ++   +     +    E   ++++        + + E++T  L+H 
Sbjct: 123 REEVEETEGVTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKHT 182

Query: 321 KDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE------EG 374
           ++         A++    +         E   + E+ + ++E  R+++ E+E      E 
Sbjct: 183 ENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEA 242

Query: 375 YRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVK 428
            RK  ++++ +RL   + +     +   Q V E  + + KK  +    +  S+K
Sbjct: 243 DRKSREEEEKRRLKEEIERRRAEAAEKRQKVPEDGLSEDKKPFKCFTPKGSSLK 296


>gnl|CDD|227400 COG5068, ARG80, Regulator of arginine metabolism and related MADS
           box-containing transcription factors [Transcription].
          Length = 412

 Score = 32.3 bits (73), Expect = 1.8
 Identities = 24/192 (12%), Positives = 42/192 (21%), Gaps = 29/192 (15%)

Query: 2   SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ-------APGQNPQENL 54
              +T       +  +    +      AP S     P  S           P  + Q N 
Sbjct: 135 HTFTTPKLESVVKSLEGKSLIQSPCSNAP-SDSSEEPSSSASFSVDPNDNNPMGSFQHNG 193

Query: 55  TALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQ 114
           +     I     Q  +               +  K   T         +  A  ++    
Sbjct: 194 SPQTNFIPLQNPQTQQYQQH-----------SSRKDHPTVPHSNTNNGRPPAKFMIPELH 242

Query: 115 PLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ-PH 173
                L         +                  + P      Q  P  L   PP +  H
Sbjct: 243 SSHSTL---------DLPSDFISDSGFPNQSSTSIFPLDSAIIQITPPHLPNNPPQENRH 293

Query: 174 QQQGHISSQIKQ 185
           +   + SS + +
Sbjct: 294 ELYSNDSSMVSE 305


>gnl|CDD|236669 PRK10263, PRK10263, DNA translocase FtsK; Provisional.
          Length = 1355

 Score = 32.7 bits (74), Expect = 1.8
 Identities = 37/183 (20%), Positives = 56/183 (30%), Gaps = 13/183 (7%)

Query: 10  PPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGL 69
           P  P  Q PP+    +P   P       PGP  G+ P   P       Q        Q  
Sbjct: 336 PVEPVTQTPPVASVDVPPAQPTVAWQPVPGPQTGE-PVIAPAPEGYPQQ-------SQYA 387

Query: 70  EEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRM 129
           +   +Y + ++      +  +A  + Q  Q  +   A    A+     P     V G   
Sbjct: 388 QPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAW 447

Query: 130 EGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLT 189
           +         P S +        Q     A   PL QQP P   Q        ++++K  
Sbjct: 448 QAEEQQSTFAPQSTY-----QTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPVVEETKPA 502

Query: 190 NIP 192
             P
Sbjct: 503 RPP 505



 Score = 32.4 bits (73), Expect = 2.4
 Identities = 24/107 (22%), Positives = 34/107 (31%), Gaps = 9/107 (8%)

Query: 102 FQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQP 161
           F+    + L  + P  P     V+  +    P  PQ        P+   P    P Q   
Sbjct: 727 FEFSPMKALLDDGPHEPLFTPIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVA 786

Query: 162 MPLQQQPPPQPH---------QQQGHISSQIKQSKLTNIPKPEGLDP 199
              Q Q P QP          QQ      Q +Q +    P+P+   P
Sbjct: 787 PQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQP 833



 Score = 30.4 bits (68), Expect = 8.6
 Identities = 23/89 (25%), Positives = 30/89 (33%), Gaps = 7/89 (7%)

Query: 114 QPLTPQLAMGVQGKRMEGVPSG-----PQMP-PMSLHGPMPMPPSQPMPNQAQPM-PLQQ 166
           QP  P        +  + V        PQ P         P  P  P P   QP  P+  
Sbjct: 754 QPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAP 813

Query: 167 QPPPQPHQQQGHISSQIKQSKLTNIPKPE 195
           QP  Q  QQ      Q +Q +    P+P+
Sbjct: 814 QPQYQQPQQPVAPQPQYQQPQQPVAPQPQ 842


>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
          Length = 1068

 Score = 32.7 bits (75), Expect = 1.8
 Identities = 25/114 (21%), Positives = 46/114 (40%), Gaps = 21/114 (18%)

Query: 1055 QDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLI 1114
            Q  E  E + A   ++      AR+++E Q   R + +RR+   K+    E         
Sbjct: 649  QTAETRESQQAEVTEK------ARTQDEQQQAPRRERQRRRNDEKRQAQQEA-------- 694

Query: 1115 KEDEEIEQWAFEAKEEEKALH-MGRGSRQRKQ----VDYTDSLTEKEWLKAIDD 1163
            K     EQ   E ++EE+      R  R+++Q    V    S+ E+     +++
Sbjct: 695  KALNVEEQSVQETEQEERVQQVQPR--RKQRQLNQKVRIEQSVAEEAVAPVVEE 746


>gnl|CDD|177614 PHA03377, PHA03377, EBNA-3C; Provisional.
          Length = 1000

 Score = 32.7 bits (74), Expect = 1.8
 Identities = 25/125 (20%), Positives = 46/125 (36%), Gaps = 14/125 (11%)

Query: 61  IDSMKEQGLEEDP-RYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARN------ 113
            D+  E   +ED  R    +E++++  E+ +   + +  Q R  +   R+  R       
Sbjct: 362 GDATSETSSDEDTGRQGSDVELESSDDELPYIDPNMEPVQQRPVMFVSRVPWRKPRTLPW 421

Query: 114 -----QPLTPQLAM-GVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQ 167
                 P+   L     +    E   S P+ P  S    +P+ P+   P +     +  Q
Sbjct: 422 PTPKTHPVKRTLVKTSGRSDEAEQAQSTPERPGPSDQPSVPVEPAHLTPVE-HTTVILHQ 480

Query: 168 PPPQP 172
           PP  P
Sbjct: 481 PPQSP 485


>gnl|CDD|227268 COG4932, COG4932, Predicted outer membrane protein [Cell envelope
            biogenesis, outer membrane].
          Length = 1531

 Score = 32.5 bits (74), Expect = 1.9
 Identities = 22/90 (24%), Positives = 30/90 (33%), Gaps = 13/90 (14%)

Query: 1144 KQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRR-------KKTEDDDEE 1196
            K+   T SL          +G+   D     + EV+ K  G            T    EE
Sbjct: 61   KETGKTISLNIPS------EGLTTTDSLLVGDYEVKEKSAGLGTTLDEATYNVTLALKEE 114

Query: 1197 PSTSKKRKKEKEKDREKDQAKLKKTLKKIM 1226
              TS   K ++EK         KK LK  +
Sbjct: 115  VITSTSTKTQEEKTEIVTPEPSKKKLKAEI 144


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
            family consists of several Plasmodium falciparum SPAM
            (secreted polymorphic antigen associated with merozoites)
            proteins. Variation among SPAM alleles is the result of
            deletions and amino acid substitutions in non-repetitive
            sequences within and flanking the alanine heptad-repeat
            domain. Heptad repeats in which the a and d position
            contain hydrophobic residues generate amphipathic
            alpha-helices which give rise to helical bundles or
            coiled-coil structures in proteins. SPAM is an example of
            a P. falciparum antigen in which a repetitive sequence
            has features characteristic of a well-defined structural
            element.
          Length = 164

 Score = 31.0 bits (70), Expect = 2.1
 Identities = 35/139 (25%), Positives = 59/139 (42%), Gaps = 20/139 (14%)

Query: 1096 EQGKKSRLIEVSELPDW----LIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDS 1151
            E+ K   L+E  ++  W    +IKE+E+++    E  EEE+              +  + 
Sbjct: 15   EEKKDENLLEHVKITSWDKEDIIKENEDVKDEKQEDDEEEEE-------------EDEEE 61

Query: 1152 LTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
            + E E    I+D  E  ++EEEEEE+       K  +K   +D   ST     +      
Sbjct: 62   IEEPE---DIEDEEEIVEDEEEEEEDEEDNVDLKDIEKKNINDIFNSTQDDNAQNLISKN 118

Query: 1212 EKDQAKLKKTLKKIMRVVI 1230
             K   K KKT + I++ + 
Sbjct: 119  YKKNEKSKKTAEDIVKTLF 137


>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
           protein; Reviewed.
          Length = 782

 Score = 32.1 bits (74), Expect = 2.2
 Identities = 43/254 (16%), Positives = 86/254 (33%), Gaps = 51/254 (20%)

Query: 196 GLDPLIILQEREN--RVALNIERRIEELNGSLTSTLPEHLRVKAEIELRAL-KVLNFQRQ 252
           GL   II + ++        +   I  L         E L  + E +      +L    +
Sbjct: 498 GLPENIIEEAKKLIGEDKEKLNELIASL---------EELERELEQKAEEAEALLKEAEK 548

Query: 253 LRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEY 312
           L+ E+          E    ++  +    +  ++  A + +++ +K   E  K  +  + 
Sbjct: 549 LKEEL---------EEKKEKLQEEEDKLLEEAEK-EAQQAIKEAKKEADEIIKELRQLQK 598

Query: 313 ITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE 372
                    +  E  +       RLNKA        +K+++K++E    + ++       
Sbjct: 599 GGYASVKAHELIEARK-------RLNKANEKKEKKKKKQKEKQEELKVGDEVK------- 644

Query: 373 EGYRKLIDQKKDKRLAFLLSQTDEYISNLT-QM----VKEHKMEQKKKQDEESKKRKQSV 427
             Y  L  QK       +LS  D+       Q     +K    + +K Q  + KK+K+  
Sbjct: 645 --YLSL-GQK-----GEVLSIPDD--KEAIVQAGIMKMKVPLSDLEKIQKPKKKKKKKPK 694

Query: 428 KQKLMDTDGKVTLD 441
             K       + LD
Sbjct: 695 TVKPKPRTVSLELD 708


>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
           bacterial type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. This family
           represents the SMC protein of most bacteria. The smc
           gene is often associated with scpB (TIGR00281) and scpA
           genes, where scp stands for segregation and condensation
           protein. SMC was shown (in Caulobacter crescentus) to be
           induced early in S phase but present and bound to DNA
           throughout the cell cycle [Cellular processes, Cell
           division, DNA metabolism, Chromosome-associated
           proteins].
          Length = 1179

 Score = 32.3 bits (74), Expect = 2.3
 Identities = 35/246 (14%), Positives = 82/246 (33%), Gaps = 38/246 (15%)

Query: 203 LQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACAR 262
              + ++    +E  IEEL   L     E    +AEIE    ++   + +L         
Sbjct: 748 RIAQLSKELTELEAEIEELEERLEEAEEELAEAEAEIEELEAQIEQLKEEL--------- 798

Query: 263 RDTTLETAVNV--KAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHC 320
               L  A++                  A    E+ + +E      ++  E         
Sbjct: 799 --KALREALDELRAELTLLNE------EAANLRERLESLERRIAATERRLE--------- 841

Query: 321 KDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKK-EQERIEKERMRRLMAEDEEGYRKLI 379
            D +E        I  L   +       E+ + + E    E+  +   +A       +L 
Sbjct: 842 -DLEEQIEELSEDIESLAAEIEELEELIEELESELEALLNERASLEEALALLRSELEELS 900

Query: 380 DQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVT 439
           ++ ++       S+    +  L + + + ++       E  + R  +++++L + +  +T
Sbjct: 901 EELRELESK--RSELRRELEELREKLAQLELRL-----EGLEVRIDNLQERLSE-EYSLT 952

Query: 440 LDQDET 445
           L++ E 
Sbjct: 953 LEEAEA 958


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
           approximately 300 residues, found in plants and
           vertebrates. They contain a highly conserved DDRGK
           motif.
          Length = 189

 Score = 31.2 bits (71), Expect = 2.3
 Identities = 25/122 (20%), Positives = 50/122 (40%), Gaps = 14/122 (11%)

Query: 277 KRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMR 336
           K   ++  K      + ++++  E ER++R+K +E      +  ++ +E     +    R
Sbjct: 2   KIGAKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEER 61

Query: 337 LNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDE 396
             +         E++ +KEQE  E E+++     +EEG         D+    LL     
Sbjct: 62  KER---------EEQARKEQE--EYEKLKSSFVVEEEG---TDKLSADEESNELLEDFIN 107

Query: 397 YI 398
           YI
Sbjct: 108 YI 109



 Score = 30.1 bits (68), Expect = 6.0
 Identities = 19/57 (33%), Positives = 35/57 (61%), Gaps = 1/57 (1%)

Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEP-STSKKRKKEKEKDREKDQAKLKKTLKKI 1225
            E EEEE E R K + KR  + ++++E      KK+++E+ K+RE+   K ++  +K+
Sbjct: 22   EAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKEREEQARKEQEEYEKL 78


>gnl|CDD|172884 PRK14408, PRK14408, membrane protein; Provisional.
          Length = 257

 Score = 31.5 bits (71), Expect = 2.5
 Identities = 19/70 (27%), Positives = 37/70 (52%), Gaps = 9/70 (12%)

Query: 549 ASILVNGKLKEYQIK-------GLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLME 601
           ASI+++   K+  ++       G+  M+ +    L GIL   + + K I T++L TY++ 
Sbjct: 30  ASIILSLVFKKQDVRLFASKNAGMTNMIRVHGKKL-GILTLFLDIIKPITTVSL-TYIIY 87

Query: 602 KKKVNGPFLI 611
           K  ++ PF +
Sbjct: 88  KYALDAPFDL 97


>gnl|CDD|240273 PTZ00110, PTZ00110, helicase; Provisional.
          Length = 545

 Score = 32.1 bits (73), Expect = 2.5
 Identities = 29/111 (26%), Positives = 51/111 (45%), Gaps = 9/111 (8%)

Query: 887 LDRILPKLKSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDLLKKFNA 946
           L  +L ++   G ++L+F +  +  + L       G+  + + G  K E+R  +L +F  
Sbjct: 366 LKMLLQRIMRDGDKILIFVETKKGADFLTKELRLDGWPALCIHGDKKQEERTWVLNEFKT 425

Query: 947 PDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIGQ 997
             S   I   +T     GL+++    VI FD    P+   Q +D  HRIG+
Sbjct: 426 GKSPIMI---ATDVASRGLDVKDVKYVINFDF---PN---QIEDYVHRIGR 467


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 32.1 bits (73), Expect = 2.5
 Identities = 11/45 (24%), Positives = 17/45 (37%)

Query: 2   SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAP 46
           S+SSTS +    +              +P   PP +   SP + P
Sbjct: 325 SSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRP 369


>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
            selection and in elongation by RNA polymerase II
            [Transcription].
          Length = 521

 Score = 31.9 bits (72), Expect = 2.6
 Identities = 38/206 (18%), Positives = 80/206 (38%), Gaps = 15/206 (7%)

Query: 1027 DEKVIQAGMFDQKSTGSER--HQFLQTILHQDDEEDEEENAVPDDETVNQMLARSEEE-F 1083
            DE +  AG+ D     + +  H  L  +L    +ED  EN    +E+  +   +SEEE F
Sbjct: 6    DELLALAGIDDSDVASNRKRAHDDLDDVLSSSSDEDNNENVDYAEESGGEGNEKSEEEKF 65

Query: 1084 QTYQRIDAERRKEQGKKSRLIEVSELP-DWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQ 1142
            +   R++  + K++  +++++ ++E+  + ++ E EE      E +E    +     S  
Sbjct: 66   KNPYRLEG-KFKDEADRAKIMAMTEIERESILFEREEEISKLMERRELAIRMEQQHRSSG 124

Query: 1143 RKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTED---------- 1192
                  +                  + ++  E EE     +    ++ +D          
Sbjct: 125  CTDTRRSTRYEPLTSAAEEKKKKLLELKKTREREERLYSERHIELQRFKDYKELEESEQG 184

Query: 1193 DDEEPSTSKKRKKEKEKDREKDQAKL 1218
              EE + S   +  ++  R  D A+L
Sbjct: 185  LQEEYTPSYAEEAVEDISRTDDFAEL 210


>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
          Length = 943

 Score = 32.0 bits (72), Expect = 2.9
 Identities = 16/46 (34%), Positives = 25/46 (54%), Gaps = 1/46 (2%)

Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
            DDG E DDE+    EE + K + +RR+  +   +    SK +K +K
Sbjct: 877  DDGTEADDEDTHPPEE-KHKSEVRRRRPPKKPSKPKKPSKPKKPKK 921


>gnl|CDD|235124 PRK03427, PRK03427, cell division protein ZipA; Provisional.
          Length = 333

 Score = 31.5 bits (72), Expect = 3.0
 Identities = 12/44 (27%), Positives = 15/44 (34%), Gaps = 1/44 (2%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
              P+ P      P+  P  QP P Q    P+  Q  P P    
Sbjct: 116 QHAPR-PAQPAPQPVQQPAYQPQPEQPLQQPVSPQVAPAPQPVH 158



 Score = 30.0 bits (68), Expect = 8.2
 Identities = 8/44 (18%), Positives = 9/44 (20%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
           P     P  +     P P     P          Q P  P    
Sbjct: 109 PEAQVPPQHAPRPAQPAPQPVQQPAYQPQPEQPLQQPVSPQVAP 152


>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
           and chromosome partitioning].
          Length = 420

 Score = 31.6 bits (72), Expect = 3.0
 Identities = 20/104 (19%), Positives = 37/104 (35%), Gaps = 5/104 (4%)

Query: 266 TLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKE 325
           TL+    V+A    ++  L     +E+  +Q K+    ++R+K    + + L   +   E
Sbjct: 169 TLKQLAAVRAEIAAEQAELTTLL-SEQRAQQAKLAQLLEERKKTLAQLNSELSADQKKLE 227

Query: 326 YHRNNQAR----IMRLNKAVMNYHANAEKEQKKEQERIEKERMR 365
             R N++R    I     A       A   +         E  R
Sbjct: 228 ELRANESRLKNEIASAEAAAAKAREAAAAAEAAAARARAAEAKR 271


>gnl|CDD|237756 PRK14559, PRK14559, putative protein serine/threonine phosphatase;
           Provisional.
          Length = 645

 Score = 32.0 bits (73), Expect = 3.0
 Identities = 28/140 (20%), Positives = 39/140 (27%), Gaps = 25/140 (17%)

Query: 33  GPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAF 92
             P S     G+A  Q+      +      S     L+   RYQ L   +   T   H  
Sbjct: 62  ASPNSEVLESGEATQQSESSLTPSSSPLYGSY----LDPGQRYQLLASSEEIPTAAAHTE 117

Query: 93  TSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPS 152
              +V                QPL P        + +         P       +P    
Sbjct: 118 LQGRVLDC-------------QPLQPSPL-----EALLEQLEDLLNPLADPTEVLPTLLW 159

Query: 153 QPM--PNQAQP-MPLQQQPP 169
           Q +  P  A P + LQ Q P
Sbjct: 160 QQLGIPALAIPYLALQDQFP 179


>gnl|CDD|132720 cd02584, RNAP_II_Rpb1_C, Largest subunit (Rpb1) of Eukaryotic RNA
           polymerase II (RNAP II), C-terminal domain.  RNA
           polymerase II (RNAP II) is a large multi-subunit complex
           responsible for the synthesis of mRNA. RNAP II consists
           of a 10-subunit core enzyme and a peripheral heterodimer
           of two subunits. The largest core subunit (Rpb1) of
           yeast RNAP II is the best characterized member of this
           family. Structure studies suggest that RNAP complexes
           from different organisms share a crab-claw-shape
           structure. In yeast, Rpb1 and Rpb2, the largest and the
           second largest subunits, each makes up one clamp, one
           jaw, and part of the cleft. Rpb1 interacts with Rpb2 to
           form the DNA entry and RNA exit channels in addition to
           the catalytic center of RNA synthesis. The C-terminal
           domain of Rpb1 makes up part of the foot and jaw
           structures.
          Length = 410

 Score = 31.8 bits (73), Expect = 3.0
 Identities = 19/83 (22%), Positives = 39/83 (46%), Gaps = 4/83 (4%)

Query: 356 QERIEKERMR-RLMAEDEEGYRKLIDQKKDKRL-AFLLSQTDEYISNLTQMVKEHKMEQK 413
            +  EK  +R R++ +DEE      D    K++ + +LS          + V   +  +K
Sbjct: 194 DDNAEKLVIRIRIINDDEEKEEDSEDDVFLKKIESNMLSDMTLKGIEGIRKVFIREENKK 253

Query: 414 KKQDEESKKRKQSVKQKLMDTDG 436
           K   E  + +K+  ++ +++TDG
Sbjct: 254 KVDIETGEFKKR--EEWVLETDG 274


>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin.  Nucleoplasmins are also
            known as chromatin decondensation proteins. They bind to
            core histones and transfer DNA to them in a reaction that
            requires ATP. This is thought to play a role in the
            assembly of regular nucleosomal arrays.
          Length = 146

 Score = 30.4 bits (69), Expect = 3.1
 Identities = 11/44 (25%), Positives = 22/44 (50%), Gaps = 13/44 (29%)

Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKK 1205
            DD  + ++E++EE+++             ED+ EE  +  K+ K
Sbjct: 116  DDEEDEEEEDDEEDDD-------------EDESEEEESPVKKVK 146



 Score = 29.2 bits (66), Expect = 8.5
 Identities = 16/50 (32%), Positives = 24/50 (48%), Gaps = 9/50 (18%)

Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
            L A ++    DDEE+EEEE+          ++ +D+DE        KK K
Sbjct: 106  LVASEEDESDDDEEDEEEED---------DEEDDDEDESEEEESPVKKVK 146


>gnl|CDD|129705 TIGR00618, sbcc, exonuclease SbcC.  All proteins in this family for
           which functions are known are part of an exonuclease
           complex with sbcD homologs. This complex is involved in
           the initiation of recombination to regulate the levels
           of palindromic sequences in DNA. This family is based on
           the phylogenomic analysis of JA Eisen (1999, Ph.D.
           Thesis, Stanford University) [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 1042

 Score = 31.9 bits (72), Expect = 3.2
 Identities = 34/228 (14%), Positives = 79/228 (34%), Gaps = 16/228 (7%)

Query: 209 RVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLE 268
            + L + +     +      LP+ L    ++ L+ ++  + + QL       A+  T L 
Sbjct: 650 ALQLTLTQERVREHALSIRVLPKELLASRQLALQKMQ--SEKEQLTYWKEMLAQCQTLLR 707

Query: 269 TAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHR 328
                      +   ++ A ++   +   + +A  +  ++      TVL+   +   +  
Sbjct: 708 ELETHIEEYDREFNEIENASSSLGSDLAAREDALNQSLKELMHQARTVLKARTE--AHFN 765

Query: 329 NNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAE------DEEGYRKLIDQK 382
           NN+     L       H  AE +        +   ++ L AE       +E    L  + 
Sbjct: 766 NNEEVTAALQTGAELSHLAAEIQFFNRLREEDTHLLKTLEAEIGQEIPSDEDILNLQCET 825

Query: 383 KDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQK 430
             +     LS+ +E  + L       ++  +  + EE  K+   + Q+
Sbjct: 826 LVQEEEQFLSRLEEKSATL------GEITHQLLKYEECSKQLAQLTQE 867


>gnl|CDD|220093 pfam09030, Creb_binding, Creb binding.  The Creb binding domain
           assumes a structure comprising of three alpha-helices
           which pack in a bundle, exposing a hydrophobic groove
           between alpha-1 and alpha-3 within which complimentary
           domains found in the protein 'activator for thyroid
           hormone and retinoid receptors' (ACTR) can dock. Docking
           of these domains is required for the recruitment of RNA
           polymerase II and the basal transcription machinery.
          Length = 104

 Score = 29.7 bits (66), Expect = 3.2
 Identities = 16/62 (25%), Positives = 22/62 (35%), Gaps = 2/62 (3%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIP 192
           P GP      +   MP P  Q +   A   P  +    QP   +G +S    Q  L  + 
Sbjct: 7   PQGPLPQQQQMQPGMPRPVMQMVAQHAVAGP--RPGLVQPGISRGIVSPNALQDLLRTLK 64

Query: 193 KP 194
            P
Sbjct: 65  SP 66


>gnl|CDD|221868 pfam12938, M_domain, M domain of GW182. 
          Length = 238

 Score = 31.0 bits (70), Expect = 3.2
 Identities = 21/122 (17%), Positives = 33/122 (27%), Gaps = 31/122 (25%)

Query: 2   SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
            N ++  +       +     G  P      G  G+ GP P    G              
Sbjct: 53  PNLASLSSLTSQGLGKIL--SGLQPPPLGNGGGSGAGGPGPVGGGGGPGVAPN------- 103

Query: 62  DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMA----YRLLARNQPLT 117
                            I+  A   +         VQQ++  +       ++L  NQPL 
Sbjct: 104 ----------------NIQPNAQAQQPSTQQLRMLVQQIQMAVQKGYLNPQIL--NQPLA 145

Query: 118 PQ 119
           PQ
Sbjct: 146 PQ 147


>gnl|CDD|237791 PRK14701, PRK14701, reverse gyrase; Provisional.
          Length = 1638

 Score = 31.8 bits (72), Expect = 3.3
 Identities = 21/108 (19%), Positives = 50/108 (46%), Gaps = 10/108 (9%)

Query: 585 GLGKTIQTIALITYLMEKKKVNGPFLIIVPLSTLSNWSLE-----FERWAPSVNVVAYKG 639
           G+GK+     +  +L  K K      II+P + L   ++E      E+    V +V Y  
Sbjct: 104 GMGKSTFGAFIALFLALKGKKC---YIILPTTLLVKQTVEKIESFCEKANLDVRLVYYHS 160

Query: 640 --SPHLRKTLQAQMKASKFNVLLTTYEYVIKDKGPLAKLHWKYMIIDE 685
                 ++    +++   F++L+TT +++ ++   +  L + ++ +D+
Sbjct: 161 NLRKKEKEEFLERIENGDFDILVTTAQFLARNFPEMKHLKFDFIFVDD 208


>gnl|CDD|180481 PRK06231, PRK06231, F0F1 ATP synthase subunit B; Validated.
          Length = 205

 Score = 31.0 bits (70), Expect = 3.3
 Identities = 18/64 (28%), Positives = 33/64 (51%), Gaps = 6/64 (9%)

Query: 338 NKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEY 397
           N  +       EKE+++ +E+++KE +   M   EE  +K +D++ D +L       DE+
Sbjct: 143 NLIIFQARQEIEKERRELKEQLQKESVELAMLAAEELIKKKVDREDDDKL------VDEF 196

Query: 398 ISNL 401
           I  L
Sbjct: 197 IREL 200


>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
           Provisional.
          Length = 2849

 Score = 31.6 bits (71), Expect = 3.6
 Identities = 19/53 (35%), Positives = 28/53 (52%), Gaps = 1/53 (1%)

Query: 483 PGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYY 535
           P    V D D+E+EDED +  ++    E E +E+ KG DDE  ++   E   Y
Sbjct: 147 PRDNFVIDDDDEDEDEDDDDEEDDEEEEEE-EEEIKGFDDEDEEDEGGEDFTY 198


>gnl|CDD|118696 pfam10168, Nup88, Nuclear pore component.  Nup88 can be divided
           into two structural domains; the N-terminal two-thirds
           of the protein has no obvious structural motifs but is
           the region for binding to Nup98, one of the components
           of the nuclear pore. the C-terminal end is a predicted
           coiled-coil domain. Nup88 is overexpressed in tumour
           cells.
          Length = 717

 Score = 31.4 bits (71), Expect = 3.7
 Identities = 21/129 (16%), Positives = 57/129 (44%), Gaps = 14/129 (10%)

Query: 267 LETAVNVKAYKRTKRQGLKEARATEK-----LEKQQKVEAERKKRQKHQEYITTVLQHCK 321
           L  A  V   +   +  L       +     L+K++++E  +  R++ ++ ++   +   
Sbjct: 541 LSRATQVFREQYLLKHDLAREEFQRRVKLLQLQKEKQLEDIQDCREE-RKSLSERAEKLA 599

Query: 322 DFKEYHRNNQARIMRLNKAVM-NYHAN------AEKEQKKEQERIEKERMRRLMAEDEEG 374
           +  E  + NQ  ++   K ++ + ++       +E++  KE +RI K+ ++ L    ++ 
Sbjct: 600 EKFEEAKYNQELLVNRCKRLLQSANSQLPVLSDSERDMSKELQRINKQ-LQHLANGIKQV 658

Query: 375 YRKLIDQKK 383
            +K   Q+ 
Sbjct: 659 KKKKNYQRY 667


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This protein,
            which interacts with both microtubules and TRAF3 (tumour
            necrosis factor receptor-associated factor 3), is
            conserved from worms to humans. The N-terminal region is
            the microtubule binding domain and is well-conserved; the
            C-terminal 100 residues, also well-conserved, constitute
            the coiled-coil region which binds to TRAF3. The central
            region of the protein is rich in lysine and glutamic acid
            and carries KKE motifs which may also be necessary for
            tubulin-binding, but this region is the least
            well-conserved.
          Length = 506

 Score = 31.4 bits (71), Expect = 3.8
 Identities = 11/53 (20%), Positives = 29/53 (54%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKK 1220
              +E+ +EE    K K + ++K    ++E    KK ++ ++++ EK + +++ 
Sbjct: 119  KKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRA 171



 Score = 30.6 bits (69), Expect = 5.8
 Identities = 24/93 (25%), Positives = 43/93 (46%), Gaps = 6/93 (6%)

Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR---EKDQAKLKKTLKKI 1225
              E  +EEE   ++  + +KK ++  +E    +K K+E ++ R   EK++ K KK  +  
Sbjct: 99   KNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPR 158

Query: 1226 MRVVIKYTDSD---GRVLSEPFIKLPSRKELPD 1255
             R   K  +      R    P  K P++K+ P 
Sbjct: 159  DREEEKKRERVRAKSRPKKPPKKKPPNKKKEPP 191


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
            required for initiation of DNA replication in S.
            cerevisiae, forming a complex with MCM5/CDC46. Homologues
            of CDC45 have been identified in human, mouse and smut
            fungus among others.
          Length = 583

 Score = 31.5 bits (72), Expect = 3.8
 Identities = 15/52 (28%), Positives = 29/52 (55%)

Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREK 1213
             +  + DDEE +EE+E  SK +       +DDD++ +T ++  + + + RE 
Sbjct: 123  LEEDDDDDEESDEEDEESSKSEDDEDDDDDDDDDDIATRERSLERRRRRREW 174



 Score = 30.3 bits (69), Expect = 7.5
 Identities = 17/60 (28%), Positives = 27/60 (45%), Gaps = 9/60 (15%)

Query: 1161 IDDGVEYDDEEEEEEEEVRSKRKGKRRKK-----TEDDDEEPSTSKKRKKEKEKDREKDQ 1215
             DDG    D EEE ++E R     +  ++      E D+E+  +SK    E + D + D 
Sbjct: 101  FDDG----DIEEELQDEPRYDDAYRDLEEDDDDDEESDEEDEESSKSEDDEDDDDDDDDD 156


>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
            envelope biogenesis, outer membrane].
          Length = 387

 Score = 31.1 bits (70), Expect = 3.8
 Identities = 32/166 (19%), Positives = 67/166 (40%), Gaps = 10/166 (6%)

Query: 1059 EDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDE 1118
            + ++ +A   ++   +   +  EE +  Q  + ER K+  +K RL            E+ 
Sbjct: 68   QSQQSSAKKGEQQRKKKEEQVAEELKPKQAAEQERLKQL-EKERL---KAQEQQKQAEEA 123

Query: 1119 EIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEV 1178
            E +    + ++EE+A       +++ +     +  E   LKA  +  +  +E  +  EE 
Sbjct: 124  EKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEA 183

Query: 1179 RSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
            ++K +    KK      +     K   EK K   + +AK +K  + 
Sbjct: 184  KAKAEAAAAKK------KAEAEAKAAAEKAKAEAEAKAKAEKKAEA 223



 Score = 29.9 bits (67), Expect = 9.9
 Identities = 30/142 (21%), Positives = 57/142 (40%), Gaps = 8/142 (5%)

Query: 289 ATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANA 348
             ++  + Q  ++  KK +  Q+      Q  ++ K      Q R+ +L K  +      
Sbjct: 60  VVQQYGRIQSQQSSAKKGE--QQRKKKEEQVAEELKPKQAAEQERLKQLEKERL-----K 112

Query: 349 EKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEH 408
            +EQ+K+ E  EK+       ++E+  +   +QKK K  A       E          + 
Sbjct: 113 AQEQQKQAEEAEKQAQLEQKQQEEQARKAAAEQKK-KAEAAKAKAAAEAAKLKAAAEAKK 171

Query: 409 KMEQKKKQDEESKKRKQSVKQK 430
           K E+  K  EE+K + ++   K
Sbjct: 172 KAEEAAKAAEEAKAKAEAAAAK 193


>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal.  This
            domain is found to the N-terminus of bacterial signal
            peptidases of the S49 family (pfam01343).
          Length = 154

 Score = 30.2 bits (69), Expect = 4.0
 Identities = 18/60 (30%), Positives = 29/60 (48%), Gaps = 10/60 (16%)

Query: 1166 EYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
            EY D +E  E  +  K++ K  +K E         KK +K K K  EK +AK ++   ++
Sbjct: 50   EYKDLKESLEAALLDKKELKAWEKAE---------KKAEKAKAKA-EKKKAKKEEPKPRL 99


>gnl|CDD|219897 pfam08549, SWI-SNF_Ssr4, Fungal domain of unknown function
           (DUF1750).  This is a fungal domain of unknown function.
          Length = 669

 Score = 31.5 bits (71), Expect = 4.1
 Identities = 19/60 (31%), Positives = 24/60 (40%), Gaps = 13/60 (21%)

Query: 132 VPSGPQMPPMSLH---GPMP--MPPSQPMPNQAQPMPLQQQPPP----QPHQQQGHISSQ 182
           +P  PQM   S++   GP P  M   QP      P P     PP         +GH +SQ
Sbjct: 202 IPLPPQMAGQSMYQPPGPYPNAMVGRQPFY----PQPGAVAGPPKRRGGHKAPRGHRASQ 257


>gnl|CDD|222374 pfam13779, DUF4175, Domain of unknown function (DUF4175). 
          Length = 820

 Score = 31.4 bits (72), Expect = 4.2
 Identities = 34/172 (19%), Positives = 47/172 (27%), Gaps = 35/172 (20%)

Query: 8   PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
                 Q QQ     GQ   G  G    G      GQ  GQ  Q +L   Q+A+     +
Sbjct: 617 AQRGEQQGQQGQGGQGQGQPGQQGQQGQGQQQGQQGQG-GQGGQGSLAERQQALRDELGR 675

Query: 68  GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIM--AYRLLARNQPLTPQLAMGVQ 125
                P         A       A   A       + M  A   L +        A+  Q
Sbjct: 676 QRGGLPGMGGEAGEAARD-----ALGRAG------RAMGGAEEALGQGD---LAEAVDRQ 721

Query: 126 GKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQG 177
           G+ +E +  G                ++ +         Q Q      QQQG
Sbjct: 722 GRALEALREG----------------ARALGEAMAQQ--QGQQQGGQGQQQG 755


>gnl|CDD|227701 COG5414, COG5414, TATA-binding protein-associated factor
            [Transcription].
          Length = 392

 Score = 31.2 bits (70), Expect = 4.2
 Identities = 22/104 (21%), Positives = 42/104 (40%), Gaps = 4/104 (3%)

Query: 1113 LIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEE 1172
            L KE +  E+   E   EE  L +G    + K+V   D   ++E ++  +   E    + 
Sbjct: 257  LKKEKQGAEEEGEEGMSEED-LDVGAAEIENKEVSEGDKEQQQEEVENAEAHKEEVQSDR 315

Query: 1173 EEEEEVRSK---RKGKRRKKTEDDDEEPSTSKKRKKEKEKDREK 1213
             +E     +      +  + TE   +E +  +K  +EK +  E 
Sbjct: 316  PDEIGEEKEEDDENEENERHTELLADELNELEKGIEEKRRQMES 359


>gnl|CDD|233158 TIGR00865, bcl-2, apoptosis regulator.  The Bcl-2 (Bcl-2) Family
           (TC 1.A.21) The Bcl-2 family consists of the apoptosis
           regulator, Bcl-X, and its homologues. Bcl-X is a
           dominant regulator of programmed cell death in mammalian
           cells. The long form (Bcl-X(L)) displays cell death
           repressor activity, but the short isoform (Bcl-X(S)) and
           the b-isoform (Bcl-Xb) promote cell death. Bcl-X(L),
           Bcl-X(S) and Bcl-Xb are three isoforms derived by
           alternative RNA splicing. Bcl-X(S) forms heterodimers
           with Bcl-2. Homologues of Bcl-X include the Bax (rat;
           192 aas; spQ63690) and Bak (mouse; 208 aas; spO08734)
           proteins which also influence apoptosis. Using isolated
           mitochondria, recombinant Bax and Bak have been shown to
           induce Dy loss, swelling and cytochrome c release. All
           of these changes are dependent on Ca2+ and are prevented
           by cyclosporin A and bongkrekic acid, both of which are
           known to close permeability transition pores
           (megachannels). Coimmimoprecipitation studies revealed
           that Bax and Bak interact with VDAC to form permeability
           transition pores. Thus, even though they can form
           channels in artificial membranes at acidic pH,
           proapoptotic Bcl-2 family proteins (including Bax and
           Bak) probably induce the mitochondrial permeability
           transition and cytochrome c release by interacting with
           permeability transition pores, the most important
           component for pore fomation of which is VDAC [Regulatory
           functions, Other].
          Length = 213

 Score = 30.6 bits (69), Expect = 4.4
 Identities = 8/16 (50%), Positives = 11/16 (68%)

Query: 471 LAAHLKQWIQDHPGWE 486
           L  HL  WIQ++ GW+
Sbjct: 155 LNEHLHPWIQENGGWD 170


>gnl|CDD|225087 COG2176, PolC, DNA polymerase III, alpha subunit (gram-positive type)
            [DNA replication, recombination, and repair].
          Length = 1444

 Score = 31.5 bits (72), Expect = 4.4
 Identities = 32/127 (25%), Positives = 52/127 (40%), Gaps = 12/127 (9%)

Query: 1049 LQTILHQDDEEDEEE--------NAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKK 1100
            L      +D  +E+E        N   +      + A  + + ++ +    +     G+K
Sbjct: 161  LLIEFEVNDISEEQEFEKFEEAINEEVEKAAQEALEAEKKLKAESPKVEKPKP-LFDGQK 219

Query: 1101 SRLIEVSELPDWLIKEDEEIEQWAFEA---KEEEKALHMGRGSRQRKQVDYTDSLTEKEW 1157
             R I+ +E    LIK +EE  +   E    K E K L  GR     K  DYT SL  K++
Sbjct: 220  GRKIKSTEEIKPLIKINEEETRVKVEGYIFKIEIKELKSGRTLLNIKVTDYTSSLILKKF 279

Query: 1158 LKAIDDG 1164
            L+  +D 
Sbjct: 280  LRDEEDE 286


>gnl|CDD|219882 pfam08524, rRNA_processing, rRNA processing.  This is a family of
            proteins that are involved in rRNA processing. In a
            localisation study they were found to localise to the
            nucleus and nucleolus. The family also includes other
            metazoa members from plants to mammals where the protein
            has been named BR22 and is associated with TTF-1, thyroid
            transcription factor 1. In the lungs, the family binds
            TTF-1 to form a complex which influences the expression
            of the key lung surfactant protein-B (SP-B) and -C
            (SP-C), the small hydrophobic surfactant proteins that
            maintain surface tension in alveoli.
          Length = 150

 Score = 29.9 bits (67), Expect = 4.5
 Identities = 27/86 (31%), Positives = 51/86 (59%), Gaps = 3/86 (3%)

Query: 1153 TEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDRE 1212
             +KE+LK ++       E+E  E++V+S ++ ++ +K +  DE+   +K+RK+E+   RE
Sbjct: 34   LKKEYLKLLEKEGYAVPEKESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQ---RE 90

Query: 1213 KDQAKLKKTLKKIMRVVIKYTDSDGR 1238
            K+ AK +K L+KI     K  + + R
Sbjct: 91   KELAKRQKELEKIELSKKKQKERERR 116


>gnl|CDD|227504 COG5177, COG5177, Uncharacterized conserved protein [Function
            unknown].
          Length = 769

 Score = 31.2 bits (70), Expect = 4.6
 Identities = 42/205 (20%), Positives = 83/205 (40%), Gaps = 24/205 (11%)

Query: 1028 EKVIQAGMFDQKSTG------SERHQFLQTILHQDDEEDEEENAVPD-DETVNQMLARSE 1080
             K+I  G ++Q          ++    LQT+   +   D  +   P+ ++  +      E
Sbjct: 295  NKIIVNGQYEQTIREIFADRATKLELDLQTVFESNMNRDTLDEYAPEGEDLRSDYDEDFE 354

Query: 1081 EEFQTYQRIDA----ERRKEQGKKSRLIEVSEL--PDWLIKEDEEIEQWAFEAKEEEKAL 1134
             +  T  RID       R++  KK+ + + +      W   E+EE  Q       +E++ 
Sbjct: 355  YDGLTTVRIDDHGFLPGREQTSKKAAVPKGTSFYQAKWAEDEEEEDGQCN-----DEEST 409

Query: 1135 HMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDD 1194
                     K+ D  +   ++E   AIDD   +++   EEEE    ++  + R   ++D 
Sbjct: 410  MSAIDDDDPKENDNEEVAGDEES--AIDDNEGFEELSPEEEE----RQLREFRDMEKEDR 463

Query: 1195 EEPSTSKKRKKEKEKDREKDQAKLK 1219
            E P  ++ +  E   +R K+   L+
Sbjct: 464  EFPDEAELQPSESAIERYKEYRGLR 488


>gnl|CDD|214661 smart00435, TOPEUc, DNA Topoisomerase I (eukaryota).  DNA
            Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina
            virus topoisomerase, Variola virus topoisomerase, Shope
            fibroma virus topoisomeras.
          Length = 391

 Score = 30.8 bits (70), Expect = 4.9
 Identities = 15/67 (22%), Positives = 37/67 (55%), Gaps = 6/67 (8%)

Query: 1171 EEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDRE-KDQAKLKKTLKKIMRVV 1229
              E   +++ K K K  +     D E   ++ ++K+KEK +E K + ++++  ++I ++ 
Sbjct: 302  LFEMISDLKRKLKSKFER-----DNEKLDAEVKEKKKEKKKEEKKKKQIERLEERIEKLE 356

Query: 1230 IKYTDSD 1236
            ++ TD +
Sbjct: 357  VQATDKE 363


>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
           component YidC; Validated.
          Length = 429

 Score = 31.0 bits (70), Expect = 4.9
 Identities = 13/38 (34%), Positives = 21/38 (55%), Gaps = 1/38 (2%)

Query: 277 KRTKRQGLKEARATEKLEKQQKVEAERK-KRQKHQEYI 313
           K+T+     EA+A +K   Q++  AER+  R+  QE  
Sbjct: 333 KKTRTAEKNEAKARKKEIAQKRRAAEREINREARQERA 370


>gnl|CDD|221818 pfam12868, DUF3824, Domain of unknwon function (DUF3824).  This
          is a repeating domain found in fungal proteins. It is
          proline-rich, and the function is not known.
          Length = 135

 Score = 29.5 bits (66), Expect = 5.1
 Identities = 20/53 (37%), Positives = 25/53 (47%), Gaps = 2/53 (3%)

Query: 2  SNSSTSPNPPPPQQQQPPLNVGQ-LPMGAPGSGPPGS-PGPSPGQAPGQNPQE 52
          S+S   P  P P    PP++  +  P       PPGS P P PG  PG NP +
Sbjct: 47 SDSYEEPYDPTPYPPSPPVSDPRYYPNSNYFPPPPGSTPVPPPGPQPGYNPAD 99


>gnl|CDD|149453 pfam08397, IMD, IRSp53/MIM homology domain.  The N-terminal
           predicted helical stretch of the insulin receptor
           tyrosine kinase substrate p53 (IRSp53) is an
           evolutionary conserved F-actin bundling domain involved
           in filopodium formation. The domain has been named IMD
           after the IRSp53 and missing in metastasis (MIM)
           proteins in which it occurs. Filopodium-inducing IMD
           activity is regulated by Cdc42 and Rac1 and is
           SH3-independent.
          Length = 218

 Score = 30.5 bits (69), Expect = 5.1
 Identities = 18/70 (25%), Positives = 35/70 (50%), Gaps = 6/70 (8%)

Query: 362 ERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESK 421
              R + ++ EE ++   D+     +  L  +T+        + K+++ E KKK+DE   
Sbjct: 65  MVHRSINSKLEEFFKAFHDE----LINPLEKKTELDKKYANALDKDYQTEYKKKRDE--L 118

Query: 422 KRKQSVKQKL 431
           ++KQS  +KL
Sbjct: 119 EKKQSDLKKL 128


>gnl|CDD|224272 COG1353, COG1353, Predicted CRISPR-associated polymerase [Defense
            mechanisms].
          Length = 799

 Score = 30.9 bits (70), Expect = 5.2
 Identities = 31/171 (18%), Positives = 57/171 (33%), Gaps = 22/171 (12%)

Query: 1147 DYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK------RRKKTEDDDEEPSTS 1200
            D+   ++ K   KA+     Y +   E+                            P+T 
Sbjct: 254  DFIYEVSSKGASKALRGRSFYIELLTEDIVNRIISELNLTRANILFEGGGHFYLLLPNTE 313

Query: 1201 KKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLS-EPFIKLPSRKELPDYYEV 1259
            + RK  +E ++E     L + L +    V  Y +     L+ + F +   RK L    E 
Sbjct: 314  EVRKILEEIEKE-----LNEWLIEQNFKVDLYLELAWVELTLKDFYRRWFRKHLEKVSE- 367

Query: 1260 IDRPMDIKKILGR--IEDGKYSSVDELQKDFK---TLCRNAQIYNEELSLI 1305
                +  +K L R  +E G      EL    +   ++C N +   +    +
Sbjct: 368  ----LPSRKKLRRFEVELGILFPRYELDGPGERTCSVCGNKRAKGDSEKEM 414


>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
          Length = 1465

 Score = 31.0 bits (70), Expect = 5.3
 Identities = 27/146 (18%), Positives = 59/146 (40%), Gaps = 10/146 (6%)

Query: 1142 QRKQVDYTDSL---TEKE-WLK---AIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDD 1194
            + K     + L   T K  WLK   A++  ++  D+E+ + EE R K +    +      
Sbjct: 1135 RDKLNIEVEDLKKTTPKSLWLKDLDALEKELDKLDKEDAKAEEAREKLQRAAARGESGAA 1194

Query: 1195 EEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELP 1254
            ++ S    +K   +K  +K      +T ++        T++   V+ +P  +  ++K+ P
Sbjct: 1195 KKVSRQAPKKPAPKKTTKKASE--SETTEETYGSSAMETENVAEVV-KPKGRAGAKKKAP 1251

Query: 1255 DYYEVIDRPMDIKKILGRIEDGKYSS 1280
               +  +   +I  +  R+      S
Sbjct: 1252 AAAKEKEEEDEILDLKDRLAAYNLDS 1277


>gnl|CDD|218191 pfam04652, DUF605, Vta1 like.  Vta1 (VPS20-associated protein 1) is
           a positive regulator of Vps4. Vps4 is an ATPase that is
           required in the multivesicular body (MVB) sorting
           pathway to dissociate the endosomal sorting complex
           required for transport (ESCRT). Vta1 promotes correct
           assembly of Vps4 and stimulates its ATPase activity
           through its conserved Vta1/SBP1/LIP5 region.
          Length = 315

 Score = 30.8 bits (70), Expect = 5.4
 Identities = 15/53 (28%), Positives = 17/53 (32%), Gaps = 3/53 (5%)

Query: 1   MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN 53
            S+SS  P P   Q   PP +          S PPG   P P       P   
Sbjct: 207 PSDSSLPPAPSSFQSDTPPPSPESPT---NPSPPPGPAAPPPPPVQQVPPLST 256


>gnl|CDD|221012 pfam11169, DUF2956, Protein of unknown function (DUF2956).  This
           family of proteins with unknown function appears to be
           restricted to Gammaproteobacteria.
          Length = 103

 Score = 28.8 bits (65), Expect = 5.4
 Identities = 14/39 (35%), Positives = 18/39 (46%)

Query: 407 EHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDET 445
           E+K +QK K  E  K RKQ +K K          D  E+
Sbjct: 38  EYKKQQKAKAREADKARKQQLKAKQRQAANDDEEDTIES 76


>gnl|CDD|216368 pfam01213, CAP_N, Adenylate cyclase associated (CAP) N terminal. 
          Length = 313

 Score = 30.6 bits (69), Expect = 5.4
 Identities = 16/65 (24%), Positives = 23/65 (35%), Gaps = 5/65 (7%)

Query: 3   NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQ-NPQENLTA-LQRA 60
           +SS    PPPP    PP       +               G    + N  E +T+ L++ 
Sbjct: 226 SSSAPSAPPPPPPPPPPSVP---TISNSVESASSDSKGGRGAVFAELNKGEGITSGLKKV 282

Query: 61  IDSMK 65
            D MK
Sbjct: 283 TDDMK 287


>gnl|CDD|132062 TIGR03017, EpsF, chain length determinant protein EpsF.  Sequences
           in this family of proteins are members of the chain
           length determinant family (pfam02706) which includes the
           wzc protein from E.coli. This family of proteins are
           homologous to the EpsF protein of the methanolan
           biosynthesis operon of Methylobacillus species strain
           12S. The distribution of this protein appears to be
           restricted to a subset of exopolysaccharide operons
           containing a syntenic grouping of genes including a
           variant of the EpsH exosortase protein. Exosortase has
           been proposed to be involved in the targetting and
           processing of proteins containing the PEP-CTERM domain
           to the exopolysaccharide layer.
          Length = 444

 Score = 30.5 bits (69), Expect = 6.0
 Identities = 34/137 (24%), Positives = 54/137 (39%), Gaps = 26/137 (18%)

Query: 186 SKLTNIPKPEGLDPLI---ILQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELR 242
           SK       + L  +I   I+Q  +  +A   E ++ EL+  L    P++ R +AEI   
Sbjct: 236 SKEGGSSGKDALPEVIANPIIQNLKTDIA-RAESKLAELSQRLGPNHPQYKRAQAEIN-- 292

Query: 243 ALKVLNFQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAE 302
                + + QL AE+              +V    R  +Q  +EA   E LE Q+    E
Sbjct: 293 -----SLKSQLNAEIKKVTS---------SVGTNSRILKQ--REAELREALENQKAKVLE 336

Query: 303 RKKRQKHQEYITTVLQH 319
             +    Q    +VLQ 
Sbjct: 337 LNR----QRDEMSVLQR 349


>gnl|CDD|130324 TIGR01257, rim_protein, retinal-specific rim ABC transporter.  This
            model describes the photoreceptor protein (rim protein)
            in eukaryotes. It is the member of ABC transporter
            superfamily. Rim protein is a membrane glycoprotein which
            is localized in the photoreceptor outer segment discs.
            Mutation/s in its genetic loci is implicated in the
            recessive Stargardt's disease [Transport and binding
            proteins, Other].
          Length = 2272

 Score = 31.1 bits (70), Expect = 6.0
 Identities = 17/55 (30%), Positives = 19/55 (34%), Gaps = 1/55 (1%)

Query: 118  PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
               A G Q KR       P   P    G  P       P Q    P + QPPP+P
Sbjct: 1284 SLFAGGAQQKRENANLRHPCSGPTEKAGQTPQASHTCSPGQPAAHP-EGQPPPEP 1337


>gnl|CDD|220600 pfam10147, CR6_interact, Growth arrest and DNA-damage-inducible
           proteins-interacting protein 1.  Members of this family
           of proteins act as negative regulators of G1 to S cell
           cycle phase progression by inhibiting cyclin-dependent
           kinases. Inhibitory effects are additive with GADD45
           proteins but occur also in the absence of GADD45
           proteins. Furthermore, they act as a repressor of the
           orphan nuclear receptor NR4A1 by inhibiting AB
           domain-mediated transcriptional activity.
          Length = 217

 Score = 30.2 bits (68), Expect = 6.2
 Identities = 19/79 (24%), Positives = 39/79 (49%), Gaps = 15/79 (18%)

Query: 347 NAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVK 406
            A+K +++++ R  KER  RL+AE  E +   +D +  +                 +M++
Sbjct: 141 RAQKRKREQKARAAKERKERLVAEAREHFGYWVDPRDPR---------------FQEMLQ 185

Query: 407 EHKMEQKKKQDEESKKRKQ 425
           + + E+KKK  E  ++ K+
Sbjct: 186 QKEKEEKKKVKEAKRREKE 204


>gnl|CDD|177464 PHA02682, PHA02682, ORF080 virion core protein; Provisional.
          Length = 280

 Score = 30.2 bits (67), Expect = 6.2
 Identities = 16/40 (40%), Positives = 20/40 (50%), Gaps = 7/40 (17%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQ-PPPQ 171
           PS  Q PP       P+P  +P P  A+P+ L  Q PPP 
Sbjct: 142 PSTRQCPPAP-----PLPTPKPAP-AAKPIFLHNQLPPPD 175


>gnl|CDD|115072 pfam06391, MAT1, CDK-activating kinase assembly factor MAT1.  MAT1
           is an assembly/targeting factor for cyclin-dependent
           kinase-activating kinase (CAK), which interacts with the
           transcription factor TFIIH. The domain found to the
           N-terminal side of this domain is a C3HC4 RING finger.
          Length = 200

 Score = 30.1 bits (68), Expect = 6.4
 Identities = 24/109 (22%), Positives = 50/109 (45%), Gaps = 5/109 (4%)

Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKK 383
            +Y + N+  IMR NK  +      E EQ  E+E+  KE  R  + ++E+  +   ++ K
Sbjct: 71  DQYEKENKDSIMR-NKRRLTREQ-EELEQALEEEKEMKEEKRLHLQKEEQEQKMAKEKDK 128

Query: 384 DKRLAFLLSQTDEYISNLTQMVKEH--KMEQKKKQDEESKKRKQSVKQK 430
            + +   L  ++   + +    K+   ++E + ++ E  K+   S   K
Sbjct: 129 -QEIIDELETSNLPANVIIAQHKKQSKQLESQVEKLERKKRVTFSTGIK 176


>gnl|CDD|235206 PRK04031, PRK04031, DNA primase; Provisional.
          Length = 408

 Score = 30.6 bits (70), Expect = 6.4
 Identities = 16/94 (17%), Positives = 42/94 (44%), Gaps = 5/94 (5%)

Query: 1152 LTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
            LT+KE  KA+ + V  +   EE  ++ +   +  + ++ + + E     +  +KE     
Sbjct: 251  LTKKEIAKALRNKVPVEQYLEELGKKAQKAAEKVKEEEEKPEKEPAEQPEPEEKEPAPVP 310

Query: 1212 EKDQAKLKKTLKKIM---RVVIKYTDSDGRVLSE 1242
             + +  +++ +K++       +   D +  V+ E
Sbjct: 311  AEKEETVREHIKELKGTLEARL--LDENWNVIKE 342


>gnl|CDD|237862 PRK14948, PRK14948, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 620

 Score = 30.7 bits (70), Expect = 6.4
 Identities = 18/72 (25%), Positives = 27/72 (37%), Gaps = 2/72 (2%)

Query: 2   SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
            ++S +   PPP Q+ PP      P+  P +  P    P P   P      +    Q   
Sbjct: 515 GSASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPP--PPTATQASSNAPAQIPA 572

Query: 62  DSMKEQGLEEDP 73
           DS     + E+P
Sbjct: 573 DSSPPPPIPEEP 584


>gnl|CDD|215521 PLN02967, PLN02967, kinase.
          Length = 581

 Score = 30.8 bits (69), Expect = 6.5
 Identities = 14/59 (23%), Positives = 25/59 (42%), Gaps = 11/59 (18%)

Query: 1165 VEYDDEEEEEEEEVRSKRKGKRRKKT-------EDDDEEPSTSKKRKKEKEKDREKDQA 1216
            V  D   ++E +    K   + R+K        E++  E    K+RK +K  +  +DQ 
Sbjct: 100  VNEDAALDKESK----KTPRRTRRKAAAASSDVEEEKTEKKVRKRRKVKKMDEDVEDQG 154



 Score = 30.4 bits (68), Expect = 8.1
 Identities = 16/67 (23%), Positives = 34/67 (50%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMR 1227
            D EEE+ E++VR +RK K+  +  +D    S     ++ +     +++++ +  L+K   
Sbjct: 127  DVEEEKTEKKVRKRRKVKKMDEDVEDQGSESEVSDVEESEFVTSLENESEEELDLEKDDG 186

Query: 1228 VVIKYTD 1234
              I +T 
Sbjct: 187  EDISHTY 193


>gnl|CDD|236912 PRK11448, hsdR, type I restriction enzyme EcoKI subunit R;
           Provisional.
          Length = 1123

 Score = 30.7 bits (70), Expect = 6.7
 Identities = 24/102 (23%), Positives = 37/102 (36%), Gaps = 22/102 (21%)

Query: 286 EARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYH 345
           E +A EK + Q   EA++++    +              E     Q    +L +      
Sbjct: 159 ELQAREKAQSQALAEAQQQELVALEGLA----------AELEEKQQELEAQLEQL----- 203

Query: 346 ANAEKEQKKEQERIEKERMRRLMAE-----DEEGYRKLIDQK 382
              EK  +  QER +K +     A       EE  R LIDQ+
Sbjct: 204 --QEKAAETSQERKQKRKEITDQAAKRLELSEEETRILIDQQ 243



 Score = 30.3 bits (69), Expect = 8.2
 Identities = 22/106 (20%), Positives = 50/106 (47%), Gaps = 6/106 (5%)

Query: 346 ANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMV 405
            N     ++E   ++++    L A ++   + L + ++ + +A L     E      ++ 
Sbjct: 141 ENLLHALQQEVLTLKQQL--ELQAREKAQSQALAEAQQQELVA-LEGLAAELEEKQQEL- 196

Query: 406 KEHKMEQKKKQDEE-SKKRKQSVKQKLMDTDGKVTLDQDETSQLTD 450
            E ++EQ +++  E S++RKQ  K+       ++ L ++ET  L D
Sbjct: 197 -EAQLEQLQEKAAETSQERKQKRKEITDQAAKRLELSEEETRILID 241


>gnl|CDD|165245 PHA02934, PHA02934, Hypothetical protein; Provisional.
          Length = 253

 Score = 30.0 bits (67), Expect = 7.2
 Identities = 14/27 (51%), Positives = 17/27 (62%)

Query: 573 NNNLNGILADEMGLGKTIQTIALITYL 599
           NN +N IL D  GLG  + TI+ IT L
Sbjct: 158 NNEVNTILMDNKGLGVRLATISFITEL 184


>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510).  This
            family consists of several hypothetical bacterial
            proteins of around 200 residues in length. The function
            of this family is unknown.
          Length = 214

 Score = 29.7 bits (67), Expect = 7.2
 Identities = 14/56 (25%), Positives = 29/56 (51%), Gaps = 1/56 (1%)

Query: 1159 KAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
            K  DD    + EE +EEE+  +  + K  K   + ++E S  +  ++E E+  +++
Sbjct: 52   KKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEEN-EEEDEESSDEN 106


>gnl|CDD|213341 cd05392, RasGAP_Neurofibromin_like, Ras-GTPase Activating Domain of
            proteins similar to neurofibromin.  Neurofibromin-like
            proteins include the Saccharomyces cerevisiae RasGAP
            proteins Ira1 and Ira2, the closest homolog of
            neurofibromin, which is responsible for the human
            autosomal dominant disease neurofibromatosis type I
            (NF1). The RasGAP Ira1/2 proteins are negative regulators
            of the Ras-cAMP signaling pathway and conserved from
            yeast to human. In yeast Ras proteins are activated by
            GEFs, and inhibited by two GAPs, Ira1 and Ira2. Ras
            proteins activate the cAMP/protein kinase A (PKA)
            pathway, which controls metabolism, stress resistance,
            growth, and meiosis. Recent studies showed that the kelch
            proteins Gpb1 and Gpb2 inhibit Ras activity via
            association with Ira1 and Ira2. Gpb1/2 bind to a
            conserved C-terminal domain of Ira1/2, and loss of Gpb1/2
            results in a destabilization of Ira1 and Ira2, leading to
            elevated levels of Ras2-GTP and uninhibited cAMP-PKA
            signaling. Since the Gpb1/2 binding domain on Ira1/2 is
            conserved in the human neurofibromin protein, the studies
            suggest that an analogous signaling mechanism may
            contribute to the neoplastic development of NF1.
          Length = 317

 Score = 30.3 bits (69), Expect = 7.2
 Identities = 21/89 (23%), Positives = 32/89 (35%), Gaps = 20/89 (22%)

Query: 1227 RVVIKYTDSDG-----RVLSEPFIKLPSRKELPDYYEVIDRPMDI----------KKILG 1271
            R++  Y  S G     +VL     ++    +  DY+EV     D            K   
Sbjct: 79   RLLTLYAKSVGNKYLRKVLRPLLTEI---VDNKDYFEVEKIKPDDENLEENADLLMKYAQ 135

Query: 1272 RIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
             + D    SVD+L   F+ +C    IY  
Sbjct: 136  MLLDSITDSVDQLPPSFRYIC--NTIYES 162


>gnl|CDD|226193 COG3667, PcoB, Uncharacterized protein involved in copper
           resistance [Inorganic ion transport and metabolism].
          Length = 321

 Score = 30.2 bits (68), Expect = 7.3
 Identities = 8/45 (17%), Positives = 10/45 (22%), Gaps = 1/45 (2%)

Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQG 177
           P   +  PM       MP      +   P P    P         
Sbjct: 32  PHAHEHAPMD-APHPAMPGMDHHAHSKMPGPEMAAPQMDHGAMPH 75


>gnl|CDD|240653 cd12176, PGDH_3, Phosphoglycerate dehydrogenases, NAD-binding and
           catalytic domains.  Phosphoglycerate dehydrogenases
           (PGDHs) catalyze the initial step in the biosynthesis of
           L-serine from D-3-phosphoglycerate. PGDHs come in 3
           distinct structural forms, with this first group being
           related to 2-hydroxy acid dehydrogenases, sharing
           structural similarity to formate and glycerate
           dehydrogenases. PGDH in E. coli and Mycobacterium
           tuberculosis form tetramers, with subunits containing a
           Rossmann-fold NAD binding domain. Formate/glycerate and
           related dehydrogenases of the D-specific 2-hydroxyacid
           dehydrogenase superfamily include groups such as formate
           dehydrogenase, glycerate dehydrogenase, L-alanine
           dehydrogenase, and S-adenosylhomocysteine hydrolase.
           Despite often low sequence identity, these proteins
           typically have a characteristic arrangement of 2 similar
           subdomains of the alpha/beta Rossmann fold NAD+ binding
           form. The NAD+ binding domain is inserted within the
           linear sequence of the mostly N-terminal catalytic
           domain, which has a similar domain structure to the
           internal NAD binding domain. Structurally, these domains
           are connected by extended alpha helices and create a
           cleft in which NAD is bound, primarily to the C-terminal
           portion of the 2nd (internal) domain. Some related
           proteins have similar structural subdomain but with a
           tandem arrangement of the catalytic and NAD-binding
           subdomains in the linear sequence.
          Length = 304

 Score = 30.2 bits (69), Expect = 7.3
 Identities = 13/26 (50%), Positives = 16/26 (61%)

Query: 746 FNAPFATTGEKVELNEEETILIIRRL 771
           FNAPF+ T    EL   E I++ RRL
Sbjct: 91  FNAPFSNTRSVAELVIGEIIMLARRL 116


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
            the C subunit of DNA polymerase delta. It carries the
            essential residues for binding to the Pol1 subunit of
            polymerase alpha, from residues 293-332, which are
            characterized by the motif D--G--VT, referred to as the
            DPIM motif. The first 160 residues of the protein form
            the minimal domain for binding to the B subunit, Cdc1, of
            polymerase delta, the final 10 C-terminal residues,
            362-372, being the DNA sliding clamp, PCNA, binding
            motif.
          Length = 427

 Score = 30.2 bits (68), Expect = 7.5
 Identities = 27/100 (27%), Positives = 42/100 (42%), Gaps = 26/100 (26%)

Query: 1165 VEYDDEEEEEEEEVRSKRKGKRRKKTEDDD-------------------EEPSTSKKRKK 1205
             E  D EEE EE+ + KR  KR KK  +D+                   EEP      KK
Sbjct: 278  GERSDSEEETEEKEKEKR--KRLKKMMEDEDEDEEMEIVPESPVEEEESEEPEPPPLPKK 335

Query: 1206 EKEKDREKDQAKLKKTLKKIMRVVIK---YTDSDGRVLSE 1242
            E+EK+         +   +  R V+K   + D +G ++++
Sbjct: 336  EEEKEEVTVSPDGGRRRGR--RRVMKKKTFKDEEGYLVTK 373


>gnl|CDD|225689 COG3147, DedD, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 226

 Score = 29.9 bits (67), Expect = 7.9
 Identities = 20/134 (14%), Positives = 37/134 (27%), Gaps = 5/134 (3%)

Query: 94  SAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ 153
             Q  +   +++   +     P  P   +  +  +  G  +   + P  +  P       
Sbjct: 46  KPQGDRDEPRVLPAVVQVVALPTQPPEGVAQEI-QDAGDAAAASVDPQPVAQPPVESTPA 104

Query: 154 PMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQERENRVALN 213
            +P  AQ     + P   P        +   + K    P         ++Q      AL 
Sbjct: 105 GVPVAAQTPKPVKPPKQPPAGAVPAKPTPKPEPKPVAEPAAAPTGQAFVVQ----LGALK 160

Query: 214 IERRIEELNGSLTS 227
              R  EL   L  
Sbjct: 161 NADRANELVAKLRG 174


>gnl|CDD|225657 COG3115, ZipA, Cell division protein [Cell division and chromosome
           partitioning].
          Length = 324

 Score = 30.2 bits (68), Expect = 7.9
 Identities = 18/127 (14%), Positives = 27/127 (21%), Gaps = 14/127 (11%)

Query: 67  QGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
              E +   Q                             A+      QP  P L      
Sbjct: 75  FTQEHEAARQSPQHQYQPEYASAQIKIPVPQPPQISDPPAH-----PQPTQPALDQ---- 125

Query: 127 KRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPM--PLQQQPPPQPHQQQGHISSQIK 184
              E  P   + P +    P P P     P  A     P   +   QP +         +
Sbjct: 126 ---EQPPEEARQPVLPQEAPAPQPVHSAAPQPAVQTVQPAVPEQQVQPEEVVEPAPEVKR 182

Query: 185 QSKLTNI 191
             +   +
Sbjct: 183 PPRKDTV 189


>gnl|CDD|148208 pfam06465, DUF1087, Domain of Unknown Function (DUF1087).  Members of
            this family are found in various chromatin remodelling
            factors and transposases. Their exact function is, as
            yet, unknown.
          Length = 66

 Score = 27.5 bits (61), Expect = 8.1
 Identities = 10/32 (31%), Positives = 16/32 (50%)

Query: 1125 FEAKEEEKALHMGRGSRQRKQVDYTDSLTEKE 1156
            +E    E+   +G+G R RKQV+Y +      
Sbjct: 35   YEQLRAEEEKALGKGKRSRKQVNYAEEDDIDG 66


>gnl|CDD|225603 COG3061, OapA, Cell envelope opacity-associated protein A [Cell
           envelope biogenesis, outer membrane].
          Length = 242

 Score = 29.9 bits (67), Expect = 8.3
 Identities = 19/82 (23%), Positives = 29/82 (35%), Gaps = 7/82 (8%)

Query: 112 RNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ 171
            N P   Q+A+  Q    +  P    MP      P P+    P P+    MP        
Sbjct: 9   DNPPAQNQMAV-EQMIEEQDAPQAETMPGNFEAKP-PLAEVWPAPDNNVFMPPL-----P 61

Query: 172 PHQQQGHISSQIKQSKLTNIPK 193
           P  ++G I + I      ++P 
Sbjct: 62  PMHRRGIIVAPIMLVAQAHLPS 83


>gnl|CDD|236138 PRK07994, PRK07994, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 647

 Score = 30.2 bits (69), Expect = 8.4
 Identities = 22/92 (23%), Positives = 30/92 (32%), Gaps = 16/92 (17%)

Query: 94  SAQVQQLRFQIMAYRLLARNQ-PLTPQLAMGVQGK--RM----------EGVPSGPQMPP 140
             +  QL +Q +   L+ R   PL P   MGV+    RM          E         P
Sbjct: 321 PPEDVQLYYQTL---LIGRKDLPLAPDRRMGVEMTLLRMLAFHPAAPLPEPEVPPQSAAP 377

Query: 141 MSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
            +       P +   P QA  +P      PQ 
Sbjct: 378 AASAQATAAPTAAVAPPQAPAVPPPPASAPQQ 409


>gnl|CDD|223526 COG0449, GlmS, Glucosamine 6-phosphate synthetase, contains
            amidotransferase and phosphosugar isomerase domains [Cell
            envelope biogenesis, outer membrane].
          Length = 597

 Score = 30.2 bits (69), Expect = 8.5
 Identities = 11/56 (19%), Positives = 25/56 (44%), Gaps = 8/56 (14%)

Query: 1087 QRIDAERRKEQGKKSRLIEVSELPDWL---IKEDEEIEQWAFEAKEEEKALHMGRG 1139
              I  E  +       + E+ +LP+ +   +  +E+I++ A    + +    +GRG
Sbjct: 414  GTISEEEERSL-----IKELQKLPNHIPKVLAAEEKIKELAKRLADAKDFFFLGRG 464


>gnl|CDD|220603 pfam10152, DUF2360, Predicted coiled-coil domain-containing protein
           (DUF2360).  This is the conserved 140 amino acid region
           of a family of proteins conserved from nematodes to
           humans. One C. elegans member is annotated as a
           Daf-16-dependent longevity protein 1 but this could not
           be confirmed. The function is unknown.
          Length = 147

 Score = 28.9 bits (65), Expect = 8.6
 Identities = 19/79 (24%), Positives = 29/79 (36%), Gaps = 23/79 (29%)

Query: 3   NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAID 62
            ++  P PPPP + +           A    PP +P   P +   + P EN         
Sbjct: 66  ITNGGPPPPPPARAE-----------AASPPPPEAPAEPPAEPEPEAPAENTVT------ 108

Query: 63  SMKEQGLEEDPRYQKLIEM 81
                 + +DPRY K  +M
Sbjct: 109 ------VAKDPRYAKYFKM 121


>gnl|CDD|235319 PRK04914, PRK04914, ATP-dependent helicase HepA; Validated.
          Length = 956

 Score = 30.2 bits (69), Expect = 8.6
 Identities = 10/11 (90%), Positives = 11/11 (100%)

Query: 580 LADEMGLGKTI 590
           LADE+GLGKTI
Sbjct: 174 LADEVGLGKTI 184


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 30.4 bits (69), Expect = 8.8
 Identities = 17/107 (15%), Positives = 45/107 (42%), Gaps = 12/107 (11%)

Query: 287 ARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKD-------FKEYHRNNQARIMRLNK 339
           A A  K++++++   +    ++ +  IT   +  K         +E +   +  +  L +
Sbjct: 391 AEALSKVKEEERPREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKR 450

Query: 340 AVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGY---RKLIDQKK 383
            +      +E E+ + + R +  + R + A D       ++L ++KK
Sbjct: 451 EIEKLE--SELERFRREVRDKVRKDREIRARDRRIERLEKELEEKKK 495


>gnl|CDD|233045 TIGR00601, rad23, UV excision repair protein Rad23.  All proteins
           in this family for which functions are known are
           components of a multiprotein complex used for targeting
           nucleotide excision repair to specific parts of the
           genome. In humans, Rad23 complexes with the XPC protein.
           This family is based on the phylogenomic analysis of JA
           Eisen (1999, Ph.D. Thesis, Stanford University) [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 378

 Score = 30.2 bits (68), Expect = 8.8
 Identities = 12/57 (21%), Positives = 16/57 (28%), Gaps = 9/57 (15%)

Query: 4   SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
            +    PP       P          P   PP SP      AP    +E   + + A
Sbjct: 80  GTGKVAPPAATPTSAP---------TPTPSPPASPASGMSAAPASAVEEKSPSEESA 127


>gnl|CDD|233467 TIGR01554, major_cap_HK97, phage major capsid protein, HK97 family.
            This model family represents the major capsid protein
            component of the heads (capsids) of bacteriophage HK97,
            phi-105, P27, and related phage. This model represents
            one of several analogous families lacking detectable
            sequence similarity. The gene encoding this component is
            typically located in an operon encoding the small and
            large terminase subunits, the portal protein and the
            prohead or maturation protease [Mobile and
            extrachromosomal element functions, Prophage functions].
          Length = 384

 Score = 30.0 bits (68), Expect = 9.0
 Identities = 15/93 (16%), Positives = 35/93 (37%), Gaps = 3/93 (3%)

Query: 1150 DSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEK 1209
              LTE E L   ++     D  +EE +++ ++     R +   D+ E   +   +    +
Sbjct: 16   RKLTEDEKLAEAEEEKAEYDALKEEIDKLDAEID---RLEELLDELEAKPAASGEGGGGE 72

Query: 1210 DREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSE 1242
            + E++        +  +R        + + LS 
Sbjct: 73   EEEEEAKAEAAEFRAYLRGGDDALAEERKALST 105


>gnl|CDD|240576 cd12932, RRP7_like, RRP7 domain ribosomal RNA-processing protein 7
            (Rrp7p), ribosomal RNA-processing protein 7 homolog A
            (Rrp7A), and similar proteins.  This CD corresponds to
            the RRP7 domain of Rrp7p and Rrp7A. Rrp7p is encoded by
            YCL031C gene from Saccharomyces cerevisiae. It is an
            essential yeast protein involved in pre-rRNA processing
            and ribosome assembly, and is speculated to be required
            for correct assembly of rpS27 into the pre-ribosomal
            particle. Rrp7A, also termed gastric cancer antigen Zg14,
            is the Rrp7p homolog mainly found in Metazoans. The
            cellular function of Rrp7A remains unclear currently.
            Both Rrp7p and Rrp7A harbor an N-terminal RNA recognition
            motif (RRM), also termed RBD (RNA binding domain) or RNP
            (ribonucleoprotein domain), and a C-terminal RRP7 domain.
          Length = 118

 Score = 28.4 bits (64), Expect = 9.1
 Identities = 29/103 (28%), Positives = 48/103 (46%), Gaps = 25/103 (24%)

Query: 1147 DYTDSLTEKEWLKA-IDDGVE-YDDEEEEEEEE----------------VRSKRKGKRRK 1188
            +Y  S  +   L++ +D+ +E +D  EEEE+EE                 R  RKGK  +
Sbjct: 8    EYKRSRPDPAELQSEVDEYMEEFDKREEEEKEEAKEARNEPDEDGFVTVTRGGRKGKTAR 67

Query: 1189 KTEDDDEEPSTSKKRKKEKEKD-------REKDQAKLKKTLKK 1224
            +   + +     KK+KK+KE +       REK + +L +  KK
Sbjct: 68   EEAVEAKAKEKEKKKKKKKELEDFYRFQIREKKKEELAELRKK 110


>gnl|CDD|128795 smart00521, CBF, CCAAT-Binding transcription Factor. 
          Length = 62

 Score = 27.4 bits (61), Expect = 9.5
 Identities = 19/44 (43%), Positives = 23/44 (52%), Gaps = 9/44 (20%)

Query: 271 VNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKK-----RQKH 309
           VN K Y R  R+  ++ARA  KLE Q K+  ERK      R  H
Sbjct: 8   VNAKQYHRILRR--RQARA--KLEAQGKLPKERKPYLHESRHLH 47


>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62. 
          Length = 217

 Score = 29.4 bits (66), Expect = 9.6
 Identities = 14/62 (22%), Positives = 23/62 (37%), Gaps = 7/62 (11%)

Query: 1171 EEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKK-------EKEKDREKDQAKLKKTLK 1223
            E E+ +  + K   +   K    D+     K   K       +K   +EK + K K   K
Sbjct: 17   ESEKYKANKDKGNPEIYNKINSQDKAIEKFKLLIKAQMAERVKKLHSQEKKEEKKKPKKK 76

Query: 1224 KI 1225
            K+
Sbjct: 77   KV 78


>gnl|CDD|150091 pfam09310, PD-C2-AF1, POU domain, class 2, associating factor 1.
           Members of this family are transcriptional coactivators
           that specifically associate with either OCT1 or OCT2,
           through recognition of their POU domains. They are
           essential for the response of B-cells to antigens and
           required for the formation of germinal centres.
          Length = 264

 Score = 29.8 bits (66), Expect = 9.8
 Identities = 12/50 (24%), Positives = 20/50 (40%), Gaps = 6/50 (12%)

Query: 114 QPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMP 163
            PL+   A  +Q +        PQ  P+      P+P  +P P + +  P
Sbjct: 187 PPLSACPANTLQYQPASSTLPAPQFLPL------PIPIPEPAPQEEEDAP 230


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of the
            RNA polymerase II associated Paf1 complex. The Paf1
            complex functions during the elongation phase of
            transcription in conjunction with Spt4-Spt5 and
            Spt16-Pob3i.
          Length = 431

 Score = 30.1 bits (68), Expect = 9.8
 Identities = 12/46 (26%), Positives = 24/46 (52%), Gaps = 3/46 (6%)

Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREK 1213
            D++E+EEEE+   + +    ++ ED +EE S S++    +      
Sbjct: 371  DEDEDEEEEQRSDEHEE---EEGEDSEEEGSQSREDGSSESSSDVG 413


>gnl|CDD|217752 pfam03833, PolC_DP2, DNA polymerase II large subunit DP2. 
          Length = 852

 Score = 30.1 bits (68), Expect = 10.0
 Identities = 16/45 (35%), Positives = 23/45 (51%), Gaps = 4/45 (8%)

Query: 1143 RKQVDYTDSLTEK--EWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
             K + YTD L  +  +WLK +    E  D+E+ EE+   SK   K
Sbjct: 248  PKILKYTDKLGIEGWDWLKDLSKKKE--DKEDTEEKVAVSKPSDK 290


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.313    0.131    0.370 

Gapped
Lambda     K      H
   0.267   0.0724    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 68,798,036
Number of extensions: 7088560
Number of successful extensions: 16542
Number of sequences better than 10.0: 1
Number of HSP's gapped: 13221
Number of HSP's successfully gapped: 1051
Length of query: 1331
Length of database: 10,937,602
Length adjustment: 109
Effective length of query: 1222
Effective length of database: 6,103,016
Effective search space: 7457885552
Effective search space used: 7457885552
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 65 (28.8 bits)