RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy2964
         (208 letters)



>gnl|CDD|233176 TIGR00898, 2A0119, cation transport protein.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 505

 Score = 98.9 bits (247), Expect = 3e-24
 Identities = 45/149 (30%), Positives = 84/149 (56%), Gaps = 1/149 (0%)

Query: 60  NWVCDGSSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKD 119
           + VC+ +  + +T+S FF+G LLG F+  +++DR+GR   +L S +V+ +   LT FS +
Sbjct: 120 DLVCEDAWKVDLTQSCFFVGVLLGSFVFGYLSDRFGRKKVLLLSTLVTAVSGVLTAFSPN 179

Query: 120 VVLFSLSRFLTGVGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYY 179
             +F + R L G+G    ++   ++  E +  K R    T    +F+++  V LP +AY+
Sbjct: 180 YTVFLVFRLLVGMGIGGIWVQAVVLNTEFLPKKQRAIVGTLIQ-VFFSLGLVLLPLVAYF 238

Query: 180 LADWQWISVITIFPLIVGLIVAIFTPESA 208
           + DW+W+ +    P  +  +++ F PES 
Sbjct: 239 IPDWRWLQLAVSLPTFLFFLLSWFVPESP 267


>gnl|CDD|119392 cd06174, MFS, The Major Facilitator Superfamily (MFS) is a large
           and diverse group of secondary transporters that
           includes uniporters, symporters, and antiporters. MFS
           proteins facilitate the transport across cytoplasmic or
           internal membranes of a variety of substrates including
           ions, sugar phosphates, drugs, neurotransmitters,
           nucleosides, amino acids, and peptides. They do so using
           the electrochemical potential of the transported
           substrates. Uniporters transport a single substrate,
           while symporters and antiporters transport two
           substrates in the same or in opposite directions,
           respectively, across membranes. MFS proteins are
           typically 400 to 600 amino acids in length, and the
           majority contain 12 transmembrane alpha helices (TMs)
           connected by hydrophilic loops. The N- and C-terminal
           halves of these proteins display weak similarity and may
           be the result of a gene duplication/fusion event. Based
           on kinetic studies and the structures of a few bacterial
           superfamily members, GlpT (glycerol-3-phosphate
           transporter), LacY (lactose permease), and EmrD
           (multidrug transporter), MFS proteins are thought to
           function through a single substrate binding site,
           alternating-access mechanism involving a rocker-switch
           type of movement. Bacterial members function primarily
           for nutrient uptake, and as drug-efflux pumps to confer
           antibiotic resistance. Some MFS proteins have medical
           significance in humans such as the glucose transporter
           Glut4, which is impaired in type II diabetes, and
           glucose-6-phosphate transporter (G6PT), which causes
           glycogen storage disease when mutated.
          Length = 352

 Score = 78.1 bits (193), Expect = 3e-17
 Identities = 34/130 (26%), Positives = 61/130 (46%), Gaps = 1/130 (0%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           S F LG  LG  +  +++DR+GR   +L   ++  LG  L  F+  + L  + RFL G+G
Sbjct: 41  SAFSLGYALGSLLAGYLSDRFGRRRVLLLGLLLFALGSLLLAFASSLWLLLVGRFLLGLG 100

Query: 134 HFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLADWQWISVITIFP 193
               +     ++ E   PK R  A+   F   + +  +  P +   LA+      + +  
Sbjct: 101 GGALYPAAAALIAEWFPPKERGRALGL-FSAGFGLGALLGPLLGGLLAESLGWRWLFLIL 159

Query: 194 LIVGLIVAIF 203
            I+GL++A+ 
Sbjct: 160 AILGLLLALL 169



 Score = 60.8 bits (148), Expect = 4e-11
 Identities = 32/132 (24%), Positives = 56/132 (42%), Gaps = 2/132 (1%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVL-GSHVVSFLGVALTPFSKDVVLFSLSRFLTGV 132
           S+F LG +LG  +   ++DR GR   +L    +++ LG+ L   +  + L  ++  L G 
Sbjct: 218 SLFGLGGILGALLGGLLSDRLGRRRLLLLIGLLLAALGLLLLALAPSLALLLVALLLLGF 277

Query: 133 GHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLADWQWISVITIF 192
           G   AF     +  E   P+ R  A    F  F ++     P +A  L D      + + 
Sbjct: 278 GLGFAFPALLTLASELAPPEARGTASGL-FNTFGSLGGALGPLLAGLLLDTGGYGGVFLI 336

Query: 193 PLIVGLIVAIFT 204
              + L+ A+  
Sbjct: 337 LAALALLAALLL 348



 Score = 38.1 bits (89), Expect = 0.002
 Identities = 26/139 (18%), Positives = 58/139 (41%), Gaps = 2/139 (1%)

Query: 69  LAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFS-KDVVLFSLSR 127
           L +  + F LG+LLG  +   +A+  G     L   ++  L   L  F  + ++L +L+ 
Sbjct: 125 LGLFSAGFGLGALLGPLLGGLLAESLGWRWLFLILAILGLLLALLLLFLLRLLLLLALAF 184

Query: 128 FLTGVGHFNAFIFYYIIVLECVG-PKWRTFAMTFPFLIFYTVSEVALPWIAYYLADWQWI 186
           FL   G++    +  + + E +G        +   F +   +  +    ++  L   + +
Sbjct: 185 FLLSFGYYGLLTYLPLYLQEVLGLSAAEAGLLLSLFGLGGILGALLGGLLSDRLGRRRLL 244

Query: 187 SVITIFPLIVGLIVAIFTP 205
            +I +    +GL++    P
Sbjct: 245 LLIGLLLAALGLLLLALAP 263


>gnl|CDD|233175 TIGR00895, 2A0115, benzoate transport.  [Transport and binding
           proteins, Carbohydrates, organic alcohols, and acids].
          Length = 398

 Score = 68.2 bits (167), Expect = 1e-13
 Identities = 35/140 (25%), Positives = 57/140 (40%), Gaps = 7/140 (5%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           S   +G   G      +ADR GR   +L S ++  +   L   + +V    + RFL G+G
Sbjct: 59  SAGLIGMAFGALFFGPLADRIGRRRVLLWSILLFSVFTLLCALATNVTQLLILRFLAGLG 118

Query: 134 HFNAFIFYYIIVLECVGPKWRTFAMTFPFL---IFYTVSEVALPW-IAYYLADWQWISVI 189
                     +V E    ++R  A+   F    I   V      W I  +   W+ +  +
Sbjct: 119 LGGLMPNLNALVSEYAPKRFRGTAVGLMFCGYPIGAAVGGFLAGWLIPVF--GWRSLFYV 176

Query: 190 -TIFPLIVGLIVAIFTPESA 208
             I PL++ L++  F PES 
Sbjct: 177 GGIAPLLLLLLLMRFLPESI 196



 Score = 29.6 bits (67), Expect = 1.00
 Identities = 20/52 (38%), Positives = 29/52 (55%), Gaps = 4/52 (7%)

Query: 75  IFFLGSLLGGFILSWVADRYG-RITAV--LGSHVVSFLGVALTPFSKDVVLF 123
           +F  G ++G  I  W+ADR G R+TA+  L   V + L V  T FS  ++L 
Sbjct: 293 LFNFGGVIGSIIFGWLADRLGPRVTALLLLLGAVFAVL-VGSTLFSPTLLLL 343


>gnl|CDD|219516 pfam07690, MFS_1, Major Facilitator Superfamily. 
          Length = 346

 Score = 64.4 bits (157), Expect = 2e-12
 Identities = 28/133 (21%), Positives = 55/133 (41%), Gaps = 4/133 (3%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           + F LG  L   +   ++DR+GR   +L   ++  LG+ L  F+  + L  + R L G+G
Sbjct: 39  TAFSLGYALAQPLAGRLSDRFGRRRVLLIGLLLFALGLLLLLFASSLWLLLVLRVLQGLG 98

Query: 134 HFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLAD---WQWISVIT 190
               F     ++ +   P+ R  A+       + +     P +   LA    W+   +I 
Sbjct: 99  GGALFPAAAALIADWFPPEERGRALGL-LSAGFGLGAALGPLLGGLLASLFGWRAAFLIL 157

Query: 191 IFPLIVGLIVAIF 203
               ++  ++A  
Sbjct: 158 AILALLAAVLAAL 170



 Score = 42.0 bits (99), Expect = 8e-05
 Identities = 19/90 (21%), Positives = 37/90 (41%), Gaps = 3/90 (3%)

Query: 74  SIFFLGSLLGGFILSWVADRYG---RITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLT 130
            +  L   +G  +L  ++DR G   R+   L   +++ LG+AL   ++  +   ++  L 
Sbjct: 244 GLAGLLGAIGRLLLGRLSDRLGRRRRLLLALLLLILAALGLALLSLTESSLWLLVALLLL 303

Query: 131 GVGHFNAFIFYYIIVLECVGPKWRTFAMTF 160
           G G    F     +V +    + R  A   
Sbjct: 304 GFGAGLVFPALNALVSDLAPKEERGTASGL 333



 Score = 27.8 bits (62), Expect = 4.3
 Identities = 24/164 (14%), Positives = 46/164 (28%), Gaps = 31/164 (18%)

Query: 69  LAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTP------------- 115
           L +  + F LG+ LG  +   +A  +G   A L   +++ L   L               
Sbjct: 123 LGLLSAGFGLGAALGPLLGGLLASLFGWRAAFLILAILALLAAVLAALLLPRPPPESKRP 182

Query: 116 ----------------FSKDVVLFSLSRFLTGVGHFNAFIFYYIIVLECVGPKWRTFAMT 159
                             +D VL+ L   L     F A + Y  +  E +G         
Sbjct: 183 KPAEEAPAPLVPAWKLLLRDPVLWLLLALLLFGFAFFALLTYLPLYQEVLG--LSALLAG 240

Query: 160 FPFLIFYTVSEVALPWIAYYLADWQWISVITIFPLIVGLIVAIF 203
               +   +  +    +            + +  L++ L     
Sbjct: 241 LLLGLAGLLGAIGRLLLGRLSDRLGRRRRLLLALLLLILAALGL 284


>gnl|CDD|215702 pfam00083, Sugar_tr, Sugar (and other) transporter. 
          Length = 449

 Score = 54.6 bits (132), Expect = 6e-09
 Identities = 37/151 (24%), Positives = 61/151 (40%), Gaps = 11/151 (7%)

Query: 66  SSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKD--VVLF 123
           +    +  SIF +G L+G      + DR+GR  ++L  +V+  +G  L  F+K     + 
Sbjct: 45  TVLSGLIVSIFSVGCLIGSLFAGKLGDRFGRKKSLLIGNVLFVIGALLQGFAKGKSFYML 104

Query: 124 SLSRFLTG--VGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYT---VSEVALPWIAY 178
            + R + G  VG  +  +  YI   E    K R    +   L       V+ +    +  
Sbjct: 105 IVGRVIVGLGVGGISVLVPMYIS--EIAPKKLRGALGSLYQLGITFGILVAAIIGLGLNK 162

Query: 179 YLADWQWI--SVITIFPLIVGLIVAIFTPES 207
           Y     W     +   P I+ LI  +F PES
Sbjct: 163 YSNSDGWRIPLGLQFVPAILLLIGLLFLPES 193


>gnl|CDD|236927 PRK11551, PRK11551, putative 3-hydroxyphenylpropionic transporter
           MhpT; Provisional.
          Length = 406

 Score = 48.8 bits (117), Expect = 6e-07
 Identities = 40/145 (27%), Positives = 65/145 (44%), Gaps = 17/145 (11%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGV--ALTPFSKDVVLFSLSRFLTG 131
           S   LG L G  +   +ADR GR   ++ S  V+  G+    T  + D     ++R LTG
Sbjct: 57  SAGILGLLPGALLGGRLADRIGRKRILIVS--VALFGLFSLATAQAWDFPSLLVARLLTG 114

Query: 132 VGHFNAFIFYYIIVLECVGPKWRTFAMTF-----PF--LIFYTVSEVALPWIAYYLADWQ 184
           VG   A      +  E VGP+ R  A++      PF   +    S + +  +A   A W+
Sbjct: 115 VGLGGALPNLIALTSEAVGPRLRGTAVSLMYCGVPFGGAL---ASVIGV--LAAGDAAWR 169

Query: 185 WISVI-TIFPLIVGLIVAIFTPESA 208
            I  +  + PL++  ++  + PES 
Sbjct: 170 HIFYVGGVGPLLLVPLLMRWLPESR 194



 Score = 28.4 bits (64), Expect = 2.7
 Identities = 12/57 (21%), Positives = 21/57 (36%)

Query: 76  FFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGV 132
           F +G  LG  ++  + DR      VL  +      +A    +       L+ F  G+
Sbjct: 264 FNIGGALGSLLIGALMDRLRPRRVVLLIYAGILASLAALAAAPSFAGMLLAGFAAGL 320


>gnl|CDD|223553 COG0477, ProP, Permeases of the major facilitator superfamily
           [Carbohydrate transport and metabolism / Amino acid
           transport and metabolism / Inorganic ion transport and
           metabolism / General function prediction only].
          Length = 338

 Score = 46.2 bits (108), Expect = 3e-06
 Identities = 29/138 (21%), Positives = 52/138 (37%), Gaps = 8/138 (5%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPF--SKDVVLFSLSRFLTG 131
           S FFLG  +G  +   + DRYGR   ++   ++  LG  L     +  + L  + R L G
Sbjct: 46  SAFFLGYAIGSLLAGPLGDRYGRRKVLIIGLLLFLLGTLLLALAPNVGLALLLILRLLQG 105

Query: 132 VGHFNAFIFYYIIVLECVGP-KWRTFAMTFPFLIFYTVSEVALPWIAYYLAD-----WQW 185
           +G          ++ E       R  A+    L    +     P +A  L       W+ 
Sbjct: 106 LGGGGLLPVASALLSEWFPEATERGLAVGLVTLGAGALGLALGPLLAGLLLGALLWGWRA 165

Query: 186 ISVITIFPLIVGLIVAIF 203
             ++     ++ LI+ + 
Sbjct: 166 AFLLAALLGLLLLILVLL 183


>gnl|CDD|233172 TIGR00891, 2A0112, putative sialic acid transporter.  [Transport
           and binding proteins, Carbohydrates, organic alcohols,
           and acids].
          Length = 405

 Score = 45.6 bits (108), Expect = 6e-06
 Identities = 27/129 (20%), Positives = 48/129 (37%), Gaps = 4/129 (3%)

Query: 83  GGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVGHFNAFIFYY 142
           G  +     DRYGR   ++ S V+   G     F+   +   ++R + G+G    +    
Sbjct: 63  GALMFGLWGDRYGRRLPMVTSIVLFSAGTLACGFAPGYITMFIARLVIGIGMGGEYGSSA 122

Query: 143 IIVLECVGPKWRTFAMTFPFLIF---YTVSEVALPWIAYYLAD-WQWISVITIFPLIVGL 198
             V+E      R  A       +     V+      +     D W+ +  I+I P+I  L
Sbjct: 123 AYVIESWPKHLRNKASGLLISGYAVGAVVAAQVYSLVVPVWGDGWRALFFISILPIIFAL 182

Query: 199 IVAIFTPES 207
            +    PE+
Sbjct: 183 WLRKNIPEA 191


>gnl|CDD|130366 TIGR01299, synapt_SV2, synaptic vesicle protein SV2.  This model
           describes a tightly conserved subfamily of the larger
           family of sugar (and other) transporters described by
           PFAM model pfam00083. Members of this subfamily include
           closely related forms SV2A and SV2B of synaptic vesicle
           protein from vertebrates and a more distantly related
           homolog (below trusted cutoff) from Drosophila
           melanogaster. Members are predicted to have two sets of
           six transmembrane helices.
          Length = 742

 Score = 42.3 bits (99), Expect = 9e-05
 Identities = 37/164 (22%), Positives = 59/164 (35%), Gaps = 21/164 (12%)

Query: 62  VCDGSSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVV 121
           +C   S   +   I +LG ++G F    +AD+ GR   +L    V+      + F +   
Sbjct: 197 LCIPDSGKGMLGLIVYLGMMVGAFFWGGLADKLGRKQCLLICLSVNGFFAFFSSFVQGYG 256

Query: 122 LFSLSRFLTGVGHFNAF--IFYY---IIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWI 176
            F   R L+G G   A   +F Y    +  E  G       M   F +   +   A+ W 
Sbjct: 257 FFLFCRLLSGFGIGGAIPIVFSYFAEFLAQEKRGEHLSWLCM---FWMIGGIYAAAMAWA 313

Query: 177 -------------AYYLADWQWISVITIFPLIVGLIVAIFTPES 207
                        AY    W+   ++  FP +  +    F PES
Sbjct: 314 IIPHYGWSFQMGSAYQFHSWRVFVIVCAFPCVFAIGALTFMPES 357


>gnl|CDD|225371 COG2814, AraJ, Arabinose efflux permease [Carbohydrate transport
           and metabolism].
          Length = 394

 Score = 41.8 bits (99), Expect = 1e-04
 Identities = 30/141 (21%), Positives = 56/141 (39%), Gaps = 9/141 (6%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           + + LG  LG  +L+ +  R  R   +LG   +  +   L+  +    +  L+R L G+ 
Sbjct: 55  TAYALGVALGAPLLALLTGRLERRRLLLGLLALFIVSNLLSALAPSFAVLLLARALAGLA 114

Query: 134 HFNAFIFYYIIVLECVGPKWRTFAMTFPFLIF--YTVSEVALPWIAYYLAD---WQWI-S 187
           H   +     +    V P  R  A+    L+F   T++ V    +  +L     W+    
Sbjct: 115 HGVFWSIAAALAARLVPPGKRGRALA---LVFTGLTLATVLGVPLGTFLGQLFGWRATFL 171

Query: 188 VITIFPLIVGLIVAIFTPESA 208
            I +  L+  L++    P S 
Sbjct: 172 AIAVLALLALLLLWKLLPPSE 192



 Score = 28.0 bits (63), Expect = 3.6
 Identities = 20/60 (33%), Positives = 27/60 (45%), Gaps = 8/60 (13%)

Query: 76  FFLGSLLGGFILSWVADRYGRIT--AVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
            F+G+LLGG     +ADR  R    A L    ++ L +  T  S  + L  L  FL G  
Sbjct: 260 GFIGNLLGG----RLADRGPRRALIAALLLLALALLALTFTGASPALALALL--FLWGFA 313


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 42.0 bits (98), Expect = 1e-04
 Identities = 19/63 (30%), Positives = 35/63 (55%), Gaps = 5/63 (7%)

Query: 2   FDRGNCNC-----YGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
            DRG  +C     Y   +  SK   +++E  E+ K+E +++  E++ER +E ++ R R R
Sbjct: 555 LDRGYNSCARTDLYFTPLASSKLAKKREEAVEKAKREAEQKAREEREREKEKEKERERER 614

Query: 57  RRD 59
            R+
Sbjct: 615 ERE 617



 Score = 37.4 bits (86), Expect = 0.004
 Identities = 15/49 (30%), Positives = 26/49 (53%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGS 66
           KRE   ++ + E +++ +EE E +KE+ +E +R R R   R       S
Sbjct: 580 KREEAVEKAKREAEQKAREEREREKEKEKEREREREREAERAAKASSSS 628



 Score = 35.8 bits (82), Expect = 0.015
 Identities = 18/42 (42%), Positives = 25/42 (59%), Gaps = 5/42 (11%)

Query: 14  VPKSKREGRQK--EEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           V K+KRE  QK  EE E  K+++KE E   +ER  E +R  +
Sbjct: 585 VEKAKREAEQKAREEREREKEKEKERE---REREREAERAAK 623


>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355).  This
          family of proteins is found in bacteria and viruses.
          Proteins in this family are typically between 180 and
          214 amino acids in length.
          Length = 125

 Score = 39.5 bits (93), Expect = 2e-04
 Identities = 15/50 (30%), Positives = 26/50 (52%), Gaps = 5/50 (10%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEK-----KERNEELDRRRRRRRRRDN 60
          K+K E +Q+E++ E +K  K   EEK     ++  +EL+       RR+ 
Sbjct: 21 KAKWEKKQEEKKSEAEKLAKMSAEEKAEYELEKLEKELEELEAELARREL 70



 Score = 32.2 bits (74), Expect = 0.067
 Identities = 10/29 (34%), Positives = 17/29 (58%)

Query: 26 EEEEGKKEKKEEEEEKKERNEELDRRRRR 54
           EE+ + E ++ E+E +E   EL RR  +
Sbjct: 43 AEEKAEYELEKLEKELEELEAELARRELK 71



 Score = 29.5 bits (67), Expect = 0.60
 Identities = 15/50 (30%), Positives = 24/50 (48%), Gaps = 8/50 (16%)

Query: 16 KSKREGRQKEEE------EEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          + K+E ++ E E       E K E + E+ EK+   EEL+    RR  + 
Sbjct: 25 EKKQEEKKSEAEKLAKMSAEEKAEYELEKLEKE--LEELEAELARRELKA 72


>gnl|CDD|233174 TIGR00893, 2A0114, D-galactonate transporter.  [Transport and
           binding proteins, Carbohydrates, organic alcohols, and
           acids].
          Length = 399

 Score = 40.8 bits (96), Expect = 2e-04
 Identities = 30/139 (21%), Positives = 52/139 (37%), Gaps = 7/139 (5%)

Query: 73  RSIFFLGSLLGGFILSWVADRYG-RITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTG 131
            S F  G ++G F   W+ DR+G R T  +   +       L  F+   V   + R L G
Sbjct: 35  FSAFSWGYVVGQFPGGWLLDRFGARKTLAVFIVIWGVF-TGLQAFAGAYVSLYILRVLLG 93

Query: 132 VGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLA---DWQWISV 188
                 F    +IV        R  A++        +  +    +  ++     WQW  +
Sbjct: 94  AAEAPFFPGIILIVASWFPASERATAVSIFNSAQG-LGGIIGGPLVGWILIHFSWQWAFI 152

Query: 189 IT-IFPLIVGLIVAIFTPE 206
           I  +  +I G++   F P+
Sbjct: 153 IEGVLGIIWGVLWLKFIPD 171


>gnl|CDD|233165 TIGR00879, SP, MFS transporter, sugar porter (SP) family.  This
           model represent the sugar porter subfamily of the major
           facilitator superfamily (pfam00083) [Transport and
           binding proteins, Carbohydrates, organic alcohols, and
           acids].
          Length = 481

 Score = 40.8 bits (96), Expect = 3e-04
 Identities = 22/79 (27%), Positives = 37/79 (46%), Gaps = 3/79 (3%)

Query: 58  RDNWVCDGSSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVAL---T 114
             N     SS   +  SIF +G  +G     W++DR+GR  ++L   ++  +G  L    
Sbjct: 62  SANSDSYSSSLWGLVVSIFLVGGFIGALFAGWLSDRFGRKKSLLIIALLFVIGAILMGLA 121

Query: 115 PFSKDVVLFSLSRFLTGVG 133
            F+  V +  + R L G+G
Sbjct: 122 AFALSVEMLIVGRVLLGIG 140


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 39.3 bits (91), Expect = 7e-04
 Identities = 21/46 (45%), Positives = 28/46 (60%), Gaps = 3/46 (6%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRR--RRRRRRRD 59
           K KRE R+K  EEE ++ +K+EE ++K R EE  RR      RRR 
Sbjct: 220 KKKREERRKVLEEE-EQRRKQEEADRKSREEEEKRRLKEEIERRRA 264



 Score = 32.7 bits (74), Expect = 0.13
 Identities = 14/41 (34%), Positives = 23/41 (56%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K + +Q+E   E ++ KK+ EE +K   EE  RR++    R
Sbjct: 204 KLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEADR 244



 Score = 31.2 bits (70), Expect = 0.33
 Identities = 16/65 (24%), Positives = 30/65 (46%), Gaps = 4/65 (6%)

Query: 15  PKSKR---EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAI 71
            + K    E +++++E E  ++ +EEEE+++ + E   RR     +R     DG S    
Sbjct: 225 ERRKVLEEEEQRRKQE-EADRKSREEEEKRRLKEEIERRRAEAAEKRQKVPEDGLSEDKK 283

Query: 72  TRSIF 76
               F
Sbjct: 284 PFKCF 288



 Score = 29.2 bits (65), Expect = 1.7
 Identities = 12/32 (37%), Positives = 19/32 (59%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
           S  E  ++ +E+ G + +  EEEEK+E  EE 
Sbjct: 92  SLSEPSRRMQEDSGAENETVEEEEKEESREER 123



 Score = 29.2 bits (65), Expect = 1.7
 Identities = 11/37 (29%), Positives = 19/37 (51%), Gaps = 5/37 (13%)

Query: 19  REGRQKEEEEEG-----KKEKKEEEEEKKERNEELDR 50
           RE R++ EE EG     +K    + EE ++  +E + 
Sbjct: 120 REEREEVEETEGVTKSEQKNDWRDAEECQKEEKEPEP 156



 Score = 28.5 bits (63), Expect = 2.8
 Identities = 16/65 (24%), Positives = 30/65 (46%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAITRSIFFL 78
            EG  K E++   ++ +E ++E+KE   E + + +R    +N     +  L  T + F  
Sbjct: 129 TEGVTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKHTENTFSR 188

Query: 79  GSLLG 83
           G   G
Sbjct: 189 GGAEG 193



 Score = 28.1 bits (62), Expect = 3.5
 Identities = 10/42 (23%), Positives = 18/42 (42%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
             E  +KEE  E ++E +E E   K   +   R     ++ +
Sbjct: 110 TVEEEEKEESREEREEVEETEGVTKSEQKNDWRDAEECQKEE 151



 Score = 27.3 bits (60), Expect = 6.3
 Identities = 11/49 (22%), Positives = 19/49 (38%), Gaps = 5/49 (10%)

Query: 8  NCYGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
              L     +R  R++E  +E   E  E ++E K  + +       RR
Sbjct: 56 EAALL-----ERLARREERRDERFSEALERQKEFKPTSTDQSLSEPSRR 99



 Score = 26.9 bits (59), Expect = 9.7
 Identities = 11/43 (25%), Positives = 19/43 (44%), Gaps = 1/43 (2%)

Query: 16  KSKREGRQK-EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           + K E R++ EE EE +   K E++      EE  +  +    
Sbjct: 114 EEKEESREEREEVEETEGVTKSEQKNDWRDAEECQKEEKEPEP 156


>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3
           subunit.  This is a family of proteins which are
           subunits of the eukaryotic translation initiation factor
           3 (eIF3). In yeast it is called Hcr1. The Saccharomyces
           cerevisiae protein eIF3j (HCR1) has been shown to be
           required for processing of 20S pre-rRNA and binds to 18S
           rRNA and eIF3 subunits Rpg1p and Prt1p.
          Length = 242

 Score = 38.9 bits (91), Expect = 7e-04
 Identities = 14/45 (31%), Positives = 25/45 (55%), Gaps = 1/45 (2%)

Query: 15  PKSKREGRQK-EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            K+K+  + K EE+E+ K+EK+E+   + E +   D    + R R
Sbjct: 56  AKAKKALKAKIEEKEKAKREKEEKGLRELEEDTPEDELAEKLRLR 100



 Score = 34.2 bits (79), Expect = 0.027
 Identities = 8/44 (18%), Positives = 23/44 (52%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            K   + + +E+E+  ++++++   E +E   E +   + R R+
Sbjct: 58  AKKALKAKIEEKEKAKREKEEKGLRELEEDTPEDELAEKLRLRK 101


>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207).  This
           family is found in eukaryotes; it has several conserved
           tryptophan residues. The function is not known.
          Length = 261

 Score = 38.9 bits (91), Expect = 8e-04
 Identities = 13/32 (40%), Positives = 26/32 (81%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           K K++ +++EEE   +++K++EEEE+K++ EE
Sbjct: 191 KLKQQQQKREEERRKQRKKQQEEEERKQKAEE 222



 Score = 32.8 bits (75), Expect = 0.098
 Identities = 11/45 (24%), Positives = 26/45 (57%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWV 62
             +  Q++++ +   E+K+++E +KER E   R+R  + + + W 
Sbjct: 94  SAKQAQRQKKLQKLLEEKQKQEREKEREEAELRQRLAKEKYEEWC 138



 Score = 30.5 bits (69), Expect = 0.47
 Identities = 10/32 (31%), Positives = 19/32 (59%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
             +++  QK  EE+ K+E+++E EE + R   
Sbjct: 98  AQRQKKLQKLLEEKQKQEREKEREEAELRQRL 129



 Score = 30.1 bits (68), Expect = 0.61
 Identities = 14/51 (27%), Positives = 30/51 (58%), Gaps = 4/51 (7%)

Query: 15  PKSKREGRQKEEEEEGKK----EKKEEEEEKKERNEELDRRRRRRRRRDNW 61
             S+ E +++ +E E KK    ++K EEE +K+R ++ +   R+++  + W
Sbjct: 174 NVSQEEAKKRLQEWELKKLKQQQQKREEERRKQRKKQQEEEERKQKAEEAW 224



 Score = 29.7 bits (67), Expect = 0.86
 Identities = 11/37 (29%), Positives = 22/37 (59%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
             E +QK+E E+ ++E +  +   KE+ EE  R++ +
Sbjct: 107 LLEEKQKQEREKEREEAELRQRLAKEKYEEWCRQKAQ 143



 Score = 29.7 bits (67), Expect = 0.91
 Identities = 9/45 (20%), Positives = 21/45 (46%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K  +  RQ +E  E     K+ + +KK +    +++++ R +  
Sbjct: 76  LKEVKLERQAQEAYENWLSAKQAQRQKKLQKLLEEKQKQEREKER 120


>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168).  This
          family consists of several hypothetical eukaryotic
          proteins of unknown function.
          Length = 142

 Score = 38.1 bits (89), Expect = 8e-04
 Identities = 12/40 (30%), Positives = 25/40 (62%), Gaps = 3/40 (7%)

Query: 22 RQKEEEEEGKKEKKEEE-EEKKERNEELD--RRRRRRRRR 58
          R +  +E+ KKE ++EE ++K+E  +  D  +  ++R +R
Sbjct: 51 RLELMDEKWKKETEDEEFQQKREEKKRKDEEKTAKKRAKR 90



 Score = 26.2 bits (58), Expect = 9.3
 Identities = 12/34 (35%), Positives = 23/34 (67%), Gaps = 2/34 (5%)

Query: 18  KREGRQKEEE--EEGKKEKKEEEEEKKERNEELD 49
           KR  RQK+++  ++ KK KK  ++E+KE ++  +
Sbjct: 86  KRAKRQKKKQKKKKKKKAKKGNKKEEKEGSKSSE 119


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 38.7 bits (90), Expect = 0.001
 Identities = 11/46 (23%), Positives = 29/46 (63%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            + K + + KEE+++ K++ KEE +++K + E  ++R  + + ++ 
Sbjct: 105 EEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEK 150



 Score = 37.6 bits (87), Expect = 0.004
 Identities = 12/49 (24%), Positives = 29/49 (59%), Gaps = 3/49 (6%)

Query: 15  PKSKRE---GRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            +S +E    +++ +EE+ KK++K +EE K  + +E  + +R  + ++ 
Sbjct: 100 NESGKEEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPKEKEK 148



 Score = 36.4 bits (84), Expect = 0.008
 Identities = 12/45 (26%), Positives = 27/45 (60%), Gaps = 2/45 (4%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           +  ++ + KEE +E +  K++E+E++K+  E   R R   ++R+ 
Sbjct: 126 EEPKDRKPKEEAKEKRPPKEKEKEKEKKVEE--PRDREEEKKRER 168



 Score = 35.6 bits (82), Expect = 0.013
 Identities = 12/47 (25%), Positives = 26/47 (55%), Gaps = 2/47 (4%)

Query: 15  PKSKREGRQKEEEEEGK--KEKKEEEEEKKERNEELDRRRRRRRRRD 59
           PK K + ++K+ EE     +EKK E    K R ++  +++   ++++
Sbjct: 143 PKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKE 189



 Score = 35.2 bits (81), Expect = 0.016
 Identities = 12/44 (27%), Positives = 26/44 (59%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
              K +   KE+    +KEK++E++ ++ R+ E +++R R R +
Sbjct: 129 KDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAK 172



 Score = 33.7 bits (77), Expect = 0.056
 Identities = 14/43 (32%), Positives = 26/43 (60%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K KR  ++KE+E+E K E+  + EE+K+R     + R ++  +
Sbjct: 138 KEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPK 180



 Score = 33.7 bits (77), Expect = 0.059
 Identities = 12/36 (33%), Positives = 24/36 (66%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNE 46
           G   P +K +  ++ + E GK+E+KE+E+ K+E+ +
Sbjct: 84  GSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKK 119



 Score = 32.2 bits (73), Expect = 0.16
 Identities = 9/40 (22%), Positives = 21/40 (52%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
            + +    KE + E  KE+++E+E+ KE  ++   + +  
Sbjct: 88  PAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEE 127



 Score = 32.2 bits (73), Expect = 0.20
 Identities = 14/42 (33%), Positives = 26/42 (61%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           PK + + ++  +E+E +KEKK EE   +E  ++ +R R + R
Sbjct: 133 PKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSR 174



 Score = 31.8 bits (72), Expect = 0.21
 Identities = 17/44 (38%), Positives = 27/44 (61%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            K K E ++K   +E +KEK+++ EE ++R EE  R R R + R
Sbjct: 131 RKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSR 174



 Score = 30.6 bits (69), Expect = 0.56
 Identities = 11/50 (22%), Positives = 23/50 (46%), Gaps = 6/50 (12%)

Query: 16  KSKREGRQKEEEEEGKKEKKEE------EEEKKERNEELDRRRRRRRRRD 59
           K K+    ++ EEE K+E+           +KK  N++ +     ++R+ 
Sbjct: 150 KEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEEEKQRQA 199



 Score = 28.7 bits (64), Expect = 2.5
 Identities = 7/42 (16%), Positives = 23/42 (54%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
           V K   +G   + +   + + +  +EE+KE+ +  + +++++
Sbjct: 80  VEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKK 121



 Score = 28.7 bits (64), Expect = 2.6
 Identities = 13/55 (23%), Positives = 25/55 (45%), Gaps = 11/55 (20%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKE-----------RNEELDRRRRRRRRRD 59
            +K    +  +EEE +KE+ +EE++KK+             EE   +R  + +  
Sbjct: 94  PAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPKEKEK 148



 Score = 28.3 bits (63), Expect = 3.0
 Identities = 10/33 (30%), Positives = 16/33 (48%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
            + K+  R + +    K  KK+   +KKE  EE
Sbjct: 161 EEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEE 193



 Score = 27.9 bits (62), Expect = 3.7
 Identities = 10/46 (21%), Positives = 21/46 (45%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            KS+ +   K++    KKE  EEE++++   E +  +       + 
Sbjct: 171 AKSRPKKPPKKKPPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEE 216


>gnl|CDD|237051 PRK12307, PRK12307, putative sialic acid transporter; Provisional.
          Length = 426

 Score = 38.4 bits (89), Expect = 0.001
 Identities = 19/57 (33%), Positives = 34/57 (59%)

Query: 77  FLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           F+G   GG +   +AD++GR   ++ S V   +G  L+  +  V++ +LSRF+ G+G
Sbjct: 63  FIGRPFGGALFGLLADKFGRKPLMMWSIVAYSVGTGLSGLASGVIMLTLSRFIVGMG 119


>gnl|CDD|206039 pfam13868, Trichoplein, Tumour suppressor, Mitostatin.  Trichoplein
           or mitostatin, was first defined as a meiosis-specific
           nuclear structural protein. It has since been linked
           with mitochondrial movement. It is associated with the
           mitochondrial outer membrane, and over-expression leads
           to reduction in mitochondrial motility whereas lack of
           it enhances mitochondrial movement. The activity appears
           to be mediated through binding the mitochondria to the
           actin intermediate filaments (IFs).
          Length = 349

 Score = 38.4 bits (90), Expect = 0.002
 Identities = 12/44 (27%), Positives = 21/44 (47%), Gaps = 5/44 (11%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERN-----EELDRRRRRRRRRDN 60
           R+K E EE ++ ++ E +E+KER       + +     R   D 
Sbjct: 160 REKAEREEEREAERRERKEEKEREVARLRAQQEEAEDEREELDE 203



 Score = 34.5 bits (80), Expect = 0.030
 Identities = 16/48 (33%), Positives = 24/48 (50%), Gaps = 5/48 (10%)

Query: 16  KSKREGRQKEEEEEGK-----KEKKEEEEEKKERNEELDRRRRRRRRR 58
           + K E +++E EEE K     +EK E EEE++    E    + R   R
Sbjct: 139 ERKEEEKEREREEELKILEYQREKAEREEEREAERRERKEEKEREVAR 186



 Score = 34.1 bits (79), Expect = 0.032
 Identities = 17/50 (34%), Positives = 27/50 (54%), Gaps = 9/50 (18%)

Query: 18  KREGRQKEEEEEGKKEKKEEE---------EEKKERNEELDRRRRRRRRR 58
           +R+ RQKE+EE  K+ ++++E         EEK+ER +E        R R
Sbjct: 214 ERKERQKEKEEAEKRRRQKQELQRAREEQIEEKEERLQEERAEEEAERER 263



 Score = 33.7 bits (78), Expect = 0.049
 Identities = 15/41 (36%), Positives = 25/41 (60%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          K+  + +E+EEE + ++  EEE  K   EE +R R+R+  R
Sbjct: 29 KKRIKAEEKEEERRIDEMMEEERLKALAEEEERERKRKEER 69



 Score = 33.3 bits (77), Expect = 0.064
 Identities = 15/46 (32%), Positives = 27/46 (58%), Gaps = 8/46 (17%)

Query: 22  RQKEEEEEGKKEKKE--------EEEEKKERNEELDRRRRRRRRRD 59
           R ++EE E ++E+ +        EE E+KER +E +   +RRR++ 
Sbjct: 188 RAQQEEAEDEREELDELRADLYQEEYERKERQKEKEEAEKRRRQKQ 233



 Score = 33.3 bits (77), Expect = 0.074
 Identities = 13/46 (28%), Positives = 28/46 (60%), Gaps = 8/46 (17%)

Query: 19  REGRQKEEEEEGKKEKKE--------EEEEKKERNEELDRRRRRRR 56
           RE R++EE EE  +E+++        +EE++ E  E+ +++++ R 
Sbjct: 83  REKRRQEEYEERLQEREQMDEIIERIQEEDEAEAQEKREKQKKLRE 128



 Score = 32.2 bits (74), Expect = 0.17
 Identities = 14/42 (33%), Positives = 25/42 (59%), Gaps = 1/42 (2%)

Query: 18  KREGRQKE-EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K E  Q+E  EEE ++E+  E++ + E  E+ +  +RR +R 
Sbjct: 246 KEERLQEERAEEEAERERMLEKQAEDEELEQENAEKRRMKRL 287



 Score = 31.4 bits (72), Expect = 0.25
 Identities = 13/46 (28%), Positives = 25/46 (54%), Gaps = 5/46 (10%)

Query: 18  KREGRQKEEEEEGKKEKK-----EEEEEKKERNEELDRRRRRRRRR 58
           + E +++E+E E ++E K      E+ E++E  E   R R+  + R
Sbjct: 137 RIERKEEEKEREREEELKILEYQREKAEREEEREAERRERKEEKER 182



 Score = 31.0 bits (71), Expect = 0.33
 Identities = 14/34 (41%), Positives = 22/34 (64%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
           +Q EE+EE +  ++EEE E+ ER  E +  R+ R
Sbjct: 295 QQIEEKEERRAAEREEELEEGERLREEEAERQAR 328



 Score = 31.0 bits (71), Expect = 0.39
 Identities = 17/48 (35%), Positives = 21/48 (43%), Gaps = 11/48 (22%)

Query: 22  RQKEEEEEGKKEKKEEEEE-----------KKERNEELDRRRRRRRRR 58
              EE  E K+E+KE E E           K ER EE +  RR R+  
Sbjct: 132 EFNEERIERKEEEKEREREEELKILEYQREKAEREEEREAERRERKEE 179



 Score = 30.3 bits (69), Expect = 0.62
 Identities = 15/47 (31%), Positives = 24/47 (51%), Gaps = 6/47 (12%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKER-NEELDRRR-----RRRRRRD 59
           R    +EE E  +++K++EE EK+ R  +EL R R      +  R  
Sbjct: 205 RADLYQEEYERKERQKEKEEAEKRRRQKQELQRAREEQIEEKEERLQ 251



 Score = 30.3 bits (69), Expect = 0.63
 Identities = 15/41 (36%), Positives = 21/41 (51%), Gaps = 5/41 (12%)

Query: 23 QKEEEEEGKKEKKEEEEEK-----KERNEELDRRRRRRRRR 58
          Q EE++  K E+KEEE        +ER + L     R R+R
Sbjct: 25 QIEEKKRIKAEEKEEERRIDEMMEEERLKALAEEEERERKR 65



 Score = 30.3 bits (69), Expect = 0.64
 Identities = 16/46 (34%), Positives = 27/46 (58%), Gaps = 6/46 (13%)

Query: 19  REGRQKEEEEEGKKEKKEEEEE------KKERNEELDRRRRRRRRR 58
           RE + +E+EE  ++E+ EEE E      K+  +EEL++    +RR 
Sbjct: 239 REEQIEEKEERLQEERAEEEAERERMLEKQAEDEELEQENAEKRRM 284



 Score = 30.3 bits (69), Expect = 0.64
 Identities = 13/45 (28%), Positives = 25/45 (55%), Gaps = 4/45 (8%)

Query: 16  KSKREGRQKEEEEEGKKE--KKEEEEEKKERNEELDRRRRRRRRR 58
           K +R   ++ EEE  ++   +K+ E+E+ E+  E   +RR +R  
Sbjct: 246 KEERLQEERAEEEAERERMLEKQAEDEELEQ--ENAEKRRMKRLE 288



 Score = 30.3 bits (69), Expect = 0.73
 Identities = 14/37 (37%), Positives = 20/37 (54%), Gaps = 1/37 (2%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
           K E R  E EEE  +E +   EE+ ER   ++  R+R
Sbjct: 300 KEERRAAEREEE-LEEGERLREEEAERQARIEEERQR 335



 Score = 30.3 bits (69), Expect = 0.74
 Identities = 13/39 (33%), Positives = 24/39 (61%), Gaps = 3/39 (7%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERN---EELDRRRRRRRR 57
          R K   EE ++E+K +EE ++ R    E+++ R +RR+ 
Sbjct: 51 RLKALAEEEERERKRKEERREGRAVLQEQIEEREKRRQE 89



 Score = 30.3 bits (69), Expect = 0.76
 Identities = 19/62 (30%), Positives = 31/62 (50%), Gaps = 18/62 (29%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEE--------EEKKERNEELDRRR---------RRRRR 57
            K++RE  ++ E  E +KE+KE E        EE ++  EELD  R         R+ R+
Sbjct: 161 EKAEREEEREAERRE-RKEEKEREVARLRAQQEEAEDEREELDELRADLYQEEYERKERQ 219

Query: 58  RD 59
           ++
Sbjct: 220 KE 221



 Score = 29.5 bits (67), Expect = 1.1
 Identities = 14/46 (30%), Positives = 24/46 (52%), Gaps = 8/46 (17%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKER----NEELDRRRR----RRRRRD 59
            Q EE+EE  +E++ EEE ++ER      E +   +    +RR + 
Sbjct: 241 EQIEEKEERLQEERAEEEAERERMLEKQAEDEELEQENAEKRRMKR 286



 Score = 29.5 bits (67), Expect = 1.1
 Identities = 17/55 (30%), Positives = 27/55 (49%), Gaps = 12/55 (21%)

Query: 16  KSKREGRQKEEE--EEGKKEKKEEEEEKK----------ERNEELDRRRRRRRRR 58
           + +REGR   +E  EE +K ++EE EE+           ER +E D    + +R 
Sbjct: 67  EERREGRAVLQEQIEEREKRRQEEYEERLQEREQMDEIIERIQEEDEAEAQEKRE 121



 Score = 29.1 bits (66), Expect = 1.6
 Identities = 12/35 (34%), Positives = 21/35 (60%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
           + +R   ++EE EEG++ ++EE E +    EE  R
Sbjct: 301 EERRAAEREEELEEGERLREEEAERQARIEEERQR 335



 Score = 29.1 bits (66), Expect = 1.6
 Identities = 14/38 (36%), Positives = 22/38 (57%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
          K E R+ +E  E ++ K   EEE++ER  + +RR  R 
Sbjct: 37 KEEERRIDEMMEEERLKALAEEEERERKRKEERREGRA 74



 Score = 28.0 bits (63), Expect = 3.9
 Identities = 11/46 (23%), Positives = 22/46 (47%), Gaps = 7/46 (15%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKER-------NEELDRRRRRRRRRDN 60
             +  +EE + E +E+ E++K+        NEE   R+   + R+ 
Sbjct: 104 IIERIQEEDEAEAQEKREKQKKLREEIDEFNEERIERKEEEKERER 149



 Score = 27.2 bits (61), Expect = 6.5
 Identities = 10/39 (25%), Positives = 19/39 (48%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            + +E+ E +K+ +EE +E  E   E     + R R + 
Sbjct: 114 AEAQEKREKQKKLREEIDEFNEERIERKEEEKEREREEE 152



 Score = 26.8 bits (60), Expect = 7.8
 Identities = 8/29 (27%), Positives = 14/29 (48%), Gaps = 2/29 (6%)

Query: 31  KKEKKEEE--EEKKERNEELDRRRRRRRR 57
           K+E++  E  EE +E     +    R+ R
Sbjct: 300 KEERRAAEREEELEEGERLREEEAERQAR 328



 Score = 26.8 bits (60), Expect = 8.2
 Identities = 7/40 (17%), Positives = 27/40 (67%), Gaps = 1/40 (2%)

Query: 19  REGRQKEEEEEGKKEKKEEEEE-KKERNEELDRRRRRRRR 57
            E +++++E+E  ++++ +++E ++ R E+++ +  R + 
Sbjct: 213 YERKERQKEKEEAEKRRRQKQELQRAREEQIEEKEERLQE 252


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
           the C subunit of DNA polymerase delta. It carries the
           essential residues for binding to the Pol1 subunit of
           polymerase alpha, from residues 293-332, which are
           characterized by the motif D--G--VT, referred to as the
           DPIM motif. The first 160 residues of the protein form
           the minimal domain for binding to the B subunit, Cdc1,
           of polymerase delta, the final 10 C-terminal residues,
           362-372, being the DNA sliding clamp, PCNA, binding
           motif.
          Length = 427

 Score = 37.5 bits (87), Expect = 0.003
 Identities = 21/46 (45%), Positives = 26/46 (56%), Gaps = 2/46 (4%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           IVP+S  E  ++ EE E     K+EEE K+E     D  RRR RRR
Sbjct: 313 IVPESPVE-EEESEEPEPPPLPKKEEE-KEEVTVSPDGGRRRGRRR 356



 Score = 28.6 bits (64), Expect = 2.5
 Identities = 10/34 (29%), Positives = 16/34 (47%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
                 R   EEE  +KEK++ +  KK   +E +
Sbjct: 274 PKPSGERSDSEEETEEKEKEKRKRLKKMMEDEDE 307



 Score = 28.3 bits (63), Expect = 2.9
 Identities = 8/44 (18%), Positives = 19/44 (43%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           + + +       +E+E + E K   E      E  ++ + +R+R
Sbjct: 254 ILEDESAEPTGLDEDEDEDEPKPSGERSDSEEETEEKEKEKRKR 297


>gnl|CDD|233166 TIGR00880, 2_A_01_02, Multidrug resistance protein. 
          Length = 141

 Score = 36.1 bits (84), Expect = 0.003
 Identities = 31/142 (21%), Positives = 57/142 (40%), Gaps = 9/142 (6%)

Query: 74  SIFFLGSLLGGFILS----WVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFL 129
            +   G  LG  I S     + DR+GR   +L    +  L  A+   S ++ +  ++RFL
Sbjct: 1   GLLLAGYALGQLIYSPLSGLLTDRFGRKPVLLVGLFIFVLSTAMFALSSNITVLIIARFL 60

Query: 130 TGVGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLAD---WQWI 186
            G G   A +    ++ +   P+ R  A+         +  +  P +   LA    W+  
Sbjct: 61  QGFGAAFALVAGAALIADIYPPEERGVALGL-MSAGIALGPLLGPPLGGVLAQFLGWRAP 119

Query: 187 SVI-TIFPLIVGLIVAIFTPES 207
            +   I  L   +++A   PE+
Sbjct: 120 FLFLAILALAAFILLAFLLPET 141


>gnl|CDD|205206 pfam13025, DUF3886, Protein of unknown function (DUF3886).  This
          family of proteins is functionally uncharacterized.
          This family of proteins is found in bacteria. Proteins
          in this family are approximately 90 amino acids in
          length. There are two completely conserved L residues
          that may be functionally important.
          Length = 70

 Score = 34.6 bits (80), Expect = 0.003
 Identities = 12/30 (40%), Positives = 18/30 (60%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          KKE K EEE+++E  E   R  R+ R ++ 
Sbjct: 26 KKELKAEEEKREEEEEARKREERKEREKNK 55



 Score = 33.1 bits (76), Expect = 0.015
 Identities = 13/30 (43%), Positives = 19/30 (63%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          KK++ + EEEK+E  EE  +R  R+ R  N
Sbjct: 25 KKKELKAEEEKREEEEEARKREERKEREKN 54



 Score = 31.9 bits (73), Expect = 0.029
 Identities = 11/32 (34%), Positives = 22/32 (68%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
          K+K++  + EEE+  ++E+  + EE+KER + 
Sbjct: 23 KAKKKELKAEEEKREEEEEARKREERKEREKN 54



 Score = 31.2 bits (71), Expect = 0.062
 Identities = 11/26 (42%), Positives = 19/26 (73%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEK 41
          K++ E R++EEE   ++E+KE E+ K
Sbjct: 30 KAEEEKREEEEEARKREERKEREKNK 55



 Score = 30.8 bits (70), Expect = 0.091
 Identities = 14/31 (45%), Positives = 21/31 (67%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
          K E  ++EEEEE +K ++ +E EK +  EEL
Sbjct: 30 KAEEEKREEEEEARKREERKEREKNKSFEEL 60



 Score = 30.4 bits (69), Expect = 0.12
 Identities = 12/32 (37%), Positives = 21/32 (65%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
            E +++EEEE  K+E+++E E+ K   E L+
Sbjct: 31 AEEEKREEEEEARKREERKEREKNKSFEELLN 62



 Score = 29.2 bits (66), Expect = 0.27
 Identities = 10/27 (37%), Positives = 18/27 (66%)

Query: 32 KEKKEEEEEKKERNEELDRRRRRRRRR 58
          K KK+E + ++E+ EE +  R+R  R+
Sbjct: 23 KAKKKELKAEEEKREEEEEARKREERK 49


>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
           RPA34.5.  This is a family of proteins conserved from
           yeasts to human. Subunit A34.5 of RNA polymerase I is a
           non-essential subunit which is thought to help Pol I
           overcome topological constraints imposed on ribosomal
           DNA during the process of transcription.
          Length = 193

 Score = 36.6 bits (85), Expect = 0.004
 Identities = 11/42 (26%), Positives = 30/42 (71%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           + K+E ++K+E ++ KKEKK+++E+  E      ++++++++
Sbjct: 152 EEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKKKKKKKK 193



 Score = 32.8 bits (75), Expect = 0.065
 Identities = 12/47 (25%), Positives = 30/47 (63%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           V K      ++++E++ KKE K+E++EKK++ E++   +  ++++  
Sbjct: 143 VEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKKKK 189



 Score = 32.8 bits (75), Expect = 0.071
 Identities = 11/44 (25%), Positives = 32/44 (72%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
              + E ++K++++E KKEKKE++++K++  E    ++++++++
Sbjct: 148 EVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKKKKKK 191



 Score = 32.4 bits (74), Expect = 0.085
 Identities = 15/36 (41%), Positives = 26/36 (72%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
            + +  ++ E EEE KKEKK+++E KKE+ E+ D++
Sbjct: 139 TTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKK 174



 Score = 32.0 bits (73), Expect = 0.11
 Identities = 8/50 (16%), Positives = 29/50 (58%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
             +  +S+   ++   + E + E +EEE+++K++ +E+ + ++ ++ +  
Sbjct: 126 SELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKE 175


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
          approximately 300 residues, found in plants and
          vertebrates. They contain a highly conserved DDRGK
          motif.
          Length = 189

 Score = 36.6 bits (85), Expect = 0.004
 Identities = 13/40 (32%), Positives = 28/40 (70%)

Query: 19 REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           E ++ EE+ EG+++++EE EE++E+ +E + R+ R  + 
Sbjct: 29 EERKKLEEKREGERKEEEELEEEREKKKEEEERKEREEQA 68



 Score = 34.7 bits (80), Expect = 0.019
 Identities = 15/44 (34%), Positives = 27/44 (61%), Gaps = 1/44 (2%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNW 61
          +R+ R+ EEEE  +++K EE+ E  ER EE +    R ++++  
Sbjct: 17 RRQQREAEEEEREERKKLEEKRE-GERKEEEELEEEREKKKEEE 59



 Score = 34.7 bits (80), Expect = 0.019
 Identities = 17/45 (37%), Positives = 30/45 (66%), Gaps = 2/45 (4%)

Query: 18 KREGRQK--EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          +RE R+K  E+ E  +KE++E EEE++++ EE +R+ R  + R  
Sbjct: 27 EREERKKLEEKREGERKEEEELEEEREKKKEEEERKEREEQARKE 71



 Score = 30.1 bits (68), Expect = 0.65
 Identities = 21/36 (58%), Positives = 29/36 (80%), Gaps = 1/36 (2%)

Query: 16 KSKREGRQKEEEE-EGKKEKKEEEEEKKERNEELDR 50
          + KREG +KEEEE E ++EKK+EEEE+KER E+  +
Sbjct: 35 EEKREGERKEEEELEEEREKKKEEEERKEREEQARK 70



 Score = 29.3 bits (66), Expect = 1.2
 Identities = 12/34 (35%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 16 KSKREGRQKEEEEEGKKE--KKEEEEEKKERNEE 47
            ++E  + EEE E KKE  +++E EE+  + +E
Sbjct: 40 GERKEEEELEEEREKKKEEEERKEREEQARKEQE 73



 Score = 28.1 bits (63), Expect = 2.7
 Identities = 9/25 (36%), Positives = 18/25 (72%)

Query: 19 REGRQKEEEEEGKKEKKEEEEEKKE 43
           E  +K+EEEE K+ +++  +E++E
Sbjct: 50 EEREKKKEEEERKEREEQARKEQEE 74



 Score = 27.7 bits (62), Expect = 3.6
 Identities = 11/43 (25%), Positives = 21/43 (48%), Gaps = 6/43 (13%)

Query: 22 RQKEEEEEGKKEKKE------EEEEKKERNEELDRRRRRRRRR 58
          R K EE++ +++++E      EE +K E   E +R+       
Sbjct: 8  RAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEE 50



 Score = 27.4 bits (61), Expect = 4.2
 Identities = 7/33 (21%), Positives = 18/33 (54%)

Query: 28 EEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
             K+ K EE++ ++++ E  +  R  R++ + 
Sbjct: 4  GAKKRAKLEEKQARRQQREAEEEEREERKKLEE 36



 Score = 27.4 bits (61), Expect = 4.7
 Identities = 13/35 (37%), Positives = 23/35 (65%), Gaps = 3/35 (8%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
          +   E R+K++EEE   E+KE EE+ ++  EE ++
Sbjct: 46 EELEEEREKKKEEE---ERKEREEQARKEQEEYEK 77


>gnl|CDD|182486 PRK10473, PRK10473, multidrug efflux system protein MdtL;
           Provisional.
          Length = 392

 Score = 36.9 bits (86), Expect = 0.004
 Identities = 19/69 (27%), Positives = 28/69 (40%), Gaps = 8/69 (11%)

Query: 69  LAITRSIFFLGS----LLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFS 124
           L I  S++  G     L  G     +ADR GR    +    +  +   L   ++   LF 
Sbjct: 40  LHIAFSVYLAGMAAAMLFAG----KIADRSGRKPVAIPGAALFIIASLLCSLAETSSLFL 95

Query: 125 LSRFLTGVG 133
             RFL G+G
Sbjct: 96  AGRFLQGIG 104


>gnl|CDD|205480 pfam13300, DUF4078, Domain of unknown function (DUF4078).  This
          family is found from fungi to humans, but its exact
          function is not known.
          Length = 88

 Score = 34.9 bits (81), Expect = 0.004
 Identities = 18/42 (42%), Positives = 27/42 (64%)

Query: 17 SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          SK E  +KE+ EE +K ++E E E+KER E  ++R+R    R
Sbjct: 39 SKDEEERKEQMEELEKAREETERERKEREERKEKRKRAIEER 80



 Score = 31.5 bits (72), Expect = 0.065
 Identities = 11/34 (32%), Positives = 20/34 (58%)

Query: 26 EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          +EEE K++ +E E+ ++E   E   R  R+ +R 
Sbjct: 41 DEEERKEQMEELEKAREETERERKEREERKEKRK 74



 Score = 26.9 bits (60), Expect = 2.5
 Identities = 10/41 (24%), Positives = 22/41 (53%), Gaps = 5/41 (12%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          ++   + E E + ++E+KE+ +   E     +RR++   RR
Sbjct: 53 EKAREETERERKEREERKEKRKRAIE-----ERRKKIEERR 88


>gnl|CDD|234311 TIGR03685, L12P_arch, 50S ribosomal protein L12P.  This model
          represents the L12P protein of the large (50S) subunit
          of the archaeal ribosome.
          Length = 105

 Score = 35.0 bits (81), Expect = 0.005
 Identities = 11/25 (44%), Positives = 17/25 (68%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEE 47
               EE ++E++EEEEE++E  EE
Sbjct: 71 AAAAAEEEEEEEEEEEEEEEESEEE 95



 Score = 31.2 bits (71), Expect = 0.11
 Identities = 10/26 (38%), Positives = 17/26 (65%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEE 40
            +     ++EEEEE ++E++EE EE
Sbjct: 69 AAAAAAAEEEEEEEEEEEEEEEESEE 94



 Score = 29.6 bits (67), Expect = 0.42
 Identities = 10/21 (47%), Positives = 16/21 (76%)

Query: 20 EGRQKEEEEEGKKEKKEEEEE 40
             ++EEEEE ++E++E EEE
Sbjct: 75 AEEEEEEEEEEEEEEEESEEE 95


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
          RNase Y, an endoribonuclease. The member from Bacillus
          subtilis, YmdA, has been shown to be involved in
          turnover of yitJ riboswitch [Transcription, Degradation
          of RNA].
          Length = 514

 Score = 36.8 bits (86), Expect = 0.005
 Identities = 19/52 (36%), Positives = 27/52 (51%), Gaps = 6/52 (11%)

Query: 13 IVPKSKREGRQKEEE------EEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          I+ ++K+E    ++E      EE  K + E E E KER  EL R  RR  +R
Sbjct: 37 IIEEAKKEAETLKKEALLEAKEEVHKLRAELERELKERRNELQRLERRLLQR 88


>gnl|CDD|217450 pfam03247, Prothymosin, Prothymosin/parathymosin family.
          Prothymosin alpha and parathymosin are two ubiquitous
          small acidic nuclear proteins that are thought to be
          involved in cell cycle progression, proliferation, and
          cell differentiation.
          Length = 106

 Score = 34.9 bits (80), Expect = 0.006
 Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
          E    E++EE + E +EEE E++E  E    +R
Sbjct: 54 EEEVDEDDEEEEGEGEEEEGEEEEETEGATGKR 86



 Score = 29.9 bits (67), Expect = 0.32
 Identities = 13/33 (39%), Positives = 17/33 (51%)

Query: 17 SKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
            +EG  + EEEE   E  EEEE + E  E  +
Sbjct: 43 GAQEGDDEMEEEEEVDEDDEEEEGEGEEEEGEE 75


>gnl|CDD|165026 PHA02644, PHA02644, hypothetical protein; Provisional.
          Length = 112

 Score = 35.0 bits (79), Expect = 0.006
 Identities = 17/46 (36%), Positives = 27/46 (58%), Gaps = 1/46 (2%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKK-ERNEELDRRRRRRRRRDN 60
          K+  E  +KE E+E K EK E E+EKK E+ E  D ++  +   ++
Sbjct: 53 KTTHEHIKKENEDEKKPEKPENEDEKKPEKPENEDEKKPEKPENED 98


>gnl|CDD|223809 COG0738, FucP, Fucose permease [Carbohydrate transport and
           metabolism].
          Length = 422

 Score = 36.5 bits (85), Expect = 0.006
 Identities = 34/138 (24%), Positives = 60/138 (43%), Gaps = 19/138 (13%)

Query: 74  SIFFLGSLLGGFILSWV-----ADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRF 128
           S F++G ++G FI S +      ++Y    A++   ++  L VAL      VV       
Sbjct: 279 SFFWVGFMVGRFIGSALMSRIKPEKYLAFYALIA--ILLLLAVALIG---GVVALY---A 330

Query: 129 LTGVGHFNAFIF--YYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLADWQWI 186
           L  +G FN+ +F   + + L+ +G      ++    L+   V    +P +   +AD   I
Sbjct: 331 LFLIGLFNSIMFPTIFSLALKNLG---EHTSVGSGLLVMAIVGGAIIPPLQGVIADMFGI 387

Query: 187 SVIT-IFPLIVGLIVAIF 203
            +   I PL+  L V  F
Sbjct: 388 QLTFLIVPLLCYLYVLFF 405



 Score = 29.2 bits (66), Expect = 1.7
 Identities = 20/123 (16%), Positives = 37/123 (30%), Gaps = 22/123 (17%)

Query: 76  FFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVAL---TPFSKDVVLFSLSRFLTGV 132
           FF G  +       +  + G    ++   ++  +G AL      SK    F ++ F+   
Sbjct: 57  FFGGYFIMSLPAGLLIKKLGYKAGIVLGLLLYAVGAALFWPAASSKSYGFFLVALFILAS 116

Query: 133 GHFNAFIFYYIIVLECVG---------PKWRTFAMTFPFLIFYTVSEVALPWIAYYLADW 183
           G         I +LE            P+   F +      F  +  +  P +   L   
Sbjct: 117 G---------IGLLETAANPYVTLLGKPESAAFRLNL-AQAFNGLGAILGPLLGSSLILS 166

Query: 184 QWI 186
              
Sbjct: 167 GVA 169


>gnl|CDD|221821 pfam12871, PRP38_assoc, Pre-mRNA-splicing factor 38-associated
          hydrophilic C-term.  This domain is a hydrophilic
          region found at the C-terminus of plant and metazoan
          pre-mRNA-splicing factor 38 proteins. The function is
          not known.
          Length = 97

 Score = 34.4 bits (79), Expect = 0.007
 Identities = 14/40 (35%), Positives = 21/40 (52%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          +  ++ EEEE  +E + + E   +R     RRR RRR R 
Sbjct: 11 DEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRR 50



 Score = 30.9 bits (70), Expect = 0.14
 Identities = 18/44 (40%), Positives = 22/44 (50%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           S  E    EEEE  ++E  EE   K ER+ +  RR  RRR R 
Sbjct: 3  VSALEEDLDEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRR 46



 Score = 28.6 bits (64), Expect = 0.88
 Identities = 9/45 (20%), Positives = 18/45 (40%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           +   +  ++      ++  +  +  +K R    DR R R R RD
Sbjct: 29 AERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRD 73



 Score = 27.8 bits (62), Expect = 1.5
 Identities = 7/37 (18%), Positives = 15/37 (40%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          R        +  +++   +++ R  + DR R R R  
Sbjct: 38 RSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDD 74



 Score = 27.4 bits (61), Expect = 2.1
 Identities = 7/45 (15%), Positives = 15/45 (33%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
             +     +       + +K   + ++ R +    R R R  RD
Sbjct: 32 DVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRD 76



 Score = 27.4 bits (61), Expect = 2.4
 Identities = 16/43 (37%), Positives = 22/43 (51%), Gaps = 5/43 (11%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKE-----RNEELDRRRRRRRRRD 59
          R    EE+  +E++ EEEE  E        ++DR RR  RRR 
Sbjct: 2  RVSALEEDLDEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRT 44



 Score = 26.3 bits (58), Expect = 4.7
 Identities = 8/44 (18%), Positives = 16/44 (36%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            +R+  +K       +++    +      +  DR R R R R 
Sbjct: 48 SRRRKRSRKRRRRRRDRDRARYRDRDDRDRDRYDRSRSRSRSRS 91


>gnl|CDD|218771 pfam05835, Synaphin, Synaphin protein.  This family consists of
          several eukaryotic synaphin 1 and 2 proteins.
          Synaphin/complexin is a cytosolic protein that
          preferentially binds to syntaxin within the SNARE
          complex. Synaphin promotes SNAREs to form precomplexes
          that oligomerise into higher order structures. A
          peptide from the central, syntaxin binding domain of
          synaphin competitively inhibits these two proteins from
          interacting and prevents SNARE complexes from
          oligomerising. It is thought that oligomerisation of
          SNARE complexes into a higher order structure creates a
          SNARE scaffold for efficient, regulated fusion of
          synaptic vesicles. Synaphin promotes neuronal
          exocytosis by promoting interaction between the
          complementary syntaxin and synaptobrevin transmembrane
          regions that reside in opposing membranes prior to
          fusion.
          Length = 139

 Score = 35.3 bits (81), Expect = 0.008
 Identities = 16/44 (36%), Positives = 25/44 (56%), Gaps = 2/44 (4%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKE--RNEELDRRRRRRRRRDNW 61
          E   +EE+EE ++  +E EEE+K   R  E +R   R+  RD +
Sbjct: 29 ESDAEEEDEEIQEALREAEEERKAKHRKMEEEREVMRQGIRDKY 72



 Score = 32.9 bits (75), Expect = 0.049
 Identities = 9/36 (25%), Positives = 21/36 (58%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          KE+E +    ++E+EE ++   E  + R+ + R+ +
Sbjct: 23 KEDEGDESDAEEEDEEIQEALREAEEERKAKHRKME 58


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 36.7 bits (84), Expect = 0.008
 Identities = 11/42 (26%), Positives = 25/42 (59%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
            K+  E ++ EE+++  +E K+ EE++K+  E L +     ++
Sbjct: 1662 KAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKK 1703



 Score = 34.7 bits (79), Expect = 0.034
 Identities = 16/45 (35%), Positives = 24/45 (53%), Gaps = 2/45 (4%)

Query: 16   KSKREGRQKEEEEEGKK--EKKEEEEEKKERNEELDRRRRRRRRR 58
              K E  +K+E EE KK  E K+ EEE K + EE  +     +++
Sbjct: 1701 AKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKK 1745



 Score = 34.7 bits (79), Expect = 0.034
 Identities = 16/45 (35%), Positives = 25/45 (55%), Gaps = 2/45 (4%)

Query: 16   KSKREGRQKEEEEEGKK--EKKEEEEEKKERNEELDRRRRRRRRR 58
            K K E  +K+E EE KK  E K+ EEE K +  E  ++    +++
Sbjct: 1632 KKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKK 1676



 Score = 34.0 bits (77), Expect = 0.054
 Identities = 14/44 (31%), Positives = 26/44 (59%), Gaps = 1/44 (2%)

Query: 16   KSKREGRQKEEEEEGKK-EKKEEEEEKKERNEELDRRRRRRRRR 58
            K K E  +K EE+E K  E  ++E E+ ++ EEL ++    +++
Sbjct: 1674 KKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKK 1717



 Score = 33.6 bits (76), Expect = 0.081
 Identities = 13/43 (30%), Positives = 27/43 (62%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            K K E  +K EE + K E+ ++ EE+K++ E+L ++    +++
Sbjct: 1606 KMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKK 1648



 Score = 32.4 bits (73), Expect = 0.17
 Identities = 17/51 (33%), Positives = 28/51 (54%), Gaps = 7/51 (13%)

Query: 16   KSKREGRQKEEEEEGKKE--KKEEEE-----EKKERNEELDRRRRRRRRRD 59
            K   E ++KE EE+ K E  KK EEE     E+ ++  E D+++    ++D
Sbjct: 1702 KKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKD 1752



 Score = 31.6 bits (71), Expect = 0.28
 Identities = 13/46 (28%), Positives = 22/46 (47%), Gaps = 2/46 (4%)

Query: 16   KSKREGRQKEEEEEG--KKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K K E  +K EEE      E+ ++ EE K++ EE  +     ++  
Sbjct: 1646 KKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAA 1691



 Score = 31.3 bits (70), Expect = 0.37
 Identities = 18/45 (40%), Positives = 24/45 (53%), Gaps = 3/45 (6%)

Query: 16   KSKREGRQKEEEEEGKKE--KKEEEEEKKERNEELDRRRRRRRRR 58
            K   E  +KE EE  K E  KK+E EEKK + EEL +     + +
Sbjct: 1688 KKAAEALKKEAEEAKKAEELKKKEAEEKK-KAEELKKAEEENKIK 1731



 Score = 31.3 bits (70), Expect = 0.38
 Identities = 16/44 (36%), Positives = 24/44 (54%), Gaps = 2/44 (4%)

Query: 16   KSKREGRQKEEEEEGKK--EKKEEEEEKKERNEELDRRRRRRRR 57
            K K E  +K EEE   K  E K+E EE K++ EE  +    +++
Sbjct: 1715 KKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKK 1758



 Score = 30.5 bits (68), Expect = 0.70
 Identities = 9/44 (20%), Positives = 25/44 (56%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K   E ++K EE++ K ++ ++    K++ +E  ++   +++ D
Sbjct: 1391 KKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKAD 1434



 Score = 30.5 bits (68), Expect = 0.73
 Identities = 16/32 (50%), Positives = 21/32 (65%), Gaps = 2/32 (6%)

Query: 14   VPKSKREGRQKEEEEEGKK--EKKEEEEEKKE 43
              K K E  +KE EE+ KK  E K++EEEKK+
Sbjct: 1727 ENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKK 1758



 Score = 30.5 bits (68), Expect = 0.76
 Identities = 12/44 (27%), Positives = 24/44 (54%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K   E ++K EE +   E K++ EE K++ +E  +    +++ D
Sbjct: 1470 KKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKAD 1513



 Score = 30.5 bits (68), Expect = 0.79
 Identities = 10/44 (22%), Positives = 22/44 (50%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K   E ++K EE +   E K++ EE K++ +   ++    ++  
Sbjct: 1302 KKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAA 1345



 Score = 30.1 bits (67), Expect = 0.98
 Identities = 12/44 (27%), Positives = 25/44 (56%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K     ++K EE++   E K++ EE K++ +EL +    +++ D
Sbjct: 1378 KKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKAD 1421



 Score = 29.7 bits (66), Expect = 1.4
 Identities = 11/44 (25%), Positives = 24/44 (54%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K K E  +K +E + K E+ ++ +E K++ EE  ++    ++  
Sbjct: 1463 KKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAA 1506



 Score = 29.3 bits (65), Expect = 1.5
 Identities = 8/44 (18%), Positives = 24/44 (54%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K K +  +K    + K ++ +++ E+K++ +E  ++    ++ D
Sbjct: 1404 KKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKAD 1447



 Score = 29.3 bits (65), Expect = 1.7
 Identities = 8/45 (17%), Positives = 26/45 (57%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            K K +  +K+ EE+ K ++ +++ E+ ++ +E  ++    ++ + 
Sbjct: 1417 KKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEE 1461



 Score = 29.3 bits (65), Expect = 1.8
 Identities = 10/53 (18%), Positives = 21/53 (39%), Gaps = 9/53 (16%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEE---------EEKKERNEELDRRRRRRRRRD 59
            K   E  + E E    + +  EE         EE K++ +   ++   +++ D
Sbjct: 1342 KKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKAD 1394



 Score = 29.0 bits (64), Expect = 2.1
 Identities = 10/44 (22%), Positives = 24/44 (54%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K   E ++ EE+++  + KK EE +K E  ++ +  ++    ++
Sbjct: 1534 KKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKN 1577



 Score = 29.0 bits (64), Expect = 2.3
 Identities = 9/41 (21%), Positives = 22/41 (53%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
            K   E ++K EE + K ++ ++  E K++ +E  +    ++
Sbjct: 1483 KKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKK 1523



 Score = 28.6 bits (63), Expect = 2.8
 Identities = 16/44 (36%), Positives = 27/44 (61%), Gaps = 1/44 (2%)

Query: 16   KSKREGRQKEEEEEGKKEK-KEEEEEKKERNEELDRRRRRRRRR 58
            K K E  +K EEE+ K E+ K++E E+K++ EEL +     + +
Sbjct: 1619 KIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIK 1662



 Score = 28.6 bits (63), Expect = 2.9
 Identities = 9/44 (20%), Positives = 24/44 (54%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
              K +  +K+ EE  K E+ +++ E+ ++ +E  ++    ++ D
Sbjct: 1443 AKKADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKAD 1486



 Score = 28.6 bits (63), Expect = 3.3
 Identities = 23/67 (34%), Positives = 30/67 (44%), Gaps = 11/67 (16%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERN----------EELDRR-RRRRRRRDNWVCD 64
            K K E  +K+EEE+ K    ++EEEKK             EELD    +RR   D  + D
Sbjct: 1743 KKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKD 1802

Query: 65   GSSNLAI 71
               N A 
Sbjct: 1803 IFDNFAN 1809



 Score = 28.6 bits (63), Expect = 3.5
 Identities = 10/44 (22%), Positives = 25/44 (56%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            K K +  +K+ EE  K ++ +++ E+ ++ EE  ++    ++ D
Sbjct: 1430 KKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKKAD 1473



 Score = 27.8 bits (61), Expect = 5.5
 Identities = 8/42 (19%), Positives = 25/42 (59%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
            K++ + ++ EE ++ ++++K+  E  K+  EE  +    +++
Sbjct: 1669 KAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKK 1710



 Score = 27.4 bits (60), Expect = 7.4
 Identities = 10/34 (29%), Positives = 22/34 (64%)

Query: 18   KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
            K E ++K +E + K E+ ++ +E K++ EE  ++
Sbjct: 1297 KAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKK 1330



 Score = 27.4 bits (60), Expect = 7.9
 Identities = 9/45 (20%), Positives = 24/45 (53%), Gaps = 4/45 (8%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEE----EKKERNEELDRRRRRRR 56
            K   E ++K +E +  +E K+ +E    E+ ++ +E  +   +++
Sbjct: 1503 KKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKK 1547



 Score = 27.4 bits (60), Expect = 7.9
 Identities = 12/43 (27%), Positives = 24/43 (55%), Gaps = 2/43 (4%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKE--RNEELDRRRRRRR 56
            K   E ++ +E ++ ++ KK EE++K E  +  E D+    R+
Sbjct: 1540 KKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRK 1582



 Score = 27.0 bits (59), Expect = 8.5
 Identities = 14/46 (30%), Positives = 26/46 (56%), Gaps = 2/46 (4%)

Query: 16   KSKREGRQKEEEEEGKK--EKKEEEEEKKERNEELDRRRRRRRRRD 59
            + K E  +K++EE  KK    K++ EEKK+ +E   +    +++ D
Sbjct: 1363 EEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKAD 1408



 Score = 27.0 bits (59), Expect = 9.6
 Identities = 8/43 (18%), Positives = 24/43 (55%)

Query: 16   KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
              K E  +K+ EE  K ++ +++ E+ ++ +E  ++    +++
Sbjct: 1456 AKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKK 1498


>gnl|CDD|220376 pfam09745, DUF2040, Coiled-coil domain-containing protein 55
           (DUF2040).  This entry is a conserved domain of
           approximately 130 residues of proteins conserved from
           fungi to humans. The proteins do contain a coiled-coil
           domain, but the function is unknown.
          Length = 128

 Score = 34.7 bits (80), Expect = 0.008
 Identities = 11/27 (40%), Positives = 17/27 (62%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKK 42
           K + E  +K EEEE ++E+ EEE +  
Sbjct: 96  KKQLEENRKLEEEEKEREELEEENDVT 122



 Score = 29.7 bits (67), Expect = 0.53
 Identities = 16/54 (29%), Positives = 25/54 (46%), Gaps = 14/54 (25%)

Query: 16  KSKREGRQKEEEEEGKKEK-------------KEEEEEKKERNEELDRRRRRRR 56
           K ++E R+KE +E   KEK             ++ EEE+KER E  +     + 
Sbjct: 72  KLQKE-REKEGDEFADKEKFVTSAYKKQLEENRKLEEEEKEREELEEENDVTKG 124


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 36.5 bits (84), Expect = 0.009
 Identities = 16/32 (50%), Positives = 25/32 (78%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
           S+ E  ++EEEEE ++E++EEEEE++E  E L
Sbjct: 863 SEEEEEEEEEEEEEEEEEEEEEEEEEENEEPL 894



 Score = 35.7 bits (82), Expect = 0.014
 Identities = 15/27 (55%), Positives = 22/27 (81%)

Query: 21  GRQKEEEEEGKKEKKEEEEEKKERNEE 47
           G  +EEEEE ++E++EEEEE++E  EE
Sbjct: 861 GDSEEEEEEEEEEEEEEEEEEEEEEEE 887



 Score = 35.4 bits (81), Expect = 0.021
 Identities = 15/28 (53%), Positives = 22/28 (78%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           +G   EEEEE ++E++EEEEE++E  EE
Sbjct: 859 DGGDSEEEEEEEEEEEEEEEEEEEEEEE 886



 Score = 34.6 bits (79), Expect = 0.038
 Identities = 14/28 (50%), Positives = 23/28 (82%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           +  ++EEEEE ++E++EEEEE++E  EE
Sbjct: 862 DSEEEEEEEEEEEEEEEEEEEEEEEEEE 889



 Score = 33.0 bits (75), Expect = 0.10
 Identities = 15/38 (39%), Positives = 26/38 (68%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           E  ++EEEEE ++E++EEEEE+ E    L+    R+++
Sbjct: 868 EEEEEEEEEEEEEEEEEEEEEENEEPLSLEWPETRQKQ 905



 Score = 28.4 bits (63), Expect = 3.6
 Identities = 11/32 (34%), Positives = 17/32 (53%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           K   +G       +G   ++EEEEE++E  EE
Sbjct: 846 KQDEKGVDGGGGSDGGDSEEEEEEEEEEEEEE 877


>gnl|CDD|219905 pfam08564, CDC37_C, Cdc37 C terminal domain.  Cdc37 is a protein
          required for the activity of numerous eukaryotic
          protein kinases. This domains corresponds to the C
          terminal domain whose function is unclear. It is found
          C terminal to the Hsp90 chaperone (Heat shocked protein
          90) binding domain pfam08565 and the N terminal kinase
          binding domain of Cdc37 pfam03234.
          Length = 89

 Score = 33.9 bits (78), Expect = 0.010
 Identities = 15/41 (36%), Positives = 25/41 (60%)

Query: 7  CNCYGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
          C   GL VP +K EG ++ +E E +  ++E E+E+ E  +E
Sbjct: 49 CIDSGLWVPNAKIEGEKEFKELEEEYNEEEAEKEEIEEEDE 89


>gnl|CDD|100110 cd05832, Ribosomal_L12p, Ribosomal protein L12p. This subfamily
          includes archaeal L12p, the protein that is
          functionally equivalent to L7/L12 in bacteria and the
          P1 and P2 proteins in eukaryotes. L12p is homologous to
          P1 and P2 but is not homologous to bacterial L7/L12. It
          is located in the L12 stalk, with proteins L10, L11,
          and 23S rRNA. L12p is the only protein in the ribosome
          to occur as multimers, always appearing as sets of
          dimers. Recent data indicate that most archaeal species
          contain six copies of L12p (three homodimers), while
          eukaryotes have four copies (two heterodimers), and
          bacteria may have four or six copies (two or three
          homodimers), depending on the species. The organization
          of proteins within the stalk has been characterized
          primarily in bacteria, where L7/L12 forms either two or
          three homodimers and each homodimer binds to the
          extended C-terminal helix of L10. L7/L12 is attached to
          the ribosome through L10 and is the only ribosomal
          protein that does not directly interact with rRNA.
          Archaeal L12p is believed to function in a similar
          fashion. However, hybrid ribosomes containing the large
          subunit from E. coli with an archaeal stalk are able to
          bind archaeal and eukaryotic elongation factors but not
          bacterial elongation factors. In several mesophilic and
          thermophilic archaeal species, the binding of 23S rRNA
          to protein L11 and to the L10/L12p pentameric complex
          was found to be temperature-dependent and cooperative.
          Length = 106

 Score = 34.0 bits (78), Expect = 0.010
 Identities = 12/26 (46%), Positives = 20/26 (76%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEEL 48
           +E+ EE ++EKK+EEE+++E  E L
Sbjct: 74 AEEKAEEKEEEKKKEEEKEEEEEEAL 99



 Score = 33.6 bits (77), Expect = 0.016
 Identities = 10/25 (40%), Positives = 18/25 (72%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEE 47
             EE+  +KE+++++EE+KE  EE
Sbjct: 72 AAAEEKAEEKEEEKKKEEEKEEEEE 96



 Score = 32.5 bits (74), Expect = 0.043
 Identities = 12/26 (46%), Positives = 19/26 (73%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEE 40
            +  E  +++EEE+ K+E+KEEEEE
Sbjct: 71 AAAAEEKAEEKEEEKKKEEEKEEEEE 96



 Score = 30.9 bits (70), Expect = 0.14
 Identities = 12/27 (44%), Positives = 19/27 (70%)

Query: 14 VPKSKREGRQKEEEEEGKKEKKEEEEE 40
             +  E + +E+EEE KKE+++EEEE
Sbjct: 69 AAAAAAEEKAEEKEEEKKKEEEKEEEE 95



 Score = 29.4 bits (66), Expect = 0.56
 Identities = 11/26 (42%), Positives = 20/26 (76%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEE 40
            ++ +  +KEEE++ ++EK+EEEEE
Sbjct: 72 AAAEEKAEEKEEEKKKEEEKEEEEEE 97



 Score = 28.6 bits (64), Expect = 0.82
 Identities = 11/23 (47%), Positives = 15/23 (65%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEE 47
              E K E+KEEE++K+E  EE
Sbjct: 71 AAAAEEKAEEKEEEKKKEEEKEE 93


>gnl|CDD|115071 pfam06390, NESP55, Neuroendocrine-specific golgi protein P55
           (NESP55).  This family consists of several mammalian
           neuroendocrine-specific golgi protein P55 (NESP55)
           sequences. NESP55 is a novel member of the chromogranin
           family and is a soluble, acidic, heat-stable secretory
           protein that is expressed exclusively in endocrine and
           nervous tissues, although less widely than
           chromogranins.
          Length = 261

 Score = 35.6 bits (81), Expect = 0.012
 Identities = 19/49 (38%), Positives = 30/49 (61%), Gaps = 4/49 (8%)

Query: 15  PKSKREGRQKEEEEEGK--KEKKEEEEEKKERNEELDRRRRRR--RRRD 59
           P+S REG + E     K  ++ +EEEEEK+E  ++  R + ++  RRRD
Sbjct: 196 PESAREGEEPERGPLDKDPRDPEEEEEEKEEEKQQPHRCKPKKPARRRD 244


>gnl|CDD|233171 TIGR00890, 2A0111, oxalate/formate antiporter family transporter.
           This subfamily belongs to the major facilitator family.
           Members include the oxalate/formate antiporter of
           Oxalobacter formigenes, where one substrate is
           decarboxylated in the cytosol into the other to consume
           a proton and drive an ion gradient [Transport and
           binding proteins, Carbohydrates, organic alcohols, and
           acids].
          Length = 377

 Score = 35.5 bits (82), Expect = 0.013
 Identities = 31/147 (21%), Positives = 55/147 (37%), Gaps = 14/147 (9%)

Query: 65  GSSNLAITRSIFFLGSLLGGFILSWVADRYG-RITAVLGSHVVSFLGVALTPFSKDVVLF 123
           G + +AI  ++  +G  +   +   +AD++G R  A+LG  ++  LG      +  +   
Sbjct: 36  GVTAVAIWFTLLLIGLAMSMPVGGLLADKFGPRAVAMLGG-ILYGLGFTFYAIADSLAAL 94

Query: 124 SLSRFLTGVGHFNAFIFYYIIVLECVGPKW----RTFAMTFPFLIFYTVSEVALPWIAYY 179
            L+  L   G     I Y I +   V  KW    R  A       +   S +  P I   
Sbjct: 95  YLTYGLASAG---VGIAYGIALNTAV--KWFPDKRGLASGIIIGGYGLGSFILSPLITSV 149

Query: 180 LADWQW---ISVITIFPLIVGLIVAIF 203
           +           + I  L+V ++ A  
Sbjct: 150 INLEGVPAAFIYMGIIFLLVIVLGAFL 176



 Score = 35.5 bits (82), Expect = 0.013
 Identities = 19/91 (20%), Positives = 34/91 (37%), Gaps = 2/91 (2%)

Query: 64  DGSSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDV--V 121
                L +  SI  + +  G   L  ++D+ GR   +     +S +G+A   F   +  V
Sbjct: 237 LSDGFLVLAVSISSIFNGGGRPFLGALSDKIGRQKTMSIVFGISAVGMAAMLFIPMLNDV 296

Query: 122 LFSLSRFLTGVGHFNAFIFYYIIVLECVGPK 152
           LF  +  L           +  +V +  GP 
Sbjct: 297 LFLATVALVFFTWGGTISLFPSLVSDIFGPA 327


>gnl|CDD|237171 PRK12678, PRK12678, transcription termination factor Rho;
           Provisional.
          Length = 672

 Score = 35.6 bits (83), Expect = 0.013
 Identities = 14/45 (31%), Positives = 22/45 (48%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            ++ R G  ++  E G++   E   +  ER EE +R  RRRR   
Sbjct: 134 GEAARRGAARKAGEGGEQPATEARADAAERTEEEERDERRRRGDR 178



 Score = 34.1 bits (79), Expect = 0.041
 Identities = 10/45 (22%), Positives = 24/45 (53%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           + +R G +++ + E ++ ++   EE+    ++ DRR RR +    
Sbjct: 171 ERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQGDRR 215



 Score = 33.3 bits (77), Expect = 0.089
 Identities = 10/42 (23%), Positives = 20/42 (47%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
               ++ E E G++ ++EE     +  +  DRR +  RR + 
Sbjct: 177 DREDRQAEAERGERGRREERGRDGDDRDRRDRREQGDRREER 218



 Score = 32.6 bits (75), Expect = 0.14
 Identities = 13/43 (30%), Positives = 18/43 (41%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K+   G Q   E      ++ EEEE+ ER    DR  R+    
Sbjct: 144 KAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAE 186



 Score = 32.2 bits (74), Expect = 0.17
 Identities = 11/44 (25%), Positives = 22/44 (50%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
             R+G  ++  +  ++  + EE  +++  +   RRRRR RR   
Sbjct: 196 RGRDGDDRDRRDRREQGDRREERGRRDGGDRRGRRRRRDRRDAR 239



 Score = 31.8 bits (73), Expect = 0.21
 Identities = 11/45 (24%), Positives = 21/45 (46%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           +     R+ + E+   + ++ E   ++ER  + D R RR RR   
Sbjct: 168 ERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQG 212



 Score = 31.8 bits (73), Expect = 0.25
 Identities = 11/44 (25%), Positives = 18/44 (40%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            +R+ R  +  E+      ++ E +  R     R R RR RR  
Sbjct: 234 DRRDARGDDNREDRGDRDGDDGEGRGGRRGRRFRDRDRRGRRGG 277



 Score = 31.0 bits (71), Expect = 0.40
 Identities = 14/44 (31%), Positives = 28/44 (63%), Gaps = 2/44 (4%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKER--NEELDRRRRRRRRRDN 60
           RE R ++ ++  +++++E+ + ++ER   +  DRR RRRRR   
Sbjct: 193 REERGRDGDDRDRRDRREQGDRREERGRRDGGDRRGRRRRRDRR 236



 Score = 30.6 bits (70), Expect = 0.60
 Identities = 10/44 (22%), Positives = 22/44 (50%), Gaps = 1/44 (2%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            + E  + E     ++ +  ++ ++++R E+   RR  R RRD 
Sbjct: 181 RQAEAERGERGRREERGRDGDDRDRRDRREQ-GDRREERGRRDG 223



 Score = 29.1 bits (66), Expect = 1.7
 Identities = 6/42 (14%), Positives = 17/42 (40%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
               ++E+ +  ++  + +  +++ R    DRR  R      
Sbjct: 204 DRRDRREQGDRREERGRRDGGDRRGRRRRRDRRDARGDDNRE 245



 Score = 29.1 bits (66), Expect = 1.8
 Identities = 8/42 (19%), Positives = 20/42 (47%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
               + E  E G++E++  + + ++R +  ++  RR  R   
Sbjct: 180 DRQAEAERGERGRREERGRDGDDRDRRDRREQGDRREERGRR 221



 Score = 26.8 bits (60), Expect = 9.0
 Identities = 11/44 (25%), Positives = 16/44 (36%), Gaps = 2/44 (4%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERN--EELDRRRRRRRRRDN 60
           R GR++  +    +     E+         E    RR RR RD 
Sbjct: 226 RRGRRRRRDRRDARGDDNREDRGDRDGDDGEGRGGRRGRRFRDR 269


>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex.  This
          entry is characterized by proteins with alternating
          conserved and low-complexity regions. Bud13 together
          with Snu17p and a newly identified factor,
          Pml1p/Ylr016c, form a novel trimeric complex. called
          The RES complex, pre-mRNA retention and splicing
          complex. Subunits of this complex are not essential for
          viability of yeasts but they are required for efficient
          splicing in vitro and in vivo. Furthermore,
          inactivation of this complex causes pre-mRNA leakage
          from the nucleus. Bud13 contains a unique,
          phylogenetically conserved C-terminal region of unknown
          function.
          Length = 141

 Score = 34.2 bits (79), Expect = 0.015
 Identities = 13/36 (36%), Positives = 25/36 (69%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          ++EE+E  K+EK+ +EE++KE  + L ++  R +R 
Sbjct: 17 KREEKEREKEEKERKEEKEKEWGKGLVQKEEREKRL 52



 Score = 28.8 bits (65), Expect = 1.2
 Identities = 17/43 (39%), Positives = 31/43 (72%), Gaps = 8/43 (18%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEE--------EEKKERNEELDR 50
          + KRE +++E+EE+ +KE+KE+E        EE+++R EEL++
Sbjct: 15 EEKREEKEREKEEKERKEEKEKEWGKGLVQKEEREKRLEELEK 57


>gnl|CDD|219882 pfam08524, rRNA_processing, rRNA processing.  This is a family of
           proteins that are involved in rRNA processing. In a
           localisation study they were found to localise to the
           nucleus and nucleolus. The family also includes other
           metazoa members from plants to mammals where the protein
           has been named BR22 and is associated with TTF-1,
           thyroid transcription factor 1. In the lungs, the family
           binds TTF-1 to form a complex which influences the
           expression of the key lung surfactant protein-B (SP-B)
           and -C (SP-C), the small hydrophobic surfactant proteins
           that maintain surface tension in alveoli.
          Length = 150

 Score = 34.5 bits (79), Expect = 0.016
 Identities = 13/43 (30%), Positives = 27/43 (62%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K +   ++K E+ E +  K+++E EK E +++  + R RRR++
Sbjct: 77  KKEIAKQRKREQREKELAKRQKELEKIELSKKKQKERERRRKK 119



 Score = 34.5 bits (79), Expect = 0.016
 Identities = 14/51 (27%), Positives = 28/51 (54%), Gaps = 5/51 (9%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE-----LDRRRRRRRRRDN 60
            K K+   +KE  ++ K+E++E+E  K+++  E       +++ R RRR  
Sbjct: 69  EKKKKLDEKKEIAKQRKREQREKELAKRQKELEKIELSKKKQKERERRRKK 119



 Score = 33.3 bits (76), Expect = 0.036
 Identities = 13/43 (30%), Positives = 30/43 (69%), Gaps = 1/43 (2%)

Query: 17  SKREGRQKEEEEEGKKEKKEEE-EEKKERNEELDRRRRRRRRR 58
           +K+  R++ E+E  K++K+ E+ E  K++ +E +RRR++  ++
Sbjct: 81  AKQRKREQREKELAKRQKELEKIELSKKKQKERERRRKKLTKK 123



 Score = 31.4 bits (71), Expect = 0.19
 Identities = 18/57 (31%), Positives = 29/57 (50%), Gaps = 9/57 (15%)

Query: 11  GLIVP-KSKREGRQKEEEE----EGKK---EKKEEEEEKK-ERNEELDRRRRRRRRR 58
           G  VP K   E + K  +E    E KK   EKKE  +++K E+ E+   +R++   +
Sbjct: 46  GYAVPEKESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQREKELAKRQKELEK 102


>gnl|CDD|206063 pfam13892, DBINO, DNA-binding domain.  DBINO is a DNA-binding
           domain found on global transcription activator SNF2L1
           proteins and chromatin re-modelling proteins.
          Length = 140

 Score = 34.2 bits (79), Expect = 0.017
 Identities = 18/57 (31%), Positives = 29/57 (50%), Gaps = 11/57 (19%)

Query: 13  IVPKSKREGRQ---------KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
              ++KR  R+         KEE E  K+ +KE  E+ K+  EE  R  +R++R+ N
Sbjct: 63  TQLRAKRLMREMLLFWKKNEKEERELRKRAEKEALEQAKK--EEELREAKRQQRKLN 117


>gnl|CDD|235795 PRK06402, rpl12p, 50S ribosomal protein L12P; Reviewed.
          Length = 106

 Score = 33.4 bits (77), Expect = 0.017
 Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEE 47
                +K+++EEEEE+KE +EE
Sbjct: 73 AAAAAEEKKEEEEEEEEKEESEE 95



 Score = 32.6 bits (75), Expect = 0.032
 Identities = 13/24 (54%), Positives = 16/24 (66%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEEL 48
              E KKE++EEEEEK+E  EE 
Sbjct: 74 AAAAEEKKEEEEEEEEKEESEEEA 97



 Score = 32.2 bits (74), Expect = 0.054
 Identities = 8/21 (38%), Positives = 15/21 (71%)

Query: 23 QKEEEEEGKKEKKEEEEEKKE 43
              EE+ ++E++EEE+E+ E
Sbjct: 74 AAAAEEKKEEEEEEEEKEESE 94



 Score = 31.1 bits (71), Expect = 0.11
 Identities = 8/21 (38%), Positives = 16/21 (76%)

Query: 23 QKEEEEEGKKEKKEEEEEKKE 43
             EE++ ++E++EE+EE +E
Sbjct: 75 AAAEEKKEEEEEEEEKEESEE 95



 Score = 30.7 bits (70), Expect = 0.17
 Identities = 11/21 (52%), Positives = 16/21 (76%)

Query: 20 EGRQKEEEEEGKKEKKEEEEE 40
             +K+EEEE ++EK+E EEE
Sbjct: 76 AAEEKKEEEEEEEEKEESEEE 96



 Score = 29.9 bits (68), Expect = 0.32
 Identities = 7/23 (30%), Positives = 14/23 (60%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEE 47
                +++K+EEEEE+++   E
Sbjct: 72 AAAAAAEEKKEEEEEEEEKEESE 94



 Score = 28.8 bits (65), Expect = 0.82
 Identities = 8/20 (40%), Positives = 17/20 (85%)

Query: 24 KEEEEEGKKEKKEEEEEKKE 43
           EE++E ++E++E+EE ++E
Sbjct: 77 AEEKKEEEEEEEEKEESEEE 96



 Score = 27.6 bits (62), Expect = 1.8
 Identities = 10/23 (43%), Positives = 14/23 (60%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEE 47
                 +EKKEEEEE++E+ E 
Sbjct: 71 AAAAAAAEEKKEEEEEEEEKEES 93


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 34.5 bits (80), Expect = 0.026
 Identities = 12/37 (32%), Positives = 22/37 (59%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           R++EEE+  K  ++E +EE +E+ EE  +  R  +  
Sbjct: 265 REEEEEKILKAAEEERQEEAQEKKEEKKKEEREAKLA 301



 Score = 31.1 bits (71), Expect = 0.40
 Identities = 16/49 (32%), Positives = 31/49 (63%), Gaps = 8/49 (16%)

Query: 16  KSKREGRQ---KEEEEEGKKEKKEEEE-----EKKERNEELDRRRRRRR 56
           K+  E RQ   +E++EE KKE++E +      E++ + EE +R+++ R+
Sbjct: 274 KAAEEERQEEAQEKKEEKKKEEREAKLAKLSPEEQRKLEEKERKKQARK 322



 Score = 26.8 bits (60), Expect = 8.1
 Identities = 8/41 (19%), Positives = 22/41 (53%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
             E  +K ++   ++E+K  +  ++ER EE   ++  +++ 
Sbjct: 254 SPEVLRKVDKTREEEEEKILKAAEEERQEEAQEKKEEKKKE 294


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 34.8 bits (81), Expect = 0.026
 Identities = 15/52 (28%), Positives = 28/52 (53%), Gaps = 6/52 (11%)

Query: 13 IVPKSKREGRQKEEE------EEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          I+ ++K+E    ++E      EE  K + E E+E +ER  EL +  +R  ++
Sbjct: 43 ILEEAKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNELQKLEKRLLQK 94



 Score = 28.2 bits (64), Expect = 3.2
 Identities = 23/63 (36%), Positives = 28/63 (44%), Gaps = 14/63 (22%)

Query: 11 GLIVPKSKREGRQKEEEEEGKK----EKKEEEEEKKER----NEELDRRRR------RRR 56
          G  V K   E + KE EEE K+     KKE E  KKE      EE+ + R       R R
Sbjct: 21 GYFVRKKIAEAKIKEAEEEAKRILEEAKKEAEAIKKEALLEAKEEIHKLRNEFEKELRER 80

Query: 57 RRD 59
          R +
Sbjct: 81 RNE 83


>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1.  All
           proteins in this family for which functions are known
           are cyclin dependent protein kinases that are components
           of TFIIH, a complex that is involved in nucleotide
           excision repair and transcription initiation. Also known
           as MAT1 (menage a trois 1). This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 309

 Score = 34.4 bits (79), Expect = 0.027
 Identities = 12/48 (25%), Positives = 30/48 (62%)

Query: 12  LIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           +I    ++  R++EE EE  + +KEEEE+++   ++ +  ++  +R++
Sbjct: 132 VIQKNKEKSTREQEELEEALEFEKEEEEQRRLLLQKEEEEQQMNKRKN 179



 Score = 27.1 bits (60), Expect = 7.9
 Identities = 11/28 (39%), Positives = 19/28 (67%), Gaps = 3/28 (10%)

Query: 23  QKEEEEEGKKEKKEEEEE---KKERNEE 47
           +KEEEE+ +   ++EEEE    K +N++
Sbjct: 154 EKEEEEQRRLLLQKEEEEQQMNKRKNKQ 181


>gnl|CDD|219761 pfam08243, SPT2, SPT2 chromatin protein.  This family includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 116

 Score = 32.9 bits (75), Expect = 0.029
 Identities = 8/36 (22%), Positives = 22/36 (61%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            ++EE    +  + E+E +  R EE ++R+++++ +
Sbjct: 81  IQKEERRSARMARLEDERELAREEEEEKRKKKKKNK 116



 Score = 28.3 bits (63), Expect = 1.2
 Identities = 11/30 (36%), Positives = 21/30 (70%), Gaps = 1/30 (3%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERN 45
           +S R  R ++E E   +E++EE+ +KK++N
Sbjct: 87  RSARMARLEDEREL-AREEEEEKRKKKKKN 115



 Score = 26.8 bits (59), Expect = 4.9
 Identities = 7/38 (18%), Positives = 20/38 (52%)

Query: 21  GRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
             + ++EE         E+E++   EE + +R+++++ 
Sbjct: 78  FMEIQKEERRSARMARLEDERELAREEEEEKRKKKKKN 115


>gnl|CDD|221408 pfam12072, DUF3552, Domain of unknown function (DUF3552).  This
          presumed domain is functionally uncharacterized. This
          domain is found in bacteria, archaea and eukaryotes.
          This domain is about 200 amino acids in length. This
          domain is found associated with pfam00013, pfam01966.
          This domain has a single completely conserved residue A
          that may be functionally important.
          Length = 201

 Score = 33.7 bits (78), Expect = 0.033
 Identities = 17/54 (31%), Positives = 29/54 (53%), Gaps = 6/54 (11%)

Query: 13 IVPKSKREGRQKEEE------EEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          I+ ++K+E    ++E      EE  K + E E E KER  EL R+ +R  +++ 
Sbjct: 39 IIEEAKKEAEALKKEALLEAKEEIHKLRAEAERELKERRNELQRQEKRLLQKEE 92



 Score = 27.5 bits (62), Expect = 3.6
 Identities = 17/47 (36%), Positives = 27/47 (57%), Gaps = 7/47 (14%)

Query: 17  SKREGR--QKEE---EEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            ++E R  QKEE    ++   EKKEE  E+KE  +EL  R+++   +
Sbjct: 81  QRQEKRLLQKEETLDRKDESLEKKEESLEEKE--KELAARQQQLEEK 125


>gnl|CDD|222060 pfam13347, MFS_2, MFS/sugar transport protein.  This family is part
           of the major facilitator superfamily of membrane
           transport proteins.
          Length = 425

 Score = 34.5 bits (80), Expect = 0.033
 Identities = 16/63 (25%), Positives = 31/63 (49%), Gaps = 2/63 (3%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPF--SKDVVLFSLSRFLTG 131
            I  + ++LG  +  W+A R+G+    L   +++ +G+ L  F     + LF +   L G
Sbjct: 264 LIGTIAAILGAPLWPWLAKRFGKKRTFLLGMLLAAIGLVLLFFLPPGSLWLFLVLVVLAG 323

Query: 132 VGH 134
           +G 
Sbjct: 324 IGL 326


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 34.3 bits (79), Expect = 0.035
 Identities = 14/52 (26%), Positives = 30/52 (57%), Gaps = 10/52 (19%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEK-----KERNEELDRR-----RRRRRRRD 59
            +E R+KE E E K+EK+ +++       KE  +E+ ++     R++ +R++
Sbjct: 284 NKEKRRKELEREAKEEKQLKKKLAQLARLKEIAKEVAQKEKARARKKEQRKE 335



 Score = 31.6 bits (72), Expect = 0.27
 Identities = 11/22 (50%), Positives = 15/22 (68%)

Query: 23  QKEEEEEGKKEKKEEEEEKKER 44
           Q+E E+E K EKK +E E+ E 
Sbjct: 200 QEEYEKEVKAEKKRQELERVEE 221



 Score = 29.3 bits (66), Expect = 1.3
 Identities = 11/47 (23%), Positives = 20/47 (42%), Gaps = 3/47 (6%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKK---ERNEELDRRRRRRRRR 58
            +S  E   +  E E +   K    ++K   +RN+E  R+   R  +
Sbjct: 251 EESDDESAWEGFESEYEPINKPVRPKRKTKAQRNKEKRRKELEREAK 297



 Score = 27.0 bits (60), Expect = 7.6
 Identities = 12/38 (31%), Positives = 25/38 (65%), Gaps = 3/38 (7%)

Query: 16  KSKREGR-QKEEEEEGKKEKKEEEEEK--KERNEELDR 50
           + KR+ + Q+ +E+  K+ ++E +EEK  K++  +L R
Sbjct: 274 RPKRKTKAQRNKEKRRKELEREAKEEKQLKKKLAQLAR 311


>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
          splicing factor.  These splicing factors consist of an
          N-terminal arginine-rich low complexity domain followed
          by three tandem RNA recognition motifs (pfam00076). The
          well-characterized members of this family are
          auxilliary components of the U2 small nuclear
          ribonuclearprotein splicing factor (U2AF). These
          proteins are closely related to the CC1-like subfamily
          of splicing factors (TIGR01622). Members of this
          subfamily are found in plants, metazoa and fungi.
          Length = 509

 Score = 34.5 bits (79), Expect = 0.036
 Identities = 10/30 (33%), Positives = 15/30 (50%)

Query: 32 KEKKEEEEEKKERNEELDRRRRRRRRRDNW 61
          +E   E E+ + R+ +    R RRR RD  
Sbjct: 3  EEPDREREKSRGRDRDRSSERPRRRSRDRS 32



 Score = 31.8 bits (72), Expect = 0.24
 Identities = 10/49 (20%), Positives = 23/49 (46%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCD 64
          + +   R++E+     +++  E   ++ R+    R R RR R  ++  D
Sbjct: 1  RDEEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYRED 49



 Score = 31.0 bits (70), Expect = 0.38
 Identities = 9/43 (20%), Positives = 18/43 (41%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          +S    R++  +    +++     E+  R +   R RRR   R
Sbjct: 19 RSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSR 61



 Score = 31.0 bits (70), Expect = 0.49
 Identities = 13/47 (27%), Positives = 23/47 (48%), Gaps = 6/47 (12%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEELDRRRRR----RRRRDNWVCDGS 66
          ++EE + ++EK    +  ++R+ E  RRR R     R R     + S
Sbjct: 1  RDEEPDREREKSRGRD--RDRSSERPRRRSRDRSRFRDRHRRSRERS 45


>gnl|CDD|214818 smart00784, SPT2, SPT2 chromatin protein.  This entry includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 106

 Score = 32.3 bits (74), Expect = 0.039
 Identities = 10/34 (29%), Positives = 20/34 (58%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           EEE    +  + E+ E++   +E +R +R R+R+
Sbjct: 73  EEERRSARLARLEDREEERLEKEEEREKRARKRK 106



 Score = 30.0 bits (68), Expect = 0.31
 Identities = 13/29 (44%), Positives = 19/29 (65%), Gaps = 2/29 (6%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKER 44
           +S R  R ++ EEE  + +KEEE EK+ R
Sbjct: 77  RSARLARLEDREEE--RLEKEEEREKRAR 103



 Score = 29.3 bits (66), Expect = 0.56
 Identities = 11/35 (31%), Positives = 20/35 (57%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           Q+EE    +  + E+ EE++   EE   +R R+R+
Sbjct: 72  QEEERRSARLARLEDREEERLEKEEEREKRARKRK 106



 Score = 28.1 bits (63), Expect = 1.6
 Identities = 9/33 (27%), Positives = 18/33 (54%)

Query: 26  EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           +EEE +  +    E+++E   E +  R +R R+
Sbjct: 72  QEEERRSARLARLEDREEERLEKEEEREKRARK 104


>gnl|CDD|218215 pfam04696, Pinin_SDK_memA, pinin/SDK/memA/ protein conserved
          region.  Members of this family have very varied
          localisations within the eukaryotic cell. pinin is
          known to localise at the desmosomes and is implicated
          in anchoring intermediate filaments to the desmosomal
          plaque. SDK2/3 is a dynamically localised nuclear
          protein thought to be involved in modulation of
          alternative pre-mRNA splicing. memA is a tumour marker
          preferentially expressed in human melanoma cell lines.
          A common feature of the members of this family is that
          they may all participate in regulating protein-protein
          interactions.
          Length = 131

 Score = 32.8 bits (75), Expect = 0.040
 Identities = 18/51 (35%), Positives = 30/51 (58%), Gaps = 8/51 (15%)

Query: 16 KSKREGRQKEEEEEGKK--EKKEEEEEKKERNEELDRRRR-----RRRRRD 59
          K  +E  +   +E+ +   E+K EE+EK+ER EEL + +R     RRR++ 
Sbjct: 22 KFSQEESRLTSKEKRRAEIEQKLEEQEKQER-EELRKEKRELFEERRRKQL 71



 Score = 31.3 bits (71), Expect = 0.15
 Identities = 15/42 (35%), Positives = 27/42 (64%)

Query: 14 VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
             SK + R + E++  ++EK+E EE +KE+ E  + RRR++
Sbjct: 29 RLTSKEKRRAEIEQKLEEQEKQEREELRKEKRELFEERRRKQ 70



 Score = 29.8 bits (67), Expect = 0.49
 Identities = 13/40 (32%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 22 RQKEEEEEGKKEKKEEE-EEKKERNEELDRRRRRRRRRDN 60
            +EE     KEK+  E E+K E  E+ +R   R+ +R+ 
Sbjct: 23 FSQEESRLTSKEKRRAEIEQKLEEQEKQEREELRKEKREL 62



 Score = 28.6 bits (64), Expect = 1.1
 Identities = 14/31 (45%), Positives = 22/31 (70%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
          K E ++K+E EE +KEK+E  EE++ +  EL
Sbjct: 43 KLEEQEKQEREELRKEKRELFEERRRKQLEL 73


>gnl|CDD|146486 pfam03879, Cgr1, Cgr1 family.  Members of this family are
          coiled-coil proteins that are involved in pre-rRNA
          processing.
          Length = 105

 Score = 32.4 bits (74), Expect = 0.043
 Identities = 12/54 (22%), Positives = 25/54 (46%), Gaps = 15/54 (27%)

Query: 19 REGRQKEEEEEGKKEKKEEEEEKKERNEE---------------LDRRRRRRRR 57
          RE   K+E+E  ++ + +  +E++   EE               ++R +RR +R
Sbjct: 46 REKELKDEKEAERQRRIQAIKERRAAKEEKERYEKMAAKMHAKKVERLKRREKR 99



 Score = 30.1 bits (68), Expect = 0.30
 Identities = 10/38 (26%), Positives = 22/38 (57%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
          KR  ++ E++    +EK+ ++E++ ER   +   + RR
Sbjct: 32 KRMEKRLEQQAIKAREKELKDEKEAERQRRIQAIKERR 69



 Score = 27.4 bits (61), Expect = 2.5
 Identities = 14/47 (29%), Positives = 23/47 (48%), Gaps = 3/47 (6%)

Query: 15 PKSKREG-RQKEEEEEGKKEKKEEEEEKKERNEELDRRR--RRRRRR 58
          PKSK     ++ E+   ++  K  E+E K+  E   +RR    + RR
Sbjct: 23 PKSKLTSWEKRMEKRLEQQAIKAREKELKDEKEAERQRRIQAIKERR 69


>gnl|CDD|221121 pfam11489, DUF3210, Protein of unknown function (DUF3210).  This is
           a family of proteins conserved in yeasts. The function
           is not known. The Schizosaccharomyces pombe member is
           SPBC18E5.07 and the Saccharomyces cerevisiae member is
           AIM21.
          Length = 671

 Score = 34.1 bits (78), Expect = 0.043
 Identities = 17/40 (42%), Positives = 23/40 (57%), Gaps = 3/40 (7%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR---RR 58
           +  E  +E  KEKKEE+E+ KE+    D R+ R R   RR
Sbjct: 603 KVLESPKEPSKEKKEEDEDTKEKAPLSDARKGRARGPARR 642


>gnl|CDD|218684 pfam05672, MAP7, MAP7 (E-MAP-115) family.  The organisation of
          microtubules varies with the cell type and is
          presumably controlled by tissue-specific
          microtubule-associated proteins (MAPs). The 115-kDa
          epithelial MAP (E-MAP-115/MAP7) has been identified as
          a microtubule-stabilising protein predominantly
          expressed in cell lines of epithelial origin. The
          binding of this microtubule associated protein is
          nucleotide independent.
          Length = 171

 Score = 33.1 bits (75), Expect = 0.051
 Identities = 16/48 (33%), Positives = 31/48 (64%), Gaps = 3/48 (6%)

Query: 13 IVPKSKREGR-QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          ++ + +R+ R Q+E+EE+ ++E+  EE+++ ER E   R    R RR+
Sbjct: 27 LLAEKRRQAREQREQEEQERREQ--EEQDRLEREELKRRAAEERLRRE 72



 Score = 32.8 bits (74), Expect = 0.064
 Identities = 14/45 (31%), Positives = 28/45 (62%), Gaps = 1/45 (2%)

Query: 16 KSKREGRQKEEEEEGKKEK-KEEEEEKKERNEELDRRRRRRRRRD 59
          + ++E R++EE++  ++E+ K    E++ R EE  RR+   R R+
Sbjct: 41 QEEQERREQEEQDRLEREELKRRAAEERLRREEEARRQEEERARE 85



 Score = 31.6 bits (71), Expect = 0.16
 Identities = 17/44 (38%), Positives = 26/44 (59%), Gaps = 3/44 (6%)

Query: 19 REGRQKEEEEEGKKE---KKEEEEEKKERNEELDRRRRRRRRRD 59
          RE R++EE+E  ++E   + E EE K+   EE  RR    RR++
Sbjct: 36 REQREQEEQERREQEEQDRLEREELKRRAAEERLRREEEARRQE 79



 Score = 31.2 bits (70), Expect = 0.25
 Identities = 14/46 (30%), Positives = 25/46 (54%), Gaps = 4/46 (8%)

Query: 16 KSKREGRQKEEEEEGKKEKKE----EEEEKKERNEELDRRRRRRRR 57
          + + E  ++E+EE+ + E++E      EE+  R EE  R+   R R
Sbjct: 39 REQEEQERREQEEQDRLEREELKRRAAEERLRREEEARRQEEERAR 84



 Score = 30.1 bits (67), Expect = 0.53
 Identities = 15/42 (35%), Positives = 26/42 (61%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           K K E  +K+E+EE ++ +K++EE +    EE +R R  R +
Sbjct: 91  KRKAEEEEKQEQEEQERIQKQKEEAEARAREEAERMRLEREK 132



 Score = 30.1 bits (67), Expect = 0.61
 Identities = 15/44 (34%), Positives = 24/44 (54%), Gaps = 3/44 (6%)

Query: 16  KSKREGRQKEEEEEGKKEKKEE---EEEKKERNEELDRRRRRRR 56
           + +R  +QKEE E   +E+ E    E EK  +  E +R  R++R
Sbjct: 104 EQERIQKQKEEAEARAREEAERMRLEREKHFQQIEQERLERKKR 147



 Score = 28.1 bits (62), Expect = 2.2
 Identities = 13/42 (30%), Positives = 25/42 (59%), Gaps = 3/42 (7%)

Query: 20  EGRQKEEEEEGKKE---KKEEEEEKKERNEELDRRRRRRRRR 58
           E R + EEE  ++E    +E+EE+ K + EE +++ +  + R
Sbjct: 66  EERLRREEEARRQEEERAREKEEKAKRKAEEEEKQEQEEQER 107



 Score = 28.1 bits (62), Expect = 2.7
 Identities = 17/49 (34%), Positives = 25/49 (51%), Gaps = 8/49 (16%)

Query: 18  KREGRQKEEEEEGKKEKKEE----EEEKKER----NEELDRRRRRRRRR 58
           ++E  +  E+EE  K K EE    E+E++ER     EE + R R    R
Sbjct: 77  RQEEERAREKEEKAKRKAEEEEKQEQEEQERIQKQKEEAEARAREEAER 125



 Score = 27.8 bits (61), Expect = 2.9
 Identities = 15/46 (32%), Positives = 26/46 (56%), Gaps = 7/46 (15%)

Query: 16  KSKREGRQKEEEEEGKKEKK-------EEEEEKKERNEELDRRRRR 54
           K + E R +EE E  + E++       +E  E+K+R EE+ +R R+
Sbjct: 112 KEEAEARAREEAERMRLEREKHFQQIEQERLERKKRLEEIMKRTRK 157



 Score = 27.0 bits (59), Expect = 5.2
 Identities = 12/43 (27%), Positives = 26/43 (60%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           + +RE   + +EEE  +EK+E+ + K E  E+ ++  + R ++
Sbjct: 68  RLRREEEARRQEEERAREKEEKAKRKAEEEEKQEQEEQERIQK 110



 Score = 27.0 bits (59), Expect = 6.7
 Identities = 14/44 (31%), Positives = 26/44 (59%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           K ++  R+ EEEE+ ++E++E  +++KE  E   R    R R +
Sbjct: 86  KEEKAKRKAEEEEKQEQEEQERIQKQKEEAEARAREEAERMRLE 129



 Score = 26.6 bits (58), Expect = 6.9
 Identities = 11/53 (20%), Positives = 29/53 (54%), Gaps = 9/53 (16%)

Query: 16  KSKREGRQKEE------EEEGKKE---KKEEEEEKKERNEELDRRRRRRRRRD 59
           + +++  ++EE      EE  ++E   +++EEE  +E+ E+  R+     +++
Sbjct: 49  QEEQDRLEREELKRRAAEERLRREEEARRQEEERAREKEEKAKRKAEEEEKQE 101


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 33.2 bits (76), Expect = 0.056
 Identities = 12/30 (40%), Positives = 19/30 (63%), Gaps = 2/30 (6%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERN 45
           K K+E  +++E EE  +EK  EEE+ +E  
Sbjct: 153 KIKKERAEEKEREE--EEKAAEEEKAREEE 180



 Score = 30.8 bits (70), Expect = 0.35
 Identities = 10/25 (40%), Positives = 17/25 (68%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNE 46
           +++ EE+E ++E+K  EEEK    E
Sbjct: 156 KERAEEKEREEEEKAAEEEKAREEE 180



 Score = 29.7 bits (67), Expect = 0.88
 Identities = 15/50 (30%), Positives = 27/50 (54%), Gaps = 1/50 (2%)

Query: 26  EEEEGKKEKKEEEE-EKKERNEELDRRRRRRRRRDNWVCDGSSNLAITRS 74
           E E+ KKE+ EE+E E++E+  E ++ R       N + + S +  + R 
Sbjct: 150 ELEKIKKERAEEKEREEEEKAAEEEKAREEEILTGNPLLNTSGDFKVKRR 199



 Score = 28.5 bits (64), Expect = 2.4
 Identities = 14/28 (50%), Positives = 20/28 (71%), Gaps = 4/28 (14%)

Query: 24  KEEEEEGKKEKKEEE---EEKKERNEEL 48
           K+E  E +KE++EEE   EE+K R EE+
Sbjct: 155 KKERAE-EKEREEEEKAAEEEKAREEEI 181


>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
           component YidC; Validated.
          Length = 429

 Score = 33.7 bits (77), Expect = 0.058
 Identities = 12/44 (27%), Positives = 19/44 (43%), Gaps = 9/44 (20%)

Query: 24  KEEEEEGKKEKKEEEEEKKERNEELDRRR---------RRRRRR 58
             E+ E K  KKE  ++++    E++R           R R RR
Sbjct: 337 TAEKNEAKARKKEIAQKRRAAEREINREARQERAAAMARARARR 380



 Score = 29.0 bits (65), Expect = 2.0
 Identities = 15/50 (30%), Positives = 23/50 (46%), Gaps = 2/50 (4%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            +I P    E   +  E   KK +  E+ E K R +E+ ++RR   R  N
Sbjct: 315 MIITPWRAPELHAENAEI--KKTRTAEKNEAKARKKEIAQKRRAAEREIN 362


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
            biogenesis [Translation, ribosomal structure and
            biogenesis].
          Length = 1077

 Score = 33.9 bits (77), Expect = 0.059
 Identities = 16/46 (34%), Positives = 30/46 (65%), Gaps = 4/46 (8%)

Query: 16   KSKREGRQKEEEEE-GKKEKKEEEEEKKERNE---ELDRRRRRRRR 57
            K + E  Q+ +EEE GKKEK+ E+  +K  ++   E+ ++R +++R
Sbjct: 1032 KERMESLQRAKEEEIGKKEKEREQRIRKTIHDNYKEMAKKRLKKKR 1077



 Score = 28.6 bits (63), Expect = 3.0
 Identities = 14/53 (26%), Positives = 27/53 (50%), Gaps = 5/53 (9%)

Query: 12   LIVPKSKRE-----GRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            L VP   RE      R  +E  + ++EK+  E  ++ + EE+ ++ + R +R 
Sbjct: 1005 LPVPPECREKHEIKDRIVKERIKDQEEKERMESLQRAKEEEIGKKEKEREQRI 1057



 Score = 27.8 bits (61), Expect = 5.7
 Identities = 17/61 (27%), Positives = 29/61 (47%), Gaps = 3/61 (4%)

Query: 13   IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAIT 72
             + K + + ++++E  E  +  KEEE  KKE+  E   +R R+   DN+       L   
Sbjct: 1020 RIVKERIKDQEEKERMESLQRAKEEEIGKKEKERE---QRIRKTIHDNYKEMAKKRLKKK 1076

Query: 73   R 73
            R
Sbjct: 1077 R 1077


>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381).  This
           domain is functionally uncharacterized. This domain is
           found in eukaryotes. This presumed domain is typically
           between 156 to 174 amino acids in length. This domain is
           found associated with pfam07780, pfam01728.
          Length = 154

 Score = 32.6 bits (75), Expect = 0.064
 Identities = 15/45 (33%), Positives = 26/45 (57%), Gaps = 5/45 (11%)

Query: 21  GRQKEEEEEGKKEKKEEEEEKKERN-----EELDRRRRRRRRRDN 60
           G  K+E+EE ++E+ E EE  +E       E+   + +R +RR+N
Sbjct: 95  GLDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRREN 139



 Score = 32.3 bits (74), Expect = 0.076
 Identities = 13/50 (26%), Positives = 24/50 (48%), Gaps = 8/50 (16%)

Query: 16  KSKREGRQKEEEEEGKKEK--------KEEEEEKKERNEELDRRRRRRRR 57
           K K E  ++E E E   E+        KE  + K+E+  E +R+++   +
Sbjct: 99  KEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRENERKQKEILK 148



 Score = 30.3 bits (69), Expect = 0.41
 Identities = 11/41 (26%), Positives = 24/41 (58%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           K ++E  ++EE E  + +++E+ +E  E+     +R +RR 
Sbjct: 98  KKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRE 138



 Score = 29.9 bits (68), Expect = 0.49
 Identities = 7/34 (20%), Positives = 17/34 (50%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           + +E  +KE  + + EK+  NE   +   + + +
Sbjct: 119 QIDELLEKELAKLKREKRRENERKQKEILKEQMK 152



 Score = 29.6 bits (67), Expect = 0.61
 Identities = 15/44 (34%), Positives = 26/44 (59%), Gaps = 3/44 (6%)

Query: 19  REGRQKEEEEEGKKE---KKEEEEEKKERNEELDRRRRRRRRRD 59
            + ++KEEEEE + E     EEE+  +   +EL + +R +RR +
Sbjct: 96  LDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRREN 139



 Score = 29.6 bits (67), Expect = 0.71
 Identities = 6/38 (15%), Positives = 20/38 (52%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            +EE+ +   EK+  + ++++R E   +++   + +  
Sbjct: 115 DEEEQIDELLEKELAKLKREKRRENERKQKEILKEQMK 152



 Score = 27.3 bits (61), Expect = 3.7
 Identities = 8/34 (23%), Positives = 18/34 (52%), Gaps = 1/34 (2%)

Query: 23  QKEEEEEGKKEKKEEEEE-KKERNEELDRRRRRR 55
            +  E+E  K K+E+  E ++++ E L  + +  
Sbjct: 121 DELLEKELAKLKREKRRENERKQKEILKEQMKML 154


>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein. 
          Length = 529

 Score = 33.6 bits (77), Expect = 0.065
 Identities = 16/25 (64%), Positives = 21/25 (84%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEELD 49
          EEEEE K+EKKEEEE+  ++ EE+D
Sbjct: 38 EEEEEEKEEKKEEEEKTTDKEEEVD 62



 Score = 31.3 bits (71), Expect = 0.39
 Identities = 12/27 (44%), Positives = 20/27 (74%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNE 46
          E  +K+EEEE   +K+EE +E++E+ E
Sbjct: 43 EKEEKKEEEEKTTDKEEEVDEEEEKEE 69



 Score = 30.5 bits (69), Expect = 0.56
 Identities = 12/42 (28%), Positives = 26/42 (61%)

Query: 14 VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
          VP  + E  ++E++EE +K   +EEE  +E  +E  +++ ++
Sbjct: 35 VPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKK 76


>gnl|CDD|224259 COG1340, COG1340, Uncharacterized archaeal coiled-coil protein
          [Function unknown].
          Length = 294

 Score = 33.1 bits (76), Expect = 0.069
 Identities = 12/36 (33%), Positives = 19/36 (52%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           E E + K+ K+E EE K++R+E          +RD
Sbjct: 9  DELELKRKQLKEEIEELKEKRDELRKEASELAEKRD 44



 Score = 29.3 bits (66), Expect = 1.4
 Identities = 12/32 (37%), Positives = 21/32 (65%), Gaps = 1/32 (3%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           ++  +++E+ EE  KE+ EE  EK +R E+L 
Sbjct: 251 EKAAKRREKREE-LKERAEEIYEKFKRGEKLT 281



 Score = 27.3 bits (61), Expect = 5.7
 Identities = 10/41 (24%), Positives = 14/41 (34%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           K KR             E++ E  EKK++   L     R  
Sbjct: 96  KEKRNEFNLGGRSIKSLEREIERLEKKQQTSVLTPEEEREL 136


>gnl|CDD|225133 COG2223, NarK, Nitrate/nitrite transporter [Inorganic ion transport
           and metabolism].
          Length = 417

 Score = 33.0 bits (76), Expect = 0.079
 Identities = 21/73 (28%), Positives = 31/73 (42%), Gaps = 14/73 (19%)

Query: 77  FLGSLLGGFILS---WVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           FL  L+G        W++DR G      G  V   + V +      +    LS FLTG G
Sbjct: 261 FLFPLIGALARPLGGWLSDRIG------GRRVTLAVFVGMA-----LAAALLSLFLTGFG 309

Query: 134 HFNAFIFYYIIVL 146
           H  +F+ +  + L
Sbjct: 310 HGGSFVVFVAVFL 322


>gnl|CDD|185300 PRK15402, PRK15402, multidrug efflux system translocase MdfA;
           Provisional.
          Length = 406

 Score = 33.0 bits (76), Expect = 0.080
 Identities = 18/57 (31%), Positives = 26/57 (45%), Gaps = 4/57 (7%)

Query: 81  LLGGFILSW----VADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           L GG  L W    ++DR GR   +L       L       ++ +  F+L RFL G+G
Sbjct: 58  LAGGMFLQWLLGPLSDRIGRRPVMLAGVAFFILTCLAILLAQSIEQFTLLRFLQGIG 114


>gnl|CDD|206034 pfam13863, DUF4200, Domain of unknown function (DUF4200).  This
          family is found in eukaryotes. It is a coiled-coil
          domain of unknwon function.
          Length = 126

 Score = 31.8 bits (73), Expect = 0.081
 Identities = 16/40 (40%), Positives = 23/40 (57%), Gaps = 2/40 (5%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
          +R   +K EEE  KK +KE+EEE KE   EL+  +    +
Sbjct: 62 RRRAEKKAEEE--KKLRKEKEEEIKELKAELEELKAEIEK 99



 Score = 29.1 bits (66), Expect = 0.75
 Identities = 15/39 (38%), Positives = 24/39 (61%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
          K +R  ++ EEE++ +KEK+EE +E K   EEL     +
Sbjct: 61 KRRRAEKKAEEEKKLRKEKEEEIKELKAELEELKAEIEK 99



 Score = 26.0 bits (58), Expect = 9.5
 Identities = 11/44 (25%), Positives = 24/44 (54%), Gaps = 1/44 (2%)

Query: 16 KSKREGRQKEEEEEGKKEK-KEEEEEKKERNEELDRRRRRRRRR 58
          + +RE  + +   + K+E+ +  EE  K+R EEL+++    +  
Sbjct: 4  EKRREMEEVQLALDAKREEFERREELLKQREEELEKKEEELQES 47


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 33.0 bits (76), Expect = 0.085
 Identities = 10/35 (28%), Positives = 21/35 (60%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           EE+EE  K + +E+++  + ++++  R R   RR 
Sbjct: 135 EEDEESSKSEDDEDDDDDDDDDDIATRERSLERRR 169



 Score = 27.3 bits (61), Expect = 7.9
 Identities = 10/39 (25%), Positives = 21/39 (53%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           E  + E++E+   +  +++   +ER+ E  RRRR    +
Sbjct: 139 ESSKSEDDEDDDDDDDDDDIATRERSLERRRRRREWEEK 177


>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
          Length = 651

 Score = 33.1 bits (76), Expect = 0.091
 Identities = 10/37 (27%), Positives = 23/37 (62%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
           +RE  +KE  +E K+ +K +++E+K++ E     + +
Sbjct: 554 QREKEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAK 590



 Score = 30.8 bits (70), Expect = 0.47
 Identities = 13/37 (35%), Positives = 25/37 (67%), Gaps = 1/37 (2%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
           + K E    +E++  +K KK+EE++KKE  E+L++ +
Sbjct: 555 REKEEKEALKEQKRLRKLKKQEEKKKKEL-EKLEKAK 590



 Score = 29.6 bits (67), Expect = 1.3
 Identities = 9/36 (25%), Positives = 23/36 (63%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           R+KEE+E  K++K+  + +K+E  ++ +  +  + +
Sbjct: 555 REKEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAK 590



 Score = 28.8 bits (65), Expect = 2.3
 Identities = 9/36 (25%), Positives = 21/36 (58%), Gaps = 2/36 (5%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           Q+E+EE  K+  KE++  +K + +E  +++   +  
Sbjct: 554 QREKEE--KEALKEQKRLRKLKKQEEKKKKELEKLE 587



 Score = 26.9 bits (60), Expect = 8.6
 Identities = 10/44 (22%), Positives = 23/44 (52%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
              K E ++++EE+E  KE+K   + KK+  ++     +  + +
Sbjct: 547 LDDKEELQREKEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAK 590


>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
          Length = 1068

 Score = 33.1 bits (76), Expect = 0.092
 Identities = 12/42 (28%), Positives = 21/42 (50%), Gaps = 2/42 (4%)

Query: 21  GRQKEEEEEGKKEKKEEEEEKKE--RNEELDRRRRRRRRRDN 60
           G ++ + +E    K E + E+++  R    + RR R  RRD 
Sbjct: 582 GGEETKPQEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDT 623



 Score = 31.5 bits (72), Expect = 0.34
 Identities = 12/44 (27%), Positives = 24/44 (54%), Gaps = 3/44 (6%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR---RRR 58
           +R  RQ +++    +E ++ E  +K R ++  ++  RR   RRR
Sbjct: 640 RRNRRQAQQQTAETRESQQAEVTEKARTQDEQQQAPRRERQRRR 683



 Score = 29.6 bits (67), Expect = 1.3
 Identities = 12/45 (26%), Positives = 24/45 (53%), Gaps = 6/45 (13%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            +   E R+ ++ E  +K + ++E+++  R E      R+RRR D
Sbjct: 647 QQQTAETRESQQAEVTEKARTQDEQQQAPRRE------RQRRRND 685



 Score = 29.2 bits (66), Expect = 1.7
 Identities = 12/49 (24%), Positives = 22/49 (44%), Gaps = 8/49 (16%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEK--------KERNEELDRRRRRRRRRDN 60
           E + +E+     + K E ++++        ++RNE  D R  R RR   
Sbjct: 585 ETKPQEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDTRDNRTRREGR 633



 Score = 28.5 bits (64), Expect = 3.2
 Identities = 13/49 (26%), Positives = 30/49 (61%), Gaps = 8/49 (16%)

Query: 15  PKSKREGRQKEEEEEGKKEKK--------EEEEEKKERNEELDRRRRRR 55
           P+ +R+ R+ +E+ + ++E K         +E E++ER +++  RR++R
Sbjct: 675 PRRERQRRRNDEKRQAQQEAKALNVEEQSVQETEQEERVQQVQPRRKQR 723


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 33.1 bits (76), Expect = 0.094
 Identities = 16/47 (34%), Positives = 25/47 (53%), Gaps = 6/47 (12%)

Query: 16  KSK------REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           KSK      ++ + KEE +E ++  K + E   E  E+L+RRR   R
Sbjct: 227 KSKKYHRVHKKEKLKEELKEFEELVKADPEAALEELEKLERRRAEER 273



 Score = 29.3 bits (66), Expect = 1.5
 Identities = 11/29 (37%), Positives = 14/29 (48%), Gaps = 1/29 (3%)

Query: 31  KKEKKEEEEEKKERN-EELDRRRRRRRRR 58
           KK    EE E K+ + EE   RR   R+ 
Sbjct: 181 KKLTPFEELELKKLSPEEAKARRAELRKM 209


>gnl|CDD|182225 PRK10077, xylE, D-xylose transporter XylE; Provisional.
          Length = 479

 Score = 33.1 bits (76), Expect = 0.098
 Identities = 38/169 (22%), Positives = 68/169 (40%), Gaps = 30/169 (17%)

Query: 65  GSSNLAITRSIFFLGSLLGGFILSWVADRYGR-----ITAVLGSHVVSFLGVA-----LT 114
            +S L    +   +G ++GG +  + ++R+GR     I AVL    +S LG A      T
Sbjct: 53  ANSLLGFCVASALIGCIIGGALGGYCSNRFGRRDSLKIAAVL--FFISALGSAWPEFGFT 110

Query: 115 PFSKD----VVLFSLSRFLTGVGHFNAFIFYYIIVLECVGPK-------WRTFAMTFPFL 163
               D    V  F + R + G+G   A +   + + E            +  FA+ F  L
Sbjct: 111 SIGPDNTGYVPEFVIYRIIGGIGVGLASMLSPMYIAEIAPAHIRGKLVSFNQFAIIFGQL 170

Query: 164 IFYTVSEV-----ALPWIAYYLADWQWISVITIFPLIVGLIVAIFTPES 207
           + Y V+          W+      W+++      P ++ L++  F PE+
Sbjct: 171 VVYFVNYFIARSGDASWLNTD--GWRYMFASEAIPALLFLMLLYFVPET 217


>gnl|CDD|203454 pfam06455, NADH5_C, NADH dehydrogenase subunit 5 C-terminus.  This
           family represents the C-terminal region of several NADH
           dehydrogenase subunit 5 proteins and is found in
           conjunction with pfam00361 and pfam00662.
          Length = 181

 Score = 32.1 bits (74), Expect = 0.098
 Identities = 23/99 (23%), Positives = 37/99 (37%), Gaps = 23/99 (23%)

Query: 67  SNLAITRSIFFL--GSLLGGFILSW---------VADRYGRITAVLGSHVVSFLGVAL-- 113
           +N  +   +  L  GS++GG +LSW             Y ++ A+L    V+ LG+ L  
Sbjct: 25  NNPIMINPMKRLAIGSIIGGSLLSWLIFPKPPMITMPLYLKLLALL----VTILGLLLGL 80

Query: 114 ------TPFSKDVVLFSLSRFLTGVGHFNAFIFYYIIVL 146
                     K  + F+L  F   +  F       I  L
Sbjct: 81  ELSNLSNKQLKKSLNFNLHSFSGSMWFFPNLSHRLIPKL 119


>gnl|CDD|192481 pfam10197, Cir_N, N-terminal domain of CBF1 interacting
          co-repressor CIR.  This is a 45 residue conserved
          region at the N-terminal end of a family of proteins
          referred to as CIRs (CBF1-interacting co-repressors).
          CBF1 (centromere-binding factor 1) acts as a
          transcription factor that causes repression by binding
          specifically to GTGGGAA motifs in responsive promoters,
          and it requires CIR as a co-repressor. CIR binds to
          histone deacetylase and to SAP30 and serves as a linker
          between CBF1 and the histone deacetylase complex.
          Length = 37

 Score = 29.5 bits (67), Expect = 0.10
 Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEE 47
          E E++  +E+K+ EE +KE  EE
Sbjct: 15 EAEQKALEEQKKIEELRKEIEEE 37


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 32.4 bits (74), Expect = 0.10
 Identities = 11/34 (32%), Positives = 23/34 (67%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           + K +  + E+ +E  ++ +EEEEE++E +E+ D
Sbjct: 157 EKKLKELEAEDVDEEDEKDEEEEEEEEEEDEDFD 190



 Score = 29.0 bits (65), Expect = 1.4
 Identities = 9/38 (23%), Positives = 20/38 (52%)

Query: 12  LIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
                   +  ++ E E+  +E +++EEE++E  EE +
Sbjct: 150 DEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEEEEDE 187



 Score = 28.2 bits (63), Expect = 2.6
 Identities = 12/34 (35%), Positives = 21/34 (61%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
             + E    +EE+E  +E++EEEEE+ E  ++ D
Sbjct: 160 LKELEAEDVDEEDEKDEEEEEEEEEEDEDFDDDD 193


>gnl|CDD|237035 PRK12280, rplW, 50S ribosomal protein L23; Reviewed.
          Length = 158

 Score = 32.0 bits (73), Expect = 0.11
 Identities = 11/30 (36%), Positives = 22/30 (73%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKER 44
            K ++E  ++ EE+E  K KKE++E+K+++
Sbjct: 96  EKEQKEVSKETEEKEAIKAKKEKKEKKEKK 125



 Score = 29.3 bits (66), Expect = 0.80
 Identities = 10/29 (34%), Positives = 19/29 (65%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKE 43
            + +++   KE EE+   + K+E++EKKE
Sbjct: 95  SEKEQKEVSKETEEKEAIKAKKEKKEKKE 123



 Score = 27.8 bits (62), Expect = 2.7
 Identities = 9/32 (28%), Positives = 21/32 (65%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
           E  ++++E   + E+KE  + KKE+ E+ +++
Sbjct: 94  ESEKEQKEVSKETEEKEAIKAKKEKKEKKEKK 125


>gnl|CDD|221250 pfam11831, Myb_Cef, pre-mRNA splicing factor component.  This
           family is a region of the Myb-Related Cdc5p/Cef1
           proteins, in fungi, and is part of the pre-mRNA splicing
           factor complex.
          Length = 363

 Score = 32.7 bits (75), Expect = 0.11
 Identities = 16/52 (30%), Positives = 26/52 (50%), Gaps = 8/52 (15%)

Query: 15  PKSKRE-----GRQKEEEEEGKKEKKEE---EEEKKERNEELDRRRRRRRRR 58
           PK K E       ++EEE E  +E+ EE   + + ++R  E  + +   RRR
Sbjct: 139 PKPKNEFELELPEEEEEEPEEMEEELEEDAADRDARKRAAEEAKEQEELRRR 190


>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
           This is a family of fungal proteins of unknown function.
          Length = 182

 Score = 32.0 bits (73), Expect = 0.11
 Identities = 10/38 (26%), Positives = 27/38 (71%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           KSK++  + +++++ KK+ K E++++KE  ++L+   +
Sbjct: 90  KSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTK 127



 Score = 31.6 bits (72), Expect = 0.16
 Identities = 8/36 (22%), Positives = 20/36 (55%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
            K K +   K++++  KK++KE E++ ++  +    
Sbjct: 96  DKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSE 131



 Score = 30.0 bits (68), Expect = 0.54
 Identities = 11/35 (31%), Positives = 25/35 (71%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
           K K+  ++K+++++ K +KK+++ EKK+  E  D+
Sbjct: 87  KKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDK 121



 Score = 29.3 bits (66), Expect = 0.90
 Identities = 13/39 (33%), Positives = 23/39 (58%), Gaps = 2/39 (5%)

Query: 13 IVPKSKREGRQKEEE--EEGKKEKKEEEEEKKERNEELD 49
          I      E ++K++E  EE +K KKE EE++K + ++  
Sbjct: 52 IYDAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKK 90



 Score = 27.7 bits (62), Expect = 3.7
 Identities = 8/34 (23%), Positives = 24/34 (70%), Gaps = 2/34 (5%)

Query: 16  KSKREGRQ--KEEEEEGKKEKKEEEEEKKERNEE 47
           K + E +Q  K ++++ KK+K +++++K ++ ++
Sbjct: 75  KKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDD 108



 Score = 26.6 bits (59), Expect = 7.9
 Identities = 7/34 (20%), Positives = 25/34 (73%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
             K++ ++K+++++ KK+ K++++ +K+  +E +
Sbjct: 86  WKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAE 119


>gnl|CDD|191022 pfam04538, BEX, Brain expressed X-linked like family.  This is a
          family of transcription elongation factors which
          includes those referred to as Bex proteins as well as
          those named TCEAL7. Bex1 was shown to be a novel link
          between neurotrophin signalling, the cell cycle, and
          neuronal differentiation, suggesting it might function
          by coordinating internal cellular states with the
          ability of cells to respond to external signals. TCEAL7
          has been shown negatively to regulate the NF-kappaB
          pathway, hence being important in ovarian cancer as it
          one of the genes frequently downregulated in this
          cancer. A closely related protein, TFIIS/TCEA, found in
          pfam07500 is involved in transcription elongation and
          transcript fidelity. TFIIS/TCEA promotes 3'
          endoribonuclease activity of RNA polymerase II (pol II)
          and allows pol II to bypass transcript pause or
          'arrest' during elongation process. It is thus possible
          that BEX is also acting in this way.
          Length = 97

 Score = 30.8 bits (70), Expect = 0.12
 Identities = 15/41 (36%), Positives = 22/41 (53%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
          K  +E   K E E  ++E+K   EE + +  E + RRR RR
Sbjct: 3  KPCKENEGKPESEPKEEEEKRPLEEGEGKKPEGNFRRRLRR 43


>gnl|CDD|110514 pfam01517, HDV_ag, Hepatitis delta virus delta antigen.  The
           hepatitis delta virus (HDV) encodes a single protein,
           the hepatitis delta antigen (HDAg). The central region
           of this protein has been shown to bind RNA. Several
           interactions are also mediated by a coiled-coil region
           at the N terminus of the protein.
          Length = 194

 Score = 32.1 bits (73), Expect = 0.13
 Identities = 14/35 (40%), Positives = 20/35 (57%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
           E ++K+    GK   +EEEEE +   EE + R RR
Sbjct: 108 ENKKKQLSSGGKHLSREEEEELRRLTEEDEERERR 142



 Score = 26.4 bits (58), Expect = 9.5
 Identities = 13/33 (39%), Positives = 17/33 (51%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
           +K++   G K    EEEE+  R  E D  R RR
Sbjct: 110 KKKQLSSGGKHLSREEEEELRRLTEEDEERERR 142


>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR).  This
           family consists of several bovine specific leukaemia
           virus receptors which are thought to function as
           transmembrane proteins, although their exact function is
           unknown.
          Length = 561

 Score = 32.3 bits (73), Expect = 0.14
 Identities = 12/34 (35%), Positives = 20/34 (58%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           K + E R ++  E+ K+EKK+ E+EK+ R     
Sbjct: 83  KLEEERRHRQRLEKDKREKKKREKEKRGRRRHHS 116



 Score = 31.6 bits (71), Expect = 0.33
 Identities = 11/33 (33%), Positives = 21/33 (63%)

Query: 24  KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           K EEE   +++ E+++ +K++ E+  R RRR  
Sbjct: 83  KLEEERRHRQRLEKDKREKKKREKEKRGRRRHH 115



 Score = 30.8 bits (69), Expect = 0.46
 Identities = 14/51 (27%), Positives = 28/51 (54%), Gaps = 1/51 (1%)

Query: 15  PKSKREGRQKEEEEEGKK-EKKEEEEEKKERNEELDRRRRRRRRRDNWVCD 64
            KS  +G     E++ KK +KKE++E++KER+++  +     +     + D
Sbjct: 184 SKSPEKGDVPAVEKKSKKPKKKEKKEKEKERDKDKKKEVEGFKSLLLALDD 234



 Score = 29.3 bits (65), Expect = 1.4
 Identities = 9/36 (25%), Positives = 24/36 (66%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           VP  +++ ++ +++E+ +KEK+ ++++KKE      
Sbjct: 192 VPAVEKKSKKPKKKEKKEKEKERDKDKKKEVEGFKS 227



 Score = 28.9 bits (64), Expect = 1.9
 Identities = 13/50 (26%), Positives = 27/50 (54%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGS 66
           ++ E  +K  + + KK++KE+EE+KK++     R        +  V +G+
Sbjct: 269 AEAEETKKSPKHKKKKQRKEKEEKKKKKKHHHHRCHHSDGGAEQPVQNGA 318



 Score = 28.1 bits (62), Expect = 4.1
 Identities = 10/33 (30%), Positives = 23/33 (69%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
           + +R  RQ+ E+++ +K+K+E+E+  + R+  L
Sbjct: 85  EEERRHRQRLEKDKREKKKREKEKRGRRRHHSL 117


>gnl|CDD|197400 cd10164, ClassIIa_HDAC5_Gln-rich-N, Glutamine-rich N-terminal
          helical domain of HDAC5, a Class IIa histone
          deacetylase.  This family consists of the
          glutamine-rich domain of histone deacetylase 5 (HDAC5).
          It belongs to a superfamily that consists of the
          glutamine-rich N-terminal helical extension to certain
          Class IIa histone deacetylases (HDACs), including
          HDAC4, HDAC5 and HDCA9; it is missing from HDAC7. This
          domain confers responsiveness to calcium signals and
          mediates interactions with transcription factors and
          cofactors, and it is able to repress transcription
          independently of the HDAC C-terminal, zinc-dependent
          catalytic domain. It has many intra- and inter-helical
          interactions which are possibly involved in reversible
          assembly and disassembly of proteins. HDACs regulate
          diverse cellular processes through enzymatic
          deacetylation of histone as well as non-histone
          proteins, in particular deacetylating
          N(6)-acetyl-lysine residues.
          Length = 97

 Score = 30.9 bits (69), Expect = 0.15
 Identities = 11/26 (42%), Positives = 21/26 (80%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEE 47
          RQ+E E++ K+E++ +EE +K+R E+
Sbjct: 70 RQQELEQQRKREQQRQEELEKQRLEQ 95


>gnl|CDD|233099 TIGR00710, efflux_Bcr_CflA, drug resistance transporter, Bcr/CflA
           subfamily.  This subfamily of drug efflux proteins, a
           part of the major faciliator family, is predicted to
           have 12 membrane-spanning regions. Members with known
           activity include Bcr (bicyclomycin resistance protein)
           in E. coli, Flor (chloramphenicol and florfenicol
           resistance) in Salmonella typhimurium DT104, and CmlA
           (chloramphenicol resistance) in Pseudomonas sp. plasmid
           R1033.
          Length = 385

 Score = 32.4 bits (74), Expect = 0.15
 Identities = 15/62 (24%), Positives = 27/62 (43%)

Query: 72  TRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTG 131
           T +++ LG   G  +   ++DRYGR   +L    +  L       S ++    + RF+  
Sbjct: 45  TLTLYLLGFAAGQLLWGPLSDRYGRRPVLLLGLFIFALSSLGLALSNNIETLLVLRFVQA 104

Query: 132 VG 133
            G
Sbjct: 105 FG 106


>gnl|CDD|219355 pfam07267, Nucleo_P87, Nucleopolyhedrovirus capsid protein P87.
           This family consists of several Nucleopolyhedrovirus
           capsid protein P87 sequences. P87 is expressed late in
           infection and concentrated in infected cell nuclei.
          Length = 606

 Score = 32.5 bits (74), Expect = 0.15
 Identities = 12/47 (25%), Positives = 25/47 (53%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
             P  KR+ R+     E   ++ E++ ++ E + E +R+RRR   ++
Sbjct: 352 SGPTRKRKRRRVPPLPEYSSDEDEDDSDEDEVDYEKERKRRREEDKN 398


>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein.  TolA couples the inner
          membrane complex of itself with TolQ and TolR to the
          outer membrane complex of TolB and OprL (also called
          Pal). Most of the length of the protein consists of
          low-complexity sequence that may differ in both length
          and composition from one species to another,
          complicating efforts to discriminate TolA (the most
          divergent gene in the tol-pal system) from paralogs
          such as TonB. Selection of members of the seed
          alignment and criteria for setting scoring cutoffs are
          based largely conserved operon struction. //The Tol-Pal
          complex is required for maintaining outer membrane
          integrity. Also involved in transport (uptake) of
          colicins and filamentous DNA, and implicated in
          pathogenesis. Transport is energized by the proton
          motive force. TolA is an inner membrane protein that
          interacts with periplasmic TolB and with outer membrane
          porins ompC, phoE and lamB [Transport and binding
          proteins, Other, Cellular processes, Pathogenesis].
          Length = 346

 Score = 32.1 bits (73), Expect = 0.16
 Identities = 12/43 (27%), Positives = 26/43 (60%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           + +++   K+E+E  KK +++ EE +K+R  E  R++   +R
Sbjct: 55 IQQQKKPAAKKEQERQKKLEQQAEEAEKQRAAEQARQKELEQR 97



 Score = 30.2 bits (68), Expect = 0.71
 Identities = 13/49 (26%), Positives = 22/49 (44%), Gaps = 5/49 (10%)

Query: 15  PKSKREGRQKEEEEEGKK-----EKKEEEEEKKERNEELDRRRRRRRRR 58
            K   E + K E E  KK     +K+ EEE K +   E  ++    +++
Sbjct: 126 AKQAAEAKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAKKK 174



 Score = 29.4 bits (66), Expect = 1.2
 Identities = 13/37 (35%), Positives = 19/37 (51%), Gaps = 4/37 (10%)

Query: 15  PKSKREGRQKEEEEEGKKE----KKEEEEEKKERNEE 47
            K+K E +++ EEE   K     KK+  E KK+   E
Sbjct: 142 KKAKEEAKKQAEEEAKAKAAAEAKKKAAEAKKKAEAE 178



 Score = 29.0 bits (65), Expect = 1.5
 Identities = 7/32 (21%), Positives = 18/32 (56%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
            +++  +Q E+  +  +EK+++ EE K +   
Sbjct: 99  AAEKAAKQAEQAAKQAEEKQKQAEEAKAKQAA 130


>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19. 
           Med19 represents a family of conserved proteins which
           are members of the multi-protein co-activator Mediator
           complex. Mediator is required for activation of RNA
           polymerase II transcription by DNA binding
           transactivators.
          Length = 178

 Score = 31.7 bits (72), Expect = 0.16
 Identities = 10/33 (30%), Positives = 24/33 (72%)

Query: 24  KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           K  E++ KK+K E+++E+K++ +E  ++++R  
Sbjct: 138 KGHEKKHKKKKHEDDKERKKKKKEKKKKKKRHS 170



 Score = 29.8 bits (67), Expect = 0.74
 Identities = 10/27 (37%), Positives = 23/27 (85%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKER 44
           K+  ++K E+++ +K+KK+E+++KK+R
Sbjct: 142 KKHKKKKHEDDKERKKKKKEKKKKKKR 168


>gnl|CDD|219621 pfam07890, Rrp15p, Rrp15p.  Rrp15p is required for the formation
          of 60S ribosomal subunits.
          Length = 132

 Score = 31.2 bits (71), Expect = 0.16
 Identities = 11/40 (27%), Positives = 25/40 (62%)

Query: 13 IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
          I+ +SK+  + K++ +  K EKK + + + E+ + L++ R
Sbjct: 23 ILSRSKKLLKAKKKLKSEKLEKKAKRQLRAEKRQALEKGR 62



 Score = 26.6 bits (59), Expect = 6.5
 Identities = 15/49 (30%), Positives = 26/49 (53%), Gaps = 4/49 (8%)

Query: 14 VPKSKREG----RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          +P SKR+     R K+  +  KK K E+ E+K +R    ++R+   + R
Sbjct: 14 LPASKRKDPILSRSKKLLKAKKKLKSEKLEKKAKRQLRAEKRQALEKGR 62


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 32.2 bits (74), Expect = 0.17
 Identities = 18/34 (52%), Positives = 25/34 (73%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           K    G++KEEEEE +KEKKEEE+E++E   E +
Sbjct: 432 KKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEE 465



 Score = 31.4 bits (72), Expect = 0.31
 Identities = 13/32 (40%), Positives = 19/32 (59%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           K   E  +K+ EEE K++KK+    KK+  EE
Sbjct: 413 KKIVEKAEKKREEEKKEKKKKAFAGKKKEEEE 444



 Score = 29.5 bits (67), Expect = 1.1
 Identities = 13/35 (37%), Positives = 20/35 (57%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
            + K   +  +K EEE+ +K+KK    +KKE  EE
Sbjct: 411 KIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEE 445



 Score = 29.5 bits (67), Expect = 1.1
 Identities = 13/38 (34%), Positives = 24/38 (63%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
             + K+E ++K    + K+E++EEE+EKKE  +E +  
Sbjct: 423 REEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEE 460



 Score = 29.5 bits (67), Expect = 1.3
 Identities = 10/32 (31%), Positives = 23/32 (71%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           K + E +++++++    +KKEEEEE+++  +E
Sbjct: 421 KKREEEKKEKKKKAFAGKKKEEEEEEEKEKKE 452



 Score = 29.1 bits (66), Expect = 2.0
 Identities = 14/34 (41%), Positives = 24/34 (70%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           K + E ++K+++    K+K+EEEEE+KE+ EE  
Sbjct: 422 KREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEK 455



 Score = 28.7 bits (65), Expect = 2.6
 Identities = 13/45 (28%), Positives = 27/45 (60%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           K K    +K+EEEE ++++K+EEE+++E  E  + +     ++  
Sbjct: 431 KKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEKKKK 475



 Score = 28.0 bits (63), Expect = 4.0
 Identities = 10/42 (23%), Positives = 20/42 (47%)

Query: 12  LIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           L   K   +  +K  E+  KK ++E++E+KK+      +   
Sbjct: 402 LTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEE 443



 Score = 27.6 bits (62), Expect = 4.8
 Identities = 12/34 (35%), Positives = 25/34 (73%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
           KRE  +KE++++    KK+EEEE++E+ ++ + +
Sbjct: 422 KREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEK 455



 Score = 27.2 bits (61), Expect = 6.4
 Identities = 15/34 (44%), Positives = 24/34 (70%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
             K K+    K++EEE ++EK+++EEEK+E  EE
Sbjct: 428 KEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEE 461


>gnl|CDD|224969 COG2058, RPP1A, Ribosomal protein L12E/L44/L45/RPP1/RPP2
           [Translation, ribosomal structure and biogenesis].
          Length = 109

 Score = 30.8 bits (70), Expect = 0.18
 Identities = 8/25 (32%), Positives = 12/25 (48%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELD 49
            E      E +EEE+E++   E  D
Sbjct: 77  AEAAAEADEAEEEEKEEEAEEESDD 101



 Score = 30.4 bits (69), Expect = 0.25
 Identities = 9/26 (34%), Positives = 15/26 (57%)

Query: 24  KEEEEEGKKEKKEEEEEKKERNEELD 49
              E   + ++ EEEE+++E  EE D
Sbjct: 75  AGAEAAAEADEAEEEEKEEEAEEESD 100



 Score = 28.5 bits (64), Expect = 0.95
 Identities = 9/24 (37%), Positives = 12/24 (50%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEE 47
              E   +  + EEEEK+E  EE
Sbjct: 74 AAGAEAAAEADEAEEEEKEEEAEE 97



 Score = 28.1 bits (63), Expect = 1.4
 Identities = 8/25 (32%), Positives = 14/25 (56%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELD 49
           E   E  + ++EE+EE+ E   + D
Sbjct: 78  EAAAEADEAEEEEKEEEAEEESDDD 102



 Score = 27.8 bits (62), Expect = 2.1
 Identities = 8/24 (33%), Positives = 16/24 (66%), Gaps = 1/24 (4%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEE 40
           +  E  + EEEE+ ++E +EE ++
Sbjct: 79  AAAEADEAEEEEK-EEEAEEESDD 101



 Score = 27.8 bits (62), Expect = 2.1
 Identities = 9/25 (36%), Positives = 10/25 (40%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEE 47
                E   E  E EEE+KE   E
Sbjct: 72 AAAAGAEAAAEADEAEEEEKEEEAE 96



 Score = 27.4 bits (61), Expect = 2.3
 Identities = 7/27 (25%), Positives = 18/27 (66%)

Query: 21  GRQKEEEEEGKKEKKEEEEEKKERNEE 47
           G +   E +  +E+++EEE ++E +++
Sbjct: 76  GAEAAAEADEAEEEEKEEEAEEESDDD 102



 Score = 27.4 bits (61), Expect = 2.4
 Identities = 8/29 (27%), Positives = 15/29 (51%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKE 43
            +        E +E ++E+KEEE E++ 
Sbjct: 71 AAAAAGAEAAAEADEAEEEEKEEEAEEES 99


>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
          Length = 330

 Score = 31.8 bits (73), Expect = 0.20
 Identities = 10/21 (47%), Positives = 14/21 (66%)

Query: 23  QKEEEEEGKKEKKEEEEEKKE 43
               EEE ++E++EEEEE  E
Sbjct: 298 AAAAEEEEEEEEEEEEEEPSE 318



 Score = 31.0 bits (71), Expect = 0.39
 Identities = 10/33 (30%), Positives = 18/33 (54%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
           K     + +    E ++E++EEEEE++   EE 
Sbjct: 289 KEVLSAQAQAAAAEEEEEEEEEEEEEEPSEEEA 321



 Score = 27.1 bits (61), Expect = 6.5
 Identities = 10/24 (41%), Positives = 15/24 (62%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEE 38
             +  E  ++EEEEE ++E  EEE
Sbjct: 297 QAAAAEEEEEEEEEEEEEEPSEEE 320


>gnl|CDD|219111 pfam06625, DUF1151, Protein of unknown function (DUF1151).  This
          family consists of several hypothetical eukaryotic
          proteins of unknown function.
          Length = 122

 Score = 30.5 bits (69), Expect = 0.23
 Identities = 16/48 (33%), Positives = 29/48 (60%), Gaps = 4/48 (8%)

Query: 14 VPKSKREGR---QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          V   K E +   +K + ++  KE+KEEEE K+ ++ EL+R   +R ++
Sbjct: 46 VLNQKPELQRVLEKRKRDQVLKEQKEEEEAKRLQS-ELERELMKRAQK 92


>gnl|CDD|221491 pfam12254, DNA_pol_alpha_N, DNA polymerase alpha subunit p180 N
          terminal.  This domain family is found in eukaryotes,
          and is approximately 70 amino acids in length. The
          family is found in association with pfam00136,
          pfam08996, pfam03104. This family is the N terminal of
          DNA polymerase alpha subunit p180 protein. The N
          terminal contains the catalytic region of the alpha
          subunit.
          Length = 67

 Score = 29.6 bits (67), Expect = 0.23
 Identities = 12/43 (27%), Positives = 24/43 (55%), Gaps = 7/43 (16%)

Query: 29 EGKKEKKEEEEEKKERN-------EELDRRRRRRRRRDNWVCD 64
          EG K++ +E E ++E++       EE  +  R+R   D+++ D
Sbjct: 9  EGGKKRLDEYEVEEEKDIYDEVDEEEYRKIVRQRLLNDDFIVD 51


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
           family consists of several Plasmodium falciparum SPAM
           (secreted polymorphic antigen associated with
           merozoites) proteins. Variation among SPAM alleles is
           the result of deletions and amino acid substitutions in
           non-repetitive sequences within and flanking the alanine
           heptad-repeat domain. Heptad repeats in which the a and
           d position contain hydrophobic residues generate
           amphipathic alpha-helices which give rise to helical
           bundles or coiled-coil structures in proteins. SPAM is
           an example of a P. falciparum antigen in which a
           repetitive sequence has features characteristic of a
           well-defined structural element.
          Length = 164

 Score = 31.0 bits (70), Expect = 0.24
 Identities = 8/24 (33%), Positives = 15/24 (62%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNE 46
           ++EEEE+ +     ++ EKK  N+
Sbjct: 78  EEEEEEDEEDNVDLKDIEKKNIND 101



 Score = 29.8 bits (67), Expect = 0.62
 Identities = 9/25 (36%), Positives = 19/25 (76%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEELD 49
          E++E+ ++E++E+EEE +E  +  D
Sbjct: 46 EKQEDDEEEEEEDEEEIEEPEDIED 70



 Score = 29.4 bits (66), Expect = 0.89
 Identities = 10/26 (38%), Positives = 21/26 (80%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEELD 49
          K+E++E  +E++EE+EE+ E  E+++
Sbjct: 44 KDEKQEDDEEEEEEDEEEIEEPEDIE 69



 Score = 29.4 bits (66), Expect = 0.96
 Identities = 13/37 (35%), Positives = 20/37 (54%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
          E  +  E+EE   E +EEEEE +E N +L    ++  
Sbjct: 63 EEPEDIEDEEEIVEDEEEEEEDEEDNVDLKDIEKKNI 99



 Score = 27.9 bits (62), Expect = 2.8
 Identities = 10/38 (26%), Positives = 22/38 (57%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           ++ E+ E ++E  E+EEE++E  E+    +   ++  N
Sbjct: 63  EEPEDIEDEEEIVEDEEEEEEDEEDNVDLKDIEKKNIN 100



 Score = 27.5 bits (61), Expect = 3.3
 Identities = 10/25 (40%), Positives = 18/25 (72%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNE 46
           +K+E++E ++E+ EEE E+ E  E
Sbjct: 45 DEKQEDDEEEEEEDEEEIEEPEDIE 69



 Score = 26.7 bits (59), Expect = 5.8
 Identities = 11/23 (47%), Positives = 19/23 (82%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEE 47
          +E E+ K EK+E++EE++E +EE
Sbjct: 38 KENEDVKDEKQEDDEEEEEEDEE 60


>gnl|CDD|240274 PTZ00112, PTZ00112, origin recognition complex 1 protein;
           Provisional.
          Length = 1164

 Score = 31.9 bits (72), Expect = 0.24
 Identities = 12/45 (26%), Positives = 24/45 (53%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           K+K + +  +++ +G K+ K   E+ K +N   D R  R   ++N
Sbjct: 225 KNKEKDKNIKKDRDGDKQTKRNSEKSKVQNSHFDVRILRSYTKEN 269


>gnl|CDD|215914 pfam00428, Ribosomal_60s, 60s Acidic ribosomal protein.  This
          family includes archaebacterial L12, eukaryotic P0, P1
          and P2.
          Length = 88

 Score = 29.9 bits (68), Expect = 0.24
 Identities = 9/17 (52%), Positives = 13/17 (76%)

Query: 24 KEEEEEGKKEKKEEEEE 40
             EEE K+E++EEEE+
Sbjct: 64 AAAEEEKKEEEEEEEED 80



 Score = 29.9 bits (68), Expect = 0.25
 Identities = 9/24 (37%), Positives = 15/24 (62%), Gaps = 2/24 (8%)

Query: 26 EEEEGKKEKKEEEEEKKERNEELD 49
                +E+K+EEEE++E  E+ D
Sbjct: 61 AAAAAAEEEKKEEEEEEE--EDDD 82



 Score = 28.0 bits (63), Expect = 1.1
 Identities = 7/17 (41%), Positives = 14/17 (82%)

Query: 24 KEEEEEGKKEKKEEEEE 40
            EEE+ ++E++EEE++
Sbjct: 65 AAEEEKKEEEEEEEEDD 81



 Score = 28.0 bits (63), Expect = 1.2
 Identities = 9/20 (45%), Positives = 13/20 (65%)

Query: 24 KEEEEEGKKEKKEEEEEKKE 43
                 ++EKKEEEEE++E
Sbjct: 60 AAAAAAAEEEKKEEEEEEEE 79



 Score = 26.4 bits (59), Expect = 3.6
 Identities = 6/19 (31%), Positives = 13/19 (68%)

Query: 25 EEEEEGKKEKKEEEEEKKE 43
                +++K+EEEEE+++
Sbjct: 62 AAAAAEEEKKEEEEEEEED 80


>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II).  Bone
           sialoprotein (BSP) is a major structural protein of the
           bone matrix that is specifically expressed by
           fully-differentiated osteoblasts. The expression of bone
           sialoprotein (BSP) is normally restricted to mineralised
           connective tissues of bones and teeth where it has been
           associated with mineral crystal formation. However, it
           has been found that ectopic expression of BSP occurs in
           various lesions, including oral and extraoral
           carcinomas, in which it has been associated with the
           formation of microcrystalline deposits and the
           metastasis of cancer cells to bone.
          Length = 291

 Score = 31.6 bits (71), Expect = 0.25
 Identities = 15/37 (40%), Positives = 23/37 (62%), Gaps = 2/37 (5%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           G    K+ +E    E+EEE  +E++EEEE + E NE+
Sbjct: 123 GNAGKKATKEDESDEDEEE--EEEEEEEEAEVEENEQ 157



 Score = 30.4 bits (68), Expect = 0.62
 Identities = 15/45 (33%), Positives = 24/45 (53%)

Query: 5   GNCNCYGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           GN     L +PK      +K  +E+   E +EEEEE++E   E++
Sbjct: 109 GNIGLAALQLPKKAGNAGKKATKEDESDEDEEEEEEEEEEEAEVE 153



 Score = 28.5 bits (63), Expect = 2.5
 Identities = 11/33 (33%), Positives = 17/33 (51%)

Query: 17 SKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
          S+  G     EEEG++E   EEE  ++ +   D
Sbjct: 51 SEENGDGDSSEEEGEEETSNEEENNEDSDGNED 83


>gnl|CDD|219939 pfam08619, Nha1_C, Alkali metal cation/H+ antiporter Nha1 C
          terminus.  The C terminus of the plasma membrane Nha1
          antiporter plays an important role in the immediate
          cell response to hypo-osmotic shock which prevents an
          execessive loss of ions and water. This domain is found
          with pfam00999.
          Length = 430

 Score = 31.7 bits (72), Expect = 0.26
 Identities = 7/31 (22%), Positives = 15/31 (48%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRRRRRDNW 61
           +  +++++  +        RRRRR+RR   
Sbjct: 55 LRRVRKKKKGSRAGRRASSLRRRRRQRRKEP 85


>gnl|CDD|219978 pfam08701, GN3L_Grn1, GNL3L/Grn1 putative GTPase.  Grn1 (yeast)
          and GNL3L (human) are putative GTPases which are
          required for growth and play a role in processing of
          nucleolar pre-rRNA. This family contains a potential
          nuclear localisation signal.
          Length = 80

 Score = 29.5 bits (67), Expect = 0.26
 Identities = 15/36 (41%), Positives = 22/36 (61%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          KEE  E  +EKK ++EE+KER +E  +  R   R+ 
Sbjct: 44 KEEILEEIEEKKRKQEEEKERRKEARKAERAEARKR 79



 Score = 28.8 bits (65), Expect = 0.47
 Identities = 9/23 (39%), Positives = 16/23 (69%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKER 44
          +Q+EE+E  K+ +K E  E ++R
Sbjct: 57 KQEEEKERRKEARKAERAEARKR 79



 Score = 28.4 bits (64), Expect = 0.86
 Identities = 8/30 (26%), Positives = 17/30 (56%), Gaps = 2/30 (6%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
          E ++K+EEE  K+ +KE  + ++    +  
Sbjct: 53 EKKRKQEEE--KERRKEARKAERAEARKRG 80


>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
           envelope biogenesis, outer membrane].
          Length = 387

 Score = 31.5 bits (71), Expect = 0.27
 Identities = 12/43 (27%), Positives = 26/43 (60%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           +S++   +K E++  KKE++  EE K ++  E +R ++  + R
Sbjct: 68  QSQQSSAKKGEQQRKKKEEQVAEELKPKQAAEQERLKQLEKER 110


>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777).  This
          is a family of eukaryotic proteins of unknown function.
          Some of the proteins in this family are putative
          nucleic acid binding proteins.
          Length = 158

 Score = 31.0 bits (70), Expect = 0.27
 Identities = 10/44 (22%), Positives = 21/44 (47%), Gaps = 2/44 (4%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEK--KERNEELDRRRRRRRR 57
          +S+R GR +  +   ++ ++    E+  + R+      R RR R
Sbjct: 11 RSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSR 54


>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
          Length = 434

 Score = 31.3 bits (71), Expect = 0.29
 Identities = 12/34 (35%), Positives = 25/34 (73%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           K+K E  ++EE E+ +KE+ +E++EK++++E   
Sbjct: 401 KAKEEKLKQEENEKKQKEQADEDKEKRQKDERKK 434



 Score = 30.9 bits (70), Expect = 0.47
 Identities = 13/44 (29%), Positives = 25/44 (56%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           + K E    EE E   KE+K ++EE +++ +E     + +R++D
Sbjct: 387 EVKDETDASEEAEAKAKEEKLKQEENEKKQKEQADEDKEKRQKD 430



 Score = 30.9 bits (70), Expect = 0.51
 Identities = 13/35 (37%), Positives = 25/35 (71%), Gaps = 3/35 (8%)

Query: 20  EGRQKEEE---EEGKKEKKEEEEEKKERNEELDRR 51
           E + KEE+   EE +K++KE+ +E KE+ ++ +R+
Sbjct: 399 EAKAKEEKLKQEENEKKQKEQADEDKEKRQKDERK 433



 Score = 28.6 bits (64), Expect = 2.3
 Identities = 13/45 (28%), Positives = 26/45 (57%), Gaps = 2/45 (4%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
             +  E   K +EE+ K+E  E E+++KE+ +E   +R++  R+ 
Sbjct: 392 TDASEEAEAKAKEEKLKQE--ENEKKQKEQADEDKEKRQKDERKK 434



 Score = 28.2 bits (63), Expect = 3.3
 Identities = 12/32 (37%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 18  KREGRQKEE--EEEGKKEKKEEEEEKKERNEE 47
           K+ G  K+E    E  + K +EE+ K+E NE+
Sbjct: 383 KKLGEVKDETDASEEAEAKAKEEKLKQEENEK 414



 Score = 27.1 bits (60), Expect = 8.9
 Identities = 8/30 (26%), Positives = 14/30 (46%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           KR+   KE  ++ +   K+  E K E +  
Sbjct: 366 KRQELLKEYNKKLQDYTKKLGEVKDETDAS 395



 Score = 27.1 bits (60), Expect = 8.9
 Identities = 11/45 (24%), Positives = 24/45 (53%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           V        + E + + +K K+EE E+K++   + D+ +R++  R
Sbjct: 388 VKDETDASEEAEAKAKEEKLKQEENEKKQKEQADEDKEKRQKDER 432


>gnl|CDD|221324 pfam11933, DUF3451, Domain of unknown function (DUF3451).  This
          presumed domain is functionally uncharacterized. This
          domain is found in eukaryotes. This domain is typically
          between 199 to 238 amino acids in length. This domain
          is found associated with pfam06512, pfam00520. This
          domain has a conserved ADD sequence motif.
          Length = 222

 Score = 30.9 bits (70), Expect = 0.29
 Identities = 11/42 (26%), Positives = 20/42 (47%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           R  + K+EE     E+ + E+E K  + +  +R R   R+ 
Sbjct: 16 NRNDKNKKEEHSIGSEEGDSEKEPKSESADGRKRCRFLLRKT 57


>gnl|CDD|221028 pfam11208, DUF2992, Protein of unknown function (DUF2992).  This
           bacterial family of proteins has no known function.
           However, the cis-regulatory yjdF motif, just upstream
           from the gene encoding the proteins for this family, is
           a small non-coding RNA, Rfam:RF01764. The yjdF motif is
           found in many Firmicutes, including Bacillus subtilis.
           In most cases, it resides in potential 5' UTRs of
           homologues of the yjdF gene whose function is unknown.
           However, in Streptococcus thermophilus, a yjdF RNA motif
           is associated with an operon whose protein products
           synthesise nicotinamide adenine dinucleotide (NAD+).
           Also, the S. thermophilus yjdF RNA lacks typical yjdF
           motif consensus features downstream of and including the
           P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the
           S. thermophilus RNAs might sense a distinct compound
           that structurally resembles the ligand bound by other
           yjdF RNAs. On the ohter hand, perhaps these RNAs have an
           alternative solution forming a similar binding site, as
           is observed with some SAM riboswitches.
          Length = 132

 Score = 30.3 bits (69), Expect = 0.31
 Identities = 13/35 (37%), Positives = 24/35 (68%)

Query: 24  KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K E E  K+EKK+  +EKKE  +E  R+ ++++++
Sbjct: 92  KLEHERNKQEKKKRSKEKKEEEKERKRQLKQQKKK 126



 Score = 27.6 bits (62), Expect = 2.9
 Identities = 13/34 (38%), Positives = 25/34 (73%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
           R K+E+++  KEKKEEE+E+K + ++  ++ + R
Sbjct: 97  RNKQEKKKRSKEKKEEEKERKRQLKQQKKKAKHR 130


>gnl|CDD|217206 pfam02731, SKIP_SNW, SKIP/SNW domain.  This domain is found in
           chromatin proteins.
          Length = 158

 Score = 30.7 bits (70), Expect = 0.33
 Identities = 12/40 (30%), Positives = 25/40 (62%), Gaps = 2/40 (5%)

Query: 19  REGRQ--KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           R+ R+  ++  E  ++  ++E++EK+E+  EL +R R  R
Sbjct: 119 RKAREEVRQRAELQRQLAEKEKQEKEEKLRELAQRAREER 158


>gnl|CDD|198151 smart01083, Cir_N, N-terminal domain of CBF1 interacting
          co-repressor CIR.  This is a 45 residue conserved
          region at the N-terminal end of a family of proteins
          referred to as CIRs (CBF1-interacting co-repressors).
          CBF1 (centromere-binding factor 1) acts as a
          transcription factor that causes repression by binding
          specifically to GTGGGAA motifs in responsive promoters,
          and it requires CIR as a co-repressor. CIR binds to
          histone deacetylase and to SAP30 and serves as a linker
          between CBF1 and the histone deacetylase complex.
          Length = 37

 Score = 28.3 bits (64), Expect = 0.34
 Identities = 9/23 (39%), Positives = 17/23 (73%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEE 47
          + E++ ++EKK+ EE +KE  +E
Sbjct: 15 KAEQKAEEEKKKIEERRKEIEKE 37



 Score = 25.2 bits (56), Expect = 4.2
 Identities = 8/36 (22%), Positives = 24/36 (66%), Gaps = 3/36 (8%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
          P +K+  ++  + E+   + +EE+++ +ER +E+++
Sbjct: 4  PGNKKNQKRVWKAEQ---KAEEEKKKIEERRKEIEK 36


>gnl|CDD|173965 cd08045, TAF4, TATA Binding Protein (TBP) Associated Factor 4
           (TAF4) is one of several TAFs that bind TBP and is
           involved in forming Transcription Factor IID (TFIID)
           complex.  The TATA Binding Protein (TBP) Associated
           Factor 4 (TAF4) is one of several TAFs that bind TBP and
           are involved in forming the Transcription Factor IID
           (TFIID) complex. TFIID is one of seven General
           Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE,
           TFIIF, and TFIID) that are involved in accurate
           initiation of transcription by RNA polymerase II in
           eukaryote. TFIID plays an important role in the
           recognition of promoter DNA and assembly of the
           pre-initiation complex. TFIID complex is composed of the
           TBP and at least 13 TAFs. TAFs from various species were
           originally named by their predicted molecular weight or
           their electrophoretic mobility in polyacrylamide gels. A
           new, unified nomenclature for the pol II TAFs has been
           suggested to show the relationship between TAF orthologs
           and paralogs. Several hypotheses are proposed for TAFs
           functions such as serving as activator-binding sites,
           core-promoter recognition or a role in essential
           catalytic activity. Each TAF, with the help of a
           specific activator, is required only for the expression
           of subset of genes and is not universally involved for
           transcription as are GTFs. In yeast and human cells,
           TAFs have been found as components of other complexes
           besides TFIID.   Several TAFs interact via histone-fold
           (HFD) motifs; HFD is the interaction motif involved in
           heterodimerization of the core histones and their
           assembly into nucleosome octamers. The minimal HFD
           contains three alpha-helices linked by two loops and is
           found in core histones, TAFS and many other
           transcription factors. TFIID has a histone octamer-like
           substructure. TAF4 domain interacts with TAF12 and makes
           a novel histone-like heterodimer that binds DNA and has
           a core promoter function of a subset of genes.
          Length = 212

 Score = 30.8 bits (70), Expect = 0.35
 Identities = 11/28 (39%), Positives = 15/28 (53%)

Query: 33  EKKEEEEEKKERNEELDRRRRRRRRRDN 60
           E+ E EEE+K   EE +R  R  + R  
Sbjct: 120 EQLEREEEEKRDEEERERLLRAAKSRSE 147



 Score = 30.4 bits (69), Expect = 0.54
 Identities = 9/36 (25%), Positives = 20/36 (55%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           Q E EEE K++++E E   +      ++ R +++ +
Sbjct: 121 QLEREEEEKRDEEERERLLRAAKSRSEQSRLKQKAK 156


>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
          Length = 413

 Score = 31.2 bits (71), Expect = 0.37
 Identities = 14/42 (33%), Positives = 23/42 (54%), Gaps = 3/42 (7%)

Query: 14 VPKSKREGRQKEEE---EEGKKEKKEEEEEKKERNEELDRRR 52
            K K+E +  E +   E+ KK+KKE++E K E   +L  + 
Sbjct: 54 TKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKT 95


>gnl|CDD|184957 PRK14995, PRK14995, methyl viologen resistance protein SmvA;
           Provisional.
          Length = 495

 Score = 31.2 bits (71), Expect = 0.38
 Identities = 15/50 (30%), Positives = 22/50 (44%), Gaps = 3/50 (6%)

Query: 76  FFLGSLLGGFILSWVADRYG-RITAVLGSHV--VSFLGVALTPFSKDVVL 122
             + S   G I   +  R G R+ A  G  +  +SF G+A+T FS     
Sbjct: 304 VMVASGFSGPIAGILVSRLGLRLVATGGMALSALSFYGLAMTDFSTQQWQ 353


>gnl|CDD|219589 pfam07808, RED_N, RED-like protein N-terminal region.  This
          family contains sequences that are similar to the
          N-terminal region of Red protein. This and related
          proteins contain a RED repeat which consists of a
          number of RE and RD sequence elements. The region in
          question has several conserved NLS sequences and a
          putative trimeric coiled-coil region, suggesting that
          these proteins are expressed in the nucleus. The
          function of Red protein is unknown, but efficient
          sequestration to nuclear bodies suggests that its
          expression may be tightly regulated of that the protein
          self-aggregates extremely efficiently.
          Length = 238

 Score = 31.0 bits (70), Expect = 0.39
 Identities = 12/34 (35%), Positives = 16/34 (47%), Gaps = 4/34 (11%)

Query: 31 KKEKKE----EEEEKKERNEELDRRRRRRRRRDN 60
          KK+KK     ++EE  E+      R R R RR  
Sbjct: 1  KKKKKYAYLRKQEENAEKEINPKYRDRARERRKG 34


>gnl|CDD|227701 COG5414, COG5414, TATA-binding protein-associated factor
           [Transcription].
          Length = 392

 Score = 30.8 bits (69), Expect = 0.41
 Identities = 11/37 (29%), Positives = 19/37 (51%)

Query: 24  KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           KEE +  + ++  EE+E+ + NEE +R         N
Sbjct: 308 KEEVQSDRPDEIGEEKEEDDENEENERHTELLADELN 344



 Score = 30.8 bits (69), Expect = 0.45
 Identities = 11/48 (22%), Positives = 20/48 (41%), Gaps = 3/48 (6%)

Query: 12  LIVPKSKREGRQKE---EEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
             V +  +E +Q+E    E   ++ + +  +E  E  EE D      R
Sbjct: 287 KEVSEGDKEQQQEEVENAEAHKEEVQSDRPDEIGEEKEEDDENEENER 334



 Score = 29.7 bits (66), Expect = 1.1
 Identities = 11/60 (18%), Positives = 27/60 (45%), Gaps = 2/60 (3%)

Query: 16  KSKREGRQKEEE--EEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAITR 73
           + ++E  +  E   EE + ++ +E  E+KE ++E +   R      + + +    +   R
Sbjct: 295 EQQQEEVENAEAHKEEVQSDRPDEIGEEKEEDDENEENERHTELLADELNELEKGIEEKR 354


>gnl|CDD|215009 smart01069, CDC37_C, Cdc37 C terminal domain.  Cdc37 is a protein
          required for the activity of numerous eukaryotic
          protein kinases. This domains corresponds to the C
          terminal domain whose function is unclear. It is found
          C terminal to the Hsp90 chaperone (Heat shocked protein
          90) binding domain pfam08565 and the N terminal kinase
          binding domain of Cdc37.
          Length = 93

 Score = 29.3 bits (66), Expect = 0.41
 Identities = 15/44 (34%), Positives = 23/44 (52%), Gaps = 4/44 (9%)

Query: 7  CNCYGLI-VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
          C   GL  VP +  +     E +E +++ + EEE +KE  EE D
Sbjct: 52 CIESGLWGVPNAIEDE---TEFKELQEQYEVEEEAEKEDEEEED 92


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
           related to Mpp10 (M phase phosphoprotein 10). The U3
           small nucleolar ribonucleoprotein (snoRNP) is required
           for three cleavage events that generate the mature 18S
           rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
           depletion of Mpp10, a U3 snoRNP-specific protein, halts
           18S rRNA production and impairs cleavage at the three U3
           snoRNP-dependent sites.
          Length = 613

 Score = 31.1 bits (70), Expect = 0.41
 Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 28  EEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           EE  K+ K  E+ K E +     R RR+++R
Sbjct: 538 EEIYKKMKAIEKSKTELDRTDKNRERRKKKR 568


>gnl|CDD|221857 pfam12923, RRP7, Ribosomal RNA-processing protein 7 (RRP7).  RRP7
           is an essential protein in yeast that is involved in
           pre-rRNA processing and ribosome assembly. It is
           speculated to be required for correct assembly of rpS27
           into the pre-ribosomal particle.
          Length = 131

 Score = 29.9 bits (68), Expect = 0.42
 Identities = 17/63 (26%), Positives = 30/63 (47%), Gaps = 18/63 (28%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEE--------EEKKERNEEL------DRRR----RR 54
              ++K    +++ +E+ KK+KKE E        E+KKE   EL      D++R    + 
Sbjct: 65  GASRNKAAEERRKLKEKKKKKKKELENFYRFQIREKKKEELAELRKKFEEDKKRIEQLKA 124

Query: 55  RRR 57
            R+
Sbjct: 125 ARK 127


>gnl|CDD|236080 PRK07734, motB, flagellar motor protein MotB; Reviewed.
          Length = 259

 Score = 30.9 bits (70), Expect = 0.43
 Identities = 13/34 (38%), Positives = 20/34 (58%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
           P+ ++E      E E  K+K+E E +KK+  EEL
Sbjct: 74  PEDEKELSASSLEAEQAKKKEEAEAKKKKEMEEL 107


>gnl|CDD|218380 pfam05010, TACC, Transforming acidic coiled-coil-containing protein
           (TACC).  This family contains the proteins TACC 1, 2 and
           3 the genes for which are found concentrated in the
           centrosomes of eukaryotic and may play a conserved role
           in organising centrosomal microtubules. The human TACC
           proteins have been linked to cancer and TACC2 has been
           identified as a possible tumour suppressor (AZU-1). The
           functional homologue (Alp7) in Schizosaccharomyces pombe
           has been shown to be required for organisation of
           bipolar spindles.
          Length = 207

 Score = 30.5 bits (69), Expect = 0.43
 Identities = 16/40 (40%), Positives = 20/40 (50%), Gaps = 3/40 (7%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           KR  + KE  E     KK EE  KK   E LDR ++  +R
Sbjct: 97  KRYEKYKEVIE---GYKKNEETLKKCAQEYLDRLKKEEQR 133


>gnl|CDD|220102 pfam09073, BUD22, BUD22.  BUD22 has been shown in yeast to be a
           nuclear protein involved in bud-site selection. It plays
           a role in positioning the proximal bud pole signal. More
           recently it has been shown to be involved in ribosome
           biogenesis.
          Length = 424

 Score = 31.0 bits (70), Expect = 0.44
 Identities = 10/34 (29%), Positives = 17/34 (50%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
             K +  + E E+E K E+  E++   E  E+ D
Sbjct: 167 SDKDDEEESESEDESKSEESAEDDSDDEEEEDSD 200



 Score = 27.5 bits (61), Expect = 6.4
 Identities = 10/31 (32%), Positives = 20/31 (64%)

Query: 31  KKEKKEEEEEKKERNEELDRRRRRRRRRDNW 61
           KKE+++E++E++ R  E + R+ +R   D  
Sbjct: 331 KKEREKEQKEREGRQSEWEARQAKREGGDAK 361


>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
          Length = 880

 Score = 31.2 bits (71), Expect = 0.45
 Identities = 13/39 (33%), Positives = 20/39 (51%), Gaps = 1/39 (2%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
               EG  K + EE  +E +E  EE K+  EEL+ + + 
Sbjct: 247 LESLEGS-KRKLEEKIRELEERIEELKKEIEELEEKVKE 284



 Score = 28.5 bits (64), Expect = 2.7
 Identities = 18/42 (42%), Positives = 26/42 (61%), Gaps = 1/42 (2%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            KR  R  EEE  G +E+ +E EEK+ER EEL ++ +   +R
Sbjct: 313 EKRLSRL-EEEINGIEERIKELEEKEERLEELKKKLKELEKR 353



 Score = 28.1 bits (63), Expect = 3.5
 Identities = 11/37 (29%), Positives = 18/37 (48%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            +  EE E  +++ +E EE KE  EEL++        
Sbjct: 217 PELREELEKLEKEVKELEELKEEIEELEKELESLEGS 253



 Score = 27.3 bits (61), Expect = 6.6
 Identities = 12/47 (25%), Positives = 20/47 (42%), Gaps = 4/47 (8%)

Query: 16  KSKREGRQKE----EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K +  G   E    E EE +K K+E EEE  +    +   ++  +  
Sbjct: 378 KKRLTGLTPEKLEKELEELEKAKEEIEEEISKITARIGELKKEIKEL 424


>gnl|CDD|233209 TIGR00958, 3a01208, Conjugate Transporter-2 (CT2) Family protein.
           [Transport and binding proteins, Other].
          Length = 711

 Score = 30.8 bits (70), Expect = 0.45
 Identities = 9/38 (23%), Positives = 15/38 (39%)

Query: 146 LECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLADW 183
           L   G  W      F FL   ++ E+ +P+    + D 
Sbjct: 153 LGLSGRDWPWLISAFVFLTLSSLGEMFIPFYTGRVIDT 190


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
          are designated YL1. These proteins have been shown to
          be DNA-binding and may be a transcription factor.
          Length = 238

 Score = 30.4 bits (69), Expect = 0.47
 Identities = 10/34 (29%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          E++E    +  EEE EK+ + EE  ++++R + +
Sbjct: 62 EDDEPESDD--EEEGEKELQREERLKKKKRVKTK 93



 Score = 29.7 bits (67), Expect = 0.89
 Identities = 7/42 (16%), Positives = 13/42 (30%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
            K      +  +    + +KK E         +  RR+  R 
Sbjct: 103 KKKDPTAAKSPKAAAPRPKKKSERISWAPTLLDSPRRKSSRS 144



 Score = 27.3 bits (61), Expect = 5.7
 Identities = 6/32 (18%), Positives = 20/32 (62%)

Query: 27 EEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          + E  + + ++EEE ++  +  +R ++++R +
Sbjct: 60 DSEDDEPESDDEEEGEKELQREERLKKKKRVK 91



 Score = 27.0 bits (60), Expect = 7.1
 Identities = 11/28 (39%), Positives = 20/28 (71%)

Query: 17 SKREGRQKEEEEEGKKEKKEEEEEKKER 44
          S+ +  + ++EEEG+KE + EE  KK++
Sbjct: 61 SEDDEPESDDEEEGEKELQREERLKKKK 88


>gnl|CDD|185616 PTZ00436, PTZ00436, 60S ribosomal protein L19-like protein;
           Provisional.
          Length = 357

 Score = 30.7 bits (68), Expect = 0.48
 Identities = 17/49 (34%), Positives = 27/49 (55%), Gaps = 3/49 (6%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR--RRRRRRDN 60
           V   K++ RQ  E+   K+  K+E+   K R +EL +R   R R RR++
Sbjct: 145 VKNEKKKERQLAEQLAAKR-LKDEQHRHKARKQELRKREKDRERARRED 192


>gnl|CDD|221333 pfam11942, Spt5_N, Spt5 transcription elongation factor, acidic
          N-terminal.  This is the very acidic N-terminal region
          of the early transcription elongation factor Spt5. The
          Spt5-Spt4 complex regulates early transcription
          elongation by RNA polymerase II and has an imputed role
          in pre-mRNA processing via its physical association
          with mRNA capping enzymes. The actual function of this
          N-terminal domain is not known although it is
          dispensable for binding to Spt4.
          Length = 92

 Score = 28.9 bits (65), Expect = 0.50
 Identities = 12/35 (34%), Positives = 19/35 (54%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          EEEEE + + ++  +E +  +E      RR RR D
Sbjct: 13 EEEEEEEDDLEDLSDEDEFIDEAEAEDDRRHRRLD 47



 Score = 28.6 bits (64), Expect = 0.74
 Identities = 10/36 (27%), Positives = 16/36 (44%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          EEEEE ++E   E+   ++   +       RR R  
Sbjct: 11 EEEEEEEEEDDLEDLSDEDEFIDEAEAEDDRRHRRL 46



 Score = 28.2 bits (63), Expect = 1.1
 Identities = 15/36 (41%), Positives = 20/36 (55%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          EEEE+  ++  +E+E   E   E DRR RR  RR  
Sbjct: 16 EEEEDDLEDLSDEDEFIDEAEAEDDRRHRRLDRRRE 51


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 30.8 bits (70), Expect = 0.51
 Identities = 16/51 (31%), Positives = 26/51 (50%), Gaps = 6/51 (11%)

Query: 16  KSKREGRQKEEEE-EGKKEKKEEEEEKKE-RNEELDRR----RRRRRRRDN 60
           K + +  +KE E   GKKE+ EEE E+ E    +L+ R    ++ R   + 
Sbjct: 846 KEQIKSIEKEIENLNGKKEELEEELEELEAALRDLESRLGDLKKERDELEA 896



 Score = 28.5 bits (64), Expect = 2.8
 Identities = 9/41 (21%), Positives = 16/41 (39%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
                + E  E ++ ++  E  K+E +      RR   R D
Sbjct: 665 GILFSRSEPAELQRLRERLEGLKRELSSLQSELRRIENRLD 705



 Score = 27.0 bits (60), Expect = 9.0
 Identities = 13/41 (31%), Positives = 21/41 (51%), Gaps = 1/41 (2%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRR-RRRRRRRDNW 61
           R+ E+ EE   + + E ++     EEL+R     R+RRD  
Sbjct: 315 RELEDAEERLAKLEAEIDKLLAEIEELEREIEEERKRRDKL 355


>gnl|CDD|185618 PTZ00438, PTZ00438, gamete antigen 27/25-like protein; Provisional.
          Length = 374

 Score = 30.8 bits (69), Expect = 0.51
 Identities = 17/37 (45%), Positives = 21/37 (56%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           IV   +  G QKEEEE+   E+ EE EE +   EE D
Sbjct: 96  IVKNEEERGTQKEEEEDEDVEEIEEVEEVEVVEEEYD 132


>gnl|CDD|221791 pfam12822, DUF3816, Protein of unknown function (DUF3816).  This
           family of proteins is functionally uncharacterized but
           are likely to be membrane transporters. This family of
           proteins is found in bacteria and archaea. Proteins in
           this family are typically between 177 and 208 amino
           acids in length. A subset of this family is associated
           with the TM1506 proteins. In this context, transport
           through the channel is predicted to be regulated by the
           TM1506 protein by either regulating redox potential or
           modification of substrates.
          Length = 168

 Score = 29.9 bits (68), Expect = 0.53
 Identities = 30/149 (20%), Positives = 45/149 (30%), Gaps = 27/149 (18%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
               +  LLG F+L  VA   G +  +L S +   L           +   L R L G+ 
Sbjct: 27  DFSHIPVLLGAFLLGPVA---GALIGLLTSLLSFLLFGGGPFALVGPLANFLPRILFGL- 82

Query: 134 HFNAFIFYYIIVLECVGPKWR------------TFAMTFPFL-----IFYTVSEVALPWI 176
                I   I        K R            T   T   L     ++     + +  I
Sbjct: 83  -----IAGLIYKKLRKKTKKRAVLAIILGTILGTLVATLLNLGLILPLYAKFLGMPISAI 137

Query: 177 AYYLADWQWISVITIFPLIVGLIVAIFTP 205
              L     +  +  F LI G+I +I   
Sbjct: 138 VGALF-AAVLLPVLPFNLIKGIIASIIVY 165


>gnl|CDD|152107 pfam11671, Apis_Csd, Complementary sex determiner protein.  This
          family of proteins represents the complementary sex
          determiner in the honeybee. In the honeybee, the
          mechanism of sex determination depends on the csd gene
          which produces an SR-type protein. Males are homozygous
          while females are homozygous for the csd gene.
          Heterozygosity generates an active protein which
          initiates female development.
          Length = 146

 Score = 30.1 bits (67), Expect = 0.53
 Identities = 15/52 (28%), Positives = 24/52 (46%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNL 69
          KR  R +E E++  K +    E ++   E    R  R R R++ +    SNL
Sbjct: 4  KRYSRSREREQKSYKNENSYREYRETSRERSRDRTERERSREHKIISSLSNL 55


>gnl|CDD|220431 pfam09831, DUF2058, Uncharacterized protein conserved in bacteria
          (DUF2058).  This domain, found in various prokaryotic
          proteins, has no known function.
          Length = 177

 Score = 29.9 bits (68), Expect = 0.54
 Identities = 14/37 (37%), Positives = 23/37 (62%), Gaps = 1/37 (2%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
          K  R+G    ++E  K+  +E + EK ER+ EL+R+R
Sbjct: 29 KQARKGADDGDDEL-KQAAEEAKAEKAERDRELNRQR 64


>gnl|CDD|219547 pfam07741, BRF1, Brf1-like TBP-binding domain.  This region
          covers both the Brf homology II and III regions. This
          region is involved in binding TATA binding protein.
          Length = 95

 Score = 29.1 bits (66), Expect = 0.54
 Identities = 8/34 (23%), Positives = 19/34 (55%)

Query: 26 EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          EE+E K+ K++ +E      ++  R+ +++R   
Sbjct: 30 EEQEEKELKQKADEGNNSGKKKKKRKAKKKRDEA 63



 Score = 26.8 bits (60), Expect = 3.7
 Identities = 3/30 (10%), Positives = 19/30 (63%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          ++++++E ++K +      +++++R+ +  
Sbjct: 30 EEQEEKELKQKADEGNNSGKKKKKRKAKKK 59



 Score = 26.4 bits (59), Expect = 5.0
 Identities = 9/38 (23%), Positives = 23/38 (60%), Gaps = 5/38 (13%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          +++EE+E K++  E     K++     ++R+ +++RD 
Sbjct: 30 EEQEEKELKQKADEGNNSGKKK-----KKRKAKKKRDE 62


>gnl|CDD|216337 pfam01159, Ribosomal_L6e, Ribosomal protein L6e. 
          Length = 108

 Score = 29.1 bits (66), Expect = 0.55
 Identities = 13/36 (36%), Positives = 22/36 (61%), Gaps = 2/36 (5%)

Query: 16 KSKREGRQKEEEE--EGKKEKKEEEEEKKERNEELD 49
          + K++ ++K E E    KKEKKE  E++K   + +D
Sbjct: 39 REKKKKKKKSEGEFFAEKKEKKEVSEQRKADQKAVD 74


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 30.5 bits (69), Expect = 0.55
 Identities = 13/42 (30%), Positives = 19/42 (45%), Gaps = 5/42 (11%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            R  +++EEEEE      E+E   K  +   +    RRR  D
Sbjct: 177 FRREKKEEEEEEE-----EDEALIKSLSFGPETEEDRRRADD 213



 Score = 30.1 bits (68), Expect = 0.65
 Identities = 11/33 (33%), Positives = 20/33 (60%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           +EE+E + EK+ EEE   +  ++L+ R    +R
Sbjct: 114 DEEQEERVEKEREEELAGDAMKKLENRTADSKR 146



 Score = 28.9 bits (65), Expect = 1.6
 Identities = 13/35 (37%), Positives = 19/35 (54%), Gaps = 6/35 (17%)

Query: 24  KEEEEEGKKEKKEEEEE------KKERNEELDRRR 52
           K +EE+ ++ +KE EEE      KK  N   D +R
Sbjct: 112 KLDEEQEERVEKEREEELAGDAMKKLENRTADSKR 146



 Score = 27.4 bits (61), Expect = 5.9
 Identities = 12/39 (30%), Positives = 18/39 (46%), Gaps = 7/39 (17%)

Query: 23  QKEEEEEGKKEKKEEEEEKKER-------NEELDRRRRR 54
           +     E K+E++EEEE++            E DRRR  
Sbjct: 174 EALFRREKKEEEEEEEEDEALIKSLSFGPETEEDRRRAD 212



 Score = 27.0 bits (60), Expect = 7.3
 Identities = 16/47 (34%), Positives = 22/47 (46%), Gaps = 8/47 (17%)

Query: 18  KREGRQKEEEEEG----KKEKKEE----EEEKKERNEELDRRRRRRR 56
           +R  +++EEE  G    K E +      E E  ER EEL   + RR 
Sbjct: 119 ERVEKEREEELAGDAMKKLENRTADSKREMEVLERLEELKELQSRRA 165


>gnl|CDD|132364 TIGR03321, alt_F1F0_F0_B, alternate F1F0 ATPase, F0 subunit B.  A
          small number of taxonomically diverse prokaryotic
          species, including Methanosarcina barkeri, have what
          appears to be a second ATP synthase, in addition to the
          normal F1F0 ATPase in bacteria and A1A0 ATPase in
          archaea. These enzymes use ion gradients to synthesize
          ATP, CC and in principle may run in either direction.
          This model represents the F0 subunit B of this apparent
          second ATP synthase.
          Length = 246

 Score = 30.4 bits (69), Expect = 0.56
 Identities = 11/27 (40%), Positives = 20/27 (74%)

Query: 26 EEEEGKKEKKEEEEEKKERNEELDRRR 52
          + +  K+E ++E  E +E+NEELD++R
Sbjct: 47 DADTKKREAEQERREYEEKNEELDQQR 73


>gnl|CDD|235396 PRK05299, rpsB, 30S ribosomal protein S2; Provisional.
          Length = 258

 Score = 30.1 bits (69), Expect = 0.62
 Identities = 13/28 (46%), Positives = 19/28 (67%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           EGRQ    E  ++E++E EEE++E  EE
Sbjct: 223 EGRQGRLAEAAEEEEEEAEEEEEEEEEE 250


>gnl|CDD|237629 PRK14160, PRK14160, heat shock protein GrpE; Provisional.
          Length = 211

 Score = 30.1 bits (68), Expect = 0.63
 Identities = 10/25 (40%), Positives = 12/25 (48%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEE 47
            E EE  K+E  E+ EE  E   E
Sbjct: 33 DLEFEEIEKEEIIEDSEESNEVKIE 57



 Score = 27.4 bits (61), Expect = 4.5
 Identities = 12/31 (38%), Positives = 19/31 (61%)

Query: 17 SKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
          +K E + KEE+ E ++ +KEE  E  E + E
Sbjct: 23 NKEEDKGKEEDLEFEEIEKEEIIEDSEESNE 53



 Score = 27.0 bits (60), Expect = 6.0
 Identities = 10/28 (35%), Positives = 18/28 (64%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNEE 47
          E   KEE++  +++ + EE EK+E  E+
Sbjct: 20 ENENKEEDKGKEEDLEFEEIEKEEIIED 47



 Score = 26.6 bits (59), Expect = 8.0
 Identities = 11/34 (32%), Positives = 20/34 (58%)

Query: 14 VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
          + +   +  + +EE++GK+E  E EE +KE   E
Sbjct: 13 MEEDCCKENENKEEDKGKEEDLEFEEIEKEEIIE 46



 Score = 26.6 bits (59), Expect = 8.4
 Identities = 9/26 (34%), Positives = 15/26 (57%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEELD 49
           E  EE   ++ E +EE K + E+L+
Sbjct: 10 HENMEEDCCKENENKEEDKGKEEDLE 35


>gnl|CDD|202833 pfam03962, Mnd1, Mnd1 family.  This family of proteins includes
           MND1 from S. cerevisiae. The mnd1 protein forms a
           complex with hop2 to promote homologous chromosome
           pairing and meiotic double-strand break repair.
          Length = 188

 Score = 29.9 bits (68), Expect = 0.64
 Identities = 13/38 (34%), Positives = 18/38 (47%), Gaps = 4/38 (10%)

Query: 19  REGRQKEEEEEGKKEKKEE----EEEKKERNEELDRRR 52
           R  + K+E EE K+   E     E+ KK R E  +R  
Sbjct: 70  RLEKLKKELEELKQRIAELQAQIEKLKKGREETEERTE 107


>gnl|CDD|100109 cd05831, Ribosomal_P1, Ribosomal protein P1. This subfamily
          represents the eukaryotic large ribosomal protein P1.
          Eukaryotic P1 and P2 are functionally equivalent to the
          bacterial protein L7/L12, but are not homologous to
          L7/L12. P1 is located in the L12 stalk, with proteins
          P2, P0, L11, and 28S rRNA. P1 and P2 are the only
          proteins in the ribosome to occur as multimers, always
          appearing as sets of heterodimers. Recent data indicate
          that eukaryotes have four copies (two heterodimers),
          while most archaeal species contain six copies of L12p
          (three homodimers) and bacteria may have four or six
          copies (two or three homodimers), depending on the
          species. Experiments using S. cerevisiae P1 and P2
          indicate that P1 proteins are positioned more
          internally with limited reactivity in the C-terminal
          domains, while P2 proteins seem to be more externally
          located and are more likely to interact with other
          cellular components. In lower eukaryotes, P1 and P2 are
          further subdivided into P1A, P1B, P2A, and P2B, which
          form P1A/P2B and P1B/P2A heterodimers. Some plant
          species have a third P-protein, called P3, which is not
          homologous to P1 and P2. In humans, P1 and P2 are
          strongly autoimmunogenic. They play a significant role
          in the etiology and pathogenesis of systemic lupus
          erythema (SLE). In addition, the ribosome-inactivating
          protein trichosanthin (TCS) interacts with human P0,
          P1, and P2, with its primary binding site located in
          the C-terminal region of P2. TCS inactivates the
          ribosome by depurinating a specific adenine in the
          sarcin-ricin loop of 28S rRNA.
          Length = 103

 Score = 28.8 bits (65), Expect = 0.64
 Identities = 10/18 (55%), Positives = 13/18 (72%)

Query: 26 EEEEGKKEKKEEEEEKKE 43
            E  K+EKKEEEEE+ +
Sbjct: 78 AAEAKKEEKKEEEEEESD 95



 Score = 27.3 bits (61), Expect = 2.5
 Identities = 8/21 (38%), Positives = 13/21 (61%)

Query: 20 EGRQKEEEEEGKKEKKEEEEE 40
               E ++E KKE++EEE +
Sbjct: 75 AAAAAEAKKEEKKEEEEEESD 95


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 30.5 bits (69), Expect = 0.66
 Identities = 11/45 (24%), Positives = 18/45 (40%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            ++ E  ++E EE+     +E EE  +   EEL           N
Sbjct: 353 LAELEEAKEELEEKLSALLEELEELFEALREELAELEAELAEIRN 397



 Score = 28.1 bits (63), Expect = 3.6
 Identities = 10/35 (28%), Positives = 21/35 (60%)

Query: 24   KEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            +E  EE K ++++ EE K++  E ++   + +R R
Sbjct: 976  EERYEELKSQREDLEEAKEKLLEVIEELDKEKRER 1010



 Score = 27.4 bits (61), Expect = 6.5
 Identities = 14/42 (33%), Positives = 24/42 (57%), Gaps = 4/42 (9%)

Query: 24  KEEEEEGKKEKKEEEEEKKERNEELDRRRRR----RRRRDNW 61
           KEE EE +++++  +EE +E  EEL+   RR     R  ++ 
Sbjct: 778 KEEIEELEEKRQALQEELEELEEELEEAERRLDALERELESL 819



 Score = 27.0 bits (60), Expect = 9.8
 Identities = 11/49 (22%), Positives = 21/49 (42%), Gaps = 4/49 (8%)

Query: 16  KSKREGRQKEEEEEGKKEK----KEEEEEKKERNEELDRRRRRRRRRDN 60
           + +R+  + E + E  K +    +EE E+ + R EEL+           
Sbjct: 706 ELRRQLEELERQLEELKRELAALEEELEQLQSRLEELEEELEELEEELE 754


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
           subunit (TFIIF-alpha).  Transcription initiation factor
           IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
           II-associating protein 74 (RAP74) is the large subunit
           of transcription factor IIF (TFIIF), which is essential
           for accurate initiation and stimulates elongation by RNA
           polymerase II.
          Length = 528

 Score = 30.3 bits (68), Expect = 0.67
 Identities = 7/43 (16%), Positives = 21/43 (48%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           P+ +++   +E EEE  +E+    ++ K+  +   ++    + 
Sbjct: 312 PEIEQDEDSEESEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKD 354



 Score = 29.5 bits (66), Expect = 1.4
 Identities = 14/44 (31%), Positives = 23/44 (52%), Gaps = 2/44 (4%)

Query: 18  KREGRQKEEEEEGKKEKKEEE--EEKKERNEELDRRRRRRRRRD 59
           K E  Q E+ EE ++EK EEE    KK +  +  + ++    +D
Sbjct: 311 KPEIEQDEDSEESEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKD 354



 Score = 28.8 bits (64), Expect = 2.4
 Identities = 8/38 (21%), Positives = 21/38 (55%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
                +  E+ +E K +KK+++  K ++  + D++ +R
Sbjct: 226 GDESDKGGEDGDEEKSKKKKKKLAKNKKKLDDDKKGKR 263



 Score = 27.6 bits (61), Expect = 4.7
 Identities = 10/36 (27%), Positives = 19/36 (52%), Gaps = 1/36 (2%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEK-KERNEELD 49
           P+ + +    E   + + E+ E+ EE  +E+NEE  
Sbjct: 297 PEEREDKLSPEIPAKPEIEQDEDSEESEEEKNEEEG 332


>gnl|CDD|218899 pfam06102, DUF947, Domain of unknown function (DUF947).  Family of
           eukaryotic proteins with unknown function.
          Length = 168

 Score = 29.6 bits (67), Expect = 0.69
 Identities = 12/58 (20%), Positives = 31/58 (53%), Gaps = 17/58 (29%)

Query: 16  KSKREGRQKEEE--EEGKK---------------EKKEEEEEKKERNEELDRRRRRRR 56
           +  +E +++E+E  +EGKK               +K +E ++ K+ ++ L+++R++  
Sbjct: 106 EILKEHKKQEKELIKEGKKPYYLKKSEIKKLVLKKKFDELKKSKQLDKALEKKRKKNA 163


>gnl|CDD|236586 PRK09605, PRK09605, bifunctional UGMP family
           protein/serine/threonine protein kinase; Validated.
          Length = 535

 Score = 30.2 bits (69), Expect = 0.69
 Identities = 14/39 (35%), Positives = 18/39 (46%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           E   K+ E  G+    +E   K  R+ ELD R R  R R
Sbjct: 346 EADIKKGEYLGRDAVIKERVPKGYRHPELDERLRTERTR 384


>gnl|CDD|237799 PRK14715, PRK14715, DNA polymerase II large subunit; Provisional.
          Length = 1627

 Score = 30.6 bits (69), Expect = 0.69
 Identities = 23/76 (30%), Positives = 33/76 (43%), Gaps = 7/76 (9%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGS--SNLAITRSIFFLGS 80
           +KEE++E K E+ + EE  +E  EE          + N   +      +   R +F   S
Sbjct: 282 KKEEKDEEKSEEVKTEEVDEEFEEEEKGFYYELYEKVNIEANKKFIKEVIAGRPVFAHPS 341

Query: 81  LLGGFILSWVADRYGR 96
             GGF L     RYGR
Sbjct: 342 TNGGFRL-----RYGR 352


>gnl|CDD|222571 pfam14153, Spore_coat_CotO, Spore coat protein CotO.  Bacillus
           spores are protected by a protein shell consisting of
           over 50 different polypeptides, known as the coat. This
           family of proteins has an important morphogenetic role
           in coat assembly, it is involved in the assembly of at
           least 5 different coat proteins including CotB, CotG,
           CotS, CotSA and CotW. It is likely to act at a late
           stage of coat assembly.
          Length = 185

 Score = 29.8 bits (67), Expect = 0.69
 Identities = 13/62 (20%), Positives = 33/62 (53%), Gaps = 9/62 (14%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAITRSIFFLG 79
           +  +++E+EE  +E+++EEE +  + +E+   +R++  ++         + +   I FL 
Sbjct: 77  DIAEQQEKEEIAQEEEKEEEAEDVKQQEVFSFKRKKPFKE---------MNLEEKIDFLA 127

Query: 80  SL 81
            L
Sbjct: 128 HL 129



 Score = 26.7 bits (59), Expect = 7.1
 Identities = 12/43 (27%), Positives = 25/43 (58%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K   E  +++  E+ +KE+  +EEEK+E  E++ ++     +R
Sbjct: 68  KEAGEPEREDIAEQQEKEEIAQEEEKEEEAEDVKQQEVFSFKR 110



 Score = 26.3 bits (58), Expect = 9.8
 Identities = 9/44 (20%), Positives = 23/44 (52%)

Query: 10  YGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
            G    +   E ++KEE  + +++++E E+ K++      R++ 
Sbjct: 70  AGEPEREDIAEQQEKEEIAQEEEKEEEAEDVKQQEVFSFKRKKP 113


>gnl|CDD|233467 TIGR01554, major_cap_HK97, phage major capsid protein, HK97
          family.  This model family represents the major capsid
          protein component of the heads (capsids) of
          bacteriophage HK97, phi-105, P27, and related phage.
          This model represents one of several analogous families
          lacking detectable sequence similarity. The gene
          encoding this component is typically located in an
          operon encoding the small and large terminase subunits,
          the portal protein and the prohead or maturation
          protease [Mobile and extrachromosomal element
          functions, Prophage functions].
          Length = 384

 Score = 30.0 bits (68), Expect = 0.72
 Identities = 11/42 (26%), Positives = 17/42 (40%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
          +   E  +  E EE K E    +EE  + + E+DR       
Sbjct: 16 RKLTEDEKLAEAEEEKAEYDALKEEIDKLDAEIDRLEELLDE 57


>gnl|CDD|183610 PRK12585, PRK12585, putative monovalent cation/H+ antiporter
           subunit G; Reviewed.
          Length = 197

 Score = 29.6 bits (66), Expect = 0.73
 Identities = 15/47 (31%), Positives = 27/47 (57%), Gaps = 1/47 (2%)

Query: 12  LIVPKSKRE-GRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           LI+ + + E  RQ+ EE E + E +  EE+  ER ++ ++ R R  +
Sbjct: 121 LIIRQEQIEKARQEREELEERMEWERREEKIDEREDQEEQEREREEQ 167


>gnl|CDD|114359 pfam05631, DUF791, Protein of unknown function (DUF791).  This
           family consists of several eukaryotic proteins of
           unknown function.
          Length = 354

 Score = 30.3 bits (68), Expect = 0.77
 Identities = 20/65 (30%), Positives = 35/65 (53%), Gaps = 4/65 (6%)

Query: 77  FLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVGH-- 134
           F  S+L G I+  +AD+ GR  A L ++ + ++   +T  S +  +  + RFL G+    
Sbjct: 79  FGSSMLFGTIVGSLADKQGRKRACL-TYCILYILSCITKHSPNYKVLMIGRFLGGIATSL 137

Query: 135 -FNAF 138
            F+AF
Sbjct: 138 LFSAF 142


>gnl|CDD|235151 PRK03699, PRK03699, putative transporter; Provisional.
          Length = 394

 Score = 30.3 bits (69), Expect = 0.78
 Identities = 36/140 (25%), Positives = 61/140 (43%), Gaps = 19/140 (13%)

Query: 74  SIF-FL--GSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLT 130
           + F FL  G L+  F+ +W+ +       ++    +  L VA   FS  + LFS++ F+ 
Sbjct: 46  NTFTFLNAGILISIFLNAWLMEIIPLKRQLIFGFALMILAVAGLMFSHSLALFSIAMFVL 105

Query: 131 G-VGHFNAFIFYYIIVLECVGPKWRTFAMTFPFL-IFYTVSEVALPWIAYYL----ADWQ 184
           G V      I  ++I     G   +       F   F++++ +  P IA YL     +W 
Sbjct: 106 GVVSGITMSIGTFLITHVYEG---KQRGSRLLFTDSFFSMAGMIFPIIAAYLLARSIEWY 162

Query: 185 WISVITIFPLIVGLI-VAIF 203
           W+         +GL+ VAIF
Sbjct: 163 WVY------ACIGLVYVAIF 176


>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain.
          This family represents the C-terminus (approximately
          300 residues) of proteins that are involved as binding
          partners for Prp19 as part of the nuclear pore complex.
          The family in Drosophila is necessary for pre-mRNA
          splicing, and the human protein has been found in
          purifications of the spliceosome. In the past this
          family was thought, erroneously, to be associated with
          microfibrillin.
          Length = 277

 Score = 29.9 bits (67), Expect = 0.78
 Identities = 13/32 (40%), Positives = 22/32 (68%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
          + E  + EEE+E  +E++EE EE++E + E D
Sbjct: 1  ETEVLELEEEDESGEEEEEESEEEEETDSEDD 32



 Score = 26.4 bits (58), Expect = 10.0
 Identities = 16/42 (38%), Positives = 23/42 (54%), Gaps = 4/42 (9%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           KR  R +EE EE ++EK E E+ +    EE    RR   R++
Sbjct: 126 KRIKRDREEREEMEREKAEIEKMRNMTEEE----RRAELRKN 163


>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
          Length = 1021

 Score = 30.5 bits (68), Expect = 0.80
 Identities = 13/44 (29%), Positives = 26/44 (59%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           + +R  R++ E E  ++E+ E +  +++R + L+R R  R  RD
Sbjct: 475 RMERIERERLERERLERERLERDRLERDRLDRLERERVDRLERD 518



 Score = 29.3 bits (65), Expect = 1.5
 Identities = 21/68 (30%), Positives = 29/68 (42%), Gaps = 13/68 (19%)

Query: 5   GNCNC------YGLIVPKSKREGRQKEEEEEGKK-------EKKEEEEEKKERNEELDRR 51
           G  +C      YG  V K   E  + E+E   +K       EKK  E  ++E  E L+R 
Sbjct: 415 GATHCHAVNGHYGGRVDKDHAERARIEKENAHRKALEMKILEKKRIERLEREERERLERE 474

Query: 52  RRRRRRRD 59
           R  R  R+
Sbjct: 475 RMERIERE 482



 Score = 29.3 bits (65), Expect = 1.9
 Identities = 17/49 (34%), Positives = 27/49 (55%), Gaps = 2/49 (4%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR--RRRRRRD 59
           I+ K + E  ++EE E  ++E+ E  E ++   E L+R R  R R  RD
Sbjct: 454 ILEKKRIERLEREERERLERERMERIERERLERERLERERLERDRLERD 502



 Score = 28.5 bits (63), Expect = 3.2
 Identities = 14/51 (27%), Positives = 28/51 (54%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSN 68
           +RE  ++E  E  + E+   +  ++ER + L+R R  + RR+++   G  N
Sbjct: 485 ERERLERERLERDRLERDRLDRLERERVDRLERDRLEKARRNSYFLKGMEN 535



 Score = 27.8 bits (61), Expect = 4.7
 Identities = 15/44 (34%), Positives = 23/44 (52%), Gaps = 2/44 (4%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKER--NEELDRRRRRRRRRD 59
           + E  ++E  E  + E+ E E  ++ER   E L+R R  R R D
Sbjct: 462 RLEREERERLERERMERIERERLERERLERERLERDRLERDRLD 505


>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
           subunit [Translation, ribosomal structure and
           biogenesis].
          Length = 591

 Score = 30.0 bits (67), Expect = 0.80
 Identities = 17/60 (28%), Positives = 30/60 (50%), Gaps = 15/60 (25%)

Query: 14  VPKSKREGRQKEEEEEGKKEK---------------KEEEEEKKERNEELDRRRRRRRRR 58
           V KSK + R+ +EEEE KK K               K    +K+E+ E L +++++  ++
Sbjct: 525 VNKSKNKKRKVDEEEEEKKLKMIMMSNKQKKLYKKMKYSNAKKEEQAENLKKKKKQIAKQ 584


>gnl|CDD|220413 pfam09805, Nop25, Nucleolar protein 12 (25kDa).  Members of this
          family of proteins are part of the yeast nuclear pore
          complex-associated pre-60S ribosomal subunit. The
          family functions as a highly conserved exonuclease that
          is required for the 5'-end maturation of 5.8S and 25S
          rRNAs, demonstrating that 5'-end processing also has a
          redundant pathway. Nop25 binds late pre-60S ribosomes,
          accompanying them from the nucleolus to the nuclear
          periphery; and there is evidence for both physical and
          functional links between late 60S subunit processing
          and export.
          Length = 134

 Score = 29.2 bits (66), Expect = 0.81
 Identities = 17/46 (36%), Positives = 31/46 (67%), Gaps = 4/46 (8%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKK----ERNEELDRRRRRRRR 57
          K K++ R+K +EE  +KE++E  EE+K    ER +EL+++ + R+ 
Sbjct: 29 KRKQQRRKKAQEEAKEKEREERIEERKRIREERKQELEKQLKERKE 74



 Score = 28.1 bits (63), Expect = 2.0
 Identities = 7/29 (24%), Positives = 19/29 (65%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          K++++++ +E+ +  E  +R   R+R R+
Sbjct: 31 KQQRRKKAQEEAKEKEREERIEERKRIRE 59



 Score = 27.7 bits (62), Expect = 2.3
 Identities = 11/31 (35%), Positives = 23/31 (74%), Gaps = 2/31 (6%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRR--RRRD 59
          +++K +EE ++KER E ++ R+R R  R+++
Sbjct: 34 RRKKAQEEAKEKEREERIEERKRIREERKQE 64



 Score = 27.3 bits (61), Expect = 3.5
 Identities = 12/33 (36%), Positives = 21/33 (63%), Gaps = 4/33 (12%)

Query: 31 KKEKKE--EEEEKKERNEELDRRRRRRRRRDNW 61
          +K+ +E  +E+E++ER EE  R+R R  R+   
Sbjct: 35 RKKAQEEAKEKEREERIEE--RKRIREERKQEL 65


>gnl|CDD|225381 COG2825, HlpA, Outer membrane protein [Cell envelope biogenesis,
           outer membrane].
          Length = 170

 Score = 29.3 bits (66), Expect = 0.83
 Identities = 14/40 (35%), Positives = 16/40 (40%), Gaps = 1/40 (2%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
             R K E E  KKEK      KK++  E D  RR      
Sbjct: 84  SDRAKAEAEI-KKEKLVNAFNKKQQEYEKDLNRREAEEEQ 122


>gnl|CDD|179614 PRK03633, PRK03633, putative MFS family transporter protein;
           Provisional.
          Length = 381

 Score = 30.0 bits (68), Expect = 0.84
 Identities = 14/60 (23%), Positives = 26/60 (43%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           S +F G+L+G  +  +V  R G   +   + ++   G A          +   RF+ G+G
Sbjct: 48  SSYFTGNLVGTLLAGYVIKRIGFNRSYYLASLIFAAGCAGLGLMVGFWSWLAWRFVAGIG 107


>gnl|CDD|221756 pfam12757, DUF3812, Protein of unknown function (DUF3812).  This is
           a family of fungal proteins whose function is not known.
          Length = 126

 Score = 29.2 bits (66), Expect = 0.84
 Identities = 11/37 (29%), Positives = 21/37 (56%), Gaps = 1/37 (2%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
            +R   Q+  +EE KK  +EE + + E  +E +R ++
Sbjct: 90  DERAEAQRARDEE-KKLDEEEAKRQHEEAKEREREKK 125



 Score = 28.0 bits (63), Expect = 1.7
 Identities = 11/25 (44%), Positives = 16/25 (64%), Gaps = 1/25 (4%)

Query: 20  EGRQKEEEEEGKKEK-KEEEEEKKE 43
           E +  EEE + + E+ KE E EKK+
Sbjct: 102 EKKLDEEEAKRQHEEAKEREREKKK 126



 Score = 27.6 bits (62), Expect = 2.8
 Identities = 8/27 (29%), Positives = 14/27 (51%)

Query: 33  EKKEEEEEKKERNEELDRRRRRRRRRD 59
            ++  +EEKK   EE  R+    + R+
Sbjct: 95  AQRARDEEKKLDEEEAKRQHEEAKERE 121


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1
           protein homologues. SDA1 is a Saccharomyces cerevisiae
           protein which is involved in the control of the actin
           cytoskeleton. The protein is essential for cell
           viability and is localised in the nucleus.
          Length = 317

 Score = 30.0 bits (68), Expect = 0.88
 Identities = 10/33 (30%), Positives = 18/33 (54%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
             +    +  EE+E +  ++EE E +KE+  EL
Sbjct: 145 AKEDSDEELSEEDEEEAAEEEEAEAEKEKASEL 177



 Score = 27.3 bits (61), Expect = 5.9
 Identities = 10/38 (26%), Positives = 20/38 (52%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           + K E  +K +E+  ++  +E+EEE  E  E    + +
Sbjct: 136 EEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEK 173



 Score = 26.5 bits (59), Expect = 9.8
 Identities = 8/25 (32%), Positives = 16/25 (64%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEE 47
            +EE++E  K+ KE+ +E+    +E
Sbjct: 134 DEEEKDEAAKKAKEDSDEELSEEDE 158


>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein.  This family
           consists of several Borrelia P83/P100 antigen proteins.
          Length = 489

 Score = 30.0 bits (67), Expect = 0.89
 Identities = 11/46 (23%), Positives = 24/46 (52%), Gaps = 9/46 (19%)

Query: 16  KSKREGRQKEEE---------EEGKKEKKEEEEEKKERNEELDRRR 52
           K++ E ++ +EE          + K+E K  E+E +++  E  ++R
Sbjct: 285 KAQIEIKKNDEEALKAKDHKAFDLKQESKASEKEAEDKELEAQKKR 330



 Score = 29.2 bits (65), Expect = 1.9
 Identities = 12/41 (29%), Positives = 19/41 (46%), Gaps = 9/41 (21%)

Query: 16  KSKREGRQKEEEEE---------GKKEKKEEEEEKKERNEE 47
           K + E RQK++E +           KE K+  E +K   E+
Sbjct: 245 KQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEK 285


>gnl|CDD|182964 PRK11102, PRK11102, bicyclomycin/multidrug efflux system;
           Provisional.
          Length = 377

 Score = 29.9 bits (68), Expect = 0.90
 Identities = 36/145 (24%), Positives = 64/145 (44%), Gaps = 13/145 (8%)

Query: 71  ITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLT 130
           +T S + LG  +G      +AD +GR   +LG  +V  L       ++ +      RFL 
Sbjct: 30  MTLSAYILGFAIGQLFYGPMADSFGRKPVILGGTLVFALAAVACALAQTIDQLIYMRFLH 89

Query: 131 GVGHFNAFIFYYII--VLECVGPKWRTFA--MTFPFLIFYTVSEVALPWIAYYLADW-QW 185
           G     A     +I  ++  + PK   F+  M+F  L+  T++ +  P I  +L  W  W
Sbjct: 90  G---LAAAAASVVINALMRDMFPK-EEFSRMMSFVTLVM-TIAPLLAPIIGGWLLVWFSW 144

Query: 186 IS---VITIFPLIVGLIVAIFTPES 207
            +   V+ +  ++   +V  F PE+
Sbjct: 145 HAIFWVLALAAILAAALVFFFIPET 169


>gnl|CDD|217787 pfam03910, Adeno_PV, Adenovirus minor core protein PV. 
          Length = 336

 Score = 29.8 bits (67), Expect = 0.90
 Identities = 12/43 (27%), Positives = 22/43 (51%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
          P+  ++  +  + +  KK KK EE+++ +   E  R    RRR
Sbjct: 21 PRPVKDEAKPRKIKRVKKRKKREEKDELDDEVEFVRSFAPRRR 63


>gnl|CDD|217940 pfam04177, TAP42, TAP42-like family.  The TOR signalling pathway
           activates a cell-growth program in response to
           nutrients. TIP41 (pfam04176) interacts with TAP42 and
           negatively regulates the TOR signaling pathway.
          Length = 335

 Score = 30.0 bits (68), Expect = 0.93
 Identities = 10/44 (22%), Positives = 18/44 (40%), Gaps = 1/44 (2%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNW 61
            + G   +   E      EEEE+ ++  E+ D    + R  D +
Sbjct: 288 MKRGGVPQGGGE-AAASAEEEEDDEDDEEDDDEETLKARAWDEF 330


>gnl|CDD|115072 pfam06391, MAT1, CDK-activating kinase assembly factor MAT1.  MAT1
           is an assembly/targeting factor for cyclin-dependent
           kinase-activating kinase (CAK), which interacts with the
           transcription factor TFIIH. The domain found to the
           N-terminal side of this domain is a C3HC4 RING finger.
          Length = 200

 Score = 29.3 bits (66), Expect = 0.98
 Identities = 13/48 (27%), Positives = 30/48 (62%)

Query: 12  LIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            I+   +R  R++EE E+  +E+KE +EEK+   ++ ++ ++  + +D
Sbjct: 80  SIMRNKRRLTREQEELEQALEEEKEMKEEKRLHLQKEEQEQKMAKEKD 127


>gnl|CDD|221389 pfam12037, DUF3523, Domain of unknown function (DUF3523).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 257 to 277 amino acids in length. This domain is
           found associated with pfam00004. This domain has a
           conserved LER sequence motif.
          Length = 276

 Score = 29.7 bits (67), Expect = 1.0
 Identities = 21/94 (22%), Positives = 37/94 (39%), Gaps = 17/94 (18%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEE---KKER-NEELDRRRRRRR---RRDNWVCDGSSN 68
           + +    + E E E  + K E E     K+ER NE+++R   + +    R+  +      
Sbjct: 160 RRETIEEEAELERENIRAKIEAEARGRAKEERENEDINREMLKLKANEERETVL------ 213

Query: 69  LAITRSIFFLGSLLGGFILSWVADRYGRITAVLG 102
                SI    S +GG   + + D+      V G
Sbjct: 214 ----ESIKTTFSHIGGGFRALLTDKSKLTMTVGG 243


>gnl|CDD|218482 pfam05178, Kri1, KRI1-like family.  The yeast member of this
          family (Kri1p) is found to be required for 40S ribosome
          biogenesis in the nucleolus.
          Length = 99

 Score = 28.4 bits (64), Expect = 1.1
 Identities = 11/24 (45%), Positives = 16/24 (66%)

Query: 34 KKEEEEEKKERNEELDRRRRRRRR 57
          K+ +EEEK +R EEL R +  +R 
Sbjct: 1  KERKEEEKAQREEELKRLKNLKRE 24



 Score = 27.6 bits (62), Expect = 2.1
 Identities = 10/29 (34%), Positives = 19/29 (65%), Gaps = 1/29 (3%)

Query: 24 KEEEEEGKKEKKEE-EEEKKERNEELDRR 51
          KE +EE K +++EE +  K  + EE++ +
Sbjct: 1  KERKEEEKAQREEELKRLKNLKREEIEEK 29


>gnl|CDD|135898 PRK06397, PRK06397, V-type ATP synthase subunit H; Validated.
          Length = 111

 Score = 28.4 bits (63), Expect = 1.1
 Identities = 12/40 (30%), Positives = 18/40 (45%), Gaps = 4/40 (10%)

Query: 19 REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          +E + K EE+     KK EEE     N  L   R+   ++
Sbjct: 38 KEAKSKYEEKA----KKTEEESLNMYNAALMEARKEAEKK 73



 Score = 26.8 bits (59), Expect = 4.7
 Identities = 11/24 (45%), Positives = 15/24 (62%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEE 47
          K E+E   KE K + EEK ++ EE
Sbjct: 30 KNEQENEIKEAKSKYEEKAKKTEE 53


>gnl|CDD|185582 PTZ00373, PTZ00373, 60S Acidic ribosomal protein P2; Provisional.
          Length = 112

 Score = 28.3 bits (63), Expect = 1.1
 Identities = 9/24 (37%), Positives = 14/24 (58%)

Query: 26  EEEEGKKEKKEEEEEKKERNEELD 49
                K E K+EE++++E  EE D
Sbjct: 82  ATAGAKAEAKKEEKKEEEEEEEDD 105



 Score = 27.6 bits (61), Expect = 2.7
 Identities = 7/23 (30%), Positives = 13/23 (56%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEE 47
                  + KKEE++E++E  E+
Sbjct: 82  ATAGAKAEAKKEEKKEEEEEEED 104


>gnl|CDD|217935 pfam04158, Sof1, Sof1-like domain.  Sof1 is essential for cell
          growth and is a component of the nucleolar rRNA
          processing machinery.
          Length = 88

 Score = 28.0 bits (63), Expect = 1.1
 Identities = 16/61 (26%), Positives = 22/61 (36%), Gaps = 18/61 (29%)

Query: 15 PKSKREGRQKEEEEEGKKEK-KEEEE----------------EKKERNEELDRRRRRRRR 57
            S RE RQ  E  E  KEK K   E                 +K + E  + ++R+   
Sbjct: 9  VLSPRE-RQALEYNEALKEKYKHMPEIKRIARHRHVPKAIKKAQKIKREMKEAKKRKEEN 67

Query: 58 R 58
          R
Sbjct: 68 R 68


>gnl|CDD|191249 pfam05279, Asp-B-Hydro_N, Aspartyl beta-hydroxylase N-terminal
           region.  This family includes the N-terminal regions of
           the junctin, junctate and aspartyl beta-hydroxylase
           proteins. Junctate is an integral ER/SR membrane calcium
           binding protein, which comes from an alternatively
           spliced form of the same gene that generates aspartyl
           beta-hydroxylase and junctin. Aspartyl beta-hydroxylase
           catalyzes the post-translational hydroxylation of
           aspartic acid or asparagine residues contained within
           epidermal growth factor (EGF) domains of proteins.
          Length = 240

 Score = 29.5 bits (66), Expect = 1.2
 Identities = 8/41 (19%), Positives = 18/41 (43%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           +V K + +G  KE + +  K    E+ + ++   E  +   
Sbjct: 117 VVSKQEEDGPGKEPQLDEDKFLLAEDSDDRQETLEAGKVHE 157



 Score = 26.8 bits (59), Expect = 7.9
 Identities = 8/39 (20%), Positives = 20/39 (51%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
           +     +    ++EE+G  ++ + +E+K    E+ D R+
Sbjct: 109 LQSLLEKIVVSKQEEDGPGKEPQLDEDKFLLAEDSDDRQ 147


>gnl|CDD|236081 PRK07735, PRK07735, NADH dehydrogenase subunit C; Validated.
          Length = 430

 Score = 29.6 bits (66), Expect = 1.2
 Identities = 13/38 (34%), Positives = 20/38 (52%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           K KREG ++  EEE  K K +     K +   L +++R
Sbjct: 76  KQKREGTEEVTEEEKAKAKAKAAAAAKAKAAALAKQKR 113



 Score = 28.4 bits (63), Expect = 2.8
 Identities = 12/38 (31%), Positives = 19/38 (50%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           K KREG ++  EEE    K +     K +   L +++R
Sbjct: 110 KQKREGTEEVTEEEKAAAKAKAAAAAKAKAAALAKQKR 147


>gnl|CDD|233170 TIGR00886, 2A0108, nitrite extrusion protein (nitrite facilitator).
            [Transport and binding proteins, Anions].
          Length = 366

 Score = 29.6 bits (67), Expect = 1.2
 Identities = 20/68 (29%), Positives = 29/68 (42%), Gaps = 15/68 (22%)

Query: 77  FLGSL---LGGFILSWVADRYGR--------ITAVLGSHVVSFLGVALTPFSKDVVLFSL 125
            LGSL   LGG I    +DR G         +   +G+ +V    V+    +  +VLF  
Sbjct: 272 LLGSLARPLGGAI----SDRLGGARKLLMSFLGVAMGAFLVVLGLVSPLSLAVFIVLFVA 327

Query: 126 SRFLTGVG 133
             F +G G
Sbjct: 328 LFFFSGAG 335


>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family.  This model
          represents a subfamily of RNA splicing factors
          including the Pad-1 protein (N. crassa), CAPER (M.
          musculus) and CC1.3 (H.sapiens). These proteins are
          characterized by an N-terminal arginine-rich, low
          complexity domain followed by three (or in the case of
          4 H. sapiens paralogs, two) RNA recognition domains
          (rrm: pfam00706). These splicing factors are closely
          related to the U2AF splicing factor family (TIGR01642).
          A homologous gene from Plasmodium falciparum was
          identified in the course of the analysis of that genome
          at TIGR and was included in the seed.
          Length = 457

 Score = 29.5 bits (66), Expect = 1.3
 Identities = 9/54 (16%), Positives = 18/54 (33%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAITRSI 75
          R +E        ++ ++  ++ R     R R RRRR  ++              
Sbjct: 3  RDRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPN 56


>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region.  The myc family
           belongs to the basic helix-loop-helix leucine zipper
           class of transcription factors, see pfam00010. Myc forms
           a heterodimer with Max, and this complex regulates cell
           growth through direct activation of genes involved in
           cell replication. Mutations in the C-terminal 20
           residues of this domain cause unique changes in the
           induction of apoptosis, transformation, and G2 arrest.
          Length = 329

 Score = 29.5 bits (66), Expect = 1.3
 Identities = 13/37 (35%), Positives = 20/37 (54%), Gaps = 4/37 (10%)

Query: 27  EEEGKKEKKEEEEEKKERNEELD----RRRRRRRRRD 59
             + + E+ EEEEE++E  EE+D     +RR    R 
Sbjct: 223 GSDSESEEDEEEEEEEEEEEEIDVVTVEKRRSSSNRK 259


>gnl|CDD|219913 pfam08576, DUF1764, Eukaryotic protein of unknown function
          (DUF1764).  This is a family of eukaryotic proteins of
          unknown function. This family contains many
          hypothetical proteins.
          Length = 98

 Score = 28.2 bits (63), Expect = 1.3
 Identities = 8/43 (18%), Positives = 24/43 (55%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          K +++ +++  +    K  K+ +++ K+++E  +     +RRR
Sbjct: 22 KKRKKKKKRTAKTARPKATKKGQKKDKKKDEFPEFPEESKRRR 64



 Score = 27.0 bits (60), Expect = 2.6
 Identities = 10/41 (24%), Positives = 22/41 (53%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
           K K++   K    +  K+ ++++++K E  E  +  +RRR
Sbjct: 24 RKKKKKRTAKTARPKATKKGQKKDKKKDEFPEFPEESKRRR 64



 Score = 26.7 bits (59), Expect = 3.8
 Identities = 11/45 (24%), Positives = 22/45 (48%), Gaps = 7/45 (15%)

Query: 15 PKSKREGRQKEEEEEGK-------KEKKEEEEEKKERNEELDRRR 52
             KR+ ++K   +  +       ++K ++++E  E  EE  RRR
Sbjct: 20 NIKKRKKKKKRTAKTARPKATKKGQKKDKKKDEFPEFPEESKRRR 64


>gnl|CDD|240341 PTZ00272, PTZ00272, heat shock protein 83 kDa (Hsp83); Provisional.
          Length = 701

 Score = 29.6 bits (66), Expect = 1.3
 Identities = 14/47 (29%), Positives = 28/47 (59%)

Query: 12  LIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           L+V K+  +    E+EE+ KK  ++ EE K E  +E D  ++++ ++
Sbjct: 205 LMVEKTTEKEVTDEDEEDTKKADEDGEEPKVEEVKEGDEGKKKKTKK 251


>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal.
          This domain is found to the N-terminus of bacterial
          signal peptidases of the S49 family (pfam01343).
          Length = 154

 Score = 28.6 bits (65), Expect = 1.3
 Identities = 9/37 (24%), Positives = 21/37 (56%), Gaps = 2/37 (5%)

Query: 17 SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           K+E   K  E+  KK +K + + +K++ ++ + + R
Sbjct: 64 DKKE--LKAWEKAEKKAEKAKAKAEKKKAKKEEPKPR 98


>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain.  This domain is
           found at the N terminus of SMC proteins. The SMC
           (structural maintenance of chromosomes) superfamily
           proteins have ATP-binding domains at the N- and
           C-termini, and two extended coiled-coil domains
           separated by a hinge in the middle. The eukaryotic SMC
           proteins form two kind of heterodimers: the SMC1/SMC3
           and the SMC2/SMC4 types. These heterodimers constitute
           an essential part of higher order complexes, which are
           involved in chromatin and DNA dynamics. This family also
           includes the RecF and RecN proteins that are involved in
           DNA metabolism and recombination.
          Length = 1162

 Score = 29.6 bits (66), Expect = 1.3
 Identities = 15/41 (36%), Positives = 24/41 (58%), Gaps = 1/41 (2%)

Query: 16  KSKREGRQKEEEE-EGKKEKKEEEEEKKERNEELDRRRRRR 55
           K + E  +KE +E E K+E +EEEEE+ E+ +E   +    
Sbjct: 335 KEEIEELEKELKELEIKREAEEEEEEQLEKLQEKLEQLEEE 375



 Score = 29.2 bits (65), Expect = 1.8
 Identities = 3/43 (6%), Positives = 14/43 (32%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            ++     +E + +  K K++ ++  +    +           
Sbjct: 186 LAELIIDLEELKLQELKLKEQAKKALEYYQLKEKLELEEENLL 228



 Score = 29.2 bits (65), Expect = 1.9
 Identities = 12/45 (26%), Positives = 22/45 (48%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
              +  ++ +EEE+ KK ++EE +   +  EEL     +  RR  
Sbjct: 267 ILAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKV 311



 Score = 28.4 bits (63), Expect = 3.7
 Identities = 16/51 (31%), Positives = 23/51 (45%), Gaps = 1/51 (1%)

Query: 11   GLIVPKSKREGRQKEEEEEGKKEKKEE-EEEKKERNEELDRRRRRRRRRDN 60
            G +   +  E  +KEE     + KKE  EEEKKE   E+     +R +   
Sbjct: 974  GNVNLMAIAEFEEKEERYNKDELKKERLEEEKKELLREIIEETCQRFKEFL 1024



 Score = 26.9 bits (59), Expect = 9.6
 Identities = 8/34 (23%), Positives = 15/34 (44%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           + +EEE     K+EEE + +    E  +     +
Sbjct: 283 KLQEEELKLLAKEEEELKSELLKLERRKVDDEEK 316


>gnl|CDD|151656 pfam11214, Med2, Mediator complex subunit 2.  This family of
          mediator complex subunit 2 proteins is conserved in
          fungi. Cyclin-dependent kinase CDK8 or Srb10 interacts
          with and phosphorylates Med2. Post-translational
          modifications of Mediator subunits are important for
          regulation of gene expression.
          Length = 99

 Score = 28.2 bits (63), Expect = 1.4
 Identities = 13/22 (59%), Positives = 19/22 (86%), Gaps = 2/22 (9%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEK 41
          E ++K+EEEE  ++KKEEEE+K
Sbjct: 80 ENKKKQEEEE--RKKKEEEEKK 99


>gnl|CDD|217080 pfam02517, Abi, CAAX protease self-immunity.  Members of this
           family are probably proteases (after a isoprenyl group
           is attached to the Cys residue in the C-terminal CAAX
           motif of a protein to attach it to the membrane, the AAX
           tripeptide being removed by one of the CAAX prenyl
           proteases). The family contains the CAAX prenyl
           protease. The proteins contain a highly conserved
           Glu-Glu motif at the amino end of the alignment. The
           alignment also contains two histidine residues that may
           be involved in zinc binding. While they are involved in
           membrane anchoring of proteins in eukaryotes, little is
           known about their function in prokaryotes. In some known
           bacteriocin loci, Abi genes have been found downstream
           of bacteriocin structural genes where they are probably
           involved in self-immunity. Investigation of the
           bacteriocin-like loci in the Gram positive bacteria
           locus from Lactobacillus sakei 23K confirmed that the
           bacteriocin-like genes (sak23Kalphabeta) exhibited
           antimicrobial activity when expressed in a heterologous
           host and that the associated Abi gene (sak23Ki)
           conferred immunity against the cognate bacteriocin.
           Interestingly, the immunity genes from three similar
           systems conferred a high degree of cross-immunity
           against each other's bacteriocins, suggesting the
           recognition of a common receptor. Site-directed
           mutagenesis demonstrated that the conserved motifs
           constituting the putative proteolytic active site of the
           Abi proteins are essential for the immunity function of
           Sak23Ki - thus a new concept in self-immunity.
          Length = 93

 Score = 27.9 bits (63), Expect = 1.4
 Identities = 10/32 (31%), Positives = 17/32 (53%)

Query: 75  IFFLGSLLGGFILSWVADRYGRITAVLGSHVV 106
           + FL + L G +L W+  R G + A +  H +
Sbjct: 58  LAFLSAFLLGLVLGWLYLRTGSLWAAILLHAL 89


>gnl|CDD|191634 pfam06886, TPX2, Targeting protein for Xklp2 (TPX2).  This family
          represents a conserved region approximately 60 residues
          long within the eukaryotic targeting protein for Xklp2
          (TPX2). Xklp2 is a kinesin-like protein localised on
          centrosomes throughout the cell cycle and on spindle
          pole microtubules during metaphase. In Xenopus, it has
          been shown that Xklp2 protein is required for
          centrosome separation and maintenance of spindle
          bi-polarity. TPX2 is a microtubule-associated protein
          that mediates the binding of the C-terminal domain of
          Xklp2 to microtubules. It is phosphorylated during
          mitosis in a microtubule-dependent way.
          Length = 57

 Score = 27.0 bits (60), Expect = 1.4
 Identities = 10/34 (29%), Positives = 21/34 (61%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
          + EE+E+  + +KEE E +++  EE   ++ R+ 
Sbjct: 16 KLEEKEKALEAEKEEAEARQKEEEEEAIKQLRKE 49



 Score = 26.6 bits (59), Expect = 2.1
 Identities = 14/31 (45%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNE 46
          K K    +KEE E  +KE +EEE  K+ R E
Sbjct: 20 KEKALEAEKEEAEARQKE-EEEEAIKQLRKE 49


>gnl|CDD|227931 COG5644, COG5644, Uncharacterized conserved protein [Function
           unknown].
          Length = 869

 Score = 29.7 bits (66), Expect = 1.4
 Identities = 12/40 (30%), Positives = 19/40 (47%), Gaps = 2/40 (5%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
              G++K  E  G    ++  E  +  N  LDR+ RR +R
Sbjct: 609 ILHGQKKRAE--GAVVFEKPLEATENFNPWLDRKMRRIKR 646


>gnl|CDD|218218 pfam04702, Vicilin_N, Vicilin N terminal region.  This region is
          found in plant seed storage proteins, N-terminal to the
          Cupin domain (pfam00190). In Macadamia integrifolia,
          this region is processed into peptides of approximately
          50 amino acids containing a
          C-X-X-X-C-(10-12)X-C-X-X-X-C motif. These peptides
          exhibit antimicrobial activity in vitro.
          Length = 147

 Score = 28.5 bits (63), Expect = 1.4
 Identities = 12/41 (29%), Positives = 28/41 (68%), Gaps = 1/41 (2%)

Query: 19 REGRQKEE-EEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          R  R++++ E   K++ KEE++++++R E+  RR  + ++R
Sbjct: 25 RGQREQQQCERRCKEQYKEEQQQQRQREEDPQRRYEQCQQR 65



 Score = 27.8 bits (61), Expect = 2.6
 Identities = 12/47 (25%), Positives = 26/47 (55%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGS 66
          E R KE+ +E ++++++ EE+ + R E+  +R ++   R    C   
Sbjct: 34 ERRCKEQYKEEQQQQRQREEDPQRRYEQCQQRCQQHEPRHRPTCQQR 80


>gnl|CDD|149849 pfam08912, Rho_Binding, Rho Binding.  Rho Binding Domain is
          responsible for the recognition and binding of Rho
          binding domain-containing proteins (such as ROCK) to
          Rho, resulting in activation of the GTPase which in
          turn modulates the phosphorylation of various
          signalling proteins. This domain is within an
          amphipathic alpha-helical coiled-coil and interacts
          with Rho through predominantly hydrophobic
          interactions.
          Length = 68

 Score = 27.5 bits (61), Expect = 1.4
 Identities = 11/24 (45%), Positives = 15/24 (62%)

Query: 24 KEEEEEGKKEKKEEEEEKKERNEE 47
           E+EE   K KK +EE +K + EE
Sbjct: 9  NEKEELNNKLKKAQEELQKLKEEE 32


>gnl|CDD|111875 pfam03032, Brevenin, Brevenin/esculentin/gaegurin/rugosin family.
           This family contains a number of defence peptides
          secreted from the skin of amphibians, including the
          opiate-like dermorphins and deltorphins, and the
          antimicrobial dermoseptins and temporins. The alignment
          for this family includes the signal peptide.
          Length = 46

 Score = 26.5 bits (59), Expect = 1.4
 Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKK 42
          E R+ EEE E ++E +E+ E K+
Sbjct: 24 EKREDEEENEDEEEGEEQSEVKR 46



 Score = 25.8 bits (57), Expect = 3.0
 Identities = 9/23 (39%), Positives = 16/23 (69%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKK 42
          E ++++EEE   +E+ EE+ E K
Sbjct: 23 EEKREDEEENEDEEEGEEQSEVK 45



 Score = 24.6 bits (54), Expect = 8.4
 Identities = 11/32 (34%), Positives = 21/32 (65%)

Query: 13 IVPKSKREGRQKEEEEEGKKEKKEEEEEKKER 44
          +V  S  E  ++E+EEE + E++ EE+ + +R
Sbjct: 15 LVSLSLCEEEKREDEEENEDEEEGEEQSEVKR 46


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved
          in cell cycle arrest and pre-mRNA splicing. It has been
          shown to be a component of U4/U6 x U5 tri-snRNP complex
          in human, Schizosaccharomyces pombe and Saccharomyces
          cerevisiae. SART-1 is a known tumour antigen in a range
          of cancers recognised by T cells.
          Length = 603

 Score = 29.3 bits (66), Expect = 1.5
 Identities = 18/57 (31%), Positives = 27/57 (47%), Gaps = 2/57 (3%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR-RRRRRRRRDNWVCDGSSNLA 70
          P S +E R     E  KK ++EE E K++R E  ++  + R +R  N    G   L 
Sbjct: 27 PGSTKESRDAAAYENWKK-RQEEAEAKRKREELREKIAKAREKRERNSKLGGIKTLG 82



 Score = 29.3 bits (66), Expect = 1.6
 Identities = 14/35 (40%), Positives = 20/35 (57%), Gaps = 2/35 (5%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
           KSK+  RQK++E E KK    +E+EK+   E    
Sbjct: 97  KSKK--RQKKKEAERKKALLLDEKEKERAAEYTSE 129



 Score = 27.4 bits (61), Expect = 6.3
 Identities = 10/53 (18%), Positives = 20/53 (37%), Gaps = 5/53 (9%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERN-----EELDRRRRRRRRR 58
           G++        R++  +E+ + +   E  E+ ER       +  R   R R  
Sbjct: 486 GILKKNQLERERREFLKEKERLKLLAEIRERIERERDRNDGKYSRMSAREREE 538


>gnl|CDD|201951 pfam01749, IBB, Importin beta binding domain.  This family
          consists of the importin alpha (karyopherin alpha),
          importin beta (karyopherin beta) binding domain. The
          domain mediates formation of the importin alpha beta
          complex; required for classical NLS import of proteins
          into the nucleus, through the nuclear pore complex and
          across the nuclear envelope. Also in the alignment is
          the NLS of importin alpha which overlaps with the IBB
          domain.
          Length = 97

 Score = 27.7 bits (62), Expect = 1.5
 Identities = 13/33 (39%), Positives = 19/33 (57%), Gaps = 3/33 (9%)

Query: 20 EGRQKEEEE--EGKKEKKEEEEEKKERNEELDR 50
          E R++ EE   E +K K+EE+  K+ RN  L  
Sbjct: 22 EMRRRREEVGVELRKNKREEQLLKR-RNVGLPP 53


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 29.0 bits (65), Expect = 1.5
 Identities = 10/33 (30%), Positives = 14/33 (42%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           PK   +  Q EEEE G  +   E+ +      E
Sbjct: 230 PKEDEDDDQGEEEESGSSDSLSEDSDASSEEME 262


>gnl|CDD|179668 PRK03893, PRK03893, putative sialic acid transporter; Provisional.
          Length = 496

 Score = 29.3 bits (66), Expect = 1.5
 Identities = 21/73 (28%), Positives = 31/73 (42%), Gaps = 18/73 (24%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGV- 132
           S  F+    GG +L  + DRYGR  A++ S                +VLFS+     G  
Sbjct: 62  SAAFISRWFGGLLLGAMGDRYGRRLAMVIS----------------IVLFSVGTLACGFA 105

Query: 133 -GHFNAFIFYYII 144
            G++  FI   +I
Sbjct: 106 PGYWTLFIARLVI 118


>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
           domain.  This domain is found in a number of different
           types of plant proteins including NAM-like proteins.
          Length = 147

 Score = 28.5 bits (64), Expect = 1.5
 Identities = 19/65 (29%), Positives = 34/65 (52%), Gaps = 15/65 (23%)

Query: 15  PKSKR-EGRQKEEEE------EGKKEKKEEEEEKKERNE----ELDRRR----RRRRRRD 59
            +SKR EGR+K +E+      + KKE+ E+E+EK+ER      E ++ R    +++    
Sbjct: 57  AESKRPEGRKKAKEKLRRDKLKAKKEEAEKEKEKEERFMKALAEAEKERAELEKKKAEAK 116

Query: 60  NWVCD 64
               +
Sbjct: 117 LMKEE 121



 Score = 28.1 bits (63), Expect = 2.2
 Identities = 11/45 (24%), Positives = 19/45 (42%), Gaps = 3/45 (6%)

Query: 8   NCYGLIVPKSKREGRQKEEEEEGKKEKKEEEEEK---KERNEELD 49
                   + ++E R  +   E +KE+ E E++K   K   EE  
Sbjct: 79  AKKEEAEKEKEKEERFMKALAEAEKERAELEKKKAEAKLMKEEKK 123


>gnl|CDD|217476 pfam03286, Pox_Ag35, Pox virus Ag35 surface protein. 
          Length = 198

 Score = 29.0 bits (65), Expect = 1.6
 Identities = 13/31 (41%), Positives = 21/31 (67%), Gaps = 3/31 (9%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNE 46
          KSK++ ++K  EEE   +K E +++K E NE
Sbjct: 67 KSKKKDKEKLTEEE---KKPESDDDKTEENE 94


>gnl|CDD|218336 pfam04935, SURF6, Surfeit locus protein 6.  The surfeit locus
          protein SURF-6 is shown to be a component of the
          nucleolar matrix and has a strong binding capacity for
          nucleic acids.
          Length = 206

 Score = 28.8 bits (65), Expect = 1.6
 Identities = 8/33 (24%), Positives = 22/33 (66%), Gaps = 2/33 (6%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           + +R  R++E+ +  KK+K++E ++K++  + 
Sbjct: 8  LEQRR--RKREQRKARKKQKRKEAKKKEDAQKS 38



 Score = 28.4 bits (64), Expect = 2.4
 Identities = 8/32 (25%), Positives = 21/32 (65%)

Query: 29  EGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           E +K+K ++E ++++   E  +  R+++R +N
Sbjct: 152 EKQKKKSKKEWKERKEKVEKKKAERQKKREEN 183



 Score = 28.0 bits (63), Expect = 2.8
 Identities = 9/28 (32%), Positives = 20/28 (71%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEELD 49
          R+K E+ + +K++K +E +KKE  ++ +
Sbjct: 12 RRKREQRKARKKQKRKEAKKKEDAQKSE 39



 Score = 27.3 bits (61), Expect = 4.4
 Identities = 6/34 (17%), Positives = 18/34 (52%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
          R+   E+  +K ++ +  +K++R E   +   ++
Sbjct: 4  REALLEQRRRKREQRKARKKQKRKEAKKKEDAQK 37



 Score = 27.3 bits (61), Expect = 4.8
 Identities = 10/45 (22%), Positives = 28/45 (62%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          + +R+ R+K++ +E KK++  ++ E +E   E ++ +++    +N
Sbjct: 15 REQRKARKKQKRKEAKKKEDAQKSEAEEVKNEENKSKKKAAPIEN 59



 Score = 27.3 bits (61), Expect = 5.1
 Identities = 12/45 (26%), Positives = 30/45 (66%), Gaps = 4/45 (8%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEK----KERNEELDRRRRRRRRR 58
           ++E ++K+ ++E K+ K++ E++K    K+R E L +R+  ++ +
Sbjct: 150 RKEKQKKKSKKEWKERKEKVEKKKAERQKKREENLKKRKDDKKNK 194



 Score = 27.3 bits (61), Expect = 5.7
 Identities = 15/44 (34%), Positives = 29/44 (65%), Gaps = 3/44 (6%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR-RRRRDN 60
           KR+ +QK++ ++  KE+KE+ E+KK   E   +R    ++R+D+
Sbjct: 149 KRKEKQKKKSKKEWKERKEKVEKKKA--ERQKKREENLKKRKDD 190



 Score = 26.5 bits (59), Expect = 8.9
 Identities = 5/38 (13%), Positives = 21/38 (55%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
            +E   E ++ K+E+ + +K++  +  +++   ++ +
Sbjct: 2  SSREALLEQRRRKREQRKARKKQKRKEAKKKEDAQKSE 39


>gnl|CDD|219947 pfam08639, SLD3, DNA replication regulator SLD3.  The SLD3 DNA
           replication regulator is required for loading and
           maintenance of Cdc45 on chromatin during DNA
           replication.
          Length = 437

 Score = 29.4 bits (66), Expect = 1.6
 Identities = 12/33 (36%), Positives = 23/33 (69%)

Query: 26  EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           E +   K +KE +++  +R++ LDR +RR+RR+
Sbjct: 119 ELDRFTKFEKEYKKKLLKRSQNLDRSKRRKRRK 151


>gnl|CDD|203444 pfam06424, PRP1_N, PRP1 splicing factor, N-terminal.  This domain
          is specific to the N-terminal part of the prp1 splicing
          factor, which is involved in mRNA splicing (and
          possibly also poly(A)+ RNA nuclear export and cell
          cycle progression). This domain is specific to the N
          terminus of the RNA splicing factor encoded by prp1. It
          is involved in mRNA splicing and possibly also
          poly(A)and RNA nuclear export and cell cycle
          progression.
          Length = 131

 Score = 28.4 bits (64), Expect = 1.6
 Identities = 13/48 (27%), Positives = 24/48 (50%), Gaps = 5/48 (10%)

Query: 18 KREGRQKEEEEEG---KKEKKEEEEEKKERNEELDRR--RRRRRRRDN 60
          +   R ++ + EG     +  +E+EE     E +D R   RR++RR+ 
Sbjct: 37 EDPKRYQDGDNEGLFSDGKYDDEDEEADRIYESIDERMDERRKKRREQ 84


>gnl|CDD|224559 COG1645, COG1645, Uncharacterized Zn-finger containing protein
          [General function prediction only].
          Length = 131

 Score = 28.2 bits (63), Expect = 1.6
 Identities = 10/32 (31%), Positives = 15/32 (46%)

Query: 29 EGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            ++   EEEEE+ E   +   RR R    D+
Sbjct: 51 GYREVVVEEEEEEVEAEVQEQLRRSRPELPDD 82


>gnl|CDD|221668 pfam12619, MCM2_N, Mini-chromosome maintenance protein 2.  This
           domain family is found in eukaryotes, and is typically
           between 138 and 153 amino acids in length. The family is
           found in association with pfam00493. Mini-chromosome
           maintenance (MCM) proteins are essential for DNA
           replication. These proteins use ATPase activity to
           perform this function.
          Length = 145

 Score = 28.4 bits (64), Expect = 1.6
 Identities = 6/22 (27%), Positives = 15/22 (68%)

Query: 37  EEEEKKERNEELDRRRRRRRRR 58
           ++++  + + +L  + RRRRR+
Sbjct: 92  DDDDDDDGDFDLTAQPRRRRRQ 113


>gnl|CDD|217384 pfam03137, OATP, Organic Anion Transporter Polypeptide (OATP)
           family.  This family consists of several eukaryotic
           Organic-Anion-Transporting Polypeptides (OATPs). Several
           have been identified mostly in human and rat. Different
           OATPs vary in tissue distribution and substrate
           specificity. Since the numbering of different OATPs in
           particular species was based originally on the order of
           discovery, similarly numbered OATPs in humans and rats
           did not necessarily correspond in function, tissue
           distribution and substrate specificity (in spite of the
           name, some OATPs also transport organic cations and
           neutral molecules). Thus, Tamai et al. initiated the
           current scheme of using digits for rat OATPs and letters
           for human ones. Prostaglandin transporter (PGT) proteins
           are also considered to be OATP family members. In
           addition, the methotrexate transporter OATK is closely
           related to OATPs. This family also includes several
           predicted proteins from Caenorhabditis elegans and
           Drosophila melanogaster. This similarity was not
           previously noted. Note: Members of this family are
           described (in the Swiss-Prot database) as belonging to
           the SLC21 family of transporters.
          Length = 582

 Score = 29.2 bits (66), Expect = 1.7
 Identities = 13/45 (28%), Positives = 26/45 (57%), Gaps = 4/45 (8%)

Query: 120 VVLFSLSRFLTGVGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLI 164
           ++L ++  F+     F + I  Y+IVL CV P+ ++ A+   +L+
Sbjct: 498 LILMAILSFI----GFLSAIPLYMIVLRCVPPEEKSLALGVQWLL 538


>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
           recombination, and repair].
          Length = 908

 Score = 29.3 bits (66), Expect = 1.7
 Identities = 14/43 (32%), Positives = 21/43 (48%), Gaps = 3/43 (6%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERN---EELDRRRRRRRR 57
           K E  +K E+ E   E+ EE +EK +     EEL +   R + 
Sbjct: 528 KEELEEKLEKLENLLEELEELKEKLQLQQLKEELRQLEDRLQE 570


>gnl|CDD|151665 pfam11223, DUF3020, Protein of unknown function (DUF3020).  This
          family of fungal proteins is conserved towards the
          C-terminus of HMG domains. The function is not known.
          Length = 49

 Score = 26.4 bits (58), Expect = 1.8
 Identities = 9/28 (32%), Positives = 17/28 (60%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRRRRR 58
          ++ KK+  E   ERN++ D R R +++ 
Sbjct: 2  RERKKKWREANSERNKDNDLRSRVKKKA 29


>gnl|CDD|169428 PRK08404, PRK08404, V-type ATP synthase subunit H; Validated.
          Length = 103

 Score = 27.8 bits (62), Expect = 1.8
 Identities = 19/45 (42%), Positives = 26/45 (57%), Gaps = 10/45 (22%)

Query: 13 IVPKSKREGRQKEEE------EEGK----KEKKEEEEEKKERNEE 47
          I+ K+K E ++ EEE      EE +    K+KKE EEE K+  EE
Sbjct: 29 IIRKAKEEAKKIEEEIIKKAEEEAQKLIEKKKKEGEEEAKKILEE 73


>gnl|CDD|100108 cd04411, Ribosomal_P1_P2_L12p, Ribosomal protein P1, P2, and
          L12p. Ribosomal proteins P1 and P2 are the eukaryotic
          proteins that are functionally equivalent to bacterial
          L7/L12. L12p is the archaeal homolog. Unlike other
          ribosomal proteins, the archaeal L12p and eukaryotic P1
          and P2 do not share sequence similarity with their
          bacterial counterparts. They are part of the ribosomal
          stalk (called the L7/L12 stalk in bacteria), along with
          28S rRNA and the proteins L11 and P0 in eukaryotes (23S
          rRNA, L11, and L10e in archaea). In bacterial
          ribosomes, L7/L12 homodimers bind the extended
          C-terminal helix of L10 to anchor the L7/L12 molecules
          to the ribosome. Eukaryotic P1/P2 heterodimers and
          archaeal L12p homodimers are believed to bind the L10
          equivalent proteins, eukaryotic P0 and archaeal L10e,
          in a similar fashion. P1 and P2 (L12p, L7/L12) are the
          only proteins in the ribosome to occur as multimers,
          always appearing as sets of dimers. Recent data
          indicate that most archaeal species contain six copies
          of L12p (three homodimers), while eukaryotes have two
          copies each of P1 and P2 (two heterodimers). Bacteria
          may have four or six copies (two or three homodimers),
          depending on the species. As in bacteria, the stalk is
          crucial for binding of initiation, elongation, and
          release factors in eukaryotes and archaea.
          Length = 105

 Score = 27.6 bits (61), Expect = 1.8
 Identities = 9/20 (45%), Positives = 14/20 (70%)

Query: 21 GRQKEEEEEGKKEKKEEEEE 40
              E+ EE K+E++EEE+E
Sbjct: 79 AEPAEKAEEAKEEEEEEEDE 98


>gnl|CDD|162098 TIGR00900, 2A0121, H+ Antiporter protein.  [Transport and binding
           proteins, Cations and iron carrying compounds].
          Length = 365

 Score = 28.8 bits (65), Expect = 1.9
 Identities = 16/54 (29%), Positives = 28/54 (51%), Gaps = 1/54 (1%)

Query: 64  DGS-SNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPF 116
            GS S L++      L  ++   I   +ADRY R   ++G+ ++  + VA+ PF
Sbjct: 30  TGSASVLSLAALAGMLPYVVLSPIAGALADRYDRKKVMIGADLIRAVLVAVLPF 83


>gnl|CDD|223551 COG0475, KefB, Kef-type K+ transport systems, membrane components
           [Inorganic ion transport and metabolism].
          Length = 397

 Score = 28.7 bits (65), Expect = 1.9
 Identities = 11/85 (12%), Positives = 26/85 (30%), Gaps = 12/85 (14%)

Query: 120 VVLFSLSRFLTGVGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYY 179
           ++L  +     G      FI   ++ +                 +   +    LP +   
Sbjct: 166 LLLAIVPALAGGGSGSVGFILGLLLAI------------LAFLALLLLLGRYLLPPLFRR 213

Query: 180 LADWQWISVITIFPLIVGLIVAIFT 204
           +A  +   +  +F L++ L  A   
Sbjct: 214 VAKTESSELFILFVLLLVLGAAYLA 238


>gnl|CDD|218636 pfam05557, MAD, Mitotic checkpoint protein.  This family consists
           of several eukaryotic mitotic checkpoint (Mitotic arrest
           deficient or MAD) proteins. The mitotic spindle
           checkpoint monitors proper attachment of the bipolar
           spindle to the kinetochores of aligned sister chromatids
           and causes a cell cycle arrest in prometaphase when
           failures occur. Multiple components of the mitotic
           spindle checkpoint have been identified in yeast and
           higher eukaryotes. In S.cerevisiae, the existence of a
           Mad1-dependent complex containing Mad2, Mad3, Bub3 and
           Cdc20 has been demonstrated.
          Length = 722

 Score = 29.1 bits (65), Expect = 2.0
 Identities = 14/41 (34%), Positives = 20/41 (48%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
           +P+ +RE     EE    +  KE+ E  KE  E+L  R  R
Sbjct: 259 IPELERELAALREENRKLRSMKEDNELLKEELEDLQSRLER 299


>gnl|CDD|216682 pfam01757, Acyl_transf_3, Acyltransferase family.  This family
           includes a range of acyltransferase enzymes. This domain
           is found in many as yet uncharacterized C. elegans
           proteins and it is approximately 300 amino acids long.
          Length = 326

 Score = 28.7 bits (64), Expect = 2.0
 Identities = 20/132 (15%), Positives = 51/132 (38%), Gaps = 5/132 (3%)

Query: 77  FLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVGHFN 136
           FL +L   ++L  +  R  R    L   ++  L + L+     ++L  L   +  +    
Sbjct: 127 FLPALFVFYLLLPLLLRLLRKLKKLLLLLLLALLLLLSLLYILILLVGLPPTVLNLLIGL 186

Query: 137 AFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLA-----DWQWISVITI 191
              F    +L     + R+  +    +I   ++ +AL  +  +L        +     ++
Sbjct: 187 LPFFLLGALLARYRKRIRSKRLLLLIVILLALALLALILLLLFLFGLVYLAPELYGYFSL 246

Query: 192 FPLIVGLIVAIF 203
             L++G+++ + 
Sbjct: 247 LLLLLGVLLLLL 258



 Score = 27.5 bits (61), Expect = 4.7
 Identities = 20/132 (15%), Positives = 45/132 (34%), Gaps = 6/132 (4%)

Query: 76  FFLGSLLGGFILSWVADR-YGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVGH 134
           F LG+LL  +     + R    I  +L   +++ + + L  F    +   L  + + +  
Sbjct: 190 FLLGALLARYRKRIRSKRLLLLIVILLALALLALILLLLFLFGLVYLAPELYGYFSLLLL 249

Query: 135 FNAFIFYYIIVLECVGPKWRTFAMTFPF----LIFYTVSEVALPWIAYYLADWQWISVIT 190
               +   ++ L  +        +        L  Y +    L  +   L     +  I 
Sbjct: 250 LLGVLLLLLLAL-LLANLRSLKRLLKYLGKYSLGIYLIHPPILLLLTKLLLLLPPLGPIL 308

Query: 191 IFPLIVGLIVAI 202
           +F L + L + +
Sbjct: 309 LFLLALVLTLLV 320


>gnl|CDD|151173 pfam10669, Phage_Gp23, Protein gp23 (Bacteriophage A118).  This
          is the highly conserved family of the major tail
          subunit protein.
          Length = 121

 Score = 27.8 bits (61), Expect = 2.0
 Identities = 14/47 (29%), Positives = 25/47 (53%)

Query: 13 IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          IV    +E R K E E  K++K+ +EE  K  +   +++R   ++ D
Sbjct: 43 IVRVEMKEERDKMETEREKRDKESKEERDKFISTMNEQQRLMDKQND 89


>gnl|CDD|239286 cd02988, Phd_like_VIAF, Phosducin (Phd)-like family, Viral
          inhibitor of apoptosis (IAP)-associated factor (VIAF)
          subfamily; VIAF is a Phd-like protein that functions in
          caspase activation during apoptosis. It was identified
          as an IAP binding protein through a screen of a human
          B-cell library using a prototype IAP. VIAF lacks a
          consensus IAP binding motif and while it does not
          function as an IAP antagonist, it still plays a
          regulatory role in the complete activation of caspases.
          VIAF itself is a substrate for IAP-mediated
          ubiquitination, suggesting that it may be a target of
          IAPs in the prevention of cell death. The similarity of
          VIAF to Phd points to a potential role distinct from
          apoptosis regulation. Phd functions as a cytosolic
          regulator of G protein by specifically binding to G
          protein betagamma (Gbg)-subunits. The C-terminal domain
          of Phd adopts a thioredoxin fold, but it does not
          contain a CXXC motif. Phd interacts with G protein beta
          mostly through the N-terminal helical domain.
          Length = 192

 Score = 28.4 bits (64), Expect = 2.1
 Identities = 10/34 (29%), Positives = 18/34 (52%), Gaps = 1/34 (2%)

Query: 23 QKEEEEEGKKEKK-EEEEEKKERNEELDRRRRRR 55
            E   E K   + +EE +++E +  L+  RR+R
Sbjct: 35 AHENALEKKLLDELDEELDEEEDDRFLEEYRRKR 68


>gnl|CDD|178635 PLN03086, PLN03086, PRLI-interacting factor K; Provisional.
          Length = 567

 Score = 28.7 bits (64), Expect = 2.2
 Identities = 14/49 (28%), Positives = 25/49 (51%), Gaps = 5/49 (10%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR-----RRRRRRD 59
          +  RE  ++E+ E  ++ K + E E+K + E   +R      +R RR D
Sbjct: 6  RRAREKLEREQRERKQRAKLKLERERKAKEEAAKQREAIEAAQRSRRLD 54


>gnl|CDD|182502 PRK10504, PRK10504, putative transporter; Provisional.
          Length = 471

 Score = 28.9 bits (65), Expect = 2.2
 Identities = 39/125 (31%), Positives = 53/125 (42%), Gaps = 19/125 (15%)

Query: 89  WVADRYGRITAVLGSHVVSF-LGVALTPFSKDVVLFSLSRFLTGVGHFNAFIFYYIIVLE 147
           W+ADR G +  +  + +V F LG      S  +    L+R L GVG         + V++
Sbjct: 67  WLADRVG-VRNIFFTAIVLFTLGSLFCALSGTLNELLLARVLQGVGGAMMVPVGRLTVMK 125

Query: 148 CVGPKWRTFAMTF--------PFLIFYTVSEVALPWIAYYLADWQWISVITIFPLIVGLI 199
            V  +    AMTF        P L        AL  +    A W WI +I I P  VG+I
Sbjct: 126 IVPREQYMAAMTFVTLPGQVGPLL------GPALGGLLVEYASWHWIFLINI-P--VGII 176

Query: 200 VAIFT 204
            AI T
Sbjct: 177 GAIAT 181


>gnl|CDD|238356 cd00660, Topoisomer_IB_N, Topoisomer_IB_N: N-terminal DNA binding
           fragment found in eukaryotic DNA topoisomerase (topo) IB
           proteins similar to the monomeric yeast and human topo I
           and heterodimeric topo I from Leishmania donvanni. Topo
           I enzymes are divided into:  topo type IA (bacterial)
           and type IB (eukaryotic). Topo I relaxes superhelical
           tension in duplex DNA by creating a single-strand nick,
           the broken strand can then rotate around the unbroken
           strand to remove DNA supercoils and, the nick is
           religated, liberating topo I. These enzymes regulate the
           topological changes that accompany DNA replication,
           transcription and other nuclear processes.  Human topo I
           is the target of a diverse set of anticancer drugs
           including camptothecins (CPTs). CPTs bind to the topo
           I-DNA complex and inhibit re-ligation of the
           single-strand nick, resulting in the accumulation of
           topo I-DNA adducts.  In addition to differences in
           structure and some biochemical properties,
           Trypanosomatid parasite topo I differ from human topo I
           in their sensitivity to CPTs and other classical topo I
           inhibitors. Trypanosomatid topos I play putative roles
           in organizing the kinetoplast DNA network unique to
           these parasites.  This family may represent more than
           one structural domain.
          Length = 215

 Score = 28.4 bits (64), Expect = 2.3
 Identities = 10/25 (40%), Positives = 17/25 (68%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEE 40
           K K++   KEE++  K+EK++ EE 
Sbjct: 98  KEKKKAMSKEEKKAIKEEKEKLEEP 122


>gnl|CDD|221893 pfam13010, pRN1_helical, Primase helical domain.  This alpha
          helical domain is found in a set of bacterial plasmid
          replication proteins. The domain is found to the
          C-terminus of the primase/polymerase domain. Mutants of
          this domain are defective in template binding,
          dinucleotide formation and conformation change prior to
          DNA extension.
          Length = 135

 Score = 27.9 bits (62), Expect = 2.3
 Identities = 10/27 (37%), Positives = 19/27 (70%)

Query: 29 EGKKEKKEEEEEKKERNEELDRRRRRR 55
          EGKKE+++ EE+ ++  EE+ +  R +
Sbjct: 1  EGKKEEEDSEEDFEKLKEEMAKYDRFK 27


>gnl|CDD|218744 pfam05781, MRVI1, MRVI1 protein.  This family consists of mammalian
           MRVI1 proteins which are related to the
           lymphoid-restricted membrane protein (JAW1) and the IP3
           receptor associated cGMP kinase substrates A and B
           (IRAGA and IRAGB). The function of MRVI1 is unknown
           although mutations in the Mrvi1 gene induces myeloid
           leukaemia by altering the expression of a gene important
           for myeloid cell growth and/or differentiation so it has
           been speculated that Mrvi1 is a tumour suppressor gene.
           IRAG is very similar in sequence to MRVI1 and is an
           essential NO/cGKI-dependent regulator of IP3-induced
           calcium release. Activation of cGKI decreases
           IP3-stimulated elevations in intracellular calcium,
           induces smooth muscle relaxation and contributes to the
           antiproliferative and pro-apoptotic effects of NO/cGMP.
           Jaw1 is a member of a class of proteins with
           COOH-terminal hydrophobic membrane anchors and is
           structurally similar to proteins involved in vesicle
           targeting and fusion. This suggests that the function
           and/or the structure of the ER in lymphocytes may be
           modified by lymphoid-restricted resident ER proteins.
          Length = 538

 Score = 28.8 bits (64), Expect = 2.4
 Identities = 11/38 (28%), Positives = 20/38 (52%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
           K  ++ R+ E EE  ++ +K    E+    EE D+ +R
Sbjct: 421 KKLQDLREPEGEEAVERTRKPSLSEEVAETEEWDKEQR 458


>gnl|CDD|153340 cd07656, F-BAR_srGAP, The F-BAR (FES-CIP4 Homology and
           Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase
           Activating Proteins.  F-BAR domains are dimerization
           modules that bind and bend membranes and are found in
           proteins involved in membrane dynamics and actin
           reorganization. Slit-Robo GTPase Activating Proteins
           (srGAPs) are Rho GAPs that interact with Robo1, the
           transmembrane receptor of Slit proteins. Slit proteins
           are secreted proteins that control axon guidance and the
           migration of neurons and leukocytes. Vertebrates contain
           three isoforms of srGAPs, all of which are expressed
           during embryonic and early development in the nervous
           system but with different localization and timing.
           srGAPs contain an N-terminal F-BAR domain, a Rho GAP
           domain, and a C-terminal SH3 domain. F-BAR domains form
           banana-shaped dimers with a positively-charged concave
           surface that binds to negatively-charged lipid
           membranes. They can induce membrane deformation in the
           form of long tubules.
          Length = 241

 Score = 28.5 bits (64), Expect = 2.4
 Identities = 10/33 (30%), Positives = 20/33 (60%)

Query: 26  EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           +  E K ++ E++EEK+E++ E    R R  ++
Sbjct: 148 KSAERKLKEAEKQEEKQEQSPEKKLERSRSSKK 180


>gnl|CDD|106886 PHA00094, VI, minor coat protein.
          Length = 112

 Score = 27.5 bits (61), Expect = 2.4
 Identities = 16/56 (28%), Positives = 24/56 (42%), Gaps = 1/56 (1%)

Query: 69  LAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFS 124
           L I     FLG+L    ++ + A  + R  A     +  F+G+ L   S  V L S
Sbjct: 5   LGIPALARFLGTLAAN-LIGYFAKFFTRGIARNALAISLFIGLILGLNSALVALLS 59


>gnl|CDD|227238 COG4901, COG4901, Ribosomal protein S25 [Translation, ribosomal
          structure and biogenesis].
          Length = 107

 Score = 27.6 bits (61), Expect = 2.4
 Identities = 12/52 (23%), Positives = 25/52 (48%), Gaps = 6/52 (11%)

Query: 14 VPKSKREGRQKEEEE------EGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           PKS+    +K E+       + KK  K++++E+  R   +D     + R++
Sbjct: 2  APKSQLSKEKKAEKAKAGTAKDKKKWSKKKKKEEARRAVTVDEELLDKIRKE 53


>gnl|CDD|107164 PHA02277, PHA02277, hypothetical protein.
          Length = 150

 Score = 27.8 bits (61), Expect = 2.5
 Identities = 11/26 (42%), Positives = 20/26 (76%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKE 43
           KRE   KE++EE +++++EE ++K E
Sbjct: 109 KREAYLKEKQEELRQKQQEEAQKKTE 134


>gnl|CDD|177433 PHA02608, 67, prohead core protein; Provisional.
          Length = 80

 Score = 27.1 bits (60), Expect = 2.5
 Identities = 5/30 (16%), Positives = 21/30 (70%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
          EG + E++++ + +  +++++ K+ +++ D
Sbjct: 46 EGEEPEDDDDDEDDDDDDDKDDKDDDDDDD 75


>gnl|CDD|182234 PRK10091, PRK10091, MFS transport protein AraJ; Provisional.
          Length = 382

 Score = 28.5 bits (64), Expect = 2.5
 Identities = 19/79 (24%), Positives = 36/79 (45%), Gaps = 1/79 (1%)

Query: 74  SIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLSRFLTGVG 133
           S + LG ++G  I++  + RY     +L    +  +G A+   S   ++ ++ R ++G  
Sbjct: 45  SYYALGVVVGAPIIALFSSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFP 104

Query: 134 HFNAFIFYYIIVLECVGPK 152
           H  AF     IVL  +   
Sbjct: 105 H-GAFFGVGAIVLSKIIKP 122


>gnl|CDD|233167 TIGR00881, 2A0104, phosphoglycerate transporter family protein.
           [Transport and binding proteins, Carbohydrates, organic
           alcohols, and acids].
          Length = 379

 Score = 28.5 bits (64), Expect = 2.6
 Identities = 31/158 (19%), Positives = 57/158 (36%), Gaps = 40/158 (25%)

Query: 66  SSNLAITRSIFFLGSLLGGFILSWVADRYGR----ITAVLGSHVVS-FLGVALTPFSKDV 120
            ++L +  S F +   +  F++  V+DR          ++   +V+ F G     FS  +
Sbjct: 29  KTDLGLLLSSFSIAYGISKFVMGSVSDRSNPRVFLPIGLILCAIVNLFFG-----FSTSL 83

Query: 121 VLFSLSRFLTGVGHFNAFIFYYIIVLECVGPKWRTFAMTFP------FLIFYTVSE---- 170
            + +    L G+  F    +          P  RT    F       ++ F+  S     
Sbjct: 84  WVMAALWALNGI--FQGMGW---------PPCGRTVTKWFSRSERGTWVSFWNCSHNVGG 132

Query: 171 -----VALPWIAYYLADWQWISVITIFPLIVGLIVAIF 203
                + L  IA     W W   + I P I+ +IV++ 
Sbjct: 133 GLLPPLVLFGIAELY-SWHW---VFIVPGIIAIIVSLI 166


>gnl|CDD|220897 pfam10883, DUF2681, Protein of unknown function (DUF2681).  This
          family of proteins is found in bacteria. Proteins in
          this family are typically between 81 and 117 amino
          acids in length.
          Length = 87

 Score = 27.0 bits (60), Expect = 2.7
 Identities = 16/45 (35%), Positives = 24/45 (53%), Gaps = 6/45 (13%)

Query: 16 KSKREGRQ-KEEEEEGKKEKKEEEEE-----KKERNEELDRRRRR 54
          K++RE R+ + E E+   EK   E E      +++NEE  RR  R
Sbjct: 27 KAQRENRKLQAENEQLATEKAVAETEVKNAKVRQKNEENTRRLSR 71


>gnl|CDD|214395 CHL00204, ycf1, Ycf1; Provisional.
          Length = 1832

 Score = 28.5 bits (64), Expect = 2.8
 Identities = 10/27 (37%), Positives = 19/27 (70%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNE 46
           +  +++ +++ KKEKK+EEE K+E   
Sbjct: 739 DSVEEKTKKKKKKEKKKEEEYKREEKA 765



 Score = 27.4 bits (61), Expect = 7.9
 Identities = 11/25 (44%), Positives = 19/25 (76%)

Query: 26  EEEEGKKEKKEEEEEKKERNEELDR 50
           EE+  KK+KKE+++E++ + EE  R
Sbjct: 742 EEKTKKKKKKEKKKEEEYKREEKAR 766


>gnl|CDD|218223 pfam04712, Radial_spoke, Radial spokehead-like protein.  This
           family includes the radial spoke head proteins RSP4 and
           RSP6 from Chlamydomonas reinhardtii, and several
           eukaryotic homologues, including mammalian RSHL1, the
           protein product of a familial ciliary dyskinesia
           candidate gene.
          Length = 481

 Score = 28.5 bits (64), Expect = 2.8
 Identities = 14/25 (56%), Positives = 19/25 (76%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEE 47
           QK+EEEE + E++EEEEE+ E  E 
Sbjct: 346 QKDEEEEQEDEEEEEEEEEPEEPEP 370



 Score = 28.5 bits (64), Expect = 2.9
 Identities = 13/34 (38%), Positives = 22/34 (64%), Gaps = 1/34 (2%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
           P+ K E  ++E+EEE ++E++E EE + E    L
Sbjct: 344 PEQKDEEEEQEDEEE-EEEEEEPEEPEPEEGPPL 376



 Score = 28.1 bits (63), Expect = 3.9
 Identities = 11/38 (28%), Positives = 22/38 (57%), Gaps = 3/38 (7%)

Query: 13  IVPKSKREGRQK---EEEEEGKKEKKEEEEEKKERNEE 47
            V     +GR      E+++ ++E+++EEEE++E   E
Sbjct: 329 HVQHILPQGRCTWVNPEQKDEEEEQEDEEEEEEEEEPE 366



 Score = 26.9 bits (60), Expect = 8.1
 Identities = 11/40 (27%), Positives = 17/40 (42%)

Query: 5   GNCNCYGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKER 44
           G C          + E   +EEEEE ++ ++ E EE    
Sbjct: 337 GRCTWVNPEQKDEEEEQEDEEEEEEEEEPEEPEPEEGPPL 376


>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34.  This family represents
           herpes virus protein U79 and cytomegalovirus early
           phosphoprotein P34 (UL112).
          Length = 238

 Score = 28.3 bits (63), Expect = 2.8
 Identities = 11/41 (26%), Positives = 27/41 (65%), Gaps = 3/41 (7%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKE---RNEELDRRRRRRRRRD 59
           R+K ++E  K+  K++E+ + E   +++E  R+++  +RR+
Sbjct: 152 RKKSDDEHRKRSGKQKEKRRVEDSQKHKEDRRKKQEEKRRN 192


>gnl|CDD|187765 cd09325, TDT_C4-dicarb_trans, C4-dicarboxylate transporters of the
           Tellurite-resistance/Dicarboxylate Transporter (TDT)
           family.  This subfamily contains bacterial
           C4-dicarboxylate transporters, which is part of the
           Tellurite-resistance/Dicarboxylate Transporter (TDT)
           family. It includes Tellurite resistance protein tehA;
           the tehA gene encodes an integral membrane protein that
           has been shown to have efflux activity of quaternary
           ammonium compounds. TehA protein of Escherichia coli
           functions as a tellurite-resistance uptake permease.
          Length = 293

 Score = 28.3 bits (64), Expect = 2.8
 Identities = 9/47 (19%), Positives = 23/47 (48%)

Query: 157 AMTFPFLIFYTVSEVALPWIAYYLADWQWISVITIFPLIVGLIVAIF 203
           A TFP +I  T  +    ++A Y      +  + +F +++  ++ ++
Sbjct: 241 AFTFPLVISATALKKTSTYLASYGLYLPILKYLALFEILIATVIVLY 287


>gnl|CDD|237753 PRK14552, PRK14552, C/D box methylation guide ribonucleoprotein
           complex aNOP56 subunit; Provisional.
          Length = 414

 Score = 28.4 bits (64), Expect = 2.8
 Identities = 11/30 (36%), Positives = 21/30 (70%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKER 44
           PK KRE ++ ++ ++ KK KK+ ++ KK+ 
Sbjct: 383 PKKKREEKKPQKRKKKKKRKKKGKKRKKKG 412



 Score = 26.9 bits (60), Expect = 8.0
 Identities = 11/43 (25%), Positives = 26/43 (60%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           K +   R +E +E+  K  K++ EEKK +  +  ++R+++ ++
Sbjct: 365 KEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKK 407


>gnl|CDD|240520 cd13156, KOW_RPL6, KOW motif of Ribosomal Protein L6.  RPL6
           contains KOW motif that has an extra ribosomal role as
           an oncogenic. KOW domain is known as an RNA-binding
           motif that is shared so far among some families of
           ribosomal proteins, the essential bacterial
           transcriptional elongation factor NusG, the eukaryotic
           chromatin elongation factor Spt5, the higher eukaryotic
           KIN17 proteins and Mtr4. .
          Length = 152

 Score = 27.5 bits (62), Expect = 2.9
 Identities = 12/36 (33%), Positives = 21/36 (58%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
            K K++ +++ E  E KK+K    EE+KE  + +D 
Sbjct: 84  KKKKKKKKKEGEFFEEKKKKYVVSEERKEDQKAVDA 119


>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1.  This domain
           family is found in eukaryotes, and is approximately 40
           amino acids in length. The family is found in
           association with pfam07719, pfam00515. There is a single
           completely conserved residue L that may be functionally
           important. NARP1 is the mammalian homologue of a yeast
           N-terminal acetyltransferase that regulates entry into
           the G(0) phase of the cell cycle.
          Length = 516

 Score = 28.4 bits (64), Expect = 2.9
 Identities = 8/32 (25%), Positives = 18/32 (56%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           K +R+  +K E+EE +K   +++ E   +  +
Sbjct: 416 KKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAK 447



 Score = 27.6 bits (62), Expect = 6.2
 Identities = 8/34 (23%), Positives = 18/34 (52%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           K ++  ++ E+EE  K   K++ E   ++ +  D
Sbjct: 417 KQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKGPD 450


>gnl|CDD|198139 smart01071, CDC37_N, Cdc37 N terminal kinase binding.  Cdc37 is a
           molecular chaperone required for the activity of
           numerous eukaryotic protein kinases. This domain
           corresponds to the N terminal domain which binds
           predominantly to protein kinases.and is found N terminal
           to the Hsp (Heat shocked protein) 90-binding domain.
           Expression of a construct consisting of only the
           N-terminal domain of Saccharomyces pombe Cdc37 results
           in cellular viability. This indicates that interactions
           with the cochaperone Hsp90 may not be essential for
           Cdc37 function.
          Length = 154

 Score = 27.8 bits (62), Expect = 3.0
 Identities = 11/39 (28%), Positives = 19/39 (48%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNW 61
           +KE EE     +   EE KK R++    ++  R++ D  
Sbjct: 96  KKELEEANGDSEGLLEELKKHRDKLKKEQKELRKKLDEL 134



 Score = 26.6 bits (59), Expect = 7.8
 Identities = 9/24 (37%), Positives = 16/24 (66%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERN 45
           R K ++E+ +  KK +E EK+E+ 
Sbjct: 117 RDKLKKEQKELRKKLDELEKEEKK 140


>gnl|CDD|226692 COG4241, COG4241, Predicted membrane protein [Function unknown].
          Length = 314

 Score = 28.2 bits (63), Expect = 3.0
 Identities = 18/77 (23%), Positives = 35/77 (45%), Gaps = 9/77 (11%)

Query: 136 NAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEV--------ALPWIAYYLADWQWIS 187
           N+F+F+Y+IV+ C+   W+       + I++    V         L  I +YL      +
Sbjct: 215 NSFLFWYLIVV-CLELLWKYENGQVTYSIYWNFLMVLGLLLAIQGLSVIFFYLKAKGLPN 273

Query: 188 VITIFPLIVGLIVAIFT 204
            + +  LI+G+I+    
Sbjct: 274 AVIVLILILGIILTPLL 290


>gnl|CDD|221490 pfam12253, CAF1A, Chromatin assembly factor 1 subunit A.  The
          CAF-1 or chromatin assembly factor-1 consists of three
          subunits, and this is the first, or A. The A domain is
          uniquely required for the progression of S phase in
          mouse cells, independent of its ability to promote
          histone deposition but dependent on its ability to
          interact with HP1 - heterochromatin protein 1-rich
          heterochromatin domains next to centromeres that are
          crucial for chromosome segregation during mitosis. This
          HP1-CAF-1 interaction module functions as a built-in
          replication control for heterochromatin, which, like a
          control barrier, has an impact on S-phase progression
          in addition to DNA-based checkpoints.
          Length = 76

 Score = 26.4 bits (59), Expect = 3.1
 Identities = 11/26 (42%), Positives = 22/26 (84%), Gaps = 1/26 (3%)

Query: 25 EEEEEGKK-EKKEEEEEKKERNEELD 49
          EEEEEG+  E ++EE+E+++ ++++D
Sbjct: 49 EEEEEGEDLESEDEEDEEEDDDDDMD 74


>gnl|CDD|205163 pfam12958, DUF3847, Protein of unknown function (DUF3847).  A
          family of uncharacterized proteins found by clustering
          human gut metagenomic sequences.
          Length = 86

 Score = 26.6 bits (59), Expect = 3.2
 Identities = 14/42 (33%), Positives = 26/42 (61%), Gaps = 2/42 (4%)

Query: 17 SKREGRQKEEEEEGKKEKKEEEEEKKERNEE--LDRRRRRRR 56
           K E  ++E E+  KK +KE+ +EK+ +N+   L++  R+ R
Sbjct: 1  KKLEQLRQEIEKAEKKLRKEQHKEKRLQNQLKKLEKGERKER 42


>gnl|CDD|205692 pfam13514, AAA_27, AAA domain.  This domain is found in a number of
           double-strand DNA break proteins. This domain contains a
           P-loop motif.
          Length = 1118

 Score = 28.6 bits (64), Expect = 3.2
 Identities = 10/38 (26%), Positives = 16/38 (42%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
            +  RQ+EE    + E+ E+E E  E      R   + 
Sbjct: 556 LQSLRQQEEAARRRLEQLEKELEVLELALAALREAWQA 593


>gnl|CDD|189539 pfam00424, REV, REV protein (anti-repression trans-activator
          protein). 
          Length = 90

 Score = 26.6 bits (59), Expect = 3.2
 Identities = 7/12 (58%), Positives = 7/12 (58%)

Query: 50 RRRRRRRRRDNW 61
          RR RRRR R   
Sbjct: 35 RRNRRRRWRQRQ 46


>gnl|CDD|240577 cd12950, RRP7_Rrp7p, RRP7 domain ribosomal RNA-processing protein 7
           (Rrp7p) and similar proteins.  This CD corresponds to
           the RRP7 domain of Rrp7p. Rrp7p is encoded by YCL031C
           gene from Saccharomyces cerevisiae. It is an essential
           yeast protein involved in pre-rRNA processing and
           ribosome assembly. Rrp7p contains an N-terminal RNA
           recognition motif (RRM), also termed RBD (RNA binding
           domain) or RNP (ribonucleoprotein domain), and a
           C-terminal RRP7 domain.
          Length = 128

 Score = 27.2 bits (61), Expect = 3.2
 Identities = 13/51 (25%), Positives = 31/51 (60%), Gaps = 3/51 (5%)

Query: 12  LIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD---RRRRRRRRRD 59
            +V   ++     EE  +  +E+K+E+E+KK++ +EL+   R + R ++++
Sbjct: 55  TVVRGGRKGPAAGEEAGKAAEEEKKEKEKKKKKKKELEDFYRFQLREKKKE 105


>gnl|CDD|179385 PRK02224, PRK02224, chromosome segregation protein; Provisional.
          Length = 880

 Score = 28.5 bits (64), Expect = 3.2
 Identities = 19/48 (39%), Positives = 27/48 (56%), Gaps = 5/48 (10%)

Query: 18  KREGRQK-EEEEEGKKEKKEEEEEKKERNEEL----DRRRRRRRRRDN 60
            RE  ++ E E E  +E+ EE EE+ ER E+L    DR  R   RR++
Sbjct: 473 DRERVEELEAELEDLEEEVEEVEERLERAEDLVEAEDRIERLEERRED 520


>gnl|CDD|223898 COG0828, RpsU, Ribosomal protein S21 [Translation, ribosomal
          structure and biogenesis].
          Length = 67

 Score = 26.1 bits (58), Expect = 3.2
 Identities = 12/45 (26%), Positives = 25/45 (55%), Gaps = 2/45 (4%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          K ++EG  +E +E    EK  E+ ++K+      +R+ +R R++ 
Sbjct: 22 KVEKEGILREMKEREFYEKPSEKRKRKK--AAARKRKFKRLRKEQ 64


>gnl|CDD|216292 pfam01086, Clathrin_lg_ch, Clathrin light chain. 
          Length = 225

 Score = 27.8 bits (62), Expect = 3.2
 Identities = 10/31 (32%), Positives = 18/31 (58%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           RE R    EE  +  +K++EE  ++  +E+D
Sbjct: 118 RERRDLRIEERDEASEKKKEELIEKAQKEID 148


>gnl|CDD|115279 pfam06609, TRI12, Fungal trichothecene efflux pump (TRI12).  This
           family consists of several fungal specific trichothecene
           efflux pump proteins. Many of the genes involved in
           trichothecene toxin biosynthesis in Fusarium
           sporotrichioides are present within a gene cluster.It
           has been suggested that TRI12 may play a role in F.
           sporotrichioides self-protection against trichothecenes.
          Length = 598

 Score = 28.5 bits (63), Expect = 3.3
 Identities = 35/155 (22%), Positives = 65/155 (41%), Gaps = 23/155 (14%)

Query: 66  SSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGSHVVSFLG--VALTPFSKDVVLF 123
           S N  +  +++ +G  +   ++  + DR+GR   V+ +H++  +G  V  T    + +L 
Sbjct: 77  SENQGLFSTLWTMGQAVSILMMGRLTDRFGRRPFVIATHIIGLVGAIVGCTANKFNTLLA 136

Query: 124 SLSRFLTGVGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVS------EVALPWIA 177
           +++      G   A   +   + E +  K +       FL    VS        A P+  
Sbjct: 137 AMTLLGVAAGPAGASPLF---IGELMSNKTK-------FLGLLIVSAPTIAMNGAGPYFG 186

Query: 178 YYLA---DWQWISVITIF--PLIVGLIVAIFTPES 207
             LA   +W+WI  I I    + V LI+  + P S
Sbjct: 187 QRLAIQGNWRWIFYIYIIMSAIAVLLIIIWYHPPS 221


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 28.3 bits (63), Expect = 3.3
 Identities = 11/34 (32%), Positives = 24/34 (70%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
           + +    ++EEEEE ++E+++E EE++  +EE +
Sbjct: 440 EEEESVEEEEEEEEEEEEEEQESEEEEGEDEEEE 473


>gnl|CDD|240364 PTZ00332, PTZ00332, paraflagellar rod protein; Provisional.
          Length = 589

 Score = 28.4 bits (63), Expect = 3.3
 Identities = 11/37 (29%), Positives = 22/37 (59%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
           KR    KE+ E   +E ++ +EE   + ++L+R+ +R
Sbjct: 299 KRYATNKEKSERFIRENEDRQEEAWNKIQDLERQLQR 335


>gnl|CDD|222977 PHA03089, PHA03089, late transcription factor VLTF-4;
          Provisional.
          Length = 191

 Score = 27.9 bits (62), Expect = 3.3
 Identities = 14/34 (41%), Positives = 21/34 (61%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
          K K   R+K+  ++ KK+KKE+EE  +   EEL 
Sbjct: 53 KKKTTPRKKKTTKKTKKKKKEKEEVPELAAEELS 86



 Score = 27.5 bits (61), Expect = 4.0
 Identities = 9/36 (25%), Positives = 20/36 (55%), Gaps = 1/36 (2%)

Query: 14 VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELD 49
            K+K++ ++KEE  E    ++  + E+ E N++  
Sbjct: 64 TKKTKKKKKEKEEVPE-LAAEELSDSEENEENDKKV 98


>gnl|CDD|216249 pfam01025, GrpE, GrpE. 
          Length = 165

 Score = 27.6 bits (62), Expect = 3.3
 Identities = 16/53 (30%), Positives = 27/53 (50%), Gaps = 8/53 (15%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR--------RRRRRRRRDN 60
          + + E   ++EEE  ++E +E EEE +E  + L R        R+R  R R+ 
Sbjct: 2  EKEEEEELEDEEEALEEELEELEEEIEELKDRLLRLLAEFENYRKRTEREREE 54


>gnl|CDD|218467 pfam05147, LANC_like, Lanthionine synthetase C-like protein.
           Lanthionines are thioether bridges that are putatively
           generated by dehydration of Ser and Thr residues
           followed by addition of cysteine residues within the
           peptide. This family contains the lanthionine synthetase
           C-like proteins 1 and 2 which are related to the
           bacterial lanthionine synthetase components C (LanC).
           LANCL1 (P40 seven-transmembrane-domain protein) and
           LANCL2 (testes-specific adriamycin sensitivity protein)
           are thought to be peptide-modifying enzyme components in
           eukaryotic cells. Both proteins are produced in large
           quantities in the brain and testes and may have role in
           the immune surveillance of these organs. Lanthionines
           are found in lantibiotics, which are peptide-derived,
           post-translationally modified antimicrobials produced by
           several bacterial strains. This region contains seven
           internal repeats.
          Length = 352

 Score = 28.1 bits (63), Expect = 3.5
 Identities = 15/65 (23%), Positives = 21/65 (32%), Gaps = 3/65 (4%)

Query: 16  KSKREGRQKEEEEE-GKKEKKEEEEEKKERNEE-LDRRRRRRRRRDNWVCDGSSNLAITR 73
               +G   E+  E  KK    E   K   +    D R   R RR  W C G+  + +  
Sbjct: 176 LLYLKGTGNEKLLELIKKALNYELSLKFPDSGNWPDSRGNERDRRVAW-CHGAPGILLAL 234

Query: 74  SIFFL 78
                
Sbjct: 235 LKAAK 239


>gnl|CDD|218258 pfam04774, HABP4_PAI-RBP1, Hyaluronan / mRNA binding family.
          This family includes the HABP4 family of
          hyaluronan-binding proteins, and the PAI-1 mRNA-binding
          protein, PAI-RBP1. HABP4 has been observed to bind
          hyaluronan (a glucosaminoglycan), but it is not known
          whether this is its primary role in vivo. It has also
          been observed to bind RNA, but with a lower affinity
          than that for hyaluronan. PAI-1 mRNA-binding protein
          specifically binds the mRNA of type-1 plasminogen
          activator inhibitor (PAI-1), and is thought to be
          involved in regulation of mRNA stability. However, in
          both cases, the sequence motifs predicted to be
          important for ligand binding are not conserved
          throughout the family, so it is not known whether
          members of this family share a common function.
          Length = 106

 Score = 27.0 bits (60), Expect = 3.5
 Identities = 9/24 (37%), Positives = 14/24 (58%)

Query: 20 EGRQKEEEEEGKKEKKEEEEEKKE 43
          E +  EEE   +   +EEE E++E
Sbjct: 52 EKQAVEEEANKEGVVEEEEVEEEE 75


>gnl|CDD|239285 cd02987, Phd_like_Phd, Phosducin (Phd)-like family, Phd
          subfamily; Phd is a cytosolic regulator of G protein
          functions. It specifically binds G protein betagamma
          (Gbg)-subunits with high affinity, resulting in the
          solubilization of Gbg from the plasma membrane. This
          impedes the formation of a functional G protein trimer
          (G protein alphabetagamma), thereby inhibiting G
          protein-mediated signal transduction. Phd also inhibits
          the GTPase activity of G protein alpha. Phd can be
          phosphorylated by protein kinase A and G
          protein-coupled receptor kinase 2, leading to its
          inactivation. Phd was originally isolated from the
          retina, where it is highly expressed and has been
          implicated to play an important role in light
          adaptation. It is also found in the pineal gland,
          liver, spleen, striated muscle and the brain. The
          C-terminal domain of Phd adopts a thioredoxin fold, but
          it does not contain a CXXC motif. Phd interacts with G
          protein beta mostly through the N-terminal helical
          domain.
          Length = 175

 Score = 27.6 bits (62), Expect = 3.6
 Identities = 8/29 (27%), Positives = 18/29 (62%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           KE ++E+++  E  EE  ++ R +R ++
Sbjct: 22 LKESEQEDDDDDEDKEEFLQQYREQRMQE 50


>gnl|CDD|223783 COG0711, AtpF, F0F1-type ATP synthase, subunit b [Energy production
           and conversion].
          Length = 161

 Score = 27.6 bits (62), Expect = 3.6
 Identities = 15/43 (34%), Positives = 23/43 (53%), Gaps = 3/43 (6%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKER-NEELDRRRRR 54
           I+ ++K+E  Q  EE   K E +EE E  KE    E++  + R
Sbjct: 77  IIEQAKKEAEQIAEEI--KAEAEEELERIKEAAEAEIEAEKER 117


>gnl|CDD|219658 pfam07950, DUF1691, Protein of unknown function (DUF1691).  This
          family of fungal proteins is uncharacterized. Each
          protein contains two copies of this region.
          Length = 109

 Score = 26.9 bits (60), Expect = 3.6
 Identities = 15/34 (44%), Positives = 22/34 (64%), Gaps = 1/34 (2%)

Query: 48 LDRRRRRRRRRDNWVCDGSSNLAITRSIFFLGSL 81
          L RR RR+RR+  WV +G S + +  S++ LG L
Sbjct: 64 LGRRSRRKRRKGWWVINGISAIGLL-SLWLLGGL 96


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 28.1 bits (63), Expect = 3.6
 Identities = 9/43 (20%), Positives = 24/43 (55%), Gaps = 2/43 (4%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           +S+ E  ++E  ++ +K++  E   +  R E L++    +++R
Sbjct: 456 ESELERFRREVRDKVRKDR--EIRARDRRIERLEKELEEKKKR 496



 Score = 28.1 bits (63), Expect = 3.9
 Identities = 16/56 (28%), Positives = 22/56 (39%), Gaps = 11/56 (19%)

Query: 14  VPKSKREGRQKEEEE---EGKKEKKEEE--------EEKKERNEELDRRRRRRRRR 58
             + +RE    E+     E   E+ EEE        EE K   E+L+    R RR 
Sbjct: 410 EEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKREIEKLESELERFRRE 465


>gnl|CDD|226889 COG4487, COG4487, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 438

 Score = 28.2 bits (63), Expect = 3.7
 Identities = 12/31 (38%), Positives = 18/31 (58%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNE 46
           + KRE  + EE  + + EKK EE  + ER +
Sbjct: 142 EKKRENNKNEERLKFENEKKLEESLELEREK 172


>gnl|CDD|149048 pfam07768, PVL_ORF50, PVL ORF-50-like family.  This is a family
          of sequences found in both bacteria and bacteriophages.
          This region is approximately 130 residues long and in
          some cases is found as part of the PVL
          (Panton-Valentine leukocidin) group of genes, which
          encode a member of the leukocidin group of bacterial
          toxins that kill leukocytes by creation of pores in the
          cell membrane. PVL appears to be a virulence factor
          associated with a number of human diseases.
          Length = 118

 Score = 27.1 bits (60), Expect = 3.8
 Identities = 15/40 (37%), Positives = 20/40 (50%), Gaps = 1/40 (2%)

Query: 19 REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            G + +E  E KK +  E+E K ER  E  RR+    RR
Sbjct: 46 PYGMRLKEYREIKKSENIEQERK-ERELERKRRKEAELRR 84


>gnl|CDD|227468 COG5139, COG5139, Uncharacterized conserved protein [Function
          unknown].
          Length = 397

 Score = 28.1 bits (62), Expect = 3.8
 Identities = 11/45 (24%), Positives = 19/45 (42%)

Query: 15 PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
          P+   +G   + ++    EK EE    +  N E   R+R+    D
Sbjct: 43 PQETSKGTSNDTKDPDNGEKNEEAAIDENSNVEAAERKRKHISTD 87


>gnl|CDD|232957 TIGR00398, metG, methionyl-tRNA synthetase.  The methionyl-tRNA
           synthetase (metG) is a class I amino acyl-tRNA ligase.
           This model appears to recognize the methionyl-tRNA
           synthetase of every species, including eukaryotic
           cytosolic and mitochondrial forms. The UPGMA difference
           tree calculated after search and alignment according to
           This model shows an unusual deep split between two
           families of MetG. One family contains forms from the
           Archaea, yeast cytosol, spirochetes, and E. coli, among
           others. The other family includes forms from yeast
           mitochondrion, Synechocystis sp., Bacillus subtilis, the
           Mycoplasmas, Aquifex aeolicus, and Helicobacter pylori.
           The E. coli enzyme is homodimeric, although monomeric
           forms can be prepared that are fully active. Activity of
           this enzyme in bacteria includes aminoacylation of
           fMet-tRNA with Met; subsequent formylation of the Met to
           fMet is catalyzed by a separate enzyme. Note that the
           protein from Aquifex aeolicus is split into an alpha
           (large) and beta (small) subunit; this model does not
           include the C-terminal region corresponding to the beta
           chain [Protein synthesis, tRNA aminoacylation].
          Length = 530

 Score = 28.1 bits (63), Expect = 3.9
 Identities = 20/74 (27%), Positives = 32/74 (43%), Gaps = 6/74 (8%)

Query: 12  LIVPKSKREGRQKEEEEEG------KKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDG 65
           LI P+ K  G + E  +           +KE EE  ++  E        + +  NW+  G
Sbjct: 164 LINPRCKICGAKPELRDSEHYFFRLSAFEKELEEWIRKNPESGSPASNVKNKAQNWLKGG 223

Query: 66  SSNLAITRSIFFLG 79
             +LAITR + + G
Sbjct: 224 LKDLAITRDLVYWG 237


>gnl|CDD|236153 PRK08116, PRK08116, hypothetical protein; Validated.
          Length = 268

 Score = 27.7 bits (62), Expect = 3.9
 Identities = 17/54 (31%), Positives = 26/54 (48%), Gaps = 7/54 (12%)

Query: 23 QKEEEEEGKKEKKEEEEEKKERNEELDR-----RRRRRRRRDNWVCDGSSNLAI 71
          ++E EE   KE++EE  EK+ R E L        + R    +N++ D  S  A 
Sbjct: 46 EREAEEA--KEREEENREKQRRIERLKSNSLLDEKFRNSTFENFLFDKGSEKAY 97


>gnl|CDD|149180 pfam07960, CBP4, CBP4.  The CBP4 in S. cerevisiae is essential for
           the expression and activity of ubiquinol-cytochrome c
           reductase. This family appears to be fungal specific.
          Length = 128

 Score = 27.0 bits (60), Expect = 3.9
 Identities = 13/32 (40%), Positives = 19/32 (59%), Gaps = 4/32 (12%)

Query: 16  KSKREGRQKEEE----EEGKKEKKEEEEEKKE 43
           K+K E  QKEE     EE ++ + + EE +KE
Sbjct: 97  KTKAEEAQKEELERIREELEEARAQSEEMRKE 128


>gnl|CDD|219956 pfam08658, Rad54_N, Rad54 N terminal.  This is the N terminal of
           the DNA repair protein Rad54.
          Length = 191

 Score = 27.7 bits (62), Expect = 4.0
 Identities = 10/29 (34%), Positives = 17/29 (58%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
           + + K EEE+ +K+++ EE E K  N   
Sbjct: 142 DDKPKIEEEKAEKDQEPEESETKLSNGPK 170


>gnl|CDD|204935 pfam12474, PKK, Polo kinase kinase.  This domain family is found in
           eukaryotes, and is approximately 140 amino acids in
           length. The family is found in association with
           pfam00069. Polo-like kinase 1 (Plx1) is essential during
           mitosis for the activation of Cdc25C, for spindle
           assembly, and for cyclin B degradation. This family is
           Polo kinase kinase (PKK) which phosphorylates Polo
           kinase and Polo-like kinase to activate them. PKK is a
           serine/threonine kinase.
          Length = 142

 Score = 27.3 bits (61), Expect = 4.2
 Identities = 10/26 (38%), Positives = 19/26 (73%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEE 47
           R +E+E++  K +KEE+E+K ++ E 
Sbjct: 88  RFQEQEKKRMKAEKEEQEQKHQKQER 113


>gnl|CDD|233042 TIGR00598, rad14, DNA repair protein.  All proteins in this family
           for which functions are known are used for the
           recognition of DNA damage as part of nucleotide excision
           repair. This family is based on the phylogenomic
           analysis of JA Eisen (1999, Ph.D. Thesis, Stanford
           University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 172

 Score = 27.4 bits (61), Expect = 4.3
 Identities = 16/41 (39%), Positives = 22/41 (53%), Gaps = 6/41 (14%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWV 62
            +KE  EE K+E KE++ EKK +  EL    RR  R   + 
Sbjct: 98  EEKERREESKEEMKEKKFEKKLK--EL----RRAVRSSEYT 132



 Score = 27.1 bits (60), Expect = 5.8
 Identities = 13/31 (41%), Positives = 20/31 (64%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
           +EE+E ++E KEE +EKK   +  + RR  R
Sbjct: 97  DEEKERREESKEEMKEKKFEKKLKELRRAVR 127


>gnl|CDD|182163 PRK09952, PRK09952, shikimate transporter; Provisional.
          Length = 438

 Score = 27.8 bits (62), Expect = 4.3
 Identities = 14/38 (36%), Positives = 22/38 (57%), Gaps = 7/38 (18%)

Query: 66  SSNLAITRSIFF-LGSLLGGF------ILSWVADRYGR 96
           + NL + R +F  +G L+GG         +W+ADR+GR
Sbjct: 278 TQNLGLPRELFLNIGLLVGGLSCLTIPCFAWLADRFGR 315


>gnl|CDD|203738 pfam07716, bZIP_2, Basic region leucine zipper. 
          Length = 54

 Score = 25.7 bits (57), Expect = 4.5
 Identities = 8/27 (29%), Positives = 16/27 (59%)

Query: 34 KKEEEEEKKERNEELDRRRRRRRRRDN 60
          K +E  +++ RN E  RR R ++++  
Sbjct: 1  KDDEYRDRRRRNNEAARRSREKKKQRE 27


>gnl|CDD|220818 pfam10595, UPF0564, Uncharacterized protein family UPF0564.  This
           family of proteins has no known function. However, one
           of the members is annotated as an EF-hand family
           protein.
          Length = 349

 Score = 27.8 bits (62), Expect = 4.5
 Identities = 12/38 (31%), Positives = 23/38 (60%)

Query: 17  SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
                 +KE+E + ++EKKE E+ KK++ E   + ++R
Sbjct: 276 KFLRTERKEKEAKEQQEKKELEQRKKKKKEMAPKVKQR 313



 Score = 26.7 bits (59), Expect = 9.4
 Identities = 10/46 (21%), Positives = 29/46 (63%), Gaps = 4/46 (8%)

Query: 14 VPK----SKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
          VP+    +KRE +++E++   + + +EE  + +++ EE + +++ +
Sbjct: 5  VPQPFQMTKREKKKREKKSIRQSKLEEELNKLEKKEEEAECKKKFK 50


>gnl|CDD|218734 pfam05758, Ycf1, Ycf1.  The chloroplast genomes of most higher
           plants contain two giant open reading frames designated
           ycf1 and ycf2. Although the function of Ycf1 is unknown,
           it is known to be an essential gene.
          Length = 832

 Score = 28.0 bits (63), Expect = 4.5
 Identities = 10/35 (28%), Positives = 14/35 (40%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
              K  +E  + EE EE    + E   E K   +E
Sbjct: 221 FFTKKLKETSETEEREEETDVEIETTSETKGTKQE 255


>gnl|CDD|204032 pfam08703, PLC-beta_C, PLC-beta C terminal.  This domain
          corresponds to the alpha helical C terminal domain of
          phospholipase C beta.
          Length = 181

 Score = 27.5 bits (61), Expect = 4.6
 Identities = 13/25 (52%), Positives = 18/25 (72%)

Query: 31 KKEKKEEEEEKKERNEELDRRRRRR 55
          KK K+  E+EKKE  ++LDR+R  R
Sbjct: 55 KKLKEISEKEKKELKKKLDRKRLER 79


>gnl|CDD|202427 pfam02841, GBP_C, Guanylate-binding protein, C-terminal domain.
           Transcription of the anti-viral guanylate-binding
           protein (GBP) is induced by interferon-gamma during
           macrophage induction. This family contains GBP1 and
           GPB2, both GTPases capable of binding GTP, GDP and GMP.
          Length = 297

 Score = 27.6 bits (62), Expect = 4.7
 Identities = 6/32 (18%), Positives = 18/32 (56%)

Query: 23  QKEEEEEGKKEKKEEEEEKKERNEELDRRRRR 54
           ++ + E  + E++   E++KE  + ++ + R 
Sbjct: 209 ERAKAEAAEAEQELLREKQKEEEQMMEAQERS 240


>gnl|CDD|220237 pfam09428, DUF2011, Fungal protein of unknown function (DUF2011).
           This is a family of fungal proteins whose function is
           unknown.
          Length = 130

 Score = 26.9 bits (60), Expect = 4.8
 Identities = 8/28 (28%), Positives = 15/28 (53%), Gaps = 1/28 (3%)

Query: 18  KREGRQKEEEEEGK-KEKKEEEEEKKER 44
            R  R++ +E   K K  ++  E+K +R
Sbjct: 96  LRLRRERTKERAEKEKRTRKNREKKFKR 123


>gnl|CDD|216985 pfam02349, MSG, Major surface glycoprotein.  This is a novel
          repeat in Pneumocystis carinii Major surface
          glycoprotein (MSG) some members of the alignment have
          up to nine repeats of this family, the repeats
          containing several conserved cysteines. The MSG of P.
          carinii is an important protein in host-pathogen
          interactions. Surface glycoprotein A from Pneumocystis
          carinii is a main target for the host immune system,
          this protein is implicated in the attachment of
          Pneumocystis carinii to the host alveolar epithelial
          cells, alveolar macrophages, host surfactant and
          possibly accounts in part for the hypoxia seen in
          Pneumocystis carinii pneumonia (PCP).
          Length = 81

 Score = 26.2 bits (58), Expect = 4.8
 Identities = 14/45 (31%), Positives = 21/45 (46%), Gaps = 1/45 (2%)

Query: 8  NCYGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
           CY L       E   KE + E KK+KK + +  KE+  +L +  
Sbjct: 36 KCYKLKKDLKLEELLLKELKGELKKKKKCK-KALKEKCTKLKKES 79


>gnl|CDD|220464 pfam09903, DUF2130, Uncharacterized protein conserved in bacteria
          (DUF2130).  This domain, found in various hypothetical
          prokaryotic proteins, has no known function.
          Length = 267

 Score = 27.6 bits (62), Expect = 4.9
 Identities = 11/27 (40%), Positives = 20/27 (74%)

Query: 27 EEEGKKEKKEEEEEKKERNEELDRRRR 53
          EE+ K +KKE+EE+ K+  E+++  +R
Sbjct: 1  EEKLKLKKKEKEEQIKDLQEQIEELKR 27


>gnl|CDD|223562 COG0488, Uup, ATPase components of ABC transporters with duplicated
           ATPase domains [General function prediction only].
          Length = 530

 Score = 27.6 bits (62), Expect = 4.9
 Identities = 10/37 (27%), Positives = 17/37 (45%), Gaps = 2/37 (5%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
            QK E     +E    E+++KE  +E +  RR +   
Sbjct: 240 EQKAERLR--QEAAAYEKQQKELAKEQEWIRRGKAAA 274


>gnl|CDD|218532 pfam05276, SH3BP5, SH3 domain-binding protein 5 (SH3BP5).  This
           family consists of several eukaryotic SH3 domain-binding
           protein 5 or c-Jun N-terminal kinase (JNK)-interacting
           proteins (SH3BP5 or Sab). Sab binds to and serves as a
           substrate for JNK in vitro, and has been found to
           interact with the Src homology 3 (SH3) domain of
           Bruton's tyrosine kinase (Btk). Inspection of the
           sequence of Sab reveals the presence of two putative
           mitogen-activated protein kinase interaction motifs
           (KIMs) similar to that found in the JNK docking domain
           of the c-Jun transcription factor, and four potential
           serine-proline JNK phosphorylation sites in the
           C-terminal half of the molecule.
          Length = 240

 Score = 27.5 bits (61), Expect = 5.0
 Identities = 9/36 (25%), Positives = 13/36 (36%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
            E    +KE   EE E       L +  +  +R  N
Sbjct: 135 SERTRAEKEHASEEAELLVAELRLRQLEKILKRAIN 170


>gnl|CDD|227474 COG5145, RAD14, DNA excision repair protein [DNA replication,
           recombination, and repair].
          Length = 292

 Score = 27.7 bits (61), Expect = 5.0
 Identities = 12/38 (31%), Positives = 20/38 (52%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           R+K+  E+ K ++KE++ EKK +      R     R D
Sbjct: 216 REKQRREKMKDDRKEKKLEKKIKELRRKTRTSNYSRMD 253



 Score = 26.9 bits (59), Expect = 8.1
 Identities = 9/37 (24%), Positives = 20/37 (54%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           G ++ + K E   +   +EG+   +E++E K+ R + 
Sbjct: 71  GYLLEEKKVEDLMENAPQEGEFFAEEQDERKEVREDA 107


>gnl|CDD|219627 pfam07897, DUF1675, Protein of unknown function (DUF1675).  The
           members of this family are sequences derived from
           hypothetical plant proteins of unknown function. One
           member of this family is annotated as a putative
           RNA-binding protein, but no evidence was found to
           support this.
          Length = 283

 Score = 27.6 bits (61), Expect = 5.1
 Identities = 14/50 (28%), Positives = 22/50 (44%), Gaps = 6/50 (12%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAITRS 74
           E EEE +K K+ +      R  E  R+R  +    N V +G    +I  +
Sbjct: 67  ETEEEWRKRKEMQSL----RRLEAKRKRSEKEY--NGVSNGDDMDSINAA 110


>gnl|CDD|224720 COG1807, ArnT, 4-amino-4-deoxy-L-arabinose transferase and related
           glycosyltransferases of PMT family [Cell envelope
           biogenesis, outer membrane].
          Length = 535

 Score = 27.8 bits (62), Expect = 5.1
 Identities = 24/142 (16%), Positives = 51/142 (35%), Gaps = 9/142 (6%)

Query: 65  GSSNLAITRSIFFLGSLLGGFILSWVADR-YGRITAVLGSHVVSFLGVALTPFSKDVVLF 123
              N    R    L   L   ++ W+A R +GR+ A+L + ++    +      +  +L 
Sbjct: 79  FGVNEWSARLPSALAGALTALLVYWLAKRLFGRLAALLAALILLLTPLFFL-IGRLALLD 137

Query: 124 SLSRFLTGVGHFNAFIFYYIIVLECVGPKWRT---FAMTFPFLIFYTVSEVALPWIAYYL 180
           +   F   +    A    Y+ +      KW      A+   FL     + +    +   L
Sbjct: 138 AALAFFLTL----ALALLYLALRARGKLKWLLLLGLALGLGFLTKGPGALLLPLILLLLL 193

Query: 181 ADWQWISVITIFPLIVGLIVAI 202
              +   ++    L +GL++ +
Sbjct: 194 LAPRLRRLLRDLRLWLGLLLGL 215


>gnl|CDD|218738 pfam05766, NinG, Bacteriophage Lambda NinG protein.  NinG or Rap
          is involved in recombination. Rap (recombination adept
          with plasmid) increases lambda-by-plasmid recombination
          catalyzed by Escherichia coli's RecBCD pathway.
          Length = 188

 Score = 27.3 bits (61), Expect = 5.1
 Identities = 12/31 (38%), Positives = 18/31 (58%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
          R+K +E++ K E + E  E K R E+L  R 
Sbjct: 36 REKAQEKKRKAEAQAERRELKARKEKLKTRS 66


>gnl|CDD|224287 COG1368, MdoB, Phosphoglycerol transferase and related proteins,
           alkaline phosphatase superfamily [Cell envelope
           biogenesis, outer membrane].
          Length = 650

 Score = 27.8 bits (62), Expect = 5.2
 Identities = 23/156 (14%), Positives = 52/156 (33%), Gaps = 26/156 (16%)

Query: 72  TRSIFFLGSLLGGF------ILSWVADRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSL 125
            + +    +LL  F          +      +T  L  + + F+   L  F    +LFSL
Sbjct: 1   MKWLNKTLALLSLFFPIFLGFGILLVFLLWLLT--LLIYFLGFVLPILLLFQGLRLLFSL 58

Query: 126 -----------SRFLTGVGHFNAFIFY------YIIVLECVGPKWRTFAMTFPFLIFYTV 168
                           GV   N F          +++L+ +  ++    +T P  +    
Sbjct: 59  PILFIVSLLLLLLLFKGVDALNIFRLILALLISILLILDILFYRFFIDFLTIPNALLIED 118

Query: 169 SEVALPWIAYYLADWQWISVITIFPLIVGLIVAIFT 204
             +     +  L+      ++ +  LI+ +++ +F 
Sbjct: 119 FNLGKLGFSA-LSLLYPEDILFVVDLILLILLLVFY 153


>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
          Length = 1036

 Score = 27.9 bits (62), Expect = 5.3
 Identities = 10/35 (28%), Positives = 21/35 (60%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRR 56
           RQ EE+   ++EK   E ++ +   E+++RR + +
Sbjct: 270 RQAEEQRRREEEKAAMEADRAQAKAEVEKRREKLQ 304


>gnl|CDD|204414 pfam10211, Ax_dynein_light, Axonemal dynein light chain.  Axonemal
           dynein light chain proteins play a dynamic role in
           flagellar and cilia motility. Eukaryotic cilia and
           flagella are complex organelles consisting of a core
           structure, the axoneme, which is composed of nine
           microtubule doublets forming a cylinder that surrounds a
           pair of central singlet microtubules. This
           ultra-structural arrangement seems to be one of the most
           stable micro-tubular assemblies known and is responsible
           for the flagellar and ciliary movement of a large number
           of organisms ranging from protozoan to mammals. This
           light chain interacts directly with the N-terminal half
           of the heavy chains.
          Length = 189

 Score = 27.2 bits (61), Expect = 5.4
 Identities = 11/31 (35%), Positives = 21/31 (67%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRR 52
           R+  + E+GK E ++E ++ +E  EEL++R 
Sbjct: 113 RKALQAEQGKSELEQEIKKLEEEKEELEKRV 143



 Score = 27.2 bits (61), Expect = 5.8
 Identities = 10/36 (27%), Positives = 19/36 (52%), Gaps = 1/36 (2%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR 53
             E   K E  E K+E++E + E+K   +E+   ++
Sbjct: 143 VAELEAKLEAIE-KREEEERQIEEKRHADEIAFLKK 177


>gnl|CDD|233048 TIGR00605, rad4, DNA repair protein rad4.  All proteins in this
          family for which functions are known are involved in
          targeting nucleotide excision repair to specific
          regions of the genome.This family is based on the
          phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
          Stanford University) [DNA metabolism, DNA replication,
          recombination, and repair].
          Length = 713

 Score = 27.5 bits (61), Expect = 5.4
 Identities = 12/39 (30%), Positives = 21/39 (53%)

Query: 22 RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
          R+ E E+E +K+ K    + +  NE   RRR++R +   
Sbjct: 14 RKVENEKEAEKQPKSRRRKVRRENEPSLRRRKKRFKTGL 52


>gnl|CDD|227448 COG5118, BDP1, Transcription initiation factor TFIIIB, Bdp1 subunit
           [Transcription].
          Length = 507

 Score = 27.8 bits (61), Expect = 5.6
 Identities = 10/43 (23%), Positives = 15/43 (34%), Gaps = 2/43 (4%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEE-KKERNEELDRRRRRRR 56
              KR    K  E     E  +  +      N   DRR R+++
Sbjct: 256 KLEKRR-HVKFLEGSNTHEMDQLLKHFLDNSNFRQDRRSRKKK 297


>gnl|CDD|217943 pfam04180, LTV, Low temperature viability protein.  The
           low-temperature viability protein LTV1 is involved in
           ribosome biogenesis 40S subunit production.
          Length = 426

 Score = 27.7 bits (61), Expect = 5.6
 Identities = 11/27 (40%), Positives = 18/27 (66%)

Query: 31  KKEKKEEEEEKKERNEELDRRRRRRRR 57
           K +K+E++ ++K   EELD+ RR   R
Sbjct: 397 KAKKREDKRKRKSALEELDKERRVLGR 423


>gnl|CDD|204918 pfam12436, USP7, Ubiquitin-specific protease 7.  This domain
          family is found in eukaryotes, and is approximately 40
          amino acids in length. The family is found in
          association with pfam00443, pfam00917. USP7 regulates
          the turnover of p53.
          Length = 35

 Score = 24.8 bits (55), Expect = 5.7
 Identities = 7/17 (41%), Positives = 12/17 (70%)

Query: 31 KKEKKEEEEEKKERNEE 47
          ++E++E E  +KER E 
Sbjct: 14 EEEREERERRRKEREEA 30


>gnl|CDD|214016 cd12923, iSH2_PI3K_IA_R, Inter-Src homology 2 (iSH2) helical domain
           of Class IA Phosphoinositide 3-kinase Regulatory
           subunits.  PI3Ks catalyze the transfer of the
           gamma-phosphoryl group from ATP to the 3-hydroxyl of the
           inositol ring of D-myo-phosphatidylinositol (PtdIns) or
           its derivatives. They play an important role in a
           variety of fundamental cellular processes, including
           cell motility, the Ras pathway, vesicle trafficking and
           secretion, immune cell activation, and apoptosis. They
           are classified according to their substrate specificity,
           regulation, and domain structure. Class IA PI3Ks are
           heterodimers of a p110 catalytic (C) subunit and a
           p85-related regulatory (R) subunit. The R subunit
           down-regulates PI3K basal activity, stabilizes the C
           subunit, and plays a role in the activation downstream
           of tyrosine kinases. All R subunits contain two SH2
           domains that flank an intervening helical domain (iSH2),
           which binds to the N-terminal adaptor-binding domain
           (ABD) of the catalytic subunit. In vertebrates, there
           are three genes (PIK3R1, PIK3R2, and PIK3R3) that encode
           for different Class IA PI3K R subunits.
          Length = 152

 Score = 26.8 bits (60), Expect = 5.8
 Identities = 11/43 (25%), Positives = 23/43 (53%), Gaps = 8/43 (18%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRRR--------RRRRRRD 59
           +E EE K++ +E+  ++   N EL+R          + R+++D
Sbjct: 89  KELEESKEQLEEDLRKQVAYNRELEREMNSLKPELMQLRKQKD 131


>gnl|CDD|225368 COG2811, NtpF, Archaeal/vacuolar-type H+-ATPase subunit H [Energy
          production and conversion].
          Length = 108

 Score = 26.2 bits (58), Expect = 5.8
 Identities = 12/43 (27%), Positives = 20/43 (46%)

Query: 16 KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          K + E   KE  EE ++  +E EEE ++  +E+    R     
Sbjct: 27 KEEAEQIIKEAREEAREIIEEAEEEAEKLAQEILEEAREEAEE 69


>gnl|CDD|213306 cd05940, FATP_FACS, Fatty acid transport proteins (FATP) play dual
           roles as fatty acid transporters and its activation
           enzymes.  Fatty acid transport protein (FATP) transports
           long-chain or very-long-chain fatty acids across the
           plasma membrane. FATPs also have fatty acid CoA
           synthetase activity, thus playing dual roles as fatty
           acid transporters and its activation enzymes. At least
           five copies of FATPs are identified in mammalian cells.
           This family also includes prokaryotic FATPs. FATPs are
           the key players in the trafficking of exogenous fatty
           acids into the cell and in intracellular fatty acid
           homeostasis.
          Length = 444

 Score = 27.6 bits (62), Expect = 5.8
 Identities = 8/21 (38%), Positives = 11/21 (52%)

Query: 123 FSLSRFLTGVGHFNAFIFYYI 143
           FS S+F   V  + A  F Y+
Sbjct: 157 FSASQFWPDVRRYGATAFQYV 177


>gnl|CDD|220714 pfam10357, Kin17_mid, Domain of Kin17 curved DNA-binding protein.
           Kin17_mid is the conserved central 169 residue region of
           a family of Kin17 proteins. Towards the N-terminal end
           there is a zinc-finger domain, and in human and mouse
           members there is a RecA-like domain further downstream.
           The Kin17 protein in humans forms intra-nuclear foci
           during cell proliferation and is re-distributed in the
           nucleoplasm during the cell cycle.
          Length = 127

 Score = 26.4 bits (59), Expect = 5.9
 Identities = 9/25 (36%), Positives = 15/25 (60%)

Query: 32  KEKKEEEEEKKERNEELDRRRRRRR 56
             K++EE  KKE+ E+ D  R ++ 
Sbjct: 97  ALKRQEELRKKEKQEKTDEEREQKL 121


>gnl|CDD|153373 cd07361, MEMO_like, Memo (mediator of ErbB2-driven cell motility)
           is co-precipitated with the C terminus of ErbB2, a
           protein involved in cell motility.  This subfamily is
           composed of Memo (mediator of ErbB2-driven cell
           motility) and similar proteins. Memo is a protein that
           is co-precipitated with the C terminus of ErbB2, a
           protein involved in cell motility. It is required for
           the ErbB2-driven cell mobility and is found in protein
           complexes with cofilin, ErbB2 and PLCgamma1. However,
           Memo is not homologous to any known signaling proteins,
           and its function in ErbB2 signaling is not known.
           Structural studies show that Memo binds directly to a
           specific ErbB2-derived phosphopeptide. Memo is
           homologous to class III nonheme iron-dependent extradiol
           dioxygenases, however, no metal binding or enzymatic
           activity can be detected for Memo. This subfamily also
           contains a few members containing a C-terminal
           AMMECR1-like domain. The AMMECR1 protein was proposed to
           be a regulatory factor that is potentially involved in
           the development of AMME contiguous gene deletion
           syndrome.
          Length = 266

 Score = 27.2 bits (61), Expect = 5.9
 Identities = 11/28 (39%), Positives = 15/28 (53%), Gaps = 6/28 (21%)

Query: 170 EVALPWIAYYLADWQWISVITIFPLIVG 197
           EV LP++ Y L D        I P++VG
Sbjct: 128 EVQLPFLQYLLPD------FKIVPILVG 149


>gnl|CDD|147845 pfam05914, RIB43A, RIB43A.  This family consists of several
           RIB43A-like eukaryotic proteins. Ciliary and flagellar
           microtubules contain a specialised set of
           protofilaments, termed ribbons, that are composed of
           tubulin and several associated proteins. RIB43A was
           first characterized in the unicellular biflagellate,
           Chlamydomonas reinhardtii although highly related
           sequences are present in several higher eukaryotes
           including humans. The function of this protein is
           unknown although the structure of RIB43A and its
           association with the specialised protofilament ribbons
           and with basal bodies is relevant to the proposed role
           of ribbons in forming and stabilising doublet and
           triplet microtubules and in organising their
           three-dimensional structure. Human RIB43A homologues
           could represent a structural requirement in centriole
           replication in dividing cells.
          Length = 379

 Score = 27.3 bits (61), Expect = 5.9
 Identities = 13/61 (21%), Positives = 29/61 (47%), Gaps = 12/61 (19%)

Query: 9   CYGLIVPKSKREGRQKEEE--EEGKKEKKEEEEEKKE----RNE------ELDRRRRRRR 56
            +  + P+     R+ +E+  +E ++ ++EE+  +KE              L+R+ RR R
Sbjct: 273 RWKGMSPEQLAAIRKGQEQQLQEKERRREEEQLREKEWDRQAINQARAAVLLERQERRLR 332

Query: 57  R 57
           +
Sbjct: 333 K 333


>gnl|CDD|235322 PRK04950, PRK04950, ProP expression regulator; Provisional.
          Length = 213

 Score = 27.2 bits (61), Expect = 6.0
 Identities = 8/44 (18%), Positives = 25/44 (56%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           +++R  +Q ++ E   +++K    E+K + +   ++R+ R ++ 
Sbjct: 111 QAQRAEQQAKKREAAGEKEKAPRRERKPKPKAPRKKRKPRAQKP 154


>gnl|CDD|214459 MTH00211, ND5, NADH dehydrogenase subunit 5; Provisional.
          Length = 597

 Score = 27.2 bits (61), Expect = 6.3
 Identities = 25/146 (17%), Positives = 48/146 (32%), Gaps = 20/146 (13%)

Query: 77  FLGSLLGGFILSWVA----------DRYGRITAVLGSHVVSFLGVALTPFSKDVVLFSLS 126
            LGS+L GFI++                 ++  +L + V   LG+ L   +   +  S  
Sbjct: 452 ALGSILAGFIITSNLPPEKPEILSLPFLLKLMPLLVTIVGLLLGLELVNLTHKQLKTSTL 511

Query: 127 RFLTGVGHFNAFIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVALPWIAYYLADWQWI 186
            F   +G++   I   +  L     +  T +     L +Y         +          
Sbjct: 512 SFSNMLGYYPPIIHRLLPKLSLKWGQ--TLSTHLMDLGWYEKLGPKGLGVNQKKLIKLVT 569

Query: 187 SV--------ITIFPLIVGLIVAIFT 204
           S         + +  L + LI+ +F 
Sbjct: 570 SPQQGLIKSYLALLLLSLLLILLLFL 595


>gnl|CDD|217536 pfam03403, PAF-AH_p_II, Platelet-activating factor acetylhydrolase,
           isoform II.  Platelet-activating factor acetylhydrolase
           (PAF-AH) is a subfamily of phospholipases A2,
           responsible for inactivation of platelet-activating
           factor through cleavage of an acetyl group. Three known
           PAF-AHs are the brain heterotrimeric PAF-AH Ib, whose
           catalytic beta and gamma subunits are aligned in
           pfam02266, the extracellular, plasma PAF-AH (pPAF-AH),
           and the intracellular PAF-AH isoform II (PAF-AH II).
           This family aligns pPAF-AH and PAF-AH II, whose
           similarity was previously noted.
          Length = 372

 Score = 27.4 bits (61), Expect = 6.3
 Identities = 9/35 (25%), Positives = 14/35 (40%), Gaps = 5/35 (14%)

Query: 22  RQKEEEEEGKKEK-----KEEEEEKKERNEELDRR 51
           + K   E  +           EEE   RNE++ +R
Sbjct: 147 KDKNAAEVEEPSWIYLRDLNAEEEFHIRNEQVGQR 181


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of
           the RNA polymerase II associated Paf1 complex. The Paf1
           complex functions during the elongation phase of
           transcription in conjunction with Spt4-Spt5 and
           Spt16-Pob3i.
          Length = 431

 Score = 27.4 bits (61), Expect = 6.3
 Identities = 10/27 (37%), Positives = 15/27 (55%)

Query: 25  EEEEEGKKEKKEEEEEKKERNEELDRR 51
           +EEEE + ++ EEEE +    E    R
Sbjct: 375 DEEEEQRSDEHEEEEGEDSEEEGSQSR 401



 Score = 27.4 bits (61), Expect = 6.6
 Identities = 7/32 (21%), Positives = 18/32 (56%)

Query: 20  EGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
                E+E+E ++++ +E EE++  + E +  
Sbjct: 367 FEEVDEDEDEEEEQRSDEHEEEEGEDSEEEGS 398


>gnl|CDD|236278 PRK08506, PRK08506, replicative DNA helicase; Provisional.
          Length = 472

 Score = 27.3 bits (61), Expect = 6.4
 Identities = 10/30 (33%), Positives = 19/30 (63%), Gaps = 1/30 (3%)

Query: 19  REGRQKEEEEEGKKE-KKEEEEEKKERNEE 47
           +E  +KE+E++ KKE K+E     + ++ E
Sbjct: 391 KEREEKEKEKKAKKEGKEERRIHFQNKSIE 420


>gnl|CDD|234218 TIGR03459, crt_membr, carotene biosynthesis associated membrane
           protein.  This model represents a family of hydrophobic
           and presumed membrane proteins found in the
           Actinobacteria. The genes encoding these proteins are
           syntenically associated with (found proximal to) genes
           of carotene biosynthesis ususally including phytoene
           synthase (crtB), phytoene dehydrogenase (crtI) and
           geranylgeranyl pyrophosphate synthase (ispA).
          Length = 470

 Score = 27.4 bits (61), Expect = 6.6
 Identities = 35/163 (21%), Positives = 55/163 (33%), Gaps = 30/163 (18%)

Query: 55  RRRRDNWVCDGSSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVLGS-HVVSFLGVAL 113
            RR   +V      + +  ++F L +   G    W       +TA+ G+  V++ L  AL
Sbjct: 286 WRRVGTFVAAALVGVLVFVAVFALITAAAGVGWGW-------LTALSGNSKVINPL--AL 336

Query: 114 TPFSKDVVLFSLSRFLTGVGHFNAF------IFYYIIVLECVGPKWR--------TFAMT 159
                 V+    S F   +  FNA       I   I++L  V   W              
Sbjct: 337 PSLVASVIEPVGSLFNDDL-DFNAVVDVIRPISMVIMLLGLVATWWLFRHDERAAVTGTA 395

Query: 160 FPFLIFYTVSEVALPWIAYYLADWQWISVITIFPLIVGLIVAI 202
             + I    + V LPW  YY      +++   F     LI   
Sbjct: 396 AAYAIAVVFNPVTLPW--YYTWP---LALAATFAQSRWLIYLT 433


>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510).  This
          family consists of several hypothetical bacterial
          proteins of around 200 residues in length. The function
          of this family is unknown.
          Length = 214

 Score = 27.0 bits (60), Expect = 6.7
 Identities = 11/40 (27%), Positives = 19/40 (47%), Gaps = 2/40 (5%)

Query: 10 YGLIVP--KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
          Y L  P   S +    ++E ++   ++  E EE KE  +E
Sbjct: 32 YQLFFPSSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKE 71



 Score = 27.0 bits (60), Expect = 7.4
 Identities = 11/32 (34%), Positives = 22/32 (68%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
             + +G  ++E+EE ++E +EE+EE  + NE+
Sbjct: 77  DKEDKGDAEKEDEESEEENEEEDEESSDENEK 108


>gnl|CDD|220388 pfam09766, FimP, Fms-interacting protein.  This entry carries part
           of the crucial 144 N-terminal residues of the FmiP
           protein, which is essential for the binding of the
           protein to the cytoplasmic domain of activated
           Fms-molecules in M-CSF induced haematopoietic
           differentiation of macrophages. The C-terminus contains
           a putative nuclear localisation sequence and a leucine
           zipper which suggest further, as yet unknown, nuclear
           functions. The level of FMIP expression might form a
           threshold that determines whether cells differentiate
           into macrophages or into granulocytes.
          Length = 352

 Score = 27.3 bits (61), Expect = 6.7
 Identities = 7/33 (21%), Positives = 20/33 (60%)

Query: 27  EEEGKKEKKEEEEEKKERNEELDRRRRRRRRRD 59
           E++ +  ++ ++++     EE  +R+++RR  D
Sbjct: 206 EQQKESSEQSQDDDSDSDEEEEQKRQKKRRSTD 238


>gnl|CDD|218693 pfam05687, DUF822, Plant protein of unknown function (DUF822).
          This family consists of the N terminal regions of
          several plant proteins of unknown function.
          Length = 151

 Score = 26.7 bits (59), Expect = 6.8
 Identities = 13/43 (30%), Positives = 18/43 (41%), Gaps = 17/43 (39%)

Query: 34 KKEEEEEKKERNEELDRRRRRRRRRDNWVCDGSSNLAITRSIF 76
          +K   +E+ E N    +RR RRRR            AI   I+
Sbjct: 5  RKPTWKER-ENN----KRRERRRR------------AIAAKIY 30


>gnl|CDD|153299 cd07615, BAR_Endophilin_A3, The Bin/Amphiphysin/Rvs (BAR) domain of
           Endophilin-A3.  BAR domains are dimerization, lipid
           binding and curvature sensing modules found in many
           different proteins with diverse functions. Endophilins
           are accessory proteins localized at synapses that
           interacts with the endocytic proteins, dynamin and
           synaptojanin. They are essential for synaptic vesicle
           formation from the plasma membrane. They interact with
           voltage-gated calcium channels, thus linking vesicle
           endocytosis to calcium regulation. They also play roles
           in virus budding, mitochondrial morphology maintenance,
           receptor-mediated endocytosis inhibition, and endosomal
           sorting. Endophilins contain an N-terminal N-BAR domain
           (BAR domain with an additional N-terminal amphipathic
           helix), followed by a variable region containing proline
           clusters, and a C-terminal SH3 domain. They are
           classified into two types, A and B. Endophilin-A
           proteins are enriched in the brain and play multiple
           roles in receptor-mediated endocytosis. Endophilin-A3
           (or endophilin-3) is also referred to as SH3P13 (SH3
           domain containing protein 13) or SH3GL3 (SH3 domain
           containing Grb2-like protein 3). It regulates
           Arp2/3-dependent actin filament assembly during
           endocytosis. It binds N-WASP through its SH3 domain and
           enhances the ability of N-WASP to activate the Arp2/3
           complex. Endophilin-A3 co-localizes with the vesicular
           glutamate transporter 1 (VGLUT1), and may play an
           important role in the synaptic release of glutamate.
          Length = 223

 Score = 26.9 bits (59), Expect = 6.8
 Identities = 13/30 (43%), Positives = 18/30 (60%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERN 45
           K KR+G+  +EE     EK EE +E  ER+
Sbjct: 147 KKKRQGKIPDEEIRQAVEKFEESKELAERS 176


>gnl|CDD|216381 pfam01237, Oxysterol_BP, Oxysterol-binding protein. 
          Length = 335

 Score = 27.2 bits (61), Expect = 6.9
 Identities = 14/33 (42%), Positives = 17/33 (51%), Gaps = 3/33 (9%)

Query: 28  EEGKKEKKEEEEEKKERNEELDRRRRRRRRRDN 60
           EEG  ++ EEE   K R EE  R RR+ R    
Sbjct: 276 EEGDYDEAEEE---KLRLEEKQRERRKEREEKG 305


>gnl|CDD|240579 cd12931, eNOPS_SF, NOPS domain, including C-terminal helical
          extension region, in the p54nrb/PSF/PSP1 family.  All
          members in this family contain a DBHS domain (for
          Drosophila behavior, human splicing), which comprises
          two conserved RNA recognition motifs (RRM1 and RRM2),
          also termed RBDs (RNA binding domains) or RNPs
          (ribonucleoprotein domains), and a charged
          protein-protein interaction NOPS (NONA and PSP1) domain
          with a long helical C-terminal extension. The NOPS
          domain specifically binds to RRM2 domain of the partner
          DBHS protein via a substantial interaction surface. Its
          highly conserved C-terminal residues are critical for
          functional DBHS dimerization while the highly conserved
          C-terminal helical extension, forming a right-handed
          antiparallel heterodimeric coiled-coil, is essential
          for localization of these proteins to subnuclear
          bodies. PSF has an additional large N-terminal domain
          that differentiates it from other family members. The
          p54nrb/PSF/PSP1 family includes 54 kDa nuclear RNA- and
          DNA-binding protein (p54nrb), polypyrimidine
          tract-binding protein (PTB)-associated-splicing factor
          (PSF) and paraspeckle protein 1 (PSP1), which are
          ubiquitously expressed and are well conserved in
          vertebrates. p54nrb, also termed NONO or NMT55, is a
          multi-functional protein involved in numerous nuclear
          processes including transcriptional regulation,
          splicing, DNA unwinding, nuclear retention of
          hyperedited double-stranded RNA, viral RNA processing,
          control of cell proliferation, and circadian rhythm
          maintenance. PSF, also termed POMp100, is also a
          multi-functional protein that binds RNA,
          single-stranded DNA (ssDNA), double-stranded DNA
          (dsDNA) and many factors, and mediates diverse
          activities in the cell. PSP1, also termed PSPC1, is a
          novel nucleolar factor that accumulates within a new
          nucleoplasmic compartment, termed paraspeckles, and
          diffusely distributes in the nucleoplasm. The cellular
          function of PSP1 remains unknown currently. The family
          also includes some p54nrb/PSF/PSP1 homologs from
          invertebrate species. For instance, the Drosophila
          melanogaster gene no-ontransient A (nonA) encoding
          puff-specific protein Bj6 (also termed NONA) and
          Chironomus tentans hrp65 gene encoding protein Hrp65.
          D. melanogaster NONA is involved in eye development and
          behavior and may play a role in circadian rhythm
          maintenance, similar to vertebrate p54nrb. C. tentans
          Hrp65 is a component of nuclear fibers associated with
          ribonucleoprotein particles in transit from the gene to
          the nuclear pore.
          Length = 90

 Score = 25.7 bits (57), Expect = 7.0
 Identities = 10/33 (30%), Positives = 19/33 (57%), Gaps = 1/33 (3%)

Query: 26 EEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          E E G++ K   E EK ++ E+L++  +  R +
Sbjct: 46 EYEFGQRWKALYELEK-QQREQLEKELKEAREK 77


>gnl|CDD|219900 pfam08553, VID27, VID27 cytoplasmic protein.  This is a family of
           fungal and plant proteins and contains many hypothetical
           proteins. VID27 is a cytoplasmic protein that plays a
           potential role in vacuolar protein degradation.
          Length = 794

 Score = 27.4 bits (61), Expect = 7.0
 Identities = 22/85 (25%), Positives = 41/85 (48%), Gaps = 13/85 (15%)

Query: 10  YGLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKER-----NEELDRRRRRRRRRDNWVCD 64
           +  +  +     R  EEEE+ ++E++EE+E++        +EE +      +  D+   D
Sbjct: 376 FSALEIEDANTERDDEEEED-EEEEEEEDEDEGPSKEHSDDEEFEEDDVESKYEDS---D 431

Query: 65  GSSNLAI----TRSIFFLGSLLGGF 85
           G+S+LA+     RS    G  +G F
Sbjct: 432 GNSSLAVGYKNDRSYVVRGDKIGVF 456


>gnl|CDD|218896 pfam06098, Radial_spoke_3, Radial spoke protein 3.  This family
           consists of several radial spoke protein 3 (RSP3)
           sequences. Eukaryotic cilia and flagella present in
           diverse types of cells perform motile, sensory, and
           developmental functions in organisms from protists to
           humans. They are centred by precisely organised,
           microtubule-based structures, the axonemes. The axoneme
           consists of two central singlet microtubules, called the
           central pair, and nine outer doublet microtubules. These
           structures are well-conserved during evolution. The
           outer doublet microtubules, each composed of A and B
           sub-fibres, are connected to each other by nexin links,
           while the central pair is held at the centre of the
           axoneme by radial spokes. The radial spokes are T-shaped
           structures extending from the A-tubule of each outer
           doublet microtubule to the centre of the axoneme. Radial
           spoke protein 3 (RSP3), is present at the proximal end
           of the spoke stalk and helps in anchoring the radial
           spoke to the outer doublet. It is thought that radial
           spokes regulate the activity of inner arm dynein through
           protein phosphorylation and dephosphorylation.
          Length = 288

 Score = 26.9 bits (60), Expect = 7.1
 Identities = 11/48 (22%), Positives = 24/48 (50%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           G  + ++  E  ++EE  E ++++++ E+ +     E  R     RRR
Sbjct: 138 GKTLEQALLEVLEEEELAELRQQQRQFEQRRNAELAETQRLEEAERRR 185



 Score = 26.9 bits (60), Expect = 8.1
 Identities = 13/34 (38%), Positives = 23/34 (67%)

Query: 18  KREGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
           + E R++EE+E  KK+ KE ++ +KE  E++  R
Sbjct: 180 EAERRRREEKERRKKQDKERKQREKETAEKIAAR 213


>gnl|CDD|227466 COG5137, COG5137, Histone chaperone involved in gene silencing
           [Transcription / Chromatin structure and dynamics].
          Length = 279

 Score = 26.9 bits (59), Expect = 7.1
 Identities = 9/33 (27%), Positives = 17/33 (51%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNE 46
            P    E  ++ EE +G++E+++EE       E
Sbjct: 171 QPDVDNEEEERLEESDGREEEEDEEVGSDSYGE 203


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 27.3 bits (61), Expect = 7.2
 Identities = 13/36 (36%), Positives = 18/36 (50%), Gaps = 8/36 (22%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRR 57
           R K EEE       +EE E+ +   +L+  R RR R
Sbjct: 256 RTKTEEE-----LAKEEAERLK---KLEAERLRRMR 283


>gnl|CDD|177283 PHA00451, PHA00451, protein kinase.
          Length = 362

 Score = 27.1 bits (60), Expect = 7.2
 Identities = 10/43 (23%), Positives = 14/43 (32%)

Query: 19  REGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNW 61
           R+   K  +   K   +  +E    R E    RR   R R   
Sbjct: 266 RKAAMKRRKRNRKLRARNAKELAAMRMEANQIRRNEPRARMLM 308


>gnl|CDD|224141 COG1220, HslU, ATP-dependent protease HslVU (ClpYQ), ATPase subunit
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 444

 Score = 27.2 bits (61), Expect = 7.2
 Identities = 13/40 (32%), Positives = 19/40 (47%), Gaps = 2/40 (5%)

Query: 12  LIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRR 51
            +VP +K    Q E ++E       E+  KK R  ELD +
Sbjct: 134 ALVPPAKNFWGQSENKQESSA--TREKFRKKLREGELDDK 171


>gnl|CDD|225514 COG2966, COG2966, Uncharacterized conserved protein [Function
           unknown].
          Length = 250

 Score = 26.9 bits (60), Expect = 7.2
 Identities = 21/89 (23%), Positives = 32/89 (35%), Gaps = 15/89 (16%)

Query: 39  EEKKERNEELDRRRRRRRRRDNWVCDGSSNLAITR----------SIFFLGSLLGGFILS 88
           EE  ++ +E+ ++  R  R    +  G +  A               FF G L  GF+L 
Sbjct: 107 EEAHKKLDEIQKQPLRYSRWLVLLMAGLAAAAFALLFGGGWLDFLIAFFAGLL--GFLLR 164

Query: 89  WVADRYGRIT---AVLGSHVVSFLGVALT 114
               R G       VL S + S + V   
Sbjct: 165 QYLSRKGNPDFFFEVLASFIASIVAVLFG 193


>gnl|CDD|181784 PRK09334, PRK09334, 30S ribosomal protein S25e; Provisional.
          Length = 86

 Score = 25.8 bits (57), Expect = 7.4
 Identities = 11/41 (26%), Positives = 22/41 (53%), Gaps = 4/41 (9%)

Query: 26 EEEEGKKEKKEEEEEKKERN----EELDRRRRRRRRRDNWV 62
          ++++ KKE  + E+E K       EEL +R  +  +++  V
Sbjct: 2  QKKKKKKEGSKTEKEIKSTIITLDEELLKRVAKEVKKEKIV 42


>gnl|CDD|129965 TIGR00887, 2A0109, phosphate:H+ symporter.  This model represents
           the phosphate uptake symporter subfamily of the major
           facilitator superfamily (pfam00083) [Transport and
           binding proteins, Anions].
          Length = 502

 Score = 27.0 bits (60), Expect = 7.4
 Identities = 25/96 (26%), Positives = 40/96 (41%), Gaps = 16/96 (16%)

Query: 77  FLGSLLGGFILSWVADRYGR-----ITAVLGSHVVSFLGVALTPFSKDVVLFSLS---RF 128
            +G+L G     W+AD+ GR     +  ++   +++ +   L+P S    + +     RF
Sbjct: 66  SIGTLAGQLFFGWLADKLGRKRVYGMELII--MIIATVASGLSPGSSPKSVMATLCFWRF 123

Query: 129 LTGVGHFNAFIFYYIIVLECVGPKWR------TFAM 158
             GVG    +    II  E    KWR       FAM
Sbjct: 124 WLGVGIGGDYPLSAIITSEFATKKWRGAMMAAVFAM 159


>gnl|CDD|236485 PRK09368, PRK09368, gas vesicle synthesis-like protein; Reviewed.
          Length = 140

 Score = 26.6 bits (59), Expect = 7.5
 Identities = 15/45 (33%), Positives = 22/45 (48%), Gaps = 2/45 (4%)

Query: 16  KSKR--EGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
           KSK    G  +   E   + +  +EEE++ R     R+  RRRRR
Sbjct: 95  KSKGALTGAAETASEALGQGRGSDEEEERRRERPRPRKAPRRRRR 139


>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
           selection and in elongation by RNA polymerase II
           [Transcription].
          Length = 521

 Score = 27.3 bits (60), Expect = 7.5
 Identities = 7/35 (20%), Positives = 13/35 (37%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDR 50
             + E  ++  +EE      EE  E   R ++   
Sbjct: 175 YKELEESEQGLQEEYTPSYAEEAVEDISRTDDFAE 209


>gnl|CDD|147685 pfam05663, DUF809, Protein of unknown function (DUF809).  This
           family consists of several proteins of unknown function
           Raphanus sativus (Radish) and Brassica napus (Rape).
          Length = 138

 Score = 26.3 bits (57), Expect = 7.6
 Identities = 14/31 (45%), Positives = 21/31 (67%)

Query: 16  KSKREGRQKEEEEEGKKEKKEEEEEKKERNE 46
           + K+EG+ + E +E KKE K E E K+E+ E
Sbjct: 101 EEKKEGKGEIEGKEEKKEGKGEIEGKEEKKE 131



 Score = 26.3 bits (57), Expect = 8.5
 Identities = 17/43 (39%), Positives = 26/43 (60%), Gaps = 1/43 (2%)

Query: 13  IVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRR 55
           + P  K E   KEE++EGK E  E +EEKKE   E++ +  ++
Sbjct: 89  VSPIIKGEIEGKEEKKEGKGE-IEGKEEKKEGKGEIEGKEEKK 130


>gnl|CDD|184861 PRK14859, tatA, twin arginine translocase protein A; Provisional.
          Length = 63

 Score = 25.1 bits (55), Expect = 7.6
 Identities = 7/24 (29%), Positives = 16/24 (66%)

Query: 18 KREGRQKEEEEEGKKEKKEEEEEK 41
          K+   +KEE E    +K++++++K
Sbjct: 40 KKATSEKEEIEIKPTKKEDKKKKK 63


>gnl|CDD|234175 TIGR03348, VI_IcmF, type VI secretion protein IcmF.  Members of
           this protein family are IcmF homologs and tend to be
           associated with type VI secretion systems [Cellular
           processes, Pathogenesis].
          Length = 1169

 Score = 27.3 bits (61), Expect = 8.0
 Identities = 17/54 (31%), Positives = 25/54 (46%), Gaps = 8/54 (14%)

Query: 48  LDRRRRRRRRRDNWVCDGSSNLAITRSIFFLGSLLGGFILSWVADRYGRITAVL 101
           L+RR  RRRR        ++ LA          LLG + LS++A+R   +  V 
Sbjct: 419 LNRRAERRRRWLRRGAYAAAALA-------ALGLLGLWSLSYLANR-DYLDEVR 464


>gnl|CDD|218302 pfam04873, EIN3, Ethylene insensitive 3.  Ethylene insensitive 3
           (EIN3) proteins are a family of plant DNA-binding
           proteins that regulate transcription in response to the
           gaseous plant hormone ethylene, and are essential for
           ethylene-mediated responses including the triple
           response, cell growth inhibition, and accelerated
           senescence.
          Length = 332

 Score = 26.9 bits (60), Expect = 8.1
 Identities = 12/49 (24%), Positives = 19/49 (38%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRRDNWVC 63
           PK   E  QKEE +  K+ K +  +            +R ++  D   C
Sbjct: 250 PKVTLECGQKEENQGKKESKIKHVQAVHTTAGFPVVCQRDKKPGDYAKC 298


>gnl|CDD|237810 PRK14764, PRK14764, lipoprotein signal peptidase; Provisional.
          Length = 209

 Score = 26.5 bits (59), Expect = 8.1
 Identities = 16/34 (47%), Positives = 19/34 (55%), Gaps = 1/34 (2%)

Query: 78  LGSLLGGFILSWVADRYGRITAVLGSHVVSFLGV 111
           LG +LGG  L  + DR+ R    L  HVV FL V
Sbjct: 126 LGLILGG-ALGNLVDRFFRAPGPLRGHVVDFLSV 158


>gnl|CDD|223742 COG0670, COG0670, Integral membrane protein, interacts with FtsH
           [General function prediction only].
          Length = 233

 Score = 26.9 bits (60), Expect = 8.1
 Identities = 18/73 (24%), Positives = 30/73 (41%), Gaps = 10/73 (13%)

Query: 66  SSNLAITRSIFFLGSLLGGFILSWVADRY------GRITAVLGSHVVSFLGVALTPFS-- 117
           + +      +FF+ + L G  LS +   Y        I A  G   + F  ++L  ++  
Sbjct: 80  NKSSPTALILFFVYTALVGLTLSPILLVYAAISGGDAIAAAFGITALVFGALSLYGYTTK 139

Query: 118 KDVVLFSLSRFLT 130
           +D  L SL  FL 
Sbjct: 140 RD--LSSLGSFLF 150


>gnl|CDD|237869 PRK14962, PRK14962, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 472

 Score = 27.0 bits (60), Expect = 8.2
 Identities = 11/34 (32%), Positives = 20/34 (58%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
            + K + +++ + +E K+E  E E+  KE  EEL
Sbjct: 353 VQQKEKKKEESKAKEEKQEDIEFEKRFKELMEEL 386


>gnl|CDD|224194 COG1275, TehA, Tellurite resistance protein and related permeases
           [Inorganic ion transport and metabolism].
          Length = 329

 Score = 26.9 bits (60), Expect = 8.3
 Identities = 23/103 (22%), Positives = 40/103 (38%), Gaps = 10/103 (9%)

Query: 106 VSFLGVAL--TPFSKDVVLFSLSRFLTGVGHFNAFIFYYIIVLECVGPKWRT---FAMTF 160
           +  +GV L     S   + F L   L G G      F  +++L  +     +   +A TF
Sbjct: 211 IGLVGVGLLLIVNSGPSLTFVL--ILWGFGLLF-LFFALLLLLRVLLRLPFSPSWWAFTF 267

Query: 161 PFLIFYTVSEVALPWIAYYLADWQWISVITIFPLIVGLIVAIF 203
           P +I  T +      I   +  + ++ +I    LI   IV + 
Sbjct: 268 PLVILATSALELGKSIGIGV--FHYLGLILGTFLIFIWIVLLV 308


>gnl|CDD|223940 COG1008, NuoM, NADH:ubiquinone oxidoreductase subunit 4 (chain M)
           [Energy production and conversion].
          Length = 497

 Score = 26.8 bits (60), Expect = 8.5
 Identities = 5/22 (22%), Positives = 12/22 (54%)

Query: 182 DWQWISVITIFPLIVGLIVAIF 203
            +  +S++   PLI  L++ + 
Sbjct: 2   SFPLLSLLIFLPLIGALLILLI 23


>gnl|CDD|233197 TIGR00937, 2A51, chromate transporter, chromate ion transporter
           (CHR) family.  Members of this family probably act as
           chromate transporters, and are found in Pseudomonas
           aeruginosa, Alcaligenes eutrophus, Vibrio cholerae,
           Bacillus subtilis, cyanobacteria and archaea. The
           protein reduces chromate accumulation and is essential
           for chromate resistance. Cutoffs for this model have now
           been lowered, compared to a previous version, giving the
           model a scope more similar to that of pfam02417. Members
           of the original, more narrowly defined family score
           above 500.00 bits [Transport and binding proteins,
           Anions].
          Length = 368

 Score = 27.0 bits (60), Expect = 8.5
 Identities = 32/146 (21%), Positives = 52/146 (35%), Gaps = 23/146 (15%)

Query: 76  FFLGSLLGGFILSWVADRYGRITAV------LGSHVVSFLGVALTPFSKDVV-------- 121
           F L S L    L+W    YG + AV      L + V++ +  A+    K +V        
Sbjct: 74  FTLPSFLLVVALAWAYVHYGSLPAVGAWFYGLQAAVIALIAQAVWKLGKKLVGPDRLLWG 133

Query: 122 --LFSLSRFLTGVGHFNAFIFYYIIVLECVGPKW-RTFAMTFPFLIFYTVSEVALPWIAY 178
             L +    +     +   +   I+VL  +G +             +  V+ +AL     
Sbjct: 134 IALVTALGTILWPSEW-IQLLLGILVL-VLGWRRPPAKIPKVWLRQYALVAFLALGLALL 191

Query: 179 ----YLADWQWISVITIFPLIVGLIV 200
                LAD    SV+ IF    G +V
Sbjct: 192 ALLLPLADSSLASVLGIFFYKAGALV 217


>gnl|CDD|212164 cd11650, AT4G37440_like, Uncharacterized protein domain conserved
           in plants.  This domain contains an extensive protein
           sequence fragment that appears conserved in a number of
           plant proteins, including the gene product of
           Arabidopsis thaliana locus AT4G37440, which has been
           identified in transcriptional profiling as expressed at
           different levels in white cabbage cultivars.
          Length = 253

 Score = 26.6 bits (59), Expect = 8.7
 Identities = 12/40 (30%), Positives = 22/40 (55%), Gaps = 2/40 (5%)

Query: 22  RQKEEEEEGKKEKKEEEEEKKERNEELDRR--RRRRRRRD 59
           ++KE + EG K + E  +      E  +R+  +RR+R+R 
Sbjct: 122 QEKELQLEGTKAEGENSKSAPFSGERHERKVMKRRKRKRV 161


>gnl|CDD|220222 pfam09403, FadA, Adhesion protein FadA.  FadA (Fusobacterium
          adhesin A) is an adhesin which forms two alpha helices.
          Length = 126

 Score = 26.0 bits (57), Expect = 8.7
 Identities = 10/34 (29%), Positives = 20/34 (58%)

Query: 25 EEEEEGKKEKKEEEEEKKERNEELDRRRRRRRRR 58
          ++EE   +E+K+E+E  ++  +EL  R+  R   
Sbjct: 41 QKEEARFEEEKQEKETAEKEVQELKERQLGREEL 74


>gnl|CDD|144972 pfam01576, Myosin_tail_1, Myosin tail.  The myosin molecule is a
           multi-subunit complex made up of two heavy chains and
           four light chains it is a fundamental contractile
           protein found in all eukaryote cell types. This family
           consists of the coiled-coil myosin heavy chain tail
           region. The coiled-coil is composed of the tail from two
           molecules of myosin. These can then assemble into the
           macromolecular thick filament. The coiled-coil region
           provides the structural backbone the thick filament.
          Length = 859

 Score = 26.9 bits (60), Expect = 8.8
 Identities = 12/34 (35%), Positives = 18/34 (52%), Gaps = 1/34 (2%)

Query: 16  KSKREGRQKEEE-EEGKKEKKEEEEEKKERNEEL 48
           K + EG  + EE EE KK+  ++  E +E  E  
Sbjct: 307 KFESEGALRAEELEELKKKLNQKISELEEAAEAA 340


>gnl|CDD|237178 PRK12705, PRK12705, hypothetical protein; Provisional.
          Length = 508

 Score = 27.0 bits (60), Expect = 8.8
 Identities = 12/56 (21%), Positives = 23/56 (41%), Gaps = 11/56 (19%)

Query: 14  VPKSKREGR---QKEEEEEGK-------KEKKEEEEEK-KERNEELDRRRRRRRRR 58
             + +RE     QKEE+ + +       + + EE E+    R  EL+   ++    
Sbjct: 76  REELQREEERLVQKEEQLDARAEKLDNLENQLEEREKALSARELELEELEKQLDNE 131


>gnl|CDD|219248 pfam06978, POP1, Ribonucleases P/MRP protein subunit POP1.  This
           family represents a conserved region approximately 150
           residues long located towards the N-terminus of the POP1
           subunit that is common to both the RNase MRP and RNase P
           ribonucleoproteins (EC:3.1.26.5). These RNA-containing
           enzymes generate mature tRNA molecules by cleaving their
           5' ends.
          Length = 158

 Score = 26.1 bits (58), Expect = 8.9
 Identities = 14/56 (25%), Positives = 22/56 (39%), Gaps = 9/56 (16%)

Query: 14  VPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEELDRRRR--------RRRRRDNW 61
           VPK  R+ R K E  +     K   ++  +R   L   R         +R++R  W
Sbjct: 47  VPKRLRK-RAKREMAKDNTPTKLSRKKPSKRLLRLALARPPNLSRKYRKRQKRKKW 101


>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit.  This model
           describes Pseudomonas denitrificans CobT gene product,
           which is a cobalt chelatase subunit that functions in
           cobalamin biosynthesis. Cobalamin (vitamin B12) can be
           synthesized via several pathways, including an aerobic
           pathway (found in Pseudomonas denitrificans) and an
           anaerobic pathway (found in P. shermanii and Salmonella
           typhimurium). These pathways differ in the point of
           cobalt insertion during corrin ring formation. There are
           apparently a number of variations on these two pathways,
           where the major differences seem to be concerned with
           the process of ring contraction. Confusion regarding the
           functions of enzymes found in the aerobic vs. anaerobic
           pathways has arisen because nonhomologous genes in these
           different pathways were given the same gene symbols.
           Thus, cobT in the aerobic pathway (P. denitrificans) is
           not a homolog of cobT in the anaerobic pathway (S.
           typhimurium). It should be noted that E. coli
           synthesizes cobalamin only when it is supplied with the
           precursor cobinamide, which is a complex intermediate.
           Additionally, all E. coli cobalamin synthesis genes
           (cobU, cobS and cobT) were named after their Salmonella
           typhimurium homologs which function in the anaerobic
           cobalamin synthesis pathway. This model describes the
           aerobic cobalamin pathway Pseudomonas denitrificans CobT
           gene product, which is a cobalt chelatase subunit, with
           a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
           heterotrimeric, ATP-dependent enzyme that catalyzes
           cobalt insertion during cobalamin biosynthesis. The
           other two subunits are the P. denitrificans CobS
           (TIGR01650) and CobN (pfam02514 CobN/Magnesium
           Chelatase) proteins. To avoid potential confusion with
           the nonhomologous Salmonella typhimurium/E.coli cobT
           gene product, the P. denitrificans gene symbol is not
           used in the name of this model [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 600

 Score = 26.8 bits (59), Expect = 9.0
 Identities = 10/33 (30%), Positives = 15/33 (45%)

Query: 15  PKSKREGRQKEEEEEGKKEKKEEEEEKKERNEE 47
           P    +  Q E E EG++    +E E  +R  E
Sbjct: 220 PTENEQEEQGEGEGEGQEGSAPQESEATDRESE 252


>gnl|CDD|233168 TIGR00883, 2A0106, metabolite-proton symporter.  This model
          represents the metabolite:H+ symport subfamily of the
          major facilitator superfamily (pfam00083), including
          citrate-H+ symporters, dicarboxylate:H+ symporters, the
          proline/glycine-betaine transporter ProP, etc
          [Transport and binding proteins, Unknown substrate].
          Length = 394

 Score = 26.9 bits (60), Expect = 9.0
 Identities = 8/20 (40%), Positives = 9/20 (45%)

Query: 77 FLGSLLGGFILSWVADRYGR 96
          FL   LG  +     DR GR
Sbjct: 45 FLARPLGAIVFGHFGDRIGR 64


>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
           Provisional.
          Length = 2849

 Score = 26.9 bits (59), Expect = 9.0
 Identities = 6/22 (27%), Positives = 16/22 (72%)

Query: 26  EEEEGKKEKKEEEEEKKERNEE 47
           ++E+  ++  +EE++++E  EE
Sbjct: 156 DDEDEDEDDDDEEDDEEEEEEE 177


>gnl|CDD|167649 PRK03963, PRK03963, V-type ATP synthase subunit E; Provisional.
          Length = 198

 Score = 26.6 bits (59), Expect = 9.3
 Identities = 14/57 (24%), Positives = 28/57 (49%), Gaps = 10/57 (17%)

Query: 12 LIVPKSKREGRQK------EEEEEGKKEKKEEEEEKKERNEELDRRRRR----RRRR 58
          LI+ +  RE  QK      E ++E +K K+E  +  + + E + R+ +      ++R
Sbjct: 6  LIIQEINREAEQKIEYILEEAQKEAEKIKEEARKRAESKAEWILRKAKTQAELEKQR 62


>gnl|CDD|237205 PRK12792, flhA, flagellar biosynthesis protein FlhA; Reviewed.
          Length = 694

 Score = 27.0 bits (60), Expect = 9.3
 Identities = 7/38 (18%), Positives = 16/38 (42%)

Query: 11  GLIVPKSKREGRQKEEEEEGKKEKKEEEEEKKERNEEL 48
              +P+ +      E  +  ++E+  + E K    E+L
Sbjct: 320 AYTIPRRRAARAAAEAAKVKREEESAQAEAKDSVKEQL 357


>gnl|CDD|150313 pfam09605, Trep_Strep, Hypothetical bacterial integral membrane
           protein (Trep_Strep).  This family consists of strongly
           hydrophobic proteins about 190 amino acids in length
           with a strongly basic motif near the C-terminus. It is
           found in rather few species, but in paralogous families
           of 12 members in the oral pathogenic spirochaete
           Treponema denticola and 2 in Streptococcus pneumoniae
           R6.
          Length = 186

 Score = 26.4 bits (59), Expect = 9.7
 Identities = 12/68 (17%), Positives = 22/68 (32%), Gaps = 12/68 (17%)

Query: 138 FIFYYIIVLECVGPKWRTFAMTFPFLIFYTVSEVAL---PWIAYYLADWQWISVITIFPL 194
           F+  +I  +              P  +    +  AL         +A       ITI  +
Sbjct: 16  FVIVFIGGM---------LGAINPVFMLLAPAITALLGGIIFMLLVAKVPKFGAITIMGI 66

Query: 195 IVGLIVAI 202
           I+GL+  +
Sbjct: 67  IIGLLFFL 74


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.326    0.142    0.448 

Gapped
Lambda     K      H
   0.267   0.0812    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 11,492,410
Number of extensions: 1168034
Number of successful extensions: 10161
Number of sequences better than 10.0: 1
Number of HSP's gapped: 8412
Number of HSP's successfully gapped: 1032
Length of query: 208
Length of database: 10,937,602
Length adjustment: 93
Effective length of query: 115
Effective length of database: 6,812,680
Effective search space: 783458200
Effective search space used: 783458200
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 15 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 40 (21.6 bits)
S2: 57 (25.6 bits)