RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy9196
         (362 letters)



>gnl|CDD|240223 cd05797, Ribosomal_L10, Ribosomal protein L10 family, L10
           subfamily; composed of bacterial 50S ribosomal protein
           and eukaryotic mitochondrial 39S ribosomal protein, L10.
           L10 occupies the L7/L12 stalk of the ribosome. The
           N-terminal domain (NTD) of L10 interacts with L11
           protein and forms the base of the L7/L12 stalk, while
           the extended C-terminal helix binds to two or three
           dimers of the NTD of L7/L12 (L7 and L12 are identical
           except for an acetylated N-terminus). The L7/L12 stalk
           is known to contain the binding site for elongation
           factors G and Tu (EF-G and EF-Tu, respectively);
           however, there is disagreement as to whether or not L10
           is involved in forming the binding site. The stalk is
           believed to be associated with GTPase activities in
           protein synthesis. In a neuroblastoma cell line, L10 has
           been shown to interact with the SH3 domain of Src and to
           activate the binding of the Nck1 adaptor protein with
           skeletal proteins such as the Wiskott-Aldrich Syndrome
           Protein (WASP) and the WASP-interacting protein (WIP).
           These bacteria and eukaryotic sequences have no
           additional C-terminal domain, present in other
           eukaryotic and archaeal orthologs.
          Length = 157

 Score = 59.1 bits (144), Expect = 1e-10
 Identities = 34/159 (21%), Positives = 65/159 (40%), Gaps = 19/159 (11%)

Query: 81  ILAKEILARFEESQMIAIVHRSSM-LAEVN------REVKVVFKRVGMTMLDRYGRATIE 133
            +  E+  + +E++ + +     + +A++       RE  V  K V  T+         +
Sbjct: 7   EIVAELKEKLKEAKSVVVADYRGLTVAQLTELRKELREAGVKLKVVKNTLA--------K 58

Query: 134 KALTNTKYISIMPLFKVSEAIIVSPEAKVD---VLLKTLKKTPQLTLMATIVDNRLLSKT 190
           +AL  T +  +  L K   AI  S E  V    VL    K+  +L +   +V+ ++L   
Sbjct: 59  RALEGTGFEDLDDLLKGPTAIAFSEEDPVAAAKVLKDFAKENKKLEIKGGVVEGKVLDAE 118

Query: 191 ETEMYAT-TNLATQQALLVQTISSVATSLTSQLNHHSTT 228
           E +  A   +     A L+  + + AT L   LN  ++ 
Sbjct: 119 EVKALAKLPSREELLAQLLGLLQAPATKLVRVLNAPASK 157


>gnl|CDD|238222 cd00379, Ribosomal_L10_P0, Ribosomal protein L10 family; composed
           of the large subunit ribosomal protein called L10 in
           bacteria, P0 in eukaryotes, and L10e in archaea, as well
           as uncharacterized P0-like eukaryotic proteins. In all
           three kingdoms, L10 forms a tight complex with multiple
           copies of the small acidic protein L12(e). This complex
           forms a stalk structure on the large subunit of the
           ribosome. The N-terminal domain (NTD) of L10 interacts
           with L11 protein and forms the base of the L7/L12 stalk,
           while the extended C-terminal helix binds to two or
           three dimers of the NTD of L7/L12 (L7 and L12 are
           identical except for an acetylated N-terminus). The
           L7/L12 stalk is known to contain the binding site for
           elongation factors G and Tu (EF-G and EF-Tu,
           respectively); however, there is disagreement as to
           whether or not L10 is involved in forming the binding
           site. The stalk is believed to be associated with GTPase
           activities in protein synthesis. In a neuroblastoma cell
           line, L10 has been shown to interact with the SH3 domain
           of Src and to activate the binding of the Nck1 adaptor
           protein with skeletal proteins such as the
           Wiskott-Aldrich Syndrome Protein (WASP) and the
           WASP-interacting protein (WIP). Some eukaryotic P0
           sequences have an additional C-terminal domain
           homologous with acidic proteins P1 and P2.
          Length = 155

 Score = 51.0 bits (123), Expect = 6e-08
 Identities = 29/152 (19%), Positives = 60/152 (39%), Gaps = 7/152 (4%)

Query: 82  LAKEILARFEESQMIAIVHRSSMLAEVNREVKVVFKRVGMTMLDRYGRAT-IEKALTNTK 140
           L +E+    ++ + + +V    +      E++   +  G  +  + G+ T + +AL  T 
Sbjct: 6   LVEELKELLKKYKSVVVVDYRGLTVAQLTELRKELRESGAKL--KVGKNTLMRRALKGTG 63

Query: 141 YISIMPLFKVSEAIIVSPEAKVDV--LLKTLKKT-PQLTLMATIVDNRLLSKTETEMYAT 197
           +  + PL K   A+  + E  V+V  +LK   K   +L     +V  ++L        A 
Sbjct: 64  FEELKPLLKGPTALAFTNEDPVEVAKVLKDFAKENKKLFAKGGVVAGKVLDPAGVTALAK 123

Query: 198 T-NLATQQALLVQTISSVATSLTSQLNHHSTT 228
             +     A+L+  + +    L   LN     
Sbjct: 124 LPSREELLAMLIGLLKAPIAKLARLLNALGIG 155


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 44.0 bits (104), Expect = 1e-04
 Identities = 33/148 (22%), Positives = 54/148 (36%), Gaps = 16/148 (10%)

Query: 231 SYLDQHSGTGGEG---------GSSASEALSGSETSGGSEGEATSNEASGESQESAKAAS 281
           S      G G E           +  +     S  +G S     ++ +S   + S   + 
Sbjct: 241 SSESSGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSP 300

Query: 282 SSDNQGEATSSNEASGE---SQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAA 338
           SS   G A SS  AS     S+ES S+  +SSS+S   +A     S + S + S+    A
Sbjct: 301 SSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPA 360

Query: 339 PGD----KVTDGKETPSTSDTDGDKSSR 362
                  +    +   S + + G  + R
Sbjct: 361 DPSSPRKRPRPSRAPSSPAASAGRPTRR 388



 Score = 39.0 bits (91), Expect = 0.004
 Identities = 23/122 (18%), Positives = 43/122 (35%), Gaps = 5/122 (4%)

Query: 237 SGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEAS 296
                     +S     S +S   E   + + +S  S  +  +  +S +   +  S+ +S
Sbjct: 269 IWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSS 328

Query: 297 GESQESDSAKAASS-----SQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPS 351
             S    S  AA S     S+SP+ S P      ++     + S A      + G+ T  
Sbjct: 329 TSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRR 388

Query: 352 TS 353
            +
Sbjct: 389 RA 390



 Score = 28.2 bits (63), Expect = 7.9
 Identities = 22/121 (18%), Positives = 32/121 (26%), Gaps = 7/121 (5%)

Query: 239 TGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQG-EATSSNEASG 297
             G      +EA +    S  +   +T   AS   + S      S       T    +  
Sbjct: 68  PTGPPPGPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPASPP 127

Query: 298 ESQESDSAKAASSSQSP-----TESAPGGEVS-KAASATDSQSSEAAPGDKVTDGKETPS 351
            S   D ++      SP           G      AS   S    A P     +    PS
Sbjct: 128 PSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEETARAPS 187

Query: 352 T 352
           +
Sbjct: 188 S 188


>gnl|CDD|223003 PHA03169, PHA03169, hypothetical protein; Provisional.
          Length = 413

 Score = 40.7 bits (95), Expect = 0.001
 Identities = 22/129 (17%), Positives = 37/129 (28%), Gaps = 12/129 (9%)

Query: 237 SGTGGEGGSSASEALSGSETSGG--------SEGEATSNEASGESQESAKAASSSDNQGE 288
           S T    GS+   A   S  +          S     S  +     E A   S + +  +
Sbjct: 102 SPTPSPSGSAEELASGLSPENTSGSSPESPASHSPPPSPPSHPGPHEPAPPESHNPSPNQ 161

Query: 289 ATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQS---SEAAPGDKVTD 345
             SS       ++S       +S+   +S    +     S+   QS       P      
Sbjct: 162 QPSS-FLQPSHEDSPEEPEPPTSEPEPDSPGPPQSETPTSSPPPQSPPDEPGEPQSPTPQ 220

Query: 346 GKETPSTSD 354
              +P+T  
Sbjct: 221 QAPSPNTQQ 229



 Score = 39.6 bits (92), Expect = 0.002
 Identities = 16/112 (14%), Positives = 29/112 (25%), Gaps = 6/112 (5%)

Query: 236 HSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQ-----GEAT 290
           HS              +  E+   S  +  S+      ++S +      ++         
Sbjct: 134 HSPPPSPPSHPGPHEPAPPESHNPSPNQQPSSFLQPSHEDSPEEPEPPTSEPEPDSPGPP 193

Query: 291 SSNEASGESQESDSAKAASSSQSPT-ESAPGGEVSKAASATDSQSSEAAPGD 341
            S   +               QSPT + AP     +A    D  +     G 
Sbjct: 194 QSETPTSSPPPQSPPDEPGEPQSPTPQQAPSPNTQQAVEHEDEPTEPEREGP 245



 Score = 37.6 bits (87), Expect = 0.008
 Identities = 26/133 (19%), Positives = 45/133 (33%), Gaps = 5/133 (3%)

Query: 234 DQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSN 293
            +  G GG  GS +    S + +  GS  E  S  +   +  S+  + +S +   +  S+
Sbjct: 84  KEERGQGGPSGSGSESVGSPTPSPSGSAEELASGLSPENTSGSSPESPASHSPPPSPPSH 143

Query: 294 EASGESQESDSAKAASSSQSPTESAPGGEVS-----KAASATDSQSSEAAPGDKVTDGKE 348
               E    +S   + + Q  +   P  E S        S  +  S      +  T    
Sbjct: 144 PGPHEPAPPESHNPSPNQQPSSFLQPSHEDSPEEPEPPTSEPEPDSPGPPQSETPTSSPP 203

Query: 349 TPSTSDTDGDKSS 361
             S  D  G+  S
Sbjct: 204 PQSPPDEPGEPQS 216



 Score = 34.6 bits (79), Expect = 0.083
 Identities = 25/118 (21%), Positives = 35/118 (29%), Gaps = 5/118 (4%)

Query: 233 LDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSS 292
           L   + +G    S AS +   S  S     E    E+   S        SS  Q     S
Sbjct: 118 LSPENTSGSSPESPASHSPPPSPPSHPGPHEPAPPESHNPSPNQ---QPSSFLQPSHEDS 174

Query: 293 NEAS--GESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKE 348
            E      S+    +     S++PT S P            S + + AP        E
Sbjct: 175 PEEPEPPTSEPEPDSPGPPQSETPTSSPPPQSPPDEPGEPQSPTPQQAPSPNTQQAVE 232



 Score = 31.1 bits (70), Expect = 0.87
 Identities = 24/107 (22%), Positives = 37/107 (34%), Gaps = 5/107 (4%)

Query: 233 LDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSS 292
            +Q          +A E+  G +   G  G + S   S  S  +   + S++      S 
Sbjct: 62  AEQGHRQTESDTETAEESRHGEKEERGQGGPSGSGSESVGS-PTPSPSGSAEELASGLSP 120

Query: 293 NEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAP 339
              SG S ES     AS S  P+  +  G    A   + + S    P
Sbjct: 121 ENTSGSSPESP----ASHSPPPSPPSHPGPHEPAPPESHNPSPNQQP 163


>gnl|CDD|237284 PRK13108, PRK13108, prolipoprotein diacylglyceryl transferase;
           Reviewed.
          Length = 460

 Score = 40.3 bits (94), Expect = 0.001
 Identities = 20/122 (16%), Positives = 35/122 (28%)

Query: 237 SGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEAS 296
           +       +SA   +   E +   +          E  +   A S          S  A 
Sbjct: 304 AAAAVASAASAVGPVGPGEPNQPDDVAEAVKAEVAEVTDEVAAESVVQVADRDGESTPAV 363

Query: 297 GESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTD 356
            E+ E+D  +      +    A     ++AASA   + +  A         E P  +   
Sbjct: 364 EETSEADIEREQPGDLAGQAPAAHQVDAEAASAAPEEPAALASEAHDETEPEVPEKAAPI 423

Query: 357 GD 358
            D
Sbjct: 424 PD 425



 Score = 39.2 bits (91), Expect = 0.003
 Identities = 22/118 (18%), Positives = 41/118 (34%), Gaps = 11/118 (9%)

Query: 245 SSASEALSGSETSGGSEGEATSN---EASGESQESAKAASSSDNQGEATSSNEASGESQE 301
           S    A    E++   E  + ++   E  G+    A AA   D +  + +  E +  + E
Sbjct: 348 SVVQVADRDGESTPAVEETSEADIEREQPGDLAGQAPAAHQVDAEAASAAPEEPAALASE 407

Query: 302 SDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDGDK 359
              A   +  + P ++AP         A   + + A PGD   +        D    +
Sbjct: 408 ---AHDETEPEVPEKAAP-----IPDPAKPDELAVAGPGDDPAEPDGIRRQDDFSSRR 457



 Score = 33.0 bits (75), Expect = 0.23
 Identities = 26/124 (20%), Positives = 40/124 (32%), Gaps = 10/124 (8%)

Query: 244 GSSASEALSGSE--TSGGSEGEATSNEASGESQESAKAASSSD----NQGEATSSNEASG 297
           G  A  AL GSE       E E     A+  +  +A A         NQ +  +    + 
Sbjct: 278 GREAPGALRGSEYVVDEALEREPAELAAAAVAS-AASAVGPVGPGEPNQPDDVAEAVKAE 336

Query: 298 ESQESDSAKAASSSQSPT---ESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSD 354
            ++ +D   A S  Q      ES P  E +  A     Q  + A         +  + S 
Sbjct: 337 VAEVTDEVAAESVVQVADRDGESTPAVEETSEADIEREQPGDLAGQAPAAHQVDAEAASA 396

Query: 355 TDGD 358
              +
Sbjct: 397 APEE 400



 Score = 29.9 bits (67), Expect = 2.4
 Identities = 20/112 (17%), Positives = 33/112 (29%), Gaps = 1/112 (0%)

Query: 250 ALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAAS 309
           A  G E  G   G     + + E + +  AA++  +   A       GE  + D    A 
Sbjct: 275 APKGREAPGALRGSEYVVDEALEREPAELAAAAVASAASAVGPV-GPGEPNQPDDVAEAV 333

Query: 310 SSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDGDKSS 361
            ++    +      S    A     S  A  +      E     D  G   +
Sbjct: 334 KAEVAEVTDEVAAESVVQVADRDGESTPAVEETSEADIEREQPGDLAGQAPA 385


>gnl|CDD|234632 PRK00099, rplJ, 50S ribosomal protein L10; Reviewed.
          Length = 172

 Score = 37.8 bits (89), Expect = 0.002
 Identities = 38/168 (22%), Positives = 67/168 (39%), Gaps = 27/168 (16%)

Query: 85  EILARFEESQMIAIVHRSSM-LAEVN------REVKVVFKRVGMTMLDRYGRATIEKALT 137
           E+  + +++Q   +     + +A++       RE  V +K V  T+  R        AL 
Sbjct: 12  ELAEKLKKAQSAVVADYRGLTVAQMTELRKKLREAGVEYKVVKNTLARR--------ALE 63

Query: 138 NTKYISIMPLFKVSEAIIVSPE-----AKVDVLLKTLKKTPQLTLMATIVDNRLLSKTET 192
            T +  +  L K   AI  S E     AKV  L    K   +L +    ++ ++L   E 
Sbjct: 64  GTGFEGLDDLLKGPTAIAFSYEDPVAAAKV--LKDFAKDNKKLEIKGGAIEGKVLDAEEV 121

Query: 193 EMYATTNLATQQ---ALLVQTISSVATSLTSQLNHHSTTLVSYLDQHS 237
           +  A   L +++   A L+  + + AT L   LN   + L   L   +
Sbjct: 122 KALAK--LPSREELLAKLLGVLQAPATKLAGVLNAPPSKLARVLKALA 167


>gnl|CDD|223044 PHA03325, PHA03325, nuclear-egress-membrane-like protein;
           Provisional.
          Length = 418

 Score = 39.1 bits (91), Expect = 0.003
 Identities = 27/121 (22%), Positives = 43/121 (35%), Gaps = 8/121 (6%)

Query: 234 DQHSGTGGEGGSSA-SEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSS 292
            +HS       S               +  E      + E++E A+ A+S+ ++G +++ 
Sbjct: 299 SEHSDPEPLPASLPPPPVRRPRVKHPEAGKEEPDGARNAEAKEPAQPATSTSSKGSSSAQ 358

Query: 293 NEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPST 352
           N+ SG +    S  AASS     E    G      +     S    P   VT   E PS 
Sbjct: 359 NKDSGSTGPGSSLAAASS---FLEDDDFGSPPLDLTT----SLRHMPSPSVTSAPEPPSI 411

Query: 353 S 353
            
Sbjct: 412 P 412


>gnl|CDD|236304 PRK08581, PRK08581, N-acetylmuramoyl-L-alanine amidase; Validated.
          Length = 619

 Score = 38.2 bits (89), Expect = 0.006
 Identities = 23/134 (17%), Positives = 46/134 (34%), Gaps = 1/134 (0%)

Query: 223 NHHSTTLVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASS 282
           N   T +   L ++        ++  + L     S  S+ E   N     +  +  + SS
Sbjct: 98  NLPQTNINQLLTKNKYDDNYSLTTLIQNLFNLN-SDISDYEQPRNSEKSTNDSNKNSDSS 156

Query: 283 SDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDK 342
             N  +  SS +   ++Q++ S+     S S  +           S +   S + A    
Sbjct: 157 IKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQSNSQPASDDTANQKS 216

Query: 343 VTDGKETPSTSDTD 356
            +   ++ S S  D
Sbjct: 217 SSKDNQSMSDSALD 230



 Score = 34.4 bits (79), Expect = 0.10
 Identities = 14/81 (17%), Positives = 28/81 (34%), Gaps = 5/81 (6%)

Query: 281 SSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPG 340
           S         +S+ +     ++ S+K   +      S+     +   S ++ Q +   P 
Sbjct: 142 SEKSTNDSNKNSDSSIKNDTDTQSSKQDKADNQKAPSS----NNTKPSTSNKQPNSPKPT 197

Query: 341 DKVTDGKETPSTSDTDGDKSS 361
                    P++ DT   KSS
Sbjct: 198 QPNQSNS-QPASDDTANQKSS 217



 Score = 29.8 bits (67), Expect = 2.5
 Identities = 12/73 (16%), Positives = 20/73 (27%)

Query: 246 SASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSA 305
            A +    S     S     SN+       S+K    +DN   +   N     S    S 
Sbjct: 26  YADDPQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNNNTSNQDNNDKKFSTIDSST 85

Query: 306 KAASSSQSPTESA 318
             +++        
Sbjct: 86  SDSNNIIDFIYKN 98



 Score = 29.8 bits (67), Expect = 2.6
 Identities = 22/139 (15%), Positives = 43/139 (30%), Gaps = 3/139 (2%)

Query: 226 STTLVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDN 285
           STTLV           +     S A + S  S  S  + TS + S +  + A   ++S+ 
Sbjct: 12  STTLVLPTLTSPTAYADDPQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNNNTSNQ 71

Query: 286 QGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAP---GDK 342
                  +     + +S++         P  +              S ++        + 
Sbjct: 72  DNNDKKFSTIDSSTSDSNNIIDFIYKNLPQTNINQLLTKNKYDDNYSLTTLIQNLFNLNS 131

Query: 343 VTDGKETPSTSDTDGDKSS 361
                E P  S+   + S+
Sbjct: 132 DISDYEQPRNSEKSTNDSN 150


>gnl|CDD|164795 PHA00370, III, attachment protein.
          Length = 297

 Score = 37.6 bits (87), Expect = 0.008
 Identities = 24/120 (20%), Positives = 46/120 (38%), Gaps = 9/120 (7%)

Query: 237 SGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEAS 296
           SG G  GGS    +  G    G +    T         +  K A++  N+   T  N+ +
Sbjct: 109 SGGGDTGGSGGGGSDGGGSEGGSTGKSLTKEGVGAGDFDYPKMANA--NKDALTEDNDQN 166

Query: 297 GESQESDSAKAASSSQSPTESAPGG--EVSKAASATDSQSSEAAPG----DKVTDGKETP 350
              +++D  +   +S S +++  G    V         +S + A      D++ +G  +P
Sbjct: 167 ALQKDAD-EQLDKASASVSDAISGFMRGVGGLVDNGGGESGQFAGSNSEMDQLGEGDGSP 225



 Score = 30.3 bits (68), Expect = 1.6
 Identities = 24/112 (21%), Positives = 45/112 (40%), Gaps = 6/112 (5%)

Query: 237 SGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQG-EATSSNEA 295
            G GG  G   S+  +G +T GG+ G  +    +G S         S+      + + E 
Sbjct: 83  DGDGGGTGEGGSD--TGGDTGGGNTGGGSGGGDTGGSGGGGSDGGGSEGGSTGKSLTKEG 140

Query: 296 SGESQESDSAKAASSSQ-SPTESAPGGEVSKAASATDSQSSEAAPGDKVTDG 346
            G + + D  K A++++ + TE      + K A       + A+  D ++  
Sbjct: 141 VG-AGDFDYPKMANANKDALTEDNDQNALQKDADEQ-LDKASASVSDAISGF 190


>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein.  This entry is a highly
           conserved protein present in eukaryotes.
          Length = 680

 Score = 37.2 bits (86), Expect = 0.014
 Identities = 28/144 (19%), Positives = 46/144 (31%), Gaps = 8/144 (5%)

Query: 220 SQLNHHSTTLVSYL-DQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAK 278
              NHHS    S L            S  S + +            T++ +S  +  S  
Sbjct: 274 GINNHHSKHADSKLQTIEVIENHSNKSRPSSSSTNGSKE-------TTSNSSSAAAGSIG 326

Query: 279 AASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAA 338
           + SS   +    + + +S +S  S +    SSS S  ES        ++ A DS+   + 
Sbjct: 327 SKSSKSAKHSNRNKSNSSPKSHSSANGSVPSSSVSDNESKQKRASKSSSGARDSKKDASG 386

Query: 339 PGDKVTDGKETPSTSDTDGDKSSR 362
                T     P    +      R
Sbjct: 387 MSANGTVENCIPENKISTPSAIER 410


>gnl|CDD|222790 PHA00430, PHA00430, tail fiber protein.
          Length = 568

 Score = 36.8 bits (85), Expect = 0.018
 Identities = 22/113 (19%), Positives = 46/113 (40%), Gaps = 10/113 (8%)

Query: 260 SEGEATSNEASGESQES-AKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESA 318
           +E +   N+A   + ES A A ++   + EA  SN  +   +    +  +S   +  ++ 
Sbjct: 173 NEADRARNQAERFNNESGASATNTKQWRSEADGSNSEANRFKGYADSMTSSVEAAKGQAE 232

Query: 319 P--------GGEVSKAA-SATDSQSSEAAPGDKVTDGKETPSTSDTDGDKSSR 362
                    G   +KAA SA+ + +SE    +  T    + + +    D++  
Sbjct: 233 SSSKEANTAGDYATKAAASASAAHASEVNAANSATAAATSANRAKQQADRAKT 285



 Score = 34.9 bits (80), Expect = 0.062
 Identities = 23/103 (22%), Positives = 41/103 (39%), Gaps = 5/103 (4%)

Query: 235 QHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNE 294
           Q      E G+SA+        + GS  EA   +   +S  S+  A+    +  +  +N 
Sbjct: 181 QAERFNNESGASATNTKQWRSEADGSNSEANRFKGYADSMTSSVEAAKGQAESSSKEANT 240

Query: 295 ASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEA 337
           A   +      KAA+S+ +   S      S  A+AT +  ++ 
Sbjct: 241 AGDYAT-----KAAASASAAHASEVNAANSATAAATSANRAKQ 278



 Score = 31.4 bits (71), Expect = 0.88
 Identities = 27/137 (19%), Positives = 57/137 (41%), Gaps = 17/137 (12%)

Query: 201 ATQQALLVQTISSVATSLTSQLNHHSTTLVSYLDQHSGTGGEGGSSASEALSGSETSGGS 260
           A  QA      S  + + T Q             +  G+  E       A S + +   +
Sbjct: 178 ARNQAERFNNESGASATNTKQWR----------SEADGSNSEANRFKGYADSMTSSVEAA 227

Query: 261 EGEAT-----SNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPT 315
           +G+A      +N A   + ++A +AS++ +  E  ++N A+  +  ++ AK   + ++ T
Sbjct: 228 KGQAESSSKEANTAGDYATKAAASASAA-HASEVNAANSATAAATSANRAK-QQADRAKT 285

Query: 316 ESAPGGEVSKAASATDS 332
           E+   G ++  A A + 
Sbjct: 286 EADKLGNMNGFAGAIEK 302


>gnl|CDD|148679 pfam07218, RAP1, Rhoptry-associated protein 1 (RAP-1).  This family
           consists of several rhoptry-associated protein 1 (RAP-1)
           sequences which appear to be specific to Plasmodium
           falciparum.
          Length = 790

 Score = 36.6 bits (84), Expect = 0.022
 Identities = 35/130 (26%), Positives = 49/130 (37%), Gaps = 9/130 (6%)

Query: 230 VSYLDQHSG--TGGEGGSSASEALSGSET-SGGSEGEATSNEASGESQESAKAASSSDNQ 286
            S+L+  +    G    +  SE    S+   G S   + S  A  E  +S        N 
Sbjct: 67  ESFLENKASKDDGNINLTDTSENGDASKKGHGKSRVRSASAAAILEEDDSKDDMEFKANP 126

Query: 287 GEAT-----SSNEASGESQESDSAKAASS-SQSPTESAPGGEVSKAASATDSQSSEAAPG 340
            EA        N+  G +  SD    AS+ S S + S  G   S   SATDS  + A+  
Sbjct: 127 NEAGKPGKPKGNQGEGLASSSDGKSKASAKSGSKSASKHGESNSSDESATDSGKASASVA 186

Query: 341 DKVTDGKETP 350
             V   +E P
Sbjct: 187 GIVGADEEAP 196



 Score = 29.3 bits (65), Expect = 3.7
 Identities = 23/108 (21%), Positives = 39/108 (36%), Gaps = 2/108 (1%)

Query: 237 SGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEAS 296
            G      +SA+  L   ++    E +A  NEA        K          +   ++AS
Sbjct: 97  HGKSRVRSASAAAILEEDDSKDDMEFKANPNEAGKPG--KPKGNQGEGLASSSDGKSKAS 154

Query: 297 GESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVT 344
            +S    ++K   S+ S   +   G+ S + +       EA P  K T
Sbjct: 155 AKSGSKSASKHGESNSSDESATDSGKASASVAGIVGADEEAPPAPKNT 202


>gnl|CDD|216860 pfam02063, MARCKS, MARCKS family. 
          Length = 296

 Score = 35.2 bits (80), Expect = 0.043
 Identities = 31/119 (26%), Positives = 49/119 (41%), Gaps = 11/119 (9%)

Query: 235 QHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEA----- 289
           + +G G E   +A+E     E +  +  EA S E +    E A AA +    GE      
Sbjct: 158 KEAGEGAEAEGAAAEKEGAKEEAAAAAPEAGSGEEAAAPGEEAGAAGAEGEAGEEPAADA 217

Query: 290 ------TSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDK 342
                     EA+ E  +++ AKAA   ++  + A     S AA    +   EAAP ++
Sbjct: 218 EPEQPEAKPEEAAPEKPQAEEAKAAEEQKAEEKPAEEAGASSAAQEAPAAEQEAAPAEE 276


>gnl|CDD|236776 PRK10856, PRK10856, cytoskeletal protein RodZ; Provisional.
          Length = 331

 Score = 34.6 bits (80), Expect = 0.067
 Identities = 20/113 (17%), Positives = 45/113 (39%), Gaps = 2/113 (1%)

Query: 209 QTISSVATSLTSQLNHHSTTLVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNE 268
           + I+++A   +++L+ +S   V  LD  + T      + +  +  + T+  +   AT+  
Sbjct: 141 EEITTMADQSSAELSQNSGQSVP-LDTSTTTDPATTPAPAAPVDTTPTNSQTPAVATAPA 199

Query: 269 ASGESQESAKAASSSDNQGEATSSNE-ASGESQESDSAKAASSSQSPTESAPG 320
            + + Q++A  A S  N   A +    A      +       +  S   + P 
Sbjct: 200 PAVDPQQNAVVAPSQANVDTAATPAPAAPATPDGAAPLPTDQAGVSTPAADPN 252



 Score = 30.8 bits (70), Expect = 1.0
 Identities = 16/110 (14%), Positives = 32/110 (29%), Gaps = 1/110 (0%)

Query: 229 LVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGE 288
           + +  DQ S    +     S  L  S T+  +   A +         S   A ++     
Sbjct: 143 ITTMADQSSAELSQNSGQ-SVPLDTSTTTDPATTPAPAAPVDTTPTNSQTPAVATAPAPA 201

Query: 289 ATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAA 338
                 A     +++   AA+ + +   +  G        A  S  +   
Sbjct: 202 VDPQQNAVVAPSQANVDTAATPAPAAPATPDGAAPLPTDQAGVSTPAADP 251


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 35.0 bits (80), Expect = 0.079
 Identities = 26/129 (20%), Positives = 45/129 (34%), Gaps = 10/129 (7%)

Query: 235 QHSG--TGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSS 292
           +H+G  TG EG     E  + +E   G E      E  GE++   +  S  +   E    
Sbjct: 639 EHTGERTGEEG-----ERPTEAEGENGEESG-GEAEQEGETETKGENESEGEIPAERKGE 692

Query: 293 NEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAP--GDKVTDGKETP 350
            E  GE +  ++     +     E     E        + ++ E      D+     E  
Sbjct: 693 QEGEGEIEAKEADHKGETEAEEVEHEGETEAEGTEDEGEIETGEEGEEVEDEGEGEAEGK 752

Query: 351 STSDTDGDK 359
              +T+GD+
Sbjct: 753 HEVETEGDR 761



 Score = 31.5 bits (71), Expect = 0.79
 Identities = 24/117 (20%), Positives = 39/117 (33%), Gaps = 3/117 (2%)

Query: 240 GGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGES 299
           G     +  +     E  G +E E   +E  GE Q         D   E    +E   E+
Sbjct: 751 GKHEVETEGDRKET-EHEGETEAEGKEDEDEGEIQAGEDGEMKGDEGAEGKVEHEGETEA 809

Query: 300 QESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTD 356
            E D  +  S +Q+           +  +A +    EA   +K  DG       D++
Sbjct: 810 GEKDEHEGQSETQADDTEVKDETGEQELNAEN--QGEAKQDEKGVDGGGGSDGGDSE 864



 Score = 28.0 bits (62), Expect = 9.0
 Identities = 30/128 (23%), Positives = 44/128 (34%), Gaps = 7/128 (5%)

Query: 239 TGGEGGSSASEALSGSETSGGS-----EGEATSNEASGESQESAKAASSSD-NQGEATSS 292
             GEG   A EA    ET         E EA   E  GE +   +     D  +GEA   
Sbjct: 693 QEGEGEIEAKEADHKGETEAEEVEHEGETEAEGTEDEGEIETGEEGEEVEDEGEGEAEGK 752

Query: 293 NEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPST 352
           +E   E    ++     +     E    GE+ +A    + +  E A G    +G+     
Sbjct: 753 HEVETEGDRKETEHEGETEAEGKEDEDEGEI-QAGEDGEMKGDEGAEGKVEHEGETEAGE 811

Query: 353 SDTDGDKS 360
            D    +S
Sbjct: 812 KDEHEGQS 819


>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein.  This family consists of
           AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
           retardation syndrome) nuclear proteins. These proteins
           have been linked to human diseases such as acute
           lymphoblastic leukaemia and mental retardation. The
           family also contains a Drosophila AF4 protein homologue
           Lilliputian which contains an AT-hook domain.
           Lilliputian represents a novel pair-rule gene that acts
           in cytoskeleton regulation, segmentation and
           morphogenesis in Drosophila.
          Length = 1154

 Score = 34.5 bits (79), Expect = 0.098
 Identities = 21/73 (28%), Positives = 38/73 (52%), Gaps = 6/73 (8%)

Query: 265 TSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVS 324
           +S+E S E Q + K  S +      T  +  S   + + S+  +SSS S +ES+ G +  
Sbjct: 363 SSSEDSDEEQATEKPPSRN------TPPSAPSSNPEPAASSSGSSSSSSGSESSSGSDSE 416

Query: 325 KAASATDSQSSEA 337
             +S++DS+ +E 
Sbjct: 417 SESSSSDSEENEP 429


>gnl|CDD|226365 COG3846, TrbL, Type IV secretory pathway, TrbL components
           [Intracellular trafficking and secretion].
          Length = 452

 Score = 34.3 bits (79), Expect = 0.10
 Identities = 25/128 (19%), Positives = 44/128 (34%), Gaps = 9/128 (7%)

Query: 235 QHSGTGGEGGSS-ASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSN 293
             + +    G+S AS A S   +     G      A+G          S   +    +  
Sbjct: 321 SLASSVTALGTSMASAAASAFASGRKGSGSGAFGTAAGV-----GDVKSPGAKAAMRTLG 375

Query: 294 EASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQ---SSEAAPGDKVTDGKETP 350
            A+G++  S ++    + +S   SA G      A+   +    +   A G  V  G  T 
Sbjct: 376 RAAGDTGVSVASGVGQAPKSAGGSAAGKSAVAKATGVQAAPGWARRMASGQNVFAGASTA 435

Query: 351 STSDTDGD 358
           + +   GD
Sbjct: 436 AAAVRGGD 443


>gnl|CDD|179316 PRK01655, spxA, transcriptional regulator Spx; Reviewed.
          Length = 131

 Score = 32.0 bits (73), Expect = 0.18
 Identities = 23/69 (33%), Positives = 33/69 (47%), Gaps = 7/69 (10%)

Query: 49  YLLPSHEKCQKARARQWLMERK-GLNEENIFQNILA----KEILARFEESQMIAIVHRSS 103
           +  PS   C+KA+A  WL E      E NIF + L     K+IL   E+     I  RS 
Sbjct: 5   FTSPSCTSCRKAKA--WLEEHDIPFTERNIFSSPLTIDEIKQILRMTEDGTDEIISTRSK 62

Query: 104 MLAEVNREV 112
           +  ++N +V
Sbjct: 63  VFQKLNVDV 71


>gnl|CDD|233697 TIGR02040, PpsR-CrtJ, transcriptional regulator PpsR.  This model
           represents the transcriptional regulator PpsR which is
           strictly associated with photosynthetic proteobacteria
           and found in photosynthetic operons. PpsR has been
           reported to be a repressor. These proteins contain a
           Helix-Turn_Helix motif of the "fis" type (pfam02954)
           [Energy metabolism, Photosynthesis].
          Length = 442

 Score = 32.9 bits (75), Expect = 0.28
 Identities = 17/62 (27%), Positives = 31/62 (50%)

Query: 158 PEAKVDVLLKTLKKTPQLTLMATIVDNRLLSKTETEMYATTNLATQQALLVQTISSVATS 217
               + VLL  +++T Q+ L AT +     ++TE E+ A      ++ L+V  I  ++  
Sbjct: 305 GGVDLRVLLSNVRRTGQVRLYATTLTGEFGAQTEVEISAAWVDQGERPLIVLVIRDISRR 364

Query: 218 LT 219
           LT
Sbjct: 365 LT 366


>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit.  This model
           describes Pseudomonas denitrificans CobT gene product,
           which is a cobalt chelatase subunit that functions in
           cobalamin biosynthesis. Cobalamin (vitamin B12) can be
           synthesized via several pathways, including an aerobic
           pathway (found in Pseudomonas denitrificans) and an
           anaerobic pathway (found in P. shermanii and Salmonella
           typhimurium). These pathways differ in the point of
           cobalt insertion during corrin ring formation. There are
           apparently a number of variations on these two pathways,
           where the major differences seem to be concerned with
           the process of ring contraction. Confusion regarding the
           functions of enzymes found in the aerobic vs. anaerobic
           pathways has arisen because nonhomologous genes in these
           different pathways were given the same gene symbols.
           Thus, cobT in the aerobic pathway (P. denitrificans) is
           not a homolog of cobT in the anaerobic pathway (S.
           typhimurium). It should be noted that E. coli
           synthesizes cobalamin only when it is supplied with the
           precursor cobinamide, which is a complex intermediate.
           Additionally, all E. coli cobalamin synthesis genes
           (cobU, cobS and cobT) were named after their Salmonella
           typhimurium homologs which function in the anaerobic
           cobalamin synthesis pathway. This model describes the
           aerobic cobalamin pathway Pseudomonas denitrificans CobT
           gene product, which is a cobalt chelatase subunit, with
           a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
           heterotrimeric, ATP-dependent enzyme that catalyzes
           cobalt insertion during cobalamin biosynthesis. The
           other two subunits are the P. denitrificans CobS
           (TIGR01650) and CobN (pfam02514 CobN/Magnesium
           Chelatase) proteins. To avoid potential confusion with
           the nonhomologous Salmonella typhimurium/E.coli cobT
           gene product, the P. denitrificans gene symbol is not
           used in the name of this model [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 600

 Score = 32.6 bits (74), Expect = 0.31
 Identities = 17/90 (18%), Positives = 28/90 (31%), Gaps = 2/90 (2%)

Query: 268 EASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAA 327
           E  G+  ES       D+Q       E      E     A   S++    +  GE  +  
Sbjct: 201 EEMGDDTESEDEEDGDDDQPTENEQEEQGEGEGEGQEGSAPQESEATDRESESGE-EEMV 259

Query: 328 SATDSQSSEAAPGDKVTDGK-ETPSTSDTD 356
            +      + +  D  T G+   P+   T 
Sbjct: 260 QSDQDDLPDESDDDSETPGEGARPARPFTS 289



 Score = 29.1 bits (65), Expect = 4.3
 Identities = 18/70 (25%), Positives = 22/70 (31%), Gaps = 1/70 (1%)

Query: 234 DQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSN 293
           D    T  E          G E S   E EAT  E S   +E    +   D   E+   +
Sbjct: 216 DDDQPTENEQEEQGEGEGEGQEGSAPQESEATDRE-SESGEEEMVQSDQDDLPDESDDDS 274

Query: 294 EASGESQESD 303
           E  GE     
Sbjct: 275 ETPGEGARPA 284



 Score = 28.8 bits (64), Expect = 6.0
 Identities = 17/96 (17%), Positives = 33/96 (34%), Gaps = 13/96 (13%)

Query: 265 TSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVS 324
            S E + E  +  ++    D   +  + NE   + +     +  S+ Q            
Sbjct: 195 RSMELAEEMGDDTESEDEEDGDDDQPTENEQEEQGEGEGEGQEGSAPQ-----------E 243

Query: 325 KAASATDSQSSEAAPGDKVTDG--KETPSTSDTDGD 358
             A+  +S+S E        D    E+   S+T G+
Sbjct: 244 SEATDRESESGEEEMVQSDQDDLPDESDDDSETPGE 279



 Score = 28.4 bits (63), Expect = 6.4
 Identities = 17/76 (22%), Positives = 31/76 (40%), Gaps = 5/76 (6%)

Query: 255 ETSGGSEGEATSNEAS---GESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSS 311
           +   G + + T NE         E  + ++  ++  EAT     SGE +   S +     
Sbjct: 211 DEEDGDDDQPTENEQEEQGEGEGEGQEGSAPQES--EATDRESESGEEEMVQSDQDDLPD 268

Query: 312 QSPTESAPGGEVSKAA 327
           +S  +S   GE ++ A
Sbjct: 269 ESDDDSETPGEGARPA 284


>gnl|CDD|237537 PRK13875, PRK13875, conjugal transfer protein TrbL; Provisional.
          Length = 440

 Score = 32.6 bits (75), Expect = 0.31
 Identities = 27/105 (25%), Positives = 40/105 (38%), Gaps = 5/105 (4%)

Query: 238 GTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSD---NQGEATSSNE 294
           G G      A+ A  G   + G    A S  A+G S  +  AA           A +S  
Sbjct: 297 GGGAAAAGGAAAAARGGAAAAGGASSAYSAGAAGGSGAAGVAAGLGGVARAGASAAASPL 356

Query: 295 ASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAP 339
               S+ ++S K  SS ++   S  GG    AA+A    ++   P
Sbjct: 357 RRAASRAAESMK--SSFRAGARSTGGGAGGAAAAAAAGAAAAGPP 399



 Score = 29.1 bits (66), Expect = 3.5
 Identities = 24/103 (23%), Positives = 44/103 (42%), Gaps = 10/103 (9%)

Query: 237 SGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEAS 296
           +     GG++A+   S + ++G + G   +  A+G     A+A         A +S    
Sbjct: 306 AAAAARGGAAAAGGASSAYSAGAAGGSGAAGVAAGLGG-VARAG------ASAAASPLRR 358

Query: 297 GESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAP 339
             S+ ++S K   SS      + GG    AA+A  + ++ A P
Sbjct: 359 AASRAAESMK---SSFRAGARSTGGGAGGAAAAAAAGAAAAGP 398


>gnl|CDD|224363 COG1446, COG1446, Asparaginase [Amino acid transport and
           metabolism].
          Length = 307

 Score = 31.9 bits (73), Expect = 0.43
 Identities = 13/57 (22%), Positives = 21/57 (36%), Gaps = 4/57 (7%)

Query: 75  ENIFQNILAKEILARFEESQMIAIVHRSSMLAEVNREVKVVFKRVGMTMLDRYGRAT 131
           E I +N LA +I AR      +      +    V   +K +    G+  +D  G   
Sbjct: 231 EVIIRNALAFDIAARVRYGLSLDA----ACERVVEEALKALGGDGGLIAVDAKGNVA 283


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 32.3 bits (73), Expect = 0.53
 Identities = 22/128 (17%), Positives = 39/128 (30%), Gaps = 8/128 (6%)

Query: 241  GEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQ 300
             E  ++  E +   + S  +E +   NE   E  E+ +    S   G  +      GE  
Sbjct: 4037 LEENNTLDEDIQQDDFSDLAEDDEKMNEDGFE--ENVQENEESTEDGVKSDEELEQGEVP 4094

Query: 301  ESDS------AKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSD 354
            E  +        A S+  S        +        +    +   G+   DG+      D
Sbjct: 4095 EDQAIDNHPKMDAKSTFASAEADEENTDKGIVGENEELGEEDGVRGNGTADGEFEQVQED 4154

Query: 355  TDGDKSSR 362
            T   K + 
Sbjct: 4155 TSTPKEAM 4162


>gnl|CDD|237019 PRK11907, PRK11907, bifunctional 2',3'-cyclic nucleotide
           2'-phosphodiesterase/3'-nucleotidase precursor protein;
           Reviewed.
          Length = 814

 Score = 31.7 bits (72), Expect = 0.71
 Identities = 16/81 (19%), Positives = 32/81 (39%), Gaps = 1/81 (1%)

Query: 247 ASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDN-QGEATSSNEASGESQESDSA 305
           A E ++ +  +     + T  E+    +        +     EA SS+E +  S  +  A
Sbjct: 28  AEEIVTTTPATSTEAEQTTPVESDATEEADNTETPVAATTAAEAPSSSETAETSDPTSEA 87

Query: 306 KAASSSQSPTESAPGGEVSKA 326
              ++S++ T +    E SK 
Sbjct: 88  TDTTTSEARTVTPAATETSKP 108


>gnl|CDD|219927 pfam08601, PAP1, Transcription factor PAP1.  The transcription
           factor Pap1 regulates antioxidant-gene transcription in
           response to H2O2. This region is cysteine rich.
           Alkylation of cysteine residues following treatment with
           a cysteine alkylating agent can mask the accessibility
           of the nuclear exporter Crm1, triggering nuclear
           accumulation and Pap1 dependent transcriptional
           expression.
          Length = 344

 Score = 31.5 bits (71), Expect = 0.75
 Identities = 22/140 (15%), Positives = 52/140 (37%), Gaps = 14/140 (10%)

Query: 186 LLSKTETEMYATTNLATQQALLVQTISSVATSLTSQLNHHSTTLVSYLDQHSGTGGEGGS 245
           LL+ TE+ + +  N            S  + ++ ++ N+ +    +     S     G  
Sbjct: 60  LLNSTESNVSSPNNNPNGYT------SPSSAAMNNKSNNRAVDPSANASAASTNSPNGLQ 113

Query: 246 SASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGES-----Q 300
           S++   + ++ S      + S+ + G + +   +  +S      +    AS  +      
Sbjct: 114 SSATQYNSNDNSSSD---SPSSGSDGFTNQLLSSLGTSPEPSTESPPQLASVNNFAAIRN 170

Query: 301 ESDSAKAASSSQSPTESAPG 320
            ++S     S+ S T + PG
Sbjct: 171 NAESNSNVPSAASSTPNIPG 190


>gnl|CDD|237030 PRK12270, kgd, alpha-ketoglutarate decarboxylase; Reviewed.
          Length = 1228

 Score = 31.4 bits (72), Expect = 0.78
 Identities = 14/79 (17%), Positives = 29/79 (36%)

Query: 275 ESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQS 334
            S  A +++     A +S  A+  + ++ +A A +   +   +AP    + AA+A    +
Sbjct: 39  GSTAAPTAAAAAAAAAASAPAAAPAAKAPAAPAPAPPAAAAPAAPPKPAAAAAAAAAPAA 98

Query: 335 SEAAPGDKVTDGKETPSTS 353
             AA               
Sbjct: 99  PPAAAAAAAPAAAAVEDEV 117


>gnl|CDD|227052 COG4708, COG4708, Predicted membrane protein [Function unknown].
          Length = 169

 Score = 30.1 bits (68), Expect = 0.93
 Identities = 10/41 (24%), Positives = 20/41 (48%)

Query: 115 VFKRVGMTMLDRYGRATIEKALTNTKYISIMPLFKVSEAII 155
           +F  +G+ +  +Y +  +   + N  +I    LF +S  II
Sbjct: 84  IFLSLGVILFSKYSKDYLFNGIINKAFIFFSILFSISMFII 124


>gnl|CDD|237367 PRK13371, PRK13371, 4-hydroxy-3-methylbut-2-enyl diphosphate
           reductase; Provisional.
          Length = 387

 Score = 30.7 bits (70), Expect = 1.1
 Identities = 15/63 (23%), Positives = 31/63 (49%), Gaps = 14/63 (22%)

Query: 83  AKEILARFEES-----------QMIAIVHRSSMLAEVNREVKVVFKRVGMTMLDRYGRAT 131
            +E L RF ++           + + + ++++ML     E+  +F+R   TML +YG A 
Sbjct: 203 REEFLERFAKAYSPGFDPDRDLERVGVANQTTMLKSETEEIGKLFER---TMLRKYGPAN 259

Query: 132 IEK 134
           + +
Sbjct: 260 LNE 262


>gnl|CDD|114645 pfam05934, MCLC, Mid-1-related chloride channel (MCLC).  This
           family consists of several mid-1-related chloride
           channels. mid-1-related chloride channel (MCLC) proteins
           function as a chloride channel when incorporated in the
           planar lipid bilayer.
          Length = 577

 Score = 30.9 bits (69), Expect = 1.1
 Identities = 20/65 (30%), Positives = 26/65 (40%)

Query: 257 SGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTE 316
           +GG +GE T  E S E  + AK    S N     +    + E  +  S  A S  Q  T 
Sbjct: 498 TGGIKGEGTGEELSQEDHQIAKPIKESGNDERGNTEGPEAAEKAQLKSEAAGSPDQGSTY 557

Query: 317 SAPGG 321
           S   G
Sbjct: 558 SPARG 562


>gnl|CDD|237011 PRK11892, PRK11892, pyruvate dehydrogenase subunit beta;
           Provisional.
          Length = 464

 Score = 31.0 bits (71), Expect = 1.1
 Identities = 12/54 (22%), Positives = 24/54 (44%)

Query: 297 GESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETP 350
           GES     A  A+++++   +      + A  A  + ++ AAP  +V    + P
Sbjct: 81  GESASDAGAAPAAAAEAAAAAPAAAAAAAAKKAAPAPAAPAAPAAEVAADPDIP 134


>gnl|CDD|185641 PTZ00462, PTZ00462, Serine-repeat antigen protein; Provisional.
          Length = 1004

 Score = 30.8 bits (69), Expect = 1.2
 Identities = 15/75 (20%), Positives = 27/75 (36%), Gaps = 2/75 (2%)

Query: 234 DQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSN 293
           D      G    +    +  S   G    E++     G+ ++  K     D    + S N
Sbjct: 44  DNAGNIDGSPIGNLDANIHAS--FGADPKESSGANLPGKKEKKKKEIRGHDIMSNSDSQN 101

Query: 294 EASGESQESDSAKAA 308
            +S E Q++   K+A
Sbjct: 102 SSSIEKQDNIQIKSA 116


>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 824

 Score = 30.7 bits (70), Expect = 1.3
 Identities = 25/179 (13%), Positives = 50/179 (27%), Gaps = 17/179 (9%)

Query: 174 QLTLMATIVDNRLLSKTETEMYATTNLATQQALLV---------QTISSVAT---SLTSQ 221
           +LT  A +V++ L     TEM   T+      LL               +      L  +
Sbjct: 330 ELTRAADVVNDGL-----TEMRGATSPRLLLELLCARMLLPSASDDERGLLARLERLERR 384

Query: 222 LNHHSTTLVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAAS 281
           L              S       ++ + A +    +      A    A   +   A  + 
Sbjct: 385 LGVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAPAPPSP 444

Query: 282 SSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPG 340
           + +       S   +       +   A++ +     AP    + A +A  +  +  A  
Sbjct: 445 AGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAP 503



 Score = 29.6 bits (67), Expect = 3.6
 Identities = 21/131 (16%), Positives = 42/131 (32%), Gaps = 9/131 (6%)

Query: 233 LDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSS 292
           L++  G  G  G+ A+ A S +     +   A    A+     +A  A ++  Q     +
Sbjct: 381 LERRLGVAGGAGAPAAAAPSAA----AAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPA 436

Query: 293 NEASGESQESDSAKAASSS-----QSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGK 347
              +  S   ++    + S         + AP    +   +A  + +  AAP        
Sbjct: 437 PAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAA 496

Query: 348 ETPSTSDTDGD 358
                +    D
Sbjct: 497 PAAPAAPAGAD 507


>gnl|CDD|240430 PTZ00473, PTZ00473, Plasmodium Vir superfamily; Provisional.
          Length = 420

 Score = 30.6 bits (69), Expect = 1.5
 Identities = 24/109 (22%), Positives = 33/109 (30%), Gaps = 16/109 (14%)

Query: 252 SGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSS 311
             +   GG     +    S ES       SS+   G +       G SQ +DS     S 
Sbjct: 317 PYNANYGGQFNSRSGRTGSSESIRGFTYDSSTTYGGSSY------GTSQ-TDSTSTYGSR 369

Query: 312 QSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDGDKS 360
            +   S  GG  S   S     S         T    +  +SD+ G   
Sbjct: 370 STFDSSTGGGSQSGGGSTYGGSS---------TFDGSSRGSSDSFGVSY 409


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 30.7 bits (69), Expect = 1.6
 Identities = 20/113 (17%), Positives = 35/113 (30%), Gaps = 1/113 (0%)

Query: 241 GEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQ 300
               S  S +++  E+      E    E   E +E  ++        E     EA   S+
Sbjct: 424 ASSTSGESPSMASQESEEEESVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEADNGSE 483

Query: 301 ESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTS 353
           E +   ++       E     E   +  A  S+ SE       +   E+P   
Sbjct: 484 E-EMEGSSEGDGDGEEPEEDAERRNSEMAGISRMSEGQQPRGSSVQPESPQEE 535


>gnl|CDD|189000 cd08662, M13, Peptidase family M13 includes neprilysin,
           endothelin-converting enzyme I.  M13 family of
           metallopeptidases includes neprilysin (neutral
           endopeptidase, NEP, enkephalinase, CD10, CALLA, EC
           3.4.24.11), endothelin-converting enzyme I (ECE-1, EC
           3.4.24.71), erythrocyte surface antigen KELL (ECE-3),
           phosphate-regulating gene on the X chromosome (PHEX),
           soluble secreted endopeptidase (SEP), and damage-induced
           neuronal endopeptidase (DINE)/X-converting enzyme (XCE).
           These proteins consist of a short N-terminal cytoplasmic
           domain, a single transmembrane helix, and a larger
           C-terminal extracellular domain containing the active
           site. Proteins in this family fulfill a broad range of
           physiological roles due to the greater variation in the
           S2' subsite allowing substrate specificity. NEP is
           expressed in a variety of tissues including kidney and
           brain, and is involved in many physiological and
           pathological processes, including blood pressure and
           inflammatory response. It degrades a wide array of
           substrates such as substance P, enkephalins,
           cholecystokinin, neurotensin and somatostatin.  It is an
           important enzyme in the regulation of amyloid-beta
           (Abeta) protein that forms amyloid plaques that are
           associated with Alzeimers disease (AD). ECE-1 catalyzes
           the final rate-limiting step in the biosynthesis of
           endothelins via post-translational conversion of the
           biologically inactive big endothelins. Like NEP, it also
           hydrolyses bradykinin, substance P, neurotensin and
           Abeta.  Endothelin-1 overproduction has been implicated
           in various diseases, including stroke, asthma,
           hypertension, and cardiac and renal failure. Kell is a
           homolog of NEP and constitutes a major antigen on human
           erythrocytes; it preferentially cleaves big endothelin-3
           to produce bioactive endothelin-3, but is also known to
           cleave substance P and neurokinin A. PHEX forms a
           complex interaction with fibroblast growth factor 23
           (FGF23) and matrix extracellular phosphoglycoprotein,
           causing bone mineralization. A loss-of-function mutation
           in PHEX disrupts this interaction leading to
           hypophosphatemic rickets; X-linked hypophosphatemic
           (XLH) rickets is the most common form of metabolic
           rickets. ECEL1 is a brain metalloprotease involved in
           the critical role in the nervous regulation of the
           respiratory system, while DINE (damage induced neuronal
           endopeptidase) is abundantly expressed in the
           hypothalamus and its expression responds to nerve injury
           as well. Thus, majority of these M13 proteases are prime
           therapeutic targets for selective inhibition.
          Length = 611

 Score = 30.3 bits (69), Expect = 1.6
 Identities = 40/194 (20%), Positives = 68/194 (35%), Gaps = 36/194 (18%)

Query: 4   VNLARQFSRTCVLERRINTRRPRPPCYEKALF---------------IEISKPKFPPNPE 48
               + F R+C+    I     +P      LF               + + +P       
Sbjct: 60  EQKIKDFYRSCMDTEAIEALGLKP--LLPLLFGLGVSPDLKNSSRNILYLDQPGLGLPDR 117

Query: 49  --YLLPSHEKCQKA---RARQWLMERKGLNEENIFQNILAKEILARFEESQMIAIVHRSS 103
             YL    +K + A      + L    G +EE+     LA+E+LA FE    +A +  S 
Sbjct: 118 DYYLDEKSKKIRAAYKAYLAKLL-VLAGEDEEDAEA--LAEEVLA-FETE--LAKISWSE 171

Query: 104 MLAEVNREVKVVFKRVGMTMLDRYGRATIEKALTNTKYISIMPLFKVSEAIIVSPEAKVD 163
              E  R+ +  +  + +  L +         +    Y+  + L    E +IV+    + 
Sbjct: 172 ---EERRDPEKTYNPMTLAELQKLA-----PGIDWKAYLEALGLPSEDEKVIVTQPDYLK 223

Query: 164 VLLKTLKKTPQLTL 177
            L K L  TP  TL
Sbjct: 224 KLNKLLASTPLRTL 237


>gnl|CDD|184923 PRK14959, PRK14959, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 624

 Score = 30.4 bits (68), Expect = 1.7
 Identities = 23/119 (19%), Positives = 34/119 (28%), Gaps = 11/119 (9%)

Query: 240 GGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGES 299
            G G S+ S    GS   G + G A +    G       A ++       T S+ A    
Sbjct: 374 SGGGASAPS----GSAAEGPASGGAATIPTPGTQGPQGTAPAAG-----MTPSSAAPATP 424

Query: 300 QESD--SAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTD 356
             S   S +       P     G     A    ++     AP    +     P+  D  
Sbjct: 425 APSAAPSPRVPWDDAPPAPPRSGIPPRPAPRMPEASPVPGAPDSVASASDAPPTLGDPS 483


>gnl|CDD|185274 PRK15376, PRK15376, pathogenicity island 1 effector protein SipA;
           Provisional.
          Length = 670

 Score = 30.4 bits (68), Expect = 1.8
 Identities = 30/160 (18%), Positives = 58/160 (36%), Gaps = 12/160 (7%)

Query: 203 QQALLVQTISSVATSLTSQLNHHSTTLVSYLDQHSGTGGEGGSSA-SEALSGS-ETSGGS 260
            ++      S+V+         HS + V      + T     + A    ++G  + +  +
Sbjct: 335 GESHHSTNSSNVS---------HSHSRVDSTTHQTETAHSASTGAIDHGIAGKIDVTAHA 385

Query: 261 EGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPG 320
             EA +N AS ES++     S     GE TS +E  G + +S   K   ++    +    
Sbjct: 386 TAEAVTN-ASSESKDGKVVTSEKGTTGETTSFDEVDGVTSKSIIGKPVQATVHGVDDNKQ 444

Query: 321 GEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDGDKS 360
              +         +S+ A  + V        T+   G+K+
Sbjct: 445 QSQTAEIVNVKPLASQLAGVENVKTDTLQSDTTVITGNKA 484


>gnl|CDD|220634 pfam10220, DUF2146, Uncharacterized conserved protein (DUF2146).
           This is a family of proteins conserved from plants to
           humans. In Dictyostelium it is annotated as Mss11p but
           this could not be confirmed. Mss11p is required for the
           activation of pseudo-hyphal and invasive growth by
           Ste12p in yeast.
          Length = 890

 Score = 30.2 bits (68), Expect = 1.8
 Identities = 18/83 (21%), Positives = 31/83 (37%)

Query: 238 GTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASG 297
            T  E   S    L GS TS     +   + AS     S ++ + +     + +  E + 
Sbjct: 573 ETDQEQPESLEPQLQGSSTSPSDASDLNFSTASSSEASSEESDNYARPTSRSGTDEEEAS 632

Query: 298 ESQESDSAKAASSSQSPTESAPG 320
           ++      +A +   S TE  PG
Sbjct: 633 KTAREKRPQALARQPSTTEYLPG 655



 Score = 29.8 bits (67), Expect = 2.6
 Identities = 19/93 (20%), Positives = 37/93 (39%)

Query: 247 ASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAK 306
           A++  + S ++     +A    A  E+ +    +     QG +TS ++AS  +  + S+ 
Sbjct: 548 AADFENNSLSAAKKMEQAEDELADEETDQEQPESLEPQLQGSSTSPSDASDLNFSTASSS 607

Query: 307 AASSSQSPTESAPGGEVSKAASATDSQSSEAAP 339
            ASS +S   + P              + E  P
Sbjct: 608 EASSEESDNYARPTSRSGTDEEEASKTAREKRP 640


>gnl|CDD|223023 PHA03249, PHA03249, DNA packaging tegument protein UL25;
           Provisional.
          Length = 653

 Score = 30.4 bits (68), Expect = 2.0
 Identities = 24/128 (18%), Positives = 36/128 (28%), Gaps = 8/128 (6%)

Query: 223 NHHSTTLVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASS 282
           N  S  +VS  D  S    E G  A     G                   +  +   A++
Sbjct: 58  NKSSFEVVSETDSGSEAEAERGRRAG---MGGRNKATKPSRRNKTTQCRPTSLALATAAT 114

Query: 283 SDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPG-D 341
                 +  S + S       S  + S      E   GG+ S       +QS   A    
Sbjct: 115 MPATPSSGKSPKVSSPP----SIPSLSEEDEGAERNSGGDDSSHTDNESTQSQPEADDEP 170

Query: 342 KVTDGKET 349
            + +G E 
Sbjct: 171 DLAEGHEF 178


>gnl|CDD|177447 PHA02664, PHA02664, hypothetical protein; Provisional.
          Length = 534

 Score = 30.0 bits (67), Expect = 2.0
 Identities = 18/100 (18%), Positives = 36/100 (36%)

Query: 262 GEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGG 321
           G A +  A+ E   +    S      E  ++  A+  +  +D    A +     +     
Sbjct: 385 GAAAAMIAAAERAANGARGSPMAAPEEGRAAAAAAAANAPADQDVEAEAHDEFDQDPGAP 444

Query: 322 EVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDGDKSS 361
             +  A + +    E   GD+  DG++   +S +    SS
Sbjct: 445 AHADRADSDEDDMDEQESGDERADGEDDSDSSYSYSTTSS 484


>gnl|CDD|221526 pfam12316, Dsh_C, Segment polarity protein dishevelled (Dsh) C
           terminal.  This domain family is found in eukaryotes,
           and is typically between 177 and 207 amino acids in
           length. The family is found in association with
           pfam00778, pfam02377, pfam00610, pfam00595. The segment
           polarity gene dishevelled (dsh) is required for pattern
           formation of the embryonic segments. It is involved in
           the determination of body organisation through the
           Wingless pathway (analogous to the Wnt-1 pathway).
          Length = 202

 Score = 29.5 bits (66), Expect = 2.1
 Identities = 26/105 (24%), Positives = 40/105 (38%), Gaps = 10/105 (9%)

Query: 236 HSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEA 295
           +S   G  GS  SE   GS +SG +    +  E S  +        S  +  E+  ++ +
Sbjct: 59  YSYGSGSAGSQHSE---GSRSSGSNR---SDGERSRAADGREGGRKSGGSGSESEHTSRS 112

Query: 296 SGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPG 340
                    A +  S   P+E +    V  + S   S SS  APG
Sbjct: 113 GSRRSGGRRAPSERSGPPPSEGS----VRSSLSHPSSHSSYGAPG 153


>gnl|CDD|183558 PRK12495, PRK12495, hypothetical protein; Provisional.
          Length = 226

 Score = 29.5 bits (66), Expect = 2.1
 Identities = 21/111 (18%), Positives = 42/111 (37%), Gaps = 11/111 (9%)

Query: 259 GSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESA 318
            +E  A  ++A   ++ +A + + S    +  +   A  E+ +  +   ASS+ +  E+A
Sbjct: 68  VTEDGAAGDDAGDGAEATAPSDAGSQASPDDDAQPAAEAEAADQSAPPEASSTSATDEAA 127

Query: 319 --PGGEVSKAASATDSQSSEAA---------PGDKVTDGKETPSTSDTDGD 358
             P    +     T   +++ A             V+    TPST D    
Sbjct: 128 TDPPATAAARDGPTPDPTAQPATPDERRSPRQRPPVSGEPPTPSTPDAHVA 178



 Score = 29.1 bits (65), Expect = 3.3
 Identities = 16/82 (19%), Positives = 29/82 (35%), Gaps = 3/82 (3%)

Query: 281 SSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPG 340
            + D       S+  S  S + D+  AA +  +   + P    + + SATD  +++    
Sbjct: 77  DAGDGAEATAPSDAGSQASPDDDAQPAAEAEAADQSAPPE---ASSTSATDEAATDPPAT 133

Query: 341 DKVTDGKETPSTSDTDGDKSSR 362
               DG     T+        R
Sbjct: 134 AAARDGPTPDPTAQPATPDERR 155


>gnl|CDD|237057 PRK12323, PRK12323, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 700

 Score = 29.8 bits (67), Expect = 2.3
 Identities = 23/118 (19%), Positives = 43/118 (36%), Gaps = 3/118 (2%)

Query: 245 SSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDS 304
           S A EAL+ +  +           A   +   A AA              A+  +  + +
Sbjct: 427 SPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAAR---PAAAGPRPVAAAAAAAPARA 483

Query: 305 AKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDGDKSSR 362
           A AA+ + +  +  P  E+    ++      +AAP   V +    P+T+D D    + 
Sbjct: 484 APAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATADPDDAFETL 541


>gnl|CDD|130057 TIGR00984, 3a0801s03tim44, mitochondrial import inner membrane,
           translocase subunit.  The mitochondrial protein
           translocase (MPT) family, which brings nuclearly encoded
           preproteins into mitochondria, is very complex with 19
           currently identified protein constituents.These proteins
           include several chaperone proteins, four proteins of the
           outer membrane translocase (Tom) import receptor, five
           proteins of the Tom channel complex, five proteins of
           the inner membrane translocase (Tim) and three "motor"
           proteins. This family is specific for the Tim proteins
           [Transport and binding proteins, Amino acids, peptides
           and amines].
          Length = 378

 Score = 29.8 bits (67), Expect = 2.4
 Identities = 19/81 (23%), Positives = 31/81 (38%), Gaps = 4/81 (4%)

Query: 286 QGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAA--PGDKV 343
           +     S+E  G++           +    ES  G ++ KA + T   ++E      + V
Sbjct: 44  ESGTLKSSEVVGKTLGKLGDTMKKMAHKAWESELGKKMKKAGAETAKTAAEHVDKSAEPV 103

Query: 344 TDGK--ETPSTSDTDGDKSSR 362
            D    +  S S  DG  SSR
Sbjct: 104 RDTAVYKHVSQSMKDGKDSSR 124


>gnl|CDD|233367 TIGR01349, PDHac_trf_mito, pyruvate dehydrogenase complex
           dihydrolipoamide acetyltransferase, long form.  This
           model represents one of several closely related clades
           of the dihydrolipoamide acetyltransferase subunit of the
           pyruvate dehydrogenase complex. It includes sequences
           from mitochondria and from alpha and beta branches of
           the proteobacteria, as well as from some other bacteria.
           Sequences from Gram-positive bacteria are not included.
           The non-enzymatic homolog protein X, which serves as an
           E3 component binding protein, falls within the clade
           phylogenetically but is rejected by its low score
           [Energy metabolism, Pyruvate dehydrogenase].
          Length = 436

 Score = 29.8 bits (67), Expect = 2.4
 Identities = 9/50 (18%), Positives = 21/50 (42%)

Query: 294 EASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKV 343
           E+S       S  A ++  S  + +P  +      ++ +  S+   GD++
Sbjct: 91  ESSASPAPKPSEIAPTAPPSAPKPSPAPQKQSPEPSSPAPLSDKESGDRI 140


>gnl|CDD|224346 COG1429, CobN, Cobalamin biosynthesis protein CobN and related
            Mg-chelatases [Coenzyme metabolism].
          Length = 1388

 Score = 30.1 bits (68), Expect = 2.6
 Identities = 15/81 (18%), Positives = 28/81 (34%), Gaps = 1/81 (1%)

Query: 239  TGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGE 298
            T     + AS      E+ G +     S+ +S     S  A S +D+ G +  +  +   
Sbjct: 1287 TRYAAFAPASATPGAPESVGTTAVSTASSASSATVTGSD-AGSGADSTGPSLGAAGSVTG 1345

Query: 299  SQESDSAKAASSSQSPTESAP 319
            + E       + S S +    
Sbjct: 1346 AGEGYEMTKEAVSGSESTGMS 1366



 Score = 28.9 bits (65), Expect = 5.0
 Identities = 27/136 (19%), Positives = 46/136 (33%), Gaps = 10/136 (7%)

Query: 228  TLVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQG 287
            TL    + ++    E G +      G+           S     E  E+A  A++     
Sbjct: 1233 TLRLLAEVYAELVAENGVACCHHTCGNPALDEWVLGLVSVPGVPELVEAATYAATRYAAF 1292

Query: 288  EATSSNEASGES--QESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTD 345
               S+   + ES    + S  +++SS + T S         A +T      A       +
Sbjct: 1293 APASATPGAPESVGTTAVSTASSASSATVTGSD----AGSGADSTGPSLGAAGSVTGAGE 1348

Query: 346  G----KETPSTSDTDG 357
            G    KE  S S++ G
Sbjct: 1349 GYEMTKEAVSGSESTG 1364


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 29.4 bits (66), Expect = 2.8
 Identities = 14/70 (20%), Positives = 28/70 (40%), Gaps = 1/70 (1%)

Query: 268 EASGESQESAKAASSSD-NQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKA 326
           E  G+  ESA +  + D +  +    ++   E +   S   +  S + +E    GE+  A
Sbjct: 210 EELGDEPESADSEDNEDEDDPKEDEDDDQGEEEESGSSDSLSEDSDASSEEMESGEMEAA 269

Query: 327 ASATDSQSSE 336
            ++ D     
Sbjct: 270 EASADDTPDS 279


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 29.7 bits (66), Expect = 3.0
 Identities = 23/93 (24%), Positives = 39/93 (41%), Gaps = 7/93 (7%)

Query: 252 SGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAAS-- 309
           S  E  G  EGE++ + +  E   S       DN+  ++S +  S +  ESDS  +A   
Sbjct: 114 SEGEGEGEGEGESSDSRSVNEEGSSDPKDIDQDNR--SSSPSIPSPQDNESDSDSSAQQQ 171

Query: 310 ---SSQSPTESAPGGEVSKAASATDSQSSEAAP 339
                  P+   P G     ++   + S++A P
Sbjct: 172 LLQPQGPPSIQVPPGAALAPSAPPPTPSAQAVP 204


>gnl|CDD|220102 pfam09073, BUD22, BUD22.  BUD22 has been shown in yeast to be a
           nuclear protein involved in bud-site selection. It plays
           a role in positioning the proximal bud pole signal. More
           recently it has been shown to be involved in ribosome
           biogenesis.
          Length = 424

 Score = 29.4 bits (66), Expect = 3.0
 Identities = 32/136 (23%), Positives = 49/136 (36%), Gaps = 10/136 (7%)

Query: 234 DQHSGTGGEGGSSASEALSGSETS---GGSEGEATSNEASGESQESAKAASSSDNQGEAT 290
            + S    +   S SE  S SE S      + E   +++   SQ       SSD +    
Sbjct: 163 AKESSDKDDEEESESEDESKSEESAEDDSDDEEEEDSDSEDYSQYDGMLVDSSDEEEGEE 222

Query: 291 SSN------EASGESQESDSAKAASSSQSPT-ESAPGGEVSKAASATDSQSSEAAPGDKV 343
           + +       +  ES ESDS  + S S S + ES+P  +  K    + +       G   
Sbjct: 223 APSINYNEDTSESESDESDSEISESRSVSDSEESSPPSKKPKEKKTSSTFLPSLMGGYFS 282

Query: 344 TDGKETPSTSDTDGDK 359
               E     D D D+
Sbjct: 283 GSEDEDDDDEDIDPDQ 298


>gnl|CDD|233034 TIGR00584, mug, mismatch-specific thymine-DNA glycosylate (mug).
           All proteins in this family for whcih functions are
           known are G-T or G-U mismatch glycosylases that function
           in base excision repair. This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University). Used 2pf model [DNA metabolism,
           DNA replication, recombination, and repair].
          Length = 328

 Score = 29.3 bits (65), Expect = 3.4
 Identities = 22/90 (24%), Positives = 32/90 (35%), Gaps = 13/90 (14%)

Query: 275 ESAKAASSSDNQGEATSSN------------EASGESQESDSAKAASSSQSPTESAPGGE 322
           E+    + +DN  EA  +                 E +E+    A + +Q P+E AP   
Sbjct: 1   EAEDTGTKNDNSSEANLTVKKQQRSADAPNMALVEEQEETSGVPAKAPTQEPSEEAPKFR 60

Query: 323 VSKAASATDSQSSEAA-PGDKVTDGKETPS 351
             K  S    +  E   P D    GK T S
Sbjct: 61  KRKPRSNEPYRPVEPKKPSDSKKSGKSTKS 90


>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen.  This
           family consists of several Theileria P67 surface
           antigens. A stage specific surface antigen of Theileria
           parva, p67, is the basis for the development of an
           anti-sporozoite vaccine for the control of East Coast
           fever (ECF) in cattle. The antigen has been shown to
           contain five distinct linear peptide sequences
           recognised by sporozoite-neutralising murine monoclonal
           antibodies.
          Length = 727

 Score = 29.6 bits (66), Expect = 3.4
 Identities = 20/105 (19%), Positives = 34/105 (32%), Gaps = 3/105 (2%)

Query: 240 GGEGGSSASEALSGSETSGGSEGEATSNEASGESQESA--KAASSSDNQGEATSSNEASG 297
            G GG      +    ++   EG+   ++                    G++TSS   + 
Sbjct: 222 VGVGGLGGVPGVGILASNTSREGQTQDDQERDGDGRVIEPGVGLPGVRVGDSTSSPSTTR 281

Query: 298 ESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSE-AAPGD 341
            S  + +   ASS  S          +    +TDS S    +PG 
Sbjct: 282 PSGSTTTTTPASSGPSAPGGPGSSSRNAVTRSTDSISGPIPSPGA 326


>gnl|CDD|191716 pfam07263, DMP1, Dentin matrix protein 1 (DMP1).  This family
           consists of several mammalian dentin matrix protein 1
           (DMP1) sequences. The dentin matrix acidic
           phosphoprotein 1 (DMP1) gene has been mapped to human
           chromosome 4q21. DMP1 is a bone and teeth specific
           protein initially identified from mineralised dentin.
           DMP1 is primarily localised in the nuclear compartment
           of undifferentiated osteoblasts. In the nucleus, DMP1
           acts as a transcriptional component for activation of
           osteoblast-specific genes like osteocalcin. During the
           early phase of osteoblast maturation, Ca(2+) surges into
           the nucleus from the cytoplasm, triggering the
           phosphorylation of DMP1 by a nuclear isoform of casein
           kinase II. This phosphorylated DMP1 is then exported out
           into the extracellular matrix, where it regulates
           nucleation of hydroxyapatite. DMP1 is a unique molecule
           that initiates osteoblast differentiation by
           transcription in the nucleus and orchestrates
           mineralised matrix formation extracellularly, at later
           stages of osteoblast maturation. The DMP1 gene has been
           found to be ectopically expressed in lung cancer
           although the reason for this is unknown.
          Length = 514

 Score = 29.2 bits (65), Expect = 3.7
 Identities = 47/179 (26%), Positives = 80/179 (44%), Gaps = 15/179 (8%)

Query: 188 SKTETEMYATTNLATQQALLVQTISSVAT---SLTSQLNHHSTTLVSYLDQHSGTGGEGG 244
           S++E++  +  N + + +  VQ  SS ++    L SQ N  S +    + +  G   +  
Sbjct: 308 SRSESQEDSEENQSQEDSQEVQDPSSESSQEADLPSQENS-SESQEEVVSESRGDNPDNT 366

Query: 245 SSASEALSGSETSGG------SEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGE 298
           +S SE    SE+S        S  E+ S E   +S+ +   +SS ++       N +S E
Sbjct: 367 TSHSEDQEDSESSEEDSLDTPSSSESQSTEEQADSESNESLSSSEESPESTEDENSSSQE 426

Query: 299 SQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDG 357
             +S SA   S SQ   ES    +       +DSQ S  +  D  ++  E+ S+S+ DG
Sbjct: 427 GLQSHSASTESRSQ---ESQSEQDSRSEEDDSDSQDSSRSKED--SNSTESASSSEEDG 480


>gnl|CDD|223479 COG0402, SsnA, Cytosine deaminase and related metal-dependent
           hydrolases [Nucleotide transport and metabolism /
           General function prediction only].
          Length = 421

 Score = 29.3 bits (66), Expect = 3.7
 Identities = 10/49 (20%), Positives = 19/49 (38%), Gaps = 4/49 (8%)

Query: 82  LAKEILARFEESQMIAIVHRSSMLAEVNREVKVVFKRVGMTMLDRYGRA 130
           L + +     +  +   +H    LAE   EV+ V +  G   ++R    
Sbjct: 199 LLESLDELARKYGLPVHIH----LAETLDEVERVLEPYGARPVERLDLL 243


>gnl|CDD|226920 COG4547, CobT, Cobalamin biosynthesis protein CobT
           (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole
           phosphoribosyltransferase) [Coenzyme metabolism].
          Length = 620

 Score = 29.0 bits (65), Expect = 4.7
 Identities = 19/94 (20%), Positives = 37/94 (39%), Gaps = 3/94 (3%)

Query: 265 TSNEASGES-QESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEV 323
            + E   +  +E A      D+Q +    +EA  E  E         +++       GE+
Sbjct: 218 MAEETGDDGIEEDADEEDGDDDQPDNNEDSEAGREESEGSDESEEDEAEATDGEGEEGEM 277

Query: 324 SKAASATDSQSSEAAPG-DKVTDGKETPSTSDTD 356
             A ++ DS+S E+    +   +    P+T  T+
Sbjct: 278 DAAEASEDSESDESDEDTETPGEDAR-PATPFTE 310


>gnl|CDD|184543 PRK14157, PRK14157, heat shock protein GrpE; Provisional.
          Length = 227

 Score = 28.4 bits (63), Expect = 4.8
 Identities = 23/77 (29%), Positives = 38/77 (49%), Gaps = 2/77 (2%)

Query: 242 EGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQE 301
           E    A++A SG++ S  S GE   + A  ++   A AA ++   GEA ++ E +GE+Q 
Sbjct: 20  EAQGQAAQASSGADASAES-GEQQDSAAQADANAGADAAPAAAE-GEAKAAAEKTGEAQS 77

Query: 302 SDSAKAASSSQSPTESA 318
                     Q+  E+A
Sbjct: 78  DSDDTLTPLGQAKKEAA 94


>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510).  This
           family consists of several hypothetical bacterial
           proteins of around 200 residues in length. The function
           of this family is unknown.
          Length = 214

 Score = 28.5 bits (64), Expect = 5.1
 Identities = 14/77 (18%), Positives = 28/77 (36%), Gaps = 3/77 (3%)

Query: 254 SETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQS 313
            E     + E    E   E +E  K A++S+++ +   + +   ES+E +  +   SS  
Sbjct: 49  QEAKKSDDQETAEIE---EVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEEDEESSDE 105

Query: 314 PTESAPGGEVSKAASAT 330
             +       S      
Sbjct: 106 NEKETEEKTESNVEKEI 122


>gnl|CDD|235906 PRK07003, PRK07003, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 830

 Score = 29.0 bits (65), Expect = 5.3
 Identities = 14/84 (16%), Positives = 31/84 (36%)

Query: 256 TSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPT 315
           T GG+ G       +G        A+++       +    +G +  + + KAA+++ +  
Sbjct: 363 TGGGAPGGGVPARVAGAVPAPGARAAAAVGASAVPAVTAVTGAAGAALAPKAAAAAAATR 422

Query: 316 ESAPGGEVSKAASATDSQSSEAAP 339
             AP    +  A+A     +    
Sbjct: 423 AEAPPAAPAPPATADRGDDAADGD 446


>gnl|CDD|225711 COG3170, FimV, Tfp pilus assembly protein FimV [Cell motility and
           secretion / Intracellular trafficking and secretion].
          Length = 755

 Score = 28.7 bits (64), Expect = 5.4
 Identities = 21/70 (30%), Positives = 35/70 (50%), Gaps = 2/70 (2%)

Query: 272 ESQESAKAASSSDNQGE-ATSSNE-ASGESQESDSAKAASSSQSPTESAPGGEVSKAASA 329
           ++   A A ++ D   E AT+ +  A   S ES  A+  S   +    AP GE+++A SA
Sbjct: 321 KAAPLAAAQAALDAPAETATAPSAPAPQVSAESSPAQPGSYLLAAPGDAPLGELAQAQSA 380

Query: 330 TDSQSSEAAP 339
            +  + E+ P
Sbjct: 381 RERLAEESVP 390


>gnl|CDD|236571 PRK09565, PRK09565, hypothetical protein; Reviewed.
          Length = 533

 Score = 28.6 bits (64), Expect = 6.1
 Identities = 18/102 (17%), Positives = 31/102 (30%), Gaps = 7/102 (6%)

Query: 209 QTISSVATSLTSQLNHHSTTLVSYLDQHSGTGGEGGSSASEALSGSETSGGSEGEATSNE 268
           + + +      +      T       +H   GG  G    E   G E SGG  G    N 
Sbjct: 259 ERVPTATPEDAADGGTGGTHDAEEFGEHGHHGGHPGGEDGEHPHGHEDSGGHHGSGGDNF 318

Query: 269 ASGESQ-------ESAKAASSSDNQGEATSSNEASGESQESD 303
              +         +  +  S  + +GE   +   +G+    D
Sbjct: 319 DHYDRHVATTVRADGGREDSDEEIRGELADAGVYAGQPHGED 360


>gnl|CDD|115751 pfam07117, DUF1373, Protein of unknown function (DUF1373).  This
           family consists of several hypothetical proteins which
           seem to be specific to Oryzias latipes (Japanese
           ricefish). Members of this family are typically around
           200 residues in length. The function of this family is
           unknown.
          Length = 210

 Score = 28.2 bits (62), Expect = 6.2
 Identities = 20/54 (37%), Positives = 28/54 (51%), Gaps = 1/54 (1%)

Query: 235 QHSGTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGE 288
           Q SG G  G   +S + SGS+ S GS+G  +    S    E    +SSSD++ E
Sbjct: 95  QSSGYGSYGSVDSSYSGSGSQQS-GSQGAQSGAPGSQHQVEQESWSSSSDDEDE 147


>gnl|CDD|234371 TIGR03839, termin_org_P1, adhesin P1.  Members of this protein
           family are the major adhesin of the Mycoplasma terminal
           organelle. The protein is called adhesin P1, cytadhesin
           P1, P140, attachment protein, and MgPa, with locus names
           MG191 in Mycoplasma genitalium and MPN141 in M.
           pneumoniae. A conserved C-terminal region is shared by
           additional paralogs in M. pneumoniae and M.
           gallisepticum, as well as by the member of this family
           [Cell envelope, Surface structures, Cellular processes,
           Pathogenesis].
          Length = 1425

 Score = 28.6 bits (63), Expect = 6.5
 Identities = 18/87 (20%), Positives = 29/87 (33%), Gaps = 11/87 (12%)

Query: 245 SSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSS--------NEAS 296
            +AS A SGS +S    G      A    +E  + +SS        +         N  S
Sbjct: 238 GTASSAGSGSSSSAAGGGAVAPTAAKALKREVEEGSSSGMGTMLPKNDTAETPIKYNSDS 297

Query: 297 GES---QESDSAKAASSSQSPTESAPG 320
           G+    +    +  +S S +     P 
Sbjct: 298 GKIVKLKALLDSTESSESINGGRWRPW 324


>gnl|CDD|221581 pfam12446, DUF3682, Protein of unknown function (DUF3682).  This
           domain family is found in eukaryotes, and is typically
           between 125 and 136 amino acids in length.
          Length = 133

 Score = 27.5 bits (61), Expect = 6.6
 Identities = 29/89 (32%), Positives = 35/89 (39%), Gaps = 12/89 (13%)

Query: 238 GTGGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEAS- 296
           G G  G SS S A +     G       +  A G       +A SS   GEA SS   S 
Sbjct: 4   GDGTGGVSSGSSAPAPPAGPGPGPNAPPAPAAPG----VDSSAGSSG--GEAGSSGSNSS 57

Query: 297 ---GESQESD--SAKAASSSQSPTESAPG 320
              G+S   D   A AA+ + SP E   G
Sbjct: 58  NTTGDSSTGDQSPAAAAAHNSSPPEGPAG 86


>gnl|CDD|165587 PHA03343, PHA03343, US22 family homolog; Provisional.
          Length = 578

 Score = 28.5 bits (63), Expect = 6.9
 Identities = 14/76 (18%), Positives = 26/76 (34%)

Query: 240 GGEGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGES 299
                    E  +G +    +   + S     E+ E A AA+++     A   ++  GE 
Sbjct: 114 AAAAEKGEEEEEAGDKIEPPAVRGSVSPIRGHETVEGAAAAATAATSAAACDGDDGGGED 173

Query: 300 QESDSAKAASSSQSPT 315
              D  K    + +P 
Sbjct: 174 GGEDGGKGIGGACAPP 189


>gnl|CDD|221427 pfam12114, Period_C, Period protein 2/3C-terminal region.  This
           domain is found in eukaryotes. This domain is typically
           between 164 to 200 amino acids in length. This domain is
           found associated with pfam08447.
          Length = 190

 Score = 27.8 bits (62), Expect = 7.2
 Identities = 17/64 (26%), Positives = 28/64 (43%)

Query: 242 EGGSSASEALSGSETSGGSEGEATSNEASGESQESAKAASSSDNQGEATSSNEASGESQE 301
           +  S    A SGS +SG     + SN +S  S  S+K   S D   E +   + + ++ E
Sbjct: 19  DSQSGTGSAASGSGSSGSGSLGSGSNGSSSGSSNSSKYFGSIDMSSENSHKAKKTQDTHE 78

Query: 302 SDSA 305
            +  
Sbjct: 79  EEQF 82


>gnl|CDD|185037 PRK15079, PRK15079, oligopeptide ABC transporter ATP-binding
           protein OppF; Provisional.
          Length = 331

 Score = 28.1 bits (63), Expect = 7.7
 Identities = 37/137 (27%), Positives = 61/137 (44%), Gaps = 24/137 (17%)

Query: 59  KARARQWLMERKGLNEENIFQNILAKEILARFEESQMIA----IVHRSSMLAEVNREVKV 114
             +  +W   R  +  + IFQ+ LA  +  R    ++IA      H      EV   VK 
Sbjct: 87  GMKDDEWRAVRSDI--QMIFQDPLA-SLNPRMTIGEIIAEPLRTYHPKLSRQEVKDRVKA 143

Query: 115 VFKRVGM--TMLDRY---------GRATIEKALTNTKYISIM--PLFKVSEAIIVSPEAK 161
           +  +VG+   +++RY          R  I +AL     + I   P   VS A+ VS +A+
Sbjct: 144 MMLKVGLLPNLINRYPHEFSGGQCQRIGIARALILEPKLIICDEP---VS-ALDVSIQAQ 199

Query: 162 VDVLLKTLKKTPQLTLM 178
           V  LL+ L++   L+L+
Sbjct: 200 VVNLLQQLQREMGLSLI 216


>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
          Length = 1465

 Score = 28.3 bits (63), Expect = 8.4
 Identities = 18/64 (28%), Positives = 25/64 (39%), Gaps = 3/64 (4%)

Query: 277  AKAASSSDNQGEATSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSE 336
            A   S      E     EA G S E    K  +S   P     G  + +AA+  +++SSE
Sbjct: 1358 ATVQSGQKLLTEMLKPAEAIGISPEKKVRKMRAS---PFNKKSGSVLGRAATNKETESSE 1414

Query: 337  AAPG 340
               G
Sbjct: 1415 NVSG 1418


>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II).  Bone
           sialoprotein (BSP) is a major structural protein of the
           bone matrix that is specifically expressed by
           fully-differentiated osteoblasts. The expression of bone
           sialoprotein (BSP) is normally restricted to mineralised
           connective tissues of bones and teeth where it has been
           associated with mineral crystal formation. However, it
           has been found that ectopic expression of BSP occurs in
           various lesions, including oral and extraoral
           carcinomas, in which it has been associated with the
           formation of microcrystalline deposits and the
           metastasis of cancer cells to bone.
          Length = 291

 Score = 27.7 bits (61), Expect = 9.0
 Identities = 29/132 (21%), Positives = 52/132 (39%), Gaps = 12/132 (9%)

Query: 242 EGGSSASEALSGSETSGGSEGEATSNEASG--ESQESAKAASSSDNQ----------GEA 289
           + GS +SE     ++S     E TSNE     +S  +    + ++N           G+A
Sbjct: 45  QSGSDSSEENGDGDSSEEEGEEETSNEEENNEDSDGNEDEEAEAENTTLSTVTLGYGGDA 104

Query: 290 TSSNEASGESQESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKET 349
           T      G +      KA ++ +  T+     E  +     + + +E    ++ T+G  T
Sbjct: 105 TPGTGNIGLAALQLPKKAGNAGKKATKEDESDEDEEEEEEEEEEEAEVEENEQGTNGTST 164

Query: 350 PSTSDTDGDKSS 361
            ST    G+ SS
Sbjct: 165 NSTEVDHGNGSS 176


>gnl|CDD|220856 pfam10714, LEA_6, Late embryogenesis abundant protein 18.  This is
           a family of late embryogenesis-abundant proteins There
           is high accumulation of this protein in dry seeds, and
           in the roots of full-grown plants in response to
           dehydration and ABA (abscisic acid application)
           treatments. This LEA protein disappears after
           germination. It accumulates in growing regions of well
           irrigated hypocotyls and meristems suggesting a role in
           seedling growth resumption on rehydration. As a group
           the LEA proteins are highly hydrophilic, contain a high
           percentage of glycine residues, lack Cys and Trp
           residues and do not coagulate upon exposure to high
           temperature, and for these reasons are considered to be
           members of a group of proteins called hydrophilins.
           Expression of the protein is negatively regulated during
           etiolating growth, particularly in roots, in contrast to
           its expression patterns during normal growth.
          Length = 77

 Score = 26.0 bits (57), Expect = 9.8
 Identities = 12/40 (30%), Positives = 20/40 (50%)

Query: 300 QESDSAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAP 339
           QE    +   ++ +PT S        AASATD+++ +  P
Sbjct: 38  QEPKPGRGGGATDAPTPSGAAVSSDAAASATDAKNRKGVP 77


>gnl|CDD|233045 TIGR00601, rad23, UV excision repair protein Rad23.  All proteins
           in this family for which functions are known are
           components of a multiprotein complex used for targeting
           nucleotide excision repair to specific parts of the
           genome. In humans, Rad23 complexes with the XPC protein.
           This family is based on the phylogenomic analysis of JA
           Eisen (1999, Ph.D. Thesis, Stanford University) [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 378

 Score = 27.9 bits (62), Expect = 9.8
 Identities = 8/58 (13%), Positives = 16/58 (27%)

Query: 304 SAKAASSSQSPTESAPGGEVSKAASATDSQSSEAAPGDKVTDGKETPSTSDTDGDKSS 361
            A  A++  S     P    S A+  + + +S         +     +         S
Sbjct: 84  VAPPAATPTSAPTPTPSPPASPASGMSAAPASAVEEKSPSEESATATAPESPSTSVPS 141


>gnl|CDD|236126 PRK07899, rpsA, 30S ribosomal protein S1; Reviewed.
          Length = 486

 Score = 27.7 bits (62), Expect = 10.0
 Identities = 12/41 (29%), Positives = 21/41 (51%)

Query: 268 EASGESQESAKAASSSDNQGEATSSNEASGESQESDSAKAA 308
           +A+     +A A+S S +   ++SS   +G +  SD   AA
Sbjct: 437 KAAAAEAAAAAASSESSSSASSSSSEAEAGGTLASDEQLAA 477


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.306    0.120    0.317 

Gapped
Lambda     K      H
   0.267   0.0587    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 16,519,656
Number of extensions: 1516098
Number of successful extensions: 1979
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1749
Number of HSP's successfully gapped: 259
Length of query: 362
Length of database: 10,937,602
Length adjustment: 98
Effective length of query: 264
Effective length of database: 6,590,910
Effective search space: 1740000240
Effective search space used: 1740000240
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.6 bits)
S2: 60 (27.2 bits)