RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy10214
         (415 letters)



>gnl|CDD|192780 pfam11594, Med28, Mediator complex subunit 28.  Mediator is a large
           complex of up to 33 proteins that is conserved from
           plants to fungi to humans - the number and
           representation of individual subunits varying with
           species. It is arranged into four different sections, a
           core, a head, a tail and a kinase-activity part, and the
           number of subunits within each of these is what varies
           with species. Overall, Mediator regulates the
           transcriptional activity of RNA polymerase II but it
           would appear that each of the four different sections
           has a slightly different function. Subunit Med28 of the
           Mediator may function as a scaffolding protein within
           Mediator by maintaining the stability of a submodule
           within the head module, and components of this submodule
           act together in a gene-regulatory programme to suppress
           smooth muscle cell differentiation. Thus, mammalian
           Mediator subunit Med28 functions as a repressor of
           smooth muscle-cell differentiation, which could have
           implications for disorders associated with abnormalities
           in smooth muscle cell growth and differentiation,
           including atherosclerosis, asthma, hypertension, and
           smooth muscle tumours.
          Length = 106

 Score =  115 bits (289), Expect = 2e-31
 Identities = 41/99 (41%), Positives = 60/99 (60%), Gaps = 3/99 (3%)

Query: 175 EIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKR 234
           EI+  +DQ + KFLD+ARQ E FFLQKR  LS  KP+  +KE+   L+ ++ RK++L  +
Sbjct: 1   EIRNYVDQLSQKFLDIARQKETFFLQKRNELSVFKPKKTLKEEAQKLKEEMQRKDQLQTK 60

Query: 235 HYDKIAVWQNLLSDLQGWAKSPA---HQGSTSSASGTTP 270
           H  KI  W+NLL+D +   K      ++G    A  +TP
Sbjct: 61  HDSKIDYWENLLTDAEDVYKVRDEVPNEGRQRIAELSTP 99



 Score =  105 bits (263), Expect = 9e-28
 Identities = 36/82 (43%), Positives = 52/82 (63%), Gaps = 6/82 (7%)

Query: 74  EIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLARK 133
           EI+  +DQ + KFLD+ARQ E FFLQKR  LS  KP+  +KE      +   L+ ++ RK
Sbjct: 1   EIRNYVDQLSQKFLDIARQKETFFLQKRNELSVFKPKKTLKE------EAQKLKEEMQRK 54

Query: 134 EELIKRHYDKIAVWQNLLSDLQ 155
           ++L  +H  KI  W+NLL+D +
Sbjct: 55  DQLQTKHDSKIDYWENLLTDAE 76


>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1.  Members
           of this family are necessary for accurate chromosome
           transmission during cell division.
          Length = 804

 Score = 46.3 bits (110), Expect = 3e-05
 Identities = 24/131 (18%), Positives = 32/131 (24%), Gaps = 4/131 (3%)

Query: 265 ASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQI 324
             G   P     +       +  P         P        P+    QQ Q   + QQ+
Sbjct: 200 PPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPL----QQPQFPGLSQQM 255

Query: 325 HMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLG 384
                Q        P      P        PG P+G     PPP        +  P    
Sbjct: 256 PPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQ 315

Query: 385 PGGMGPGGLLQ 395
            G      L+Q
Sbjct: 316 RGPQFREQLVQ 326



 Score = 40.5 bits (95), Expect = 0.001
 Identities = 29/134 (21%), Positives = 34/134 (25%), Gaps = 7/134 (5%)

Query: 267 GTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHM 326
              P    P    P  +A     P       P  P    + +          Q   Q  +
Sbjct: 173 PQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPL 232

Query: 327 QHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG---NL 383
                Q   P   P  P G S  M    P  P+       PP P A P     P     L
Sbjct: 233 PPQLPQQPPPLQQPQFP-GLSQQMP---PPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGL 288

Query: 384 GPGGMGPGGLLQGP 397
             G   P    Q P
Sbjct: 289 PQGQNAPLPPPQQP 302



 Score = 38.2 bits (89), Expect = 0.008
 Identities = 19/93 (20%), Positives = 21/93 (22%), Gaps = 1/93 (1%)

Query: 305 MQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNA 364
                   +QQ     Q    +            P  GP     G      G P      
Sbjct: 158 EVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQ 217

Query: 365 GPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
              P PS  P     P  L P    P    Q P
Sbjct: 218 QFLPAPSQAPAQPPLPPQL-PQQPPPLQQPQFP 249



 Score = 36.7 bits (85), Expect = 0.026
 Identities = 19/107 (17%), Positives = 24/107 (22%), Gaps = 7/107 (6%)

Query: 270 PPNSTPTQSGPGISAMG-GPLPGMMGGMAP-IVPGSTMQPMSGMPQQQQQVQMQQQIHMQ 327
                P Q       +     PG+   M P        Q     PQ Q   Q Q   H  
Sbjct: 228 AQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPH-P 286

Query: 328 HMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGP 374
            + Q    P  PP  P         +     +  G            
Sbjct: 287 GLPQGQNAPLPPPQQPQLLPL----VQQPQGQQRGPQFREQLVQLSQ 329



 Score = 30.5 bits (69), Expect = 2.0
 Identities = 16/76 (21%), Positives = 22/76 (28%)

Query: 256 PAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQ 315
           P  Q          PP + PT           PLP         +         G   ++
Sbjct: 263 PQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQRGPQFRE 322

Query: 316 QQVQMQQQIHMQHMQQ 331
           Q VQ+ QQ      Q+
Sbjct: 323 QLVQLSQQQREALSQE 338


>gnl|CDD|219971 pfam08690, GET2, GET complex subunit GET2.  This family corresponds
           to the GET complex subunit GET2. The GET complex is
           involved in the retrieval of ER resident proteins from
           the Golgi.
          Length = 298

 Score = 43.6 bits (103), Expect = 9e-05
 Identities = 23/121 (19%), Positives = 39/121 (32%), Gaps = 6/121 (4%)

Query: 259 QGSTSSASGTTPPNSTPT-QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQ 317
           QGS+      +  ++ P   +G   SA     P +   +  I P    +  S       +
Sbjct: 34  QGSSVKLVSKSVLDAKPEDNTGSTTSAHDQSTPEIQDILEAIDP-PKDESESPAENIDPE 92

Query: 318 VQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGM 377
           V+M QQ+     + Q  G G         ++ +  M      G G     P  +  P   
Sbjct: 93  VEMFQQL----AKMQQQGNGSDNPPADDSTADLFSMLLQMGGGDGPDSESPASAQEPQEA 148

Query: 378 G 378
            
Sbjct: 149 P 149


>gnl|CDD|238103 cd00176, SPEC, Spectrin repeats, found in several proteins involved
           in cytoskeletal structure; family members include
           spectrin, alpha-actinin and dystrophin; the spectrin
           repeat forms a three helix bundle with the second helix
           interrupted by proline in some sequences; the repeats
           are independent folding units; tandem repeats are found
           in differing numbers and arrange in an antiparallel
           manner to form dimers; the repeats are defined by a
           characteristic tryptophan (W) residue in helix A and a
           leucine (L) at the carboxyl end of helix C and separated
           by a linker of 5 residues; two copies of the repeat are
           present here.
          Length = 213

 Score = 42.0 bits (99), Expect = 2e-04
 Identities = 40/177 (22%), Positives = 77/177 (43%), Gaps = 25/177 (14%)

Query: 85  KFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKI 144
           +FL  A ++EA+  +K  LLS+      ++ V  + K    L  +LA  EE ++   +++
Sbjct: 4   QFLRDADELEAWLSEKEELLSSTDYGDDLESVEALLKKHEALEAELAAHEERVEAL-NEL 62

Query: 145 AVWQNLLS-------DLQSCLQVLTKE-DEVSTTLEKDEIKLEIDQATLKFLDLARQMEA 196
              + L+        ++Q  L+ L +  +E+    E+   +LE      +F   A  +E 
Sbjct: 63  G--EQLIEEGHPDAEEIQERLEELNQRWEELRELAEERRQRLEEALDLQQFFRDADDLEQ 120

Query: 197 FFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKRH---YDKIAVWQNLLSDLQ 250
           +  +K   L++            DL  DL   EEL+K+H    +++   +  L  L 
Sbjct: 121 WLEEKEAALASE-----------DLGKDLESVEELLKKHKELEEELEAHEPRLKSLN 166


>gnl|CDD|224259 COG1340, COG1340, Uncharacterized archaeal coiled-coil protein
           [Function unknown].
          Length = 294

 Score = 42.0 bits (99), Expect = 4e-04
 Identities = 36/189 (19%), Positives = 79/189 (41%), Gaps = 37/189 (19%)

Query: 63  DGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKD 122
           +  +L       ++ EI++                L+K+   S L PE   +E  +V + 
Sbjct: 100 NEFNLGGRSIKSLEREIER----------------LEKKQQTSVLTPE---EERELV-QK 139

Query: 123 IVDLRHDLARKEELIKRH------YDKIAVWQNLLSDLQSCLQVLTKE-----DEVSTTL 171
           I +LR +L   ++ ++ +        +I   +    ++   +Q L  E     +E+    
Sbjct: 140 IKELRKELEDAKKALEENEKLKELKAEIDELKKKAREIHEKIQELANEAQEYHEEMIKLF 199

Query: 172 EK-DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELI-VKEDIVDLRHDLARKE 229
           E+ DE++ E D+   +F++L+++++    +       L+ EL  +++ I  LR      +
Sbjct: 200 EEADELRKEADELHEEFVELSKKID----ELHEEFRNLQNELRELEKKIKALRAKEKAAK 255

Query: 230 ELIKRHYDK 238
              KR   K
Sbjct: 256 RREKREELK 264


>gnl|CDD|224495 COG1579, COG1579, Zn-ribbon protein, possibly nucleic acid-binding
           [General function prediction only].
          Length = 239

 Score = 41.2 bits (97), Expect = 4e-04
 Identities = 27/115 (23%), Positives = 46/115 (40%), Gaps = 4/115 (3%)

Query: 126 LRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATL 185
           L  +  R E  IK     +   +  L  L   L+ L  E E     +  +++ EI +   
Sbjct: 15  LDLEKDRLEPRIKEIRKALKKAKAELEALNKALEALEIELE-DLENQVSQLESEIQEIRE 73

Query: 186 KFLDLARQMEAFFLQKRFLLSALKPEL-IVKEDIVDLRHDLARKEELIKRHYDKI 239
           +      ++ A   ++     AL  E+ I KE I  L  +LA   E I++   +I
Sbjct: 74  RIKRAEEKLSAVKDEREL--RALNIEIQIAKERINSLEDELAELMEEIEKLEKEI 126



 Score = 38.9 bits (91), Expect = 0.003
 Identities = 22/137 (16%), Positives = 51/137 (37%), Gaps = 19/137 (13%)

Query: 71  EKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDL 130
           +  +++ EI +   +      ++ A   ++     AL  E+ + +     + I  L  +L
Sbjct: 60  QVSQLESEIQEIRERIKRAEEKLSAVKDEREL--RALNIEIQIAK-----ERINSLEDEL 112

Query: 131 ARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEV----------STTLEKDEIKLEI 180
           A   E I++   +I   +  L  L+  L       E             + +++E+K ++
Sbjct: 113 AELMEEIEKLEKEIEDLKERLERLEKNLAEAEARLEEEVAEIREEGQELSSKREELKEKL 172

Query: 181 DQATLKFLDLARQMEAF 197
           D   L   +  R  +  
Sbjct: 173 DPELLSEYE--RIRKNK 187


>gnl|CDD|220309 pfam09606, Med15, ARC105 or Med15 subunit of Mediator complex
           non-fungal.  The approx. 70 residue Med15 domain of the
           ARC-Mediator co-activator is a three-helix bundle with
           marked similarity to the KIX domain. The sterol
           regulatory element binding protein (SREBP) family of
           transcription activators use the ARC105 subunit to
           activate target genes in the regulation of cholesterol
           and fatty acid homeostasis. In addition, Med15 is a
           critical transducer of gene activation signals that
           control early metazoan development.
          Length = 768

 Score = 41.1 bits (96), Expect = 0.001
 Identities = 26/119 (21%), Positives = 30/119 (25%), Gaps = 1/119 (0%)

Query: 277 QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGP 336
             G          P M     P   G       G P  QQQ   Q Q  +Q+ QQQ M  
Sbjct: 181 NQGQQGPVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGGQQQQNPQMQQQLQNQQQQQMDQ 240

Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQ 395
              P+              G        G    P         P      GM P  + Q
Sbjct: 241 QQGPADAQAQMGQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQ-SQLGMLPNQMQQ 298



 Score = 40.8 bits (95), Expect = 0.001
 Identities = 28/111 (25%), Positives = 34/111 (30%), Gaps = 4/111 (3%)

Query: 255 SPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQ 314
                G   +AS      +   Q   G + MG   P  M  +  + PG     M      
Sbjct: 100 MGQQMGGPGTASNLLQSLNVRGQMPMGAAGMG---PHQMSRVGTMQPGGQAGGMMQQSSG 156

Query: 315 QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAG 365
           Q Q Q   Q+  Q  Q QG    G   G  GP         G P   G  G
Sbjct: 157 QPQSQQPNQMGPQQGQAQGQAG-GMNQGQQGPVGQQQPPQMGQPGMPGGGG 206



 Score = 40.0 bits (93), Expect = 0.002
 Identities = 28/150 (18%), Positives = 36/150 (24%), Gaps = 12/150 (8%)

Query: 257 AHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGM-MGGMAPIVPGSTMQPMSGMPQQQ 315
           A Q          P      Q+  G    G  +  M  G   P   G  M          
Sbjct: 57  AAQQQVLQGGQGMPDPINALQNLTGQGTRGPQMGPMGPGPGRP--MGQQMGGPGTASNLL 114

Query: 316 QQVQMQQQIHMQH-----MQQQGMGPGGPPSGPGGPSSGMMFMG----PGGPRGGGNAGP 366
           Q + ++ Q+ M        Q   +G   P    GG             P           
Sbjct: 115 QSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQSSGQPQSQQPNQMGPQQGQAQ 174

Query: 367 PPFPSAGPGGMGGPGNLGPGGMGPGGLLQG 396
                   G  G  G   P  MG  G+  G
Sbjct: 175 GQAGGMNQGQQGPVGQQQPPQMGQPGMPGG 204



 Score = 38.4 bits (89), Expect = 0.007
 Identities = 24/91 (26%), Positives = 26/91 (28%), Gaps = 5/91 (5%)

Query: 277 QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGP 336
           Q G      GG  P  M G    VP     P     Q Q  +   Q   MQ M   G   
Sbjct: 250 QMGQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQSQLGMLPNQ---MQQMPGGGQ-- 304

Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP 367
           GGP    G P      +  GG          
Sbjct: 305 GGPGQPMGPPPQRPGAVPQGGQAVQQGVMSA 335



 Score = 37.7 bits (87), Expect = 0.011
 Identities = 30/99 (30%), Positives = 33/99 (33%), Gaps = 16/99 (16%)

Query: 302 GSTMQPMSGMPQQQQ-----QVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
           G   Q   GM  QQ      QV MQQQ   Q  QQ  +G              M  M  G
Sbjct: 252 GQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQSQLGMLPNQ---------MQQMPGG 302

Query: 357 GPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQ 395
           G  G G    P  P   PG +   G     G+   G  Q
Sbjct: 303 GQGGPGQPMGP--PPQRPGAVPQGGQAVQQGVMSAGQQQ 339



 Score = 36.1 bits (83), Expect = 0.031
 Identities = 31/144 (21%), Positives = 41/144 (28%), Gaps = 13/144 (9%)

Query: 256 PAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQ 315
           P  Q    S  G  P N      G G    G P+         +  G        M   Q
Sbjct: 279 PPQQQPQQSQLGMLP-NQMQQMPGGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQ 337

Query: 316 QQVQMQQQIHMQHM-----QQQGMGPGGPPSGPGGPSSGMMFMGPGG----PRGGGNAGP 366
           QQ++  +  +M+       QQQ  G   P +           +G GG           G 
Sbjct: 338 QQLKQMKLRNMRGQQQTQQQQQQQGGNHPAAHQQQM---NQQVGQGGQMVALGYLNIQGN 394

Query: 367 PPFPSAGPGGMGGPGNLGPGGMGP 390
                A P   G PG +      P
Sbjct: 395 QGGLGANPMQQGQPGMMSSPSPVP 418



 Score = 35.7 bits (82), Expect = 0.048
 Identities = 34/161 (21%), Positives = 45/161 (27%), Gaps = 18/161 (11%)

Query: 258 HQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQ 317
             G         PP   P     G  A+   +          +    M+      QQQQQ
Sbjct: 301 GGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQQLKQMKLRNMRGQQQTQQQQQQ 360

Query: 318 VQMQQQIHMQHMQQQGMGPGGPPSGPGGPSS-----------------GMMFMGPGGPRG 360
                    Q    Q +G GG     G  +                  GMM      P+ 
Sbjct: 361 QGGNHPAAHQQQMNQQVGQGGQMVALGYLNIQGNQGGLGANPMQQGQPGMMSSPSPVPQV 420

Query: 361 GGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPLAYL 401
             N   P  P       GGPG+  P     GG++  P A +
Sbjct: 421 QTNQSMPQPPQPSVPSPGGPGSQ-PPQSVSGGMIPSPPALM 460



 Score = 35.7 bits (82), Expect = 0.049
 Identities = 37/177 (20%), Positives = 43/177 (24%), Gaps = 35/177 (19%)

Query: 250 QGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAM--------GGPLPGMMGGMAPIVP 301
            G         S    S           +G G   M        GG   GMM   +   P
Sbjct: 100 MGQQMGGPGTASNLLQSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQSSG-QP 158

Query: 302 GSTM-----------QPMSGMPQQQQQVQMQQQIHMQ--------HMQQQGMGPGGPPSG 342
            S             Q  +G   Q QQ  + QQ   Q           Q  M   G P G
Sbjct: 159 QSQQPNQMGPQQGQAQGQAGGMNQGQQGPVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGG 218

Query: 343 P--GGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
                P            +     GP     A      G    G GGM P  +  G 
Sbjct: 219 QQQQNPQMQQQLQNQQQQQMDQQQGP-----ADAQAQMGQQQQGQGGMQPQQMQGGQ 270



 Score = 33.8 bits (77), Expect = 0.19
 Identities = 27/113 (23%), Positives = 32/113 (28%), Gaps = 9/113 (7%)

Query: 271 PNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQ 330
            N    Q G G + M    PGMM   +P      +Q    MPQ  Q              
Sbjct: 389 LNIQGNQGGLGANPMQQGQPGMMSSPSP---VPQVQTNQSMPQPPQPSVPSPGGPGSQPP 445

Query: 331 QQGMGPGGPPSGPGGPS-SGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGN 382
           Q   G   P      PS S  M   P   R            +  G +  PG 
Sbjct: 446 QSVSGGMIPSPPALMPSPSPQMSQSPASQR-----TIQQDMVSPGGPLNTPGQ 493



 Score = 33.8 bits (77), Expect = 0.19
 Identities = 27/97 (27%), Positives = 29/97 (29%), Gaps = 15/97 (15%)

Query: 314 QQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGP------- 366
            QQQV    Q     +       G    GP     G M  GPG P G    GP       
Sbjct: 58  AQQQVLQGGQGMPDPINALQNLTGQGTRGPQM---GPMGPGPGRPMGQQMGGPGTASNLL 114

Query: 367 -----PPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPL 398
                      G  GMG       G M PGG   G +
Sbjct: 115 QSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMM 151



 Score = 32.3 bits (73), Expect = 0.51
 Identities = 28/96 (29%), Positives = 35/96 (36%), Gaps = 14/96 (14%)

Query: 254 KSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPL----PGMMGGMAPIVPGSTMQPMS 309
             P    S S         S P    P + + GGP       + GGM P        P +
Sbjct: 406 GQPGMMSSPSPVPQVQTNQSMPQPPQPSVPSPGGPGSQPPQSVSGGMIP-------SPPA 458

Query: 310 GMPQQQQQVQMQQQIHMQH-MQQQGMGPGGPPSGPG 344
            MP      QM Q    Q  +QQ  + PGGP + PG
Sbjct: 459 LMPSPSP--QMSQSPASQRTIQQDMVSPGGPLNTPG 492



 Score = 32.3 bits (73), Expect = 0.58
 Identities = 23/123 (18%), Positives = 28/123 (22%)

Query: 270 PPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHM 329
            P     Q   G       L  +       +  + M P         Q   Q    MQ  
Sbjct: 95  GPGRPMGQQMGGPGTASNLLQSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQS 154

Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
             Q         GP    +     G    + G      P     PG  GG G       G
Sbjct: 155 SGQPQSQQPNQMGPQQGQAQGQAGGMNQGQQGPVGQQQPPQMGQPGMPGGGGQGQMQQQG 214

Query: 390 PGG 392
             G
Sbjct: 215 QPG 217



 Score = 30.7 bits (69), Expect = 1.6
 Identities = 17/116 (14%), Positives = 26/116 (22%), Gaps = 1/116 (0%)

Query: 243 QNLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPG 302
             +   LQ   +    Q    + +          Q G     M G    +     P    
Sbjct: 224 PQMQQQLQNQQQQQMDQQQGPADAQAQMGQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQ 283

Query: 303 STMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGP 358
                +  +P Q QQ+    Q                    GG +     M  G  
Sbjct: 284 PQQSQLGMLPNQMQQMPGGGQ-GGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQ 338


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 40.1 bits (94), Expect = 0.002
 Identities = 36/209 (17%), Positives = 86/209 (41%), Gaps = 40/209 (19%)

Query: 71  EKDEIKLEIDQATLKFLDLARQ-----MEAFFLQKRFLLSALKPELIVKEVNMVTKDIVD 125
           E +E++ E+++   + L+L  +      E   L++R      + E + + +  + + I  
Sbjct: 275 ELEELREELEELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEA 334

Query: 126 LRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVSTTLEKD---------- 174
           L+ +L  +E L++     +A  +    +L+  L  L +E +E+   L ++          
Sbjct: 335 LKEELEERETLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREELAELEAELAE 394

Query: 175 ------EIKLEIDQATLKFLDLARQMEAF----------FLQKRFLLSALKPELI----- 213
                 E+K EI+    +   L+ ++E              + +  L  L  EL      
Sbjct: 395 IRNELEELKREIESLEERLERLSERLEDLKEELKELEAELEELQTELEELNEELEELEEQ 454

Query: 214 ---VKEDIVDLRHDLARKEELIKRHYDKI 239
              +++ + +L  +LA  +E ++R   ++
Sbjct: 455 LEELRDRLKELERELAELQEELQRLEKEL 483


>gnl|CDD|219791 pfam08317, Spc7, Spc7 kinetochore protein.  This domain is found in
           cell division proteins which are required for
           kinetochore-spindle association.
          Length = 321

 Score = 38.9 bits (91), Expect = 0.003
 Identities = 24/133 (18%), Positives = 48/133 (36%), Gaps = 19/133 (14%)

Query: 120 TKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLE 179
            + +  L+  L    E +KR  + +    NL++ ++  L+                +K E
Sbjct: 142 MQLLEGLKEGLEENLEGMKRDEELLNKDLNLINSIKPKLRKK-----------LQALKEE 190

Query: 180 IDQATLKFLDLARQMEAFFLQKRFLLSALKPELI-VKEDIVDLRHDLARKEELIKRHYDK 238
           I    L+   LA ++      +   L   + EL  +   I + R  L   ++ ++     
Sbjct: 191 IAS--LR--QLADELNLCDPLE---LEKARQELRSLSVKISEKRKQLEELQQELQELTIA 243

Query: 239 IAVWQNLLSDLQG 251
           I    N  S+L  
Sbjct: 244 IEALTNKKSELLE 256



 Score = 33.1 bits (76), Expect = 0.24
 Identities = 25/133 (18%), Positives = 47/133 (35%), Gaps = 17/133 (12%)

Query: 60  GLPDG--RSLSPLEKDEIKLEIDQATLKFLDLARQMEAFF--LQKRF-LLSALKPEL--- 111
           GL +G   +L  +++DE  L  D   +    +  ++      L++    L  L  EL   
Sbjct: 147 GLKEGLEENLEGMKRDEELLNKDLNLIN--SIKPKLRKKLQALKEEIASLRQLADELNLC 204

Query: 112 -------IVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE 164
                    +E+  ++  I + R  L   ++ ++     I    N  S+L   +    K 
Sbjct: 205 DPLELEKARQELRSLSVKISEKRKQLEELQQELQELTIAIEALTNKKSELLEEIAEAEKI 264

Query: 165 DEVSTTLEKDEIK 177
            E        EI 
Sbjct: 265 REECRGWSAKEIS 277


>gnl|CDD|227361 COG5028, COG5028, Vesicle coat complex COPII, subunit SEC24/subunit
           SFB2/subunit SFB3 [Intracellular trafficking and
           secretion].
          Length = 861

 Score = 39.0 bits (91), Expect = 0.004
 Identities = 19/92 (20%), Positives = 22/92 (23%), Gaps = 1/92 (1%)

Query: 277 QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGP 336
           Q   G ++            A    G    P    P  QQQ + Q       M   G   
Sbjct: 15  QVHTGAASSKKS-ARPHRAYANFSAGQMGMPPYTTPPLQQQSRRQIDQAATAMHNTGANN 73

Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPP 368
             P        S   F  P G        P P
Sbjct: 74  PAPSVMSPAFQSQQKFSSPYGGSMADGTAPKP 105


>gnl|CDD|218108 pfam04487, CITED, CITED.  CITED, CBP/p300-interacting
           transactivator with ED-rich tail, are characterized by a
           conserved 32-amino acid sequence at the C-terminus.
           CITED proteins do not bind DNA directly and are thought
           to function as transcriptional co-activators.
          Length = 206

 Score = 37.6 bits (87), Expect = 0.005
 Identities = 27/115 (23%), Positives = 33/115 (28%), Gaps = 13/115 (11%)

Query: 278 SGPGISAMGGPLPGMMGGMAPIVPGSTMQP--MSGMPQQQQQVQMQQQIH-MQHMQQQGM 334
            G G+ A G P   M G M    P  +M    M     + Q   +      M  MQ Q +
Sbjct: 49  PGGGMDASGRPRSAMSGPMGGGHPHQSMPAYMMFNPSSKPQPFMLVPGPQLMASMQLQKL 108

Query: 335 GPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGG-MGGPGNLGPGGM 388
                         G      G P GGG     P     PG        L P  +
Sbjct: 109 N---------TQYQGHAGAPAGHPGGGGPQQFRPGAGQPPGMQHMPAPALPPNVI 154


>gnl|CDD|218704 pfam05701, DUF827, Plant protein of unknown function (DUF827).
           This family consists of several plant proteins of
           unknown function. Several sequences in this family are
           described as being "myosin heavy chain-like".
          Length = 484

 Score = 38.4 bits (89), Expect = 0.006
 Identities = 29/137 (21%), Positives = 61/137 (44%), Gaps = 12/137 (8%)

Query: 107 LKPELIVKEVNMV--------TKDIV-DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSC 157
           LK EL V E   +        TK  V DL+  L + E+  ++      + +    +L+  
Sbjct: 48  LKKELEVAEKEKLQVLKELESTKRTVEDLKLKLEKAEKEEQQAKQDSELAKLRAEELEQG 107

Query: 158 LQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKED 217
           +Q L  E  ++ T E D +K E+ +   ++  L  + +A   +    + A K   + ++ 
Sbjct: 108 IQELEVERYITATAELDSVKEELRKIRQEYDALVEERDAALKRAEEAICASK---VNEKK 164

Query: 218 IVDLRHDLARKEELIKR 234
           + +L  ++   +E ++R
Sbjct: 165 VEELTKEIIAMKESLER 181


>gnl|CDD|227507 COG5180, PBP1, Protein interacting with poly(A)-binding protein
           [RNA processing and modification].
          Length = 654

 Score = 38.2 bits (88), Expect = 0.007
 Identities = 36/136 (26%), Positives = 46/136 (33%), Gaps = 18/136 (13%)

Query: 259 QGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSG-MPQQQQQ 317
           Q    ++ G   P   P         MG P+ G      P++ G     M   MP Q Q 
Sbjct: 498 QQRQLNSMGNAVPGMNPAMGMNMGGMMGFPMGGPSASPNPMMNGFAAGSMGMYMPFQPQP 557

Query: 318 VQMQQQIHMQHMQQQGMGPGGPPSGPGGPS----SGMMFMGPGGPRGGGNAGPPPFPSAG 373
           +       M  +    MG  G   G G  S    +G M  GPG P G             
Sbjct: 558 MFYHPSPQMMPV----MGSNGAEEGGGNISPHVPAGFMAAGPGAPMGA---------FGY 604

Query: 374 PGGMGGPGNLGPGGMG 389
           PGG+   G +G G  G
Sbjct: 605 PGGIPFQGMMGSGPSG 620


>gnl|CDD|218350 pfam04959, ARS2, Arsenite-resistance protein 2.  Arsenite is a
           carcinogenic compound which can act as a co-mutagen by
           inhibiting DNA repair. Arsenite-resistance protein 2 is
           thought to play a role in arsenite resistance.
          Length = 211

 Score = 37.5 bits (87), Expect = 0.007
 Identities = 24/71 (33%), Positives = 27/71 (38%), Gaps = 5/71 (7%)

Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGP-----GNLGPG 386
            G+ PG P   P  P + M +  P  P  G   G PPFP    GG  G      G  G  
Sbjct: 141 GGLAPGLPGYPPQTPQALMPYGQPRPPMMGYGRGGPPFPPNQYGGGRGNYDEFRGQGGYY 200

Query: 387 GMGPGGLLQGP 397
           G      L GP
Sbjct: 201 GKPRNRDLDGP 211



 Score = 27.8 bits (62), Expect = 8.1
 Identities = 12/37 (32%), Positives = 14/37 (37%)

Query: 6   PGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 42
           P    GG   G P +P   P  +   G   P  MG G
Sbjct: 136 PKPDPGGLAPGLPGYPPQTPQALMPYGQPRPPMMGYG 172



 Score = 27.8 bits (62), Expect = 8.1
 Identities = 12/37 (32%), Positives = 14/37 (37%)

Query: 355 PGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 391
           P    GG   G P +P   P  +   G   P  MG G
Sbjct: 136 PKPDPGGLAPGLPGYPPQTPQALMPYGQPRPPMMGYG 172


>gnl|CDD|219339 pfam07223, DUF1421, Protein of unknown function (DUF1421).  This
           family represents a conserved region approximately 350
           residues long within a number of plant proteins of
           unknown function.
          Length = 357

 Score = 37.6 bits (87), Expect = 0.008
 Identities = 28/123 (22%), Positives = 35/123 (28%), Gaps = 14/123 (11%)

Query: 270 PPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHM 329
                 + + P     G P P       P        P SG P  QQ    Q Q      
Sbjct: 202 AMQPPYSGAPPSQQFYGPPQPSPYMYGGPGGR-----PNSGFPSGQQPPPSQGQ------ 250

Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
             +G G  GPP    G    +    P G     +   P  P+A       P +  P   G
Sbjct: 251 --EGYGYSGPPP-SKGNHGSVASYAPQGSSQSYSTAYPSLPAATVLPQALPMSSAPMSGG 307

Query: 390 PGG 392
             G
Sbjct: 308 GSG 310


>gnl|CDD|130689 TIGR01628, PABP-1234, polyadenylate binding protein, human types 1,
           2, 3, 4 family.  These eukaryotic proteins recognize the
           poly-A of mRNA and consists of four tandem RNA
           recognition domains at the N-terminus (rrm: pfam00076)
           followed by a PABP-specific domain (pfam00658) at the
           C-terminus. The protein is involved in the transport of
           mRNA's from the nucleus to the cytoplasm. There are four
           paralogs in Homo sapiens which are expressed in testis
           (GP:11610605_PABP3 ), platelets (SP:Q13310_PABP4 ),
           broadly expressed (SP:P11940_PABP1) and of unknown
           tissue range (SP:Q15097_PABP2).
          Length = 562

 Score = 37.1 bits (86), Expect = 0.015
 Identities = 20/95 (21%), Positives = 22/95 (23%), Gaps = 17/95 (17%)

Query: 283 SAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSG 342
             MG P+ G MG           QP       QQQ   Q            M     P G
Sbjct: 385 LPMGSPMGGAMG-----------QPPYYGQGPQQQFNGQPLGW------PRMSMMPTPMG 427

Query: 343 PGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGM 377
           PGGP            R                 +
Sbjct: 428 PGGPLRPNGLAPMNAVRAPSRNAQNAAQKPPMQPV 462



 Score = 32.1 bits (73), Expect = 0.63
 Identities = 10/80 (12%), Positives = 17/80 (21%), Gaps = 3/80 (3%)

Query: 279 GPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGG 338
           G G        P     M+ +   + M P     +      M          Q       
Sbjct: 402 GQGPQQQFNGQPLGWPRMSMM--PTPMGPGGP-LRPNGLAPMNAVRAPSRNAQNAAQKPP 458

Query: 339 PPSGPGGPSSGMMFMGPGGP 358
                  P+   + +    P
Sbjct: 459 MQPVMYPPNYQSLPLSQDLP 478



 Score = 30.5 bits (69), Expect = 1.8
 Identities = 24/89 (26%), Positives = 33/89 (37%), Gaps = 10/89 (11%)

Query: 315 QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSG---MMFMGPG---GPRGGGNAGPPP 368
           Q++ Q +  +  Q MQ Q      P   P G + G       GP      +  G      
Sbjct: 362 QRKEQRRAHLQDQFMQLQPRMRQLPMGSPMGGAMGQPPYYGQGPQQQFNGQPLGWPRMSM 421

Query: 369 FPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
            P   P G GGP  L P G+ P   ++ P
Sbjct: 422 MP--TPMGPGGP--LRPNGLAPMNAVRAP 446



 Score = 30.2 bits (68), Expect = 2.3
 Identities = 17/84 (20%), Positives = 18/84 (21%), Gaps = 12/84 (14%)

Query: 307 PMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGP 366
           PM G   Q            Q   Q    P                MGPGGP      G 
Sbjct: 390 PMGGAMGQPPYYGQGP--QQQFNGQPLGWPRMSMMPTP--------MGPGGPLRP--NGL 437

Query: 367 PPFPSAGPGGMGGPGNLGPGGMGP 390
            P  +                M P
Sbjct: 438 APMNAVRAPSRNAQNAAQKPPMQP 461


>gnl|CDD|216868 pfam02084, Bindin, Bindin. 
          Length = 239

 Score = 36.4 bits (84), Expect = 0.015
 Identities = 26/66 (39%), Positives = 28/66 (42%), Gaps = 2/66 (3%)

Query: 327 QHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGN-AGPPPFPSAGPGGMGGPGNLGP 385
           Q M  Q MG G  P+       G    G GGP GGG   G       GP G GG G+ GP
Sbjct: 9   QAMNPQ-MGGGNYPAPGQPAQQGYANQGMGGPVGGGGGPGAGGGAPGGPVGGGGGGSGGP 67

Query: 386 GGMGPG 391
            G G  
Sbjct: 68  PGGGEV 73



 Score = 31.8 bits (72), Expect = 0.52
 Identities = 21/87 (24%), Positives = 23/87 (26%), Gaps = 8/87 (9%)

Query: 307 PMSGMPQQQ-QQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAG 365
            M   PQ    Q+            QQG    G     GG        G  G  GG   G
Sbjct: 3   NMQNYPQAMNPQMGGGNYPAPGQPAQQGYANQGMGGPVGG-------GGGPGAGGGAPGG 55

Query: 366 PPPFPSAGPGGMGGPGNLGPGGMGPGG 392
           P      G GG  G G +         
Sbjct: 56  PVGGGGGGSGGPPGGGEVAGEAEDAMS 82


>gnl|CDD|219837 pfam08430, Fork_head_N, Forkhead N-terminal region.  The region
           described in this family is found towards the N-terminus
           of various eukaryotic fork head/HNF-3-related
           transcription factors (which contain the pfam00250
           domain). These proteins play key roles in embryogenesis,
           maintenance of differentiated cell states, and
           tumorigenesis.
          Length = 137

 Score = 35.3 bits (81), Expect = 0.017
 Identities = 21/100 (21%), Positives = 28/100 (28%), Gaps = 2/100 (2%)

Query: 283 SAMGGPLPGMMGGMAPIVPGSTM-QPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPS 341
           S++ G +   M  M    P +T     +                M      GM PG   +
Sbjct: 11  SSVSGGMVYSMNSMNTYGPMNTSQGSANSSMNMSGYAGPGAMNGMSSSSMNGMSPGYGGA 70

Query: 342 GPGGPSSGMMFMGPG-GPRGGGNAGPPPFPSAGPGGMGGP 380
           G      GM  MG    P G   A  P    +G       
Sbjct: 71  GSPMGMMGMSSMGTSLSPSGTMGAMGPMPAGSGGSLSPNM 110



 Score = 32.6 bits (74), Expect = 0.13
 Identities = 26/116 (22%), Positives = 36/116 (31%), Gaps = 17/116 (14%)

Query: 260 GSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQ 319
            S +S +   P N++   +   ++  G   PG M GM+     S+M  MS          
Sbjct: 19  YSMNSMNTYGPMNTSQGSANSSMNMSGYAGPGAMNGMSS----SSMNGMSPGY------- 67

Query: 320 MQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPG 375
                        GM           PS  M  MGP     GG+  P    S    
Sbjct: 68  ------GGAGSPMGMMGMSSMGTSLSPSGTMGAMGPMPAGSGGSLSPNMSMSRASS 117



 Score = 31.9 bits (72), Expect = 0.25
 Identities = 16/95 (16%), Positives = 25/95 (26%), Gaps = 6/95 (6%)

Query: 260 GSTSSASGTTPPNSTPTQSGPGISAMG-GPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQV 318
            +TS  S  +  N +       ++ M    + GM  G         M  MS M      +
Sbjct: 30  MNTSQGSANSSMNMSGYAGPGAMNGMSSSSMNGMSPGYGGAGSPMGMMGMSSMG---TSL 86

Query: 319 QMQQQIHMQHMQQQGMGPGG--PPSGPGGPSSGMM 351
                +        G G       S     S   +
Sbjct: 87  SPSGTMGAMGPMPAGSGGSLSPNMSMSRASSQNNL 121


>gnl|CDD|240419 PTZ00440, PTZ00440, reticulocyte binding protein 2-like protein;
            Provisional.
          Length = 2722

 Score = 37.1 bits (86), Expect = 0.018
 Identities = 29/137 (21%), Positives = 60/137 (43%), Gaps = 27/137 (19%)

Query: 49   LAYLEKTTSNIGLPDGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALK 108
            L Y +K+  NI   DG  L  L+K++ + E  ++ +  L++   +    L K+       
Sbjct: 965  LEYYDKSKENINGNDGTHLEKLDKEKDEWEHFKSEIDKLNVNYNI----LNKKI------ 1014

Query: 109  PELIVKE----VNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE 164
             +LI K+    + ++ K I +   ++   EE + ++        +LL  +++ L      
Sbjct: 1015 DDLIKKQHDDIIELIDKLIKEKGKEI---EEKVDQYI-------SLLEKMKTKLSSFHFN 1064

Query: 165  DEVSTT---LEKDEIKL 178
             ++        K+EIKL
Sbjct: 1065 IDIKKYKNPKIKEEIKL 1081


>gnl|CDD|222878 PHA02562, 46, endonuclease subunit; Provisional.
          Length = 562

 Score = 36.9 bits (86), Expect = 0.019
 Identities = 40/170 (23%), Positives = 68/170 (40%), Gaps = 25/170 (14%)

Query: 71  EKDEIKLEIDQATLKFLDLARQME----AF--FLQKRFLLSALKPELIVKEVNMVTKDIV 124
           E   IK EI++ T + L+L   +E    A          + + K E   K + M  K  V
Sbjct: 228 EAKTIKAEIEELTDELLNLVMDIEDPSAALNKLNTAAAKIKS-KIEQFQKVIKMYEKGGV 286

Query: 125 --DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQ 182
                  ++   + I +  DK+   Q+ L  L       T  DE+       EI  E ++
Sbjct: 287 CPTCTQQISEGPDRITKIKDKLKELQHSLEKLD------TAIDELE------EIMDEFNE 334

Query: 183 ATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLA-RKEEL 231
            + K L+L  ++      K+ L++ +     VK  I +L+ +     EEL
Sbjct: 335 QSKKLLELKNKISTN---KQSLITLVDKAKKVKAAIEELQAEFVDNAEEL 381



 Score = 30.0 bits (68), Expect = 2.5
 Identities = 14/89 (15%), Positives = 42/89 (47%), Gaps = 7/89 (7%)

Query: 110 ELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVST 169
           E I+ E N  +K +++L++ ++  ++ +    DK    +  + +LQ+  + +   +E++ 
Sbjct: 326 EEIMDEFNEQSKKLLELKNKISTNKQSLITLVDKAKKVKAAIEELQA--EFVDNAEELA- 382

Query: 170 TLEKDEIKLEIDQATLK----FLDLARQM 194
            L+ +  K+   ++ L        +   +
Sbjct: 383 KLQDELDKIVKTKSELVKEKYHRGIVTDL 411


>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
           Transcription initiation factor IIA (TFIIA) is a
           heterotrimer, the three subunits being known as alpha,
           beta, and gamma, in order of molecular weight. The N and
           C-terminal domains of the gamma subunit are represented
           in pfam02268 and pfam02751, respectively. This family
           represents the precursor that yields both the alpha and
           beta subunits. The TFIIA heterotrimer is an essential
           general transcription initiation factor for the
           expression of genes transcribed by RNA polymerase II.
           Together with TFIID, TFIIA binds to the promoter region;
           this is the first step in the formation of a
           pre-initiation complex (PIC). Binding of the rest of the
           transcription machinery follows this step. After
           initiation, the PIC does not completely dissociate from
           the promoter. Some components, including TFIIA, remain
           attached and re-initiate a subsequent round of
           transcription.
          Length = 332

 Score = 36.6 bits (85), Expect = 0.020
 Identities = 28/130 (21%), Positives = 39/130 (30%), Gaps = 12/130 (9%)

Query: 256 PAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGM----APIVPGSTMQPM--S 309
           P  Q   +  +G    ++TPT S          LP    G      P        P+  +
Sbjct: 70  PPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAGPAGPTIQTEPGQLYPVQVPVMVT 129

Query: 310 GMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMG--PGGPRGGGNAGPP 367
             P      Q  QQ  +Q +QQ      G P+    PS             +   N   P
Sbjct: 130 QNPANSPLDQPAQQRALQQLQQ----RYGAPASGQLPSQQQSAQKNDESQLQQQPNGETP 185

Query: 368 PFPSAGPGGM 377
           P  + G G  
Sbjct: 186 PQQTDGAGDD 195



 Score = 32.4 bits (74), Expect = 0.40
 Identities = 20/92 (21%), Positives = 23/92 (25%), Gaps = 9/92 (9%)

Query: 297 APIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
           AP       QP+   P  Q    +      QH    G     PP+    P       GP 
Sbjct: 55  APPPVAQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALP------AGPA 108

Query: 357 GPR---GGGNAGPPPFPSAGPGGMGGPGNLGP 385
           GP      G   P   P              P
Sbjct: 109 GPTIQTEPGQLYPVQVPVMVTQNPANSPLDQP 140


>gnl|CDD|240291 PTZ00146, PTZ00146, fibrillarin; Provisional.
          Length = 293

 Score = 35.9 bits (83), Expect = 0.027
 Identities = 21/49 (42%), Positives = 21/49 (42%), Gaps = 2/49 (4%)

Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG 381
           G G GG   G GG   G    G GG RGGG          G GG GG G
Sbjct: 7   GGGRGGGRGGGGGGGRG--GGGRGGGRGGGRGRGRGGGGGGRGGGGGGG 53



 Score = 35.1 bits (81), Expect = 0.048
 Identities = 18/39 (46%), Positives = 18/39 (46%), Gaps = 1/39 (2%)

Query: 5  GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 43
          G GG RGGG  G       G GG GG    G GG G G 
Sbjct: 9  GRGGGRGGGGGGGRGGGGRG-GGRGGGRGRGRGGGGGGR 46



 Score = 35.1 bits (81), Expect = 0.048
 Identities = 18/39 (46%), Positives = 18/39 (46%), Gaps = 1/39 (2%)

Query: 354 GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
           G GG RGGG  G       G GG GG    G GG G G 
Sbjct: 9   GRGGGRGGGGGGGRGGGGRG-GGRGGGRGRGRGGGGGGR 46



 Score = 32.8 bits (75), Expect = 0.26
 Identities = 24/60 (40%), Positives = 24/60 (40%), Gaps = 4/60 (6%)

Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
           GMG G      GG   G    G GG  GGG  G          G GG G  G GG GPG 
Sbjct: 1   GMGGGFGGGRGGGRGGG----GGGGRGGGGRGGGRGGGRGRGRGGGGGGRGGGGGGGPGK 56



 Score = 30.1 bits (68), Expect = 1.8
 Identities = 18/42 (42%), Positives = 18/42 (42%), Gaps = 2/42 (4%)

Query: 3  FMGPGGPRGGGNAGPPPFPSAGPGGMG-GPGNLGPGGMGPGG 43
           MG GG  GG   G       G GG G G G  G  G G GG
Sbjct: 1  GMG-GGFGGGRGGGRGGGGGGGRGGGGRGGGRGGGRGRGRGG 41



 Score = 29.7 bits (67), Expect = 2.8
 Identities = 17/46 (36%), Positives = 18/46 (39%), Gaps = 7/46 (15%)

Query: 352 FMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
            MG G   G G          G GG GG G  G GG   GG  +G 
Sbjct: 1   GMGGGFGGGRGGGR-------GGGGGGGRGGGGRGGGRGGGRGRGR 39



 Score = 29.3 bits (66), Expect = 3.4
 Identities = 17/39 (43%), Positives = 17/39 (43%)

Query: 5  GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 43
          G GG  GGG  G          G GG G  G GG GPG 
Sbjct: 18 GGGGRGGGGRGGGRGGGRGRGRGGGGGGRGGGGGGGPGK 56


>gnl|CDD|133051 cd06429, GT8_like_1, GT8_like_1 represents a subfamily of GT8 with
           unknown function.  A subfamily of glycosyltransferase
           family 8 with unknown function: Glycosyltransferase
           family 8 comprises enzymes with a number of known
           activities; lipopolysaccharide galactosyltransferase
           lipopolysaccharide glucosyltransferase 1, glycogenin
           glucosyltransferase and inositol
           1-alpha-galactosyltransferase. It is classified as a
           retaining glycosyltransferase, based on the relative
           anomeric stereochemistry of the substrate and product in
           the reaction catalyzed.
          Length = 257

 Score = 35.4 bits (82), Expect = 0.038
 Identities = 34/160 (21%), Positives = 56/160 (35%), Gaps = 30/160 (18%)

Query: 138 KRHYDKIAVWQNLLSDLQSCLQVLTKEDEV-STTLEKDEIKLEIDQATLKFLDLARQMEA 196
            ++Y  +  W +L     + ++VL  +D      ++ D +     +A    L   R+ E 
Sbjct: 38  NQNYGAMRSWFDLNPLKIATVKVLNFDDFKLLGKVKVDSLMQLESEADTSNLK-QRKPEY 96

Query: 197 FFL--QKRFLLSALKPEL---IVKEDIVDLRHDLARKEELIKRHYD-KIA-----VW--- 242
             L    RF L  L P+L   I  +D V ++ DL    EL        +A      W   
Sbjct: 97  ISLLNFARFYLPELFPKLEKVIYLDDDVVVQKDL---TELWNTDLGGGVAGAVETSWNPG 153

Query: 243 -----------QNLLSDLQGWAKSPAHQGSTSSASGTTPP 271
                      QN+    + W +    +  T     T PP
Sbjct: 154 VNVVNLTEWRRQNVTETYEKWMELNQEEEVTLWKLITLPP 193


>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
           bacterial type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. This family
           represents the SMC protein of most bacteria. The smc
           gene is often associated with scpB (TIGR00281) and scpA
           genes, where scp stands for segregation and condensation
           protein. SMC was shown (in Caulobacter crescentus) to be
           induced early in S phase but present and bound to DNA
           throughout the cell cycle [Cellular processes, Cell
           division, DNA metabolism, Chromosome-associated
           proteins].
          Length = 1179

 Score = 35.8 bits (83), Expect = 0.043
 Identities = 33/186 (17%), Positives = 74/186 (39%), Gaps = 32/186 (17%)

Query: 64  GRSLSPLE------------KDEIK-LEIDQATLKFLDLARQMEAFFLQKRFLLSALKPE 110
            R L  LE            K E++ LE+    L+  +L  ++E                
Sbjct: 199 ERQLKSLERQAEKAERYKELKAELRELELALLVLRLEELREELEELQ------------- 245

Query: 111 LIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVST 169
              +E+    +++ +L  +L   EE ++    +++  +  + +LQ  L  L  E   +  
Sbjct: 246 ---EELKEAEEELEELTAELQELEEKLEELRLEVSELEEEIEELQKELYALANEISRLEQ 302

Query: 170 TLEKDEIKLEIDQATLKFLDLAR-QMEAFFLQKRFLLSALKPEL-IVKEDIVDLRHDLAR 227
             +    +L   +  L+ L+    ++E+   +    L+ L+ +L  +KE++  L  +L  
Sbjct: 303 QKQILRERLANLERQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEE 362

Query: 228 KEELIK 233
            E  ++
Sbjct: 363 LEAELE 368


>gnl|CDD|214826 smart00806, AIP3, Actin interacting protein 3.  Aip3p/Bud6p is a
           regulator of cell and cytoskeletal polarity in
           Saccharomyces cerevisiae that was previously identified
           as an actin-interacting protein. Actin-interacting
           protein 3 (Aip3p) localizes at the cell cortex where
           cytoskeleton assembly must be achieved to execute
           polarized cell growth, and deletion of AIP3 causes gross
           defects in cell and cytoskeletal polarity. Aip3p
           localization is mediated by the secretory pathway,
           mutations in early- or late-acting components of the
           secretory apparatus lead to Aip3p mislocalization.
          Length = 426

 Score = 35.4 bits (82), Expect = 0.053
 Identities = 31/129 (24%), Positives = 56/129 (43%), Gaps = 14/129 (10%)

Query: 64  GRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDI 123
            R+     K ++  + D    K  DL   +EA  L+K      ++P    K++  V K++
Sbjct: 204 NRAYVESSKKKLSEDSDSLLTKVDDLQDIIEA--LRKDVAQRGVRPSK--KQLETVQKEL 259

Query: 124 VDLRHDLARKEELIKR---HYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLE- 179
              R +L + EE I      + KI  W+  L  +    Q LT ++++   L++D  K E 
Sbjct: 260 ETARKELKKMEEYIDIEKPIWKKI--WEAELDKVCEEQQFLTLQEDLIADLKEDLEKAEE 317

Query: 180 ----IDQAT 184
               ++Q  
Sbjct: 318 TFDLVEQCC 326



 Score = 30.0 bits (68), Expect = 2.3
 Identities = 30/142 (21%), Positives = 59/142 (41%), Gaps = 30/142 (21%)

Query: 121 KDIVDLRHDLARKEELIKR-HYDKIAVWQNLLSDLQSCLQVLTKEDEVSTT--------- 170
            ++  L+ +LA    ++++ H       +  + D+   L+ + K    S +         
Sbjct: 155 AELKSLQRELA----VLRQTHNSFFTEIKESIKDI---LEKIDKFKSSSLSASGSSNRAY 207

Query: 171 LEKDEIKLEIDQATL--KFLDLARQMEAFFLQKRFLLSALKPEL----IVKEDIVDLRHD 224
           +E  + KL  D  +L  K  DL   +EA  L+K      ++P       V++++   R +
Sbjct: 208 VESSKKKLSEDSDSLLTKVDDLQDIIEA--LRKDVAQRGVRPSKKQLETVQKELETARKE 265

Query: 225 LARKEELIKR---HYDKIAVWQ 243
           L + EE I      + KI  W+
Sbjct: 266 LKKMEEYIDIEKPIWKKI--WE 285


>gnl|CDD|222374 pfam13779, DUF4175, Domain of unknown function (DUF4175). 
          Length = 820

 Score = 35.3 bits (82), Expect = 0.068
 Identities = 14/45 (31%), Positives = 14/45 (31%), Gaps = 3/45 (6%)

Query: 301 PGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGG 345
                Q   G   Q Q  Q  QQ      QQQG    G   G G 
Sbjct: 620 GEQQGQQGQGGQGQGQPGQQGQQ---GQGQQQGQQGQGGQGGQGS 661



 Score = 34.9 bits (81), Expect = 0.095
 Identities = 19/108 (17%), Positives = 25/108 (23%), Gaps = 36/108 (33%)

Query: 305 MQPMSGMPQQQQQVQMQQQIH-----MQHMQ----------QQGMGPGGPPSGPGGPSSG 349
           +Q       Q  Q +MQQ +      ++  Q          Q+                 
Sbjct: 573 LQV--TQGGQGGQSEMQQAMEGLGETLREQQGLSDETFRDLQEQFNAQRGEQQGQQ---- 626

Query: 350 MMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
               G GG   G            PG  G  G     G    G   G 
Sbjct: 627 ----GQGGQGQGQ-----------PGQQGQQGQGQQQGQQGQGGQGGQ 659



 Score = 29.5 bits (67), Expect = 4.1
 Identities = 10/48 (20%), Positives = 12/48 (25%)

Query: 310 GMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGG 357
               Q+ + Q QQ    Q   Q G           G        G G 
Sbjct: 614 QFNAQRGEQQGQQGQGGQGQGQPGQQGQQGQGQQQGQQGQGGQGGQGS 661


>gnl|CDD|240227 PTZ00009, PTZ00009, heat shock 70 kDa protein; Provisional.
          Length = 653

 Score = 35.2 bits (81), Expect = 0.072
 Identities = 12/35 (34%), Positives = 13/35 (37%), Gaps = 1/35 (2%)

Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGP 366
           Q  G G P   PGG   GM         G  + GP
Sbjct: 614 QAAGGGMPGGMPGGMPGGMPGGAGPAGAGASS-GP 647



 Score = 30.1 bits (68), Expect = 2.3
 Identities = 12/41 (29%), Positives = 12/41 (29%), Gaps = 2/41 (4%)

Query: 1   MMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 41
           M  M      G     P   P   PGG G  G       GP
Sbjct: 609 MTKMYQAAGGGMPGGMPGGMPGGMPGGAGPAG--AGASSGP 647



 Score = 30.1 bits (68), Expect = 2.3
 Identities = 12/41 (29%), Positives = 12/41 (29%), Gaps = 2/41 (4%)

Query: 350 MMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 390
           M  M      G     P   P   PGG G  G       GP
Sbjct: 609 MTKMYQAAGGGMPGGMPGGMPGGMPGGAGPAG--AGASSGP 647



 Score = 29.4 bits (66), Expect = 4.4
 Identities = 13/41 (31%), Positives = 14/41 (34%), Gaps = 6/41 (14%)

Query: 2   MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 42
           M+   GG   GG       P   PGGM G       G   G
Sbjct: 612 MYQAAGGGMPGG------MPGGMPGGMPGGAGPAGAGASSG 646



 Score = 29.4 bits (66), Expect = 4.4
 Identities = 13/41 (31%), Positives = 14/41 (34%), Gaps = 6/41 (14%)

Query: 351 MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 391
           M+   GG   GG       P   PGGM G       G   G
Sbjct: 612 MYQAAGGGMPGG------MPGGMPGGMPGGAGPAGAGASSG 646


>gnl|CDD|144972 pfam01576, Myosin_tail_1, Myosin tail.  The myosin molecule is a
           multi-subunit complex made up of two heavy chains and
           four light chains it is a fundamental contractile
           protein found in all eukaryote cell types. This family
           consists of the coiled-coil myosin heavy chain tail
           region. The coiled-coil is composed of the tail from two
           molecules of myosin. These can then assemble into the
           macromolecular thick filament. The coiled-coil region
           provides the structural backbone the thick filament.
          Length = 859

 Score = 35.0 bits (81), Expect = 0.085
 Identities = 38/151 (25%), Positives = 63/151 (41%), Gaps = 25/151 (16%)

Query: 118 MVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVL-TKEDEVSTTLEKDE- 175
            + +   +L + L RKE  + +   K+   Q L++ LQ  ++ L  +  E+   LE +  
Sbjct: 1   ELERQKRELENQLYRKESELSQLSSKLEDEQALVAQLQKKIKELEARIRELEEELEAERA 60

Query: 176 --IKLEIDQATLKFLDLARQMEAFFLQKRFL----LSALKPELIVKED--IVDLRHDL-- 225
              K E  +A     DL+R++E   L +R       +A + EL  K +  +  LR DL  
Sbjct: 61  ARAKAEKARA-----DLSRELEE--LSERLEEAGGATAAQIELNKKREAELAKLRKDLEE 113

Query: 226 ------ARKEELIKRHYDKIAVWQNLLSDLQ 250
                      L K+H D I      +  LQ
Sbjct: 114 ANLQHEEALATLRKKHQDAINELSEQIEQLQ 144


>gnl|CDD|218621 pfam05518, Totivirus_coat, Totivirus coat protein. 
          Length = 753

 Score = 34.8 bits (80), Expect = 0.086
 Identities = 25/121 (20%), Positives = 29/121 (23%), Gaps = 20/121 (16%)

Query: 275 PTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIH-----MQHM 329
                P +     P P    G     PG       GMP    +     +         H 
Sbjct: 631 IISGFPPVFKTALPRPDYNRGGEAGGPGVPGPVPVGMPAHTARPSRVARGDPVRPTAHHA 690

Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
                        PGGP           P GGG   PPP   A  G      +L      
Sbjct: 691 AL----RAPQAPRPGGP-----------PGGGGGLPPPPDLPAAAGPAPCGSSLIASPTA 735

Query: 390 P 390
           P
Sbjct: 736 P 736



 Score = 32.8 bits (75), Expect = 0.39
 Identities = 12/37 (32%), Positives = 13/37 (35%)

Query: 5   GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 41
             G P GGG   PPP   A  G      +L      P
Sbjct: 700 PGGPPGGGGGLPPPPDLPAAAGPAPCGSSLIASPTAP 736



 Score = 31.3 bits (71), Expect = 1.2
 Identities = 20/111 (18%), Positives = 25/111 (22%), Gaps = 4/111 (3%)

Query: 271 PNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQ 330
           P+        G    G    GM    A     +   P+   P          Q       
Sbjct: 646 PDYNRGGEAGGPGVPGPVPVGMPAHTARPSRVARGDPVR--PTAHHAALRAPQAPRP--G 701

Query: 331 QQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG 381
               G GG P  P  P++               A P P P       G   
Sbjct: 702 GPPGGGGGLPPPPDLPAAAGPAPCGSSLIASPTAPPEPEPPGAEQADGAEN 752


>gnl|CDD|227568 COG5243, HRD1, HRD ubiquitin ligase complex, ER membrane component
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 491

 Score = 34.6 bits (79), Expect = 0.091
 Identities = 24/68 (35%), Positives = 31/68 (45%), Gaps = 9/68 (13%)

Query: 243 QNLLSDLQGWAKSPAHQ-GSTSSASGTTPPNSTPTQSGPGISAMGGPL--------PGMM 293
           Q+L S + GW   P       S ++ TT P++TPT   P  S  GGP         P   
Sbjct: 412 QDLSSVIPGWTMLPIPGTRRISQSTSTTNPSATPTTGDPSNSTYGGPQTFPNSGNNPNFN 471

Query: 294 GGMAPIVP 301
            G+A IVP
Sbjct: 472 RGIAGIVP 479


>gnl|CDD|219133 pfam06682, DUF1183, Protein of unknown function (DUF1183).  This
           family consists of several eukaryotic proteins of around
           360 residues in length. The function of this family is
           unknown.
          Length = 317

 Score = 33.9 bits (78), Expect = 0.11
 Identities = 21/67 (31%), Positives = 23/67 (34%)

Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
              G+  G  P   G    G    G GG  G G   PPP   +      GPG     G G
Sbjct: 179 SCGGVRGGPRPERAGYGGGGGGGGGGGGGGGSGPGPPPPGFKSSFPPPYGPGAGPSSGYG 238

Query: 390 PGGLLQG 396
            GG   G
Sbjct: 239 SGGTRSG 245



 Score = 28.9 bits (65), Expect = 5.1
 Identities = 14/36 (38%), Positives = 16/36 (44%)

Query: 2   MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPG 37
            F+  GG RGG       +   G GG GG G  G G
Sbjct: 176 FFLSCGGVRGGPRPERAGYGGGGGGGGGGGGGGGSG 211



 Score = 28.9 bits (65), Expect = 5.1
 Identities = 14/36 (38%), Positives = 16/36 (44%)

Query: 351 MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPG 386
            F+  GG RGG       +   G GG GG G  G G
Sbjct: 176 FFLSCGGVRGGPRPERAGYGGGGGGGGGGGGGGGSG 211


>gnl|CDD|215618 PLN03184, PLN03184, chloroplast Hsp70; Provisional.
          Length = 673

 Score = 34.4 bits (79), Expect = 0.12
 Identities = 19/60 (31%), Positives = 24/60 (40%), Gaps = 5/60 (8%)

Query: 299 IVPGSTMQPMSGMPQQQQQV-QMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGG 357
           I  GST +    M    Q+V Q+ Q ++     Q G G  GP  G    SS     G  G
Sbjct: 608 IASGSTQKMKDAMAALNQEVMQIGQSLY----NQPGAGGAGPAPGGEAGSSSSSSSGGDG 663


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 34.5 bits (79), Expect = 0.13
 Identities = 26/143 (18%), Positives = 38/143 (26%), Gaps = 6/143 (4%)

Query: 255  SPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLP--GMMGGMAPIVPGSTMQPMSGMP 312
             P    +  +A    P    P+    G  A GG +         A         P+  + 
Sbjct: 2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLA 2887

Query: 313  QQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
            +       +         ++   P  PP     P           P       PP  P+ 
Sbjct: 2888 RPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTT 2947

Query: 373  GPGGMGGPGNLGP----GGMGPG 391
             P G G P    P    G + PG
Sbjct: 2948 DPAGAGEPSGAVPQPWLGALVPG 2970


>gnl|CDD|214710 smart00533, MUTSd, DNA-binding domain of DNA mismatch repair MUTS
           family. 
          Length = 308

 Score = 33.0 bits (76), Expect = 0.21
 Identities = 29/187 (15%), Positives = 68/187 (36%), Gaps = 27/187 (14%)

Query: 66  SLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKR-FLLSALKPELIVKEVNMVTKDIV 124
                   ++ L +  +     ++ + +E+        LL  +   L+     ++   + 
Sbjct: 72  ERGRASPRDL-LRLYDSLEGLKEIRQLLESLDGPLLGLLLKVILEPLLELLELLLEL-LN 129

Query: 125 DLRHDLARKEELIKRHYD-KIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQA 183
           D          LIK  +D ++   +  L +L         E+E+   L+K+  +L ID  
Sbjct: 130 DDDPLEVNDGGLIKDGFDPELDELREKLEEL---------EEELEELLKKEREELGID-- 178

Query: 184 TLKFLDLARQMEAFFLQKRFLLSALKPELIVK-----------EDIVDLRHDLARKEELI 232
           +LK L   +    +    +     +  + I +            ++ +L ++L   +E I
Sbjct: 179 SLK-LGYNKVHGYYIEVTKSEAKKVPKDFIRRSSLKNTERFTTPELKELENELLEAKEEI 237

Query: 233 KRHYDKI 239
           +R   +I
Sbjct: 238 ERLEKEI 244



 Score = 31.1 bits (71), Expect = 0.94
 Identities = 35/188 (18%), Positives = 59/188 (31%), Gaps = 55/188 (29%)

Query: 67  LSPL-EKDEIKLEIDQATLKFLDLARQ-MEAFFLQKRFLLSALKPELIVKEVNMVTKDIV 124
           L PL +  EI               R       ++   L   L+  L         K I 
Sbjct: 25  LQPLLDLKEIN-------------ERLDAVEELVENPELRQKLRQLL---------KRIP 62

Query: 125 DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEK-DEIKLEIDQA 183
           DL       E L+ R        +   +  +  L++         +LE   EI+  ++  
Sbjct: 63  DL-------ERLLSR-------IERGRASPRDLLRLYD-------SLEGLKEIRQLLESL 101

Query: 184 TLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKRHYD-KIAVW 242
               L L        L K  L   L+   ++ E + D          LIK  +D ++   
Sbjct: 102 DGPLLGL--------LLKVILEPLLELLELLLELLNDDDPLEVNDGGLIKDGFDPELDEL 153

Query: 243 QNLLSDLQ 250
           +  L +L+
Sbjct: 154 REKLEELE 161


>gnl|CDD|116042 pfam07421, Pro-NT_NN, Neurotensin/neuromedin N precursor.  This
           family contains the precursor of bacterial
           neurotensin/neuromedin N (approximately 170 residues
           long). This the common precursor of two biologically
           active related peptides, neurotensin and neuromedin N.
           It undergoes tissue-specific processing leading to the
           formation in some tissues and cancer cell lines of large
           peptides ending with the neurotensin or neuromedin N
           sequence.
          Length = 169

 Score = 32.3 bits (73), Expect = 0.21
 Identities = 35/123 (28%), Positives = 54/123 (43%), Gaps = 17/123 (13%)

Query: 121 KDIVDLRHDLARKEELIKRHYDKIA-----VWQNLLSDLQSCLQVLTKEDEVSTTLEKDE 175
           +D+  L  DL     L   H  KI+      W+  L ++ S +  L  + E +  +  D+
Sbjct: 27  EDVRALEADL-----LTNMHTSKISKASPPSWKMTLLNVCSLINNLNSQAEEAGEMHDDD 81

Query: 176 I----KLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEEL 231
           +    KL      L    L   +  F LQK     A +   I++EDI+D  +D   KEE+
Sbjct: 82  LVTKRKLP---LVLDGFSLEAMLTIFQLQKICRSRAFQHWEIIQEDILDAGNDKNEKEEV 138

Query: 232 IKR 234
           IKR
Sbjct: 139 IKR 141


>gnl|CDD|221930 pfam13135, DUF3947, Protein of unknown function (DUF3947).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria. Proteins in
           this family are approximately 80 amino acids in length.
          Length = 76

 Score = 30.6 bits (69), Expect = 0.23
 Identities = 16/40 (40%), Positives = 20/40 (50%), Gaps = 7/40 (17%)

Query: 303 STMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSG 342
           ST+Q +       Q +QMQQQ+    MQQQG  P  P   
Sbjct: 22  STIQAV------HQAMQMQQQMQ-PAMQQQGQQPYYPSVE 54


>gnl|CDD|236802 PRK10942, PRK10942, serine endoprotease; Provisional.
          Length = 473

 Score = 33.2 bits (76), Expect = 0.24
 Identities = 28/100 (28%), Positives = 37/100 (37%), Gaps = 7/100 (7%)

Query: 261 STSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPI-VPGSTMQPMSGMPQQQQQVQ 319
           S  SA+     ++T  Q  P ++ M   L  +M  +  I V GST      MP+Q QQ  
Sbjct: 19  SPLSATAAETSSATTAQQMPSLAPM---LEKVMPSVVSINVEGSTTVNTPRMPRQFQQFF 75

Query: 320 MQQQIHMQH---MQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
                  Q     Q      GG     GG     M +G G
Sbjct: 76  GDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSG 115


>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
           Mitofilin controls mitochondrial cristae morphology.
           Mitofilin is enriched in the narrow space between the
           inner boundary and the outer membranes, where it forms a
           homotypic interaction and assembles into a large
           multimeric protein complex. The first 78 amino acids
           contain a typical amino-terminal-cleavable mitochondrial
           presequence rich in positive-charged and hydroxylated
           residues and a membrane anchor domain. In addition, it
           has three centrally located coiled coil domains.
          Length = 493

 Score = 33.1 bits (76), Expect = 0.29
 Identities = 22/129 (17%), Positives = 48/129 (37%), Gaps = 20/129 (15%)

Query: 119 VTKDIVDLRHDLARKEELIKRHYDKIAVW--QNLLSDL-------QSCLQVLTKEDEVST 169
           + ++++         +EL+    D I      NL  DL       +  L  L+K+     
Sbjct: 124 LLEELLKETASDPVVQELVSIFNDLIDSIKEDNLKDDLESLIASAKEELDQLSKKLAELK 183

Query: 170 TLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKE 229
             E++E++  + +   + L    +           L A   E         LR +  R++
Sbjct: 184 AEEEEELERALKEKREELLSKLEE----------ELLARL-ESKEAALEKQLRLEFEREK 232

Query: 230 ELIKRHYDK 238
           E +++ Y++
Sbjct: 233 EELRKKYEE 241


>gnl|CDD|179382 PRK02195, PRK02195, V-type ATP synthase subunit D; Provisional.
          Length = 201

 Score = 32.2 bits (74), Expect = 0.30
 Identities = 32/165 (19%), Positives = 59/165 (35%), Gaps = 46/165 (27%)

Query: 70  LEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLS-ALKPELIVKEVNMVTKDIVDLRH 128
           L K+ +K +  Q  LK L            +R+L +  LK   +  EV     +  +L  
Sbjct: 7   LTKNSLKKQKKQ--LKML------------ERYLPTLKLKKAQLQAEVRRAKAEAAELE- 51

Query: 129 DLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEK---------DEIKLE 179
                ++L +     I+++   L   +  ++V     +V    E          D I+ E
Sbjct: 52  --QEYQKLRQAIEAWISLFSEPLYFDEDLIKV----KKVEKDYENIAGVEVPILDSIEFE 105

Query: 180 ------------IDQATLKFLDLAR-QMEAFFLQKR--FLLSALK 209
                       +D       +L + ++EA  LQ+R   L   L+
Sbjct: 106 IIEYSLLNTPIWVDTGIELLKELVQLKIEAEVLQERLLLLEEELR 150


>gnl|CDD|192930 pfam12066, DUF3546, Domain of unknown function (DUF3546).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 93 to 114 amino acids in length. This domain has
           two completely conserved Y residues that may be
           functionally important.
          Length = 110

 Score = 31.2 bits (71), Expect = 0.32
 Identities = 17/96 (17%), Positives = 38/96 (39%), Gaps = 22/96 (22%)

Query: 160 VLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKR---FLLSALKPELIVKE 216
           + +++D++S      E +    +   +F    +Q++ FF Q +   +      PE + K 
Sbjct: 6   LESQDDDISPA----EAEKRYQEYKTEFR--RKQLQDFFDQHKDEEWFREKYHPEELAK- 58

Query: 217 DIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQGW 252
              + R +L +          ++ V+  LL    G 
Sbjct: 59  -RREERRELRKN---------RLNVFLELLES--GT 82


>gnl|CDD|173957 cd08198, DHQS-like2, Dehydroquinate synthase (DHQS)-like. DHQS
           catalyzes the conversion of DAHP to DHQ in shikimate
           pathway for aromatic compounds synthesis.
           Dehydroquinate synthase-like proteins. Dehydroquinate
           synthase (DHQS) catalyzes the conversion of
           3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to
           dehydroquinate (DHQ) in the second step of the shikimate
           pathway. This pathway involves seven sequential
           enzymatic steps in the conversion of erythrose
           4-phosphate and phosphoenolpyruvate into chorismate for
           subsequent synthesis of aromatic compounds. The activity
           of DHQS requires NAD as cofactor. Proteins of this
           family share sequence similarity and functional motifs
           with that of dehydroquinate synthase, but the specific
           function has not been characterized.
          Length = 369

 Score = 32.6 bits (75), Expect = 0.35
 Identities = 23/89 (25%), Positives = 34/89 (38%), Gaps = 8/89 (8%)

Query: 180 IDQATLKFL-DLARQMEAFFLQKRFLLSALKPELIV------KEDIVDLRHDLARKEEL- 231
           ID    +    LA  ++A+       L  + P  IV      K D   +    A      
Sbjct: 37  IDSGVAQANPQLASDIQAYAAAHADALRLVAPPHIVPGGEACKNDPDLVEALHAAINRHG 96

Query: 232 IKRHYDKIAVWQNLLSDLQGWAKSPAHQG 260
           I RH   IA+    + D  G+A + AH+G
Sbjct: 97  IDRHSYVIAIGGGAVLDAVGYAAATAHRG 125


>gnl|CDD|205922 pfam13748, ABC_membrane_3, ABC transporter transmembrane region.
           This family represents a unit of six transmembrane
           helices.
          Length = 237

 Score = 32.2 bits (74), Expect = 0.35
 Identities = 15/61 (24%), Positives = 27/61 (44%), Gaps = 12/61 (19%)

Query: 97  FLQKRFLL-SALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQ 155
           F ++   L   L   L  KEV ++ +          RK   ++RHY  ++  +  LSD +
Sbjct: 158 FARRNERLNGRLNNRL-EKEVGLIER----------RKPSALRRHYRALSRLRIRLSDRE 206

Query: 156 S 156
           +
Sbjct: 207 A 207


>gnl|CDD|217899 pfam04108, APG17, Autophagy protein Apg17.  Apg17 is required for
           activating Apg1 protein kinases.
          Length = 408

 Score = 32.7 bits (75), Expect = 0.38
 Identities = 26/161 (16%), Positives = 54/161 (33%), Gaps = 34/161 (21%)

Query: 121 KDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSC-------LQVLTKED-------- 165
            ++  L H+LA   E +  H+D+                     L+VL  +         
Sbjct: 195 SELNSLEHELADLLESLTNHFDQCVTAVKHTEGDPLDDAEYDELLEVLKNDAAELPDVVK 254

Query: 166 EVSTTLEKDEIKLEIDQATLKFLDLARQMEAFF---------LQK------RFLLSALKP 210
           E+ T +  DEI+    +          ++E            L+K      R+L      
Sbjct: 255 ELHTVI--DEIENNEKRVKKFLSSHMSKIEELHSATKELLEELEKYKERLPRYLAIFADI 312

Query: 211 ELIVKEDIVDLRHDLARKEELIKRHYDK-IAVWQNLLSDLQ 250
             + ++    ++  +    EL    YD  +  ++ LL +++
Sbjct: 313 RALWEDFKEPIQQYIQELSEL-CEFYDNFLNSYKGLLLEVE 352


>gnl|CDD|237015 PRK11901, PRK11901, hypothetical protein; Reviewed.
          Length = 327

 Score = 32.3 bits (74), Expect = 0.38
 Identities = 25/96 (26%), Positives = 34/96 (35%), Gaps = 16/96 (16%)

Query: 247 SDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQ 306
           S+  G  K+    GS+S +SG     S    +  G  A G            I    +  
Sbjct: 70  SNNAGAEKNIDLSGSSSLSSGNQSSPSAANNTSDGHDASG---VKNTAPPQDI----SAP 122

Query: 307 PMSGMPQQQQQVQM---QQQIHMQH------MQQQG 333
           P+S  P Q    Q    QQ+I +         QQQG
Sbjct: 123 PISPTPTQAAPPQTPNGQQRIELPGNISDALSQQQG 158


>gnl|CDD|197874 smart00787, Spc7, Spc7 kinetochore protein.  This domain is found
           in cell division proteins which are required for
           kinetochore-spindle association.
          Length = 312

 Score = 32.3 bits (74), Expect = 0.40
 Identities = 21/128 (16%), Positives = 50/128 (39%), Gaps = 3/128 (2%)

Query: 125 DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQAT 184
            L+  L    E +K  Y  +     LL+ ++   ++  ++D +   L +   +LE +   
Sbjct: 144 GLKEGLDENLEGLKEDYKLLMKELELLNSIK--PKLRDRKDALEEELRQ-LKQLEDELED 200

Query: 185 LKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKRHYDKIAVWQN 244
               +L R  E      + ++  +K    ++E++ +L   +            +IA  + 
Sbjct: 201 CDPTELDRAKEKLKKLLQEIMIKVKKLEELEEELQELESKIEDLTNKKSELNTEIAEAEK 260

Query: 245 LLSDLQGW 252
            L   +G+
Sbjct: 261 KLEQCRGF 268


>gnl|CDD|191111 pfam04849, HAP1_N, HAP1 N-terminal conserved region.  This family
           represents an N-terminal conserved region found in
           several huntingtin-associated protein 1 (HAP1)
           homologues. HAP1 binds to huntingtin in a polyglutamine
           repeat-length-dependent manner. However, its possible
           role in the pathogenesis of Huntington's disease is
           unclear. This family also includes a similar N-terminal
           conserved region from hypothetical protein products of
           ALS2CR3 genes found in the human juvenile amyotrophic
           lateral sclerosis critical region 2q33-2q34.
          Length = 307

 Score = 32.1 bits (73), Expect = 0.41
 Identities = 42/179 (23%), Positives = 65/179 (36%), Gaps = 55/179 (30%)

Query: 108 KPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEV 167
           K E + +++     +I+ LRH+L  K+EL                     LQ  +  DE 
Sbjct: 99  KNEKLEEQLGKARDEILQLRHELNLKDEL---------------------LQFYSDADEE 137

Query: 168 STTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLL------------SALKPELIVK 215
           S   E  E      Q +        Q+EA  LQ++  L            S LK E +  
Sbjct: 138 SED-ESSESTPLRPQESSSSSHGCFQLEA--LQEKLKLLEEENEHLRSEASHLKTETVTY 194

Query: 216 ED-------------------IVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQGWAKS 255
           E+                   I  L  +LA+K E ++R  ++I    + + DLQ   KS
Sbjct: 195 EEKEQQLVNDCVKQLREANDQIASLSEELAKKTEDLERQQEEITHLLSQIVDLQKKCKS 253


>gnl|CDD|218112 pfam04497, Pox_E2-like, Poxviridae protein.  This family of
           proteins is restricted to Poxviridae. It contains a
           number of differently named uncharacterized proteins.
          Length = 727

 Score = 32.8 bits (75), Expect = 0.43
 Identities = 11/46 (23%), Positives = 19/46 (41%), Gaps = 1/46 (2%)

Query: 100 KRFLLSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIA 145
           +   L      ++V    M  +DIVD   D    + L+K+  D + 
Sbjct: 344 RIKSLPIHSRLVMVMCEEMGYEDIVDFL-DNLDVDTLVKKGADPLT 388


>gnl|CDD|236598 PRK09631, PRK09631, DNA topoisomerase IV subunit A; Provisional.
          Length = 635

 Score = 32.4 bits (74), Expect = 0.50
 Identities = 32/150 (21%), Positives = 60/150 (40%), Gaps = 57/150 (38%)

Query: 135 ELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQM 194
           ++IK H +           LQ   +VL  E E    LE+ ++  +I          A+ +
Sbjct: 311 DIIKFHAEH----------LQ---KVLKMELE----LERAKLLEKI---------FAKTL 344

Query: 195 EAFFLQKRF----------------LLSALKP---EL---IVKEDIVDL---------RH 223
           E  F+++R                 +LS LKP   EL   + +EDI +L           
Sbjct: 345 EQIFIEERIYKRIETISSEEDVISIVLSELKPFKEELSRDVTEEDIENLLKIPIRRISLF 404

Query: 224 DLARKEELIKRHYDKIAVWQNLLSDLQGWA 253
           D+ + ++ I+    ++   +  L  ++G+A
Sbjct: 405 DIDKNQKEIRILNKELKSVEKNLKSIKGYA 434


>gnl|CDD|219420 pfam07466, DUF1517, Protein of unknown function (DUF1517).  This
           family consists of several hypothetical glycine rich
           plant and bacterial proteins of around 300 residues in
           length. The function of this family is unknown.
          Length = 280

 Score = 31.9 bits (73), Expect = 0.56
 Identities = 18/49 (36%), Positives = 18/49 (36%), Gaps = 2/49 (4%)

Query: 333 GMGPGGPPSG--PGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGG 379
           G G    PS       SS     G  G  GGG   P   P  G GG GG
Sbjct: 9   GGGSFRAPSRSSSSPRSSSPGGGGYYGSPGGGFGFPFLIPFFGFGGGGG 57



 Score = 28.4 bits (64), Expect = 6.4
 Identities = 17/52 (32%), Positives = 19/52 (36%), Gaps = 2/52 (3%)

Query: 344 GGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP-GGMGPGGLL 394
           GG S          PR     G   + S G GG G P  +   G  G GGL 
Sbjct: 9   GGGSFRAPSRSSSSPRSSSPGGGGYYGSPG-GGFGFPFLIPFFGFGGGGGLF 59


>gnl|CDD|221868 pfam12938, M_domain, M domain of GW182. 
          Length = 238

 Score = 31.8 bits (72), Expect = 0.58
 Identities = 24/110 (21%), Positives = 31/110 (28%), Gaps = 15/110 (13%)

Query: 294 GGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGM--- 350
            GM    P    +  SG           Q     ++   G GPGG   G     + +   
Sbjct: 3   SGMGFAGPFGGDRFPSG-GSSVNSPPFSQNNLPNNLGGGGGGPGGGGGGNNPNLASLSSL 61

Query: 351 -----------MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
                      +   P G  GG  AG P     G G    P N+ P    
Sbjct: 62  TSQGLGKILSGLQPPPLGNGGGSGAGGPGPVGGGGGPGVAPNNIQPNAQA 111



 Score = 27.9 bits (62), Expect = 8.1
 Identities = 12/37 (32%), Positives = 13/37 (35%)

Query: 4   MGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 40
             P G  GG  AG P     G G    P N+ P    
Sbjct: 75  PPPLGNGGGSGAGGPGPVGGGGGPGVAPNNIQPNAQA 111


>gnl|CDD|232933 TIGR00348, hsdR, type I site-specific deoxyribonuclease, HsdR
           family.  This gene is part of the type I restriction and
           modification system which is composed of three
           polypeptides R (restriction endonuclease), M
           (modification) and S (specificity). This group of
           enzymes recognize specific short DNA sequences and have
           an absolute requirement for ATP (or dATP) and
           S-adenosyl-L-methionine. They also catalyse the
           reactions of EC 2.1.1.72 and EC 2.1.1.73, with similar
           site specificity.(J. Mol. Biol. 271 (3), 342-348
           (1997)). Members of this family are assumed to differ
           from each other in DNA site specificity [DNA metabolism,
           Restriction/modification].
          Length = 667

 Score = 32.0 bits (73), Expect = 0.66
 Identities = 25/107 (23%), Positives = 45/107 (42%), Gaps = 11/107 (10%)

Query: 48  PLAYLEKTTSN-IGLPDGRSLS--PLE---KDEIKLEID-QATLK-FLDLARQMEAFFLQ 99
           P+   ++ TS       GR L    +    +D + ++ID +  L       ++++AFF +
Sbjct: 402 PIFKKDRDTSLTFAYVFGRYLHRYFITDAIRDGLTVKIDYEDRLPEDHLDKKKLDAFFDE 461

Query: 100 KRFLLS-ALKP--ELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDK 143
              LL   ++   +  +KE    TK I+     L    + I  HY K
Sbjct: 462 IFELLPERIREITKESLKEKLQKTKKILFNEDRLESIAKDIAEHYAK 508


>gnl|CDD|153280 cd07596, BAR_SNX, The Bin/Amphiphysin/Rvs (BAR) domain of Sorting
           Nexins.  BAR domains are dimerization, lipid binding and
           curvature sensing modules found in many different
           proteins with diverse functions. Sorting nexins (SNXs)
           are Phox homology (PX) domain containing proteins that
           are involved in regulating membrane traffic and protein
           sorting in the endosomal system. SNXs differ from each
           other in their lipid-binding specificity, subcellular
           localization and specific function in the endocytic
           pathway. A subset of SNXs also contain BAR domains. The
           PX-BAR structural unit determines the specific membrane
           targeting of SNXs. BAR domains form dimers that bind to
           membranes, induce membrane bending and curvature, and
           may also be involved in protein-protein interactions.
          Length = 218

 Score = 31.2 bits (71), Expect = 0.69
 Identities = 15/94 (15%), Positives = 38/94 (40%), Gaps = 11/94 (11%)

Query: 112 IVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVSTT 170
            +  +  + KD+   +  L + +        K+   +  L + +S L+   K  +E+S  
Sbjct: 115 ALLTLQSLKKDLASKKAQLEKLKAAPGIKPAKVEELEEELEEAESALEEARKRYEEISER 174

Query: 171 LEKDEIKLEID-----QATLK-----FLDLARQM 194
           L+++  +   +     +A LK      +  A ++
Sbjct: 175 LKEELKRFHEERARDLKAALKEFARLQVQYAEKI 208


>gnl|CDD|218116 pfam04503, SSDP, Single-stranded DNA binding protein, SSDP.  This
           is a family of eukaryotic single-stranded DNA binding
           proteins with specificity to a pyrimidine-rich element
           found in the promoter region of the alpha2(I) collagen
           gene.
          Length = 293

 Score = 31.6 bits (71), Expect = 0.70
 Identities = 43/135 (31%), Positives = 53/135 (39%), Gaps = 12/135 (8%)

Query: 261 STSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQM 320
           S     G  PP   P Q   G+    G  P + GGM P V       M G    Q+    
Sbjct: 73  SPRYPGGPRPPLRMPNQPPGGVP---GSQPLLPGGMDPTVRQQGHPNMGG--PMQRMTPP 127

Query: 321 QQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPR---GGGNAGPPPFPSAGPGGM 377
           +    +   Q  G G   PP+   GP+   M MGPG  R      +A   P+ S+ PG  
Sbjct: 128 RGMKSLDGPQNYGGGMRPPPNSLLGPAMPGMNMGPGLGRPWPNPISANSIPYSSSSPGEY 187

Query: 378 GGPGNLGPGGMGPGG 392
            GP    PGG GP G
Sbjct: 188 TGP----PGGGGPPG 198


>gnl|CDD|219419 pfam07462, MSP1_C, Merozoite surface protein 1 (MSP1) C-terminus.
           This family represents the C-terminal region of
           merozoite surface protein 1 (MSP1) which are found in a
           number of Plasmodium species. MSP-1 is a 200-kDa protein
           expressed on the surface of the P. vivax merozoite.
           MSP-1 of Plasmodium species is synthesised as a
           high-molecular-weight precursor and then processed into
           several fragments. At the time of red cell invasion by
           the merozoite, only the 19-kDa C-terminal fragment
           (MSP-119), which contains two epidermal growth
           factor-like domains, remains on the surface. Antibodies
           against MSP-119 inhibit merozoite entry into red cells,
           and immunisation with MSP-119 protects monkeys from
           challenging infections. Hence, MSP-119 is considered a
           promising vaccine candidate.
          Length = 574

 Score = 31.8 bits (72), Expect = 0.72
 Identities = 59/293 (20%), Positives = 104/293 (35%), Gaps = 57/293 (19%)

Query: 67  LSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDL 126
           L+  + + + LEI         L   ++  +   R+    LK E +  +   + +  + +
Sbjct: 51  LTETKVNALYLEIAH-------LKELLQHSY--DRYYKYKLKLERLYNKKEQIGQSKMQI 101

Query: 127 RHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLT------KEDE---VSTTLEKDEIK 177
           +     KE L KR        +N L++    L   +      +E E   V  TL+  +I 
Sbjct: 102 KKLTLLKERLEKR--------KNSLNNPFYVLSNFSNFFNKKREAEKQEVENTLKNTDIL 153

Query: 178 LEIDQATLKFL-----------DLARQMEAFFL--QKRFLLSALKPEL-----IVKEDIV 219
           L+  +A +K+            +++ Q E  +L  +K  +LS L+  L     + KE I 
Sbjct: 154 LKYYKARVKYYTGEAVPLKTLSEVSIQREDNYLNLEKFRVLSRLEGRLKKNINLGKEKIS 213

Query: 220 ----DLRHDLARKEELIKR-------HYDKIAVWQNLLSDLQGWAKSPAHQGSTSSASGT 268
                L H     +ELIK        + +        L   +        Q      +  
Sbjct: 214 YLSSGLHHVFTELKELIKNKNYTGNTNPENNPEVNEALEQYKELLPKGTTQ-EAKVTTVV 272

Query: 269 TPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQ 321
           TPP +    S   +   G           P   GS + P +   + QQ VQ+Q
Sbjct: 273 TPPQADAAPSPLSVRPAGSSGSASGSTQIP-TSGSVLGPGAAATELQQVVQLQ 324


>gnl|CDD|236092 PRK07772, PRK07772, single-stranded DNA-binding protein;
           Provisional.
          Length = 186

 Score = 30.8 bits (70), Expect = 0.75
 Identities = 18/56 (32%), Positives = 20/56 (35%)

Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
           GG   G GG   G    G GG  GGG   P    +        P +  P   G GG
Sbjct: 124 GGGGGGGGGFGGGGGGSGGGGGGGGGGGAPGGGGAQASAPADDPWSSAPASGGFGG 179



 Score = 28.8 bits (65), Expect = 3.2
 Identities = 17/51 (33%), Positives = 20/51 (39%), Gaps = 1/51 (1%)

Query: 335 GPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP 385
           G GG   G GG   G    G GG +        P+ SA   G  G G+  P
Sbjct: 135 GGGGGSGGGGGGGGGGGAPGGGGAQASA-PADDPWSSAPASGGFGGGDDEP 184


>gnl|CDD|217789 pfam03915, AIP3, Actin interacting protein 3. 
          Length = 424

 Score = 31.5 bits (72), Expect = 0.79
 Identities = 32/122 (26%), Positives = 56/122 (45%), Gaps = 14/122 (11%)

Query: 72  KDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLA 131
           K ++  + D    K  DL   +EA  L+K      ++P    K++  V K+I     +L 
Sbjct: 208 KKKLSEDSDSLLTKVDDLQDIIEA--LRKDVAQRGVRPGP--KQLETVQKEIQKAEKELK 263

Query: 132 RKEELIKR---HYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLE-----IDQA 183
           + EE IKR    + KI  W++ L  +    Q LT ++++   L+ D  K E     ++Q 
Sbjct: 264 KMEEYIKREKPVWKKI--WESELDKVCEEQQFLTLQEDLIADLQDDLEKAEETFDLVEQC 321

Query: 184 TL 185
           + 
Sbjct: 322 SE 323


>gnl|CDD|221143 pfam11593, Med3, Mediator complex subunit 3 fungal.  Mediator is a
           large complex of up to 33 proteins that is conserved
           from plants to fungi to humans - the number and
           representation of individual subunits varying with
           species. It is arranged into four different sections, a
           core, a head, a tail and a kinase-activity part, and the
           number of subunits within each of these is what varies
           with species. Overall, Mediator regulates the
           transcriptional activity of RNA polymerase II but it
           would appear that each of the four different sections
           has a slightly different function. Mediator subunit
           Hrs1/Med3 is a physical target for Cyc8-Tup1, a yeast
           transcriptional co-repressor.
          Length = 381

 Score = 31.5 bits (71), Expect = 0.83
 Identities = 20/101 (19%), Positives = 33/101 (32%), Gaps = 13/101 (12%)

Query: 250 QGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAM------GGPLPGMMGGMAPI---- 299
            G A +   Q S  + +  +  N   +   P  ++M        PL  ++ G++P     
Sbjct: 203 TGPAAAAKAQASAQAQAQASAYNQMGSLGVPQNTSMLAQIPNPTPLMQLLNGVSPNNAMA 262

Query: 300 VPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPP 340
            P + M PM  + Q   Q    Q   M      G       
Sbjct: 263 SPLNNMSPMRNLNQMGNQNNGGQ---MTPSANNGNMNNQSR 300


>gnl|CDD|236941 PRK11634, PRK11634, ATP-dependent RNA helicase DeaD; Provisional.
          Length = 629

 Score = 31.4 bits (71), Expect = 0.91
 Identities = 22/88 (25%), Positives = 30/88 (34%), Gaps = 2/88 (2%)

Query: 303 STMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGG 362
           ST++   GMP +  Q   + +I  + M  Q +G   P +G      G  F G    R GG
Sbjct: 529 STIELPKGMPGEVLQHFTRTRILNKPMNMQLLGDAQPHTGGERRGGGRGFGGER--REGG 586

Query: 363 NAGPPPFPSAGPGGMGGPGNLGPGGMGP 390
                     G G           G  P
Sbjct: 587 RNFSGERREGGRGDGRRFSGERREGRAP 614


>gnl|CDD|240614 cd12794, Hsm3_like, Hsm3 is a  yeast Proteasome chaperone of the
           19S regulatory particle and related proteins.  This
           group contains proteins related to the Hsm3 protein
           (Yeast Proteasome Interacting Protein) of Saccharomyces
           cerevisiae. S. cerevisiae Hsm3 is a chaperone of
           regulatory particles involved in proteasome assembly.
           The 26S Proteasome is a large, 2.5 MDa complex comprised
           of at least 33 subunits, and relies on chaperones to
           facilitate correct assembly. The proteasome contains a
           cylindrical 20S core particle and 1-2 19S regulatory
           particles, comprised of AAA-ATPase and non-ATPase
           subunits. The proteasome acts in ubiquitin-dependent
           proteolysis. The 19S RP targets and opens the the
           ubiquitin-tagged substrate and releases ubiquitin. Hsm3
           acts as a 19S chaperone, binding to the C-terminal
           domain of Rpt1 (the 6 ATPase subunits of the 19 S
           regulatory particle(s). Hsm3 has a C-shape composed of
           11 HEAT repeats. Mutations in the Hsm3-Rpt interface
           disrupt formation of the 26 S Proteasome complex.
          Length = 455

 Score = 31.5 bits (72), Expect = 0.93
 Identities = 17/74 (22%), Positives = 33/74 (44%), Gaps = 5/74 (6%)

Query: 148 QNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSA 207
           QNLL  L + L+       ++  ++K    L +   T   +DL + + A    K  LL  
Sbjct: 2   QNLLDHLNTALETDPLPPVINKLIDK--CSLNLKTITSLPVDLKQLLPAI---KSILLDN 56

Query: 208 LKPELIVKEDIVDL 221
              E++  + +++L
Sbjct: 57  ESYEILDYDLLLEL 70


>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 824

 Score = 31.5 bits (72), Expect = 0.96
 Identities = 18/124 (14%), Positives = 24/124 (19%), Gaps = 3/124 (2%)

Query: 253 AKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGG-MAPIVPGSTMQPMSGM 311
            K  A   ++    G          + P  +          G   A   P     P +G 
Sbjct: 654 PKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAG- 712

Query: 312 PQQQQQVQMQQQIHMQHMQQQGMGPGGP-PSGPGGPSSGMMFMGPGGPRGGGNAGPPPFP 370
                  Q  Q                P P  P  P           P         P  
Sbjct: 713 QADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAA 772

Query: 371 SAGP 374
           +  P
Sbjct: 773 APPP 776


>gnl|CDD|214832 smart00817, Amelin, Ameloblastin precursor (Amelin).  This family
           consists of several mammalian Ameloblastin precursor
           (Amelin) proteins. Matrix proteins of tooth enamel
           consist mainly of amelogenin but also of non-amelogenin
           proteins, which, although their volumetric percentage is
           low, have an important role in enamel mineralisation.
           One of the non-amelogenin proteins is ameloblastin, also
           known as amelin and sheathlin. Ameloblastin (AMBN) is
           one of the enamel sheath proteins which is though to
           have a role in determining the prismatic structure of
           growing enamel crystals.
          Length = 411

 Score = 31.4 bits (71), Expect = 0.98
 Identities = 37/151 (24%), Positives = 49/151 (32%), Gaps = 10/151 (6%)

Query: 240 AVWQNLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMM--GGMA 297
           A+  N  +  +   + P H G         P    P Q  P        LP         
Sbjct: 121 ALPTNQATPQKNGPQPPMHLGQPPLQQAELPM--IPPQVAPSDKPPQTELPLYDFADPQN 178

Query: 298 PIV-PGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
           P++   + +     MPQ +QQ       +M +    G    G P+  G  SS  M  G G
Sbjct: 179 PLLFQIAHLMSRGPMPQNKQQHLYPGLFYMSY----GANQLGAPARLGAMSSEEMTGGRG 234

Query: 357 GPRGGGNAGPPPFPSAGPGGMGGPGNLGPGG 387
            P   G A  P      PG  G P N    G
Sbjct: 235 APHAYG-ALFPGLGGMRPGLRGMPQNPAMQG 264


>gnl|CDD|236382 PRK09111, PRK09111, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 598

 Score = 31.4 bits (72), Expect = 1.1
 Identities = 14/76 (18%), Positives = 15/76 (19%)

Query: 327 QHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPG 386
           Q       G GG P G GG           G      A   P  +             P 
Sbjct: 388 QEGPPSPGGGGGGPPGGGGAPGAPAAAAAPGAAAAAPAAGGPAAALAAVPDAAAAAAAPP 447

Query: 387 GMGPGGLLQGPLAYLE 402
                      L   E
Sbjct: 448 APAAAPQPAVRLNSFE 463



 Score = 29.1 bits (66), Expect = 5.8
 Identities = 11/42 (26%), Positives = 11/42 (26%)

Query: 7   GGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 48
           G P  GG  G PP     PG        G     P       
Sbjct: 390 GPPSPGGGGGGPPGGGGAPGAPAAAAAPGAAAAAPAAGGPAA 431



 Score = 28.3 bits (64), Expect = 9.8
 Identities = 11/42 (26%), Positives = 12/42 (28%)

Query: 5   GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQ 46
            PGG  GG   G     +       G     P   GP   L 
Sbjct: 393 SPGGGGGGPPGGGGAPGAPAAAAAPGAAAAAPAAGGPAAALA 434


>gnl|CDD|220368 pfam09730, BicD, Microtubule-associated protein Bicaudal-D.  BicD
           proteins consist of three coiled-coiled domains and are
           involved in dynein-mediated minus end-directed transport
           from the Golgi apparatus to the endoplasmic reticulum
           (ER). For full functioning they bind with GSK-3beta
           pfam05350 to maintain the anchoring of microtubules to
           the centromere. It appears that amino-acid residues
           437-617 of BicD and the kinase activity of GSK-3 are
           necessary for the formation of a complex between BicD
           and GSK-3beta in intact cells.
          Length = 711

 Score = 31.3 bits (71), Expect = 1.2
 Identities = 27/125 (21%), Positives = 51/125 (40%), Gaps = 25/125 (20%)

Query: 114 KEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEK 173
           KE   + + I++L+ +L +            A   N+ ++ +    +  +  E +  LE 
Sbjct: 28  KEAYYLQR-ILELQAELKQLR----------AELSNVQAENERLSSLSQELKEENEMLEL 76

Query: 174 DEIKLEIDQATLKFLDLARQM--------EAFFLQKRFLLSALKPELIVKEDIVDLRHDL 225
              +L  +    KF + AR +        E   LQK   +S L+   +  E    L+H++
Sbjct: 77  QRGRLRDEIKEYKFRE-ARLLQDYSELEEENISLQK--QVSVLRQSQVEFE---GLKHEI 130

Query: 226 ARKEE 230
            R EE
Sbjct: 131 RRLEE 135


>gnl|CDD|236722 PRK10590, PRK10590, ATP-dependent RNA helicase RhlE; Provisional.
          Length = 456

 Score = 30.9 bits (70), Expect = 1.3
 Identities = 11/52 (21%), Positives = 12/52 (23%)

Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG 381
           QQ+G G  G   G G           G           P    G     G  
Sbjct: 392 QQRGGGGRGQGGGRGQQQGQPRRGEGGAKSASAKPAEKPSRRLGDAKPAGEQ 443


>gnl|CDD|227623 COG5307, COG5307, SEC7 domain proteins [General function prediction
           only].
          Length = 1024

 Score = 31.2 bits (71), Expect = 1.3
 Identities = 19/121 (15%), Positives = 34/121 (28%), Gaps = 6/121 (4%)

Query: 151 LSDLQSCLQVLTKEDEVSTTLEK----DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLS 206
           +        VL K  E  +        +       Q + +FL+   ++  F  Q   LL 
Sbjct: 703 IKSESKISNVLFKNSEGLSPDLNKTLLESALDSKSQLSSRFLE-IEELSDFGFQIALLLP 761

Query: 207 ALKPELIVKEDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQGWAKSPAHQGSTSSAS 266
                  V   +      +   + L+      IA  + +         S   +   SS S
Sbjct: 762 FEYSV-EVSLVVAVKELVIGCSDNLLTEAASSIASGKTIFEISAYEDLSSTLRYILSSLS 820

Query: 267 G 267
            
Sbjct: 821 N 821



 Score = 30.9 bits (70), Expect = 1.6
 Identities = 18/144 (12%), Positives = 40/144 (27%), Gaps = 14/144 (9%)

Query: 67  LSPLEK----DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKD 122
           LSP       +       Q + +FL+   ++  F  Q   LL              V+  
Sbjct: 720 LSPDLNKTLLESALDSKSQLSSRFLE-IEELSDFGFQIALLLPFEYSV-------EVSLV 771

Query: 123 IVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQ 182
           +      +   + L+      IA  + +           T    +S+    + +  + + 
Sbjct: 772 VAVKELVIGCSDNLLTEAASSIASGKTIFEISAYEDLSSTLRYILSSLSNDELVLSQENL 831

Query: 183 ATLKFLDLARQMEAFFLQKRFLLS 206
                L  ++       +   L  
Sbjct: 832 --FIELLSSKNEGKQNDKNLELRL 853


>gnl|CDD|226959 COG4594, FecB, ABC-type Fe3+-citrate transport system, periplasmic
           component [Inorganic ion transport and metabolism].
          Length = 310

 Score = 30.5 bits (69), Expect = 1.3
 Identities = 42/206 (20%), Positives = 75/206 (36%), Gaps = 58/206 (28%)

Query: 104 LSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTK 163
           +SALKP+LI+ + +   K +      +A    L  R+ D    +Q  +   ++  + + K
Sbjct: 107 ISALKPDLIIADSSR-HKKVYKELKKIAPTIALKSRNED----YQENIDSFKTIAKAVGK 161

Query: 164 EDEVSTTLEK-----DEIKLEIDQATLKFL---DLARQMEAF---FLQKRFL-------- 204
           E E+   L K      EIK ++ + T         A Q           + L        
Sbjct: 162 EKEMEKRLAKHKKKIAEIKKKLPKGTNSLAIGVSRATQFNLHTEESYTGQLLTQLGYQVP 221

Query: 205 ----------------LSALKPELIVKEDIVDLRHDLARKEELIKRHYDKIAVWQNL--- 245
                           L+A+ P++++      L      ++E I R ++K A+W+ L   
Sbjct: 222 AASSDGGPYMSVGLEQLAAINPDVMI------LATY---RDESIVRKWEKNALWKKLKAV 272

Query: 246 ------LSDLQGWAKSPAHQGSTSSA 265
                   D   WA+S     + S A
Sbjct: 273 KNGQVYDVDRNTWARSRGIDAAESMA 298


>gnl|CDD|115579 pfam06933, SSP160, Special lobe-specific silk protein SSP160.  This
           family consists of several special lobe-specific silk
           protein SSP160 sequences which appear to be specific to
           Chironomus (Midge) species.
          Length = 758

 Score = 30.9 bits (69), Expect = 1.3
 Identities = 10/40 (25%), Positives = 22/40 (55%)

Query: 239 IAVWQNLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQS 278
           +A W+ +L+ L+ +A   A   STS+++ T+   +    +
Sbjct: 263 VAEWEAILAALEAFANGSASANSTSNSNSTSNSTTNSNST 302


>gnl|CDD|197891 smart00818, Amelogenin, Amelogenins, cell adhesion proteins, play a
           role in the biomineralisation of teeth.  They seem to
           regulate formation of crystallites during the secretory
           stage of tooth enamel development and are thought to
           play a major role in the structural organisation and
           mineralisation of developing enamel. The extracellular
           matrix of the developing enamel comprises two major
           classes of protein: the hydrophobic amelogenins and the
           acidic enamelins. Circular dichroism studies of porcine
           amelogenin have shown that the protein consists of 3
           discrete folding units: the N-terminal region appears to
           contain beta-strand structures, while the C-terminal
           region displays characteristics of a random coil
           conformation. Subsequent studies on the bovine protein
           have indicated the amelogenin structure to contain a
           repetitive beta-turn segment and a "beta-spiral" between
           Gln112 and Leu138, which sequester a (Pro, Leu, Gln)
           rich region. The beta-spiral offers a probable site for
           interactions with Ca2+ ions. Muatations in the human
           amelogenin gene (AMGX) cause X-linked hypoplastic
           amelogenesis imperfecta, a disease characterised by
           defective enamel. A 9bp deletion in exon 2 of AMGX
           results in the loss of codons for Ile5, Leu6, Phe7 and
           Ala8, and replacement by a new threonine codon,
           disrupting the 16-residue (Met1-Ala16) amelogenin signal
           peptide.
          Length = 165

 Score = 30.1 bits (68), Expect = 1.3
 Identities = 21/68 (30%), Positives = 24/68 (35%), Gaps = 2/68 (2%)

Query: 293 MGGMAPIVPGSTMQPMSGMPQQQ--QQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGM 350
           + G   + P    QP    P QQ  Q   +Q     Q MQ Q      PP  P  P   M
Sbjct: 76  VPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLPPM 135

Query: 351 MFMGPGGP 358
             M P  P
Sbjct: 136 FPMQPLPP 143


>gnl|CDD|237539 PRK13878, PRK13878, conjugal transfer relaxase TraI; Provisional.
          Length = 746

 Score = 30.9 bits (70), Expect = 1.5
 Identities = 16/91 (17%), Positives = 27/91 (29%), Gaps = 12/91 (13%)

Query: 310 GMPQQQQQVQMQQQIHMQHMQQQ-----GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNA 364
            +  ++Q +  ++  H +  + +     G G GGP   P               R G   
Sbjct: 501 ALESRRQALLNKENTHERTERPEHRGRTGRGRGGPGQRPAADQHA--AGAAAVARAGDGR 558

Query: 365 GPPPFPSAGPGGMGGPG-----NLGPGGMGP 390
                      G+   G     N+G  G  P
Sbjct: 559 PAAGRGDRAGAGVHAAGVHRKPNVGRIGRKP 589


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 30.5 bits (70), Expect = 1.6
 Identities = 19/79 (24%), Positives = 37/79 (46%), Gaps = 4/79 (5%)

Query: 173 KDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHD-LARKEEL 231
           K E  LE   A  +   L  + E    ++R  L  L+  L+ KE+ +D + + L ++EE 
Sbjct: 55  KKEALLE---AKEEIHKLRNEFEKELRERRNELQKLEKRLLQKEENLDRKLELLEKREEE 111

Query: 232 IKRHYDKIAVWQNLLSDLQ 250
           +++   ++   Q  L   +
Sbjct: 112 LEKKEKELEQKQQELEKKE 130


>gnl|CDD|227606 COG5281, COG5281, Phage-related minor tail protein [Function
           unknown].
          Length = 833

 Score = 30.8 bits (69), Expect = 1.7
 Identities = 47/293 (16%), Positives = 68/293 (23%), Gaps = 52/293 (17%)

Query: 151 LSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKP 210
           L  L +  Q +             +  L   +     L    +      QK  LL   K 
Sbjct: 491 LKALLAFQQQIADLSGAKEKASDQKSLLWKAEEQYALLKEEAKQRQLQEQK-ALLEHKKE 549

Query: 211 ELIVKEDIVDLRHDLARKEELIKRHY----------------DKIAVWQNLLSDLQ-GWA 253
            L     + +L    A + EL  +                     A     L++L   W+
Sbjct: 550 TLEYTSQLAELLDQQADRFELSAQAAGSQKERGSDLYREALAQNAAALNKALNELAAYWS 609

Query: 254 KSPAHQGS----TSSASGTTPPNSTPTQSG------PGISAMGGPLPGMMGGMAPIVPGS 303
                QG       SA      ++T   S            M                  
Sbjct: 610 ALDLLQGDWKAGALSALANYRDSATDVASQAAQLFTNAFDGMANNAAKFATTGKLSFKSF 669

Query: 304 TMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGP----------------- 346
           T   +S +     Q  +Q  + +      G   GG  +  G                   
Sbjct: 670 TRSVLSDLAGILLQAALQIIVGLVGSAFGGALSGGGSASTGAGSVFHFAAGGVYGSGGLP 729

Query: 347 -------SSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
                  SS  +F    G    G AGP        G  G  G     G G   
Sbjct: 730 EYAGGVVSSPTVFTKAAGLGLMGEAGPEAILPLDRGSDGKLGVAAGMGGGGAA 782


>gnl|CDD|233255 TIGR01061, parC_Gpos, DNA topoisomerase IV, A subunit,
           Gram-positive.  Operationally, topoisomerase IV is a
           type II topoisomerase required for the decatenation of
           chromosome segregation. Not every bacterium has both a
           topo II and a topo IV. The topo IV families of the
           Gram-positive bacteria and the Gram-negative bacteria
           appear not to represent a single clade among the type II
           topoisomerases, and are represented by separate models
           for this reason [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 738

 Score = 30.5 bits (69), Expect = 1.8
 Identities = 26/144 (18%), Positives = 59/144 (40%), Gaps = 30/144 (20%)

Query: 65  RSLSPLEKDEIKLEIDQATLKFLDLARQMEAFF------------LQKRFLLSALKPELI 112
           RS   LEK   +LEI +  +K + +  ++                L   F  +  + E I
Sbjct: 357 RSKYELEKASKRLEIVEGLIKAISIIDEIIKLIRSSEDKSDAKENLIDNFKFTENQAEAI 416

Query: 113 V--KEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE------ 164
           V  +   +   DI +L+ +   + EL      KI   + +++  ++  ++L K+      
Sbjct: 417 VSLRLYRLTNTDIFELKEE---QNEL----EKKIISLEQIIASEKARNKLLKKQLEEYKK 469

Query: 165 ---DEVSTTLEKDEIKLEIDQATL 185
               +  + +E    +++I+++ L
Sbjct: 470 QFAQQRRSQIEDFINQIKINESEL 493


>gnl|CDD|224264 COG1345, FliD, Flagellar capping protein [Cell motility and
           secretion].
          Length = 483

 Score = 30.5 bits (69), Expect = 1.8
 Identities = 12/55 (21%), Positives = 25/55 (45%), Gaps = 3/55 (5%)

Query: 112 IVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDE 166
           + K++  + KDI  L   L   EE  K  ++      ++++ + S    LT++  
Sbjct: 427 LNKQIKSLDKDIKSLDKRLEAAEERYKTQFNT---LDDMMTQMNSQSSYLTQQLV 478


>gnl|CDD|226074 COG3544, COG3544, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 190

 Score = 29.8 bits (67), Expect = 1.9
 Identities = 11/44 (25%), Positives = 14/44 (31%), Gaps = 3/44 (6%)

Query: 312 PQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGP 355
               Q  ++ Q   M+       G GG P        GMM M  
Sbjct: 80  IILAQNQEIAQ---MKTWLATWGGKGGEPPSLEMAKMGMMEMHE 120


>gnl|CDD|218636 pfam05557, MAD, Mitotic checkpoint protein.  This family consists
           of several eukaryotic mitotic checkpoint (Mitotic arrest
           deficient or MAD) proteins. The mitotic spindle
           checkpoint monitors proper attachment of the bipolar
           spindle to the kinetochores of aligned sister chromatids
           and causes a cell cycle arrest in prometaphase when
           failures occur. Multiple components of the mitotic
           spindle checkpoint have been identified in yeast and
           higher eukaryotes. In S.cerevisiae, the existence of a
           Mad1-dependent complex containing Mad2, Mad3, Bub3 and
           Cdc20 has been demonstrated.
          Length = 722

 Score = 30.3 bits (68), Expect = 2.0
 Identities = 43/206 (20%), Positives = 84/206 (40%), Gaps = 23/206 (11%)

Query: 71  EKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKD--IVDLRH 128
           E   +K ++D  +LK   L  + E    + +  +S +K +L   +      D  +  L  
Sbjct: 136 EAKLLKDKLDAESLK---LQNEKEDQLKEAKESISRIKNDLSEMQCRAQNADTELKLLES 192

Query: 129 DLARKEELIKRHYDKIAVWQNLLSDLQSCL------QVLTKEDEVSTTLEKDEIKLEIDQ 182
           +L    E ++    ++A  +  L  L S         V  K  E      + + ++    
Sbjct: 193 ELEELREQLEECQKELAEAEKKLQSLTSEQASSADNSVKIKHLEEELKRYEQDAEVVKSM 252

Query: 183 AT--LKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKE---------EL 231
               L+  +L R++ A   + R L S  +   ++KE++ DL+  L R E         EL
Sbjct: 253 KEQLLQIPELERELAALREENRKLRSMKEDNELLKEELEDLQSRLERFEKMREKLADLEL 312

Query: 232 IKRHYD-KIAVWQNLLSDLQGWAKSP 256
            K   + ++  W++LL D+    ++P
Sbjct: 313 EKEKLENELKSWKSLLQDIGLNLRTP 338


>gnl|CDD|219934 pfam08614, ATG16, Autophagy protein 16 (ATG16).  Autophagy is a
           ubiquitous intracellular degradation system for
           eukaryotic cells. During autophagy, cytoplasmic
           components are enclosed in autophagosomes and delivered
           to lysosomes/vacuoles. ATG16 (also known as Apg16) has
           been shown to be bind to Apg5 and is required for the
           function of the Apg12p-Apg5p conjugate in the yeast
           autophagy pathway.
          Length = 194

 Score = 29.8 bits (67), Expect = 2.0
 Identities = 33/152 (21%), Positives = 59/152 (38%), Gaps = 11/152 (7%)

Query: 44  LLQGPLAYLEKTTSNIGLPDGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFL 103
           L           +S+       S + + + E KL   +  L  L   ++ E   L +R L
Sbjct: 43  LQAEKYEQQSSHSSSPSADGPGSDAAIAEMEQKLAKLREELTEL-HKKRGE---LAQRLL 98

Query: 104 LSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTK 163
           L   + E + +E+  + K I +LR ++   E  I+   +++   +     LQ  L  L  
Sbjct: 99  LLNDELEQLRREIQQLEKTIAELRSEITSLETEIRDLREELQEKEKDNETLQDELISLNI 158

Query: 164 EDEVSTTLEKDEIKLEIDQATLKFLDLARQME 195
           E      LE+   KL+ +   L    + R M 
Sbjct: 159 ELNA---LEEKLRKLQKENQEL----VERWMA 183


>gnl|CDD|220441 pfam09849, DUF2076, Uncharacterized protein conserved in bacteria
           (DUF2076).  This domain, found in various hypothetical
           prokaryotic proteins, has no known function. The domain,
           however, is found in various periplasmic ligand-binding
           sensor proteins.
          Length = 234

 Score = 30.0 bits (68), Expect = 2.1
 Identities = 26/94 (27%), Positives = 32/94 (34%), Gaps = 11/94 (11%)

Query: 313 QQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
           Q+    Q   +I  + ++ Q   P    SG  G  SGM   G   P     A  PP P A
Sbjct: 53  QEAALKQANARI--EELEAQAQHPQSQSSG--GFLSGMFGGGAPRPPPAAPAVQPPAPPA 108

Query: 373 GPGGMGGPGNLGPGGMGP-------GGLLQGPLA 399
            PG   G  +    G  P       G  L G   
Sbjct: 109 RPGWGSGGPSQQGAGQQPGYAQPGPGSFLGGAAQ 142


>gnl|CDD|234252 TIGR03545, TIGR03545, TIGR03545 family protein.  This model
           represents a relatively rare but broadly distributed
           uncharacterized protein family, distributed in 1-2
           percent of bacterial genomes, all of which have outer
           membranes. In many of these genomes, it is part of a
           two-gene pair.
          Length = 555

 Score = 30.1 bits (68), Expect = 2.3
 Identities = 20/87 (22%), Positives = 36/87 (41%), Gaps = 7/87 (8%)

Query: 61  LPDGRSLSPLEKDEIKLE-IDQATLK-FLDLARQMEAFFLQKRFLLSALKPELIVKEVNM 118
           LP+ + L   +K   +LE I +  +K  L+L +  E F   K         + I    N 
Sbjct: 187 LPNKQDLEEYKK---RLEAIKKKDIKNPLELQKIKEEF--DKLKKEGKADKQKIKSAKND 241

Query: 119 VTKDIVDLRHDLARKEELIKRHYDKIA 145
           +  D   L+ DLA  ++  +    ++ 
Sbjct: 242 LQNDKKQLKADLAELKKAPQNDLKRLE 268


>gnl|CDD|237783 PRK14667, uvrC, excinuclease ABC subunit C; Provisional.
          Length = 567

 Score = 30.1 bits (68), Expect = 2.3
 Identities = 27/98 (27%), Positives = 45/98 (45%), Gaps = 12/98 (12%)

Query: 144 IAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRF 203
           I V   L  +++     L K++E+  T +  EI L+ +    K   L R  EA     RF
Sbjct: 446 IEVRDRLGLNIKVF--SLAKKEEILYTEDGKEIPLKENPILYKVFGLIRD-EA----HRF 498

Query: 204 LLSALKPELIVKEDIVDLRHDLA----RKEELIKRHYD 237
            LS  + +L  KE + D+   +      K+E+I R++ 
Sbjct: 499 ALSYNR-KLREKEGLKDILDKIKGIGEVKKEIIYRNFK 535


>gnl|CDD|163064 TIGR02894, DNA_bind_RsfA, transcription factor, RsfA family.  In a
           subset of endospore-forming members of the Firmcutes,
           members of this protein family are found, several to a
           genome. Two very strongly conserved sequences regions
           are separated by a highly variable linker region. Much
           of the linker region was excised from the seed alignment
           for this model. A characterized member is the
           prespore-specific transcription RsfA from Bacillus
           subtilis, previously called YwfN, which is controlled by
           sigma factor F and seems to fine-tune expression of some
           genes in the sigma-F regulon. A paralog in Bacillus
           subtilis is designated YlbO [Regulatory functions, DNA
           interactions, Cellular processes, Sporulation and
           germination].
          Length = 161

 Score = 29.3 bits (66), Expect = 2.3
 Identities = 11/53 (20%), Positives = 24/53 (45%), Gaps = 9/53 (16%)

Query: 142 DKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQM 194
           ++    Q    +L+  L+ L +     +T+E+D       Q  +  +D AR++
Sbjct: 111 NQNESLQKRNEELEKELEKLRQR---LSTIEEDY------QTLIDIMDRARKL 154


>gnl|CDD|214660 smart00434, TOP4c, DNA Topoisomerase IV.  Bacterial DNA
           topoisomerase IV, GyrA, ParC.
          Length = 444

 Score = 30.2 bits (69), Expect = 2.4
 Identities = 24/119 (20%), Positives = 49/119 (41%), Gaps = 19/119 (15%)

Query: 76  KLEIDQATLKFLDLARQMEAFFLQKR--FLLSALKPELIVK-EVNMVTKDIVD-----LR 127
           K  + +   +FLD   +       +R  +LL  L+ E +   E   +   I+D     +R
Sbjct: 321 KYNLKEILKEFLDHRLE----VYTRRKEYLLGKLEAERLHILEGLFIALSIIDEIIVLIR 376

Query: 128 HDLARKEELIKRHYDKIAV----WQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQ 182
                 +E  ++  ++  +       +L D++  L+ LTK +      E  E++ EI+ 
Sbjct: 377 SSKDLAKEAKEKLMERFELSEIQADAIL-DMR--LRRLTKLEVEKLEKELKELEKEIED 432


>gnl|CDD|238163 cd00261, AAI_SS, AAI_SS: Alpha-Amylase Inhibitors (AAIs) and Seed
           Storage (SS) Protein subfamily; composed of cereal-type
           AAIs and SS proteins. They are mainly present in the
           seeds of a variety of plants. AAIs play an important
           role in the natural defenses of plants against insects
           and pathogens such as fungi, bacteria and viruses. AAIs
           impede the digestion of plant starch and proteins by
           inhibiting digestive alpha-amylases and proteinases.
           Also included in this subfamily are SS proteins such as
           2S albumin, gamma-gliadin, napin, and prolamin. These
           AAIs and SS proteins are also known allergens in humans.
          Length = 110

 Score = 28.5 bits (64), Expect = 2.4
 Identities = 14/43 (32%), Positives = 19/43 (44%), Gaps = 5/43 (11%)

Query: 313 QQQQQVQMQQQIHM-----QHMQQQGMGPGGPPSGPGGPSSGM 350
           QQQ Q   QQ         ++++QQ  G GGPP  P      +
Sbjct: 1   QQQCQPGQQQPQQPLNSCREYLRQQCSGVGGPPVWPQQSCEVL 43


>gnl|CDD|218745 pfam05783, DLIC, Dynein light intermediate chain (DLIC).  This
           family consists of several eukaryotic dynein light
           intermediate chain proteins. The light intermediate
           chains (LICs) of cytoplasmic dynein consist of multiple
           isoforms, which undergo post-translational modification
           to produce a large number of species. DLIC1 is known to
           be involved in assembly, organisation, and function of
           centrosomes and mitotic spindles when bound to
           pericentrin. DLIC2 is a subunit of cytoplasmic dynein 2
           that may play a role in maintaining Golgi organisation
           by binding cytoplasmic dynein 2 to its Golgi-associated
           cargo.
          Length = 490

 Score = 30.2 bits (68), Expect = 2.4
 Identities = 27/98 (27%), Positives = 33/98 (33%), Gaps = 19/98 (19%)

Query: 321 QQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGP 380
           QQ +  +       G   P   PGG            PR    +GP    S  P  M   
Sbjct: 357 QQSLLAKQPATPTRGVESPARSPGG-----------SPRTTNRSGPRNVASVSP--MTSV 403

Query: 381 GNLGPGGMGPGGLLQGPLA-----YLEKTTSNIGLPDG 413
             + P  M PG   +G LA      L K T + G P G
Sbjct: 404 KKIDP-NMKPGAASEGVLANFFNSLLSKKTGSPGSPGG 440


>gnl|CDD|193258 pfam12782, Innate_immun, Invertebrate innate immunity transcript
           family.  The immune response of the purple sea urchin
           appears to be more complex than previously believed in
           that it uses immune-related gene families homologous to
           vertebrate Toll-like and NOD/NALP-like receptor families
           as well as C-type lectins and a rudimentary complement
           system. In addition, the species also produces this
           unusual family of mRNAs, also known as 185/333, which is
           strongly upregulated in response to pathogen challenge.
          Length = 312

 Score = 30.0 bits (67), Expect = 2.4
 Identities = 26/68 (38%), Positives = 27/68 (39%), Gaps = 14/68 (20%)

Query: 333 GMGPGGPPSGPGGPSSGMMFMGP-------------GGPRGGGNAGPPPFPSAGPGGMGG 379
           GM  GGP    GGP  G  F GP             GGP GG     P F  + P G GG
Sbjct: 48  GMQMGGPRQD-GGPMGGRRFDGPGSGAPQMDGRRQNGGPMGGRRFDGPRFGGSRPDGAGG 106

Query: 380 PGNLGPGG 387
               G GG
Sbjct: 107 RPFFGQGG 114



 Score = 28.1 bits (62), Expect = 10.0
 Identities = 42/121 (34%), Positives = 44/121 (36%), Gaps = 16/121 (13%)

Query: 280 PGISAMGGPLP--GMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPG 337
           PG   MGGP    G MGG     PGS    M G  Q            M   +  G   G
Sbjct: 46  PGGMQMGGPRQDGGPMGGRRFDGPGSGAPQMDGRRQNGGP--------MGGRRFDGPRFG 97

Query: 338 GP-PSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQG 396
           G  P G GG      F G GG RG G          G  G+GGPG     G    G  QG
Sbjct: 98  GSRPDGAGGRP----FFGQGGRRGDGEEETDAAQQIG-DGLGGPGQFDGPGRRHHGHRQG 152

Query: 397 P 397
           P
Sbjct: 153 P 153


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 30.0 bits (68), Expect = 2.6
 Identities = 24/124 (19%), Positives = 52/124 (41%), Gaps = 13/124 (10%)

Query: 117 NMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVSTTLEK-- 173
             + K+I +L       EE ++         +  L DL+S L  L KE DE+   L +  
Sbjct: 850 KSIEKEIENLNGKKEELEEELEEL-------EAALRDLESRLGDLKKERDELEAQLRELE 902

Query: 174 ---DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEE 230
              +E++ +I++   +  +L  ++EA   +   +      +  + E+ + L    A  + 
Sbjct: 903 RKIEELEAQIEKKRKRLSELKAKLEALEEELSEIEDPKGEDEEIPEEELSLEDVQAELQR 962

Query: 231 LIKR 234
           + + 
Sbjct: 963 VEEE 966


>gnl|CDD|236729 PRK10636, PRK10636, putative ABC transporter ATP-binding protein;
           Provisional.
          Length = 638

 Score = 30.1 bits (68), Expect = 2.6
 Identities = 16/73 (21%), Positives = 39/73 (53%), Gaps = 3/73 (4%)

Query: 126 LRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKD-EIKLEIDQAT 184
           LR ++AR E+ +++   ++A  +  L D  S L   +++ E++  L++    K  +++  
Sbjct: 561 LRKEIARLEKEMEKLNAQLAQAEEKLGD--SELYDQSRKAELTACLQQQASAKSGLEECE 618

Query: 185 LKFLDLARQMEAF 197
           + +L+   Q+E  
Sbjct: 619 MAWLEAQEQLEQM 631


>gnl|CDD|227938 COG5651, COG5651, PPE-repeat proteins [Cell motility and
           secretion].
          Length = 490

 Score = 29.9 bits (67), Expect = 2.7
 Identities = 26/152 (17%), Positives = 38/152 (25%), Gaps = 29/152 (19%)

Query: 263 SSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQ 322
              + +    +    +G   +A+GG   G         PG  +                 
Sbjct: 339 LGVANSGSAAAPFGIAGANQAALGGANSGAGNFGLGNNPGGGLGGKPLG----------- 387

Query: 323 QIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPR----GGGNAGPPPFPSAGPGGMG 378
                     G G GG     G  ++G    G         G  NAG     +A   G G
Sbjct: 388 ----------GTGNGGI-GASGIGNTGYGNSGIANAGLSNAGSNNAGGENAGNANNTGGG 436

Query: 379 GPGNLGPGGMGPGGLLQGPLAYLEKTTSNIGL 410
             G    G    G        +    + N G 
Sbjct: 437 NVGLWNAGDFNAGAA---GTGFTNNGSYNTGF 465



 Score = 29.9 bits (67), Expect = 2.8
 Identities = 22/79 (27%), Positives = 26/79 (32%), Gaps = 3/79 (3%)

Query: 335 GPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAG--PGGMGGPGNLGPGGMGPGG 392
             G   SG      G+         GG N+G   F       GG+GG    G G  G G 
Sbjct: 338 NLGVANSGSAAAPFGIAGANQAAL-GGANSGAGNFGLGNNPGGGLGGKPLGGTGNGGIGA 396

Query: 393 LLQGPLAYLEKTTSNIGLP 411
              G   Y     +N GL 
Sbjct: 397 SGIGNTGYGNSGIANAGLS 415



 Score = 29.5 bits (66), Expect = 3.5
 Identities = 25/154 (16%), Positives = 33/154 (21%), Gaps = 6/154 (3%)

Query: 253 AKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMM--GGMAPIVPGSTMQPMSG 310
             + A  G+  S +      S    +    S        +   GG A       +     
Sbjct: 287 GLAAAGTGNIGSGNAVDSGGSALVGAIGQTSQATANAGSVNATGGAAAGSGNLGVANSGS 346

Query: 311 MPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFP 370
                      Q          G G  G  + PGG   G    G  G  G G +G     
Sbjct: 347 AAAPFGIAGANQAALGG--ANSGAGNFGLGNNPGGGLGGKPLGG-TGNGGIGASGIGNTG 403

Query: 371 SA-GPGGMGGPGNLGPGGMGPGGLLQGPLAYLEK 403
                    G  N G    G              
Sbjct: 404 YGNSGIANAGLSNAGSNNAGGENAGNANNTGGGN 437


>gnl|CDD|224273 COG1354, scpA, Rec8/ScpA/Scc1-like protein (kleisin family)
           [Replication,    recombination, and repair].
          Length = 248

 Score = 29.6 bits (67), Expect = 2.8
 Identities = 37/234 (15%), Positives = 82/234 (35%), Gaps = 44/234 (18%)

Query: 46  QGPLAYLEKTTSNIGLPDGRSLSPLEKDEIKL-EID--QATLKFLDLARQMEAFFLQK-- 100
           +GPL  L              L  + K +I   +I   + T ++L    +++   L+   
Sbjct: 12  EGPLDLL--------------LHLIRKGKIDPWDIPIVELTDQYLAYIEELKKLDLEVAA 57

Query: 101 RFLLSA-----LKPELIVKEVNMVTKDIV--DLRHDLARKEELIKRHYDKIAVWQNLLSD 153
            +L+ A     +K  +++ +     +D    + R +L  + E  +R+ +   +   L  +
Sbjct: 58  DYLVMAAILLRIKSRMLLPKEEEEAEDEELEEPRDELVARLEEYERYKEAAELLAELEEE 117

Query: 154 LQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELI 213
            +    V +K        ++     EI    L F    + +      K+  L  ++  ++
Sbjct: 118 RR---DVFSKIKPEIKIKKERRPVEEISLIDL-FRAYQKILR---RVKQEELVEIERIVL 170

Query: 214 VKEDIVDLRHDLARKEELIKRHYDKIAVWQNL-LSDLQGWAKSPAHQGSTSSAS 266
            +  +          EE ++    ++     L  SDL    +      ST  A 
Sbjct: 171 EELSV----------EEQLEELLARLEARGVLRFSDLFSPEERKDEVVSTFLAL 214


>gnl|CDD|222095 pfam13388, DUF4106, Protein of unknown function (DUF4106).  This
           family of proteins are found in large numbers in the
           Trichomonas vaginalis proteome. The function of this
           protein is unknown.
          Length = 422

 Score = 29.6 bits (66), Expect = 2.8
 Identities = 29/92 (31%), Positives = 32/92 (34%), Gaps = 15/92 (16%)

Query: 260 GSTSSASGT-TPPNSTPTQSGPGI-----SAMG-----GPLPGMMGGMAPIVPGSTMQPM 308
           G+   ASGT  PPN       PG+     S+ G      P P       P V     QP 
Sbjct: 162 GTYILASGTYIPPNPPREAPAPGLPKTFTSSHGHRHRHAPKPTQQ----PTVQNPAQQPT 217

Query: 309 SGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPP 340
              P QQ Q Q QQQ      Q     P   P
Sbjct: 218 VQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQP 249


>gnl|CDD|226808 COG4371, COG4371, Predicted membrane protein [Function unknown].
          Length = 334

 Score = 29.5 bits (66), Expect = 2.9
 Identities = 17/55 (30%), Positives = 19/55 (34%), Gaps = 5/55 (9%)

Query: 344 GGPSSGMMFMGPGGPRGGGNAGPPP-----FPSAGPGGMGGPGNLGPGGMGPGGL 393
           GG   G  F  P G   G + G P            GG G P  +  GG G G  
Sbjct: 50  GGRIGGGSFRAPSGYSRGYSGGGPSGGGYSGGGYSGGGFGFPFIIPGGGGGGGFG 104



 Score = 28.7 bits (64), Expect = 5.0
 Identities = 22/54 (40%), Positives = 23/54 (42%), Gaps = 9/54 (16%)

Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP------PFPSAGPGGMGGPGNLG 384
           GG    P G S G      GGP GGG +G         FP   PGG GG G  G
Sbjct: 55  GGSFRAPSGYSRGY---SGGGPSGGGYSGGGYSGGGFGFPFIIPGGGGGGGFGG 105


>gnl|CDD|220161 pfam09273, Rubis-subs-bind, Rubisco LSMT substrate-binding.
           Members of this family adopt a multihelical structure,
           with an irregular array of long and short alpha-helices.
           They allow binding of the protein to substrate, such as
           the N-terminal tails of histones H3 and H4 and the large
           subunit of the Rubisco holoenzyme complex.
          Length = 128

 Score = 28.5 bits (64), Expect = 2.9
 Identities = 11/37 (29%), Positives = 14/37 (37%), Gaps = 6/37 (16%)

Query: 144 IAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEI 180
               Q L    +  L       E  TTLE+DE  L+ 
Sbjct: 77  EKALQFLEKLCKLLL------SEYPTTLEEDEALLKK 107


>gnl|CDD|180777 PRK06958, PRK06958, single-stranded DNA-binding protein;
           Provisional.
          Length = 182

 Score = 29.0 bits (65), Expect = 3.0
 Identities = 27/79 (34%), Positives = 29/79 (36%), Gaps = 6/79 (7%)

Query: 314 QQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAG 373
           Q  Q +   +I    MQ  G G GG   G GG   G    G GG  GG         S  
Sbjct: 90  QDGQDRYSTEIVADQMQMLG-GRGGSGGGGGGGDEGGYGGGGGGGGGGYGGE-----SRS 143

Query: 374 PGGMGGPGNLGPGGMGPGG 392
            GG G     G GG G G 
Sbjct: 144 GGGGGRASGGGGGGAGGGA 162


>gnl|CDD|222706 pfam14356, DUF4403, Domain of unknown function (DUF4403).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria. Proteins in
           this family are typically between 455 and 518 amino
           acids in length. There is a single completely conserved
           residue W that may be functionally important.
          Length = 425

 Score = 29.5 bits (67), Expect = 3.1
 Identities = 17/75 (22%), Positives = 32/75 (42%), Gaps = 8/75 (10%)

Query: 61  LPDGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVT 120
           LP+ + L  +  D  ++ +  A + + +L R +      K F LS    ++ VK V++  
Sbjct: 236 LPNLKILPSIS-DGFRINL-PADIPYAELNRLLNRQLAGKTFPLSG-GRKVTVKSVSVYG 292

Query: 121 KD-----IVDLRHDL 130
                   VD+   L
Sbjct: 293 SGDRLVIAVDVDGSL 307


>gnl|CDD|218635 pfam05556, Calsarcin, Calcineurin-binding protein (Calsarcin).
           This family consists of several mammalian
           calcineurin-binding proteins. The calcium- and
           calmodulin-dependent protein phosphatase calcineurin has
           been implicated in the transduction of signals that
           control the hypertrophy of cardiac muscle and slow fibre
           gene expression in skeletal muscle. Calsarcin-1 and
           calsarcin-2 are expressed in developing cardiac and
           skeletal muscle during embryogenesis, but calsarcin-1 is
           expressed specifically in adult cardiac and slow-twitch
           skeletal muscle, whereas calsarcin-2 is restricted to
           fast skeletal muscle. Calsarcins represent a novel
           family of sarcomeric proteins that link calcineurin with
           the contractile apparatus, thereby potentially coupling
           muscle activity to calcineurin activation. Calsarcin-3,
           is expressed specifically in skeletal muscle and is
           enriched in fast-twitch muscle fibres. Like calsarcin-1
           and calsarcin-2, calsarcin-3 interacts with calcineurin,
           and the Z-disc proteins alpha-actinin, gamma-filamin,
           and telethonin.
          Length = 273

 Score = 29.4 bits (66), Expect = 3.3
 Identities = 15/66 (22%), Positives = 22/66 (33%), Gaps = 9/66 (13%)

Query: 326 MQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP 385
             ++Q+     GG     GG S G +  G     G          +   G +  P  + P
Sbjct: 86  KDNLQKFVPSQGGQ----GGNSEGSIPQGDSHQPGQT-----QPNTPDLGSVYNPEAIAP 136

Query: 386 GGMGPG 391
           G  GP 
Sbjct: 137 GYGGPL 142


>gnl|CDD|233432 TIGR01480, copper_res_A, copper-resistance protein, CopA family.
           This model represents the CopA copper resistance protein
           family. CopA is related to laccase (benzenediol:oxygen
           oxidoreductase) and L-ascorbate oxidase, both
           copper-containing enzymes. Most members have a typical
           TAT (twin-arginine translocation) signal sequence with
           an Arg-Arg pair. Twin-arginine translocation is observed
           for a large number of periplasmic proteins that cross
           the inner membrane with metal-containing cofactors
           already bound. The combination of copper-binding sites
           and TAT translocation motif suggests a mechansism of
           resistance by packaging and export [Cellular processes,
           Detoxification, Transport and binding proteins, Cations
           and iron carrying compounds].
          Length = 587

 Score = 29.8 bits (67), Expect = 3.3
 Identities = 21/113 (18%), Positives = 32/113 (28%), Gaps = 7/113 (6%)

Query: 304 TMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGN 363
           T+    G+      +  +  + M+ M   GM       G       M  M PG       
Sbjct: 346 TLAVRLGLTAPVPALDPRPLLTMKDMGMGGMH-----HGMDHSKMSMGGM-PGMDMSMRA 399

Query: 364 AGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPLAYLEKTTSNIGLPD-GRR 415
               P   +       P +     + P   +   +         IGL D GRR
Sbjct: 400 QSNAPMDHSQMAMDASPKHPASEPLNPLVDMIVDMPMDRMDDPGIGLRDNGRR 452


>gnl|CDD|221510 pfam12288, CsoS2_M, Carboxysome shell peptide mid-region.  This
           domain family is found in bacteria and eukaryotes, and
           is approximately 430 amino acids in length. This family
           is annotated frequently as a carboxysome shell peptide,
           however there is little publication to confirm this.
          Length = 424

 Score = 29.7 bits (67), Expect = 3.4
 Identities = 21/94 (22%), Positives = 35/94 (37%), Gaps = 13/94 (13%)

Query: 269 TPPNSTPTQSGPGISAMGGPLPGMMGGMAPIV----PGS----TMQPMSGMPQQQQQVQM 320
           + P     + G  ++  G  + G   G +  V    PG+    T  P +G+ Q QQ   +
Sbjct: 307 SRPKPEAAKVGFSLTNKGQKVSGTRTGRSEGVTGDEPGTCKAVTGTPYAGLEQAQQFCSV 366

Query: 321 QQQIHMQHMQQQGMGPGGPP-----SGPGGPSSG 349
                ++    +  G  GP       G GG  +G
Sbjct: 367 DAVNEIKVRTPRRAGTPGPRLTGQQPGIGGVMTG 400


>gnl|CDD|213398 cd12191, gal11_coact, gall11 coactivator domain.  Gall11/MED15 acts
           in the general regulation of GAL structural genes and is
           required for full expression for several genes in this
           pathway, including GALs 1,7, and 10 in Saccharomyces
           cerevisiae. GAL11 function is dependent on GCN4
           functionality and binds GCN4 in a degenerate manner with
           multiple orientations found at the GCN4-Gal11 interface.
          Length = 90

 Score = 27.7 bits (62), Expect = 3.5
 Identities = 10/26 (38%), Positives = 13/26 (50%), Gaps = 4/26 (15%)

Query: 311 MPQQQQQV----QMQQQIHMQHMQQQ 332
            PQ  +Q+    Q   Q+ MQ  QQQ
Sbjct: 61  PPQAMEQIKEVQQTHFQLLMQRRQQQ 86


>gnl|CDD|218161 pfam04589, RFX1_trans_act, RFX1 transcription activation region.
           The RFX family is a family of winged-helix DNA binding
           proteins. RFX1 is a regulatory factor essential for
           expression of MHC class II genes. This region is to
           found N terminal to the RFX DNA binding region
           (pfam02257) in some mammalian RFX proteins, and is
           thought to activate transcription when associated with
           DNA. Deletion analysis has identified the region 233-351
           in human RFX1 as being required for maximal activation.
          Length = 150

 Score = 28.4 bits (63), Expect = 3.9
 Identities = 17/73 (23%), Positives = 26/73 (35%), Gaps = 1/73 (1%)

Query: 261 STSSASGTTPPNSTPTQ-SGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQ 319
             +S  G+  P S   Q S P  + +       +        G  +Q +     QQ   Q
Sbjct: 1   MQTSEGGSDSPASVALQTSVPAQAPVPASQQRSVVQATSQTKGGPVQQLPVHRVQQVPQQ 60

Query: 320 MQQQIHMQHMQQQ 332
           +QQ  H+   Q Q
Sbjct: 61  VQQVQHVYPAQVQ 73


>gnl|CDD|197548 smart00157, PRP, Major prion protein.  The prion protein is a major
           component of scrapie-associated fibrils in
           Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler
           syndrome and bovine spongiform encephalopathy.
          Length = 218

 Score = 29.1 bits (65), Expect = 3.9
 Identities = 19/51 (37%), Positives = 19/51 (37%), Gaps = 6/51 (11%)

Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGN 382
           QG G G P  G  G   G    G  G   GG  G P     G G  GG  N
Sbjct: 31  QGGGWGQPHGGGWGQPHG----GGWGQPHGGGWGQP--HGGGWGQGGGTHN 75



 Score = 28.7 bits (64), Expect = 5.1
 Identities = 20/60 (33%), Positives = 20/60 (33%), Gaps = 13/60 (21%)

Query: 336 PGG---PPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
           PGG   PP G G           G P GGG   P       P G G     G G    GG
Sbjct: 23  PGGNRYPPQGGGW----------GQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGG 72


>gnl|CDD|220871 pfam10759, DUF2587, Protein of unknown function (DUF2587).  This is
           a bacterial family of proteins with no known function.
          Length = 168

 Score = 28.5 bits (64), Expect = 4.0
 Identities = 12/44 (27%), Positives = 20/44 (45%), Gaps = 5/44 (11%)

Query: 319 QMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGG 362
           QM  +  ++ M+++ + PG   + PG P         GGP  G 
Sbjct: 126 QMAARAQLEQMRRRALPPGVGIAPPGQPQ-----GARGGPPPGT 164


>gnl|CDD|214360 CHL00094, dnaK, heat shock protein 70.
          Length = 621

 Score = 29.3 bits (66), Expect = 4.1
 Identities = 29/123 (23%), Positives = 50/123 (40%), Gaps = 20/123 (16%)

Query: 68  SPLEKDEIK---------LEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNM 118
           S L KDE++            D+   + +DL  Q E+   Q    L  LK ++  ++   
Sbjct: 500 STLPKDEVERMVKEAEKNAAEDKEKREKIDLKNQAESLCYQAEKQLKELKDKISEEKKEK 559

Query: 119 VTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKL 178
           +   I  LR      + L   +Y+ I   ++LL +LQ  L  + K  EV ++    +   
Sbjct: 560 IENLIKKLR------QALQNDNYESI---KSLLEELQKALMEIGK--EVYSSTSTTDPAS 608

Query: 179 EID 181
             D
Sbjct: 609 NDD 611


>gnl|CDD|218237 pfam04738, Lant_dehyd_C, Lantibiotic dehydratase, C terminus.
           Lantibiotics are ribosomally synthesised antimicrobial
           agents derived from ribosomally synthesised peptides.
           They are produced by bacteria of the Firmicutes phylum,
           and include mutacin, subtilin, and nisin. Lantibiotic
           peptides contain thioether bridges termed lanthionines
           that are thought to be generated by dehydration of
           serine and threonine residues followed by addition of
           cysteine residues. This family constitutes the
           C-terminus of the enzyme proposed to catalyze the
           dehydration step.
          Length = 500

 Score = 29.3 bits (66), Expect = 4.6
 Identities = 18/86 (20%), Positives = 38/86 (44%), Gaps = 8/86 (9%)

Query: 180 IDQATLKFLDL-ARQMEAF---FLQKRFLLSALKPELIVKEDIVDLRHDLARKE-ELIKR 234
           ++    +F D  A ++E +    ++K FL+S L+P   V + +  L   L   +      
Sbjct: 7   VELLATEFPDAPAEKVEEYLAKLIEKGFLISELRPPSTVADPLDYLIEKLEALDVPEANE 66

Query: 235 HYDKIAVWQNLLSDLQGWAKSPAHQG 260
               +   Q L+++   +A+ P  +G
Sbjct: 67  LLAALREIQKLIAE---YAELPIGEG 89


>gnl|CDD|220915 pfam10961, DUF2763, Protein of unknown function (DUF2763).  This
           eukaryotic family of proteins has no known function.
          Length = 91

 Score = 27.4 bits (61), Expect = 4.8
 Identities = 12/35 (34%), Positives = 13/35 (37%), Gaps = 7/35 (20%)

Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP 367
           G G  GPP G          MG  G  GG +  P 
Sbjct: 61  GRGGPGPPGGGRR-------MGRIGGGGGPSRPPM 88


>gnl|CDD|188414 TIGR03899, TIGR03899, TIGR03899 family protein.  Members of this
           protein family are conserved hypothetical proteins with
           a limited species distribution within the
           Gammaproteobacteria. It is common in the genera Vibrio
           and Shewanella, and in this resembles the C-terminal
           domain and putative protein sorting motif TIGR03501.
           This model, but design, does not extend to all
           homologs,but rather represents a particular clade.
          Length = 250

 Score = 28.7 bits (65), Expect = 4.9
 Identities = 6/14 (42%), Positives = 9/14 (64%)

Query: 318 VQMQQQIHMQHMQQ 331
            QM + IH + MQ+
Sbjct: 67  FQMAEDIHNRSMQE 80


>gnl|CDD|226513 COG4026, COG4026, Uncharacterized protein containing TOPRIM domain,
           potential nuclease [General function prediction only].
          Length = 290

 Score = 28.7 bits (64), Expect = 5.0
 Identities = 28/153 (18%), Positives = 69/153 (45%), Gaps = 22/153 (14%)

Query: 45  LQGPLAYLEKTTSNIGLPDGRSLSPLEKDEIKLEIDQATLK--FLDLARQMEAFFLQKRF 102
           L+G + ++E+    + +P G  +  ++ + ++ E+  A ++     L R  E   L++ +
Sbjct: 82  LRGMVGHIER----MKIPIGHDVEHIDVELVRKELKNALVRAGLKTLQRVPEYMDLKEDY 137

Query: 103 L--------LSALKPELIVK------EVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQ 148
                    L   K EL+ +      E   V + +  L  + +R EE++K+   ++   +
Sbjct: 138 EELKEKLEELQKEKEELLKELEELEAEYEEVQERLKRLEVENSRLEEMLKKLPGEVYDLK 197

Query: 149 NLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEID 181
               +L+  +++   E+E+ + L K+ + L   
Sbjct: 198 KRWDELEPGVELP--EEELISDLVKETLNLAPK 228


>gnl|CDD|234354 TIGR03789, pdsO, proteobacterial sortase system OmpA family
           protein.  A newly defined histidine kinase (TIGR03785)
           and response regulator (TIGR03787) gene pair occurs
           exclusively in Proteobacteria, mostly of marine origin,
           nearly all of which contain a subfamily 6 sortase
           (TIGR03784) and its single dedicated target protein
           (TIGR03788) adjacent to to the sortase. This protein
           family shows up in only in those species with the
           histidine kinase/response regulator gene pair, and often
           adjacent to that pair. It belongs to the OmpA protein
           family (pfam00691). Its function is unknown. We assign
           the gene symbol pdsO, for Proteobacterial Dedicated
           Sortase system OmpA family protein.
          Length = 239

 Score = 28.6 bits (64), Expect = 5.1
 Identities = 18/95 (18%), Positives = 38/95 (40%), Gaps = 17/95 (17%)

Query: 255 SPAHQGSTSSASGTTPPNSTPTQS---------GPGI---SAMGGPLPGMMGGMAPIVPG 302
           S     +  +      P  T  Q          G G    + +GGP+  ++GG+   + G
Sbjct: 15  SSVAATTYQNQPHLQTPQETSQQEADQEALIGLGSGALLGALVGGPVGAIIGGITGGLIG 74

Query: 303 STM---QPMSGMPQQQQQVQM--QQQIHMQHMQQQ 332
             +   +    + QQ+QQ+    Q+Q  ++ ++ +
Sbjct: 75  QAVNNDEQQQHIAQQRQQMVALTQKQQALEQLEAE 109


>gnl|CDD|215969 pfam00521, DNA_topoisoIV, DNA gyrase/topoisomerase IV, subunit A. 
          Length = 427

 Score = 29.0 bits (66), Expect = 5.2
 Identities = 23/89 (25%), Positives = 40/89 (44%), Gaps = 16/89 (17%)

Query: 98  LQKRF-LLSALKPEL-IVKEVNMV---TKDIVDLRHDLARKEELIKRHYDKIAVWQNLLS 152
           L++R  +L  L   L  +  V  V   + D+   + +L   EEL +   D +        
Sbjct: 332 LEERLHILEGLLKALNKIDFVIEVIRGSIDLKKAKKELI--EELSEIQADYLL------- 382

Query: 153 DLQSCLQVLTKEDEVSTTLEKDEIKLEID 181
           D++  L+ LTKE+      E +E++ EI 
Sbjct: 383 DMR--LRRLTKEEIEKLEKEIEELEKEIA 409


>gnl|CDD|233508 TIGR01649, hnRNP-L_PTB, hnRNP-L/PTB/hephaestus splicing factor
           family.  Included in this family of heterogeneous
           ribonucleoproteins are PTB (polypyrimidine tract binding
           protein ) and hnRNP-L. These proteins contain four RNA
           recognition motifs (rrm: pfam00067).
          Length = 481

 Score = 29.0 bits (65), Expect = 5.5
 Identities = 19/88 (21%), Positives = 25/88 (28%), Gaps = 5/88 (5%)

Query: 298 PIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMG--PGGPPSGPGGPSSGMMFMGP 355
           P +PG   +        +Q+       H       G     G      GG   G     P
Sbjct: 191 PDLPGR--RDPGLDQTHRQRQPALLGQHPSSYGHDGYSSHGGPLAPLAGGDRMGPPHGPP 248

Query: 356 GGPRGGGNAGPPPFPSAGPG-GMGGPGN 382
              R    A P     +  G   GGPG+
Sbjct: 249 SRYRPAYEAAPLAPAISSYGPAGGGPGS 276



 Score = 28.2 bits (63), Expect = 8.0
 Identities = 18/83 (21%), Positives = 25/83 (30%), Gaps = 3/83 (3%)

Query: 315 QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGP 374
           ++   + Q    +     G  P            G +    GG R G   GPP       
Sbjct: 196 RRDPGLDQTHRQRQPALLGQHPSSYGHDGYSSHGGPLAPLAGGDRMGPPHGPPSRYRPAY 255

Query: 375 GGMGGPGNL---GPGGMGPGGLL 394
                   +   GP G GPG +L
Sbjct: 256 EAAPLAPAISSYGPAGGGPGSVL 278


>gnl|CDD|227911 COG5624, TAF61, Transcription initiation factor TFIID, subunit
           TAF12 (also component of histone acetyltransferase SAGA)
           [Transcription].
          Length = 505

 Score = 28.9 bits (64), Expect = 5.5
 Identities = 20/82 (24%), Positives = 31/82 (37%), Gaps = 6/82 (7%)

Query: 292 MMGGMAPIVPG-STMQPMSGMPQQQQ--QVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSS 348
           M GG+  +  G S  + +   PQ QQ  +  +  Q    H  ++    G PP      S+
Sbjct: 214 MAGGVYGVHDGRSKRRLVDRYPQFQQGQKQVLSPQQRFLHGMERYEASGMPPPAEWAGSN 273

Query: 349 GMMFMG---PGGPRGGGNAGPP 367
           G+  +       PRG      P
Sbjct: 274 GLHVLPGRREEVPRGIFRCPSP 295


>gnl|CDD|184281 PRK13729, PRK13729, conjugal transfer pilus assembly protein TraB;
           Provisional.
          Length = 475

 Score = 29.0 bits (65), Expect = 5.6
 Identities = 22/89 (24%), Positives = 31/89 (34%), Gaps = 8/89 (8%)

Query: 315 QQQVQMQQQIH-----MQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPP- 368
           +Q+   Q++I         + +Q    G  P    G     M   P GP G    G  P 
Sbjct: 97  KQRGDDQRRIEKLGQDNAALAEQVKALGANPVTATGEPVPQMPASPPGPEGEPQPGNTPV 156

Query: 369 -FPSAGPGGMGGPGNLGPG-GMGPGGLLQ 395
            FP  G   +  P    PG G+ P   + 
Sbjct: 157 SFPPQGSVAVPPPTAFYPGNGVTPPPQVT 185


>gnl|CDD|182398 PRK10350, PRK10350, hypothetical protein; Provisional.
          Length = 145

 Score = 28.1 bits (62), Expect = 5.7
 Identities = 15/73 (20%), Positives = 21/73 (28%)

Query: 314 QQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAG 373
           QQQ +Q Q   + Q +QQ   G           ++G M      P    N          
Sbjct: 65  QQQHLQNQINNNSQRVQQGQPGNNPARQQMLPNTNGGMLNSNRNPDSSLNQQHMLPERRN 124

Query: 374 PGGMGGPGNLGPG 386
              +  P    P 
Sbjct: 125 GDMLNQPSTPQPD 137


>gnl|CDD|226711 COG4260, COG4260, Membrane protease subunit, stomatin/prohibitin
           family [Amino acid    transport and metabolism].
          Length = 345

 Score = 28.7 bits (64), Expect = 6.0
 Identities = 10/38 (26%), Positives = 11/38 (28%)

Query: 286 GGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQ 323
           GG      G    +  G  M        Q  Q Q Q Q
Sbjct: 262 GGAAGTFAGMAMGMQMGQGMMESLRTSLQGNQGQAQAQ 299


>gnl|CDD|221321 pfam11928, DUF3446, Domain of unknown function (DUF3446).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 80 to 99 amino acids in length. This domain is
           found associated with pfam00096. This domain has a
           single completely conserved residue P that may be
           functionally important.
          Length = 84

 Score = 26.7 bits (59), Expect = 6.5
 Identities = 14/62 (22%), Positives = 24/62 (38%), Gaps = 4/62 (6%)

Query: 244 NLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSG----PGISAMGGPLPGMMGGMAPI 299
           +L+S L G +  P     +SS+S ++  + +P  S        S +    P        I
Sbjct: 22  SLVSGLVGMSNPPPSSSPSSSSSSSSSSSQSPPLSCSVHQSEPSPIYSAAPPYSSACGDI 81

Query: 300 VP 301
            P
Sbjct: 82  YP 83


>gnl|CDD|147458 pfam05268, GP38, Phage tail fibre adhesin Gp38.  This family
           contains several Gp38 proteins from T-even-like phages.
           Gp38, together with a second phage protein, gp57,
           catalyzes the organisation of gp37 but is absent from
           the phage particle. Gp37 is responsible for receptor
           recognition.
          Length = 261

 Score = 28.2 bits (63), Expect = 6.5
 Identities = 24/73 (32%), Positives = 24/73 (32%), Gaps = 22/73 (30%)

Query: 335 GPGGPPSGPGGPSSGMM------FMGPGGPRG---------GGNAGPPPFPSAGPGGMGG 379
           G GG P G GG S   M         PGG  G         GGN G         GG   
Sbjct: 175 GGGGRPFGAGGKSGSHMSGGNASLTAPGGGSGTGSAYGGGNGGNVG-------AAGGRAW 227

Query: 380 PGNLGPGGMGPGG 392
            GN    G G  G
Sbjct: 228 GGNGYEYGGGAAG 240


>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen.  This
           family consists of several Theileria P67 surface
           antigens. A stage specific surface antigen of Theileria
           parva, p67, is the basis for the development of an
           anti-sporozoite vaccine for the control of East Coast
           fever (ECF) in cattle. The antigen has been shown to
           contain five distinct linear peptide sequences
           recognised by sporozoite-neutralising murine monoclonal
           antibodies.
          Length = 727

 Score = 28.9 bits (64), Expect = 6.8
 Identities = 20/63 (31%), Positives = 23/63 (36%), Gaps = 4/63 (6%)

Query: 335 GPGGPPS-GPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP-GGMGPGG 392
            P  P    P  PS+  +   P   +G G AG    PSA  G   G G      G    G
Sbjct: 638 VPSDPTKVTPTQPSN--LPQVPTSGQGNGTAGGEQPPSAPNGTGNGEGGKDLKEGEKKEG 695

Query: 393 LLQ 395
           L Q
Sbjct: 696 LFQ 698


>gnl|CDD|151935 pfam11498, Activator_LAG-3, Transcriptional activator LAG-3.  The
           C.elegans Notch pathway, involved in the control of
           growth, differentiation and patterning in animal
           development, relies on either of the receptors GLP-1 or
           LIN-12. Both these receptors promote signalling by the
           recruitment of LAG-3 to target promoters, where it then
           acts as a transcriptional activator. LAG-3 works as a
           ternary complex together with the DNA binding protein,
           LAG-1.
          Length = 476

 Score = 28.8 bits (63), Expect = 6.9
 Identities = 19/57 (33%), Positives = 23/57 (40%), Gaps = 8/57 (14%)

Query: 305 MQPMSGMPQQQQQVQMQQQIHMQHMQQQGMG------PGGPPSGPGG--PSSGMMFM 353
           M+    +  QQQQ Q  QQ   QH Q    G      P G P+   G  P+ G   M
Sbjct: 410 MRLQEQIQHQQQQAQHHQQAQQQHQQPAQHGQMGYGIPNGYPAHMHGHAPAYGAHHM 466



 Score = 28.4 bits (62), Expect = 8.5
 Identities = 13/33 (39%), Positives = 16/33 (48%)

Query: 306 QPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGG 338
           Q       QQQQ+ +QQQ  M  +QQ     GG
Sbjct: 358 QQQQQQEHQQQQMLLQQQQQMHQLQQHHQMNGG 390


>gnl|CDD|182745 PRK10803, PRK10803, tol-pal system protein YbgF; Provisional.
          Length = 263

 Score = 28.2 bits (63), Expect = 7.2
 Identities = 16/55 (29%), Positives = 19/55 (34%), Gaps = 1/55 (1%)

Query: 313 QQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP 367
           Q  Q V+ Q+QI    +     G     S  G  S       P    G  NAG P
Sbjct: 83  QLNQVVERQKQI-YLQIDSLSSGGAAAQSTSGDQSGAAASATPAADAGTANAGAP 136


>gnl|CDD|221581 pfam12446, DUF3682, Protein of unknown function (DUF3682).  This
           domain family is found in eukaryotes, and is typically
           between 125 and 136 amino acids in length.
          Length = 133

 Score = 27.5 bits (61), Expect = 7.3
 Identities = 15/40 (37%), Positives = 16/40 (40%), Gaps = 6/40 (15%)

Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
           G G GG  SG   P+       P GP  G NA P P    
Sbjct: 4   GDGTGGVSSGSSAPA------PPAGPGPGPNAPPAPAAPG 37


>gnl|CDD|233667 TIGR01982, UbiB, 2-polyprenylphenol 6-hydroxylase.  This model
           represents the enzyme (UbiB) which catalyzes the first
           hydroxylation step in the ubiquinone biosynthetic
           pathway in bacteria. It is believed that the reaction is
           2-polyprenylphenol -> 6-hydroxy-2-polyprenylphenol. This
           model finds hits primarily in the proteobacteria. The
           gene is also known as AarF in certain species
           [Biosynthesis of cofactors, prosthetic groups, and
           carriers, Menaquinone and ubiquinone].
          Length = 437

 Score = 28.4 bits (64), Expect = 8.7
 Identities = 18/63 (28%), Positives = 25/63 (39%), Gaps = 4/63 (6%)

Query: 74  EIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLARK 133
            I+  I         LAR +E      R     L+P  +VKE     +  +DLR + A  
Sbjct: 152 GIEKTIAADIALLYRLARIVERLSPDSR----RLRPTEVVKEFEKTLRRELDLRREAANA 207

Query: 134 EEL 136
            EL
Sbjct: 208 SEL 210


>gnl|CDD|222449 pfam13908, Shisa, Wnt and FGF inhibitory regulator.  Shisa is a
           transcription factor-type molecule that physically
           interacts with immature forms of the Wnt receptor
           Frizzled and the FGF receptor within the endoplasmic
           reticulum to inhibit their post-translational maturation
           and trafficking to the cell surface.
          Length = 177

 Score = 27.9 bits (62), Expect = 8.7
 Identities = 12/59 (20%), Positives = 13/59 (22%), Gaps = 6/59 (10%)

Query: 319 QMQQQIHMQHMQQQGMGPGGPPSG-PGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGG 376
            +Q     Q        PG    G    P    M   P          PPP      G 
Sbjct: 121 TVQTTPLPQPPSTAPSYPGPQYQGYHPMPPQPGMPAPP-----YSLQYPPPGLLQPQGP 174


>gnl|CDD|227690 COG5403, COG5403, Uncharacterized conserved protein [Function
           unknown].
          Length = 285

 Score = 28.0 bits (62), Expect = 9.1
 Identities = 24/91 (26%), Positives = 31/91 (34%), Gaps = 3/91 (3%)

Query: 308 MSGMPQQ--QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAG 365
            +G+ QQ   Q + M   + M  + +Q     G   G  G  +      P G  GGG   
Sbjct: 111 RAGIDQQIAMQMLPMVASLIMGGLFKQTTAQMGQMGGNMGGQNPGGMSLPQGMGGGGGGA 170

Query: 366 PPPFPSAGPGG-MGGPGNLGPGGMGPGGLLQ 395
             P      GG    P      GM  GG  Q
Sbjct: 171 LGPILGPQLGGPADNPLGSVLQGMFGGGQAQ 201


>gnl|CDD|221759 pfam12764, Gly-rich_Ago1, Glycine-rich region of argonaut.  This
           domain is often found at the very N-terminal of
           argonaut-like proteins.
          Length = 102

 Score = 26.9 bits (59), Expect = 9.3
 Identities = 20/41 (48%), Positives = 21/41 (51%), Gaps = 7/41 (17%)

Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
           QG G GGPP   G         G GG RGGG+ G PP PS 
Sbjct: 7   QGRGRGGPPQQGGRG-------GGGGGRGGGSTGGPPRPSV 40


>gnl|CDD|164795 PHA00370, III, attachment protein.
          Length = 297

 Score = 28.0 bits (62), Expect = 9.3
 Identities = 24/81 (29%), Positives = 30/81 (37%), Gaps = 6/81 (7%)

Query: 5   GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPLAYLEKTTSNIGLPDG 64
             GG  GGGN G       G GG    G+ G G  G G         L K     G  D 
Sbjct: 95  DTGGDTGGGNTG------GGSGGGDTGGSGGGGSDGGGSEGGSTGKSLTKEGVGAGDFDY 148

Query: 65  RSLSPLEKDEIKLEIDQATLK 85
             ++   KD +  + DQ  L+
Sbjct: 149 PKMANANKDALTEDNDQNALQ 169


>gnl|CDD|223029 PHA03264, PHA03264, envelope glycoprotein D; Provisional.
          Length = 416

 Score = 28.0 bits (62), Expect = 9.7
 Identities = 17/55 (30%), Positives = 19/55 (34%), Gaps = 1/55 (1%)

Query: 336 PGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 390
           P G       P  G +  G  G R  G  G  P P+   G  GG    GP    P
Sbjct: 277 PPGDDRPEAKPEPGPVEDGAPG-RETGGEGEGPEPAGRDGAAGGEPKPGPPRPAP 330


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.315    0.138    0.415 

Gapped
Lambda     K      H
   0.267   0.0647    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 22,430,036
Number of extensions: 2305659
Number of successful extensions: 4496
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3340
Number of HSP's successfully gapped: 436
Length of query: 415
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 316
Effective length of database: 6,546,556
Effective search space: 2068711696
Effective search space used: 2068711696
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 60 (27.1 bits)