RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy8304
         (593 letters)



>gnl|CDD|215716 pfam00100, Zona_pellucida, Zona pellucida-like domain. 
          Length = 252

 Score =  123 bits (309), Expect = 9e-32
 Identities = 50/272 (18%), Positives = 93/272 (34%), Gaps = 38/272 (13%)

Query: 168 KCEKNSMKVFISFDKPFF-GIVFSKGHYSNVNCVHLPAGLGRTSANFEIGIHACGTSGNT 226
           +C +++M V +S D  F  G+  S  H  + +C   P     T A F   + +CGT+   
Sbjct: 1   QCTEDTMTVVVSKDLTFTPGLYLSSLHLGDPSCR--PVETNSTHAIFRFPLSSCGTTRQR 58

Query: 227 ENGLYGYGADAGSGTYFENIIVIQYDPQVQEV----WDQARKLRCTWHDQYEKSVTFRPF 282
                        G  +EN IV Q    V        D+   + C+    Y         
Sbjct: 59  TGD----------GIVYENEIVSQPSVSVGPPITRDSDRRLPVSCS----YTSRSVSVSS 104

Query: 283 PVDMLDVVRADFAGDNVG-CWMQIQVGKGPWASEV-SGLVKIGQTMTMVLAIKDDDSKFD 340
              +  V     +    G     +++      +      +++G  + +   +        
Sbjct: 105 VAVLPTVPPVSPSPSGEGSLTFSLRLYTDDSYTSDYPVTIELGDPLYVEATVSVLPRSDP 164

Query: 341 --MLVRNCMAHDGKRAP----IQLVDQRGCVTRSKLMSRFTKIKNFGASASVLSYAHFQA 394
             + + +C A  G          L+ + GC       S    + +    +       F+A
Sbjct: 165 LVLFLDSCWATPGPDPDSSPRYDLI-ENGCPVDGDSTS---TLSHPVGESHTA-RFSFKA 219

Query: 395 FKFP-DSMEVHFQCTIQICR---YQCPDQCSS 422
           F+FP DS +V+  C++++C      C   CS 
Sbjct: 220 FRFPGDSSQVYIHCSVKVCDKSDPSCKPTCSR 251


>gnl|CDD|214579 smart00241, ZP, Zona pellucida (ZP) domain.  ZP proteins are
           responsible for sperm-adhesion fo the zona pellucida. ZP
           domains are also present in multidomain transmembrane
           proteins such as glycoprotein GP2, uromodulin and
           TGF-beta receptor type III (betaglycan).
          Length = 252

 Score = 75.1 bits (185), Expect = 4e-15
 Identities = 50/264 (18%), Positives = 87/264 (32%), Gaps = 29/264 (10%)

Query: 168 KCEKNSMKVFISFDKPFFGIVFSKGHY-SNVNCVHLPAGLGRTSANFEIGIHACGTSGNT 226
           +C ++ M V +S D  F G +  KG    + +C            +FE+ ++ CGT    
Sbjct: 1   QCGEDQMVVSVSTDLLFPGGINVKGLTLGDPSCRPQFTDATSAFVSFEVPLNGCGTR-RQ 59

Query: 227 ENGLYGYGADAGSGTYFENIIVIQYDPQ--VQEVWDQARKLRCTWHDQYEKSVTFRPFPV 284
            N           G  + N +V+       +      A   +C +    E         V
Sbjct: 60  VNP---------DGIVYSNTLVVSPFHPGFITRDDRAAYHFQCFYP---ENEKVSLNLDV 107

Query: 285 DMLD-VVRADFAGDNVGCWMQIQVGKGPWASEV-SGLVKIGQTMTMVLAIK-DDDSKFDM 341
             +     +  +   + C  ++      + S   S    +G  +         DD    +
Sbjct: 108 STIPPTELSSVSEGPLTCSYRLYKD-DSFGSPYQSADYVLGDPVYHEWECDGADDPPLGL 166

Query: 342 LVRNCMAHDGKRA----PIQLVDQRGCVTRSKLMSRFTKIKNFGASASVLSYAHFQAFKF 397
           LV NC A  G          ++D  GC     L S       + ++    +    + FKF
Sbjct: 167 LVDNCYATPGPDPSSGPKYFIID-NGCPVDGYLDS----TIPYNSNPLHRARFSVKVFKF 221

Query: 398 PDSMEVHFQCTIQICRYQCPDQCS 421
            D   V+F C I++C       C 
Sbjct: 222 ADRSLVYFHCQIRLCDKDDGSSCD 245


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 53.0 bits (127), Expect = 4e-07
 Identities = 33/140 (23%), Positives = 38/140 (27%), Gaps = 30/140 (21%)

Query: 51   ASETSSDQESQQSAPKHGYVDR--PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVY 108
            +     D   Q + P+    DR  P  P  P   PP  H     P    P  N       
Sbjct: 2584 SRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPP 2643

Query: 109  HGPPPPPPPLSAAKPP-----------------------PVQPEAMDKSG-----YGPPP 140
               PPP  P     P                        P +  A    G       PPP
Sbjct: 2644 PTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPP 2703

Query: 141  PPPLLAPAPDPWPNAVADTP 160
            PPP   PAP    +A    P
Sbjct: 2704 PPPTPEPAPHALVSATPLPP 2723



 Score = 50.7 bits (121), Expect = 2e-06
 Identities = 30/96 (31%), Positives = 32/96 (33%), Gaps = 7/96 (7%)

Query: 64   APKHGYVDRPPP--PPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAA 121
            AP      RPP   P A   AP RP P  R       +           P  PP P +  
Sbjct: 2857 APGGDVRRRPPSRSPAAKPAAPARP-PVRRLARPAVSRSTESFALPPDQPERPPQPQAPP 2915

Query: 122  KPPPVQPEAMDKSGYGPPPPPPL----LAPAPDPWP 153
             P P            PPPPPP     LAP  DP  
Sbjct: 2916 PPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAG 2951



 Score = 48.0 bits (114), Expect = 1e-05
 Identities = 31/123 (25%), Positives = 42/123 (34%), Gaps = 21/123 (17%)

Query: 35   PKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPS 94
            P ++   + A          +    S+ +       D+P  PP P  APP P P  + P 
Sbjct: 2867 PSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQP-QAPPPPQPQPQPPP 2925

Query: 95   HGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPN 154
              +PQ           PPPPPPP      PP+ P         P          P PW  
Sbjct: 2926 PPQPQ-----------PPPPPPPRP---QPPLAPTT------DPAGAGEPSGAVPQPWLG 2965

Query: 155  AVA 157
            A+ 
Sbjct: 2966 ALV 2968



 Score = 45.7 bits (108), Expect = 7e-05
 Identities = 23/84 (27%), Positives = 26/84 (30%)

Query: 72   RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAM 131
             PPPPP      P    +      G          +   P PP  P   A P      A 
Sbjct: 2700 DPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPAR 2759

Query: 132  DKSGYGPPPPPPLLAPAPDPWPNA 155
              +  GPP P P  APA  P    
Sbjct: 2760 PPTTAGPPAPAPPAAPAAGPPRRL 2783



 Score = 44.2 bits (104), Expect = 2e-04
 Identities = 23/112 (20%), Positives = 29/112 (25%), Gaps = 6/112 (5%)

Query: 54   TSSDQESQQSAPKHGYVDRPPPPP-APIVAPPRPHPNGRHPS----HGKPQFNHKLGGVY 108
                + +Q S+P      R   P    + +   P P    P               G   
Sbjct: 2668 RRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAA 2727

Query: 109  HGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
                 P  P + A P      A    G   P  PP  A  P P P A     
Sbjct: 2728 ARQASPALPAAPAPPAVPAGPATP-GGPARPARPPTTAGPPAPAPPAAPAAG 2778



 Score = 41.8 bits (98), Expect = 0.001
 Identities = 27/101 (26%), Positives = 29/101 (28%), Gaps = 8/101 (7%)

Query: 71   DRPPPPPAPIVAPPRPHPNGRH------PSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPP 124
            DR  PPP P   P  P    R       P   +P+      G   GP PP P       P
Sbjct: 2565 DRSVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAP 2624

Query: 125  PVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSL 165
               P     S     P P      P P        P  VS 
Sbjct: 2625 D--PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSR 2663



 Score = 41.5 bits (97), Expect = 0.001
 Identities = 27/124 (21%), Positives = 34/124 (27%), Gaps = 21/124 (16%)

Query: 58   QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP--PP 115
              ++Q++P       PP  PA    P  P    R P+   P           GPP     
Sbjct: 2726 AAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTR 2785

Query: 116  PPLS-------------------AAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAV 156
            P ++                   AA   P        S  GP PPP    P   P P   
Sbjct: 2786 PAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGP 2845

Query: 157  ADTP 160
                
Sbjct: 2846 PPPS 2849



 Score = 40.3 bits (94), Expect = 0.003
 Identities = 33/135 (24%), Positives = 37/135 (27%), Gaps = 19/135 (14%)

Query: 44   AESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRP-----HPNG----RHPS 94
            A      A+   +   +    P        PPPP     P  P      P G    R PS
Sbjct: 2809 AAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPS 2868

Query: 95   HGKPQF----NHKLGGVYHGPPPPPPPLSAAKPP-----PVQPEAMDKSGYGPPPPPPLL 145
                                P       S A PP     P QP+A       P PPPP  
Sbjct: 2869 RSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQ 2928

Query: 146  APAPDPWPNAVADTP 160
             P P P P      P
Sbjct: 2929 -PQPPPPPPPRPQPP 2942



 Score = 40.3 bits (94), Expect = 0.004
 Identities = 28/122 (22%), Positives = 37/122 (30%), Gaps = 18/122 (14%)

Query: 35  PKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPS 94
               + R    +    A     D +++ +AP    V  P P P P  APP P        
Sbjct: 380 LPTRKRRSARHAATPFARGPGGDDQTRPAAPVPASVPTPAPTPVPASAPPPPATPLPSAE 439

Query: 95  HGKPQFNHKLGGVYHGPPPPPP--------PLSAAKPPPVQPEAMDKSGYGPPPPPPLLA 146
            G             GP PPP           +   P     +A+D      PP PP   
Sbjct: 440 PGSDD----------GPAPPPERQPPAPATEPAPDDPDDATRKALDALRERRPPEPPGAD 489

Query: 147 PA 148
            A
Sbjct: 490 LA 491



 Score = 38.8 bits (90), Expect = 0.010
 Identities = 33/152 (21%), Positives = 39/152 (25%), Gaps = 28/152 (18%)

Query: 26   PNVSKVEE---QPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVA 82
            P VS+  E    P     R                    Q  P      RP PP AP   
Sbjct: 2889 PAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTD 2948

Query: 83   P-PRPHPNGRHPSHGKPQFNHKLGGVYHGP----PPPPPPLSAAKPPPVQPEAMDKSGYG 137
            P     P+G  P    P     + G    P    P P P   A                 
Sbjct: 2949 PAGAGEPSGAVPQ---PWLGALVPGRVAVPRFRVPQPAPSREAPASS------------- 2992

Query: 138  PPPPPPLLAPAPDPWPNAVA----DTPKIVSL 165
             PP           W +++A      P  VSL
Sbjct: 2993 TPPLTGHSLSRVSSWASSLALHEETDPPPVSL 3024



 Score = 38.4 bits (89), Expect = 0.015
 Identities = 30/98 (30%), Positives = 33/98 (33%), Gaps = 4/98 (4%)

Query: 63  SAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAK 122
           SA +H       P      A     P  R P  G  Q           P P P P+ A+ 
Sbjct: 369 SAGRHHPKRASLPTRKRRSARHAATPFARGPG-GDDQTRPAAPVPASVPTPAPTPVPASA 427

Query: 123 PPPVQ---PEAMDKSGYGPPPPPPLLAPAPDPWPNAVA 157
           PPP     P A   S  GP PPP    PAP   P    
Sbjct: 428 PPPPATPLPSAEPGSDDGPAPPPERQPPAPATEPAPDD 465



 Score = 37.2 bits (86), Expect = 0.033
 Identities = 28/162 (17%), Positives = 37/162 (22%), Gaps = 24/162 (14%)

Query: 20   AEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYV--------- 70
             +    P      + P     R             + +       P  G +         
Sbjct: 2917 PQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVPR 2976

Query: 71   DRPPPPPAPIVAP-PRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPP---- 125
             R P P     AP     P   H       +   L  ++    PPP  L     PP    
Sbjct: 2977 FRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSLA-LHEETDPPPVSLKQTLWPPDDTE 3035

Query: 126  ---------VQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVAD 158
                        E  D     P PP P    A +P P     
Sbjct: 3036 DSDADSLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEA 3077



 Score = 36.5 bits (84), Expect = 0.055
 Identities = 23/80 (28%), Positives = 27/80 (33%), Gaps = 13/80 (16%)

Query: 71   DRPPPP--PAPIVAPPRPHPNGRHPS-----HGKPQFNHKLGGVYHGPPPPPPPLSAAKP 123
            D PP P   AP + P  P     HP       G  +      G       PPPPL  A P
Sbjct: 2507 DAPPAPSRLAPAILPDEPVGEPVHPRMLTWIRGLEELASDDAG------DPPPPLPPAAP 2560

Query: 124  PPVQPEAMDKSGYGPPPPPP 143
            P     ++      P P  P
Sbjct: 2561 PAAPDRSVPPPRPAPRPSEP 2580



 Score = 34.9 bits (80), Expect = 0.14
 Identities = 27/110 (24%), Positives = 36/110 (32%), Gaps = 24/110 (21%)

Query: 55   SSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPP 114
            +S  ES++S P       P  PPA ++AP    P    P+   P            PP  
Sbjct: 2789 ASLSESRESLPS---PWDPADPPAAVLAPAAALPPAASPAGPLP------------PPTS 2833

Query: 115  PPPLSAAKPPPVQPEAMDKSGY---------GPPPPPPLLAPAPDPWPNA 155
              P +   PP   P ++   G           PP   P   PA    P  
Sbjct: 2834 AQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPV 2883



 Score = 34.5 bits (79), Expect = 0.21
 Identities = 27/111 (24%), Positives = 30/111 (27%), Gaps = 24/111 (21%)

Query: 71  DRPPPPPAPI------------VAPPRPHPNG---------RHPSHGKPQFNHKLGGVYH 109
           D PPP PA              V  P P P           R P+   P     L    H
Sbjct: 314 DPPPPAPAGDAEEEDDEDGAMEVVSPLPRPRQHYPLGFPKRRRPTWTPPSSLEDLSAGRH 373

Query: 110 GPPPPPPPLSAAKPPP--VQPEAMDKSGYGPP-PPPPLLAPAPDPWPNAVA 157
            P     P    +       P A    G     P  P+ A  P P P  V 
Sbjct: 374 HPKRASLPTRKRRSARHAATPFARGPGGDDQTRPAAPVPASVPTPAPTPVP 424



 Score = 34.1 bits (78), Expect = 0.25
 Identities = 28/103 (27%), Positives = 29/103 (28%), Gaps = 29/103 (28%)

Query: 65   PKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKP- 123
            P      RP     P  A   P P G  P                 PP P     A  P 
Sbjct: 2475 PGAPVYRRPAEARFPFAAGAAPDPGGGGPPDPD------------APPAPSRLAPAILPD 2522

Query: 124  ----PPVQPE-----------AMDKSGYGPPPPPPLLAPAPDP 151
                 PV P            A D +G  PPPP P  AP   P
Sbjct: 2523 EPVGEPVHPRMLTWIRGLEELASDDAG-DPPPPLPPAAPPAAP 2564



 Score = 32.6 bits (74), Expect = 0.76
 Identities = 16/64 (25%), Positives = 18/64 (28%), Gaps = 3/64 (4%)

Query: 71  DRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP--PPPPLSAAKPPPVQP 128
           DR P         P P P    P+      +   G    G P   P PP      P    
Sbjct: 266 DRAPETARG-ATGPPPPPEAAAPNGAAAPPDGVWGAALAGAPLALPAPPDPPPPAPAGDA 324

Query: 129 EAMD 132
           E  D
Sbjct: 325 EEED 328



 Score = 31.4 bits (71), Expect = 1.7
 Identities = 29/101 (28%), Positives = 31/101 (30%), Gaps = 24/101 (23%)

Query: 63  SAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPP---PPLS 119
           S P  G +  P PPP       R     R  +               GPPPPP    P  
Sbjct: 245 SHPLRGDIAAPAPPPVVGEGADRAPETARGAT---------------GPPPPPEAAAPNG 289

Query: 120 AAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           AA PP          G      P  L   PDP P A A   
Sbjct: 290 AAAPPD------GVWGAALAGAPLALPAPPDPPPPAPAGDA 324



 Score = 30.7 bits (69), Expect = 2.8
 Identities = 22/89 (24%), Positives = 26/89 (29%), Gaps = 17/89 (19%)

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAM 131
           R  P  A +     P    R   H    F    GG     P  P P  A+ P P      
Sbjct: 372 RHHPKRASL-----PTRKRRSARHAATPFARGPGGDDQTRPAAPVP--ASVPTPAPTPVP 424

Query: 132 DKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
                   PPPP       P P+A   + 
Sbjct: 425 -----ASAPPPP-----ATPLPSAEPGSD 443


>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1.  Members
           of this family are necessary for accurate chromosome
           transmission during cell division.
          Length = 804

 Score = 50.5 bits (121), Expect = 2e-06
 Identities = 27/135 (20%), Positives = 35/135 (25%), Gaps = 10/135 (7%)

Query: 20  AEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAP 79
           A K E       E QP+         +   ++ E    Q  Q+           P PP  
Sbjct: 129 APKPEPQPPQAPESQPQPQT-----PAQKMLSLEEVEAQLQQRQQAPQ-----LPQPPQQ 178

Query: 80  IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPP 139
           ++    P      P  G P+          G P    P      P   P         P 
Sbjct: 179 VLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQ 238

Query: 140 PPPPLLAPAPDPWPN 154
            PPPL  P       
Sbjct: 239 QPPPLQQPQFPGLSQ 253



 Score = 42.1 bits (99), Expect = 8e-04
 Identities = 26/98 (26%), Positives = 32/98 (32%), Gaps = 9/98 (9%)

Query: 58  QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
              Q        V      PAP  AP +P    + P    P    +  G+    PPPPP 
Sbjct: 202 GYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQ 261

Query: 118 LSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
               +  P QP+A          PPP   P P P    
Sbjct: 262 PPQQQQQPPQPQAQ---------PPPQNQPTPHPGLPQ 290



 Score = 41.3 bits (97), Expect = 0.002
 Identities = 24/80 (30%), Positives = 29/80 (36%), Gaps = 9/80 (11%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPP-----PPLSAAKPPPVQ 127
           P P  AP   P  P    + P   +PQF      +   PP PP     PP   A+PPP  
Sbjct: 221 PAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQN 280

Query: 128 PEAMD----KSGYGPPPPPP 143
                    +    P PPP 
Sbjct: 281 QPTPHPGLPQGQNAPLPPPQ 300



 Score = 38.2 bits (89), Expect = 0.015
 Identities = 29/143 (20%), Positives = 39/143 (27%), Gaps = 10/143 (6%)

Query: 16  GSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPP 75
               +   +     K        Q+  +A +  +   +   D     +APK      P P
Sbjct: 82  PGAPSVGPDSDLSQKTS-TFSPCQSGYEASTDPEYIPDLQPDPSLWGTAPKPE----PQP 136

Query: 76  PPAPIVAPPRPHPNGRHPS-HGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS 134
           P AP   P    P  +  S         +       P PP   L    PP          
Sbjct: 137 PQAPESQPQPQTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFP---- 192

Query: 135 GYGPPPPPPLLAPAPDPWPNAVA 157
             GPP  PP     P   P  V 
Sbjct: 193 QQGPPEQPPGYPQPPQGHPEQVQ 215



 Score = 32.4 bits (74), Expect = 0.74
 Identities = 18/123 (14%), Positives = 26/123 (21%)

Query: 33  EQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRH 92
            +      +      +QV  +      SQ  A        P  PP               
Sbjct: 197 PEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQMP 256

Query: 93  PSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPW 152
           P   +P    +        PPP    +     P    A       P   P +  P     
Sbjct: 257 PPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQR 316

Query: 153 PNA 155
              
Sbjct: 317 GPQ 319


>gnl|CDD|237865 PRK14951, PRK14951, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 618

 Score = 45.9 bits (109), Expect = 6e-05
 Identities = 18/116 (15%), Positives = 27/116 (23%), Gaps = 15/116 (12%)

Query: 55  SSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPP 114
           ++  E+   A K          PA               +                   P
Sbjct: 367 AAAAEAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAA------------SAP 414

Query: 115 PPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCE 170
             P +AA P PV   A          P    A             P+ V++ V+  
Sbjct: 415 AAPPAAAPPAPVAAPAAAA---PAAAPAAAPAAVALAPAPPAQAAPETVAIPVRVA 467



 Score = 40.9 bits (96), Expect = 0.002
 Identities = 24/128 (18%), Positives = 30/128 (23%), Gaps = 20/128 (15%)

Query: 33  EQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRH 92
           E    A       +     +   +   S  +AP        PP P    A   P      
Sbjct: 385 EAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAP----PAAAPPAPVAAPAAAAPAAAPAA 440

Query: 93  PSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPW 152
                             PP    P + A P  V PE        P       APA  P 
Sbjct: 441 APAAVALAPA--------PPAQAAPETVAIPVRVAPE--------PAVASAAPAPAAAPA 484

Query: 153 PNAVADTP 160
              +  T 
Sbjct: 485 AARLTPTE 492



 Score = 35.5 bits (82), Expect = 0.097
 Identities = 23/88 (26%), Positives = 25/88 (28%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
            P   AP  AP         P+                 PP P    AA  P   P A  
Sbjct: 383 RPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAAP 442

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAVADTP 160
            +    P PP   AP     P  VA  P
Sbjct: 443 AAVALAPAPPAQAAPETVAIPVRVAPEP 470



 Score = 29.7 bits (67), Expect = 5.6
 Identities = 13/74 (17%), Positives = 15/74 (20%), Gaps = 13/74 (17%)

Query: 58  QESQQSAPKHGYVDRPPPPPAPIVAPPRPH--PNGRHPSHGKPQFNHKLGGVYHGPPPPP 115
             +  +AP          P  P  A P     P    P                   P P
Sbjct: 431 AAAPAAAPAAAPAAVALAPAPPAQAAPETVAIPVRVAPEPAVAS-----------AAPAP 479

Query: 116 PPLSAAKPPPVQPE 129
               AA       E
Sbjct: 480 AAAPAAARLTPTEE 493


>gnl|CDD|219321 pfam07174, FAP, Fibronectin-attachment protein (FAP).  This family
           contains bacterial fibronectin-attachment proteins
           (FAP). Family members are rich in alanine and proline,
           are approximately 300 long, and seem to be restricted to
           mycobacteria. These proteins contain a
           fibronectin-binding motif that allows mycobacteria to
           bind to fibronectin in the extracellular matrix.
          Length = 297

 Score = 44.9 bits (106), Expect = 7e-05
 Identities = 23/50 (46%), Positives = 24/50 (48%), Gaps = 9/50 (18%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           PPPPPP  +AA P P  P         PPPPPP   PAP P     A  P
Sbjct: 43  PPPPPPSTAAAAPAPAAP---------PPPPPPAAPPAPQPDDPNAAPPP 83



 Score = 44.9 bits (106), Expect = 7e-05
 Identities = 23/84 (27%), Positives = 26/84 (30%), Gaps = 16/84 (19%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           PPPPP+   A P P      P                 P  PP P          P   D
Sbjct: 44  PPPPPSTAAAAPAPAAPPPPP----------------PPAAPPAPQPDDPNAAPPPPPAD 87

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAV 156
            +   PPP  P   P P P P  +
Sbjct: 88  PNAPPPPPVDPNAPPPPAPEPGRI 111



 Score = 43.0 bits (101), Expect = 2e-04
 Identities = 27/91 (29%), Positives = 29/91 (31%), Gaps = 27/91 (29%)

Query: 63  SAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAK 122
           + P     D  PPPP P  A   P P                       PPPPPP +A  
Sbjct: 32  ALPATANADPAPPPPPPSTAAAAPAPA---------------------APPPPPPPAAPP 70

Query: 123 PPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
            P              PPPPP    AP P P
Sbjct: 71  APQPDDPN------AAPPPPPADPNAPPPPP 95



 Score = 38.3 bits (89), Expect = 0.008
 Identities = 24/89 (26%), Positives = 26/89 (29%), Gaps = 27/89 (30%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           PPPPP    A                            P P  PP       P  P+  D
Sbjct: 43  PPPPPPSTAAAA--------------------------PAPAAPPPPPPPAAPPAPQPDD 76

Query: 133 KSGYGPPPPPPLLAPAPDPW-PNAVADTP 160
            +   PPPP    AP P P  PNA     
Sbjct: 77  PNAAPPPPPADPNAPPPPPVDPNAPPPPA 105



 Score = 34.1 bits (78), Expect = 0.15
 Identities = 22/65 (33%), Positives = 27/65 (41%), Gaps = 9/65 (13%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           PPPPPA   AP    PN   P               + PPPPP   +A  PP  +P  +D
Sbjct: 62  PPPPPAAPPAPQPDDPNAAPPPPPADP---------NAPPPPPVDPNAPPPPAPEPGRID 112

Query: 133 KSGYG 137
            +  G
Sbjct: 113 NAVGG 117



 Score = 32.6 bits (74), Expect = 0.52
 Identities = 14/57 (24%), Positives = 16/57 (28%), Gaps = 12/57 (21%)

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           L    +  P PPPP  +                 P P  P   P P   P    D P
Sbjct: 33  LPATANADPAPPPPPPSTAAAA------------PAPAAPPPPPPPAAPPAPQPDDP 77


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 45.2 bits (107), Expect = 1e-04
 Identities = 27/124 (21%), Positives = 38/124 (30%), Gaps = 8/124 (6%)

Query: 46  STDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRP------HPNGRHPSHGKPQ 99
            +D       +      +  +      PPP P                 +   P+    +
Sbjct: 44  VSDSAELAAVTVVAGAAACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAPASPARE 103

Query: 100 FNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA--MDKSGYGPPPPPPLLAPAPDPWPNAVA 157
            +    G     PPPP P  A+ PP   P+   M +    P PPP    PA    P AVA
Sbjct: 104 GSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVA 163

Query: 158 DTPK 161
               
Sbjct: 164 SDAA 167



 Score = 44.4 bits (105), Expect = 2e-04
 Identities = 21/118 (17%), Positives = 31/118 (26%), Gaps = 4/118 (3%)

Query: 38  AQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPP----RPHPNGRHP 93
           A  R          +  ++ + + +   +    D       P   P      P       
Sbjct: 794 AAFRRPGRLRRSGPAADAASRTASKRKSRSHTPDGGSESSGPARPPGAAARPPPARSSES 853

Query: 94  SHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
           S  KP          +G   P PP   A+P    P     +      P P   PAP  
Sbjct: 854 SKSKPAAAGGRARGKNGRRRPRPPEPRARPGAAAPPKAAAAAPPAGAPAPRPRPAPRV 911



 Score = 42.9 bits (101), Expect = 5e-04
 Identities = 31/149 (20%), Positives = 44/149 (29%), Gaps = 24/149 (16%)

Query: 26  PNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPR 85
           P+       P   +    + S+     E+SS   S  S    G    P P P+   +P R
Sbjct: 300 PSSPGSGPAPSSPRASSSSSSS----RESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSR 355

Query: 86  PHPNGRHPSHGKPQFNHKLGGVYHGPPP-------------------PPPPLSAAKPPPV 126
           P P     S  K     +                                   A +P P 
Sbjct: 356 PPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRARRRDATGRFPAGRPRPS 415

Query: 127 QPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
             +A   SG       PLL P+ +PWP +
Sbjct: 416 PLDAGAASG-AFYARYPLLTPSGEPWPGS 443



 Score = 41.3 bits (97), Expect = 0.002
 Identities = 29/165 (17%), Positives = 34/165 (20%), Gaps = 38/165 (23%)

Query: 20  AEKTEVPNVSKVEEQPKQ--AQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPP 77
            E    P      E P          + ST   AS       +            PPPP 
Sbjct: 65  FEPPTGPPPGPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSS----PDPPPPT 120

Query: 78  APIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPP--------------------- 116
            P  +PP              +     G      PP                        
Sbjct: 121 PPPASPPPSPAPDLSE---MLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLS 177

Query: 117 --------PLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
                   P S    PP        S   P    P+ A A  P P
Sbjct: 178 SPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAP 222



 Score = 36.7 bits (85), Expect = 0.042
 Identities = 28/86 (32%), Positives = 33/86 (38%), Gaps = 6/86 (6%)

Query: 64  APKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKP 123
            P      RP P P   + P  P P G  P  G  +      G  H P P    L+A  P
Sbjct: 896 PPAGAPAPRPRPAPRVKLGP-MP-PGGPDPRGGFRRVPP---GDLHTPAPSAAALAAYCP 950

Query: 124 PPVQPEAMDKSGYGPPPPPPLLAPAP 149
           P V  E +D   + P P  P LA  P
Sbjct: 951 PEVVAELVDHPLF-PEPWRPALAFDP 975



 Score = 35.5 bits (82), Expect = 0.10
 Identities = 24/124 (19%), Positives = 32/124 (25%), Gaps = 1/124 (0%)

Query: 29  SKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHP 88
            K         +     +    A+       S +S+                   PRP  
Sbjct: 819 RKSRSHTPDGGSESSGPARPPGAAARPPPARSSESSKSKPAAAGGRARGKNGRRRPRPPE 878

Query: 89  NGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD-KSGYGPPPPPPLLAP 147
               P    P            P P P P    K  P+ P   D + G+   PP  L  P
Sbjct: 879 PRARPGAAAPPKAAAAAPPAGAPAPRPRPAPRVKLGPMPPGGPDPRGGFRRVPPGDLHTP 938

Query: 148 APDP 151
           AP  
Sbjct: 939 APSA 942



 Score = 34.8 bits (80), Expect = 0.18
 Identities = 17/130 (13%), Positives = 32/130 (24%), Gaps = 10/130 (7%)

Query: 41  REDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQF 100
              ++S+   +S      E++   P+   +  P                G   S   P+ 
Sbjct: 234 ASSSDSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRE 293

Query: 101 NHK--LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS--------GYGPPPPPPLLAPAPD 150
                        P P  P +++     +  +   +        G    P P        
Sbjct: 294 RSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSP 353

Query: 151 PWPNAVADTP 160
             P   AD  
Sbjct: 354 SRPPPPADPS 363



 Score = 33.2 bits (76), Expect = 0.53
 Identities = 23/90 (25%), Positives = 27/90 (30%), Gaps = 11/90 (12%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
             PP  P  + P    + R P    P             P P P  SAA           
Sbjct: 187 SSPPAEPPPSTPPAAASPRPPRRSSPI------SASASSPAPAPGRSAADDAGASSSDSS 240

Query: 133 KS-----GYGPPPPPPLLAPAPDPWPNAVA 157
            S     G+GP    PL  PAP   P  + 
Sbjct: 241 SSESSGCGWGPENECPLPRPAPITLPTRIW 270



 Score = 32.5 bits (74), Expect = 0.83
 Identities = 32/145 (22%), Positives = 42/145 (28%), Gaps = 27/145 (18%)

Query: 43  DAESTDQVASETSSDQESQQSAPKH--GYVDRPPPPPAPIVAPPRP----HPNGRHPSHG 96
              S    A  +SSD  S +S+        + P P PAPI  P R       NG     G
Sbjct: 224 PGRSAADDAGASSSDSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPG 283

Query: 97  KPQFNHKLGGVYHGPPPPPPPLSAAKPPPV-------------------QPEAMDKSGYG 137
            P  +          P P  P S   P                        E+   +   
Sbjct: 284 -PASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVS 342

Query: 138 PPPPP-PLLAPAPDPWPNAVADTPK 161
           P P P    +P+  P P   +   K
Sbjct: 343 PGPSPSRSPSPSRPPPPADPSSPRK 367



 Score = 29.8 bits (67), Expect = 6.2
 Identities = 17/86 (19%), Positives = 22/86 (25%)

Query: 75  PPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS 134
            P     + P      R  S  K + +   GG     P  PP  +A  PP    E+    
Sbjct: 798 RPGRLRRSGPAADAASRTASKRKSRSHTPDGGSESSGPARPPGAAARPPPARSSESSKSK 857

Query: 135 GYGPPPPPPLLAPAPDPWPNAVADTP 160
                           P P      P
Sbjct: 858 PAAAGGRARGKNGRRRPRPPEPRARP 883


>gnl|CDD|235124 PRK03427, PRK03427, cell division protein ZipA; Provisional.
          Length = 333

 Score = 44.3 bits (105), Expect = 1e-04
 Identities = 31/137 (22%), Positives = 45/137 (32%), Gaps = 16/137 (11%)

Query: 28  VSKVEEQPKQAQNREDAESTDQVASE-TSSDQESQQSAPKHGYVDRPPPP-PAPIVAPPR 85
           V +V   P  AQ  E A  + Q   +   +  + +Q   +      PP   P P    P+
Sbjct: 68  VHRVNHAPANAQEHEAARPSPQHQYQPPYASAQPRQPVQQPPEAQVPPQHAPRPAQPAPQ 127

Query: 86  PHPNGRH-PSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPL 144
           P     + P   +P        V     P P P+ +A  P  Q            P  P+
Sbjct: 128 PVQQPAYQPQPEQPL----QQPVSPQVAPAPQPVHSAPQPAQQAFQ---------PAEPV 174

Query: 145 LAPAPDPWPNAVADTPK 161
            AP P+P         K
Sbjct: 175 AAPQPEPVAEPAPVMDK 191



 Score = 32.7 bits (75), Expect = 0.44
 Identities = 17/113 (15%), Positives = 27/113 (23%), Gaps = 1/113 (0%)

Query: 18  QFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPP 77
              +    P  +  + +    Q  E                +  Q        ++P   P
Sbjct: 86  PSPQHQYQPPYASAQPRQPVQQPPEAQVPPQHAPRPAQPAPQPVQQPAYQPQPEQPLQQP 145

Query: 78  APIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA 130
                 P P P    P   +  F      V    P P    +     P + EA
Sbjct: 146 VSPQVAPAPQPVHSAPQPAQQAF-QPAEPVAAPQPEPVAEPAPVMDKPKRKEA 197


>gnl|CDD|148139 pfam06346, Drf_FH1, Formin Homology Region 1.  This region is found
           in some of the Diaphanous related formins (Drfs). It
           consists of low complexity repeats of around 12
           residues.
          Length = 160

 Score = 41.5 bits (97), Expect = 3e-04
 Identities = 40/103 (38%), Positives = 43/103 (41%), Gaps = 14/103 (13%)

Query: 65  PKHGYVDRPPPPPAPIVA---PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAA 121
           P  G    PPPPP P VA   PP P P         P     L G    PPPPP P  A 
Sbjct: 41  PLPGGAAIPPPPPLPGVAGIPPPPPLPGATAIPPPPP-----LPGAAGIPPPPPLPGGAG 95

Query: 122 KPPPVQPEAMDKSGYGPPPPP----PLLAPAPDPWPNAVADTP 160
            PPP  P  +      PPPPP    P + P P P+P A    P
Sbjct: 96  IPPP--PPPLPGGAAVPPPPPLPGGPGVPPPPPPFPGAPGIPP 136



 Score = 37.2 bits (86), Expect = 0.007
 Identities = 33/92 (35%), Positives = 34/92 (36%), Gaps = 21/92 (22%)

Query: 65  PKHGYVDRPPPPPAP---IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAA 121
           P  G    PPPPP P    + PP P P G                   G PPPPPPL   
Sbjct: 65  PLPGATAIPPPPPLPGAAGIPPPPPLPGGA------------------GIPPPPPPLPGG 106

Query: 122 KPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
              P  P      G  PPPPP   AP   P P
Sbjct: 107 AAVPPPPPLPGGPGVPPPPPPFPGAPGIPPPP 138



 Score = 35.3 bits (81), Expect = 0.034
 Identities = 32/93 (34%), Positives = 32/93 (34%), Gaps = 13/93 (13%)

Query: 65  PKHGYVDRPPPPPAP----IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSA 120
           P  G    PPPPP P    I  PP P P G       P      G     PPPP P    
Sbjct: 77  PLPGAAGIPPPPPLPGGAGIPPPPPPLPGGAAVPPPPPLPG---GPGVPPPPPPFPGAPG 133

Query: 121 AKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
             PPP         G  PPPP     PA    P
Sbjct: 134 IPPPPPGM------GSPPPPPFGFGVPAAPVLP 160



 Score = 34.1 bits (78), Expect = 0.085
 Identities = 33/93 (35%), Positives = 35/93 (37%), Gaps = 16/93 (17%)

Query: 73  PPPPPAP---IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP---------PPPPLSA 120
           PPPPP P    + PP P P G       P      GG    PPP         PPPPL  
Sbjct: 1   PPPPPLPGGVGIPPPPPLPGGVCIPPPPPL----PGGTGIPPPPPLPGGAAIPPPPPLPG 56

Query: 121 AKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
               P  P     +   PPPP P  A  P P P
Sbjct: 57  VAGIPPPPPLPGATAIPPPPPLPGAAGIPPPPP 89



 Score = 32.6 bits (74), Expect = 0.30
 Identities = 37/103 (35%), Positives = 40/103 (38%), Gaps = 19/103 (18%)

Query: 73  PPPPPAP---IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPP---------PPPLSA 120
           PPPPP P    + PP P P G       P      GG    PPPP         PPPL  
Sbjct: 13  PPPPPLPGGVCIPPPPPLPGGTGIPPPPPL----PGGAAIPPPPPLPGVAGIPPPPPLPG 68

Query: 121 AKPPPVQPEAMDKSGYGPPPPPPL---LAPAPDPWPNAVADTP 160
           A   P  P     +G  PPPP P    + P P P P   A  P
Sbjct: 69  ATAIPPPPPLPGAAGIPPPPPLPGGAGIPPPPPPLPGGAAVPP 111



 Score = 30.7 bits (69), Expect = 1.1
 Identities = 25/60 (41%), Positives = 25/60 (41%), Gaps = 9/60 (15%)

Query: 105 GGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLA----PAPDPWPNAVADTP 160
           GGV   PPPP P      PPP  P      G G PPPPPL      P P P P      P
Sbjct: 8   GGVGIPPPPPLPGGVCIPPPPPLP-----GGTGIPPPPPLPGGAAIPPPPPLPGVAGIPP 62


>gnl|CDD|237864 PRK14950, PRK14950, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 585

 Score = 42.9 bits (101), Expect = 4e-04
 Identities = 23/92 (25%), Positives = 32/92 (34%), Gaps = 12/92 (13%)

Query: 77  PAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGY 136
           P P   P +P      P    P            P   P   +AA  PP +P   + +  
Sbjct: 362 PVPAPQPAKPTAAAPSPVRPTP-----------APSTRPKAAAAANIPPKEPVR-ETATP 409

Query: 137 GPPPPPPLLAPAPDPWPNAVADTPKIVSLDVK 168
            P PP P+  P P    +A   T   + +D K
Sbjct: 410 PPVPPRPVAPPVPHTPESAPKLTRAAIPVDEK 441



 Score = 41.7 bits (98), Expect = 0.001
 Identities = 25/110 (22%), Positives = 30/110 (27%), Gaps = 19/110 (17%)

Query: 63  SAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAK 122
             P      RP    A  + P  P      P    P+           PP P  P SA K
Sbjct: 379 VRPTPAPSTRPKAAAAANIPPKEPVRETATPPPVPPRPV--------APPVPHTPESAPK 430

Query: 123 PPPVQPEAMDKSGYGPPPPPP-----------LLAPAPDPWPNAVADTPK 161
                    +K  Y PP PP            +L      W   + D P 
Sbjct: 431 LTRAAIPVDEKPKYTPPAPPKEEEKALIADGDVLEQLEAIWKQILRDVPP 480



 Score = 40.9 bits (96), Expect = 0.002
 Identities = 20/91 (21%), Positives = 22/91 (24%), Gaps = 10/91 (10%)

Query: 75  PPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS 134
           P PAP   P         P     +           PPP PP   A   P     A   +
Sbjct: 381 PTPAPSTRPKAAAAANIPPKEPVRETAT--------PPPVPPRPVAPPVPHTPESAPKLT 432

Query: 135 GYGPPPP--PPLLAPAPDPWPNAVADTPKIV 163
               P    P    PAP             V
Sbjct: 433 RAAIPVDEKPKYTPPAPPKEEEKALIADGDV 463



 Score = 30.5 bits (69), Expect = 2.5
 Identities = 19/101 (18%), Positives = 24/101 (23%), Gaps = 14/101 (13%)

Query: 25  VPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQ-QSAPKHGYVDRP---PPPPAPI 80
            P                 +      A+     +E   ++A       RP   P P  P 
Sbjct: 367 QPAKPTAAAPSPVRPTPAPSTRPKAAAAANIPPKEPVRETATPPPVPPRPVAPPVPHTPE 426

Query: 81  VAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAA 121
            AP         P   KP         Y  P PP     A 
Sbjct: 427 SAPKL--TRAAIPVDEKP--------KYTPPAPPKEEEKAL 457


>gnl|CDD|218191 pfam04652, DUF605, Vta1 like.  Vta1 (VPS20-associated protein 1) is
           a positive regulator of Vps4. Vps4 is an ATPase that is
           required in the multivesicular body (MVB) sorting
           pathway to dissociate the endosomal sorting complex
           required for transport (ESCRT). Vta1 promotes correct
           assembly of Vps4 and stimulates its ATPase activity
           through its conserved Vta1/SBP1/LIP5 region.
          Length = 315

 Score = 42.4 bits (100), Expect = 5e-04
 Identities = 26/118 (22%), Positives = 34/118 (28%), Gaps = 9/118 (7%)

Query: 32  EEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGR 91
           +     + N    E  D  ++  S    S    P        P  P+    PP P     
Sbjct: 162 DVATTNSDNSFPGEDADPASASPSDPPSSSPGVPSFPSPPEDPSSPSDSSLPPAPSS--- 218

Query: 92  HPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAP 149
                 P  + +       PP P  P     PPP   +    S   P PP     PAP
Sbjct: 219 -FQSDTPPPSPESPTNPSPPPGPAAP-----PPPPVQQVPPLSTAKPTPPSASATPAP 270



 Score = 42.0 bits (99), Expect = 6e-04
 Identities = 20/119 (16%), Positives = 31/119 (26%), Gaps = 18/119 (15%)

Query: 42  EDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFN 101
           EDA+     +  +   +++  ++          P   P  +P  P           P   
Sbjct: 159 EDADVATTNSDNSFPGEDADPASA--------SPSDPPSSSPGVPSFPSPPEDPSSPS-- 208

Query: 102 HKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
                     PP P    +  PPP        +   PP P     P     P      P
Sbjct: 209 ------DSSLPPAPSSFQSDTPPPSPES--PTNPSPPPGPAAPPPPPVQQVPPLSTAKP 259



 Score = 38.9 bits (91), Expect = 0.006
 Identities = 19/112 (16%), Positives = 29/112 (25%), Gaps = 11/112 (9%)

Query: 40  NREDAESTDQVASETSSDQESQQSAPKHGYVDR---PPPPPAPIVAPPRPHPNGRHPSHG 96
           + + A ++      +S    S  S P+          PP P+   +   P       +  
Sbjct: 176 DADPASASPSDPPSSSPGVPSFPSPPEDPSSPSDSSLPPAPSSFQSDTPPPSPESPTNPS 235

Query: 97  KPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPA 148
            P             PPPPP          +P     S    P     L   
Sbjct: 236 PPPGPA--------APPPPPVQQVPPLSTAKPTPPSASATPAPIGGITLDDD 279



 Score = 32.7 bits (75), Expect = 0.44
 Identities = 18/132 (13%), Positives = 29/132 (21%), Gaps = 23/132 (17%)

Query: 29  SKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHP 88
           +    +     +   ++         S     +  +        P P       PP    
Sbjct: 170 NSFPGEDADPASASPSDPPSSSPGVPSFPSPPEDPSSPSDSSLPPAPSSFQSDTPPPSPE 229

Query: 89  NGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPA 148
           +                     P  P PP   A PPP   + +       P PP   A A
Sbjct: 230 S---------------------PTNPSPPPGPAAPPPPPVQQVPPLSTAKPTPPS--ASA 266

Query: 149 PDPWPNAVADTP 160
                  +    
Sbjct: 267 TPAPIGGITLDD 278



 Score = 31.2 bits (71), Expect = 1.6
 Identities = 17/100 (17%), Positives = 25/100 (25%), Gaps = 1/100 (1%)

Query: 35  PKQAQNREDAESTDQVASETS-SDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHP 93
           P    +      +     E   S  +S        +    PPP       P P P    P
Sbjct: 184 PSDPPSSSPGVPSFPSPPEDPSSPSDSSLPPAPSSFQSDTPPPSPESPTNPSPPPGPAAP 243

Query: 94  SHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDK 133
                Q    L      PP      +      +  +A+ K
Sbjct: 244 PPPPVQQVPPLSTAKPTPPSASATPAPIGGITLDDDAIAK 283



 Score = 30.4 bits (69), Expect = 2.7
 Identities = 15/90 (16%), Positives = 19/90 (21%), Gaps = 3/90 (3%)

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAM 131
             P P  P+             S                P  PP         P  PE  
Sbjct: 146 EDPNPGPPLDEEDEDADVATTNSDN-SFPGEDADPASASPSDPPSSSPGVPSFPSPPEDP 204

Query: 132 DKSGYGPPPPPPLLAPAPDPWPNAVADTPK 161
                   PP P  +      P    ++P 
Sbjct: 205 SSPSDSSLPPAP--SSFQSDTPPPSPESPT 232


>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 824

 Score = 42.7 bits (101), Expect = 6e-04
 Identities = 19/137 (13%), Positives = 30/137 (21%), Gaps = 9/137 (6%)

Query: 37  QAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHG 96
            A       +    A    +     Q AP             P   PP+       PS  
Sbjct: 675 GAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPA 734

Query: 97  ------KPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPD 150
                  P            P  PPPP + A                P     +      
Sbjct: 735 ADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAP---PPSPPSEEEEMAEDDAP 791

Query: 151 PWPNAVADTPKIVSLDV 167
              +      + V++++
Sbjct: 792 SMDDEDRRDAEEVAMEL 808



 Score = 41.9 bits (99), Expect = 0.001
 Identities = 24/118 (20%), Positives = 32/118 (27%), Gaps = 5/118 (4%)

Query: 43  DAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNH 102
                    S  ++   +  +         P    AP      P P         P    
Sbjct: 391 AGAPAAAAPSAAAAAPAAAPAPAA----AAPAAAAAP-APAAAPQPAPAPAPAPAPPSPA 445

Query: 103 KLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
                   P PPP    +A+P P    A + +    P PP   APA  P   A    P
Sbjct: 446 GNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAP 503



 Score = 41.1 bits (97), Expect = 0.002
 Identities = 21/116 (18%), Positives = 31/116 (26%), Gaps = 4/116 (3%)

Query: 44  AESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAP--PRPHPNGRHPSHGKPQFN 101
           +   ++ A   +    +  +AP        P   +   AP    P  + +H +       
Sbjct: 606 SGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDG 665

Query: 102 HKLGGVYHGPPPP--PPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
                   G   P  PPP  A   P     A        P   P    A DP    
Sbjct: 666 GDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQP 721



 Score = 40.4 bits (95), Expect = 0.003
 Identities = 24/105 (22%), Positives = 31/105 (29%), Gaps = 6/105 (5%)

Query: 58  QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
             +  S+       RP  P AP  AP  P P G   +  +       G       P    
Sbjct: 600 PPAPASSGPPEEAARPAAPAAP-AAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVA 658

Query: 118 LSAA--KPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           +  A         +A   +   PPP P   APA    P   A   
Sbjct: 659 VPDASDGGDGWPAKAGGAAPAAPPPAP---APAAPAAPAGAAPAQ 700



 Score = 40.0 bits (94), Expect = 0.004
 Identities = 24/129 (18%), Positives = 38/129 (29%), Gaps = 4/129 (3%)

Query: 35  PKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPS 94
               ++    +++D      +    +  +AP        P  PA   AP +P P      
Sbjct: 651 EHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAG-AAPAQPAPAPAATP 709

Query: 95  HGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDK---SGYGPPPPPPLLAPAPDP 151
                 +              P  +A  P P+ PE  D    +G    PPPP        
Sbjct: 710 PAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAA 769

Query: 152 WPNAVADTP 160
              A   +P
Sbjct: 770 PAAAPPPSP 778



 Score = 39.2 bits (92), Expect = 0.006
 Identities = 23/147 (15%), Positives = 30/147 (20%), Gaps = 6/147 (4%)

Query: 20  AEKTEVPNVSKVEEQPKQAQNREDAESTDQVASE------TSSDQESQQSAPKHGYVDRP 73
                        E   +     DA                ++   +   A         
Sbjct: 638 EASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAA 697

Query: 74  PPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDK 133
           P  PAP  A   P      P+   PQ              P P       PP    A  +
Sbjct: 698 PAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQ 757

Query: 134 SGYGPPPPPPLLAPAPDPWPNAVADTP 160
               P P P     A  P      +  
Sbjct: 758 PPPPPAPAPAAAPAAAPPPSPPSEEEE 784



 Score = 38.8 bits (91), Expect = 0.009
 Identities = 28/137 (20%), Positives = 36/137 (26%), Gaps = 14/137 (10%)

Query: 34  QPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHP 93
            P  A     A      A+  ++   +  +AP+      P P PAP          G   
Sbjct: 398 APSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQP--APAPAPAPAPPSPAGNAPAGGAPS 455

Query: 94  SHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP-- 151
                            P P   P   A P P  P A   +     P  P      D   
Sbjct: 456 PPPAAA-----PSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGADDAA 510

Query: 152 -----WPNAVADTPKIV 163
                WP  +A  PK  
Sbjct: 511 TLRERWPEILAAVPKRS 527



 Score = 33.8 bits (78), Expect = 0.27
 Identities = 24/119 (20%), Positives = 31/119 (26%), Gaps = 3/119 (2%)

Query: 43  DAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNH 102
             E     AS    ++ ++ +AP        P P     A P        P    P+  H
Sbjct: 596 GGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGA-AAAPAEASAAPAPGVAAPE--H 652

Query: 103 KLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPK 161
               V             AK     P A   +     P  P  A    P P   A  P 
Sbjct: 653 HPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPA 711



 Score = 30.0 bits (68), Expect = 4.2
 Identities = 15/57 (26%), Positives = 18/57 (31%), Gaps = 1/57 (1%)

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           + G    P    P  +AA P      A   +      P P  AP P P P      P
Sbjct: 387 VAGGAGAPAAAAPSAAAAAPAAAPAPAAA-APAAAAAPAPAAAPQPAPAPAPAPAPP 442


>gnl|CDD|237057 PRK12323, PRK12323, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 700

 Score = 41.8 bits (98), Expect = 9e-04
 Identities = 19/96 (19%), Positives = 25/96 (26%), Gaps = 10/96 (10%)

Query: 71  DRPPPPPAPIVAPPRPHPNGRHPSHGKP---------QFNHKLGGVYHGPPPPPPPL-SA 120
             P  P A   A          P+   P         Q + +  G    P P P    +A
Sbjct: 401 APPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAA 460

Query: 121 AKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAV 156
           A  P         +     P     A AP P  +  
Sbjct: 461 AARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDP 496



 Score = 41.0 bits (96), Expect = 0.002
 Identities = 22/88 (25%), Positives = 24/88 (27%), Gaps = 4/88 (4%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
             P  A         P  R P+        +      G  P P P  AA P      A  
Sbjct: 408 AAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAA 467

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAVADTP 160
               GP P     A AP     A A  P
Sbjct: 468 ----GPRPVAAAAAAAPARAAPAAAPAP 491



 Score = 39.5 bits (92), Expect = 0.005
 Identities = 30/144 (20%), Positives = 38/144 (26%), Gaps = 23/144 (15%)

Query: 20  AEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAP 79
           A        +     P  A       +     S       + + A   G    P P PAP
Sbjct: 395 AAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAP 454

Query: 80  IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDK-SGYGP 138
                 P    R  +                 P P    +AA P    P A    +   P
Sbjct: 455 ---AAAPAAAARPAA---------------AGPRPVAAAAAAAPARAAPAAAPAPADDDP 496

Query: 139 PP----PPPLLAPAPDPWPNAVAD 158
           PP    PP   +PAP     A A 
Sbjct: 497 PPWEELPPEFASPAPAQPDAAPAG 520



 Score = 35.6 bits (82), Expect = 0.090
 Identities = 26/160 (16%), Positives = 35/160 (21%), Gaps = 19/160 (11%)

Query: 11  LQITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKH--G 68
           L    G            +    QP  A     A +    A   +       +A      
Sbjct: 361 LAFRPGQSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVA 420

Query: 69  YVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQP 128
                  P    +A  R            P            P   P    AA+P    P
Sbjct: 421 AAPARRSPAPEALAAARQASARGPGGAPAPA---------PAPAAAPAA--AARPAAAGP 469

Query: 129 EAMDKSGYGPPP---PPPLLAPAPD---PWPNAVADTPKI 162
             +  +    P    P    APA D   PW     +    
Sbjct: 470 RPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASP 509



 Score = 34.5 bits (79), Expect = 0.16
 Identities = 16/55 (29%), Positives = 16/55 (29%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVS 164
           GP        A   P     A        PP  P  APA      AVA  P   S
Sbjct: 373 GPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRS 427



 Score = 32.2 bits (73), Expect = 0.96
 Identities = 19/123 (15%), Positives = 21/123 (17%), Gaps = 24/123 (19%)

Query: 62  QSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHP---------------SHGKPQFNHKLGG 106
                        P  A   A P P  +   P                   P        
Sbjct: 467 AGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGWVAESI 526

Query: 107 VYHGPPPPPPPLSAAKPPPVQPEAMDKSG----YGPPPPPPLLAPAPDP-----WPNAVA 157
                  P        P P    A   +        P PP   A          WP   A
Sbjct: 527 PDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMFDGDWPALAA 586

Query: 158 DTP 160
             P
Sbjct: 587 RLP 589



 Score = 29.8 bits (67), Expect = 5.5
 Identities = 14/55 (25%), Positives = 17/55 (30%), Gaps = 6/55 (10%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSL 165
           P P     +AA P P  P A       P   P   A A          +P   +L
Sbjct: 385 PAPAAAAPAAAAPAPAAPPAA------PAAAPAAAAAARAVAAAPARRSPAPEAL 433


>gnl|CDD|218621 pfam05518, Totivirus_coat, Totivirus coat protein. 
          Length = 753

 Score = 41.3 bits (97), Expect = 0.001
 Identities = 24/83 (28%), Positives = 26/83 (31%), Gaps = 10/83 (12%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
            P  P    A  R     R             GG   G  PPPP L AA  P     ++ 
Sbjct: 681 DPVRPTAHHAALRAPQAPRPGGPPG-------GG---GGLPPPPDLPAAAGPAPCGSSLI 730

Query: 133 KSGYGPPPPPPLLAPAPDPWPNA 155
            S   PP P P  A   D   N 
Sbjct: 731 ASPTAPPEPEPPGAEQADGAENQ 753



 Score = 40.6 bits (95), Expect = 0.003
 Identities = 26/98 (26%), Positives = 31/98 (31%), Gaps = 13/98 (13%)

Query: 76  PPAPIVAPPRP--HPNGRHPSHGKPQFNHKLGG--------VYHGPPPPPPPLSAAKPPP 125
           PP    A PRP  +  G     G P                V  G P  P    AA   P
Sbjct: 636 PPVFKTALPRPDYNRGGEAGGPGVPGPVPVGMPAHTARPSRVARGDPVRPTAHHAALRAP 695

Query: 126 VQPE---AMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
             P         G  PPPP    A  P P  +++  +P
Sbjct: 696 QAPRPGGPPGGGGGLPPPPDLPAAAGPAPCGSSLIASP 733


>gnl|CDD|237030 PRK12270, kgd, alpha-ketoglutarate decarboxylase; Reviewed.
          Length = 1228

 Score = 40.6 bits (96), Expect = 0.003
 Identities = 28/140 (20%), Positives = 34/140 (24%), Gaps = 25/140 (17%)

Query: 26  PNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPR 85
            N   VEE  +Q     D  S D    E  +D               P    AP  A   
Sbjct: 6   QNEWLVEEMYQQYL--ADPNSVDPSWREFFADY-------------GPGSTAAPTAAAAA 50

Query: 86  PHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLL 145
                  P+                 P  P P   A   P  P     +      P    
Sbjct: 51  AAAAASAPAAAPAA----------KAPAAPAPAPPAAAAPAAPPKPAAAAAAAAAPAAPP 100

Query: 146 APAPDPWPNAVADTPKIVSL 165
           A A    P A A   ++  L
Sbjct: 101 AAAAAAAPAAAAVEDEVTPL 120


>gnl|CDD|237871 PRK14965, PRK14965, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 576

 Score = 40.1 bits (94), Expect = 0.003
 Identities = 18/86 (20%), Positives = 19/86 (22%), Gaps = 10/86 (11%)

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAM 131
             P PP+     P P      P    P              P P P   A   P    A 
Sbjct: 380 GAPAPPSAAWGAPTPAAPAAPPPAAAPPVPPAAPARPAAARPAPAPAPPAAAAPPARSAD 439

Query: 132 DKSGYGPPPPPPLLAPAPDPWPNAVA 157
                           A D W   VA
Sbjct: 440 P----AAAA------SAGDRWRAFVA 455



 Score = 37.8 bits (88), Expect = 0.014
 Identities = 14/50 (28%), Positives = 16/50 (32%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           P  P  P  AA PP         +   P P P   A A  P  +A     
Sbjct: 394 PAAPAAPPPAAAPPVPPAAPARPAAARPAPAPAPPAAAAPPARSADPAAA 443



 Score = 29.7 bits (67), Expect = 4.9
 Identities = 20/53 (37%), Positives = 22/53 (41%), Gaps = 5/53 (9%)

Query: 113 PPPPPLSAAKPPPVQPEAMDKSGYGPPPPP----PLLA-PAPDPWPNAVADTP 160
           P PP  +   P P  P A   +   P PP     P  A PAP P P A A  P
Sbjct: 382 PAPPSAAWGAPTPAAPAAPPPAAAPPVPPAAPARPAAARPAPAPAPPAAAAPP 434


>gnl|CDD|218146 pfam04554, Extensin_2, Extensin-like region. 
          Length = 57

 Score = 35.9 bits (83), Expect = 0.003
 Identities = 23/83 (27%), Positives = 28/83 (33%), Gaps = 27/83 (32%)

Query: 69  YVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQP 128
           Y  + PPPP    +PP P+                    Y  PPPP        PPP   
Sbjct: 1   YHYKSPPPPVKQYSPPPPY-------------------YYKSPPPPVKSPVYKSPPP--- 38

Query: 129 EAMDKSGYGPPPPPPLLAPAPDP 151
                  Y  PPPP  +  +P P
Sbjct: 39  -----PVYKSPPPPKYVYKSPPP 56



 Score = 26.7 bits (59), Expect = 6.6
 Identities = 19/81 (23%), Positives = 21/81 (25%), Gaps = 27/81 (33%)

Query: 62  QSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAA 121
           +S P       PPPP      PP                            PPPP   + 
Sbjct: 4   KSPPPPVKQYSPPPPYYYKSPPPPVKS-------------------PVYKSPPPPVYKSP 44

Query: 122 KPPPVQPEAMDKSGYGPPPPP 142
            PP           Y  PPPP
Sbjct: 45  PPPKYV--------YKSPPPP 57


>gnl|CDD|220708 pfam10349, WWbp, WW-domain ligand protein.  The WWbp domain is
           characterized by several short PY and PT-like motifs of
           the PPPPY form. These appear to bind directly to the WW
           domains of WWP1 and WWP2 and other such diverse proteins
           as dystrophin and YAP (Yes-associated protein). This is
           the WW-domain binding protein WWbp via PY and PY_like
           motifs. The presence of a phosphotyrosine residue in the
           pWBP-1 peptide abolishes WW domain binding which
           suggests a potential regulatory role for tyrosine
           phosphorylation in modulating WW domain-ligand
           interactions. Given the likelihood that WWP1 and WWP2
           function as E3 ubiquitin-protein ligases, it is possible
           that initial substrate-specific recognition occurs via
           WW domain-substrate protein interaction followed by
           ubiquitin transfer and subsequent proteolysis. This
           domain lies just downstream of the GRAM (pfam02893) in
           many members.
          Length = 111

 Score = 37.4 bits (87), Expect = 0.003
 Identities = 22/86 (25%), Positives = 25/86 (29%), Gaps = 21/86 (24%)

Query: 58  QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
           +ES    P   YV   P P     A P P+                       P  P PP
Sbjct: 42  RESGYYPPPGAYVHLEPLPAYGQYAAPPPYG-------------------PPPPYYPAPP 82

Query: 118 LSAAKPPPVQPEAMDKSGYGPPPPPP 143
                PPP     M  +    PPPP 
Sbjct: 83  GVYPTPPPPNSGYM--ADPQEPPPPY 106



 Score = 27.4 bits (61), Expect = 9.8
 Identities = 13/40 (32%), Positives = 15/40 (37%)

Query: 109 HGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPA 148
            G  PPP      +P P   +      YGPPPP     P 
Sbjct: 44  SGYYPPPGAYVHLEPLPAYGQYAAPPPYGPPPPYYPAPPG 83


>gnl|CDD|220603 pfam10152, DUF2360, Predicted coiled-coil domain-containing protein
           (DUF2360).  This is the conserved 140 amino acid region
           of a family of proteins conserved from nematodes to
           humans. One C. elegans member is annotated as a
           Daf-16-dependent longevity protein 1 but this could not
           be confirmed. The function is unknown.
          Length = 147

 Score = 38.1 bits (89), Expect = 0.003
 Identities = 23/74 (31%), Positives = 28/74 (37%), Gaps = 15/74 (20%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP------KIV 163
           GPPPPPP  + A  PP              PP      AP      VA  P      K++
Sbjct: 70  GPPPPPPARAEAASPPPP-------EAPAEPPAEPEPEAPAENTVTVAKDPRYAKYFKML 122

Query: 164 SLDVKCE--KNSMK 175
            L V  +  KN M+
Sbjct: 123 KLGVPAQAVKNKMQ 136



 Score = 29.7 bits (67), Expect = 2.5
 Identities = 15/76 (19%), Positives = 17/76 (22%), Gaps = 16/76 (21%)

Query: 51  ASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHG 110
             + +        A        PPPPPA   A   P P                      
Sbjct: 50  LEDVTVQTTPPPPASAITNGGPPPPPPARAEAASPPPPEAPAEP---------------- 93

Query: 111 PPPPPPPLSAAKPPPV 126
           P  P P   A     V
Sbjct: 94  PAEPEPEAPAENTVTV 109


>gnl|CDD|217469 pfam03276, Gag_spuma, Spumavirus gag protein. 
          Length = 582

 Score = 39.9 bits (93), Expect = 0.003
 Identities = 24/88 (27%), Positives = 29/88 (32%), Gaps = 10/88 (11%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           PPPP +    PP        PS      N      ++   P P P   + PP   P    
Sbjct: 188 PPPPSSLPGLPPGSSS--LAPSASSTPGNRLPRVSFNPFLPGPSPAQPSAPPASIP---- 241

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAVADTP 160
                PP PP +   AP P P      P
Sbjct: 242 ----APPIPPVIQYVAPPPVPPPQPIIP 265



 Score = 34.9 bits (80), Expect = 0.14
 Identities = 16/80 (20%), Positives = 24/80 (30%), Gaps = 8/80 (10%)

Query: 72  RPPPPPA--PIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPE 129
            PP      P  +          P +  P+ +         P  P  P ++   PP+ P 
Sbjct: 189 PPPSSLPGLPPGSSSLAPSASSTPGNRLPRVSFNPFLPGPSPAQPSAPPASIPAPPIPPV 248

Query: 130 AMDKSGYGPPPPPPLLAPAP 149
                      PPP+  P P
Sbjct: 249 IQ------YVAPPPVPPPQP 262


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 40.1 bits (93), Expect = 0.004
 Identities = 27/103 (26%), Positives = 39/103 (37%), Gaps = 10/103 (9%)

Query: 60  SQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLS 119
            Q    +H +     P   P  + P   P     +    Q          G   P PPL 
Sbjct: 448 PQSPFAQHPFTSGGLPAIGPPPSLPTSTPAAPPRASSGSQPPGSALPSSGGCAGPGPPL- 506

Query: 120 AAKPPPVQ--PEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
               PP+Q   E +D++     PPPP  +P+P+P    V +TP
Sbjct: 507 ----PPIQIKEEPLDEAEEPESPPPPPRSPSPEP---TVVNTP 542



 Score = 33.9 bits (77), Expect = 0.25
 Identities = 32/122 (26%), Positives = 44/122 (36%), Gaps = 4/122 (3%)

Query: 34  QPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPP-PPAPIVAPPRPHPNGRH 92
           QP      +         ++  S     QSA +     R  P PPAP +   +P P    
Sbjct: 295 QPFGLAQSQVPPLPLPSQAQPHSHTPPSQSALQPQQPPREQPLPPAPSMPHIKPPPTTPI 354

Query: 93  PSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPP-VQPEAMDKSGYGPPPPPPLLAPAPDP 151
           P       +HK      GP P P   S   PPP ++P +   + + P   PP L   P  
Sbjct: 355 PQLPNQ--SHKHPPHLQGPSPFPQMPSNLPPPPALKPLSSLPTHHPPSAHPPPLQLMPQS 412

Query: 152 WP 153
            P
Sbjct: 413 QP 414



 Score = 33.5 bits (76), Expect = 0.40
 Identities = 27/133 (20%), Positives = 38/133 (28%), Gaps = 36/133 (27%)

Query: 74  PPPPA--PIVAPPRPHPNGRHP--------------------------SHGKPQFNHKLG 105
           PPPPA  P+ + P  HP   HP                          S       H   
Sbjct: 382 PPPPALKPLSSLPTHHPPSAHPPPLQLMPQSQPLQSVPAQPPVLTQSQSLPPKASTHPHS 441

Query: 106 GVYHGPPPPP---PPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPD-----PWPNAVA 157
           G++ GPP  P    P ++   P + P     +     PP       P             
Sbjct: 442 GLHSGPPQSPFAQHPFTSGGLPAIGPPPSLPTSTPAAPPRASSGSQPPGSALPSSGGCAG 501

Query: 158 DTPKIVSLDVKCE 170
             P +  + +K E
Sbjct: 502 PGPPLPPIQIKEE 514



 Score = 32.0 bits (72), Expect = 1.2
 Identities = 25/113 (22%), Positives = 37/113 (32%), Gaps = 13/113 (11%)

Query: 46  STDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRH-PSHGKPQFNHKL 104
           S     S++ S  + Q   P+     + PP  A   + P P P+ +  P  G P      
Sbjct: 156 SPQDNESDSDSSAQQQLLQPQGPPSIQVPPGAALAPSAPPPTPSAQAVPPQGSPI----- 210

Query: 105 GGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPP--PLLAPAPDPWPNA 155
                   P P P   +    +   ++       P PP  P  A    P P A
Sbjct: 211 -----AAQPAPQPQQPSPLSLISAPSLHPQRLPSPHPPLQPQTASQQSPQPPA 258



 Score = 31.6 bits (71), Expect = 1.5
 Identities = 28/109 (25%), Positives = 35/109 (32%), Gaps = 14/109 (12%)

Query: 61  QQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSA 120
             +   H      P PP       +  P    PS   PQ +H      HGP PP P    
Sbjct: 227 ISAPSLHPQRLPSPHPPLQPQTASQQSPQPPAPSSRHPQSSH------HGPGPPMPHALQ 280

Query: 121 AKPP--------PVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPK 161
             P         P QP  + +S   P P P    P     P+  A  P+
Sbjct: 281 QGPVFLQHPSSNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQSALQPQ 329



 Score = 30.8 bits (69), Expect = 2.4
 Identities = 31/142 (21%), Positives = 41/142 (28%), Gaps = 19/142 (13%)

Query: 34  QPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRP------- 86
           QP+ A  +             SS        P H     P     P   PP+P       
Sbjct: 245 QPQTASQQSPQPPAPSSRHPQSSHHGPGPPMP-HALQQGPVFLQHPSSNPPQPFGLAQSQ 303

Query: 87  HPNGRHPSHGKPQFNHKLGGVYHGPP--------PPPPPLSAAKPPPVQPEAMDKSGYGP 138
            P    PS  +P  +         P         PP P +   KPPP  P     +    
Sbjct: 304 VPPLPLPSQAQPHSHTPPSQSALQPQQPPREQPLPPAPSMPHIKPPPTTPIPQLPNQSHK 363

Query: 139 PPPPPLLAPAPDPWPNAVADTP 160
            PP       P P+P   ++ P
Sbjct: 364 HPPH---LQGPSPFPQMPSNLP 382



 Score = 30.8 bits (69), Expect = 2.5
 Identities = 37/158 (23%), Positives = 57/158 (36%), Gaps = 14/158 (8%)

Query: 8   STQLQITTGSQFAEKTE--VPNVSKVEE--QPKQAQNREDAESTDQVASETSSDQESQQS 63
           ST+ Q    +   E+ E      SK +E  +P      E     +  +S++ S  E   S
Sbjct: 79  STKRQREKPASDTEEPERVTAKKSKTQELSRPNSPSEGEGEGEGEGESSDSRSVNEEGSS 138

Query: 64  APKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPP--------PPP 115
            PK   +D+     +P +  P+ + +    S  +     +       PP        PPP
Sbjct: 139 DPKD--IDQDNRSSSPSIPSPQDNESDSDSSAQQQLLQPQGPPSIQVPPGAALAPSAPPP 196

Query: 116 PPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
            P + A PP   P A   +     P P  L  AP   P
Sbjct: 197 TPSAQAVPPQGSPIAAQPAPQPQQPSPLSLISAPSLHP 234


>gnl|CDD|223065 PHA03378, PHA03378, EBNA-3B; Provisional.
          Length = 991

 Score = 39.7 bits (92), Expect = 0.004
 Identities = 25/95 (26%), Positives = 26/95 (27%), Gaps = 7/95 (7%)

Query: 72  RPPPPPAPIVA--PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPE 129
              PP AP      P        P    P              PP      A+PP   P 
Sbjct: 703 PMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPG 762

Query: 130 AMDKSGYGP----PPPPPLLAPAPDPWPNAVADTP 160
                   P    P PPP   PAP   P   A TP
Sbjct: 763 RARPPAAAPGAPTPQPPPQAPPAPQQRPRG-APTP 796



 Score = 37.4 bits (86), Expect = 0.022
 Identities = 22/87 (25%), Positives = 23/87 (26%), Gaps = 5/87 (5%)

Query: 74  PPPPAPIVA-PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           PPP AP    PP   P                        PP      A+PP   P    
Sbjct: 696 PPPRAPTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRAR 755

Query: 133 KSGYGPPPPPPLL----APAPDPWPNA 155
                P    P      AP P P P A
Sbjct: 756 PPAAAPGRARPPAAAPGAPTPQPPPQA 782



 Score = 33.9 bits (77), Expect = 0.29
 Identities = 28/116 (24%), Positives = 38/116 (32%), Gaps = 13/116 (11%)

Query: 57  DQESQQSAPKHGYVDR--PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPP 114
           D ES + A      D+  P P   P+   P   P     +   P +      V H    P
Sbjct: 547 DIESDEPASTEPVHDQLLPAPGLGPLQIQPLTSPTTSQLASSAPSYAQTPWPVPHPSQTP 606

Query: 115 PPPLSAAKPP----------PVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
            PP + +  P          P++P  M      P     L+ P P   P  V  TP
Sbjct: 607 EPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPH-QPPQVEITP 661



 Score = 33.5 bits (76), Expect = 0.43
 Identities = 27/104 (25%), Positives = 29/104 (27%), Gaps = 12/104 (11%)

Query: 65  PKHGYVDRPPPPPAPIVA-PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKP 123
           P        PP  AP  A PP   P    P    P                PP  +   P
Sbjct: 717 PAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAA--PGRARPPAAAPGAP 774

Query: 124 PPVQPEAMDKSGYGPPPP--PPLLAPAPDPWPNAVADTPKIVSL 165
            P  P         PP P   P  AP P P P A   T   +  
Sbjct: 775 TPQPPPQ------APPAPQQRPRGAPTPQPPPQAGP-TSMQLMP 811


>gnl|CDD|237605 PRK14086, dnaA, chromosomal replication initiation protein;
           Provisional.
          Length = 617

 Score = 39.4 bits (92), Expect = 0.005
 Identities = 22/99 (22%), Positives = 32/99 (32%), Gaps = 18/99 (18%)

Query: 73  PPPPPAPIVA-PPRPHPNGR-HPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA 130
           PPPP A   + P  P P  R +  +G P+ + +       PP  P         P  P  
Sbjct: 97  PPPPHARRTSEPELPRPGRRPYEGYGGPRADDR-------PPGLPRQDQLPTARPAYPAY 149

Query: 131 MDKSGYGPPPPPP---------LLAPAPDPWPNAVADTP 160
             +   G  P            L  P   P+ +  +  P
Sbjct: 150 QQRPEPGAWPRAADDYGWQQQRLGFPPRAPYASPASYAP 188



 Score = 36.3 bits (84), Expect = 0.053
 Identities = 18/129 (13%), Positives = 31/129 (24%), Gaps = 24/129 (18%)

Query: 20  AEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAP 79
           A +T  P + +   +P +      A+              ++ + P      +  P P  
Sbjct: 102 ARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPA----YQQRPEPGA 157

Query: 80  IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQ-----PEAMDKS 134
                  +   +                  G PP  P  S A   P Q     P    + 
Sbjct: 158 WPRAADDYGWQQQR---------------LGFPPRAPYASPASYAPEQERDREPYDAGRP 202

Query: 135 GYGPPPPPP 143
            Y       
Sbjct: 203 EYDQRRRDY 211



 Score = 32.5 bits (74), Expect = 0.69
 Identities = 26/125 (20%), Positives = 31/125 (24%), Gaps = 18/125 (14%)

Query: 32  EEQPKQAQNREDAESTDQVASETSSDQES-QQSAPKHGYVDRPPPPPAPIVAPPR----- 85
           ++Q      R    S    A E   D+E      P++    R    P P    PR     
Sbjct: 168 QQQRLGFPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRPRRDRTD 227

Query: 86  ---PHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPP 142
              P P   H   G        G         P         P    A      GP  P 
Sbjct: 228 RPEPPPGAGHVHRG--------GPGPPERDDAPVV-PIRPSAPGPLAAQPAPAPGPGEPT 278

Query: 143 PLLAP 147
             L P
Sbjct: 279 ARLNP 283



 Score = 29.4 bits (66), Expect = 6.3
 Identities = 18/60 (30%), Positives = 21/60 (35%), Gaps = 8/60 (13%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKS--GYGPP---PPPPLLAP---APDPWPNAVADTPK 161
           G P PPPP +     P  P    +   GYG P     PP L      P   P   A   +
Sbjct: 93  GEPAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQR 152


>gnl|CDD|237862 PRK14948, PRK14948, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 620

 Score = 39.2 bits (92), Expect = 0.006
 Identities = 22/87 (25%), Positives = 27/87 (31%), Gaps = 2/87 (2%)

Query: 75  PPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS 134
              A    PP+  P    P+   P            PPPPPP  + A          D S
Sbjct: 518 SNTAKTPPPPQKSPP--PPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQIPADSS 575

Query: 135 GYGPPPPPPLLAPAPDPWPNAVADTPK 161
              P P  P  +P  D  P  +    K
Sbjct: 576 PPPPIPEEPTPSPTKDSSPEEIDKAAK 602



 Score = 37.2 bits (87), Expect = 0.024
 Identities = 22/97 (22%), Positives = 28/97 (28%), Gaps = 14/97 (14%)

Query: 59  ESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPL 118
           ESQ  +              P      P P    P   +P            P PPPPP 
Sbjct: 511 ESQSGSA------SNTAKTPPPPQKSPPPPA-PTPPLPQPTATA------PPPTPPPPPP 557

Query: 119 SAAKPPPVQPEAMDKSGYGPPPPPPLLAPAP-DPWPN 154
           +A +     P  +      PPP P    P+P      
Sbjct: 558 TATQASSNAPAQIPADSSPPPPIPEEPTPSPTKDSSP 594



 Score = 34.6 bits (80), Expect = 0.15
 Identities = 20/84 (23%), Positives = 29/84 (34%)

Query: 51  ASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHG 110
           AS T+      Q +P       P P P     PP P P     +        ++      
Sbjct: 517 ASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQIPADSSP 576

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKS 134
           PPP P   + +      PE +DK+
Sbjct: 577 PPPIPEEPTPSPTKDSSPEEIDKA 600



 Score = 30.3 bits (69), Expect = 3.1
 Identities = 19/89 (21%), Positives = 22/89 (24%), Gaps = 13/89 (14%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           P    + I     P      P+   P            P P  P     K     P    
Sbjct: 361 PSAFISEIANASAPANPTPAPNPSPP------------PAPIQPSAPKTKQAATTPSPPP 408

Query: 133 KSGYGPPPPPPL-LAPAPDPWPNAVADTP 160
                P P P     P+P P  NA    P
Sbjct: 409 AKASPPIPVPAEPTEPSPTPPANAANAPP 437



 Score = 29.5 bits (67), Expect = 5.1
 Identities = 17/77 (22%), Positives = 24/77 (31%), Gaps = 6/77 (7%)

Query: 49  QVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVY 108
           ++A+ ++    +    P        P  P    A   P P     S   P          
Sbjct: 367 EIANASAPANPTPAPNPSPPPAPIQPSAPKTKQAATTPSPPPAKASPPIPVPAEPT---- 422

Query: 109 HGPPPPPPPLSAAKPPP 125
              P P PP +AA  PP
Sbjct: 423 --EPSPTPPANAANAPP 437



 Score = 29.2 bits (66), Expect = 7.3
 Identities = 13/54 (24%), Positives = 18/54 (33%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVS 164
           P  P P  + + PP     +  K+      P P  A A  P P     T    +
Sbjct: 374 PANPTPAPNPSPPPAPIQPSAPKTKQAATTPSPPPAKASPPIPVPAEPTEPSPT 427



 Score = 29.2 bits (66), Expect = 8.7
 Identities = 17/52 (32%), Positives = 20/52 (38%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKI 162
           PP   PP  A  PP  QP A       PPPPP     + +      AD+   
Sbjct: 526 PPQKSPPPPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQIPADSSPP 577


>gnl|CDD|236776 PRK10856, PRK10856, cytoskeletal protein RodZ; Provisional.
          Length = 331

 Score = 38.1 bits (89), Expect = 0.010
 Identities = 29/131 (22%), Positives = 35/131 (26%), Gaps = 13/131 (9%)

Query: 34  QPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHP 93
           Q  +AQ  E     DQ ++E S     Q           P   PAP         N + P
Sbjct: 134 QNHKAQQEEITTMADQSSAELS-QNSGQSVPLDTSTTTDPATTPAPAAPVDTTPTNSQTP 192

Query: 94  SHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           +                P P   P   A   P Q      +   P  P      AP P  
Sbjct: 193 AVATA------------PAPAVDPQQNAVVAPSQANVDTAATPAPAAPATPDGAAPLPTD 240

Query: 154 NAVADTPKIVS 164
            A   TP    
Sbjct: 241 QAGVSTPAADP 251


>gnl|CDD|223029 PHA03264, PHA03264, envelope glycoprotein D; Provisional.
          Length = 416

 Score = 38.1 bits (88), Expect = 0.011
 Identities = 26/90 (28%), Positives = 34/90 (37%), Gaps = 15/90 (16%)

Query: 75  PPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS 134
           PPPAP    P P  + R  +  +P      G V       P   +  +    +P   D +
Sbjct: 266 PPPAPSGGSPAPPGDDRPEAKPEP------GPV---EDGAPGRETGGEGEGPEPAGRDGA 316

Query: 135 GYGPPPPPPLLAPAPDP-----WPNAVADT 159
             G P P P   PAPD      WP+  A T
Sbjct: 317 AGGEPKPGPP-RPAPDADRPEGWPSLEAIT 345


>gnl|CDD|236138 PRK07994, PRK07994, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 647

 Score = 37.9 bits (89), Expect = 0.014
 Identities = 24/138 (17%), Positives = 36/138 (26%), Gaps = 11/138 (7%)

Query: 35  PKQAQNREDA--ESTDQVASETSSDQESQQSAPKHGYVDRPPPPPA-PIVAPPRPHPNGR 91
           P       +   +S    AS  ++   +   AP       PPP  A              
Sbjct: 361 PAAPLPEPEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTS 420

Query: 92  HPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
                + Q     G           P +A++  PV       +   P P     APA   
Sbjct: 421 QLLAARQQLQRAQGAT---KAKKSEPAAASRARPVNSALERLASVRPAPSALEKAPAKKE 477

Query: 152 -----WPNAVADTPKIVS 164
                  N V    + V+
Sbjct: 478 AYRWKATNPVEVKKEPVA 495


>gnl|CDD|221818 pfam12868, DUF3824, Domain of unknwon function (DUF3824).  This is
           a repeating domain found in fungal proteins. It is
           proline-rich, and the function is not known.
          Length = 135

 Score = 36.0 bits (83), Expect = 0.016
 Identities = 27/100 (27%), Positives = 38/100 (38%), Gaps = 15/100 (15%)

Query: 55  SSDQESQQSAPKH-GYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP 113
            +++E ++    H  Y D    P  P   PP P  +        P  N+           
Sbjct: 31  KAERERERYRHDHDAYSDSYEEPYDPTPYPPSPPVSD---PRYYPNSNYF---------- 77

Query: 114 PPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           PPPP S   PPP      + + Y PPPP  +  P   P+P
Sbjct: 78  PPPPGSTPVPPPGPQPGYNPADY-PPPPGAVPPPQNYPYP 116



 Score = 34.5 bits (79), Expect = 0.042
 Identities = 29/112 (25%), Positives = 42/112 (37%), Gaps = 10/112 (8%)

Query: 40  NREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQ 99
            R + +  ++       D ++   + +  Y D  P PP+P V+ PR +PN  +       
Sbjct: 25  QRRERKKAERERERYRHDHDAYSDSYEEPY-DPTPYPPSPPVSDPRYYPNSNY------- 76

Query: 100 FNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPP--LLAPAP 149
           F    G     PP P P  + A  PP          Y  PP P     AP P
Sbjct: 77  FPPPPGSTPVPPPGPQPGYNPADYPPPPGAVPPPQNYPYPPGPGQDPYAPRP 128


>gnl|CDD|215533 PLN02983, PLN02983, biotin carboxyl carrier protein of acetyl-CoA
           carboxylase.
          Length = 274

 Score = 37.1 bits (86), Expect = 0.016
 Identities = 24/80 (30%), Positives = 25/80 (31%), Gaps = 32/80 (40%)

Query: 74  PPPPAPIVA--PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAM 131
           PPPPAP+V   PP PH                       PP  PP    A   P      
Sbjct: 144 PPPPAPVVMMQPPPPHAM---------------------PPASPPAAQPAPSAPASS--- 179

Query: 132 DKSGYGPPPPPPLLAPAPDP 151
                 PPP P    PA  P
Sbjct: 180 ------PPPTPASPPPAKAP 193



 Score = 32.5 bits (74), Expect = 0.56
 Identities = 18/67 (26%), Positives = 21/67 (31%), Gaps = 17/67 (25%)

Query: 61  QQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSA 120
            Q  P    V   PPPP   + P  P      PS                 P   PP + 
Sbjct: 142 PQPPPPAPVVMMQPPPPHA-MPPASPPAAQPAPS----------------APASSPPPTP 184

Query: 121 AKPPPVQ 127
           A PPP +
Sbjct: 185 ASPPPAK 191



 Score = 30.6 bits (69), Expect = 1.9
 Identities = 16/50 (32%), Positives = 18/50 (36%), Gaps = 7/50 (14%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           P PPPP       PP            PP  PP   PAP    ++   TP
Sbjct: 142 PQPPPPAPVVMMQPPPPHAM-------PPASPPAAQPAPSAPASSPPPTP 184



 Score = 30.2 bits (68), Expect = 3.1
 Identities = 16/45 (35%), Positives = 16/45 (35%), Gaps = 4/45 (8%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
             PPP P   PP   P   HP    P      G  Y  P P  PP
Sbjct: 178 SSPPPTPASPPPAKAPKSSHPPLKSPM----AGTFYRSPAPGEPP 218


>gnl|CDD|184927 PRK14963, PRK14963, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 504

 Score = 37.5 bits (87), Expect = 0.017
 Identities = 24/108 (22%), Positives = 31/108 (28%), Gaps = 7/108 (6%)

Query: 55  SSDQESQQSAPKHGY--VDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPP 112
            SD  S + A  H    +   P      VAPP P P     +    + N     V     
Sbjct: 325 RSDALSLELALLHALLALGGAPSEGVAAVAPPAPAP-----ADLTQRLNRLEKEVRSLRS 379

Query: 113 PPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
            P    +AA  P    +   +    P P     AP       A A   
Sbjct: 380 APTAAATAAGAPLPDFDPRPRGPPAPEPARSAEAPPLVAPAAAPAGLA 427


>gnl|CDD|237866 PRK14952, PRK14952, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 584

 Score = 37.6 bits (87), Expect = 0.019
 Identities = 21/55 (38%), Positives = 23/55 (41%), Gaps = 11/55 (20%)

Query: 105 GGVYHGPPPPPPPLSAAKP-PPVQPEAMDKSGYGPPPPPPLLAPAPDPW-PNAVA 157
             + H  P   P  SAA P P  QP          P P P+LAP P    PNA A
Sbjct: 389 ANLLHNAPQAAPAPSAAAPEPKHQP---------APEPRPVLAPTPASGEPNAAA 434



 Score = 35.2 bits (81), Expect = 0.11
 Identities = 18/70 (25%), Positives = 22/70 (31%), Gaps = 21/70 (30%)

Query: 61  QQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSA 120
             S P +   + P   PAP  A P P                      H P P P P+ A
Sbjct: 384 DMSIPANLLHNAPQAAPAPSAAAPEPK---------------------HQPAPEPRPVLA 422

Query: 121 AKPPPVQPEA 130
             P   +P A
Sbjct: 423 PTPASGEPNA 432


>gnl|CDD|166942 PRK00404, tatB, sec-independent translocase; Provisional.
          Length = 141

 Score = 35.6 bits (82), Expect = 0.019
 Identities = 18/56 (32%), Positives = 19/56 (33%), Gaps = 5/56 (8%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQP 128
           PP PP P V PP        P+   P             P    P  AA PPP  P
Sbjct: 84  PPAPPEP-VTPPTAQS----PAPAVPTPPPTSTPAVPPAPAAAVPAPAAAPPPSDP 134



 Score = 32.5 bits (74), Expect = 0.26
 Identities = 18/50 (36%), Positives = 21/50 (42%), Gaps = 4/50 (8%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           PP PP P++   PP  Q  A       PP   P + PAP     A A  P
Sbjct: 84  PPAPPEPVT---PPTAQSPA-PAVPTPPPTSTPAVPPAPAAAVPAPAAAP 129



 Score = 30.6 bits (69), Expect = 0.94
 Identities = 18/50 (36%), Positives = 20/50 (40%), Gaps = 1/50 (2%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           PP P  P +A  P P  P     S    PP P    PAP   P   +D P
Sbjct: 87  PPEPVTPPTAQSPAPAVPTPPPTSTPAVPPAPAAAVPAPAAAP-PPSDPP 135



 Score = 30.6 bits (69), Expect = 0.98
 Identities = 18/72 (25%), Positives = 22/72 (30%), Gaps = 10/72 (13%)

Query: 78  APIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYG 137
           AP+  P  P P     +                 P PPP  + A PP         +   
Sbjct: 80  APLTPPAPPEPVTPPTAQSPAP----------AVPTPPPTSTPAVPPAPAAAVPAPAAAP 129

Query: 138 PPPPPPLLAPAP 149
           PP  PP    AP
Sbjct: 130 PPSDPPQPPRAP 141



 Score = 30.2 bits (68), Expect = 1.3
 Identities = 14/56 (25%), Positives = 17/56 (30%), Gaps = 4/56 (7%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPP----PLLAPAPDPWPNAVADTPK 161
            P PP P        P            P  PP     + APA  P P+     P+
Sbjct: 84  PPAPPEPVTPPTAQSPAPAVPTPPPTSTPAVPPAPAAAVPAPAAAPPPSDPPQPPR 139


>gnl|CDD|227505 COG5178, PRP8, U5 snRNP spliceosome subunit [RNA processing and
           modification].
          Length = 2365

 Score = 37.7 bits (87), Expect = 0.024
 Identities = 22/79 (27%), Positives = 29/79 (36%), Gaps = 18/79 (22%)

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIV 163
           +  +  G PPPPPP     PP          G+ PP  PP   P P P  N    + K +
Sbjct: 1   MASLPPGNPPPPPP-----PP----------GFEPPSQPP---PPPPPGVNVKKRSRKQL 42

Query: 164 SLDVKCEKNSMKVFISFDK 182
           S+      +S     S   
Sbjct: 43  SIVGDILGHSGNPIYSLRV 61



 Score = 31.5 bits (71), Expect = 1.6
 Identities = 17/57 (29%), Positives = 20/57 (35%), Gaps = 23/57 (40%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPE 129
           PPPPP P   PP            +P            PPPPPP ++  K    Q  
Sbjct: 10  PPPPPPPGFEPP-----------SQP------------PPPPPPGVNVKKRSRKQLS 43


>gnl|CDD|223003 PHA03169, PHA03169, hypothetical protein; Provisional.
          Length = 413

 Score = 36.5 bits (84), Expect = 0.033
 Identities = 37/163 (22%), Positives = 52/163 (31%), Gaps = 18/163 (11%)

Query: 7   VSTQLQITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSD--------- 57
           V+ Q    T S      E    S+  E+ ++ Q       ++ V S T S          
Sbjct: 61  VAEQGHRQTESDTETAEE----SRHGEKEERGQGGPSGSGSESVGSPTPSPSGSAEELAS 116

Query: 58  ----QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYH-GPP 112
               + +  S+P+      PPP P     P  P P   H      Q +  L   +   P 
Sbjct: 117 GLSPENTSGSSPESPASHSPPPSPPSHPGPHEPAPPESHNPSPNQQPSSFLQPSHEDSPE 176

Query: 113 PPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
            P PP S  +P    P   +     PPP  P   P     P  
Sbjct: 177 EPEPPTSEPEPDSPGPPQSETPTSSPPPQSPPDEPGEPQSPTP 219


>gnl|CDD|223066 PHA03379, PHA03379, EBNA-3A; Provisional.
          Length = 935

 Score = 37.0 bits (85), Expect = 0.034
 Identities = 24/104 (23%), Positives = 34/104 (32%), Gaps = 13/104 (12%)

Query: 59  ESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPL 118
           +S ++A  HG    P PPP   + P   H           Q +     V   PP P   L
Sbjct: 429 QSLETATSHGSAQVPEPPPVHDLEPGPLHD----------QHSMAPCPVAQLPPGPLQDL 478

Query: 119 SAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKI 162
               P    P  +        P P    P   PW  +++  P +
Sbjct: 479 E---PGDQLPGVVQDGRPACAPVPAPAGPIVRPWEASLSQVPGV 519


>gnl|CDD|114709 pfam06003, SMN, Survival motor neuron protein (SMN).  This family
           consists of several eukaryotic survival motor neuron
           (SMN) proteins. The Survival of Motor Neurons (SMN)
           protein, the product of the spinal muscular
           atrophy-determining gene, is part of a large
           macromolecular complex (SMN complex) that functions in
           the assembly of spliceosomal small nuclear
           ribonucleoproteins (snRNPs). The SMN complex functions
           as a specificity factor essential for the efficient
           assembly of Sm proteins on U snRNAs and likely protects
           cells from illicit, and potentially deleterious,
           non-specific binding of Sm proteins to RNAs.
          Length = 264

 Score = 36.1 bits (83), Expect = 0.037
 Identities = 36/130 (27%), Positives = 46/130 (35%), Gaps = 32/130 (24%)

Query: 26  PNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPP--PPAPIVAP 83
           P     E+  K+A   E   STD+  S+ SS    +  +  +  +  P P  P  P   P
Sbjct: 121 PPPDVDEDALKEANVNETESSTDE--SDRSS-HSHEVRSKSNFPMGPPSPWNPRFPPGPP 177

Query: 84  PRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPP---P 140
           P P   GRH                  P   PP LS   PPP           GPP   P
Sbjct: 178 PPPPGFGRHGE---------------KPSGWPPFLSGW-PPPFPL--------GPPMIPP 213

Query: 141 PPPLLAPAPD 150
           PPP+     +
Sbjct: 214 PPPMSPDFGE 223



 Score = 30.7 bits (69), Expect = 2.0
 Identities = 31/112 (27%), Positives = 38/112 (33%), Gaps = 33/112 (29%)

Query: 42  EDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFN 101
           EDA     V    SS  ES +S+  H    +   P  P             PS   P+F 
Sbjct: 127 EDALKEANVNETESSTDESDRSSHSHEVRSKSNFPMGP-------------PSPWNPRF- 172

Query: 102 HKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
                    PP PPPP          P    + G  P   PP L+  P P+P
Sbjct: 173 ---------PPGPPPP----------PPGFGRHGEKPSGWPPFLSGWPPPFP 205


>gnl|CDD|219805 pfam08347, CTNNB1_binding, N-terminal CTNNB1 binding.  This region
           tends to appear at the N-terminus of proteins also
           containing DNA-binding HMG (high mobility group) boxes
           (pfam00505) and appears to bind the armadillo repeat of
           CTNNB1 (beta-catenin), forming a stable complex.
           Signaling by Wnt through TCF/LCF is involved in
           developmental patterning, induction of neural tissues,
           cell fate decisions and stem cell differentiation.
           Isoforms of HMG T-cell factors lacking the N-terminal
           CTNNB1-binding domain cannot fulfill their role as
           transcriptional activators in T-cell differentiation.
          Length = 200

 Score = 35.6 bits (82), Expect = 0.037
 Identities = 29/105 (27%), Positives = 35/105 (33%), Gaps = 9/105 (8%)

Query: 65  PKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPL------ 118
              GY   PP    P +  P P+      S   P  N            P  PL      
Sbjct: 91  QDMGYYKGPPYSGYPFLMLPNPYLPNGSLSPLPPSSNKVPVVQPPHHVHPLTPLITYSNE 150

Query: 119 --SAAKPPPVQPEAMD-KSGYGPPPPPPLLAPAPDPWPNAVADTP 160
             S   PPP  P  +D K+G   PP PP ++P     P  V   P
Sbjct: 151 HFSPGTPPPHLPYDVDPKTGIPRPPHPPDISPFYPLSPGGVGQIP 195


>gnl|CDD|223033 PHA03291, PHA03291, envelope glycoprotein I; Provisional.
          Length = 401

 Score = 36.5 bits (84), Expect = 0.038
 Identities = 24/101 (23%), Positives = 29/101 (28%), Gaps = 19/101 (18%)

Query: 55  SSDQESQQSAPKHG----YVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHG 110
           S D     SAP+ G    +V   P P     A P   P    PS                
Sbjct: 185 SCDPALPLSAPRLGPADVFVPATPRPTPRTTASPETTPT---PSTTTS------------ 229

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
           PP    P  +      Q     ++   P PP P    AP  
Sbjct: 230 PPSTTIPAPSTTIAAPQAGTTPEAEGTPAPPTPGGGEAPPA 270



 Score = 29.5 bits (66), Expect = 5.1
 Identities = 16/62 (25%), Positives = 20/62 (32%), Gaps = 2/62 (3%)

Query: 75  PPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS 134
           P P+   +PP         +   PQ          G P PP P     PP     A + S
Sbjct: 222 PTPSTTTSPPSTTIPAPSTTIAAPQAGTT--PEAEGTPAPPTPGGGEAPPANATPAPEAS 279

Query: 135 GY 136
            Y
Sbjct: 280 RY 281


>gnl|CDD|177871 PLN02226, PLN02226, 2-oxoglutarate dehydrogenase E2 component.
          Length = 463

 Score = 36.3 bits (83), Expect = 0.040
 Identities = 21/89 (23%), Positives = 34/89 (38%), Gaps = 10/89 (11%)

Query: 45  ESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKL 104
           E   +VA  + S+  + Q  P     +   P P+P     +       P   KP+     
Sbjct: 157 EPGTKVAIISKSEDAASQVTPSQKIPETTDPKPSPPAEDKQKPKVESAPVAEKPK----- 211

Query: 105 GGVYHGPPPPPPPLSAAKPPPVQPEAMDK 133
                 P  PPPP  +AK P + P+  ++
Sbjct: 212 -----APSSPPPPKQSAKEPQLPPKERER 235



 Score = 29.7 bits (66), Expect = 4.9
 Identities = 12/40 (30%), Positives = 17/40 (42%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
           P P PP    + P V+   + +    P  PPP    A +P
Sbjct: 187 PKPSPPAEDKQKPKVESAPVAEKPKAPSSPPPPKQSAKEP 226


>gnl|CDD|237538 PRK13876, PRK13876, conjugal transfer coupling protein TraG;
           Provisional.
          Length = 663

 Score = 36.4 bits (85), Expect = 0.047
 Identities = 14/38 (36%), Positives = 17/38 (44%), Gaps = 2/38 (5%)

Query: 114 PPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
           PPP L+    PP +P+  D SG   P  P   A A   
Sbjct: 555 PPPDLAPPAGPPARPD--DWSGLPIPAVPAPAAAAAAD 590


>gnl|CDD|235904 PRK06995, flhF, flagellar biosynthesis regulator FlhF; Validated.
          Length = 484

 Score = 36.1 bits (84), Expect = 0.048
 Identities = 23/142 (16%), Positives = 37/142 (26%), Gaps = 23/142 (16%)

Query: 43  DAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAP-IVAPPRPHPNGRHPSHGKPQFN 101
            A +    A+  ++      +         P   PAP +V   +     R     +    
Sbjct: 49  AALAPPAAAAPAAAQPPPAAAPAAVSRPAAPAAEPAPWLVEHAKRLTAQREQLVARAA-- 106

Query: 102 HKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADT-- 159
                    P  P     AA       E   +            AP P    +A A    
Sbjct: 107 --------APAAPEAQAPAAPAERAAAENAAR----RLARAAAAAPRPRVPADAAAAVAD 154

Query: 160 ------PKIVSLDVKCEKNSMK 175
                  +IV+  V  E  S++
Sbjct: 155 AVKARIERIVNDTVMQELRSLR 176


>gnl|CDD|235906 PRK07003, PRK07003, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 830

 Score = 36.4 bits (84), Expect = 0.050
 Identities = 29/133 (21%), Positives = 41/133 (30%), Gaps = 27/133 (20%)

Query: 55  SSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP- 113
           SSD+ ++ +A       +P   PA    P  P    + P+                    
Sbjct: 564 SSDRGARAAAAA-----KPAAAPAAAPKPAAPRVAVQVPTPRARAATGDAPPNGAARAEQ 618

Query: 114 ------PPPPLSAAKPPPVQPEAMDKSGYGPP-----------PPPPLLAPAPDPWPNAV 156
                  PPP     P    P + D+ G+G P           P    +AP P   P   
Sbjct: 619 AAESRGAPPPWEDIPPDDYVPLSADE-GFGGPDDGFVPVFDSGPDDVRVAPKPADAPAPP 677

Query: 157 ADT---PKIVSLD 166
            DT   P  + LD
Sbjct: 678 VDTRPLPPAIPLD 690



 Score = 34.4 bits (79), Expect = 0.19
 Identities = 17/100 (17%), Positives = 24/100 (24%), Gaps = 11/100 (11%)

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP--PPPPLS-------AAK 122
           R          P      G   +   P+            PP  P PP +       A  
Sbjct: 386 RAAAAVGASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAAPAPPATADRGDDAADG 445

Query: 123 PPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVA--DTP 160
             PV  +A  ++            P  D    +    D P
Sbjct: 446 DAPVPAKANARASADSRCDERDAQPPADSGSASAPASDAP 485



 Score = 31.7 bits (72), Expect = 1.1
 Identities = 15/97 (15%), Positives = 19/97 (19%)

Query: 59  ESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPL 118
            +                 AP  A          P            G        P P 
Sbjct: 392 GASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAAPAPPATADRGDDAADGDAPVPA 451

Query: 119 SAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
            A           ++    P       APA D  P+A
Sbjct: 452 KANARASADSRCDERDAQPPADSGSASAPASDAPPDA 488



 Score = 31.7 bits (72), Expect = 1.3
 Identities = 21/139 (15%), Positives = 29/139 (20%), Gaps = 24/139 (17%)

Query: 24  EVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAP 83
                      P    +R D  +         ++  +   +       +PP       AP
Sbjct: 421 TRAEAPPAAPAPPATADRGDDAADGDAPVPAKANARASADSRCDERDAQPPADSGSASAP 480

Query: 84  PRPHPNGRHPSHGKPQFNHKLGGVYHGPPPP--PPPLSAAKPPPVQPEAMDKSGYGPPPP 141
               P                      P     P P +AA          D         
Sbjct: 481 ASDAP----------------------PDAAFEPAPRAAAPSAATPAAVPDARAPAAASR 518

Query: 142 PPLLAPAPDPWPNAVADTP 160
               A A  P P A   TP
Sbjct: 519 EDAPAAAAPPAPEARPPTP 537



 Score = 29.4 bits (66), Expect = 7.1
 Identities = 12/53 (22%), Positives = 14/53 (26%), Gaps = 1/53 (1%)

Query: 105 GGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVA 157
           GG   G  P     +   P      A+  S   P       A      P A A
Sbjct: 365 GGAPGGGVPARVAGAVPAPGARAAAAVGASA-VPAVTAVTGAAGAALAPKAAA 416


>gnl|CDD|165468 PHA03201, PHA03201, uracil DNA glycosylase; Provisional.
          Length = 318

 Score = 35.6 bits (82), Expect = 0.051
 Identities = 25/92 (27%), Positives = 30/92 (32%), Gaps = 17/92 (18%)

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPP---VQP 128
           R P PP     P  P P         P+             PP PP   A+PPP     P
Sbjct: 7   RSPSPPRR---PSPPRPTPPRSPDASPE-----------ETPPSPPGPGAEPPPGRAAGP 52

Query: 129 EAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
            A  +   G P      + AP   P  + D P
Sbjct: 53  AAPRRRPRGCPAGVTFSSSAPPRPPLGLDDAP 84



 Score = 32.6 bits (74), Expect = 0.56
 Identities = 18/64 (28%), Positives = 24/64 (37%), Gaps = 1/64 (1%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           PP PP P   PP     G      +P+     G  +    PP PPL     P   P  +D
Sbjct: 34  PPSPPGPGAEPPPGRAAGPAAPRRRPR-GCPAGVTFSSSAPPRPPLGLDDAPAATPPPLD 92

Query: 133 KSGY 136
            + +
Sbjct: 93  WTEF 96


>gnl|CDD|236669 PRK10263, PRK10263, DNA translocase FtsK; Provisional.
          Length = 1355

 Score = 36.2 bits (83), Expect = 0.053
 Identities = 15/43 (34%), Positives = 23/43 (53%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           P    PP+++   PP QP    +   GP    P++APAP+ +P
Sbjct: 339 PVTQTPPVASVDVPPAQPTVAWQPVPGPQTGEPVIAPAPEGYP 381



 Score = 35.8 bits (82), Expect = 0.083
 Identities = 32/133 (24%), Positives = 47/133 (35%), Gaps = 15/133 (11%)

Query: 10  QLQITTGSQFAEKTEVPNVSKVEEQPKQ--AQNREDAESTDQVASETSSDQESQQSAPKH 67
           Q  +    Q+ +  +        +QP+Q  A   +  +    VA +    Q  Q  AP+ 
Sbjct: 756 QQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQP 815

Query: 68  GYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFN--HKL------GGVYHGPPPPPPPLS 119
            Y       P   VAP   +   + P   +PQ    H L          H P  P P L 
Sbjct: 816 QY-----QQPQQPVAPQPQYQQPQQPVAPQPQDTLLHPLLMRNGDSRPLHKPTTPLPSLD 870

Query: 120 AAKPPPVQPEAMD 132
              PPP + E +D
Sbjct: 871 LLTPPPSEVEPVD 883



 Score = 33.9 bits (77), Expect = 0.31
 Identities = 25/134 (18%), Positives = 41/134 (30%), Gaps = 4/134 (2%)

Query: 12  QITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQ--VASETSSDQESQQSAPKHGY 69
           Q   G Q  E    P      +Q + AQ         Q  V  +      + +   +  Y
Sbjct: 361 QPVPGPQTGEPVIAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPY 420

Query: 70  VDRPPPPPA--PIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQ 127
               P  PA  P  AP    P   +    + Q +       +         +A +P   Q
Sbjct: 421 YAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQ 480

Query: 128 PEAMDKSGYGPPPP 141
           P+ +++     P P
Sbjct: 481 PQPVEQQPVVEPEP 494



 Score = 33.5 bits (76), Expect = 0.36
 Identities = 20/93 (21%), Positives = 32/93 (34%), Gaps = 1/93 (1%)

Query: 26  PNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPR 85
           P V  V++  +    ++  +   Q  +     Q+ QQ         +P  P AP     +
Sbjct: 747 PIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQ 806

Query: 86  PH-PNGRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
           P  P    P + +PQ        Y  P  P  P
Sbjct: 807 PQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAP 839



 Score = 32.4 bits (73), Expect = 0.88
 Identities = 30/143 (20%), Positives = 47/143 (32%), Gaps = 18/143 (12%)

Query: 20  AEKTEVPNVSKVEEQP--KQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPP 77
            +  + P  +   EQP    A   E+ +ST    S   ++Q  QQ A +     +P P  
Sbjct: 426 EQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQPQPVE 485

Query: 78  APIVAPPRPHPNGRHPS-----------HGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPV 126
              V  P P      P+             + +   +L   Y   P P       +P P+
Sbjct: 486 QQPVVEPEPVVEETKPARPPLYYFEEVEEKRAREREQLAAWYQPIPEP-----VKEPEPI 540

Query: 127 QPEAMDKSGYGPPPPPPLLAPAP 149
           +      S    PP     A +P
Sbjct: 541 KSSLKAPSVAAVPPVEAAAAVSP 563



 Score = 31.2 bits (70), Expect = 2.0
 Identities = 24/152 (15%), Positives = 39/152 (25%), Gaps = 5/152 (3%)

Query: 31  VEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNG 90
           V  QP       +              Q +Q +   +  + +P  P  P  AP    P  
Sbjct: 358 VAWQPVPGPQTGEPVIAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQ 417

Query: 91  RHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPP-----VQPEAMDKSGYGPPPPPPLL 145
           +      P+   +       P  P    +            Q     +  Y  P     L
Sbjct: 418 QPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPL 477

Query: 146 APAPDPWPNAVADTPKIVSLDVKCEKNSMKVF 177
              P P        P+ V  + K  +  +  F
Sbjct: 478 YQQPQPVEQQPVVEPEPVVEETKPARPPLYYF 509


>gnl|CDD|219339 pfam07223, DUF1421, Protein of unknown function (DUF1421).  This
           family represents a conserved region approximately 350
           residues long within a number of plant proteins of
           unknown function.
          Length = 357

 Score = 35.7 bits (82), Expect = 0.054
 Identities = 33/133 (24%), Positives = 45/133 (33%), Gaps = 16/133 (12%)

Query: 28  VSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRP- 86
           +S  E Q  +A +    +ST Q  +     +     AP        PP PAP    P   
Sbjct: 29  LSHEEAQSSEAHSFH-VDSTKQPPAPEQVAKHELADAPLQQVNAALPPAPAPQSPQPDQQ 87

Query: 87  -----HPNGRHPSHGKPQFNHKLGGVYHGPPPP--PPPLSAAKPPPVQPEAMDKSGYGPP 139
                 P+ ++PS   PQ    +         P  PPP     PP  QP+A         
Sbjct: 88  QQSQAPPSHQYPSQLPPQQVQSVPQQPTPQQEPYYPPPSQPQPPPAQQPQA-------QQ 140

Query: 140 PPPPLLAPAPDPW 152
           P PP   P    +
Sbjct: 141 PQPPPQVPQQQQY 153



 Score = 35.7 bits (82), Expect = 0.060
 Identities = 28/136 (20%), Positives = 45/136 (33%), Gaps = 3/136 (2%)

Query: 18  QFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPP 77
           Q     +VP   + +  P+Q Q +++     Q A + S     +       Y   PP  P
Sbjct: 140 QPQPPPQVPQQQQYQSPPQQPQYQQNPPPQAQSAPQVSGLYPEESPYQPQSY---PPNEP 196

Query: 78  APIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYG 137
            P     +P  +G  PS            +Y GP   P     +   P   +  +  GY 
Sbjct: 197 LPSSMAMQPPYSGAPPSQQFYGPPQPSPYMYGGPGGRPNSGFPSGQQPPPSQGQEGYGYS 256

Query: 138 PPPPPPLLAPAPDPWP 153
            PPP      +   + 
Sbjct: 257 GPPPSKGNHGSVASYA 272



 Score = 35.3 bits (81), Expect = 0.080
 Identities = 32/135 (23%), Positives = 43/135 (31%), Gaps = 7/135 (5%)

Query: 26  PNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQ-----SAPKHGYVDRPPPPPAPI 80
           P   +   Q    Q         Q     +   ++QQ       P+      PP  P   
Sbjct: 104 PQQVQSVPQQPTPQQEPYYPPPSQPQPPPAQQPQAQQPQPPPQVPQQQQYQSPPQQPQYQ 163

Query: 81  VAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPP 140
             PP    +    S   P+ +      Y  PP  P P S A  PP       +  YGPP 
Sbjct: 164 QNPPPQAQSAPQVSGLYPEESPYQPQSY--PPNEPLPSSMAMQPPYSGAPPSQQFYGPPQ 221

Query: 141 PPPLLAPAPDPWPNA 155
           P P +   P   PN+
Sbjct: 222 PSPYMYGGPGGRPNS 236



 Score = 33.4 bits (76), Expect = 0.31
 Identities = 25/124 (20%), Positives = 39/124 (31%), Gaps = 7/124 (5%)

Query: 42  EDAESTDQVASETSSDQESQQSAPKHGYVDRP-PPPPAPIVAPPRPHPNGRHPSHGKPQF 100
           E+A+S++  +    S ++         +     P        PP P P    P     Q 
Sbjct: 32  EEAQSSEAHSFHVDSTKQPPAPEQVAKHELADAPLQQVNAALPPAPAPQSPQP---DQQQ 88

Query: 101 NHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
             +    +  P   PP    + P    P+      Y PPP  P   PA  P        P
Sbjct: 89  QSQAPPSHQYPSQLPPQQVQSVPQQPTPQQEP---YYPPPSQPQPPPAQQPQAQQPQPPP 145

Query: 161 KIVS 164
           ++  
Sbjct: 146 QVPQ 149


>gnl|CDD|197891 smart00818, Amelogenin, Amelogenins, cell adhesion proteins, play a
           role in the biomineralisation of teeth.  They seem to
           regulate formation of crystallites during the secretory
           stage of tooth enamel development and are thought to
           play a major role in the structural organisation and
           mineralisation of developing enamel. The extracellular
           matrix of the developing enamel comprises two major
           classes of protein: the hydrophobic amelogenins and the
           acidic enamelins. Circular dichroism studies of porcine
           amelogenin have shown that the protein consists of 3
           discrete folding units: the N-terminal region appears to
           contain beta-strand structures, while the C-terminal
           region displays characteristics of a random coil
           conformation. Subsequent studies on the bovine protein
           have indicated the amelogenin structure to contain a
           repetitive beta-turn segment and a "beta-spiral" between
           Gln112 and Leu138, which sequester a (Pro, Leu, Gln)
           rich region. The beta-spiral offers a probable site for
           interactions with Ca2+ ions. Muatations in the human
           amelogenin gene (AMGX) cause X-linked hypoplastic
           amelogenesis imperfecta, a disease characterised by
           defective enamel. A 9bp deletion in exon 2 of AMGX
           results in the loss of codons for Ile5, Leu6, Phe7 and
           Ala8, and replacement by a new threonine codon,
           disrupting the 16-residue (Met1-Ala16) amelogenin signal
           peptide.
          Length = 165

 Score = 34.8 bits (80), Expect = 0.059
 Identities = 23/89 (25%), Positives = 31/89 (34%), Gaps = 8/89 (8%)

Query: 73  PPPPPAPIVAPPRPH---PNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKP----PP 125
           P  P  P++  P  H   P   H  +               PP P  P+    P    PP
Sbjct: 66  PVVPQQPLMPVPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPPQPQQPMQPQPPVHPIPP 125

Query: 126 VQPEAMDKSGYGPPPPPPLLAPAP-DPWP 153
           + P+      +   P PPLL   P + WP
Sbjct: 126 LPPQPPLPPMFPMQPLPPLLPDLPLEAWP 154


>gnl|CDD|220840 pfam10667, DUF2486, Protein of unknown function (DUF2486).  This
           family is made up of members from various Burkholderia
           spp. The function is unknown.
          Length = 245

 Score = 34.9 bits (80), Expect = 0.086
 Identities = 22/125 (17%), Positives = 27/125 (21%), Gaps = 15/125 (12%)

Query: 44  AESTDQVASETSSDQESQQSAPKHGYVDRP-PPPPAPIVAPP-------RPHPNGRHPSH 95
               +Q AS        + +A        P P P  P VA P        P       + 
Sbjct: 48  VPGAEQAASAAPVHAAREATADPEFVAVEPVPTPHVPAVALPGDTDAPAEPGAAPHVVAE 107

Query: 96  GKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
                   L        P  PP  A                    PP     +P     A
Sbjct: 108 RAAAMQAPLPSALAADDPQAPPAGATAADAGDAAP-------DATPPAAGDASPPAAAQA 160

Query: 156 VADTP 160
            A   
Sbjct: 161 AASAA 165


>gnl|CDD|234938 PRK01297, PRK01297, ATP-dependent RNA helicase RhlB; Provisional.
          Length = 475

 Score = 35.3 bits (81), Expect = 0.088
 Identities = 17/75 (22%), Positives = 21/75 (28%), Gaps = 15/75 (20%)

Query: 68  GYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQ 127
           G  +   P PAP      P      P   K             P       +AA P   +
Sbjct: 10  GKGEAEQPAPAPPSPAAAP----APPPPAKTA----------APATKAAAPAAAAPRAEK 55

Query: 128 PEAMDKSGYGPPPPP 142
           P+  DK      P P
Sbjct: 56  PKK-DKPRRERKPKP 69



 Score = 32.6 bits (74), Expect = 0.57
 Identities = 18/86 (20%), Positives = 25/86 (29%), Gaps = 11/86 (12%)

Query: 56  SDQESQQSAP--KHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP 113
            + E    AP         PPP      A     P    P   KP+ +            
Sbjct: 12  GEAEQPAPAPPSPAAAPAPPPPAKTAAPATKAAAPAAAAPRAEKPKKDKPR------RER 65

Query: 114 PPPPLSAAKPPP--VQPEAMDKSGYG 137
            P P S  K     V+P+   K+ + 
Sbjct: 66  KPKPASLWKLEDFVVEPQEG-KTRFH 90



 Score = 32.2 bits (73), Expect = 0.86
 Identities = 14/51 (27%), Positives = 15/51 (29%), Gaps = 9/51 (17%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPK 161
            P P PP  AA P P  P              P    A        A+ PK
Sbjct: 16  QPAPAPPSPAAAPAPPPP---------AKTAAPATKAAAPAAAAPRAEKPK 57



 Score = 29.9 bits (67), Expect = 4.0
 Identities = 13/41 (31%), Positives = 15/41 (36%), Gaps = 1/41 (2%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLL-APAPD 150
           P PP P  + A PPP +  A       P    P    P  D
Sbjct: 19  PAPPSPAAAPAPPPPAKTAAPATKAAAPAAAAPRAEKPKKD 59



 Score = 29.1 bits (65), Expect = 8.1
 Identities = 16/83 (19%), Positives = 18/83 (21%), Gaps = 27/83 (32%)

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAM 131
             P PP+P  AP  P P                          P     A  P       
Sbjct: 17  PAPAPPSPAAAPAPPPPA---------------------KTAAPAT--KAAAPAAAAPRA 53

Query: 132 DKSGYGPPPPPPLLAPAPDPWPN 154
           +K    P    P     P P   
Sbjct: 54  EK----PKKDKPRRERKPKPASL 72


>gnl|CDD|221170 pfam11696, DUF3292, Protein of unknown function (DUF3292).  This
           eukaryotic family of proteins has no known function.
          Length = 641

 Score = 35.1 bits (81), Expect = 0.10
 Identities = 25/106 (23%), Positives = 36/106 (33%), Gaps = 5/106 (4%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEK 171
           P PPPP S+ +  P  P ++D            +  A  P P A          + K E 
Sbjct: 404 PLPPPPSSSLRKAPSSPASIDHKQLNLGASEEEIDQAAAPEPEAPV-----AEAEEKAEP 458

Query: 172 NSMKVFISFDKPFFGIVFSKGHYSNVNCVHLPAGLGRTSANFEIGI 217
              K + S    FF    + G  S +      A  G   A   +G+
Sbjct: 459 PKKKGWGSRILGFFKGTTATGIESKLAVDRARAAAGSEHAKNRLGV 504


>gnl|CDD|220950 pfam11029, DAZAP2, DAZ associated protein 2 (DAZAP2).  DAZ
           associated protein 2 has a highly conserved sequence
           throughout evolution including a conserved polyproline
           region and several SH2/SH3 binding sites. It occurs as a
           single copy gene with a four-exon organisation and is
           located on chromosome 12. It encodes a ubiquitously
           expressed protein and binds to DAZ and DAZL1 through DAZ
           repeats.
          Length = 136

 Score = 33.3 bits (76), Expect = 0.12
 Identities = 24/94 (25%), Positives = 29/94 (30%), Gaps = 14/94 (14%)

Query: 60  SQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLS 119
                 +      P P P+  +   +    G   SH    +          PPP  PP S
Sbjct: 17  VVPPQAQMPQASAPYPGPSMYLPMAQVMAVGPQSSHPPMAYYP-----IGAPPPVYPPGS 71

Query: 120 AAKPPPVQ----PEAMDKSGYGP--PPPPPLLAP 147
                 VQ      A    G G   PPPPP  AP
Sbjct: 72  T---VLVQGGFDAGARFGPGTGSSIPPPPPGCAP 102



 Score = 27.9 bits (62), Expect = 7.8
 Identities = 12/44 (27%), Positives = 14/44 (31%), Gaps = 2/44 (4%)

Query: 112 PPPPPPLSAAKPPP--VQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           P  PP  S    P   V P+A       P P P +  P      
Sbjct: 2   PDAPPAYSELYQPSYVVPPQAQMPQASAPYPGPSMYLPMAQVMA 45


>gnl|CDD|225689 COG3147, DedD, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 226

 Score = 34.1 bits (78), Expect = 0.14
 Identities = 21/91 (23%), Positives = 27/91 (29%), Gaps = 7/91 (7%)

Query: 75  PPPAPIVAPPRPHPNGRHPSHGK-------PQFNHKLGGVYHGPPPPPPPLSAAKPPPVQ 127
           P    +VA P   P G                    +        P   P++A  P PV+
Sbjct: 58  PAVVQVVALPTQPPEGVAQEIQDAGDAAAASVDPQPVAQPPVESTPAGVPVAAQTPKPVK 117

Query: 128 PEAMDKSGYGPPPPPPLLAPAPDPWPNAVAD 158
           P     +G  P  P P   P P   P A   
Sbjct: 118 PPKQPPAGAVPAKPTPKPEPKPVAEPAAAPT 148


>gnl|CDD|233927 TIGR02557, HpaP, type III secretion protein HpaP.  This family of
           genes is always found in type III secretion operons,
           althought its function in the processes of secretion and
           virulence is unclear. Hpa stands for Hrp-associated
           gene, where Hrp stands for hypersensitivity response and
           virulence.
          Length = 201

 Score = 33.7 bits (77), Expect = 0.15
 Identities = 15/49 (30%), Positives = 18/49 (36%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           P  P   +  + P  Q    D   Y PPP P    P  +  P   ADT 
Sbjct: 12  PADPARPARRRTPLAQLRRRDALAYAPPPRPEPPPPCDEDRPEPRADTR 60



 Score = 30.6 bits (69), Expect = 1.4
 Identities = 17/83 (20%), Positives = 19/83 (22%), Gaps = 19/83 (22%)

Query: 73  PPPPPA---PIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPE 129
            P  PA     +A  R            P             P PPPP    +P P    
Sbjct: 14  DPARPARRRTPLAQLRRRD--ALAYAPPP------------RPEPPPPCDEDRPEPRADT 59

Query: 130 AMDKSGYGPP--PPPPLLAPAPD 150
                    P    P      PD
Sbjct: 60  RASDPPPEAPTDADPAQPPDDPD 82


>gnl|CDD|237011 PRK11892, PRK11892, pyruvate dehydrogenase subunit beta;
           Provisional.
          Length = 464

 Score = 34.5 bits (80), Expect = 0.15
 Identities = 7/43 (16%), Positives = 11/43 (25%), Gaps = 1/43 (2%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
                   + A       +    +   P  P   +A  PD  P
Sbjct: 93  AAAEAAAAAPAAAAAAAAKKAAPAPAAPAAPAAEVAADPD-IP 134



 Score = 31.8 bits (73), Expect = 1.1
 Identities = 12/50 (24%), Positives = 14/50 (28%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
             P     +AA  P     A  K     P  P   A      P+  A T 
Sbjct: 89  AAPAAAAEAAAAAPAAAAAAAAKKAAPAPAAPAAPAAEVAADPDIPAGTE 138



 Score = 29.9 bits (68), Expect = 4.3
 Identities = 9/47 (19%), Positives = 15/47 (31%)

Query: 42  EDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHP 88
           E+ ES     +  ++  E+  +AP                AP  P  
Sbjct: 79  EEGESASDAGAAPAAAAEAAAAAPAAAAAAAAKKAAPAPAAPAAPAA 125



 Score = 29.5 bits (67), Expect = 5.0
 Identities = 9/53 (16%), Positives = 11/53 (20%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKI 162
           G        + A        A   +             AP      VA  P I
Sbjct: 81  GESASDAGAAPAAAAEAAAAAPAAAAAAAAKKAAPAPAAPAAPAAEVAADPDI 133


>gnl|CDD|237082 PRK12373, PRK12373, NADH dehydrogenase subunit E; Provisional.
          Length = 400

 Score = 34.4 bits (79), Expect = 0.15
 Identities = 29/165 (17%), Positives = 38/165 (23%), Gaps = 31/165 (18%)

Query: 13  ITTGSQFAEKTEVPN---VSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGY 69
           +  G Q       P     S  EE  K   N   A + D   +    D            
Sbjct: 177 VKPGPQIGRYASEPAGGLTSLTEEAGKARYNASKALAEDIGDTVKRIDGTEVPLLAPWQG 236

Query: 70  VDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPE 129
              P PP     A P                                  +A K P   P+
Sbjct: 237 DAAPVPPSEA--ARP--------------------------KSADAETNAALKTPATAPK 268

Query: 130 AMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEKNSM 174
           A  K+   P   P     A +P P   A      +     +K   
Sbjct: 269 AAAKNAKAPEAQPVSGTAAAEPAPKEAAKAAAAAAKPALEDKPRP 313



 Score = 30.2 bits (68), Expect = 3.6
 Identities = 16/72 (22%), Positives = 24/72 (33%), Gaps = 2/72 (2%)

Query: 114 PPPPLSAAKPPP--VQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEK 171
           P PP  AA+P     +  A  K+    P      A AP+  P +     +    +     
Sbjct: 240 PVPPSEAARPKSADAETNAALKTPATAPKAAAKNAKAPEAQPVSGTAAAEPAPKEAAKAA 299

Query: 172 NSMKVFISFDKP 183
            +       DKP
Sbjct: 300 AAAAKPALEDKP 311


>gnl|CDD|236940 PRK11633, PRK11633, cell division protein DedD; Provisional.
          Length = 226

 Score = 33.8 bits (78), Expect = 0.17
 Identities = 12/41 (29%), Positives = 14/41 (34%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
             P P P+   KP PV+              PP   P P P
Sbjct: 97  VEPEPAPVEPPKPKPVEKPKPKPKPQQKVEAPPAPKPEPKP 137



 Score = 33.1 bits (76), Expect = 0.31
 Identities = 23/94 (24%), Positives = 32/94 (34%), Gaps = 7/94 (7%)

Query: 33  EQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRH 92
            Q    Q  E A    +     +   +    AP +  V+  P P    V PP+P P  + 
Sbjct: 60  TQALPTQPPEGAAEAVRAGDAAAPSLDPATVAPPNTPVEPEPAP----VEPPKPKPVEKP 115

Query: 93  PSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPV 126
               KPQ   +       P P P P+   K  P 
Sbjct: 116 KPKPKPQ---QKVEAPPAPKPEPKPVVEEKAAPT 146



 Score = 30.4 bits (69), Expect = 2.1
 Identities = 28/99 (28%), Positives = 32/99 (32%), Gaps = 18/99 (18%)

Query: 65  PKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPP---PPLSAA 121
           PK G  DR  P   P      P      P  G  +       V  G    P   P   A 
Sbjct: 45  PKPG--DRDEPDMMPAATQALP----TQPPEGAAE------AVRAGDAAAPSLDPATVAP 92

Query: 122 KPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
              PV+PE        PP P P+  P P P P    + P
Sbjct: 93  PNTPVEPEPAPVE---PPKPKPVEKPKPKPKPQQKVEAP 128



 Score = 30.4 bits (69), Expect = 2.1
 Identities = 18/81 (22%), Positives = 25/81 (30%), Gaps = 12/81 (14%)

Query: 72  RPPPPPAPIVAPPR-PHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA 130
           R     AP + P     PN        P             PP P P+   KP P   + 
Sbjct: 76  RAGDAAAPSLDPATVAPPNTPVEPEPAPV-----------EPPKPKPVEKPKPKPKPQQK 124

Query: 131 MDKSGYGPPPPPPLLAPAPDP 151
           ++      P P P++     P
Sbjct: 125 VEAPPAPKPEPKPVVEEKAAP 145


>gnl|CDD|222997 PHA03132, PHA03132, thymidine kinase; Provisional.
          Length = 580

 Score = 34.4 bits (79), Expect = 0.19
 Identities = 16/94 (17%), Positives = 23/94 (24%), Gaps = 3/94 (3%)

Query: 52  SETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPR-PHPNGRHPSHGKPQFNHKLGGVYHG 110
             TS D +      + G         + I   PR P    +           +       
Sbjct: 45  EATSEDDDDLYPPRETGSG--GGVATSTIYTVPRPPRGPEQTLDKPDSLPASRELPPGPT 102

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPL 144
           P PP     A+ P         +  Y    P  L
Sbjct: 103 PVPPGGFRGASSPRLGADSTSPRFLYQVNFPVIL 136


>gnl|CDD|114270 pfam05539, Pneumo_att_G, Pneumovirinae attachment membrane
           glycoprotein G. 
          Length = 408

 Score = 33.9 bits (77), Expect = 0.22
 Identities = 30/136 (22%), Positives = 42/136 (30%), Gaps = 24/136 (17%)

Query: 8   STQLQITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKH 67
           + Q   +T     + T   +  + + +P  +Q      S     S TS DQ +     +H
Sbjct: 209 ANQRLSSTEPVGTQGTTTSSNPEPQTEPPPSQRGPSG-SPQHPPSTTSQDQSTTGDGQEH 267

Query: 68  GYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQ 127
               R   PPA          N R P                  PPP         P  +
Sbjct: 268 --TQRRKTPPAT--------SNRRSPHS-------------TATPPPTTKRQETGRPTPR 304

Query: 128 PEAMDKSGYGPPPPPP 143
           P A  +SG  PP   P
Sbjct: 305 PTATTQSGSSPPHSSP 320


>gnl|CDD|144451 pfam00859, CTF_NFI, CTF/NF-I family transcription modulation
           region. 
          Length = 295

 Score = 33.9 bits (77), Expect = 0.22
 Identities = 25/96 (26%), Positives = 36/96 (37%), Gaps = 13/96 (13%)

Query: 72  RPPPPPAPI------VAPPRP-----HPNGRHPSHGKPQFNHK--LGGVYHGPPPPPPPL 118
            P P P+P+      + P +P     H   R+P H  PQ   K  +  V           
Sbjct: 149 SPHPTPSPLHFPTSPILPQQPSSYFPHTAIRYPPHLHPQDPLKEFVQLVCDPSSQQAGQP 208

Query: 119 SAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPN 154
           + +    V    +      PPPPPP+  P P P P+
Sbjct: 209 NGSGQGKVPNHFLPTPMLAPPPPPPMARPVPLPMPD 244



 Score = 29.7 bits (66), Expect = 4.9
 Identities = 27/100 (27%), Positives = 34/100 (34%), Gaps = 22/100 (22%)

Query: 49  QVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVY 108
           Q+  + SS Q  Q +    G V     P  P++APP P P  R                 
Sbjct: 195 QLVCDPSSQQAGQPNGSGQGKVPNHFLP-TPMLAPPPPPPMAR----------------- 236

Query: 109 HGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPA 148
               P P P+   KPP    E    S   P    P  +PA
Sbjct: 237 ----PVPLPMPDTKPPTTSTEGGATSPTSPTYSTPSTSPA 272


>gnl|CDD|227665 COG5373, COG5373, Predicted membrane protein [Function unknown].
          Length = 931

 Score = 34.0 bits (78), Expect = 0.24
 Identities = 14/45 (31%), Positives = 17/45 (37%), Gaps = 3/45 (6%)

Query: 113 PPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVA 157
           PPP P + A+       A   S    P   P  APA    P+  A
Sbjct: 78  PPPVPPAPAQEGEAPA-AEQPSAVPAPSAAP--APAEPVEPSLAA 119



 Score = 30.2 bits (68), Expect = 3.8
 Identities = 11/54 (20%), Positives = 14/54 (25%)

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVA 157
             G             AA+  P+   A   +    PPP P         P A  
Sbjct: 43  AAGPVAKAAEQMAAPEAAEAAPLPAAAESIASPEVPPPVPPAPAQEGEAPAAEQ 96


>gnl|CDD|223044 PHA03325, PHA03325, nuclear-egress-membrane-like protein;
           Provisional.
          Length = 418

 Score = 33.7 bits (77), Expect = 0.24
 Identities = 35/177 (19%), Positives = 56/177 (31%), Gaps = 31/177 (17%)

Query: 4   YKIVSTQLQITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQS 63
           Y+++  Q+   T S F   + +P  +      +    R  A  T    ++ + D  S+ S
Sbjct: 248 YRLLF-QIGQLTSSAFMLNSSLPTSAPKRRSRRAGAMRAAAGET----ADLADDDGSEHS 302

Query: 64  APKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKP 123
            P         P PA +  PP   P  +HP  GK + +         P  P    S+   
Sbjct: 303 DP--------EPLPASLPPPPVRRPRVKHPEAGKEEPDGARNAEAKEPAQPATSTSSKGS 354

Query: 124 PPVQPEAMDKSGYG------------------PPPPPPLLAPAPDPWPNAVADTPKI 162
              Q +    +G G                  P      L   P P   +  + P I
Sbjct: 355 SSAQNKDSGSTGPGSSLAAASSFLEDDDFGSPPLDLTTSLRHMPSPSVTSAPEPPSI 411


>gnl|CDD|214832 smart00817, Amelin, Ameloblastin precursor (Amelin).  This family
           consists of several mammalian Ameloblastin precursor
           (Amelin) proteins. Matrix proteins of tooth enamel
           consist mainly of amelogenin but also of non-amelogenin
           proteins, which, although their volumetric percentage is
           low, have an important role in enamel mineralisation.
           One of the non-amelogenin proteins is ameloblastin, also
           known as amelin and sheathlin. Ameloblastin (AMBN) is
           one of the enamel sheath proteins which is though to
           have a role in determining the prismatic structure of
           growing enamel crystals.
          Length = 411

 Score = 33.7 bits (77), Expect = 0.25
 Identities = 20/77 (25%), Positives = 26/77 (33%), Gaps = 12/77 (15%)

Query: 58  QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPP-- 115
           Q+ + S P H       PPP P     +P   G  P                  P PP  
Sbjct: 87  QQYEYSLPVH-------PPPLPSQPSLQPQQPGLKPFLQPTALPTNQATPQKNGPQPPMH 139

Query: 116 ---PPLSAAKPPPVQPE 129
              PPL  A+ P + P+
Sbjct: 140 LGQPPLQQAELPMIPPQ 156


>gnl|CDD|216368 pfam01213, CAP_N, Adenylate cyclase associated (CAP) N terminal. 
          Length = 313

 Score = 33.6 bits (77), Expect = 0.25
 Identities = 12/45 (26%), Positives = 17/45 (37%), Gaps = 12/45 (26%)

Query: 116 PPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           P +S++ P              PPPPPP   P+     N+V    
Sbjct: 223 PAVSSSAPSA------------PPPPPPPPPPSVPTISNSVESAS 255


>gnl|CDD|218397 pfam05044, Prox1, Homeobox prospero-like protein (PROX1).  The
           homeobox gene Prox1 is expressed in a subpopulation of
           endothelial cells that, after budding from veins, gives
           rise to the mammalian lymphatic system. Prox1 has been
           found to be an early specific marker for the developing
           liver and pancreas in the mammalian foregut endoderm.
           This family contains an atypical homeobox domain.
          Length = 908

 Score = 33.9 bits (77), Expect = 0.25
 Identities = 22/82 (26%), Positives = 28/82 (34%), Gaps = 9/82 (10%)

Query: 73  PPPPPAPIVAPPRPHPNG-RHP-----SHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPV 126
            P  P+ + A    HP   RHP     +   P  +     V+ G P   P L A   P  
Sbjct: 617 QPLHPSSLSASMGFHPPPFRHPFPLPLTVAIPNPSLHQSEVFMGYPFQSPHLGA---PSG 673

Query: 127 QPEAMDKSGYGPPPPPPLLAPA 148
            P   D+     P P   L P 
Sbjct: 674 SPPGKDRDSPDLPRPTTSLHPK 695



 Score = 29.3 bits (65), Expect = 7.6
 Identities = 23/110 (20%), Positives = 32/110 (29%), Gaps = 10/110 (9%)

Query: 41  REDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPI-----VAPPRPHPNGRHPSH 95
           R  A       +  +  Q    S+        PPP   P      VA P P  +      
Sbjct: 600 RILALRDAVGPAAGTHHQPLHPSSLSASMGFHPPPFRHPFPLPLTVAIPNPSLHQSEVFM 659

Query: 96  GKPQFNHKLGGVYHGPPPPP---PPLSAAKPPPVQPEAMDKSGYGPPPPP 142
           G P  +  LG     PP      P L   +P       +  + + P   P
Sbjct: 660 GYPFQSPHLGAPSGSPPGKDRDSPDLP--RPTTSLHPKLLSAHHHPGSSP 707


>gnl|CDD|113398 pfam04625, DEC-1_N, DEC-1 protein, N-terminal region.  The
           defective chorion-1 gene (dec-1) in Drosophila encodes
           follicle cell proteins necessary for proper eggshell
           assembly. Multiple products of the dec-1 gene are formed
           by alternative RNA splicing and proteolytic processing.
           Cleavage products include S80 (80 kDa) which is
           incorporated into the eggshell, and further proteolysis
           of S80 gives S60 (60 kDa).
          Length = 407

 Score = 33.6 bits (76), Expect = 0.26
 Identities = 28/114 (24%), Positives = 35/114 (30%), Gaps = 15/114 (13%)

Query: 70  VDRPPPPPAPIVA----PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPP--LSAAKP 123
           V R  PPP P+      P  P           PQ    L  +   P  P  P  L AA P
Sbjct: 52  VARQNPPPNPLGQLMNWPALPQDFQLPSMDLGPQVGSFLAQL---PAMPSMPGLLGAAAP 108

Query: 124 PPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEKNSMKVF 177
            P        +      PP   APA D     + D  +   L     +N+    
Sbjct: 109 VP------APAPAPAAAPPAAPAPAADTPAAPIPDAVQPAILGQAALQNAFTFL 156


>gnl|CDD|225657 COG3115, ZipA, Cell division protein [Cell division and chromosome
           partitioning].
          Length = 324

 Score = 33.7 bits (77), Expect = 0.27
 Identities = 21/100 (21%), Positives = 28/100 (28%), Gaps = 22/100 (22%)

Query: 26  PNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPR 85
           P +S     P+  Q   D E   + A +    QE+    P H      P P    V    
Sbjct: 107 PQISDPPAHPQPTQPALDQEQPPEEARQPVLPQEAPAPQPVH---SAAPQPAVQTVQ--- 160

Query: 86  PHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPP 125
                  P+  + Q           P     P    K PP
Sbjct: 161 -------PAVPEQQVQ---------PEEVVEPAPEVKRPP 184



 Score = 30.6 bits (69), Expect = 2.2
 Identities = 20/109 (18%), Positives = 25/109 (22%), Gaps = 18/109 (16%)

Query: 70  VDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGG------VYHGPP----------- 112
             R   P   +              +  PQF  +           + P            
Sbjct: 45  SKRDDDPYDEVADDEGVGEVRVVRKNEAPQFTQEHEAARQSPQHQYQPEYASAQIKIPVP 104

Query: 113 -PPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
            PP      A P P QP    +        P L   AP P P   A   
Sbjct: 105 QPPQISDPPAHPQPTQPALDQEQPPEEARQPVLPQEAPAPQPVHSAAPQ 153


>gnl|CDD|219406 pfam07420, DUF1509, Protein of unknown function (DUF1509).  This
           family consists of several uncharacterized viral
           proteins from the Marek's disease-like viruses. Members
           of this family are typically around 400 residues in
           length. The function of this family is unknown.
          Length = 377

 Score = 33.5 bits (76), Expect = 0.28
 Identities = 33/163 (20%), Positives = 44/163 (26%), Gaps = 28/163 (17%)

Query: 4   YKIVSTQLQITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQS 63
           Y  V T   I   ++          S  ++ P  ++    A     V          Q  
Sbjct: 115 YSSVMTWTPIPCFAEVPVFPRPYQSSGDDDGPSTSRGSGVARVRPTVI---------QHR 165

Query: 64  APKHGYVD----RPPP-----------PPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVY 108
             K    D    RP P           P A    PP+P P+G++ S   P     L  V 
Sbjct: 166 VDKTRPSDYENHRPRPFAMANPSWVDEPDAAAQRPPQPGPSGQNRSPRTP----TLSNVR 221

Query: 109 HGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
               P       A  PP             P       P+P P
Sbjct: 222 VLDAPVATNRGEAPSPPRTDTLDPDPAIAGPSRAVNRTPSPRP 264


>gnl|CDD|177328 PHA01929, PHA01929, putative scaffolding protein.
          Length = 306

 Score = 33.5 bits (76), Expect = 0.28
 Identities = 26/88 (29%), Positives = 31/88 (35%), Gaps = 9/88 (10%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           P P P P++ P  P   G+    G PQ             P P P SA  P  VQ     
Sbjct: 25  PTPQPNPVIQPQAPVQPGQP---GAPQQLAI-----PTQQPQPVPTSAMTPHVVQQAPAQ 76

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAVADTP 160
            +   PP     L  A +  P   A TP
Sbjct: 77  PAPAAPPAAGAALPEALEVPPPP-AFTP 103


>gnl|CDD|221526 pfam12316, Dsh_C, Segment polarity protein dishevelled (Dsh) C
           terminal.  This domain family is found in eukaryotes,
           and is typically between 177 and 207 amino acids in
           length. The family is found in association with
           pfam00778, pfam02377, pfam00610, pfam00595. The segment
           polarity gene dishevelled (dsh) is required for pattern
           formation of the embryonic segments. It is involved in
           the determination of body organisation through the
           Wingless pathway (analogous to the Wnt-1 pathway).
          Length = 202

 Score = 33.0 bits (75), Expect = 0.30
 Identities = 35/131 (26%), Positives = 50/131 (38%), Gaps = 17/131 (12%)

Query: 14  TTGSQFAEKTEVPNVSK-VEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDR 72
           + GSQ +E +     ++   E+ + A  RE    +    SE  S+  S+  + + G    
Sbjct: 65  SAGSQHSEGSRSSGSNRSDGERSRAADGREGGRKSGGSGSE--SEHTSRSGSRRSGGRRA 122

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           P     P  +      +  HPS       H   G    PPP  PP+   KPPP  P    
Sbjct: 123 PSERSGPPPSEGSVRSSLSHPSS------HSSYGAPGVPPPYNPPMLMMKPPPPSP---- 172

Query: 133 KSGYGPPPPPP 143
               GPP  PP
Sbjct: 173 ----GPPGAPP 179


>gnl|CDD|226193 COG3667, PcoB, Uncharacterized protein involved in copper
           resistance [Inorganic ion transport and metabolism].
          Length = 321

 Score = 33.3 bits (76), Expect = 0.31
 Identities = 21/85 (24%), Positives = 31/85 (36%), Gaps = 8/85 (9%)

Query: 58  QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLG-GVYHGPPPPPP 116
           + +Q  A +H  +D P P           H     P    PQ +H     + H PPP P 
Sbjct: 28  EAAQPHAHEHAPMDAPHPAMPG--MDHHAHSKMPGPEMAAPQMDHGAMPHMDHAPPPIPT 85

Query: 117 PLSA--AKPPPVQPEAMDKSGYGPP 139
             +A  ++ P     A  +    PP
Sbjct: 86  QHAAERSRSP---ASAAARVAAFPP 107


>gnl|CDD|189937 pfam01310, Adeno_PVIII, Adenovirus hexon associated protein,
           protein VIII.  See pfam01065. This family represents
           Hexon.
          Length = 215

 Score = 32.8 bits (75), Expect = 0.31
 Identities = 23/87 (26%), Positives = 29/87 (33%), Gaps = 14/87 (16%)

Query: 74  PPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYH----------GPPPPPPPLSAAKP 123
           P PP   V  PRP    R  ++   Q     GG              PP  PP  S  +P
Sbjct: 80  PGPPPTTVYLPRPEALERVMTNSGAQL---AGGSTLNRGEGRIQLAEPPVGPPSRSRLRP 136

Query: 124 PPV-QPEAMDKSGYGPPPPPPLLAPAP 149
               Q     +S + P     LL  +P
Sbjct: 137 DGWFQLGGGGRSSFTPTQAYLLLQESP 163


>gnl|CDD|217453 pfam03251, Tymo_45kd_70kd, Tymovirus 45/70Kd protein.  Tymoviruses
           are single stranded RNA viruses. This family includes a
           protein of unknown function that has been named based on
           its molecular weight. Tymoviruses such as the ononis
           yellow mosaic tymovirus encode only three proteins. Of
           these two are overlapping this protein overlaps a larger
           ORF that is thought to be the polymerase.
          Length = 458

 Score = 33.5 bits (77), Expect = 0.31
 Identities = 21/95 (22%), Positives = 28/95 (29%), Gaps = 14/95 (14%)

Query: 65  PKHGYVDRPPPPPAPIV------APPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPL 118
           P      +  P  A         + P PHP    PSH     + K   +     P P   
Sbjct: 371 PVPPPKVQALPLTALAPLVRHSPSIPLPHPPSALPSHVGAS-SSKHHRLPPSVLPGPRLS 429

Query: 119 SAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           S +  P +           P  PPP  +P     P
Sbjct: 430 SPSPSPSLPT-------RRPGTPPPPASPPTPSPP 457



 Score = 28.9 bits (65), Expect = 8.0
 Identities = 14/81 (17%), Positives = 20/81 (24%), Gaps = 4/81 (4%)

Query: 71  DRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA 130
             PPP    +   P P P  +           +       P PP    S       +   
Sbjct: 358 QLPPPTKRRLRLLPVPPPKVQALPLTALAPLVRHSPSIPLPHPPSALPSHVGASSSKHHR 417

Query: 131 MDKSGYGPPPPPPLLAPAPDP 151
           +  S      P P L+     
Sbjct: 418 LPPS----VLPGPRLSSPSPS 434


>gnl|CDD|220260 pfam09483, HpaP, Type III secretion protein (HpaP).  This entry
           represents proteins encoded by genes which are always
           found in type III secretion operons, although their
           function in the processes of secretion and virulence is
           unclear. Hpa stands for Hrp-associated gene, where Hrp
           stands for hypersensitivity response and virulence. see
           also PMID:18584024.
          Length = 185

 Score = 32.8 bits (75), Expect = 0.32
 Identities = 19/77 (24%), Positives = 24/77 (31%), Gaps = 2/77 (2%)

Query: 82  APPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPP 141
           AP    P     S    +   +      GPP  P P   A+    +     +    PP P
Sbjct: 1   APAGAAPRAARRSFDYARLMRRAARA--GPPGTPAPPGPAEDAHPEFPERPRDAPAPPAP 58

Query: 142 PPLLAPAPDPWPNAVAD 158
           P       DP P A A 
Sbjct: 59  PRATDGDRDPQPLADAL 75


>gnl|CDD|222828 PHA01732, PHA01732, proline-rich protein.
          Length = 94

 Score = 31.2 bits (70), Expect = 0.35
 Identities = 11/34 (32%), Positives = 15/34 (44%)

Query: 138 PPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEK 171
           P P PP   P P P P A    P +  ++ +  K
Sbjct: 13  PAPLPPAPVPPPPPAPPAPVPEPTVKPVNAEAPK 46



 Score = 28.2 bits (62), Expect = 3.5
 Identities = 22/53 (41%), Positives = 24/53 (45%), Gaps = 15/53 (28%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP-WPNAVADTPKI 162
           PP PP PL  A   PV           PPPPP   AP P+P      A+ PKI
Sbjct: 9   PPEPPAPLPPA---PV-----------PPPPPAPPAPVPEPTVKPVNAEAPKI 47


>gnl|CDD|152960 pfam12526, DUF3729, Protein of unknown function (DUF3729).  This
           family of proteins is found in viruses. Proteins in this
           family are typically between 145 and 1707 amino acids in
           length. The family is found in association with
           pfam01443, pfam01661, pfam05417, pfam01660, pfam00978.
           There is a single completely conserved residue L that
           may be functionally important.
          Length = 115

 Score = 31.6 bits (72), Expect = 0.38
 Identities = 20/88 (22%), Positives = 25/88 (28%), Gaps = 22/88 (25%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
             PP +     P P                 +G         PPP+SA    P   E   
Sbjct: 29  FSPPESAHPDDPPP-----------------VGDPRPPVVDTPPPVSAVWVLPPPSE--- 68

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAVADTP 160
                PPP P    P P   P+ +A   
Sbjct: 69  --PAAPPPDPEPPVPGPAGPPSPLAPPA 94


>gnl|CDD|184918 PRK14954, PRK14954, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 620

 Score = 33.4 bits (76), Expect = 0.38
 Identities = 19/62 (30%), Positives = 24/62 (38%), Gaps = 10/62 (16%)

Query: 112 PPPPPPLSAAKPPPVQPEA---------MDKSGYGPPPPPPLLAPAPDPWPNAVADTPKI 162
           P P  P     P P +PEA            S   P   PP+   AP P P+  A  P+ 
Sbjct: 395 PEPDLPQPDRHPGPAKPEAPGARPAELPSPASAPTPEQQPPVARSAPLP-PSPQASAPRN 453

Query: 163 VS 164
           V+
Sbjct: 454 VA 455



 Score = 29.9 bits (67), Expect = 5.1
 Identities = 13/66 (19%), Positives = 13/66 (19%), Gaps = 3/66 (4%)

Query: 63  SAPKHGYVDRPPPPPAPIVAPPR-PHPNGRHP-SHGKPQFNHKLGGVYHGPPPPP-PPLS 119
           S         P  P       P  P   G  P     P                P PP  
Sbjct: 387 SPDVKKKAPEPDLPQPDRHPGPAKPEAPGARPAELPSPASAPTPEQQPPVARSAPLPPSP 446

Query: 120 AAKPPP 125
            A  P 
Sbjct: 447 QASAPR 452


>gnl|CDD|183558 PRK12495, PRK12495, hypothetical protein; Provisional.
          Length = 226

 Score = 32.5 bits (74), Expect = 0.42
 Identities = 25/128 (19%), Positives = 31/128 (24%), Gaps = 13/128 (10%)

Query: 33  EQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPH-PNGR 91
           +QP         ++ D   +   SD  SQ S       D  P   A       P   +  
Sbjct: 65  QQPVTEDGAAGDDAGDGAEATAPSDAGSQASPDD----DAQPAAEAEAADQSAPPEASST 120

Query: 92  HPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
             +               GP P P    A       P           PP     P P  
Sbjct: 121 SATDEAATDPPATAAARDGPTPDPTAQPATPDERRSP--------RQRPPVSGEPPTPST 172

Query: 152 WPNAVADT 159
               VA T
Sbjct: 173 PDAHVAGT 180


>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
           Transcription initiation factor IIA (TFIIA) is a
           heterotrimer, the three subunits being known as alpha,
           beta, and gamma, in order of molecular weight. The N and
           C-terminal domains of the gamma subunit are represented
           in pfam02268 and pfam02751, respectively. This family
           represents the precursor that yields both the alpha and
           beta subunits. The TFIIA heterotrimer is an essential
           general transcription initiation factor for the
           expression of genes transcribed by RNA polymerase II.
           Together with TFIID, TFIIA binds to the promoter region;
           this is the first step in the formation of a
           pre-initiation complex (PIC). Binding of the rest of the
           transcription machinery follows this step. After
           initiation, the PIC does not completely dissociate from
           the promoter. Some components, including TFIIA, remain
           attached and re-initiate a subsequent round of
           transcription.
          Length = 332

 Score = 32.8 bits (75), Expect = 0.53
 Identities = 22/94 (23%), Positives = 28/94 (29%), Gaps = 13/94 (13%)

Query: 51  ASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHG 110
            +E   D   Q   P           P P+  PP        P+  + Q N        G
Sbjct: 44  VAEFPWDPSPQAPPP-------VAQLPQPLPQPPPTQALQALPAGDQQQHNT-----PTG 91

Query: 111 PPPPPPPLSAAKPP-PVQPEAMDKSGYGPPPPPP 143
            P   PP + A P  P  P    + G   P   P
Sbjct: 92  SPAANPPATFALPAGPAGPTIQTEPGQLYPVQVP 125


>gnl|CDD|219404 pfam07415, Herpes_LMP2, Gammaherpesvirus latent membrane protein
           (LMP2) protein.  This family consists of several
           Gammaherpesvirus latent membrane protein (LMP2)
           proteins. Epstein-Barr virus is a human Gammaherpesvirus
           that infects and establishes latency in B lymphocytes in
           vivo. The latent membrane protein 2 (LMP2) gene is
           expressed in latently infected B cells and encodes two
           protein isoforms, LMP2A and LMP2B, that are identical
           except for an additional N-terminal 119 aa cytoplasmic
           domain which is present in the LMP2A isoform. LMP2A is
           thought to play a key role in either the establishment
           or the maintenance of latency and/or the reactivation of
           productive infection from the latent state. The
           significance of LMP2B and its role in pathogenesis
           remain unclear.
          Length = 489

 Score = 32.9 bits (75), Expect = 0.53
 Identities = 15/69 (21%), Positives = 18/69 (26%), Gaps = 9/69 (13%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
               P P V      P+ R P +G                    PL    P        +
Sbjct: 38  SWDRPGPPVPEDYDAPSHRPPPYGGS---------NGDRHGGYQPLGQQDPSLYAGLGQN 88

Query: 133 KSGYGPPPP 141
             G  PPPP
Sbjct: 89  GGGGLPPPP 97


>gnl|CDD|222095 pfam13388, DUF4106, Protein of unknown function (DUF4106).  This
           family of proteins are found in large numbers in the
           Trichomonas vaginalis proteome. The function of this
           protein is unknown.
          Length = 422

 Score = 32.7 bits (74), Expect = 0.54
 Identities = 18/70 (25%), Positives = 27/70 (38%), Gaps = 5/70 (7%)

Query: 67  HGYVDR--PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPP 124
           HG+  R  P P   P V  P   P  ++P+    Q   +       P   P P + A+ P
Sbjct: 193 HGHRHRHAPKPTQQPTVQNPAQQPTVQNPA---QQPQQQPQQQPVQPAQQPTPQNPAQQP 249

Query: 125 PVQPEAMDKS 134
           P   +   +S
Sbjct: 250 PQTEQGHKRS 259



 Score = 32.7 bits (74), Expect = 0.56
 Identities = 22/90 (24%), Positives = 26/90 (28%), Gaps = 11/90 (12%)

Query: 75  PPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP---PPPPLSAAKPPPVQPEAM 131
           PP  P  AP    P     SHG     H+     H P P   P     A +P    P   
Sbjct: 173 PPNPPREAPAPGLPKTFTSSHGH---RHR-----HAPKPTQQPTVQNPAQQPTVQNPAQQ 224

Query: 132 DKSGYGPPPPPPLLAPAPDPWPNAVADTPK 161
            +      P  P   P P         T +
Sbjct: 225 PQQQPQQQPVQPAQQPTPQNPAQQPPQTEQ 254



 Score = 31.6 bits (71), Expect = 1.4
 Identities = 19/82 (23%), Positives = 21/82 (25%), Gaps = 10/82 (12%)

Query: 73  PPPPPAPIVAPPRPHP-----NGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQ 127
           PP PP    AP  P         RH    KP     +       P   P +      P Q
Sbjct: 173 PPNPPREAPAPGLPKTFTSSHGHRHRHAPKPTQQPTVQ-----NPAQQPTVQNPAQQPQQ 227

Query: 128 PEAMDKSGYGPPPPPPLLAPAP 149
                       P P   A  P
Sbjct: 228 QPQQQPVQPAQQPTPQNPAQQP 249


>gnl|CDD|99944 cd05512, Bromo_brd1_like, Bromodomain; brd1_like subfamily. BRD1 is
           a mammalian gene which encodes for a nuclear protein
           assumed to be a transcriptional regulator. BRD1 has been
           implicated with brain development and susceptibility to
           schizophrenia and bipolar affective disorder.
           Bromodomains are 110 amino acid long domains that are
           found in many chromatin associated proteins.
           Bromodomains can interact specifically with acetylated
           lysine.
          Length = 98

 Score = 30.4 bits (69), Expect = 0.57
 Identities = 11/32 (34%), Positives = 21/32 (65%), Gaps = 5/32 (15%)

Query: 339 FDMLVRNCMAHDGK-----RAPIQLVDQRGCV 365
           F++++ NC+A++ K     RA ++L DQ G +
Sbjct: 66  FNLIINNCLAYNAKDTIFYRAAVRLRDQGGAI 97


>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein.  This family consists of
           AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
           retardation syndrome) nuclear proteins. These proteins
           have been linked to human diseases such as acute
           lymphoblastic leukaemia and mental retardation. The
           family also contains a Drosophila AF4 protein homologue
           Lilliputian which contains an AT-hook domain.
           Lilliputian represents a novel pair-rule gene that acts
           in cytoskeleton regulation, segmentation and
           morphogenesis in Drosophila.
          Length = 1154

 Score = 33.0 bits (75), Expect = 0.58
 Identities = 36/157 (22%), Positives = 51/157 (32%), Gaps = 24/157 (15%)

Query: 16  GSQFAEKTEVPNVSKVEEQPKQ---AQNREDAESTDQVASETSSDQESQQSAPKHGYVDR 72
             Q  EK    N             A +   + S+    S + SD ES+ S+      + 
Sbjct: 370 EEQATEKPPSRNTPPSAPSSNPEPAASSSGSSSSSSGSESSSGSDSESESSS-SDSEENE 428

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           PP   +     P P P    PS  K Q ++ L  V   P    P  S +  PP++     
Sbjct: 429 PPRTAS-----PEPEP----PSTNKWQLDNWLNKV--NPHKVSPAESVSSNPPIKQPMEK 477

Query: 133 K-------SGYGPPP--PPPLLAPAPDPWPNAVADTP 160
           +       S Y P    PPP  +      P      P
Sbjct: 478 EGKVKSSGSQYHPESKEPPPKSSSKEKRRPRTAQKGP 514


>gnl|CDD|218566 pfam05349, GATA-N, GATA-type transcription activator, N-terminal.
           GATA transcription factors mediate cell differentiation
           in a diverse range of tissues. Mutation are often
           associated with certain congenital human disorders. The
           six classical vertebrate GATA proteins, GATA-1 to
           GATA-6, are highly homologous and have two tandem zinc
           fingers. The classical GATA transcription factors
           function transcription activators. In lower metazoans
           GATA proteins carry a single canonical zinc finger. This
           family represents the N-terminal domain of the family of
           GATA transcription activators.
          Length = 177

 Score = 31.7 bits (72), Expect = 0.60
 Identities = 16/68 (23%), Positives = 25/68 (36%), Gaps = 1/68 (1%)

Query: 93  PSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPW 152
            +HG+  ++H  GG  H     P  +   + P + P      G         LA  P  W
Sbjct: 9   ANHGQAAYDHDSGGFLHSAASSPVYVPTTRVPSMLPSLPYLQGCEASQQAHALAAHPG-W 67

Query: 153 PNAVADTP 160
             A A++ 
Sbjct: 68  SQAAAESS 75


>gnl|CDD|140276 PTZ00249, PTZ00249, variable surface protein Vir28; Provisional.
          Length = 516

 Score = 32.7 bits (74), Expect = 0.61
 Identities = 31/140 (22%), Positives = 49/140 (35%), Gaps = 16/140 (11%)

Query: 35  PKQAQNREDAESTDQVASETSSDQESQQSAPK-HGYVDRPPPPPAPIVAPPRPHPNGRHP 93
           P++ Q    A +  +++ E    +    S+P  HG       PP P V+   P  +GRHP
Sbjct: 218 PREEQKAVTAHAHRRISGEARPPKHISFSSPHAHGRPPVETRPPNP-VSVSSPQAHGRHP 276

Query: 94  --SHGKPQF---NHKLGGVYHGPPPPPPPLSAAKPP------PVQPEAMDKSGYGP---P 139
             +H  P     + K         P P  +S               E+   S       P
Sbjct: 277 GETHTPPLVTVPSSKAHDRNPVQTPTPTSVSGYSSQAKGLEKQAGGESERTSSVPSEQFP 336

Query: 140 PPPPLLAPAPDPWPNAVADT 159
            P P+L P     P   +++
Sbjct: 337 LPLPVLLPLGQSGPLESSES 356


>gnl|CDD|240289 PTZ00144, PTZ00144, dihydrolipoamide succinyltransferase;
           Provisional.
          Length = 418

 Score = 32.7 bits (75), Expect = 0.61
 Identities = 19/69 (27%), Positives = 21/69 (30%), Gaps = 5/69 (7%)

Query: 96  GKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
           G P      GG    P   P   +AAK     PE         P P P  A  P P   A
Sbjct: 112 GAPLSEIDTGGA--PPAAAPAAAAAAKAEKTTPEKPKA---AAPTPEPPAASKPTPPAAA 166

Query: 156 VADTPKIVS 164
               P   +
Sbjct: 167 KPPEPAPAA 175



 Score = 28.9 bits (65), Expect = 7.8
 Identities = 11/62 (17%), Positives = 12/62 (19%), Gaps = 6/62 (9%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCE 170
                 P       P            P PP     P P     A    P  V+     E
Sbjct: 136 KAEKTTPEKPKAAAPTPEPPAASK---PTPPAAAKPPEP---APAAKPPPTPVARADPRE 189

Query: 171 KN 172
             
Sbjct: 190 TR 191


>gnl|CDD|171499 PRK12438, PRK12438, hypothetical protein; Provisional.
          Length = 991

 Score = 32.5 bits (74), Expect = 0.67
 Identities = 16/49 (32%), Positives = 19/49 (38%), Gaps = 4/49 (8%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVAD 158
           G     PP  A  P P Q     ++   P  PP      PD  P AVA+
Sbjct: 908 GDAASAPPPGAGPPAPPQAVPPPRTTQPPAAPP----RGPDVPPAAVAE 952



 Score = 31.0 bits (70), Expect = 2.0
 Identities = 11/37 (29%), Positives = 13/37 (35%)

Query: 119 SAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
             A  P     +    G GPP PP  + P     P A
Sbjct: 901 RVATAPGGDAASAPPPGAGPPAPPQAVPPPRTTQPPA 937


>gnl|CDD|184281 PRK13729, PRK13729, conjugal transfer pilus assembly protein TraB;
           Provisional.
          Length = 475

 Score = 32.5 bits (74), Expect = 0.72
 Identities = 27/140 (19%), Positives = 47/140 (33%), Gaps = 21/140 (15%)

Query: 7   VSTQLQITTG---SQFAE-KTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQ 62
            +T++Q+T      Q+ E + E+  ++K     ++  ++   E   Q  +  +   ++  
Sbjct: 70  ATTEMQVTAAQMQKQYEEIRRELDVLNK-----QRGDDQRRIEKLGQDNAALAEQVKALG 124

Query: 63  SAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAK 122
           + P           P P +    P P G       P      G V   PP    P +   
Sbjct: 125 ANPVT-----ATGEPVPQMPASPPGPEGEPQPGNTPVSFPPQGSVAVPPPTAFYPGNGVT 179

Query: 123 PPPVQPEAMDKSGYGPPPPP 142
           PPP          Y   P P
Sbjct: 180 PPPQ-------VTYQSVPVP 192


>gnl|CDD|236382 PRK09111, PRK09111, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 598

 Score = 32.6 bits (75), Expect = 0.75
 Identities = 11/51 (21%), Positives = 13/51 (25%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           G  PP    +   P          +      P   LA  PD    A A   
Sbjct: 398 GGGPPGGGGAPGAPAAAAAPGAAAAAPAAGGPAAALAAVPDAAAAAAAPPA 448



 Score = 29.5 bits (67), Expect = 6.0
 Identities = 21/93 (22%), Positives = 25/93 (26%), Gaps = 16/93 (17%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           P P       P      G   +   P               P   L+A   P     A  
Sbjct: 392 PSPGGGGGGPPGGGGAPGAPAAAAAPGA----AAAAPAAGGPAAALAAV--PDAAAAA-- 443

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSL 165
                  PP P  AP P    N+  D   IV+L
Sbjct: 444 -----AAPPAPAAAPQPAVRLNSFED---IVAL 468


>gnl|CDD|225499 COG2948, VirB10, Type IV secretory pathway, VirB10 components
           [Intracellular trafficking and secretion].
          Length = 360

 Score = 32.1 bits (73), Expect = 0.75
 Identities = 11/46 (23%), Positives = 13/46 (28%), Gaps = 1/46 (2%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPP-PPLLAPAPDPWPNA 155
            PP PPPL      PV P+   +     P         A       
Sbjct: 75  DPPLPPPLPVDLGAPVLPDQQVEEAKDQPRRLRAAELAATSGSRVE 120



 Score = 31.7 bits (72), Expect = 1.00
 Identities = 13/47 (27%), Positives = 15/47 (31%), Gaps = 3/47 (6%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAV 156
                PP       PP+ P   D      PPP P+   AP      V
Sbjct: 53  INNTQPPSNVERGTPPLPPLPDD---PPLPPPLPVDLGAPVLPDQQV 96



 Score = 30.1 bits (68), Expect = 3.4
 Identities = 13/50 (26%), Positives = 16/50 (32%), Gaps = 4/50 (8%)

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           L G               +    QP +  + G  P PP P   P P P P
Sbjct: 38  LVGFALIALQGE----KKRINNTQPPSNVERGTPPLPPLPDDPPLPPPLP 83


>gnl|CDD|237182 PRK12727, PRK12727, flagellar biosynthesis regulator FlhF;
           Provisional.
          Length = 559

 Score = 32.3 bits (73), Expect = 0.77
 Identities = 35/179 (19%), Positives = 47/179 (26%), Gaps = 37/179 (20%)

Query: 25  VPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSA---PKHGYVDRPPPPPAPIV 81
            P        P  A  +  A +        +S  E   +A    +   V R  P  AP+ 
Sbjct: 72  APQAPTKPAAPVHAPLKLSANANMSQRQRVASAAEDMIAAMALRQPVSVPRQAPAAAPVR 131

Query: 82  APPRPHPNGRHPSHG-----KPQFNHKLGGV----------------------YHGPPPP 114
           A   P P  +  +H       P+  H L  V                          P P
Sbjct: 132 AASIPSPAAQALAHAAAVRTAPRQEHALSAVPEQLFADFLTTAPVPRAPVQAPVVAAPAP 191

Query: 115 PPPLSAAKPPPVQPEAMDKSGYGPP------PPPPLLAPAPDPWPNAVADTPKIVSLDV 167
            P ++AA          D               P +L PA  P P  VA         V
Sbjct: 192 VPAIAAALAAHAAYAQDDDEQLDDDGFDLDDALPQILPPAALP-PIVVAPAAPAALAAV 249



 Score = 31.5 bits (71), Expect = 1.5
 Identities = 20/107 (18%), Positives = 29/107 (27%), Gaps = 5/107 (4%)

Query: 38  AQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGK 97
           A  +E A S   V  +  +D  +    P+          PAP+ A           +   
Sbjct: 152 APRQEHALSA--VPEQLFADFLTTAPVPRAPVQAPVVAAPAPVPAIAAALAAHAAYAQDD 209

Query: 98  PQFNHKLGGVYHGPPPP---PPPLSAAKPPPVQPEAMDKSGYGPPPP 141
            +     G       P    P  L      P  P A+       P P
Sbjct: 210 DEQLDDDGFDLDDALPQILPPAALPPIVVAPAAPAALAAVAAAAPAP 256


>gnl|CDD|240415 PTZ00429, PTZ00429, beta-adaptin; Provisional.
          Length = 746

 Score = 32.6 bits (74), Expect = 0.77
 Identities = 27/104 (25%), Positives = 30/104 (28%), Gaps = 16/104 (15%)

Query: 53  ETSSDQESQQSAPKHGYVDRPPPPPA-----PIVAPPRPHPNGRHP-SHGKPQFNHK--L 104
               D     S P  G  D  P P A      I           HP + G     H   L
Sbjct: 613 TEDDDAVELPSTPSMGTQDGSPAPSAAPAGYDIFEFAGDGTGAPHPVASGSNGAQHADPL 672

Query: 105 GGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPA 148
           G ++ G P        A  P  Q      SG   P  PP  A A
Sbjct: 673 GDLFSGLPST----VGASSPAFQAA----SGSQAPASPPTAASA 708


>gnl|CDD|165555 PHA03298, PHA03298, envelope glycoprotein L; Provisional.
          Length = 167

 Score = 31.3 bits (70), Expect = 0.81
 Identities = 9/36 (25%), Positives = 16/36 (44%), Gaps = 1/36 (2%)

Query: 220 CGTSGNTENGLYGYGADA-GSGTYFENIIVIQYDPQ 254
           C TS   +     +G +    G  FE+++ I + P 
Sbjct: 56  CMTSFEHDAAARIFGPEYLPGGDMFEDLLTIIFKPL 91


>gnl|CDD|165539 PHA03282, PHA03282, envelope glycoprotein E; Provisional.
          Length = 540

 Score = 32.2 bits (73), Expect = 0.82
 Identities = 18/65 (27%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 113 PPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEKN 172
           P P P      PP   +  D++G    P PP L P P P P+ +A+   +  + V     
Sbjct: 162 PRPVPTPPGGTPPPDDDEGDEAGAPATPAPP-LHPPPAPHPHPIAEVAHVRGVTVSLRTQ 220

Query: 173 SMKVF 177
           +  +F
Sbjct: 221 TAILF 225



 Score = 29.5 bits (66), Expect = 6.3
 Identities = 21/66 (31%), Positives = 23/66 (34%), Gaps = 8/66 (12%)

Query: 70  VDRPPPPPAP-----IVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPP 124
           V R P   A      +V  PRP P    P  G P  +   G     P  P PPL     P
Sbjct: 143 VSRAPNSTAARAVVFLVVGPRPVPT---PPGGTPPPDDDEGDEAGAPATPAPPLHPPPAP 199

Query: 125 PVQPEA 130
              P A
Sbjct: 200 HPHPIA 205


>gnl|CDD|219056 pfam06485, DUF1092, Protein of unknown function (DUF1092).  This
           family consists of several hypothetical proteins of
           unknown function all from photosynthetic organisms
           including plants and cyanobacteria.
          Length = 270

 Score = 31.8 bits (73), Expect = 0.82
 Identities = 11/32 (34%), Positives = 17/32 (53%), Gaps = 1/32 (3%)

Query: 58  QESQQSAPKH-GYVDRPPPPPAPIVAPPRPHP 88
           +  ++  P+  GY+   PPP A    PP+P P
Sbjct: 102 EREEEVYPQEPGYMALAPPPVALDKPPPQPLP 133


>gnl|CDD|223880 COG0810, TonB, Periplasmic protein TonB, links inner and outer
           membranes [Cell envelope biogenesis, outer membrane].
          Length = 244

 Score = 31.7 bits (72), Expect = 0.87
 Identities = 23/92 (25%), Positives = 24/92 (26%), Gaps = 14/92 (15%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
            P PP     PP P P          +            P  P P    KP P       
Sbjct: 66  QPKPPTEPETPPEPTPPKPKEKPKPEK-----------KPKKPKPKPKPKPKPKPKVKPQ 114

Query: 133 KSGYGPPPPPPLLAPAPDPWPNAVADTPKIVS 164
                PP      APA    PN  A  P   S
Sbjct: 115 PKPKKPPSKTAAKAPAA---PNQPARPPSAAS 143



 Score = 28.6 bits (64), Expect = 9.4
 Identities = 17/63 (26%), Positives = 21/63 (33%), Gaps = 11/63 (17%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP-----NAVADTPKIVSL 165
           P PP  P +  +P P +P+        P P      P P P P       V   PK    
Sbjct: 67  PKPPTEPETPPEPTPPKPKEK------PKPEKKPKKPKPKPKPKPKPKPKVKPQPKPKKP 120

Query: 166 DVK 168
             K
Sbjct: 121 PSK 123


>gnl|CDD|177646 PHA03418, PHA03418, hypothetical E4 protein; Provisional.
          Length = 230

 Score = 31.6 bits (71), Expect = 0.90
 Identities = 28/104 (26%), Positives = 34/104 (32%), Gaps = 15/104 (14%)

Query: 65  PKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPP 124
           P+      P PPP P   P  P P  +   H KP    + GG         P  + A   
Sbjct: 44  PQEDPDKNPSPPPDP---PLTPRPPAQPNGHNKPPVTKQPGGEGTEEDHQAPLAADADDD 100

Query: 125 PVQPEAMDKSGYGPPPPPPLLAP------------APDPWPNAV 156
           P   +      +GP P    LAP             PDP P A 
Sbjct: 101 PRPGKRSKADEHGPAPGRAALAPFKLDLDQDPLHGDPDPPPGAT 144



 Score = 31.2 bits (70), Expect = 1.2
 Identities = 32/128 (25%), Positives = 46/128 (35%), Gaps = 14/128 (10%)

Query: 27  NVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRP 86
           N   V +QP      ED ++        ++D +      K    D   P P      P  
Sbjct: 72  NKPPVTKQPGGEGTEEDHQAP------LAADADDDPRPGKRSKADEHGPAPGRAALAPFK 125

Query: 87  HPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLA 146
               + P HG P  +   G        PP     ++PP  + E   +   G PPP P   
Sbjct: 126 LDLDQDPLHGDP--DPPPGATGGQGEEPPEGGEESQPPLGEGEGAVE---GHPPPLP--- 177

Query: 147 PAPDPWPN 154
           PAP+P P+
Sbjct: 178 PAPEPKPH 185


>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
          Length = 943

 Score = 32.4 bits (73), Expect = 0.90
 Identities = 32/150 (21%), Positives = 54/150 (36%), Gaps = 24/150 (16%)

Query: 23  TEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVA 82
           +++P +SK  E PK  ++ +D E   +     S+ + ++  +PK   +   P  P    +
Sbjct: 570 SKIPTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629

Query: 83  PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPP- 141
           P  P                        PPPP  P S  +P   +     K    P PP 
Sbjct: 630 PKSP----------------------KRPPPPQRPSSPERPEGPKIIKSPKPPKSPKPPF 667

Query: 142 -PPLLAPAPDPWPNAVADTPKIVSLDVKCE 170
            P       D + +A A + +  +  V  E
Sbjct: 668 DPKFKEKFYDDYLDAAAKSKETKTTVVLDE 697


>gnl|CDD|234797 PRK00575, tatA, twin arginine translocase protein A; Provisional.
          Length = 92

 Score = 29.7 bits (67), Expect = 0.96
 Identities = 8/43 (18%), Positives = 11/43 (25%), Gaps = 7/43 (16%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
                 P   A P PVQ +          P       + +  P
Sbjct: 56  AAAAQAPYQVATPTPVQSQR-------VDPAAASGQDSTEARP 91


>gnl|CDD|221188 pfam11725, AvrE, Pathogenicity factor.  This family is secreted by
           gram-negative Gammaproteobacteria such as Pseudomonas
           syringae of tomato and the fire blight plant pathogen
           Erwinia amylovora, amongst others. It is an essential
           pathogenicity factor of approximately 198 kDa. Its
           injection into the host-plant is dependent upon the
           bacterial type III or Hrp secretion system. The family
           is long and carries a number of predicted functional
           regions, including an ERMS or endoplasmic reticulum
           membrane retention signal at both the C- and the
           N-termini, a leucine-zipper motif from residues 539-560,
           and a nuclear localisation signal at 1358-1361. this
           conserved AvrE-family of effectors is among the few that
           are required for full virulence of many phytopathogenic
           pseudomonads, erwinias and pantoeas.
          Length = 1771

 Score = 32.1 bits (73), Expect = 0.97
 Identities = 14/74 (18%), Positives = 20/74 (27%)

Query: 13  ITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDR 72
           I   S  + +  V        +         A   D    +      S Q++       R
Sbjct: 147 ILNKSSSSRRPPVSKEEGTSSKMPATALASAALFKDDEIRQEVDAARSDQASQSRLSRSR 206

Query: 73  PPPPPAPIVAPPRP 86
             PP  P  A PR 
Sbjct: 207 GNPPAIPPDAAPRQ 220


>gnl|CDD|177464 PHA02682, PHA02682, ORF080 virion core protein; Provisional.
          Length = 280

 Score = 31.8 bits (71), Expect = 1.0
 Identities = 26/86 (30%), Positives = 29/86 (33%), Gaps = 19/86 (22%)

Query: 74  PPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAK--PPPVQPEAM 131
           P   AP  A P   P    P+   P            P P  PP +A    PP V P   
Sbjct: 86  PACAAPAPACPACAPAAPAPAVTCP-----------APAPACPPATAPTCPPPAVCPAPA 134

Query: 132 DKSGYGP------PPPPPLLAPAPDP 151
             +   P      PP PPL  P P P
Sbjct: 135 RPAPACPPSTRQCPPAPPLPTPKPAP 160



 Score = 31.4 bits (70), Expect = 1.1
 Identities = 23/82 (28%), Positives = 27/82 (32%), Gaps = 8/82 (9%)

Query: 74  PPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDK 133
           PP  AP   PP   P    P+   P    +         PP PPL   KP P        
Sbjct: 117 PPATAPTCPPPAVCPAPARPAPACPPSTRQC--------PPAPPLPTPKPAPAAKPIFLH 168

Query: 134 SGYGPPPPPPLLAPAPDPWPNA 155
           +   PP  P    P  +  P A
Sbjct: 169 NQLPPPDYPAASCPTIETAPAA 190


>gnl|CDD|218181 pfam04621, ETS_PEA3_N, PEA3 subfamily ETS-domain transcription
           factor N terminal domain.  The N terminus of the PEA3
           transcription factors is implicated in transactivation
           and in inhibition of DNA binding. Transactivation is
           potentiated by activation of the Ras/MAP kinase and
           protein kinase A signalling cascades. The N terminal
           region contains conserved MAP kinase phosphorylation
           sites.
          Length = 336

 Score = 31.8 bits (72), Expect = 1.1
 Identities = 27/123 (21%), Positives = 39/123 (31%), Gaps = 13/123 (10%)

Query: 27  NVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSA-PKHGYVDRPPPPPAPI----- 80
           + S  + +P          ST              Q + P       PP P  P+     
Sbjct: 118 SYSAYDRKPASGFKPPTPPSTPCSPVNPQETVRQLQPSGPLSN--SSPPSPHTPLPNQSP 175

Query: 81  VAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPP 140
           + PP   P+  +PS    +F  +L       PPPP   S    PP   +  +      P 
Sbjct: 176 LPPPMSSPDSSYPSE--HRFQRQLSEPCLPFPPPPGRGSRDGRPPYHRQMSEPL---VPY 230

Query: 141 PPP 143
           PP 
Sbjct: 231 PPQ 233


>gnl|CDD|237874 PRK14971, PRK14971, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 614

 Score = 31.7 bits (72), Expect = 1.3
 Identities = 18/96 (18%), Positives = 21/96 (21%), Gaps = 11/96 (11%)

Query: 73  PPPPPAPIV---APPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPE 129
           P     P+    A           S    Q              P  P SA +P    P 
Sbjct: 375 PKQHIKPVFTQPAAAPQPSAAAAASPSPSQ--------SSAAAQPSAPQSATQPAGTPPT 426

Query: 130 AMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSL 165
                    P  PP  AP          +    VS 
Sbjct: 427 VSVDPPAAVPVNPPSTAPQAVRPAQFKEEKKIPVSK 462


>gnl|CDD|225711 COG3170, FimV, Tfp pilus assembly protein FimV [Cell motility and
           secretion / Intracellular trafficking and secretion].
          Length = 755

 Score = 31.8 bits (72), Expect = 1.3
 Identities = 21/57 (36%), Positives = 23/57 (40%), Gaps = 2/57 (3%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPA--PDPWPNAVADTPKIVSLD 166
           PP   P SA    P  P     +   PPPPP    P   P P P A  DT  + S D
Sbjct: 141 PPGYSPKSALSAEPSHPVPAPAAASAPPPPPRAARPVRQPAPAPAAPGDTYTVRSGD 197


>gnl|CDD|233692 TIGR02031, BchD-ChlD, magnesium chelatase ATPase subunit D.  This
           model represents one of two ATPase subunits of the
           trimeric magnesium chelatase responsible for insertion
           of magnesium ion into protoporphyrin IX. This is an
           essential step in the biosynthesis of both chlorophyll
           and bacteriochlorophyll. This subunit is found in green
           plants, photosynthetic algae, cyanobacteria and other
           photosynthetic bacteria. Unlike subunit I (TIGR02030),
           this subunit is not found in archaea [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Chlorophyll
           and bacteriochlorphyll].
          Length = 589

 Score = 31.7 bits (72), Expect = 1.3
 Identities = 14/44 (31%), Positives = 16/44 (36%), Gaps = 12/44 (27%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPN 154
           PPPPPPP     P P +PE         P  P    P      +
Sbjct: 273 PPPPPPP-----PEPPEPEE-------EPDEPDQTDPDDGEETD 304



 Score = 29.4 bits (66), Expect = 6.2
 Identities = 9/33 (27%), Positives = 11/33 (33%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPP 143
           PPPPP P    + P    +     G      P 
Sbjct: 276 PPPPPEPPEPEEEPDEPDQTDPDDGEETDQIPE 308


>gnl|CDD|236333 PRK08691, PRK08691, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 709

 Score = 31.6 bits (71), Expect = 1.4
 Identities = 25/106 (23%), Positives = 38/106 (35%), Gaps = 3/106 (2%)

Query: 12  QITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVD 71
           Q++       +T+ P      E P QA   ++A  T+  A E  ++       P +   D
Sbjct: 470 QVSKNKAADNETDAPLSEVPSENPIQATPNDEAVETETFAHEAPAEPFYGYGFPDN---D 526

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
            PP   A I  P   H      + G      + GG+     P  PP
Sbjct: 527 CPPEDGAEIPPPDWEHAAPADTAGGGADEEAEAGGIGGNNTPSAPP 572


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 31.4 bits (71), Expect = 1.4
 Identities = 24/109 (22%), Positives = 38/109 (34%), Gaps = 11/109 (10%)

Query: 17  SQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPP 76
           ++ A K +       EE+ K+  + +D E+T     E  S Q S+ S      + +P P 
Sbjct: 200 AREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISRRSSSSLKKPDPS 259

Query: 77  PAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPP 125
           P+              P   +     +        PP   P SA   PP
Sbjct: 260 PSMA-----------SPETRESSKRTETRPRTSLRPPSARPASARPAPP 297


>gnl|CDD|217310 pfam02993, MCPVI, Minor capsid protein VI.  This minor capsid
           protein may act as a link between the external capsid
           and the internal DNA-protein core. The C-terminal 11
           residues may function as a protease cofactor leading to
           enzyme activation.
          Length = 238

 Score = 30.9 bits (70), Expect = 1.4
 Identities = 25/134 (18%), Positives = 37/134 (27%), Gaps = 23/134 (17%)

Query: 34  QPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHP 93
           +    +  E     ++ A      QE   + P      RP P    ++ P  P P    P
Sbjct: 99  EKDLEKLLEKVLGEEEPA-----PQEETVADPIQALQPRPRPDVEEVLVPAAPEP----P 149

Query: 94  SHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPE-AMDKSGYGPPPPPPLLAPAPDPW 152
           S+ +               P P P+            A+D       PP P   P   P 
Sbjct: 150 SYEETI------------KPGPAPVEEPVDSMAIAVPAIDTPVTLELPPAPQPPPPVVPQ 197

Query: 153 P-NAVADTPKIVSL 165
           P   V      +  
Sbjct: 198 PSTMVVHRRSRIKR 211


>gnl|CDD|223041 PHA03321, PHA03321, tegument protein VP11/12; Provisional.
          Length = 694

 Score = 31.5 bits (71), Expect = 1.4
 Identities = 26/104 (25%), Positives = 33/104 (31%), Gaps = 18/104 (17%)

Query: 71  DRPPPPPAPIVAPPRPH----PNGRHPSHGKPQFNHKLGGVYHGP-PPPPPPLSAAKPPP 125
           + PPPPP        P        +      P++   LG +   P    PPP  AA P P
Sbjct: 440 NDPPPPPRARPGST-PACARRARAQRARDAGPEYVDPLGALRRLPAGAAPPPEPAAAPSP 498

Query: 126 VQPEAMDKSGYGPPPPPP----------LLAPAPDPWPNAVADT 159
                  + G GPP  PP             P     P  + D 
Sbjct: 499 --ATYYTRMGGGPPRLPPRNRATETLRPDWGPPAAAPPEQMEDP 540


>gnl|CDD|235540 PRK05641, PRK05641, putative acetyl-CoA carboxylase biotin carboxyl
           carrier protein subunit; Validated.
          Length = 153

 Score = 30.2 bits (68), Expect = 1.5
 Identities = 13/53 (24%), Positives = 17/53 (32%), Gaps = 10/53 (18%)

Query: 106 GVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVAD 158
           G+         P  A  P P  P A          P P+   AP P P +  +
Sbjct: 42  GIDLSAVQEQVPTPAPAPAPAVPSA----------PTPVAPAAPAPAPASAGE 84


>gnl|CDD|237191 PRK12757, PRK12757, cell division protein FtsN; Provisional.
          Length = 256

 Score = 30.8 bits (70), Expect = 1.5
 Identities = 19/112 (16%), Positives = 29/112 (25%), Gaps = 32/112 (28%)

Query: 23  TEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVA 82
           +EVP   +  + P+     +      Q A +      + Q  P       PP      V 
Sbjct: 103 SEVPYNEQTPQVPRSTVQIQ------QQAQQQQPPATTAQPQPV-----TPPRQTTAPVQ 151

Query: 83  PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKS 134
           P  P P                         P  P++ A   P      +K 
Sbjct: 152 PQTPAPV---------------------RTQPAAPVTQAVEAPKVEAEKEKE 182


>gnl|CDD|183854 PRK13042, PRK13042, superantigen-like protein; Reviewed.
          Length = 291

 Score = 31.1 bits (70), Expect = 1.6
 Identities = 27/104 (25%), Positives = 38/104 (36%), Gaps = 22/104 (21%)

Query: 46  STDQVASET---SSDQESQQSAPKHGYVDRPPP------PPAPIVAPPRPHPNGRHPSHG 96
           +T Q A+ T   S+  E+ QS P    V+ P        PP+  V  P+  PN   PS  
Sbjct: 23  TTTQAANATTPSSTKVEAPQSTPPSTKVEAPQSKPNATTPPSTKVEAPQQTPNATTPSST 82

Query: 97  KPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPP 140
           K +              P  P +   P  + P+  D   Y   P
Sbjct: 83  KVE-------------TPQSPTTKQVPTEINPKFKDLRAYYTKP 113


>gnl|CDD|177614 PHA03377, PHA03377, EBNA-3C; Provisional.
          Length = 1000

 Score = 31.6 bits (71), Expect = 1.6
 Identities = 21/83 (25%), Positives = 28/83 (33%), Gaps = 17/83 (20%)

Query: 44  AESTDQVASETSSDQESQQSAPKH-GYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNH 102
            + T    S  S + E  QS P+  G  D+P  P    V P    P              
Sbjct: 429 VKRTLVKTSGRSDEAEQAQSTPERPGPSDQPSVP----VEPAHLTPVEHTTV-------- 476

Query: 103 KLGGVYHGPPPPPPPLSAAKPPP 125
               + H PP  PP ++    PP
Sbjct: 477 ----ILHQPPQSPPTVAIKPAPP 495



 Score = 30.8 bits (69), Expect = 2.5
 Identities = 29/145 (20%), Positives = 44/145 (30%), Gaps = 22/145 (15%)

Query: 8   STQLQITTGSQFAEKTEVPNVSKV------EEQPKQAQNREDAESTDQVASE---TSSDQ 58
           S++ Q  T S     + +P+V  +        QP +  +      T  ++ E      D 
Sbjct: 667 SSRRQPATQSTPPRPSWLPSVFVLPSVDAGRAQPSEESHLSSMSPTQPISHEEQPRYEDP 726

Query: 59  ESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPL 118
           +       H     PP   AP            +  H +PQ        Y  P PP  P 
Sbjct: 727 DDPLDLSLHPDQAPPPSHQAP------------YSGHEEPQAQQAPYPGYWEPRPPQAPY 774

Query: 119 SAAKPPPVQPEAMDK-SGYGPPPPP 142
              + P  Q   +    GY  P   
Sbjct: 775 LGYQEPQAQGVQVSSYPGYAGPWGL 799



 Score = 29.6 bits (66), Expect = 6.1
 Identities = 28/146 (19%), Positives = 42/146 (28%), Gaps = 20/146 (13%)

Query: 32  EEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVA------PPR 85
           EEQP+     ED +    ++            AP  G+ + P    AP         P  
Sbjct: 718 EEQPRY----EDPDDPLDLSLHPDQAPPPSHQAPYSGHEE-PQAQQAPYPGYWEPRPPQA 772

Query: 86  PHPNGRHPSHGKPQFNHKLGGVYHGPPPPPP-------PLSAAKPPPVQPEAMDKSGYGP 138
           P+   + P     Q +   G  Y GP              +     P            P
Sbjct: 773 PYLGYQEPQAQGVQVSSYPG--YAGPWGLRAQHPRYRHSWAYWSQYPGHGHPQGPWAPRP 830

Query: 139 PPPPPLLAPAPDPWPNAVADTPKIVS 164
           P  PP    +     + V+  P + S
Sbjct: 831 PHLPPQWDGSAGHGQDQVSQFPHLQS 856



 Score = 29.3 bits (65), Expect = 7.4
 Identities = 21/92 (22%), Positives = 30/92 (32%), Gaps = 9/92 (9%)

Query: 65  PKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPP 124
           P HG+   P  P  P + PP+   +  H      QF H           PP    +  P 
Sbjct: 817 PGHGHPQGPWAPRPPHL-PPQWDGSAGHGQDQVSQFPH-----LQSETGPPRLQLSQVPQ 870

Query: 125 PVQPEAMDKSGYGP---PPPPPLLAPAPDPWP 153
               + +  S       P P   + P P  +P
Sbjct: 871 LPYSQTLVSSSAPSWSSPQPRAPIRPIPTRFP 902



 Score = 28.9 bits (64), Expect = 9.1
 Identities = 26/149 (17%), Positives = 44/149 (29%), Gaps = 10/149 (6%)

Query: 21  EKTEVPNVSKVEEQPKQAQNREDA-----ESTDQVASETSSDQESQQSAP-----KHGYV 70
           +  + P    ++  P  ++ R  A     +   +V    ++++E   + P     K    
Sbjct: 480 QPPQSPPTVAIKPAPPPSRRRRGACVVYDDDIIEVIDVETTEEEESVTQPAKPHRKVQDG 539

Query: 71  DRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA 130
            +          PP+  P+ R P    P               P        PP   P  
Sbjct: 540 FQRSGRRQKRATPPKVSPSDRGPPKASPPVMAPPSTGPRVMATPSTGPRDMAPPSTGPRQ 599

Query: 131 MDKSGYGPPPPPPLLAPAPDPWPNAVADT 159
             K   GPP   P     P   P  +A +
Sbjct: 600 QAKCKDGPPASGPHEKQPPSSAPRDMAPS 628


>gnl|CDD|151322 pfam10873, DUF2668, Protein of unknown function (DUF2668).  Members
           in this family of proteins are annotated as Cysteine and
           tyrosine-rich protein 1, however currently no function
           is known.
          Length = 154

 Score = 30.2 bits (68), Expect = 1.6
 Identities = 15/45 (33%), Positives = 16/45 (35%), Gaps = 13/45 (28%)

Query: 108 YHGPPPP---------PPPLSAAKPPPVQPEAMDKSGYGPPPPPP 143
           Y   PPP         P  L    PPP  P     +   PPPP P
Sbjct: 109 YPMAPPPYTYDHEMEYPTDL----PPPYSPAPQASAQRSPPPPYP 149


>gnl|CDD|114603 pfam05887, Trypan_PARP, Procyclic acidic repetitive protein (PARP).
            This family consists of several Trypanosoma brucei
           procyclic acidic repetitive protein (PARP) like
           sequences. The procyclic acidic repetitive protein
           (parp) genes of Trypanosoma brucei encode a small family
           of abundant surface proteins whose expression is
           restricted to the procyclic form of the parasite. They
           are found at two unlinked loci, parpA and parpB;
           transcription of both loci is developmentally regulated.
          Length = 145

 Score = 30.3 bits (67), Expect = 1.6
 Identities = 10/44 (22%), Positives = 13/44 (29%), Gaps = 2/44 (4%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
            P          +P     E  +    G   P P   P P+P P
Sbjct: 71  EPEEEGEE--EPEPEEEGEEEPEPEETGEEEPEPEPEPEPEPEP 112



 Score = 29.5 bits (65), Expect = 2.5
 Identities = 16/83 (19%), Positives = 21/83 (25%), Gaps = 18/83 (21%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
           P   P        P P        +P+                 P    +P     E  +
Sbjct: 60  PDDEPEE---EEEPEPEEEGEEEPEPE-----------EEGEEEP----EPEETGEEEPE 101

Query: 133 KSGYGPPPPPPLLAPAPDPWPNA 155
                 P P P   P P+P P A
Sbjct: 102 PEPEPEPEPEPEPEPEPEPEPGA 124


>gnl|CDD|219061 pfam06495, Transformer, Fruit fly transformer protein.  This family
           consists of transformer proteins from several Drosophila
           species and also from Ceratitis capitata (Mediterranean
           fruit fly). The transformer locus (tra) produces an RNA
           processing protein that alternatively splices the
           doublesex pre-mRNA in the sex determination hierarchy of
           Drosophila melanogaster.
          Length = 182

 Score = 30.4 bits (68), Expect = 1.6
 Identities = 22/76 (28%), Positives = 30/76 (39%), Gaps = 1/76 (1%)

Query: 54  TSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP 113
           TSS +  ++S  +  Y   P     P+  P   +P         PQF+   G   +G  P
Sbjct: 93  TSSTERRRRSRSRSRYSRTPRIITVPVPVPAADYPYAYGWPPPAPQFSGMQGAFPYGMLP 152

Query: 114 -PPPPLSAAKPPPVQP 128
            P PP  A  P P  P
Sbjct: 153 RPVPPYFAPYPRPPAP 168


>gnl|CDD|237000 PRK11855, PRK11855, dihydrolipoamide acetyltransferase; Reviewed.
          Length = 547

 Score = 31.3 bits (72), Expect = 1.7
 Identities = 10/48 (20%), Positives = 10/48 (20%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVA 157
              P      AA  P     A           P   APA    P    
Sbjct: 196 AAAPAAAAAPAAAAPAAAAAAAPAPAPAAAAAPAAAAPAAAAAPGKAP 243


>gnl|CDD|221501 pfam12273, RCR, Chitin synthesis regulation, resistance to Congo
           red.  RCR proteins are ER membrane proteins that
           regulate chitin deposition in fungal cell walls.
           Although chitin, a linear polymer of beta-1,4-linked
           N-acetylglucosamine, constitutes only 2% of the cell
           wall it plays a vital role in the overall protection of
           the cell wall against stress, noxious chemicals and
           osmotic pressure changes. Congo red is a cell
           wall-disrupting benzidine-type dye extensively used in
           many cell wall mutant studies that specifically targets
           chitin in yeast cells and inhibits growth. RCR proteins
           render the yeasts resistant to Congo red by diminishing
           the content of chitin in the cell wall. RCR proteins are
           probably regulating chitin synthase III interact
           directly with ubiquitin ligase Rsp5, and the VPEY motif
           is necessary for this, via interaction with the WW
           domains of Rsp5.
          Length = 124

 Score = 29.8 bits (67), Expect = 1.7
 Identities = 16/91 (17%), Positives = 24/91 (26%), Gaps = 22/91 (24%)

Query: 58  QESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGV--------YH 109
            + Q +    G   +  PP          +  G +   G+   N K   +        Y 
Sbjct: 48  SQQQYNNTPQGRPPQYVPPYT---ETANENDLGYYDGQGEFHPNPKAPAIELQPPPNAYE 104

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPP 140
                P      +PP            GPPP
Sbjct: 105 RGTRSPTTDDEYQPPA-----------GPPP 124


>gnl|CDD|215646 pfam00001, 7tm_1, 7 transmembrane receptor (rhodopsin family).
           This family contains, amongst other G-protein-coupled
           receptors (GCPRs), members of the opsin family, which
           have been considered to be typical members of the
           rhodopsin superfamily. They share several motifs, mainly
           the seven transmembrane helices, GCPRs of the rhodopsin
           superfamily. All opsins bind a chromophore, such as
           11-cis-retinal. The function of most opsins other than
           the photoisomerases is split into two steps: light
           absorption and G-protein activation. Photoisomerases, on
           the other hand, are not coupled to G-proteins - they are
           thought to generate and supply the chromophore that is
           used by visual opsins.
          Length = 251

 Score = 30.7 bits (70), Expect = 1.8
 Identities = 19/69 (27%), Positives = 29/69 (42%), Gaps = 3/69 (4%)

Query: 505 DSNSTSQTMIFPTNMEDNSGMICMTTIGFSATLFVLLGILVVSCLVSACLCIRLRPFSNK 564
           + N T+  + FP      S  +  T +GF   +  LL ILV   L+   L  R R  +++
Sbjct: 124 EGNVTTCLIDFPEESTKRSYTLLSTLLGF---VLPLLVILVCYTLILRTLRKRARSGASQ 180

Query: 565 TSQKTMSVS 573
              K  S  
Sbjct: 181 ARAKRSSSK 189


>gnl|CDD|223079 PHA03419, PHA03419, E4 protein; Provisional.
          Length = 200

 Score = 30.3 bits (68), Expect = 1.8
 Identities = 19/92 (20%), Positives = 22/92 (23%), Gaps = 16/92 (17%)

Query: 68  GYVDRPPPPPAPIVAP-PRPHPNGRHPSH---------------GKPQFNHKLGGVYHGP 111
           GY   PP  P P   P P P   G  P                 GK +   K        
Sbjct: 48  GYPFCPPTTPHPSSQPPPCPPSPGHPPQTNDTHEKDLALQPPPGGKKKEKKKKETEKPAQ 107

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPP 143
               P          +    +       PPPP
Sbjct: 108 GGEKPDQGPEAKGEGEGHEPEDPPPEDTPPPP 139


>gnl|CDD|200219 TIGR02927, SucB_Actino, 2-oxoglutarate dehydrogenase, E2 component,
           dihydrolipoamide succinyltransferase.  This model
           represents an Actinobacterial clade of E2 enzyme, a
           component of the 2-oxoglutarate dehydrogenase complex
           involved in the TCA cycle. These proteins have multiple
           domains including the catalytic domain (pfam00198), one
           or two biotin domains (pfam00364) and an E3-component
           binding domain (pfam02817).
          Length = 579

 Score = 31.1 bits (70), Expect = 1.9
 Identities = 8/47 (17%), Positives = 13/47 (27%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVA 157
           P     P  +       P+   ++ +  P PP            A A
Sbjct: 210 PAEEEAPAPSEAGSEPAPDPAARAPHAAPDPPAPAPAPAKTAAPAAA 256



 Score = 30.4 bits (68), Expect = 2.9
 Identities = 18/72 (25%), Positives = 22/72 (30%), Gaps = 12/72 (16%)

Query: 74  PPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDK 133
              PAP  A   P P+    +                P PP P  + AK       A   
Sbjct: 213 EEAPAPSEAGSEPAPDPAARAPHAA------------PDPPAPAPAPAKTAAPAAAAPVS 260

Query: 134 SGYGPPPPPPLL 145
           SG   P   PL+
Sbjct: 261 SGDSGPYVTPLV 272



 Score = 29.6 bits (66), Expect = 5.0
 Identities = 14/87 (16%), Positives = 19/87 (21%), Gaps = 20/87 (22%)

Query: 78  APIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYG 137
           A              PS    +           P P P   +    P             
Sbjct: 203 ANAAPAEPAEEEAPAPSEAGSE-----------PAPDPAARAPHAAPDPPA--------- 242

Query: 138 PPPPPPLLAPAPDPWPNAVADTPKIVS 164
           P P P   A      P +  D+   V+
Sbjct: 243 PAPAPAKTAAPAAAAPVSSGDSGPYVT 269


>gnl|CDD|172376 PRK13855, PRK13855, type IV secretion system protein VirB10;
           Provisional.
          Length = 376

 Score = 31.0 bits (70), Expect = 1.9
 Identities = 10/40 (25%), Positives = 13/40 (32%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
           P PP  + A    P  P  +D     P     +   AP  
Sbjct: 59  PAPPSTMIATNTKPFHPAPIDVPPDPPAAQEAVQPTAPPS 98


>gnl|CDD|148271 pfam06566, Chon_Sulph_att, Chondroitin sulphate attachment domain. 
           This family represents the chondroitin sulphate
           attachment domain of vertebrate neural transmembrane
           proteoglycans that contain EGF modules. Evidence has
           been accumulated to support the idea that neural
           proteoglycans are involved in various cellular events
           including mitogenesis, differentiation, axonal outgrowth
           and synaptogenesis. This domain contains several
           potential sites of chondroitin sulphate attachment, as
           well as potential sites of N-linked glycosylation.
          Length = 253

 Score = 30.7 bits (69), Expect = 1.9
 Identities = 16/55 (29%), Positives = 21/55 (38%), Gaps = 4/55 (7%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLD 166
           P P   L  + P    PEA + S     PP P     P   P    ++P  V L+
Sbjct: 104 PTPDEALGNSNPSLALPEATEASN----PPSPGPGDKPSLLPELPKESPVEVWLN 154


>gnl|CDD|237940 PRK15313, PRK15313, autotransport protein MisL; Provisional.
          Length = 955

 Score = 31.3 bits (70), Expect = 2.0
 Identities = 20/64 (31%), Positives = 28/64 (43%), Gaps = 5/64 (7%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAV--ADTPKIVSLDVKC 169
           P  P P+    P PV P+ +D     P P  P++   PDP    +  +DTP I     + 
Sbjct: 581 PVDPDPVDPVIPDPVIPDPVDPDPVDPEPVDPVI---PDPTIPDIGQSDTPPITEHQFRP 637

Query: 170 EKNS 173
           E  S
Sbjct: 638 EVGS 641



 Score = 29.8 bits (66), Expect = 4.7
 Identities = 14/42 (33%), Positives = 19/42 (45%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           P  P P+    P PV P+ +D     P  P P++    DP P
Sbjct: 563 PIIPDPVDPVIPDPVIPDPVDPDPVDPVIPDPVIPDPVDPDP 604


>gnl|CDD|221371 pfam12004, DUF3498, Domain of unknown function (DUF3498).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 433 to 538 amino acids in length. This domain is
           found associated with pfam00616, pfam00168. This domain
           has two conserved sequence motifs: DLQ and PLSFQNP.
          Length = 489

 Score = 30.8 bits (69), Expect = 2.0
 Identities = 12/57 (21%), Positives = 18/57 (31%)

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           L    H P  P    +  +    QP       +    PP LL+  P       ++ P
Sbjct: 241 LPDRQHQPALPRQNSAGPQRRVDQPSPPGGGSHRGRIPPSLLSSLPSEGSMLSSEWP 297


>gnl|CDD|219914 pfam08577, PI31_Prot_C, PI31 proteasome regulator.  PI31 is a
           cellular regulator of proteasome formation and of
           proteasome-mediated antigen processing.
          Length = 68

 Score = 28.1 bits (63), Expect = 2.1
 Identities = 17/47 (36%), Positives = 20/47 (42%), Gaps = 1/47 (2%)

Query: 72  RPPPPPAPIVAPPRPHPN-GRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
            PP P  P+ AP   +   G      +P F    GG   GPPP  PP
Sbjct: 13  GPPDPFDPLPAPLGGNGQGGMIFDPNRPGFGPPRGGGGDGPPPGVPP 59


>gnl|CDD|236768 PRK10819, PRK10819, transport protein TonB; Provisional.
          Length = 246

 Score = 30.4 bits (69), Expect = 2.2
 Identities = 18/59 (30%), Positives = 24/59 (40%), Gaps = 1/59 (1%)

Query: 111 PPPPPPPLSAAKP-PPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVK 168
             PPP P+   +P P   PE   ++    P P P   P P P P  V    +    +VK
Sbjct: 64  VQPPPEPVVEPEPEPEPIPEPPKEAPVVIPKPEPKPKPKPKPKPKPVKKVEEQPKREVK 122



 Score = 30.0 bits (68), Expect = 2.6
 Identities = 33/139 (23%), Positives = 43/139 (30%), Gaps = 11/139 (7%)

Query: 28  VSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPH 87
           V +V E P  AQ      S   VA      +  Q   P    V  P P P PI  PP+  
Sbjct: 35  VHQVIELPAPAQ----PISVTMVAPAD--LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEA 88

Query: 88  PNGRHPSHGKPQFN--HKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLL 145
           P        KP+     K   V      P   +   +P P  P                 
Sbjct: 89  PVVIPKPEPKPKPKPKPKPKPVKKVEEQPKREVKPVEPRPASPFENTAPARPTSSTATAA 148

Query: 146 APAPDPWPNAVADTPKIVS 164
           A  P     +V+  P+ +S
Sbjct: 149 ASKP---VTSVSSGPRALS 164



 Score = 29.3 bits (66), Expect = 5.0
 Identities = 17/65 (26%), Positives = 24/65 (36%), Gaps = 3/65 (4%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEK 171
           P P  P+S      V P  ++      PPP P++ P P+P P         V +     K
Sbjct: 42  PAPAQPISVTM---VAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIPKPEPK 98

Query: 172 NSMKV 176
              K 
Sbjct: 99  PKPKP 103


>gnl|CDD|217409 pfam03178, CPSF_A, CPSF A subunit region.  This family includes a
           region that lies towards the C-terminus of the cleavage
           and polyadenylation specificity factor (CPSF) A (160
           kDa) subunit. CPSF is involved in mRNA polyadenylation
           and binds the AAUAAA conserved sequence in pre-mRNA.
           CPSF has also been found to be necessary for splicing of
           single-intron pre-mRNAs. The function of the aligned
           region is unknown but may be involved in RNA/DNA
           binding.
          Length = 318

 Score = 30.6 bits (70), Expect = 2.2
 Identities = 14/39 (35%), Positives = 20/39 (51%), Gaps = 7/39 (17%)

Query: 234 GADAGSGTYFENIIVIQYDPQVQEVWDQARKL--RCTWH 270
           GAD      F N+ V++YDP+  E  D   +L  R  +H
Sbjct: 187 GADK-----FGNLHVLRYDPEAPESLDGDPRLLHRAEFH 220


>gnl|CDD|223032 PHA03283, PHA03283, envelope glycoprotein E; Provisional.
          Length = 542

 Score = 30.7 bits (69), Expect = 2.2
 Identities = 21/92 (22%), Positives = 29/92 (31%), Gaps = 9/92 (9%)

Query: 4   YKIVSTQLQITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQS 63
           + +V+  LQ T    F   T  P ++   E+      R             SSD   + S
Sbjct: 105 FNVVNGSLQRTKDVYFVNGTVFPILT--PERSVLQIQRATPNIAGVYTLHVSSDGMMKHS 162

Query: 64  A------PKHGYVDRPPPPPAPIVAPPRPHPN 89
                          PP P  P V P R H +
Sbjct: 163 VFFVTVKKPPKQPQTPPAPLVPQV-PARHHTD 193


>gnl|CDD|235309 PRK04596, minC, septum formation inhibitor; Reviewed.
          Length = 248

 Score = 30.3 bits (68), Expect = 2.3
 Identities = 14/32 (43%), Positives = 16/32 (50%), Gaps = 3/32 (9%)

Query: 106 GVYHGPPPPPPPLSAAKPPPV---QPEAMDKS 134
            V   PPPPPPP  A   PPV    P  M ++
Sbjct: 116 AVSPPPPPPPPPARAEPAPPVARPAPGRMQRT 147


>gnl|CDD|237378 PRK13406, bchD, magnesium chelatase subunit D; Provisional.
          Length = 584

 Score = 30.8 bits (70), Expect = 2.4
 Identities = 16/47 (34%), Positives = 17/47 (36%), Gaps = 11/47 (23%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPP-----------PPPLLA 146
           PPPPPPP      PP   E  D +                 PP LLA
Sbjct: 269 PPPPPPPPEDDDDPPEDEEEQDDAEDRALEEIVLEAVRAALPPDLLA 315



 Score = 30.4 bits (69), Expect = 3.6
 Identities = 15/43 (34%), Positives = 15/43 (34%), Gaps = 4/43 (9%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWP 153
           P PP PP     PPP  PE  D     PP        A D   
Sbjct: 259 PAPPQPPEEEPPPPPPPPEDDDD----PPEDEEEQDDAEDRAL 297


>gnl|CDD|220309 pfam09606, Med15, ARC105 or Med15 subunit of Mediator complex
           non-fungal.  The approx. 70 residue Med15 domain of the
           ARC-Mediator co-activator is a three-helix bundle with
           marked similarity to the KIX domain. The sterol
           regulatory element binding protein (SREBP) family of
           transcription activators use the ARC105 subunit to
           activate target genes in the regulation of cholesterol
           and fatty acid homeostasis. In addition, Med15 is a
           critical transducer of gene activation signals that
           control early metazoan development.
          Length = 768

 Score = 30.7 bits (69), Expect = 2.5
 Identities = 21/117 (17%), Positives = 29/117 (24%), Gaps = 11/117 (9%)

Query: 31  VEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPP-----PPPAPIVAPP- 84
                +Q  N++  +    VA    + Q +Q     +      P     P P P V    
Sbjct: 365 HPAAHQQQMNQQVGQGGQMVALGYLNIQGNQGGLGANPMQQGQPGMMSSPSPVPQVQTNQ 424

Query: 85  --RPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPP 139
                P    PS G P                P P +    P  Q      S     
Sbjct: 425 SMPQPPQPSVPSPGGPGSQP---PQSVSGGMIPSPPALMPSPSPQMSQSPASQRTIQ 478



 Score = 29.2 bits (65), Expect = 7.5
 Identities = 25/132 (18%), Positives = 37/132 (28%), Gaps = 11/132 (8%)

Query: 33  EQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRH 92
            Q +  Q +       Q   +    Q     A     +++       +VA    +  G  
Sbjct: 336 GQQQLKQMKLRNMRGQQQTQQQQQQQGGNHPAAHQQQMNQQVGQGGQMVALGYLNIQGNQ 395

Query: 93  PSHGKPQFNHKLGGVYHGPPPP---------PPPLSAAKPPPVQPEAMDKSGY--GPPPP 141
              G         G+   P P          P P   + P P  P +        G  P 
Sbjct: 396 GGLGANPMQQGQPGMMSSPSPVPQVQTNQSMPQPPQPSVPSPGGPGSQPPQSVSGGMIPS 455

Query: 142 PPLLAPAPDPWP 153
           PP L P+P P  
Sbjct: 456 PPALMPSPSPQM 467


>gnl|CDD|165431 PHA03160, PHA03160, hypothetical protein; Provisional.
          Length = 499

 Score = 30.4 bits (68), Expect = 2.6
 Identities = 16/62 (25%), Positives = 22/62 (35%), Gaps = 5/62 (8%)

Query: 98  PQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA-----MDKSGYGPPPPPPLLAPAPDPW 152
           P F +   G         PPL+ ++  P+QP       M      PPP  P     P   
Sbjct: 396 PFFRYAPYGAPKNDHHLLPPLACSQQLPMQPLHVQQAPMQAPHVAPPPMQPPHVQQPRVL 455

Query: 153 PN 154
           P+
Sbjct: 456 PS 457


>gnl|CDD|218549 pfam05308, Mito_fiss_reg, Mitochondrial fission regulator.  In
           eukaryotes, this family of proteins induces
           mitochondrial fission.
          Length = 248

 Score = 30.1 bits (68), Expect = 2.7
 Identities = 12/40 (30%), Positives = 12/40 (30%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAP 149
                 P  S    P   P         PPPPPP   P P
Sbjct: 157 SSDESVPSSSTTSFPISPPTEEPVLEVPPPPPPPPPPPPP 196


>gnl|CDD|218970 pfam06278, DUF1032, Protein of unknown function (DUF1032).  This
           family consists of several conserved eukaryotic proteins
           of unknown function.
          Length = 565

 Score = 30.7 bits (69), Expect = 2.8
 Identities = 18/85 (21%), Positives = 27/85 (31%), Gaps = 12/85 (14%)

Query: 77  PAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAM--DKS 134
           P P++   +  P G   + G   +  +  G     PP    +      P +      +  
Sbjct: 184 PVPVLQFSQEEPEGPAANGG---YEEEAEGGAEPLPPEDHEVEVEPAEPRERHQSPIEPR 240

Query: 135 GYGP-------PPPPPLLAPAPDPW 152
            Y         P PP  L   PDPW
Sbjct: 241 RYRLRERVQEAPEPPSRLKETPDPW 265


>gnl|CDD|217461 pfam03261, CDK5_activator, Cyclin-dependent kinase 5 activator
           protein. 
          Length = 314

 Score = 30.2 bits (68), Expect = 3.0
 Identities = 11/50 (22%), Positives = 15/50 (30%), Gaps = 6/50 (12%)

Query: 114 PPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIV 163
            P P     PPP       K+    P        AP     +   +P+ V
Sbjct: 117 EPSPGQPPAPPPSV--LSGKNANCIPSQK----NAPSIAITSTGGSPRRV 160


>gnl|CDD|237863 PRK14949, PRK14949, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 944

 Score = 30.5 bits (69), Expect = 3.0
 Identities = 26/155 (16%), Positives = 36/155 (23%), Gaps = 10/155 (6%)

Query: 5   KIVSTQLQITTGSQFAEKTEVPNV-SKVEEQPKQAQNREDAESTDQVASETSSDQESQQS 63
           K  S   +  T    A    +    S  +     A    D +              +  S
Sbjct: 637 KKSSADRKPKTPPSRAPPASLSKPASSPDASQTSASFDLDPDFELATHQSVPEAALASGS 696

Query: 64  APKHGYV----DRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLG----GVYHGPPPPP 115
           AP    V    DRPP   AP VA     PN         +                    
Sbjct: 697 APAPPPVPDPYDRPPWEEAPEVASANDGPN-NAAEGNLSESVEDASNSELQAVEQQATHQ 755

Query: 116 PPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPD 150
           P + A    P    A+ ++          L     
Sbjct: 756 PQVQAEAQSPASTTALTQTSSEVQDTELNLVLLSS 790


>gnl|CDD|218597 pfam05466, BASP1, Brain acid soluble protein 1 (BASP1 protein).
           This family consists of several brain acid soluble
           protein 1 (BASP1) or neuronal axonal membrane protein
           NAP-22. The BASP1 is a neuron enriched Ca(2+)-dependent
           calmodulin-binding protein of unknown function.
          Length = 233

 Score = 29.8 bits (66), Expect = 3.1
 Identities = 29/135 (21%), Positives = 42/135 (31%), Gaps = 22/135 (16%)

Query: 32  EEQPKQAQNREDAESTDQV--ASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPN 89
           E  PK+ +  + A  T +V  A E   D+++Q +A K    +      A     P+  P 
Sbjct: 33  EGTPKENEEAQAAAETTEVKEAKEEKPDKDAQDTANKTEEKEGEKEAAAAKEEAPKAEPE 92

Query: 90  GRHP-SHGKPQFNHKLGGVYHGPPPPPPPLS--AAKPPPVQPEAMDKSGYGPPPPPPLLA 146
                +  K +           PP    P    AA P P         G  P        
Sbjct: 93  KTEGAAEAKAE-----------PPKASDPEQEPAAAPGPA------AGGEAPKASEASSQ 135

Query: 147 PAPDPWPNAVADTPK 161
           PA    P    +  K
Sbjct: 136 PAESAAPAKEEEKSK 150


>gnl|CDD|236692 PRK10431, PRK10431, N-acetylmuramoyl-l-alanine amidase II;
           Provisional.
          Length = 445

 Score = 30.2 bits (68), Expect = 3.1
 Identities = 12/33 (36%), Positives = 13/33 (39%), Gaps = 10/33 (30%)

Query: 71  DRPPPPP----------APIVAPPRPHPNGRHP 93
           D PPPPP           P V  PR     R+P
Sbjct: 128 DVPPPPPPPPVVAKRVETPAVVAPRVSEPARNP 160


>gnl|CDD|233508 TIGR01649, hnRNP-L_PTB, hnRNP-L/PTB/hephaestus splicing factor
           family.  Included in this family of heterogeneous
           ribonucleoproteins are PTB (polypyrimidine tract binding
           protein ) and hnRNP-L. These proteins contain four RNA
           recognition motifs (rrm: pfam00067).
          Length = 481

 Score = 30.2 bits (68), Expect = 3.2
 Identities = 23/129 (17%), Positives = 37/129 (28%), Gaps = 17/129 (13%)

Query: 76  PPAPIVAPPRPHPN--GRHPSHGKPQFNHKLGGVYHGP--PPPPPPLSAAKPPPVQPEAM 131
           P  P    P        R P+      +      Y     P  P        PP  P + 
Sbjct: 191 PDLPGRRDPGLDQTHRQRQPALLGQHPSSYGHDGYSSHGGPLAPLAGGDRMGPPHGPPSR 250

Query: 132 DKSGYGPPPPPPLLA---PAPDPWPNAVADTPKIVSLDVKCEK---------NSMKVFIS 179
            +  Y   P  P ++   PA    P +V     +    V C++         N  +V   
Sbjct: 251 YRPAYEAAPLAPAISSYGPAGGG-PGSVLMVSGLHQEKVNCDRLFNLFCVYGNVERVKFM 309

Query: 180 FDKPFFGIV 188
            +K    ++
Sbjct: 310 KNKKETALI 318


>gnl|CDD|117486 pfam08919, F_actin_bind, F-actin binding.  The F-actin binding
           domain forms a compact bundle of four antiparallel
           alpha-helices, which are arranged in a left-handed
           topology. Binding of F-actin to the F-actin binding
           domain may result in cytoplasmic retention and
           subcellular distribution of the protein, as well as
           possible inhibition of protein function.
          Length = 179

 Score = 29.7 bits (66), Expect = 3.3
 Identities = 18/56 (32%), Positives = 20/56 (35%), Gaps = 11/56 (19%)

Query: 105 GGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           G     PP  P P S AKP              PP P PL + +P P   A    P
Sbjct: 2   GLKKPVPPAVPKPQSTAKPVGT-----------PPSPVPLPSTSPSPSKMANGTQP 46


>gnl|CDD|184923 PRK14959, PRK14959, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 624

 Score = 30.4 bits (68), Expect = 3.3
 Identities = 26/96 (27%), Positives = 34/96 (35%), Gaps = 7/96 (7%)

Query: 54  TSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPP 113
           T   Q  Q +AP  G     P   AP    P   P+ R P    P    + G     PP 
Sbjct: 399 TPGTQGPQGTAPAAGMT---PSSAAPATPAPSAAPSPRVPWDDAPPAPPRSGI----PPR 451

Query: 114 PPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAP 149
           P P +  A P P  P+++  +   PP        A 
Sbjct: 452 PAPRMPEASPVPGAPDSVASASDAPPTLGDPSDTAE 487


>gnl|CDD|219358 pfam07271, Cytadhesin_P30, Cytadhesin P30/P32.  This family
           consists of several Mycoplasma species specific
           Cytadhesin P32 and P30 proteins. P30 has been found to
           be membrane associated and localised on the tip
           organelle. It is thought that it is important in
           cytadherence and virulence.
          Length = 279

 Score = 30.0 bits (67), Expect = 3.4
 Identities = 32/160 (20%), Positives = 52/160 (32%), Gaps = 44/160 (27%)

Query: 30  KVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPP-------PPPAPIVA 82
           ++ EQ ++   + + ++ +   +E  + QE  Q A  +   +  P       P P   + 
Sbjct: 111 QMAEQLQRISEQNEQQAIEIDPTEEVNTQEPTQPAGVNVANNPQPQVQPQFGPNPQQRIN 170

Query: 83  PPR-------------------PHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKP 123
           P R                   PH  G  P+  +P FN         P P  PP      
Sbjct: 171 PQRFGFPMQPNMGMRPGFNQMPPHMPGMPPNQMRPGFN---------PMPGMPPRPGFNQ 221

Query: 124 PPVQPEAMDKSGYGPPP---------PPPLLAPAPDPWPN 154
            P     M++ G+ P P           P +   P   PN
Sbjct: 222 NPNMMPNMNRPGFRPQPGGFNHPGTPMGPNMQQRPGFNPN 261


>gnl|CDD|235899 PRK06975, PRK06975, bifunctional uroporphyrinogen-III
           synthetase/uroporphyrin-III C-methyltransferase;
           Reviewed.
          Length = 656

 Score = 30.5 bits (69), Expect = 3.5
 Identities = 14/50 (28%), Positives = 16/50 (32%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPK 161
                P +AA  P    +  D       P     APAP P P A    P 
Sbjct: 267 DAAAQPATAAPAPSRMTDTNDSKSVTSQPAAAAAAPAPPPNPPATPPEPP 316


>gnl|CDD|237782 PRK14666, uvrC, excinuclease ABC subunit C; Provisional.
          Length = 694

 Score = 30.2 bits (68), Expect = 3.5
 Identities = 16/67 (23%), Positives = 17/67 (25%), Gaps = 7/67 (10%)

Query: 71  DRPPPPPAP-----IVAPPRPHPNG--RHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKP 123
           D P  P AP      V P           P          L  V H  P    P  AA  
Sbjct: 341 DTPLLPDAPEGSSDPVVPVAAATPVDASLPDVRTGTAPTSLANVSHADPAVAQPTQAATL 400

Query: 124 PPVQPEA 130
               P+ 
Sbjct: 401 AGAAPKG 407



 Score = 29.9 bits (67), Expect = 5.1
 Identities = 20/96 (20%), Positives = 22/96 (22%), Gaps = 7/96 (7%)

Query: 76  PPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDK-- 133
            P  IV P  P   GR      P       G+    P  P     +   PV P A     
Sbjct: 307 IPPRIVVPWLPDTEGREGDDLAPTAVCTDAGLLPDTPLLPDAPEGS-SDPVVPVAAATPV 365

Query: 134 ----SGYGPPPPPPLLAPAPDPWPNAVADTPKIVSL 165
                       P  LA      P     T      
Sbjct: 366 DASLPDVRTGTAPTSLANVSHADPAVAQPTQAATLA 401


>gnl|CDD|224203 COG1284, COG1284, Uncharacterized conserved protein [Function
           unknown].
          Length = 289

 Score = 29.9 bits (68), Expect = 3.6
 Identities = 10/55 (18%), Positives = 22/55 (40%), Gaps = 9/55 (16%)

Query: 488 NKIIKVVSTGDLTFALDDSNSTSQTMIFPTNMEDNSGMICMTTIGFSATLFVLLG 542
           +K+I +V  G          ++S+ +I  +  E+    + +  +G   T     G
Sbjct: 192 SKVIDIVQEGL---------NSSKVVIIISKKEEEIAALILEELGRGVTYLDGEG 237


>gnl|CDD|236797 PRK10927, PRK10927, essential cell division protein FtsN;
           Provisional.
          Length = 319

 Score = 30.0 bits (67), Expect = 3.7
 Identities = 22/88 (25%), Positives = 34/88 (38%), Gaps = 7/88 (7%)

Query: 18  QFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVD--RPPP 75
           Q AE+  +   S+  EQ  Q Q R     T Q A   +  ++S+ ++ +  Y D  + P 
Sbjct: 161 QLAEQQRLAQQSRTTEQSWQQQTR-----TSQAAPVQAQPRQSKPASTQQPYQDLLQTPA 215

Query: 76  PPAPIVAPPRPHPNGRHPSHGKPQFNHK 103
                  P +  P  R     KP    K
Sbjct: 216 HTTAQSKPQQAAPVTRAADAPKPTAEKK 243


>gnl|CDD|218839 pfam05983, Med7, MED7 protein.  This family consists of several
           eukaryotic proteins which are homologues of the yeast
           MED7 protein. Activation of gene transcription in
           metazoans is a multi-step process that is triggered by
           factors that recognise transcriptional enhancer sites in
           DNA. These factors work with co-activators such as MED7
           to direct transcriptional initiation by the RNA
           polymerase II apparatus.
          Length = 161

 Score = 29.2 bits (66), Expect = 3.7
 Identities = 11/37 (29%), Positives = 14/37 (37%), Gaps = 6/37 (16%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAP 147
            PPPPP         ++        + PPPPPP    
Sbjct: 6   YPPPPPYYKLFTDENLEL------RFLPPPPPPTEGS 36


>gnl|CDD|219133 pfam06682, DUF1183, Protein of unknown function (DUF1183).  This
           family consists of several eukaryotic proteins of around
           360 residues in length. The function of this family is
           unknown.
          Length = 317

 Score = 29.7 bits (67), Expect = 4.1
 Identities = 22/55 (40%), Positives = 24/55 (43%), Gaps = 2/55 (3%)

Query: 83  PPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYG 137
            PRP   G     G        GG   GP PPPP   ++ PPP  P A   SGYG
Sbjct: 186 GPRPERAGY--GGGGGGGGGGGGGGGSGPGPPPPGFKSSFPPPYGPGAGPSSGYG 238


>gnl|CDD|218883 pfam06075, DUF936, Plant protein of unknown function (DUF936).
           This family consists of several hypothetical proteins
           from Arabidopsis thaliana and Oryza sativa. The function
           of this family is unknown.
          Length = 564

 Score = 29.8 bits (67), Expect = 4.2
 Identities = 25/109 (22%), Positives = 31/109 (28%), Gaps = 7/109 (6%)

Query: 69  YVDR-PPPPPAPIVAPPRPHPNGRHPSHGKPQ---FNHKLGG--VYHGPPPPPPPLSAAK 122
           YVDR  P  P P++   RP P GRHP  G P+       L                S+A 
Sbjct: 83  YVDRLEPGSPVPVLRGIRPVP-GRHPCVGNPEDLVAADSLAFFSDAVIQVIKRKKASSAP 141

Query: 123 PPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEK 171
                  +   +     P      P      N    TP  V        
Sbjct: 142 RRGSWDSSSKSASIDSSPTVIGPRPRSFSELNLTDRTPAKVRSSRSELG 190


>gnl|CDD|130706 TIGR01645, half-pint, poly-U binding splicing factor, half-pint
           family.  The proteins represented by this model contain
           three RNA recognition motifs (rrm: pfam00076) and have
           been characterized as poly-pyrimidine tract binding
           proteins associated with RNA splicing factors. In the
           case of PUF60 (GP|6176532), in complex with p54, and in
           the presence of U2AF, facilitates association of U2
           snRNP with pre-mRNA.
          Length = 612

 Score = 30.0 bits (67), Expect = 4.4
 Identities = 18/48 (37%), Positives = 20/48 (41%), Gaps = 2/48 (4%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADT 159
               PPL  A P  V+P  M      P PPP L  P+    P  VA T
Sbjct: 355 AEEVPPLPQAAPAVVKPGPM--EIPTPVPPPGLAIPSLVAPPGLVAPT 400


>gnl|CDD|218232 pfam04731, Caudal_act, Caudal like protein activation region.  This
           family consists of the amino termini of proteins
           belonging to the caudal-related homeobox protein family.
           This region is thought to mediate transcription
           activation. The level of activation caused by mouse Cdx2
           is affected by phosphorylation at serine 60 via the
           mitogen-activated protein kinase pathway. Caudal family
           proteins are involved in the transcriptional regulation
           of multiple genes expressed in the intestinal
           epithelium, and are important in differentiation and
           maintenance of the intestinal epithelial lining. Caudal
           proteins always have a homeobox DNA binding domain
           (pfam00046).
          Length = 135

 Score = 28.6 bits (64), Expect = 4.5
 Identities = 21/74 (28%), Positives = 24/74 (32%), Gaps = 6/74 (8%)

Query: 83  PPRPHPNGRHPSH--GKPQFNHKLGGVYHGPP---PPPPPLSAAKPPPVQPEAMDKSGYG 137
             R         +    PQ+    GG +H P     P    S A   P  P   D S YG
Sbjct: 5   SVRHSGLNLGAQNFVSAPQYPD-YGGYHHVPGMNLDPHGQPSGAWGSPYGPPREDWSAYG 63

Query: 138 PPPPPPLLAPAPDP 151
           P P P   A    P
Sbjct: 64  PGPGPSATAATGSP 77


>gnl|CDD|225629 COG3087, FtsN, Cell division protein [Cell division and chromosome
           partitioning].
          Length = 264

 Score = 29.4 bits (66), Expect = 4.9
 Identities = 11/68 (16%), Positives = 19/68 (27%), Gaps = 5/68 (7%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCE 170
                   +  +  PV+P+          P P   APAP+P   A              +
Sbjct: 131 SQKAQSQATTVQTQPVKPKPRP-----EKPQPVAPAPAPEPVEKAPKAEAAPPPKPKAED 185

Query: 171 KNSMKVFI 178
               +  +
Sbjct: 186 AAETRYML 193


>gnl|CDD|114474 pfam05750, Rubella_Capsid, Rubella capsid protein.  Rubella virus
           is an enveloped positive-strand RNA virus of the family
           Togaviridae. Virions are composed of three structural
           proteins: a capsid and two membrane-spanning
           glycoproteins, E2 and E1. During virus assembly, the
           capsid interacts with genomic RNA to form nucleocapsids.
           It has been discovered that capsid phosphorylation
           serves to negatively regulate binding of viral genomic
           RNA. This may delay the initiation of nucleocapsid
           assembly until sufficient amounts of virus glycoproteins
           accumulate at the budding site and/or prevent
           non-specific binding to cellular RNA when levels of
           genomic RNA are low. It follows that at a late stage in
           replication, the capsid may undergo dephosphorylation
           before nucleocapsid assembly occurs.
          Length = 300

 Score = 29.5 bits (65), Expect = 5.0
 Identities = 19/58 (32%), Positives = 26/58 (44%), Gaps = 10/58 (17%)

Query: 110 GPPPP---------PPPLSAAKPPPVQPEAMD-KSGYGPPPPPPLLAPAPDPWPNAVA 157
            PPPP          P    ++ PP QP+    ++G G   P P L P  +P+  AVA
Sbjct: 77  APPPPEERQESRSQTPAPKPSRAPPQQPQPPRMQTGRGGSAPRPELGPPTNPFQAAVA 134


>gnl|CDD|176558 cd08621, PI-PLCXDc_like_2, Catalytic domain of uncharacterized
           hypothetical proteins similar to eukaryotic
           phosphatidylinositol-specific phospholipase C, X domain
           containing proteins.  This subfamily corresponds to the
           catalytic domain present in a group of uncharacterized
           hypothetical proteins found in bacteria and fungi, which
           are similar to eukaryotic phosphatidylinositol-specific
           phospholipase C, X domain containing proteins
           (PI-PLCXD). The typical eukaryotic
           phosphoinositide-specific phospholipase C (PI-PLC, EC
           3.1.4.11) has a multidomain organization that consists
           of a PLC catalytic core domain, and various regulatory
           domains. The catalytic core domain is assembled from two
           highly conserved X- and Y-regions split by a divergent
           linker sequence. In contrast, eukaryotic PI-PLCXDs
           contain a single TIM-barrel type catalytic domain, X
           domain, and are more closely related to bacterial
           PI-PLCs, which participate in Ca2+-independent PI
           metabolism, hydrolyzing the membrane lipid
           phosphatidylinositol (PI) to produce phosphorylated
           myo-inositol and diacylglycerol (DAG). Although the
           biological function of eukaryotic PI-PLCXDs still
           remains unclear, it may distinct from that of typical
           eukaryotic PI-PLCs.
          Length = 300

 Score = 29.3 bits (66), Expect = 5.0
 Identities = 9/33 (27%), Positives = 13/33 (39%)

Query: 226 TENGLYGYGADAGSGTYFENIIVIQYDPQVQEV 258
               L+    DA S   F N++ + Y     EV
Sbjct: 260 ANPALFWKLVDAMSPWSFPNVVYVDYLGNFGEV 292


>gnl|CDD|215496 PLN02918, PLN02918, pyridoxine (pyridoxamine) 5'-phosphate oxidase.
          Length = 544

 Score = 29.5 bits (66), Expect = 5.1
 Identities = 14/66 (21%), Positives = 15/66 (22%), Gaps = 2/66 (3%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMD 132
               P PI   P P  +    S   P             PP    L      P    AM 
Sbjct: 15  QSLLPLPI--SPPPPHSSSLSSSPSPTQRFLTPSQGSRLPPRRRALCTKSQDPRWRRAMA 72

Query: 133 KSGYGP 138
                P
Sbjct: 73  SLAVIP 78


>gnl|CDD|179334 PRK01770, PRK01770, sec-independent translocase; Provisional.
          Length = 171

 Score = 29.0 bits (65), Expect = 5.2
 Identities = 14/71 (19%), Positives = 26/71 (36%), Gaps = 8/71 (11%)

Query: 18  QFAEKT----EVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRP 73
           Q AE         +  K  ++     N    ++       T +  ++Q S+P+     +P
Sbjct: 87  QAAESMKRSYAANDPEKASDEAHTIHNPVVKDNEAAHEGVTPAAAQTQASSPEQ----KP 142

Query: 74  PPPPAPIVAPP 84
              P P+V P 
Sbjct: 143 ETTPEPVVKPA 153


>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region.  The myc family
           belongs to the basic helix-loop-helix leucine zipper
           class of transcription factors, see pfam00010. Myc forms
           a heterodimer with Max, and this complex regulates cell
           growth through direct activation of genes involved in
           cell replication. Mutations in the C-terminal 20
           residues of this domain cause unique changes in the
           induction of apoptosis, transformation, and G2 arrest.
          Length = 329

 Score = 29.5 bits (66), Expect = 5.3
 Identities = 26/117 (22%), Positives = 38/117 (32%), Gaps = 11/117 (9%)

Query: 44  AESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHK 103
           A   ++V SE  +   S Q+A K G    P   PAP  +P                +   
Sbjct: 122 AAKLEKVVSEKLA---SLQAARKEG---LPSDSPAPAPSPRGRPHPASGSGRLSASYLQD 175

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
           L         P    S   P P+  E    S    P P   L   P+   ++ +D+ 
Sbjct: 176 LSTSASECIDP----SVVFPYPL-NERSKSSKVASPTPRLGLRTPPNSSSSSGSDSE 227


>gnl|CDD|221745 pfam12737, Mating_C, C-terminal domain of homeodomain 1.  Mating in
           fungi is controlled by the loci that determine the
           mating type of an individual, and only individuals with
           differing mating types can mate. Basidiomycete fungi
           have evolved a unique mating system, termed tetrapolar
           or bifactorial incompatibility, in which mating type is
           determined by two unlinked loci; compatibility at both
           loci is required for mating to occur. The multi-allelic
           tetrapolar mating system is considered to be a novel
           innovation that could have only evolved once, and is
           thus unique to the mushroom fungi. This domain is
           C-terminal to the homeodomain transcription factor
           region.
          Length = 418

 Score = 29.4 bits (66), Expect = 5.5
 Identities = 23/129 (17%), Positives = 33/129 (25%), Gaps = 22/129 (17%)

Query: 52  SETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGR-------HPSHGKPQFNHKL 104
           + + S   +       G   R       + AP RP  + R        P H    ++   
Sbjct: 167 TPSLSPPHTPTDTAPSGKRKRRLSDGFQLPAPKRPQTSSRPQTVSDPLPLHATTDWDTWF 226

Query: 105 GGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPL------LAPAPDPWPNAVAD 158
                    P   L+   PPPV       S + P    PL          P   P A+  
Sbjct: 227 QA--TVSSSPSLLLTGDIPPPV-------SVFAPDDSTPLDISLFNFPLIPLLPPEALDL 277

Query: 159 TPKIVSLDV 167
                    
Sbjct: 278 PAPTAVSSS 286


>gnl|CDD|220596 pfam10138, Tellurium_res, Tellurium resistance protein.  Members of
           this family confer resistance to the metalloid element
           tellurium and its salts.
          Length = 98

 Score = 27.7 bits (62), Expect = 5.6
 Identities = 8/20 (40%), Positives = 8/20 (40%)

Query: 111 PPPPPPPLSAAKPPPVQPEA 130
             P PPP  A   P   P A
Sbjct: 2   AAPVPPPAPAPPAPAPPPAA 21


>gnl|CDD|216078 pfam00716, Peptidase_S21, Assemblin (Peptidase family S21). 
          Length = 326

 Score = 29.3 bits (66), Expect = 5.6
 Identities = 19/93 (20%), Positives = 21/93 (22%), Gaps = 21/93 (22%)

Query: 73  PPPPPAPIVAP---PRPHPNGRHPSHGKPQFNHKLGGVYHGPPP------PPP-----PL 118
           P   P+    P   P   P       G                       P         
Sbjct: 235 PGTAPSFDATPSVSPSGQPLSPAAPPGTSSVAGTALSASPAALFGDMVYVPLDAYNQLLA 294

Query: 119 SAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDP 151
             A   P  P+       GP PP  L  PAP P
Sbjct: 295 GQAFNQPPDPQ-------GPAPPAELAPPAPAP 320


>gnl|CDD|215130 PLN02217, PLN02217, probable pectinesterase/pectinesterase
           inhibitor.
          Length = 670

 Score = 29.7 bits (66), Expect = 5.7
 Identities = 20/87 (22%), Positives = 32/87 (36%), Gaps = 12/87 (13%)

Query: 44  AESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHK 103
           A S    +S++ S   +  ++P  G++  PP  P+ IV+P    P               
Sbjct: 578 ASSNTTFSSDSPSTVVAPSTSPPAGHLGSPPATPSKIVSPSTSPPASH------------ 625

Query: 104 LGGVYHGPPPPPPPLSAAKPPPVQPEA 130
           LG     P  P   +  A      PE+
Sbjct: 626 LGSPSTTPSSPESSIKVASTETASPES 652


>gnl|CDD|217298 pfam02948, Amelogenin, Amelogenin.  Amelogenins play a role in
           biomineralisation. They seem to regulate the formation
           of crystallites during the secretory stage of tooth
           enamel development. thought to play a major role in the
           structural organisation and mineralisation of developing
           enamel. They are found in the extracellular matrix.
           Mutations in X-chromosomal amelogenin can cause
           Amelogenesis imperfecta.
          Length = 174

 Score = 28.7 bits (64), Expect = 6.1
 Identities = 29/112 (25%), Positives = 37/112 (33%), Gaps = 16/112 (14%)

Query: 60  SQQSAPKHGYVDR---------PPPPPAPIVAPPRPH---PNGRHPSHGKPQFNHKLGGV 107
           S Q   +                 PP  P++  P  H   P      H +P   H L   
Sbjct: 54  SPQMPQQQQSAHPKLTPHHQLLILPPQQPMMPVPGHHPMVPMTGQQPHLQPPAQHPLQPT 113

Query: 108 YHGPPPPPPPLSAAKPPPVQPEAMDKSG---YGPPPPPPLLAPAP-DPWPNA 155
           Y   P P  P     P   Q  A  + G   +   P PPL+   P +PWP A
Sbjct: 114 YGQNPQPQQPTHTQPPVQPQQPADPQPGQPMFPMQPLPPLVPDLPLEPWPAA 165


>gnl|CDD|219053 pfam06482, Endostatin, Collagenase NC10 and Endostatin.  NC10
           stands for Non-helical region 10 and is taken from human
           COL15A1. A mutation in this region in human COL18A1 is
           associated with an increased risk of prostrate cancer.
           This domain is cleaved from the precursor and forms
           endostatin. Endostatin is a key tumour suppressor and
           has been used highly successfully to treat cancer. It is
           a potent angiogenesis inhibitor. Endostatin also binds a
           zinc ion near the N-terminus; this is likely to be of
           structural rather than functional importance according
           to.
          Length = 291

 Score = 29.3 bits (66), Expect = 6.2
 Identities = 11/55 (20%), Positives = 15/55 (27%), Gaps = 6/55 (10%)

Query: 63  SAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPP 117
                    +PPP      + PR  P+  H    +           H  P PP  
Sbjct: 53  VPLPGTTATQPPPVVLTPWSDPRL-PDPPHLPDPQTHSAT-----AHRNPHPPLN 101


>gnl|CDD|234398 TIGR03921, T7SS_mycosin, type VII secretion-associated serine
           protease mycosin.  Members of this family are
           subtilisin-related serine proteases, found strictly in
           the Actinobacteria and associated with type VII
           secretion operons. The designation mycosin is used for
           members from Mycobacterium [Protein fate, Protein and
           peptide secretion and trafficking, Protein fate, Protein
           modification and repair].
          Length = 350

 Score = 29.2 bits (66), Expect = 6.3
 Identities = 12/35 (34%), Positives = 16/35 (45%), Gaps = 1/35 (2%)

Query: 60  SQQSAPKHGYVDRPPPPPAP-IVAPPRPHPNGRHP 93
           + +  P+ G   RP P PA  + AP  P P    P
Sbjct: 283 TGELPPEDGRPLRPAPAPARPVAAPAPPPPPDDTP 317


>gnl|CDD|218332 pfam04929, Herpes_DNAp_acc, Herpes DNA replication accessory
           factor.  Replicative DNA polymerases are capable of
           polymerising tens of thousands of nucleotides without
           dissociating from their DNA templates. The high
           processivity of these polymerases is dependent upon
           accessory proteins that bind to the catalytic subunit of
           the polymerase or to the substrate. The Epstein-Barr
           virus (EBV) BMRF1 protein is an essential component of
           the viral DNA polymerase and is absolutely required for
           lytic virus replication. BMRF1 is also a transactivator.
           This family is predicted to have a UL42 like structure.
          Length = 381

 Score = 29.3 bits (66), Expect = 6.3
 Identities = 10/44 (22%), Positives = 16/44 (36%)

Query: 43  DAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRP 86
           +A+S+ +          S    P+H   D  P PP    + P  
Sbjct: 288 NADSSVEANGVEPEPTGSVSDRPRHLSSDSSPSPPDTSDSDPST 331


>gnl|CDD|234818 PRK00708, PRK00708, sec-independent translocase; Provisional.
          Length = 209

 Score = 29.0 bits (65), Expect = 6.4
 Identities = 13/53 (24%), Positives = 16/53 (30%), Gaps = 6/53 (11%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVS 164
              P  ++    P   PE        P  P P  APA        A  PK  +
Sbjct: 108 ENKPAEVTTPVEPMGLPET------PPAVPVPAPAPAVAAAAAQAAAAPKAPA 154



 Score = 28.6 bits (64), Expect = 7.1
 Identities = 21/94 (22%), Positives = 26/94 (27%), Gaps = 16/94 (17%)

Query: 70  VDRP--PPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQ 127
           +  P     PA +  P            G P    +       P P P   +AA      
Sbjct: 102 MSEPATENKPAEVTTPV--------EPMGLP----ETPPAVPVPAPAPAVAAAAAQAAAA 149

Query: 128 PEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPK 161
           P+A  K     P P    AP P       A   K
Sbjct: 150 PKAPAKPRAKSPRPAAKAAPKPT--ETITAKKAK 181


>gnl|CDD|173552 PTZ00359, PTZ00359, hypothetical protein; Provisional.
          Length = 443

 Score = 29.4 bits (66), Expect = 6.5
 Identities = 9/30 (30%), Positives = 15/30 (50%), Gaps = 1/30 (3%)

Query: 525 MICMTTIGFSATLFVLLGILVVSCLVSACL 554
           + CM+       +F+ L   VV  L++ CL
Sbjct: 246 LCCMSFTKCDG-VFIFLTGTVVGILITVCL 274


>gnl|CDD|219569 pfam07777, MFMR, G-box binding protein MFMR.  This region is found
           to the N-terminus of the pfam00170 transcription factor
           domain. It is between 150 and 200 amino acids in length.
           The N-terminal half is rather rich in proline residues
           and has been termed the PRD (proline rich domain),
           whereas the C-terminal half is more polar and has been
           called the MFMR (multifunctional mosaic region). It has
           been suggested that this family is composed of three
           sub-families called A, B and C, classified according to
           motif composition. It has been suggested that some of
           these motifs may be involved in mediating
           protein-protein interactions. The MFMR region contains a
           nuclear localisation signal in bZIP opaque and GBF-2.
           The MFMR also contains a transregulatory activity in
           TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention
           signals.
          Length = 189

 Score = 28.7 bits (64), Expect = 6.6
 Identities = 20/79 (25%), Positives = 27/79 (34%), Gaps = 13/79 (16%)

Query: 74  PPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEA-MD 132
           P   +P  +     P                   Y+GP PPPP  +++     QP   M 
Sbjct: 9   PSKSSPKTSVQEDTPTP-TVYPDWSAMQ-----AYYGPRPPPPYFNSSVASSPQPHPYM- 61

Query: 133 KSGYGPPPP--PPLLAPAP 149
              +GP  P  PP   P P
Sbjct: 62  ---WGPQQPMMPPYGTPPP 77


>gnl|CDD|233366 TIGR01348, PDHac_trf_long, pyruvate dehydrogenase complex
           dihydrolipoamide acetyltransferase, long form.  This
           model describes a subset of pyruvate dehydrogenase
           complex dihydrolipoamide acetyltransferase specifically
           close by both phylogenetic and per cent identity (UPGMA)
           trees. Members of this set include two or three copies
           of the lipoyl-binding domain. E. coli AceF is a member
           of this model, while mitochondrial and some other
           bacterial forms belong to a separate model [Energy
           metabolism, Pyruvate dehydrogenase].
          Length = 546

 Score = 29.1 bits (65), Expect = 7.1
 Identities = 13/41 (31%), Positives = 14/41 (34%), Gaps = 1/41 (2%)

Query: 110 GPPPPPPPLSAAKPPPVQ-PEAMDKSGYGPPPPPPLLAPAP 149
           G  P   P  A+  P  Q P A        P      APAP
Sbjct: 193 GSTPATAPAPASAQPAAQSPAATQPEPAAAPAAAKAQAPAP 233


>gnl|CDD|236507 PRK09424, pntA, NAD(P) transhydrogenase subunit alpha; Provisional.
          Length = 509

 Score = 29.0 bits (66), Expect = 7.2
 Identities = 12/46 (26%), Positives = 13/46 (28%), Gaps = 11/46 (23%)

Query: 112 PPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVA 157
           PPPP  +SAA   P    A         P        P       A
Sbjct: 369 PPPPIQVSAA---PAAAAA--------APAAKEEEKKPASPWRKYA 403


>gnl|CDD|215045 PLN00064, PLN00064, photosystem II protein Psb27; Provisional.
          Length = 166

 Score = 28.4 bits (63), Expect = 7.2
 Identities = 12/32 (37%), Positives = 15/32 (46%)

Query: 63 SAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPS 94
          S PK   + +PP   A  V+PP P P   H  
Sbjct: 4  SKPKPLSLIKPPTATAAAVSPPLPPPRRNHLL 35


>gnl|CDD|218350 pfam04959, ARS2, Arsenite-resistance protein 2.  Arsenite is a
           carcinogenic compound which can act as a co-mutagen by
           inhibiting DNA repair. Arsenite-resistance protein 2 is
           thought to play a role in arsenite resistance.
          Length = 211

 Score = 28.6 bits (64), Expect = 7.5
 Identities = 12/33 (36%), Positives = 13/33 (39%), Gaps = 1/33 (3%)

Query: 111 PPPPPPPLSAAKPPPVQPEA-MDKSGYGPPPPP 142
           PP P P   A   P   P+       YG P PP
Sbjct: 135 PPKPDPGGLAPGLPGYPPQTPQALMPYGQPRPP 167



 Score = 28.6 bits (64), Expect = 8.1
 Identities = 18/62 (29%), Positives = 20/62 (32%), Gaps = 3/62 (4%)

Query: 72  RPPPPPAPIVAPPRPHPNGRHPSHG--KPQFNHKLGGVYHGPPPPPPPLSAAKP-PPVQP 128
           RP  P    + PP+P P G  P      PQ    L       PP         P PP Q 
Sbjct: 124 RPALPEIKPLQPPKPDPGGLAPGLPGYPPQTPQALMPYGQPRPPMMGYGRGGPPFPPNQY 183

Query: 129 EA 130
             
Sbjct: 184 GG 185


>gnl|CDD|237802 PRK14723, flhF, flagellar biosynthesis regulator FlhF; Provisional.
          Length = 767

 Score = 29.4 bits (66), Expect = 7.6
 Identities = 6/41 (14%), Positives = 10/41 (24%)

Query: 110 GPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPD 150
                    +   P P+    +  +        P   PAP 
Sbjct: 50  AVAASAQAYAPPAPAPLPAALVAPAPAAASIAAPAAVPAPG 90



 Score = 29.0 bits (65), Expect = 9.1
 Identities = 13/56 (23%), Positives = 16/56 (28%), Gaps = 4/56 (7%)

Query: 120 AAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIVSLDVKCEKNSMK 175
           AA      P A        P      APA        A        D++ E  SM+
Sbjct: 52  AASAQAYAPPAPA----PLPAALVAPAPAAASIAAPAAVPAPGAIGDLRGELQSMR 103


>gnl|CDD|221179 pfam11711, Tim54, Inner membrane protein import complex subunit
           Tim54.  Mitochondrial function depends on the import of
           hundreds of different proteins synthesised in the
           cytosol. Protein import is a multi-step pathway which
           includes the binding of precursor proteins to surface
           receptors, translocation of the precursor across one or
           both mitochondrial membranes, and folding and assembly
           of the imported protein inside the mitochondrion. Most
           precursor proteins carry amino-terminal targeting
           signals, called pre-sequences, and are imported into
           mitochondria via import complexes located in both the
           outer and the inner membrane (IM). The IM complex, TIM,
           is made up of at least two proteins which mediate
           translocation of proteins into the matrix by removing
           their signal peptide and another pair of proteins, Tim54
           and Tim22, that insert the polytopic proteins, that
           carry internal targetting information, into the inner
           membrane.
          Length = 377

 Score = 28.9 bits (65), Expect = 7.7
 Identities = 17/78 (21%), Positives = 29/78 (37%), Gaps = 11/78 (14%)

Query: 6   IVSTQLQITTGSQFAEKTEVPNVSKVEEQPKQAQNREDAESTDQVASETSSDQESQQSAP 65
           +   +    T  + A +TEV      E   + A+     E+ +    ET    E + + P
Sbjct: 195 LDPPEPPEPTVDEAAPETEVEATPAAESPAEPAE-----ETAETTPEETEDAPEEENNKP 249

Query: 66  KHGYVDRPPPPPAPIVAP 83
                   PP P P ++P
Sbjct: 250 VK------PPVPKPYISP 261


>gnl|CDD|235357 PRK05177, minC, septum formation inhibitor; Reviewed.
          Length = 239

 Score = 28.8 bits (65), Expect = 8.0
 Identities = 13/53 (24%), Positives = 15/53 (28%), Gaps = 9/53 (16%)

Query: 111 PPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTPKIV 163
            P  PP L   +P               P   P   P     P A A  P +V
Sbjct: 92  GPGMPPALVGGRPAGDVE---------IPEEEPAAPPPAPAAPEAPAAVPSLV 135


>gnl|CDD|215544 PLN03029, PLN03029, type-a response regulator protein; Provisional.
          Length = 222

 Score = 28.5 bits (63), Expect = 8.1
 Identities = 17/72 (23%), Positives = 29/72 (40%), Gaps = 11/72 (15%)

Query: 37  QAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRH---- 92
           +++N++      Q   E S  Q  +Q  P      +P   P P   P +P+ N R     
Sbjct: 149 KSKNQKQENQEKQEKLEESEIQSEKQEQP----SQQPQSQPQPQQQPQQPNNNKRKAMEE 204

Query: 93  ---PSHGKPQFN 101
              P   +P++N
Sbjct: 205 GLSPDRTRPRYN 216


>gnl|CDD|237081 PRK12372, PRK12372, ribonuclease III; Reviewed.
          Length = 413

 Score = 28.7 bits (64), Expect = 8.4
 Identities = 29/166 (17%), Positives = 47/166 (28%), Gaps = 33/166 (19%)

Query: 16  GSQFAEKTEVPNVSKVEE-------QPKQAQNREDAESTDQVASETSSDQESQQ---SAP 65
            S+  E   VP V  V+E       + K+     +A +     + T++     +    AP
Sbjct: 239 ASKHVEPEIVPGVKGVQEALDLRSPERKERAAAREARAAAAAPAATAAAAAPAEEPAVAP 298

Query: 66  ----KHGYVD-------RPPPPPAPIVAPPRPHPNGRHPSHGKPQFNHKLGGVYHGPPPP 114
               +  +V+       R   P A   A  +P            +            P  
Sbjct: 299 MAAIRAAHVETAADKGERAAKPAAADKAADKPADRPDAAEKAAEK------------PAE 346

Query: 115 PPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADTP 160
             P +A KP     +    S   P       A  P    +A A   
Sbjct: 347 AAPRAADKPAGQAADPASSSADKPGASADAAARTPARARDAAAPDA 392


>gnl|CDD|221009 pfam11162, DUF2946, Protein of unknown function (DUF2946).  This
           family of proteins has no known function.
          Length = 119

 Score = 27.4 bits (61), Expect = 8.7
 Identities = 13/47 (27%), Positives = 14/47 (29%), Gaps = 7/47 (14%)

Query: 109 HGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPAPDPWPNA 155
              P  P  L           A        PPPPP   P    WP+A
Sbjct: 74  AHTPALPAALPLLLALVRLAAA-------VPPPPPASLPPRSRWPSA 113


>gnl|CDD|222010 pfam13254, DUF4045, Domain of unknown function (DUF4045).  This
           presumed domain is functionally uncharacterized. This
           domain family is found in bacteria and eukaryotes, and
           is typically between 384 and 430 amino acids in length.
          Length = 414

 Score = 28.7 bits (64), Expect = 9.0
 Identities = 26/114 (22%), Positives = 32/114 (28%), Gaps = 12/114 (10%)

Query: 43  DAESTD--QVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSHGKPQF 100
           D+E T   +   ETSS    + SAPK      P  P                     P+ 
Sbjct: 232 DSEKTKPEKPQQETSSMDTEKSSAPKPRETLDPKSPEKAPPIDTTEEELK------SPEA 285

Query: 101 NHKLGGVYHGPPPPPPPLS----AAKPPPVQPEAMDKSGYGPPPPPPLLAPAPD 150
           + K           P  LS    A  P P+            P P P   P  D
Sbjct: 286 SPKESEEASARKRSPSLLSPSPKAESPKPLASPGKSPRDPLSPRPKPQSPPVND 339


>gnl|CDD|218108 pfam04487, CITED, CITED.  CITED, CBP/p300-interacting
           transactivator with ED-rich tail, are characterized by a
           conserved 32-amino acid sequence at the C-terminus.
           CITED proteins do not bind DNA directly and are thought
           to function as transcriptional co-activators.
          Length = 206

 Score = 28.3 bits (63), Expect = 9.1
 Identities = 18/95 (18%), Positives = 23/95 (24%), Gaps = 15/95 (15%)

Query: 73  PPPPPAPIVAPPRPHPNGRHPS---HGKPQFN-----HKLGGVYHGPPPPPPPLSAAKPP 124
              P   + A    +P+ +         PQ        KL   Y G    P         
Sbjct: 69  GGHPHQSMPAYMMFNPSSKPQPFMLVPGPQLMASMQLQKLNTQYQGHAGAPA--GHPGGG 126

Query: 125 PVQPEAMDKSGYGPPPPPPLLAPAPDPWPNAVADT 159
             Q         G   PP +        P  V DT
Sbjct: 127 GPQQFRP-----GAGQPPGMQHMPAPALPPNVIDT 156


>gnl|CDD|182486 PRK10473, PRK10473, multidrug efflux system protein MdtL;
           Provisional.
          Length = 392

 Score = 28.8 bits (65), Expect = 9.3
 Identities = 11/41 (26%), Positives = 20/41 (48%), Gaps = 1/41 (2%)

Query: 528 MTTIGFSATLFVLLGILVVSCLVSACLCIRLRPFSNKTSQK 568
              +G SA   +L+GIL+   +VS  L + + P     + +
Sbjct: 347 AAVLGISA-WNMLIGILIACSIVSLLLILFVAPGRPVAAHE 386


>gnl|CDD|234630 PRK00095, mutL, DNA mismatch repair protein; Reviewed.
          Length = 617

 Score = 28.6 bits (65), Expect = 9.7
 Identities = 16/113 (14%), Positives = 24/113 (21%), Gaps = 3/113 (2%)

Query: 36  KQAQNREDAESTDQVASETSSDQESQQSAPKHGYVDRPPPPPAPIVAPPRPHPNGRHPSH 95
            Q+     A   +QV      +    Q  P +     PP        P +   +    S 
Sbjct: 326 AQSGLIPAAAGANQVLEPAEPEPLPLQQTPLYASGSSPPASSPSSAPPEQSEESQEESSA 385

Query: 96  GKPQFNHKLGGVYHGPPPPPPPLSAAKPPPVQPEAMDKSGYGPPPPPPLLAPA 148
            K                      AA   P       ++       P   A  
Sbjct: 386 EKNPLQPNA---SQSEAAAAASAEAAAAAPAAAPEPAEAAEEADSFPLGYALG 435


>gnl|CDD|235757 PRK06260, PRK06260, threonine synthase; Validated.
          Length = 397

 Score = 28.7 bits (65), Expect = 10.0
 Identities = 16/39 (41%), Positives = 21/39 (53%), Gaps = 3/39 (7%)

Query: 206 LGRTSANFEIGIH--ACGTSGNTENGLYGYGADAGSGTY 242
           +G T A  E+G+   AC ++GNT   L  Y A AG   Y
Sbjct: 105 VGVTKAL-ELGVKTVACASTGNTSASLAAYAARAGLKCY 142


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.317    0.133    0.409 

Gapped
Lambda     K      H
   0.267   0.0647    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 29,456,213
Number of extensions: 2848794
Number of successful extensions: 8075
Number of sequences better than 10.0: 1
Number of HSP's gapped: 5294
Number of HSP's successfully gapped: 499
Length of query: 593
Length of database: 10,937,602
Length adjustment: 102
Effective length of query: 491
Effective length of database: 6,413,494
Effective search space: 3149025554
Effective search space used: 3149025554
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 62 (27.8 bits)