RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy11920
         (884 letters)



>gnl|CDD|240016 cd04658, Piwi_piwi-like_Euk, Piwi_piwi-like_Euk: PIWI domain,
           Piwi-like subfamily found in eukaryotes. This domain is
           found in Piwi and closely related proteins, where it is
           believed to perform a crucial role in germline cells,
           via RNA silencing. RNA silencing refers to a group of
           related gene-silencing mechanisms mediated by short RNA
           molecules, including siRNAs, miRNAs, and
           heterochromatin-related guide RNAs. The mechanism in
           Piwi is believed to be similar to that in Argonaute, the
           central component of the RNA-induced silencing complex
           (RISC). The PIWI domain is the C-terminal portion of
           Argonaute and consists of two subdomains, one of which
           provides the 5' anchoring of the guide RNA and the
           other, the catalytic site for slicing.
          Length = 448

 Score =  501 bits (1291), Expect = e-170
 Identities = 205/459 (44%), Positives = 293/459 (63%), Gaps = 16/459 (3%)

Query: 414 QVMKSIASFTRVDPNQKLQAISKYINNVNNNKETSELLKGWGLTLNKSMETLNARILPVE 473
            +MK +A  T+++P ++   I ++I  +  N    ELLK WG+ L+ +   +  R+LP E
Sbjct: 1   NLMKELAEHTKLNPKERYDTIRQFIQRIQKNPSVQELLKKWGIELDSNPLKIQGRVLPPE 60

Query: 474 KIYMGNNFVAPGSQEADWNRQVGTNPALTVVNFDQWVLIHIRRDQRNADNFLNCLNRNSN 533
           +I MGN FV   +  ADW R++   P    VN + WVLI+  RDQR A++FL  L + + 
Sbjct: 61  QIIMGNVFV-YANSNADWKREIRNQPLYDAVNLNNWVLIYPSRDQREAESFLQTLKQVAG 119

Query: 534 AIGIRVKKPQVIALQEEQTLSYLTALK-SMRSDTQFVVIIFNAPRTDRYQAVKKYCCCER 592
            +GI++  P++I +++++  +Y+ ALK + RSD Q VVII    + D Y A+KK+CC E 
Sbjct: 120 PMGIQISPPKIIKVKDDRIETYIRALKDAFRSDPQLVVIILPGNKKDLYDAIKKFCCVEC 179

Query: 593 PIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSVQIPY---DCAMVIGIDVYHE 649
           P+PSQVI SRT+ ++  ++SI  KIALQIN KLGG  W+V+IP       M++GIDVYH+
Sbjct: 180 PVPSQVITSRTLKKKKNLRSIASKIALQINAKLGGIPWTVEIPPFILKNTMIVGIDVYHD 239

Query: 650 GVGSQGQNIVGLVASTNKDFTTYYSQAVIQRRGQE-ITDSIAQPFKQALDRFIQANSVPP 708
            +  +   +VG VAS NK  T ++S+ + Q RGQE I DS+ +  K+AL  + + N   P
Sbjct: 240 TITKKKS-VVGFVASLNKSITKWFSKYISQVRGQEEIIDSLGKSMKKALKAYKKENKKLP 298

Query: 709 KQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTTLPSCSYAPKITAIIVQKRINTKIF 768
            +I I+RDGV DGQL  V   E+ Q ++ +        S +Y+PK+  I+V KRINT+ F
Sbjct: 299 SRIIIYRDGVGDGQLKKVKEYEVPQIKKAIKQY-----SENYSPKLAYIVVNKRINTRFF 353

Query: 769 QLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVSQHVRQGTVTPSHYIVLRNDNNVKV 828
                    N +N P G+V+D  +T+    DFFLVSQ VRQGTVTP+HY VL +   +K 
Sbjct: 354 N----QGGNNFSNPPPGTVVDSEITKPEWYDFFLVSQSVRQGTVTPTHYNVLYDTTGLKP 409

Query: 829 DHLQRLSYKLCHLYYNWPGTIRVPAVCQYAHRIAYLTGM 867
           DHLQRL+YKLCHLYYNW G+IRVPA CQYAH++A+L G 
Sbjct: 410 DHLQRLTYKLCHLYYNWSGSIRVPAPCQYAHKLAFLVGQ 448


>gnl|CDD|214930 smart00950, Piwi, This domain is found in the protein Piwi and its
           relatives.  The function of this domain is the dsRNA
           guided hydrolysis of ssRNA. Determination of the crystal
           structure of Argonaute reveals that PIWI is an RNase H
           domain, and identifies Argonaute as Slicer, the enzyme
           that cleaves mRNA in the RNAi RISC complex.. In
           addition, Mg+2 dependence and production of 3'-OH and 5'
           phosphate products are shared characteristics of RNaseH
           and RISC. The PIWI domain core has a tertiary structure
           belonging to the RNase H family of enzymes. RNase H fold
           proteins all have a five-stranded mixed beta-sheet
           surrounded by helices. By analogy to RNase H enzymes
           which cleave single-stranded RNA guided by the DNA
           strand in an RNA/DNA hybrid, the PIWI domain can be
           inferred to cleave single-stranded RNA, for example
           mRNA, guided by double stranded siRNA.
          Length = 301

 Score =  297 bits (762), Expect = 8e-94
 Identities = 105/311 (33%), Positives = 159/311 (51%), Gaps = 19/311 (6%)

Query: 568 FVVIIFNAPRTDRYQAVKKYCCCERPIPSQVINSRTISRE---DKMKSIVMKIALQINCK 624
            VVI+    +TD Y  +KKY   +  +P+Q + ++T+ +     K+K  +  +AL+IN K
Sbjct: 2   IVVILPGEKKTDLYHEIKKYLETKLGVPTQCVQAKTLDKVSKRRKLKQYLTNVALKINAK 61

Query: 625 LGGSLWSV---QIPYDCAMVIGIDVYHEGVGSQGQNIVGLVASTNKDFTTYYSQAVIQ-R 680
           LGG  W +    IP    ++IGIDV H   G  G  +   VA+       Y S    Q  
Sbjct: 62  LGGINWVLDVPPIPLKPTLIIGIDVSHPSAGKGGS-VAPSVAAFVAS-GNYLSGNFYQAF 119

Query: 681 RGQEITDSIAQPFKQALDRFIQAN-SVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVD 739
             ++ +  + +  ++AL ++ ++N    P +I ++RDGVS+GQ   V   E+   + I  
Sbjct: 120 VREQGSRQLKEILREALKKYYKSNRKRLPDRIVVYRDGVSEGQFKQVLEYEV---KAIKK 176

Query: 740 TIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSD 799
                 P   Y PK+T I+VQKR +T+ F      +     N P G+V+D  +T     D
Sbjct: 177 ACKELGPD--YKPKLTVIVVQKRHHTRFF----PEDGNGRVNVPPGTVVDSVITSPEWYD 230

Query: 800 FFLVSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVCQYAH 859
           F+LVS    QGT  P+HY VL ++ N+  D LQRL+YKLCHLYY     + +PA   YAH
Sbjct: 231 FYLVSHAGLQGTARPTHYTVLYDEGNLDPDELQRLTYKLCHLYYRSTRPVSLPAPVYYAH 290

Query: 860 RIAYLTGMHLQ 870
            +A      L 
Sbjct: 291 LLAKRARQLLH 301


>gnl|CDD|216915 pfam02171, Piwi, Piwi domain.  This domain is found in the protein
           Piwi and its relatives. The function of this domain is
           the dsRNA guided hydrolysis of ssRNA. Determination of
           the crystal structure of Argonaute reveals that PIWI is
           an RNase H domain, and identifies Argonaute as Slicer,
           the enzyme that cleaves mRNA in the RNAi RISC complex.
           In addition, Mg+2 dependence and production of 3'-OH and
           5' phosphate products are shared characteristics of
           RNaseH and RISC. The PIWI domain core has a tertiary
           structure belonging to the RNase H family of enzymes.
           RNase H fold proteins all have a five-stranded mixed
           beta-sheet surrounded by helices. By analogy to RNase H
           enzymes which cleave single-stranded RNA guided by the
           DNA strand in an RNA/DNA hybrid, the PIWI domain can be
           inferred to cleave single-stranded RNA, for example
           mRNA, guided by double stranded siRNA.
          Length = 296

 Score =  269 bits (689), Expect = 2e-83
 Identities = 103/305 (33%), Positives = 161/305 (52%), Gaps = 12/305 (3%)

Query: 567 QFVVIIFNAPRTDRYQAVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLG 626
             VVI+      D Y ++KKY   E  IP+Q I  +T  +  K    +  + L+IN KLG
Sbjct: 1   LIVVIL-PDKNKDNYHSIKKYLETELGIPTQCIRLKTALKRTKP-QTLTNVLLKINVKLG 58

Query: 627 GS-LWSVQIPYDCAMVIGIDVYHEGVG-SQGQNIVGLVASTNKDFTTYYSQAVIQRRGQE 684
           G   W V+IP    ++IG D+ H   G     ++VG VAS +K    Y      Q  GQE
Sbjct: 59  GLNYWIVEIPPKIDVIIGFDISHGTGGTDDNPSVVGFVASMDKHPQKYAGGVRYQASGQE 118

Query: 685 ITDSIAQPFKQALDRFIQANSVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTT 744
           + + + +   ++L  F ++    P++I ++RDGVS+GQ   V   E++Q ++    +   
Sbjct: 119 LIEPLKEIILESLRSFYKSRKKKPERIIVYRDGVSEGQFPQVLNYEVNQIKEACKEL--- 175

Query: 745 LPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVS 804
                Y PK+T I+VQKR +T+ F    +G+R    N P G+V+D  +T     DF+L S
Sbjct: 176 --GEGYRPKLTVIVVQKRHHTRFFA---SGKRDGAQNPPPGTVVDDVITSPEYYDFYLCS 230

Query: 805 QHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVCQYAHRIAYL 864
              RQGTV P+ Y VL ++  +  + LQ+L+YKLC++Y      + +PA   YAH++A  
Sbjct: 231 HAGRQGTVKPTKYTVLYDEIGLSPEELQQLTYKLCYMYQRVFRPVSLPAPVYYAHKLAKR 290

Query: 865 TGMHL 869
              +L
Sbjct: 291 GRNNL 295


>gnl|CDD|240015 cd04657, Piwi_ago-like, Piwi_ago-like: PIWI domain, Argonaute-like
           subfamily. Argonaute is the central component of the
           RNA-induced silencing complex (RISC) and related
           complexes. The PIWI domain is the C-terminal portion of
           Argonaute and consists of two subdomains, one of which
           provides the 5' anchoring of the guide RNA and the
           other, the catalytic site for slicing.
          Length = 426

 Score =  244 bits (624), Expect = 2e-72
 Identities = 120/433 (27%), Positives = 191/433 (44%), Gaps = 35/433 (8%)

Query: 451 LKGWGLTLNKSMETLNARILPVEKI-YMGNNFVAPGSQEADWN------RQVGTNPALTV 503
           LK +G++++K M T+  R+LP  K+ Y  ++   P      WN       + G   +  V
Sbjct: 3   LKEFGISVSKEMITVPGRVLPPPKLKYGDSSKTVPPRN-GSWNLRGKKFLEGGPIRSWAV 61

Query: 504 VNFDQWVLIHIRRDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMR 563
           +NF         R      NF++ L +     GI       IA  E +       LK  +
Sbjct: 62  LNFAGPRRSREERADLR--NFVDQLVKTVIGAGIN--ITTAIASVEGRVEELFAKLKQAK 117

Query: 564 SD-TQFVVIIFNAPRTDRYQAVKKYCCCERPIPSQVINSRTISREDK---MKSIVMKIAL 619
            +  Q V++I     +D Y  +K+    E  I +Q + ++ ++++       ++ +KI  
Sbjct: 118 GEGPQLVLVILPKKDSDIYGRIKRLADTELGIHTQCVLAKKVTKKGNPQYFANVALKI-- 175

Query: 620 QINCKLGGSLWSVQIPYDCA------MVIGIDVYHEGVGSQGQ--NIVGLVASTNKDFTT 671
             N KLGG   S++            MV+G DV H   G      +I  +VAS +     
Sbjct: 176 --NLKLGGINHSLEPDIRPLLTKEPTMVLGADVTHPSPGDPAGAPSIAAVVASVDWHLAQ 233

Query: 672 YYSQAVIQRRGQEITDSIAQPFKQALDRFIQANSVPPKQIFIFRDGVSDGQLDSVSRVEI 731
           Y +   +Q   QEI D +    ++ L  F +A    P++I  +RDGVS+GQ   V   E+
Sbjct: 234 YPASVRLQSHRQEIIDDLESMVRELLRAFKKATGKLPERIIYYRDGVSEGQFAQVLNEEL 293

Query: 732 DQYQQIVDTIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLA-NAPSGSVLDH 790
              ++    +        Y PKIT I+VQKR +T+ F      +      N P G+V+D 
Sbjct: 294 PAIRKACAKLY-----PGYKPKITFIVVQKRHHTRFF-PTDEDDADGKNGNVPPGTVVDR 347

Query: 791 TVTRKTLSDFFLVSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIR 850
            +T     DF+L S    QGT  P+HY VL ++     D LQ L+Y LC+ Y     ++ 
Sbjct: 348 GITHPREFDFYLCSHAGIQGTARPTHYHVLWDEIGFTADELQTLTYNLCYTYARCTRSVS 407

Query: 851 VPAVCQYAHRIAY 863
           +P    YAH  A 
Sbjct: 408 IPPPAYYAHLAAA 420


>gnl|CDD|239208 cd02826, Piwi-like, Piwi-like: PIWI domain. Domain found in
           proteins involved in RNA silencing. RNA silencing refers
           to a group of related gene-silencing mechanisms mediated
           by short RNA molecules, including siRNAs, miRNAs, and
           heterochromatin-related guide RNAs. The central
           component of the RNA-induced silencing complex (RISC)
           and related complexes is Argonaute. The PIWI domain is
           the C-terminal portion of Argonaute and consists of two
           subdomains, one of which provides the 5' anchoring of
           the guide RNA and the other, the catalytic site for
           slicing. This domain is also found in closely related
           proteins, including the Piwi subfamily, where it is
           believed to perform a crucial role in germline cells,
           via a similar mechanism.
          Length = 393

 Score =  225 bits (576), Expect = 5e-66
 Identities = 109/414 (26%), Positives = 178/414 (42%), Gaps = 42/414 (10%)

Query: 463 ETLNARILPVEKIYMGNNFVAPGSQEADWNRQVGTN--PALTVVNFDQWVLIHIRRDQ-R 519
             L  R+LP  +I   N           + R +G    PA      +   +I  R ++  
Sbjct: 3   LILKGRVLPKPQILFKNK----------FLRNIGPFEKPAKI---TNPVAVIAFRNEEVD 49

Query: 520 NADNFLNCL-NRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRS-DTQFVVIIFNAPR 577
           +    L     +    I   +     I           +  K+      Q V+ I    +
Sbjct: 50  DLVKRLADACRQLGMKIK-EIPIVSWIEDLNNSFKDLKSVFKNAIKAGVQLVIFILKEKK 108

Query: 578 TDRYQAVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSVQIP-- 635
              +  +K+    +  IPSQVI  +T  +  ++K  +  +  ++N KLGG  + +  P  
Sbjct: 109 PPLHDEIKRLEA-KSDIPSQVIQLKTAKKMRRLKQTLDNLLRKVNSKLGGINYILDSPVK 167

Query: 636 -YDCAMVIGIDVYHEGVGSQGQN--IVGLVAST-NKDFTTYYSQAVIQR--RGQEITDSI 689
            +   + IG DV H    +       VG  A+  N  F   +      R  + Q++ +  
Sbjct: 168 LFKSDIFIGFDVSHPDRRTVNGGPSAVGFAANLSNHTFLGGFLYVQPSREVKLQDLGEV- 226

Query: 690 AQPFKQALDRFIQAN-SVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTTLPSC 748
               K+ LD F ++     P++I I+RDGVS+G+   V        ++I+        S 
Sbjct: 227 ---IKKCLDGFKKSTGEGLPEKIVIYRDGVSEGEFKRVKEE----VEEIIKEACEIEES- 278

Query: 749 SYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVSQHVR 808
            Y PK+  I+VQKR NT+ F      +   + N   G+V+DHT+T   LS+F+L S   R
Sbjct: 279 -YRPKLVIIVVQKRHNTRFFP---NEKNGGVQNPEPGTVVDHTITSPGLSEFYLASHVAR 334

Query: 809 QGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVCQYAHRIA 862
           QGTV P+ Y V+ ND N  ++ L+ L+Y LC  + N    I +PA   YAH++A
Sbjct: 335 QGTVKPTKYTVVFNDKNWSLNELEILTYILCLTHQNVYSPISLPAPLYYAHKLA 388


>gnl|CDD|198017 smart00949, PAZ, This domain is named PAZ after the proteins Piwi
           Argonaut and Zwille.  This domain is found in two
           families of proteins that are involved in
           post-transcriptional gene silencing. These are the Piwi
           family and the Dicer family, that includes the Carpel
           factory protein. The function of the domains is unknown
           but has been suggested to mediate complex formation
           between proteins of the Piwi and Dicer families by
           hetero-dimerisation. The three-dimensional structure of
           this domain has been solved. The PAZ domain is composed
           of two subdomains. One subdomain is similar to the OB
           fold, albeit with a different topology. The OB-fold is
           well known as a single-stranded nucleic acid binding
           fold. The second subdomain is composed of a beta-hairpin
           followed by an alpha-helix. The PAZ domains shows
           low-affinity nucleic acid binding and appears to
           interact with the 3' ends of single-stranded regions of
           RNA in the cleft between the two subdomains. PAZ can
           bind the characteristic two-base 3' overhangs of siRNAs,
           indicating that although PAZ may not be a primary
           nucleic acid binding site in Dicer or RISC, it may
           contribute to the specific and productive incorporation
           of siRNAs and miRNAs into the RNAi pathway.
          Length = 138

 Score =  152 bits (386), Expect = 4e-43
 Identities = 65/140 (46%), Positives = 88/140 (62%), Gaps = 6/140 (4%)

Query: 290 TCLDLI-DELKEKFGGNFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR-G 347
           T LD +     +    NF +R ++ L G IVLTRYNN+TYRID+ID+   P STF K  G
Sbjct: 2   TVLDFMRQLPSQGNRSNFQDRCAKDLKGLIVLTRYNNKTYRIDDIDWNLAPKSTFEKSDG 61

Query: 348 EPKSYVDYYREAYNIEIRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGLSD 407
              ++V+YY++ YNI IRD +QP+L++R KR+  +    E         PEL F+TGL+D
Sbjct: 62  SEITFVEYYKQKYNITIRDPNQPLLVSRPKRRRNQNGKGEPVLLP----PELCFITGLTD 117

Query: 408 AMRNDFQVMKSIASFTRVDP 427
            MR DF +MKSIA  TR+ P
Sbjct: 118 RMRKDFMLMKSIADRTRLSP 137


>gnl|CDD|239211 cd02845, PAZ_piwi_like, PAZ domain,  Piwi_like subfamily. In
           multi-cellular organisms, the Piwi protein appears to be
           essential for the maintenance of germline stem cells. In
           the Drosophila male germline, Piwi was shown to be
           involved in the silencing of retrotransposons in the
           male gametes. The Piwi proteins share their domain
           architecture with other members of the argonaute family.
           The PAZ domain has been named after the proteins Piwi,
           Argonaut, and Zwille. PAZ is found in two families of
           proteins that are essential components of RNA-mediated
           gene-silencing pathways, including RNA interference, the
           Piwi and Dicer families. PAZ functions as a nucleic acid
           binding domain, with a strong preference for
           single-stranded nucleic acids (RNA or DNA) or RNA
           duplexes with single-stranded 3' overhangs. It has been
           suggested that the PAZ domain provides a unique mode for
           the recognition of the two 3'-terminal nucleotides in
           single-stranded nucleic acids and buries the 3' OH
           group, and that it might recognize characteristic 3'
           overhangs in siRNAs within RISC (RNA-induced silencing)
           and other complexes.
          Length = 117

 Score =  140 bits (354), Expect = 7e-39
 Identities = 58/121 (47%), Positives = 82/121 (67%), Gaps = 6/121 (4%)

Query: 288 TSTCLDLIDELKEK-FGGNFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR 346
           ++T LD + +L  +     F E   + LIG IVLTRYNN+TYRID+IDF +TP+STF K 
Sbjct: 1   STTVLDRMHKLYRQETDERFREECEKELIGSIVLTRYNNKTYRIDDIDFDKTPLSTFKKS 60

Query: 347 -GEPKSYVDYYREAYNIEIRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGL 405
            G   ++V+YY++ YNIEI D +QP+L++R KR+  RG   E  +    ++PEL FLTGL
Sbjct: 61  DGTEITFVEYYKKQYNIEITDLNQPLLVSRPKRRDPRGGEKEPIY----LIPELCFLTGL 116

Query: 406 S 406
           +
Sbjct: 117 T 117


>gnl|CDD|215631 PLN03202, PLN03202, protein argonaute; Provisional.
          Length = 900

 Score =  125 bits (316), Expect = 2e-29
 Identities = 124/484 (25%), Positives = 212/484 (43%), Gaps = 84/484 (17%)

Query: 423 TRVDPNQKLQAISKYINNVNNNKETSELLKGWGLTLNKSMETLNARILPVEKIYMGNNF- 481
           +R  P ++++ ++  + + N + +   +L+  G++++     +  R+LP  K+ +GN   
Sbjct: 404 SRQKPQERMKVLTDALKSSNYDADP--MLRSCGISISSQFTQVEGRVLPAPKLKVGNGED 461

Query: 482 VAPGSQEADWNRQVGTNPA----LTVVNFDQWVLIHIRRDQRN-ADNFLNCLNRNSNAIG 536
             P +   ++N +    P       VVNF        R D R+   + + C        G
Sbjct: 462 FFPRNGRWNFNNKKLVEPTKIERWAVVNFSA------RCDIRHLVRDLIKCGEMK----G 511

Query: 537 IRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVI---IFNAPR-----------TDRYQ 582
           I ++ P    + EE    +  A   +R +  F  I   +   P+           +D Y 
Sbjct: 512 INIEPP--FDVFEENP-QFRRAPPPVRVEKMFEQIQSKLPGPPQFLLCILPERKNSDIYG 568

Query: 583 AVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLGG--SLWSV-------- 632
             KK    E  I +Q I    ++ +      +  + L+IN KLGG  SL ++        
Sbjct: 569 PWKKKNLSEFGIVTQCIAPTRVNDQ-----YLTNVLLKINAKLGGLNSLLAIEHSPSIPL 623

Query: 633 --QIPYDCAMVIGIDVYHEGVGSQGQ----NIVGLVASTNKDFTTYYSQAV-IQRRGQEI 685
             ++P    +++G+DV H   GS GQ    +I  +V+S      + Y  +V  Q    E+
Sbjct: 624 VSKVP---TIILGMDVSH---GSPGQSDVPSIAAVVSSRQWPLISRYRASVRTQSPKVEM 677

Query: 686 TDSIAQPFKQA----------LDRFIQANSVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQ 735
            DS+ +P              LD +  +    P+QI IFRDGVS+ Q + V  +E+DQ  
Sbjct: 678 IDSLFKPVGDKDDDGIIRELLLDFYTSSGKRKPEQIIIFRDGVSESQFNQVLNIELDQII 737

Query: 736 QIVDTIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRK 795
           +    +       S++PK T I+ QK  +TK FQ  S    P+  N P G+V+D+ +   
Sbjct: 738 EACKFL-----DESWSPKFTVIVAQKNHHTKFFQAGS----PD--NVPPGTVVDNKICHP 786

Query: 796 TLSDFFLVSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVC 855
             +DF++ +     GT  P+HY VL ++     D LQ L + L ++Y      I V A  
Sbjct: 787 RNNDFYMCAHAGMIGTTRPTHYHVLLDEIGFSADDLQELVHSLSYVYQRSTTAISVVAPV 846

Query: 856 QYAH 859
            YAH
Sbjct: 847 CYAH 850



 Score = 33.9 bits (78), Expect = 0.47
 Identities = 30/115 (26%), Positives = 51/115 (44%), Gaps = 14/115 (12%)

Query: 82  TSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRG--ELGQPIKVMVNYIDLSVKE 139
               P P    P +    +++      SKP   P+  RG    GQ I+++ N+  +SV  
Sbjct: 1   KDALPPPPPVVPPNVVPIKLEPTK-KPSKPKRLPMARRGFGSKGQKIQLLTNHFKVSVNN 59

Query: 140 GSGM-YEYEVKF----NPPIDSRGIRNRLINSL-----NDLLGQYKTFDG-MNLF 183
             G  + Y V        P+D +GI  ++I+ +     +DL G+   +DG  +LF
Sbjct: 60  PDGHFFHYSVSLTYEDGRPVDGKGIGRKVIDKVQETYSSDLAGKDFAYDGEKSLF 114


>gnl|CDD|216914 pfam02170, PAZ, PAZ domain.  This domain is named PAZ after the
           proteins Piwi Argonaut and Zwille. This domain is found
           in two families of proteins that are involved in
           post-transcriptional gene silencing. These are the Piwi
           family and the Dicer family, that includes the Carpel
           factory protein. The function of the domains is unknown
           but has been suggested to mediate complex formation
           between proteins of the Piwi and Dicer families by
           hetero-dimerisation. The three-dimensional structure of
           this domain has been solved. The PAZ domain is composed
           of two subdomains. One subdomain is similar to the OB
           fold, albeit with a different topology. The OB-fold is
           well known as a single-stranded nucleic acid binding
           fold. The second subdomain is composed of a beta-hairpin
           followed by an alpha-helix. The PAZ domains shows
           low-affinity nucleic acid binding and appears to
           interact with the 3' ends of single-stranded regions of
           RNA in the cleft between the two subdomains. PAZ can
           bind the characteristic two-base 3' overhangs of siRNAs,
           indicating that although PAZ may not be a primary
           nucleic acid binding site in Dicer or RISC, it may
           contribute to the specific and productive incorporation
           of siRNAs and miRNAs into the RNAi pathway.
          Length = 114

 Score = 98.1 bits (245), Expect = 3e-24
 Identities = 41/107 (38%), Positives = 60/107 (56%), Gaps = 13/107 (12%)

Query: 305 NFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR-GEPKSYVDYYREAYNIE 363
           +F E+ ++AL G IV T YNN+TYRID I +  TP STF  + G   S  +Y++E YNI 
Sbjct: 20  DFREKFTKALKGLIVETTYNNRTYRIDGITWDPTPNSTFPLKDGGEISVAEYFKEKYNIT 79

Query: 364 IRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGLSDAMR 410
           ++  + P+L+  V RK +             + PEL F+TG    M+
Sbjct: 80  LKYPNLPLLV--VGRKKK----------PNYLPPELCFITGGQRYMK 114


>gnl|CDD|239207 cd02825, PAZ, PAZ domain, named PAZ after the proteins Piwi
           Argonaut and Zwille. PAZ is found in two families of
           proteins that are essential components of RNA-mediated
           gene-silencing pathways, including RNA interference, the
           piwi and Dicer families. PAZ functions as a nucleic-acid
           binding domain, with a strong preference for
           single-stranded nucleic acids (RNA or DNA) or RNA
           duplexes with single-stranded 3' overhangs. It has been
           suggested that the PAZ domain provides a unique mode for
           the recognition of the two 3'-terminal nucleotides in
           single-stranded nucleic acids and buries the 3' OH
           group, and that it might recognize characteristic 3'
           overhangs in siRNAs within RISC (RNA-induced silencing)
           and other complexes. This parent model also contains
           structures of an archaeal PAZ domain.
          Length = 115

 Score = 51.7 bits (124), Expect = 5e-08
 Identities = 26/106 (24%), Positives = 46/106 (43%), Gaps = 9/106 (8%)

Query: 279 IDTSCRVLRTSTCLDLIDELKEKFGGNFMERLSQALIGEIVLTRYN--NQTYRIDEIDFK 336
           I+T C+  +         E+      +  E  ++ L G  V   +N  N+ YR D     
Sbjct: 5   IETMCKFPK-------DREIDTPLLDSPREEFTKELKGLKVEDTHNPLNRVYRPDGETRL 57

Query: 337 QTPMSTFTKRGEPKSYVDYYREAYNIEIRDKSQPMLITRVKRKTRR 382
           + P       G+  ++ DY++E YN+ + D +QP+LI +   K   
Sbjct: 58  KAPSQLKHSDGKEITFADYFKERYNLTLTDLNQPLLIVKFSSKKSY 103


>gnl|CDD|240017 cd04659, Piwi_piwi-like_ProArk, Piwi_piwi-like_ProArk: PIWI domain,
           Piwi-like subfamily found in Archaea and Bacteria. RNA
           silencing refers to a group of related gene-silencing
           mechanisms mediated by short RNA molecules, including
           siRNAs, miRNAs, and heterochromatin-related guide RNAs.
           The central component of the RNA-induced silencing
           complex (RISC) and related complexes is Argonaute. The
           PIWI domain is the C-terminal portion of Argonaute and
           consists of two subdomains, one of which provides the 5'
           anchoring of the guide RNA and the other, the catalytic
           site for slicing. This domain is also found in closely
           related proteins, including the Piwi subfamily, where it
           is believed to perform a crucial role in germline cells,
           via a similar mechanism.
          Length = 404

 Score = 53.9 bits (130), Expect = 2e-07
 Identities = 65/366 (17%), Positives = 116/366 (31%), Gaps = 47/366 (12%)

Query: 523 NFLNCLNRNSNAIGIRVKKPQVIALQE-EQTLSYLTA----LKSMRSDTQFVVIIFNAPR 577
            F      N NA+G        + L    Q  + + A    L         V+++     
Sbjct: 63  KFPGFGGGNKNALGKNKISVFRLDLNRSAQAEAIIEAVDLALSESSQGVDVVIVVLPEDL 122

Query: 578 TDRYQAVKKYCCCER-----PIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSV 632
            +  +    Y   +       IP+Q +   T+     +  +   +AL +  KLGG  W +
Sbjct: 123 KELPEEFDLYDRLKAKLLRLGIPTQFVREDTLKNRQDLAYVAWNLALALYAKLGGIPWKL 182

Query: 633 ---QIPYDCAMVIGIDVYHEGVGSQGQNIVGLVASTNKDFTTYY---SQAVIQRRGQEIT 686
                P D    IGI       G     + G     + D        +        +   
Sbjct: 183 DADSDPADL--YIGIGFARSRDGE--VRVTGCAQVFDSDGLGLILRGAPIEEPTEDRSPA 238

Query: 687 DSIAQPFKQALDRFIQ-ANSVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTTL 745
           D      K+ L+ + +      PK++ + +DG         +  EI+  ++ ++      
Sbjct: 239 DLK-DLLKRVLEGYRESHRGRDPKRLVLHKDG-------RFTDEEIEGLKEALE------ 284

Query: 746 PSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVL---DHTVTRKTLSDFFL 802
                  K+  + V K    ++F     G  PN      G+ +   D      T      
Sbjct: 285 ---ELGIKVDLVEVIKSGPHRLF---RFGTYPNGFPPRRGTYVKLSDDEGLLWTHGSVPK 338

Query: 803 VSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWP-GTIRVPAVCQYAHRI 861
            +     G  TP   ++ R+  N  ++ L      L  L +N      R+P    YA R+
Sbjct: 339 YNT--YPGMGTPRPLLLRRHSGNTDLEQLASQILGLTKLNWNSFQFYSRLPVTIHYADRV 396

Query: 862 AYLTGM 867
           A L   
Sbjct: 397 AKLLKR 402


>gnl|CDD|239212 cd02846, PAZ_argonaute_like, PAZ domain, argonaute_like subfamily.
           Argonaute is part of the RNA-induced silencing complex
           (RISC), and is an endonuclease that plays a key role in
           the RNA interference pathway. The PAZ domain has been
           named after the proteins Piwi,Argonaut, and Zwille. PAZ
           is found in two families of proteins that are essential
           components of RNA-mediated gene-silencing pathways,
           including RNA interference, the Piwi and Dicer families.
           PAZ functions as a nucleic acid binding domain, with a
           strong preference for single-stranded nucleic acids (RNA
           or DNA) or RNA duplexes with single-stranded 3'
           overhangs. It has been suggested that the PAZ domain
           provides a unique mode for the recognition of the two
           3'-terminal nucleotides in single-stranded nucleic acids
           and buries the 3' OH group, and that it might recognize
           characteristic 3' overhangs in siRNAs within RISC
           (RNA-induced silencing) and other complexes.
          Length = 114

 Score = 43.8 bits (104), Expect = 3e-05
 Identities = 28/101 (27%), Positives = 44/101 (43%), Gaps = 15/101 (14%)

Query: 294 LIDELKEKFGGNF--------MERLSQALIGEIVLTRYNNQT---YRIDEIDFKQTPMST 342
           +I+ LKE  G +           +L +AL G  V   +   T   Y+I  +  +     T
Sbjct: 4   VIEFLKEFLGFDTPLGLSDNDRRKLKKALKGLKVEVTHRGNTNRKYKIKGLSAEPASQQT 63

Query: 343 FTKRGEPK--SYVDYYREAYNIEIRDKSQPMLITRVKRKTR 381
           F  +   K  S  DY++E YNI ++  + P L   V RK +
Sbjct: 64  FELKDGEKEISVADYFKEKYNIRLKYPNLPCLQ--VGRKGK 102


>gnl|CDD|239210 cd02844, PAZ_CAF_like, PAZ domain, CAF_like subfamily. CAF (for
           carpel factory) is a plant homolog of Dicer. CAF has
           been implicated in flower morphogenesis and in early
           Arabidopsis development and might function through
           posttranscriptional regulation of specific mRNA
           molecules. PAZ domains are named after the proteins
           Piwi, Argonaut, and Zwille. PAZ is found in two families
           of proteins that are essential components of
           RNA-mediated gene-silencing pathways, including RNA
           interference, the Piwi and Dicer families. PAZ functions
           as a nucleic-acid binding domain, with a strong
           preference for single-stranded nucleic acids (RNA or
           DNA) or RNA duplexes with single-stranded 3' overhangs.
           It has been suggested that the PAZ domain provides a
           unique mode for the recognition of the two 3'-terminal
           nucleotides in single-stranded nucleic acids and buries
           the 3' OH group, and that it might recognize
           characteristic 3' overhangs in siRNAs within RISC
           (RNA-induced silencing) and other complexes.
          Length = 135

 Score = 42.8 bits (101), Expect = 9e-05
 Identities = 24/99 (24%), Positives = 41/99 (41%), Gaps = 14/99 (14%)

Query: 314 LIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR--GEPKSYVDYYREAYNIEIRDKSQPM 371
           L G +V   +N + Y I  I       S+F  +      +Y +Y++E Y I +   +QP+
Sbjct: 31  LKGSVVTAPHNGRFYVISGI-LDLNANSSFPGKEGLGYATYAEYFKEKYGIVLNHPNQPL 89

Query: 372 LITR---------VKRKTRRGTNVEESH--YIAAIVPEL 399
           L  +           R   +G + E+    Y   + PEL
Sbjct: 90  LKGKQIFNLHNLLHNRFEEKGESEEKEKDRYFVELPPEL 128


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 39.9 bits (93), Expect = 0.008
 Identities = 19/86 (22%), Positives = 32/86 (37%), Gaps = 18/86 (20%)

Query: 30   LPTTSTDDTTGSHAPGIPSPSTS---SPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTP 86
            L   +      S A  +P P+++   +P     P P  +               P  G+ 
Sbjct: 2812 LAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSL---------------PLGGSV 2856

Query: 87   GPGGDAPSSPPTQQMKALSISKSKPP 112
             PGGD    PP++   A   + ++PP
Sbjct: 2857 APGGDVRRRPPSRSPAAKPAAPARPP 2882



 Score = 33.8 bits (77), Expect = 0.59
 Identities = 20/104 (19%), Positives = 30/104 (28%), Gaps = 19/104 (18%)

Query: 22  EEEKRPGSLPTTSTDDTTGSHAP------GIPSPSTSSPSQASEPAPKHIGRGRLLQQLL 75
               +  SLPT        +  P      G      ++P  AS P P            +
Sbjct: 372 RHHPKRASLPTRKRRSARHAATPFARGPGGDDQTRPAAPVPASVPTPAPTP--------V 423

Query: 76  AKGVVPTSGTP-----GPGGDAPSSPPTQQMKALSISKSKPPSE 114
                P   TP         D P+ PP +Q  A +   +    +
Sbjct: 424 PASAPPPPATPLPSAEPGSDDGPAPPPERQPPAPATEPAPDDPD 467



 Score = 33.4 bits (76), Expect = 0.68
 Identities = 15/91 (16%), Positives = 25/91 (27%), Gaps = 14/91 (15%)

Query: 24   EKRPGSLPTTSTDDTTG--SHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVP 81
             +RP + P ++           P  P+P +  P     P P                  P
Sbjct: 2586 ARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPS------------P 2633

Query: 82   TSGTPGPGGDAPSSPPTQQMKALSISKSKPP 112
             +  P P       PP +     +  +   P
Sbjct: 2634 AANEPDPHPPPTVPPPERPRDDPAPGRVSRP 2664



 Score = 33.4 bits (76), Expect = 0.76
 Identities = 28/105 (26%), Positives = 35/105 (33%), Gaps = 12/105 (11%)

Query: 18  KRREEEEKRPGSLPTTSTDD-TTGSHAPGIPSPST----SSPSQASEPAPKHIGRGRLLQ 72
           KRR     RP   P +S +D + G H P   S  T    S+   A+  A    G  +   
Sbjct: 353 KRR-----RPTWTPPSSLEDLSAGRHHPKRASLPTRKRRSARHAATPFARGPGGDDQTRP 407

Query: 73  QLLAKGVVPTSGTPGPGGDAPSSP--PTQQMKALSISKSKPPSEP 115
                  VPT         AP  P  P    +  S     PP E 
Sbjct: 408 AAPVPASVPTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPER 452



 Score = 32.2 bits (73), Expect = 1.6
 Identities = 18/99 (18%), Positives = 28/99 (28%), Gaps = 11/99 (11%)

Query: 24   EKRPGSLPTTSTD-------DTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLA 76
              RP   PTT+                  +  P+ +S S++ E  P             A
Sbjct: 2754 PARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPS----PWDPADPPA 2809

Query: 77   KGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
              + P +  P     A   PP    +  +      P  P
Sbjct: 2810 AVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPP 2848



 Score = 29.9 bits (67), Expect = 7.2
 Identities = 14/39 (35%), Positives = 18/39 (46%), Gaps = 3/39 (7%)

Query: 84   GTPGPGGDA---PSSPPTQQMKALSISKSKPPSEPVYFR 119
              P PGG     P +PP     A +I   +P  EPV+ R
Sbjct: 2494 AAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPR 2532


>gnl|CDD|218116 pfam04503, SSDP, Single-stranded DNA binding protein, SSDP.  This
           is a family of eukaryotic single-stranded DNA binding
           proteins with specificity to a pyrimidine-rich element
           found in the promoter region of the alpha2(I) collagen
           gene.
          Length = 293

 Score = 38.5 bits (89), Expect = 0.011
 Identities = 24/96 (25%), Positives = 34/96 (35%), Gaps = 15/96 (15%)

Query: 27  PGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPT---S 83
              +  T++        PG P P    P+Q     P         Q LL  G+ PT    
Sbjct: 60  TPQMQNTTSQPFMSPRYPGGPRPPLRMPNQPPGGVPGS-------QPLLPGGMDPTVRQQ 112

Query: 84  GTPGPGGDAPSSPPTQQMKALSISKS-----KPPSE 114
           G P  GG      P + MK+L   ++     +PP  
Sbjct: 113 GHPNMGGPMQRMTPPRGMKSLDGPQNYGGGMRPPPN 148


>gnl|CDD|239209 cd02843, PAZ_dicer_like, PAZ domain, dicer_like subfamily. Dicer is
           an RNAse involved in cleaving dsRNA in the RNA
           interference pathway. It generates dsRNAs which are
           approximately 20 bp long (siRNAs), which in turn target
           hydrolysis of homologous RNAs. PAZ domains are named
           after the proteins Piwi Argonaut and Zwille. PAZ is
           found in two families of proteins that are essential
           components of RNA-mediated gene-silencing pathways,
           including RNA interference, the piwi and Dicer families.
           PAZ functions as a nucleic-acid binding domain, with a
           strong preference for single-stranded nucleic acids (RNA
           or DNA) or RNA duplexes with single-stranded 3'
           overhangs. It has been suggested that the PAZ domain
           provides a unique mode for the recognition of the two
           3'-terminal nucleotides in single-stranded nucleic acids
           and buries the 3' OH group, and that it might recognize
           characteristic 3' overhangs in siRNAs within RISC
           (RNA-induced silencing) and other complexes.
          Length = 122

 Score = 34.0 bits (78), Expect = 0.090
 Identities = 17/59 (28%), Positives = 34/59 (57%), Gaps = 5/59 (8%)

Query: 318 IVLTRYNN----QTYRIDEIDFKQTPMSTFTKRGEPKSYVDYYREAYNIEIRDKSQPML 372
           +V+  Y N    Q + + EI     P+S F    E +++ +YY++ Y ++I++ +QP+L
Sbjct: 45  VVMPWYRNFDQPQYFYVAEICTDLRPLSKFPG-PEYETFEEYYKKKYKLDIQNLNQPLL 102


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 35.9 bits (83), Expect = 0.11
 Identities = 18/95 (18%), Positives = 25/95 (26%), Gaps = 4/95 (4%)

Query: 25  KRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSG 84
             P S     +    G  +P  P P+    S    PAP      R +             
Sbjct: 95  LAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPA 154

Query: 85  TPGPGGDAPSSPPT--QQMKALSI--SKSKPPSEP 115
                    S   +  Q    LS     ++ PS P
Sbjct: 155 AGASPAAVASDAASSRQAALPLSSPEETARAPSSP 189



 Score = 33.2 bits (76), Expect = 0.84
 Identities = 16/98 (16%), Positives = 23/98 (23%), Gaps = 4/98 (4%)

Query: 19  RREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKG 78
            RE     PG         T    +P        S       +P             +  
Sbjct: 101 AREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPA 160

Query: 79  VVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPV 116
            V  S        A      ++      + S PP+EP 
Sbjct: 161 AVA-SDAASSRQAALPLSSPEETAR---APSSPPAEPP 194



 Score = 31.7 bits (72), Expect = 2.3
 Identities = 17/103 (16%), Positives = 28/103 (27%), Gaps = 17/103 (16%)

Query: 17  AKRREEEEKRPGSLPTTST----DDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQ 72
            +        PGS P  S+      ++ S      S S+SS S            G    
Sbjct: 293 ERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVS----PGPSPS 348

Query: 73  QLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
           +         S +P            ++    S + S P +  
Sbjct: 349 R---------SPSPSRPPPPADPSSPRKRPRPSRAPSSPAASA 382



 Score = 31.3 bits (71), Expect = 2.8
 Identities = 21/99 (21%), Positives = 28/99 (28%)

Query: 17  AKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLA 76
           A  R      P S   +S     G  A      S+S  S +                  A
Sbjct: 202 ASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGCGWGPENECPLPRPA 261

Query: 77  KGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
              +PT      G + PSS P     + S  +  P   P
Sbjct: 262 PITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSP 300


>gnl|CDD|221745 pfam12737, Mating_C, C-terminal domain of homeodomain 1.  Mating in
           fungi is controlled by the loci that determine the
           mating type of an individual, and only individuals with
           differing mating types can mate. Basidiomycete fungi
           have evolved a unique mating system, termed tetrapolar
           or bifactorial incompatibility, in which mating type is
           determined by two unlinked loci; compatibility at both
           loci is required for mating to occur. The multi-allelic
           tetrapolar mating system is considered to be a novel
           innovation that could have only evolved once, and is
           thus unique to the mushroom fungi. This domain is
           C-terminal to the homeodomain transcription factor
           region.
          Length = 418

 Score = 35.2 bits (81), Expect = 0.13
 Identities = 22/93 (23%), Positives = 31/93 (33%), Gaps = 15/93 (16%)

Query: 16  EAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLL 75
           +    E   KRP S   +S+          +PSP+ S+  + SE +        L    L
Sbjct: 116 DEDEAERPSKRPRSDSISSSSSPAKPPEACLPSPAASTQDELSEASA-----APLPTPSL 170

Query: 76  AKGVVPTSGTP----------GPGGDAPSSPPT 98
           +    PT   P          G    AP  P T
Sbjct: 171 SPPHTPTDTAPSGKRKRRLSDGFQLPAPKRPQT 203


>gnl|CDD|218902 pfam06121, DUF959, Domain of Unknown Function (DUF959).  This
           N-terminal domain is not expressed in the 'Short'
           isoform of Collagen A.
          Length = 202

 Score = 34.4 bits (78), Expect = 0.15
 Identities = 22/72 (30%), Positives = 31/72 (43%), Gaps = 7/72 (9%)

Query: 28  GSLPTTSTDDTT--GSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGT 85
            S P  ST+ TT      PG    ST+  S  SE   + + +G   +Q +  G V T+ T
Sbjct: 44  ASTPVQSTESTTTHVVPRPGETEESTTPAS--SEEPKEIVEKG---KQNVVPGTVATTPT 98

Query: 86  PGPGGDAPSSPP 97
             P     +S P
Sbjct: 99  VTPVAMDVASSP 110


>gnl|CDD|220233 pfam09422, WTX, WTX protein.  The WTX protein is found to be
           inactivated in one third of Wilms tumours. The WTX
           protein is functionally uncharacterized.
          Length = 467

 Score = 34.9 bits (80), Expect = 0.20
 Identities = 28/129 (21%), Positives = 39/129 (30%), Gaps = 18/129 (13%)

Query: 3   GRGGRGAALKQILEAKRREEEEKRPGSLPTTSTDDTTGSHAPG----IPSPSTSSPSQAS 58
           GR  RG  LK +  + R   ++K           +   +  P     +PS  T+S     
Sbjct: 76  GRPKRG--LKGLFSSMRWHRKDKSNK-------AEQEEAKEPEGGLILPSSLTASLECVK 126

Query: 59  EPAPKHIGRGRLLQQLLAKG-VVPTSGTPGPGGDAPSSP-PTQQMKALSISKSKPPSEPV 116
           E  P+         Q    G V P S    PG        P +     S  +   P    
Sbjct: 127 EETPRAAREENAPPQDADGGKVSPASALEAPGTSCVECGDPAKPGSESSAFEDPGPGLGA 186

Query: 117 YFRGELGQP 125
                L QP
Sbjct: 187 ---DSLCQP 192


>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 824

 Score = 34.2 bits (79), Expect = 0.33
 Identities = 25/123 (20%), Positives = 36/123 (29%), Gaps = 12/123 (9%)

Query: 3   GRGGRGAALKQILEAKRREEEEKRPGSLPTTSTDDT----TGSHAPGIPSPSTSSPSQAS 58
             G  G        +    EE  RP +    +          + AP   S + +    A 
Sbjct: 591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAP 650

Query: 59  EPAPKH-----IGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPS 113
           E  PKH        G       A G  P +  P P   AP++P        +     P +
Sbjct: 651 EHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPA---PAPAA 707

Query: 114 EPV 116
            P 
Sbjct: 708 TPP 710


>gnl|CDD|147982 pfam06112, Herpes_capsid, Gammaherpesvirus capsid protein.  This
           family consists of several Gammaherpesvirus capsid
           proteins. The exact function of this family is unknown.
          Length = 148

 Score = 32.9 bits (75), Expect = 0.34
 Identities = 19/97 (19%), Positives = 31/97 (31%), Gaps = 15/97 (15%)

Query: 7   RGAALKQILEAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPS-PSTSSPSQASEPAPKHI 65
            G   K+ L+A R    +         S   ++ S  PG  +  S SS S  S   P   
Sbjct: 66  HGIRRKKHLQALRGAGPQTSSSIGSALSASSSSASGVPGGANQLSGSSGSALS-SGP--- 121

Query: 66  GRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMK 102
                           +S +    G   ++P + + K
Sbjct: 122 ----------GSLSSSSSLSGSGAGAGDTAPSSSKKK 148


>gnl|CDD|218191 pfam04652, DUF605, Vta1 like.  Vta1 (VPS20-associated protein 1) is
           a positive regulator of Vps4. Vps4 is an ATPase that is
           required in the multivesicular body (MVB) sorting
           pathway to dissociate the endosomal sorting complex
           required for transport (ESCRT). Vta1 promotes correct
           assembly of Vps4 and stimulates its ATPase activity
           through its conserved Vta1/SBP1/LIP5 region.
          Length = 315

 Score = 33.5 bits (77), Expect = 0.45
 Identities = 27/93 (29%), Positives = 35/93 (37%), Gaps = 8/93 (8%)

Query: 27  PGSLPTTSTDDTTGS-HAPGIPSPSTSSPSQASE---PAPKHIGRGRLLQQLLAKGVVPT 82
                 + +D  + S   P  PSP     S +     PAP                  PT
Sbjct: 177 ADPASASPSDPPSSSPGVPSFPSPPEDPSSPSDSSLPPAPSSFQS----DTPPPSPESPT 232

Query: 83  SGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
           + +P PG  AP  PP QQ+  LS +K  PPS  
Sbjct: 233 NPSPPPGPAAPPPPPVQQVPPLSTAKPTPPSAS 265



 Score = 30.4 bits (69), Expect = 4.3
 Identities = 21/99 (21%), Positives = 33/99 (33%), Gaps = 12/99 (12%)

Query: 21  EEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVV 80
           EE+E     + TT++D++         S S S P  +S   P             +    
Sbjct: 156 EEDED--ADVATTNSDNSFPGEDADPASASPSDPPSSSPGVP----------SFPSPPED 203

Query: 81  PTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFR 119
           P+S +      APSS  +        S + P   P    
Sbjct: 204 PSSPSDSSLPPAPSSFQSDTPPPSPESPTNPSPPPGPAA 242


>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein.  This family consists of
           AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
           retardation syndrome) nuclear proteins. These proteins
           have been linked to human diseases such as acute
           lymphoblastic leukaemia and mental retardation. The
           family also contains a Drosophila AF4 protein homologue
           Lilliputian which contains an AT-hook domain.
           Lilliputian represents a novel pair-rule gene that acts
           in cytoskeleton regulation, segmentation and
           morphogenesis in Drosophila.
          Length = 1154

 Score = 33.7 bits (77), Expect = 0.60
 Identities = 21/99 (21%), Positives = 34/99 (34%), Gaps = 9/99 (9%)

Query: 20  REEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGV 79
              + + P        +DT+ S  P   S + SS   +S    +         +   KG 
Sbjct: 814 SSPKPEHPSRKRPRRQEDTSSSSGPFSASSTKSSSKSSSTSKHR---------KTEGKGS 864

Query: 80  VPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYF 118
             +    G  GD P+   +  +  LS   SKP    + F
Sbjct: 865 STSKEHKGSSGDTPNKASSFPVPPLSNGSSKPRRPKLVF 903


>gnl|CDD|222010 pfam13254, DUF4045, Domain of unknown function (DUF4045).  This
           presumed domain is functionally uncharacterized. This
           domain family is found in bacteria and eukaryotes, and
           is typically between 384 and 430 amino acids in length.
          Length = 414

 Score = 32.9 bits (75), Expect = 0.70
 Identities = 25/107 (23%), Positives = 36/107 (33%), Gaps = 17/107 (15%)

Query: 22  EEEKRPGSLPTTSTDDTTGSHAPGIPS-----PSTSSPSQASEPAPKHIGRGRLLQQLLA 76
           + EK     P  + D  +   AP I +      S  +  + SE A        LL     
Sbjct: 249 DTEKSSAPKPRETLDPKSPEKAPPIDTTEEELKSPEASPKESEEASARKRSPSLL----- 303

Query: 77  KGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVY-FRGEL 122
                 S +P      P + P +  +     + KP S PV  FR  L
Sbjct: 304 ------SPSPKAESPKPLASPGKSPRDPLSPRPKPQSPPVNDFRANL 344



 Score = 31.0 bits (70), Expect = 2.7
 Identities = 20/84 (23%), Positives = 29/84 (34%), Gaps = 17/84 (20%)

Query: 20  REEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGV 79
            EEE K P + P  S + +    +P + SPS  + S     +P    R  L  +      
Sbjct: 276 TEEELKSPEASPKESEEASARKRSPSLLSPSPKAESPKPLASPGKSPRDPLSPRP----- 330

Query: 80  VPTSGTPGPGGDAPSSPPTQQMKA 103
                        P SPP    +A
Sbjct: 331 ------------KPQSPPVNDFRA 342


>gnl|CDD|146151 pfam03363, Herpes_LP, Herpesvirus leader protein. 
          Length = 177

 Score = 32.2 bits (73), Expect = 0.72
 Identities = 27/114 (23%), Positives = 41/114 (35%), Gaps = 22/114 (19%)

Query: 20  REEEEKRPGSLPTTSTDDTTGSHAPGIPSPS----------------TSSPSQASEPAPK 63
            EEEE   G  P+    D + +  P  P P                 + SP+      P+
Sbjct: 54  EEEEEVVSGP-PSGPRGDPSEAPGPSRPGPPGLGPEGPFGQLLRRRRSPSPTGGDPEGPR 112

Query: 64  HIGRGRLLQQLLAKGVVPTSGTP-GPGGDAPSSPPTQQMKALSISKSKPPSEPV 116
            + R  LL++         SG+P GP G           + L+ S  +P  +PV
Sbjct: 113 RVRRRVLLEEEEE----VVSGSPSGPQGPLIQPAARSWREWLARSGPRPEPQPV 162


>gnl|CDD|233432 TIGR01480, copper_res_A, copper-resistance protein, CopA family.
           This model represents the CopA copper resistance protein
           family. CopA is related to laccase (benzenediol:oxygen
           oxidoreductase) and L-ascorbate oxidase, both
           copper-containing enzymes. Most members have a typical
           TAT (twin-arginine translocation) signal sequence with
           an Arg-Arg pair. Twin-arginine translocation is observed
           for a large number of periplasmic proteins that cross
           the inner membrane with metal-containing cofactors
           already bound. The combination of copper-binding sites
           and TAT translocation motif suggests a mechansism of
           resistance by packaging and export [Cellular processes,
           Detoxification, Transport and binding proteins, Cations
           and iron carrying compounds].
          Length = 587

 Score = 32.9 bits (75), Expect = 0.91
 Identities = 15/74 (20%), Positives = 20/74 (27%), Gaps = 10/74 (13%)

Query: 60  PAPKHIGRGRLLQQLLAKGVVP-TSGTPGPGGDAPSSPPTQQMKA----LSISKSKPPSE 114
                  R R LQ L + G                S  P   +      L+I ++     
Sbjct: 1   SVMTAFDRRRFLQGLASGGAAAGLGLWATAAWAERSPLPESVLSGTEFDLTIGET----- 55

Query: 115 PVYFRGELGQPIKV 128
            V F G     I V
Sbjct: 56  MVNFTGRARPAITV 69


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 32.7 bits (74), Expect = 1.1
 Identities = 25/103 (24%), Positives = 35/103 (33%), Gaps = 9/103 (8%)

Query: 27  PGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVV----PT 82
           P  LP+        + +   P P   S      P   H G G  +   L +G V    P+
Sbjct: 234 PQRLPSPHPPLQPQTASQQSPQPPAPSSRH---PQSSHHGPGPPMPHALQQGPVFLQHPS 290

Query: 83  SGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRGELGQP 125
           S  P P G A S  P   + + +   S  P      +    QP
Sbjct: 291 SNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQSALQP--QQP 331


>gnl|CDD|224348 COG1431, COG1431, Argonaute homolog, implicated in RNA metabolism
           [Translation, ribosomal structure and biogenesis].
          Length = 685

 Score = 32.2 bits (73), Expect = 1.6
 Identities = 62/371 (16%), Positives = 116/371 (31%), Gaps = 61/371 (16%)

Query: 525 LNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVIIFNAP--RTDRYQ 582
           +    +NSN I  +V+   +    +   +     L  +  +     +          +Y 
Sbjct: 365 VVYGFKNSNGIDWKVEGLTLHVAGKRPKMK--DDLTKIIKEIDVEELKKQEMYKDDVKYA 422

Query: 583 AVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSV---QIPYDCA 639
            +K+    +  IPSQVI      +  K       +A +   K  G  +       P D  
Sbjct: 423 ILKRL---DETIPSQVILDPNNRKPYK--GTKTNLASKRYLKTLGQPYLKRNGLGPVD-- 475

Query: 640 MVIGIDVYHEGVGSQGQNIVGLVASTNKDFTTYYSQAVIQRRGQEITDSIAQPFKQALDR 699
            ++G+DV      S+G   V    S       + S+  ++     +T ++ +  + +   
Sbjct: 476 AIVGLDV---SRVSEGNWTVEGCTSC------FVSEGGLEEYYHTVTPALGERLETSGRY 526

Query: 700 FIQANSVPPK---QIFIFRDG-VSDGQLDSVSRVEIDQYQQIVDTIMTTLPSCSYAPKIT 755
             + N    +    I   RDG +  G++ +V                             
Sbjct: 527 LEKMNWRGFESRNLIVTLRDGKLVAGEIAAVKEY-----------------GGELGSNPE 569

Query: 756 AIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVSQHVRQGTVTPS 815
              + K  N       +       A         H       S        VR+GT  P 
Sbjct: 570 VNRILK--NNPWV--FAIEGEIWGAFVRLDGSTVH----LCCSP----YNPVRRGTPRP- 616

Query: 816 HYIVLRNDNNVKVDHLQRLSYKLCHLYYN--WPGTIRVPAVCQYAHRIAYLTGMHLQRLP 873
             I LR  +      L  L + L  + Y+       R+PA   YA + + L    +   P
Sbjct: 617 --IALRRRDGKLDGELIGLVHDLTAMNYSNPSGTWSRLPAPVHYADKASKLARYGVSIGP 674

Query: 874 SDVLSDKLFYL 884
            D +S++ + +
Sbjct: 675 GDPVSERPYPV 685


>gnl|CDD|234547 TIGR04330, cas_Cpf1, CRISPR-associated protein Cpf1, subtype
           PREFRAN.  This family is the long protein of a novel
           CRISPR subtype, PREFRAN, which is most common in
           Prevotella and Francisella, although widely distributed.
           The PREFRAN type has Cas1, Cas2, and Cas4, but lacks the
           helicase Cas3 and endonuclease Cas3-HD.
          Length = 1287

 Score = 32.1 bits (73), Expect = 1.6
 Identities = 30/166 (18%), Positives = 60/166 (36%), Gaps = 17/166 (10%)

Query: 304 GNFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKRGEPKSYVDYYREAYNIE 363
           G+F   ++ + + EI    Y N       ID     +    K       ++ Y   YN +
Sbjct: 218 GDFELYVAVSELDEIFSLDYYNNVLSQSGIDSYNAIIGGIMKNDAKIKGLNEYINLYNQK 277

Query: 364 IRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGLSDAMRNDFQVMKSIASFT 423
           I+DK   +   +   K               I+ +    + L D   +D +V+K+I  F 
Sbjct: 278 IKDKKLELPKLKQLHKQ--------------ILSDREAKSFLPDMFEDDSEVVKAIKEFY 323

Query: 424 RVDPNQKLQAISKYINNVNNNKETSELLKGWGLTLNKSMETLNARI 469
                Q    +   +  +    +  + LKG  +  +  + TL+ ++
Sbjct: 324 EQTLEQG--NVIGKLKTLLEKLDKLD-LKGIYIRNDNQLTTLSQQV 366


>gnl|CDD|237871 PRK14965, PRK14965, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 576

 Score = 32.0 bits (73), Expect = 1.7
 Identities = 28/138 (20%), Positives = 45/138 (32%), Gaps = 8/138 (5%)

Query: 7   RGAALK---QILEAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPK 63
           + A L     + E   R E  +R    P ++        AP  P P+ + P   + PA  
Sbjct: 357 KMATLAPGAPVSELLDRLEALERGAPAPPSAAWGAPTPAAPAAPPPAAAPPVPPAAPARP 416

Query: 64  HIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRGELG 123
              R             P + +  P   A S+    +     +   KP         E G
Sbjct: 417 AAARPA-PAPAPPAAAAPPARSADP-AAAASAGDRWRAFVAFVKGKKPALGASL---EQG 471

Query: 124 QPIKVMVNYIDLSVKEGS 141
            P+ V    +++   EGS
Sbjct: 472 SPLGVSAGLLEIGFPEGS 489


>gnl|CDD|236733 PRK10672, PRK10672, rare lipoprotein A; Provisional.
          Length = 361

 Score = 31.6 bits (72), Expect = 2.0
 Identities = 17/75 (22%), Positives = 24/75 (32%), Gaps = 13/75 (17%)

Query: 26  RPGSLPTTSTDDTTG-----SHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVV 80
            P S  T  ++D TG     S   G P+       + SEP P             A    
Sbjct: 220 LPVSNSTLKSEDPTGAPVTSSGFLGAPTTLAPGVLEGSEPTPTAPS--------SAPATA 271

Query: 81  PTSGTPGPGGDAPSS 95
           P +  P     + S+
Sbjct: 272 PAAAAPQAAATSSSA 286


>gnl|CDD|202276 pfam02541, Ppx-GppA, Ppx/GppA phosphatase family.  This family
           consists of the N-terminal region of exopolyphosphatase
           (Ppx) EC:3.6.1.11 and guanosine pentaphosphate
           phospho-hydrolase (GppA) EC:3.6.1.40.
          Length = 285

 Score = 31.1 bits (71), Expect = 2.4
 Identities = 19/59 (32%), Positives = 29/59 (49%), Gaps = 6/59 (10%)

Query: 513 HIRRDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVI 571
              RD  NAD FL    R     G+ V   ++I+ +EE  L YL  + ++ S  + +VI
Sbjct: 65  SALRDAVNADEFLA---RVKKETGLPV---EIISGEEEARLIYLGVVSTLPSKGRGLVI 117


>gnl|CDD|202558 pfam03159, XRN_N, XRN 5'-3' exonuclease N-terminus.  This family
           aligns residues towards the N-terminus of several
           proteins with multiple functions. The members of this
           family all appear to possess 5'-3' exonuclease activity
           EC:3.1.11.-. Thus, the aligned region may be necessary
           for 5' to 3' exonuclease function. The family also
           contains several Xrn1 and Xrn2 proteins. The 5'-3'
           exoribonucleases Xrn1p and Xrn2p/Rat1p function in the
           degradation and processing of several classes of RNA in
           Saccharomyces cerevisiae. Xrn1p is the main enzyme
           catalyzing cytoplasmic mRNA degradation in multiple
           decay pathways, whereas Xrn2p/Rat1p functions in the
           processing of rRNAs and small nucleolar RNAs (snoRNAs)
           in the nucleus.
          Length = 237

 Score = 30.8 bits (70), Expect = 2.6
 Identities = 32/110 (29%), Positives = 44/110 (40%), Gaps = 21/110 (19%)

Query: 219 ENLMFYNILFRKIAFLLSMVQFKDCLY---D---PRSKLLIPQYKLEVWPGFVTAIDEYE 272
           E+ MF  I F  I  L ++V+ +  LY   D   PR+K+   Q        F  A D  E
Sbjct: 56  EDEMFVAI-FEYIDRLFNIVRPRKLLYMAIDGVAPRAKM-NQQRSRR----FRAAKDAKE 109

Query: 273 GGLKLQIDTSCRVLRTSTCLDLIDELKEKFGGN-------FMERLSQALI 315
              + + +     L T          KEKF  N       FM RL++AL 
Sbjct: 110 KEAEAEEN--REELETEGIKLPEKVEKEKFDSNCITPGTPFMARLAKALR 157


>gnl|CDD|219741 pfam08193, INO80_Ies4, INO80 complex subunit Ies4.  The INO80
           ATPase is a member of the SNF2 family of ATPases and
           functions as an integral component of a multisubunit
           ATP-dependent chromatin remodelling complex. This family
           of proteins corresponds to the fungal Ies4 subunit of
           INO80.
          Length = 228

 Score = 30.7 bits (69), Expect = 2.8
 Identities = 19/75 (25%), Positives = 26/75 (34%), Gaps = 10/75 (13%)

Query: 24  EKRPGSLPTTSTDDT-------TGSHAPGIPSPSTSS---PSQASEPAPKHIGRGRLLQQ 73
           E  P   P +S  D          S A   P+  TS+   P +   P PK   +    Q 
Sbjct: 43  EDEPSDSPASSAADPPVPSSVDNASDAASTPAAGTSATDTPRRKGGPGPKKGEKRSAGQG 102

Query: 74  LLAKGVVPTSGTPGP 88
             ++      G PGP
Sbjct: 103 TTSETTSKPRGKPGP 117


>gnl|CDD|223326 COG0248, GppA, Exopolyphosphatase [Nucleotide transport and
           metabolism / Inorganic ion transport and metabolism].
          Length = 492

 Score = 31.1 bits (71), Expect = 2.9
 Identities = 17/56 (30%), Positives = 26/56 (46%), Gaps = 6/56 (10%)

Query: 516 RDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVI 571
           RD  N D FL    R    +G+ +   +VI+ +EE  L YL    ++      +VI
Sbjct: 85  RDAPNGDEFLA---RVEKELGLPI---EVISGEEEARLIYLGVASTLPRKGDGLVI 134


>gnl|CDD|221759 pfam12764, Gly-rich_Ago1, Glycine-rich region of argonaut.  This
           domain is often found at the very N-terminal of
           argonaut-like proteins.
          Length = 102

 Score = 29.2 bits (65), Expect = 3.0
 Identities = 20/85 (23%), Positives = 27/85 (31%), Gaps = 14/85 (16%)

Query: 40  GSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPP-- 97
           G    G      S+        P+      L Q   A      S  P P   + SS P  
Sbjct: 19  GRGGGGGGRGGGSTGGPPRPSVPE------LHQATQAPYQAAVSTQPAPSEASSSSQPPE 72

Query: 98  ------TQQMKALSISKSKPPSEPV 116
                 TQQ + LSI +    S+ +
Sbjct: 73  SSSLQVTQQFQQLSIQQEASSSQAI 97


>gnl|CDD|204078 pfam08832, SRC-1, Steroid receptor coactivator.  This domain is
          found in steroid/nuclear receptor coactivators and
          contains two LXXLL motifs that are involved in receptor
          binding. The family includes SRC-1/NcoA-1, NcoA-2/TIF2,
          pCIP/ACTR/GRIP-1/AIB1.
          Length = 78

 Score = 28.7 bits (64), Expect = 3.1
 Identities = 20/71 (28%), Positives = 29/71 (40%), Gaps = 7/71 (9%)

Query: 11 LKQILEAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRL 70
          L Q+L  K    E   P  + ++ TD        G+ S +   PS  S    KH    ++
Sbjct: 6  LLQLLTTKTEPLE---PPLMASSDTDCKDSLGVTGVSSSTGGCPSSHSSLKEKH----KI 58

Query: 71 LQQLLAKGVVP 81
          L +LL  G  P
Sbjct: 59 LHRLLQNGSSP 69


>gnl|CDD|139494 PRK13335, PRK13335, superantigen-like protein; Reviewed.
          Length = 356

 Score = 30.5 bits (68), Expect = 4.5
 Identities = 27/131 (20%), Positives = 41/131 (31%), Gaps = 25/131 (19%)

Query: 17  AKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLA 76
           A  R+E   +    P T+ + T+ S    I  P        +  A       +   Q   
Sbjct: 66  ANTRQERTPKLEKAPNTNEEKTSASKIEKISQPKQEEQKSLNISATP--APKQEQSQTTT 123

Query: 77  KGVVPT--------SGTPGP----GGDAPSSPPTQQ--------MKALSISKSKPPSEPV 116
           +   P         + TP P      D P SP  +Q         + L    +KP  E  
Sbjct: 124 ESTTPKTKVTTPPSTNTPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFE-- 181

Query: 117 YFRGELGQPIK 127
            F  + G  +K
Sbjct: 182 -FEKQFGFLLK 191


>gnl|CDD|177464 PHA02682, PHA02682, ORF080 virion core protein; Provisional.
          Length = 280

 Score = 30.2 bits (67), Expect = 5.1
 Identities = 21/91 (23%), Positives = 35/91 (38%), Gaps = 15/91 (16%)

Query: 39  TGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPT 98
             + AP  P+P+ + P+ A    P             A    P +  P P   AP+ PP+
Sbjct: 95  CPACAPAAPAPAVTCPAPAPACPPA-----------TAPTCPPPAVCPAPARPAPACPPS 143

Query: 99  QQM----KALSISKSKPPSEPVYFRGELGQP 125
            +       L   K  P ++P++   +L  P
Sbjct: 144 TRQCPPAPPLPTPKPAPAAKPIFLHNQLPPP 174


>gnl|CDD|220093 pfam09030, Creb_binding, Creb binding.  The Creb binding domain
           assumes a structure comprising of three alpha-helices
           which pack in a bundle, exposing a hydrophobic groove
           between alpha-1 and alpha-3 within which complimentary
           domains found in the protein 'activator for thyroid
           hormone and retinoid receptors' (ACTR) can dock. Docking
           of these domains is required for the recruitment of RNA
           polymerase II and the basal transcription machinery.
          Length = 104

 Score = 28.5 bits (63), Expect = 5.5
 Identities = 21/74 (28%), Positives = 31/74 (41%), Gaps = 15/74 (20%)

Query: 44  PGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGD------APSSPP 97
           PG+P P     +Q +   P+          L+  G+     +P    D      +PSSP 
Sbjct: 19  PGMPRPVMQMVAQHAVAGPR--------PGLVQPGISRGIVSPNALQDLLRTLKSPSSP- 69

Query: 98  TQQMKALSISKSKP 111
            QQ + L+I KS P
Sbjct: 70  QQQQQVLNILKSNP 83


>gnl|CDD|151482 pfam11035, SnAPC_2_like, Small nuclear RNA activating complex
           subunit 2-like.  This family of proteins is SnAPC
           subunit 2-like. SnAPC allows the transcription of human
           small nuclear RNA genes to occur by recognition of the
           proximal sequence element.
          Length = 344

 Score = 30.0 bits (67), Expect = 5.7
 Identities = 15/54 (27%), Positives = 18/54 (33%), Gaps = 7/54 (12%)

Query: 28  GSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVP 81
           G LP  + D   GS  P           QAS  A +  G         A G+ P
Sbjct: 279 GKLPPGTEDGGAGSTGP-------EETDQASPQASEPAGSSEPRSAWQAAGICP 325


>gnl|CDD|139048 PRK12538, PRK12538, RNA polymerase sigma factor; Provisional.
          Length = 233

 Score = 29.8 bits (67), Expect = 5.8
 Identities = 22/76 (28%), Positives = 36/76 (47%), Gaps = 18/76 (23%)

Query: 718 VSDGQLDSVSRVEIDQYQQIVDTIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERP 777
           V+DG+ D+VS +E ++   +++  M  LP             Q+RI      +LS  E  
Sbjct: 145 VADGKPDAVSVIERNELSDLLEAAMQRLPE------------QQRIAV----ILSYHE-- 186

Query: 778 NLANAPSGSVLDHTVT 793
           N++N     V+D TV 
Sbjct: 187 NMSNGEIAEVMDTTVA 202


>gnl|CDD|221825 pfam12877, DUF3827, Domain of unknown function (DUF3827).  This
           family contains the human KIAA1549 protein which has
           been found to be fused fused to BRAF gene in many cases
           of pilocytic astrocytomas. The fusion is due mainly to a
           tandem duplication of 2 Mb at 7q34. Although nothing is
           known about the function of KIAA1549 protein, the BRAF
           protein is a well characterized oncoprotein. It is a
           serine/threonine protein kinase which is implicated in
           MAP/ERK signalling, a critical pathway for the
           regulation of cell division, differentiation and
           secretion.
          Length = 684

 Score = 29.9 bits (67), Expect = 6.5
 Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 7/73 (9%)

Query: 48  SPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSIS 107
             S  S   ++  + +  GR      + A+     +   G    AP S   +Q+ + SI 
Sbjct: 389 GDSEGSSVISNRSSREKSGRPSTTPSVTAQQKP--TKEEGRKKPAPPSGTDEQLSSASIF 446

Query: 108 K-----SKPPSEP 115
           +     S+P S+P
Sbjct: 447 EHVDRLSRPSSDP 459


>gnl|CDD|223061 PHA03369, PHA03369, capsid maturational protease; Provisional.
          Length = 663

 Score = 30.0 bits (67), Expect = 6.5
 Identities = 23/87 (26%), Positives = 30/87 (34%), Gaps = 2/87 (2%)

Query: 31  PTTSTDDTTGSHAPGIP-SPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPG 89
            T   D        GIP S    SP  A  P P+  G   L+     +    TS  P P 
Sbjct: 375 HTGPADRQRPQRPDGIPYSVPARSPMTAYPPVPQFCGDPGLVSPYNPQSPG-TSYGPEPV 433

Query: 90  GDAPSSPPTQQMKALSISKSKPPSEPV 116
           G  P  P    +  +S++    P  P 
Sbjct: 434 GPVPPQPTNPYVMPISMANMVYPGHPQ 460


>gnl|CDD|218408 pfam05062, RICH, RICH domain.  This presumed domain is about 85
           residues in length and very rich in charged residues,
           hence the name RICH (Rich In CHarged residues). It is
           found in secreted proteins such as PspC, SpsA, and IgA
           FC receptor from Streptococcus agalactiae. This domain
           could be involved in bacterial adherence or cell wall
           binding.
          Length = 81

 Score = 27.7 bits (62), Expect = 7.0
 Identities = 9/27 (33%), Positives = 15/27 (55%), Gaps = 1/27 (3%)

Query: 425 VDPNQKLQAI-SKYINNVNNNKETSEL 450
           V   +KL AI ++Y+  ++  K   EL
Sbjct: 35  VALIKKLSAIKTEYLYELDVLKTKVEL 61


>gnl|CDD|221371 pfam12004, DUF3498, Domain of unknown function (DUF3498).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 433 to 538 amino acids in length. This domain is
           found associated with pfam00616, pfam00168. This domain
           has two conserved sequence motifs: DLQ and PLSFQNP.
          Length = 489

 Score = 29.7 bits (66), Expect = 7.3
 Identities = 26/121 (21%), Positives = 42/121 (34%), Gaps = 16/121 (13%)

Query: 21  EEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGR-------LLQQ 73
           EE  +R               H P +P  +++ P +  +      G          LL  
Sbjct: 225 EEFTRRSTDFTRRQLSLPDRQHQPALPRQNSAGPQRRVDQPSPPGGGSHRGRIPPSLLSS 284

Query: 74  LLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRGELGQPIKVMVNYI 133
           L ++G + +S  P  G     + P QQ    S S      E     G L QP  V ++ +
Sbjct: 285 LPSEGSMLSSEWPQSG-----ARPRQQ----SSSSKGDSPELRPAAGHLQQPSPVNMSAL 335

Query: 134 D 134
           +
Sbjct: 336 E 336


>gnl|CDD|220582 pfam10117, McrBC, McrBC 5-methylcytosine restriction system
           component.  Members of this family of bacterial proteins
           modify the specificity of mcrB restriction by expanding
           the range of modified sequences restricted.
          Length = 317

 Score = 29.5 bits (67), Expect = 8.4
 Identities = 18/89 (20%), Positives = 30/89 (33%), Gaps = 8/89 (8%)

Query: 505 NFDQWVLIHIRRDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRS 564
           N        +R D+   DN LN L +  +A+   +K  +         L  L  L     
Sbjct: 109 NPGHKHKFFVRYDEFTEDNPLNRLLK--SALERLLKLTRSSENLRL--LRELLFLLDEVP 164

Query: 565 DTQFVVIIFNAPRTDR----YQAVKKYCC 589
           D++     F   R +R    Y+ +     
Sbjct: 165 DSKISAKDFQKWRLNRLNARYELLLPLAR 193


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.135    0.393 

Gapped
Lambda     K      H
   0.267   0.0717    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 44,914,723
Number of extensions: 4436124
Number of successful extensions: 3984
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3912
Number of HSP's successfully gapped: 68
Length of query: 884
Length of database: 10,937,602
Length adjustment: 105
Effective length of query: 779
Effective length of database: 6,280,432
Effective search space: 4892456528
Effective search space used: 4892456528
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 63 (28.1 bits)