RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11920
(884 letters)
>gnl|CDD|240016 cd04658, Piwi_piwi-like_Euk, Piwi_piwi-like_Euk: PIWI domain,
Piwi-like subfamily found in eukaryotes. This domain is
found in Piwi and closely related proteins, where it is
believed to perform a crucial role in germline cells,
via RNA silencing. RNA silencing refers to a group of
related gene-silencing mechanisms mediated by short RNA
molecules, including siRNAs, miRNAs, and
heterochromatin-related guide RNAs. The mechanism in
Piwi is believed to be similar to that in Argonaute, the
central component of the RNA-induced silencing complex
(RISC). The PIWI domain is the C-terminal portion of
Argonaute and consists of two subdomains, one of which
provides the 5' anchoring of the guide RNA and the
other, the catalytic site for slicing.
Length = 448
Score = 501 bits (1291), Expect = e-170
Identities = 205/459 (44%), Positives = 293/459 (63%), Gaps = 16/459 (3%)
Query: 414 QVMKSIASFTRVDPNQKLQAISKYINNVNNNKETSELLKGWGLTLNKSMETLNARILPVE 473
+MK +A T+++P ++ I ++I + N ELLK WG+ L+ + + R+LP E
Sbjct: 1 NLMKELAEHTKLNPKERYDTIRQFIQRIQKNPSVQELLKKWGIELDSNPLKIQGRVLPPE 60
Query: 474 KIYMGNNFVAPGSQEADWNRQVGTNPALTVVNFDQWVLIHIRRDQRNADNFLNCLNRNSN 533
+I MGN FV + ADW R++ P VN + WVLI+ RDQR A++FL L + +
Sbjct: 61 QIIMGNVFV-YANSNADWKREIRNQPLYDAVNLNNWVLIYPSRDQREAESFLQTLKQVAG 119
Query: 534 AIGIRVKKPQVIALQEEQTLSYLTALK-SMRSDTQFVVIIFNAPRTDRYQAVKKYCCCER 592
+GI++ P++I +++++ +Y+ ALK + RSD Q VVII + D Y A+KK+CC E
Sbjct: 120 PMGIQISPPKIIKVKDDRIETYIRALKDAFRSDPQLVVIILPGNKKDLYDAIKKFCCVEC 179
Query: 593 PIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSVQIPY---DCAMVIGIDVYHE 649
P+PSQVI SRT+ ++ ++SI KIALQIN KLGG W+V+IP M++GIDVYH+
Sbjct: 180 PVPSQVITSRTLKKKKNLRSIASKIALQINAKLGGIPWTVEIPPFILKNTMIVGIDVYHD 239
Query: 650 GVGSQGQNIVGLVASTNKDFTTYYSQAVIQRRGQE-ITDSIAQPFKQALDRFIQANSVPP 708
+ + +VG VAS NK T ++S+ + Q RGQE I DS+ + K+AL + + N P
Sbjct: 240 TITKKKS-VVGFVASLNKSITKWFSKYISQVRGQEEIIDSLGKSMKKALKAYKKENKKLP 298
Query: 709 KQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTTLPSCSYAPKITAIIVQKRINTKIF 768
+I I+RDGV DGQL V E+ Q ++ + S +Y+PK+ I+V KRINT+ F
Sbjct: 299 SRIIIYRDGVGDGQLKKVKEYEVPQIKKAIKQY-----SENYSPKLAYIVVNKRINTRFF 353
Query: 769 QLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVSQHVRQGTVTPSHYIVLRNDNNVKV 828
N +N P G+V+D +T+ DFFLVSQ VRQGTVTP+HY VL + +K
Sbjct: 354 N----QGGNNFSNPPPGTVVDSEITKPEWYDFFLVSQSVRQGTVTPTHYNVLYDTTGLKP 409
Query: 829 DHLQRLSYKLCHLYYNWPGTIRVPAVCQYAHRIAYLTGM 867
DHLQRL+YKLCHLYYNW G+IRVPA CQYAH++A+L G
Sbjct: 410 DHLQRLTYKLCHLYYNWSGSIRVPAPCQYAHKLAFLVGQ 448
>gnl|CDD|214930 smart00950, Piwi, This domain is found in the protein Piwi and its
relatives. The function of this domain is the dsRNA
guided hydrolysis of ssRNA. Determination of the crystal
structure of Argonaute reveals that PIWI is an RNase H
domain, and identifies Argonaute as Slicer, the enzyme
that cleaves mRNA in the RNAi RISC complex.. In
addition, Mg+2 dependence and production of 3'-OH and 5'
phosphate products are shared characteristics of RNaseH
and RISC. The PIWI domain core has a tertiary structure
belonging to the RNase H family of enzymes. RNase H fold
proteins all have a five-stranded mixed beta-sheet
surrounded by helices. By analogy to RNase H enzymes
which cleave single-stranded RNA guided by the DNA
strand in an RNA/DNA hybrid, the PIWI domain can be
inferred to cleave single-stranded RNA, for example
mRNA, guided by double stranded siRNA.
Length = 301
Score = 297 bits (762), Expect = 8e-94
Identities = 105/311 (33%), Positives = 159/311 (51%), Gaps = 19/311 (6%)
Query: 568 FVVIIFNAPRTDRYQAVKKYCCCERPIPSQVINSRTISRE---DKMKSIVMKIALQINCK 624
VVI+ +TD Y +KKY + +P+Q + ++T+ + K+K + +AL+IN K
Sbjct: 2 IVVILPGEKKTDLYHEIKKYLETKLGVPTQCVQAKTLDKVSKRRKLKQYLTNVALKINAK 61
Query: 625 LGGSLWSV---QIPYDCAMVIGIDVYHEGVGSQGQNIVGLVASTNKDFTTYYSQAVIQ-R 680
LGG W + IP ++IGIDV H G G + VA+ Y S Q
Sbjct: 62 LGGINWVLDVPPIPLKPTLIIGIDVSHPSAGKGGS-VAPSVAAFVAS-GNYLSGNFYQAF 119
Query: 681 RGQEITDSIAQPFKQALDRFIQAN-SVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVD 739
++ + + + ++AL ++ ++N P +I ++RDGVS+GQ V E+ + I
Sbjct: 120 VREQGSRQLKEILREALKKYYKSNRKRLPDRIVVYRDGVSEGQFKQVLEYEV---KAIKK 176
Query: 740 TIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSD 799
P Y PK+T I+VQKR +T+ F + N P G+V+D +T D
Sbjct: 177 ACKELGPD--YKPKLTVIVVQKRHHTRFF----PEDGNGRVNVPPGTVVDSVITSPEWYD 230
Query: 800 FFLVSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVCQYAH 859
F+LVS QGT P+HY VL ++ N+ D LQRL+YKLCHLYY + +PA YAH
Sbjct: 231 FYLVSHAGLQGTARPTHYTVLYDEGNLDPDELQRLTYKLCHLYYRSTRPVSLPAPVYYAH 290
Query: 860 RIAYLTGMHLQ 870
+A L
Sbjct: 291 LLAKRARQLLH 301
>gnl|CDD|216915 pfam02171, Piwi, Piwi domain. This domain is found in the protein
Piwi and its relatives. The function of this domain is
the dsRNA guided hydrolysis of ssRNA. Determination of
the crystal structure of Argonaute reveals that PIWI is
an RNase H domain, and identifies Argonaute as Slicer,
the enzyme that cleaves mRNA in the RNAi RISC complex.
In addition, Mg+2 dependence and production of 3'-OH and
5' phosphate products are shared characteristics of
RNaseH and RISC. The PIWI domain core has a tertiary
structure belonging to the RNase H family of enzymes.
RNase H fold proteins all have a five-stranded mixed
beta-sheet surrounded by helices. By analogy to RNase H
enzymes which cleave single-stranded RNA guided by the
DNA strand in an RNA/DNA hybrid, the PIWI domain can be
inferred to cleave single-stranded RNA, for example
mRNA, guided by double stranded siRNA.
Length = 296
Score = 269 bits (689), Expect = 2e-83
Identities = 103/305 (33%), Positives = 161/305 (52%), Gaps = 12/305 (3%)
Query: 567 QFVVIIFNAPRTDRYQAVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLG 626
VVI+ D Y ++KKY E IP+Q I +T + K + + L+IN KLG
Sbjct: 1 LIVVIL-PDKNKDNYHSIKKYLETELGIPTQCIRLKTALKRTKP-QTLTNVLLKINVKLG 58
Query: 627 GS-LWSVQIPYDCAMVIGIDVYHEGVG-SQGQNIVGLVASTNKDFTTYYSQAVIQRRGQE 684
G W V+IP ++IG D+ H G ++VG VAS +K Y Q GQE
Sbjct: 59 GLNYWIVEIPPKIDVIIGFDISHGTGGTDDNPSVVGFVASMDKHPQKYAGGVRYQASGQE 118
Query: 685 ITDSIAQPFKQALDRFIQANSVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTT 744
+ + + + ++L F ++ P++I ++RDGVS+GQ V E++Q ++ +
Sbjct: 119 LIEPLKEIILESLRSFYKSRKKKPERIIVYRDGVSEGQFPQVLNYEVNQIKEACKEL--- 175
Query: 745 LPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVS 804
Y PK+T I+VQKR +T+ F +G+R N P G+V+D +T DF+L S
Sbjct: 176 --GEGYRPKLTVIVVQKRHHTRFFA---SGKRDGAQNPPPGTVVDDVITSPEYYDFYLCS 230
Query: 805 QHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVCQYAHRIAYL 864
RQGTV P+ Y VL ++ + + LQ+L+YKLC++Y + +PA YAH++A
Sbjct: 231 HAGRQGTVKPTKYTVLYDEIGLSPEELQQLTYKLCYMYQRVFRPVSLPAPVYYAHKLAKR 290
Query: 865 TGMHL 869
+L
Sbjct: 291 GRNNL 295
>gnl|CDD|240015 cd04657, Piwi_ago-like, Piwi_ago-like: PIWI domain, Argonaute-like
subfamily. Argonaute is the central component of the
RNA-induced silencing complex (RISC) and related
complexes. The PIWI domain is the C-terminal portion of
Argonaute and consists of two subdomains, one of which
provides the 5' anchoring of the guide RNA and the
other, the catalytic site for slicing.
Length = 426
Score = 244 bits (624), Expect = 2e-72
Identities = 120/433 (27%), Positives = 191/433 (44%), Gaps = 35/433 (8%)
Query: 451 LKGWGLTLNKSMETLNARILPVEKI-YMGNNFVAPGSQEADWN------RQVGTNPALTV 503
LK +G++++K M T+ R+LP K+ Y ++ P WN + G + V
Sbjct: 3 LKEFGISVSKEMITVPGRVLPPPKLKYGDSSKTVPPRN-GSWNLRGKKFLEGGPIRSWAV 61
Query: 504 VNFDQWVLIHIRRDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMR 563
+NF R NF++ L + GI IA E + LK +
Sbjct: 62 LNFAGPRRSREERADLR--NFVDQLVKTVIGAGIN--ITTAIASVEGRVEELFAKLKQAK 117
Query: 564 SD-TQFVVIIFNAPRTDRYQAVKKYCCCERPIPSQVINSRTISREDK---MKSIVMKIAL 619
+ Q V++I +D Y +K+ E I +Q + ++ ++++ ++ +KI
Sbjct: 118 GEGPQLVLVILPKKDSDIYGRIKRLADTELGIHTQCVLAKKVTKKGNPQYFANVALKI-- 175
Query: 620 QINCKLGGSLWSVQIPYDCA------MVIGIDVYHEGVGSQGQ--NIVGLVASTNKDFTT 671
N KLGG S++ MV+G DV H G +I +VAS +
Sbjct: 176 --NLKLGGINHSLEPDIRPLLTKEPTMVLGADVTHPSPGDPAGAPSIAAVVASVDWHLAQ 233
Query: 672 YYSQAVIQRRGQEITDSIAQPFKQALDRFIQANSVPPKQIFIFRDGVSDGQLDSVSRVEI 731
Y + +Q QEI D + ++ L F +A P++I +RDGVS+GQ V E+
Sbjct: 234 YPASVRLQSHRQEIIDDLESMVRELLRAFKKATGKLPERIIYYRDGVSEGQFAQVLNEEL 293
Query: 732 DQYQQIVDTIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLA-NAPSGSVLDH 790
++ + Y PKIT I+VQKR +T+ F + N P G+V+D
Sbjct: 294 PAIRKACAKLY-----PGYKPKITFIVVQKRHHTRFF-PTDEDDADGKNGNVPPGTVVDR 347
Query: 791 TVTRKTLSDFFLVSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIR 850
+T DF+L S QGT P+HY VL ++ D LQ L+Y LC+ Y ++
Sbjct: 348 GITHPREFDFYLCSHAGIQGTARPTHYHVLWDEIGFTADELQTLTYNLCYTYARCTRSVS 407
Query: 851 VPAVCQYAHRIAY 863
+P YAH A
Sbjct: 408 IPPPAYYAHLAAA 420
>gnl|CDD|239208 cd02826, Piwi-like, Piwi-like: PIWI domain. Domain found in
proteins involved in RNA silencing. RNA silencing refers
to a group of related gene-silencing mechanisms mediated
by short RNA molecules, including siRNAs, miRNAs, and
heterochromatin-related guide RNAs. The central
component of the RNA-induced silencing complex (RISC)
and related complexes is Argonaute. The PIWI domain is
the C-terminal portion of Argonaute and consists of two
subdomains, one of which provides the 5' anchoring of
the guide RNA and the other, the catalytic site for
slicing. This domain is also found in closely related
proteins, including the Piwi subfamily, where it is
believed to perform a crucial role in germline cells,
via a similar mechanism.
Length = 393
Score = 225 bits (576), Expect = 5e-66
Identities = 109/414 (26%), Positives = 178/414 (42%), Gaps = 42/414 (10%)
Query: 463 ETLNARILPVEKIYMGNNFVAPGSQEADWNRQVGTN--PALTVVNFDQWVLIHIRRDQ-R 519
L R+LP +I N + R +G PA + +I R ++
Sbjct: 3 LILKGRVLPKPQILFKNK----------FLRNIGPFEKPAKI---TNPVAVIAFRNEEVD 49
Query: 520 NADNFLNCL-NRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRS-DTQFVVIIFNAPR 577
+ L + I + I + K+ Q V+ I +
Sbjct: 50 DLVKRLADACRQLGMKIK-EIPIVSWIEDLNNSFKDLKSVFKNAIKAGVQLVIFILKEKK 108
Query: 578 TDRYQAVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSVQIP-- 635
+ +K+ + IPSQVI +T + ++K + + ++N KLGG + + P
Sbjct: 109 PPLHDEIKRLEA-KSDIPSQVIQLKTAKKMRRLKQTLDNLLRKVNSKLGGINYILDSPVK 167
Query: 636 -YDCAMVIGIDVYHEGVGSQGQN--IVGLVAST-NKDFTTYYSQAVIQR--RGQEITDSI 689
+ + IG DV H + VG A+ N F + R + Q++ +
Sbjct: 168 LFKSDIFIGFDVSHPDRRTVNGGPSAVGFAANLSNHTFLGGFLYVQPSREVKLQDLGEV- 226
Query: 690 AQPFKQALDRFIQAN-SVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTTLPSC 748
K+ LD F ++ P++I I+RDGVS+G+ V ++I+ S
Sbjct: 227 ---IKKCLDGFKKSTGEGLPEKIVIYRDGVSEGEFKRVKEE----VEEIIKEACEIEES- 278
Query: 749 SYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVSQHVR 808
Y PK+ I+VQKR NT+ F + + N G+V+DHT+T LS+F+L S R
Sbjct: 279 -YRPKLVIIVVQKRHNTRFFP---NEKNGGVQNPEPGTVVDHTITSPGLSEFYLASHVAR 334
Query: 809 QGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVCQYAHRIA 862
QGTV P+ Y V+ ND N ++ L+ L+Y LC + N I +PA YAH++A
Sbjct: 335 QGTVKPTKYTVVFNDKNWSLNELEILTYILCLTHQNVYSPISLPAPLYYAHKLA 388
>gnl|CDD|198017 smart00949, PAZ, This domain is named PAZ after the proteins Piwi
Argonaut and Zwille. This domain is found in two
families of proteins that are involved in
post-transcriptional gene silencing. These are the Piwi
family and the Dicer family, that includes the Carpel
factory protein. The function of the domains is unknown
but has been suggested to mediate complex formation
between proteins of the Piwi and Dicer families by
hetero-dimerisation. The three-dimensional structure of
this domain has been solved. The PAZ domain is composed
of two subdomains. One subdomain is similar to the OB
fold, albeit with a different topology. The OB-fold is
well known as a single-stranded nucleic acid binding
fold. The second subdomain is composed of a beta-hairpin
followed by an alpha-helix. The PAZ domains shows
low-affinity nucleic acid binding and appears to
interact with the 3' ends of single-stranded regions of
RNA in the cleft between the two subdomains. PAZ can
bind the characteristic two-base 3' overhangs of siRNAs,
indicating that although PAZ may not be a primary
nucleic acid binding site in Dicer or RISC, it may
contribute to the specific and productive incorporation
of siRNAs and miRNAs into the RNAi pathway.
Length = 138
Score = 152 bits (386), Expect = 4e-43
Identities = 65/140 (46%), Positives = 88/140 (62%), Gaps = 6/140 (4%)
Query: 290 TCLDLI-DELKEKFGGNFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR-G 347
T LD + + NF +R ++ L G IVLTRYNN+TYRID+ID+ P STF K G
Sbjct: 2 TVLDFMRQLPSQGNRSNFQDRCAKDLKGLIVLTRYNNKTYRIDDIDWNLAPKSTFEKSDG 61
Query: 348 EPKSYVDYYREAYNIEIRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGLSD 407
++V+YY++ YNI IRD +QP+L++R KR+ + E PEL F+TGL+D
Sbjct: 62 SEITFVEYYKQKYNITIRDPNQPLLVSRPKRRRNQNGKGEPVLLP----PELCFITGLTD 117
Query: 408 AMRNDFQVMKSIASFTRVDP 427
MR DF +MKSIA TR+ P
Sbjct: 118 RMRKDFMLMKSIADRTRLSP 137
>gnl|CDD|239211 cd02845, PAZ_piwi_like, PAZ domain, Piwi_like subfamily. In
multi-cellular organisms, the Piwi protein appears to be
essential for the maintenance of germline stem cells. In
the Drosophila male germline, Piwi was shown to be
involved in the silencing of retrotransposons in the
male gametes. The Piwi proteins share their domain
architecture with other members of the argonaute family.
The PAZ domain has been named after the proteins Piwi,
Argonaut, and Zwille. PAZ is found in two families of
proteins that are essential components of RNA-mediated
gene-silencing pathways, including RNA interference, the
Piwi and Dicer families. PAZ functions as a nucleic acid
binding domain, with a strong preference for
single-stranded nucleic acids (RNA or DNA) or RNA
duplexes with single-stranded 3' overhangs. It has been
suggested that the PAZ domain provides a unique mode for
the recognition of the two 3'-terminal nucleotides in
single-stranded nucleic acids and buries the 3' OH
group, and that it might recognize characteristic 3'
overhangs in siRNAs within RISC (RNA-induced silencing)
and other complexes.
Length = 117
Score = 140 bits (354), Expect = 7e-39
Identities = 58/121 (47%), Positives = 82/121 (67%), Gaps = 6/121 (4%)
Query: 288 TSTCLDLIDELKEK-FGGNFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR 346
++T LD + +L + F E + LIG IVLTRYNN+TYRID+IDF +TP+STF K
Sbjct: 1 STTVLDRMHKLYRQETDERFREECEKELIGSIVLTRYNNKTYRIDDIDFDKTPLSTFKKS 60
Query: 347 -GEPKSYVDYYREAYNIEIRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGL 405
G ++V+YY++ YNIEI D +QP+L++R KR+ RG E + ++PEL FLTGL
Sbjct: 61 DGTEITFVEYYKKQYNIEITDLNQPLLVSRPKRRDPRGGEKEPIY----LIPELCFLTGL 116
Query: 406 S 406
+
Sbjct: 117 T 117
>gnl|CDD|215631 PLN03202, PLN03202, protein argonaute; Provisional.
Length = 900
Score = 125 bits (316), Expect = 2e-29
Identities = 124/484 (25%), Positives = 212/484 (43%), Gaps = 84/484 (17%)
Query: 423 TRVDPNQKLQAISKYINNVNNNKETSELLKGWGLTLNKSMETLNARILPVEKIYMGNNF- 481
+R P ++++ ++ + + N + + +L+ G++++ + R+LP K+ +GN
Sbjct: 404 SRQKPQERMKVLTDALKSSNYDADP--MLRSCGISISSQFTQVEGRVLPAPKLKVGNGED 461
Query: 482 VAPGSQEADWNRQVGTNPA----LTVVNFDQWVLIHIRRDQRN-ADNFLNCLNRNSNAIG 536
P + ++N + P VVNF R D R+ + + C G
Sbjct: 462 FFPRNGRWNFNNKKLVEPTKIERWAVVNFSA------RCDIRHLVRDLIKCGEMK----G 511
Query: 537 IRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVI---IFNAPR-----------TDRYQ 582
I ++ P + EE + A +R + F I + P+ +D Y
Sbjct: 512 INIEPP--FDVFEENP-QFRRAPPPVRVEKMFEQIQSKLPGPPQFLLCILPERKNSDIYG 568
Query: 583 AVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLGG--SLWSV-------- 632
KK E I +Q I ++ + + + L+IN KLGG SL ++
Sbjct: 569 PWKKKNLSEFGIVTQCIAPTRVNDQ-----YLTNVLLKINAKLGGLNSLLAIEHSPSIPL 623
Query: 633 --QIPYDCAMVIGIDVYHEGVGSQGQ----NIVGLVASTNKDFTTYYSQAV-IQRRGQEI 685
++P +++G+DV H GS GQ +I +V+S + Y +V Q E+
Sbjct: 624 VSKVP---TIILGMDVSH---GSPGQSDVPSIAAVVSSRQWPLISRYRASVRTQSPKVEM 677
Query: 686 TDSIAQPFKQA----------LDRFIQANSVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQ 735
DS+ +P LD + + P+QI IFRDGVS+ Q + V +E+DQ
Sbjct: 678 IDSLFKPVGDKDDDGIIRELLLDFYTSSGKRKPEQIIIFRDGVSESQFNQVLNIELDQII 737
Query: 736 QIVDTIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRK 795
+ + S++PK T I+ QK +TK FQ S P+ N P G+V+D+ +
Sbjct: 738 EACKFL-----DESWSPKFTVIVAQKNHHTKFFQAGS----PD--NVPPGTVVDNKICHP 786
Query: 796 TLSDFFLVSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWPGTIRVPAVC 855
+DF++ + GT P+HY VL ++ D LQ L + L ++Y I V A
Sbjct: 787 RNNDFYMCAHAGMIGTTRPTHYHVLLDEIGFSADDLQELVHSLSYVYQRSTTAISVVAPV 846
Query: 856 QYAH 859
YAH
Sbjct: 847 CYAH 850
Score = 33.9 bits (78), Expect = 0.47
Identities = 30/115 (26%), Positives = 51/115 (44%), Gaps = 14/115 (12%)
Query: 82 TSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRG--ELGQPIKVMVNYIDLSVKE 139
P P P + +++ SKP P+ RG GQ I+++ N+ +SV
Sbjct: 1 KDALPPPPPVVPPNVVPIKLEPTK-KPSKPKRLPMARRGFGSKGQKIQLLTNHFKVSVNN 59
Query: 140 GSGM-YEYEVKF----NPPIDSRGIRNRLINSL-----NDLLGQYKTFDG-MNLF 183
G + Y V P+D +GI ++I+ + +DL G+ +DG +LF
Sbjct: 60 PDGHFFHYSVSLTYEDGRPVDGKGIGRKVIDKVQETYSSDLAGKDFAYDGEKSLF 114
>gnl|CDD|216914 pfam02170, PAZ, PAZ domain. This domain is named PAZ after the
proteins Piwi Argonaut and Zwille. This domain is found
in two families of proteins that are involved in
post-transcriptional gene silencing. These are the Piwi
family and the Dicer family, that includes the Carpel
factory protein. The function of the domains is unknown
but has been suggested to mediate complex formation
between proteins of the Piwi and Dicer families by
hetero-dimerisation. The three-dimensional structure of
this domain has been solved. The PAZ domain is composed
of two subdomains. One subdomain is similar to the OB
fold, albeit with a different topology. The OB-fold is
well known as a single-stranded nucleic acid binding
fold. The second subdomain is composed of a beta-hairpin
followed by an alpha-helix. The PAZ domains shows
low-affinity nucleic acid binding and appears to
interact with the 3' ends of single-stranded regions of
RNA in the cleft between the two subdomains. PAZ can
bind the characteristic two-base 3' overhangs of siRNAs,
indicating that although PAZ may not be a primary
nucleic acid binding site in Dicer or RISC, it may
contribute to the specific and productive incorporation
of siRNAs and miRNAs into the RNAi pathway.
Length = 114
Score = 98.1 bits (245), Expect = 3e-24
Identities = 41/107 (38%), Positives = 60/107 (56%), Gaps = 13/107 (12%)
Query: 305 NFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR-GEPKSYVDYYREAYNIE 363
+F E+ ++AL G IV T YNN+TYRID I + TP STF + G S +Y++E YNI
Sbjct: 20 DFREKFTKALKGLIVETTYNNRTYRIDGITWDPTPNSTFPLKDGGEISVAEYFKEKYNIT 79
Query: 364 IRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGLSDAMR 410
++ + P+L+ V RK + + PEL F+TG M+
Sbjct: 80 LKYPNLPLLV--VGRKKK----------PNYLPPELCFITGGQRYMK 114
>gnl|CDD|239207 cd02825, PAZ, PAZ domain, named PAZ after the proteins Piwi
Argonaut and Zwille. PAZ is found in two families of
proteins that are essential components of RNA-mediated
gene-silencing pathways, including RNA interference, the
piwi and Dicer families. PAZ functions as a nucleic-acid
binding domain, with a strong preference for
single-stranded nucleic acids (RNA or DNA) or RNA
duplexes with single-stranded 3' overhangs. It has been
suggested that the PAZ domain provides a unique mode for
the recognition of the two 3'-terminal nucleotides in
single-stranded nucleic acids and buries the 3' OH
group, and that it might recognize characteristic 3'
overhangs in siRNAs within RISC (RNA-induced silencing)
and other complexes. This parent model also contains
structures of an archaeal PAZ domain.
Length = 115
Score = 51.7 bits (124), Expect = 5e-08
Identities = 26/106 (24%), Positives = 46/106 (43%), Gaps = 9/106 (8%)
Query: 279 IDTSCRVLRTSTCLDLIDELKEKFGGNFMERLSQALIGEIVLTRYN--NQTYRIDEIDFK 336
I+T C+ + E+ + E ++ L G V +N N+ YR D
Sbjct: 5 IETMCKFPK-------DREIDTPLLDSPREEFTKELKGLKVEDTHNPLNRVYRPDGETRL 57
Query: 337 QTPMSTFTKRGEPKSYVDYYREAYNIEIRDKSQPMLITRVKRKTRR 382
+ P G+ ++ DY++E YN+ + D +QP+LI + K
Sbjct: 58 KAPSQLKHSDGKEITFADYFKERYNLTLTDLNQPLLIVKFSSKKSY 103
>gnl|CDD|240017 cd04659, Piwi_piwi-like_ProArk, Piwi_piwi-like_ProArk: PIWI domain,
Piwi-like subfamily found in Archaea and Bacteria. RNA
silencing refers to a group of related gene-silencing
mechanisms mediated by short RNA molecules, including
siRNAs, miRNAs, and heterochromatin-related guide RNAs.
The central component of the RNA-induced silencing
complex (RISC) and related complexes is Argonaute. The
PIWI domain is the C-terminal portion of Argonaute and
consists of two subdomains, one of which provides the 5'
anchoring of the guide RNA and the other, the catalytic
site for slicing. This domain is also found in closely
related proteins, including the Piwi subfamily, where it
is believed to perform a crucial role in germline cells,
via a similar mechanism.
Length = 404
Score = 53.9 bits (130), Expect = 2e-07
Identities = 65/366 (17%), Positives = 116/366 (31%), Gaps = 47/366 (12%)
Query: 523 NFLNCLNRNSNAIGIRVKKPQVIALQE-EQTLSYLTA----LKSMRSDTQFVVIIFNAPR 577
F N NA+G + L Q + + A L V+++
Sbjct: 63 KFPGFGGGNKNALGKNKISVFRLDLNRSAQAEAIIEAVDLALSESSQGVDVVIVVLPEDL 122
Query: 578 TDRYQAVKKYCCCER-----PIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSV 632
+ + Y + IP+Q + T+ + + +AL + KLGG W +
Sbjct: 123 KELPEEFDLYDRLKAKLLRLGIPTQFVREDTLKNRQDLAYVAWNLALALYAKLGGIPWKL 182
Query: 633 ---QIPYDCAMVIGIDVYHEGVGSQGQNIVGLVASTNKDFTTYY---SQAVIQRRGQEIT 686
P D IGI G + G + D + +
Sbjct: 183 DADSDPADL--YIGIGFARSRDGE--VRVTGCAQVFDSDGLGLILRGAPIEEPTEDRSPA 238
Query: 687 DSIAQPFKQALDRFIQ-ANSVPPKQIFIFRDGVSDGQLDSVSRVEIDQYQQIVDTIMTTL 745
D K+ L+ + + PK++ + +DG + EI+ ++ ++
Sbjct: 239 DLK-DLLKRVLEGYRESHRGRDPKRLVLHKDG-------RFTDEEIEGLKEALE------ 284
Query: 746 PSCSYAPKITAIIVQKRINTKIFQLLSAGERPNLANAPSGSVL---DHTVTRKTLSDFFL 802
K+ + V K ++F G PN G+ + D T
Sbjct: 285 ---ELGIKVDLVEVIKSGPHRLF---RFGTYPNGFPPRRGTYVKLSDDEGLLWTHGSVPK 338
Query: 803 VSQHVRQGTVTPSHYIVLRNDNNVKVDHLQRLSYKLCHLYYNWP-GTIRVPAVCQYAHRI 861
+ G TP ++ R+ N ++ L L L +N R+P YA R+
Sbjct: 339 YNT--YPGMGTPRPLLLRRHSGNTDLEQLASQILGLTKLNWNSFQFYSRLPVTIHYADRV 396
Query: 862 AYLTGM 867
A L
Sbjct: 397 AKLLKR 402
>gnl|CDD|239212 cd02846, PAZ_argonaute_like, PAZ domain, argonaute_like subfamily.
Argonaute is part of the RNA-induced silencing complex
(RISC), and is an endonuclease that plays a key role in
the RNA interference pathway. The PAZ domain has been
named after the proteins Piwi,Argonaut, and Zwille. PAZ
is found in two families of proteins that are essential
components of RNA-mediated gene-silencing pathways,
including RNA interference, the Piwi and Dicer families.
PAZ functions as a nucleic acid binding domain, with a
strong preference for single-stranded nucleic acids (RNA
or DNA) or RNA duplexes with single-stranded 3'
overhangs. It has been suggested that the PAZ domain
provides a unique mode for the recognition of the two
3'-terminal nucleotides in single-stranded nucleic acids
and buries the 3' OH group, and that it might recognize
characteristic 3' overhangs in siRNAs within RISC
(RNA-induced silencing) and other complexes.
Length = 114
Score = 43.8 bits (104), Expect = 3e-05
Identities = 28/101 (27%), Positives = 44/101 (43%), Gaps = 15/101 (14%)
Query: 294 LIDELKEKFGGNF--------MERLSQALIGEIVLTRYNNQT---YRIDEIDFKQTPMST 342
+I+ LKE G + +L +AL G V + T Y+I + + T
Sbjct: 4 VIEFLKEFLGFDTPLGLSDNDRRKLKKALKGLKVEVTHRGNTNRKYKIKGLSAEPASQQT 63
Query: 343 FTKRGEPK--SYVDYYREAYNIEIRDKSQPMLITRVKRKTR 381
F + K S DY++E YNI ++ + P L V RK +
Sbjct: 64 FELKDGEKEISVADYFKEKYNIRLKYPNLPCLQ--VGRKGK 102
>gnl|CDD|239210 cd02844, PAZ_CAF_like, PAZ domain, CAF_like subfamily. CAF (for
carpel factory) is a plant homolog of Dicer. CAF has
been implicated in flower morphogenesis and in early
Arabidopsis development and might function through
posttranscriptional regulation of specific mRNA
molecules. PAZ domains are named after the proteins
Piwi, Argonaut, and Zwille. PAZ is found in two families
of proteins that are essential components of
RNA-mediated gene-silencing pathways, including RNA
interference, the Piwi and Dicer families. PAZ functions
as a nucleic-acid binding domain, with a strong
preference for single-stranded nucleic acids (RNA or
DNA) or RNA duplexes with single-stranded 3' overhangs.
It has been suggested that the PAZ domain provides a
unique mode for the recognition of the two 3'-terminal
nucleotides in single-stranded nucleic acids and buries
the 3' OH group, and that it might recognize
characteristic 3' overhangs in siRNAs within RISC
(RNA-induced silencing) and other complexes.
Length = 135
Score = 42.8 bits (101), Expect = 9e-05
Identities = 24/99 (24%), Positives = 41/99 (41%), Gaps = 14/99 (14%)
Query: 314 LIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKR--GEPKSYVDYYREAYNIEIRDKSQPM 371
L G +V +N + Y I I S+F + +Y +Y++E Y I + +QP+
Sbjct: 31 LKGSVVTAPHNGRFYVISGI-LDLNANSSFPGKEGLGYATYAEYFKEKYGIVLNHPNQPL 89
Query: 372 LITR---------VKRKTRRGTNVEESH--YIAAIVPEL 399
L + R +G + E+ Y + PEL
Sbjct: 90 LKGKQIFNLHNLLHNRFEEKGESEEKEKDRYFVELPPEL 128
>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
Length = 3151
Score = 39.9 bits (93), Expect = 0.008
Identities = 19/86 (22%), Positives = 32/86 (37%), Gaps = 18/86 (20%)
Query: 30 LPTTSTDDTTGSHAPGIPSPSTS---SPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTP 86
L + S A +P P+++ +P P P + P G+
Sbjct: 2812 LAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSL---------------PLGGSV 2856
Query: 87 GPGGDAPSSPPTQQMKALSISKSKPP 112
PGGD PP++ A + ++PP
Sbjct: 2857 APGGDVRRRPPSRSPAAKPAAPARPP 2882
Score = 33.8 bits (77), Expect = 0.59
Identities = 20/104 (19%), Positives = 30/104 (28%), Gaps = 19/104 (18%)
Query: 22 EEEKRPGSLPTTSTDDTTGSHAP------GIPSPSTSSPSQASEPAPKHIGRGRLLQQLL 75
+ SLPT + P G ++P AS P P +
Sbjct: 372 RHHPKRASLPTRKRRSARHAATPFARGPGGDDQTRPAAPVPASVPTPAPTP--------V 423
Query: 76 AKGVVPTSGTP-----GPGGDAPSSPPTQQMKALSISKSKPPSE 114
P TP D P+ PP +Q A + + +
Sbjct: 424 PASAPPPPATPLPSAEPGSDDGPAPPPERQPPAPATEPAPDDPD 467
Score = 33.4 bits (76), Expect = 0.68
Identities = 15/91 (16%), Positives = 25/91 (27%), Gaps = 14/91 (15%)
Query: 24 EKRPGSLPTTSTDDTTG--SHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVP 81
+RP + P ++ P P+P + P P P P
Sbjct: 2586 ARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPS------------P 2633
Query: 82 TSGTPGPGGDAPSSPPTQQMKALSISKSKPP 112
+ P P PP + + + P
Sbjct: 2634 AANEPDPHPPPTVPPPERPRDDPAPGRVSRP 2664
Score = 33.4 bits (76), Expect = 0.76
Identities = 28/105 (26%), Positives = 35/105 (33%), Gaps = 12/105 (11%)
Query: 18 KRREEEEKRPGSLPTTSTDD-TTGSHAPGIPSPST----SSPSQASEPAPKHIGRGRLLQ 72
KRR RP P +S +D + G H P S T S+ A+ A G +
Sbjct: 353 KRR-----RPTWTPPSSLEDLSAGRHHPKRASLPTRKRRSARHAATPFARGPGGDDQTRP 407
Query: 73 QLLAKGVVPTSGTPGPGGDAPSSP--PTQQMKALSISKSKPPSEP 115
VPT AP P P + S PP E
Sbjct: 408 AAPVPASVPTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPER 452
Score = 32.2 bits (73), Expect = 1.6
Identities = 18/99 (18%), Positives = 28/99 (28%), Gaps = 11/99 (11%)
Query: 24 EKRPGSLPTTSTD-------DTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLA 76
RP PTT+ + P+ +S S++ E P A
Sbjct: 2754 PARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPS----PWDPADPPA 2809
Query: 77 KGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
+ P + P A PP + + P P
Sbjct: 2810 AVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPP 2848
Score = 29.9 bits (67), Expect = 7.2
Identities = 14/39 (35%), Positives = 18/39 (46%), Gaps = 3/39 (7%)
Query: 84 GTPGPGGDA---PSSPPTQQMKALSISKSKPPSEPVYFR 119
P PGG P +PP A +I +P EPV+ R
Sbjct: 2494 AAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPR 2532
>gnl|CDD|218116 pfam04503, SSDP, Single-stranded DNA binding protein, SSDP. This
is a family of eukaryotic single-stranded DNA binding
proteins with specificity to a pyrimidine-rich element
found in the promoter region of the alpha2(I) collagen
gene.
Length = 293
Score = 38.5 bits (89), Expect = 0.011
Identities = 24/96 (25%), Positives = 34/96 (35%), Gaps = 15/96 (15%)
Query: 27 PGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPT---S 83
+ T++ PG P P P+Q P Q LL G+ PT
Sbjct: 60 TPQMQNTTSQPFMSPRYPGGPRPPLRMPNQPPGGVPGS-------QPLLPGGMDPTVRQQ 112
Query: 84 GTPGPGGDAPSSPPTQQMKALSISKS-----KPPSE 114
G P GG P + MK+L ++ +PP
Sbjct: 113 GHPNMGGPMQRMTPPRGMKSLDGPQNYGGGMRPPPN 148
>gnl|CDD|239209 cd02843, PAZ_dicer_like, PAZ domain, dicer_like subfamily. Dicer is
an RNAse involved in cleaving dsRNA in the RNA
interference pathway. It generates dsRNAs which are
approximately 20 bp long (siRNAs), which in turn target
hydrolysis of homologous RNAs. PAZ domains are named
after the proteins Piwi Argonaut and Zwille. PAZ is
found in two families of proteins that are essential
components of RNA-mediated gene-silencing pathways,
including RNA interference, the piwi and Dicer families.
PAZ functions as a nucleic-acid binding domain, with a
strong preference for single-stranded nucleic acids (RNA
or DNA) or RNA duplexes with single-stranded 3'
overhangs. It has been suggested that the PAZ domain
provides a unique mode for the recognition of the two
3'-terminal nucleotides in single-stranded nucleic acids
and buries the 3' OH group, and that it might recognize
characteristic 3' overhangs in siRNAs within RISC
(RNA-induced silencing) and other complexes.
Length = 122
Score = 34.0 bits (78), Expect = 0.090
Identities = 17/59 (28%), Positives = 34/59 (57%), Gaps = 5/59 (8%)
Query: 318 IVLTRYNN----QTYRIDEIDFKQTPMSTFTKRGEPKSYVDYYREAYNIEIRDKSQPML 372
+V+ Y N Q + + EI P+S F E +++ +YY++ Y ++I++ +QP+L
Sbjct: 45 VVMPWYRNFDQPQYFYVAEICTDLRPLSKFPG-PEYETFEEYYKKKYKLDIQNLNQPLL 102
>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
Length = 1352
Score = 35.9 bits (83), Expect = 0.11
Identities = 18/95 (18%), Positives = 25/95 (26%), Gaps = 4/95 (4%)
Query: 25 KRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSG 84
P S + G +P P P+ S PAP R +
Sbjct: 95 LAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPA 154
Query: 85 TPGPGGDAPSSPPT--QQMKALSI--SKSKPPSEP 115
S + Q LS ++ PS P
Sbjct: 155 AGASPAAVASDAASSRQAALPLSSPEETARAPSSP 189
Score = 33.2 bits (76), Expect = 0.84
Identities = 16/98 (16%), Positives = 23/98 (23%), Gaps = 4/98 (4%)
Query: 19 RREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKG 78
RE PG T +P S +P +
Sbjct: 101 AREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPA 160
Query: 79 VVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPV 116
V S A ++ + S PP+EP
Sbjct: 161 AVA-SDAASSRQAALPLSSPEETAR---APSSPPAEPP 194
Score = 31.7 bits (72), Expect = 2.3
Identities = 17/103 (16%), Positives = 28/103 (27%), Gaps = 17/103 (16%)
Query: 17 AKRREEEEKRPGSLPTTST----DDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQ 72
+ PGS P S+ ++ S S S+SS S G
Sbjct: 293 ERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVS----PGPSPS 348
Query: 73 QLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
+ S +P ++ S + S P +
Sbjct: 349 R---------SPSPSRPPPPADPSSPRKRPRPSRAPSSPAASA 382
Score = 31.3 bits (71), Expect = 2.8
Identities = 21/99 (21%), Positives = 28/99 (28%)
Query: 17 AKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLA 76
A R P S +S G A S+S S + A
Sbjct: 202 ASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGCGWGPENECPLPRPA 261
Query: 77 KGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
+PT G + PSS P + S + P P
Sbjct: 262 PITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSP 300
>gnl|CDD|221745 pfam12737, Mating_C, C-terminal domain of homeodomain 1. Mating in
fungi is controlled by the loci that determine the
mating type of an individual, and only individuals with
differing mating types can mate. Basidiomycete fungi
have evolved a unique mating system, termed tetrapolar
or bifactorial incompatibility, in which mating type is
determined by two unlinked loci; compatibility at both
loci is required for mating to occur. The multi-allelic
tetrapolar mating system is considered to be a novel
innovation that could have only evolved once, and is
thus unique to the mushroom fungi. This domain is
C-terminal to the homeodomain transcription factor
region.
Length = 418
Score = 35.2 bits (81), Expect = 0.13
Identities = 22/93 (23%), Positives = 31/93 (33%), Gaps = 15/93 (16%)
Query: 16 EAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLL 75
+ E KRP S +S+ +PSP+ S+ + SE + L L
Sbjct: 116 DEDEAERPSKRPRSDSISSSSSPAKPPEACLPSPAASTQDELSEASA-----APLPTPSL 170
Query: 76 AKGVVPTSGTP----------GPGGDAPSSPPT 98
+ PT P G AP P T
Sbjct: 171 SPPHTPTDTAPSGKRKRRLSDGFQLPAPKRPQT 203
>gnl|CDD|218902 pfam06121, DUF959, Domain of Unknown Function (DUF959). This
N-terminal domain is not expressed in the 'Short'
isoform of Collagen A.
Length = 202
Score = 34.4 bits (78), Expect = 0.15
Identities = 22/72 (30%), Positives = 31/72 (43%), Gaps = 7/72 (9%)
Query: 28 GSLPTTSTDDTT--GSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGT 85
S P ST+ TT PG ST+ S SE + + +G +Q + G V T+ T
Sbjct: 44 ASTPVQSTESTTTHVVPRPGETEESTTPAS--SEEPKEIVEKG---KQNVVPGTVATTPT 98
Query: 86 PGPGGDAPSSPP 97
P +S P
Sbjct: 99 VTPVAMDVASSP 110
>gnl|CDD|220233 pfam09422, WTX, WTX protein. The WTX protein is found to be
inactivated in one third of Wilms tumours. The WTX
protein is functionally uncharacterized.
Length = 467
Score = 34.9 bits (80), Expect = 0.20
Identities = 28/129 (21%), Positives = 39/129 (30%), Gaps = 18/129 (13%)
Query: 3 GRGGRGAALKQILEAKRREEEEKRPGSLPTTSTDDTTGSHAPG----IPSPSTSSPSQAS 58
GR RG LK + + R ++K + + P +PS T+S
Sbjct: 76 GRPKRG--LKGLFSSMRWHRKDKSNK-------AEQEEAKEPEGGLILPSSLTASLECVK 126
Query: 59 EPAPKHIGRGRLLQQLLAKG-VVPTSGTPGPGGDAPSSP-PTQQMKALSISKSKPPSEPV 116
E P+ Q G V P S PG P + S + P
Sbjct: 127 EETPRAAREENAPPQDADGGKVSPASALEAPGTSCVECGDPAKPGSESSAFEDPGPGLGA 186
Query: 117 YFRGELGQP 125
L QP
Sbjct: 187 ---DSLCQP 192
>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
Validated.
Length = 824
Score = 34.2 bits (79), Expect = 0.33
Identities = 25/123 (20%), Positives = 36/123 (29%), Gaps = 12/123 (9%)
Query: 3 GRGGRGAALKQILEAKRREEEEKRPGSLPTTSTDDT----TGSHAPGIPSPSTSSPSQAS 58
G G + EE RP + + + AP S + + A
Sbjct: 591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAP 650
Query: 59 EPAPKH-----IGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPS 113
E PKH G A G P + P P AP++P + P +
Sbjct: 651 EHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPA---PAPAA 707
Query: 114 EPV 116
P
Sbjct: 708 TPP 710
>gnl|CDD|147982 pfam06112, Herpes_capsid, Gammaherpesvirus capsid protein. This
family consists of several Gammaherpesvirus capsid
proteins. The exact function of this family is unknown.
Length = 148
Score = 32.9 bits (75), Expect = 0.34
Identities = 19/97 (19%), Positives = 31/97 (31%), Gaps = 15/97 (15%)
Query: 7 RGAALKQILEAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPS-PSTSSPSQASEPAPKHI 65
G K+ L+A R + S ++ S PG + S SS S S P
Sbjct: 66 HGIRRKKHLQALRGAGPQTSSSIGSALSASSSSASGVPGGANQLSGSSGSALS-SGP--- 121
Query: 66 GRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMK 102
+S + G ++P + + K
Sbjct: 122 ----------GSLSSSSSLSGSGAGAGDTAPSSSKKK 148
>gnl|CDD|218191 pfam04652, DUF605, Vta1 like. Vta1 (VPS20-associated protein 1) is
a positive regulator of Vps4. Vps4 is an ATPase that is
required in the multivesicular body (MVB) sorting
pathway to dissociate the endosomal sorting complex
required for transport (ESCRT). Vta1 promotes correct
assembly of Vps4 and stimulates its ATPase activity
through its conserved Vta1/SBP1/LIP5 region.
Length = 315
Score = 33.5 bits (77), Expect = 0.45
Identities = 27/93 (29%), Positives = 35/93 (37%), Gaps = 8/93 (8%)
Query: 27 PGSLPTTSTDDTTGS-HAPGIPSPSTSSPSQASE---PAPKHIGRGRLLQQLLAKGVVPT 82
+ +D + S P PSP S + PAP PT
Sbjct: 177 ADPASASPSDPPSSSPGVPSFPSPPEDPSSPSDSSLPPAPSSFQS----DTPPPSPESPT 232
Query: 83 SGTPGPGGDAPSSPPTQQMKALSISKSKPPSEP 115
+ +P PG AP PP QQ+ LS +K PPS
Sbjct: 233 NPSPPPGPAAPPPPPVQQVPPLSTAKPTPPSAS 265
Score = 30.4 bits (69), Expect = 4.3
Identities = 21/99 (21%), Positives = 33/99 (33%), Gaps = 12/99 (12%)
Query: 21 EEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVV 80
EE+E + TT++D++ S S S P +S P +
Sbjct: 156 EEDED--ADVATTNSDNSFPGEDADPASASPSDPPSSSPGVP----------SFPSPPED 203
Query: 81 PTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFR 119
P+S + APSS + S + P P
Sbjct: 204 PSSPSDSSLPPAPSSFQSDTPPPSPESPTNPSPPPGPAA 242
>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein. This family consists of
AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
retardation syndrome) nuclear proteins. These proteins
have been linked to human diseases such as acute
lymphoblastic leukaemia and mental retardation. The
family also contains a Drosophila AF4 protein homologue
Lilliputian which contains an AT-hook domain.
Lilliputian represents a novel pair-rule gene that acts
in cytoskeleton regulation, segmentation and
morphogenesis in Drosophila.
Length = 1154
Score = 33.7 bits (77), Expect = 0.60
Identities = 21/99 (21%), Positives = 34/99 (34%), Gaps = 9/99 (9%)
Query: 20 REEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGV 79
+ + P +DT+ S P S + SS +S + + KG
Sbjct: 814 SSPKPEHPSRKRPRRQEDTSSSSGPFSASSTKSSSKSSSTSKHR---------KTEGKGS 864
Query: 80 VPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYF 118
+ G GD P+ + + LS SKP + F
Sbjct: 865 STSKEHKGSSGDTPNKASSFPVPPLSNGSSKPRRPKLVF 903
>gnl|CDD|222010 pfam13254, DUF4045, Domain of unknown function (DUF4045). This
presumed domain is functionally uncharacterized. This
domain family is found in bacteria and eukaryotes, and
is typically between 384 and 430 amino acids in length.
Length = 414
Score = 32.9 bits (75), Expect = 0.70
Identities = 25/107 (23%), Positives = 36/107 (33%), Gaps = 17/107 (15%)
Query: 22 EEEKRPGSLPTTSTDDTTGSHAPGIPS-----PSTSSPSQASEPAPKHIGRGRLLQQLLA 76
+ EK P + D + AP I + S + + SE A LL
Sbjct: 249 DTEKSSAPKPRETLDPKSPEKAPPIDTTEEELKSPEASPKESEEASARKRSPSLL----- 303
Query: 77 KGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVY-FRGEL 122
S +P P + P + + + KP S PV FR L
Sbjct: 304 ------SPSPKAESPKPLASPGKSPRDPLSPRPKPQSPPVNDFRANL 344
Score = 31.0 bits (70), Expect = 2.7
Identities = 20/84 (23%), Positives = 29/84 (34%), Gaps = 17/84 (20%)
Query: 20 REEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGV 79
EEE K P + P S + + +P + SPS + S +P R L +
Sbjct: 276 TEEELKSPEASPKESEEASARKRSPSLLSPSPKAESPKPLASPGKSPRDPLSPRP----- 330
Query: 80 VPTSGTPGPGGDAPSSPPTQQMKA 103
P SPP +A
Sbjct: 331 ------------KPQSPPVNDFRA 342
>gnl|CDD|146151 pfam03363, Herpes_LP, Herpesvirus leader protein.
Length = 177
Score = 32.2 bits (73), Expect = 0.72
Identities = 27/114 (23%), Positives = 41/114 (35%), Gaps = 22/114 (19%)
Query: 20 REEEEKRPGSLPTTSTDDTTGSHAPGIPSPS----------------TSSPSQASEPAPK 63
EEEE G P+ D + + P P P + SP+ P+
Sbjct: 54 EEEEEVVSGP-PSGPRGDPSEAPGPSRPGPPGLGPEGPFGQLLRRRRSPSPTGGDPEGPR 112
Query: 64 HIGRGRLLQQLLAKGVVPTSGTP-GPGGDAPSSPPTQQMKALSISKSKPPSEPV 116
+ R LL++ SG+P GP G + L+ S +P +PV
Sbjct: 113 RVRRRVLLEEEEE----VVSGSPSGPQGPLIQPAARSWREWLARSGPRPEPQPV 162
>gnl|CDD|233432 TIGR01480, copper_res_A, copper-resistance protein, CopA family.
This model represents the CopA copper resistance protein
family. CopA is related to laccase (benzenediol:oxygen
oxidoreductase) and L-ascorbate oxidase, both
copper-containing enzymes. Most members have a typical
TAT (twin-arginine translocation) signal sequence with
an Arg-Arg pair. Twin-arginine translocation is observed
for a large number of periplasmic proteins that cross
the inner membrane with metal-containing cofactors
already bound. The combination of copper-binding sites
and TAT translocation motif suggests a mechansism of
resistance by packaging and export [Cellular processes,
Detoxification, Transport and binding proteins, Cations
and iron carrying compounds].
Length = 587
Score = 32.9 bits (75), Expect = 0.91
Identities = 15/74 (20%), Positives = 20/74 (27%), Gaps = 10/74 (13%)
Query: 60 PAPKHIGRGRLLQQLLAKGVVP-TSGTPGPGGDAPSSPPTQQMKA----LSISKSKPPSE 114
R R LQ L + G S P + L+I ++
Sbjct: 1 SVMTAFDRRRFLQGLASGGAAAGLGLWATAAWAERSPLPESVLSGTEFDLTIGET----- 55
Query: 115 PVYFRGELGQPIKV 128
V F G I V
Sbjct: 56 MVNFTGRARPAITV 69
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 32.7 bits (74), Expect = 1.1
Identities = 25/103 (24%), Positives = 35/103 (33%), Gaps = 9/103 (8%)
Query: 27 PGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVV----PT 82
P LP+ + + P P S P H G G + L +G V P+
Sbjct: 234 PQRLPSPHPPLQPQTASQQSPQPPAPSSRH---PQSSHHGPGPPMPHALQQGPVFLQHPS 290
Query: 83 SGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRGELGQP 125
S P P G A S P + + + S P + QP
Sbjct: 291 SNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQSALQP--QQP 331
>gnl|CDD|224348 COG1431, COG1431, Argonaute homolog, implicated in RNA metabolism
[Translation, ribosomal structure and biogenesis].
Length = 685
Score = 32.2 bits (73), Expect = 1.6
Identities = 62/371 (16%), Positives = 116/371 (31%), Gaps = 61/371 (16%)
Query: 525 LNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVIIFNAP--RTDRYQ 582
+ +NSN I +V+ + + + L + + + +Y
Sbjct: 365 VVYGFKNSNGIDWKVEGLTLHVAGKRPKMK--DDLTKIIKEIDVEELKKQEMYKDDVKYA 422
Query: 583 AVKKYCCCERPIPSQVINSRTISREDKMKSIVMKIALQINCKLGGSLWSV---QIPYDCA 639
+K+ + IPSQVI + K +A + K G + P D
Sbjct: 423 ILKRL---DETIPSQVILDPNNRKPYK--GTKTNLASKRYLKTLGQPYLKRNGLGPVD-- 475
Query: 640 MVIGIDVYHEGVGSQGQNIVGLVASTNKDFTTYYSQAVIQRRGQEITDSIAQPFKQALDR 699
++G+DV S+G V S + S+ ++ +T ++ + + +
Sbjct: 476 AIVGLDV---SRVSEGNWTVEGCTSC------FVSEGGLEEYYHTVTPALGERLETSGRY 526
Query: 700 FIQANSVPPK---QIFIFRDG-VSDGQLDSVSRVEIDQYQQIVDTIMTTLPSCSYAPKIT 755
+ N + I RDG + G++ +V
Sbjct: 527 LEKMNWRGFESRNLIVTLRDGKLVAGEIAAVKEY-----------------GGELGSNPE 569
Query: 756 AIIVQKRINTKIFQLLSAGERPNLANAPSGSVLDHTVTRKTLSDFFLVSQHVRQGTVTPS 815
+ K N + A H S VR+GT P
Sbjct: 570 VNRILK--NNPWV--FAIEGEIWGAFVRLDGSTVH----LCCSP----YNPVRRGTPRP- 616
Query: 816 HYIVLRNDNNVKVDHLQRLSYKLCHLYYN--WPGTIRVPAVCQYAHRIAYLTGMHLQRLP 873
I LR + L L + L + Y+ R+PA YA + + L + P
Sbjct: 617 --IALRRRDGKLDGELIGLVHDLTAMNYSNPSGTWSRLPAPVHYADKASKLARYGVSIGP 674
Query: 874 SDVLSDKLFYL 884
D +S++ + +
Sbjct: 675 GDPVSERPYPV 685
>gnl|CDD|234547 TIGR04330, cas_Cpf1, CRISPR-associated protein Cpf1, subtype
PREFRAN. This family is the long protein of a novel
CRISPR subtype, PREFRAN, which is most common in
Prevotella and Francisella, although widely distributed.
The PREFRAN type has Cas1, Cas2, and Cas4, but lacks the
helicase Cas3 and endonuclease Cas3-HD.
Length = 1287
Score = 32.1 bits (73), Expect = 1.6
Identities = 30/166 (18%), Positives = 60/166 (36%), Gaps = 17/166 (10%)
Query: 304 GNFMERLSQALIGEIVLTRYNNQTYRIDEIDFKQTPMSTFTKRGEPKSYVDYYREAYNIE 363
G+F ++ + + EI Y N ID + K ++ Y YN +
Sbjct: 218 GDFELYVAVSELDEIFSLDYYNNVLSQSGIDSYNAIIGGIMKNDAKIKGLNEYINLYNQK 277
Query: 364 IRDKSQPMLITRVKRKTRRGTNVEESHYIAAIVPELAFLTGLSDAMRNDFQVMKSIASFT 423
I+DK + + K I+ + + L D +D +V+K+I F
Sbjct: 278 IKDKKLELPKLKQLHKQ--------------ILSDREAKSFLPDMFEDDSEVVKAIKEFY 323
Query: 424 RVDPNQKLQAISKYINNVNNNKETSELLKGWGLTLNKSMETLNARI 469
Q + + + + + LKG + + + TL+ ++
Sbjct: 324 EQTLEQG--NVIGKLKTLLEKLDKLD-LKGIYIRNDNQLTTLSQQV 366
>gnl|CDD|237871 PRK14965, PRK14965, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 576
Score = 32.0 bits (73), Expect = 1.7
Identities = 28/138 (20%), Positives = 45/138 (32%), Gaps = 8/138 (5%)
Query: 7 RGAALK---QILEAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPK 63
+ A L + E R E +R P ++ AP P P+ + P + PA
Sbjct: 357 KMATLAPGAPVSELLDRLEALERGAPAPPSAAWGAPTPAAPAAPPPAAAPPVPPAAPARP 416
Query: 64 HIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRGELG 123
R P + + P A S+ + + KP E G
Sbjct: 417 AAARPA-PAPAPPAAAAPPARSADP-AAAASAGDRWRAFVAFVKGKKPALGASL---EQG 471
Query: 124 QPIKVMVNYIDLSVKEGS 141
P+ V +++ EGS
Sbjct: 472 SPLGVSAGLLEIGFPEGS 489
>gnl|CDD|236733 PRK10672, PRK10672, rare lipoprotein A; Provisional.
Length = 361
Score = 31.6 bits (72), Expect = 2.0
Identities = 17/75 (22%), Positives = 24/75 (32%), Gaps = 13/75 (17%)
Query: 26 RPGSLPTTSTDDTTG-----SHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVV 80
P S T ++D TG S G P+ + SEP P A
Sbjct: 220 LPVSNSTLKSEDPTGAPVTSSGFLGAPTTLAPGVLEGSEPTPTAPS--------SAPATA 271
Query: 81 PTSGTPGPGGDAPSS 95
P + P + S+
Sbjct: 272 PAAAAPQAAATSSSA 286
>gnl|CDD|202276 pfam02541, Ppx-GppA, Ppx/GppA phosphatase family. This family
consists of the N-terminal region of exopolyphosphatase
(Ppx) EC:3.6.1.11 and guanosine pentaphosphate
phospho-hydrolase (GppA) EC:3.6.1.40.
Length = 285
Score = 31.1 bits (71), Expect = 2.4
Identities = 19/59 (32%), Positives = 29/59 (49%), Gaps = 6/59 (10%)
Query: 513 HIRRDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVI 571
RD NAD FL R G+ V ++I+ +EE L YL + ++ S + +VI
Sbjct: 65 SALRDAVNADEFLA---RVKKETGLPV---EIISGEEEARLIYLGVVSTLPSKGRGLVI 117
>gnl|CDD|202558 pfam03159, XRN_N, XRN 5'-3' exonuclease N-terminus. This family
aligns residues towards the N-terminus of several
proteins with multiple functions. The members of this
family all appear to possess 5'-3' exonuclease activity
EC:3.1.11.-. Thus, the aligned region may be necessary
for 5' to 3' exonuclease function. The family also
contains several Xrn1 and Xrn2 proteins. The 5'-3'
exoribonucleases Xrn1p and Xrn2p/Rat1p function in the
degradation and processing of several classes of RNA in
Saccharomyces cerevisiae. Xrn1p is the main enzyme
catalyzing cytoplasmic mRNA degradation in multiple
decay pathways, whereas Xrn2p/Rat1p functions in the
processing of rRNAs and small nucleolar RNAs (snoRNAs)
in the nucleus.
Length = 237
Score = 30.8 bits (70), Expect = 2.6
Identities = 32/110 (29%), Positives = 44/110 (40%), Gaps = 21/110 (19%)
Query: 219 ENLMFYNILFRKIAFLLSMVQFKDCLY---D---PRSKLLIPQYKLEVWPGFVTAIDEYE 272
E+ MF I F I L ++V+ + LY D PR+K+ Q F A D E
Sbjct: 56 EDEMFVAI-FEYIDRLFNIVRPRKLLYMAIDGVAPRAKM-NQQRSRR----FRAAKDAKE 109
Query: 273 GGLKLQIDTSCRVLRTSTCLDLIDELKEKFGGN-------FMERLSQALI 315
+ + + L T KEKF N FM RL++AL
Sbjct: 110 KEAEAEEN--REELETEGIKLPEKVEKEKFDSNCITPGTPFMARLAKALR 157
>gnl|CDD|219741 pfam08193, INO80_Ies4, INO80 complex subunit Ies4. The INO80
ATPase is a member of the SNF2 family of ATPases and
functions as an integral component of a multisubunit
ATP-dependent chromatin remodelling complex. This family
of proteins corresponds to the fungal Ies4 subunit of
INO80.
Length = 228
Score = 30.7 bits (69), Expect = 2.8
Identities = 19/75 (25%), Positives = 26/75 (34%), Gaps = 10/75 (13%)
Query: 24 EKRPGSLPTTSTDDT-------TGSHAPGIPSPSTSS---PSQASEPAPKHIGRGRLLQQ 73
E P P +S D S A P+ TS+ P + P PK + Q
Sbjct: 43 EDEPSDSPASSAADPPVPSSVDNASDAASTPAAGTSATDTPRRKGGPGPKKGEKRSAGQG 102
Query: 74 LLAKGVVPTSGTPGP 88
++ G PGP
Sbjct: 103 TTSETTSKPRGKPGP 117
>gnl|CDD|223326 COG0248, GppA, Exopolyphosphatase [Nucleotide transport and
metabolism / Inorganic ion transport and metabolism].
Length = 492
Score = 31.1 bits (71), Expect = 2.9
Identities = 17/56 (30%), Positives = 26/56 (46%), Gaps = 6/56 (10%)
Query: 516 RDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRSDTQFVVI 571
RD N D FL R +G+ + +VI+ +EE L YL ++ +VI
Sbjct: 85 RDAPNGDEFLA---RVEKELGLPI---EVISGEEEARLIYLGVASTLPRKGDGLVI 134
>gnl|CDD|221759 pfam12764, Gly-rich_Ago1, Glycine-rich region of argonaut. This
domain is often found at the very N-terminal of
argonaut-like proteins.
Length = 102
Score = 29.2 bits (65), Expect = 3.0
Identities = 20/85 (23%), Positives = 27/85 (31%), Gaps = 14/85 (16%)
Query: 40 GSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPP-- 97
G G S+ P+ L Q A S P P + SS P
Sbjct: 19 GRGGGGGGRGGGSTGGPPRPSVPE------LHQATQAPYQAAVSTQPAPSEASSSSQPPE 72
Query: 98 ------TQQMKALSISKSKPPSEPV 116
TQQ + LSI + S+ +
Sbjct: 73 SSSLQVTQQFQQLSIQQEASSSQAI 97
>gnl|CDD|204078 pfam08832, SRC-1, Steroid receptor coactivator. This domain is
found in steroid/nuclear receptor coactivators and
contains two LXXLL motifs that are involved in receptor
binding. The family includes SRC-1/NcoA-1, NcoA-2/TIF2,
pCIP/ACTR/GRIP-1/AIB1.
Length = 78
Score = 28.7 bits (64), Expect = 3.1
Identities = 20/71 (28%), Positives = 29/71 (40%), Gaps = 7/71 (9%)
Query: 11 LKQILEAKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRL 70
L Q+L K E P + ++ TD G+ S + PS S KH ++
Sbjct: 6 LLQLLTTKTEPLE---PPLMASSDTDCKDSLGVTGVSSSTGGCPSSHSSLKEKH----KI 58
Query: 71 LQQLLAKGVVP 81
L +LL G P
Sbjct: 59 LHRLLQNGSSP 69
>gnl|CDD|139494 PRK13335, PRK13335, superantigen-like protein; Reviewed.
Length = 356
Score = 30.5 bits (68), Expect = 4.5
Identities = 27/131 (20%), Positives = 41/131 (31%), Gaps = 25/131 (19%)
Query: 17 AKRREEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLA 76
A R+E + P T+ + T+ S I P + A + Q
Sbjct: 66 ANTRQERTPKLEKAPNTNEEKTSASKIEKISQPKQEEQKSLNISATP--APKQEQSQTTT 123
Query: 77 KGVVPT--------SGTPGP----GGDAPSSPPTQQ--------MKALSISKSKPPSEPV 116
+ P + TP P D P SP +Q + L +KP E
Sbjct: 124 ESTTPKTKVTTPPSTNTPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFE-- 181
Query: 117 YFRGELGQPIK 127
F + G +K
Sbjct: 182 -FEKQFGFLLK 191
>gnl|CDD|177464 PHA02682, PHA02682, ORF080 virion core protein; Provisional.
Length = 280
Score = 30.2 bits (67), Expect = 5.1
Identities = 21/91 (23%), Positives = 35/91 (38%), Gaps = 15/91 (16%)
Query: 39 TGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPT 98
+ AP P+P+ + P+ A P A P + P P AP+ PP+
Sbjct: 95 CPACAPAAPAPAVTCPAPAPACPPA-----------TAPTCPPPAVCPAPARPAPACPPS 143
Query: 99 QQM----KALSISKSKPPSEPVYFRGELGQP 125
+ L K P ++P++ +L P
Sbjct: 144 TRQCPPAPPLPTPKPAPAAKPIFLHNQLPPP 174
>gnl|CDD|220093 pfam09030, Creb_binding, Creb binding. The Creb binding domain
assumes a structure comprising of three alpha-helices
which pack in a bundle, exposing a hydrophobic groove
between alpha-1 and alpha-3 within which complimentary
domains found in the protein 'activator for thyroid
hormone and retinoid receptors' (ACTR) can dock. Docking
of these domains is required for the recruitment of RNA
polymerase II and the basal transcription machinery.
Length = 104
Score = 28.5 bits (63), Expect = 5.5
Identities = 21/74 (28%), Positives = 31/74 (41%), Gaps = 15/74 (20%)
Query: 44 PGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGD------APSSPP 97
PG+P P +Q + P+ L+ G+ +P D +PSSP
Sbjct: 19 PGMPRPVMQMVAQHAVAGPR--------PGLVQPGISRGIVSPNALQDLLRTLKSPSSP- 69
Query: 98 TQQMKALSISKSKP 111
QQ + L+I KS P
Sbjct: 70 QQQQQVLNILKSNP 83
>gnl|CDD|151482 pfam11035, SnAPC_2_like, Small nuclear RNA activating complex
subunit 2-like. This family of proteins is SnAPC
subunit 2-like. SnAPC allows the transcription of human
small nuclear RNA genes to occur by recognition of the
proximal sequence element.
Length = 344
Score = 30.0 bits (67), Expect = 5.7
Identities = 15/54 (27%), Positives = 18/54 (33%), Gaps = 7/54 (12%)
Query: 28 GSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVP 81
G LP + D GS P QAS A + G A G+ P
Sbjct: 279 GKLPPGTEDGGAGSTGP-------EETDQASPQASEPAGSSEPRSAWQAAGICP 325
>gnl|CDD|139048 PRK12538, PRK12538, RNA polymerase sigma factor; Provisional.
Length = 233
Score = 29.8 bits (67), Expect = 5.8
Identities = 22/76 (28%), Positives = 36/76 (47%), Gaps = 18/76 (23%)
Query: 718 VSDGQLDSVSRVEIDQYQQIVDTIMTTLPSCSYAPKITAIIVQKRINTKIFQLLSAGERP 777
V+DG+ D+VS +E ++ +++ M LP Q+RI +LS E
Sbjct: 145 VADGKPDAVSVIERNELSDLLEAAMQRLPE------------QQRIAV----ILSYHE-- 186
Query: 778 NLANAPSGSVLDHTVT 793
N++N V+D TV
Sbjct: 187 NMSNGEIAEVMDTTVA 202
>gnl|CDD|221825 pfam12877, DUF3827, Domain of unknown function (DUF3827). This
family contains the human KIAA1549 protein which has
been found to be fused fused to BRAF gene in many cases
of pilocytic astrocytomas. The fusion is due mainly to a
tandem duplication of 2 Mb at 7q34. Although nothing is
known about the function of KIAA1549 protein, the BRAF
protein is a well characterized oncoprotein. It is a
serine/threonine protein kinase which is implicated in
MAP/ERK signalling, a critical pathway for the
regulation of cell division, differentiation and
secretion.
Length = 684
Score = 29.9 bits (67), Expect = 6.5
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 7/73 (9%)
Query: 48 SPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSIS 107
S S ++ + + GR + A+ + G AP S +Q+ + SI
Sbjct: 389 GDSEGSSVISNRSSREKSGRPSTTPSVTAQQKP--TKEEGRKKPAPPSGTDEQLSSASIF 446
Query: 108 K-----SKPPSEP 115
+ S+P S+P
Sbjct: 447 EHVDRLSRPSSDP 459
>gnl|CDD|223061 PHA03369, PHA03369, capsid maturational protease; Provisional.
Length = 663
Score = 30.0 bits (67), Expect = 6.5
Identities = 23/87 (26%), Positives = 30/87 (34%), Gaps = 2/87 (2%)
Query: 31 PTTSTDDTTGSHAPGIP-SPSTSSPSQASEPAPKHIGRGRLLQQLLAKGVVPTSGTPGPG 89
T D GIP S SP A P P+ G L+ + TS P P
Sbjct: 375 HTGPADRQRPQRPDGIPYSVPARSPMTAYPPVPQFCGDPGLVSPYNPQSPG-TSYGPEPV 433
Query: 90 GDAPSSPPTQQMKALSISKSKPPSEPV 116
G P P + +S++ P P
Sbjct: 434 GPVPPQPTNPYVMPISMANMVYPGHPQ 460
>gnl|CDD|218408 pfam05062, RICH, RICH domain. This presumed domain is about 85
residues in length and very rich in charged residues,
hence the name RICH (Rich In CHarged residues). It is
found in secreted proteins such as PspC, SpsA, and IgA
FC receptor from Streptococcus agalactiae. This domain
could be involved in bacterial adherence or cell wall
binding.
Length = 81
Score = 27.7 bits (62), Expect = 7.0
Identities = 9/27 (33%), Positives = 15/27 (55%), Gaps = 1/27 (3%)
Query: 425 VDPNQKLQAI-SKYINNVNNNKETSEL 450
V +KL AI ++Y+ ++ K EL
Sbjct: 35 VALIKKLSAIKTEYLYELDVLKTKVEL 61
>gnl|CDD|221371 pfam12004, DUF3498, Domain of unknown function (DUF3498). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 433 to 538 amino acids in length. This domain is
found associated with pfam00616, pfam00168. This domain
has two conserved sequence motifs: DLQ and PLSFQNP.
Length = 489
Score = 29.7 bits (66), Expect = 7.3
Identities = 26/121 (21%), Positives = 42/121 (34%), Gaps = 16/121 (13%)
Query: 21 EEEEKRPGSLPTTSTDDTTGSHAPGIPSPSTSSPSQASEPAPKHIGRGR-------LLQQ 73
EE +R H P +P +++ P + + G LL
Sbjct: 225 EEFTRRSTDFTRRQLSLPDRQHQPALPRQNSAGPQRRVDQPSPPGGGSHRGRIPPSLLSS 284
Query: 74 LLAKGVVPTSGTPGPGGDAPSSPPTQQMKALSISKSKPPSEPVYFRGELGQPIKVMVNYI 133
L ++G + +S P G + P QQ S S E G L QP V ++ +
Sbjct: 285 LPSEGSMLSSEWPQSG-----ARPRQQ----SSSSKGDSPELRPAAGHLQQPSPVNMSAL 335
Query: 134 D 134
+
Sbjct: 336 E 336
>gnl|CDD|220582 pfam10117, McrBC, McrBC 5-methylcytosine restriction system
component. Members of this family of bacterial proteins
modify the specificity of mcrB restriction by expanding
the range of modified sequences restricted.
Length = 317
Score = 29.5 bits (67), Expect = 8.4
Identities = 18/89 (20%), Positives = 30/89 (33%), Gaps = 8/89 (8%)
Query: 505 NFDQWVLIHIRRDQRNADNFLNCLNRNSNAIGIRVKKPQVIALQEEQTLSYLTALKSMRS 564
N +R D+ DN LN L + +A+ +K + L L L
Sbjct: 109 NPGHKHKFFVRYDEFTEDNPLNRLLK--SALERLLKLTRSSENLRL--LRELLFLLDEVP 164
Query: 565 DTQFVVIIFNAPRTDR----YQAVKKYCC 589
D++ F R +R Y+ +
Sbjct: 165 DSKISAKDFQKWRLNRLNARYELLLPLAR 193
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.135 0.393
Gapped
Lambda K H
0.267 0.0717 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 44,914,723
Number of extensions: 4436124
Number of successful extensions: 3984
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3912
Number of HSP's successfully gapped: 68
Length of query: 884
Length of database: 10,937,602
Length adjustment: 105
Effective length of query: 779
Effective length of database: 6,280,432
Effective search space: 4892456528
Effective search space used: 4892456528
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 63 (28.1 bits)