RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy15165
(606 letters)
>gnl|CDD|222150 pfam13465, zf-H2C2_2, Zinc-finger double domain.
Length = 26
Score = 40.5 bits (95), Expect = 3e-05
Identities = 17/26 (65%), Positives = 19/26 (73%)
Query: 384 YLRRHMRVHTNEKPYKCKDCGAAFNH 409
LRRHMR HT EKPYKC CG +F+
Sbjct: 1 NLRRHMRTHTGEKPYKCPVCGKSFSS 26
Score = 34.3 bits (79), Expect = 0.005
Identities = 18/25 (72%), Positives = 20/25 (80%), Gaps = 1/25 (4%)
Query: 547 YLKRHMRTHTNEKPYKC-VCGLGFN 570
L+RHMRTHT EKPYKC VCG F+
Sbjct: 1 NLRRHMRTHTGEKPYKCPVCGKSFS 25
>gnl|CDD|197676 smart00355, ZnF_C2H2, zinc finger.
Length = 23
Score = 30.9 bits (70), Expect = 0.091
Identities = 11/23 (47%), Positives = 12/23 (52%)
Query: 370 YICEYCHKEFTFYNYLRRHMRVH 392
Y C C K F + LR HMR H
Sbjct: 1 YRCPECGKVFKSKSALREHMRTH 23
Score = 30.5 bits (69), Expect = 0.14
Identities = 10/23 (43%), Positives = 12/23 (52%)
Query: 533 FVCEYCNKEFTFLQYLKRHMRTH 555
+ C C K F L+ HMRTH
Sbjct: 1 YRCPECGKVFKSKSALREHMRTH 23
Score = 28.2 bits (63), Expect = 0.72
Identities = 9/21 (42%), Positives = 14/21 (66%)
Query: 338 HVCPHCGKKFTRKAELQLHIK 358
+ CP CGK F K+ L+ H++
Sbjct: 1 YRCPECGKVFKSKSALREHMR 21
Score = 25.9 bits (57), Expect = 4.7
Identities = 8/21 (38%), Positives = 12/21 (57%)
Query: 501 LQCPHCPKTFPRKTELSNHIK 521
+CP C K F K+ L H++
Sbjct: 1 YRCPECGKVFKSKSALREHMR 21
>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27. This protein forms
the C subunit of DNA polymerase delta. It carries the
essential residues for binding to the Pol1 subunit of
polymerase alpha, from residues 293-332, which are
characterized by the motif D--G--VT, referred to as the
DPIM motif. The first 160 residues of the protein form
the minimal domain for binding to the B subunit, Cdc1,
of polymerase delta, the final 10 C-terminal residues,
362-372, being the DNA sliding clamp, PCNA, binding
motif.
Length = 427
Score = 35.2 bits (81), Expect = 0.093
Identities = 34/193 (17%), Positives = 71/193 (36%), Gaps = 13/193 (6%)
Query: 16 PSRRRSLGVESSVPKLKIKTETDDKSWEDKSLLEPEIKIKVEQGQATTSDETEEDDNTRQ 75
+ +S S KS++ PE+K+K + TS ET + +
Sbjct: 140 GVGLPPVAPAASPALKPTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTSKETTTEKTEGK 199
Query: 76 TSFKTIISKYVG-------YNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDA 128
TS K K + T +K+ K+ A ++ ++ ++ R ++
Sbjct: 200 TSVKAASLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVKEESEEESGKRDVILEDES 259
Query: 129 PVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFAS 188
+ E+ ++ + E+ +K ++E+R+++ + D D
Sbjct: 260 AEPTGLDE----DEDEDEPKPSGERSDSEEETEEK--EKEKRKRLKKMMEDEDEDEEMEI 313
Query: 189 LDEMSSEEEEEED 201
+ E EEEE E+
Sbjct: 314 VPESPVEEEESEE 326
>gnl|CDD|200998 pfam00096, zf-C2H2, Zinc finger, C2H2 type. The C2H2 zinc finger
is the classical zinc finger domain. The two conserved
cysteines and histidines co-ordinate a zinc ion. The
following pattern describes the zinc finger.
#-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can
be any amino acid, and numbers in brackets indicate the
number of residues. The positions marked # are those
that are important for the stable fold of the zinc
finger. The final position can be either his or cys. The
C2H2 zinc finger is composed of two short beta strands
followed by an alpha helix. The amino terminal part of
the helix binds the major groove in DNA binding zinc
fingers. The accepted consensus binding sequence for Sp1
is usually defined by the asymmetric hexanucleotide core
GGGCGG but this sequence does not include, among others,
the GAG (=CTC) repeat that constitutes a high-affinity
site for Sp1 binding to the wt1 promoter.
Length = 22
Score = 30.8 bits (70), Expect = 0.11
Identities = 11/22 (50%), Positives = 13/22 (59%)
Query: 534 VCEYCNKEFTFLQYLKRHMRTH 555
C C K F+ LKRH+RTH
Sbjct: 1 KCPDCGKSFSRKSNLKRHLRTH 22
Score = 29.2 bits (66), Expect = 0.37
Identities = 9/22 (40%), Positives = 13/22 (59%)
Query: 371 ICEYCHKEFTFYNYLRRHMRVH 392
C C K F+ + L+RH+R H
Sbjct: 1 KCPDCGKSFSRKSNLKRHLRTH 22
Score = 28.9 bits (65), Expect = 0.48
Identities = 10/20 (50%), Positives = 15/20 (75%)
Query: 339 VCPHCGKKFTRKAELQLHIK 358
CP CGK F+RK+ L+ H++
Sbjct: 1 KCPDCGKSFSRKSNLKRHLR 20
Score = 27.7 bits (62), Expect = 1.1
Identities = 9/20 (45%), Positives = 14/20 (70%)
Query: 502 QCPHCPKTFPRKTELSNHIK 521
+CP C K+F RK+ L H++
Sbjct: 1 KCPDCGKSFSRKSNLKRHLR 20
>gnl|CDD|227561 COG5236, COG5236, Uncharacterized conserved protein, contains RING
Zn-finger [General function prediction only].
Length = 493
Score = 34.6 bits (79), Expect = 0.14
Identities = 22/97 (22%), Positives = 35/97 (36%), Gaps = 8/97 (8%)
Query: 474 QKQICEICCAEVYHINGHIKDKHSGFF-LQCPHCPKTFP------RKTELSNHIKGIHMK 526
K C C + + H K +H +C K F R + L +H G +
Sbjct: 155 PKSKCHRRCGSLKELKKHYKAQHGFVLCSECIGNKKDFWNEIRLFRSSTLRDHKNGGLEE 214
Query: 527 HELRQTFVCEYCNKEFTFLQYLKRHMRTHTNEKPYKC 563
+ +C +C F L+RH R +E + C
Sbjct: 215 EGFKGHPLCIFCKIYFYDDDELRRHCRLR-HEACHIC 250
>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
Length = 1388
Score = 34.6 bits (80), Expect = 0.19
Identities = 29/208 (13%), Positives = 67/208 (32%), Gaps = 10/208 (4%)
Query: 4 DDWNNLDDHVGKPSRRRSLGVESSVPKLKIKTETDDKSWEDKSLLEPEIKIKVEQGQATT 63
+ D K S + S K K+ + D+K + + + + + + ++
Sbjct: 1180 KKKKSSADKSKKASVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSS 1239
Query: 64 SDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASS-----GSDTPKKRR 118
+ N S + + + + +A S D
Sbjct: 1240 VKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGG 1299
Query: 119 GRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEE-----RRRKI 173
+ S + R A +++++ K +++K+ + + A + R RK
Sbjct: 1300 SKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRKK 1359
Query: 174 DLDKKPSDMDNVFASLDEMSSEEEEEED 201
D D D+ E +E++E+D
Sbjct: 1360 KSDSSSEDDDDSEVDDSEDEDDEDDEDD 1387
>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457). This is
a family of uncharacterized proteins.
Length = 449
Score = 33.8 bits (77), Expect = 0.26
Identities = 31/150 (20%), Positives = 60/150 (40%), Gaps = 6/150 (4%)
Query: 60 QATTSDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRG 119
+A +D +E D+ + F + + L T+ ++++R +S+ S +K+R
Sbjct: 104 EAGFADSDDESDDGSEYVFWAPGTTTAATSPRKLETMRRKSRRRTSDSSADSLNERKQRR 163
Query: 120 RSRSKGRDAPVSRKRNRRAKTPS-QEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKK 178
+ + R + R TP + TD ++ + + E RR
Sbjct: 164 KWKRPRRSPI--KPPKIRPGTPELPDSTDFVCGTLDEDRPLEAAYKSCMEARRLSKQVVI 221
Query: 179 PSDMDNVFASLDEMSSEEEEEEDWDEIHLG 208
P D+D F + D E+EE+E D +
Sbjct: 222 PQDIDPSFPTSD---PEDEEDELDDVEEVI 248
Score = 29.6 bits (66), Expect = 6.0
Identities = 15/52 (28%), Positives = 29/52 (55%)
Query: 155 KNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEIH 206
+N +K+ K A+EE + D D++ D D+ D+ ++E++ED D+
Sbjct: 35 ENAIRKLGKEAEEEAMEEEDDDEEDDDDDDDEDEDDDDDDDDEDDEDEDDDD 86
>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
Length = 520
Score = 34.0 bits (79), Expect = 0.26
Identities = 17/69 (24%), Positives = 41/69 (59%), Gaps = 6/69 (8%)
Query: 134 RNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERR---RKIDLDKKPSDMDNVFASLD 190
R RR + E+ ++LQ+++NL++K++ + K E ++ +L++K +++ L+
Sbjct: 78 RERRNELQKLEK---RLLQKEENLDRKLELLEKREEELEKKEKELEQKQQELEKKEEELE 134
Query: 191 EMSSEEEEE 199
E+ E+ +E
Sbjct: 135 ELIEEQLQE 143
>gnl|CDD|227674 COG5384, Mpp10, U3 small nucleolar ribonucleoprotein component
[Translation, ribosomal structure and biogenesis].
Length = 569
Score = 33.5 bits (76), Expect = 0.32
Identities = 13/56 (23%), Positives = 21/56 (37%)
Query: 159 KKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEIHLGQFSPYE 214
KK + + ++D ++ S MD V L +E E E S +E
Sbjct: 223 KKHSDVKDPKEDEELDEEEHDSAMDKVKLDLFADEEDEPNAEGVGEASDKNLSSFE 278
>gnl|CDD|220661 pfam10263, SprT-like, SprT-like family. This family represents a
domain found in eukaryotes and prokaryotes. The domain
contains a characteristic motif of the zinc
metallopeptidases. This family includes the bacterial
SprT protein.
Length = 153
Score = 32.4 bits (74), Expect = 0.35
Identities = 18/98 (18%), Positives = 27/98 (27%), Gaps = 24/98 (24%)
Query: 321 EVVHLAIHKKHSHSGQYHVCPHCGKKFTRKAEL--QLHIKGIHLKHQLEKT--------- 369
E+ H A+ G H G +F H+ +
Sbjct: 65 EMCHAALFLLFGGRGYPH-----GDEFKALMAQVGGAGPLEPTTTHRFDIEVVSGRKRYI 119
Query: 370 YICEYCHKEFTFYNYLRRHMRVHTNEKPYKCKDCGAAF 407
Y C C + + +RRH Y+C CG
Sbjct: 120 YRCGSCGQLYPRKRRIRRHK--------YRCGRCGGKL 149
>gnl|CDD|206065 pfam13894, zf-C2H2_4, C2H2-type zinc finger. This family contains
a number of divergent C2H2 type zinc fingers.
Length = 24
Score = 28.8 bits (64), Expect = 0.47
Identities = 11/23 (47%), Positives = 13/23 (56%)
Query: 533 FVCEYCNKEFTFLQYLKRHMRTH 555
F C C K F+ LKRH+R H
Sbjct: 1 FKCPLCGKSFSSKDALKRHLRKH 23
Score = 28.4 bits (63), Expect = 0.82
Identities = 9/23 (39%), Positives = 14/23 (60%)
Query: 370 YICEYCHKEFTFYNYLRRHMRVH 392
+ C C K F+ + L+RH+R H
Sbjct: 1 FKCPLCGKSFSSKDALKRHLRKH 23
Score = 28.0 bits (62), Expect = 0.96
Identities = 9/23 (39%), Positives = 13/23 (56%)
Query: 502 QCPHCPKTFPRKTELSNHIKGIH 524
+CP C K+F K L H++ H
Sbjct: 2 KCPLCGKSFSSKDALKRHLRKHH 24
Score = 28.0 bits (62), Expect = 1.1
Identities = 10/24 (41%), Positives = 14/24 (58%)
Query: 338 HVCPHCGKKFTRKAELQLHIKGIH 361
CP CGK F+ K L+ H++ H
Sbjct: 1 FKCPLCGKSFSSKDALKRHLRKHH 24
Score = 26.1 bits (57), Expect = 4.4
Identities = 10/24 (41%), Positives = 14/24 (58%)
Query: 249 FKCRVCDWKLNSYDKLLRHIKSDH 272
FKC +C +S D L RH++ H
Sbjct: 1 FKCPLCGKSFSSKDALKRHLRKHH 24
>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777). This is
a family of eukaryotic proteins of unknown function.
Some of the proteins in this family are putative nucleic
acid binding proteins.
Length = 158
Score = 31.8 bits (72), Expect = 0.48
Identities = 19/92 (20%), Positives = 47/92 (51%), Gaps = 7/92 (7%)
Query: 99 RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRK-------RNRRAKTPSQEETDAKIL 151
R+ R +R ++ R R RS+ R+ R+ R+RR+++P + + ++
Sbjct: 7 RSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSP 66
Query: 152 QQQKNLEKKIKKMAKEERRRKIDLDKKPSDMD 183
++++ +++ K A+E ++R+ K D++
Sbjct: 67 SRRRDRKRERDKDAREPKKRERQKLIKEEDLE 98
Score = 28.3 bits (63), Expect = 6.6
Identities = 19/72 (26%), Positives = 37/72 (51%), Gaps = 4/72 (5%)
Query: 97 DKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPS----QEETDAKILQ 152
+R++ + S S +P++ R RSRS R R+R++ A+ P Q+ + L+
Sbjct: 39 RRRSRSRSPHRSRRSRSPRRHRSRSRSPSRRRDRKRERDKDAREPKKRERQKLIKEEDLE 98
Query: 153 QQKNLEKKIKKM 164
+ + E ++ KM
Sbjct: 99 GKSDEEVEMMKM 110
>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
Length = 1068
Score = 33.1 bits (76), Expect = 0.48
Identities = 23/90 (25%), Positives = 47/90 (52%), Gaps = 7/90 (7%)
Query: 117 RRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKE-----ERRR 171
R R+R +GR+ +RNRR Q+ + + QQ + EK + ++ ER+R
Sbjct: 624 RDNRTRREGRENREENRRNRRQAQ--QQTAETRESQQAEVTEKARTQDEQQQAPRRERQR 681
Query: 172 KIDLDKKPSDMDNVFASLDEMSSEEEEEED 201
+ + +K+ + + +++E S +E E+E+
Sbjct: 682 RRNDEKRQAQQEAKALNVEEQSVQETEQEE 711
>gnl|CDD|213729 TIGR02605, CxxC_CxxC_SSSS, putative regulatory protein, FmdB
family. This model represents a region of about 50
amino acids found in a number of small proteins in a
wide range of bacteria. The region begins usually with
the initiator Met and contains two CxxC motifs separated
by 17 amino acids. One member of this family is has been
noted as a putative regulatory protein, designated FmdB
(SP:Q50229, PMID:8841393 ). Most members of this family
have a C-terminal region containing highly degenerate
sequence, such as
SSTSESTKSSGSSGSSGSSESKASGSTEKSTSSTTAAAAV in
Mycobacterium tuberculosis and
VAVGGSAPAPSPAPRAGGGGGGCCGGGCCG in Streptomyces
avermitilis. These low complexity regions, which are not
included in the model, resemble low-complexity
C-terminal regions of some heterocycle-containing
bacteriocin precursors [Regulatory functions, DNA
interactions].
Length = 52
Score = 29.6 bits (67), Expect = 0.49
Identities = 7/26 (26%), Positives = 13/26 (50%)
Query: 398 YKCKDCGAAFNHNVSLKNHKNSSCPK 423
Y+C CG F + + ++CP+
Sbjct: 6 YRCTACGHRFEVLQKMSDDPLATCPE 31
>gnl|CDD|112562 pfam03753, HHV6-IE, Human herpesvirus 6 immediate early protein.
The proteins in this family are poorly characterized,
but an investigation has indicated that the immediate
early protein is required the down-regulation of MHC
class I expression in dendritic cells. Human herpesvirus
6 immediate early protein is also referred to as U90.
Length = 993
Score = 33.1 bits (75), Expect = 0.57
Identities = 33/150 (22%), Positives = 54/150 (36%), Gaps = 17/150 (11%)
Query: 37 TDDKSWEDKSLL--EPEIKIKVEQGQATTSDETEEDDNTRQTSFKTII----------SK 84
D S +S EI ++ ++ T F +
Sbjct: 585 ETDHSAPYESESDNNDEIDYIASVDSGNRTNNIHMNNTNENTPFSKSGKSPPEVTPSKTF 644
Query: 85 YVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQE 144
Y D++T K KRTA+ + G T K ++ +S S D V + S++
Sbjct: 645 YKRDKKKDISTNRKVKKRTAKRKTVGYKTDKSKKIKSDSLPTDTNVI-----VISSESED 699
Query: 145 ETDAKILQQQKNLEKKIKKMAKEERRRKID 174
E D + ++ L+KKIK K E + D
Sbjct: 700 EEDGFNIIKKSQLKKKIKSELKSESSSESD 729
>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
Transcription initiation factor IIA (TFIIA) is a
heterotrimer, the three subunits being known as alpha,
beta, and gamma, in order of molecular weight. The N and
C-terminal domains of the gamma subunit are represented
in pfam02268 and pfam02751, respectively. This family
represents the precursor that yields both the alpha and
beta subunits. The TFIIA heterotrimer is an essential
general transcription initiation factor for the
expression of genes transcribed by RNA polymerase II.
Together with TFIID, TFIIA binds to the promoter region;
this is the first step in the formation of a
pre-initiation complex (PIC). Binding of the rest of the
transcription machinery follows this step. After
initiation, the PIC does not completely dissociate from
the promoter. Some components, including TFIIA, remain
attached and re-initiate a subsequent round of
transcription.
Length = 332
Score = 32.4 bits (74), Expect = 0.60
Identities = 16/71 (22%), Positives = 25/71 (35%), Gaps = 2/71 (2%)
Query: 136 RRAKTPSQEETDAKILQQQKNLE--KKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMS 193
R + E K + ++ K+ KK AK +RR I D S D+
Sbjct: 202 RLREADGTLEQRIKGAEGGGAMKVLKQPKKQAKSSKRRTIAQIDGIDSDDEGDGSDDDDD 261
Query: 194 SEEEEEEDWDE 204
+ E + D
Sbjct: 262 EDAIESDLDDS 272
>gnl|CDD|217373 pfam03115, Astro_capsid, Astrovirus capsid protein precursor. This
product is encoded by astrovirus ORF2, one of the three
astrovirus ORFs (1a, 1b, 2). The 87kD precursor protein
undergoes an intracellular cleavage to form a 79kD
protein. Subsequently, extracellular trypsin cleavage
yields the three proteins forming the infectious virion.
Length = 787
Score = 32.8 bits (75), Expect = 0.67
Identities = 14/47 (29%), Positives = 21/47 (44%), Gaps = 9/47 (19%)
Query: 116 KRRGRSRSKGRDAPV---------SRKRNRRAKTPSQEETDAKILQQ 153
K R RS+S+GR V R++N R K S + + +Q
Sbjct: 22 KSRARSQSRGRGRSVKITVNSRNKGRRQNGRNKYQSNQRVRNIVNKQ 68
>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
domain. This domain is found in a number of different
types of plant proteins including NAM-like proteins.
Length = 147
Score = 31.2 bits (71), Expect = 0.72
Identities = 21/113 (18%), Positives = 46/113 (40%), Gaps = 14/113 (12%)
Query: 98 KRAKRTARAASSGSDT---------PKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDA 148
K+ R+ SS T + R +GR ++++ RR K +++E
Sbjct: 28 ASKKKKKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKK--AKEKLRRDKLKAKKEEAE 85
Query: 149 KILQQQKNLEKKIKKMAKEE---RRRKIDLDKKPSDMDNVFASLDEMSSEEEE 198
K ++++ K + + KE ++K + + +FA +S E+ +
Sbjct: 86 KEKEKEERFMKALAEAEKERAELEKKKAEAKLMKEEKKIMFADTSSLSPEQRQ 138
>gnl|CDD|227701 COG5414, COG5414, TATA-binding protein-associated factor
[Transcription].
Length = 392
Score = 32.4 bits (73), Expect = 0.75
Identities = 29/141 (20%), Positives = 53/141 (37%), Gaps = 15/141 (10%)
Query: 72 NTRQTSFKTIISK----YVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSR---SK 124
R F+ SK V +DDL D +A+ + + ++ R S +
Sbjct: 187 YVRARRFRKKSSKIEIEEVEKKVDDLLEKDMKAESVSVVLKDEKELARQERVSSWENFKE 246
Query: 125 GRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDN 184
P+SR ++ K ++EE + + E+ + A E +++ K +
Sbjct: 247 EPGEPLSRPALKKEKQGAEEEGEEGMS------EEDLDVGAAEIENKEVSEGDKEQQQEE 300
Query: 185 VFASLDEMSSEEEEEEDWDEI 205
V E EE + + DEI
Sbjct: 301 VEN--AEAHKEEVQSDRPDEI 319
>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
and chromosome partitioning].
Length = 420
Score = 32.4 bits (74), Expect = 0.75
Identities = 16/80 (20%), Positives = 36/80 (45%), Gaps = 3/80 (3%)
Query: 122 RSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSD 181
+ K ++ + + Q++ AK+ +Q K+LE +I + + DL K
Sbjct: 39 QLKQIQKEIAALEKKIRE---QQDQRAKLEKQLKSLETEIASLEAQLIETADDLKKLRKQ 95
Query: 182 MDNVFASLDEMSSEEEEEED 201
+ ++ A L+ + +E E+
Sbjct: 96 IADLNARLNALEVQEREQRR 115
>gnl|CDD|222425 pfam13865, FoP_duplication, C-terminal duplication domain of Friend
of PRMT1. Fop, or Friend of Prmt1, proteins are
conserved from fungi and plants to vertebrates. There is
little that is actually conserved except for this
C-terminal LDXXLDAYM region where X is any amino acid).
The Fop proteins themselves are nuclear proteins
localised to regions with low levels of DAPI, with a
punctate/speckle-like distribution. Fop is a
chromatin-associated protein and it colocalises with
facultative heterochromatin. It is is critical for
oestrogen-dependent gene activation.
Length = 76
Score = 29.7 bits (67), Expect = 0.80
Identities = 18/92 (19%), Positives = 34/92 (36%), Gaps = 19/92 (20%)
Query: 103 TARAASSGSDTPKKRRGRSRSKGRDAPVSRKRN--RRAKTPSQEETDAKILQQQKNLEKK 160
R S G + RG R + R + + + K ++E+ DA+ L++
Sbjct: 2 GGRKGSRGGKFRPRGRGARRGRRRGRGGRKGKGGAAKPKPKTREDLDAE-------LDQY 54
Query: 161 IKKMAKEERRRKIDLDKKPSDMDNVFASLDEM 192
+ + LD +D+D + DE
Sbjct: 55 MSTTKSK-------LD---ADLDAYMSKKDEK 76
Score = 28.6 bits (64), Expect = 2.1
Identities = 18/86 (20%), Positives = 24/86 (27%), Gaps = 22/86 (25%)
Query: 108 SSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKE 167
S G + + R R +G R R R K K K +E
Sbjct: 1 SGGRKGSRGGKFRPRGRGARRGRRRGRGGRKG---------------KGGAAKPKPKTRE 45
Query: 168 ERRRKIDLDKKPSD-MDNVFASLDEM 192
DLD + M + LD
Sbjct: 46 ------DLDAELDQYMSTTKSKLDAD 65
>gnl|CDD|221408 pfam12072, DUF3552, Domain of unknown function (DUF3552). This
presumed domain is functionally uncharacterized. This
domain is found in bacteria, archaea and eukaryotes.
This domain is about 200 amino acids in length. This
domain is found associated with pfam00013, pfam01966.
This domain has a single completely conserved residue A
that may be functionally important.
Length = 201
Score = 31.4 bits (72), Expect = 0.88
Identities = 15/69 (21%), Positives = 39/69 (56%), Gaps = 6/69 (8%)
Query: 134 RNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEER---RRKIDLDKKPSDMDNVFASLD 190
+ RR + QE+ ++LQ+++ L++K + + K+E ++ +L + ++ L+
Sbjct: 74 KERRNELQRQEK---RLLQKEETLDRKDESLEKKEESLEEKEKELAARQQQLEEKEEELE 130
Query: 191 EMSSEEEEE 199
E+ E+++E
Sbjct: 131 ELIEEQQQE 139
>gnl|CDD|218517 pfam05236, TAF4, Transcription initiation factor TFIID component
TAF4 family. This region of similarity is found in
Transcription initiation factor TFIID component TAF4.
Length = 255
Score = 31.6 bits (72), Expect = 0.89
Identities = 19/72 (26%), Positives = 32/72 (44%), Gaps = 4/72 (5%)
Query: 130 VSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERR--RKIDLDKKPSDMDNVFA 187
+SR R K+ E + + +Q + L +K K+ EERR R+ +L + + +
Sbjct: 91 LSRHRRDGIKSDPNYEIRSDVRRQLRFLAQKQKEE--EERRVERRRELGLEDPEQLRLKQ 148
Query: 188 SLDEMSSEEEEE 199
E E EE
Sbjct: 149 KAKEEQKAESEE 160
>gnl|CDD|220362 pfam09723, CxxC_CxxC_SSSS, Zinc ribbon domain. This entry
represents a region of about 41 amino acids found in a
number of small proteins in a wide range of bacteria.
The region usually begins with the initiator Met and
contains two CxxC motifs separated by 17 amino acids.
One protein in this entry has been noted as a putative
regulatory protein, designated FmdB. Most proteins in
this entry have a C-terminal region containing highly
degenerate sequence.
Length = 42
Score = 28.3 bits (64), Expect = 1.1
Identities = 8/26 (30%), Positives = 15/26 (57%)
Query: 398 YKCKDCGAAFNHNVSLKNHKNSSCPK 423
Y+C+DCG F + + ++CP+
Sbjct: 6 YRCEDCGHTFEVLQKISDDPLATCPE 31
>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
unknown].
Length = 652
Score = 32.0 bits (73), Expect = 1.1
Identities = 17/96 (17%), Positives = 40/96 (41%), Gaps = 8/96 (8%)
Query: 114 PKKRRGRSRSKGRDAPVSRKR--NRRAKTPSQEET----DAKILQQQKNLEKKIKKMA-- 165
P+++ G + R+ V KR EE ++ + ++ +EK ++
Sbjct: 403 PREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKREIEKLESELERF 462
Query: 166 KEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEED 201
+ E R K+ D++ D L++ E+++ +
Sbjct: 463 RREVRDKVRKDREIRARDRRIERLEKELEEKKKRVE 498
>gnl|CDD|236394 PRK09169, PRK09169, hypothetical protein; Validated.
Length = 2316
Score = 32.0 bits (73), Expect = 1.3
Identities = 20/116 (17%), Positives = 36/116 (31%), Gaps = 25/116 (21%)
Query: 99 RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRA---------KTPSQEETDAK 149
+ P R R R + DAP R + +E T +
Sbjct: 2 GPAHAPHKRRRDAAAPADPRPRRRPRLGDAPAPRTARADSGATPRGRPRAGADREPTSEQ 61
Query: 150 ILQQQKNLEK---------------KIKKMAKEERRRKIDLDKKPS-DMDNVFASL 189
+ ++ L++ ++ + ++ R RK+D D DNV A
Sbjct: 62 LRDYERWLDRAAAGQLDAQREQQCARLWFLVQQARARKVDPDFCLDLARDNVLAQR 117
>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y. Members of this family are
RNase Y, an endoribonuclease. The member from Bacillus
subtilis, YmdA, has been shown to be involved in
turnover of yitJ riboswitch [Transcription, Degradation
of RNA].
Length = 514
Score = 31.4 bits (72), Expect = 1.3
Identities = 19/68 (27%), Positives = 37/68 (54%), Gaps = 7/68 (10%)
Query: 134 RNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMS 193
+ RR + E ++LQ+++ L++K++ + K+E +L+KK ++ N +LDE
Sbjct: 72 KERRNELQRLER---RLLQREETLDRKMESLDKKEE----NLEKKEKELSNKEKNLDEKE 124
Query: 194 SEEEEEED 201
E EE
Sbjct: 125 EELEELIA 132
>gnl|CDD|220237 pfam09428, DUF2011, Fungal protein of unknown function (DUF2011).
This is a family of fungal proteins whose function is
unknown.
Length = 130
Score = 29.9 bits (68), Expect = 1.4
Identities = 21/72 (29%), Positives = 32/72 (44%), Gaps = 8/72 (11%)
Query: 96 IDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
+ K + + + KKR+ R K R A R RR +T + E + + +K
Sbjct: 64 LKKHNAKVEKELLREKEKKKKRK-RPGKKRRIA----LRLRRERTKERAEKEKRT---RK 115
Query: 156 NLEKKIKKMAKE 167
N EKK K+ KE
Sbjct: 116 NREKKFKRRQKE 127
>gnl|CDD|217503 pfam03344, Daxx, Daxx Family. The Daxx protein (also known as the
Fas-binding protein) is thought to play a role in
apoptosis, but precise role played by Daxx remains to be
determined. Daxx forms a complex with Axin.
Length = 715
Score = 31.4 bits (71), Expect = 1.5
Identities = 21/121 (17%), Positives = 50/121 (41%), Gaps = 5/121 (4%)
Query: 81 IISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKT 140
+ISKY D ++R KR R S ++ S ++P ++ ++
Sbjct: 387 VISKYA--MKQDDTEEEERRKRQERERQGTSSRSS-DPSKASSTSGESPS--MASQESEE 441
Query: 141 PSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEE 200
E + + ++++ E++ ++ E+ + +++ + + S + EE EE
Sbjct: 442 EESVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEADNGSEEEMEGSSEGDGDGEEPEE 501
Query: 201 D 201
D
Sbjct: 502 D 502
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 31.4 bits (71), Expect = 1.6
Identities = 18/106 (16%), Positives = 44/106 (41%), Gaps = 2/106 (1%)
Query: 96 IDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
++K + A + + PK G+ K ++ K+ ++ K P +E D K ++ K
Sbjct: 80 VEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEK-PKEEPKDRKPKEEAK 138
Query: 156 NLEKKIKKMAKEERRRKIDLDK-KPSDMDNVFASLDEMSSEEEEEE 200
+K ++E++ + D+ + + V A +++
Sbjct: 139 EKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPP 184
>gnl|CDD|227381 COG5048, COG5048, FOG: Zn-finger [General function prediction
only].
Length = 467
Score = 31.2 bits (70), Expect = 1.6
Identities = 17/50 (34%), Positives = 27/50 (54%), Gaps = 1/50 (2%)
Query: 532 TFVCEYCNKEFTFLQYLKRHMRTHTNEKPYKC-VCGLGFNFNVSLKNHKQ 580
C C F+ L++L RH+R+HT EKP +C G +F+ L+ +
Sbjct: 33 PDSCPNCTDSFSRLEHLTRHIRSHTGEKPSQCSYSGCDKSFSRPLELSRH 82
Score = 31.2 bits (70), Expect = 1.6
Identities = 21/93 (22%), Positives = 38/93 (40%), Gaps = 2/93 (2%)
Query: 330 KHSHSGQYHVCPHCGKKFTRKAELQLHIKGIHLKHQLEKTYIC--EYCHKEFTFYNYLRR 387
C F+R + L H++ ++ + K + C C K F+ + L+R
Sbjct: 282 SEKGFSLPIKSKQCNISFSRSSPLTRHLRSVNHSGESLKPFSCPYSLCGKLFSRNDALKR 341
Query: 388 HMRVHTNEKPYKCKDCGAAFNHNVSLKNHKNSS 420
H+ +HT+ P K K ++ + L N S
Sbjct: 342 HILLHTSISPAKEKLLNSSSKFSPLLNNEPPQS 374
Score = 30.0 bits (67), Expect = 3.7
Identities = 22/74 (29%), Positives = 33/74 (44%), Gaps = 4/74 (5%)
Query: 495 KHSGFFLQCPH--CPKTFPRKTELSNHIKGIHMKHELRQTFVC--EYCNKEFTFLQYLKR 550
GF L C +F R + L+ H++ ++ E + F C C K F+ LKR
Sbjct: 282 SEKGFSLPIKSKQCNISFSRSSPLTRHLRSVNHSGESLKPFSCPYSLCGKLFSRNDALKR 341
Query: 551 HMRTHTNEKPYKCV 564
H+ HT+ P K
Sbjct: 342 HILLHTSISPAKEK 355
>gnl|CDD|220297 pfam09581, Spore_III_AF, Stage III sporulation protein AF
(Spore_III_AF). This family represents the stage III
sporulation protein AF (Spore_III_AF) of the bacterial
endospore formation program, which exists in some but
not all members of the Firmicutes (formerly called
low-GC Gram-positives). The C-terminal region of these
proteins is poorly conserved.
Length = 185
Score = 30.3 bits (69), Expect = 1.7
Identities = 12/27 (44%), Positives = 17/27 (62%), Gaps = 1/27 (3%)
Query: 143 QEETDAKILQQ-QKNLEKKIKKMAKEE 168
Q A IL++ K LEK+++K KEE
Sbjct: 73 QASQRAYILEEYAKQLEKQVEKKLKEE 99
>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family. This model
represents a subfamily of RNA splicing factors including
the Pad-1 protein (N. crassa), CAPER (M. musculus) and
CC1.3 (H.sapiens). These proteins are characterized by
an N-terminal arginine-rich, low complexity domain
followed by three (or in the case of 4 H. sapiens
paralogs, two) RNA recognition domains (rrm: pfam00706).
These splicing factors are closely related to the U2AF
splicing factor family (TIGR01642). A homologous gene
from Plasmodium falciparum was identified in the course
of the analysis of that genome at TIGR and was included
in the seed.
Length = 457
Score = 31.0 bits (70), Expect = 1.9
Identities = 11/80 (13%), Positives = 29/80 (36%), Gaps = 5/80 (6%)
Query: 97 DKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPS----QEETDAKILQ 152
+ + S + + R R R + RD R+ R+++P+ +
Sbjct: 12 NDTRRSDKGRERSRRRSRSRDRSR-RRRDRDYYRGRRGRSRSRSPNRYYRPRGDRSYRRD 70
Query: 153 QQKNLEKKIKKMAKEERRRK 172
+++ + + + ER +
Sbjct: 71 DRRSGRNTKEPLTEAERDDR 90
>gnl|CDD|222911 PHA02616, PHA02616, VP2/VP3; Provisional.
Length = 259
Score = 30.8 bits (69), Expect = 2.1
Identities = 16/67 (23%), Positives = 27/67 (40%), Gaps = 2/67 (2%)
Query: 68 EEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRD 127
E + + + + + K + +K++ + S + KKRRG RS GR
Sbjct: 194 ELNKDIYKIPTQAVKRKQDELHPVSPTKKAALSKKSKWTGTKSSQSSKKRRG--RSTGRS 251
Query: 128 APVSRKR 134
V R R
Sbjct: 252 TTVRRNR 258
>gnl|CDD|109943 pfam00906, Hepatitis_core, Hepatitis core antigen. The core
antigen of hepatitis viruses possesses a carboxyl
terminus rich in arginine. On this basis it was
predicted that the core antigen would bind DNA. There is
some experimental evidence to support this.
Length = 182
Score = 30.2 bits (68), Expect = 2.1
Identities = 21/69 (30%), Positives = 28/69 (40%), Gaps = 12/69 (17%)
Query: 77 SFKTIIS---KYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRK 133
SF I Y N L+T+ + R S TP RR RS+S R
Sbjct: 120 SFGVWIRTPPAYRPPNAPILSTLPETIVVRRRGRSPRRRTPSPRRRRSQSPRR------- 172
Query: 134 RNRRAKTPS 142
RR+++PS
Sbjct: 173 --RRSQSPS 179
>gnl|CDD|232905 TIGR00284, TIGR00284, dihydropteroate synthase-related protein.
This protein has been found so far only in the Archaea,
and in particular in those archaea that lack a
bacterial-type dihydropteroate synthase. The central
region of this protein shows considerable homology to
the amino-terminal half of dihydropteroate synthases,
while the carboxyl-terminal region shows homology to the
small, uncharacterized protein slr0651 of Synechocystis
PCC6803 [Unknown function, General].
Length = 499
Score = 31.0 bits (70), Expect = 2.2
Identities = 15/62 (24%), Positives = 28/62 (45%), Gaps = 4/62 (6%)
Query: 142 SQEETDAKILQQQKNLEKKIKKMAKEE---RRRKIDLDKKPSDMDNVFASLDEMSSEEEE 198
S EE +++ + K LE+ K+ + E R + + KP + V A + +E+
Sbjct: 109 STEEPADEVVLEIKKLEEYTSKIEEREADFRIGSLKIPLKPPPL-RVVAEIPPTVAEDGI 167
Query: 199 EE 200
E
Sbjct: 168 EG 169
>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
splicing factor. These splicing factors consist of an
N-terminal arginine-rich low complexity domain followed
by three tandem RNA recognition motifs (pfam00076). The
well-characterized members of this family are auxilliary
components of the U2 small nuclear ribonuclearprotein
splicing factor (U2AF). These proteins are closely
related to the CC1-like subfamily of splicing factors
(TIGR01622). Members of this subfamily are found in
plants, metazoa and fungi.
Length = 509
Score = 31.0 bits (70), Expect = 2.2
Identities = 17/106 (16%), Positives = 37/106 (34%), Gaps = 17/106 (16%)
Query: 105 RAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDA--KILQQQKNLEKKIK 162
R+ + RR R RS D+ +R +++P + + + + + ++
Sbjct: 27 RSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRSLRYSSVRRSRDRPRRRSRSVR 86
Query: 163 KMAKEERR---------------RKIDLDKKPSDMDNVFASLDEMS 193
+ + RR ++ D KP + V A + S
Sbjct: 87 SIEQHRRRLRDRSPSNQWRKDDKKRSLWDIKPPGYELVTADQAKAS 132
>gnl|CDD|217502 pfam03343, SART-1, SART-1 family. SART-1 is a protein involved in
cell cycle arrest and pre-mRNA splicing. It has been
shown to be a component of U4/U6 x U5 tri-snRNP complex
in human, Schizosaccharomyces pombe and Saccharomyces
cerevisiae. SART-1 is a known tumour antigen in a range
of cancers recognised by T cells.
Length = 603
Score = 30.9 bits (70), Expect = 2.4
Identities = 34/153 (22%), Positives = 66/153 (43%), Gaps = 14/153 (9%)
Query: 55 KVEQGQATTSDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGS--- 111
+E + + ++DD + ++I+SKY D+ K+ SG
Sbjct: 182 NLELKKKKPDYDPDDDDKFNK---RSILSKY-----DEEIEGKKKKSDNLFTLDSGGSTD 233
Query: 112 DTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRR 171
D +K+R + K + VS + +TP+ + D + + K + K KK K++RR+
Sbjct: 234 DEAEKKRQEVKKKLKINNVSLDDDS-TETPASDYYDVSEMVKFK--KPKKKKKKKKKRRK 290
Query: 172 KIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDE 204
+D D+ + + + +S + EEE E
Sbjct: 291 DLDEDELEPEAEGLGSSDSGSRKDVEEENARLE 323
>gnl|CDD|221755 pfam12756, zf-C2H2_2, C2H2 type zinc-finger (2 copies). This
family contains two copies of a C2H2-like zinc finger
domain.
Length = 100
Score = 28.8 bits (65), Expect = 2.4
Identities = 17/78 (21%), Positives = 33/78 (42%), Gaps = 8/78 (10%)
Query: 281 CYHCGYYSKNRSTLKNHVRVEHGENQAKRKEKKICDICSAEVVHLAIHKKHSHSGQYHVC 340
C C + S H+ HG +R + + D+ +++ K H + + C
Sbjct: 2 CLFCNHTSDTVEENLEHMFKSHGFFIPER--EYLVDL--EGLLNYLREKIH----EGNEC 53
Query: 341 PHCGKKFTRKAELQLHIK 358
+CGK+F L+ H++
Sbjct: 54 LYCGKQFKSLEALRQHMR 71
Score = 28.4 bits (64), Expect = 3.2
Identities = 11/30 (36%), Positives = 18/30 (60%)
Query: 361 HLKHQLEKTYICEYCHKEFTFYNYLRRHMR 390
+L+ ++ + C YC K+F LR+HMR
Sbjct: 42 YLREKIHEGNECLYCGKQFKSLEALRQHMR 71
>gnl|CDD|234471 TIGR04104, cxxc_20_cxxc, cxxc_20_cxxc protein. This small,
uncommon, poorly conserved protein is found primarily in
the Firmicutes. It features are pair of CxxC motifs
separated by about 20 amino acids, followed by a highly
hydrophobic region of about 45 amino acids. It has no
conserved gene neighborhood, and its function is
unknown.
Length = 94
Score = 28.5 bits (64), Expect = 3.0
Identities = 10/35 (28%), Positives = 20/35 (57%), Gaps = 3/35 (8%)
Query: 371 ICEYCHKEFTFYNYLRRHMRVHTNEKPYKCKDCGA 405
IC+ C+++F++ L+ + +P KC +CG
Sbjct: 2 ICKNCNEKFSYKELLKSLFSL---YRPIKCPNCGT 33
>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
Length = 509
Score = 30.3 bits (69), Expect = 3.1
Identities = 19/136 (13%), Positives = 40/136 (29%), Gaps = 11/136 (8%)
Query: 73 TRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSR 132
+ S I +++ + K+T + +
Sbjct: 25 AKSKSKGFITK-------EEIKEALESKKKTPEQIDQVLIFLSGMVKDTDDATESDIPKK 77
Query: 133 KRNRRAKTPSQEETDAK----ILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFAS 188
K AK + + K L K EKK ++ D+D D+
Sbjct: 78 KTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQADDDDDDD 137
Query: 189 LDEMSSEEEEEEDWDE 204
D+ +++ ++D D+
Sbjct: 138 DDDDLDDDDIDDDDDD 153
Score = 30.3 bits (69), Expect = 3.1
Identities = 21/108 (19%), Positives = 35/108 (32%), Gaps = 5/108 (4%)
Query: 97 DKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKN 156
K+ K A+AA++ + KK + S +K D +L Q
Sbjct: 76 KKKTKTAAKAAAAKAPAKKKLKDELDSS---KKAEKKNALDKDDDLNYVKDIDVLNQAD- 131
Query: 157 LEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDE 204
+ + ID D D D D +EE++E +
Sbjct: 132 -DDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDVDDEDEEKKEAKEL 178
Score = 30.0 bits (68), Expect = 4.5
Identities = 25/178 (14%), Positives = 54/178 (30%), Gaps = 12/178 (6%)
Query: 34 KTETDDKSWEDKSLLEPE--IKIKVEQGQATTSD--ETEEDDNTRQTSFKTI-ISKYVGY 88
+ + + E+++ + + +G T + E E + I
Sbjct: 4 ASTKAELAAEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMV 63
Query: 89 NIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEET-D 147
D T K+ + A+ + + + + + + + K+N K D
Sbjct: 64 KDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKD 123
Query: 148 AKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEI 205
+L Q + + ID D D D D ++E+E+ E
Sbjct: 124 IDVLNQAD--DDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDV----DDEDEEKKEA 175
>gnl|CDD|197903 smart00834, CxxC_CXXC_SSSS, Putative regulatory protein.
CxxC_CXXC_SSSS represents a region of about 41 amino
acids found in a number of small proteins in a wide
range of bacteria. The region usually begins with the
initiator Met and contains two CxxC motifs separated by
17 amino acids. One protein in this entry has been noted
as a putative regulatory protein, designated FmdB. Most
proteins in this entry have a C-terminal region
containing highly degenerate sequence.
Length = 41
Score = 26.7 bits (60), Expect = 3.3
Identities = 8/26 (30%), Positives = 15/26 (57%)
Query: 398 YKCKDCGAAFNHNVSLKNHKNSSCPK 423
Y+C+DCG F + + ++CP+
Sbjct: 6 YRCEDCGHTFEVLQKISDDPLTTCPE 31
>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62.
Length = 217
Score = 29.8 bits (67), Expect = 3.4
Identities = 14/81 (17%), Positives = 31/81 (38%), Gaps = 4/81 (4%)
Query: 99 RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLE 158
RAKR RA S K + + + +++ + + +K
Sbjct: 8 RAKRVVRALES----EKYKANKDKGNPEIYNKINSQDKAIEKFKLLIKAQMAERVKKLHS 63
Query: 159 KKIKKMAKEERRRKIDLDKKP 179
++ K+ K+ +++K+ L P
Sbjct: 64 QEKKEEKKKPKKKKVPLQVNP 84
>gnl|CDD|237619 PRK14135, recX, recombination regulator RecX; Provisional.
Length = 263
Score = 29.8 bits (68), Expect = 3.7
Identities = 16/64 (25%), Positives = 35/64 (54%), Gaps = 2/64 (3%)
Query: 142 SQEETDAKILQQQKNLEKKIKKMAKEERRRKI--DLDKKPSDMDNVFASLDEMSSEEEEE 199
++E+ + + L KK +K+ + ++KI L K + + A+L+E+ E++EE
Sbjct: 150 TEEDQIEVAQKLAEKLLKKYQKLPFKALKQKIIQSLLTKGFSYEVIKAALEELDLEQDEE 209
Query: 200 EDWD 203
E+ +
Sbjct: 210 EEQE 213
>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain. This domain is
found at the N terminus of SMC proteins. The SMC
(structural maintenance of chromosomes) superfamily
proteins have ATP-binding domains at the N- and
C-termini, and two extended coiled-coil domains
separated by a hinge in the middle. The eukaryotic SMC
proteins form two kind of heterodimers: the SMC1/SMC3
and the SMC2/SMC4 types. These heterodimers constitute
an essential part of higher order complexes, which are
involved in chromatin and DNA dynamics. This family also
includes the RecF and RecN proteins that are involved in
DNA metabolism and recombination.
Length = 1162
Score = 30.3 bits (68), Expect = 3.8
Identities = 26/187 (13%), Positives = 74/187 (39%), Gaps = 9/187 (4%)
Query: 22 LGVESSVPKLKIKTETDDKSWEDKSLLEPEIKIKVEQGQATTSDETEEDDNTRQTSFKTI 81
+++S+ +L + + + E + +I Q + ++ +++ + K
Sbjct: 663 SELKASLSELTKELLAEQELQEKAESELAKNEILRRQEEIKKKEQRIKEELKKLKLEKEE 722
Query: 82 ISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRK-RNRRAKT 140
+ D + + + ++ +SR K + + + + K
Sbjct: 723 L------LADKVQEAQDKINEELKLLEQKIKEKEEEEEKSRLKKEEEEEEKSELSLKEKE 776
Query: 141 PSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEE 200
++EE + L+ ++ E+K+K +E R + +L ++ ++ L E+ +EE
Sbjct: 777 LAEEEEKTEKLKVEEEKEEKLKAQEEELRALEEELKEEAELLEE--EQLLIEQEEKIKEE 834
Query: 201 DWDEIHL 207
+ +E+ L
Sbjct: 835 ELEELAL 841
>gnl|CDD|177301 PHA00733, PHA00733, hypothetical protein.
Length = 128
Score = 28.7 bits (64), Expect = 3.8
Identities = 14/43 (32%), Positives = 20/43 (46%), Gaps = 6/43 (13%)
Query: 338 HVCPHCGKKFTRKAELQLHIKGIHLKHQLEKTYICEYCHKEFT 380
+VCP C F+ L+ HI+ E + +C C KEF
Sbjct: 74 YVCPLCLMPFSSSVSLKQHIR------YTEHSKVCPVCGKEFR 110
Score = 28.7 bits (64), Expect = 4.4
Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 2/48 (4%)
Query: 369 TYICEYCHKEFTFYNYLRRHMRVHTNEKPYKCKDCGAAFNHNVSLKNH 416
Y+C C F+ L++H+R + K C CG F + S +H
Sbjct: 73 PYVCPLCLMPFSSSVSLKQHIRYTEHSK--VCPVCGKEFRNTDSTLDH 118
>gnl|CDD|218561 pfam05340, DUF740, Protein of unknown function (DUF740). This
family consists of several uncharacterized plant
proteins of unknown function.
Length = 565
Score = 30.0 bits (67), Expect = 3.9
Identities = 27/105 (25%), Positives = 41/105 (39%), Gaps = 10/105 (9%)
Query: 104 ARAASSGSDTPKKRRGRSRSKGRDAPVSR--KRNRRAKTPSQEETDAKILQQ--QKNLEK 159
++S G P+ RR +S S R+A S + RR+ T + ++NL
Sbjct: 60 KPSSSGGGFFPELRRTKSFSAKRNAGFSGADEPQRRSCDVRSRSTLWSLFHDDDEENLPS 119
Query: 160 KIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDE 204
I + R KP D V +E+ EE+EE E
Sbjct: 120 SIAPPEIDPEPR------KPIVPDLVLEEEEEVEMEEDEEYYEKE 158
>gnl|CDD|172341 PRK13808, PRK13808, adenylate kinase; Provisional.
Length = 333
Score = 29.9 bits (67), Expect = 4.1
Identities = 12/69 (17%), Positives = 24/69 (34%)
Query: 98 KRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNL 157
K AK +AA + K + + ++K+ + + ++
Sbjct: 262 KAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKATAKAPKRGAK 321
Query: 158 EKKIKKMAK 166
KK KK+ K
Sbjct: 322 GKKAKKVTK 330
Score = 29.5 bits (66), Expect = 5.6
Identities = 25/137 (18%), Positives = 53/137 (38%), Gaps = 23/137 (16%)
Query: 57 EQGQATTSDETEEDDNTRQTSFKTIISKYVGY--------NIDDLNTIDK---------- 98
+G+ +D+T E R S++ V Y +D + TID+
Sbjct: 133 ARGEEVRADDTPEVLAKRLASYRAQTEPLVHYYSEKRKLLTVDGMMTIDEVTREIGRVLA 192
Query: 99 --RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKR---NRRAKTPSQEETDAKILQQ 153
A +AA + + ++ +++K VS+K+ + + + +
Sbjct: 193 AVGAANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKKA 252
Query: 154 QKNLEKKIKKMAKEERR 170
+K +K +KK AK ++
Sbjct: 253 KKTAKKALKKAAKAVKK 269
>gnl|CDD|223430 COG0353, RecR, Recombinational DNA repair protein (RecF pathway)
[DNA replication, recombination, and repair].
Length = 198
Score = 29.5 bits (67), Expect = 4.1
Identities = 16/43 (37%), Positives = 17/43 (39%), Gaps = 13/43 (30%)
Query: 310 KEKKICDICSAE--------VVH-----LAIHKKHSHSGQYHV 339
E CDICS E VV LA+ K G YHV
Sbjct: 64 TESDPCDICSDESRDKSQLCVVEEPKDVLALEKTGEFRGLYHV 106
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 30.3 bits (68), Expect = 4.2
Identities = 17/65 (26%), Positives = 36/65 (55%), Gaps = 1/65 (1%)
Query: 138 AKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLD-KKPSDMDNVFASLDEMSSEE 196
K ++EE + K +++K EK++KK+ ++ K L ++ SD NV ++ S +
Sbjct: 10 KKILTEEELERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEKKSRKR 69
Query: 197 EEEED 201
+ E++
Sbjct: 70 DVEDE 74
>gnl|CDD|234616 PRK00076, recR, recombination protein RecR; Reviewed.
Length = 196
Score = 29.3 bits (67), Expect = 4.3
Identities = 14/42 (33%), Positives = 18/42 (42%), Gaps = 13/42 (30%)
Query: 311 EKKICDICSAE--------VVH-----LAIHKKHSHSGQYHV 339
E+ C+ICS VV LAI + + G YHV
Sbjct: 64 EQDPCEICSDPRRDQSLICVVESPADVLAIERTGEYRGLYHV 105
>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
Rpc31. RNA polymerase III contains seventeen subunits
in yeasts and in human cells. Twelve of these are akin
to RNA polymerase I or II and the other five are RNA pol
III-specific, and form the functionally distinct groups
(i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
Rpc34 and Rpc82 form a cluster of enzyme-specific
subunits that contribute to transcription initiation in
S.cerevisiae and H.sapiens. There is evidence that these
subunits are anchored at or near the N-terminal Zn-fold
of Rpc1, itself prolonged by a highly conserved but RNA
polymerase III-specific domain.
Length = 221
Score = 29.3 bits (66), Expect = 4.3
Identities = 20/86 (23%), Positives = 41/86 (47%), Gaps = 15/86 (17%)
Query: 119 GRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKK 178
G ++ G+ +S+ + + E + I ++ LEKK+K++ E+ + + D+
Sbjct: 121 GINKKAGKKLALSKFKRKV---GLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDE- 176
Query: 179 PSDMDNVFASLDEMSSEEEEEEDWDE 204
+E EEEE+ED+D+
Sbjct: 177 -----------EEEEEEEEEDEDFDD 191
>gnl|CDD|177753 PLN00149, PLN00149, potassium transporter; Provisional.
Length = 779
Score = 30.2 bits (68), Expect = 4.4
Identities = 24/98 (24%), Positives = 40/98 (40%), Gaps = 7/98 (7%)
Query: 54 IKVEQGQATTSDETEEDDNTRQTSFKTIISKYVGYNI--DDLNTIDKRAKRTARAASSGS 111
I+ E+ + + E EE ++ R T T + G + DD + + R S
Sbjct: 626 IRSEKPEPNGAPENEEGEDERMTVVGTCSTHLEGIQLREDDSDKQEPAGTSELREIRS-- 683
Query: 112 DTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAK 149
P R + R + P S K +R A+ QE +A+
Sbjct: 684 --PPVSRPKKRVRFV-VPESPKIDRGAREELQELMEAR 718
>gnl|CDD|218029 pfam04328, DUF466, Protein of unknown function (DUF466). Small
bacterial protein of unknown function. Structural
modelling suggests this domain may bind nucleic acids.
Length = 64
Score = 27.2 bits (61), Expect = 4.5
Identities = 6/18 (33%), Positives = 11/18 (61%)
Query: 426 SYETYLKHLKTNHHGYEV 443
YE Y++H++ +H V
Sbjct: 24 DYEKYVEHMRRHHPDKPV 41
>gnl|CDD|226202 COG3677, COG3677, Transposase and inactivated derivatives [DNA
replication, recombination, and repair].
Length = 129
Score = 28.5 bits (64), Expect = 4.7
Identities = 10/43 (23%), Positives = 14/43 (32%), Gaps = 2/43 (4%)
Query: 308 KRKEKKICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKFTRK 350
+ K C C + V Q + C CG FT +
Sbjct: 26 MQITKVNCPRCKSSNVV--KIGGIRRGHQRYKCKSCGSTFTVE 66
>gnl|CDD|220284 pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid).
Members of this family are bacterial proteins with a
conserved motif [KR]FYDLN, sometimes flanked by a pair
of CXXC motifs, followed by a long region of low
complexity sequence in which roughly half the residues
are Asp and Glu, including multiple runs of five or more
acidic residues. The function of members of this family
is unknown.
Length = 104
Score = 28.0 bits (63), Expect = 4.8
Identities = 7/13 (53%), Positives = 8/13 (61%)
Query: 335 GQYHVCPHCGKKF 347
G CP CGK+F
Sbjct: 7 GTKRTCPTCGKRF 19
>gnl|CDD|227812 COG5525, COG5525, Phage terminase, large subunit GpA [Replication,
recombination and repair].
Length = 611
Score = 29.7 bits (67), Expect = 5.2
Identities = 11/48 (22%), Positives = 20/48 (41%), Gaps = 1/48 (2%)
Query: 332 SHSGQYHV-CPHCGKKFTRKAELQLHIKGIHLKHQLEKTYICEYCHKE 378
+++V CPHCG++ K + +G+ CE+C
Sbjct: 221 GDQRRFYVPCPHCGEEQQLKFGEKSGPRGLKDTPAEAAFIQCEHCGCV 268
>gnl|CDD|233208 TIGR00956, 3a01205, Pleiotropic Drug Resistance (PDR) Family
protein. [Transport and binding proteins, Other].
Length = 1394
Score = 29.7 bits (67), Expect = 5.2
Identities = 14/79 (17%), Positives = 28/79 (35%), Gaps = 12/79 (15%)
Query: 150 ILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEIHLGQ 209
IL ++ K+ KK + K D++ V S D ++ ++ D
Sbjct: 702 ILVFRRGSLKRAKKAGETSASNKNDIEAGE-----VLGSTDLTDESDDVNDEKDM----- 751
Query: 210 FSPYEFKCRVCDWKLNSYD 228
E + W+ +Y+
Sbjct: 752 --EKESGEDIFHWRNLTYE 768
>gnl|CDD|232888 TIGR00233, trpS, tryptophanyl-tRNA synthetase. This model
represents tryptophanyl-tRNA synthetase. Some members of
the family have a pfam00458 domain amino-terminal to the
region described by this model [Protein synthesis, tRNA
aminoacylation].
Length = 327
Score = 29.6 bits (67), Expect = 5.3
Identities = 11/66 (16%), Positives = 30/66 (45%), Gaps = 4/66 (6%)
Query: 155 KNLEKKIKKMAKEERRRKIDLDKKP---SDMDNVF-ASLDEMSSEEEEEEDWDEIHLGQF 210
K ++KKI+K A + R + ++ ++ ++ + +++ +E ++ G+
Sbjct: 208 KQIKKKIRKAATDGGRVTLFEHREKPGVPNLLVIYQYLSFFLIDDDKLKEIYEAYKSGKL 267
Query: 211 SPYEFK 216
E K
Sbjct: 268 GYGECK 273
>gnl|CDD|192632 pfam10571, UPF0547, Uncharacterized protein family UPF0547. This
domain contains a zinc-ribbon motif.
Length = 26
Score = 25.6 bits (57), Expect = 5.9
Identities = 12/35 (34%), Positives = 14/35 (40%), Gaps = 11/35 (31%)
Query: 313 KICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKF 347
K C C AEV +CPHCG +F
Sbjct: 1 KTCPECGAEVPL-----------AAKICPHCGYEF 24
>gnl|CDD|221185 pfam11719, Drc1-Sld2, DNA replication and checkpoint protein.
Genome duplication is precisely regulated by
cyclin-dependent kinases CDKs, which bring about the
onset of S phase by activating replication origins and
then prevent relicensing of origins until mitosis is
completed. The optimum sequence motif for CDK
phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found
to have at least 11 potential phosphorylation sites.
Drc1 is required for DNA synthesis and S-M replication
checkpoint control. Drc1 associates with Cdc2 and is
phosphorylated at the onset of S phase when Cdc2 is
activated. Thus Cdc2 promotes DNA replication by
phosphorylating Drc1 and regulating its association with
Cut5. Sld2 and Sld3 represent the minimal set of S-CDK
substrates required for DNA replication.
Length = 397
Score = 29.4 bits (66), Expect = 5.9
Identities = 27/145 (18%), Positives = 47/145 (32%), Gaps = 29/145 (20%)
Query: 63 TSDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKK-RRGRS 121
DE + R S +T +S +D T S+TP RR
Sbjct: 154 AEDEDRPEYGPR--SERTPLSSGKKVMLDLFFTPTSW--------RYSSETPSFLRRSNQ 203
Query: 122 RSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSD 181
P++ +PS L+ Q+ + K + ++ +EE
Sbjct: 204 DVSATSNPLNSAEPDFGVSPSP-------LRPQRPVGKGLSELVQEEES----------- 245
Query: 182 MDNVFASLDEMSSEEEEEEDWDEIH 206
+D+ L E+ +EE +E
Sbjct: 246 IDDELDVLREIEAEEAGIGPIEEEV 270
>gnl|CDD|202114 pfam02114, Phosducin, Phosducin.
Length = 245
Score = 29.3 bits (65), Expect = 5.9
Identities = 7/75 (9%), Positives = 27/75 (36%), Gaps = 7/75 (9%)
Query: 106 AASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMA 165
+ + G++ G ++ R + ++ + + ++ +++M+
Sbjct: 2 EKAKSQSLEEDFEGQASHTGPKGVINDWRKFKLESEDSDS-------VAHSKKEILRQMS 54
Query: 166 KEERRRKIDLDKKPS 180
+ R D ++ S
Sbjct: 55 SPQSRDDKDSKERFS 69
>gnl|CDD|206083 pfam13912, zf-C2H2_6, C2H2-type zinc finger.
Length = 27
Score = 25.6 bits (57), Expect = 6.5
Identities = 7/24 (29%), Positives = 10/24 (41%)
Query: 369 TYICEYCHKEFTFYNYLRRHMRVH 392
+ C C K F+ L H + H
Sbjct: 1 VHTCGVCGKTFSSLQALGGHKKSH 24
>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
subunit (TFIIF-alpha). Transcription initiation factor
IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
II-associating protein 74 (RAP74) is the large subunit
of transcription factor IIF (TFIIF), which is essential
for accurate initiation and stimulates elongation by RNA
polymerase II.
Length = 528
Score = 29.5 bits (66), Expect = 6.7
Identities = 36/150 (24%), Positives = 53/150 (35%), Gaps = 19/150 (12%)
Query: 43 EDKSLLEPEIKIKVEQGQATTSDETEEDDNTRQTSF-------KTIISKYVGYNIDD-LN 94
E + L PEI K E Q S+E+EE+ N + K + K G + DD +
Sbjct: 299 EREDKLSPEIPAKPEIEQDEDSEESEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDS 358
Query: 95 TIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQ 154
D S T KK++ + + D+ S N PS E D
Sbjct: 359 GDDSDDSDIDGEDSVSLVTAKKQKEPKKEEPVDSNPSSPGNSGPARPSPESKD------- 411
Query: 155 KNLEKKIKKMAKEERRRKIDLDKKPSDMDN 184
K +K A E + + K +N
Sbjct: 412 ----KGKRKAANEVSKSPASVPAKKLKTEN 437
>gnl|CDD|215521 PLN02967, PLN02967, kinase.
Length = 581
Score = 29.2 bits (65), Expect = 7.4
Identities = 23/103 (22%), Positives = 42/103 (40%), Gaps = 20/103 (19%)
Query: 108 SSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKE 167
++ KK R+R R+A S + + K +++ +K+KKM ++
Sbjct: 104 AALDKESKKTPRRTR-------------RKAAAASSDVEEEKT-EKKVRKRRKVKKMDED 149
Query: 168 ERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWD-EIHLGQ 209
+ S++ +V S S E E EE+ D E G+
Sbjct: 150 V-----EDQGSESEVSDVEESEFVTSLENESEEELDLEKDDGE 187
>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein. The proteins in this family
are designated YL1. These proteins have been shown to be
DNA-binding and may be a transcription factor.
Length = 238
Score = 28.9 bits (65), Expect = 7.7
Identities = 18/85 (21%), Positives = 37/85 (43%), Gaps = 5/85 (5%)
Query: 98 KRAKRTARAASSGSD---TPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQ 154
K+ K+ AA S PKK+ R R+++ R+ T + +A + +
Sbjct: 101 KKKKKDPTAAKSPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSST--VQNKEATHERLK 158
Query: 155 KNLEKKIKKMAKEERRRKIDLDKKP 179
+ ++ K AK +R++ +K+
Sbjct: 159 EREIRRKKIQAKARKRKEKKKEKEL 183
>gnl|CDD|219715 pfam08070, DTHCT, DTHCT (NUC029) region. The DTCHT region is the
C-terminal part of DNA gyrases B / topoisomerase IV /
HATPase proteins. This region is composed of quite low
complexity sequence.
Length = 95
Score = 27.2 bits (60), Expect = 7.8
Identities = 23/95 (24%), Positives = 35/95 (36%), Gaps = 19/95 (20%)
Query: 98 KRAKRTARAASSGSDT--PKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
K KR + S S+ PKK + + + + PS +
Sbjct: 5 KGKKRETVNSDSDSEAGVPKK-----------PAPPKGKGSKKRKPSSSDES------DS 47
Query: 156 NLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLD 190
N KK+ K A ++ +K D D PSD D+ A
Sbjct: 48 NFGKKVSKSATSKKSKKGDDDDFPSDFDSAVAPRA 82
>gnl|CDD|177288 PHA00616, PHA00616, hypothetical protein.
Length = 44
Score = 25.9 bits (57), Expect = 8.0
Identities = 9/28 (32%), Positives = 16/28 (57%)
Query: 502 QCPHCPKTFPRKTELSNHIKGIHMKHEL 529
QC C F +K E+ H+ +H +++L
Sbjct: 3 QCLRCGGIFRKKKEVIEHLLSVHKQNKL 30
>gnl|CDD|179886 PRK04860, PRK04860, hypothetical protein; Provisional.
Length = 160
Score = 28.3 bits (64), Expect = 8.3
Identities = 10/21 (47%), Positives = 13/21 (61%)
Query: 385 LRRHMRVHTNEKPYKCKDCGA 405
+RRH RV E Y+C+ CG
Sbjct: 131 VRRHNRVVRGEAVYRCRRCGE 151
>gnl|CDD|227516 COG5189, SFP1, Putative transcriptional repressor regulating G2/M
transition [Transcription / Cell division and chromosome
partitioning].
Length = 423
Score = 28.9 bits (64), Expect = 8.3
Identities = 24/96 (25%), Positives = 38/96 (39%), Gaps = 21/96 (21%)
Query: 346 KFTRKAELQLHIKGIHLKHQLEKTYICEY--CHKEFTFYNYLRRHM-------RVHTN-- 394
K E + LK + K Y C C+K++ N L+ HM ++H N
Sbjct: 326 KLAHGGERNIDTPSRMLKVKDGKPYKCPVEGCNKKYKNQNGLKYHMLHGHQNQKLHENPS 385
Query: 395 ----------EKPYKCKDCGAAFNHNVSLKNHKNSS 420
+KPY+C+ C + + LK H+ S
Sbjct: 386 PEKMNIFSAKDKPYRCEVCDKRYKNLNGLKYHRKHS 421
>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein. TolA couples the inner
membrane complex of itself with TolQ and TolR to the
outer membrane complex of TolB and OprL (also called
Pal). Most of the length of the protein consists of
low-complexity sequence that may differ in both length
and composition from one species to another,
complicating efforts to discriminate TolA (the most
divergent gene in the tol-pal system) from paralogs such
as TonB. Selection of members of the seed alignment and
criteria for setting scoring cutoffs are based largely
conserved operon struction. //The Tol-Pal complex is
required for maintaining outer membrane integrity. Also
involved in transport (uptake) of colicins and
filamentous DNA, and implicated in pathogenesis.
Transport is energized by the proton motive force. TolA
is an inner membrane protein that interacts with
periplasmic TolB and with outer membrane porins ompC,
phoE and lamB [Transport and binding proteins, Other,
Cellular processes, Pathogenesis].
Length = 346
Score = 29.0 bits (65), Expect = 8.4
Identities = 14/84 (16%), Positives = 33/84 (39%), Gaps = 6/84 (7%)
Query: 96 IDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
RAA + R+ + +++ + AK +++ A+ + ++
Sbjct: 75 QQAEEAEKQRAAE------QARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKAKQ 128
Query: 156 NLEKKIKKMAKEERRRKIDLDKKP 179
E K K A+ E++ K + K+
Sbjct: 129 AAEAKAKAEAEAEKKAKEEAKKQA 152
>gnl|CDD|149172 pfam07948, Nairovirus_M, Nairovirus M polyprotein-like. The
sequences in this family are similar to the Dugbe virus
M polyprotein precursor, which includes glycoproteins G1
and G2. Both are thought to be inserted in the membrane
of the Golgi complex of the infected host cell, and G1
is known to have a role in infection of vertebrate
hosts.
Length = 645
Score = 29.0 bits (65), Expect = 8.5
Identities = 9/36 (25%), Positives = 17/36 (47%)
Query: 313 KICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKFT 348
+ C C V+ + H + Y++CP+C + T
Sbjct: 495 QTCTKCEQTPVNAIDAEMHDLNCSYNICPYCASRLT 530
>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
Length = 651
Score = 29.2 bits (66), Expect = 8.6
Identities = 14/67 (20%), Positives = 32/67 (47%), Gaps = 8/67 (11%)
Query: 129 PVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDM----DN 184
++ +R K + + K L++ K E+K KK ++ + KI P++ ++
Sbjct: 547 LDDKEELQREKEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAKIP----PAEFFKRQED 602
Query: 185 VFASLDE 191
+++ DE
Sbjct: 603 KYSAFDE 609
>gnl|CDD|215056 PLN00104, PLN00104, MYST -like histone acetyltransferase;
Provisional.
Length = 450
Score = 29.0 bits (65), Expect = 9.0
Identities = 12/34 (35%), Positives = 18/34 (52%)
Query: 331 HSHSGQYHVCPHCGKKFTRKAELQLHIKGIHLKH 364
++ + + C C K RK +LQ H+K LKH
Sbjct: 192 YNDCSKLYFCEFCLKFMKRKEQLQRHMKKCDLKH 225
>gnl|CDD|153328 cd07644, I-BAR_IMD_BAIAP2L2, Inverse (I)-BAR, also known as the
IRSp53/MIM homology Domain (IMD), of Brain-specific
Angiogenesis Inhibitor 1-Associated Protein 2-Like 2.
The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs
(I-BAR) domain, is a dimerization and lipid-binding
module that bends membranes and induces membrane
protrusions. This group is composed of uncharacterized
proteins known as BAIAP2L2 (Brain-specific Angiogenesis
Inhibitor 1-Associated Protein 2-Like 2). They contain
an N-terminal IMD, an SH3 domain, and a WASP homology 2
(WH2) actin-binding motif at the C-terminus. The related
proteins, BAIAP2L1 and IRSp53, function as regulators of
membrane dynamics and the actin cytoskeleton. The IMD
domain binds and bundles actin filaments, binds
membranes and produces membrane protrusions, and
interacts with the small GTPase Rac.
Length = 215
Score = 28.3 bits (63), Expect = 9.4
Identities = 11/65 (16%), Positives = 31/65 (47%)
Query: 140 TPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEE 199
S+ + + + NLEK + ++ + ER+R ++ + +++ + S+ E +
Sbjct: 111 EDSRRVYELEYRHRAANLEKCMSELWRMERQRDRNVREMKENVNRLRQSMQAFLKESQRA 170
Query: 200 EDWDE 204
+ +E
Sbjct: 171 AELEE 175
>gnl|CDD|227579 COG5254, ARV1, Predicted membrane protein [Function unknown].
Length = 239
Score = 28.3 bits (63), Expect = 9.7
Identities = 15/49 (30%), Positives = 20/49 (40%), Gaps = 1/49 (2%)
Query: 314 ICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKFTRKAELQLHIKGIHL 362
+C C + V L S Q CP C +K + EL +K I L
Sbjct: 2 VCIECGSRVDSLYTRYSTSAI-QLSRCPSCNRKMDKYFELDGVLKLIDL 49
>gnl|CDD|218790 pfam05876, Terminase_GpA, Phage terminase large subunit (GpA).
This family consists of several phage terminase large
subunit proteins as well as related sequences from
several bacterial species. The DNA packaging enzyme of
bacteriophage lambda, terminase, is a heteromultimer
composed of a small subunit, gpNu1, and a large subunit,
gpA, products of the Nu1 and A genes, respectively.
Terminase is involved in the site-specific binding and
cutting of the DNA in the initial stages of packaging.
It is now known that gpA is actively involved in late
stages of packaging, including DNA translocation, and
that this enzyme contains separate functional domains
for its early and late packaging activities.
Length = 552
Score = 28.7 bits (65), Expect = 9.8
Identities = 11/48 (22%), Positives = 19/48 (39%), Gaps = 10/48 (20%)
Query: 337 YHV-CPHCGKKFTRKAELQLHIKGIHLKHQLEKT---YICEYCHKEFT 380
Y+V CPHCG++ +L + + Y+C +C
Sbjct: 199 YYVPCPHCGEEQ------ELRWERLKWDKGEAPETARYVCPHCGCVIE 240
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.318 0.132 0.412
Gapped
Lambda K H
0.267 0.0804 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 29,500,922
Number of extensions: 2774914
Number of successful extensions: 4674
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4531
Number of HSP's successfully gapped: 246
Length of query: 606
Length of database: 10,937,602
Length adjustment: 103
Effective length of query: 503
Effective length of database: 6,369,140
Effective search space: 3203677420
Effective search space used: 3203677420
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 62 (27.5 bits)