RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy5825
(596 letters)
>gnl|CDD|187547 cd05236, FAR-N_SDR_e, fatty acyl CoA reductases (FARs), extended
(e) SDRs. SDRs are Rossmann-fold NAD(P)H-binding
proteins, many of which may function as fatty acyl CoA
reductases (FAR), acting on medium and long chain fatty
acids, and have been reported to be involved in diverse
processes such as biosynthesis of insect pheromones,
plant cuticular wax production, and mammalian wax
biosynthesis. In Arabidopsis thaliana, proteins with
this particular architecture have also been identified
as the MALE STERILITY 2 (MS2) gene product, which is
implicated in male gametogenesis. Mutations in MS2
inhibit the synthesis of exine (sporopollenin),
rendering plants unable to reduce pollen wall fatty
acids to corresponding alcohols. This N-terminal domain
shares the catalytic triad (but not the upstream Asn)
and characteristic NADP-binding motif of the extended
SDR family. Extended SDRs are distinct from classical
SDRs. In addition to the Rossmann fold (alpha/beta
folding pattern with a central beta-sheet) core region
typical of all SDRs, extended SDRs have a less conserved
C-terminal extension of approximately 100 amino acids.
Extended SDRs are a diverse collection of proteins, and
include isomerases, epimerases, oxidoreductases, and
lyases; they typically have a TGXXGXXG cofactor binding
motif. SDRs are a functionally diverse family of
oxidoreductases that have a single domain with a
structurally conserved Rossmann fold, an
NAD(P)(H)-binding region, and a structurally diverse
C-terminal region. Sequence identity between different
SDR enzymes is typically in the 15-30% range; they
catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 320
Score = 278 bits (712), Expect = 4e-89
Identities = 113/297 (38%), Positives = 167/297 (56%), Gaps = 16/297 (5%)
Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGI 293
P++G IY+L+R K ++ +ERL E KD+LFDR +N K+ I GD+S+P+LG+
Sbjct: 25 PDIGKIYLLIRGKSGQSAEERLRELLKDKLFDRGRNLNPLFE-SKIVPIEGDLSEPNLGL 83
Query: 294 SSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVST 353
S D Q + +++IIH AA++ FDE + +A ++N+ T LL+LA RC +LKA +HVST
Sbjct: 84 SDEDLQTLIEEVNIIIHCAATVTFDERLDEALSINVLGTLRLLELAKRCKKLKAFVHVST 143
Query: 354 LYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGE 413
Y + R+ I+E+ YPP E L +++ + ELE + L GG + N+Y+FTKA+ E
Sbjct: 144 AYVNGDRQLIEEKVYPPPADPEKLIDILELMDDLELERATPKLLGG-HPNTYTFTKALAE 202
Query: 414 SVVEKYLYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAALGLIHTFYAKHDKK 473
+V K LPL +VRPSIV +T KEP GW +N GP G G++ T A +
Sbjct: 203 RLVLKERGNLPLVIVRPSIVGATLKEPFPGWIDNFNGPDGLFLAYGKGILRTMNADPNAV 262
Query: 474 CDLIPVDVATNMMLGVVWKTALDHGHVAPPASLVAPIPRTDPPVYNLSISSSYPITW 530
D+IPVDV N +L + + + VY+ S P TW
Sbjct: 263 ADIIPVDVVANALLAAAAYSGVR--------------KPRELEVYHCGSSDVNPFTW 305
>gnl|CDD|219687 pfam07993, NAD_binding_4, Male sterility protein. This family
represents the C-terminal region of the male sterility
protein in a number of arabidopsis and drosophila. A
sequence-related jojoba acyl CoA reductase is also
included.
Length = 245
Score = 190 bits (486), Expect = 9e-57
Identities = 80/250 (32%), Positives = 120/250 (48%), Gaps = 31/250 (12%)
Query: 239 IYILLRSKKNKTVQERLAEQF-KDELFDRLKNEQADILQRKVHIISGDISQPSLGISSHD 297
IY L+R+K ++ ERL ++ K LFDRLK ++ ++GD+S+P+LG+S D
Sbjct: 25 IYCLVRAKDGESALERLRQELLKYGLFDRLK------ALERIIPVAGDLSEPNLGLSDED 78
Query: 298 QQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYTH 357
Q + + VIIH AA++ F E D N+ TRE+L LA + + HVST Y +
Sbjct: 79 FQELAEEVDVIIHNAATVNFVEPYSDLRATNVLGTREVLRLAKQMKK-LPFHHVSTAYVN 137
Query: 358 SYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVE 417
R + EE + E ++L G N Y+ +K + E +V
Sbjct: 138 GERGGLLEE-----------------KPYKLDEDEPALLGG--LPNGYTQSKWLAEQLVR 178
Query: 418 KYLYKLPLAMVRPSIVVSTWKEPIVGWSNNLY-GPGGAAAGAALGLIHTFYAKHDKKCDL 476
+ LP+ + RPSI+ E GW N GP G GA LG++ D + DL
Sbjct: 179 EAAGGLPVVIYRPSIITG---ESRTGWINGDDFGPRGLLGGAGLGVLPDILGDPDARLDL 235
Query: 477 IPVDVATNMM 486
+PVD N +
Sbjct: 236 VPVDYVANAI 245
>gnl|CDD|215538 PLN02996, PLN02996, fatty acyl-CoA reductase.
Length = 491
Score = 130 bits (329), Expect = 2e-32
Identities = 83/289 (28%), Positives = 133/289 (46%), Gaps = 41/289 (14%)
Query: 234 PEVGGIYILLRSKKNKTVQERL-AEQFKDELFDRLKNEQAD----ILQRKVHIISGDISQ 288
P V +Y+LLR+ K+ +RL E +LF L+ + + ++ KV + GDIS
Sbjct: 36 PNVKKLYLLLRASDAKSATQRLHDEVIGKDLFKVLREKLGENLNSLISEKVTPVPGDISY 95
Query: 289 PSLGIS-SHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKA 347
LG+ S+ ++ + I ++++ AA+ FDE A +N +L+ A +C ++K
Sbjct: 96 DDLGVKDSNLREEMWKEIDIVVNLAATTNFDERYDVALGINTLGALNVLNFAKKCVKVKM 155
Query: 348 ILHVSTLYTHSYRE--------------------DIQEEFYPPLFSYEDLAHVMQTTNQE 387
+LHVST Y + DI EE E L + +
Sbjct: 156 LLHVSTAYVCGEKSGLILEKPFHMGETLNGNRKLDINEE---KKLVKEKLKE-LNEQDAS 211
Query: 388 ELEILSSM---------LFGGIYNNSYSFTKAIGESVVEKYLYKLPLAMVRPSIVVSTWK 438
E EI +M L G + N+Y FTKA+GE ++ + LPL ++RP+++ ST+K
Sbjct: 212 EEEITQAMKDLGMERAKLHG--WPNTYVFTKAMGEMLLGNFKENLPLVIIRPTMITSTYK 269
Query: 439 EPIVGWSNNLYGPGGAAAGAALGLIHTFYAKHDKKCDLIPVDVATNMML 487
EP GW L G G + F A + D+IP D+ N M+
Sbjct: 270 EPFPGWIEGLRTIDSVIVGYGKGKLTCFLADPNSVLDVIPADMVVNAMI 318
>gnl|CDD|215279 PLN02503, PLN02503, fatty acyl-CoA reductase 2.
Length = 605
Score = 109 bits (273), Expect = 5e-25
Identities = 89/315 (28%), Positives = 143/315 (45%), Gaps = 59/315 (18%)
Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKD-ELFDRL-----KNEQADILQRKVHIISGDIS 287
P+VG IY+L+++K + ERL + D ELF L K+ Q+ +L + V ++ G++
Sbjct: 144 PDVGKIYLLIKAKDKEAAIERLKNEVIDAELFKCLQETHGKSYQSFMLSKLVPVV-GNVC 202
Query: 288 QPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKA 347
+ +LG+ I + VII++AA+ FDE A +N + L+ A +C +LK
Sbjct: 203 ESNLGLEPDLADEIAKEVDVIINSAANTTFDERYDVAIDINTRGPCHLMSFAKKCKKLKL 262
Query: 348 ILHVSTLYTHSYRE-------------------------------DIQEEFYPPLFSYE- 375
L VST Y + R+ DI+ E L S
Sbjct: 263 FLQVSTAYVNGQRQGRIMEKPFRMGDCIARELGISNSLPHNRPALDIEAEIKLALDSKRH 322
Query: 376 -----DLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEKYLYKLPLAMVRP 430
A M+ +L + + L+G + ++Y FTKA+GE V+ +P+ ++RP
Sbjct: 323 GFQSNSFAQKMK-----DLGLERAKLYG--WQDTYVFTKAMGEMVINSMRGDIPVVIIRP 375
Query: 431 SIVVSTWKEPIVGW--SNNLYGPGGAAAGAALGLIHTFYAKHDKKCDLIPVDVATNMMLG 488
S++ STWK+P GW N + P G G + F A + D++P D+ N L
Sbjct: 376 SVIESTWKDPFPGWMEGNRMMDPIVLYYGK--GQLTGFLADPNGVLDVVPADMVVNATLA 433
Query: 489 VVWKTALDHGHVAPP 503
+ K HG A P
Sbjct: 434 AMAK----HGGAAKP 444
>gnl|CDD|187573 cd05263, MupV_like_SDR_e, Pseudomonas fluorescens MupV-like,
extended (e) SDRs. This subgroup of extended SDR family
domains have the characteristic active site tetrad and a
well-conserved NAD(P)-binding motif. This subgroup is
not well characterized, its members are annotated as
having a variety of putative functions. One
characterized member is Pseudomonas fluorescens MupV a
protein involved in the biosynthesis of Mupirocin, a
polyketide-derived antibiotic. Extended SDRs are
distinct from classical SDRs. In addition to the
Rossmann fold (alpha/beta folding pattern with a central
beta-sheet) core region typical of all SDRs, extended
SDRs have a less conserved C-terminal extension of
approximately 100 amino acids. Extended SDRs are a
diverse collection of proteins, and include isomerases,
epimerases, oxidoreductases, and lyases; they typically
have a TGXXGXXG cofactor binding motif. SDRs are a
functionally diverse family of oxidoreductases that have
a single domain with a structurally conserved Rossmann
fold, an NAD(P)(H)-binding region, and a structurally
diverse C-terminal region. Sequence identity between
different SDR enzymes is typically in the 15-30% range;
they catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 293
Score = 98.2 bits (245), Expect = 1e-22
Identities = 61/250 (24%), Positives = 99/250 (39%), Gaps = 45/250 (18%)
Query: 239 IYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGISSHDQ 298
+ +L+RS+ ER+ E+A + +V ++ GD++QP+LG+S+
Sbjct: 25 VLVLVRSESLGEAHERI--------------EEAGLEADRVRVLEGDLTQPNLGLSAAAS 70
Query: 299 QFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYTHS 358
+ + + +IH AAS F +DA+ NI T +L+LA R + +VST Y
Sbjct: 71 RELAGKVDHVIHCAASYDFQAPNEDAWRTNIDGTEHVLELAARLDI-QRFHYVSTAYVAG 129
Query: 359 YREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEK 418
RE E + N Y +KA E +V
Sbjct: 130 NREGNIRE----------TELNPGQN----------------FKNPYEQSKAEAEQLVRA 163
Query: 419 YLYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAA-LGLIHTFYAKHDKKCDLI 477
++PL + RPSIVV K G + G A LG + +L+
Sbjct: 164 AATQIPLTVYRPSIVVGDSK---TGRIEKIDGLYELLNLLAKLGRWLPMPGNKGARLNLV 220
Query: 478 PVDVATNMML 487
PVD + ++
Sbjct: 221 PVDYVADAIV 230
>gnl|CDD|187546 cd05235, SDR_e1, extended (e) SDRs, subgroup 1. This family
consists of an SDR module of multidomain proteins
identified as putative polyketide sythases fatty acid
synthases (FAS), and nonribosomal peptide synthases,
among others. However, unlike the usual ketoreductase
modules of FAS and polyketide synthase, these domains
are related to the extended SDRs, and have canonical
NAD(P)-binding motifs and an active site tetrad.
Extended SDRs are distinct from classical SDRs. In
addition to the Rossmann fold (alpha/beta folding
pattern with a central beta-sheet) core region typical
of all SDRs, extended SDRs have a less conserved
C-terminal extension of approximately 100 amino acids.
Extended SDRs are a diverse collection of proteins, and
include isomerases, epimerases, oxidoreductases, and
lyases; they typically have a TGXXGXXG cofactor binding
motif. SDRs are a functionally diverse family of
oxidoreductases that have a single domain with a
structurally conserved Rossmann fold, an
NAD(P)(H)-binding region, and a structurally diverse
C-terminal region. Sequence identity between different
SDR enzymes is typically in the 15-30% range; they
catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 290
Score = 71.9 bits (177), Expect = 7e-14
Identities = 56/208 (26%), Positives = 90/208 (43%), Gaps = 37/208 (17%)
Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGI 293
V IY L+R+K + ERL + K E L +E ++ ++ GD+S+P+LG+
Sbjct: 23 KNVSKIYCLVRAKDEEAALERLIDNLK-EYGLNLWDELE---LSRIKVVVGDLSKPNLGL 78
Query: 294 SSHDQQFIQHHIHVIIHAAASL----RFDELIQDAFTLNIQATRELLDLATRCSQLKAIL 349
S D Q + + VIIH A++ ++EL N+ T+ELL LA +LK +
Sbjct: 79 SDDDYQELAEEVDVIIHNGANVNWVYPYEEL----KPANVLGTKELLKLAAT-GKLKPLH 133
Query: 350 HVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIY-NNSYSFT 408
VSTL S + N + E ML N Y +
Sbjct: 134 FVSTLSVFS----------------------AEEYNALDDEESDDMLESQNGLPNGYIQS 171
Query: 409 KAIGESVVEKYL-YKLPLAMVRPSIVVS 435
K + E ++ + LP+A++RP +
Sbjct: 172 KWVAEKLLREAANRGLPVAIIRPGNIFG 199
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 70.3 bits (172), Expect = 1e-12
Identities = 42/210 (20%), Positives = 87/210 (41%), Gaps = 7/210 (3%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
++K A+ S + +K+ K + +++EKEKE+ +EK K KE+
Sbjct: 68 ESKLSSDEAVKRVEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEE 127
Query: 80 EKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS-HKHKDKDRERDKDEKKEQKESK 138
KD+ +E + K EKEK+KEKK ++ + + K ++R R K K+ + K
Sbjct: 128 PKDRKPKEEAKEKRPPK-----EKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKK 182
Query: 139 SSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHK 198
+K K+ + + + P ++ + K++E +S ++ +
Sbjct: 183 PPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPM-EEDESRQ 241
Query: 199 HKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
+ + + K + S + S+
Sbjct: 242 SSEISRRSSSSLKKPDPSPSMASPETRESS 271
Score = 61.1 bits (148), Expect = 8e-10
Identities = 37/144 (25%), Positives = 72/144 (50%), Gaps = 1/144 (0%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
K++ K+K KEKEKEK+ K+ ++ +EK++++V +K + +K K K + +KE
Sbjct: 132 KPKEEAKEKRPPKEKEKEKE-KKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEP 190
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
+E+K ++ K K E D +E++E++E + ++S ++ + S IS
Sbjct: 191 PEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISRRSS 250
Query: 166 PAPTPTQKSPVKTKEKEKEKESST 189
+ SP + +E T
Sbjct: 251 SSLKKPDPSPSMASPETRESSKRT 274
Score = 54.1 bits (130), Expect = 1e-07
Identities = 37/229 (16%), Positives = 77/229 (33%), Gaps = 9/229 (3%)
Query: 16 PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVS 75
P K K++ P ++++K ++R + K + KK +K
Sbjct: 128 PKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKK 187
Query: 76 SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK---------EKSHKHKDKDRER 126
+ E++K +E + KP+E +E++KE+ D K E+ + + R
Sbjct: 188 KEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISR 247
Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
++ + S + +SK + + PP P + +P + K KE
Sbjct: 248 RSSSSLKKPDPSPSMASPETRESSKRTETRPRTSLRPPSARPASARPAPPRVKRKEIVTV 307
Query: 187 SSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPE 235
+ + + E ++ AG + E
Sbjct: 308 LQDAQGVGKIVSNVILEGKKSEDEDDENFVVEAAAQAPDIVAGGEDEAE 356
Score = 45.3 bits (107), Expect = 7e-05
Identities = 26/161 (16%), Positives = 63/161 (39%), Gaps = 18/161 (11%)
Query: 71 KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
K A S ++ ++ K K +++ K + K+++++ + K++ +++ +
Sbjct: 65 KCAESKLSSDEAVKRVEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKP 124
Query: 131 KKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTT 190
K+E K+ K + PP +K + +++E+EK+
Sbjct: 125 KEEPKDRKPKEEAKEKR---------------PPKEKEKEKEKKVEEPRDREEEKKRERV 169
Query: 191 HDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPK 231
K K KK KE + K++++ + + K
Sbjct: 170 RAKSRPKKPPKKKP---PNKKKEPPEEEKQRQAAREAVKGK 207
Score = 35.2 bits (81), Expect = 0.090
Identities = 37/188 (19%), Positives = 73/188 (38%), Gaps = 37/188 (19%)
Query: 86 SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
S ++ K + S K K K+ K +S K ++K++E+ K+EKK++KE
Sbjct: 72 SSDEAVKRVEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKP------- 124
Query: 146 SSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKH 205
++ P K KE+ KE + K K KK ++
Sbjct: 125 --------------------------KEEPKDRKPKEEAKEKR-PPKEKEKEKEKKVEEP 157
Query: 206 GDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFD 265
D+ K+++ + + K K PK P K+ + +E + + ++ +
Sbjct: 158 RDREEEKKRE-RVRAKSRPKKP--PKKKPPNKKKEPPEEEKQRQAAREAVKGKPEEPDVN 214
Query: 266 RLKNEQAD 273
+ ++ D
Sbjct: 215 EEREKEED 222
>gnl|CDD|223528 COG0451, WcaG, Nucleoside-diphosphate-sugar epimerases [Cell
envelope biogenesis, outer membrane / Carbohydrate
transport and metabolism].
Length = 314
Score = 64.2 bits (156), Expect = 4e-11
Identities = 61/299 (20%), Positives = 100/299 (33%), Gaps = 72/299 (24%)
Query: 253 ERLAEQFKD-ELFDRLKNEQADILQRKVHIISGDISQPSLGISSHDQQFIQHHIHVIIHA 311
ERL D DRL + D L V + D++ L + + +IH
Sbjct: 18 ERLLAAGHDVRGLDRL-RDGLDPLLSGVEFVVLDLTDRDL-----VDELAKGVPDAVIHL 71
Query: 312 AASLRFDELI----QDAFTLNIQATRELLDLATRCSQLKAILHVST---LYTHSYREDIQ 364
AA + + +N+ T LL+ A R + +K + S+ +Y I
Sbjct: 72 AAQSSVPDSNASDPAEFLDVNVDGTLNLLEAA-RAAGVKRFVFASSVSVVYGDPPPLPID 130
Query: 365 EEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEKY--LYK 422
E+ PP N Y +K E ++ Y LY
Sbjct: 131 EDLGPPRP-----------------------------LNPYGVSKLAAEQLLRAYARLYG 161
Query: 423 LPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAALGLIHTFYAKHDKKCDLIPVDVA 482
LP+ ++RP N+YGPG + G++ F + K +I +
Sbjct: 162 LPVVILRPF---------------NVYGPGD-KPDLSSGVVSAFIRQLLKGEPIIVIGGD 205
Query: 483 TNMMLGVVWKTALDHGHVAPPASLVAP-IPRTDPPVYNLSISSSYPITWLEYMNSVQAA 540
+ D +V A + + D V+N+ S + IT E +V A
Sbjct: 206 GSQ--------TRDFVYVDDVADALLLALENPDGGVFNIG-SGTAEITVRELAEAVAEA 255
>gnl|CDD|233557 TIGR01746, Thioester-redct, thioester reductase domain. This model
includes the terminal domain from the fungal alpha
aminoadipate reductase enzyme (also known as
aminoadipate semialdehyde dehydrogenase) which is
involved in the biosynthesis of lysine , as well as the
reductase-containing component of the myxochelin
biosynthetic gene cluster, MxcG. The mechanism of
reduction involves activation of the substrate by
adenylation and transfer to a covalently-linked
pantetheine cofactor as a thioester. This thioester is
then reduced to give an aldehyde (thus releasing the
product) and a regenerated pantetheine thiol. (In
myxochelin biosynthesis this aldehyde is further reduced
to an alcohol or converted to an amine by an
aminotransferase.) This is a fundamentally different
reaction than beta-ketoreductase domains of polyketide
synthases which act at a carbonyl two carbons removed
from the thioester and forms an alcohol as a product.
This domain is invariably found at the C-terminus of the
proteins which contain it (presumably because it results
in the release of the product). The majority of hits to
this model are non-ribosomal peptide synthetases in
which this domain is similarly located proximal to a
thiolation domain (pfam00550). In some cases this domain
is found at the end of a polyketide synthetase enzyme,
but is unlike ketoreductase domains which are found
before the thiolase domains. Exceptions to this observed
relationship with the thiolase domain include three
proteins which consist of stand-alone reductase domains
(GP|466833 from M. leprae, GP|435954 from Anabaena and
OMNI|NTL02SC1199 from Strep. coelicolor) and one protein
(OMNI|NTL01NS2636 from Nostoc) which contains N-terminal
homology with a small group of hypothetical proteins but
no evidence of a thiolation domain next to the putative
reductase domain. Below the noise cutoff to this model
are proteins containing more distantly related
ketoreductase and dehydratase/epimerase domains. It has
been suggested that a NADP-binding motif can be found in
the N-terminal portion of this domain that may form a
Rossman-type fold.
Length = 367
Score = 60.9 bits (148), Expect = 6e-10
Identities = 41/205 (20%), Positives = 81/205 (39%), Gaps = 40/205 (19%)
Query: 232 CYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSL 291
+ L+R+ + ERL E + D+ + ++ +++GD+S+P L
Sbjct: 21 RRSTQAKVICLVRAASEEHAMERLREALRSYRL-----WHEDLARERIEVVAGDLSEPRL 75
Query: 292 GISSHDQQFIQHHIHVIIHAAASLRF----DELIQDAFTLNIQATRELLDLATRCSQLKA 347
G+S + + + ++ I+H A + + EL N+ TRE+L LA + K
Sbjct: 76 GLSDAEWERLAENVDTIVHNGALVNWVYPYSELRGA----NVLGTREVLRLAAS-GRAKP 130
Query: 348 ILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYN-NSYS 406
+ +VST+ + T E+ ++ Y+
Sbjct: 131 LHYVSTI-------SVGAAIDLS-------------TVTEDDATVT----PPPGLAGGYA 166
Query: 407 FTKAIGESVVEKY-LYKLPLAMVRP 430
+K + E +V + LP+ +VRP
Sbjct: 167 QSKWVAELLVREASDRGLPVTIVRP 191
>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
Length = 1388
Score = 61.6 bits (150), Expect = 8e-10
Identities = 45/198 (22%), Positives = 82/198 (41%), Gaps = 2/198 (1%)
Query: 34 TSSSTSNPTNSSSSKKDKKDKDRD-KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
S +S + K K ++ K+K+ +K +SK + D+ + +
Sbjct: 1155 EQRLKSKTKGKASKLRKPKLKKKEKKKKKSSADKSKKASVVGNSKRVDSDEKRKLDDKPD 1214
Query: 93 ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
K S S++E +E+K K +KS + K ++ + + E + SS +
Sbjct: 1215 NKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNA 1274
Query: 153 PASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPK 212
P S + PPPP+ P +S +K K+ + S KKK K KT K
Sbjct: 1275 PKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARK 1334
Query: 213 EKDAKSKEKESHKSSAGP 230
+K +K++ K++ S +
Sbjct: 1335 KK-SKTRVKQASASQSSR 1351
Score = 46.2 bits (110), Expect = 5e-05
Identities = 28/205 (13%), Positives = 61/205 (29%), Gaps = 2/205 (0%)
Query: 55 DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
+ +EKE K+ + K K+ + + K K++++K+ + S + K
Sbjct: 1145 EEVEEKEIAKEQRLKSKTKGKASKLRK-PKLKKKEKKKKKSSADKSKKASVVGNSKRVDS 1203
Query: 115 KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKS 174
+ D + K + + +S + + S
Sbjct: 1204 DEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSD 1263
Query: 175 PVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA-KSKEKESHKSSAGPKCY 233
+ + K K + ++S K+ K K K K+ + S
Sbjct: 1264 DLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKK 1323
Query: 234 PEVGGIYILLRSKKNKTVQERLAEQ 258
+ + K V++ A Q
Sbjct: 1324 KKKSEKKTARKKKSKTRVKQASASQ 1348
Score = 45.8 bits (109), Expect = 6e-05
Identities = 33/194 (17%), Positives = 63/194 (32%), Gaps = 11/194 (5%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSS-SSKKDKKDKDRDKEKEK 62
SV +S + + D+ S+ + + K+ K + K
Sbjct: 1193 SVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSK 1252
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
+D ++ S SKE + + + S P S + + K K
Sbjct: 1253 SSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKK 1312
Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ---LISHPPPPAPTPTQKSPVKTK 179
E K++K+S+ + S + AS SQ L+ P +K +
Sbjct: 1313 RLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPR-------KKKSDSSS 1365
Query: 180 EKEKEKESSTTHDK 193
E + + E + D+
Sbjct: 1366 EDDDDSEVDDSEDE 1379
Score = 45.4 bits (108), Expect = 8e-05
Identities = 23/137 (16%), Positives = 49/137 (35%), Gaps = 9/137 (6%)
Query: 1 MAYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEK 60
+ + +A + S+ + +S + KK KK
Sbjct: 1261 SSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRL-EGSLA 1319
Query: 61 EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
+KK K + K+A K SK + ++ S + S + +K+K D +
Sbjct: 1320 ALKKKKKSEKKTA--------RKKKSKTRVKQASASQSSRLLRRPRKKKSDSSSEDDDDS 1371
Query: 121 DKDRERDKDEKKEQKES 137
+ D D+D++ ++ +
Sbjct: 1372 EVDDSEDEDDEDDEDDD 1388
Score = 45.0 bits (107), Expect = 1e-04
Identities = 38/179 (21%), Positives = 72/179 (40%), Gaps = 4/179 (2%)
Query: 47 SKKDKKDKDRDKEKEKEKKD---KEKDK-SAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
++ +KK+K+ +K K KD ++ DK +++E ++ +++R +SK K +S+
Sbjct: 1109 AELEKKEKELEKLKNTTPKDMWLEDLDKFEEALEEQEEVEEKEIAKEQRLKSKTKGKASK 1168
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
K K KK +K+K DK ++ ++ +S K+ N K +SGS
Sbjct: 1169 LRKPKLKKKEKKKKKSSADKSKKASVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDD 1228
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
+K +SS +D+ S K+ K + S
Sbjct: 1229 EEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPP 1287
Score = 41.6 bits (98), Expect = 0.001
Identities = 25/149 (16%), Positives = 47/149 (31%), Gaps = 12/149 (8%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTST--SSSTSNPTNS-SSSKKDKKDKDRDKEKE 61
K SS S N K S S+ S P N+ + ++
Sbjct: 1233 TKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRP 1292
Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH---- 117
+ + S+ + K+ +K S +K+ K ++ ++ K+K K + + S
Sbjct: 1293 DGESNGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRL 1352
Query: 118 -----KHKDKDRERDKDEKKEQKESKSSS 141
K K D D+ +
Sbjct: 1353 LRRPRKKKSDSSSEDDDDSEVDDSEDEDD 1381
>gnl|CDD|236304 PRK08581, PRK08581, N-acetylmuramoyl-L-alanine amidase; Validated.
Length = 619
Score = 60.6 bits (147), Expect = 1e-09
Identities = 38/238 (15%), Positives = 79/238 (33%), Gaps = 21/238 (8%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
K S++ +++H S N D+ S S T + +N S+ DKK D
Sbjct: 30 PQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNN-NTSNQDNNDKKFSTIDSSTSDS 88
Query: 64 KKDKEKDKSAVSSKEKEKDK-----------------VSSKEKERKESKPKESSSEKEKK 106
+ + + + + + + + +S +
Sbjct: 89 NNIIDFIYKNLPQTNINQLLTKNKYDDNYSLTTLIQNLFNLNSDISDYEQPRNSEKSTND 148
Query: 107 KEKKDKKEKSHKHKDKDRERDK-DEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
K + + ++DK D +K + + + NS +P +Q + P
Sbjct: 149 SNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQ-SNSQPA 207
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHK-KKDKHGDKTNPKEKDAKSKEKE 222
T QKS K + + + D++S+ K +KD K + + +K +
Sbjct: 208 SDDTANQKSSSKDNQSMSDSALDSILDQYSEDAKKTQKDYASQSKKDKTETSNTKNPQ 265
Score = 42.1 bits (99), Expect = 8e-04
Identities = 23/200 (11%), Positives = 68/200 (34%), Gaps = 3/200 (1%)
Query: 29 IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
+P+ ++ ++ ++ S+ K + ++ KD S + K + +
Sbjct: 17 LPTLTSPTAYADDPQKDSTAKTTSHDSKKSNDDETSKD---TSSKDTDKADNNNTSNQDN 73
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
++K S S+S+ + K + D+ + ++S
Sbjct: 74 NDKKFSTIDSSTSDSNNIIDFIYKNLPQTNINQLLTKNKYDDNYSLTTLIQNLFNLNSDI 133
Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDK 208
+ E S+ ++ + K+ T+ +++K + + K +K +
Sbjct: 134 SDYEQPRNSEKSTNDSNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNS 193
Query: 209 TNPKEKDAKSKEKESHKSSA 228
P + + + + S ++
Sbjct: 194 PKPTQPNQSNSQPASDDTAN 213
Score = 35.1 bits (81), Expect = 0.12
Identities = 34/159 (21%), Positives = 63/159 (39%), Gaps = 8/159 (5%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
S+K+ + + S+ N+ SS STS+ N + + D ++
Sbjct: 156 SIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQSNSQPASDDTANQK 215
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK-----EKSHK 118
K+ + S+ + D+ S E +K K S S+K+K + K + K
Sbjct: 216 SSSKDNQSMSDSALDSILDQYS--EDAKKTQKDYASQSKKDKTETSNTKNPQLPTQDELK 273
Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
HK K + +++ Q ++S+S + S SGS
Sbjct: 274 HKSKPAQSFENDVN-QSNTRSTSLFETGPSLSNNDDSGS 311
>gnl|CDD|212494 cd08946, SDR_e, extended (e) SDRs. Extended SDRs are distinct from
classical SDRs. In addition to the Rossmann fold
(alpha/beta folding pattern with a central beta-sheet)
core region typical of all SDRs, extended SDRs have a
less conserved C-terminal extension of approximately 100
amino acids. Extended SDRs are a diverse collection of
proteins, and include isomerases, epimerases,
oxidoreductases, and lyases; they typically have a
TGXXGXXG cofactor binding motif. SDRs are a functionally
diverse family of oxidoreductases that have a single
domain with a structurally conserved Rossmann fold, an
NAD(P)(H)-binding region, and a structurally diverse
C-terminal region. Sequence identity between different
SDR enzymes is typically in the 15-30% range; they
catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 200
Score = 57.3 bits (139), Expect = 2e-09
Identities = 33/178 (18%), Positives = 59/178 (33%), Gaps = 53/178 (29%)
Query: 307 VIIHAAASLRFDELIQDA---FTLNIQATRELLDLATRCSQLKAILHVSTLYTHSYREDI 363
V++H AA + + F N+ T LL+ A + +K ++ S+ + E +
Sbjct: 33 VVVHLAALVGVPASWDNPDEDFETNVVGTLNLLEAARKAG-VKRFVYASSASVYGSPEGL 91
Query: 364 QEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEKYL--Y 421
EE P + Y +K E ++ Y Y
Sbjct: 92 PEEEETPPRP----------------------------LSPYGVSKLAAEHLLRSYGESY 123
Query: 422 KLPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAALGLIHTFY--AKHDKKCDLI 477
LP+ ++R + N+YGPG G+++ F A K +
Sbjct: 124 GLPVVILRLA---------------NVYGPGQRPRLD--GVVNDFIRRALEGKPLTVF 164
>gnl|CDD|225857 COG3320, COG3320, Putative dehydrogenase domain of multifunctional
non-ribosomal peptide synthetases and related enzymes
[Secondary metabolites biosynthesis, transport, and
catabolism].
Length = 382
Score = 56.2 bits (136), Expect = 2e-08
Identities = 42/198 (21%), Positives = 81/198 (40%), Gaps = 27/198 (13%)
Query: 239 IYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGISSHDQ 298
+ L+R++ ++ RL + F L ++ +V +++GD+++P LG+S
Sbjct: 28 VICLVRAQSDEAALARLEKTF------DLYRHWDELSADRVEVVAGDLAEPDLGLSERTW 81
Query: 299 QFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYTHS 358
Q + ++ +IIH AA + + N+ T E+L LA + K + +VS++
Sbjct: 82 QELAENVDLIIHNAALVNHVFPYSELRGANVLGTAEVLRLAAT-GKPKPLHYVSSI---- 136
Query: 359 YREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVV-E 417
+ E Y + EI + G Y +K + E +V E
Sbjct: 137 ---SVGETEYY------------SNFTVDFDEISPTRNVGQGLAGGYGRSKWVAEKLVRE 181
Query: 418 KYLYKLPLAMVRPSIVVS 435
LP+ + RP +
Sbjct: 182 AGDRGLPVTIFRPGYITG 199
>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
Provisional.
Length = 482
Score = 55.7 bits (135), Expect = 4e-08
Identities = 26/75 (34%), Positives = 49/75 (65%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
T S + K K EK++E++ KEK K A + K+KE+++ KEK+ +E + +E +
Sbjct: 403 TGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEA 462
Query: 102 EKEKKKEKKDKKEKS 116
E+EK++E++ KK+++
Sbjct: 463 EEEKEEEEEKKKKQA 477
Score = 48.8 bits (117), Expect = 5e-06
Identities = 21/82 (25%), Positives = 52/82 (63%), Gaps = 3/82 (3%)
Query: 57 DKEKE---KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
++E E KK +K K V EK++++ ++K++ + K+ E+E+K++K+++K
Sbjct: 396 EEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEK 455
Query: 114 EKSHKHKDKDRERDKDEKKEQK 135
E+ + ++++E ++++KK+Q
Sbjct: 456 EEEEEEAEEEKEEEEEKKKKQA 477
Score = 48.8 bits (117), Expect = 6e-06
Identities = 21/76 (27%), Positives = 44/76 (57%), Gaps = 5/76 (6%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
T +K +K R++EK+++KK K +E+EK+K KE ++ + +E +
Sbjct: 408 ATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEK-----KEEEKEEEEEEA 462
Query: 101 SEKEKKKEKKDKKEKS 116
E+++++E+K KK+ +
Sbjct: 463 EEEKEEEEEKKKKQAT 478
Score = 43.8 bits (104), Expect = 2e-04
Identities = 20/82 (24%), Positives = 45/82 (54%), Gaps = 3/82 (3%)
Query: 61 EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
E+E + K A +K K V EK+R+E K ++ KK++++++E+ K +
Sbjct: 396 EEEIEFLTGSKKA---TKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKE 452
Query: 121 DKDRERDKDEKKEQKESKSSSK 142
++ E +++ ++E++E + K
Sbjct: 453 EEKEEEEEEAEEEKEEEEEKKK 474
Score = 43.4 bits (103), Expect = 3e-04
Identities = 15/77 (19%), Positives = 41/77 (53%)
Query: 66 DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
+E+ + SK+ K EK K+ + ++ +K+ KK ++E+ + + K+ E
Sbjct: 395 TEEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEE 454
Query: 126 RDKDEKKEQKESKSSSK 142
++++E++ ++E + +
Sbjct: 455 KEEEEEEAEEEKEEEEE 471
Score = 40.7 bits (96), Expect = 0.002
Identities = 20/81 (24%), Positives = 46/81 (56%)
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
E+E + ++ +K K+ K +EK++++EKK+KK+K+ K K+ E +++++K+++E
Sbjct: 396 EEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEK 455
Query: 138 KSSSKIVSSSHNSKEPASGSQ 158
+ + +E Q
Sbjct: 456 EEEEEEAEEEKEEEEEKKKKQ 476
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 54.0 bits (129), Expect = 2e-07
Identities = 37/179 (20%), Positives = 78/179 (43%), Gaps = 2/179 (1%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
+ + E E ++ E + +K+ D K +E+K++ + +E++KKK
Sbjct: 1348 AKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKA 1407
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
+ KK + K K + ++ +EKK+ E+K ++ + +K+ A ++ A
Sbjct: 1408 DELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAE 1467
Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+ K K +E +K K ++ KK D+ K+K ++K+ E K +
Sbjct: 1468 EAKKADEAKKKAEEAKKADEAK--KKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKA 1524
Score = 53.2 bits (127), Expect = 4e-07
Identities = 35/182 (19%), Positives = 77/182 (42%), Gaps = 14/182 (7%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE-KERKESKPKESSSEKEKKKEK 109
KK +++ K E +KK +E K+ + K+ E+ K + K++ E K + + K + +
Sbjct: 1296 KKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAA 1355
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
D+ E + + + ++ ++ KK+ +K ++ + +K+ A +
Sbjct: 1356 ADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDK----------- 1404
Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
+ +K K+K K K + K ++ K +AK K +E+ K+
Sbjct: 1405 -KKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK-KADEAKKKAEEAKKAEEA 1462
Query: 230 PK 231
K
Sbjct: 1463 KK 1464
Score = 53.2 bits (127), Expect = 4e-07
Identities = 37/186 (19%), Positives = 85/186 (45%), Gaps = 6/186 (3%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
+ +K K D+ + K +E +K D+ K K+ +E K + K++ E K + + K +
Sbjct: 1298 AEEKKKADEAKKKAEEAKKADEAKKKA------EEAKKKADAAKKKAEEAKKAAEAAKAE 1351
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
+ D+ E + + + ++ ++ KK+ +K ++ + +K+ A + +
Sbjct: 1352 AEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELK 1411
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
A +K+ K+ E++K++ K + K + K + K ++AK K +E+ K
Sbjct: 1412 KAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKK 1471
Query: 226 SSAGPK 231
+ K
Sbjct: 1472 ADEAKK 1477
Score = 51.7 bits (123), Expect = 9e-07
Identities = 38/185 (20%), Positives = 78/185 (42%), Gaps = 2/185 (1%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
++KK D K+ E++KK E K A +K+ ++ K ++E ++K K+ + E +K E
Sbjct: 1287 EEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAE 1346
Query: 109 KKDKKEKSHKHKDKDRERDK--DEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
+ ++ + + E EKK+++ K + + K+ +
Sbjct: 1347 AAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKK 1406
Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
A + + K K E +K++ K ++ K D+ K ++AK E+ K+
Sbjct: 1407 ADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKA 1466
Query: 227 SAGPK 231
K
Sbjct: 1467 EEAKK 1471
Score = 51.7 bits (123), Expect = 1e-06
Identities = 42/175 (24%), Positives = 79/175 (45%), Gaps = 10/175 (5%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
KK ++ + E+ KK +E+ K K+KE ++ E+ +K + + + +E KK ++
Sbjct: 1613 KKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEE 1672
Query: 111 DKKEKSHKHKDKDRERDKDE--KKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
DKK+ K ++ E+ E KKE +E+K + ++ K+ A +
Sbjct: 1673 DKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELK--------KA 1724
Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
K + +KE E++ + + KKK H K K+ + KEKE+
Sbjct: 1725 EEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEA 1779
Score = 51.7 bits (123), Expect = 1e-06
Identities = 36/193 (18%), Positives = 79/193 (40%), Gaps = 11/193 (5%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
+ ++KK ++ + E K + + D++ + ++ E + +E ++K K+ + EK
Sbjct: 1331 ADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEK 1390
Query: 104 EK-----KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+K KK ++DKK K + ++ KK+ E+K ++ + +K+ A ++
Sbjct: 1391 KKADEAKKKAEEDKK------KADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK 1444
Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
A + K K +E +K K + K ++ K +AK
Sbjct: 1445 KADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKK 1504
Query: 219 KEKESHKSSAGPK 231
+ K+ K
Sbjct: 1505 AAEAKKKADEAKK 1517
Score = 51.3 bits (122), Expect = 2e-06
Identities = 44/220 (20%), Positives = 96/220 (43%), Gaps = 4/220 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+ ++ + ++K++E K K + ++K+ D+ K +E K+ + + KKK
Sbjct: 1360 EAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKK 1419
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
+ KK+ K K + ++ +E K+ E+K ++ + +K+ A ++ A
Sbjct: 1420 ADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKA 1479
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+ K K +E +K++ + + KKK K +K ++K+ E K +
Sbjct: 1480 EEAKKADEAKKKAEEAKKKAD---EAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKA 1536
Query: 228 AGPKCYPEVGGIYILLRSKKNKTVQE-RLAEQFKDELFDR 266
K E L ++++ K +E + AE+ K D+
Sbjct: 1537 DEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDK 1576
Score = 50.9 bits (121), Expect = 2e-06
Identities = 42/183 (22%), Positives = 82/183 (44%), Gaps = 3/183 (1%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEK--EKDKVSSKEKERKESKPKESSSEKEKKKE 108
KK ++ K E +KK +E K A ++K+K E K + K E+ E+ + +EK +
Sbjct: 1309 KKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEA 1368
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
+ KKE++ K D +++ +++KK + K + + + K+ A+ + A
Sbjct: 1369 AEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAE 1428
Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
+ K K +E +K K ++ K ++ K +AK K +E+ K+
Sbjct: 1429 EKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAK-KADEAKKKAEEAKKADE 1487
Query: 229 GPK 231
K
Sbjct: 1488 AKK 1490
Score = 47.1 bits (111), Expect = 3e-05
Identities = 27/149 (18%), Positives = 76/149 (51%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
KK +++ +E +K +++K K+ + K +E +K +++ +++ + K++ K+K+
Sbjct: 1653 KKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEA 1712
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
E+K K E+ K +++++ + ++ KKE +E K ++ K+ + +
Sbjct: 1713 EEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEE 1772
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSK 196
+++ ++ + E++++ DK K
Sbjct: 1773 IRKEKEAVIEEELDEEDEKRRMEVDKKIK 1801
Score = 45.5 bits (107), Expect = 8e-05
Identities = 45/216 (20%), Positives = 94/216 (43%), Gaps = 3/216 (1%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-KPKESSSEKEK 105
++ KK D K+K +EKK ++ K +K+ D++ +K++ + K+ + EK+K
Sbjct: 1373 KEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKK 1432
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK-EPASGSQLISHPP 164
E K K E++ K + ++ ++ +K E+ + K+ + K E A +
Sbjct: 1433 ADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKA 1492
Query: 165 PPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESH 224
A ++ + K+K E+ + + KK ++ K+ + K K E
Sbjct: 1493 EEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELK 1552
Query: 225 KSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFK 260
K+ K E +++++K + R AE+ K
Sbjct: 1553 KAEELKKA-EEKKKAEEAKKAEEDKNMALRKAEEAK 1587
Score = 45.5 bits (107), Expect = 8e-05
Identities = 47/255 (18%), Positives = 102/255 (40%), Gaps = 17/255 (6%)
Query: 22 KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
+++++ I + ++ K ++ K + +K +EKK ++ K A K+ ++
Sbjct: 1247 EERNNEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKKAEEKKKADE 1306
Query: 82 DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD----KDRERDKDEKKEQKES 137
K ++E +K + K+ + E +KK + KK + K + E DE + +E
Sbjct: 1307 AKKKAEEA-KKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEK 1365
Query: 138 KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKH 197
+++ K A+ + A + K K +E +K++ +
Sbjct: 1366 AEAAEKKKEEAKKKADAAKKK--------AEEKKKADEAKKKAEEDKKKADELKKAAAAK 1417
Query: 198 KHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAE 257
K + K + K +AK K +E+ K+ K E K K + + A+
Sbjct: 1418 KKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKA----EEAKKKAEEAKKAD 1473
Query: 258 QFKDELFDRLKNEQA 272
+ K + + K ++A
Sbjct: 1474 EAKKKAEEAKKADEA 1488
Score = 44.0 bits (103), Expect = 2e-04
Identities = 43/241 (17%), Positives = 90/241 (37%), Gaps = 19/241 (7%)
Query: 43 NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
+ ++ K D K+K +E K ++ K +K+ D+ + +K++ + + E
Sbjct: 1461 EAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEE 1520
Query: 103 KEKKKE--KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP------- 153
+K E K ++ +K+ + K + ++ DE K+ +E K + + + K
Sbjct: 1521 AKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMAL 1580
Query: 154 --ASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNP 211
A ++ + K K +E +K K +++ K ++
Sbjct: 1581 RKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKK 1640
Query: 212 KEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
KE + K K +E K+ K K ++ AE+ K D K +
Sbjct: 1641 KEAEEKKKAEELKKAEEENKIKAA--------EEAKKAEEDKKKAEEAKKAEEDEKKAAE 1692
Query: 272 A 272
A
Sbjct: 1693 A 1693
Score = 44.0 bits (103), Expect = 3e-04
Identities = 37/174 (21%), Positives = 81/174 (46%), Gaps = 3/174 (1%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
+D K + K+ E+ KKD E+ K A +E+ +++ E+ R + ++ K ++
Sbjct: 1221 EDAKKAEAVKKAEEAKKDAEEAKKA--EEERNNEEIRKFEEARMAHFARRQAAIKAEEAR 1278
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPA-SGSQLISHPPPPA 167
K D+ +K+ + K D + +EKK+ E+K ++ + +K+ A + A
Sbjct: 1279 KADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKA 1338
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
+ + E E + + ++ ++ KKK++ K + +K A+ K+K
Sbjct: 1339 EEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKK 1392
Score = 43.2 bits (101), Expect = 4e-04
Identities = 51/226 (22%), Positives = 91/226 (40%), Gaps = 28/226 (12%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
+ +K K D+ + K +E +K D+ K K+ + K +E K E+ +K + K+ + E +K
Sbjct: 1427 AEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKK--KAEEAKKADEAKKKAEEAKK 1484
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
E K K E++ K D+ ++ + +KK + K+ + E A +
Sbjct: 1485 ADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKAD------- 1537
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
+K+ K K E +K + K + K DK K ++K+ E +
Sbjct: 1538 ----EAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEAR 1593
Query: 226 SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
K Y E KK K AE+ K ++K E+
Sbjct: 1594 IEEVMKLYEE---------EKKMK------AEEAKKAEEAKIKAEE 1624
Score = 41.7 bits (97), Expect = 0.001
Identities = 47/215 (21%), Positives = 93/215 (43%), Gaps = 2/215 (0%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K ++K K + +K +E K E+ K A +K+ E+DK + K + K +E+ E+ K
Sbjct: 1541 KAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKL 1600
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
+++KK K+ + K + + K E+ ++ E + ++E +L
Sbjct: 1601 YEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENK 1660
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+++ K E++K+K + + K + K + K ++ K KE E K +
Sbjct: 1661 IKAAEEA--KKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKA 1718
Query: 228 AGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDE 262
K E I K+ + +++ E KDE
Sbjct: 1719 EELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDE 1753
Score = 38.2 bits (88), Expect = 0.015
Identities = 50/228 (21%), Positives = 93/228 (40%), Gaps = 18/228 (7%)
Query: 48 KKDKKDKDRDKEKEKE-KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
K D+ K + +K E KK +EK K+ K +E K K+K + K +E + +K
Sbjct: 1523 KADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRK 1582
Query: 107 KEKKDKKEKSH-KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
E+ K E++ + K E +K K E+ + +KI E ++
Sbjct: 1583 AEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKI------KAEELKKAEEEKKKVE 1636
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
+ K +E +K +E + K ++ K ++ E+D K + K
Sbjct: 1637 QLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKK 1696
Query: 226 SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQF-KDELFDRLKNEQA 272
+ K E+ KK + +++ AE+ K E +++K E+A
Sbjct: 1697 EAEEAKKAEEL---------KKKEAEEKKKAEELKKAEEENKIKAEEA 1735
Score = 38.2 bits (88), Expect = 0.017
Identities = 40/203 (19%), Positives = 92/203 (45%), Gaps = 5/203 (2%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
+++ K + A+ + + + +++ KK ++ K +E+ K E+ K KE
Sbjct: 1685 EDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAK-----KEA 1739
Query: 80 EKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
E+DK ++E ++ E + K+ + K+++++K ++ K + ++ ++DEK+ + K
Sbjct: 1740 EEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKK 1799
Query: 140 SSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKH 199
I + N E L+ + K +K + E+ + K +K+
Sbjct: 1800 IKDIFDNFANIIEGGKEGNLVINDSKEMEDSAIKEVADSKNMQLEEADAFEKHKFNKNNE 1859
Query: 200 KKKDKHGDKTNPKEKDAKSKEKE 222
+D + + KEKD K ++E
Sbjct: 1860 NGEDGNKEADFNKEKDLKEDDEE 1882
Score = 37.4 bits (86), Expect = 0.022
Identities = 45/226 (19%), Positives = 93/226 (41%), Gaps = 16/226 (7%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
KKD + K+ E+E+ ++E K + + ++ + E + +E++KK ++
Sbjct: 1236 KKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEA 1295
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKES-KSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
K E+ K K + ++ +E K+ E+ K + + + +K+ A ++ +
Sbjct: 1296 KKAEE--KKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAE 1353
Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK---DKHGDKTNPKEKDAKSKEKESHKS 226
+EK + E K KKK K D+ K ++ K K E K+
Sbjct: 1354 AAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKA 1413
Query: 227 SAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQA 272
+A K E K K +++ A++ K + + K ++A
Sbjct: 1414 AAAKKKADEA----------KKKAEEKKKADEAKKKAEEAKKADEA 1449
Score = 36.3 bits (83), Expect = 0.065
Identities = 31/185 (16%), Positives = 76/185 (41%), Gaps = 4/185 (2%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE---KDKVSSKEKERKESKPKESSSEKE 104
++++K ++ K ++ +K + K +E ++ + E+ RK + + + +
Sbjct: 1209 EEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARR 1268
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPP 164
+ K ++ K+ + K + ++ DE K+ +E K + + + +K+ +
Sbjct: 1269 QAAIKAEEARKADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAK 1328
Query: 165 PPAPTPTQKSPVKTKEKE-KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
A +K+ K E + E+ D+ + K + K K+K +K+K
Sbjct: 1329 KKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAE 1388
Query: 224 HKSSA 228
K A
Sbjct: 1389 EKKKA 1393
Score = 31.6 bits (71), Expect = 1.4
Identities = 45/241 (18%), Positives = 96/241 (39%), Gaps = 24/241 (9%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE-------SSS 101
++ + + K E +K ++ K+ + K ++ K + K + K +E +
Sbjct: 1143 EEARKAEDAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEVRKAEELRKAEDARKA 1202
Query: 102 EKEKKKEKKDKKEKSHKHKDKDR----ERDKDEKKEQKESKSSSKIVSSSHNSK-EPASG 156
E +K E++ K E++ K +D + ++ ++ KK+ +E+K + + ++ K E A
Sbjct: 1203 EAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARM 1262
Query: 157 SQLISHPPPPAPTPTQKSP--VKTKEKEKEKESSTTHDKHSKHKHKKK---DKHGDKTNP 211
+ +K+ K +EK+K E+ +K + KKK K D+
Sbjct: 1263 AHFARRQAAIKAEEARKADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKK 1322
Query: 212 KEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
K ++AK K + K + K E + + + + K E+
Sbjct: 1323 KAEEAKKKADAAKKKAEEAKKAAEA-------AKAEAEAAADEAEAAEEKAEAAEKKKEE 1375
Query: 272 A 272
A
Sbjct: 1376 A 1376
Score = 31.6 bits (71), Expect = 1.5
Identities = 25/117 (21%), Positives = 49/117 (41%), Gaps = 8/117 (6%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
S S+ K+ K+ E+ + + +K+ + ++ K+ +KEK+ KE +E
Sbjct: 1824 SKEMEDSAIKEVADSKNMQLEEADAFEKHKFNKNNENGEDGNKEADFNKEKDLKEDDEEE 1883
Query: 99 SSSEKEKKKEKKDKKE--------KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSS 147
E +K KD E + D + DKDE ++ ++ +I+ S
Sbjct: 1884 IEEADEIEKIDKDDIEREIPNNNMAGKNNDIIDDKLDKDEYIKRDAEETREEIIKIS 1940
Score = 30.1 bits (67), Expect = 4.3
Identities = 40/222 (18%), Positives = 90/222 (40%), Gaps = 11/222 (4%)
Query: 40 NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK-EKEKDKVSSKEKERKESKPKE 98
P+ K+D D+ E+ E+ K + K E+ + +K+K K +E
Sbjct: 1073 KPSYKDFDFDAKEDNRADEATEEAFGKAEEAKKTETGKAEEARKAEEAKKKAEDARKAEE 1132
Query: 99 SSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+ ++ +K ++ +K + K + R+ + K E+ +K ++ ++E +
Sbjct: 1133 ARKAEDARKAEEARKAEDAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEVRKAEE 1192
Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
L A + + E+E++ E + + K + KK + K ++AK
Sbjct: 1193 L-----RKAEDARKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKK---DAEEAKK 1244
Query: 219 KEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFK 260
E+E +++ + + E + R K + R A++ K
Sbjct: 1245 AEEE--RNNEEIRKFEEARMAHFARRQAAIKAEEARKADELK 1284
>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27. This protein forms
the C subunit of DNA polymerase delta. It carries the
essential residues for binding to the Pol1 subunit of
polymerase alpha, from residues 293-332, which are
characterized by the motif D--G--VT, referred to as the
DPIM motif. The first 160 residues of the protein form
the minimal domain for binding to the B subunit, Cdc1,
of polymerase delta, the final 10 C-terminal residues,
362-372, being the DNA sliding clamp, PCNA, binding
motif.
Length = 427
Score = 52.1 bits (125), Expect = 4e-07
Identities = 30/196 (15%), Positives = 74/196 (37%), Gaps = 6/196 (3%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
S S K +D+S +T+ T T+ ++ + + K
Sbjct: 167 PPKSIMSPEVKVKSAKKTQDTS---KETTTEKTEGKTSVKAASLKRNPPKKSNIMSSFFK 223
Query: 66 DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
K K+K + K S+E+ K E S + ++ + +++ ++
Sbjct: 224 KKTKEKKEKKEASESTVKEESEEESGKRDVILEDESAEPTGLDEDEDEDEPKPSGERSDS 283
Query: 126 RDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEK 185
++ E+KE+++ K K++ +E + P + + P K++E+++
Sbjct: 284 EEETEEKEKEKRKRLKKMMEDEDEDEEMEIVPES---PVEEEESEEPEPPPLPKKEEEKE 340
Query: 186 ESSTTHDKHSKHKHKK 201
E + + D + ++
Sbjct: 341 EVTVSPDGGRRRGRRR 356
Score = 45.6 bits (108), Expect = 5e-05
Identities = 40/222 (18%), Positives = 70/222 (31%), Gaps = 21/222 (9%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
+ + K ++ S+ S + K KK +D KE E
Sbjct: 135 VKRRTGVGLPPVAPAASPALKPTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTSKETTTE 194
Query: 64 KKDKEKDKSAVSSKEKEKDKV-------SSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
K + + A S K K K KE+KE K S+ KE+ +E+ K++
Sbjct: 195 KTEGKTSVKAASLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVKEESEEESGKRDVI 254
Query: 117 HKHK-----DKDRERDKDEKKEQKESKSSSKIVSSSHNSKE-----PASGS----QLISH 162
+ + D + D+DE K E S + K ++
Sbjct: 255 LEDESAEPTGLDEDEDEDEPKPSGERSDSEEETEEKEKEKRKRLKKMMEDEDEDEEMEIV 314
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
P P + P +KE+E + + + +
Sbjct: 315 PESPVEEEESEEPEPPPLPKKEEEKEEVTVSPDGGRRRGRRR 356
Score = 40.6 bits (95), Expect = 0.002
Identities = 36/213 (16%), Positives = 62/213 (29%), Gaps = 25/213 (11%)
Query: 24 KDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK 83
KDS+ + N N S + + + K+ V+ K
Sbjct: 96 KDSNVLYDVDYDILKENLHNCSKNSLEYGKQAGPITNPNVKRRTGVGLPPVAPAASPALK 155
Query: 84 VSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
++ K PK S + K K K ++ S K+ E+ E K+S K
Sbjct: 156 PTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTS-----------KETTTEKTEGKTSVKA 204
Query: 144 VSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKH---- 199
S N + ++ +++ T ++E E+ES
Sbjct: 205 ASLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVKEESEEESGKRDVILEDESAEPTG 264
Query: 200 ----------KKKDKHGDKTNPKEKDAKSKEKE 222
K + D E+ K K K
Sbjct: 265 LDEDEDEDEPKPSGERSDSEEETEEKEKEKRKR 297
Score = 38.7 bits (90), Expect = 0.008
Identities = 29/181 (16%), Positives = 66/181 (36%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
K +S S+ S ++ +D ++ + + K + D ++E E+++
Sbjct: 232 KKEASESTVKEESEEESGKRDVILEDESAEPTGLDEDEDEDEPKPSGERSDSEEETEEKE 291
Query: 65 KDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
K+K K + E E +++ + E + E K++++K+E + R
Sbjct: 292 KEKRKRLKKMMEDEDEDEEMEIVPESPVEEEESEEPEPPPLPKKEEEKEEVTVSPDGGRR 351
Query: 125 ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKE 184
+ K++ +V+ E S + P P P + + +K K
Sbjct: 352 RGRRRVMKKKTFKDEEGYLVTKKVYEWESFSEDEAEPPPTKPKPKVSTPAVPAAAKKPKA 411
Query: 185 K 185
Sbjct: 412 P 412
Score = 29.4 bits (66), Expect = 6.7
Identities = 20/96 (20%), Positives = 37/96 (38%)
Query: 136 ESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHS 195
+ ++S + + N K P+S P +K+ +KE EK T K +
Sbjct: 146 VAPAASPALKPTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTSKETTTEKTEGKTSVKAA 205
Query: 196 KHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPK 231
K K ++ +K K K+++ S + K
Sbjct: 206 SLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVK 241
>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR). This
family consists of several bovine specific leukaemia
virus receptors which are thought to function as
transmembrane proteins, although their exact function is
unknown.
Length = 561
Score = 52.0 bits (124), Expect = 6e-07
Identities = 47/181 (25%), Positives = 76/181 (41%), Gaps = 10/181 (5%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
P N+ S +D KD + DK S +K ++ +SK E+ + E
Sbjct: 139 PENALPSDEDDKDPNDPYRALDIDLDKPLADSEKLPVQKHRNAETSKSPEKGDVPAVEKK 198
Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
S+K KKKEKK+K+ +ERDKD+KKE + KS + S S + +
Sbjct: 199 SKKPKKKEKKEKE----------KERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEA 248
Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKE 220
S + T P + K+ E E+ + K K + +K++K K + + S
Sbjct: 249 SLANTVSGTAPDSEPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKKHHHHRCHHSDG 308
Query: 221 K 221
Sbjct: 309 G 309
Score = 40.8 bits (95), Expect = 0.002
Identities = 33/126 (26%), Positives = 64/126 (50%), Gaps = 8/126 (6%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES------KPKESS 100
KK+KK+K+++++K+K KK+ E KS + + + +S + + S S
Sbjct: 203 KKKEKKEKEKERDKDK-KKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGTAPDS 261
Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
E K + ++ +KS KHK K + ++K+EKK++K+ + S +++P +
Sbjct: 262 EPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKK-HHHHRCHHSDGGAEQPVQNGAVE 320
Query: 161 SHPPPP 166
P PP
Sbjct: 321 EEPLPP 326
Score = 32.7 bits (74), Expect = 0.57
Identities = 34/191 (17%), Positives = 70/191 (36%), Gaps = 8/191 (4%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS 146
+E+ ++ K+ +K+++KEK+ ++ + D + + + + + S
Sbjct: 86 EERRHRQRLEKDKREKKKREKEKRGRRRHHSLGTESDEDIAPAQMVDIVTEEMPENALPS 145
Query: 147 SHNSKEPASGSQL--ISHPPPPAPT---PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK 201
+ K+P + I P A + P QK K EK +K SK KK
Sbjct: 146 DEDDKDPNDPYRALDIDLDKPLADSEKLPVQKHRNAETSKSPEKGDVPAVEKKSKKPKKK 205
Query: 202 KDKHGDKTNPKEKDAKSKEKESHK---SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQ 258
+ K +K K+K + + +S + L + + T + ++
Sbjct: 206 EKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGTAPDSEPDE 265
Query: 259 FKDELFDRLKN 269
KD + K
Sbjct: 266 PKDAEAEETKK 276
>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein. This protein is found to be part
of a large ribonucleoprotein complex containing the U3
snoRNA. Depletion of the Utp proteins impedes production
of the 18S rRNA, indicating that they are part of the
active pre-rRNA processing complex. This large RNP
complex has been termed the small subunit (SSU)
processome.
Length = 728
Score = 51.6 bits (124), Expect = 8e-07
Identities = 38/182 (20%), Positives = 71/182 (39%), Gaps = 16/182 (8%)
Query: 43 NSSSSKKDKKDKD---RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
+ + KK++ D + +E E E++ E++ S K + K + E++ K
Sbjct: 381 RAEARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEAESKKLK 440
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQL 159
K + KEKK+ E+ +++ + +K K K S+ + K +E
Sbjct: 441 KENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEE------- 493
Query: 160 ISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
P T K+++ +K+SS+ DK + K K K KEK
Sbjct: 494 -----NPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVK-KKKKKEKSIDLD 547
Query: 220 EK 221
+
Sbjct: 548 DD 549
Score = 43.5 bits (103), Expect = 3e-04
Identities = 29/145 (20%), Positives = 50/145 (34%), Gaps = 1/145 (0%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
S + PS + + S + + K+KK+ D ++E E E++
Sbjct: 406 ESDEEENEEPSKKNVGRRKFGPENGEKEAESKKLKKENKNEFKEKKESDEEEELEDEEEA 465
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
K + + K EK + +E+E E P + K K + K D+
Sbjct: 466 KVEKVANKLLKRSEKAQKEEEEEELDEENP-WLKTTSSVGKSAKKQDSKKKSSSKLDKAA 524
Query: 127 DKDEKKEQKESKSSSKIVSSSHNSK 151
+K K K K K S +
Sbjct: 525 NKISKAAVKVKKKKKKEKSIDLDDD 549
Score = 39.7 bits (93), Expect = 0.005
Identities = 23/143 (16%), Positives = 57/143 (39%), Gaps = 7/143 (4%)
Query: 22 KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
+ K S N + K+ K+ + ++ +++E+ EK + + + ++
Sbjct: 422 RRKFGPENGEKEAESKKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKA 481
Query: 82 DKVSSKEKERKE-------SKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
K +E+ +E S +S+ +++ KK+ K +K+ K + K +KK++
Sbjct: 482 QKEEEEEELDEENPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKE 541
Query: 135 KESKSSSKIVSSSHNSKEPASGS 157
K ++ + K
Sbjct: 542 KSIDLDDDLIDEEDSIKLDVDDE 564
Score = 39.3 bits (92), Expect = 0.007
Identities = 37/218 (16%), Positives = 78/218 (35%), Gaps = 26/218 (11%)
Query: 33 STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAV--------SSKEKEKDKV 84
S S + + R K + ++ + +++ S + + K+++
Sbjct: 332 SDSEEEDEDDDEDDDDGENPWMLRKKLGKLKEGEDDEENSGLLSMKFMQRAEARKKEEND 391
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
+ E+ R+E + +E S E+E ++ K + K ++ E++ + KK +KE+K+ K
Sbjct: 392 AEIEELRRELEGEEESDEEENEEPSKKNVGRR-KFGPENGEKEAESKKLKKENKNEFKEK 450
Query: 145 SSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESS-----------TTHDK 193
S +E L K ++++ +KE+E T+
Sbjct: 451 KESDEEEE------LEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEENPWLKTTSSVG 504
Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPK 231
S K K K K + + K K
Sbjct: 505 KSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEK 542
Score = 34.6 bits (80), Expect = 0.15
Identities = 21/126 (16%), Positives = 40/126 (31%), Gaps = 3/126 (2%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
+ S + + A S ++ K K
Sbjct: 449 EKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEENPWLKTTSSVGKSAK 508
Query: 66 DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
++ K + S +K +K+S + K+ K KE S + + ++ S K D E
Sbjct: 509 KQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEKSIDLDDDLIDEE---DSIKLDVDDEE 565
Query: 126 RDKDEK 131
+ DE+
Sbjct: 566 DEDDEE 571
Score = 33.1 bits (76), Expect = 0.42
Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 17/188 (9%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK-EKEKDKVSSKEKERKESKPKESSSE 102
S + + D + + E + E D ++ + K K K+ +E S +E
Sbjct: 324 SEEDEDEDSDSEEEDEDDDEDDDDGENPWMLRKKLGKLKEGEDDEENSGLLSMKFMQRAE 383
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
KK+E + E+ + + + E D++E +E + + + KE S
Sbjct: 384 ARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEAES------- 436
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
+K + K + KEK+ S ++ + K +K +K + + A+ +E+E
Sbjct: 437 ---------KKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEE 487
Query: 223 SHKSSAGP 230
P
Sbjct: 488 EELDEENP 495
Score = 33.1 bits (76), Expect = 0.45
Identities = 29/144 (20%), Positives = 53/144 (36%), Gaps = 31/144 (21%)
Query: 10 SSSSAHPSPHKNKDKDSSAIP-STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK--- 65
S ++ K K+K S + + D++ K+K+ K+
Sbjct: 528 SKAAVKVKKKKKKEKSIDLDDDLIDEEDSIKLDVDDEEDEDDEELPFLFKQKDLIKEAFA 587
Query: 66 ------DKEKDKSAVSSKEKEKDK--------------VSSKEKERKESKPKESSSEKEK 105
+ EK+K V +E K+ + ++K+RK + + E K
Sbjct: 588 GDDVVAEFEKEKKEVIEEEDPKEIDLTLPGWGSWAGDGIKKRKKKRKRKRRFLTKIEGVK 647
Query: 106 KKEKKDKK-------EKSHKHKDK 122
K+++KDKK EK +K K
Sbjct: 648 KEKRKDKKLKNVIINEKRNKKAAK 671
Score = 32.7 bits (75), Expect = 0.60
Identities = 22/126 (17%), Positives = 44/126 (34%), Gaps = 8/126 (6%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
+ S K+ + + + + +K + ++ +K
Sbjct: 424 KFGPENGEKEAESKKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQK 483
Query: 65 KDKEKD--------KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+++E++ K+ S + K + S K+ K K S+ K +KK KKEKS
Sbjct: 484 EEEEEELDEENPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEKS 543
Query: 117 HKHKDK 122
D
Sbjct: 544 IDLDDD 549
>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
subunit (TFIIF-alpha). Transcription initiation factor
IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
II-associating protein 74 (RAP74) is the large subunit
of transcription factor IIF (TFIIF), which is essential
for accurate initiation and stimulates elongation by RNA
polymerase II.
Length = 528
Score = 51.1 bits (122), Expect = 1e-06
Identities = 43/182 (23%), Positives = 64/182 (35%), Gaps = 13/182 (7%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K K+ E E+ + EK K KD E + ES ++EK K
Sbjct: 183 MKAAKNGPAAFGDEDEETEGEKGGGGRGKDLKIKDLEGDDEDDGDESDKGGEDGDEEKSK 242
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK---------IVSSSHNSKEPASGSQ 158
+KK K K+ K D D++ + + E S I SS + +P
Sbjct: 243 KKKKKLAKNKKKLDDDKKGKRGGDDDADEYDSDDGDDEGREEDYISDSSASGNDPEERED 302
Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
+S P P Q +E E+E + SK K K G K + D+ S
Sbjct: 303 KLSPEIPAKPEIEQDE----DSEESEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDS 358
Query: 219 KE 220
+
Sbjct: 359 GD 360
Score = 48.4 bits (115), Expect = 7e-06
Identities = 47/247 (19%), Positives = 81/247 (32%), Gaps = 25/247 (10%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDK-EKEKEK 64
K + N KK K+ D D E + +
Sbjct: 217 DLEGDDEDDGDESDKGGEDGDEEKSKKKKKKLAKNKKKLDDDKKGKRGGDDDADEYDSDD 276
Query: 65 KDKE-------KDKSAVSSKEKEKDKVSSKEKERK--------------ESKPKESSSEK 103
D E D SA + +E++ S E K E +E K
Sbjct: 277 GDDEGREEDYISDSSASGNDPEEREDKLSPEIPAKPEIEQDEDSEESEEEKNEEEGGLSK 336
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
+ KK KK K +K+ KD D + + S S + + KEP + S+P
Sbjct: 337 KGKKLKKLKGKKNGLDKDDSDSGDDSDDSDIDGEDSVSLVTAK--KQKEPKKEEPVDSNP 394
Query: 164 PPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK-KDKHGDKTNPKEKDAKSKEKE 222
P + + ++K+K K K ++ + KK K ++ K++ + ++
Sbjct: 395 SSPGNSGPARPSPESKDKGKRKAANEVSKSPASVPAKKLKTENAPKSSSGKSTPQTFSGS 454
Query: 223 SHKSSAG 229
S+A
Sbjct: 455 KSSSNAA 461
>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510). This
family consists of several hypothetical bacterial
proteins of around 200 residues in length. The function
of this family is unknown.
Length = 214
Score = 47.4 bits (113), Expect = 5e-06
Identities = 21/87 (24%), Positives = 42/87 (48%), Gaps = 4/87 (4%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
D+ E+E +K D ++ KE+EK+ +S++KE K KE +E+ +E+
Sbjct: 39 SPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEE 98
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKE 136
++ +++ +K E +KE
Sbjct: 99 DEESSD----ENEKETEEKTESNVEKE 121
Score = 44.7 bits (106), Expect = 4e-05
Identities = 21/84 (25%), Positives = 40/84 (47%)
Query: 21 NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE 80
+ D +A S T K+++ + + E +++K D EK+ + +E
Sbjct: 38 SSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEE 97
Query: 81 KDKVSSKEKERKESKPKESSSEKE 104
+D+ SS E E++ + ES+ EKE
Sbjct: 98 EDEESSDENEKETEEKTESNVEKE 121
Score = 44.3 bits (105), Expect = 5e-05
Identities = 22/120 (18%), Positives = 49/120 (40%), Gaps = 6/120 (5%)
Query: 33 STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
S+ S + + K D ++ +E ++E+K+ A +S++KE + KE E
Sbjct: 38 SSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKE------AANSEDKEDKGDAEKEDEES 91
Query: 93 ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
E + +E E + EK+ +++ + + ++ + S S + E
Sbjct: 92 EEENEEEDEESSDENEKETEEKTESNVEKEITNPSWKPVGTEQTGPHAMTFDSGSQDWNE 151
Score = 44.0 bits (104), Expect = 7e-05
Identities = 23/84 (27%), Positives = 38/84 (45%), Gaps = 1/84 (1%)
Query: 31 STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
S S ++ S ++ + + KE+EKE + E DK EKE ++ + +E
Sbjct: 39 SPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSE-DKEDKGDAEKEDEESEEENEE 97
Query: 91 RKESKPKESSSEKEKKKEKKDKKE 114
E E+ E E+K E +KE
Sbjct: 98 EDEESSDENEKETEEKTESNVEKE 121
Score = 36.6 bits (85), Expect = 0.020
Identities = 15/70 (21%), Positives = 33/70 (47%)
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
+ +E+K + E ++ K+++KE ++ +D+ + E +E +E SS
Sbjct: 46 ADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEEDEESSDE 105
Query: 149 NSKEPASGSQ 158
N KE ++
Sbjct: 106 NEKETEEKTE 115
Score = 32.4 bits (74), Expect = 0.49
Identities = 20/86 (23%), Positives = 34/86 (39%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
SS S +A K D NS + + D+E E+E
Sbjct: 36 FPSSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEEN 95
Query: 65 KDKEKDKSAVSSKEKEKDKVSSKEKE 90
++++++ S + KE E+ S+ EKE
Sbjct: 96 EEEDEESSDENEKETEEKTESNVEKE 121
Score = 29.3 bits (66), Expect = 4.3
Identities = 14/71 (19%), Positives = 26/71 (36%), Gaps = 7/71 (9%)
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
SS + + + S+ ++ E ++ KE+ ++E E KE K
Sbjct: 38 SSPSDQAAADEQEAKKSDDQETAEIEEVKEE-------EKEAANSEDKEDKGDAEKEDEE 90
Query: 145 SSSHNSKEPAS 155
S N +E
Sbjct: 91 SEEENEEEDEE 101
>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain. This domain is
found at the N terminus of SMC proteins. The SMC
(structural maintenance of chromosomes) superfamily
proteins have ATP-binding domains at the N- and
C-termini, and two extended coiled-coil domains
separated by a hinge in the middle. The eukaryotic SMC
proteins form two kind of heterodimers: the SMC1/SMC3
and the SMC2/SMC4 types. These heterodimers constitute
an essential part of higher order complexes, which are
involved in chromatin and DNA dynamics. This family also
includes the RecF and RecN proteins that are involved in
DNA metabolism and recombination.
Length = 1162
Score = 49.2 bits (117), Expect = 5e-06
Identities = 34/231 (14%), Positives = 84/231 (36%), Gaps = 31/231 (13%)
Query: 43 NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
+ + ++ +KE+E + +++K K+ +++++ KE +E K + E
Sbjct: 248 RDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLE 307
Query: 103 --KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
K +EK + EK K +K+ +++K+E +E ++ +I + +E
Sbjct: 308 RRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEE-------- 359
Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKE 220
Q ++ K ++ E+E S+ ++ K ++ K +
Sbjct: 360 ----------EQLEKLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAK 409
Query: 221 KESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
S L ++ K + + E + + K +
Sbjct: 410 LLLELSE-----------QEEDLLKEEKKEELKIVEELEESLETKQGKLTE 449
Score = 38.0 bits (88), Expect = 0.014
Identities = 38/241 (15%), Positives = 87/241 (36%), Gaps = 24/241 (9%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
S++ +K K+R +K E+ + + + K ++ KE+ +K + + + E +
Sbjct: 166 SREKRKKKER-LKKLIEETENLAELIIDLEELK-LQELKLKEQAKKALEYYQLKEKLELE 223
Query: 107 KEK--KDKKEKSHKHKDKDR------ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+E K ++ + E+++ E +Q+ K + +KE +
Sbjct: 224 EENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKK 283
Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
L Q+ +K KE+E+ S + ++ + K+ + K
Sbjct: 284 L------------QEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLE-KE 330
Query: 219 KEKESHKSSAGPKCYPEVGGIYILLRSKK-NKTVQERLAEQFKDELFDRLKNEQADILQR 277
+KE + K E+ ++ + EQ ++EL + K E +
Sbjct: 331 LKKEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEKLEQLEEELLAKKKLESERLSSA 390
Query: 278 K 278
Sbjct: 391 A 391
Score = 37.6 bits (87), Expect = 0.019
Identities = 29/192 (15%), Positives = 72/192 (37%), Gaps = 20/192 (10%)
Query: 34 TSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
S + K +K+ KEKE+ ++ +++ K +E E+++ EK +++
Sbjct: 309 RKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEK 368
Query: 94 SKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP 153
+ E +KK E + + +++ ++++EK+ + + S + K
Sbjct: 369 LEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLLLELSEQEEDLLKEEK-- 426
Query: 154 ASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE 213
++ +E E+ E+ K + +K+ K +
Sbjct: 427 ------------------KEELKIVEELEESLETKQGKLTEEKEELEKQALKLLKDKLEL 468
Query: 214 KDAKSKEKESHK 225
K ++ KE+
Sbjct: 469 KKSEDLLKETKL 480
Score = 37.6 bits (87), Expect = 0.021
Identities = 39/229 (17%), Positives = 81/229 (35%), Gaps = 17/229 (7%)
Query: 54 KDRDKEKEKEKKDKEK-DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
K KE+ KK E E+E K +E +++++E +
Sbjct: 198 LQELKLKEQAKKALEYYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESS 257
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
K++ K ++ + K+ K+E+KE K + +E S+L+ +
Sbjct: 258 KQELEKEEEILAQVLKENKEEEKEKKLQEEE-LKLLAKEEEELKSELL-----------K 305
Query: 173 KSPVKTKEKEKEKESSTTHDKHSKH----KHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
K ++EK KES K K K + ++ + + K +E+E
Sbjct: 306 LERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLEKL 365
Query: 229 GPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQR 277
K + + + + ++ + EL + + E +L+
Sbjct: 366 QEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLLLEL 414
Score = 36.5 bits (84), Expect = 0.053
Identities = 37/241 (15%), Positives = 83/241 (34%), Gaps = 34/241 (14%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
KK K +K+ + ++ + ++ ++K K+K +EK R + + +E +
Sbjct: 712 ELKKLKLEKEELLADKVQEAQDKINEELKLLEQKIKEKEEEEEKSRLKKEEEEEEKSELS 771
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
KEK+ +E+ K K E +++ K Q+E + +
Sbjct: 772 LKEKELAEEEEKTEKLKVEEEKEEKLKAQEEELRALEE---------------------- 809
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
K + E+E+ K + + + KE+ K E
Sbjct: 810 -----ELKEEAELLEEEQLLIEQEEKIKEEELEELALEL-------KEEQKLEKLAEEEL 857
Query: 226 SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGD 285
+ E +LL+ ++ + + + + K+E K E + Q+ + +
Sbjct: 858 ERLEEEITKEELLQELLLKEEELEEQKLKDELESKEEKEKEEKKELEEESQKDNLLEEKE 917
Query: 286 I 286
Sbjct: 918 N 918
Score = 30.3 bits (68), Expect = 3.8
Identities = 22/127 (17%), Positives = 46/127 (36%), Gaps = 12/127 (9%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+K+ ++ +KE+E+E+ + E E KE + + +KE+ +
Sbjct: 946 ADEKEKEEDNKEEEEERNKRLLLAKEELGNVNLMAI---AEFEEKEERYNKDELKKERLE 1002
Query: 108 EKKDKKEK---SHKHKDKDRERDKDEKKEQKESKSSSKIVSSS------HNSKEPASGSQ 158
E+K + + + + + +K + +S +P SG
Sbjct: 1003 EEKKELLREIIEETCQRFKEFLELFVSINRGLNKVFFYLELGGSAELRLEDSDDPFSGGI 1062
Query: 159 LISHPPP 165
IS PP
Sbjct: 1063 EISARPP 1069
Score = 29.9 bits (67), Expect = 4.6
Identities = 16/100 (16%), Positives = 40/100 (40%), Gaps = 6/100 (6%)
Query: 43 NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK----- 97
+ + + E+ ++ EK+K + +E+E+ + +
Sbjct: 923 RIAEEAIILLKYESEPEELLLEEADEKEKEEDNKEEEEERNKRLLLAKEELGNVNLMAIA 982
Query: 98 ESSSEKEKKKEKKDKKEK-SHKHKDKDRERDKDEKKEQKE 136
E ++E+ + + KKE+ + K+ RE ++ + KE
Sbjct: 983 EFEEKEERYNKDELKKERLEEEKKELLREIIEETCQRFKE 1022
>gnl|CDD|187557 cd05246, dTDP_GD_SDR_e, dTDP-D-glucose 4,6-dehydratase, extended
(e) SDRs. This subgroup contains dTDP-D-glucose
4,6-dehydratase and related proteins, members of the
extended-SDR family, with the characteristic Rossmann
fold core region, active site tetrad and NAD(P)-binding
motif. dTDP-D-glucose 4,6-dehydratase is closely related
to other sugar epimerases of the SDR family.
dTDP-D-dlucose 4,6,-dehydratase catalyzes the second of
four steps in the dTDP-L-rhamnose pathway (the
dehydration of dTDP-D-glucose to
dTDP-4-keto-6-deoxy-D-glucose) in the synthesis of
L-rhamnose, a cell wall component of some pathogenic
bacteria. In many gram negative bacteria, L-rhamnose is
an important constituent of lipopoylsaccharide
O-antigen. The larger N-terminal portion of
dTDP-D-Glucose 4,6-dehydratase forms a Rossmann fold
NAD-binding domain, while the C-terminus binds the sugar
substrate. Extended SDRs are distinct from classical
SDRs. In addition to the Rossmann fold (alpha/beta
folding pattern with a central beta-sheet) core region
typical of all SDRs, extended SDRs have a less conserved
C-terminal extension of approximately 100 amino acids.
Extended SDRs are a diverse collection of proteins, and
include isomerases, epimerases, oxidoreductases, and
lyases; they typically have a TGXXGXXG cofactor binding
motif. SDRs are a functionally diverse family of
oxidoreductases that have a single domain with a
structurally conserved Rossmann fold, an
NAD(P)(H)-binding region, and a structurally diverse
C-terminal region. Sequence identity between different
SDR enzymes is typically in the 15-30% range; they
catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 315
Score = 48.3 bits (116), Expect = 6e-06
Identities = 43/176 (24%), Positives = 62/176 (35%), Gaps = 52/176 (29%)
Query: 282 ISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAF---TLNIQATRELLDL 338
+ GDI L D+ F + I +IH AA D I D N+ T LL+
Sbjct: 56 VKGDICDAEL----VDRLFEEEKIDAVIHFAAESHVDRSISDPEPFIRTNVLGTYTLLEA 111
Query: 339 ATRCSQLKAILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFG 398
A + + +H+ST +E Y L + E L+
Sbjct: 112 ARKYGVKR-FVHIST-----------DEVYGDLLDDGEF---------TETSPLAP---- 146
Query: 399 GIYNNSYSFTKAIGESVVEKYL--YKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPG 452
+ YS +KA + +V Y Y LP+ + R SNN YGP
Sbjct: 147 ---TSPYSASKAAADLLVRAYHRTYGLPVVITRC--------------SNN-YGPY 184
>gnl|CDD|235962 PRK07201, PRK07201, short chain dehydrogenase; Provisional.
Length = 657
Score = 48.8 bits (117), Expect = 7e-06
Identities = 39/142 (27%), Positives = 60/142 (42%), Gaps = 31/142 (21%)
Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGI 293
+++L+R + RL DR V + GD+++P LG+
Sbjct: 24 RREATVHVLVR----RQSLSRLEALAAYWGADR------------VVPLVGDLTEPGLGL 67
Query: 294 SSHDQQFIQHHIHVIIHAAA--SLRFDELIQDAFTLNIQATRELLDLATRCSQLKAIL-- 349
S D + I ++H AA L DE Q A N+ TR +++LA R L+A
Sbjct: 68 SEADIAELG-DIDHVVHLAAIYDLTADEEAQRA--ANVDGTRNVVELAER---LQAATFH 121
Query: 350 HVST-----LYTHSYREDIQEE 366
HVS+ Y +RED +E
Sbjct: 122 HVSSIAVAGDYEGVFREDDFDE 143
>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
splicing factor. These splicing factors consist of an
N-terminal arginine-rich low complexity domain followed
by three tandem RNA recognition motifs (pfam00076). The
well-characterized members of this family are auxilliary
components of the U2 small nuclear ribonuclearprotein
splicing factor (U2AF). These proteins are closely
related to the CC1-like subfamily of splicing factors
(TIGR01622). Members of this subfamily are found in
plants, metazoa and fungi.
Length = 509
Score = 48.4 bits (115), Expect = 8e-06
Identities = 19/122 (15%), Positives = 57/122 (46%), Gaps = 11/122 (9%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER--------KESKPKESSSEK 103
+D++ D+E+EK + +++D+S+ + + +D+ +++ R ++S+P++
Sbjct: 1 RDEEPDREREK-SRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYD 59
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
+ + + + +D+ R R + + ++ + + S S+ ++ L
Sbjct: 60 SRSP-RSLRYSSVRRSRDRPRRRSRSVRSIEQH-RRRLRDRSPSNQWRKDDKKRSLWDIK 117
Query: 164 PP 165
PP
Sbjct: 118 PP 119
Score = 41.4 bits (97), Expect = 0.001
Identities = 19/132 (14%), Positives = 48/132 (36%), Gaps = 12/132 (9%)
Query: 81 KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
+D+ +E+E+ + ++ SSE+ +++ + + + + ++R +D + + S
Sbjct: 1 RDEEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDS 60
Query: 141 SKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHK 200
S S S + V++ E+ + + + + K
Sbjct: 61 RSP-RSLRYSSVRRSRD----------RPRRRSRSVRSIEQHRRRLRDRS-PSNQWRKDD 108
Query: 201 KKDKHGDKTNPK 212
KK D P
Sbjct: 109 KKRSLWDIKPPG 120
Score = 31.0 bits (70), Expect = 2.0
Identities = 22/167 (13%), Positives = 57/167 (34%), Gaps = 16/167 (9%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDR-DKEKEKEKKDKEKDKSAVSSKE 78
+ D++ S+ P S + +D+ R +E+ + + +D+ S+
Sbjct: 3 EEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRS 62
Query: 79 KEKDKVS----SKEKERKESKP-----------KESSSEKEKKKEKKDKKEKSHKHKDKD 123
+ S S+++ R+ S+ ++ S + +K+ K + K +
Sbjct: 63 PRSLRYSSVRRSRDRPRRRSRSVRSIEQHRRRLRDRSPSNQWRKDDKKRSLWDIKPPGYE 122
Query: 124 RERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTP 170
K Q S + + + ++ + +I+ P
Sbjct: 123 LVTADQAKASQVFSVPGTAPRPAMTDPEKLLAEGSIITPLPVLPYQQ 169
>gnl|CDD|235401 PRK05306, infB, translation initiation factor IF-2; Validated.
Length = 746
Score = 47.9 bits (115), Expect = 1e-05
Identities = 19/159 (11%), Positives = 59/159 (37%), Gaps = 1/159 (0%)
Query: 47 SKKDKKDKDRDKEKEKEKK-DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
+K+ EK KE + + S V +E K++ + +E +++ +E+++ + +
Sbjct: 10 AKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAE 69
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
++ K + + + + + E +++ +++ K + + P
Sbjct: 70 EEAKAEAAAAAPAEEAAEAAAAAEAAARPAEDEAARPAEAAARRPKAKKAAKKKKGPKPK 129
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
+ + + K + + + K KK+
Sbjct: 130 KKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKP 168
Score = 47.5 bits (114), Expect = 2e-05
Identities = 26/178 (14%), Positives = 56/178 (31%), Gaps = 13/178 (7%)
Query: 53 DKDRDKEKEKEKKDKEKD-----KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K R E KE K+ K + V +E ++E+K + K + +
Sbjct: 2 SKVRVYELAKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAE 61
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
E + + + +E E + ++ + ++
Sbjct: 62 EAAAAEAEEEAKAEAAAAAPAEEAAEAAAAAEAAARPAEDEAARPAE--------AAARR 113
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
P + + K K K+K+ + K K + + + + K K+K + K
Sbjct: 114 PKAKKAAKKKKGPKPKKKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKPTEK 171
Score = 44.8 bits (107), Expect = 1e-04
Identities = 23/166 (13%), Positives = 63/166 (37%), Gaps = 1/166 (0%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSS-KEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+ + EK K+ + KS S+ +E+E K +K + +E+K + + + +
Sbjct: 10 AKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAE 69
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
E+ + + ++ E + + ++ + + + + A + P P
Sbjct: 70 EEAKAEAAAAAPAEEAAEAAAAAEAAARPAEDEAARPAEAAARRPKAKKAAKKKKGPKPK 129
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE 213
++ + ++ K + + + KKK + + P+E
Sbjct: 130 KKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKPTEKIPRE 175
Score = 38.7 bits (91), Expect = 0.008
Identities = 30/236 (12%), Positives = 70/236 (29%), Gaps = 45/236 (19%)
Query: 49 KDKKDKDRDKEKEKEKKD-KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+ + KE K+ EK K + V +E ++E+K + K + +
Sbjct: 2 SKVRVYELAKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAE 61
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
E + + + +E E + + PA
Sbjct: 62 EAAAAEAEEEAKAEAAAAAPAEEAAEAAAAAEA----------------------AARPA 99
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+ + K K+++ K K KKK ++ K + +
Sbjct: 100 EDEAARPAEAAARRPKAKKAAK---KKKGPKPKKKKPKRKAARGGKRGKGGKGRRRRRGR 156
Query: 228 AGPKCYPEVGGIYILLRSKKNKTVQERLAEQFK-------DELFDRLKNEQADILQ 276
+ + KK + E++ + EL +++ + A++++
Sbjct: 157 RRRR------------KKKKKQKPTEKIPREVVIPETITVAELAEKMAVKAAEVIK 200
Score = 38.7 bits (91), Expect = 0.009
Identities = 25/162 (15%), Positives = 50/162 (30%), Gaps = 20/162 (12%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
K ++E KE+ +E ++ A + E+ + +E + + + + E
Sbjct: 30 EVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAEEEAKAEAAAAAPAEEAAEAA 89
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
+ + + E K +K +K P P
Sbjct: 90 AAAEAAARPAEDEAARPAEAAARRPKAKKAAKKKKG--------------------PKPK 129
Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDK 208
P +K+ K + K + + K KKK K +K
Sbjct: 130 KKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKPTEK 171
>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
This is a family of fungal proteins of unknown function.
Length = 182
Score = 44.7 bits (106), Expect = 3e-05
Identities = 25/65 (38%), Positives = 38/65 (58%), Gaps = 2/65 (3%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
KK K+ +E EK KK+ E+ + K+K K K K+K+ K+ K+ SEK+ +KE +
Sbjct: 62 KKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKK-KDKDKD-KKDDKKDDKSEKKDEKEAE 119
Query: 111 DKKEK 115
DK E
Sbjct: 120 DKLED 124
Score = 43.1 bits (102), Expect = 1e-04
Identities = 23/77 (29%), Positives = 37/77 (48%), Gaps = 3/77 (3%)
Query: 45 SSSKKDKKDKDR---DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
KK+ ++K + K+K K+KKDK+KDK +K + K + +++ E K S
Sbjct: 72 EKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSE 131
Query: 102 EKEKKKEKKDKKEKSHK 118
E K +K HK
Sbjct: 132 TLSTLSELKPRKYALHK 148
Score = 42.0 bits (99), Expect = 2e-04
Identities = 23/85 (27%), Positives = 41/85 (48%), Gaps = 6/85 (7%)
Query: 70 DKSAVSSKEKEK---DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
D +K+K+K +++ +KE +E + + +K KKK+ KDK +K K DK
Sbjct: 54 DAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKS--- 110
Query: 127 DKDEKKEQKESKSSSKIVSSSHNSK 151
+K ++KE ++ S S
Sbjct: 111 EKKDEKEAEDKLEDLTKSYSETLST 135
Score = 41.2 bits (97), Expect = 5e-04
Identities = 24/86 (27%), Positives = 44/86 (51%), Gaps = 9/86 (10%)
Query: 58 KEKEKEKKDKE-KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+ E +KK KE ++ KE E+ + K K +K+ K+ +K+KK +KKD K
Sbjct: 56 EYTEAKKKKKELAEEIEKVKKEYEEKQ---KWKWKKKKSKKKKDKDKDKKDDKKDDKS-- 110
Query: 117 HKHKDKDRERDKDEKKEQKESKSSSK 142
+ KD + +D+ ++ +S S +
Sbjct: 111 ---EKKDEKEAEDKLEDLTKSYSETL 133
Score = 33.9 bits (78), Expect = 0.12
Identities = 25/95 (26%), Positives = 42/95 (44%), Gaps = 8/95 (8%)
Query: 84 VSSKEKERKESKPKESSSEKEKKK---EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
+ E + K KE + E EK K E+K K + K K +++DKD+K ++K+ KS
Sbjct: 52 IYDAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSE 111
Query: 141 SKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSP 175
K + + E + S T ++ P
Sbjct: 112 KKDEKEAEDKLEDLTKS-----YSETLSTLSELKP 141
Score = 31.6 bits (72), Expect = 0.64
Identities = 13/44 (29%), Positives = 20/44 (45%), Gaps = 1/44 (2%)
Query: 43 NSSSSKKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVS 85
K K DK DK EK+ EK+ ++K + S + +S
Sbjct: 94 KKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSETLSTLS 137
>gnl|CDD|240274 PTZ00112, PTZ00112, origin recognition complex 1 protein;
Provisional.
Length = 1164
Score = 46.9 bits (111), Expect = 3e-05
Identities = 35/146 (23%), Positives = 57/146 (39%), Gaps = 24/146 (16%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD----- 57
+++ SSSSSS + + + S+ TS + + SS K K+
Sbjct: 120 HNLDSSSSSSISSSL-------TNISFFSSPTSIYSCLSNSLSSKHSPKVIKENQSTHVN 172
Query: 58 -----KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
+ KE +K+ K + DK+ + K + KEK KEK
Sbjct: 173 ISSDNSPRNKEISNKQLKKQTNVTHTTCYDKMRRSPRNTSTIKNNTNDKNKEKNKEKD-- 230
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESK 138
K+ KDR+ DK K+ ++SK
Sbjct: 231 -----KNIKKDRDGDKQTKRNSEKSK 251
Score = 46.1 bits (109), Expect = 5e-05
Identities = 48/232 (20%), Positives = 88/232 (37%), Gaps = 20/232 (8%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
YS S+S SS SP K+ S+ + SS ++P N S K K +
Sbjct: 147 YSCLSNSLSSKH--SPKVIKENQSTHV----NISSDNSPRNKEISNKQLKKQTNVTHTTC 200
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
K + ++ + K DK K KE+ ++ K+ +K+ K+ +K + + H D
Sbjct: 201 YDKMRRSPRNTSTIKNNTNDKNKEKNKEKDKNIKKDRDGDKQTKR-NSEKSKVQNSHFDV 259
Query: 123 ------DRERDKDEKKEQKESKSSSKIVSSSHN-SKEPASGSQLISHPPPPAPTPTQKSP 175
+E KDEK +SS + S K+ S P
Sbjct: 260 RILRSYTKENKKDEKNVVSGIRSSVLLKRKSQCLRKDSYVYSNHQKKAKTGDPKNIIHRN 319
Query: 176 VKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+ + SS+ H ++ ++ + ++P +K +K + K++
Sbjct: 320 NGSSNSNNDDTSSSNHLGSNRISNR------NPSSPYKKQTTTKHTNNTKNN 365
Score = 46.1 bits (109), Expect = 5e-05
Identities = 56/269 (20%), Positives = 108/269 (40%), Gaps = 32/269 (11%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE--- 63
SS +S +K K ++ +T +P N+S+ K + DK+++K KEK+
Sbjct: 174 SSDNSPRNKEISNKQLKKQTNVTHTTCYDKMRRSPRNTSTIKNNTNDKNKEKNKEKDKNI 233
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK-----------KKEKKDK 112
KKD++ DK + EK K + S + S KE+ +++ K++ +
Sbjct: 234 KKDRDGDKQTKRNSEKSKVQNSHFDVRILRSYTKENKKDEKNVVSGIRSSVLLKRKSQCL 293
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
++ S+ + + ++ + K + S ++ S GS IS+ P +P
Sbjct: 294 RKDSYVYSNHQKKAKTGDPKNIIHRNNGSSNSNNDDTSSSNHLGSNRISNRNPSSP---- 349
Query: 173 KSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKC 232
+K+++T H ++K+ K K K N + + K +SS P
Sbjct: 350 ----------YKKQTTTKHTNNTKNNKYNKTKTTQKFNHPLRHHATINK---RSSMLP-M 395
Query: 233 YPEVGGIYILLRSKKNKTVQERLAEQFKD 261
+ G + E +A+ KD
Sbjct: 396 SEQKGRGASEKSEYIKEFTMEEVAKLTKD 424
Score = 30.7 bits (69), Expect = 2.5
Identities = 34/177 (19%), Positives = 63/177 (35%), Gaps = 20/177 (11%)
Query: 55 DRDKEKEKEKKDKEKDKSAVSSKEKEK---DKVSSKEKERKESKPKESSSEKEKKKEKKD 111
D ++ + K+ + + + + +KEK D SS + SS +
Sbjct: 93 DLNERSKTPIKNNDNVTTPIKANKKEKHNLDSSSSSSISSSLTNISFFSSPTSIYSCLSN 152
Query: 112 KKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPT 171
H K KE++S+ +SS ++ + ++ + T
Sbjct: 153 SLSSKHS------------PKVIKENQSTHVNISSDNSPRNKEISNKQLKKQTNVTHTTC 200
Query: 172 QKSPVKTKEKEKEKESSTTHDKHSKHKHK----KKDKHGDKTNPKEKDAKSKEKESH 224
++ +++T K+K K KKD+ GDK K KSK + SH
Sbjct: 201 YDKMRRSPRNTSTIKNNTNDKNKEKNKEKDKNIKKDRDGDKQT-KRNSEKSKVQNSH 256
>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
Length = 413
Score = 44.7 bits (106), Expect = 9e-05
Identities = 30/132 (22%), Positives = 51/132 (38%), Gaps = 8/132 (6%)
Query: 12 SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
S + K+K + S I S S + S + K+K+++K ++ K K
Sbjct: 10 SFFSGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKK 69
Query: 72 SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
S K+K++ K E E K + K+ K K+K K K + + K
Sbjct: 70 SEKKKKKKKEKKEPKSEGETKL--------GFKTPKKSKKTKKKPPKPKPNEDVDNAFNK 121
Query: 132 KEQKESKSSSKI 143
+ KS+ I
Sbjct: 122 IAELAEKSNVYI 133
Score = 41.6 bits (98), Expect = 0.001
Identities = 31/121 (25%), Positives = 55/121 (45%), Gaps = 8/121 (6%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPST--STSSSTSNPTNSSSSKKDKK------DKDRD 57
S ++ + P + + ST S N ++S+KKDKK K +
Sbjct: 11 FFSGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKS 70
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
++K+K+KK+K++ KS +K K SK+ ++K KPK + + + EKS+
Sbjct: 71 EKKKKKKKEKKEPKSEGETKLGFKTPKKSKKTKKKPPKPKPNEDVDNAFNKIAELAEKSN 130
Query: 118 K 118
Sbjct: 131 V 131
Score = 40.9 bits (96), Expect = 0.001
Identities = 27/122 (22%), Positives = 45/122 (36%), Gaps = 3/122 (2%)
Query: 26 SSAIPSTSTSSSTSNP-TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV 84
S S S +N K+ ++E + +KDK + E +K
Sbjct: 12 FSGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKSE 71
Query: 85 SSKE-KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE-RDKDEKKEQKESKSSSK 142
K+ K+ K+ E ++ K KK KK K K K E D K + ++ S+
Sbjct: 72 KKKKKKKEKKEPKSEGETKLGFKTPKKSKKTKKKPPKPKPNEDVDNAFNKIAELAEKSNV 131
Query: 143 IV 144
+
Sbjct: 132 YI 133
Score = 39.7 bits (93), Expect = 0.003
Identities = 29/119 (24%), Positives = 44/119 (36%), Gaps = 6/119 (5%)
Query: 68 EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
+K K S V SKE S+ + + KK+KK+ K K K + +++
Sbjct: 17 QKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKSEKKKKK 76
Query: 128 KDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
K EKKE K + + SK+ PP P P + + EK
Sbjct: 77 KKEKKEPKSEGETKLGFKTPKKSKKTK------KKPPKPKPNEDVDNAFNKIAELAEKS 129
Score = 33.2 bits (76), Expect = 0.38
Identities = 18/111 (16%), Positives = 36/111 (32%), Gaps = 7/111 (6%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
YS S + +++ + S KK KK + ++ + E
Sbjct: 28 YSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEG 87
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
E K K K+ K K + K ++ +++ K + +K
Sbjct: 88 ETKLGFK-------TPKKSKKTKKKPPKPKPNEDVDNAFNKIAELAEKSNV 131
Score = 33.2 bits (76), Expect = 0.43
Identities = 26/132 (19%), Positives = 49/132 (37%), Gaps = 8/132 (6%)
Query: 31 STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
S +T S P + S K+ +E+ K S K+++K+ S K+ E
Sbjct: 13 SGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEEN-KVATTSTKKDKKEDKNNESKKKSE 71
Query: 91 RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNS 150
+K+ K K++KKE K + E K + + +K + + +
Sbjct: 72 KKKKK-------KKEKKEPKSEGETKLGFKTPKKSKKTKKKPPKPKPNEDVDNAFNKIAE 124
Query: 151 KEPASGSQLISH 162
S + +H
Sbjct: 125 LAEKSNVYIGAH 136
Score = 32.4 bits (74), Expect = 0.71
Identities = 16/65 (24%), Positives = 25/65 (38%)
Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
S K K+++K ES +K K K +KK+ + K K++ K
Sbjct: 46 ENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKTPKKSKKTKKK 105
Query: 227 SAGPK 231
PK
Sbjct: 106 PPKPK 110
>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein. This family
consists of several Borrelia P83/P100 antigen proteins.
Length = 489
Score = 44.6 bits (105), Expect = 1e-04
Identities = 27/95 (28%), Positives = 46/95 (48%), Gaps = 5/95 (5%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+D DK RD+ ++K+++ K K A +S KE +V+ +K E E KK
Sbjct: 239 AQDNADKQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQIEI-----KKN 293
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
+++ K K HK D +E EK+ + + + K
Sbjct: 294 DEEALKAKDHKAFDLKQESKASEKEAEDKELEAQK 328
Score = 41.5 bits (97), Expect = 0.001
Identities = 31/104 (29%), Positives = 58/104 (55%), Gaps = 9/104 (8%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-KPKESSSEKEKK 106
+ + ++ DK++ K ++K A + +K++D+V K++E K KP ++SS KE K
Sbjct: 214 RAQQLKEELDKKQIDADKAQQKADFAQDNADKQRDEVRQKQQEAKNLPKPADTSSPKEDK 273
Query: 107 K---EKKDKKEKSH---KHKDKDRERDKDEKKE--QKESKSSSK 142
+ +K + EK+ K D++ + KD K ++ESK+S K
Sbjct: 274 QVAENQKREIEKAQIEIKKNDEEALKAKDHKAFDLKQESKASEK 317
Score = 41.1 bits (96), Expect = 0.001
Identities = 24/111 (21%), Positives = 49/111 (44%), Gaps = 3/111 (2%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
+SS K+ K ++++E EK E K+ + + + K ++E K S+ + E E
Sbjct: 266 TSSPKEDKQVAENQKREIEKAQIEIKKNDEEALKAKDHKAFDLKQESKASEKEAEDKELE 325
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS---KIVSSSHNSKE 152
+K+++ E K K + + ++ +S + K+V N E
Sbjct: 326 AQKKREPVAEDLQKTKPQVEAQPTSLNEDAIDSSNPVYGLKVVDPITNLSE 376
Score = 39.2 bits (91), Expect = 0.006
Identities = 30/182 (16%), Positives = 71/182 (39%), Gaps = 4/182 (2%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
K + +E +K + KE+E + + + ++ KE K+ + +++
Sbjct: 180 KKVVEALREDNEKGVNFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQKADFA 239
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS--SHNSKEPASGSQLISHPPPPAPTP 170
++ + K +D+ R++ ++ K K + +SS + N K +Q+
Sbjct: 240 QDNADKQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQIEIKKNDEEALK 299
Query: 171 TQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGD--KTNPKEKDAKSKEKESHKSSA 228
+ ++E + DK + + K++ D KT P+ + + E S+
Sbjct: 300 AKDHKAFDLKQESKASEKEAEDKELEAQKKREPVAEDLQKTKPQVEAQPTSLNEDAIDSS 359
Query: 229 GP 230
P
Sbjct: 360 NP 361
Score = 39.2 bits (91), Expect = 0.006
Identities = 33/191 (17%), Positives = 68/191 (35%), Gaps = 39/191 (20%)
Query: 43 NSSSSKKDKKDKDRDKEKEKEKKDK--EKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
N S D + E +E +K + KE+E S+E ++ + KE
Sbjct: 168 NVSDVDTDSISDKKVVEALREDNEKGVNFRRDMTDLKERE-----SQEDAKRAQQLKEEL 222
Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
+K+ +K +++ + D++RD+ +K+Q+
Sbjct: 223 DKKQIDADKA-QQKADFAQDNADKQRDEVRQKQQEAKNL--------------------- 260
Query: 161 SHPPPPAPTPTQKSPVKTK---EKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAK 217
P P T + +++E E + K + + K H K ++++K
Sbjct: 261 -----PKPADTSSPKEDKQVAENQKREIEKAQIEIKKNDEEALKAKDH--KAFDLKQESK 313
Query: 218 SKEKESHKSSA 228
+ EKE+
Sbjct: 314 ASEKEAEDKEL 324
>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
RPA34.5. This is a family of proteins conserved from
yeasts to human. Subunit A34.5 of RNA polymerase I is a
non-essential subunit which is thought to help Pol I
overcome topological constraints imposed on ribosomal
DNA during the process of transcription.
Length = 193
Score = 42.8 bits (101), Expect = 1e-04
Identities = 23/61 (37%), Positives = 33/61 (54%), Gaps = 2/61 (3%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
S KE E E +E K K+ +KE KKEKK+KK+K K + + K +KK++
Sbjct: 135 SEKETTAKVEKEAEVEEEEKKEKKK--KKEVKKEKKEKKDKKEKMVEPKGSKKKKKKKKK 192
Query: 135 K 135
K
Sbjct: 193 K 193
Score = 40.9 bits (96), Expect = 6e-04
Identities = 20/65 (30%), Positives = 34/65 (52%)
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
+ + + EKE KE+ E+E+KKEKK KKE + K+K +++K + + +
Sbjct: 126 SELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKK 185
Query: 138 KSSSK 142
K K
Sbjct: 186 KKKKK 190
Score = 40.9 bits (96), Expect = 7e-04
Identities = 27/72 (37%), Positives = 40/72 (55%), Gaps = 7/72 (9%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
S S+ +K+ EKE E +++EK KEK+K K KEK+ K+ K ++ K
Sbjct: 129 GSESETSEKETTAKVEKEAEVEEEEK-------KEKKKKKEVKKEKKEKKDKKEKMVEPK 181
Query: 104 EKKKEKKDKKEK 115
KK+KK KK+K
Sbjct: 182 GSKKKKKKKKKK 193
Score = 38.5 bits (90), Expect = 0.004
Identities = 22/86 (25%), Positives = 43/86 (50%), Gaps = 10/86 (11%)
Query: 25 DSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV 84
+ S S S + ++ K +K+ + ++E++KEKK K++ K K+ +K+K+
Sbjct: 118 YGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKM 177
Query: 85 SSKEKERKESKPKESSSEKEKKKEKK 110
+ +K K+KKK+KK
Sbjct: 178 VEPKGSKK----------KKKKKKKK 193
Score = 34.7 bits (80), Expect = 0.070
Identities = 22/89 (24%), Positives = 43/89 (48%), Gaps = 10/89 (11%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
PT + + + E +++ + +K A +E++K+K K+KE K
Sbjct: 115 PTGYGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEK--KKKKEVK-------- 164
Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
EK++KK+KK+K + K K +++ K
Sbjct: 165 KEKKEKKDKKEKMVEPKGSKKKKKKKKKK 193
Score = 34.7 bits (80), Expect = 0.075
Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 1/75 (1%)
Query: 68 EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
D + + + ++ E +KEKKK+K+ KKEK K KDK +
Sbjct: 120 APDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEK-KDKKEKMV 178
Query: 128 KDEKKEQKESKSSSK 142
+ + ++K+ K K
Sbjct: 179 EPKGSKKKKKKKKKK 193
Score = 32.8 bits (75), Expect = 0.29
Identities = 18/73 (24%), Positives = 31/73 (42%)
Query: 11 SSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKD 70
+ P + + ++ T+ K+ KK K+ KEK+++K KEK
Sbjct: 118 YGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKM 177
Query: 71 KSAVSSKEKEKDK 83
SK+K+K K
Sbjct: 178 VEPKGSKKKKKKK 190
Score = 31.2 bits (71), Expect = 1.0
Identities = 19/69 (27%), Positives = 32/69 (46%), Gaps = 3/69 (4%)
Query: 87 KEKERKESKPKESSSEKEKKKEK--K-DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
S+ + S E K EK + +++EK K K K+ +++K EKK++KE K
Sbjct: 123 GPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKG 182
Query: 144 VSSSHNSKE 152
K+
Sbjct: 183 SKKKKKKKK 191
Score = 30.1 bits (68), Expect = 2.6
Identities = 16/68 (23%), Positives = 36/68 (52%), Gaps = 1/68 (1%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
+ + ++ S+ + ++ + E E ++E+K +K+K K K+++ KD+K++
Sbjct: 119 GAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKK-KKEVKKEKKEKKDKKEKM 177
Query: 135 KESKSSSK 142
E K S K
Sbjct: 178 VEPKGSKK 185
Score = 29.7 bits (67), Expect = 3.1
Identities = 14/63 (22%), Positives = 28/63 (44%)
Query: 165 PPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESH 224
PP+ ++ + + K ++ + ++ K K KKK+ +K K+K K E +
Sbjct: 124 PPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGS 183
Query: 225 KSS 227
K
Sbjct: 184 KKK 186
Score = 28.1 bits (63), Expect = 9.8
Identities = 15/65 (23%), Positives = 25/65 (38%)
Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
AP E +++ ++ + + +KK+K K KEK K +KE
Sbjct: 120 APDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVE 179
Query: 227 SAGPK 231
G K
Sbjct: 180 PKGSK 184
>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
Length = 509
Score = 44.2 bits (105), Expect = 1e-04
Identities = 26/171 (15%), Positives = 52/171 (30%), Gaps = 14/171 (8%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIP--STSTSSSTSNPTNSSSSKKDKKDKDRDKE 59
+ + A S K ++ + S + + KK K
Sbjct: 29 SKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMVKDTDDATESDIPKKKTKTAAKAAAA 88
Query: 60 KEKEKK------DKEKDKSAVSSKEKEKDKVSSKEKERKESKPK------ESSSEKEKKK 107
K KK D K ++ +K+ D K+ + + + +
Sbjct: 89 KAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQADDDDDDDDDDDLDDDDID 148
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+ D ++ D D + + +EKKE KE + S + + + Q
Sbjct: 149 DDDDDEDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDFVWDEDDSEALRQ 199
Score = 40.7 bits (96), Expect = 0.002
Identities = 19/124 (15%), Positives = 41/124 (33%), Gaps = 1/124 (0%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
K A + S S K++ ++ + K+K ++ D+ +
Sbjct: 3 TASTKAELAAEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGM 62
Query: 80 EKDKVSSKE-KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
KD + E K+ + + K KK K++ K +++ D+ + K
Sbjct: 63 VKDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVK 122
Query: 139 SSSK 142
Sbjct: 123 DIDV 126
Score = 39.2 bits (92), Expect = 0.006
Identities = 17/126 (13%), Positives = 37/126 (29%), Gaps = 5/126 (3%)
Query: 30 PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE-----KDKV 84
ST + K K K + ++E K+ + K + + V
Sbjct: 4 ASTKAELAAEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMV 63
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
+ + PK+ + K K +K K + ++ + + K+ +
Sbjct: 64 KDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKD 123
Query: 145 SSSHNS 150
N
Sbjct: 124 IDVLNQ 129
>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein. TolA couples the inner
membrane complex of itself with TolQ and TolR to the
outer membrane complex of TolB and OprL (also called
Pal). Most of the length of the protein consists of
low-complexity sequence that may differ in both length
and composition from one species to another,
complicating efforts to discriminate TolA (the most
divergent gene in the tol-pal system) from paralogs such
as TonB. Selection of members of the seed alignment and
criteria for setting scoring cutoffs are based largely
conserved operon struction. //The Tol-Pal complex is
required for maintaining outer membrane integrity. Also
involved in transport (uptake) of colicins and
filamentous DNA, and implicated in pathogenesis.
Transport is energized by the proton motive force. TolA
is an inner membrane protein that interacts with
periplasmic TolB and with outer membrane porins ompC,
phoE and lamB [Transport and binding proteins, Other,
Cellular processes, Pathogenesis].
Length = 346
Score = 43.3 bits (102), Expect = 2e-04
Identities = 18/91 (19%), Positives = 49/91 (53%), Gaps = 1/91 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+ ++ +K R E+ ++K+ +++ + ++K+ E+ ++EK+++ + K E K
Sbjct: 76 QAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKA-KQAAEAKA 134
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ + + EK K + K + ++ + K E+K
Sbjct: 135 KAEAEAEKKAKEEAKKQAEEEAKAKAAAEAK 165
Score = 37.9 bits (88), Expect = 0.013
Identities = 23/93 (24%), Positives = 39/93 (41%), Gaps = 4/93 (4%)
Query: 48 KKDKKDKDRDKEKEKEKKDK-EKDKSAVSSKEKEKDKVSSKEKERKESKPKES---SSEK 103
++ + K+ E+ K EK K A +K K+ + +K + E K KE +E+
Sbjct: 95 EQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEE 154
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
E K + + +K K E + K E K
Sbjct: 155 EAKAKAAAEAKKKAAEAKKKAEAEAKAKAEAKA 187
Score = 36.0 bits (83), Expect = 0.043
Identities = 21/88 (23%), Positives = 36/88 (40%), Gaps = 3/88 (3%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
K + E E EKK KE+ A E+E ++ E ++K ++ K+ + + K K
Sbjct: 127 KQAAEAKAKAEAEAEKKAKEE---AKKQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAKA 183
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKE 136
+ K K+ + K K E
Sbjct: 184 EAKAKAKAEEAKAKAEAAKAKAAAEAAA 211
Score = 35.2 bits (81), Expect = 0.087
Identities = 22/135 (16%), Positives = 56/135 (41%), Gaps = 11/135 (8%)
Query: 8 SSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDK 67
S S P P + + + + +N KK+++R K+ E++ ++
Sbjct: 21 GSLYHSVKPEPGGGGEIIQAVLVDPGAVAQQANRIQQQKKPAAKKEQERQKKLEQQAEEA 80
Query: 68 EKDKSA-------VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
EK ++A + + + E+ K+++ K+ +E+ K K+ + K
Sbjct: 81 EKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKAKQAAEAK----AKA 136
Query: 121 DKDRERDKDEKKEQK 135
+ + E+ E+ +++
Sbjct: 137 EAEAEKKAKEEAKKQ 151
Score = 32.5 bits (74), Expect = 0.61
Identities = 22/90 (24%), Positives = 40/90 (44%), Gaps = 1/90 (1%)
Query: 48 KKDKKDKDRD-KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
K K + + + K KE+ KK E++ A ++ E +K +K+K E+K K + K K
Sbjct: 132 AKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAKAEAKAKAKA 191
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
+E K K E + + + +
Sbjct: 192 EEAKAKAEAAKAKAAAEAAAKAEAEAAAAA 221
Score = 32.1 bits (73), Expect = 0.83
Identities = 24/88 (27%), Positives = 47/88 (53%), Gaps = 1/88 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K ++K + E+ K K+ E K+ ++ ++K K +K++ +E+K K ++ K+K
Sbjct: 111 AKQAEEKQKQAEEAKAKQAAEA-KAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAA 169
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
E K K E K K + + + K E+ + K
Sbjct: 170 EAKKKAEAEAKAKAEAKAKAKAEEAKAK 197
Score = 30.6 bits (69), Expect = 2.7
Identities = 19/88 (21%), Positives = 35/88 (39%), Gaps = 3/88 (3%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
KK K++ + E+E + K + K + K K ++ K + E+K K + E + K
Sbjct: 142 KKAKEEAKKQAEEEAKAKAAAEAKKK---AAEAKKKAEAEAKAKAEAKAKAKAEEAKAKA 198
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
E K + + E E +
Sbjct: 199 EAAKAKAAAEAAAKAEAEAAAAAAAEAE 226
Score = 29.4 bits (66), Expect = 5.6
Identities = 17/87 (19%), Positives = 38/87 (43%), Gaps = 4/87 (4%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK-ERKESKPKESSSEKEKKKEK 109
K+ ++ K K + K+ ++ ++ + K K +K K + +E+K K +E K K
Sbjct: 150 KQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAKAEAKAKAKAEEAKAK---AEAAKAKAA 206
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKE 136
+ K+ + + K ++ E
Sbjct: 207 AEAAAKAEAEAAAAAAAEAERKADEAE 233
Score = 29.4 bits (66), Expect = 6.0
Identities = 12/61 (19%), Positives = 24/61 (39%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K K + + + E + K K ++ A + K K + K E+ ++ + K
Sbjct: 171 AKKKAEAEAKAKAEAKAKAKAEEAKAKAEAAKAKAAAEAAAKAEAEAAAAAAAEAERKAD 230
Query: 108 E 108
E
Sbjct: 231 E 231
>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein. This family consists of
AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
retardation syndrome) nuclear proteins. These proteins
have been linked to human diseases such as acute
lymphoblastic leukaemia and mental retardation. The
family also contains a Drosophila AF4 protein homologue
Lilliputian which contains an AT-hook domain.
Lilliputian represents a novel pair-rule gene that acts
in cytoskeleton regulation, segmentation and
morphogenesis in Drosophila.
Length = 1154
Score = 43.8 bits (103), Expect = 3e-04
Identities = 51/252 (20%), Positives = 82/252 (32%), Gaps = 34/252 (13%)
Query: 4 SVKSSSSSSSAHPS-------PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDR 56
S S S + P +K+ S A T S+ + + SK R
Sbjct: 625 SSSSDSPEDESLPPSSQSPGNTESSKE--SCASLRTPVCRSSVG-SQNDLSKDRLLSPMR 681
Query: 57 DKEKEKEKKDKEKDK-----------SAVSSKEKEKDKVSSKEKERKESKP-KESSSEKE 104
+ E +D E+ S + +K ++ S P K++S
Sbjct: 682 ETELLSPLRDSEERYSLWVKIDLDLLSRIPGHPYKKGVPPKPAEKDSLSAPKKQTSKTAS 741
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP- 163
+K K K+ KHK+ + + KK++ E KSSS SSS + +S +
Sbjct: 742 EKSSSKGKR----KHKNDEEADKIESKKQRLEEKSSSCSPSSSSSHHHSSSNKESRKSSR 797
Query: 164 -------PPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
P P+ + SP K S K K++ K
Sbjct: 798 NKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDTSSSSGPFSASSTKSSSKSSSTSKHR 857
Query: 217 KSKEKESHKSSA 228
K++ K S S
Sbjct: 858 KTEGKGSSTSKE 869
Score = 43.0 bits (101), Expect = 4e-04
Identities = 48/188 (25%), Positives = 70/188 (37%), Gaps = 30/188 (15%)
Query: 12 SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
P K +KDS + P TS + S ++S +K K D++ DK + K+++ +EK
Sbjct: 714 PYKKGVPPKPAEKDSLSAPKKQTSKTASEKSSSKGKRKHKNDEEADKIESKKQRLEEKSS 773
Query: 72 S-----AVSSKEKEKDKVSSKEKERKESK----PKESSSEKEKKKEKKDKKEK------- 115
S + S +K S K KE + P S K E +K
Sbjct: 774 SCSPSSSSSHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDTS 833
Query: 116 ---------SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
S K K K K E K S +S + SS ++ AS S P PP
Sbjct: 834 SSSGPFSASSTKSSSKSSSTSKHRKTEGKGSSTSKEHKGSSGDTPNKAS-----SFPVPP 888
Query: 167 APTPTQKS 174
+ K
Sbjct: 889 LSNGSSKP 896
Score = 38.7 bits (90), Expect = 0.010
Identities = 50/248 (20%), Positives = 82/248 (33%), Gaps = 27/248 (10%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKD-KKDKDRDKEKEKEKK 65
SSS S + K +++ +S ++ + SSSS + D E E
Sbjct: 363 SSSEDSDEEQATEKPPSRNTPPSAPSSNPEPAASSSGSSSSSSGSESSSGSDSESESSSS 422
Query: 66 DKEKDK-SAVSSKEKEK------------DKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
D E+++ +S E E +KV+ + ES ++ +KE K K
Sbjct: 423 DSEENEPPRTASPEPEPPSTNKWQLDNWLNKVNPHKVSPAESVSSNPPIKQPMEKEGKVK 482
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS-------SSHNSKEPASGSQLISHPP- 164
S H + K KE++ +++ K S S+ P + P
Sbjct: 483 SSGSQYHPESKEPPPKSSSKEKRRPRTAQKGPESGRGKQKSPAQSEAPPQRRTVGKKQPK 542
Query: 165 -PPAPTPTQKSPVKTKEKE----KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
P + + E E S T K K K + PK +
Sbjct: 543 KPEKASAGDERTGLRPESEPGTLPYGSSVQTPPDRPKAATKGSRKPSPRKEPKSSVPPAA 602
Query: 220 EKESHKSS 227
EK +KS
Sbjct: 603 EKRKYKSP 610
Score = 35.7 bits (82), Expect = 0.082
Identities = 28/105 (26%), Positives = 43/105 (40%), Gaps = 1/105 (0%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
SS S SS+ H + +K+S +P++ SS K + K +++
Sbjct: 773 SSCSPSSSSSHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDT 832
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
+S K K SS K RK ++ K SS+ KE K D
Sbjct: 833 SSSSGPFSASSTKSSSKSSSTSKHRK-TEGKGSSTSKEHKGSSGD 876
Score = 31.4 bits (71), Expect = 1.7
Identities = 44/182 (24%), Positives = 67/182 (36%), Gaps = 23/182 (12%)
Query: 54 KDRDKEKEKEKKDKEKDKSA-VSSKEKEK---DKVSSKEKERKESKPKESSSEKEKKKEK 109
KK K S SSK K K D+ + K + +K+ ++SSS
Sbjct: 723 PAEKDSLSAPKKQTSKTASEKSSSKGKRKHKNDEEADKIESKKQRLEEKSSSCSPSSSS- 781
Query: 110 KDKKEKSHKHKDKDRE-RDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
SH H ++E R KE++ S S +SSS S +P S+ P
Sbjct: 782 ------SHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSS--SPKPEHPSR--KRPRRQED 831
Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
T + P S++ + KH+K + G T+ KE S + + SS
Sbjct: 832 TSSSSGP-----FSASSTKSSSKSSSTS-KHRKTEGKGSSTS-KEHKGSSGDTPNKASSF 884
Query: 229 GP 230
Sbjct: 885 PV 886
>gnl|CDD|113514 pfam04747, DUF612, Protein of unknown function, DUF612. This
family includes several uncharacterized proteins from
Caenorhabditis elegans.
Length = 517
Score = 42.0 bits (97), Expect = 8e-04
Identities = 43/179 (24%), Positives = 82/179 (45%), Gaps = 1/179 (0%)
Query: 16 PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVS 75
P+ ++ K++ A + S KK +K +D E E++ K+ +
Sbjct: 44 PNSINDQRKEAFASLELTEQPQQVEKVKKSEKKKAQKQIAKDHEAEQKVNAKKAAEKEAR 103
Query: 76 SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD-KKEKSHKHKDKDRERDKDEKKEQ 134
E E K +++E+E K+ K ++ +KE++K++ D KK ++ K K+K + +K EK E+
Sbjct: 104 RAEAEAKKRAAQEEEHKQWKAEQERIQKEQEKKEADLKKLQAEKKKEKAVKAEKAEKAEK 163
Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
+ S+ V K+ A+ P P PT T P + ++ K++ K
Sbjct: 164 TKKASTPAPVEEEIVVKKVANDRSAAPAPEPKTPTNTPAEPAEQVQEITGKKNKKNKKK 222
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 42.4 bits (99), Expect = 8e-04
Identities = 50/193 (25%), Positives = 88/193 (45%), Gaps = 26/193 (13%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKK-DKKDKDRDKEKEKEK 64
K ++S SP N+D+ SS S S +S++SN + + S+KK +KK K+ E
Sbjct: 22 KKQTASPDGRASP-TNEDQRSSGRNSPSAASTSSNDSKAESTKKPNKKIKE---EATSPL 77
Query: 65 KDKEKDKSAVSSKEKEKDKVSSKEKERKE----SKPKESSSEKEKKKEKKDKKEKSHKHK 120
K ++ + +S +E ++V++K+ + +E + P E E E + E D + + +
Sbjct: 78 KSTKRQREKPASDTEEPERVTAKKSKTQELSRPNSPSEGEGEGEGEGESSDSRSVNEEGS 137
Query: 121 DKDRERDKDEKK--------EQKESKSSSKIVSSSHNSKEPAS-----GSQLISHPPPPA 167
++ D+D + + ES S S + P S G+ L PPP
Sbjct: 138 SDPKDIDQDNRSSSPSIPSPQDNESDSDSSAQQQLLQPQGPPSIQVPPGAALAPSAPPPT 197
Query: 168 PT----PTQKSPV 176
P+ P Q SP+
Sbjct: 198 PSAQAVPPQGSPI 210
Score = 36.2 bits (83), Expect = 0.055
Identities = 23/80 (28%), Positives = 45/80 (56%), Gaps = 8/80 (10%)
Query: 92 KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
K +K +E + EK K++ ++ +E+ + K+K++ER E++ ++E++ ++K SSSH S+
Sbjct: 576 KLAKKREEAVEKAKREAEQKAREEREREKEKEKER---EREREREAERAAKASSSSHESR 632
Query: 152 E-----PASGSQLISHPPPP 166
S PPP
Sbjct: 633 MSEPQLSGPAHMRPSFEPPP 652
Score = 32.0 bits (72), Expect = 1.0
Identities = 25/85 (29%), Positives = 47/85 (55%), Gaps = 5/85 (5%)
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
SSK +++E +++ E E+K ++ ++EK K K+++RER+++ ++ K S SS +
Sbjct: 574 SSKLAKKREEAVEKAKREAEQKAREEREREKE-KEKEREREREREAERAAKASSSSHE-- 630
Query: 145 SSSHNSKEPASGSQLISHPPPPAPT 169
S S+ SG + P PT
Sbjct: 631 --SRMSEPQLSGPAHMRPSFEPPPT 653
Score = 30.4 bits (68), Expect = 3.4
Identities = 24/63 (38%), Positives = 36/63 (57%), Gaps = 4/63 (6%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK-ESKPKESSSEK 103
+SSK KK R++ EK K++ E+ +EKEK+K +E+ER+ E K SSS
Sbjct: 573 ASSKLAKK---REEAVEKAKREAEQKAREEREREKEKEKEREREREREAERAAKASSSSH 629
Query: 104 EKK 106
E +
Sbjct: 630 ESR 632
>gnl|CDD|215521 PLN02967, PLN02967, kinase.
Length = 581
Score = 42.0 bits (98), Expect = 8e-04
Identities = 29/110 (26%), Positives = 51/110 (46%), Gaps = 3/110 (2%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSA-VSSKEKEKDKVSSKEKERKESKPKESSSE- 102
S K + K K+ E + ++ S V +++ DK S K R K +SS+
Sbjct: 69 SKKKPTRSVKRATKKTVVEISEPLEEGSELVVNEDAALDKESKKTPRRTRRKAAAASSDV 128
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
+E+K EKK +K + K K + D+ + E + + S + S + S+E
Sbjct: 129 EEEKTEKKVRKRRKVK-KMDEDVEDQGSESEVSDVEESEFVTSLENESEE 177
Score = 35.0 bits (80), Expect = 0.12
Identities = 33/148 (22%), Positives = 57/148 (38%), Gaps = 9/148 (6%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDK-------KDK 54
A S K S+ + P +N S P+ S +T S ++ +D
Sbjct: 46 AGSRKKIESALAVDEEPDEN-GAVSKKKPTRSVKRATKKTVVEISEPLEEGSELVVNEDA 104
Query: 55 DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
DKE +K + + +A SS +E+ K RK K E ++ + E D +E
Sbjct: 105 ALDKESKKTPRRTRRKAAAASSDVEEEKTEKKVRKRRKVKKMDEDVEDQGSESEVSDVEE 164
Query: 115 KSHKHK-DKDRERDKDEKKEQKESKSSS 141
+ + E + D +K+ E S +
Sbjct: 165 SEFVTSLENESEEELDLEKDDGEDISHT 192
Score = 29.2 bits (65), Expect = 6.5
Identities = 23/124 (18%), Positives = 41/124 (33%), Gaps = 26/124 (20%)
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
+++ ES E K K +S K K + E E+
Sbjct: 49 RKKIESALAVDEEPDENGAVSKKKPTRSVKRATKKTVVEISEPLEE-------------- 94
Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK-----EKEKESSTTHDKHSKHKHKKKD 203
GS+L+ + ++K+P +T+ K +E T + K KK D
Sbjct: 95 -------GSELVVNEDAALDKESKKTPRRTRRKAAAASSDVEEEKTEKKVRKRRKVKKMD 147
Query: 204 KHGD 207
+ +
Sbjct: 148 EDVE 151
>gnl|CDD|220102 pfam09073, BUD22, BUD22. BUD22 has been shown in yeast to be a
nuclear protein involved in bud-site selection. It plays
a role in positioning the proximal bud pole signal. More
recently it has been shown to be involved in ribosome
biogenesis.
Length = 424
Score = 41.7 bits (98), Expect = 9e-04
Identities = 28/141 (19%), Positives = 67/141 (47%), Gaps = 18/141 (12%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
+ + +K K K+ KS +K++ K SS + + +ES+ ++ S +E ++ D +E+
Sbjct: 142 IETKAKKGKAKKKTKKS-----KKKEAKESSDKDDEEESESEDESKSEESAEDDSDDEEE 196
Query: 116 SHKHKDKDRERDK-------DEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
+ + D +E+ E+ S + ++ S S + + + S +
Sbjct: 197 EDSDSEDYSQYDGMLVDSSDEEEGEEAPSINYNEDTSESESDESDSEIS------ESRSV 250
Query: 169 TPTQKSPVKTKEKEKEKESST 189
+ +++S +K+ +++K SST
Sbjct: 251 SDSEESSPPSKKPKEKKTSST 271
Score = 29.8 bits (67), Expect = 4.9
Identities = 27/173 (15%), Positives = 52/173 (30%), Gaps = 26/173 (15%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
S N +S+ + + D + E ++ S S K KEK S+
Sbjct: 225 SINYNEDTSESESDESD-SEISESRSVSDSEESSPPSKKPKEKKTSSTFLPSLMGGYFSG 283
Query: 99 SSSEKEKKKEKKDKKEKSHKHKDKDR---------------ERDKDEKKEQKESKSSSKI 143
S E + ++ + K K+R K KKE+++ + +
Sbjct: 284 SEDEDDDDEDIDPDQVVKKPVKRKNRRGQRARQAIWEKKYGSGAKHVKKEREKEQKEREG 343
Query: 144 VSSSHNSKEPA----------SGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
S +++ + S P + K K+ +K
Sbjct: 344 RQSEWEARQAKREGGDAKAGRAAEPTGSRTQQKGDRPKRGEKKKPKKPSVDKP 396
Score = 29.0 bits (65), Expect = 7.5
Identities = 23/116 (19%), Positives = 46/116 (39%), Gaps = 18/116 (15%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSA--------VSSKEKEKDK---------VSS 86
+ SKK + + DK+ E+E + +++ KS +E + V S
Sbjct: 155 TKKSKKKEAKESSDKDDEEESESEDESKSEESAEDDSDDEEEEDSDSEDYSQYDGMLVDS 214
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE-KKEQKESKSSS 141
++E E P + +E + E + + + + + K+ KE K+SS
Sbjct: 215 SDEEEGEEAPSINYNEDTSESESDESDSEISESRSVSDSEESSPPSKKPKEKKTSS 270
>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
Length = 1465
Score = 41.8 bits (98), Expect = 0.001
Identities = 32/218 (14%), Positives = 74/218 (33%), Gaps = 6/218 (2%)
Query: 16 PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS---KKDKKDKDRDKEKEKEKKDKEKDKS 72
P+P K K S + + T S++ T + + K + + ++K++E +
Sbjct: 1205 PAPKKTTKKASESETTEETYGSSAMETENVAEVVKPKGRAGAKKKAPAAAKEKEEEDEIL 1264
Query: 73 AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
+ + + S+ + K + ++ + KK D D + D +
Sbjct: 1265 DLKDRLAAYNLDSAPAQSAKMEETVKAVPARRAAARKK-PLASVSVISDSDDDDDDFAVE 1323
Query: 133 EQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHD 192
+ K + + A+ + PA + + E K E+
Sbjct: 1324 VSLAERLKKKGGRKPAAANKKAAKPPAAAKKRGPATVQSGQ--KLLTEMLKPAEAIGISP 1381
Query: 193 KHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGP 230
+ K + + + + A +KE ES ++ +G
Sbjct: 1382 EKKVRKMRASPFNKKSGSVLGRAATNKETESSENVSGS 1419
Score = 31.8 bits (72), Expect = 1.3
Identities = 34/186 (18%), Positives = 62/186 (33%), Gaps = 13/186 (6%)
Query: 55 DRDK-EKEKEKKDKEKDKSA----VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
DRDK E E K KS + + EKE DK+ ++ + +E++ K + +
Sbjct: 1134 DRDKLNIEVEDLKKTTPKSLWLKDLDALEKELDKLDKEDAKAEEAREKLQRAAARGESGA 1193
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
K + K ++ K + + ++ + N E P
Sbjct: 1194 AKKVSRQAPKKPAPKKTTKKASESETTEETYGSSAMETENVAEVV--------KPKGRAG 1245
Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
+K+P KEKE+E E D+ + + K K ++ + K
Sbjct: 1246 AKKKAPAAAKEKEEEDEILDLKDRLAAYNLDSAPAQSAKMEETVKAVPARRAAARKKPLA 1305
Query: 230 PKCYPE 235
Sbjct: 1306 SVSVIS 1311
Score = 31.4 bits (71), Expect = 1.6
Identities = 33/191 (17%), Positives = 67/191 (35%), Gaps = 5/191 (2%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE 61
AY++ S+ + S+ K A + + ++ + S S D D +
Sbjct: 1272 AYNLDSAPAQSAKMEET----VKAVPARRAAARKKPLASVSVISDSDDDDDDFAVEVSLA 1327
Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
+ K K K A ++K+ K ++K K + E K + K
Sbjct: 1328 ERLKKKGGRKPAAANKKAAKPPAAAK-KRGPATVQSGQKLLTEMLKPAEAIGISPEKKVR 1386
Query: 122 KDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK 181
K R ++K ++++ + S + +S S+ P P + + +T
Sbjct: 1387 KMRASPFNKKSGSVLGRAATNKETESSENVSGSSSSEKDEIDVSAKPRPQRANRKQTTYV 1446
Query: 182 EKEKESSTTHD 192
+ ES + D
Sbjct: 1447 LSDSESESADD 1457
>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein. This entry is a highly
conserved protein present in eukaryotes.
Length = 680
Score = 41.4 bits (97), Expect = 0.001
Identities = 43/274 (15%), Positives = 86/274 (31%), Gaps = 13/274 (4%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
SS +S + S S++S T S K +
Sbjct: 220 SSKGLTSTKELVPVQNSGGNH-SLSKSSNSQTPELEYSEKGKDHHHSHNHQHHSIGINNH 278
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKES--KPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
K + + + S+K + S KE++S + S K +R
Sbjct: 279 HSKHADSKLQTIEVIENHSNKSRPSSSSTNGSKETTSNSSSAAAGSIGSKSSKSAKHSNR 338
Query: 125 ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP---PPPAPTPTQKSPVKTKEK 181
+ K + S S S N + S+ S A + V+
Sbjct: 339 NKSNSSPKSHSSANGSVPSSSVSDNESKQKRASKSSSGARDSKKDASGMSANGTVENCIP 398
Query: 182 EKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGI-- 239
E + +T + + K + ++ +++ + + S +S ++G +
Sbjct: 399 ENK---ISTPSAIERLEQDIKKLQAELQQARQNESELRNQISLLTSLERSLKSDLGQLKK 455
Query: 240 -YILLRSKKNKTVQERLAE-QFKDELFDRLKNEQ 271
+L++K N V + + Q + RLK+E
Sbjct: 456 ENDMLQTKLNSMVSAKQKDKQSMQSMEKRLKSEA 489
Score = 36.1 bits (83), Expect = 0.059
Identities = 22/92 (23%), Positives = 40/92 (43%), Gaps = 4/92 (4%)
Query: 33 STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKE----KDKSAVSSKEKEKDKVSSKE 88
+ + + S ++ + + KK+ + K S VS+K+K+K + S E
Sbjct: 423 QQARQNESELRNQISLLTSLERSLKSDLGQLKKENDMLQTKLNSMVSAKQKDKQSMQSME 482
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
K K ++EK+ +EKK KKE+
Sbjct: 483 KRLKSEADSRVNAEKQLAEEKKRKKEEEETAA 514
Score = 29.5 bits (66), Expect = 5.6
Identities = 17/104 (16%), Positives = 36/104 (34%), Gaps = 12/104 (11%)
Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
DK++ + + +S+K + NS S S + ++P ++ +
Sbjct: 213 DKEKSEASSKGLTSTKELVPVQNSGGNHSLS----------KSSNSQTPELEYSEKGKDH 262
Query: 187 SSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGP 230
+ H H + H + K + + E S+KS
Sbjct: 263 HHSH--NHQHHSIGINNHHSKHADSKLQTIEVIENHSNKSRPSS 304
Score = 28.7 bits (64), Expect = 9.8
Identities = 30/157 (19%), Positives = 52/157 (33%), Gaps = 11/157 (7%)
Query: 73 AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
+S +KEK + SSK + +S + S+ + +K +
Sbjct: 208 TLSVTDKEKSEASSKGLTSTKELVPVQNS-----GGNHSLSKSSNSQTPELEYSEKGKDH 262
Query: 133 EQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHD 192
+ I ++H+SK S Q I + S KE SS+
Sbjct: 263 HHSHNHQHHSIGINNHHSKHADSKLQTIEVIENHSNKSRPSSSSTNGSKETTSNSSSAAA 322
Query: 193 KHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
K K KH + ++ + +SH S+ G
Sbjct: 323 GSIGSKSSKSAKHSN------RNKSNSSPKSHSSANG 353
>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family. This model
represents a subfamily of RNA splicing factors including
the Pad-1 protein (N. crassa), CAPER (M. musculus) and
CC1.3 (H.sapiens). These proteins are characterized by
an N-terminal arginine-rich, low complexity domain
followed by three (or in the case of 4 H. sapiens
paralogs, two) RNA recognition domains (rrm: pfam00706).
These splicing factors are closely related to the U2AF
splicing factor family (TIGR01642). A homologous gene
from Plasmodium falciparum was identified in the course
of the analysis of that genome at TIGR and was included
in the seed.
Length = 457
Score = 41.4 bits (97), Expect = 0.001
Identities = 14/84 (16%), Positives = 38/84 (45%), Gaps = 3/84 (3%)
Query: 56 RDKEKEKEK-KDKEKDKSAVSSKEKEKDK-VSSKEKER-KESKPKESSSEKEKKKEKKDK 112
RD+E+ + + + DK S+ + + + S + ++R + S + + + +
Sbjct: 3 RDRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYRPR 62
Query: 113 KEKSHKHKDKDRERDKDEKKEQKE 136
++S++ D+ R+ E + E
Sbjct: 63 GDRSYRRDDRRSGRNTKEPLTEAE 86
Score = 38.7 bits (90), Expect = 0.009
Identities = 9/93 (9%), Positives = 36/93 (38%), Gaps = 8/93 (8%)
Query: 45 SSSKKDKKDKDRDKEKEKE-KKDKEKDKSAVSSKEKEKDKVSSKE-KERKESKPKESSSE 102
+ ++ R +K +E + + + + S + +++D + + R S +
Sbjct: 4 DRERGRLRNDTRRSDKGRERSRRRSRSRDR-SRRRRDRDYYRGRRGRSRSRSPNRYYRPR 62
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
++ + D+ + +E + +++ +
Sbjct: 63 GDRSYRRDDR-----RSGRNTKEPLTEAERDDR 90
Score = 34.5 bits (79), Expect = 0.14
Identities = 8/57 (14%), Positives = 26/57 (45%)
Query: 86 SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
+E+ R + + S +E+ + + +++S + +D+D R + + + +
Sbjct: 4 DRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYR 60
Score = 32.9 bits (75), Expect = 0.46
Identities = 18/112 (16%), Positives = 41/112 (36%), Gaps = 13/112 (11%)
Query: 19 HKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
+N + S + S +D+ + RD++ + ++ + + +S
Sbjct: 10 LRNDTRRSDK---------GRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYR 60
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
D+ R + + ++ E + E+ D+ + K RERD E
Sbjct: 61 PRGDRSY----RRDDRRSGRNTKEPLTEAERDDRTVFVLQLALKARERDLYE 108
Score = 30.2 bits (68), Expect = 3.5
Identities = 12/110 (10%), Positives = 36/110 (32%), Gaps = 26/110 (23%)
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
+D++ + + R D+ +E+ +S S+ S ++ G
Sbjct: 2 YRDRERGRLR----NDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGR----------- 46
Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK-HGDKTNPKEKDAK 217
+ + + + + + +++ D+ G T +A+
Sbjct: 47 ----------RGRSRSRSPNRYYRPRGDRSYRRDDRRSGRNTKEPLTEAE 86
>gnl|CDD|234229 TIGR03490, Mycoplas_LppA, mycoides cluster lipoprotein, LppA/P72
family. Members of this protein family occur in
Mycoplasma mycoides, Mycoplasma hyopneumoniae, and
related Mycoplasmas in small paralogous families that
may also include truncated forms and/or pseudogenes.
Members are predicted lipoproteins with a conserved
signal peptidase II processing and lipid attachment
site. Note that the name for certain characterized
members, p72, reflects an anomalous apparent molecular
weight, given a theoretical MW of about 61 kDa.
Length = 541
Score = 41.4 bits (97), Expect = 0.001
Identities = 31/106 (29%), Positives = 50/106 (47%), Gaps = 6/106 (5%)
Query: 29 IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
I S S S S T SS+SK+ +K + + K K+ D + + E + S
Sbjct: 13 ISSISFLSVVSCSTTSSNSKQPEKKPEIKPNENTPKIPKKPD----NKEPSENNNNKSNN 68
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
+ + E P SS+ EKK + KE+ K KD+ ++ DK + +Q
Sbjct: 69 ENKDEENP--SSTNPEKKPDPSKNKEEIEKPKDEPKKPDKKPQADQ 112
Score = 36.4 bits (84), Expect = 0.041
Identities = 26/110 (23%), Positives = 44/110 (40%), Gaps = 4/110 (3%)
Query: 26 SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS 85
S ST++S+S + K K KE + +KS +K++E +
Sbjct: 20 SVVSCSTTSSNSKQPEKKPEIKPNENTPKIPKKPDNKEPSENNNNKSNNENKDEENPSST 79
Query: 86 SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
+ EK+ SK KE + + + +K DKK D+ D+ K
Sbjct: 80 NPEKKPDPSKNKEEIEKPKDEPKKPDKKP----QADQPNNVHADQPNNNK 125
>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
component YidC; Validated.
Length = 429
Score = 41.4 bits (97), Expect = 0.001
Identities = 16/90 (17%), Positives = 36/90 (40%), Gaps = 5/90 (5%)
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK-----IVSSSHNSKEPASGSQ 158
EK + K KKE + K + +RE +++ ++E+ + + ++ + + + +
Sbjct: 339 EKNEAKARKKEIAQKRRAAEREINREARQERAAAMARARARRAAVKAKKKGLIDASPNED 398
Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESS 188
S +P Q T E +E
Sbjct: 399 TPSENEESKGSPPQVEATTTAEPNREPSQE 428
Score = 31.0 bits (70), Expect = 1.7
Identities = 21/100 (21%), Positives = 40/100 (40%), Gaps = 12/100 (12%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
K R EK + K K+ + +K + + +E R+ ++ + + +
Sbjct: 334 KTRTAEKNEAKARKK--------EIAQKRRAAEREINREA---RQERAAAMARARARRAA 382
Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP 153
K+ K D ++D E +ESK S V + + EP
Sbjct: 383 VKAKKKGLIDASPNEDTPSENEESKGSPPQV-EATTTAEP 421
>gnl|CDD|216461 pfam01370, Epimerase, NAD dependent epimerase/dehydratase family.
This family of proteins utilise NAD as a cofactor. The
proteins in this family use nucleotide-sugar substrates
for a variety of chemical reactions.
Length = 233
Score = 40.3 bits (95), Expect = 0.001
Identities = 31/158 (19%), Positives = 46/158 (29%), Gaps = 49/158 (31%)
Query: 300 FIQHHIHVIIHAAASLRFDELIQDAFTL---NIQATRELLDLATRCSQLKAILHVSTLYT 356
+ +IH AA +D N+ T LL+ A R K + S+
Sbjct: 59 LAEVQPDAVIHLAAQSGVGASFEDPADFIRANVLGTLRLLEAARRAGV-KRFVFASS--- 114
Query: 357 HSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVV 416
E Y + + I G + Y+ K E +V
Sbjct: 115 --------SEVYGDV---------------ADPPITEDTPLG--PLSPYAAAKLAAERLV 149
Query: 417 EKY--LYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPG 452
E Y Y L ++R N+YGPG
Sbjct: 150 EAYARAYGLRAVILRLF---------------NVYGPG 172
>gnl|CDD|240370 PTZ00342, PTZ00342, acyl-CoA synthetase; Provisional.
Length = 746
Score = 41.2 bits (97), Expect = 0.002
Identities = 27/103 (26%), Positives = 42/103 (40%), Gaps = 9/103 (8%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
++EK + ++KE++K S E E P E +KEK ++ KD KEK+
Sbjct: 219 NKEEKNNGSNVNNNGNKNNKEEQKGNDLSNELEDISLGPLEY--DKEKLEKIKDLKEKAK 276
Query: 118 KHKDKDRERDKDEKKEQKESKSS-------SKIVSSSHNSKEP 153
K D K + K + IV +S S +P
Sbjct: 277 KLGISIILFDDMTKNKTTNYKIQNEDPDFITSIVYTSGTSGKP 319
>gnl|CDD|183610 PRK12585, PRK12585, putative monovalent cation/H+ antiporter
subunit G; Reviewed.
Length = 197
Score = 39.3 bits (91), Expect = 0.002
Identities = 27/97 (27%), Positives = 52/97 (53%), Gaps = 11/97 (11%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
RD+ + +K D +K KS + +E+ EK R+E + E E E+++EK D++E
Sbjct: 104 RDQLRSVKKDDIKKKKSLIIRQEQ-------IEKARQEREELEERMEWERREEKIDERE- 155
Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
++++RER++ +EQ + S +I+ + E
Sbjct: 156 --DQEEQEREREEQTIEEQSDD-SEHEIIEQDESETE 189
>gnl|CDD|172341 PRK13808, PRK13808, adenylate kinase; Provisional.
Length = 333
Score = 40.3 bits (94), Expect = 0.002
Identities = 25/139 (17%), Positives = 53/139 (38%), Gaps = 6/139 (4%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
+++ ++ P+ K S+ S + S ++ K + KK
Sbjct: 196 AANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKK----KAAKTAVSAKKAAKTAAKAAKK 251
Query: 66 DKEKDKSAVSSKEKEKDKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
K+ K A+ K K + K + K +K +++ + K +KK K+ + K K
Sbjct: 252 AKKTAKKALKKAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKA 311
Query: 124 RERDKDEKKEQKESKSSSK 142
+ + K++K +K
Sbjct: 312 TAKAPKRGAKGKKAKKVTK 330
Score = 39.9 bits (93), Expect = 0.003
Identities = 32/144 (22%), Positives = 59/144 (40%), Gaps = 6/144 (4%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
++++KK K +K + VS K+ K VS+K+ + +K + + +
Sbjct: 196 AANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKKAKKT 255
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
KK KK K K + K K K +K +K + K+ A+GS+ +
Sbjct: 256 AKKALKKAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAK--AKKKAGKKAAAGSKAKA-- 311
Query: 164 PPPAPTPTQKSPVKTKEKEKEKES 187
A P + + K +K +K +
Sbjct: 312 --TAKAPKRGAKGKKAKKVTKKRA 333
Score = 36.4 bits (84), Expect = 0.032
Identities = 28/131 (21%), Positives = 55/131 (41%), Gaps = 1/131 (0%)
Query: 27 SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSA-VSSKEKEKDKVS 85
+A+ + + + P S +KK +K +KK + SA ++K K
Sbjct: 192 AAVGAANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKK 251
Query: 86 SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
+K+ +K K + +K KK K + + + + K +KK K++ + SK +
Sbjct: 252 AKKTAKKALKKAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKA 311
Query: 146 SSHNSKEPASG 156
++ K A G
Sbjct: 312 TAKAPKRGAKG 322
Score = 31.0 bits (70), Expect = 1.9
Identities = 27/124 (21%), Positives = 45/124 (36%), Gaps = 4/124 (3%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
+ +S+ S K K +A + S + ++ K K K + K
Sbjct: 209 KSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKKAKKTAKKALKKAAKAVK 268
Query: 65 KDKEK-DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK---KEKSHKHK 120
K +K K+A + + K K +K++ K ++ K K K K K K K
Sbjct: 269 KAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKATAKAPKRGAKGKKAKKV 328
Query: 121 DKDR 124
K R
Sbjct: 329 TKKR 332
>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777). This is
a family of eukaryotic proteins of unknown function.
Some of the proteins in this family are putative nucleic
acid binding proteins.
Length = 158
Score = 38.7 bits (90), Expect = 0.002
Identities = 24/109 (22%), Positives = 48/109 (44%), Gaps = 5/109 (4%)
Query: 35 SSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
S S S + + +DR + + + + +E+D+ S S+ R S
Sbjct: 2 GRSRSRSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRS 61
Query: 95 KPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
S +++++K +++K + + K RER K K+E E KS ++
Sbjct: 62 ----RSRSPSRRRDRKRERDKDAR-EPKKRERQKLIKEEDLEGKSDEEV 105
Score = 31.4 bits (71), Expect = 0.72
Identities = 14/94 (14%), Positives = 40/94 (42%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
++ + S S S S ++D++ + R + + ++ + + S+
Sbjct: 7 RSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSP 66
Query: 80 EKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
+ + +E+++ +PK+ +K K+E + K
Sbjct: 67 SRRRDRKRERDKDAREPKKRERQKLIKEEDLEGK 100
Score = 30.6 bits (69), Expect = 1.2
Identities = 15/100 (15%), Positives = 40/100 (40%), Gaps = 5/100 (5%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
+ R + + ++ S +E+ + S+ +ER + S S ++ + ++
Sbjct: 3 RSRSRSPRRSRRRGRSR----SRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRR 58
Query: 114 EKSH-KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
+S + + R+R ++ K+ +E K + E
Sbjct: 59 HRSRSRSPSRRRDRKRERDKDAREPKKRERQKLIKEEDLE 98
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 40.3 bits (94), Expect = 0.003
Identities = 25/106 (23%), Positives = 51/106 (48%), Gaps = 6/106 (5%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
S + + +K ++E E++KK +EK +KEKE K+ + +KE K + +S+
Sbjct: 2 SRTESEAEKKILTEEELERKKKKEEK------AKEKELKKLKAAQKEAKAKLQAQQASDG 55
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
+K +KK + +D++ E D + K S ++ ++
Sbjct: 56 TNVPKKSEKKSRKRDVEDENPEDFIDPDTPFGQKKRLSSQMAKQYS 101
Score = 39.9 bits (93), Expect = 0.004
Identities = 20/89 (22%), Positives = 36/89 (40%), Gaps = 2/89 (2%)
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
+ E E+K +E E++KKKE+K K+++ K K +E + +Q ++
Sbjct: 4 TESEAEKKILTEEEL--ERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKK 61
Query: 145 SSSHNSKEPASGSQLISHPPPPAPTPTQK 173
S + K P P +K
Sbjct: 62 SEKKSRKRDVEDENPEDFIDPDTPFGQKK 90
Score = 31.8 bits (72), Expect = 1.5
Identities = 17/90 (18%), Positives = 38/90 (42%)
Query: 35 SSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
S + S ++++ + K + +EK KEK+ K+ + +K K + + +S +
Sbjct: 2 SRTESEAEKKILTEEELERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKK 61
Query: 95 KPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
K+S + + +D + K R
Sbjct: 62 SEKKSRKRDVEDENPEDFIDPDTPFGQKKR 91
>gnl|CDD|177089 CHL00189, infB, translation initiation factor 2; Provisional.
Length = 742
Score = 40.2 bits (94), Expect = 0.003
Identities = 28/133 (21%), Positives = 44/133 (33%), Gaps = 8/133 (6%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKD--RDKEKEKEKKDKE---KDKSAV 74
++ KDS + + K KD + K K+K+K K+ D
Sbjct: 36 ESDIKDSLLNLDINKKLHEKLDKKNKKFNKTDDLKDSKKTKLKQKKKIKKKLHIDDDYDN 95
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
K K + + + +EK KKK +K K K KDE +
Sbjct: 96 FFDSKNNSKQFAGPLAISLMRKPKPKTEKLKKKITVNKST---NKKKKKVLSSKDELIKY 152
Query: 135 KESKSSSKIVSSS 147
+K S + S
Sbjct: 153 DNNKPKSISIHSP 165
Score = 35.6 bits (82), Expect = 0.080
Identities = 31/168 (18%), Positives = 52/168 (30%), Gaps = 34/168 (20%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE--KKKEKKDKKE 114
K K + + S + +K ++ ++K K ++ K+ K K K+ KK
Sbjct: 24 KNLKHSSYKIRLESDIKDSLLNLDINKKLHEKLDKKNKKFNKTDDLKDSKKTKLKQKKKI 83
Query: 115 KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKS 174
K H D D + D K K+ I S
Sbjct: 84 KKKLHIDDDYDNFFDSKNNSKQFAGPLAI------------------------------S 113
Query: 175 PVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
++ + + EK K +KKK K + K +K K
Sbjct: 114 LMRKPKPKTEKLKKKITVN--KSTNKKKKKVLSSKDELIKYDNNKPKS 159
>gnl|CDD|218538 pfam05285, SDA1, SDA1. This family consists of several SDA1
protein homologues. SDA1 is a Saccharomyces cerevisiae
protein which is involved in the control of the actin
cytoskeleton. The protein is essential for cell
viability and is localised in the nucleus.
Length = 317
Score = 39.3 bits (92), Expect = 0.004
Identities = 43/183 (23%), Positives = 64/183 (34%), Gaps = 39/183 (21%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE----- 104
DK+ + D E E+EK + K S +E ++ +E + KE +SE
Sbjct: 124 DKEIESSDSEDEEEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEKASELATTRIL 183
Query: 105 ------KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
K +E + +K K + RDKD +S E
Sbjct: 184 TPADFAKIQELRLEKGVDKALGGKLKRRDKDA---------------PERHSDELVDADD 228
Query: 159 LISHPPPPAPTPTQKSPVKTKEK--EKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
+ P +K TKE+ KE +K K KK + TN KEK A
Sbjct: 229 IE------GPAKKKKQ---TKEERIATAKEGREDREKFGSRKGKKDKEGKSTTN-KEK-A 277
Query: 217 KSK 219
+ K
Sbjct: 278 RKK 280
Score = 29.2 bits (66), Expect = 6.9
Identities = 19/100 (19%), Positives = 48/100 (48%), Gaps = 2/100 (2%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKS--AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
K K+++ + KE E+ + + D +E E + + + K ESS ++++
Sbjct: 77 KWKEEERKKKEAEQGLESDDDDDEEEEWEVEEDEDSDDEGEWIDVESDKEIESSDSEDEE 136
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS 146
++ + K+ ++ E D++E E++E+++ + S
Sbjct: 137 EKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEKASE 176
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 39.2 bits (92), Expect = 0.004
Identities = 24/71 (33%), Positives = 43/71 (60%), Gaps = 6/71 (8%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
S K DK R++E+EK K E+++ + ++KE+ K+KE +E+K + S E++
Sbjct: 254 SPEVLRKVDKTREEEEEKILKAAEEERQEEAQEKKEE-----KKKEEREAKLAKLSPEEQ 308
Query: 105 KKKE-KKDKKE 114
+K E K+ KK+
Sbjct: 309 RKLEEKERKKQ 319
Score = 32.2 bits (74), Expect = 0.64
Identities = 17/64 (26%), Positives = 36/64 (56%), Gaps = 13/64 (20%)
Query: 48 KKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+K K + ++ E+ +EKK+++K + +E + K+S +E+ + E EKE+K
Sbjct: 270 EKILKAAEEERQEEAQEKKEEKKKEE----REAKLAKLSPEEQRKLE--------EKERK 317
Query: 107 KEKK 110
K+ +
Sbjct: 318 KQAR 321
Score = 31.5 bits (72), Expect = 1.4
Identities = 20/64 (31%), Positives = 35/64 (54%), Gaps = 3/64 (4%)
Query: 75 SSKEKEK-DKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
S + K DK +E+E K ++ + +EKK+EKK ++ ++ K E+ K E+
Sbjct: 254 SPEVLRKVDKTREEEEEKILKAAEEERQEEAQEKKEEKKKEEREAKLAKLSPEEQRKLEE 313
Query: 132 KEQK 135
KE+K
Sbjct: 314 KERK 317
Score = 28.8 bits (65), Expect = 8.1
Identities = 14/62 (22%), Positives = 36/62 (58%), Gaps = 1/62 (1%)
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
+K +++ K + E + +E + EK K+++KK+++E + +R +EK+ +K++
Sbjct: 262 DKTREEEEEKILKAAEEERQEEAQEK-KEEKKKEEREAKLAKLSPEEQRKLEEKERKKQA 320
Query: 138 KS 139
+
Sbjct: 321 RK 322
>gnl|CDD|226096 COG3566, COG3566, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 379
Score = 39.5 bits (92), Expect = 0.004
Identities = 16/102 (15%), Positives = 41/102 (40%), Gaps = 4/102 (3%)
Query: 48 KKDKKDKDRDKEK-EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+ D E+ + + +S + +KV + EK+ ++ K S + K
Sbjct: 184 PITTRRIGVDGISLSLEETKASEVEHLSASLKTATEKVDALEKDLHAAQAKLDSGQALTK 243
Query: 107 KE---KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
+E KK + K+ + D+D + +++++++
Sbjct: 244 EELDAKKAELSKALAALEAANAADEDPQDRDAAVEAAARLMG 285
>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355). This
family of proteins is found in bacteria and viruses.
Proteins in this family are typically between 180 and
214 amino acids in length.
Length = 125
Score = 37.2 bits (87), Expect = 0.004
Identities = 24/78 (30%), Positives = 38/78 (48%), Gaps = 1/78 (1%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+ E+EK DKE DK+ K K + K K+ E ++ K S+ EK + + +K +KE
Sbjct: 1 EPEEEKTFTDKEVDKAIAKEKAKWEKKQEEKKSEAEKLA-KMSAEEKAEYELEKLEKELE 59
Query: 117 HKHKDKDRERDKDEKKEQ 134
+ R K E K+
Sbjct: 60 ELEAELARRELKAEAKKM 77
Score = 31.8 bits (73), Expect = 0.34
Identities = 16/59 (27%), Positives = 33/59 (55%)
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
+KE DK +KEK + E K +E SE EK + +++ ++ + ++E ++ E + +
Sbjct: 10 DKEVDKAIAKEKAKWEKKQEEKKSEAEKLAKMSAEEKAEYELEKLEKELEELEAELARR 68
Score = 29.5 bits (67), Expect = 2.3
Identities = 17/78 (21%), Positives = 35/78 (44%), Gaps = 12/78 (15%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
E E+EK +K+ +KEK K ++++E K E+ + E+K + E
Sbjct: 1 EPEEEKTFTDKEVDKAIAKEKAK------WEKKQEEKKSEAEKLAKMSAEEKAEYEL--- 51
Query: 119 HKDKDRERDKDEKKEQKE 136
+ E++ +E + +
Sbjct: 52 ---EKLEKELEELEAELA 66
>gnl|CDD|218684 pfam05672, MAP7, MAP7 (E-MAP-115) family. The organisation of
microtubules varies with the cell type and is presumably
controlled by tissue-specific microtubule-associated
proteins (MAPs). The 115-kDa epithelial MAP
(E-MAP-115/MAP7) has been identified as a
microtubule-stabilising protein predominantly expressed
in cell lines of epithelial origin. The binding of this
microtubule associated protein is nucleotide
independent.
Length = 171
Score = 37.8 bits (87), Expect = 0.005
Identities = 18/118 (15%), Positives = 64/118 (54%), Gaps = 1/118 (0%)
Query: 26 SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS 85
A S + T+ T++ + + +K R +++E++++E+ + + + ++
Sbjct: 3 GKAENSAALGKPTAGTTDAEEATRLLAEKRRQAREQREQEEQERREQEEQDRLEREELKR 62
Query: 86 SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
+ER + + E+E+ +EK++K ++ + ++K +E+++ E+ ++++ ++ ++
Sbjct: 63 RAAEERLRREEEARRQEEERAREKEEKAKRKAEEEEK-QEQEEQERIQKQKEEAEARA 119
Score = 29.7 bits (66), Expect = 2.7
Identities = 17/89 (19%), Positives = 47/89 (52%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
++K++K + K +E+EK+++E+ + KE+ + + + + + + K ++++
Sbjct: 83 AREKEEKAKRKAEEEEKQEQEEQERIQKQKEEAEARAREEAERMRLEREKHFQQIEQERL 142
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
E+K + E+ K K + +K++ K
Sbjct: 143 ERKKRLEEIMKRTRKSEVSPQVKKEDPKV 171
>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
Length = 434
Score = 39.0 bits (91), Expect = 0.005
Identities = 18/85 (21%), Positives = 39/85 (45%), Gaps = 1/85 (1%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
K ++ K +K + + K+ + K E K+ +E + K+EK ++
Sbjct: 351 KLYEEVKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVKDETDASEEAEAKAKEEKLKQE 410
Query: 114 EKSHKHKDKDRERDKDEKKEQKESK 138
E K K++ E DK+++++ + K
Sbjct: 411 ENEKKQKEQADE-DKEKRQKDERKK 434
Score = 39.0 bits (91), Expect = 0.006
Identities = 18/79 (22%), Positives = 40/79 (50%), Gaps = 5/79 (6%)
Query: 38 TSNPTNSSSSKKDKKDKDRDKE-----KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
+ T+ S K+ + K+ +K+ K+ + E D S + + +++K+ +E E+K
Sbjct: 356 VKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVKDETDASEEAEAKAKEEKLKQEENEKK 415
Query: 93 ESKPKESSSEKEKKKEKKD 111
+ + + EK +K E+K
Sbjct: 416 QKEQADEDKEKRQKDERKK 434
Score = 38.2 bits (89), Expect = 0.009
Identities = 23/84 (27%), Positives = 38/84 (45%), Gaps = 3/84 (3%)
Query: 40 NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
S++ K ++ KE K+ +D K V E D E + KE K K+
Sbjct: 354 EEVKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVK---DETDASEEAEAKAKEEKLKQE 410
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKD 123
+EK++K++ + KEK K + K
Sbjct: 411 ENEKKQKEQADEDKEKRQKDERKK 434
Score = 30.5 bits (69), Expect = 2.8
Identities = 14/75 (18%), Positives = 31/75 (41%), Gaps = 2/75 (2%)
Query: 69 KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
K V S + + K ++ + + + K++ D + + K K+ + +
Sbjct: 351 KLYEEVKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVKDETDA-SEEAEAKAKEEKLKQ 409
Query: 129 DE-KKEQKESKSSSK 142
+E +K+QKE K
Sbjct: 410 EENEKKQKEQADEDK 424
>gnl|CDD|236877 PRK11192, PRK11192, ATP-dependent RNA helicase SrmB; Provisional.
Length = 434
Score = 39.2 bits (92), Expect = 0.006
Identities = 15/54 (27%), Positives = 28/54 (51%), Gaps = 5/54 (9%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
EK+ + K + EKK+++K+K + +H+D K+ K +K S +S
Sbjct: 384 SEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDT-----KNIGKRRKPSGTS 432
Score = 36.8 bits (86), Expect = 0.029
Identities = 19/61 (31%), Positives = 28/61 (45%), Gaps = 6/61 (9%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
T + S KK K + K EKK+KEK+K V + ++ + K KP +S
Sbjct: 380 TKAPSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIG------KRRKPSGTSE 433
Query: 102 E 102
E
Sbjct: 434 E 434
Score = 35.7 bits (83), Expect = 0.058
Identities = 15/55 (27%), Positives = 23/55 (41%), Gaps = 3/55 (5%)
Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
K K EK S K K ++KE+++ KPK ++ K K +K
Sbjct: 379 KTKAPSEKKTGKPSKKVLAKRA---EKKEKEKEKPKVKKRHRDTKNIGKRRKPSG 430
Score = 33.4 bits (77), Expect = 0.32
Identities = 18/58 (31%), Positives = 24/58 (41%), Gaps = 2/58 (3%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR--ERDKDEKKEQKESKSSSK 142
E K P E + K KK + EK K K+K + +R +D K K K S
Sbjct: 374 DELRPKTKAPSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGT 431
Score = 32.6 bits (75), Expect = 0.52
Identities = 15/59 (25%), Positives = 28/59 (47%), Gaps = 1/59 (1%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
R K K +K K V +K EK K KEK + + + +++ + +++K +E
Sbjct: 377 RPKTKAPSEKKTGKPSKKVLAKRAEK-KEKEKEKPKVKKRHRDTKNIGKRRKPSGTSEE 434
Score = 31.8 bits (73), Expect = 1.1
Identities = 11/53 (20%), Positives = 17/53 (32%)
Query: 165 PPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAK 217
P+ T K K K EK+ K +H+ G + P +
Sbjct: 382 APSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGTSEE 434
Score = 30.7 bits (70), Expect = 2.4
Identities = 14/60 (23%), Positives = 22/60 (36%), Gaps = 9/60 (15%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
EK+ K A +++KEK+K K K+R K K+ K +
Sbjct: 382 APSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDT---------KNIGKRRKPSGTS 432
Score = 29.9 bits (68), Expect = 4.6
Identities = 15/59 (25%), Positives = 26/59 (44%), Gaps = 3/59 (5%)
Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
P K+P EK+ K S K ++ K K+K+K K ++ K ++ +S
Sbjct: 378 PKTKAP---SEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGTSE 433
Score = 29.5 bits (67), Expect = 5.1
Identities = 15/61 (24%), Positives = 21/61 (34%), Gaps = 8/61 (13%)
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
P T V K EK+++ K K K K +H D N ++ S E
Sbjct: 382 APSEKKTGKPSKKVLAKRAEKKEK--------EKEKPKVKKRHRDTKNIGKRRKPSGTSE 433
Query: 223 S 223
Sbjct: 434 E 434
>gnl|CDD|220684 pfam10310, DUF2413, Protein of unknown function (DUF2413). This is
a family of proteins conserved in fungi. The function is
not known.
Length = 436
Score = 39.0 bits (91), Expect = 0.006
Identities = 25/124 (20%), Positives = 48/124 (38%), Gaps = 16/124 (12%)
Query: 34 TSSSTSNPTNSSSSKKDKKDKDRD-------KEKEKEKKDKEKDKSAVSSKEKEKDKVSS 86
+ + + KD + D D E+ ++ K +K KE + +
Sbjct: 10 DEKAPTKKPKKGDASKDSTEDDEDILEFLDELEQSEKAKPPKKP--------KEASRPGT 61
Query: 87 KEKERKESKPKESS-SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
+K SKP ESS + E+K K K +S + + E +E++E + + ++
Sbjct: 62 PRNPKKSSKPTESSAASSEEKPAKPRKSAESTRSSHPKSKAPSTESEEEEEPEETPDPIA 121
Query: 146 SSHN 149
S
Sbjct: 122 SIGG 125
Score = 33.6 bits (77), Expect = 0.33
Identities = 21/118 (17%), Positives = 36/118 (30%), Gaps = 12/118 (10%)
Query: 61 EKEKKDKEKDKSAVSSKEKEKDK-----VSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
+++ K+ K S E D+ + E+ K PK+ + KK
Sbjct: 10 DEKAPTKKPKKGDASKDSTEDDEDILEFLDELEQSEKAKPPKKPKEASRPGTPRNPKK-- 67
Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS-HPPPPAPTPTQ 172
K E +E+ S + S + K A ++ P P P
Sbjct: 68 ----SSKPTESSAASSEEKPAKPRKSAESTRSSHPKSKAPSTESEEEEEPEETPDPIA 121
Score = 30.9 bits (70), Expect = 2.0
Identities = 38/186 (20%), Positives = 64/186 (34%), Gaps = 26/186 (13%)
Query: 105 KKKEKKDKKEKSHKHKDKD--RERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
KK KK K D+D D+ E+ E+ + K S + P S+
Sbjct: 15 TKKPKKGDASKDSTEDDEDILEFLDELEQSEKAKPPKKPKEASRPGTPRNPKKSSK---- 70
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
P + S K + K ES+ + SK T +E++ + +
Sbjct: 71 ---PTESSAASSEEKPAKPRKSAESTRSSHPKSK---------APSTESEEEEEPEETPD 118
Query: 223 SHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHII 282
S G + G I S + V+ AEQ +E ++ E+A + +V
Sbjct: 119 PIASIGG--WWSLWGSITSTATSTASAAVK--QAEQAVNE----IQQEEAQLWAEQVRGN 170
Query: 283 SGDISQ 288
G +
Sbjct: 171 VGALRD 176
>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
domain. This domain is found in a number of different
types of plant proteins including NAM-like proteins.
Length = 147
Score = 37.4 bits (87), Expect = 0.006
Identities = 30/99 (30%), Positives = 52/99 (52%), Gaps = 3/99 (3%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNP-TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
+K K + S ++S+ N + S+ + K+ + R K KEK ++DK K K +
Sbjct: 26 KKASKKKKKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKKAKEKLRRDKLKAKKEEAE 85
Query: 77 KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
KEKEK++ K E++ + + EK+K + K K+EK
Sbjct: 86 KEKEKEERF--MKALAEAEKERAELEKKKAEAKLMKEEK 122
Score = 36.6 bits (85), Expect = 0.010
Identities = 22/91 (24%), Positives = 40/91 (43%), Gaps = 3/91 (3%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE---KEKKKE 108
K++ + K K E K K K S+ E E ES + E K K+K
Sbjct: 13 KNEPKWKSKRSELKKASKKKKKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKKAKEKL 72
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
++DK + + +K++E+++ K E++
Sbjct: 73 RRDKLKAKKEEAEKEKEKEERFMKALAEAEK 103
>gnl|CDD|236944 PRK11642, PRK11642, exoribonuclease R; Provisional.
Length = 813
Score = 39.0 bits (91), Expect = 0.007
Identities = 21/92 (22%), Positives = 39/92 (42%), Gaps = 8/92 (8%)
Query: 36 SSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS----AVSSKEKEKDKVSSKEKER 91
SS P N + ++K K +K +++ K + + EK+ ++K+ R
Sbjct: 726 SSERAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKAKPKAAKKDAR 785
Query: 92 KESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
K KP S K +K K +++ K K +
Sbjct: 786 KAKKP----SAKTQKIAAATKAKRAAKKKVAE 813
Score = 32.0 bits (73), Expect = 1.0
Identities = 13/73 (17%), Positives = 33/73 (45%)
Query: 70 DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
D S +SS+ ++ + ++ K+ + ++ + +K + + S +K +
Sbjct: 721 DFSLISSERAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKAKPKAA 780
Query: 130 EKKEQKESKSSSK 142
+K +K K S+K
Sbjct: 781 KKDARKAKKPSAK 793
>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
Length = 880
Score = 39.3 bits (92), Expect = 0.007
Identities = 25/85 (29%), Positives = 48/85 (56%), Gaps = 2/85 (2%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEK-DKVSSKEKERKE-SKPKESSSEKEKKKEKKD 111
++ KEKEKE ++ ++ + +SS+ E +++ EKE KE + KE E EK+ E +
Sbjct: 192 EELIKEKEKELEEVLREINEISSELPELREELEKLEKEVKELEELKEEIEELEKELESLE 251
Query: 112 KKEKSHKHKDKDRERDKDEKKEQKE 136
++ + K ++ E +E K++ E
Sbjct: 252 GSKRKLEEKIRELEERIEELKKEIE 276
Score = 37.7 bits (88), Expect = 0.018
Identities = 17/88 (19%), Positives = 46/88 (52%), Gaps = 2/88 (2%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKE--KEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
KD +++ E+E+++ K +++ + +E + + ++ KE +E + K S E E+ +E+
Sbjct: 608 KDAEKELEREEKELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEELREE 667
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKES 137
+ + + E + ++E K++
Sbjct: 668 YLELSRELAGLRAELEELEKRREEIKKT 695
Score = 35.4 bits (82), Expect = 0.11
Identities = 17/95 (17%), Positives = 49/95 (51%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K+ ++ ++R +E +K+ K+ EK + + + ++ +K++E + K + + EK +
Sbjct: 331 KELEEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKKRLTGLTPEKLE 390
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
++ ++ EK+ + +++ + E K+ K
Sbjct: 391 KELEELEKAKEEIEEEISKITARIGELKKEIKELK 425
Score = 35.0 bits (81), Expect = 0.12
Identities = 22/96 (22%), Positives = 48/96 (50%), Gaps = 1/96 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK-ERKESKPKESSSEKEKK 106
K+ KK ++ + +E + EK + + +E +K S+E+ E + E S E
Sbjct: 619 KELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEELREEYLELSRELAGL 678
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
+ + ++ EK + K E+ K+E +E++++K +
Sbjct: 679 RAELEELEKRREEIKKTLEKLKEELEEREKAKKELE 714
Score = 32.3 bits (74), Expect = 0.87
Identities = 20/92 (21%), Positives = 43/92 (46%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+K+++ ++ K+ ++ +K E+ + E+ K K E+ +K +++
Sbjct: 334 EEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKKRLTGLTPEKLEKEL 393
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+E + KE+ + K R + KKE KE K
Sbjct: 394 EELEKAKEEIEEEISKITARIGELKKEIKELK 425
Score = 31.2 bits (71), Expect = 2.1
Identities = 18/96 (18%), Positives = 40/96 (41%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+ +K+ + + K K ++ + + + +KE +++ K KE KE K K K +
Sbjct: 241 EELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEKVKELKELKEKAEEYIKLSE 300
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
++ E K R ++ E++ + K
Sbjct: 301 FYEEYLDELREIEKRLSRLEEEINGIEERIKELEEK 336
Score = 31.2 bits (71), Expect = 2.1
Identities = 16/55 (29%), Positives = 30/55 (54%)
Query: 61 EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
EKE K+ E+ K + EKE + + +++ +E + +E KKE ++ +EK
Sbjct: 227 EKEVKELEELKEEIEELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEK 281
Score = 30.4 bits (69), Expect = 3.5
Identities = 21/95 (22%), Positives = 44/95 (46%), Gaps = 7/95 (7%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K+ +K+ + + E + + +E +K+ + KE +E K + EKE +
Sbjct: 196 KEKEKELEEVLREINEISSELPE------LREELEKLEKEVKELEELKEEIEELEKELES 249
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
+ K++ K ++ ER ++ KKE +E + K
Sbjct: 250 LEGSKRKLEEKIREL-EERIEELKKEIEELEEKVK 283
Score = 29.6 bits (67), Expect = 5.5
Identities = 19/102 (18%), Positives = 49/102 (48%), Gaps = 5/102 (4%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSA-----VSSKEKEKDKVSSKEKERKESKPKESS 100
K+++ ++ + K KE EK+ +E ++ +K++E +++ + K ++
Sbjct: 334 EEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKKRLTGLTPEKLEKEL 393
Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
E EK KE+ +++ + + +++ E K+ E +K
Sbjct: 394 EELEKAKEEIEEEISKITARIGELKKEIKELKKAIEELKKAK 435
>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62.
Length = 217
Score = 37.9 bits (88), Expect = 0.007
Identities = 21/65 (32%), Positives = 34/65 (52%), Gaps = 1/65 (1%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
R E EK K +K+K + +K +DK K K +++ E +K +EKK++K+K
Sbjct: 14 RALESEKYKANKDKGNPEIYNKINSQDKAIEKFKLLIKAQMAE-RVKKLHSQEKKEEKKK 72
Query: 116 SHKHK 120
K K
Sbjct: 73 PKKKK 77
Score = 36.7 bits (85), Expect = 0.019
Identities = 19/77 (24%), Positives = 32/77 (41%), Gaps = 7/77 (9%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
S K KDK E K +DK + EK K + ++ ER + + E++
Sbjct: 18 SEKYKANKDK---GNPEIYNKINSQDK----AIEKFKLLIKAQMAERVKKLHSQEKKEEK 70
Query: 105 KKKEKKDKKEKSHKHKD 121
KK +KK + + +
Sbjct: 71 KKPKKKKVPLQVNPAQL 87
Score = 35.5 bits (82), Expect = 0.047
Identities = 25/85 (29%), Positives = 40/85 (47%), Gaps = 12/85 (14%)
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
+++D + K V + E EK K ++K+K E K +S +K +K K K +
Sbjct: 2 KRQDFFRAKRVVRALESEKYK-ANKDKGNPEIYNKINSQDKAIEKFKLLI-------KAQ 53
Query: 123 DRERDK----DEKKEQKESKSSSKI 143
ER K EKKE+K+ K+
Sbjct: 54 MAERVKKLHSQEKKEEKKKPKKKKV 78
>gnl|CDD|184900 PRK14907, rplD, 50S ribosomal protein L4; Provisional.
Length = 295
Score = 38.4 bits (89), Expect = 0.008
Identities = 24/109 (22%), Positives = 44/109 (40%), Gaps = 3/109 (2%)
Query: 1 MAYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEK 60
MA + K++ ++ P K S T ++ T++ + + K KK K
Sbjct: 1 MAETKKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTT 60
Query: 61 EKEKKDKEKD---KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+K EK K +K+ K + S E +K +++S+ KK
Sbjct: 61 KKVTVKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSKLPKK 109
Score = 35.3 bits (81), Expect = 0.076
Identities = 24/105 (22%), Positives = 45/105 (42%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
KK + +EK+ K+ S ++K K+ K +S + +K +K K++ S K K+
Sbjct: 5 KKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTTKKVT 64
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
K EK+ K + + +K+ + + SK P
Sbjct: 65 VKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSKLPKK 109
Score = 34.9 bits (80), Expect = 0.082
Identities = 23/112 (20%), Positives = 40/112 (35%), Gaps = 2/112 (1%)
Query: 32 TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER 91
T+ +T ++ K+ + K+ K K K+A K K + K +
Sbjct: 7 TTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTTKKVTVK 66
Query: 92 KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
E KK K K+ S + + + K+ K K+ +S KI
Sbjct: 67 FEKTESVKKESVAKKTVK--KEAVSAEVFEASNKLFKNTSKLPKKLFASEKI 116
Score = 31.8 bits (72), Expect = 0.84
Identities = 19/102 (18%), Positives = 40/102 (39%), Gaps = 2/102 (1%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDK-SAVSSKEKEKDKVSSKEKERKESKP-KES 99
T ++ KK ++K +K K+ K K +A ++ K K + +K + K+
Sbjct: 4 TKKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTTKKV 63
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
+ + EK + K + K + + E + +S
Sbjct: 64 TVKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSK 105
Score = 29.9 bits (67), Expect = 3.7
Identities = 21/107 (19%), Positives = 34/107 (31%), Gaps = 6/107 (5%)
Query: 83 KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
K ++K+K +E KP + K+ K K K+ K K K K++KS
Sbjct: 5 KKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAK----KAAKV--KKTKSVKT 58
Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESST 189
E + S + V + K +S
Sbjct: 59 TTKKVTVKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSK 105
>gnl|CDD|215565 PLN03083, PLN03083, E3 UFM1-protein ligase 1 homolog; Provisional.
Length = 803
Score = 38.6 bits (90), Expect = 0.009
Identities = 22/96 (22%), Positives = 41/96 (42%)
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
++ +KE D ++ + S K ES P S+S+K KK+K +
Sbjct: 378 DQIEKEMDAFSIQASSAGLIGSSEKSLGSNESSPAASNSDKGSKKKKGKSTSTKGGTAES 437
Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+ ++D K+ K+++ + SS S A G +
Sbjct: 438 IPDDEEDAPKKGKKNQKKGRDKSSKVPSDSKAGGKK 473
Score = 37.9 bits (88), Expect = 0.016
Identities = 26/110 (23%), Positives = 45/110 (40%), Gaps = 2/110 (1%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE 61
A+S+++SS+ +SS S S S S+S+K + D E++
Sbjct: 386 AFSIQASSAGLIGSSEKSLG-SNESSPAASNSDKGSKKKKGKSTSTKGGTAESIPDDEED 444
Query: 62 KEKKDKEKDKSAVSSKEKEK-DKVSSKEKERKESKPKESSSEKEKKKEKK 110
KK K+ K K D + +KE +S+ ++ E+ KK
Sbjct: 445 APKKGKKNQKKGRDKSSKVPSDSKAGGKKESVKSQEDNNNIPPEEWVMKK 494
Score = 34.0 bits (78), Expect = 0.28
Identities = 25/112 (22%), Positives = 38/112 (33%), Gaps = 15/112 (13%)
Query: 33 STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
S S SN ++ ++S DK K K+K KS + + +E K
Sbjct: 400 SEKSLGSNESSPAASNSDKGSK------------KKKGKSTSTKGGTAESIPDDEEDAPK 447
Query: 93 ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
+ K + + K D K K K +E D E KI+
Sbjct: 448 KGKKNQKKGRDKSSKVPSDSKAGGKKESVKSQE---DNNNIPPEEWVMKKIL 496
>gnl|CDD|237629 PRK14160, PRK14160, heat shock protein GrpE; Provisional.
Length = 211
Score = 37.4 bits (87), Expect = 0.010
Identities = 14/67 (20%), Positives = 33/67 (49%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K+ K K + E++ K+++ K++ ++ E +++ +E + E E+ K +
Sbjct: 3 KECKDAKHENMEEDCCKENENKEEDKGKEEDLEFEEIEKEEIIEDSEESNEVKIEELKDE 62
Query: 108 EKKDKKE 114
K K+E
Sbjct: 63 NNKLKEE 69
Score = 29.0 bits (65), Expect = 6.3
Identities = 11/80 (13%), Positives = 36/80 (45%), Gaps = 5/80 (6%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+++ K+ K + ++ E +++ +K ++E E ++E ++ ++ E
Sbjct: 1 MEKECKDAKHENMEEDCCKENENKEE-----DKGKEEDLEFEEIEKEEIIEDSEESNEVK 55
Query: 117 HKHKDKDRERDKDEKKEQKE 136
+ + + K+E K+ +
Sbjct: 56 IEELKDENNKLKEENKKLEN 75
>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19.
Med19 represents a family of conserved proteins which
are members of the multi-protein co-activator Mediator
complex. Mediator is required for activation of RNA
polymerase II transcription by DNA binding
transactivators.
Length = 178
Score = 37.1 bits (86), Expect = 0.010
Identities = 20/61 (32%), Positives = 35/61 (57%)
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
K+K K K+ ++ P+E+ S+ E K + K +K DK+R++ K EKK++K+
Sbjct: 110 KKKHKHKHKKHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRH 169
Query: 139 S 139
S
Sbjct: 170 S 170
Score = 34.0 bits (78), Expect = 0.11
Identities = 28/100 (28%), Positives = 35/100 (35%), Gaps = 17/100 (17%)
Query: 148 HNSKEPASGSQLISHPPPPAPTPTQ-----KSPVKTKEKEKEKE------------SSTT 190
P + QL P P P Q P K K K K K+ S +
Sbjct: 76 GKELLPLTSVQLAGFRLHPGPLPEQYRLMHIQPPKKKHKHKHKKHRTQDPLPEETPSDSE 135
Query: 191 HDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGP 230
K + KHKKK DK K+K K K+K+ H
Sbjct: 136 GLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRHSPEHPG 175
Score = 32.9 bits (75), Expect = 0.29
Identities = 23/66 (34%), Positives = 35/66 (53%), Gaps = 4/66 (6%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
K K K KK + +D +E D K E+K K K+ +KE+KK+KK+KK+K
Sbjct: 112 KHKHKHKKHRTQDPL---PEETPSDSEGLKGHEKKHKK-KKHEDDKERKKKKKEKKKKKK 167
Query: 118 KHKDKD 123
+H +
Sbjct: 168 RHSPEH 173
Score = 32.1 bits (73), Expect = 0.41
Identities = 22/70 (31%), Positives = 32/70 (45%), Gaps = 2/70 (2%)
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ K K K K+ + ++ E E + + K KHK K E DK+ KK++KE K
Sbjct: 106 IQPPKKKHKHKH-KKHRTQDPLPE-ETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKK 163
Query: 139 SSSKIVSSSH 148
K S H
Sbjct: 164 KKKKRHSPEH 173
Score = 31.3 bits (71), Expect = 0.88
Identities = 19/59 (32%), Positives = 30/59 (50%), Gaps = 4/59 (6%)
Query: 19 HKNKDKDS---SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK-EKKDKEKDKSA 73
HK+K K +P + S S + KK K + D++++K+K EKK K+K S
Sbjct: 113 HKHKHKKHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRHSP 171
>gnl|CDD|218188 pfam04641, Rtf2, Replication termination factor 2. It is vital for
effective cell-replication that replication is not
stalled at any point by, for instance, damaged bases.
Rtf2 stabilizes the replication fork stalled at the
site-specific replication barrier RTS1 by preventing
replication restart until completion of DNA synthesis by
a converging replication fork initiated at a flanking
origin. The RTS1 element terminates replication forks
that are moving in the cen2-distal direction while
allowing forks moving in the cen2-proximal direction to
pass through the region. Rtf2 contains a C2HC2 motif
related to the C3HC4 RING-finger motif, and would appear
to fold up, creating a RING finger-like structure but
forming only one functional Zn2+ ion-binding site.
Length = 254
Score = 37.7 bits (88), Expect = 0.010
Identities = 22/89 (24%), Positives = 39/89 (43%), Gaps = 7/89 (7%)
Query: 40 NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
NPT + ++ + K+K+KK K+K K ++ + VSS + S
Sbjct: 163 NPTEEEVELLKARLEEE-RAKKKKKKKKKKTKKNNATGSSAEATVSSA------VPTELS 215
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
S + + KK KK++S ++ E K
Sbjct: 216 SGAGQVGEAKKLKKKRSIAPDNEKSEVYK 244
Score = 33.9 bits (78), Expect = 0.17
Identities = 25/106 (23%), Positives = 39/106 (36%), Gaps = 7/106 (6%)
Query: 79 KEKDKVS-SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
E+D V + +E E E+ KKK+KK KK K K + S
Sbjct: 155 SEEDVVPLNPTEEEVELLKARLEEERAKKKKKKKKK------KTKKNNATGSSAEATVSS 208
Query: 138 KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEK 183
+++ S + E + S P + KS + +KEK
Sbjct: 209 AVPTELSSGAGQVGEAKKLKKKRSIAPDNEKSEVYKSLFTSHKKEK 254
Score = 33.1 bits (76), Expect = 0.33
Identities = 13/80 (16%), Positives = 27/80 (33%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
++E E K E++++ K+K+K + S+ E E
Sbjct: 165 TEEEVELLKARLEEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEA 224
Query: 116 SHKHKDKDRERDKDEKKEQK 135
K + D ++ + K
Sbjct: 225 KKLKKKRSIAPDNEKSEVYK 244
Score = 33.1 bits (76), Expect = 0.35
Identities = 19/75 (25%), Positives = 30/75 (40%), Gaps = 7/75 (9%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+KK KK K + +K + + S+ E + + E K+ K K S + +K
Sbjct: 181 AKKKKKKKKKKTKKNNATGSS-AEATVSSAVPTELSSGAGQVGEAKKLKKKRSIAPDNEK 239
Query: 107 KE------KKDKKEK 115
E KKEK
Sbjct: 240 SEVYKSLFTSHKKEK 254
Score = 32.7 bits (75), Expect = 0.46
Identities = 19/93 (20%), Positives = 31/93 (33%), Gaps = 4/93 (4%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
E+E E ++ K+K+K K + K S SS + + K
Sbjct: 166 EEEVELLKARLEEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEAK 225
Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
K R D +K S+ + +S K
Sbjct: 226 KLKKKRSIAPDNEK----SEVYKSLFTSHKKEK 254
Score = 29.2 bits (66), Expect = 4.8
Identities = 16/78 (20%), Positives = 33/78 (42%), Gaps = 1/78 (1%)
Query: 122 KDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP-PPPAPTPTQKSPVKTKE 180
++ K +KK++K++K ++ SS+ + A ++L S +K +
Sbjct: 177 EEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEAKKLKKKRSIAPD 236
Query: 181 KEKEKESSTTHDKHSKHK 198
EK + + H K K
Sbjct: 237 NEKSEVYKSLFTSHKKEK 254
>gnl|CDD|191249 pfam05279, Asp-B-Hydro_N, Aspartyl beta-hydroxylase N-terminal
region. This family includes the N-terminal regions of
the junctin, junctate and aspartyl beta-hydroxylase
proteins. Junctate is an integral ER/SR membrane calcium
binding protein, which comes from an alternatively
spliced form of the same gene that generates aspartyl
beta-hydroxylase and junctin. Aspartyl beta-hydroxylase
catalyzes the post-translational hydroxylation of
aspartic acid or asparagine residues contained within
epidermal growth factor (EGF) domains of proteins.
Length = 240
Score = 37.6 bits (87), Expect = 0.011
Identities = 22/98 (22%), Positives = 40/98 (40%), Gaps = 4/98 (4%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP----KESSSEKE 104
K+ + + ++ D+++ A E+ +D +E ++ K K S E E
Sbjct: 128 KEPQLDEDKFLLAEDSDDRQETLEAGKVHEETEDSYHVEETASEQYKQDMKEKASEQENE 187
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
KE +K E++ D E D DE+ E + K
Sbjct: 188 DSKEPVEKAERTKAETDDVTEEDYDEEDNPVEDSKAIK 225
Score = 35.7 bits (82), Expect = 0.051
Identities = 31/176 (17%), Positives = 71/176 (40%), Gaps = 7/176 (3%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K+ + +E E +E+ + AV K K+K + KE+ + + S ++E
Sbjct: 68 KEKSTSEPTVPPEEAEPHAEEEGQLAVR-KTKQKVEEEVKEQLQSLLEKIVVSKQEEDGP 126
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS--GSQLISHPPP 165
K+ + ++ D D+++E E+ + S++ +E AS Q +
Sbjct: 127 GKEPQLDEDKFLL----AEDSDDRQETLEAGKVHEETEDSYHVEETASEQYKQDMKEKAS 182
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
K PV+ E+ K + T + + + + +D K ++ + +++
Sbjct: 183 EQENEDSKEPVEKAERTKAETDDVTEEDYDEEDNPVEDSKAIKEELAKEPVEEQQE 238
Score = 32.6 bits (74), Expect = 0.46
Identities = 23/159 (14%), Positives = 55/159 (34%), Gaps = 9/159 (5%)
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
KEK S + +E + + +E + K K+ E+ K++ + ++ +++D
Sbjct: 68 KEKSTSEPTVPPEEAEPHAEEEGQLAVRKTKQKVEEEVKEQLQSLLEKIVVSKQEEDGPG 127
Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
+ + E K + + E + T +++ KEK E+E
Sbjct: 128 KEPQLDEDKFLLAED--SDDRQETLEAGKVHEETEDSYHVEETASEQYKQDMKEKASEQE 185
Query: 187 SSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
+ K + K + + ++E +
Sbjct: 186 N-------EDSKEPVEKAERTKAETDDVTEEDYDEEDNP 217
Score = 32.2 bits (73), Expect = 0.56
Identities = 17/73 (23%), Positives = 40/73 (54%), Gaps = 4/73 (5%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
T S K+D K+K ++E E K+ EK + + E D V+ ++ + +++ ++S +
Sbjct: 168 TASEQYKQDMKEKASEQENEDSKEPVEKAERTKA----ETDDVTEEDYDEEDNPVEDSKA 223
Query: 102 EKEKKKEKKDKKE 114
KE+ ++ +++
Sbjct: 224 IKEELAKEPVEEQ 236
>gnl|CDD|227446 COG5116, RPN2, 26S proteasome regulatory complex component
[Posttranslational modification, protein turnover,
chaperones].
Length = 926
Score = 38.4 bits (89), Expect = 0.011
Identities = 19/96 (19%), Positives = 39/96 (40%), Gaps = 4/96 (4%)
Query: 21 NKDKDSSAIP--STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
+ +++ P S + N++ K R K+K KEK +K+ + S
Sbjct: 759 DLEEEEFEYPRMYEEASGKSVRKVNTAVLSTTIKAAARAKQKPKEKGPNDKEI-KIESPS 817
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
E + +++E K ++ + KK+K K +
Sbjct: 818 VETEG-ERCTIKQREEKGIDAPAILNVKKKKPYKVD 852
Score = 33.8 bits (77), Expect = 0.27
Identities = 20/82 (24%), Positives = 37/82 (45%), Gaps = 11/82 (13%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSS--KEKERKESKPKES-SSEKEKKKEKKDKKE 114
+E+E E ++ S S ++ +S+ K R + KPKE ++KE K E
Sbjct: 761 EEEEFEYPRMYEEASGKSVRKVNTAVLSTTIKAAARAKQKPKEKGPNDKEIKIE------ 814
Query: 115 KSHKHKDKDRERDKDEKKEQKE 136
+ + ER +++E+K
Sbjct: 815 --SPSVETEGERCTIKQREEKG 834
Score = 31.1 bits (70), Expect = 2.0
Identities = 15/67 (22%), Positives = 32/67 (47%), Gaps = 8/67 (11%)
Query: 71 KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
K+A +K+K K+K + ++ + ES E+ E+ K++++ K D +
Sbjct: 792 KAAARAKQKPKEKGPNDKEIKIESPSVETEGERCTIKQREE--------KGIDAPAILNV 843
Query: 131 KKEQKES 137
KK++
Sbjct: 844 KKKKPYK 850
>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
Length = 943
Score = 38.5 bits (89), Expect = 0.011
Identities = 48/209 (22%), Positives = 76/209 (36%), Gaps = 23/209 (11%)
Query: 17 SPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
+P + +D D P +S P DK+ ++ + E KE + ++ +
Sbjct: 497 APIEEEDSDKHDEPPEGPEASGLPPKAPG----DKEGEEGEHEDSKESDEPKEGGKPGET 552
Query: 77 KEKEKDKVSSKEKERKESK----------PKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
KE E K KE K SK PK+ K+ ++ KK K+ +S + +
Sbjct: 553 KEGEVGKKPGPAKEHKPSKIPTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQRPTR---P 609
Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS--HPPPPAPTPTQKSPVKTK----E 180
+ E + S K S + K P + S P P + K P K
Sbjct: 610 KSPKLPELLDIPKSPKRPESPKSPKRPPPPQRPSSPERPEGPKIIKSPKPPKSPKPPFDP 669
Query: 181 KEKEKESSTTHDKHSKHKHKKKDKHGDKT 209
K KEK D +K K K D++
Sbjct: 670 KFKEKFYDDYLDAAAKSKETKTTVVLDES 698
Score = 31.6 bits (71), Expect = 1.6
Identities = 43/202 (21%), Positives = 70/202 (34%), Gaps = 28/202 (13%)
Query: 48 KKDKKDKDRDKEKEKEKKDKE-KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
KK KK +E++ +K D+ + A K +E E ++SK ES KE
Sbjct: 490 KKSKKKLAPIEEEDSDKHDEPPEGPEASGLPPKAPGDKEGEEGEHEDSK--ESDEPKEGG 547
Query: 107 KEKKDKKEKSHKHKDKDRE----------------RDKDEKKEQKESKSSSKIVSSSH-- 148
K + K+ + K +E +D K+ +E K + S+
Sbjct: 548 KPGETKEGEVGKKPGPAKEHKPSKIPTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQRPT 607
Query: 149 ---NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK--- 202
+ K P S P +P ++ P + E+ K K K
Sbjct: 608 RPKSPKLPELLDIPKSPKRPESPKSPKRPPPPQRPSSPERPEGPKIIKSPKPPKSPKPPF 667
Query: 203 -DKHGDKTNPKEKDAKSKEKES 223
K +K DA +K KE+
Sbjct: 668 DPKFKEKFYDDYLDAAAKSKET 689
Score = 29.3 bits (65), Expect = 6.7
Identities = 41/243 (16%), Positives = 65/243 (26%), Gaps = 13/243 (5%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
S K HP + K S + +P K K E K
Sbjct: 575 LSKKPEFPKDPKHPKDPEEPKKPKRPR-SAQRPTRPKSPKLPELLDIPKSPKR--PESPK 631
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
K + S + E K+ K K KP KEK + K
Sbjct: 632 SPKRPPPPQRPSSPERPEGPKIIKSPKPPKSPKPPFDPKFKEKFYDDYLDAAAKSKETKT 691
Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKS-------- 174
D+ + KE+ + + P + P P P +
Sbjct: 692 TVVLDESFESILKETLPETPGTPFTTPRPLPPKLPRDEEFPFEPIGDPDAEQPDDIEFFT 751
Query: 175 -PVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCY 233
P + + E + T + K++D H + P E + H P +
Sbjct: 752 PPEEERTFFHETPADTPLPDILAEEFKEEDIHAETGEPDEAMKRPDSPSEH-EDKPPGDH 810
Query: 234 PEV 236
P +
Sbjct: 811 PSL 813
>gnl|CDD|237035 PRK12280, rplW, 50S ribosomal protein L23; Reviewed.
Length = 158
Score = 36.7 bits (85), Expect = 0.012
Identities = 20/64 (31%), Positives = 36/64 (56%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
+ KE KE ++KE K+ KEK++ KV+ K ++K +K +++++K KK K
Sbjct: 95 SEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTTKK 154
Query: 113 KEKS 116
+E
Sbjct: 155 EEGK 158
Score = 33.2 bits (76), Expect = 0.17
Identities = 23/70 (32%), Positives = 36/70 (51%), Gaps = 3/70 (4%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
+E EK+ KE K +EKE K ++KE+KE K E ++K+ K K+ +K+ K
Sbjct: 92 PEESEKEQKEVSKET---EEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATK 148
Query: 119 HKDKDRERDK 128
+E K
Sbjct: 149 KTTTKKEEGK 158
Score = 32.0 bits (73), Expect = 0.40
Identities = 18/65 (27%), Positives = 34/65 (52%), Gaps = 1/65 (1%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
S KE+++ ++EKE ++K ++ +KEKK +K K+KS K ++ + +
Sbjct: 95 SEKEQKEVSKETEEKEAIKAKKEK-KEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTTK 153
Query: 135 KESKS 139
KE
Sbjct: 154 KEEGK 158
Score = 32.0 bits (73), Expect = 0.46
Identities = 17/67 (25%), Positives = 34/67 (50%), Gaps = 1/67 (1%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSA-VSSKEKEKDKVSSKEKERKESKPKES 99
P S +K+ + +KE K KK+K++ K V+ K +K + + K++ K +
Sbjct: 92 PEESEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTT 151
Query: 100 SSEKEKK 106
+ ++E K
Sbjct: 152 TKKEEGK 158
Score = 30.1 bits (68), Expect = 2.0
Identities = 16/66 (24%), Positives = 37/66 (56%), Gaps = 2/66 (3%)
Query: 55 DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK--EKERKESKPKESSSEKEKKKEKKDK 112
+ ++++KE + ++K A+ +K+++K+K K EK K+ K + + +K +K
Sbjct: 93 EESEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTT 152
Query: 113 KEKSHK 118
K++ K
Sbjct: 153 KKEEGK 158
Score = 29.0 bits (65), Expect = 3.9
Identities = 20/66 (30%), Positives = 30/66 (45%)
Query: 90 ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
E E + KE S E E+K+ K KKEK K + K E+ +K + ++ K +
Sbjct: 93 EESEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTT 152
Query: 150 SKEPAS 155
KE
Sbjct: 153 KKEEGK 158
>gnl|CDD|240246 PTZ00053, PTZ00053, methionine aminopeptidase 2; Provisional.
Length = 470
Score = 38.2 bits (89), Expect = 0.013
Identities = 25/113 (22%), Positives = 46/113 (40%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
+ + + K+ K+++K + K+ +K K + + ++ + E E K+ K KK
Sbjct: 1 AMNENGENEVKQQKQQNKQKGTKKKNKKSKKDVDDDDAFLAELISENQEAENKQNNKKKK 60
Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
+K K K K+ D + SS+ +SH K Q PP
Sbjct: 61 KKKKKKKKKNLGEAYDLAYDLPVVWSSAAFQDNSHIRKLGNWPEQEWKQTQPP 113
Score = 37.8 bits (88), Expect = 0.016
Identities = 24/102 (23%), Positives = 47/102 (46%), Gaps = 6/102 (5%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
+ + + + K + K++ K+K K+K+K SK+ D + + E++ E+
Sbjct: 1 AMNENGENEVKQQ---KQQNKQKGTKKKNKK---SKKDVDDDDAFLAELISENQEAENKQ 54
Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
+KKK+KK KK+K + + D D + +S I
Sbjct: 55 NNKKKKKKKKKKKKKNLGEAYDLAYDLPVVWSSAAFQDNSHI 96
Score = 33.9 bits (78), Expect = 0.23
Identities = 16/89 (17%), Positives = 34/89 (38%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
N N +K + + K+K K+ K D A ++ +++ + ++ K+ K K+
Sbjct: 4 ENGENEVKQQKQQNKQKGTKKKNKKSKKDVDDDDAFLAELISENQEAENKQNNKKKKKKK 63
Query: 99 SSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
+K+ E D +D
Sbjct: 64 KKKKKKNLGEAYDLAYDLPVVWSSAAFQD 92
>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
chromosome partitioning].
Length = 1163
Score = 38.2 bits (89), Expect = 0.013
Identities = 27/239 (11%), Positives = 90/239 (37%), Gaps = 24/239 (10%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+++ ++ +++ E+ K + ++ +++ +E + K +E E + S +E E E +
Sbjct: 259 QEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLLRERLEELENEL 318
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
E+ +++ + K K + + + +E++ E + S L
Sbjct: 319 EELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEAKEELEEKLSAL-------- 370
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK---------KDKHGDKTNPKEKDAKS 218
++ + +E + + ++ +++ ++ ++ + + +D K
Sbjct: 371 -----LEELEELFEALREELAELEAELAEIRNELEELKREIESLEERLERLSERLEDLKE 425
Query: 219 KEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQR 277
+ KE + E+ + L + + + R + + L+ E + +
Sbjct: 426 ELKELEAELEELQ--TELEELNEELEELEEQLEELRDRLKELERELAELQEELQRLEKE 482
Score = 31.6 bits (72), Expect = 1.5
Identities = 33/242 (13%), Positives = 84/242 (34%), Gaps = 53/242 (21%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD------KVSSKEKERKESKPKESSS 101
++ + +R +E + E ++ E KE K+ ++S E+E +E
Sbjct: 206 ERQAEKAERYQELKAELRELELALLLAKLKELRKELEELEEELSRLEEELEE-------L 258
Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS 161
++E ++ +K+ +E + ++ E ++ +++ + + ++ +E
Sbjct: 259 QEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLLRE--------- 309
Query: 162 HPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
+ +E E E E + K K + + ++ ++ +
Sbjct: 310 ---------------RLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLA 354
Query: 222 ESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKD--ELFDRLKNEQADILQRKV 279
E ++ +K + E L E F+ E L+ E A+I
Sbjct: 355 ELEEAKE--------------ELEEKLSALLEELEELFEALREELAELEAELAEIRNELE 400
Query: 280 HI 281
+
Sbjct: 401 EL 402
>gnl|CDD|214395 CHL00204, ycf1, Ycf1; Provisional.
Length = 1832
Score = 38.2 bits (89), Expect = 0.016
Identities = 24/89 (26%), Positives = 42/89 (47%), Gaps = 6/89 (6%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+ + KK +K K + S EK+ ++ ++ +E KE + E E KEKK E
Sbjct: 1493 NGNENVNKKINQKKKGFIPSNEKKSIEIENRNQEEKEPAGQG---ELESDKEKKGNLESV 1549
Query: 117 HKHKDKDRERDKDE---KKEQKESKSSSK 142
+++K+ E D E KK + + + S
Sbjct: 1550 LSNQEKNIEEDYAESDIKKRKNKKQYKSN 1578
Score = 35.8 bits (83), Expect = 0.076
Identities = 22/80 (27%), Positives = 42/80 (52%), Gaps = 6/80 (7%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
S+ KK + ++R++E EKE + + +S KEK+ + S + K + + S+
Sbjct: 1511 PSNEKKSIEIENRNQE-EKEPAGQGELES---DKEKKGNLESVLSNQEKNIEEDYAESDI 1566
Query: 104 EKKKEKKDKKEKSHKHKDKD 123
+K+K K K+ KS+ + D
Sbjct: 1567 KKRKNK--KQYKSNTEAELD 1584
Score = 30.1 bits (68), Expect = 4.6
Identities = 17/38 (44%), Positives = 25/38 (65%), Gaps = 2/38 (5%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
K+ E K S E ++K+KKKEK KKE+ +K ++K R
Sbjct: 731 KDAEFKISDSVEEKTKKKKKKEK--KKEEEYKREEKAR 766
>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207). This
family is found in eukaryotes; it has several conserved
tryptophan residues. The function is not known.
Length = 261
Score = 37.0 bits (86), Expect = 0.018
Identities = 34/179 (18%), Positives = 76/179 (42%), Gaps = 12/179 (6%)
Query: 26 SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS 85
S S S S + + T SS S + + K + + +E ++ +S+K+ ++ K
Sbjct: 46 DSESSSNSVPSLSLSSTASSLSDSSTYSRSLKEVKLERQA-QEAYENWLSAKQAQRQKKL 104
Query: 86 SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
K E K+ + E+EK++E+ + +++ K K ++ R K +Q + + K
Sbjct: 105 QKLLEEKQKQ------EREKEREEAELRQRLAKEKYEEWCRQ---KAQQAAKQRTPKHKK 155
Query: 146 SSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
+ S + + P K ++ E +K K+ ++ + + KK+ +
Sbjct: 156 EAAESASSSLSGS--AKPERNVSQEEAKKRLQEWELKKLKQQQQKREEERRKQRKKQQE 212
Score = 30.9 bits (70), Expect = 1.5
Identities = 18/85 (21%), Positives = 42/85 (49%), Gaps = 6/85 (7%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES----SSEKEKKKEKKD 111
R K ++ K+ K K + S + ER S+ + E +K K+++
Sbjct: 139 RQKAQQAAKQRTPKHKKEAAESASSSLS-GSAKPERNVSQEEAKKRLQEWELKKLKQQQQ 197
Query: 112 KKEKSHKHKDKDRERDKDEKKEQKE 136
K+E+ + K + ++++++E+K++ E
Sbjct: 198 KREEE-RRKQRKKQQEEEERKQKAE 221
>gnl|CDD|218026 pfam04321, RmlD_sub_bind, RmlD substrate binding domain.
L-rhamnose is a saccharide required for the virulence of
some bacteria. Its precursor, dTDP-L-rhamnose, is
synthesised by four different enzymes the final one of
which is RmlD. The RmlD substrate binding domain is
responsible for binding a sugar nucleotide.
Length = 284
Score = 37.2 bits (87), Expect = 0.019
Identities = 21/67 (31%), Positives = 29/67 (43%), Gaps = 13/67 (19%)
Query: 306 HVIIHAAASLRFD--ELIQD-AFTLNIQATRELLDLATRCSQLKAIL-HVSTLY------ 355
V+++AAA D E + A+ +N LA C+ A L H+ST Y
Sbjct: 51 DVVVNAAAYTAVDKAESEPELAYAVNALGPGN---LAEACAARGAPLIHISTDYVFDGAK 107
Query: 356 THSYRED 362
YRED
Sbjct: 108 GGPYRED 114
>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3
subunit. This is a family of proteins which are
subunits of the eukaryotic translation initiation factor
3 (eIF3). In yeast it is called Hcr1. The Saccharomyces
cerevisiae protein eIF3j (HCR1) has been shown to be
required for processing of 20S pre-rRNA and binds to 18S
rRNA and eIF3 subunits Rpg1p and Prt1p.
Length = 242
Score = 36.9 bits (86), Expect = 0.020
Identities = 25/97 (25%), Positives = 51/97 (52%), Gaps = 3/97 (3%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
+ + KDK D + + + K+ D+E+D+ +EK K +K K+ ++K +E
Sbjct: 13 APPAKAVVKDKWDDEDEDDDVKDSWDEEEDEEK--EEEKAKVAAKAKAKKALKAKIEEKE 70
Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
K +K+EK ++ + +D+ E+ + +K Q+ES
Sbjct: 71 KAKREKEEKGLRELEEDTPEDELAEKLR-LRKLQEES 106
Score = 30.4 bits (69), Expect = 2.0
Identities = 13/58 (22%), Positives = 32/58 (55%)
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
+++E++E K K ++ K KK K +EK ++K+ + ++ +++ E + + K
Sbjct: 39 EEEDEEKEEEKAKVAAKAKAKKALKAKIEEKEKAKREKEEKGLRELEEDTPEDELAEK 96
>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
TolA; Provisional.
Length = 387
Score = 37.1 bits (86), Expect = 0.022
Identities = 19/98 (19%), Positives = 42/98 (42%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
K+ E K K + ++A + + K K ++ + ++ K+ + + KKK + K+K+
Sbjct: 161 KKAAAEAKKKAEAEAAKKAAAEAKKKAEAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAA 220
Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
+ E K E +++K + +K A
Sbjct: 221 AEAKAAAAKAAAEAKAAAEKAAAAKAAEKAAAAKAAAE 258
Score = 36.7 bits (85), Expect = 0.032
Identities = 21/89 (23%), Positives = 35/89 (39%), Gaps = 1/89 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK-K 106
KK + + E E KK + K ++ K +K+K E+K K ++ K+K
Sbjct: 161 KKAAAEAKKKAEAEAAKKAAAEAKKKAEAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAA 220
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
E K K+ E+ K +K
Sbjct: 221 AEAKAAAAKAAAEAKAAAEKAAAAKAAEK 249
Score = 35.6 bits (82), Expect = 0.079
Identities = 17/90 (18%), Positives = 37/90 (41%), Gaps = 1/90 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK-ESSSEKEKK 106
KK + + + E +KK + + + +++ K+K + +K+K E+K K + ++
Sbjct: 169 KKAEAEAAKKAAAEAKKKAEAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAAAEAKAAAA 228
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
K + K + K K E
Sbjct: 229 KAAAEAKAAAEKAAAAKAAEKAAAAKAAAE 258
Score = 34.8 bits (80), Expect = 0.11
Identities = 12/91 (13%), Positives = 44/91 (48%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+K K + ++K+++++ +E + + +E+ K + +++ K E ++++ K
Sbjct: 71 QKSAKRAEEQRKKKEQQQAEELQQKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAALK 130
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+K+ ++ + + + + K+ +K
Sbjct: 131 QKQAEEAAAKAAAAAKAKAEAEAKRAAAAAK 161
Score = 34.8 bits (80), Expect = 0.14
Identities = 28/181 (15%), Positives = 75/181 (41%), Gaps = 1/181 (0%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER-KESKPKESSSEKEKKK 107
+ ++ + E++++KK++++ + + E++++ EKER + K+ + E K+
Sbjct: 68 QQQQKSAKRAEEQRKKKEQQQAEELQQKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
K K+ + K + K E + ++ + ++ K + + E + + + A
Sbjct: 128 ALKQKQAEEAAAKAAAAAKAKAEAEAKRAAAAAKKAAAEAKKKAEAEAAKKAAAEAKKKA 187
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+K+ E E+ +K K + K E A +++ + K++
Sbjct: 188 EAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAAAEAKAAAAKAAAEAKAAAEKAAAAKAA 247
Query: 228 A 228
Sbjct: 248 E 248
>gnl|CDD|187564 cd05254, dTDP_HR_like_SDR_e, dTDP-6-deoxy-L-lyxo-4-hexulose
reductase and related proteins, extended (e) SDRs.
dTDP-6-deoxy-L-lyxo-4-hexulose reductase, an extended
SDR, synthesizes dTDP-L-rhamnose from
alpha-D-glucose-1-phosphate, providing the precursor of
L-rhamnose, an essential cell wall component of many
pathogenic bacteria. This subgroup has the
characteristic active site tetrad and NADP-binding
motif. This subgroup also contains human MAT2B, the
regulatory subunit of methionine adenosyltransferase
(MAT); MAT catalyzes S-adenosylmethionine synthesis. The
human gene encoding MAT2B encodes two major splicing
variants which are induced in human cell liver cancer
and regulate HuR, an mRNA-binding protein which
stabilizes the mRNA of several cyclins, to affect cell
proliferation. Both MAT2B variants include this extended
SDR domain. Extended SDRs are distinct from classical
SDRs. In addition to the Rossmann fold (alpha/beta
folding pattern with a central beta-sheet) core region
typical of all SDRs, extended SDRs have a less conserved
C-terminal extension of approximately 100 amino acids.
Extended SDRs are a diverse collection of proteins, and
include isomerases, epimerases, oxidoreductases, and
lyases; they typically have a TGXXGXXG cofactor binding
motif. SDRs are a functionally diverse family of
oxidoreductases that have a single domain with a
structurally conserved Rossmann fold, an
NAD(P)(H)-binding region, and a structurally diverse
C-terminal region. Sequence identity between different
SDR enzymes is typically in the 15-30% range; they
catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 280
Score = 36.8 bits (86), Expect = 0.022
Identities = 31/148 (20%), Positives = 49/148 (33%), Gaps = 48/148 (32%)
Query: 300 FIQHHIHVIIHAAASLRFDELIQD---AFTLNIQATRELLDLATRCSQLKAIL-HVSTLY 355
+ VII+ AA R D+ D A+ +N+ A L ++ A L H+ST Y
Sbjct: 51 IRDYKPDVIINCAAYTRVDKCESDPELAYRVNVLAPENLARA---AKEVGARLIHISTDY 107
Query: 356 -----THSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKA 410
Y+E ED + + N Y +K
Sbjct: 108 VFDGKKGPYKE-------------EDAPNPL---------------------NVYGKSKL 133
Query: 411 IGESVVEKYLYKLPLAMVRPSIVVSTWK 438
+GE V + ++R S + K
Sbjct: 134 LGEVAVLNANPR--YLILRTSWLYGELK 159
>gnl|CDD|191187 pfam05087, Rota_VP2, Rotavirus VP2 protein. Rotavirus particles
consist of three concentric proteinaceous capsid layers.
The innermost capsid (core) is made of VP2. The genomic
RNA and the two minor proteins VP1 and VP3 are
encapsidated within this layer. The N-terminus of
rotavirus VP2 is necessary for the encapsidation of VP1
and VP3.
Length = 887
Score = 37.6 bits (87), Expect = 0.023
Identities = 21/90 (23%), Positives = 45/90 (50%), Gaps = 2/90 (2%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+ + + DR +EK+ EK+D +K++ + K +K + + K + S +
Sbjct: 8 EANINNNDRMQEKDDEKQD-QKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQENLKIAD 66
Query: 108 E-KKDKKEKSHKHKDKDRERDKDEKKEQKE 136
E KK KE+S + + + +++ +K+ Q E
Sbjct: 67 EVKKSTKEESKQLLEVLKTKEEHQKEIQYE 96
Score = 31.4 bits (71), Expect = 1.5
Identities = 19/91 (20%), Positives = 32/91 (35%), Gaps = 14/91 (15%)
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
+E D +EK+ +K +K R E K EK +KK++ +
Sbjct: 5 NRREANINNND----RMQEKDDEKQD--QKNRMELK--------EKVLDKKEEVVTDNVD 50
Query: 120 KDKDRERDKDEKKEQKESKSSSKIVSSSHNS 150
+ ++ K E K S+K S
Sbjct: 51 SPVKEQSSQENLKIADEVKKSTKEESKQLLE 81
Score = 30.3 bits (68), Expect = 3.6
Identities = 17/80 (21%), Positives = 40/80 (50%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
N + K D+K +++ + KEK +K++ + + + SS+E + + K+
Sbjct: 11 INNNDRMQEKDDEKQDQKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQENLKIADEVKK 70
Query: 99 SSSEKEKKKEKKDKKEKSHK 118
S+ E+ K+ + K ++ H+
Sbjct: 71 STKEESKQLLEVLKTKEEHQ 90
Score = 29.9 bits (67), Expect = 4.7
Identities = 12/59 (20%), Positives = 26/59 (44%), Gaps = 1/59 (1%)
Query: 84 VSSKEKERKESKPKESSSEKE-KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
++ + + + EK+ +K+++K++ E K DK E D + +SS
Sbjct: 1 MAYRNRREANINNNDRMQEKDDEKQDQKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQ 59
>gnl|CDD|217503 pfam03344, Daxx, Daxx Family. The Daxx protein (also known as the
Fas-binding protein) is thought to play a role in
apoptosis, but precise role played by Daxx remains to be
determined. Daxx forms a complex with Axin.
Length = 715
Score = 37.2 bits (86), Expect = 0.023
Identities = 21/88 (23%), Positives = 50/88 (56%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
K D E+E+ +K +E+++ SS+ + K SS E +ES E+ ++E++++
Sbjct: 393 MKQDDTEEEERRKRQERERQGTSSRSSDPSKASSTSGESPSMASQESEEEESVEEEEEEE 452
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSS 140
+E+ + ++ + E +DE++E++ +
Sbjct: 453 EEEEEEEQESEEEEGEDEEEEEEVEADN 480
Score = 31.8 bits (72), Expect = 1.2
Identities = 31/157 (19%), Positives = 69/157 (43%), Gaps = 16/157 (10%)
Query: 50 DKKDKDRDKEKEKEKKDKEK---DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
D ++++R K +E+E++ D S SS E ++S+E E +ES +E E+E++
Sbjct: 397 DTEEEERRKRQERERQGTSSRSSDPSKASSTSGESPSMASQESEEEESVEEEEEEEEEEE 456
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKE---SKSSSKIVSSSHNSKEPASGSQLISHP 163
+E+++ +E+ + ++++ E + D E++ S+ +++ S IS
Sbjct: 457 EEEQESEEEEGEDEEEEEEVEADNGSEEEMEGSSEGDGDGEEPEEDAERRNSEMAGISRM 516
Query: 164 PP----------PAPTPTQKSPVKTKEKEKEKESSTT 190
P + ++ + E E S
Sbjct: 517 SEGQQPRGSSVQPESPQEEPLQPESMDAESVGEESDE 553
>gnl|CDD|130141 TIGR01069, mutS2, MutS2 family protein. Function of MutS2 is
unknown. It should not be considered a DNA mismatch
repair protein. It is likely a DNA mismatch binding
protein of unknown cellular function [DNA metabolism,
Other].
Length = 771
Score = 37.1 bits (86), Expect = 0.026
Identities = 41/145 (28%), Positives = 66/145 (45%), Gaps = 13/145 (8%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
++ K+++R+K+ E EK+ +E K+ KE E KEK+ ++K +S + K KE
Sbjct: 553 EELKERERNKKLELEKEAQEALKALK--KEVESIIRELKEKKIHKAKEIKSIEDLVKLKE 610
Query: 109 KKDK------KEKSHKHKDKDRERDKDEKKEQKESKSSSKI-VSSSH-NSKEPASGSQLI 160
K K ++ K DK R R +K + + +K V+ K S + I
Sbjct: 611 TKQKIPQKPTNFQADKIGDKVRIRYFGQKGKIVQILGGNKWNVTVGGMRMKVHGSELEKI 670
Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEK 185
+ PPP K P TK + KE
Sbjct: 671 NKAPPPKKF---KVPKTTKPEPKEA 692
Score = 36.3 bits (84), Expect = 0.051
Identities = 30/96 (31%), Positives = 48/96 (50%), Gaps = 5/96 (5%)
Query: 43 NSSSSKKDKKDKDRDKE---KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
S+ +K+ + K+ E KE+EK KE ++ KE+E++K EKE +E K
Sbjct: 519 KLSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKERERNKKLELEKEAQE-ALKAL 577
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
E E ++ K++K HK K+ D + KE K
Sbjct: 578 KKEVE-SIIRELKEKKIHKAKEIKSIEDLVKLKETK 612
Score = 30.6 bits (69), Expect = 2.9
Identities = 25/127 (19%), Positives = 53/127 (41%), Gaps = 9/127 (7%)
Query: 68 EKDKSAVSSKEKEKDKV----SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
E+ K+ ++E + + S+ EKE ++ KE++K KK+ +++ + K+++
Sbjct: 500 EQAKTFYGEFKEEINVLIEKLSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKERE 559
Query: 124 RERDKDEKKEQKES-----KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
R + + +KE +E+ K I+ K + QK P K
Sbjct: 560 RNKKLELEKEAQEALKALKKEVESIIRELKEKKIHKAKEIKSIEDLVKLKETKQKIPQKP 619
Query: 179 KEKEKEK 185
+ +K
Sbjct: 620 TNFQADK 626
>gnl|CDD|176924 cd09071, FAR_C, C-terminal domain of fatty acyl CoA reductases.
C-terminal domain of fatty acyl CoA reductases, a family
of SDR-like proteins. SDRs or short-chain
dehydrogenases/reductases are Rossmann-fold
NAD(P)H-binding proteins. Many proteins in this FAR_C
family may function as fatty acyl-CoA reductases (FARs),
acting on medium and long chain fatty acids, and have
been reported to be involved in diverse processes such
as the biosynthesis of insect pheromones, plant
cuticular wax production, and mammalian wax
biosynthesis. In Arabidopsis thaliana, proteins with
this particular architecture have also been identified
as the MALE STERILITY 2 (MS2) gene product, which is
implicated in male gametogenesis. Mutations in MS2
inhibit the synthesis of exine (sporopollenin),
rendering plants unable to reduce pollen wall fatty
acids to corresponding alcohols. The function of this
C-terminal domain is unclear.
Length = 92
Score = 34.1 bits (79), Expect = 0.026
Identities = 10/23 (43%), Positives = 17/23 (73%)
Query: 574 FLHMIPGMIMDTVLRCLNKPPRI 596
FLH++P ++D +LR L + PR+
Sbjct: 1 FLHLLPAYLLDLLLRLLGRKPRL 23
>gnl|CDD|240433 PTZ00482, PTZ00482, membrane-attack complex/perforin (MACPF)
Superfamily; Provisional.
Length = 844
Score = 37.2 bits (86), Expect = 0.028
Identities = 28/146 (19%), Positives = 55/146 (37%), Gaps = 14/146 (9%)
Query: 23 DKDSSAIPSTSTSSSTSNPTNSSSSKKDK-KDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
D + A +TS SST + + +D+ D + ++ + + S K+
Sbjct: 100 DDEDDAGNATSGESSTDDDSLLELPDRDEDADTQANNDQTNDFDQDDSSNSQTDQGLKQS 159
Query: 82 DKVSSKEKERKESKPKESS--------SEKEKKKEKKDKKEKSHKHK-----DKDRERDK 128
+SS EK +E K + + ++ E+ K K KS D +
Sbjct: 160 VNLSSAEKLIEEKKGQTENTFKFYNFGNDGEEAAAKDGGKSKSSDPGPLNDSDGQGDDGD 219
Query: 129 DEKKEQKESKSSSKIVSSSHNSKEPA 154
E E+ ++ S+++ + S P
Sbjct: 220 PESAEEDKAASNTRAAYTKATSVFPG 245
Score = 32.5 bits (74), Expect = 0.78
Identities = 26/164 (15%), Positives = 60/164 (36%), Gaps = 6/164 (3%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
++ N S D D + D E ++ D S SS + + E +++
Sbjct: 77 ASFLNQRKSLDDDDDDEFDFLYEDDEDDAGNATSGESSTDDDSLLELPDRDEDADTQANN 136
Query: 99 SSSEKEKKKEKK-DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
+ + + + ++ K +K ++++ +++++ K + N E A+
Sbjct: 137 DQTNDFDQDDSSNSQTDQGLKQSVNLSSAEKLIEEKKGQTENTFKF-YNFGNDGEEAAAK 195
Query: 158 QL-ISHPPPPAPTPT---QKSPVKTKEKEKEKESSTTHDKHSKH 197
S P P Q + E++K +S T ++K
Sbjct: 196 DGGKSKSSDPGPLNDSDGQGDDGDPESAEEDKAASNTRAAYTKA 239
>gnl|CDD|218734 pfam05758, Ycf1, Ycf1. The chloroplast genomes of most higher
plants contain two giant open reading frames designated
ycf1 and ycf2. Although the function of Ycf1 is unknown,
it is known to be an essential gene.
Length = 832
Score = 37.3 bits (87), Expect = 0.028
Identities = 21/71 (29%), Positives = 33/71 (46%), Gaps = 2/71 (2%)
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
KK KE S +E+E D E K +K ++ S +E ++KE K +D D
Sbjct: 224 KKLKET--SETEEREEETDVEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDPDKTEDLD 281
Query: 124 RERDKDEKKEQ 134
+ EKK++
Sbjct: 282 KLEILKEKKDE 292
Score = 31.5 bits (72), Expect = 1.5
Identities = 20/79 (25%), Positives = 36/79 (45%), Gaps = 9/79 (11%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
KK K + E E++++E D E E + K+ +E +E S ++KE
Sbjct: 224 KKLK---ETSETEEREEETD------VEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDP 274
Query: 111 DKKEKSHKHKDKDRERDKD 129
DK E K + ++D++
Sbjct: 275 DKTEDLDKLEILKEKKDEE 293
Score = 31.1 bits (71), Expect = 2.0
Identities = 22/87 (25%), Positives = 35/87 (40%), Gaps = 4/87 (4%)
Query: 29 IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
IPS T +S +++ +++ D + E E K K + S++E KE
Sbjct: 217 IPS---PFFTKKLKETSETEEREEETDVEIETTSETK-GTKQEQEGSTEEDPSLFSEEKE 272
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEK 115
K + KEKK E+ EK
Sbjct: 273 DPDKTEDLDKLEILKEKKDEELFWFEK 299
Score = 30.4 bits (69), Expect = 3.4
Identities = 17/72 (23%), Positives = 33/72 (45%), Gaps = 4/72 (5%)
Query: 81 KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
K K +S+ +ER+E E+ E E E K K++ ++D +EK++ +++
Sbjct: 225 KLKETSETEEREE----ETDVEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDPDKTEDL 280
Query: 141 SKIVSSSHNSKE 152
K+ E
Sbjct: 281 DKLEILKEKKDE 292
>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis). This nucleolar
family of proteins are involved in 60S ribosomal
biogenesis. They are specifically involved in the
processing beyond the 27S stage of 25S rRNA maturation.
This family contains sequences that bear similarity to
the glioma tumour suppressor candidate region gene 2
protein (p60). This protein has been found to interact
with herpes simplex type 1 regulatory proteins.
Length = 387
Score = 36.6 bits (85), Expect = 0.029
Identities = 23/92 (25%), Positives = 43/92 (46%), Gaps = 1/92 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS-SKEKERKESKPKESSSEKEKK 106
D +++ D E E + E + + K K K +KEK RKE + + ++ KK
Sbjct: 245 SDDDGEEESDDESAWEGFESEYEPINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQLKK 304
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
K + + K + +E+ + KKEQ++ +
Sbjct: 305 KLAQLARLKEIAKEVAQKEKARARKKEQRKER 336
Score = 31.2 bits (71), Expect = 1.6
Identities = 21/94 (22%), Positives = 42/94 (44%), Gaps = 6/94 (6%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKE------KEKDKVSSKEKERKESKPKESSSEKE 104
K +K R + + E+K EK S + E+ +E+ ES + SE E
Sbjct: 208 KAEKKRQELERVEEKKLEKMAPEASRLDEMSEGLLEESDDDGEEESDDESAWEGFESEYE 267
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ K K+ ++K++ R + E++ ++E +
Sbjct: 268 PINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQ 301
Score = 30.1 bits (68), Expect = 3.3
Identities = 24/100 (24%), Positives = 42/100 (42%), Gaps = 14/100 (14%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
S S + R K K K +++KEK + KE ++ + KE K+ K K
Sbjct: 257 SAWEGFESEYEPINKPVRPKRKTKAQRNKEK-------RRKELER---EAKEEKQLKKKL 306
Query: 99 SSSEK----EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
+ + K+ +K+K K + K+R K K+ +
Sbjct: 307 AQLARLKEIAKEVAQKEKARARKKEQRKERGEKKKLKRRK 346
>gnl|CDD|220839 pfam10661, EssA, WXG100 protein secretion system (Wss), protein
EssA. The WXG100 protein secretion system (Wss) is
responsible for the secretion of WXG100 proteins
(pfam06013) such as ESAT-6 and CFP-10 in Mycobacterium
tuberculosis or EsxA and EsxB in Staphylococcus aureus.
In S. aureus, the Wss seems to be encoded by a locus of
eight CDS, called ess (eSAT-6 secretion system). This
locus encodes, amongst several other proteins, EssA, a
protein predicted to possess one transmembrane domain.
Due to its predicted membrane location and its absolute
requirement for WXG100 protein secretion, it has been
speculated that EssA could form a secretion apparatus in
conjunction with the polytopic membrane protein EsaA,
YukC (pfam10140) and YukAB, which is a membrane-bound
ATPase containing Ftsk/SpoIIIE domains (pfam01580)
called EssC in S. aureus and Snm1/Snm2 in Mycobacterium
tuberculosis. Proteins homologous to EssA, YukC, EsaA
and YukD seem absent from mycobacteria.
Length = 145
Score = 35.2 bits (81), Expect = 0.033
Identities = 25/119 (21%), Positives = 50/119 (42%), Gaps = 7/119 (5%)
Query: 33 STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVS-----SKEKEKDKVSSK 87
S + S K D+ K ++ EK+ ++ E DK + ++E+ K +++
Sbjct: 3 SAADSYLEDDGKMQFKVDRLQKTDQEKNEKKLRETELDKLGIELFTTETEEEINKKKNAE 62
Query: 88 EKERKESKPKESSSEKEKKKEKKDKKEK--SHKHKDKDRERDKDEKKEQKESKSSSKIV 144
+KE ++ + S +KE K+ K+ S +++ E E S S S +
Sbjct: 63 QKEMEDIENSLFSEDKEGNVAVKETKDSLFSSEYEVTSNEAASSGNAETSTSSSISNTI 121
>gnl|CDD|227507 COG5180, PBP1, Protein interacting with poly(A)-binding protein
[RNA processing and modification].
Length = 654
Score = 36.6 bits (84), Expect = 0.036
Identities = 47/194 (24%), Positives = 72/194 (37%), Gaps = 18/194 (9%)
Query: 34 TSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
SS+ SN N K + ++K + + + ++ + VS+ E +
Sbjct: 240 ESSNASNKENRQEKPAAAKQPHHMDDDGTKRKMVIEIEGLSLLENRKPEAVSAPEAVSPQ 299
Query: 94 SKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP 153
SK + SS +EK+K+ K+KK S+ K D K E S +S E
Sbjct: 300 SKSEGPSSGQEKEKQIKEKKSFSYGWKHTKF----DSSKNLLEVIKSKFKSLFDISSGEL 355
Query: 154 ASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK-------KDKHG 206
GS+ PP A + + V +KE + S K KH + HG
Sbjct: 356 KWGSK----PPWEAKAVSIATKVSKPKKESVRSGSKAAKKSPSTKHTTRSSTSLRRRNHG 411
Query: 207 DK---TNPKEKDAK 217
NP DAK
Sbjct: 412 SFFGAKNPHTNDAK 425
Score = 29.7 bits (66), Expect = 5.9
Identities = 39/196 (19%), Positives = 61/196 (31%), Gaps = 16/196 (8%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
K +++ H K K I S + S+ + K +EK
Sbjct: 252 EKPAAAKQPHHMDDDGTKRKMVIEIEGLSLLENRKPEAVSAPEAVSPQSKSEGPSSGQEK 311
Query: 65 KDKEKDKSAVSSKEKEKDKVSSKE----KERKESKPKESSSEKEKKKEKKDKKEK----- 115
+ + K+K + S K SSK + K + SS + K K + K
Sbjct: 312 EKQIKEKKSFSYGWKHTKFDSSKNLLEVIKSKFKSLFDISSGELKWGSKPPWEAKAVSIA 371
Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQL---ISHPPPPAPTPTQ 172
+ K K K +K + SS+ + GS H
Sbjct: 372 TKVSKPKKESVRSGSKAAKKSPSTKHTTRSSTSLRRR-NHGSFFGAKNPHTNDAKRVLFG 430
Query: 173 KS---PVKTKEKEKEK 185
KS +K+KE EK
Sbjct: 431 KSFNMFIKSKEAHDEK 446
>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein.
Length = 529
Score = 36.7 bits (85), Expect = 0.036
Identities = 19/63 (30%), Positives = 34/63 (53%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
KE EKE D+E+++ KE+E+ +E+ +E + +E + +K KE + E +
Sbjct: 29 KEVEKEVPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKVKETTTEWELLN 88
Query: 118 KHK 120
K K
Sbjct: 89 KTK 91
Score = 33.2 bits (76), Expect = 0.38
Identities = 15/52 (28%), Positives = 30/52 (57%), Gaps = 3/52 (5%)
Query: 84 VSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
KE E++ +E ++EKK+E++ +K ++ D E +K+EKK++
Sbjct: 26 WVEKEVEKEVPDEEEEEEKEEKKEEEEKTTDKE---EEVDEEEEKEEKKKKT 74
Score = 32.0 bits (73), Expect = 0.93
Identities = 18/56 (32%), Positives = 38/56 (67%), Gaps = 3/56 (5%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
K+ D++ ++EKE++K+++EK ++E++K +EK++K K KE+++E E
Sbjct: 33 KEVPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEK---EEKKKKTKKVKETTTEWE 85
Score = 31.3 bits (71), Expect = 1.6
Identities = 16/71 (22%), Positives = 39/71 (54%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
K++KK+++ ++E+ D+E++K K K+ + +++ + ++KP + + K+
Sbjct: 42 EEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKVKETTTEWELLNKTKPIWTRNPKDV 101
Query: 106 KKEKKDKKEKS 116
KE+ KS
Sbjct: 102 TKEEYAAFYKS 112
Score = 30.5 bits (69), Expect = 2.9
Identities = 18/53 (33%), Positives = 34/53 (64%), Gaps = 4/53 (7%)
Query: 92 KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
KE + + E+E++KE+K ++E+ K D+E + DE++E++E K +K V
Sbjct: 29 KEVEKEVPDEEEEEEKEEKKEEEE----KTTDKEEEVDEEEEKEEKKKKTKKV 77
>gnl|CDD|219882 pfam08524, rRNA_processing, rRNA processing. This is a family of
proteins that are involved in rRNA processing. In a
localisation study they were found to localise to the
nucleus and nucleolus. The family also includes other
metazoa members from plants to mammals where the protein
has been named BR22 and is associated with TTF-1,
thyroid transcription factor 1. In the lungs, the family
binds TTF-1 to form a complex which influences the
expression of the key lung surfactant protein-B (SP-B)
and -C (SP-C), the small hydrophobic surfactant proteins
that maintain surface tension in alveoli.
Length = 150
Score = 34.9 bits (80), Expect = 0.040
Identities = 17/76 (22%), Positives = 41/76 (53%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
+ +++ +++ KS+ ++ EK K ++KE + + +E ++ K++K+ +K
Sbjct: 43 EKEGYAVPEKESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQREKELAKRQKELEK 102
Query: 114 EKSHKHKDKDRERDKD 129
+ K K K+RER +
Sbjct: 103 IELSKKKQKERERRRK 118
Score = 32.9 bits (75), Expect = 0.22
Identities = 27/97 (27%), Positives = 50/97 (51%), Gaps = 1/97 (1%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
D+ K K+ +E K KE ++ +K+ + EKE + P++ S+EK+
Sbjct: 1 MGSVDQNQKKNGKKFTREYKVKEIQRNLTKKARLKKEYLKLLEKE-GYAVPEKESAEKQV 59
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
K K+D+K + K D+ +E K K+EQ+E + + +
Sbjct: 60 KSSKEDRKFEKKKKLDEKKEIAKQRKREQREKELAKR 96
Score = 32.2 bits (73), Expect = 0.31
Identities = 24/83 (28%), Positives = 44/83 (53%)
Query: 77 KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
+ EK SSKE + E K K ++ K+ K++++EK + K+ E+ + KK+QKE
Sbjct: 53 ESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQREKELAKRQKELEKIELSKKKQKE 112
Query: 137 SKSSSKIVSSSHNSKEPASGSQL 159
+ K ++ S +P G ++
Sbjct: 113 RERRRKKLTKKTKSGQPLMGPRI 135
>gnl|CDD|220818 pfam10595, UPF0564, Uncharacterized protein family UPF0564. This
family of proteins has no known function. However, one
of the members is annotated as an EF-hand family
protein.
Length = 349
Score = 36.3 bits (84), Expect = 0.042
Identities = 42/209 (20%), Positives = 72/209 (34%), Gaps = 16/209 (7%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKE-KEKEKKDKEKDKSAVSSKE 78
K S+ + P S KDK +++E K K + + SS+
Sbjct: 104 ILPRKLRSSTSEREPKKFKAKPVPKSIYIPLLKDKMQEEELKRKIRVQMRAQELLQSSRL 163
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES- 137
+ ++ + K + K KKK K K+ KS +K E+ + + E+K+S
Sbjct: 164 PPRMAKHEAQERLTKKKKRGQKKSKYKKKTFKPKRAKSIPDFEKLHEKFQKQLAEKKKSK 223
Query: 138 --------------KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEK 183
KSSS+ N + PP+ + + + K
Sbjct: 224 RPTVPEPFNFQESHKSSSRTYLDQENISAGEENLKPTRRKLPPSTKKWESLVKFLRTERK 283
Query: 184 EKESSTTHDKHSKHKHKKKDKHGDKTNPK 212
EKE+ +K + KKK K +
Sbjct: 284 EKEAKEQQEKKELEQRKKKKKEMAPKVKQ 312
Score = 34.0 bits (78), Expect = 0.20
Identities = 26/105 (24%), Positives = 45/105 (42%), Gaps = 13/105 (12%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
S + D++ E+ K + S +K + V ERKE + KE +KE
Sbjct: 236 SHKSSSRTYLDQENISAGEENLKPTRRKLPPSTKKWESLVKFLRTERKEKEAKEQQEKKE 295
Query: 105 KKKEKKDKKEKSH-------------KHKDKDRERDKDEKKEQKE 136
++ KK KKE + K +++ +E+ +KE+KE
Sbjct: 296 LEQRKKKKKEMAPKVKQRFEANDPAQKLQEERKEQLAKLRKEEKE 340
Score = 32.1 bits (73), Expect = 0.78
Identities = 19/82 (23%), Positives = 43/82 (52%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
S+KK + + + KEK+ KE+ + + K+K K + + +++ + +E+
Sbjct: 267 STKKWESLVKFLRTERKEKEAKEQQEKKELEQRKKKKKEMAPKVKQRFEANDPAQKLQEE 326
Query: 106 KKEKKDKKEKSHKHKDKDRERD 127
+KE+ K K K ++K+ E++
Sbjct: 327 RKEQLAKLRKEEKEREKEYEQE 348
Score = 30.1 bits (68), Expect = 3.2
Identities = 24/116 (20%), Positives = 52/116 (44%), Gaps = 3/116 (2%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
HK+ + + S PT KK + K E+K+KE + +
Sbjct: 236 SHKSSSRTYLDQENISAGEENLKPTRRKLPPSTKKWESLVKFLRTERKEKEAKEQQEKKE 295
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
+++ K K+KE + + +K ++++KE+ K + +++ER+K+ ++E
Sbjct: 296 LEQRKK---KKKEMAPKVKQRFEANDPAQKLQEERKEQLAKLRKEEKEREKEYEQE 348
Score = 29.7 bits (67), Expect = 3.9
Identities = 27/174 (15%), Positives = 71/174 (40%), Gaps = 22/174 (12%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK--- 115
E+ +E++++ ++KS +K + +E+K++ ++E K K K
Sbjct: 68 EQNEERREEVREKSKAILLSSQKPFSFYEREEQKKAILPRKLRSSTSEREPKKFKAKPVP 127
Query: 116 --SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQK 173
+ KD+ ++++ K++ + + +++ SS A
Sbjct: 128 KSIYIPLLKDKMQEEELKRKIRVQMRAQELLQSSRLPPRMAKHEA--------------- 172
Query: 174 SPVKTKEKEKEKESSTTHDKHS-KHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
+ +K+K + + + K + K K K +K + K + +++K+S +
Sbjct: 173 -QERLTKKKKRGQKKSKYKKKTFKPKRAKSIPDFEKLHEKFQKQLAEKKKSKRP 225
>gnl|CDD|191716 pfam07263, DMP1, Dentin matrix protein 1 (DMP1). This family
consists of several mammalian dentin matrix protein 1
(DMP1) sequences. The dentin matrix acidic
phosphoprotein 1 (DMP1) gene has been mapped to human
chromosome 4q21. DMP1 is a bone and teeth specific
protein initially identified from mineralised dentin.
DMP1 is primarily localised in the nuclear compartment
of undifferentiated osteoblasts. In the nucleus, DMP1
acts as a transcriptional component for activation of
osteoblast-specific genes like osteocalcin. During the
early phase of osteoblast maturation, Ca(2+) surges into
the nucleus from the cytoplasm, triggering the
phosphorylation of DMP1 by a nuclear isoform of casein
kinase II. This phosphorylated DMP1 is then exported out
into the extracellular matrix, where it regulates
nucleation of hydroxyapatite. DMP1 is a unique molecule
that initiates osteoblast differentiation by
transcription in the nucleus and orchestrates
mineralised matrix formation extracellularly, at later
stages of osteoblast maturation. The DMP1 gene has been
found to be ectopically expressed in lung cancer
although the reason for this is unknown.
Length = 514
Score = 36.6 bits (84), Expect = 0.042
Identities = 32/149 (21%), Positives = 66/149 (44%), Gaps = 4/149 (2%)
Query: 10 SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEK 69
SS S+ + +++ S + + S NP N++S +D++D + +E + +
Sbjct: 332 SSESSQEADLPSQENSSESQEEVVSESRGDNPDNTTSHSEDQEDSESSEEDSLDTPSSSE 391
Query: 70 DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE----KKKEKKDKKEKSHKHKDKDRE 125
+S + E ++ S +E ES E+SS +E + + ++S +D E
Sbjct: 392 SQSTEEQADSESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSRSE 451
Query: 126 RDKDEKKEQKESKSSSKIVSSSHNSKEPA 154
D + ++ SK S S+ +S+E
Sbjct: 452 EDDSDSQDSSRSKEDSNSTESASSSEEDG 480
Score = 29.6 bits (66), Expect = 4.7
Identities = 32/212 (15%), Positives = 76/212 (35%), Gaps = 25/212 (11%)
Query: 31 STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
+ S ++ S S++ + + ++ +E + ++ ++ SS+ ++ + S+E
Sbjct: 288 TMEVKSDSTENAGLSQSREHSRSESQEDSEENQSQEDSQEVQDPSSESSQEADLPSQENS 347
Query: 91 RKESKPKESSSEKE-----------------KKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
S S++E + +E + E+ E E++
Sbjct: 348 --------SESQEEVVSESRGDNPDNTTSHSEDQEDSESSEEDSLDTPSSSESQSTEEQA 399
Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
ES S S S E + S A T ++ ++++ + +E +
Sbjct: 400 DSESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSRSEEDDSDSQD 459
Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
S+ K ++ ++ K+ E ES K
Sbjct: 460 SSRSKEDSNSTESASSSEEDGQPKNTEIESRK 491
Score = 29.6 bits (66), Expect = 5.5
Identities = 43/234 (18%), Positives = 89/234 (38%), Gaps = 9/234 (3%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
V S S+H + D+ + ST S N SS+ K K+ K D+E+ +
Sbjct: 194 VGGGSEGESSHGDGSEFDDEGMQSDDPESTRSERGNSRMSSAGLKSKESKGEDEEQASTQ 253
Query: 65 KDKEKDKSAVSSKE-------KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
E S++ E+D + +S+ + ++ + +S
Sbjct: 254 DSGESQSVEYPSRKFFRKSRISEEDGRGELDDSNTMEVKSDSTENAGLSQSREHSRSESQ 313
Query: 118 KHKDKDRERDKDEKKEQKESKSSSKI-VSSSHNSKEPASGSQLISHPPPPAPTPTQKSPV 176
+ ++++ ++ ++ + S+SS + + S NS E S P T +
Sbjct: 314 EDSEENQSQEDSQEVQDPSSESSQEADLPSQENSSESQEEVVSESRGDNPDNTTSHSEDQ 373
Query: 177 KTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKT-NPKEKDAKSKEKESHKSSAG 229
+ E +E T S+ ++ D +++ + E+ +S E E+ S G
Sbjct: 374 EDSESSEEDSLDTPSSSESQSTEEQADSESNESLSSSEESPESTEDENSSSQEG 427
Score = 28.9 bits (64), Expect = 9.3
Identities = 23/78 (29%), Positives = 43/78 (55%), Gaps = 1/78 (1%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPS-TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
+S+ S SS+ SP +D++SS+ S S+ST + + S S++D + ++ D + +
Sbjct: 402 ESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSRSEEDDSDSQDSS 461
Query: 65 KDKEKDKSAVSSKEKEKD 82
+ KE S S+ E+D
Sbjct: 462 RSKEDSNSTESASSSEED 479
>gnl|CDD|227600 COG5275, COG5275, BRCT domain type II [General function prediction
only].
Length = 276
Score = 35.9 bits (82), Expect = 0.044
Identities = 18/80 (22%), Positives = 27/80 (33%), Gaps = 1/80 (1%)
Query: 33 STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
ST+ S+ S+ K +KE K K+ + + K V RK
Sbjct: 14 STTPDEYFEQQSTRSRS-KPRIISNKETTTSKDVVHPVKTELDTTSDSKPVVHQTRATRK 72
Query: 93 ESKPKESSSEKEKKKEKKDK 112
++PK S K K
Sbjct: 73 PAQPKAEKSTTSKSKSHTTT 92
>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
recombination, and repair].
Length = 908
Score = 36.3 bits (84), Expect = 0.050
Identities = 38/301 (12%), Positives = 103/301 (34%), Gaps = 41/301 (13%)
Query: 50 DKKDKDRDKEKEKEKKDKEK----DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
+K +K + KE K+ K K + E +D + + E+E KE K E E+++
Sbjct: 167 EKYEKLSELLKEVIKEAKAKIEELEGQLSELLEDIEDLLEALEEELKELKKLEEIQEEQE 226
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
++E + + E + + E + ++ + + + +E
Sbjct: 227 EEELEQEIEALEERLAELEEEKERLEELKARLLEIESLELEALKIRE------------- 273
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
+ ++ +E E++ + + + +++ G + + E+ K
Sbjct: 274 -----EELRELERLLEELEEKIERLEELEREIEELEEELEGLR-----ALLEELEELLEK 323
Query: 226 SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGD 285
+ L + K ++ + + E KNE A +L+ ++ +
Sbjct: 324 LKS--------------LEERLEKLEEKLEKLESELEELAEEKNELAKLLEERLKELEER 369
Query: 286 ISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQL 345
+ + + ++ Q + +++ + +EL +L +L
Sbjct: 370 LEELEKELEKALERLKQLEEAIQELKEELAELSAALEEIQEELEELEKELEELERELEEL 429
Query: 346 K 346
+
Sbjct: 430 E 430
Score = 33.2 bits (76), Expect = 0.47
Identities = 15/88 (17%), Positives = 42/88 (47%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
+ + + +E+ +K++ + + + EKE ++ + E E + +EK ++ +
Sbjct: 480 ELELEELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKEELEEKLEKLE 539
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ E+ + K+K + + E+ Q E +
Sbjct: 540 NLLEELEELKEKLQLQQLKEELRQLEDR 567
Score = 33.2 bits (76), Expect = 0.53
Identities = 14/92 (15%), Positives = 43/92 (46%), Gaps = 1/92 (1%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
++ + + E++ E+ + E + + +E+ + +E E+ E + ++ E E+
Sbjct: 649 EELLQAALEELEEKVEELEAEIRRELQRIENEEQLEEKLEELEQLEEELEQLREELEELL 708
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
+K + + + + R+ + +E K++ E
Sbjct: 709 KKL-GEIEQLIEELESRKAELEELKKELEKLE 739
Score = 29.3 bits (66), Expect = 7.8
Identities = 16/88 (18%), Positives = 42/88 (47%), Gaps = 3/88 (3%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDK--SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
++ +++ +EKE+ + +E ++ + E+E ++ E+ KE ++ + +
Sbjct: 484 EELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKEELEEKLEKLENLLE 543
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
E ++ KEK + E + E + Q+
Sbjct: 544 ELEELKEKLQL-QQLKEELRQLEDRLQE 570
>gnl|CDD|240403 PTZ00400, PTZ00400, DnaK-type molecular chaperone; Provisional.
Length = 663
Score = 36.3 bits (84), Expect = 0.051
Identities = 26/100 (26%), Positives = 51/100 (51%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
KE E+ K+ EK K V +K + + + S EK+ + K K S ++K++ K+K K +
Sbjct: 551 KEAEEYKEQDEKKKELVDAKNEAETLIYSVEKQLSDLKDKISDADKDELKQKITKLRSTL 610
Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
+D D +DK ++ ++ K S + ++ + + S
Sbjct: 611 SSEDVDSIKDKTKQLQEASWKISQQAYKQGNSDNQQSEQS 650
>gnl|CDD|200340 TIGR03927, T7SS_EssA_Firm, type VII secretion protein EssA.
Members of this family are associated with type VII
secretion of WXG100 family targets in the Firmicutes,
but not in the Actinobacteria. This highly divergent
protein family consists largely of a central region of
highly polar low-complexity sequence containing
occasional LF motifs in weak repeats about 17 residues
in length, flanked by hydrophobic N- and C-terminal
regions [Protein fate, Protein and peptide secretion and
trafficking].
Length = 150
Score = 34.7 bits (80), Expect = 0.054
Identities = 18/114 (15%), Positives = 52/114 (45%), Gaps = 2/114 (1%)
Query: 28 AIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK 87
+PS +++ ++ +KKD + + + +E+ + +K+ ++K + +
Sbjct: 10 FLPSADSAAENDGKLQVKPNRYEKKDIEINTDYLQEETELDKELFTPEEQKKITFQKHKE 69
Query: 88 EKERKESKPKESSSEKEKKKEKKDKKEK--SHKHKDKDRERDKDEKKEQKESKS 139
+ E++E K + S + K K++ S +++ + ++E K++ S
Sbjct: 70 KPEQEELKNQLFSENATENNTVKATKKQLFSSEYEQTSSSSESTSEEETKKTSS 123
>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein. The YqfQ-like protein family
includes the B. subtilis YqfQ protein, also known as
VrrA, which is functionally uncharacterized. This family
of proteins is found in bacteria. Proteins in this
family are typically between 146 and 237 amino acids in
length. There are two conserved sequence motifs: QYGP
and PKLY.
Length = 155
Score = 34.7 bits (80), Expect = 0.055
Identities = 21/69 (30%), Positives = 33/69 (47%), Gaps = 3/69 (4%)
Query: 29 IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
I +SS S D+ +++ E + E K+K+K + EKEK K ++
Sbjct: 87 IFRELSSSDDEEEETEEEST-DETEQEDPPETKTESKEKKKREVPKPKTEKEKPK--TEP 143
Query: 89 KERKESKPK 97
K+ K SKPK
Sbjct: 144 KKPKPSKPK 152
Score = 32.8 bits (75), Expect = 0.20
Identities = 15/63 (23%), Positives = 30/63 (47%), Gaps = 4/63 (6%)
Query: 76 SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
S ++++ + +E + + E K E K+KK++ ++E+ K E K+ K
Sbjct: 92 SSSDDEEEETEEESTDETEQEDPP----ETKTESKEKKKREVPKPKTEKEKPKTEPKKPK 147
Query: 136 ESK 138
SK
Sbjct: 148 PSK 150
Score = 32.8 bits (75), Expect = 0.22
Identities = 12/59 (20%), Positives = 25/59 (42%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
++E+E +++ D++ + K + K+K E K + K+ K K K
Sbjct: 94 SDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSKPK 152
Score = 31.6 bits (72), Expect = 0.48
Identities = 12/52 (23%), Positives = 24/52 (46%)
Query: 91 RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
R+ S + E E++ + ++E + K + +E+ K E + K K K
Sbjct: 89 RELSSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPK 140
Score = 30.9 bits (70), Expect = 0.99
Identities = 14/57 (24%), Positives = 27/57 (47%), Gaps = 1/57 (1%)
Query: 86 SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
S + +E +ES+ E E++ + K E + K ++ + K EK++ K K
Sbjct: 92 SSSDDEEEETEEESTDETEQEDPPETKTESK-EKKKREVPKPKTEKEKPKTEPKKPK 147
Score = 30.9 bits (70), Expect = 1.1
Identities = 17/65 (26%), Positives = 31/65 (47%), Gaps = 4/65 (6%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
SSS ++++ + + E E++D + K+ K+K + EKE KPK + +
Sbjct: 92 SSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKE----KPKTEPKKPK 147
Query: 105 KKKEK 109
K K
Sbjct: 148 PSKPK 152
Score = 28.2 bits (63), Expect = 7.5
Identities = 16/63 (25%), Positives = 27/63 (42%), Gaps = 3/63 (4%)
Query: 174 SPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKT---NPKEKDAKSKEKESHKSSAGP 230
S +E+E E+ES+ ++ + K + K K PK + K K + + P
Sbjct: 92 SSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSKP 151
Query: 231 KCY 233
K Y
Sbjct: 152 KLY 154
>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex. This
entry is characterized by proteins with alternating
conserved and low-complexity regions. Bud13 together
with Snu17p and a newly identified factor,
Pml1p/Ylr016c, form a novel trimeric complex. called The
RES complex, pre-mRNA retention and splicing complex.
Subunits of this complex are not essential for viability
of yeasts but they are required for efficient splicing
in vitro and in vivo. Furthermore, inactivation of this
complex causes pre-mRNA leakage from the nucleus. Bud13
contains a unique, phylogenetically conserved C-terminal
region of unknown function.
Length = 141
Score = 34.2 bits (79), Expect = 0.063
Identities = 29/91 (31%), Positives = 47/91 (51%), Gaps = 20/91 (21%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
+DK + D E+++E+K++EK +EKERKE K KE +K+E
Sbjct: 5 RDKSGRIIDIEEKREEKEREK-----------------EEKERKEEKEKEWGKGLVQKEE 47
Query: 109 KKDKKEKSHKHKDKDRER---DKDEKKEQKE 136
++ + E+ K K+K R D+D +E KE
Sbjct: 48 REKRLEELEKAKNKPLARYADDEDYDEELKE 78
>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y. Members of this family are
RNase Y, an endoribonuclease. The member from Bacillus
subtilis, YmdA, has been shown to be involved in
turnover of yitJ riboswitch [Transcription, Degradation
of RNA].
Length = 514
Score = 35.7 bits (83), Expect = 0.063
Identities = 28/106 (26%), Positives = 53/106 (50%), Gaps = 15/106 (14%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK- 103
S+++ K + +KE E KE + +KE+ + E+E KE + + E+
Sbjct: 28 GSAEELAKRIIEEAKKEAETLKKEA---LLEAKEEVHKLRAELERELKERRNELQRLERR 84
Query: 104 --------EKKKEKKDKKEKSHKHKDK---DRERDKDEKKEQKESK 138
++K E DKKE++ + K+K ++E++ DEK+E+ E
Sbjct: 85 LLQREETLDRKMESLDKKEENLEKKEKELSNKEKNLDEKEEELEEL 130
>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
biogenesis [Translation, ribosomal structure and
biogenesis].
Length = 1077
Score = 35.9 bits (82), Expect = 0.077
Identities = 22/72 (30%), Positives = 40/72 (55%), Gaps = 7/72 (9%)
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE---KEKKKEKKDKKEKSHKH 119
E ++K + K + KE+ KD +EKER ES + E KEK++E++ +K +
Sbjct: 1010 ECREKHEIKDRIV-KERIKD---QEEKERMESLQRAKEEEIGKKEKEREQRIRKTIHDNY 1065
Query: 120 KDKDRERDKDEK 131
K+ ++R K ++
Sbjct: 1066 KEMAKKRLKKKR 1077
Score = 32.8 bits (74), Expect = 0.68
Identities = 15/68 (22%), Positives = 32/68 (47%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+ ++ E KD+ + +EKE+ + + KE + K ++ ++ +K + KE +
Sbjct: 1010 ECREKHEIKDRIVKERIKDQEEKERMESLQRAKEEEIGKKEKEREQRIRKTIHDNYKEMA 1069
Query: 117 HKHKDKDR 124
K K R
Sbjct: 1070 KKRLKKKR 1077
>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM). This
family consists of several Plasmodium falciparum SPAM
(secreted polymorphic antigen associated with
merozoites) proteins. Variation among SPAM alleles is
the result of deletions and amino acid substitutions in
non-repetitive sequences within and flanking the alanine
heptad-repeat domain. Heptad repeats in which the a and
d position contain hydrophobic residues generate
amphipathic alpha-helices which give rise to helical
bundles or coiled-coil structures in proteins. SPAM is
an example of a P. falciparum antigen in which a
repetitive sequence has features characteristic of a
well-defined structural element.
Length = 164
Score = 34.4 bits (79), Expect = 0.077
Identities = 18/82 (21%), Positives = 32/82 (39%), Gaps = 3/82 (3%)
Query: 57 DKE---KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
DKE KE E EK + +E++++++ E E + E E+E+ +E
Sbjct: 32 DKEDIIKENEDVKDEKQEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEEDEEDNVDL 91
Query: 114 EKSHKHKDKDRERDKDEKKEQK 135
+ K D + Q
Sbjct: 92 KDIEKKNINDIFNSTQDDNAQN 113
Score = 31.0 bits (70), Expect = 1.1
Identities = 22/97 (22%), Positives = 51/97 (52%), Gaps = 3/97 (3%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
+ + +D+++E +++++E+D+ + E +D+ E E +E + E + K EK
Sbjct: 38 KENEDVKDEKQEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEE-DEEDNVDLKDIEK 96
Query: 110 KDKKEKSHKHKDKDRER--DKDEKKEQKESKSSSKIV 144
K+ + + +D + + K+ KK +K K++ IV
Sbjct: 97 KNINDIFNSTQDDNAQNLISKNYKKNEKSKKTAEDIV 133
>gnl|CDD|185618 PTZ00438, PTZ00438, gamete antigen 27/25-like protein; Provisional.
Length = 374
Score = 35.4 bits (81), Expect = 0.077
Identities = 19/78 (24%), Positives = 37/78 (47%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
+K + D E + + K +E+ +E+E ++ + E E E +E+ D E S K
Sbjct: 80 DKSDNENDVELEGLNIIVKNEEERGTQKEEEEDEDVEEIEEVEEVEVVEEEYDDDEDSEK 139
Query: 119 HKDKDRERDKDEKKEQKE 136
+K+ + + DE + E
Sbjct: 140 DDEKESDAEGDENELAGE 157
>gnl|CDD|223880 COG0810, TonB, Periplasmic protein TonB, links inner and outer
membranes [Cell envelope biogenesis, outer membrane].
Length = 244
Score = 34.8 bits (80), Expect = 0.082
Identities = 24/147 (16%), Positives = 47/147 (31%), Gaps = 5/147 (3%)
Query: 24 KDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK 83
+D I + + ++ + + + E++ K + ++ + +
Sbjct: 29 EDFVGIELVPLAVFLLAAKVLEAPTEEPQPEP--EPPEEQPKPPTEPETPPEPTPPKPKE 86
Query: 84 VSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK-DKDRERDKDEKKEQKESKSSSK 142
EK+ K+ KPK K K K K K K K ++ + S+S
Sbjct: 87 KPKPEKKPKKPKPKPKPKPKPKPKVKPQPKPKKPPSKTAAKAPAAPNQPARPPSAASASG 146
Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPT 169
+ S SG + P P
Sbjct: 147 AATG--PSASYLSGLRRAIRRAPRYPA 171
Score = 31.7 bits (72), Expect = 0.80
Identities = 21/67 (31%), Positives = 22/67 (32%)
Query: 160 ISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
P PP T P K KEK K K K K K K K PK K SK
Sbjct: 64 EEQPKPPTEPETPPEPTPPKPKEKPKPEKKPKKPKPKPKPKPKPKPKVKPQPKPKKPPSK 123
Query: 220 EKESHKS 226
+
Sbjct: 124 TAAKAPA 130
Score = 28.6 bits (64), Expect = 7.4
Identities = 22/66 (33%), Positives = 29/66 (43%), Gaps = 8/66 (12%)
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
PP TP + +P K KEK K ++ K K K K K K PK K +K
Sbjct: 69 PPTEPETPPEPTPPKPKEKPKPEKKP----KKPKPKPKPKPK----PKPKVKPQPKPKKP 120
Query: 223 SHKSSA 228
K++A
Sbjct: 121 PSKTAA 126
>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
Length = 1463
Score = 35.6 bits (81), Expect = 0.082
Identities = 18/50 (36%), Positives = 33/50 (66%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDK 67
P + + SS + S+S+SSS+S ++SSSS ++D D++ EKE +++
Sbjct: 1240 PCPDLSESSSTMHSSSSSSSSSCSSSSSSSDSSSSEEDGDEKNEKEDRER 1289
Score = 34.0 bits (77), Expect = 0.26
Identities = 20/67 (29%), Positives = 35/67 (52%), Gaps = 4/67 (5%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
Y + + + P P D S+ S+SSS+S+ +SSSS D + D +++
Sbjct: 1227 YMAQPHVGAGAMPPCP----DLSESSSTMHSSSSSSSSSCSSSSSSSDSSSSEEDGDEKN 1282
Query: 63 EKKDKEK 69
EK+D+E+
Sbjct: 1283 EKEDRER 1289
Score = 31.7 bits (71), Expect = 1.5
Identities = 20/70 (28%), Positives = 38/70 (54%), Gaps = 8/70 (11%)
Query: 8 SSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDK 67
S SSS+ H S SS+ S+ +SSS+S+ ++SS D+K++ D+E+ K +
Sbjct: 1245 SESSSTMHSS--------SSSSSSSCSSSSSSSDSSSSEEDGDEKNEKEDRERAGGGKRR 1296
Query: 68 EKDKSAVSSK 77
+ + + +
Sbjct: 1297 GRQRLPIRDR 1306
>gnl|CDD|205448 pfam13268, DUF4059, Protein of unknown function (DUF4059). This
family of proteins is functionally uncharacterized. This
family of proteins is found in bacteria. Proteins in
this family are approximately 70 amino acids in length.
There is a conserved DKT sequence motif.
Length = 72
Score = 32.3 bits (74), Expect = 0.084
Identities = 14/34 (41%), Positives = 19/34 (55%), Gaps = 7/34 (20%)
Query: 236 VGGIYILLR--SKKNKTVQERLAEQFKDELFDRL 267
+ I+IL R KK+KT +ER A L+D L
Sbjct: 23 ISLIWILWRAIRKKDKTAKERQA-----FLYDVL 51
>gnl|CDD|221733 pfam12720, DUF3807, Protein of unknown function (DUF3807). This is
a family of conserved fungal proteins of unknown
function.
Length = 169
Score = 34.3 bits (79), Expect = 0.087
Identities = 15/77 (19%), Positives = 30/77 (38%), Gaps = 1/77 (1%)
Query: 61 EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
E+E K +E + + D + + + K E +K+K +DK+ KS K
Sbjct: 68 ERELK-EEAEAEEEGEVDASPDAGAVAGESSADRKEAEQQGAAQKRKSCRDKERKSAKDP 126
Query: 121 DKDRERDKDEKKEQKES 137
+ D+ + +
Sbjct: 127 RGGTQDVVDKSQASLDY 143
Score = 32.8 bits (75), Expect = 0.28
Identities = 20/99 (20%), Positives = 44/99 (44%), Gaps = 8/99 (8%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
++ +E E E++ + + E KE E++ + K S +++K KD +
Sbjct: 69 RELKEEAEAEEEGEVDASPDAGAVAGESSA-DRKEAEQQGAAQKRKSCRDKERKSAKDPR 127
Query: 114 EKSHKHKDKDRERDK--DEKKEQKESKSSS-----KIVS 145
+ DK + +E+ +Q+E++S +I+S
Sbjct: 128 GGTQDVVDKSQASLDYGEEETQQQEAQSGPNNFGRRIIS 166
Score = 28.9 bits (65), Expect = 5.1
Identities = 14/66 (21%), Positives = 24/66 (36%), Gaps = 7/66 (10%)
Query: 88 EKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSS 147
E+E KE E E + + +S DR+ + + QK K
Sbjct: 68 ERELKEEAEAEEEGEVDASPDAGAVAGESSA----DRKEAEQQGAAQKRKSCRDK---ER 120
Query: 148 HNSKEP 153
++K+P
Sbjct: 121 KSAKDP 126
>gnl|CDD|221952 pfam13166, AAA_13, AAA domain. This family of domains contain a
P-loop motif that is characteristic of the AAA
superfamily. Many of the proteins in this family are
conjugative transfer proteins. This family includes the
PrrC protein that is thought to be the active component
of the anticodon nuclease.
Length = 713
Score = 35.4 bits (82), Expect = 0.088
Identities = 25/94 (26%), Positives = 45/94 (47%), Gaps = 1/94 (1%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
K ++K E E EKK++E +K+ +K K+ +K+ + S+ + + K+ KE
Sbjct: 105 KKLEEKIEQLEAEIEKKEEELEKAKNKFLDKAWKKL-AKKYDSNLSEALKGLNYKKNFKE 163
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
K K+ KS ++ K + K SS+K
Sbjct: 164 KLLKELKSVILNASSLLSLEELKAKIKTLFSSNK 197
Score = 30.7 bits (70), Expect = 2.8
Identities = 20/130 (15%), Positives = 45/130 (34%), Gaps = 12/130 (9%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+++ + + + +E +KE K E+ + ++ ++K++ EK + + + KK
Sbjct: 87 GEENIEIEAQIEELKKELKKLEEKIEQLEAEIEKKEE--ELEKAKNKFL-----DKAWKK 139
Query: 107 KEKKDKKE-----KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS 161
KK K +K +E+ E K + SS + + S
Sbjct: 140 LAKKYDSNLSEALKGLNYKKNFKEKLLKELKSVILNASSLLSLEELKAKIKTLFSSNKPE 199
Query: 162 HPPPPAPTPT 171
Sbjct: 200 LALLTLSVID 209
>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
This family represents Cwf15/Cwc15 (from
Schizosaccharomyces pombe and Saccharomyces cerevisiae
respectively) and their homologues. The function of
these proteins is unknown, but they form part of the
spliceosome and are thus thought to be involved in mRNA
splicing.
Length = 241
Score = 34.7 bits (80), Expect = 0.090
Identities = 30/120 (25%), Positives = 59/120 (49%), Gaps = 10/120 (8%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
HK+K ++ AI + S+ + +N D++D+ + K E++ ++ + D S SS
Sbjct: 69 AHKSKKENKLAI-EDADKSTNLDASNEGDEDDDEEDEIKRKRIEEDARNSDADDSDSSSD 127
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEK-KKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
D S + E+ E EK KKE+ ++KE+ ++ E+ +E+K ++E
Sbjct: 128 SDSSDDDSDDDDSEDETA--ALLRELEKIKKERAEEKER------EEEEKAAEEEKAREE 179
Score = 28.5 bits (64), Expect = 9.8
Identities = 19/102 (18%), Positives = 39/102 (38%), Gaps = 4/102 (3%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
+ S++ + D + I +S + D D + D
Sbjct: 83 ADKSTNLDASNEGDEDDDEEDEIKRKRIEE----DARNSDADDSDSSSDSDSSDDDSDDD 138
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
+D++A +E EK K E++ +E + K + EK +++E
Sbjct: 139 DSEDETAALLRELEKIKKERAEEKEREEEEKAAEEEKAREEE 180
>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1. This domain
family is found in eukaryotes, and is approximately 40
amino acids in length. The family is found in
association with pfam07719, pfam00515. There is a single
completely conserved residue L that may be functionally
important. NARP1 is the mammalian homologue of a yeast
N-terminal acetyltransferase that regulates entry into
the G(0) phase of the cell cycle.
Length = 516
Score = 35.3 bits (82), Expect = 0.095
Identities = 21/81 (25%), Positives = 38/81 (46%), Gaps = 1/81 (1%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
DK E +++E + +S E++K + ++ E+K K + + +KK E KK K
Sbjct: 389 DKPLLAEGEEEEGENGNLSPAERKKLRKKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKG 448
Query: 117 HKHKDKDRERDKD-EKKEQKE 136
+ K + D EK + E
Sbjct: 449 PDGETKKVDPDPLGEKLARTE 469
Score = 34.9 bits (81), Expect = 0.13
Identities = 19/89 (21%), Positives = 40/89 (44%), Gaps = 2/89 (2%)
Query: 30 PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
P + N + S ++K K R K+++ EKK ++++ ++K+K + +
Sbjct: 391 PLLAEGEE-EEGENGNLSPAERK-KLRKKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKG 448
Query: 90 ERKESKPKESSSEKEKKKEKKDKKEKSHK 118
E+K + EK +D E++ K
Sbjct: 449 PDGETKKVDPDPLGEKLARTEDPLEEAMK 477
Score = 32.6 bits (75), Expect = 0.62
Identities = 14/71 (19%), Positives = 31/71 (43%), Gaps = 5/71 (7%)
Query: 68 EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
+ + +E E +S E+++ K + K +KK +K++ EK+ K +
Sbjct: 390 KPLLAEGEEEEGENGNLSPAERKKLRKKQR-----KAEKKAEKEEAEKAAAKKKAEAAAK 444
Query: 128 KDEKKEQKESK 138
K + + + K
Sbjct: 445 KAKGPDGETKK 455
Score = 29.5 bits (67), Expect = 5.3
Identities = 20/99 (20%), Positives = 36/99 (36%), Gaps = 17/99 (17%)
Query: 82 DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
DK E E +E + S + KK KK +K + K +KE+ E ++
Sbjct: 389 DKPLLAEGEEEEGENGNLSPAERKKLRKKQRKAE------------KKAEKEEAEKAAAK 436
Query: 142 KIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKE 180
K ++ + G P P + +T++
Sbjct: 437 KKAEAAAKKAKGPDG-----ETKKVDPDPLGEKLARTED 470
>gnl|CDD|227463 COG5134, COG5134, Uncharacterized conserved protein [Function
unknown].
Length = 272
Score = 34.7 bits (79), Expect = 0.098
Identities = 25/139 (17%), Positives = 53/139 (38%), Gaps = 10/139 (7%)
Query: 22 KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKD--------RDKEKEKEKKDKEKDKSA 73
DK S +N + D K+ + R E + +D K ++
Sbjct: 66 GDKSYYTTKIYRFSIKCHLCSNPIDVRTDPKNTEYVVESGGRRKIEPQDINEDPAKAENV 125
Query: 74 VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
K E D + + EK+ + K ++ +S ++ +K+ S R R + +++
Sbjct: 126 --EKVPESDAIEALEKQLTQQKSEKHNSSAINFIDELNKRLWSDPFVSSQRLRKQFRERK 183
Query: 134 QKESKSSSKIVSSSHNSKE 152
+ E K +K +S + +
Sbjct: 184 KIEKKQEAKDLSLKNRAAL 202
>gnl|CDD|237867 PRK14953, PRK14953, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 486
Score = 35.2 bits (81), Expect = 0.098
Identities = 26/95 (27%), Positives = 47/95 (49%), Gaps = 2/95 (2%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
SK++ K++ KE+EKE+ D +K + E + K E KE + K + +
Sbjct: 365 KEGSKQETKEQPEKKEEEKEELDIDKIILQIIKNEGKIISAILKNAEIKEEEGKITIKVE 424
Query: 104 EKKKEKKDKKEKSHKHK--DKDRERDKDEKKEQKE 136
+ +++ D + KS K + E K EK+++KE
Sbjct: 425 KSEEDTLDLEIKSIKKYFPFIEFEEVKKEKEKEKE 459
>gnl|CDD|235219 PRK04098, PRK04098, sec-independent translocase; Provisional.
Length = 158
Score = 33.9 bits (78), Expect = 0.098
Identities = 15/69 (21%), Positives = 28/69 (40%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
++ K + +D +K + ++VS++ K +ES ES E + +
Sbjct: 90 KITAENEIKSIQDLLQDYKKSLEEDTIPNHLNEEVSNETKLTQESSSDESPKEVKLATKN 149
Query: 110 KDKKEKSHK 118
K KK K
Sbjct: 150 KTKKHDKEK 158
Score = 31.6 bits (72), Expect = 0.54
Identities = 19/87 (21%), Positives = 34/87 (39%), Gaps = 3/87 (3%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK---ESK 95
+ S K ++ D K + + +D K E+D + + E
Sbjct: 71 ESAVESLKKKLKFEELDDLKITAENEIKSIQDLLQDYKKSLEEDTIPNHLNEEVSNETKL 130
Query: 96 PKESSSEKEKKKEKKDKKEKSHKHKDK 122
+ESSS++ K+ K K K+ KH +
Sbjct: 131 TQESSSDESPKEVKLATKNKTKKHDKE 157
>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381). This
domain is functionally uncharacterized. This domain is
found in eukaryotes. This presumed domain is typically
between 156 to 174 amino acids in length. This domain is
found associated with pfam07780, pfam01728.
Length = 154
Score = 33.8 bits (78), Expect = 0.099
Identities = 12/64 (18%), Positives = 33/64 (51%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
R +K+ +E+++ V +E ++++ + E++ +K K + ++K+K+ KE+
Sbjct: 91 RKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRENERKQKEILKEQ 150
Query: 116 SHKH 119
Sbjct: 151 MKML 154
Score = 31.5 bits (72), Expect = 0.60
Identities = 14/47 (29%), Positives = 27/47 (57%)
Query: 92 KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ K KE E+E + E+ D++E+ + +K+ + K EK+ + E K
Sbjct: 96 LDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRENERK 142
Score = 29.6 bits (67), Expect = 2.6
Identities = 16/71 (22%), Positives = 35/71 (49%), Gaps = 9/71 (12%)
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKESKPKES-SSEKEKKKEKKDKKEKSHKHKDKDRE 125
++K + + +KEK++ +E E +E +E EK+ K ++++ RE
Sbjct: 87 RKKVRKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKR--------RE 138
Query: 126 RDKDEKKEQKE 136
++ +K+ KE
Sbjct: 139 NERKQKEILKE 149
Score = 28.8 bits (65), Expect = 4.4
Identities = 14/62 (22%), Positives = 36/62 (58%)
Query: 77 KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
++K + + +KE++E + +E E+ ++E+ D+ + K K +R ++E+K+++
Sbjct: 87 RKKVRKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRENERKQKEI 146
Query: 137 SK 138
K
Sbjct: 147 LK 148
Score = 28.0 bits (63), Expect = 7.6
Identities = 14/67 (20%), Positives = 27/67 (40%), Gaps = 6/67 (8%)
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
K KEK++ E E+ + E E + + K+EK+ + E+ K
Sbjct: 92 KLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKE------LAKLKREKRRENERKQKE 145
Query: 120 KDKDRER 126
K++ +
Sbjct: 146 ILKEQMK 152
>gnl|CDD|233830 TIGR02350, prok_dnaK, chaperone protein DnaK. Members of this
family are the chaperone DnaK, of the DnaK-DnaJ-GrpE
chaperone system. All members of the seed alignment were
taken from completely sequenced bacterial or archaeal
genomes and (except for Mycoplasma sequence) found
clustered with other genes of this systems. This model
excludes DnaK homologs that are not DnaK itself, such as
the heat shock cognate protein HscA (TIGR01991).
However, it is not designed to distinguish among DnaK
paralogs in eukaryotes. Note that a number of dnaK genes
have shadow ORFs in the same reverse (relative to dnaK)
reading frame, a few of which have been assigned
glutamate dehydrogenase activity. The significance of
this observation is unclear; lengths of such shadow ORFs
are highly variable as if the presumptive protein
product is not conserved [Protein fate, Protein folding
and stabilization].
Length = 595
Score = 35.4 bits (82), Expect = 0.10
Identities = 26/121 (21%), Positives = 54/121 (44%), Gaps = 8/121 (6%)
Query: 22 KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
KDK + S + + + S + ++ K+ + E++KK KE + ++
Sbjct: 480 KDKGTG----KEQSITITASSGLSEEEIERMVKEAEANAEEDKKRKE----EIEARNNAD 531
Query: 82 DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
EK KE+ K + EKEK ++ + +++ K +D + + K E+ +Q K +
Sbjct: 532 SLAYQAEKTLKEAGDKLPAEEKEKIEKAVAELKEALKGEDVEEIKAKTEELQQALQKLAE 591
Query: 142 K 142
Sbjct: 592 A 592
>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
envelope biogenesis, outer membrane].
Length = 387
Score = 34.9 bits (80), Expect = 0.10
Identities = 22/117 (18%), Positives = 48/117 (41%), Gaps = 8/117 (6%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKE----KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
+ K K + K+ E+ K E K ++A + K+ E + ++ EK + E++ K + +
Sbjct: 160 AAKLKAAAEAKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEAKAAAEKAKAEAEAKAKAEK 219
Query: 103 KEK----KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
K + +K +KK+ + K K + + + + I + K
Sbjct: 220 KAEAAAEEKAAAEKKKAAAKAKADKAAAAAKAAERKAAAAALDDIFGGLSSGKNAPK 276
Score = 33.8 bits (77), Expect = 0.24
Identities = 23/100 (23%), Positives = 46/100 (46%), Gaps = 5/100 (5%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK---- 103
+K ++++ R E++KK + A + K K +K+K + +K E + K
Sbjct: 131 QKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKAEAA 190
Query: 104 -EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
KKK + + K + K K + + K EKK + ++ +
Sbjct: 191 AAKKKAEAEAKAAAEKAKAEAEAKAKAEKKAEAAAEEKAA 230
Score = 33.4 bits (76), Expect = 0.34
Identities = 30/181 (16%), Positives = 85/181 (46%), Gaps = 1/181 (0%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER-KESKPKESSSEKEKKK 107
+ ++ + E++++KK+++ + + E++++ EKER K + ++ + E EK+
Sbjct: 68 QSQQSSAKKGEQQRKKKEEQVAEELKPKQAAEQERLKQLEKERLKAQEQQKQAEEAEKQA 127
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
+ + K+++ K ++ K E + K + ++K+ +++ K+ ++ A
Sbjct: 128 QLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKA 187
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
K + + K +++ + +K + K + +K ++K A +K K ++
Sbjct: 188 EAAAAKKKAEAEAKAAAEKAKAEAEAKAKAEKKAEAAAEEKAAAEKKKAAAKAKADKAAA 247
Query: 228 A 228
A
Sbjct: 248 A 248
Score = 32.6 bits (74), Expect = 0.66
Identities = 30/181 (16%), Positives = 79/181 (43%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+K K+++ ++ K K+ ++E+ K + K +++ E+ K+++ ++ E++ +K
Sbjct: 81 RKKKEEQVAEELKPKQAAEQERLKQLEKERLKAQEQQKQAEEAEKQAQLEQKQQEEQARK 140
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
++K+K+ K K K + K + + ++ +K A + A
Sbjct: 141 AAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEA 200
Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+K+ + + K K ++ + + KKK K + AK+ E+++ ++
Sbjct: 201 KAAAEKAKAEAEAKAKAEKKAEAAAEEKAAAEKKKAAAKAKADKAAAAAKAAERKAAAAA 260
Query: 228 A 228
Sbjct: 261 L 261
>gnl|CDD|227447 COG5117, NOC3, Protein involved in the nuclear export of
pre-ribosomes [Translation, ribosomal structure and
biogenesis / Intracellular trafficking and secretion].
Length = 657
Score = 35.4 bits (81), Expect = 0.11
Identities = 23/106 (21%), Positives = 44/106 (41%), Gaps = 11/106 (10%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSK-KDKKDKDRDKEKE 61
Y +K SSS +++D P S SS + N K KD D + +E
Sbjct: 44 YDLKKSSSDE---------EEQDYELRPRVS-SSWNNESYNRLPIKTKDNVVADVNNGEE 93
Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+ + + + S K++ + S +E++ P + + EK++
Sbjct: 94 FLSESESEASLEIDSDIKDEKQKSLEEQKIAPEIPVKQQIDSEKER 139
>gnl|CDD|213844 TIGR03657, IsdB, heme uptake protein IsdB. Isd proteins are
iron-regulated surface proteins found in Bacillus,
Staphylococcus and Listeria species and are responsible
for heme scavenging from hemoproteins. The IsdB protein
is only observed in Staphylococcus and consists of an
N-terminal hydrophobic signal sequence, a pair of tandem
NEAT (NEAr Transporter, pfam05031) domains which confers
the ability to bind heme and a C-terminal sortase
processing signal which targets the protein to the cell
wall. IsdB is believed to make a direct contact with
methemoglobin facilitating transfer of heme to IsdB. The
heme is then transferred to other cell wall-bound NEAT
domain proteins such as IsdA and IsdC.
Length = 644
Score = 34.9 bits (79), Expect = 0.12
Identities = 35/169 (20%), Positives = 73/169 (43%), Gaps = 4/169 (2%)
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKKE---KKDKKEKSHK 118
+K+ K + ++K++++D + KE SKP EKE +K+ K D K+
Sbjct: 449 DKEAFTKANADKTNKKEQQDNSAKKETTPATPSKPTTPPVEKESQKQDSQKDDNKQSPSV 508
Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
K+ D + + K + ++ SSS + S +Q ++ P + T+ +
Sbjct: 509 EKENDASSESGKDKTPATKPAKGEVESSSTTPTKVVSTTQNVAKPTTASSETTKDVVQTS 568
Query: 179 KEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
+ K+S+ + K+ + + + N +E AKS + +S+
Sbjct: 569 AGSSEAKDSAPLQKANIKNTNDGHTQSQNNKNTQENKAKSLPQTGEESN 617
Score = 31.4 bits (70), Expect = 1.4
Identities = 28/141 (19%), Positives = 57/141 (40%), Gaps = 1/141 (0%)
Query: 19 HKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
+K + +D+SA T+ ++ + T + K+D +D K+ +KE D S+ S K+
Sbjct: 462 NKKEQQDNSAKKETTPATPSKPTTPPVEKESQKQDSQKDDNKQSPSVEKENDASSESGKD 521
Query: 79 KE-KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
K K + E E + P + S + + ++ K + + K
Sbjct: 522 KTPATKPAKGEVESSSTTPTKVVSTTQNVAKPTTASSETTKDVVQTSAGSSEAKDSAPLQ 581
Query: 138 KSSSKIVSSSHNSKEPASGSQ 158
K++ K + H + +Q
Sbjct: 582 KANIKNTNDGHTQSQNNKNTQ 602
>gnl|CDD|206039 pfam13868, Trichoplein, Tumour suppressor, Mitostatin. Trichoplein
or mitostatin, was first defined as a meiosis-specific
nuclear structural protein. It has since been linked
with mitochondrial movement. It is associated with the
mitochondrial outer membrane, and over-expression leads
to reduction in mitochondrial motility whereas lack of
it enhances mitochondrial movement. The activity appears
to be mediated through binding the mitochondria to the
actin intermediate filaments (IFs).
Length = 349
Score = 34.5 bits (80), Expect = 0.13
Identities = 20/95 (21%), Positives = 55/95 (57%), Gaps = 8/95 (8%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+K +++++R+ E+ + K++KE++ + + ++++E + + +E E + E E+K
Sbjct: 160 REKAEREEEREAERRERKEEKEREVARLRAQQEEAED---EREELDELRADLYQEEYERK 216
Query: 107 KEKKDKKEKSHKHK-----DKDRERDKDEKKEQKE 136
+ +K+K+E + + + RE +EK+E+ +
Sbjct: 217 ERQKEKEEAEKRRRQKQELQRAREEQIEEKEERLQ 251
Score = 33.7 bits (78), Expect = 0.23
Identities = 19/99 (19%), Positives = 60/99 (60%), Gaps = 11/99 (11%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
+++ + +++EKE+E++++ K K + +++ ++ +ERKE K +E + + +++E
Sbjct: 134 NEERIERKEEEKEREREEELKILEYQREKAEREEEREAERRERKEEKEREVARLRAQQEE 193
Query: 109 KKDKKEKS-----------HKHKDKDRERDKDEKKEQKE 136
+D++E+ ++ K++ +E+++ EK+ +++
Sbjct: 194 AEDEREELDELRADLYQEEYERKERQKEKEEAEKRRRQK 232
Score = 32.6 bits (75), Expect = 0.62
Identities = 21/100 (21%), Positives = 58/100 (58%), Gaps = 13/100 (13%)
Query: 52 KDKDRDKEKEKEKKDKEKDK-SAVSSKEKEKDKVSSKEKERKESKP----KESSSEKEKK 106
+++++ +++E E++ +E+++ + + +E+D+ ++EK K+ K E + E+ ++
Sbjct: 81 EEREKRRQEEYEERLQEREQMDEIIERIQEEDEAEAQEKREKQKKLREEIDEFNEERIER 140
Query: 107 KEKKDKKEKS--------HKHKDKDRERDKDEKKEQKESK 138
KE++ ++E+ + K + E + E++E+KE K
Sbjct: 141 KEEEKEREREEELKILEYQREKAEREEEREAERRERKEEK 180
>gnl|CDD|218328 pfam04921, XAP5, XAP5, circadian clock regulator. This protein is
found in a wide range of eukaryotes. It is a nuclear
protein and is suggested to be DNA binding. In plants,
this family is essential for correct circadian clock
functioning by acting as a light-quality regulator
coordinating the activities of blue and red light
signalling pathways during plant growth - inhibiting
growth in red light but promoting growth in blue light.
Length = 233
Score = 34.2 bits (79), Expect = 0.13
Identities = 23/76 (30%), Positives = 38/76 (50%), Gaps = 5/76 (6%)
Query: 64 KKDKEKDKSAVS-----SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
KK K+K KS +S +E E + K+ ++ S+P E++ KKK K+ +
Sbjct: 1 KKKKKKKKSKLSFGDDDEEEDEDEGEDEKKVPKESSEPDEANVNPNKKKIGKNPSVDTSF 60
Query: 119 HKDKDRERDKDEKKEQ 134
DK RE + E +E+
Sbjct: 61 LPDKAREEKEAELREE 76
>gnl|CDD|227880 COG5593, COG5593, Nucleic-acid-binding protein possibly involved in
ribosomal biogenesis [Translation, ribosomal structure
and biogenesis].
Length = 821
Score = 35.0 bits (80), Expect = 0.14
Identities = 24/112 (21%), Positives = 41/112 (36%), Gaps = 5/112 (4%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
P D D S + S S T+ K D D + K + ++ D+E+ +
Sbjct: 699 PDVEDDSDDSELDFAEDDFSDS--TSDDEPKLDAIDDEDAKSEGSQESDQEEGLDEIFYS 756
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
+ S E E SSE+EK++E+ + K + + K
Sbjct: 757 FDGEQDNSDSFAESSEEDE---SSEEEKEEEENKEVSAKRAKKKQRKNMLKS 805
Score = 31.6 bits (71), Expect = 1.6
Identities = 34/178 (19%), Positives = 66/178 (37%), Gaps = 27/178 (15%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
T ++ K KK + + E + E + V S+ +D E + E +S+S
Sbjct: 663 TKKTADGKGKKSNKASFDSDDEMDENEIWSALVKSRPDVEDDSDDSELDFAEDDFSDSTS 722
Query: 102 EKEKKKEKKDKKE-KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
+ E K + D ++ KS ++ D+E DE + + + + + ++ +S
Sbjct: 723 DDEPKLDAIDDEDAKSEGSQESDQEEGLDEIFYSFDGEQDNSDSFAESSEEDESS----- 777
Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
E+EKE + +K KK+ K+ K+ P A
Sbjct: 778 ---------------------EEEKEEEENKEVSAKRAKKKQRKNMLKSLPVFASADD 814
>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
Length = 651
Score = 34.6 bits (80), Expect = 0.16
Identities = 18/51 (35%), Positives = 27/51 (52%), Gaps = 3/51 (5%)
Query: 66 DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
DK S +KE+ + +EKE KE+ ++ K KK+E+K KKE
Sbjct: 537 DKPDGPSVWKLDDKEELQ---REKEEKEALKEQKRLRKLKKQEEKKKKELE 584
Score = 29.6 bits (67), Expect = 5.3
Identities = 22/98 (22%), Positives = 42/98 (42%), Gaps = 9/98 (9%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK---------E 104
+D K + K +K K + +K +D+ R E KP S K E
Sbjct: 497 RDAAKAEMKLISLDKKKKQLLQLCDKLRDEWLPNLGIRIEDKPDGPSVWKLDDKEELQRE 556
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
K++++ K++K + K E+ K E ++ +++K
Sbjct: 557 KEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAKIPPA 594
Score = 29.6 bits (67), Expect = 5.8
Identities = 21/79 (26%), Positives = 37/79 (46%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
+EKE+++ KE+ + K++EK K ++ E+ + P E +E K D+
Sbjct: 555 REKEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAKIPPAEFFKRQEDKYSAFDETGLPT 614
Query: 118 KHKDKDRERDKDEKKEQKE 136
D + K+ KK KE
Sbjct: 615 HDADGEEISKKERKKLSKE 633
>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
subunit [Translation, ribosomal structure and
biogenesis].
Length = 591
Score = 34.7 bits (79), Expect = 0.16
Identities = 22/76 (28%), Positives = 43/76 (56%), Gaps = 4/76 (5%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
S + K K K++ ++ ++E+++K+ +S+K+K K+ K K K +++ + K
Sbjct: 519 SEADKDVNKSKNKKRKVDEEEEEKKLKMIMMSNKQK---KLYKKMKYSNAKKEEQAENLK 575
Query: 104 EKKKE-KKDKKEKSHK 118
+KKK+ K KK S K
Sbjct: 576 KKKKQIAKQKKLDSKK 591
Score = 30.4 bits (68), Expect = 3.5
Identities = 24/92 (26%), Positives = 44/92 (47%), Gaps = 9/92 (9%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS-------SEKEK 105
D D + + +KE + + + + E +KD SK K+RK + +E S K+K
Sbjct: 495 DDDEELQAQKELELEAQGIKYSETSEADKDVNKSKNKKRKVDEEEEEKKLKMIMMSNKQK 554
Query: 106 KKEKKDKKEKSHKHKDKD--RERDKDEKKEQK 135
K KK K + K + + +++ K K++K
Sbjct: 555 KLYKKMKYSNAKKEEQAENLKKKKKQIAKQKK 586
Score = 28.9 bits (64), Expect = 8.4
Identities = 17/82 (20%), Positives = 40/82 (48%), Gaps = 4/82 (4%)
Query: 61 EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
E+ + ++ V+ E + + + E++ + + ++ E E + + E S K
Sbjct: 464 TMEETQRHSEEDLVNRFEDVRYEHVAGEEDDDDDEELQAQKELELEAQGIKYSETSEADK 523
Query: 121 D----KDRERDKDEKKEQKESK 138
D K+++R DE++E+K+ K
Sbjct: 524 DVNKSKNKKRKVDEEEEEKKLK 545
>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572). Family of
eukaryotic proteins with undetermined function.
Length = 321
Score = 34.3 bits (79), Expect = 0.16
Identities = 30/195 (15%), Positives = 66/195 (33%), Gaps = 25/195 (12%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK---E 104
++++ D K+ E D +++ + E+ K+ + R+ S E
Sbjct: 125 REEELAGDAMKKLENRTADSKREMEVLERLEELKEL-----QSRRADVDVNSMLEALFRR 179
Query: 105 KKKEKKDKKEKSHK---------HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
+KKE+++++E+ ++DR R DE E E + + S +S
Sbjct: 180 EKKEEEEEEEEDEALIKSLSFGPETEEDRRRADDEDSEDDEEDNDNTPSPKSGSSSPAKP 239
Query: 156 GSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKD 215
S L + P+ K + + + K K + +
Sbjct: 240 TSILKKSAAKRSEAPSSSKAKKNSRGIPKPRDALSSLVVRK---KAAP-----ESTSQSP 291
Query: 216 AKSKEKESHKSSAGP 230
+ ++ +AG
Sbjct: 292 SSAEPTSESPQTAGN 306
Score = 33.6 bits (77), Expect = 0.24
Identities = 21/140 (15%), Positives = 52/140 (37%), Gaps = 7/140 (5%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
+++ +E+E+E++D+ KS E E+D+ + +++ ++ + ++ K
Sbjct: 175 ALFRREKKEEEEEEEEDEALIKSLSFGPETEEDRRRADDEDSEDDEEDNDNTPSPKSGSS 234
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
K + K+ + S S +K S A S ++ P T
Sbjct: 235 SPAKP-------TSILKKSAAKRSEAPSSSKAKKNSRGIPKPRDALSSLVVRKKAAPEST 287
Query: 170 PTQKSPVKTKEKEKEKESST 189
S + + + ++
Sbjct: 288 SQSPSSAEPTSESPQTAGNS 307
>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
Rpc31. RNA polymerase III contains seventeen subunits
in yeasts and in human cells. Twelve of these are akin
to RNA polymerase I or II and the other five are RNA pol
III-specific, and form the functionally distinct groups
(i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
Rpc34 and Rpc82 form a cluster of enzyme-specific
subunits that contribute to transcription initiation in
S.cerevisiae and H.sapiens. There is evidence that these
subunits are anchored at or near the N-terminal Zn-fold
of Rpc1, itself prolonged by a highly conserved but RNA
polymerase III-specific domain.
Length = 221
Score = 34.0 bits (78), Expect = 0.17
Identities = 16/74 (21%), Positives = 38/74 (51%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
K +K K K K + ++E+E E+K + + ++E +K++++++E+
Sbjct: 124 KKAGKKLALSKFKRKVGLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEE 183
Query: 117 HKHKDKDRERDKDE 130
+ +D D + D D+
Sbjct: 184 EEDEDFDDDDDDDD 197
Score = 29.3 bits (66), Expect = 4.8
Identities = 28/113 (24%), Positives = 48/113 (42%), Gaps = 16/113 (14%)
Query: 41 PTNSSSSKKDKKDKDR---DKEKEKEKKDKEKD-------------KSAVSSKEKEKDKV 84
T S S +D KD DK ++K K S + +K K+
Sbjct: 71 YTGSESLSQDPKDGIERYSDKYQKKRKIGPSIKEHPFDLELFPKELYSVMGINKKAGKKL 130
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
+ + +RK E + ++K +KK K + +D D E +KDE++E++E
Sbjct: 131 ALSKFKRKVGLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEE 183
Score = 28.6 bits (64), Expect = 7.7
Identities = 17/70 (24%), Positives = 34/70 (48%), Gaps = 3/70 (4%)
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
+K+ K SK K K + ++E+E + K ++K KE + + KD++
Sbjct: 121 GINKKAGKKLALSKFKRKVGLFTEEEEDIDEKLSM---LEKKLKELEAEDVDEEDEKDEE 177
Query: 124 RERDKDEKKE 133
E +++E+ E
Sbjct: 178 EEEEEEEEDE 187
>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
Length = 1352
Score = 34.8 bits (80), Expect = 0.17
Identities = 27/176 (15%), Positives = 50/176 (28%), Gaps = 14/176 (7%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
+ S P+ + S+ S S SS P S++ D +
Sbjct: 189 PPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGCG 248
Query: 66 DKEKDK------SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
+++ + ++ + + + SSS +E+ S
Sbjct: 249 WGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPA 308
Query: 120 KDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSP 175
SSS SSS +S+ A P P+ +P+ P
Sbjct: 309 ---PSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSP-----GPSPSRSPSPSRP 356
Score = 33.2 bits (76), Expect = 0.48
Identities = 17/45 (37%), Positives = 20/45 (44%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSK 48
SSSS PSP + A S SSS+S+ SSSS
Sbjct: 284 PASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSS 328
>gnl|CDD|215146 PLN02260, PLN02260, probable rhamnose biosynthetic enzyme.
Length = 668
Score = 34.7 bits (80), Expect = 0.17
Identities = 43/156 (27%), Positives = 58/156 (37%), Gaps = 45/156 (28%)
Query: 301 IQHHIHVIIHAAASLRFDELIQDAFTL---NIQATRELLDLATRCSQLKAILHVSTLYTH 357
I I I+H AA D ++F NI T LL+ Q++ +HVST
Sbjct: 77 ITEGIDTIMHFAAQTHVDNSFGNSFEFTKNNIYGTHVLLEACKVTGQIRRFIHVST---- 132
Query: 358 SYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVE 417
+E Y + ED N E ++L + N YS TKA E +V
Sbjct: 133 -------DEVYGE--TDEDAD----VGNHEASQLLPT--------NPYSATKAGAEMLVM 171
Query: 418 KY--LYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGP 451
Y Y LP+ R NN+YGP
Sbjct: 172 AYGRSYGLPVITTR---------------GNNVYGP 192
>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain. This is a family of proteins of
approximately 300 residues, found in plants and
vertebrates. They contain a highly conserved DDRGK
motif.
Length = 189
Score = 33.5 bits (77), Expect = 0.17
Identities = 23/83 (27%), Positives = 50/83 (60%), Gaps = 2/83 (2%)
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
KK K ++ + K+ + + ++E+ER+E K E E E+K E+++ +E+ K K+++
Sbjct: 1 KKIGAKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERK-EEEELEEEREKKKEEE 59
Query: 124 RERDKDEKKEQKESKSSSKIVSS 146
+++ E++ +KE + K+ SS
Sbjct: 60 ERKER-EEQARKEQEEYEKLKSS 81
Score = 32.7 bits (75), Expect = 0.35
Identities = 18/104 (17%), Positives = 52/104 (50%), Gaps = 7/104 (6%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEK---EKDKVSSKEKERKESKPKESSSEKEKKK 107
K + + +EK+ ++ +E ++ ++K +++ +E+E +E + K+ E+ K++
Sbjct: 5 AKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKER 64
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
E++ +KE ++ ++ + +E+ K S+ S+
Sbjct: 65 EEQARKE----QEEYEKLKSSFVVEEEGTDKLSADEESNELLED 104
Score = 31.2 bits (71), Expect = 0.98
Identities = 19/72 (26%), Positives = 42/72 (58%), Gaps = 5/72 (6%)
Query: 48 KKDKKDKDRDKE--KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
+K + + R+ E + +E+K E+ + +E+E ++ K+KE +E K +E E+ +
Sbjct: 13 EKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKERE---EQAR 69
Query: 106 KKEKKDKKEKSH 117
K++++ +K KS
Sbjct: 70 KEQEEYEKLKSS 81
>gnl|CDD|219405 pfam07418, PCEMA1, Acidic phosphoprotein precursor PCEMA1. This
family consists of several acidic phosphoprotein
precursor PCEMA1 sequences which appear to be found
exclusively in Plasmodium chabaudi. PCEMA1 is an antigen
that is associated with the membrane of the infected
erythrocyte throughout the entire intraerythrocytic
cycle. The exact function of this family is unclear.
Length = 286
Score = 34.1 bits (78), Expect = 0.18
Identities = 20/87 (22%), Positives = 41/87 (47%), Gaps = 4/87 (4%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSS-KEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
D + + E E+ D E + +S K+ E D KEK R+E + + + E
Sbjct: 203 DDEVTSYFNDGENEENDDELEAEVISYLKDGENDNE-VKEKIRREYREWKGDKANTNETE 261
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQK 135
+D+ E +++++ E ++E K ++
Sbjct: 262 IEDESED--EYEEEAGEEQENEDKGEE 286
>gnl|CDD|227448 COG5118, BDP1, Transcription initiation factor TFIIIB, Bdp1 subunit
[Transcription].
Length = 507
Score = 34.3 bits (78), Expect = 0.18
Identities = 43/223 (19%), Positives = 77/223 (34%), Gaps = 22/223 (9%)
Query: 30 PSTSTSSSTSNPT----NSSSSKKDKK-------DKDRDKEKEKEKKDKEKDKSAVSSKE 78
PS SSS SN T ++ S+K KK + D + + K + K SA +
Sbjct: 111 PSFLDSSSNSNGTARRLSTISNKLPKKIRLGSITENDMNLKTFKRHRVLGKPSSAKKPAK 170
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ +R S +S E ++ + K K K +D E K E+ +
Sbjct: 171 ISPPTAMTDSLDRNFSSETSTSREADENENYVISKVKDIPKKVRDGESAKYFIDEENFTM 230
Query: 139 SSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHK 198
+ + E S++ K+ ++ + K E S TH+ K
Sbjct: 231 AELCKPNFPIQISENFEKSKMAK-----------KAKLEKRRHVKFLEGSNTHEMDQLLK 279
Query: 199 HKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYI 241
H + + + K S ++ +A + G I +
Sbjct: 280 HFLDNSNFRQDRRSRKKKASASRDISDQNAEEILMIKNGHIVV 322
>gnl|CDD|240578 cd12951, RRP7_Rrp7A, RRP7 domain ribosomal RNA-processing protein 7
homolog A (Rrp7A) and similar proteins. The family
corresponds to the RRP7 domain of Rrp7A, also termed
gastric cancer antigen Zg14, and similar proteins which
are yeast ribosomal RNA-processing protein 7 (Rrp7p)
homologs mainly found in Metazoans. The cellular
function of Rrp7A remains unclear currently. Rrp7A
harbors an N-terminal RNA recognition motif (RRM), also
termed RBD (RNA binding domain) or RNP
(ribonucleoprotein domain), and a C-terminal RRP7
domain.
Length = 129
Score = 32.6 bits (75), Expect = 0.18
Identities = 26/66 (39%), Positives = 37/66 (56%), Gaps = 12/66 (18%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKD----KVSSKEKERKESKPKESSSEKEKKKEKKDK 112
DKE+E+EK++KEK E E D +K+ R ++ KES + K +KEKK K
Sbjct: 31 DKEEEEEKEEKEK--------EAEPDEDGWVTVTKKGRRPKTARKESVAAKAAEKEKKKK 82
Query: 113 KEKSHK 118
K+K K
Sbjct: 83 KKKELK 88
>gnl|CDD|227458 COG5129, MAK16, Nuclear protein with HMG-like acidic region
[General function prediction only].
Length = 303
Score = 33.9 bits (77), Expect = 0.18
Identities = 29/114 (25%), Positives = 54/114 (47%), Gaps = 2/114 (1%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS-SEKE 104
+ ++ ++D+ +E+E+ D E + S EKEK K EK + E+S SE+E
Sbjct: 191 TEREKRQDEKERYVEEEEESDTELEAVTDDS-EKEKTKKKDLEKWLGSDQSMETSESEEE 249
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+ E + +++ +K K R+R D+ K+ ++ + N K PA
Sbjct: 250 ESSESESDEDEDEDNKGKIRKRKTDDAKKSRKPHIHIEYEQERENEKIPAVQHS 303
Score = 30.8 bits (69), Expect = 1.8
Identities = 24/109 (22%), Positives = 49/109 (44%), Gaps = 5/109 (4%)
Query: 43 NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
K+ +++ + + E E + +K K+ EK S + E ES+ +ESS
Sbjct: 195 KRQDEKERYVEEEEESDTELEAVTDDSEKEKTKKKDLEKWLGSDQSMETSESEEEESSES 254
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKD-----EKKEQKESKSSSKIVSS 146
+ + E +D K K K K D ++ + E ++++E++ + S
Sbjct: 255 ESDEDEDEDNKGKIRKRKTDDAKKSRKPHIHIEYEQERENEKIPAVQHS 303
>gnl|CDD|218883 pfam06075, DUF936, Plant protein of unknown function (DUF936).
This family consists of several hypothetical proteins
from Arabidopsis thaliana and Oryza sativa. The function
of this family is unknown.
Length = 564
Score = 34.4 bits (79), Expect = 0.19
Identities = 27/122 (22%), Positives = 40/122 (32%), Gaps = 8/122 (6%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKD--------KKDKDRDK 58
SS S PSP SS+ S+ S ++S KK + D
Sbjct: 183 RSSRSELGAPSPSGGTSCPSSSGGRRSSIGSRRLRGSASLRKKVAVLSAPRKPGSRSSDC 242
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
+ + SS +++ K SK R K SS+ E KK + +
Sbjct: 243 KSSPRARSSSAKSPFKSSIQRKATKALSKLSLRASPKDTSKSSKSEVAPPKKSEAKVPSS 302
Query: 119 HK 120
K
Sbjct: 303 SK 304
>gnl|CDD|224012 COG1087, GalE, UDP-glucose 4-epimerase [Cell envelope biogenesis,
outer membrane].
Length = 329
Score = 34.1 bits (79), Expect = 0.19
Identities = 22/95 (23%), Positives = 39/95 (41%), Gaps = 11/95 (11%)
Query: 264 FDRLKNEQADILQRK-VHIISGDIS-QPSLGISSHDQQFIQHHIHVIIHAAASLRFDELI 321
D L N L + GD+ + L F ++ I ++H AAS+ E +
Sbjct: 30 LDNLSNGHKIALLKLQFKFYEGDLLDRALL-----TAVFEENKIDAVVHFAASISVGESV 84
Query: 322 QDA---FTLNIQATRELLDLATRCSQLKAILHVST 353
Q+ + N+ T L++ A + +K + ST
Sbjct: 85 QNPLKYYDNNVVGTLNLIE-AMLQTGVKKFIFSST 118
>gnl|CDD|114268 pfam05537, DUF759, Borrelia burgdorferi protein of unknown function
(DUF759). This family consists of several
uncharacterized proteins from the Lyme disease
spirochete Borrelia burgdorferi.
Length = 439
Score = 34.3 bits (78), Expect = 0.19
Identities = 39/113 (34%), Positives = 59/113 (52%), Gaps = 16/113 (14%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK-EKKKEK 109
KK ++D K EK K K S S+K+ K+ +S K+KE + ES E+ EK +
Sbjct: 19 KKAIEQDISK-MEKYLKPKKSSLGSTKDIVKNNLSDKKKELSKQSKFESLRERVEKYRLT 77
Query: 110 KDKK--------EKSHKHKDK-----DRERDKDEKKE-QKESKSSSKIVSSSH 148
+ KK EK+ K K DR++ + E KE KESK+ SK++++S
Sbjct: 78 QTKKLIKQGMGFEKARKEAFKRSLMSDRDKRRLEYKELAKESKAKSKMLAASQ 130
>gnl|CDD|113413 pfam04642, DUF601, Protein of unknown function, DUF601. This
family represents a conserved region found in several
uncharacterized plant proteins.
Length = 311
Score = 33.9 bits (77), Expect = 0.19
Identities = 30/128 (23%), Positives = 56/128 (43%), Gaps = 10/128 (7%)
Query: 49 KDKKDKDRDKEKE--KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS-EKEK 105
+D K + +K K+ E+ + K + + + S K + K+ S +K
Sbjct: 6 EDSKLRAAEKAKQPQAEEDSGSRQKPSTLAGKNPDAPTSESRTPSKATSSKDPSKRYADK 65
Query: 106 KKEKKDKKEKSHKHKDKDRERDKD-----EKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
K+++ +K +S + R +KD +K++ K+ S +V SS S+ S+
Sbjct: 66 KRKQSEKDARSPPRSSRPRTEEKDAGPSQQKEKGKKGDSQDLVVLSSRESE--RRTSERR 123
Query: 161 SHPPPPAP 168
S P PAP
Sbjct: 124 STGPLPAP 131
Score = 32.7 bits (74), Expect = 0.48
Identities = 27/124 (21%), Positives = 52/124 (41%), Gaps = 8/124 (6%)
Query: 9 SSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKE 68
S +A + ++DS + ST + + +S S+ K K+ K DK+
Sbjct: 8 SKLRAAEKAKQPQAEEDSGSRQKPSTLAGKNPDAPTSESRTPSKAT-SSKDPSKRYADKK 66
Query: 69 KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
+ K+ EKD S R ++ K++ ++K+K KK + ++ ER
Sbjct: 67 R-------KQSEKDARSPPRSSRPRTEEKDAGPSQQKEKGKKGDSQDLVVLSSRESERRT 119
Query: 129 DEKK 132
E++
Sbjct: 120 SERR 123
>gnl|CDD|234525 TIGR04259, oxa_formateAnti, oxalate/formate antiporter. This model
represents a subgroup of the more broadly defined model
TIGR00890, which in turn belongs to the Major
Facilitator transporter family. Seed members for this
family include the known oxalate/formate antiporter of
Oxalobacter formigenes, as well as transporter subunits
co-clustered with the two genes of a system that
decarboxylates oxalate into formate. In many of these
cassettes, two subunits are found rather than one,
suggesting the antiporter is sometimes homodimeric,
sometimes heterodimeric.
Length = 405
Score = 34.0 bits (78), Expect = 0.23
Identities = 13/35 (37%), Positives = 15/35 (42%), Gaps = 6/35 (17%)
Query: 431 SIVVSTWKEPIVGWSNNLYGP------GGAAAGAA 459
+V TW PI GW + YGP GG G
Sbjct: 47 FVVTETWLVPIEGWFVDKYGPRIVVMFGGIMCGLG 81
>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34. This family represents
herpes virus protein U79 and cytomegalovirus early
phosphoprotein P34 (UL112).
Length = 238
Score = 33.3 bits (76), Expect = 0.24
Identities = 20/106 (18%), Positives = 42/106 (39%), Gaps = 1/106 (0%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK-KDKEKDKSAVSSKE 78
+ + + + + S KK + + K+KEK + ++ K ++
Sbjct: 125 VAHEAEIRNLGDVKNAEKFEKECRALSRKKSDDEHRKRSGKQKEKRRVEDSQKHKEDRRK 184
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
K+++K + E +R S + + + KEK KH D +R
Sbjct: 185 KQEEKRRNDEDKRPGGGGGSSGGQSGLSTKDEPPKEKRQKHHDPER 230
Score = 33.3 bits (76), Expect = 0.28
Identities = 20/101 (19%), Positives = 38/101 (37%), Gaps = 11/101 (10%)
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
K EK +KE + + E K S K+KE++ + + E +KK+++ ++ K
Sbjct: 138 KNAEKFEKECRALSRKKSDDEHRKRSGKQKEKRRVEDSQKHKEDRRKKQEEKRRNDEDKR 197
Query: 120 KDKDRERDK-----------DEKKEQKESKSSSKIVSSSHN 149
++K QK ++ SH
Sbjct: 198 PGGGGGSSGGQSGLSTKDEPPKEKRQKHHDPERRLEPQSHE 238
Score = 32.5 bits (74), Expect = 0.50
Identities = 21/92 (22%), Positives = 37/92 (40%), Gaps = 3/92 (3%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK--ERKESKPKESSSE 102
+ + K D + K K+K +K + + + KE + K K + E K SS
Sbjct: 148 RALSRKKSDDEHRKRSGKQK-EKRRVEDSQKHKEDRRKKQEEKRRNDEDKRPGGGGGSSG 206
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
+ KD+ K + K D ER + + +
Sbjct: 207 GQSGLSTKDEPPKEKRQKHHDPERRLEPQSHE 238
Score = 28.7 bits (64), Expect = 7.5
Identities = 19/83 (22%), Positives = 32/83 (38%), Gaps = 3/83 (3%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
+ E E + + K K + S +KK + +++S K K+K R D + KE
Sbjct: 125 VAHEAEIRNLGDVKNAEKFEKECRALS---RKKSDDEHRKRSGKQKEKRRVEDSQKHKED 181
Query: 135 KESKSSSKIVSSSHNSKEPASGS 157
+ K K + GS
Sbjct: 182 RRKKQEEKRRNDEDKRPGGGGGS 204
>gnl|CDD|227693 COG5406, COG5406, Nucleosome binding factor SPN, SPT16 subunit
[Transcription / DNA replication, recombination, and
repair / Chromatin structure and dynamics].
Length = 1001
Score = 34.2 bits (78), Expect = 0.24
Identities = 35/160 (21%), Positives = 60/160 (37%), Gaps = 11/160 (6%)
Query: 37 STSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP 96
S SNP + S K + D ++ E + + +K S + K R E++
Sbjct: 408 SLSNPIVFTDSPKAQGDISFLFGEDDETPEYLTLQDKAPDF-LDKTISSHRSKFRDETRE 466
Query: 97 KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASG 156
E ++ K++ + +K+ +K + + D + E KS +I S S +S+ P
Sbjct: 467 HELNARKKRVEHQKELLDKIIEEGLERFRNASDAGPDSIEEKSEKRIESYSRDSQLPRQI 526
Query: 157 ----------SQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
Q I P P P S +K K E
Sbjct: 527 GELRIIVDFARQSIILPIGGRPVPFHISSIKNASKNDEGN 566
>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
protein; Reviewed.
Length = 782
Score = 34.0 bits (79), Expect = 0.24
Identities = 21/89 (23%), Positives = 49/89 (55%), Gaps = 13/89 (14%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV-------------SSKEKERKESK 95
++KK+K +++E + ++ +++ + A+ +KE D++ S K E E++
Sbjct: 554 EEKKEKLQEEEDKLLEEAEKEAQQAIKEAKKEADEIIKELRQLQKGGYASVKAHELIEAR 613
Query: 96 PKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
+ + + ++K+K+KK +KEK + K D
Sbjct: 614 KRLNKANEKKEKKKKKQKEKQEELKVGDE 642
Score = 29.0 bits (66), Expect = 9.5
Identities = 33/147 (22%), Positives = 65/147 (44%), Gaps = 30/147 (20%)
Query: 80 EKDKV-----SSKEKERK-ESKPKESSSEKEK----KKEKKDKKEKSHKHKDKDRERDKD 129
+K+K+ S +E ER+ E K +E+ + ++ K+E ++KKEK + +DK E +
Sbjct: 514 DKEKLNELIASLEELERELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEK 573
Query: 130 E-KKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESS 188
E ++ KE+K + + + + + +H + E K +
Sbjct: 574 EAQQAIKEAKKEADEIIKELRQLQKGGYASVKAH----------------ELIEARKRLN 617
Query: 189 TTHDKHSKHKHKKKDKHGDKTNPKEKD 215
++K K K K+K+K + K D
Sbjct: 618 KANEKKEKKKKKQKEK---QEELKVGD 641
>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
Length = 520
Score = 34.0 bits (79), Expect = 0.24
Identities = 15/81 (18%), Positives = 40/81 (49%), Gaps = 7/81 (8%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEK 115
KE E K++ + + + ++ + E + E + + ++K E +K+E+
Sbjct: 56 KEALLEAKEEIHKL-----RNEFEKELRERRNELQKLEKRLLQKEENLDRKLELLEKREE 110
Query: 116 SHKHKDKDRERDKDEKKEQKE 136
+ K+K+ E+ + E ++++E
Sbjct: 111 ELEKKEKELEQKQQELEKKEE 131
Score = 30.5 bits (70), Expect = 3.2
Identities = 21/97 (21%), Positives = 46/97 (47%), Gaps = 12/97 (12%)
Query: 47 SKKDKKDKDRDKE---KEKEKKDKEKDKSAVSSKEKEKDKVSSKEK--ERKESKPKESSS 101
+KK+ + ++ KE+ K + + + + + E + EK +KE
Sbjct: 47 AKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNE---LQKLEKRLLQKEENLDRKLE 103
Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
EK++E+ +KKEK + K +++ ++K+E+ E
Sbjct: 104 LLEKREEELEKKEKELEQK----QQELEKKEEELEEL 136
Score = 29.4 bits (67), Expect = 7.0
Identities = 22/94 (23%), Positives = 52/94 (55%), Gaps = 4/94 (4%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE-KE- 104
+K + + R++E EK++K+ E+ + + KE+E +++ ++ + E ++ E KE
Sbjct: 99 DRKLELLEKREEELEKKEKELEQKQQELEKKEEELEELIEEQLQELERISGLTAEEAKEI 158
Query: 105 --KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
+K E++ + E + K+ + E ++ K+ KE
Sbjct: 159 LLEKVEEEARHEAAVLIKEIEEEAKEEADKKAKE 192
>gnl|CDD|191179 pfam05053, Menin, Menin. MEN1, the gene responsible for multiple
endocrine neoplasia type 1, is a tumour suppressor gene
that encodes a protein called Menin which may be an
atypical GTPase stimulated by nm23.
Length = 618
Score = 33.8 bits (77), Expect = 0.25
Identities = 26/118 (22%), Positives = 46/118 (38%), Gaps = 13/118 (11%)
Query: 79 KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
++K + EKE KESK +E ++ ++ KS + E E +
Sbjct: 452 RQKVVIKLPEKEAKESKEAAGEEAREGRRRGPRRESKS--QEPSGGESPNPELPANNNNS 509
Query: 139 SSSKIVSSSHNSKEPAS------GSQLISHPPPPAPTPT-----QKSPVKTKEKEKEK 185
+S+ ++ + KE A+ + S P P + ++ PV T EK K
Sbjct: 510 NSNNNNNNGADRKEAAATTGNATTTSNGSGTSVPLPVSSEPPQHKEGPVITFYSEKMK 567
>gnl|CDD|227594 COG5269, ZUO1, Ribosome-associated chaperone zuotin [Translation,
ribosomal structure and biogenesis / Posttranslational
modification, protein turnover, chaperones].
Length = 379
Score = 33.9 bits (77), Expect = 0.25
Identities = 28/112 (25%), Positives = 52/112 (46%), Gaps = 2/112 (1%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
K K++D++ + + +P S +++K+ K K E+E + K +A+ K +
Sbjct: 209 KLKNQDNARLKRLVQIAKKRDPRIKSFKEQEKEMKKIRKW-EREAGARLKALAALKGKAE 267
Query: 80 EKDKVSSK-EKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
K+K + E + K+ + E KK K +KK + KD D D D+
Sbjct: 268 AKNKAEIEAEALASATAVKKKAKEVMKKALKMEKKAIKNAAKDADYFGDADK 319
>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
selection and in elongation by RNA polymerase II
[Transcription].
Length = 521
Score = 33.9 bits (77), Expect = 0.25
Identities = 15/71 (21%), Positives = 32/71 (45%), Gaps = 2/71 (2%)
Query: 68 EKDKSAVSSKEKEKDKVSS--KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
+ + S+ E++K K+ K +ER+E E E ++ K+ K+ +E +++
Sbjct: 132 TRYEPLTSAAEEKKKKLLELKKTREREERLYSERHIELQRFKDYKELEESEQGLQEEYTP 191
Query: 126 RDKDEKKEQKE 136
+E E
Sbjct: 192 SYAEEAVEDIS 202
>gnl|CDD|111859 pfam03015, Sterile, Male sterility protein. This family represents
the C-terminal region of the male sterility protein in a
number of arabidopsis and drosophila. A sequence-related
jojoba acyl CoA reductase is also included.
Length = 94
Score = 31.5 bits (72), Expect = 0.26
Identities = 8/22 (36%), Positives = 12/22 (54%)
Query: 574 FLHMIPGMIMDTVLRCLNKPPR 595
F H +P +D +LR + PR
Sbjct: 1 FYHTLPAYFLDLLLRLYGQKPR 22
>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein. The proteins in this family
are designated YL1. These proteins have been shown to be
DNA-binding and may be a transcription factor.
Length = 238
Score = 33.1 bits (76), Expect = 0.26
Identities = 36/152 (23%), Positives = 54/152 (35%), Gaps = 27/152 (17%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
D D ++ E E D+E E EK+ + ++K+ ++ E KKK+KKD
Sbjct: 57 DFDDSEDDEPESDDEE---------EGEKELQREERLKKKKRVKTKAYKEPTKKKKKKDP 107
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
R + K E+ + S SS +S T
Sbjct: 108 TAAKSPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSS------------------TVQN 149
Query: 173 KSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
K + KE+E K K K KKK+K
Sbjct: 150 KEATHERLKEREIRRKKIQAKARKRKEKKKEK 181
Score = 29.7 bits (67), Expect = 4.1
Identities = 34/123 (27%), Positives = 59/123 (47%), Gaps = 14/123 (11%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSS--SKKDKKDKDRDKEKEKE 63
++ S +A P P K K + S P+ +P SS S K+ ++ KE+E
Sbjct: 108 TAAKSPKAAAPRP-KKKSERISWAPTL-----LDSPRRKSSRSSTVQNKEATHERLKERE 161
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK------EKKKEKKDKKEKSH 117
+ K+ A KEK+K+K ++E+ E+K E + K E+++EKK K ++
Sbjct: 162 IRRKKIQAKARKRKEKKKEKELTQEERLAEAKETERINLKSLERYEEQEEEKKKAKIQAL 221
Query: 118 KHK 120
K +
Sbjct: 222 KKR 224
Score = 28.9 bits (65), Expect = 6.4
Identities = 21/77 (27%), Positives = 37/77 (48%), Gaps = 7/77 (9%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKD------KSAVSSKEKEKDKVSSKEKERKESKPK 97
+++K K R K+K + KS+ SS + K+ + KER+ + K
Sbjct: 107 PTAAKSPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSSTVQNKEATHERLKEREI-RRK 165
Query: 98 ESSSEKEKKKEKKDKKE 114
+ ++ K+KEKK +KE
Sbjct: 166 KIQAKARKRKEKKKEKE 182
>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon.
Length = 431
Score = 33.9 bits (77), Expect = 0.28
Identities = 27/184 (14%), Positives = 79/184 (42%), Gaps = 3/184 (1%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
T+ S S+ ++ ++ + + +++EK++S +E E+ + +K +++ + + E
Sbjct: 89 TDQSLSEPSRRMQEDSGAENETVEEEEKEESREEREEVEETEGVTKSEQKNDWRDAEECQ 148
Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS 161
++EK+ E +++++ +++ K + E+ S + E + +
Sbjct: 149 KEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKHTENTFSRG--GAEGAQVEAGKEFEKLK 206
Query: 162 HPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
A ++ K +E+ K E + + +K + +K KE+ + + +
Sbjct: 207 QKQQEAALELEELKKKREERRKVLEEE-EQRRKQEEADRKSREEEEKRRLKEEIERRRAE 265
Query: 222 ESHK 225
+ K
Sbjct: 266 AAEK 269
Score = 31.5 bits (71), Expect = 1.4
Identities = 16/107 (14%), Positives = 50/107 (46%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
+ K K ++ E E K K+K+++ E+ +K+ + + E+E+
Sbjct: 175 MTHKLKHTENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEE 234
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
++ K+++ ++ + +++ R ++ ++ + E+ + V S++
Sbjct: 235 QRRKQEEADRKSREEEEKRRLKEEIERRRAEAAEKRQKVPEDGLSED 281
Score = 30.0 bits (67), Expect = 4.2
Identities = 18/97 (18%), Positives = 42/97 (43%)
Query: 17 SPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
S + A + KK ++R K E+E++ ++++++ S
Sbjct: 187 SRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEADRKS 246
Query: 77 KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
+E+E+ + +E ER+ ++ E + + +DKK
Sbjct: 247 REEEEKRRLKEEIERRRAEAAEKRQKVPEDGLSEDKK 283
Score = 30.0 bits (67), Expect = 4.5
Identities = 21/137 (15%), Positives = 57/137 (41%), Gaps = 1/137 (0%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE 61
+ S S + ++++ + S K D +D + +++E
Sbjct: 92 SLSEPSRRMQEDSGAENETVEEEEKEESREEREEVEETEGVTKSEQKNDWRDAEECQKEE 151
Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
KE + +E++K S E+ + + + + E+ +E + + K+ ++ K ++
Sbjct: 152 KEPEPEEEEKPKRGSLEENNGEFMTHKLKHTENTFSRGGAEGAQVEAGKEFEKLKQKQQE 211
Query: 122 KDRERDKDEKKEQKESK 138
E + + KK+++E +
Sbjct: 212 AALELE-ELKKKREERR 227
Score = 28.8 bits (64), Expect = 7.8
Identities = 30/169 (17%), Positives = 68/169 (40%), Gaps = 6/169 (3%)
Query: 63 EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
E++ + K S S + ++ E+ +E E +++E+ ++ E K + K
Sbjct: 79 ERQKEFKPTSTDQSLSEPSRRMQEDSGAENETVEEEEKEESREEREEVEETEGVTKSEQK 138
Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKE 182
+ RD +E +++++ + S E +G + ++H + + + E
Sbjct: 139 NDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNG-EFMTHKLKHTENTFSRGGAEGAQVE 197
Query: 183 KEKESSTTHDKHSK-----HKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
KE K + + KKK + K +E+ + +E+ KS
Sbjct: 198 AGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEADRKS 246
>gnl|CDD|215406 PLN02761, PLN02761, lipase class 3 family protein.
Length = 527
Score = 33.9 bits (77), Expect = 0.28
Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 12/69 (17%)
Query: 16 PSPHK------------NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
SP+K K K S I S+S +S +S+ T S K D +E+E E
Sbjct: 18 SSPNKIFKTQPQTLILTTKFKTCSIICSSSCTSISSSTTQQKQSNKQTHVSDNKREEEPE 77
Query: 64 KKDKEKDKS 72
++ +EK+ S
Sbjct: 78 EELEEKEVS 86
>gnl|CDD|217884 pfam04086, SRP-alpha_N, Signal recognition particle, alpha subunit,
N-terminal. SRP is a complex of six distinct
polypeptides and a 7S RNA that is essential for
transferring nascent polypeptide chains that are
destined for export from the cell to the translocation
apparatus of the endoplasmic reticulum (ER) membrane.
SRP binds hydrophobic signal sequences as they emerge
from the ribosome, and arrests translation.
Length = 272
Score = 33.2 bits (76), Expect = 0.29
Identities = 22/95 (23%), Positives = 35/95 (36%), Gaps = 3/95 (3%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
S K+ S P K K ++TS +S + +SS+ + KE
Sbjct: 120 SKKTVDSMIERKPKEPGLKRKQRKKAQESATSPESSPSSTPNSSRPSTPHLLKAKEGPSR 179
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
+ K S+ +S EK S K K + K+
Sbjct: 180 RAKKAAKLSSTASSGDEK---SPKSKAAPKKAGKK 211
Score = 32.0 bits (73), Expect = 0.79
Identities = 26/151 (17%), Positives = 47/151 (31%), Gaps = 3/151 (1%)
Query: 10 SSSSAHPSPHKNKDKDSSAIPS-TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKE 68
S SP + + S T S P +K +K +
Sbjct: 100 ESKKQAKSPKAMRTFEESKKSKKTVDSMIERKPKEPGLKRKQRKKAQESATSPESSPSST 159
Query: 69 KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
+ S S+ K K + +K +K SS+ ++ K K K R+ D
Sbjct: 160 PNSSRPSTPHLLKAKEGPSRRAKKAAK--LSSTASSGDEKSPKSKAAPKKAGKKMRKWDL 217
Query: 129 DEKKEQKESKSSSKIVSSSHNSKEPASGSQL 159
D ++ S ++ N+ P ++
Sbjct: 218 DGDEDDDAVLDYSAPDANDENADAPEDVEEV 248
Score = 29.7 bits (67), Expect = 4.5
Identities = 33/145 (22%), Positives = 58/145 (40%), Gaps = 10/145 (6%)
Query: 65 KDKEKDKSAVSSKEKEKDK-----VSSKEKERKES--KPKESSSEKEKKKEKKDKKEKSH 117
K++ + + A ++ ++E D+ + EKE K+ PK + +E KK KK
Sbjct: 70 KNQLRQEKARTTYDEEFDEYFDQQLRELEKESKKQAKSPKAMRTFEESKKSKKTVDSMIE 129
Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVK 177
+ + + K KK Q+ + S SS+ NS P++ L P+ K K
Sbjct: 130 RKPKEPGLKRKQRKKAQESATSPESSPSSTPNSSRPSTPHLL---KAKEGPSRRAKKAAK 186
Query: 178 TKEKEKEKESSTTHDKHSKHKHKKK 202
+ + K + K KK
Sbjct: 187 LSSTASSGDEKSPKSKAAPKKAGKK 211
>gnl|CDD|215591 PLN03124, PLN03124, poly [ADP-ribose] polymerase; Provisional.
Length = 643
Score = 33.7 bits (77), Expect = 0.29
Identities = 23/110 (20%), Positives = 49/110 (44%), Gaps = 5/110 (4%)
Query: 27 SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK----DKEKDKSAVSSKEKEKD 82
AI + ++S S +S+ KK ++ +D ++ K D+ K + +E +
Sbjct: 34 DAIAEDAKTASKSGTKSSAGRKKRRERQDDGDDEPVSPKRIAIDEVKGMTVRELREAASE 93
Query: 83 KVSSKEKERKESKPK-ESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
+ + +K+ + ++ E + K + + K K D ER+K+EK
Sbjct: 94 RGLATTGRKKDLLERLCAALESDVKVGSANGTGEDEKEKGGDEEREKEEK 143
>gnl|CDD|226889 COG4487, COG4487, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 438
Score = 33.6 bits (77), Expect = 0.29
Identities = 13/98 (13%), Positives = 41/98 (41%), Gaps = 2/98 (2%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK--ESKPKESSSEKEK 105
KK+ + +K+++ ++ + +D+++ E K KE +++
Sbjct: 63 KKELSQLEEQLINQKKEQKNLFNEQIKQFELALQDEIAKLEALELLNLEKDKELELLEKE 122
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
E + +K ++ + E+ ++ K ++ K ++
Sbjct: 123 LDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEK 160
Score = 33.6 bits (77), Expect = 0.31
Identities = 19/105 (18%), Positives = 48/105 (45%), Gaps = 3/105 (2%)
Query: 50 DKKDKDRDK--EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+++D+ R +E EK+ EK S+K+KE ++ + +K+ + + + ++ +
Sbjct: 33 EQEDQSRILNTLEEFEKEANEKRAQYRSAKKKELSQLEEQLINQKKEQKNLFNEQIKQFE 92
Query: 108 EKKDKKEKSHKHKDK-DRERDKDEKKEQKESKSSSKIVSSSHNSK 151
+ + + + E+DK+ + +KE SK + +
Sbjct: 93 LALQDEIAKLEALELLNLEKDKELELLEKELDELSKELQKQLQNT 137
Score = 29.8 bits (67), Expect = 4.4
Identities = 21/96 (21%), Positives = 43/96 (44%), Gaps = 3/96 (3%)
Query: 43 NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
+K + EKE ++ KE K ++ E + K + + E + E E
Sbjct: 104 ALELLNLEKDKELELLEKELDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEKKLE 163
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ + E++ +E+ H + + + + E +EQ+ESK
Sbjct: 164 ESLELEREKFEEQLH---EANLDLEFKENEEQRESK 196
Score = 29.0 bits (65), Expect = 7.6
Identities = 19/101 (18%), Positives = 49/101 (48%), Gaps = 7/101 (6%)
Query: 48 KKDKKDKDRDK------EKEKEKKDKEKDKSAVSSKEKEKDKVSSK-EKERKESKPKESS 100
+ D+ K+ K E ++K++ K++ + + ++K + S + E+E+ E + E++
Sbjct: 122 ELDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEKKLEESLELEREKFEEQLHEAN 181
Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
+ E K+ ++ ++ K K R + ++ Q E+
Sbjct: 182 LDLEFKENEEQRESKWAILKKLKRRAELGSQQVQGEALELP 222
>gnl|CDD|224415 COG1498, SIK1, Protein implicated in ribosomal biogenesis, Nop56p
homolog [Translation, ribosomal structure and
biogenesis].
Length = 395
Score = 33.5 bits (77), Expect = 0.30
Identities = 12/63 (19%), Positives = 28/63 (44%), Gaps = 2/63 (3%)
Query: 69 KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
+ +S +E+ + ++ ++ + K KP + + KKE+ + + K K ER
Sbjct: 335 GEPDGISLREELEKRI--EKLKEKPPKPPTKAKPERDKKERPGRYRRKKKEKKAKSERRG 392
Query: 129 DEK 131
+
Sbjct: 393 LQN 395
Score = 32.4 bits (74), Expect = 0.62
Identities = 16/61 (26%), Positives = 34/61 (55%), Gaps = 2/61 (3%)
Query: 79 KEKDKVSSKEK-ERKESKPKESSSEKE-KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
E D +S +E+ E++ K KE + K K ++DKKE+ +++ K +E+ ++ +
Sbjct: 335 GEPDGISLREELEKRIEKLKEKPPKPPTKAKPERDKKERPGRYRRKKKEKKAKSERRGLQ 394
Query: 137 S 137
+
Sbjct: 395 N 395
>gnl|CDD|219569 pfam07777, MFMR, G-box binding protein MFMR. This region is found
to the N-terminus of the pfam00170 transcription factor
domain. It is between 150 and 200 amino acids in length.
The N-terminal half is rather rich in proline residues
and has been termed the PRD (proline rich domain),
whereas the C-terminal half is more polar and has been
called the MFMR (multifunctional mosaic region). It has
been suggested that this family is composed of three
sub-families called A, B and C, classified according to
motif composition. It has been suggested that some of
these motifs may be involved in mediating
protein-protein interactions. The MFMR region contains a
nuclear localisation signal in bZIP opaque and GBF-2.
The MFMR also contains a transregulatory activity in
TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention
signals.
Length = 189
Score = 32.9 bits (75), Expect = 0.31
Identities = 29/93 (31%), Positives = 38/93 (40%), Gaps = 10/93 (10%)
Query: 14 AHPS-PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS 72
AHPS P + A+PS ST P + + K +KD+ K K K D S
Sbjct: 89 AHPSMPPGSHPFSPYAMPSAEVPGST--PLSMETDAKSSDNKDKGSIK----KSKGSDGS 142
Query: 73 ---AVSSKEKEKDKVSSKEKERKESKPKESSSE 102
A+S K E K S S+ ES S+
Sbjct: 143 LGLAMSGKNGESGKASGSSANGGSSQSSESGSD 175
>gnl|CDD|145900 pfam02994, Transposase_22, L1 transposable element.
Length = 370
Score = 33.5 bits (76), Expect = 0.31
Identities = 25/91 (27%), Positives = 42/91 (46%), Gaps = 1/91 (1%)
Query: 20 KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
+N+D SS+ P TSSS + P ++ D+K E+ +K + + + +K K
Sbjct: 10 RNQDTQSSSEPPKPTSSSPATPNTWENNDLDEKSYLIMMEEGFKKDNYSSLREDIETKGK 69
Query: 80 EKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
E KE E +K E+ E +K K+
Sbjct: 70 EVQNF-LKELEECITKQVEAHIENTEKCLKE 99
>gnl|CDD|220376 pfam09745, DUF2040, Coiled-coil domain-containing protein 55
(DUF2040). This entry is a conserved domain of
approximately 130 residues of proteins conserved from
fungi to humans. The proteins do contain a coiled-coil
domain, but the function is unknown.
Length = 128
Score = 32.0 bits (73), Expect = 0.32
Identities = 19/100 (19%), Positives = 53/100 (53%), Gaps = 8/100 (8%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+ K++K + + E+++ +K K +++ +++++ + +ERK K +E ++
Sbjct: 29 AAKEEKKQAKLSERKENRKPKYIGSLLEAAERRKREREIA--EERKLQKEREKEGDEFAD 86
Query: 107 KEK------KDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
KEK K + E++ K +++++ER++ E++
Sbjct: 87 KEKFVTSAYKKQLEENRKLEEEEKEREELEEENDVTKGKD 126
>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal. This
domain is found to the N-terminus of bacterial signal
peptidases of the S49 family (pfam01343).
Length = 154
Score = 32.1 bits (74), Expect = 0.34
Identities = 20/78 (25%), Positives = 37/78 (47%), Gaps = 15/78 (19%)
Query: 73 AVSSKEKEK------DKVSSKEKERKESKPKESSSEKEKKK-EKKDKKEKSHKHKDKDRE 125
A++ K+K K ++ + K+ KES +KE K EK +KK ++
Sbjct: 30 ALAQKKKGKKGELEITDLNEEYKDLKESLEAALLDKKELKAWEKAEKK--------AEKA 81
Query: 126 RDKDEKKEQKESKSSSKI 143
+ K EKK+ K+ + ++
Sbjct: 82 KAKAEKKKAKKEEPKPRL 99
>gnl|CDD|235943 PRK07133, PRK07133, DNA polymerase III subunits gamma and tau;
Validated.
Length = 725
Score = 33.6 bits (77), Expect = 0.35
Identities = 16/96 (16%), Positives = 38/96 (39%), Gaps = 6/96 (6%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
+ + + E E + K K + K+K K + + + K + E+ E
Sbjct: 354 ALSELEEEDENEIKFK---KIEENSIDNLDIKEK---KIENENDIEGKSDTKNLEEGFET 407
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
KD K K+ +K + + + + +++I++
Sbjct: 408 KDNKNKNSSFINKTENILTNSPLKDELLEKTTEIIN 443
>gnl|CDD|130249 TIGR01181, dTDP_gluc_dehyt, dTDP-glucose 4,6-dehydratase. This
protein is related to UDP-glucose 4-epimerase (GalE) and
likewise has an NAD cofactor [Cell envelope,
Biosynthesis and degradation of surface polysaccharides
and lipopolysaccharides].
Length = 317
Score = 33.1 bits (76), Expect = 0.36
Identities = 42/175 (24%), Positives = 58/175 (33%), Gaps = 51/175 (29%)
Query: 282 ISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQD--AFT-LNIQATRELLDL 338
+ GDI L + F +H ++H AA D I AF N+ T LL+
Sbjct: 55 VKGDIGDREL----VSRLFTEHQPDAVVHFAAESHVDRSISGPAAFIETNVVGTYTLLEA 110
Query: 339 ATRCSQLKAILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFG 398
+ H+ST +E Y DL T L
Sbjct: 111 VRKYWHEFRFHHIST-----------DEV------YGDLEKGDAFTETTPLAP------- 146
Query: 399 GIYNNSYSFTKAIGESVVEKYL--YKLPLAMVRPSIVVSTWKEPIVGWSNNLYGP 451
++ YS +KA + +V Y Y LP + R SNN YGP
Sbjct: 147 ---SSPYSASKAASDHLVRAYHRTYGLPALITRC--------------SNN-YGP 183
>gnl|CDD|114629 pfam05917, DUF874, Helicobacter pylori protein of unknown function
(DUF874). This family consists of several hypothetical
proteins specific to Helicobacter pylori. The function
of this family is unknown.
Length = 417
Score = 33.3 bits (75), Expect = 0.37
Identities = 30/168 (17%), Positives = 78/168 (46%), Gaps = 3/168 (1%)
Query: 21 NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE 80
++DK + + + + N S + +++++ ++EK+K +K+ + ++ E+E
Sbjct: 123 DQDKKIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELANSQIKAEQE 182
Query: 81 KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD--EKKEQKESK 138
K K ++++ ++ K K S+ + E + +K+K+ K + KD ++ EQ +
Sbjct: 183 KQKTEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQDLIKEQKDFIKEAEQNCQE 242
Query: 139 SSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
+ ++ K ++ + P P T ++P++ K K+
Sbjct: 243 NHNQFFIKKLGIK-AGIAIEIEAECKTPKPAKTNQTPIQPKHLPNSKQ 289
Score = 32.2 bits (72), Expect = 0.84
Identities = 21/105 (20%), Positives = 48/105 (45%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
+ +DKK + +KE E +KS + +++E+ K+K KE +S K
Sbjct: 120 FADDQDKKIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELANSQIKA 179
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
+++++K ++EK ++K + + K + + K + +
Sbjct: 180 EQEKQKTEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQD 224
>gnl|CDD|118064 pfam09528, Ehrlichia_rpt, Ehrlichia tandem repeat (Ehrlichia_rpt).
This entry represents 77 residues of an 80 amino acid
(240 nucleotide) tandem repeat, found in a variable
number of copies in an immunodominant outer membrane
protein of Ehrlichia chaffeensis, a tick-borne obligate
intracellular pathogen.
Length = 707
Score = 33.5 bits (75), Expect = 0.37
Identities = 32/275 (11%), Positives = 94/275 (34%), Gaps = 8/275 (2%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS-KKDKKDKDRDKEKEKE 63
++ + ++ +D +S P + + +K+++ + ++
Sbjct: 274 IEEHQGETEKEEGIPESHAEDLQPAVDDIVEHPSSEPFVAEEEVSETEKEENNPEVLAED 333
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
+D +S VS + + + E E + + ++ E E +
Sbjct: 334 LQDAADGESGVSDQPAQVVEERESEIEEHQGETEKEEGIPESHAEDDEIASDPSIEHFSA 393
Query: 124 RERDKDEKKEQKESKSSSKIVS-SSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKE 182
+ + E++ES K + A + P + ++ ++ E
Sbjct: 394 EVGKEVSETEKEESNPEVKAEDLQPAVDGDVAHHESEVGDKPAETSKEEESPEIEAEDGE 453
Query: 183 KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYIL 242
K+ H+++D+ + + +E A+ K ++ + G + +
Sbjct: 454 PAKDGGIEE------SHQEEDEIVSEPSKEEFTAEVKAEDLQPAVDGSVEHSSSEVGEEV 507
Query: 243 LRSKKNKTVQERLAEQFKDELFDRLKNEQADILQR 277
++K ++ E AE + D L++ ++ ++
Sbjct: 508 SETEKEESNPEIKAEDLPPAVDDSLEHSIPEVGEK 542
Score = 33.1 bits (74), Expect = 0.50
Identities = 39/192 (20%), Positives = 73/192 (38%), Gaps = 12/192 (6%)
Query: 7 SSSSSSSAHPSPHKNKDK---DSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
+ P H D+ D S ++ + T S + K +D + +
Sbjct: 364 GETEKEEGIPESHAEDDEIASDPSIEHFSAEVGKEVSETEKEESNPEVKAEDLQPAVDGD 423
Query: 64 KKDKEK---DKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKKEKKDKKEKSHKH 119
E DK A +SKE+E ++ +++ E K+ +ES E+++ + K+E + +
Sbjct: 424 VAHHESEVGDKPAETSKEEESPEIEAEDGEPAKDGGIEESHQEEDEIVSEPSKEEFTAEV 483
Query: 120 KDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTK 179
K +D + D E S+ ++ S KE ++ PP + S +
Sbjct: 484 KAEDLQPAVDGSVEHSSSEVGEEV---SETEKEESNPEIKAEDLPPAVDDSLEHSIPEVG 540
Query: 180 EKEKE--KESST 189
EK E E
Sbjct: 541 EKVDEMFAEEFN 552
Score = 32.7 bits (73), Expect = 0.71
Identities = 31/163 (19%), Positives = 69/163 (42%), Gaps = 10/163 (6%)
Query: 70 DKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
DK A +SKE+E ++ +++ E K+ +ES E+++ + K+E + + K +D +
Sbjct: 132 DKPAKTSKEEENPEIEAEDGEPAKDDGIEESHQEEDEIVSESSKEEFTAEVKAEDLQPAV 191
Query: 129 DEKKEQKESKSSSKIVSSSHNSK---------EPASGSQLISHPPPPAPTPTQKSPVKTK 179
D E S+ ++ + +PA + H P + S +
Sbjct: 192 DGSIEHSSSEVGEEVSKTEKEESNPEVKAEDLQPAVDDDVAHHESEVGDKPAETSKEEET 251
Query: 180 EKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
+ K ++ D +H + ++H +T +E +S ++
Sbjct: 252 PEVKAEDLQPAVDGSVEHSSSEIEEHQGETEKEEGIPESHAED 294
>gnl|CDD|214661 smart00435, TOPEUc, DNA Topoisomerase I (eukaryota). DNA
Topoisomerase I (eukaryota), DNA topoisomerase V,
Vaccina virus topoisomerase, Variola virus
topoisomerase, Shope fibroma virus topoisomeras.
Length = 391
Score = 33.1 bits (76), Expect = 0.39
Identities = 23/91 (25%), Positives = 40/91 (43%), Gaps = 13/91 (14%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
K +++ K + + K +K K K+ SK + E +E ++KK++K
Sbjct: 281 KLQEKIKALKYQLKRLKKMILLFEMISDLKRKLKSKFERDNEKL----DAEVKEKKKEKK 336
Query: 112 KKEKSHKHKDKDRER---------DKDEKKE 133
K+EK K ++ ER DK+E K
Sbjct: 337 KEEKKKKQIERLEERIEKLEVQATDKEENKT 367
Score = 30.0 bits (68), Expect = 4.0
Identities = 16/55 (29%), Positives = 30/55 (54%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESK 95
S + ++K KEK+KEKK +EK K + E+ +K+ + +++E+K
Sbjct: 312 KLKSKFERDNEKLDAEVKEKKKEKKKEEKKKKQIERLEERIEKLEVQATDKEENK 366
Score = 29.6 bits (67), Expect = 5.1
Identities = 21/67 (31%), Positives = 33/67 (49%), Gaps = 8/67 (11%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK--ERKESKPKESSSEKE 104
S +K K + E++ EK D E + K+KEK K K+K ER E + ++ +
Sbjct: 307 SDLKRKLKSKF-ERDNEKLDAEVKE-----KKKEKKKEEKKKKQIERLEERIEKLEVQAT 360
Query: 105 KKKEKKD 111
K+E K
Sbjct: 361 DKEENKT 367
>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein. This family includes proteins
related to Mpp10 (M phase phosphoprotein 10). The U3
small nucleolar ribonucleoprotein (snoRNP) is required
for three cleavage events that generate the mature 18S
rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
depletion of Mpp10, a U3 snoRNP-specific protein, halts
18S rRNA production and impairs cleavage at the three U3
snoRNP-dependent sites.
Length = 613
Score = 33.4 bits (76), Expect = 0.40
Identities = 11/102 (10%), Positives = 43/102 (42%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
S +D++D + + ++ D ++ + + + + +KE + + E++++
Sbjct: 240 SGEDEEDDEEGNIEYEDFFDPKEKDKKKDAGDDAELEDDEPDKEAVKKEADSKPEEEDEE 299
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
++++ + + + ++ K ++ + S SS
Sbjct: 300 DDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDLESPKELSSF 341
Score = 32.7 bits (74), Expect = 0.56
Identities = 27/134 (20%), Positives = 52/134 (38%), Gaps = 4/134 (2%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
D+D ++ ++ + KD S E E+D E+ E + EK+KKK+ D
Sbjct: 217 DEDDFEDYFQDDSEDGKDDEDFGSGEDEEDD----EEGNIEYEDFFDPKEKDKKKDAGDD 272
Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
E DK+ + + + K ++E + + + P + + P
Sbjct: 273 AELEDDEPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDL 332
Query: 173 KSPVKTKEKEKEKE 186
+SP + EK +
Sbjct: 333 ESPKELSSFEKRQA 346
Score = 31.5 bits (71), Expect = 1.4
Identities = 24/125 (19%), Positives = 52/125 (41%), Gaps = 9/125 (7%)
Query: 47 SKKDKKDKDRDK-------EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER--KESKPK 97
+ D +D +D E +D+E D+ E D +K+ +++ +
Sbjct: 217 DEDDFEDYFQDDSEDGKDDEDFGSGEDEEDDEEGNIEYEDFFDPKEKDKKKDAGDDAELE 276
Query: 98 ESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
+ +KE K++ D K + +D ++E D+DE++ + + K+ + S
Sbjct: 277 DDEPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDLESPK 336
Query: 158 QLISH 162
+L S
Sbjct: 337 ELSSF 341
Score = 30.7 bits (69), Expect = 2.8
Identities = 24/95 (25%), Positives = 43/95 (45%), Gaps = 6/95 (6%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-----KPKESSSEK 103
+KDK +D + E +D E DK AV + K + +E + +E +P E++ +K
Sbjct: 260 PKEKDKKKDAGDDAELEDDEPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDK 319
Query: 104 EKKKEKKDKKEKSHKHKDKDR-ERDKDEKKEQKES 137
K E + K+ E+ + + K+Q E
Sbjct: 320 VKLDEPVLEGVDLESPKELSSFEKRQAKLKQQIEQ 354
Score = 30.0 bits (67), Expect = 4.2
Identities = 16/63 (25%), Positives = 35/63 (55%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+ D K ++ D+E ++++ D+++++ ++ +K K E ES + SS EK + K
Sbjct: 288 EADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDLESPKELSSFEKRQAK 347
Query: 108 EKK 110
K+
Sbjct: 348 LKQ 350
>gnl|CDD|115579 pfam06933, SSP160, Special lobe-specific silk protein SSP160. This
family consists of several special lobe-specific silk
protein SSP160 sequences which appear to be specific to
Chironomus (Midge) species.
Length = 758
Score = 33.2 bits (75), Expect = 0.41
Identities = 15/41 (36%), Positives = 30/41 (73%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS 47
S +SSSSA+ + + N +++ S++++++TSN T+SS+S
Sbjct: 106 SGNSSSSANSTSNSNSTTSNNSTTSSNSTTTTSNSTSSSNS 146
>gnl|CDD|219668 pfam07964, Red1, Rec10 / Red1. Rec10 / Red1 is involved in meiotic
recombination and chromosome segregation during
homologous chromosome formation. This protein localises
to the synaptonemal complex in S. cerevisiae and the
analogous structures (linear elements) in S. pombe. This
family is currently only found in fungi.
Length = 706
Score = 33.3 bits (76), Expect = 0.41
Identities = 36/208 (17%), Positives = 66/208 (31%), Gaps = 20/208 (9%)
Query: 22 KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
KD A ++ N K K K EK+++ + ++ SKE K
Sbjct: 405 KDPTIIAGKKLMNKLTSEKINNPVKVVKVSKYKGNKSEKKRDINVLDTIFASPVSKELRK 464
Query: 82 DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
SK+ + K KP + S+K+ K K + + ++ Q + SS
Sbjct: 465 KVGKSKQTKLKNFKPVPNKSKKQLANNNSQNI----KSKKVVKAKTNNKANLQDVGECSS 520
Query: 142 KIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK----EKEKESSTTHDKHSKH 197
+ N K+ ++ S KS + E K+ T
Sbjct: 521 PPNNKEKNDKQTSTSSS------------VLKSDRSSIEVRNPNANVKKLEDTTYNAKFP 568
Query: 198 KHKKKDKHGDKTNPKEKDAKSKEKESHK 225
K + + +DA + ++
Sbjct: 569 TVSKNNAYTLVDISTSEDAVNSADDTRS 596
Score = 29.5 bits (66), Expect = 6.3
Identities = 36/255 (14%), Positives = 80/255 (31%), Gaps = 32/255 (12%)
Query: 10 SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPT-----NSSSSKKDKKDKDRDKEKEKEK 64
SS + + + I + + S + N+ ++ K + + ++
Sbjct: 250 ESSVQDEECSREANVPTQDIEANTKDSLHMSAQDNHYDNTQLQTPERSTKRKSPIWDLKE 309
Query: 65 KDKEKDKSAVSSKEKEKDKVSSKEKER-----KESKP----------KESSSEKEKKKEK 109
KE + ++ + +K S E +E P + +S E KE
Sbjct: 310 DQKESKIKSGTNLKLSSEKESIPETSYVNVLEEEQSPLVRLQKRKLARSTSKTLESLKEV 369
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
+ + S K+K E + +E + + + + ++L S
Sbjct: 370 FEDQASSVKNKQAQSEENLNESPKTPIAVTGDPHLKDPTIIAGKKLMNKLTSEKINNPVK 429
Query: 170 PTQKSPVKTKEKEKEKE---------SSTTHDKHSKHKHKKKDKHGDKTNPKEKD---AK 217
+ S K + EK+++ S + + K K+ K + K
Sbjct: 430 VVKVSKYKGNKSEKKRDINVLDTIFASPVSKELRKKVGKSKQTKLKNFKPVPNKSKKQLA 489
Query: 218 SKEKESHKSSAGPKC 232
+ ++ KS K
Sbjct: 490 NNNSQNIKSKKVVKA 504
>gnl|CDD|225887 COG3351, FlaD, Putative archaeal flagellar protein D/E [Cell
motility and secretion].
Length = 214
Score = 32.5 bits (74), Expect = 0.41
Identities = 22/110 (20%), Positives = 46/110 (41%), Gaps = 15/110 (13%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
K +EK +K + +A + +E ++ + +K S+ E +E ++ E++
Sbjct: 2 KPYLEEKIEKAEKVTAFALEELKEKIEELPIQAKK--------SDDELVEELPERYEQTK 53
Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
++ ++ +E+ +KE S K+ KEPA S A
Sbjct: 54 ENSLIEKVDSIEEEISEKEKVMSEKL-------KEPAQMSSTSEEEEKKA 96
>gnl|CDD|220431 pfam09831, DUF2058, Uncharacterized protein conserved in bacteria
(DUF2058). This domain, found in various prokaryotic
proteins, has no known function.
Length = 177
Score = 32.2 bits (74), Expect = 0.41
Identities = 19/71 (26%), Positives = 34/71 (47%), Gaps = 11/71 (15%)
Query: 66 DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
DK+K K A KEK K + + RK + + ++ ++ K +K E +D++
Sbjct: 13 DKKKAKKA--KKEKRKQRK----QARKGADDGDDELKQAAEEAKAEKAE-----RDRELN 61
Query: 126 RDKDEKKEQKE 136
R + + EQK
Sbjct: 62 RQRQAEAEQKA 72
Score = 28.3 bits (64), Expect = 7.7
Identities = 14/63 (22%), Positives = 35/63 (55%), Gaps = 9/63 (14%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
KK KK+K + +++ ++ D D+ +++E + +K E++R E + +++ +
Sbjct: 18 KKAKKEKRKQRKQARKGADDGDDELKQAAEEAKAEK---AERDR------ELNRQRQAEA 68
Query: 108 EKK 110
E+K
Sbjct: 69 EQK 71
>gnl|CDD|238356 cd00660, Topoisomer_IB_N, Topoisomer_IB_N: N-terminal DNA binding
fragment found in eukaryotic DNA topoisomerase (topo) IB
proteins similar to the monomeric yeast and human topo I
and heterodimeric topo I from Leishmania donvanni. Topo
I enzymes are divided into: topo type IA (bacterial)
and type IB (eukaryotic). Topo I relaxes superhelical
tension in duplex DNA by creating a single-strand nick,
the broken strand can then rotate around the unbroken
strand to remove DNA supercoils and, the nick is
religated, liberating topo I. These enzymes regulate the
topological changes that accompany DNA replication,
transcription and other nuclear processes. Human topo I
is the target of a diverse set of anticancer drugs
including camptothecins (CPTs). CPTs bind to the topo
I-DNA complex and inhibit re-ligation of the
single-strand nick, resulting in the accumulation of
topo I-DNA adducts. In addition to differences in
structure and some biochemical properties,
Trypanosomatid parasite topo I differ from human topo I
in their sensitivity to CPTs and other classical topo I
inhibitors. Trypanosomatid topos I play putative roles
in organizing the kinetoplast DNA network unique to
these parasites. This family may represent more than
one structural domain.
Length = 215
Score = 32.6 bits (75), Expect = 0.42
Identities = 13/32 (40%), Positives = 22/32 (68%), Gaps = 3/32 (9%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
+EKE+K++ KE EK+ KE+K+K E+ +
Sbjct: 96 EEKEKKKAMSKE---EKKAIKEEKEKLEEPYG 124
>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
Length = 619
Score = 33.2 bits (77), Expect = 0.43
Identities = 21/103 (20%), Positives = 37/103 (35%), Gaps = 2/103 (1%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
D + E++ E ++ E+E++ + ES+ E EK K K
Sbjct: 171 DGFVDPNAEEDPAHVGSELEELDDDEDEEEEEDENDDSLAADESELPEKVLEKFKALA-K 229
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH-NSKE 152
K+ + K R KK K + + + S SK+
Sbjct: 230 QYKKLRKAQEKKVEGRLAQHKKYAKLREKLKEELKSLRLTSKQ 272
>gnl|CDD|227925 COG5638, COG5638, Uncharacterized conserved protein [Function
unknown].
Length = 622
Score = 33.2 bits (75), Expect = 0.45
Identities = 38/209 (18%), Positives = 67/209 (32%), Gaps = 30/209 (14%)
Query: 42 TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE----------- 90
T S S +D K +K +KE D S D + E E
Sbjct: 364 TASKLSDEDDDSVMESK-MQKLFSEKEIDFGLNSELVDMSDDGENGEMEDTFTSHLPASN 422
Query: 91 RKESKPKESSS-------EKEKKKEKKDKKEKSHKH------KDKDRERDKDEKKEQKES 137
ES K ++ +E+++ +K+++ K K KDK +K KK +
Sbjct: 423 ESESDDKLETTIEKLDRKLRERQENRKERQLKKTKDDSDVDLKDKKESINKKNKKGKHAI 482
Query: 138 KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTT-----HD 192
+ ++ K + + H + +K K K+K D
Sbjct: 483 ERTAASKEELELIKADDEDDEQLDHFDMKSILKAEKFKKNRKLKKKASNLEEGFVFDPKD 542
Query: 193 KHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
+ + D T+P+ K +K
Sbjct: 543 PRFVAIFEDHNFAIDPTHPEFKKTGGMKK 571
Score = 32.4 bits (73), Expect = 0.81
Identities = 30/142 (21%), Positives = 53/142 (37%), Gaps = 7/142 (4%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
D + + +S S S K + ++ K +E+++ K++
Sbjct: 398 LVDMSDDGENGEMEDTFTSHLPASNESESDDKLETTIEKLDRKLRERQENRKERQL---- 453
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
+K KD K++KES K++ K + KE+ K D + DE+ + +
Sbjct: 454 KKTKDDSDVDLKDKKESINKKNKKGKHAIERTAASKEELELIKADD---EDDEQLDHFDM 510
Query: 138 KSSSKIVSSSHNSKEPASGSQL 159
KS K N K S L
Sbjct: 511 KSILKAEKFKKNRKLKKKASNL 532
>gnl|CDD|218581 pfam05416, Peptidase_C37, Southampton virus-type processing
peptidase. Corresponds to Merops family C37.
Norwalk-like viruses (NLVs), including the Southampton
virus, cause acute non-bacterial gastroenteritis in
humans. The NLV genome encodes three open reading frames
(ORFs). ORF1 encodes a polyprotein, which is processed
by the viral protease into six proteins.
Length = 535
Score = 32.9 bits (75), Expect = 0.46
Identities = 26/139 (18%), Positives = 52/139 (37%), Gaps = 17/139 (12%)
Query: 10 SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDK-----------DRDK 58
A P K K+K + S + KK ++++ DR++
Sbjct: 224 EPQDATPEGKKGKNKKGRGKKHNAFSRRGLSDEEYDEYKKIREERGGKYSIQEYLEDRER 283
Query: 59 -EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES-----SSEKEKKKEKKDK 112
E+E ++ + + K + ++ K RK+ K + + + +K++ D
Sbjct: 284 YEEELAERQATEADFCEEEEAKIRQRIFGLRKTRKQRKEERAKLGLVTGSDIRKRKPIDW 343
Query: 113 KEKSHKHKDKDRERDKDEK 131
K D DR+ D +EK
Sbjct: 344 NPKGPLWADDDRQVDYNEK 362
>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168). This
family consists of several hypothetical eukaryotic
proteins of unknown function.
Length = 142
Score = 31.6 bits (72), Expect = 0.46
Identities = 21/95 (22%), Positives = 50/95 (52%), Gaps = 8/95 (8%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
D++ +KE +D+E + K K+++K ++K++ +++ K K+KKK+KK K+ +
Sbjct: 56 DEKWKKETEDEEFQQKREEKKRKDEEK-TAKKRAKRQKK-------KQKKKKKKKAKKGN 107
Query: 117 HKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
K + + + ++ E++E + + K
Sbjct: 108 KKEEKEGSKSSEESSDEEEEGEEDKQEEPVEIMEK 142
Score = 29.2 bits (66), Expect = 3.2
Identities = 22/75 (29%), Positives = 41/75 (54%), Gaps = 4/75 (5%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
+++KK KD +K +K K ++K K K+K+K +KE KE S E+++
Sbjct: 72 REEKKRKDEEKTAKKRAK-RQKKKQK---KKKKKKAKKGNKKEEKEGSKSSEESSDEEEE 127
Query: 108 EKKDKKEKSHKHKDK 122
++DK+E+ + +K
Sbjct: 128 GEEDKQEEPVEIMEK 142
>gnl|CDD|221121 pfam11489, DUF3210, Protein of unknown function (DUF3210). This is
a family of proteins conserved in yeasts. The function
is not known. The Schizosaccharomyces pombe member is
SPBC18E5.07 and the Saccharomyces cerevisiae member is
AIM21.
Length = 671
Score = 33.0 bits (75), Expect = 0.49
Identities = 30/150 (20%), Positives = 58/150 (38%), Gaps = 5/150 (3%)
Query: 30 PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE-KKDKEKDKSAVSSKEKEKDKVSSKE 88
P + + PT SS + +++ E E +D + VS+ + +S+
Sbjct: 384 PEDESEIAVKPPTEESSRRPEEEKHRFPSEDVWEDSPSSLQDTATVSTPSNPPPR-ASET 442
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
E++ S+ S + E K +K+K+ K R +D ++ ES+ +
Sbjct: 443 PEQETSRSSSEVSLDPHQSELKSEKKKARPEVSKQRFPSRDVWEDAPESQELVTTEETPE 502
Query: 149 NSKEPASGSQ---LISHPPPPAPTPTQKSP 175
K + G + S P PT ++ P
Sbjct: 503 EVKSSSPGVTKPAIPSRPKKGKPTSEKRKP 532
Score = 29.9 bits (67), Expect = 5.0
Identities = 34/201 (16%), Positives = 71/201 (35%), Gaps = 15/201 (7%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
S++ +++ S+ P + + +S+ S + S+K K + K++
Sbjct: 422 SLQDTATVSTPSNPPPRASETPEQETSRSSSEVSLDPHQSELKSEKKKARPEVSKQRFPS 481
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
+ E + E+ K +KP S K+ K + +K K K
Sbjct: 482 RDVWEDAPESQELVTTEETPEEVKSSSPGVTKPAIPSRPKKGKPTSEKRKPPPVPKKPKP 541
Query: 124 R---------ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS------HPPPPAP 168
+ ++ E+ K ++ + SK A + S P AP
Sbjct: 542 QIPARPAKLQKQQAGEEANSSAFKPKPRVPARPGGSKIAALKAGFASDLNGRLALGPQAP 601
Query: 169 TPTQKSPVKTKEKEKEKESST 189
+SP + +++KE++ T
Sbjct: 602 KKVLESPKEPSKEKKEEDEDT 622
Score = 29.1 bits (65), Expect = 9.1
Identities = 44/228 (19%), Positives = 75/228 (32%), Gaps = 20/228 (8%)
Query: 19 HKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
K++ + P + +S ++ S + +E E + E + AV
Sbjct: 338 EKSEKSRHESDPKSRENSKPASIYGSVPDLIRHTPLEDVEEYEPLFPEDES-EIAVKPPT 396
Query: 79 KEKDKVSSKEKERK------ESKPKES---------SSEKEKKKEKKDKKEKSHKHKDKD 123
+E + +EK R E P S+ + E +++ +
Sbjct: 397 EESSRRPEEEKHRFPSEDVWEDSPSSLQDTATVSTPSNPPPRASETPEQETSRSSSEVSL 456
Query: 124 RERDKDEKKEQKESKSS-SKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK- 181
+ K E+K+++ SK S + E A SQ + SP TK
Sbjct: 457 DPHQSELKSEKKKARPEVSKQRFPSRDVWEDAPESQELVTTEETPEEVKSSSPGVTKPAI 516
Query: 182 -EKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
+ K+ T +K KK K P K K + E SSA
Sbjct: 517 PSRPKKGKPTSEKRKPPPVPKKPKPQIPARP-AKLQKQQAGEEANSSA 563
>gnl|CDD|222011 pfam13257, DUF4048, Domain of unknown function (DUF4048). This
presumed domain is functionally uncharacterized. This
domain family is found in eukaryotes, and is typically
between 228 and 257 amino acids in length.
Length = 242
Score = 32.4 bits (74), Expect = 0.49
Identities = 17/51 (33%), Positives = 21/51 (41%), Gaps = 3/51 (5%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD 57
S S S S + + + S S S+ SSTS S KD K D D
Sbjct: 126 RSRRSGSRSTSRSRLRLQGGSLSSSRSSRSSTSKGATSG---KDSKSADID 173
>gnl|CDD|217502 pfam03343, SART-1, SART-1 family. SART-1 is a protein involved in
cell cycle arrest and pre-mRNA splicing. It has been
shown to be a component of U4/U6 x U5 tri-snRNP complex
in human, Schizosaccharomyces pombe and Saccharomyces
cerevisiae. SART-1 is a known tumour antigen in a range
of cancers recognised by T cells.
Length = 603
Score = 32.8 bits (75), Expect = 0.50
Identities = 34/155 (21%), Positives = 56/155 (36%), Gaps = 7/155 (4%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
K KK + D E +K K+K K S + + E+ + E K
Sbjct: 216 KKKKSDNLFTLDSGGSTDDEAEKKRQEVKKKLKINNVSLDDDSTETPASDYYDVSEMVKF 275
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
KK KK+K K K R +D DE + + E++ S S E + P
Sbjct: 276 KKPKKKKKKK---KKRRKDLDEDELEPEAEGLGSSDSGSRKDVE----EENARLEDSPKK 328
Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKD 203
++ E + + ++S + K +KK
Sbjct: 329 RKEEQEDDDFVEDDDDLQASLAKQRRLAQKKRKKL 363
Score = 32.4 bits (74), Expect = 0.78
Identities = 46/209 (22%), Positives = 81/209 (38%), Gaps = 39/209 (18%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAV-SSKEKEKDKVSSKEKERKESKP-----KESS 100
SKK +K K+ +++K +KEK+++A +S++ KV K +E +E + K++
Sbjct: 98 SKKRQKKKEAERKKALLLDEKEKERAAEYTSEDLAGLKVGHKVEEFEEGEDVILTLKDTG 157
Query: 101 ----------------SEKEKKKEK-KDKKEKSHKHKDKDRER---------DKDEKKEQ 134
EKEK K+ + KK+K D D + D++ + ++
Sbjct: 158 VLEDEDEGDELENVELVEKEKDKKNLELKKKKPDYDPDDDDKFNKRSILSKYDEEIEGKK 217
Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKH 194
K+S + + S E Q + + V + E +S +D
Sbjct: 218 KKSDNLFTLDSGGSTDDEAEKKRQ-------EVKKKLKINNVSLDDDSTETPASDYYDVS 270
Query: 195 SKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
K KK K K + KD E E
Sbjct: 271 EMVKFKKPKKKKKKKKKRRKDLDEDELEP 299
Score = 31.7 bits (72), Expect = 1.4
Identities = 21/82 (25%), Positives = 42/82 (51%)
Query: 31 STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
ST + S+ + S K KK K + K+K+K +KD ++D+ ++ S++
Sbjct: 256 DDSTETPASDYYDVSEMVKFKKPKKKKKKKKKRRKDLDEDELEPEAEGLGSSDSGSRKDV 315
Query: 91 RKESKPKESSSEKEKKKEKKDK 112
+E+ E S +K K++++ D
Sbjct: 316 EEENARLEDSPKKRKEEQEDDD 337
Score = 30.5 bits (69), Expect = 2.7
Identities = 24/108 (22%), Positives = 50/108 (46%), Gaps = 2/108 (1%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
K +++ + K K+++ ++K A + +++E++ K E + ++ KK KK +K
Sbjct: 44 KRQEEAEAKRKREELREKIAKAREKRERNSKLGGIKTLGEDDDDDDDTKAWLKKSKKRQK 103
Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSS-SKIVSSSHNSKEPASGSQLI 160
+K + K D+ EK+ E S + H +E G +I
Sbjct: 104 KKE-AERKKALLLDEKEKERAAEYTSEDLAGLKVGHKVEEFEEGEDVI 150
>gnl|CDD|218332 pfam04929, Herpes_DNAp_acc, Herpes DNA replication accessory
factor. Replicative DNA polymerases are capable of
polymerising tens of thousands of nucleotides without
dissociating from their DNA templates. The high
processivity of these polymerases is dependent upon
accessory proteins that bind to the catalytic subunit of
the polymerase or to the substrate. The Epstein-Barr
virus (EBV) BMRF1 protein is an essential component of
the viral DNA polymerase and is absolutely required for
lytic virus replication. BMRF1 is also a transactivator.
This family is predicted to have a UL42 like structure.
Length = 381
Score = 32.7 bits (75), Expect = 0.51
Identities = 20/95 (21%), Positives = 29/95 (30%), Gaps = 12/95 (12%)
Query: 23 DKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD 82
+ + T + S +S SS D D E S E+
Sbjct: 294 EANGVEPEPTGSVSDRPRHLSSDSSPSPPDTSDSDPSTETPP---PASLSHSPPAAFERP 350
Query: 83 KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
S K ++E K +K+KK KK K
Sbjct: 351 LALS-PKRKREGDKK--------QKKKKSKKLKLT 376
Score = 31.9 bits (73), Expect = 0.89
Identities = 20/63 (31%), Positives = 28/63 (44%), Gaps = 6/63 (9%)
Query: 9 SSSSSAHPSPHKNKDKDSS------AIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
SS + PSP D D S A S S ++ P S +K + DK + K+K K
Sbjct: 312 HLSSDSSPSPPDTSDSDPSTETPPPASLSHSPPAAFERPLALSPKRKREGDKKQKKKKSK 371
Query: 63 EKK 65
+ K
Sbjct: 372 KLK 374
>gnl|CDD|146486 pfam03879, Cgr1, Cgr1 family. Members of this family are
coiled-coil proteins that are involved in pre-rRNA
processing.
Length = 105
Score = 30.8 bits (70), Expect = 0.52
Identities = 14/47 (29%), Positives = 26/47 (55%), Gaps = 1/47 (2%)
Query: 90 ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
E++ K E + K ++KE KD+KE + + +++ KE+KE
Sbjct: 31 EKRMEKRLEQQAIKAREKELKDEKEAERQRR-IQAIKERRAAKEEKE 76
>gnl|CDD|237753 PRK14552, PRK14552, C/D box methylation guide ribonucleoprotein
complex aNOP56 subunit; Provisional.
Length = 414
Score = 32.6 bits (75), Expect = 0.53
Identities = 17/53 (32%), Positives = 33/53 (62%)
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
++ K++++ + +E KE PK ++E+KK +K KK+K K K K R++ +
Sbjct: 362 DELKEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414
Score = 31.1 bits (71), Expect = 1.9
Identities = 15/53 (28%), Positives = 30/53 (56%), Gaps = 1/53 (1%)
Query: 82 DKVSSKEKER-KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
D++ + +R +E K K K+K++EKK +K K K + K ++ K + ++
Sbjct: 362 DELKEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414
Score = 30.3 bits (69), Expect = 3.4
Identities = 11/50 (22%), Positives = 27/50 (54%)
Query: 77 KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
KE+ ++ +++ + K+ +K +K++KK K++K K + K +
Sbjct: 365 KEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414
Score = 29.6 bits (67), Expect = 5.5
Identities = 12/34 (35%), Positives = 20/34 (58%)
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
K+K K K+K + K + R++ K KK+ K+ K
Sbjct: 376 KEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRK 409
>gnl|CDD|233787 TIGR02223, ftsN, cell division protein FtsN. FtsN is a poorly
conserved protein active in cell division in a number of
Proteobacteria. The N-terminal 30 residue region tends
to by Lys/Arg-rich, and is followed by a
membrane-spanning region. This is followed by an acidic
low-complexity region of variable length and a
well-conserved C-terminal domain of two tandem regions
matched by pfam05036 (Sporulation related repeat), found
in several cell division and sporulation proteins. The
role of FtsN as a suppressor for other cell division
mutations is poorly understood; it may involve cell wall
hydrolysis [Cellular processes, Cell division].
Length = 298
Score = 32.4 bits (73), Expect = 0.54
Identities = 29/181 (16%), Positives = 64/181 (35%), Gaps = 24/181 (13%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
+ + +A P K +++ S + ++P S+ ++ E+ + +
Sbjct: 64 NQTENGETAADLPPKPEERWSYIEELEAREVLINDPEEPSNGGGVEESAQLTAEQRQLLE 123
Query: 66 DKEKDKSAVSSKEKE---KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
+ D A + V+ + +++ K + + E +K + EK +
Sbjct: 124 QMQADMRAAEKVLATAPSEQTVAVEARKQTAEKKPQKARTAEAQKTPVET-EKIASKVKE 182
Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKE 182
+++ K K+ E++S+SK P AP + K K KE
Sbjct: 183 AKQKQKALPKQTAETQSNSK--------------------PIETAPKADKADKTKPKPKE 222
Query: 183 K 183
K
Sbjct: 223 K 223
Score = 30.0 bits (67), Expect = 3.7
Identities = 27/127 (21%), Positives = 53/127 (41%), Gaps = 8/127 (6%)
Query: 32 TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAV-SSKEKEKDKVSSKEKE 90
T+ S T + + K K R E +K + EK S V +K+K+K +
Sbjct: 138 TAPSEQTVAVEARKQTAEKKPQKARTAEAQKTPVETEKIASKVKEAKQKQKALPKQTAET 197
Query: 91 RKESKPKESSSEKEKKKEKKDKKEKSHKHKD-------KDRERDKDEKKEQKESKSSSKI 143
+ SKP E++ + +K + K K ++ + ++E+ + + + SSKI
Sbjct: 198 QSNSKPIETAPKADKADKTKPKPKEKAERAAALQCGAYANKEQAESVRAKLAFLGISSKI 257
Query: 144 VSSSHNS 150
++
Sbjct: 258 TTTDGGK 264
Score = 29.7 bits (66), Expect = 4.5
Identities = 25/171 (14%), Positives = 52/171 (30%), Gaps = 15/171 (8%)
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESK-----------PKESSSEKEKKKE 108
K+ + + + K+ + E D E+ + P+E S+ ++
Sbjct: 52 KQANEPETLQPKNQTENGETAADLPPKPEERWSYIEELEAREVLINDPEEPSNGGGVEES 111
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH--PPPP 166
+ E+ + + EK + V + + E + P
Sbjct: 112 AQLTAEQRQLLEQMQADMRAAEKVLATAPSEQTVAVEARKQTAEKKPQKARTAEAQKTPV 171
Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK--KDKHGDKTNPKEKD 215
+ K+K+K T + + + K DKT PK K+
Sbjct: 172 ETEKIASKVKEAKQKQKALPKQTAETQSNSKPIETAPKADKADKTKPKPKE 222
>gnl|CDD|233044 TIGR00600, rad2, DNA excision repair protein (rad2). All proteins
in this family for which functions are known are flap
endonucleases that generate the 3' incision next to DNA
damage as part of nucleotide excision repair. This
family is related to many other flap endonuclease
families including the fen1 family. This family is based
on the phylogenomic analysis of JA Eisen (1999, Ph.D.
Thesis, Stanford University) [DNA metabolism, DNA
replication, recombination, and repair].
Length = 1034
Score = 32.9 bits (75), Expect = 0.54
Identities = 17/111 (15%), Positives = 43/111 (38%), Gaps = 1/111 (0%)
Query: 21 NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE 80
+ ++ ST S + S+++ ++K E E K+ ++
Sbjct: 658 GSFIEVDSVSSTLELQVPSKSQPTDESEENAENKVASIEGEHRKEIEDLLFDESEEDNIV 717
Query: 81 KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK-SHKHKDKDRERDKDE 130
K+ + +++ ++ S E+ + E E+ S K + + ++R E
Sbjct: 718 GMIEEEKDADDFKNEWQDISLEELEALEANLLAEQNSLKAQKQQQKRIAAE 768
Score = 31.8 bits (72), Expect = 1.2
Identities = 20/85 (23%), Positives = 34/85 (40%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
E E + EK++S E D VSS + + SK + + +E + K E H
Sbjct: 640 NPMEVEPMESEKEESESDGSFIEVDSVSSTLELQVPSKSQPTDESEENAENKVASIEGEH 699
Query: 118 KHKDKDRERDKDEKKEQKESKSSSK 142
+ + +D D+ E+ K
Sbjct: 700 RKEIEDLLFDESEEDNIVGMIEEEK 724
>gnl|CDD|217434 pfam03224, V-ATPase_H_N, V-ATPase subunit H. The yeast
Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase)
is a multisubunit complex responsible for acidifying
organelles. It functions as an ATP dependent proton pump
that transports protons across a lipid bilayer. This
domain corresponds to the N terminal domain of the H
subunit of V-ATPase. The N-terminal domain is required
for the activation of the complex whereas the C-terminal
domain is required for coupling ATP hydrolysis to proton
translocation.
Length = 312
Score = 32.6 bits (75), Expect = 0.54
Identities = 20/91 (21%), Positives = 36/91 (39%), Gaps = 5/91 (5%)
Query: 288 QPSLGISSHDQQFIQH---HIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQ 344
P L + S+ FI I + A + + ++L+++A L + LL +T Q
Sbjct: 108 SPFLKLLSNQDDFIVLLALFILAKLLAYSKKKSNKLVEEALPLLLSLLSSLLQSSTLGLQ 167
Query: 345 LKAILHVSTLYTHS-YREDIQE-EFYPPLFS 373
A+ + L YR+ E + L
Sbjct: 168 YIAVRCLQELLRVKEYRKLFWESDGVSTLID 198
>gnl|CDD|237258 PRK12903, secA, preprotein translocase subunit SecA; Reviewed.
Length = 925
Score = 32.7 bits (75), Expect = 0.57
Identities = 16/96 (16%), Positives = 41/96 (42%), Gaps = 8/96 (8%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVS-SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
+ ++ E+ + E+ K+ K + + ++E+ + ++ K E + +K
Sbjct: 824 IQREEMLMRPEELELINEEQKNLKQEIKLELSEIQEAEEEIQNINENKNEFVEFKNDPKK 883
Query: 110 -------KDKKEKSHKHKDKDRERDKDEKKEQKESK 138
KD K D+ ++ +K KK++K+ +
Sbjct: 884 LNKLIIAKDVLIKLVISSDEIKQDEKTTKKKKKDLE 919
Score = 28.9 bits (65), Expect = 8.9
Identities = 13/98 (13%), Positives = 42/98 (42%), Gaps = 5/98 (5%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
P + + + + + + ++++++ E + K+ +K + +K
Sbjct: 833 PEELELINEEQKNLKQEIKLELSEIQEAEEEIQNINENKNEFVEFKNDPKKLNK-LIIAK 891
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
+ V S ++ +++ E +++K+KK +K +E
Sbjct: 892 DVLIKLVISSDEIKQD----EKTTKKKKKDLEKTDEEA 925
>gnl|CDD|237799 PRK14715, PRK14715, DNA polymerase II large subunit; Provisional.
Length = 1627
Score = 32.9 bits (75), Expect = 0.59
Identities = 19/61 (31%), Positives = 33/61 (54%)
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
KEKK+EK ++K + K ++ D E +++EK E I ++ KE +G + +H
Sbjct: 280 KEKKEEKDEEKSEEVKTEEVDEEFEEEEKGFYYELYEKVNIEANKKFIKEVIAGRPVFAH 339
Query: 163 P 163
P
Sbjct: 340 P 340
>gnl|CDD|220252 pfam09468, RNase_H2-Ydr279, Ydr279p protein family (RNase H2
complex component). RNases H are enzymes that
specifically hydrolyse RNA when annealed to a
complementary DNA and are present in all living
organisms. In yeast RNase H2 is composed of a complex of
three proteins (Rnh2Ap, Ydr279p and Ylr154p), this
family represents the homologues of Ydr279p. It is not
known whether non yeast proteins in this family fulfil
the same function.
Length = 287
Score = 32.3 bits (74), Expect = 0.61
Identities = 19/99 (19%), Positives = 30/99 (30%), Gaps = 3/99 (3%)
Query: 12 SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
P N + A S P K + + K+ K+K +
Sbjct: 179 ELDIPDDILNLLRLRYAC---DLLCSYLPPDLYKELLKSLLIPEFKPLDKYLKESKKKKR 235
Query: 72 SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
E + + K K ++E K K+ K K KK
Sbjct: 236 ETEEDVEAAESRAEKKRKSKEEIKKKKPKESKGVKALKK 274
Score = 31.9 bits (73), Expect = 0.85
Identities = 21/69 (30%), Positives = 34/69 (49%), Gaps = 5/69 (7%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE-KDKVSSKEKERKESKPKESSSEKE 104
+ DK + K+K+++ +E ++A S EK+ K K K+K+ KESK +
Sbjct: 217 IPEFKPLDKYLKESKKKKRETEEDVEAAESRAEKKRKSKEEIKKKKPKESKGVK----AL 272
Query: 105 KKKEKKDKK 113
KK K K
Sbjct: 273 KKVVAKGMK 281
Score = 31.1 bits (71), Expect = 1.5
Identities = 16/67 (23%), Positives = 31/67 (46%), Gaps = 1/67 (1%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
++ K + K + K+K + + ++ E ES+ ++ KE+ K+KK K+ K
Sbjct: 209 KELLKSLLIPE-FKPLDKYLKESKKKKRETEEDVEAAESRAEKKRKSKEEIKKKKPKESK 267
Query: 116 SHKHKDK 122
K K
Sbjct: 268 GVKALKK 274
Score = 29.2 bits (66), Expect = 6.1
Identities = 17/79 (21%), Positives = 32/79 (40%), Gaps = 2/79 (2%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
S K+ + + K + K + K + ++ V + E + K ++S E
Sbjct: 201 SYLPPDLYKELLKSLLIPEFKPLDKYLKESKKKKRETEEDVEAAES--RAEKKRKSKEEI 258
Query: 104 EKKKEKKDKKEKSHKHKDK 122
+KKK K+ K K+ K
Sbjct: 259 KKKKPKESKGVKALKKVVA 277
>gnl|CDD|218597 pfam05466, BASP1, Brain acid soluble protein 1 (BASP1 protein).
This family consists of several brain acid soluble
protein 1 (BASP1) or neuronal axonal membrane protein
NAP-22. The BASP1 is a neuron enriched Ca(2+)-dependent
calmodulin-binding protein of unknown function.
Length = 233
Score = 32.1 bits (72), Expect = 0.61
Identities = 38/189 (20%), Positives = 84/189 (44%), Gaps = 6/189 (3%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
+K +DK+K+ E E++ + ++E + +++ KE KE KP + + + K E+K+
Sbjct: 16 EKAKDKDKKAEGAATEEEGTPKENEEAQAAAETTEVKEAKEEKPDKDAQDTANKTEEKEG 75
Query: 113 KEKSHKHKDK----DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
++++ K++ + E+ + + + E +S + PA+G + +
Sbjct: 76 EKEAAAAKEEAPKAEPEKTEGAAEAKAEPPKASDPEQEPAAAPGPAAGGEAPKASEASSQ 135
Query: 169 TPTQKSPVKTKEKEKEK-ESSTTHDKHSKHKHKKKD-KHGDKTNPKEKDAKSKEKESHKS 226
+P K +EK KE+ E+ T + + K D + P +A KE+ +
Sbjct: 136 PAESAAPAKEEEKSKEEGEAKKTEAPAAAAQETKSDAAPASDSKPSSSEAAPSSKETPAA 195
Query: 227 SAGPKCYPE 235
+ P +
Sbjct: 196 TEAPSSTAK 204
Score = 31.0 bits (69), Expect = 1.3
Identities = 36/197 (18%), Positives = 85/197 (43%), Gaps = 13/197 (6%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPT---NSSSSKKDKKDKDRDKEKEK 62
K+ +++ +P +N++ ++A + + P +++K ++K+ +++ K
Sbjct: 24 KAEGAATEEEGTPKENEEAQAAAETTEVKEAKEEKPDKDAQDTANKTEEKEGEKEAAAAK 83
Query: 63 EKKDK-EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
E+ K E +K+ +++ K + +S ++ + P ++ + K + +
Sbjct: 84 EEAPKAEPEKTEGAAEAKAEPPKASDPEQEPAAAPGPAAGGEAPKASEASSQPAESAAPA 143
Query: 122 KDRERDKDEKKEQK---------ESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
K+ E+ K+E + +K E+KS + S S S A+ S + AP+ T
Sbjct: 144 KEEEKSKEEGEAKKTEAPAAAAQETKSDAAPASDSKPSSSEAAPSSKETPAATEAPSSTA 203
Query: 173 KSPVKTKEKEKEKESST 189
K+ E+ K S
Sbjct: 204 KASAPAAPAEEVKPSEA 220
>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1. All
proteins in this family for which functions are known
are cyclin dependent protein kinases that are components
of TFIIH, a complex that is involved in nucleotide
excision repair and transcription initiation. Also known
as MAT1 (menage a trois 1). This family is based on the
phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
Stanford University) [DNA metabolism, DNA replication,
recombination, and repair].
Length = 309
Score = 32.5 bits (74), Expect = 0.62
Identities = 24/131 (18%), Positives = 54/131 (41%), Gaps = 14/131 (10%)
Query: 53 DKDRDKEKEKEKKDKE---KDKSAVSSKEKEKDKVSSKEKERKESKPK--ESSSEKEKKK 107
+ + K + +K++K+ K+K + +++E ++ EKE +E + + E+++
Sbjct: 116 ENTKKKIETYQKENKDVIQKNKEKSTREQEELEEALEFEKEEEEQRRLLLQKEEEEQQMN 175
Query: 108 EKKDKK----EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
++K+K+ E + +K K K ++P + S I
Sbjct: 176 KRKNKQALLDELETSTLPAAELIAQHKKNSVKLEMQVEKP-----KPEKPNTFSTGIKMG 230
Query: 164 PPPAPTPTQKS 174
+ P QKS
Sbjct: 231 YQISLVPVQKS 241
>gnl|CDD|184860 PRK14858, tatA, twin arginine translocase protein A; Provisional.
Length = 108
Score = 30.6 bits (69), Expect = 0.63
Identities = 14/62 (22%), Positives = 26/62 (41%), Gaps = 2/62 (3%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKD--KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
K ++ +EKEK +K + K A + + K ++ + K K E +S +
Sbjct: 47 KQSMQEESRTAEEKEKAEKLAETKKEAEAPEAKAEEDQAPKPKGAGEPPATVASKAGDGA 106
Query: 107 KE 108
K
Sbjct: 107 KA 108
>gnl|CDD|221821 pfam12871, PRP38_assoc, Pre-mRNA-splicing factor 38-associated
hydrophilic C-term. This domain is a hydrophilic region
found at the C-terminus of plant and metazoan
pre-mRNA-splicing factor 38 proteins. The function is
not known.
Length = 97
Score = 30.5 bits (69), Expect = 0.63
Identities = 11/68 (16%), Positives = 37/68 (54%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
++E + + ++ +R P+ + + +++++ K+ + + +D+ R RD+D++
Sbjct: 19 EEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRD 78
Query: 135 KESKSSSK 142
+ +S S+
Sbjct: 79 RYDRSRSR 86
Score = 30.1 bits (68), Expect = 0.71
Identities = 7/83 (8%), Positives = 31/83 (37%), Gaps = 7/83 (8%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
+ ++ +E+E ++ + K+ + + + K + + +++ +
Sbjct: 11 DEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYR 70
Query: 114 EKSH-------KHKDKDRERDKD 129
++ + + + R R +D
Sbjct: 71 DRDDRDRDRYDRSRSRSRSRSRD 93
Score = 29.8 bits (67), Expect = 1.2
Identities = 15/81 (18%), Positives = 34/81 (41%), Gaps = 1/81 (1%)
Query: 68 EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
E+D E+E+D + K ++ S + ++ + +K S K + + R+RD
Sbjct: 7 EEDLDEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKR-SRKRRRRRRDRD 65
Query: 128 KDEKKEQKESKSSSKIVSSSH 148
+ +++ + S S
Sbjct: 66 RARYRDRDDRDRDRYDRSRSR 86
Score = 29.4 bits (66), Expect = 1.4
Identities = 9/74 (12%), Positives = 32/74 (43%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
E+E+ +++++ ++ ++ S + + + S K +++ + + +
Sbjct: 11 DEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYR 70
Query: 118 KHKDKDRERDKDEK 131
D+DR+R +
Sbjct: 71 DRDDRDRDRYDRSR 84
Score = 29.0 bits (65), Expect = 1.9
Identities = 10/84 (11%), Positives = 32/84 (38%), Gaps = 1/84 (1%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
S ++D ++ R E++ ++ + + + K + + R + + +
Sbjct: 15 ESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDD 74
Query: 104 EKKKEKKDKKEKSHKHKDKDRERD 127
+ + +S + + +DR R
Sbjct: 75 RDRDRYDRSRSRS-RSRSRDRRRR 97
>gnl|CDD|221643 pfam12572, DUF3752, Protein of unknown function (DUF3752). This
domain family is found in eukaryotes, and is typically
between 140 and 163 amino acids in length.
Length = 148
Score = 31.2 bits (71), Expect = 0.63
Identities = 28/123 (22%), Positives = 50/123 (40%), Gaps = 6/123 (4%)
Query: 6 KSSSSSSSAHPSPHKN-KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
+ S +S P+ +N K + SS T P K K+ +D E
Sbjct: 3 ERSDLNSRVDPTKLRNRKFSTGTKSARGDDSSWTETPEE-----KAKRLQDEVLGVEAGA 57
Query: 65 KDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
+ S ++KE + + E+K K +K++KK+KK+++ + DR
Sbjct: 58 SAPAAASAKASKRDKEMARKVKEYNEKKRGKSLVEQHQKKQKKKKKEEENDDPSRRPFDR 117
Query: 125 ERD 127
E+D
Sbjct: 118 EKD 120
>gnl|CDD|218215 pfam04696, Pinin_SDK_memA, pinin/SDK/memA/ protein conserved
region. Members of this family have very varied
localisations within the eukaryotic cell. pinin is known
to localise at the desmosomes and is implicated in
anchoring intermediate filaments to the desmosomal
plaque. SDK2/3 is a dynamically localised nuclear
protein thought to be involved in modulation of
alternative pre-mRNA splicing. memA is a tumour marker
preferentially expressed in human melanoma cell lines. A
common feature of the members of this family is that
they may all participate in regulating protein-protein
interactions.
Length = 131
Score = 30.9 bits (70), Expect = 0.64
Identities = 21/75 (28%), Positives = 40/75 (53%), Gaps = 2/75 (2%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
+E+ +++SKEK R E + K E+EK++ ++ +KEK +++ R++ + K EQ
Sbjct: 21 QKFSQEESRLTSKEKRRAEIEQKLE--EQEKQEREELRKEKRELFEERRRKQLELRKLEQ 78
Query: 135 KESKSSSKIVSSSHN 149
K + HN
Sbjct: 79 KMEDEKLQETWHEHN 93
>gnl|CDD|217286 pfam02919, Topoisom_I_N, Eukaryotic DNA topoisomerase I, DNA
binding fragment. Topoisomerase I promotes the
relaxation of DNA superhelical tension by introducing a
transient single-stranded break in duplex DNA and are
vital for the processes of replication, transcription,
and recombination. This family may be more than one
structural domain.
Length = 215
Score = 31.8 bits (73), Expect = 0.64
Identities = 14/32 (43%), Positives = 21/32 (65%), Gaps = 3/32 (9%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
EKE+K++ KE EK+ KE+KDK E+ +
Sbjct: 97 AEKEKKKAMSKE---EKKAIKEEKDKLEEPYG 125
>gnl|CDD|217450 pfam03247, Prothymosin, Prothymosin/parathymosin family.
Prothymosin alpha and parathymosin are two ubiquitous
small acidic nuclear proteins that are thought to be
involved in cell cycle progression, proliferation, and
cell differentiation.
Length = 106
Score = 30.7 bits (69), Expect = 0.67
Identities = 19/96 (19%), Positives = 47/96 (48%), Gaps = 1/96 (1%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKD-KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
++++ KD KE +EK++ K + ++E + + +E +E + E
Sbjct: 7 AAAELSAKDLKEKKEVVEEKENGKNAPANGNENEENGAQEGDDEMEEEEEVDEDDEEEEG 66
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
E ++E+ +++E++ K D+++ E K+ K+
Sbjct: 67 EGEEEEGEEEEETEGATGKRAAEDEEDDAETKKQKT 102
Score = 29.5 bits (66), Expect = 1.6
Identities = 18/85 (21%), Positives = 37/85 (43%), Gaps = 1/85 (1%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
D D E KD ++ K V KE K+ + E +E+ +E E E+++E +
Sbjct: 1 SDTKVDAAAELSAKDLKEKKEVVEEKENGKNA-PANGNENEENGAQEGDDEMEEEEEVDE 59
Query: 112 KKEKSHKHKDKDRERDKDEKKEQKE 136
E+ +++ +++E +
Sbjct: 60 DDEEEEGEGEEEEGEEEEETEGATG 84
>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
Mitofilin controls mitochondrial cristae morphology.
Mitofilin is enriched in the narrow space between the
inner boundary and the outer membranes, where it forms a
homotypic interaction and assembles into a large
multimeric protein complex. The first 78 amino acids
contain a typical amino-terminal-cleavable mitochondrial
presequence rich in positive-charged and hydroxylated
residues and a membrane anchor domain. In addition, it
has three centrally located coiled coil domains.
Length = 493
Score = 32.7 bits (75), Expect = 0.67
Identities = 28/153 (18%), Positives = 60/153 (39%), Gaps = 20/153 (13%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD-KEK 60
++ ++S +A + K+ + A+ T S ++ D +
Sbjct: 94 VAEAEAKATSVAAEATTPKSIQELVEALEELLEELLKE--TASDPVVQELVSIFNDLIDS 151
Query: 61 EKEKKDKEKDKSAVSS---------------KEKEKDKVSSKEKERKESKPKESSSEKEK 105
KE K+ +S ++S K +E++++ KE++E S E+E
Sbjct: 152 IKEDNLKDDLESLIASAKEELDQLSKKLAELKAEEEEELERALKEKREE--LLSKLEEEL 209
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
+ K+ K + ER+K+E +++ E K
Sbjct: 210 LARLESKEAALEKQLRLEFEREKEELRKKYEEK 242
>gnl|CDD|224013 COG1088, RfbB, dTDP-D-glucose 4,6-dehydratase [Cell envelope
biogenesis, outer membrane].
Length = 340
Score = 32.2 bits (74), Expect = 0.69
Identities = 45/177 (25%), Positives = 63/177 (35%), Gaps = 54/177 (30%)
Query: 282 ISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQD--AFT-LNIQATRELLDL 338
+ GDI L D+ F ++ ++H AA D I F N+ T LL+
Sbjct: 56 VQGDICDREL----VDRLFKEYQPDAVVHFAAESHVDRSIDGPAPFIQTNVVGTYTLLEA 111
Query: 339 ATRCSQLKAILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFG 398
A + H+ST +E Y L +D +TT
Sbjct: 112 ARKYWGKFRFHHIST-----------DEVYGDLGLDDDA--FTETTP------------- 145
Query: 399 GIYNNS--YSFTKAIGESVVEKYL--YKLPLAMVRPSIVVSTWKEPIVGWSNNLYGP 451
YN S YS +KA + +V Y+ Y LP + R SNN YGP
Sbjct: 146 --YNPSSPYSASKAASDLLVRAYVRTYGLPATITRC--------------SNN-YGP 185
>gnl|CDD|233223 TIGR00990, 3a0801s09, mitochondrial precursor proteins import
receptor (72 kDa mitochondrial outermembrane protein)
(mitochondrial import receptor for the ADP/ATP carrier)
(translocase of outermembrane tom70). [Transport and
binding proteins, Amino acids, peptides and amines].
Length = 615
Score = 32.7 bits (74), Expect = 0.71
Identities = 21/81 (25%), Positives = 42/81 (51%), Gaps = 1/81 (1%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
S K KK++ + K+ EKE + K ++K + + K + + + S S E++
Sbjct: 65 SKPKISKKERRKRKQAEKETEGKTEEKKSTAPKNAPVEPADELPEIDESSVANLSEEERK 124
Query: 105 KKKEK-KDKKEKSHKHKDKDR 124
K K K+K K++++KD ++
Sbjct: 125 KYAAKLKEKGNKAYRNKDFNK 145
Score = 30.7 bits (69), Expect = 2.2
Identities = 17/85 (20%), Positives = 34/85 (40%)
Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVK 177
K + + + K KKE+++ K + K K+ + P P + S
Sbjct: 58 KGQQQRESKPKISKKERRKRKQAEKETEGKTEEKKSTAPKNAPVEPADELPEIDESSVAN 117
Query: 178 TKEKEKEKESSTTHDKHSKHKHKKK 202
E+E++K ++ +K +K K
Sbjct: 118 LSEEERKKYAAKLKEKGNKAYRNKD 142
>gnl|CDD|235322 PRK04950, PRK04950, ProP expression regulator; Provisional.
Length = 213
Score = 31.8 bits (73), Expect = 0.72
Identities = 13/62 (20%), Positives = 27/62 (43%)
Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
+E K K + + A +K + ++ R+E KPK + K++K + + + D
Sbjct: 104 EEAKAKVQAQRAEQQAKKREAAGEKEKAPRRERKPKPKAPRKKRKPRAQKPEPQHTPVSD 163
Query: 122 KD 123
Sbjct: 164 IS 165
Score = 28.7 bits (65), Expect = 6.2
Identities = 10/59 (16%), Positives = 27/59 (45%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
K K + ++ +++ K ++ EKEK ++ + K + K ++ + + + S
Sbjct: 107 KAKVQAQRAEQQAKKREAAGEKEKAPRRERKPKPKAPRKKRKPRAQKPEPQHTPVSDIS 165
Score = 28.7 bits (65), Expect = 7.0
Identities = 16/72 (22%), Positives = 33/72 (45%), Gaps = 3/72 (4%)
Query: 81 KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER-DKDEKKEQKESKS 139
K KV ++ E++ K + + +++K ++++K K + K + R K E + S
Sbjct: 107 KAKVQAQRAEQQAKKREAA--GEKEKAPRRERKPKPKAPRKKRKPRAQKPEPQHTPVSDI 164
Query: 140 SSKIVSSSHNSK 151
S V + K
Sbjct: 165 SELTVGQAVKVK 176
>gnl|CDD|227371 COG5038, COG5038, Ca2+-dependent lipid-binding protein, contains C2
domain [General function prediction only].
Length = 1227
Score = 32.8 bits (75), Expect = 0.72
Identities = 17/139 (12%), Positives = 40/139 (28%), Gaps = 27/139 (19%)
Query: 11 SSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKD 70
S+ S+ + + P S + + K+ ++E + E +
Sbjct: 2 STKQQHYR--------SSDNYSGNRPIPTIPKFFRSRGQRAEKKEEEQEMQPEDEKLFAP 53
Query: 71 KSAVSS-------------------KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
+ + K+ + SS EK ++ S +E K+
Sbjct: 54 IAQRTVQIADVNFQGAKGIDDLSFTVPKQSIESSSPEKSDVDTSNTRPSVSRELHKDDYV 113
Query: 112 KKEKSHKHKDKDRERDKDE 130
++ + K + E
Sbjct: 114 GPDQDGGWQRKVELSSEQE 132
Score = 30.5 bits (69), Expect = 3.0
Identities = 15/88 (17%), Positives = 33/88 (37%), Gaps = 9/88 (10%)
Query: 153 PASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGD----- 207
SG++ I P + Q++ K +E+E + E ++ + D +
Sbjct: 13 NYSGNRPIPTIPKFFRSRGQRAEKKEEEQEMQPEDEKLFAPIAQRTVQIADVNFQGAKGI 72
Query: 208 ----KTNPKEKDAKSKEKESHKSSAGPK 231
T PK+ S ++S ++ +
Sbjct: 73 DDLSFTVPKQSIESSSPEKSDVDTSNTR 100
>gnl|CDD|225381 COG2825, HlpA, Outer membrane protein [Cell envelope biogenesis,
outer membrane].
Length = 170
Score = 31.2 bits (71), Expect = 0.74
Identities = 23/95 (24%), Positives = 39/95 (41%), Gaps = 6/95 (6%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
K D E E +K+ KE K K KE + + S K + +
Sbjct: 40 PQAKKVSADLESEFKKRQKELQKMQKELKAKE------AKLQDDGKMEALSDRAKAEAEI 93
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
KK+K + K ++ E+D + ++ ++E K KI
Sbjct: 94 KKEKLVNAFNKKQQEYEKDLNRREAEEEQKLLEKI 128
>gnl|CDD|227468 COG5139, COG5139, Uncharacterized conserved protein [Function
unknown].
Length = 397
Score = 32.4 bits (73), Expect = 0.80
Identities = 36/181 (19%), Positives = 70/181 (38%), Gaps = 13/181 (7%)
Query: 99 SSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE--KKEQKESKSSSKIVSSSHNSKEPASG 156
S++++E+ K + E K ++E K+ Q + +S+++K+P +G
Sbjct: 2 STADQEQPKVVEATPEDGTASSQKSTINAENENTKQNQSMEPQETSK-GTSNDTKDPDNG 60
Query: 157 SQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
+ + S V+ E+ K K ST S + +K D+ P +
Sbjct: 61 EK------NEEAAIDENSNVEAAER-KRKHISTDFSDMSLLRKRKNDQ---SLQPTREPM 110
Query: 217 KSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQ 276
S++ + A + G + + +E L EQ DE+ RLK D +
Sbjct: 111 DSRDSGQDFTEAQSGELGDTGDRQLKAPAASRARRKEDLLEQTVDEISLRLKKRMQDAAK 170
Query: 277 R 277
+
Sbjct: 171 K 171
>gnl|CDD|235582 PRK05729, valS, valyl-tRNA synthetase; Reviewed.
Length = 874
Score = 32.4 bits (75), Expect = 0.80
Identities = 18/62 (29%), Positives = 28/62 (45%), Gaps = 8/62 (12%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK---EKERKESKPKESSSEKEKKKEKKDKK 113
D E E + +KE K EKE ++V K E ++ + E+EK E ++K
Sbjct: 808 DVEAELARLEKELAKL-----EKEIERVEKKLSNEGFVAKAPEEVVEKEREKLAEYEEKL 862
Query: 114 EK 115
K
Sbjct: 863 AK 864
>gnl|CDD|149343 pfam08229, SHR3_chaperone, ER membrane protein SH3. This family of
proteins are membrane localised chaperones that are
required for correct plasma membrane localisation of
amino acid permeases (AAPs). SH3 prevents AAPs proteins
from aggregating and assists in their correct folding.
In the absence of SH3, AAPs are retained in the ER.
Length = 196
Score = 31.5 bits (72), Expect = 0.81
Identities = 10/38 (26%), Positives = 21/38 (55%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
K E+ E+ ++A ++KE+E + KE ++K+
Sbjct: 159 WKDAKLLEEFAAEEAEAAAAAKEEESAEGEKKESKKKK 196
Score = 30.4 bits (69), Expect = 2.1
Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 10/49 (20%)
Query: 90 ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
E K++K E + +E + K+E+S + EKKE K+ K
Sbjct: 158 EWKDAKLLEEFAAEEAEAAAAAKEEESAE----------GEKKESKKKK 196
>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region. The myc family
belongs to the basic helix-loop-helix leucine zipper
class of transcription factors, see pfam00010. Myc forms
a heterodimer with Max, and this complex regulates cell
growth through direct activation of genes involved in
cell replication. Mutations in the C-terminal 20
residues of this domain cause unique changes in the
induction of apoptosis, transformation, and G2 arrest.
Length = 329
Score = 32.2 bits (73), Expect = 0.81
Identities = 26/91 (28%), Positives = 42/91 (46%), Gaps = 13/91 (14%)
Query: 12 SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
S P P + K S T + P +SSSS D + ++ ++E+E+E++++E D
Sbjct: 187 SVVFPYPLNERSKSSKVASPTPRLGLRTPPNSSSSSGSDSESEEDEEEEEEEEEEEEID- 245
Query: 72 SAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
V + EK R S K S+SE
Sbjct: 246 ------------VVTVEKRRSSSNRKASTSE 264
>gnl|CDD|150392 pfam09710, Trep_dent_lipo, Treponema clustered lipoprotein
(Trep_dent_lipo). This entry represents a family of six
predicted lipoproteins from a region of about 20
tandemly arranged genes in the Treponema denticola
genome. Two other neighboring genes share the
lipoprotein signal peptide region but do not show more
extensive homology. The function of this locus is
unknown.
Length = 394
Score = 32.2 bits (73), Expect = 0.82
Identities = 22/93 (23%), Positives = 41/93 (44%), Gaps = 5/93 (5%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDK-SAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
S K+ K++ R+ + E K + K + SK + + V + E + KE + K++ E
Sbjct: 17 SCSKEVKEQ-REMRIKVESSMKIEPKENEFLSKPEYDEHVKTPE-QIKELEEKKAYEESL 74
Query: 105 KKKEKK-DKKEKSHKHKDKDRER-DKDEKKEQK 135
K+ + + DK + K D +QK
Sbjct: 75 KQLQFELDKYDLVLIQAYKTPTNIGIDNLAQQK 107
>gnl|CDD|173965 cd08045, TAF4, TATA Binding Protein (TBP) Associated Factor 4
(TAF4) is one of several TAFs that bind TBP and is
involved in forming Transcription Factor IID (TFIID)
complex. The TATA Binding Protein (TBP) Associated
Factor 4 (TAF4) is one of several TAFs that bind TBP and
are involved in forming the Transcription Factor IID
(TFIID) complex. TFIID is one of seven General
Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE,
TFIIF, and TFIID) that are involved in accurate
initiation of transcription by RNA polymerase II in
eukaryote. TFIID plays an important role in the
recognition of promoter DNA and assembly of the
pre-initiation complex. TFIID complex is composed of the
TBP and at least 13 TAFs. TAFs from various species were
originally named by their predicted molecular weight or
their electrophoretic mobility in polyacrylamide gels. A
new, unified nomenclature for the pol II TAFs has been
suggested to show the relationship between TAF orthologs
and paralogs. Several hypotheses are proposed for TAFs
functions such as serving as activator-binding sites,
core-promoter recognition or a role in essential
catalytic activity. Each TAF, with the help of a
specific activator, is required only for the expression
of subset of genes and is not universally involved for
transcription as are GTFs. In yeast and human cells,
TAFs have been found as components of other complexes
besides TFIID. Several TAFs interact via histone-fold
(HFD) motifs; HFD is the interaction motif involved in
heterodimerization of the core histones and their
assembly into nucleosome octamers. The minimal HFD
contains three alpha-helices linked by two loops and is
found in core histones, TAFS and many other
transcription factors. TFIID has a histone octamer-like
substructure. TAF4 domain interacts with TAF12 and makes
a novel histone-like heterodimer that binds DNA and has
a core promoter function of a subset of genes.
Length = 212
Score = 31.5 bits (72), Expect = 0.83
Identities = 9/45 (20%), Positives = 27/45 (60%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
+ +++++++ E+E+E+ + + S+ K+K K KE++ +
Sbjct: 121 QLEREEEEKRDEEERERLLRAAKSRSEQSRLKQKAKEMQKEEDEE 165
>gnl|CDD|217476 pfam03286, Pox_Ag35, Pox virus Ag35 surface protein.
Length = 198
Score = 31.7 bits (72), Expect = 0.83
Identities = 21/78 (26%), Positives = 37/78 (47%), Gaps = 3/78 (3%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
S K+ +K + ++ K K K+ EK +++KK +S K ++ E D D +E
Sbjct: 47 SPKQPKKKRPTTPRKPATTKKSKKKDKEKL---TEEEKKPESDDDKTEENENDPDNNEES 103
Query: 135 KESKSSSKIVSSSHNSKE 152
+S+ S+ S S E
Sbjct: 104 GDSQESASANSLSDIDNE 121
Score = 29.0 bits (65), Expect = 5.0
Identities = 16/65 (24%), Positives = 27/65 (41%), Gaps = 5/65 (7%)
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
P PT +K K K+K+KE T + K + D + N + D + +
Sbjct: 51 PKKKRPTTPRKPATTKKSKKKDKEKLTEEE-----KKPESDDDKTEENENDPDNNEESGD 105
Query: 223 SHKSS 227
S +S+
Sbjct: 106 SQESA 110
Score = 29.0 bits (65), Expect = 5.3
Identities = 14/65 (21%), Positives = 25/65 (38%), Gaps = 4/65 (6%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
KK + R K+ K K+K+K K+ E D +K + +++ E
Sbjct: 48 PKQPKKKRPTTPRKPATTKKSKKKDKEKLTEEEKKPESD----DDKTEENENDPDNNEES 103
Query: 104 EKKKE 108
+E
Sbjct: 104 GDSQE 108
>gnl|CDD|219621 pfam07890, Rrp15p, Rrp15p. Rrp15p is required for the formation of
60S ribosomal subunits.
Length = 132
Score = 30.8 bits (70), Expect = 0.83
Identities = 16/64 (25%), Positives = 27/64 (42%)
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
K K K + S+ K+ K K K K K + EK++ + + K +D +
Sbjct: 13 KLPASKRKDPILSRSKKLLKAKKKLKSEKLEKKAKRQLRAEKRQALEKGRVKPVLPEDLE 72
Query: 124 RERD 127
+ER
Sbjct: 73 KERR 76
Score = 28.1 bits (63), Expect = 7.0
Identities = 18/73 (24%), Positives = 31/73 (42%), Gaps = 10/73 (13%)
Query: 27 SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK--- 83
+ I ++ +S S SKK K K K+ + EK +K+ + + K + +K
Sbjct: 7 AKILASKLPASKRKDPILSRSKKLLKAK---KKLKSEKLEKKAKRQLRAEKRQALEKGRV 63
Query: 84 ----VSSKEKERK 92
EKER+
Sbjct: 64 KPVLPEDLEKERR 76
>gnl|CDD|240254 PTZ00069, PTZ00069, 60S ribosomal protein L5; Provisional.
Length = 300
Score = 32.0 bits (73), Expect = 0.87
Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 6/90 (6%)
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE------SKPKESSSEKEKKKEKKDKK 113
K+ +++D +K K S K S E K+ + P + +K+KKK+ KK
Sbjct: 210 KQLKEEDPDKYKKQFSKYIKAGVGPDSLEDMYKKAHAAIRANPSKVKKKKKKKKKVVHKK 269
Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
K+ K K R+ KK Q+ + KI
Sbjct: 270 YKTKKLTGKQRKARVKAKKAQRRERLQKKI 299
>gnl|CDD|215579 PLN03106, TCP2, Protein TCP2; Provisional.
Length = 447
Score = 32.0 bits (72), Expect = 0.92
Identities = 17/58 (29%), Positives = 35/58 (60%), Gaps = 1/58 (1%)
Query: 15 HPSPHKNKDKDSSAIPST-STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
+ + +N+ + +S S S++S TS + S S+ + +DK R++ +E+ K+KEK+
Sbjct: 169 NQTLTQNQAQHNSLSKSACSSTSDTSKGSGLSLSRSELRDKARERARERTAKEKEKED 226
>gnl|CDD|221185 pfam11719, Drc1-Sld2, DNA replication and checkpoint protein.
Genome duplication is precisely regulated by
cyclin-dependent kinases CDKs, which bring about the
onset of S phase by activating replication origins and
then prevent relicensing of origins until mitosis is
completed. The optimum sequence motif for CDK
phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found
to have at least 11 potential phosphorylation sites.
Drc1 is required for DNA synthesis and S-M replication
checkpoint control. Drc1 associates with Cdc2 and is
phosphorylated at the onset of S phase when Cdc2 is
activated. Thus Cdc2 promotes DNA replication by
phosphorylating Drc1 and regulating its association with
Cut5. Sld2 and Sld3 represent the minimal set of S-CDK
substrates required for DNA replication.
Length = 397
Score = 32.1 bits (73), Expect = 0.95
Identities = 12/70 (17%), Positives = 28/70 (40%), Gaps = 4/70 (5%)
Query: 38 TSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
+ P++ S + ++ K EK + E D+ E+ ++E + K
Sbjct: 302 RAKPSDEPSLPESDIHEEIPKLDEKSLSEFLGY----MGGIDEDDEDEDDEESKEEVEKK 357
Query: 98 ESSSEKEKKK 107
+ +K +K+
Sbjct: 358 QKVKKKPRKR 367
Score = 29.4 bits (66), Expect = 5.3
Identities = 18/113 (15%), Positives = 41/113 (36%), Gaps = 13/113 (11%)
Query: 37 STSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP 96
+ P K K+ R K + K ++ S +E K+ EK E
Sbjct: 276 ANDEPRRVFKKKGQKRTTRRVKMRPVRAKPSDEPSLPESDIHEEIPKL--DEKSLSEFLG 333
Query: 97 KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
+++ + E ++ ++ + ++K++ K+ K+ S+N
Sbjct: 334 YMGGIDEDDEDEDDEESKE-----------EVEKKQKVKKKPRKRKVNPVSNN 375
>gnl|CDD|222571 pfam14153, Spore_coat_CotO, Spore coat protein CotO. Bacillus
spores are protected by a protein shell consisting of
over 50 different polypeptides, known as the coat. This
family of proteins has an important morphogenetic role
in coat assembly, it is involved in the assembly of at
least 5 different coat proteins including CotB, CotG,
CotS, CotSA and CotW. It is likely to act at a late
stage of coat assembly.
Length = 185
Score = 31.3 bits (71), Expect = 0.96
Identities = 19/80 (23%), Positives = 37/80 (46%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
K +KE+EKE D+ K + ++ KE E + EKE+ ++++
Sbjct: 33 IKKADEKEEEKENSDEHVKSKEEEQKIEYEEAEKEKEAGEPEREDIAEQQEKEEIAQEEE 92
Query: 112 KKEKSHKHKDKDRERDKDEK 131
K+E++ K ++ K +K
Sbjct: 93 KEEEAEDVKQQEVFSFKRKK 112
>gnl|CDD|237744 PRK14521, rpsP, 30S ribosomal protein S16; Provisional.
Length = 186
Score = 31.3 bits (71), Expect = 1.0
Identities = 17/78 (21%), Positives = 32/78 (41%), Gaps = 6/78 (7%)
Query: 39 SNPTNSSSSKKDKKDKDRDKEKEK----EKKDKEKDKSAVSSKEK--EKDKVSSKEKERK 92
++KKDK K + K+ EKK E AV+ K+ + + +
Sbjct: 108 EEKEGKVNAKKDKLSKAKKAAKKAALEAEKKVNEARAEAVAEKKAAEAAAVAAEEAAAAE 167
Query: 93 ESKPKESSSEKEKKKEKK 110
E + +E+ +E+ +E
Sbjct: 168 EEEAEEAPAEEAPAEESA 185
>gnl|CDD|219901 pfam08555, DUF1754, Eukaryotic family of unknown function
(DUF1754). This is a eukaryotic protein family of
unknown function.
Length = 90
Score = 29.7 bits (67), Expect = 1.1
Identities = 19/71 (26%), Positives = 36/71 (50%), Gaps = 1/71 (1%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKK 107
K K K K+K+K+KK K K K V ++++E++K S++ E E+E+
Sbjct: 11 KLKGKKIDVKKKKKKKKKKNKSKEEVVTEKEEEEKSSAESDLKEGEEDEDNEKIEQEEDG 70
Query: 108 EKKDKKEKSHK 118
+ E++ +
Sbjct: 71 MNLTEAERAFE 81
>gnl|CDD|220223 pfam09405, Btz, CASC3/Barentsz eIF4AIII binding. This domain is
found on CASC3 (cancer susceptibility candidate gene 3
protein) which is also known as Barentsz (Btz). CASC3
is a component of the EJC (exon junction complex) which
is a complex that is involved in post-transcriptional
regulation of mRNA in metazoa. The complex is formed by
the association of four proteins (eIF4AIII, Barentsz,
Mago, and Y14), mRNA, and ATP. This domain wraps around
eIF4AIII and stacks against the 5' nucleotide.
Length = 116
Score = 30.1 bits (68), Expect = 1.1
Identities = 15/40 (37%), Positives = 21/40 (52%), Gaps = 1/40 (2%)
Query: 32 TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
S S T S+ + K+DK+R K +E EK D E D+
Sbjct: 1 KVESERQSGRTPSAEPTEPKEDKER-KRREHEKYDDEDDE 39
>gnl|CDD|147051 pfam04698, MOBP_C-Myrip, Myelin-associated oligodendrocytic basic
protein (MOBP). MOBP is abundantly expressed in central
nervous system myelin, and shares several
characteristics with myelin basic protein (MBP), in
terms of regional distribution and function. This family
is the middle and C-terminal regions of MOBP which has
been shown to be essential for normal arrangement of the
radial component in central nervous system myelin. Most
member-proteins carry a FVHE-PHD type zinc-finger at
their N-terminus.
Length = 710
Score = 31.7 bits (71), Expect = 1.2
Identities = 23/100 (23%), Positives = 41/100 (41%), Gaps = 6/100 (6%)
Query: 10 SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDK-KDKDRDKEKEKEKKDKE 68
S AH + D++ + + + SN S D ++K R++ E K E
Sbjct: 393 PSPGAHL-----RALDTAQVSDDLSETDISNEAQDPQSLTDSTEEKLRNRLYELAMKMSE 447
Query: 69 KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
K+ S+ +E E +KE S+ ++E KK+
Sbjct: 448 KETSSGEDQESEPKAEPENQKESLSSEDNNQGVQEELKKK 487
>gnl|CDD|227499 COG5171, YRB1, Ran GTPase-activating protein (Ran-binding protein)
[Intracellular trafficking and secretion].
Length = 211
Score = 31.1 bits (70), Expect = 1.2
Identities = 12/89 (13%), Positives = 34/89 (38%), Gaps = 1/89 (1%)
Query: 55 DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
+R K++ K +K++ + K + D + +E K ++S + E + K
Sbjct: 4 ERKKKQAKIEKEENEQKERSLDVVSKGDAFGDGKAGGEEKKVQQSPFLENAVPEGDEGKG 63
Query: 115 KSHKHKDKDRERDKDEKKEQKESKSSSKI 143
+ + + ++ K ++ +
Sbjct: 64 PESPNIHFE-PVVELQRVHLKTNEEDETV 91
>gnl|CDD|222977 PHA03089, PHA03089, late transcription factor VLTF-4; Provisional.
Length = 191
Score = 30.9 bits (70), Expect = 1.2
Identities = 21/101 (20%), Positives = 42/101 (41%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
S++ DK + D E + K K + ++K+ K + K+K+ KE P+ ++ E
Sbjct: 26 STTESVDKVNDDIFPEDVEIPSKKTSKKKKTTPRKKKTTKKTKKKKKEKEEVPELAAEEL 85
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
+E ++ +K K + + E S K+
Sbjct: 86 SDSEENEENDKKVDYELPKVQNTAAEVNHEDVIDLSDLKLA 126
Score = 30.5 bits (69), Expect = 1.6
Identities = 16/74 (21%), Positives = 29/74 (39%), Gaps = 2/74 (2%)
Query: 29 IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
IPS TS + K K K ++KE+ E +E S E+ KV +
Sbjct: 45 IPSKKTSKKKKTTPRKKKTTKKTKKKKKEKEEVPELAAEELSDS--EENEENDKKVDYEL 102
Query: 89 KERKESKPKESSSE 102
+ + + + + +
Sbjct: 103 PKVQNTAAEVNHED 116
>gnl|CDD|147580 pfam05474, Semenogelin, Semenogelin. This family consists of
several mammalian semenogelin (I and II) proteins.
Freshly ejaculated human semen has the appearance of a
loose gel in which the predominant structural protein
components are the seminal vesicle secreted semenogelins
(Sg).
Length = 450
Score = 31.6 bits (71), Expect = 1.2
Identities = 39/251 (15%), Positives = 90/251 (35%), Gaps = 30/251 (11%)
Query: 38 TSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
T NP+ + K K KE+ KE+D VS ++ R + +
Sbjct: 139 TQNPSQDRGNSTSGKGKSSQDSNTKERL-------LARGLGKEQDSVSGAQRNRTQGGSQ 191
Query: 98 ESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS-----SHNSKE 152
S + + ++E + ++K + E K++ SK + + + H SK+
Sbjct: 192 SSYVLQTEDLVANKQQETQNSLQNKGSYPNVYEVKQKHSSKVQTSLHPAHQHRLQHGSKD 251
Query: 153 PASGSQ-------------LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKH 199
+ +Q +H + T++ + E +K+ S +
Sbjct: 252 IFTKNQHQTKNLNQDQEHGQKAHKISYQSSSTEERRLNHGENGVQKDVSKGSISRQTEEK 311
Query: 200 KKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQF 259
T P ++ + K S++SS+ + G I + + +++
Sbjct: 312 IMGKSQKQVTTPSQEQGQKANKISYQSSSTEERRLNHGEKGI-----QKDVSKGSTSKKT 366
Query: 260 KDELFDRLKNE 270
++++ D+ +N+
Sbjct: 367 EEKIHDKSQNQ 377
>gnl|CDD|219355 pfam07267, Nucleo_P87, Nucleopolyhedrovirus capsid protein P87.
This family consists of several Nucleopolyhedrovirus
capsid protein P87 sequences. P87 is expressed late in
infection and concentrated in infected cell nuclei.
Length = 606
Score = 31.8 bits (72), Expect = 1.2
Identities = 22/117 (18%), Positives = 37/117 (31%), Gaps = 12/117 (10%)
Query: 16 PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDR---DKEKEKEKKDKEKDKS 72
P + K + SS P ++ ++R D D + + S
Sbjct: 291 PMTEEIKSWQTPLQTPAMYSSDYQAPKPEPIYTWEELLRERFPSDLFAISSLPDSDSEAS 350
Query: 73 AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
K K + E + ++ S E E EK+ K+ RE DK+
Sbjct: 351 DSGPTRKRKRRRVPPLPEYSSDEDEDDSDEDEVDYEKERKRR---------REEDKN 398
Score = 29.9 bits (67), Expect = 5.3
Identities = 18/103 (17%), Positives = 40/103 (38%), Gaps = 9/103 (8%)
Query: 36 SSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEK---DKSAVSSKEKEKDKVSS---KEK 89
S + + D + + E+ +E+ D A+SS + S K
Sbjct: 298 SWQTPLQTPAMYSSDYQAPKPEPIYTWEELLRERFPSDLFAISSLPDSDSEASDSGPTRK 357
Query: 90 ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
++ P ++ ++ D+ E + +K+R+R ++E K
Sbjct: 358 RKRRRVPPLPEYSSDEDEDDSDEDEVDY---EKERKRRREEDK 397
Score = 29.1 bits (65), Expect = 7.6
Identities = 25/86 (29%), Positives = 38/86 (44%), Gaps = 3/86 (3%)
Query: 10 SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD-KEKEKE-KKDK 67
S A S + + S + P+ P SS +D+ D D D + EKE K+ +
Sbjct: 334 SDLFAISSLPDSDSEASDSGPTRKRKRRRVPPLPEYSSDEDEDDSDEDEVDYEKERKRRR 393
Query: 68 EKDKSAVSSKEKEKDKVSSKEKERKE 93
E+DK+ + K E K + ER E
Sbjct: 394 EEDKNFLRLKALELSKYAGV-NERME 418
Score = 29.1 bits (65), Expect = 7.7
Identities = 18/80 (22%), Positives = 34/80 (42%), Gaps = 1/80 (1%)
Query: 27 SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVS 85
S++P + + +S S PT ++ + E E + + E D + +E+DK
Sbjct: 340 SSLPDSDSEASDSGPTRKRKRRRVPPLPEYSSDEDEDDSDEDEVDYEKERKRRREEDKNF 399
Query: 86 SKEKERKESKPKESSSEKEK 105
+ K + SK + EK
Sbjct: 400 LRLKALELSKYAGVNERMEK 419
>gnl|CDD|235971 PRK07219, PRK07219, DNA topoisomerase I; Validated.
Length = 822
Score = 31.9 bits (73), Expect = 1.2
Identities = 13/79 (16%), Positives = 26/79 (32%), Gaps = 8/79 (10%)
Query: 49 KDKKDKDRD-----KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
K + K EK+ KEK+ S+ + V +K E+ + E+ ++
Sbjct: 747 KGGFGDELGCCNNPKCNYTEKQKKEKES---KSELEALKGVGAKTAEKLKDAGVETVTDL 803
Query: 104 EKKKEKKDKKEKSHKHKDK 122
+ D+
Sbjct: 804 TAADPDAVAAKVDGVSADR 822
Score = 30.0 bits (68), Expect = 4.4
Identities = 11/66 (16%), Positives = 20/66 (30%), Gaps = 6/66 (9%)
Query: 77 KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
EK K + K E+ + EK K+ + D D + +
Sbjct: 763 NYTEKQKKEKESKSELEALKGVGAKTAEKLKDAGVET------VTDLTAADPDAVAAKVD 816
Query: 137 SKSSSK 142
S+ +
Sbjct: 817 GVSADR 822
>gnl|CDD|220710 pfam10351, Apt1, Golgi-body localisation protein domain. This is
the C-terminus of a family of proteins conserved from
plants to humans. The plant members are localised to the
Golgi proteins and appear to regulate membrane
trafficking, as they are required for rapid vesicle
accumulation at the tip of the pollen tube. The
C-terminus probably contains the Golgi localisation
signal and it is well-conserved.
Length = 451
Score = 31.5 bits (72), Expect = 1.2
Identities = 20/68 (29%), Positives = 34/68 (50%), Gaps = 1/68 (1%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD-KEKEKEK 64
+ S SSS+ ++ S+ S+S SS+S + SS KDK+ + K +++E
Sbjct: 312 EDSDISSSSSSGSRRSSSTSRSSSSSSSLLSSSSILSKSSDKSKDKRFSLKLSKSEKEES 371
Query: 65 KDKEKDKS 72
D E+ S
Sbjct: 372 DDLEEMIS 379
Score = 30.7 bits (70), Expect = 2.6
Identities = 18/94 (19%), Positives = 30/94 (31%), Gaps = 1/94 (1%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
+D +E K +E + SS + S+ S SSS K K
Sbjct: 295 GRDSSLSEEDSDSSKREEDSDISSSSSSGSRRSSSTSRSSSSSSSLL-SSSSILSKSSDK 353
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
K ++ K + + D+ +E S
Sbjct: 354 SKDKRFSLKLSKSEKEESDDLEEMISRSSKYMSF 387
>gnl|CDD|116627 pfam08017, Fibrinogen_BP, Fibrinogen binding protein. Proteins in
this family bind to fibrinogen. Members of this family
includes the fibrinogen receptor, FbsA, which mediates
platelet aggregation.
Length = 393
Score = 31.4 bits (70), Expect = 1.3
Identities = 34/187 (18%), Positives = 83/187 (44%), Gaps = 13/187 (6%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK----ERKESKPKESSSEK 103
++D ++K + E+ ++D E ++S + E+ + V +K + ER++ + S
Sbjct: 147 QRDAENKSQGNVLERRQRDAE-NRSQGNVLERRQRDVENKSQGNVLERRQRDVENKSQGN 205
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
++ ++D + +S + + R+RD E+KS ++ E S ++
Sbjct: 206 VLERRQRDAENRSQGNVLERRQRDV-------ENKSQGNVLERRQRDVENKSQGNVLERR 258
Query: 164 PPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK-EKE 222
A +Q + ++ ++++ E +S + + + K + G +KS +E
Sbjct: 259 QRDAENRSQGNVLERRQRDVENKSQGNVLERRQRDAENKSQVGQLIGKNPLLSKSIISRE 318
Query: 223 SHKSSAG 229
++ SS G
Sbjct: 319 NNHSSQG 325
Score = 29.4 bits (65), Expect = 5.4
Identities = 30/208 (14%), Positives = 88/208 (42%), Gaps = 13/208 (6%)
Query: 26 SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD--- 82
SS + + + + S ++D +++ + E+ ++D E + +++D
Sbjct: 13 SSPVSAMDSVGNQSQGNVLERRQRDAENRSQGNVLERRQRDAENRSQGNVLERRQRDAEN 72
Query: 83 KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE--------- 133
+ ER++ + S ++ ++D + KS + + R+RD + K +
Sbjct: 73 RSQGNVLERRQRDAENRSQGNVLERRQRDVENKSQGNVLERRQRDVENKSQGNVLERRQR 132
Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
E++S ++ E S ++ A +Q + ++ ++++ E +S +
Sbjct: 133 DAENRSQGNVLERRQRDAENKSQGNVLERRQRDAENRSQGNVLERRQRDVENKSQGNVLE 192
Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
+ + K + G+ +++DA+++ +
Sbjct: 193 RRQRDVENKSQ-GNVLERRQRDAENRSQ 219
>gnl|CDD|152107 pfam11671, Apis_Csd, Complementary sex determiner protein. This
family of proteins represents the complementary sex
determiner in the honeybee. In the honeybee, the
mechanism of sex determination depends on the csd gene
which produces an SR-type protein. Males are homozygous
while females are homozygous for the csd gene.
Heterozygosity generates an active protein which
initiates female development.
Length = 146
Score = 30.5 bits (68), Expect = 1.3
Identities = 14/50 (28%), Positives = 29/50 (58%)
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
S E+E+K K + + ++ ++R RD+ E++ +E K S + + S+N
Sbjct: 9 SREREQKSYKNENSYREYRETSRERSRDRTERERSREHKIISSLSNLSNN 58
>gnl|CDD|224212 COG1293, COG1293, Predicted RNA-binding protein homologous to
eukaryotic snRNP [Transcription].
Length = 564
Score = 31.6 bits (72), Expect = 1.3
Identities = 17/91 (18%), Positives = 33/91 (36%), Gaps = 1/91 (1%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
E+ K + +K K+ + ++ K K K K + ++ S KE + K
Sbjct: 342 LADFYGNEEIKIELDKSKTPSENAQRYF-KKYKKLKGAKVNLDRQLSELKEAIAYYESAK 400
Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
K + K + E+ ++ S K
Sbjct: 401 TALEKAEGKKAIEEIREELIEEGLLKSKKKK 431
Score = 30.1 bits (68), Expect = 3.8
Identities = 22/77 (28%), Positives = 34/77 (44%), Gaps = 5/77 (6%)
Query: 48 KKDKKDKDRDKEKEKEKKDK-EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
K K + DR + KE E K+A+ E +K + +E E +S K+KK
Sbjct: 376 KGAKVNLDRQLSELKEAIAYYESAKTALEKAEGKKA-IEEIREELIEEGLLKS---KKKK 431
Query: 107 KEKKDKKEKSHKHKDKD 123
++KK+ EK D
Sbjct: 432 RKKKEWFEKFRWFVSSD 448
Score = 28.9 bits (65), Expect = 9.2
Identities = 17/75 (22%), Positives = 27/75 (36%), Gaps = 11/75 (14%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKV----SSKEKERKESKPK-------ESSSEKEK 105
+ +K KK K + + K+ + S+K K K E E
Sbjct: 366 QRYFKKYKKLKGAKVNLDRQLSELKEAIAYYESAKTALEKAEGKKAIEEIREELIEEGLL 425
Query: 106 KKEKKDKKEKSHKHK 120
K +KK +K+K K
Sbjct: 426 KSKKKKRKKKEWFEK 440
>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
Length = 330
Score = 31.4 bits (72), Expect = 1.3
Identities = 7/43 (16%), Positives = 24/43 (55%)
Query: 72 SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
+A++ K+ +++ + ++ E E+E+++E+++ E
Sbjct: 276 AALADKDALDEELKEVLSAQAQAAAAEEEEEEEEEEEEEEPSE 318
>gnl|CDD|218738 pfam05766, NinG, Bacteriophage Lambda NinG protein. NinG or Rap is
involved in recombination. Rap (recombination adept with
plasmid) increases lambda-by-plasmid recombination
catalyzed by Escherichia coli's RecBCD pathway.
Length = 188
Score = 30.8 bits (70), Expect = 1.4
Identities = 12/55 (21%), Positives = 21/55 (38%), Gaps = 11/55 (20%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKE--KSHKHKDKD---------RERDKDE 130
K K + K + + +++E K +KE K+ K+ R RD
Sbjct: 33 ALKREKAQEKKRKAEAQAERRELKARKEKLKTRSDWLKEAQAAVNKYIRLRDAGL 87
>gnl|CDD|217834 pfam03998, Utp11, Utp11 protein. This protein is found to be part
of a large ribonucleoprotein complex containing the U3
snoRNA. Depletion of the Utp proteins impedes production
of the 18S rRNA, indicating that they are part of the
active pre-rRNA processing complex. This large RNP
complex has been termed the small subunit (SSU)
processome.
Length = 239
Score = 31.2 bits (71), Expect = 1.4
Identities = 20/95 (21%), Positives = 40/95 (42%), Gaps = 2/95 (2%)
Query: 34 TSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS--AVSSKEKEKDKVSSKEKER 91
T+ + + + EK+K+K K+K K + +++ + K+ E+
Sbjct: 145 TTPELLDRRENRPRISQLEKTSLVDEKQKKKSAKKKRKLYKELKERKEREKKLKKVEQRL 204
Query: 92 KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
+ + + +KKK KDK K K+R+R
Sbjct: 205 ELQRELMKKGKGKKKKIVKDKDGKVVYKWKKERKR 239
Score = 28.5 bits (64), Expect = 9.9
Identities = 17/94 (18%), Positives = 41/94 (43%), Gaps = 2/94 (2%)
Query: 53 DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK--K 110
D + +++ + + + +E ++ + K K+ S++K++K K K
Sbjct: 129 DDEEEQKSFDPAEYFDTTPELLDRRENRPRISQLEKTSLVDEKQKKKSAKKKRKLYKELK 188
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
++KE+ K K ++ + + +K KIV
Sbjct: 189 ERKEREKKLKKVEQRLELQRELMKKGKGKKKKIV 222
>gnl|CDD|234428 TIGR03979, His_Ser_Rich, His-Xaa-Ser repeat protein HxsA. Members
of this protein share two defining regions. One is a
histidine/serine-rich cluster, typically
H-R-S-H-S-S-H-R-S-H-S-S-H. Members are found always in
the context of a pair of radical SAM proteins, HxsB and
HxsC, and a fourth protein HxsD. The system is predicted
to perform peptide modifications, likely in the
His-Xaa-Ser region, to produce some uncharacterized
natural product.
Length = 186
Score = 30.6 bits (69), Expect = 1.4
Identities = 15/63 (23%), Positives = 24/63 (38%), Gaps = 2/63 (3%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
S S + S + PS + S +PS S S S + S S + + +
Sbjct: 66 SSHYSGAGGSYSVPS--GDTSTYSYPVPSPSYSPSPGSSIQSLPSTTGVRPQSSAENANS 123
Query: 63 EKK 65
EK+
Sbjct: 124 EKR 126
Score = 29.5 bits (66), Expect = 3.3
Identities = 14/57 (24%), Positives = 23/57 (40%), Gaps = 4/57 (7%)
Query: 3 YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKE 59
YSV S +S+ ++P P S P +S S S S + + ++ K
Sbjct: 76 YSVPSGDTSTYSYPVP----SPSYSPSPGSSIQSLPSTTGVRPQSSAENANSEKRKL 128
>gnl|CDD|220624 pfam10187, Nefa_Nip30_N, N-terminal domain of NEFA-interacting
nuclear protein NIP30. This is a the N-terminal 100
amino acids of a family of proteins conserved from
plants to humans. The full-length protein has putatively
been called NEFA-interacting nuclear protein NIP30,
however no reference could be found to confirm this.
Length = 99
Score = 29.3 bits (66), Expect = 1.5
Identities = 21/68 (30%), Positives = 33/68 (48%), Gaps = 11/68 (16%)
Query: 74 VSSKEKEKDKVSSKEKERKESKPKESSSEK-------EKKKEKKDKK----EKSHKHKDK 122
VS E ++ + +E+ R PK E+ E+ +E KDKK E+ K K++
Sbjct: 2 VSESELDEARKRRQEEVRAPRDPKAEPEEEYDGRSLYERLQENKDKKQEEFEEKFKLKNQ 61
Query: 123 DRERDKDE 130
R D+DE
Sbjct: 62 FRGLDEDE 69
>gnl|CDD|220838 pfam10659, Trypan_glycop_C, Trypanosome variant surface
glycoprotein C-terminal domain. The trypanosome
parasite expresses these proteins to evade the immune
response.
Length = 98
Score = 29.3 bits (66), Expect = 1.5
Identities = 23/98 (23%), Positives = 31/98 (31%), Gaps = 17/98 (17%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K K K KEK + KE D + K K + + E K+ KK
Sbjct: 2 NKKNKTKTECKEKGCKWDKKEDDGKCKPKEGKAKKNGAPVTQTAGTETTTEKCKGKKDKK 61
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
+ K K K E K SS +V+
Sbjct: 62 DCK-----------------KGCKWEGNTCKDSSFLVN 82
>gnl|CDD|227612 COG5293, COG5293, Predicted ATPase [General function prediction
only].
Length = 591
Score = 31.4 bits (71), Expect = 1.5
Identities = 49/304 (16%), Positives = 86/304 (28%), Gaps = 36/304 (11%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESK-----------PKE 98
DK + K+K E K S S +E E ++ +++ K+ E
Sbjct: 200 DKIQELESKKKLAELLRKTWIGSLDSLEEIETTELRKQDEVNKKQATLNTFDFHAQDYAE 259
Query: 99 SSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+ E+ + +R K KEQ P
Sbjct: 260 TEELVNTVDERIAELNNRRISMQSHWKRVKTSLKEQIL--------------FCPDEIQV 305
Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGD-KTNPKEKDAK 217
L P K E + T ++H + + + GD K E D
Sbjct: 306 LYEEVGVLFPGQV----KKDFEHVIAFNRAITEERHDYLQEEIAEIEGDLKEVNAELDDL 361
Query: 218 SKEKESH----KSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQAD 273
K + K+ + Y + I LR + + + L + + +
Sbjct: 362 GKRRAEGLAFLKNRGVFEKYQTLCEEIIALRGELAELEYRIEPLRKLHALDQYIGTLKHE 421
Query: 274 ILQRKVHIISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATR 333
L + I ++ Q +S + F + I + SLR T + T
Sbjct: 422 CLDLE-ERIYTEVQQQCSLFASIGRLF-KEMIREVYDCYGSLRVTTNKNGHLTFGAEITD 479
Query: 334 ELLD 337
D
Sbjct: 480 AAPD 483
>gnl|CDD|202427 pfam02841, GBP_C, Guanylate-binding protein, C-terminal domain.
Transcription of the anti-viral guanylate-binding
protein (GBP) is induced by interferon-gamma during
macrophage induction. This family contains GBP1 and
GPB2, both GTPases capable of binding GTP, GDP and GMP.
Length = 297
Score = 31.1 bits (71), Expect = 1.5
Identities = 14/59 (23%), Positives = 31/59 (52%), Gaps = 4/59 (6%)
Query: 80 EKDKVSSKEKERKESKPKESSSEKEKKKE---KKDKKEKSHK-HKDKDRERDKDEKKEQ 134
K+K E+ + E+ E +EK+KE + +E+S++ H + E+ + E+++
Sbjct: 201 AKEKAIEAERAKAEAAEAEQELLREKQKEEEQMMEAQERSYQEHVKQLIEKMEAEREKL 259
Score = 28.4 bits (64), Expect = 9.9
Identities = 19/78 (24%), Positives = 39/78 (50%), Gaps = 2/78 (2%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
KEK + E+ K+ + E+E + KE+E+ + S E K+ +K + E+
Sbjct: 201 AKEKAIE-AERAKAEAAEAEQELLREKQKEEEQMMEAQERSYQEHVKQLIEKMEAEREKL 259
Query: 119 HKDKDRERDKDEKKEQKE 136
+++R + + +EQ+E
Sbjct: 260 LAEQERMLEH-KLQEQEE 276
>gnl|CDD|227931 COG5644, COG5644, Uncharacterized conserved protein [Function
unknown].
Length = 869
Score = 31.6 bits (71), Expect = 1.7
Identities = 17/115 (14%), Positives = 42/115 (36%)
Query: 17 SPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
+P K K+ + S S ++ + + E ++ +
Sbjct: 587 APRKRKEDFVTPSTSLEKSMDRILHGQKKRAEGAVVFEKPLEATENFNPWLDRKMRRIKR 646
Query: 77 KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
+K+ + ++K K+ P+E ++++ +K + K+ E + DEK
Sbjct: 647 IKKKAYRRIRRDKRLKKKMPEEENTQENHLGSEKKRHGGVPDILLKEIEVEDDEK 701
Score = 30.4 bits (68), Expect = 2.9
Identities = 25/97 (25%), Positives = 44/97 (45%), Gaps = 6/97 (6%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-----KPKESSSEKEKK 106
+ K +K KE + +S EK D++ +K+R E KP E++
Sbjct: 577 FPVVEQRRKLAPRKRKEDFVTPSTSLEKSMDRILHGQKKRAEGAVVFEKPLEATENFNPW 636
Query: 107 KEKKDKKEKSHKHKDKDR-ERDKDEKKEQKESKSSSK 142
++K ++ K K K R RDK KK+ E +++ +
Sbjct: 637 LDRKMRRIKRIKKKAYRRIRRDKRLKKKMPEEENTQE 673
>gnl|CDD|218049 pfam04373, DUF511, Protein of unknown function (DUF511). Bacterial
protein of unknown function.
Length = 310
Score = 31.2 bits (71), Expect = 1.7
Identities = 19/88 (21%), Positives = 37/88 (42%), Gaps = 10/88 (11%)
Query: 50 DKKDKDRDKEKE----------KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
DKK R+K K ++K+ K + ++EK + K +E +
Sbjct: 27 DKKLNSREKGKTPIAQLGAEIGSDRKELAKKSPFIKTQEKPPRRYYLKSREDELELKALD 86
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERD 127
+ E+ +E+ + + + K K+ ERD
Sbjct: 87 EIKSEEDEEQSEAPKANKKQKNSFHERD 114
>gnl|CDD|222613 pfam14235, DUF4337, Domain of unknown function (DUF4337). This
family of proteins is functionally uncharacterized. This
family of proteins is found in bacteria. Proteins in
this family are typically between 187 and 201 amino
acids in length. There is a single completely conserved
residue Q that may be functionally important.
Length = 158
Score = 30.2 bits (69), Expect = 1.7
Identities = 12/46 (26%), Positives = 21/46 (45%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
R E + K + +KEK + + + KE K K+ + E D +
Sbjct: 65 AAAPRAELQAKIARYKKEKARYRSEAKELEAKAKEAEAESDHALHQ 110
>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
(vWA) domain [General function prediction only].
Length = 4600
Score = 31.5 bits (71), Expect = 1.7
Identities = 35/209 (16%), Positives = 72/209 (34%), Gaps = 39/209 (18%)
Query: 50 DKKDKD-RDKEKEKEKKDKEKDKSAVSS----KEKEKDKVSSKEKERKES---------- 94
+++D + E E ++ E D + V+ E E + ++ E
Sbjct: 3842 NEEDTANQSDLDESEARELESDMNGVTKDSVVSENENSDSEEENQDLDEEVNDIPEDLSN 3901
Query: 95 ---------KPKESSSEKEKKKEKK----DKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
+E E E+K ++ ++ + K D DKD ++++ E + S
Sbjct: 3902 SLNEKLWDEPNEEDLLETEQKSNEQSAANNESDLVSKEDDNKALEDKDRQEKEDEEEMSD 3961
Query: 142 KIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESST--------THDK 193
+ + +P S PPP +K EKE + + D+
Sbjct: 3962 DV--GIDDEIQPDIQENN-SQPPPENEDLDLPEDLKLDEKEGDVSKDSDLEDMDMEAADE 4018
Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
+ + +KD+ +P E++ E
Sbjct: 4019 NKEEADAEKDEPMQDEDPLEENNTLDEDI 4047
Score = 29.6 bits (66), Expect = 6.1
Identities = 21/106 (19%), Positives = 42/106 (39%), Gaps = 6/106 (5%)
Query: 38 TSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
+N + + D D D EK E +E ++E +D V S E+ + P+
Sbjct: 4039 ENNTLDEDIQQDDFSDLAEDDEKMNEDGFEEN---VQENEESTEDGVKSDEELEQGEVPE 4095
Query: 98 ESSSEKEKKKEKKD---KKEKSHKHKDKDRERDKDEKKEQKESKSS 140
+ + + K + K E ++ DK + +E E+ + +
Sbjct: 4096 DQAIDNHPKMDAKSTFASAEADEENTDKGIVGENEELGEEDGVRGN 4141
Score = 29.2 bits (65), Expect = 9.3
Identities = 28/164 (17%), Positives = 64/164 (39%), Gaps = 13/164 (7%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS-SKEKERKESKPKESSSEKEKKKEK 109
+D D ++ +E + D ++E E D +K+ E++ +S E + E+
Sbjct: 3832 NEDDDLEELANEEDTANQSDLDESEARELESDMNGVTKDSVVSENENSDSEEENQDLDEE 3891
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
+ + + ++ D+ +++ E++ S S+++N S L+S
Sbjct: 3892 VNDIPEDLSNSLNEKLWDEPNEEDLLETEQKSNEQSAANNE------SDLVSKEDDN--- 3942
Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE 213
K+ +EKE E + D + + + + P E
Sbjct: 3943 ---KALEDKDRQEKEDEEEMSDDVGIDDEIQPDIQENNSQPPPE 3983
>gnl|CDD|177433 PHA02608, 67, prohead core protein; Provisional.
Length = 80
Score = 28.6 bits (64), Expect = 1.8
Identities = 7/31 (22%), Positives = 15/31 (48%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD 82
+ +D D +++ + D + DK + E D
Sbjct: 49 EPEDDDDDEDDDDDDDKDDKDDDDDDDDEDD 79
>gnl|CDD|234767 PRK00448, polC, DNA polymerase III PolC; Validated.
Length = 1437
Score = 31.3 bits (72), Expect = 1.9
Identities = 26/103 (25%), Positives = 44/103 (42%), Gaps = 13/103 (12%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
D K++ E +KE++D++ K A+ + +K E E+K+ E +
Sbjct: 165 IDDSKEELEKFEAQKEEEDEKLAKEALEAMKK-------LEAEKKKQSKNFDPKEGPVQI 217
Query: 108 EKKDKKEKSHKHKDKDRERDKDE------KKEQKESKSSSKIV 144
KK KE+ K+ + E + K E KE KS I+
Sbjct: 218 GKKIDKEEITPMKEINEEERRVVVEGYVFKVEIKELKSGRHIL 260
>gnl|CDD|218899 pfam06102, DUF947, Domain of unknown function (DUF947). Family of
eukaryotic proteins with unknown function.
Length = 168
Score = 29.9 bits (68), Expect = 2.1
Identities = 24/92 (26%), Positives = 40/92 (43%), Gaps = 2/92 (2%)
Query: 53 DKDRDKEKEKEKK--DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
D R+KE E+ +K K KD ++ + S+ K K + ++ KK+EK+
Sbjct: 58 DDYREKEIEELEKALKKTKDSEEKEELKRTLQSMKSRLKTLKNKDREREILKEHKKQEKE 117
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
KE + K E K K++ + SK
Sbjct: 118 LIKEGKKPYYLKKSEIKKLVLKKKFDELKKSK 149
Score = 29.9 bits (68), Expect = 2.3
Identities = 26/97 (26%), Positives = 47/97 (48%), Gaps = 7/97 (7%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-----KPKESSSE 102
KK K +++++ K + K + K + +K++E++ + +K+ KE KP
Sbjct: 73 KKTKDSEEKEELKRTLQSMKSRLK-TLKNKDREREILKEHKKQEKELIKEGKKPYYLKKS 131
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKE-QKESK 138
+ KK K K ++ K K D+ +K KK KE K
Sbjct: 132 EIKKLVLKKKFDELKKSKQLDKALEKKRKKNAGKEKK 168
Score = 29.9 bits (68), Expect = 2.5
Identities = 21/77 (27%), Positives = 35/77 (45%), Gaps = 9/77 (11%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
D +EKE ++ EK K S+EKE + + S + K K ++E
Sbjct: 57 LDDYREKEIEELEK---------ALKKTKDSEEKEELKRTLQSMKSRLKTLKNKDREREI 107
Query: 116 SHKHKDKDRERDKDEKK 132
+HK +++E K+ KK
Sbjct: 108 LKEHKKQEKELIKEGKK 124
>gnl|CDD|227360 COG5027, SAS2, Histone acetyltransferase (MYST family) [Chromatin
structure and dynamics].
Length = 395
Score = 30.9 bits (70), Expect = 2.1
Identities = 10/65 (15%), Positives = 25/65 (38%)
Query: 55 DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
+ K+ K+ +K K K D++ + + S+E+E ++ + +
Sbjct: 58 NLGAAISIPKRKKQTEKGKKEKKPKVSDRMDLDNENVQLEMLYSISNEREIRQLRFGGSK 117
Query: 115 KSHKH 119
+ H
Sbjct: 118 VQNPH 122
>gnl|CDD|220135 pfam09184, PPP4R2, PPP4R2. PPP4R2 (protein phosphatase 4 core
regulatory subunit R2) is the regulatory subunit of the
histone H2A phosphatase complex. It has been shown to
confer resistance to the anticancer drug cisplatin in
yeast, and may confer resistance in higher eukaryotes.
Length = 285
Score = 30.6 bits (69), Expect = 2.1
Identities = 21/112 (18%), Positives = 50/112 (44%), Gaps = 2/112 (1%)
Query: 2 AYSVKSSSSSSSAHPSPHKN-KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDK-E 59
+ + S + P P + KD + + S+ + S +K+ D +
Sbjct: 174 PFIERIDSVNGPGEPEPEDDPKDSLGNGSSTNGLPDSSQDKNKSLEEYYEKESSDAAASQ 233
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
+ K K+K + ++ ++D +EKE KE + +E + E+E+++++ +
Sbjct: 234 DDGPKGSDVKNKKSDDEEDDDQDGDYVEEKELKEDEEEEETEEEEEEEDEDE 285
Score = 30.2 bits (68), Expect = 3.1
Identities = 23/113 (20%), Positives = 44/113 (38%), Gaps = 12/113 (10%)
Query: 22 KDKDSSAIPSTST-------SSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAV 74
+ DS P S + TN K+K ++ EKE D +A
Sbjct: 177 ERIDSVNGPGEPEPEDDPKDSLGNGSSTNGLPDSSQDKNKSLEEYYEKESSD-----AAA 231
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
S + K +K E + E+K+ K+D++E+ + ++++ + D
Sbjct: 232 SQDDGPKGSDVKNKKSDDEEDDDQDGDYVEEKELKEDEEEEETEEEEEEEDED 284
>gnl|CDD|236081 PRK07735, PRK07735, NADH dehydrogenase subunit C; Validated.
Length = 430
Score = 30.7 bits (69), Expect = 2.1
Identities = 34/182 (18%), Positives = 72/182 (39%), Gaps = 10/182 (5%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
+KD + K++ + +E K V+ E K+ + +E++++ PK E+ K +
Sbjct: 3 PEKDLEDLKKEAARRAKEEARKRLVAKHGAEISKLEEENREKEKALPKNDDMTIEEAKRR 62
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
K+ ++R+ E+ ++E + +++ +K A Q
Sbjct: 63 AAAAAKAKAAALAKQKREGTEEVTEEEKAKAKAKAAAAAKAKAAALAKQKREGTE----- 117
Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
V +EK K + K K+ + G + +E++ KEK K++A
Sbjct: 118 -----EVTEEEKAAAKAKAAAAAKAKAAALAKQKREGTEEVTEEEEETDKEKAKAKAAAA 172
Query: 230 PK 231
K
Sbjct: 173 AK 174
Score = 30.7 bits (69), Expect = 2.2
Identities = 23/94 (24%), Positives = 45/94 (47%), Gaps = 5/94 (5%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH- 117
E +E+K K K+A ++K K K + +E +E ++KEK K K K+
Sbjct: 118 EVTEEEKAAAKAKAAAAAKAKAAALAKQKREGTEEVTEEEEETDKEKAKAKAAAAAKAKA 177
Query: 118 ----KHKDKDRERDKDEKKEQKESKSSSKIVSSS 147
K K + +E E++++K+ +K +++
Sbjct: 178 AALAKQKAAEAGEGTEEVTEEEKAKAKAKAAAAA 211
Score = 28.8 bits (64), Expect = 9.2
Identities = 31/132 (23%), Positives = 59/132 (44%), Gaps = 2/132 (1%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK-KKEK 109
K+ ++ +E +E+K K K K+A ++K K K + +E +E ++ K K
Sbjct: 76 KQKREGTEEVTEEEKAKAKAKAAAAAKAKAAALAKQKREGTEEVTEEEKAAAKAKAAAAA 135
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
K K K K + E +E++E + K+ +K +++ +K A Q +
Sbjct: 136 KAKAAALAKQKREGTEEVTEEEEETDKEKAKAKAAAAA-KAKAAALAKQKAAEAGEGTEE 194
Query: 170 PTQKSPVKTKEK 181
T++ K K K
Sbjct: 195 VTEEEKAKAKAK 206
>gnl|CDD|226400 COG3883, COG3883, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 265
Score = 30.5 bits (69), Expect = 2.2
Identities = 23/94 (24%), Positives = 46/94 (48%), Gaps = 2/94 (2%)
Query: 32 TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK-EKEKDKVSSKEKE 90
ST+ T+ S K +D E +KEKK+ + + ++ ++ E+ + K+ +KE
Sbjct: 16 ISTAFLTTVFAALLSDKIQNQDSKL-SELQKEKKNIQNEIESLDNQIEEIQSKIDELQKE 74
Query: 91 RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
+SK + +KE + K++ E+ K + R
Sbjct: 75 IDQSKAEIKKLQKEIAELKENIVERQELLKKRAR 108
Score = 28.9 bits (65), Expect = 6.2
Identities = 23/112 (20%), Positives = 50/112 (44%), Gaps = 7/112 (6%)
Query: 51 KKDKDRDKEKEKEKKDK-EKDKSAVSSKEKEKDKVSSKEKERK----ESKPKESSSEKEK 105
K+DK +EK+ +DK E + + E + + ++S++ E+ KE+S+ EK
Sbjct: 154 KEDKKSLEEKQAALEDKLETLVALQNELETQLNSLNSQKAEKNALIAALAAKEASALGEK 213
Query: 106 KKEKKDKK--EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
++ K E + K K +EQ ++++ S ++ ++
Sbjct: 214 AALEEQKALAEAAAAEAAKQEAAAKAAAQEQAALQAAATAAQPSAVTESASA 265
>gnl|CDD|217667 pfam03666, NPR3, Nitrogen Permease regulator of amino acid
transport activity 3. This family, also known in yeasts
as Rmd11, complexes with NPR2, pfam06218. This complex
heterodimer is responsible for inactivating TORC1. an
evolutionarily conserved protein complex that controls
cell size via nutritional input signals, specifically,
in response to amino acid starvation.
Length = 424
Score = 30.8 bits (70), Expect = 2.2
Identities = 19/52 (36%), Positives = 27/52 (51%)
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
++KK KK+K D + +R+ E SKSSSK S S + +PAS
Sbjct: 45 RKKKKKKQKKSDRADPNDDREPSVDSEDSSSKSSSKSESGSLANSDPASDPS 96
>gnl|CDD|218435 pfam05104, Rib_recp_KP_reg, Ribosome receptor lysine/proline rich
region. This highly conserved region is found towards
the C-terminus of the transmembrane domain. The function
is unclear.
Length = 151
Score = 29.9 bits (67), Expect = 2.2
Identities = 30/116 (25%), Positives = 47/116 (40%)
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
K+RKES +S +KKKEK +K+ K K++ E + +E I+
Sbjct: 12 KQRKESGKTQSQKSDKKKKEKVSEKKGKSKKKEEKPNGKIPEHEPNQEVTEVEVIIEKEP 71
Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
+ + P AP P + PV ++EK + S + K KK K
Sbjct: 72 VPAVAVAPVPVAVVAPVVAPKPKKSQPVMSQEKTASPQKSVPAPSPKEKKKKKVAK 127
>gnl|CDD|218336 pfam04935, SURF6, Surfeit locus protein 6. The surfeit locus
protein SURF-6 is shown to be a component of the
nucleolar matrix and has a strong binding capacity for
nucleic acids.
Length = 206
Score = 30.0 bits (68), Expect = 2.3
Identities = 22/59 (37%), Positives = 35/59 (59%), Gaps = 4/59 (6%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
KEK+K+K KE + KEK + K + ++K+R+E+ K +K KKK+K KK +
Sbjct: 151 KEKQKKKSKKEWKER----KEKVEKKKAERQKKREENLKKRKDDKKNKKKKKAKKKGRI 205
Score = 29.2 bits (66), Expect = 4.7
Identities = 32/85 (37%), Positives = 48/85 (56%), Gaps = 7/85 (8%)
Query: 48 KKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+K+K K K E K K D++ K A+ KEK+K K + KERKE K ++ +E++KK
Sbjct: 121 EKEKWTKALAKAEGVKVKDDEKLLKKALKRKEKQKKKSKKEWKERKE-KVEKKKAERQKK 179
Query: 107 -----KEKKDKKEKSHKHKDKDRER 126
K++KD K+ K K K + R
Sbjct: 180 REENLKKRKDDKKNKKKKKAKKKGR 204
>gnl|CDD|237869 PRK14962, PRK14962, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 472
Score = 30.9 bits (70), Expect = 2.4
Identities = 17/57 (29%), Positives = 28/57 (49%)
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
++ D E+ + ++KEK K SK KE K+ + KE +E K+K + S
Sbjct: 337 PNVQENDVEEKNDNSNVQQKEKKKEESKAKEEKQEDIEFEKRFKELMEELKEKGDLS 393
>gnl|CDD|234941 PRK01315, PRK01315, putative inner membrane protein translocase
component YidC; Provisional.
Length = 329
Score = 30.5 bits (69), Expect = 2.4
Identities = 12/71 (16%), Positives = 30/71 (42%)
Query: 40 NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
NPT S + ++++ K K+ + + + + + + + +K + + +PK
Sbjct: 258 NPTPGSPAAIAREERLAKKGKDHGESEGKVVAPEGAVAQTTEVREQTKRQTVQRQQPKRQ 317
Query: 100 SSEKEKKKEKK 110
S K +K
Sbjct: 318 SRAKRQKGGAA 328
>gnl|CDD|237554 PRK13909, PRK13909, putative recombination protein RecB;
Provisional.
Length = 910
Score = 31.1 bits (71), Expect = 2.4
Identities = 25/101 (24%), Positives = 38/101 (37%), Gaps = 32/101 (31%)
Query: 53 DKD--RDKEKEKEKKDKE--------------------KDKSAVSSKEK------EKDKV 84
DKD R EKEK K +E KD+S+ S E E+ ++
Sbjct: 663 DKDYARALEKEKALKYEEEINVLYVAFTRAKNSLIVVKKDESSGSMFEILDLKPLERGEI 722
Query: 85 SSKEKERKESKPKESSSEKEKKK----EKKDKKEKSHKHKD 121
KE + K +S K K + K+ +E+ + D
Sbjct: 723 EIKEPKISPKKESLITSVKLKPHGYQEQVKEIEEEPKEDND 763
>gnl|CDD|219913 pfam08576, DUF1764, Eukaryotic protein of unknown function
(DUF1764). This is a family of eukaryotic proteins of
unknown function. This family contains many hypothetical
proteins.
Length = 98
Score = 29.0 bits (65), Expect = 2.4
Identities = 16/59 (27%), Positives = 33/59 (55%), Gaps = 2/59 (3%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
KE++K +K ++D + S K++ K K++ K ++PK + ++K K+K + E
Sbjct: 1 KEEKKNEKTDKRDIDDIFSNIKKRKKK--KKRTAKTARPKATKKGQKKDKKKDEFPEFP 57
>gnl|CDD|222010 pfam13254, DUF4045, Domain of unknown function (DUF4045). This
presumed domain is functionally uncharacterized. This
domain family is found in bacteria and eukaryotes, and
is typically between 384 and 430 amino acids in length.
Length = 414
Score = 30.6 bits (69), Expect = 2.4
Identities = 37/228 (16%), Positives = 73/228 (32%), Gaps = 20/228 (8%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSS--SSKKDKKDKDRDKE 59
+ S S+S S ++ D PS + +PT ++ S +K + + K
Sbjct: 109 SLPSHPRSRSASVSNSKDGDRPSDLPPSPSKTMDPRRWSPTKATWLESALNKPESPKHKP 168
Query: 60 KEKE----KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
+ + KKD + + + +S + + +S ++ + K K
Sbjct: 169 QPPQQPEWKKDLSRLRQSRASVDLGR--TNSFKEVTPVGLMRTPPPGSHSKSPSKSGIPD 226
Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSP 175
+D E+ K EK +Q+ S + E +S + P +P
Sbjct: 227 LPSSRDS--EKTKPEKPQQETSSMDT----------EKSSAPKPRETLDPKSPEKAPPID 274
Query: 176 VKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
+E + + S ++ S K K S K
Sbjct: 275 TTEEELKSPEASPKESEEASARKRSPSLLSPSPKAESPKPLASPGKSP 322
>gnl|CDD|240339 PTZ00265, PTZ00265, multidrug resistance protein (mdr1);
Provisional.
Length = 1466
Score = 30.8 bits (69), Expect = 2.4
Identities = 30/128 (23%), Positives = 51/128 (39%), Gaps = 8/128 (6%)
Query: 40 NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK---ERKESKP 96
+PT + +K +KD + +K + + ++ D + + +
Sbjct: 669 DPTKDNKENNNKNNKDDNNNNNNNNNNKINNAGSYIIEQGTHDALMKNKNGIYYTMINNQ 728
Query: 97 KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE----KKEQKESKSSSKIVS-SSHNSK 151
K SS + KD KS +KD +R D DE K + ES S+ K S N+
Sbjct: 729 KVSSKKSSNNDNDKDSDMKSSAYKDSERGYDPDEMNGNSKHENESASNKKSCKMSDENAS 788
Query: 152 EPASGSQL 159
E +G +L
Sbjct: 789 ENNAGGKL 796
>gnl|CDD|221818 pfam12868, DUF3824, Domain of unknwon function (DUF3824). This is
a repeating domain found in fungal proteins. It is
proline-rich, and the function is not known.
Length = 135
Score = 29.5 bits (66), Expect = 2.5
Identities = 16/69 (23%), Positives = 24/69 (34%), Gaps = 6/69 (8%)
Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
++++ KK ++E+ D D D E+ S VS
Sbjct: 24 SQRRERKKAERERERYRHDHDAYSDSYEEPYDPTPYPPSPPVSDPR------YYPNSNYF 77
Query: 163 PPPPAPTPT 171
PPPP TP
Sbjct: 78 PPPPGSTPV 86
>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger. [Transport and
binding proteins, Cations and iron carrying compounds].
Length = 1096
Score = 30.7 bits (69), Expect = 2.5
Identities = 19/92 (20%), Positives = 39/92 (42%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
SK D + + E+ E+ ++ + + +E + E E K E E+K
Sbjct: 631 SKGDVAEAEHTGERTGEEGERPTEAEGENGEESGGEAEQEGETETKGENESEGEIPAERK 690
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
E++ + E K D E + +E + + E++
Sbjct: 691 GEQEGEGEIEAKEADHKGETEAEEVEHEGETE 722
>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
primarily archaeal type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. It is found
in a single copy and is homodimeric in prokaryotes, but
six paralogs (excluded from this family) are found in
eukarotes, where SMC proteins are heterodimeric. This
family represents the SMC protein of archaea and a few
bacteria (Aquifex, Synechocystis, etc); the SMC of other
bacteria is described by TIGR02168. The N- and
C-terminal domains of this protein are well conserved,
but the central hinge region is skewed in composition
and highly divergent [Cellular processes, Cell division,
DNA metabolism, Chromosome-associated proteins].
Length = 1164
Score = 30.8 bits (70), Expect = 2.6
Identities = 18/102 (17%), Positives = 41/102 (40%), Gaps = 8/102 (7%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK------EKERKESKPKESSSEKEKKK 107
KD ++ EK K++ + K + ++E ++S + E+K E EKE K
Sbjct: 388 KDYREKLEKLKREINELKRELDRLQEELQRLSEELADLNAAIAGIEAKINELEEEKEDKA 447
Query: 108 EKKDKKEKSHKH--KDKDRERDKDEKKEQKESKSSSKIVSSS 147
+ K+E + D + + +++ + ++
Sbjct: 448 LEIKKQEWKLEQLAADLSKYEQELYDLKEEYDRVEKELSKLQ 489
Score = 30.4 bits (69), Expect = 3.7
Identities = 14/97 (14%), Positives = 47/97 (48%), Gaps = 5/97 (5%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK--EKERKESKPKESSSEKE 104
K + ++ ++E E+E+K ++K + ++E + + ++ E +++ ++ ++ +
Sbjct: 332 DKLLAEIEELEREIEEERKRRDKLTEEYAELKEELEDLRAELEEVDKEFAETRDELKDYR 391
Query: 105 KKKEK-KDKKEKSHKHKDK--DRERDKDEKKEQKESK 138
+K EK K + + + D+ + + E+ +
Sbjct: 392 EKLEKLKREINELKRELDRLQEELQRLSEELADLNAA 428
Score = 30.0 bits (68), Expect = 4.6
Identities = 23/153 (15%), Positives = 56/153 (36%), Gaps = 18/153 (11%)
Query: 50 DKKDKDRDKEK---EKEKKDKEKDKSAVSSKEKE--------KDKVSSKEKERKESKPKE 98
++K EK EKE ++ ++ + + + K K E+E +E +
Sbjct: 818 EQKLNRLTLEKEYLEKEIQELQEQRIDLKEQIKSIEKEIENLNGKKEELEEELEELEAAL 877
Query: 99 SSSEKEKKKEKKDKKE-KSHKHKDKDRERDKDEKKEQKES-----KSSSKIVSSSHNSKE 152
E KK++ E ++ + + + + + + E+K K+ + + + E
Sbjct: 878 RDLESRLGDLKKERDELEAQLRELERKIEELEAQIEKKRKRLSELKAKLEALEEELSEIE 937
Query: 153 PASGSQLISHPPPPAPTPTQKSPVKTKEKEKEK 185
G + P ++ ++ E+E
Sbjct: 938 DPKG-EDEEIPEEELSLEDVQAELQRVEEEIRA 969
Score = 29.7 bits (67), Expect = 5.3
Identities = 19/101 (18%), Positives = 47/101 (46%), Gaps = 12/101 (11%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK----------ESKPKES 99
+++ + E +K + E+ + + + K +DK++ + E K E KE
Sbjct: 321 EERLAKLEAEIDKLLAEIEELEREIEEERKRRDKLTEEYAELKEELEDLRAELEEVDKEF 380
Query: 100 SSEKEKKKEKKDKKEK-SHKHKDKDRERDK-DEKKEQKESK 138
+ +++ K+ ++K EK + + RE D+ E+ ++ +
Sbjct: 381 AETRDELKDYREKLEKLKREINELKRELDRLQEELQRLSEE 421
>gnl|CDD|218148 pfam04557, tRNA_synt_1c_R2, Glutaminyl-tRNA synthetase,
non-specific RNA binding region part 2. This is a
region found N terminal to the catalytic domain of
glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes
but not in Escherichia coli. This region is thought to
bind RNA in a non-specific manner, enhancing
interactions between the tRNA and enzyme, but is not
essential for enzyme function.
Length = 83
Score = 28.5 bits (64), Expect = 2.6
Identities = 10/40 (25%), Positives = 16/40 (40%), Gaps = 1/40 (2%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
+ D K K+KK K+K ++ K K + E
Sbjct: 18 TEADLVK-KKKKKKKKKAEDTAATAKAKKATAEDVSEGAM 56
>gnl|CDD|114337 pfam05609, LAP1C, Lamina-associated polypeptide 1C (LAP1C). This
family contains rat LAP1C proteins and several
uncharacterized highly related sequences from both mice
and humans. LAP1s (lamina-associated polypeptide 1s) are
type 2 integral membrane proteins with a single
membrane-spanning region of the inner nuclear membrane.
LAP1s bind to both A- and B-type lamins and have a
putative role in the membrane attachment and assembly of
the nuclear lamina.
Length = 465
Score = 30.4 bits (68), Expect = 2.7
Identities = 32/149 (21%), Positives = 58/149 (38%), Gaps = 4/149 (2%)
Query: 30 PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEK-DKSAVSSKEKEKDKVSSKE 88
P+ S ++D+ + + KK D++ VS K+K +
Sbjct: 22 PAEEARGLRDACGLSKDHQEDETSSQPESSQTGSKKTVRSPDEANVSEDPKDKLRRPPLR 81
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKH--KDKDRERDKDEKKEQKESKSSSKIVSS 146
R E+ ++ ++ E +D S + K+ R RD E + K ++ + + SS
Sbjct: 82 YPRYEATEVQNKQSFLEEGETEDDHHSSSSNVTKEPLRSRDSHESSD-KVGRADAHLGSS 140
Query: 147 SHNSKEPASGSQLISHPPPPAPTPTQKSP 175
S + AS S P T +QK+P
Sbjct: 141 SWALPKSASDFTAHSQQPSVLTTGSQKAP 169
>gnl|CDD|237496 PRK13766, PRK13766, Hef nuclease; Provisional.
Length = 773
Score = 30.6 bits (70), Expect = 2.7
Identities = 16/73 (21%), Positives = 36/73 (49%), Gaps = 5/73 (6%)
Query: 75 SSKEKEKD-----KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
SS+ KEK K +K + E +E++K+++ + K K K+ E +++
Sbjct: 488 SSRRKEKKMKEELKNLKGILNKKLQELDEEQKGEEEEKDEQLSLDDFVKSKGKEEEEEEE 547
Query: 130 EKKEQKESKSSSK 142
++++ KE++
Sbjct: 548 KEEKDKETEEDEP 560
>gnl|CDD|215581 PLN03109, PLN03109, ETHYLENE-INSENSITIVE3-like3 protein;
Provisional.
Length = 599
Score = 30.6 bits (69), Expect = 2.8
Identities = 26/150 (17%), Positives = 59/150 (39%), Gaps = 1/150 (0%)
Query: 24 KDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS-AVSSKEKEKD 82
++ S I S+ + TS T + + ++KD D +D +VSSK+ ++
Sbjct: 284 REESLIRQPSSDNGTSGITETPRGGHEDRNKDAISSDSDYDVDGLEDAPGSVSSKDDRRN 343
Query: 83 KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
++ + + +K+K KK +K K + + E++ + +E ++S +
Sbjct: 344 LQPVAQEPERARDDAPNQVVPDKEKTKKPRKRKRPRGRSTVAEQEVEVTQEHPPAESRNA 403
Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
+ +H + + T Q
Sbjct: 404 LPDMNHVDAQGMEYQITGTSHENDTVTALQ 433
>gnl|CDD|227891 COG5604, COG5604, Uncharacterized conserved protein [Function
unknown].
Length = 523
Score = 30.6 bits (69), Expect = 2.8
Identities = 23/83 (27%), Positives = 38/83 (45%), Gaps = 6/83 (7%)
Query: 83 KVSSKEKERKESKPKES-SSEKEKKKEKK---DKKEKSHKHKDKDRERDKDEKKEQKESK 138
K S K+ ++ K + K+ + KK K SH + + + K ++K SK
Sbjct: 3 KASKATKKFTKNHLKNTIDRRKQLARSKKVYGTKNRNSHTENKMESGTNDNNKNKEKLSK 62
Query: 139 SSSKIVSSSHNSKEPASGSQLIS 161
S + SS +S+E GS+ IS
Sbjct: 63 LYSDVDSS--SSEEEEDGSESIS 83
Score = 30.2 bits (68), Expect = 3.1
Identities = 19/71 (26%), Positives = 28/71 (39%), Gaps = 1/71 (1%)
Query: 40 NPTNSSSSKKDKKDKDRDKE-KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
N +S ++K + + K KEK K SS E+E+D S K SK
Sbjct: 34 GTKNRNSHTENKMESGTNDNNKNKEKLSKLYSDVDSSSSEEEEDGSESISKLNVNSKKIS 93
Query: 99 SSSEKEKKKEK 109
+ +K K
Sbjct: 94 LNQVSTQKWRK 104
>gnl|CDD|216095 pfam00748, Calpain_inhib, Calpain inhibitor. This region is found
multiple times in calpain inhibitor proteins.
Length = 131
Score = 29.4 bits (66), Expect = 2.8
Identities = 30/123 (24%), Positives = 50/123 (40%), Gaps = 12/123 (9%)
Query: 9 SSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE------- 61
+ S+S PSP K K+ + + S ++ S S +K RDK +
Sbjct: 9 TCSASPPPSPTAKKKKEEAEKTAASGEVVSAQSAPSVRSAAPPPEKKRDKMSDDALDALS 68
Query: 62 ----KEKKDKEKDKSAVS-SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+ + D E+ K KEK K++ K ER+++ P E + K K+ K K
Sbjct: 69 DSLGQREPDPEEKKPVEDKVKEKAKEEKLEKLGEREDTIPPEYRLLEAKDKDGKPLLPKP 128
Query: 117 HKH 119
+
Sbjct: 129 EEE 131
>gnl|CDD|218274 pfam04801, Sin_N, Sin-like protein conserved region. Family of
higher eukaryotic proteins. SIN was identified as a
protein that interacts specifically with SXL (sex
lethal) in a yeast two-hybrid assay. The interaction is
mediated by one of the SXL RNA binding domains.
Length = 422
Score = 30.5 bits (69), Expect = 2.9
Identities = 24/83 (28%), Positives = 41/83 (49%), Gaps = 6/83 (7%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
DKKDK + KE++ +D++ D+ A ++ K S E E++ + +E S +KK
Sbjct: 140 DKKDKRK-KEEDTADEDEDPDEEAEEELKQVTVKFSRPETEKQRKR-REQSYNFLQKKIA 197
Query: 110 KDKKEKSHKHKDKD----RERDK 128
++ + H KD ER K
Sbjct: 198 EEPWIELKYHGKKDSESELERQK 220
>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
bacterial type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. This family
represents the SMC protein of most bacteria. The smc
gene is often associated with scpB (TIGR00281) and scpA
genes, where scp stands for segregation and condensation
protein. SMC was shown (in Caulobacter crescentus) to be
induced early in S phase but present and bound to DNA
throughout the cell cycle [Cellular processes, Cell
division, DNA metabolism, Chromosome-associated
proteins].
Length = 1179
Score = 30.8 bits (70), Expect = 3.0
Identities = 16/88 (18%), Positives = 41/88 (46%), Gaps = 3/88 (3%)
Query: 51 KKDKDRDKEKEKEKKDKEKD--KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
+ + + E+E E+ KE + +S E++K ++ + E + +E ++ E+ +
Sbjct: 272 LRLEVSELEEEIEELQKELYALANEISRLEQQK-QILRERLANLERQLEELEAQLEELES 330
Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKE 136
K D+ + ++ E K+E + +
Sbjct: 331 KLDELAEELAELEEKLEELKEELESLEA 358
>gnl|CDD|237855 PRK14900, valS, valyl-tRNA synthetase; Provisional.
Length = 1052
Score = 30.7 bits (69), Expect = 3.1
Identities = 26/173 (15%), Positives = 59/173 (34%), Gaps = 11/173 (6%)
Query: 40 NPTNSSSSKKDKKDKDRDKEKE-KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
NP+ ++ +KDR + +E +EK+ K + A+ S + + + E KP +
Sbjct: 867 NPSFVQNAPPAVVEKDRARAEELREKRGKLEAHRAMLSGSEANSARRDTMEIQNEQKPTQ 926
Query: 99 SSSEKEKKKE---------KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
E + +K S + +K + + + +
Sbjct: 927 DGPAAEAQPAQENTVVESAEKAVAAVSEAAQQAATAVASGIEKVAEAVRKTVRRSVKKAA 986
Query: 150 SKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK 202
+ A+ + ++ P +K+ K +K+ K ++ KK
Sbjct: 987 AT-RAAMKKKVAKKAPAKKAAAKKAAAKKAAAKKKVAKKAPAKKVARKPAAKK 1038
>gnl|CDD|222918 PHA02687, PHA02687, ORF061 late transcription factor VLTF-4;
Provisional.
Length = 231
Score = 30.0 bits (67), Expect = 3.2
Identities = 25/80 (31%), Positives = 39/80 (48%), Gaps = 1/80 (1%)
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE-RDKDEKKEQKE 136
E+E + E P ++ + KKK K DK EKS K +K D+D+K E+KE
Sbjct: 64 EQECQQEQLPVPESVPPAPVKTPKRRTKKKAKADKPEKSPKAVEKLCPPDDRDDKNEEKE 123
Query: 137 SKSSSKIVSSSHNSKEPASG 156
++ S +++ ASG
Sbjct: 124 PTEEAQRNEESGDAEGGASG 143
>gnl|CDD|197310 cd09076, L1-EN, Endonuclease domain (L1-EN) of the non-LTR
retrotransposon LINE-1 (L1), and related domains. This
family contains the endonuclease domain (L1-EN) of the
non-LTR retrotransposon LINE-1 (L1), and related
domains, including the endonuclease of Xenopus laevis
Tx1. These retrotranspons belong to the subtype 2,
L1-clade. LINES can be classified into two subtypes.
Subtype 2 has two ORFs: the second (ORF2) encodes a
modular protein consisting of an N-terminal
apurine/apyrimidine endonuclease domain (EN), a central
reverse transcriptase, and a zinc-finger-like domain at
the C-terminus. LINE-1/L1 elements (full length and
truncated) comprise about 17% of the human genome. This
endonuclease nicks the genomic DNA at the consensus
target sequence 5'TTTT-AA3' producing a ribose
3'-hydroxyl end as a primer for reverse transcription of
associated template RNA. This subgroup also includes the
endonuclease of Xenopus laevis Tx1, another member of
the L1-clade. This family belongs to the large EEP
(exonuclease/endonuclease/phosphatase) superfamily that
contains functionally diverse enzymes that share a
common catalytic mechanism of cleaving phosphodiester
bonds.
Length = 236
Score = 30.0 bits (68), Expect = 3.2
Identities = 9/30 (30%), Positives = 17/30 (56%)
Query: 256 AEQFKDELFDRLKNEQADILQRKVHIISGD 285
E+ K+E +D+L++ + + II GD
Sbjct: 112 DEEEKEEFYDQLQDVLDKVPRHDTLIIGGD 141
>gnl|CDD|218439 pfam05109, Herpes_BLLF1, Herpes virus major outer envelope
glycoprotein (BLLF1). This family consists of the BLLF1
viral late glycoprotein, also termed gp350/220. It is
the most abundantly expressed glycoprotein in the viral
envelope of the Herpesviruses and is the major antigen
responsible for stimulating the production of
neutralising antibodies in vivo.
Length = 830
Score = 30.5 bits (68), Expect = 3.2
Identities = 25/221 (11%), Positives = 59/221 (26%), Gaps = 22/221 (9%)
Query: 8 SSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE---- 63
++ ++ P+ K D ++ P+ T+ T+ + + + E+
Sbjct: 506 VTTPNATSPTTQKTSDTPNATSPTPIVIGVTTTATSPPTGTTSVPNATSPQVTEESPVNN 565
Query: 64 ---KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
S+ + S ++ P S S + + +
Sbjct: 566 TNTPVVTSAPSVLTSAVTTGQHGTGSSPTSQQPGIPSSSHSTP-----RSNSTSTTPLLT 620
Query: 121 DKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKE 180
++ +E S++ + + S P G S P + T + P +
Sbjct: 621 SAHPTGGENITEETPSVPSTTHVSTLS-----PGPGPGTTSQVSGPGNSSTSRYPGEVHV 675
Query: 181 KE-----KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
E S + + + KE
Sbjct: 676 TEGMPNPNATSPSAPSGQKTAVPTVTSTGGKANSTTKETSG 716
Score = 29.8 bits (66), Expect = 4.7
Identities = 19/64 (29%), Positives = 34/64 (53%), Gaps = 17/64 (26%)
Query: 3 YSVKSSSSS-----SSAHPSPHKNKDKDSSAIPST------------STSSSTSNPTNSS 45
+ +S+S+S +SAHP+ +N +++ ++PST T+S S P NSS
Sbjct: 606 STPRSNSTSTTPLLTSAHPTGGENITEETPSVPSTTHVSTLSPGPGPGTTSQVSGPGNSS 665
Query: 46 SSKK 49
+S+
Sbjct: 666 TSRY 669
>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
unknown].
Length = 652
Score = 30.4 bits (69), Expect = 3.3
Identities = 20/80 (25%), Positives = 40/80 (50%), Gaps = 1/80 (1%)
Query: 58 KEKEKEK-KDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
K KE+E+ ++KE + + +K K +E E +E+S K + +E K + EK
Sbjct: 396 KVKEEERPREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKREIEKL 455
Query: 117 HKHKDKDRERDKDEKKEQKE 136
++ R +D+ ++ +E
Sbjct: 456 ESELERFRREVRDKVRKDRE 475
>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II). Bone
sialoprotein (BSP) is a major structural protein of the
bone matrix that is specifically expressed by
fully-differentiated osteoblasts. The expression of bone
sialoprotein (BSP) is normally restricted to mineralised
connective tissues of bones and teeth where it has been
associated with mineral crystal formation. However, it
has been found that ectopic expression of BSP occurs in
various lesions, including oral and extraoral
carcinomas, in which it has been associated with the
formation of microcrystalline deposits and the
metastasis of cancer cells to bone.
Length = 291
Score = 30.0 bits (67), Expect = 3.4
Identities = 23/129 (17%), Positives = 48/129 (37%), Gaps = 5/129 (3%)
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
K + ++KE E D+ +E+E +E + + +E+ + E H + +
Sbjct: 121 KAGNAGKKATKEDESDEDEEEEEEEEEEEAEVEENEQGTNGTSTNSTEVDHGNGSSGGDN 180
Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP---APTPTQKSPVKTKEKEK 183
++ ++E + + + P G Q PP T V T E +
Sbjct: 181 GEEGEEESVTEAEAEGTTVAGPTTTSPNGGFQ--PTTPPQEVYGTTDPPFGKVTTPEYQG 238
Query: 184 EKESSTTHD 192
E E + ++
Sbjct: 239 EYEQTGANE 247
>gnl|CDD|237875 PRK14974, PRK14974, cell division protein FtsY; Provisional.
Length = 336
Score = 29.9 bits (68), Expect = 3.4
Identities = 17/59 (28%), Positives = 29/59 (49%)
Query: 67 KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
KEK V E++ ++ +E E + +E E++K+K K K + K+KD E
Sbjct: 6 KEKLSKFVEKVEEKIEEEEEEEAPEAEEEEEEEDEEEKKEKPGFFDKAKITEIKEKDIE 64
>gnl|CDD|237631 PRK14162, PRK14162, heat shock protein GrpE; Provisional.
Length = 194
Score = 29.4 bits (66), Expect = 3.5
Identities = 27/106 (25%), Positives = 48/106 (45%), Gaps = 5/106 (4%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
K++ +K+ +E+ E K KE+D+ E E KE + K K K+ +
Sbjct: 3 KEEFPSEKDLPQEETTDEAPKKEAKEAPKEEDQEKQNPVEDLE---KEIADLKAKNKDLE 59
Query: 111 DKKEKSHKHKDKDRERDKDEKKE--QKESKSSSKIVSSSHNSKEPA 154
DK +S + R E+ + + ES+S +K V + ++ E A
Sbjct: 60 DKYLRSQAEIQNMQNRYAKERAQLIKYESQSLAKDVLPAMDNLERA 105
>gnl|CDD|234715 PRK00290, dnaK, molecular chaperone DnaK; Provisional.
Length = 627
Score = 30.1 bits (69), Expect = 3.6
Identities = 26/112 (23%), Positives = 48/112 (42%), Gaps = 13/112 (11%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK----------EKERKESKPKESSSEKEKK 106
D+E E+ KD E + +K K+ V ++ EK KE K + EKEK
Sbjct: 502 DEEIERMVKDAEANAEE---DKKRKELVEARNQADSLIYQTEKTLKELGDKVPADEKEKI 558
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
+ + +++ K +DK+ + K E+ Q K + + ++ A +
Sbjct: 559 EAAIKELKEALKGEDKEAIKAKTEELTQASQKLGEAMYQQAQAAQGAAGAAA 610
>gnl|CDD|148635 pfam07139, DUF1387, Protein of unknown function (DUF1387). This
family represents a conserved region approximately 300
residues long within a number of hypothetical proteins
of unknown function that seem to be restricted to
mammals.
Length = 301
Score = 30.0 bits (67), Expect = 3.6
Identities = 28/148 (18%), Positives = 58/148 (39%), Gaps = 6/148 (4%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKS----SSK 142
K+ ++K+SKPK + K KE+ +E++ +KD +++S S
Sbjct: 7 KKNKKKKSKPKPEAPAKSASKEETTPEEQAAPGDEKDEVNGFHANGSADDTESVDSLSEG 66
Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK 202
+ S+S +++EP + + PP P+ + T + E + ++ H K
Sbjct: 67 LDSASLDAREPEAVTLDA--PPSPSSSLTNGLSDLQSKLELQSSPHSSAKPHPSSDQHKN 124
Query: 203 DKHGDKTNPKEKDAKSKEKESHKSSAGP 230
K + + ++ GP
Sbjct: 125 AKKYVSKPSQPVTPNNSAHHDAPAALGP 152
>gnl|CDD|223683 COG0610, COG0610, Type I site-specific restriction-modification
system, R (restriction) subunit and related helicases
[Defense mechanisms].
Length = 962
Score = 30.5 bits (69), Expect = 3.6
Identities = 15/88 (17%), Positives = 35/88 (39%), Gaps = 5/88 (5%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
K+ + + E+ K+ +D ++K+K E + K ++EK ++
Sbjct: 827 DKNGAYESLKELIERIIKEWIED-----LRQKKKLIERLIEAINQYRAKKLDTAEKLEEL 881
Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
KKE+ K ++ +++E
Sbjct: 882 YILAKKEEEFKQFAEEEGLNEEELAFYD 909
>gnl|CDD|206034 pfam13863, DUF4200, Domain of unknown function (DUF4200). This
family is found in eukaryotes. It is a coiled-coil
domain of unknwon function.
Length = 126
Score = 28.7 bits (65), Expect = 3.7
Identities = 19/80 (23%), Positives = 48/80 (60%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
R++ + +E+ K++++ +E+ ++ + +K KE++ K +EK+ ++EKK +KEK
Sbjct: 20 REEFERREELLKQREEELEKKEEELQESLIKFDKFLKENEAKRRRAEKKAEEEKKLRKEK 79
Query: 116 SHKHKDKDRERDKDEKKEQK 135
+ K+ E ++ + + +K
Sbjct: 80 EEEIKELKAELEELKAEIEK 99
>gnl|CDD|215656 pfam00012, HSP70, Hsp70 protein. Hsp70 chaperones help to fold
many proteins. Hsp70 assisted folding involves repeated
cycles of substrate binding and release. Hsp70 activity
is ATP dependent. Hsp70 proteins are made up of two
regions: the amino terminus is the ATPase domain and the
carboxyl terminus is the substrate binding region.
Length = 598
Score = 30.3 bits (69), Expect = 3.7
Identities = 22/104 (21%), Positives = 50/104 (48%), Gaps = 4/104 (3%)
Query: 40 NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
+ S + ++ KD ++ ++KK KE + +K + ++ V S EK KE K
Sbjct: 497 ASSGLSDDEIERMVKDAEEYAAEDKKRKE----RIEAKNEAEEYVYSLEKSLKEEGDKLP 552
Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
++K+K +E + ++ + +DK+ K E+ ++ ++
Sbjct: 553 EADKKKVEEAIEWLKEELEGEDKEEIEAKTEELQKVVQPIGERM 596
>gnl|CDD|221028 pfam11208, DUF2992, Protein of unknown function (DUF2992). This
bacterial family of proteins has no known function.
However, the cis-regulatory yjdF motif, just upstream
from the gene encoding the proteins for this family, is
a small non-coding RNA, Rfam:RF01764. The yjdF motif is
found in many Firmicutes, including Bacillus subtilis.
In most cases, it resides in potential 5' UTRs of
homologues of the yjdF gene whose function is unknown.
However, in Streptococcus thermophilus, a yjdF RNA motif
is associated with an operon whose protein products
synthesise nicotinamide adenine dinucleotide (NAD+).
Also, the S. thermophilus yjdF RNA lacks typical yjdF
motif consensus features downstream of and including the
P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the
S. thermophilus RNAs might sense a distinct compound
that structurally resembles the ligand bound by other
yjdF RNAs. On the ohter hand, perhaps these RNAs have an
alternative solution forming a similar binding site, as
is observed with some SAM riboswitches.
Length = 132
Score = 28.8 bits (65), Expect = 3.7
Identities = 17/62 (27%), Positives = 35/62 (56%), Gaps = 9/62 (14%)
Query: 76 SKEKEKDKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
+KE +K +S+K ++ + E E+ K+++KK KEK + K++ R+ + +KK
Sbjct: 75 AKEVKKPGISTKAQQALKLEH-------ERNKQEKKKRSKEKKEEEKERKRQLKQQKKKA 127
Query: 134 QK 135
+
Sbjct: 128 KH 129
>gnl|CDD|185429 PTZ00074, PTZ00074, 60S ribosomal protein L34; Provisional.
Length = 135
Score = 28.9 bits (65), Expect = 3.8
Identities = 12/30 (40%), Positives = 15/30 (50%)
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSK 87
KEK K+KK K+K K K +K K
Sbjct: 106 KEKAKQKKQKKKKKKKKKKKTSKKAAKKKK 135
Score = 28.5 bits (64), Expect = 4.8
Identities = 11/30 (36%), Positives = 19/30 (63%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKS 116
KEK +++ + K+ +K+KK KK K+K
Sbjct: 106 KEKAKQKKQKKKKKKKKKKKTSKKAAKKKK 135
>gnl|CDD|215412 PLN02769, PLN02769, Probable galacturonosyltransferase.
Length = 629
Score = 30.0 bits (68), Expect = 3.8
Identities = 32/155 (20%), Positives = 55/155 (35%), Gaps = 27/155 (17%)
Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTP 170
+ ++ K + E ++ +S + SS + +S S+L P P P
Sbjct: 63 SHVGSARENGTKKTQNQVSEGVDEILKESG--LTSSKPSDIVISSRSKLKKVFPDPKLNP 120
Query: 171 -TQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
K K ST DK + K K+ E E+ KS
Sbjct: 121 LPVKPHSVPVPSSDTKNKSTAIDK------------------ENKGQKADEDENEKS--- 159
Query: 230 PKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELF 264
C E G Y L + + +++ + ++ KD+LF
Sbjct: 160 --CELEFGS-YCLWSEEHKEVMKDSIVKRLKDQLF 191
>gnl|CDD|241486 cd13332, FERM_C_JAK1, Janus kinase 1 FERM domain C-lobe. JAK1 is a
tyrosine kinase protein essential in signaling type I
and type II cytokines. It interacts with the gamma chain
of type I cytokine receptors to elicit signals from the
IL-2 receptor family, the IL-4 receptor family, the
gp130 receptor family, ciliary neurotrophic factor
receptor (CNTF-R), neurotrophin-1 receptor (NNT-1R) and
Leptin-R). It also is involved in transducing a signal
by type I (IFN-alpha/beta) and type II (IFN-gamma)
interferons, and members of the IL-10 family via type II
cytokine receptors. JAK (also called Just Another
Kinase) is a family of intracellular, non-receptor
tyrosine kinases that transduce cytokine-mediated
signals via the JAK-STAT pathway. The JAK family in
mammals consists of 4 members: JAK1, JAK2, JAK3 and
TYK2. JAKs are composed of seven JAK homology (JH)
domains (JH1-JH7) . The C-terminal JH1 domain is the
main catalytic domain, followed by JH2, which is often
referred to as a pseudokinase domain, followed by
JH3-JH4 which is homologous to the SH2 domain, and
lastly JH5-JH7 which is a FERM domain. Named after
Janus, the two-faced Roman god of doorways, JAKs possess
two near-identical phosphate-transferring domains; one
which displays the kinase activity (JH1), while the
other negatively regulates the kinase activity of the
first (JH2). The FERM domain has a cloverleaf tripart
structure (FERM_N, FERM_M, FERM_C/N, alpha-, and
C-lobe/A-lobe,A-lobe, B-lobe, C-lobe/F1, F2, F3). The
C-lobe/F3 within the FERM domain is part of the PH
domain family. The FERM domain is found in the
cytoskeletal-associated proteins such as ezrin, moesin,
radixin, 4.1R, and merlin. These proteins provide a link
between the membrane and cytoskeleton and are involved
in signal transduction pathways. The FERM domain is also
found in protein tyrosine phosphatases (PTPs) , the
tyrosine kinases FAK and JAK, in addition to other
proteins involved in signaling. This domain is
structurally similar to the PH and PTB domains and
consequently is capable of binding to both peptides and
phospholipids at different sites.
Length = 198
Score = 29.4 bits (66), Expect = 3.9
Identities = 13/33 (39%), Positives = 20/33 (60%)
Query: 95 KPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
KP ++ EK+KK + K K K K +DK + R+
Sbjct: 83 KPATTAVEKKKKGKSKKNKLKGKKDEDKKKARE 115
>gnl|CDD|215104 PLN00207, PLN00207, polyribonucleotide nucleotidyltransferase;
Provisional.
Length = 891
Score = 30.2 bits (68), Expect = 4.0
Identities = 15/63 (23%), Positives = 32/63 (50%)
Query: 59 EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
E EK +++ + K +K V++ + R+ ++ +++S+E +KKD K +
Sbjct: 828 EANSEKSSQKQQGGSTKDKAPQKKYVNTSSRPRRAAQAEKNSAENAAVPKKKDYKRATSG 887
Query: 119 HKD 121
KD
Sbjct: 888 SKD 890
>gnl|CDD|203489 pfam06644, ATP11, ATP11 protein. This family consists of several
eukaryotic ATP11 proteins. In Saccharomyces cerevisiae,
expression of functional F1-ATPase requires two proteins
encoded by the ATP11 and ATP12 genes. Atp11p is a
molecular chaperone of the mitochondrial matrix that
participates in the biogenesis pathway to form F1, the
catalytic unit of the ATP synthase.
Length = 250
Score = 29.6 bits (67), Expect = 4.0
Identities = 16/75 (21%), Positives = 28/75 (37%), Gaps = 3/75 (4%)
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
EK + K +K K + +R + + K +K+ S+ K + AS + + P
Sbjct: 2 EKYRSKLLQKAKESGLEFIERLKKALKDKIEKKEFSAKK---PPTGPSKQASKFKTLKPP 58
Query: 164 PPPAPTPTQKSPVKT 178
P P K
Sbjct: 59 KPADKKKPFDKPFKP 73
>gnl|CDD|239570 cd03488, Topoisomer_IB_N_htopoI_like, Topoisomer_IB_N_htopoI_like :
N-terminal DNA binding fragment found in eukaryotic DNA
topoisomerase (topo) IB proteins similar to the
monomeric yeast and human topo I. Topo I enzymes are
divided into: topo type IA (bacterial) and type IB
(eukaryotic). Topo I relaxes superhelical tension in
duplex DNA by creating a single-strand nick, the broken
strand can then rotate around the unbroken strand to
remove DNA supercoils and, the nick is religated,
liberating topo I. These enzymes regulate the
topological changes that accompany DNA replication,
transcription and other nuclear processes. Human topo I
is the target of a diverse set of anticancer drugs
including camptothecins (CPTs). CPTs bind to the topo
I-DNA complex and inhibit religation of the
single-strand nick, resulting in the accumulation of
topo I-DNA adducts. This family may represent more than
one structural domain.
Length = 215
Score = 29.6 bits (67), Expect = 4.0
Identities = 14/42 (33%), Positives = 23/42 (54%), Gaps = 9/42 (21%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHK------HKDK 122
+KE K++ KE EK+ K +K+K E+ + HK+K
Sbjct: 96 AQKEEKKAMSKE---EKKAIKAEKEKLEEEYGFCILDGHKEK 134
>gnl|CDD|216257 pfam01034, Syndecan, Syndecan domain. Syndecans are transmembrane
heparin sulfate proteoglycans which are implicated in
the binding of extracellular matrix components and
growth factors.
Length = 207
Score = 29.3 bits (66), Expect = 4.0
Identities = 23/99 (23%), Positives = 35/99 (35%), Gaps = 12/99 (12%)
Query: 7 SSSSSSSAHPSPHKNKDKD--SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
S S S A PS ++ + S+ P +T+SS+ + +++S K
Sbjct: 53 YSGSGSGATPSDDEDSEPVTTSATPPKLTTTSSSPSNDTTTASTSTKTSPTVSTTVTTTT 112
Query: 65 KDKEKDKSAV---SSKEKEKDKVSSKEK-------ERKE 93
E D S E + SS ERKE
Sbjct: 113 SPSETDTEEATTTVSTETPTEGGSSAATDPSKNLLERKE 151
>gnl|CDD|215544 PLN03029, PLN03029, type-a response regulator protein; Provisional.
Length = 222
Score = 29.6 bits (66), Expect = 4.0
Identities = 11/58 (18%), Positives = 37/58 (63%)
Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
K K +++++E ++K ++ ++S + S+++E+ + + + + +P++ ++ K K E+
Sbjct: 147 KTKSKNQKQENQEKQEKLEESEIQSEKQEQPSQQPQSQPQPQQQPQQPNNNKRKAMEE 204
>gnl|CDD|219500 pfam07655, Secretin_N_2, Secretin N-terminal domain. This is a
short domain found in bacterial type II/III secretory
system proteins. The architecture of these proteins
suggest that this family may be functionally analogous
to pfam03958.
Length = 95
Score = 28.1 bits (63), Expect = 4.1
Identities = 17/44 (38%), Positives = 22/44 (50%), Gaps = 6/44 (13%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS 47
SV S S SSS + SS+ S SSS+S+ +SSS
Sbjct: 20 SVTSGSVSSSG------SNSSSSSSNSSNGGSSSSSSSGDSSSG 57
>gnl|CDD|220614 pfam10174, Cast, RIM-binding protein of the cytomatrix active zone.
This is a family of proteins that form part of the CAZ
(cytomatrix at the active zone) complex which is
involved in determining the site of synaptic vesicle
fusion. The C-terminus is a PDZ-binding motif that binds
directly to RIM (a small G protein Rab-3A effector). The
family also contains four coiled-coil domains.
Length = 774
Score = 30.0 bits (67), Expect = 4.1
Identities = 18/88 (20%), Positives = 42/88 (47%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
D +D+ E++ K+ + + + KE+ KE+ R + + EK ++
Sbjct: 382 DMRDRYEKTERKLRVLQKKIENLQETFRRKERRLKEEKERLRSLQTDTNTDTALEKLEKA 441
Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKES 137
+KE+ + + R+RD+ ++E+ E+
Sbjct: 442 LAEKERIIERLKEQRDRDERYEQEEFET 469
>gnl|CDD|223649 COG0576, GrpE, Molecular chaperone GrpE (heat shock protein)
[Posttranslational modification, protein turnover,
chaperones].
Length = 193
Score = 29.2 bits (66), Expect = 4.2
Identities = 13/78 (16%), Positives = 36/78 (46%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
DK+ K + + E+ ++ ++ ++ +E E++ +E++ + K+K
Sbjct: 1 MSDKEQKTEEPDAEETEEAEKSEEEEAEEEEPEEENELEEEQQEIAELEAQLEELKDKYL 60
Query: 108 EKKDKKEKSHKHKDKDRE 125
+ + E K +++RE
Sbjct: 61 RAQAEFENLRKRTERERE 78
>gnl|CDD|223599 COG0525, ValS, Valyl-tRNA synthetase [Translation, ribosomal
structure and biogenesis].
Length = 877
Score = 29.9 bits (68), Expect = 4.2
Identities = 18/69 (26%), Positives = 28/69 (40%), Gaps = 8/69 (11%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK-EKERKESKPKESSSEKEKK 106
D E + +K+ EK EKE D++ K E +K E EKEK+
Sbjct: 804 LPLAGLIDLAAELARLEKELEK-------LEKEIDRIEKKLSNEGFVAKAPEEVVEKEKE 856
Query: 107 KEKKDKKEK 115
K + + +
Sbjct: 857 KLAEYQVKL 865
>gnl|CDD|204935 pfam12474, PKK, Polo kinase kinase. This domain family is found in
eukaryotes, and is approximately 140 amino acids in
length. The family is found in association with
pfam00069. Polo-like kinase 1 (Plx1) is essential during
mitosis for the activation of Cdc25C, for spindle
assembly, and for cyclin B degradation. This family is
Polo kinase kinase (PKK) which phosphorylates Polo
kinase and Polo-like kinase to activate them. PKK is a
serine/threonine kinase.
Length = 142
Score = 28.8 bits (65), Expect = 4.3
Identities = 22/92 (23%), Positives = 50/92 (54%), Gaps = 2/92 (2%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
+ +K+ ++ + ++K+ EK + + + + K E++ + KE S K +K
Sbjct: 19 QLLKRHEKELEQLERQQKRTIEKLEQRQTQELRRLPKRIRAEQKTRLKMFKE--SLKIEK 76
Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
KE K + EK + ++++++R K EK+EQ++
Sbjct: 77 KELKQEVEKLPRFQEQEKKRMKAEKEEQEQKH 108
>gnl|CDD|223624 COG0550, TopA, Topoisomerase IA [DNA replication, recombination,
and repair].
Length = 570
Score = 29.9 bits (68), Expect = 4.3
Identities = 13/44 (29%), Positives = 24/44 (54%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
+ + +KE KDK++ + V+ + + KV S EK+ K+ P
Sbjct: 214 TEIEGKKEGRLKDKDEAEEIVNKLKGKPAKVVSVEKKPKKRSPP 257
>gnl|CDD|178744 PLN03205, PLN03205, ATR interacting protein; Provisional.
Length = 652
Score = 30.1 bits (67), Expect = 4.5
Identities = 31/131 (23%), Positives = 63/131 (48%), Gaps = 10/131 (7%)
Query: 31 STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV---SSK 87
S ST + + P + +SS + D ++D E ++ KK+ E+ + E+E ++ +K
Sbjct: 108 SNSTVVTAAKPISPNSSNR-CCDSEKDLEIDRLKKELERVSKQLLDVEQECSQLKKGKNK 166
Query: 88 EKERK-----ESKPKESSSEKEKKKE-KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
E E K ++K + S+ K+ + + D S H++ D D+KK K + +
Sbjct: 167 EMESKNLCADDNKGQCSTVHASKRIDLEPDVATSSVIHRENDSRMALDDKKSFKTAGVQA 226
Query: 142 KIVSSSHNSKE 152
+ + + SK+
Sbjct: 227 DLANHADLSKK 237
>gnl|CDD|240235 PTZ00032, PTZ00032, 60S ribosomal protein L18; Provisional.
Length = 211
Score = 29.4 bits (66), Expect = 4.5
Identities = 21/92 (22%), Positives = 33/92 (35%), Gaps = 12/92 (13%)
Query: 32 TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER 91
T+ + NP + SS KK K K K + K+ K R
Sbjct: 19 TNKAVYPPNPLSLFSSPNRKKSAPEQVPTGKNKLLLTK-----------RSKLKGIPKPR 67
Query: 92 KESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
K K +E ++K ++++ K DKD
Sbjct: 68 KLHKHGF-WAEIFEEKVEREELGNPCKDLDKD 98
>gnl|CDD|181632 PRK09060, PRK09060, dihydroorotase; Validated.
Length = 444
Score = 29.9 bits (68), Expect = 4.6
Identities = 11/25 (44%), Positives = 15/25 (60%)
Query: 329 IQATRELLDLATRCSQLKAILHVST 353
+ ATR L+ LA + +LHVST
Sbjct: 213 LLATRRLVRLARETGRRIHVLHVST 237
>gnl|CDD|235132 PRK03577, PRK03577, acid shock protein precursor; Provisional.
Length = 102
Score = 28.1 bits (62), Expect = 4.6
Identities = 22/74 (29%), Positives = 31/74 (41%), Gaps = 2/74 (2%)
Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK--E 220
PA T T +P KT K++ K K K+K K ++ A K +
Sbjct: 27 TAAPAATTTTAAPAKTTHHHKKQHKKAPEQKAQAAKKHHKNKKEQKAPEQKAQAAKKHAK 86
Query: 221 KESHKSSAGPKCYP 234
K SHK++A P P
Sbjct: 87 KHSHKTAAKPAAQP 100
>gnl|CDD|202833 pfam03962, Mnd1, Mnd1 family. This family of proteins includes
MND1 from S. cerevisiae. The mnd1 protein forms a
complex with hop2 to promote homologous chromosome
pairing and meiotic double-strand break repair.
Length = 188
Score = 29.1 bits (66), Expect = 4.8
Identities = 22/96 (22%), Positives = 49/96 (51%), Gaps = 8/96 (8%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK--E 104
S+ K K R ++ +KE ++ ++ + + ++ +K R+E++ + E+ +
Sbjct: 61 SQALNKLKTRLEKLKKELEELKQRI------AELQAQIEKLKKGREETEERTELLEELKQ 114
Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
+KE K K + K++ D ER + K+E K +K +
Sbjct: 115 LEKELKKLKAELEKYEKNDPERIEKLKEETKVAKEA 150
>gnl|CDD|215584 PLN03113, PLN03113, DNA ligase 1; Provisional.
Length = 744
Score = 30.0 bits (67), Expect = 4.8
Identities = 17/105 (16%), Positives = 38/105 (36%), Gaps = 6/105 (5%)
Query: 2 AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSS------TSNPTNSSSSKKDKKDKD 55
A + K + S SP K K ++ T+ S T + S +
Sbjct: 17 AAAKKKQPQTQSQSSSPKKRKIGETQDANLGKTNVSEGTLPKTEDTIEPKSDSAKPRSST 76
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
++ + K+ + K++ K K+ +K+ + P++ +
Sbjct: 77 SSIAEDSKTGTKKAQTLSKPKKDEMKSKIGLLKKKPNDFDPEKVA 121
>gnl|CDD|152468 pfam12033, DUF3519, Protein of unknown function (DUF3519). This
family of proteins is functionally uncharacterized. This
protein is found in bacteria. Proteins in this family
are typically between 117 to 1154 amino acids in length.
This protein has a single completely conserved residue Q
that may be functionally important.
Length = 104
Score = 28.0 bits (62), Expect = 4.9
Identities = 24/86 (27%), Positives = 35/86 (40%), Gaps = 10/86 (11%)
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
K KK +K K+H + E+ D + S +K +S NS EP P
Sbjct: 26 KDNKKGEKLKNH-YVITGFEKRLDNSESLYTSPIITKHETSPLNSNEPN---------PT 75
Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTH 191
P P +Q+ +KT E E T+
Sbjct: 76 PKPLTSQEDLLKTSENLNETTPEPTN 101
>gnl|CDD|217829 pfam03985, Paf1, Paf1. Members of this family are components of
the RNA polymerase II associated Paf1 complex. The Paf1
complex functions during the elongation phase of
transcription in conjunction with Spt4-Spt5 and
Spt16-Pob3i.
Length = 431
Score = 29.7 bits (67), Expect = 5.0
Identities = 20/80 (25%), Positives = 38/80 (47%), Gaps = 4/80 (5%)
Query: 41 PTNSSSSKKDKKDKDR----DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP 96
++ SK K + R D E+ E +D+E+++ + +E+E + + + +E
Sbjct: 346 NPSTKESKMRDKRRARLDPIDFEEVDEDEDEEEEQRSDEHEEEEGEDSEEEGSQSREDGS 405
Query: 97 KESSSEKEKKKEKKDKKEKS 116
ESSS+ E K KE +
Sbjct: 406 SESSSDVGSDSESKADKESA 425
>gnl|CDD|220178 pfam09321, DUF1978, Domain of unknown function (DUF1978). Members
of this family are found in various hypothetical
proteins produced by the bacterium Chlamydia pneumoniae.
Their exact function has not, as yet, been identified.
Length = 241
Score = 29.4 bits (66), Expect = 5.1
Identities = 15/66 (22%), Positives = 30/66 (45%)
Query: 71 KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
S V E+ + K S K ER K + ++ K+ KK ++ + + + ++ E
Sbjct: 55 FSEVDRDEQWEKKTSLKHLERTYEKALDRLEKQSSKENKKVLQDAQREFERQSQDFYDKE 114
Query: 131 KKEQKE 136
+E +E
Sbjct: 115 IEEVEE 120
>gnl|CDD|236267 PRK08451, PRK08451, DNA polymerase III subunits gamma and tau;
Validated.
Length = 535
Score = 29.6 bits (67), Expect = 5.1
Identities = 16/73 (21%), Positives = 34/73 (46%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
K K K + K E + + S+++K+ D S+ E +E+K + ++ + KE
Sbjct: 446 KGAKIKIQKALKSAENPLQSLKEFKPSNEKKKIDTESTAEMLEEEAKKDDEEVQETQLKE 505
Query: 109 KKDKKEKSHKHKD 121
+ +E ++D
Sbjct: 506 ATELQEFMINNED 518
Score = 28.8 bits (65), Expect = 8.8
Identities = 13/67 (19%), Positives = 25/67 (37%)
Query: 160 ISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
I A P Q EK+K + + + + + KK D+ +T KE +
Sbjct: 452 IQKALKSAENPLQSLKEFKPSNEKKKIDTESTAEMLEEEAKKDDEEVQETQLKEATELQE 511
Query: 220 EKESHKS 226
+++
Sbjct: 512 FMINNED 518
>gnl|CDD|112890 pfam04094, DUF390, Protein of unknown function (DUF390). This is a
family of long proteins currently only found in the rice
genome. They have no known function. However they may be
some kind of transposable element.
Length = 843
Score = 29.8 bits (66), Expect = 5.1
Identities = 27/166 (16%), Positives = 67/166 (40%), Gaps = 20/166 (12%)
Query: 30 PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
PS + S S + ++++ +++ DR + ++ ++ +E + A +++ E+ +++E+
Sbjct: 216 PSRHSKSGQSEAEDPAAAEARRREADRREAADRLREAEEAAQDAARARQAEE---AAREE 272
Query: 90 ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS---- 145
+ + +E++ E E + S +D+ + ++S+
Sbjct: 273 AARARQAEEAAREAEAAFRADEAAATSEAARDEAAGAQLAPDPSGDAAATTSEAAGDEAA 332
Query: 146 --------SSHNSKEPASGS-----QLISHPPPPAPTPTQKSPVKT 178
S EPA G I P AP+P + P+ +
Sbjct: 333 GALLGPDPSGDAQDEPAPGGAPDSGTSIGGPSRAAPSPRRLFPLPS 378
>gnl|CDD|206228 pfam14058, PcfK, PcfK-like protein. The PcfK-like protein family
includes the Enterococcus faecalis PcfK protein, which
is functionally uncharacterized. This family of proteins
is found in bacteria and viruses. Proteins in this
family are typically between 137 and 257 amino acids in
length. There are two completely conserved residues (D
and L) that may be functionally important.
Length = 136
Score = 28.5 bits (64), Expect = 5.1
Identities = 12/60 (20%), Positives = 23/60 (38%), Gaps = 3/60 (5%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
D++ K K V+ + K + RKE+ E K +++ K +K+
Sbjct: 68 DEDDIKVGKPINCR---VTVNHTVELTEEEKAEARKEALKAYQQEELRKIQKRSKKSKKA 124
>gnl|CDD|220267 pfam09494, Slx4, Slx4 endonuclease. The Slx4 protein is a
heteromeric structure-specific endonuclease found in
fungi. Slx4 with Slx1 acts as a nuclease on branched DNA
substrates, particularly simple-Y, 5'-flap, or
replication fork structures by cleaving the strand
bearing the 5' non-homologous arm at the branch junction
and thus generating ligatable nicked products from
5'-flap or replication fork substrates.
Length = 627
Score = 29.6 bits (66), Expect = 5.2
Identities = 29/130 (22%), Positives = 48/130 (36%), Gaps = 13/130 (10%)
Query: 73 AVSSKEKEKDKVSSKEKERKESKPK----ESSSEKEKKKEKKDKKE--------KSHKHK 120
VS K K K K K RK +K K +S + ++ + D+ K K
Sbjct: 65 NVSGKRVPKKKKIKKPKLRKRTKRKNKKIKSLTAFNEENFETDRAPSLLSYLSGKQSKVN 124
Query: 121 DKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKE 180
D + + +K + S S+ S+ ++ E +LI P KS +K
Sbjct: 125 DILKRLESSKKIKNSRSSESTFETSALYSEDEWIDIVKLIRLRFPKLSESDLKS-LKNYI 183
Query: 181 KEKEKESSTT 190
EK+ +
Sbjct: 184 YGAEKQEESE 193
>gnl|CDD|235370 PRK05244, PRK05244, Der GTPase activator; Provisional.
Length = 177
Score = 28.7 bits (65), Expect = 5.2
Identities = 26/114 (22%), Positives = 45/114 (39%), Gaps = 15/114 (13%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS-HKHKDKDRERDKDEKKE 133
K+ + K +K+++ + + +E+K++KK K KS +H E
Sbjct: 1 KKKKSSPKRSKGMAKSKKKTREELDAEARERKRKKKHKGLKSGSRHN---------EGNT 51
Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLI--SHPPPPAPTPTQKSPVKTKEKEKEK 185
Q + K ++ SK+P L P P K P + E+E EK
Sbjct: 52 QSKGKGQAQKKDPRIGSKKPI---PLGVEEKVKPKKKKPKSKKPKLSPEQELEK 102
>gnl|CDD|218391 pfam05029, TIMELESS_C, Timeless protein C terminal region. The
timeless (tim) gene is essential for circadian function
in Drosophila. Putative homologues of Drosophila tim
have been identified in both mice and humans (mTim and
hTIM, respectively). Mammalian TIM is not the true
orthologue of Drosophila TIM, but is the likely
orthologue of a fly gene, timeout (also called tim-2).
mTim has been shown to be essential for embryonic
development, but does not have substantiated circadian
function. Some family members contain a SANT domain in
this region.
Length = 507
Score = 29.7 bits (66), Expect = 5.3
Identities = 26/192 (13%), Positives = 71/192 (36%), Gaps = 11/192 (5%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS---SEKEK 105
K ++ K + K+ E+ ++ + +E+ + + ++R + E S S +
Sbjct: 227 KKRRKKLKPKQPNGEESGEDDFQEDPEEEEQLPESKPEETEKRVSAFQVEGSTLISAENL 286
Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ------- 158
+++ K +K + + +E+ E + +V + ++E Q
Sbjct: 287 RQQLKQEKTSWPLLWLQSCLIRAADDREEDECDQAVPLVPLTEENEEAMENEQFQRLLKA 346
Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKD-AK 217
L PP + P K + + +++ + + + + KD + + +
Sbjct: 347 LGLRPPRSGQEGFWRIPAKLSSTQLRRRAASLSGEEEEPEDELKDDVDGEQADESEHETL 406
Query: 218 SKEKESHKSSAG 229
+ K + + AG
Sbjct: 407 ALRKNARQRKAG 418
>gnl|CDD|234368 TIGR03835, termin_org_DnaJ, terminal organelle assembly protein
TopJ. This model describes TopJ (MG_200, CbpA), a DnaJ
homolog and probable assembly protein of the Mycoplasma
terminal organelle. The terminal organelle is involved
in both cytadherence and gliding motility [Cellular
processes, Chemotaxis and motility].
Length = 871
Score = 29.8 bits (66), Expect = 5.5
Identities = 23/96 (23%), Positives = 35/96 (36%), Gaps = 7/96 (7%)
Query: 177 KTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEV 236
+ K E ++ T D SK K KKK K K++ + +E A E
Sbjct: 89 EEINKSGEFDNITDDDTPSKKKKKKKKKGWFWAKSKQESKTIETEEIIDVGASVNQANET 148
Query: 237 GGIYILLRSKKNKTVQ-------ERLAEQFKDELFD 265
L + ++V ERL +Q K+ F
Sbjct: 149 RLFDDTLDDQLEESVSTQSTDDGERLFDQNKEPSFT 184
>gnl|CDD|220237 pfam09428, DUF2011, Fungal protein of unknown function (DUF2011).
This is a family of fungal proteins whose function is
unknown.
Length = 130
Score = 28.4 bits (64), Expect = 5.5
Identities = 13/59 (22%), Positives = 29/59 (49%)
Query: 68 EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
EK+ K+K++ + K + + + + EK+K + +EK K + K++E+
Sbjct: 72 EKELLREKEKKKKRKRPGKKRRIALRLRRERTKERAEKEKRTRKNREKKFKRRQKEKEK 130
>gnl|CDD|178307 PLN02705, PLN02705, beta-amylase.
Length = 681
Score = 29.5 bits (66), Expect = 5.5
Identities = 15/74 (20%), Positives = 34/74 (45%), Gaps = 8/74 (10%)
Query: 11 SSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKD-------KKDKDRDKEKEKE 63
+ P + + +A + + + T N N+ + K ++R+KEKE+
Sbjct: 29 PNRNRNQPQSRRPRGFAATAAAAAIAPTENDVNNGNISSGGGGGGGGKGKREREKEKERT 88
Query: 64 KKDKEKDKSAVSSK 77
K +E+ + A++S+
Sbjct: 89 KL-RERHRRAITSR 101
>gnl|CDD|220774 pfam10477, EIF4E-T, Nucleocytoplasmic shuttling protein for mRNA
cap-binding EIF4E. EIF4E-T is the transporter protein
for shuttling the mRNA cap-binding protein EIF4E
protein, targeting it for nuclear import. EIF4E-T
contains several key binding domains including two
functional leucine-rich NESs (nuclear export signals)
between residues 438-447 and 613-638 in the human
protein. The other two binding domains are an
EIF4E-binding site, between residues 27-42 in Q9EST3,
and a bipartite NLS (nuclear localisation signals)
between 194-211, and these lie in family EIF4E-T_N.
EIF4E is the eukaryotic translation initiation factor 4E
that is the rate-limiting factor for cap-dependent
translation initiation.
Length = 520
Score = 29.5 bits (65), Expect = 5.6
Identities = 42/229 (18%), Positives = 76/229 (33%), Gaps = 23/229 (10%)
Query: 16 PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK------KDKEK 69
P ++ D P SS + +SS S++ KKD D D+ K + + KE
Sbjct: 39 PPSYRRGKSDGVWDPEKWNSSLYPSSGSSSPSERLKKDSDTDRGSLKRRIPDPRERVKED 98
Query: 70 DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH---------K 120
D V S ++ R S+ S ++ + + K
Sbjct: 99 DLDVVLSPQRRSFGGGCHVTARASSENDNESLRLLGERRIGSGRIMPSRGFERDFRGPRK 158
Query: 121 DKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT-- 178
D++ ER +D +++ K+ + + S E P + S ++T
Sbjct: 159 DRNPERSRDRERDYKDKRFRREFGDSKRVFSESRRNDSYTIEEEPEWFSAGPTSQLETIE 218
Query: 179 ------KEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
K E++ ++S K K KK K E + +
Sbjct: 219 LIGFDDKILEEDDKTSNGDGKQKGRKRTKKRTASVKEGHVECNGGVSLE 267
>gnl|CDD|219461 pfam07543, PGA2, Protein trafficking PGA2. A Saccharomyces
cerevisiae member of this family (PGA2) is an ER protein
which has been implicated in protein trafficking.
Length = 139
Score = 28.2 bits (63), Expect = 5.8
Identities = 17/96 (17%), Positives = 39/96 (40%), Gaps = 13/96 (13%)
Query: 54 KDRDKEKEKEKKDKEKDKSA---VSSKE-----KEKDKVSSKEKERKESKPKESSSE--- 102
K ++KE EKE+ ++E+ + +S + E E S+
Sbjct: 40 KAQEKEHEKERAEREEAREKKAKISPNALRGGATAGHGEEDTDDEEDEEDFATPSAVPQW 99
Query: 103 --KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
K +K+++K ++ + ++ DE ++ +E
Sbjct: 100 GKKARKRQRKVIRKLLEAEEQLREDQYDDEDEDIEE 135
>gnl|CDD|218941 pfam06217, GAGA_bind, GAGA binding protein-like family. This
family includes gbp a protein from Soybean that binds to
GAGA element dinucleotide repeat DNA. It seems likely
that the this domain mediates DNA binding. This putative
domain contains several conserved cysteines and a
histidine suggesting this may be a zinc-binding DNA
interaction domain.
Length = 301
Score = 29.1 bits (65), Expect = 5.9
Identities = 13/50 (26%), Positives = 20/50 (40%)
Query: 88 EKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
KE K+ K +S + K KK KK+ S ++ K +S
Sbjct: 144 AKEVKKPKKGQSPKVPKAPKPKKPKKKGSVSNRSVKMPGIDPRSKPDWKS 193
>gnl|CDD|223061 PHA03369, PHA03369, capsid maturational protease; Provisional.
Length = 663
Score = 29.6 bits (66), Expect = 6.0
Identities = 22/99 (22%), Positives = 34/99 (34%), Gaps = 12/99 (12%)
Query: 89 KERKESKPKESSSE--KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS 146
ERK + E E + K KK K+E+ K+ + K E K+ ES+
Sbjct: 464 HERKRKRGGELKEELIETLKLVKKLKEEQESLAKELEATAHKSEIKKIAESEFK------ 517
Query: 147 SHNSKEPASGSQLISHPP-PPAPTPTQKSPVKTKEKEKE 184
+ + + I A P K + E E
Sbjct: 518 ---NAGAKTAAANIEPNCSADAAAPATKRARPETKTELE 553
>gnl|CDD|218549 pfam05308, Mito_fiss_reg, Mitochondrial fission regulator. In
eukaryotes, this family of proteins induces
mitochondrial fission.
Length = 248
Score = 28.9 bits (65), Expect = 6.0
Identities = 20/87 (22%), Positives = 31/87 (35%), Gaps = 17/87 (19%)
Query: 133 EQKESKSSSKIVSSSHNSKEPASGSQLIS----------HPPPPAPTPTQKSPVKTKE-- 180
EQ S +S + SS + ++ S IS PPPP P P ++
Sbjct: 146 EQSNSTTSDLL-SSDESVPSSSTTSFPISPPTEEPVLEVPPPPPPPPPPPPPSLQQSTSA 204
Query: 181 ----KEKEKESSTTHDKHSKHKHKKKD 203
KE++ + S K K +
Sbjct: 205 IDLIKERKGQRSAAGKTLVLSKPKSPE 231
>gnl|CDD|191382 pfam05837, CENP-H, Centromere protein H (CENP-H). This family
consists of several eukaryotic centromere protein H
(CENP-H) sequences. Macromolecular
centromere-kinetochore complex plays a critical role in
sister chromatid separation, but its complete protein
composition as well as its precise dynamic function
during mitosis has not yet been clearly determined.
CENP-H contains a coiled-coil structure and a nuclear
localisation signal. CENP-H is specifically and
constitutively localised in kinetochores throughout the
cell cycle. CENP-H may play a role in kinetochore
organisation and function throughout the cell cycle.
This the C-terminus of the region, which is conserved
from fungi to humans.
Length = 106
Score = 27.7 bits (62), Expect = 6.1
Identities = 15/65 (23%), Positives = 34/65 (52%), Gaps = 2/65 (3%)
Query: 82 DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
+++S EKER + K K E + K K+ S + + +E+ + + + K+SK+
Sbjct: 17 EELSDLEKERLQLKQKNVELALELLELTKKKE--SWREDMELKEQLEKLEADLKKSKAKW 74
Query: 142 KIVSS 146
+++ +
Sbjct: 75 EVMKN 79
>gnl|CDD|224510 COG1594, RPB9, DNA-directed RNA polymerase, subunit
M/Transcription elongation factor TFIIS
[Transcription].
Length = 113
Score = 27.8 bits (62), Expect = 6.4
Identities = 12/53 (22%), Positives = 17/53 (32%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
+ K R KE +K KE + K ++KEK K
Sbjct: 26 RKCGYEEEASNKKVYRYSVKEAVEKKKEVVLVVEDETQGAKTLPTAKEKCPKC 78
>gnl|CDD|185616 PTZ00436, PTZ00436, 60S ribosomal protein L19-like protein;
Provisional.
Length = 357
Score = 29.1 bits (64), Expect = 6.4
Identities = 21/90 (23%), Positives = 42/90 (46%), Gaps = 1/90 (1%)
Query: 89 KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
K + E K + +E+ K KD++ + K + R+R+KD ++ ++E +++
Sbjct: 144 KVKNEKKKERQLAEQLAAKRLKDEQHRHKARKQELRKREKDRERARREDAAAAAAAKQKA 203
Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
+K+ A+ S S AP +P K
Sbjct: 204 AAKKAAAPSGKKS-AKAAAPAKAAAAPAKA 232
>gnl|CDD|240413 PTZ00423, PTZ00423, glideosome-associated protein 45; Provisional.
Length = 193
Score = 28.9 bits (64), Expect = 6.5
Identities = 34/120 (28%), Positives = 55/120 (45%), Gaps = 15/120 (12%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
+ K+ K +D D+ E+EK KE V ++K + +E E + +P E E
Sbjct: 6 RKNKAKEPKRRDIDELAEREKLKKE-----VEEIPEQKPEDIVEELEDQPEEPPEQEEEN 60
Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH----NSKEPASGSQL 159
E++K K++ ++K DEK +S+S I S SH S + +GSQL
Sbjct: 61 EEQKPKEEIDYPIQENK------SFDEKNLDDLERSNSDIYSESHKYDNASDKLETGSQL 114
>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
Length = 1021
Score = 29.3 bits (65), Expect = 6.6
Identities = 36/174 (20%), Positives = 72/174 (41%), Gaps = 37/174 (21%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKK------------- 52
+SSS +S + N +S S +S S P SK D+K
Sbjct: 369 RSSSCASRQSANNVTNITSITSVTSVASVASVASVP-----SKDDRKYPQDGATHCHAVN 423
Query: 53 -------DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
DKD + EK++ + A+ K EK ++ E+E +E +E E+
Sbjct: 424 GHYGGRVDKDHAERARIEKENAHR--KALEMKILEKKRIERLEREERERLERERMERIER 481
Query: 106 KKEKKDKKEKSHKHKDK-DRER---------DKDEKKEQKESKSSSKIVSSSHN 149
++ ++++ E+ +D+ +R+R D+ E+ ++++ +S + N
Sbjct: 482 ERLERERLERERLERDRLERDRLDRLERERVDRLERDRLEKARRNSYFLKGMEN 535
>gnl|CDD|215590 PLN03123, PLN03123, poly [ADP-ribose] polymerase; Provisional.
Length = 981
Score = 29.4 bits (66), Expect = 6.6
Identities = 17/64 (26%), Positives = 26/64 (40%)
Query: 30 PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
+ S K KKD D + +K K D++ S +S++K D S E
Sbjct: 188 SEAKEEKAEERKQESKKGAKRKKDASGDDKSKKAKTDRDVSTSTAASQKKSSDLESKLEA 247
Query: 90 ERKE 93
+ KE
Sbjct: 248 QSKE 251
>gnl|CDD|147982 pfam06112, Herpes_capsid, Gammaherpesvirus capsid protein. This
family consists of several Gammaherpesvirus capsid
proteins. The exact function of this family is unknown.
Length = 148
Score = 28.3 bits (63), Expect = 6.6
Identities = 17/59 (28%), Positives = 26/59 (44%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
S S+SSSS++ N+ SS +S S S+ ++ S S D K+K
Sbjct: 90 SALSASSSSASGVPGGANQLSGSSGSALSSGPGSLSSSSSLSGSGAGAGDTAPSSSKKK 148
>gnl|CDD|220654 pfam10254, Pacs-1, PACS-1 cytosolic sorting protein. PACS-1 is a
cytosolic sorting protein that directs the localisation
of membrane proteins in the trans-Golgi network
(TGN)/endosomal system. PACS-1 connects the clathrin
adaptor AP-1 to acidic cluster sorting motifs contained
in the cytoplasmic domain of cargo proteins such as
furin, the cation-independent mannose-6-phosphate
receptor and in viral proteins such as human
immunodeficiency virus type 1 Nef.
Length = 413
Score = 29.4 bits (66), Expect = 6.6
Identities = 33/145 (22%), Positives = 50/145 (34%), Gaps = 25/145 (17%)
Query: 7 SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD--------- 57
S SS+ PS + S+ P S S S+ +++ S D D
Sbjct: 228 SLFVLSSSPPSSSGASKEASATPPP---SPSMSSSLSAAGSPVDAIGLQVDYWPAARPGE 284
Query: 58 KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE-----KKKEKKDK 112
+ KE K+D S K K S + R S +E+ KEK K
Sbjct: 285 RRKEGSKRDA-------SGKNTLKSTFRSLQVSRLPSSGQEAQMTNTMSMTVVTKEKNKK 337
Query: 113 KEKSHKHKDKDRERDKDEKKEQKES 137
K K +E++ + K + E
Sbjct: 338 VPVMFLGK-KPKEKEVESKSQCIEG 361
>gnl|CDD|224016 COG1091, RfbD, dTDP-4-dehydrorhamnose reductase [Cell envelope
biogenesis, outer membrane].
Length = 281
Score = 29.2 bits (66), Expect = 6.6
Identities = 24/108 (22%), Positives = 40/108 (37%), Gaps = 16/108 (14%)
Query: 267 LKNEQADILQRKVHIISGDISQPSLGISSHD---QQFIQHHIHVIIHAAASLRFDELIQD 323
L E L + +I+ + L I+ D + + V+I+AAA D+ +
Sbjct: 12 LGTELRRALPGEFEVIA--TDRAELDITDPDAVLEVIRETRPDVVINAAAYTAVDKAESE 69
Query: 324 ---AFTLNIQATRELLDLATRCSQLKAILHVSTLYT------HSYRED 362
AF +N L A ++H+ST Y Y+E
Sbjct: 70 PELAFAVNATGAENLARAAAEVGAR--LVHISTDYVFDGEKGGPYKET 115
>gnl|CDD|167284 PRK01833, tatA, twin arginine translocase protein A; Provisional.
Length = 74
Score = 27.2 bits (60), Expect = 6.7
Identities = 15/38 (39%), Positives = 22/38 (57%), Gaps = 1/38 (2%)
Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK 83
S K KK DK K+ E + K + K A S+++K K+K
Sbjct: 35 SVKGFKKAMADDKPKDAEFE-KVEAKEAASTEQKAKEK 71
>gnl|CDD|216860 pfam02063, MARCKS, MARCKS family.
Length = 296
Score = 29.0 bits (64), Expect = 6.7
Identities = 46/216 (21%), Positives = 92/216 (42%), Gaps = 6/216 (2%)
Query: 14 AHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSA 73
A P+ + K+ ++ + T +S++ ++K+ E +KE + E + A
Sbjct: 45 ASPAAAEAGAKEELQANGSAPAEETGKEEAASAAAAEEKEAAASTEPDKEPAEAEPAEPA 104
Query: 74 VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
S E E + +S EK + P SSE KKK+K+ +KS K +++K E E
Sbjct: 105 -SPAEAEGEAATSTEKAEDGATP-SPSSETPKKKKKRFSFKKSFKLSGFSFKKNKKEAGE 162
Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
E++ + + +KE A+ + + A P +++ E E +E + +
Sbjct: 163 GAEAEGA---AAEKEGAKEEAAAAAPEAGSGEEAAAPGEEAGAAGAEGEAGEEPAADAEP 219
Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
+ K ++ +K +E A ++K K +
Sbjct: 220 EQP-EAKPEEAAPEKPQAEEAKAAEEQKAEEKPAEE 254
>gnl|CDD|217443 pfam03234, CDC37_N, Cdc37 N terminal kinase binding. Cdc37 is a
molecular chaperone required for the activity of
numerous eukaryotic protein kinases. This domain
corresponds to the N terminal domain which binds
predominantly to protein kinases and is found N terminal
to the Hsp (Heat shocked protein) 90-binding domain
pfam08565. Expression of a construct consisting of only
the N-terminal domain of Saccharomyces pombe Cdc37
results in cellular viability. This indicates that
interactions with the cochaperone Hsp90 may not be
essential for Cdc37 function.
Length = 172
Score = 28.6 bits (64), Expect = 6.8
Identities = 22/104 (21%), Positives = 45/104 (43%), Gaps = 8/104 (7%)
Query: 50 DKKDKDRDKEKEKEKKDKEKDKSAV--SSKEKEKDKVSSKEKERKESKPKESSSE--KEK 105
D+ + DK + K++ AV S E DK + + ++ ++ E + K++
Sbjct: 59 DRLLERVDKLLSELKEESLDSSQAVMKSLNENFTDKENVEPEQPTYNEMVEDLFDQVKDE 118
Query: 106 KKEKKDKKE----KSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
EK + H+ K K +++ +K ++ E + KI S
Sbjct: 119 VDEKNGAALIEELQKHRDKLKKEQKELLKKLDELEKEEKKKIWS 162
>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
Length = 1036
Score = 29.5 bits (66), Expect = 6.8
Identities = 13/50 (26%), Positives = 31/50 (62%)
Query: 87 KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
+EK R+ K + +E+E++ E++ ++E+ + DR + K E ++++E
Sbjct: 252 EEKRRELEKLAKEEAERERQAEEQRRREEEKAAMEADRAQAKAEVEKRRE 301
Score = 29.1 bits (65), Expect = 8.4
Identities = 15/61 (24%), Positives = 30/61 (49%), Gaps = 9/61 (14%)
Query: 57 DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
+K +E EK KE E E+++ + +++ R+E K + + K E + ++EK
Sbjct: 253 EKRRELEKLAKE---------EAERERQAEEQRRREEEKAAMEADRAQAKAEVEKRREKL 303
Query: 117 H 117
Sbjct: 304 Q 304
>gnl|CDD|183377 PRK11910, PRK11910, amidase; Provisional.
Length = 615
Score = 29.2 bits (65), Expect = 6.8
Identities = 22/110 (20%), Positives = 39/110 (35%), Gaps = 9/110 (8%)
Query: 83 KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
+ ++ E + SK K+ + +KEK K+ + ++ E+ K+ Q E
Sbjct: 29 QTTTSENKSAVSKKKKPTVKKEKPKQSSNNLTLGKNKENFHLEKGFGNKQLQVERIIDRI 88
Query: 143 IVSSSHNSKE---------PASGSQLISHPPPPAPTPTQKSPVKTKEKEK 183
SS N E + P P P + SP +K +
Sbjct: 89 FQSSLKNRTEIKVKPKNNPQKKQNIKPVKPIPSKPEKPEDSPSPFYDKAR 138
>gnl|CDD|224259 COG1340, COG1340, Uncharacterized archaeal coiled-coil protein
[Function unknown].
Length = 294
Score = 28.9 bits (65), Expect = 6.9
Identities = 18/95 (18%), Positives = 43/95 (45%), Gaps = 6/95 (6%)
Query: 47 SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER-----KESKPKESSS 101
+K + K+ + KEK + +S + S E+E +++ K++ +E + +
Sbjct: 83 AKLQELRKEYRELKEKRNEFNLGGRS-IKSLEREIERLEKKQQTSVLTPEEERELVQKIK 141
Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
E K+ E K + ++ + + + KK+ +E
Sbjct: 142 ELRKELEDAKKALEENEKLKELKAEIDELKKKARE 176
>gnl|CDD|224495 COG1579, COG1579, Zn-ribbon protein, possibly nucleic acid-binding
[General function prediction only].
Length = 239
Score = 28.9 bits (65), Expect = 6.9
Identities = 19/102 (18%), Positives = 36/102 (35%), Gaps = 15/102 (14%)
Query: 48 KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK- 106
+ E E E + + VS E E ++ + +R E K E+E +
Sbjct: 40 LEALNKALEALEIELEDLENQ-----VSQLESEIQEIRER-IKRAEEKLSAVKDERELRA 93
Query: 107 --------KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
KE+ + E ++ E+ + E ++ KE
Sbjct: 94 LNIEIQIAKERINSLEDELAELMEEIEKLEKEIEDLKERLER 135
>gnl|CDD|216420 pfam01298, Lipoprotein_5, Transferrin binding protein-like solute
binding protein. This family of proteins are distantly
related to other families of solute binding proteins.
Length = 554
Score = 29.4 bits (66), Expect = 6.9
Identities = 24/153 (15%), Positives = 50/153 (32%), Gaps = 9/153 (5%)
Query: 33 STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
S + TS P + +D +K + + A + K + E K
Sbjct: 5 SPKTDTSAPKAEAPKYQDVPSAKPEKAELAK-------LDAPALGFAMKLPRRNWGPEEK 57
Query: 93 ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
++ + + +++ EK++ + + K+ + K +S +S KE
Sbjct: 58 TELSEKDWIKTSSLSKIENEVEKNNGEDETHDKNRKEGAHDFKYVRSGYVYISGGSLEKE 117
Query: 153 PASGSQLISHPP--PPAPTPTQKSPVKTKEKEK 183
G++ P + PV K K
Sbjct: 118 DNKGAKSGYDGYVYYKGKQPAKNLPVSGKVTYK 150
>gnl|CDD|236048 PRK07561, PRK07561, DNA topoisomerase I subunit omega; Validated.
Length = 859
Score = 29.4 bits (67), Expect = 7.0
Identities = 11/46 (23%), Positives = 19/46 (41%)
Query: 49 KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
+K + + KD K+AV K K + + EK+ K +
Sbjct: 814 LAEKPEKLRYLADAPAKDPAGKKAAVKFSRKTKQQYVASEKDGKAT 859
>gnl|CDD|236343 PRK08868, PRK08868, flagellar protein FlaG; Provisional.
Length = 144
Score = 28.2 bits (63), Expect = 7.0
Identities = 16/88 (18%), Positives = 34/88 (38%), Gaps = 4/88 (4%)
Query: 5 VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKE---KE 61
+ S +S+ + S K + ++S S + SS K +K +++ E
Sbjct: 3 ISSYASNIQPYGSNSGTKFASENGNGTSSVLVSDKTR-SVSSEKVEKTEQELSVEAAVAM 61
Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
E++ + + E+ + V S K
Sbjct: 62 AEQRQELNREELEKMVEQMNEFVKSINK 89
>gnl|CDD|227862 COG5575, ORC2, Origin recognition complex, subunit 2 [DNA
replication, recombination, and repair].
Length = 535
Score = 29.3 bits (65), Expect = 7.2
Identities = 23/146 (15%), Positives = 43/146 (29%)
Query: 75 SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
+ K E S PK+ E + ++ H+ + + +
Sbjct: 24 LVFANSHESNDLKMVENVSSTPKKGVLEDPSTLTPEVVTPRTPGHRIIKAKGAYTKDRSA 83
Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKH 194
K + +I + GS + P + SP E E + + +
Sbjct: 84 KRRRGRIEIERHLLGEFDDNVGSNSLLDVPLYSLEAEPLSPSVMLEDESMEGINQSPQGI 143
Query: 195 SKHKHKKKDKHGDKTNPKEKDAKSKE 220
S K K+D + P +S E
Sbjct: 144 SVEKLGKEDNRSRSSTPASPSLESHE 169
>gnl|CDD|115046 pfam06364, DUF1068, Protein of unknown function (DUF1068). This
family consists of several hypothetical plant proteins
from Arabidopsis thaliana and Oryza sativa. The function
of this family is unknown.
Length = 176
Score = 28.5 bits (63), Expect = 7.4
Identities = 29/105 (27%), Positives = 48/105 (45%), Gaps = 20/105 (19%)
Query: 23 DKDSSAIPSTSTSSSTSNPTNSSSSKKDKK-DKDRDKE---------KEKEKKDKEKDKS 72
D D SA P + SN + +K+D + ++D +K K++E + EK K
Sbjct: 48 DCDCSARPLLTIPKELSNASFEDCAKQDPEVNEDTEKNYAELLTEELKQREAESTEKHKR 107
Query: 73 A----------VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
A SS +KE DK +S + +E++ K + E+KK
Sbjct: 108 ADVGLLEAKKLTSSYQKEADKCNSGMETCEEAREKAEEALVEQKK 152
>gnl|CDD|236978 PRK11778, PRK11778, putative inner membrane peptidase; Provisional.
Length = 330
Score = 29.0 bits (66), Expect = 7.4
Identities = 14/54 (25%), Positives = 22/54 (40%), Gaps = 9/54 (16%)
Query: 83 KVSSKEKERKESKPKESSSEKE-KKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
++ + KE KE +KE K K KK K+++ K K + K
Sbjct: 44 NLNEQYKEMKEELKAALLDKKELKAWHKAQKK--------KEKQEAKAAKAKSK 89
>gnl|CDD|198139 smart01071, CDC37_N, Cdc37 N terminal kinase binding. Cdc37 is a
molecular chaperone required for the activity of
numerous eukaryotic protein kinases. This domain
corresponds to the N terminal domain which binds
predominantly to protein kinases.and is found N terminal
to the Hsp (Heat shocked protein) 90-binding domain.
Expression of a construct consisting of only the
N-terminal domain of Saccharomyces pombe Cdc37 results
in cellular viability. This indicates that interactions
with the cochaperone Hsp90 may not be essential for
Cdc37 function.
Length = 154
Score = 28.2 bits (63), Expect = 7.6
Identities = 21/105 (20%), Positives = 41/105 (39%), Gaps = 16/105 (15%)
Query: 56 RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK------------ 103
R + E E K+ + + K DK+ +E + S + +E
Sbjct: 41 RVERME-EIKNLKYELIMNDHLNKRIDKLLKGLREEELSPETPTYNEMLAELQDQLKKEL 99
Query: 104 -EKKKEKKDKKEKSHKHKDK--DRERDKDEKKEQKESKSSSKIVS 145
E + + E+ KH+DK +++ +K ++ E + KI S
Sbjct: 100 EEANGDSEGLLEELKKHRDKLKKEQKELRKKLDELEKEEKKKIWS 144
>gnl|CDD|240419 PTZ00440, PTZ00440, reticulocyte binding protein 2-like protein;
Provisional.
Length = 2722
Score = 29.4 bits (66), Expect = 7.6
Identities = 31/96 (32%), Positives = 42/96 (43%), Gaps = 13/96 (13%)
Query: 60 KEKEKKDKEKDKSAVSSKEKEKDKVSSKE-----KERKESKPKESSSEKEKKKEK----- 109
KEK K+ +EK +S EK K K+SS K+ K K KE E+K E
Sbjct: 1034 KEKGKEIEEKVDQYISLLEKMKTKLSSFHFNIDIKKYKNPKIKEEIKLLEEKVEALLKKI 1093
Query: 110 KDKKEKSHKHKDKDRER---DKDEKKEQKESKSSSK 142
+ K K + K+K E EK +Q E + K
Sbjct: 1094 DENKNKLIEIKNKSHEHVVNADKEKNKQTEHYNKKK 1129
>gnl|CDD|221937 pfam13148, DUF3987, Protein of unknown function (DUF3987). A
family of uncharacterized proteins found by clustering
human gut metagenomic sequences.
Length = 379
Score = 29.2 bits (66), Expect = 7.7
Identities = 13/51 (25%), Positives = 24/51 (47%), Gaps = 2/51 (3%)
Query: 88 EKERKESKPKESSSEKEKK--KEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
E+ R+E + + E EK+ + +K EK K K + ++ +E E
Sbjct: 68 EELREEYEEELKEYEAEKEIWEAEKKGLEKKAKKAIKKGKDEEALAEELLE 118
>gnl|CDD|227496 COG5167, VID27, Protein involved in vacuole import and degradation
[Intracellular trafficking and secretion].
Length = 776
Score = 29.2 bits (65), Expect = 7.8
Identities = 21/87 (24%), Positives = 39/87 (44%), Gaps = 5/87 (5%)
Query: 57 DKEKEKEKKDKEKDKSAVSS---KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
+ EK ++ + KD SS EK+ D + EK E++ E S +E+ ++ +D
Sbjct: 337 NNEKWGNEEAERKDYILDSSSVPLEKQFDDILYFEKMEIENRNPEESEHEEEVEDYED-- 394
Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSS 140
E H + D + ++ + E S
Sbjct: 395 ENDHSKRICDDDELENHFRAADEKNSH 421
>gnl|CDD|130680 TIGR01619, hyp_HI0040, TIGR01619 family protein. This model
represents a hypothetical equivalog of gamma
proteobacteria, includes HI0040. These sequences do not
have any similarity to known proteins by PSI-BLAST.
Length = 249
Score = 28.8 bits (64), Expect = 7.8
Identities = 16/55 (29%), Positives = 26/55 (47%), Gaps = 7/55 (12%)
Query: 317 FDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYT--HSYREDIQEEFYP 369
FD L+ + I AT ELLDL + + ++ LY HS+ D + + +
Sbjct: 127 FDFLLASPLEIKIHATEELLDLLKKKGR-----DLAALYLIEHSFHFDEEAKMFA 176
>gnl|CDD|222466 pfam13945, NST1, Splicing factor, salt tolerance regulator. NST1
is a family of proteins that seem to be involved,
directly or indirectly, in the salt sensitivity of some
cellular functions in yeast. These proteins also
interact with the splicing factor Msl1p.
Length = 189
Score = 28.4 bits (63), Expect = 7.8
Identities = 16/79 (20%), Positives = 38/79 (48%), Gaps = 4/79 (5%)
Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPP----PPAPTPTQKS 174
H D + K +KK++ ++ S S S + ++ S +++ P P +Q++
Sbjct: 25 HNDSSSSKSKKKKKKRSKATSPSHNASDQSTNNVMSTPSAILARPQPLSYPFGSQQSQQN 84
Query: 175 PVKTKEKEKEKESSTTHDK 193
VK ++++ +ST ++
Sbjct: 85 AVKNSKEKRIWNTSTQEER 103
>gnl|CDD|221825 pfam12877, DUF3827, Domain of unknown function (DUF3827). This
family contains the human KIAA1549 protein which has
been found to be fused fused to BRAF gene in many cases
of pilocytic astrocytomas. The fusion is due mainly to a
tandem duplication of 2 Mb at 7q34. Although nothing is
known about the function of KIAA1549 protein, the BRAF
protein is a well characterized oncoprotein. It is a
serine/threonine protein kinase which is implicated in
MAP/ERK signalling, a critical pathway for the
regulation of cell division, differentiation and
secretion.
Length = 684
Score = 29.1 bits (65), Expect = 7.9
Identities = 31/181 (17%), Positives = 58/181 (32%), Gaps = 25/181 (13%)
Query: 18 PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD--KEKDKSAVS 75
P +D A + + S + ++K R + K K D +D +
Sbjct: 482 PRFEPSRDDRAAENGKVNKEIQVALRHKSEIEHHRNKIRLRAKRKGHYDFPSVEDSNNGH 541
Query: 76 SKEKEKDKVSSKEKERKESKP----KESSSEKEKKKEKKDKKEKSHKHK----------- 120
KE+++V + + + + E S+ E KK + ++ +
Sbjct: 542 GDPKEQERVYQRAQMQIDKILLPPDSEPSTFTEPKKSSRGQRSPKARRSRQSLNGPSTEM 601
Query: 121 DKDR--ERDKDEKKEQKESKSSSKIVSSS-----HNSKEPASGSQLISHPPPPAPTPTQK 173
D DR ERD+D + ++ S P G + P +Q
Sbjct: 602 DLDRLIERDRDGTYRSGPGVENEAYEETNDRLPESRSYSPTRGPKGHDPSEPSY-LSSQP 660
Query: 174 S 174
S
Sbjct: 661 S 661
>gnl|CDD|233605 TIGR01865, cas_Csn1, CRISPR-associated protein Cas9/Csn1, subtype
II/NMEMI. CRISPR loci appear to be mobile elements with
a wide host range. This model represents a protein found
only in CRISPR-containing species, near other
CRISPR-associated proteins (cas), as part of the NMENI
subtype of CRISPR/Cas locus. The species range so far
for this protein is animal pathogens and commensals only
[Mobile and extrachromosomal element functions, Other].
Length = 805
Score = 29.3 bits (66), Expect = 7.9
Identities = 12/44 (27%), Positives = 19/44 (43%)
Query: 51 KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
+ + KE+ KK+++K K S+ KE K E K
Sbjct: 528 GTNFGKRNSKERYKKNEDKIKEFASALGKEILKEEPTENSSKNI 571
>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT. This family
consists of several bacterial cobalamin biosynthesis
(CobT) proteins. CobT is involved in the transformation
of precorrin-3 into cobyrinic acid.
Length = 282
Score = 28.6 bits (64), Expect = 8.0
Identities = 17/60 (28%), Positives = 34/60 (56%), Gaps = 1/60 (1%)
Query: 45 SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
S+ +D +D+D KE E + + +E++ + S ++ D SS+E E E + E+S++
Sbjct: 218 SADSEDNEDEDDPKEDEDDDQGEEEESGSSDSLSEDSDA-SSEEMESGEMEAAEASADDT 276
>gnl|CDD|129694 TIGR00606, rad50, rad50. All proteins in this family for which
functions are known are involvedin recombination,
recombinational repair, and/or non-homologous end
joining.They are components of an exonuclease complex
with MRE11 homologs. This family is distantly related to
the SbcC family of bacterial proteins.This family is
based on the phylogenomic analysis of JA Eisen (1999,
Ph.D. Thesis, Stanford University).
Length = 1311
Score = 29.2 bits (65), Expect = 8.0
Identities = 23/98 (23%), Positives = 40/98 (40%), Gaps = 6/98 (6%)
Query: 23 DKDSSAIPSTSTSSSTSNPTNSSSSKKDKK-----DKDRDKEKEKEKKDKEKDKSAVSSK 77
D++ S P T S K DK + E E +KK+K +D+ +
Sbjct: 674 DENQSCCPVCQRVFQTEAELQEFISDLQSKLRLAPDKLKSTESELKKKEKRRDEMLGLA- 732
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
+ + KEKE E + K ++ ++ K D +E+
Sbjct: 733 PGRQSIIDLKEKEIPELRNKLQKVNRDIQRLKNDIEEQ 770
>gnl|CDD|130658 TIGR01597, PYST-B, Plasmodium yoelii subtelomeric family PYST-B.
This model represents a paralogous family of Plasmodium
yoelii genes preferentially located in the subtelomeric
regions of the chromosomes. There are no obvious
homologs to these genes in any other organism.
Length = 255
Score = 28.7 bits (64), Expect = 8.2
Identities = 24/108 (22%), Positives = 45/108 (41%), Gaps = 13/108 (12%)
Query: 19 HKNKDKDSSAIPS-TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE--------KKDKEK 69
H K K+ + +P + T N + ++ K+ D E E K +K
Sbjct: 91 HIKKHKERNTLPDLNNVDKKTKKLINKLQKELEELKKELDNEMNDELTIQPIHDKIIIKK 150
Query: 70 DKSAVSSKEKE----KDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
D++ S+ ++ +++ +S E +E S + K K+K KK K
Sbjct: 151 DENNSVSEHEDFKQLENEKNSSVSEHEEFDIASSDNLKIKRKLKKLVK 198
>gnl|CDD|227577 COG5252, COG5252, Uncharacterized conserved protein, contains
CCCH-type Zn-finger protein [General function prediction
only].
Length = 299
Score = 28.9 bits (64), Expect = 8.2
Identities = 15/71 (21%), Positives = 34/71 (47%)
Query: 44 SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
K KK ++ K+ ++ + + +DK+ + KV + K+ + KE +K
Sbjct: 1 MPPKKMAKKQQESGKKATRDMRKELEDKTFGLKNKNRSTKVQAIIKQIETLNLKEQLEKK 60
Query: 104 EKKKEKKDKKE 114
EK + ++ ++E
Sbjct: 61 EKMRMEEKRRE 71
>gnl|CDD|146016 pfam03179, V-ATPase_G, Vacuolar (H+)-ATPase G subunit. This family
represents the eukaryotic vacuolar (H+)-ATPase
(V-ATPase) G subunit. V-ATPases generate an acidic
environment in several intracellular compartments.
Correspondingly, they are found as membrane-attached
proteins in several organelles. They are also found in
the plasma membranes of some specialised cells.
V-ATPases consist of peripheral (V1) and membrane
integral (V0) heteromultimeric complexes. The G subunit
is part of the V1 subunit, but is also thought to be
strongly attached to the V0 complex. It may be involved
in the coupling of ATP degradation to H+ translocation.
Length = 105
Score = 27.5 bits (62), Expect = 8.3
Identities = 22/88 (25%), Positives = 45/88 (51%), Gaps = 3/88 (3%)
Query: 78 EKEKDKVSSKEKERKESKPKESSSEKEKKKEK-KDKKEKSHK--HKDKDRERDKDEKKEQ 134
EKE ++ ++ ++R+ + K++ E EK+ E+ + ++E K + R + EKK +
Sbjct: 13 EKEAAEIVNEARKRRAKRLKQAKEEAEKEIEEYRAQREAEFKEFEAEHSGSRGELEKKIE 72
Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISH 162
KE++ + S N + A L+S
Sbjct: 73 KETEEKIDELKRSFNKNKEAVVQMLLSK 100
>gnl|CDD|237015 PRK11901, PRK11901, hypothetical protein; Reviewed.
Length = 327
Score = 28.9 bits (65), Expect = 8.3
Identities = 13/45 (28%), Positives = 22/45 (48%), Gaps = 1/45 (2%)
Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
S +++ + + + Q IS PP +PTPTQ +P +T
Sbjct: 93 SSPSAANNTSDGHDASGVKNTAPPQDIS-APPISPTPTQAAPPQT 136
>gnl|CDD|213592 TIGR01179, galE, UDP-glucose-4-epimerase GalE. Alternate name:
UDPgalactose 4-epimerase This enzyme interconverts
UDP-glucose and UDP-galactose. A set of related
proteins, some of which are tentatively identified as
UDP-glucose-4-epimerase in Thermotoga maritima, Bacillus
halodurans, and several archaea, but deeply branched
from this set and lacking experimental evidence, are
excluded from This model and described by a separate
model [Energy metabolism, Sugars].
Length = 328
Score = 28.8 bits (65), Expect = 8.4
Identities = 25/97 (25%), Positives = 40/97 (41%), Gaps = 12/97 (12%)
Query: 250 TVQERLAEQFKDELFDRLKNEQADILQRKVHI-----ISGDISQPSLGISSHDQQFIQHH 304
TV++ L + + D L N + L R I + GD+ L D+ F +H
Sbjct: 15 TVRQLLESGHEVVILDNLSNGSREALPRGERITPVTFVEGDLRDREL----LDRLFEEHK 70
Query: 305 IHVIIHAAASLRFDELIQDA---FTLNIQATRELLDL 338
I +IH A + E +Q + N+ T LL+
Sbjct: 71 IDAVIHFAGLIAVGESVQKPLKYYRNNVVGTLNLLEA 107
>gnl|CDD|187567 cd05257, Arna_like_SDR_e, Arna decarboxylase_like, extended (e)
SDRs. Decarboxylase domain of ArnA. ArnA, is an enzyme
involved in the modification of outer membrane protein
lipid A of gram-negative bacteria. It is a bifunctional
enzyme that catalyzes the NAD-dependent decarboxylation
of UDP-glucuronic acid and
N-10-formyltetrahydrofolate-dependent formylation of
UDP-4-amino-4-deoxy-l-arabinose; its NAD-dependent
decaboxylating activity is in the C-terminal 360
residues. This subgroup belongs to the extended SDR
family, however the NAD binding motif is not a perfect
match and the upstream Asn of the canonical active site
tetrad is not conserved. Extended SDRs are distinct from
classical SDRs. In addition to the Rossmann fold
(alpha/beta folding pattern with a central beta-sheet)
core region typical of all SDRs, extended SDRs have a
less conserved C-terminal extension of approximately 100
amino acids. Extended SDRs are a diverse collection of
proteins, and include isomerases, epimerases,
oxidoreductases, and lyases; they typically have a
TGXXGXXG cofactor binding motif. SDRs are a functionally
diverse family of oxidoreductases that have a single
domain with a structurally conserved Rossmann fold, an
NAD(P)(H)-binding region, and a structurally diverse
C-terminal region. Sequence identity between different
SDR enzymes is typically in the 15-30% range; they
catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 316
Score = 28.8 bits (65), Expect = 8.5
Identities = 17/82 (20%), Positives = 31/82 (37%), Gaps = 16/82 (19%)
Query: 278 KVHIISGDISQPSLGISSHDQQFIQHHI---HVIIHAAASLRFDELIQDAFTL---NIQA 331
+ H ISGD+ +++ + V+ H AA + + N+
Sbjct: 48 RFHFISGDVRDA---------SEVEYLVKKCDVVFHLAALIAIPYSYTAPLSYVETNVFG 98
Query: 332 TRELLDLATRCSQLKAILHVST 353
T +L+ A K ++H ST
Sbjct: 99 TLNVLE-AACVLYRKRVVHTST 119
>gnl|CDD|220719 pfam10368, YkyA, Putative cell-wall binding lipoprotein. YkyA is a
family of proteins containing a lipoprotein signal and a
hydrolase domain. It is similar to cell wall binding
proteins and might also be recognisable by a host immune
defence system. It is thus likely to belong to pathways
important for pathogenicity.
Length = 205
Score = 28.5 bits (64), Expect = 8.5
Identities = 26/95 (27%), Positives = 45/95 (47%), Gaps = 1/95 (1%)
Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
R+K +KEK+ EK + S +K +K+ K+ ++K + + E+ K +K K
Sbjct: 79 DKREKLLKKEKESIEKSEEEFKSAKKYIEKIEDKKLKKKAKQLVKVMKERYKSYDKLYKA 138
Query: 114 EKSHKHKDKD-RERDKDEKKEQKESKSSSKIVSSS 147
K + +K+ E KD+ KE K V+ S
Sbjct: 139 YKKALNLEKELYEYLKDKDLTLKELDEKIKAVNQS 173
>gnl|CDD|219564 pfam07771, TSGP1, Tick salivary peptide group 1. This contains a
group of peptides derived from a salivary gland cDNA
library of the tick Ixodes scapularis. Also present are
peptides from a related tick species, Ixodes ricinus.
They are characterized by a putative signal peptide
indicative of secretion and conserved cysteine residues.
Length = 120
Score = 27.5 bits (61), Expect = 8.6
Identities = 13/41 (31%), Positives = 22/41 (53%), Gaps = 3/41 (7%)
Query: 33 STSSSTSNPTN---SSSSKKDKKDKDRDKEKEKEKKDKEKD 70
+TSS + + ++K KK K + K+ +K KK +KD
Sbjct: 80 TTSSGEPSHPDDHPPEPTEKPKKKKKKSKKTKKPKKSSKKD 120
>gnl|CDD|227625 COG5309, COG5309, Exo-beta-1,3-glucanase [Carbohydrate transport
and metabolism].
Length = 305
Score = 28.7 bits (64), Expect = 8.7
Identities = 16/66 (24%), Positives = 26/66 (39%), Gaps = 7/66 (10%)
Query: 1 MAYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTS-------NPTNSSSSKKDKKD 53
M +S SSS++ + S +S + S+S+ +S S P N + K
Sbjct: 5 MQFSSTSSSAALATLSSSSSALSSSASEVSSSSSRASASGFLAFTLGPYNDDGTCKSADQ 64
Query: 54 KDRDKE 59
D E
Sbjct: 65 VASDLE 70
>gnl|CDD|215541 PLN03020, PLN03020, low-temperature-induced protein; Provisional.
Length = 556
Score = 28.9 bits (64), Expect = 8.8
Identities = 19/80 (23%), Positives = 30/80 (37%), Gaps = 2/80 (2%)
Query: 133 EQKESKSSSKIVSSSHNSKEPASGS--QLISHPPPPAPTPTQKSPVKTKEKEKEKESSTT 190
E ++ ++ + + H+ +GS Q P PA T + + P T EK T
Sbjct: 256 EPEQPFATEDLPTRPHDKSIFPTGSHDQFSPEPLSPAETKSTEDPQSTSNAEKPSNQKTY 315
Query: 191 HDKHSKHKHKKKDKHGDKTN 210
+K S DK N
Sbjct: 316 TEKISSATSAIADKAISAKN 335
>gnl|CDD|187548 cd05237, UDP_invert_4-6DH_SDR_e, UDP-Glcnac (UDP-linked
N-acetylglucosamine) inverting 4,6-dehydratase, extended
(e) SDRs. UDP-Glcnac inverting 4,6-dehydratase was
identified in Helicobacter pylori as the hexameric flaA1
gene product (FlaA1). FlaA1 is hexameric, possesses
UDP-GlcNAc-inverting 4,6-dehydratase activity, and
catalyzes the first step in the creation of a
pseudaminic acid derivative in protein glycosylation.
Although this subgroup has the NADP-binding motif
characteristic of extended SDRs, its members tend to
have a Met substituted for the active site Tyr found in
most SDR families. Extended SDRs are distinct from
classical SDRs. In addition to the Rossmann fold
(alpha/beta folding pattern with a central beta-sheet)
core region typical of all SDRs, extended SDRs have a
less conserved C-terminal extension of approximately 100
amino acids. Extended SDRs are a diverse collection of
proteins, and include isomerases, epimerases,
oxidoreductases, and lyases; they typically have a
TGXXGXXG cofactor binding motif. SDRs are a functionally
diverse family of oxidoreductases that have a single
domain with a structurally conserved Rossmann fold, an
NAD(P)(H)-binding region, and a structurally diverse
C-terminal region. Sequence identity between different
SDR enzymes is typically in the 15-30% range; they
catalyze a wide range of activities including the
metabolism of steroids, cofactors, carbohydrates,
lipids, aromatic compounds, and amino acids, and act in
redox sensing. Classical SDRs have an TGXXX[AG]XG
cofactor binding motif and a YXXXK active site motif,
with the Tyr residue of the active site motif serving as
a critical catalytic residue (Tyr-151, human
15-hydroxyprostaglandin dehydrogenase numbering). In
addition to the Tyr and Lys, there is often an upstream
Ser and/or an Asn, contributing to the active site;
while substrate binding is in the C-terminal region,
which determines specificity. The standard reaction
mechanism is a 4-pro-S hydride transfer and proton relay
involving the conserved Tyr and Lys, a water molecule
stabilized by Asn, and nicotinamide. Atypical SDRs
generally lack the catalytic residues characteristic of
the SDRs, and their glycine-rich NAD(P)-binding motif is
often different from the forms normally seen in
classical or extended SDRs. Complex (multidomain) SDRs
such as ketoreductase domains of fatty acid synthase
have a GGXGXXG NAD(P)-binding motif and an altered
active site motif (YXXXN). Fungal type ketoacyl
reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
Length = 287
Score = 28.7 bits (65), Expect = 8.9
Identities = 22/106 (20%), Positives = 43/106 (40%), Gaps = 24/106 (22%)
Query: 263 LFDRLKNEQADILQR--------KVHIISGDISQPSLGISSHDQQFIQHHIHVIIHAAA- 313
+FDR +N+ ++++ K+ I GD+ + F + ++ HAAA
Sbjct: 32 VFDRDENKLHELVRELRSRFPHDKLRFIIGDVRDKERLRRA----FKERGPDIVFHAAAL 87
Query: 314 ------SLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVST 353
+E I+ N+ T+ ++D A K + +ST
Sbjct: 88 KHVPSMEDNPEEAIKT----NVLGTKNVIDAAIENGVEKFVC-IST 128
>gnl|CDD|165564 PHA03309, PHA03309, transcriptional regulator ICP4; Provisional.
Length = 2033
Score = 29.1 bits (64), Expect = 8.9
Identities = 21/77 (27%), Positives = 35/77 (45%), Gaps = 4/77 (5%)
Query: 139 SSSKIVSSSHNSKEPASGSQLISHPP-PPAPTPTQKSPVKTKEKEKEKES---STTHDKH 194
SSS SSS +S P+S + P P+P+P +++PV + +E S +
Sbjct: 1817 SSSSSSSSSSSSSSPSSRPSRSATPSLSPSPSPPRRAPVDRSRSGRRRERDRPSANPFRW 1876
Query: 195 SKHKHKKKDKHGDKTNP 211
+ + + D D T P
Sbjct: 1877 APRQRSRADHSPDGTAP 1893
>gnl|CDD|221144 pfam11595, DUF3245, Protein of unknown function (DUF3245). This is
a family of proteins conserved in fungi. The function is
not known, and there is no S. cerevisiae member.
Length = 145
Score = 28.0 bits (62), Expect = 9.0
Identities = 24/130 (18%), Positives = 45/130 (34%), Gaps = 14/130 (10%)
Query: 35 SSSTSNPTNSSSSKKDKKDKDRDKEKEKEK-------------KDKEKDKSAVSSKEKEK 81
+S T S S K D++ KE+++ + D S ++
Sbjct: 14 ASWLPPMTASEQSNPKKTDEELQKEEDEIFTAVPETLGLGAPLPTQAADGSWNRTELDSN 73
Query: 82 DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
DK+ ++ K K + EK + K K K + + D+ + E +S S
Sbjct: 74 DKLR-RQLLGKNYKKVMAEKEKAEGGPVKRKAAVVAKEAKQSSKGVGDDDDDDDEDESRS 132
Query: 142 KIVSSSHNSK 151
++K
Sbjct: 133 AAFGKKGSNK 142
>gnl|CDD|185272 PRK15374, PRK15374, pathogenicity island 1 effector protein SipB;
Provisional.
Length = 593
Score = 28.8 bits (64), Expect = 9.1
Identities = 21/104 (20%), Positives = 43/104 (41%), Gaps = 6/104 (5%)
Query: 61 EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP---KESSSEKEKKKEKKDKKEKSH 117
E K + KS + EK+ + +K + + P + ++ ++ KE + KE
Sbjct: 144 EASIKKTDTAKSVYDAAEKKLTQAQNKLQSLDPADPGYAQAEAAVEQAGKEATEAKEALD 203
Query: 118 KHKDKDRERDKDEK-KEQKESKSSSKI--VSSSHNSKEPASGSQ 158
K D + D K K +K +K +++ + + + G Q
Sbjct: 204 KATDATVKAGTDAKAKAEKADNILTKFQGTANAASQNQVSQGEQ 247
>gnl|CDD|150531 pfam09871, DUF2098, Uncharacterized protein conserved in archaea
(DUF2098). This domain, found in various hypothetical
prokaryotic proteins, has no known function.
Length = 91
Score = 26.9 bits (60), Expect = 9.7
Identities = 8/38 (21%), Positives = 21/38 (55%)
Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
T+ KK+++++D+++ E+ KK++E +
Sbjct: 48 VTDKVKEKKEEREEDKEELIERIKKEEETFEDVDLGSA 85
>gnl|CDD|148682 pfam07222, PBP_sp32, Proacrosin binding protein sp32. This family
consists of several mammalian specific proacrosin
binding protein sp32 sequences. sp32 is a sperm specific
protein which is known to bind with with 55- and 53-kDa
proacrosins and the 49-kDa acrosin intermediate. The
exact function of sp32 is unclear, it is thought however
that the binding of sp32 to proacrosin may be involved
in packaging the acrosin zymogen into the acrosomal
matrix.
Length = 243
Score = 28.5 bits (63), Expect = 9.8
Identities = 14/102 (13%), Positives = 36/102 (35%)
Query: 4 SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
S + ++ + + H ++ S P + SS + + + ++E
Sbjct: 141 SAEVQPTTMTLPIAEHPTITENQSFQPWPERLHNNVEELLQSSLSLGGSVQVKAPKPKQE 200
Query: 64 KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
+ + + K +EK +E+E E + K+ +
Sbjct: 201 QLLSKLQEYLQEHKTEEKQPQEEQEEEEVEEEAKQEEGQGTD 242
>gnl|CDD|227578 COG5253, MSS4, Phosphatidylinositol-4-phosphate 5-kinase [Signal
transduction mechanisms].
Length = 612
Score = 28.8 bits (64), Expect = 9.9
Identities = 49/304 (16%), Positives = 88/304 (28%), Gaps = 46/304 (15%)
Query: 6 KSSSSSSSAHPSPHKNKDKDSS-----AIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEK 60
S + S P+ + S A + S +N + + D+ KE
Sbjct: 14 ISMTHDKSTRPNDRSMSNDSSLCGLNQASDANGNEYSPNNKVSKKDTFSDQLHDALSKEF 73
Query: 61 EKE--------KKDKEKDKSAVSSK---EKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
E K K + +S E K+ + + + S K
Sbjct: 74 TLERERDRLQLNKRKYQAIRLQTSTPIVEIFKNNKDAVDPPNHTRSSGNNLSNANVKTLS 133
Query: 110 ------KDKKEKSHKHKDKDRERDKDEK--------KEQKESKSSSKIVSSSHNSKEPAS 155
+ ++ D E + K S +S N K +
Sbjct: 134 APVGEHSRSNNPPNLDQNLDTEPESSISQWGELQLNPSGKTLSSQPSRKPTSENPKSESD 193
Query: 156 GSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE-- 213
S+L + SP+ K K S+ +++S + + P E
Sbjct: 194 NSKL---------PTSVNSPLPDKSLLKRTLSNFWAERNSYN----WKPLVYPSCPSEHI 240
Query: 214 -KDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQA 272
D+ +E SS C ++R + ++T+ ERL E R E
Sbjct: 241 FSDSDVIIREDEPSSLIAFCLSTSDYRNKMMRLRDSETMDERLLNGMPLEGGHRNPQESY 300
Query: 273 DILQ 276
++L
Sbjct: 301 NMLT 304
>gnl|CDD|227818 COG5531, COG5531, SWIB-domain-containing proteins implicated in
chromatin remodeling [Chromatin structure and dynamics].
Length = 237
Score = 28.2 bits (63), Expect = 9.9
Identities = 20/63 (31%), Positives = 27/63 (42%)
Query: 85 SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
S E+ R K K + + K K K+E S K+ E E KE + K SS I
Sbjct: 57 SLAEEPRVLRKEKYNITRKTTGKNDLPKEEDSSLPSSKETENGDTEGKETDKKKKSSTIS 116
Query: 145 SSS 147
+S
Sbjct: 117 KNS 119
>gnl|CDD|219589 pfam07808, RED_N, RED-like protein N-terminal region. This family
contains sequences that are similar to the N-terminal
region of Red protein. This and related proteins contain
a RED repeat which consists of a number of RE and RD
sequence elements. The region in question has several
conserved NLS sequences and a putative trimeric
coiled-coil region, suggesting that these proteins are
expressed in the nucleus. The function of Red protein is
unknown, but efficient sequestration to nuclear bodies
suggests that its expression may be tightly regulated of
that the protein self-aggregates extremely efficiently.
Length = 238
Score = 28.3 bits (63), Expect = 10.0
Identities = 13/45 (28%), Positives = 25/45 (55%)
Query: 97 KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
K+ +K+E+ +KE + K++D+ RER K K+ S ++
Sbjct: 2 KKKKYAYLRKQEENAEKEINPKYRDRARERRKGINKDYDPSSLAA 46
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.309 0.125 0.353
Gapped
Lambda K H
0.267 0.0677 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 28,624,648
Number of extensions: 2746148
Number of successful extensions: 13800
Number of sequences better than 10.0: 1
Number of HSP's gapped: 8758
Number of HSP's successfully gapped: 1629
Length of query: 596
Length of database: 10,937,602
Length adjustment: 102
Effective length of query: 494
Effective length of database: 6,413,494
Effective search space: 3168266036
Effective search space used: 3168266036
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.7 bits)
S2: 62 (27.8 bits)