RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11496
(607 letters)
>gnl|CDD|219749 pfam08216, DUF1716, Eukaryotic domain of unknown function
(DUF1716). This domain is found in eukaryotic proteins.
A human nuclear protein with this domain is thought to
have a role in apoptosis.
Length = 108
Score = 127 bits (322), Expect = 3e-35
Identities = 59/94 (62%), Positives = 74/94 (78%)
Query: 81 EETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFPDNAEKFMESEIELHTTIQELHAIA 140
E + EVL+E LKK++L+FEKR KN+E+RIKFPD+ EKFMESE++L IQEL +A
Sbjct: 15 AEEDGVEVLDESSLKKLVLVFEKRIRKNQELRIKFPDDPEKFMESEVDLDDIIQELKVLA 74
Query: 141 TVPDLYPLLVQLKAVSSMLELVLHENTDIAVAVV 174
T PDLYP LV+L VSS+L L+ HENTDIA+AVV
Sbjct: 75 TCPDLYPSLVELNGVSSLLSLLNHENTDIAIAVV 108
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 39.7 bits (92), Expect = 0.006
Identities = 22/77 (28%), Positives = 43/77 (55%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELERE 69
K+ + K + EE AE+AR + KMK +E ++ E+ KI+ EEL++
Sbjct: 1569 AKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKA 1628
Query: 70 KQEEEKIKTLIEETEEK 86
++E++K++ L ++ E+
Sbjct: 1629 EEEKKKVEQLKKKEAEE 1645
Score = 35.9 bits (82), Expect = 0.072
Identities = 21/87 (24%), Positives = 43/87 (49%), Gaps = 1/87 (1%)
Query: 10 KPKSPKRKALDDTVEEEETS-AEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELER 68
K K+KA +D + +E A A+KK + + K K E ++ +E + +E ++
Sbjct: 1392 KADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKK 1451
Query: 69 EKQEEEKIKTLIEETEEKEVLNELMLK 95
+ +E +K + ++ EE + +E K
Sbjct: 1452 KAEEAKKAEEAKKKAEEAKKADEAKKK 1478
Score = 35.5 bits (81), Expect = 0.11
Identities = 21/83 (25%), Positives = 41/83 (49%), Gaps = 5/83 (6%)
Query: 5 ELLSYKPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQE 64
+ K + K+KA EE + + ED +K + + K +E+++ E E+ ++
Sbjct: 1664 AEEAKKAEEDKKKA-----EEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKA 1718
Query: 65 ELEREKQEEEKIKTLIEETEEKE 87
E ++ +EE KIK + E +E
Sbjct: 1719 EELKKAEEENKIKAEEAKKEAEE 1741
Score = 35.1 bits (80), Expect = 0.12
Identities = 18/86 (20%), Positives = 41/86 (47%), Gaps = 5/86 (5%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELERE 69
K + K+KA + + AE+ +K + + K K E+++ K + +E +++
Sbjct: 1372 KKEEAKKKA-----DAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKK 1426
Query: 70 KQEEEKIKTLIEETEEKEVLNELMLK 95
+E++K ++ EE + +E K
Sbjct: 1427 AEEKKKADEAKKKAEEAKKADEAKKK 1452
Score = 33.6 bits (76), Expect = 0.41
Identities = 25/122 (20%), Positives = 56/122 (45%), Gaps = 11/122 (9%)
Query: 15 KRKALDDTVEEEETSAEDARK--------KRKYSSSSSSSKMKYQEMERLEQEKIRQEEL 66
K + L EE+ AE+ +K + + + K K +E ++ E+++ + E
Sbjct: 1634 KVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEA 1693
Query: 67 EREKQEEEKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFPDNAEKFMESE 126
+++ EE K +++ E +E LKK E+ +K E + + ++ +K E++
Sbjct: 1694 LKKEAEEAKKAEELKKKEAEEKKKAEELKKA---EEENKIKAEEAKKEAEEDKKKAEEAK 1750
Query: 127 IE 128
+
Sbjct: 1751 KD 1752
Score = 32.4 bits (73), Expect = 0.84
Identities = 26/129 (20%), Positives = 56/129 (43%), Gaps = 7/129 (5%)
Query: 5 ELLSYKPKSPKRKALD-DTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQ----- 58
+ K + K+KA + + A++A+KK + + +K K +E ++ ++
Sbjct: 1394 DEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKA 1453
Query: 59 -EKIRQEELEREKQEEEKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFPD 117
E + EE +++ +E +K ++ EE + +E K + K K D
Sbjct: 1454 EEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKAD 1513
Query: 118 NAEKFMESE 126
A+K E++
Sbjct: 1514 EAKKAEEAK 1522
Score = 32.4 bits (73), Expect = 0.99
Identities = 22/88 (25%), Positives = 48/88 (54%), Gaps = 11/88 (12%)
Query: 10 KPKSPKRKALDDTVEEEE----------TSAEDARKKRKYSSSSSSSKMKYQEMER-LEQ 58
K + ++KA + +E E AE+ +K + + +K+K +E ++ E+
Sbjct: 1682 KAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEE 1741
Query: 59 EKIRQEELEREKQEEEKIKTLIEETEEK 86
+K + EE +++++E++KI L +E E+K
Sbjct: 1742 DKKKAEEAKKDEEEKKKIAHLKKEEEKK 1769
Score = 32.0 bits (72), Expect = 1.2
Identities = 20/83 (24%), Positives = 38/83 (45%), Gaps = 7/83 (8%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELERE 69
K +KA +E + AE+A+K + + +K K E ++ + K + +E +
Sbjct: 1464 KKAEEAKKA-----DEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAK-- 1516
Query: 70 KQEEEKIKTLIEETEEKEVLNEL 92
K EE K ++ EE + +E
Sbjct: 1517 KAEEAKKADEAKKAEEAKKADEA 1539
Score = 32.0 bits (72), Expect = 1.3
Identities = 20/87 (22%), Positives = 44/87 (50%), Gaps = 1/87 (1%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMER-LEQEKIRQEELER 68
+ + + +A ++ E E E+A+KK + + K K E ++ E++K + +EL++
Sbjct: 1353 EAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKK 1412
Query: 69 EKQEEEKIKTLIEETEEKEVLNELMLK 95
++K ++ EEK+ +E K
Sbjct: 1413 AAAAKKKADEAKKKAEEKKKADEAKKK 1439
Score = 30.9 bits (69), Expect = 2.7
Identities = 17/70 (24%), Positives = 28/70 (40%), Gaps = 2/70 (2%)
Query: 23 VEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEE 82
+E + AE+A+K + + +K K ++ +E + E K E E E
Sbjct: 1304 ADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEA--AKAEAEAAADEAEA 1361
Query: 83 TEEKEVLNEL 92
EEK E
Sbjct: 1362 AEEKAEAAEK 1371
Score = 30.9 bits (69), Expect = 2.8
Identities = 20/81 (24%), Positives = 35/81 (43%), Gaps = 8/81 (9%)
Query: 24 EEEETSAEDARKKRKYSSSSSSSKMKYQEMERL--------EQEKIRQEELEREKQEEEK 75
EE + AE+ +K + K K E ++ E+ KI+ E ++ +E++K
Sbjct: 1616 EEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKK 1675
Query: 76 IKTLIEETEEKEVLNELMLKK 96
++ EE E LKK
Sbjct: 1676 KAEEAKKAEEDEKKAAEALKK 1696
Score = 30.5 bits (68), Expect = 3.7
Identities = 27/97 (27%), Positives = 44/97 (45%), Gaps = 8/97 (8%)
Query: 16 RKALDDTVEEEETSAEDARK---KRKYSSSSSSSKM-KYQEMERLEQ----EKIRQEELE 67
RKA + EE AEDARK RK + + K ++ ++ E E+ +++ E
Sbjct: 1182 RKAEEVRKAEELRKAEDARKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKKDAEE 1241
Query: 68 REKQEEEKIKTLIEETEEKEVLNELMLKKMILLFEKR 104
+K EEE+ I + EE + + + I E R
Sbjct: 1242 AKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEAR 1278
Score = 30.1 bits (67), Expect = 4.1
Identities = 18/74 (24%), Positives = 38/74 (51%), Gaps = 6/74 (8%)
Query: 25 EEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELE------REKQEEEKIKT 78
EE AE+A+KK + + + +K K +E ++ ++ K + EE + ++ E +K
Sbjct: 1454 EEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKAD 1513
Query: 79 LIEETEEKEVLNEL 92
++ EE + +E
Sbjct: 1514 EAKKAEEAKKADEA 1527
>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27. This protein forms
the C subunit of DNA polymerase delta. It carries the
essential residues for binding to the Pol1 subunit of
polymerase alpha, from residues 293-332, which are
characterized by the motif D--G--VT, referred to as the
DPIM motif. The first 160 residues of the protein form
the minimal domain for binding to the B subunit, Cdc1,
of polymerase delta, the final 10 C-terminal residues,
362-372, being the DNA sliding clamp, PCNA, binding
motif.
Length = 427
Score = 39.0 bits (91), Expect = 0.007
Identities = 20/86 (23%), Positives = 37/86 (43%), Gaps = 7/86 (8%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMER-------LEQEKIR 62
K K K K +E S E++ K+ S+ E E E+
Sbjct: 225 KTKEKKEKKEASESTVKEESEEESGKRDVILEDESAEPTGLDEDEDEDEPKPSGERSDSE 284
Query: 63 QEELEREKQEEEKIKTLIEETEEKEV 88
+E E+EK++ +++K ++E+ +E E
Sbjct: 285 EETEEKEKEKRKRLKKMMEDEDEDEE 310
Score = 29.8 bits (67), Expect = 4.5
Identities = 24/92 (26%), Positives = 33/92 (35%), Gaps = 8/92 (8%)
Query: 6 LLSYKPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEE 65
+ S+ K K K E E + K + E E E + ++E
Sbjct: 218 MSSFFKKKTKEKKEKKEASESTVKEESEEESGK--------RDVILEDESAEPTGLDEDE 269
Query: 66 LEREKQEEEKIKTLIEETEEKEVLNELMLKKM 97
E E + + EETEEKE LKKM
Sbjct: 270 DEDEPKPSGERSDSEEETEEKEKEKRKRLKKM 301
>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
Length = 1021
Score = 37.8 bits (87), Expect = 0.019
Identities = 21/81 (25%), Positives = 48/81 (59%), Gaps = 8/81 (9%)
Query: 12 KSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQ 71
++ RKAL+ + E++ R++R+ +++ + MER+E+E++ +E LERE+
Sbjct: 443 ENAHRKALEMKILEKKRIERLEREERE--------RLERERMERIERERLERERLERERL 494
Query: 72 EEEKIKTLIEETEEKEVLNEL 92
E ++++ + E+E ++ L
Sbjct: 495 ERDRLERDRLDRLERERVDRL 515
Score = 31.6 bits (71), Expect = 1.5
Identities = 25/94 (26%), Positives = 43/94 (45%), Gaps = 5/94 (5%)
Query: 21 DTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKI---RQEELEREKQEEEKI- 76
D E E RK K + + +ER E+E++ R E +ERE+ E E++
Sbjct: 431 DKDHAERARIEKENAHRKALEMKILEKKRIERLEREERERLERERMERIERERLERERLE 490
Query: 77 -KTLIEETEEKEVLNELMLKKMILLFEKRTLKNR 109
+ L + E++ L+ L +++ L R K R
Sbjct: 491 RERLERDRLERDRLDRLERERVDRLERDRLEKAR 524
>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
This family represents Cwf15/Cwc15 (from
Schizosaccharomyces pombe and Saccharomyces cerevisiae
respectively) and their homologues. The function of
these proteins is unknown, but they form part of the
spliceosome and are thus thought to be involved in mRNA
splicing.
Length = 241
Score = 35.1 bits (81), Expect = 0.063
Identities = 19/76 (25%), Positives = 33/76 (43%), Gaps = 9/76 (11%)
Query: 20 DDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTL 79
DD+ ++ + D S E EKI++E E +++EEE+
Sbjct: 120 DDSDSSSDSDSSD-------DDSDDDDSEDETAALLRELEKIKKERAEEKEREEEEKAAE 172
Query: 80 IEETEEKEVL--NELM 93
E+ E+E+L N L+
Sbjct: 173 EEKAREEEILTGNPLL 188
>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207). This
family is found in eukaryotes; it has several conserved
tryptophan residues. The function is not known.
Length = 261
Score = 35.1 bits (81), Expect = 0.070
Identities = 17/76 (22%), Positives = 35/76 (46%), Gaps = 1/76 (1%)
Query: 24 EEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQE-KIRQEELEREKQEEEKIKTLIEE 82
+ + +K+ S+SSS S E ++E K R +E E +K ++++ K E
Sbjct: 144 QAAKQRTPKHKKEAAESASSSLSGSAKPERNVSQEEAKKRLQEWELKKLKQQQQKREEER 203
Query: 83 TEEKEVLNELMLKKMI 98
++++ E +K
Sbjct: 204 RKQRKKQQEEEERKQK 219
Score = 31.6 bits (72), Expect = 1.1
Identities = 15/88 (17%), Positives = 36/88 (40%), Gaps = 2/88 (2%)
Query: 12 KSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQ 71
S + ++ + T + ++ + + + + + Q+K+++ E++KQ
Sbjct: 55 PSLSLSSTASSLSDSSTYSRSLKEVKLERQAQEAYENWLSAKQAQRQKKLQKLLEEKQKQ 114
Query: 72 EEEKIKTLIEETEE--KEVLNELMLKKM 97
E EK + E + KE E +K
Sbjct: 115 EREKEREEAELRQRLAKEKYEEWCRQKA 142
Score = 28.9 bits (65), Expect = 7.4
Identities = 25/122 (20%), Positives = 53/122 (43%), Gaps = 16/122 (13%)
Query: 16 RKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEK 75
+K L++ ++E + + R+ +K KY+E R + ++ ++ + K+E +
Sbjct: 105 QKLLEEKQKQEREKEREEAELRQ-----RLAKEKYEEWCRQKAQQAAKQRTPKHKKEAAE 159
Query: 76 IKTLIEET-----------EEKEVLNELMLKKMILLFEKRTLKNREMRIKFPDNAEKFME 124
+ E K+ L E LKK+ +KR + R+ R K + E+ +
Sbjct: 160 SASSSLSGSAKPERNVSQEEAKKRLQEWELKKLKQQQQKREEERRKQRKKQQEEEERKQK 219
Query: 125 SE 126
+E
Sbjct: 220 AE 221
>gnl|CDD|237987 cd00020, ARM, Armadillo/beta-catenin-like repeats. An approximately
40 amino acid long tandemly repeated sequence motif
first identified in the Drosophila segment polarity gene
armadillo; these repeats were also found in the
mammalian armadillo homolog beta-catenin, the junctional
plaque protein plakoglobin, the adenomatous polyposis
coli (APC) tumor suppressor protein, and a number of
other proteins. ARM has been implicated in mediating
protein-protein interactions, but no common features
among the target proteins recognized by the ARM repeats
have been identified; related to the HEAT domain; three
consecutive copies of the repeat are represented by this
alignment model.
Length = 120
Score = 33.4 bits (77), Expect = 0.077
Identities = 18/103 (17%), Positives = 40/103 (38%), Gaps = 15/103 (14%)
Query: 137 HAIATVPDLYPLLVQLKAVSSMLELVLHENTDIAVAVVDLLQELTDVDVLNESEEGTESL 196
+ A D +V+ + ++++L+ E+ ++ A + L+ L N+
Sbjct: 33 NLSAGNNDNIQAVVEAGGLPALVQLLKSEDEEVVKAALWALRNLAAGPEDNKL------- 85
Query: 197 LTALLDQQVCALLVQNLERLDETVKEESDGVHNTLGIFENLCE 239
+L+ LV L+ +E +++ N G NL
Sbjct: 86 --IVLEAGGVPKLVNLLDSSNEDIQK------NATGALSNLAS 120
Score = 30.7 bits (70), Expect = 0.75
Identities = 21/109 (19%), Positives = 40/109 (36%), Gaps = 12/109 (11%)
Query: 199 ALLDQQVCALLVQNLERLDETVKEESDGVHNTLGIFENLCELKPDVIHDIGKQGIIQWSL 258
A++ LV L DE V+ E NL D I + + G + +
Sbjct: 2 AVIQAGGLPALVSLLSSSDENVQRE------AAWALSNLSAGNNDNIQAVVEAGGLPALV 55
Query: 259 KRLKAKIPFDGNKL--YTSEILSILCQKNNENRKLLGDLDGIDILLQQL 305
+ LK++ ++ L L +N+ ++ + G+ L+ L
Sbjct: 56 QLLKSEDE----EVVKAALWALRNLAAGPEDNKLIVLEAGGVPKLVNLL 100
>gnl|CDD|218896 pfam06098, Radial_spoke_3, Radial spoke protein 3. This family
consists of several radial spoke protein 3 (RSP3)
sequences. Eukaryotic cilia and flagella present in
diverse types of cells perform motile, sensory, and
developmental functions in organisms from protists to
humans. They are centred by precisely organised,
microtubule-based structures, the axonemes. The axoneme
consists of two central singlet microtubules, called the
central pair, and nine outer doublet microtubules. These
structures are well-conserved during evolution. The
outer doublet microtubules, each composed of A and B
sub-fibres, are connected to each other by nexin links,
while the central pair is held at the centre of the
axoneme by radial spokes. The radial spokes are T-shaped
structures extending from the A-tubule of each outer
doublet microtubule to the centre of the axoneme. Radial
spoke protein 3 (RSP3), is present at the proximal end
of the spoke stalk and helps in anchoring the radial
spoke to the outer doublet. It is thought that radial
spokes regulate the activity of inner arm dynein through
protein phosphorylation and dephosphorylation.
Length = 288
Score = 35.0 bits (81), Expect = 0.098
Identities = 22/64 (34%), Positives = 39/64 (60%), Gaps = 2/64 (3%)
Query: 23 VEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEE 82
V EEE AE +++R++ ++ + Q +E E E+ R+EE ER K+++++ K +E
Sbjct: 148 VLEEEELAELRQQQRQFEQRRNAELAETQRLE--EAERRRREEKERRKKQDKERKQREKE 205
Query: 83 TEEK 86
T EK
Sbjct: 206 TAEK 209
>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
envelope biogenesis, outer membrane].
Length = 387
Score = 34.9 bits (80), Expect = 0.11
Identities = 19/65 (29%), Positives = 31/65 (47%), Gaps = 1/65 (1%)
Query: 24 EEEETSAEDARKKRKYSSSSSSSKMKYQEMERL-EQEKIRQEELEREKQEEEKIKTLIEE 82
+ E RKK++ + + E ERL + EK R + E++KQ EE K E
Sbjct: 71 QSSAKKGEQQRKKKEEQVAEELKPKQAAEQERLKQLEKERLKAQEQQKQAEEAEKQAQLE 130
Query: 83 TEEKE 87
+++E
Sbjct: 131 QKQQE 135
>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon.
Length = 431
Score = 33.5 bits (76), Expect = 0.33
Identities = 28/101 (27%), Positives = 49/101 (48%), Gaps = 2/101 (1%)
Query: 4 GELLSYKPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQ 63
GE +++K K + E + A +K K ++ +++ + +R E+ K+ +
Sbjct: 172 GEFMTHKLKHTENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLE 231
Query: 64 EELEREKQEEEKIKTLIEETEEKEVLNELMLKKMILLFEKR 104
EE +R KQEE K+ E EEK L E + ++ EKR
Sbjct: 232 EEEQRRKQEEADRKS--REEEEKRRLKEEIERRRAEAAEKR 270
>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain. This
family represents the C-terminus (approximately 300
residues) of proteins that are involved as binding
partners for Prp19 as part of the nuclear pore complex.
The family in Drosophila is necessary for pre-mRNA
splicing, and the human protein has been found in
purifications of the spliceosome. In the past this
family was thought, erroneously, to be associated with
microfibrillin.
Length = 277
Score = 32.6 bits (74), Expect = 0.49
Identities = 22/85 (25%), Positives = 41/85 (48%), Gaps = 12/85 (14%)
Query: 5 ELLSYKPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQE 64
EL K + +DD ++E E+ K +E++R+++++ +E
Sbjct: 89 ELELKKRNTLLEANIDDVDTDDENEEEE------------YEAWKLRELKRIKRDREERE 136
Query: 65 ELEREKQEEEKIKTLIEETEEKEVL 89
E+EREK E EK++ + EE E+
Sbjct: 137 EMEREKAEIEKMRNMTEEERRAELR 161
Score = 31.4 bits (71), Expect = 1.2
Identities = 31/80 (38%), Positives = 42/80 (52%), Gaps = 3/80 (3%)
Query: 24 EEEETSAEDARKKR-KYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEE 82
EEEET +ED + R K + ++ QE ER ++ EE + K EE K +TL +
Sbjct: 23 EEEETDSEDDMEPRLKPVFTRKKDRITIQEREREAAKEKALEEEAKRKAEERKRETL--K 80
Query: 83 TEEKEVLNELMLKKMILLFE 102
E+EV EL LKK L E
Sbjct: 81 IVEEEVKKELELKKRNTLLE 100
>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
domain. This domain is found in a number of different
types of plant proteins including NAM-like proteins.
Length = 147
Score = 31.2 bits (71), Expect = 0.67
Identities = 19/94 (20%), Positives = 34/94 (36%), Gaps = 13/94 (13%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDA----------RKKRKYSSSSSSSKMKYQEMERLEQE 59
K +S E E+ E + K K +K + E E+ ++E
Sbjct: 33 KKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKKAKEKLRRDKLKAKKEEAEKEKEKEE 92
Query: 60 KIRQEELEREKQE---EEKIKTLIEETEEKEVLN 90
+ + E EK+ E+K EEK+++
Sbjct: 93 RFMKALAEAEKERAELEKKKAEAKLMKEEKKIMF 126
>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
TolA; Provisional.
Length = 387
Score = 32.5 bits (74), Expect = 0.70
Identities = 15/52 (28%), Positives = 32/52 (61%), Gaps = 2/52 (3%)
Query: 24 EEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEK 75
++++ SA+ A ++RK + ++ Q+ + EQE+++Q E ER +E+K
Sbjct: 68 QQQQKSAKRAEEQRKKKEQQQAEEL--QQKQAAEQERLKQLEKERLAAQEQK 117
Score = 29.8 bits (67), Expect = 4.7
Identities = 20/78 (25%), Positives = 34/78 (43%), Gaps = 6/78 (7%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELERE 69
+ KS KR +E++ + E +K+ ++K E ERL ++ +++ E
Sbjct: 70 QQKSAKRAEEQRKKKEQQQAEELQQKQAAEQE-----RLKQLEKERLAAQEQKKQAEEAA 124
Query: 70 KQEEEKIKTLIEETEEKE 87
KQ K K EE K
Sbjct: 125 KQAALKQK-QAEEAAAKA 141
>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1. All
proteins in this family for which functions are known
are cyclin dependent protein kinases that are components
of TFIIH, a complex that is involved in nucleotide
excision repair and transcription initiation. Also known
as MAT1 (menage a trois 1). This family is based on the
phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
Stanford University) [DNA metabolism, DNA replication,
recombination, and repair].
Length = 309
Score = 32.1 bits (73), Expect = 0.84
Identities = 18/75 (24%), Positives = 41/75 (54%), Gaps = 1/75 (1%)
Query: 19 LDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKT 78
L++T ++ ET ++ + + + S+ + + E E LE EK +E+ Q+EE+ +
Sbjct: 115 LENTKKKIETYQKENKDVIQKNKEKSTREQEELE-EALEFEKEEEEQRRLLLQKEEEEQQ 173
Query: 79 LIEETEEKEVLNELM 93
+ + ++ +L+EL
Sbjct: 174 MNKRKNKQALLDELE 188
>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain. This domain is
found at the N terminus of SMC proteins. The SMC
(structural maintenance of chromosomes) superfamily
proteins have ATP-binding domains at the N- and
C-termini, and two extended coiled-coil domains
separated by a hinge in the middle. The eukaryotic SMC
proteins form two kind of heterodimers: the SMC1/SMC3
and the SMC2/SMC4 types. These heterodimers constitute
an essential part of higher order complexes, which are
involved in chromatin and DNA dynamics. This family also
includes the RecF and RecN proteins that are involved in
DNA metabolism and recombination.
Length = 1162
Score = 32.2 bits (73), Expect = 0.86
Identities = 25/136 (18%), Positives = 54/136 (39%)
Query: 1 MDIGELLSYKPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEK 60
+ E K + + + + +++E+E + E+ + ++ K+K QE E E+
Sbjct: 750 EEEEEKSRLKKEEEEEEKSELSLKEKELAEEEEKTEKLKVEEEKEEKLKAQEEELRALEE 809
Query: 61 IRQEELEREKQEEEKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFPDNAE 120
+EE E ++E+ I+ + EE+ L LK+ L + + + +
Sbjct: 810 ELKEEAELLEEEQLLIEQEEKIKEEELEELALELKEEQKLEKLAEEELERLEEEITKEEL 869
Query: 121 KFMESEIELHTTIQEL 136
E Q+L
Sbjct: 870 LQELLLKEEELEEQKL 885
>gnl|CDD|225087 COG2176, PolC, DNA polymerase III, alpha subunit (gram-positive
type) [DNA replication, recombination, and repair].
Length = 1444
Score = 32.3 bits (74), Expect = 0.93
Identities = 16/89 (17%), Positives = 39/89 (43%), Gaps = 3/89 (3%)
Query: 27 ETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEETEEK 86
E + + +K + + K+K + + + + + + R+ + E+IK LI+ EE+
Sbjct: 180 EEAINEEVEKAAQEALEAEKKLKAESPKVEKPKPLFDGQKGRKIKSTEEIKPLIKINEEE 239
Query: 87 EVLNELMLKKMILLFEKRTLKNREMRIKF 115
+ ++ I E + LK+ +
Sbjct: 240 ---TRVKVEGYIFKIEIKELKSGRTLLNI 265
>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein. This entry is a highly
conserved protein present in eukaryotes.
Length = 680
Score = 32.2 bits (73), Expect = 0.93
Identities = 44/176 (25%), Positives = 74/176 (42%), Gaps = 19/176 (10%)
Query: 14 PKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEE 73
+K +D ++ + S A++K K S S ++K + R+ EK EE +R+K+EE
Sbjct: 452 QLKKE-NDMLQTKLNSMVSAKQKDKQSMQSMEKRLKSEADSRVNAEKQLAEEKKRKKEEE 510
Query: 74 EKIK--TLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFPDNAEKFMESEIELHT 131
E +E L K L E + L++ ++++K E L
Sbjct: 511 ETAARAAAQAAASREECAESLKQAKQDLEMEIKKLEH-DLKLK--------EEECRMLEK 561
Query: 132 TIQELHAI-ATVPDLYPLLVQLKAV---SSMLELVLHENTDIAVAVVDLLQELTDV 183
QEL + + L+ L+A+ + MLE L T + +DL L DV
Sbjct: 562 EAQELRKYQESEKETEVLMSALQAMQDKNLMLENSLSAETRLK---LDLFSALGDV 614
>gnl|CDD|112890 pfam04094, DUF390, Protein of unknown function (DUF390). This is a
family of long proteins currently only found in the rice
genome. They have no known function. However they may be
some kind of transposable element.
Length = 843
Score = 32.1 bits (72), Expect = 1.2
Identities = 21/76 (27%), Positives = 36/76 (47%), Gaps = 5/76 (6%)
Query: 7 LSYKPKSPKRKALDDTVEEEETSAEDARK----KRKYSSSSSSSKMKYQEMERLEQ-EKI 61
LS P P R + E E+ +A +AR+ +R+ + ++ Q+ R Q E+
Sbjct: 209 LSEIPSRPSRHSKSGQSEAEDPAAAEARRREADRREAADRLREAEEAAQDAARARQAEEA 268
Query: 62 RQEELEREKQEEEKIK 77
+EE R +Q EE +
Sbjct: 269 AREEAARARQAEEAAR 284
>gnl|CDD|219555 pfam07753, DUF1609, Protein of unknown function (DUF1609). This
region is found in a number of hypothetical proteins
thought to be expressed by the eukaryote
Encephalitozoon cuniculi, an obligate intracellular
microsporidial parasite. It is approximately 200
residues long.
Length = 230
Score = 31.3 bits (71), Expect = 1.2
Identities = 21/74 (28%), Positives = 39/74 (52%), Gaps = 8/74 (10%)
Query: 26 EETSAEDARKKRKYSSSSSSSKMKYQEMERL------EQEKIRQEELEREKQEEEKIKTL 79
EE + AR K+K S ++ + EKI+ EEL++ +E+ K ++
Sbjct: 13 EEMAVGGARAKKKGGKKKSKGGRHCYKIHKRVLRWRKSPEKIK-EELDKGSEEKWKGRS- 70
Query: 80 IEETEEKEVLNELM 93
IEE +E++VL+++
Sbjct: 71 IEEIKEQKVLHDIT 84
>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
unknown].
Length = 652
Score = 31.6 bits (72), Expect = 1.5
Identities = 11/34 (32%), Positives = 19/34 (55%)
Query: 49 KYQEMERLEQEKIRQEELEREKQEEEKIKTLIEE 82
+E+ R +E ++ E +Q+EE I +IEE
Sbjct: 611 DSEELRRAIEEWKKRFEERERRQKEEDILRIIEE 644
>gnl|CDD|217902 pfam04111, APG6, Autophagy protein Apg6. In yeast, 15 Apg proteins
coordinate the formation of autophagosomes. Autophagy is
a bulk degradation process induced by starvation in
eukaryotic cells. Apg6/Vps30p has two distinct functions
in the autophagic process, either associated with the
membrane or in a retrieval step of the carboxypeptidase
Y sorting pathway.
Length = 356
Score = 31.4 bits (71), Expect = 1.5
Identities = 19/89 (21%), Positives = 31/89 (34%), Gaps = 7/89 (7%)
Query: 22 TVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQE-KIRQEELEREKQEEEKIKTLI 80
VE A D+ + E+E LE+E EL +EEK +
Sbjct: 59 NVEISNYEALDSELD----ELKKEEERLLDELEELEKEDDDLDGELVEL--QEEKEQLEN 112
Query: 81 EETEEKEVLNELMLKKMILLFEKRTLKNR 109
EE + N + L ++L+ +
Sbjct: 113 EELQYLREYNLFDRNNLQLEDNLQSLELQ 141
>gnl|CDD|235334 PRK05035, PRK05035, electron transport complex protein RnfC;
Provisional.
Length = 695
Score = 31.1 bits (71), Expect = 1.9
Identities = 13/32 (40%), Positives = 17/32 (53%), Gaps = 5/32 (15%)
Query: 51 QEMERLEQEKIR----QEELEREKQE-EEKIK 77
QE ++ E+ K R Q LEREK E + K
Sbjct: 443 QEKKKAEEAKARFEARQARLEREKAAREARHK 474
>gnl|CDD|218734 pfam05758, Ycf1, Ycf1. The chloroplast genomes of most higher
plants contain two giant open reading frames designated
ycf1 and ycf2. Although the function of Ycf1 is unknown,
it is known to be an essential gene.
Length = 832
Score = 31.1 bits (71), Expect = 1.9
Identities = 20/88 (22%), Positives = 31/88 (35%), Gaps = 11/88 (12%)
Query: 16 RKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEK 75
K L +T E EE E + S E + +QE+ E + EEK
Sbjct: 223 TKKLKETSETEEREEETDVEIETTS-----------ETKGTKQEQEGSTEEDPSLFSEEK 271
Query: 76 IKTLIEETEEKEVLNELMLKKMILLFEK 103
E +K + + + + FEK
Sbjct: 272 EDPDKTEDLDKLEILKEKKDEELFWFEK 299
>gnl|CDD|234352 TIGR03779, Bac_Flav_CT_M, Bacteroides conjugative transposon TraM
protein. Members of this protein family are designated
TraM and are found in a proposed transfer region of a
class of conjugative transposon found in the Bacteroides
lineage [Cellular processes, DNA transformation].
Length = 410
Score = 30.9 bits (70), Expect = 2.1
Identities = 18/70 (25%), Positives = 29/70 (41%)
Query: 24 EEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEET 83
EE + AE A R SS+++ + +E+ + EE E ++ EE L E
Sbjct: 104 EEPDEPAETAGSLRPIRSSAAAYRDINRELGSFYEYPKTDEEKELLREVEELESRLATEP 163
Query: 84 EEKEVLNELM 93
L E +
Sbjct: 164 SPAPELEEQL 173
>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein. The proteins in this family
are designated YL1. These proteins have been shown to be
DNA-binding and may be a transcription factor.
Length = 238
Score = 30.4 bits (69), Expect = 2.3
Identities = 19/92 (20%), Positives = 45/92 (48%), Gaps = 5/92 (5%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELERE 69
K +P+ K + + T + R+K SS+ + + ++ ++ E + + + R+
Sbjct: 114 KAAAPRPKKKSERISWAPTLLDSPRRKSSRSSTVQNKEATHERLKEREIRRKKIQAKARK 173
Query: 70 KQEEEKIKTL-----IEETEEKEVLNELMLKK 96
++E++K K L + E +E E +N L++
Sbjct: 174 RKEKKKEKELTQEERLAEAKETERINLKSLER 205
>gnl|CDD|173611 PTZ00421, PTZ00421, coronin; Provisional.
Length = 493
Score = 30.6 bits (69), Expect = 2.7
Identities = 12/76 (15%), Positives = 26/76 (34%)
Query: 12 KSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQ 71
P+ + + K++ + + E E+ Q E+ +
Sbjct: 405 GKPRHSGVSVPASTSAMTHSFDDNTSKHADPCAMGVKRMDEGILDERLGRLQALSEKLRT 464
Query: 72 EEEKIKTLIEETEEKE 87
+ E+IK E ++KE
Sbjct: 465 QHEEIKRCREALQKKE 480
>gnl|CDD|116082 pfam07461, NADase_NGA, Nicotine adenine dinucleotide glycohydrolase
(NADase). This family consists of several bacterial
nicotine adenine dinucleotide glycohydrolase (NGA)
proteins which appear to be specific to Streptococcus
pyogenes. NAD glycohydrolase (NADase) is a potential
virulence factor. Streptococcal NADase may contribute to
virulence by its ability to cleave beta-NAD at the
ribose-nicotinamide bond, depleting intracellular NAD
pools and producing the potent vasoactive compound
nicotinamide.
Length = 446
Score = 30.5 bits (68), Expect = 2.9
Identities = 34/165 (20%), Positives = 79/165 (47%), Gaps = 17/165 (10%)
Query: 3 IGELLSYKPKSPKRKALDDTVEEEETSAEDARKKRKYSSS-----SSSSKMKYQEMERLE 57
+GE+LSYK SP V + +S E+ KK + + + + + + + ++E +
Sbjct: 90 VGEVLSYKFASPMHIGRILIVNGDTSSKENYYKKNRIAKADVKYYNGNKLVLFHKIELGD 149
Query: 58 QEKIRQEELEREKQ-EEEKIKTLIEETEEKEVLNELMLKKMIL------LFEKRTLKNRE 110
+ +E +K+ + ++I + E + + + L L ++ +FEK K +E
Sbjct: 150 TYTKKPHHIEIDKKLDVDRIDIEVTEVHQGQNKDILALSEVTFGNIERDIFEK---KFKE 206
Query: 111 MRIKFPDN--AEKFMESEIELHTTIQELHAIATVPDLYPLLVQLK 153
++ K+ + A++F+E+ + ++ A+A+ + Y + V K
Sbjct: 207 IKDKWVTDKQADEFIETADKYADKAIQMSAVASRAEYYRMYVSRK 251
>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510). This
family consists of several hypothetical bacterial
proteins of around 200 residues in length. The function
of this family is unknown.
Length = 214
Score = 30.1 bits (68), Expect = 2.9
Identities = 22/88 (25%), Positives = 39/88 (44%), Gaps = 11/88 (12%)
Query: 10 KPKSPKRKALDDTVE------EEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQ 63
P SP +A D E +E E+ +++ K +++S + K + E+ +
Sbjct: 36 FPSSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEEN 95
Query: 64 EELEREKQEEEKIKTLIEETEEKEVLNE 91
EE + E +E + +ETEEK N
Sbjct: 96 EEEDEESSDENE-----KETEEKTESNV 118
>gnl|CDD|220184 pfam09332, Mcm10, Mcm10 replication factor. Mcm10 is a
eukaryotic DNA replication factor that regulates the
stability and chromatin association of DNA polymerase
alpha.
Length = 346
Score = 30.2 bits (68), Expect = 3.3
Identities = 17/58 (29%), Positives = 29/58 (50%), Gaps = 2/58 (3%)
Query: 14 PKRKALDDTVEEEETSAEDARKKRKYSSSSS--SSKMKYQEMERLEQEKIRQEELERE 69
P AL+ E+++ +A K S S+ + K Q++ERL K R EE+++
Sbjct: 4 PTPGALNLKKHLEKSALAEAGGPPKQSISAVELLKQQKQQDLERLRARKKRAEEIQKR 61
>gnl|CDD|130078 TIGR01005, eps_transp_fam, exopolysaccharide transport protein
family. The model describes the exopolysaccharide
transport protein family in bacteria. The transport
protein is part of a large genetic locus which is
associated with exopolysaccharide (EPS) biosynthesis.
Detailed molecular characterization and gene fusion
analysis revealed atleast seven gene products are
involved in the overall regulation, which among other
things, include exopolysaccharide biosynthesis, property
of conferring virulence and exopolysaccharide export
[Transport and binding proteins, Carbohydrates, organic
alcohols, and acids].
Length = 754
Score = 30.4 bits (68), Expect = 3.5
Identities = 21/152 (13%), Positives = 51/152 (33%), Gaps = 18/152 (11%)
Query: 74 EKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIK-----FPDNAEKFMESEIE 128
+ +K ++ +VL E++ ++ L + L+ R+ ++ +
Sbjct: 261 DSVKKALQNGGSLDVLPEVLSSQLKLEDLIQRLRERQAELRATIADLSTTMLANHPRVVA 320
Query: 129 LHTTIQELHAIATVPDLYPLLVQL-------KAVSSMLELVLHENTD-IAVAVVDLLQEL 180
+++ +L A + L ++ + E L + + + A ++
Sbjct: 321 AKSSLADLDA-----QIRSELQKITKSLLMQADAAQARESQLVSDVNQLKAASAQAGEQQ 375
Query: 181 TDVDVLNESEEGTESLLTALLDQQVCALLVQN 212
D+D L L + L A QN
Sbjct: 376 VDLDALQRDAAAKRQLYESYLTNYRQAASRQN 407
>gnl|CDD|153365 cd07681, F-BAR_PACSIN3, The F-BAR (FES-CIP4 Homology and
Bin/Amphiphysin/Rvs) domain of Protein kinase C and
Casein kinase Substrate in Neurons 3 (PACSIN3). F-BAR
domains are dimerization modules that bind and bend
membranes and are found in proteins involved in membrane
dynamics and actin reorganization. Protein kinase C and
Casein kinase Substrate in Neurons (PACSIN) proteins,
also called Synaptic dynamin-associated proteins
(Syndapins), act as regulators of cytoskeletal and
membrane dynamics. Vetebrates harbor three isoforms with
distinct expression patterns and specific functions.
PACSIN 3 or Syndapin III is expressed ubiquitously and
regulates glucose uptake in adipocytes through its role
in GLUT1 trafficking. It also modulates the subcellular
localization and stimulus-specific function of the
cation channel TRPV4. PACSIN 3 contains an N-terminal
F-BAR domain and a C-terminal SH3 domain. F-BAR domains
form banana-shaped dimers with a positively-charged
concave surface that binds to negatively-charged lipid
membranes. They can induce membrane deformation in the
form of long tubules.
Length = 258
Score = 29.9 bits (67), Expect = 3.7
Identities = 35/138 (25%), Positives = 56/138 (40%), Gaps = 15/138 (10%)
Query: 10 KPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELERE 69
K + P K L + VE + ARK + ++ + + K EQ + Q+ +E+
Sbjct: 123 KAQKPWVKKLKE-VESSKKGYHAARKDER-TAQTRETHAKADSTVSQEQLRKLQDRVEKC 180
Query: 70 KQEEEKIKTLIEET-EEKEVLNELMLKKMILLFE-------KRTLKNREMRIKFP----- 116
QE EK K E+ EE N ++ M FE KR +EM +
Sbjct: 181 TQEAEKAKEQYEKALEELNRYNPRYMEDMEQAFEICQEAERKRLCFFKEMLLDLHQHLDL 240
Query: 117 DNAEKFMESEIELHTTIQ 134
+++ F +LH TI
Sbjct: 241 SSSDSFHALYRDLHQTIS 258
>gnl|CDD|236912 PRK11448, hsdR, type I restriction enzyme EcoKI subunit R;
Provisional.
Length = 1123
Score = 30.3 bits (69), Expect = 3.7
Identities = 22/80 (27%), Positives = 39/80 (48%), Gaps = 5/80 (6%)
Query: 51 QEMERLEQEKI-RQEELEREKQEEEKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNR 109
Q++E +EK Q E ++QE ++ L E EEK+ E L++ L EK ++
Sbjct: 156 QQLELQAREKAQSQALAEAQQQELVALEGLAAELEEKQQELEAQLEQ---LQEKAAETSQ 212
Query: 110 EMRIKFPDNAEKFMESEIEL 129
E + K + ++ +EL
Sbjct: 213 ERKQKRKEITDQA-AKRLEL 231
>gnl|CDD|234055 TIGR02907, spore_VI_D, stage VI sporulation protein D. SpoVID, the
stage VI sporulation protein D, is restricted to
endospore-forming members of the bacteria, all of which
are found among the Firmicutes. It is widely distributed
but not quite universal in this group. Between
well-conserved N-terminal and C-terminal domains is a
poorly conserved, low-complexity region of variable
length, rich enough in glutamic acid to cause spurious
BLAST search results unless a filter is used. The seed
alignment for this model was trimmed, in effect, by
choosing member sequences in which these regions are
relatively short. SpoVID is involved in spore coat
assembly by the mother cell compartment late in the
process of sporulation [Cellular processes, Sporulation
and germination].
Length = 338
Score = 29.9 bits (67), Expect = 3.8
Identities = 35/150 (23%), Positives = 68/150 (45%), Gaps = 16/150 (10%)
Query: 11 PKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREK 70
++ + K + E+ E A+D + K S+S ++ + E+E + E E E
Sbjct: 174 ERTDEPKVEHEAHEQHEQPADDDPDEWKISASEPFQ-LESEVEASPEEENYEEYEDETEL 232
Query: 71 QEEEKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFPDNA----EKFMESE 126
+ E++ K L E+TE+ + + + K+ L+ E + + P+NA + F ++E
Sbjct: 233 EVEDEEKALDEQTEDPQ------QEDALAGDAKKALEEEEEKGERPENATYLTKLFRKAE 286
Query: 127 IELHTT-----IQELHAIATVPDLYPLLVQ 151
E T +QE I T+ + Y + V
Sbjct: 287 EEQFTKLRMCIVQEGDTIETIAERYEISVS 316
>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
Length = 1036
Score = 30.2 bits (68), Expect = 3.9
Identities = 20/58 (34%), Positives = 29/58 (50%), Gaps = 14/58 (24%)
Query: 52 EMERLEQEKIRQEELEREKQEEEKIKTLIE-------------ETEE-KEVLNELMLK 95
E +R E EK+ +EE ERE+Q EE+ + E E E+ +E L L+ K
Sbjct: 252 EEKRRELEKLAKEEAERERQAEEQRRREEEKAAMEADRAQAKAEVEKRREKLQNLLKK 309
>gnl|CDD|221794 pfam12825, DUF3818, Domain of unknown function in PX-proteins
(DUF3818). This domain is found on proteins carrying a
PX domain. Its function is unknown.
Length = 340
Score = 30.0 bits (68), Expect = 4.0
Identities = 32/135 (23%), Positives = 58/135 (42%), Gaps = 16/135 (11%)
Query: 21 DTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLI 80
V E T+ ++A + S Y +++L Q +R+ R+K +K L
Sbjct: 178 QEVIESYTAWKNAVESEPVDEEESEEAELYSNLKQLLQLYLRE----RDKDL---MKKLW 230
Query: 81 EETEEKEVLNELMLKKMILLFEK---RTLKNREMRIKFPDNAEKFMESEIELHTTIQELH 137
+E E L +L LK ++ +F + R K ++ + D EKFM+ I+L +
Sbjct: 231 QEPE----LTQL-LKDLVTIFYEPLVRVFKVADVDVALKD-FEKFMDDLIKLLEKVINQL 284
Query: 138 AIATVPDLYPLLVQL 152
I+ ++ V L
Sbjct: 285 YISDPFNVVQAFVDL 299
>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
This is a family of fungal proteins of unknown function.
Length = 182
Score = 29.3 bits (66), Expect = 4.1
Identities = 25/113 (22%), Positives = 47/113 (41%), Gaps = 12/113 (10%)
Query: 8 SYKPKSPKRKALDDTVE--EEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEE 65
Y K+K L + +E ++E + K +K S K K ++ ++ + + E
Sbjct: 56 EYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDK----SE 111
Query: 66 LEREKQEEEKIKTLIEETEE-KEVLNELM-----LKKMILLFEKRTLKNREMR 112
+ EK+ E+K++ L + E L+EL L K I + E+
Sbjct: 112 KKDEKEAEDKLEDLTKSYSETLSTLSELKPRKYALHKDIYQSRLDRKRRAEVA 164
>gnl|CDD|146486 pfam03879, Cgr1, Cgr1 family. Members of this family are
coiled-coil proteins that are involved in pre-rRNA
processing.
Length = 105
Score = 28.1 bits (63), Expect = 4.5
Identities = 19/63 (30%), Positives = 30/63 (47%), Gaps = 6/63 (9%)
Query: 40 SSSSSSSKMKYQEMERLEQEKIRQEELEREKQEE-----EKIKTLIEETEEKEVLNELML 94
S +S + + ++ + K R++EL+ EK+ E + IK EEKE E M
Sbjct: 24 KSKLTSWEKRMEKRLEQQAIKAREKELKDEKEAERQRRIQAIKERRAAKEEKERY-EKMA 82
Query: 95 KKM 97
KM
Sbjct: 83 AKM 85
>gnl|CDD|130141 TIGR01069, mutS2, MutS2 family protein. Function of MutS2 is
unknown. It should not be considered a DNA mismatch
repair protein. It is likely a DNA mismatch binding
protein of unknown cellular function [DNA metabolism,
Other].
Length = 771
Score = 29.8 bits (67), Expect = 4.7
Identities = 23/83 (27%), Positives = 35/83 (42%), Gaps = 10/83 (12%)
Query: 29 SAEDARKKRKYSSSSSSSKMKYQEMER--LEQEKIRQEELEREKQEE------EKIKTLI 80
K+ + + +K QE + LEQE +E ER K+ E E +K L
Sbjct: 519 KLSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKERERNKKLELEKEAQEALKALK 578
Query: 81 EETEEKEVLNELMLKKMILLFEK 103
+E E ++ EL KK+ E
Sbjct: 579 KEVE--SIIRELKEKKIHKAKEI 599
>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355). This
family of proteins is found in bacteria and viruses.
Proteins in this family are typically between 180 and
214 amino acids in length.
Length = 125
Score = 28.4 bits (64), Expect = 5.0
Identities = 15/66 (22%), Positives = 31/66 (46%), Gaps = 1/66 (1%)
Query: 17 KALDDTVEEEETSAEDAR-KKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEK 75
KA+ + E E+ + + K + S+ K +Y+ + ++ + + EL R + + E
Sbjct: 15 KAIAKEKAKWEKKQEEKKSEAEKLAKMSAEEKAEYELEKLEKELEELEAELARRELKAEA 74
Query: 76 IKTLIE 81
K L E
Sbjct: 75 KKMLSE 80
>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein. TolA couples the inner
membrane complex of itself with TolQ and TolR to the
outer membrane complex of TolB and OprL (also called
Pal). Most of the length of the protein consists of
low-complexity sequence that may differ in both length
and composition from one species to another,
complicating efforts to discriminate TolA (the most
divergent gene in the tol-pal system) from paralogs such
as TonB. Selection of members of the seed alignment and
criteria for setting scoring cutoffs are based largely
conserved operon struction. //The Tol-Pal complex is
required for maintaining outer membrane integrity. Also
involved in transport (uptake) of colicins and
filamentous DNA, and implicated in pathogenesis.
Transport is energized by the proton motive force. TolA
is an inner membrane protein that interacts with
periplasmic TolB and with outer membrane porins ompC,
phoE and lamB [Transport and binding proteins, Other,
Cellular processes, Pathogenesis].
Length = 346
Score = 29.4 bits (66), Expect = 5.0
Identities = 19/79 (24%), Positives = 35/79 (44%), Gaps = 2/79 (2%)
Query: 10 KPKSPKRKALDDTVEEEET-SAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELE- 67
K + ++K E E+ +AE AR+K +++ K E + E+ +++ E
Sbjct: 65 KEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAEEA 124
Query: 68 REKQEEEKIKTLIEETEEK 86
+ KQ E E E+K
Sbjct: 125 KAKQAAEAKAKAEAEAEKK 143
>gnl|CDD|224259 COG1340, COG1340, Uncharacterized archaeal coiled-coil protein
[Function unknown].
Length = 294
Score = 29.3 bits (66), Expect = 5.1
Identities = 21/96 (21%), Positives = 38/96 (39%), Gaps = 2/96 (2%)
Query: 15 KRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKI--RQEELEREKQE 72
KR ++ ++E + ++KR + S + ++K E E++
Sbjct: 77 KRDEINAKLQELRKEYRELKEKRNEFNLGGRSIKSLEREIERLEKKQQTSVLTPEEEREL 136
Query: 73 EEKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKN 108
+KIK L +E E+ + E K L E LK
Sbjct: 137 VQKIKELRKELEDAKKALEENEKLKELKAEIDELKK 172
>gnl|CDD|153306 cd07622, BAR_SNX4, The Bin/Amphiphysin/Rvs (BAR) domain of Sorting
Nexin 4. BAR domains are dimerization, lipid binding
and curvature sensing modules found in many different
proteins with diverse functions. Sorting nexins (SNXs)
are Phox homology (PX) domain containing proteins that
are involved in regulating membrane traffic and protein
sorting in the endosomal system. SNXs differ from each
other in their lipid-binding specificity, subcellular
localization and specific function in the endocytic
pathway. A subset of SNXs also contain BAR domains. The
PX-BAR structural unit determines the specific membrane
targeting of SNXs. SNX4 is involved in recycling traffic
from the sorting endosome (post-Golgi endosome) back to
the late Golgi. It is also implicated in the regulation
of plasma membrane receptor trafficking and interacts
with receptors for EGF, insulin, platelet-derived growth
factor and leptin. BAR domains form dimers that bind to
membranes, induce membrane bending and curvature, and
may also be involved in protein-protein interactions.
Length = 201
Score = 28.9 bits (65), Expect = 5.2
Identities = 26/97 (26%), Positives = 51/97 (52%), Gaps = 11/97 (11%)
Query: 18 ALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIK 77
++D+ +E+EE A D K+ + + S + K E+ + + EK +++Q EE +K
Sbjct: 85 SIDNGLEDEELIA-DQLKEYLFFADSLRAVCKKHELLQYDLEKAEDALANKKQQGEEAVK 143
Query: 78 TLIEETEEKEVLNELMLKKM--ILLFEKRTLKNREMR 112
E K+ LNE + K + + F+K+ K R+++
Sbjct: 144 ------EAKDELNEFVKKALEDVERFKKQ--KVRDLK 172
>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3
subunit. This is a family of proteins which are
subunits of the eukaryotic translation initiation factor
3 (eIF3). In yeast it is called Hcr1. The Saccharomyces
cerevisiae protein eIF3j (HCR1) has been shown to be
required for processing of 20S pre-rRNA and binds to 18S
rRNA and eIF3 subunits Rpg1p and Prt1p.
Length = 242
Score = 29.2 bits (66), Expect = 5.4
Identities = 22/93 (23%), Positives = 48/93 (51%), Gaps = 10/93 (10%)
Query: 10 KPKSPKRKALDDTVEEEET-----SAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQE 64
K+ + DD E+++ ED K+ + + ++ +K K + + KI ++
Sbjct: 15 PAKAVVKDKWDDEDEDDDVKDSWDEEEDEEKEEEKAKVAAKAKAK-----KALKAKIEEK 69
Query: 65 ELEREKQEEEKIKTLIEETEEKEVLNELMLKKM 97
E + ++EE+ ++ L E+T E E+ +L L+K+
Sbjct: 70 EKAKREKEEKGLRELEEDTPEDELAEKLRLRKL 102
>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
Length = 520
Score = 29.4 bits (67), Expect = 6.4
Identities = 23/114 (20%), Positives = 49/114 (42%), Gaps = 12/114 (10%)
Query: 37 RKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEE--EKIKTLIEETEE--------- 85
RK + + + + + LE+ K E +++E E E+I L E E+
Sbjct: 25 RKKIAEAKIKEAEEEAKRILEEAKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNEL 84
Query: 86 KEVLNELMLKKMILLFEKRTLKNREMRIKFPDNAEKFMESEIE-LHTTIQELHA 138
+++ L+ K+ L + L+ RE ++ + + + E+E ++EL
Sbjct: 85 QKLEKRLLQKEENLDRKLELLEKREEELEKKEKELEQKQQELEKKEEELEELIE 138
>gnl|CDD|224212 COG1293, COG1293, Predicted RNA-binding protein homologous to
eukaryotic snRNP [Transcription].
Length = 564
Score = 29.3 bits (66), Expect = 6.5
Identities = 18/99 (18%), Positives = 44/99 (44%), Gaps = 4/99 (4%)
Query: 22 TVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEE----EKIK 77
+ + + E + K + S +++ +++ ++L+ K+ + E +E E K
Sbjct: 341 RLADFYGNEEIKIELDKSKTPSENAQRYFKKYKKLKGAKVNLDRQLSELKEAIAYYESAK 400
Query: 78 TLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFP 116
T +E+ E K+ + E+ + + K K R+ + F
Sbjct: 401 TALEKAEGKKAIEEIREELIEEGLLKSKKKKRKKKEWFE 439
Score = 28.9 bits (65), Expect = 8.2
Identities = 26/105 (24%), Positives = 39/105 (37%), Gaps = 13/105 (12%)
Query: 19 LDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMER--------LEQEKIRQEELEREK 70
L D EE E K + S ++ KY++++ L + K E K
Sbjct: 342 LADFYGNEEIKIE-LDKSKTPSENAQRYFKKYKKLKGAKVNLDRQLSELKEAIAYYESAK 400
Query: 71 QEEEKIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKF 115
EK + E +E L E L K +K+ K +E KF
Sbjct: 401 TALEKAEGKKAIEEIREELIEEGLLKS----KKKKRKKKEWFEKF 441
>gnl|CDD|236080 PRK07734, motB, flagellar motor protein MotB; Reviewed.
Length = 259
Score = 28.9 bits (65), Expect = 7.1
Identities = 14/60 (23%), Positives = 30/60 (50%), Gaps = 1/60 (1%)
Query: 24 EEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKT-LIEE 82
E+E+ + + + + + K +EME L+ + + ++ +EKQ ++T L EE
Sbjct: 75 EDEKELSASSLEAEQAKKKEEAEAKKKKEMEELKAVQKKIDQYIKEKQLSSSLQTKLTEE 134
>gnl|CDD|178439 PLN02847, PLN02847, triacylglycerol lipase.
Length = 633
Score = 29.5 bits (66), Expect = 7.1
Identities = 27/72 (37%), Positives = 38/72 (52%), Gaps = 8/72 (11%)
Query: 20 DDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEM-ERLEQEKIRQE-ELEREKQEEEKIK 77
DD EEE +ED + +SS ++ E+ LE+E RQE E++ + QEEE
Sbjct: 471 DDEEEEEPLLSED-----RVITSSVEEEVTEGELWYELEKELQRQETEVDAQAQEEEA-A 524
Query: 78 TLIEETEEKEVL 89
E TEE+ VL
Sbjct: 525 AAKEITEEENVL 536
>gnl|CDD|224768 COG1855, COG1855, ATPase (PilT family) [General function prediction
only].
Length = 604
Score = 29.3 bits (66), Expect = 7.4
Identities = 23/80 (28%), Positives = 35/80 (43%), Gaps = 8/80 (10%)
Query: 16 RKALDDTVEEEETSAEDARKK--RKYSSSSSSSKMKYQEMERLEQE---KIRQEELEREK 70
++ L VE E A K KY K ++ +E++ KI + LE E+
Sbjct: 471 KRYLPGDVEVEVVGDGRAVVKVPEKYIPKVIGKGGK--RIKEIEKKLGIKIDVKPLE-EE 527
Query: 71 QEEEKIKTLIEETEEKEVLN 90
+E EK+ IEE + VL
Sbjct: 528 EEGEKVPVEIEEKGKHIVLY 547
>gnl|CDD|233044 TIGR00600, rad2, DNA excision repair protein (rad2). All proteins
in this family for which functions are known are flap
endonucleases that generate the 3' incision next to DNA
damage as part of nucleotide excision repair. This
family is related to many other flap endonuclease
families including the fen1 family. This family is based
on the phylogenomic analysis of JA Eisen (1999, Ph.D.
Thesis, Stanford University) [DNA metabolism, DNA
replication, recombination, and repair].
Length = 1034
Score = 29.5 bits (66), Expect = 7.5
Identities = 29/139 (20%), Positives = 56/139 (40%), Gaps = 13/139 (9%)
Query: 1 MDIGELLSYKPKSP---KRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLE 57
M++ + S K +S +D E + S ++ +K+ +E
Sbjct: 642 MEVEPMESEKEESESDGSFIEVDSVSSTLELQVPSKSQPTDESEENAENKV-----ASIE 696
Query: 58 QEKIRQ-EELEREKQEEEKIKTLIEETEE-KEVLNELMLKKMILLFEKRTLKNREMRIKF 115
E ++ E+L ++ EE+ I +IEE ++ + NE + I L E L+ + +
Sbjct: 697 GEHRKEIEDLLFDESEEDNIVGMIEEEKDADDFKNEW---QDISLEELEALEANLLAEQN 753
Query: 116 PDNAEKFMESEIELHTTIQ 134
A+K + I T Q
Sbjct: 754 SLKAQKQQQKRIAAEVTGQ 772
>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein. This family consists of
AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
retardation syndrome) nuclear proteins. These proteins
have been linked to human diseases such as acute
lymphoblastic leukaemia and mental retardation. The
family also contains a Drosophila AF4 protein homologue
Lilliputian which contains an AT-hook domain.
Lilliputian represents a novel pair-rule gene that acts
in cytoskeleton regulation, segmentation and
morphogenesis in Drosophila.
Length = 1154
Score = 29.1 bits (65), Expect = 7.7
Identities = 18/73 (24%), Positives = 33/73 (45%), Gaps = 10/73 (13%)
Query: 6 LLSYKPKSPKRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEE 65
LLS P P +K + E+++ + ++ K +S SSSK K + + +
Sbjct: 706 LLSRIPGHPYKKGVPPKPAEKDSLSAPKKQTSKTASEKSSSKGK----------RKHKND 755
Query: 66 LEREKQEEEKIKT 78
E +K E +K +
Sbjct: 756 EEADKIESKKQRL 768
>gnl|CDD|221245 pfam11822, DUF3342, Domain of unknown function (DUF3342). This
family of proteins are functionally uncharacterized.
This family is found in bacteria. This presumed domain
is typically between 170 to 303 amino acids in length.
The N-terminal half of this family is a BTB-like domain.
Length = 302
Score = 28.5 bits (64), Expect = 8.7
Identities = 27/121 (22%), Positives = 50/121 (41%), Gaps = 16/121 (13%)
Query: 449 KVDRLLELHFKYLSKHVVSIIA---NMLRNC-SGPQRQRLLSKFTENDHEKV-DRLLELH 503
++++L++ Y H+ I++ NM NC + RL FT N+ +V D+ +
Sbjct: 91 QMEQLVDECLMYCHAHLSEIVSSSCNM--NCLNDELVTRLAHMFTHNELARVKDKKDKFK 148
Query: 504 FKYLSK----VDEADKEQRDEDEDENYLRR--LEAGLFTLQLVDYIIVETCAAGAATIKQ 557
+ +K + + E + LRR L LFT + + C G ++
Sbjct: 149 PRLFTKLIQHLCDPLGEAPASHGRASGLRRCGLCGTLFTQGELKRL---ECCPGKISVGH 205
Query: 558 R 558
R
Sbjct: 206 R 206
>gnl|CDD|220818 pfam10595, UPF0564, Uncharacterized protein family UPF0564. This
family of proteins has no known function. However, one
of the members is annotated as an EF-hand family
protein.
Length = 349
Score = 29.0 bits (65), Expect = 8.7
Identities = 20/68 (29%), Positives = 34/68 (50%), Gaps = 1/68 (1%)
Query: 11 PKSPKRKALD-DTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELERE 69
KS R LD + + E + + R+K S+ S +K+ ER E+E Q+E +
Sbjct: 237 HKSSSRTYLDQENISAGEENLKPTRRKLPPSTKKWESLVKFLRTERKEKEAKEQQEKKEL 296
Query: 70 KQEEEKIK 77
+Q ++K K
Sbjct: 297 EQRKKKKK 304
>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
protein; Reviewed.
Length = 782
Score = 29.0 bits (66), Expect = 8.9
Identities = 13/49 (26%), Positives = 25/49 (51%), Gaps = 1/49 (2%)
Query: 42 SSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEETEEKEVLN 90
+S + E + E E + +E E+ K+E E+ K ++E E+K +
Sbjct: 523 ASLEELERELEQKAEEAEALLKE-AEKLKEELEEKKEKLQEEEDKLLEE 570
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 28.8 bits (65), Expect = 9.6
Identities = 18/62 (29%), Positives = 29/62 (46%), Gaps = 6/62 (9%)
Query: 31 EDARKK-RKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEET----EE 85
+ +K K +K E ER E+ + ++EE ++E+ E K+ L E EE
Sbjct: 255 PEVLRKVDKTREEEEEKILKAAEEERQEEAQEKKEEKKKEE-REAKLAKLSPEEQRKLEE 313
Query: 86 KE 87
KE
Sbjct: 314 KE 315
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 29.1 bits (65), Expect = 9.8
Identities = 20/79 (25%), Positives = 37/79 (46%), Gaps = 5/79 (6%)
Query: 53 MERLEQEKIRQEELEREKQEEEKIKTLIE---ETEEKEVLNELMLK--KMILLFEKRTLK 107
E++ + +EELER+K++EEK K + +KE +L + K++ K
Sbjct: 5 ESEAEKKILTEEELERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEK 64
Query: 108 NREMRIKFPDNAEKFMESE 126
R +N E F++ +
Sbjct: 65 KSRKRDVEDENPEDFIDPD 83
>gnl|CDD|225468 COG2916, Hns, DNA-binding protein H-NS [General function
prediction only].
Length = 128
Score = 27.4 bits (61), Expect = 9.8
Identities = 16/47 (34%), Positives = 25/47 (53%), Gaps = 2/47 (4%)
Query: 54 ERLEQ--EKIRQEELEREKQEEEKIKTLIEETEEKEVLNELMLKKMI 98
E LE+ EK Q ER+++E I + E E+ + EL++K I
Sbjct: 18 ELLEEMLEKEEQVVQERQEEEAAAIAEIEERQEKYGTIRELLIKDGI 64
>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex. This
entry is characterized by proteins with alternating
conserved and low-complexity regions. Bud13 together
with Snu17p and a newly identified factor,
Pml1p/Ylr016c, form a novel trimeric complex. called The
RES complex, pre-mRNA retention and splicing complex.
Subunits of this complex are not essential for viability
of yeasts but they are required for efficient splicing
in vitro and in vivo. Furthermore, inactivation of this
complex causes pre-mRNA leakage from the nucleus. Bud13
contains a unique, phylogenetically conserved C-terminal
region of unknown function.
Length = 141
Score = 27.7 bits (62), Expect = 9.9
Identities = 20/84 (23%), Positives = 39/84 (46%), Gaps = 5/84 (5%)
Query: 45 SSKMKYQEMERLEQEKIRQEELEREKQEEEKIKTLIEETEEKEVLNELMLKKMILLFEKR 104
+ K +E ER ++EK R+EE E+E + K EE E++ E K + +
Sbjct: 13 DIEEKREEKEREKEEKERKEEKEKEWGKGLVQK---EEREKRLEELEKAKNKPLARYADD 69
Query: 105 TLKNREM--RIKFPDNAEKFMESE 126
+ E+ + ++ D +F+ +
Sbjct: 70 EDYDEELKEQERWDDPMAQFLRKK 93
>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
chromosome partitioning].
Length = 1163
Score = 28.9 bits (65), Expect = 9.9
Identities = 21/114 (18%), Positives = 52/114 (45%), Gaps = 6/114 (5%)
Query: 15 KRKALDDTVEEEETSAEDARKKRKYSSSSSSSKMKYQEMERLEQEKIRQEELEREKQEEE 74
+ L+ +E+ E + + + + + E + E+++ EL ++E E
Sbjct: 675 ELAELEAQLEKLEEELKSLKNELRSLEDLLEELRRQLEELERQLEELK-RELAALEEELE 733
Query: 75 KIKTLIEETEEKEVLNELMLKKMILLFEKRTLKNREMRIKFPDNAEKFMESEIE 128
++++ +EE EE+ E L+++ + L+ E ++ + A ++ EIE
Sbjct: 734 QLQSRLEELEEELEELEEELEEL-----QERLEELEEELESLEEALAKLKEEIE 782
>gnl|CDD|202427 pfam02841, GBP_C, Guanylate-binding protein, C-terminal domain.
Transcription of the anti-viral guanylate-binding
protein (GBP) is induced by interferon-gamma during
macrophage induction. This family contains GBP1 and
GPB2, both GTPases capable of binding GTP, GDP and GMP.
Length = 297
Score = 28.4 bits (64), Expect = 10.0
Identities = 15/55 (27%), Positives = 31/55 (56%), Gaps = 7/55 (12%)
Query: 51 QEMERLEQEKIRQEEL--EREKQEEEKIKTLIE--ETEEKEVLNE---LMLKKMI 98
E E L +++ +E++ +E+ +E +K LIE E E +++L E ++ K+
Sbjct: 218 AEQELLREKQKEEEQMMEAQERSYQEHVKQLIEKMEAEREKLLAEQERMLEHKLQ 272
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.316 0.134 0.368
Gapped
Lambda K H
0.267 0.0677 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 31,159,680
Number of extensions: 3192724
Number of successful extensions: 6902
Number of sequences better than 10.0: 1
Number of HSP's gapped: 5798
Number of HSP's successfully gapped: 507
Length of query: 607
Length of database: 10,937,602
Length adjustment: 103
Effective length of query: 504
Effective length of database: 6,369,140
Effective search space: 3210046560
Effective search space used: 3210046560
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 62 (27.8 bits)