RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy10214
(415 letters)
>gnl|CDD|192780 pfam11594, Med28, Mediator complex subunit 28. Mediator is a large
complex of up to 33 proteins that is conserved from
plants to fungi to humans - the number and
representation of individual subunits varying with
species. It is arranged into four different sections, a
core, a head, a tail and a kinase-activity part, and the
number of subunits within each of these is what varies
with species. Overall, Mediator regulates the
transcriptional activity of RNA polymerase II but it
would appear that each of the four different sections
has a slightly different function. Subunit Med28 of the
Mediator may function as a scaffolding protein within
Mediator by maintaining the stability of a submodule
within the head module, and components of this submodule
act together in a gene-regulatory programme to suppress
smooth muscle cell differentiation. Thus, mammalian
Mediator subunit Med28 functions as a repressor of
smooth muscle-cell differentiation, which could have
implications for disorders associated with abnormalities
in smooth muscle cell growth and differentiation,
including atherosclerosis, asthma, hypertension, and
smooth muscle tumours.
Length = 106
Score = 115 bits (289), Expect = 2e-31
Identities = 41/99 (41%), Positives = 60/99 (60%), Gaps = 3/99 (3%)
Query: 175 EIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKR 234
EI+ +DQ + KFLD+ARQ E FFLQKR LS KP+ +KE+ L+ ++ RK++L +
Sbjct: 1 EIRNYVDQLSQKFLDIARQKETFFLQKRNELSVFKPKKTLKEEAQKLKEEMQRKDQLQTK 60
Query: 235 HYDKIAVWQNLLSDLQGWAKSPA---HQGSTSSASGTTP 270
H KI W+NLL+D + K ++G A +TP
Sbjct: 61 HDSKIDYWENLLTDAEDVYKVRDEVPNEGRQRIAELSTP 99
Score = 105 bits (263), Expect = 9e-28
Identities = 36/82 (43%), Positives = 52/82 (63%), Gaps = 6/82 (7%)
Query: 74 EIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLARK 133
EI+ +DQ + KFLD+ARQ E FFLQKR LS KP+ +KE + L+ ++ RK
Sbjct: 1 EIRNYVDQLSQKFLDIARQKETFFLQKRNELSVFKPKKTLKE------EAQKLKEEMQRK 54
Query: 134 EELIKRHYDKIAVWQNLLSDLQ 155
++L +H KI W+NLL+D +
Sbjct: 55 DQLQTKHDSKIDYWENLLTDAE 76
>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1. Members
of this family are necessary for accurate chromosome
transmission during cell division.
Length = 804
Score = 46.3 bits (110), Expect = 3e-05
Identities = 24/131 (18%), Positives = 32/131 (24%), Gaps = 4/131 (3%)
Query: 265 ASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQI 324
G P + + P P P+ QQ Q + QQ+
Sbjct: 200 PPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPL----QQPQFPGLSQQM 255
Query: 325 HMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLG 384
Q P P PG P+G PPP + P
Sbjct: 256 PPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQ 315
Query: 385 PGGMGPGGLLQ 395
G L+Q
Sbjct: 316 RGPQFREQLVQ 326
Score = 40.5 bits (95), Expect = 0.001
Identities = 29/134 (21%), Positives = 34/134 (25%), Gaps = 7/134 (5%)
Query: 267 GTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHM 326
P P P +A P P P + + Q Q +
Sbjct: 173 PQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPL 232
Query: 327 QHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG---NL 383
Q P P P G S M P P+ PP P A P P L
Sbjct: 233 PPQLPQQPPPLQQPQFP-GLSQQMP---PPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGL 288
Query: 384 GPGGMGPGGLLQGP 397
G P Q P
Sbjct: 289 PQGQNAPLPPPQQP 302
Score = 38.2 bits (89), Expect = 0.008
Identities = 19/93 (20%), Positives = 21/93 (22%), Gaps = 1/93 (1%)
Query: 305 MQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNA 364
+QQ Q + P GP G G P
Sbjct: 158 EVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQ 217
Query: 365 GPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
P PS P P L P P Q P
Sbjct: 218 QFLPAPSQAPAQPPLPPQL-PQQPPPLQQPQFP 249
Score = 36.7 bits (85), Expect = 0.026
Identities = 19/107 (17%), Positives = 24/107 (22%), Gaps = 7/107 (6%)
Query: 270 PPNSTPTQSGPGISAMG-GPLPGMMGGMAP-IVPGSTMQPMSGMPQQQQQVQMQQQIHMQ 327
P Q + PG+ M P Q PQ Q Q Q H
Sbjct: 228 AQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPH-P 286
Query: 328 HMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGP 374
+ Q P PP P + + G
Sbjct: 287 GLPQGQNAPLPPPQQPQLLPL----VQQPQGQQRGPQFREQLVQLSQ 329
Score = 30.5 bits (69), Expect = 2.0
Identities = 16/76 (21%), Positives = 22/76 (28%)
Query: 256 PAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQ 315
P Q PP + PT PLP + G ++
Sbjct: 263 PQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQRGPQFRE 322
Query: 316 QQVQMQQQIHMQHMQQ 331
Q VQ+ QQ Q+
Sbjct: 323 QLVQLSQQQREALSQE 338
>gnl|CDD|219971 pfam08690, GET2, GET complex subunit GET2. This family corresponds
to the GET complex subunit GET2. The GET complex is
involved in the retrieval of ER resident proteins from
the Golgi.
Length = 298
Score = 43.6 bits (103), Expect = 9e-05
Identities = 23/121 (19%), Positives = 39/121 (32%), Gaps = 6/121 (4%)
Query: 259 QGSTSSASGTTPPNSTPT-QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQ 317
QGS+ + ++ P +G SA P + + I P + S +
Sbjct: 34 QGSSVKLVSKSVLDAKPEDNTGSTTSAHDQSTPEIQDILEAIDP-PKDESESPAENIDPE 92
Query: 318 VQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGM 377
V+M QQ+ + Q G G ++ + M G G P + P
Sbjct: 93 VEMFQQL----AKMQQQGNGSDNPPADDSTADLFSMLLQMGGGDGPDSESPASAQEPQEA 148
Query: 378 G 378
Sbjct: 149 P 149
>gnl|CDD|238103 cd00176, SPEC, Spectrin repeats, found in several proteins involved
in cytoskeletal structure; family members include
spectrin, alpha-actinin and dystrophin; the spectrin
repeat forms a three helix bundle with the second helix
interrupted by proline in some sequences; the repeats
are independent folding units; tandem repeats are found
in differing numbers and arrange in an antiparallel
manner to form dimers; the repeats are defined by a
characteristic tryptophan (W) residue in helix A and a
leucine (L) at the carboxyl end of helix C and separated
by a linker of 5 residues; two copies of the repeat are
present here.
Length = 213
Score = 42.0 bits (99), Expect = 2e-04
Identities = 40/177 (22%), Positives = 77/177 (43%), Gaps = 25/177 (14%)
Query: 85 KFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKI 144
+FL A ++EA+ +K LLS+ ++ V + K L +LA EE ++ +++
Sbjct: 4 QFLRDADELEAWLSEKEELLSSTDYGDDLESVEALLKKHEALEAELAAHEERVEAL-NEL 62
Query: 145 AVWQNLLS-------DLQSCLQVLTKE-DEVSTTLEKDEIKLEIDQATLKFLDLARQMEA 196
+ L+ ++Q L+ L + +E+ E+ +LE +F A +E
Sbjct: 63 G--EQLIEEGHPDAEEIQERLEELNQRWEELRELAEERRQRLEEALDLQQFFRDADDLEQ 120
Query: 197 FFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKRH---YDKIAVWQNLLSDLQ 250
+ +K L++ DL DL EEL+K+H +++ + L L
Sbjct: 121 WLEEKEAALASE-----------DLGKDLESVEELLKKHKELEEELEAHEPRLKSLN 166
>gnl|CDD|224259 COG1340, COG1340, Uncharacterized archaeal coiled-coil protein
[Function unknown].
Length = 294
Score = 42.0 bits (99), Expect = 4e-04
Identities = 36/189 (19%), Positives = 79/189 (41%), Gaps = 37/189 (19%)
Query: 63 DGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKD 122
+ +L ++ EI++ L+K+ S L PE +E +V +
Sbjct: 100 NEFNLGGRSIKSLEREIER----------------LEKKQQTSVLTPE---EERELV-QK 139
Query: 123 IVDLRHDLARKEELIKRH------YDKIAVWQNLLSDLQSCLQVLTKE-----DEVSTTL 171
I +LR +L ++ ++ + +I + ++ +Q L E +E+
Sbjct: 140 IKELRKELEDAKKALEENEKLKELKAEIDELKKKAREIHEKIQELANEAQEYHEEMIKLF 199
Query: 172 EK-DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELI-VKEDIVDLRHDLARKE 229
E+ DE++ E D+ +F++L+++++ + L+ EL +++ I LR +
Sbjct: 200 EEADELRKEADELHEEFVELSKKID----ELHEEFRNLQNELRELEKKIKALRAKEKAAK 255
Query: 230 ELIKRHYDK 238
KR K
Sbjct: 256 RREKREELK 264
>gnl|CDD|224495 COG1579, COG1579, Zn-ribbon protein, possibly nucleic acid-binding
[General function prediction only].
Length = 239
Score = 41.2 bits (97), Expect = 4e-04
Identities = 27/115 (23%), Positives = 46/115 (40%), Gaps = 4/115 (3%)
Query: 126 LRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATL 185
L + R E IK + + L L L+ L E E + +++ EI +
Sbjct: 15 LDLEKDRLEPRIKEIRKALKKAKAELEALNKALEALEIELE-DLENQVSQLESEIQEIRE 73
Query: 186 KFLDLARQMEAFFLQKRFLLSALKPEL-IVKEDIVDLRHDLARKEELIKRHYDKI 239
+ ++ A ++ AL E+ I KE I L +LA E I++ +I
Sbjct: 74 RIKRAEEKLSAVKDEREL--RALNIEIQIAKERINSLEDELAELMEEIEKLEKEI 126
Score = 38.9 bits (91), Expect = 0.003
Identities = 22/137 (16%), Positives = 51/137 (37%), Gaps = 19/137 (13%)
Query: 71 EKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDL 130
+ +++ EI + + ++ A ++ AL E+ + + + I L +L
Sbjct: 60 QVSQLESEIQEIRERIKRAEEKLSAVKDEREL--RALNIEIQIAK-----ERINSLEDEL 112
Query: 131 ARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEV----------STTLEKDEIKLEI 180
A E I++ +I + L L+ L E + +++E+K ++
Sbjct: 113 AELMEEIEKLEKEIEDLKERLERLEKNLAEAEARLEEEVAEIREEGQELSSKREELKEKL 172
Query: 181 DQATLKFLDLARQMEAF 197
D L + R +
Sbjct: 173 DPELLSEYE--RIRKNK 187
>gnl|CDD|220309 pfam09606, Med15, ARC105 or Med15 subunit of Mediator complex
non-fungal. The approx. 70 residue Med15 domain of the
ARC-Mediator co-activator is a three-helix bundle with
marked similarity to the KIX domain. The sterol
regulatory element binding protein (SREBP) family of
transcription activators use the ARC105 subunit to
activate target genes in the regulation of cholesterol
and fatty acid homeostasis. In addition, Med15 is a
critical transducer of gene activation signals that
control early metazoan development.
Length = 768
Score = 41.1 bits (96), Expect = 0.001
Identities = 26/119 (21%), Positives = 30/119 (25%), Gaps = 1/119 (0%)
Query: 277 QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGP 336
G P M P G G P QQQ Q Q +Q+ QQQ M
Sbjct: 181 NQGQQGPVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGGQQQQNPQMQQQLQNQQQQQMDQ 240
Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQ 395
P+ G G P P GM P + Q
Sbjct: 241 QQGPADAQAQMGQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQ-SQLGMLPNQMQQ 298
Score = 40.8 bits (95), Expect = 0.001
Identities = 28/111 (25%), Positives = 34/111 (30%), Gaps = 4/111 (3%)
Query: 255 SPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQ 314
G +AS + Q G + MG P M + + PG M
Sbjct: 100 MGQQMGGPGTASNLLQSLNVRGQMPMGAAGMG---PHQMSRVGTMQPGGQAGGMMQQSSG 156
Query: 315 QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAG 365
Q Q Q Q+ Q Q QG G G GP G P G G
Sbjct: 157 QPQSQQPNQMGPQQGQAQGQAG-GMNQGQQGPVGQQQPPQMGQPGMPGGGG 206
Score = 40.0 bits (93), Expect = 0.002
Identities = 28/150 (18%), Positives = 36/150 (24%), Gaps = 12/150 (8%)
Query: 257 AHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGM-MGGMAPIVPGSTMQPMSGMPQQQ 315
A Q P Q+ G G + M G P G M
Sbjct: 57 AAQQQVLQGGQGMPDPINALQNLTGQGTRGPQMGPMGPGPGRP--MGQQMGGPGTASNLL 114
Query: 316 QQVQMQQQIHMQH-----MQQQGMGPGGPPSGPGGPSSGMMFMG----PGGPRGGGNAGP 366
Q + ++ Q+ M Q +G P GG P
Sbjct: 115 QSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQSSGQPQSQQPNQMGPQQGQAQ 174
Query: 367 PPFPSAGPGGMGGPGNLGPGGMGPGGLLQG 396
G G G P MG G+ G
Sbjct: 175 GQAGGMNQGQQGPVGQQQPPQMGQPGMPGG 204
Score = 38.4 bits (89), Expect = 0.007
Identities = 24/91 (26%), Positives = 26/91 (28%), Gaps = 5/91 (5%)
Query: 277 QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGP 336
Q G GG P M G VP P Q Q + Q MQ M G
Sbjct: 250 QMGQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQSQLGMLPNQ---MQQMPGGGQ-- 304
Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP 367
GGP G P + GG
Sbjct: 305 GGPGQPMGPPPQRPGAVPQGGQAVQQGVMSA 335
Score = 37.7 bits (87), Expect = 0.011
Identities = 30/99 (30%), Positives = 33/99 (33%), Gaps = 16/99 (16%)
Query: 302 GSTMQPMSGMPQQQQ-----QVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
G Q GM QQ QV MQQQ Q QQ +G M M G
Sbjct: 252 GQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQSQLGMLPNQ---------MQQMPGG 302
Query: 357 GPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQ 395
G G G P P PG + G G+ G Q
Sbjct: 303 GQGGPGQPMGP--PPQRPGAVPQGGQAVQQGVMSAGQQQ 339
Score = 36.1 bits (83), Expect = 0.031
Identities = 31/144 (21%), Positives = 41/144 (28%), Gaps = 13/144 (9%)
Query: 256 PAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQ 315
P Q S G P N G G G P+ + G M Q
Sbjct: 279 PPQQQPQQSQLGMLP-NQMQQMPGGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQ 337
Query: 316 QQVQMQQQIHMQHM-----QQQGMGPGGPPSGPGGPSSGMMFMGPGG----PRGGGNAGP 366
QQ++ + +M+ QQQ G P + +G GG G
Sbjct: 338 QQLKQMKLRNMRGQQQTQQQQQQQGGNHPAAHQQQM---NQQVGQGGQMVALGYLNIQGN 394
Query: 367 PPFPSAGPGGMGGPGNLGPGGMGP 390
A P G PG + P
Sbjct: 395 QGGLGANPMQQGQPGMMSSPSPVP 418
Score = 35.7 bits (82), Expect = 0.048
Identities = 34/161 (21%), Positives = 45/161 (27%), Gaps = 18/161 (11%)
Query: 258 HQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQ 317
G PP P G A+ + + M+ QQQQQ
Sbjct: 301 GGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQQLKQMKLRNMRGQQQTQQQQQQ 360
Query: 318 VQMQQQIHMQHMQQQGMGPGGPPSGPGGPSS-----------------GMMFMGPGGPRG 360
Q Q +G GG G + GMM P+
Sbjct: 361 QGGNHPAAHQQQMNQQVGQGGQMVALGYLNIQGNQGGLGANPMQQGQPGMMSSPSPVPQV 420
Query: 361 GGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPLAYL 401
N P P GGPG+ P GG++ P A +
Sbjct: 421 QTNQSMPQPPQPSVPSPGGPGSQ-PPQSVSGGMIPSPPALM 460
Score = 35.7 bits (82), Expect = 0.049
Identities = 37/177 (20%), Positives = 43/177 (24%), Gaps = 35/177 (19%)
Query: 250 QGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAM--------GGPLPGMMGGMAPIVP 301
G S S +G G M GG GMM + P
Sbjct: 100 MGQQMGGPGTASNLLQSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQSSG-QP 158
Query: 302 GSTM-----------QPMSGMPQQQQQVQMQQQIHMQ--------HMQQQGMGPGGPPSG 342
S Q +G Q QQ + QQ Q Q M G P G
Sbjct: 159 QSQQPNQMGPQQGQAQGQAGGMNQGQQGPVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGG 218
Query: 343 P--GGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
P + GP A G G GGM P + G
Sbjct: 219 QQQQNPQMQQQLQNQQQQQMDQQQGP-----ADAQAQMGQQQQGQGGMQPQQMQGGQ 270
Score = 33.8 bits (77), Expect = 0.19
Identities = 27/113 (23%), Positives = 32/113 (28%), Gaps = 9/113 (7%)
Query: 271 PNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQ 330
N Q G G + M PGMM +P +Q MPQ Q
Sbjct: 389 LNIQGNQGGLGANPMQQGQPGMMSSPSP---VPQVQTNQSMPQPPQPSVPSPGGPGSQPP 445
Query: 331 QQGMGPGGPPSGPGGPS-SGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGN 382
Q G P PS S M P R + G + PG
Sbjct: 446 QSVSGGMIPSPPALMPSPSPQMSQSPASQR-----TIQQDMVSPGGPLNTPGQ 493
Score = 33.8 bits (77), Expect = 0.19
Identities = 27/97 (27%), Positives = 29/97 (29%), Gaps = 15/97 (15%)
Query: 314 QQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGP------- 366
QQQV Q + G GP G M GPG P G GP
Sbjct: 58 AQQQVLQGGQGMPDPINALQNLTGQGTRGPQM---GPMGPGPGRPMGQQMGGPGTASNLL 114
Query: 367 -----PPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPL 398
G GMG G M PGG G +
Sbjct: 115 QSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMM 151
Score = 32.3 bits (73), Expect = 0.51
Identities = 28/96 (29%), Positives = 35/96 (36%), Gaps = 14/96 (14%)
Query: 254 KSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPL----PGMMGGMAPIVPGSTMQPMS 309
P S S S P P + + GGP + GGM P P +
Sbjct: 406 GQPGMMSSPSPVPQVQTNQSMPQPPQPSVPSPGGPGSQPPQSVSGGMIP-------SPPA 458
Query: 310 GMPQQQQQVQMQQQIHMQH-MQQQGMGPGGPPSGPG 344
MP QM Q Q +QQ + PGGP + PG
Sbjct: 459 LMPSPSP--QMSQSPASQRTIQQDMVSPGGPLNTPG 492
Score = 32.3 bits (73), Expect = 0.58
Identities = 23/123 (18%), Positives = 28/123 (22%)
Query: 270 PPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHM 329
P Q G L + + + M P Q Q MQ
Sbjct: 95 GPGRPMGQQMGGPGTASNLLQSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQS 154
Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
Q GP + G + G P PG GG G G
Sbjct: 155 SGQPQSQQPNQMGPQQGQAQGQAGGMNQGQQGPVGQQQPPQMGQPGMPGGGGQGQMQQQG 214
Query: 390 PGG 392
G
Sbjct: 215 QPG 217
Score = 30.7 bits (69), Expect = 1.6
Identities = 17/116 (14%), Positives = 26/116 (22%), Gaps = 1/116 (0%)
Query: 243 QNLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPG 302
+ LQ + Q + + Q G M G + P
Sbjct: 224 PQMQQQLQNQQQQQMDQQQGPADAQAQMGQQQQGQGGMQPQQMQGGQMQVPMQQQPPQQQ 283
Query: 303 STMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGP 358
+ +P Q QQ+ Q GG + M G
Sbjct: 284 PQQSQLGMLPNQMQQMPGGGQ-GGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQ 338
>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
chromosome partitioning].
Length = 1163
Score = 40.1 bits (94), Expect = 0.002
Identities = 36/209 (17%), Positives = 86/209 (41%), Gaps = 40/209 (19%)
Query: 71 EKDEIKLEIDQATLKFLDLARQ-----MEAFFLQKRFLLSALKPELIVKEVNMVTKDIVD 125
E +E++ E+++ + L+L + E L++R + E + + + + + I
Sbjct: 275 ELEELREELEELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEA 334
Query: 126 LRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVSTTLEKD---------- 174
L+ +L +E L++ +A + +L+ L L +E +E+ L ++
Sbjct: 335 LKEELEERETLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREELAELEAELAE 394
Query: 175 ------EIKLEIDQATLKFLDLARQMEAF----------FLQKRFLLSALKPELI----- 213
E+K EI+ + L+ ++E + + L L EL
Sbjct: 395 IRNELEELKREIESLEERLERLSERLEDLKEELKELEAELEELQTELEELNEELEELEEQ 454
Query: 214 ---VKEDIVDLRHDLARKEELIKRHYDKI 239
+++ + +L +LA +E ++R ++
Sbjct: 455 LEELRDRLKELERELAELQEELQRLEKEL 483
>gnl|CDD|219791 pfam08317, Spc7, Spc7 kinetochore protein. This domain is found in
cell division proteins which are required for
kinetochore-spindle association.
Length = 321
Score = 38.9 bits (91), Expect = 0.003
Identities = 24/133 (18%), Positives = 48/133 (36%), Gaps = 19/133 (14%)
Query: 120 TKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLE 179
+ + L+ L E +KR + + NL++ ++ L+ +K E
Sbjct: 142 MQLLEGLKEGLEENLEGMKRDEELLNKDLNLINSIKPKLRKK-----------LQALKEE 190
Query: 180 IDQATLKFLDLARQMEAFFLQKRFLLSALKPELI-VKEDIVDLRHDLARKEELIKRHYDK 238
I L+ LA ++ + L + EL + I + R L ++ ++
Sbjct: 191 IAS--LR--QLADELNLCDPLE---LEKARQELRSLSVKISEKRKQLEELQQELQELTIA 243
Query: 239 IAVWQNLLSDLQG 251
I N S+L
Sbjct: 244 IEALTNKKSELLE 256
Score = 33.1 bits (76), Expect = 0.24
Identities = 25/133 (18%), Positives = 47/133 (35%), Gaps = 17/133 (12%)
Query: 60 GLPDG--RSLSPLEKDEIKLEIDQATLKFLDLARQMEAFF--LQKRF-LLSALKPEL--- 111
GL +G +L +++DE L D + + ++ L++ L L EL
Sbjct: 147 GLKEGLEENLEGMKRDEELLNKDLNLIN--SIKPKLRKKLQALKEEIASLRQLADELNLC 204
Query: 112 -------IVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE 164
+E+ ++ I + R L ++ ++ I N S+L + K
Sbjct: 205 DPLELEKARQELRSLSVKISEKRKQLEELQQELQELTIAIEALTNKKSELLEEIAEAEKI 264
Query: 165 DEVSTTLEKDEIK 177
E EI
Sbjct: 265 REECRGWSAKEIS 277
>gnl|CDD|227361 COG5028, COG5028, Vesicle coat complex COPII, subunit SEC24/subunit
SFB2/subunit SFB3 [Intracellular trafficking and
secretion].
Length = 861
Score = 39.0 bits (91), Expect = 0.004
Identities = 19/92 (20%), Positives = 22/92 (23%), Gaps = 1/92 (1%)
Query: 277 QSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGP 336
Q G ++ A G P P QQQ + Q M G
Sbjct: 15 QVHTGAASSKKS-ARPHRAYANFSAGQMGMPPYTTPPLQQQSRRQIDQAATAMHNTGANN 73
Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPP 368
P S F P G P P
Sbjct: 74 PAPSVMSPAFQSQQKFSSPYGGSMADGTAPKP 105
>gnl|CDD|218108 pfam04487, CITED, CITED. CITED, CBP/p300-interacting
transactivator with ED-rich tail, are characterized by a
conserved 32-amino acid sequence at the C-terminus.
CITED proteins do not bind DNA directly and are thought
to function as transcriptional co-activators.
Length = 206
Score = 37.6 bits (87), Expect = 0.005
Identities = 27/115 (23%), Positives = 33/115 (28%), Gaps = 13/115 (11%)
Query: 278 SGPGISAMGGPLPGMMGGMAPIVPGSTMQP--MSGMPQQQQQVQMQQQIH-MQHMQQQGM 334
G G+ A G P M G M P +M M + Q + M MQ Q +
Sbjct: 49 PGGGMDASGRPRSAMSGPMGGGHPHQSMPAYMMFNPSSKPQPFMLVPGPQLMASMQLQKL 108
Query: 335 GPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGG-MGGPGNLGPGGM 388
G G P GGG P PG L P +
Sbjct: 109 N---------TQYQGHAGAPAGHPGGGGPQQFRPGAGQPPGMQHMPAPALPPNVI 154
>gnl|CDD|218704 pfam05701, DUF827, Plant protein of unknown function (DUF827).
This family consists of several plant proteins of
unknown function. Several sequences in this family are
described as being "myosin heavy chain-like".
Length = 484
Score = 38.4 bits (89), Expect = 0.006
Identities = 29/137 (21%), Positives = 61/137 (44%), Gaps = 12/137 (8%)
Query: 107 LKPELIVKEVNMV--------TKDIV-DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSC 157
LK EL V E + TK V DL+ L + E+ ++ + + +L+
Sbjct: 48 LKKELEVAEKEKLQVLKELESTKRTVEDLKLKLEKAEKEEQQAKQDSELAKLRAEELEQG 107
Query: 158 LQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKED 217
+Q L E ++ T E D +K E+ + ++ L + +A + + A K + ++
Sbjct: 108 IQELEVERYITATAELDSVKEELRKIRQEYDALVEERDAALKRAEEAICASK---VNEKK 164
Query: 218 IVDLRHDLARKEELIKR 234
+ +L ++ +E ++R
Sbjct: 165 VEELTKEIIAMKESLER 181
>gnl|CDD|227507 COG5180, PBP1, Protein interacting with poly(A)-binding protein
[RNA processing and modification].
Length = 654
Score = 38.2 bits (88), Expect = 0.007
Identities = 36/136 (26%), Positives = 46/136 (33%), Gaps = 18/136 (13%)
Query: 259 QGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSG-MPQQQQQ 317
Q ++ G P P MG P+ G P++ G M MP Q Q
Sbjct: 498 QQRQLNSMGNAVPGMNPAMGMNMGGMMGFPMGGPSASPNPMMNGFAAGSMGMYMPFQPQP 557
Query: 318 VQMQQQIHMQHMQQQGMGPGGPPSGPGGPS----SGMMFMGPGGPRGGGNAGPPPFPSAG 373
+ M + MG G G G S +G M GPG P G
Sbjct: 558 MFYHPSPQMMPV----MGSNGAEEGGGNISPHVPAGFMAAGPGAPMGA---------FGY 604
Query: 374 PGGMGGPGNLGPGGMG 389
PGG+ G +G G G
Sbjct: 605 PGGIPFQGMMGSGPSG 620
>gnl|CDD|218350 pfam04959, ARS2, Arsenite-resistance protein 2. Arsenite is a
carcinogenic compound which can act as a co-mutagen by
inhibiting DNA repair. Arsenite-resistance protein 2 is
thought to play a role in arsenite resistance.
Length = 211
Score = 37.5 bits (87), Expect = 0.007
Identities = 24/71 (33%), Positives = 27/71 (38%), Gaps = 5/71 (7%)
Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGP-----GNLGPG 386
G+ PG P P P + M + P P G G PPFP GG G G G
Sbjct: 141 GGLAPGLPGYPPQTPQALMPYGQPRPPMMGYGRGGPPFPPNQYGGGRGNYDEFRGQGGYY 200
Query: 387 GMGPGGLLQGP 397
G L GP
Sbjct: 201 GKPRNRDLDGP 211
Score = 27.8 bits (62), Expect = 8.1
Identities = 12/37 (32%), Positives = 14/37 (37%)
Query: 6 PGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 42
P GG G P +P P + G P MG G
Sbjct: 136 PKPDPGGLAPGLPGYPPQTPQALMPYGQPRPPMMGYG 172
Score = 27.8 bits (62), Expect = 8.1
Identities = 12/37 (32%), Positives = 14/37 (37%)
Query: 355 PGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 391
P GG G P +P P + G P MG G
Sbjct: 136 PKPDPGGLAPGLPGYPPQTPQALMPYGQPRPPMMGYG 172
>gnl|CDD|219339 pfam07223, DUF1421, Protein of unknown function (DUF1421). This
family represents a conserved region approximately 350
residues long within a number of plant proteins of
unknown function.
Length = 357
Score = 37.6 bits (87), Expect = 0.008
Identities = 28/123 (22%), Positives = 35/123 (28%), Gaps = 14/123 (11%)
Query: 270 PPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHM 329
+ + P G P P P P SG P QQ Q Q
Sbjct: 202 AMQPPYSGAPPSQQFYGPPQPSPYMYGGPGGR-----PNSGFPSGQQPPPSQGQ------ 250
Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
+G G GPP G + P G + P P+A P + P G
Sbjct: 251 --EGYGYSGPPP-SKGNHGSVASYAPQGSSQSYSTAYPSLPAATVLPQALPMSSAPMSGG 307
Query: 390 PGG 392
G
Sbjct: 308 GSG 310
>gnl|CDD|130689 TIGR01628, PABP-1234, polyadenylate binding protein, human types 1,
2, 3, 4 family. These eukaryotic proteins recognize the
poly-A of mRNA and consists of four tandem RNA
recognition domains at the N-terminus (rrm: pfam00076)
followed by a PABP-specific domain (pfam00658) at the
C-terminus. The protein is involved in the transport of
mRNA's from the nucleus to the cytoplasm. There are four
paralogs in Homo sapiens which are expressed in testis
(GP:11610605_PABP3 ), platelets (SP:Q13310_PABP4 ),
broadly expressed (SP:P11940_PABP1) and of unknown
tissue range (SP:Q15097_PABP2).
Length = 562
Score = 37.1 bits (86), Expect = 0.015
Identities = 20/95 (21%), Positives = 22/95 (23%), Gaps = 17/95 (17%)
Query: 283 SAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSG 342
MG P+ G MG QP QQQ Q M P G
Sbjct: 385 LPMGSPMGGAMG-----------QPPYYGQGPQQQFNGQPLGW------PRMSMMPTPMG 427
Query: 343 PGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGM 377
PGGP R +
Sbjct: 428 PGGPLRPNGLAPMNAVRAPSRNAQNAAQKPPMQPV 462
Score = 32.1 bits (73), Expect = 0.63
Identities = 10/80 (12%), Positives = 17/80 (21%), Gaps = 3/80 (3%)
Query: 279 GPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGG 338
G G P M+ + + M P + M Q
Sbjct: 402 GQGPQQQFNGQPLGWPRMSMM--PTPMGPGGP-LRPNGLAPMNAVRAPSRNAQNAAQKPP 458
Query: 339 PPSGPGGPSSGMMFMGPGGP 358
P+ + + P
Sbjct: 459 MQPVMYPPNYQSLPLSQDLP 478
Score = 30.5 bits (69), Expect = 1.8
Identities = 24/89 (26%), Positives = 33/89 (37%), Gaps = 10/89 (11%)
Query: 315 QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSG---MMFMGPG---GPRGGGNAGPPP 368
Q++ Q + + Q MQ Q P P G + G GP + G
Sbjct: 362 QRKEQRRAHLQDQFMQLQPRMRQLPMGSPMGGAMGQPPYYGQGPQQQFNGQPLGWPRMSM 421
Query: 369 FPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
P P G GGP L P G+ P ++ P
Sbjct: 422 MP--TPMGPGGP--LRPNGLAPMNAVRAP 446
Score = 30.2 bits (68), Expect = 2.3
Identities = 17/84 (20%), Positives = 18/84 (21%), Gaps = 12/84 (14%)
Query: 307 PMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGP 366
PM G Q Q Q P MGPGGP G
Sbjct: 390 PMGGAMGQPPYYGQGP--QQQFNGQPLGWPRMSMMPTP--------MGPGGPLRP--NGL 437
Query: 367 PPFPSAGPGGMGGPGNLGPGGMGP 390
P + M P
Sbjct: 438 APMNAVRAPSRNAQNAAQKPPMQP 461
>gnl|CDD|216868 pfam02084, Bindin, Bindin.
Length = 239
Score = 36.4 bits (84), Expect = 0.015
Identities = 26/66 (39%), Positives = 28/66 (42%), Gaps = 2/66 (3%)
Query: 327 QHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGN-AGPPPFPSAGPGGMGGPGNLGP 385
Q M Q MG G P+ G G GGP GGG G GP G GG G+ GP
Sbjct: 9 QAMNPQ-MGGGNYPAPGQPAQQGYANQGMGGPVGGGGGPGAGGGAPGGPVGGGGGGSGGP 67
Query: 386 GGMGPG 391
G G
Sbjct: 68 PGGGEV 73
Score = 31.8 bits (72), Expect = 0.52
Identities = 21/87 (24%), Positives = 23/87 (26%), Gaps = 8/87 (9%)
Query: 307 PMSGMPQQQ-QQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAG 365
M PQ Q+ QQG G GG G G GG G
Sbjct: 3 NMQNYPQAMNPQMGGGNYPAPGQPAQQGYANQGMGGPVGG-------GGGPGAGGGAPGG 55
Query: 366 PPPFPSAGPGGMGGPGNLGPGGMGPGG 392
P G GG G G +
Sbjct: 56 PVGGGGGGSGGPPGGGEVAGEAEDAMS 82
>gnl|CDD|219837 pfam08430, Fork_head_N, Forkhead N-terminal region. The region
described in this family is found towards the N-terminus
of various eukaryotic fork head/HNF-3-related
transcription factors (which contain the pfam00250
domain). These proteins play key roles in embryogenesis,
maintenance of differentiated cell states, and
tumorigenesis.
Length = 137
Score = 35.3 bits (81), Expect = 0.017
Identities = 21/100 (21%), Positives = 28/100 (28%), Gaps = 2/100 (2%)
Query: 283 SAMGGPLPGMMGGMAPIVPGSTM-QPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPS 341
S++ G + M M P +T + M GM PG +
Sbjct: 11 SSVSGGMVYSMNSMNTYGPMNTSQGSANSSMNMSGYAGPGAMNGMSSSSMNGMSPGYGGA 70
Query: 342 GPGGPSSGMMFMGPG-GPRGGGNAGPPPFPSAGPGGMGGP 380
G GM MG P G A P +G
Sbjct: 71 GSPMGMMGMSSMGTSLSPSGTMGAMGPMPAGSGGSLSPNM 110
Score = 32.6 bits (74), Expect = 0.13
Identities = 26/116 (22%), Positives = 36/116 (31%), Gaps = 17/116 (14%)
Query: 260 GSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQ 319
S +S + P N++ + ++ G PG M GM+ S+M MS
Sbjct: 19 YSMNSMNTYGPMNTSQGSANSSMNMSGYAGPGAMNGMSS----SSMNGMSPGY------- 67
Query: 320 MQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPG 375
GM PS M MGP GG+ P S
Sbjct: 68 ------GGAGSPMGMMGMSSMGTSLSPSGTMGAMGPMPAGSGGSLSPNMSMSRASS 117
Score = 31.9 bits (72), Expect = 0.25
Identities = 16/95 (16%), Positives = 25/95 (26%), Gaps = 6/95 (6%)
Query: 260 GSTSSASGTTPPNSTPTQSGPGISAMG-GPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQV 318
+TS S + N + ++ M + GM G M MS M +
Sbjct: 30 MNTSQGSANSSMNMSGYAGPGAMNGMSSSSMNGMSPGYGGAGSPMGMMGMSSMG---TSL 86
Query: 319 QMQQQIHMQHMQQQGMGPGG--PPSGPGGPSSGMM 351
+ G G S S +
Sbjct: 87 SPSGTMGAMGPMPAGSGGSLSPNMSMSRASSQNNL 121
>gnl|CDD|240419 PTZ00440, PTZ00440, reticulocyte binding protein 2-like protein;
Provisional.
Length = 2722
Score = 37.1 bits (86), Expect = 0.018
Identities = 29/137 (21%), Positives = 60/137 (43%), Gaps = 27/137 (19%)
Query: 49 LAYLEKTTSNIGLPDGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALK 108
L Y +K+ NI DG L L+K++ + E ++ + L++ + L K+
Sbjct: 965 LEYYDKSKENINGNDGTHLEKLDKEKDEWEHFKSEIDKLNVNYNI----LNKKI------ 1014
Query: 109 PELIVKE----VNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE 164
+LI K+ + ++ K I + ++ EE + ++ +LL +++ L
Sbjct: 1015 DDLIKKQHDDIIELIDKLIKEKGKEI---EEKVDQYI-------SLLEKMKTKLSSFHFN 1064
Query: 165 DEVSTT---LEKDEIKL 178
++ K+EIKL
Sbjct: 1065 IDIKKYKNPKIKEEIKL 1081
>gnl|CDD|222878 PHA02562, 46, endonuclease subunit; Provisional.
Length = 562
Score = 36.9 bits (86), Expect = 0.019
Identities = 40/170 (23%), Positives = 68/170 (40%), Gaps = 25/170 (14%)
Query: 71 EKDEIKLEIDQATLKFLDLARQME----AF--FLQKRFLLSALKPELIVKEVNMVTKDIV 124
E IK EI++ T + L+L +E A + + K E K + M K V
Sbjct: 228 EAKTIKAEIEELTDELLNLVMDIEDPSAALNKLNTAAAKIKS-KIEQFQKVIKMYEKGGV 286
Query: 125 --DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQ 182
++ + I + DK+ Q+ L L T DE+ EI E ++
Sbjct: 287 CPTCTQQISEGPDRITKIKDKLKELQHSLEKLD------TAIDELE------EIMDEFNE 334
Query: 183 ATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLA-RKEEL 231
+ K L+L ++ K+ L++ + VK I +L+ + EEL
Sbjct: 335 QSKKLLELKNKISTN---KQSLITLVDKAKKVKAAIEELQAEFVDNAEEL 381
Score = 30.0 bits (68), Expect = 2.5
Identities = 14/89 (15%), Positives = 42/89 (47%), Gaps = 7/89 (7%)
Query: 110 ELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVST 169
E I+ E N +K +++L++ ++ ++ + DK + + +LQ+ + + +E++
Sbjct: 326 EEIMDEFNEQSKKLLELKNKISTNKQSLITLVDKAKKVKAAIEELQA--EFVDNAEELA- 382
Query: 170 TLEKDEIKLEIDQATLK----FLDLARQM 194
L+ + K+ ++ L + +
Sbjct: 383 KLQDELDKIVKTKSELVKEKYHRGIVTDL 411
>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
Transcription initiation factor IIA (TFIIA) is a
heterotrimer, the three subunits being known as alpha,
beta, and gamma, in order of molecular weight. The N and
C-terminal domains of the gamma subunit are represented
in pfam02268 and pfam02751, respectively. This family
represents the precursor that yields both the alpha and
beta subunits. The TFIIA heterotrimer is an essential
general transcription initiation factor for the
expression of genes transcribed by RNA polymerase II.
Together with TFIID, TFIIA binds to the promoter region;
this is the first step in the formation of a
pre-initiation complex (PIC). Binding of the rest of the
transcription machinery follows this step. After
initiation, the PIC does not completely dissociate from
the promoter. Some components, including TFIIA, remain
attached and re-initiate a subsequent round of
transcription.
Length = 332
Score = 36.6 bits (85), Expect = 0.020
Identities = 28/130 (21%), Positives = 39/130 (30%), Gaps = 12/130 (9%)
Query: 256 PAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGM----APIVPGSTMQPM--S 309
P Q + +G ++TPT S LP G P P+ +
Sbjct: 70 PPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAGPAGPTIQTEPGQLYPVQVPVMVT 129
Query: 310 GMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMG--PGGPRGGGNAGPP 367
P Q QQ +Q +QQ G P+ PS + N P
Sbjct: 130 QNPANSPLDQPAQQRALQQLQQ----RYGAPASGQLPSQQQSAQKNDESQLQQQPNGETP 185
Query: 368 PFPSAGPGGM 377
P + G G
Sbjct: 186 PQQTDGAGDD 195
Score = 32.4 bits (74), Expect = 0.40
Identities = 20/92 (21%), Positives = 23/92 (25%), Gaps = 9/92 (9%)
Query: 297 APIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
AP QP+ P Q + QH G PP+ P GP
Sbjct: 55 APPPVAQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALP------AGPA 108
Query: 357 GPR---GGGNAGPPPFPSAGPGGMGGPGNLGP 385
GP G P P P
Sbjct: 109 GPTIQTEPGQLYPVQVPVMVTQNPANSPLDQP 140
>gnl|CDD|240291 PTZ00146, PTZ00146, fibrillarin; Provisional.
Length = 293
Score = 35.9 bits (83), Expect = 0.027
Identities = 21/49 (42%), Positives = 21/49 (42%), Gaps = 2/49 (4%)
Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG 381
G G GG G GG G G GG RGGG G GG GG G
Sbjct: 7 GGGRGGGRGGGGGGGRG--GGGRGGGRGGGRGRGRGGGGGGRGGGGGGG 53
Score = 35.1 bits (81), Expect = 0.048
Identities = 18/39 (46%), Positives = 18/39 (46%), Gaps = 1/39 (2%)
Query: 5 GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 43
G GG RGGG G G GG GG G GG G G
Sbjct: 9 GRGGGRGGGGGGGRGGGGRG-GGRGGGRGRGRGGGGGGR 46
Score = 35.1 bits (81), Expect = 0.048
Identities = 18/39 (46%), Positives = 18/39 (46%), Gaps = 1/39 (2%)
Query: 354 GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
G GG RGGG G G GG GG G GG G G
Sbjct: 9 GRGGGRGGGGGGGRGGGGRG-GGRGGGRGRGRGGGGGGR 46
Score = 32.8 bits (75), Expect = 0.26
Identities = 24/60 (40%), Positives = 24/60 (40%), Gaps = 4/60 (6%)
Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
GMG G GG G G GG GGG G G GG G G GG GPG
Sbjct: 1 GMGGGFGGGRGGGRGGG----GGGGRGGGGRGGGRGGGRGRGRGGGGGGRGGGGGGGPGK 56
Score = 30.1 bits (68), Expect = 1.8
Identities = 18/42 (42%), Positives = 18/42 (42%), Gaps = 2/42 (4%)
Query: 3 FMGPGGPRGGGNAGPPPFPSAGPGGMG-GPGNLGPGGMGPGG 43
MG GG GG G G GG G G G G G G GG
Sbjct: 1 GMG-GGFGGGRGGGRGGGGGGGRGGGGRGGGRGGGRGRGRGG 41
Score = 29.7 bits (67), Expect = 2.8
Identities = 17/46 (36%), Positives = 18/46 (39%), Gaps = 7/46 (15%)
Query: 352 FMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
MG G G G G GG GG G G GG GG +G
Sbjct: 1 GMGGGFGGGRGGGR-------GGGGGGGRGGGGRGGGRGGGRGRGR 39
Score = 29.3 bits (66), Expect = 3.4
Identities = 17/39 (43%), Positives = 17/39 (43%)
Query: 5 GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 43
G GG GGG G G GG G G GG GPG
Sbjct: 18 GGGGRGGGGRGGGRGGGRGRGRGGGGGGRGGGGGGGPGK 56
>gnl|CDD|133051 cd06429, GT8_like_1, GT8_like_1 represents a subfamily of GT8 with
unknown function. A subfamily of glycosyltransferase
family 8 with unknown function: Glycosyltransferase
family 8 comprises enzymes with a number of known
activities; lipopolysaccharide galactosyltransferase
lipopolysaccharide glucosyltransferase 1, glycogenin
glucosyltransferase and inositol
1-alpha-galactosyltransferase. It is classified as a
retaining glycosyltransferase, based on the relative
anomeric stereochemistry of the substrate and product in
the reaction catalyzed.
Length = 257
Score = 35.4 bits (82), Expect = 0.038
Identities = 34/160 (21%), Positives = 56/160 (35%), Gaps = 30/160 (18%)
Query: 138 KRHYDKIAVWQNLLSDLQSCLQVLTKEDEV-STTLEKDEIKLEIDQATLKFLDLARQMEA 196
++Y + W +L + ++VL +D ++ D + +A L R+ E
Sbjct: 38 NQNYGAMRSWFDLNPLKIATVKVLNFDDFKLLGKVKVDSLMQLESEADTSNLK-QRKPEY 96
Query: 197 FFL--QKRFLLSALKPEL---IVKEDIVDLRHDLARKEELIKRHYD-KIA-----VW--- 242
L RF L L P+L I +D V ++ DL EL +A W
Sbjct: 97 ISLLNFARFYLPELFPKLEKVIYLDDDVVVQKDL---TELWNTDLGGGVAGAVETSWNPG 153
Query: 243 -----------QNLLSDLQGWAKSPAHQGSTSSASGTTPP 271
QN+ + W + + T T PP
Sbjct: 154 VNVVNLTEWRRQNVTETYEKWMELNQEEEVTLWKLITLPP 193
>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
bacterial type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. This family
represents the SMC protein of most bacteria. The smc
gene is often associated with scpB (TIGR00281) and scpA
genes, where scp stands for segregation and condensation
protein. SMC was shown (in Caulobacter crescentus) to be
induced early in S phase but present and bound to DNA
throughout the cell cycle [Cellular processes, Cell
division, DNA metabolism, Chromosome-associated
proteins].
Length = 1179
Score = 35.8 bits (83), Expect = 0.043
Identities = 33/186 (17%), Positives = 74/186 (39%), Gaps = 32/186 (17%)
Query: 64 GRSLSPLE------------KDEIK-LEIDQATLKFLDLARQMEAFFLQKRFLLSALKPE 110
R L LE K E++ LE+ L+ +L ++E
Sbjct: 199 ERQLKSLERQAEKAERYKELKAELRELELALLVLRLEELREELEELQ------------- 245
Query: 111 LIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVST 169
+E+ +++ +L +L EE ++ +++ + + +LQ L L E +
Sbjct: 246 ---EELKEAEEELEELTAELQELEEKLEELRLEVSELEEEIEELQKELYALANEISRLEQ 302
Query: 170 TLEKDEIKLEIDQATLKFLDLAR-QMEAFFLQKRFLLSALKPEL-IVKEDIVDLRHDLAR 227
+ +L + L+ L+ ++E+ + L+ L+ +L +KE++ L +L
Sbjct: 303 QKQILRERLANLERQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEE 362
Query: 228 KEELIK 233
E ++
Sbjct: 363 LEAELE 368
>gnl|CDD|214826 smart00806, AIP3, Actin interacting protein 3. Aip3p/Bud6p is a
regulator of cell and cytoskeletal polarity in
Saccharomyces cerevisiae that was previously identified
as an actin-interacting protein. Actin-interacting
protein 3 (Aip3p) localizes at the cell cortex where
cytoskeleton assembly must be achieved to execute
polarized cell growth, and deletion of AIP3 causes gross
defects in cell and cytoskeletal polarity. Aip3p
localization is mediated by the secretory pathway,
mutations in early- or late-acting components of the
secretory apparatus lead to Aip3p mislocalization.
Length = 426
Score = 35.4 bits (82), Expect = 0.053
Identities = 31/129 (24%), Positives = 56/129 (43%), Gaps = 14/129 (10%)
Query: 64 GRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDI 123
R+ K ++ + D K DL +EA L+K ++P K++ V K++
Sbjct: 204 NRAYVESSKKKLSEDSDSLLTKVDDLQDIIEA--LRKDVAQRGVRPSK--KQLETVQKEL 259
Query: 124 VDLRHDLARKEELIKR---HYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLE- 179
R +L + EE I + KI W+ L + Q LT ++++ L++D K E
Sbjct: 260 ETARKELKKMEEYIDIEKPIWKKI--WEAELDKVCEEQQFLTLQEDLIADLKEDLEKAEE 317
Query: 180 ----IDQAT 184
++Q
Sbjct: 318 TFDLVEQCC 326
Score = 30.0 bits (68), Expect = 2.3
Identities = 30/142 (21%), Positives = 59/142 (41%), Gaps = 30/142 (21%)
Query: 121 KDIVDLRHDLARKEELIKR-HYDKIAVWQNLLSDLQSCLQVLTKEDEVSTT--------- 170
++ L+ +LA ++++ H + + D+ L+ + K S +
Sbjct: 155 AELKSLQRELA----VLRQTHNSFFTEIKESIKDI---LEKIDKFKSSSLSASGSSNRAY 207
Query: 171 LEKDEIKLEIDQATL--KFLDLARQMEAFFLQKRFLLSALKPEL----IVKEDIVDLRHD 224
+E + KL D +L K DL +EA L+K ++P V++++ R +
Sbjct: 208 VESSKKKLSEDSDSLLTKVDDLQDIIEA--LRKDVAQRGVRPSKKQLETVQKELETARKE 265
Query: 225 LARKEELIKR---HYDKIAVWQ 243
L + EE I + KI W+
Sbjct: 266 LKKMEEYIDIEKPIWKKI--WE 285
>gnl|CDD|222374 pfam13779, DUF4175, Domain of unknown function (DUF4175).
Length = 820
Score = 35.3 bits (82), Expect = 0.068
Identities = 14/45 (31%), Positives = 14/45 (31%), Gaps = 3/45 (6%)
Query: 301 PGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGG 345
Q G Q Q Q QQ QQQG G G G
Sbjct: 620 GEQQGQQGQGGQGQGQPGQQGQQ---GQGQQQGQQGQGGQGGQGS 661
Score = 34.9 bits (81), Expect = 0.095
Identities = 19/108 (17%), Positives = 25/108 (23%), Gaps = 36/108 (33%)
Query: 305 MQPMSGMPQQQQQVQMQQQIH-----MQHMQ----------QQGMGPGGPPSGPGGPSSG 349
+Q Q Q +MQQ + ++ Q Q+
Sbjct: 573 LQV--TQGGQGGQSEMQQAMEGLGETLREQQGLSDETFRDLQEQFNAQRGEQQGQQ---- 626
Query: 350 MMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 397
G GG G PG G G G G G
Sbjct: 627 ----GQGGQGQGQ-----------PGQQGQQGQGQQQGQQGQGGQGGQ 659
Score = 29.5 bits (67), Expect = 4.1
Identities = 10/48 (20%), Positives = 12/48 (25%)
Query: 310 GMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGG 357
Q+ + Q QQ Q Q G G G G
Sbjct: 614 QFNAQRGEQQGQQGQGGQGQGQPGQQGQQGQGQQQGQQGQGGQGGQGS 661
>gnl|CDD|240227 PTZ00009, PTZ00009, heat shock 70 kDa protein; Provisional.
Length = 653
Score = 35.2 bits (81), Expect = 0.072
Identities = 12/35 (34%), Positives = 13/35 (37%), Gaps = 1/35 (2%)
Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGP 366
Q G G P PGG GM G + GP
Sbjct: 614 QAAGGGMPGGMPGGMPGGMPGGAGPAGAGASS-GP 647
Score = 30.1 bits (68), Expect = 2.3
Identities = 12/41 (29%), Positives = 12/41 (29%), Gaps = 2/41 (4%)
Query: 1 MMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 41
M M G P P PGG G G GP
Sbjct: 609 MTKMYQAAGGGMPGGMPGGMPGGMPGGAGPAG--AGASSGP 647
Score = 30.1 bits (68), Expect = 2.3
Identities = 12/41 (29%), Positives = 12/41 (29%), Gaps = 2/41 (4%)
Query: 350 MMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 390
M M G P P PGG G G GP
Sbjct: 609 MTKMYQAAGGGMPGGMPGGMPGGMPGGAGPAG--AGASSGP 647
Score = 29.4 bits (66), Expect = 4.4
Identities = 13/41 (31%), Positives = 14/41 (34%), Gaps = 6/41 (14%)
Query: 2 MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 42
M+ GG GG P PGGM G G G
Sbjct: 612 MYQAAGGGMPGG------MPGGMPGGMPGGAGPAGAGASSG 646
Score = 29.4 bits (66), Expect = 4.4
Identities = 13/41 (31%), Positives = 14/41 (34%), Gaps = 6/41 (14%)
Query: 351 MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPG 391
M+ GG GG P PGGM G G G
Sbjct: 612 MYQAAGGGMPGG------MPGGMPGGMPGGAGPAGAGASSG 646
>gnl|CDD|144972 pfam01576, Myosin_tail_1, Myosin tail. The myosin molecule is a
multi-subunit complex made up of two heavy chains and
four light chains it is a fundamental contractile
protein found in all eukaryote cell types. This family
consists of the coiled-coil myosin heavy chain tail
region. The coiled-coil is composed of the tail from two
molecules of myosin. These can then assemble into the
macromolecular thick filament. The coiled-coil region
provides the structural backbone the thick filament.
Length = 859
Score = 35.0 bits (81), Expect = 0.085
Identities = 38/151 (25%), Positives = 63/151 (41%), Gaps = 25/151 (16%)
Query: 118 MVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVL-TKEDEVSTTLEKDE- 175
+ + +L + L RKE + + K+ Q L++ LQ ++ L + E+ LE +
Sbjct: 1 ELERQKRELENQLYRKESELSQLSSKLEDEQALVAQLQKKIKELEARIRELEEELEAERA 60
Query: 176 --IKLEIDQATLKFLDLARQMEAFFLQKRFL----LSALKPELIVKED--IVDLRHDL-- 225
K E +A DL+R++E L +R +A + EL K + + LR DL
Sbjct: 61 ARAKAEKARA-----DLSRELEE--LSERLEEAGGATAAQIELNKKREAELAKLRKDLEE 113
Query: 226 ------ARKEELIKRHYDKIAVWQNLLSDLQ 250
L K+H D I + LQ
Sbjct: 114 ANLQHEEALATLRKKHQDAINELSEQIEQLQ 144
>gnl|CDD|218621 pfam05518, Totivirus_coat, Totivirus coat protein.
Length = 753
Score = 34.8 bits (80), Expect = 0.086
Identities = 25/121 (20%), Positives = 29/121 (23%), Gaps = 20/121 (16%)
Query: 275 PTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIH-----MQHM 329
P + P P G PG GMP + + H
Sbjct: 631 IISGFPPVFKTALPRPDYNRGGEAGGPGVPGPVPVGMPAHTARPSRVARGDPVRPTAHHA 690
Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
PGGP P GGG PPP A G +L
Sbjct: 691 AL----RAPQAPRPGGP-----------PGGGGGLPPPPDLPAAAGPAPCGSSLIASPTA 735
Query: 390 P 390
P
Sbjct: 736 P 736
Score = 32.8 bits (75), Expect = 0.39
Identities = 12/37 (32%), Positives = 13/37 (35%)
Query: 5 GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 41
G P GGG PPP A G +L P
Sbjct: 700 PGGPPGGGGGLPPPPDLPAAAGPAPCGSSLIASPTAP 736
Score = 31.3 bits (71), Expect = 1.2
Identities = 20/111 (18%), Positives = 25/111 (22%), Gaps = 4/111 (3%)
Query: 271 PNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQ 330
P+ G G GM A + P+ P Q
Sbjct: 646 PDYNRGGEAGGPGVPGPVPVGMPAHTARPSRVARGDPVR--PTAHHAALRAPQAPRP--G 701
Query: 331 QQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG 381
G GG P P P++ A P P P G
Sbjct: 702 GPPGGGGGLPPPPDLPAAAGPAPCGSSLIASPTAPPEPEPPGAEQADGAEN 752
>gnl|CDD|227568 COG5243, HRD1, HRD ubiquitin ligase complex, ER membrane component
[Posttranslational modification, protein turnover,
chaperones].
Length = 491
Score = 34.6 bits (79), Expect = 0.091
Identities = 24/68 (35%), Positives = 31/68 (45%), Gaps = 9/68 (13%)
Query: 243 QNLLSDLQGWAKSPAHQ-GSTSSASGTTPPNSTPTQSGPGISAMGGPL--------PGMM 293
Q+L S + GW P S ++ TT P++TPT P S GGP P
Sbjct: 412 QDLSSVIPGWTMLPIPGTRRISQSTSTTNPSATPTTGDPSNSTYGGPQTFPNSGNNPNFN 471
Query: 294 GGMAPIVP 301
G+A IVP
Sbjct: 472 RGIAGIVP 479
>gnl|CDD|219133 pfam06682, DUF1183, Protein of unknown function (DUF1183). This
family consists of several eukaryotic proteins of around
360 residues in length. The function of this family is
unknown.
Length = 317
Score = 33.9 bits (78), Expect = 0.11
Identities = 21/67 (31%), Positives = 23/67 (34%)
Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
G+ G P G G G GG G G PPP + GPG G G
Sbjct: 179 SCGGVRGGPRPERAGYGGGGGGGGGGGGGGGSGPGPPPPGFKSSFPPPYGPGAGPSSGYG 238
Query: 390 PGGLLQG 396
GG G
Sbjct: 239 SGGTRSG 245
Score = 28.9 bits (65), Expect = 5.1
Identities = 14/36 (38%), Positives = 16/36 (44%)
Query: 2 MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPG 37
F+ GG RGG + G GG GG G G G
Sbjct: 176 FFLSCGGVRGGPRPERAGYGGGGGGGGGGGGGGGSG 211
Score = 28.9 bits (65), Expect = 5.1
Identities = 14/36 (38%), Positives = 16/36 (44%)
Query: 351 MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPG 386
F+ GG RGG + G GG GG G G G
Sbjct: 176 FFLSCGGVRGGPRPERAGYGGGGGGGGGGGGGGGSG 211
>gnl|CDD|215618 PLN03184, PLN03184, chloroplast Hsp70; Provisional.
Length = 673
Score = 34.4 bits (79), Expect = 0.12
Identities = 19/60 (31%), Positives = 24/60 (40%), Gaps = 5/60 (8%)
Query: 299 IVPGSTMQPMSGMPQQQQQV-QMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGG 357
I GST + M Q+V Q+ Q ++ Q G G GP G SS G G
Sbjct: 608 IASGSTQKMKDAMAALNQEVMQIGQSLY----NQPGAGGAGPAPGGEAGSSSSSSSGGDG 663
>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
Length = 3151
Score = 34.5 bits (79), Expect = 0.13
Identities = 26/143 (18%), Positives = 38/143 (26%), Gaps = 6/143 (4%)
Query: 255 SPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLP--GMMGGMAPIVPGSTMQPMSGMP 312
P + +A P P+ G A GG + A P+ +
Sbjct: 2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLA 2887
Query: 313 QQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
+ + ++ P PP P P PP P+
Sbjct: 2888 RPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTT 2947
Query: 373 GPGGMGGPGNLGP----GGMGPG 391
P G G P P G + PG
Sbjct: 2948 DPAGAGEPSGAVPQPWLGALVPG 2970
>gnl|CDD|214710 smart00533, MUTSd, DNA-binding domain of DNA mismatch repair MUTS
family.
Length = 308
Score = 33.0 bits (76), Expect = 0.21
Identities = 29/187 (15%), Positives = 68/187 (36%), Gaps = 27/187 (14%)
Query: 66 SLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKR-FLLSALKPELIVKEVNMVTKDIV 124
++ L + + ++ + +E+ LL + L+ ++ +
Sbjct: 72 ERGRASPRDL-LRLYDSLEGLKEIRQLLESLDGPLLGLLLKVILEPLLELLELLLEL-LN 129
Query: 125 DLRHDLARKEELIKRHYD-KIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQA 183
D LIK +D ++ + L +L E+E+ L+K+ +L ID
Sbjct: 130 DDDPLEVNDGGLIKDGFDPELDELREKLEEL---------EEELEELLKKEREELGID-- 178
Query: 184 TLKFLDLARQMEAFFLQKRFLLSALKPELIVK-----------EDIVDLRHDLARKEELI 232
+LK L + + + + + I + ++ +L ++L +E I
Sbjct: 179 SLK-LGYNKVHGYYIEVTKSEAKKVPKDFIRRSSLKNTERFTTPELKELENELLEAKEEI 237
Query: 233 KRHYDKI 239
+R +I
Sbjct: 238 ERLEKEI 244
Score = 31.1 bits (71), Expect = 0.94
Identities = 35/188 (18%), Positives = 59/188 (31%), Gaps = 55/188 (29%)
Query: 67 LSPL-EKDEIKLEIDQATLKFLDLARQ-MEAFFLQKRFLLSALKPELIVKEVNMVTKDIV 124
L PL + EI R ++ L L+ L K I
Sbjct: 25 LQPLLDLKEIN-------------ERLDAVEELVENPELRQKLRQLL---------KRIP 62
Query: 125 DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEK-DEIKLEIDQA 183
DL E L+ R + + + L++ +LE EI+ ++
Sbjct: 63 DL-------ERLLSR-------IERGRASPRDLLRLYD-------SLEGLKEIRQLLESL 101
Query: 184 TLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKRHYD-KIAVW 242
L L L K L L+ ++ E + D LIK +D ++
Sbjct: 102 DGPLLGL--------LLKVILEPLLELLELLLELLNDDDPLEVNDGGLIKDGFDPELDEL 153
Query: 243 QNLLSDLQ 250
+ L +L+
Sbjct: 154 REKLEELE 161
>gnl|CDD|116042 pfam07421, Pro-NT_NN, Neurotensin/neuromedin N precursor. This
family contains the precursor of bacterial
neurotensin/neuromedin N (approximately 170 residues
long). This the common precursor of two biologically
active related peptides, neurotensin and neuromedin N.
It undergoes tissue-specific processing leading to the
formation in some tissues and cancer cell lines of large
peptides ending with the neurotensin or neuromedin N
sequence.
Length = 169
Score = 32.3 bits (73), Expect = 0.21
Identities = 35/123 (28%), Positives = 54/123 (43%), Gaps = 17/123 (13%)
Query: 121 KDIVDLRHDLARKEELIKRHYDKIA-----VWQNLLSDLQSCLQVLTKEDEVSTTLEKDE 175
+D+ L DL L H KI+ W+ L ++ S + L + E + + D+
Sbjct: 27 EDVRALEADL-----LTNMHTSKISKASPPSWKMTLLNVCSLINNLNSQAEEAGEMHDDD 81
Query: 176 I----KLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEEL 231
+ KL L L + F LQK A + I++EDI+D +D KEE+
Sbjct: 82 LVTKRKLP---LVLDGFSLEAMLTIFQLQKICRSRAFQHWEIIQEDILDAGNDKNEKEEV 138
Query: 232 IKR 234
IKR
Sbjct: 139 IKR 141
>gnl|CDD|221930 pfam13135, DUF3947, Protein of unknown function (DUF3947). This
family of proteins is functionally uncharacterized. This
family of proteins is found in bacteria. Proteins in
this family are approximately 80 amino acids in length.
Length = 76
Score = 30.6 bits (69), Expect = 0.23
Identities = 16/40 (40%), Positives = 20/40 (50%), Gaps = 7/40 (17%)
Query: 303 STMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSG 342
ST+Q + Q +QMQQQ+ MQQQG P P
Sbjct: 22 STIQAV------HQAMQMQQQMQ-PAMQQQGQQPYYPSVE 54
>gnl|CDD|236802 PRK10942, PRK10942, serine endoprotease; Provisional.
Length = 473
Score = 33.2 bits (76), Expect = 0.24
Identities = 28/100 (28%), Positives = 37/100 (37%), Gaps = 7/100 (7%)
Query: 261 STSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPI-VPGSTMQPMSGMPQQQQQVQ 319
S SA+ ++T Q P ++ M L +M + I V GST MP+Q QQ
Sbjct: 19 SPLSATAAETSSATTAQQMPSLAPM---LEKVMPSVVSINVEGSTTVNTPRMPRQFQQFF 75
Query: 320 MQQQIHMQH---MQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
Q Q GG GG M +G G
Sbjct: 76 GDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSG 115
>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
Mitofilin controls mitochondrial cristae morphology.
Mitofilin is enriched in the narrow space between the
inner boundary and the outer membranes, where it forms a
homotypic interaction and assembles into a large
multimeric protein complex. The first 78 amino acids
contain a typical amino-terminal-cleavable mitochondrial
presequence rich in positive-charged and hydroxylated
residues and a membrane anchor domain. In addition, it
has three centrally located coiled coil domains.
Length = 493
Score = 33.1 bits (76), Expect = 0.29
Identities = 22/129 (17%), Positives = 48/129 (37%), Gaps = 20/129 (15%)
Query: 119 VTKDIVDLRHDLARKEELIKRHYDKIAVW--QNLLSDL-------QSCLQVLTKEDEVST 169
+ ++++ +EL+ D I NL DL + L L+K+
Sbjct: 124 LLEELLKETASDPVVQELVSIFNDLIDSIKEDNLKDDLESLIASAKEELDQLSKKLAELK 183
Query: 170 TLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKE 229
E++E++ + + + L + L A E LR + R++
Sbjct: 184 AEEEEELERALKEKREELLSKLEE----------ELLARL-ESKEAALEKQLRLEFEREK 232
Query: 230 ELIKRHYDK 238
E +++ Y++
Sbjct: 233 EELRKKYEE 241
>gnl|CDD|179382 PRK02195, PRK02195, V-type ATP synthase subunit D; Provisional.
Length = 201
Score = 32.2 bits (74), Expect = 0.30
Identities = 32/165 (19%), Positives = 59/165 (35%), Gaps = 46/165 (27%)
Query: 70 LEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLS-ALKPELIVKEVNMVTKDIVDLRH 128
L K+ +K + Q LK L +R+L + LK + EV + +L
Sbjct: 7 LTKNSLKKQKKQ--LKML------------ERYLPTLKLKKAQLQAEVRRAKAEAAELE- 51
Query: 129 DLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEK---------DEIKLE 179
++L + I+++ L + ++V +V E D I+ E
Sbjct: 52 --QEYQKLRQAIEAWISLFSEPLYFDEDLIKV----KKVEKDYENIAGVEVPILDSIEFE 105
Query: 180 ------------IDQATLKFLDLAR-QMEAFFLQKR--FLLSALK 209
+D +L + ++EA LQ+R L L+
Sbjct: 106 IIEYSLLNTPIWVDTGIELLKELVQLKIEAEVLQERLLLLEEELR 150
>gnl|CDD|192930 pfam12066, DUF3546, Domain of unknown function (DUF3546). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 93 to 114 amino acids in length. This domain has
two completely conserved Y residues that may be
functionally important.
Length = 110
Score = 31.2 bits (71), Expect = 0.32
Identities = 17/96 (17%), Positives = 38/96 (39%), Gaps = 22/96 (22%)
Query: 160 VLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKR---FLLSALKPELIVKE 216
+ +++D++S E + + +F +Q++ FF Q + + PE + K
Sbjct: 6 LESQDDDISPA----EAEKRYQEYKTEFR--RKQLQDFFDQHKDEEWFREKYHPEELAK- 58
Query: 217 DIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQGW 252
+ R +L + ++ V+ LL G
Sbjct: 59 -RREERRELRKN---------RLNVFLELLES--GT 82
>gnl|CDD|173957 cd08198, DHQS-like2, Dehydroquinate synthase (DHQS)-like. DHQS
catalyzes the conversion of DAHP to DHQ in shikimate
pathway for aromatic compounds synthesis.
Dehydroquinate synthase-like proteins. Dehydroquinate
synthase (DHQS) catalyzes the conversion of
3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to
dehydroquinate (DHQ) in the second step of the shikimate
pathway. This pathway involves seven sequential
enzymatic steps in the conversion of erythrose
4-phosphate and phosphoenolpyruvate into chorismate for
subsequent synthesis of aromatic compounds. The activity
of DHQS requires NAD as cofactor. Proteins of this
family share sequence similarity and functional motifs
with that of dehydroquinate synthase, but the specific
function has not been characterized.
Length = 369
Score = 32.6 bits (75), Expect = 0.35
Identities = 23/89 (25%), Positives = 34/89 (38%), Gaps = 8/89 (8%)
Query: 180 IDQATLKFL-DLARQMEAFFLQKRFLLSALKPELIV------KEDIVDLRHDLARKEEL- 231
ID + LA ++A+ L + P IV K D + A
Sbjct: 37 IDSGVAQANPQLASDIQAYAAAHADALRLVAPPHIVPGGEACKNDPDLVEALHAAINRHG 96
Query: 232 IKRHYDKIAVWQNLLSDLQGWAKSPAHQG 260
I RH IA+ + D G+A + AH+G
Sbjct: 97 IDRHSYVIAIGGGAVLDAVGYAAATAHRG 125
>gnl|CDD|205922 pfam13748, ABC_membrane_3, ABC transporter transmembrane region.
This family represents a unit of six transmembrane
helices.
Length = 237
Score = 32.2 bits (74), Expect = 0.35
Identities = 15/61 (24%), Positives = 27/61 (44%), Gaps = 12/61 (19%)
Query: 97 FLQKRFLL-SALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQ 155
F ++ L L L KEV ++ + RK ++RHY ++ + LSD +
Sbjct: 158 FARRNERLNGRLNNRL-EKEVGLIER----------RKPSALRRHYRALSRLRIRLSDRE 206
Query: 156 S 156
+
Sbjct: 207 A 207
>gnl|CDD|217899 pfam04108, APG17, Autophagy protein Apg17. Apg17 is required for
activating Apg1 protein kinases.
Length = 408
Score = 32.7 bits (75), Expect = 0.38
Identities = 26/161 (16%), Positives = 54/161 (33%), Gaps = 34/161 (21%)
Query: 121 KDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSC-------LQVLTKED-------- 165
++ L H+LA E + H+D+ L+VL +
Sbjct: 195 SELNSLEHELADLLESLTNHFDQCVTAVKHTEGDPLDDAEYDELLEVLKNDAAELPDVVK 254
Query: 166 EVSTTLEKDEIKLEIDQATLKFLDLARQMEAFF---------LQK------RFLLSALKP 210
E+ T + DEI+ + ++E L+K R+L
Sbjct: 255 ELHTVI--DEIENNEKRVKKFLSSHMSKIEELHSATKELLEELEKYKERLPRYLAIFADI 312
Query: 211 ELIVKEDIVDLRHDLARKEELIKRHYDK-IAVWQNLLSDLQ 250
+ ++ ++ + EL YD + ++ LL +++
Sbjct: 313 RALWEDFKEPIQQYIQELSEL-CEFYDNFLNSYKGLLLEVE 352
>gnl|CDD|237015 PRK11901, PRK11901, hypothetical protein; Reviewed.
Length = 327
Score = 32.3 bits (74), Expect = 0.38
Identities = 25/96 (26%), Positives = 34/96 (35%), Gaps = 16/96 (16%)
Query: 247 SDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQ 306
S+ G K+ GS+S +SG S + G A G I +
Sbjct: 70 SNNAGAEKNIDLSGSSSLSSGNQSSPSAANNTSDGHDASG---VKNTAPPQDI----SAP 122
Query: 307 PMSGMPQQQQQVQM---QQQIHMQH------MQQQG 333
P+S P Q Q QQ+I + QQQG
Sbjct: 123 PISPTPTQAAPPQTPNGQQRIELPGNISDALSQQQG 158
>gnl|CDD|197874 smart00787, Spc7, Spc7 kinetochore protein. This domain is found
in cell division proteins which are required for
kinetochore-spindle association.
Length = 312
Score = 32.3 bits (74), Expect = 0.40
Identities = 21/128 (16%), Positives = 50/128 (39%), Gaps = 3/128 (2%)
Query: 125 DLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQAT 184
L+ L E +K Y + LL+ ++ ++ ++D + L + +LE +
Sbjct: 144 GLKEGLDENLEGLKEDYKLLMKELELLNSIK--PKLRDRKDALEEELRQ-LKQLEDELED 200
Query: 185 LKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEELIKRHYDKIAVWQN 244
+L R E + ++ +K ++E++ +L + +IA +
Sbjct: 201 CDPTELDRAKEKLKKLLQEIMIKVKKLEELEEELQELESKIEDLTNKKSELNTEIAEAEK 260
Query: 245 LLSDLQGW 252
L +G+
Sbjct: 261 KLEQCRGF 268
>gnl|CDD|191111 pfam04849, HAP1_N, HAP1 N-terminal conserved region. This family
represents an N-terminal conserved region found in
several huntingtin-associated protein 1 (HAP1)
homologues. HAP1 binds to huntingtin in a polyglutamine
repeat-length-dependent manner. However, its possible
role in the pathogenesis of Huntington's disease is
unclear. This family also includes a similar N-terminal
conserved region from hypothetical protein products of
ALS2CR3 genes found in the human juvenile amyotrophic
lateral sclerosis critical region 2q33-2q34.
Length = 307
Score = 32.1 bits (73), Expect = 0.41
Identities = 42/179 (23%), Positives = 65/179 (36%), Gaps = 55/179 (30%)
Query: 108 KPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEV 167
K E + +++ +I+ LRH+L K+EL LQ + DE
Sbjct: 99 KNEKLEEQLGKARDEILQLRHELNLKDEL---------------------LQFYSDADEE 137
Query: 168 STTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLL------------SALKPELIVK 215
S E E Q + Q+EA LQ++ L S LK E +
Sbjct: 138 SED-ESSESTPLRPQESSSSSHGCFQLEA--LQEKLKLLEEENEHLRSEASHLKTETVTY 194
Query: 216 ED-------------------IVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQGWAKS 255
E+ I L +LA+K E ++R ++I + + DLQ KS
Sbjct: 195 EEKEQQLVNDCVKQLREANDQIASLSEELAKKTEDLERQQEEITHLLSQIVDLQKKCKS 253
>gnl|CDD|218112 pfam04497, Pox_E2-like, Poxviridae protein. This family of
proteins is restricted to Poxviridae. It contains a
number of differently named uncharacterized proteins.
Length = 727
Score = 32.8 bits (75), Expect = 0.43
Identities = 11/46 (23%), Positives = 19/46 (41%), Gaps = 1/46 (2%)
Query: 100 KRFLLSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIA 145
+ L ++V M +DIVD D + L+K+ D +
Sbjct: 344 RIKSLPIHSRLVMVMCEEMGYEDIVDFL-DNLDVDTLVKKGADPLT 388
>gnl|CDD|236598 PRK09631, PRK09631, DNA topoisomerase IV subunit A; Provisional.
Length = 635
Score = 32.4 bits (74), Expect = 0.50
Identities = 32/150 (21%), Positives = 60/150 (40%), Gaps = 57/150 (38%)
Query: 135 ELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQM 194
++IK H + LQ +VL E E LE+ ++ +I A+ +
Sbjct: 311 DIIKFHAEH----------LQ---KVLKMELE----LERAKLLEKI---------FAKTL 344
Query: 195 EAFFLQKRF----------------LLSALKP---EL---IVKEDIVDL---------RH 223
E F+++R +LS LKP EL + +EDI +L
Sbjct: 345 EQIFIEERIYKRIETISSEEDVISIVLSELKPFKEELSRDVTEEDIENLLKIPIRRISLF 404
Query: 224 DLARKEELIKRHYDKIAVWQNLLSDLQGWA 253
D+ + ++ I+ ++ + L ++G+A
Sbjct: 405 DIDKNQKEIRILNKELKSVEKNLKSIKGYA 434
>gnl|CDD|219420 pfam07466, DUF1517, Protein of unknown function (DUF1517). This
family consists of several hypothetical glycine rich
plant and bacterial proteins of around 300 residues in
length. The function of this family is unknown.
Length = 280
Score = 31.9 bits (73), Expect = 0.56
Identities = 18/49 (36%), Positives = 18/49 (36%), Gaps = 2/49 (4%)
Query: 333 GMGPGGPPSG--PGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGG 379
G G PS SS G G GGG P P G GG GG
Sbjct: 9 GGGSFRAPSRSSSSPRSSSPGGGGYYGSPGGGFGFPFLIPFFGFGGGGG 57
Score = 28.4 bits (64), Expect = 6.4
Identities = 17/52 (32%), Positives = 19/52 (36%), Gaps = 2/52 (3%)
Query: 344 GGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP-GGMGPGGLL 394
GG S PR G + S G GG G P + G G GGL
Sbjct: 9 GGGSFRAPSRSSSSPRSSSPGGGGYYGSPG-GGFGFPFLIPFFGFGGGGGLF 59
>gnl|CDD|221868 pfam12938, M_domain, M domain of GW182.
Length = 238
Score = 31.8 bits (72), Expect = 0.58
Identities = 24/110 (21%), Positives = 31/110 (28%), Gaps = 15/110 (13%)
Query: 294 GGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGM--- 350
GM P + SG Q ++ G GPGG G + +
Sbjct: 3 SGMGFAGPFGGDRFPSG-GSSVNSPPFSQNNLPNNLGGGGGGPGGGGGGNNPNLASLSSL 61
Query: 351 -----------MFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 389
+ P G GG AG P G G P N+ P
Sbjct: 62 TSQGLGKILSGLQPPPLGNGGGSGAGGPGPVGGGGGPGVAPNNIQPNAQA 111
Score = 27.9 bits (62), Expect = 8.1
Identities = 12/37 (32%), Positives = 13/37 (35%)
Query: 4 MGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMG 40
P G GG AG P G G P N+ P
Sbjct: 75 PPPLGNGGGSGAGGPGPVGGGGGPGVAPNNIQPNAQA 111
>gnl|CDD|232933 TIGR00348, hsdR, type I site-specific deoxyribonuclease, HsdR
family. This gene is part of the type I restriction and
modification system which is composed of three
polypeptides R (restriction endonuclease), M
(modification) and S (specificity). This group of
enzymes recognize specific short DNA sequences and have
an absolute requirement for ATP (or dATP) and
S-adenosyl-L-methionine. They also catalyse the
reactions of EC 2.1.1.72 and EC 2.1.1.73, with similar
site specificity.(J. Mol. Biol. 271 (3), 342-348
(1997)). Members of this family are assumed to differ
from each other in DNA site specificity [DNA metabolism,
Restriction/modification].
Length = 667
Score = 32.0 bits (73), Expect = 0.66
Identities = 25/107 (23%), Positives = 45/107 (42%), Gaps = 11/107 (10%)
Query: 48 PLAYLEKTTSN-IGLPDGRSLS--PLE---KDEIKLEID-QATLK-FLDLARQMEAFFLQ 99
P+ ++ TS GR L + +D + ++ID + L ++++AFF +
Sbjct: 402 PIFKKDRDTSLTFAYVFGRYLHRYFITDAIRDGLTVKIDYEDRLPEDHLDKKKLDAFFDE 461
Query: 100 KRFLLS-ALKP--ELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDK 143
LL ++ + +KE TK I+ L + I HY K
Sbjct: 462 IFELLPERIREITKESLKEKLQKTKKILFNEDRLESIAKDIAEHYAK 508
>gnl|CDD|153280 cd07596, BAR_SNX, The Bin/Amphiphysin/Rvs (BAR) domain of Sorting
Nexins. BAR domains are dimerization, lipid binding and
curvature sensing modules found in many different
proteins with diverse functions. Sorting nexins (SNXs)
are Phox homology (PX) domain containing proteins that
are involved in regulating membrane traffic and protein
sorting in the endosomal system. SNXs differ from each
other in their lipid-binding specificity, subcellular
localization and specific function in the endocytic
pathway. A subset of SNXs also contain BAR domains. The
PX-BAR structural unit determines the specific membrane
targeting of SNXs. BAR domains form dimers that bind to
membranes, induce membrane bending and curvature, and
may also be involved in protein-protein interactions.
Length = 218
Score = 31.2 bits (71), Expect = 0.69
Identities = 15/94 (15%), Positives = 38/94 (40%), Gaps = 11/94 (11%)
Query: 112 IVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVSTT 170
+ + + KD+ + L + + K+ + L + +S L+ K +E+S
Sbjct: 115 ALLTLQSLKKDLASKKAQLEKLKAAPGIKPAKVEELEEELEEAESALEEARKRYEEISER 174
Query: 171 LEKDEIKLEID-----QATLK-----FLDLARQM 194
L+++ + + +A LK + A ++
Sbjct: 175 LKEELKRFHEERARDLKAALKEFARLQVQYAEKI 208
>gnl|CDD|218116 pfam04503, SSDP, Single-stranded DNA binding protein, SSDP. This
is a family of eukaryotic single-stranded DNA binding
proteins with specificity to a pyrimidine-rich element
found in the promoter region of the alpha2(I) collagen
gene.
Length = 293
Score = 31.6 bits (71), Expect = 0.70
Identities = 43/135 (31%), Positives = 53/135 (39%), Gaps = 12/135 (8%)
Query: 261 STSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQM 320
S G PP P Q G+ G P + GGM P V M G Q+
Sbjct: 73 SPRYPGGPRPPLRMPNQPPGGVP---GSQPLLPGGMDPTVRQQGHPNMGG--PMQRMTPP 127
Query: 321 QQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPR---GGGNAGPPPFPSAGPGGM 377
+ + Q G G PP+ GP+ M MGPG R +A P+ S+ PG
Sbjct: 128 RGMKSLDGPQNYGGGMRPPPNSLLGPAMPGMNMGPGLGRPWPNPISANSIPYSSSSPGEY 187
Query: 378 GGPGNLGPGGMGPGG 392
GP PGG GP G
Sbjct: 188 TGP----PGGGGPPG 198
>gnl|CDD|219419 pfam07462, MSP1_C, Merozoite surface protein 1 (MSP1) C-terminus.
This family represents the C-terminal region of
merozoite surface protein 1 (MSP1) which are found in a
number of Plasmodium species. MSP-1 is a 200-kDa protein
expressed on the surface of the P. vivax merozoite.
MSP-1 of Plasmodium species is synthesised as a
high-molecular-weight precursor and then processed into
several fragments. At the time of red cell invasion by
the merozoite, only the 19-kDa C-terminal fragment
(MSP-119), which contains two epidermal growth
factor-like domains, remains on the surface. Antibodies
against MSP-119 inhibit merozoite entry into red cells,
and immunisation with MSP-119 protects monkeys from
challenging infections. Hence, MSP-119 is considered a
promising vaccine candidate.
Length = 574
Score = 31.8 bits (72), Expect = 0.72
Identities = 59/293 (20%), Positives = 104/293 (35%), Gaps = 57/293 (19%)
Query: 67 LSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDL 126
L+ + + + LEI L ++ + R+ LK E + + + + + +
Sbjct: 51 LTETKVNALYLEIAH-------LKELLQHSY--DRYYKYKLKLERLYNKKEQIGQSKMQI 101
Query: 127 RHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLT------KEDE---VSTTLEKDEIK 177
+ KE L KR +N L++ L + +E E V TL+ +I
Sbjct: 102 KKLTLLKERLEKR--------KNSLNNPFYVLSNFSNFFNKKREAEKQEVENTLKNTDIL 153
Query: 178 LEIDQATLKFL-----------DLARQMEAFFL--QKRFLLSALKPEL-----IVKEDIV 219
L+ +A +K+ +++ Q E +L +K +LS L+ L + KE I
Sbjct: 154 LKYYKARVKYYTGEAVPLKTLSEVSIQREDNYLNLEKFRVLSRLEGRLKKNINLGKEKIS 213
Query: 220 ----DLRHDLARKEELIKR-------HYDKIAVWQNLLSDLQGWAKSPAHQGSTSSASGT 268
L H +ELIK + + L + Q +
Sbjct: 214 YLSSGLHHVFTELKELIKNKNYTGNTNPENNPEVNEALEQYKELLPKGTTQ-EAKVTTVV 272
Query: 269 TPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQ 321
TPP + S + G P GS + P + + QQ VQ+Q
Sbjct: 273 TPPQADAAPSPLSVRPAGSSGSASGSTQIP-TSGSVLGPGAAATELQQVVQLQ 324
>gnl|CDD|236092 PRK07772, PRK07772, single-stranded DNA-binding protein;
Provisional.
Length = 186
Score = 30.8 bits (70), Expect = 0.75
Identities = 18/56 (32%), Positives = 20/56 (35%)
Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
GG G GG G G GG GGG P + P + P G GG
Sbjct: 124 GGGGGGGGGFGGGGGGSGGGGGGGGGGGAPGGGGAQASAPADDPWSSAPASGGFGG 179
Score = 28.8 bits (65), Expect = 3.2
Identities = 17/51 (33%), Positives = 20/51 (39%), Gaps = 1/51 (1%)
Query: 335 GPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP 385
G GG G GG G G GG + P+ SA G G G+ P
Sbjct: 135 GGGGGSGGGGGGGGGGGAPGGGGAQASA-PADDPWSSAPASGGFGGGDDEP 184
>gnl|CDD|217789 pfam03915, AIP3, Actin interacting protein 3.
Length = 424
Score = 31.5 bits (72), Expect = 0.79
Identities = 32/122 (26%), Positives = 56/122 (45%), Gaps = 14/122 (11%)
Query: 72 KDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLA 131
K ++ + D K DL +EA L+K ++P K++ V K+I +L
Sbjct: 208 KKKLSEDSDSLLTKVDDLQDIIEA--LRKDVAQRGVRPGP--KQLETVQKEIQKAEKELK 263
Query: 132 RKEELIKR---HYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLE-----IDQA 183
+ EE IKR + KI W++ L + Q LT ++++ L+ D K E ++Q
Sbjct: 264 KMEEYIKREKPVWKKI--WESELDKVCEEQQFLTLQEDLIADLQDDLEKAEETFDLVEQC 321
Query: 184 TL 185
+
Sbjct: 322 SE 323
>gnl|CDD|221143 pfam11593, Med3, Mediator complex subunit 3 fungal. Mediator is a
large complex of up to 33 proteins that is conserved
from plants to fungi to humans - the number and
representation of individual subunits varying with
species. It is arranged into four different sections, a
core, a head, a tail and a kinase-activity part, and the
number of subunits within each of these is what varies
with species. Overall, Mediator regulates the
transcriptional activity of RNA polymerase II but it
would appear that each of the four different sections
has a slightly different function. Mediator subunit
Hrs1/Med3 is a physical target for Cyc8-Tup1, a yeast
transcriptional co-repressor.
Length = 381
Score = 31.5 bits (71), Expect = 0.83
Identities = 20/101 (19%), Positives = 33/101 (32%), Gaps = 13/101 (12%)
Query: 250 QGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAM------GGPLPGMMGGMAPI---- 299
G A + Q S + + + N + P ++M PL ++ G++P
Sbjct: 203 TGPAAAAKAQASAQAQAQASAYNQMGSLGVPQNTSMLAQIPNPTPLMQLLNGVSPNNAMA 262
Query: 300 VPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPP 340
P + M PM + Q Q Q M G
Sbjct: 263 SPLNNMSPMRNLNQMGNQNNGGQ---MTPSANNGNMNNQSR 300
>gnl|CDD|236941 PRK11634, PRK11634, ATP-dependent RNA helicase DeaD; Provisional.
Length = 629
Score = 31.4 bits (71), Expect = 0.91
Identities = 22/88 (25%), Positives = 30/88 (34%), Gaps = 2/88 (2%)
Query: 303 STMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGG 362
ST++ GMP + Q + +I + M Q +G P +G G F G R GG
Sbjct: 529 STIELPKGMPGEVLQHFTRTRILNKPMNMQLLGDAQPHTGGERRGGGRGFGGER--REGG 586
Query: 363 NAGPPPFPSAGPGGMGGPGNLGPGGMGP 390
G G G P
Sbjct: 587 RNFSGERREGGRGDGRRFSGERREGRAP 614
>gnl|CDD|240614 cd12794, Hsm3_like, Hsm3 is a yeast Proteasome chaperone of the
19S regulatory particle and related proteins. This
group contains proteins related to the Hsm3 protein
(Yeast Proteasome Interacting Protein) of Saccharomyces
cerevisiae. S. cerevisiae Hsm3 is a chaperone of
regulatory particles involved in proteasome assembly.
The 26S Proteasome is a large, 2.5 MDa complex comprised
of at least 33 subunits, and relies on chaperones to
facilitate correct assembly. The proteasome contains a
cylindrical 20S core particle and 1-2 19S regulatory
particles, comprised of AAA-ATPase and non-ATPase
subunits. The proteasome acts in ubiquitin-dependent
proteolysis. The 19S RP targets and opens the the
ubiquitin-tagged substrate and releases ubiquitin. Hsm3
acts as a 19S chaperone, binding to the C-terminal
domain of Rpt1 (the 6 ATPase subunits of the 19 S
regulatory particle(s). Hsm3 has a C-shape composed of
11 HEAT repeats. Mutations in the Hsm3-Rpt interface
disrupt formation of the 26 S Proteasome complex.
Length = 455
Score = 31.5 bits (72), Expect = 0.93
Identities = 17/74 (22%), Positives = 33/74 (44%), Gaps = 5/74 (6%)
Query: 148 QNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSA 207
QNLL L + L+ ++ ++K L + T +DL + + A K LL
Sbjct: 2 QNLLDHLNTALETDPLPPVINKLIDK--CSLNLKTITSLPVDLKQLLPAI---KSILLDN 56
Query: 208 LKPELIVKEDIVDL 221
E++ + +++L
Sbjct: 57 ESYEILDYDLLLEL 70
>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
Validated.
Length = 824
Score = 31.5 bits (72), Expect = 0.96
Identities = 18/124 (14%), Positives = 24/124 (19%), Gaps = 3/124 (2%)
Query: 253 AKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMMGG-MAPIVPGSTMQPMSGM 311
K A ++ G + P + G A P P +G
Sbjct: 654 PKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAG- 712
Query: 312 PQQQQQVQMQQQIHMQHMQQQGMGPGGP-PSGPGGPSSGMMFMGPGGPRGGGNAGPPPFP 370
Q Q P P P P P P
Sbjct: 713 QADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAA 772
Query: 371 SAGP 374
+ P
Sbjct: 773 APPP 776
>gnl|CDD|214832 smart00817, Amelin, Ameloblastin precursor (Amelin). This family
consists of several mammalian Ameloblastin precursor
(Amelin) proteins. Matrix proteins of tooth enamel
consist mainly of amelogenin but also of non-amelogenin
proteins, which, although their volumetric percentage is
low, have an important role in enamel mineralisation.
One of the non-amelogenin proteins is ameloblastin, also
known as amelin and sheathlin. Ameloblastin (AMBN) is
one of the enamel sheath proteins which is though to
have a role in determining the prismatic structure of
growing enamel crystals.
Length = 411
Score = 31.4 bits (71), Expect = 0.98
Identities = 37/151 (24%), Positives = 49/151 (32%), Gaps = 10/151 (6%)
Query: 240 AVWQNLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMM--GGMA 297
A+ N + + + P H G P P Q P LP
Sbjct: 121 ALPTNQATPQKNGPQPPMHLGQPPLQQAELPM--IPPQVAPSDKPPQTELPLYDFADPQN 178
Query: 298 PIV-PGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPG 356
P++ + + MPQ +QQ +M + G G P+ G SS M G G
Sbjct: 179 PLLFQIAHLMSRGPMPQNKQQHLYPGLFYMSY----GANQLGAPARLGAMSSEEMTGGRG 234
Query: 357 GPRGGGNAGPPPFPSAGPGGMGGPGNLGPGG 387
P G A P PG G P N G
Sbjct: 235 APHAYG-ALFPGLGGMRPGLRGMPQNPAMQG 264
>gnl|CDD|236382 PRK09111, PRK09111, DNA polymerase III subunits gamma and tau;
Validated.
Length = 598
Score = 31.4 bits (72), Expect = 1.1
Identities = 14/76 (18%), Positives = 15/76 (19%)
Query: 327 QHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPG 386
Q G GG P G GG G A P + P
Sbjct: 388 QEGPPSPGGGGGGPPGGGGAPGAPAAAAAPGAAAAAPAAGGPAAALAAVPDAAAAAAAPP 447
Query: 387 GMGPGGLLQGPLAYLE 402
L E
Sbjct: 448 APAAAPQPAVRLNSFE 463
Score = 29.1 bits (66), Expect = 5.8
Identities = 11/42 (26%), Positives = 11/42 (26%)
Query: 7 GGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGP 48
G P GG G PP PG G P
Sbjct: 390 GPPSPGGGGGGPPGGGGAPGAPAAAAAPGAAAAAPAAGGPAA 431
Score = 28.3 bits (64), Expect = 9.8
Identities = 11/42 (26%), Positives = 12/42 (28%)
Query: 5 GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQ 46
PGG GG G + G P GP L
Sbjct: 393 SPGGGGGGPPGGGGAPGAPAAAAAPGAAAAAPAAGGPAAALA 434
>gnl|CDD|220368 pfam09730, BicD, Microtubule-associated protein Bicaudal-D. BicD
proteins consist of three coiled-coiled domains and are
involved in dynein-mediated minus end-directed transport
from the Golgi apparatus to the endoplasmic reticulum
(ER). For full functioning they bind with GSK-3beta
pfam05350 to maintain the anchoring of microtubules to
the centromere. It appears that amino-acid residues
437-617 of BicD and the kinase activity of GSK-3 are
necessary for the formation of a complex between BicD
and GSK-3beta in intact cells.
Length = 711
Score = 31.3 bits (71), Expect = 1.2
Identities = 27/125 (21%), Positives = 51/125 (40%), Gaps = 25/125 (20%)
Query: 114 KEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEK 173
KE + + I++L+ +L + A N+ ++ + + + E + LE
Sbjct: 28 KEAYYLQR-ILELQAELKQLR----------AELSNVQAENERLSSLSQELKEENEMLEL 76
Query: 174 DEIKLEIDQATLKFLDLARQM--------EAFFLQKRFLLSALKPELIVKEDIVDLRHDL 225
+L + KF + AR + E LQK +S L+ + E L+H++
Sbjct: 77 QRGRLRDEIKEYKFRE-ARLLQDYSELEEENISLQK--QVSVLRQSQVEFE---GLKHEI 130
Query: 226 ARKEE 230
R EE
Sbjct: 131 RRLEE 135
>gnl|CDD|236722 PRK10590, PRK10590, ATP-dependent RNA helicase RhlE; Provisional.
Length = 456
Score = 30.9 bits (70), Expect = 1.3
Identities = 11/52 (21%), Positives = 12/52 (23%)
Query: 330 QQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPG 381
QQ+G G G G G G P G G
Sbjct: 392 QQRGGGGRGQGGGRGQQQGQPRRGEGGAKSASAKPAEKPSRRLGDAKPAGEQ 443
>gnl|CDD|227623 COG5307, COG5307, SEC7 domain proteins [General function prediction
only].
Length = 1024
Score = 31.2 bits (71), Expect = 1.3
Identities = 19/121 (15%), Positives = 34/121 (28%), Gaps = 6/121 (4%)
Query: 151 LSDLQSCLQVLTKEDEVSTTLEK----DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLS 206
+ VL K E + + Q + +FL+ ++ F Q LL
Sbjct: 703 IKSESKISNVLFKNSEGLSPDLNKTLLESALDSKSQLSSRFLE-IEELSDFGFQIALLLP 761
Query: 207 ALKPELIVKEDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQGWAKSPAHQGSTSSAS 266
V + + + L+ IA + + S + SS S
Sbjct: 762 FEYSV-EVSLVVAVKELVIGCSDNLLTEAASSIASGKTIFEISAYEDLSSTLRYILSSLS 820
Query: 267 G 267
Sbjct: 821 N 821
Score = 30.9 bits (70), Expect = 1.6
Identities = 18/144 (12%), Positives = 40/144 (27%), Gaps = 14/144 (9%)
Query: 67 LSPLEK----DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKD 122
LSP + Q + +FL+ ++ F Q LL V+
Sbjct: 720 LSPDLNKTLLESALDSKSQLSSRFLE-IEELSDFGFQIALLLPFEYSV-------EVSLV 771
Query: 123 IVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQ 182
+ + + L+ IA + + T +S+ + + + +
Sbjct: 772 VAVKELVIGCSDNLLTEAASSIASGKTIFEISAYEDLSSTLRYILSSLSNDELVLSQENL 831
Query: 183 ATLKFLDLARQMEAFFLQKRFLLS 206
L ++ + L
Sbjct: 832 --FIELLSSKNEGKQNDKNLELRL 853
>gnl|CDD|226959 COG4594, FecB, ABC-type Fe3+-citrate transport system, periplasmic
component [Inorganic ion transport and metabolism].
Length = 310
Score = 30.5 bits (69), Expect = 1.3
Identities = 42/206 (20%), Positives = 75/206 (36%), Gaps = 58/206 (28%)
Query: 104 LSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTK 163
+SALKP+LI+ + + K + +A L R+ D +Q + ++ + + K
Sbjct: 107 ISALKPDLIIADSSR-HKKVYKELKKIAPTIALKSRNED----YQENIDSFKTIAKAVGK 161
Query: 164 EDEVSTTLEK-----DEIKLEIDQATLKFL---DLARQMEAF---FLQKRFL-------- 204
E E+ L K EIK ++ + T A Q + L
Sbjct: 162 EKEMEKRLAKHKKKIAEIKKKLPKGTNSLAIGVSRATQFNLHTEESYTGQLLTQLGYQVP 221
Query: 205 ----------------LSALKPELIVKEDIVDLRHDLARKEELIKRHYDKIAVWQNL--- 245
L+A+ P++++ L ++E I R ++K A+W+ L
Sbjct: 222 AASSDGGPYMSVGLEQLAAINPDVMI------LATY---RDESIVRKWEKNALWKKLKAV 272
Query: 246 ------LSDLQGWAKSPAHQGSTSSA 265
D WA+S + S A
Sbjct: 273 KNGQVYDVDRNTWARSRGIDAAESMA 298
>gnl|CDD|115579 pfam06933, SSP160, Special lobe-specific silk protein SSP160. This
family consists of several special lobe-specific silk
protein SSP160 sequences which appear to be specific to
Chironomus (Midge) species.
Length = 758
Score = 30.9 bits (69), Expect = 1.3
Identities = 10/40 (25%), Positives = 22/40 (55%)
Query: 239 IAVWQNLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQS 278
+A W+ +L+ L+ +A A STS+++ T+ + +
Sbjct: 263 VAEWEAILAALEAFANGSASANSTSNSNSTSNSTTNSNST 302
>gnl|CDD|197891 smart00818, Amelogenin, Amelogenins, cell adhesion proteins, play a
role in the biomineralisation of teeth. They seem to
regulate formation of crystallites during the secretory
stage of tooth enamel development and are thought to
play a major role in the structural organisation and
mineralisation of developing enamel. The extracellular
matrix of the developing enamel comprises two major
classes of protein: the hydrophobic amelogenins and the
acidic enamelins. Circular dichroism studies of porcine
amelogenin have shown that the protein consists of 3
discrete folding units: the N-terminal region appears to
contain beta-strand structures, while the C-terminal
region displays characteristics of a random coil
conformation. Subsequent studies on the bovine protein
have indicated the amelogenin structure to contain a
repetitive beta-turn segment and a "beta-spiral" between
Gln112 and Leu138, which sequester a (Pro, Leu, Gln)
rich region. The beta-spiral offers a probable site for
interactions with Ca2+ ions. Muatations in the human
amelogenin gene (AMGX) cause X-linked hypoplastic
amelogenesis imperfecta, a disease characterised by
defective enamel. A 9bp deletion in exon 2 of AMGX
results in the loss of codons for Ile5, Leu6, Phe7 and
Ala8, and replacement by a new threonine codon,
disrupting the 16-residue (Met1-Ala16) amelogenin signal
peptide.
Length = 165
Score = 30.1 bits (68), Expect = 1.3
Identities = 21/68 (30%), Positives = 24/68 (35%), Gaps = 2/68 (2%)
Query: 293 MGGMAPIVPGSTMQPMSGMPQQQ--QQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGM 350
+ G + P QP P QQ Q +Q Q MQ Q PP P P M
Sbjct: 76 VPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLPPM 135
Query: 351 MFMGPGGP 358
M P P
Sbjct: 136 FPMQPLPP 143
>gnl|CDD|237539 PRK13878, PRK13878, conjugal transfer relaxase TraI; Provisional.
Length = 746
Score = 30.9 bits (70), Expect = 1.5
Identities = 16/91 (17%), Positives = 27/91 (29%), Gaps = 12/91 (13%)
Query: 310 GMPQQQQQVQMQQQIHMQHMQQQ-----GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNA 364
+ ++Q + ++ H + + + G G GGP P R G
Sbjct: 501 ALESRRQALLNKENTHERTERPEHRGRTGRGRGGPGQRPAADQHA--AGAAAVARAGDGR 558
Query: 365 GPPPFPSAGPGGMGGPG-----NLGPGGMGP 390
G+ G N+G G P
Sbjct: 559 PAAGRGDRAGAGVHAAGVHRKPNVGRIGRKP 589
>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
Length = 520
Score = 30.5 bits (70), Expect = 1.6
Identities = 19/79 (24%), Positives = 37/79 (46%), Gaps = 4/79 (5%)
Query: 173 KDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHD-LARKEEL 231
K E LE A + L + E ++R L L+ L+ KE+ +D + + L ++EE
Sbjct: 55 KKEALLE---AKEEIHKLRNEFEKELRERRNELQKLEKRLLQKEENLDRKLELLEKREEE 111
Query: 232 IKRHYDKIAVWQNLLSDLQ 250
+++ ++ Q L +
Sbjct: 112 LEKKEKELEQKQQELEKKE 130
>gnl|CDD|227606 COG5281, COG5281, Phage-related minor tail protein [Function
unknown].
Length = 833
Score = 30.8 bits (69), Expect = 1.7
Identities = 47/293 (16%), Positives = 68/293 (23%), Gaps = 52/293 (17%)
Query: 151 LSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKP 210
L L + Q + + L + L + QK LL K
Sbjct: 491 LKALLAFQQQIADLSGAKEKASDQKSLLWKAEEQYALLKEEAKQRQLQEQK-ALLEHKKE 549
Query: 211 ELIVKEDIVDLRHDLARKEELIKRHY----------------DKIAVWQNLLSDLQ-GWA 253
L + +L A + EL + A L++L W+
Sbjct: 550 TLEYTSQLAELLDQQADRFELSAQAAGSQKERGSDLYREALAQNAAALNKALNELAAYWS 609
Query: 254 KSPAHQGS----TSSASGTTPPNSTPTQSG------PGISAMGGPLPGMMGGMAPIVPGS 303
QG SA ++T S M
Sbjct: 610 ALDLLQGDWKAGALSALANYRDSATDVASQAAQLFTNAFDGMANNAAKFATTGKLSFKSF 669
Query: 304 TMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGP----------------- 346
T +S + Q +Q + + G GG + G
Sbjct: 670 TRSVLSDLAGILLQAALQIIVGLVGSAFGGALSGGGSASTGAGSVFHFAAGGVYGSGGLP 729
Query: 347 -------SSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
SS +F G G AGP G G G G G
Sbjct: 730 EYAGGVVSSPTVFTKAAGLGLMGEAGPEAILPLDRGSDGKLGVAAGMGGGGAA 782
>gnl|CDD|233255 TIGR01061, parC_Gpos, DNA topoisomerase IV, A subunit,
Gram-positive. Operationally, topoisomerase IV is a
type II topoisomerase required for the decatenation of
chromosome segregation. Not every bacterium has both a
topo II and a topo IV. The topo IV families of the
Gram-positive bacteria and the Gram-negative bacteria
appear not to represent a single clade among the type II
topoisomerases, and are represented by separate models
for this reason [DNA metabolism, DNA replication,
recombination, and repair].
Length = 738
Score = 30.5 bits (69), Expect = 1.8
Identities = 26/144 (18%), Positives = 59/144 (40%), Gaps = 30/144 (20%)
Query: 65 RSLSPLEKDEIKLEIDQATLKFLDLARQMEAFF------------LQKRFLLSALKPELI 112
RS LEK +LEI + +K + + ++ L F + + E I
Sbjct: 357 RSKYELEKASKRLEIVEGLIKAISIIDEIIKLIRSSEDKSDAKENLIDNFKFTENQAEAI 416
Query: 113 V--KEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE------ 164
V + + DI +L+ + + EL KI + +++ ++ ++L K+
Sbjct: 417 VSLRLYRLTNTDIFELKEE---QNEL----EKKIISLEQIIASEKARNKLLKKQLEEYKK 469
Query: 165 ---DEVSTTLEKDEIKLEIDQATL 185
+ + +E +++I+++ L
Sbjct: 470 QFAQQRRSQIEDFINQIKINESEL 493
>gnl|CDD|224264 COG1345, FliD, Flagellar capping protein [Cell motility and
secretion].
Length = 483
Score = 30.5 bits (69), Expect = 1.8
Identities = 12/55 (21%), Positives = 25/55 (45%), Gaps = 3/55 (5%)
Query: 112 IVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDE 166
+ K++ + KDI L L EE K ++ ++++ + S LT++
Sbjct: 427 LNKQIKSLDKDIKSLDKRLEAAEERYKTQFNT---LDDMMTQMNSQSSYLTQQLV 478
>gnl|CDD|226074 COG3544, COG3544, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 190
Score = 29.8 bits (67), Expect = 1.9
Identities = 11/44 (25%), Positives = 14/44 (31%), Gaps = 3/44 (6%)
Query: 312 PQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGP 355
Q ++ Q M+ G GG P GMM M
Sbjct: 80 IILAQNQEIAQ---MKTWLATWGGKGGEPPSLEMAKMGMMEMHE 120
>gnl|CDD|218636 pfam05557, MAD, Mitotic checkpoint protein. This family consists
of several eukaryotic mitotic checkpoint (Mitotic arrest
deficient or MAD) proteins. The mitotic spindle
checkpoint monitors proper attachment of the bipolar
spindle to the kinetochores of aligned sister chromatids
and causes a cell cycle arrest in prometaphase when
failures occur. Multiple components of the mitotic
spindle checkpoint have been identified in yeast and
higher eukaryotes. In S.cerevisiae, the existence of a
Mad1-dependent complex containing Mad2, Mad3, Bub3 and
Cdc20 has been demonstrated.
Length = 722
Score = 30.3 bits (68), Expect = 2.0
Identities = 43/206 (20%), Positives = 84/206 (40%), Gaps = 23/206 (11%)
Query: 71 EKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKD--IVDLRH 128
E +K ++D +LK L + E + + +S +K +L + D + L
Sbjct: 136 EAKLLKDKLDAESLK---LQNEKEDQLKEAKESISRIKNDLSEMQCRAQNADTELKLLES 192
Query: 129 DLARKEELIKRHYDKIAVWQNLLSDLQSCL------QVLTKEDEVSTTLEKDEIKLEIDQ 182
+L E ++ ++A + L L S V K E + + ++
Sbjct: 193 ELEELREQLEECQKELAEAEKKLQSLTSEQASSADNSVKIKHLEEELKRYEQDAEVVKSM 252
Query: 183 AT--LKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKE---------EL 231
L+ +L R++ A + R L S + ++KE++ DL+ L R E EL
Sbjct: 253 KEQLLQIPELERELAALREENRKLRSMKEDNELLKEELEDLQSRLERFEKMREKLADLEL 312
Query: 232 IKRHYD-KIAVWQNLLSDLQGWAKSP 256
K + ++ W++LL D+ ++P
Sbjct: 313 EKEKLENELKSWKSLLQDIGLNLRTP 338
>gnl|CDD|219934 pfam08614, ATG16, Autophagy protein 16 (ATG16). Autophagy is a
ubiquitous intracellular degradation system for
eukaryotic cells. During autophagy, cytoplasmic
components are enclosed in autophagosomes and delivered
to lysosomes/vacuoles. ATG16 (also known as Apg16) has
been shown to be bind to Apg5 and is required for the
function of the Apg12p-Apg5p conjugate in the yeast
autophagy pathway.
Length = 194
Score = 29.8 bits (67), Expect = 2.0
Identities = 33/152 (21%), Positives = 59/152 (38%), Gaps = 11/152 (7%)
Query: 44 LLQGPLAYLEKTTSNIGLPDGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFL 103
L +S+ S + + + E KL + L L ++ E L +R L
Sbjct: 43 LQAEKYEQQSSHSSSPSADGPGSDAAIAEMEQKLAKLREELTEL-HKKRGE---LAQRLL 98
Query: 104 LSALKPELIVKEVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTK 163
L + E + +E+ + K I +LR ++ E I+ +++ + LQ L L
Sbjct: 99 LLNDELEQLRREIQQLEKTIAELRSEITSLETEIRDLREELQEKEKDNETLQDELISLNI 158
Query: 164 EDEVSTTLEKDEIKLEIDQATLKFLDLARQME 195
E LE+ KL+ + L + R M
Sbjct: 159 ELNA---LEEKLRKLQKENQEL----VERWMA 183
>gnl|CDD|220441 pfam09849, DUF2076, Uncharacterized protein conserved in bacteria
(DUF2076). This domain, found in various hypothetical
prokaryotic proteins, has no known function. The domain,
however, is found in various periplasmic ligand-binding
sensor proteins.
Length = 234
Score = 30.0 bits (68), Expect = 2.1
Identities = 26/94 (27%), Positives = 32/94 (34%), Gaps = 11/94 (11%)
Query: 313 QQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
Q+ Q +I + ++ Q P SG G SGM G P A PP P A
Sbjct: 53 QEAALKQANARI--EELEAQAQHPQSQSSG--GFLSGMFGGGAPRPPPAAPAVQPPAPPA 108
Query: 373 GPGGMGGPGNLGPGGMGP-------GGLLQGPLA 399
PG G + G P G L G
Sbjct: 109 RPGWGSGGPSQQGAGQQPGYAQPGPGSFLGGAAQ 142
>gnl|CDD|234252 TIGR03545, TIGR03545, TIGR03545 family protein. This model
represents a relatively rare but broadly distributed
uncharacterized protein family, distributed in 1-2
percent of bacterial genomes, all of which have outer
membranes. In many of these genomes, it is part of a
two-gene pair.
Length = 555
Score = 30.1 bits (68), Expect = 2.3
Identities = 20/87 (22%), Positives = 36/87 (41%), Gaps = 7/87 (8%)
Query: 61 LPDGRSLSPLEKDEIKLE-IDQATLK-FLDLARQMEAFFLQKRFLLSALKPELIVKEVNM 118
LP+ + L +K +LE I + +K L+L + E F K + I N
Sbjct: 187 LPNKQDLEEYKK---RLEAIKKKDIKNPLELQKIKEEF--DKLKKEGKADKQKIKSAKND 241
Query: 119 VTKDIVDLRHDLARKEELIKRHYDKIA 145
+ D L+ DLA ++ + ++
Sbjct: 242 LQNDKKQLKADLAELKKAPQNDLKRLE 268
>gnl|CDD|237783 PRK14667, uvrC, excinuclease ABC subunit C; Provisional.
Length = 567
Score = 30.1 bits (68), Expect = 2.3
Identities = 27/98 (27%), Positives = 45/98 (45%), Gaps = 12/98 (12%)
Query: 144 IAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRF 203
I V L +++ L K++E+ T + EI L+ + K L R EA RF
Sbjct: 446 IEVRDRLGLNIKVF--SLAKKEEILYTEDGKEIPLKENPILYKVFGLIRD-EA----HRF 498
Query: 204 LLSALKPELIVKEDIVDLRHDLA----RKEELIKRHYD 237
LS + +L KE + D+ + K+E+I R++
Sbjct: 499 ALSYNR-KLREKEGLKDILDKIKGIGEVKKEIIYRNFK 535
>gnl|CDD|163064 TIGR02894, DNA_bind_RsfA, transcription factor, RsfA family. In a
subset of endospore-forming members of the Firmcutes,
members of this protein family are found, several to a
genome. Two very strongly conserved sequences regions
are separated by a highly variable linker region. Much
of the linker region was excised from the seed alignment
for this model. A characterized member is the
prespore-specific transcription RsfA from Bacillus
subtilis, previously called YwfN, which is controlled by
sigma factor F and seems to fine-tune expression of some
genes in the sigma-F regulon. A paralog in Bacillus
subtilis is designated YlbO [Regulatory functions, DNA
interactions, Cellular processes, Sporulation and
germination].
Length = 161
Score = 29.3 bits (66), Expect = 2.3
Identities = 11/53 (20%), Positives = 24/53 (45%), Gaps = 9/53 (16%)
Query: 142 DKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQM 194
++ Q +L+ L+ L + +T+E+D Q + +D AR++
Sbjct: 111 NQNESLQKRNEELEKELEKLRQR---LSTIEEDY------QTLIDIMDRARKL 154
>gnl|CDD|214660 smart00434, TOP4c, DNA Topoisomerase IV. Bacterial DNA
topoisomerase IV, GyrA, ParC.
Length = 444
Score = 30.2 bits (69), Expect = 2.4
Identities = 24/119 (20%), Positives = 49/119 (41%), Gaps = 19/119 (15%)
Query: 76 KLEIDQATLKFLDLARQMEAFFLQKR--FLLSALKPELIVK-EVNMVTKDIVD-----LR 127
K + + +FLD + +R +LL L+ E + E + I+D +R
Sbjct: 321 KYNLKEILKEFLDHRLE----VYTRRKEYLLGKLEAERLHILEGLFIALSIIDEIIVLIR 376
Query: 128 HDLARKEELIKRHYDKIAV----WQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEIDQ 182
+E ++ ++ + +L D++ L+ LTK + E E++ EI+
Sbjct: 377 SSKDLAKEAKEKLMERFELSEIQADAIL-DMR--LRRLTKLEVEKLEKELKELEKEIED 432
>gnl|CDD|238163 cd00261, AAI_SS, AAI_SS: Alpha-Amylase Inhibitors (AAIs) and Seed
Storage (SS) Protein subfamily; composed of cereal-type
AAIs and SS proteins. They are mainly present in the
seeds of a variety of plants. AAIs play an important
role in the natural defenses of plants against insects
and pathogens such as fungi, bacteria and viruses. AAIs
impede the digestion of plant starch and proteins by
inhibiting digestive alpha-amylases and proteinases.
Also included in this subfamily are SS proteins such as
2S albumin, gamma-gliadin, napin, and prolamin. These
AAIs and SS proteins are also known allergens in humans.
Length = 110
Score = 28.5 bits (64), Expect = 2.4
Identities = 14/43 (32%), Positives = 19/43 (44%), Gaps = 5/43 (11%)
Query: 313 QQQQQVQMQQQIHM-----QHMQQQGMGPGGPPSGPGGPSSGM 350
QQQ Q QQ ++++QQ G GGPP P +
Sbjct: 1 QQQCQPGQQQPQQPLNSCREYLRQQCSGVGGPPVWPQQSCEVL 43
>gnl|CDD|218745 pfam05783, DLIC, Dynein light intermediate chain (DLIC). This
family consists of several eukaryotic dynein light
intermediate chain proteins. The light intermediate
chains (LICs) of cytoplasmic dynein consist of multiple
isoforms, which undergo post-translational modification
to produce a large number of species. DLIC1 is known to
be involved in assembly, organisation, and function of
centrosomes and mitotic spindles when bound to
pericentrin. DLIC2 is a subunit of cytoplasmic dynein 2
that may play a role in maintaining Golgi organisation
by binding cytoplasmic dynein 2 to its Golgi-associated
cargo.
Length = 490
Score = 30.2 bits (68), Expect = 2.4
Identities = 27/98 (27%), Positives = 33/98 (33%), Gaps = 19/98 (19%)
Query: 321 QQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGP 380
QQ + + G P PGG PR +GP S P M
Sbjct: 357 QQSLLAKQPATPTRGVESPARSPGG-----------SPRTTNRSGPRNVASVSP--MTSV 403
Query: 381 GNLGPGGMGPGGLLQGPLA-----YLEKTTSNIGLPDG 413
+ P M PG +G LA L K T + G P G
Sbjct: 404 KKIDP-NMKPGAASEGVLANFFNSLLSKKTGSPGSPGG 440
>gnl|CDD|193258 pfam12782, Innate_immun, Invertebrate innate immunity transcript
family. The immune response of the purple sea urchin
appears to be more complex than previously believed in
that it uses immune-related gene families homologous to
vertebrate Toll-like and NOD/NALP-like receptor families
as well as C-type lectins and a rudimentary complement
system. In addition, the species also produces this
unusual family of mRNAs, also known as 185/333, which is
strongly upregulated in response to pathogen challenge.
Length = 312
Score = 30.0 bits (67), Expect = 2.4
Identities = 26/68 (38%), Positives = 27/68 (39%), Gaps = 14/68 (20%)
Query: 333 GMGPGGPPSGPGGPSSGMMFMGP-------------GGPRGGGNAGPPPFPSAGPGGMGG 379
GM GGP GGP G F GP GGP GG P F + P G GG
Sbjct: 48 GMQMGGPRQD-GGPMGGRRFDGPGSGAPQMDGRRQNGGPMGGRRFDGPRFGGSRPDGAGG 106
Query: 380 PGNLGPGG 387
G GG
Sbjct: 107 RPFFGQGG 114
Score = 28.1 bits (62), Expect = 10.0
Identities = 42/121 (34%), Positives = 44/121 (36%), Gaps = 16/121 (13%)
Query: 280 PGISAMGGPLP--GMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPG 337
PG MGGP G MGG PGS M G Q M + G G
Sbjct: 46 PGGMQMGGPRQDGGPMGGRRFDGPGSGAPQMDGRRQNGGP--------MGGRRFDGPRFG 97
Query: 338 GP-PSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQG 396
G P G GG F G GG RG G G G+GGPG G G QG
Sbjct: 98 GSRPDGAGGRP----FFGQGGRRGDGEEETDAAQQIG-DGLGGPGQFDGPGRRHHGHRQG 152
Query: 397 P 397
P
Sbjct: 153 P 153
>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
primarily archaeal type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. It is found
in a single copy and is homodimeric in prokaryotes, but
six paralogs (excluded from this family) are found in
eukarotes, where SMC proteins are heterodimeric. This
family represents the SMC protein of archaea and a few
bacteria (Aquifex, Synechocystis, etc); the SMC of other
bacteria is described by TIGR02168. The N- and
C-terminal domains of this protein are well conserved,
but the central hinge region is skewed in composition
and highly divergent [Cellular processes, Cell division,
DNA metabolism, Chromosome-associated proteins].
Length = 1164
Score = 30.0 bits (68), Expect = 2.6
Identities = 24/124 (19%), Positives = 52/124 (41%), Gaps = 13/124 (10%)
Query: 117 NMVTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKE-DEVSTTLEK-- 173
+ K+I +L EE ++ + L DL+S L L KE DE+ L +
Sbjct: 850 KSIEKEIENLNGKKEELEEELEEL-------EAALRDLESRLGDLKKERDELEAQLRELE 902
Query: 174 ---DEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEDIVDLRHDLARKEE 230
+E++ +I++ + +L ++EA + + + + E+ + L A +
Sbjct: 903 RKIEELEAQIEKKRKRLSELKAKLEALEEELSEIEDPKGEDEEIPEEELSLEDVQAELQR 962
Query: 231 LIKR 234
+ +
Sbjct: 963 VEEE 966
>gnl|CDD|236729 PRK10636, PRK10636, putative ABC transporter ATP-binding protein;
Provisional.
Length = 638
Score = 30.1 bits (68), Expect = 2.6
Identities = 16/73 (21%), Positives = 39/73 (53%), Gaps = 3/73 (4%)
Query: 126 LRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKD-EIKLEIDQAT 184
LR ++AR E+ +++ ++A + L D S L +++ E++ L++ K +++
Sbjct: 561 LRKEIARLEKEMEKLNAQLAQAEEKLGD--SELYDQSRKAELTACLQQQASAKSGLEECE 618
Query: 185 LKFLDLARQMEAF 197
+ +L+ Q+E
Sbjct: 619 MAWLEAQEQLEQM 631
>gnl|CDD|227938 COG5651, COG5651, PPE-repeat proteins [Cell motility and
secretion].
Length = 490
Score = 29.9 bits (67), Expect = 2.7
Identities = 26/152 (17%), Positives = 38/152 (25%), Gaps = 29/152 (19%)
Query: 263 SSASGTTPPNSTPTQSGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQ 322
+ + + +G +A+GG G PG +
Sbjct: 339 LGVANSGSAAAPFGIAGANQAALGGANSGAGNFGLGNNPGGGLGGKPLG----------- 387
Query: 323 QIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPR----GGGNAGPPPFPSAGPGGMG 378
G G GG G ++G G G NAG +A G G
Sbjct: 388 ----------GTGNGGI-GASGIGNTGYGNSGIANAGLSNAGSNNAGGENAGNANNTGGG 436
Query: 379 GPGNLGPGGMGPGGLLQGPLAYLEKTTSNIGL 410
G G G + + N G
Sbjct: 437 NVGLWNAGDFNAGAA---GTGFTNNGSYNTGF 465
Score = 29.9 bits (67), Expect = 2.8
Identities = 22/79 (27%), Positives = 26/79 (32%), Gaps = 3/79 (3%)
Query: 335 GPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAG--PGGMGGPGNLGPGGMGPGG 392
G SG G+ GG N+G F GG+GG G G G G
Sbjct: 338 NLGVANSGSAAAPFGIAGANQAAL-GGANSGAGNFGLGNNPGGGLGGKPLGGTGNGGIGA 396
Query: 393 LLQGPLAYLEKTTSNIGLP 411
G Y +N GL
Sbjct: 397 SGIGNTGYGNSGIANAGLS 415
Score = 29.5 bits (66), Expect = 3.5
Identities = 25/154 (16%), Positives = 33/154 (21%), Gaps = 6/154 (3%)
Query: 253 AKSPAHQGSTSSASGTTPPNSTPTQSGPGISAMGGPLPGMM--GGMAPIVPGSTMQPMSG 310
+ A G+ S + S + S + GG A +
Sbjct: 287 GLAAAGTGNIGSGNAVDSGGSALVGAIGQTSQATANAGSVNATGGAAAGSGNLGVANSGS 346
Query: 311 MPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFP 370
Q G G G + PGG G G G G G +G
Sbjct: 347 AAAPFGIAGANQAALGG--ANSGAGNFGLGNNPGGGLGGKPLGG-TGNGGIGASGIGNTG 403
Query: 371 SA-GPGGMGGPGNLGPGGMGPGGLLQGPLAYLEK 403
G N G G
Sbjct: 404 YGNSGIANAGLSNAGSNNAGGENAGNANNTGGGN 437
>gnl|CDD|224273 COG1354, scpA, Rec8/ScpA/Scc1-like protein (kleisin family)
[Replication, recombination, and repair].
Length = 248
Score = 29.6 bits (67), Expect = 2.8
Identities = 37/234 (15%), Positives = 82/234 (35%), Gaps = 44/234 (18%)
Query: 46 QGPLAYLEKTTSNIGLPDGRSLSPLEKDEIKL-EID--QATLKFLDLARQMEAFFLQK-- 100
+GPL L L + K +I +I + T ++L +++ L+
Sbjct: 12 EGPLDLL--------------LHLIRKGKIDPWDIPIVELTDQYLAYIEELKKLDLEVAA 57
Query: 101 RFLLSA-----LKPELIVKEVNMVTKDIV--DLRHDLARKEELIKRHYDKIAVWQNLLSD 153
+L+ A +K +++ + +D + R +L + E +R+ + + L +
Sbjct: 58 DYLVMAAILLRIKSRMLLPKEEEEAEDEELEEPRDELVARLEEYERYKEAAELLAELEEE 117
Query: 154 LQSCLQVLTKEDEVSTTLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELI 213
+ V +K ++ EI L F + + K+ L ++ ++
Sbjct: 118 RR---DVFSKIKPEIKIKKERRPVEEISLIDL-FRAYQKILR---RVKQEELVEIERIVL 170
Query: 214 VKEDIVDLRHDLARKEELIKRHYDKIAVWQNL-LSDLQGWAKSPAHQGSTSSAS 266
+ + EE ++ ++ L SDL + ST A
Sbjct: 171 EELSV----------EEQLEELLARLEARGVLRFSDLFSPEERKDEVVSTFLAL 214
>gnl|CDD|222095 pfam13388, DUF4106, Protein of unknown function (DUF4106). This
family of proteins are found in large numbers in the
Trichomonas vaginalis proteome. The function of this
protein is unknown.
Length = 422
Score = 29.6 bits (66), Expect = 2.8
Identities = 29/92 (31%), Positives = 32/92 (34%), Gaps = 15/92 (16%)
Query: 260 GSTSSASGT-TPPNSTPTQSGPGI-----SAMG-----GPLPGMMGGMAPIVPGSTMQPM 308
G+ ASGT PPN PG+ S+ G P P P V QP
Sbjct: 162 GTYILASGTYIPPNPPREAPAPGLPKTFTSSHGHRHRHAPKPTQQ----PTVQNPAQQPT 217
Query: 309 SGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPP 340
P QQ Q Q QQQ Q P P
Sbjct: 218 VQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQP 249
>gnl|CDD|226808 COG4371, COG4371, Predicted membrane protein [Function unknown].
Length = 334
Score = 29.5 bits (66), Expect = 2.9
Identities = 17/55 (30%), Positives = 19/55 (34%), Gaps = 5/55 (9%)
Query: 344 GGPSSGMMFMGPGGPRGGGNAGPPP-----FPSAGPGGMGGPGNLGPGGMGPGGL 393
GG G F P G G + G P GG G P + GG G G
Sbjct: 50 GGRIGGGSFRAPSGYSRGYSGGGPSGGGYSGGGYSGGGFGFPFIIPGGGGGGGFG 104
Score = 28.7 bits (64), Expect = 5.0
Identities = 22/54 (40%), Positives = 23/54 (42%), Gaps = 9/54 (16%)
Query: 337 GGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP------PFPSAGPGGMGGPGNLG 384
GG P G S G GGP GGG +G FP PGG GG G G
Sbjct: 55 GGSFRAPSGYSRGY---SGGGPSGGGYSGGGYSGGGFGFPFIIPGGGGGGGFGG 105
>gnl|CDD|220161 pfam09273, Rubis-subs-bind, Rubisco LSMT substrate-binding.
Members of this family adopt a multihelical structure,
with an irregular array of long and short alpha-helices.
They allow binding of the protein to substrate, such as
the N-terminal tails of histones H3 and H4 and the large
subunit of the Rubisco holoenzyme complex.
Length = 128
Score = 28.5 bits (64), Expect = 2.9
Identities = 11/37 (29%), Positives = 14/37 (37%), Gaps = 6/37 (16%)
Query: 144 IAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEI 180
Q L + L E TTLE+DE L+
Sbjct: 77 EKALQFLEKLCKLLL------SEYPTTLEEDEALLKK 107
>gnl|CDD|180777 PRK06958, PRK06958, single-stranded DNA-binding protein;
Provisional.
Length = 182
Score = 29.0 bits (65), Expect = 3.0
Identities = 27/79 (34%), Positives = 29/79 (36%), Gaps = 6/79 (7%)
Query: 314 QQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAG 373
Q Q + +I MQ G G GG G GG G G GG GG S
Sbjct: 90 QDGQDRYSTEIVADQMQMLG-GRGGSGGGGGGGDEGGYGGGGGGGGGGYGGE-----SRS 143
Query: 374 PGGMGGPGNLGPGGMGPGG 392
GG G G GG G G
Sbjct: 144 GGGGGRASGGGGGGAGGGA 162
>gnl|CDD|222706 pfam14356, DUF4403, Domain of unknown function (DUF4403). This
family of proteins is functionally uncharacterized. This
family of proteins is found in bacteria. Proteins in
this family are typically between 455 and 518 amino
acids in length. There is a single completely conserved
residue W that may be functionally important.
Length = 425
Score = 29.5 bits (67), Expect = 3.1
Identities = 17/75 (22%), Positives = 32/75 (42%), Gaps = 8/75 (10%)
Query: 61 LPDGRSLSPLEKDEIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVT 120
LP+ + L + D ++ + A + + +L R + K F LS ++ VK V++
Sbjct: 236 LPNLKILPSIS-DGFRINL-PADIPYAELNRLLNRQLAGKTFPLSG-GRKVTVKSVSVYG 292
Query: 121 KD-----IVDLRHDL 130
VD+ L
Sbjct: 293 SGDRLVIAVDVDGSL 307
>gnl|CDD|218635 pfam05556, Calsarcin, Calcineurin-binding protein (Calsarcin).
This family consists of several mammalian
calcineurin-binding proteins. The calcium- and
calmodulin-dependent protein phosphatase calcineurin has
been implicated in the transduction of signals that
control the hypertrophy of cardiac muscle and slow fibre
gene expression in skeletal muscle. Calsarcin-1 and
calsarcin-2 are expressed in developing cardiac and
skeletal muscle during embryogenesis, but calsarcin-1 is
expressed specifically in adult cardiac and slow-twitch
skeletal muscle, whereas calsarcin-2 is restricted to
fast skeletal muscle. Calsarcins represent a novel
family of sarcomeric proteins that link calcineurin with
the contractile apparatus, thereby potentially coupling
muscle activity to calcineurin activation. Calsarcin-3,
is expressed specifically in skeletal muscle and is
enriched in fast-twitch muscle fibres. Like calsarcin-1
and calsarcin-2, calsarcin-3 interacts with calcineurin,
and the Z-disc proteins alpha-actinin, gamma-filamin,
and telethonin.
Length = 273
Score = 29.4 bits (66), Expect = 3.3
Identities = 15/66 (22%), Positives = 22/66 (33%), Gaps = 9/66 (13%)
Query: 326 MQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP 385
++Q+ GG GG S G + G G + G + P + P
Sbjct: 86 KDNLQKFVPSQGGQ----GGNSEGSIPQGDSHQPGQT-----QPNTPDLGSVYNPEAIAP 136
Query: 386 GGMGPG 391
G GP
Sbjct: 137 GYGGPL 142
>gnl|CDD|233432 TIGR01480, copper_res_A, copper-resistance protein, CopA family.
This model represents the CopA copper resistance protein
family. CopA is related to laccase (benzenediol:oxygen
oxidoreductase) and L-ascorbate oxidase, both
copper-containing enzymes. Most members have a typical
TAT (twin-arginine translocation) signal sequence with
an Arg-Arg pair. Twin-arginine translocation is observed
for a large number of periplasmic proteins that cross
the inner membrane with metal-containing cofactors
already bound. The combination of copper-binding sites
and TAT translocation motif suggests a mechansism of
resistance by packaging and export [Cellular processes,
Detoxification, Transport and binding proteins, Cations
and iron carrying compounds].
Length = 587
Score = 29.8 bits (67), Expect = 3.3
Identities = 21/113 (18%), Positives = 32/113 (28%), Gaps = 7/113 (6%)
Query: 304 TMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGN 363
T+ G+ + + + M+ M GM G M M PG
Sbjct: 346 TLAVRLGLTAPVPALDPRPLLTMKDMGMGGMH-----HGMDHSKMSMGGM-PGMDMSMRA 399
Query: 364 AGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPLAYLEKTTSNIGLPD-GRR 415
P + P + + P + + IGL D GRR
Sbjct: 400 QSNAPMDHSQMAMDASPKHPASEPLNPLVDMIVDMPMDRMDDPGIGLRDNGRR 452
>gnl|CDD|221510 pfam12288, CsoS2_M, Carboxysome shell peptide mid-region. This
domain family is found in bacteria and eukaryotes, and
is approximately 430 amino acids in length. This family
is annotated frequently as a carboxysome shell peptide,
however there is little publication to confirm this.
Length = 424
Score = 29.7 bits (67), Expect = 3.4
Identities = 21/94 (22%), Positives = 35/94 (37%), Gaps = 13/94 (13%)
Query: 269 TPPNSTPTQSGPGISAMGGPLPGMMGGMAPIV----PGS----TMQPMSGMPQQQQQVQM 320
+ P + G ++ G + G G + V PG+ T P +G+ Q QQ +
Sbjct: 307 SRPKPEAAKVGFSLTNKGQKVSGTRTGRSEGVTGDEPGTCKAVTGTPYAGLEQAQQFCSV 366
Query: 321 QQQIHMQHMQQQGMGPGGPP-----SGPGGPSSG 349
++ + G GP G GG +G
Sbjct: 367 DAVNEIKVRTPRRAGTPGPRLTGQQPGIGGVMTG 400
>gnl|CDD|213398 cd12191, gal11_coact, gall11 coactivator domain. Gall11/MED15 acts
in the general regulation of GAL structural genes and is
required for full expression for several genes in this
pathway, including GALs 1,7, and 10 in Saccharomyces
cerevisiae. GAL11 function is dependent on GCN4
functionality and binds GCN4 in a degenerate manner with
multiple orientations found at the GCN4-Gal11 interface.
Length = 90
Score = 27.7 bits (62), Expect = 3.5
Identities = 10/26 (38%), Positives = 13/26 (50%), Gaps = 4/26 (15%)
Query: 311 MPQQQQQV----QMQQQIHMQHMQQQ 332
PQ +Q+ Q Q+ MQ QQQ
Sbjct: 61 PPQAMEQIKEVQQTHFQLLMQRRQQQ 86
>gnl|CDD|218161 pfam04589, RFX1_trans_act, RFX1 transcription activation region.
The RFX family is a family of winged-helix DNA binding
proteins. RFX1 is a regulatory factor essential for
expression of MHC class II genes. This region is to
found N terminal to the RFX DNA binding region
(pfam02257) in some mammalian RFX proteins, and is
thought to activate transcription when associated with
DNA. Deletion analysis has identified the region 233-351
in human RFX1 as being required for maximal activation.
Length = 150
Score = 28.4 bits (63), Expect = 3.9
Identities = 17/73 (23%), Positives = 26/73 (35%), Gaps = 1/73 (1%)
Query: 261 STSSASGTTPPNSTPTQ-SGPGISAMGGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQ 319
+S G+ P S Q S P + + + G +Q + QQ Q
Sbjct: 1 MQTSEGGSDSPASVALQTSVPAQAPVPASQQRSVVQATSQTKGGPVQQLPVHRVQQVPQQ 60
Query: 320 MQQQIHMQHMQQQ 332
+QQ H+ Q Q
Sbjct: 61 VQQVQHVYPAQVQ 73
>gnl|CDD|197548 smart00157, PRP, Major prion protein. The prion protein is a major
component of scrapie-associated fibrils in
Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler
syndrome and bovine spongiform encephalopathy.
Length = 218
Score = 29.1 bits (65), Expect = 3.9
Identities = 19/51 (37%), Positives = 19/51 (37%), Gaps = 6/51 (11%)
Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGN 382
QG G G P G G G G G GG G P G G GG N
Sbjct: 31 QGGGWGQPHGGGWGQPHG----GGWGQPHGGGWGQP--HGGGWGQGGGTHN 75
Score = 28.7 bits (64), Expect = 5.1
Identities = 20/60 (33%), Positives = 20/60 (33%), Gaps = 13/60 (21%)
Query: 336 PGG---PPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGG 392
PGG PP G G G P GGG P P G G G G GG
Sbjct: 23 PGGNRYPPQGGGW----------GQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGG 72
>gnl|CDD|220871 pfam10759, DUF2587, Protein of unknown function (DUF2587). This is
a bacterial family of proteins with no known function.
Length = 168
Score = 28.5 bits (64), Expect = 4.0
Identities = 12/44 (27%), Positives = 20/44 (45%), Gaps = 5/44 (11%)
Query: 319 QMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGG 362
QM + ++ M+++ + PG + PG P GGP G
Sbjct: 126 QMAARAQLEQMRRRALPPGVGIAPPGQPQ-----GARGGPPPGT 164
>gnl|CDD|214360 CHL00094, dnaK, heat shock protein 70.
Length = 621
Score = 29.3 bits (66), Expect = 4.1
Identities = 29/123 (23%), Positives = 50/123 (40%), Gaps = 20/123 (16%)
Query: 68 SPLEKDEIK---------LEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNM 118
S L KDE++ D+ + +DL Q E+ Q L LK ++ ++
Sbjct: 500 STLPKDEVERMVKEAEKNAAEDKEKREKIDLKNQAESLCYQAEKQLKELKDKISEEKKEK 559
Query: 119 VTKDIVDLRHDLARKEELIKRHYDKIAVWQNLLSDLQSCLQVLTKEDEVSTTLEKDEIKL 178
+ I LR + L +Y+ I ++LL +LQ L + K EV ++ +
Sbjct: 560 IENLIKKLR------QALQNDNYESI---KSLLEELQKALMEIGK--EVYSSTSTTDPAS 608
Query: 179 EID 181
D
Sbjct: 609 NDD 611
>gnl|CDD|218237 pfam04738, Lant_dehyd_C, Lantibiotic dehydratase, C terminus.
Lantibiotics are ribosomally synthesised antimicrobial
agents derived from ribosomally synthesised peptides.
They are produced by bacteria of the Firmicutes phylum,
and include mutacin, subtilin, and nisin. Lantibiotic
peptides contain thioether bridges termed lanthionines
that are thought to be generated by dehydration of
serine and threonine residues followed by addition of
cysteine residues. This family constitutes the
C-terminus of the enzyme proposed to catalyze the
dehydration step.
Length = 500
Score = 29.3 bits (66), Expect = 4.6
Identities = 18/86 (20%), Positives = 38/86 (44%), Gaps = 8/86 (9%)
Query: 180 IDQATLKFLDL-ARQMEAF---FLQKRFLLSALKPELIVKEDIVDLRHDLARKE-ELIKR 234
++ +F D A ++E + ++K FL+S L+P V + + L L +
Sbjct: 7 VELLATEFPDAPAEKVEEYLAKLIEKGFLISELRPPSTVADPLDYLIEKLEALDVPEANE 66
Query: 235 HYDKIAVWQNLLSDLQGWAKSPAHQG 260
+ Q L+++ +A+ P +G
Sbjct: 67 LLAALREIQKLIAE---YAELPIGEG 89
>gnl|CDD|220915 pfam10961, DUF2763, Protein of unknown function (DUF2763). This
eukaryotic family of proteins has no known function.
Length = 91
Score = 27.4 bits (61), Expect = 4.8
Identities = 12/35 (34%), Positives = 13/35 (37%), Gaps = 7/35 (20%)
Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP 367
G G GPP G MG G GG + P
Sbjct: 61 GRGGPGPPGGGRR-------MGRIGGGGGPSRPPM 88
>gnl|CDD|188414 TIGR03899, TIGR03899, TIGR03899 family protein. Members of this
protein family are conserved hypothetical proteins with
a limited species distribution within the
Gammaproteobacteria. It is common in the genera Vibrio
and Shewanella, and in this resembles the C-terminal
domain and putative protein sorting motif TIGR03501.
This model, but design, does not extend to all
homologs,but rather represents a particular clade.
Length = 250
Score = 28.7 bits (65), Expect = 4.9
Identities = 6/14 (42%), Positives = 9/14 (64%)
Query: 318 VQMQQQIHMQHMQQ 331
QM + IH + MQ+
Sbjct: 67 FQMAEDIHNRSMQE 80
>gnl|CDD|226513 COG4026, COG4026, Uncharacterized protein containing TOPRIM domain,
potential nuclease [General function prediction only].
Length = 290
Score = 28.7 bits (64), Expect = 5.0
Identities = 28/153 (18%), Positives = 69/153 (45%), Gaps = 22/153 (14%)
Query: 45 LQGPLAYLEKTTSNIGLPDGRSLSPLEKDEIKLEIDQATLK--FLDLARQMEAFFLQKRF 102
L+G + ++E+ + +P G + ++ + ++ E+ A ++ L R E L++ +
Sbjct: 82 LRGMVGHIER----MKIPIGHDVEHIDVELVRKELKNALVRAGLKTLQRVPEYMDLKEDY 137
Query: 103 L--------LSALKPELIVK------EVNMVTKDIVDLRHDLARKEELIKRHYDKIAVWQ 148
L K EL+ + E V + + L + +R EE++K+ ++ +
Sbjct: 138 EELKEKLEELQKEKEELLKELEELEAEYEEVQERLKRLEVENSRLEEMLKKLPGEVYDLK 197
Query: 149 NLLSDLQSCLQVLTKEDEVSTTLEKDEIKLEID 181
+L+ +++ E+E+ + L K+ + L
Sbjct: 198 KRWDELEPGVELP--EEELISDLVKETLNLAPK 228
>gnl|CDD|234354 TIGR03789, pdsO, proteobacterial sortase system OmpA family
protein. A newly defined histidine kinase (TIGR03785)
and response regulator (TIGR03787) gene pair occurs
exclusively in Proteobacteria, mostly of marine origin,
nearly all of which contain a subfamily 6 sortase
(TIGR03784) and its single dedicated target protein
(TIGR03788) adjacent to to the sortase. This protein
family shows up in only in those species with the
histidine kinase/response regulator gene pair, and often
adjacent to that pair. It belongs to the OmpA protein
family (pfam00691). Its function is unknown. We assign
the gene symbol pdsO, for Proteobacterial Dedicated
Sortase system OmpA family protein.
Length = 239
Score = 28.6 bits (64), Expect = 5.1
Identities = 18/95 (18%), Positives = 38/95 (40%), Gaps = 17/95 (17%)
Query: 255 SPAHQGSTSSASGTTPPNSTPTQS---------GPGI---SAMGGPLPGMMGGMAPIVPG 302
S + + P T Q G G + +GGP+ ++GG+ + G
Sbjct: 15 SSVAATTYQNQPHLQTPQETSQQEADQEALIGLGSGALLGALVGGPVGAIIGGITGGLIG 74
Query: 303 STM---QPMSGMPQQQQQVQM--QQQIHMQHMQQQ 332
+ + + QQ+QQ+ Q+Q ++ ++ +
Sbjct: 75 QAVNNDEQQQHIAQQRQQMVALTQKQQALEQLEAE 109
>gnl|CDD|215969 pfam00521, DNA_topoisoIV, DNA gyrase/topoisomerase IV, subunit A.
Length = 427
Score = 29.0 bits (66), Expect = 5.2
Identities = 23/89 (25%), Positives = 40/89 (44%), Gaps = 16/89 (17%)
Query: 98 LQKRF-LLSALKPEL-IVKEVNMV---TKDIVDLRHDLARKEELIKRHYDKIAVWQNLLS 152
L++R +L L L + V V + D+ + +L EEL + D +
Sbjct: 332 LEERLHILEGLLKALNKIDFVIEVIRGSIDLKKAKKELI--EELSEIQADYLL------- 382
Query: 153 DLQSCLQVLTKEDEVSTTLEKDEIKLEID 181
D++ L+ LTKE+ E +E++ EI
Sbjct: 383 DMR--LRRLTKEEIEKLEKEIEELEKEIA 409
>gnl|CDD|233508 TIGR01649, hnRNP-L_PTB, hnRNP-L/PTB/hephaestus splicing factor
family. Included in this family of heterogeneous
ribonucleoproteins are PTB (polypyrimidine tract binding
protein ) and hnRNP-L. These proteins contain four RNA
recognition motifs (rrm: pfam00067).
Length = 481
Score = 29.0 bits (65), Expect = 5.5
Identities = 19/88 (21%), Positives = 25/88 (28%), Gaps = 5/88 (5%)
Query: 298 PIVPGSTMQPMSGMPQQQQQVQMQQQIHMQHMQQQGMG--PGGPPSGPGGPSSGMMFMGP 355
P +PG + +Q+ H G G GG G P
Sbjct: 191 PDLPGR--RDPGLDQTHRQRQPALLGQHPSSYGHDGYSSHGGPLAPLAGGDRMGPPHGPP 248
Query: 356 GGPRGGGNAGPPPFPSAGPG-GMGGPGN 382
R A P + G GGPG+
Sbjct: 249 SRYRPAYEAAPLAPAISSYGPAGGGPGS 276
Score = 28.2 bits (63), Expect = 8.0
Identities = 18/83 (21%), Positives = 25/83 (30%), Gaps = 3/83 (3%)
Query: 315 QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGP 374
++ + Q + G P G + GG R G GPP
Sbjct: 196 RRDPGLDQTHRQRQPALLGQHPSSYGHDGYSSHGGPLAPLAGGDRMGPPHGPPSRYRPAY 255
Query: 375 GGMGGPGNL---GPGGMGPGGLL 394
+ GP G GPG +L
Sbjct: 256 EAAPLAPAISSYGPAGGGPGSVL 278
>gnl|CDD|227911 COG5624, TAF61, Transcription initiation factor TFIID, subunit
TAF12 (also component of histone acetyltransferase SAGA)
[Transcription].
Length = 505
Score = 28.9 bits (64), Expect = 5.5
Identities = 20/82 (24%), Positives = 31/82 (37%), Gaps = 6/82 (7%)
Query: 292 MMGGMAPIVPG-STMQPMSGMPQQQQ--QVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSS 348
M GG+ + G S + + PQ QQ + + Q H ++ G PP S+
Sbjct: 214 MAGGVYGVHDGRSKRRLVDRYPQFQQGQKQVLSPQQRFLHGMERYEASGMPPPAEWAGSN 273
Query: 349 GMMFMG---PGGPRGGGNAGPP 367
G+ + PRG P
Sbjct: 274 GLHVLPGRREEVPRGIFRCPSP 295
>gnl|CDD|184281 PRK13729, PRK13729, conjugal transfer pilus assembly protein TraB;
Provisional.
Length = 475
Score = 29.0 bits (65), Expect = 5.6
Identities = 22/89 (24%), Positives = 31/89 (34%), Gaps = 8/89 (8%)
Query: 315 QQQVQMQQQIH-----MQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPP- 368
+Q+ Q++I + +Q G P G M P GP G G P
Sbjct: 97 KQRGDDQRRIEKLGQDNAALAEQVKALGANPVTATGEPVPQMPASPPGPEGEPQPGNTPV 156
Query: 369 -FPSAGPGGMGGPGNLGPG-GMGPGGLLQ 395
FP G + P PG G+ P +
Sbjct: 157 SFPPQGSVAVPPPTAFYPGNGVTPPPQVT 185
>gnl|CDD|182398 PRK10350, PRK10350, hypothetical protein; Provisional.
Length = 145
Score = 28.1 bits (62), Expect = 5.7
Identities = 15/73 (20%), Positives = 21/73 (28%)
Query: 314 QQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAG 373
QQQ +Q Q + Q +QQ G ++G M P N
Sbjct: 65 QQQHLQNQINNNSQRVQQGQPGNNPARQQMLPNTNGGMLNSNRNPDSSLNQQHMLPERRN 124
Query: 374 PGGMGGPGNLGPG 386
+ P P
Sbjct: 125 GDMLNQPSTPQPD 137
>gnl|CDD|226711 COG4260, COG4260, Membrane protease subunit, stomatin/prohibitin
family [Amino acid transport and metabolism].
Length = 345
Score = 28.7 bits (64), Expect = 6.0
Identities = 10/38 (26%), Positives = 11/38 (28%)
Query: 286 GGPLPGMMGGMAPIVPGSTMQPMSGMPQQQQQVQMQQQ 323
GG G + G M Q Q Q Q Q
Sbjct: 262 GGAAGTFAGMAMGMQMGQGMMESLRTSLQGNQGQAQAQ 299
>gnl|CDD|221321 pfam11928, DUF3446, Domain of unknown function (DUF3446). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 80 to 99 amino acids in length. This domain is
found associated with pfam00096. This domain has a
single completely conserved residue P that may be
functionally important.
Length = 84
Score = 26.7 bits (59), Expect = 6.5
Identities = 14/62 (22%), Positives = 24/62 (38%), Gaps = 4/62 (6%)
Query: 244 NLLSDLQGWAKSPAHQGSTSSASGTTPPNSTPTQSG----PGISAMGGPLPGMMGGMAPI 299
+L+S L G + P +SS+S ++ + +P S S + P I
Sbjct: 22 SLVSGLVGMSNPPPSSSPSSSSSSSSSSSQSPPLSCSVHQSEPSPIYSAAPPYSSACGDI 81
Query: 300 VP 301
P
Sbjct: 82 YP 83
>gnl|CDD|147458 pfam05268, GP38, Phage tail fibre adhesin Gp38. This family
contains several Gp38 proteins from T-even-like phages.
Gp38, together with a second phage protein, gp57,
catalyzes the organisation of gp37 but is absent from
the phage particle. Gp37 is responsible for receptor
recognition.
Length = 261
Score = 28.2 bits (63), Expect = 6.5
Identities = 24/73 (32%), Positives = 24/73 (32%), Gaps = 22/73 (30%)
Query: 335 GPGGPPSGPGGPSSGMM------FMGPGGPRG---------GGNAGPPPFPSAGPGGMGG 379
G GG P G GG S M PGG G GGN G GG
Sbjct: 175 GGGGRPFGAGGKSGSHMSGGNASLTAPGGGSGTGSAYGGGNGGNVG-------AAGGRAW 227
Query: 380 PGNLGPGGMGPGG 392
GN G G G
Sbjct: 228 GGNGYEYGGGAAG 240
>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen. This
family consists of several Theileria P67 surface
antigens. A stage specific surface antigen of Theileria
parva, p67, is the basis for the development of an
anti-sporozoite vaccine for the control of East Coast
fever (ECF) in cattle. The antigen has been shown to
contain five distinct linear peptide sequences
recognised by sporozoite-neutralising murine monoclonal
antibodies.
Length = 727
Score = 28.9 bits (64), Expect = 6.8
Identities = 20/63 (31%), Positives = 23/63 (36%), Gaps = 4/63 (6%)
Query: 335 GPGGPPS-GPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGP-GGMGPGG 392
P P P PS+ + P +G G AG PSA G G G G G
Sbjct: 638 VPSDPTKVTPTQPSN--LPQVPTSGQGNGTAGGEQPPSAPNGTGNGEGGKDLKEGEKKEG 695
Query: 393 LLQ 395
L Q
Sbjct: 696 LFQ 698
>gnl|CDD|151935 pfam11498, Activator_LAG-3, Transcriptional activator LAG-3. The
C.elegans Notch pathway, involved in the control of
growth, differentiation and patterning in animal
development, relies on either of the receptors GLP-1 or
LIN-12. Both these receptors promote signalling by the
recruitment of LAG-3 to target promoters, where it then
acts as a transcriptional activator. LAG-3 works as a
ternary complex together with the DNA binding protein,
LAG-1.
Length = 476
Score = 28.8 bits (63), Expect = 6.9
Identities = 19/57 (33%), Positives = 23/57 (40%), Gaps = 8/57 (14%)
Query: 305 MQPMSGMPQQQQQVQMQQQIHMQHMQQQGMG------PGGPPSGPGG--PSSGMMFM 353
M+ + QQQQ Q QQ QH Q G P G P+ G P+ G M
Sbjct: 410 MRLQEQIQHQQQQAQHHQQAQQQHQQPAQHGQMGYGIPNGYPAHMHGHAPAYGAHHM 466
Score = 28.4 bits (62), Expect = 8.5
Identities = 13/33 (39%), Positives = 16/33 (48%)
Query: 306 QPMSGMPQQQQQVQMQQQIHMQHMQQQGMGPGG 338
Q QQQQ+ +QQQ M +QQ GG
Sbjct: 358 QQQQQQEHQQQQMLLQQQQQMHQLQQHHQMNGG 390
>gnl|CDD|182745 PRK10803, PRK10803, tol-pal system protein YbgF; Provisional.
Length = 263
Score = 28.2 bits (63), Expect = 7.2
Identities = 16/55 (29%), Positives = 19/55 (34%), Gaps = 1/55 (1%)
Query: 313 QQQQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPP 367
Q Q V+ Q+QI + G S G S P G NAG P
Sbjct: 83 QLNQVVERQKQI-YLQIDSLSSGGAAAQSTSGDQSGAAASATPAADAGTANAGAP 136
>gnl|CDD|221581 pfam12446, DUF3682, Protein of unknown function (DUF3682). This
domain family is found in eukaryotes, and is typically
between 125 and 136 amino acids in length.
Length = 133
Score = 27.5 bits (61), Expect = 7.3
Identities = 15/40 (37%), Positives = 16/40 (40%), Gaps = 6/40 (15%)
Query: 333 GMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
G G GG SG P+ P GP G NA P P
Sbjct: 4 GDGTGGVSSGSSAPA------PPAGPGPGPNAPPAPAAPG 37
>gnl|CDD|233667 TIGR01982, UbiB, 2-polyprenylphenol 6-hydroxylase. This model
represents the enzyme (UbiB) which catalyzes the first
hydroxylation step in the ubiquinone biosynthetic
pathway in bacteria. It is believed that the reaction is
2-polyprenylphenol -> 6-hydroxy-2-polyprenylphenol. This
model finds hits primarily in the proteobacteria. The
gene is also known as AarF in certain species
[Biosynthesis of cofactors, prosthetic groups, and
carriers, Menaquinone and ubiquinone].
Length = 437
Score = 28.4 bits (64), Expect = 8.7
Identities = 18/63 (28%), Positives = 25/63 (39%), Gaps = 4/63 (6%)
Query: 74 EIKLEIDQATLKFLDLARQMEAFFLQKRFLLSALKPELIVKEVNMVTKDIVDLRHDLARK 133
I+ I LAR +E R L+P +VKE + +DLR + A
Sbjct: 152 GIEKTIAADIALLYRLARIVERLSPDSR----RLRPTEVVKEFEKTLRRELDLRREAANA 207
Query: 134 EEL 136
EL
Sbjct: 208 SEL 210
>gnl|CDD|222449 pfam13908, Shisa, Wnt and FGF inhibitory regulator. Shisa is a
transcription factor-type molecule that physically
interacts with immature forms of the Wnt receptor
Frizzled and the FGF receptor within the endoplasmic
reticulum to inhibit their post-translational maturation
and trafficking to the cell surface.
Length = 177
Score = 27.9 bits (62), Expect = 8.7
Identities = 12/59 (20%), Positives = 13/59 (22%), Gaps = 6/59 (10%)
Query: 319 QMQQQIHMQHMQQQGMGPGGPPSG-PGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGG 376
+Q Q PG G P M P PPP G
Sbjct: 121 TVQTTPLPQPPSTAPSYPGPQYQGYHPMPPQPGMPAPP-----YSLQYPPPGLLQPQGP 174
>gnl|CDD|227690 COG5403, COG5403, Uncharacterized conserved protein [Function
unknown].
Length = 285
Score = 28.0 bits (62), Expect = 9.1
Identities = 24/91 (26%), Positives = 31/91 (34%), Gaps = 3/91 (3%)
Query: 308 MSGMPQQ--QQQVQMQQQIHMQHMQQQGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAG 365
+G+ QQ Q + M + M + +Q G G G + P G GGG
Sbjct: 111 RAGIDQQIAMQMLPMVASLIMGGLFKQTTAQMGQMGGNMGGQNPGGMSLPQGMGGGGGGA 170
Query: 366 PPPFPSAGPGG-MGGPGNLGPGGMGPGGLLQ 395
P GG P GM GG Q
Sbjct: 171 LGPILGPQLGGPADNPLGSVLQGMFGGGQAQ 201
>gnl|CDD|221759 pfam12764, Gly-rich_Ago1, Glycine-rich region of argonaut. This
domain is often found at the very N-terminal of
argonaut-like proteins.
Length = 102
Score = 26.9 bits (59), Expect = 9.3
Identities = 20/41 (48%), Positives = 21/41 (51%), Gaps = 7/41 (17%)
Query: 332 QGMGPGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSA 372
QG G GGPP G G GG RGGG+ G PP PS
Sbjct: 7 QGRGRGGPPQQGGRG-------GGGGGRGGGSTGGPPRPSV 40
>gnl|CDD|164795 PHA00370, III, attachment protein.
Length = 297
Score = 28.0 bits (62), Expect = 9.3
Identities = 24/81 (29%), Positives = 30/81 (37%), Gaps = 6/81 (7%)
Query: 5 GPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGPGGLLQGPLAYLEKTTSNIGLPDG 64
GG GGGN G G GG G+ G G G G L K G D
Sbjct: 95 DTGGDTGGGNTG------GGSGGGDTGGSGGGGSDGGGSEGGSTGKSLTKEGVGAGDFDY 148
Query: 65 RSLSPLEKDEIKLEIDQATLK 85
++ KD + + DQ L+
Sbjct: 149 PKMANANKDALTEDNDQNALQ 169
>gnl|CDD|223029 PHA03264, PHA03264, envelope glycoprotein D; Provisional.
Length = 416
Score = 28.0 bits (62), Expect = 9.7
Identities = 17/55 (30%), Positives = 19/55 (34%), Gaps = 1/55 (1%)
Query: 336 PGGPPSGPGGPSSGMMFMGPGGPRGGGNAGPPPFPSAGPGGMGGPGNLGPGGMGP 390
P G P G + G G R G G P P+ G GG GP P
Sbjct: 277 PPGDDRPEAKPEPGPVEDGAPG-RETGGEGEGPEPAGRDGAAGGEPKPGPPRPAP 330
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.315 0.138 0.415
Gapped
Lambda K H
0.267 0.0647 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 22,430,036
Number of extensions: 2305659
Number of successful extensions: 4496
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3340
Number of HSP's successfully gapped: 436
Length of query: 415
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 316
Effective length of database: 6,546,556
Effective search space: 2068711696
Effective search space used: 2068711696
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 60 (27.1 bits)