RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy10643
(769 letters)
>gnl|CDD|221250 pfam11831, Myb_Cef, pre-mRNA splicing factor component. This
family is a region of the Myb-Related Cdc5p/Cef1
proteins, in fungi, and is part of the pre-mRNA splicing
factor complex.
Length = 363
Score = 302 bits (775), Expect = 7e-96
Identities = 155/382 (40%), Positives = 212/382 (55%), Gaps = 38/382 (9%)
Query: 328 DYS-IGTGAAMKTPRTPAPQTDRILQEAQNMMALTHVDTPLKGGLNTPLLAPDFSGVTPS 386
+YS ++TPRTP + D I+ EA+N+ ALT TPL GG NTPL DF GVTP
Sbjct: 2 NYSQTNNNTPIRTPRTP-AEEDAIMNEARNLRALTETQTPLLGGENTPLHETDFDGVTPR 60
Query: 387 KDHLATPNTVLTTPFSQRSVHDGGPGSTPGGFSTPGVRDSVRGGATP--TPIRDRLNINP 444
K + TPN + T PF G G+TP G P TP RD+L+IN
Sbjct: 61 KQQIQTPNPLATPPFRS----GNGIGATPLRGG---------SGYGPLRTPNRDKLSIND 107
Query: 445 EDNMLLEAGDTPAAFKSFQTE---QLRAGLSSLPLPKNDYEIVVPENEEMEEKASGDVDM 501
E M E G+TP K + E L++GL+SLP PKN++E+ +PE EE E + + +
Sbjct: 108 EAAM--EVGETPREEKLREDEAKLSLKSGLASLPKPKNEFELELPEEEEEEPEEMEEE-L 164
Query: 502 LEDQADVDAAAIARMKAQREHEMRLRSQVIQKNLPRPFDINIVLRPSNSDPPLSELQKAE 561
ED AD DA A +A+ + E+R RSQVIQ+NLPRP +++++ + + PL+EL AE
Sbjct: 165 EEDAADRDARKRAAEEAKEQEELRRRSQVIQRNLPRPSVLDLIVLRPSVNVPLTELDPAE 224
Query: 562 ELIKQEMITMLHYDALETPLSVDKKAAKQSNILTDEEHYNFLKHRPYRNFSLEELEAADD 621
+LI +EM ++ +DAL+ PL K K PY +F EELE A
Sbjct: 225 KLINKEMALLIAHDALKYPLPGGKPK---------------GKAVPYEDFDDEELEEARK 269
Query: 622 LLKREMDLVKTGMGHGDLSLESFTQVWEECLSQVLFLANQNRYTRASLASKKDRADSLAK 681
L++ E++ +K MGH + SLE F + W E QVLFL N YT AS++DR ++
Sbjct: 270 LIEAELEKLKGEMGHQEESLEEFDEAWSELNEQVLFLPGLNAYTDIEDASEEDRIEAYKA 329
Query: 682 RLEQNRKHMSLEAKKATKMENK 703
LE RK M EA+KA K+E K
Sbjct: 330 ALENVRKKMEKEAEKANKLEKK 351
>gnl|CDD|227476 COG5147, REB1, Myb superfamily proteins, including transcription
factors and mRNA splicing factors [Transcription / RNA
processing and modification / Cell division and
chromosome partitioning].
Length = 512
Score = 163 bits (414), Expect = 4e-43
Identities = 91/289 (31%), Positives = 128/289 (44%), Gaps = 23/289 (7%)
Query: 17 DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWLDPSIKKTEWSREEDEKLLHLA 76
DE LKA V K G N WS++ASLL + KQ RW L+P +KK WS EEDE+L+ L
Sbjct: 28 DEDLKALVKKLGPNNWSKVASLLISSTGKQSSNRWNNHLNPQLKKKNWSEEEDEQLIDLD 87
Query: 77 KLMPTQWRTIAPII-GRTAAQCLERYEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPET 135
K + TQW TIA RTA QC+ERY L+ + R+ + +IDP E
Sbjct: 88 KELGTQWSTIADYKDRRTAQQCVERYVNTLEDLSSTH----DSKLQRRNEFDKIDPFNEN 143
Query: 136 KPARPDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELRAA 195
RPD + + E E+ EA RL + KA K REK E + LQ+ +EL++A
Sbjct: 144 SARRPDIYEDELLEREVNREASYRLRVPRVSKADVKPREKGEENNPDIEDLQEMKELKSA 203
Query: 196 GIEVAPRQKKKRGIDYNAEIPFEKRPAPGFYDTSKEER---LRQQHLDGE-LRSEKEERE 251
I K I+ K G ++E ++ L + +
Sbjct: 204 SITRHLILPSKSEIN--------KAFKKGETLALEQEINEYKEKKGLSRKQFCERIWSTD 255
Query: 252 RKKDK------QKLKQRKENDIPTAMLQNLEPEKKRSKLVLPEPQISDM 294
R +DK +KL R + I + + ++R K E Q
Sbjct: 256 RDEDKFWPNIYKKLPYRDKKSIYKHLRRKYNIFEQRGKWTKEEEQELAK 304
>gnl|CDD|212557 cd11659, SANT_CDC5_II, SANT/myb-like DNA-binding domain of Cell
Division Cycle 5-Like Protein repeat II. In humans,
cell division cycle 5-like protein (CDC5) functions in
pre-mRNA splicing in cell cycle control. The
DNA-binding, myb-like domain of CDC5 is a member of the
SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR
and TFIIIB', several factors that share this domain. The
SANT domain resembles the 3 alpha-helix bundle of
DNA-binding Myb domains and is found in a diverse set of
proteins.
Length = 53
Score = 112 bits (281), Expect = 5e-30
Identities = 46/53 (86%), Positives = 50/53 (94%)
Query: 57 PSIKKTEWSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLERYEFLLDQAQ 109
PSIKKTEW+REEDEKLLHLAKL+PTQWRTIAPI+GRTA QCLERY LLD+AQ
Sbjct: 1 PSIKKTEWTREEDEKLLHLAKLLPTQWRTIAPIVGRTAQQCLERYNKLLDEAQ 53
>gnl|CDD|215818 pfam00249, Myb_DNA-binding, Myb-like DNA-binding domain. This
family contains the DNA binding domains from Myb
proteins, as well as the SANT domain family.
Length = 47
Score = 56.0 bits (136), Expect = 2e-10
Identities = 18/39 (46%), Positives = 24/39 (61%)
Query: 17 DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWL 55
DE+L AV K+G WS+IA L ++ QCK RW +L
Sbjct: 9 DELLIEAVKKHGNGNWSKIAKHLPGRTDNQCKNRWNNYL 47
Score = 41.7 bits (99), Expect = 3e-05
Identities = 16/43 (37%), Positives = 21/43 (48%), Gaps = 2/43 (4%)
Query: 61 KTEWSREEDEKLLHLAKLMPT-QWRTIAPII-GRTAAQCLERY 101
+ W+ EEDE L+ K W IA + GRT QC R+
Sbjct: 1 RGPWTPEEDELLIEAVKKHGNGNWSKIAKHLPGRTDNQCKNRW 43
>gnl|CDD|206092 pfam13921, Myb_DNA-bind_6, Myb-like DNA-binding domain. This
family contains the DNA binding domains from Myb
proteins, as well as the SANT domain family.
Length = 59
Score = 56.6 bits (137), Expect = 2e-10
Identities = 23/57 (40%), Positives = 32/57 (56%), Gaps = 2/57 (3%)
Query: 16 QDEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWLDPSIKKTEWSREEDEKL 72
+DE L V KYG N W +IA L R + C+ RW L P + W++EED++L
Sbjct: 5 EDEKLLKLVEKYG-NDWKQIAEELGR-TPSACRDRWRRKLRPKRSRGPWTKEEDQRL 59
Score = 42.3 bits (100), Expect = 2e-05
Identities = 17/39 (43%), Positives = 24/39 (61%)
Query: 64 WSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLERYE 102
W+ EEDEKLL L + W+ IA +GRT + C +R+
Sbjct: 1 WTEEEDEKLLKLVEKYGNDWKQIAEELGRTPSACRDRWR 39
>gnl|CDD|197842 smart00717, SANT, SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding
domains.
Length = 49
Score = 52.2 bits (126), Expect = 5e-09
Identities = 20/41 (48%), Positives = 26/41 (63%)
Query: 17 DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWLDP 57
DE+L V KYGKN W +IA L ++A+QC+ RW L P
Sbjct: 9 DELLIELVKKYGKNNWEKIAKELPGRTAEQCRERWRNLLKP 49
Score = 44.1 bits (105), Expect = 4e-06
Identities = 22/47 (46%), Positives = 26/47 (55%), Gaps = 2/47 (4%)
Query: 61 KTEWSREEDEKLLHLAKLMPT-QWRTIAPIIG-RTAAQCLERYEFLL 105
K EW+ EEDE L+ L K W IA + RTA QC ER+ LL
Sbjct: 1 KGEWTEEEDELLIELVKKYGKNNWEKIAKELPGRTAEQCRERWRNLL 47
>gnl|CDD|238096 cd00167, SANT, 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding
domains. Tandem copies of the domain bind telomeric DNA
tandem repeatsas part of the capping complex. Binding
is sequence dependent for repeats which contain the G/C
rich motif [C2-3 A (CA)1-6]. The domain is also found
in regulatory transcriptional repressor complexes where
it also binds DNA.
Length = 45
Score = 51.8 bits (125), Expect = 6e-09
Identities = 20/39 (51%), Positives = 25/39 (64%)
Query: 17 DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWL 55
DE+L AV KYGKN W +IA L ++ KQC+ RW L
Sbjct: 7 DELLLEAVKKYGKNNWEKIAKELPGRTPKQCRERWRNLL 45
Score = 41.4 bits (98), Expect = 3e-05
Identities = 19/45 (42%), Positives = 22/45 (48%), Gaps = 2/45 (4%)
Query: 63 EWSREEDEKLLHL-AKLMPTQWRTIAPIIG-RTAAQCLERYEFLL 105
W+ EEDE LL K W IA + RT QC ER+ LL
Sbjct: 1 PWTEEEDELLLEAVKKYGKNNWEKIAKELPGRTPKQCRERWRNLL 45
>gnl|CDD|178751 PLN03212, PLN03212, Transcription repressor MYB5; Provisional.
Length = 249
Score = 52.4 bits (125), Expect = 2e-07
Identities = 42/137 (30%), Positives = 69/137 (50%), Gaps = 13/137 (9%)
Query: 5 MILKRWVIFVFQDEILKAAVMKYGKNQWSRI---ASLLHRKSAKQCKARWFEWLDPSIKK 61
M +KR V +DEIL + + K G+ +W + A LL + K C+ RW +L PS+K+
Sbjct: 21 MGMKRGPWTVEEDEILVSFIKKEGEGRWRSLPKRAGLL--RCGKSCRLRWMNYLRPSVKR 78
Query: 62 TEWSREEDEKLLHLAKLMPTQWRTIA-PIIGRTAAQCLERYEFLLDQAQKKEEGEDVADD 120
+ +E++ +L L +L+ +W IA I GRT + + L + ++ D
Sbjct: 79 GGITSDEEDLILRLHRLLGNRWSLIAGRIPGRTDNEIKNYWNTHLRKKLLRQ-----GID 133
Query: 121 PRKLKPGEIDPNPETKP 137
P+ KP +D N KP
Sbjct: 134 PQTHKP--LDANNIHKP 148
>gnl|CDD|215570 PLN03091, PLN03091, hypothetical protein; Provisional.
Length = 459
Score = 44.2 bits (104), Expect = 2e-04
Identities = 26/81 (32%), Positives = 43/81 (53%), Gaps = 4/81 (4%)
Query: 16 QDEILKAAVMKYGKNQWSRIASL--LHRKSAKQCKARWFEWLDPSIKKTEWSREEDEKLL 73
+DE L + KYG WS + L R K C+ RW +L P +K+ +S++E+ ++
Sbjct: 21 EDEKLLRHITKYGHGCWSSVPKQAGLQR-CGKSCRLRWINYLRPDLKRGTFSQQEENLII 79
Query: 74 HLAKLMPTQWRTIAP-IIGRT 93
L ++ +W IA + GRT
Sbjct: 80 ELHAVLGNRWSQIAAQLPGRT 100
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 44.4 bits (104), Expect = 3e-04
Identities = 45/186 (24%), Positives = 76/186 (40%), Gaps = 20/186 (10%)
Query: 106 DQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
D+A+K E + AD+ +K + E E K A K + + E +A +
Sbjct: 1500 DEAKKAAEAKKKADEAKKAE--EAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEEL 1557
Query: 166 KKA--KRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAP 223
KKA K+KA E + E + AL+K E + A ++ R + EK+
Sbjct: 1558 KKAEEKKKAEEAKKAEEDKNMALRKAEEAKKA--------EEARIEEVMKLYEEEKKMKA 1609
Query: 224 GFYDTSKEERLRQQHL--------DGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNL 275
++E +++ + L E +KE E+KK ++ K +EN I A
Sbjct: 1610 EEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKK 1669
Query: 276 EPEKKR 281
E K+
Sbjct: 1670 AEEDKK 1675
Score = 36.7 bits (84), Expect = 0.053
Identities = 34/204 (16%), Positives = 83/204 (40%), Gaps = 10/204 (4%)
Query: 106 DQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
++A+K +E + A++ +K E E + D + + EA+ +
Sbjct: 1467 EEAKKADEAKKKAEEAKKAD--EAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKA 1524
Query: 166 KKAKRKAREKQLEEARRLAALQKRRELRAA-----GIEVAPRQKKKRGIDYNAEIPFEKR 220
+AK+ K+ +EA++ +K EL+ A E ++ K+ + +
Sbjct: 1525 DEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAE 1584
Query: 221 PAPGFYDTSKEERLRQQHLDGELRSE---KEERERKKDKQKLKQRKENDIPTAMLQNLEP 277
A + EE ++ + ++++E K E + K ++ K +E + +
Sbjct: 1585 EAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAE 1644
Query: 278 EKKRSKLVLPEPQISDMELEQVVK 301
EKK+++ + + + ++ + K
Sbjct: 1645 EKKKAEELKKAEEENKIKAAEEAK 1668
Score = 34.3 bits (78), Expect = 0.34
Identities = 54/278 (19%), Positives = 108/278 (38%), Gaps = 22/278 (7%)
Query: 41 RKSAKQCKARWFEWLDPSIKKTEWSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLER 100
+K+ + KA + D + KK E +++ DE AK + + A + A + +
Sbjct: 1290 KKADEAKKAEEKKKADEAKKKAEEAKKADE-----AKKKAEEAKKKADAAKKKAEEAKKA 1344
Query: 101 YEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDE----DELEMLSEA 156
E +A+ + + A+ K + E K A K +E DE + +E
Sbjct: 1345 AEAAKAEAEAAADEAEAAE--EKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEE 1402
Query: 157 RARLANTQGKKA--KRKARE--KQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYN 212
+ A+ K A K+KA E K+ EE ++ +K+ E E + ++ + +
Sbjct: 1403 DKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEA 1462
Query: 213 AEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAML 272
+ E + A ++E + + ++K+ E KK + K+ E
Sbjct: 1463 KKKAEEAKKADEAKKKAEEAKKADE-------AKKKAEEAKKKADEAKKAAEAKKKADEA 1515
Query: 273 QNLEPEKKRSKLVLPEPQISDMELEQVVKLGRATEVAR 310
+ E KK + E E ++ + +A E+ +
Sbjct: 1516 KKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKK 1553
Score = 34.0 bits (77), Expect = 0.46
Identities = 41/186 (22%), Positives = 73/186 (39%), Gaps = 22/186 (11%)
Query: 109 QKKEEGEDVADDPRK---LKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
+ K+ ED RK K E E + K M +E + EA+ + +
Sbjct: 1568 EAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKK 1627
Query: 166 KKAKRKARE----KQLEEARRLAALQKRRE---LRAAGIEVAPRQKKKRGIDYNAEIPFE 218
+ ++K E K+ EE ++ L+K E ++AA + KK+ + E
Sbjct: 1628 AEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDE 1687
Query: 219 KRPAPGFYDTSKEERLRQQHLDG---ELRSEKEERERKKDKQKLKQRKENDIPTAMLQNL 275
K+ E L+++ + E +KE E+KK ++ K +EN I +
Sbjct: 1688 KK---------AAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKE 1738
Query: 276 EPEKKR 281
E K+
Sbjct: 1739 AEEDKK 1744
Score = 33.6 bits (76), Expect = 0.53
Identities = 49/209 (23%), Positives = 87/209 (41%), Gaps = 19/209 (9%)
Query: 106 DQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
D+A+KK E AD+ +K K E E K + K DE + + +A A +
Sbjct: 1434 DEAKKKAEEAKKADEAKK-KAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKA 1492
Query: 166 KKAKRKARE--KQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAP 223
++AK+KA E K E ++ +K E + A + KK AE EK+ A
Sbjct: 1493 EEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAE---EKKKAD 1549
Query: 224 GFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSK 283
+ K E L++ EE+++ ++ +K ++ K + A E + +
Sbjct: 1550 ---ELKKAEELKK----------AEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEE 1596
Query: 284 LVLPEPQISDMELEQVVKLGRATEVAREV 312
++ + M+ E+ K A A E+
Sbjct: 1597 VMKLYEEEKKMKAEEAKKAEEAKIKAEEL 1625
Score = 32.8 bits (74), Expect = 0.90
Identities = 46/267 (17%), Positives = 105/267 (39%), Gaps = 17/267 (6%)
Query: 41 RKSAKQCKARWFEWLDPSIKKTEWSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLER 100
+K+ + KA + + K E + E++K + L K A + +E
Sbjct: 1546 KKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRK---------AEEAKKAEEARIEE 1596
Query: 101 YEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARL 160
L ++ +K + E + K+K E+ E K K + +E + E +
Sbjct: 1597 VMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAE 1656
Query: 161 ANTQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPR--QKKKRGIDYNAEIPFE 218
+ K A+ + + E+ ++ +K E E + ++ K+ + + E
Sbjct: 1657 EENKIKAAEEAKKAE--EDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEE 1714
Query: 219 KRPAPGFYDTSKEERLRQQHLDGELRSEK----EERERKKDKQKLKQRKENDIPTAMLQN 274
K+ A +E +++ + E +K E ++ +++K+K+ K+ + A
Sbjct: 1715 KKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIR 1774
Query: 275 LEPEKKRSKLVLPEPQISDMELEQVVK 301
E E + + E + ME+++ +K
Sbjct: 1775 KEKEAVIEEELDEEDEKRRMEVDKKIK 1801
Score = 30.5 bits (68), Expect = 4.4
Identities = 33/214 (15%), Positives = 91/214 (42%), Gaps = 9/214 (4%)
Query: 105 LDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPA---RPDPKDMDEDELEMLSEARARLA 161
+A++ ++ ++ K K E+ E K A + + +E + ++ +A A
Sbjct: 1527 AKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEA 1586
Query: 162 NTQGKKAKRKAREKQLEEARRLAALQKRRE----LRAAGIEVAPRQKKKRGIDYNAEIPF 217
+ ++A+ + K EE +++ A + ++ ++A ++ A +KKK E
Sbjct: 1587 K-KAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAE- 1644
Query: 218 EKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEP 277
EK+ A +E +++ + +K++ E K ++ +++ + + +
Sbjct: 1645 EKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKA 1704
Query: 278 EKKRSKLVLPEPQISDMELEQVVKLGRATEVARE 311
E+ + K + + +++ + +A E +E
Sbjct: 1705 EELKKKEAEEKKKAEELKKAEEENKIKAEEAKKE 1738
>gnl|CDD|212558 cd11660, SANT_TRF, Telomere repeat binding factor-like
DNA-binding domains of the SANT/myb-like family. Human
telomere repeat binding factors, TRF1 and TRF2,
function as part of the 6 component shelterin complex.
TRF2 binds DNA and recruits RAP1 (via binding to the
RAP1 protein c-terminal (RCT)) and TIN2 in the
protection of telomeres from DNA repair machinery.
Metazoan shelterin consists of 3 DNA binding proteins
(TRF2, TRF1, and POT1) and 3 recruited proteins that
bind to one or more of these DNA-binding proteins
(RAP1, TIN2, TPP1). Schizosaccharomyces pombe TAZ1 is
an orthlog and binds RAP1. Human TRF1 and TRF2 bind
double-stranded DNA. hTRF2 consists of a basic
N-terminus, a TRF homology domain, the RAP1 binding
motif (RBM), the TIN2 binding motif (TBM) and a
myb-like DNA binding domain, SANT, named after 'SWI3,
ADA2, N-CoR and TFIIIB', several factors that share
this domain. Tandem copies of the domain bind telomeric
DNA tandem repeats as part of the capping complex. The
single myb-like domain of TRF-type proteins is similar
to the tandem myb_like domains found in yeast RAP1.
Length = 50
Score = 35.2 bits (82), Expect = 0.005
Identities = 11/38 (28%), Positives = 19/38 (50%), Gaps = 3/38 (7%)
Query: 17 DEILKAAVMKYGKNQWSRI---ASLLHRKSAKQCKARW 51
DE L V KYG W++I ++ +++ K +W
Sbjct: 8 DEALVEGVEKYGVGNWAKILKDYFFVNNRTSVDLKDKW 45
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 39.1 bits (91), Expect = 0.009
Identities = 39/215 (18%), Positives = 67/215 (31%), Gaps = 18/215 (8%)
Query: 108 AQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQGKK 167
A++ + ++ K + E + KP ++E + K
Sbjct: 95 AKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRP----------PK 144
Query: 168 AKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAPGFYD 227
K K +EK++EE R +KR +RA P +KK ++R A
Sbjct: 145 EKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEEEKQRQAAREAV 204
Query: 228 TSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSKLVLP 287
K E +E+ E+E K + + + + + S L P
Sbjct: 205 KGKPEE--------PDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISRRSSSSLKKP 256
Query: 288 EPQISDMELEQVVKLGRATEVAREVAIESGSGPTS 322
+P S E R R + P S
Sbjct: 257 DPSPSMASPETRESSKRTETRPRTSLRPPSARPAS 291
>gnl|CDD|222688 pfam14334, DUF4390, Domain of unknown function (DUF4390). This
family of proteins is functionally uncharacterized. This
family of proteins is found in bacteria and eukaryotes.
Proteins in this family are typically between 192 and
203 amino acids in length.
Length = 165
Score = 34.5 bits (80), Expect = 0.087
Identities = 17/91 (18%), Positives = 34/91 (37%), Gaps = 3/91 (3%)
Query: 460 KSFQTEQLRAGLSSLPLPKNDYEIVVPENEEMEEKASGD--VDMLEDQADVDAAAIARMK 517
+ + LS PL Y + + + A+ D + L + A ++
Sbjct: 62 EKVASATRTYRLSYDPL-TRRYRVTDGGSGLSQSFATLDEALRALGRIRNWPVADAGDLE 120
Query: 518 AQREHEMRLRSQVIQKNLPRPFDINIVLRPS 548
++ +RLR ++ LP+P IN +
Sbjct: 121 PGEDYRVRLRVRLDTSQLPKPLQINALFSSD 151
>gnl|CDD|212556 cd11658, SANT_DMAP1_like, SANT/myb-like domain of Human Dna
Methyltransferase 1 Associated Protein 1-like. These
proteins are members of the SANT/myb group. SANT is
named after 'SWI3, ADA2, N-CoR and TFIIIB', several
factors that share this domain. The SANT domain
resembles the 3 alpha-helix bundle of the DNA-binding
Myb domains and is found in a diverse set of proteins.
Length = 46
Score = 29.7 bits (67), Expect = 0.41
Identities = 12/42 (28%), Positives = 18/42 (42%), Gaps = 4/42 (9%)
Query: 64 WSREEDEKLLHLAKLMPTQWRTIA----PIIGRTAAQCLERY 101
W++EE + L L K +W I GR+ E+Y
Sbjct: 1 WTKEETDYLFDLVKRFDLRWNVILDRYPFQKGRSVEDLKEKY 42
>gnl|CDD|130283 TIGR01216, ATP_synt_epsi, ATP synthase, F1 epsilon subunit (delta
in mitochondria). This model describes one of the five
types of subunits in the F1 part of F1/F0 ATP synthases.
Members of this family are designated epsilon in
bacterial and chloroplast systems but designated delta
in mitochondria, where the counterpart of the bacterial
delta subunit is designated OSCP. In a few cases
(Propionigenium modestum, Acetobacterium woodii) scoring
above the trusted cutoff and designated here as
exceptions, Na+ replaces H+ for translocation [Energy
metabolism, ATP-proton motive force interconversion].
Length = 130
Score = 31.8 bits (73), Expect = 0.49
Identities = 15/47 (31%), Positives = 25/47 (53%), Gaps = 3/47 (6%)
Query: 143 KDMDEDEL-EMLSEARARLANTQGKKAKRKAREKQLEEAR-RLAALQ 187
D+DE E + L A L + + K +A +L++AR +L AL+
Sbjct: 85 DDIDEAEAEKALEAAEKLLESAEDDKDLAEA-LLKLKKARAQLEALE 130
>gnl|CDD|213402 cd12203, GT1, GT1, myb-like, SANT family. GT-1, a myb-like
protein, is one of the GT trihelix transcription
factors. GT-1 binds the GT cis-element of rbcS-3A, a
light-induced gene, as a dimer. Arabidopsis GT-1 is a
trans-activator and acts in the stabilization of
components of the transcrtiption pre-initiation complex
comprised of TFIIA-TBP-TATA. The isolated GT-1
DNA-binding domain is sufficient to bind DNA. This
region closely resemble the myb domain, but with longer
helices. It has been proposed that GT-1 may respond to
light signals via calcium-dependent phosphorylation to
create a light-modulated molecular switch. These
proteins are members of the SANT/myb group. SANT is
named after 'SWI3, ADA2, N-CoR and TFIIIB', several
factors that share this domain. The SANT domain
resembles the 3 alpha-helix bundle of the DNA-binding
Myb domains and is found in a diverse set of proteins.
Length = 66
Score = 30.3 bits (69), Expect = 0.52
Identities = 15/39 (38%), Positives = 20/39 (51%), Gaps = 5/39 (12%)
Query: 29 KNQWSRIASLLHRK----SAKQCKARWFEWLDPSIKKTE 63
K W IA+ + SAKQCK +W E L+ KK +
Sbjct: 29 KALWEEIAAKMRELGYNRSAKQCKEKW-ENLNKYYKKVK 66
>gnl|CDD|148065 pfam06234, TmoB, Toluene-4-monooxygenase system protein B (TmoB).
This family consists of several Toluene-4-monooxygenase
system protein B (TmoB) sequences. Pseudomonas mendocina
KR1 metabolises toluene as a carbon source. The initial
step of the pathway is hydroxylation of toluene to form
p-cresol by a multicomponent toluene-4-monooxygenase
(T4MO) system. TmoB adopts a ubiquitin fold. Although
TmoB is a component of the T4MO system, its precise role
remains unclear.
Length = 85
Score = 30.4 bits (69), Expect = 0.71
Identities = 14/45 (31%), Positives = 19/45 (42%), Gaps = 7/45 (15%)
Query: 499 VDMLEDQADVDAAAIA------RMKAQREHEMRLRSQVIQKNLPR 537
VD ED D A A R+ + H +R+R Q + PR
Sbjct: 21 VD-TEDTMDQVAEKAAHHSVGRRVPPRPGHVVRVRKQGSTELFPR 64
>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19.
Med19 represents a family of conserved proteins which
are members of the multi-protein co-activator Mediator
complex. Mediator is required for activation of RNA
polymerase II transcription by DNA binding
transactivators.
Length = 178
Score = 31.7 bits (72), Expect = 0.86
Identities = 15/69 (21%), Positives = 30/69 (43%)
Query: 204 KKKRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRK 263
KKK + + P D+ + ++H + +KE +++KK+K+K K+R
Sbjct: 110 KKKHKHKHKKHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRH 169
Query: 264 ENDIPTAML 272
+ P
Sbjct: 170 SPEHPGVGF 178
>gnl|CDD|193581 cd09892, NGN_SP_RfaH, N-Utilization Substance G (NusG) N-terminal
domain in the NusG Specialized Paralog (SP), RfaH. RfaH
is an operon-specific virulence regulator, thought to
have arisen from an early duplication of N-Utilization
Substance G (NusG). Paralogs of eubacterial NusG, NusG
SP (Specialized Paralog of NusG), are more diverse and
often found as the first ORF in operons encoding
secreted proteins and LPS biosynthesis genes. NusG SP
family members are operon-specific transcriptional
antitermination factors. NusG is essential in
Escherichia coli and is associated with RNA polymerase
elongation and Rho-termination in bacteria. In contrast,
RfaH is a non-essential protein that controls expression
of operons containing an ops (operon polarity
suppressor) element in their transcribed DNA. RfaH and
NusG are different in their response to Rho-dependent
terminators and regulatory targets. The NusG N-terminal
(NGN) domain is quite similar in all NusG orthologs, but
its C-terminal domains and the linker that separate
these two domains are different. The domain organization
of NusG and its homologs suggest that the common
properties of NusG and RfaH are due to their similar NGN
domains.
Length = 96
Score = 30.2 bits (69), Expect = 0.86
Identities = 11/22 (50%), Positives = 12/22 (54%)
Query: 419 STPGVRDSVRGGATPTPIRDRL 440
ST GV VR G P P+ D L
Sbjct: 69 STRGVSRLVRFGGEPAPVPDAL 90
>gnl|CDD|236978 PRK11778, PRK11778, putative inner membrane peptidase; Provisional.
Length = 330
Score = 32.1 bits (74), Expect = 1.1
Identities = 11/39 (28%), Positives = 19/39 (48%)
Query: 231 EERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPT 269
+E L+ LD + + ++KK+KQ+ K K P
Sbjct: 53 KEELKAALLDKKELKAWHKAQKKKEKQEAKAAKAKSKPR 91
>gnl|CDD|234796 PRK00571, atpC, F0F1 ATP synthase subunit epsilon; Validated.
Length = 135
Score = 30.5 bits (70), Expect = 1.5
Identities = 14/50 (28%), Positives = 18/50 (36%), Gaps = 3/50 (6%)
Query: 143 KDMDEDELE-MLSEARARLANTQGKKAKRKAREKQLEEAR-RLAALQKRR 190
D+DE E A L N +A+ L A RL +K R
Sbjct: 87 DDIDEARAEEAKERAEEALENKHDDVDYARAQAA-LARAIARLRVAEKLR 135
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 31.5 bits (72), Expect = 1.5
Identities = 14/53 (26%), Positives = 27/53 (50%), Gaps = 7/53 (13%)
Query: 143 KDMDEDELEMLSEARARLANTQGKKAKRKAREKQL--EEARRLAALQKRRELR 193
K +E+ E E + KK +R+A+ +L EE R+L +++++ R
Sbjct: 274 KAAEEERQEEAQEKKEEK-----KKEEREAKLAKLSPEEQRKLEEKERKKQAR 321
>gnl|CDD|221313 pfam11917, DUF3435, Protein of unknown function (DUF3435). This
family of proteins are functionally uncharacterized.
This protein is found in eukaryotes. Proteins in this
family are typically between 435 to 791 amino acids in
length. This family is related to pfam00589 suggesting
it may be an integrase enzyme.
Length = 418
Score = 31.6 bits (72), Expect = 1.7
Identities = 26/118 (22%), Positives = 49/118 (41%), Gaps = 4/118 (3%)
Query: 502 LEDQADVDAAAIARMKAQREHEMRLRSQVIQK-NLPRPFDINIVLRPS-NSDPPLSELQK 559
L + D D A+ +E +R +++ + + RP D+ + S DP L EL +
Sbjct: 236 LPRRVDRDVQAVVLGLPPQEALIRAATRMSRTRDPRRPRDLTDEQKASVEEDPELQELIR 295
Query: 560 AEELIKQEMITMLH--YDALETPLSVDKKAAKQSNILTDEEHYNFLKHRPYRNFSLEE 615
+ +K+E+I + A TPL + ++ + LK + F E+
Sbjct: 296 KRDHLKKEIIALYGQVAKAKGTPLYERLEKRRREVRNERQRLRRELKKKIREEFDEEQ 353
>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain. This is a family of proteins of
approximately 300 residues, found in plants and
vertebrates. They contain a highly conserved DDRGK
motif.
Length = 189
Score = 30.8 bits (70), Expect = 1.7
Identities = 21/86 (24%), Positives = 36/86 (41%), Gaps = 11/86 (12%)
Query: 179 EARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQH 238
+ +L Q RR+ R A E ++KK E E+ E R++
Sbjct: 7 KRAKLEEKQARRQQREA-EEEEREERKKLEEKREGERKEEEE----------LEEEREKK 55
Query: 239 LDGELRSEKEERERKKDKQKLKQRKE 264
+ E R E+EE+ RK+ ++ K +
Sbjct: 56 KEEEERKEREEQARKEQEEYEKLKSS 81
>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon.
Length = 431
Score = 31.2 bits (70), Expect = 2.1
Identities = 42/172 (24%), Positives = 67/172 (38%), Gaps = 17/172 (9%)
Query: 99 ERYEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEM------ 152
ER E + K E ++ D + + E +P PE + + + M
Sbjct: 122 EREEVEETEGVTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKH 181
Query: 153 ----LSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRG 208
S A A + K K ++KQ E A L L+K+RE R +E +++K+
Sbjct: 182 TENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEE 241
Query: 209 IDYNAEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLK 260
D + EKR KEE R++ E R + E +DK+ K
Sbjct: 242 ADRKSREEEEKR-------RLKEEIERRRAEAAEKRQKVPEDGLSEDKKPFK 286
>gnl|CDD|150406 pfam09727, CortBP2, Cortactin-binding protein-2. This entry is the
first approximately 250 residues of cortactin-binding
protein 2. In addition to being a positional candidate
for autism this protein is expressed at highest levels
in the brain in humans. The human protein has six
associated ankyrin repeat domains pfam00023 towards the
C-terminus which act as protein-protein interaction
domains.
Length = 193
Score = 30.6 bits (69), Expect = 2.2
Identities = 17/107 (15%), Positives = 44/107 (41%), Gaps = 6/107 (5%)
Query: 163 TQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGI-----DYNAEIPF 217
+ EK + E ++ A QK + R +A +++++ + + I +
Sbjct: 70 GAEDPEQEDIYEKPMSELDKVMAKQKETQRRMLAQLLAAEKRQRKTVLELEEEKRKHIRY 129
Query: 218 EKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKE 264
K+ +E ++ L+ E +S++ ++E++ K +E
Sbjct: 130 MKKSDDFTNLLEQERERLKKLLEQE-KSQQAKKEQEHRKLLATLEEE 175
>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal. This
domain is found to the N-terminus of bacterial signal
peptidases of the S49 family (pfam01343).
Length = 154
Score = 30.2 bits (69), Expect = 2.3
Identities = 14/50 (28%), Positives = 19/50 (38%), Gaps = 7/50 (14%)
Query: 232 ERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKR 281
E L LD + E+ E+K +K K K K+ E K R
Sbjct: 56 ESLEAALLDKKELKAWEKAEKKAEKAKAKAEKKK-------AKKEEPKPR 98
>gnl|CDD|233254 TIGR01059, gyrB, DNA gyrase, B subunit. This model describes the
common type II DNA topoisomerase (DNA gyrase). Two
apparently independently arising families, one in the
Proteobacteria and one in Gram-positive lineages, are
both designated toposisomerase IV. Proteins scoring
above the noise cutoff for this model and below the
trusted cutoff for topoisomerase IV models probably
should be designated GyrB [DNA metabolism, DNA
replication, recombination, and repair].
Length = 654
Score = 31.2 bits (71), Expect = 2.4
Identities = 19/68 (27%), Positives = 29/68 (42%), Gaps = 3/68 (4%)
Query: 245 SEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSKLV--LPEPQI-SDMELEQVVK 301
SE E +D + +Q E IP L+ + KK V P+P+I E + +
Sbjct: 122 SEWLEVTVFRDGKIYRQEFERGIPLGPLEVVGETKKTGTTVRFWPDPEIFETTEFDFDIL 181
Query: 302 LGRATEVA 309
R E+A
Sbjct: 182 AKRLRELA 189
>gnl|CDD|216991 pfam02357, NusG, Transcription termination factor nusG.
Length = 90
Score = 28.8 bits (65), Expect = 2.4
Identities = 10/21 (47%), Positives = 11/21 (52%)
Query: 419 STPGVRDSVRGGATPTPIRDR 439
STPGV V G P P+ D
Sbjct: 70 STPGVTGFVGFGGKPAPVPDE 90
>gnl|CDD|213874 TIGR03857, F420_MSMEG_2249, probable F420-dependent oxidoreductase,
MSMEG_2249 family. Coenzyme F420 has a limited
phylogenetic distribution, including methanogenic
archaea, Mycobacterium tuberculosis and related species,
Colwellia psychrerythraea 34H, Rhodopseudomonas
palustris HaA2, and others. Partial phylogenetic
profiling identifies protein subfamilies, within the
larger family called luciferase-like monooxygenanases
(pfam00296), that appear only in F420-positive genomes
and are likely to be F420-dependent. This model
describes a distinctive subfamily, found only in
F420-biosynthesizing members of the Actinobacteria of
the bacterial luciferase-like monooxygenase (LLM)
superfamily [Unknown function, Enzymes of unknown
specificity].
Length = 329
Score = 30.9 bits (70), Expect = 2.5
Identities = 13/31 (41%), Positives = 20/31 (64%), Gaps = 1/31 (3%)
Query: 70 EKLLHLAKLMPTQWRTIAPIIGRTAAQCLER 100
E+L+ +A L+P +W + IG +AAQC R
Sbjct: 278 EQLVDVADLIPDEWLEASAAIG-SAAQCARR 307
>gnl|CDD|178307 PLN02705, PLN02705, beta-amylase.
Length = 681
Score = 30.7 bits (69), Expect = 3.6
Identities = 17/48 (35%), Positives = 25/48 (52%), Gaps = 5/48 (10%)
Query: 247 KEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSKLVLPEPQISDM 294
K ERE++K++ KL++R I + ML L R P P +DM
Sbjct: 78 KREREKEKERTKLRERHRRAITSRMLAGL-----RQYGNFPLPARADM 120
>gnl|CDD|178635 PLN03086, PLN03086, PRLI-interacting factor K; Provisional.
Length = 567
Score = 30.6 bits (69), Expect = 3.8
Identities = 22/74 (29%), Positives = 41/74 (55%), Gaps = 8/74 (10%)
Query: 239 LDGELRS--EKEERERKKDKQKLK-----QRKENDIPTAMLQNLEPEKKRSKLVLPEPQI 291
+D ELR EK ERE+++ KQ+ K +RK + + +E ++ +L E QI
Sbjct: 1 MDFELRRAREKLEREQRERKQRAKLKLERERKAKEEAAKQREAIEAAQRSRRLDAIEAQI 60
Query: 292 -SDMELEQVVKLGR 304
+D ++++ ++ GR
Sbjct: 61 KADQQMQESLQAGR 74
>gnl|CDD|167649 PRK03963, PRK03963, V-type ATP synthase subunit E; Provisional.
Length = 198
Score = 29.7 bits (67), Expect = 3.9
Identities = 18/58 (31%), Positives = 34/58 (58%), Gaps = 6/58 (10%)
Query: 151 EMLSEARARLANTQGKKAKRKAREKQ---LEEARRLAALQKRRELRAAGIEVAPRQKK 205
+L EA+ + A ++A+++A K L +A+ A L+K+R + A +EV R+K+
Sbjct: 21 YILEEAQ-KEAEKIKEEARKRAESKAEWILRKAKTQAELEKQRIIANAKLEV--RRKR 75
>gnl|CDD|204055 pfam08764, Coagulase, Staphylococcus aureus coagulase.
Staphylococcus aureus secretes a cofactor called
coagulase. Coagulase is an extracellular protein that
forms a complex with human prothrombin, and activates it
without the usual proteolytic cleavages. The resulting
complex directly initiates blood clotting.
Length = 282
Score = 30.1 bits (68), Expect = 4.4
Identities = 16/45 (35%), Positives = 23/45 (51%)
Query: 486 PENEEMEEKASGDVDMLEDQADVDAAAIARMKAQREHEMRLRSQV 530
P +EE EEKA+ +V L + D AA K H LR+++
Sbjct: 146 PYSEEEEEKATDEVYDLVSEIDTLYAAYYGDKQHGTHAKELRAKL 190
>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family. Emg1 and Nop14 are novel
proteins whose interaction is required for the
maturation of the 18S rRNA and for 40S ribosome
production.
Length = 809
Score = 30.4 bits (69), Expect = 4.5
Identities = 17/69 (24%), Positives = 25/69 (36%), Gaps = 5/69 (7%)
Query: 131 PNPETKPARPDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRR 190
P + D+ E+ + RA+ + + K E EEA RL L+ R
Sbjct: 224 PPKPPMTPEEKDDEYDQRVRELTFDRRAQPTD----RTK-TEEELAKEEAERLKKLEAER 278
Query: 191 ELRAAGIEV 199
R G E
Sbjct: 279 LRRMRGEEE 287
>gnl|CDD|236709 PRK10531, PRK10531, acyl-CoA thioesterase; Provisional.
Length = 422
Score = 30.1 bits (68), Expect = 4.6
Identities = 15/54 (27%), Positives = 25/54 (46%), Gaps = 1/54 (1%)
Query: 297 EQVVKLGRATEV-AREVAIESGSGPTSDALLTDYSIGTGAAMKTPRTPAPQTDR 349
+ V +LGRA +V A A +G+ ++ D +I G P A ++R
Sbjct: 338 DGVAQLGRAWDVDAGLSAYVNGANTNGQVVIRDSAINEGFNTAKPWADAVTSNR 391
>gnl|CDD|204335 pfam09905, DUF2132, Uncharacterized conserved protein (DUF2132).
This domain, found in various hypothetical prokaryotic
proteins, has no known function.
Length = 64
Score = 27.1 bits (61), Expect = 5.5
Identities = 11/24 (45%), Positives = 14/24 (58%), Gaps = 7/24 (29%)
Query: 56 DPSIK-------KTEWSREEDEKL 72
+PSIK KT W+RE+ E L
Sbjct: 39 NPSIKSSLKFLRKTPWAREKVENL 62
>gnl|CDD|236973 PRK11767, PRK11767, SpoVR family protein; Provisional.
Length = 498
Score = 29.8 bits (68), Expect = 5.6
Identities = 14/28 (50%), Positives = 18/28 (64%), Gaps = 1/28 (3%)
Query: 730 PRRIASLTEDVNRQKEREAVLQERFGAL 757
P++I SL E+ RQKERE LQ + L
Sbjct: 182 PQKI-SLQEEKARQKEREEYLQSQVNDL 208
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 30.3 bits (68), Expect = 5.6
Identities = 19/73 (26%), Positives = 34/73 (46%), Gaps = 20/73 (27%)
Query: 140 PDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELR-----A 194
+ K + E+ELE + KK + KA+EK E ++L A QK + + A
Sbjct: 8 AEKKILTEEELE------------RKKKKEEKAKEK---ELKKLKAAQKEAKAKLQAQQA 52
Query: 195 AGIEVAPRQKKKR 207
+ P++ +K+
Sbjct: 53 SDGTNVPKKSEKK 65
>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
Length = 1036
Score = 29.8 bits (67), Expect = 6.2
Identities = 28/80 (35%), Positives = 37/80 (46%), Gaps = 9/80 (11%)
Query: 144 DMDEDELE--MLSEARARLANTQGKKAKRKA-REKQLEEARRLAALQKRREL-RA-AGIE 198
MDE E +L E R L K AK +A RE+Q EE RR + E RA A E
Sbjct: 240 GMDEHSFEDFLLEEKRRELE----KLAKEEAERERQAEEQRRREEEKAAMEADRAQAKAE 295
Query: 199 VAPRQKKKRGIDYNAEIPFE 218
V R++K + + A +
Sbjct: 296 VEKRREKLQNLLKKASRSAD 315
>gnl|CDD|222361 pfam13751, DDE_Tnp_1_6, Transposase DDE domain. Transposase
proteins are necessary for efficient DNA transposition.
This domain is a member of the DDE superfamily, which
contain three carboxylate residues that are believed to
be responsible for coordinating metal ions needed for
catalysis.
Length = 125
Score = 28.5 bits (64), Expect = 6.5
Identities = 13/51 (25%), Positives = 24/51 (47%), Gaps = 6/51 (11%)
Query: 162 NTQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQ-KKKRGIDY 211
+ ++A+RKARE+ E + + R+ G+E Q K+ G+
Sbjct: 54 RPELEEARRKARERLKSEEGK-----ALYKKRSIGVEGVFGQIKRNLGLRR 99
>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
recombination, and repair].
Length = 908
Score = 29.7 bits (67), Expect = 7.1
Identities = 32/156 (20%), Positives = 69/156 (44%), Gaps = 9/156 (5%)
Query: 147 EDELEMLSEARARLANTQGKKAKRKAREKQLEEARR-LAALQKRRELRAAGIEVAPRQKK 205
E+++E L E + + + +A ++LEE L +L++R E +E + +
Sbjct: 287 EEKIERLEELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLESELE 346
Query: 206 KRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKEN 265
+ + N + KE R + L+ EL + ER ++ ++ + ++E
Sbjct: 347 ELAEEKNELAKLLEE-------RLKELEERLEELEKELE-KALERLKQLEEAIQELKEEL 398
Query: 266 DIPTAMLQNLEPEKKRSKLVLPEPQISDMELEQVVK 301
+A L+ ++ E + + L E + ELE+ +K
Sbjct: 399 AELSAALEEIQEELEELEKELEELERELEELEEEIK 434
>gnl|CDD|237546 PRK13889, PRK13889, conjugal transfer relaxase TraA; Provisional.
Length = 988
Score = 29.7 bits (67), Expect = 7.1
Identities = 44/219 (20%), Positives = 78/219 (35%), Gaps = 35/219 (15%)
Query: 133 PETKPARPDPKDM-DEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRE 191
P+ + + ++A AR + A R+AR + L R A+
Sbjct: 770 LPDPVPGPEAGRRPERESAAATTDAPARTVAADPEAALRQARTRALV--RHARAVDAIFR 827
Query: 192 LRAAGIEVAPRQKK---KRGIDYNAEIPFEKRPAPGFYDTSKEER-----------LRQQ 237
++ G V P Q K + + P+ A Y + E +R
Sbjct: 828 MQEQGGPVLPHQVKELQEARKAFEEVRPYGSHDAEAAYKKNPELAAEAASGRPARAIRAL 887
Query: 238 HLDGELRSEKE-------ERERKKDKQKLKQRKENDIPTA---------MLQNLEPEKKR 281
L+ ELR++ ER +K D+ +Q + D+ M ++LE + +
Sbjct: 888 QLETELRTDPARRADRFVERWQKLDRASQRQYQAGDMSGYKATRAAMGDMAKSLERDPQL 947
Query: 282 SKLVLPEPQISDMELEQVVKLGRATEVAREVAIESGSGP 320
L+ + + E +LGR E+A I+ G G
Sbjct: 948 ESLLAGRKRELGIGFESGRRLGR--ELAFSHGIDLGRGR 984
>gnl|CDD|238570 cd01164, FruK_PfkB_like, 1-phosphofructokinase (FruK), minor
6-phosphofructokinase (pfkB) and related sugar kinases.
FruK plays an important role in the predominant pathway
for fructose utilisation.This group also contains
tagatose-6-phophate kinase, an enzyme of the tagatose
6-phosphate pathway, which responsible for breakdown of
the galactose moiety during lactose metabolism by
bacteria such as L. lactis.
Length = 289
Score = 29.4 bits (67), Expect = 7.5
Identities = 14/47 (29%), Positives = 25/47 (53%), Gaps = 2/47 (4%)
Query: 275 LEPEKKRSKLVLPEPQISDMELEQVVK-LGRATEVAREVAIESGSGP 320
E + +++ P P+IS+ ELE +++ L + V + SGS P
Sbjct: 94 KEEDGTETEINEPGPEISEEELEALLEKLKALLKKGDIVVL-SGSLP 139
>gnl|CDD|198428 cd10030, UDG_F4_TTUDGA_like, Family 4 Uracil-DNA glycosylase (UDG),
found exclusively in thermophilic organisms. The
enzymes of Family 4 Uracil-DNA glycosylase (UDG), found
only in thermophilic organisms, are thermostable
enzymes. Uracil-DNA glycosylases (UDGs) are DNA repair
enzymes that catalyze the removal of mismatched uracil
from DNA to initiate DNA base excision repair pathway.
The Thermus thermophilus enzyme TTUDGA removes uracil
from both, ssDNA and dsDNA, but not thymine from a G:T
mismatch. These details suggest that the mechanism by
which Family 4 UDGs remove uracils from DNA is similar
to that of Family 1 enzymes. The thermostability of the
enzyme may be linked to the presence of an iron-sulfur
cluster, salt-bridges and ion pairs on the molecular
surface as well as prolines on loops and turns, as
commonly found in the Family 4 enzymes. Uracil in DNA
can arise as a result of mis-incorporation of dUMP
residues by DNA polymerase or deamination of cytosine.
Uracil mispaired with guanine in DNA is one of the major
pro-mutagenic events, causing G:C->A:T mutations.
Length = 164
Score = 28.6 bits (65), Expect = 7.6
Identities = 11/33 (33%), Positives = 20/33 (60%), Gaps = 2/33 (6%)
Query: 601 NFLKHRPY--RNFSLEELEAADDLLKREMDLVK 631
N +K RP R + EE+ A L+R+++L++
Sbjct: 68 NVVKCRPPGNRTPTPEEIAACRPFLERQIELIR 100
>gnl|CDD|222409 pfam13837, Myb_DNA-bind_4, Myb/SANT-like DNA-binding domain.
This presumed domain appears to be related to other
Myb/SANT-like DNA binding domains. In particular
pfam10545 seems most related. This family is greatly
expanded in plants and appears in several proteins
annotated as transposon proteins.
Length = 84
Score = 27.2 bits (61), Expect = 7.7
Identities = 10/28 (35%), Positives = 15/28 (53%), Gaps = 4/28 (14%)
Query: 28 GKNQWSRIASLLHR----KSAKQCKARW 51
K+ W IA + +SA+QCK +W
Sbjct: 31 NKHVWEEIAEKMAERGYNRSAEQCKEKW 58
>gnl|CDD|215901 pfam00401, ATP-synt_DE, ATP synthase, Delta/Epsilon chain, long
alpha-helix domain. Part of the ATP synthase CF(1).
These subunits are part of the head unit of the ATP
synthase. This subunit is called epsilon in bacteria and
delta in mitochondria. In bacteria the delta (D) subunit
is equivalent to the mitochondrial Oligomycin sensitive
subunit, OSCP (pfam00213).
Length = 48
Score = 26.3 bits (59), Expect = 8.0
Identities = 16/48 (33%), Positives = 23/48 (47%), Gaps = 3/48 (6%)
Query: 143 KDMDEDE-LEMLSEARARLANTQGKKAKRKAREKQLEEAR-RLAALQK 188
+D+D + E A LA +G K +A E L+ AR RL A +
Sbjct: 1 EDIDLERAEEAKERAEEALAKAEGDKEYIRA-EAALKRARARLRAAKL 47
>gnl|CDD|235766 PRK06276, PRK06276, acetolactate synthase catalytic subunit;
Reviewed.
Length = 586
Score = 29.3 bits (66), Expect = 8.1
Identities = 15/55 (27%), Positives = 24/55 (43%)
Query: 142 PKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELRAAG 196
PKD+ E EL++ + G K Q+++A L A +R + A G
Sbjct: 158 PKDVQEGELDLEKYPIPAKIDLPGYKPTTFGHPLQIKKAAELIAEAERPVILAGG 212
>gnl|CDD|182521 PRK10528, PRK10528, multifunctional acyl-CoA thioesterase I and
protease I and lysophospholipase L1; Provisional.
Length = 191
Score = 28.6 bits (64), Expect = 9.0
Identities = 11/39 (28%), Positives = 19/39 (48%), Gaps = 5/39 (12%)
Query: 78 LMPTQWRTIAPII-----GRTAAQCLERYEFLLDQAQKK 111
L+ +W++ ++ G T+ Q L R LL Q Q +
Sbjct: 35 LLNDKWQSKTSVVNASISGDTSQQGLARLPALLKQHQPR 73
>gnl|CDD|217512 pfam03359, GKAP, Guanylate-kinase-associated protein (GKAP)
protein.
Length = 342
Score = 29.0 bits (65), Expect = 9.0
Identities = 18/62 (29%), Positives = 22/62 (35%), Gaps = 11/62 (17%)
Query: 121 PRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEA 180
P+K G I ++ D D EAR+RLA AKR A KQ
Sbjct: 277 PKKPAKGPIVKPAISREKSLDSSDRQR------QEARSRLA-----AAKRAASFKQNSAT 325
Query: 181 RR 182
Sbjct: 326 ES 327
>gnl|CDD|191489 pfam06297, PET, PET Domain. This domain is suggested to be
involved in protein-protein interactions. The family is
found in conjunction with pfam00412.
Length = 106
Score = 27.7 bits (62), Expect = 9.9
Identities = 22/76 (28%), Positives = 32/76 (42%), Gaps = 13/76 (17%)
Query: 199 VAPRQKKKRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQHL-------DGELR--SEKEE 249
V P + Y +P EK P G S+ E+ R++ L D + R E
Sbjct: 24 VPPGLTPELVHRYMELLPEEKVPVVG----SEGEKYRRRQLLHQLPPHDQDPRYCHGLSE 79
Query: 250 RERKKDKQKLKQRKEN 265
E K+ + +KQRKE
Sbjct: 80 EEVKELEDFVKQRKEE 95
>gnl|CDD|218538 pfam05285, SDA1, SDA1. This family consists of several SDA1
protein homologues. SDA1 is a Saccharomyces cerevisiae
protein which is involved in the control of the actin
cytoskeleton. The protein is essential for cell
viability and is localised in the nucleus.
Length = 317
Score = 28.9 bits (65), Expect = 9.9
Identities = 38/168 (22%), Positives = 67/168 (39%), Gaps = 36/168 (21%)
Query: 112 EEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEAR-------ARLANTQ 164
EE ++ A ++ E+ E + A + + ++++ L+ R A++ +
Sbjct: 136 EEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEKASELATTRILTPADFAKIQELR 195
Query: 165 GKKAKRKAREKQLEEARRLAALQKRRE-LRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAP 223
+K KA +L+ + A + E + A IE P +KKK
Sbjct: 196 LEKGVDKALGGKLKRRDKDAPERHSDELVDADDIE-GPAKKKK----------------- 237
Query: 224 GFYDTSKEERLRQQHLDGELRSEKEERERKKDKQ------KLKQRKEN 265
+KEER+ E R + R+ KKDK+ K K RK+N
Sbjct: 238 ----QTKEERIATAKEGREDREKFGSRKGKKDKEGKSTTNKEKARKKN 281
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.312 0.130 0.364
Gapped
Lambda K H
0.267 0.0696 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 40,259,013
Number of extensions: 4088732
Number of successful extensions: 5264
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4831
Number of HSP's successfully gapped: 307
Length of query: 769
Length of database: 10,937,602
Length adjustment: 104
Effective length of query: 665
Effective length of database: 6,324,786
Effective search space: 4205982690
Effective search space used: 4205982690
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 63 (28.1 bits)