RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy12365
(1324 letters)
>gnl|CDD|225201 COG2319, COG2319, FOG: WD40 repeat [General function prediction
only].
Length = 466
Score = 51.2 bits (121), Expect = 3e-06
Identities = 52/224 (23%), Positives = 96/224 (42%), Gaps = 25/224 (11%)
Query: 2 DKLIFTLEEPHGPGDVYVCWQRRTGE--LLATTGRDSSVSIYNKHGKLIDKITLPGL--- 56
+KLI +LE H + G LLA++ D +V +++ TL G
Sbjct: 98 EKLIKSLEGLHDSSVSKLALSSPDGNSILLASSSLDGTVKLWDLSTPGKLIRTLEGHSES 157
Query: 57 CIVMDWDSEGDLLGIISSNSSAVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSML---- 112
+ + +G LL SS + +W+ T K +G DP++ L + +L
Sbjct: 158 VTSLAFSPDGKLLASGSSLDGTIKLWDLRTGKPLSTLAGHTDPVSSLAFSPDGGLLIASG 217
Query: 113 --QLSVSIYNKHGKLIDKITLPGL--CIVMDWDSEGDLLGIISSNSSAVNVWTL-----L 163
++ +++ + + TL G +V + +G LL S+ + +W L L
Sbjct: 218 SSDGTIRLWDLSTGKLLRSTLSGHSDSVVSSFSPDGSLL-ASGSSDGTIRLWDLRSSSSL 276
Query: 164 TYTLG------ERISWSDDGQLLAVTTSGGSVKIYLSKLPKLVV 201
TL +++S DG+LLA +S G+V+++ + KL+
Sbjct: 277 LRTLSGHSSSVLSVAFSPDGKLLASGSSDGTVRLWDLETGKLLS 320
Score = 47.8 bits (112), Expect = 4e-05
Identities = 43/239 (17%), Positives = 95/239 (39%), Gaps = 28/239 (11%)
Query: 3 KLIFTLEEPHGPGDVYVCWQRRTGELLATTGRDSSVSIYN--KHGKLIDKITLPGLCIV- 59
KL+ + H V G LLA+ D ++ +++ L+ ++ ++
Sbjct: 232 KLLRSTLSGHSDSVVSSFS--PDGSLLASGSSDGTIRLWDLRSSSSLLRTLSGHSSSVLS 289
Query: 60 MDWDSEGDLLGIISSNSSAVNVWNTYTKKRTIVDS--GLRDPLTCLVWCKQCSMLQLSVS 117
+ + +G LL S+ V +W+ T K + G P++ L + S+L S
Sbjct: 290 VAFSPDGKLL-ASGSSDGTVRLWDLETGKLLSSLTLKGHEGPVSSLSFSPDGSLLVSGGS 348
Query: 118 IYNKH-------GKLIDKITLPGLCIVMDWDSEGDLLGIISSNSSAVNVWTLLTYTLG-- 168
GK + + + + + +G ++ S++ + V +W L T +L
Sbjct: 349 DDGTIRLWDLRTGKPLKTLEGHSNVLSVSFSPDGRVVSSGSTDGT-VRLWDLSTGSLLRN 407
Query: 169 --------ERISWSDDGQLLAVTTSGGSVKIY--LSKLPKLVVANNGKIAILSSLNQVS 217
+ +S DG+ LA +S +++++ + L + + +GK+ S +
Sbjct: 408 LDGHTSRVTSLDFSPDGKSLASGSSDNTIRLWDLKTSLKSVSFSPDGKVLASKSSDLSV 466
Score = 47.4 bits (111), Expect = 4e-05
Identities = 38/210 (18%), Positives = 90/210 (42%), Gaps = 22/210 (10%)
Query: 19 VCWQRRTGELLATTGRDSSVSIYNKHGKLIDKITLPGL--CIVMDWDSEGDLLGIISSNS 76
+ + G L+A+ D ++ +++ + + TL G +V + +G LL S+
Sbjct: 204 LAFSPDGGLLIASGSSDGTIRLWDLSTGKLLRSTLSGHSDSVVSSFSPDGSLL-ASGSSD 262
Query: 77 SAVNVWNT-YTKKRTIVDSGLRDPLTCLVWCKQCSML-----QLSVSIYN-KHGKLIDKI 129
+ +W+ + SG + + + +L +V +++ + GKL+ +
Sbjct: 263 GTIRLWDLRSSSSLLRTLSGHSSSVLSVAFSPDGKLLASGSSDGTVRLWDLETGKLLSSL 322
Query: 130 TLPGLCIV---MDWDSEGDLLGIISSNSSAVNVWTLLT---------YTLGERISWSDDG 177
TL G + + +G LL S+ + +W L T ++ +S+S DG
Sbjct: 323 TLKGHEGPVSSLSFSPDGSLLVSGGSDDGTIRLWDLRTGKPLKTLEGHSNVLSVSFSPDG 382
Query: 178 QLLAVTTSGGSVKIYLSKLPKLVVANNGKI 207
++++ ++ G+V+++ L+ +G
Sbjct: 383 RVVSSGSTDGTVRLWDLSTGSLLRNLDGHT 412
>gnl|CDD|238121 cd00200, WD40, WD40 domain, found in a number of eukaryotic
proteins that cover a wide variety of functions
including adaptor/regulatory modules in signal
transduction, pre-mRNA processing and cytoskeleton
assembly; typically contains a GH dipeptide 11-24
residues from its N-terminus and the WD dipeptide at its
C-terminus and is 40 residues long, hence the name WD40;
between GH and WD lies a conserved core; serves as a
stable propeller-like platform to which proteins can
bind either stably or reversibly; forms a propeller-like
structure with several blades where each blade is
composed of a four-stranded anti-parallel b-sheet;
instances with few detectable copies are hypothesized to
form larger structures by dimerization; each WD40
sequence repeat forms the first three strands of one
blade and the last strand in the next blade; the last
C-terminal WD40 repeat completes the blade structure of
the first WD40 repeat to create the closed ring
propeller-structure; residues on the top and bottom
surface of the propeller are proposed to coordinate
interactions with other proteins and/or small ligands; 7
copies of the repeat are present in this alignment.
Length = 289
Score = 43.5 bits (103), Expect = 4e-04
Identities = 41/213 (19%), Positives = 88/213 (41%), Gaps = 33/213 (15%)
Query: 4 LIFTLEEPHGPGDVY-VCWQRRTGELLATTGRDSSVSIYN-KHGKLIDKITLPGLCIV-M 60
L TL+ H G V V + G+LLAT D ++ +++ + G+L+ + + +
Sbjct: 1 LRRTLKG-H-TGGVTCVAFSP-DGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDV 57
Query: 61 DWDSEGDLLGIISSNSSAVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSML-----QLS 115
++G L S+ + +W+ T + +G ++ + + +L +
Sbjct: 58 AASADGTYL-ASGSSDKTIRLWDLETGECVRTLTGHTSYVSSVAFSPDGRILSSSSRDKT 116
Query: 116 VSIYN-KHGKLIDKITLPG-----LCIVMDWDSEGDLLGIISSNSSAVNVWTL----LTY 165
+ +++ + GK + TL G + D SS + +W L
Sbjct: 117 IKVWDVETGKCL--TTLRGHTDWVNSVAFSPD---GTFVASSSQDGTIKLWDLRTGKCVA 171
Query: 166 TL------GERISWSDDGQLLAVTTSGGSVKIY 192
TL +++S DG+ L ++S G++K++
Sbjct: 172 TLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLW 204
Score = 37.3 bits (87), Expect = 0.036
Identities = 42/194 (21%), Positives = 78/194 (40%), Gaps = 33/194 (17%)
Query: 24 RTGELLATTGRDSSVSIYN-KHGKLIDKITLPG-----LCIVMDWDSEGDLLGIISSNSS 77
G +L+++ RD ++ +++ + GK + TL G + D SS
Sbjct: 103 PDGRILSSSSRDKTIKVWDVETGKCL--TTLRGHTDWVNSVAFSPD---GTFVASSSQDG 157
Query: 78 AVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSMLQLSVS-----IYN-KHGKLIDKITL 131
+ +W+ T K +G + + + L S S +++ GK + TL
Sbjct: 158 TIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTGKCLG--TL 215
Query: 132 PG---LCIVMDWDSEGDLLGIISSNSSAVNVWTLLT----YTLGE------RISWSDDGQ 178
G + + +G LL S + VW L T TL ++WS DG+
Sbjct: 216 RGHENGVNSVAFSPDGYLL-ASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGK 274
Query: 179 LLAVTTSGGSVKIY 192
LA ++ G+++I+
Sbjct: 275 RLASGSADGTIRIW 288
>gnl|CDD|205602 pfam13424, TPR_12, Tetratricopeptide repeat.
Length = 78
Score = 34.7 bits (80), Expect = 0.028
Identities = 16/69 (23%), Positives = 33/69 (47%), Gaps = 3/69 (4%)
Query: 834 LNDAVTLYESAGNYEKAATCYIQ-LKNWTKIGQLLPHIKSATTFIQYAKAKEAMGSYRES 892
LN+ + G+Y++A + L+ ++G+ H ++A A+ A+G Y E+
Sbjct: 8 LNNLALVLRRLGDYDEALELLEKALELARELGED--HPETARALNNLARLYLALGDYDEA 65
Query: 893 VGAYERAED 901
+ E+A
Sbjct: 66 LEYLEKALA 74
Score = 31.6 bits (72), Expect = 0.39
Identities = 18/84 (21%), Positives = 32/84 (38%), Gaps = 21/84 (25%)
Query: 697 GHVAALLGNHDTAQQRYLTSDIPTMALTLRRDLRQWREALALATSLGSN--QTPIISCDY 754
V LG++D A + +AL LA LG + +T +
Sbjct: 12 ALVLRRLGDYDEALELL-------------------EKALELARELGEDHPETARALNNL 52
Query: 755 AQQLEMTGQHAQALSFYQKSMELA 778
A+ G + +AL + +K++ L
Sbjct: 53 ARLYLALGDYDEALEYLEKALALR 76
>gnl|CDD|216037 pfam00637, Clathrin, Region in Clathrin and VPS. Each region is
about 140 amino acids long. The regions are composed of
multiple alpha helical repeats. They occur in the arm
region of the Clathrin heavy chain.
Length = 143
Score = 35.3 bits (82), Expect = 0.071
Identities = 20/135 (14%), Positives = 47/135 (34%), Gaps = 15/135 (11%)
Query: 822 NECADILQQFNKLNDAVTLYESAGNYEKAAT---------CYIQLKNWTKIGQLLPHIKS 872
+ + ++ L + + ESA Y + ++ K L +K
Sbjct: 11 SRVVKLFEKRGLLEELIPYLESALKENSRENPALQTALLELYAKYEDPEK---LEEFLKK 67
Query: 873 ATTF-IQYAKAK-EAMGSYRESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTE 930
+ ++ E Y E+V Y++ +Y + + L L + A++ E
Sbjct: 68 NNNYDLEKVAKLCEKADLYEEAVILYKKNGNYKEAISL-LKKLKLYKDAIEYAVKSNDPE 126
Query: 931 GAKRIADYCNKHGDF 945
+++ + +G F
Sbjct: 127 LWEKLLNALLDNGRF 141
>gnl|CDD|238112 cd00189, TPR, Tetratricopeptide repeat domain; typically contains
34 amino acids
[WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-
X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found
in a variety of organisms including bacteria,
cyanobacteria, yeast, fungi, plants, and humans in
various subcellular locations; involved in a variety of
functions including protein-protein interactions, but
common features in the interaction partners have not
been defined; involved in chaperone, cell-cycle,
transciption, and protein transport complexes; the
number of TPR motifs varies among proteins (1,3-11,13
15,16,19); 5-6 tandem repeats generate a right-handed
helical structure with an amphipathic channel that is
thought to accomodate an alpha-helix of a target
protein; it has been proposed that TPR proteins
preferably interact with WD-40 repeat proteins, but in
many instances several TPR-proteins seem to aggregate to
multi-protein complexes; examples of TPR-proteins
include, Cdc16p, Cdc23p and Cdc27p components of the
cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal
targeting signals, the Tom70p co-receptor for
mitochondrial targeting signals, Ser/Thr phosphatase 5C
and the p110 subunit of O-GlcNAc transferase; three
copies of the repeat are present here.
Length = 100
Score = 30.8 bits (70), Expect = 1.1
Identities = 19/65 (29%), Positives = 30/65 (46%), Gaps = 8/65 (12%)
Query: 840 LYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAMGSYRESVGAYERA 899
LY G+Y++A Y K +L P +A + A A +G Y E++ YE+A
Sbjct: 9 LYYKLGDYDEALEYY------EKALELDP--DNADAYYNLAAAYYKLGKYEEALEDYEKA 60
Query: 900 EDYDN 904
+ D
Sbjct: 61 LELDP 65
Score = 28.9 bits (65), Expect = 5.5
Identities = 18/66 (27%), Positives = 27/66 (40%), Gaps = 8/66 (12%)
Query: 834 LNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAMGSYRESV 893
+ Y G YE+A Y K +L P +A + A +G Y E++
Sbjct: 37 YYNLAAAYYKLGKYEEALEDY------EKALELDP--DNAKAYYNLGLAYYKLGKYEEAL 88
Query: 894 GAYERA 899
AYE+A
Sbjct: 89 EAYEKA 94
>gnl|CDD|223533 COG0457, NrfG, FOG: TPR repeat [General function prediction only].
Length = 291
Score = 32.5 bits (72), Expect = 1.5
Identities = 35/193 (18%), Positives = 60/193 (31%), Gaps = 10/193 (5%)
Query: 824 CADILQQFNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAK 883
+ L+ + A L A K L+ K +L A +
Sbjct: 46 LEEALELLPNSDLAGLLLLLALALLKLGRLEEALELLEKALELELLPNLAEALLNLGLLL 105
Query: 884 EAMGSYRESVGAYERAEDYDNVVRVDLDHLNDI--RHAVDIVKAKKCTEGAKRIADYCNK 941
EA+G Y E++ E+A D + L D +A + E A + N+
Sbjct: 106 EALGKYEEALELLEKALALDPDPDLAEALLALGALYELGDYEEALELYEKALELDPELNE 165
Query: 942 HGDFGAAIHFLILSKCYQDAFNLSQQHKKLHEFGKFLLEEDEPNPVELKRLAIHFEEDKD 1001
+ A+ L + + L K L + + L L + + +
Sbjct: 166 LAEALLAL--------GALLEALGRYEEALELLEKALKLNPDDDAEALLNLGLLYLKLGK 217
Query: 1002 MFRAAQYYYHAKE 1014
A +YY A E
Sbjct: 218 YEEALEYYEKALE 230
>gnl|CDD|222858 PHA02533, 17, large terminase protein; Provisional.
Length = 534
Score = 32.7 bits (75), Expect = 1.5
Identities = 26/84 (30%), Positives = 34/84 (40%), Gaps = 25/84 (29%)
Query: 445 PESGEKYTIMTQAMSE-------EFLFFATSEYELKIFSLSEWKFVSGYKHSDKIKS-IY 496
P G KY I T +SE +EY +K V+ Y H++ I I
Sbjct: 310 PVEGHKY-IATLDVSEGRGQDYSALHIIDITEYP--------YKQVAVY-HNNTISPLIL 359
Query: 497 PDI-------YGICLVLIEMNNTG 513
PDI Y V IE+N+TG
Sbjct: 360 PDIIVDYLMEYNEAPVYIELNSTG 383
>gnl|CDD|146707 pfam04212, MIT, MIT (microtubule interacting and transport) domain.
The MIT domain forms an asymmetric three-helix bundle
and binds ESCRT-III (endosomal sorting complexes
required for transport) substrates.
Length = 69
Score = 29.5 bits (67), Expect = 1.8
Identities = 21/76 (27%), Positives = 33/76 (43%), Gaps = 14/76 (18%)
Query: 827 ILQQFNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAM 886
+ + + AV + AGNYE+A Y + I LL +K K +EA
Sbjct: 2 LEKALELVKKAVEA-DEAGNYEEALELYKE-----AIEYLLQALKYEPD----PKRREA- 50
Query: 887 GSYRESVGAY-ERAED 901
R+ + Y +RAE+
Sbjct: 51 --LRQKIAEYLDRAEE 64
>gnl|CDD|225504 COG2956, COG2956, Predicted N-acetylglucosaminyl transferase
[Carbohydrate transport and metabolism].
Length = 389
Score = 32.0 bits (73), Expect = 2.3
Identities = 35/159 (22%), Positives = 58/159 (36%), Gaps = 36/159 (22%)
Query: 666 AIRAYQSLDEAGMVWCLESLVEEEEDTSIL-CGHVAALLGNHDTAQQRY-LTSDIPTMAL 723
AIR +Q+L E+ L E+ ++ G G D A+ + D A
Sbjct: 88 AIRIHQTLLES------PDLTFEQRLLALQQLGRDYMAAGLLDRAEDIFNQLVDEGEFAE 141
Query: 724 TLRRDL-------RQWREALALATSL----GSNQTPIIS---CDYAQQLEMTGQHAQALS 769
+ L R+W +A+ +A L G I+ C+ AQQ + +A
Sbjct: 142 GALQQLLNIYQATREWEKAIDVAERLVKLGGQTYRVEIAQFYCELAQQALASSDVDRARE 201
Query: 770 FYQKSMELATPDIQDPECQRKCKEGIARTSIRVGDFRLG 808
+K+++ D +C R SI +G L
Sbjct: 202 LLKKALQ------ADKKC--------VRASIILGRVELA 226
>gnl|CDD|233895 TIGR02494, PFLE_PFLC, glycyl-radical enzyme activating protein
family. This subset of the radical-SAM family
(pfam04055) includes a number of probable activating
proteins acting on different enzymes all requiring an
amino-acid-centered radical. The closest relatives to
this family are the pyruvate-formate lyase activating
enzyme (PflA, 1.97.1.4, TIGR02493) and the anaerobic
ribonucleotide reductase activating enzyme (TIGR02491).
Included within this subfamily are activators of
hydroxyphenyl acetate decarboxylase (HdpA, ),
benzylsuccinate synthase (BssD, ), gycerol dehydratase
(DhaB2,) as well as enzymes annotated in E. coli as
activators of different isozymes of pyruvate-formate
lyase (PFLC and PFLE) however, these appear to lack
characterization and may activate enzymes with
distinctive functions. Most of the sequence-level
variability between these forms is concentrated within an
N-terminal domain which follows a conserved group of
three cysteines and contains a variable pattern of 0 to 8
additional cysteines.
Length = 295
Score = 31.5 bits (72), Expect = 2.4
Identities = 13/62 (20%), Positives = 20/62 (32%), Gaps = 17/62 (27%)
Query: 1236 LPCPYCD-----TMVPDMM------LHCASCARIIPFCIA------SGKHITRNELTKCL 1278
L C +C P+++ L C C + P A G++ KC
Sbjct: 26 LRCKWCSNPESQRKSPELLFKENRCLGCGKCVEVCPAGTARLSELADGRNRIIIRREKCT 85
Query: 1279 EC 1280
C
Sbjct: 86 HC 87
>gnl|CDD|184804 PRK14720, PRK14720, transcript cleavage factor/unknown domain fusion
protein; Provisional.
Length = 906
Score = 32.0 bits (73), Expect = 2.6
Identities = 22/75 (29%), Positives = 36/75 (48%), Gaps = 6/75 (8%)
Query: 939 CNKHGDFGAAIHFL-ILSKCYQDAFNLSQQHKKLHEFGKFLLEEDEPNPVELKRLAIHFE 997
C+K +G L L++ Y ++KKL + L++ D NP +K+LA +E
Sbjct: 106 CDKILLYGENKLALRTLAEAYAKL----NENKKLKGVWERLVKADRDNPEIVKKLATSYE 161
Query: 998 EDKDMFRAAQYYYHA 1012
E+ D +A Y A
Sbjct: 162 EE-DKEKAITYLKKA 175
>gnl|CDD|237756 PRK14559, PRK14559, putative protein serine/threonine phosphatase;
Provisional.
Length = 645
Score = 31.6 bits (72), Expect = 3.1
Identities = 10/19 (52%), Positives = 11/19 (57%)
Query: 1237 PCPYCDTMVPDMMLHCASC 1255
PCP C T VP HC +C
Sbjct: 29 PCPQCGTEVPVDEAHCPNC 47
>gnl|CDD|237953 PRK15378, PRK15378, inositol phosphate phosphatase SopB;
Provisional.
Length = 564
Score = 31.8 bits (72), Expect = 3.1
Identities = 20/86 (23%), Positives = 29/86 (33%)
Query: 812 AAESNSSVLKNECADILQQFNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIK 871
AA++ L A QQ L +A + A + + W I L H
Sbjct: 126 AAKALKKNLIELIAARTQQQLGLPAKEAHRFAALAFSDAQVKQLNNQPWQTIKNTLSHNG 185
Query: 872 SATTFIQYAKAKEAMGSYRESVGAYE 897
T Q A+ +G+ AYE
Sbjct: 186 HHYTNTQLPAAEMKIGAKDIFPKAYE 211
>gnl|CDD|221693 pfam12657, TFIIIC_delta, Transcription factor IIIC subunit delta
N-term. In humans there are six subunits of
transcription factor IIIC, and this one is the 90 kDa
subunit; whereas in fungi the complex resolves into nine
different subunits and this is No. 9 in yeasts. The
whole subunit is involved in RNA polymerase III-mediated
transcription. It is possible that this N-terminal
domain interacts with TFIIIC subunit 8.
Length = 167
Score = 30.5 bits (69), Expect = 3.2
Identities = 33/115 (28%), Positives = 45/115 (39%), Gaps = 21/115 (18%)
Query: 171 ISWSDDGQLLAVTTSGGSVKIYLSKLPKLVVANNGKIAILSSLNQVSVYLRSIERKGTPW 230
+SWS+DGQL T G +V I K S++ LR G
Sbjct: 10 LSWSEDGQLAVAT--GETVHILNPKSLAKSFIPTPSTLPASAIQWDITKLRGNLFTGQEL 67
Query: 231 --------TNFIIETEIEPS------WREYHGLVVANNGK--IAILSSLNQVSVY 269
F I EI PS W GL A NG+ +A+L+S ++S+Y
Sbjct: 68 PSILPQSRDLFSIGEEISPSHVRAVAWSP-PGL--AKNGRCLLAVLTSNLRLSLY 119
>gnl|CDD|197651 smart00320, WD40, WD40 repeats. Note that these repeats are
permuted with respect to the structural repeats (blades)
of the beta propeller domain.
Length = 40
Score = 27.7 bits (62), Expect = 3.3
Identities = 7/22 (31%), Positives = 16/22 (72%)
Query: 171 ISWSDDGQLLAVTTSGGSVKIY 192
+++S DG+ LA + G++K++
Sbjct: 18 VAFSPDGKYLASGSDDGTIKLW 39
>gnl|CDD|236058 PRK07579, PRK07579, hypothetical protein; Provisional.
Length = 245
Score = 30.6 bits (69), Expect = 4.2
Identities = 11/53 (20%), Positives = 24/53 (45%), Gaps = 2/53 (3%)
Query: 891 ESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTEGAKRIADYCNKHG 943
+ G +D+ + +DLD RH ++ ++A T + A + ++ G
Sbjct: 178 ATEGNLNSKKDFKQLREIDLDERGTFRHFINRLRA--LTHDDYKNAYFVDESG 228
>gnl|CDD|239145 cd02682, MIT_AAA_Arch, MIT: domain contained within Microtubule
Interacting and Trafficking molecules. This sub-family of
MIT domains is found in mostly archaebacterial
AAA-ATPases. The molecular function of the MIT domain is
unclear.
Length = 75
Score = 28.6 bits (64), Expect = 4.2
Identities = 13/36 (36%), Positives = 20/36 (55%), Gaps = 3/36 (8%)
Query: 1186 NLQESALKFATLLLRPEYRSNLEDK---YRKQIELI 1218
L+E A K+A ++ E N ED Y+K IE++
Sbjct: 1 MLEEMARKYAINAVKAEKEGNAEDAITNYKKAIEVL 36
>gnl|CDD|211334 cd02566, PseudoU_synth_RluE, Pseudouridine synthase, Escherichia
coli RluE. This group is comprised of bacterial
proteins similar to E. coli RluE. Pseudouridine
synthases catalyze the isomerization of specific
uridines in an RNA molecule to pseudouridines
(5-ribosyluracil, psi). No cofactors are required.
Escherichia coli RluE makes psi2457 in 23S RNA. psi2457
is not universally conserved.
Length = 168
Score = 30.0 bits (68), Expect = 4.4
Identities = 14/30 (46%), Positives = 16/30 (53%), Gaps = 2/30 (6%)
Query: 42 NKHGKLIDKITLPGLCIV--MDWDSEGDLL 69
KH L D I PG+ +D DSEG LL
Sbjct: 19 EKHKTLKDYIDDPGVYAAGRLDRDSEGLLL 48
Score = 30.0 bits (68), Expect = 4.4
Identities = 14/30 (46%), Positives = 16/30 (53%), Gaps = 2/30 (6%)
Query: 120 NKHGKLIDKITLPGLCIV--MDWDSEGDLL 147
KH L D I PG+ +D DSEG LL
Sbjct: 19 EKHKTLKDYIDDPGVYAAGRLDRDSEGLLL 48
>gnl|CDD|212567 cd11694, DHR2_DOCK_D, Dock Homology Region 2, a GEF domain, of
Class D Dedicator of Cytokinesis proteins. DOCK
proteins are atypical guanine nucleotide exchange
factors (GEFs) that lack the conventional Dbl homology
(DH) domain. As GEFs, they activate small GTPases by
exchanging bound GDP for free GTP. They are divided into
four classes (A-D) based on sequence similarity and
domain architecture; class D, also called the Zizimin
subfamily, includes Dock9, 10 and 11. Class D Docks are
specific GEFs for Cdc42. Dock9 plays important roles in
spine formation and dendritic growth. Dock10 and Dock11
are preferentially expressed in lymphocytes. All DOCKs
contain two homology domains: the DHR-1 (Dock homology
region-1), also called CZH1 (CED-5, Dock180, and
MBC-zizimin homology 1), and DHR-2 (also called CZH2 or
Docker). The DHR-1 domain binds
phosphatidylinositol-3,4,5-triphosphate. This alignment
model represents the DHR-2 domain of class D DOCKs,
which contains the catalytic GEF activity for Cdc42.
Class D DOCKs also contain a Pleckstrin homology (PH)
domain at the N-terminus.
Length = 376
Score = 31.2 bits (71), Expect = 4.5
Identities = 20/88 (22%), Positives = 29/88 (32%), Gaps = 29/88 (32%)
Query: 824 CADILQQ---------FNKLNDAVTLYESAGNYEKAATCYIQLKNWTKIGQLLPHIKSAT 874
C + L + KL + +YE ++E+ A CY L
Sbjct: 51 CVEGLWKAERYELLGELYKL--IIPIYEKRRDFEQLADCYRTLH---------------- 92
Query: 875 TFIQYAKAKEAMGSYRESVGAYERAEDY 902
Y K E M S + +G Y R Y
Sbjct: 93 --RAYEKVVEVMESGKRLLGTYYRVAFY 118
>gnl|CDD|128594 smart00299, CLH, Clathrin heavy chain repeat homology.
Length = 140
Score = 29.9 bits (68), Expect = 4.7
Identities = 10/63 (15%), Positives = 24/63 (38%)
Query: 880 AKAKEAMGSYRESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTEGAKRIADYC 939
K E Y E+V Y++ ++ + + ++HL + A++ + E +
Sbjct: 76 GKLCEKAKLYEEAVELYKKDGNFKDAIVTLIEHLGNYEKAIEYFVKQNNPELWAEVLKAL 135
Query: 940 NKH 942
Sbjct: 136 LDK 138
>gnl|CDD|132218 TIGR03174, cas_Csc3, CRISPR type I-D/CYANO-associated protein
Csc3/Cas10d. CRISPR (Clustered Regularly Interspaced
Short Palindromic Repeats) is a widespread family of
prokaryotic direct repeats with spacers of unique
sequence between consecutive repeats. This protein
family is a CRISPR-associated (Cas) family strictly
associated with the Cyano subtype of CRISPR/Cas locus,
found in several species of Cyanobacteria and several
archaeal species. This family is designated Csc3 for
CRISPR/Cas Subtype Cyano protein 3, as it is often the
third gene upstream of the core cas genes,
cas3-cas4-cas1-cas2 [Mobile and extrachromosomal element
functions, Other].
Length = 953
Score = 31.0 bits (70), Expect = 5.6
Identities = 29/129 (22%), Positives = 39/129 (30%), Gaps = 23/129 (17%)
Query: 623 ISLHKVLKLLNWKEAWNICAVLNQSETWRSFAEACLQNL---EFSWAIRAYQSLDEAGMV 679
+LH K + + N LN E A + L EF + D +
Sbjct: 45 FTLHDYHKHCHADDMPNDEFDLNIQEI-IPIILALGKRLGLDEFWAPENSEDWRDYIAEI 103
Query: 680 WCLESLVEEEEDTSILCGHVAALLGNHDTAQQRYLTSDIPTMALTLRRDLRQWREALALA 739
L V ++ S NH TA + R LR R L LA
Sbjct: 104 SFLAQNVHGKQHIS----------SNHSTAG---YNFTLKE------RTLRPLRHLLLLA 144
Query: 740 TSLGSNQTP 748
S S +P
Sbjct: 145 DSAASLSSP 153
>gnl|CDD|218251 pfam04762, IKI3, IKI3 family. Members of this family are
components of the elongator multi-subunit component of a
novel RNA polymerase II holoenzyme for transcriptional
elongation. This region contains WD40 like repeats.
Length = 903
Score = 31.1 bits (71), Expect = 5.6
Identities = 9/27 (33%), Positives = 17/27 (62%), Gaps = 1/27 (3%)
Query: 169 ERISWSDDGQLLAVTTSGGSVKIYLSK 195
+WS D +LLA+TT +V + +++
Sbjct: 120 SAAAWSPDEELLALTTGENTV-LLMTR 145
>gnl|CDD|239956 cd04583, CBS_pair_ABC_OpuCA_assoc2, This cd contains two tandem
repeats of the cystathionine beta-synthase (CBS pair)
domains in association with the ABC transporter OpuCA.
OpuCA is the ATP binding component of a bacterial solute
transporter that serves a protective role to cells
growing in a hyperosmolar environment but the function
of the CBS domains in OpuCA remains unknown. In the
related ABC transporter, OpuA, the tandem CBS domains
have been shown to function as sensors for ionic
strength, whereby they control the transport activity
through an electronic switching mechanism. ABC
transporters are a large family of proteins involved in
the transport of a wide variety of different compounds,
like sugars, ions, peptides, and more complex organic
molecules. They are a subset of nucleotide hydrolases
that contain a signature motif, Q-loop, and
H-loop/switch region, in addition to the Walker A
motif/P-loop and Walker B motif commonly found in a
number of ATP- and GTP-binding and hydrolyzing proteins.
CBS is a small domain originally identified in
cystathionine beta-synthase and subsequently found in a
wide range of different proteins. CBS domains usually
come in tandem repeats, which associate to form a
so-called Bateman domain or a CBS pair which is
reflected in this model. The interface between the two
CBS domains forms a cleft that is a potential ligand
binding site. The CBS pair coexists with a variety of
other functional domains. It has been proposed that the
CBS domain may play a regulatory role, although its
exact function is unknown.
Length = 109
Score = 29.1 bits (66), Expect = 5.7
Identities = 22/92 (23%), Positives = 40/92 (43%), Gaps = 18/92 (19%)
Query: 63 DSEGDLLGIISSNSSAVNVWNTYTKKRTIVDSGLRDPLTCLVWCKQCSMLQLSVSIYNKH 122
D + LLGI+S S + Y + +++ D L D T +Q S+ +
Sbjct: 32 DKDNKLLGIVSLES----LEQAYKEAKSLEDIMLEDVFT----------VQPDASLRD-- 75
Query: 123 GKLIDKITLPGLCIVMDWDSEGDLLGIISSNS 154
++ + G V D +G L+G+I+ +S
Sbjct: 76 --VLGLVLKRGPKYVPVVDEDGKLVGLITRSS 105
>gnl|CDD|239057 cd02143, NADH_nitroreductase, Nitroreductase family. Members of this
family utilize FMN as a cofactor. This family is involved
in the reduction of flavin or nitroaromatic compounds by
using NAD(P)H as electron donor in a obligatory
two-electron transfer. Nitrogenase is homodimer. Each
subunit contains one FMN molecule. Members of this family
are also called NADH dehydrogenase, oxygen-insensitive
NAD(P)H nitrogenase or dihydropteridine reductase.
Length = 147
Score = 29.5 bits (67), Expect = 6.5
Identities = 12/51 (23%), Positives = 20/51 (39%), Gaps = 4/51 (7%)
Query: 1207 LEDKYRKQIELIVRKAPRKDIASPEEAHVLPCPYCDTMVP--DMMLHCASC 1255
++D + K I+ I R AP +AS P D ++ L +
Sbjct: 45 VDDPWEKGIDPIFRGAPHLLLASAPRDF--PTAQVDAIIALTYFELAAQAL 93
>gnl|CDD|129168 TIGR00058, Hemerythrin, hemerythrin family non-heme iron protein.
This family includes oxygen carrier proteins of various
oligomeric states from the vascular fluid (hemerythrin)
and muscle (myohemerythrin) of some marine invertebrates.
Each unit binds 2 non-heme Fe using 5 H, one E and one D.
One member of this family,from the sandworm Nereis
diversicolor, is an unusual (non-metallothionein)
cadmium-binding protein. Homologous proteins, excluded
from this narrowly defined family, are found in archaea
and bacteria (see pfam01814).
Length = 115
Score = 29.0 bits (65), Expect = 6.8
Identities = 18/50 (36%), Positives = 25/50 (50%), Gaps = 6/50 (12%)
Query: 963 NLSQQHKKLHEFGKFLLEEDEPNPVELKRL----AIHFEEDKDMFRAAQY 1008
NL ++HK L G F L D + LK L +HF +++ M AA Y
Sbjct: 17 NLDEEHKTLFN-GIFALAAD-NSATALKELIDVTVLHFLDEEAMMIAANY 64
>gnl|CDD|237602 PRK14081, PRK14081, triple tyrosine motif-containing protein;
Provisional.
Length = 667
Score = 30.8 bits (70), Expect = 7.0
Identities = 19/82 (23%), Positives = 33/82 (40%), Gaps = 23/82 (28%)
Query: 445 PESGEKYTIMTQAMSEE----FLFFATSEYELKIFSLSEWKFVSGYKHSDK-IKSIYPDI 499
P+ +YTIM QA E+ F + + +Y + K +K IK+IY D
Sbjct: 60 PKEEGEYTIMVQAKKEDSNKPFDYVSKEDYVIG-------------KAEEKLIKNIYLDK 106
Query: 500 YGI-----CLVLIEMNNTGFLY 516
+ + ++ N +Y
Sbjct: 107 DTLNVGEKIEIKVDSNKEPLMY 128
>gnl|CDD|234059 TIGR02917, PEP_TPR_lipo, putative PEP-CTERM system TPR-repeat
lipoprotein. This protein family occurs in strictly
within a subset of Gram-negative bacterial species with
the proposed PEP-CTERM/exosortase system, analogous to
the LPXTG/sortase system common in Gram-positive
bacteria. This protein occurs in a species if and only if
a transmembrane histidine kinase (TIGR02916) and a
DNA-binding response regulator (TIGR02915) also occur.
The present of tetratricopeptide repeats (TPR) suggests
protein-protein interaction, possibly for the regulation
of PEP-CTERM protein expression, since many PEP-CTERM
proteins in these genomes are preceded by a proposed DNA
binding site for the response regulator.
Length = 899
Score = 30.4 bits (69), Expect = 7.2
Identities = 60/345 (17%), Positives = 105/345 (30%), Gaps = 90/345 (26%)
Query: 711 QRYLTSDIPTMALTLRRDLRQW-REALALATSLGSNQTPIISCDYAQQLEMTGQHAQALS 769
Q YL AL + + ++ LG Q G +A+S
Sbjct: 575 QYYLGKGQLKKALAILNEAADAAPDSPEAWLMLGRAQL------------AAGDLNKAVS 622
Query: 770 FYQKSMELA---------TPDIQDPECQRKCKEGIARTSIRVGDFRLGIRLAAESNSSVL 820
++K + L D + + A TS++ L + +++
Sbjct: 623 SFKKLLALQPDSALALLLLADAY--AVMKNYAK--AITSLKRA-------LELKPDNTEA 671
Query: 821 KNECADILQQFNKLNDAVTLYES-AGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQY 879
+ A +L + A + +S + KAA + G L K IQ
Sbjct: 672 QIGLAQLLLAAKRTESAKKIAKSLQKQHPKAALGFELE------GDLYLRQKDYPAAIQA 725
Query: 880 AKAKEAMGSYRESVGAYERAEDYDNVVRVDLDHLNDIRHAVDIVKAKKCTEGAKRIADYC 939
YR+ A +RA N +++ L + E K + +
Sbjct: 726 ---------YRK---ALKRAPSSQNAIKLHRALL----------ASGNTAEAVKTLEAWL 763
Query: 940 NKHGDFGAAIHFLILSKCYQDAFNLSQQHKKLHEFGKFLLEEDEPNPVELKRLAIHFEED 999
H + A + L++ Y + + K ++++ N V L LA + E
Sbjct: 764 KTHPN-DAVLRTA-LAELYLAQKDYDKAIKHYQT----VVKKAPDNAVVLNNLAWLYLEL 817
Query: 1000 KDMFRAAQY---------------------YYHAKEYGRAMKLLL 1023
KD RA +Y E RA+ LL
Sbjct: 818 KDP-RALEYAERALKLAPNIPAILDTLGWLLVEKGEADRALPLLR 861
>gnl|CDD|238917 cd01942, ribokinase_group_A, Ribokinase-like subgroup A. Found in
bacteria and archaea, this subgroup is part of the
ribokinase/pfkB superfamily. Its oligomerization state
is unknown at this time.
Length = 279
Score = 30.0 bits (68), Expect = 8.3
Identities = 8/25 (32%), Positives = 11/25 (44%)
Query: 517 HTAMDYLLPIPEFPPATEEVLWDTV 541
H D +L + FP E VL +
Sbjct: 7 HLNYDIILKVESFPGPFESVLVKDL 31
>gnl|CDD|177301 PHA00733, PHA00733, hypothetical protein.
Length = 128
Score = 28.7 bits (64), Expect = 8.7
Identities = 15/38 (39%), Positives = 19/38 (50%), Gaps = 7/38 (18%)
Query: 362 HYLGDNQKTIPDAFIKNRNYFAHTLVYKPQLLGDKEYL 399
H L QK + A +K TL+Y PQLL + YL
Sbjct: 32 HSLTPEQKRLIRAVVK-------TLIYNPQLLDESSYL 62
>gnl|CDD|222112 pfam13414, TPR_11, TPR repeat.
Length = 69
Score = 27.3 bits (61), Expect = 9.7
Identities = 16/64 (25%), Positives = 27/64 (42%), Gaps = 9/64 (14%)
Query: 841 YESAGNYEKAATCYIQLKNWTKIGQLLPHIKSATTFIQYAKAKEAMGS-YRESVGAYERA 899
G+Y++A Y K +L P +A + A A +G Y E++ E+A
Sbjct: 13 LFKLGDYDEAIEAY------EKALELDP--DNAEAYYNLALAYLKLGKDYEEALEDLEKA 64
Query: 900 EDYD 903
+ D
Sbjct: 65 LELD 68
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.135 0.405
Gapped
Lambda K H
0.267 0.0728 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 66,463,601
Number of extensions: 6555846
Number of successful extensions: 5515
Number of sequences better than 10.0: 1
Number of HSP's gapped: 5489
Number of HSP's successfully gapped: 51
Length of query: 1324
Length of database: 10,937,602
Length adjustment: 108
Effective length of query: 1216
Effective length of database: 6,147,370
Effective search space: 7475201920
Effective search space used: 7475201920
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 65 (28.8 bits)