RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= 015899
(398 letters)
>gnl|CDD|215254 PLN02460, PLN02460, indole-3-glycerol-phosphate synthase.
Length = 338
Score = 652 bits (1685), Expect = 0.0
Identities = 281/342 (82%), Positives = 314/342 (91%), Gaps = 4/342 (1%)
Query: 55 ETKDGSATISSVMEDAETALKAKEWEVGMLINEVAASQGIKIRRRPPTGPPLHYVGPFQF 114
+K GSA + +D AL+ KEWEVGM NE+AASQGI+IRRRPPTGPPLHYVGPFQF
Sbjct: 1 ASKSGSA----LYDDMFNALEVKEWEVGMSQNEIAASQGIRIRRRPPTGPPLHYVGPFQF 56
Query: 115 RIQNEGNTPRNILEEIVWHKDVEVTQLKQRRPLSMLKNALDNAPPARDFIGALMAANQRT 174
R+QNEGNTPRNILEEIVW+KDVEV Q+K+R+PL +LK AL NAPPARDF+GAL AA++RT
Sbjct: 57 RLQNEGNTPRNILEEIVWYKDVEVAQMKERKPLYLLKKALQNAPPARDFVGALRAAHKRT 116
Query: 175 GLPALIAEVKKASPSRGILREDFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVR 234
G P LIAEVKKASPSRG+LRE+FDPVEIA++YEKGGAACLS+LTDEKYF+GSFENLEA+R
Sbjct: 117 GQPGLIAEVKKASPSRGVLRENFDPVEIAQAYEKGGAACLSVLTDEKYFQGSFENLEAIR 176
Query: 235 SAGVKCPLLCKEFIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVE 294
+AGVKCPLLCKEFIVDAWQIYYAR+KGADA+LLIAAVLPDLDI+YM KICK LG+ AL+E
Sbjct: 177 NAGVKCPLLCKEFIVDAWQIYYARSKGADAILLIAAVLPDLDIKYMLKICKSLGMAALIE 236
Query: 295 VHDEREMDRVLGIEGIELIGINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESG 354
VHDEREMDRVLGIEG+ELIGINNR+LETFEVD SNTKKLLEGERGE IR+K IIVVGESG
Sbjct: 237 VHDEREMDRVLGIEGVELIGINNRSLETFEVDISNTKKLLEGERGEQIREKGIIVVGESG 296
Query: 355 LFTPDDIAYVQEAGVKAVLVGESIVKQDDPGKGITGLFGKDI 396
LFTPDD+AYVQ AGVKAVLVGES+VKQDDPGKGI GLFGKDI
Sbjct: 297 LFTPDDVAYVQNAGVKAVLVGESLVKQDDPGKGIAGLFGKDI 338
>gnl|CDD|234710 PRK00278, trpC, indole-3-glycerol-phosphate synthase; Reviewed.
Length = 260
Score = 372 bits (957), Expect = e-129
Identities = 142/272 (52%), Positives = 183/272 (67%), Gaps = 13/272 (4%)
Query: 123 PRNILEEIVWHKDVEVTQLKQRRPLSMLKNALDNAPPARDFIGALMAANQRTGLPALIAE 182
+IL++IV +K EV K + PL+ LK APP RDF AL R G PA+IAE
Sbjct: 1 MMDILDKIVAYKREEVAARKAQVPLAELKARAAAAPPPRDFAAAL-----RAGKPAVIAE 55
Query: 183 VKKASPSRGILREDFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVRSAGVKCPL 242
VKKASPS+G++REDFDPVEIA++YE GGAACLS+LTDE++F+GS E L A R+A V P+
Sbjct: 56 VKKASPSKGVIREDFDPVEIAKAYEAGGAACLSVLTDERFFQGSLEYLRAARAA-VSLPV 114
Query: 243 LCKEFIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMD 302
L K+FI+D +QIY AR GADA+LLI A L D ++ + LGL LVEVHDE E++
Sbjct: 115 LRKDFIIDPYQIYEARAAGADAILLIVAALDDEQLKELLDYAHSLGLDVLVEVHDEEELE 174
Query: 303 RVLGIEGIELIGINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIA 362
R L G LIGINNRNL+TFEVD T++L + +VV ESG+FTP+D+
Sbjct: 175 RAL-KLGAPLIGINNRNLKTFEVDLETTERLAPL------IPSDRLVVSESGIFTPEDLK 227
Query: 363 YVQEAGVKAVLVGESIVKQDDPGKGITGLFGK 394
+ +AG AVLVGES+++ DDPG + L G
Sbjct: 228 RLAKAGADAVLVGESLMRADDPGAALRELLGA 259
>gnl|CDD|223212 COG0134, TrpC, Indole-3-glycerol phosphate synthase [Amino acid
transport and metabolism].
Length = 254
Score = 308 bits (791), Expect = e-104
Identities = 129/262 (49%), Positives = 180/262 (68%), Gaps = 13/262 (4%)
Query: 127 LEEIVWHKDVEVTQLKQRRPLSMLKNALDNAPPARDFIGALMAANQRTGLPALIAEVKKA 186
LE+I+ K EV K + PL+ L+ + +A RDF AL A+ + PA+IAEVKKA
Sbjct: 1 LEKILADKKEEVAARKAKLPLAELRAKIRSAD--RDFYAALKEASGK---PAVIAEVKKA 55
Query: 187 SPSRGILREDFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVRSAGVKCPLLCKE 246
SPS+G++REDFDPVEIA++YE+GGAA +S+LTD KYF+GSFE+L AVR+A V P+L K+
Sbjct: 56 SPSKGLIREDFDPVEIAKAYEEGGAAAISVLTDPKYFQGSFEDLRAVRAA-VDLPVLRKD 114
Query: 247 FIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMDRVLG 306
FI+D +QIY AR GADAVLLI A L D + + LG+ LVEVH+E E++R L
Sbjct: 115 FIIDPYQIYEARAAGADAVLLIVAALDDEQLEELVDRAHELGMEVLVEVHNEEELERALK 174
Query: 307 IEGIELIGINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIAYVQE 366
+ G ++IGINNR+L T EVD T+KL K++I++ ESG+ TP+D+ + +
Sbjct: 175 L-GAKIIGINNRDLTTLEVDLETTEKLAPL------IPKDVILISESGISTPEDVRRLAK 227
Query: 367 AGVKAVLVGESIVKQDDPGKGI 388
AG A LVGE++++ DDP + +
Sbjct: 228 AGADAFLVGEALMRADDPEEAL 249
>gnl|CDD|238203 cd00331, IGPS, Indole-3-glycerol phosphate synthase (IGPS); an
enzyme in the tryptophan biosynthetic pathway,
catalyzing the ring closure reaction of
1-(o-carboxyphenylamino)-1-deoxyribulose-5-phosphate
(CdRP) to indole-3-glycerol phosphate (IGP), accompanied
by the release of carbon dioxide and water. IGPS is
active as a separate monomer in most organisms, but is
also found fused to other enzymes as part of a
bifunctional or multifunctional enzyme involved in
tryptophan biosynthesis.
Length = 217
Score = 302 bits (777), Expect = e-102
Identities = 117/229 (51%), Positives = 164/229 (71%), Gaps = 12/229 (5%)
Query: 163 FIGALMAANQRTGLPALIAEVKKASPSRGILREDFDPVEIARSYEKGGAACLSILTDEKY 222
F AL +R G +IAEVK+ASPS+G++REDFDPVEIA++YEK GAA +S+LT+ KY
Sbjct: 1 FKAAL----KRPGGLGVIAEVKRASPSKGLIREDFDPVEIAKAYEKAGAAAISVLTEPKY 56
Query: 223 FKGSFENLEAVRSAGVKCPLLCKEFIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTK 282
F+GS E+L AVR A V P+L K+FI+D +QIY AR GADAVLLI A L D ++ + +
Sbjct: 57 FQGSLEDLRAVREA-VSLPVLRKDFIIDPYQIYEARAAGADAVLLIVAALDDEQLKELYE 115
Query: 283 ICKLLGLTALVEVHDEREMDRVLGIEGIELIGINNRNLETFEVDNSNTKKLLEGERGEII 342
+ + LG+ LVEVHDE E++R L + G ++IGINNR+L+TFEVD + T++L +
Sbjct: 116 LARELGMEVLVEVHDEEELERALAL-GAKIIGINNRDLKTFEVDLNTTERLAP------L 168
Query: 343 RQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDPGKGITGL 391
K++I+V ESG+ TP+D+ + EAG AVL+GES+++ DPG + L
Sbjct: 169 IPKDVILVSESGISTPEDVKRLAEAGADAVLIGESLMRAPDPGAALREL 217
>gnl|CDD|215798 pfam00218, IGPS, Indole-3-glycerol phosphate synthase.
Length = 254
Score = 283 bits (726), Expect = 1e-94
Identities = 128/265 (48%), Positives = 175/265 (66%), Gaps = 11/265 (4%)
Query: 127 LEEIVWHKDVEVTQLKQRRPLSMLKNALDNAPPARDFIGALMAANQRTGLPALIAEVKKA 186
LE+IV K EV K R PL+ L+ APP R F AL + R PALIAEVKKA
Sbjct: 1 LEKIVADKREEVAAAKARPPLADLQADARLAPPTRSFYDALRESRGR---PALIAEVKKA 57
Query: 187 SPSRGILREDFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVRSAGVKCPLLCKE 246
SPS+G++REDFDP EIAR YE GA+ +S+LT+ KYF+GS E L VR A V P+L K+
Sbjct: 58 SPSKGLIREDFDPAEIARVYEAAGASAISVLTEPKYFQGSLEYLREVREA-VSLPVLRKD 116
Query: 247 FIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMDRVLG 306
FI+D +QIY AR GAD VLLI AVL D + + + + LG+ LVEVH+E E++R L
Sbjct: 117 FIIDEYQIYEARAYGADTVLLIVAVLSDELLEELYEYARSLGMEPLVEVHNEEELERALA 176
Query: 307 IEGIELIGINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIAYVQE 366
+ G +LIG+NNRNL+TFEVD + T++L ++ +++++V ESG+ TP+D+ + +
Sbjct: 177 L-GAKLIGVNNRNLKTFEVDLNTTRRLA-----PMVP-EDVLLVAESGISTPEDVEKLAK 229
Query: 367 AGVKAVLVGESIVKQDDPGKGITGL 391
G A LVGES+++ D + I L
Sbjct: 230 HGANAFLVGESLMRAPDVRQAIREL 254
>gnl|CDD|236509 PRK09427, PRK09427, bifunctional indole-3-glycerol phosphate
synthase/phosphoribosylanthranilate isomerase;
Provisional.
Length = 454
Score = 237 bits (606), Expect = 5e-74
Identities = 105/259 (40%), Positives = 156/259 (60%), Gaps = 17/259 (6%)
Query: 125 NILEEIVWHKDVEVTQLKQRRPLSMLKNALDNAPPARDFIGALMAANQRTGLPALIAEVK 184
+L +IV K + V KQ++PL+ +N + P R F AL A I E K
Sbjct: 5 TVLAKIVADKAIWVAARKQQQPLASFQNEI--QPSDRSFYDALKGPK-----TAFILECK 57
Query: 185 KASPSRGILREDFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVRSAGVKCPLLC 244
KASPS+G++R+DFDP EIAR Y K A+ +S+LTDEKYF+GSF+ L VR+ V P+LC
Sbjct: 58 KASPSKGLIRDDFDPAEIARVY-KHYASAISVLTDEKYFQGSFDFLPIVRAI-VTQPILC 115
Query: 245 KEFIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMDRV 304
K+FI+D +QIY AR GADA+LL+ +VL D R + + L + L EV +E E++R
Sbjct: 116 KDFIIDPYQIYLARYYGADAILLMLSVLDDEQYRQLAAVAHSLNMGVLTEVSNEEELERA 175
Query: 305 LGIEGIELIGINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIAYV 364
+ + G ++IGINNRNL +D + T++L +I ++IV+ ESG++T + +
Sbjct: 176 IAL-GAKVIGINNRNLRDLSIDLNRTREL-----APLIP-ADVIVISESGIYTHAQVREL 228
Query: 365 QEAGVKAVLVGESIVKQDD 383
L+G S++ +DD
Sbjct: 229 SP-FANGFLIGSSLMAEDD 246
>gnl|CDD|140013 PRK13957, PRK13957, indole-3-glycerol-phosphate synthase;
Provisional.
Length = 247
Score = 153 bits (389), Expect = 2e-44
Identities = 100/270 (37%), Positives = 150/270 (55%), Gaps = 23/270 (8%)
Query: 123 PRNILEEIVWHKDVEVTQLKQRRPLSMLKNALDNAPPARDFIGALMAANQRTGLPALIAE 182
+L EI+ K E+ ++ + PL D P RD + ++ ++IAE
Sbjct: 1 MHRVLREIIETKQNEIEKISRWDPLP------DRGLPLRD--------SLKSRSFSIIAE 46
Query: 183 VKKASPSRGILREDFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVRSAGVKCPL 242
K+ SPS G LR D+ PV+IA++YE GA+ +S+LTD+ YF GS E+L++V S+ +K P+
Sbjct: 47 CKRKSPSAGELRADYHPVQIAKTYETLGASAISVLTDQSYFGGSLEDLKSV-SSELKIPV 105
Query: 243 LCKEFIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMD 302
L K+FI+D QI AR GA A+LLI +L I+ K LG+ LVEVH E E
Sbjct: 106 LRKDFILDEIQIREARAFGASAILLIVRILTPSQIKSFLKHASSLGMDVLVEVHTEDEAK 165
Query: 303 RVLGIEGIELIGINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIA 362
L G E+IGIN R+L+TF++ + L+E + NI+ VGESG+ + D+
Sbjct: 166 LALDC-GAEIIGINTRDLDTFQIH----QNLVEEVAAFLPP--NIVKVGESGIESRSDLD 218
Query: 363 YVQEAGVKAVLVGESIVKQDDPGKGITGLF 392
++ V A L+G +++ D K LF
Sbjct: 219 KFRKL-VDAALIGTYFMEKKDIRKAWLSLF 247
>gnl|CDD|184335 PRK13802, PRK13802, bifunctional indole-3-glycerol phosphate
synthase/tryptophan synthase subunit beta; Provisional.
Length = 695
Score = 159 bits (403), Expect = 4e-43
Identities = 96/246 (39%), Positives = 142/246 (57%), Gaps = 12/246 (4%)
Query: 146 PLSMLKNALDNAPPARDFIGALMAANQRTGLPALIAEVKKASPSRGILREDFDPVEIARS 205
L +K A AP D L A+ G+P +IAE+K+ASPS+G L + DP +AR
Sbjct: 23 SLEEVKKAAAAAPAPIDATRWLKRAD---GIP-VIAEIKRASPSKGHLSDIPDPAALARE 78
Query: 206 YEKGGAACLSILTDEKYFKGSFENLEAVRSAGVKCPLLCKEFIVDAWQIYYARTKGADAV 265
YE+GGA+ +S+LT+ + F GS ++ + VR+A V P+L K+FIV +QI+ AR GAD V
Sbjct: 79 YEQGGASAISVLTEGRRFLGSLDDFDKVRAA-VHIPVLRKDFIVTDYQIWEARAHGADLV 137
Query: 266 LLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMDRVLGIEGIELIGINNRNLETFEV 325
LLI A L D ++++ + LG+T LVE H E++R + G ++IGIN RNL+ +V
Sbjct: 138 LLIVAALDDAQLKHLLDLAHELGMTVLVETHTREEIERAIA-AGAKVIGINARNLKDLKV 196
Query: 326 DNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDPG 385
D + +L ++I+ V ESG+F ++ AG AVLVGE + DD
Sbjct: 197 DVNKYNELAADLPDDVIK------VAESGVFGAVEVEDYARAGADAVLVGEGVATADDHE 250
Query: 386 KGITGL 391
+ L
Sbjct: 251 LAVERL 256
>gnl|CDD|240073 cd04722, TIM_phosphate_binding, TIM barrel proteins share a
structurally conserved phosphate binding motif and in
general share an eight beta/alpha closed barrel
structure. Specific for this family is the conserved
phosphate binding site at the edges of strands 7 and 8.
The phosphate comes either from the substrate, as in the
case of inosine monophosphate dehydrogenase (IMPDH), or
from ribulose-5-phosphate 3-epimerase (RPE) or from
cofactors, like FMN.
Length = 200
Score = 38.3 bits (89), Expect = 0.003
Identities = 41/193 (21%), Positives = 74/193 (38%), Gaps = 17/193 (8%)
Query: 196 DFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVR---SAGVKCPLLCKEFIVDAW 252
DPVE+A++ + GA + + T + + + + V +A PL + I DA
Sbjct: 11 SGDPVELAKAAAEAGADAIIVGTRSSDPEEAETDDKEVLKEVAAETDLPLGVQLAINDAA 70
Query: 253 QIYYARTK-----GADAVLLIAAV--LPDLDIRYMTKICK-LLGLTALVEVHDEREMDRV 304
GAD V + AV L D+ + ++ + + + +V++ E+
Sbjct: 71 AAVDIAAAAARAAGADGVEIHGAVGYLAREDLELIRELREAVPDVKVVVKLSPTGELAAA 130
Query: 305 LGIE-GIELIGINNRNLET-FEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIA 362
E G++ +G+ N LL R + V+ G+ P+D A
Sbjct: 131 AAEEAGVDEVGLGNGGGGGGGRDAVPIADLLLI----LAKRGSKVPVIAGGGINDPEDAA 186
Query: 363 YVQEAGVKAVLVG 375
G V+VG
Sbjct: 187 EALALGADGVIVG 199
>gnl|CDD|240079 cd04728, ThiG, Thiazole synthase (ThiG) is the tetrameric enzyme
that is involved in the formation of the thiazole moiety
of thiamin pyrophosphate, an essential ubiquitous
cofactor that plays an important role in carbohydrate
and amino acid metabolism. ThiG catalyzes the formation
of thiazole from 1-deoxy-D-xylulose 5-phosphate (DXP)
and dehydroglycine, with the help of the sulfur carrier
protein ThiS that carries the sulfur needed for thiazole
assembly on its carboxy terminus (ThiS-COSH).
Length = 248
Score = 34.8 bits (81), Expect = 0.056
Identities = 17/44 (38%), Positives = 26/44 (59%)
Query: 341 IIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDP 384
II + ++ V+ ++G+ TP D A E G AVL+ +I K DP
Sbjct: 170 IIERADVPVIVDAGIGTPSDAAQAMELGADAVLLNTAIAKAKDP 213
>gnl|CDD|172030 PRK13397, PRK13397, 3-deoxy-7-phosphoheptulonate synthase;
Provisional.
Length = 250
Score = 34.6 bits (79), Expect = 0.063
Identities = 25/107 (23%), Positives = 55/107 (51%), Gaps = 10/107 (9%)
Query: 277 IRYMTKICKLLGLTALVEVHDEREMDRVLGIEGIELIGINNRNLETFEVDNSNTKKLLEG 336
IRY+ ++C+ GL ++ E+ ER+++ + +++I + RN++ FE K L
Sbjct: 68 IRYLHEVCQEFGLLSVSEIMSERQLEE--AYDYLDVIQVGARNMQNFEF-----LKTLSH 120
Query: 337 ERGEIIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDD 383
I+ ++ ++ E L ++Y+Q+ G +++ E V+ D
Sbjct: 121 IDKPILFKRGLMATIEEYL---GALSYLQDTGKSNIILCERGVRGYD 164
>gnl|CDD|224933 COG2022, ThiG, Uncharacterized enzyme of thiazole biosynthesis
[Nucleotide transport and metabolism].
Length = 262
Score = 33.8 bits (78), Expect = 0.12
Identities = 16/44 (36%), Positives = 26/44 (59%)
Query: 341 IIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDP 384
II + ++ V+ ++G+ TP D A E G AVL+ +I + DP
Sbjct: 177 IIEEADVPVIVDAGIGTPSDAAQAMELGADAVLLNTAIARAKDP 220
>gnl|CDD|215192 PLN02334, PLN02334, ribulose-phosphate 3-epimerase.
Length = 229
Score = 33.1 bits (76), Expect = 0.17
Identities = 14/48 (29%), Positives = 23/48 (47%), Gaps = 3/48 (6%)
Query: 344 QKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDPGKGITGL 391
+ +I V G G+ P I EAG ++ G ++ D + I+GL
Sbjct: 174 ELDIEVDG--GV-GPSTIDKAAEAGANVIVAGSAVFGAPDYAEVISGL 218
>gnl|CDD|147701 pfam05690, ThiG, Thiazole biosynthesis protein ThiG. This family
consists of several bacterial thiazole biosynthesis
protein G sequences. ThiG, together with ThiF and ThiH,
is proposed to be involved in the synthesis of
4-methyl-5-(b-hydroxyethyl)thiazole (THZ) which is an
intermediate in the thiazole production pathway.
Length = 246
Score = 32.9 bits (76), Expect = 0.21
Identities = 16/44 (36%), Positives = 26/44 (59%)
Query: 341 IIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDP 384
II + ++ V+ ++G+ TP D A E G AVL+ +I + DP
Sbjct: 169 IIEEADVPVIVDAGIGTPSDAAQAMELGADAVLLNTAIARAKDP 212
>gnl|CDD|234003 TIGR02768, TraA_Ti, Ti-type conjugative transfer relaxase TraA.
This protein contains domains distinctive of a single
strand exonuclease (N-terminus, MobA/MobL, pfam03389) as
well as a helicase domain (central region, homologous to
the corresponding region of the F-type relaxase TraI,
TIGR02760). This protein likely fills the same role as
TraI(F), nicking (at the oriT site) and unwinding the
coiled plasmid prior to conjugative transfer.
Length = 744
Score = 30.9 bits (70), Expect = 1.3
Identities = 22/67 (32%), Positives = 35/67 (52%), Gaps = 12/67 (17%)
Query: 314 GINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIAYV----QEAGV 369
GI +R L + E +N + LL K+++V+ E+G+ +A V +EAG
Sbjct: 417 GIESRTLASLEYAWANGRDLLS--------DKDVLVIDEAGMVGSRQMARVLKEAEEAGA 468
Query: 370 KAVLVGE 376
K VLVG+
Sbjct: 469 KVVLVGD 475
>gnl|CDD|184165 PRK13585, PRK13585,
1-(5-phosphoribosyl)-5-[(5-
phosphoribosylamino)methylideneamino]
imidazole-4-carboxamide isomerase; Provisional.
Length = 241
Score = 30.3 bits (69), Expect = 1.4
Identities = 19/58 (32%), Positives = 33/58 (56%), Gaps = 4/58 (6%)
Query: 327 NSNTKKLLEGERGEIIRQ----KNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVK 380
N + + LLEG E +++ +I V+ G+ T DD+ ++EAG V+VG ++ K
Sbjct: 170 NVDVEGLLEGVNTEPVKELVDSVDIPVIASGGVTTLDDLRALKEAGAAGVVVGSALYK 227
>gnl|CDD|238317 cd00564, TMP_TenI, Thiamine monophosphate synthase (TMP
synthase)/TenI. TMP synthase catalyzes an important step
in the thiamine biosynthesis pathway, the substitution
of the pyrophosphate of 2-methyl-4-amino-5-
hydroxymethylpyrimidine pyrophosphate by 4-methyl-5-
(beta-hydroxyethyl) thiazole phosphate to yield thiamine
phosphate. TenI is a enzymatically inactive regulatory
protein involved in the regulation of several
extracellular enzymes. This superfamily also contains
other enzymatically inactive proteins with unknown
functions.
Length = 196
Score = 30.2 bits (69), Expect = 1.5
Identities = 49/197 (24%), Positives = 73/197 (37%), Gaps = 32/197 (16%)
Query: 198 DPVEIARSYEKGGAACLSI----LTDEKYFKGSFENLEAVRSAGVKCPLLCKEFIV-DAW 252
D +E+ + KGG + + L+ + + + E R GV I+ D
Sbjct: 13 DLLEVVEAALKGGVTLVQLREKDLSARELLELARALRELCRKYGVP-------LIINDRV 65
Query: 253 QIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMDRVLGIEGIEL 312
+ A GAD V L LP + R + ++G++ H E R G +
Sbjct: 66 DL--ALAVGADGVHLGQDDLPVAEARALLGPDLIIGVS----THSLEEALR-AEELGADY 118
Query: 313 IGINNRNLETFEVDNSNTKKLLEGERG-----EIIRQKNIIVVGESGLFTPDDIAYVQEA 367
+G V + TK G EI I VV G+ TP++ A V A
Sbjct: 119 VGFG-------PVFPTPTKPGAGPPLGLELLREIAELVEIPVVAIGGI-TPENAAEVLAA 170
Query: 368 GVKAVLVGESIVKQDDP 384
G V V +I DDP
Sbjct: 171 GADGVAVISAITGADDP 187
>gnl|CDD|214380 CHL00162, thiG, thiamin biosynthesis protein G; Validated.
Length = 267
Score = 30.4 bits (69), Expect = 1.5
Identities = 12/44 (27%), Positives = 23/44 (52%)
Query: 341 IIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDP 384
II I V+ ++G+ TP + + E G VL+ ++ + +P
Sbjct: 184 IIENAKIPVIIDAGIGTPSEASQAMELGASGVLLNTAVAQAKNP 227
>gnl|CDD|234687 PRK00208, thiG, thiazole synthase; Reviewed.
Length = 250
Score = 30.0 bits (69), Expect = 2.1
Identities = 17/44 (38%), Positives = 25/44 (56%)
Query: 341 IIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDP 384
II Q ++ V+ ++G+ TP D A E G AVL+ +I DP
Sbjct: 170 IIEQADVPVIVDAGIGTPSDAAQAMELGADAVLLNTAIAVAGDP 213
>gnl|CDD|222258 pfam13604, AAA_30, AAA domain. This family of domains contain a
P-loop motif that is characteristic of the AAA
superfamily. Many of the proteins in this family are
conjugative transfer proteins. There is a Walker A and
Walker B.
Length = 195
Score = 29.5 bits (67), Expect = 2.2
Identities = 18/69 (26%), Positives = 31/69 (44%), Gaps = 9/69 (13%)
Query: 311 ELIGINNRNLETFEVDNSNTKKLLEGERGEIIRQKNIIVVGESGLFTPDDIA----YVQE 366
E +GI R L + + + G ++ ++VV E+G+ +A ++
Sbjct: 64 EELGIEARTLASLLHRWDKGE-----DPGRVLDAGTLLVVDEAGMVGTRQMARLLRLAEK 118
Query: 367 AGVKAVLVG 375
AG K VLVG
Sbjct: 119 AGAKVVLVG 127
>gnl|CDD|223347 COG0269, SgbH, 3-hexulose-6-phosphate synthase and related proteins
[Carbohydrate transport and metabolism].
Length = 217
Score = 29.5 bits (67), Expect = 2.4
Identities = 29/128 (22%), Positives = 54/128 (42%), Gaps = 6/128 (4%)
Query: 261 GADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMDRVLGIEGIELIGINNRNL 320
GAD V ++ A D I+ K+ K G +++ + ++ + ++ +G++ L
Sbjct: 80 GADWVTVLGAA-DDATIKKAIKVAKEYGKEVQIDLIGVWDPEQRA--KWLKELGVDQVIL 136
Query: 321 ETFEVDNSNTKKLLEGERGEIIRQ--KNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESI 378
D K + E I++ V +G TP+DI + G V+VG +I
Sbjct: 137 HR-GRDAQAAGKSWGEDDLEKIKKLSDLGAKVAVAGGITPEDIPLFKGIGADIVIVGRAI 195
Query: 379 VKQDDPGK 386
DP +
Sbjct: 196 TGAKDPAE 203
>gnl|CDD|218092 pfam04452, Methyltrans_RNA, RNA methyltransferase. RNA
methyltransferases modify nucleotides during ribosomal
RNA maturation in a site-specific manner. The
Escherichia coli member is specific for U1498
methylation.
Length = 224
Score = 29.1 bits (66), Expect = 3.3
Identities = 12/33 (36%), Positives = 21/33 (63%), Gaps = 2/33 (6%)
Query: 347 IIVVG-ESGLFTPDDIAYVQEAGVKAVLVGESI 378
++++G E G F+P +I ++EAG V +G I
Sbjct: 177 LLIIGPEGG-FSPKEIELLKEAGFTPVSLGPRI 208
>gnl|CDD|238796 cd01555, UdpNAET, UDP-N-acetylglucosamine enolpyruvyl transferase
catalyzes enolpyruvyl transfer as part of the first step
in the biosynthesis of peptidoglycan, a component of the
bacterial cell wall. The reaction is phosphoenolpyruvate
+ UDP-N-acetyl-D-glucosamine = phosphate +
UDP-N-acetyl-3-(1-carboxyvinyl)-D-glucosamine. This
enzyme is of interest as a potential target for
anti-bacterial agents. The only other known enolpyruvyl
transferase is the related
5-enolpyruvylshikimate-3-phosphate synthase.
Length = 400
Score = 29.4 bits (67), Expect = 3.4
Identities = 23/90 (25%), Positives = 36/90 (40%), Gaps = 28/90 (31%)
Query: 258 RTKGA-DAVL-LIAAVL-----------PDL-DIRYMTKICKLLGLTALVEVHDEREMDR 303
R GA +A L ++AA L PDL D+ M ++ + LG A VE E +
Sbjct: 6 RISGAKNAALPILAAALLTDEPVTLRNVPDLLDVETMIELLRSLG--AKVEFEGENTLV- 62
Query: 304 VLGIEGIELIGINNRNLETFEVDNSNTKKL 333
I+ N+ + E +K+
Sbjct: 63 -----------IDASNINSTEAPYELVRKM 81
>gnl|CDD|181392 PRK08332, PRK08332, ribonucleotide-diphosphate reductase subunit
alpha; Validated.
Length = 1740
Score = 29.7 bits (66), Expect = 3.8
Identities = 18/66 (27%), Positives = 32/66 (48%), Gaps = 7/66 (10%)
Query: 326 DNSNTKKLLEGERGEIIRQKNIIVVGESGLFTP-------DDIAYVQEAGVKAVLVGESI 378
D N + +L+ +G IR N VVG++ + TP + + +E G K + E I
Sbjct: 893 DVINRRNVLKEAKGGPIRATNPCVVGDTRILTPEGYIKAEELFSLAKERGKKEAVAVEGI 952
Query: 379 VKQDDP 384
++ +P
Sbjct: 953 AEEGEP 958
>gnl|CDD|129443 TIGR00343, TIGR00343, pyridoxal 5'-phosphate synthase, synthase
subunit Pdx1. This protein had been believed to be a
singlet oxygen resistance protein. Subsequent work
showed that it is a protein of pyridoxine (vitamin B6)
biosynthesis, and that pyridoxine quenches the highly
toxic singlet form of oxygen produced by light in the
presence of certain chemicals [Biosynthesis of
cofactors, prosthetic groups, and carriers, Pyridoxine].
Length = 287
Score = 29.0 bits (65), Expect = 3.8
Identities = 13/33 (39%), Positives = 17/33 (51%)
Query: 354 GLFTPDDIAYVQEAGVKAVLVGESIVKQDDPGK 386
G+ TP D A + + G V VG I K +P K
Sbjct: 207 GVATPADAALMMQLGADGVFVGSGIFKSSNPEK 239
>gnl|CDD|238189 cd00309, chaperonin_type_I_II, chaperonin families, type I and type
II. Chaperonins are involved in productive folding of
proteins. They share a common general morphology, a
double toroid of 2 stacked rings, each composed of 7-9
subunits. There are 2 main chaperonin groups. The
symmetry of type I is seven-fold and they are found in
eubacteria (GroEL) and in organelles of eubacterial
descent (hsp60 and RBP). The symmetry of type II is
eight- or nine-fold and they are found in archea
(thermosome), thermophilic bacteria (TF55) and in the
eukaryotic cytosol (CTT). Their common function is to
sequester nonnative proteins inside their central cavity
and promote folding by using energy derived from ATP
hydrolysis.
Length = 464
Score = 29.3 bits (67), Expect = 4.1
Identities = 28/120 (23%), Positives = 43/120 (35%), Gaps = 22/120 (18%)
Query: 178 ALIAEVKKASPSRGILREDFDPVEIARSYEKGGAACLSILTDEKYFKGSFENLEAVR--- 234
L+ E +K L P EI R YEK L IL E E+ E +
Sbjct: 91 ELLKEAEKL------LAAGIHPTEIIRGYEKAVEKALEIL-KEIAVPIDVEDREELLKVA 143
Query: 235 --SAGVKCPLLCKEFIVDAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTAL 292
S K +F+ + + DAVL + D+D+ + ++ K G +
Sbjct: 144 TTSLNSKLVSGGDDFLGE--LV-------VDAVLKVGKENGDVDLG-VIRVEKKKGGSLE 193
>gnl|CDD|240077 cd04726, KGPDC_HPS, 3-Keto-L-gulonate 6-phosphate decarboxylase
(KGPDC) and D-arabino-3-hexulose-6-phosphate synthase
(HPS). KGPDC catalyzes the formation of L-xylulose
5-phosphate and carbon dioxide from 3-keto-L-gulonate
6-phosphate as part of the anaerobic pathway for
L-ascorbate utilization in some eubacteria. HPS
catalyzes the formation of
D-arabino-3-hexulose-6-phosphate from D-ribulose
5-phosphate and formaldehyde in microorganisms that can
use formaldehyde as a carbon source. Both catalyze
reactions that involve the Mg2+-assisted formation and
stabilization of 1,2-enediolate reaction intermediates.
Length = 202
Score = 28.7 bits (65), Expect = 4.2
Identities = 11/28 (39%), Positives = 16/28 (57%)
Query: 357 TPDDIAYVQEAGVKAVLVGESIVKQDDP 384
TPD + ++AG V+VG +I DP
Sbjct: 168 TPDTLPEFKKAGADIVIVGRAITGAADP 195
>gnl|CDD|181086 PRK07695, PRK07695, transcriptional regulator TenI; Provisional.
Length = 201
Score = 28.8 bits (65), Expect = 4.2
Identities = 18/62 (29%), Positives = 29/62 (46%), Gaps = 7/62 (11%)
Query: 328 SNTKKLLEGERG-----EIIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQD 382
++ KK + RG +I R +I V+ G+ TP++ V AGV + V I
Sbjct: 127 TDCKKGVPA-RGLEELSDIARALSIPVIAIGGI-TPENTRDVLAAGVSGIAVMSGIFSSA 184
Query: 383 DP 384
+P
Sbjct: 185 NP 186
>gnl|CDD|240078 cd04727, pdxS, PdxS is a subunit of the pyridoxal 5'-phosphate
(PLP) synthase, an important enzyme in deoxyxylulose
5-phosphate (DXP)-independent pathway for de novo
biosynthesis of PLP, present in some eubacteria, in
archaea, fungi, plants, plasmodia, and some metazoa.
Together with PdxT, PdxS forms the PLP synthase, a
heteromeric glutamine amidotransferase (GATase), whereby
PdxT produces ammonia from glutamine and PdxS combines
ammonia with five- and three-carbon phosphosugars to
form PLP. PLP is the biologically active form of vitamin
B6, an essential cofactor in many biochemical processes.
PdxS subunits form two hexameric rings.
Length = 283
Score = 28.8 bits (65), Expect = 4.4
Identities = 13/33 (39%), Positives = 18/33 (54%)
Query: 354 GLFTPDDIAYVQEAGVKAVLVGESIVKQDDPGK 386
G+ TP D A + + G V VG I K ++P K
Sbjct: 204 GVATPADAALMMQLGADGVFVGSGIFKSENPEK 236
>gnl|CDD|234907 PRK01130, PRK01130, N-acetylmannosamine-6-phosphate 2-epimerase;
Provisional.
Length = 221
Score = 28.6 bits (65), Expect = 4.8
Identities = 12/39 (30%), Positives = 20/39 (51%)
Query: 340 EIIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESI 378
E+++ V+ E + TP+ E G AV+VG +I
Sbjct: 167 ELLKAVGCPVIAEGRINTPEQAKKALELGAHAVVVGGAI 205
>gnl|CDD|235515 PRK05581, PRK05581, ribulose-phosphate 3-epimerase; Validated.
Length = 220
Score = 28.2 bits (64), Expect = 6.0
Identities = 9/37 (24%), Positives = 15/37 (40%)
Query: 358 PDDIAYVQEAGVKAVLVGESIVKQDDPGKGITGLFGK 394
D+I EAG + G ++ D + I L +
Sbjct: 181 ADNIKECAEAGADVFVAGSAVFGAPDYKEAIDSLRAE 217
>gnl|CDD|240303 PTZ00170, PTZ00170, D-ribulose-5-phosphate 3-epimerase;
Provisional.
Length = 228
Score = 28.0 bits (63), Expect = 6.8
Identities = 16/56 (28%), Positives = 25/56 (44%), Gaps = 7/56 (12%)
Query: 340 EIIRQK----NIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDDPGKGITGL 391
+R++ NI V G G+ + I +AG ++ G SI K D + I L
Sbjct: 165 RELRKRYPHLNIQVDG--GI-NLETIDIAADAGANVIVAGSSIFKAKDRKQAIELL 217
>gnl|CDD|224303 COG1385, COG1385, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 246
Score = 28.4 bits (64), Expect = 6.9
Identities = 12/42 (28%), Positives = 23/42 (54%), Gaps = 1/42 (2%)
Query: 338 RGEIIRQKNI-IVVGESGLFTPDDIAYVQEAGVKAVLVGESI 378
E + + + +++G G F+ D+I ++EAG V +G I
Sbjct: 185 LLEALPEGKVLLIIGPEGGFSEDEIELLREAGFTPVSLGPRI 226
>gnl|CDD|240080 cd04729, NanE, N-acetylmannosamine-6-phosphate epimerase (NanE)
converts N-acetylmannosamine-6-phosphate to
N-acetylglucosamine-6-phosphate. This reaction is part
of the pathway that allows the usage of sialic acid as a
carbohydrate source. Sialic acids are a family of
related sugars that are found as a component of
glycoproteins, gangliosides, and other
sialoglycoconjugates.
Length = 219
Score = 27.9 bits (63), Expect = 7.2
Identities = 13/40 (32%), Positives = 21/40 (52%)
Query: 339 GEIIRQKNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESI 378
E+ + I V+ E + +P+ A E G AV+VG +I
Sbjct: 170 KELRKALGIPVIAEGRINSPEQAAKALELGADAVVVGSAI 209
>gnl|CDD|237513 PRK13803, PRK13803, bifunctional phosphoribosylanthranilate
isomerase/tryptophan synthase subunit beta; Provisional.
Length = 610
Score = 28.6 bits (64), Expect = 7.8
Identities = 7/57 (12%), Positives = 16/57 (28%), Gaps = 2/57 (3%)
Query: 159 PARDFIGALMAANQRTGLPALIAEVKKASPSRGILREDFDPVEIARSYEKGGAACLS 215
A + A+ A + + + + D V++ + K A
Sbjct: 44 LAPNLEKAIRKA--GGRPVGVFVNESAKAMLKFSKKNGIDFVQLHGAESKAEPAYCQ 98
>gnl|CDD|225431 COG2876, AroA, 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP)
synthase [Amino acid transport and metabolism].
Length = 286
Score = 28.1 bits (63), Expect = 8.2
Identities = 10/49 (20%), Positives = 25/49 (51%), Gaps = 2/49 (4%)
Query: 277 IRYMTKICKLLGLTALVEVHDEREMDRVLGIEGIELIGINNRNLETFEV 325
++ + + GL + EV D R+++ E +++ + RN++ F +
Sbjct: 98 LKLLKRAADETGLPVVTEVMDVRDVEAAA--EYADILQVGARNMQNFAL 144
>gnl|CDD|215730 pfam00118, Cpn60_TCP1, TCP-1/cpn60 chaperonin family. This family
includes members from the HSP60 chaperone family and the
TCP-1 (T-complex protein) family.
Length = 481
Score = 28.3 bits (64), Expect = 8.3
Identities = 42/225 (18%), Positives = 72/225 (32%), Gaps = 63/225 (28%)
Query: 193 LREDFDPVEIARSYEKGGAACLSILTD-EKYFKGSFENLEAV--RSAGVKCPLLCKEFIV 249
+ P +I R YE L L + E+L V S K E +
Sbjct: 81 IEAGIHPTDIIRGYELALEIALKALEELSIPVSDDDEDLLNVARTSLNSKISSRESELL- 139
Query: 250 DAWQIYYARTKGADAVLLIAAVLPDLDIRYMTKICKLLGLTALVEVHDEREMDRVLGIEG 309
+ DAVLLI D+ + K+ G D L IEG
Sbjct: 140 -------GKLV-VDAVLLIIEKF-DVG---NIGVIKIEG---------GSLEDSEL-IEG 177
Query: 310 IEL-----------------IGINNRNLETFE------VDNSNTKKLLEGERGEIIRQ-- 344
I L I + + LE + ++LLE E +++
Sbjct: 178 IVLDKGYLSPDMPKRLENPKILLLDCPLEYEKTEKVIISTAEELERLLEAEEKQLLPLLE 237
Query: 345 ------KNIIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVKQDD 383
N++++ + + ++ + G+ A+ VK++D
Sbjct: 238 KIVDAGVNLVIIQKG--IDDLALHFLAKNGILALR----RVKKED 276
>gnl|CDD|223292 COG0214, SNZ1, Pyridoxine biosynthesis enzyme [Coenzyme
metabolism].
Length = 296
Score = 28.0 bits (63), Expect = 8.5
Identities = 13/33 (39%), Positives = 17/33 (51%)
Query: 354 GLFTPDDIAYVQEAGVKAVLVGESIVKQDDPGK 386
G+ TP D A + + G V VG I K +P K
Sbjct: 216 GVATPADAALMMQLGADGVFVGSGIFKSSNPEK 248
>gnl|CDD|146925 pfam04529, Herpes_U59, Herpesvirus U59 protein. The proteins in
this family have no known function. Cytomegalovirus UL88
is also a member of this family.
Length = 365
Score = 28.2 bits (63), Expect = 9.0
Identities = 12/56 (21%), Positives = 24/56 (42%), Gaps = 5/56 (8%)
Query: 98 RRPPTGPPLHYVGP--FQFRIQNEGNTPRNI---LEEIVWHKDVEVTQLKQRRPLS 148
+ G Y P + R+++ PR + ++W DV + +K+R P +
Sbjct: 95 QMSRDGRDELYEVPKVYLIRVRDGNGGPREVSWPKTSVLWAPDVGIKTVKRRSPAA 150
>gnl|CDD|233457 TIGR01539, portal_lambda, phage portal protein, lambda family.
This model represents one of several distantly related
families of phage portal protein. This protein forms a
hole, or portal, that enables DNA passage during
packaging and ejection. It also forms the junction
between the phage head (capsid) and the tail proteins.
It functions as a dodecamer of a single polypeptide of
average mol. wt. of 40-90 KDa [Mobile and
extrachromosomal element functions, Prophage
functions].
Length = 458
Score = 28.3 bits (63), Expect = 9.2
Identities = 16/67 (23%), Positives = 26/67 (38%), Gaps = 1/67 (1%)
Query: 16 PTLNRRPSKFSIVRSVPASAMDAHRRNGVGFTPIRAQKAETKDGSATISSVMEDAETALK 75
L R + A+A+D H+ N VG I + + + T++ D A
Sbjct: 27 RLLRARARDLVRNNDIVANAIDLHKDNIVGHMGIISYRPQWLGRRGTLAKSFVDKIEAAW 86
Query: 76 AKEWEVG 82
+ EW G
Sbjct: 87 S-EWAEG 92
>gnl|CDD|238805 cd01571, NAPRTase_B, Nicotinate phosphoribosyltransferase
(NAPRTase), subgroup B. Nicotinate
phosphoribosyltransferase catalyses the formation of
NAMN and PPi from 5-phosphoribosy -1-pyrophosphate
(PRPP) and nicotinic acid, this is the first, and also
rate limiting, reaction in the NAD salvage synthesis.
This salvage pathway serves to recycle NAD degradation
products.
Length = 302
Score = 28.0 bits (63), Expect = 9.8
Identities = 27/98 (27%), Positives = 42/98 (42%), Gaps = 12/98 (12%)
Query: 289 LTALVEVH-DEREMD-RVLGIEGIELIGINNRNLETFEVDNSNTKKLLEGERGEI-IRQK 345
AL++ DE+E + G +L G+ L+T + L+ R + IR
Sbjct: 187 RIALIDTFNDEKEEALKAAKALGDKLDGVR---LDTPSSRRGVFRYLIREVRWALDIRGY 243
Query: 346 N---IIVVGESGLFTPDDIAYVQEAGVKAVLVGESIVK 380
I V SG +DI +++ GV A VG +I K
Sbjct: 244 KHVKIFV---SGGLDEEDIKELEDVGVDAFGVGTAISK 278
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.318 0.136 0.390
Gapped
Lambda K H
0.267 0.0797 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 20,927,470
Number of extensions: 2098880
Number of successful extensions: 2138
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2105
Number of HSP's successfully gapped: 68
Length of query: 398
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 299
Effective length of database: 6,546,556
Effective search space: 1957420244
Effective search space used: 1957420244
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 60 (26.8 bits)