Fix tagged template literal with unicode #15047
Draft
+188
−172
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes #8745, fixes #15492, fixes #15929, fixes #16763, fixes #18115
Changes
ascii_only
toprefers_ascii
. It will try to emit mostly ascii, but if a non-ascii character is encountered in a tagged template, it will emit it. Updates execution to scan the file to see if it contains any non-ascii and if it does, load it as utf-8 instead.This should be benchmarked to see what the performance cost is.
TODO:
bun --compile
)Alternate solution is to add a polyfill in js_parser.zig or do some injection into jsc's parsing logic:
Benchmarks
hyperfine "bun-latest a.js" "bun a.js" "bun-latest no_unicode.js" "bun no_unicode.js" --warmup=5
// @bun
file is slowed down by 6ms (out of 1000ms) (about a 0.6% slowdown)// @bun
file slows it down by ~1.21x because it has to convert utf-8 to utf-16before jsc can parse the file