You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is intended behavior if you don't use the second parameter of the Buffer.from() method and pass the encoding you want to be used and not actually a bug, this happens because of how the UTF-8 encoding works, digging deep into the encodings and the spec and all that, we get the following tables and their sizes:
In UTF-8:
1 byte:0-7F(ASCII)2 bytes:80-7FF (allEuropean plus some Middle Eastern)3 bytes:800-FFFF(multilingualplane incl. the top 1792 and private-use)4 bytes:10000-10FFFF
In UTF-16:
2bytes:0-D7FF(multilingualplane except the top 1792 and private-use )4 bytes: D800 -10FFFF
Version
14.17.3
Platform
Linux 5.11.0-31-generic #33-Ubuntu SMP Wed Aug 11 13:19:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Subsystem
Buffer
What steps will reproduce the bug?
How often does it reproduce? Is there a required condition?
Always on both Debian Buster and Ubuntu Hirsute Hippo
Buster:
Linux 4.19.0-17-amd64 #1 SMP Debian 4.19.194-3 (2021-07-18) x86_64 GNU/Linux
Hippo:
Linux 5.11.0-31-generic #33-Ubuntu SMP Wed Aug 11 13:19:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
What is the expected behavior?
For Buffer.from('\x80') to create a single byte buffer containing 0x80.
What do you see instead?
Buffer.from('\x80') creates a two byte buffer containing 0xc2 80
Additional information
No response
The text was updated successfully, but these errors were encountered: