-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs.readFileSync can't return a string for a big file #9489
Comments
I've found the de facto limit for the current v8: 268,435,440 characters ( This test code is OK: const fs = require('fs');
fs.writeFileSync('bigfile.txt', `\uFEFF${'*'.repeat(Math.pow(2, 28) - 16 - 1)}`, 'utf16le');
console.log(fs.readFileSync('bigfile.txt', 'utf16le').length); If I add just one character, it throws the error. Should it be documented somewhere? |
FWIW the limit comes from here. ChakraCore uses a much different value that is dependent upon the value of |
I think the docs recommendation should be (if it is not already) to use raw Buffers for "any very large data". |
(iirc |
buffer.toString throws an Error when the resulting string would be bigger than `2^28 - 16`. Fixes: nodejs#9489
Just in case anyone else finds themselves at this issue from Google, I ran into this while trying to synchronously (no readline, streams, etc.) read a 400mb JSON file line by line. As suggested, I used raw buffers to solve this aided by the buffer-split package. var bsplit = require('buffer-split');
function readLineJSON(path) {
const buf = fs.readFileSync(path); // omitting encoding returns a Buffer
const delim = Buffer.from('\n');
const result = bsplit(buf, delim);
return result
.map(x => x.toString())
.filter(x => x !== "")
.map(JSON.parse);
} |
@vsemozhetbyt … is there anything here you’d like to see? Would you want to open a docs PR yourself? |
I have no definite opinion what should be added and in what way. It seems there is no consensus if we should document engine-specific limits. So feel free to close till any new decisions) |
We should certainly improve the error messages: #define SB_STRING_TOO_LONG_ERROR \
v8::Exception::Error(OneByteString(isolate, "\"toString()\" failed")) Edit: @addaleax Just noticed your comment in the code. I could not find an open issue for this, is there? Any specific reason this has not been changed yet? |
@tniessen No, not beyond the discussion in #12765. The reason this has not been changed yet is that since it’s semver-major it would target Node 9, which gives plenty of time, and the fact that at some point we’re going to have to go through our native errors anyway to upgrade them to the new error code system. (Also, most of the ToDos from that PR might be suitable for first-time contributions from people with a C++ background.) |
readFileSync does not read large files nodejs/node#9489
PR-URL: #19739 Fixes: #3175 Fixes: #9489 Reviewed-By: Gus Caplan <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
In which version of Node we should expect huge files to be supported? |
@Extarys As per this blog post, max String length was increased in V8 6.2, ie the last Node.js LTS version (08.11.3) already supports them. |
To be more exact: the new limit is mentioned in the "Increased max string length" section, ie |
Thanks for this update! I love that when importing big logs or something. |
Sorry to ping on a closed thread, but I can't Google my way out of asking this: How do I set the
but I'm not familiar with the UTF16 code units or how to use them. Do I just write:
I found this blog post, which uses the |
You can't change |
Ah, thank you for clarifying. Just |
Or |
If I try to read a big file (582,170,692 bytes, ~ 555 MB) into a buffer, it is OK. If I add an encoding and try to get a string, I get an error.
It seems the string does not exceed the Spec limit. Is there any other undocumented (or documented in other places) limits for the
fs.readFileSync()
orBuffer.toString()
?The text was updated successfully, but these errors were encountered: