Skip to content

Commit 89f2074

Browse files
jalafelBridgeAR
authored andcommitted
doc: document bytes to chars after setEncoding
This commit documents and edge-case behavior in readable streams. It is expected that non-object streams are measured in bytes against the highWaterMark. However, it was discovered in issue thereafter begin to measure the buffer's length in characters. PR-URL: #13442 Refs: #6798 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Vse Mozhet Byt <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
1 parent 1cdb41f commit 89f2074

File tree

1 file changed

+20
-5
lines changed

1 file changed

+20
-5
lines changed

doc/api/stream.md

+20-5
Original file line numberDiff line numberDiff line change
@@ -68,8 +68,8 @@ buffer that can be retrieved using `writable._writableState.getBuffer()` or
6868

6969
The amount of data potentially buffered depends on the `highWaterMark` option
7070
passed into the streams constructor. For normal streams, the `highWaterMark`
71-
option specifies a total number of bytes. For streams operating in object mode,
72-
the `highWaterMark` specifies a total number of objects.
71+
option specifies a [total number of bytes][hwm-gotcha]. For streams operating
72+
in object mode, the `highWaterMark` specifies a total number of objects.
7373

7474
Data is buffered in Readable streams when the implementation calls
7575
[`stream.push(chunk)`][stream-push]. If the consumer of the Stream does not
@@ -1517,9 +1517,9 @@ constructor and implement the `readable._read()` method.
15171517
#### new stream.Readable([options])
15181518

15191519
* `options` {Object}
1520-
* `highWaterMark` {number} The maximum number of bytes to store in
1521-
the internal buffer before ceasing to read from the underlying
1522-
resource. Defaults to `16384` (16kb), or `16` for `objectMode` streams
1520+
* `highWaterMark` {number} The maximum [number of bytes][hwm-gotcha] to store
1521+
in the internal buffer before ceasing to read from the underlying resource.
1522+
Defaults to `16384` (16kb), or `16` for `objectMode` streams
15231523
* `encoding` {string} If specified, then buffers will be decoded to
15241524
strings using the specified encoding. Defaults to `null`
15251525
* `objectMode` {boolean} Whether this stream should behave
@@ -2157,6 +2157,19 @@ object mode has an interesting side effect. Because it *is* a call to
21572157
However, because the argument is an empty string, no data is added to the
21582158
readable buffer so there is nothing for a user to consume.
21592159

2160+
### `highWaterMark` discrepency after calling `readable.setEncoding()`
2161+
2162+
The use of `readable.setEncoding()` will change the behavior of how the
2163+
`highWaterMark` operates in non-object mode.
2164+
2165+
Typically, the size of the current buffer is measured against the
2166+
`highWaterMark` in _bytes_. However, after `setEncoding()` is called, the
2167+
comparison function will begin to measure the buffer's size in _characters_.
2168+
2169+
This is not a problem in common cases with `latin1` or `ascii`. But it is
2170+
advised to be mindful about this behavior when working with strings that could
2171+
contain multi-byte characters.
2172+
21602173
[`'data'`]: #stream_event_data
21612174
[`'drain'`]: #stream_event_drain
21622175
[`'end'`]: #stream_event_end
@@ -2195,6 +2208,8 @@ readable buffer so there is nothing for a user to consume.
21952208
[fs write streams]: fs.html#fs_class_fs_writestream
21962209
[http-incoming-message]: http.html#http_class_http_incomingmessage
21972210
[zlib]: zlib.html
2211+
[hwm-gotcha]: #stream_highWaterMark_discrepency_after_calling_readable_setencoding
2212+
[Readable]: #stream_class_stream_readable
21982213
[stream-_flush]: #stream_transform_flush_callback
21992214
[stream-_read]: #stream_readable_read_size_1
22002215
[stream-_transform]: #stream_transform_transform_chunk_encoding_callback

0 commit comments

Comments
 (0)