Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: write syscall optimization and a bytesWritten bug fix #2960

Closed

Conversation

brendanashworth
Copy link
Contributor

This pull request contains two changes to net: one that uses .cork()/.uncork() on a socket as it is connecting, and the other to fix the value of .bytesWritten while a socket is connecting and a ._writev call is issued.

.cork() / .uncork()

By using these two values on the Writable stream of the socket while it is connecting, we can avoid the common stream-ism of a single buffered ._write, then a larger ._writev following it. This change merges multiple writes to a connecting stream into a single writev call, rather than two calls as before. Here is an explanation gist.

.bytesWritten fix

As found by reading over the code base, if a writev call is issued and cached (while connecting), _pendingData will be an array. Because there was no special functionality for this, Buffer.byteLength coerced the array to a string and returned the byte length of that. This fixes that bug and adds a regression test.

@brendanashworth brendanashworth added the net Issues and PRs related to the net subsystem. label Sep 19, 2015
@ronkorving
Copy link
Contributor

Brilliant. We could apply the same cork trick for fs.WriteStream and gain some benefits there.

@brendanashworth
Copy link
Contributor Author

Perhaps r= @trevnorris or @bnoordhuis ?

this.cork();
this.once('connect', function() {
this.uncork();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm confused. What does this gain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is supposed to keep the initial write (if written to multiple times when connecting) inside the stream buffer - then, when it connects (and is writable), all the writes can be flushed at once (with writev), rather than a single write then a writev with the rest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm. wonder how this would have worked before. I assume writes to an unconnected socket would have either errored or been lost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before they were kept in the _pendingData, _pendingEncoding things, and the callback wouldn't be called, so the streams would buffer the rest. This still happens after this commit though (on a smaller scale with _pendingBuffer) because .end() uncorks completely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you cork eagerly here instead of lazily in Socket#_writeGeneric()? The change LGTM, just wondering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By corking eagerly, we make sure (as often as possible) that the writes stay in the stream infrastructure rather than like before, where one sat in net.js.

@trevnorris
Copy link
Contributor

I feel like this will have some interactions with the automatic cork/uncork in the http module. Could those be potentially removed and rely on this instead?

@brendanashworth
Copy link
Contributor Author

@trevnorris I'll take a look - but this can only affect the HTTP client, not the server, so that may limit its usefulness. :/

CI: https://ci.nodejs.org/job/node-test-commit/835/

@trevnorris
Copy link
Contributor

Don't see any test failures related to this PR.

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Mar 22, 2016
@jasnell
Copy link
Member

jasnell commented Mar 22, 2016

Is this still an issue?

Instead of allowing the socket to lazily buffer up a write while
connecting, proactively .cork() the stream and .uncork() when we have
connected. This allows the stream to buffer together all writes while
connecting into a single writev when connected, rather than an initial
write and a follow-up writev.

This only leads us to a small caveat: if the stream is uncorked while
connecting, writes will begin to be sent to the socket unintentionally.
Work around this by preserving a smaller subset of the _pendingData and
_pendingEncoding stuff, which can be used in this case. (This also
happens when .end() is called on a socket which is still connecting.)

This means this will be turned into a single write call:

  var socket = net.connect(...);
  socket.write('hello, wor');
  socket.write('ld!');

Improving performance when this is the case.
When a writev is caused on a socket (sometimes through corking and
uncorking), previously net would call Buffer.byteLength on the array of
buffers and chunks, giving a completely erroneous byte length for the
bulk of them (this is because byteLength is weird and coerces to
strings).

This commit fixes this bug by iterating over each chunk in the pending
stack and calculating the length individually. Also adds a regression
test.
@brendanashworth
Copy link
Contributor Author

@jasnell it could still be fixed, it just needs someone to sign off on it. I've rebased the branch.

@jasnell
Copy link
Member

jasnell commented Mar 26, 2016

/cc @nodejs/ctc

@indutny
Copy link
Member

indutny commented Mar 26, 2016

A question, will it crash if someone will .uncork() socket right after creation, and will write data to it?

@brendanashworth
Copy link
Contributor Author

@indutny it will not — instead, it will function as it did before, buffering up the data separately, leading to two write calls. So, if it is uncorked, you lose the speed bonus.

@indutny
Copy link
Member

indutny commented Mar 26, 2016

Ok, good. May I ask you to add a test for this case too? Otherwise LGTM!

@jasnell
Copy link
Member

jasnell commented Feb 28, 2017

@brendanashworth ... is this still something you'd like to pursue?

@fhinkel
Copy link
Member

fhinkel commented Mar 26, 2017

Feel free to reopen this PR if you get back to working on it.

@fhinkel fhinkel closed this Mar 26, 2017
brendanashworth added a commit to brendanashworth/io.js that referenced this pull request Jul 14, 2017
When a writev is caused on a socket (sometimes through corking and
uncorking), previously net would call Buffer.byteLength on the array of
buffers and chunks. This throws a TypeError, because Buffer.byteLength
throws when passed a non-string.

In dbfe8c4, behavior changed to throw when passed a non-string. This is
correct behavior. Previously, it would cast the argument to a string, so
before this commit, bytesWritten would give an erroneous value. This
commit corrects the behavior equally both before and after dbfe8c4.

This commit fixes this bug by iterating over each chunk in the pending
stack and calculating the length individually. Also adds a regression
test.

Refs: nodejs#2960
brendanashworth added a commit that referenced this pull request Jul 29, 2017
When a writev is caused on a socket (sometimes through corking and
uncorking), previously net would call Buffer.byteLength on the array of
buffers and chunks. This throws a TypeError, because Buffer.byteLength
throws when passed a non-string.

In dbfe8c4, behavior changed to throw when passed a non-string. This is
correct behavior. Previously, it would cast the argument to a string, so
before this commit, bytesWritten would give an erroneous value. This
commit corrects the behavior equally both before and after dbfe8c4.

This commit fixes this bug by iterating over each chunk in the pending
stack and calculating the length individually. Also adds a regression
test. This additionally changes an `instanceof Buffer` check to `typeof
!== 'string'`, which should be equivalent.

PR-URL: #14236
Reviewed-By: Brian White <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Refs: #2960
addaleax pushed a commit that referenced this pull request Aug 1, 2017
When a writev is caused on a socket (sometimes through corking and
uncorking), previously net would call Buffer.byteLength on the array of
buffers and chunks. This throws a TypeError, because Buffer.byteLength
throws when passed a non-string.

In dbfe8c4, behavior changed to throw when passed a non-string. This is
correct behavior. Previously, it would cast the argument to a string, so
before this commit, bytesWritten would give an erroneous value. This
commit corrects the behavior equally both before and after dbfe8c4.

This commit fixes this bug by iterating over each chunk in the pending
stack and calculating the length individually. Also adds a regression
test. This additionally changes an `instanceof Buffer` check to `typeof
!== 'string'`, which should be equivalent.

PR-URL: #14236
Reviewed-By: Brian White <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Refs: #2960
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
net Issues and PRs related to the net subsystem. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants