-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: add topic - event loop, timers, nextTick()
#4936
Changes from all commits
46f0702
5a28415
dc1b8a5
bb5b682
ba98380
936bf17
35cf726
f80d7cc
254694b
45fb2fe
d6d76f5
c133caf
f425164
1bd3e6c
7574d4b
8dc6ecb
d82a7f1
1dc26f6
82d0fb8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,467 @@ | ||
# The Node.js Event Loop, Timers, and `process.nextTick()` | ||
|
||
## What is the Event Loop? | ||
|
||
The event loop is what allows Node.js to perform non-blocking I/O | ||
operations — despite the fact that JavaScript is single-threaded — by | ||
offloading operations to the system kernel whenever possible. | ||
|
||
Since most modern kernels are multi-threaded, they can handle multiple | ||
operations executing in the background. When one of these operations | ||
completes, the kernel tells Node.js so that the appropriate callback | ||
may added to the `poll` queue to eventually be executed. We'll explain | ||
this in further detail later in this topic. | ||
|
||
## Event Loop Explained | ||
|
||
When Node.js starts, it initializes the event loop, processes the | ||
provided input script (or drops into the REPL, which is not covered in | ||
this document) which may make async API calls, schedule timers, or call | ||
`process.nextTick()`, then begins processing the event loop. | ||
|
||
The following diagram shows a simplified overview of the event loop's | ||
order of operations. | ||
|
||
┌───────────────────────┐ | ||
┌─>│ timers │ | ||
│ └──────────┬────────────┘ | ||
│ ┌──────────┴────────────┐ | ||
│ │ I/O callbacks │ | ||
│ └──────────┬────────────┘ | ||
│ ┌──────────┴────────────┐ | ||
│ │ idle, prepare │ | ||
│ └──────────┬────────────┘ ┌───────────────┐ | ||
│ ┌──────────┴────────────┐ │ incoming: │ | ||
│ │ poll │<─────┤ connections, │ | ||
│ └──────────┬────────────┘ │ data, etc. │ | ||
│ ┌──────────┴────────────┐ └───────────────┘ | ||
│ │ check │ | ||
│ └──────────┬────────────┘ | ||
│ ┌──────────┴────────────┐ | ||
└──┤ close callbacks │ | ||
└───────────────────────┘ | ||
|
||
*note: each box will be referred to as a "phase" of the event loop.* | ||
|
||
Each phase has a FIFO queue of callbacks to execute. While each phase is | ||
special in its own way, generally, when the event loop enters a given | ||
phase, it will perform any operations specific to that phase, then | ||
execute callbacks in that phase's queue until the queue has been | ||
exhausted or the maximum number of callbacks have executed. When the | ||
queue has been exhausted or the callback limit is reached, the event | ||
loop will move to the next phase, and so on. | ||
|
||
Since any of these operations may schedule _more_ operations and new | ||
events processed in the `poll` phase are queued by the kernel, poll | ||
events can be queued while polling events are being processed. As a | ||
result, long running callbacks can allow the poll phase to run much | ||
longer than a timer's threshold. See the [`timers`](#timers) and | ||
[`poll`](#poll) sections for more details. | ||
|
||
_**NOTE:** There is a slight discrepancy between the Windows and the | ||
Unix/Linux implementation, but that's not important for this | ||
demonstration. The most important parts are here. There are actually | ||
seven or eight steps, but the ones we care about — ones that Node.js | ||
actually uses are those above._ | ||
|
||
|
||
## Phases Overview: | ||
|
||
* `timers`: this phase executes callbacks scheduled by `setTimeout()` | ||
and `setInterval()`. | ||
* `I/O callbacks`: most types of callback except timers, setImmedate, close | ||
* `idle, prepare`: only used internally | ||
* `poll`: retrieve new I/O events; node will block here when appropriate | ||
* `check`: setImmediate callbacks are invoked here | ||
* `close callbacks`: e.g socket.on('close', ...) | ||
|
||
Between each run of the event loop, Node.js checks if it is waiting for | ||
any asynchronous I/O or timer and it shuts down cleanly if there are not | ||
any. | ||
|
||
## Phases in Detail | ||
|
||
### timers | ||
|
||
A timer specifies the **threshold** _after which_ a provided callback | ||
_may be executed_ rather than the **exact** time a person _wants it to | ||
be executed_. Timers callbacks will run as early as they can be | ||
scheduled after the specified amount of time has passed; however, | ||
Operating System scheduling or the running of other callbacks may delay | ||
them. | ||
|
||
_**Note**: Technically, the [`poll` phase](#poll) controls when timers | ||
are executed._ | ||
|
||
For example, say you schedule a timeout to execute after a 100 ms | ||
threshold, then your script starts asynchronously reading a file which | ||
takes 95 ms: | ||
|
||
```js | ||
|
||
var fs = require('fs'); | ||
|
||
function someAsyncOperation (callback) { | ||
|
||
// let's assume this takes 95ms to complete | ||
fs.readFile('/path/to/file', callback); | ||
|
||
} | ||
|
||
var timeoutScheduled = Date.now(); | ||
|
||
setTimeout(function () { | ||
|
||
var delay = Date.now() - timeoutScheduled; | ||
|
||
console.log(delay + "ms have passed since I was scheduled"); | ||
}, 100); | ||
|
||
|
||
// do someAsyncOperation which takes 95 ms to complete | ||
someAsyncOperation(function () { | ||
|
||
var startCallback = Date.now(); | ||
|
||
// do something that will take 10ms... | ||
while (Date.now() - startCallback < 10) { | ||
; // do nothing | ||
} | ||
|
||
}); | ||
``` | ||
|
||
When the event loop enters the `poll` phase, it has an empty queue | ||
(`fs.readFile()` has not completed) so it will wait for the number of ms | ||
remaining until the soonest timer's threshold is reached. While it is | ||
waiting 95 ms pass, `fs.readFile()` finishes reading the file and its | ||
callback which takes 10 ms to complete is added to the `poll` queue and | ||
executed. When the callback finishes, there are no more callbacks in the | ||
queue, so the event loop will see that the threshold of the soonest | ||
timer has been reached then wrap back to the `timers` phase to execute | ||
the timer's callback. In this example, you will see that the total delay | ||
between the timer being scheduled and its callback being executed will | ||
be 105ms. | ||
|
||
Note: To prevent the `poll` phase from starving the event loop, libuv | ||
also has a hard maximum (system dependent) before it stops `poll`ing for | ||
more events. | ||
|
||
### I/O callbacks: | ||
|
||
This phase executes callbacks for some system operations such as types | ||
of TCP errors. For example if a TCP socket receives `ECONNREFUSED` when | ||
attempting to connect, some \*nix systems want to wait to report the | ||
error. This will be queued to execute in the `I/O callbacks` phase. | ||
|
||
### poll: | ||
|
||
The poll phase has two main functions: | ||
|
||
1. Executing scripts for timers who's threshold has elapsed, then | ||
2. Processing events in the `poll` queue. | ||
|
||
|
||
When the event loop enters the `poll` phase _and there are no timers | ||
scheduled_, one of two things will happen: | ||
|
||
* _If the `poll` queue **is not empty**_, the event loop will iterate | ||
through its queue of callbacks executing them synchronously until | ||
either the queue has been exhausted, or the system-dependent hard limit | ||
is reached. | ||
|
||
* _If the `poll` queue is **empty**, one of two more things will | ||
happen: | ||
* If scripts have been scheduled by `setImmediate()`, the event loop | ||
will end the `poll` phase and continue to the `check` phase to | ||
execute those scheduled scripts. | ||
|
||
* If scripts **have not** been scheduled by `setImmediate()`, the | ||
event loop will wait for callbacks to be added to the queue, then | ||
execute it immediately. | ||
|
||
Once the `poll` queue is empty the event loop will check for timers | ||
_whose time thresholds have been reached_. If one or more timers are | ||
ready, the event loop will wrap back to the timers phase to execute | ||
those timers' callbacks. | ||
|
||
### `check`: | ||
|
||
This phase allows a person to execute callbacks immediately after the | ||
`poll` phase has completed. If the `poll` phase becomes idle and | ||
scripts have been queued with `setImmediate()`, the event loop may | ||
continue to the `check` phase rather than waiting. | ||
|
||
`setImmediate()` is actually a special timer that runs in a separate | ||
phase of the event loop. It uses a libuv API that schedules callbacks to | ||
execute after the `poll` phase has completed. | ||
|
||
Generally, as the code is executed, the event loop will eventually hit | ||
the `poll` phase where it will wait for an incoming connection, request, | ||
etc. However, after a callback has been scheduled with `setImmediate()`, | ||
then the `poll` phase becomes idle, it will end and continue to the | ||
`check` phase rather than waiting for `poll` events. | ||
|
||
### `close callbacks`: | ||
|
||
If a socket or handle is closed abruptly (e.g. `socket.destroy()`), the | ||
`'close'` event will be emitted in this phase. Otherwise it will be | ||
emitted via `process.nextTick()`. | ||
|
||
## `setImmediate()` vs `setTimeout()` | ||
|
||
`setImmediate` and `setTimeout()` are similar, but behave in different | ||
ways depending on when they are called. | ||
|
||
* `setImmediate()` is designed to execute a script once the current | ||
`poll` phase completes. | ||
* `setTimeout()` schedules a script to be run | ||
after a minimum threshold in ms has elapsed. | ||
|
||
The order in which they are execute varies depending on the context in | ||
which they are called. If both are called in the main module then you | ||
are bound to how fast your process go, which is impacted by other | ||
programs running on your machine. | ||
|
||
For example, if we run the following script which is not within a I/O | ||
cycle (i.e. the main module), the order in which the two functions are | ||
executed is non-deterministic as it is based upon how fast your process | ||
goes (which is impacted by other programs running on your machine): | ||
|
||
|
||
```js | ||
// timeout_vs_immediate.js | ||
setTimeout(function timeout () { | ||
console.log('timeout'); | ||
},0); | ||
|
||
setImmediate(function immediate () { | ||
console.log('immediate'); | ||
}); | ||
``` | ||
|
||
$ node timeout_vs_immediate.js | ||
timeout | ||
immediate | ||
|
||
$ node timeout_vs_immediate.js | ||
immediate | ||
timeout | ||
|
||
|
||
However, if you move the two calls within an I/O cycle, the immediate | ||
callback is always executed first: | ||
|
||
```js | ||
// timeout_vs_immediate.js | ||
var fs = require('fs') | ||
|
||
fs.readFile(__filename, () => { | ||
setTimeout(() => { | ||
console.log('timeout') | ||
}, 0) | ||
setImmediate(() => { | ||
console.log('immediate') | ||
}) | ||
}) | ||
``` | ||
|
||
$ node timeout_vs_immediate.js | ||
immediate | ||
timeout | ||
|
||
$ node timeout_vs_immediate.js | ||
immediate | ||
timeout | ||
|
||
The main advantage to using `setImmediate()` over `setTimeout()` is | ||
`setImmediate()` will always be executed before any timers if scheduled | ||
within an I/O cycle, independently of how many timers are present. | ||
|
||
## `process.nextTick()`: | ||
|
||
### Understanding `process.nextTick()` | ||
|
||
You may have noticed that `process.nextTick()` was not displayed in the | ||
diagram, even though its a part of the asynchronous API. This is because | ||
`process.nextTick()` is not technically part of the event loop. Instead, | ||
the nextTickQueue will be processed after the current operation | ||
completes, regardless of the current `phase` of the event loop. | ||
|
||
Looking back at our diagram, any time you call `process.nextTick()` in a | ||
given phase, all callbacks passed to `process.nextTick()` will be | ||
resolved before the event loop continues. This can create some bad | ||
situations because **it allows you to "starve" your I/O by making | ||
recursive `process.nextTick()` calls.** which prevents the event loop | ||
from reaching the `poll` phase. | ||
|
||
### Why would that be allowed? | ||
|
||
Why would something like this be included in Node.js? Part of it is a | ||
design philosophy where an API should always be asynchronous even where | ||
it doesn't have to be. Take this code snippet for example: | ||
|
||
```js | ||
function apiCall (arg, callback) { | ||
if (typeof arg !== 'string') | ||
return process.nextTick(callback, | ||
new TypeError('argument should be string')); | ||
} | ||
``` | ||
|
||
The snippet does an argument check and if it's not correct, it will pass | ||
the error to the callback. The API updated fairly recently to allow | ||
passing arguments to `process.nextTick()` allowing it to take any | ||
arguments passed after the callback to be propagated as the arguments to | ||
the callback so you don't have to nest functions. | ||
|
||
What we're doing is passing an error back to the user but only *after* | ||
we have allowed the rest of the user's code to execute. By using | ||
`process.nextTick()` we guarantee that `apiCall()` always runs its | ||
callback *after* the rest of the user's code and *before* the event loop | ||
is allowed to proceed. To acheive this, the JS call stack is allowed to | ||
unwind then immediately execute the provided callback which allows a | ||
person to make recursive calls to nextTick without reaching a | ||
`RangeError: Maximum call stack size exceeded from v8`. | ||
|
||
This philosophy can lead to some potentially problematic situations. | ||
Take this snippet for example: | ||
|
||
```js | ||
// this has an asynchronous signature, but calls callback synchronously | ||
function someAsyncApiCall (callback) { callback(); }; | ||
|
||
// the callback is called before `someAsyncApiCall` completes. | ||
someAsyncApiCall(() => { | ||
|
||
// since someAsyncApiCall has completed, bar hasn't been assigned any value | ||
console.log('bar', bar); // undefined | ||
|
||
}); | ||
|
||
var bar = 1; | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A real world example in node would be: const server = net.createServer(() => {}).listen(8080);
server.on('listening', () => {}); When only a port is passed the port is bound immediately. So the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added your real world example in techjeffharris@f80d7cc |
||
|
||
The user defines `someAsyncApiCall()` to have an asynchronous signature, | ||
actually operates synchronously. When it is called, the callback | ||
provided to `someAsyncApiCall ()` is called in the same phase of the | ||
event loop because `someAsyncApiCall()` doesn't actually do anything | ||
asynchronously. As a result, the callback tries to reference `bar` but | ||
it may not have that variable in scope yet because the script has not | ||
been able to run to completion. | ||
|
||
By placing it in a `process.nextTick()`, the script still has the | ||
ability to run to completion, allowing all the variables, functions, | ||
etc., to be initialized prior to the callback being called. It also has | ||
the advantage of not allowing the event loop to continue. It may be | ||
useful that the user be alerted to an error before the event loop is | ||
allowed to continue. | ||
|
||
A real world example in node would be: | ||
|
||
```js | ||
const server = net.createServer(() => {}).listen(8080); | ||
|
||
server.on('listening', () => {}); | ||
``` | ||
|
||
When only a port is passed the port is bound immediately. So the | ||
`'listening'` callback could be called immediately. Problem is that the | ||
`.on('listening')` will not have been set by that time. | ||
|
||
To get around this the `'listening'` event is queued in a `nextTick()` | ||
to allow the script to run to completion. Which allows the user to set | ||
any event handlers they want. | ||
|
||
## `process.nextTick()` vs `setImmediate()` | ||
|
||
We have two calls that are similar as far as users are concerned, but | ||
their names are confusing. | ||
|
||
* `process.nextTick()` fires immediately on the same phase | ||
* `setImmediate()` fires on the following iteration or 'tick' of the | ||
event loop | ||
|
||
In essence, the names should be swapped. `process.nextTick()` fires more | ||
immediately than `setImmediate()` but this is an artifact of the past | ||
which is unlikely to change. Making this switch would break a large | ||
percentage of the packages on npm. Every day more new modules are being | ||
added, which mean every day we wait, more potential breakages occur. | ||
While they are confusing, the names themselves won't change. | ||
|
||
*We recommend developers use `setImmediate()` in all cases because its | ||
easier to reason about (and it leads to code that's compatible with a | ||
wider variety of environments, like browser JS.)* | ||
|
||
## Why use `process.nextTick()`? | ||
|
||
There are two main reasons: | ||
|
||
1. Allow users to handle errors, cleanup any then unneeded resources, or | ||
perhaps try the request again before the event loop continues. | ||
|
||
2. At times it's necessary to allow a callback to run after the call | ||
stack has unwound but before the event loop continues. | ||
|
||
One example is to match the user's expectations. Simple example: | ||
|
||
```js | ||
var server = net.createServer(); | ||
server.on('connection', function(conn) { }); | ||
|
||
server.listen(8080); | ||
server.on('listening', function() { }); | ||
``` | ||
|
||
Say that listen() is run at the beginning of the event loop, but the | ||
listening callback is placed in a `setImmediate()`. Now, unless a | ||
hostname is passed binding to the port will happen immediately. Now for | ||
the event loop to proceed it must hit the `poll` phase, which means | ||
there is a non-zero chance that a connection could have been received | ||
allowing the connection event to be fired before the listening event. | ||
|
||
Another example is running a function constructor that was to, say, | ||
inherit from `EventEmitter` and it wanted to call an event within the | ||
constructor: | ||
|
||
```js | ||
const EventEmitter = require('events'); | ||
const util = require('util'); | ||
|
||
function MyEmitter() { | ||
EventEmitter.call(this); | ||
this.emit('event'); | ||
} | ||
util.inherits(MyEmitter, EventEmitter); | ||
|
||
const myEmitter = new MyEmitter(); | ||
myEmitter.on('event', function() { | ||
console.log('an event occurred!'); | ||
}); | ||
``` | ||
|
||
You can't emit an event from the constructor immediately | ||
because the script will not have processed to the point where the user | ||
assigns a callback to that event. So, within the constructor itself, | ||
you can use `process.nextTick()` to set a callback to emit the event | ||
after the constructor has finished, which provides the expected results: | ||
|
||
```js | ||
const EventEmitter = require('events'); | ||
const util = require('util'); | ||
|
||
function MyEmitter() { | ||
EventEmitter.call(this); | ||
|
||
// use nextTick to emit the event once a handler is assigned | ||
process.nextTick(function () { | ||
this.emit('event'); | ||
}.bind(this)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @evanlucas good catch. Arrow functions can't have their There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @evanlucas ! replaced the arrow function with a plain-old anonymous function. |
||
} | ||
util.inherits(MyEmitter, EventEmitter); | ||
|
||
const myEmitter = new MyEmitter(); | ||
myEmitter.on('event', function() { | ||
console.log('an event occurred!'); | ||
}); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @piscisaureus Does this model adequately reflect reality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order in which certain types of callbacks are invoked is mostly determined by libuv, see https://github.com/libuv/libuv/blob/b015b4d0ee03b825d61b0b54e36aa405a8e1cf03/src/unix/core.c#L340-L366.
My take would be:
socket.on('close', ...)
)process.nextTick and microtasks are processed after every invocation of a javascript function, so they essentially run as a "loop within a loop" with a higher priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also some little activities (handling the gracefully close) done between 7 and 1 in node: https://github.com/nodejs/node/blob/master/src/node.cc#L4207-L4221.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sam-github @piscisaureus @mcollina
Again, the order of these steps is based upon this presentation by @trevnorris https://nodesource.com/resources
Should these refinements considered show-stoppers? I'm leaning toward getting this landed and allowing those with more intimate knowledge and understanding making some of the more specific adjustments.. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider them show-stoppers.
The order of steps is highly important, because otherwise some behaviors cannot be explained, creating more confusion.
IMHO it is acceptable to simplify things in a presentation (you have a limited time), but this should be the 'ultimate' docs on the event loop, so nobody would have to dig through all the source to understand how things works. So, let's aim for correctness first, and then prepare a "newbie friendly" doc after (or leave it to 3rd-party blog posts).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piscisaureus
That's not always true for some odd reason. The only time the
'close'
event is emitted inuv__run_closing_handles()
is onSocket#_destroy()
. Otherwise it simply runs innextTick()
(exampleServer#_emitCloseIfDrained()
and subsequent call toemitCloseNT()
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcollina If you want more of an "ultimate" doc, here's one for reference: https://gist.github.com/trevnorris/05531d8339f8e265bd49
There are a couple updates needed (e.g. we run timers twice), but overall that represents the technical running of the event loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trevnorris we are running timers twice to have more precise timings (as there might be some stuff happening between one run and the other) or is there another reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcollina It's because we run
uv_run()
usingUV_RUN_ONCE
. Which we started using for'beforeExit'
support. The reason libuv does this is documented indeps/uv/src/unix/core.c