Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 9e6a4d6

Browse files
TimothyGuaddaleax
authored andcommittedJul 18, 2017
doc: add documentation on ICU
PR-URL: #13916 Refs: #13644 (comment) Reviewed-By: Vse Mozhet Byt <[email protected]>
1 parent 2a91d59 commit 9e6a4d6

File tree

3 files changed

+211
-0
lines changed

3 files changed

+211
-0
lines changed
 

‎doc/api/_toc.md

+1
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
* [HTTP](http.html)
2727
* [HTTPS](https.html)
2828
* [Inspector](inspector.html)
29+
* [Internationalization](intl.html)
2930
* [Modules](modules.html)
3031
* [Net](net.html)
3132
* [OS](os.html)

‎doc/api/all.md

+1
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
@include http
2222
@include https
2323
@include inspector
24+
@include intl
2425
@include modules
2526
@include net
2627
@include os

‎doc/api/intl.md

+209
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# Internationalization Support
2+
3+
Node.js has many features that make it easier to write internationalized
4+
programs. Some of them are:
5+
6+
- Locale-sensitive or Unicode-aware functions in the [ECMAScript Language
7+
Specification][ECMA-262]:
8+
- [`String.prototype.normalize()`][]
9+
- [`String.prototype.toLowerCase()`][]
10+
- [`String.prototype.toUpperCase()`][]
11+
- All functionality described in the [ECMAScript Internationalization API
12+
Specification][ECMA-402] (aka ECMA-402):
13+
- [`Intl`][] object
14+
- Locale-sensitive methods like [`String.prototype.localeCompare()`][] and
15+
[`Date.prototype.toLocaleString()`][]
16+
- The [WHATWG URL parser][]'s [internationalized domain names][] (IDNs) support
17+
- [`require('buffer').transcode()`][]
18+
- More accurate [REPL][] line editing
19+
20+
Node.js (and its underlying V8 engine) uses [ICU][] to implement these features
21+
in native C/C++ code. However, some of them require a very large ICU data file
22+
in order to support all locales of the world. Because it is expected that most
23+
Node.js users will make use of only a small portion of ICU functionality, only
24+
a subset of the full ICU data set is provided by Node.js by default. Several
25+
options are provided for customizing and expanding the ICU data set either when
26+
building or running Node.js.
27+
28+
## Options for building Node.js
29+
30+
To control how ICU is used in Node.js, four `configure` options are available
31+
during compilation. Additional details on how to compile Node.js are documented
32+
in [BUILDING.md][].
33+
34+
- `--with-intl=none` / `--without-intl`
35+
- `--with-intl=system-icu`
36+
- `--with-intl=small-icu` (default)
37+
- `--with-intl=full-icu`
38+
39+
An overview of available Node.js and JavaScript features for each `configure`
40+
option:
41+
42+
| | `none` | `system-icu` | `small-icu` | `full-icu`
43+
|-----------------------------------------|-----------------------------------|------------------------------|------------------------|------------
44+
| [`String.prototype.normalize()`][] | none (function is no-op) | full | full | full
45+
| `String.prototype.to*Case()` | full | full | full | full
46+
| [`Intl`][] | none (object does not exist) | partial/full (depends on OS) | partial (English-only) | full
47+
| [`String.prototype.localeCompare()`][] | partial (not locale-aware) | full | full | full
48+
| `String.prototype.toLocale*Case()` | partial (not locale-aware) | full | full | full
49+
| [`Number.prototype.toLocaleString()`][] | partial (not locale-aware) | partial/full (depends on OS) | partial (English-only) | full
50+
| `Date.prototype.toLocale*String()` | partial (not locale-aware) | partial/full (depends on OS) | partial (English-only) | full
51+
| [WHATWG URL Parser][] | partial (no IDN support) | full | full | full
52+
| [`require('buffer').transcode()`][] | none (function does not exist) | full | full | full
53+
| [REPL][] | partial (inaccurate line editing) | full | full | full
54+
55+
*Note*: The "(not locale-aware)" designation denotes that the function carries
56+
out its operation just like the non-`Locale` version of the function, if one
57+
exists. For example, under `none` mode, `Date.prototype.toLocaleString()`'s
58+
operation is identical to that of `Date.prototype.toString()`.
59+
60+
### Disable all internationalization features (`none`)
61+
62+
If this option is chosen, most internationalization features mentioned above
63+
will be **unavailable** in the resulting `node` binary.
64+
65+
### Build with a pre-installed ICU (`system-icu`)
66+
67+
Node.js can link against an ICU build already installed on the system. In fact,
68+
most Linux distributions already come with ICU installed, and this option would
69+
make it possible to reuse the same set of data used by other components in the
70+
OS.
71+
72+
Functionalities that only require the ICU library itself, such as
73+
[`String.prototype.normalize()`][] and the [WHATWG URL parser][], are fully
74+
supported under `system-icu`. Features that require ICU locale data in
75+
addition, such as [`Intl.DateTimeFormat`][] *may* be fully or partially
76+
supported, depending on the completeness of the ICU data installed on the
77+
system.
78+
79+
### Embed a limited set of ICU data (`small-icu`)
80+
81+
This option makes the resulting binary link against the ICU library statically,
82+
and includes a subset of ICU data (typically only the English locale) within
83+
the `node` executable.
84+
85+
Functionalities that only require the ICU library itself, such as
86+
[`String.prototype.normalize()`][] and the [WHATWG URL parser][], are fully
87+
supported under `small-icu`. Features that require ICU locale data in addition,
88+
such as [`Intl.DateTimeFormat`][], generally only work with the English locale:
89+
90+
```js
91+
const january = new Date(9e8);
92+
const english = new Intl.DateTimeFormat('en', { month: 'long' });
93+
const spanish = new Intl.DateTimeFormat('es', { month: 'long' });
94+
95+
console.log(english.format(january));
96+
// Prints "January"
97+
console.log(spanish.format(january));
98+
// Prints "M01" on small-icu
99+
// Should print "enero"
100+
```
101+
102+
This mode provides a good balance between features and binary size, and it is
103+
the default behavior if no `--with-intl` flag is passed. The official binaries
104+
are also built in this mode.
105+
106+
#### Providing ICU data at runtime
107+
108+
If the `small-icu` option is used, one can still provide additional locale data
109+
at runtime so that the JS methods would work for all ICU locales. Assuming the
110+
data file is stored at `/some/directory`, it can be made available to ICU
111+
through either:
112+
113+
* The [`NODE_ICU_DATA`][] environmental variable:
114+
115+
```shell
116+
env NODE_ICU_DATA=/some/directory node
117+
```
118+
119+
* The [`--icu-data-dir`][] CLI parameter:
120+
121+
```shell
122+
node --icu-data-dir=/some/directory
123+
```
124+
125+
(If both are specified, the `--icu-data-dir` CLI parameter takes precedence.)
126+
127+
ICU is able to automatically find and load a variety of data formats, but the
128+
data must be appropriate for the ICU version, and the file correctly named.
129+
The most common name for the data file is `icudt5X[bl].dat`, where `5X` denotes
130+
the intended ICU version, and `b` or `l` indicates the system's endianness.
131+
Check ["ICU Data"][] article in the ICU User Guide for other supported formats
132+
and more details on ICU data in general.
133+
134+
The [full-icu][] npm module can greatly simplify ICU data installation by
135+
detecting the ICU version of the running `node` executable and downloading the
136+
appropriate data file. After installing the module through `npm i full-icu`,
137+
the data file will be available at `./node_modules/full-icu`. This path can be
138+
then passed either to `NODE_ICU_DATA` or `--icu-data-dir` as shown above to
139+
enable full `Intl` support.
140+
141+
### Embed the entire ICU (`full-icu`)
142+
143+
This option makes the resulting binary link against ICU statically and include
144+
a full set of ICU data. A binary created this way has no further external
145+
dependencies and supports all locales, but might be rather large. See
146+
[BUILDING.md][BUILDING.md#full-icu] on how to compile a binary using this mode.
147+
148+
## Detecting internationalization support
149+
150+
To verify that ICU is enabled at all (`system-icu`, `small-icu`, or
151+
`full-icu`), simply checking the existence of `Intl` should suffice:
152+
153+
```js
154+
const hasICU = typeof Intl === 'object';
155+
```
156+
157+
Alternatively, checking for `process.versions.icu`, a property defined only
158+
when ICU is enabled, works too:
159+
160+
```js
161+
const hasICU = typeof process.versions.icu === 'string';
162+
```
163+
164+
To check for support for a non-English locale (i.e. `full-icu` or
165+
`system-icu`), [`Intl.DateTimeFormat`][] can be a good distinguishing factor:
166+
167+
```js
168+
const hasFullICU = (() => {
169+
try {
170+
const january = new Date(9e8);
171+
const spanish = new Intl.DateTimeFormat('es', { month: 'long' });
172+
return spanish.format(january) === 'enero';
173+
} catch (err) {
174+
return false;
175+
}
176+
})();
177+
```
178+
179+
For more verbose tests for `Intl` support, the following resources may be found
180+
to be helpful:
181+
182+
- [btest402][]: Generally used to check whether Node.js with `Intl` support is
183+
built correctly.
184+
- [Test262][]: ECMAScript's official conformance test suite includes a section
185+
dedicated to ECMA-402.
186+
187+
[btest402]: https://github.com/srl295/btest402
188+
[BUILDING.md]: https://github.com/nodejs/node/blob/master/BUILDING.md
189+
[BUILDING.md#full-icu]: https://github.com/nodejs/node/blob/master/BUILDING.md#build-with-full-icu-support-all-locales-supported-by-icu
190+
[`Date.prototype.toLocaleString()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Date/toLocaleString
191+
[ECMA-262]: https://tc39.github.io/ecma262/
192+
[ECMA-402]: https://tc39.github.io/ecma402/
193+
[full-icu]: https://www.npmjs.com/package/full-icu
194+
[ICU]: http://icu-project.org/
195+
["ICU Data"]: http://userguide.icu-project.org/icudata
196+
[`--icu-data-dir`]: cli.html#cli_icu_data_dir_file
197+
[internationalized domain names]: https://en.wikipedia.org/wiki/Internationalized_domain_name
198+
[`Intl`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Intl
199+
[`Intl.DateTimeFormat`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/DateTimeFormat
200+
[`NODE_ICU_DATA`]: cli.html#cli_node_icu_data_file
201+
[`Number.prototype.toLocaleString()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/toLocaleString
202+
[REPL]: repl.html#repl_repl
203+
[`require('buffer').transcode()`]: buffer.html#buffer_buffer_transcode_source_fromenc_toenc
204+
[`String.prototype.localeCompare()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare
205+
[`String.prototype.normalize()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/normalize
206+
[`String.prototype.toLowerCase()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/toLowerCase
207+
[`String.prototype.toUpperCase()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/toUpperCase
208+
[Test262]: https://github.com/tc39/test262/tree/master/test/intl402
209+
[WHATWG URL parser]: url.html#url_the_whatwg_url_api

0 commit comments

Comments
 (0)
Please sign in to comment.