|
| 1 | +# Internationalization Support |
| 2 | + |
| 3 | +Node.js has many features that make it easier to write internationalized |
| 4 | +programs. Some of them are: |
| 5 | + |
| 6 | +- Locale-sensitive or Unicode-aware functions in the [ECMAScript Language |
| 7 | + Specification][ECMA-262]: |
| 8 | + - [`String.prototype.normalize()`][] |
| 9 | + - [`String.prototype.toLowerCase()`][] |
| 10 | + - [`String.prototype.toUpperCase()`][] |
| 11 | +- All functionality described in the [ECMAScript Internationalization API |
| 12 | + Specification][ECMA-402] (aka ECMA-402): |
| 13 | + - [`Intl`][] object |
| 14 | + - Locale-sensitive methods like [`String.prototype.localeCompare()`][] and |
| 15 | + [`Date.prototype.toLocaleString()`][] |
| 16 | +- The [WHATWG URL parser][]'s [internationalized domain names][] (IDNs) support |
| 17 | +- [`require('buffer').transcode()`][] |
| 18 | +- More accurate [REPL][] line editing |
| 19 | + |
| 20 | +Node.js (and its underlying V8 engine) uses [ICU][] to implement these features |
| 21 | +in native C/C++ code. However, some of them require a very large ICU data file |
| 22 | +in order to support all locales of the world. Because it is expected that most |
| 23 | +Node.js users will make use of only a small portion of ICU functionality, only |
| 24 | +a subset of the full ICU data set is provided by Node.js by default. Several |
| 25 | +options are provided for customizing and expanding the ICU data set either when |
| 26 | +building or running Node.js. |
| 27 | + |
| 28 | +## Options for building Node.js |
| 29 | + |
| 30 | +To control how ICU is used in Node.js, four `configure` options are available |
| 31 | +during compilation. Additional details on how to compile Node.js are documented |
| 32 | +in [BUILDING.md][]. |
| 33 | + |
| 34 | +- `--with-intl=none` / `--without-intl` |
| 35 | +- `--with-intl=system-icu` |
| 36 | +- `--with-intl=small-icu` (default) |
| 37 | +- `--with-intl=full-icu` |
| 38 | + |
| 39 | +An overview of available Node.js and JavaScript features for each `configure` |
| 40 | +option: |
| 41 | + |
| 42 | +| | `none` | `system-icu` | `small-icu` | `full-icu` |
| 43 | +|-----------------------------------------|-----------------------------------|------------------------------|------------------------|------------ |
| 44 | +| [`String.prototype.normalize()`][] | none (function is no-op) | full | full | full |
| 45 | +| `String.prototype.to*Case()` | full | full | full | full |
| 46 | +| [`Intl`][] | none (object does not exist) | partial/full (depends on OS) | partial (English-only) | full |
| 47 | +| [`String.prototype.localeCompare()`][] | partial (not locale-aware) | full | full | full |
| 48 | +| `String.prototype.toLocale*Case()` | partial (not locale-aware) | full | full | full |
| 49 | +| [`Number.prototype.toLocaleString()`][] | partial (not locale-aware) | partial/full (depends on OS) | partial (English-only) | full |
| 50 | +| `Date.prototype.toLocale*String()` | partial (not locale-aware) | partial/full (depends on OS) | partial (English-only) | full |
| 51 | +| [WHATWG URL Parser][] | partial (no IDN support) | full | full | full |
| 52 | +| [`require('buffer').transcode()`][] | none (function does not exist) | full | full | full |
| 53 | +| [REPL][] | partial (inaccurate line editing) | full | full | full |
| 54 | + |
| 55 | +*Note*: The "(not locale-aware)" designation denotes that the function carries |
| 56 | +out its operation just like the non-`Locale` version of the function, if one |
| 57 | +exists. For example, under `none` mode, `Date.prototype.toLocaleString()`'s |
| 58 | +operation is identical to that of `Date.prototype.toString()`. |
| 59 | + |
| 60 | +### Disable all internationalization features (`none`) |
| 61 | + |
| 62 | +If this option is chosen, most internationalization features mentioned above |
| 63 | +will be **unavailable** in the resulting `node` binary. |
| 64 | + |
| 65 | +### Build with a pre-installed ICU (`system-icu`) |
| 66 | + |
| 67 | +Node.js can link against an ICU build already installed on the system. In fact, |
| 68 | +most Linux distributions already come with ICU installed, and this option would |
| 69 | +make it possible to reuse the same set of data used by other components in the |
| 70 | +OS. |
| 71 | + |
| 72 | +Functionalities that only require the ICU library itself, such as |
| 73 | +[`String.prototype.normalize()`][] and the [WHATWG URL parser][], are fully |
| 74 | +supported under `system-icu`. Features that require ICU locale data in |
| 75 | +addition, such as [`Intl.DateTimeFormat`][] *may* be fully or partially |
| 76 | +supported, depending on the completeness of the ICU data installed on the |
| 77 | +system. |
| 78 | + |
| 79 | +### Embed a limited set of ICU data (`small-icu`) |
| 80 | + |
| 81 | +This option makes the resulting binary link against the ICU library statically, |
| 82 | +and includes a subset of ICU data (typically only the English locale) within |
| 83 | +the `node` executable. |
| 84 | + |
| 85 | +Functionalities that only require the ICU library itself, such as |
| 86 | +[`String.prototype.normalize()`][] and the [WHATWG URL parser][], are fully |
| 87 | +supported under `small-icu`. Features that require ICU locale data in addition, |
| 88 | +such as [`Intl.DateTimeFormat`][], generally only work with the English locale: |
| 89 | + |
| 90 | +```js |
| 91 | +const january = new Date(9e8); |
| 92 | +const english = new Intl.DateTimeFormat('en', { month: 'long' }); |
| 93 | +const spanish = new Intl.DateTimeFormat('es', { month: 'long' }); |
| 94 | + |
| 95 | +console.log(english.format(january)); |
| 96 | +// Prints "January" |
| 97 | +console.log(spanish.format(january)); |
| 98 | +// Prints "M01" on small-icu |
| 99 | +// Should print "enero" |
| 100 | +``` |
| 101 | + |
| 102 | +This mode provides a good balance between features and binary size, and it is |
| 103 | +the default behavior if no `--with-intl` flag is passed. The official binaries |
| 104 | +are also built in this mode. |
| 105 | + |
| 106 | +#### Providing ICU data at runtime |
| 107 | + |
| 108 | +If the `small-icu` option is used, one can still provide additional locale data |
| 109 | +at runtime so that the JS methods would work for all ICU locales. Assuming the |
| 110 | +data file is stored at `/some/directory`, it can be made available to ICU |
| 111 | +through either: |
| 112 | + |
| 113 | +* The [`NODE_ICU_DATA`][] environmental variable: |
| 114 | + |
| 115 | + ```shell |
| 116 | + env NODE_ICU_DATA=/some/directory node |
| 117 | + ``` |
| 118 | + |
| 119 | +* The [`--icu-data-dir`][] CLI parameter: |
| 120 | + |
| 121 | + ```shell |
| 122 | + node --icu-data-dir=/some/directory |
| 123 | + ``` |
| 124 | + |
| 125 | +(If both are specified, the `--icu-data-dir` CLI parameter takes precedence.) |
| 126 | + |
| 127 | +ICU is able to automatically find and load a variety of data formats, but the |
| 128 | +data must be appropriate for the ICU version, and the file correctly named. |
| 129 | +The most common name for the data file is `icudt5X[bl].dat`, where `5X` denotes |
| 130 | +the intended ICU version, and `b` or `l` indicates the system's endianness. |
| 131 | +Check ["ICU Data"][] article in the ICU User Guide for other supported formats |
| 132 | +and more details on ICU data in general. |
| 133 | + |
| 134 | +The [full-icu][] npm module can greatly simplify ICU data installation by |
| 135 | +detecting the ICU version of the running `node` executable and downloading the |
| 136 | +appropriate data file. After installing the module through `npm i full-icu`, |
| 137 | +the data file will be available at `./node_modules/full-icu`. This path can be |
| 138 | +then passed either to `NODE_ICU_DATA` or `--icu-data-dir` as shown above to |
| 139 | +enable full `Intl` support. |
| 140 | + |
| 141 | +### Embed the entire ICU (`full-icu`) |
| 142 | + |
| 143 | +This option makes the resulting binary link against ICU statically and include |
| 144 | +a full set of ICU data. A binary created this way has no further external |
| 145 | +dependencies and supports all locales, but might be rather large. See |
| 146 | +[BUILDING.md][BUILDING.md#full-icu] on how to compile a binary using this mode. |
| 147 | + |
| 148 | +## Detecting internationalization support |
| 149 | + |
| 150 | +To verify that ICU is enabled at all (`system-icu`, `small-icu`, or |
| 151 | +`full-icu`), simply checking the existence of `Intl` should suffice: |
| 152 | + |
| 153 | +```js |
| 154 | +const hasICU = typeof Intl === 'object'; |
| 155 | +``` |
| 156 | + |
| 157 | +Alternatively, checking for `process.versions.icu`, a property defined only |
| 158 | +when ICU is enabled, works too: |
| 159 | + |
| 160 | +```js |
| 161 | +const hasICU = typeof process.versions.icu === 'string'; |
| 162 | +``` |
| 163 | + |
| 164 | +To check for support for a non-English locale (i.e. `full-icu` or |
| 165 | +`system-icu`), [`Intl.DateTimeFormat`][] can be a good distinguishing factor: |
| 166 | + |
| 167 | +```js |
| 168 | +const hasFullICU = (() => { |
| 169 | + try { |
| 170 | + const january = new Date(9e8); |
| 171 | + const spanish = new Intl.DateTimeFormat('es', { month: 'long' }); |
| 172 | + return spanish.format(january) === 'enero'; |
| 173 | + } catch (err) { |
| 174 | + return false; |
| 175 | + } |
| 176 | +})(); |
| 177 | +``` |
| 178 | + |
| 179 | +For more verbose tests for `Intl` support, the following resources may be found |
| 180 | +to be helpful: |
| 181 | + |
| 182 | +- [btest402][]: Generally used to check whether Node.js with `Intl` support is |
| 183 | + built correctly. |
| 184 | +- [Test262][]: ECMAScript's official conformance test suite includes a section |
| 185 | + dedicated to ECMA-402. |
| 186 | + |
| 187 | +[btest402]: https://github.com/srl295/btest402 |
| 188 | +[BUILDING.md]: https://github.com/nodejs/node/blob/master/BUILDING.md |
| 189 | +[BUILDING.md#full-icu]: https://github.com/nodejs/node/blob/master/BUILDING.md#build-with-full-icu-support-all-locales-supported-by-icu |
| 190 | +[`Date.prototype.toLocaleString()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Date/toLocaleString |
| 191 | +[ECMA-262]: https://tc39.github.io/ecma262/ |
| 192 | +[ECMA-402]: https://tc39.github.io/ecma402/ |
| 193 | +[full-icu]: https://www.npmjs.com/package/full-icu |
| 194 | +[ICU]: http://icu-project.org/ |
| 195 | +["ICU Data"]: http://userguide.icu-project.org/icudata |
| 196 | +[`--icu-data-dir`]: cli.html#cli_icu_data_dir_file |
| 197 | +[internationalized domain names]: https://en.wikipedia.org/wiki/Internationalized_domain_name |
| 198 | +[`Intl`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Intl |
| 199 | +[`Intl.DateTimeFormat`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/DateTimeFormat |
| 200 | +[`NODE_ICU_DATA`]: cli.html#cli_node_icu_data_file |
| 201 | +[`Number.prototype.toLocaleString()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/toLocaleString |
| 202 | +[REPL]: repl.html#repl_repl |
| 203 | +[`require('buffer').transcode()`]: buffer.html#buffer_buffer_transcode_source_fromenc_toenc |
| 204 | +[`String.prototype.localeCompare()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare |
| 205 | +[`String.prototype.normalize()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/normalize |
| 206 | +[`String.prototype.toLowerCase()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/toLowerCase |
| 207 | +[`String.prototype.toUpperCase()`]: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/toUpperCase |
| 208 | +[Test262]: https://github.com/tc39/test262/tree/master/test/intl402 |
| 209 | +[WHATWG URL parser]: url.html#url_the_whatwg_url_api |
0 commit comments