Skip to content

Commit a5e3edd

Browse files
jasnellrvagg
authored andcommitted
doc: general improvements to url.md copy
General cleanup and restructuring of the doc. Added additional detail to how URLs are serialized. PR-URL: #6904 Reviewed-By: Robert Jefe Lindstaedt <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Brian White <[email protected]>
1 parent b7ca0a2 commit a5e3edd

File tree

1 file changed

+191
-82
lines changed

1 file changed

+191
-82
lines changed

doc/api/url.md

+191-82
Original file line numberDiff line numberDiff line change
@@ -2,139 +2,248 @@
22

33
Stability: 2 - Stable
44

5-
This module has utilities for URL resolution and parsing.
6-
Call `require('url')` to use it.
5+
The `url` module provides utilities for URL resolution and parsing. It can be
6+
accessed using:
77

8-
## URL Parsing
8+
```js
9+
const url = require('url');
10+
```
911

10-
Parsed URL objects have some or all of the following fields, depending on
11-
whether or not they exist in the URL string. Any parts that are not in the URL
12-
string will not be in the parsed object. Examples are shown for the URL
12+
## URL Strings and URL Objects
1313

14-
`'http://user:[email protected]:8080/p/a/t/h?query=string#hash'`
14+
A URL string is a structured string containing multiple meaningful components.
15+
When parsed, a URL object is returned containing properties for each of these
16+
components.
1517

16-
* `href`: The full URL that was originally parsed. Both the protocol and host are lowercased.
18+
The following details each of the components of a parsed URL. The example
19+
`'http://user:[email protected]:8080/p/a/t/h?query=string#hash'` is used to
20+
illustrate each.
1721

18-
Example: `'http://user:[email protected]:8080/p/a/t/h?query=string#hash'`
22+
```
23+
+---------------------------------------------------------------------------+
24+
| href |
25+
+----------++-----------+-----------------+-------------------------+-------+
26+
| protocol || auth | host | path | hash |
27+
| || +----------+------+----------+--------------+ |
28+
| || | hostname | port | pathname | search | |
29+
| || | | | +-+------------+ |
30+
| || | | | | | query | |
31+
" http: // user:pass @ host.com : 8080 /p/a/t/h ? query=string #hash "
32+
| || | | | | | | |
33+
+----------++-----------+-----------+------+----------+-+-----------+-------+
34+
(all spaces in the "" line should be ignored -- they're purely for formatting)
35+
```
1936

20-
* `protocol`: The request protocol, lowercased.
37+
### urlObject.href
2138

22-
Example: `'http:'`
39+
The `href` property is the full URL string that was parsed with both the
40+
`protocol` and `host` components converted to lower-case.
2341

24-
* `slashes`: The protocol requires slashes after the colon.
42+
For example: `'http://user:[email protected]:8080/p/a/t/h?query=string#hash'`
2543

26-
Example: true or false
44+
### urlObject.protocol
2745

28-
* `host`: The full lowercased host portion of the URL, including port
29-
information.
46+
The `protocol` property identifies the URL's lower-cased protocol scheme.
3047

31-
Example: `'host.com:8080'`
48+
For example: `'http:'`
3249

33-
* `auth`: The authentication information portion of a URL.
50+
### urlObject.slashes
3451

35-
Example: `'user:pass'`
52+
The `slashes` property is a `boolean` with a value of `true` if two ASCII
53+
forward-slash characters (`/`) are required following the colon in the
54+
`protocol`.
3655

37-
* `hostname`: Just the lowercased hostname portion of the host.
56+
### urlObject.host
3857

39-
Example: `'host.com'`
58+
The `host` property is the full lower-cased host portion of the URL, including
59+
the `port` if specified.
4060

41-
* `port`: The port number portion of the host.
61+
For example: `'host.com:8080'`
4262

43-
Example: `'8080'`
63+
### urlObject.auth
4464

45-
* `pathname`: The path section of the URL, that comes after the host and
46-
before the query, including the initial slash if present. No decoding is
47-
performed.
65+
The `auth` property is the username and password portion of the URL, also
66+
referred to as "userinfo". This string subset follows the `protocol` and
67+
double slashes (if present) and preceeds the `host` component, delimited by an
68+
ASCII "at sign" (`@`). The format of the string is `{username}[:{password}]`,
69+
with the `[:{password}]` portion being optional.
4870

49-
Example: `'/p/a/t/h'`
71+
For example: `'user:pass'`
5072

51-
* `search`: The 'query string' portion of the URL, including the leading
52-
question mark.
73+
### urlObject.hostname
5374

54-
Example: `'?query=string'`
75+
The `hostname` property is the lower-cased host name portion of the `host`
76+
component *without* the `port` included.
5577

56-
* `path`: Concatenation of `pathname` and `search`. No decoding is performed.
78+
For example: `'host.com'`
5779

58-
Example: `'/p/a/t/h?query=string'`
80+
### urlObject.port
5981

60-
* `query`: Either the 'params' portion of the query string, or a
61-
querystring-parsed object.
82+
The `port` property is the numeric port portion of the `host` component.
6283

63-
Example: `'query=string'` or `{'query':'string'}`
84+
For example: `'8080'`
6485

65-
* `hash`: The 'fragment' portion of the URL including the pound-sign.
86+
### urlObject.pathname
6687

67-
Example: `'#hash'`
88+
The `pathname` property consists of the entire path section of the URL. This
89+
is everything following the `host` (including the `port`) and before the start
90+
of the `query` or `hash` components, delimited by either the ASCII question
91+
mark (`?`) or hash (`#`) characters.
6892

69-
### Escaped Characters
93+
For example `'/p/a/t/h'`
7094

71-
Spaces (`' '`) and the following characters will be automatically escaped in the
72-
properties of URL objects:
95+
No decoding of the path string is performed.
7396

74-
```
75-
< > " ` \r \n \t { } | \ ^ '
76-
```
97+
### urlObject.search
98+
99+
The `search` property consists of the entire "query string" portion of the
100+
URL, including the leading ASCII question mark (`?`) character.
101+
102+
For example: `'?query=string'`
103+
104+
No decoding of the query string is performed.
105+
106+
### urlObject.path
107+
108+
The `path` property is a concatenation of the `pathname` and `search`
109+
components.
110+
111+
For example: `'/p/a/t/h?query=string'`
112+
113+
No decoding of the `path` is performed.
114+
115+
### urlObject.query
116+
117+
The `query` property is either the "params" portion of the query string (
118+
everything *except* the leading ASCII question mark (`?`), or an object
119+
returned by the [`querystring`][] module's `parse()` method:
77120

78-
---
121+
For example: `'query=string'` or `{'query': 'string'}`
79122

80-
The following methods are provided by the URL module:
123+
If returned as a string, no decoding of the query string is performed. If
124+
returned as an object, both keys and values are decoded.
81125

82-
## url.format(urlObj)
126+
### urlObject.hash
127+
128+
The `hash` property consists of the "fragment" portion of the URL including
129+
the leading ASCII hash (`#`) character.
130+
131+
For example: `'#hash'`
132+
133+
## url.format(urlObject)
83134
<!-- YAML
84135
added: v0.1.25
85136
-->
86137

87-
Take a parsed URL object, and return a formatted URL string.
88-
89-
Here's how the formatting process works:
90-
91-
* `href` will be ignored.
92-
* `path` will be ignored.
93-
* `protocol` is treated the same with or without the trailing `:` (colon).
94-
* The protocols `http`, `https`, `ftp`, `gopher`, `file` will be
95-
postfixed with `://` (colon-slash-slash) as long as `host`/`hostname` are present.
96-
* All other protocols `mailto`, `xmpp`, `aim`, `sftp`, `foo`, etc will
97-
be postfixed with `:` (colon).
98-
* `slashes` set to `true` if the protocol requires `://` (colon-slash-slash)
99-
* Only needs to be set for protocols not previously listed as requiring
100-
slashes, such as `mongodb://localhost:8000/`, or if `host`/`hostname` are absent.
101-
* `auth` will be used if present.
102-
* `hostname` will only be used if `host` is absent.
103-
* `port` will only be used if `host` is absent.
104-
* `host` will be used in place of `hostname` and `port`.
105-
* `pathname` is treated the same with or without the leading `/` (slash).
106-
* `query` (object; see `querystring`) will only be used if `search` is absent.
107-
* `search` will be used in place of `query`.
108-
* It is treated the same with or without the leading `?` (question mark).
109-
* `hash` is treated the same with or without the leading `#` (pound sign, anchor).
110-
111-
## url.parse(urlStr[, parseQueryString][, slashesDenoteHost])
138+
* `urlObject` {Object} A URL object (either as returned by `url.parse()` or
139+
constructed otherwise).
140+
141+
The `url.format()` method processes the given URL object and returns a formatted
142+
URL string.
143+
144+
The formatting process essentially operates as follows:
145+
146+
* A new empty string `result` is created.
147+
* If `urlObject.protocol` is a string, it is appended as-is to `result`.
148+
* Otherwise, if `urlObject.protocol` is not `undefined` and is not a string, an
149+
[`Error`][] is thrown.
150+
* For all string values of `urlObject.protocol` that *do not end* with an ASCII
151+
colon (`:`) character, the literal string `:` will be appended to `result`.
152+
* If either the `urlObject.slashes` property is true, `urlObject.protocol`
153+
begins with one of `http`, `https`, `ftp`, `gopher`, or `file`, or
154+
`urlObject.protocol` is `undefined`, the literal string `//` will be appended
155+
to `result`.
156+
* If the value of the `urlObject.auth` property is truthy, and either
157+
`urlObject.host` or `urlObject.hostname` are not `undefined`, the value of
158+
`urlObject.auth` will be coerced into a string and appended to `result`
159+
followed by the literal string `@`.
160+
* If the `urlObject.host` property is `undefined` then:
161+
* If the `urlObject.hostname` is a string, it is appended to `result`.
162+
* Otherwise, if `urlObject.hostname` is not `undefined` and is not a string,
163+
an [`Error`][] is thrown.
164+
* If the `urlObject.port` property value is truthy, and `urlObject.hostname`
165+
is not `undefined`:
166+
* The literal string `:` is appended to `result`, and
167+
* The value of `urlObject.port` is coerced to a string and appended to
168+
`result`.
169+
* Otherwise, if the `urlObject.host` property value is truthy, the value of
170+
`urlObject.host` is coerced to a string and appended to `result`.
171+
* If the `urlObject.pathname` property is a string that is not an empty string:
172+
* If the `urlObject.pathname` *does not start* with an ASCII forward slash
173+
(`/`), then the literal string '/' is appended to `result`.
174+
* The value of `urlObject.pathname` is appended to `result`.
175+
* Otherwise, if `urlObject.pathname` is not `undefined` and is not a string, an
176+
[`Error`][] is thrown.
177+
* If the `urlObject.search` property is `undefined` and if the `urlObject.query`
178+
property is an `Object`, the literal string `?` is appended to `result`
179+
followed by the output of calling the [`querystring`][] module's `stringify()`
180+
method passing the value of `urlObject.query`.
181+
* Otherwise, if `urlObject.search` is a string:
182+
* If the value of `urlObject.search` *does not start* with the ASCII question
183+
mark (`?`) character, the literal string `?` is appended to `result`.
184+
* The value of `urlObject.search` is appended to `result`.
185+
* Otherwise, if `urlObject.search` is not `undefined` and is not a string, an
186+
[`Error`][] is thrown.
187+
* If the `urlObject.hash` property is a string:
188+
* If the value of `urlObject.hash` *does not start* with the ASCII hash (`#`)
189+
character, the literal string `#` is appended to `result`.
190+
* The value of `urlObject.hash` is appended to `result`.
191+
* Otherwise, if the `urlObject.hash` property is not `undefined` and is not a
192+
string, an [`Error`][] is thrown.
193+
* `result` is returned.
194+
195+
196+
## url.parse(urlString[, parseQueryString[, slashesDenoteHost]])
112197
<!-- YAML
113198
added: v0.1.25
114199
-->
115200

116-
Take a URL string, and return an object.
117-
118-
Pass `true` as the second argument to also parse the query string using the
119-
`querystring` module. If `true` then the `query` property will always be
120-
assigned an object, and the `search` property will always be a (possibly
121-
empty) string. If `false` then the `query` property will not be parsed or
122-
decoded. Defaults to `false`.
201+
* `urlString` {string} The URL string to parse.
202+
* `parseQueryString` {boolean} If `true`, the `query` property will always
203+
be set to an object returned by the [`querystring`][] module's `parse()`
204+
method. If `false`, the `query` property on the returned URL object will be an
205+
unparsed, undecoded string. Defaults to `false`.
206+
* `slashesDenoteHost` {boolean} If `true`, the first token after the literal
207+
string `//` and preceeding the next `/` will be interpreted as the `host`.
208+
For instance, given `//foo/bar`, the result would be
209+
`{host: 'foo', pathname: '/bar'}` rather than `{pathname: '//foo/bar'}`.
210+
Defaults to `false`.
123211

124-
Pass `true` as the third argument to treat `//foo/bar` as
125-
`{ host: 'foo', pathname: '/bar' }` rather than
126-
`{ pathname: '//foo/bar' }`. Defaults to `false`.
212+
The `url.parse()` method takes a URL string, parses it, and returns a URL
213+
object.
127214

128215
## url.resolve(from, to)
129216
<!-- YAML
130217
added: v0.1.25
131218
-->
132219

133-
Take a base URL, and a href URL, and resolve them as a browser would for
134-
an anchor tag. Examples:
220+
* `from` {string} The Base URL being resolved against.
221+
* `to` {string} The HREF URL being resolved.
222+
223+
The `url.resolve()` method resolves a target URL relative to a base URL in a
224+
manner similar to that of a Web browser resolving an anchor tag HREF.
225+
226+
For example:
135227

136228
```js
137229
url.resolve('/one/two/three', 'four') // '/one/two/four'
138230
url.resolve('http://example.com/', '/one') // 'http://example.com/one'
139231
url.resolve('http://example.com/one', '/two') // 'http://example.com/two'
140232
```
233+
234+
## Escaped Characters
235+
236+
URLs are only permitted to contain a certain range of characters. Spaces (`' '`)
237+
and the following characters will be automatically escaped in the
238+
properties of URL objects:
239+
240+
```
241+
< > " ` \r \n \t { } | \ ^ '
242+
```
243+
244+
For example, the ASCII space character (`' '`) is encoded as `%20`. The ASCII
245+
forward slash (`/`) character is encoded as `%3C`.
246+
247+
248+
[`Error`]: errors.html#errors_class_error
249+
[`querystring`]: querystring.html

0 commit comments

Comments
 (0)