Skip to content

Commit d5fd88c

Browse files
authored
Rollup merge of #118194 - notriddle:notriddle/tuple-unit, r=GuillaumeGomez
rustdoc: search for tuples and unit by type with `()` This feature extends rustdoc to support the syntax that most users will naturally attempt to use to search for tuples. Part of #60485 Function signature searches already support tuples and unit. The explicit name `primitive:tuple` and `primitive:unit` can be used to match a tuple or unit, while `()` will match either one. It also follows the direction set by the actual language for parens as a group, so `(u8,)` will only match a tuple, while `(u8)` will match a plain, unwrapped byte—thanks to loose search semantics, it will also match the tuple. ## Preview * [`option<t>, option<u> -> (t, u)`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=option%3Ct%3E%2C option%3Cu%3E -%3E (t%2C u)>) * [`[t] -> (t,)`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=[t] -%3E (t%2C)>) * [`(ipaddr,) -> socketaddr`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=(ipaddr%2C) -%3E socketaddr>) ## Motivation When type-based search was first landed, it was directly [described as incomplete][a comment]. [a comment]: #23289 (comment) Filling out the missing functionality is going to mean adding support for more of Rust's [type expression] syntax, such as tuples (in this PR), references, raw pointers, function pointers, and closures. [type expression]: https://doc.rust-lang.org/reference/types.html#type-expressions There does seem to be demand for this sort of thing, such as [this Discord message](https://discord.com/channels/442252698964721669/443150878111694848/1042145740065099796) expressing regret at rustdoc not supporting tuples in search queries. ## Reference description (from the Rustdoc book) <table> <thead> <tr> <th>Shorthand</th> <th>Explicit names</th> </tr> </thead> <tbody> <tr><td colspan="2">Before this PR</td></tr> <tr> <td><code>[]</code></td> <td><code>primitive:slice</code> and/or <code>primitive:array</code></td> </tr> <tr> <td><code>[T]</code></td> <td><code>primitive:slice&lt;T&gt;</code> and/or <code>primitive:array&lt;T&gt;</code></td> </tr> <tr> <td><code>!</code></td> <td><code>primitive:never</code></td> </tr> <tr><td colspan="2">After this PR</td></tr> <tr> <td><code>()</code></td> <td><code>primitive:unit</code> and/or <code>primitive:tuple</code></td> </tr> <tr> <td><code>(T)</code></td> <td><code>T</code></td> </tr> <tr> <td><code>(T,)</code></td> <td><code>primitive:tuple&lt;T&gt;</code></td> </tr> </tbody> </table> A single type expression wrapped in parens is the same as that type expression, since parens act as the grouping operator. If they're empty, though, they will match both `unit` and `tuple`, and if there's more than one type (or a trailing or leading comma) it is the same as `primitive:tuple<...>`. However, since items can be left out of the query, `(T)` will still return results for types that match tuples, even though it also matches the type on its own. That is, `(u32)` matches `(u32,)` for the exact same reason that it also matches `Result<u32, Error>`. ## Future direction The [type expression grammar](https://doc.rust-lang.org/reference/types.html#type-expressions) from the Reference is given below: <pre><code>Syntax Type : TypeNoBounds | <a href="https://doc.rust-lang.org/reference/types/impl-trait.html">ImplTraitType</a> | <a href="https://doc.rust-lang.org/reference/types/trait-object.html">TraitObjectType</a> <br> TypeNoBounds : <a href="https://doc.rust-lang.org/reference/types.html#parenthesized-types">ParenthesizedType</a> | <a href="https://doc.rust-lang.org/reference/types/impl-trait.html">ImplTraitTypeOneBound</a> | <a href="https://doc.rust-lang.org/reference/types/trait-object.html">TraitObjectTypeOneBound</a> | <a href="https://doc.rust-lang.org/reference/paths.html#paths-in-types">TypePath</a> | <a href="https://doc.rust-lang.org/reference/types/tuple.html#tuple-types">TupleType</a> | <a href="https://doc.rust-lang.org/reference/types/never.html">NeverType</a> | <a href="https://doc.rust-lang.org/reference/types/pointer.html#raw-pointers-const-and-mut">RawPointerType</a> | <a href="https://doc.rust-lang.org/reference/types/pointer.html#shared-references-">ReferenceType</a> | <a href="https://doc.rust-lang.org/reference/types/array.html">ArrayType</a> | <a href="https://doc.rust-lang.org/reference/types/slice.html">SliceType</a> | <a href="https://doc.rust-lang.org/reference/types/inferred.html">InferredType</a> | <a href="https://doc.rust-lang.org/reference/paths.html#qualified-paths">QualifiedPathInType</a> | <a href="https://doc.rust-lang.org/reference/types/function-pointer.html">BareFunctionType</a> | <a href="https://doc.rust-lang.org/reference/macros.html#macro-invocation">MacroInvocation</a> </code></pre> ImplTraitType and TraitObjectType (and ImplTraitTypeOneBound and TraitObjectTypeOneBound) are not yet implemented. They would mostly desugar to `trait:`, similarly to how `!` desugars to `primitive:never`. ParenthesizedType and TuplePath are added in this PR. TypePath is already implemented (except const generics, which is not planned, and function-like trait syntax, which is planned as part of closure support). NeverType is already implemented. RawPointerType and ReferenceType require parsing and fixes to the search index to store this information, but otherwise their behavior seems simple enough. Just like tuples and slices, `&T` would be equivalent to `primitive:reference<T>`, `&mut T` would be equivalent to `primitive:reference<keyword:mut, T>`, `*T` would be equivalent to `primitive:pointer<T>`, `*mut T` would be equivalent to `primitive:pointer<keyword:mut, T>`, and `*const T` would be equivalent to `primitive:pointer<keyword:const, T>`. Lifetime generics support is not planned, because lifetime subtyping seems too complicated. ArrayType is subsumed by SliceType right now. Implementing const generics is not planned, because it seems like it would require a lot of implementation complexity for not much gain. InferredType isn't really covered right now. Its semantics in a search context are not obvious. QualifiedPathInType is not implemented, and it is not planned. I would need a use case to justify it, and act as a guide for what the exact semantics should be. BareFunctionType is not implemented. Along with function-like trait syntax, which is formally considered a TypePath, it's the biggest missing feature to be able to do structured searches over generic APIs like `Option`. MacroInvocation is not parsed (macro names are, but they don't mean the same thing here at all). Those are gone by the time Rustdoc sees the source code.
2 parents efb3f11 + e74339b commit d5fd88c

File tree

9 files changed

+596
-54
lines changed

9 files changed

+596
-54
lines changed

src/doc/rustdoc/src/read-documentation/search.md

+36-11
Original file line numberDiff line numberDiff line change
@@ -147,15 +147,38 @@ will match these queries:
147147
* `Read -> Result<Vec<u8>, Error>`
148148
* `Read -> Result<Error, Vec>`
149149
* `Read -> Result<Vec<u8>>`
150+
* `Read -> u8`
150151

151152
But it *does not* match `Result<Vec, u8>` or `Result<u8<Vec>>`.
152153

153-
Function signature searches also support arrays and slices. The explicit name
154-
`primitive:slice<u8>` and `primitive:array<u8>` can be used to match a slice
155-
or array of bytes, while square brackets `[u8]` will match either one. Empty
156-
square brackets, `[]`, will match any slice or array regardless of what
157-
it contains, while a slice with a type parameter, like `[T]`, will only match
158-
functions that actually operate on generic slices.
154+
### Primitives with Special Syntax
155+
156+
| Shorthand | Explicit names |
157+
| --------- | ------------------------------------------------ |
158+
| `[]` | `primitive:slice` and/or `primitive:array` |
159+
| `[T]` | `primitive:slice<T>` and/or `primitive:array<T>` |
160+
| `()` | `primitive:unit` and/or `primitive:tuple` |
161+
| `(T)` | `T` |
162+
| `(T,)` | `primitive:tuple<T>` |
163+
| `!` | `primitive:never` |
164+
165+
When searching for `[]`, Rustdoc will return search results with either slices
166+
or arrays. If you know which one you want, you can force it to return results
167+
for `primitive:slice` or `primitive:array` using the explicit name syntax.
168+
Empty square brackets, `[]`, will match any slice or array regardless of what
169+
it contains, or an item type can be provided, such as `[u8]` or `[T]`, to
170+
explicitly find functions that operate on byte slices or generic slices,
171+
respectively.
172+
173+
A single type expression wrapped in parens is the same as that type expression,
174+
since parens act as the grouping operator. If they're empty, though, they will
175+
match both `unit` and `tuple`, and if there's more than one type (or a trailing
176+
or leading comma) it is the same as `primitive:tuple<...>`.
177+
178+
However, since items can be left out of the query, `(T)` will still return
179+
results for types that match tuples, even though it also matches the type on
180+
its own. That is, `(u32)` matches `(u32,)` for the exact same reason that it
181+
also matches `Result<u32, Error>`.
159182

160183
### Limitations and quirks of type-based search
161184

@@ -188,11 +211,10 @@ Most of these limitations should be addressed in future version of Rustdoc.
188211
that you don't want a type parameter, you can force it to match
189212
something else by giving it a different prefix like `struct:T`.
190213

191-
* It's impossible to search for references, pointers, or tuples. The
214+
* It's impossible to search for references or pointers. The
192215
wrapped types can be searched for, so a function that takes `&File` can
193216
be found with `File`, but you'll get a parse error when typing an `&`
194-
into the search field. Similarly, `Option<(T, U)>` can be matched with
195-
`Option<T, U>`, but `(` will give a parse error.
217+
into the search field.
196218

197219
* Searching for lifetimes is not supported.
198220

@@ -216,8 +238,9 @@ Item filters can be used in both name-based and type signature-based searches.
216238
```text
217239
ident = *(ALPHA / DIGIT / "_")
218240
path = ident *(DOUBLE-COLON ident) [!]
219-
slice = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
220-
arg = [type-filter *WS COLON *WS] (path [generics] / slice / [!])
241+
slice-like = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
242+
tuple-like = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN
243+
arg = [type-filter *WS COLON *WS] (path [generics] / slice-like / tuple-like / [!])
221244
type-sep = COMMA/WS *(COMMA/WS)
222245
nonempty-arg-list = *(type-sep) arg *(type-sep arg) *(type-sep)
223246
generic-arg-list = *(type-sep) arg [ EQUAL arg ] *(type-sep arg [ EQUAL arg ]) *(type-sep)
@@ -263,6 +286,8 @@ OPEN-ANGLE-BRACKET = "<"
263286
CLOSE-ANGLE-BRACKET = ">"
264287
OPEN-SQUARE-BRACKET = "["
265288
CLOSE-SQUARE-BRACKET = "]"
289+
OPEN-PAREN = "("
290+
CLOSE-PAREN = ")"
266291
COLON = ":"
267292
DOUBLE-COLON = "::"
268293
QUOTE = %x22

src/librustdoc/html/render/search_index.rs

+3
Original file line numberDiff line numberDiff line change
@@ -566,6 +566,9 @@ fn get_index_type_id(
566566
// The type parameters are converted to generics in `simplify_fn_type`
567567
clean::Slice(_) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Slice)),
568568
clean::Array(_, _) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Array)),
569+
clean::Tuple(ref n) if n.is_empty() => {
570+
Some(RenderTypeId::Primitive(clean::PrimitiveType::Unit))
571+
}
569572
clean::Tuple(_) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Tuple)),
570573
clean::QPath(ref data) => {
571574
if data.self_type.is_self_type()

src/librustdoc/html/static/js/search.js

+71-29
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,18 @@ function initSearch(rawSearchIndex) {
260260
* Special type name IDs for searching by both array and slice (`[]` syntax).
261261
*/
262262
let typeNameIdOfArrayOrSlice;
263+
/**
264+
* Special type name IDs for searching by tuple.
265+
*/
266+
let typeNameIdOfTuple;
267+
/**
268+
* Special type name IDs for searching by unit.
269+
*/
270+
let typeNameIdOfUnit;
271+
/**
272+
* Special type name IDs for searching by both tuple and unit (`()` syntax).
273+
*/
274+
let typeNameIdOfTupleOrUnit;
263275

264276
/**
265277
* Add an item to the type Name->ID map, or, if one already exists, use it.
@@ -295,11 +307,7 @@ function initSearch(rawSearchIndex) {
295307
}
296308

297309
function isEndCharacter(c) {
298-
return "=,>-]".indexOf(c) !== -1;
299-
}
300-
301-
function isErrorCharacter(c) {
302-
return "()".indexOf(c) !== -1;
310+
return "=,>-])".indexOf(c) !== -1;
303311
}
304312

305313
function itemTypeFromName(typename) {
@@ -585,8 +593,6 @@ function initSearch(rawSearchIndex) {
585593
throw ["Unexpected ", "!", ": it can only be at the end of an ident"];
586594
}
587595
foundExclamation = parserState.pos;
588-
} else if (isErrorCharacter(c)) {
589-
throw ["Unexpected ", c];
590596
} else if (isPathSeparator(c)) {
591597
if (c === ":") {
592598
if (!isPathStart(parserState)) {
@@ -616,11 +622,14 @@ function initSearch(rawSearchIndex) {
616622
}
617623
} else if (
618624
c === "[" ||
625+
c === "(" ||
619626
isEndCharacter(c) ||
620627
isSpecialStartCharacter(c) ||
621628
isSeparatorCharacter(c)
622629
) {
623630
break;
631+
} else if (parserState.pos > 0) {
632+
throw ["Unexpected ", c, " after ", parserState.userQuery[parserState.pos - 1]];
624633
} else {
625634
throw ["Unexpected ", c];
626635
}
@@ -661,43 +670,56 @@ function initSearch(rawSearchIndex) {
661670
skipWhitespace(parserState);
662671
let start = parserState.pos;
663672
let end;
664-
if (parserState.userQuery[parserState.pos] === "[") {
673+
if ("[(".indexOf(parserState.userQuery[parserState.pos]) !== -1) {
674+
let endChar = ")";
675+
let name = "()";
676+
let friendlyName = "tuple";
677+
678+
if (parserState.userQuery[parserState.pos] === "[") {
679+
endChar = "]";
680+
name = "[]";
681+
friendlyName = "slice";
682+
}
665683
parserState.pos += 1;
666-
getItemsBefore(query, parserState, generics, "]");
684+
const { foundSeparator } = getItemsBefore(query, parserState, generics, endChar);
667685
const typeFilter = parserState.typeFilter;
668686
const isInBinding = parserState.isInBinding;
669687
if (typeFilter !== null && typeFilter !== "primitive") {
670688
throw [
671689
"Invalid search type: primitive ",
672-
"[]",
690+
name,
673691
" and ",
674692
typeFilter,
675693
" both specified",
676694
];
677695
}
678696
parserState.typeFilter = null;
679697
parserState.isInBinding = null;
680-
parserState.totalElems += 1;
681-
if (isInGenerics) {
682-
parserState.genericsElems += 1;
683-
}
684698
for (const gen of generics) {
685699
if (gen.bindingName !== null) {
686-
throw ["Type parameter ", "=", " cannot be within slice ", "[]"];
700+
throw ["Type parameter ", "=", ` cannot be within ${friendlyName} `, name];
687701
}
688702
}
689-
elems.push({
690-
name: "[]",
691-
id: null,
692-
fullPath: ["[]"],
693-
pathWithoutLast: [],
694-
pathLast: "[]",
695-
normalizedPathLast: "[]",
696-
generics,
697-
typeFilter: "primitive",
698-
bindingName: isInBinding,
699-
bindings: new Map(),
700-
});
703+
if (name === "()" && !foundSeparator && generics.length === 1 && typeFilter === null) {
704+
elems.push(generics[0]);
705+
} else {
706+
parserState.totalElems += 1;
707+
if (isInGenerics) {
708+
parserState.genericsElems += 1;
709+
}
710+
elems.push({
711+
name: name,
712+
id: null,
713+
fullPath: [name],
714+
pathWithoutLast: [],
715+
pathLast: name,
716+
normalizedPathLast: name,
717+
generics,
718+
bindings: new Map(),
719+
typeFilter: "primitive",
720+
bindingName: isInBinding,
721+
});
722+
}
701723
} else {
702724
const isStringElem = parserState.userQuery[start] === "\"";
703725
// We handle the strings on their own mostly to make code easier to follow.
@@ -770,9 +792,11 @@ function initSearch(rawSearchIndex) {
770792
* @param {Array<QueryElement>} elems - This is where the new {QueryElement} will be added.
771793
* @param {string} endChar - This function will stop when it'll encounter this
772794
* character.
795+
* @returns {{foundSeparator: bool}}
773796
*/
774797
function getItemsBefore(query, parserState, elems, endChar) {
775798
let foundStopChar = true;
799+
let foundSeparator = false;
776800
let start = parserState.pos;
777801

778802
// If this is a generic, keep the outer item's type filter around.
@@ -786,6 +810,8 @@ function initSearch(rawSearchIndex) {
786810
extra = "<";
787811
} else if (endChar === "]") {
788812
extra = "[";
813+
} else if (endChar === ")") {
814+
extra = "(";
789815
} else if (endChar === "") {
790816
extra = "->";
791817
} else {
@@ -802,6 +828,7 @@ function initSearch(rawSearchIndex) {
802828
} else if (isSeparatorCharacter(c)) {
803829
parserState.pos += 1;
804830
foundStopChar = true;
831+
foundSeparator = true;
805832
continue;
806833
} else if (c === ":" && isPathStart(parserState)) {
807834
throw ["Unexpected ", "::", ": paths cannot start with ", "::"];
@@ -879,6 +906,8 @@ function initSearch(rawSearchIndex) {
879906

880907
parserState.typeFilter = oldTypeFilter;
881908
parserState.isInBinding = oldIsInBinding;
909+
910+
return { foundSeparator };
882911
}
883912

884913
/**
@@ -926,6 +955,8 @@ function initSearch(rawSearchIndex) {
926955
break;
927956
}
928957
throw ["Unexpected ", c, " (did you mean ", "->", "?)"];
958+
} else if (parserState.pos > 0) {
959+
throw ["Unexpected ", c, " after ", parserState.userQuery[parserState.pos - 1]];
929960
}
930961
throw ["Unexpected ", c];
931962
} else if (c === ":" && !isPathStart(parserState)) {
@@ -1599,6 +1630,11 @@ function initSearch(rawSearchIndex) {
15991630
) {
16001631
// [] matches primitive:array or primitive:slice
16011632
// if it matches, then we're fine, and this is an appropriate match candidate
1633+
} else if (queryElem.id === typeNameIdOfTupleOrUnit &&
1634+
(fnType.id === typeNameIdOfTuple || fnType.id === typeNameIdOfUnit)
1635+
) {
1636+
// () matches primitive:tuple or primitive:unit
1637+
// if it matches, then we're fine, and this is an appropriate match candidate
16021638
} else if (fnType.id !== queryElem.id || queryElem.id === null) {
16031639
return false;
16041640
}
@@ -1792,7 +1828,7 @@ function initSearch(rawSearchIndex) {
17921828
if (row.id > 0 && elem.id > 0 && elem.pathWithoutLast.length === 0 &&
17931829
typePassesFilter(elem.typeFilter, row.ty) && elem.generics.length === 0 &&
17941830
// special case
1795-
elem.id !== typeNameIdOfArrayOrSlice
1831+
elem.id !== typeNameIdOfArrayOrSlice && elem.id !== typeNameIdOfTupleOrUnit
17961832
) {
17971833
return row.id === elem.id || checkIfInList(
17981834
row.generics,
@@ -2886,12 +2922,15 @@ ${item.displayPath}<span class="${type}">${name}</span>\
28862922
*/
28872923
function buildFunctionTypeFingerprint(type, output, fps) {
28882924
let input = type.id;
2889-
// All forms of `[]` get collapsed down to one thing in the bloom filter.
2925+
// All forms of `[]`/`()` get collapsed down to one thing in the bloom filter.
28902926
// Differentiating between arrays and slices, if the user asks for it, is
28912927
// still done in the matching algorithm.
28922928
if (input === typeNameIdOfArray || input === typeNameIdOfSlice) {
28932929
input = typeNameIdOfArrayOrSlice;
28942930
}
2931+
if (input === typeNameIdOfTuple || input === typeNameIdOfUnit) {
2932+
input = typeNameIdOfTupleOrUnit;
2933+
}
28952934
// http://burtleburtle.net/bob/hash/integer.html
28962935
// ~~ is toInt32. It's used before adding, so
28972936
// the number stays in safe integer range.
@@ -2991,7 +3030,10 @@ ${item.displayPath}<span class="${type}">${name}</span>\
29913030
// that can be searched using `[]` syntax.
29923031
typeNameIdOfArray = buildTypeMapIndex("array");
29933032
typeNameIdOfSlice = buildTypeMapIndex("slice");
3033+
typeNameIdOfTuple = buildTypeMapIndex("tuple");
3034+
typeNameIdOfUnit = buildTypeMapIndex("unit");
29943035
typeNameIdOfArrayOrSlice = buildTypeMapIndex("[]");
3036+
typeNameIdOfTupleOrUnit = buildTypeMapIndex("()");
29953037

29963038
// Function type fingerprints are 128-bit bloom filters that are used to
29973039
// estimate the distance between function and query.

tests/rustdoc-js-std/parser-errors.js

+4-13
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ const PARSED = [
2424
original: "-> *",
2525
returned: [],
2626
userQuery: "-> *",
27-
error: "Unexpected `*`",
27+
error: "Unexpected `*` after ` `",
2828
},
2929
{
3030
query: 'a<"P">',
@@ -107,23 +107,14 @@ const PARSED = [
107107
userQuery: "a<::a>",
108108
error: "Unexpected `::`: paths cannot start with `::`",
109109
},
110-
{
111-
query: "((a))",
112-
elems: [],
113-
foundElems: 0,
114-
original: "((a))",
115-
returned: [],
116-
userQuery: "((a))",
117-
error: "Unexpected `(`",
118-
},
119110
{
120111
query: "(p -> p",
121112
elems: [],
122113
foundElems: 0,
123114
original: "(p -> p",
124115
returned: [],
125116
userQuery: "(p -> p",
126-
error: "Unexpected `(`",
117+
error: "Unexpected `-` after `(`",
127118
},
128119
{
129120
query: "::a::b",
@@ -204,7 +195,7 @@ const PARSED = [
204195
original: "a (b:",
205196
returned: [],
206197
userQuery: "a (b:",
207-
error: "Unexpected `(`",
198+
error: "Expected `,`, `:` or `->`, found `(`",
208199
},
209200
{
210201
query: "_:",
@@ -249,7 +240,7 @@ const PARSED = [
249240
original: "ab'",
250241
returned: [],
251242
userQuery: "ab'",
252-
error: "Unexpected `'`",
243+
error: "Unexpected `'` after `b`",
253244
},
254245
{
255246
query: "a->",

tests/rustdoc-js-std/parser-slice-array.js

+18
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,24 @@ const PARSED = [
266266
userQuery: "]",
267267
error: "Unexpected `]`",
268268
},
269+
{
270+
query: '[a<b>',
271+
elems: [],
272+
foundElems: 0,
273+
original: "[a<b>",
274+
returned: [],
275+
userQuery: "[a<b>",
276+
error: "Unclosed `[`",
277+
},
278+
{
279+
query: 'a<b>]',
280+
elems: [],
281+
foundElems: 0,
282+
original: "a<b>]",
283+
returned: [],
284+
userQuery: "a<b>]",
285+
error: "Unexpected `]` after `>`",
286+
},
269287
{
270288
query: 'primitive:[u8]',
271289
elems: [

0 commit comments

Comments
 (0)