-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use version-sorting for all sorting #115046
Changes from 1 commit
98d3012
127e052
47bb076
95eb1e2
f06df22
2e931b5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -99,6 +99,42 @@ fn bar() {} | |
fn baz() {} | ||
``` | ||
|
||
### Sorting | ||
|
||
In various cases, the default Rust style specifies to sort things. If not | ||
otherwise specified, such sorting should be "version sorting", which ensures | ||
that (for instance) `x8` comes before `x16` even though the character `1` comes | ||
before the character `8`. (If not otherwise specified, version-sorting is | ||
lexicographical.) | ||
|
||
For the purposes of the Rust style, to compare two strings for version-sorting: | ||
|
||
- Compare the strings by (Unicode) character as normal, finding the index of | ||
the first differing character. (If the two strings do not have the same | ||
length, this may be the end of the shorter string.) | ||
- For both strings, determine the sequence of ASCII digits containing either | ||
that character or the character before. (If either string doesn't have such a | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
sequence of ASCII digits, fall back to comparing the strings as normal.) | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Compare the numeric values of the number specified by the sequence of digits. | ||
(Note that an implementation of this algorithm can easily check this without | ||
accumulating copies of the digits or converting to a number: longer sequences | ||
of digits are larger numbers, equal-length sequences can be sorted | ||
lexicographically.) | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- If the numbers have the same numeric value, the one with more leading zeroes | ||
comes first. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a deep reason for this? I think I would have expected the one with more leading zeroes to come later, similar to how with usual lexicographic ordering the longer string (in that case with more trailing characters) comes later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Roughly, consistency with numeric sorting. Suppose you have 000, 001, 002, ..., 010, 011, ..., 100, 101. We sort those in that order, not for any reason involving the leading zeroes, but because of their numeric value. But the net effect is also that numbers with three leading zeroes come first, then two leading zeroes, then one leading zero, then no leading zeroes. So it seemed consistent, to me, to put "more leading zeroes" first in this case too. |
||
|
||
Note that there exist various algorithms called "version sorting", which differ | ||
most commonly in their handling of numbers with leading zeroes. This algorithm | ||
does not purport to precisely match the behavior of any particular other | ||
algorithm, only to produce a simple and satisfying result for Rust formatting. | ||
(In particular, this algorithm aims to produce a satisfying result for a set of | ||
symbols that have the same number of leading zeroes, and an acceptable and | ||
easily understandable result for a set of symbols that has varying numbers of | ||
leading zeroes.) | ||
|
||
As an example, version-sorting will sort the following symbols in the order | ||
given: `x000`, `x00`, `x0`, `x01`, `x1`, `x09`, `x9`, `x010`, `x10`. | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### [Module-level items](items.md) | ||
|
||
### [Statements](statements.md) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I presume correctly that the usage of the term "strings" here does not bound the following prescriptions just to literal strings? I.e. we want the same algorithm to also apply in all other sorting contexts (e.g. idents in imports)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the intention of "strings" in "compare two strings for version-sorting" here is in the sense of it being a string to the tool parsing it, not that it's a literal string in the code being parsed.
Suggestions for clearer, more unambiguous wording welcome.